[pdf] navigating the storm: impact, emop, and agile steering standards | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /llc/fqv corpus id: navigating the storm: impact, emop, and agile steering standards @article{mandell navigatingts, title={navigating the storm: impact, emop, and agile steering standards}, author={l. mandell and c. neudecker and a. antonacopoulos and elizabeth grumbach and l. auvil and m. christy and jacob a. heil and t. samuelson}, journal={digit. scholarsh. humanit.}, year={ }, volume={ }, pages={ - } } l. mandell, c. neudecker, + authors t. samuelson published engineering, computer science digit. scholarsh. humanit. this article discusses two major initiatives tasked with developing tools to im- prove optical character recognition (ocr) or the mechanical keying of texts that are digitally available only as page images. the two initiatives are the improving access to text project in europe and the early modern ocr project in the usa. because of dealing with a multilayered problem like ocr technologies and having to collaborate with radically interdisciplinary and international team members, the two projects… expand view via publisher primaresearch.org save to library create alert cite launch research feed share this paper citations view all topics from this paper optical character recognition agile software development citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency ocr-d: an end-to-end open source ocr framework for historical printed documents c. neudecker, konstantin baierer, + authors elisa herrmann computer science datech save alert research feed okralact - a multi-engine open source ocr training system konstantin baierer, rui dong, c. neudecker computer science hip ' save alert research feed references showing - of references sort byrelevance most influenced papers recency agile software development: principles, patterns, and practices robert j. winter cpt computer science view excerpts, references background save alert research feed an experimental workflow development platform for historical document digitisation and analysis c. neudecker, s. schlarb, + authors k. wolstencroft computer science hip ' save alert research feed book history in the early modern ocr project, or, bringing balance to the force jacob a. heil, jacob todd samuelson history view excerpts, references background save alert research feed early modern ocr project (emop) at texas a&m university: using aletheia to train tesseract katayoun torabi, jessica durgan, bryan tarpley computer science acm symposium on document engineering view excerpts, references methods save alert research feed aletheia - an advanced document layout and text ground-truthing system for production environments c. clausner, s. pletschacher, a. antonacopoulos computer science international conference on document analysis and recognition pdf view excerpt, references methods save alert research feed the impact dataset of historical document images c. papadopoulos, s. pletschacher, c. clausner, a. antonacopoulos engineering, computer science hip ' pdf view excerpts, references background save alert research feed digitizing the archive: the necessity of an "early modern" period l. mandell history save alert research feed manifesto for agile software development k. beck, mike beedle, + authors d. thomas computer science , pdf save alert research feed n.d.). manifesto for agile software development agile alliance. http://agilemanifesto.org (accessed an experimental workflow development platform ... ... related papers abstract topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue data paper corresponding author: yann ryan dept. of english, queen mary, university of london, london, uk y.c.y.ryan@qmul.ac.uk keywords: newspapers; digital humanities; metadata; newspaper history; british library; archives to cite this article: ryan, y., & mckernan, l. ( ). converting the british library’s catalogue of british and irish newspapers into a public domain dataset: processes and applications. journal of open humanities data, : , pp. – . doi: https:// doi.org/ . /johd. abstract this paper describes the production of a title-level list of british, irish, british overseas territories and crown dependencies newspapers ( – ) held by the british library, and its potential for reuse and research. the data was extracted from the british library’s catalogue of over , british and irish newspaper titles, cleaned, and published on the british library research repository, an open access repository for the research produced by staff and research associates of the british library. bespoke versions of the data have been made available to specialist users, notably the british library/alan turing institute’s ‘living with machines’ project, enabling greater historical analysis of nineteenth-century british news and selective digitisation. yann ryan luke mckernan *author affiliations can be found in the back matter of this article converting the british library’s catalogue of british and irish newspapers into a public domain dataset: processes and applications mailto:y.c.y.ryan@qmul.ac.uk https://doi.org/ . /johd. https://doi.org/ . /johd. https://orcid.org/ - - - https://orcid.org/ - - - x ryan and mckernan journal of open humanities data doi: . /johd. overview repository location doi: https://doi.org/ . / context produced as part of the british library’s heritage made digital newspaper project, digitising historical newspapers and exploring options for creative re-use of newspaper data (https://blogs. bl.uk/thenewsroom/ / /heritage-made-digital-the-newspapers.html). method steps the original data comes from the british library’s catalogue of world newspapers (there is no separate newspaper catalogue, but all newspaper titles are included in the british library’s aleph management system and discoverable via its integrated catalogue at https://explore.bl.uk). this has been built up over c. years of collecting of newspapers, a collection that now comprises some , titles or million issues dating from to the present day. the collection of british and irish titles (around two-thirds of the entire newspaper collection) runs from to the present day. it is not absolutely complete, but most titles published from the s onwards are held, and effectively all titles are held from onwards, when legal deposit was introduced, by which publishers are required to send one copy of each newspaper issue to the british library. there are a few omissions (either entire titles or gaps in the run of a title), while for reasons of space usually only one edition of an issue has been taken by the library since . the newspaper catalogue is at title-level, with changes in a newspaper title and regional variants resulting in a new catalogue record, and often a new catalogue record where there has been a change in format (i.e., microfilm or digital copies). over the long period of collecting, inevitable inconsistencies and gaps in the metadata have built up. the titles in the dataset are exactly as reflected in the catalogue, following the british library cataloguing practice of the times when the titles were acquired. the data was extracted from the british library catalogue (through the aleph integrated library system) by the collection metadata team. aleph stores metadata relating to the newspaper collection, in the form of marc records, in a number of fields, often with complicated holding information. the collections metadata team at the british library extracted years of publication from the free-text information about date holdings, which is published alongside the original holdings field. this was then aligned to data from a separate master negative database of microfilm copies of newspaper print originals (newspaper titles linked by system id numbers), as well as an up-to-date list of digital holdings on the british newspaper archive, which hosts digitised newspapers from the british library collection (https://www.britishnewspaperarchive.co.uk). the initial data extraction was followed by a process of cleaning the extracted data by the news collections team. we manually adjusted some , holdings records, mostly where years had not been extracted or there were inconsistencies surrounding place of publication— either where alternative spellings or punctuation had been used, or to ensure that the entire dataset used the same set of uk county boundaries. for this, and for joining the initial dataset to the microfilm and digital holdings, custom scripts for cleaning and extracting the data were developed using r, before exporting to the final .csv format. sampling strategy to produce a reusable dataset, the decision was made to limit this to british and irish newspapers, where there were fewer complications with the data, such as dealing with languages other than english, the need for research into the history of some titles, or requiring consultation with other british library curators in relevant area studies. a complete listing of all titles in the newspaper collection will be a follow-up project, scheduled to take place in . https://doi.org/ . /johd. https://doi.org/ . / https://blogs.bl.uk/thenewsroom/ / /heritage-made-digital-the-newspapers.html https://blogs.bl.uk/thenewsroom/ / /heritage-made-digital-the-newspapers.html https://explore.bl.uk https://www.britishnewspaperarchive.co.uk ryan and mckernan journal of open humanities data doi: . /johd. quality control not applicable. dataset description object name british and irish newspapers: a title-level list of british, irish, british overseas territories and crown dependencies newspapers held by the british library format names and versions excel; csv; plaintext creation dates - - – - - dataset creators danskin, alan – collection metadata standards manager (british library) lester, stephen – curator, newspaper collections (british library) mckernan, luke – lead curator, news and moving image (british library) ryan, yann – curator, newspaper data (now post-doctoral researcher, queen mary university of london) language english license cc repository name british library research repository publication date - - reuse potential the past six years or so have seen the rise of ‘collections as data’: the idea that metadata from holdings of cultural heritage collections can function as data to be analysed in its own right (see collections as data national forum, , for a definition and discussion of the term). tim sherratt, for example, has used the metadata from the national library of australia’s trove digitised newspaper collection to undertake historical analyses (sherratt, ). as newspapers are digitised, detailed metadata are produced in tandem, providing issue-level details on the place and date of publication, which can then be exploited by researchers (fyfe, ). however, this only relates to the portion of the collection which has been digitised, currently consisting of just over % of the entire british library newspaper collection of million pages. up until now, no easily available survey of the print holdings of the library has been available to researchers. we see the main reuse potential of this dataset as four-fold: https://doi.org/ . /johd. ryan and mckernan journal of open humanities data doi: . /johd. firstly, the list can be used in conjunction with the physical holdings and the library’s explore catalogue (https://explore.bl.uk) as a general finding aid, or to narrow down one’s search to a specific corpus of newspapers. while explore is already an excellent search tool, this list aids discovery by enabling easy browsing by date and location. secondly, it opens up newspaper data to the non-specialist. we purposely standardised and simplified the data fields so that users could take advantage of the filtering, sorting and graphing functions in software such as excel or google sheets. thirdly, it allows for geographical and diachronic analyses of the british and irish newspaper industries, allowing for easy production, for example, of time-series statistics on the establishment of new titles, or of maps of individual ‘hotspots’ of newspaper growth on a county or city level. while geographic coordinates were beyond the scope of the dataset, a code to accurately georeference the structured data is being developed specifically for use with the title list (ryan et al., ). finally, understanding the print collection helps us to understand the digitised portion in context. researchers across the world now use the data from the british library’s digitised newspaper collection for historical research. many of these projects employ large-scale text mining or image analytics over the entire collection to make broad historical claims: previous projects, for instance, have used the corpus to estimate dates when electricity took over from horses, or to analyse ‘subjective well-being’ (lansdall-welfare et al., ; hills, proto, sgroi, & seresinhe, ). however, it is also recognised that these types of claims must be understood in terms of the idiosyncrasies existing in the digitised collection. the corpus ultimately only represents a fraction of the entirety of the library’s newspaper collection and has not been produced to be particularly systematic or representative (shaw, ). this list helps to contextualise the data in the digitised collection. a project undertaking text mining, for example, may adjust the weighting methods if one understands the proportions of the print holdings of a particular place that each digitised newspaper represents. the ‘living with machines’ project — a british library and alan turing institute initiative using data analysis to understand the lived experience of the nineteenth century — is already using a version of this list to carry out a ‘topographical survey’ of the digitised newspaper collection (vane, ). acknowledgements the authors would like to thank the british library news collections team, british library digital scholarship, and the researchers of the ‘living with machines’ project for their help with and feedback on this dataset. funding information british library grant-in-aid. competing interests the authors have no competing interests to declare. author contributions ryan: conceptualization; investigation; data curation; writing – original draft mckernan: conceptualization; investigation; supervision; resources; project administration; writing – review & editing author affiliations yann ryan orcid.org/ - - - department of english, queen mary, university of london, london, uk luke mckernan orcid.org/ - - - x newspaper collections, british library, london, uk https://doi.org/ . /johd. https://explore.bl.uk https://orcid.org/ - - - https://orcid.org/ - - - x ryan and mckernan journal of open humanities data doi: . /johd. to cite this article: ryan, y., & mckernan, l. ( ). converting the british library’s catalogue of british and irish newspapers into a public domain dataset: processes and applications. journal of open humanities data, : , pp. – . doi: https:// doi.org/ . /johd. published: january copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/ licenses/by/ . /. journal of open humanities data is a peer-reviewed open access journal published by ubiquity press. references collections as data national forum. ( ). the santa barbara statement on collections as data. retrieved november , , from https://collectionsasdata.github.io/statement/ fyfe, p. ( ). an archaeology of victorian newspapers. victorian periodicals review, ( ), – . doi: https://doi.org/ . /vpr. . hills, t. t., proto, e., sgroi, d., & seresinhe, c. i. ( ). historical analysis of national subjective wellbeing using millions of digitized books. nature human behaviour, ( ), – . doi: https://doi. org/ . /s - - -z lansdall-welfare, t., sudhahar, s., thompson, j., lewis, j., findmypast newspaper team, & cristianini, n. ( ). content analysis of years of british periodicals. proceedings of the national academy of sciences, ( ), e –e . doi: https://doi.org/ . /pnas. ryan, y., ardanuy, m. c., van strien, d., hosseini, k., beelen, k., hetherington, j., mcdonough, k., mcgillivray, b., ridge, m., vane, o., & wilson, d. ( ). using smart annotations to map the geography of newspapers. paper presented at dh , ottawa, canada (held online). https://dh .adho.org/wp-content/uploads/ / / _ usingsmartannotationstomapthegeographyofnewspapers.html shaw, j. ( ). billion words: the british library newspapers – project: some guidelines for large-scale newspaper digitisation. paper presented at ifla, oslo. https://origin-archive.ifla.org/iv/ ifla /papers/ e-shaw.pdf sherratt, t. ( ). from collection search to collections as data. retrieved from http://doi.org/ . / zenodo. vane, o. ( ). press picker: visualising formats and title name changes in the british library’s newspaper holdings. retrieved from https://livingwithmachines.ac.uk/press-picker-visualising-formats-and-title- name-changes-in-the-british-librarys-newspaper-holdings/ https://doi.org/ . /johd. https://doi.org/ . /johd. https://doi.org/ . /johd. http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / https://collectionsasdata.github.io/statement/ https://doi.org/ . /vpr. . https://doi.org/ . /s - - -z https://doi.org/ . /s - - -z https://doi.org/ . /pnas. https://dh .adho.org/wp-content/uploads/ / / _usingsmartannotationstomapthegeographyofnewspapers.html https://dh .adho.org/wp-content/uploads/ / / _usingsmartannotationstomapthegeographyofnewspapers.html https://origin-archive.ifla.org/iv/ifla /papers/ e-shaw.pdf https://origin-archive.ifla.org/iv/ifla /papers/ e-shaw.pdf http://doi.org/ . /zenodo. http://doi.org/ . /zenodo. https://livingwithmachines.ac.uk/press-picker-visualising-formats-and-title-name-changes-in-the-british-librarys-newspaper-holdings/ https://livingwithmachines.ac.uk/press-picker-visualising-formats-and-title-name-changes-in-the-british-librarys-newspaper-holdings/ op-llcj .. the value of critical destruction: evaluating multispectral image processing methods for the analysis of primary historical texts ............................................................................................................................................................ alejandro giacometti department of medical physics and biomedical engineering, ucl centre for digital humanities, university college london, london alberto campagnolo ligatus research centre, ccw graduate school, university of the arts london, london lindsay macdonald photogrammetry, d imaging and metrology research centre, university college london, london simon mahony ucl centre for digital humanities, department of information studies, university college london, london stuart robson photogrammetry, d imaging and metrology research centre, university college london, london tim weyrich department of computer science, ucl centre for digital humanities, university college london, london melissa terras department of information studies, ucl centre for digital humanities, university college london, london adam gibson department of medical physics and biomedical engineering, university college london, london ....................................................................................................................................... abstract multispectral imaging—a method for acquiring image data over a series of wave- lengths across the light spectrum—is becoming a valuable tool within the cultural correspondence: melissa terras, department of information studies, foster court, university college london, gower street, wc e bt, london. e-mail: m.terras@ucl.ac.uk digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. this is an open access article distributed under the terms of the creative commons attribution non-commercial license (http://creativecommons.org/licenses/by-nc/ . /), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. for commercial re-use, please contact journals.permissions@oup.com of doi: . /llc/fqv digital scholarship in the humanities advance access published october , xpath error undefined namespace prefix and heritage sector for the recovery and enhancement of information contained within primary historical texts. however, most applications of this technique, to date, have been bespoke: analysing particular documents of historic importance. there has been little prior work done on evaluating this technique in a structured fashion, to provide recommendations on how best to capture and process images when working with damaged and abraded textual material. this article intro- duces a new approach for evaluating the efficacy of image processing algorithms in recovering information from multispectral images of deteriorated primary historical texts. we present a series of experiments that deliberately degrade samples cut from a real historical document to provide a set of images acquired before and after damage. these images then allow us to compare, both objectively and quantitatively, the effectiveness of multispectral imaging and image process- ing for recovering information from damaged text. we develop a methodological framework for the continuing study of the techniques involved in the analysis and processing of multispectral images of primary historical texts, and a dataset which will be of use to others interested in advanced digitisation techniques within the cultural heritage sector. ................................................................................................................................................................................. introduction multispectral imaging is an advanced digitisation method for acquiring image data over a series of wavelengths across the light spectrum. combined with image processing, it has become a valuable tool for the enhancement and recovery of informa- tion contained within culturally important docu- ments, providing a means, in some cases, to recover lost text, or examine other features no longer detectable by the human eye. however, applications of multispectral imaging within the cultural and heritage sector have mainly been bespoke, with limited access to or understanding of the techniques and methods used to recover damaged text. the barriers to accessing this technol- ogy will become lower as the equipment becomes commercially available; however, it is important that we better understand the methods and approaches used for multispectral imaging in order to be able to use such techniques efficiently, whilst maximising the information we can recover from cultural objects. this article describes a highly interdisciplinary approach to evaluating multispectral imaging and image processing in the context of primary histor- ical sources. we introduce a formal methodology to evaluate image processing of multispectral data and provide a framework for developing new, best prac- tice methods when using multispectral processes to image damaged texts. we do so by first building up a large dataset of multispectral images of actual parchment, taken before and after a set of degrad- ation procedures that were designed to match the most likely types of damage which may occur over the lifetime of parchment documents. this dataset then allows us to evaluate the efficacy of image pro- cessing algorithms attempting to recover damaged text, and to make recommendations on how best to apply multispectral imaging when attempting to recover information from damaged text. our novel approach, which requires the necessary, con- trolled destruction of a historical parchment docu- ment, presents a formal methodology in acquiring, processing, and analysing multispectral data. it also led to the creation of a large dataset consisting of a series of multispectral images showing both the initial and degraded state of samples from a real manuscript, providing a valuable tool for the advanced digitisation research community. as such, this article makes a major contribution to our understanding of how multispectral imaging can be used across the cultural and heritage sector, and demonstrates how an interdisciplinary approach centred on questions raised from within a digital humanities project can advance our a. giacometti et al. of digital scholarship in the humanities, understanding of image processing for both the cultural heritage and engineering science sectors. the digital humanities and imaging although most effort in the digital humanities is focussed on the production, analysis, and visualisa- tion of text , there is a recent and growing interest in the community towards digital imaging, and how image capture and processing techniques can aid us in uncovering new bodies of information, particu- larly from historical documents . digital imaging technology has been used to produce detailed and trustworthy surrogates of historical documents for decades (deegan and tanner, ; hughes, ; terras, ), and digitized versions of primary his- torical sources are often adequate for the needs of most scholars. however, improvements in image processing and analysis have led to a number of exciting and important digital humanities projects which can reveal a greater wealth of information about the originals, beyond traditional digitisation technologies. leveraged by technological improve- ments in image acquisition and image processing, humanities scholars have been able to image, ana- lyse, and recover more information from historical texts (chabries et al., ; terras, a; salerno et al., ; tanner and bearman, ). one of the most promising techniques is multispectral ima- ging, which can provide additional evidence of the content of a document when it is difficult to read with the naked eye, when further information about the physical composition of a document and ink identification is required (senvaitenë et al., ), or when information is required about its proven- ance (tanner and bearman ). multispectral imaging light is an electromagnetic wave, often characterised by its wavelength (which we perceive as colour), which is the distance between two consecutive peaks of the wave. the spectrum that is visible to humans includes wavelengths from approximately nm to nm (fig. ). light with a wavelength longer than nm is referred to as infrared; ultra- violet is light with wavelengths shorter than nm (peatross and ware, ). most digital imaging equipment captures the same broad spectra of light that is visible to humans with a combination of broadband red, green, and blue sensors (this is hardly surprising, given that the outputs of most imaging technologies are those which humans should be able to see). in contrast, multispectral ima- ging measures a series of discrete wavelengths over a defined range. these images can be acquired in the visible spectrum and also in the infrared and ultra- violet spectrum (with images that include a broader range of wavelengths often being referred to as hyper- spectral (landgrebe, )). multispectral images are reasonably straightfor- ward to acquire if appropriate light sources and detectors are available along with a method for wavelength selection. a spectrum of light is usually selected through the use of filters, or via a light source . for example, a series of filters placed in front of a camera lens can allow images to be cap- tured in distinct wavebands (hardeberg et al., ; attas, ; rapantzikos and balas, ) or, in a more recent development, light sources can be used which emit at specific wavelengths (easton et al., ; marengo et al., ; hollaus et al., ). images may then be acquired using a commercial camera or a more sophisticated scanning system. the resulting sets of images can show different aspects of a document at different wavelengths (fig. ). multispectral imaging was first developed by nasa in the s to determine the composition of objects in space (landgrebe, ) and more recently has been used in medical imaging, for example in imaging the interior of the eye (everdell et al., ). it is also incredibly useful to help in reading documents: given that different inks have different spectral signatures due to their differing chemical composition, multispectral ima- ging can be used to differentiate inks used in differ- ent areas on a document, different depths within a document (such as in the case of palimpsests) or to differentiate ink from other types of document damage, such as mould or abrasion. in the cultural evaluating multispectral image processing methods digital scholarship in the humanities, of and heritage sectors, multispectral imaging has been used across a range of documents including the archimedes palimpsest (salerno et al., ), the dead sea scrolls (chabries et al., ; tanner and bearman, ), carbonised scrolls from herculaneum (chabries et al., ), letters from the hudson bay archives (goltz et al., ), pal- impsests from the saint catherine monastery in egypt (easton et al., ), improving tarnished daguerreotypes (goltz and hill, ), removing ef- fects of ink-bleeding, ink-corrosion, and foxing (joo kim et al., ), and recovering the diaries of david livingstone (knox et al., ). although multispectral imaging is currently the leading tech- nique for recovering lost text in historical manu- scripts, there are no guidelines which determine when it is the most appropriate technique compared to imaging at a single wavelength, or what the best wavelengths to use are. one purpose of this work is to establish a means to compare different imaging approaches objectively so that such guidelines can be evidence-based. most of the reported applications of this tech- nique in the cultural and heritage sector are to spe- cific documents of great historical importance. wider use of the technology is now inevitable as more examples of successful recovery from multi- spectral images of historical documents arise, although a careful cost–benefit analysis is required to consider the type of data that a multispectral imaging project might yield, given the present (but falling) costs of undertaking this kind of ima- ging. as the availability of the technique is expected to increase, it is important to consider best practice in capturing and processing multispectral images. questions remain as to how best to take advantage of digital visualisation technology to present multi- spectral image data of cultural heritage to historians and palaeographers (bonanni et al., , ponto et al., ). in addition, there is little evidence avail- able about how best to process or analyse multispec- tral images (giacometti, ). further processing (the computational manipulation of digital images, see gonzalez and woods ( ) for an introduc- tion) of multispectral images can allow important historical features and details to be identified, enhanced, and separated from other features, and it is important to understand what image processing approaches are most useful when dealing specifically with multispectral images of particular types of damage found on primary historical texts. additionally, multispectral imaging can be misun- derstood, with the technology sometimes being described as if it were magic (for example, see zolfagharifard, ), and there is a need for a sys- tematic investigation into the effectiveness and fig. multispectral images are captured in a similar process to colour images, but with many images captured at discrete narrow ranges of the light spectrum, rather than a small number of images which are each sensitive to light at a broader range of wavelengths a. giacometti et al. of digital scholarship in the humanities, usefulness of the technique for the cultural and heri- tage sector. previous multispectral imaging capture projects, applied to specific examples of texts of historical importance, have concentrated on recording docu- ments in their current state (generally once import- ant features are illegible). here, we investigate best practice in the multispectral imaging of heritage material by imaging a parchment document before and after a series of degradation processes, allowing us to assess the effectiveness of image processing algorithms to recover information from degraded documents. this gives us a unique platform for evaluating the quality of recovered images, and allows us to assess the performance of image pro- cessing algorithms for analysis of these images. we propose a method for objectively comparing images of degraded documents, and develop a method for indicating which image processing methods are most appropriate for recovering text which has suffered from specific types of damage. in addition, at a time when ‘critical making’ is being much dis- cussed in the digital humanities, we propose that our approach to ‘critical destruction’ demonstrates the importance of adopting quantitative approaches when undertaking digital humanities research. method the evaluation of image quality and the perform- ance of methods which produce images of cultural heritage documents is a complex and challenging task (macdonald and jacobsen, ), and there has been little attempt to evaluate multispectral image quality previously. partly this is because ‘quality’ is ill-defined. here, we are able to introduce a new, objective definition of ’quality’, namely, the amount of shared information between an image of the undamaged parchment and one of the recovered text. this is explored more fully in section . . existing multispectral image data are often particu- lar to an individual document, and the success or failure of analysis is determined by the subjectively perceived legibility of the writing (easton et al, ; attas, ; knox, ). in order to assess multi- spectral image processing methods objectively, it is necessary to acquire data under controlled condi- tions: capturing multispectral images of a manu- script before any degradation occurs, and then capturing images of the manuscript after degrad- ation. these two sets of images enable evaluation of the image processing methods and their perform- ance on a real degraded document . naturally, this is impossible to do with historical text which has already been degraded (and no curator would allow us to degrade a primary historical text of any importance), but it is possible to adopt an experi- mental approach in which a real manuscript is deliberately degraded in a rigorously controlled fashion, and its corresponding deterioration docu- mented via multispectral imaging. this allows us to quantitatively and objectively compare the recovery of text from the degraded manuscript and to fig. multispectral detail of a single feature from our sample r captured using a monochrome camera. note the variation in intensity and contrast of the writing and ink across the imaged wavelengths. it can be observed how, initially, the ink gains contrast slightly, with a darker background in the shorter wavelengths. around the longer wavelengths in the visible spectrum and into the near-infrared, the contrast suddenly drops quite significantly, ren- dering the contrast in the nm image almost null evaluating multispectral image processing methods digital scholarship in the humanities, of evaluate the success of multispectral imaging and its related processing approaches when faced with spe- cific documentary damage. such an experimental approach is common in the field of conservation: the changes that parch- ment (or more specifically, its collagen fibres) undergo have been studied when subjecting samples to heat (chahine, ), ultraviolet light (meghea et al., ) even open flame (giurginca et al., ) and chemical solutions (dobrusina and visotskite, ). other methods, such as optical coherence tomography (góra et al., ), x-ray fluorescence, optical fluorescence (dolgin et al., ), and x-ray diffraction (kennedy and wess, ), have been used to identify the state of degradation of parch- ment: however, ours is the first known study to focus on the macroscopic effects of degradation agents on the legibility of primary historical texts in order to understand how we can most effectively use multispectral methods. . degraded and degrading text we chose to focus on parchment documents for our study, given that parchment remains the primary medium of large quantities of culturally important documents in archives, museums, libraries, and pri- vate collections. a durable, stiff material, parchment is made of animal skin and consists of structured collagen fibres, but is highly sensitive to changes in humidity, and is endangered by biological, thermo- chemical, and mechanical damage (reed, ; clarkson, ; larsen, ). through consult- ation with conservators and archivists, we identified twenty methods of degradation that commonly affect historical parchment material, changing its physical characteristics at both microscopic and macroscopic levels (vnouček, ). these included both physical and chemical agents to mimic the kinds of damage that parchment documents can be expected to incur during their lives, from techno- logical mistakes during production, to improper use, unsuitable storage conditions, disasters, and natural ageing (table ). the damage agents were selected so as to affect not only the properties of the parchment, but also the legibility of text in various ways, for example, shrinking or otherwise deform- ing the parchment, and obscuring or effacing the writing via physical, chemical, or biological reac- tions and stains (giacometti et al., ). . the parchment prepared an eighteenth-century manuscript was donated for our experiment from london metropolitan archives . the document was an assignment of property which had been deemed to hold no histor- ical or scholarly value, and had been de-accessioned from their collection prior to our request for parch- ment material in accordance with the national archives guidance on deaccessioning and disposal (the national archives, ). the manuscript was written in iron gall ink on prepared animal skin, measured approximately � cm, and was composed of two large leaves (fig. ), which were folded in thirds both horizontally and vertically. both leaves had red margin guidelines and a blue seal glued outside the left margin. the outer leaf contained a large stamp on the top left corner and a fold trapping the inner leaf with red wax. both leaves had writing on the recto covering most of the area of the leaf enclosed by the red margin. the outer leaf had a written section on its verso, detailing the date and contents of the document. the recto of the manuscript corresponded with the flesh side of the parchment, the verso with the hair side. apart from some signs of wear and tear, especially around the folds of the text, it was in overall good condition. twenty-three � cm flat square sections were cut from the parchment as samples for this research: each sample contains writ- ten text. each of the samples was selected from a flat area of the manuscript, where the writing covered the surface of the recto (flesh side), and the verso was empty and without blemishes. folds and marks of any kind were avoided. there were three excep- tions: two of the samples were cut so as to contain writing on both the recto and verso, and one sample was cut from a folded area. the old fold sample and one of the samples with writing on both sides were kept as controls. the samples were imaged, then a series of treatments were applied to damage the twenty samples, as described in table ; three were left untreated and kept intact as controls (giving a total of samples). the samples were then a. giacometti et al. of digital scholarship in the humanities, reimaged, producing image sets acquired before and after damage. . the parchment imaged multispectral images were acquired of every sample on two separate sessions in order to capture the samples in an untreated and treated state. the ima- ging station and equipment were not moved be- tween sessions. each sample was imaged under three modalities which represent different ways of illuminating and capturing the images (macdonald et al., ). the nikon camera is a consumer camera which has a high pixel resolution and acquires colour images; the monochrome camera is a scientific camera which acquires greyscale images only (making it more convenient for multi- spectral imaging though a filter) and has increased sensitivity to the infrared, but has a lower pixel table summary of different types of degradation (in small capitals), giving the reason for the degradation, the circumstances in which it might occur, and the type of degradation degradation reason circumstances mechanical damage chemical, biological, or environmental damage damage by extraneous substances technological mistakes during manufacture lime solution, acidity, finishing scraping hydrochloric acid, calcium hydroxide oil during writing ink acidity sulphuric acid during binding unsympathetic binding mechanical damage storage environmental changes temperature, humidity heat, desiccant, mould exposure to light visible light, uv uv light, controls pollutants, dirt chemical reactions smoke, sulphuric acid, controls natural disasters fire, smoke, water heat, smoke, water biodegradation micro-organisms, insects, rodents mould mechanical destruction rubbing, folding mechanical damage use erasures, changes to text corrections, re-usage scraping iron gall ink mishandling, misuse mechanical damage, scrunching accidents spillage blood, red wine, black tea, iron gall ink, water blood, oil, red wine, black tea, aniline dye, iron gall ink, indian ink repairs historical, conservation treatments water, sodium hypochlorite rebinding unsympathetic binding mechanical damage palaeographical and conservation experiments palimpsest text recovery, bleaching hydrochloric acid oil reformatting, digitisation mechanical damage uv light natural ageing controls in the table are also highlighted the kinds of degradation that naturally occurred to the manuscript during our experiments; these are identified by the keyword controls. evaluating multispectral image processing methods digital scholarship in the humanities, of resolution. using both reflective and transmission imaging allowed both surface and deeper features to be detected. ( ) colour reflective imaging. this maximises sensitivity to surface features on the parchment. a nikon d camera was mounted on a copy- stand with white tungsten-halogen incandescent lighting set at (these emit from to , nm, with peak power from – nm; see fig. ). sixteen bandpass filters with centre wavelengths – nm and bandwidth nm (unaxis optics, usa) were fixed in turn to the front of the camera lens. the camera was fitted with a nikkor mm f . macro lens and set to an aperture of f throughout. ( ) colour transmission imaging. this used the same camera, but the light source was a light- box beneath the parchment, ensuring that only light which had passed through the parchment was detected, thereby increasing sensitivity to deeper regions of the parchment, and avoiding specular reflections. its fluorescent lamps had a narrower bandwidth and did not provide mea- sureable power above nm (which did not affect the colour imaging, because our longest wavelength filter was nm). the same sixteen bandpass filters and the same camera lens were used. we also imaged each sample with no filter, giving captures per sample. ( ) monochromatic reflective. in order to detect light in the near infrared, we used a monochrome camera (kodak megaplus . i) which did not have an infrared cutoff filter. this enabled us to investigate longer wave- lengths using five additional infrared bandpass filters from – nm, centred every nm. this camera has a lower spatial resolution than the nikon ( , � , pixels compared to , � , pixels) and a lower bit depth ( bits compared to bits). the lens was a nikkor mm f , which was set at a constant aperture of f/ . for all images. the sampling resolution on the surface of the parchment was approximately pixels/mm ( dpi). a total of , images were acquired in two ima- ging sessions, requiring careful data management processes to name each sample and record metadata about the process (table ; giacometti, ). . image processing the analysis of images created by our experimental approach is important in several ways, as we can use various image processing algorithms to produce fig. a diagram of the location of the samples cut from both leaves of the iron gall ink on parchment manuscript. each sample is cm square, giving an overall impression of the size of the original parchment a. giacometti et al. of digital scholarship in the humanities, estimates of the original writing from the degraded samples, and calculate how different our results are from the untreated samples (this is demonstrated in fig. ). there is a variety of image processing methods suitable for this task, including k-means clustering , principal components analysis (pca) , independent components analysis (ica) , and linear spectral mixture analysis fig. imaging set up for colour reflective imaging. the camera is locked facing vertically downwards. the sample is placed on the copystand over a piece of black card under a sheet of anti-reflective glass. the process of acquiring the images involved manually exchanging each filter in front of the camera lens, causing small movements of the camera each time (which meant the resulting images required further image registration, see giacometti, ) table imaging details per session imaging system and lighting modality, and photograph counts session modality samples captures number of images before degradation colour reflective colour transmission monochromatic reflective session total , after degradation colour reflective colour transmission monochromatic reflective session total , total , for each sample, one photograph was taken with each of filters, plus one photograph without a filter. evaluating multispectral image processing methods digital scholarship in the humanities, of (lsma) , although early on in this research we demonstrated the limited success of k-means clus- tering for our application (giacometti, ). we therefore processed the multispectral images of each treated sample using three different methods: pca, ica, and lsma . first, data were cleaned and prepared: each monochrome image was cropped to a square area of � pixels, and the stack of images repre- senting the same sample were co-registered so that each pixel represents the same coordinates on the physical sample. this corrected for movement in the camera as lens filters were changed, and also for some degradation treatments which caused the sam- ples to shrink or curl . we used a two-step regis- tration process. an initial linear step preformed gross realignment and a subsequent non-linear step provided non-rigid registration (giacometti, ). both steps were implemented using software called niftireg (modat et al., ). this resulted in a series of images of the treated sample which could be directly compared pixel-by-pixel to images of the untreated sample. the three image analysis proced- ures were applied to these co-registered images. the output of these consisted of a series of individual images, each of which was intended to show one feature of the sample, for example we might have four images or ‘layers’ representing two inks, a stain, and the underling parchment. a comparison of the recovered images was then performed. our aim was to be as objective and quantitative as possible, which was made difficult because the layers obtained from pca and ica show degrees of correlation rather than similar intensities. for example, ink might appear as dark on a light background in the sample, but the layer representing ink might appear as a light pattern on a dark background. a straightforward similarity measure such as least squares similarity would give a poor result in this case even if the ink had been perfectly identified. we therefore used an approach known as ‘mutual information’ which is a quanti- tative measure of the amount of information shared between two images, independent of their colour or intensity (wells et al., ; panagiotou, ) . in our case, a successful identification of the text would give a high mutual information score if the patterns were correctly identified, irrespective of the colour assigned to the recovered layer. . controlled dmage we describe three of the twenty methods of destruc- tion in more detail below. these three examples were chosen to illustrate different classes of damage: one where the ink was physically removed (and therefore no longer remains on the document); one where the ink remains in its original form but has been obliterated by a stain; and finally one where the ink remains but has been chemically altered by a bleaching agent. . . scraping forcibly removing the surface layer and the visible writing off the parchment with pumice stone was a commonly employed method to re-use parchment (diringer, ; netz and noel, ), and therefore is a common problem for which multispectral ima- ging may help in recovering the erased text. we divided our sample into three areas, leaving the top area untreated, the middle area scraped gently, and the bottom area scraped thoroughly for a longer period. the scraping was undertaken by beating a pumice stone until it became powder, and, using a piece of cotton wool, scraping the parchment with this powder, in circular motions. . . blood bloodstains are not uncommon in historical docu- ments (wechsler, ), and historical texts can even be written in blood (gurkina and rebrikova, ; kieschnick, ). moreover, blood looks similar to iron gall ink, as both haemoglobin and iron gall are rich in the same colour-carrying iron oxides, which contributes to the difficulty of separ- ating ink from bloodstains. in criminology, multi- spectral imaging has been found of assistance when dating blood stains at crime scenes (edelman et al., ). we stained our parchment with human blood (obtained as expired blood from the uclh blood transfusion unit), until it fully penetrated the parchment. the excess blood was removed with blotting paper. a. giacometti et al. of digital scholarship in the humanities, . . sodium hypochlorite sodium hypochlorite is a strong alkaline substance and a common bleaching agent. reports of experi- ments with both paper and parchment describe the bleaching effects of both the parchment turning pink and the iron gall ink fading (smith, ). there are also anecdotes of unscrupulous bleaching of parchment by curators and conservators in an attempt to read them (blagden, ; fuchs, ). in our method, – % sodium hypochlorite (naocl) was diluted in ml of de-ionised water (ph- . ) until a ph of . was reached. this solu- tion was then applied to the parchment. results we describe the results of imaging our three meth- ods of destruction in more detail below. we concentrate here on results obtained from mono- chromatic reflective imaging, as the analysis of these images has produced useful outputs. see giacometti, for further research on our other imaging modalities. . example : scraping figure shows the initial sample, a photograph of the sample after scraping and the best recovered image. the scraping has rendered the affected areas illegible to the naked eye (centre of fig. ). there are traces of text remaining on the middle third of the sample, where the scraping was lighter but the bottom third has very few marks where the writing used to be. on the top third, where a single word was removed, there is a darker mark on the parchment, but no traces of the word can be seen. the degradation to the integrity of the parchment is also visible in the samples: the parchment has become physically thinner in the scraped areas. figure shows the recovered component images from the sample shown in fig. . the first row of fig. shows the four largest principal components of the images of the untreated samples. it can be seen that the pattern of the text is recovered suc- cessfully, but the intensity is not faithfully recov- ered—the text is white on black whereas in the original sample (see fig. ), it is black on white. this illustrates the effect described in section . and the necessity of a technique like mutual infor- mation which is sensitive to the overall patterns but not the absolute values to compare the images quantitatively. each subsequent row displays four registered recovery images resulting from one of the three image processing algorithms (from rows – , pca, ica, and lmsa, respectively). the mutual information between images of the damaged and undamaged samples ranged from zero where there was no information shared between a pair of images to . , which was the highest quan- titative similarity and which occurred between no-pc and sc-ica in fig. above . in this sample, then, the treatment has signifi- cantly affected the writing, but there is information that can be recovered using multispectral imaging, and we can visualise some recovery of the writing, while quantitatively indicating the success of ica to achieve the best results. . example : blood on our sample section shown in fig. , the blood treatment has obscured the writing on the bottom half of the sample. the bloodstains are of a dark brownish colour. the lack of contrast between the stained parchment and the ink renders the writing almost illegible. the treatment has also changed the geometry of the sample, shrinking it slightly. this effect has been corrected in the recovery image by co-registering it with the image of the sample before the treatment shown in figure . in this case, the affected area of the sample that holds ink is stained, and any interpretation of the sample is difficult on the unprocessed images. the highest mutual information measure was . and occurred between the unprocessed image and bl-pc . the mutual information between the unprocessed image and bl-ls was almost equally high ( . ). our systematic data gathering enables further comparison. we can look at the intensity of each pixel as a function of wavelength (fig. ). the top part of fig. shows the untreated parchment. the band of pixels with high reflectance at all wavelengths corresponds to the background, or the parchment substrate. a smaller, less clustered evaluating multispectral image processing methods digital scholarship in the humanities, of group which corresponds to the writing is darker at shorter wavelengths, but has similar reflectance to the parchment around infrared. however, after the treatment is applied (bottom of fig. ), a second band of pixels corresponding to the blood appears and follows a similar path to that of the ink, with the response across the spectrum appearing similar. this demonstrates that the wavelength dependence of blood is similar to that of much of the ink and indicates why the blood makes the ink difficult to read (see also the spectral reconstruction in macdonald et al., ). in this case, the recovery estimate shows a clear trace of the writing with identifiable letters, even though the image of the treated sample looks, to the human eye, too obscured to be legible. figure shows that the spectrum of blood is similar to that of ink, so even multispectral imaging might not be com- pletely successful. however, we have shown that pca is capable of enhancing text, and that pca is the first image processing method that should be tried when faced with such damage. lmsa also performed well and should be considered if good knowledge of the spectra of the blood and ink are available. . example . sodium hypochlorite the writing on the sample treated with sodium hypochlorite has become very faint (see the bottom part of the central image in fig. ). both the parchment and the writing have become lighter. the areas where the writing has remained visible appear to be where there are stronger ink marks, suggesting that the ink penetrated deeper into the structure of the parchment. during the application of the treatment, the parchment started to lose structural integrity and some small particles sepa- rated from the sample. the treatment was stopped earlier than planned because of this, as the sample needed to be preserved intact for imaging purposes. after drying, the sample became stable again. however, it remained in a fragile state, lighter to the touch, more transparent, smoother, and more flexible than before treatment. in this case (fig. ), the maximum mutual in- formation was . and occurred for sh-pc . the second highest occurred for sh-ic ( . ), even the best recovered estimate (right image in fig. ) appears to be very similar to the image of the treated sample (middle image of fig. ), and our methods have not been able to extract any further informa- tion from the text. multispectral imaging cannot recover text in every example of damage, and our systematic investigations indicate where it is worth- while using multispectral imaging, and where it simply will not recover any additional informa- tion from damaged and deteriorated texts. our research indicates that text damaged by bleaching agents such as sodium hypochlorite might not gen- erate useful results when imaged multispectrally: this could inform cost analysis on when it may, or may not be, worth imaging particular texts which have indications of this sort of damage. further results and discussion regarding all of our degradation techniques, and the results from fig. sample o r (left to right) before treatment, after treatment (scraping), and the best possible recovery estimate using our methods (third independent component) a. giacometti et al. of digital scholarship in the humanities, both capture and processing are catalogued in giacometti ( ), providing a framework in which to understand the successes and limitations of multispectral imaging and the image processing algorithms used. discussion our approach—acquiring multispectral images of historical parchment from a set of samples before and after they were submitted to various forms of fig. sample r (scraping) image processing results by three different processing algorithms. the top row includes pca of the untreated samples, the next row the pca of the treated samples. row three indicates the results further processed with ica, row four shows lsma: both are highlighting similarities between the original and the treated samples evaluating multispectral image processing methods digital scholarship in the humanities, of degradation—is novel. in this work, we attempted to recover writing from multispectral images, whilst objectively and quantitatively evaluating the effect- iveness of image processing algorithms. our ap- proach successfully identifies the samples which contain more mutual information shared with the original text, and successfully ranks partial recovery of information. the effect of each of the twenty treatments on both the parchment and the visibility of the writing it carries varied significantly. in some samples in which the writing has been rendered unreadable by the treatment, the writing can be recovered, including aniline dye, oil, and blood. in some samples the writ- ing is completely obscured or the parchment has been severely affected and recovery is all but impos- sible, including iron gall ink, india ink, and mould. in most cases, however, the image processing algo- rithms can extract more information from the multi- spectral images of treated samples corresponding to the writing than the human eye can see. pca outperformed ica and lsma as the image processing means by which to produce accurate re- covery estimates for almost all the samples (although one of our examples shown above, the blood stained fragment, was more successfully recovered with ica). this shows that there is not one approach or algo- rithm which suits all types of document degradation, and that the specific condition of a document affects the processing methods which should be used on resulting images. however, pca is a standard pro- cessing algorithm which appears to be accurate and robust in this application, and is therefore recom- mended to be used as the first in a range of processes when analysing multispectral images. further pro- cessing may yield improved results. our research depends on deliberately degrading square samples cut from a real historical iron gall ink manuscript on parchment. this degradation was necessary to model the type of documentary damage commonly seen in historical documents, and to understand how they affect the reading and inter- pretation of writing, both before and after multi- spectral imaging of the samples. the critical destruction is therefore a core part of our method, as it is central to a complete understanding of the effectiveness of multispectral imaging on primary historical texts. however, our approach does not provide system- atic information about any single degradation cause. there is much research to follow on from this, given that we have shown that using carefully prepared historical evidence can provide an effective frame- work for the evaluation and analysis of the applica- tion of multispectral imaging. additional analysis of our data is possible, and we have already carried our further research into the estimation of spectral sig- natures of the materials present in the documents from the collected multispectral images (macdonald et al., ). we envisage that the dataset has the potential to become an invaluable asset for libraries and archives, research in conser- vation, and various problems in image and signal processing, and have made all of the data generated fig. sample i r (left to right) before treatment, after treatment (blood), and the best possible recovery estimate using our methods (second principal component) a. giacometti et al. of digital scholarship in the humanities, from this project available for use by others . our dataset provides physical information of how parch- ment reacts to various forms of degradation, and also provides documentation on acquisition, and will provide a resource for future research (reducing the need for experimentation on valuable primary historical texts). our next step will be to carry out a similar process concentrating solely on systematic- ally reproducing different degrees of an individual type of degradation (such as water damage or smoke and heat) to provide further information to help both future conservation and digitisation efforts. fig. sample i r (blood) image processing results by processing algorithm. the top row includes pca of the untreated samples, the next row the pca of the treated samples. row three indicates the results processed with ica, row four shows lsma: both highlight similarities between the original and the treated samples evaluating multispectral image processing methods digital scholarship in the humanities, of fig. sample r (blood) spectral intensity against wavelength. above: before treatment. below, after treatment fig. sample o r (left to right) before treatment, after treatment (sodium hypochlorite), and the best possible recovery estimate using our methods (first principal component) a. giacometti et al. of digital scholarship in the humanities, conclusion as multispectral imaging becomes more frequently used in the cultural and heritage sectors, it is im- portant to understand the framework which under- pins its application to the capture and analysis of primary historical texts. our research has provided a systematic methodology for the continuing study and evaluation of the techniques involved in the analysis and processing of multispectral images of degraded cultural heritage documents, and a basis for further testing and development. understanding fig. sample o r (sodium hypochlorite) image processing results by processing algorithm. the top row includes pca of the untreated samples, the next row the pca of the treated samples. row three indicates the results further processed with ica, row four shows lsma evaluating multispectral image processing methods digital scholarship in the humanities, of the most efficient way to apply these techniques to damaged and abraded texts is central to ensuring that the images created when using multispectral imaging—which becomes evidence to be used by a range of scholars including historians, palaeog- raphers, and papyrologists—can be trusted by researchers, whilst also making the most efficient use of resources. our systematic approach provides a framework for the analysis of deteriorated docu- ments using multispectral techniques. carrying out this type of interdisciplinary research facilitates a deeper understanding of the artefacts, multispectral imaging, and image process- ing methods. specifically, it provides a methodology for the continuing study of the techniques involved in the analysis and processing of multispectral images of degraded cultural heritage documents, and a framework for further testing and develop- ment. it has required input from conservators, digitisation specialists, medical physicists, engineers, and computer scientists, archivists all collaborating in a digital humanities project where aspects of computing are advanced as much as our under- standing of a process that can be useful for huma- nities scholars. our unique methodology, where the destruction of a historical text is necessary to ac- quire experimental data for evaluation, can now be used to evaluate a process for reading other, more valuable, historical texts. our combined crit- ical approach to a developing technology allows us to advise and steer the application of multispectral techniques to primary historical texts. funding this work was supported by the engineering and physical sciences research council [grant number ep/f x/ ]. we would like to thank london metropolitan archives for donating the parchment manuscript which allowed us to carry out this research. references attas, e. m. ( ). enhancement of document legibility using spectroscopic imaging. archivaria, : – . balas, c., papadakis, v., papadakis, n., papadakis, a., vazgiouraki, e. and themelis, g. ( ). a novel hyper-spectral imaging apparatus for the non- destructive analysis of objects of artistic and historic value. journal of cultural heritage, (s ): – . barnett, t., chalmers, a., diaz-andreu, m., ellis, g., longhurst, p., sharpe, k. and trinks, i. ( ). d laser scanning for recording and monitoring rock art erosion. international newsletter on rock art, : – . baumann, r., porter, d. c. and seales, w. b. ( ). the use of micro-ct in the study of archaeological artifacts. th international conference on ndt of art. jerusalem, israel. blagden, c. ( ). some observations on ancient inks, with the proposal of a new method of recovering the legibility of decayed writings: by charles blagden, m. d. sec. r. s. and f. a. s. philosophical transactions of the royal society of london, : – . bonanni, l., xiao, x., hockenberry, m., subramani, p., ishii, h., seracini, m. and schulze, j. ( ). wetpaint: scraping through multi-layered images. proceedings of the sigchi conference on human factors in computing systems (chi ‘ ), new york, ny: acm, pp. – . http://doi.acm.org/ . / . . doi¼ . / . . chabries, d. m., booras, s. w. and bearman, g. h. ( ). imaging the past: recent applications of multi- spectral imaging technology to deciphering manu- scripts. antiquity, ( ): – . chahine, c. ( ). changes in hydrothermal stability of leather and parchment with deterioration: a dsc study. thermochimica acta, ( – ): – . clarkson, c. ( ). rediscovering parchment: the nature of the beast. the paper conservator, ( ): – . conway, p. ( ). best practices for digitizing photo- graphs: a network analysis of influences. proceedings of is&t’s archiving , imaging science and technology, berne, – june. crowther, c., nyhan, j., tarte, s. and dahl, j. ( ). new and recent developments in image analysis: theory and practice. panel session, digital humanities . http://dharchive.org/paper/dh /panel- .xml deegan, m. and tanner, s. ( ), digital futures: strategies for the information age. london: library association publishing. diringer, d. ( ). the book before printing: ancient, medieval, and oriental. mineola, ny: courier dover publications. a. giacometti et al. of digital scholarship in the humanities, http://doi.acm.org/ . / . http://doi.acm.org/ . / . http://dharchive.org/paper/dh /panel- .xml dh ( ). plenary sessions, panels, long papers, short papers, posters and workshops at digital huma- nities . http://dharchive.org/ dobrusina, s. a. and visotskite, v. k. ( ). chemical treatment effects on parchment properties in the course of ageing. restaurator, ( ): – . dolgin, b., bulatov, v. and schechter, i. ( ). non- destructive assessment of parchment deterioration by optical methods. analytical and bioanalytical chemistry, ( ): – . earl, g., martinez, k. and malzbender, t. ( ). archaeological applications of polynomial texture map- ping: analysis, conservation and representation. journal of archaeological science, ( ): – . easton, r. l., jr., knox and, k. t. and christens-barry, w. a. ( ). multispectral imaging of the archimedes palimpsest. proceedings of the nd applied imagery pattern recognition workshop, san jose, california, pp. – . easton, r. l., jr., knox, k. t., christens-barry, w. a., boydston, k., toth, m. b., emery, d. and noel, w. ( ). standardized system for multispectral imaging of palimpsests. proceedings of spie , computer vision and image analysis of art d. , san jose, california. edelman, g., van leeuwen, t. g. and aalders, m. c. ( ). hyperspectral imaging for the age estimation of blood stains at the crime scene. forensic science international, ( - ): - . everdell, n. l., styles, i. b., claridge, e., hebden, j. c. and calcagni, a. s. ( ). multispectral imaging of the ocular fundus using led illumination. in depeursinge, c. and vitkin (eds), novel optical instrumentation for biomedical applications iv, vol. . proceedings of spie-osa biomedical optics. optical society of america, munich, german. fuchs, r. ( ). the history of chemical reinforcement of texts in manuscript - what should we do now? in fellows-jensen, g. and springborg, p. (eds), care and conservation of manuscripts : proceedings of the seventh international seminar held at the royal library, vol. . copenhagen, denmark: museum tusculanum press. giacometti, a., campagnolo, a., macdonald, l., mahony, s., terras, m., robson, s., weyrich, t. and gibson, a. ( ). documenting parchment degradation via multispectral imaging. proceedings of bcs conference on electronic imaging and the visual arts (eva), london, pp. – . giacometti, a. ( ). evaluating multispectral imaging processing methdologies for analysing cultural heritage documents. ph.d. thesis, university college london, forthcoming. giacometti, a., terras, m. and gibson, a. ( ). objectively evaluating text recovery methodologies for multispectral images of palimpsests. international journal of heritage in the digital era, th issue dedi- cated to computer vision in cultural heritage. giurginca, m., lacatusu, i. and miu and i. petroviciu, l. ( ). parchment behaviour under extreme heat and fire conditions, ( ): – . goltz, d.m., cloutis, e, norman, l. and attas, m. ( ). enhancement of faint text using visible ( - nm) multispectral imaging, restaurator, : – . goltz, d. and hill, g. ( ). hyperspectral imaging of daguerreotypes. restaurator: international journal for the preservation of library and archival material, ( ): – . gonzalez, r. c. and woods, r. e. ( ). digital image processing, reading, massachusetts: addison-wesley publishing. góra, m., pircher, m., götzinger, e., bajraszewski, t., strlic, m., kolar, j., hitzenberger, c. k. and targowski, p. ( ). optical coherence tomography for examination of parchment degradation. laser chemistry, : – . gurkina, s. and rebrikova, n. ( ). treatment of parchment fragments of a hebrew bible. restaurator, ( ): – . gray, r. and neuhoff, d. ( ). quantization. ieee transactions on information theory, ( ): – . hardeberg, j. y., schmitt, f. and brettel, h. ( ). multispectral color image capture using a liquid crystal tunable filter. optical engineering, ( ): – . hartigan, j. a. and wong, m. a. ( ). algorithm as : a k-means clustering algorithm. journal of the royal statistical society: series c (applied statistics), ( ): – . heinz, d. and chang, c. i. ( ). fully constrained least squares linear spectral mixture analysis method for ma- terial quantification in hyperspectral imagery. ieee transactions on geoscience and remote sensing, ( ): – . hill, d. l. g., batchelor, p. g., holden, m. and hawkes, d. j. ( ). medical image registration. physics in medicine and biology, ( ): r – . hollaus, f., gau, m. and sablatnig, r. ( ). acquisition and enhancement of multispectral images of ancient evaluating multispectral image processing methods digital scholarship in the humanities, of http://dharchive.org/ manuscripts. berlin, germany: kultur und informatik: visual worlds and interactive spaces, pp. – . hughes, l. ( ), digitizing collections: strategic issues for the information manager. london: facet publishing. hyvärinen, a., karhunen, j. and oja, e. ( ). independent component analysis. new york, ny: john wiley and sons. information in images ( ). multispectral document imaging. www.informationinimages.com/#!multispec- tral-document-scanning/c yhe jolliffe, i. t. ( ). principal component analysis. new york, ny: springer-verlag. joo kim, s., deng, f. and brown, m. s. ( ). visual enhancement of old documents with hyperspectral imaging. pattern recognition, ( ): – . kennedy, c. j. and wess, t. j. ( ). chapter the use of x-ray scattering to analyse parchment structure and degradation. in david, b. and dudley, c. (ed.), physical techniques in the study of art, archaeology and cultural heritage. elsevier, pp. – . kieschnick, j. ( ). blood writing in chinese buddhism. journal of the international association of buddhist studies, ( ): – . knox, k. t. ( ). enhancement of overwritten text in the archimedes palimpsest. proc. spie , computer image analysis in the study of art, ( february ); doi: . / . . knox, k. t., easton, r. l., jr., christens-barry, w. a. and boydston, k. ( ). recovery of handwritten text from the diaries and papers of david livingstone. proceedings of spie , computer vision and image analysis of art ii , pp. – . landgrebe, d. ( ). information extraction principles and methods for multispectral and hyperspectral image data. in chen, c. (ed.), information processing for remote sensing. river edge, nj: world scientific publishing company, pp. – . larsen, r. ( ). introduction to damage and damage assessment. in larsen, r. (ed.), improved damage assessment of parchment (idap): assessment, data collection and sharing of knowledge, st edn. european commision, directorate- general for environment, pp. – . luccheseyz, l. and mitray, s. k. ( ). color image segmentation: a state-of-theart survey. proceedings of the indian national science academy (insa-a), delhi, indian: national science academy, ( ): – . macdonald, l. and jacobsen, r. ( ). assessing image quality. in macdonald, l. (ed.), digital heritage, applying digital imaging to cultural heritage. oxford: butterworth-heinenmann, pp. – . macdonald, l. w., giacometti, a., campagnolo, a., robson, s., weyrich, t., terras, m. and gibson, a. ( ). multispectral imaging of degraded parchment. computational color imaging, th international workshop, cciw , chiba, japan, - march . proceedings in tominaga, s., schettini, r., and trémeau, a. (eds), lecture notes in computer science, vol. , chiba, japan: springer berlin heidelberg. marengo, e., manfredi, m., zerbinati, o., robotti, e., mazzucco, e., gosetti, f., bearman, g., france, f. and shor, p. ( ). development of a technique based on multi-spectral imaging for monitoring the conservation of cultural heritage objects. analytica chimica acta, ( ): – . meghea, a., giurginca, m., iftimie, n., miu, l., viorica, b. and budrugeac, p. ( ). behaviour to accelerate ageing of some natural biopolymer constituents of parchment. molecular crystals and liquid crystals, ( ): – . modat, m., ridgway, g. r., taylor, z. a., lehmann, m., barnes, j., hawkes, d. j., fox, n. c. and ourselin, s. ( ). fast free-form deformation using graphics pro- cessing units. computer methods and programs in biomedicine, ( ): – . netz, r. and noel, w. ( ). the archimedes codex: how a medieval prayer book is revealing the true genius of antiquity’s greatest scientist. st edn. da capo press, london. panagiotou, c. ( ). information theoretic regularization in diffuse optical tomography. ph.d. thesis. london: university college london. peatross, j. and ware, m. ( ). physics of light and optics. provo: brigham young university ponto, k., seracini, m. and kuester, f. ( ). wipe-off: an intuitive interface for exploring ultra-large multi- spectral data sets for cultural heritage diagnostics. computer graphics forum, ( ): – . rapantzikos, k. and balas, c. ( ). hyperspectral ima- ging: potential in non-destructive analysis of palimp- sests. ieee international conference on image processing, . ramsay, s. and rockwell, g. ( ). developing things: towards and epistemology of building in the digital humanities. in gold, m. k. (ed.), debates in the a. giacometti et al. of digital scholarship in the humanities, www.informationinimages.com/#!multispectral-document-scanning/c yhe www.informationinimages.com/#!multispectral-document-scanning/c yhe digital humanities. minneapolis: university of minnesota press, pp. – . ratto, m. ( ). critical making: conceptual and mater- ial studies in technology and social life. the information society, ( ): – . reed, r. ( ). ancient skins, parchments and leathers. london, uk: seminar press. salerno, e., tonazzini, a. and bedinin, l. ( ). digital image analysis to enhance underwritten text in the archimedes palimpsest. international journal of document analysis and recognition (ijdar), ( – ): – . schuman, r. ( ). i tweeted a joke that started a big ass ruckus: pan kisses kafka. http://pankisseskafka.com/ / / /i-tweeted-a-joke-that-started-a-big-ass- ruckus/ (accessed january ). senvaitenë, j., beganskienë, a., tautkus, s., padarauskas, a. and kareiva, a. ( ). characterization of histocial writing inks by different analytical techniques. chemija, ( – ): – . smith, t. ( ). an evaluation of historical bleaching with chlorine dioxide gas, sodium hypochlorite, and chloramine-t at the fogg art museum. restaurator, ( – ): – . tanner, s. and bearman, g. ( ). digitising the dead sea scrolls: archiving . arlington, va: the society for imaging science and technology, pp. – . terras, m. ( a). image to interpretation: intelligent systems to aid historians in the reading of the vindolanda texts. oxford studies in ancient documents, oxford university press, oxford. terras, m. ( b). disciplined: using educational stu- dies to analyse humanities computing. literary and linguistic computing, ( ): – . terras, m. ( ). digital images for the information professional. london: ashgate. the national archives ( ). deaccessioning and dis- posal: guidance for archive services. www.nationalarc- hives.gov.uk/documents/deaccessioning-and-disposal- guide.pdf vnouček, j. ( ). typology of the damage of the parchment in manuscripts of the codex form. in larsen, r. (ed.), improved damage assessment of parchment (idap): assessment, data collection and sharing of knowledge, st edn. european commission, directorate- general for environment, luxembourg, pp, – . wells, w. m., jr iii., viola, p., atsumi, h., nakajima, s. and kikinis, r. ( ). multimodal volume registra- tion by maximization of mutual information. medical image analysis . , ( ): – . weingart, s. ( a). acceptances to digital humanities (part ). www.scottbot.net/hial/?p¼ (ac- cessed april ) weingart, s. ( b). submissions to digital humanities . www.scottbot.net/hial/?p¼ (accessed november ). weingart, s. ( ). acceptances to digital humanities (part ). www.scottbot.net/hial/?p¼ (accessed april ). wechsler, t. ( ). the origin of the so called dead sea scrolls. the jewish quarterly review, ( ): – . workman, j. and weyer, l. ( ). practical guide to interpretive near-infrared spectroscopy. crc press, london. zolfagharifard, e. ( ). does the bible have secrets to reveal? scholars hope to restore hidden text in ancient new testament manuscript. http://www.dailymail.co. uk/sciencetech/article- /scholars-hope-restore- hidden-text-ancient-new-testament-manuscript. html#ixzz lgzpvv o notes an analysis of submissions to the digital humanities conference carried out by scott weingart demon- strates that work on text processing remains the core focus of the digital humanities community (weingart, b), with an analysis of dh acceptances indi- cating that ‘literary studies, text analysis, and text mining still reign supreme’ (weingart, ). this fol- lows the same trends identified in weingart’s analysis of digital humanities acceptances (weingart, a). an earlier analysis of the most used words in the ach/allc conference abstracts – (terras, b, p. ) indicates that text was the focus on this earlier work of the digital humanities community. at the digital humanities conference , one of the eight panels was devoted to image processing (crowther et al., ), and the program also contained a range of short and long papers dealing with image processing, optical character recognition, and the search, retrieval, and navigation of high resolution document image collections (dh , ). evaluating multispectral image processing methods digital scholarship in the humanities, of http://pankisseskafka.com/ / / /i-tweeted-a-joke-that-started-a-big-ass-ruckus/ http://pankisseskafka.com/ / / /i-tweeted-a-joke-that-started-a-big-ass-ruckus/ http://pankisseskafka.com/ / / /i-tweeted-a-joke-that-started-a-big-ass-ruckus/ www.nationalarchives.gov.uk/documents/deaccessioning-and-disposal-guide.pdf www.nationalarchives.gov.uk/documents/deaccessioning-and-disposal-guide.pdf www.nationalarchives.gov.uk/documents/deaccessioning-and-disposal-guide.pdf www.scottbot.net/hial/?p= www.scottbot.net/hial/?p= www.scottbot.net/hial/?p= www.scottbot.net/hial/?p= www.scottbot.net/hial/?p= www.scottbot.net/hial/?p= http://www.dailymail.co.uk/sciencetech/article- /scholars-hope-restore-hidden-text-ancient-new-testament-manuscript.html#ixzz lgzpvv o http://www.dailymail.co.uk/sciencetech/article- /scholars-hope-restore-hidden-text-ancient-new-testament-manuscript.html#ixzz lgzpvv o http://www.dailymail.co.uk/sciencetech/article- /scholars-hope-restore-hidden-text-ancient-new-testament-manuscript.html#ixzz lgzpvv o http://www.dailymail.co.uk/sciencetech/article- /scholars-hope-restore-hidden-text-ancient-new-testament-manuscript.html#ixzz lgzpvv o other emerging techniques of interest to those aiming to recover information from primary historical sources include infrared or near-infrared imaging (workman and weyer, ), and three dimensional imaging such as micro-ct (baumann et al., ), d laser scanning (barnett et al., ), and reflectance transformation imaging (rti) (earl et al., ). these are the most popular ways to capture multispec- tral images in the heritage sector, although the cost of obtaining equipment can still be prohibitive for many institutions to undertake this sort of analysis. at the time of writing, a set of narrowband multispectral filters retails in the region of £ , (and will also require additional camera and lighting equipment to be able to be used with it: this is the system we use in this experiment). a full system for production and capture of specific light wavelengths currently retails for £ , . camera sensors that can select wave- lengths automatically have been developed (balas et al., ), but these are not commercially available. relatively low cost scanners have been developed that claim full multispectral capabilities, currently retailing for £ , (information in images, ), but these claims have not been verified by independent tests. although there are now over forty different guidelines in existence which detail best practice in straightfor- ward digitisation of cultural and heritage materials (conway, ), none of them has described ideal approaches for the capture, analysis, and storage of multispectral images of heritage material. ratto, ; ramsay and rockwell, ; schuman, . the creation of virtual models, or ‘phantoms’, to allow this comparison is also explored in detail in giacometti ( ) and giacometti et al., ( ). www.lma.gov.uk k-means clustering is a method to separate data points into a number (k) of clusters according to underlying shared characteristics. for example, an image showing parchment and two different inks might be separated into k¼ clusters, so that the pixels representing the three ‘layers’ are identified sep- arately. see hartigan and wong ( ), gray and neuhoff ( ), and luccheseyz and mitray ( ). pca is a technique for decomposing a set of data into its intrinsic variability, preserving the maximum vari- ability of the data in fewer dimensions (jolliffe, ). in the ideal case, each of the principal components would show one layer from the image. ica is designed to separate sources of signals from a series of measurements (hyvärinen et al, ). independent components are not ranked, and the energy of each dimension is not preserved, or mean- ingful. it behaves similarly to pca but can give differ- ent results. again, we would aim that each of the independent components shows a different layer from the image. lsma decomposes multispectral image data into layers of materials by using a priori knowledge of the spectral signals of materials that are present (heinz and chang, ). this requires knowledge of the ab- sorption spectrum of each dye, which might not always be available. further descriptions of these techniques and applica- tions are available in chapter of giacometti ( ). non-linear or non-rigid transformations are those that affect one area of an image in a different way to other areas (hill et al., ), thus allowing compensation when one side of the parchment has shrunk, etc. the amount of information in an image can be for- mally calculated as the entropy of the image. a blank image has no information and has entropy¼ , whereas a completely random image carries maximum information (in that the value of one pixel cannot be predicted by that of its neighbours) and therefore has maximum entropy. the information shared between two images can be given as the joint entropy which increases as two images differ, because if the images are different, one cannot be used to predict the other. if the entropy h of an image x is h(x) and the joint entropy of images x and y is h(x,y), then we can define the mutual information i(x,y) as the informa- tion shared between two images, or equivalently their similarity. then, formally, i(x,y)¼h(x)þ h(y)�h(x,y). the procedure sometimes involved softening the parchment using a mixture of cheese, milk, and lime, before proceeding to scrape the writing using a knife or razor (diringer, ). full data are available in appendix c of giacometti ( ). the doi for this dataset is . / .ds. the figures included in this paper were originally pub- lished in giacometti ( ). a. giacometti et al. of digital scholarship in the humanities, www.lma.gov.uk software support for discourse-based textual information analysis: a systematic literature review and software guidelines in practice information article software support for discourse-based textual information analysis: a systematic literature review and software guidelines in practice patricia martin-rodilla ,* and miguel sánchez information retrieval lab (irlab), facultade de informática, university of a coruña, c.p. a coruña, spain chocosoft s.l., c.p. santiago de compostela, spain; miguel@chocosoft.net * correspondence: patricia.martin.rodilla@udc.es received: february ; accepted: may ; published: may ���������� ������� abstract: the intrinsic characteristics of humanities research require technological support and software assistance that also necessarily goes through the analysis of textual narratives. when these narratives become increasingly complex, pragmatics analysis (i.e., at discourse or argumentation levels) assisted by software is a great ally in the digital humanities. in recent years, solutions have been developed from the information visualization domain to support discourse analysis or argumentation analysis of textual sources via software, with applications in political speeches, debates, online forums, but also in written narratives, literature or historical sources. this paper presents a wide and interdisciplinary systematic literature review (slr), both in software-related areas and humanities areas, on the information visualization and the software solutions adopted to support pragmatics textual analysis. as a result of this review, this paper detects weaknesses in existing works on the field, especially related to solutions’ availability, pragmatic framework dependence and lack of information sharing and reuse software mechanisms. the paper also provides some software guidelines for improving the detected weaknesses, exemplifying some guidelines in practice through their implementation in a new web tool, viscourse. viscourse is conceived as a complementary tool to assist textual analysis and to facilitate the reuse of informational pieces from discourse and argumentation text analysis tasks. keywords: discourse analysis; argumentation analysis; information visualization; software assistance; viscourse; systematic literature review; information reuse . introduction information management in humanities disciplines necessarily involves natural language textual sources analysis at any level. recently, digital humanities area is including initiatives and works on how to assist textual analysis tasks via software (see chapters – from [ ] for some examples). mainly, this software assistance focuses on two different assistance directions: ( ) natural language processing automatization solutions and ( ) the development of visualization techniques to visualize results from automation tasks —from group — or to help on manual or semiautomatics textual analysis, including annotation tasks, adapted to humanistic disciplines. regarding the second kind of software assistance, there are several proposals at visualization level [ , ] for visualizing grammatical or lexical structures of the texts, dealing with morphology, lexicology or syntax levels of linguistic analysis. however, as long as we focus on more conceptual or relational aspects of the textual sources (at a semantics or pragmatics levels), the previous group of software assistance decreases. although information , , ; doi: . /info www.mdpi.com/journal/information http://www.mdpi.com/journal/information http://www.mdpi.com https://orcid.org/ - - - x http://dx.doi.org/ . /info http://www.mdpi.com/journal/information https://www.mdpi.com/ - / / / ?type=check_update&version= information , , of textual analysis at the discursive or argumentation level is currently being applied in a wide range of studies and disciplines (political speeches, debates, online forums, historical sources, among others), there are no systematic studies on the software assistance offered at discursive or argumentative levels. this paper presents a wide and interdisciplinary systematic literature review (slr), both in software-related areas and humanities areas, on the information visualization and the software solutions adopted to support pragmatics textual analysis at discursive or argumentation levels. note that, although we are conscious that linguistically and even philosophically the discursive and argumentation approach to text analysis have multiple analysis possibilities and frameworks, this paper treats the discursive-argumentation information at the same level, regardless of the linguistic framework that is taken as a basis, in order to perform a more complete study. the paper is organized as follows: section details the materials and methods employed (the systematic literature review methodology, search strategies and inclusion/exclusion criteria employed). section presents the systematic review results, a first discussion of the weaknesses detected and the answers to the research questions defined for this slr study. based on these results, section presents some software guidelines identified for improvement and an implementation proposal for the guidelines’ application in a web tool, viscourse. finally, section discusses future directions. . materials and methods this section presents the systematic literature review (slr) performed, detailing the methodology, search strategies and inclusion/exclusion criteria adopted. . . systematic literature review in order to analyze existing solutions on software visualization for supporting or assisting at discursive or argumentation textual analysis, we have performed a systematic literature review (hereafter slr). we have followed the slr guidelines by kitchenham and charters in [ ], where a systematic literature review is defined as a methodology for “identifying, evaluating and interpreting all available research relevant to a particular research question, or topic area, or phenomenon of interest” [ ]. the rest of this section documents our slr process. . . . research questions the first step in slr methodology consists of defining research questions that underpin the slr process. we have identified relevant research questions to guide the review: . rq : what evidence is there that discourse and argumentation textual analysis is currently supported via information visualization software? . rq : what kind of support are these works providing and how is it implemented? we have defined seven categories for analyzing the kind of support provided by the main contribution presented in these works: • infovis technique: the support provided is mainly visual. for example, a new information visualization technique or its application to a new kind of discursive or argumentation textual information. • linguistic resource: the main support is provided by offering a new linguistic resource: a new corpus, annotation information, taxonomy, ontological information, etc. • complete software tool: the assistance is materialized as an entire new software tool. • application example: the support is provided by illustrating a new discursive or argumentation analysis in a new domain of application or corpora. • discourse metrics measurement: the support is provided by implementing software mechanisms to calculate discursive or argumentation standard metrics for helping in the textual analysis. information , , of • fully automatic analysis support: the support is provided by implementing automatic software solutions for visualizing automatic discursive or argumentation analysis: fully automatic parsers, new algorithms or machine learning techniques, etc. • survey/empirical or qualitative study: works focused on qualitative analysis. . rq : does the software support offered in these works present any weakness or deficiency reported in the study itself or detected as a result of the review? . rq : is it possible to identify some software guidelines for improving the existing information visualization software solutions for supporting or assisting discourse and argumentation textual analysis tasks? figure details the slr process performed. first, we identified the main research repositories for finding relevant work that answer our research questions. then, we eliminated duplications, defining a filtered inclusion/exclusion strategy. finally, we analyzed the title and abstract of the resultant publications. in this step, we also applied some quality assessment criteria as a checklist to the works obtained. this quality assessment step ensured that the works reviewed have no important bias and fit the scope and relevance criteria. the resultant set of publications constitutes the relevant work done on visualization software assistance (labeled as group in the introduction) for supporting or assisting discourse and argumentation textual analysis tasks. information , , x for peer review of • fully automatic analysis support: the support is provided by implementing automatic software solutions for visualizing automatic discursive or argumentation analysis: fully automatic parsers, new algorithms or machine learning techniques, etc. • survey/empirical or qualitative study: works focused on qualitative analysis. . rq : does the software support offered in these works present any weakness or deficiency reported in the study itself or detected as a result of the review? . rq : is it possible to identify some software guidelines for improving the existing information visualization software solutions for supporting or assisting discourse and argumentation textual analysis tasks? figure details the slr process performed. first, we identified the main research repositories for finding relevant work that answer our research questions. then, we eliminated duplications, defining a filtered inclusion/exclusion strategy. finally, we analyzed the title and abstract of the resultant publications. in this step, we also applied some quality assessment criteria as a checklist to the works obtained. this quality assessment step ensured that the works reviewed have no important bias and fit the scope and relevance criteria. the resultant set of publications constitutes the relevant work done on visualization software assistance (labeled as group in the introduction) for supporting or assisting discourse and argumentation textual analysis tasks. figure . systematic literature review (slr) process steps, inspired by the kitchenham and charters guidelines ( ) [ ]. . . . sources, search strategies and filtered criteria in order to answer the research questions, we have defined a combined search strategy in two different kinds of sources. on the one hand, we have searched in four international and well-known digital libraries of research publications. specifically, we have chosen science direct [ ], springer link [ ], acm library [ ] and ieee xplore [ ] due to the accessibility, degree of reliability and relevance in software engineering. on the other hand, we have also performed the same searches in digital humanities relevant repositories that contain works on software engineering and discourse or argumentation textual analysis. specifically, we have reviewed acl anthology [ ] and rst repository (rhetorical structure theory) [ – ] as main sources for works on computational linguistics in the area. moreover, we have included the main two research journals on digital humanities on an international level: digital scholarship in the humanities (henceforth dsh) [ ] and digital humanities quarterly (henceforth dhq) [ ]. relevant terms for extracting computational information visualization solutions on assisting discourse or argumentation textual analysis are included as keywords in the queries, such as “discourse analysis”, “argument mining”, “information visualization”, “software tool”, etc. note that, in order to deal with a manageable number of publications, we also add in this first phase of the slr filters for the date of publication and publisher. due to some repositories also acting as hubs of other repositories publications (e.g., acm library), it is necessary to limit the searches to their own resources in order to avoid duplications. table summarizes repositories, queries performed, and the number of initial results achieved. figure . systematic literature review (slr) process steps, inspired by the kitchenham and charters guidelines ( ) [ ]. . . . sources, search strategies and filtered criteria in order to answer the research questions, we have defined a combined search strategy in two different kinds of sources. on the one hand, we have searched in four international and well-known digital libraries of research publications. specifically, we have chosen science direct [ ], springer link [ ], acm library [ ] and ieee xplore [ ] due to the accessibility, degree of reliability and relevance in software engineering. on the other hand, we have also performed the same searches in digital humanities relevant repositories that contain works on software engineering and discourse or argumentation textual analysis. specifically, we have reviewed acl anthology [ ] and rst repository (rhetorical structure theory) [ – ] as main sources for works on computational linguistics in the area. moreover, we have included the main two research journals on digital humanities on an international level: digital scholarship in the humanities (henceforth dsh) [ ] and digital humanities quarterly (henceforth dhq) [ ]. relevant terms for extracting computational information visualization solutions on assisting discourse or argumentation textual analysis are included as keywords in the queries, such as “discourse analysis”, “argument mining”, “information visualization”, “software tool”, etc. note that, in order to deal with a manageable number of publications, we also add in this first phase of the slr filters for the date of publication and publisher. due to some repositories also acting as hubs of other repositories information , , of publications (e.g., acm library), it is necessary to limit the searches to their own resources in order to avoid duplications. table summarizes repositories, queries performed, and the number of initial results achieved. table . slr repositories, search queries and the number of resultant publications. repository search query number of results springer link (“discourse analysis” or “argument mining”) and (“information visualization” or “visualization” or “visual analytics”) and (“software” or “tool”); filter - science direct (“discourse analysis” or “argument mining”) and (“information visualization” or “visualization” or “visual analytics”) and (“software” or “tool”); filter - acm library [[all: “discourse analysis”] or [all: “argument mining”]] and [[all: “information visualization”] or [all: “visualization”] or [all: “visual analytics”]] and [[all: “software”] or [all: “tool”] or [all: “]] and [publication date: ( / / to / / )]; filter acm publisher ieee xplore (‘discourse and analysis’ or ‘argument mining’) and (‘information and visualization’ or ‘visualization’ or ‘visual analytics’) and (‘software’ or ‘tool’); filter - acl anthology (“discourse analysis” or “argument mining”) and (“information visualization” or “visualization” or “visual analytics”) and (“software” or “tool”) rst repository “software” dsh journal (discourse analysis or argument mining and information visualization or visual analytics and software or tool). published: after january dhq journal query: (“discourse analysis” or “argument mining”) and (“information visualization” or “visualization” or “visual analytics”) and (“software” or “tool”) total as table shows, the preliminary set of publications consists of works. following the slr guidelines, a set of inclusion/exclusion criteria are defined, focusing on original and recent works on information visualization implemented via software solutions for supporting discourse or argumentation analysis tasks. thus, refinement criteria were applied as: • the year of publication between and . • original publications written in english language. • only original publications: papers in journals and full papers in conferences (also edited as chapters), excluding workshops. • only publications with scope on computer science (software engineering, information visualization and computational linguistics included) and linguistics/discourse/argumentation-related areas. • only those publications that have associated original software/existing software use/demonstrator/ tools that provide visual support for discursive/argumentation textual analysis. applying these criteria, we obtained an intermediate reduced pool of different publications. finally, a step of quality assessment is applied to this intermediate set (see next section). . . . quality assessment due to the heterogeneity and diverse source of the publications included in the presented slr, we systematically applied the following checklist as a quality assessment mechanism, answering yes (y) or no (n) to the following questions: • q : are the study goals clearly stated and related to textual analysis assistance, and are the software proposals clearly detailed? • q : are the studies proposing an original software or an original application of existing software for assisting textual analysis through visual software resources? • q : is the proposal validated with real text analysis cases? • q : is the proposal dealing with textual discursive/argumentation analysis information? information , , of • q : is the proposal offering some software mechanisms for promoting the reuse and sharing of the information generated during their use? each publication of the intermediate set is evaluated applying the checklist (yes = point, no = point). thus, each publication obtains a quality assessment score point (qn) from qn = (minimum score) to qn = (maximum score). we have decided that only publications that meet the following two quality requirements are finally included in the final set: - publications with qn greater than , that is, publications with at least three affirmative answers to the quality questions. - publications with an affirmative answer to question . this implies that their contribution in assistance is through information visualization mechanisms, which is the main area of this study. table a shows the final quality assessment evaluation. only publications marked with a fine gray color in the table are included in the slr final set repository. after the quality assessment phase, slr final set repository is composed of publications. the next section presents and deeply discusses the slr findings and the guidelines extracted. . results . . systematic literature review results multiple readings can be extracted from the slr performed. in the first place, an important process of reducing the number of publications has been carried out, with a search results of publications, an intermediate slr set of publications and a final slr repository of publications. this refining process has been influenced by two aspects. first, the selected keywords or combinations of them (multi-words) used in the search queries are commonly used in numerous disciplines, presenting polysemy in their use within the different research communities. thus, the initial number of queries results is quite high for an slr process, finding that many of the initially retrieved publications did not meet any of the quality assessment criteria proposed by our slr process. therefore, subsequent quality assessment process allowed us to focus on works that specifically proposed software solutions to assist in textual analysis on a discursive and argumentation level. secondly, many of the publications reviewed, although they were relevant applying keywords, did not cover the full scope of the work. these works are shown in a white background color in table a and they correspond to: • fully automatic solutions and approaches, such as automatics parsers, automatic detection or prediction methods or tools, all of which are referred to in table a . because our goal is focusing on software assistance tools, we have not included the works whose main contribution is based on a full automation of tasks, both at the level of detection of discursive or argumentation structures and at the level of automatic generation of visualizations that do not allow interaction of the end user. • approaches based on some pieces of information considered discursive but that do not respond to an analysis of the complete structure of the discourse, such as application of topic modeling, statistical studies (basic analysis of frequencies of terms or similar descriptive statistics), works in metaphors, stems, taxonomies or ontologies, all of which are referred to in table a . many of them also adopt an automation approach. we present below the synthesis of evidence from our slr. we begin with a general analysis of the results from the final slr set. next, we present the answers to the research questions previously defined. availability it is important to highlight the small number of free-use tools available throughout the study area. only publications from the slr intermediate repository ( publications) present access at least to information , , of one demo for free or, in better cases, to software repositories or fully implemented free software tools (see last column in table a ). another small group of the rest could be accessed through institutional credentials, while most of them present a url’s with broken links in the publication, or they were never available online. visualization techniques employed regarding the visualization techniques used, most of them follow a principle of visualization based on the original text, many of which are based on discourse trees’ [ ] approaches or similar structures for argumentation. some alternative proposals, such as conceptual recurrence plots [ ] do not allow discursive or argumentation analysis based on the original text. the visualization is carried out after the textual analyses are completed and it is focused only on a specific metric (intervention turns, similarity between utterances, etc.). discursive and argumentation framework supported another interesting aspect extracting from the slr is that most of the tools with a high quality score (score qn = or score qn = ) are particularly developed for supporting a specific framework of discursive or argumentation analysis, mainly rst—rhetorical structure theory—[ ] (such as [ , ]) or iat—inference anchoring theory—[ ] (such as [ – ]), but also some ad hoc frameworks [ , ]. although, conceptually, these frameworks present similar ontologies (segments and relationships between them) and similar visual possibilities, most of the tools are conceived to assist in textual analysis using a single discursive or argumentation framework. this causes a dependency between the software tool and the pragmatic framework chosen, since the user must employ that framework to carry out their textual analyses. the dependence on a discursive or argumentation framework is repeating in all proposals. thus, it is not possible to extend the theoretical framework for creating similar textual analyses with customization in discursive or argumentation schemas. sharing and reuse software mechanisms finally, we have analyzed the sharing and reuse mechanisms of the final publications. although most of these software proposals are focusing on the collaborative edition and analysis, only software tools present sharing and reuse functionality, while software tools reviewed the lack of any sharing and reuse planification of the information generated during the textual analyses. the most used mechanism is based on information exportation in standard file formats, mainly xml –extensible markup language – or derived formats (such as json, javascript object notation) as an interchange mechanism. only the rstweb tool [ ] presents extra mechanisms allowing for the better reuse of the resultant analysis information. in summary, this slr shows that the discursive and argumentation aspects continue to present few software alternatives for textual analysis assistance, in comparison with the wide range of proposals in the lexical or grammatical levels of textual analysis, or in fully automatic approaches at any level. the main weaknesses detected in the existing software proposals are the dependency of one specific discursive or argumentation framework (problems to generalize the analysis), availability (problems accessing, using and keeping the software updated) and the lack of sharing and reuse mechanisms that allow for re-analyses, collaborative editing, and comparative reasoning of the textual analysis performed. the following section elaborates on these results, answering the research questions initially proposed. . . answering the research questions rq : what evidence is there that discourse and argumentation textual analysis is currently supported via information visualization software? information , , of appendix a shows the slr final repository, with all publications reviewed, their scores and availability. the slr allows us to answer the rq and rq of this study, regarding evidence about visualization techniques that are applied to support discourse or argumentation textual analysis via software. while the lexical or grammatical levels of textual analysis are gaining in methods and software tools for their assistance, the discursive and argumentation aspects continue to present few software alternatives. there are only a few recent tools [ , – ] to assist in this textual analysis through an available information visualization software resource. rq : what kind of support are these works providing and how is it implemented? an interesting macro-analysis resultant of the slr process corresponds to the kind of software support provided by the publications attending our categorization defined in rq . the table a “main contribution” column shows in bold the kind of software support contribution according to our seven categories defined for each publication. note that, in some cases, one publication could present several contributions for different rq categories, although the common scenario is that one publication focuses on one kind of main software support contribution. thus, considering only the main category associated with each publication (the first category reported in appendix a), the distribution of software support in the works reviewed is as follows: with a total of works reviewed, the majority category ( publications) presents complete support in the form of a software tool, although the main objective and functionality of each tool may differ. subsequently, the works present support in the form of infovis techniques ( publications), application examples (seven publications), automatic solutions (seven publications), linguistic resources (four publications), qualitative studies (three publications) and finally the measurement of the basic discourse metrics (two publications). the large number of complete software tools reviewed offers us an idea of he current interest in software support in discursive and argumentation textual analysis. however, as we have already mentioned in the results of the slr process, many of these tools present the weaknesses detected, and most of them are not even available for evaluation or use. rq : does the software support offered in these works present any weakness or deficiency reported in the study itself or detected as a result of the review? regarding rq , we found some weaknesses in the current software tools, especially related to availability, framework independence and a lack of information sharing and reuse software mechanisms. based on these weaknesses, four software guidelines are defined and exemplified in implementation through viscourse. (see section ). rq : is it possible to identify some software guidelines for improving the existing information visualization software solutions for supporting or assisting discourse and argumentation textual analysis tasks? answering rq , viscourse tries to act as a complementary software tool, allowing for the generalization of the discursive or argumentation analyzes thanks to their flexibility and independence from the segmentation method (free definition of segments or multi-phrase groups) or of the theoretical framework (rst or iat, among others), also with a user-customizable visualization criteria (screen position, color palette, color criteria, etc.). typical users of viscourse can be mainly researchers in humanistic disciplines, but also teachers or students of discourse or argumentation areas. in addition, viscourse focuses on the sharing and reuse of the analytical information generated, thanks to a black-box mechanism that allows for the encapsulation of import/export formats to researchers, without losing sharing and reused capacity. the next section details the guidelines extracted and the solutions proposed as part of viscourse implementation for the guidelines’ implementation. information , , of . extracted guidelines in practice considering the slr results, we have defined a non-exclusive set of software guidelines for improving the existing information visualization software assistance provided to discursive and argumentation textual analysis: . textual granularity: visual mechanisms should be added to change the level of granularity of the textual analysis. this means that the user must be able to change the visual focus of the analysis, being able to focus on the specific text paragraphs, phrases or other textual segments, or to raise the level of abstraction, calculating general metrics for the full text analyzed. . linguistic framework flexibility: software mechanisms should be developed to allow an independence between the visual mechanisms and the specific discursive or argumentation framework used for the textual analysis, allowing for the extension of the software tools to future discursive or argumentation frameworks. some guidelines here include separate conceptual modelling strategies for the visual solution and each specific framework applied for each analysis performed. . sharing and reuse mechanisms: software mechanisms should be developed for allowing the sharing and reuse of the resultant informational pieces for the textual analyses in a transparent way for end users. these mechanisms are particularly useful both in future analysis by the same users and in a collaborative or comparative analysis by other researchers. some guidelines here include black-box and transparent export/import mechanisms for the informational pieces produced during the textual analyses. . availability alternatives: the software assistance provided should offer some availability and maintenance solutions. this does not necessarily imply free or open models of all the software tools, but rather a prior planning of availability mechanisms so that the learning effort made by users is rewarded. how these guidelines could be implemented in a real software solution for assisting discursive and argumentation textual analysis? taking into account previous works on discourse and argumentation studies via software [ – ], viscourse tool [ ] is a web platform for discursive/argumentation textual analysis. we detail in the next sub-sections our approach for implementing the guidelines presented above in viscourse. . . textual granularity as a tool for supporting discursive/argumentation textual analysis, viscourse is initially conceived for an analysis attached to the text. this implies that the user details the literal text to be analyzed, segmenting it and performing the analysis at the discursive or argumentative level. although this is the most common use, viscourse includes some mechanisms related to guideline for ensuring the software support at different degrees of textual granularity. specifically, two mechanisms have been implemented related to this guideline: first, the user can vary the level of granularity of the text by grouping the textual segments into larger units, called groups. this allows one to associate discursive or argumentation characteristics to large sets of text, even entire paragraphs. secondly, it is possible to hide the literal text of the analysis performed, showing the option “simplified mode”. at that time, all segments and groups acquire the same size and alignment, hiding the literal text to analyze and keeping only the name of the segment or group. this simplified view allows the user to abstract from the literal text, focusing on the discursive or argumentation relationships detected and facilitating the comparison between several analyses on the same text or the calculation of simple metrics (number of discursive relationships of a specific type, etc.). in the future, we plan to offer this automatic metrics calculation and comparative functionality implemented as a module in viscourse. figures and show the same textual analysis (“bouquets in the basket” text form rst corpus) with different levels of granularity. information , , of information , , x for peer review of first, the user can vary the level of granularity of the text by grouping the textual segments into larger units, called groups. this allows one to associate discursive or argumentation characteristics to large sets of text, even entire paragraphs. secondly, it is possible to hide the literal text of the analysis performed, showing the option “simplified mode”. at that time, all segments and groups acquire the same size and alignment, hiding the literal text to analyze and keeping only the name of the segment or group. this simplified view allows the user to abstract from the literal text, focusing on the discursive or argumentation relationships detected and facilitating the comparison between several analyses on the same text or the calculation of simple metrics (number of discursive relationships of a specific type, etc.). in the future, we plan to offer this automatic metrics calculation and comparative functionality implemented as a module in viscourse. figures and show the same textual analysis (“bouquets in the basket” text form rst corpus) with different levels of granularity. figure . viscourse textual analysis for the “bouquets in a basket” rhetorical structure theory (rst) corpus text. figure . viscourse textual analysis for the “bouquets in a basket” rhetorical structure theory (rst) corpus text.information , , x for peer review of figure . viscourse textual analysis for the bouquets in the basket rst corpus text. simplified mode. . . linguistic framework flexibility our slr showed that most of the current tools are particularly developed for supporting a specific framework of discursive or argumentation analysis, such as rstweb in rst [ ] or ova in iat [ ]. despite the theoretical and linguistic approach differences between them, most of these frameworks base the discursive or argumentation analysis on the same needs in terms of software visualization structures: textual segmentation or grouping, and the definition of discursive or argumentation relationships between segments (or similar textual elements with greater granularity). thus, guideline defends that a software tool that supports this type of textual analysis should maintain a certain independence between the discursive/argumentation framework used during a certain textual analysis and the visual elements employed in the tool. this conceptual decoupling at the software tool level allows the future extension of support to perform analysis with future discursive or argumentation frameworks that may arise. for implementing guideline , viscourse initial version decouples visual mechanisms and discursive/argumentation information framework used. first, viscourse allows for textual analysis based on the customized segmentation of the text by the user and its grouping into wider levels that also allow for analysis on a multi-phrase or paragraph level. viscourse currently allows for the analysis of relationships between segments or multi-phrase groups, with highly customized features for the user, so any discursive and/or argumentation relationships framework can be used. in addition, on a visualization level, the text follows existing approaches as discourse trees but adding color visualization for segments or multi-phrase groups. the color of the groups and the discursive or argumentation relationships during the analysis are also selectable by the user, which allows him to apply color criteria (it is common, for example, to use red for discursive and/or argumentation relationships with contrast or disagreement semantics). it is also possible to highlight only a specific type of relation or segment in the tool for visual clarification. figure . viscourse textual analysis for the bouquets in the basket rst corpus text. simplified mode. information , , of . . linguistic framework flexibility our slr showed that most of the current tools are particularly developed for supporting a specific framework of discursive or argumentation analysis, such as rstweb in rst [ ] or ova in iat [ ]. despite the theoretical and linguistic approach differences between them, most of these frameworks base the discursive or argumentation analysis on the same needs in terms of software visualization structures: textual segmentation or grouping, and the definition of discursive or argumentation relationships between segments (or similar textual elements with greater granularity). thus, guideline defends that a software tool that supports this type of textual analysis should maintain a certain independence between the discursive/argumentation framework used during a certain textual analysis and the visual elements employed in the tool. this conceptual decoupling at the software tool level allows the future extension of support to perform analysis with future discursive or argumentation frameworks that may arise. for implementing guideline , viscourse initial version decouples visual mechanisms and discursive/argumentation information framework used. first, viscourse allows for textual analysis based on the customized segmentation of the text by the user and its grouping into wider levels that also allow for analysis on a multi-phrase or paragraph level. viscourse currently allows for the analysis of relationships between segments or multi-phrase groups, with highly customized features for the user, so any discursive and/or argumentation relationships framework can be used. in addition, on a visualization level, the text follows existing approaches as discourse trees but adding color visualization for segments or multi-phrase groups. the color of the groups and the discursive or argumentation relationships during the analysis are also selectable by the user, which allows him to apply color criteria (it is common, for example, to use red for discursive and/or argumentation relationships with contrast or disagreement semantics). it is also possible to highlight only a specific type of relation or segment in the tool for visual clarification. figure shows an example of the well-known bouquets in a basket example from the rst corpus [ ] performed using viscourse. viscourse natively implements the rst analysis [ , ] due to the applicability of the platform to ongoing projects. besides, thanks to the customization possibilities in terms of segments, groups and relationships definition, it is possible to perform multiple analyses using different discursive or argumentation frameworks as a basis. . . sharing and reuse mechanisms as our slr showed, a few cases of the revised tools have automatic mechanisms for exporting the information generated during the analysis [ , , – ], which allows some initial steps on sharing and reuse of this information. most of these works use mechanisms that require knowledge of certain file exchange formats, in most cases requiring that the researcher must also edit the generated files. viscourse also includes the classic import/export mechanisms through editable standard file formats from the web platform (figure shows the json editor included in viscourse, with three options for file view: code, tree or a node view). this mechanism transforms viscourse in a complete json editor for the discourse-based information produced during a textual analysis. note that the tool allows the user to save multiple analyses in their user account, maintaining a visualization carrousel for each user. each analysis performed is generating an underlined json file with information about the original textual sources, the segments, groups, relationships, etc., created by the user and color and position information that allow the software to replicate and export each analysis. information , , of information , , x for peer review of figure shows an example of the well-known bouquets in a basket example from the rst corpus [ ] performed using viscourse. viscourse natively implements the rst analysis [ , ] due to the applicability of the platform to ongoing projects. besides, thanks to the customization possibilities in terms of segments, groups and relationships definition, it is possible to perform multiple analyses using different discursive or argumentation frameworks as a basis. . . sharing and reuse mechanisms as our slr showed, a few cases of the revised tools have automatic mechanisms for exporting the information generated during the analysis [ , , – ], which allows some initial steps on sharing and reuse of this information. most of these works use mechanisms that require knowledge of certain file exchange formats, in most cases requiring that the researcher must also edit the generated files. viscourse also includes the classic import/export mechanisms through editable standard file formats from the web platform (figure shows the json editor included in viscourse, with three options for file view: code, tree or a node view). this mechanism transforms viscourse in a complete json editor for the discourse-based information produced during a textual analysis. note that the tool allows the user to save multiple analyses in their user account, maintaining a visualization carrousel for each user. each analysis performed is generating an underlined json file with information about the original textual sources, the segments, groups, relationships, etc., created by the user and color and position information that allow the software to replicate and export each analysis. figure . viscourse classic import/export mechanisms through editable javascript object notation (json) files from the viscourse web platform, with three options for json file view: code, tree or a node view. however, in our experience, many of the analyses performed are reused by the researcher himself, or shared with colleagues for comparison or communication purposes, and many of these researchers do not need to edit the files generated by the applications for this purpose. for this reason, viscourse also includes, following guideline , an import/export mechanism as a “black box” for the researcher, through its own viscourse code mechanism. the implementation consists of encapsulating all the information produced during a textual analysis (information about the original text, the figure . viscourse classic import/export mechanisms through editable javascript object notation (json) files from the viscourse web platform, with three options for json file view: code, tree or a node view. however, in our experience, many of the analyses performed are reused by the researcher himself, or shared with colleagues for comparison or communication purposes, and many of these researchers do not need to edit the files generated by the applications for this purpose. for this reason, viscourse also includes, following guideline , an import/export mechanism as a “black box” for the researcher, through its own viscourse code mechanism. the implementation consists of encapsulating all the information produced during a textual analysis (information about the original text, the discursive or argumentation information produced during the textual analysis and the visual decisions about colors, positions, etc., taken by the user during the analysis) in a black box piece associated with a unique code automatically generated. the black-box import/export mechanism in viscourse follows this workflow: • once the textual analysis is finished, the user selects “share selected visualization” in the options menu. an export message is shown. the user can generate an internal code for viscourse that matches the json file created for the textual analysis with exactly their visualization and textual analysis parameters. • the code generated is shown to the user, with automatic copy options. sharing the viscourse code with any viscourse user, it is possible to import the textual analysis in any web viscourse session. • for importing, the user selects the “import from code” option near their visualization carrousel in the main screen. pasting the viscourse code is enough to reproduce the textual analysis performed and all their visualization options in other viscourse sessions or user accounts. figure visually summarizes this workflow, showing the different actions in the screen performed by the user in an exporting/importing operation. information , , of information , , x for peer review of discursive or argumentation information produced during the textual analysis and the visual decisions about colors, positions, etc., taken by the user during the analysis) in a black box piece associated with a unique code automatically generated. the black-box import/export mechanism in viscourse follows this workflow: • once the textual analysis is finished, the user selects “share selected visualization” in the options menu. an export message is shown. the user can generate an internal code for viscourse that matches the json file created for the textual analysis with exactly their visualization and textual analysis parameters. • the code generated is shown to the user, with automatic copy options. sharing the viscourse code with any viscourse user, it is possible to import the textual analysis in any web viscourse session. • for importing, the user selects the “import from code” option near their visualization carrousel in the main screen. pasting the viscourse code is enough to reproduce the textual analysis performed and all their visualization options in other viscourse sessions or user accounts. figure visually summarizes this workflow, showing the different actions in the screen performed by the user in an exporting/importing operation. figure . viscourse code, a black-box mechanism for sharing and reusing textual analysis information between users. . . availability alternatives as previously detailed, one of the weaknesses that our slr analysis points out is the absence of tools currently available to the end user. although it is not the objective of the paper to go into the many reasons why this can happen, we think that, as guideline details, the prior planning of the software maintenance model and availability of the different tools to support discursive and figure . viscourse code, a black-box mechanism for sharing and reusing textual analysis information between users. . . availability alternatives as previously detailed, one of the weaknesses that our slr analysis points out is the absence of tools currently available to the end user. although it is not the objective of the paper to go into the many reasons why this can happen, we think that, as guideline details, the prior planning of the software maintenance model and availability of the different tools to support discursive and argumentation textual analysis is necessary. currently, we are conducting this prior analysis to provide viscourse with different availability and use alternatives: free use, research licenses, free trials, subscription payment, availability to users in code repositories, etc. there are several alternatives to ensure that the work done is effectively available to its end users. . future steps as it has been shown throughout this paper, the development of software assistance for textual analysis through information visualization techniques presents great challenges. it is also an area of vital importance for humanistic and social disciplines, where researchers present special needs in terms of textual analysis assistance as compared to other disciplines. thus, the systematic literature review carried out here is a valuable contribution to the field, including heterogeneity references in terms of source repositories, disciplines and approaches and presenting a broad panorama of the area. while we believe that the steps taken so far are satisfactory, future steps are planned. our immediate work includes a formal validation of the viscourse tool and their guidelines on implementation. it is necessary to evaluate the viscourse software tool on a wide set of case studies, by its target users. an evaluation working with researchers in various branches of the digital humanities is planned, in order to identify weaknesses and strengths of the tool, obtaining assessment valorization by end users. regarding the connection of the viscourse tool with the current works revised here, future work includes exploring the possible connection between viscourse visualization, sharing and reuse information , , of mechanisms and current natural language processing (nlp) algorithms and approaches to the automatic extraction of linguistic information on a pragmatic level. the flexibility of viscourse in terms of the discursive or argumentation framework used and their black-box sharing and reuse capabilities allow us to explore the use of viscourse as a complementary piece of software for visualizing and interacting with outputs from different nlp discursive parsers. besides, the following steps in viscourse tool improvements will focus on the implementation of a comparative visualization solution, which allows for comparing several analyses of the same text from different frameworks, or performed by different users, in the same interface. adding comparative features to viscourse will further enhance viscourse’s reuse and sharing capabilities. author contributions: conceptualization, p.m.-r.; formal analysis, p.m.-r.; funding acquisition, p.m.-r. and m.s.; investigation, p.m.-r. and m.s.; methodology, p.m.-r.; project administration, p.m.-r.; validation, p.m.-r. and m.s.; visualization, p.m.-r. and m.s.; writing—original draft, p.m.-r.; writing—review & editing, p.m.-r. all authors have read and agreed to the published version of the manuscript. funding: this research was partially funded by spanish ministry of economy, industry and competitiveness under its competitive juan de la cierva postdoctoral research programme, grant fjci- - and by the “ministerio de ciencia, innovación y universidades” of the government of spain (research grant rti - -b-c , co-funded by the european regional development fund, erdf/feder program). conflicts of interest: the authors declare no conflict of interest. the funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. appendix a table a . quality assessment slr phase results, evaluating publications ( publications eliminating replications) from different source repositories. gray marked rows constitute publications included in the final slr repository ( publications). “main contribution” column shows in bold the kind of software support contribution according to our seven categories (defined in rq ) for each publication. source-slrcode title main contribution q q q q q qn availability springer link-sl [ ] a survey on information visualization: recent advances and challenges infovis techniques for textual analysis: discourse trees y y y y n n springer link-sl [ ] towards automatic argument extraction and visualization in a deliberative model of online consultations for local governments annotated corpora on politics. argumentation analysis. n y y y n n springer link-sl [ ] argumentation in the us presidential elections: annotated corpora of television debates and social ova tool application example for supporting argumentation analysis. y y y y y y springer link-sl [ ] the argument web: an online ecosystem of tools, systems and services for argumentation ova tool + argument analytics tool for supporting argumentation analysis. y y y y y y springer link-sl [ ] sapte: a multimedia information system to support the discourse analysis and information retrieval of television programs sapte tool for tv domain. some discourse analysis metrics. y n y y n n springer link-sl [ ] text to multi-level mindmaps infovis techniques for textual analysis: manual mind maps y n y n n n springer link-sl [ ] knowledge building discourse explorer: a social network analysis application for knowledge building discourse kbdex tool. some discourse analysis metrics. y y n y n y springer link-sl [ ] polycafe—automatic support for the polyphonic analysis of cscl chats polycafe tool for automatic analysis of conversations: learning analytics domain. y y y n y n springer link-sl [ ] polycafe - polyphonic conversation analysis and feedback polycafe tool for automatic analysis of conversations. y y y n y n information , , of table a . cont. source-slrcode title main contribution q q q q q qn availability springer link-sl [ ] facilitating the analysis of discourse phenomena in an interoperable nlp platform u-compare tool for nlp workflows construction: automatic discourse extensions. y n y y n n springer link-sl [ ] computer assisted text analysis in the social sciences (alceste tool) alceste tool. some discourse analysis metrics. y y n n n n springer link-sl [ ] mass collaboration on the web: textual content analysis by means of natural language processing nlp application example on mass collaboration domain. y n n n n n science direct-sd [ ] towards computational discourse analysis: a methodology for mining twitter backchanneling conversations methodology and application example on automatic concept map creation from twitter data. y y y n n n science direct-sd [ ] margot: a web server for argumentation mining margot tool for automatic argumentation textual analysis. y n y y n y science direct-sd [ ] using visual text analytics to examine broadcast interviewing infovis techniques for textual analysis: conceptual recurrence plots y y y y n n acm lib.-acm [ ] visual analytics of academic writing xip tool for automatic textual analysis and application example on scientific discourse y n n n n n acm lib.-acm [ ] temporal analytics with discourse analysis: tracing ideas and impact on communal discourse some discourse analysis metrics. y n y n n n acm lib.-acm [ ] humor, support and criticism: a taxonomy for discourse analysis about political crisis on twitter taxonomy on politics. y n y n n n acm lib.-acm [ ] discourse-centric learning analytics cohere tool for automatic analysis of discourse. y y y y y y acm lib.-acm [ ] experiments in automated support for argument reconstruction automatic topic modelling and argumentation experiments. n n y y n n acm lib.-acm [ ] highly interactive and natural user interfaces: enabling visual analysis in historical lexicography infovis techniques for textual analysis. y y y n n n acm lib.-acm [ ] using argumentative structure to interpret debates in online deliberative democracy and erulemaking application example for supporting argumentation analysis on politics y n y y n n acm lib.-acm [ ] themestreams: visualizing the stream of themes discussed in politics infovis techniques for textual analysis: themestreams y y y n n n acm lib.-acm [ ] web-retrieval supported argument space exploration automatic information retrieval methods for argumentation analysis. y y y n n n acm lib.-acm [ ] visualizing natural language descriptions: a survey survey on graphical systems for natural language support. y n n n n n acm lib.-acm [ ] marius, the giraffe: a comparative informatics case study of linguistic features of the social media discourse application example on social media. some discourse analysis metrics. y y n n n n acm lib.-acm [ ] single or multiple conversational agents?: an interactional coherence comparison application example on chatbots. some discourse analysis metrics. y n y n n n acm lib.-acm [ ] analyzing wikipedia deletion debates with a group decision-making forecast model automatic machine learning technique. application example on debates. n n y n n n ieee xplore-ix [ ] current work practice and users’ perspectives on visualization and interactivity in business intelligence qualitative empirical study on infovis + business intelligence uses. y n n n n n information , , of table a . cont. source-slrcode title main contribution q q q q q qn availability ieee xplore-ix [ ] visualization of sensory perception descriptions wine fingerprints + topics themes infovis tools. sentiment analysis and topic modelling applications. y y y y n y ieee xplore-ix [ ] conceptual recurrence plots: revealing patterns in human discourse infovis techniques for textual analysis: conceptual recurrence plots y y y y n n ieee xplore-ix [ ] visual unrolling of network evolution and the analysis of dynamic discourse infovis technique: dcra visualization prototype y y y n n n ieee xplore-ix [ ] a survey on computer assisted qualitative data analysis software survey on data analysis software. y n n n n n ieee xplore-ix [ ] a tool for discourse analysis and visualization tool for supporting discourse analysis. y y y y n n ieee xplore-ix [ ] robust adaptive discourse parsing for e-learning fora agora tool for automatic contrast parsing on internet forums y n y y n n ieee xplore-ix [ ] assessing collaborative process in cscl with an intelligent content analysis toolkit some discourse analysis metrics. y n y n n n ieee xplore-ix [ ] epicurus: a platform for the visualisation of forensic documents based on a linguistic approach epicurus tool for supporting discourse analysis. y y y y n n ieee xplore-ix [ ] text cohesion visualizer text cohesion tool for infovis techniques. some discourse analysis metrics. y y y y n n ieee xplore-ix [ ] a pilot study of cztalk: a graphical tool for collaborative knowledge work infovis techniques: graph visualizations for discourse. y n y n n n ieee xplore-ix [ ] the competency building process of human computer interaction in game-based teaching: adding the flexibility of an asynchronous format application example on massively multiplayer online games (mmog) domain. some discourse analysis metrics. n n y n n n acl anth.-acl [ ] arguminsci: a tool for analyzing argumentationand rhetorical aspects in scientific writing arguminsci tool: automatic argumentation and discourse parsing. y y y y n y acl anth.-acl [ ] two practical rhetorical structure theory parsers automatic discourse parsers. y n y y n y acl antl.-acl [ ] capturing chat: annotation and tools for multiparty casual conversation infovis technique + stave tool for conversational analysis. y n y n n n acl antl.-acl [ ] interactive exploration of asynchronous conversations: applying a user-centered approach to design a visual text analytic system infovis techniques for conversational analysis. y n n n n n acl anth.-acl [ ] rstweb–a browser-based annotation interface for rhetorical structure theory and discourse relations rstweb tool for supporting discourse analysis. y y y y y y acl anth.-acl [ ] tree annotator: versatile visual annotation of hierarchical text relations tree annotator tool: graphical tool for annotating tree-like structures y y y y y y acl anth.-acl [ ] the impact of modeling overall argumentation with tree kernels automatic representation methodology for argumentation y n y y n n acl anth.-acl [ ] ilcm - a virtual research infrastructure for large-scale qualitative data ilcm tool for discourse analysis. y y y y y y rst-rst [ ] the gum corpus: creating multilayer resources in the classroom annotated corpora on education. rst discourse analysis. y y y n n y rst -duplicated rstweb - a browser-based annotation interface for rhetorical structure theory and discourse relations - - - - - - - - information , , of table a . cont. source-slrcode title main contribution q q q q q qn availability dsh journal-dsh [ ] paperminer—a real-time spatiotemporal visualization for newspaper articles infovis techniques. application example on newspapers. some discourse metrics. y y y n n n dsh journal-dsh [ ] mining ethnicity: discourse-driven topic modelling of immigrant discourses in the usa automatic topic modelling. application example on historical texts. y n y n n n dsh journal-dsh [ ] exploratory thematic analysis for digitized archival collections tome tool: automatic topic modelling. application example on historical texts. y n y n n n dsh journal-dsh [ ] non-representational approaches to modeling interpretation in a graphical environment infovis techniques for textual analysis. n y n n n n dsh journal-dsh [ ] supporting exploratory text analysis in literature study application example on literature. some discourse analysis metrics. y n y n n n dsh journal-dsh [ ] non-traditional prosodic features for automated phrase break prediction automatic phrase break prediction review. n n y n n n dsh journal-dsh [ ] analysis of variation significance in artificial traditions using stemmaweb stemmaweb tool for stemmatology. y n y n n n dsh journal -dsh [ ] networks of networks: a citation network analysis of the adoption, use, and adaptation of formal network techniques in archaeology automatic network analysis techniques. application examples on archaeology. y n y n n n dsh journal-dsh [ ] ontology-based analysis of the large collection of historical hebrew manuscripts manual ontology analysis. application example on ancient texts. n y y n n y dhq journal-dhq [ ] a pedagogy for computer-assisted literary analysis: introducing galgo (golden age literature glossary online) taxonomy-glossary resource. y n y n n n references . schreibman, s.; siemens, r.; unsworth, j. a new companion to digital humanities; john wiley & sons: chichester, uk, . . kucher, k.; kerren, a. (eds.) text visualization techniques: taxonomy, visual survey, and community insights. in proceedings of the ieee pacific visualization symposium (pacificvis), hangzhou, china, – april . . alharbi, m.; laramee, r.s. sos textvis: an extended survey of surveys on text visualization. computers , , . [crossref] . kitchenham, b.; charters, s. guidelines for performing systematic literature reviews in software engineering; ebse report no. - ; durham university: durham, uk, . . elsevier. sciencedirect® elsevier, b.v. . available online: https://www.sciencedirect.com/ (accessed on may ). . springer. springer link. springer nature switzerland ag. . available online: https://link.springer.com/ (accessed on may ). . acm. acm digital library. association for computing machinery. . available online: https://dl.acm.org/ (accessed on may ). . ieee. ieee xplore. . available online: https://ieeexplore.ieee.org/xplore/home.jsp (accessed on may ). . acl. acl anthology. association for computational linguistics (acl). . available online: https: //www.aclweb.org/anthology/ (accessed on may ). http://dx.doi.org/ . /computers https://www.sciencedirect.com/ https://link.springer.com/ https://dl.acm.org/ https://ieeexplore.ieee.org/xplore/home.jsp https://www.aclweb.org/anthology/ https://www.aclweb.org/anthology/ information , , of . mann, w.c.; taboada, m. rst—rhetorical structure theory – . available online: https://www.sfu. ca/rst/ tools/index.html (accessed on may ). . mann, w.c.; thompson, s.a. rhetorical structure theory: toward a functional theory of text organization. j. study discourse , , – . [crossref] . mann, w.c.; thompson, s.a. rhetorical structure theory: a theory of text organization; information sciences institute: los angeles, ca, usa, . . taboada, m.; mann, w.c. rhetorical structure theory: looking back and moving ahead. discourse stud. , , – . [crossref] . taboada, m.; mann, w.c. applications of rhetorical structure theory. discourse stud. , , – . [crossref] . dsh. digital scholarship in the humanities. oxford university press. . available online: https: //academic.oup.com/dsh (accessed on may ). . dhq. digital humanities quarterly. association for computers and the humanities (ach) and the alliance of digital humanities organizations (adho). . available online: http://digitalhumanities.org/dhq/ about/about.html (accessed on may ). . zhao, j.; chevalier, f.; collins, c.; balakrishnan, r. facilitating discourse analysis with interactive visualization. ieee trans. vis. comput. graph. , , – . [crossref] [pubmed] . angus, d.; smith, a.; wiles, j. conceptual recurrence plots: revealing patterns in human discourse. ieee trans. vis. comput. graph. , , – . [crossref] . zeldes, a. (ed.) rstweb—a browser-based annotation interface for rhetorical structure theory and discourse relations. in proceedings of the conference of the north american chapter of the association for computational linguistics: demonstrations, san diego, ca, usa, – june . . budzynska, k.; reed, c. whence inference? technical report; university of dundee: dundee, uk, . . visser, j.; konat, b.; duthie, r.; koszowy, m.; budzynska, k.; reed, c. argumentation in the us presidential elections: annotated corpora of television debates and social media reaction. lang. resour. eval. , , – . [crossref] . lawrence, j.; park, j.; budzynska, k.; cardie, c.; konat, b.; reed, c. using argumentative structure to interpret debates in online deliberative democracy and erulemaking. acm trans. internet technol. , , – . [crossref] . reed, c.; budzynska, k.; duthie, r.; janier, m.; konat, b.; lawrence, j.; pease, a.; snaith, m. the argument web: an online ecosystem of tools, systems and services for argumentation. philos. technol. , , – . [crossref] . niekler, a.; bleier, a.; kahmann, c.; posch, l.; wiedemann, g.; erdogan, k.; heyer, g.; strohmaier, m. ilcm-a virtual research infrastructure for large-scale qualitative data. arxiv, ; arxiv: . . de liddo, a.; shum, s.b.; quinto, i.; bachler, m.; cannavacciuolo, l. (eds.) discourse-centric learning analytics. in proceedings of the st international conference on learning analytics and knowledge, banff, al, canada, february– march . . martín-rodilla, p.; gonzalez-perez, c. (eds.) an iso/iec -derived modelling language for discourse analysis. in proceedings of the ieee th international conference on research challenges in information science (rcis), marrakech, morocco, – may . . gamallo, p.; martín-rodilla, p.; calderón, b. (eds.) identifying causal relations in legal documents with dependency syntactic analysis. in proceedings of the th symposium on languages, applications and technologies (slate ), coimbra, portugal, – june . . martin-rodilla, p. digging into software knowledge generation in cultural heritage; springer: cham, switzerland, ; isbn - - - - . . martin-rodilla, p.; sanchez, m. viscourse. . available online: https://viscourse.org/ (accessed on may ). . liu, s.; cui, w.; wu, y.; liu, m. a survey on information visualization: recent advances and challenges. vis. comput. , , – . [crossref] . bembenik, r.; andruszkiewicz, p. (eds.) towards automatic argument extraction and visualization in a deliberative model of online consultations for local governments. in proceedings of the east european conference on advances in databases and information systems, prague, czech republic, – august . https://www.sfu.ca/rst/ tools/index.html https://www.sfu.ca/rst/ tools/index.html http://dx.doi.org/ . /text. . . . . http://dx.doi.org/ . / http://dx.doi.org/ . / https://academic.oup.com/dsh https://academic.oup.com/dsh http://digitalhumanities.org/dhq/about/about.html http://digitalhumanities.org/dhq/about/about.html http://dx.doi.org/ . /tvcg. . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /tvcg. . http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / http://dx.doi.org/ . /s - - - https://viscourse.org/ http://dx.doi.org/ . /s - - - information , , of . pereira, m.h.; de souza, c.l.; pádua, f.l.; silva, g.d.; de assis, g.t.; pereira, a.c. sapte: a multimedia information system to support the discourse analysis and information retrieval of television programs. multimed. tools appl. , , – . [crossref] . elhoseiny, m.; elgammal, a. text to multi-level mindmaps. multimed. tools appl. , , – . [crossref] . oshima, j.; oshima, r.; matsuzawa, y. knowledge building discourse explorer: a social network analysis application for knowledge building discourse. educ. technol. res. dev. , , – . [crossref] . trausan-matu, s.; dascalu, m.; rebedea, t. polycafe—automatic support for the polyphonic analysis of cscl chats. int. j. comput.-supported collab. learn. , , – . [crossref] . dascalu, m. polycafe-polyphonic conversation analysis and feedback. in analyzing discourse and text complexity for learning and collaborating; springer: cham, switzerland, ; pp. – . . batista-navarro, r.t.; kontonatsios, g.; mihăilă, c.; thompson, p.; rak, r.; nawaz, r.; korkontzelos, i.; ananiadou, s. facilitating the analysis of discourse phenomena in an interoperable nlp platform. in proceedings of the international conference on intelligent text processing and computational linguistics cicling , samos, greece, – march . . brier, a.; hopp, b. computer assisted text analysis in the social sciences. qual. quant. , , – . [crossref] . habernal, i.; daxenberger, j.; gurevych, i. mass collaboration on the web: textual content analysis by means of natural language processing. in mass collaboration and education; springer: cham, switzerland, ; pp. – . . lipizzi, c.; dessavre, d.g.; iandoli, l.; marquez, j.e.r. towards computational discourse analysis: a methodology for mining twitter backchanneling conversations. comput. hum. behav. , , – . [crossref] . lippi, m.; torroni, p. margot: a web server for argumentation mining. expert syst. appl. , , – . [crossref] . angus, d.; fitzgerald, r.; atay, c.; wiles, j. using visual text analytics to examine broadcast interviewing. discourse context media , , – . [crossref] . simsek, d.; shum, s.b.; de liddo, a.; ferguson, r.; sándor, Á. (eds.) visual analytics of academic writing. in proceedings of the th international conference on learning analytics and knowledge, indianapolis, in, usa, – march . . lee, a.v.y.; tan, s.c. (eds.) temporal analytics with discourse analysis: tracing ideas and impact on communal discourse. in proceedings of the th international learning analytics & knowledge conference, vancouver, bc, canada, – march . . teixeira, c.r.g.; kurtz, g.; leuck, l.p.; tietzmann, r.; de souza, d.r.; lerina, j.m.f.; manssour, i.h.; silveira, m.s. humor, support and criticism: a taxonomy for discourse analysis about political crisis on twitter. in proceedings of the th annual international conference on digital government research: governance in the data age, delft, the netherlands, may– june . . winkels, r.; douw, j.; veldhoen, s. experiments in automated support for argument reconstruction. in proceedings of the th international conference on artificial intelligence and law, rome, italy, – june . . therón, r.; seguín, c.; de la cruz, l.; vaquero, m. highly interactive and natural user interfaces: enabling visual analysis in historical lexicography. in proceedings of the first international conference on digital access to textual cultural heritage, madrid, spain, — may . . de rooij, o.; odijk, d.; de rijke, m. themestreams: visualizing the stream of themes discussed in politics. in proceedings of the th international acm sigir conference on research and development in information retrieval, dublin, ireland, jul– august . . thiel, m.; ludwig, p.; mossakowski, t.; neuhaus, f.; nürnberger, a. web-retrieval supported argument space exploration. in proceedings of the conference on conference human information interaction and retrieval, oslo, norway, – march . . hassani, k.; lee, w.-s. visualizing natural language descriptions: a survey. acm comput. surv. , , – . [crossref] . zimmerman, c.; chen, y.; hardt, d.; vatrapu, r. marius, the giraffe: a comparative informatics case study of linguistic features of the social media discourse. in proceedings of the th acm international conference on collaboration across boundaries: culture, distance & technology, kyoto, japan, – august . http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - -y http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - -y http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /j.chb. . . http://dx.doi.org/ . /j.eswa. . . http://dx.doi.org/ . /j.dcm. . . http://dx.doi.org/ . / information , , of . chaves, a.p.; gerosa, m.a. single or multiple conversational agents? an interactional coherence comparison. in proceedings of the chi conference on human factors in computing systems, montreal, qc, canada, – april . . mayfield, e.; black, a.w. analyzing wikipedia deletion debates with a group decision-making forecast model. proc. acm hum. comput. interact. , , – . [crossref] . aigner, w. current work practice and users’ perspectives on visualization and interactivity in business intelligence. in proceedings of the th international conference on information visualisation, london, uk, – july . . kerren, a.; prangova, m.; paradis, c. visualization of sensory perception descriptions. in proceedings of the th international conference on information visualisation, london, uk, – july . . brandes, u.; corman, s.r. visual unrolling of network evolution and the analysis of dynamic discourse. inf. vis. , , – . [crossref] . reis, l.p.; costa, a.p.; de souza, f.n. a survey on computer assisted qualitative data analysis software. in proceedings of the th iberian conference on information systems and technologies (cisti), las palmas, spain, – june . . chiru, c.-g.; trausan-matu, s. a tool for discourse analysis and visualization. in proceedings of the rd international conference on emerging intelligent data and web technologies, bucharest, romania, – september . . lucas, n.; giguet, e. robust adaptive discourse parsing for e-learning fora. in proceedings of the th ieee international conference on advanced learning technologies, santander, spain, – july . . li, y.; wang, j.; liao, j.; zhao, d.; huang, r. assessing collaborative process in cscl with an intelligent content analysis toolkit. in proceedings of the th ieee international conference on advanced learning technologies (icalt ), niigata, japan, – july . . somaraki, v.; xu, z. epicurus: a platform for the visualisation of forensic documents based on a linguistic approach. in proceedings of the nd international conference on automation and computing (icac), colchester, uk, – september . . nukoolkit, c.; chansripiboon, p.; mongkolnam, p.; todd, r.w. text cohesion visualizer. in proceedings of the th international conference on computer science & education (iccse), singapore, – august . . lam, h.; fisher, b.; dill, j. a pilot study of cztalk: a graphical tool for collaborative knowledge work. in proceedings of the th annual hawaii international conference on system sciences, big island, hi, usa, january . . emad, s.; halvorson, w.; broillet, a.; dunwell, n. the competency building process of human computer interaction in game-based teaching: adding the flexibility of an asynchronous format. in proceedings of the ieee international professonal communication conference, vancouver, bc, canada, – july . . lauscher, a.; glavaš, g.; eckert, k. arguminsci: a tool for analyzing argumentation and rhetorical aspects in scientific writing. in proceedings of the th workshop on argument mining, brussels, belgium, october– november . . surdeanu, m.; hicks, t.; valenzuela-escárcega, m.a. two practical rhetorical structure theory parsers. in proceedings of the conference of the north american chapter of the association for computational linguistics: demonstrations, denver, co, usa, may– june . . gilmartin, e.; campbell, n. capturing chat: annotation and tools for multiparty casual conversation. in proceedings of the th international conference on language resources and evaluation (lrec‘ ), portorož, slovenia, – may . . hoque, e.; carenini, g.; joty, s. interactive exploration of asynchronous conversations: applying a user-centered approach to design a visual text analytic system. in proceedings of the workshop on interactive language learning, visualization, and interfaces, baltimore, ma, usa, june . . helfrich, p.; rieb, e.; abrami, g.; lücking, a.; mehler, a. treeannotator: versatile visual annotation of hierarchical text relations. in proceedings of the th international conference on language resources and evaluation (lrec ), miyazaki, japan, – may . . wachsmuth, h.; da san martino, g.; kiesel, d.; stein, b. the impact of modeling overall argumentation with tree kernels. in proceedings of the conference on empirical methods in natural language processing, copenhagen, denmark, – september . http://dx.doi.org/ . / http://dx.doi.org/ . /palgrave.ivs. information , , of . zeldes, a. the gum corpus: creating multilayer resources in the classroom. lang. resour. eval. , , – . [crossref] . kutty, s.; nayak, r.; turnbull, p.; chernich, r.; kennedy, g.; raymond, k. paperminer—a real-time spatiotemporal visualization for newspaper articles. digit. scholarsh. human. , , – . [crossref] . viola, l.; verheul, j. mining ethnicity: discourse-driven topic modelling of immigrant discourses in the usa, – . digit. scholarsh. human. . [crossref] . klein, l.f.; eisenstein, j.; sun, i. exploratory thematic analysis for digitized archival collections. digit. scholarsh. human. , , i –i . [crossref] . drucker, j. non-representational approaches to modeling interpretation in a graphical environment. digit. scholarsh. human. , , – . [crossref] . muralidharan, a.; hearst, m.a. supporting exploratory text analysis in literature study. lit. linguist. comput. , , – . [crossref] . brierley, c.; atwell, e. non-traditional prosodic features for automated phrase break prediction. lit. linguist. comput. , , – . [crossref] . andrews, t.l. analysis of variation significance in artificial traditions using stemmaweb. digit. scholarsh. human. , , – . [crossref] . brughmans, t. networks of networks: a citation network analysis of the adoption, use, and adaptation of formal network techniques in archaeology. lit. linguist. comput. , , – . [crossref] . zhitomirsky-geffe, m.; prebor, g.; miller, y. ontology-based analysis of the large collection of historical hebrew manuscripts. proc. assoc. inf. sci. technol. , , – . [crossref] . garcía, n.a.; caplan, a.; mering, b. a pedagogy for computer-assisted literary analysis: introducing galgo (golden age literature glossary online). dhq digit. hum. q. , , – . © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). http://dx.doi.org/ . /s - - -x http://dx.doi.org/ . /llc/fqy http://dx.doi.org/ . /llc/fqz http://dx.doi.org/ . /llc/fqv http://dx.doi.org/ . /llc/fqx http://dx.doi.org/ . /llc/fqs http://dx.doi.org/ . /llc/fqr http://dx.doi.org/ . /llc/fqu http://dx.doi.org/ . /llc/fqt http://dx.doi.org/ . /pra . . http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction materials and methods systematic literature review research questions sources, search strategies and filtered criteria quality assessment results systematic literature review results answering the research questions extracted guidelines in practice textual granularity linguistic framework flexibility sharing and reuse mechanisms availability alternatives future steps references américa crítica ( ), – , , issn: - , https://doi.org/ . /americacritica/ digital humanities at cuny. building communities of practice in the public university stefano morello the graduate center, cuny, united states received: / / accepted: / / abstract—in this essay, i reflect on my experience working in the field of digital humanities at the graduate center (gc) of the city university of new york (cuny) to refute the misconception that the point of intersection of humanities and computation is dependent on robust technological infrastructure and, therefore, outside of the reach of underfunded public institutions. on the contrary, my tenure as a gc digital fellow suggests that the development of dh communities of practice can be an especially valuable asset for public universities, due to the waterfall effect they can produce for both the academic and the local community. finally, i present evidence of second and third-order effects of the gc’s institutional dh culture by briefly introducing two projects developed at cuny that both rely on and engage critically with technology: the cuny distance learning archive (cdla), a gc class project, and qc voices, a structured initiative established at one of the four-year cuny colleges. — digital humanities, digital praxis, critical university studies, community of practice, american studies. abstract—il saggio presenta una riflessione sulla mia esperienza nelle digital humanities al graduate center (gc) della city university of new york (cuny) al fine di confutare il luogo comune secondo cui il punto di inter- sezione tra le scienze umanistiche e quelle computazionali richieda una robusta infrastruttura tecnologica e sia, di conseguenza, di difficile applicazione nelle istituzioni pubbliche che operano spesso in regimi di austerità. al con- trario, la mia esperienza suggerisce come lo sviluppo di “comunità di pratica” orientate allo studio e all’applicazione delle dh possa costituire una risorsa di valore soprattutto per le università pubbliche, grazie all’effetto a cascata che possono generare sia all’interno della comunità accademica sia di quella locale. a prova di ciò, il saggio analizza due progetti che dipendono dalla tecnologia e che interagiscono con essa in modo critico: il cuny distance learning archive (cdla), un progetto sviluppato nell’ambito di un seminario in dh al gc, e qc voices, un’iniziativa ped- agogica sistematica presso uno dei cuny college. — digital humanities, digital praxis, critical university studies, community of practice, american studies. introduction a t a recent open house event for the phd programin english at the graduate center (gc) of the city university of new york (cuny), a faculty member contact data: stefano morello, s.morello@me.com sketched a parallel between the graduate student experi- ence and the quest of the protagonist of p. d. eastman’s children book are you my mother? born in an empty nest, eastman’s hatchling bird embarks on a journey to find his missing genitor. the search brings the baby bird to ask a number of animals and animated objects if they are his mother. the hatchling’s quest resonates with that of a graduate student, the then-deputy exec- utive officer of the program noted: bouncing between https://doi.org/ . /americacritica/ s.morello@me.com américa crítica ( ): – disciplinary homes, methodologies, formal and informal mentors, and para-curricular activities, until they find their figurative nurturers and with them, their academic homes. the metaphor immediately resonated with me. while my commitment to american studies has been consistent throughout my – yet short – academic career, both the inherently speculative nature of scholarly re- search and the interdisciplinary anatomy of my work have pulled me in manifold directions during my time as a ph.d. student. in addition to genuine intellectual cu- riosity and the need to overcome theoretical or practical research challenges, what further prompts graduate stu- dents to pose the proverbial “are you my mother?” ques- tion to different actors, methodologies, and disciplines, are the unstable nature of the job market that increas- ingly requires applicants to be fluent in multiple fields and disciplinary areas, and a desire for community in a context of ever-growing academic alienation. since the early stages of one’s graduate career at the gc, students, especially those willing to break out of their disciplinary bubbles, are typically exposed to more opportunities than they can chew on. in the fall of , when i began my phd program, i was intro- duced to manifold formal and informal resources to its students through a number of orientations that kicked off the academic year. such initiatives included student and faculty-led cross-departmental research groups, cer- tificate programs, and intra-institutional centers geared towards supporting different approaches to academic re- search, often through the employment of graduate stu- dents. i was first exposed to the field of digital human- ities (dh) in the kinds of overwhelming circumstances that make new student orientations almost disorienting. completely oblivious to over fifty years of scholarship in the field and parroting some of my colleagues’ im- pressions, i distinctly remember dismissing what was being demoed at the event (distant reading, data visu- alization, and mapping projects) as an emphasis of form over content. besides, because of my slight familiar- ity with computer programming and my confidence in my own digital literacy, i did not see the point of fur- ther investing in learning more digital skills when there was so much theory i had to master in my actual field (as a non-literature major in college and first-generation college student, i was especially affected by impostor syndrome). despite my appreciation for the liveliness of the dh community that surrounded me (i had often admired the warm and welcoming environment that character- ized their events), it was not until two years later, when i found myself in need of what dh had to offer to my dis- sertation project that i went back on my steps. in the fall of , i had the opportunity of laying my hands on un- earthed archival material documenting the punk scenes and the subcultural formations at the heart of my disser- tation. lawrence livermore, countercultural figure and co-founder of the berkeley-based record label lookout records, had made his zine collection and a number of artifacts from his days in the east bay available to me. with an eye to the increasing institutionalization of punk (the acquisition of punk ephemera by academic institu- tions that often de-facto prevents non-academic subcul- tural participants from accessing the material), i became intrigued by the idea of making the content of liver- more’s archive available to both scholars and subcultural participants through an open access digital archive, mir- roring my commitments to work with and for the com- munity and to produce public-facing scholarship. my first knock on the door of dh – when i first asked myself if it were, indeed, my metaphorical mother – was driven by pure utilitarian intentions: i viewed dh as a means (a set of methodologies and tools) to reach an end (curating and publishing livermore’s digital col- lection). however, what i discovered in the process of developing the east bay punk digital archive (ebp- da) and through my further involvement with the dh community are otherwise modes of academic engage- ment: collaborative, praxis-driven, and public-facing. what follows is an account of my dh history at the gc (cuny). rather than producing a self-referential nar- rative of success, i aim to refute the misconception that the point of intersection of humanities and computation is dependent on robust technological infrastructure and, therefore, outside of the reach of underfunded public in- stitutions. i argue, on the contrary, that dh hubs are not predominantly dependent on vanguard technology. the development of dh communities of practice can be an especially valuable asset for resource-scarce public uni- versities, due to the waterfall effect they can produce for both the academic and the local community. gcdi and the digital fellows pro- gram the gc is the principal doctoral-granting institution of the cuny system, the largest public urban univer- sity system in the united states, comprising cam- puses: eleven senior colleges, seven community col- leges, one undergraduate honors college, and seven post- see east bay punk digital archive at www.eastbaypunkda.com. www.eastbaypunkda.com stefano morello, digital humanities at cuny graduate institutions. as of , the cuny system counted more than , enrolled students (cuny ). not unlike other institutions, the gc offers training in dh methods through departmental or cross- departmental courses (including the interactive technol- ogy and pedagogy certificate, a three-course sequence that offers interdisciplinary training in technology and pedagogy), fellowship programs, and para-curricular workshops. within this constellation, gc digital ini- tiatives (gcdi) is an intra-institutional initiative led by lisa rhody and matthew k. gold that offers opportu- nities to learn, support, and promote digital scholarship. the program is run by a group of graduate fellows, fac- ulty, and staff and central to its mission is the aim to build and sustain a community around the shared idea of a “digital gc,” envisioning and actively devising pro- ductive, inclusive, and ethical ways to integrate tech- nology in the curriculum and in the research process. the majority of gcdi’s activities are conducted through the digital fellows program, “an in-house think-and-do tank for digital projects, connecting fellows to digital initiatives throughout the graduate center” (gc digi- tal fellows n.d.). the digital fellows team, a diverse group of doctoral students, offers events, workshops, of- fice hours, faculty consultations, week-long institutes, and community-based working groups. my first practical encounter with dh took place through gcdi’s digital research institute (dri), a free week-long in-house training course usually held and taught the last week of winter break by the digital fel- lows to staff, students, and faculty of the gc. taking a foundational approach, the institute introduces its par- ticipants to technical skills and a conceptual vocabu- lary that serves as a basis for further learning and en- gagement in the field. as pointed out by rhody in a blog post on the digital humanities research institute (dhri, a scaled-up version of the dri aimed at train- ing faculty from us universities with the goal of setting up similar courses in their home institutions), “know- ing the underlying technologies will inform that choice and help with troubleshooting problems, asking for help on forums, collaborating with programmers and design- ers” (rhody ). this pedagogical approach “also leads to second and third-order effects as students teach themselves and others, builds confidence, and flexibil- the curricula for the edition included: workshops in com- mand line, digital ethics and data, git, python, text analysis, introduction to r, data manipulation, data visualization, map- ping, omeka, html and css and platforms, and twitter/api. see https://gcdri.commons.gc.cuny.edu/ for further information. ity” (rhody ). in other words, by taking a foun- dational, as opposed to an instrumental approach (i.e., teaching students how to deploy a particular tool for a specific end), the dri aims to teach its participants a forma mentis, rather than merely a modus operandi. what i found most valuable, aside from being intro- duced to a number of tools, was indeed the institute’s pedagogical model. instead of relying solely on the ex- pertise of the instructor, the digital fellows fostered a kind of learning-in-common by facilitating exchanges, relationship-building, and skill-sharing among learners from across the disciplines. in doing so, the institute put into practice a set of common values that digital hu- manists aspire to attain in concordance with its goals. in her popular essay in debates in digital humanities, lisa spiro identified the values that inform dh ethos as openness, collaboration, collegiality and connectedness, diversity, and experimentation ( , ). my positive experience as a dri participant and the autodidactic efforts that ensued (and eventually led to the development of the ebp-da, with the support of the new media lab, a vital node of the dh ecosystem at the gc that provides access to technology and various forms of support to students and faculty seeking to in- tegrate digital media into traditional academic practice) prompted me, shortly thereafter, to apply for the dig- ital fellows program myself. whereas the majority of dh graduate fellowships in the united states offer ei- ther formalized training (whereby individual or group projects are developed, often in response to an artifi- cial prompt) or financial and technical support to bring a project of one’s own design to realization, being a dig- ital fellow is a rather unique employment opportunity that puts graduate students in the position of both re- ceiving from and giving back to their community. each fellow joins the program with a specific set of skills and, usually, a dh project that they are developing as part of their academic pursuit. while graduate fellows re- ceive training and support towards accomplishing their research goals, the fellowship allows them an extraordi- nary amount of freedom: in concert with the team they decide what tools, methods, and outputs are most con- ducive to their professional formation and desirable to different constituencies of the gc, as well as how to as of , some of the distinguished centers that focus primarily on supporting and developing faculty projects include the mary- land institute for technology in the humanities (mith) at the university of maryland, the roy rosenzweig center for his- tory and new media (rrchnm) at george mason university, and the center for digital humanities and social sciences (ma- trix). https://gcdri.commons.gc.cuny.edu/ américa crítica ( ): – learn them, and how to disseminate the knowledge they produce. in other words, the program offers fellows an opportunity to learn while producing output for use of the community (rather than an artificial final product), in the form of workshops, working groups, events, and col- laborative projects. faculty and student consultations, usually hosted in the digital scholarship lab, are fur- ther opportunities for the digital fellows to work with, rather than for the gc population. through their collab- orative approach, the digital fellows foster sustainable training on anything from theoretical concerns to more practical issues and technical obstacles with the ultimate goal of putting scholars in the best position possible to be the expert of their own projects. if the majority of funding schemes reproduce the empirical experience of institutions with generous funding models and extraor- dinary infrastructural capacity (especially in the form of well equipped digital labs and dedicated personnel as- sisting individual projects), the digital fellows program aims to replicate an organic learning-by-doing process that prepares early career scholars for real-life scenarios likely to be found in public universities, community col- leges, and even small liberal art colleges. while the development of the ebp-da offered me the opportunity to put into practice and expand on some of the foundational skills i had learned as a dri par- ticipant – the command line, html and css, and git, among others – developing an expertise in omeka and digital archiving led to my becoming an instructor at the following iteration of the institute. omeka is a free con- tent management system (cms) and a web publishing system built by the roy rosenzweig center for history and new media (rrchnm) at george mason univer- sity (gmu) to create searchable online databases and scholarly online interpretations of digital collections. in addition to being used by archives, historical societies, libraries, and museums, omeka is also employed by in- dividual researchers and teachers to describe primary sources according to archival standards and publish on- line digital collections, as well as to curate interpretive online exhibits from those items. my workshop, built upon an open-access tutorial developed by dh scholar amanda french, engaged with some of the conceptual challenges of digital archives before introducing partic- ipants to the nuts and bolts of the platform. by the end of two -minute sessions, participants had cre- ated a small digital collection, a short exhibit, and had been introduced to the resources available at the gc for those interested in pursuing such projects. reflect- ing the increasing implementation of digital archives in both the classroom and in scholarly research (whereas a platform such as omeka offers an invaluable oppor- tunity for cultural preservation with little to no institu- tional funding), the workshop has since transcended the dri setting and has become a staple of the digi- tal fellows offerings, along with “getting started with tei,” “intro to python,” “building websites with word- press,” “data privacy and ethics,” and “introduction to mapping.” held in the fall and spring semester, gcdi’s workshops are typically accompanied by material dis- tributed in open access (e.g., web tutorials, powerpoint slides, and github repositories), allowing for the scope of the fellows’ work to extend beyond the workshop set- ting and the gc. as kathleen fitzpatrick has suggested, open access work entails “free access not just in the sense of gratis, but also in the sense of libre work that, subject to appropriate scholarly standards of citation, is free to be built upon” ( , ). many of gcdi’s workshops live in open access github repositories, al- lowing future digital fellows and dh practitioners to update them, build upon them, or adapt them to their learning settings. as per fitzpatrick’s understanding of free access, gcdi’s approach to knowledge dissemina- tion is informed by the same ethos of openness: knowl- edge is produced to be distributed to the community and to influence more knowledge production at both an intra and extra-institutional level. as dh practitioners, rather than using the do-it- yourself (diy) affordances of technology to replace other professional figures, we are interested in work- ing with them to imagine and develop new and better methodologies. aside from building a set of technical skills, developing the ebp-da also involved familiar- izing with archival theory and practice. i engaged in conversation with archivists, librarians, faculty, and fel- low grad students to learn from their experience on mat- ters such as metadata, file format standards, informa- tional architecture (especially its relationship with dis- coverability and accessibility), rights and permissions, and sustainability. through this process, i realized the extraordinary amount of work in and around digital archives at the gc as well as the need for a platform to put different constituencies in conversation. after see especially projects that seek to preserve the cultural heritage of marginalized communities, such as “new roots: voices from carolina del norte!” (https : / / newroots . lib. unc . edu/), “dawn- land voices: writing of indigenous new england” (https : / / dawnlandvoices. org/collections/), and “wearing gay history” (http://wearinggayhistory.com/). among these are projects completed by the american social his- tory project, developed at the new media lab, and in the context https://newroots.lib.unc.edu/ https://dawnlandvoices.org/collections/ https://dawnlandvoices.org/collections/ http://wearinggayhistory.com/ stefano morello, digital humanities at cuny further surveying the community about its needs and de- sires, as part of my digital fellows duties, i spearheaded the digital archive research collective (darc). in the fall semester, the working group, co-lead by filipa calado and supported by param ajmera and di yoong, created a wiki that contains information about various institutional resources, featured projects by students and faculty, and overviews of several digital archival meth- ods, approaches, and tools. the wikimedia platform allows for the repository to be developed collaboratively by the community, al- lowing any user to add and edit content. in paral- lel with other working groups – such as the python user group (pug), the r user’s group (rug), and the gis/mapping working group – darc also holds monthly meetings open to all members of the commu- nity of all skill levels, disciplines, and backgrounds. during working groups meetings, digital fellows do not cast themselves as the only experts in the room, but rather invite those with an interest in specific method- ologies to congregate to work and learn together. fi- nally, in the spring of , darc held an event se- ries that included talks by experts in the field and work- shops on tools and platforms such as tei, tropy, audac- ity, and hathitrust. by developing awareness around digital archival work and facilitating access to technical and academic support, darc’s goal, in accordance with gcdi’s mission, is to foster the birth and development of a self-sustained community of practice. as defined by lave and wenger, communities of practice are groups of people who share a concern or a passion for something they do and learn how to do it better as they interact reg- ularly ( ). by emphasizing human relationships and common interests, communities of practice have the ca- pacity to bring constituencies from across the disciplines together and to bridge frozen dialectics among different fields. furthermore, according to etienne and beverly wenger-trayner, fostering two complementary forms of participation, competence and knowledgeability, allow higher education to foster a kind of knowing-in-practice ( : vi). especially in settings with a rapid turnover (of either students or contingent faculty) communities of practice, born and developed through the very acts of learning and doing together, have the potential of pro- of the praxis class of the itp certificate. for a survey of digi- tal archives developed at the gc, see “projects – darc (digi- tal archive research collective),” https://darc.gcdiprojects.org/ projects. see “digital archive research collective (darc) wiki” https: //darc.gcdiprojects.org/ see https://darc.gcdiprojects.org/darc_event_series ducing a lasting impact, whereas expertise tends to be a shared asset and its divulgation a shared responsibil- ity. this allows for gcdi to extend the longevity of its communities of practice beyond the tenure of digital fellows with specific skills as well as institutional in- vestment in specific technologies or methodologies. tagging the tower, the blog used by the digital fel- lows to share resources and reflect on their experiences, abounds with accounts that resonate with mine and espe- cially emphasize the desire not only to build community around technology-based scholarship, but also to further build bridges across communities and disciplines. as early as , former digital fellow laura keane wrote: the digital fellowship program has sharpened my programming and web development skills, and has given me a new venue to employ such skills. [...] i’ve found that my work in the digital fellows pro- gram has been based on collaboration and building a community around technology at the graduate center – this is exciting! [...] i’d like to see the fel- lows working together with representatives from other programs at the graduate center to build an infrastructure for communication across disci- plines – a ‘digital gc’ – and i think technology plays a crucial role in realizing that goal. (keane ) as illustrated through the examples in edited volumes such as debates in digital humanities and digital ped- agogy in the humanities, as well as in journals like jour- nal of digital humanities (jdh) and journal of inter- active technology and pedagogy (jitp), dh has often proved to foster successful interdisciplinary work, pro- duce new types of knowledge production, and devise curricular innovation. i thus urge the skeptical reader not to think of technology in higher education solely through a marxist lens, i.e., as a means to relegate the intellectual worker as an appendage to both the machine and the neoliberal university, as part of a perpetual ef- fort to extract her fullest productive capacity. on the contrary, as brian greenspan has recently argued, the digital humanities involve a close scrutiny of the affordances and constraints that govern most scholarly work today, whether they are technical (relating to media, networks, platforms, interfaces, codes, and databases), social (involving collabo- ration, authorial capital, copyright and ip, censor- ship and firewalls, viral memes, the idea of “the book,” audiences, literacies, and competencies), or labor-related (emphasizing the often-hidden work of students, librarians and archivists, program- mers, techies, research and teaching assistants, and alt-ac workers). ( : n.p.) https://darc.gcdiprojects.org/projects https://darc.gcdiprojects.org/projects https://darc.gcdiprojects.org/ https://darc.gcdiprojects.org/ https://darc.gcdiprojects.org/darc_event_series américa crítica ( ): – as dh practitioners, we object to technological essen- tialism (technology as having an inherently good or bad nature) in favor of a praxis that uses digital means to- wards building academic practices that are better than the ones we have, more conducive of ethical and col- laborative work. in other words, as we think of “the digital” as a catalyst for research in the humanities, our technological praxis can and must be informed by new and better standards of humanity and care. furthermore, as dh work often enables work geared towards non- academic publics, communities of practice can have a pivotal role in creating synergetic connections with non- academic communities and in promoting dialogues and collaborations across boundaries, emphasizing the pub- lic research agenda of city and state colleges. especially in institutional contexts with limited fi- nancial, technological, and human resources, diverse communities of practice can thus be building blocks for a thriving dh hub. despite its wide range of activi- ties, gcdi can rely on a rather limited budget, the im- pact of which has been extended through its community- oriented approach. for instance, the initial funding that supported the training materials built for the dri came from a one-time strategic investment initiative award, a state grant offered to cuny for particular projects based on strategic infrastructure building. the impact of the grant was scaled up through the digital fellows program, sustained through funding from the provost’s office, often in the context of the overall support pack- ages offered to phd students. whereas at many other (especially private) institutions, graduate funding pack- ages often come with lower (or no) work requirements, being a digital fellow – as most gc fellowships do – re- quires a service commitment of hours per week. fur- thermore, as argued by rhody ( ) and demonstrated by and through my personal experience, training pro- vided through a foundational approach and developed through communities of practice often produces second and third-order effects. in the next section, i will pro- vide two examples of such effects by briefly introducing two projects developed at cuny that rely on and engage critically with technology: the cuny distance learn- ing archive (cdla), a gc class projects that outgrew its original scope and qc voices, a structured initiative on extending dh communities of practice beyond academia, see also joan fragaszy troyano and lisa m. rhody, “expanding communities of practice” in jour- nal of digital humanities, vol. , no. spring accessed online http : / / journalofdigitalhumanities . org / - / expanding-communities-of-practice/ established at one of the four-year cuny colleges. on second- and third-order effects in the spring of , gold, faculty in the english and digital humanities programs, led a graduate seminar on knowledge infrastructures that required, as a final project, “an intervention [...] into the knowledge infras- tructures at the gc or in cuny” (gold ). the global covid- pandemic urged the class to make a commitment to a cause much earlier than anticipated. on march , the news of cuny’s switch to distance learning to mitigate the health risks posed by the pan- demic broke just a few minutes before our last in-person class of the semester. over the course of two hours, the students in the class unanimously decided that the inter- vention would have to be related to the unique moment we were experiencing as students and teachers. over the rest of the semester, under gold’s supervision and through the extraordinary involvement of the students in the class, the cdla was developed as a crowdsourced archive that allows students, fac- ulty, and staff from across the cuny system’s campuses to submit personal narratives about the experience of moving online, emails, and com- munications related to the decisions to move on- line, documentation of online learning experiences (e.g., photos, narratives, screenshots), and links to digital media artifacts that capture the event in real time. (cuny distance learning archive, ) furthermore, the cdla also sought to preserve social media posts and reactions (twitter, reddit, facebook, and instagram) of the cuny community to both the cri- sis and the shift to remote learning. since the archive’s initial conception, the class quickly moved forward, under pressure of the need to capture the moment. within the first week of cuny’s transition to online instruction, the team developed a website through the cuny academic commons (an academic social network created by and for the cuny that include a customised installation of wordpress), an online submission system, and a social media pres- ence via major digital platforms. over the following weeks, gold’s class partnered with the core interac- tive technology and pedagogy class of the itp pro- gram, whose students devised a number of suggested writing prompts for cdla contributors. while moving the founding members of the cdla team are matthew k. gold, travis bartley, nicole cote, jean hyemin kim, charlie markbre- iter, zach muhlbauer, michael gossett, and myself. http://journalofdigitalhumanities.org/ - /expanding-communities-of-practice/ http://journalofdigitalhumanities.org/ - /expanding-communities-of-practice/ stefano morello, digital humanities at cuny the project forward allowed the team to learn-by-doing, students also studied technical, ethical, and theoretical challenges faced by similar ‘crisis archives’ (such as the september digital archive and our marathon) and learned from experts in the field (including jim mc- grath, former project director for our marathon, ed summers, technical lead for documenting the now, and johnathan thayer, assistant professor at the queens college’s graduate school of library and information studies) invited as (remote) guest speakers in the re- maining sessions of gold’s class. as of september , without any funding and relying mostly on its origi- nal team’s labour, the cdla has collected dozens of contributions (in the form of personal narratives, cor- respondence, official email communications, and learn- ing resources) and its social media collection efforts re- sulted in scraping close to a hundred thousand posts. if the goal of the cdla is to “document this moment of crisis response from a critical approach to educa- tional technology,” collecting different forms of data from a wide range of sources is aimed at producing a multi-perspective narrative that includes both the insti- tutional and the lived experiences of multiple actors oc- cupying different positionalities and identities. through their juxtaposition, the cdla team hopes to enable re- searchers, students, and members of the community to understand, learn from, and engage critically with this moment. as travis bartley, one of the members of the team, noted: with this archive, we hope to better understand the particular means through which the accommoda- tion of distance learning has in some ways troubled educational instruction. further, given the possi- bility that distance learning practices may become instituted as the norm for higher education, we hope to maintain a collection that acknowledges the human cost of such practices, assisting in the development of pedagogy that truly meets student needs through the digital medium. ( ) moving forward, the cdla team hopes to find institu- tional backing to ensure the longevity of its archiving efforts, either through merging its collection with an es- tablished repository or through the provision of funds for the migration of data to a secure storage platform. it is also currently seeking external funding for the next stages of the project, geared towards curation and preser- vation solutions, metadata standardization, ethical prac- tices to handle social media datasets, as well as creating an archive front-end to ensure accessibility. the case of the cdla and its ongoing development, from class assignment to public resource, is not only fur- ther proof of the indissoluble relationship between dh practice and theory in both research and classroom set- tings, whereby community-oriented projects offer out- standing opportunities to develop a praxis that acts on the theoretical underpinnings of the field. it also allows me to emphasize the pivotal role of a human infrastruc- ture – the result of a synergetic approach to building dh communities of practice that comprises both cur- ricular and para-curricular activities – that relies on a set of foundational skills to approach, devise, and de- velop a dh project and contributes, on the one hand, to overcome financial and technological scarcity, and on the other hand, to the development of a “digital gc.” qc voices: a collaborative writing platform third order effects of the presence of a community com- mitted to integrating technology in their scholarship also percolate beyond the r settings of the gc and into undergraduate pedagogy. benefits of the gc’s digital knowledge infrastructure also extend to other cuny campuses and their population, where funding of dig- ital initiatives is not as robust. for once, as graduate students and alumni develop a sensibility to dh tools and methods during their graduate career, they often carry it with them to the cuny community and four- year colleges, where many of them find employment as faculty, teaching fellows, adjunct teachers, and staff. if the use of course sites and blogs has become somewhat widespread, digital tools such as digital archives or data visualization software are also making their way in un- dergraduate’s teaching pedagogies. as an example of this growing tendency, i want to bring to your attention some initiatives promoted at queens college (qc), to which i have been affiliated for several years in different capacities. over the past three years, as part of its efforts to further integrate technol- ogy in english courses, writing at queens (the program that supports and administers the college’s writing cur- riculum) has run several faculty development workshops to encourage writing instructors to further implement multimodal assignments in their courses. as posited by cynthia selfe, “multimodal writing” extends tradi- tional classroom composition work into “visual, audio, gestural, spatial, or linguistic means of creating mean- ing” beyond what is traditionally considered literature and allows teachers to foster their students’ multilitera- cies (selfe , ). a number of para-curricular américa crítica ( ): – activities also rely on the affordances of technology to promote otherwise pedagogies and modes of engage- ment with writing. a particularly interesting case is that of qc voices, a program that uses a local installa- tion of wordpress (qwriting) as a platform for a collec- tive blog featuring student writers. currently on hiatus due to the budget cuts that resulted from the covid- emergency, qc voices was spearheaded in by gc alumni jason tougaw (faculty in the qc english depart- ment) and boone gorges (qc’s educational technolo- gist and phd candidate in philosophy at the gc, at the time). the project’s generative questions were: first, since the domains of writing and information technology are increasingly intertwined, how is the former influencing the purposes of writing, the genres of written communi- cation, and the nature of audience and author? second, at a time when citizens are bombarded by media mes- sages and information is delivered mostly through dig- ital platforms, how can we further develop and channel digital writing fluency towards critical thinking, effec- tive communication, and active citizenship? (tougaw ). rather than achieving proficiency with specific software packages and technological devices, the goal of the program was to effectively collaborate, asyn- chronously and synchronously, across spatial barriers, to produce, analyze, and share information on a digital platform. every semester, with these pedagogical goals in mind, qc voices hired a diverse cohort of a dozen graduate and undergraduate students, selected from a large pool of applicants from across the disciplines, to each publish six non-fictional thematic columns. in ad- dition to a stipend of $ per semester, student partic- ipation was driven by the opportunity of being part of a program run like a professional public publication, with the support of tougaw, in the role of faculty mentor, and two remunerated editors (usually an adjunct professor with experience as a professional content editor and an early career dh scholar in the role of multimedia edi- tor). as explained by tougaw in a recent interview: we try to structure it like a literary-magazine edit- ing experience [...] we do all the steps that i would go through if i was publishing something. they submit the first draft, we give them notes, it usu- ally takes them another week or so to revise, and then we do a round of more sentence level, detail- oriented editing. in the meantime, one of the tech- nology fellows works with them on assembling the visual elements and doing layout. (“sharing stu- dent perspectives” ) through writing workshops, a professional editorial process, and one-on-one mentoring, writers learn about the distinctive elements of writing online, including vi- sual rhetoric, savvy linking, and media integration. the workshops are hosted every few weeks during free hour, when classes aren’t in session, in the digital writing studio, a lab built through a grant earned by kevin fer- guson (gc alumnus and faculty in english at qc and in ma program in digital humanities at the gc), equipped with five round-tables with dedicated screens and a lap- top cart, primarily used to promote multimodal writing in composition courses. workshop topics included pod- casting, digital editorial practices, visual rhetoric, online pitching, developing an online presence, online collab- oration, and building a community of writers. the in- vestment in technology of the program is thus especially geared towards learning outcomes such as cooperation, discussion, and community-building. as per the col- laborative ethos that informs the program, while writ- ers benefit from one-on-one mentoring, peer networks were also often born out of the workshops. the qc voices website still gets thousands of visits each month, making it both a public forum for members of the qc community and a highly visible online representation of some of the college’s most outstanding students, speak- ing their minds through a range of styles (from poetic prose to journalism, from creative non-fiction to a digi- tal exhibits) on a plethora of topics (recent columns have focused on environmental activism, prison reform, nerd culture, immigrant life, local food culture, afrocentric- ity, theater, hip hop, and muslim-american identity). the initiative can thus be framed as laying at the inter- section of digital and public humanities, whereas stu- dents produce public content pertinent to their lived ex- perience and their community. in addition, it also oper- ated as a kind of professional development, with alumni of the program working as professional writers, or us- ing the digital literacy, communication skills, and col- laborative approach to writing they developed through qc voices in their professional work. in light of the cuny-wide mass budget cuts under the covid- cri- sis, queens college has deemed qc voices too expen- sive to run. the emphasis college administrators put on the cost of the editing fellows is further proof of a peculiar kind of shortsightedness in sustaining digital infrastructures (and computational humanities) through massive investments in technology – including million dollar contracts to purchase licenses for platforms de- veloped with little regards to ethics by for-profit corpo- rations, including cunyfirst, blackboard, g suite for stefano morello, digital humanities at cuny education, and the like – rather than in human capital. conclusion even within public universities, i am aware of the gc’s privileged position in terms of human and intellectual capital, as well as resources available to its affiliates through the ecosystem to which it belongs. despite its pathological austerity blues – to quote michael fabri- cant and stephen brier ( ) – cuny is the largest public urban university system in the nation, located in one of the largest urban technology hubs in the world. however, scaling up training in dh research methods is a desirable goal for both public institutions and the dh community itself. on the one hand, a true diverse dh community – to this day still extremely white and male-dominated – can only coalesce when training in the field reaches higher education’s largest pools of di- verse resources: community colleges and public univer- sity systems. on the other hand, public institutions can benefit from dh’s ability to promote horizontal collabo- rative research practices that foster mentorship and non- hierarchical relationships among diverse perspectives, training, and fields of expertise to de silo knowledge cre- ation and public impact. in an institutional context steeped in dh, such as that of the gc, the digital fellows program represents a sustainable funding scheme aimed to employ and train graduate students, while also producing output for the community in the form of support for dh scholarship. initiatives like the dri and dhri, aimed at teaching not only computational foundational skills, but also at scal- ing up the pedagogical philosophy that informs gcdi’s work, are another example of sustainable professional development that can produce a waterfall effect for the community. if dh practitioners at better funded univer- sities are more likely to have access to the newest tech- nology and to professional assistance than those who are not, public universities can and must promote an institu- tional culture that aims at nurturing graduate students, staff, and faculty computational skills and devise oppor- tunities for them to join forces across disciplines and hi- erarchies. whereas communities of practice coalesce by doing together, they do not necessarily come nor stay together spontaneously. public institutions need to ac- tively stimulate, facilitate, or formalize such initiatives. investing in human, rather than merely technological, infrastructure is essential to build communities of prac- tice and spark a virtuous circle that can lead to further infrastructural development, larger scope of operations, an institutional dh culture, and eventually to formal and informal inter-institutional networks of practice. references bartley, travis. personal interview. august , . cuny. . “total enrollment by undergraduate and grad- uate level, full-time/part-time attendance, and college, fall ” accessed july , . https://www.cuny.edu/ irdatabook/rpts _ay_current/enrl_ _uggr_ftpt. rpt.pdf. cuny distance learning archive. . “about.” accessed september , . https : / / cdla . commons . gc . cuny. edu / about/. fabricant michael, and stephen brier. . austerity blues: fight- ing for the soul of public higher education. baltimore: johns hopkins university press. fitzpatrick, kathleen. . generous thinking: a radical ap- proach to saving the university. baltimore: johns hopkins university press. fragaszy troyano, joan and lisa m. rhody. . “expanding communities of practice.” jour- nal of digital humanities ( ). accessed july , . http : / / journalofdigitalhumanities . org / - / expanding-communities-of-practice/. gc digital fellows. “about.” accessed july , . https : / / digitalfellows.commons.gc.cuny.edu/about/. lave, jean, and etienne wenger. . situated learning: legiti- mate peripheral participation. cambridge: cambridge uni- versity press. kane, laura. . “a fresh perspective.” tagging the tower. ac- cessed august , . https://digitalfellows.commons.gc. cuny.edu/ / / /am-i-an-author/. gold, matthew k. . “knowledge infrastructure.” syllabus, the graduate center, cuny. accessed july , . https : / / kinfrastructures.commons.gc.cuny.edu/syllabus/. greenspan, brian. . “the scandal of digital human- ities.” debates in the digital humanities, edited by matthew k. gold and lauren f. klein. minneapolis: university of minnesota press. accessed novem- ber , . https : / / dhdebates . gc . cuny . edu / read / untitled-f acf c-a - d -be - f ac e a / section / b be c- c- f -a a - ec a c#ch . rhody, lisa. . “dhri: notes toward our pedagogical ap- proach.” accessed july , . http://www.lisarhody.com/ dhri-notes-toward-our-pedagogical-approach/. selfe, cynthia l., ed. . multimodal composition. new york: hampton. “sharing student perspectives.” . the qview, . accessed august , https://www.qc.cuny.edu/communications/ documents/qview/qview .pdf. https://www.cuny.edu/irdatabook/rpts _ay_current/enrl_ _uggr_ftpt.rpt.pdf https://www.cuny.edu/irdatabook/rpts _ay_current/enrl_ _uggr_ftpt.rpt.pdf https://www.cuny.edu/irdatabook/rpts _ay_current/enrl_ _uggr_ftpt.rpt.pdf https://cdla.commons.gc.cuny.edu/about/ https://cdla.commons.gc.cuny.edu/about/ http://journalofdigitalhumanities.org/ - /expanding-communities-of-practice/ http://journalofdigitalhumanities.org/ - /expanding-communities-of-practice/ https://digitalfellows.commons.gc.cuny.edu/about/ https://digitalfellows.commons.gc.cuny.edu/about/ https://digitalfellows.commons.gc.cuny.edu/ / / /am-i-an-author/ https://digitalfellows.commons.gc.cuny.edu/ / / /am-i-an-author/ https://kinfrastructures.commons.gc.cuny.edu/syllabus/ https://kinfrastructures.commons.gc.cuny.edu/syllabus/ https://dhdebates.gc.cuny.edu/read/untitled-f acf c-a - d -be - f ac e a /section/ b be c- c- f -a a - ec a c##ch https://dhdebates.gc.cuny.edu/read/untitled-f acf c-a - d -be - f ac e a /section/ b be c- c- f -a a - ec a c##ch https://dhdebates.gc.cuny.edu/read/untitled-f acf c-a - d -be - f ac e a /section/ b be c- c- f -a a - ec a c##ch http://www.lisarhody.com/dhri-notes-toward-our-pedagogical-approach/ http://www.lisarhody.com/dhri-notes-toward-our-pedagogical-approach/ https://www.qc.cuny.edu/communications/documents/qview/qview .pdf https://www.qc.cuny.edu/communications/documents/qview/qview .pdf américa crítica ( ): – spiro, lisa. . “‘this is why we fight’: defining the values of the digital humanities.” debates in the digital humanities, edited by matthew k. gold, - . minneapolis: university of minnesota press. also accessible at https : / / dhdebates . gc . cuny . edu / read / untitled- c - - b-a be- fdb bfbd e / section / e -c - ab- b - f #ch . tougaw, jason. personal interview. september, , . wenger-trayner, etienne and beverly. . “foreword.” imple- menting communities of practice in higher education, edited by jacquie mcdonald and aileen cater-steel, v-viii. singa- pore: springer. https://dhdebates.gc.cuny.edu/read/untitled- c - - b-a be- fdb bfbd e/section/ e -c - ab- b - f ##ch https://dhdebates.gc.cuny.edu/read/untitled- c - - b-a be- fdb bfbd e/section/ e -c - ab- b - f ##ch https://dhdebates.gc.cuny.edu/read/untitled- c - - b-a be- fdb bfbd e/section/ e -c - ab- b - f ##ch op-llcj .. dasch: data and service center for the humanities ............................................................................................................................................................ lukas rosenthaler and peter fornaro digital humanities lab, university of basel, basel, switzerland claire clivaz university of lausanne, ladhul, faculty of social and political sciences, swiss institute of bioinformatics, vital-it, lausanne (ch), switzerland ....................................................................................................................................... abstract research data in the humanities needs to be sustainable, and access to digital resources must be possible over a long period. only if these prerequisites are fulfilled can research data be used as a source for other projects. in addition, reliability is a fundamental requirement so that digital sources can be cited, reused, and quoted. to address this problem, we present our solution: the data and service center for the humanities located in switzerland. the centralized infrastructure is based on flexible and extendable software that is in turn reliant on modern technologies. such an approach allows for the straightforward migration of existing research project databases with limited life spans in the humanities. we will demonstrate the basic concepts behind this proposed solution and our first experiences in the application thereof. ................................................................................................................................................................................. introduction the long-term preservation of digital data and resources has been an ongoing topic within the in- formation technology (it)-industry and archiving community (kuny, ). it is not only an it chal- lenge but also a challenge to both social and human sciences, and academics in general, in terms of changing research and training habits in the schol- arly world of the humanities. this short article focuses mainly on the former issue but will offer conclusive remarks, from a swiss perspective, on the latter aforementioned issue. while the open archival information system reference model offers a very reasonable framework for the long-term archiving of digital data such as digitized motion picture, images, sound, or text documents, the archiving of highly structured digital data, such as databases, still raises a lot of problems. the flattening of databases into xml text files has been successfully used to archive the contents of relational database management systems (rdbms). , , however this method reduces acces- sibility since the xml files usually have to be read back into a rdbms. in addition to this process, the logic of the application has to be reconstructed in order to regain the full usability of the data. currently, the best method for maintaining the sus- tainability, functionality, and usability of structured data is to migrate data repositories and their soft- ware environments (user interfaces, analytical tools, and so forth) to new technologies, thus ensuring their ongoing functional accessibility (rosenthaler et al., ). the replacement of obsolete hardware and software infrastructure is an ongoing and labor- intensive process that requires continuous financial correspondence: peter fornaro, digital humanities lab, university of basel, bernoullsitrasse , ch- basel, switzerland. e-mail: peter.fornaro@unibas.ch digital scholarship in the humanities, vol. , supplement , . � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com i doi: . /llc/fqv advance access published on october effort (rosenthaler et al., ). furthermore, given that online research data are often being constantly modified to reflect new insights and are thus changing dynamically, referencing it (for citations as an example) is not straightforward. since research is not a standardized process, different research projects tend to make use of pro- ject-specific solutions and technologies. this variety of technical solutions and methods therefore leads to generic problems with respect to the interoper- ability and longevity of research data. despite these difficulties, the use of digital research data, includ- ing databases, has become common in the huma- nities. simultaneously, the terms used to describe the complexity of digital resources that are used and produced by the humanist researchers may be misleading: not only with respect to collections of digital objects such as digitized manuscripts, collections of digitized photographs, and metadata related to it, but also the enrichment of such sources with elements such as annotations and semantic rich links, among others, all of which form a base of research data in the humanities. as long as project funding is available, many of these digital sources are accessible to the research community online. however, once the funding ceases, most of these sources will remain accessible only as long as the supporting hardware and software can be kept in working order. after a certain amount of time— measurable in years—there is a danger that most of these research databases might be abandoned due to lack of maintenance. thus, most of the digital resources and results created within research projects will have a relatively short life span and will no longer be available to the research community; the citation of the affected digital objects will become impossible. however, these digital sources can provide a valuable reference base for future pro- jects but only if the maintenance thereof is ensured through continuous funding. unfortunately, such continuous financial support cannot be realized be- cause of the heterogeneity and lack of critical mass of most project-related databases. furthermore, many of the printed publications (paper or e-paper such as pdf ) published during the life span of a research project rely on results obtained from the original digital data collections. in order to add the often-demanded transparency (traceability and conformability), the source mater- ial that the research is based on should also be available for critical review. approach given this challenging situation, the swiss academy of humanities and social sciences (sahss)—on the behalf of the state secretariat for education, research and innovation (seri)—has launched a project to address this situation in the national context of switzerland. the digital humanities lab of the university of basel (dhlab) in conjunc- tion with the universities of lausanne and bern, in association with the swiss national archives, parti- cipated in a call for proposals resulting in the joint departments being given the task of establishing a solution. in the initial -year period ( – ) is a pilot program leading to the establishment of a data and service center for the humanities (dasch). this program will be based on several test cases of different sizes and complexities from different disciplines; the methods and processes, legal aspects, infrastructure needs, and finally, the costs and expenses, all have to be evaluated. the proposed dasch is based on the following premises: ( ) preserving software in a usable and working condition remains a difficult task, as illu- strated at the recent meeting ‘preserving.exe: toward a national strategy for preserving software’ held by the american library of congress ( ). it would therefore be too difficult and costly to maintain a multitude of different systems for a long time. ( ) emulation of obsolete hard- and software as proposed by r.a. lorie (lorie, ) is also not an easy approach, having its share of problems (luan et al., ). instead, with respect to dasch, we propose that only a minimal number of centralized service loca- tions should be operated: where the different digital sources or databases will be integrated through an importation and translation pro- cess. as a result, ideally only one type of l. rosenthaler et al. i digital scholarship in the humanities, vol. , supplement , technological infrastructure has to be maintained. dasch as defined by the call for proposals addresses the following: � the first phase focuses on the adoption of exist- ing data resources of research projects that have reached the end of their funding. in the second phase, the goal is for researchers of ongoing or future research projects to be guided through the creation and use of digital sources in order to facilitate the accessibility of the data once fund- ing has ended. dasch facilitates the exchange between platforms and infrastructures by imple- menting interfaces for the import, export, and querying of information (data restricted by means of legal constraints, such as copyright issues and/or protection of personal rights, are mapped in a sophisticated access right system). the adopted digital sources must be accessible through a flexible and extendable user interface that allows for search and analysis. we are cur- rently supporting a restful (representational state transfer, a standard for scalable web services) web-service and a linked open data sparql-endpoint in order to integrate the data into other research projects and/or data- bases. dasch should encourage new collabora- tive research models in order to allow for the optimal use of digital sources. it should also be able to facilitate efficient training modules and support for all the new research projects funded by the swiss national science foundation. in order to prepare for the interrelation of swiss digital humanities research to international research, international contacts are a key focus for this center. given the nature of research in the humanities, the data sources dasch has to deal with are het- erogeneous and consist mainly of qualitative data (which is possibly linked to digitized objects). the experience thus far, with respect to the integration of approximately projects of different sizes and complexities, has shown that anything from simple spreadsheets to complex, undocumented relational databases with more than tables can be expected. the system for annotation and linkage of sources in arts and humanities (salsah) platform was chosen as base for the consolida- tion of different data sources. the repository of salsah is based on semantic web technologies [resource description framework (rdf), resource description framework schema (rdfs), and web ontology language (owl)], and it has a modular, layered architecture and is thus well suited to the emulation of the basic func- tionality of rdbms’s simple databases such as ms- access, and other databases such as filemaker. salsah is currently being actively developed by the dhlab and will be available as open source by the end of (figs – ). within a research project funded by the swiss national science foundation, salsah is currently being extended with several new important features (expected in ). � a time machine that will allow digital objects to be referenced by permalinks which include a timestamp for referencing. such permalinks will always display the digital object in the state it was in at the time it was referenced. these permalinks will add true ‘citation-ability’ to the salsah environment. � salsah, which is currently organized technic- ally as a centralized system, will be transformed into a distributed, self-organizing p p system. at the same time, an archival system based on distarnet (distributed archival network, a self-organizing peer-to-peer network for arch- ival storage of digital data; see subotic et al., ) will be added to salsah for protection against data loss due to catastrophic events like hardware failure, flooding, and fires occurring at any salsah location. distarent also uses p p technology to maintain redundant multiple backups of the data within the network. � within the dasch project, salsah will be ex- tended to support open data standards for access and linked data. however, open access may be restricted by legal reasons (such as copyright and privacy laws). salsah includes fine-grained identity and rights management. � salsah will be continuously enhanced accord- ing to the needs of the researchers using the plat- form. it is planned for salsah to migrate to open source by the end of . dasch digital scholarship in the humanities, vol. , supplement , i f ig . t h e w eb in te rf ac e o f s a l s a h im p le m en ts a d es k to p in te rf ac e w it h in th e w eb -b ro w se r w in d o w in o rd er to al lo w fo r th e v is u al iz at io n o f a w o rk in g r d f g ra p h w it h m u lt ip le so u rc es si m u lt an eo u sl y l. rosenthaler et al. i digital scholarship in the humanities, vol. , supplement , f ig . s a l s a h al lo w s a d y n am ic v is u al iz at io n o f th e g ra p h -l ik e st ru ct u re o f th e r d f d at a re p re se n ta ti o n dasch digital scholarship in the humanities, vol. , supplement , i the main tasks of dasch will be three-fold: ( ) maintaining the technological infrastructure and adapting it to the needs of researchers and changing technology. this task will be handled in basel during the pilot phase but since salsah will become an open-source project, other institutions and individuals may contribute to its base. however, in our experience, open-source projects need a powerful coordinating institution in order to be successful. the dhlab will be on hand to play this role. ( ) assistance to researchers: in the humanities, many researchers working with digital sources do not have the technical knowledge to fully exploit the advantages of digital processes. dasch will support researchers, with regard to the best possible use of digital methods and tools for their research. training and education will also be encouraged. ( ) to create a report with recommendations with respect to proceeding with the project and transferring the pilot project to a perman- ent institution. it should be noted that funding for this project goes far beyond normal scientific project financing, and this is evidence of the strong commitment of all parties involved (sahss, seri, swiss national science foundation) in the foundation of a sustain- able national data curation and service center for digital research data in the humanities. the team comprising of the universities of basel, bern, and lausanne, and the national archives demonstrates this commitment. the project’s pilot phase is financed by sahss and seri. since switzerland is a multilingual nation with a highly federalized struc- ture, we decided to create dasch as a virtual center where the technological infrastructure will be located and maintained in basel for the time being, but all the other tasks will be performed by local branch offices that are very close to the re- searchers. during the pilot phase, lausanne and bern will be testing this branch-office, satellite model. as soon as salsah’s p p functionality is implemented, the technical infrastructure may be distributed as necessary or desired. due to the distarnet archival system, data are secured against loss without the local branches needing to build an expensive and complicated backup infra- structure. as a final point of focus, the it element of this infrastructure project requires explanation with respect to the swiss institutions. indeed, until now, the funds devoted to research and the funds focused fig. the architecture of the software infrastructure is highly modular, flexible, and extendable; the design follows strict layering l. rosenthaler et al. i digital scholarship in the humanities, vol. , supplement , on infrastructure have been always separated. through this pilot project, in collaboration with other colleagues working on digital humanities (dh) projects, we hope to progressively overcome, at least in part, this defined division. even the best system of data curation has to overcome a signifi- cant obstacle: the (un)willingness of a social scientist or humanist (ssh) researcher to give their digital material to a platform. to overcome this recurrent difficulty, several solutions have to be explored. firstly, dh training and education is an influential element that directly influences re- search and it developments. secondly, with respect to the usual individual habits of ssh researchers, a top-down project such as this one could be comple- mented by a bottom-up approach, represented, for example, by the recent nakala platform created by huma-num. indeed, to promote salsah as a solution for all the dh material in a country would be a near-impossible task in human, financial, and concrete terms. we would rather, through salsah, encourage and stimulate data-life-cycle manage- ment in the humanities. in such a proposed center, parallel services could be developed, such as copyright counseling for images and online data. these remarks are surely not only related to possible further developments: as the striking example of the arts and humanities data service in uk proves ‘a united kingdom national service aiding the discovery, creation and preservation of digital resources in and for research, teaching and learning in the arts and humanities, [. . .] established in and ceased operation in ’. this project, however, had too great a focus on technical chal- lenges and was unable to offer the services and research that make dh truly come alive. the overcoming of the division between research fund- ing and infrastructure funding is surely one of the key challenges of such regional/national projects. references kuny, t. ( ). a digital dark ages? challenges in the preservation of electronic information. in rd ifla general conference. international federation of library associations and institutions, copenhagen, denmark. lorie r. a. ( ). long term preservation of digital information. in proceedings of the st acm/ieee-cs joint conference on digital libraries, roanoke, va, – june . new york: association of computing machinery, pp. – . luan f., nygard m., and mesti t. ( ). a survey of digital preservation strategies. world digital libraries, l ( ), ios press. rosenthaler, l., gschwind, r., and frey, f. ( ). the digital age in long-term image archival – risks and prospects. in icom-cc. preserving.exe: toward a national strategy for preserving software ( ). washington, dc: library of congress, http://www.digitalpreservation.gov/meet ings/documents/othermeetings/preservingsoftware /preserving_exe_agenda.pdf; last accessed on / / subotic, i., rosenthaler, l., and schuldt, h. ( ). a distributed archival network for process-oriented autonomic long-term digital preservation, acm proceedings of the joint conference on digital libraries. new york: acm, pp. – . notes e.g. siard of the national archives of switzerland, http://www.bar.admin.ch/dienstleistungen/ / /index.html?lang¼en (accessed october ). xena of the national archives of australia, http:// xena.sourceforge.net (accessed october ). kopal/kolibri of the german kopal project, http:// kopal.langzeitarchivierung.de/index_kolibri.php.en which we consider to be equal to a printed paper from a technical point of view. system for annotation and linkage of sources un arts and humanites, a virtual research platform, see http:// www.salsah.org for the generic entry point, http://www. salsah.org/dokubib for an (simplified) entrypoint for the documentation library of st. moritz, and http:// www.salsah.org/kuhaba for the kunsthalle basel. https://www.nakala.fr/; last accessed on / / . huma-num means ‘la très grande infrastructure des humanités numérique’, see http://www.huma-num.fr/ see http://en.wikipedia.org/wiki/arts_and_humanities_ data_service (accessed october ). dasch digital scholarship in the humanities, vol. , supplement , i http://www.digitalpreservation.gov/meetings/documents/othermeetings/preservingsoftware /preserving_exe_agenda.pdf http://www.digitalpreservation.gov/meetings/documents/othermeetings/preservingsoftware /preserving_exe_agenda.pdf http://www.digitalpreservation.gov/meetings/documents/othermeetings/preservingsoftware /preserving_exe_agenda.pdf http://www.bar.admin.ch/dienstleistungen/ / /index.html?lang=en http://www.bar.admin.ch/dienstleistungen/ / /index.html?lang=en http://www.bar.admin.ch/dienstleistungen/ / /index.html?lang=en http://xena.sourceforge.net http://xena.sourceforge.net http://kopal.langzeitarchivierung.de/index_kolibri.php.en http://kopal.langzeitarchivierung.de/index_kolibri.php.en http://www.salsah.org http://www.salsah.org http://www.salsah.org/dokubib http://www.salsah.org/dokubib http://www.salsah.org/kuhaba http://www.salsah.org/kuhaba https://www.nakala.fr/ http://www.huma-num.fr/ http://en.wikipedia.org/wiki/arts_and_humanities_data_service http://en.wikipedia.org/wiki/arts_and_humanities_data_service learning designers in the ‘third space’: the socio-technical construction of moocs and their relationship to educator and learning designer roles in he journal of interactive media in education white, s and white, s learning designers in the ‘third space’: the socio-technical construction of moocs and their relationship to educator and learning designer roles in he. journal of interactive media in education, ( ): , pp.  – , doi: http://dx.doi.org/ . /jime. article learning designers in the ‘third space’: the socio-technical construction of moocs and their relationship to educator and learning designer roles in he steven white and su white massive open online courses (moocs) are frequently portrayed as “agents of change” in higher educa- tion (he), impacting on institutional practices, processes and structures throughout he. however, these courses do not “fit” neatly with the established aims and functions of universities, and accounts of technology-led change in universities predominate, simplistically emphasising technologically determinist narratives with incidental social effects. this study aims to explore the consequences of introducing these courses into he in terms of the roles of educators, learning designers and the socio-technical construction of moocs. the research takes a socio-technical perspective, combining the established analytical strat- egy of socio-technical interaction networks (stin) with the social theoretical ‘third space’ framework of he activity. the paper reports on the first of three institutional cases studies, finding that learning designers occupy a hub-like position in the networks of actors involved in mooc development within an emergent ‘third space’ between academic and managerial roles. the analysis also reveals how the massive and open elements of these courses elicit involvement of seemingly peripheral actors, who exert a strong influence on course production processes and content, with educators taking a less central role. this work adds a socio-technical element to understandings of third space activity in higher education, and can inform the planning and development of online education projects in accounting for changing roles in he where massiveness and openness are combined in a course. keywords: mooc; stin; third space; learning designer; roles introduction massive open online courses (moocs) have prompted substantial discussion and debate in both public and aca- demic discourse. some perceive them as disruptive forces, whilst others claim they are catalysts for openness and access to education (boven ). of course understand- ings and realisations of each term within the acronym (the precise nature of openness in a course, for example) are not fixed (anderson ) however, discourse is increas- ingly focusing on more practical issues of the place of moocs within he (kovanović et al. ) and even crit- ics acknowledge that moocs have foregrounded online learning in discussions of he strategy, and have created renewed interest in digital technologies on the part of aca- demics (laurillard , p. ). reviews of the literature suggest that moocs may act as potential “change agents” in some areas of he, including in the area of teaching and learning (liyanagunawardena, adams & williams ). however, moocs don’t align fully with typical university functions of “teaching, research and service” (daniel ), especially in terms of their open and massive nature. further, investigating the impact of moocs (or indeed other educational tech- nologies) in he can be problematic. moocs are often presented as irresistible forces of nature (a “tsunami”, “avalanche”, “online wave”) or as indicators of inevitable scientific progress (bulfin et al. ) but “there is a lack of evidence for the causal effects of technology” in this respect (oliver , p. ) such reports represent a tech- nologically determinist perspective, viewing technology as possessing inherent properties, leading to inevitable impacts on users, thus changing the social world (selwyn ). this gives an oversimplified view of the dynamics and consequences of introducing new technologies into particular social contexts. acknowledging the interaction of technologies and their context of use, siemens ( ) argues that moocs represent one way in which contemporary universities are struggling to redefine their role in the era of the internet. moocs are, he claims, a “middle ground” for university of southampton, gb corresponding author: steven white (s.t.white@soton.ac.uk) http://dx.doi.org/ . /jime. mailto:s.t.white@soton.ac.uk white and white: learning designers in the ‘third space’art.  , page  of education “between the highly organised and structured classroom environment and the chaotic open web of frag- mented information” (siemens, , p. ). such a “middle ground” involves a range of stakeholders in he, and this paper explores the interactions between moocs, the edu- cators who contribute to them, and the learning designers (lds) who create them. this connection between moocs and educators/learning designers is recognised as impor- tant, yet under-researched (liyanagunawardena, adams & williams ; veletsianos & shepherdson ), whilst the need to better understand the processes underlying the development of online learning is well-established (yuan et al. ). drawing on ideas from the fields of social informatics and education, this paper explores the (sometimes unex- pected) consequences of introducing new technologies into social settings. after reviewing relevant literature, the theoretical framework and methods are outlined. the findings reveal how openness and massiveness, realised through a course structure in an he setting can entail socio-technical influences which shape the roles of educa- tors and lds, and the courses produced. background literature scholarship and educational technology the link between technology and scholarship is an area of growing interest for researchers. fry highlights the “need to develop a grounded understanding of how scholars are actually using icts in their work” (fry , p. ), while weller ( ) argues that the influence of tech- nologies which are “cheap, fast and out-of-control” have great potential to change academic work. however, these qualities seem less relevant (as weller acknowledges) to the forms of moocs which are part of this case study, as they are typically time and resource-intensive to produce (hollands & thirthali ). in terms of course produc- tion, research into the development of online learning initiatives reveals the need for teamwork in these projects (cowie & nichols ), rather than the more individual- ist focus on academics’ use of digital technologies taken in weller’s work on digital scholarship ( ). this focus on teamwork in online learning initiatives can be linked to the “unbundling” of faculty roles in online education (tucker & neely ) and in contemporary he more gen- erally (king & bjarnason ). such changes may reflect a challenge to perceptions of academia as “the last remain- ing cottage industry” (elton ), in which the “master teacher” operates as “jack-of-all-trades” (moore, : ). indeed, trowler et al. ( ) identify a range of insti- tutional and external contextual forces, which challenge established conceptions of disciplinary norms and rou- tines commonly understood as “tribes and territories” in he (becher & trowler ). online course development, lds and educators studies of ld roles in online learning initiatives hint at the complexity of such projects in he. research in instruc- tional design shows the need for collaboration between a range of stakeholders (chao et al. ) but that the “role of the learning designer is crucial in supporting aca- demics to develop quality products” (seeto & herrington ). in a case study of instructional designer roles in blended learning initiatives, keppell ( ) sees instruc- tional designers as having a “brokering” role across dif- ferent academic communities and departments. this idea of lds in a “border crossing role” has interesting parallels with whitchurch’s ( a) research on third space work in he, which will be discussed below. in another case study, cowie & nichols ( ) see the potential for conflict and tension in online learning project implementation. online learning initiatives, they claim, require the “bridging of distinctive cultures”. observing a renegotiation of power relations between educators and lds during development of online and hybrid courses, they argue for the primacy of relationships (rather than timelines or targets) in these projects. research has shown a clear difference between production processes underpinning conventional (face- to-face) and online courses (gregory & lodge ). how- ever, further investigations are required to understand how or whether findings from these studies of blended or online learning initiatives align with the realities of mooc development. the relationship between moocs, educators and lds the extent of research concerning the relationship between educators and/or lds and moocs is limited. bayne & ross explore factors influencing mooc pedagogy, but do not aim to consider wider influences on ld and educator roles. more pertinent to this research, najafi et al. ( ) find that educators value the opportu- nities for collaboration with lds during mooc initiatives, though the study is relatively small scale. czerniewicz et al. ( ) explore how engagement with moocs encour- ages educators to reflect on openness in their academic practice, and find emergent tensions around openness of content in the face of copyright constraints. literat ( ) and cheverie ( ) legal and copyright challenges linked to moocs, although the studies consist of reviews and commentary rather than empirical research. theoretical framework this study uses the stin strategy as a way to avoid techno- logical or social determinism, by examining mooc devel- opment as a “socio-technical system in a way that privi- leges neither the technical nor the social” (meyer ). whitchurch’s ( a) social theory of third space activity in higher education is then used to relate the metaphori- cal stin representations of mooc production to the activ- ities of educators and lds involved with them. the study aims to explore the consequences of introduc- ing moocs into he contexts, particularly for educators, ld and those who work with them to produce courses. the following overarching research question guides the research: to what extent are educator and learning designer roles influenced by participation in mooc develop- ment in he institutions? two sub-questions inform the primary research question: white and white: learning designers in the ‘third space’ art.  , page  of . what are the socio-technical systems related to mooc production and implementation in which learning designers and educators are involved? . what are the roles of educators and learning designers within mooc development and implementation projects? defining educators and learning designers this study focuses primarily on the roles and activities of educators and ld, but seeks to uncover other significant actors or factors which emerge from the analysis. in the context of mooc development in this study, educators are typically lecturers at the case study institution (with teaching and research roles), but function as the subject matter experts (sme) outlined in caplan and graham’s delineation of online course development roles ( , p. ). this role is clearly distinct from past conceptions of the “lone ranger” academics who produce courses in relative isolation, relying on their own technical and pedagogical knowledge to do so (bates, ; in chao et al. ) according to caplan, smes typically provide content for course materials, check alignment of learn- ing objectives and content, and suggest activities to be included. ld (also known as instructional designers) on the other hand, are conventionally understood as those who conduct “the systematic and reflective process of translating principles of learning and instruction into plans for instruction materials, activities, information resources, and evaluation” (smith & ragan , p. ). their role includes adapting, creating and sequencing content and learning outcomes, following an addie pro- cesses of analysis, design, development, implementation, and evaluation. it is argued that their role is becoming more complex and extensive, as studies of instructional designer practices have shown that formal addie pro- cesses are rarely followed precisely in practice (kenny et al. ). indeed, seeto and herrington ( , p. ) link the development of constructivist learning theory and more open, web-based learning environments to a diminished focus on ‘instruction’, and the new title of learning designer for this “diversifying and expanding” role. it is for this reason that the term learning designer is used in this paper. the need for collaboration between a range of team members (in addition to ld and smes) in online course design projects is recognised in the literature as a way to foster quality in course design (caplan & graham, , p. ; chao et al., ). the stin approach employed in this study aims to take into account this range of social actors in order to understand the course development pro- cess “not simply as a technical methodology to be applied to design situations, but also as a socially [and technically] constructed practice” (campbell et al. , p. ). the initial decision to focus on ld and educators in particu- lar was made as their interactions are seen as a particular site of ‘culture clashes’ (cowie and nichols, ) where ld act as “brokers” between academic departments and other professional departments in the university (keppell, ). this applies particularly to educators as smes who are active in the course design and development process itself, rather than postgraduate students who deal mainly with educators in providing content, or those who facili- tate in discussion forums once courses are already under way. of course, one aim of the stin strategy (and of social informatics more generally) is to uncover actors, groups or technologies which may have a hitherto unrecognised importance in the use of technologies within social set- tings (walker & creanor ). socio-technical interaction networks rq . what are the socio-technical systems related to mooc production and implementation in which learning designers and educators are involved? the stin strategy aims to provide detailed and nuanced accounts of the way technical and social factors interact to shape technologies and their contexts of use. stin origi- nates in the field of social informatics, which has gener- ated a substantial body of research to support three key principles of information technology use in social set- tings. these principles are that information technologies ( ) are embedded in their contexts of use, ( ) have a char- acteristic ‘duality’ of enabling and constraining effects, and ( ) are configurable in that they can be understood differently in particular settings (kling et al. ). the idea of a network is used as a metaphor in which a stin is defined as: “a network that includes people (including organi- sations), equipment, data, diverse resources (money, skill, status), documents and messages, legal arrangements, enforcement mechanisms, and resource flows” (kling et al. ). these metaphorical networks help illustrate the complex ways in which technologies are embedded, shaped and used within organisations. stin focuses on the routines and consequences of technology use, rather than pro- cesses of adoption or innovation which are the concerns of social construction of technology (scot) and actor network theory (ant). this study applies the stin strategy to analyse mooc use in universities to reveal “the complexity of introduc- ing new artefacts into existing networks, where outcomes are frequently unpredictable and may propagate through wider networks to have effects often far removed from the original intentions” (walker & creanor ). a set of “heuristics” characterises the stin approach and forms the basis of the study. they are intended to highlight key elements in a socio-technical system (kling et al, ): • identify a relevant population of system interactions • identify core interactor groups • identify incentives and impediments • identify excluded actors and undesired actions • identify existing communication forums • identify system architectural choice points • identify resource flows • map architectural choice points to socio-technical characteristics art.  , page  of beavers et al: book reviews these heuristics are applied in the analysis of interview and documentary data, and to participant observation accounts made in the field by the researcher. having used the stin strategy to frame mooc develop- ment and implementation as a socio-technical network, the study applies the relevant social theory of third space activity in he to interpret the stin data in relation to edu- cator and ld roles (rq ). moocs as ‘third space’ initiatives rq . what are the roles of educators and learning designers within mooc development and implementation projects? whitchurch’s concept of third space activity in he is used as a way of exploring the roles of those involved in mooc development, answering rq . defining ‘roles’ castells relates and differentiates a ‘role’ from ‘identity’, explaining that “[i]n simple terms, identities organize the meaning while roles organize the functions” of activity (castells , p. ). this interpretation recognizes that an identity or role can be a fluid, “cumulative project” (whitchurch b) rather than one of essentialist, fixed properties. this positions individual roles (and agency) in negotiation with social structures and the roles of others. the current study aims to add to understandings of how roles may change over time and across spaces in organisa- tions as interactions are co-constructed by social and tech- nical factors. third space environments and processes in her extensive studies of change in higher education, (whitchurch a) has identified a ‘third space’ which defies conventional binary definitions of academic (e.g. lecturer) and professional (e.g. marketing, registry) roles in he. she argues that individuals often cross con- ventional boundaries of departments or functions in he, responding to the demand for heterogeneous project teams in university projects and initiatives (often involv- ing online learning technologies). figure illustrates academic and professional roles in he, and how a ‘third space’ exists outside of their perceived conventional areas of operation: this study investigates whether mooc projects have these characteristics, as institutions continue to grapple with questions of how moocs fit within existing struc- tures and business models (daniel ; yuan & powell ). whitchurch & law’s ( ) narratives and pro- cesses of contestation, reconciliation and reconstruction are used as a conceptual framework through which to understand the “dynamics of third space environments”: • contestation process: tensions and challenges of working across professional and academic spheres become apparent. individuals define themselves in relation to ‘rules and resources’ of an institution for pragmatic reasons, but may not privately identify with them. • reconciliation process: negotiation of difference as the possibility for fruitful collaboration emerges. critical exchange and sharing of multiple perspec- tives occurs in context commitment to overall ideo- logical aims of a project. • reconstruction process: active participation of indi- viduals toward the creation of a pluralistic environ- ment in which new rules and resources are created in relation to the new space. new identities and networks develop, perhaps alongside new language or extended understandings of certain terms. the idea of third space activity and the processes oper- ating within them will serve as a lens through which to understand mooc development and the roles of those working on them. method case study selection this paper reports on the first of three case studies of uk universities which produce moocs on a major commer- cial platform. after conducting a literature review and pre- liminary interviews with experts in the field of moocs and online learning (n= ) the first of three cases was selected (university a). the three cases were selected using purpo- sive sampling in order to compare between meaningful situations in context (bryman ). university a is a mid- sized uk university, which has produced multiple moocs in partnership with a commercial platform provider. participants fourteen participants in mooc development were inter- viewed, and observation notes taken during site visits. par- ticipants included educators, lds and professional staff in senior management, marketing and legal functions. educators were drawn from three different departments, whilst learning designers had experience of working on a range of different online courses and learning technolo- gies, including further iterations of moocs. research instruments semi-structured interviews (n= ), participant observa- tions and documentary analysis ( documents) were used to generate credible, triangulated data in the study (bowen ). interviews followed a flexible guide derived from the stin heuristics. data analysis thematic analysis was used on all data generated. codes were first derived from the stin heuristics and research questions. subsequently, emergent themes were identi- fied from inductive analysis of data (corbin & strauss ), following stin research by meyer ( ). thematic analysis followed a six-step process as set out by vaismo- radi et al. ( ): . familiarise with data . generate initial codes . search for themes . review themes . define, name and refine themes . report art.  , page  of beavers et al: book reviews results in mooc development at university a, lds become a hub for mooc development activity, filtering and mediating the demands of external and internal university stake- holders often embodied through non-human actants, which “influence the range of actions of other actors and actants” (meyer, ). complex patterns of activity at university a are illustrated by the idea of socio-technical interaction networks, and the dynamics at play within them, as shown in figure . figure shows this hub-like position of lds, through which they filter and interpret the demands of other social actors in mooc development. it is, however, diffi- cult to represent the full range of evolving relationships, incentives and pressures at play in a single stin diagram. the sections which follow elaborate on the main themes identified in the analysis (with selected excerpts from documents and interviews), centring on significant actors, motivations, constraints, processes, and architectural choice points in mooc development and implementation. significant actors - hubs and peripheral roles although there was (especially initially) some diversity of approaches across teams producing different moocs, investigation of the actors, actants and groups involved in mooc development revealed the significance of lds and some seemingly ‘peripheral’ actors in the process. educators were of course involved in structuring and selecting course content in “co-creation” with lds, but most actors recognised that lds took a hub-like role in mooc projects, acting as the “linchpin” for activities in which they often “had a very free hand” in decision- making and defining the roles of others. one educator described the ld as the “producer and director of the mooc” who also acted as a “gateway” for the platform provider, interpreting guidelines or requirements of the platform. in contrast, the educator described their own role as “scriptwriter or researcher” (albeit one with final say over matters of academic content/accuracy). rep- resentatives of legal, marketing and media production departments also took influential roles in the production process, perhaps leading educators to perceive their role as somewhat diluted as compared to their responsibili- ties and control over other types of courses. educators took an active part in some aspects of the development process, but had a less consistent presence in decision- making processes regarding legal, quality assurance and marketing issues that seemed to influence wider course design and development processes. reputational enhancement as motivation and risk mooc development and implementation at university a can be linked in complex ways with the themes of repu- tational enhancement and reputational risk. reputational figure : representation of whitchurch’s concept of ‘third space’ activity in he (whitchurch & law, ). art.  , page  of beavers et al: book reviews enhancement of the institution was identified by all study participants as a key institutional incentive behind mooc production, and constitutes a significant choice point in terms of the selection of platform provider and of indi- vidual courses selected for development (those subjects linked to research strengths of the university). the mas- sive scale, reach, and visibility of mooc course offerings were intended to provide the university with a way to establish itself “at the vanguard of a new era of delivery of education”. it should be noted that the ‘reach’ of moocs was also an incentive for many educators to participate in terms of letting them “spread the word” about research in their areas or widening access to education as “what we should be doing”. moocs on commercial platforms have extensive reach to the public via the web, making the platform and the web itself an important actant in this system. however, the high profile nature of the activity and commercial aspect of the venture also entailed legal and reputational risks to the institution which seemed to influence course production and actor roles in various and significant ways. lds felt limitations (both externally and self-imposed) on creativity and ambition partly because “it was a very short timescale and a very complicated pro- ject”. the “tremendous” legal issues of rights clearances for course materials experienced by educators and lds were, for example, significant complications which occupied much course development time for educators and lds. a conservative approach to course design this sense of pressures on time and resources, and per- ceived reputational risk seemed, over time, to engender a somewhat conservative approach to course design and development, limiting innovation and creativity. most actors recognise that although some central funding was made available (especially in the early stages of mooc development), much time contributed was “gifted” as “goodwill” to these projects. lds initially attempted to work creatively around limitations in funding and plat- form affordances in order to “get away from the notion … that it was a content push”. lds also claimed to have influenced the on-going development of the platform in discussions and feedback sessions with platform repre- sentatives. some lds introduced online tools which were external to the platform, but found that use of external technologies put pressure on other actors across the net- work (such as ict support or the legal department). this combination of social and technical factors influenced ld approaches where for one ld “a design decision is placing a constraint on myself”. the substantial legal restrictions figure : simplified stin diagram of mooc development activity at university a. white and white: learning designers in the ‘third space’ art.  , page  of on content permissions also lead lds to limit educator access to the platform. media production values also influenced educator con- tributions to course video content in cases where, during filming of educator contributions, “they [media produc- ers and ld] very quickly said ‘this isn’t going to work, this is too academic, this is too text heavy’”. these examples illustrate a renegotiation of control and responsibility relating to actor roles in course design and content selec- tion, as well as highlighting the technical requirements of mooc production, and the various constraints associated with working through a commercial platform provider. this also to some extent discouraged revision of courses for future reuse, though adaptions to some courses were undertaken. in this context, representatives of legal, media production, or marketing functions were able to influence both course development procedures, course design decisions, and (as the next section will show) the configuration and selection of certain technologies. architectural choice points as mooc projects evolved within the institution, various social and technical forces were reified as ‘non-human actants’ in the form of formal governance structures and technical choices. procedures for course development were adapted and formalised over time to ensure all rel- evant actors had some opportunities for input and review (with the legal department playing a significant and con- sistent part in the process). in addition, significant tech- nical choices were made which mediated the process of content creation and design. one such choice was the adoption of a proxy site which allowed lds to maintain full control of content on the platform, responding pri- marily to legal concerns and restrictions regarding con- tent. a further choice was the university’s subscription to a commercial provider of stock images for use in courses. this resource was introduced to resolve tensions between educator demands for accurate imagery on courses on the one hand, and legal or branding considerations involving the institution and/or the platform provider on the other. such images needed to “be acceptable from a scientific standpoint, but also meet the glossy slick standards for putting [images] out on a very public platform”. a final example related to deciding whether individual activities (learning objects) should be designated as open access (publicly searchable, rather than open only to registered course participants). most were made open, but some were deemed unsuitable for this, for example because they dealt with sensitive topics which needed to be con- textualised within the wider course material. discussion using findings from the stin analysis, this section exam- ines the applicability of the concept of third space activity to mooc production at university a. this will help criti- cally examine understandings of the roles of educators and ld in moocs and how this might relate to such roles in other online learning initiatives. mooc development as third space activity mooc projects at university a are complex and require new roles and forms of collaboration. the analysis sup- ports trowler et al.’s ( ) claim that a variety of (insti- tutional and external contextual) forces influence the practices of educators and lds in addition to disciplinary norms and routines in he. the mooc initiative is char- acterized by “the emergence of broadly-based, extended projects across the university, which are no longer con- tainable within firm boundaries, [which] have created new portfolios of activity” (whitchurch, , p. ). as figure : moocs development as third space activity at university a. white and white: learning designers in the ‘third space’art.  , page  of discussed in previously. whitchurch & law ( ) illus- trate activities occurring in a third space which are dis- tinct from solely academic or professional functions in he. figure represents this situation in terms of third space activity and moocs at university a. in figure , generalist and specialist functions (man- agement, marketing etc.) and mainstream academics are positioned outside of the third space as either profes- sional or academic roles. such roles have clearer structural boundaries and defined positions within the institution. those in perimeter roles may actively cross boundaries to achieve particular aims. however, those positioned within the third space (mooc learning designers, mooc project manager, for example) are likely to be involved in par- ticular projects which demand work across professional boundaries, likely focusing on overall institutional or pro- ject goals, rather than those of a specific department. dynamics of third space activity: contestation, reconciliation and reconstruction to help illustrate the dynamics of mooc projects through the lens of third space activity, table one categorises and interprets findings from the stin strategy as processes of contestation, reconciliation and reconstruction (see ‘theo- retical framework section). particular areas of tension and change which embody these processes are identified in the stin analysis in areas of mooc design, mooc devel- opment processes and funding and resources. contestation regarding processes of contestation, the table shows how a sense of uncertainty and ambiguity initially existed around mooc project roles, as part of a “mysterious pro- cess”, according to one ld. a case study of learning support functions by whitchurch and law ( ) finds similarly that participants “had to create [their] own role” and “find [their] own way into systems” in the face of the challenges and uncertainty of third space activity. indeed, whitchurch ( ) cites a discussion from an e-learning conference in which participants in third space projects are described as “a unique group who had almost come together because there was a job to be done but it couldn’t quite be articu- lated”. the stin analysis adds a concern with technical ele- ments within third space environments, for example where perceived limitations of the platform or understandings of openness are contested. attempts to implement techno- logical solutions (using external applications to innovate new learning activities, or applying creative commons licences to content) meet organisational, financial or legal barriers for third space actors. this demonstrates the social informatics principle of the duality of technology - that it has both enabling and constraining effects in organisations (kling et al., ). these challenges parallel examples used by whitchurch ( ) to illustrate the increasing complex- ity of learning technologist [sic] roles in he, which far exceed mere provision of technical support for educators. reconciliation as lds come to appreciate the possibilities and con- straints operating in the socio-technical arrangements of which they are a part, processes of reconstruction are enacted. design decisions require cooperation from other sections of the university (ict support, media production) or the platform provider, entailing the negotiation, criti- cal exchange and invention characteristic of whitchurch and law’s reconciliation phase. in development processes, lds start to explore and adapt their own roles and those of others in recognition of the pluralistic environment of moocs, and develop a problem-solving approach toward the entire process, rather than ‘fire fighting’ indi- vidual problems (whitchurch and law, ). this allows them to find new ways to interpret and articulate prob- lems (whitchurch, ), manage conflict over funding, and respond to the underlying sense of reputational risk which influences the activity of actors. reconstruction research on third space activity and studies of online learning projects (cowie & nichols, ) have empha- sised the need for a focus on relationships (rather than timelines or disciplinary boundaries) in certain he ini- tiatives. whitchurch and law ( ) claim that fostering relationships allows the “formation of a new, plural space” in which reconstruction processes can be rooted. the stin analysis revealed how lds came to define their own roles and those of others as experience of mooc projects developed. management facilitated the creation of new structures and decision-making procedures on matters of resourcing and technical practices. lds ultimately place limits on activity types, and controls on content selection procedures in recognition of the complexity and reputa- tional risks associated with the project, and the resource and time constraints under which it operates. however, negotiation over budget allocation for mooc develop- ment continues, as does exploration of different business models and strategies for mooc development. this dem- onstrates that the reconstruction phase has perhaps yet to be reached in terms of mooc funding, as reflected in table . mooc actor roles in the third space learning designers as hubs the stin analysis demonstrated that lds take a hub-like role in mooc development (see figure ), and this to some extent extends findings of previous research into online learning more generally. research has highlighted the “brokering” (keppell, ) or “bridging” (cowie & nichols, ) role taken by lds in he projects, span- ning different disciplinary communities of academics. as hubs in a third space environment, lds at university a are able to interpret procedures and configure technical resources provided by the university and the platform, thus influencing the roles of others. lds describe mooc projects as “a massive team effort”, involving equal rela- tions and “co-creation”. however, analysis of interview and documentary data suggests lds can in fact command the “final say” in order to “get things done” from their position in the network, reflecting kehm’s ( ) idea of “secret managers”. in a wider sense, it could be argued that lds are taking the responsibility of aligning peda- gogy, technology and organisation - crucial considera- tions in teaching and learning (dron & anderson ) white and white: learning designers in the ‘third space’ art.  , page  of and the successful diffusion of online learning in institu- tions (jochems et al. ). perceptions of reduced educator influence in course design educators perceive a reduction of their influence in mooc course development compared to their activities on other (mainly face-to-face) courses. this may be attributable in part to an “unbundling of faculty roles” in online educa- tion (tucker & neely, ) and an increasingly globalised higher education sector more widely (king & bjarnason ). cowie and nichols ( ) emphasise the need for teamwork in online education development projects, not- ing resistance to this from faculty “wedded as they are to the jack-of-all-trades idea of the master teacher” (moore , p. ). the stin analysis has enabled the identifica- tion of a wider network of seemingly peripheral actors to which some conventional educator roles are ‘unbundled’. this seems to be a response to internal and external con- textual pressures and incentives, which are linked to the open and massive character of moocs. however, these pressures and incentives shaping mooc development also necessitate the involvement of a range of social actors outside of the academic departments concerned with par- ticular content areas, as outlined below. influence of peripheral actors seemingly ‘peripheral’ actors in fact take on significant roles in the mooc development process, influencing the selection, presentation and protocols for sharing of con- tent, and the configuration of the technical tools used in these activities. the idea of significant peripheral actors in implementation of ict systems has been identified in social informatics research (eschenfelder & chase ), and at university a new roles (mooc project manager, facilitation coordinator, asset specialist) were created to facilitate the creation of moocs. evaluation application of the stin analytic strategy has generated a useful systems view of mooc production, embedded in the social and organisational context of university a. the stin findings regarding actor roles and interactions also fit well with the concept of ‘border crossing’ activity in the third space. however, the combination of stin and third space concepts presents challenges in its applica- tion. a fundamental principle of social informatics is that technologies are embedded in their social contexts of use, but whitchurch argues that individuals in the third space resist constraints and boundaries in such social contexts, contestation reconciliation reconstruction mooc design reactions against ‘content’ push approach reactions against limitations of platform limitations / absence of institutional procedures emerging complexity of learning designer role conflict over content, approach, control need for cooperation across departments emerges negotiation of activities, resources, procedures within the institution and with fl reflection on / response to mooc participant behaviour and feedback lds redefine own roles and those of others lds constrain creativity, content and activity types in relation to pressures on resources, time and reputational risk mooc development uncertainty and tension regarding development roles, processes and allocation of resources diverse approaches to mooc projects (among different mooc teams) conflict over power relations between educators, lds, legal and marketing teams negotiation of roles and decision-making in development processes need for a problem solving approach is realized recognition/understanding of perceptions of reputational and legal risks lds and management establish new organizational and decision making processes for moocs (limited educator input) consolidation of a problem solving approach – lds as relationship builders and brokers limitations on educator access to content and resources quality assurance procedures reformulated and standardized funding/ resources top down funding announced need for substantial ‘goodwill’ of contributors emerges ambiguity, tensions and conflict over funding allocation (for mentoring, support, media production) less funding available for course re-runs / course development negotiation of cost burden between learning support unit and departments exploration of different business models (recruitment, partnerships, third party funding) table : third space processes of contestation, reconciliation and reconstruction in mooc development. white and white: learning designers in the ‘third space’art.  , page  of redefining them dynamically. this presents something of a ‘moving target’ for stin studies - into what context exactly are mooc technologies embedded? further, the introduction of new technology itself both changes and is changed by the context and the actors which shape it. the degree of contingency in these circumstances seems high and as such modelling the dynamics of the situation is very challenging. conclusion the universities uk mooc report ( ) called for greater understanding of “how the development and application of online approaches require changes in the processes and procedures that underpin that mission”. this study demonstrates how the roles of educators and learning designers are strongly shaped by involvement in the com- plex socio-technical network of mooc production, which is in turn embedded in the particular social and organi- sational context of university a. the stin findings and third space lens add to current understandings of these roles in highlighting how issues such as legal constraints (or concerns with marketing, media production etc.) can shape organisational structures around moocs and the technical configurations of the tools that contribute to course development and delivery. at university a, learn- ing designers occupy and define a hub-like, ‘third space’ role which straddles academic and professional functions. complex interactions with seemingly peripheral actors (legal, marketing, media production) shape the course design and development process, to some extent dilut- ing or ‘unbundling’ the conventional ‘jack-of-all trades’ role of educators, or creating new roles required to satisfy organisational needs and priorities, or technical platform requirements. these findings raise questions about the implications of introducing courses with these elements of massiveness and openness into he contexts. universities must grapple with competing internal and external pressures and moti- vations (especially those related to reputational enhance- ment or risk) in developing and delivering such courses, and this in turn shapes the courses produced and the roles of those who produce them. these findings can inform decision-making on the strategic planning of courses, and course design and development processes. it is not possible to generalise these findings from one case, so future research will compare and triangulate these findings with those of two further case study locations. competing interests both authors are employed at universities which are mem- bers of the futurelearn consortium from which the case studies are being drawn. references anderson, t promise and/or peril: moocs and open and distance learning. commonwealth of learn- ing, pp. – . bayne, s and ross, j the pedagogy of the massive open online course (mooc): the uk view, edinburgh: the higher education academy. available at: http:// www.heacademy.ac.uk/resources/detail/elt/the_ped- agogy_of_the_mooc_uk_view [accessed may , ]. becher, t and trowler, p academic tribes and terri- tories: intellectual enquiry and the culture of disciplines, open university press. boven, d the next game changer: the historical antecedents of the mooc movement in educaion. elearning papers, . bowen, g document analysis as a qualitative research method. qualitative research journal, ( ), pp. – . doi: http://dx.doi.org/ . /qrj bryman, a social research methods, oxford univer- sity press. bulfin, s, pangrazio, l and selwyn, n making “moocs”: the construction of a new digital higher education within news media discourse. the interna- tional review of research in open and distance learn- ing, ( ). doi: http://dx.doi.org/ . /irrodl. v i . campbell, k, schwier, r and kenny, r the criti- cal, relational practice of instructional design in higher education: an emerging model of change agency. edu- cational technology research, , pp. – . doi: http://dx.doi.org/ . /s - - - caplan, d and graham, r the development of online courses. in t anderson & f elloumi, eds. theory and practice of online learning. athabasca university, pp. – . castells, m the power of identity, oxford: blackwell. chao, i t, saj, t and hamilton, d using collabo- rative course development to achieve online course quality standards. the international review of research in open and distributed learning, ( ), pp. – . cheverie, j copyright challenges in a mooc envi- ronment. educause. corbin, j and strauss, a basics of qualitative research: techniques and procedures for developing grounded theory, sage publications. doi: http:// dx.doi.org/ . / cowie, p and nichols, m the clash of cultures: hybrid learning course development as management of tension. journal of distance education (online), ( ), pp. – . czerniewicz, l, glover, m, deacon, a and walji, s moocs, openness and changing edu- cator practices: an activity theory case study. avail- able at: http:// . . . /handle/ / [accessed may , ]. daniel, j foreword to the special section on mas- sive open online courses. merlot journal of online learning and teaching, ( ), pp. i–iv. dron, j and anderson, t teaching crowds, edmon- ton: au press. elton, l task differentiation in universities: towards a new collegiality. tertiary education and management, ( ), pp. – . doi: http://dx.doi.org/ . / bf eschenfelder, k r and chase, l c socio-technical net- works of large, post- implementation web information http://www.heacademy.ac.uk/resources/detail/elt/the_pedagogy_of_the_mooc_uk_view http://www.heacademy.ac.uk/resources/detail/elt/the_pedagogy_of_the_mooc_uk_view http://www.heacademy.ac.uk/resources/detail/elt/the_pedagogy_of_the_mooc_uk_view http://dx.doi.org/ . /qrj http://dx.doi.org/ . /irrodl.v i . http://dx.doi.org/ . /irrodl.v i . http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / http://dx.doi.org/ . / http:// . . . /handle/ / http://dx.doi.org/ . /bf http://dx.doi.org/ . /bf white and white: learning designers in the ‘third space’ art.  , page  of systems: tracing effects and influences. in th hawaii international conference on system sciences. doi: http:// dx.doi.org/ . /hicss. . fry, j the cultural shaping of icts within academic fields: corpus-based linguistics as a case study. liter- ary and linguistic computing, ( ), pp. – . doi: http://dx.doi.org/ . /llc/ . . gregory, m and lodge, j academic workload: the silent barrier to the implementation of technology- enhanced learning strategies in higher education. distance education, ( ), pp. – . doi: http:// dx.doi.org/ . / . . hollands, f m and thirthali, d moocs: expecta- tions and reality. full report. may , new york, new york, usa. available at: http://cbcse.org/word- press/wp-content/uploads/ / /moocs_expec- tations_and_reality.pdf. jochems, w, van merrienboer, j and koper, r an introduction to integrated e-learning. in w jochems, j van merrienboer, & r koper, eds. integrated e-learn- ing. london: london: routeledge farmer, pp. – . kehm, b strengthening quality through qualifying mid-level management. in m fremerey & m pletsch- betancourt, eds. prospects of change in higher educa- tion. towards new qualities and relevance: festschrift for matthias wesseler. frankfurt: iko, pp. – . kenny, r, zhang, z and schwier, r a review of what instructional designers do: questions answered and ques- tions not asked. canadian journal of learning and tech- nology, ( ). doi: http://dx.doi.org/ . /t jw p keppell, m j instructional designers on the bor- derline: brokering across communities of practice. instructional design: case studies in communities of practice, pp. – . king, r and bjarnason, s the university in the global age, palgrave macmillan. kling, r, mckim, g and king, a a bit more to it: scholarly communication forums as socio-technical interaction networks. journal of the american society for information science and technology., ( ), pp. – . doi: http://dx.doi.org/ . /asi. kling, r, rosenbaum, h and sawyer, s understand- ing and communicating social informatics: a framework for studying and teaching the human contexts of infor- mation and communication, information today, inc. kovanović, v, joksimović, s, gašević, d, siemens, g, and hatala, m what public media reveals about moocs: a systematic analysis of news reports. british journal of educational technology, ( ), pp. – . doi: http://dx.doi.org/ . /bjet. laurillard, d how should professors adapt to the changing digital education environment? in l engwall, u teichler, & e de corte, eds. from books to moocs? emerging models of learning and teaching in higher education. portland press, pp. – . literat, i implications of massive open online courses for higher education: mitigating or reifying educational inequities? higher education research & development, ( ), pp. – . doi: http:// dx.doi.org/ . / . . liyanagunawardena, t r, adams, a a and ann williams, s moocs: a systematic study of the published literature – . the international review of research in open and distance learning, ( ), pp. – . meyer, e socio-technical interaction networks: a discussion of the strengths, weaknesses and future of kling’s stin model. in social informatics: an infor- mation society for all? in remebrance of rob kling. pp. – . available at: http://link.springer.com/chap- ter/ . / - - - - _ [accessed august , ]. meyer, e socio-technical perspectives on digital pho- tography: scientific digital photography use by marine mammal researchers. indiana university. moore, m g teamwork. american journal of dis- tance education, ( ), pp. – . doi: http:// dx.doi.org/ . / najafi, h, rolheiser, c, harrison, l and håklev, s university of toronto instructors’ experiences with developing moocs. the international review of research in open and distributed learning, ( ). doi: http://dx.doi.org/ . /irrodl.v i . oliver, m learning technology: theorising the tools we study. british journal of educational technology, ( ), pp. – . doi: http://dx.doi.org/ . / j. - . . .x seeto, d and herrington, j a design-based research and the learning designer. in l markauskaite, p goodyear, & p reimann, eds. annual conference of the australasian society for computers in learning in ter- tiary education. sydney university press, pp. – . selwyn, n looking beyond learning: notes towards the critical study of educational technology. journal of computer assisted learning, ( ), pp. – . doi: http://dx.doi.org/ . /j. - . . .x siemens, g massive open online courses: innova- tion in education? open educational resources: inno- vation in education, pp. – . available at: https:// oerknowledgecloud.org/sites/oerknowledgecloud. org/files/pub_ps_oer-irp_ch .pdf [accessed june , ]. smith, p and ragan, t instructional design, wiley. trowler, p, saunders, m and bamber, v tribes and territories in the st century: rethinking the signifi- cance of disciplines in higher education, routledge. tucker, j and neely, p unbundling faculty roles in online distance education programs. the international review of research in open and distance learning, ( ). universities uk massive open online courses: higher education’s digital moment? london. available at: http://www.universitiesuk.ac.uk/highereducation/ documents/ /massiveopenonlinecourses.pdf. vaismoradi, m, turunen, h and bondas, t con- tent analysis and thematic analysis: implications for conducting a qualitative descriptive study. nursing & health sciences, , pp. – . doi: http://dx.doi. org/ . /nhs. veletsianos, g and shepherdson, p a systematic analysis and synthesis of the empirical mooc literature http://dx.doi.org/ . /hicss. . http://dx.doi.org/ . /hicss. . http://dx.doi.org/ . /llc/ . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://cbcse.org/wordpress/wp-content/uploads/ / /moocs_expectations_and_reality.pdf http://cbcse.org/wordpress/wp-content/uploads/ / /moocs_expectations_and_reality.pdf http://cbcse.org/wordpress/wp-content/uploads/ / /moocs_expectations_and_reality.pdf http://dx.doi.org/ . /t jw p http://dx.doi.org/ . /bjet. http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://link.springer.com/chapter/ . / - - - - _ http://link.springer.com/chapter/ . / - - - - _ http://dx.doi.org/ . / http://dx.doi.org/ . / http://dx.doi.org/ . /irrodl.v i . http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /j. - . . .x https://oerknowledgecloud.org/sites/oerknowledgecloud.org/files/pub_ps_oer-irp_ch .pdf https://oerknowledgecloud.org/sites/oerknowledgecloud.org/files/pub_ps_oer-irp_ch .pdf https://oerknowledgecloud.org/sites/oerknowledgecloud.org/files/pub_ps_oer-irp_ch .pdf http://www.universitiesuk.ac.uk/highereducation/documents/ /massiveopenonlinecourses.pdf http://www.universitiesuk.ac.uk/highereducation/documents/ /massiveopenonlinecourses.pdf http://dx.doi.org/ . /nhs. http://dx.doi.org/ . /nhs. white and white: learning designers in the ‘third space’art.  , page  of published in – . the international review of research in open and distributed learning, ( ). doi: http://dx.doi.org/ . /irrodl.v i . walker, s and creanor, l the stin in the tale: a socio-technical interaction perspective on networked learning. journal of educational technology & society, ( ), pp. – . weller, m the digital scholar: how technology is transforming scholarly practice, bloomsbury. doi: http://dx.doi.org/ . / whitchurch, c a shifting identities and blurring boundaries: the emergence of third space profession- als in uk higher education. higher education quarterly, ( ), pp. – . doi: http://dx.doi.org/ . / j. - . . .x whitchurch, c b beyond “administration” and “management”: reconstructing professional identities in uk higher education. university of london. whitchurch, c and law, p optimising the potential of third space professionals in uk highereducation. available at: http://oro.open.ac.uk/ / [accessed january , ]. yuan, l and powell, s moocs and open education: implications for higher education, available at: http:// pdf.thepdfportal.com/pdffiles/ .pdf. yuan, l, powell, s and olivier, b beyond-moocs- sustainable-online-learning-in-institutions.pdf. white paper cetis. available at: http://publications.cetis. ac.uk/wp-content/uploads/ / /beyond-moocs- sustainable-online-learning-in-institutions.pdf. how to cite this article: white, s and white, s learning designers in the ‘third space’: the socio-technical construction of moocs and their relationship to educator and learning designer roles in he. journal of interactive media in education, ( ): , pp.  – , doi: http://dx.doi.org/ . /jime. submitted: july accepted: october published: november copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access journal of interactive media in education is a peer-reviewed open access journal published by ubiquity press. http://dx.doi.org/ . /irrodl.v i . http://dx.doi.org/ . / http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /j. - . . .x http://oro.open.ac.uk/ / http://pdf.thepdfportal.com/pdffiles/ .pdf. http://pdf.thepdfportal.com/pdffiles/ .pdf. http://publications.cetis.ac.uk/wp-content/uploads/ / /beyond-moocs-sustainable-online-learning-in-institutions.pdf http://publications.cetis.ac.uk/wp-content/uploads/ / /beyond-moocs-sustainable-online-learning-in-institutions.pdf http://publications.cetis.ac.uk/wp-content/uploads/ / /beyond-moocs-sustainable-online-learning-in-institutions.pdf http://dx.doi.org/ . /jime. http://creativecommons.org/licenses/by/ . / identifying translationese at the word and sub-word level ehud alexander avner∗ noam ordan† shuly wintner‡ abstract we use text classification to distinguish automatically between original and trans- lated texts in hebrew, a morphologically complex language. to this end, we design sev- eral linguistically informed feature sets that capture word-level and sub-word-level (in particular, morphological) properties of hebrew. such features are abstract enough to allow for the development of accurate, robust classifiers, and they also lend themselves to linguistic interpretation. careful evaluation shows that some of the classifiers we define are, indeed, highly accurate, and scale up nicely to domains that they were not trained on. in addition, analysis of the best features provides insight into the morpho- logical properties of translated texts. introduction much research in translation studies suggests that the language of translated texts, often called translationese, exhibits different linguistic properties from the language of original, non-translated texts. the differences are so marked that automatic (machine learning based) classification techniques can distinguish between original and translated texts with high ac- curacy, and indeed, several translationese classifiers have been defined for a few european ∗department für linguistik, universität potsdam, germany †angewandte sprachwissenschaft sowie Übersetzen und dolmetschen, universität des saarlandes, ger- many ‡department of computer science, university of haifa, israel languages. in this work, we employ text classification for the investigation of translationese in a morphologically complex language, namely modern hebrew. this work is, to the best of our knowledge, the first to address automatic identification of translationese in a semitic language; we are also the first to train our classifiers on a corpus of twentieth-century literary texts. another novelty of the present work is that we focus on morphological (and, more generally, sub-word) features. an advantage of morphological features is that they lend themselves to interpretation, i.e., to qualitative analysis, as they can potentially capture structural and stylistic differences between translated and original texts. such differences are realized in more analytic languages (like english) on the token level. we thus set out to design several feature sets that capture word-level and sub-word-level phenomena – specifically morphological properties – of hebrew translationese, and thus fo- cus on the linguistic information encoded in tokens and sub-tokens. as will be shown, using the output of a morphological analyzer does not suffice; more sophisticated feature engi- neering is called for. we present a novel approach to approximating hebrew word structure by means of alphabet abstraction. this approach, when enhanced with morphosyntactic information (that is, part-of-speech tags), turns out to be one of the most accurate and scal- able among the classifiers we define. the main contribution of this work is the construction of accurate classifiers that identify hebrew translationese and can scale up to domains they were not trained on. this is impor- tant not only theoretically; numerous studies have shown that statistical machine transla- tion (smt) systems can benefit a great deal when knowledge of the direction of translation is incorporated into the language and translation models (kurokawa, goutte, & isabelle, ; lembersky, ordan, & wintner, , a, b, ). robust detection of translationese is thus highly relevant for smt. this is the first work to address the automatic identification of translationese in hebrew (or any other semitic language), and the first to focus on the morphological manifestation of translated texts’ properties. we thus also contribute to better understanding of the transla- tion product. in addition, we show that literary corpora are suitable for the development of scalable identification systems, and introduce a novel approach to approximating hebrew word structure that might be applicable to other semitic languages. in the next section we survey existing work on the automatic identification of transla- tionese. in section we introduce some relevant characteristics of hebrew orthography and morphology. the experimental setup is described in detail in section . the features we de- fine, and the rationale for using them, are discussed in section . we then list the results of several computational experiments in section and analyze them in section . we conclude with directions for future research. related work the term translationese was coined by gellerstam ( ), who compared texts originally written in swedish with texts translated from english into swedish, and concluded that the striking differences between them do not indicate poor translation but rather a statistical phenomenon, a systematic influence of the source language on the target language. more recent works have suggested that all translations, regardless of source and target language, share certain features characteristic to translated texts (baker, ; toury, ). baroni and bernardini ( ) were the first to employ text classification to investigate and identify translationese. their comparable corpus is a collection of articles from an ital- ian geopolitics journal; each article is treated as a data point, i.e., as a training instance. the source languages from which articles are translated are assumed to be mainly english, ara- bic, french, spanish, russian, and other languages. prior to the classification, the corpus is tagged and lemmatized, and proper names are replaced with a dynamic id-marker. the learning method they employ is support vector machines (svms). they experiment with numerous feature sets: frequencies of unigrams, bigrams, and trigrams of words, lem- mas, part-of-speech (pos) tags, and a mixed mode in which function words are left un- touched in their surface form, while content words are substituted by their corresponding pos tag. they also experiment with combinations of the single svm classifiers trained on the aforementioned feature sets. experiments are run using sixteen-fold cross-validation. single-feature classifiers yield accuracy of at most . % (word unigrams and mixed mode bigrams). trigram models obtain . %- . %. the worse classification model is pos un- igrams with accuracy of . %. the best classifier ( . %) is a combination of five models: lemma unigrams and bigrams, unigrams and bigrams of the mixed representation, and pos trigrams. good features include function words and morphosyntactic categories in general, and personal pronouns and adverbs in particular. the accuracy of some classifiers is said to outperform human judgment. kurokawa, goutte, and isabelle ( ) identify translationese in english and canadian french, and show what impact their findings have on machine translation systems. the corpus they use is a large portion of the canadian hansard, transcripts of the canadian par- liament proceedings. following baroni and bernardini ( ), they produce four different representations of the corpus: surface forms, lemmas, pos tags, and a mixed representation. they, too, train svm classifiers using n-gram frequencies: –to– -grams of pos tags and of the mixed representation, and –to– -grams of surface forms and lemmas. classification is performed on blocks of text of varying lengths and on sentences. using ten-fold cross- validation, the best classification results are just below % accuracy (for blocks) and % (for sentences). both these results are achieved by svms trained on word bigram frequen- cies. classifiers focusing on linguistic patterns, i.e., pos tags and the mixed representation, yield around %. the authors find that “[g]lobally, the relationship between the feature representations is clear: word > lemma > mixed > pos,” and that “there seems to be an op- timal n-gram length: bigram[s] for words and lemmas, trigrams for pos and mixed” (p. ). finally, kurokawa et al. show that the direction of translation has an impact on smt: transla- tion systems the direction of which is the same as the direction of the training data perform better than systems going in the opposite direction. ilisei, inkpen, pastor, and mitkov ( ) train and test their system on a translated and non-translated spanish technical and medical dataset. they set out to go beyond the practi- cal purpose of developing a classifier for translationese and “explore the characteristic [uni- versal] features which most influence the translated language” (p. ). specifically, they are interested in the contribution of features designed to capture the simplification hypothesis (blum-kulka & levenston, ; cf. also baker, ) to the identification of translationese. according to this hypothesis, outputs of translators are less complex in terms of grammar, vocabulary, etc., than the source texts they render. ilisei et al. propose various such ‘simpli- fication features’: average sentence length, sentence depth (i.e., parse-tree depth), and lex- ical richness (type-token ratio), among other features. they compare several classification algorithms; the classifiers are trained on pos frequencies, including and excluding the sim- plification features. they find that removing the simplification features leads to decreased accuracy, and lexical richness is found to be the most informative feature. the best accuracy, . %, is obtained by an svm classifier. ilisei and inkpen ( ) apply similar methods to romanian newspaper articles and obtain similar results. popescu ( ) studies english translationese at the character level. his corpus consists of book-length literary works, most of them from the nineteenth century. the subcorpus of original english contains works written by british and american authors; the transla- tion subcorpus contains works translated from french and works translated from ger- man. in the present work, we, too, train our classifiers on a literary corpus. unlike popescu, we strictly use twentieth-century literature. the features popescu extracts are simply character -grams, irrespective of word and sentence boundaries. classification is performed on the book level (i.e., the training and testing instances are complete books), using svms and ten-fold cross-validation, and achieves virtually % accuracy. however, when the svm is trained on british english original texts and on translations from french, but tested on american english and on translations from german, the accuracy drops to . %, implying that the classifiers are overfitting. popescu repeats the previous experiment, this time eliminating from the feature space all -grams that the french original texts and their translated counterparts share. the accuracy ob- tained this time is . %. the advantages of the character n-gram approach are obvious: it is language-independent, does not presuppose any language processing tools, and seems to promise relatively high classification accuracy. we hypothesize that character n-grams cap- ture morphological features, and that such features could also be captured with n-grams shorter than . koppel and ordan ( ) identify translationese, but also detect the source language of translated texts. they work on english translated from several languages (finnish, french, german, italian, and spanish), using the europarl corpus, which records the proceedings of the european parliament (koehn, ). the feature set in all their experiments is a list of function words, and the learning method is bayesian logistic regression. training is done on , chunks of , words each, half of which are original english, the other half translated english (where each source language constitutes exactly one fifth of the translated data). using ten-fold cross-validation, they identify translationese with . % accuracy. the source language is correctly classified in . % of the chunks. in addition, they train a classi- fier on europarl and test it on a different corpus containing newspaper articles in original english and in english translated from greek, hebrew, and korean. this classifier obtains . % accuracy. in the opposite setting, i.e., when training on newspaper articles and test- ing on europarl, the result is worse – namely . %. we adopt their setup, working with -word chunks. we, too, test our classifiers on datasets very different from the ones they are trained on. unlike koppel and ordan, one of our goals is to find meaningful feature sets that are able to scale up to out-of-domain corpora. most recently, volansky, ordan, and wintner (forthcoming) distinguish between original english and english translated from ten source languages (danish, dutch, finnish, french, german, greek, italian, portuguese, spanish, and swedish). they, too, use parts of eu- roparl as their corpus: million tokens of original english and , tokens from each of the source languages are partitioned into chunks of , tokens which are then used as training instances. the classification algorithm is svm; testing is done using ten-fold cross- validation. similar to ilisei et al. ( ), volansky et al.’s objective goes beyond the develop- ment of a working identification model. they set out to test several hypotheses – e.g., the simplification hypothesis mentioned above – that have been purposed by translation schol- ars as translation universals. to this end, they define several feature sets that reflect these universals. following popescu ( ), they train classifiers using frequencies of character n-grams. in addition, they employ a precompiled list of prefixes and suffixes as a feature set approximating english morphological structure. the former classifier achieves % accu- racy, the latter %. like volansky et al. and ilisei et al. ( ), our objective is not only to design accurate identification models, but also to explore computationally the properties of translated texts that distinguish them from original ones. however, in contrast to volansky et al. and ilisei et al., we work on a morphologically rich language with nonconcatenative morphology, and focus exactly on the word-level and sub-word-level features that cannot be investigated in more analytic languages such as english. like koppel and ordan ( ), and unlike all other works mentioned above, we test our classifiers on datasets from domains different from the one they are trained on. hebrew orthography and morphology the hebrew alphabet is a -letter abjad (daniels, ) for which two main standards exist: the full script, in which vocalization diacritics decorate words, thereby explicating all vowels, and the lacking script, in which these diacritics are missing, and the two letters w and i are occasionally added to represent some, but not all, of the vowels which would otherwise be represented by diacritics. the overwhelming majority of modern hebrew texts – and all the texts our classifiers are trained and tested on – are written in the lacking variant. in this script, most of the five hebrew vowels are left underspecified: /e/ and /a/ are usually not explicated (when they are, they are typically realized by the characters a and h ); /o/ and /u/, when specified, are realized by the same character, w, which is also used for the consonant /v/. similarly, the single character i is used both for the vowel /i/ (when it is specified) and for the sake of readability, a straightforward ascii transliteration of hebrew is used in this study. the characters, in hebrew alphabetical order, are abgdhwzxtiklmnsypcqr$t. for the consonant /y/. the four characters which represent (some of ) the vowels – a, h, w, y – are traditionally known as matres lectionis; they also play a significant role in hebrew derivational morphology. many particles are realized as prefixes attached to the words immediately following them. these include the definite article h, the coordinating conjunction w (“and”), four of the most frequent prepositions – b (“in”), k (“as”), l (“to”), and m (“from”) – and subordinating con- junctions, such as k$ (“when”) and $ (“that/which”). when one of the prepositions b, k, or l precedes the definite article h, the latter is assimilated with the prefixing preposition and the resulting surface form becomes ambiguous with respect to definiteness. hebrew has a rich, partly nonconcatenative morphology. derivational processes are based on a root-and-pattern system; inflectional processes are mainly carried out by suf- fixation, but also involve prefixes, circumfixes, and pattern shifts. an example of the root-and-pattern mechanism are the seven hebrew binyanim, i.e., the verbal patterns. each pattern (binyan) is traditionally associated with a certain (vague, and thus not always predictable) meaning. for example, the hif ’il pattern productively gener- ates causative variants of verbs; similarly (and, to a lesser extent), hitpa’el is used for reflex- ives; three patterns systematically express the passive voice of three counterpart patterns, namely, nif ’al (vs. pa’al), pu’al (vs. pi’el), and huf ’al (vs. hif ’il). consider the three-letter root k.t.b, broadly denoting the notion of writing. when combined with the pa’al pattern ccc, this root yields the form ktb (“write”); when combined with the nif ’al pattern, nccc, traditionally the passive counterpart of pa’al, it yields nktb (“being written”); when com- bined with the hif ’il pattern, hccic, traditionally denoting causativization, it yields hktib (“dictate”). these morphological patterns are mechanisms for expressing constructions that require syntactic or lexical solutions in other languages. similarly, a root can be combined with nominal patterns, for instance, hktbh (“dictation”) is the result of combining the k.t.b root with the nominal pattern hccch which typically produces nominalized forms of the verbal pattern hif ’il. the c’s in the pattern represent the slots for the three consonants of the root. verbs inflect for number, gender (masculine and feminine), person, and tense. kwtbt, for instance, is the present tense feminine singular realization (underspecified for person) of the pa’al verbal pattern. the hebrew tense system is relatively simple, with three tenses and no aspectual distinctions. note that the present tense is actually a participle form that can also be used as an adjective or a noun (akin to -ing forms in english). nouns inflect for number, adjectives for number and gender, numerals for gender. nouns, adjectives, participles, numerals, and quantifiers have two morphologically (and often phonologically) distinct forms: the unmarked absolute state and the construct state. the latter is used, in the case of nouns, adjectives, and participles, for the construction of compounds. it is also involved, in the case of nouns, in possessor-possessed constructions (i.e., noun compounds are, in point of fact, lexicalized possessive constructions). for exam- ple, $mlh (“dress,” absolute state) vs. $mlt (“dress,” construct state), as in $mlt klh (“dress,” construct + “bride,” absolute → “wedding dress,” but also “a dress of a bride”). the fact that, in the lacking script, approximately half of the construct forms are orthographically identical to the absolute forms adds substantially to the ambiguity of hebrew word forms. there are several ways to express possessiveness in hebrew, one of which is by attaching pronominal suffixes that inflect for number, gender, and person. the base form for these constructions is the construct state. for example, the first person singular possessive suffix i can be attached to $mlt (“dress”, construct) to yield $mlti (“my dress”). the other possessive constructions go beyond the word level and involve the preposition $l (“of”). the morphological complexity, the deficient orthography, and the affixation of frequent particles bring about a system that produces highly ambiguous texts: “first, the first and last few characters of each token may be either part of the stem or bound morphemes (prefixes or suffixes). second, the lack of explicitly marked vowels yields many homographs” (fabri, gasser, habash, kiraz, & wintner, ). hence, word segmentation is not straightforward, pos tagging is “a much messier task [...] than in other languages, such as english” (koppel, mughaz, & akiva, ), and automatic morphological analysis is an immensely complex enterprise. experimental setup this is a corpus-based study; we use several corpora, which we automatically pre-process, to train and test machine learning based classifiers. we now explain the experimental setup and our methodology in more detail. . corpus design the main corpus used in this study is a subset of a corpus compiled by jason perry with the purpose of comparing translated and non-translated hebrew texts. it is a monolingual comparable corpus that consists of first chapters of books published in hebrew in the last decade (usually only the first chapter from each book). perry downloaded the texts from a public internet site aimed at exposing readers to newly published books. the corpus is annotated with the following metadata information: author name, book title, and source language. we add the following fields: translator name, genre (prose, play, verse, children’s literature, etc.), and the author’s year of birth. to allow for a better compa- rability, we restrict our training data to texts written by authors born after , to english as the source language (of the translated texts), and to prose as the genre. the subcorpus of original hebrew literature (henceforth ohe b ) contains book chap- ters by authors. the translation subcorpus (te n ) contains book chapters by authors translated from english by translators. each subcorpus contains exactly , tokens. ohe b and te n are the corpora we train our classifiers with; henceforth, we occasion- ally refer to them as the training data; together, they constitute the inc [in-corpus] experi- mental scenario introduced in section . recall that most work on the identification of translationese has been carried out on cor- pora containing data from restricted domains, e.g., geopolitics (baroni & bernardini, ), parliament proceedings (kurokawa et al., ; koppel & ordan, ; volansky et al., forth- http://hebrewcorpus.nmelrc.org/, accessed june . http://text.org.il/. in this study, punctuation marks are counted as tokens. coming), or technical and medical data (ilisei et al., ). we use a corpus containing twentieth-century literary texts in this study, first and foremost, because we are not aware of any other large-scale comparable hebrew corpus containing texts from domains such as the above. in fact, we are not aware of any other large-scale comparable hebrew corpus of any domain. there are, moreover, several benefits and interesting aspects to using a literary corpus: . the quality of translation is arguably very high. not only can literary translators be as- sumed to be very competent translators, the common practice is that literary translations pass through an editorial cycle (copyediting, proofreading) before actually being published. . the multitude of authors and translators in our training data ensures that the classifiers do not learn to identify a specific author or translator but rather the phenomenon of hebrew translationese. . a corpus of contemporary literature could be easily expanded for future research: in the age of the internet the majority of publishers make excerpts of newly pub- lished books available online. . metadata, such as the source language of the text, the birth date of the author, or the name of the translator, can normally be extracted with relative ease. . identifying translationese by training on a corpus containing twentieth-century literature affords us an opportunity to explore a domain which very little work has been done on (one exception is popescu ( ), whose corpus, however, consists of nineteenth-century litera- ture). in fact, classifying literary translations is probably a harder task than classifying other genres, both because of the diversity of the texts and because much effort is invested in the translation of literary works, and more freedom is given to the translator to render the text as similar as possible to original writing. this is in contrast to more “technical” translations, which are often done under strict deadlines, resulting in more source-influenced, less fluent translations. indeed, addressing a different but related task, namely, source language detec- tion, lynch and vogel ( ), who train and test their models on nineteenth-century literary texts, state that they believe that literary translations “will pose a greater challenge [...] than the europarl corpus, which is more homogeneous in style” (p. ). . finally, classifiers trained on a literary corpus might be able to scale up to scenarios which they are not trained on. baroni and bernardini ( ), referring to their corpus, state that “[a] drawback of hav- ing a very uniform, very comparable corpus is that the results of our experiment may be true only for the specific genre and domain under analysis” (p. ). we conjecture that a corpus of contemporary literature, being more heterogeneous than other closed-domain corpora, is suitable for the development of robust identification models. we use additional datasets in order to check to what extent our classifiers scale up to scenarios they are not trained on. first, we test whether our models can predict transla- tionese within the same domain (literature), but on texts translated from a different source language (french). secondly, we test how well the classifiers predict translationese in a dif- ferent domain, but on texts translated from the same source language as the training data (i.e., english). this last task is notoriously difficult (argamon, ). none of the works dis- cussed in section , with the exception of koppel and ordan’s ( ), test their systems on a domain different from the one their systems are trained on. in-domain corpus we construct a small corpus containing book chapters translated from french, rather than from english, extracted from the preliminary full corpus compiled by perry. we refer to this corpus as the in-domain dataset (ind f r ). it includes book chapters by authors translated by translators, totaling , tokens. out-of-domain corpora we use two additional small datasets: one containing journal and newspaper articles dealing with social science topics, often in a popular science style, the other containing journal and newspaper articles from the economics domain. we refer to these corpora as the out-of-domain datasets (ood-soc[ial], ood-eco[nomics]). ood-soc consists of translated articles and original articles, and ood-eco of translated and original. the number of authors and translators in these datasets is unknown; however, since the texts come from several different newspapers and jour- nals, it is safe to assume that no one author (or translator) is overrepresented. each of the ood datasets contains , tokens of texts translated from english and , tokens of texts originally written in hebrew. popescu ( ) tests his model on texts translated from a different source language but in the same domain. . morphological analysis and chunking after applying a minimal cleaning script to the data, the corpora are first tokenized and then morphologically analyzed using the mila tools (yona & wintner, ; itai & wintner, ). the morphological analyzer is a rule-based computational implementation of the inflectional morphology of modern hebrew, based on a lexicon of almost , lemmas. the morphological processor produces, for each token, its pos category. then, according to the pos, several other properties are specified. for verbs, e.g., these properties include binyan (verbal pattern, cf. section ), gender, number, person, and tense. in addition, the morphological analyzer segments tokens by specifying the sequence of affix particles at- tached to them, as well as the form and function of these affixes. as an example, figure depicts the output of the morphological processor on the word forms wk$htxlti וכשהתחלתי! (“and when i began”) and sprihm !mספריה (“their books”). observe that in the first example, two prefixes are identified ,ו!) w “and”, followed by ,כש! k$ “when”), followed by the lemma htxil “begin”. then, the main pos is listed (verb), followed by a sequence of morphological features. the second example shows also a suffix, !mיה (ihm ), denoting a possessive pronoun in third person, masculine, plural (“their”). we come back to these examples in section . below. the output of the analyzer is disambiguated using the tagger of bar-haim, sima’an, and winter ( ): this is a stochastic tagger, trained on newspaper articles, and it ranks the analyses produced by the analyzer by assigning a score to each analysis (typically, ‘ . ’ for the correct analysis, ‘ . ’ for the incorrect ones). unfortunately, the tagger is unable to al- ways produce a unique top-ranked candidate; in cases where the tagger returns more than one optimal candidate, we simply pick the optimal result appearing first in the output. the reported accuracy of the pos tagger is . %, but this evaluation is based on cross-validation experiments. as is well-known, out-of-domain evaluation of similar tasks usually reveals poorer performance. this is indeed our observation: on our corpus, the accuracy of tagging the tagset includes tags: adjective, adverb, conjunction, copula, existential, foreign, interjection, in- terrogative, modal, mwe (multi word expression), negation, noun, numberexpression, numeral, participle, preposition, pronoun, propername, punctuation, quantifier, title, unknown, url, verb, and wordprefix. figure : output of the morphological analyzer on the tokens wk$htxlti וכשהתחלתי! (“and when i began”) and sprihm !mספריה (“their books”). seems to be lower, although we do not have precise data. in particular, the tagger often fails to distinguish between verbal analyses that differ in the binyan only. we also do not have data on the accuracy of the tagger on identifying any specific feature, but a different hebrew tagger (lembersky, shacham, & wintner, ), reporting similar overall accuracy on the same test set, reports over % accuracy on main pos, around % for number, gender, and person, over % on tense, etc. once a corpus is tokenized, analyzed, and tagged, it is partitioned into chunks, each containing , tokens; there is no correlation between the number of chunks we extract from a corpus and the number of texts (i.e., chapters or articles) this corpus contains. that is to say, each chunk contains exactly , tokens, regardless of chapter/article and sentence boundaries. since the main objective of this work is to observe word-level and sub-word- level phenomena in general, and to learn from morphological features packaged in single words in particular, we do not alter the size of instances; we treat each corpus (translated and non-translated) as a single continuous stream of data. we believe that , -token chunks punctuation marks count as tokens. strike a balance between having enough chunks per corpus, on the one hand, and having big enough chunks to avoid problems of sparsity for certain rare word-level and sub-word- level features, on the other hand. since none of our classifiers goes beyond the word level, sentence boundaries are irrelevant. table summarizes the properties of the corpora used in the study. chunks tokens texts authors translators split ohe b , all original te n , all translated from english ind f r , all translated from french ood-soc , % orig., % trans. from eng. ood-eco , % orig., % trans. from eng. table : the literary corpora: training (ohe b and te n ) and test (ind f r , translated from french); and the out-of-domain test corpora. . methodology the core of our experimental methodology is the development of classifiers that can auto- matically distinguish between instances belonging to different classes (in our case, there are only two classes: translated and non-translated texts). the classifiers are trained on a corpus containing training data, that is, instances of the classes to be distinguished, each labeled a priori as belonging to one of the classes. each of these instances is represented as a feature vector, a set of numeric attributes designed by the developers of the classifier to capture certain characteristics of the classes. the values of these features are extracted from the training instances (e.g., frequencies of certain words in an instance; see next section). during training, the classifiers learn to distinguish between the labeled instances, thereby assigning different weights to the features. a trained classifier can then be applied to unseen test instances and determine their class. if the features selected to represent the instances are meaningful, the classifier should be accurate when applied to test data. such methodologies have been extensively and successfully used for the automatic clas- sification of texts according to, e.g., topic or genre (sebastiani, ). they have been simi- larly used for automatic author attribution, i.e., “inferring characteristics of the author from the characteristics of documents written by that author” (juola, , p. ), for example, for identifying authors of newspaper articles (diederich, kindermann, leopold, & paass, ), or for determining the gender of a document’s author (koppel, argamon, & shimoni, ). support vector machine (svm) is the classification algorithm employed in all our exper- iments. svms “probably represent the most successful technology for text categorization today” (witten & frank, , p. ), and indeed, svms have been widely and successfully used for identifying translationese (e.g., baroni & bernardini, ; kurokawa et al., ; popescu, ; volansky et al., forthcoming). specifically, we apply the sequential minimal optimization algorithm (smo) for training svms (platt, ), using the default linear ker- nel, as implemented in the weka machine learning toolkit (hall et al., ). all the identification models are trained and tested on the corpus containing ohe b and te n in a ten-fold cross-validation procedure (we later refer to this experimental scenario as the inc [in-corpus] scenario). the obtained svm classifiers are then also tested on the three additional datasets discussed above (ind f r , ood-soc, and ood-eco). for all the ex- periments we report accuracy, namely the percentage of text chunks the classifier correctly classifies. in section we analyze the resulting classifiers, exploiting the values of the weights assigned by svms to the features used for classification. feature design the essence of text classification is the design of the feature vectors by which the text data are represented. as we do not go beyond the word level in this study, we design several feature sets aimed at capturing linguistic – specifically morphological – characteristics of surface tokens and sub-tokens. in this section we describe and motivate these feature sets. . token-based features we use two different kinds of token-based features: word unigrams extracted from the train- ing data, and a precompiled list of function words extracted from external corpora. in both settings, a list of tokens constitutes the feature vector representing a chunk, and feature val- ues are the frequencies of these tokens in the chunk. word unigrams we compile a list of all the words in the training data, i.e., in the union of te n and ohe b (excluding punctuation), and use each word as a feature. like volansky et al. (forthcoming), we treat this experiment as a sanity check, since, being highly content-dependent, this feature set is expected to yield very good classification results when tested on the training corpus in a ten-fold cross-validation scenario, but not to scale up to external domains. function words since mosteller and wallace’s seminal work on the federalist papers ( ), function words have been extensively and successfully used in text classification. this approach for feature design has also been proven to be instrumental in identifying translationese, albeit not very scalable (koppel & ordan, ). these words “carry little meaning by themselves [...] but [...] define relationships of syntactic or seman- tic functions between other (‘content’) words in the sentence [... they] are therefore largely topic-independent and may serve as useful indicators of an author’s preferred way to express broad concepts” (juola, , p. ). being highly frequent, these words exist in every chunk of text, regardless of its size, and since they are so frequent, it is safe to assume that more often than not text producers do not control the use of these words, i.e., do not select them consciously. unlike english, however, hebrew text tokens often contain more than one lexical item, and many typical function words, such as prepositions, are concatenated to other words belonging to other parts of speech (cf. section above). hence, closed sets containing several hundred function words, like the ones used for english, cannot be compiled for hebrew. the list we use in this study contains hebrew words belong- ing to the following categories: quantifiers, pronouns, prepositions, negation words, interrogative markers, existentials, copulas, and conjunctions. it contains all possi- ble inflections for each word – and only those surface forms that appear at least once in a collection of six large external hebrew corpora. the list (which is available from mila) contains , items. due to the morphological and orthographic challenges hebrew poses (e.g., the fact that many function words are realized as affixes), classi- fiers based on function words are not expected to perform on our data as well as they do on english texts. . features that reflect morphological aspects since we are interested in investigating the morphological aspects of translationese, we de- fine a set of features that reflect such information. to this end, we use the output of the morphological processor mentioned above (section . ). based on processor’s output (cf. figure ), we define the following feature sets: pos while pos tags may be considered syntactic rather than morphological features, we mainly employ them, as will be described below (section . ), in order to enhance the performance and sophistication of other feature sets. we also use them, like ilisei et al. ( ) do, as a baseline for testing the contribution of other features, in our case the ‘pure’ morphological features; i.e., we first train a classifier based solely on the pos tags in the tagset, and then test this classifier with each of the morphological features added to it, and also with combinations thereof. this should give us a good indication of the contribution made by each morphological feature. for example, in figure , the value of pos is verb in the first example, and noun in the second. binyan the features in this category are the seven hebrew verbal patterns, the binyanim (cf. section above). since the verbal patterns have no counterpart in english, the source language of our study, we expect the frequencies of at least some of them to http://mila.cs.technion.ac.il differ between original and translated texts. in figure , the value of binyan in the first example is hif ’il. status the two features in this category are applicable to nouns, adjectives, participles, numerals, and quantifiers: the features construct and absolute reflect the construct and the absolute states, respectively (cf. section above). since english does not have a form which is equivalent to the construct state, we expect the distribution of con- structions involving the construct state (e.g., possessive noun-noun constructions) to differ between ohe b and te n . possessive this feature set contains only one feature indicating whether a possessive suf- fix is attached to the token. since hebrew has several ways of expressing possessive- ness, one of which is by means of attaching possessive suffixes (cf. section ), we ex- pect the distribution of these suffixes to be different across ohe b and te n . in figure , a possessive suffix is attached to the second example. prefix_ , prefix_ even though hebrew words can take several prefixes, it is rarely the case that more than two prefixes are attached to one token. we therefore consider only the first two prefix positions as feature categories. we expect them to convey signifi- cant classification cues, since they correspond to function words: recall that the defi- nite article, the conjunction and, and numerous prepositions are realized as prefixes in hebrew. in figure , the first example exhibits two prefixes: the value of prefix_ is conjunction and the value of prefix_ is temporalsubconj. the values of the morphological and pos feature sets are the frequencies of those fea- tures within a chunk. we also experimented with the logarithm of the frequencies as the actual values of features, but this turned out to be beneficial only for two classifiers, namely for the pos and the binyan classifiers. we therefore use log frequencies for these two fea- ture sets. note that we do not define a feature for every coordinate of the morphological analysis provided by the analyzer. for example, we find gender, tense, and number to be less rele- vant for our task. first, we consider them more lexical than morphological, not least due to hebrew’s grammatical gender. second, these features typically do not reflect translators’ decisions, as they are imposed by the source text (unlike, e.g., possessive or passive con- structions, where translators have several alternatives to choose from). . features based on character n-grams following popescu ( ), we experiment with character n-grams. the feature set he de- signs contains character -grams, irrespective of word boundaries. unlike him, we experi- ment with -grams through -grams, as well as with the union of all of them. presumably, longer n-grams would capture many lexical phenomena, and would thus yield accurate in- domain but inaccurate out-of-domain classifiers (hebrew words tend to be rather short due to lack of vowels). we also do not go beyond the word level; that is, we calculate n-grams oc- curring only within one token, since n-grams spanning over several tokens are expected to capture syntactic properties of the language, whereas the focus of this study is on mor- phological features. inspired by koppel et al. ( ), who use hebrew and aramaic prefixes and suffixes as features for the classification of rabbinic manuscripts, we design another feature set; we collect bigrams occurring at word boundaries, i.e, at the beginning and the end of tokens. unlike them, we do not employ a predefined list of suffixes and prefixes. note that since each bigram in this feature set is preceded or followed by a reserved character marking a word boundary (see footnote ), the bigrams in this experiment are, in point of fact, trigrams (in other words, this feature set is a proper subset of the character trigram feature set). . features that approximate word structure we also design a set of features that reflect, on the one hand, the formal representation of morphological information (i.e., the way morphological features are expressed in the or- word boundaries are counted as characters. so, for example, the bigrams corresponding to a word like ab are {_a, ab, b_}, where ‘_’ is a reserved character marking a word boundary. volansky et al. (forthcoming) apply a similar feature set to the identification of english translationese. thography), but, on the other hand, are as content- and domain-independent as possible – that is, features that do not directly reflect lexical information. to this end, we define an ab- straction mechanism which is expected to approximate hebrew word structures, e.g., verbal and nominal patterns. the idea is to reduce the hebrew alphabet to a smaller alphabet, allowing symbols in the reduced alphabet to capture sets of characters. we run experiments with three such abstract alphabets (aba), listed here in decreasing order of abstraction: aba consists of only two symbols: c , representing all consonants and v , replacing the characters traditionally known as matres lectionis (cf. section ). these characters play a significant role in hebrew derivational morphology, among other things represent- ing some of the vowels. formally: aba := {c , v }, where c represents the consonants b, g, d, z, x, t, k, l, m, n, s, y, p, c, q, r, $, t and v represents a, h, w, i. aba in this alphabet, c is as above, but v is spelled out. aba thus contains five symbols: {c, a, h, w, i}. not only do the matres lectionis play a significant role in nonconcatena- tive morphology (e.g., in verbal and nominal patterns), but the prefixes h and w also reflect the definite article and the coordinating conjunction and, respectively. aba includes ten symbols: {c, a, h, w, i, b, k, l, m, t}, where c stands for all remaining letters. the spelled out consonants b, k, l, and m are prepositions which are realized as prefixes. the characters k and m can also reflect other grammatical properties, such as gender, number, possessiveness, and tense. the consonant t participates in the construction of many verbal and nominal patterns; e.g., it is part of the unmarked feminine plural suffix wt. figure illustrates how the surface token wk$hmcxiqwt (“and when the funny ones [feminine]...”) is represented in each of the three abstract alphabets. the feature values in the aba experiments are (the frequencies of ) complete abstracted tokens. since no language-specific processing tools are necessary in order to create these ab- surface w k $ h m c x i q w t aba v c c v c c c v c v c aba w c c h c c c i c w c aba w k c h m c c i c w t figure : the three different abstract representations of the surface form wk$hmcxiqwt. stract representations, applying them to other languages with nonconcatenative morphol- ogy (specifically arabic) is straightforward. . feature combinations we define two ways of combining features: disjunction and conjunction. the disjunction f ∪ f results in the union of the feature sets f and f . although the feature vector grows as a result of the disjunction, the features and their values remain the same. combining by means of disjunction allows for a better understanding of the contribution each feature subset makes. the conjunction f × f , on the other hand, results in a new feature set altogether, namely, the cartesian product of f and f . in this study, we employ conjunction in or- der to enrich the different aba and character n-gram feature sets with pos information. for example, consider the conjunction of character bigrams and pos; given the input word hlk (“walked”), which is tagged as a verb, the features extracted from it are the pairs 〈_h, verb〉, 〈hl, verb〉, 〈lk, verb〉, and 〈k_, verb〉; the feature space includes the cartesian product of all possible bigrams with all pos tags. similarly, when combining aba × pos, each aba feature, e.g., c v c , results in a feature set of features, one for each pos tag: 〈cvc, noun〉, 〈cvc, verb〉, etc. by applying pos conjunction to the aba alphabets and character n-grams, we obtain more nuanced and better interpretable feature sets, which remain, nevertheless, abstract and content-independent. results we implemented the features discussed in the previous section and constructed svm clas- sifiers based on each set of features. we then tested the accuracy of each of the classifiers in four experimental setups corresponding to the corpora introduced in section : inc (in-corpus) the training data, i.e., ohe b and te n . it includes chunks: original hebrew instances and translated from english; evaluation is done using ten-fold cross-validation. ind f r testing on the in-domain dataset containing chunks of twentieth-century litera- ture translated into hebrew from french. note that this dataset contains only chunks translated from french, that is, no texts originally written in hebrew. ood-soc testing on the out-of-domain dataset dealing with social science topics ( chunks, evenly balanced: original hebrew, translated from english). ood-eco testing on the out-of-domain dataset dealing with economics ( chunks, evenly balanced: original hebrew, translated from english). for all the experiments we report accuracy; the baseline (choosing at random) is always %, as the test set is balanced. since the test corpora are relatively small ( chunks in the case of ind f r and in the out-of-domain experiments), most differences in accuracy on these test sets are not statistically significant. however, differences of a few percentage points on the training set, for which we conduct cross-validation evaluation, are typically significant. to emphasize that, we graphically depict % confidence intervals (clopper & pearson, ) for the results of some of the inc experiments. complete ranked confidence interval plots for all the experiments (inc, ind f r , ood-soc, and ood-eco) are listed in the appendix. . classifiers based on tokens the accuracies of the classifiers trained on token-based features are given in table . as we conjectured, the classifier trained on word unigrams is highly accurate in the in-corpus scenario, but does not scale up to the in-domain and out-of-domain datasets. similarly, and like koppel and ordan ( ), we find that a classifier trained solely on function words, while achieving convincing in-corpus results, does not perform very well when applied in other experimental scenarios. classifier inc ind f r ood-soc ood-eco word unigrams . . . . function words . . . . table : results of classifiers based on tokens. . classifiers based on morphological analysis the results of the classifiers that reflect morphological aspects, namely, the ones trained solely on the output of the morphological analyzer, are given in table . the confidence intervals of the cross-validation experiments are plotted in figure . classifier inc ind f r ood-soc ood-eco pos . . . . binyan (bi) . . . . status (st) . . . . possessive (ps) . . . . prefix_ (p ) . . . . prefix_ (p ) . . . . pos ∪ bi . . . . pos ∪ st . . . . pos ∪ ps . . . . pos ∪ p . . . . pos ∪ p . . . . pos ∪ bi ∪ st ∪ ps ∪ p ∪ p . . . . bi ∪ st ∪ ps ∪ p ∪ p . . . . table : results of classifiers based on morphological analysis. the pos classifier, here used mainly as a baseline to test the contribution of the ‘pure’ morphological features, yields . % accuracy in the in-corpus scenario. while performing quite well on the separate dataset of literary texts translated from french ( %), it fails to . . . . . . inc accuracy possessive (ps) prefix_ (p ) binyan (bi) status (st) prefix_ (p ) bi ∪ st ∪ ps ∪ p ∪ p pos ∪ ps pos pos ∪ bi pos ∪ st pos ∪ p pos ∪ p pos ∪ bi ∪ st ∪ ps ∪ p ∪ p figure : confidence intervals, classifiers based on morphological analysis (inc scenario). identify translationese in the ood datasets. interestingly, baroni and bernardini ( ) re- port that a similar classifier obtains . % accuracy on their italian data. they note that “the strikingly low performance of the unigram [pos] model is not surprising, since this model is using the relative frequency of [pos] tags as its only cue” (p. ). volansky et al. (forth- coming), on the other hand, obtain % when applying a similar feature set to identifying english translationese. apart from prefix_ , and the somewhat less impressive binyan, no other classifier based on a single morphological feature (including pos) manages to perform better than the baseline in all four experimental scenarios. when the features are combined by means of disjunction, the results improve somewhat. not surprisingly, the combined feature sets that yield the best results in the in-corpus scenario are the ones that uses both pos and prefix_ (pos ∪ p ; pos ∪ bi ∪ st ∪ ps ∪ p ∪ p ). once the pos baseline is removed and a classifier is trained using only the combination of the single, pure morphological features (bi ∪ st ∪ ps ∪ p ∪ p ), accuracy drops in all scenarios except ood-soc. interestingly, this ood-soc results is the best any of the pure morphological classifier yields. in sum, classifiers based on features produced by morphological analysis fail to produce accurate classification results, especially out of domain. the reason may be the low quality of the morphological processing tools we use or the low dimensionality of these classifiers (sometimes containing only one or two features), coupled with the relatively small size of the training set. . classifiers based on character n-grams capturing much lexical information, classifiers based on character n-grams unsurprisingly yield good results (cf. popescu, ; volansky et al., forthcoming). these results are given in table , and the confidence intervals of the inc experiments are plotted in figure . classifier inc ind f r ood-soc ood-eco -grams . . . . -grams . . . . -grams at word boundaries . . . . -grams . . . . -grams . . . . -grams, top- features . . . . -grams . . . . - ∪ - ∪ - ∪ - ∪ -grams . . . . -grams × pos . . . . -grams at word boundaries × pos . . . . -grams × pos . . . . -grams × pos . . . . table : results of classifiers based on character n-grams. the optimal n for n-gram classifiers seems to be ; not only is the -gram classifier highly accurate in cross-validation, it scales up nicely to out-of-domain tasks. extending n-gram length to , or taking all n-grams of lengths to , does not seem to improve much. enhanc- ing the n-grams with pos information brings about a small accuracy gain in most cases. as a further indication of the robustness of the n-gram classifiers, we experimented with a -gram classifier that only uses the most indicative features of ohe b and the most . . . . . . inc accuracy -grams -grams -grams at w.b. -grams, top- -grams at w.b. × pos -grams -grams × pos -grams -grams × pos -grams × pos -grams - ∪ - ∪ - ∪ - ∪ -grams figure : confidence intervals, classifiers based on character n-grams (inc scenario). indicative of te n : after training a classifier with the entire set of -grams, we selected the features whose weights were greatest (most indicative of ohe b ) and the features whose weights were lowest (most indicative of te n ). we then trained a new classifier using only these features. the results of this top- -gram classifier are listed in table , and demonstrate the power of simple, low-dimensional classifiers. some of the most indicative features are discussed in section . . classifiers based on abstract alphabets the results of the classifiers that approximate hebrew word structure by means of alphabet abstractions are given in table , and the confidence intervals of the inc experiments are plotted in figure . aba and aba reveal mixed results. while performing extremely well in certain scenar- ios (for example, the result aba obtains on the ood-eco dataset, . %, is by far the best the specific features are listed in the appendix. classifier inc ind f r ood-soc ood-eco aba . . . . aba . . . . aba . . . . aba × pos . . . . aba × pos . . . . aba × pos . . . . aba × pos, top- features . . . . table : results of classifiers based on abstract alphabets. . . . . . . inc accuracy aba_ aba_ × pos, top- aba_ × pos aba_ aba_ × pos aba_ aba_ × pos figure : confidence intervals, classifiers based on abstract alphabets (inc scenario). one any of our classifiers yields in that scenario, cf. figure in the appendix), they fail mis- erably in other scenarios. in contrast, aba yields competitive results in all four scenarios. considering the fact that these results are obtained without applying any feature selection methods, they are very promising. it stands to reason that by cautiously reducing the fea- ture spaces of these simple aba classifiers, their performance will increase significantly. we intend to do that in future work. upon enriching the abstract templates with pos information by means of conjunction, we observe accuracy improvement in most scenarios. while all three classifiers yield results well above the baseline, aba × pos is the best overall classifier we train in this study. here, too, we experimented with a classifier using only the top-n features most indicative of o and the top-n most indicative of t, this time with n being . this very low-dimensional classifier is the only one in this study obtaining more than % in all experimental scenarios. analysis we now look more closely at the results discussed in the previous section. specifically, in order to understand which features are more relevant than others for a given classifier, we examine the weights an svm assigns to the features it uses: the higher the weight, the more important the feature is considered to be. note that low weights are not always an indica- tion that a feature is not important, as potential dependencies among features can discount important features. the inverse, however, does not hold: a feature assigned a high weight indicates a significant property of one of the classes to be distinguished. in the following, we highlight some of the more successful discriminating features. word unigrams the token dwwqa is one the most prominent markers of ohe b , i.e., it is underrepresented in translated texts. dwwqa is an adverb that roughly means “contrary to expectations.” importantly, it is not lexicalized in english. this is a typical case of negative interference (toury, ): a lexical gap between the two language systems involved in the translation process creates a situation where nothing in the source language (in this case, english) triggers the generation of the lacking item (in this case, dwwqa ) when translating into the target language (hebrew). in our corpus, dwwqa is almost six times more frequent in ohe b than in te n . pos three major differences in the pos distribution between te n and ohe b are given in table . the difference in the distribution of proper names goes hand in hand with the explicita- tion hypothesis (blum-kulka, ). according to this hypothesis, translators tend to render implicit utterances in the source text (e.g., pronouns) more explicitly in the target text they pos tag ohe b te n ratio (te n /ohe b ) propername , , . copula , , . modal , , . table : three major differences in the pos distribution across te n and ohe b . produce, specifically by means of proper names. we note, however, that we (unlike, e.g., baroni & bernardini, ) do not notice remarkable differences in the distribution of pro- nouns between ohe b and te n . we also find that modal constructions are more frequent in translated than in non- translated hebrew texts. this is a case of positive interference (toury, ), i.e., overrep- resentation of features characteristic to the source language in translations (in this case, from english). interestingly, the different distribution of modal verbs across te n and ohe b provides us with a partial explanation of the excessive use of copulas in translated texts: in hebrew, most modal verbs do not inflect for tense; in order to express the past tense, modals are combined with copula past tense forms (functioning in these constructions as an auxiliary verb). and indeed, we find in our data that copulas in the past tense are highly collocated with modal verbs. this partially explains why copulas are more frequent in te n . another contributing factor stems from the optionality of hebrew present-tense copulas in non-verbal sentences (haugereid, melnik, & wintner, ); since counterpart copulas are mandatory in english, they tend to be explicated in translations to hebrew. morphological features the classifiers trained on the pure, single morphological feature sets do not perform very well, neither in the ten-fold cross-validation in-corpus scenario, nor when tested on the additional in-domain and out-of-domain datasets. this might be due to the low dimensionality of these classifiers, to the relatively small amount of training data, or to the performance of the morphological analyzer, which is often inaccurate. perhaps not surprisingly, the classifier based on prefix_ is the best performing one, as it reflects linguistic information which is realized in other languages (like english) as func- tion words, e.g., conjunctions and prepositions. we find that the most significant marker of ohe b is the prefix corresponding to the coordinating conjunction and. indeed, this prefix is . times more frequent in ohe b than in te n . this finding calls for further research by translation scholars. even though the binyan classifier, based on the seven hebrew verbal patterns, manages to perform slightly better than the baseline in all experimental scenarios, we cannot inter- pret the results it yields. the reason is that the accuracy of the morphological processor is particularly poor with respect to the verbal patterns. character n-grams many of the most discriminative word unigrams are also reflected in the results of other feature classes like character n-grams and abstract alphabets. so, for instance, the character trigrams _dw, dww, wwq, wqa, and qa_ (corresponding to the token dwwqa mentioned above) are amongst the most prominent markers of ohe b . in other words, even though we set out to capture morphological properties by looking at sub-tokens and alphabet abstractions, we sometimes end up capturing lexical cues. a detailed analysis of the most significant features of the -gram classifier reveals the following pattern: among the ten strongest indications of original hebrew are three sub- strings of the lexical dwwqa, followed by ywd “more/still/yet”, kbr “already”, gm “also” and mwl “against”, with indications of word boundaries at either side of these short words. note that these are all function words, that likely do not have direct, one-to-one counterparts in the source languages, and hence are distributed very differently between ohe b and te n . other indications of ohe b include bi$r (the prefix of bi$ral “in israel”) and _tl_ (with word boundaries at both ends), clearly referring to tel aviv. strong indications of translations include the prepositions kdi “in-order-to” and bzmn “while/during”, the adjective nwsp “additional”, the modal yewi “may”, but also n-grams that are more abstract and less transparent. aba we find that one of the most prominent features of te n is a triplet of matres lectionis, namely vvv. this template mostly encompasses four tokens which play a crucial role in hebrew grammar: . hwa, which can be either a pronoun (“he”) or a third person singular copula in the present tense (“is”) . hia (“she”), which is the same as hwa, namely, both a pronoun and a copula, only feminine . hih (“was”) and . hiw (“were”), which are copula forms in the past tense. table illustrates how these four most characteristic instantiations of this template are realized across ohe b and te n . together they constitute % of this tem- plate’s occurrences in the training data. a reason for the overrepresentation of copula in te n , namely, positive interference, was discussed above. vvvab a ohe b te n ratio (te n /ohe b ) hwa , , . hia , , . hih , , . hiw , . table : the four most characteristic instantiations of the aba template vvv and their distribution in te n and ohe b . by looking at an abstract feature as simple as the vvv sequence, which potentially leaves room for surface forms (in practice, only of them are realized in the training data), we already have at our disposal numerous highly frequent distinguishing markers. aba naturally, some of the results found in aba are also reproduced in aba . for ex- ample, the spelled-out instance hwa of the aba vvv -template is one of the top markers of te n . another discriminating marker of te n captured by aba is the template hccia, which captures certain instances of the verb pattern hif’il (in past tense, third person singular masculine), namely, those instances with a as the third (and final) letter of the root. the hif’il pattern is predominantly used as a causative in hebrew and might indicate a structural difference between english and hebrew. this calls for further investigation. the same aba template, hccia, also captures the token hn$ia (“the president”), reflect- ing perhaps a cultural marker (the israeli head of state is the prime minister, rather than the president). although lexical, this feature captures a significant difference between te n and ohe b which is scalable to other domains, since it is rather frequent in genres such as newspaper articles and parliament proceedings. aba while less abstract than aba and aba , this third alphabet touches on morpho- logical templates that cannot be captured with the more abstract alphabets. consider the template mcwcl, which, theoretically speaking, exclusively captures the masculine singu- lar passive participle of roots whose third (and final) letter is l. this template is three times more frequent in te n . the most frequent instance of this template, the modal mswgl (“ca- pable of”), reflects constructions which are, as discussed above, more frequent in translated texts due to positive interference. the other instances of this aba template suggest that there are different distributions of morphological items between te n and ohe b . this, too, calls for further studies. aba, general importantly, we manage to capture with the aba templates many discrimi- native markers – whether lexical, morphological, or (morpho)syntactic – without relying on a ready-made, morphologically or syntactically informed mechanism. enriching the templates with pos information improves the results. once the aba tem- plates are restricted to capture smaller token spaces, they become more precise. in aba × pos, a prominent aba feature like vvv is spelled out into features (each corresponding to a different pos tag), thereby making 〈vvv, pronoun〉 and 〈vvv, copula〉 much more dom- inant than, say, 〈vvv, noun〉. the pos enhancement thus helps the classifiers to separate the wheat from the chaff. conclusion we have employed text classification for the investigation of translationese in a morpholog- ically complex language, namely modern hebrew. this is the first work addressing the au- tomatic identification of translationese in a semitic language, and the first focusing on the morphological manifestation of translated texts’ properties. specifically, we have trained several svm classifiers that distinguish with high accuracy between twentieth-century lit- erary texts translated from english and similar texts originally written in hebrew. some of these classifiers have proven to be robust, yielding good results when tested on different datasets, i.e, on texts from the same domain (twentieth-century literature), but translated from a different source language (french), and on texts from other domains, namely news- paper and journal articles dealing with the social sciences and economics. the fact that some of the classifiers scale up to other experimental scenarios supports our hypothesis that training on a corpus of contemporary literature – a very heterogeneous dataset – is suitable and beneficial for the development of scalable classifiers. numerous feature design strategies have been explored: function words, word unigrams, pure morphological features, pos tags, character n-grams, and three different instances of a novel alphabet abstraction mechanism aimed at approximating hebrew word structure. we have also experimented with several hybrid feature sets, i.e., combinations of some of the aforementioned feature sets by means of disjunction and conjunction. classifiers trained solely on morphological information do not obtain very good results; this might be due to the performance of the often inaccurate morphological analyzer, the low dimensionality of these classifiers, or the relatively small amount of training data. the classifiers obtaining the best overall results use combined models, conjunctions of pos in- formation with either an alphabet abstraction or character n-grams. this indicates that, currently, the best way to represent word-level and sub-word-level phenomena in hebrew, for the purpose of identifying translationese, is by approximating morphological analysis using shallow abstractions, and restricting those abstractions to specific pos spaces. as we saw in the previous section, even though we set out to capture morphological properties of hebrew, we sometimes end up capturing non-morphological features. even when applying pos annotation and alphabet abstractions, lexical markers manage to “sneak in,” e.g., in the form of proper nouns. indeed, some of the most significant classification cues do not reflect morphological traits of the hebrew language. let us revisit the example of the lemma dwwqa (roughly meaning “contrary to expec- tations”). it appears . times more often in ohe b than in te n ( vs. occurrences), thus averaging slightly more than one occurrence per original hebrew chunk; its probabil- ity to appear in a te n chunk, on the other hand, is / (assuming a uniform distribution). although not a morphological feature, dwwqa does reflect a structural difference between hebrew and english (and french), and due to its relatively high frequency it contributes immensely to classification. for example, cvvcv is one of the most significant features in the aba experiment. similarly, as described above in section , n-grams corresponding to substrings of dwwqa are amongst the most prominent markers of ohe b . in this sense we conclude that although our abstractions are not purely morphology based, the non- morphological features that do manage to sneak in are of both theoretical and practical value. in future work, our first concern will be dimensionality reduction. as we show with the - gram and aba × pos classifiers, a very low-dimensional space of only or features suf- fices for producing highly accurate results. we intend to employ state-of-the-art algorithms in order to rank feature sets and select the most discriminative feature subsets, thereby re- ducing the size of feature vectors and limiting the effect of overfitting to the training data. this should bring about accuracy gains, but also facilitate a better understanding of the morphological properties of hebrew translationese. we also intend to explore other ways of designing alphabet abstractions, more sophisti- cated than the ones developed and discussed in the present work. substitutions could, for example, be made dependent upon positions within the surface token, thereby simulating hebrew prefixes and suffixes. in this study, we have combined abstract alphabets only with pos tags. this has turned out to be a promising approach. combinations with other feature sets might also prove fruitful, in particular with prefix_ , the pure morphological feature set yielding the best results. finally, we plan to apply similar alphabet abstractions to other semitic languages, such as arabic, building on the similar root-and-pattern morphological structures of words in these languages. acknowledgments this research was supported by a grant from the israeli ministry of science and technology. we are grateful to ted briscoe for suggesting some of the n-gram experiments and for useful discussions. we wish to thank titus von der malsburg for suggesting the use of confidence intervals. we are also grateful to bracha lang for providing us with the out-of-domain cor- pora, and to kayla jacobs for providing us with the list of hebrew function words. thanks are also due to irit noy for annotating the literary texts with additional metadata. references argamon, s. ( ). book review of scalability issues in authorship attribution, by kim luyckx. literary and linguistic computing , ( ), – . baker, m. ( ). corpus linguistics and translation studies: implications and applications. in m. baker, g. francis, & e. tognini-bonelli (eds.), text and technology: in honour of john sinclair (pp. – ). amsterdam: john benjamins. bar-haim, r., sima’an, k., & winter, y. ( ). part-of-speech tagging of modern hebrew text. natural language engineering , ( ), – . baroni, m., & bernardini, s. ( ). a new approach to the study of translationese: machine- learning the difference between original and translated text. literary and linguistic computing , ( ), – . blum-kulka, s., & levenston, e. a. ( ). universals of lexical simplification. in c. færch & g. kasper (eds.), strategies in interlanguage communication (pp. – ). longman. blum-kulka, s. ( ). shifts of cohesion and coherence in translation. in j. house & s. blum-kulka (eds.), interlingual and intercultural communication discourse and cognition in translation and second language acquisition studies (vol. , pp. – ). tübingen: gunter narr. clopper, c. j., & pearson, e. s. ( ). the use of confidence or fiducial limits illustrated in the case of the binomial. biometrika, ( ), - . daniels, p. t. ( ). scripts of semitic languages. in r. hetzron (ed.), the semitic languages (pp. – ). routledge. diederich, j., kindermann, j., leopold, e., & paass, g. ( ). authorship attribution with support vector machines. applied intelligence, ( - ), – . fabri, r., gasser, m., habash, n., kiraz, g., & wintner, s. ( ). linguistic introduction: the orthography, morphology and syntax of semitic languages. in i. zitouni (ed.), semitic language processing (pp. – ). berlin and heidelberg: springer. gellerstam, m. ( ). translationese in swedish novels translated from english. in l. wollin & h. lindquist (eds.), (pp. – ). lund: cwk gleerup. hall, m., frank, e., holmes, g., pfahringer, b., reutemann, p., & witten, i. ( ). the weka data mining software: an update. acm sigkdd explorations newsletter, ( ), – . haugereid, p., melnik, n., & wintner, s. ( ). nonverbal predicates in modern hebrew. in s. müller (ed.), proceedings of the th international conference on head-driven phrase structure grammar (pp. – ). csli publications. retrieved from http:// cslipublications.stanford.edu/hpsg/ /hmw.pdf ilisei, i., & inkpen, d. ( ). translationese traits in romanian newspapers: a machine learning approach. international journal of computational linguistics and applica- tions, . ilisei, i., inkpen, d., pastor, g. c., & mitkov, r. ( ). identification of translationese: a machine learning approach. in a. f. gelbukh (ed.), proceedings of cicling- : th international conference on computational linguistics and intelligent text processing (vol. , pp. – ). springer. itai, a., & wintner, s. ( ). language resources for hebrew. language resources and evaluation, ( ), – . juola, p. ( ). authorship attribution. foundations and trends in information retrieval, ( ), – . koehn, p. ( ). europarl: a parallel corpus for statistical machine translation. in mt summit (vol. ). koppel, m., argamon, s., & shimoni, a. r. ( ). automatically categorizing written texts by author gender. literary and linguistic computing , ( ), – . koppel, m., mughaz, d., & akiva, n. ( ). new methods for attribution of rabbinic litera- ture. hebrew linguistics: a journal for hebrew descriptive, computational and applied linguistics, , – . koppel, m., & ordan, n. ( ). translationese and its dialects. in proceedings of the th an- nual meeting of the association for computational linguistics: human language tech- nologies (pp. – ). portland, oregon, usa: association for computational lin- guistics. kurokawa, d., goutte, c., & isabelle, p. ( ). automatic detection of translated text and its impact on machine translation. proceedings of mt summit xii , – . lembersky, g., ordan, n., & wintner, s. ( , july). language models for machine transla- tion: original vs. translated texts. in proceedings of emnlp. lembersky, g., ordan, n., & wintner, s. ( a, april). adapting translation models to trans- lationese improves smt. in proceedings of the th conference of the european chapter of the association for computational linguistics (pp. – ). avignon, france: as- sociation for computational linguistics. retrieved from http://www.aclweb.org/ anthology/e - lembersky, g., ordan, n., & wintner, s. ( b, december). language models for machine translation: original vs. translated texts. computational linguistics, ( ), – . retrieved from http://dx.doi.org/ . /coli_a_ lembersky, g., ordan, n., & wintner, s. ( , january). improving statistical machine translation by adapting translation models to translationese. computational linguis- tics, . retrieved from http://dx.doi.org/ . /coli_a_ lembersky, g., shacham, d., & wintner, s. ( , january). morphological disambigua- tion of hebrew: a case study in classifier combination. natural language en- gineering , , – . retrieved from http://journals.cambridge.org/article _s lynch, g., & vogel, c. ( ). towards the automatic detection of the source language of a literary translation. in proceedings of the th international conference on computa- tional linguistics (coling): posters (pp. – ). mosteller, f., & wallace, d. l. ( ). inference and disputed authorship: the federalist. addison-wesley. platt, j. c. ( ). fast training of support vector machines using sequential minimal opti- mization. in b. schölkopf, c. burges, & a. smola (eds.), advances in kernel methods: support vector learning. cambridge, ma: mit press. popescu, m. ( ). studying translationese at the character level. in g. angelova, k. bontcheva, r. mitkov, & n. nicolov (eds.), proceedings of recent advances in natural language processing (pp. – ). sebastiani, f. ( ). machine learning in automated text categorization. acm computing surveys, ( ), – . toury, g. ( ). descriptive translation studies and beyond. amsterdam and philadelphia: john benjamins. volansky, v., ordan, n., & wintner, s. (forthcoming). on the features of translationese. literary and linguistic computing . witten, i. h., & frank, e. ( ). data mining: practical machine learning tools and tech- niques (second ed.). morgan kaufmann. yona, s., & wintner, s. ( ). a finite-state morphological grammar of hebrew. natural language engineering , ( ), – . most distinctive n -gram features the most distinctive n-gram features (section . ) are: wqa_, wwqa, dwwq, ywd_, _kbr, _gm_, mwl_, kbr_, _ywd, _mwl, _dww, bier, bkll, _wrq, _egm, _acl, kll_, ela_, egm_, klwm, _lw_, _klw, wla_, _wla, _tl_, _ph_, _npe, ain_, blnw, sbln, arwx, _yew, _lmy, bmek, ylwl, hwa_, mswg, laxw, _ah_, dmii, _mlb, _zmn, hbiy, _bzm, briq, mlbd, _msw, _ydi, amwr, _keh, lehi, mek_, ydii, mbri, _hwa, yewi, nwsp, bzmn, _kdi, kdi_ confidence interval plots we graphically depict below % confidence intervals corresponding to the experiments reported on in section : the inc experiments (figure ), the ind f r experiments (figure ), the ood-soc experiments (figure ), and the ood-eco experiments (figure ). . . . . . . inc accuracy possessive (ps) prefix_ (p ) binyan (bi) status (st) prefix_ (p ) -grams bi ∪ st ∪ ps ∪ p ∪ p pos ∪ ps pos pos ∪ bi pos ∪ st pos ∪ p aba_ pos ∪ p pos ∪ bi ∪ st ∪ ps ∪ p ∪ p -grams aba_ × pos, top- aba_ × pos aba_ -grams at w.b. function words -grams, top- aba_ × pos aba_ -grams at w.b. × pos aba_ × pos word unigrams -grams -grams × pos -grams -grams × pos -grams × pos -grams - ∪ - ∪ - ∪ - ∪ -grams figure : confidence intervals, inc experiments. . . . . . . ind-fr accuracy -grams aba_ aba_ possessive (ps) prefix_ (p ) bi ∪ st ∪ ps ∪ p ∪ p prefix_ (p ) aba_ × pos word unigrams function words binyan (bi) pos ∪ p pos ∪ bi ∪ st ∪ ps ∪ p ∪ p -grams -grams -grams, top- -grams -grams at w.b. × pos status (st) -grams at w.b. aba_ -grams - ∪ - ∪ - ∪ - ∪ -grams -grams × pos -grams × pos -grams × pos pos ∪ p aba_ × pos pos aba_ × pos, top- pos ∪ bi pos ∪ st pos ∪ ps aba_ × pos figure : confidence intervals, ind f r experiments. . . . . . . ood-soc accuracy aba_ possessive (ps) prefix_ (p ) pos ∪ bi pos pos ∪ p status (st) pos ∪ st pos ∪ ps pos ∪ bi ∪ st ∪ ps ∪ p ∪ p binyan (bi) pos ∪ p -grams at w.b. prefix_ (p ) -grams word unigrams -grams × pos aba_ × pos function words bi ∪ st ∪ ps ∪ p ∪ p -grams -grams -grams -grams -grams at w.b. × pos aba_ aba_ × pos - ∪ - ∪ - ∪ - ∪ -grams -grams, top- -grams × pos -grams × pos aba_ aba_ × pos, top- aba_ × pos figure : confidence intervals, ood-soc experiments. . . . . . . ood-eco accuracy status (st) possessive (ps) pos pos ∪ st binyan (bi) pos ∪ bi pos ∪ ps -grams at w.b. prefix_ (p ) pos ∪ p -grams × pos function words pos ∪ p -grams -grams -grams × pos prefix_ (p ) pos ∪ bi ∪ st ∪ ps ∪ p ∪ p bi ∪ st ∪ ps ∪ p ∪ p aba_ aba_ × pos word unigrams - ∪ - ∪ - ∪ - ∪ -grams -grams × pos aba_ × pos aba_ × pos, top- -grams -grams aba_ -grams -grams, top- -grams at w.b. × pos aba_ × pos aba_ figure : confidence intervals, ood-eco experiments. book review: "online evaluation of creativity and the arts" by hiesun cecilia suhr (ed.) anna jobin, laboratory of digital cultures and humanities, university of lausanne, lausanne, switzerland anna.jobin@unil.ch notice: this is the author’s version of a work that was accepted for publication in digital scholarship in the humanities. changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. the definitive version has been accepted for publication in digital scholarship in the humanities, doi: . /llc/fqv . http://dsh.oxfordjournals.org/ mailto:anna.jobin@unil.ch this book ambitiously sets out to give an overview of the potential, as well as the limitations and challenges, of 'online evaluation cultures' in eight chapters focusing on 'creativity and the arts'. in the introduction the author and contributing author hiesun cecilia suhr outlines the motivation and scope of the book and also argues that for artists building a reputation – built on evaluation and judgment – is of economic importance. because digital environments help build such a reputation, she continues, there needs to be interdisciplinary understanding of these digital environments with regard to their 'evaluation cultures'. the collection includes diverse contributions covering the 'different creative fields' of visual art, music, photography, makeup tutorials, design, fashion blogging, reputation ranking and game design. in an attempt to unite the different chapters, the introduction identifies five distinct dimensions that are related to online evaluation, on the grounds that all chapters 'intersect with at least one of the five aspects'. the first of these aspects is linked to the materiality of the medium and its role in the modalities of communication-evaluation practice. joseph reagle's chapter about one particular online gallery might be seen to fit within this particular aspect. reagle's contribution is an interesting account of how an online user community assesses the potential implications of different possible technical implementations of a rating system for pictures. limitations of the current standard solutions for online evaluations – 'liking' and commenting – for nuanced feedback and evaluation are repeated assertions throughout the book and this is most apparent in the chapters about visual arts and online music contests. linda vigor's pertinent examination of art studio critique sessions and its comparison to the feedback that is enabled by online visual art websites demonstrates that different settings influence the nature of feedback beyond explicit divergent rules and instructions. while music contests are nothing new, according to suhr the nature and process of an online-based contest may impact profoundly on the way that the music is evaluated. another example, given by ramon reichert, are youtube makeup tutorials which, through by their online format and the characteristics that come with it, he says blur classic conceptions of production and reception. the second and third aspects used to unite the work presented are social dimension and power and politics respectively. evaluation, in any given form, is not only a matter of individual taste but also one of social convention, which can, of course, be both explicit and implicit. in her compelling contribution about fashion bloggers, brooke erin duffy makes a thorough examination of social norms and their enactment within a specific online community of practice. of particular interest – and reminiscent of the first aspect – is her depiction of how some of these norms are entangled with the technological specificity of blogging where links, comments and ‘likes’ are not only a means to express feedback, but also a way to augment a website's reputation and economic status. helen kennedy broaches power and politics in her contribution, through a nuanced account of anti-spec movements and their importance for rejecting non-compensated design competitions and spec work. spec, or speculative, work describes the potentially unpaid labour solicited by such design competitions where compensated work has become the prize. in providing a closer look at this topic, her chapter shows how different discourses about spec work reflect powerful social dynamics. notably, spec work also overlaps with the fourth aspect identified in the introduction as related to online evaluation; tension 'between creativity and the commercial market'. kennedy outlines how the tendency to glorify the 'amateur culture' contributes to a legitimization of spec work which, in turn, increases the precarity of creative work and undermines the ethics of professional creative labour. evaluation as learning opportunity is identified as the fifth aspect related to online evaluation. while a running thread throughout the work, this approach is most explicitly examined in the chapters concerning visual art and spec work. vigor's research on critique in the visual arts explains how cognitive and structural components intersect and impact the possible level of learning that can be derived from any evaluation. online spec work, as illustrated in kennedy’s chapter, inhibits the learning process that is indispensable to professional design work. the same aspect, however, also appears in another light, because obviously other contexts result in participants appreciating online-based evaluation for the learning opportunity it provides, as is highlighted in trammell's contribution on online board game design forums. in fashion blogging, it is the participants' hope of learning something that comes to form a large part of their motivation. the five aspects highlighted above are very different in nature. some relate clearly to the declared topic of the book, whereas others are generic analytical dimensions of human practice and representations. using a mixture of central aspects in this way makes for an interestingly broad perspective, as does the wide range of topics addressed. the breadth of topics here might serve as exploratory groundwork for further addressing complex issues such as evaluative processes in digital environments. indeed, suhr takes the opportunity to call for interdisciplinarity and non-binary modes of inquiry in her contribution to this ambitious book. one might, however, also argue that instead of giving an overview, the large scope of the work results in a volume where it is hard to see an obvious relationship between the various contributions. because several contributions broaden the basic concepts to which the book is addressed even further, be it evaluation or creativity, it can be quite hard to grasp its scope. alessandro gandini's pertinent study 'critique on klout', for example, is a discussion of whether social media metrics of influence are correlated to offline networks. admittedly this is of 'freelance creatives', but it strays a long way from addressing creativity and the arts. in conclusion, while appreciating the breadth of study, this reviewer became confused in the mix of chapters thematizing evaluation of an online practice and chapters focusing on online evaluation as a practice. the distinction of these is, unfortunately, not well made for the reader and the use of the five aspects in attempting to categorize the work does not achieve this. while i cannot recommend the book as a coherent volume on digital evaluation of creative and artistic practices, there are individual chapters that have merit and the breadth of the work might be useful for the exploration of the different topics addressed going forward. book review: "online evaluation of creativity and the arts" by hiesun cecilia suhr (ed.) variability in academic research data management practices: implications for data services development from a faculty survey variability in academic research data management practices: implications for data services development from a faculty survey whitmire, a. l., boock, m., & sutton, s. c. ( ). variability in academic research data management practices: implications for data services development from a faculty survey. program, ( ), - . doi: . /prog- - - . /prog- - - emerald group publishing limited version of record http://cdss.library.oregonstate.edu/sa-termsofuse http://survey.az .qualtrics.com/se/?sid=sv_ io d aayr vggx http://cdss.library.oregonstate.edu/sa-termsofuse variability in academic research data management practices implications for data services development from a faculty survey amanda l. whitmire and michael boock oregon state university, corvallis, oregon, usa, and shan c. sutton university of arizona, tuscon, arizona, usa abstract purpose – the purpose of this paper is to demonstrate how knowledge of local research data management (rdm) practices critically informs the progressive development of research data services (rds) after basic services have already been established. design/methodology/approach – an online survey was distributed via e-mail to all university faculty in the fall of , and was left open for just over one month. the authors sent two reminder e-mails before closing the survey. survey data were downloaded from qualtrics survey software and analyzed in r. findings – in this paper, the authors reviewed a subset of survey findings that included data types, volume, and storage locations, rdm roles and responsibilities, and metadata practices. the authors found that oregon state university (osu) researchers are generating a wide variety of data types, and that practices vary between colleges. the authors discovered that faculty are not utilizing campus-wide storage infrastructure, and are maintaining their own storage servers in surprising numbers. faculty-level research assistants perform the majority of data-related tasks at osu, with the exception of data sharing, which is primarily handled by the professorial ranks. the authors found that many faculty on campus are creating metadata, but that there is a need to provide support in how to discover and create standardized metadata. originality/value – this paper presents a novel example of how to efficiently move from establishing basic rdm services to providing more focussed services that meet specific local needs. it provides an approach for others to follow when tackling the difficult question of, “what next?” with regard to providing academic rds. keywords research data services, data management, academic libraries, metadata, survey, data sharing paper type case study . introduction the increasing ease and speed with which researchers can collect large, complex data sets is outpacing their development of the knowledge and skills that are necessary to properly manage them. these skills are crucial to ensuring data quality, integrity, shareability, discoverability, and reuse over time. as funding agencies steadily enact mandates for the submission of data management or sharing plans with proposals, investigators will be held accountable to them (holdren, ). similar expectations for data accessibility are emerging from some journal publishers, such as plos. academic libraries are increasingly sources of infrastructure and research support in the area of data stewardship (akers and doty, ; and references therein), and directly assessing researchers’ data needs through the use of surveys is a common tactic employed during the process of developing services (akers and doty, ; program: electronic library and information systems vol. no. , pp. - © emerald group publishing limited - doi . /prog- - - received february revised april accepted april the current issue and full text archive of this journal is available on emerald insight at: www.emeraldinsight.com/ - .htm prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) averkamp et al., ; marchionini, ; rolando et al., ; scaramozzino et al., ; steinhart et al., ; tenopir et al., ). for example, akers and doty ( ) used results from a campus survey to make the decision not to expand institutional repository functionality to include preservation and sharing of data sets. averkamp et al. ( ) discovered widespread dissatisfaction with the lack of both centralized data storage and university-supported cloud storage, and shared these concerns with the research services arm of the university information technology group. survey results gathered by a provost’s task force on the stewardship of digital research data at the university of north carolina (unc) at chapel hill revealed that more often than not, researchers were relying upon themselves to store data and were using “less desirable practices for data storage” (marchionini, ). they also found that less than percent of survey respondents were aware of certain data management support services that were available. based on direct feedback from faculty, the task force was able to make strong recommendations to the unc campus administration regarding establishing or expanding cyberinfrastructure and data support services. several surveys have found that creating metadata is something that researchers struggle with, and that they often use non-standardized methods to document their data or fail to document their data at all (rolando et al., ; steinhart et al., ; tenopir et al., ). the proposed solution to this challenge largely involves training for researchers (e.g. rolando et al., ), but site-specific survey data can also elucidate the extent to which researchers would be receptive to such training if it were developed. for example, steinhart et al. ( ) found that, “nearly two-thirds of respondents reported they would not use a metadata service, whether fee-based or free of charge.” in that case, despite the fact that researchers need training, developing a metadata service would likely be a wasted effort. while the results of faculty surveys often reveal common themes, there is no substitute for having an understanding of local research practices when investing in the development of research support services. this case study reviews the history of data services development at oregon state university (osu), and describes how recent faculty survey results are being used to further refine these services. an online survey was distributed to all osu faculty during the fall of . the survey covered several aspects of research data management (rdm), ranging from characterizing the data that faculty generate, to asking what rdm tasks they struggle with, and what their opinions are regarding who should pay for data services and infrastructure. in this case study paper, we focus on five areas of the survey that generated surprising or particularly important results, and discuss how we will use or have used these discoveries to modify or develop our existing research data services (rds). first, we discuss the types of data that faculty in different colleges are generating, and review the possible implications for targeting outreach and training. then we discuss the volume of data that faculty report they are generating, and how this informs planning for future data storage and sharing infrastructure. one of the most important aspects of practice variation among faculty is where they store their data, and we present some unexpected results in this area. as much of the support that our data services group provides occurs one-on-one with researchers, it is critical to understand to whom we should target for assistance. we asked the faculty to describe who performs the majority of rdm tasks in their research endeavors, and now have a better understanding of who to reach out to when we develop new services or products. lastly, we review current practices on campus for creating metadata, and discuss how we may try to address gaps in this area. academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) . timeline of data services development at osu osu libraries has investigated and engaged in the provision of data services on a limited basis for some time. historically, osu libraries’ data services have focussed in two areas: aggregating and visualizing oregon natural resources-related geospatial data; and building data repository services. in , osu libraries partnered with the osu college of forestry, college of science, usda forest service lab in corvallis, and the northwest alliance for computational science and engineering to create virtual oregon, a data archive and portal for “environmental and other place-based data on oregon and associated areas” (keon et al., ). virtual oregon was discontinued due to lack of funding, but was soon replaced with oregon explorer (http://oregonexplorer.info/), a series of web portals that include data archiving as well as data visualization tools pertaining to oregon natural resources. in addition to portal development to make specific types of data available to the osu and wider communities, osu libraries have worked with faculty and staff on campus in a variety of ways to better ascertain campus needs regarding data. in , meetings were held by members of the osu community to discuss issues relating to the management and curation of research data across campus, and the feasibility of establishing a spatial data repository for osu. underpinning these conversations was the recognition that increasingly large volumes of data were being produced across campus, with no way of knowing what was stored where, by whom and how it was organized. the series of meetings served to gather knowledge about the different kinds of research data that were being produced at the university and potential avenues for sharing information about best practices. the library was an active participant in these meetings, and one result was that the scholarsarchive@osu institutional repository was deemed to be an appropriate repository for static data sets smaller than two gigabytes (avery et al., ). in , osu libraries invited faculty from across the university to two lunch meetings at which attendees were asked a series of questions about their data and the libraries’ potential role in relation to those data. at this point, the scholarsarchive@osu institutional repository, built on the dspace platform and managed by the libraries, housed a variety of spatial data sets from faculty involved in the data meetings, as well as a small number of data sets associated with student theses and dissertations. one outcome of these meetings was that the libraries decided to focus on research data associated with theses and dissertations “as a way for the libraries to learn how to do the work involved in curating data” (boock and chadwell, ). although the data services that osu libraries currently provides are informed by this history of engagement with osu faculty, the library still lacked sufficient staffing and critical details that it deemed necessary to provide targeted services and support that would meet campus researcher needs. in , a data management specialist position was established in osu libraries to provide leadership in formalizing and expanding the organization’s data services. one of the position’s initial roles was to participate in the arl/dlf/duraspace e-science institute (e-science institute, ) as part of a small team of librarians and a member of the university’s information services (is) department, in order to produce a strategic agenda for rds at osu. the agenda provided a roadmap for the development of services in four primary areas: planning and consultation services, access and preservation infrastructure, data management training, and open data consortia and collaborations. it also identified the campus survey whose results are discussed in this paper as the best way to further discern campus needs, and direct an expansion of library and technology support services pertaining to the university’s research data (sutton et al., ). prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) http://oregonexplorer.info/ . methods the purpose of the survey was to improve our understanding of practices, opinions, concerns, and needs regarding data at osu. we endeavored to understand the nature of the data sets that osu researchers are generating, and how they are being managed. as such, we distributed the survey to faculty across all ranks from professorial (assistant, associate, and full) to support faculty (faculty research assistants (fras) and research associates) and post-doctoral researchers. survey questions generally fell within the following areas that represent primary issues in data stewardship: data stewardship policies, roles and responsibilities; data characteristics and short-term management practices; data management services and support; data management funding; research data standards and documentation; data sharing; and long-term preservation. the web-based survey was developed using qualtrics software, referring to the survey from marchionini et al. (marchionini, ) as a starting point. we obtained significant constructive feedback from the osu survey research center to refine aspects of the survey structure, flow, and question design. the survey was distributed to all osu faculty members via e-mail addresses that were obtained from the office of human resources (hr). the hr database query resulted in , e-mail addresses. data were then downloaded from qualtrics and analyzed in the software program r (whitmire, ). . results . survey response the survey was open from october -december , . after the survey was deployed, it became evident that emeritus faculty were inadvertently included in the e-mail list. while of responses in the “other” category of the faculty rank question actually self-identified as emeritus (via write-in response), we excluded all “other” answers from the results and from our response rate calculation. in total, surveys were started; surveys were completed. there were no required questions, so response rates for each question vary. in total, e-mails bounced or failed to deliver. therefore, a response rate of . percent was estimated based on how we treated “other” faculty responses. excluding all “other” faculty ranks from responses (numerator) and emeritus and bounced e-mail addresses from denominator, we find: � ; � � ¼ : % we utilized the “anonymize response” feature in the survey termination section of the qualtrics survey flow to disassociate responses from the individual survey link and scrub the ip address. this effectively de-identified the survey results. faculty from every college and unit responded to the survey (table i), and response rates were generally greater than percent (table ii). response rates varied across the ranks, ranging from percent for full professors (n ¼ ) to percent for instructors/other/unknown ranks (n ¼ ; table iii). . data types, volume, and storage locations the most common data types that osu researchers produce are quantitative data (e.g. spreadsheets, delimited text, spss, xml; . percent of total responses), digital images ( . percent), and non-digital (handwritten) text ( . percent; figure ). as expected, differences in the most common data types are evident across colleges. academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) college or unit prof. assoc. prof. asst. prof. sfra/ fra res. assoc. post doc total agricultural sciences (agr) business (bus) earth, ocean and atmos. sci. (ceoas) education (edu) engineering (engr) forestry (for) liberal arts (libart) pharmacy (pharm) public health and human sci. (phhs) science (sci) veterinary medicine (vet med) university libraries (lib) other total notes: these response numbers do not include responses from faculty who responded as “other” to the question regarding their rank (n ¼ ). the college and unit abbreviations used in figures are shown in parentheses. ranks are professor (prof.), associate professor (assoc. prof.), assistant professor (asst. prof.), senior faculty research assistant and faculty research assistant (sfra/ fra), research associate (res. assoc.; not including post-docs), post-doctoral researchers (all types, including research associate, fellow, etc.) and other (affiliations include research centers and institutes, student affairs and academic programs such as the graduate school, extension; ranks include instructors, courtesy faculty, support faculty affiliated with a research center, etc.) table i. number of completed responses from each college or unit, by rank college or unit contacts responses response rate (%) agricultural sciences business earth, ocean and atmos. sci. education engineering forestry liberal arts pharmacy public health and human sci. science veterinary medicine university libraries other total , notes: the number of contacts shown includes emeritus faculty who were inadvertently contacted, and bounced e-mails. as such, these response rates shown are slightly lower than the estimated response rate for the survey as a whole. these response numbers do not include responses from faculty who responded as “other” to the question regarding their rank (n ¼ ) table ii. response rates by college or unit prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) for example, a higher percentage of researchers within the colleges of earth, ocean and atmospheric sciences (ceoas) and forestry (for) produce geospatial data than in other colleges, while qualitative text (e.g. an interview transcript) is more prevalent in education (edu), public health and human sciences (phhs), and liberal arts (libart). when asked about how much data they are producing, osu faculty report that for a “typical” research project, they generate less than gb in most cases (n ¼ ; figure ). again, depending on their discipline, some researchers produce much more and some much less. there were no responses in the ranges from tb- pb or w pb. in total, percent of respondents indicated that they did not know how much data they position/rank contacts responses response rate (%) professor associate professor assistant professor research associate/fellow faculty research assistant senior faculty research assistant instructor/other/unknown total , notes: the number of contacts shown includes emeritus faculty who were inadvertently contacted, and bounced e-mails. as such, these response rates shown are slightly lower than the estimated response rate for the survey as a whole. these response numbers include responses from faculty who responded as “other” to the question regarding their rank, but these responses were removed from the survey analysis table iii. survey response rates by rank non-dig. images non-dig. text video audio gene seq. samples dig. images eln qual. text quant. text databases geospatial artistic prod. quantitative ag r bu s ce o as ed u en gr fo r li ba rt ph ar m ph hs sc i ve tm ed li b to ta l faculty affiliation d a ta t yp e % creating data type notes: color scale indicates what percentage of respondents in each college or unit selected “yes” for each data type. light gray with a bullet indicates zero “yes” responses. the number above each column shows the total number of faculty responses for that college/unit figure . responses to the question, “please indicate whether or not you generate each of the following data format(s) as a part of your research process. select yes or no for each” academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) typically produce. when asked about the largest amount of data they have produced for a single project, only three respondents answered in the tb- pb range (colleges of agriculture (n ¼ ) and liberal arts (n ¼ )), and one in the w pb range (ceoas; data not shown). overall, osu faculty report storing short-term data (data less than five years old) most often on personal computers (pc; percent) and external storage devices ( percent; faculty could report storing data in multiple locations; figure ). in several colleges and departments, faculty report storing data on servers held within their research group, in most cases despite the fact that their college or department offers replicated, network server-based storage as a service. colleges with high numbers of respondents using their own research group servers include ceoas ( percent report having their own server), engineering ( percent), science ( percent), and vet med ( percent). college and departmental servers are also well utilized by faculty for storing short-term data, especially in agriculture ( percent), business ( percent), > pb tb - pb tb - tb gb - tb gb - gb i don’t know < gb ag r bu s ce o as ed u en gr fo r li ba rt ph ar m ph hs sc i ve tm ed li b to ta l faculty affiliation d a ta v o lu m e r a n g e % creating data in vol. range notes: color scale indicates what percentage of respondents in each college or unit selected the given data volume range. light gray with a bullet indicates zero responses. the number above each column shows the number of faculty responses in college/unit. the percent of library faculty responses is off-scale at percent in the < gb range (dark gray) figure . responses to the question, “what has been the typical amount of digital data for a single project you have worked on in the past years?” other cloud is server unit server indiv. server external hd desktop/laptop ag r bu s ce o as ed u en g fo r li ba rt ph ar m ph hs sc i ve tm ed li b to ta l faculty affiliation s to ra g e l o ca tio n % storing data in location notes: color scale is the percent of faculty that responded “yes,” where the total responses include “yes,” “no,” and “i don”t know.’ light gray with a bullet indicates zero “yes” responses. the number above each column shows the total number of faculty responses (y+n+idk) for that storage location and college/unit figure . responses to the question, “thinking about data you’ve generated in the last five years (short-term data), please indicate where you store and/or backup these data. select yes or no for each” prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) engineering ( percent), forestry ( percent), pharmacy ( percent), and vet med ( percent). cloud-based storage options are utilized at rates between percent (forestry) and percent (education). it is interesting to notice that campus-wide server-based storage infrastructure, offered though is, is not heavily utilized. only percent of respondents indicated that they store data with is, and most were in units that also reported producing smaller data sets. it is important to note that a given faculty member may employ different data storage options at different times, so the use of one method does not entirely preclude the utilization of others. . data management tasks and roles with the exceptions of data analysis, sharing, and disposal, the survey results indicate that fras handle the majority of data management tasks (figure ). at osu, personnel in research support positions, such as laboratory technicians and research assistants, are distinguished from administrative staff in that they have non-tenure track faculty status (as opposed to “classified staff” status). as such, research personnel are known as “fras,” or fras, and they are almost exclusively supported on “soft money” by research grants. in the case of researchers in less data-intensive colleges (e.g. liberal arts or business) however, principle investigators (pis) handle the majority of these tasks themselves (college-level data not shown). graduate students are almost never responsible for data sharing outside of the research group, nor are they typically involved in data archiving or data disposal. while less involved than research assistants, faculty reported that graduate students do participate in data collection, metadata creation, quality control, and analysis. the only data management tasks for which faculty reported involvement by is were data backup and archiving. the professorial ranks handle the majority of data sharing. . metadata practices the proportion of faculty who report that they create metadata varies widely by college. only percent of veterinary medicine faculty create metadata, while disposal archive sharing store/org. analysis backup qa/qc metadata data coll. pi gradstud ra/fra it staff other not appl. total position type d a ta m a n a g e m e n t ta sk % doing this task in this position notes: for each task, respondents could choose one position type only. color indicates what percent of each task is being conducted by the given position type. light gray with a bullet indicates zero responses. the numbers in the “total” column show the total number of responses for each task. note that the color scale range is to percent figure . responses to the question, “who performs the majority of each of the following digital data management tasks associated with your research?” academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) percent of earth, ocean, and atmospheric sciences faculty do (table iv). response rates to the question were also highly variable. in total, percent of forestry faculty responded to this question, while percent of liberal arts faculty did. only faculty who responded “yes” to the first metadata question were asked about which metadata standard they are currently using. faculty who are creating metadata are overwhelmingly using a schema that has been standardized within their research group (table v). interestingly, - percent of faculty do not know if they are using the metadata standards listed in the survey question. . discussion . nature and volume of data produced at osu the survey results clearly demonstrate that faculty are generating a wide variety of data types. this was not unexpected, but it’s helpful to see “who” (faculty in which colleges) is generating “what” (data types) as we consider adding support services or training. for example, a high percentage (w percent) of faculty in the colleges college or unit yes no total % yes survey responses response rate (%) agricultural sciences business earth, ocean and atmos. sci. education engineering forestry liberal arts pharmacy public health and human sci. science veterinary medicine university libraries all units note: for comparison, the two “responses” columns on the right show the number of respondents from each college for the survey as a whole, and the subsequent within-survey response rate for this question table iv. responses to the question, “do you generate metadata? for example, do you currently document or describe your data, create code books, data dictionaries, ‘readme’ files, etc.?” yes no i don't know total dc (dublin care) dwc (darwin core) ddi (data documentation initiative) dif (directory interchange format) eml (ecological metadata language) fgdc (federal geographic data committee) iso (geographic information) ogis (open gis) metadata standardized within my lab other (specify): notes: only respondents who answered, “yes” to the question regarding metadata creation were prompted to answer this question table v. responses to the question, “please indicate which metadata standard you currently use to describe your data. select yes or no for each” prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) of earth, ocean and atmospheric sciences, engineering, and forestry are creating digital text for quantitative research (figure ), which was described in the survey as, “software scripts and codes, descriptive information/metadata.” we have been considering offering software carpentry (sc; http://software-carpentry.org/) workshops, which would provide instruction on topics like version control, programming languages (r, python, and matlab), and using databases. the survey results indicate that we would want to target our outreach and marketing for these workshops to those three colleges, due to the prevalence of software code being generated by their constituents. results also indicate that faculty, graduate students, and research assistants are all involved in the analysis phase of the research lifecycle (figure ), so we would have to consider either targeting the sc workshop to an audience with a broad level of experience (beginning coder to very experienced), or creating separate workshops for students and faculty. in seven of the colleges surveyed, more than percent of faculty report that they are creating digital image data, and in three of the colleges more than percent of faculty are (figure ). despite the fact that several campus surveys have found the same results with respect to the prevalence of digital image data (averkamp et al., ; marchionini, ; rolando et al., ; steinhart et al., ), this was an interesting and unexpected finding that has broad implications for data storage and backup, file organization and naming, metadata, and data sharing and preservation. the fact that so many digital images are being created on campus points to a potential need for providing specialized support materials (e.g. cornell university library, r.d., ; jisc, ) or a workshop on best practices for managing digital images. this observation also generates more questions: what are they taking pictures of? how are they being analyzed? with increasing funder and publisher mandates for data sharing, how will researchers share them? these questions point to a need to further engage with faculty on the topic of the use of digital images for research, perhaps via a series of data curation profiles (carlson, ; witt et al., ) dedicated to the issue. a topic closely related to the types of data that faculty are generating, is the volume of data being generated. while the topic of “big data” has been grabbing headlines, research funding (zgorski, ), and even its own journals (e.g. elsevier’s big data research and springer’s journal of big data), the large majority of researchers ( percent at osu) are still creating what we consider to be “regular data” (figure ), which we arbitrarily define as being less than one terabyte in size. only researchers out of respondents ( . percent) report that they are producing “typical” data sets in the - terabyte range, and none report creating anything larger than that under usual circumstances (three researchers report that the largest data set they have created is terabytes- petabyte in size, and one reported their largest data set was w petabyte; data not shown). the main implication of this finding is that meeting the data storage needs of our faculty is likely to be a tractable challenge. faculty at osu already have access to gigabytes of free google drive cloud storage, and will soon also each have access to tb of free storage with microsoft’s onedrive cloud storage service. while not universally ideal, cloud-based, vendor-provided data storage has some advantages over using laptops, external hard drives, and individually maintained servers. most notably, cloud storage is replicated and secure, and assuming an internet connection is available, can be accessed from anywhere. unlike using laptops and external drives, data stored in the cloud is not at risk of physical theft or accidental damage. drive and onedrive also offer variable access permissions at the file and folder level, so that researchers involved in collaborative work can more easily share data within the project. a drawback of using cloud-based academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) http://software-carpentry.org/ storage on our campus is the potential for upload/download bottlenecks due to the limited speed of our connection with the internet. researchers who need to move large data sets around would be better off working on local servers, which have much faster transfer rates within the campus network. ultimately, there will be plenty of space available for most researchers to store their short-term data, but how well faculty are managing those data (e.g. using a thoughtful and consistent file-naming convention) remains an open question. perhaps the more interesting question is whether or not faculty will take advantage of these storage options as they become available. results from the data storage location question reveal some unexpected and disconcerting habits. . data storage habits perhaps the most surprising discovery revealed by this survey is the fact that a large percentage of faculty are managing their own data servers. we expected to see that faculty are storing short-term data (less than five years old) on desktop and laptop computers and external storage drives, and that is born out in the results (figure ). in addition to central data storage options that are available through is, several colleges on campus have their own computing support services for data storage and backup. in light of this, we did not expect faculty to be maintaining their own servers in any appreciable number. however, as noted in the results section, significant proportions of faculty in ceoas ( percent), engineering ( percent), science ( percent), and vet med ( percent) report that they are storing data on servers that they maintain themselves, despite the fact that replicated, networked storage is available within their college. likewise, only percent of faculty store data with is. this indicates that either the centralized (college and university level) cyberinfrastructure resources do not currently meet faculty needs in this area, or that faculty are unaware of their data storage options. a sample of write-in responses to the survey shed some light on how faculty view this problem: having reliable, scalable, and relatively inexpensive short term data storage is critical for our work. our current model requires us to buy lab specific equipment that degrades over time and that is not completely backed up. it would be fantastic to have a central repository for data that can use economies of scale to increase reliability and redundancy and allow us to focus on analyzing the data rather than managing it. some college level services (data storage and backup) should be available at the university level in a visible osu data center. we constantly struggle with adequate, secure, backed-up disk space for our projects. central data storage (on the order of s to s of tbs), provided by the college/university, would be a big help. given that centralized data storage services do exist with is, a critical issue to explore further is the degree to which faculty’s low use of is options is attributable to a lack of awareness vs shortcomings in the options themselves, so those centralized services may be improved to enable wider adoption. between pcs, external hard drives and personal servers, this level of ad hoc, do-it-yourself data storage exposes a significant proportion of the data produced at osu to serious risks. how much of the data stored in these locations is backed up in multiple locations (i.e. replication)? are faculty aware of the life expectancy of pc and external hard drives and their rates of failure? are researchers adequately prepared prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) to manage their own storage servers effectively? why do faculty rely on themselves for data storage at such high percentages? is self-reliance in this area a personal preference, or is it a lack of high-quality, affordable centralized infrastructure (either within the college or the university)? our library has an effective, valuable working relationship with is, and the results of this survey have sparked a substantive conversation about possible causes of and remedies for the lack of uptake of centralized cyberinfrastructure by the osu community. is is currently working on implementing an advanced “object storage” cyberinfrastructure system, which is expected to effectively eliminate storage volume limits and significantly reduce costs for campus users. we will be working with them to advertise this option and provide support for how to take advantage of its features once this option becomes available. as faculty shift over to a new storage system, this will also provide opportunities to have conversations with them on topics such as file-naming conventions and folder organization. . targeting outreach and services one of the biggest challenges that we have faced in developing rds at osu libraries has been our lack of visibility on campus. for example, only percent of faculty reported that they are aware of our services related to developing or reviewing data management plans (data not shown). in light of this, we believed that it would be beneficial to better understand who on campus, in terms of their position, is handling which data-related tasks. the hope was that a better understanding of who is doing what would enable us to focus outreach efforts on the appropriate audience. this would make better use of our limited rds resources and improve our chances of service uptake. fras perform the majority of several rdm tasks, including data collection, metadata creation, quality control, and data backup, storage, and organization (figure ). they share about equally in data analysis and data archive tasks with those in the professorial ranks. the conclusion we can draw from these results is clear: we need to be reaching out to research assistants with support and training in all aspects of rdm best practices. fras play a large role in data storage and organization, with . percent of respondents indicating that fras are primarily responsible for this activity. as we collaborate with is in building out new storage infrastructure, we need to make the professorial ranks aware of the resource, but teach fras how to use it. of all rdm tasks, data sharing had the highest percentage of responses clustered in a single rank, at . percent. in this case, professors were predominantly responsible for the sharing of their research. given that professors are the ones serving as pi on the grants that support much of the research performed at osu, and that they oversee the projects and the work, it is no surprise that they act as gatekeepers to the products of their work. as federal mandates for the sharing of research results continue to expand across agencies, and become more rigorously audited (holdren, ), we can play an important role in helping pis stay up to date with the sharing requirements. it will also be important for us to be aware of the growing number of options for archiving and sharing research data, so that we can help pis discover and utilize them effectively. these options currently range from federal, discipline-specific archives (e.g. the national center for biotechnology information, or the national ocean data center), to private, discipline-agnostic sharing platforms (such as figshare) or repositories that exist to support the sharing of data sets associated with publications (like dryad). academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) another data sharing option that is becoming increasingly available for academic researchers is using an institutional repository. osu libraries has just begun the development of a new platform for our institutional repository (ir; currently on dspace). the ir will be built on a hydra/fedora repository backend (http://projecthydra.org), which will enable a much more nuanced and robust data model. it will also allow for much more expansive and flexible metadata, and explicitly define relationships between objects in the repository. for example, we will be able to assign researcher ids to data set creators (e.g. orcid) and we will be able to retain the folder and file structure of data deposits. while the transition to a new repository system should be invisible to campus users (with the exception of a significantly improved user interface), we will be able to encourage depositors to use rdm best practices by supporting an expanded range of metadata schemas, preserving folder and file structure, and adding functional links between data sets and related content (both inside and outside of the ir). while the programmers work to develop and refine the ir platform, our data specialists will need to invest significant effort toward developing outreach and training materials for pis and fras (since fras share in the work of data archiving). we will also need to be prepared to spend time offering workshops and guest lectures for broad audiences on features of the new ir and how to take advantage of them. . metadata support is needed the survey results regarding metadata practices are promising, but also provide a potential area for engagement with faculty and the development of training exercises. with (or percent) of survey respondents answering the question of whether or not they create metadata, percent report that they do (table iv). this agrees well with the results of an international survey on researcher data management practices, which found that percent of researchers were creating metadata (tenopir et al., ). within the group that reported creating metadata, tenopir et al. ( ) found that a combined . percent of the researchers were either not using a metadata standard, or were using a standard devised within their lab. likewise, we found that a total of . percent of respondents to our survey were either using a standard within their group ( . percent) or were not using one at all ( . percent; table v, including “other (specify)” responses not shown). it is encouraging that nearly half of osu researchers report that they create metadata. however, the extent to which researchers are not using standard metadata schemas is an area where we can improve data stewardship on campus. data sets that have metadata that conforms to a standard will be more interoperable with other data sets, more discoverable (by machines and by humans), and are likely to be more thoroughly documented compared to those that have an ad hoc schema. since so many researchers are already creating metadata, it’s not likely that we would get much traction with an introductory metadata workshop (unless perhaps, we geared it toward early-stage graduate students). there appears to be more of a need for training in how to implement specific metadata standards, ideally using an available tool to do so. for example, ecological metadata language (eml) and fgdc/iso were among the most commonly selected schemas among faculty who are using a standard (table v). it would make sense, then, to develop a workshop to train faculty in metadata creation under each of those standards, using existing tools for doing so (e.g. morpho, in the case of eml). survey respondents were also asked about how important it was for osu to invest in providing certain data services, including guidance in how to use metadata standards. a combined total of percent of faculty rated this type of guidance as moderately or prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) http://projecthydra.org very important (figure ), which indicates that there may be a sizeable pool of faculty who would be receptive to training in this area and that development of this training should be strongly considered. we also know from the survey results that faculty rate creating metadata as one of their most difficult tasks (data not shown), and that research assistants are most commonly performing data documentation tasks (figure ). this implies that we should be tailoring the content of metadata training, and how we do outreach, for the fras. this is a particularly good example of how valuable local survey results can be in helping to determine how to most effectively invest limited rds resources and time. . conclusions the primary goal of launching a data stewardship survey was to characterize the rdm practices of osu faculty, and subsequently determine where expanded rds efforts could be most effectively applied. in this paper, we focussed on five aspects of the survey: the types and volume of data being generated by osu faculty; their data storage habits; the roles and responsibilities for various data-related tasks; and metadata. we had a response rate of just over percent, with almost faculty completing the survey. after excluding results from faculty of unknown rank, we had data from completed surveys (though no survey questions were required, so response rates vary by question). we found that osu researchers are generating a wide variety of data types, and that practices vary between colleges. we were surprised to discover that such a large percentage of researchers on campus are generating digital images as a part of their research ( . percent). we are motivated to further engage with faculty on this topic in order to better understand their habits and how we can support them. we also discovered that faculty are largely not availing themselves of centralized cyberinfrastructure resources. instead, even faculty in colleges that have computing support are often going so far as to maintain their own data storage servers. in several cases, faculty who are maintaining their own servers are also generating data at a higher volume. this level of ad hoc storage exposes a significant portion of the osu research data corpus to significant risk of loss. this finding provides impetus for library collaboration with is and the campus administration on how we can increase the utilization of centralized, replicated data storage options. at osu, faculty-level research assistants perform the majority of data-related tasks, including data collection, metadata creation, quality control, and data storage, organization, and backup. they share data analysis and data archiving responsibilities with pis, while pis play the most significant role in data sharing. these observations provide clear direction regarding who we should be targeting outreach, training and % % % % % % not at all important somewhat important moderately important very important notes: n= . the white dot shows the mean response. “how important do you think it is for osu to spend resources on providing the following services […] guidance on how to use appropriate metadata standards?” figure . categorical responses to the question about the importance of providing metadata support academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) support for given rdm tasks. since faculty report that they currently struggle with how and where to share data (results not shown), we now understand that we need to provide support directly to pis in this area. finally, we reviewed the metadata habits of our faculty. while we are buoyed by the discovery that such a large percentage of faculty are creating metadata ( percent of respondents to that question, which had an percent response rate itself), we see room for improvement in the area of increasing the use of standardized metadata schemas. since so many faculty are already creating metadata, we believe that a more advanced workshop that would enable attendees to learn how to create standardized metadata pertinent to their disciplines is warranted. there is also likely a place for an introductory metadata workshop for early-stage graduate students (and open-minded faculty), given that over half of osu faculty are not creating metadata (when those who did not answer the question are included). overall, the results of a campus-wide faculty survey on research data stewardship has provided us with significant insight into local rdm practices. we see several areas where we can develop targeted services and training, and areas where we would like to delve more deeply into what, how and why researchers do what they do with their data. the purpose of this case study, which shares abbreviated results from that survey, was to provide other academic libraries and/or rds personnel with a few examples of the value and utility of conducting such a survey. to the extent that it agrees with the findings of other rdm practices surveys, we also believe that some of these results may be generalizable to the wider academic community (e.g. pis are likely to be the best point of engagement for offering data sharing services, are often employing their own servers for data storage, and are somewhat unlikely to be employing standard metadata schema). it is almost certainly true in most places that university faculty produce an impressively diverse corpus of data types, formats, and sizes, and that these data sets are stored in a myriad of locations, from ideal to less so. the results of our survey, taken in context with the results of other such surveys, point to a ubiquitous need for thoughtfully planned academic rds that are simultaneously broad in scope and strategically focussed on addressing specific local needs. while it is possible to generalize about the common challenges that researchers face with respect to rdm, it is also undoubtedly true that in the endeavor to address those challenges, the devil is in the details. acknowledgments the authors thank lydia newton and the osu survey research center for valuable guidance during the development of the survey. the authors appreciate prompt interactions with the osu office of human resources in providing faculty e-mail addresses, and the irb’s expedited review process was fantastic. the authors thank steve van tuyl for productive and engaging discussions regarding the survey results. the authors also thank two anonymous reviewers for their thoughtful comments, which helped to improve the manuscript. references akers, k.g. and doty, j. ( ), “disciplinary differences in faculty research data management practices and perspectives”, int. j. digit. curation, vol. no. , pp. - . doi: . /ijdc.v i . . averkamp, s., gu, x. and rogers, b. ( ), data management at the university of iowa: a university libraries report on campus research data needs, univ. iowa libr. staff publ, iowa city. prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) http://www.emeraldinsight.com/action/showlinks?crossref= . % fijdc.v i . avery, b.e., chau, m., vondracek, r. and wirth, a.a. ( ), “osu libraries and research dataset curation: a beginning”, working paper, oregon state university libraries, corvallis. boock, m. and chadwell, f.a. ( ), “steps toward implementation of data curation services”, oregon state university libraries, corvallis. carlson, j. ( ), “opportunities and barriers for librarians in exploring data: observations from the data curation profile workshops”, j. escience librariansh, vol. no. , available at: http://dx.doi.org/ . /jeslib. . cornell university library, r.d. ( ), “digital imaging tutorial – contents”, mov. theory pract. digit. imaging tutor, available at: www.library.cornell.edu/preservation/tutorial/ contents.html (accessed june, ). e-science institute ( ), “home page”, available at: http://duraspace.org/e-science-institute (accessed august, ). holdren, j.p. ( ), “memorandum for the heads of executive departments and agencies: expanding public access to the results of federally funded research”, executive office of the president, office of science and technology policy, washington, dc, february , available at: www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_ memo_ .pdf jisc ( ), “systems for managing digital media collections”, jisc digit. media, available at: www. jiscdigitalmedia.ac.uk/guide/systems-for-managing-digital-media-collections/ (accessed june, ). keon, d., pancake, c. and wright, d. ( ), “virtual oregon: seamless access to distributed environmental information”, proceedings of the nd acm/ieee-cs joint conference on digital libraries, jcdl ’ , acm, new york, ny, pp. - . marchionini, g. ( ), research data stewardship at unc: recommendations for scholarly practice and leadership, university of north carolina, chapel hill, nc. rolando, l., doty, c., hagenmaier, w., valk, a. and parham, s.w. ( ), “institutional readiness for data stewardship: findings and recommendations from the research data assessment”, technical report, georgia institute of technology, atlanta. scaramozzino, j.m., ramírez, m.l. and mcgaughey, k.j. ( ), “a study of faculty data curation behaviors and attitudes at a teaching-centered university”, coll. res. libr, vol. no. , pp. - . steinhart, g., chen, e., arguillas, f., dietrich, d. and kramer, s. ( ), “prepared to plan? a snapshot of researcher readiness to address data management planning requirements”, j. escience librariansh, vol. no. . doi: . /jeslib. . . sutton, s., barber, d. and whitmire, a.l. ( ), oregon state university libraries and press strategic agenda for research data services, oregon state university, corvallis. tenopir, c., allard, s., douglass, k., aydinoglu, a.u., wu, l., read, e., manoff, m. and frame, m. ( ), “data sharing by scientists: practices and perceptions”, plos one, vol. , no. , e . doi: . /journal.pone. . whitmire, a.l. ( ), “data and code from: variability in academic research data management practices: implications for data services development from a faculty survey”, oregon state university libraries, corvallis, available at: http://dx.doi.org/ . /n j r witt, m., carlson, j., brandt, d.s. and cragin, m.h. ( ), “constructing data curation profiles”, int. j. digit. curation, vol. no. , pp. - . doi: . /ijdc.v i . . zgorski, l.-j. ( ), “nsf leads federal efforts in big data, press release - ”, natl. sci. found, available at: www.nsf.gov/news/news_summ.jsp?cntn_id¼ &org¼nsf&from¼news (accessed june, ). academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) http://dx.doi.org/ . /jeslib. . www.library.cornell.edu/preservation/tutorial/contents.html www.library.cornell.edu/preservation/tutorial/contents.html http://duraspace.org/e-science-institute www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf www.jiscdigitalmedia.ac.uk/guide/systems-for-managing-digital-media-collections/ www.jiscdigitalmedia.ac.uk/guide/systems-for-managing-digital-media-collections/ http://dx.doi.org/ . /n j r www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news www.nsf.gov/news/news_summ.jsp?cntn_id= &#x ;org=nsf&#x ;from=news http://www.emeraldinsight.com/action/showlinks?crossref= . % f . http://www.emeraldinsight.com/action/showlinks?crossref= . % fjeslib. . http://www.emeraldinsight.com/action/showlinks?crossref= . % fijdc.v i . http://www.emeraldinsight.com/action/showlinks?crossref= . % fjournal.pone. http://www.emeraldinsight.com/action/showlinks?crossref= . % fcrl- &isi= http://www.emeraldinsight.com/action/showlinks?crossref= . % f . appendix . survey instrument introduction. thank you for participating in the center for digital scholarship & services survey on research data stewardship at oregon state university. your responses will help us better understand the data landscape at osu: how much data are being created and in what types and formats, and how faculty are managing them. results from this research survey will contribute to our efforts to build better support and services for research data stewardship on campus. this survey covers topics including funding agency and publisher mandates regarding data, perceptions of data ownership, funding support and services for research data management, and current researcher practices. these topics are relevant to all researchers at osu, and your participation may benefit you and the wider osu research community by enabling informed, targeted expansion of services to meet current needs. your participation is voluntary; you may skip questions or end the survey at any time. after the conclusion of the survey, your name and e-mail address will not be associated with your responses in any way. results from the survey will be reported in aggregate by such factors as rank and college appointment, and the data set and analysis will be shared in a publicly accessible repository and via conference proceedings and publications. it is theoretically possible that your identity may be ascertained by pairing your rank and department, but there is no risk associated with answering the survey. the survey will take approximately - minutes to complete. you may close the survey and return to it at any time using the survey link you received in the e-mail invitation. the security and confidentiality of information collected from you online cannot be guaranteed. confidentiality will be kept to the extent permitted by the technology being used. information collected online can be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. if you have any questions or concerns, please contact the principal investigator of this research, osu libraries’ data management specialist, dr. amanda whitmire, at amanda. whitmire@oregonstate.edu or - - . if you have questions about your rights as a survey participant, please contact the oregon state university institutional review board (irb) by e-mail at irb@oregonstate.edu and refer to study number (survey on research data stewardship at oregon state university). prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) prog , d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) corresponding author dr amanda l. whitmire can be contacted at: amanda.whitmire@oregonstate.edu for instructions on how to order reprints of this article, please visit our website: www.emeraldgrouppublishing.com/licensing/reprints.htm or contact us for further details: permissions@emeraldinsight.com academic rdm practices d ow nl oa de d by p fa u l ib ra ry , c al s ta te u ni v s an b er na rd in o a t : s ep te m be r ( p t ) mailto:amanda.whitmire@oregonstate.edu outline placeholder appendix .survey instrument untitled cercasi “digital scholar”: profili emergenti dei ricercatori in rete looking for “digital scholars”: emerging profiles of networked researchers antonella esposito | dottoranda presso la uoc, universitat oberta de catalunya (es) e università degli studi di milano (it) * università degli studi di milano | via festa del perdono , milano, italia | antonella.esposito@unimi.it sommario qual è l’impatto dei nuovi tools del web . sulle pratiche di comunicazione e di pubblicazione dei ricercatori? tratto da una tesi per un master of research, l’articolo riporta una selezione dei risultati di interviste semi-strutturate ad altrettanti ricercatori senior, junior e dottorandi, operanti nelle aree umanistica e delle scienze sociali, della fisica e della medicina. se l’atteggiamento più diffuso riguarda un approccio pragmatico e attento all’efficienza nella selezione e nell’uso dei tool vecchi e nuovi, tuttavia emergono alcuni isolati profili di nuovi ‘digital scholar’ che costruiscono in rete la propria identità digitale insieme alla produzione e alla distribuzione di conoscenza, nonostante la mancanza di legittimazione del proprio contesto di riferimento. parole chiave università, pratiche di ricerca, digital scholar, produzione e distribuzione della conoscenza. abstract what impact are new web . tools having on communication and publishing practices in the research field? drawn from an unpublished master’s dissertation, this paper reports a selection of findings from semi-structured interviews with senior, early-career and doctoral researchers working in humanities, social sciences, medicine and physics. the prevalent attitude is a pragmatic and efficiency-driven approach in selecting and using traditional and new tools. however, a few isolated examples have emerged of new ‘digital scholars’. these are researchers who, as well as producing and distributing knowledge, are devoted to building their personal digital identity, even though this aspect is not legitimized within their specific research context. key-words higher education, research practices, digital scholar, knowledge production and distribution. esposito a. ( ). cercasi “digital scholar”: profili emergenti dei ricercatori in rete. td tecnologie didattiche, ( ), pp. - td tecnologie didattiche, ( ) il tema delle potenzialità innovatrici del social web sulle pratiche di comunicazione, condivisione e pubblicazione dei ricercatori ha ricevuto negli ulti- mi anni l’attenzione di alcuni importanti studi in- ternazionali, ma tuttora rimane un territorio am- piamente inesplorato, in particolare nel panorama universitario italiano. questo articolo è tratto da una tesi elaborata come requisito di un master of research e incentrata sull’esplorazione del rappor- to tra pratiche di digital scholarship - ovvero degli usi della tecnologia web per attività di produzione e di distribuzione della conoscenza in ambito ac- cademico - e nuove tendenze verso pratiche di open scholarship, in riferimento all’ampliamento - abilitato dalle nuove generazioni di strumenti per la comunicazione online - delle culture di condivi- sione che hanno sempre caratterizzato i diversi ambiti scientifici. lo studio qualitativo originario è stato effettuato su un piccolo campione di ricerca- tori senior, ricercatori junior e studenti di dottorato affiliati all’università degli studi di milano. il me- todo adottato ha previsto la realizzazione di in- terviste semi-strutturate a ricercatori selezionati tramite strategie di selezione degli intervistati “per convenienza” e “a palla di neve”, nelle aree uma- nistica, delle scienze sociali, della fisica e della medicina. la ricerca aveva l’obiettivo di fornire una “istantanea” degli usi correnti ed emergenti di stru- menti vecchi e nuovi nelle pratiche della ricerca e di raccogliere una varietà di pareri sulla possibile evoluzione delle culture di condivisione all’interno dei diversi settori disciplinari. questo articolo si concentra sulla parte descrittiva dello studio, rela- tiva alle pratiche tecnologiche raccontate dai sin- goli ricercatori e alla loro idea di digital scholar. la selezione di dati qui presentata viene interpretata attraverso un confronto con studi empirici su vasta scala e considerando le tipologie di online engage- ment dell’individuo rispetto al web (white e le cornu, ) e il fenomeno del nuovo ricercatore digital, networked and open discusso da martin weller ( ). la lente interpretativa nel suo libro più recente, dedicato alla figura emergente del digital scholar, martin weller ( ) propone una lettura dei cambiamenti in at- to nel ruolo dei ricercatori, sottoposti alle pressio- ni e alle opportunità delle tecnologie web . . l’autore si basa sull’analisi delle pratiche di ricer- ca innovative che fanno uso del social web: ad esempio, dall’uso sistematico del blog come mez- zo di pubblicazione informale all’adozione di rivi- ste online in open access; dalla condivisione di in- dicazioni bibliografiche all’utilizzo del crowdsour- cing - ovvero l’affidarsi al contributo collettivo e vo- lontario degli utenti web - come tecnica di raccol- ta dati; dalla sperimentazione di open notebook in ambito scientifico alla co-progettazione di stru- menti di learning analytics. weller propone una descrizione dichiaratamente provvisoria del digital scholar come di «qualcuno che adotta approcci di- gitali, in rete e aperti per dimostrare specializzazio- ne in un determinato ambito di studio» (weller, ). weller fornisce una veloce panoramica di come queste tre caratteristiche assegnate al ricer- catore si siano in realtà rivelate in modi diversi fin dagli inizi degli anni novanta del secolo scorso. tuttavia, interpretato in maniera radicale, questo nuovo modello di digital scholar sottende per l’au- tore un processo di democraticizzazione del siste- ma di reclutamento dei ricercatori - innestato dal social web - grazie al quale «un ricercatore affer- mato può essere benissimo qualcuno senza alcu- na affiliazione istituzionale» (ib.), poiché la sua re- putazione viene definita più dalle reti di relazioni online e dall’identità digitale che si costruisce nel tempo che dall’appartenenza istituzionale. questa idea di ricercatore digitale sottintende che vi sia una stretta relazione tra uso del social web e ap- propriazione di una più profonda cultura della con- divisione tra pari e tra docenti e studenti, sia nel- l’insegnamento che nella ricerca in università. tut- tavia, la diffusione di atteggiamenti emergenti de- ve fare i conti con quella che l’autore definisce co- me «matrice di resilienza verso la digital scholar- ship», che considera ostacoli e fattori inibitori a li- vello di politica nazionale, istituzionale, di area di- sciplinare e di propensione individuale. uno schema concettuale utile per interpretare la cesura tra un prima e un dopo rispetto alle tecnolo- gie web e al tipo di online engagement richiesto al- l’individuo per utilizzarle è invece rappresentato dalla polarizzazione tra visitors e residents propo- sta da white e le cornu ( ). l’ambiente del so- cial web è costituito da nuovi tipi di applicazioni software che possono essere spiegate più efficace- mente dalla metafora dello spazio e del “luogo” - inteso come «l’essere presente con altri» (white e le cornu, ) - piuttosto che dalla metafora del tool, ovvero di «uno strumento adatto ad uno sco- po» (ib.). questo implica per gli autori un cambio di paradigma nel tipo di impegno online delle per- sone: da visitors - che usano il web come una ri- messa degli attrezzi dalla quale prendere di volta in volta lo strumento necessario per un dato scopo e limite temporale - a residents, che intendono il web come «un luogo per esprimere opinioni, un luogo nel quale le relazioni possono formarsi ed estendersi» (ib.) e dove contenuti e identità digita- le tendono a sovrapporsi. l’alternanza visitors/re- sidents è pensata in realtà come un continuum in cui collocare opportunamente i comportamenti di- gitali degli individui: la propensione verso l’uno o l’altro dei due poli prescinde dal grado di compe- tenza tecnologica e dovrebbe essere valutato per white e le cornu rispetto al complesso delle “alfa- betizzazioni digitali”, ovvero rispetto all’insieme di competenze e abilità richieste dal contesto e dal settore disciplinare in cui l’individuo si trova ad operare. gli studi empirici molti studi recenti - condotti principalmente nel re- gno unito e negli usa - sono dedicati ad indagini sui social media nelle attività di ricerca. tali studi concordano sul fatto che vi sono prove molto scar- se relative alla diffusione tra i docenti e ricercatori universitari dei celebrati nuovi canali della comuni- cazione (harley et al., ; procter williams, ste- wart, ; schonfeld e housewright, ). i ca- nali tradizionali, quali conferenze e seminari: «spesso resi più efficienti dalla transizione al digitale, ma d’altra parte praticamente immu- tati, rimangono tuttora i mezzi più importanti tramite i quali i ricercatori comunicano sia for- malmente che informalmente» (schonfeld e housewright, : p. ). strumenti web . (come blog, rss feed, wiki, twitter) non vengono citati in questi studi come meccanismi popolari: in qualche caso sono visti per- sino come «una perdita di tempo perché non sono sottoposti alla peer review» (harley et al., : p. ). queste rilevazioni sono confermate anche da ri- cerche su piccola scala, quali l’indagine di kraker e lindstaedt ( ) nell’area dei ricercatori di e-lear- ning e l’audit portato a termine da pearce ( ) presso lo staff dell’open university. da una parte procter williams, e stewart ( : p. ), nel loro studio condotto su scala nazionale nel regno unito, rilevano che «il processo di sperimentazione e in- novazione risulta essere al momento altamente lo- calizzato e disperso, e con tutta probabilità si pro- trarrà a lungo». dall’altra lo studio ciber - un que- stionario online che ha interrogato ricercatori in tut- to il mondo - sostiene in base ai risultati raccolti che «i social media hanno trovato una collocazione nel research workflow per molti accademici e stanno dimostrando la propria utilità» ( : p. ). spostando l’attenzione sui fattori demografici, di- sponibili per il regno unito, l’adozione delle tecno- logie web . appare come sempre più diffusa tra gli studenti di dottorato (jisc/british library, ) e dei ricercatori nella prima fase della loro carriera (james et al., ), ma i dati relativi alla frequen- za d’uso dei nuovi strumenti riservano alcune sor- prese in favore delle vecchie generazioni di ricerca- tori (procter, williams e stewart, ). ancora più interessanti sono le ragioni che i ricer- catori dichiarano sulla probabile adozione di nuovi strumenti: «i servizi che hanno più probabilità di avere successo sono quegli strumenti in cui i ricerca- tori sono coinvolti nello scoprire, esplorare e realizzare nuove capacità e adattarle ai propri fini, in accordo con le culture e i contesti di ri- ferimento nei quali conducono il proprio lavoro» (procter, williams e stewart, : p. ). inoltre, laddove lavorare con colleghi di diverse isti- tuzioni può favorire l’adozione di nuove tecnologie (ciber, ), la barriera più importante nell’ap- propriazione di questi tools emergenti è costituita dalla «mancanza di chiarezza sui concreti benefi- ci che potrebbero derivarne ai ricercatori» (ciber, : p. ) e dal fatto che «pochi servizi hanno raggiunto la massa critica necessaria per manife- stare l’effetto network che ne stimola l’uso perva- sivo da parte di specifiche comunità» (procter, williams e stewart, : p. ). tuttavia, si deve tener presente che questi studi fo- calizzati sui media emergenti non riescono a resti- tuire una rappresentazione completa dell’adozione delle tecnologie nelle pratiche di ricerca perché spesso non mettono in relazione “vecchie” e “nuo- ve” tecnologie, trascurano studi sull’uso delle ict nel decennio precedente il web . (fry, ) e omettono di considerare il ruolo giocato da ambien- ti e strumenti digitali istituzionali quali pagine per- sonali, digital library, account di posta elettronica, servizi informativi per la ricerca (per questi ultimi, bitter e muller, ). le interviste: l’appropriazione delle ict come era prevedibile, nel campione esaminato l’email e la digital library (oltre ai comuni software per la produttività personale) sono stati indicati co- me gli strumenti di uso continuo da parte della to- talità dei ricercatori intervistati, senza distinzione di area disciplinare, fascia di età, attitudini personali o contesto di ricerca. oltre a questi, però, numero- se altre applicazioni vengono nominate in quanto strumenti emergenti per un utilizzo nella vita quoti- diana e per specifiche attività di ricerca. la figura suggerisce il livello di distribuzione di tecnologie non specifiche di ambiti disciplinari (sia dispositivi elettronici che applicazioni software) presso i ricer- catori partecipanti alle interviste. l’email è diventata nel tempo uno strumento multi- funzione che va ben al di là dell’originario meccani- smo della comunicazione uno-a-uno (fisica, # ): la maggior parte dei ricercatori continua ad affidar- si a questo “vecchio” strumento sia per svolgere at- tività di networking che per lavori di editing colla- borativo nella produzione di saggi multi-autore. d’altra parte la digital library, abilitando un acces- a. esposito td tecnologie didattiche, ( ) so immediato ad una quantità enorme di studi pub- blicati, effettivamente rende i ricercatori più consa- pevoli di «quanto non abbiamo ancora letto (e for- se non riusciremo mai a leggere) su un determina- to argomento» (area umanistica, # ) e contribui- sce a «migliorare l’argomentazione interdiscipli- nare, proprio attraverso la lettura incrociata delle ricerche» (fisica, # ). tuttavia, il panorama si fa più variegato quando in- tervengono scelte su ulteriori strumenti e viene con- siderato il rapporto tra i bisogni tecnologici nella vi- ta quotidiana e nel lavoro di ricerca. così, un comu- ne registratore digitale audio/video diventa parte della vita quotidiana di un ricercatore di scienze sociali (# ), mentre una varietà di strumenti di uti- lizzo generale possono servire un ventaglio di fun- zioni specializzate per un altro ricercatore: «uso un kindle per leggere e-books; un ipad, che supporta l’unicode, per leggere i classici della letteratura greca. con uno smartphone sincronizzo le attività tra casa e lavoro, incluso il tempo che trascorro consultando la digital li- brary e i contatti con gli studenti, se necessa- rio» (area umanistica, # ). in alcuni casi però gli intervistati dichiarano di non usare né social network del tipo facebook né smar- tphone perché non ne sentono la necessità e, anzi, denunciano la forte pressione commerciale che in- genera bisogni indotti, non giustificati da una reale esigenza. tra gli strumenti di ultima generazione, senza dub- bio skype risulta essere il favorito: viene comune- mente usato per moltiplicare le opportunità di in- contrare a distanza colleghi o dottorandi (scienze sociali, # , # ; medicina # ), mentre in alcuni casi (area umanistica # ; medicina # ; fisica, # , # ) viene ritenuto molto utile per risolvere ve- locemente problemi quando un progetto collabora- tivo è in una fase di stallo o per rinegoziare decisio- ni all’inizio di una nuova fase di lavoro. curare un blog di argomenti attinenti alla ricerca è invece un’attività scarsamente diffusa e non è nep- pure riconosciuta come un’attività raccomandabile, persino come mezzo per esercitare la scrittura ac- cademica da parte dei dottorandi: «il metodo scientifico incorpora specifici vinco- li e strumenti che tu devi necessariamente ac- quisire prima di iniziare a costruirci sopra. non puoi eludere tali prerequisiti. e potrebbe esse- re un rischio per gli studenti esporsi troppo pre- sto. il rischio è quello di mostrare un approccio poco solido, poco scientifico» (medicina, # ). tuttavia, tra gli intervistati di scienze sociali un pro- fessore associato (# ) gestisce un blog per condi- videre riflessioni sui propri progetti di ricerca e un dottorando (# ) contribuisce ad un blog collettivo: «questo blog multi-autore è considerato una sorta di “vetrina” di alcuni filoni di ricerca svi- luppati all’interno del mio dipartimento. ospita blog post strutturati come articoli di ricerca, commenti a quei lavori e a volte anche guest post. in un certo senso serve anche a fare net- working e a estendere i confini della nostra co- munità di ricerca» (scienze sociali, # ). tra i ricercatori intervistati l’atteggiamento più co- mune rispetto alla scelta e ai pattern d’uso delle figura . livello di distribuzione di tecnologie non specifiche di ambiti disciplinari presso i ricercatori partecipanti alle interviste. tecnologie sembra dunque essere quello di un ap- proccio pragmatico e guidato dall’efficienza che questi mezzi possono imprimere alle attività: «l’uso e la scelta di un tool digitale è assoluta- mente funzionale ai miei bisogni di ricerca, al- le domande di ricerca e al campione di sogget- ti che devo studiare. non importa quanto sia difficile lo strumento. se può davvero aiutare e risponde alla situazione di ricerca, sono del tut- to disposta a metterci il tempo necessario per acquisirne il funzionamento… questa è la chia- ve di tutto» (medicina, # ); attributi quali velocità, completezza di informazio- ne (area umanistica, # ) e facilitazione di prati- che esistenti (scienze sociali, # , medicina, # ) caratterizzano un modo di vedere le tecnologie co- me mezzi per risolvere problemi pratici. questo at- teggiamento si trasforma facilmente in capacità di adattarsi a un nuovo strumento - ad esempio drop- box usato per un progetto interdisciplinare - quan- do lo strumento fornisce una facilitazione a costo zero in termini di tempo. i canali “analogici” dei seminari e delle conferenze rimangono per tutti gli intervistati mezzi privilegiati (perché riconosciuti) della condivisione informale dei work in progress, e c’è chi rileva con decisione l’importanza della conoscenza interpersonale come prerequisito per avviare una collaborazione di ricer- ca (medicina, # ). È un fatto che anche i pochi che fanno un uso intensivo degli ambienti di social net- working dichiarino che tali ambienti in effetti non contribuiscono davvero ad ampliare la propria co- munità di ricerca (scienze sociali, # ; area uma- nistica, # ). tuttavia, un sottogruppo di ricercatori (scienze so- ciali, # ; area umanistica, # ; # ) sembra esse- re più incline a sperimentare nuovi strumenti e allo stesso tempo a costruirsi un’identità accademica digitale attraverso una sofisticata strategia d’uso dei diversi mezzi di comunicazione. «utilizzo parecchi tool e ambienti che sono so- lito classificare come tool “frequenti”, “accade- mici” e “personali” e che uso spesso in modali- tà “mobile” (tramite ipad, blackberry) e per una varietà di obiettivi, quali la gestione di proget- ti, indagini online, blogging, microblogging, bo- okmarking, pianificazione di meeting. trovo twitter molto utile come “alimentatore di cono- scenza” che attinge ad una comunità interna- zionale: con questa funzione credo che sia più efficiente (almeno per me) rispetto ad altri so- cial network. invece, in facebook discuto argo- menti di ricerca, studi, opinioni personali, “vi- sioni del mondo” con miei pari ma anche con gli studenti. su questi stessi temi il mio blog ospita riflessioni più personali, che trova una sua audience linkando i post su twitter. infine, uso linkedin per attrarre nuovi contatti attorno ai miei impegni professionali extra-accademi- ci» (scienze sociali, # ). altri preferiscono mantenere separate le proprie re- ti private da quelle di ricerca: «succede che qualche ricercatore mi chieda di far parte del mio network in facebook: tutta- via, se uno scambio informale di informazioni ha la probabilità di diventare una collaborazio- ne di ricerca preferisco spostarmi su altri stru- menti, come l’email o skype, per approfondire la discussione in un ambiente più privato» (area umanistica, # ). tuttavia, l’atteggiamento esplorativo adottato dal singolo ricercatore non sembra essere così gratifi- cante in alcuni settori disciplinari: «ho un profilo in molte di queste tecnologie emergenti, quali social network, siti di social bookmarking, social citation... ma finora non sono riuscita ad individuare alcun concreto be- neficio per il mio lavoro di ricerca. per esempio, ho usato per un periodo citeulike per scambia- re indicazioni bibliografiche. ma mi sono presto resa conto che ben pochi classicisti hanno l’abitudine di mettere in comune indicazioni bi- bliografiche, e così la mia permanenza in quel- l’ambiente non ha avuto alcun valore» (area umanistica # ). tali ostacoli non scoraggiano comunque questa ri- cercatrice dall’impegno di curare una propria sepa- rata identità digitale (tramite pseudonimo) in so- cial network costruiti attorno ad interessi letterari, nei quali «posso giocarmi un ruolo differente che non ha niente a che fare con le mie responsabili- tà nel ruolo di ricercatore universitario» (area umanistica, # ). in un altro caso, lo stato emergente del proprio am- bito disciplinare incentiva la collaborazione a di- stanza per una dottoranda in archeologia (digitale): «prima di tutto al cnr - dove lavoro con una borsa di studio - vi è un intenso lavoro di team, che per esempio si manifesta attraverso l’uso di google docs per la produzione collaborativa di qualsiasi documento. ma più importante an- cora è il fatto che nel nostro campo è vitale cercare nuovi contatti e collaborare a distanza con la comunità internazionale degli sviluppa- tori open source, per co-progettare ambienti di grafica e modellare strumenti che abilitano la costruzione di musei virtuali di siti archeologi- ci. come “archeologi digitali” siamo ancora re- lativamente pochi e siamo aperti all’osserva- zione di altre esperienze, soluzioni, tipi di re- search output e cerchiamo tutto ciò scanda- gliando tutti i tipi di siti e di comunità sul web» (area umanistica, # ). si può notare, invece, come nella fascia dei ricerca- tori senior ci sia un atteggiamento più disincantato rispetto a queste tecnologie emergenti: c’è chi sot- a. esposito tolinea come nelle scienze dure «l’impatto emotivo delle ict sia già stato vissuto agli inizi degli anni novanta» (fisica, # ); altri fanno presente come le facilitazioni offerte dal web . permettono da tempo di svolgere attività a distanza quali la discus- sione di progetti di ricerca e la redazione collabora- tiva di testi che questi nuovi mezzi così enfatizzati sbandierano come una loro prerogativa (scienze sociali, # ); altri infine rimarcano il fatto che le nuove generazioni alle prese con queste tecnologie “pronte all’uso” rischiano di perdere il controllo cri- tico sugli strumenti che utilizzano per fare e comu- nicare la ricerca (area umanistica, # ). infine, sull’idea del ricercatore digitale si rileva una divisione netta tra chi, nell’area umanistica in par- ticolare, non vede come appropriato l’attributo “di- gitale” associato alle proprie pratiche di ricerca e chi, nelle scienze dure, dà a questa “etichetta” un senso reso scontato dai decenni di utilizzo delle tecnologie informatiche. tuttavia, alcune idee nuo- ve emergono: «anche se lavoro in una disciplina classica e non vedo intorno a me molto interesse rispetto alla tecnologia, devo dire che io mi sento un di- gital researcher: sia perché tutti i miei stru- menti di lavoro sono digitali e sono in rete, e poi perché ho una frequentazione continua con un gruppo di sviluppatori per progettare insie- me nuovi strumenti» (area umanistica, # ). «c’è un modo di pensare al digital researcher che ha a che fare con le nostre visioni del mon- do. in realtà l’insieme delle mie idee di ricerca è fortemente influenzato e continuamente ali- mentato da tutto ciò che viene condiviso in re- te, attraverso meccanismi digitali» (scienze sociali, # ). «da una parte un digital researcher utilizza nuovi strumenti per modellare in collaborazio- ne nuovi metodi che ampliano la conoscenza di siti e reperti archeologici. d’altra parte un tale approccio abilita il ricercatore a pensare a modi nuovi ed efficaci per raggiungere un pub- blico non specializzato e metterlo a parte di un patrimonio culturale straordinario» (area uma- nistica, # ). discussione la sezione precedente ha inteso offrire un rappor- to dettagliato delle esperienze personali e delle opinioni raccolte dai ricercatori alle prese con vec- chie e nuove tecnologie. tuttavia è opportuno far presente che questi risultati vanno intesi come narrazioni di singoli individui selezionati secon- do un metodo non probabilistico: non possono quindi essere considerati un campione rappresen- tativo delle pratiche tecnologiche correnti ed emer- genti nelle aree disciplinari prese in esame. inoltre, limitandosi ad utilizzare lo strumento dell’intervi- sta, lo studio non fornisce quei dati di contesto che servirebbero a configurare le esperienze narrate come pratiche situate. nonostante questi limiti, però, è possibile render conto di comportamenti prevalenti e di casi particolari che vanno a com- porre la “istantanea” che era l’obiettivo dichiarato di questo studio. i dati ricavati dalle interviste realizzate illustrano un quadro complessivo delle pratiche di ricerca nel quale l’appropriazione delle tecnologie da parte dei singoli ricercatori sottende un approccio funziona- le e improntato all’efficienza, e rivela una scarsa diffusione e un cauto interesse verso gli strumenti web . per supportare le attività di ricerca, in li- nea con i risultati di studi empirici su larga scala. i dati suggeriscono alcune differenze nelle culture delle diverse aree disciplinari riguardo all’interpre- tazione di pratiche digitali nella ricerca: si veda ad esempio come l’uso del blog venga visto dagli in- tervistati di scienze sociali come luogo di elabora- zione di nuova conoscenza e/o come strumento di disseminazione di ricerche in corso, mentre in am- bito scientifico può persino essere percepito come un rischio per i dottorandi, impegnati nell’acquisi- zione graduale e gerarchica di una robusta meto- dologia scientifica. tuttavia, alcuni ricercatori nel campione sono por- tatori di un approccio eclettico e perlopiù auto-le- gittimante nei confronti delle nuove tecnologie del- la comunicazione, nonostante i rispettivi contesti si mostrino piuttosto indifferenti verso le potenzialità dei nuovi strumenti/ambienti. considerando come quadro di riferimento il conti- nuum visitors/residents (white e le cornu, ), l’appropriazione individuale delle tecnologie della comunicazione, la selezione degli strumenti e gli usi più ricorrenti rilevati tra i ricercatori intervistati evidenziano un’attitudine più vicina al modello vi- sitors che a quello residents. infatti la maggioran- za dei ricercatori, di qualsiasi area, tende a conce- pire le tecnologie come strumenti da utilizzare se e quando servono, per determinate funzioni, e il web come “rimessa degli attrezzi” più che come spazio sociale. nuovi strumenti e ambienti vengono intro- dotti qualora questi abbiano una chiara utilizzazio- ne e siano in grado di migliorare l’efficienza di pra- tiche esistenti: il caso tipico è la popolarità di sky- pe per supportare l’interazione nei meeting a di- stanza. tuttavia, una piccola minoranza di ricerca- tori mostra di avere un atteggiamento più esplora- tivo verso le nuove tecnologie e di fatto manifesta una combinazione di engagement sia come visitors che come residents, attraverso la costruzione di una propria identità digitale in vari social media. ma vi possono essere tensioni tra questo individua- le impegno online e le più comuni pratiche digitali all’interno dell’area disciplinare di riferimento. a questo proposito tre casi emergono più chiaramen- td tecnologie didattiche, ( ) a. esposito te - due nell’area umanistica e uno nelle scienze so- ciali - indicativi del tipo di autonomia di comporta- menti digitali che il singolo esprime rispetto al con- testo: ) il caso di un ricercatore di scienze sociali dalla carriera consolidata che afferma di trarre vantag- gio dallo sviluppare nei social networks parte del proprio discorso accademico. la sua scelta è quella di autolegittimare il proprio impegno onli- ne, combinando le attività di elaborazione di “vi- sioni del mondo” (partecipata anche con gli stu- denti) e di networking con la costruzione di un nuovo ruolo come intellettuale pubblico nel so- cial web. ) il caso di una giovane ricercatrice di area umani- stica che si rende conto di essere una “pionie- ra”(riguardo al tipo di online engagement) nella propria disciplina classica, e sceglie di giocarsi un altro tipo di identità intellettuale, focalizzata sulla propria expertise in letteratura e nettamen- te separata rispetto alle “tracce” di identità digi- tale come ricercatore universitario. ) il caso di una studentessa di dottorato in “ar- cheologia digitale”, che si divide tra il contesto dell’università - legato alla tradizione antichisti- ca - e quello del cnr, che di fatto le consente e le richiede di assumere un habitus di lavoro col- laborativo e di intraprendere azioni di networ- king nelle reti digitali. queste azioni costituisco- no allo stesso tempo modalità di produzione e di distribuzione di conoscenza e hanno la finalità di costruire collettivamente metodo, strumenti e prodotto finale di una disciplina tesa a ri-crearsi nel digitale. nella terminologia di weller, se tutti e tre questi di- gital scholar mostrano un modo di essere digitale che è inerente al proprio essere connessi in rete, so- lo il ricercatore di scienze sociali - dato il tipo di di- sciplina di cui si occupa e la propria posizione acca- demica consolidata - sembra in grado di assumere il ruolo di digital scholar prefigurato da weller, ovve- ro quello di qualcuno che produce e condivide la propria expertise nelle reti digitali, prescindendo dall’affiliazione istituzionale. l’apprendista ricerca- trice in archeologia digitale appare d’altra parte co- me la figura maggiormente favorita dal proprio con- testo di riferimento, in transizione e disperso anche dal punto di vista della collocazione istituzionale. sarebbe interessante capire come in futuro questa situazione di “limbo” potrebbe trasformarsi in nuovi tipi di pratiche di ricerca formalmente riconosciute, per esempio attraverso la realizzazione di un artefat- to digitale da presentare come tesi di dottorato. conclusioni le interviste presentate in questo articolo hanno for- nito una panoramica delle pratiche digitali di un pic- colo campione di ricercatori affiliati ad un’unica uni- versità italiana. utilizzando la provvisoria definizio- ne di digital scholar elaborata da weller ( ) co- me strumento di sintesi, si può affermare - sulla ba- se dei dati raccolti - che la maggioranza dei ricerca- tori intervistati risulta essere tradizionalmente digi- tal, moderatamente networked e occasionalmente open (sebbene questo aspetto non sia stato trattato in questa sede). in altre parole, i ricercatori dimo- strano un uso consistente ed efficiente delle “vec- chie” tecnologie, che sembrano soddisfare bene tut- ti gli attuali bisogni di comunicazione e di distribu- zione del processo e dei prodotti dell’attività di ricer- ca, entro i vincoli formali del sistema di accredita- mento di tale lavoro. la maggior parte degli intervi- stati sembra non vedere nessun chiaro beneficio nel passare a nuovi mezzi tecnologici, in mancanza di informazioni precise, di supporto istituzionale e sen- za un riconoscimento di qualche tipo da parte della propria comunità di ricerca. tuttavia, ciò che si evi- denzia con forza da questo campione di intervistati è la distanza tra una late majority (rogers, ) di ricercatori che non utilizzano e sono sostanzialmen- te indifferenti verso possibili nuovi canali di comuni- cazione e pubblicazione e i pochi early adopters (ib.) che esplorano usi sofisticati di una varietà di tecnologie, passando dal livello di utilizzo personale a quello accademico e professionale, talvolta sepa- randoli e altre volte mescolandoli. in effetti i casi isolati di early adopters mostrano un atteggiamento del tipo “digital-in-quanto-newor- ked”, caratterizzato da un’intensa frequentazione del social web come luogo dove costruire, espan- dere e rinegoziare la propria identità digitale, con un approccio auto-legittimante rispetto ai tradizio- nali modi di produzione e distribuzione della cono- scenza. questi esempi emergono nelle aree umani- stica e delle scienze sociali, dove una cultura del la- voro di ricerca di tipo più individualistico (fry, ) abilita un uso prettamente personalizzato della tecnologia; d’altra parte, proprio la scarsa pro- pensione alla divisione del lavoro in questi ambiti disciplinari non facilita la condivisione di nuove pratiche digitali nella propria comunità di ricerca. sembra qui riproporsi quel fenomeno definito da tony bates ( : p. ) dei lone rangers, in ri- ferimento a quei docenti pionieri dell’e-learning nei contesti universitari, che rischiano di rimanere iso- lati nelle loro sperimentazioni. tuttavia, a differen- za dei lone rangers dell’e-learning, i pionieri delle pratiche di digital scholarship intervistati in questo studio sembrano in grado di trovare autonomamen- te - almeno in questa fase esplorativa - sia un inte- resse specifico che delle forme di gratificazione per sperimentare nuove forme di comunicazione e di costruzione del proprio profilo accademico. se da una parte gli studi sugli usi effettivi del social web da parte dei ricercatori sono ancora in nume- ro troppo limitato per offrire un quadro di riferimen- td tecnologie didattiche, ( ) to attendibile sul fenomeno, dall’altra è possibile in- travvedere una linea di sviluppo nella diffusione delle nuove tecnologie della comunicazione nei contesti accademici della ricerca. ci si riferisce all’utilizzo dei social media per rag- giungere nuove forme di impatto dell’attività di ri- cerca, sia nella stessa comunità accademica che nel più ampio consesso sociale. il social web apre infatti nuovi spazi di intervento personale che con- tribuiscono a creare inedite relazioni tra le dimen- sioni della carriera accademica. si veda ad esem- pio la crescente convergenza tra le dimensioni del “prestigio personale” e del “fare rete”: «se un tem- po gli accademici confidavano nelle conoscenze personali tra colleghi per rendere noto il proprio la- voro e per incrementare il numero di citazioni, oggi ciò che conta è quanto sia facile rintracciare il lavo- ro di uno studioso e quante versioni dello stesso la- voro si trovano là fuori in canali diversi, a disposi- zione di altri accademici e ricercatori» (lse public policy group, : p. ). il valore aggiunto che i nuovi media della comuni- cazione possono offrire alla visibilità e alla diffusio- ne della propria attività di ricerca costituisce infat- ti una motivazione trasversale ai vari settori di ri- cerca e tipologie di online engagement, poiché prescinde sia dalla propensione personale ad esplorare nuove tecnologie che dall’adesione ideo- logica ad una più estesa cultura della condivisione dei contenuti, dei metodi e delle pratiche di colla- borazione. una tale linea di sviluppo implica in ogni caso che le istituzioni universitarie si impegnino ad offrire - in particolare nel percorso di apprendistato dei fu- turi ricercatori - quegli strumenti di informazione, supporto e di regolamentazione la cui mancanza sembra essere l’ostacolo principale nell’adozione dei social media da parte dei singoli ricercatori. bates a. w. ( ). technology, e-learning and distance education, ( nd ed.). abingdon, uk: routledge. bitter s., muller a. ( ). social networking tools and rese- arch information systems: do they compete?. in procee- dings of the acm websci ‘ , rd international conferen- ce on web science (koblenz, de, - giugno ). lon- don, uk: the web science trust. http://journal.webscience.org/ / (ultima consultazione . . ). ciber ( ). social media and research workflow. lon- don: university of college. http://www.ucl.ac.uk/infostudies/research/ciber/social- media-report.pdf (ultima consultazione / / ). fry j. ( ). scholarly research and information practices: a domain analytic approach. information processing and management, special issue formal methods for informa- tion retrieval, ( ), pp. - . harley d., acord s. k., earl-novell s., lawrence s., c. judson king ( ). assessing the future landscape of scholarly communication: an exploration of faculty values and ne- eds in seven disciplines. berkeley, ca, usa: center for stu- dies in higher education. http://escholarship.org/uc/item/ x g (ultima consultazione . . ). jisc/british library ( ). researchers of tomorrow. a three years (bl/jisc) study tracking the research behavior of ‘generation y’ doctoral students. second annual report - . london, uk: jisc. http://www.jisc.ac.uk/news/stories/ / /researcher- softomorrow.aspx (ultima consultazione . . ). kraker p., lindstaedt s. ( ). research practices on the web in the field of technology enhanced learning. in pro- ceedings of the acm websci ‘ , rd international confe- rence on web science (koblenz, d, - giugno ). london, uk: the web science trust. http://journal.webscience.org/ / (ultima consultazione . . ). james l., norman j., de baets a. s., burchell-hughes i., bur- chmore h., philips a., sheppard d., wilks l., wolffe j. ( ). lives and technologies of early career resear- chers. london, uk: jisc. http://www.jisc.ac.uk/publications/reports/ /earlyca- reerresearchersstudy.aspx#downloads (ultima consultazione . . ). lse public policy group ( ). maximising the impacts of your research: an handbook for social scientists. h t t p : / / b l o g s . l s e . a c . u k / i m p a c t o f s o c i a l s c i e n - ces/ / / /maximizing-the-impacts-of-your-rese- arch-a-handbook-for-social-scientists-now-available-to- download-as-a-pdf/ (ultima consultazione . . ). pearce n. ( ). digital scholarship audit report. milton keynes, uk: open university. http://oro.open.ac.uk/ / /pearce( ).pdf (ultima consultazione . . ). procter r., williams r., stewart j. ( ). if you build it, will they come? how researchers perceive and use web . . london, uk: research information network. http://www.rin.ac.uk/system/files/attachments/web_ . _ screen.pdf (ultima consultazione . . ). rogers e. m. ( ). diffusion of innovations (iv ed.). new york, ny, usa: free press. schonfeld roger c., housewright r. ( ). faculty survey : key strategic insights for libraries, publishers, and societies. new york, ny, usa: ithaka s+r. http://www.ithaka.org/ithaka-s-r/research/faculty-sur- veys- - /faculty-survey- (ultima consultazione . . ). weller m. ( ). the digital scholar: how technology is transforming scholarly practice. london, uk: bloomsbury academic (open access). http://www.bloomsburyacademic.com/view/digitalscho- lar_ /book-ba- .xml (ultima consultazione . . ). white d. s., le cornu a. ( ). visitors and residents: a new typology for online engagement. first monday, ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm /article/view/ / (ultima consultazione . . ). bibliografia the invention and dissemination of the spacer gif: implications for the future of access and use of web archives research article the invention and dissemination of the spacer gif: implications for the future of access and use of web archives trevor owens & grace helen thomas published online: april # this is a u.s. government work and not under copyright protection in the u.s.; foreign copyright protection may apply abstract over the last two decades publishing and distributing content on the web has become a core part of society. this ephemeral content has rapidly become an essential component of the human record. writing histories of the late th and early st century will require engaging with web archives. the scale of web content and of web archives presents significant challenges for how research can access and engage with this material. digital humanities scholars are advancing computational methods to work with corpora of millions of digitized resources, but to fully engage with the growing content of two decades of web archives, we now require methods to approach and examine billions, ultimately trillions, of incongruous resources. this article approaches one seemingly insignificant, but fundamental, aspect in web design history: the use of tiny transparent images as a tool for layout design, and surfaces how traces of these files can illustrate future paths for engaging with web archives. this case study offers implications for future methods allowing scholars to engage with web archives. it also prompts considerations for librarians and archivists in thinking about web archives as data and the development of systems, qualitative and quantitative, through which to make this material available. keywords web archiving . computational scholarship . cryptographic hash . digital history ‘the web is ruined and i ruined it.’ this is the title of author and web designer david siegel’s post to xml.com (siegel ). siegel, the author of the book creating international journal of digital humanities ( ) : – https://doi.org/ . /s - - - the following research represents the opinions, perspectives and ideas of the authors. it does not necessarily represent the perspectives of any institutions with which they are affiliated. * grace helen thomas grth@loc.gov trevor owens trow@loc.gov u.s. library of congress, washington, dc, usa http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf http://xml.com mailto:grth@loc.gov killer websites (siegel, ), went on to explain his role in what he describes as ‘the roots of html terrorism.’ (siegel ) specifically, he contends that ‘the hacks i’ve espoused, especially the single-pixel gif, and using frames and tables to do layout, are the duct tape of the web.’ all of these elements of design went out of fashion. as he explains, ‘i ruined the web by mixing chocolate and peanut butter so they could never become unmixed. i committed the hangable offense of mixing structure with presen- tation.’ in particular, he advocated the use of these single-pixel, clear gif files as a way of building page layouts. these kinds of technical discussions of design practices in web history are invaluable resources for understanding the records of the web (owens ). one of his self-proclaimed offenses, ‘the single-pixel gif,’ became a subject of analysis and study by digital artist and folklorist olia lialina in a online exhibit (lialina ). as part of an ongoing effort to explore and explain the early history of the web, lialina produced the online exhibit illustrated below. this presentation, clear.gif, shows a series of transparent gifs wrapped in elaborate frames. widely referred to as ‘spacer’ gifs, these single-pixel, transparent gifs were used first and foremost as a way of controlling the placement and presentation of content on a website. they were invis- ible, or rather transparent, i.e. whatever was behind them showed through. however, they still took up space. so a designer could encode into their html document any number of spacer gifs to appear in a row in order to control the placement of any given element on a page. this provided a means of controlling exactly where visual elements would appear on a given web page. as is evident in fig. , they only become visible when broken, when the link to the image file no longer resolves. these tiny files, the presence of which is only conspicuous when they are no longer present, are invaluable aids which help us understand the history of the web. simultaneously, exploration of the study of these files, furthermore, offers insight toward the future of enabling scholarly research on the history of the web. in our explanation of the findings of this investigation, we identify key ways of working with records of the web, and born-digital collections more broadly, which can inform our future understanding of our digital past. the single-pixel gif is an element of design, invisible like so many other aspects of design on the web, but still encoded in highly structured ways. in an interview about her ongoing work to explore and understand the early web, in particular the geocities archive, lialina explains, ‘i remember, everybody who made pages in the s had cgif, maybe it was called clear gif, some people would call it fig. screenshot of clear.gif online exhibit international journal of digital humanities ( ) : – zero-dot-gif, but it was this transparent one that would help you to make layouts.’ (johnson ). her exhibit functions as a way of drawing attention to this practice, but it also provides a point of entry to begin to explore the form and function of the history of these images in the history of web design. in , jesper rønn-jensen, asked exactly this kind of question as a blog post: who invented the spacer gif (rønn-jensen ). rønn-jensen is an early web developer who has remained passionate and outspoken about the history of web design and development. in an update to the post, rønn-jensen notes that siegel claimed credit in personal email correspondence with him. specifically, siegel claimed ‘i invented it all by myself in my living room.’ but at that point, another designer, software developer joe kleinberg, chimed in and claimed that he was really the one who had invented it (rønn-jensen ). what answers do web archives and other born-digital archives offer to such questions? furthermore and in some ways more interestingly, in what ways might we be able to track the emergence and decline of something like the single- pixel gif? cultural heritage organizations such as the internet archive, the british library, the library of congress, and hundreds of others across the globe are working to collect and preserve the web. many of these institutions now have significant holdings documenting more than two decades of the web’s history. in what follows, we approach these collections as a means of exploring the ways in which we can ask and answer such questions concerning web archives. before diving into specific questions regarding single-pixel gifs, we contextualize this work in ongoing discussions about the future of access and use of digital collec- tions. cultural heritage institutions are increasingly exploring ways of thinking about enabling computational scholarship to think of their collections as data. much of these conversations are about digitized collection materials, but we now have access to massive corpora of born-digital material, these born digital collections are functionaly born computable for digital scholarship. within that section, we briefly introduce computational scholarship and how ap- proaching digital collections as data sets results in new kinds of research. we then provide examples of ongoing projects which focus on applying computational schol- arship to web archives as a model of treating web archives collections as data to support new and evolving kinds of research. next, we present the findings of our efforts to trace the history of single-pixel gifs as far back as the first instances appearing in the internet archive and library of congress web archives. then, we share the findings of the use of computational scholarship, more specifically distant reading, on the uk web archive, headquartered at the british library, to map the patterns of single-pixel gifs over a -year period of web harvesting. finally, using our methods as a case study, we discuss the findings of an approach based on tracing tiny files through terabytes of messy web archives data and the implications of these findings for researchers and digital library practitioners. situating web archives in trends in online collections without realizing it, humanists have been using computational methods to carry out their research for decades by using full-text search to explore electronic databases international journal of digital humanities ( ) : – (underwood ) and, prior to this, with the advent of the computer, grappling with how to integrate computational analysis into historical inquiry, if at all (anderson ). in other words, much of current scholarship is already computational, but many people are unaware of the role that computation plays in their research and discovery process. over the course of the last twenty years, a more sophisticated approach to computational research has developed for humanists who are working with cultural heritage collections and imposing pattern and relevance algorithms directly onto the contents they are studying. ‘distant reading’ has evolved into its own methodology of studying texts at scale (jockers ), especially for text-based collections. letting a computer ‘read’ hun- dreds of thousands of novels in seconds has significantly expanded the types of questions we can ask about collections, beyond keyword and word co-occurrence patterns. for example, text mining can identify linguistic patterns, highlight and map named entities (finkel et al. ), compare authors’ styles, create connected network graphs, and generate interrelated topics (blei et al. ) over a collection or corpus. these methods have been applied to a collection of twenty thousand novels to predict trends in the literary world (archer and jockers ) and to thousands of articles from eighteenth-century (newman and block ) and nineteenth-century (smith et al. ) newspapers to discover trends in news coverage and reprinting over time and geographic location. the work has continued with specifically non-text-based collections. scholars have used similar distantly-consumptive analytic methods on their recorded sound (clement et al. ), image (lorang et al. ), audio-visual, visual, and crowdsourced collections, whether the content in the collection began as digital items or had been digitized. indeed, the expansion of these methods has itself resulted in the need for libraries, archives, and museums increasingly to rethink the modes of access they provide to collections. computational scholarship is powered by corpus level engage- ment with works and artifacts as data. the library of congress collections as data events and the related always already computational initiative have stimulated conversation concerning access for digital collections and helped articulate visions for multi-modal access to digital collections (mears ). the series brought together experts and practitioners creating digital collections and using digital collections in an effort to highlight common themes throughout the process. major takeaways included a need for iterative processes with the goal of providing digital collections with better access, form, and quality (padilla ). to date, much of the work on broad access to digital collections has focused on digitized content. however, work on web archives is one significant exception. the wayback machine, the platform developed by the internet archive to provide access to web archives, has long been the primary means of entry to viewing web archives content. alternatively, archives may use other, similar playback software, such as the community-driven open-source openwayback or pywb, a version of wayback written in the programming language python. it is important to note that the wayback see the wiki for openwayback at https://github.com/iipc/openwayback/wiki see the documentation for pywb at https://pywb.readthedocs.io/en/latest/manual/apps.html#wayback-pywb. international journal of digital humanities ( ) : – https://github.com/iipc/openwayback/wiki https://pywb.readthedocs.io/en/latest/manual/apps.html#wayback-pywb machine and other, similar efforts are not the archive. rather, as software, the wayback machine, openwayback, and pywb provide windows onto the resources stored in any web archive. with basic computer and internet literacy, one is able to navigate through archived web content on the wayback machine much like browsing the live web. however, as web archives have grown exponentially from gigabytes to petabytes, clicking through weekly captures of one section of one website gives users only a tiny fraction of the archive’s content and even of that particular website over time. the sheer amount of web archive data now necessitates computational methods to detect patterns across the archived web and highlight areas of the archive in which to dig deeper. in the autumn of , the library of congress commissioned a pilot project simulating a potential researcher using lc web archives (gallinger and chudnov ). the lc web archiving team provided more than five terabytes of web archives content by means of a secure cloud platform to enable bulk use and analysis. the web archive file format, or warc, is the standard aggregate file for harvested web content. it combines multiple resources as content blocks within each warc, as well as associated metadata for each resource. warc files are well suited for use in a playback mechanism like the wayback machine, but the structure and scale of these files is often challenging for researchers to work with directly. utilizing the cloud infrastructure and distributed computing provided by the third- party service, the contractors generated derivatives of the warc files: web archive transformation (wat) files. wat files are a slimmed version of warc files which consist only of metadata for each resource contained in a warc file, excluding the resource itself. this metadata includes the referring uri, the resource uri, mime type, a timestamp of harvest, and the size of the resource. wat files are a lightweight option for dealing with web archive resource metadata, taking up less than % of the space of a warc file. for the pilot project, the contractors ultimately used the referring uris and resource uris to create link analysis visualizations in order to map how each website domain in the collection linked externally to other website domains. network analysis is a common way for researchers to explore web archives and for institutions practicing web archiving to begin understanding the breadth of their own collections or perform quality review and completeness checks. this type of analysis over web archives provides a snapshot in time, i.e. a high-level view of a subset of the archive. in order to arrive at a deeper understanding of researchers’ needs, the british library’s uk web archiving team hosted ten researchers on campus in under the big uk domain data for the arts and humanities (buddah) project. these researchers aimed to complete case studies while collaborating with the uk web archiving team as a long term project. as a result, the case studies highlighted ways in which communication between the web archiving team, project managers, and see the file format description at https://www.loc.gov/preservation/digital/formats/fdd/fdd .shtml. see the internet archive documentation at https://webarchive.jira.com/wiki/spaces/ars/pages/ /wat+overview+and+technical+details. see the uk web archive link analysis visualization https://www.webarchive.org. uk/ukwa/visualisation/ukwa.ds. /linkage and the ongoing web archives for longitudinal knowledge (walk) project by partners at the university of waterloo, the university of alberta, and york university http://webarchives.ca/ for more information. international journal of digital humanities ( ) : – https://www.loc.gov/preservation/digital/formats/fdd/fdd .shtml https://webarchive.jira.com/wiki/spaces/ars/pages/ /wat+overview+and+technical+details https://webarchive.jira.com/wiki/spaces/ars/pages/ /wat+overview+and+technical+details https://www.webarchive.org.uk/ukwa/visualisation/ukwa.ds. /linkage https://www.webarchive.org.uk/ukwa/visualisation/ukwa.ds. /linkage http://webarchives.ca/ researchers would be improved and more intuitive interfaces and datasets could be created for the researchers. to this end, there have been efforts to lower the barrier of entry to warcs and analysis of web archives content. the mellon-funded archives unleashed toolkit (aut), which grew out of warcbase (lin et al. ), is currently the most robust system providing streamlined access to web archives data for researchers. aut consists of web archives data loaded onto a high-performance computing platform, with data analysis interfaces at the ready. similarly, web archiving systems api, or wasapi (bailey and taylor ), is an effort funded by the institute of museum and library services (imls), which seeks to map an interoperable api-based model for access to web archives data. the existence and evolution of these efforts gesture toward a future in which we move increasingly away from one-at-a-time views of rendered web pages toward a model of treating web archives as digital corpora. it took tremendous effort to make something like the google ngram viewer to make sense of the noise in digitized texts. in contrast, libraries, archives and museums have billions of born-digital files in their web archives which, as born-digital objects, are born ready for computational scholarship. having provided this context and background, we return now to the questions raised at the beginning of this essay. traces of the single-pixel gif in web archives will offer some insights into the potentials of this mode of engaging with web archives. explorations in the history of the single-pixel gif what can we understand about the history of the single-pixel gif when we begin by approaching web archives computationally? part of the initial impulse to conduct this research was lialina’s online exhibit of single-pixel gifs. if we take these hand-picked and curated examples of single-pixel gifs as an initial source, we can begin to characterize them and, in turn, use that characterization to query web archives. lialina’s exhibition links to a series of live manifestations of these images, presented in the list below. of particular note, these are each specific locations on the web where one can find, or could once find, a copy of a spacer gif. after the last forward slash in each of the urls, we find the filename and extension. one of the exhibited works comes directly from siegel’s site (killersites.com), but in each of them, even just at the filename level, we can see the different names these files take on: http://www.geocities.com/clipart/pbi/c.gif http://pic.geocities.com/images/pixel.gif http://www.google.com/clear.gif http://killersites.com/killersites/resources/dot_clear.gif http://visit.geocities.yahoo.com/visit.gif http://blingee.com/images/spaceball.gif http://www-cdr.stanford.edu/~petrie/blank.gif http://img.artlebedev.ru/;-)/n.gif https://mail.google.com/mail/images/cleardot.gif http://www.google.com/images/cleardot.gif for final reports from the buddah project, see the blog https://buddah.projects.history.ac.uk/ / /. http://archivesunleashed.org/about-project/. international journal of digital humanities ( ) : – http://killersites.com http://www.geocities.com/clipart/pbi/c.gif http://www.geocities.com/clipart/pbi/c.gif http://pic.geocities.com/images/pixel.gif http://pic.geocities.com/images/pixel.gif http://www.google.com/clear.gif http://www.google.com/clear.gif http://killersites.com/killersites/resources/dot_clear.gif http://visit.geocities.yahoo.com/visit.gif http://visit.geocities.yahoo.com/visit.gif http://blingee.com/images/spaceball.gif http://blingee.com/images/spaceball.gif http://www-cdr.stanford.edu/~petrie/blank.gif http://www-cdr.stanford.edu/~petrie/blank.gif http://img.artlebedev.ru/;-)/n.gif http://img.artlebedev.ru/;-)/n.gif https://mail.google.com/mail/images/cleardot.gif https://mail.google.com/mail/images/cleardot.gif http://www.google.com/images/cleardot.gif http://www.google.com/images/cleardot.gif https://buddah.projects.history.ac.uk/ / / http://archivesunleashed.org/about-project/ . characterizing/identifying files below we have characterized each of the files using two methods. first, by querying their instances on the wayback machine, we have identified the earliest date for which the internet archive and the library of congress have captures of each respective resource in the specified location. second, we have computed a sha- cryptographic hash for each file. a cryptographic hash function is an algorithm which takes a given set of data (such as a file) and computes a sequence of characters which can then serve as a unique identifier for that data. even changing a single bit in a file will result in a different sequence of characters. for a sense of just how high that confidence can be, it is worth noting that a cryptographic hash offers more confidence as a characterizer of individualization than a dna test does for uniquely identifying a person (kruse ii and heiser , p. ). of these, the earliest recorded capture of any of the single-pixel gifs is the geocities clipart link. with that noted, this only tells us when that file was acquired by respective institutions, not necessarily when it was created. this is a recurring pattern which we will encounter as we work through our analysis. a central challenge in interpreting the contents of web archives is retaining a certain level of skepticism: to what extent are any research findings mapping trends in web history, versus trends in how the web was collected? this is a topic, we futher explore later. significantly, by hashing the files, we have found seven distinct files out of the original ten. the chart above is coded to show three sets of duplicate files (coded ‘ ,’ ‘ ,’ and ‘ ’ in the ‘match’ column) and four unique files. the files within each duplicate set are bit-for-bit identical (i.e. the file coded with ‘ ’ is identical to the other international journal of digital humanities ( ) : – url earliest lc earliest ia sha- match http://www.geocities.com/clipart/pbi/c.gif / / / / f da a e ed b cb d a d b http://pic.geocities.com/images/pixel.gif / / / / e a e d eac d f c http://www.google.com/images/cleardot. gif / / / / d f a f a a f c ca http://www.google.com/clear.gif / / / / a d c a d bcd a bb http://killersites. com/killersites/resources/dot_clear.gif / / / / e a e d eac d f c https://mail.google. com/mail/images/cleardot.gif / / / / d f a f a a f c ca http://visit.geocities.yahoo.com/visit.gif none / / faa f c b b f f a a c d http://blingee.com/images/spaceball.gif / / / / daeaa b f f bc d c bd acb b b a http://www-cdr.stanford. edu/~petrie/blank.gif / / / / d cc dc e c d ad cfb b ac e a ef f http://img.artlebedev.ru/;-)/n.gif none / / daeaa b f f bc d c bd acb b b a http://www.geocities.com/clipart/pbi/c.gif http://pic.geocities.com/images/pixel.gif http://www.google.com/images/cleardot.gif http://www.google.com/images/cleardot.gif http://www.google.com/clear.gif http://killersites.com/killersites/resources/dot_clear.gif http://killersites.com/killersites/resources/dot_clear.gif https://mail.google.com/mail/images/cleardot.gif https://mail.google.com/mail/images/cleardot.gif http://visit.geocities.yahoo.com/visit.gif http://blingee.com/images/spaceball.gif http://www-cdr.stanford.edu/~petrie/blank.gif http://www-cdr.stanford.edu/~petrie/blank.gif http://img.artlebedev.ru/;-)/n.gif file coded with ‘ ’). in most cases where this occurred, one could deduce that the files with identical hash values are themselves historically related. in other words, one file is likely a later, identical copy of the original. however, in this unique case, given the miniscule file size, we cannot assume any interrelation of identical files. a tiny transparent image file does not lend much to the original maker’s unique creativity, and it is possible that several users created identical files using identical processes. . single-pixel gif trends across corpora given that we have distinct, digital fingerprints for each of these single-pixel gifs in the form of their sha- hash values, it becomes possible to query an entire corpus of a web archive to determine where and when files with the same hash value were collected. to date, the uk web archiving program remains unique in that it stores a copy of all the content it has collected in a high-performance distributed computing system. as a result, it is possible to run queries across the entirety of the content of their web archive. andrew jackson, the technical leader for the uk web archives, generously scanned the uk web archive for appearances of these seven hash values. jackson then published the scripts and data resulting from this query (jackson ). the charts below display the number of times each of the seven distinct single-pixel gifs from the geocities data set appeared in the uk web archive collections over time. the first initial pass at the findings shows that there are three extant examples of gifs in the archive dating from : two instances of blank.gif, three instances of pixel.gif, and instances of spaceball.gif. hence, we can conclude that spaceball.gif was the earliest widely used or at least widely collected example of single-pixel gifs. this year is significantly earlier than the first instance of each gif from the geocities data set previously discussed (fig. ). each of the seven unique gifs studied here existed in the uk web archive by . yet, as the charts show, they made their way across the web and through time in strikingly varied ways. cleardot.gif (a category documented in two distinct, original google urls) emerges as the most widely collected gif out of the seven. in , the british library collected and documented the presence of more than one million copies of cleardot.gif ( , , copies). this collection results in a fascinating spike, while the other six gifs nearly vanish from the archive after having had a large presence in and . clear.gif had the earliest significant spike in , and the usage of dot_clear.gif/pixel.gif shot up to nearly , entries (combined total) in . blank.gif resurfaced in and all seven gifs have low representation in . to begin understanding the trends of single-pixel gifs over time, it is important to consider whether the gifs themselves had distinct histories and to examine the details of those histories, separately from collection practices. exploration of the histories of each of these individual files through independent searching reveals the varied ways in which these files have been developed and used. as a post by martin brinkmann from documents, spaceball.gif was used by flickr, the community-driven website launched in hosting photographs and images, to prohibit easy download of the image files by individuals or crawlers. when a user would attempt to right click and download an image file, they would instead be international journal of digital humanities ( ) : – tricked into downloading a tiny, transparent gif which had been invisibly masking the underlying displayed image (brinkmann ). similarly, cleardot.gif (much like spaceball.gif) appears to serve a distinctly different purpose from a spacer gif solely used for formatting. often referred to as ‘web beacons’ or ‘web bugs,’ these files are widely known to be used as a means of surveillance and tracking. specifically, their tiny size and invisibility means that they load quickly, without being detected. each time one of these files loads, it results in a ping back to the source. indeed, the url https://mail.google. com/mail/u/ /images/cleardot.gif is an example of this (pabouk ). critiques of these methods go back to at least late , when sites for companies including fig. appearances of the seven distinct gifs in the uk web archive from to international journal of digital humanities ( ) : – https://mail.google.com/mail/u/ /images/cleardot.gif https://mail.google.com/mail/u/ /images/cleardot.gif quicken, fedex, metamucil, oil of olay, and statmarket were identified as using this technique (smith ). these histories present interesting and challenging issues, admittedly beyond the scope of the current study: given the range of functions of single-pixel, transparent gifs, how are we to understand their presence in different locations over time? to what extent can we take the presence of a single-pixel transparent gif as serving a formatting function when the same file has been used for other purposes, such as blocking the download of other image files? using the data, is it possible to identify which uses of the single-pixel, transparent gif predate other uses? if we were to zero in on that early year, we might well be able to pinpoint the url that each of these images first appeared at in the archive and the day they first appeared, which would constitute a possible next step for this kind of study. discussion: what invisible files let us see there are millions of copies of single-pixel, transparent gifs in the world’s web archives. each one is a trace of a practice and method of presenting information on the web. some are traces of changes in web design. some are traces of methods of surveillance. by working back and forth between the urls for these tiny, functionally invisible images and their hash values, we have begun to map some of this history. the findings of this preliminary mapping offer a range of considerations for the future of access and use of web archives and the history of the web. they suggest requirements for a better understanding of crawling and collecting practices, new methods for character- izing and indexing files, and issues for the interpretation of born-digital collection data. . seeing web history or web archiving history? a web crawler whose job is to archive particular websites makes appraisal decisions in a different way than a human archivist processing a donated collection. both processes include having all documents in front of the archivist and the crawler, and both must decide which to keep and which to pass over. however, all of the rules for a crawler must be set before the crawl starts. it is possible to change the crawler behavior during the crawl, but this change takes a significant amount of effort and ongoing quality review. to avoiding crawling the entire internet every time, the rules tell a crawler what to archive and what to avoid. restricted areas can include entire domains or a regular expression for all urls with the string ‘login,’ for example. for this study, it is possible that any dramatic drop in gif appearances, such as in and , could reflect the choice of a web archivist to exclude single-pixel, transparent gifs from the crawl entirely. this decision may have been made for any number of reasons, including space constraints or a simple belief that single-pixel, transparent gifs were unnecessary to store in the archival record. it is also possible that the program stopped archiving a site or many sites which contained a large number of these single-pixel, transparent gifs. collateral content, or superfluous content the crawler ends up harvesting during a crawl, is unavoidable given the nature of the web. if most of the single-pixel gifs were crawled as collateral content, the exclusion of certain websites may have caused a reduction in their appearances. international journal of digital humanities ( ) : – . approaching web archives as data corpora it is imperative for libraries and archives to consider the end data utilized by researchers in the future when building digital collections in the present. part of this practice requires web archivists to create scope and content notes and keep records of crawl decisions as they are made and as crawls are performed. content processing done by the web archivists to understand their own collections as data can help with this. if an archivist saw these dramatic drops in appearances of single-pixel, transparent gifs as a result of crawling practices, the archivist could file the information and share it with a researcher attempting to understand the collection in the future. this study looks at transparent gifs appearing in two specific collections, olia lialina’s exhibit of transparent gifs from the geocities archive and the uk web archive. these two collections make up a small percentage of content in web archives throughout the world, web archives which have had varying crawl practices over time (milliganet al. ). we took a look at the history of seven transparent gifs in data resulting from harvesting done by the uk web archiving team. we have not looked at the complete history of all single-pixel gifs as they appeared on the live web over time (brügger ). with appropriate technical infrastructure, this same study could be completed on any organization’s web archives. since each one of these entities will have different crawl practices, multiple web archiving initiatives collecting the same websites is invaluable to researchers studying the web. as the crawl becomes more comprehensive, we can begin to see how the findings of case studies like these are influenced by crawling practices (crawl frequency, crawl depth, deduplication, etc.) and whether the findings are indicative of web usage trends throughout time. decoupling these concepts is essential for an understanding of the practice of web archiving and the history of the web, respectively, and can only be done through multiple archives. when we approach each institution’s web archives as corpora it becomes increas- ingly clear that there is significant value in having a range of organizations engaged in web archiving ideally, they are engaging in these practices with a range of tools. the trends in the appearance of these files raise all kinds of questions. for instance, what conclusions do we reach when we apply similar methods to different kinds of files? in other words, what do trends in identical copies of files themselves tell about the movement, dissemination, and popularity of practices and approaches? there is infor- mational content in the files, but the history of the appearance of a given file in a given place also has potential informational value. . characterizing files as key to future modes of access knowing the specific urls at which files exist is also invaluable to the study of web history. the case of single-pixel gifs illustrates the significant value of modes of characterizing and identifying files using other methods. the ability to hash a file and use that digital fingerprint to see where else it, or files created through identical processes, exists in web archives is immensely powerful. who would have imagined there were millions of copies of one of these tiny files captured in the uk web archive in one particular year? when we discover that two urls held identical files at a particular date, we can start to track and trace the replication and movement of international journal of digital humanities ( ) : – information. importantly, this is all derivative information about the content. even in a situation in which archives can’t offer global access to the content itself, non- consumptive hashes could very well be provided for this kind of work. while hashes are exciting, it is important to remember that there are many other ways of characterizing similarity. an alternative approach to this kind of research could involve simply identifying all the ‘.gif’ files in a web archive that are particularly small and visually inspecting them to identify potential other candidates for different, unique single-pixel gifs. when one moves further into hash-based approaches to the study of files, it will be critical to remember that minor changes in a file are going to give it a new hash. with that noted, this only further points to the need to root the future of the study of web archives in the ability to compute against the files in these corpora. . implications for digital library infrastructure access issues highlighted in the computational scholarship are a sobering reminder that ‘digital’ or ‘digitized’ doesn’t not necessarily mean immediately ready for computa- tional scholarship. different kinds of questions require data to be prepared, processed, and made accessible in a number of ways. while digital material, rather than analog, is one step closer to becoming data, there is still work to be done to strategically arrange the content for a future of computational scholarship. furthermore, there are specific necessary affordances in technical architecture in order to enable researchers to com- pute against a corpus. as the library of congress pilot project showed, cracking open complex warc files to perform high-level analyses of the archive takes computing power that many researchers, and even institutions, do not always have at their disposal. the present study was, in large part, possible because a copy of the uk web archive is maintained and managed on a high-performance distributed computer system and because its archivist was willing to field a request to search across this web archive corpus to answer this particular question. most web archives are not currently configured in a manner which enables researchers to compute against their content as a corpus. in order for this kind of research to become more of a reality, library institutions will first have to explore having compute-on-demand capabilities for their entire corpus of web archives and, more broadly, other large, born-digital and digitized collections. this has significant implications for the future of infrastructure. it largely requires either establishing local high-performance computing environments or a shift to approaching access systems that rely on cloud computing environments for access copies of content. models that involve caching portions of content and working across multiple levels of tiered storage media simply will not be able to facilitate this kind of data corpus use of querying collections. conclusion: researchers and web archivists embracing distant reading the single-pixel, transparent gif seems to exemplify the essence of insignificance. the files are tiny and invisible. however, the history of these files reveals a great deal about the history of web design, tracking, and surveillance. sometimes they are spacer gifs, international journal of digital humanities ( ) : – sometimes they are web bugs, and sometimes they are web beacons. while we have not offered conclusive answers to any of the questions about their history, we have explored single-pixel, transparent gifs as a case study to shed light on future methods of studying the history of the web through born-digital web archives collections. the future of the study of the web and the future of collecting the web are intertwined. when we step back and see the patterns that emerge by looking at the hashes of a small set of files in the uk web archive, we immediately are prompted to raise two questions: what does this tell us about the history of the web? what does this tell us about the history of web archiving practices? researchers, now and in the future, will want to approach web archives collections by pivoting between distant reading and close reading. the pairing of distant and close reading as a method of studying the archived web is the only way of conceptualizing the sheer scale of the archived web and performing meaningful research. however, these methods will also help iteratively to build better, more comprehen- sive, and more curated web archives throughout the world. the scale of a web archive is also a challenge for the archivists charged with curating and maintaining it. yet, the same tools used by researchers can be used by web archivists and practitioners in the field to understand their archives or, sometimes more importantly, what is missing from their archives. as practitioners come to understand their archives in greater detail, this knowledge will inform future preservation practices and will provide immediate assistance in provenance for researchers utilizing the data. since the scale of web archives does not lend itself to traditional page-through reading and distant reading will become a necessity of close reading, the burden is on digital librarians to rethink the nature and structure of digital libraries, digital content, and web archives infrastructure. this could mean putting more resources into devel- opment of tools outside of web page rendering mechanisms, such as streamlined creation and delivery of data sets or web archives content derivatives. overall, detailed collection notes, especially crawling, scoping, and other specific decisions made over time, are crucial to improving the system and furthering research. references anderson, i. ( ). history and computing. making history. retrieved from http://www.history.ac. uk/makinghistory/resources/articles/history_and_computing.html. archer, j., & jockers, m. l. ( ). the bestseller code: anatomy of the blockbuster novel. new york: st. martin’s press. bailey, j., & taylor, n. ( ). web archiving systems apis (wasapi) for systems interoperability and collaborative technical development. paper presented at the cni fall , washington dc, us. blei, d. m., ng, a. y., & jordan, m. i. ( ). latent dirichlet allocation. journal of machine learning research, (jan), – . brinkmann, m. ( ). how to avoid saving spaceball.gif at flickr. ghacks tech news. retrieved april , from https://www.ghacks.net/ / / /how-to-avoid-saving-spaceballgif-at-flickr/. brügger, n. ( ). the archived website and website philology. nordicom review, ( ), – . https://doi.org/ . /nor- - . clement, t. e., auvil, l., & tcheng, d. ( ). high performance sound technologies for access and scholarship. retrieved from http://hdl.handle.net/ / . finkel, j. r., grenager, t., & manning, c. ( ). incorporating non-local information into information extraction systems by gibbs sampling. in acl- - rd annual meeting of the association for international journal of digital humanities ( ) : – http://www.history.ac.uk/makinghistory/resources/articles/history_and_computing.html http://www.history.ac.uk/makinghistory/resources/articles/history_and_computing.html https://www.ghacks.net/ / / /how-to-avoid-saving-spaceballgif-at-flickr/ https://doi.org/ . /nor- - http://hdl.handle.net/ / computational linguistics, proceedings of the conference (pp. – ). michigan: annarbor. https://doi.org/ . / . . gallinger, m., & chudnov, d. ( ). library of congress lab: library of congress digital scholars lab pilot project report. washington, dc: the library of congress retrieved from http://digitalpreservation. gov/meetings/dcs /dchudnov-mgallinger_lclabreport.pdf. jackson, a. ( ). tracing clear.gif: jupyter notebook. uk web archive github repository. https://nbviewer.jupyter.org/github/ukwa/halflife/blob/master/clear/tracingclear.gif.ipynb. jockers, m. l. ( ). macroanalysis: digital methods and literary history. urbana: university of illinois press. johnson, p. ( ). digital folklore with olia lialina & dragan espenschied: the transcript. retrieved from http://artfcity.com/ / / /digital-folklore-with-olia-lialina-dragan-espenschied-the-transcript/. kruse, w. g., ii, & heiser, j. g. ( ). computer forensics: incident response essentials. boston: addison– wesley professional. lialina, o. ( ). olia’s collection of clear/blanc/ /transparent/cover/beacon gifs. retrieved from http://www.collection.evan-roth.com/olia_lialina/clear.gif/. lin, j., milligan, i., wiebe, j., & zhou, a. ( ). warcbase: scalable analytics infrastructure for exploring web archives. journal on computing and cultural heritage (jocch), ( ), . lorang, e. m., soh, l.-k., datla, m. v., & kulwicki, s. ( ). developing an image-based classifier for detecting poetic content in historic newspaper collections. d-lib magazine, ( / ). https://doi. org/ . /july -lorang. mears, j. ( ). read collections as data report summary. retrieved april , from https://blogs.loc. gov/thesignal/ / /read-collections-as-data-report-summary/. milligan, i., ruest, n., & lin, j. ( ). content selection and curation for web archiving: the gatekeepers vs. the masses. in proceedings of the th acm/ieee-cs on joint conference on digital libraries (pp. – ). new york, ny, usa: acm. https://doi.org/ . / . newman, d. j., & block, s. ( ). probabilistic topic decomposition of an eighteenth-century american newspaper. journal of the association for information science and technology, ( ), – . owens, t. ( ). designing online communities: how designers, developers, community managers, and software structure discourse and knowledge production on the web. new york: peter lang. pabouk. ( ). how does google’s cleardot.gif track email recipients with a generic url? super user. retrieved april , from https://superuser.com/questions/ /how-does-googles-cleardot-gif- track-email-recipients-with-a-generic-url. padilla, t. ( ). on a collections as data imperative. retrieved april , from http://digitalpreservation.gov/meetings/dcs /tpadilla_onacollectionsasdataimperative_final.pdf. rønn-jensen, j. ( ). who invented the spacer.gif? retrieved from http://justaddwater.dk/ / / /who- invented-the-spacergif/. rønn-jensen, j. ( ). who invented the spacer.gif (part ). retrieved from http://justaddwater.dk/ / / /who-invented-the-spacergif-part- /. siegel, d. ( ). the web is ruined and i ruined it. xml.com. retrieved from https://www.xml. com/pub/a/w j/s .people.html. smith, r. m. ( ). the web bug faq. retrieved april , from https://w .eff.org/privacy/ marketing/web_bug.html. smith, d. a., cordell, r., & dillon, e. m. ( ). infectious texts: modeling text reuse in nineteenth-century newspapers. in big data, ieee international conference on (pp. – ). ieee. underwood, t. ( ). theorizing research practices we forgot to theorize twenty years ago. representations, ( ), – . https://doi.org/ . /rep. . . . . international journal of digital humanities ( ) : – https://doi.org/ . / . http://digitalpreservation.gov/meetings/dcs /dchudnov-mgallinger_lclabreport.pdf http://digitalpreservation.gov/meetings/dcs /dchudnov-mgallinger_lclabreport.pdf https://nbviewer.jupyter.org/github/ukwa/halflife/blob/master/clear/tracingclear.gif.ipynb http://artfcity.com/ / / /digital-folklore-with-olia-lialina-dragan-espenschied-the-transcript/ http://www.collection.evan-roth.com/olia_lialina/clear.gif/ https://doi.org/ . /july -lorang https://doi.org/ . /july -lorang https://blogs.loc.gov/thesignal/ / /read-collections-as-data-report-summary/ https://blogs.loc.gov/thesignal/ / /read-collections-as-data-report-summary/ https://doi.org/ . / . https://superuser.com/questions/ /how-does-googles-cleardot-gif-track-email-recipients-with-a-generic-url https://superuser.com/questions/ /how-does-googles-cleardot-gif-track-email-recipients-with-a-generic-url http://digitalpreservation.gov/meetings/dcs /tpadilla_onacollectionsasdataimperative_final.pdf http://justaddwater.dk/ / / /who-invented-the-spacergif/ http://justaddwater.dk/ / / /who-invented-the-spacergif/ http://justaddwater.dk/ / / /who-invented-the-spacergif-part- / http://justaddwater.dk/ / / /who-invented-the-spacergif-part- / https://www.xml.com/pub/a/w j/s .people.html https://www.xml.com/pub/a/w j/s .people.html https://doi.org/ https://doi.org/ https://doi.org/ . /rep. . . . the invention and dissemination of the spacer gif: implications for the future of access and use of �web archives abstract situating web archives in trends in online collections explorations in the history of the single-pixel gif characterizing/identifying files single-pixel gif trends across corpora discussion: what invisible files let us see seeing web history or web archiving history? approaching web archives as data corpora characterizing files as key to future modes of access implications for digital library infrastructure conclusion: researchers and web archivists embracing distant reading references p e ac h t r e e s t r e e t, s u i t e at l a n ta , g a w w w. l i b r a ry p u b l i s h i n g. o r g . . s a r a h @ e d u c o p i a . o r g l i b r a r y p u b l i s h i n g d i r e c t o r y e d i t e d b y s a r a h k . l i p p i n c o t t www.librarypublishing.org mailto:sarah@educopia.org cc by . by library publishing coalition - - - - ( e p d f ) contents foreword vi introduction viii library publishing coalition subcommittees xiii reading an entry xiv libraries in the united states and canada arizona state university auburn university boston college brigham young university brock university cal poly, san luis obispo california institute of technology california state university san marcos carnegie mellon university claremont university consortium colby college college at brockport, suny college of wooster columbia university connecticut college cornell university dartmouth college duke university emory university florida atlantic university florida state university georgetown university georgia state university grand valley state university gustavus adolphus college hamilton college illinois wesleyan university indiana university johns hopkins university kansas state university loyola university chicago macalester college mcgill university miami university mount saint vincent university northeastern university northwestern university oberlin college ohio state university oregon state university pacific university pennsylvania state university pepperdine university portland state university purdue university rochester institute of technology rutgers, the state university of new jersey simon fraser university state university of new york at buffalo state university of new york at geneseo syracuse university temple university texas tech university thomas jefferson university trinity university tulane university université de montréal university of alberta university of arizona university of british columbia university of calgary university of california, berkeley university of california system university of central florida university of colorado anschutz medical campus university of colorado denver university of florida university of georgia university of guelph university of hawaii at manoa university of idaho university of illinois at chicago university of iowa university of kansas university of kentucky university of maryland college park university of massachusetts amherst university of massachusetts medical school university of michigan university of minnesota university of nebraska-lincoln university of north carolina at chapel hill university of north carolina at charlotte university of north carolina at greensboro university of north texas university of oregon university of pittsburgh university of san diego university of south florida university of tennessee university of texas at san antonio university of toronto university of utah university of victoria university of washington university of waterloo university of windsor university of wisconsin–madison utah state university valparaiso university vanderbilt university villanova university virginia commonwealth university virginia tech wake forest university washington university in st. louis wayne state university western university libraries outside the united states and canada australian national university edith cowan university humboldt-universität zu berlin monash university swinburne university of technology university of hong kong university of south australia library publishing coalition strategic affiliates platforms, tools, and service providers personnel index vi foreword martin halbert (university of north texas), james mullins (purdue university), and tyler walters (virginia tech) in january , we officially launched the library publishing coalition (lpc) project, a collaborative initiative that now involves academic libraries committed to advancing the emerging field of library publishing. as this new service area matures and expands, we have seen a clear need for knowledge sharing, collaboration, and development of common practices. the lpc is helping this field move forward in a number of key ways, but we are particularly proud to publish the first edition of the library publishing directory, a guide to the publishing activities of academic libraries. in documenting the breadth and depth of activities in this field, this resource aims to articulate the unique value of library publishing; to establish it as a significant and growing community of practice; and to raise its visibility within a number of stakeholder communities, including administrators, funding agencies, other scholarly publishers, librarians, and content creators. collecting this rich set of data from libraries across the united states and canada allows us to identify themes, challenges, and trends; make predictions about future directions; and position the library publishing coalition to better meet the needs of this community. the directory also advances one of the central goals of the library publishing coalition, to facilitate and encourage collaboration among libraries as well as among libraries and publishers that share their values, especially university presses and learned societies. we hope that libraries will use the directory to learn about their peers, find mutually beneficial ways to work together, and ultimately improve their practices and enhance the value they provide to their campuses. we hope that presses will see opportunities to initiate new partnerships or expand existing ones. the lpc is a community-driven project that relies on the hard work and expertise of representatives from our participating institutions. we could not have produced the directory without the support of the directory subcommittee. we would like to thank marilyn billings (university of massachusetts-amherst), stephanie davis-kahl (illinois wesleyan university), adrian ho (university of kentucky), holly mercer (university of tennessee), elizabeth smart (brigham young university), shan sutton (oregon state university), allegra swift (claremont university consortium), beth turtle (kansas state university), and charles watkinson (purdue university) for their invaluable contributions. vii we also are grateful for the generous support of purdue university libraries’ scholarly publishing services unit, which donated resources, staff time, and the expertise of alexandra hoff and managing editor katherine purple; lightning source, who donated print-on-demand services; and the charlesworth group for conversion to ebook formats. finally, we would like to thank the libraries that took the time to help us to better understand, promote, and assert the significance of library publishing initiatives by providing information for this directory. the libraries listed in this first edition demonstrate the tremendous interest and energy in this field. we look forward to continuing to watch and document library publishing services as they evolve and progress in the coming years. viii introduction sarah k. lippincott, katherine skinner, and charles watkinson we are so pleased to share with our readership this first library publishing directory, produced by the library publishing coalition (lpc) in our organization’s inaugural year of work. this directory intends to make visible the innovation, support, and services offered today by a broad range of academic libraries in the area of scholarly communications. herein, we begin to document the strategic investments university libraries around the world are making in the area of academic publishing. once believed to be a one-off activity subsidized by a small number of libraries, “library publishing” today is evolving into a dynamic subfield in the academic publishing ecosystem. why publish a directory? for more than two decades, faculty, researchers, and students have come to their college and university libraries to gain technical support and staffing for early experiments in digital scholarship. from hosting ejournals and electronic theses and dissertations (etds) to collaborating with teams of researchers to construct multimedia experiences, these libraries have been willing and able partners in this academic mission of creating and disseminating scholarship. by , these library-based activities began to formalize, as documented in two key reports: ithaka s&r’s university publishing in a digital age and arl’s research library publishing services: new options for university publishing. subsequent studies reinforced the importance of these emerging library-based publishing endeavors. as demonstrated by the seminal library publishing services: strategies for success report, publishing services now are thriving across the whole range of academic libraries today, from small liberal arts colleges to premier research institutions. this growth of library publishing activities provided the impetus and rationale for creating the lpc to help advance this subfield for u.s. and canadian academic libraries. hosted by the educopia institute, and driven by academic libraries, the lpc project ( – ) is now founding this new organization. its mission is to promote the development of innovative, sustainable publishing services in academic and research libraries to support scholars as they create, advance, and disseminate knowledge. as a key part of this work, the lpc seeks to document practices and services in the field, and to foster strategic alliances and connections both across and between libraries and other academic publishers. the lpc created this directory to begin to answer the many questions the project team had about the publishing activities currently underway in libraries. how ix many libraries define their scholarly communications activities as “publishing”? how long have they been doing this work? with whom do they partner? what types of publications are they producing? are libraries offering specific products and/or services to their campuses? what percentage of their publications are peer reviewed? how many staff members are working on this activity, and how are they funding their activities? are there identifiable models and trends in this subfield of publishing today? with these and other questions in mind, the lpc directory subcommittee built and disseminated an internet-based survey in spring , targeting north american listservs for academic libraries. we focused on north america for scoping reasons: we knew we could not hope to chronicle global work in full, and so began with this smaller-but-significant subset of activity. we intentionally structured this directory to encompass institutions beyond the lpc itself, inviting any institution engaged in library publishing to participate. we received more than responses to this survey. in the following pages, we include directory listings for all institutions that responded, grouping the north american institutions first (our primary target) and programs outside the u.s. and canada next. using the survey data, the lpc directory subcommittee assembled the directory entries, shared each one with its institutional representative for editing and approval, and then published it herein. we greatly appreciate all those who gave their time and energy to help us document the efforts of their individual libraries. notably, the only institutions listed here are those that responded to our survey. undoubtedly, many important programs have been missed in this first edition. we hope that those we have missed will contact lpc (sarah@educopia.org) so we can ensure these institutions are included in future editions. the library publishing directory contributes directly to the lpc’s goal of encouraging collaboration by allowing library publishing staff, who have traditionally had relatively little contact with each other, to identify colleagues producing scholarly work in similar disciplines or using the same technology platform. the directory also is intended to open the way to collaboration with other publishers, especially mission-driven non-profit university presses and learned societies, by introducing and articulating the unique and complementary approach that libraries take to the publishing function. finally, it is hoped that the directory can help scholarly authors to become more aware of the opportunities that may exist on their own campuses or in their disciplines to experiment with new publication formats or business models. we highlight below some of the exciting library publishing trends and models we see emerging in this first directory of activity. together the answers provide a rich picture of what types of product libraries are creating and what technological, financial, and human resources they are using. mailto:sarah@educopia.org x library publishing today individually, the directory entries reveal much about local practices, including the mission driving an institution’s activities, the funding models and staffing supporting its work, the relationship between publication and preservation, and the type and quantity of publications produced. collectively, these entries say far, far more. last year, the libraries profiled in this directory published faculty-driven journals, student-driven journals, monographs, at least , conference papers and proceedings, and nearly , each of etds and technical/research reports. these publications covered an array of disciplines, including law, agriculture, history, education, computer science, and many, many others. thirty-three libraries report disciplinary specialties in the social sciences and area, ethnic, cultural, and gender studies (a broad classification that includes a range of interdisciplinary specialties). education ( libraries), health and clinical sciences ( ), and the general humanities ( ) are also particularly well-represented areas. faculty-driven journals were the most common publication reported by these libraries. over % of the libraries in this directory published at least one in and over half ( %) published at least one student-driven journal. thirty- six percent produced at least one monograph, and more than three-quarters published etds. more than half reported publishing data, audio, and video, in addition to text and images. currently, there is no single, dominant model for the organization of publishing services. in many institutions, services are distributed across multiple library units or across campus. the lead unit varies across libraries (e.g., scholarly communications, technical services, and even special collections). library publishing programs featured in this directory range from small, experimental endeavors to large, more mature operations with several dedicated staff members. libraries reported between . and eight full-time equivalent in library staff, and many also reported employing graduate ( %) and undergraduate ( %) students. across these libraries, the most prominent services are building, implementing, maintaining, and supporting publishing platforms for authors. in this work many report using full-service digital platforms, including public knowledge project’s ojs/ocs/oms suite ( %), bepress’s digital commons platform ( %), and dspace ( %)—the top three for respondents. however, many also report developing software locally ( %) and/or using a content management system like wordpress ( %) for dissemination and delivery. more than three-quarters of respondents said that they provide a broader range of services, including metadata ( %), analytics ( %), outreach ( %), doi assignment ( %), audio/ video streaming ( %), and issn registration ( %). a substantial number of these libraries also provide support for editorial and production processes. xi these include peer review management ( %), copyediting ( %), and print- on-demand ( %). finally, some libraries support business model development ( %), budget preparation ( %), and contract and license preparation ( %). other services offered, such as author advisory on copyright ( %), build upon librarians’ strengths as educators and advocates. a hallmark of library publishing, as is repeatedly highlighted in individual directory entries, has been the building of partnerships with content creators and other publishers on and off campus. faculty, students, and other authors typically provide the editorial leadership for library publications. over % of libraries featured herein report that they have relationships with campus departments or programs; % partner with individual faculty; and over half work with graduate and undergraduate students. many of the libraries in this directory report that they work with or have administrative ties to university presses. off-campus partners include scholarly societies, non-profit organizations, museums, library networks and consortia, and individual faculty at other institutions. despite the different forms library publishing activities have assumed to date, the directory demonstrates that these programs share a growing commonality of philosophy and approach combining traditional library values and skills (such as a concern with long-term preservation, expertise in the organization of information, and commitment to widening access) with lightweight digital workflows to create a distinctive “field” of publishing activity. the libraries in this directory overwhelmingly prefer open access publication ( % focus mostly or completely on open access). and although % of libraries rely in part or completely on their library’s operating budget to support publishing services, notably, % do not. among those libraries that are subsidizing these activities, the operating budget is contributing an average of % of the publishing budget. looking toward the future, many of the libraries in this directory report that they plan to increase the numbers and types of publications they produce, support an expanded suite of services (particularly in areas like data management), identify new partners within and beyond campus, and make improvements to software and workflows. the future of library publishing as libraries undertake the improvement and expansion of services, they will continue to confront a difficult and rapidly changing landscape. building capacity, sustaining services, and securing funding will require concerted efforts to demonstrate value and improve business models. raising credibility and visibility on campus and within the broader scholarly communications community will also require individual and collective efforts. libraries will need to convince campus administrators, university presses, librarians, commercial publishers, and content creators that library publishing is an important, strategic, xii purposeful service area that adds value to the publishing ecosystem. perhaps most important, libraries will need to cultivate and strengthen their relationships with other scholarly publishers—including university presses, scholarly societies, and commercial publishers—to build our collective capacity, extend the reach of scholarship, and ensure that the scholarly communication apparatus continues to evolve in pace with the research and knowledge produced across academia. this library publishing directory tells a compelling story, one that we believe needs dissemination in its own right. we look forward to seeing these networks continue to build upon the work they have done. we hope the directory will help existing and prospective library publishers identify new partners and learn from the experiences of their colleagues. and, of course, we hope to see the nexus of activity represented here continue to expand in the years ahead. xiii library publishing coalition subcommittees the following subcommittee members have donated their time and expertise to advancing the library publishing coalition’s mission and producing its most significant resources. program subcommittee the program subcommittee bears primary responsibility for planning and implementing the library publishing forum. sarah beaubien (grand valley state university) dan lee (university of arizona) mark newton (columbia university) melanie schlosser (ohio state university) marcia stockham (kansas state university) allegra swift (claremont university consortium) evviva weinraub (oregon state university) directory subcommittee the directory subcommittee provides support for the design and creation of the library publishing directory. marilyn billings (university of massachusetts-amherst) stephanie davis-kahl (illinois wesleyan university) adrian ho (university of kentucky) holly mercer (university of tennessee) elizabeth smart (brigham young university) shan sutton (oregon state university) allegra swift (claremont university consortium) beth turtle (kansas state university) charles watkinson (purdue university) research subcommittee the research subcommittee coordinates library publishing coalition roundtable discussions, and manages the organization’s research agenda donna beck (carnegie mellon university) marilyn billings (university of massachusetts-amherst) brad eden (valparaiso university) isaac gilman (pacific university) dan lee (university of arizona) gail mcmillan (virginia tech) catherine mitchell (california digital library) jane morris (boston college) melanie schlosser (ohio state university) mary beth thompson (university of kentucky) xiv reading an entry: some “health warnings” the field of library publishing is rapidly evolving, and its boundaries have not yet been clearly defined. we have attempted to produce a directory that is readable and cohesive and that allows for cross-institutional comparison. in some cases, this means we have used terminology and categories that do not fully reflect the complex and experimental nature of activities that libraries are undertaking. we hope that through this directory, and through input from the library publishing community, we will start to establish common language as this field matures. in some cases, as described below, questions in the questionnaire on which the entries are based were not specific enough and respondents reported numbers in different ways. revised questions will be sent in future years. while “staff in support of publishing activities” are consistently reported for salaried employees, the way in which respondents reported students percentages may vary. for example, undergraduate employees usually work a maximum of hours per week rather than the expected of a full-time staff member. therefore, it is open to question whether a . undergraduate works or hours per week. under “types of publication,” we have noticed that conference proceedings have been reported in various ways. in some cases, respondents report the number of series, while in other cases, they report the total number of faculty papers. some high estimates of total number of publications raise an issue of whether the instructions to just record activity in the last full calendar year, , were followed. readers will notice the presence of “seals” next to the title of some entries. these acknowledge the support of the institutions that fund the library publishing coalition by their generous two-year pledges. “contributing institutions” have pledged to support the foundation of the lpc with an annual contribution of $ , . “founding institutions” receive the highest honor, having pledged $ , a year to the project. to recognize their exceptional contributions, we include profiles of specific publications that founding institutions have nominated. these also give a practical sense of the wide range of types of publications produced. f o u n di ng institu tio n library publishing coalition c o n tr ib ut ing institu tio n library publishing coalition libraries in the united states and canada arizona state university hayden library primary unit: informatics and cyberinfrastructure services primary contact: mimmo bonanni digital projects manager - - digitalrepository@asu.edu website: repository.asu.edu social media: @asulibraries; facebook.com/asulibraries program overview mission/description: arizona state university libraries created the asu digital repository to support asu’s commitment to excellence, access, and impact. the asu digital repository advances the new american university by providing a central place to collect, preserve, and discover the creative and scholarly output from asu faculty, research partners, staff, and students. providing free, online access to asu scholarship benefits our local community, encourages transdisciplinary research, and engages scholars and researchers worldwide, increasing impact globally through the rapid dissemination of knowledge. the asu digital repository improves the visibility of content by exposing it to commercial search engines such as google, the asu libraries’ one search, as well as the asu digital repository search portal. the asu digital repository helps meet public access policies and archival requirements specified by many federal grants. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) mailto:digitalrepository@asu.edu repository.asu.edu media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: state and local documents (government publications); music; dance top publications: journal of surrealism (journal) campus partners: campus departments or programs; individual faculty publishing platform(s): contentdm; locally developed software digital preservation strategy: digital preservation services under discussion additional services: outreach; metadata; doi assignment/allocation of identifiers; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: improving marketing and outreach (involving subject librarians and e-research staff ), expanding data management support, and exploring the addition of learning objects. auburn university auburn university libraries primary contact: aaron trehub assistant dean for technology and technical services - - trehuaj@auburn.edu program overview mission/description: to support the university’s outreach mission by making original research and scholarship by auburn university faculty and students more accessible to alabama residents and the world at large. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); charitable contributions/ friends of the library organizations ( ) publishing activities types of publications: etds ( ) media formats: text; images top publications: etds campus partners: campus departments or programs; individual faculty publishing platform(s): dspace digital preservation strategy: digital preservation services under discussion. auburn university libraries is a founding member of two private lockss networks (metaarchive cooperative; adpnet), but does not currently use these distributed digital preservation networks to preserve etds or materials in the ir. additional services: graphic design (print or web); outreach; training; cataloging; meta- data; open url support; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: currently building and populating an institutional repository. mailto:trehuaj@auburn.edu boston college boston college university libraries primary unit: scholarly communications primary contact: jane morris head of scholarly communications and research - - jane.morris@bc.edu website: www.bc.edu/libraries/collections/escholarshiphome program overview mission/description: our goal is to showcase and preserve boston college’s scholarly output and to maximize research visibility and influence. escholarship@ bc encourages community contributors to archive and disseminate scholarly work, peer-reviewed publications, books, chapters, conference proceedings, and small datasets in an online open access environment. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/ honors theses ( ); grey literature from centers/institutes; datasets media formats: text; video; data disciplinary specialties: theology; education; the middle east; libraries top publications: catholic education (journal); studies in christian-jewish relations (journal); information technology and libraries (journal); levantine review (journal); proceedings of the catholic theological society of america (conference proceedings) c o n tr ib ut ing institu tio n library publishing coalition mailto:jane.morris@bc.edu www.bc.edu/libraries/collections/escholarshiphome percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: catholic theological society of america; ala library and information technology association; council of centers on christian jewish relations; seminar on jesuit spirituality publishing platform(s): ojs/ocs/omp; digitool digital preservation strategy: hathitrust; lockss; metaarchive additional services: marketing; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; dataset management; contract/license preparation; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: hosting more data and developing more open access journals. brigham young university harold b. lee library primary unit: scholarly communication unit scholarsarchive@byu.edu primary contact: elizabeth smart scholarly communication librarian - - elizabeth_smart@byu.edu website: sites.lib.byu.edu/scholarsarchive program overview mission/description: the harold b. lee library’s primary publishing resources include an institutional repository and digital publishing services for faculty- and student-edited journals. combined, these resources are called scholarsarchive. scholarsarchive is designed to make original scholarly and creative work—such as research, publications, journals, and data—freely and persistently available. the library’s publishing efforts are targeted at supporting broader academic and public discovery and use of university scholarship. scholarsarchive may also house items of historic interest to the university. the library supports content partners with software support, digitizing, metadata creation, journal management, and free hosting services. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); student conference papers and proceedings ( ); databases ( ); etds ( ) media formats: text; images f o u n di ng institu tio n library publishing coalition mailto:scholarsarchive@byu.edu mailto:elizabeth_smart@byu.edu sites.lib.byu.edu/scholarsarchive disciplinary specialties: religion; natural history of the american west; children’s literature top publications: western north american naturalist (journal); byu studies (journal); children’s book and play review (journal); pacific studies (journal); tesl reporter (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: international society for the comparative study of civilizations (iscsc); association of mormon counselors and psychotherapists (amcap); council on east asian libraries (ceal) publishing platform(s): contentdm; ojs/ocs/omp digital preservation strategy: rosetta (moving from beta to full implementation in ) additional services: analytics; cataloging; metadata; peer review management; digitization; hosting of supplemental content plans for expansion/future directions: areas of future exploration and possible expansion include monograph publishing, print on demand, doi support, hosting streaming media, and data management. h i g h l i g h t e d p u b l i c a t i o n the western north american naturalist (formerly great basin naturalist) has published peer- reviewed experimental and descriptive research pertaining to the biological natural history of western north america for more than years. ojs.lib.byu.edu/spc/index.php/wnan ojs.lib.byu.edu/spc/index.php/wnan brock university james a. gibson library primary contact: elizabeth yates liaison / scholarly communication librarian - - ext. eyates@brocku.ca website: www.brocku.ca/library/about-us-lib/openaccess program overview mission/description: the library’s publishing initiatives provide technology, expertise, and promotional support for researchers, students, and staff at brock university seeking to make their research universally accessible via open access. the library currently publishes/hosts five scholarly oa journals in partnership with scholars portal and the ontario council of university libraries. we use open journal systems (ojs) software. the library manages an open access publishing fund to help brock authors cover the costs of publishing with fully oa journals or monograph publishers. a minimum of four awards of up to $ , are granted; total funding is $ , . the library also hosts and disseminates brock scholarship through our digital repository, which collects graduate theses, major research projects, and subject- or department-based research collections and materials from our special collections and archives. we also raise awareness of open access through open access week activities, information resources, and other venues. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); etds ( ) media formats: text; images disciplinary specialties: humanities; french language; arts education; teaching and learning percentage of journals that are peer reviewed: mailto:eyates@brocku.ca www.brocku.ca/library/about-us-lib/openaccess campus partners: campus departments or programs; individual faculty other partners: ontario council of university libraries/scholars portal publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: scholars portal additional services: copy-editing; training; analytics; notification of a&i sources; issn registration; digitization plans for expansion/future directions: launching a journal showcasing undergraduate student research in the faculty of applied health sciences; launching an open monograph publishing system in partnership with scholars portal and the ontario council of university libraries. cal poly, san luis obispo robert e. kennedy library primary unit: digital scholarship services primary contact: marisa ramirez digital scholarship services librarian - - mramir @calpoly.edu website: digitalcommons.calpoly.edu/; lib.calpoly.edu/scholarship program overview mission/description: the robert e. kennedy library provides digital services to assist the campus community with the creation, publication, sharing, and preservation of research, scholarship, and campus history. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( ) funding sources (%): library operating budget ( ); endowment income ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ); graduate internship reports media formats: text; images; audio; video; data; multimedia/interactive content disciplinary specialties: history; philosophy; sustainability top publications: senior undergraduate projects; master’s theses; between the species (journal); california climate action planning (conference proceedings) percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:mramir @calpoly.edu digitalcommons.calpoly.edu lib.calpoly.edu/scholarship campus partners: individual faculty; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion. we are in the process of joining lockss and metaarchive. additional services: typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; notification of a&i sources; issn registration; peer review management; business model development; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: hiring an endowed digital scholarship services student assistant through the digital scholarship services student assistantship program, which provides paid, experiential learning opportunities for cal poly students who are interested the various facets of the changing digital publishing landscape. california institute of technology caltech library primary unit: metadata services group primary contact: kathy johnson repository librarian - - kjohnson@library.caltech.edu program overview mission/description: caltechthesis is part of coda, the caltech collection of open digital archives, managed by caltech library services. the mission of coda is to collect, manage, preserve, and provide global access over time to the scholarly output of the institute and the publications of campus units. caltechthesis contains phd, engineer’s, master’s, and bachelor’s/senior theses authored by caltech students. most items in caltechthesis are textual dissertations, but some may also contain software programs, maps, videos, etc. the etd is the version of record for the institute and deposit of doctoral dissertations is required for graduation. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities media formats: text; images; audio; video; data; simple websites disciplinary specialties: biology; chemistry and chemical engineering; engineering and applied science; geology and planetary science; physics; mathematics; astronomy campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): eprints digital preservation strategy: digital preservation services under discussion mailto:kjohnson@library.caltech.edu additional services: marketing; outreach; training; analytics; cataloging; metadata; author copyright advisory; digitization plans for expansion/future directions: undergoing gradual move of platforms to islandora/fedora, including preservation activity. california state university san marcos kellogg library primary contact: carmen mitchell institutional repository librarian - - cmitchell@csusm.edu website: csusm-dspace.calstate.edu; scholarworks.csusm.edu social media: @csusm_library program overview mission/description: the purpose of the california state university san marcos institutional repository (scholarworks) is to collect, organize, preserve, and disseminate csusm research, creative works, and other academic content in a web-based environment. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); other ( ) publishing activities types of publications: etds ( ); library exhibits media formats: text; images; audio; video disciplinary specialties: student work/research; library exhibits top publications: “going paperless: student and parent perceptions of ipads in the classroom” (thesis); “lateral violence in nursing” (thesis); “nurses’ technique and site selection in subcutaneous insulin injection” (thesis); “individual differences in working memory and levels of processing” (thesis); “wounded hearts: a journey through grief ” (thesis) campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students mailto:cmitchell@csusm.edu http://csusm-dspace.calstate.edu scholarworks.csusm.edu publishing platform(s): dspace digital preservation strategy: in-house; digital preservation services under discussion additional services: marketing; outreach; training; analytics; cataloging; metadata; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: planning to include faculty publications and datasets within the next year, working with other csu campuses on an undergraduate journal, and currently working to publish digital surrogates of items from the university archives. carnegie mellon univeristy carnegie mellon university libraries primary unit: archives and digital library initiatives primary contact: gabrielle michalek head of archives and digital library initiatives - - gabrielle@cmu.edu website: repository.cmu.edu program overview mission/description: carnegie mellon university libraries’ publishing program aims to promote open access to scholarly resources, to support online journals and conference management—from article submission through peer review to open access and long-term preservation, and to publish grey literature, including theses, dissertations, and technical reports. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text disciplinary specialties: social and behavioral sciences; engineering; physical and life sciences; arts and humanities; security top publications: journal of privacy and confidentiality (journal); dietrich college honors theses percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:gabrielle@cmu.edu repository.cmu.edu campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: lockss; metaarchive additional services: marketing; outreach; training; analytics; cataloging; metadata; peer review management; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: hosting more open access journals; supporting conference proceedings; and publishing more theses, dissertations, and technical reports. claremont university consortium claremont colleges library primary unit: center for digital initiatives scholarship@cuc.claremont.edu primary contact: allegra swift digital initiatives librarian - - allegra_swift@cuc.claremont.edu website: scholarship.claremont.edu; ccdl.libraries.claremont.edu social media: @ccdiglib; facebook.com/honnoldlibrary; facebook.com/ claremontcollegesdigitallibrary; flickr.com/photos/claremontcollegesdigitallibrary program overview mission/description: the center for digital initiatives facilitates the dissemination of knowledge by providing publishing platforms, consulting, and technical services to enable the creation and distribution of teaching and research resources to the scholarly community. year publishing activities began: (first journal); (in earnest) organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); digital collections; lectures and symposia media formats: text; images; video; multimedia/interactive content disciplinary specialties: arts and humanities; social and behavioral sciences; physical and mathematical sciences; life sciences; business top publications: cmc senior theses; journal of humanistic mathematics (journal); steam (journal); scripps senior theses; lux (journal); performance practice review (journal) c o n tr ib ut ing institu tio n library publishing coalition mailto:scholarship@cuc.claremont.edu mailto:allegra_swift@cuc.claremont.edu scholarship.claremont.edu ccdl.libraries.claremont.edu percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: rancho santa ana botanical gardens publishing platform(s): bepress (digital commons); contentdm digital preservation strategy: amazon glacier; amazon s ; looking into clockss, lockss, and some others additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; peer review management; business model development; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: possible expansion into areas of education and alternative/non-traditional publishing. colby college colby college libraries primary unit: digital and special collections primary contact: marty kelly assistant director for digital collections - - mfkelly@colby.edu program overview mission/description: the publishing mission of colby college libraries digital and special collections is to showcase the scholarly work of colby’s faculty and students, make the college’s unique collections more broadly available, and contribute to open intellectual discourse. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); newsletters ( ); undergraduate capstone/honors theses ( ); alumni magazine media formats: text; images; audio; video disciplinary specialties: humanities; environmental science; jewish studies; economics top publications: colby quarterly (journal); colby honors theses and senior scholars papers; colby undergraduate research symposium (conference proceedings); atlas of maine (journal); colby magazine (magazine) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students f o u n di ng institu tio n library publishing coalition mailto:mfkelly@colby.edu publishing platform(s): bepress (digital commons); wordpress digital preservation strategy: digital preservation services under discussion additional services: outreach; training; analytics; cataloging; metadata; open url support; peer review management; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: planning to support two new major publishing initiatives with colby’s center for the arts and humanities this coming academic year: the relaunch of the colby quarterly ( – ) and the development of a new undergraduate research journal. h i g h l i g h t e d p u b l i c a t i o n the colby environmental assessment team collection of student-produced watershed studies on maine’s belgrade lakes are widely used by local lake associations, town officials, and the department of environmental protection. digitalcommons.colby.edu/lakesproject digitalcommons.colby.edu/lakesproject college at brockport, suny drake memorial library primary unit: library technology primary contact: kim myers digital repository specialist - - kmyers@brockport.edu program overview year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data disciplinary specialties: kinesiology; sports science; physical education; education; counselor education; philosophy; english top publications: counselor education master’s theses; education master’s theses; technical reports from the water research community; dissenting voices (journal); journal of literary onomastic studies (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons) digital preservation strategy: lockss mailto:kmyers@brockport.edu additional services: copy-editing; marketing; training; cataloging; metadata; issn registration; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: expanding in the etd arena; working with the graduate school to automate the publication of our master’s theses as they are produced. college of wooster college of wooster libraries primary unit: digital scholarship and services primary contact: stephen flynn emerging technologies librarian - - sflynn@wooster.edu program overview mission/description: our goal is to digitally preserve and promote the original scholarship of our faculty and students. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons); dspace; wordpress digital preservation strategy: in-house; digital preservation services under discussion additional services: metadata; digitization; hosting of supplemental content plans for expansion/future directions: migrating from dspace to bepress, which may enable us to promote the publishing of new undergraduate journals. mailto:sflynn@wooster.edu columbia university columbia university libraries/information services primary unit: center for digital research and scholarship info@cdrs.columbia.edu primary contact: mark newton production manager - - mnewton@columbia.edu website: cdrs.columbia.edu social media: @columbiacdrs; @researchatcu; @dataatcu; @ scholarlycomm; facebook.com/pages/center-for-digital-research-and- scholarship-columbia-university/ program overview mission/description: the center for digital research and scholarship (cdrs) serves the digital research and scholarly communications needs of the faculty, students, and staff of columbia university and its affiliates. our mission is to increase the utility and impact of research produced at columbia by creating, adapting, implementing, supporting, and sustaining innovative digital tools and publishing platforms for content delivery, discovery, analysis, data curation, and preservation. in pursuit of that mission, we also engage in extensive outreach, education, and advocacy to ensure that the scholarly work produced at columbia university has a global reach and accelerates the pace of research across disciplines. year publishing activities began: (columbia university libraries); (cdrs) organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ); grants ( ); licensing ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); conference papers and proceedings ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) f o u n di ng institu tio n library publishing coalition mailto:info@cdrs.columbia.edu mailto:mnewton@columbia.edu cdrs.columbia.edu media formats: text; images; audio; video; data; software disciplinary specialties: law; humanities; public health; global studies; interdisciplinary studies top publications: tremor and other hyperkinetic movements (journal); dangerous citizens (website); academic commons (digital research repository); women film pioneers project (website); columbia business law review (journal) percentage of journals that are peer reviewed: campus partners: columbia university press; campus departments or programs; individual faculty; graduate students; undergraduate students other partners: modern languages association; fordham university press; new york university; ecological society of america. informal partners include california digital library; cornell university; purdue university. publishing platform(s): fedora; ojs/ocs/omp; wordpress; locally developed software; drupal digital preservation strategy: aptrust; archive-it; duracloud/dspace; dpn; in-house; digital preservation services under discussion. content is also backed up to nysernet, to two on-site locations, and off-site to tape with ironmountain. additional services: graphic design (print or web); typesetting; copy- editing; marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset h i g h l i g h t e d p u b l i c a t i o n academic commons is a digital publication platform that brings global visibility to the research and scholarship of columbia university and its affiliates. academiccommons.columbia.edu academiccommons.columbia.edu management; business model development; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming; preservation; repository deposit to pmc; seo; application development; content and platform migration; workshops and consultation; social media and journal publishing best practices workshops; informal scholarly communication events; open access week events; campus oa fund management; collaboration spaces plans for expansion/future directions: planning to continue integration of the publishing program with the digital research repository, academic commons (academiccommons.columbia.edu), as well as to pursue new publishing partnerships with scholarly societies through members affiliated with the university. further plans include expansion into unique identifier support (such as with orcid and through ezid) as well as work in support of federal and funder mandates for access to funded research. academiccommons.columbia.edu connecticut college charles e. shain library primary unit: special collections primary contact: benjamin panciera director of special collections - - bpancier@conncoll.edu website: digitalcommons.conncoll.edu program overview mission/description: connecticut college seeks to make the products of student and faculty research and campus resources as widely available as possible through its institutional repository. mandatory electronic submission of student honors theses began in . the faculty overwhelmingly passed an open access policy in , and the library has supported this by retrospectively making faculty research available through the institutional repository. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty conference papers and proceedings ( ); newsletters ( ); undergraduate capstone/honors theses ( ) media formats: text; audio campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion; no digital preservation services provided mailto:bpancier@conncoll.edu digitalcommons.conncoll.edu additional services: cataloging; metadata; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: seeking to optimize faculty participation and maximize the amount of available research in the institutional repository and inform faculty of the possibility of using the repository to make available unpublished material like conference papers and datasets. cornell university cornell university library primary unit: digital scholarship and preservation services primary contact: david ruddy director, scholarly communications services - - dwr @cornell.edu program overview mission/description: separate operations have their own mission statements (project euclid, arxiv, ecommons, cip). in general, we wish to promote sustainable models of scholarly communications with an emphasis on access and affordability. year publishing activities began: organization: services are primarily distributed across library units. a few projects involve the cornell university press. staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ); sales revenue ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); etds ( ); undergraduate capstone/ honors theses ( ); case studies media formats: text; audio; video; data disciplinary specialties: mathematics; physics; statistics; computer science percentage of journals that are peer reviewed: campus partners: cornell university press; campus departments or programs; individual faculty; graduate students other partners: duke university press; scholarly societies; scholars worldwide c o n tr ib ut ing institu tio n library publishing coalition mailto:dwr @cornell.edu publishing platform(s): dpubs; dspace; locally developed software digital preservation strategy: in-house additional services: graphic design (print or web); metadata; doi assignment/ allocation of identifiers; open url support; budget preparation; digitization; hosting of supplemental content; audio/video streaming additional information: “publishing” activities at cornell are complex and include at least four fairly distinct operations: project euclid, arxiv.org, ecommons (an institutional repository), and cornell initiatives in publishing (cornell-related journals and books). each of these operations arguably fit the provided criteria for “library publishing” activities. arxiv.org dartmouth college dartmouth college library primary unit: digital library program library.dartmouth.edu/mail/send.php?to=askalib primary contact: elizabeth kirk associate librarian for information resources - - elizabeth.e.kirk@dartmouth.edu website: www.dartmouth.edu/~library/digital program overview mission/description: the dartmouth college library’s digital publishing program supports faculty publication of original scholarly content in a digital environment. our digital publications include journals, monographs, and scholarly editions. all content is available online without charge. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ); endowment income ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ); etds ( ); digital, scholarly editions of manuscripts, letters, etc. ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: environment; linguistics; electronic or “new” media; native american history; history of arctic exploration top publications: elementa (journal); linguistic discovery (journal); journal of e-media studies (journal); occom circle project (digital collection); artistry of the homeric simile (monograph) percentage of journals that are peer reviewed: f o u n di ng institu tio n library publishing coalition library.dartmouth.edu/mail/send.php?to=askalib mailto:elizabeth.e.kirk@dartmouth.edu http://www.dartmouth.edu campus partners: campus departments or programs; individual faculty other partners: university press of new england; bioone publishing platform(s): contentdm; locally developed software; ambra digital preservation strategy: dpn; hathitrust; lockss; portico; in-house; digital preservation services under discussion additional services: marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; peer review management; business model development; budget preparation; other author advisory; digitization; audio/video streaming; xml consultation in jats . and tei additional information: the partnership with the publisher bioone is enabling us to increase our technological capacity for journal publishing. bioone is a significant contributor to the staffing for elementa. the partnership with the university press of new england is enabling us to increase knowledge and capacity for monograph publishing. plans for expansion/future directions: publishing more monographs in conjunction with the university press of new england, further developing technical capacity for journals, increasing the number of digital editions, working with student journals. h i g h l i g h t e d p u b l i c a t i o n through elementa: science of the anthropocene, we aim to facilitate scientific solutions to the challenges presented by this era of accelerated human impact with timely, technically sound, peer- reviewed articles that address interactions between human and natural systems and behaviors. home.elementascience.org home.elementascience.org duke university duke university libraries primary unit: office of copyright and scholarly communications open-access@duke.edu primary contact: paolo mangiafico coordinator of scholarly communications technology - - paolo.mangiafico@duke.edu website: library.duke.edu/openaccess program overview mission/description: duke university libraries partners with members of the duke community to publish and disseminate scholarship in new and creative ways, including helping to publish scholarly journals on an open access digital platform, archiving previously published and original works, and consulting on new forms of scholarly dissemination. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; data; multimedia/interactive content disciplinary specialties: greek, roman, and byzantine studies; transatlantic german studies; th-century russian studies; cultural anthropology; scholarly communications top publications: cultural anthropology (journal); etds; greek, roman, and byzantine studies (journal); scholarly communications @ duke (blog); andererseits (journal) f o u n di ng institu tio n library publishing coalition mailto:open-access@duke.edu mailto:paolo.mangiafico@duke.edu library.duke.edu/openaccess percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: society for cultural anthropology; editors of particular journals and their organizations publishing platform(s): dspace; ojs; wordpress; symplectic elements digital preservation strategy: depends on the journal and type of content, primarily in-house, but exploring archiving with portico additional services: outreach; training; analytics; metadata; open url support; dataset management; business model development; contract/license preparation; author copyright advisory; other author advisory; hosting of supplemental content plans for expansion/future directions: working with more datasets, digital projects, and forms other than linear text; exploring platforms that support new publishing models, not just digital versions of old journal models. h i g h l i g h t e d p u b l i c a t i o n cultural anthropology is the journal of the society for cultural anthropology, a section of the american anthropological association (aaa). it is one of journals published by the aaa, and it is widely regarded as one of the flagship journals of its discipline. culanth.org culanth.org emory university robert w. woodruff library primary unit: emory center for digital scholarship allen.tullos@emory.edu primary contact: stewart varner digital scholarship coordinator - - stewart.varner@emory.edu program overview mission/description: the enduring goal of a university is to create and disseminate knowledge. changes in technology offer opportunities for new forms of both creation and dissemination of scholarship through open access (oa). open access publishing also offers opportunities for emory university to fulfill its mission of creating and preserving knowledge in a way that opens disciplinary boundaries and facilitates sharing that knowledge more freely with the world. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); databases ( ); etds ( ) media formats: text; images; audio; video; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: southern studies; religion/theology top publications: southern spaces (journal); molecular vision (journal); methodist review (journal); practical matters (journal) percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:allen.tullos@emory.edu mailto:stewart.varner@emory.edu campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): fedora; ojs/ocs/omp; wordpress; drupal digital preservation strategy: digital preservation services under discussion additional services: typesetting; copy-editing; metadata; peer review management; contract/license preparation; author copyright advisory; digitization; audio/video streaming additional information: on june , , emory university announced the launch of the emory center for digital scholarship (ecds). the ecds brings together four units currently housed in the robert w. woodruff library: the digital scholarship commons (disc), the electronic data center, the lewis h. beck center for electronic collections, and the emory center for interactive teaching (ecit). these units have each collaborated with emory scholars who wish to incorporate technology into their teaching and research. the formation of the ecds will break down barriers between these functions and simplify the process of establishing partnerships with scholars. expanding and strengthening support for open access, digital publishing is a top priority for the ecds. plans for expansion/future directions: reexamining the expansion of library publishing services following the recent launch of the emory center for digital scholarship. florida atlantic university se wimberly library primary unit: digital library lydig@fau.edu primary contact: joanne parandjuk digital initiatives librarian - - jparandj@fau.edu website: www.library.fau.edu/depts/digital_library/about.htm program overview mission/description: recognizing the publishing needs of campus members and local partners, an open access publishing service was initiated by the fau digital library in support of scholarly communications across campus and the wider dissemination of fau research and creative content. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video disciplinary specialties: geosciences; undergraduate research; communications; local history top publications: the florida geographer (journal); democratic communique (journal); fau undergraduate research journal (journal); journal of coastal research (journal backfile); broward legacy (journal) mailto:lydig@fau.edu mailto:jparandj@fau.edu www.library.fau.edu/depts/digital_library/about.htm percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students other partners: broward county historical society publishing platform(s): islandora (migration underway); ojs/ocs/omp digital preservation strategy: florida digital archive member additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; issn registration; digitization; audio/video streaming plans for expansion/future directions: we have just launched the second issue of the first volume of our undergraduate research journal to instill scholarly inquiry and practices among undergraduates, and we hope to see a rise in the research activity of our students. florida state university robert manning strozier library primary unit: technology and digital scholarship primary contact: micah vandegrift scholarly communication librarian - - mvandegrift@fsu.edu website: diginole.lib.fsu.edu program overview mission/description: scholarly communications is a developing area of librarianship that deals with the production, dissemination, promotion, and preservation of scholarly research and creative works. the scholarly communication initiative will find, assess, and provide tools and services for representing scholarship in a digital environment. our vision is to support a variety of new modes and models of dissemination for academic work (open access, digital publishing, project-based digital scholarship, etc.). areas of focus include our institutional repository (technical management, outreach, collection development); open access (education and programs on access options for scholarly work); author rights (information and resources on negotiating copyright transfer contracts); copyrights and fair use (information and resources on copyright as it pertains to academic publishing); research and writing (keeping abreast of the many changes and development in this area, and contributing to the professional literature); and outreach (creating partnerships with campus offices, faculty, and administrators to further the scholarly communications initiative). year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) c o n tr ib ut ing institu tio n library publishing coalition mailto:mvandegrift@fsu.edu diginole.lib.fsu.edu media formats: text disciplinary specialties: arts and literature; art education and therapy; law top publications: heal: humanism evolving through arts and literature (journal); journal of art for life (journal); the owl (journal); fsu law review (journal) campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion additional services: marketing; outreach; training; analytics; metadata; issn registration; peer review management; contract/license preparation; author copyright advisory; hosting of supplemental content plans for expansion/future directions: piloting an open access fund, finding a sustainable model and including it as an ongoing resource for moving scholarship and prestige to open access; growing scholcomm office to include repository manager and host research fellows (clir, mellon); coordinating with the school of library and information studies and the history of text technologies to integrate scholcomm initiatives into curriculum; providing training and investment in fsu lis students’ skills and knowledge in this area; reworking open access policy with faculty senate to make our policy more effective and more in line with the scholarly communication push internationally. georgetown university georgetown university libraries primary unit: library information technologies digitalscholarship@georgetown.edu primary contact: kate dohe digital services librarian - - kd @georgetown.edu website: www.library.georgetown.edu/digitalgeorgetown social media: @gtownlibrary program overview mission/description: digitalgeorgetown supports the advancement of education and scholarship at georgetown and contributes to the expansion of research initiatives, both nationally and internationally. by providing the infrastructure, resources, and services, digitalgeorgetown sustains the evolution from the traditional research models of today to the enriched scholarly communication environment of tomorrow, and it provides context and leadership in developing collaborative opportunities with partners across the campus and around the world. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: student-driven journals ( ); monographs ( ); technical/ research reports ( ); newsletters ( ); etds ( ); undergraduate capstone/ honors theses ( ); faculty papers; video interviews; citations; syllabi media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: linguistics; communications; international relations/ foreign policy; bioethics mailto:digitalscholarship@georgetown.edu mailto:kd @georgetown.edu www.library.georgetown.edu/digitalgeorgetown top publications: georgetown university round tables on language and linguistics (monograph); the human cloning debate (monograph); the genocide in cambodia (monograph) percentage of journals that are peer reviewed: campus partners: georgetown university press; campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace digital preservation strategy: digital preservation services under discussion additional services: marketing; outreach; training; analytics; cataloging; metadata; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: continuing to enhance and expand our initiative to include more open access materials, different forms and formats of etds, and other scholarly publications. georgia state university georgia state university library primary unit: digital initiatives digitalarchive@gsu.edu primary contact: sean lind digital initiatives librarian - - slind @gsu.edu website: digitalarchive.gsu.edu program overview mission/description: the mission of the institutional repository at georgia state university is to give free and open access to the impactful scholarly and creative works, research, publications, reports, and data contributed by faculty, students, staff, and administrative units of georgia state university. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( ) funding sources (%): library materials budget ( ) publishing activities types of publications: student-driven journals ( ); monographs ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text disciplinary specialties: law review; undergraduate honors research top publications: georgia state university law review (journal); colonial academic alliance undergraduate research journal (journal); discovery: georgia state university undergraduate honors research journal (journal) percentage of journals that are peer reviewed: mailto:digitalarchive@gsu.edu mailto:slind @gsu.edu digitalarchive.gsu.edu campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: duracloud/dspace additional services: marketing; outreach; training; analytics; cataloging; metadata; issn registration; open url support; author copyright advisory; other author advisory; digitization plans for expansion/future directions: increasing the number and variety of georgia state university faculty scholarly publications openly available for download on the internet. grand valley state university grand valley state university libraries primary unit: collections and scholarly communications scholarworks@gvsu.edu primary contact: sarah beaubien scholarly communications outreach coordinator - - beaubisa@gvsu.edu program overview year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); textbooks ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data top publications: online readings in psychology and culture (digital collection); foundation review (journal); fishladder (journal); language arts journal of michigan (journal); journal of tourism insights (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: michigan council of teachers of english; resort and commercial recreation association; international association for cross-cultural psychology; johnson center for philanthropy publishing platform(s): bepress (digital commons) f o u n di ng institu tio n library publishing coalition mailto:scholarworks@gvsu.edu mailto:beaubisa@gvsu.edu digital preservation strategy: lockss; portico; digital preservation services under discussion additional services: outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; peer review management; author copyright advisory; digitization; hosting of supplemental content h i g h l i g h t e d p u b l i c a t i o n the foundation review is the first peer-reviewed journal of philanthropy, written by and for foundation staff and boards, and those who work with them implementing programs. it provides rigorous research and writing, presented in an accessible style. scholarworks.gvsu.edu/tfr scholarworks.gvsu.edu/tfr gustavus adolphus college folke bernadotte memorial library primary contact: barbara fister professor and academic librarian - - fister@gac.edu program overview mission/description: we want to support the shift from closed, licensed access to information to open, shareable, and sustainable scholarship. year publishing activities began: organization: entrepreneurial, experimental, more or less a sandbox in which librarians help other faculty consider alternatives staff in support of publishing activities (fte): library staff ( . ) funding sources (%): other ( ) publishing activities types of publications: monographs ( ) media formats: text campus partners: individual faculty publishing platform(s): contentdm; wordpress; pressbooks digital preservation strategy: in-house additional services: author copyright advisory; other author advisory; digitization additional information: we have published one monograph using pressbooks: an anthology based on faculty statements about teaching, scholarship, and service submitted for tenure and promotion. we wanted it to be lightweight and without cost other than time. it worked. we also have shared platform advice with faculty interested in publishing. it is all very much at the beginning and is without much in the way of technical or financial support, but we expect the resource commitment to grow. plans for expansion/future directions: working with similar libraries to study the possible launch of a press. mailto:fister@gac.edu hamilton college burke library primary unit: department of special collections and archives cgoodwil@hamilton.edu primary contact: randall ericson editor - - rericson@hamilton.edu website: couperpress.org program overview mission/description: the couper press was established in by couper librarian randall ericson of the burke library at hamilton college in clinton, new york. the press is named in honor of the late richard w. couper ‘ , an alumnus, life trustee of hamilton, and benefactor of the burke library. the press publishes a quarterly journal of scholarship, american communal societies quarterly (acsq), which showcases the communal societies collections of burke library. american communal societies series, a monograph series, presents new scholarship pertaining to american intentional communities as well as reprints of, and critical introductions to, important historical works that may be difficult to find or are out of print. shaker studies are short monographs on the shakers. occasional publications are published on topics that highlight the special collections of the burke library. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): endowment income ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ) media formats: text; images disciplinary specialties: communal studies; religious studies; sociology; american history; musicology mailto:cgoodwil@hamilton.edu mailto:rericson@hamilton.edu http://couperpress.org top publications: prison diary and letters of chester gillette (monograph); visiting the shakers, – : watervliet, hancock, tyringham, new lebanon (monograph); encyclopedic guide to american intentional communities (monograph); a promising venture: shaker photographs from the wpa (monograph); demographic directory of the harmony society (monograph) campus partners: individual faculty; undergraduate students other partners: museums; libraries; private collectors digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); copy-editing plans for expansion/future directions: considering making the american communal societies quarterly available through an institutional repository. illinois wesleyan university the ames library primary unit: scholarly communications primary contact: stephanie davis-kahl scholarly communications librarian - - sdaviska@iwu.edu program overview mission/description: the ames library publishing program focuses on disseminating excellent student-authored research, scholarship, and creative works, with an emphasis on providing education and outreach on issues related to publishing such as open access, author rights, and copyright. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( ) funding sources (%): library operating budget ( ); non-library campus budget ( ) publishing activities types of publications: student-driven journals ( ); textbooks; ( ); student conference papers and proceedings ( ); newsletters ( ); undergraduate capstone/ honors theses ( ) media formats: text; images; audio; video disciplinary specialties: economics; political science; history top publications: undergraduate economic review (journal); constructing history (journal); res publica (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students c o n tr ib ut ing institu tio n library publishing coalition mailto:sdaviska@iwu.edu publishing platform(s): bepress (digital commons) digital preservation strategy: in-house; digital preservation services under discussion additional services: training; analytics; metadata; peer review management; author copyright advisory; other author advisory; hosting of supplemental content; audio/video streaming additional information: regarding our funding model; percent of the cost of our bepress implementation is covered by the library, while the remaining percent is generously provided by the office of the president, office of the provost, and mellon center for faculty and curriculum development. faculty advisors for our student journals donate their time. plans for expansion/future directions: considering how to best position the program to become a publishing outlet for faculty. indiana university indiana university libraries primary unit: iuscholarworks iusw@indiana.edu primary contact: jennifer laherty digital publishing librarian - - jlaherty@indiana.edu website: scholarworks.iu.edu program overview mission/description: iuscholarworks is a set of services from the indiana university libraries to make the work of iu scholars freely available and to ensure that these resources are preserved and organized for the future. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ); graduate students ( ) funding sources (%): library materials budget ( ); library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); newsletters ( ); etds ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: folklore percentage of journals that are peer reviewed: campus partners: iu press; campus departments or programs; individual faculty; graduate students; undergraduate students other partners: american folklore society c o n tr ib ut ing institu tio n library publishing coalition mailto:iusw@indiana.edu mailto:jlaherty@indiana.edu scholarworks.iu.edu publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: archive-it; clockss; duracloud/dspace; hathitrust additional services: outreach; training; analytics; cataloging; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; dataset management; peer review management; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming; metadata consultation. plans for expansion/future directions: incorporating the libraries’ open access publishing activities into the development of a new campus office, the office of scholarly publishing, which includes the university press and an etextbook initiative. johns hopkins university sheridan libraries primary unit: scholarly resources and special collections dissertations@jhu.edu primary contact: david reynolds manager of scholarly digital initiatives - - davidr@jhu.edu program overview mission/description: to provide a publishing platform for required etds and journals for the johns hopkins academic community. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images disciplinary specialties: education; business top publications: international journal of interdisciplinary education (journal); new horizons for education (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: in-house; digital preservation services under discussion mailto:dissertations@jhu.edu mailto:davidr@jhu.edu additional services: training; analytics; metadata; peer review management; author copyright advisory additional information: we have only done an etd pilot so far, but mandatory submission was required as of september , . we are working with the school of education to publish two new oa journals. we expect the inaugural issues to appear by the second quarter of . plans for expansion/future directions: publishing journals for the school of education; looking into providing a monograph publishing service for academic departments; revisiting the question of publishing student journals. kansas state university kansas state university libraries primary unit: scholarly communications and publishing info@newprairiepress.org primary contact: char simser coordinator of electronic publishing, new prairie press - - info@newprairiepress.org website: newprairiepress.org social media: @newprairiepress program overview mission/description: to host peer-reviewed scholarly journals, monographs, conference proceedings, and other series primarily in the humanities and social sciences; make the content freely available worldwide; and contribute to and support evolving scholarly publishing models. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ) media formats: text; images; video disciplinary specialties: financial therapy; rural research and policy; library science; cognitive sciences and semantics; analytical philosophy top publications: gdr bulletin (journal); baltic international yearbook (journal); journal of financial therapy (journal); online journal of rural research & policy (journal); kansas library association college and university libraries section proceedings (conference proceedings) f o u n di ng institu tio n library publishing coalition mailto:info@newprairiepress.org mailto:info@newprairiepress.org http://newprairiepress.org percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty publishing platform(s): bepress (digital commons) digital preservation strategy: clockss additional services: graphic design (print or web); marketing; training; notification of a&i sources; doi assignment/allocation of identifiers; digitization; hosting of supplemental content plans for expansion/future directions: publishing open access monographs and conference proceedings and publishing two undergraduate research journals; setting up an advisory board to help set direction and policy and recommend new titles for npp. h i g h l i g h t e d p u b l i c a t i o n since , the journal of financial therapy has been the leading forum dedicated to clinical, experimental, and qualitative research in the emerging field of financial therapy. jftonline.org jftonline.org loyola university chicago loyola university chicago libraries primary unit: library systems primary contact: margaret heller digital services librarian - - mheller @luc.edu program overview mission/description: loyola ecommons is an open-access, sustainable, and secure resource created to preserve and provide access to research, scholarship, and creative works created by the university community for the benefit of loyola students, faculty, staff, and the larger academic community. sponsored by the university libraries, loyola ecommons is a suite of online resources, services, and people working in concert to facilitate a wide range of scholarly and archival activities, including collaboration, resource sharing, author rights management, digitization, preservation, and access by a global academic audience. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ) publishing activities types of publications: faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ) media formats: text; images; data disciplinary specialties: criminal justice; economics; social work campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion mailto:mheller @luc.edu additional services: outreach; training; analytics; metadata; digitization; hosting of supplemental content plans for expansion/future directions: hosting conference proceedings and journals. macalester college dewitt wallace library primary unit: digital scholarship and services primary contact: johan oberg digital scholarship and services librarian - - joberg@macalester.edu website: www.macalester.edu/library/digitalinitiatives/index.html program overview mission/description: the digital publishing unit of the dewitt wallace library supports the creation, management, and dissemination of local digital- born scholarship in various formats. essential to supporting this mission is the continuing exploration of evolving creation, collaboration, and publication tools; encoding methods; and development of staff skills and facility resources. the unit serves the digital scholarship and electronic publishing needs through development of digital scholarship projects as well as open access online distribution of journals, articles, and conference proceedings. the library is committed to playing an active role in the changing landscape of scholarly publishing and supports the ideals of the open access movement. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); undergraduate capstone/honors theses ( ); college alumni magazine; conference proceedings; oral histories media formats: text; images; audio; video; data disciplinary specialties: natural sciences; social sciences; fine arts; humanities; interdisciplinary studies mailto:joberg@macalester.edu www.macalester.edu/library/digitalinitiatives/index.html top publications: “an analysis of the career length of professional basketball players” (thesis); “the cultural omnivore in its natural habitat: music taste at a liberal arts college” (thesis); “what are the effects of mergers in the u.s. airline industry? an econometric analysis on delta-northwest merger” (thesis); “the mirror’s reflection: virgil’s aeneid in english translation” (thesis); “fat teen trouble: a sociological perspective of obesity in adolescents” (thesis) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students other partners: association for nepal and himalayan studies (anhs) publishing platform(s): bepress (digital commons); contentdm digital preservation strategy: in-house additional services: typesetting; cataloging; metadata; issn registration; dataset management; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: working with faculty to develop data management curation and preservation. mcgill university mcgill university library primary unit: escholarship, epublishing and digitization primary contact: amy buckland escholarship, epublishing & digitization coordinator - - amy.buckland@mcgill.ca program overview mission/description: mcgill university library showcases the research done by the mcgill community to the world via publishing initiatives such as electronic theses and dissertations, open access journals and monographs, and by partnering with others to develop new methods to disseminate research. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ); technical/ research reports ( ); etds ( ); undergraduate capstone/honors theses ( ); working papers media formats: text; images; audio; video disciplinary specialties: education; food cultures; library history top publications: mcgill journal of education (journal); cuizine (journal); fontanus (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: public knowledge project; erudit; thesescanada c o n tr ib ut ing institu tio n library publishing coalition mailto:amy.buckland@mcgill.ca publishing platform(s): ojs/ocs/omp; locally developed software; digitool digital preservation strategy: in-house; digital preservation services under discussion additional services: training; analytics; notification of a&i sources; issn registration; author copyright advisory; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: etd program and fontanus series are well established, but ojs journals are still in a developmental stage; looking to pair with the digital humanities community on campus to look at new ways of publishing, beyond the journal/monograph binary. miami university university libraries primary unit: center for digital scholarship primary contact: john millard head, center for digital scholarship - - millarj@miamioh.edu program overview mission/description: we want to serve as a collaborative partner with faculty, students, and staff by providing infrastructure and expertise to support open access journals with or without peer review. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ) media formats: text disciplinary specialties: computer science and engineering; psychology percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): ojs/ocs/omp digital preservation strategy: in-house additional services: cataloging; metadata; author copyright advisory; digitization mailto:millarj@miamioh.edu mount saint vincent university mount saint vincent university library primary unit: archives and scholarly communication ojs@msvu.ca primary contact: roger gillis scholarly communications and archives librarian - - roger.gillis@gmail.com website: journals.msvu.ca program overview mission/description: journals at the mount is a hosting service provided by the mount saint vincent university library for the mount community and/or affiliated partners. the service employs open journal systems (ojs) as a the hosting platform for scholarly journals and includes training, support, and guidance for the development of new and existing publications of the mount community. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images disciplinary specialties: women’s/gender studies; adult education top publications: atlantis: critical studies in gender, culture & social justice (journal); canadian journal for the study of adult education (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students mailto:ojs@msvu.ca mailto:roger.gillis@gmail.com journals.msvu.ca other partners: canadian association for the study of adult education; public knowledge project publishing platform(s): ojs/ocs/omp digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; contract/license preparation; author copyright advisory; hosting of supplemental content plans for expansion/future directions: digitizing back issues, developing student journals, and discussing with faculty the development of new journals/migrating existing journals to the ojs platform. northeastern university university libraries primary unit: scholarly communication primary contact: hillary corbett scholarly communication librarian - - h.corbett@neu.edu program overview mission/description: the university libraries offer a growing suite of publishing services in response to the needs of faculty, students, and staff. the libraries provide an online platform for journal publishing and the opportunity to produce innovative online collections and e-books through its digital repository service. through the repository service, the libraries also provide open access to the university’s electronic theses and dissertations, scholarly research output, and university-produced objects. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons); fedora; omeka; issuu digital preservation strategy: in-house; digital preservation services under discussion f o u n di ng institu tio n library publishing coalition mailto:h.corbett@neu.edu additional services: graphic design (print or web); typesetting; copy-editing; outreach; training; metadata; compiling indexes and/or tocs; notification of a&i sources; doi assignment/allocation of identifiers; dataset management; author copyright advisory; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: working to expand the capabilities of our digital repository, in response to users’ needs for space that can accommodate new kinds of projects; bringing another faculty journal online in the coming year. h i g h l i g h t e d p u b l i c a t i o n annals of environmental science publishes original, peer-reviewed research in the environmental sciences, broadly defined. it has been published open-access at northeastern university since . www.aes.neu.edu www.aes.neu.edu northwestern university northwestern university library primary unit: center for scholarly communication and digital curation cscdc@northwestern.edu primary contact: claire stewart head, digital collections and scholarly communication services - - claire-stewart@northwestern.edu website: cscdc.northwestern.edu social media: @nu_cscdc program overview mission/description: we are engaged in planning activities to identify tools and support models that enable distributed, preservable publishing projects across the entire university. in initial phases, we anticipate the emphasis will be heavier on non-traditional products, transitioning to open theses, open journals, and open books as the key stakeholders, including our press, move into closer technical and mission alignment. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ); charitable contributions/ friends of the library organizations ( ); grants ( ) publishing activities types of publications: databases ( ); scholarly websites that are heavily content- driven media formats: text; images disciplinary specialties: classics; history top publications: classicizing chicago (digital collection) campus partners: campus departments or programs; individual faculty c o n tr ib ut ing institu tio n library publishing coalition mailto:cscdc@northwestern.edu mailto:claire-stewart@northwestern.edu cscdc.northwestern.edu publishing platform(s): fedora; wordpress; drupal digital preservation strategy: duracloud/dspace; dpn; in-house; digital preservation services under discussion additional services: graphic design (print or web); training; metadata; dataset management; author copyright advisory; digitization; hosting of supplemental content additional information: we are working in many areas that blur into “library publishing,” so it is sometimes hard to isolate the people, tasks, and funding that contribute to library publishing services. it is an area that we see as a growing component of our library’s scholarly and digital programs. the fact that the university press also reports to the dean of libraries opens up avenues for fruitful discussion, but to date the press’ publishing is quite separate from the library’s. plans for expansion/future directions: developing a consulting service for faculty seeking to establish new publications and engaging in conversations with partners on campus around a shared investment in a cloud-based wordpress service, with plans to build and extend custom plugins for publishing projects and to integrate cms-based publishing projects with the library’s digital repository; exploring possible collaborations with the university press, especially related to policy and infrastructure. oberlin college oberlin college library primary unit: oberlin college library alan.boyd@oberlin.edu primary contact: alan boyd associate director of libraries - - alan.boyd@oberlin.edu program overview mission/description: publish all current and retrospective honors papers and master’s theses with concurrence of the faculty department. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: undergraduate capstone/honors theses ( ) media formats: text; images; multimedia/interactive content campus partners: campus departments or programs publishing platform(s): ohiolink etd center digital preservation strategy: no digital preservation services provided additional services: outreach; cataloging; metadata; author copyright advisory; digitization; hosting of supplemental content mailto:alan.boyd@oberlin.edu mailto:alan.boyd@oberlin.edu ohio state university university libraries primary unit: digital content services schlosser. @osu.edu primary contact: melanie schlosser digital publishing librarian - - schlosser. @osu.edu website: library.osu.edu/projects-initiatives/knowledge-bank program overview mission/description: our mission is to engage with partners across the university to increase the amount, value, and impact of osu-produced digital content. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); student conference papers and proceedings ( ); newsletters ( ); undergraduate capstone/honors theses ( ); conference and event lectures and presentations ( ); graduate student culminating papers and projects ( ); graduate student research forum papers and symposia posters ( ); undergraduate research forum presentations and posters ( ) media formats: text; images; audio; video; data percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: society for disability studies; the ohio academy of science f o u n di ng institu tio n library publishing coalition mailto:schlosser. @osu.edu mailto:schlosser. @osu.edu http://library.osu.edu/projects-initiatives/knowledge publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); typesetting; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; doi assignment/allocation of identifiers; contract/license preparation; author copyright advisory; digitization; hosting of supplemental content; consulting and educational programming additional information: although an etd program is considered library publishing for this survey, we did not include etds. our dissertations and theses are submitted by students to the ohiolink consortial etd database. since autumn term , dissertations have been produced by the student in electronic format and submitted to the ohiolink etd center. beginning calendar , all master’s theses have been produced by the student in electronic format and submitted to the ohiolink etd center. we do not host our dissertations and theses separately. plans for expansion/future directions: formalizing policies and procedures, recruiting new publishing partners, and adding new services. h i g h l i g h t e d p u b l i c a t i o n disability studies quarterly, the journal of the society for disability studies, is a multidisciplinary, international publication that covers all aspects of disability studies. dsq-sds.org dsq-sds.org oregon state university oregon state university libraries and press primary unit: center for digital scholarship and services primary contact: michael boock head of the center for digital scholarship and services - - michael.boock@oregonstate.edu website: cdss.library.oregonstate.edu program overview mission/description: oregon state university libraries’ publishing activities are primarily focused on the dissemination of scholarship produced by osu faculty and students. this is achieved largely through the institutional repository scholarsarchive@ osu, which includes previously unpublished material such as electronic theses and dissertations, agricultural extension reports, and faculty datasets. osu libraries also hosts open access journals that include articles by osu faculty. the libraries’ center for digital scholarship and services digitizes selected out-of-print osu press publications, and provides open access to excerpts from press books and supplementary materials such as maps and datasets. other publishing activities involve the development of online resources that present and interpret unique holdings of osu libraries. examples include extensive documentary histories and online exhibits on the linus pauling papers and related archival collections in the history of science and other areas. osu libraries has also developed digital resources in conjunction with books published by the osu press. examples include a mobile application for touring historic buildings that is based on a book about portland architecture, and a website that supports nature exploration related to a children’s book published by the press. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); datasets f o u n di ng institu tio n library publishing coalition mailto:michael.boock@oregonstate.edu cdss.library.oregonstate.edu media formats: text; images; audio; video; data disciplinary specialties: forestry; agriculture; history of science; water studies top publications: growing your own (technical report); forest phytophthoras (journal); international institute for fisheries economics and trade conference proceedings (conference proceedings); journal of the transportation research forum (journal); reducing fire risk on your forest property (technical report) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: transportation research forum; international institute for fisheries economics and trade; western dry kiln association; oregon institute for natural resources publishing platform(s): contentdm; dspace; fedora; ojs/ocs/omp; wordpress; omeka digital preservation strategy: archive-it; lockss; metaarchive additional services: graphic design (print or web); training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; dataset management; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming h i g h l i g h t e d p u b l i c a t i o n for more than a century, oregon state university’s extension service and agricultural experiment station publications have covered everything from winemaking techniques to marine economics. ir.library.oregonstate.edu/xmlui/handle/ / ir.library.oregonstate.edu/xmlui/handle additional information: it should be noted that while the osu press is part of the osu libraries organization, the press’ publishing program, which results in the publication of approximately twenty-five books per year on the pacific northwest, has mostly operated independently from the libraries’ publishing activities. therefore, the descriptions of “library publishing” have not included the press’ current print publishing output. in the future, the publishing programs of the libraries and press will be increasingly integrated. plans for expansion/future directions: our plans for the future largely focus on open access student journals, digital humanities, and open textbooks. student journals will publish research from osu undergraduate and graduate students, as well as students from around the world in specific disciplines. digital humanities projects will incorporate platforms that emphasize multimedia elements in presenting scholarship by osu faculty. open textbooks will involve a new partnership between the osu libraries and press and the osu extended campus open educational resources unit to support development of open textbooks by osu faculty. the osu libraries’ gray family chair for innovative library services will focus on digital publishing for at least the next three years, with a new incumbent providing vision and direction for innovation and sustainability in digital publishing. pacific university pacific university libraries primary unit: local collections and publication services primary contact: isaac gilman scholarly communications and research services librarian - - gilmani@pacificu.edu website: www.pacificu.edu/library/services/lcps/index.cfm program overview mission/description: pacific university libraries’ publishing services exist to disseminate diverse and significant scholarly and creative work, regardless of a work’s economic potential. through flexible open access publishing models and author services, pacific university libraries will contribute to the discovery of new ideas (from scholars within and outside the pacific community) and to the sustainability of the publishing system. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ); etds ( ) media formats: text; images; audio disciplinary specialties: health care; philosophy; undergraduate research; librarianship top publications: essays in philosophy (journal); journal of librarianship and scholarly communication (journal); health & interprofessional practice (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students c o n tr ib ut ing institu tio n library publishing coalition mailto:gilmani@pacificu.edu www.pacificu.edu/library/services/lcps/index.cfm publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion additional services: typesetting; copy-editing; training; analytics; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; author copyright advisory; digitization pennsylvania state university penn state university libraries primary unit: publishing & curation services primary contact: linda friend head, scholarly publishing services - - lxf @psu.edu website: www.libraries.psu.edu/psul/pubcur.html program overview mission/description: our mission is to provide authors and researchers with consultation on publishing options and practical, alternative ways for penn state faculty and students to publish and disseminate research in many formats. in addition, we provide assistance to scholarly journals and societies in disseminating their publications and proceedings electronically. we subscribe to the principles of open access to research information. doctoral dissertations and master’s theses for most academic programs are submitted digitally and are disseminated through the libraries, and there is an active program of collecting and making student research available. the three primary research journals in the field of pennsylvania history are part of our digitized collections. we are currently investigating the need and feasibility of offering an enhanced program of tiered publishing services, particularly for research journals, data, conference proceedings, and student-initiated work. year publishing activities began: organization: centralized library publishing unit/department. some operations and publishing workflow responsibilities are distributed among several library units/departments including technology support, cataloging, preservation, etc. staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ); sales revenue ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); graduate student research exhibition posters; undergraduate student research exhibition posters f o u n di ng institu tio n library publishing coalition mailto:lxf @psu.edu www.libraries.psu.edu/psul/pubcur.html media formats: text; images; audio; video; data disciplinary specialties: pennsylvania history and culture top publications: pennsylvania history journal (journal); pennsylvania magazine of history and biography (magazine); western pennsylvania history (journal); wepan conference proceedings (conference proceedings) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: women in engineering proactive network (wepan); historical society of pennsylvania; heinz history center; pennsylvania history association publishing platform(s): contentdm; ojs/ocs/omp; wordpress digital preservation strategy: digital preservation services under discussion; digital preservation special team is currently working on a long range plan. additional services: marketing; outreach; metadata; dataset management; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: redescribing the program, with expansion of services in the near future. h i g h l i g h t e d p u b l i c a t i o n western pennsylvania history from the heinz history center is a colorful regional quarterly of interest to scholars and history buffs alike. ojs.libraries.psu.edu/index.php/wph ojs.libraries.psu.edu/index.php/wph pepperdine university pepperdine university libraries primary unit: office of the dean of libraries primary contact: mark roosa dean of libraries - - mark.roosa@pepperdine.edu website: digitalcommons.pepperdine.edu program overview mission/description: the pepperdine libraries provide a global gateway to knowledge, serving the diverse and changing needs of our learning community through personalized service at our campus locations and rich computer- based resources. at the academic heart of our educational environment, our libraries are sanctuaries for study, learning, and research, encouraging discovery, contemplation, social discourse, and creative expression. as the information universe continues to evolve, our goal is to remain responsive to users’ needs by providing seamless access to both print and digital resources essential for learning, teaching, and research. the libraries, through digital commons@pepperdine, offer a wide array of digital publications that are openly available for study, research, and learning. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; data disciplinary specialties: religion; business; public policy; psychology; law c o n tr ib ut ing institu tio n library publishing coalition mailto:mark.roosa@pepperdine.edu digitalcommons.pepperdine.edu top publications: pepperdine law review (journal); leaven (journal); pepperdine dispute resolution law journal (journal); the journal of business, entrepreneurship and the law (journal); journal of the national association of administrative law judiciary (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons); contentdm digital preservation strategy: lockss; portico; preservica; in-house additional services: marketing; outreach; training; cataloging; metadata; dataset management; digitization; audio/video streaming plans for expansion/future directions: publishing additional undergraduate research; creating a line of monographic publications; publishing rich media content (e.g., video presentations); implementing an enterprise digital preservation solution; identifying new ways of participating in the editorial processes generally associated with publishing. portland state university portland state university library primary unit: digital initiatives primary contact: sarah beasley scholarly communication coordinator - - bvsb@pdx.edu program overview mission/description: portland state university (psu) library provides the infrastructure and a suite of services to offer a publishing platform that facilitates open access distribution; enhanced web search engine discovery through standards-based metadata and file formatting; permanent urls; file formatting and format migration; copyright advisory for authors; and outreach for and promotion of psu faculty or psu departmentally sponsored content. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: student-driven journals ( ); monographs ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/ honors theses ( ) media formats: text; images; audio; video; data disciplinary specialties: physics; environmental sciences; engineering and computer science; urban studies and planning; education percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty publishing platform(s): bepress (digital commons) mailto:bvsb@pdx.edu digital preservation strategy: in-house additional services: marketing; outreach; analytics; cataloging; metadata; dataset management; peer review management; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: hosting journals, archiving monographs, and producing open access textbooks. purdue university purdue university libraries primary unit: purdue scholarly publishing services primary contact: charles watkinson head, scholarly publishing services - - ctwatkin@purdue.edu website: www.lib.purdue.edu/publishing social media: @publishpurdue program overview mission/description: purdue scholarly publishing services focuses on supporting the publication efforts of various centers and departments within the purdue system. the primary publishing platform used is purdue e-pubs (www.purdue. edu/epubs), and the majority of products created are openly accessible, free- of-charge, to readers. open access is made possible by the financial support of partners, foundations, and purdue university libraries. major initiatives include the production of the journal of purdue undergraduate research, the publication of technical reports on behalf of the joint transportation research program (jtrp), and the project management of habri central, a major bibliographic reference database for researchers in the area of human-animal bond studies, produced in partnership with the purdue college of veterinary medicine. purdue scholarly publishing services and purdue university press, which publishes more formal books and journals, together constitute the publishing division of purdue libraries. our diverse publishing activities are supported by a single group of staff members with assistance from undergraduate and graduate students. by harnessing the skills of both librarians and publishers, and leveraging a common infrastructure, we believe we can better serve the needs of scholars in the digital age and enhance the impact of purdue scholarship by developing information products aligned with the university’s strengths. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ); undergraduate students ( . ) f o u n di ng institu tio n library publishing coalition mailto:ctwatkin@purdue.edu www.lib.purdue.edu/publishing www.purdue.edu/epubs www.purdue.edu/epubs funding sources (%): library operating budget ( ); non-library campus budget ( ); grants ( ); sales revenue ( ); licensing ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/ honors theses ( ); habri central (an information hub for human-animal bond studies built on the hubzero platform for scientific collaboration); the data curation profiles directory media formats: text; images; audio; video; data; multimedia/interactive content. disciplinary specialties: engineering (civil engineering); education (stem); library and information science; public policy; comparative literature top publications: joint transportation research program technical reports (technical reports); jpur: journal of purdue undergraduate research (journal); habri central (website); clcweb: comparative literature and culture (journal); interdisciplinary journal of problem-based learning (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: indiana department of transportation (indot); habri foundation; charleston conference/against the grain press; international association of scientific and technological university libraries (iatul) h i g h l i g h t e d p u b l i c a t i o n the journal of purdue undergraduate research (jpur) has been established to publish outstanding research papers written by purdue undergraduates from all disciplines who have completed faculty-mentored research projects. docs.lib.purdue.edu/jpur docs.lib.purdue.edu/jpur publishing platform(s): bepress (digital commons); hubzero for habri central digital preservation strategy: clockss and portico for most important journals; metaarchive for habri central additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; issn registration; doi assignment/ allocation of identifiers; open url support; dataset management; peer review management; business model development; budget preparation; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming; developmental editing; project management additional information: data publication is handled by the purdue university research repository (purr), which is a collaborative project of the libraries, information technology at purdue (itap), and the office of the vice president for research. we have classified all open access journals as being products of scholarly publishing services because of the types of workflow adopted, but five of these use the purdue university press imprint. plans for expansion/future directions: working to expand the number of centers and departments we serve on campus, particularly in the area of conference proceedings and technical reports; creating better linkages between publications and materials in purdue’s data and archival repositories; developing better capacity to handle multimedia and “new form” publications; developing a clearer sustainability plan across the libraries publishing division that balances earned revenue with internal support. rochester institute of technology the wallace center primary unit: scholarly publishing studio primary contact: nick paulus manager of scholarly publishing - - njpwml@rit.edu website: wallacecenter.rit.edu/scholarly-publishing-studio program overview mission/description: we connect stakeholders’ scholarship efforts with our comprehensive publishing services, ensuring that faculty and student research is made available to readers faster and disseminated in a way that meets their academic objectives. our approach is collaborative. we offer help with design and layout, copy-editing outsourcing, open access publishing, and pre-publishing consultation. at sps, we are committed to advancing the dissemination of scholarship. organization: centralized library publishing unit/department publishing activities campus partners: campus departments or programs; individual faculty publishing platform(s): bepress (digital commons); ojs/ocs/omp additional services: graphic design (print or web); copy-editing mailto:njpwml@rit.edu http://wallacecenter.rit.edu/scholarly rutgers, the state university of new jersey rutgers university libraries primary unit: scholarly communication center primary contact: rhonda marker rucore collection manager/head, scholarly communications center - - rmarker@rutgers.edu website: rucore.libraries.rutgers.edu/services program overview mission/description: the goal of the rutgers university community repository is to advance research and learning at rutgers, to foster interdisciplinary collaboration, and to contribute to the development of new knowledge through the archiving, preservation, and presentation of digital resources. original research products and papers of the faculty and administrators and the unique resources of the libraries will be permanently preserved and made accessible with tools developed to facilitate and encourage their continued use. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( ) funding sources (%): library materials budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); faculty conference papers and proceedings ( ); databases ( ); etds ( ); research/interview videos ( ); a new scholarly communication form, the published video analytic, currently in use in our nsf- funded mathematics education collection, the video mosaic (www.videomosaic.org) media formats: text; video; data; multimedia/interactive content disciplinary specialties: mathematics education; psychology; jazz music; new jersey history; classical studies c o n tr ib ut ing institu tio n library publishing coalition mailto:rmarker@rutgers.edu rucore.libraries.rutgers.edu/services www.videomosaic.org top publications: video mosaic collaborative (website); pragmatic case studies in psychotherapy (journal); journal of jazz studies (journal); journal of rutgers university libraries (journal); new jersey history (journal) percentage of journals that are peer reviewed: campus partners: individual faculty; graduate students publishing platform(s): fedora; ojs/ocs/omp; locally developed software digital preservation strategy: in-house additional services: graphic design (print or web); training; analytics; cataloging; metadata; notification of a&i sources; doi assignment/allocation of identifiers; contract/license preparation; author copyright advisory; audio/video streaming additional information: our focus is on the unique scholarship and resources of rutgers university and on the research and education needs of our community. we develop new tools and services, including new modes of scholarly communication, in response to faculty and student needs, often through collaboration in research grants. plans for expansion/future directions: expanding our publishing of original research and scholarship, with a particular focus on research data and digital video, including video of conferences and lectures held in the alexander library; exploring the publishing of undergraduate research in open access journals and new modes of scholarly communication, particularly in the humanities and social sciences. simon fraser university simon fraser university library . theses/dissertations primary unit: thesis office primary contact: nicole white head, research commons - - ngjertse@sfu.ca website: www.lib.sfu.ca/help/writing/thesis program overview mission/description: responsible for accepting formatted theses and dissertations, and depositing them in the library’s institutional repository, summit. summit also acts as a publication platform for university authors (e.g., conference papers, technical reports). conforms to oai-pmh. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( %) publishing activities types of publications: technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; multimedia/interactive content campus partners: campus departments and programs; individual faculty; graduate students, publishing platform(s): drupal digital preservation strategy: archivematica, in-house. work is underway to allow archivematica to store aip’s (archival information packages) in a coppul private lockss network. c o n tr ib ut ing institu tio n library publishing coalition mailto:ngjertse@sfu.ca www.lib.sfu.ca/help/writing/thesis additional services: copy-editing; training; analytics; compiling indexes and/or tocs; author copyright advisory; hosting of supplemental content plans for expansion/future directions: working toward becoming a trusted digital repository. moved to exclusively digital thesis submission. . scholarly journals and conferences primary unit: public knowledge project publishing services (pkp|ps) pkp-hosting@sfu.ca primary contact: brian owen associate university librarian/pkp managing director - - brian_owen@sfu.ca website: http://www.lib.sfu.ca/collections/scholarly-publishing; https://pkpservices.sfu.ca program overview mission/description: provide online hosting and related technical support at no charge for scholarly journals and conferences that have a significant sfu faculty connection (e.g., a managing editor) or to support sfu-based teaching and research initiatives. year publishing activities began: organization: the sfu library provides the administrative and technical home for pkp and its related activities, such as pkp publishing services. in return, pkp|ps provides the technical expertise and infrastructure support for the sfu library’s scholarly communication services. pkp|ps staff work closely with the library’s liaison librarians. staff in support of publishing activities (fte): library staff (. ) funding sources (%): library operating budget ( ); pkp|ps in-kind ( ) publishing activities types of publications: faculty-driven and graduate student journals ( ); scholarly conferences ( ) media formats: text; images; audio; video; data; multimedia/interactive content mailto:pkp-hosting@sfu.ca mailto:brian_owen@sfu.ca http://www.lib.sfu.ca/collections/scholarly https://pkpservices.sfu.ca other partners: sfu’s canadian centre for studies in publishing publishing platform(s): ojs/ocs digital preservation strategy: coppul; lockss additional services: digitization; software customization/development additional information: pkp publishing services is not a typical library publishing operation. by virtue of being the developers of ojs and other pkp software, we are able to offer technical support that may not be feasible for other library publishing services. plans for expansion/future directions: hosting and related support for open monograph press (omp). state university of new york at buffalo e. h. butler library primary unit: scholarly communication librarian primary contact: marc d. bayer scholarly communication librarian - - bayermd@buffalostate.edu website: digitalcommons.buffalostate.edu/submit_research.html program overview mission/description: the e. h. butler library publishes monographs and periodicals that feature the research, applied, and artistic works of the buffalo state community. in addition to a print publishing program, the library administers the campus institutional repository. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) publishing activities campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: monroe fordham regional history center of suny buffalo state plans for expansion/future directions: including more student research. mailto:bayermd@buffalostate.edu digitalcommons.buffalostate.edu/submit_research.html state university of new york at geneseo milne library primary unit: technical services milne@geneseo.edu primary contact: allison brown editor and production manager - - browna@geneseo.edu website: publishing.geneseo.edu program overview mission/description: the mission of milne library publishing services is based on a core value of libraries: knowledge sharing and literacy are an essential public good. the goal of milne publishing is to inspire authors and creators to share their works with a sustainable publishing model that rewards both authors and readers, libraries and learning. milne publishing will help transform scholarly communications and library publishing. year publishing activities began: organization: distributed across library units; open suny textbooks and individual journals are distributed among various institutions staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); student conference papers and proceedings ( ); newsletters ( ); tei digital humanities projects; omeka digital collections; best practices toolkits media formats: text; images. disciplinary specialties: education; library and information science; local history; humanities/liberal arts top publications: digitalthoreau.org (website); reprints and new monographs on amazon.com; educational change (journal); reprints on open monograph press; workflow toolkit (website) mailto:milne@geneseo.edu mailto:browna@geneseo.edu publishing.geneseo.edu digitalthoreau.org amazon.com percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students other partners: thoreau society; thoreau institute; walden woods project; new york state foundations of education association publishing platform(s): contentdm; ojs/ocs/omp; wordpress; commons in a box digital preservation strategy: no digital preservation services provided; server backup as appropriate additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; issn registration; doi assignment/allocation of identifiers; open url support; peer review management; business model development; budget preparation; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: library publishing toolkit available at: www. publishingtoolkit.org. developing interactives with video, multiple choice feedback, etc. plans for expansion/future directions: expanding the use of open monograph press for textbook, reprints, and new monograph publishing; developing network hosting and training models for open journal systems and open monograph press; expanding the role of digital scholarship publishing with social reading in digital thoreau and the use of omeka. www.publishingtoolkit.org www.publishingtoolkit.org syracuse university syracuse university libraries primary unit: scholarly communication primary contact: yuan li scholarly communication librarian - - yli @syr.edu website: surface.syr.edu program overview mission/description: to provide syracuse university (su) faculty with an alternative to commercial publishing venues, and to provide the campus community support for open access publishing models. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library materials budget ( ); library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); working papers; journal articles; images; video; and presentations media formats: text; video disciplinary specialties: law and commerce; public diplomacy; writing and rhetoric; disability and popular culture top publications: intertext (journal) percentage of journals that are peer reviewed: campus partners: syracuse university press; campus departments or programs; individual faculty; graduate students; undergraduate students f o u n di ng institu tio n library publishing coalition mailto:yli @syr.edu surface.syr.edu publishing platform(s): bepress (digital commons); ojs/ocs/omp digital preservation strategy: aptrust; dpn; lockss additional services: graphic design (print or web); typesetting; copy-editing; marketing; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; peer review management; business model development; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: launching a joint imprint (syracuse unbound) with syracuse university press and two new open access journals in the coming months; forming a new unit that brings together several units involved in digital scholarship activities, including digital publishing; formalizing a menu of publishing services for the campus community. h i g h l i g h t e d p u b l i c a t i o n intertext aims to represent the writing of syracuse university students through publishing exemplary works submitted from any writing program undergraduate course. wrt-intertext.syr.edu wrt-intertext.syr.edu temple university temple university libraries primary unit: digital library initiatives diglib@temple.edu primary contact: delphine khanna head of digital library initiatives - - delphine@temple.edu website: digital.library.temple.edu program overview mission/description: the goal of our program is to provide free and open access to digital scholarship produced by temple university students. currently, we focus on the publishing of doctoral dissertations, master’s theses, and the winning essays of the temple university library prize for undergraduate research in general topics and in topics related to sustainability and the environment. in the future, we plan to greatly expand our publishing program to include scholarly journals and books. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ); winning essays for the temple university library prize for undergraduate research ( ) media formats: text; images; data disciplinary specialties: full range of academic subjects in etds top publications: “the digitalization of music culture: a case study examining the musician/listener relationship with digital technology” (thesis); “profitability ratio analysis for professional service firms” (thesis); “naskh al- qur’an: a theological and juridical reconsideration of the theory of abrogation and its impact on qur’anic exegesis” (thesis); “pcaob international inspection mailto:diglib@temple.edu mailto:delphine@temple.edu digital.library.temple.edu and audit quality” (thesis); “mother of god, cease sorrow!: the significance of movement in a late byzantine icon” (thesis) campus partners: campus departments or programs publishing platform(s): contentdm digital preservation strategy: in-house. digital preservation services under discussion; our contentdm instance is hosted at oclc and they have backup procedures. we are also now considering membership in hathitrust. additional services: analytics; cataloging; metadata; hosting of supplemental content plans for expansion/future directions: planning significant expansion of services, such as the inclusion of books and journals. texas tech university texas tech university libraries primary unit: digital resources library unit primary contact: christopher starcher digital services librarian - - christopher.starcher@ttu.edu program overview mission/description: to publish and archive the scholarship of texas tech university by its faculty, researchers, and students. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: textbooks ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images top publications: etds; honors theses campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace digital preservation strategy: duracloud/dspace; lockss; scholars portal; in-house; digital preservation services under discussion. everything is housed at the university data center and then backed up to an out-of-town remote storage facility. additional services: outreach; training; analytics; metadata; doi assignment/ allocation of identifiers; author copyright advisory; other author advisory; digitization mailto:christopher.starcher@ttu.edu thomas jefferson university scott memorial library primary unit: academic and instructional support & resources primary contact: dan kipnis senior education services librarian and editor of jefferson digital commons - - dan.kipnis@jefferson.edu program overview mission/description: to provide an open access institutional repository of the work being produced by the jefferson community to a global audience. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library materials budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); videos; grand round presentations; conference posters media formats: text; images; audio; video disciplinary specialties: historical psychiatry; internal medicine; population studies; integrative medicine top publications: jefferson journal of psychiatry (journal); the medicine forum (journal); on the anatomy of the breast (monograph); a manual of military surgery (monograph); legend and lore: jefferson medical college (monograph) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: special library association mailto:dan.kipnis@jefferson.edu publishing platform(s): bepress (digital commons) digital preservation strategy: lockss; in-house additional services: marketing; outreach; training; analytics; metadata; issn registration; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming additional information: we are encouraged by our grass roots effort to get materials in our ir and to continue our publishing efforts. plans for expansion/future directions: continuing to add journals, newsletters, and additional grey literature materials to our institutional repository. trinity university coates library primary unit: discovery services primary contact: jane costanza head of discovery services - - jcostanz@trinity.edu website: digitalcommons.trinity.edu program overview mission/description: the trinity university open access policy encourages faculty authors to retain non-commercial copyright for their scholarly publications and provides them with the means to negotiate those rights with their publishers. additionally, open access facilitates the sharing of peer-reviewed research through trinity’s digital repository (digital commons @ trinity), which provides broad, free access to a faculty author’s scholarly work. the open access policy at trinity depends for its effectiveness on faculty authors granting to the university permission to upload digital copies of their scholarly publications to trinity’s digital repository. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); student conference papers and proceedings ( ); undergraduate capstone/honors theses ( ); administrative reports media formats: text; images; video; data disciplinary specialties: teacher education; anthropology; psychology; mathematics; biology mailto:jcostanz@trinity.edu digitalcommons.trinity.edu top publications: “cognitive bias modification: past perspectives, current findings, and future applications” (thesis); “cognitive bias modification: induced interpretive biases affect memory” (thesis); “a survey of psychologists’ attitudes towards and utilization of exposure therapy for ptsd” (thesis); “islamophobia, euro-islam, islamism and post-islamism: changing patterns of secularism in europe” (thesis); tipiti (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students other partners: society for the anthropology of lowland south america publishing platform(s): bepress (digital commons) digital preservation strategy: clockss additional services: analytics; cataloging; metadata; open url support; author copyright advisory additional information: we also support selectedworks. plans for expansion/future directions: continuing to help faculty members understand the issues around the economics of scholarly publishing and the benefits of providing open access to their scholarly output. tulane university howard-tilton memorial library primary unit: digital initiatives primary contact: jeff rubin digital initiatives and publishing coordinator - - jrubin @tulane.edu website: library.tulane.edu/repository program overview mission/description: tulane university journal publishing is an open access journal publishing service that provides a web-based platform for scholarly and academic publishing to the tulane community. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); etds ( ) media formats: text; images; audio; video disciplinary specialties: zoology; botany; international affairs; literary top publications: tulane studies in zoology and botany (journal); tulane review (journal); tulane journal of international affairs (journal); second line: an undergraduate journal of literary conversation (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): ojs/ocs/omp c o n tr ib ut ing institu tio n library publishing coalition mailto:jrubin @tulane.edu library.tulane.edu/repository digital preservation strategy: dpn; centralized storage and backup through tulane technology services additional services: training; metadata; issn registration; author copyright advisory; other author advisory; hosting of supplemental content; audio/video streaming universitÉ de montrÉal université de montréal libraries primary unit: teaching, learning and research support primary contact: diane sauvé director, teaching, learning and research support - - ext. diane.sauve@umontreal.ca website: www.bib.umontreal.ca/papyrus program overview mission/description: the université de montréal institutional repository, papyrus, provides access to the university theses and dissertations, as well as to some publications and other forms of intellectual output from the university. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ) media formats: text; images; audio; video campus partners: campus departments or programs publishing platform(s): dspace digital preservation strategy: no digital preservation services provided mailto:diane.sauve@umontreal.ca www.bib.umontreal.ca/papyrus university of alberta university of alberta libraries primary unit: digital initiatives primary contact: leah vanderjagt digital repository services librarian - - leah.vanderjagt@ualberta.ca website: guides.library.ualberta.ca/oa social media: listed at www.library.ualberta.ca program overview mission/description: the university of alberta libraries provides support to community members who want to publish in oa formats (e.g., providing journal hosting and institutional repository services). year publishing activities began: staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); videos media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: library and information studies; education; pharmaceutical sciences; sociology; environmental studies (particularly oil sands) top publications: canadian journal of sociology (journal); international journal of qualitative methods (journal); journal of pharmacy & pharmaceutical sciences (journal); evidence based library and information practice (journal); canadian review of comparative literature (journal) percentage of journals that are peer reviewed: mailto:leah.vanderjagt@ualberta.ca guides.library.ualberta.ca/oa www.library.ualberta.ca campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: public knowledge project; research teams/projects (e.g., oil sands research and information network; canadian writing research collaboratory); local non-profit organizations (e.g., edmonton social planning council). publishing platform(s): fedora; ojs/ocs/omp; wordpress; locally developed software digital preservation strategy: archive-it; archivematica; clockss; coppul; hathitrust; lockss; portico; in-house additional services: outreach; cataloging; metadata; doi assignment/allocation of identifiers; open url support; dataset management; author copyright advisory; other author advisory; digitization; hosting of supplemental content additional information: funding for publishing services comes out of the library operations budget. however, we do not have a fixed breakdown. we do not charge users for our publishing services and only publish open access content. plans for expansion/future directions: supporting the growth of our institutional repository and journal hosting services; facilitating the development of campus-wide scholarly publishing initiatives (e.g., establishing an open monograph publishing service, research data “publication” and curation), open educational resources (oer), etc. university of arizona university of arizona libraries primary unit: scholarly publishing and data management team repository@u.library.arizona.edu primary contact: dan lee director, office of copyright management and scholarly communication - - leed@email.arizona.edu website: journals.uair.arizona.edu; arizona.openrepository.com/arizona program overview mission/description: the scholarly publishing and data management team provides tools, services, and expertise that enable the creation, distribution, and preservation of scholarly works and research data in support of the mission of the university of arizona. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ); periodicals ( ) media formats: text; images; audio; video; data disciplinary specialties: agriculture; life sciences; dendrochronology; archaeology; geosciences top publications: radiocarbon (journal); journal of ancient egyptian interconnections (journal); etds; coyote papers (working papers); arizona anthropologist (journal) f o u n di ng institu tio n library publishing coalition mailto:repository@u.library.arizona.edu mailto:leed@email.arizona.edu journals.uair.arizona.edu arizona.openrepository.com/arizona percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: international society of lymphology; society for range management; tree ring society publishing platform(s): contentdm; dspace; ojs/ocs/omp; locally developed software digital preservation strategy: digital preservation services under discussion additional services: training; analytics; cataloging; metadata; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; dataset management; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: discussing collaborative efforts with the university press. h i g h l i g h t e d p u b l i c a t i o n radiocarbon is the main international journal of record for research articles and date lists relevant to c and other radioisotopes and techniques used in archaeological, geophysical, oceanographic, and related dating. www.radiocarbon.org www.radiocarbon.org university of british columbia university of british columbia library primary unit: digital initiatives and scholarly communications primary contact: allan bell director, digital initiatives and scholarly communications - - allan.bell@ubc.ca website: circle.ubc.ca program overview mission/description: digital initiatives and scholarly communication services supports new models of scholarly communications, copyright services, the showcasing of ubc’s intellectual output via open access repository services, as well as the digitization of unique historical materials. digital initiatives and scholarly communication services is a key part of the library’s strategy to support the evolving needs of faculty and students and to support teaching, research and learning at ubc. our goal is to create sustainable, world-class programs and processes that promote digital scholarship, make ubc research and digital collections openly available to the world, and ensure the long-term preservation of ubc’s digital collections. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ); non-thesis graduate student research ( ) media formats: text; images; audio; video; data mailto:allan.bell@ubc.ca circle.ubc.ca disciplinary specialties: mining engineering; forestry; education; sustainability; earth and ocean sciences top publications: “guidelines for mine haul road design” (technical report); “comparison of limit states design” (technical report); “pain-enduring eccentric exercise” (technical report); “portable science: podcasting as an outreach tool for a large academic science and engineering library” (technical report); “wet- bulb temperature” (technical report) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: archivematica; coppul; lockss; in-house. we participate in the coppul lockss pln. additional services: marketing; outreach; training; analytics; metadata; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming additional information: we also publish lower-level undergraduate work in our repository, for example, the science one program: circle.ubc.ca/ handle/ / . development partner on the public knowledge project (pkp), including the creation and maintenance of user documentation and related training materials, offering hosting and related support, performing testing, participating on pkp’s advisory and technical committees, and seeking further areas for cooperation. circle.ubc.ca/handle/ / circle.ubc.ca/handle/ / university of calgary libraries and cultural resources primary unit: centre for scholarly communication primary contact: tim au yeung coordinator, digital repository technologies - - ytau@ucalgary.ca program overview mission/description: the centre for scholarly communication provides innovative solutions for the creation, evaluation, dissemination, and preservation of the research output of the academy. a priority for libraries and cultural resources, the centre enables scholars through: sustainable electronic publishing using a variety of platforms; robust dissemination of digital collections in multiple formats; a platform for partnerships and discussion of trends and ideas; and solutions for longer term preservation of digital collections. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/ research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); occasional papers media formats: text; images; audio; video; data top publications: arctic (journal); ariel: a review of international english literature (journal); journal of military and strategic studies (journal); etds percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty mailto:ytau@ucalgary.ca other partners: scholarly societies (e.g., canadian evaluation society); research institutes (e.g., arctic institute of north america); individual faculty at other canadian universities (e.g., university of saskatchewan) publishing platform(s): contentdm; dspace; ojs/ocs/omp; locally developed software digital preservation strategy: archivematica; coppul; duracloud/dspace; synergies; in-house; digital preservation services under discussion additional services: graphic design (print or web); outreach; training; analytics; cataloging; metadata; doi assignment/allocation of identifiers; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: the university press is a unit within the library. working collaboratively, the library and press share expertise and technologies to support and extend scholarly publishing services. changes resulting from the integration include transition of press journals to library-hosted online journals (most now open access) and the initiation of open access book publishing. for this survey, activities associated with books under our press imprint were not included. university of california, berkeley institute for research on labor and employment library primary unit: the irle library web team primary contact: terence k. huwe director of library and information resources - - thuwe@library.berkeley.edu website: www.irle.berkeley.edu program overview mission/description: the irle library uses digital technologies to promote the scholarly content created by the institute for research on labor and employment as well as its affiliated faculty, students, and visiting scholars. year publishing activities began: organization: individual units create their own library publishing services, but take care to work with the campus-wide and system-wide resources staff in support of publishing activities (fte): library staff ( ); graduate students ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); in addition to working papers; conference papers and policy reports; gis web resources media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: employment and wage studies; employment in the “green economy”; public sector labor relations; sociology; management of organizations/organizational behavior top publications: “hidden cost of wal-mart jobs: use of safety net programs by wal-mart workers in california” (technical report); “ california establishment survey: preliminary findings on employer based healthcare mailto:thuwe@library.berkeley.edu www.irle.berkeley.edu reform” (technical report); “the impact of san francisco’s employer health spending requirement: initial findings from the labor and product markets” (technical report); “impact of sb on health coverage” (technical report) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students other partners: california studies association publishing platform(s): bepress (digital commons); dspace; wordpress; locally developed software digital preservation strategy: uc merritt; in-house; digital preservation services under discussion additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; issn registration; doi assignment/allocation of identifiers; dataset management; business model development; budget preparation; contract/license preparation; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: beginning to use the epub format for full-length ebook sales by third party outlets. university of california system california digital library primary unit: access and publishing group primary contact: catherine mitchell director, access and publishing group - - catherine.mitchell@ucop.edu website: www.escholarship.org social media: @escholarship; facebook.com/escholarship program overview mission/description: escholarship provides a suite of open access, scholarly publishing services and research tools that enable departments, research units, publishing programs, and individual scholars associated with the university of california to have direct control over the creation and dissemination of the full range of their scholarship. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data disciplinary specialties: law; romance languages/classics; environmental studies; architecture/urban planning; linguistics/literary studies c o n tr ib ut ing institu tio n library publishing coalition mailto:catherine.mitchell@ucop.edu www.escholarship.org top publications: “assessing the future landscape of scholarly communication: an exploration of faculty values and needs in seven disciplines” (technical/ research report); dermatology online journal (journal); journal of transnational american studies (journal); western journal of emergency medicine (journal); the traffic in praise: pindar and the poetics of social economy (monograph) percentage of journals that are peer reviewed: campus partners: uc press; campus departments or programs; individual faculty; graduate students; undergraduate students other partners: public knowledge project; uc campus libraries; pubmed; biomed central publishing platform(s): ojs; locally developed software digital preservation strategy: uc merritt additional services: outreach; training; analytics; cataloging; doi assignment/ allocation of identifiers; open url support; dataset management; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: identify opportunities to support new modes of research by investigating the needs of digital humanities scholars; explore to what extent altmetrics and commenting/annotation provide utility to researchers in different disciplines by experimenting with the provision of related tools and technologies; improve the quality of escholarship journals by providing baseline standards and guidance regarding best practices for oa publications; empower escholarship contributors to better understand and manage their copyright and publishing choices; improve the ability of escholarship research units to more robustly interact with escholarship by completing an administrative interface project (begun in – ) that provides them with expanded capabilities to control their publication environment within escholarship; continue to build relationships with and contribute to the broader digital library publishing community via our major development partnership with the public knowledge project; develop and formalize user community engagement processes for access and publishing services in order to leverage super-user knowledge/ practices, better align development priorities with user needs, raise awareness of new features/development agenda, work more directly with campus contacts and increase outreach opportunities to new users. university of central florida john c. hitt library primary unit: information technology and digital initiatives primary contact: lee dotson digital initiatives librarian - - lee.dotson@ucf.edu program overview mission/description: the ucf libraries currently provides publishing support for honors theses, graduate etds, and ucf affiliated or ucf faculty-edited open access e-journals. efforts to support broader dissemination of scholarship include enabling access to a wide audience through freely accessible databases and using open journal systems (ojs) open source publishing software to publish electronic journals from scratch and host electronic journals in florida oj. the ucf libraries collaborates with the florida virtual campus to provide these services. year publishing activities began: organization: services are distributed across library units/departments publishing activities types of publications: faculty-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text campus partners: campus departments or programs; individual faculty other partners: florida virtual campus publishing platform(s): ojs/ocs/omp; locally developed software digital preservation strategy: fcla daitss additional services: outreach; training; analytics; cataloging; metadata; hosting of supplemental content c o n tr ib ut ing institu tio n library publishing coalition mailto:lee.dotson@ucf.edu university of colorado anschutz medical campus health sciences library primary contact: heidi zuniga electronic resources librarian - - heidi.zuniga@ucdenver.edu program overview mission/description: the university of colorado anschutz medical campus digital repository will reflect the university’s excellence; support the rapid dissemination of research; foster at all levels understanding and appreciation of the value of research, learning, and teaching at cu anschutz medical campus; ensure future, persistent, and reliable access to intellectual assets. year publishing activities began: organization: services are distributed across several campuses staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ) media formats: text; images; audio; video; data; multimedia/interactive content disciplinary specialties: health sciences top publications: etds percentage of journals that are peer reviewed: campus partners: individual faculty publishing platform(s): digitool by exlibris digital preservation strategy: digital preservation services under discussion mailto:heidi.zuniga@ucdenver.edu additional services: marketing; outreach; cataloging; metadata; author copyright advisory; digitization additional information: we don’t consider ourselves to be a “library as publisher” institution at this point, but we certainly do disseminate etds and other resources. plans for expansion/future directions: publishing works from recipients of an open access journal fund program, also administered by our library, which helps authors pay for oa costs; seeing growth in research datasets, and other material that doesn’t normally get published but may be of value to researchers; monitoring the publication output of our researchers and trying to direct those articles toward the repository. university of colorado denver auraria library primary unit: special collections and digital initiatives primary contact: matthew mariner head of special collections and digital initiatives - - matthew.mariner@ucdenver.edu website: digitool.library.colostate.edu/r/?func=collections&collection_id= program overview mission/description: the mission of the auraria digital library program is to securely host, faithfully present, and freely distribute cultural, historical, educational, and scholarly content to auraria campus constituents and the interested public. the curation of scholarly publications, or the intellectual output of auraria campus staff, faculty, and students is of particular importance as it serves to promote and legitimize the activities of our institutions amongst our peers. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ) media formats: text; images; audio; video campus partners: campus departments or programs other partners: university of colorado denver graduate school publishing platform(s): digitool digital preservation strategy: amazon glacier additional services: author copyright advisory; other author advisory; digitization; audio/video streaming mailto:matthew.mariner@ucdenver.edu http://digitool.library.colostate.edu additional information: auraria library actually serves three unaffiliated schools on one campus (cu denver; metropolitan state university of denver; and community college of denver). currently, only cu denver grants graduate degrees requiring a thesis or dissertation, but said school recently made etds mandatory. these are submitted to proquest, but co-delivered to the library, where they are hosted and made publicly available. we hope to add more capacity for inclusion of undergraduate works (capstones, undergrad research) that would be published solely in our repository (unlike etds, which are technically also held by proquest). in addition to these activities, our scholarly communications librarian jeffrey beall offers advice to faculty regarding publishing, but he is currently forming plans to offer these services more concretely and publicly (i.e., campus-wide). plans for expansion/future directions: offering a space for unpublished undergraduate works, which are often ignored, but given auraria’s diverse and undergraduate-focused constituency, demand emphasis. university of florida george a. smathers libraries primary unit: digital library center ufdc@uflib.ufl.edu primary contact: judy russell dean of university libraries - - jcrussell@ufl.edu website: digital.uflib.ufl.edu; ufdc.ufl.edu program overview organization: services are distributed across library units/departments publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); newsletters ( ); etds ( ); databases ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: caribbean studies; entomology; african studies; psychology; physical therapy top publications: arl pd bank (database); vodou archive (digital scholarship database and archive); african studies quarterly (journal); interamerican journal of psychology (journal); florida entomologist (journal); journal of undergraduate research (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: florida virtual campus (flvc); internet archive; digital library of the caribbean (dloc); university press of florida; florida museum of natural history publishing platform(s): ojs/ocs/omp; locally developed software (sobekcm) digital preservation strategy: fcla daitss; in-house additional services: outreach; analytics; cataloging; metadata; dataset management; author copyright advisory; digitization; hosting of supplemental content c o n tr ib ut ing institu tio n library publishing coalition mailto:ufdc@uflib.ufl.edu mailto:jcrussell@ufl.edu digital.uflib.ufl.edu ufdc.ufl.edu university of georgia university of georgia libraries primary unit: digital library of georgia primary contact: andy carter digital projects archivist - - cartera@uga.edu program overview mission/description: our general objectives are to identify valuable, but overlooked, work from faculty and students, and increase the amount of uga’s scholarly output that is available via open access. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ) media formats: text; images; audio disciplinary specialties: higher education campus partners: campus departments or programs; individual faculty publishing platform(s): dspace; ojs/ocs/omp additional services: cataloging; metadata; doi assignment/allocation of identifiers; author copyright advisory; digitization plans for expansion/future directions: fine-tuning etd platform; expanding journal hosting efforts using ojs, depending on need and interest on campus. c o n tr ib ut ing institu tio n library publishing coalition mailto:cartera@uga.edu university of guelph university of guelph library primary unit: research enterprise and scholarly communication primary contact: wayne johnston head, research enterprise and scholarly communication - - ext. wajohnst@uoguelph.ca program overview mission/description: we seek to disseminate and preserve the scholarly output of the university. we believe open access, both green (self-archiving) and gold (open access journals), is critical to this objective. more broadly, we also seek to promote the digitization and dissemination of canadian scholarly journal content. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/ research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ) media formats: text; images; audio; video; data disciplinary specialties: agriculture; veterinary sciences; arts; history; international development top publications: critical studies in improvisation (journal); international review of scottish studies (journal); partnership: the canadian journal of library and information practice and research (journal); synergies canada (journal); studies by undergraduate researchers at guelph (journal) percentage of journals that are peer reviewed: mailto:wajohnst@uoguelph.ca campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: scholarly societies; national organizations; provincial consortia publishing platform(s): dspace; fedora; ojs/ocs/omp; dataverse digital preservation strategy: duracloud/dspace; scholars portal; synergies additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; dataset management; author copyright advisory; other author advisory; audio/video streaming university of hawaii at manoa university of hawaii at manoa libraries primary unit: desktop network services primary contact: beth tillinghast web support librarian, institutional repositories manager - - betht@hawaii.edu program overview mission/description: though the university of hawaii at manoa currently does not have a formal library publishing program, our library is involved in providing publishing services through the various collections hosted in our institutional repository, scholarspace. we provide the hosting services for numerous department journal publications, conference proceedings, technical reports, department newsletters, as well as open access to some dissertations and theses. the publishing activities are consistent with our mission of acquiring, organizing, preserving, and providing access to information resources vital to the learning, teaching, and research mission of the university of hawaii at manoa. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ); datasets media formats: text; images; audio; video; data disciplinary specialties: language documentation; social work; entomology; pacific islands culture; southeast asian culture c o n tr ib ut ing institu tio n library publishing coalition mailto:betht@hawaii.edu top publications: language documentation and conservation (journal); ethnobotany research and applications (journal); the contemporary pacific (journal); journal of indigenous social development (journal); explorations (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): dspace digital preservation strategy: archive-it; portico; in-house additional services: doi assignment/allocation of identifiers; dataset management; author copyright advisory; digitization; hosting of supplemental content university of idaho university of idaho library primary unit: digital initiatives primary contact: devin becker digital initiatives librarian - - dbecker@uidaho.edu website: www.lib.uidaho.edu/digital; journals.lib.uidaho.edu program overview mission/description: the digital initiatives department works to preserve and make accessible publications and other research products from researchers and affiliates of the university of idaho via its open access publishing capabilities. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: student-driven journals ( ); databases ( ) media formats: text; images; audio; video; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: rangeland ecology and management; creative writing top publications: fugue (journal); journal of rangeland applications (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs publishing platform(s): contentdm; ojs/ocs/omp digital preservation strategy: in-house mailto:dbecker@uidaho.edu www.lib.uidaho.edu/digital journals.lib.uidaho.edu additional services: graphic design (print or web); typesetting; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; doi assignment/allocation of identifiers; open url support; digitization; hosting of supplemental content additional information: we also publish a number of digital collections of historical images and documents. plans for expansion/future directions: bringing etds online; using etds to start developing a more robust (and visible) institutional repository. university of illinois at chicago university library primary unit: scholarly communications escholarship@uic.edu primary contact: sandy de groote scholarly communications librarian - - sgroote@uic.edu website: library.uic.edu/home/services/escholarship program overview mission/description: the objective/mission of the uic university library publishing program is to advance scholarly knowledge in a cost-effective manner. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); newsletters ( ); etds ( ) media formats: text; images; data disciplinary specialties: social work; internet studies; public health informatics top publications: first monday (journal); online journal of public health informatics (journal); behavior and social issues (journal); uncommon culture (journal); journal of biomedical discovery and collaboration (journal) percentage of journals that are peer reviewed: campus partners: individual faculty f o u n di ng institu tio n library publishing coalition mailto:escholarship@uic.edu mailto:sgroote@uic.edu library.uic.edu/home/services/escholarship publishing platform(s): contentdm; dspace; ojs/ocs/omp; inera extyles digital preservation strategy: hathitrust; lockss additional services: graphic design (print or web); typesetting; marketing; training; metadata; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; dataset management; author copyright advisory; digitization; word to xml conversion plans for expansion/future directions: exploring monograph publishing. h i g h l i g h t e d p u b l i c a t i o n first monday is one of the first openly accessible, peer–reviewed journals on the internet, solely devoted to the internet. firstmonday.org/index firstmonday.org/index university of iowa university of iowa libraries primary unit: digital research and publishing lib-ir@uiowa.edu primary contact: wendy robertson digital scholarship librarian - - wendy-robertson@uiowa.edu website: www.lib.uiowa.edu/drp/publishing social media: @iowareso program overview mission/description: digital research and publishing explores ways that academic libraries can best leverage digital collections, resources, and expertise to support faculty and student scholars by: collaborating on interdisciplinary scholarship built upon digital collections; offering publishing services to support sustainable scholarly communication; engaging the community through participatory digital initiatives; promoting widespread use and reuse of locally built repositories and archives; and advancing new technologies that support digital research and publishing. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/ research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ) media formats: text c o n tr ib ut ing institu tio n library publishing coalition mailto:lib-ir@uiowa.edu mailto:wendy-robertson@uiowa.edu www.lib.uiowa.edu/drp/publishing top publications: walt whitman quarterly review (journal); medieval feminist forum (journal); proceedings in obstetrics & gynecology (journal); iowa journal of cultural studies (journal); poroi (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students other partners: society for medieval feminist scholarship publishing platform(s): bepress (digital commons); contentdm; wordpress digital preservation strategy: archive-it; lockss; in-house; digital preservation services under discussion additional services: cataloging; metadata; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; peer review management; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: working on adding additional services, such as dois, html versions of articles, and possibly some formatting of content; assessing campus needs for datasets. university of kansas ku libraries primary unit: center for faculty initiatives and engagement kuscholarworks@ku.edu primary contact: marianne reed digital information specialist - - mreed@ku.edu website: journals.ku.edu program overview mission/description: digital publishing services provides support to the ku community for the design, management, and distribution of online publications, including journals, conference proceedings, monographs, and other scholarly content. we help scholars explore new and emerging publishing models in our changing scholarly communication environment, and we help monitor and address campus concerns and questions about electronic publishing. these services are intended to enable online publishing for campus publications, and help make their content available in a manner that promotes increased visibility and access, and ensures long-term stewardship of the materials. year publishing activities began: organization: centralized library publishing unit/department; transitioning to distribution across library units staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); occasional lectures; oral histories and interviews media formats: text; audio; video c o n tr ib ut ing institu tio n library publishing coalition mailto:kuscholarworks@ku.edu mailto:mreed@ku.edu journals.ku.edu disciplinary specialties: philosophy; natural science; humanities; oral history and interviews; linguistics top publications: biodiversity informatics (journal); american studies (journal); latin american theater review (journal); kansas working papers in linguistics (working papers); treatise online (preprints) percentage of journals that are peer reviewed: campus partners: individual faculty; graduate students publishing platform(s): dspace; ojs/ocs/omp; xtf digital preservation strategy: portico; digital preservation services under discussion additional services: outreach; training; analytics; cataloging; metadata; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming; isbns; consulting on publishing models and issues plans for expansion/future directions: some services are ongoing. a strategic initiative to expand the program is pending. university of kentucky university of kentucky libraries primary unit: department of digital scholarship uknowledge@lsv.uky.edu primary contact: adrian k. ho director of digital scholarship - - adrian.ho@uky.edu website: uknowledge.uky.edu program overview mission/description: the university of kentucky (uk) libraries launched an institutional repository (uknowledge) in late to champion the integration and transformation of scholarly communication within the uk community. the initiative sought to improve access by students, faculty, and researchers to appropriate resources for maximizing the dissemination of their research and scholarship in an open and digital environment. a crucial component of uknowledge is providing publishing services to broadly disseminate scholarship created or sponsored by the uk community. we provide a flexible platform to publish a variety of scholarly content and to expand the discoverability of the published works. additionally, we are establishing a separate digital repository for the long-term preservation of the published content and research datasets. using state-of-the-art technologies, we are able to offer campus constituents sought-after services in different stages of the scholarly communication life cycle to help them thrive and succeed. we also inform them of scholarly communication issues such as open access, author rights, and the economics of journal publishing. providing library publishing services is one avenue through which we are making significant contributions to the fulfillment of uk’s mission. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library materials budget ( ); charge backs ( ) f o u n di ng institu tio n library publishing coalition mailto:uknowledge@lsv.uky.edu mailto:adrian.ho@uky.edu uknowledge.uky.edu publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ); image galleries/virtual exhibits ( ) media formats: text; images disciplinary specialties: higher education; hispanic studies; public health; undergraduate research (multidisciplinary) top publications: kentucky journal of higher education policy and practice (journal); nomenclatura: aproximaciones a los estudios hispánicos (journal); frontiers in public health services and systems research (journal); kaleidoscope: the university of kentucky journal of undergraduate scholarship (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons) digital preservation strategy: in-house additional services: graphic design (print or web); training; analytics; cataloging; metadata; notification of a&i sources; issn registration; open url support; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content h i g h l i g h t e d p u b l i c a t i o n frontiers in public health services and systems research provides quick, open access to actionable public health infrastructure research to improve public health practices. uknowledge.uky.edu/frontiersinphssr uknowledge.uky.edu/frontiersinphssr plans for expansion/future directions: strengthening existing library publishing partnerships; bringing more campus constituents on board; building upon our current library publishing services (e.g., partnering with the uk graduate school to complete the integration of our library publishing services into the workflow as they implement an electronic thesis and dissertation mandate); pursuing additional opportunities to collaborate with various campus units in support of undergraduate research as we celebrate uk students’ academic achievements by making them visible and accessible worldwide; assisting uk-based print journals to create their online presence and extend their reach beyond academia; exploring data publishing in partnership with uk researchers; continuing to advocate open access and open licensing as well as inform the uk community of new scholarly communication practices such as alternative metrics, open peer review, and researcher identity management; making uknowledge the primary online publishing avenue for uk-based research and scholarship. university of maryland college park mckeldin library primary unit: digital stewardship primary contact: terry m. owen drum coordinator - - towen@umd.edu website: publish.lib.umd.edu; drum.lib.umd.edu program overview mission/description: capture, preserve, and provide access to the output of university of maryland faculty, researchers, centers, and labs. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: in-house; digital preservation services under discussion additional services: analytics; metadata; issn registration; author copyright advisory; hosting of supplemental content plans for expansion/future directions: expanding into epublishing in , including faculty and student-produced e-publications. c o n tr ib ut ing institu tio n library publishing coalition mailto:towen@umd.edu publish.lib.umd.edu drum.lib.umd.edu university of massachusetts amherst w.e.b. du bois library primary unit: office of scholarly communication scholarworks@library.umass.edu primary contact: marilyn s. billings scholarly communication & special initiatives librarian - - mbillings@library.umass.edu website: scholarworks.umass.edu program overview mission/description: scholarworks@umass amherst, an open access digital repository service, was established in to provide a digital showcase of the unique research and scholarly outputs of members of the university of massachusetts amherst community. it provides a platform for the distribution of content such as electronic dissertations, master’s theses, and capstone projects as well as scholarly output of academic departments, research centers, and institutes. scholarworks provides a wide variety of scholarly publishing services including: online journal publishing and conference management system; collaboration with scholarly presses to provide permanent location and urls for supplementary content for scholarly monographs, texts, and other scholarly materials. scholarworks provides many services for research support that can be used in conjunction with grant applications, which now require applicants to detail how the results of the funded research will be showcased and disseminated. the scholarworks service can be included as part of the overall data management strategy for research results, reports, new journal services, conference proceedings, etc. these value-added services enhance the professional visibility for faculty and researchers and provide excellent search and retrieval facilities and broader dissemination as well as increased use of materials through services such as google scholar and other internet search engines. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) f o u n di ng institu tio n library publishing coalition mailto:scholarworks@library.umass.edu mailto:mbillings@library.umass.edu scholarworks.umass.edu funding sources (%): library materials budget ( ); library operating budget ( ); non-library campus budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); graduate student capstones and practicums media formats: text; images; audio; video; data disciplinary specialties: anthropology; engineering; community engagement; nursing; hospitality and tourism top publications: “how to do case study research” (technical report); “the impact of language barrier & cultural differences on restaurant experiences: a grounded theory approach” (conference proceedings); “theme park development costs: initial investment cost per first year attendee” (conference proceedings); “the form of the preludes to bach’s unaccompanied cello suites” (thesis); “ratio analysis for the hospitality industry: a cross sector comparison of financial trends in the lodging, restaurant, airline and amusement sectors” (journal article) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons); eprints; fedora digital preservation strategy: lockss; in-house; digital preservation services under discussion additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content additional information: we are looking into the possibilities of coordinating more closely with our university press on a variety of services. they have expertise but not the additional time to assist with the types of publishing services faculty are starting to ask for, such as copy-editing, proofing. we are also members of the networked digital library of theses and dissertations (ndltd). plans for expansion/future directions: exploring additional publication services in collaboration with other groups on campus (copy-editing, proofing, graphic design, referral services); engaging in more extensive collaboration with the office of research on data management, intellectual property/copyright; and expanding into capturing undergraduate student work/projects. h i g h l i g h t e d p u b l i c a t i o n communication + provides an open forum for exploring and sharing ideas about communication across modes of inquiry and perspectives. its primary objective is to push the theoretical frontiers of communication as an autonomous and distinct field of research. scholarworks.umass.edu/cpo scholarworks.umass.edu/cpo university of massachusetts medical school lamar soutter library primary unit: research and scholarly communication services primary contact: rebecca reznik-zellen head of research & scholarly communication services - - rebecca.reznik-zellen@umassmed.edu website: escholarship.umassmed.edu/about.html program overview mission/description: escholarship@umms is a digital repository offering worldwide access to the research and scholarly work of the university of massachusetts medical school community. the goal is to bring together the university’s scholarly output in order to enhance its visibility and accessibility. we help individual researchers and departments organize and publicize their research beyond the walls of the medical school, archiving publications, posters, presentations, and other materials they produce in their scholarly pursuits. our publishing services—including the journal of escience librarianship and two other open access peer-reviewed electronic journals, student dissertations and theses, and conference proceedings—highlight the works of university of massachusetts medical school authors and others. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ); textbooks; ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); finding aids ( ); book reviews ( ) media formats: text; images; audio; video; multimedia/interactive content c o n tr ib ut ing institu tio n library publishing coalition mailto:rebecca.reznik-zellen@umassmed.edu escholarship.umassmed.edu/about.html disciplinary specialties: library science; psychiatry/mental health research; neurology; clinical and translational science; life sciences top publications: journal of escience librarianship (journal); etds; psychiatry information in brief (journal); neurological bulletin (journal); a history of the university of massachusetts medical school (e-book) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons) digital preservation strategy: in-house; digital preservation services under discussion additional services: copy-editing; marketing; outreach; training; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; author copyright advisory; hosting of supplemental content; audio/video streaming; altmetrics data plans for expansion/future directions: expanding our publishing services to additional departments within the medical school, incorporating more multimedia, and enhancing publications with altmetrics data. university of michigan university library primary unit: michigan publishing mpublishing@umich.edu website: www.publishing.umich.edu social media: @m_publishing program overview mission/description: michigan publishing is the hub of scholarly publishing at the university of michigan, and is a part of its dynamic and innovative university library. our mission as publishers, librarians, copyright experts, and technologists is to support the communications needs of scholars, and to publish, promote, and preserve the scholarly record. year publishing activities began: organization: centralized library publishing unit/department funding sources (%): library operating budget ( ); sales revenue ( ) publishing activities campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: open humanities press; american council of learned societies publishing platform(s): dspace; wordpress; locally developed software digital preservation strategy: hathitrust; in-house additional services: typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; contract/license preparation; author copyright advisory; other author advisory c o n tr ib ut ing institu tio n library publishing coalition mailto:mpublishing@umich.edu www.publishing.umich.edu university of minnesota university of minnesota libraries primary unit: content and collections division jkirchne@umn.edu primary contact: joy kirchner aul for content & collections - - jkirchne@umn.edu program overview year publishing activities began: organization: services are distributed across library units/departments funding sources (%): library operating budget ( ); endowment income ( ); grants ( ) publishing activities types of publications: faculty conference papers and proceedings ( ); etds ( ); working papers; blogs; online dictionary media formats: text; images; audio; video; data campus partners: individual faculty publishing platform(s): contentdm; dspace; movabletype; drupal digital preservation strategy: clockss; duracloud/dspace; hathitrust; portico; omeka additional services: training; analytics; metadata; open url support; dataset management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: currently developing a program in a new division. c o n tr ib ut ing institu tio n library publishing coalition mailto:jkirchne@umn.edu mailto:jkirchne@umn.edu university of nebraska-lincoln university of nebraska-lincoln libraries primary unit: zea books/office of scholarly communications proyster@unl.edu primary contact: paul royster publisher, zea books - - proyster@unl.edu website: digitalcommons.unl.edu/zea program overview mission/description: zea books is the digital and on-demand publishing operation of the university of nebraska-lincoln libraries. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); textbooks ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; data; concept maps/modeling maps/visualizations; multimedia/interactive content campus partners: individual faculty other partners: nebraska academy of sciences; center for great plains studies; textile society of america; lester a. larsen tractor and power museum; center for systemic entomology; nebraska ornithological union publishing platform(s): bepress (digital commons) mailto:proyster@unl.edu mailto:proyster@unl.edu digitalcommons.unl.edu/zea additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; issn registration; open url support; peer review management; business model development; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content university of north carolina at chapel hill university library primary unit: library administration primary contact: will owen associate university librarian for technical services and systems - - owen@email.unc.edu program overview mission/description: the library has historically published, in print, specialized monographs on topics related to the university or library. we publish etds electronically and provide digital editions and original scholarly interpretations in support of research and instruction with a special emphasis on the american south. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ); digital humanities research projects media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: the american south top publications: documenting the american south (digital collection) campus partners: unc press; campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): contentdm; fedora; locally developed software mailto:owen@email.unc.edu digital preservation strategy: archive-it; hathitrust; in-house (carolina digital repository); internet archive additional services: training; cataloging; metadata; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: collaborating with researchers on archiving, preserving, and publishing research data; collaborating with unc press for print-on-demand publications. university of north carolina at charlotte atkins library primary unit: digital scholarship lab atkins-dsl@uncc.edu primary contact: somaly kim wu digital scholarship librarian - - skimwu@uncc.edu website: journals.uncc.edu; dsl.uncc.edu/dsl/services/publication program overview mission/description: we support the publication of scholarly journals online and assist journal editors with the management, editorial work, and production of their scholarly journal. the dsl offers journal hosting support services to unc charlotte faculty. our services are built on the open journal system (ojs) journal management software that facilitates the publication of online peer-reviewed journals. dsl services include platform software hosting, updates, and copyright consulting. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ) media formats: text disciplinary specialties: education; psychology; urban education top publications: nhsa dialog (journal); urban education research and policy annuals (journal); undergraduate journal of psychology (journal) percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:atkins-dsl@uncc.edu mailto:skimwu@uncc.edu journals.uncc.edu dsl.uncc.edu/dsl/services/publication campus partners: individual faculty publishing platform(s): ojs/ocs/omp digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); training; issn registration; dataset management; author copyright advisory plans for expansion/future directions: building an institutional repository that is planned to be online within the year. university of north carolina at greensboro university libraries primary unit: collections and scholarly communications primary contact: beth bernhardt assistant dean for collection management and scholarly communications - - brbernha@uncg.edu program overview mission/description: still in development year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): other ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); databases ( ); etds ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: public health; education; nursing; sociology top publications: international journal of nurse practitioner educators (journal); the international journal of critical pedagogy (journal); journal of backcountry studies (journal); journal of learning spaces (journal); partnerships: a journal of service-learning and civic engagement (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students f o u n di ng institu tio n library publishing coalition mailto:brbernha@uncg.edu publishing platform(s): contentdm; ojs/ocs/omp; locally developed software digital preservation strategy: hathitrust; in-house; digital preservation services under discussion additional services: training; analytics; cataloging; metadata; author copyright advisory; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: hosting ojs for other regional libraries; supporting faculty in new scholarly media, such as database and ui design, web pages, and usability. h i g h l i g h t e d p u b l i c a t i o n a peer-reviewed, open-access journal published biannually, the journal of learning spaces provides a scholarly, multidisciplinary forum for research articles, case studies, book reviews, and position pieces related to all aspects of learning space design, operation, pedagogy, and assessment in higher education. partnershipsjournal.org/index.php/jls partnershipsjournal.org/index.php/jls university of north texas university of north texas libraries primary unit: scholarly publishing services primary contact: martin halbert dean of libraries - - martin.halbert@unt.edu program overview mission/description: the unt libraries scholarly publishing services are a collaborative program between faculty and the library to develop new and innovative forms of scholarly publications, especially using digital technologies. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ), graduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ) media formats: text; images; audio; video; data; multimedia/interactive content disciplinary specialties: electronic arts top publications: möbius journal (journal); the eagle feather (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: texas state historical association publishing platform(s): locally developed software f o u n di ng institu tio n library publishing coalition mailto:martin.halbert@unt.edu digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); metadata plans for expansion/future directions: cultivate new ideas for collaborative scholarly publications. h i g h l i g h t e d p u b l i c a t i o n möbius is a journal of the iarta (initiative for advanced research in technology and the arts) research group at the university of north texas. moebiusjournal.org moebiusjournal.org university of oregon university of oregon libraries primary unit: digital scholarship center primary contact: john russell scholarly communications librarian - - johnruss@uoregon.edu website: library.uoregon.edu/digitalscholarship program overview mission/description: the digital scholarship center (dsc) collaborates with faculty and students to transform research and scholarly communication using new media and digital technologies. based on a foundation of access, sharing, and preservation, the dsc provides digital asset management, digital preservation, training, consultations, and tools for digital scholarship. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: humanities; gender studies top publications: ada: a journal of gender, new media, and technology (journal); konturen (journal); oregon undergraduate research journal (journal); humanist studies & the digital age (journal) mailto:johnruss@uoregon.edu library.uoregon.edu/digitalscholarship percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: fembot collective publishing platform(s): contentdm; dspace; ojs/ocs/omp; wordpress digital preservation strategy: in-house additional services: graphic design (print or web); copy-editing; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: increasing quality control over publications. university of pittsburgh university library system primary unit: office of scholarly communication and publishing oscp@mail.pitt.edu primary contact: timothy s. deliyannides director, office of scholarly communication and publishing - - tsd@pitt.edu website: www.library.pitt.edu/dscribe social media: @oscp_pitt program overview mission/description: the university library system, university of pittsburgh offers a full range of publishing services for a variety of content types, specializing in scholarly journals and subject-based open access repositories. because we are committed to helping research communities share knowledge and ideas through open and responsible collaboration, we subsidize the costs of electronic publishing and provide incentives to promote open access to scholarly research. our program promotes open access journal publishing at a very low cost; eliminates the high cost of print journal publication and distribution; allows easy collaboration among authors, editors, and reviewers regardless of location; enhances the visibility, searchability, and navigation of publications; and incorporates innovative and sustainable technologies to speed and facilitate scholarly publishing. we are seeking partners around the world who share our commitment to open access to scholarly research information. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ); charge backs ( ) f o u n di ng institu tio n library publishing coalition mailto:oscp@mail.pitt.edu mailto:tsd@pitt.edu www.library.pitt.edu/dscribe publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); government documents ( ); unpublished article manuscripts ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: latin american studies; european studies; history and philosophy of science; law; health sciences top publications: revista iberoamericana (journal); university of pittsburgh law review (journal); international journal of telerehabilitation (journal); archive of european integration (digital collection); philsci-archive (preprints) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: american forensic association; brunel university; consortium of indonesian universities–pittsburgh (kptip); fonds ricoeur; grupo biblios: international network of the development of library and information science; institute for linguistic evidence; institute of integrative omics and applied biotechnology; institute of public health, bangalore, india; instituto internacional de literatura iberoamericana; kadir has university; laps/ensp h i g h l i g h t e d p u b l i c a t i o n the international journal of telerehabilitation (ijt) is a biannual journal dedicated to advancing telerehabilitation by disseminating information about current research and practices. lawreview.law.pitt.edu lawreview.law.pitt.edu oswaldo cruz foundation laps; motivational interviewing network of trainers (mint); pennsylvania library association; société américaine de philosophie de langue française; society for ricoeur studies; tale: the association for linguistic evidence; university of chapeco, department of anthropology; university of kingston centre for modern european philosophy publishing platform(s): eprints; fedora; islandora; wordpress; locally developed software digital preservation strategy: discoverygarden; hathitrust; lockss; in-house additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; dataset management; business model development; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming university of san diego copley library primary unit: special collections and archives primary contact: kelly riddle digital initiatives librarian - - kriddle@sandiego.edu program overview mission/description: digital publishing at the university of san diego’s copley library offers the university community the opportunity to share research, scholarly works, and other unique resources of historical or intellectual value. the library’s digital publishing program will serve to advance faculty and student success and will foster intellectual collaboration both locally and globally. the library is dedicated to developing publishing services that will support and disseminate knowledge created or sponsored by the university so that it is readily discoverable, openly accessible, preserved, and sustainable. a goal of digital publishing will be to introduce faculty to a variety of new publishing models. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons); contentpro digital preservation strategy: digital preservation services under discussion f o u n di ng institu tio n library publishing coalition mailto:kriddle@sandiego.edu additional services: outreach; training; cataloging; metadata; open url support; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming university of south florida tampa library primary unit: academic resources scholarcommons@usf.edu primary contact: rebel cummings-sauls library operations coordinator - - rebelcs@usf.edu website: scholarcommons.usf.edu program overview mission/description: the usf tampa library strives to develop and encourage research collaboration and initiatives throughout all areas of campus. members of the usf community are encouraged to deposit their research with scholar commons. we commit to assisting faculty, staff, and students in all stages of the deposit process, to managing their work to optimize access/ readership, and to ensure long-term preservation. long-term preservation and increasing accessibility will increase citation rates and highlight the research accomplishments of this campus. scholar commons will have a direct impact on the university’s four strategic goals: student success, research innovation, sound financial management, and creating new partnerships. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ); endowment income ( ) publishing activities types of publications: journals produced under contract/mou for external groups ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ); oral histories; events and lectures; course material; grey/white works mailto:scholarcommons@usf.edu mailto:rebelcs@usf.edu scholarcommons.usf.edu media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: geology and karst; holocaust and genocide; environmental sustainability; literature; math/quantitative literature top publications: etds; social science research: principle, methods, and practices (journal); international journal of speleology (journal); journal of strategic security (journal); studia ubb geologia (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students other partners: national cave and karst research institute (nckri); aphra behn society; union internationale de spéléologie; center for conflict management (ccm) of the national university of rwanda (nur); henley- putnam university; national numeracy network (nnn); iavcei commission on statistics in volcanology (cosiv); babeş-bolyai university; national center for suburban studies at hofstra university publishing platform(s): bepress (digital commons) digital preservation strategy: lockss; portico; in-house; digital preservation services under discussion. pln is being discussed. bepress also offers preservation and backups. additional services: graphic design (print or web); typesetting; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; author copyright advisory; digitization; hosting of supplemental content; audio/ video streaming; add dois to references; suggest pod services plans for expansion/future directions: adding a coordinator role; expanding all content areas; and we currently have three new journals in process. university of tennessee university of tennessee libraries primary unit: digital production and publishing/newfound press primary contact: holly mercer associate dean for scholarly communication & research services - - hollymercer@utk.edu website: www.newfoundpress.utk.edu; trace.tennessee.edu program overview mission/description: the university of tennessee libraries has developed a framework to make scholarly and specialized works available worldwide. newfound press, the university libraries digital imprint, advances the community of learning by experimenting with effective and open systems of scholarly communication. drawing on the resources that the university has invested in digital library development, newfound press collaborates with authors and researchers to bring new forms of publication to an expanding scholarly universe. ut libraries provides open access publishing services, copyright education, and services to help scholars meet new data management and sharing requirements. in addition, we create digital collections of regional and global importance to support research and teaching. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; multimedia/interactive content f o u n di ng institu tio n library publishing coalition mailto:hollymercer@utk.edu www.newfoundpress.utk.edu trace.tennessee.edu disciplinary specialties: east tennessee; great smoky mountains; anthropology; sociology; law top publications: the fishes of tennessee (monograph); building bridges in anthropology (monograph); to advance their opportunities: federal policies toward african american workers from world war i to the civil rights act of (monograph); goodness gracious, miss agnes: patchwork of country living (monograph); “why we don’t vote: low voter turnout in u.s. presidential elections” (thesis) percentage of journals that are peer reviewed: campus partners: ut press; campus departments or programs; individual faculty; graduate students; undergraduate students other partners: southern anthropological society; music theory society of the mid-atlantic publishing platform(s): bepress (digital commons); locally developed software digital preservation strategy: duracloud/dspace; metaarchive additional services: graphic design (print or web); typesetting; copy-editing; marketing; analytics; cataloging; metadata; doi assignment/allocation of identifiers; dataset management; peer review management; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming; assignment of isbns plans for expansion/future directions: exploring how to cultivate data publishing and how to support digital humanities on campus. h i g h l i g h t e d p u b l i c a t i o n the wondrous bird’s nest i & ii (das wunderbarliche vogelnest) is the only complete english translation of the fourth of the five simplican novels by seventeenth- century german-language novelist grimmelshausen. newfoundpress.utk.edu/pubs/hiller newfoundpress.utk.edu/pubs/hiller university of texas at san antonio university of texas at san antonio libraries primary unit: learning technology primary contact: posie aagaard assistant dean for collections and curriculum support - - posie.aagaard@utsa.edu program overview mission/description: the utsa libraries collaborate with faculty to disseminate original scholarly content using a variety of platforms, ensuring open access while simultaneously acknowledging reader preferences. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty conference papers and proceedings ( ) media formats: text; images; video; concept maps/modeling maps/visualizations disciplinary specialties: astronomy top publications: torus workshop (conference proceedings) campus partners: individual faculty; graduate students other partners: science organizing committee publishing platform(s): contentdm; worldcat.org digital preservation strategy: in-house. master copy is retained in a preferred file format; copies of the files are kept on local server (which has security, disaster recovery, and backup features) and also with oclc; metadata has been created to support ongoing longevity. mailto:posie.aagaard@utsa.edu worldcat.org additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; contract/license preparation; author copyright advisory; hosting of supplemental content; audio/ video streaming additional information: for our pilot publishing project, we collaborated with faculty who expressed a strong preference for using ibooks/itunes as a publishing platform because the primary audience for the material (astronomy scholars) prefer to consume content on ipads. in addition to producing an ibook, we produced a multimedia-pdf, converting the content to a more open format for wider access and preservation purposes. plans for expansion/future directions: actively seeking new opportunities to collaborate with faculty on publishing projects. university of toronto university of toronto libraries primary unit: information technology services primary contact: sian meikle interim director, its - - sian.meikle@utoronto.ca website: jps.library.utoronto.ca; tspace.library.utoronto.ca program overview mission/description: the university of toronto libraries maintains both the open journal system (ojs) and t-space, the university’s research repository with the aim to preserve and make available the university’s scholarly contributions. we provide leadership and actively support scholarly communication needs by developing alternative forms of publication and viability models for the future that ensure the production and capture of research output. year publishing activities began: organization: services are distributed across several campuses staff in support of publishing activities (fte): library staff ( . ); graduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: medicine/health sciences; humanities; social sciences; physical/natural sciences percentage of journals that are peer reviewed: mailto:sian.meikle@utoronto.ca jps.library.utoronto.ca tspace.library.utoronto.ca campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: university of toronto press; ontario council of university libraries (ocul); canadian association of research libraries (carl) publishing platform(s): contentdm; dspace; fedora; islandora; ojs/ocs/ omp; wordpress; bibapp digital preservation strategy: archive-it; duracloud/dspace; lockss; scholars portal; synergies; internet archive additional services: graphic design (print or web); outreach; training; cataloging; metadata; business model development; contract/license preparation; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: aligning focus.library.utoronto.ca (a more outwardly facing system for faculty profiling) with t-space, the repository; working on copyright issues with our recently hired scholarly communication/ copyright librarian. focus.library.utoronto.ca university of utah j. willard marriott library primary unit: information technology primary contact: john herbert head, digital ventures - - john.herbert@utah.edu program overview year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); etds ( ) media formats: text; images; audio; video; concept maps/modeling maps/ visualizations disciplinary specialties: law; environmental studies; foreign languages; political science top publications: utah law review (journal); hinckley journal of politics (journal); utah foreign language review (journal); utah environmental law review (journal) percentage of journals that are peer reviewed: campus partners: individual faculty; graduate students; undergraduate students publishing platform(s): contentdm; ojs/ocs/omp; wordpress digital preservation strategy: rosetta f o u n di ng institu tio n library publishing coalition mailto:john.herbert@utah.edu additional services: graphic design (print or web); outreach; metadata; doi assignment/allocation of identifiers; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming h i g h l i g h t e d p u b l i c a t i o n the utah historical review is the journal of student history published by the alpha rho chapter of phi alpha theta (national history honor society) at the university of utah. utahhistoricalreview.com utahhistoricalreview.com university of victoria university of victoria libraries primary unit: scholarly publishing office press@uvic.ca primary contact: inba kehoe scholarly communications librarian - - press@uvic.ca website: journals@uvic.ca; dspace.library.uvic.ca: program overview mission/description: uvic press represents the scholarly publishing expertise for the university of victoria and its partner institutions and associations. we are dedicated to the online dissemination of knowledge and research through open access of journals, monographs, and other forms of publication. uvic press offers an imprint to scholarship of a high quality, determined through peer review. we will work with emerging writers and research to promote success in scholarly publishing. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); etds ( ) media formats: text; images disciplinary specialties: humanities; social sciences; disability services; writing; creative fiction top publications: philosophy in review (journal); working papers of the linguistics circle (journal); international journal of child, youth and family studies (journal); canadian zooarchaeology (journal); appeal: review of current law and law reform (journal) mailto:press@uvic.ca mailto:press@uvic.ca mailto:journals@uvic.ca http://dspace.library.uvic.ca: percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: public knowledge project; canadian associate of learned journals; universities art association of canada; association for borderlands studies publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: coppul; lockss; synergies additional services: copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; issn registration; contract/license preparation; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: developing a fully functional university publishing program that will include publishing of journals, conference proceedings, and books. the program will include various imprints under the university press umbrella. university of washington university of washington libraries primary unit: digital initiatives primary contact: ann lally head, digital initiatives - - alally@uw.edu website: researchworks.lib.washington.edu program overview mission/description: the university of washington libraries researchworks service provides faculty, researchers, and students with tools to archive and/or publish the products of research including datasets, monographs, images, journal articles, and technical reports. year publishing activities began: organization: services are distributed across several campuses staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/ honors theses ( ); research notebooks media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: information studies; anthropology; fisheries; native american studies percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students c o n tr ib ut ing institu tio n library publishing coalition mailto:alally@uw.edu researchworks.lib.washington.edu other partners: indo-pacific prehistory association; society for slovene studies publishing platform(s): contentdm; dspace; ojs/ocs/omp digital preservation strategy: university escience dark archive additional services: graphic design (print or web); training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; peer review management; contract/license preparation; author copyright advisory; digitization; hosting of supplemental content university of waterloo university of waterloo library primary unit: digital initiatives primary contact: pascal calarco aul, research & digital discovery services - - ext. pvcalarco@uwaterloo.ca program overview mission/description: enabling original scholarly research at the university of waterloo from faculty, students, and staff. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: student-driven journals ( ); journals produced under contract/mou for external groups ( ); newsletters ( ); databases ( ); etds ( ) media formats: text; images; audio; video; multimedia/interactive content disciplinary specialties: disability studies; mechanical engineering; sociology and criminology; food science top publications: engine: pre-print server for ieee society for vehicular technology (preprints); canadian journal of disability studies (journal); canadian graduate journal of sociology and criminology (journal); canadian journal of food safety (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students mailto:pvcalarco@uwaterloo.ca other partners: theses canada; canadian disability studies association; canadian association of food safety publishing platform(s): dspace; ojs/ocs/omp; locally developed software digital preservation strategy: archive-it; scholars portal; in-house; digital preservation services under discussion; theses canada additional services: analytics; cataloging; metadata; issn registration; business model development; author copyright advisory; digitization; hosting of supplemental content additional information: we have also participated in the networked digital library of theses and dissertations since . plans for expansion/future directions: extending to working papers, pre-prints, senior undergraduate work, and other original efforts. university of windsor leddy library primary unit: information services primary contact: dave johnston information services librarian, scholarly communications coordinator - - ext. djohnst@uwindsor.ca website: scholar.uwindsor.ca; ojs.uwindsor.ca/ojs/leddy/index.php; ocs.uwindsor. ca/ocs/index.php/pc/virtues social media: facebook.com/leddy.library program overview mission/description: the leddy library supports the dissemination of new scholarship by graduate, faculty, and staff researchers at the university of windsor in a variety of forms. through the scholarship at uwindsor repository, we are able to support the dissemination of theses and dissertations and thus provide increased visibility to the work of our graduate students. we also use the repository to support conferences run on our campus by helping the organizers manage the submission workflow and publication process. as a longstanding supporter of open journal systems, the library helps to publish and maintain several journals run from our campus, and we are currently in the process of using the new open monograph press software to help support electronic monograph publishing. providing support for open access is a central concern in all of our publishing endeavors. we seek to educate our users about the value of open access and to encourage various forms of open access publication. organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); faculty conference papers and proceedings ( ); etds ( ) media formats: text; images mailto:djohnst@uwindsor.ca scholar.uwindsor.ca ojs.uwindsor.ca/ojs/leddy/index.php ocs.uwindsor.ca/ocs/index.php/pc/virtues ocs.uwindsor.ca/ocs/index.php/pc/virtues disciplinary specialties: philosophy (information logic); social justice; scholarship of teaching and learning; philosophy (phenomenology); multivariate statistical techniques top publications: informal logic (journal); collected essays in teaching and learning (journal); applied multivariate research (journal); studies in social justice (journal); phaenex (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons); ojs/ocs/omp digital preservation strategy: lockss additional services: marketing; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: extending use of existing systems to support the publication of more journals and conferences; launching an open monograph series with the philosophy department. university of wisconsin–madison university of wisconsin–madison libraries primary unit: general library system primary contact: elisabeth owens special assistant to the vice provost for libraries - - eowens@library.wisc.edu website: parallelpress.library.wisc.edu; uwdc.library.wisc.edu program overview mission/description: the general library system publishes print and digital works featuring new works of scholars, researchers, and poets, and important scholarly and historical materials that are available for study in both print and digital formats. these publications are the result of collaborations with the scholarly community and represent an ongoing commitment by the libraries to scholarly communication as a contribution to the wisconsin idea and in support of the outreach mission of the university. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ); endowment income ( ); charitable contributions/friends of the library organizations ( ); sales revenue ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); reformatted works media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: university of wisconsin; state of wisconsin; african studies; ecology and natural resources; decorative arts and material culture mailto:eowens@library.wisc.edu parallelpress.library.wisc.edu uwdc.library.wisc.edu top publications: wi land survey records (digital collection); foreign relations of the united states (digital collection); icelandic online (digital collection); africa focus (digital collection); decorative arts library (digital collection) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty publishing platform(s): dspace; fedora; ojs/ocs/omp; wordpress; locally developed software digital preservation strategy: clockss; hathitrust; lockss; in-house; digital preservation services under discussion additional services: graphic design (print or web); copy-editing; marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; budget preparation; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: the majority of the libraries’ publishing activities involve the reformatting and dissemination of new versions of existing resources. we do publish new material, and our responses are primarily reflective of these activities (as opposed to our digital collections and repository services). plans for expansion/future directions: increasing emphasis on open access publications and unique archival and special collections materials. utah state university merrill-cazier library primary unit: digital initiatives primary contact: becky thoms copyright librarian - - becky.thoms@usu.edu website: digitalcommons.usu.edu program overview mission/description: usu libraries is committed to the open dissemination of knowledge, as well as its delivery in new forms. our publishing efforts emphasize open access and a commitment to look beyond traditional monographs and scholarly articles to disseminate dynamic scholarly works that can incorporate multimedia and social communications-style input. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); faculty and student posters media formats: text; images; audio; video; data top publications: journal of indigenous research (journal); journal of mormon history (journal); journal of western archives (journal); foundations of wave phenomena (journal); an introduction to editing manuscripts for medievalists (monograph) percentage of journals that are peer reviewed: f o u n di ng institu tio n library publishing coalition mailto:becky.thoms@usu.edu digitalcommons.usu.edu campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons) digital preservation strategy: in-house. our digital publishing content is archived on bepress servers in several geographic locations; we also archive copies on an in-house server. some titles are preserved in hathitrust, and we are investigating dpn. additional services: graphic design (print or web); cataloging; metadata; author copyright advisory; digitization plans for expansion/future directions: building on existing collaborative relationship with the usu press to connect authors with freelance providers of traditional publisher services such as peer review management, copy-editing, and typesetting. h i g h l i g h t e d p u b l i c a t i o n folklore and the internet is a pioneering examination of the folkloric qualities of the world wide web, e-mail, and related digital media. it shows that folk culture, sustained by a new and evolving vernacular, has been a key to language, practice, and interaction online. digitalcommons.usu.edu/usupress_pubs/ digitalcommons.usu.edu/usupress valparaiso university christopher center for library and information resources primary unit: christopher center library services scholar@valpo.edu primary contact: jonathan bull scholarly communication services librarian - - jon.bull@valpo.edu website: scholar.valpo.edu program overview mission/description: valposcholar, a service of the christopher center library and the valparaiso university law library, is a digital repository and publication platform designed to collect, preserve, and make accessible the academic output of valpo faculty, students, staff, and affiliates. year publishing activities began: organization: services are distributed across two libraries, the christopher center and the law library staff in support of publishing activities (fte): library staff ( ); graduate students ( ); undergraduate students ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ); other conference proceedings media formats: text; images; audio; video; data disciplinary specialties: business and leadership ethics; creative writing (fiction); law top publications: valparaiso law review (journal); valparaiso fiction review (journal); the journal of values-based leadership (journal); third world legal studies (journal) percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:scholar@valpo.edu mailto:jon.bull@valpo.edu scholar.valpo.edu campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons); contentdm digital preservation strategy: no digital preservation services provided additional services: typesetting; marketing; outreach; training; analytics; metadata; issn registration; open url support; dataset management; peer review management; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: this is a growing service that appears to be needed and well-received on campus. we expect only growth in the future, along with external partnerships and more faculty-student collaboration. vanderbilt university jean and alexander heard library primary unit: scholarly communications primary contact: clifford b. anderson director, scholarly communications - - clifford.anderson@vanderbilt.edu website: library.vanderbilt.edu/scholarly program overview mission/description: the jean and alexander heard library fosters emerging modes of open access publishing by providing scholarly, technical, and financial support for the digital dissemination of faculty, student, and staff publications. the library maintains several publishing initiatives through its scholarly communication program. currently, it publishes four peer-reviewed, open access journals—ameriquests, homiletic, vanderbilt e-journal of luso-hispanic studies, and the vanderbilt undergraduate research journal—using open journal systems software. it also hosts a database of electronic theses and dissertations in cooperation with the graduate school. additionally, the library distributes undergraduate capstone projects through its institutional repository. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images disciplinary specialties: american studies; homiletics; luso-hispanic studies top publications: ameriquests (journal); homiletic (journal); vanderbilt e-journal of luso-hispanic studies (journal); vanderbilt undergraduate research journal (journal) mailto:clifford.anderson@vanderbilt.edu http://library.vanderbilt.edu/scholarly percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: academy of homiletics publishing platform(s): dspace; ojs/ocs/omp; etd-db digital preservation strategy: in-house; lockss-etd additional services: outreach; training; cataloging; author copyright advisory plans for expansion/future directions: strengthening support for the publication of scientific datasets as well as projects in the digital humanities. villanova university falvey memorial library primary unit: falvey memorial library primary contact: darren g. poley interim library director - - darren.poley@villanova.edu program overview mission/description: in support of villanova university’s academic mission, the library is committed to the creation and dissemination of scholarship; utilizing digital modes and exploring new media for scholarly communication; and whenever possible, fostering open and public access to the intellectual contributions it publishes. organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); undergraduate capstone/honors theses ( ) media formats: text; images disciplinary specialties: american catholic studies; catholic higher education; theater; humanities; liberal arts and sciences top publications: journal of catholic higher education (journal); american catholic studies (journal); expositions (journal); praxis (journal); concept (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: american catholic historical society; association of catholic colleges and universities c o n tr ib ut ing institu tio n library publishing coalition mailto:darren.poley@villanova.edu publishing platform(s): ojs/ocs/omp digital preservation strategy: in-house additional services: graphic design (print or web); digitization virginia commonwealth university vcu libraries primary unit: information management and processing primary contact: john duke senior associate university librarian - - jkduke@vcu.edu website: digarchive.library.vcu.edu program overview mission statement: vcu’s digital press provides the tools, infrastructure, and support for unique digital scholarly expressions from the vcu community of faculty and students from all disciplines. year publishing activities began: organization: services are distributed across library units/departments total fte in support of publishing activities: library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: monographs ( ), student conference papers and proceedings ( ), etds ( ) media formats: text; images; audio; video; concept maps/modeling/maps/ visualizations; multimedia/interactive content disciplinary specialties: history top publications: british virginia (monograph); “information technology outsourcing in u.s. hospital systems” (thesis); “a computational biology approach to the analysis of complex physiology” (thesis); “the effects of the handwriting without tears program” (thesis); “psychology and the theater” (thesis) internal partners: campus departments or programs; individual faculty; graduate students; undergraduate students mailto:jkduke@vcu.edu digarchive.library.vcu.edu publishing platform(s): contentdm; dspace digital preservation strategy: in-house; digital preservation services under discussion additional services: marketing; outreach; training; cataloging; metadata; digitization additional information: vcu libraries recruited a new professional position to advance research data management in the first quarter of academic year - ; it expects to launch a library publishing program and a full institutional repository later this year. plans for expansion/future directions: expanding the institutional repository to become a full partner in the share initiative; creating a publishing platform for existing journals published by vcu faculty and for new scholarly journals and output from the entire vcu community. virginia tech university libraries primary unit: center for digital research and scholarship primary contact: gail mcmillan director, center for digital research and scholarship services - - gailmac@vt.edu website: scholar.lib.vt.edu; ejournals.lib.vt.edu; vtechworks.lib.vt.edu program overview mission/description: the libraries support the virginia tech community’s needs (e.g., conference, journal, and book publishing; rights management and open access consulting, etc.) through digital publishing services. virginia tech has been hosting, providing access to, and preserving ejournals since , but we are new to supporting the full workflow from article submission to peer review, editing, and production. we launched ojs in december . year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); faculty conference papers and proceedings ( ); etds ( ); yearbooks; annual reports media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: technology education top publications: etds; journal of technology education (journal); alan review (journal); journal of industrial teacher education (journal); journal of technology studies (journal) f o u n di ng institu tio n library publishing coalition mailto:gailmac@vt.edu scholar.lib.vt.edu ejournals.lib.vt.edu vtechworks.lib.vt.edu percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: while our ejournal editors largely work on behalf of scholarly societies, we do not work directly with the societies. publishing platform(s): dspace; ojs/ocs/omp; locally developed software digital preservation strategy: lockss; metaarchive additional services: analytics; cataloging; metadata; doi assignment/allocation of identifiers; dataset management; contract/license preparation; author copyright advisory; hosting of supplemental content; audio/video streaming plans for expansion/future directions: consulting with editors about using ojs through cdrs services; inviting hosted ejournal editors to consider using ojs; launching ocs; and collaborating with our university community to consider other publishing services. h i g h l i g h t e d p u b l i c a t i o n the journal of research in music performance is a peer-reviewed journal designed to provide presentation of a broad range of research that represents the breadth of an emerging field of study. ejournals.lib.vt.edu/jrmp ejournals.lib.vt.edu/jrmp wake forest university z. smith reynolds library primary unit: digital publishing kanewp@wfu.edu primary contact: william kane digital publishing - - kanewp@wfu.edu website: digitalpublishing.wfu.edu program overview mission/description: digital publishing at wake forest university helps faculty, staff, and students create, collect, and convert previously or otherwise unpublished works into digitally distributed books, journals, articles, and the like. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); non-library campus budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace; scalar; wordpress; tizra c o n tr ib ut ing institu tio n library publishing coalition mailto:kanewp@wfu.edu mailto:kanewp@wfu.edu digitalpublishing.wfu.edu digital preservation strategy: amazon glacier; amazon s ; hathitrust; in- house; digital preservation services under discussion additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; open url support; business model development; budget preparation; contract/license preparation; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: doubling the number of pages published year to year. washington university in st. louis university libraries primary unit: digital library services digital@wumail.wustl.edu primary contact: emily stenberg digital publishing and preservation librarian - - emily.stenberg@wustl.edu website: openscholarship.wustl.edu program overview mission/description: the mission of the washington university in st. louis libraries publishing program is twofold: to provide alternatives to traditional publishing avenues, and to promote and disseminate original scholarly work of the washington university community. washington university libraries began publishing etds in , and in , we launched the open scholarship repository to continue etd publication, to provide a platform for the open access re-publication of faculty articles, and to provide for original publication of online journals and monographs. since the launch of open scholarship, we have expanded into undergraduate honors theses and presentations, and have begun publishing monographs. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): endowment income ( ) publishing activities types of publications: monographs ( ); etds ( ); undergraduate capstone/ honors theses ( ) media formats: text; images top publications: “edith wharton: vision and perception in her short stories” (thesis); “added-tone sonorities in the choral music of eric whitacre” (thesis); “fashioning women under totalitarian regimes: ‘new women’ of nazi germany f o u n di ng institu tio n library publishing coalition mailto:digital@wumail.wustl.edu mailto:emily.stenberg@wustl.edu openscholarship.wustl.edu and soviet russia” (thesis); “computational fluid dynamics (cfd) modeling of mixed convection flows in building enclosures” (thesis); “sentimental ideology, women’s pedagogy, and american indian women’s writing: - ” (thesis) campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: in-house additional services: graphic design (print or web); copy-editing; metadata; doi assignment/allocation of identifiers; other author advisory plans for expansion/future directions: bringing a small number of journals (currently in development) online in the coming year. wayne state university wayne state university library system primary unit: digital publishing unit primary contact: joshua neds-fox coordinator for digital publishing - - jnf@wayne.edu program overview mission/description: wayne state’s digital publishing unit works to make unique, important, or institutionally relevant scholarly content available to the world at large, in the context of the wsu library system’s digital platforms. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text percentage of journals that are peer reviewed: campus partners: wayne state university press; campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons); fedora digital preservation strategy: in-house; digital preservation services under discussion f o u n di ng institu tio n library publishing coalition mailto:jnf@wayne.edu additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; author copyright advisory; other author advisory; digitization; hosting of supplemental content h i g h l i g h t e d p u b l i c a t i o n jmasm is an independent, peer- reviewed, open access journal providing a scholarly outlet for applied (non)parametric statisticians, data analysts, researchers, psychometricians, quantitative or qualitative evaluators, and methodologists. digitalcommons.wayne.edu/jmasm digitalcommons.wayne.edu/jmasm western university western libraries primary unit: library information resources management wlscholcomm@uwo.ca primary contact: karen marshall assistant university librarian - - ext. karen.marshall@uwo.ca website: ir.lib.uwo.ca program overview mission/description: scholarship@western is a multi-functional portal that collects, showcases, archives, and preserves a variety of materials created or sponsored by the university of western ontario community. it aims to facilitate knowledge sharing and broaden the international recognition of western’s academic excellence by providing open access to western’s intellectual output and professional achievements. it also serves as a platform to support western’s scholarly communication needs and provides an avenue for the compliance of research funding agencies’ open access policies. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ) media formats: text; images; audio; video; data percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: scholarly societies; conferences mailto:wlscholcomm@uwo.ca mailto:karen.marshall@uwo.ca ir.lib.uwo.ca libraries outside the united states and canada australian national university australian national university library primary contact: lorena kanellopoulos manager, anu e press + - - - lorena.kanellopoulos@anu.edu.au website: epress.anu.edu.au; digitalcollections.anu.edu.au; anulib.anu.edu.au program overview mission/description: the library aims to support anu by ’s goals of excellence in research and education and the university’s role as a national policy resource. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: student-driven journals ( ); monographs ( ); faculty conference papers and proceedings ( ); etds ( ) media formats: text; images; audio; video percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty publishing platform(s): dspace; wordpress digital preservation strategy: no digital preservation services provided additional services: graphic design (print or web); cataloging; author copyright advisory plans for expansion/future directions: for key strategic directions, see anulib.anu.edu.au/_resources/reports-and-publications/publications/library_ operational_plan_draft_ .pdf. mailto:lorena.kanellopoulos@anu.edu.au epress.anu.edu.au digitalcollections.anu.edu.au anulib.anu.edu.au anulib.anu.edu.au/_resources/reports-and-publications/publications/library_operational_plan_draft_ .pdf anulib.anu.edu.au/_resources/reports-and-publications/publications/library_operational_plan_draft_ .pdf edith cowan university edith cowan university library primary unit: research services researchonline@ecu.edu.au primary contact: julia gross senior librarian, research services + - - - j.gross@ecu.edu.au website: ro.ecu.edu.au program overview year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library materials budget ( ); library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; concept maps/modeling maps/visualizations disciplinary specialties: education; business; social and behavioral sciences; medicine and health sciences; arts and humanities top publications: australian journal of teacher education (journal); landscapes (journal); eculture (journal); journal of emergency primary health care (journal); research journalism (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons) mailto:researchonline@ecu.edu.au mailto:j.gross@ecu.edu.au ro.ecu.edu.au digital preservation strategy: digital preservation services under discussion additional services: marketing; outreach; training; analytics; cataloging; metadata; notification of a&i sources; issn registration; doi assignment/ allocation of identifiers; author copyright advisory; digitization plans for expansion/future directions: increasing numbers of journals published; investigating ebook publication. humboldt-universitÄt zu berlin universitätsbibliothek primary unit: arbeitsgruppe elektronisches publizieren primary contact: niels fromm head electronic publishing group + - - fromm@ub.hu-berlin.de website: edoc.hu-berlin.de program overview mission/description: the edoc-server is the institutional repository of humboldt university. on this server every member of the university is able to publish his or her electronic theses and/or any documents as open access. we accept anything from single articles or volumes to series of open access publications. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( ) funding sources (%): library operating budget ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); etds ( ) media formats: text percentage of journals that are peer reviewed: campus partners: campus departments or programs publishing platform(s): locally developed software digital preservation strategy: clockss; lockss; in-house mailto:fromm@ub.hu -berlin.de http://edoc.hu-berlin.de additional services: cataloging; metadata; digitization; document templates for ms-office; styles for endnote / citavi plans for expansion/future directions: developing a concept and a workflow for the publication of research data in addition to electronic theses. monash university monash university library primary unit: research infrastructure division primary contact: andrew harrison research repository librarian + - - - andrew.harrison@monash.edu program overview mission/description: publishing at monash university is carried out by monash university research repository and monash university publishing, both of which are parts of the university library. monash university research repository is a digital archive of selected content representing monash’s research activity. the repository provides staff and students a place to deposit their research collections, data, or publications so they are centrally stored and managed, with the content easily discoverable online by their peers globally and by the broader community. the university requires that successful phd theses are submitted to the repository for online publication. the repository is intended to be primarily an open access repository but does contain restricted access content on a case by case basis (e.g., embargoed theses). monash university publishing focuses on peer-reviewed monographs, which are published in both online open access and traditional print forms—as such it is not included here. it seeks to publish scholarly work of the highest quality, ensured by rigorous peer review; maximise the impact of those titles; represent the breadth and energy of monash university research interests (while not excluding contributors from anywhere); promote the free exchange of knowledge; play a coordinating role in the production and dissemination of monash’s scholarly publications, and provide a body of publishing expertise within the university. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); etds ( ); research datasets such as images and sound files mailto:andrew.harrison@monash.edu media formats: text; images; audio; data disciplinary specialties: geographic information systems (gis); comparative literature and cultural studies; social/community work top publications: pan: philosophy activism nature (journal); practice reflexions (journal); applied gis (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: australian community workers association publishing platform(s): fedora; vital digital preservation strategy: in-house additional services: doi assignment/allocation of identifiers additional information: journal publishing is largely a legacy service. our future focus is on theses and research data. we also think that separating the university press from the repository is unhelpful: we see them as complimentary, and both are exploring new ways for libraries to be involved in publishing going forward. plans for expansion/future directions: expanding theses program to include master’s and phd candidates from disciplines previously exempt from the compulsory submission process; expanding the range of research data included in the repository. changes to australian funding council rules will increase the amount of open access journal material we hold. swinburne university of technology swinburne library primary unit: information resources primary contact: nyssa parkes online projects librarian + - - - nparkes@swin.edu.au website: www.swinburne.edu.au/lib/ir/onlinejournals; commons.swinburne.edu.au program overview mission/description: the swinburne online journals service provides publishing support to swinburne faculties and research centres who publish online open access journals. we provide hosting software and technical assistance as well as help and advice on general online publishing and copyright issues. swinburne commons is the centralized service for the management and distribution of digital media content produced across swinburne. the commons draws together quality digital media content from across the university to highlight the research strengths, teaching excellence, student accomplishments, and unique aspects of swinburne. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): other ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); video and audio publishing; videos created at the university are disseminated centrally through the library’s service media formats: text; images; audio; video; multimedia/interactive content disciplinary specialties: mathematics (videos); telecommunications; psychology; settler colonial studies percentage of journals that are peer reviewed: mailto:nparkes@swin.edu.au www.swinburne.edu.au/lib/ir/onlinejournals commons.swinburne.edu.au campus partners: individual faculty other partners: telecommunications society of australia publishing platform(s): ojs/ocs/omp; locally developed software digital preservation strategy: digital preservation services under discussion. additional services: graphic design (print or web); marketing; training; analytics; metadata; compiling indexes and/or tocs; issn registration; doi assignment/ allocation of identifiers; contract/license preparation; audio/video streaming; copyright and permissions advice; technical advice (video and audio); accessibility advice additional information: www.swinburne.edu.au/lib/ir/onlinejournals/support. html, commons.swinburne.edu.au/toolkit.php plans for expansion/future directions: investigating monograph publishing; implementing software upgrades. www.swinburne.edu.au/lib/ir/onlinejournals/support.html www.swinburne.edu.au/lib/ir/onlinejournals/support.html commons.swinburne.edu.au/toolkit.php university of hong kong university libraries primary unit: technical services primary contact: david t. palmer associate university librarian + - - dtpalmer@hku.hk website: hub.hku.hk program overview mission/description: we make highly visible the research and researchers of our university through our efforts, in the expectation that new offers of collaboration, contract research, employment, and so forth will be received from the government, industry, and society. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video disciplinary specialties: social sciences percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace digital preservation strategy: in-house; digital preservation services under discussion additional services: cataloging; metadata; doi assignment/allocation of identifiers mailto:dtpalmer@hku.hk hub.hku.hk university of south australia library primary unit: information resources and technology unisa-research-archive@unisa.edu.au primary contact: kate sergeant coordinator, repository & archive metadata services + - - - kate.sergeant@unisa.edu.au website: ura.unisa.edu.au program overview mission/description: the university of south australia has mandated that research degree students deposit a digital copy of their thesis in the university’s institutional repository, the unisa research archive. theses are made available under an open access publishing model with the application of a non-exclusive creative commons licence, however, copyright remains with the author. whilst it is the aim of the university that theses (and other research outputs) be made open access where possible, authors do have the option to restrict access to their thesis for two years. in order to capture previous research in an electronic format, the library embarked on a process of digitizing theses back to the foundation of the university of south australia in . these were loaded to the unisa research archive in . as a result, the university has today over digitized research degree theses available in its repository. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) publishing activities types of publications: faculty-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ); digitized content media formats: text; images; audio; video disciplinary specialties: health; education; engineering; business; information technology mailto:unisa-research-archive@unisa.edu.au mailto:kate.sergeant@unisa.edu.au ura.unisa.edu.au top publications: graduation booklets; “innovation, globalisation and performance in smes” (thesis); “factors affecting online shopping behaviour” (thesis); calendars and handbooks; “service quality improvement in the hotel industry” (thesis) percentage of journals that are peer reviewed: campus partners: campus departments or programs; graduate students publishing platform(s): ojs/ocs/omp; digitool digital preservation strategy: university server additional services: graphic design (print or web); cataloging; metadata; author copyright advisory; digitization; handle publishing (persistent link) library publishing coalition strategic affiliates strategic affiliates are entities (including service providers, library networks and consortia, non-profit organizations, and others) that share a common interest in this emerging field. to become a strategic affiliate, contact the library publishing coalition’s program manager, sarah k. lippincott (sarah@educopia.org). anvil academic association of research libraries (arl) bepress bibliolabs boston library consortium (blc) coalition for networked information (cni) council of australian university librarians (caul) digital public library of america (dpla) five colleges librarians council hastac knowledge unlatched oapen open access scholarly publishers association (oaspa) public knowledge project (pkp) sparc society for scholarly publishing (ssp) tizra mailto:sarah@educopia.org platforms, tools, and service providers libraries work with a range of external software, tools, and service providers to support preservation, markup, conversion, hosting, allocation of identifiers, and other processes related to the publishing workflow. this following list compiles the names and websites of tools, software, and service providers employed by the libraries in this directory. editorial/production amazon createspace www.createspace.com backstage library works www.bslw.com bookcomp www.bookcomp.com calibre www.calibre-ebook.com charlesworth www.charlesworth-group.com data conversion laboratory, inc. www.dclab.com inera extyles www.inera.com ingram lightning source www.lightningsource.com media preserve www.themediapreserve.com oxygen www.oxygenxml.com scene savers www.scenesavers.com sigil www.github.com/user-none/sigil submittable www.submittable.com tips technical publishing www.technicalpublishing.com trigonix www.trigonix.com/english versioning machine www.v-machine.org www.createspace.com www.bslw.com www.bookcomp.com http://www.calibre-ebook.com http://www.charlesworth-group.com www.dclab.com www.inera.com www.lightningsource.com www.themediapreserve.com www.oxygenxml.com www.scenesavers.com www.github.com/user-none/sigil www.submittable.com www.technicalpublishing.com www.trigonix.com/english http://www.v-machine.org @mire www.atmire.com/website ambra www.ambraproject.org bepress www.bepress.com commons in a box www.commonsinabox.org connexions www.cnx.org contentdm www.contentdm.org dataverse www.thedata.org digitool by exlibris www.exlibrisgroup.com/category/ digitoolovervie django web framework www.djangoproject.com dpubs dpubs.org drupal www.drupal.org dspace www.dspace.org ensemble www.ensemblevideo.com eprints www.eprints.org/us etd-db scholar.lib.vt.edu/etd-db/index.shtml xtf (extensible text framework) xtf.cdlib.org fedora www.fedora-commons.org hubzero www.hubzero.org issuu www.issuu.com kaltura www.corp.kaltura.com omeka www.omeka.org ojs/ocs/omp pkp.sfu.ca/ojs pkp.sfu.ca/ocs pkp.sfu.ca/omp panopto www.panopto.com pressbooks www.pressbooks.com scalar scalar.usc.edu tizra www.tizra.com wordpress www.wordpress.org vitalsource www.vitalsource.com platform/hosting/infrastructure www.atmire.com/website www.ambraproject.org www.bepress.com www.commonsinabox.org www.cnx.org www.contentdm.org www.thedata.org www.exlibrisgroup.com/category/digitoolovervie www.djangoproject.com http://dpubs.org www.drupal.org www.dspace.org www.ensemblevideo.com www.eprints.org/us scholar.lib.vt.edu/etd-db/index.shtml xtf.cdlib.org http://www.fedora-commons.org www.hubzero.org www.issuu.com www.corp.kaltura.com www.omeka.org pkp.sfu.ca/ojs pkp.sfu.ca/ocs pkp.sfu.ca/omp www.panopto.com www.pressbooks.com scalar.usc.edu www.tizra.com www.wordpress.org www.vitalsource.com discovery/marketing altmetric.com www.altmetric.com bibapp www.bibapp.org bowker www.bowker.com/en-us crossref www.crossref.org datacite www.datacite.org doaj www.doaj.org ebsco www.ebscohost.com ezid www.n t.net/ezid loc issn registry www.loc.gov/issn marcive home.marcive.com proquest www.proquest.com serials solutions www.serialssolutions.com digital preservation adpnet www.adpnet.org amazon glacier www.aws.amazon.com/glacier amazon s www.aws.amazon.com/s aptrust www.aptrust.org archive-it www.archive-it.org archivematica www.archivematica.org artefactual www.artefactual.com chronopolis chronopolis.sdsc.edu clockss www.clockss.org/clockss/home dark archive in the sunshine state (daitss) daitss.fcla.edu digital preservation network (dpn) www.dpn.org discoverygarden www.discoverygarden.ca duracloud www.duracloud.org hathitrust www.hathitrust.org hydra www.projecthydra.org internet archive www.archive.org/index.php islandora www.islandora.ca lockss www.lockss.org metaarchive www.metaarchive.org portico www.portico.org/digital-preservation altmetric.com www.altmetric.com www.bibapp.org www.bowker.com/en www.crossref.org www.datacite.org www.doaj.org www.ebscohost.com www.n t.net/ezid www.loc.gov/issn home.marcive.com www.proquest.com www.serialssolutions.com www.adpnet.org www.aws.amazon.com/glacier www.aws.amazon.com www.aptrust.org www.archive-it.org www.archivematica.org www.artefactual.com chronopolis.sdsc.edu www.clockss.org/clockss/home daitss.fcla.edu www.dpn.org www.discoverygarden.ca www.duracloud.org www.hathitrust.org www.projecthydra.org www.archive.org/index.php www.islandora.ca www.lockss.org www.metaarchive.org www.portico.org/digital preservica www.preservica.com rosetta www.exlibrisgroup.com/category/ rosettaoverview safety deposit box www.digital-preservation.com/ solution/safety-deposit-box scholars portal spotdocs.scholarsportal.info/display/sp/ home synergies www.synergiescanada.org uc merritt merritt.cdlib.org library networks and consortia networked digital library of theses and dissertations (ndltd) www.ndltd.org ohiolink etd center etd.ohiolink.edu texas digital library www.tdl.org theses canada www.collectionscanada.gc.ca/ thesescanada/index-e.html www.preservica.com www.exlibrisgroup.com/category/rosettaoverview http://www.digital-preservation.com spotdocs.scholarsportal.info/display/sp/home spotdocs.scholarsportal.info/display/sp/home www.synergiescanada.org merritt.cdlib.org www.ndltd.org etd.ohiolink.edu www.tdl.org www.collectionscanada.gc.ca index-e.html personnel index aagaard, posie, anderson, clifford b., bayer, marc d., beasley, sarah, beaubien, sarah, xiii, beck, donna, xiii becker, devin, bell, allan, bernhardt, beth, billings, marilyn, vi, xiii, bonanni, mimmo, boock, michael, boyd, alan, brown, allison, buckland, amy, bull, jonathan, calarco, pascal, carter, andy, corbett, hillary, costanza, jane, cummings-sauls, rebel, davis-kahl, stephanie, vi, xiii, de groote, sandy, deliyannides, timothy s., dohe, kate, dotson, lee, duke, john, eden, brad, xiii ericson, randall, fister, barbara, flynn, stephen, friend, linda, fromm, niels, gillis, roger, gilman, isaac, xiii, gross, julia, halbert, martin, vi, harrison, andrew, heller, margaret, herbert, john, ho, adrian k., vi, xiii, huwe, terence k., johnson, kathy, johnston, dave, johnston, wayne, kane, william, kanellopoulos, lorena, kehoe, inba, kelly, marty, khanna, delphine, kim wu, somaly, kipnis, dan, kirchner, joy, kirk, elizabeth, laherty, jennifer, lally, ann, lee, dan, xiii, li, yuan, lind, sean, lippincott, sarah k., viii, mangiafico, paolo, mariner, matthew, marker, rhonda, marshall, karen, mcmillan, gail, xiii, meikle, sian, mercer, holly, vi, xiii, michalek, gabrielle, millard, john, mitchell, carmen, mitchell, catherine, xiii, morris, jane, xiii, mullins, james, vi myers, kim, neds-fox, joshua, newton, mark, xiii, oberg, johan, owen, brian, owen, terry m., owen, will, owens, elisabeth, palmer, david t., panciera, benjamin, parandjuk, joanne, parkes, nyssa, paulus, nick, poley, darren g., ramirez, marisa, reed, marianne, reynolds, david, reznik-zellen, rebecca, riddle, kelly, robertson, wendy, roosa, mark, royster, paul, rubin, jeff, ruddy, david, russell, john, russell, judy, sauvé, diane, schlosser, melanie, xiii, sergeant, kate, simser, char, skinner, katherine, viii smart, elizabeth, vi, xiii, starcher, christopher, stenberg, emily, stewart, claire, stockham, marcia, xiii sutton, shan, vi, xiii swift, allegra, vi, xiii, thompson, mary beth, xiii thoms, becky, tillinghast, beth, trehub, aaron, turtle, beth, vi, xiii vandegrift, micah, vanderjagt, leah, varner, stewart, walters, tyler, vi watkinson, charles, vi, xiii, weinraub, evviva, xiii white, nicole, yates, elizabeth, yeung, tim au, zuniga, heidi, www.librarypublishing.org participating in the library publishing coalition means joining a robust network of libraries committed to enhancing, promoting, and exploring this emerging field. our participating libraries are designing and building this organization from the ground up: making decisions about governance and services, producing resources that benefit the community, and engaging with colleagues. north american academic libraries with an interest in participating may do so at any point during the two-year project period (january –december ) as a contributing institution. in january , the lpc will launch as a membership organization. for more information, please contact sarah k. lippincott, library publishing coalition program manager (sarah@educopia.org). www.librarypublishing.org mailto:sarah@educopia.org http://www.librarypublishing.org fc title copyright contents foreword introduction library publishing coalition subcommittees reading an entry libraries in the united states and canada arizona state university auburn university boston college brigham young university brock university cal poly, san luis obispo california institute of technology california state university san marcos carnegie mellon university claremont university consortium colby college college at brockport, suny college of wooster columbia university connecticut college cornell university dartmouth college duke university emory university florida atlantic university florida state university georgetown university georgia state university grand valley state university gustavus adolphus college hamilton college illinois wesleyan university indiana university johns hopkins university kansas state university loyola university chicago macalester college mcgill university miami university mount saint vincent university northeastern university northwestern university oberlin college ohio state university oregon state university pacific university pennsylvania state university pepperdine university portland state university purdue university rochester institute of technology rutgers, the state university of new jersey simon fraser university state university of new york at buffalo state university of new york at geneseo syracuse university temple university texas tech university thomas jefferson university trinity university tulane university université de montréal university of alberta university of arizona university of british columbia university of calgary university of california, berkeley university of california system university of central florida university of colorado anschutz medical campus university of colorado denver university of florida university of georgia university of guelph university of hawaii at manoa university of idaho university of illinois at chicago university of iowa university of kansas university of kentucky university of maryland college park university of massachusetts amherst university of massachusetts medical school university of michigan university of minnesota university of nebraska-lincoln university of north carolina at chapel hill university of north carolina at charlotte university of north carolina at greensboro university of north texas university of oregon university of pittsburgh university of san diego university of south florida university of tennessee university of texas at san antonio university of toronto university of utah university of victoria university of washington university of waterloo university of windsor university of wisconsin–madison utah state university valparaiso university vanderbilt university villanova university virginia commonwealth university virginia tech wake forest university washington university in st. louis wayne state university western university libraries outside the united states and canada australian national university edith cowan university humboldt-universität zu berlin monash university swinburne university of technology university of hong kong university of south australia library publishing coalition strategic affiliates platforms, tools, and service providers personnel index back cover “the structure of scholarly communications within academic libraries” wm. joseph thomas, thomasw@ecu.edu head of collection development, joyner library, east carolina university abstract: academic libraries often define their administrative structure according to services they offer, including research services, acquisitions, cataloging and metadata, and so on. scholarly communications is something of a moving target, though. how are scholarly communications positions defined, what duties do they often include, and how do they fit within the library’s administrative structure? some of the first positions devoted to scholarly communications required jd’s and focused on author’s rights, copyright and fair use. yet other positions recently advertised group scholarly communications librarians within digital scholarship units, which not only create and maintain institutional repositories, they may also publish electronic journals and/or offer services related to data curation. a brief review of the findings recently published in a spec kit, which focuses on arl libraries, begins this article. the main intention, though, is to provide a wider context of scholarly communication activities across a variety of academic libraries. to do that, a survey of non-arl libraries was administered, reviewing their relevant positions and library organization, and the variety of scholarly communication services they offer. lastly, a set of scholarly communication core services is proposed. keywords: scholarly communications, institutional repository, data management, open access, authors rights, librarian competencies introduction: in november , the association of research libraries (arl) published spec kit , the organization of scholarly communication services. this spec kit reported the results of a survey of arl members and gathered together a variety of sample documents, including position descriptions, committee charges, organization charts, web pages and brochures designed to market scholarly communications services, assessment tools, and texts of open access policies and resolutions. the survey was designed to determine “how research institutions are currently organizing staff to support scholarly communication services, and whether their organizational structures have changed since ” (p. ). what do we mean by scholarly communications and who responded? radom, feltner-reichert, and stringer-stanback used this definition provided by the scholarly communications group from washington university in st. louis: “the creation, transformation, dissemination, and preservation of knowledge related to teaching, research, and scholarly endeavors.” there were responses to the survey (for a return rate of %). of these were from institutions categorized by the carnegie classification as ru/vh (research university, very high research activity). there were institutions with carnegie class ruh (research universities, high research activity), canadian arl members, and the library of congress. two of the institutions were considered medium sized; all others were large. three quarters of the respondents were public. the topic is important across all academic libraries, though, so a similar survey was designed, focusing on the other members of the unc system and libraries of various sizes across the country. librarians from schools were invited to take the survey, including schools from the following basic carnegie classifications: ru/vh, ruh, dru, master’s, and baccalaureate. representatives from schools started mailto:thomasw@ecu.edu the survey, but three did not complete it, for a return rate of %. there are only ru/vh schools not members of arl; all were invited and responded. there are ruh schools not members of arl. of those, were invited and did answer the survey. there are dru (doctoral/research university) schools; were invited but only six responded to the survey. the relatively low number of responses from ruh and dru schools means that this is still an important pool of libraries to study. the master’s schools responding to the survey were all from north carolina—seven are public and seven are private. all eight baccalaureate schools are from nc, two public and six private. the author’s institution is east carolina university, a member of the university of north carolina system with a basic carnegie classification of dru. the survey focused on the following characteristics: leadership of scholarly communications, administrative structure and date of most recent change, outreach and educational activities, hosting and managing digital content, digital scholarship and other services. in addition, this presentation for the north carolina serials conference communicated potential for growth in scholarly communications programs in the state through shared support in expertise and shared support for technical infrastructure. finally, the concept of scholarly communications core services was introduced. leadership of scholarly communication: within arl libraries, the spec kit reports, a single librarian often leads scholarly communication efforts ( responses). most of these librarians are department heads or assistant directors, and many have the term “scholarly communications” in their titles. eight of the single librarian leaders have special training, generally either law degrees or other specific training for copyright. nine of these devote half of their time or less to scholarly communications (sc) duties. nine of them have direct reports ranging from . fte to fte. other support for sc activities comes from committee members and other librarians. the next most likely leader of scholarly communications efforts is a library unit ( responses). many have “scholarly communication” in the title; other terms include “digital initiatives/services/curation” and publishing. half of these groups have had special training (law degrees and copyright courses). there were responses that “two or more librarians” lead sc efforts. position titles included the terms scholarly communications, copyright, and digital initiatives. a majority of sc leader-librarians report to directors and associate directors. eight of the had received special training (mostly jd or copyright courses), and of them have direct support. leadership by a library committee garnered nine responses. the members of these committees are from variety of departments across the library, and the groups average eight members. lastly, there were three responses that sc efforts were not led by “any single person or group.” my survey results revealed a different pattern: scholarly communications activities were much more likely to be led by a single person. library leadership by a single person accounted for of the responses to this question. leadership by two or more people, responses; there was only one sc department, and two responses were that there was no sc leadership within the library. separately there was a question about a scholarly communications committee, because such committees can exist alongside clearly established leaders. three quarters of the responses ( ) were “no.” there were “yes” responses for committees made up of librarians only (some are institutional repository working groups or open access committees); there were only five sc committees with librarians and other faculty. group size is generally less than members: five groups report or fewer members; seven groups number to members, and three groups have more than members. these sc committees most often report to the library administrator ( of the ), while report to faculty senates, and reports to the sc librarian. administrative changes to support sc work were significant among association of research libraries members: of respondents ( %) experienced some sort of change since . the majority of these ( ) created at least one new position; created a new department. formal assessments include annual reports and performance reviews, a few surveys to faculty, and review of statistics (like number of downloads from institutional repositories). demonstrable outcomes include an increase in faculty self-archiving, publishing in open access (oa) journals, and support for oa policies. the change rate for non-arl libraries was almost as high: % of respondents had changed a position to lead sc initiatives, and most of those changes occurred in or later. the titles for librarians leading sc efforts reveal a range of departmental affiliations. for the libraries reporting titles, most are administrative or have the term “scholarly communication” in the title ( ). another dozen refer to the director of the library and a half dozen were assistant or associate directors. ten have the term “reference” or “research” in the title, and other terms included in position titles were collection development, digital collections, and systems. while the library directors report to the provost, the majority of other respondents report to the director or ad ( ), and another five report to a department head. staffing support, where it exists, is generally parts of people’s time, in particular, liaisons and those doing work on an ir (metadata, systems, programming). assessment is varied and still in its infancy. only some respondents are counting things, mostly the number of items added to the repository, while others are counting number of attendees at events. a few are recording other measures, such as tracking recipients of oa publishing fund grants, but most are concentrating on building programs and on creating support across campus (for instance, faculty backing an oa policy). scholarly communications services: outreach and education scholarly communications services may be generally divided between outreach and educational activities and those services related to hosting and managing digital content. all arl libraries answering the questions about outreach and education offer services related to authors’ rights, and all but one consult with faculty on sc issues. most consult with graduate students ( ) and most advise authors on meeting funding mandates ( ). funding requirements consultations and authors’ rights discussions (which inevitably include copyright) are also seen as offered elsewhere on campus, most likely a research office and university legal counsel—suggesting partnerships for the libraries. a large number of arl libraries, of them, also plan campus-wide events; consult with undergrads about sc issues; and prepare sc-related documents for faculty discussion. it is important to note that the spec kit survey permitted librarians to mark that the service was provided both by the library and in another unit on campus, while my survey did not. for the non-arl libraries, authors’ rights education is still a significant activity: of respondents are engaged in it, across a variety of school types. there are libraries that advise authors on how to make their research open access, and as might be expected, there is a high degree of overlap between schools offering both services. only libraries plan group events related to scholarly communications. sample group events include recent presentations to faculty on journal publishing in oa and traditional publishers, and open access week talks. only of schools advise researchers on their data management plans—but of these also engage in data management activities. advising graduate students about electronic theses and dissertations (etds) takes place at schools; other schools said this activity is done by another unit, most likely the graduate school or faculty advisors. schools of varying sizes are indeed participating in scholarly communications activities, just at rates that differ from those by arls. libraries should look for potential partners within their institutions in order to increase the range and audience for their sc efforts. for several of these outreach and educational activities, the graduate school, university research office, and/or university legal offices make natural partners. scholarly communications services: hosting and managing digital content there were responses to questions about hosting and managing digital content recorded in the spec kit. the number of libraries offering each service is somewhat lower than the outreach and education services, though. highest numbers are for supporting campus etds ( of ), providing an ir ( ), data management ( ) and digitization ( ). more of these activities are also provided by other campus units. identifying those other units and clarifying whether the library should be involved or in what way would be very important. libraries that i surveyed are also engaged in hosting and managing digital content, and the two services most often offered are the provision of an ir and digitization. note that the ir and digitization are not offered elsewhere on campus, and that digitization (which includes everything from scanning old college yearbooks to participating in hathi trust) is the most offered service ( of responses). irs are offered by schools across the span of carnegie classes, but in decreasing frequency: only two baccalaureate schools have one, and another indicated that they are planning for one. in contrast, only two ru/vh schools reported that they did not have an ir. a little over half as many libraries ( ) have begun publishing journals compared to the arl’s, but there were two master’s colleges and a dru in addition to the ru/vh and ruh schools. a few more libraries report involvement with data management ( ), and these also included schools from across a variety of carnegie classes. what campus partners are available here, for example, to publish e-journals? campus it, various departments on campus? maybe even if another unit is already providing the basic service, the library can add value to those e-journals with services related to indexing, registering for issns, crafting a preservation plan, etc. scholarly communications services: other digital publishing and support the spec kit survey combined digital humanities, e-science, and “e-scholarship initiatives” without defining any of these three. a large number of the responses ( ) indicated support, and noted other campus units also offering support. this number compares well with number of libraries offering an ir and data management. there were libraries that said they are working with faculty to develop new forms of publishing, and schools noted that other units on campus are doing this too. there are libraries publishing e-journals, and who said that other units are providing this service. only of respondents indicated the library administers an oa publishing fund, and said that other units offer such a fund. who paid page charges or other publishing fees in past? likely a research office or dean’s office paid these fees, and maybe these offices would make good partners for a campus oa fund. non-arl library support for new forms of publication included smaller numbers than arl schools ( compared to ), but these were spread across ru/vh, ruh, dru, and masters schools. the surveyed schools also were less likely to offer an open access publishing fund—only out of respondents (all ru/vh or ruh)—although other schools indicated that they are looking for opportunities to offer a fund. this compares to arl schools offering an oa fund. other services mentioned related to reserves, e-reserves, and fair use consultations, new faculty orientations and graduate student orientations. one library director talked about watching nih grant-funded research projects through the campus office of sponsored programs process and tracking public access policy compliance. in all of these activities too are potential campus partners, including campus research and legal offices. potential for growth: exploring options and planning growth in scholarly communication will be easier if libraries can take advantage of shared support for expertise and shared support for technical infrastructure. shared support for expertise for north carolina libraries includes several web resources, a working group, and a new resource person. web resources highlighted were acrl’s scholarly communication toolkit and the arl’s “developing a scholarly communication program in your library.” recently formed by the university library advisory council (ulac) formed a scholarly communication working group, and charged it with investigating oa publishing and archiving resources available to member institutions of the university of north carolina. the new resource person is the visiting program officer for scholarly communication, for the association of southeastern research libraries: christine fruin. ms. fruin is the scholarly communication librarian for the university of florida, and in her capacity as vpo will work with sc and oa leaders within aserl on a series of articles in order to highlight sc work done in our region and to identify common themes and best practices. these are only some of the external expert sources available to libraries. in addition to other external experts, libraries should seek expertise in partners such as the university legal counsel, research office, and/or graduate school. shared support for technical infrastructure presupposes libraries working together on any of several different software packages designed to offer the following services: institutional repositories, e-journal publishing, and data management. there are several well-known institutional repository software options, including dspace and bepress. at least two regional consortia also offer shared repositories using dspace: lasr (liberal arts shared repository) and the nitle network (national institute for technology in liberal education). unc greensboro has also created an ir system (ncdocks) that is currently shared by seven unc system schools. open journal systems is one of the best known software packages for publishing e-journals, and several unc schools are already utilizing it. a shared ojs would defray costs for other schools. some libraries publish e-journals in their dspace repositories, and bepress can also host e-journals. data storage and management is an important and growing need, so libraries are scrambling to evaluate what they can provide. dspace can store data, as can dataverse and project redcap, and there are other free repository software packages, but libraries must be careful, because this software is “free” as in “free puppies.” reflections: the scholarly communications landscape has changed rapidly in the last few years, and the pace of change continues to increase. within the past few weeks, there has been a flurry of activity: aserl announced the vpo, ulac created their task force, an oa fund was initiated at northern illinois, and positions have been posted at virginia commonwealth university, butler university, montana state university, and others. can libraries avoid being left out of the loop? more space for working in the scholarly communications arena will definitely be opened up by the recent office of science and technology policy directive for more agencies to make their funded publications oa and better manage the underlying data. libraries must ask themselves what services to offer, strategically and sustainably, while the library community at large should also consider how to bridge gaps in service across such a wide variety of library sizes. a basic takeaway from the survey data is that schools of all sizes are already offering scholarly communications services, so any of our libraries can engage in this work. the libraries still have to decide carefully what services to offer, and who their partners should be. perhaps a set of scholarly communication core services could offer direction for planning training, bridging gaps across institutions of varying sizes, and lead to effective assessment of scholarly communications programs. scholarly communication core services: one of the first questions to address when considering a set of core services for scholarly communications is whether they would be program oriented or whether they would be written as librarian competencies. after all, one possibility for describing a set of core services is to consider sc as a program. acrl has guidelines for instruction programs in academic libraries that might serve as a good model. these guidelines address such functions as program design, support, key components of advanced programs, and benchmarks. there might be more flexibility, though, in concentrating on librarian competencies. these newly- developed competencies could stand alone like the information literacy competency standards, or librarians could recommend that sc competencies be integrated into other competency standards. and there are certainly lots of competency sets out there: rusa’s professional competencies for reference and user services libraries has a very good structure; nasig lists draft competencies for electronic resources librarianship; there are competencies for art librarians, music librarians, and medical librarians, among others. consider the following proposal for scholarly communication core services. related to each broad topic, librarians will:  open access: o help authors make their works open access o understand variety of publishing models  copyright and publishing agreements: o help patrons use copyrighted materials fairly and legally o consult with authors on their publishing agreements  research support: o help users evaluate oa resources among their lit reviews o help authors comply with funding mandates in order to meet the goal to help authors make their works open access, librarians will have to be familiar with a variety of publishing models and a variety of types of open access. this competency would include the librarian being able to deposit a permissible copy of a work into an appropriate repository. (see s. potvin, , p. .) this repository might be an ir, a data repository, pubmed central, or a subject repository. copyright and publishing agreements are critical features of the scholarly communication landscape, so understanding them must be a basic competency among librarians doing sc work. consistent among comments in my survey and on the spec kit survey were remarks about the library’s role as a resource for the use of copyrighted materials—reserves were mentioned a lot, and digitization of physical formats (like vhs), but coursepacks are another area where the library’s licenses can make a big difference to students. working with authors to understand their publishing agreements and to retain the rights they want to keep is an important proactive service that will have a direct impact downstream on the availability of research for future library users. research support services refer to a wide range of library users, from students needing resources to write their papers to faculty conducting a literature review for a grant. complying with funding mandates will create more demands on librarians as the funding mandates increase. librarian help writing a successful data management plans might be one indicator of success, or the verification of public access policy compliance. overall, these scholarly communication core services are generally framed so that any member of the library can offer them. they are also intended to be flexible, to address variances of need whether the audience member is a student, faculty member, or other library user. initially, at least, the core services would focus on outreach and educational objectives, since such activities could precede the technological infrastructure necessary for hosting and managing digital content. feedback received during the north carolina serials conference was generally positive, with encouragement to focus on consulting and advocacy roles, to be respectful of different approaches to scholarly communication issues required by disciplinary differences, and to be sure that scholarly communication expertise is disseminated throughout the organization rather than concentrated only in one person. conclusion: librarians from a wide variety of schools were surveyed to discover their scholarly communication leadership, administrative structure, and services offered. outreach and educational activities most offered include authors’ rights and open access, and digitization and hosting the ir top the list of digital content services. these results compare favorably to the types of activities offered by arl members, although not at the same rate of adoption. in addition to suggesting potential for growth through shared expertise, the author also encourages librarians to consider implementing a set of scholarly communications core services because they might provide useful benchmarks against which to plan and evaluate locally offered services. since the north carolina serials conference in mid-march , two publications and a presentation reveal widespread interest in incorporating scholarly communications educational activities into information literacy. acrl’s intersections of scholarly communication and information literacy ( ), and common ground at the nexus of information literacy and scholarly communication (s. davis-kahl and m. k. hensley, eds., ) were both published, and davis-kahl, kim duckett, julia gelfand, and cathy palmer presented “information literacy & scholarly communication: mutually exclusive or naturally symbiotic?” to the acrl conference in indianapolis ( ). incorporating sc activities into information literacy will provide excellent benchmarks for engaging students. hopefully while this effort is underway, librarians will come up with strategies for defining sc competencies with respect to faculty members, researchers complying with mandates, and other campus partners. librarians might also consider whether there are other preexisting competencies into which sc could be incorporated. references: association of college and research libraries. ( ). intersections of scholarly communication and information literacy: creating strategic collaborations for a changing academic environment. chicago, il: association of college and research libraries. retrieved from http://acrl.ala.org/intersections/. davis-kahl, s., duckett, k., gelfand, j., and c. palmer. ( ). information literacy & scholarly communication: mutually exclusive or naturally symbiotic? paper presented at the association of college & research libraries conference, indianapolis, in. apr. . retrieved from http://works.bepress.com/stephanie_davis_kahl/ /. davis-kahl, s., & hensley, m. k. ( ). common ground at the nexus of information literacy and scholarly communication. chicago: association of college and research libraries, a division of the american library association. potvin, s. ( ). the principal and the pragmatist: on conflict and coalescence for librarian engagement with open access initiatives. the journal of academic librarianship, ( ), – . doi: . /j.acalib. . . radom, r., feltner-reichert, m., and k. stringer-stanback. ( ). organization of scholarly communication resources, spec kit . washington, dc: association of research libraries. thomas, w. j. ( ). the structure of scholarly communications within academic libraries. paper presented at the nd north carolina serials conference, chapel hill, nc. mar. . retrieved from http://thescholarship.ecu.edu/handle/ / . undergraduate research and the academic librarian: case studies and best practices c h a p t e r * a triumph, a fail, and a question: a pilot approach to student- faculty-librarian research collaboration missy roser and sara smith introduction in , amherst college received a two-year planning grant, followed by a multiyear implementation grant, from the andrew w. mellon founda- tion for faculty to develop a set of seminars in the humanities and social sciences. the idea was to introduce sophomores and juniors to approach- es to research as a process: how to frame a researchable question, develop investigative strategies, and identify and use sources. the seminars would help students engage with topics that intersect with the scholarly interests of a faculty mentor, potentially leading to a senior thesis—a model more commonly seen in the lab sciences. the mellon pilot eventually included the following elements: • four to eight research seminars each spring semester, capped at six students * this work is licensed under a creative commons attribution-noncommercial- sharealike . license, cc by-nc-sa (https://creativecommons.org/licenses/ by-nc-sa/ . /). https://creativecommons.org/licenses/by https://creativecommons.org/licenses/by c h a p t e r • courses built around or directly contributing to a faculty member’s own research, ideally resulting in collaborative faculty-student publi- cation, exhibition, or presentation • a subject librarian affiliated with each seminar, participating at vary- ing levels, from on-call to fully embedded • access to other instructional staff, including project collaboration with academic technology services professionals or museum cura- tors • a voluntary six- to eight-week summer research fellowship compo- nent for participating seminar students forty courses have now been offered in areas, including material culture in art history, urban planning and educational opportunity, interdisciplin- ary explorations of sensory systems, the world of the king james bible, per- formance culture at the turn of the century, and archives of childhood. over six years, fifteen faculty members, students, and seven librarians have been involved in the seminars. nineteen students have co-published with a faculty member in a peer-reviewed journal, seven books and ten journal articles have incorporated contributions from student researchers, and six exhibitions have been mounted in the college’s library and museum. sup- porting this extended pilot project has encouraged librarians to stretch our ideal forms of teaching research dispositions and scaffolding undergraduate research. librarians working with the mellon seminars are involved with a wide variety of activities: traditional instruction sessions on how to develop a re- search prospectus or using a bibliographic-management tool like zotero, con- sulting on course design or digital-scholarship pedagogy, serving as interloc- utors for proposal workshopping, or teaching weekly “research lab” sessions to complement course content. our roles particularly evolved to support the initiative during the summer, when students stay on campus to continue in- tensive work on their projects. this opportunity came at a moment of real transition for the institution—which was in the midst of a wave of faculty retirements and hiring—and the library. a new library director had just ar- rived, the entire research & instruction department turned over during the course of the project, due to retirements and promotions, and, as a result of increasing demand for our work with undergraduate research, we were able to add two additional teaching-librarian positions that also addressed such identified needs as outreach and user experience. the shifts in our priorities and identity seem to echo the debates and direction of instruction librarians in the profession more broadly over the same period. a triumph, a fail, and a question background the original grant proposed experimental seminars as a way to expand the student research program beyond thesis work and a longstanding sum- mer-science program by developing “activities that help us ( ) enhance stu- dent understanding of how research questions are developed and pursued— and how they connect to the “big questions” underlying the liberal arts, ( ) better prepare students for successful thesis projects in the humanities and social sciences, and ( ) foster a climate of intellectual excitement and engage- ment that pervades both the classroom and daily life at amherst.” library instruction at amherst, a selective residential liberal-arts college with , undergraduates and an open curriculum, is already very context-specific. many departments offer research-methods courses for majors, usually with librarian support. but while percent of seniors complete an honors thesis, survey data several years ago revealed that the experience could be isolating and fragmented. we saw potential in the mellon grant to lay more explicit groundwork for non-stem students embarking on independent work. first, we had to articulate our role in the project. the grant had been awarded before any of the current research & instruction librarians were hired, and the library hadn’t even been referenced in the original application. the faculty principal investigator declared in an initial meeting that he didn’t want any of “that database stuff.” instead, he sought proposals and input from librarians regarding the forms that collaboration could take, urging us to think creatively to reinvent the library’s relationship to faculty and curric- ulum. our instincts were to build on previous course-integrated instruction and thesis consultations as well as to think about opportunities for embed- ding instruction, like those discussed by dewey and by smith & sutton. we initially assigned the liaison librarian for the faculty member’s home department to each seminar, and their involvement that first semester indi- cated a range of possible activity. one seminar had only informal consul- tation with their librarian. another, in classics, had two sessions with the librarian to introduce research tools in that discipline. the seminar on edu- cation and history included a session on zotero and organizing research, as well as a class covering resources very specific to their project, including gov- ernment documents and newly acquired microfilm. the last seminar had the closest collaboration, with a librarian teaching five sessions that were heavily integrated with the content of the course, each breaking down a type of ev- idence that could be used to investigate a potential research question in law and culture. the distribution was similar across eight seminars the following year: two courses had only a session or two with their librarians to cover bibliographic management and basic research; two others, in art history and religion, had librarians teach two to three sessions focused on disciplinary c h a p t e r approaches to research and following citations; and four of the courses had quasi–embedded librarians supporting their learning as they took on proj- ects heavily based in archival or special collections in alternative newspapers, history of the early-modern book, nineteenth-century children’s literature, or missionary papers from turkey. because many seminars were interdisciplinary, we later went beyond liaison assignments to match seminars to librarians whose capabilities and interests closely aligned with a particular topic or mode of inquiry. as the program developed, skills such as data management and coding transcripts were integrated into research-team instruction, as were technology concepts like card-sorting and wireframing for web design. one other reference point in the first years was the university of adelaide’s research skills develop- ment framework, which helped us to describe a “research pedagogy” and distinguish the bounded-research approach we taught in regular instruction sessions from the kind of scaffolded to open-ended research characteristic of the mellon seminars. librarians taught skills early on to help students practice asking researchable questions, discover and evaluate information, and synthesize findings. as they developed proposals at the end of the semes- ter and moved into more independent research in the summer, goals shifted to supporting student-initiated inquiry and coaching teams through testing schemas and refining their own methodologies. partnerships as the program continued, librarians played an increasingly connected role. our department head attended working-group meetings, individual librari- ans took on key responsibilities in the seminars and in project management, and summer involvement and facilitation expanded. we explored how we could more fully partner with faculty, pairing their deep disciplinary un- derstanding with our focus on the research process to address each semi- nar’s subject area and intended outcomes. faculty, realizing this, appreciated spending less time on nuts-and-bolts mechanics and more time on high- er-level concepts. this model allowed for contextual application of big-pic- ture process issues—a more nuanced version of the specificity we had been bringing to workshops and one-shot instruction sessions. while not every seminar made use of its liaison librarian in an embedded sense, several things made for a very different experience of offering research support: early conversations with a faculty member, familiarity with an entire syllabus (including often doing all the course readings), getting an advance look at assignments, and regular check-ins with the course. in many cases, the librar- ian would suggest places of convergence during the semester where a hands-on a triumph, a fail, and a question instruction session might make sense with the content planned for that week, e.g., exploring the history of polling in newspapers in relation to public support for the death penalty, or tracing the underlying research for a popular account of the silk road by connecting examples to the book’s bibliography. in sever- al other courses, faculty came to increasingly value an “unsyllabus” at times, where the messiness and uncertainty of developing a research question or pro- posal focus would benefit from building flexibility into planning. eighty-nine students over the past four years have stayed after the com- pletion of the semester to continue their work with faculty, and the library is now a hub of summer research activity. most mellon seminar teams set up camp in the library—with librarians continuing to act as on-call coaches— and research & instruction librarians convene weekly research table meet- ings for students to share progress, ask questions, and learn from peers. we’ve established a spinoff thesis research table and broadened our workshop of- ferings and community-building events to better serve the needs of student researchers over these months while faculty are often not on campus. we’ve also inaugurated an annual daylong showcase of undergraduate research and creative work in the spring, in partnership with the writing center, center for community engagement, and academic technology. the event is held in the library and current and past mellon students are well represented. reflection there have been ripple effects from the mellon initiative in nearly every aspect of our work as research and instruction librarians. our approach has led to a broadening of the research skills and methods we teach, a greater focus on transferable aspects of learning to do research, and improved relationships with teaching faculty. there has been increased collaboration with non-mel- lon faculty as word has spread, and we have seen more student-to-student referrals as well. student evaluations described greater understanding of how to develop researchable questions, analyze research methodologies, and evaluate the relevance of sources. our extended engagement with the mo- tivated students in these disciplinary seminars also led to more interaction with many of them as they subsequently became thesis writers in the same departments. the research undertaken by students in the mellon seminars involved a sustained project that was longer than a final course paper but shorter than an honors thesis. the highly situational orientation of the research in these sem- inars prepared students for thesis research in a new way, revealing the ben- efits of a longer timeframe for a research experience, especially for transfer or less-prepared students. for librarians, this also meant being more delib- c h a p t e r erate about the affective and metacognitive elements of research instruction, particularly using the summer to introduce often-tacit aspects of research and peer learning (and to encourage students to explore and report back on off-campus experiences, as well). when asked about tangible skills they took away from summer research tables, students described learning how to build in time for reflection and understanding the emotional ups and downs involved. our work with these students in particular led us to incorporate more aspects of team-based research—including an awareness of the stag- es of successful team formation and project management—and how to teach communication skills to support it, for students’ dual roles as peer colleagues and working with a faculty lead investigator. for us, this not only applies to classroom instruction but also extends to building community, increasing librarian visibility, and connecting to other instructional staff on campus, such as the writing center and academic technology. some of the successes of this project have been accompanied by challeng- es. we continue to wrestle with practical questions that have arisen regarding fuzzy boundaries when we’re not the instructor of record, as well as pedagog- ical questions of how best to integrate subject content with information-lit- eracy concepts, as bowler and street found. the sustainability and scaling of an embedded model is also a perpetual concern. although the work has been incredibly satisfying for librarians involved, it has led to an expanded workload in summer, taken more time to be truly and effectively embedded in these courses, and meant disproportionate attention to some disciplinary research practices over others in a situation where each librarian works with multiple academic departments. expressly designating these seminars as experimental has helped em- phasize an iterative and developmental approach to research, encouraging students to try out new ideas (a potentially vulnerable but generative space not always common to the amherst experience). we have had success work- ing with faculty to “unstuff ” syllabi in order to create more time and space to focus on process, and we now regularly collaborate on rethinking course outcomes. there is, of course, the potential to make this program stronger. attending every class session doesn’t necessarily ensure collaboration be- tween faculty and librarians, and we have begun to develop strategies to ad- dress this concern. in one seminar last spring, instead of having the librarian join the course four times over the semester, the faculty member asked her to hold weekly “research lab” sessions to provide more in-depth, hands-on instruction that built to the students’ collaborative research proposals. while an excellent opportunity, this format needs some revision for better integra- tion: the labs lost the symbolic value of being in the same classroom with the professor, though their worth was evident when the students needed much less time to get up to speed for summer research. a triumph, a fail, and a question assessment in addition to their regular course evaluations, students generally filled out separate evaluations for their faculty about how mellon seminars differed from other courses. while librarians didn’t see most of those, we did talk with faculty individually and in the planning group about changes to larger structural issues as well as specific session topics. we spearheaded a concert- ed focus on building a research community in the summer, which in turn led to more awareness of the increasing number of students doing independent work and a resulting task force convened by the dean of faculty’s office to better coordinate summer opportunities for students. because of the relation- ships we developed with students, we got quite a bit of informal feedback. the library sponsored an information session each fall to promote the mellon seminars to other students before course pre-registration, and these partici- pant panels were very helpful for candid reports of how the previous seminars and summer research had gone. most valuable for formative assessment, though, were the weekly sum- mer research tables, which were positioned at the point of need. we struc- tured these sessions by asking teams to report on “a triumph, a fail, and a question” of the past week; besides facilitating peer learning, it allowed us to address unanticipated gaps and build essential concepts into the next version of the course. many of these gaps involved research practices with concrete aspects—naming conventions for shared files, choosing coding software, finding cvs for scholars in a particular subfield, using the u.s. newspaper directory to identify what was missing from digitized coverage—that were situated in a larger social context of academic and research culture that we could unpack together and make more transparent for students. what we heard overwhelmingly from students in evaluations and in per- son was that they became more critical readers, with much more attention given to how research is created or a claim is made—that the process was re- vealed and permission granted to “look under the hood” of arguments made by even senior scholars in a field. this experience also modeled pathways for how they might go about starting to research specific questions. despite increased autonomy and self-direction in mellon seminars, students told us that it was less intimidating than expected, that “this isn’t just learning about something but doing it.” while we hadn’t been familiar with indiana univer- sity’s decoding the disciplines project at the outset of our experience, this collaborative approach to demystify how an expert would go about a disci- plinary task was very consonant with our objective to make the research pro- cess more explicit. one of the original goals for the grant was to connect students to poten- tial thesis topics earlier in their undergraduate careers, and this has indeed c h a p t e r been the case. while not every student developed mellon research into thesis projects, quite a few continued in related areas, and many continued to make use of librarians as their “research coaches” through graduation. initial anal- ysis by our institutional research office shows that a much higher percentage of mellon seminar students go on to complete a thesis: percent overall, with increases from – percentage points for students of color, first-gen- eration, and low-income students compared to non-mellon students. self-se- lection for the seminars is likely one factor, but the nature of these courses means that these students have previous experience with advanced research and with working closely with faculty and librarians. several academic departments have embraced the summer thesis re- search table model of sustained support for their thesis cohorts, asking li- aison librarians and writing associates to lead monthly meetings for them through the academic year, often with a rotating faculty partner. feedback from students has guided discussion topics, with an augmented focus on scholarly communication and open access to be added this year to help bridge the transition from consumer to producer of information. in addition, the college’s strategic plan recommended investigating half-credit courses, pos- sibly taught by instructional staff, which could build on this departmental model or the weekly research labs in conjunction with the semester-long seminars. recommendations/best practices throughout the mellon project, we’ve worked with faculty to articulate how this research seminar will be different from other courses in order to design the learning experience accordingly. the project’s coordinating working group—with a librarian at the table with faculty—helped surface these issues; it also was the most effective mechanism to raise awareness among faculty about ways to define and incorporate the librarian role. for the librarians, we’ve had to be prepared to go beyond our comfort zones in terms of discuss- ing course content, observing and deconstructing for students the research approaches of faculty members and their larger disciplinary or interdisciplin- ary communities, and thinking on our feet to solve problems that emerge. we did “translation” work to scaffold and interpret research processes in the archives, the gis lab, art museums, the folger shakespeare library, and other settings through give and take with the faculty instructors, librarians, and students. this immersive alignment with faculty research process has giv- en us a deeper understanding of methodologies and practices in particular fields, which has also been very rewarding for our own intellectual engage- ment in our work. a triumph, a fail, and a question these contexts and outcomes lent themselves organically to bigger-pic- ture thinking. as the acrl framework for information literacy for higher education drafts and final version emerged over the same period, its focus on threshold concepts, dispositions, metacognition, and affective aspects of learning to do research resonated with our experience in these intensive settings. it also prompted us to think about alternative research products, such as digital-scholarship projects, archival exhibitions, wikipedia arti- cles, or website development. organizationally, the mellon project helped us as a department in this time of transition to think about how to prior- itize and scale our teaching, as well as about the nature of collaboration with faculty on research and acknowledgment of librarian work. as faculty considered the question of how to credit undergrads for contributions to their scholarship in the humanities and social sciences, they also came to us with questions about how to credit embedded librarians for their role in teaching or research support at an institution where librarians don’t have faculty rank or tenure. being able to draw on this experience was also cru- cial for librarian involvement in strategic planning around the integration of research and teaching, as well as on the committee examining potential changes to the college’s curriculum. in reflecting on the mellon project as a whole, we identify several import- ant elements of the experience: . bringing our teaching identities into classrooms in a much more overt way, to where faculty have asked for librarians to co-teach or have considered fully collaborating on future research projects or seminars; this in turn creates increased awareness of the need for our expertise in teaching research-specific approaches for students. . the need for time to think through and continually revisit pedagogy for teaching the messiness of research and all it entails, including the need to negotiate conceptual space for this work in research seminars within majors modeled on the mellon seminars. a larger question remains for how to scale this perspective for shorter engagements. . the ongoing balance between creating structure (including designing instruction and planning specific sessions) and thinking on our feet, which incorporates many other aspects of our personalities and scholar- ly/teaching/librarian identities—particularly in working with multiple cohorts over subsequent years. conclusion the mellon pilot project gave us the incentive and space to implement new ideas for teaching the research process. we drew on the new framework to c h a p t e r further make sense of our own practice and make connections across out- comes and skills that might have seemed to fall outside library instruction previously. most important, we were able to develop relationships with stu- dents and build trust with faculty as they took risks of their own. notes . examples include austin sarat, with katherine blumstein ’ , aubrey jones ’ , heather richard ’ , and madeline sprung-keyser ’ , gruesome spectacles: botched executions and america’s death penalty (stanford, ca: stanford universi- ty press, ); hilary moss, yinan zhang ’ , and andy anderson, “assessing the impact of the inner belt: mit, highways, and housing in cambridge, massachu- setts,” journal of urban history , no. ( ): – ; caitlin britos ’ , james hall ’ , soo kim ’ , daniel schulwolf ’ , karl loewenstein and the american occupation of germany, http://loewenstein.wordpress.amherst.edu; and unlocking wonder: a peek into the world of luxury cabinets (art exhibition curated by pablo morales ’ , claire castellano ’ , robert croll ’ , martha morgenthau ’ , and madeleine sung ’ , mead art museum, amherst college, amherst, ma, october –february ). . barbara i. dewey, “the embedded librarian: strategic campus collaborations,” resource sharing & information networks , no. – ( ): – , doi: . / j v n _ ; susan sharpless smith and lynn sutton, “embedded librarians: on the road in the deep south,” college & research libraries news , no. ( ): – , , http://crln.acrl.org/content/ / / . . john willison and kerry o’regan, “commonly known, commonly not known, to- tally unknown: a framework for students becoming researchers,” higher education research and development , no. ( ): – , doi: . / ; see also http://www.adelaide.edu.au/rsd/framework. . anastasia efklides, “metacognition, affect, and conceptual difficulty,” in over- coming barriers to student understanding: threshold concepts and troublesome knowledge, ed. jan h. f. meyer and ray land (new york: routledge, ), – . . meagan bowler and kori street, “investigating the efficacy of embedment: exper- iments in information literacy integration,” reference services review , no. ( ): – , doi: . / . . leah shopkow, “what decoding the disciplines can offer threshold concepts,” in threshold concepts and transformational learning, ed. jan meyer, ray land, and caroline baillie (boston: sense publishers, ), – ; see also http://decod- ingthedisciplines.org. . association of college and research libraries, “framework for information litera- cy for higher education,” http://www.ala.org/acrl/standards/ilframework. bibliography association of college and research libraries. “framework for information literacy for higher education.” http://www.ala.org/acrl/standards/ilframework. http://loewenstein.wordpress.amherst.edu/ http://crln.acrl.org/content/ / / http://www.adelaide.edu.au/rsd/framework http://dx.doi.org/ . / http://decodingthedisciplines.org http://decodingthedisciplines.org http://www.ala.org/acrl/standards/ilframework http://www.ala.org/acrl/standards/ilframework a triumph, a fail, and a question bowler, meagan, and kori street. “investigating the efficacy of embedment: experi- ments in information literacy integration.” reference services review , no. ( ): – . doi: . / . dewey, barbara i. “the embedded librarian: strategic campus collaborations.” resource sharing & information networks , no. – ( ): – . doi: . / j v n _ . efklides, anastasia. “metacognition, affect, and conceptual difficulty.” in overcoming barriers to student understanding: threshold concepts and troublesome knowl- edge, edited by jan h. f. meyer and ray land, – . new york: routledge, . shopkow, leah. “what decoding the disciplines can offer threshold concepts.” in threshold concepts and transformational learning, edited by jan meyer, ray land, and caroline baillie, – . boston: sense publishers, . smith, susan sharpless, and lynn sutton. “embedded librarians: on the road in the deep south.” college & research libraries news , no. ( ): – , . http:// crln.acrl.org/content/ / / . willison, john, and kerry o’regan. “commonly known, commonly not known, totally unknown: a framework for students becoming research- ers.” higher education research and development , no. ( ): – . doi: . / . http://dx.doi.org/ . / http://crln.acrl.org/content/ / / http://crln.acrl.org/content/ / / microsoft word - whitsonengl .docx albert robida “la sortie de l’opéra l’an .” lithograph. library of congress prints and photographs division. . engl : nineteenth-century speculative fiction teaching philosophy my teaching is informed by critical pedagogy, particularly the work of paulo friere, asao inue, and henry giroux. for giroux, critical pedagogy means “educating students to take risks and to struggle within ongoing relations of power in order to be able to alter the grounds in which life was lived” ( ). you will see that i have scaffolded a variety of active-learning activities and recommended readings. these activities model my belief that different people engage with a course in different ways. your contributions to these activities are vital and will prove beneficial to your colleagues and to me. my primary purpose as a teacher is to help develop in students what friere calls conscientização, or critical consciousness. critical consciousness entails understanding the social, political, ecological, and technological contradictions shaping the uneven distribution of power in our world; and, furthermore, a willingness to intervene in those conflicts on multiple levels. too many students are marginalized and silence due to what inue calls the assessment of “white habitus” as an internalization of racist standards of behavior and achievement that are commonly rewarded by university grading. alternatively, i grade on participation, effort, and the degree to which you help your colleagues struggle against these categories rather than reifying an abstract and oppressive sense of “ability.” we will all struggle to think and work through these categories in different and uneven ways. i hope you seize the opportunity to learn with us about science fiction, power, ideology, and collective agency in a supportive and progressive environment. major course texts (available from the bookie or online). https://www.amazon.com/book-urizen-facsimile-color-history/dp/ /ref=tmm_pap_swatch_ ?_encoding=utf &qid= &sr= - https://www.amazon.com/we-have-never-been-modern-ebook/dp/b aqlfqio/ref=sr_ _ ?s=books&ie=utf &qid= &sr= - &keywords=we+have+never+been+modern https://www.amazon.com/frankenstein-broadview-editions-mary-shelley/dp/ /ref=sr_ _ ?s=books&ie=utf &qid= &sr= - &keywords=frankenstein+third+edition https://www.amazon.com/island-doctor-moreau-broadview-editions/dp/ /ref=sr_ _sc_ ?s=books&ie=utf &qid= &sr= - -spell&keywords=dr.+moreau+broadivew https://www.amazon.com/she-rider-haggard- -feb- -paperback/dp/b t ym m/ref=sr_ _ ?s=books&ie=utf &qid= &sr= - &keywords=she+broadview https://www.amazon.com/news-nowhere-william-morris/dp/ /ref=sr_ _ ?s=books&ie=utf &qid= &sr= - &keywords=news+from+nowhere+broadview https://www.amazon.com/archaeologies-future-desire-science-fictions/dp/ /ref=sr_ _ ?s=books&ie=utf &qid= &sr= - &keywords=archaeologies+of+the+future https://www.amazon.com/herland-related-writings-broadview-editions/dp/ /ref=sr_ _ ?s=books&ie=utf &qid= &sr= - &keywords=herland+% b+broadview https://www.amazon.com/one-blood-hidden-givens-collection/dp/ /ref=mt_paperback?_encoding=utf &me= https://www.amazon.com/kindred-octavia-e-butler/dp/ /ref=tmm_pap_swatch_ ?_encoding=utf &qid= &sr= - objectives course description darko suvin claims that science fiction is fundamentally concerned with “cognitive estrangement,” or the presence of some element in the story that transforms how its readers understand their world. in fact, much of the developments in science, economics, and politics in the nineteenth century were also concerned with the new worlds revealed by an increasingly industrialized society. charles darwin shocked the world by postulating that natural selection determined the habits of human beings, not any divine plan. voyages to other parts of the planet were revealing new frontiers and new spaces for capitalism and colonialism to exploit. machines and unskilled labor were replacing artisans with mechanized and standardized commodities. and the hopes and fears inspired by these new worlds reappeared as dreams and nightmares in speculative fiction: darwin’s theories became the strange human-like animal hybrids of h.g. wells’s the island of dr. moreau, while imperialism inspired the “lost race” novels of h. rider haggard and made possible the utopian dreams of william morris and charlotte perkins gilman. this course will show how science fiction articulated the hopes and fears victorians associated with the future. such anxieties are a symptom of our inability to imagine the future (or the past) in its alterity. against liberal promises of perpetual progress in which the notion of eventual inclusion tells the oppressed to stave off revolution and reassure the ruling class, science fiction enacts dramas surrounding the true danger and possibility of a future that is entirely unpredictable. in addition to the authors mentioned above, this course will show how women and authors of color used science fiction to challenge the oppressions of their day and imagine futures that asserted their freedom and power. produce close readings of key texts that are historically- informed and evidence-based. find relationships between course content and specific research interests or areas of teaching. incorporate theoretical perspectives and historical sources into critical readings of the nineteenth century respond to critiques of critical theory discussion leader on two days of your choosing, i will ask you to work in groups of two to create a -minute presentation associated in some way with the readings of the day, along with questions designed to facilitate discussion of the content and your presentation. the lecture can be used to help you conceptualize your final project. you may also choose to have students engage in some kind of active learning exercise that illustrates the main points you make during your presentation. for examples, see the schedule. points or % total; points each. major projects: points possible ;lkj;lkj;lkjlk final project you will see one hour on most weeks devoted to a “workshop.” these days are designed to help you conceptualize, draft, and revise a final project. this project can take the shape of a traditional seminar paper in which you synthesize a primary reading with secondary and tertiary sources that is related to course content in some broadly- conceived way. you may also decide to engage in a multimodal or digital project. whereas the teaching of various tools for digital scholarship is beyond the scope of this particular class, i am happy to help you think through these possibilities and incorporate milestones into the class’s workshop schedule. points or %. text engagements for each week of class, i will ask you to write a one-paragraph response to the readings for that day. this should include an initial reaction to the readings and at least one question that the readings prompted for you. the question can be one of clarification, if you didn’t understand part of the novel or theoretical reading, or it can be more of an open-ended question meant to foster discussion. we will use these engagements often in class, so please upload them no later than midnight on the sunday prior to our monday meeting. points total or %. schedule assignments are due on date listed. schedule may change with notice from me. an asterisk before a reading denotes a reading that is available on blackboard. date topics/theme in-class readings due assignments due / science and fiction introductions and course overview discussion: quoted sections by darko suvin and china mieville on the definition of science fiction. sign up for meetings and presentations. in-class selections from: darko suvin, “on the poetics of the science fiction genre.” college english. . ( ): - . john newsinger, “fantasy and revolution: an interview with china mieville.” international socialism journal. ( ). pick a spinoza keyword to explore for next week. / self and world keywords: define and explore the interrelationship amongst deleuze’s spinoza keywords from chapter . use references from the levinson and deleuze readings to contextualize your definition. discussion: blake, wordsworth - ideolgy and estrangement in the nineteenth-century. workshop: pick three different texts and three different topics that you are interested in exploring for william blake, the book of urizen. dover: dover publications, . william wordsworth, “lines composed a few miles above tintern abbey, on revisiting the banks of the wye during a tour. july , .” rchs hypertext reader. marjorie levinson, “a motion and a spirit: romancing spinoza.” studies in romanticism. . ( ): - . gilles deleuze, “life of spinoza,” “on the difference between the ethics and a morality,” and “spinoza and us.” spinoza: practical philosophy. tran. helpful: gregory colon-semenza. “the structure of your graduate career: an ideal plan” and “publishing” http://pubs.socialistreviewindex.org.uk/isj /newsinger.htm http://pubs.socialistreviewindex.org.uk/isj /newsinger.htm http://pubs.socialistreviewindex.org.uk/isj /newsinger.htm http://www.rc.umd.edu/sites/default/rcoldsite/www/rchs/reader/tabbey.html http://www.rc.umd.edu/sites/default/rcoldsite/www/rchs/reader/tabbey.html your final project. talk about your interest, both personal and professional, in class. robert hurley. san francisco: city life books, . - ; - . / nature and culture lecture: spinoza, deleuze, latour, and new materialism. discussion: actor-network theory and modernity. workshop: discuss your proposal. encourage at least three revisions from each of your colleagues proposals. bruno latour, we have never been modern. harvard: harvard up, . graham harman, “we have never been modern.” prince of networks: bruno latour and metaphysics. melbourne: re.press, . - . write a one-page proposal for your final project. remember that you are free to write a seminar paper or a multimodal / digital project. helpful: gregory colon-semenza. “the seminar paper.” eric hayot. “showing your iceberg” and “metalanguage.” / science and poetry footnotes: pick one footnote and one analogy employed by darwin in either of the cantos we read for today. give us a short history of that aspect of romantic science and its impact on darwin’s poem. discussion: griffiths’s work on analogy and its impact on nineteenth-century science. what is the relationship between matter, language, and verse driving the early history of evolution? erasmus darwin, “editors introduction” and “canto ” the botanic garden volume : the economy of vegetation. ed. adam komisaruk and allison dushane. london: routledge, . - . erasmus darwin, “canto .” the botanic garden volume : the loves of the plants. ed. adam komisaruk and allison dushane. london: routledge, . - . devin griffiths, “the intuitions of analogy in erasmus darwin’s poetics.” sel: studies in english literature - . . ( ): - . revise your proposal using at least one recommendation you received last week. helpful: gregory colon-semenza. “organization and time management.” workshop: revised proposals. / virtual and real digital editions: pick one of the editions of frankenstein. read the introduction and supplemental materials. what is the edition’s editorial approach? what theoretical approach to understanding the novel do you detect in its editorial approach? discuss with reference to andrew burkett’s reading of the various media surrounding frankenstein. discussion and metadata project: andy burkett joins us to discuss his article. we will also annotate shelley’s “the mortal immortal” with hypothesi.s. discussion: mary shelley’s frankenstein. mary shelley, frankenstein. third edition. ed. d.l. macdonald and kathleen scherf. *mary shelley, frankenstein: annotated for scientists, engineers, and creators of all kinds. ed. david guston, ed finn, and jason scott robert. cambridge: mit press, . mary shelley, frankenstein. ed. stuart curran. romantic circles electronic editions. mary shelley, frankenmoo. ed. ron broglio and eric sonstrem. romantic circles electronic editions. mary shelley, “the mortal immortal” ed. michael erbele-sinatra. romantic circles electronic editions. *andrew burkett, “mediating monstrosity: media, information, and mary shelley’s frankenstein.” studies in romanticism. (winter ): . . - . / civilized and savage travel: pick one of the localities darwin describes in the voyage. compare that description of travel with the framing story from frankenstein. how does *charles darwin, the voyage of the h.m.s. beagle, or, journal of researches. new york: p.f. collier and son, . “porto praya,” “tierra del fuego,” vaplparaiso, portillo pass,” write a -page close reading of a primary source. helpful: eric hayot, “the uneven u” and “structure and https://www.rc.umd.edu/editions/frankenstein http://homes.lmc.gatech.edu/~broglio/rc/frankenstein/index.htm https://web.hypothes.is/ https://www.rc.umd.edu/editions/mws/immortal/index.html darwin’s depiction of nature compare with shelley? discussion: race and racism in the theory of evolution. workshop: present your - page close reading to us. give us your central arguments and show how you back up these arguments with evidence from the text. “galapagos archipelago,” and “mauritius.” *canon schmidtt, “charles darwin’s savage mnemonics.” charles darwin and the memory of the human: evolution, savages, and south america. cambridge: cambridge up: . - . subordination.” / nature and ecology narrative: pick out two examples from the reading of anthropomorphism in on the origin of species. give us a close reading of how anthropomorphism works in your example and how it compares with erasmus darwin’s work on analogy. discussion: evolution and the non-human. workshop: revisions of your close reading. *charles darwin, on the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. new york: d. appleton and company, . “introduction,” “struggle for existence,” “natural selection,” “on the lapses of time,” “on the geological succession of organic beings,” “recapitulation and conclusion.” *gillian beer, “fit and misfitting: anthropomorphism and the natural order.” darwin’s plots: evolutionary narrative in darwin, george eliot, and nineteenth-century fiction. use at least one piece of advice from last week and revise your close reading. helpful: eric hayot, “a materialist theory of writing” and “how do readers work?” / human and animal historical readings: pick one appendix from the broadview edition (not the h.g. wells, the island of dr. moreau. ed. mason harris. peterborough: write a draft of your introductory and positioning paragraphs. reviews or evolution ), read the supplemental materials, and give us a -minute summary on what you learned. discussion: animal studies and wells. workshop: introductions and scholarly conversations. broadview press, . *kate benston, “experimenting at the thresholds: sacrifice, anthropomorphism, and the aims of (critical) animal studies.” pmla. . ( ): - . helpful: eric hayot, “introductions” and “institutional contexts” / human and machine industrialists and revolutionaries: find one point of contention between babbage and marx regarding machinery and labor. what does this disagreement reveal about their sense of human agency? discussion: marx and the non-human. workshop: bibliographies, secondary and tertiary sources. *charles babbage, the economies of machinery and manufacture. london: charles knight and pall mall, . - . “exerting forces too great for human power and executing operations too delicate for human touch,” and “the division of labor.” *karl marx, fragment on machines.” the grundrisse. tran. martin nicholaus. london: penguin, , - . *karl marx, “the labour process and the valorization process.” capital: a critique of political economy. volume . tran. ben fowkes. london: penguin, . - . *tamara ketabgian, “human parts and prosthetic networks: the victorian factory and mesmeric forces. the lives of the machines: write a -source annotated bibliography of secondary and tertiary sources. include a one-paragraph description summarizing its content and how you will use it in your paper. eric hayot, “citational practice” and “work as process” https://quod.lib.umich.edu/d/dcbooks/ . . / : /--lives-of-machines-the-industrial-imaginary-in-victorian?g=dculture;rgn=div ;view=fulltext;xc= # . https://quod.lib.umich.edu/d/dcbooks/ . . / : /--lives-of-machines-the-industrial-imaginary-in-victorian?g=dculture;rgn=div ;view=fulltext;xc= # . https://quod.lib.umich.edu/d/dcbooks/ . . / : /--lives-of-machines-the-industrial-imaginary-in-victorian?g=dculture;rgn=div ;view=fulltext;xc= # . the industrial imaginary in victorian literature and culture. ann arbor: u of michigan press, . - . / fiction and utopia historical readings: pick one appendix from the broadview edition (not the reviews or evolution ), read the supplemental materials, and give us a -minute summary on what you learned. discussion: morris and williams. workshop: outlining and structure william morris, the news from nowhere. ed. stephen arata. peterborough: broadview press, . raymond williams, “utopia and science fiction.” science fiction studies , no. ( ): – . bring a draft outline of your final project to class. helpful: eric hayot: “structure and subordination” and “ending well” / history and the future positioning jameson and utopia: pick one chapter from archaeologies that demonstrates wegner’s argument about jameson in his essay. tell us what jameson is arguing and how it revises williams’s discussion of base and superstructure. lecture: freud, lacan, marx, and jameson. discussion: the desire called utopia frederic jameson, archaeologies of the future: the desire called utopia and other science fictions. new york: verso, . “part : the desire called utopia” *philip e. wegner, “other modernisms: on the desire called utopia.” periodizing jameson: dialectics, the university, and the desire for narrative. evanston: northwestern up, . - . https://www.depauw.edu/sfs/backissues/ /williams art.htm https://www.depauw.edu/sfs/backissues/ /williams art.htm / women and men utopias: examine jameson’s reading of bloch and compare it with munoz’s. how do each theorists reading of desire change their understanding of utopia? discussion: gilman and munoz. workshop: present your drafts in class. charlotte perkins gilman, herland and related writings. ed. beth sutton- ramspeck. peterborough: broadview press, . *jose estaban munoz, “introduction” and “chapter .” cruising utopia: the then and there of queer futurity. new york: nyu press, . - . bring the equivalent of pages of material to class that you will merge with the -page close reading you’ve already completed. helpful: eric hayot, “eight strategies for getting writing done” and “work as a process” colonialism and occultism historical readings: pick one appendix from the broadview edition (not the reviews or evolution ), read the supplemental materials, and give us a -minute summary on what you learned. discussion: colonialism as cognitive estrangement. h. rider haggard, she: a history of adventure. ed. andrew stauffer. peterborough: broadview press, . *john rieder. “the colonial gaze and the frame of science fiction.” colonialism and the emergence of science fiction. middletown: wesleyan up, . - .. black and white comparisons: pick out two similarities between haggard and hopkins’s novel. how does hopkins transform haggard’s colonial adventure story? discussion: afrofuturism and colonialism. workshop: present your pauline hopkins, of one blood: or, the hidden self. new york: washington square press, . *lisa yaszek, “afrofuturism in american science fiction.” the cambridge companion to american science fiction. ed. gerry canavan and eric carl link. cambridge, cambridge up, . - . bring a complete draft of your final project to class. project to the class. give at least suggestions for improvement. then and now anachronism: find examples of how butler discusses black female subjectivity in kindred. how do these align or dispute christina sharpe’s understanding of living in the wake of slavery? text engagement: history and afrofuturism. octavia butler, kindred. new york: beacon press, . *christina sharpe. “the wake.” in the wake: on blackness and being. durham: duke up, . - . exchange drafts with another student, read, and offer at least suggestions that could make the project better. final week final projects due. course values: inclusion: your success in this class is important to me. we will all need accommodations because we all learn differently. if there are aspects of this course that prevent you from learning or exclude you, please let me know as soon as possible. the sooner i know about these, the earlier we can discuss possible adjustments or alternative arrangements that might help you. if you have a documentable disability, please visit the access center (washington building ; . . ) to schedule an appointment with an advisor. collaboration and reading: i see reading to be a collective project — not an individual one — that emphasizes inclusion, good faith, and comradery. i reject the vision of graduate education that separates, marginalizes, and intimidates various groups of students for the veneration of a few, who are usually white and male. as such, i do not expect any one of my students to read and know everything — in fact, i would be suspicious of a student who presented themselves in that way. instead, we collectively construct the critical dimensions of our intervention into culture, literature, theory, and life. your participation is essential in that process. university of florida professor philip wegner encourages his students to form a dialogue with the readings in the course “being attentive to their respective voices, acknowledging their particular historical and otherwise contingent beings-in-the-world, and finally working to imagine how we today might best retool the insights and modes of analysis of their various ‘unfinished projects.’” many of our readings will be progressive in some ways and regressive in others. i ask that, however possible, you bracket your initial emotional response to what you read and develop complex insights to the works we examine. consider that your first encounter with these authors might not exhaust their power or importance to your education, and whether or not you like a particular work may have no relevance to its importance as a historical or ideological source of knowledge. some of my favorite books were ones i didn’t like when i first read them. . academic honesty: everyone in this class, including me, must abide by the standards of academic honesty set up by washington state university. see that statement here: http://wsulibs.wsu.edu/library-instruction/plagiarism. i work hard to model appropriate academic citation. please see me if you are unclear about any of these requirements. email: i would rather talk to you in person than via email, since email depersonalizes the exchange and makes it easier for me to misinterpret what you mean. if emailing me is necessary, please allow me at least hours to respond to your email inquiries. i try to respond in a timely manner, but i do not always check my email when not in town or on the weekends safety: washington state university is committed to enhancing the safety of the students, faculty, staff, and visitors. it is highly recommended that you review the campus safety plan (http://safetyplan.wsu.edu/) and visit the office of emergency management website (http://oem.wsu.edu/) for a comprehensive listing of university policies, procedures, statistics, and information related to campus safety, emergency management, and the health and welfare of the campus community. http://wsulibs.wsu.edu/library-instruction/plagiarism https://safetyplan.wsu.edu/ https://oem.wsu.edu/ sources readings: jay clayton. “ th-century science and science fiction.” fall : vanderbilt u. benjamin morgan. “victorian speculative fiction.” [warning: pdf]. fall : u of chicago. philip e. wegner. lit : bridging the pernicious chasm: utopia, dystopia, and science fiction.” fall : u of florida. policies and design: john aycock and jim uhl. “choice in the classroom.” acm sigcse bulletin. . ( ): - . ashley boyd. “ young adult literature.” fall : washington state u. ashley boyd. “critical theory, literacy, and pedagogy.” fall . washington state u anne-marie womack, annelise blanchard, cassie wang, mary catherine jessee. accessible syllabus. web. august . anne-marie womack. first-year writing: rhetoric and research in the digital era. spring : tulane. philip e. wegner. “eng . literary theory: küntlerroman.” fall : u of florida. https://vusf.wordpress.com/about/ http://v collective.org/wp-content/uploads/ / /vict-speculative-fiction.doc http://users.clas.ufl.edu/pwegner/lit f syllabus.htm https://accessiblesyllabus.tulane.edu/ https://accessiblesyllabus.tulane.edu/ https://dl.acm.org/citation.cfm?id= https://accessiblesyllabus.tulane.edu/wp-content/uploads/sites/ / / /syllabus-after- .jpg microsoft word - gladney_april _hmg.docx long-term digital preservation: a digital humanities topic? h. m. gladney saratoga, ca, / abstract: we argue that the so-called digital humanities fail to meet conventional criteria to be an accredited field of study on a par with literature, chemistry, computer science, and civil engineering, or even a specialized professorial emphasis such as ancient history or nuclear physics. the argument uses long-term digital preservation as an example to argue that digital humanities proponents’ case for their research agenda does not merit financial support, emphasizing practical aspects over subjective theory. we are today as far into the electric age as the elizabethans had advanced into the typographical … age. and we are experiencing the same confusions … which they had felt when living simultaneously in two contrasted forms of society and experience. [mcluhan] the exhaustion, the surfeit, the pressure of information have all been seen before. … this time it is different. we are a half century further along and can begin to see how vast the scale and how strong the effects of connectedness. [gleick] formal academic recognition of digital work in the humanities remains problematic. socially this has to do with the slow pace of institutional change. intellectually it has to do with the poorly understood nature of non-verbal knowledge-bearing objects. curatorially it raises the problem of how such knowledge-bearing objects are to be preserved for the long term. culturally it runs afoul of the low status given to works of popular culture—multimedia, documentaries, interactive games, and [so on]—which tend to be dismissed as entertainment. the increasing number of digital humanities articles suggests … that serious attention is urgently needed for understanding and preserving digital objects. “digital humanities” (dh) is the name chosen by an interest group that is promot- ing their activities for funding and for inclusion in university faculties. digital document preservation is prominent among the topics proposed for investigation by this interest group. for an upcoming workshop debate, manfred thaller asked me to present a case for denying the requested support, arguing that dh is not a worthy “a considerable part of the gear and tackle of print media—now taken for granted, invisible as old wallpaper—evolved in direct response to the sense of information surfeit” (gleick , ). excerpted from http://en.wikipedia.org/wiki/digital_humanities; emphasis added. every cited web page was seen between december , and march , . abbreviations used in the text might depend on the context, as follows: dh “the digital humanities” or else “digital humanities”; a.k.a. “e-humanities”; dhp “dh proponents” or else “a typical dh proponent (david howard potter)”; dl “digital library” or “digital libraries”; ldp “long-term digital preservation”; se “science and engineering”, as represented in university faculties; swe “software engineering” or else “a typical software engineer (samuel william east)”. in fact, digital preservation is the only specific dh research topic i found in recent digital humanities quarterly articles. academic discipline by discussing research into long-term digital preservation (ldp), and requested this advance position paper. an outsider may be pardoned for murky understanding of what is meant by ‘the digital humanities’. even insiders are struggling with fuzzy boundaries, as might be expected of any new activity. for instance, the following excerpt typifies web- accessible comments. our definitions are often a little muddy. (melissa terras, in a keynote presentation at [the ] dh conference, called the community to task for hemming and hawing: “it's... kinda the intersection of...”) we need to get better at this! … cuny’s dh initiative has published a beginner's resource guide to the digital humanities, which includes links [to] definitions and pages [about] sample projects, basic readings, and “hot topics” in dh, … patrick svensson has a solid piece in dh quarterly called the landscape of digital humanities. a post by a uva graduate student, chris forster, attempted to define dh … [as having] four areas of activity—(i) use of computational methods for research; (ii) new media studies; (iii) how technology reshapes the humanities classroom; and (iv) how it reshapes scholarly communication and academic roles. a recent conference call asserts simply, “dh is the nexus of computing and the humanities”. and the content of borgman ( ) suggests that much of what dhp describes is covered by information science faculties. before proceeding further, we should compare the following definition and description of information science (is), a collection of topics that has been recognized as an academic discipline for approximately thirty years, i.e., much earlier than any mention of dh! information science (or information studies) is an interdisciplinary field primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval and dissemination of information. practitioners within the field study this draft responds to an invitation to participate in an april debate: the cologne dialogue on digital humanities. http://digitalhumanities.org/answers/topic/what-is-digital-humanities. http://dh .cch.kcl.ac.uk/. http://news.stanford.edu/news/ /june/digital-humanities-conference- .html. see also (anon) at http://shapeofthings.org/resources.html. adapted from http://en.wikipedia.org/wiki/information_science. the application and usage of knowledge in organizations, along with the interaction between people, organizations and any existing information systems, with the aim of creating, replacing, improving or understanding information systems. information science is often (mistakenly) considered a branch of computer science. however, it is actually a broad, interdisciplinary field, incorporating not only aspects of computer science, but often diverse fields such as archival science, cognitive science, commerce, communications, law, library science, museology, management, mathematics, philosophy, public policy, and the social sciences. information science focuses on understanding problems from the perspective of the stakeholders involved and then applying information and other technologies as needed. in other words, it tackles systemic problems first rather than individual pieces of technology within that system. in this respect, information science can be seen as a response to …, the belief that technology "develops by its own laws, that it realizes its own potential, limited only by the material resources available, and must therefore be regarded as an autonomous system controlling and ultimately permeating all other subsystems of society." within information science, attention has been given in recent years to human– computer interaction, groupware, the semantic web, value sensitive design, iterative design processes and to the ways people generate, use and find information. today this field is called the field of information, and there are a growing number of schools and colleges of information. comparison of the definitions of dh and is suggests that dh is an unneeded inven- tion! any scholarly group may reasonably name its shared topics however it pleases, provided only that the chosen name does not mislead. so we have little reason to challenge the naming. the substantial issue instead is whether or not dh deserves to be ranked together with long-established university faculties such as history or sub-faculties such as analytical chemistry. or perhaps, instead of judging what is deserved, we should consider whether it will attract respect from the established faculties, and also funding that it seeks from government institutions, such as the u.s. national endowment for the humanities. funding issues are made more important than they might otherwise be by current cutbacks that threaten established university faculties (economist ), (underwood ). this neh supports … training programs for scholars … to extend their knowledge of digital humanities. … neh seeks to increase the number of humanities scholars using digital technology in their research and to disseminate knowledge about [relevant] advanced technology … and methodologies. today, complex data—its form, manipulation, and interpretation—are as important to humanities study as more traditional research materials. … digitized historical records … [and] multimedia collections … are increasing in number due to the … affordability of mass data storage devices, … extensive networking capabilities, and sophisticated [software] … improving interactive access to and analysis of these data … the advanced topics in the digital humanities program seeks to enable humanities scholars … to incorporate [such] advances into their scholarship and teaching. to judge the merits, we should consider several dh activities: instruction, pro- posed research, tools development, and analysis of social behavior. the current article examines only technical aspects, leaving other aspects to other commenta- tors. it emphasizes objective over subjective aspects because, whenever doing so is sensible, these tend more rapidly towards debate closure. when coleridge tried to define beauty, he returned always to one deep thought: beauty, he said, is ‘unity in variety.’ science is nothing else than the search to discover unity in the wild variety of nature—or more exactly, in the variety of our experience. poetry, painting, the arts are the same search, in coleridge's phrase, for unity in variety. (bronowski , ) what is it that computer scientists and software engineers do? their projects be- gin (logically) with abstraction. however, “some people ... think that the current abstractions of computer science ... [and] algorithms handling [them] need to be circumstance makes it appropriate to ask each dh funding applicant questions along the following lines. ( ) since university funding will not today increase, what do you recommend be given up to support e-humanities as you recommend? what balancing cuts should be made within your own university? ( ) what do you yourself propose to accomplish? how much new funding will this require? why is it needed and how is it justified? ( ) why would your proposed research be better done in a humanities faculty than by current scientific or engineering faculty in your own university? extract from http://www.neh.gov/grants/guidelines/iatdh.html. coleridge traced [this definition] back to pythagorus: “the safest definition … of beauty, as well as the oldest, is that of pythagorus: the reduction of many to one” (bronowski , ). bronowski ( ) p. ff. provides an eloquent characterization of abstraction and its social role. adapted to fit the requirements of the humanities.” to react to such an assertion, we need specific descriptions of the adaptations they call for—descriptions seem- ingly not yet available. with these in hand, we would surely ask, “what skills are needed to provide what's called for? should we find an e-humanist for such work, or should we find a software engineer?” imagine a debate between a prototypical digital humanist, dhp, and a software engineer, swe—a debate in which swe responds to some vague dhp assertion by asking for specific, relatively objective examples. dhp might respond in some way that does not satisfy swe, leading him to request more specificity/objectivity. if this process continues for several rounds, dhp might respond angrily along the lines of, “dr. swe, your background seems insufficient for you to understand!” how might swe respond? it seems likely that he might say (or think, even if he is too polite to say), “well, since you seem unable to explain it for students, you are not qualified for a dh professorship!” a likely outcome of such debate is that, while the e-humanist community con- siders such questions, perhaps even writing articles about them, software engineers will provide responsive tools—ones that even address human factors not even iden- tified. and these engineers are likely to finish and deploy their work earlier than the e-humanists reach consensus about their opinions! this is likely because any objective specification of what's wanted is surpris- ingly close to specification of satisfying software. and turning specifications into implementations is what software engineers do! an example: long-term digital preservation a -jan- invitation included a conference description asserting: “preserving digital artefacts is a global challenge, which has not been solved conclusively as yet.” burgess and hamming ( ) elaborate as follows: institutional interest in exploring the possibilities for digital scholarship, after an initial flurry of activity followed by something of a hiatus, seems to be gaining impetus again. we have recently seen the establishment of new granting initiatives … as well as a general "buzz" about digital scholarship epitomized by articles in the chronicle of higher education and elsewhere, culminating in standing room only panels on digital humanities at the mla conferences … innovative work … is gaining ground among a growing cohort of digital scholars. from thaller’s notes with his workshop announcement. see http://computerspielemuseum.de/documents_public/veranstaltungen/keep _emulation_expert_workshop_berlin.pdf … scholars in the digital humanities are now starting to explore … technical and rhetorical problems of … preserving "born digital" creative works … but what about “born digital” scholarship … that never had a print analog? very few theorists have attended to this category … the work of new media researchers in the humanities tends to get lumped into a single category rather than … distinct categories of scholarship rendered in new media and scholarship about new media. institutionally, this distinction is crucial for upcoming scholars, since much of the contention centers around originality of content: if the multimedia format of the work is essential to … the argument it presents, where should it count—as a work of scholarship … or as a reworking of an existing argument? thus it is important to distinguish … between ‘scholarly multimedia' and other terms frequently used … . by scholarly multimedia we specifically mean critical scholarly works— interpretive and argumentative, as opposed to creative or archival—that are produced, and [perhaps] performed, in multimedia form. these works represent a new rhetorical genre of scholarship … that differs from multimedia art or hypertext fiction … such excerpts suggest questions that, as far as i know, have not been adequately answered in any professional publication. ( ) what criteria must be satisfied for a digital preservation method to be judged a solution in principle? ( ) over and above an answer to ( ), what criteria must be satisfied for a digital preservation method to be judged a practical solution? ( ) over and above answers to ( ) and ( ), what criteria must be satisfied for world-wide digital preservation practice to be judged socially satisfactory? epistemological bases an article that i no longer can identify referred to a “technical hard core to preser- vation, rather than just librarianship”. such phraseology suggests the importance of named topics being clearly identified and overlapping minimally. terms such as anybody who disagrees with “not adequately answered” is invited to cite contradictory articles. a ‘solution in principle’ is a methodological prescription that, were it to be implemented by software engineers and repository managers, would be adequate. a ‘practical solution’ is an implementation that pilot installations have demonstrated to be satisfactory. ‘socially satisfactory’ calls for managed infrastructure (perhaps within a digital repository network) that satisfies anybody who wants some particular information to endure for some specified period. (s)he would be satisfied if (s)he deemed reliable institutional promises for the service alluded to, and if fees for such service were reasonable. ‘digital library’ and ‘archiving’ had been used for two decades before anybody mentioned ‘digital preservation’. the current article, therefore, limits ‘digital pres- ervation’ to extensions beyond digital document management suggested by glad- ney ( ). just how important and useful this tactic is can be seen by considering difficul- ties in burgess and hamming ( ). many of these simply disappear if one parti- tions communication processes into steps and intermediate message representations describing how an information bundle moves from the mind and space of its author to those of its eventual recipient(s). figure : human and machine roles in sharing documents (simplification of gladney ( ) [figure ]; see also (oais)) part of what makes for clear analytical description is explicit attention to distinc- tions taught by th-century epistemology (coffa ). compare the style of bootz, szoniecky and bargaoui ( ) to analyses of communication steps hidden behind what [figure ] suggests. bootz et al make no use of helpful basic distinc- tions: • between objects and values: in most information preservation, what is to be preserved is some pattern (a value) inherent in one or more representations, each embodied in an object that can be transmitted (nimmer ). multiple representations can reduce (without eliminating) ambiguity between which information is essential and which is accidental. • between accidental and essential information, an obviously subjective distinction. for instance, a poet might or might not intend page layout to be important. although common conventions emphasize artists' intentions, sometimes observers' intentions dominate a discussion, such as when an observer is trying to achieve something practical, as might occur in deciding whether a painting is indeed from the purported artist. • between analog and digital information representations and, for the former, questions of precision and noise. digital information can be transmitted without any error whatsoever. in contrast, moving information between human beings and human beings usually has steps with analog signals and therefore cannot avoid distortions and subjective decisions about what is good enough. what should a digital preservation solution accomplish? as a minimum, it should: • ensure that a copy of every preserved document survives as long as it might interest somebody; • ensure that authorized consumers can find and use any preserved document as its producers intended, avoiding errors introduced by third parties that include archivists and editors; • ensure that any consumer can reliably decide whether information received is sufficiently trustworthy for his intended application; • hide technical complexity from end users; and • replace human effort by automatic procedures whenever feasible. conceptual difficulties digital data … is analogous to infrastructure in the physical world … and like physical infrastructure, we want our data infrastructure to be stable, predictable, cost-effective, and sustainable. creating systems with these and other critical characteristics … involves tackling a spectrum of technical, policy, economic, research, education, and social issues. the management, organization, access, and preservation of digital data is arguably a “grand challenge” of the information age. (berman) published difficulties of long-term digital preservation prove to be largely confusions with language. similar difficulties were addressed in early twentieth- century philosophy. we describe prominent confusions, show how to clarify the issues, and summarize a method that solves all the technical challenges described in the literature. other reports provide detailed design and analysis of the [proposed] tdo method. a purpose of the current article is to invite searching public criticism before anyone invests significant resources in creating preservation data objects. (gladney ) before addressing technology, we need to understand what people mean by ‘docu- ment preservation’, or at least achieve clarity about different concepts used by different communities. such concepts can be independent of the document media, i.e., the same for documents on paper, audio and video recordings on magnetic media and vinyl platters, and for digital objects that are shared. early digital archive literature is full of misunderstandings of basic concepts. for instance, articles about ‘trusted digital repositories’ betray problems that call their direction into question. confusion between ‘trusted’ and ‘trustworthy’ misled investigators into focusing on repositories rather than on content objects. for instance, beagrie et al ( ) call for certification that an institution has correctly executed sound preservation practices. repository-centric proposals have unavoidable weaknesses: • they depend on an unexpressed premise—that exposing an archive’s procedures can persuade its clients that its content deliveries will be authentic. such procedures have not yet been described, much less justified as achieving what their proponents seem to assume. • audits of an archive—no matter how frequent these are—cannot demonstrate that its contents have not been improperly altered years before a sensitive document is accessed. in a century or so, nobody will care about the capabilities and weaknesses of today’s repositories. instead, what people will want to know whether digital content they can fetch is credibly authentic. in casual conversation, we often say that the copy of a recording is authentic if it closely resembles the original. but consider, for example, an orchestral perform- ance, with sound reflected from walls entering imperfect microphones, signal changes in electronic circuits, and so on, until we finally hear the soundtrack of a television rendering. which of many different signal versions is ‘the original’? difficulties with ‘original’ and ‘authentic’ are conceptual. nobody creates an ar- tifact in an indivisible act. what is an acceptable original is somebody’s subjective choice. when such an original has been chosen, we can describe it objectively with what is intended here are analog recordings such as those of the first half of the th-century. we know how to make information trustworthy for specified applications, but do not know how to ensure that information deliveries are trusted by eventual recipients. provenance metadata expressing everything important about the creation event. we can then judge authenticity relative to that version, and be understood. conventional definitions, such as “authentic: of undisputed origin; genuine.” (concise oxford english dictionary), do not help much. for signals, for material artifacts, and even for natural entities, the following definition captures what people mean when they say ‘authentic’. given a derivation statement r, “v is a copy of y ( v=c(y) )”, a provenance statement s, "x said or created y as part of event z", and a copy function, "c(y) = tn (…(t ( t (y) ))),” we say that v is a derivative of y if v is related to y according to r. we say that “by x as part of z” is a true provenance of v if r and s are true. we say that v is sufficiently faithful to y if c conforms to social conventions for the genre and for the circumstances at hand. we say that v is an authentic copy of y if it is a sufficiently faithful derivative with true provenance. here ‘copy’ means either “later instance” or “conforming to a specific concep- tual object”. each tk represents a transformation that is part of a [figure ] trans- mission step and that potentially alters the information carried. to preserve authen- ticity, the metadata accompanying the input in each transmission step should be extended by a tk description. these metadata should identify who made each tk choice and all other aspects important to consumers’ judgments of authenticity. … reflecting on the challenge … for ensuring the reliability and authenticity of records that lack a stable form and content. the ease with which [dynamic documents] can be manipulated has given … a new reason for keeping them: ‘repurposing’. … we have to consider the possibility of substituting the characteristics of completeness, stability and fixity with the capacity of the [repositories] to trace and preserve each change the record has undergone. and perhaps we may look at the record as existing in one of two modes, as an entity in becoming … and as a fixed entity at any given time the record is used. … strategies must be developed … for both the creators and preservers … (duranti ) we disagree! neither our careful definition of ‘authenticity’ nor any other work suggests that ‘dynamic documents’ (representations of artistic and other perform- ances) present a new or difficult preservation problem. what is different for differ- ent object kinds is merely the ease and frequency of change and of copying. a repeat of an earlier performance would be called authentic if it were a faithful copy except for a constant time-shift. this can describe any kind of performance. its meaning is simpler for digital documents than for analog recordings or live performances because digital files already reflect the sampling errors of recording performances that are continuous in time. the authors expressing difficulty with dynamic digital objects do not express similar uncertainty about analog recordings of music or television performances. perhaps their confusion is misunderstanding of language, as suggested by wittgen- stein ( , . ). the "digital curation" concept is still evolving. [lee] defines it as follows: digital curation involves selection and appraisal by creators and archivists; evolving provision of intellectual access; redundant storage; data transformations; and, for some materials, a commitment to long-term preservation. digital curation is stewardship that provides for the reproducibility and re-use of authentic digital data and other digital assets. development of trustworthy and durable digital repositories; principles of sound metadata creation and capture; use of open standards for file formats and data encoding; and the promotion of information management literacy are all essential to the longevity of digital resources and the success of curation efforts. digital preservation is typically regarded as a key subset of digital curation. (bailey ) the social challenge and the essence of its solution are conceptually simple. without careful management, recorded information gradually would become inac- cessible (rosenthal et al ). impediments include changing language. for works on paper, it might take centuries before readers are no longer comfortable with the language used. for digital documents, this period is today much shorter, partly because rendering technology is still changing rapidly and partly because usability expectations are higher than for information on paper. both the social and technical structure of any ldp solution should parallel that for documents on paper. the only exceptions should address aspects for which we can identify reasons for deviation. this need not be because a book is written in latin. it can also be because key expressions, idioms, and metaphors are no longer commonly understood as their authors intended. the most sensitive examples are computer programs, for which a single changed bit might impede use. an abstract reason for this assertion is occam’s razor compliance. practical reasons include that doing so will take advantage of library management practice developed over more than a century and that the resulting procedures can be designed to seem familiar to repository personnel and their clients. the traditional roles of repositories include acquiring, saving (including redun- dant copies), and sharing “interesting” content objects. they include editing that content and associated metadata only if available sources cannot make satisfactory copies available. this occurs for no more than a tiny fraction of the worthwhile literature. instead, editing and describing documents and records are traditional responsibilities of outside communities, such as those of authors, editors, and pub- lishers. surprisingly, librarians, archivists, and their faculty colleagues do not seem to see it this way. many of their published articles propose work on or methods for preserving digital content by extending the role of repository institutions, and prominent members of the dh community call ldp a “grand challenge” (lee and tibbo ), (berman). we disagree. an ldp solution will not be a prescription for repository man- agement, but instead a method for making digital objects durably useful, readily sharable, and durably trustworthy—a scheme for representing content. the next section sketches one such scheme. an unsolved challenge is caused by immense increase in the number of books, papers, periodicals, memorabilia, technical data (berriman and groom ), and other digital objects published. the fraction of this flood meeting any dispassion- ate quality criterion has probably decreased, so that what one needs to read to be well-informed has not grown nearly as quickly. and information technologists have provided, and are refining, tools that make finding the answer to any well- formulated question—if that question has in fact been addressed—much easier than it was either a decade ago or a century ago. the remaining problems are social: making the tools easy to use, teaching the public how to do so, and choosing criteria for repository accession. the last does not seem to call for research, because it will be a matter of subjective choice by each repository community. the solution for these challenges cannot be hurried, but instead will be worked out socially over a few decades. if there is a big problem, it seems to be that the dh community has perhaps not noticed, perhaps not comprehended, and certainly not acknowledged manifest con- at least, their publications suggest this. the hyperbolic phrase “exponential growth” has lost much of its original force. however, published information has, in fact, experienced exponential growth. objective judgment of this fraction would be difficult, even if one could achieve consensus about subjectively chosen criteria. the assertion might, however, be agreed by thoughtful critics who have experienced a growing flood of scholarly articles that teach us little that we did not already know and also wanted to know. tinuing technical progress. c.p. snow’s gap between “two cultures” is still evi- dent! a technical solution an in-principle solution for the ldp requirements summarized above was pub- lished as early as . later, gladney ( ) disagreed with most other preserva- tion authors by asserting that the technical core could not be procedures for managing digital repositories, but instead had to be a scheme in which a single file could package a “complete” information corpus. the scheme for such a “trustworthy digital object” (tdo), which represents some document together with subjectively chosen critical context, is suggested by [figure ]. its most important properties follow. • representing bit-strings are packaged with registered schema. • the package includes or links reliably to all metadata needed for interpretation and as authenticity evidence. • these bit-strings and metadata are encoded to be platform-independent and durably intelligible. • every critical link to another tdo is secured by a cryptographic message authentication code. • all this is sealed using cryptographic certificates based on public-key message authentication, with each cryptographic certificate authenticated by a recursive certificate chain grounded in a public reliable source. supportive evidence can easily be gathered by inspecting citations made by dh authors. gladney ( ) provides a more thorough description and analysis than that in the following synopsis. for some data classes, representations approaching obsolescence might have to be superseded, perhaps as often as every decade. a fail-safe way of doing this is known. implementations can be executed as batch processes that use “waste” computer cycles. figure : schema for a preserved information package several articles describing this work requested public criticism wanted before implementing pilots to test and demonstrate the ideas’ correctness and practicality. i paused, waiting for reactions. over eight years later, almost nobody has com- mented, nothing distinctly different and workable has been published, and repeti- tive preservation conferences seem remarkably similar to their counterparts of a decade earlier (gladney ). how could this happen? surely part of the problem is dh community inattention to software engineering literature. summary and conclusion an aspect mostly missing from ldp literature is a sense of history-in-the-making. a few commentators, following marshal mcluhan's the gutenberg galaxy, suggest that the “digital revolution” has a precedent sometimes called “the guten- berg revolution”. they point out that the social changes stimulated by the inven- tion of movable type required about a century to play out. only years have passed since e-mail became available, and only years since the first digital libraries were deployed. if we are indeed experiencing a digi- tal revolution, it is only its early days. if so, it might be silly for scholars to debate how it should work out. society will, over time, decide. a tiny group of scholars can sometimes influence society. but is the current issue such a case? we further wonder, “why might scientists and engineers intuitively feel that dh does not merit high respect?” this might be because some dh publications display appalling inattention to prior work, such as seminal epistemology coffa ( ), mcdonough ( ), pincock ( ) that has long provided fundamentals of their topics. it also might be because dh does not have the richness and complexity of topics such as nuclear physics and nuclear engineering. for instance, i seem able to while writing these notes, i discovered philosophical support for my position in bootz, szoniecky and bargaoui ( ). mcluhan did not write about digital media, but rather about electronic communication of any kind. for ldp, this has been illustrated by burgess and hamming ( ). a more egregious example occurs in gochenour’s discussion of mathematical graphs ( ). this fails to cite carnap’s the logical structure of the world (pincock )—a seminal epistemology text to which gochenour adds nothing new. established academic practices demand varied high skills, ranging from deep conceptual thinking to relatively routine mechanical tasks. consider, for instance, chemical physics. it calls for laboratory skills—use of glassware, balances, spectroscopes, and more sophisticated instruments that are essential to most chemistry practice. learning these skills typically occupies % of an undergraduate's scheduled hours, and some chemists spend much time extending or make meaningful comments on most dh papers and expect that i could, with a few weeks of self-education, even publish in a dh periodical. i could not manage an equivalent feat in any scientific or engineering field, not even in those of my formal education! who might be harmed by considering dh to be a discipline in itself? perhaps the community that will suffer the greatest practical disadvantages will be the strongest proponents of an independent dh! many of these might overlook the immense is literature and its solutions to what they see as dh research challenges, possibly “solving” already-solved questions again. and their articles, labeled as dh literature, are likely to be overlooked by most other scholars, mostly because these never notice the existence of dh and, after their attention is directed to peri- odicals such as dhq, deciding that the just-mentioned weakness of the field merits ignoring dh literature. the current article illustrates the weakness of proposed digital humanities re- search agendas by showing that long-time digital preservation—the most promi- nently featured specific topic in recent dh articles, is a solved challenge for which all that still needs attention is software creation and deployment. unless the dh community can identify other research topics of significant depth and scope, we must conclude that there exists no persuasive dh research agenda—and there- fore insufficient reason for establishing dh faculties. refining such tools. and yet almost nobody confuses these aspects with “being a chemist” or contributing to human knowledge. skill with digital tools is surely necessary for humanities practice, and might require significant time to acquire (either as an undergraduate or, for today's children, in elementary school). however, this is insufficient reason to conflate such mechanistic aspects with what is needed to be a professor of humanities. the current author was unaware of dh until manfred thaller proposed the dh debate, illustrating the first problem. the ldp example described above illustrates the second problem. i have not discovered such topics in my dh readings; if they exist, the dh community needs to communicate them as part of seeking funding support and respect. we must differentiate creation of a dh university (sub)department from appointment by existing departments of individual faculty whose incumbents choose to focus on dh topics. the former would be part of some administrative agenda. proscribing the latter would be an invasion of faculty independence which could reasonably be interpreted as a violation of ethical policy. bibliography anonymous. . online humanities scholarship: the shape of things to come, a tabulation of dh-dl relationship resources, digital humanities quarterly ( ). http://shapeofthings.org/resources.html. bailey, charles w. . “scholarly electronic publishing bibliography, § . , library issues: information integrity and preservation.” http://www.digital- scholarship.com/sepb/, http://www.digital-scholarship.com/sepb/lbinteg.htm. beagrie, neil, meg bellinger, robin dale, marianne doerr, margaret hedstrom, maggie jones, anne kenney, catherine lupovici, kelly russell, colin webb, deborah woodyard. . trusted digital repositories: attributes and responsibilities. http://www.oclc.org/research/activities/past/rlg/ trustedrep/repositories.pdf. bentley, paul. . “mastering digital lives: cultural heritage institutions tackle the tower of babel”. online currents. accessed february , . http://www.twf.org.au/research/masteringdigitallives.html. berman, francine. . “got data? a guide to data preservation in the information age.”communications of the acm ( , - ). berriman, g. bruce and steven l. groom. . “how will astronomy archives survive the data tsunami.” communications of the acm ( ), - . bootz, philippe, samuel szoniecky, and abderrahim bargaoui. . “entity/identity: a tool designed to index documents about digital poetry.” paper presented at the symposium e-poetry, barcelona, may - . http://archivesic.ccsd.cnrs.fr/sic_ /en/. borgman, christine l. . scholarship in the digital age: information, infrastructure, and the internet. cambridge, ma: mit press. bronowski, jacob. . science and human values. new york: harper & row. burgess, helen j. and jeanne hamming. . “new media in the academy; labor and the production of knowledge in scholarly multimedia.” digital humanities quarterly ( ). http://www.digitalhumanities.org/dhq/ vol/ / / / .html. coffa, j. alberto. . the semantic tradition from kant to carnap to the vienna station. cambridge: cambridge university press. digital humanities quarterly at http://digitalhumanities.org/dhq/vol/ / /index.html. the alliance of dh organizations promotes and supports digital research and teaching across all arts and humanities disciplines, acting as a community-based advisory force, and supporting excellence in research, publication, collaboration and training. duranti, luciana. . “the long-term preservation of the dynamic and interactive records of the arts, sciences and e-government: interpares .” documents numérique ( ), - . economist. . “social media in the th century: how luther went viral”. accessed december, . http://www.economist.com/blogs/babbage/ / /social-media- th-century. economist. . “university challenge: slim down, focus and embrace technology: american universities need to be more businesslike.” accessed november , . http://www.economist.com/node/ /print. gladney, h.m. . “are intellectual property rights a digital dilemma? controversial topics and international aspects.” imp magazine (february ). ---. . “a storage subsystem for image and records management.” ibm systems journal , – . ---. . “long-term digital preservation: why is progress lagging?” http://www-e.uni-magdeburg.de/predoiu/sda /gladney.pdf. paper submitted to the nestor workshop on semantic digital archives, berlin, september . ---. . “long-term preservation of digital records: trustworthy digital objects.” american archivist ( ), - . ---. . “principles for digital preservation.” communications of the acm ( ), - . gleick, james. . the information. new york: pantheon books. gochenour, phillip h. . “nodalism.” digital humanities quarterly ( ). http://www.digitalhumanities.org/dhq/vol/ / / / .html. lee, christopher a. and helen r. tibbo. . “digital curation and trusted repositories: steps toward success.” journal of digital information ( ). accessed on february , . http://journals.tdl.org/jodi/article/view/ / . mcdonough, richard m. . the argument of the tractatus: its relevance to contemporary theories of logic, language, mind, and philosophical truth. albany: state university of new york press. mcluhan, marshall. . the gutenberg galaxy: the making of typographic man. new york, ny: new american library. nimmer, david. . “adams and bits: of jewish kings and copyrights.” southern california law review : – . oais reference model - iso . ( ). http://digitalcurationexchange.org/?q=node/ . pincock, christopher. . “carnap’s logical structure of the world.” http://philsci-archive.pitt.edu/ / /pincock_aufbau_draft.pdf. rosenthal, david s.h., thomas s. robertson, tom lipkis, vicky reich, and seth morabito. . “requirements for digital preservation systems: a bottom- up approach.” d-lib magazine ( ). tibbo, helen r. and carolyn hank, christopher a. lee, rachael clemens, eds. digital curation: practice, promise & prospects. proceedings of digccurr . chapel hill, nc, april - , . http://www.ils.unc.edu/digccurr . underwood, sarah. . “british computer scientists reboot.” communications of the acm ( ): . von baeyer, hans christian. . information: the new language of science. london: weidenfeld & nicolson. “digital humanities”. wikipedia. accessed december . http://en.wikipedia.org/wiki/digital_humanities. wittgenstein, ludwig. . tractatus logicophilosophicus. routledge. author’s curriculum vitae henry gladney started research in as a chemical physicist and evolved to physics, to ibm research division management, and finally to computer science. his directly pertinent contributions include leading prototype development of racf® (resource access control facility), a security product that, years later, is often copied, e.g., as part of unix® file systems. he later designed a digital library service (gladney ) that evolved into today's ibm content manager®, and then collaborated with product developers on protecting people's intellectual property rights (gladney ). since leaving ibm, he devised a digital preserva- tion method (gladney ) for which he is implementing a prototype. the expressions of opinion called for in the call for the cologne dialogue on digital humanities makes declaration of each participant’s background appropriate—more so than for other scholarly articles. hmg’s patents and publications are listed at http://www.hgladney.com/hmgpubs.htm. untitled for misuse in our cohort: , pills. conclusion: after an ed visit for acute pain a significant portion of opioids prescribed is unused and available for misuse. a large pragmatic study should be done to confirm that an opioid prescription strategy based on our results will limit unused opioid pills while maintaining pain relief. keywords: opioids lo opiate prescribing in ontario emergency departments b. borgundvaag, md, w. khuu, mph, s.l. mcleod, msc, t. gomes, msc, schwartz/reisman emergency medicine institute, toronto, on introduction: increased prescribing of high potency opioids has been associated with increasing opioid addiction and linked to serious adverse outcomes including misuse, diversion, overdose and death. problems related to opioids are a major canadian public health concern yet few data are available on prescribing in most canadian provinces. the objective of this study was to describe opioid prescribing in ontario eds and patient harms associated with this practice. methods: we conducted a population-based cohort study among ontario residents aged - years who were eligible for public drug coverage between april and march . using administrative databases, we identified patients with no opioid use in the past months who received a prescription opioid from an emergency or family physician. patients were followed for years following their index prescription. the primary outcome was hospital admission for opioid toxicity and secondary outcome was dose-escalation exceeding mg morphine equivalents (meq). results: of the , unique patients included, , ( . %) and , ( . %) prescriptions were issued by emergency physician (ep) and family physicians (fp), respectively. fp patients were older ( . vs . yr, msd . ), had fewer ed visits ( . vs . , msd . ), and more fp visits ( . vs . msd . ) in the year prior to their index visit. for combination products, eps were more likely to prescribe oxycodone compared to fps ( . % vs . %, Δ . , % ci: . , . ). for single agent products, eps were more likely to prescribe hydromorphone compared to fps ( . % vs . %, Δ . , % ci: . , . ). fps were more likely to prescribe codeine either as a combination or single agent formulation. ep prescriptions led to significantly more hospital admissions for opioid toxicity ( . % vs . %, Δ . , % ci: . , . ), while fp prescriptions more often resulted in dose escalation beyond mg meqs ( . % vs . %, Δ . , % ci: . , . ). conclusion: a large percentage of opioid-naïve patients receive an initial opiate prescription in the ed, where the use of high potency opioids is much more common, with / of these patients subsequently hospitalized for opioid toxicity. creation of a physician accessible provincial registry would be useful to monitor opioid prescribing and dispensing, inform clinical practice, and identify patients at high-risk who may benefit from early interventions. keywords: opioid, physician prescribing, toxicity lo the utility of femoral nerve blocks in the emergency department; a national survey of practice j. ringaert, md, j. broughton, md, m. pauls, md, i. laxdal, n. ashmead, md, university of manitoba, winnipeg, mb introduction: approximately , hip fractures occur annually in canada, and the incidence will increase with an aging population. pain control remains a challenge with these patients, as many are elderly and prone to delirium. regional anesthesia has shown to be very effective with minimal risks, but it is not clear how often emergency physicians are using this technique to provide analgesia for patients with proximal hip fractures. this is the first canada-wide survey to evaluate the use of regional anaesthesia in the emergency department for hip fractures. it also evaluates physician comfort level with performing these blocks, perceived educational needs in this area, and barriers to performing nerve blocks. methods: a -question survey was sent to mem- bers of the canadian association of emergency physicians via email in january and february of . data was collected and analysed using an online collection program called “survey monkey”. ethics approval was obtained through the university of manitoba research ethics board. results: emergency physicians and residents took part in the survey. the majority of respondents ( . %) choose intravenous opioids as their first line of analgesia and only . % use peripheral nerve blocks (pnb) as their first line choice for analgesia in hip fracture. in response to practitioner comfort with pnbs for hip fractures, most were not at all confident ( . %) in their ability and many respondents have never performed a nerve block for a hip fracture ( . %). the most commonly identified barriers to performing pnbs include lack of training, the time to perform the procedure and a lack of confidence. a larger percentage of respondents ( . %), identified having had no training and no knowledge of how to perform pnbs for hip fractures. conclusion: the vast majority of canadian emergency physicians who took part in this survey do not utilize pnbs as a method of pain man- agement for hip fractures. over half have never performed one of these procedures and many have never received training in how to do so. future efforts should focus on improving access to education, disseminating information regarding the effectiveness of pnb, and addressing logistical barriers in the ed. keywords: survey, regional anesthesia, emergency department lo gridlocked: an emergency medicine game and teaching tool p.e. sneath, bsc, d. tsoy, j. rempel, m. mercuri, phd, a. pardhan, md, t.m. chan, md, mcmaster university, hamilton, on introduction / innovation concept: in the controlled chaos of the emergency department (ed) it can be difficult for medical trainees similarly recognize that there is definite order to the chaos, and many may never truly appreciate its complexity. how should medical learners develop this skill? didactic teaching cannot effectively portray the complexities of managing the ed. much like education in cardiac arrest, trauma, and multi-casualty incident management, it is our belief that the management of patient flow through the ed is best learned through simulation. thus, we developed gridlocked, a board game that requires players to work cooperatively to manage a simulated ed to win the game. methods: gridlocked development took place over a six-month period during which iterative cycles of gameplay and redevelopment were used to optimize game mechanics and improve player engagement. the patient cases were created by medical students (ps, dt, jr) and subsequently reviewed for content validity by two attending emergency physicians (tc, ap). input from attending emergency physicians, residents, medical students, and laypeople was integrated into the game through a plan-do-study-act (pdsa) model. curriculum, tool, or material: our game includes: ) the game board; ) patient cards, which describe a patient, their level of acuity, and the tasks that must be completed in order to disposition the patient; ) event cards, which cause random positive or negative events to occur-much like random events occur in real life that change the dynamics of the ed; ) game characters, which move around the board to denote where tasks are being completed; ) a tracking sheet to follow how many tasks each character has performed in each turn; ) a shift-time clock, which is scientific abstracts cjem � jcmu ; suppl s https://doi.org/ . /cem. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/ . /cem. . https://www.cambridge.org/core https://www.cambridge.org/core/terms used to track the ‘hours’ of your shift; ) a ‘gridlock counter’, which tracks how many ed backups or adverse patient outcomes occur (‘gridlocks’). the goal of the game is to work cooperatively with your teammates to complete patient tasks and move patients through the ed to an ultimate disposition (e.g. admission, discharge). the game is won if you finish your shift before reaching the maximum number of ‘gridlocks’ allowed. conclusion: initial responses to gridlocked have been very positive, supporting it as both an engaging board game and potential teaching tool. we are excited to see it validated through research trials and possibly incorporated into emergency medicine training at both student and postgraduate training levels. keywords: emergency department flow, simulation, board game lo the canadiem digital scholars program: an innovative international digital collaboration curriculum f. zaver, md, a. thomas, md, s. shahbaz, md, a. helman, md, e.s. kwok, md, b. thoma, md, ma, t.m. chan, md, university of calgary, calgary, ab introduction / innovation concept: digital media are a new frontier in medical education scholarship. asynchronous education resources facilitate a multi-modal approach to teaching, and allows residents to personalize their learning to achieve mastery in their own time. the canadiem digital scholars program is a nationwide initiative that provides residents with practical experiences in creating digital educa- tional materials under the supervision of experts in the field. the pro- gram allows for collaboration and access to mentorship from top digital educators from across north america. methods: interested residents accepted into the program spent a period of their pgy year completing modules developed in the theory and science behind digital education. four modules, developed in an iterative process, have been built on the topics of podcasting, blogging, digital identity, and patient commu- nication. each fellow was supervised members of the canadiem team, a faculty member from the resident’s home institution, and digital experts from across north america. curriculum, tool, or material: the first fellow completed all aspects of the designed curriculum. above this, he also engaged in blog content creation, initiated research on digital scholarship, and managed the editorial section of canadiem. the sec- ond fellow is currently halfway through his year (and is expected to complete the program within the year) and has co-authored blog posts and podcasts in months. conclusion: the canadiem digital scholars program utilizes a novel approach to foster development of digital educators utilizing experts across north america. we have demonstrated the feasibility and sustainability with our initial pilot years. this program is being scaled next year to include two scholars per year, which will facilitate cross-collaboration between the scholars. keywords: innovations in emergency medicine education, social media, free open access meducation (foam) lo not a hobby anymore: establishment of the global health emergency medicine organization at the university of toronto to facilitate academic careers in global health for faculty and residents c. hunchak, md, mph, l. puchalski ritchie, md, phd, m. salmon, md, mph, j. maskalyk, md, m. landes, md, msc, mount sinai hospital, toronto, on introduction / innovation concept: demand for training in global health emergency medicine (em) practice and education across canada is high and increasing. for faculty with advanced global health em training, em departments have not traditionally recognized global health as an academic niche warranting support. to address these unmet needs, expert faculty at the university of toronto (ut) established the global health emergency medicine (ghem) organization to provide both quality training opportunities for residents and an academic home for faculty in the field of global health em. methods: six faculty with training and experience in global health em founded ghem in at a ut teaching hospital, supported by the leadership of the ed chief and head of the divisions of em. this initial critical mass of faculty formed a governing body, seed funding was granted from the affiliated hospital practice plan and a five-year strategic academic plan was developed. curriculum, tool, or material: ghem has flourished at ut with growing membership and increasing academic outputs. five governing members and general faculty members currently run projects engaging over faculty and residents. formal partnerships have been developed with institutions in ethiopia, congo and malawi, supported by five granting agencies. fifteen publications have been authored to date with multiple additional manuscripts currently in review. nineteen frcp and ccfp-em residents have been mentored in global health clinical practice, research and education. finally, ghem’s activities have become a leading recruitment tool for both em postgraduate training programs and the em department. conclusion: ghem is the first academic em organization in canada to meet the ever-growing demand for quality global health em training and to harness and support existing expertise among faculty. the productivity from this collaborative framework has established global health em at ut as a relevant and sustainable academic career. ghem serves as a model for other faculty and institutions looking to move global health em practice from the realm of ‘hobby’ to recognized academic endeavor, with proven academic benefits conferring to faculty, trainees and the institution. keywords: global health education, global health training, global health research lo safety and efficiency of emergency physician supplementation in a provincially nurse-staffed telephone service for urgent caller advice e. grafstein, md, r.b. abu-laban, md, mhsc, b. wong, mha, r. stenstrom, md, phd, f.x. scheuermeyer, md, m. root, ma, q. doan, mdcm, mhsc, phd, st. paul’s hospital, vancouver, bc introduction: in british columbia created a nurse (rn) staffed telephone triage service, (tts) to provide timely advice to non- callers ( ). a perception exists that some callers are inappropriately directed to emergency departments (eds) thereby worsening crowding. we sought to determine whether supplementary emergency physician (ep) triage would decrease ed visits while preserving caller safety and satisfaction. methods: tts rns use computer algorithms and judgment to triage callers. potentially sick callers are directed to “seek care now” (red calls). often this is to an ed depending on acuity and time of day. in the vancouver health region from april-september between : - : hours, a co-located ep also spoke with “red” callers to provide further guidance. callers were followed up with week and satisfaction was evaluated on a -point likert scale. the tts data was linked to the regional ed database to assess ed attendance within days, and the provincial vital statistics database for -day mortality. our primary outcome was the proportion of unique “red” callers who did not attend the ed compared with a historical cohort one year earlier without ep triage in place. secondary outcomes were the proportion of “red” callers advised not to attend the ed but (a) attended, (b) admitted, or (c) died. results: in the study period there were “red” calls of résumés scientifique s ; suppl cjem � jcmu https://doi.org/ . /cem. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/ . /cem. . https://www.cambridge.org/core https://www.cambridge.org/core/terms portal: libraries and the academy, volume , number , april , pp. - (article) doi: . /pla. . for additional information about this article access provided by university of kansas libraries ( oct : gmt) http://muse.jhu.edu/journals/pla/summary/v / . .carter.html http://muse.jhu.edu/journals/pla/summary/v / . .carter.html lisa r. carter and beth m. whittaker portal: libraries and the academy, vol. , no. ( ), pp. – . copyright © by johns hopkins university press, baltimore, md . area studies and special collections: shared challenges, shared strength lisa r. carter and beth m. whittaker abstract: special collections and area studies librarians face similar challenges in the changing academic library environment, including the need to articulate the value of these specialized collections and to mainstream processes and practices into larger discovery, teaching, learning, and research efforts. for some institutions, these similarities have led to combining these areas of librarianship into a shared administrative structure. this article articulates the concept of “distinctive collections,” identifies the shared challenges of these programs, and enumerates some essential differences, as well as outlines some observations from institutions that have taken this step. it further suggests opportunities for these areas to build strength and significantly impact teaching, learning, and research together. future research agendas that might propel further investigation of “distinctive collections” are proposed. introduction in an environment where research libraries must increasingly articulate the value of their distinctive collections to the larger enterprise, certain imperatives loom large. these imperatives include intensive engagement with users, the need for cost-effective processes that drive meaningful outcomes, and the opportunities and challenges of collections strategically curated around an area of specialization. special collections—usually encompassing rare, archival, or other primary source materials—are increasingly seen as the corpus that distinguishes one academic library from another. many special collections develop in response to a localized enthusiasm for a topic or a specialization in research that reflects an institution’s strength or a community’s passion. the opportunity to dramatically inspire teaching, learning, and research by connecting faculty, students, and other scholars to these collections holds the promise of transformative impact. area studies and special collections: shared challenges, shared strength similarly, area studies collections have developed as specialized accumulations of knowledge united by language, geographic region, cultural resonance, or all three. area studies collections in the research library may contain rare materials alongside more commonly held resources, but these aggregations are distinctive as collections, deliberately selected around specialized areas of expertise that are focused on re- lated languages, regions, and cultures. area studies materials are generally more easily discovered than special collections because they are cataloged and available for circula- tion, but access often requires mediation due to language, divergent access standards, or cultural sensitivity. hence, the area studies librarian’s intervention in connecting users with specialized collections is not unlike that of the special collections curator, and vice versa. for the authors, this resonance between area studies and special collections has been put in place operationally. our libraries have placed these two types of distinctive collections under one divisional umbrella. for us, the associations are a matter of daily existence. as we make sense of the pairing to our constituencies, our staff, and our col- leagues, the similarities and differences between the two types of collections impact our approaches to workflows, resource allocation, and advocacy. in this article, we plan to further define the concept of “distinctive collections.” we will identify some of the shared challenges of these types of collections, as well as some essential differences, which suggest opportunities to collaboratively build strength and significantly impact teaching, learning, and research. we will also discuss observations and lessons learned from our experience working with paired programs. we intend to identify future research agendas that might propel further investigation of these themes. we do not mean to argue that libraries should organizationally combine area stud- ies and special collections because operational solutions differ from library to library. additionally, we do not suggest that special collections and area studies are the only distinctive collections that might exist in a research library. instead, we hope to identify characteristics and opportunities of distinctiveness that may be helpful as other librar- ies consider how to increase the impact of collections that distinguish them from other institutions. further, we do not intend to perpetuate any siloed or isolated approach to special or area collections, even in combination with each other. the future of dis- tinctive collections is dependent on and critical to the whole of a research library. the convenience of organizational borders should not dictate separation of these areas from the libraries’ workflows or from infrastructure that advances access to or engagement with all library resources. defining distinctive collections the concept of distinctive collections originates in literature focusing on special collec- tions. nicolas barker, in his introduction to celebrating research, argues, “where once special collections were regarded as the top dressing on the solid cake of main library . . . the area studies librarian’s intervention in connecting users with specialized collections is not unlike that of the special col- lections curator, and vice versa. lisa r. carter and beth m. whittaker management, they are now regarded as distinctive signifiers, almost trademarks.” he adds, “arl libraries want to be known for their distinctive collections, not by some characteristic shared by every other library.” this now common argument lays the foundation for the concept of distinctive collections as valuable accumulations of re- search material that set a library apart from its peers. barker also notes that a hallmark of collection development for special collections is the need to “catch material in time” to preserve primary expressions of knowledge and make them accessible for use. in her introduction to special collections in arl libraries, alice prochaska articulates an “ecumenical” concept of distinctiveness that envelopes area studies. in addition to the typical rare books, manuscripts, archives, and other formats, she asserts, “‘special collections’ also can be extended to include distinct collections of material relating to a particular subject or part of the world.” this report of the arl (association of research libraries) working group on special collections embraced an inclusive vision that highlights many of the distinguishing chal- lenges both traditional special collections and area studies collections face. it explains, “our thinking has embraced libraries’ stewardship of any kind of vehicle for information and communication that lacks readily available and standardized classification schemes, and any that is vulnerable to destruction or disappearance without special treatment.” this broadening of the concept of special collections suggests needed attention to the shared characteristics of distinctive collections. rick anderson discussed some of the characteristics of distinctive collections in his briefing paper “can’t buy us love.” he argues that the opportunity dichotomy in research library collections is not between print and digital, but between “commodity/ non-commodity,” further examining the critical difference of materials that are special- ized. while anderson sees that “the library’s role as a broker, curator, and organizer of commodity documents is fading,” he articulates the importance of investing in the acquisition, digitization, and discoverability of “non-commodity” materials and sug- gests that the whole library’s role shift toward that of broker, curator, and organizer of “non-commodity” or distinctive collections. while “many of the academic library’s traditional roles are moving to the margins of the research experience,” the opportunity in “non-commodity” collections lies in their very distinctive nature—libraries should embrace the material on the margins as core and invest in actions that make them rel- evant. anderson also remarks, “librarians will have to explain clearly, concisely, and compellingly why such a shift makes sense and how it will be beneficial in terms of both local and broader public good.” distinctiveness in area studies area studies collections in research libraries consist of both general, circulating, “com- modity” collections and rare, ephemeral, “non-commodity” materials. through the expertise of area studies librarians, specialized collections curated around languages, literatures, and cultures of geographic regions or ethnic identities offer distinctive op- portunities for research, teaching, and learning. dan hazen’s “area studies librarianship and interdisciplinarity: globalization, the long tail, and the cloud” in interdisciplinarity & academic libraries outlines the “distinc- tive” characteristics of area studies collections. hazen asserts: area studies and special collections: shared challenges, shared strength non-western collections work, for example, focuses on esoteric materials in unfamiliar languages that can be difficult to acquire. these “long-tail” acquisitions, and the staff to support them, draw upon structures that reflect both the interdisciplinarity of area studies and the high-overhead, low-use resources upon which it depends. hazen observes, “libraries’ support for area studies entails interplay between general- ized procedures and systems, and the requirements of materials and services that fall outside the norm.” he also notes, “high usage is regarded as a primary indicator of collection success, and non-english materials rarely make the grade.” he further states that “area studies initially challenged both the traditional categories of discipline-based scholarship and established approaches to library operations” and consisted of “difficult materials and labor-intensive routines.” deborah jakubs’s extensive writings on the realities and futures of area studies librarianship further explore characteristics of distinctive collections. in “modernizing mycroft: the future of the area librarian,” she notes how specialization in area studies is seen as “suspect” and asks, “is specialization a luxury? or is specialization a neces- sity?” these are critical questions for a research library as it turns to emphasize its distinctive collections. jakubs notes that expert librarians who cultivate distinctive collections have a “rapport with faculty, the close association with and dedication to academic programs, part and parcel of the job, the broad subject knowledge and an intensity of engagement with the field.” she observes that the area librarian’s job “has evolved as a highly in- dependent role.” in her article “a library by any other name: change, adaptation, transforma- tion,” jakubs asserts: area studies collections are special collections. foreign-language collections are integral to research libraries. it is our duty to collect broadly, to support the needs of researchers, and to consider the scholarly record internationally. as libraries focus on expanding access to their distinctive collections via digitization projects, area studies will become more visible. the arl spec kit : collecting global resources further provides evidence that area studies form a nexus of specialized resources and expertise with potential impact for an organization both in distinctiveness and in international reach. the spec kit’s analysis of strategies for collection develop- ment highlights how expertise and distinctive collections operate in a self-sustaining cycle. one respondent remarked, “because we are so engaged in instruction, being in the class- room puts us in direct contact with students and faculty. it is easy to spot research trends or changes within the curriculum.” close relationships with vendors, specialized book markets, gifts, and exchange programs are key to successful acquisitions and services in area studies. the spec (systems and procedures exchange center) survey also outlines the imperative to develop collaborative collections, along with the exciting possibilities . . . area studies form a nexus of specialized resources and exper- tise with potential impact for an organization both in distinctive- ness and in international reach. lisa r. carter and beth m. whittaker of sharing staffing and library services. the survey also notes how the preservation needs of international and area studies characterize their distinctiveness. respondents had “an acute awareness of the special needs of these resources because of poor bind- ings, acidic paper, etc.” coupled with a complex relationship to the role of digitization as a preservation (rather than access) strategy. the spec kit does not provide a complete sense of how libraries administer area studies units, saying, “the organization of those units ranges from an integration of special collections and area studies units to a structure where global resource collections units report to public services.” it does observe, however, that “balancing the identity and specialized workflow needs of individual collections with a library’s need for ef- ficiency and cost effectiveness will always be a precarious undertaking, particularly when implementing reorganizations.” several conferences have taken an intensive look at the future of area studies librari- anship. while their agendas were much broader than the concept of distinctiveness, a few highlights are worth sharing. the international and area studies collections in st century libraries conference took place in november at yale university in new haven, ct. the conference found shared issues among managers of specialized collec- tions, including a “sense of urgency about the need to better position these library units so that they can continue to thrive” and the participants’ concern for “improving their ability to advocate” and “demonstrating their organizational impact.” the center for research libraries (crl) hosted global dimensions of scholarship and research libraries: a forum on the future, at duke university in durham, nc, in december of . the literature that emerged from this conference emphasized the effects of “globalization” on campuses and is worth examining particularly where “dis- tinctiveness” seems to be in play. charles hale discussed the work of the university of texas in austin with archival collections abroad as part of its human rights docu- mentation initiative (hrdi) and the guatemalan national police historical archive. both collections create models of “noncustodial archiving,” where distinctiveness in area studies led to curation of born-digital primary source material with global impact. betsy wilson discussed the involvement of the university of washington in seattle in international research areas that create a challenge for libraries as they “balance the need for national collections as well as distinctive local strengths, to provide core mate- rial for undergraduate teaching and research, as well as ‘reasonable access’ to less-used research resources.” in her preview of the collaboration, advocacy, and recruitment: area and inter- national studies librarian workshop held at indiana university in bloomington in , christa williford hints at a significant change in approach to area studies as well as specific mention of possible overlap with special collections. “without some sort of large-scale coherent approach to curating and making accessible our global collections,” she says, “we risk losing the richness and depth our academic libraries offer to students and scholars.” specifically, she asks: what about opportunities for closer collaboration between area and international studies specialists and those working in special collections and archives? since these professionals serve many of the same researchers, it stands to reason they could find common interests area studies and special collections: shared challenges, shared strength and promote one another’s work. many recipients of our cataloging hidden special collections and archives grants have had success in engaging faculty and students in their efforts to describe collections. might their strategies translate to better engagement with the work of area and international studies librarians? in the documentation of the event at indiana university, “provocations” included concepts such as “archiving” web resources, sharing area studies librarians among institutions, and demonstrating value and impact. a “response” by peter zhou states unequivocally that area studies collections are marks of distinction important in posi- tioning research libraries. he asserts, “virtually no university can aspire to world-class status without a strong international studies program and supporting collection.” leveraging distinctive collections jakubs sees the future of area studies in moving “from relative isolation into a new role that still recognizes the value of specialization,” and hazen notes the “interplay between generalized procedures and systems, and the requirements of materials and services that fall outside the norm.” along the same lines, recent literature on mainstreaming special collections into the research library enterprise indicates new opportunities for distinctive collections. lisa carter, in her article “it’s the collections that are special,” argues, “libraries can embrace their special collections and archives as a locus of distinction, experimenta- tion and core value. the time has come for libraries to integrate special collections into the flow in every aspect of our work.” she also issues a call for change: it is time to integrate the selection, description, research service and technological activities in every library with those needed to connect users to our most distinctive, unique collections. libraries must recognize that while the collections are special and even have special needs, the talents and skills needed to expose them are found library-wide. arl’s research library issues special issue on distinctive collections further explores the complex relationship between distinctiveness and integration. in “special collections at the cusp of the digital age: a credo,” clifford lynch observes, “each great research library has its own unique character; special and distinctive collections have always been central to shaping this char- acter.” anne r. kenney’s appeal to heed the “collaborative imperative” echoes the calls from area studies thinkers on the need to col- laboratively build collections. donald j. waters examines the changing role of special collections in scholarly communications, critiquing the “value proposition” of special collections. he acknowledges that special collections “can be a source of pride, expertise, and excellence” but adds, “taken to an extreme, the argument about institutional distinctiveness can also limit scholarly productivity by provoking the impulse to protect silo-like boundaries around collections.” . . . recent literature on main- streaming special collections into the research library enter- prise indicates new opportuni- ties for distinctive collections. lisa r. carter and beth m. whittaker arl’s research library issues on mainstreaming special collections highlights cases where research libraries are integrating their distinctive collections into broader library practices and systems. drawing on tom hickerson’s observations in “rebalanc- ing the investment in collections,” carter introduces the theme of research library issues by noting that if special collections are to become central to the research library, they need to be integrated into what hickerson calls the “common asset base” of the overall research library. this means aligning special collections with the broader mission of the library and its institution as well as creating new organizational structures and work- flows. such changes will position archives and special collections in lead roles in the evolution of the twenty-first-century academic library. but this realignment needs to be viewed as “a component activity contributing to broad institutional goals,” not merely a forefronting of special collections. in discussing how his institution has integrated distinctive collections, michael b. moir describes the development of the portuguese canadian history project at york university in toronto, canada. echoing hazen’s arguments, moir asks: at a time when the availability of potential donations far outstrips the resources available to preserve this material and make it accessible, how do libraries ensure a reasonable return on the investment of diminishing funds through collections use by the burgeoning ranks of new faculty and graduate students with new and sometimes unpredictable research interests? moir also underscores the need to tie the workflows and practices of distinctive col- lections to larger institutional work (including digitization): “such endeavors must be brought into the mainstream of annual budgeting and departmental work plans if the libraries’ objectives based on leveraging unique research collections are to be achieved.” liz mengel specifically addresses the need to break down silos in her discussion of “blended librarians,” who cross organizational boundaries, at johns hopkins uni- versity in baltimore. the integration of special collections into the mainstream of that library extends to the cooperative management of the collection budget. robert cox and his coauthors further articulated this blurring of the lines with a user focus in their discussion of how the university of massachusetts amherst brought resources across the library to bear to make special collections more accessible: libraries know patrons are not interested in understanding the arcane internal structure of the library in order to do their research. finding ways to blur or eliminate the boundaries between two departments that are providing similar service is a great way to move away from a siloed environment to a more holistic user-centered environment. the growing frequency and intensity of focus on distinctiveness demonstrate clear parallels in the challenges and attention of the two communities. commonalities of distinctive collections common ground between special collections and area studies lies in their positioning in the research library, their identification around specialization, and the need to inte- grate these areas more centrally into the core of the research library enterprise. shared features include: area studies and special collections: shared challenges, shared strength • a high level of expertise in a distinguishing area • highly focused collection development • special handling and processing concerns (languages, fragility, format) • a targeted but international user community (in addition to a more generalized group of local users) • existing elements of the desired intensive liaison model • shared history of positioning as outsiders, as siloed, or as different from the larger library system. zhou’s suggestion that international and area studies are distinguished programs and contribute to distinction mirrors barker’s articulation of special collections as “dis- tinctive signifiers, almost trademarks.” a shared overarching challenge for distinctive collections is that they tend to fall into what hazen calls a “long tail” of low-use resources that are expensive to acquire, process, and maintain. the authors believe that the future of both areas is dependent less on the stockpile of unique treasures they hold or the carefulness with which collections are cultivated and more on the library’s success in actively connecting these remarkable resources to local users and a global community of scholars. as hazen indicates, “changing modes of access, the evolving economics of cooperation, and the impact of the cloud now allow different approaches to acquisitions,” manage- ment, discovery, and use. interdisciplinarity and breaking down silos both special collections and area studies have existed in a space between defined aca- demic disciplines and have long fostered interdisciplinary scholarship. while they do not “belong” to specific academic departments, both areas often have close ties with faculty and students that connect around a specific identity (for example, middle east studies, medieval scholar, latin american- ist, or comics historian). within the library, the specialized nature of primary source and rare collections and language-based mate- rials has meant that the special collections and area studies librarians’ efforts evolved as what jakubs calls “a highly independent role” managing acquisition, description, arrangement, and service within a silo because the nature of the materials required specialized mediation. as distinctive collections have come to be understood as central to the future of the research library, as their description and processing become more normalized and common ground between special col- lections and area studies lies in their positioning in the research library, their identification around specializa- tion, and the need to integrate these areas more centrally into the core of the research library enterprise. both special collections and area studies have existed in a space between defined academic dis- ciplines and have long fostered interdisciplinary scholarship. lisa r. carter and beth m. whittaker ubiquitous, and as discovery and delivery of all research materials increasingly happen online, both area studies and special collections are revisiting the ways they work. they have been encouraged to break out of their silos and their restricted reading rooms to emerge into the broader library and information environment. jakubs articulated this critical transition: it is time to move from relative isolation into a new role that still recognizes the value of specialization. the future of area librarians depends on our adapting and modernizing, integrating our skills into the library in new ways, and therefore changing our image. unless we do so, redefining our core responsibilities, we will continue to be misperceived and undervalued, and hence, endangered. in this way, area studies and special collections professionals must tap into their abilities to see connections and relate across areas of expertise to more effectively engage function- ally throughout the library. interconnected thinking is both an advocacy strategy and a potential answer to reduced resources, enabling specialists to integrate with broader library workflows. yet as they act interdepartmentally, specialists have the shared chal- lenge of creating understanding about the distinctive needs of their communities and collections. in addition to pushing work beyond departmental borders, area studies and special collections can share strategies for inviting in functional specialists to deal with specialized materials. global audience, local relevance area studies and special collections engage with audiences that can be international, visiting, and dispersed by their very nature. if a research library intentionally develops “destination” collections, it must be pre- pared to support an audience broader than the faculty and students on its campus. further, the general library user population often dwarfs the numbers of specialized on-campus users of distinctive collections. but those specialized users will likely en- gage more intensively, for longer periods, and with longer-term outputs. to demonstrate the importance of distinctive collec- tions, libraries need to determine how to measure impact for and value of these two audiences—distant but discipline relevant, local but with intense needs and extended timelines. this need is a shared challenge fraught with issues of qualitative assessment. further, distinctive collections must be made relevant for the general population that is critical to the parent organization. how do you catch the passion of the local business donor? how might you advance a university’s drive toward technological and medical innovation with distinctive materials? what commitment should special collections or area studies have to the undergraduate studying in a relevant area, but who lacks language or paleographic skills? these shared challenges are, at their base, difficult because of the specialized nature of the collections. can examples of bridge building in area studies advance special collections’ efforts toward relevance, and vice versa? area studies and special collec- tions engage with audiences that can be international, visiting, and dispersed by their very nature. area studies and special collections: shared challenges, shared strength as librarians address the needs of each audience, they must refocus on, improve, and articulate the skills experts often excel at but need to deploy broadly across audi- ences and situations. for example, special collections curators may regularly succeed at cultivating relationships with donors or subject-specific scholarly communities, while area studies librarians have close ties to faculty in their language and geographic areas. either way, excellence at relationship building sparked by shared passion and sustained through regular, empathetic interaction propels discovery through long-term dialog and shared insights. area studies specialists and special collections curators often serve as intensive, long-term research partners with the scholars who use their collections and the communities who have a direct interest in them. both curators and area studies librarians are often deeply embedded in specific academic departments or centers and seen as trusted colleagues and peers. in this way, they play key roles in what janice m. jaguszewski and karen williams call “the hybrid model of liaisons and functional specialists [which] requires a team approach as well as a strong referral system.” the aspects of relationship building that are shared or different between special collections and area studies can provide inspiration for how to engage more deeply with users. distinctive collections professionals, then, may be models in this area for colleagues in other areas of library practice. achieving such intensive relationships at scale is a critical challenge for both spe- cial collections and area studies. and yet, the digital environment offers opportunities for special collections and area studies libraries to connect more efficiently in scholarly communities as trusted knowledge bro- kers and sources. just as members of the internet community gravitate toward a specific blogger because of shared tone or interests, area studies and special collec- tions librarians can turn distinctive col- lections and expertise into opportunities to be “followed” and to spark connection and discussion. crl, through its study on electronic human rights documenta- tion, identified an important role for libraries to monitor what information is available and serve as a trusted partner in connecting information with long-term scholarship. librarians are needed as aggregators and “understanders,” as people who know how knowledge is or was produced and distributed and how to express and preserve it. with their high level of expertise, area studies and special collections librarians share the ability to interpret, select, aggregate, and authenticate in distinctive areas. together, special collections and area studies people can navigate the needed expansion of their reach by sharing strategies and cross-pollinating online offerings. hidden collections and discovery area studies librarians have collected primary sources and ephemera that complement their “commodity” materials back to beginnings of their collections. just as special col- lections have a role in preserving primary sources to enable new research, area studies librarians are needed as aggrega- tors and “understanders,” as people who know how knowledge is or was produced and distributed and how to express and preserve it. lisa r. carter and beth m. whittaker similarly collect documentation from and about areas of the world, often capturing information that may not survive a shifting political or social environment. their spe- cial collections colleagues can offer advice on stabilizing and providing access to these materials and possibly offer a safer home for fragile items. similarly, special collections have collected foreign-language materials in their subject and format areas since the collections were established. area studies colleagues can assist with processing, access, and use of these collections. distinctive collections jointly share a need to revisit collecting strategy in the context of decreasing resources. librarians and curators need to find ways to scale their collect- ing in both areas to the environment of their parent institutions, whether in the areas of budget and storage, the ability to get the materials processed, or the relative priority such materials might have to the core research agenda of the institution. once librarians and archivists have right-sized their collecting activities to their organizational context, they must learn to articulate the value of using the remaining focused collections to transform teaching, learning, and research. and librarians and archivists need to make their impact go further through cooperative collecting and collaborative collection efforts such as hathitrust, uborrow, borrowdirect, and the various digital collection environ- ments, which require interoperability and openness. in area studies, languages and non-roman scripts create barriers for nonreaders of the language and for the application of descriptive standards, requiring expert mediation in discovery and use. in special collections, original order, context, and format require interpretation. even as librarians embrace colleagues who have expertise in description, preservation, and technology to help them get materials out there, they must work side by side with their colleagues to mitigate these distinctive challenges. innovations in large-scale web discovery, linked open data, and crowdsourcing can only be leveraged if translational metadata are effectively structured and created. and yet, having libraries’ distinctive collections available at that scale is critical to knowledge building on a global level, and ultimately, to the authority and relevance of their institutions. in the past, distinction in collections was determined by the uniqueness or sheer magnitude of materials. increasingly today differentiation involves how libraries connect these remarkable resources to users and a global community of scholars. how can the shared strengths in area studies and special collections overcome some of the challenges of exposing hidden collections to a global audience in a net- worked environment? ownership and copyright of government publications and primary sources are joint areas of exploration as more and more distinc- tive collections move online. connect- ing disparate sources in a subject area (whether it be human rights or samuel beckett) requires international collaboration. the interplay between distinctive collections and vendors who can digitize and repackage them is also a shared issue for special collections and area studies. in vendor negotia- in the past, distinction in collections was determined by the uniqueness or sheer magnitude of materials. increasingly today differentiation involves how libraries connect these remarkable resources to users and a global community of scholars. area studies and special collections: shared challenges, shared strength tions, there is room to articulate the value of the original, the intellectual property in aggregation, and the shared benefit to the holding institution and the aggregator. stan- dards and sustainability for web archiving are of shared concern, regardless of whether the web site that needs to be preserved is a wiki about manga or a university’s complex self-representation on the web. it is exactly in the changing context of today’s research library and these potential areas for shared solutions where distinctive collections and expertise can have broader reach and greater impact. as hazen says, “more robust solutions will require special- ized acquisitions that are also aggressively cooperative, better tools for discovery, and fluid mechanisms for access.” important differences in our examination, we found key areas of divergence for area studies and special col- lections that should be acknowledged and can be used to advance the whole organiza- tion. by embracing and investigating differences, special collections and area studies can grow stronger. for example, as dan hazen wrote, “external mandates and support have been cru- cial in establishing the footings for area studies.” area studies centers and academic departments have traditionally provided special funding to support area studies library collections and initiatives. as government funds (title vi) become less secure, area studies in libraries must examine the symbiotic relationship they have had with their centers. this evolving dynamic requires reenvisioning area studies’ importance to society writ large. what place should they have now in a society that is globalized but still has need of greater cultural understanding? as research institutions develop global campuses, the corresponding area studies have the opportunity for a high profile on campus. the importance of distinctive collections in those areas is critical for the deep research that can transcend borders. on the other hand, special collections have grown up locally with donations from passionate people and operational funding sourced primarily from the library. that special collections have an international audience is incidental to the fact that the collec- tions are distinctive. having been bred locally, special collections have sometimes been distracted by the international community craving access to the unique materials. they need to refocus on the local connection that made the collection relevant to an institution in the first place. research interests at a university evolve over time, but the evolution of a university generally follows a recognizable trajectory—as a land grant institution, as a flagship university, as an intimate liberal arts college, or as an intensive science and technology school. in a time of limited resources and the need for the library to align closely with the broader institution, special collections may need to focus more intently on an institution-based context to sustain support for the resource-intensive activities needed to expose their distinctive collections. special collections and area studies directions might also diverge around their changing relationships to the role of custodian versus monitoring facilitator. because special collections’ distinctiveness is grounded in the existence of primary artifacts, special collections and the libraries to which they belong will always have a custodial lisa r. carter and beth m. whittaker role in preserving and protecting original, unique, or rare objects. because the bulk of area studies collections comprise “commodity” collections (albeit difficult to acquire in the international market), they might consider turning over the custodial role for their “non-commodity” collections to special collections, leaving them free to take on the role of connector, broker, or knowledge conduit. a further difference in approach to custodianship is that area studies collections reasonably exist in an environment where not all materials on a related topic can be found under the “area studies” umbrella. few jewish studies collections, for example, would hold all the material related to judaica or jewish culture in a large re- search library because some material is appropriately administered as part of other collections. by their nature, jewish studies materials can be anywhere. at the same time, special collections librar- ies sometimes find themselves, despite collection development policies and the best of intentions, becoming a home for library materials that fall outside their collecting area, but just happen to be old, fragile, or otherwise vulnerable. in this way, area studies librarians can be purposefully explicit in their collection building, whereas the special collections curator must manage the realities of providing specialized custody. with these and possibly other unexplored differences, area studies and special collections librarians’ separate expertise can inform each other, improving the overall library’s ability to tackle the challenges of internationalization, local relevance, advocacy for funding, and the nature of custody. even in their difference, a shared attention to distinctiveness can advance the whole. organizational considerations an examination of the reasons for bringing these two areas of the library together and the practical realities of those implementations help us understand how distinctive areas can work together to advance the impact of the research library. these dynamics play out uniquely in each library, and examples from libraries that have organizationally joined these programs offer concrete insight into the concept of distinctiveness. to determine the character of the “distinctive collections” pairing, we relied extensively on informal communication with colleagues who were aware of our interest in this area. we also consulted readily available organizational charts and searched arl’s position descrip- tion bank. given the nature of academic libraries, it was impossible to examine every institution, but we believe this sample provides some interesting points for discussion. at the university of chicago, an associate university librarian for area studies and special collections administers a division that also includes the humanities and social sciences. the position grew out of a library reorganization, which created this new administrative unit to facilitate a unified collection and service philosophy that . . . area studies and special collec- tions librarians’ separate expertise can inform each other, improving the overall library’s ability to tackle the challenges of internationaliza- tion, local relevance, advocacy for funding, and the nature of custody. area studies and special collections: shared challenges, shared strength encompasses special, distinctive, and general collections throughout the library. the focus in the first years after the merger was to improve communication and collabora- tion between area studies and general collections, as well as between area studies and special collections. divisional meetings allowed bibliographers and curators to learn about one another’s collections and explore areas of overlap and possible cooperation in areas such as instruction, outreach, digitization, and collection development. as collec- tion development for general collections shifts toward building a “collective collection,” local and unique resources are the focus of increased attention. further, the library has identified special collections and area studies as priorities in a new university capital campaign, giving these areas greater institutional visibility. in , the university of florida libraries in gainesville merged archives, manu- scripts, rare books, the baldwin library of historical children’s literature, and florida history collections to increase professional standards, in alignment with other arl insti- tutions. prominent area studies collections that had acquired significant holdings of rare material joined the department in , forming the special and area studies collections department (sasc). the map and imagery library was added to sasc in to ex- pand opportunities for linking area studies, special collections, and digital scholarship. a chair currently oversees the special and area studies collections department, working under an associate dean for scholarly resources and services in the george a. smathers libraries. in addition to a traditional special collections reading room, two area studies programs maintain separate reference desks in the same building, and some area stud- ies staff provide reference services for and maintain circulating collection in the main undergraduate library. this environment creates informal connections between special and general collections. sasc has forged an identity that maintains the specificity of content management, access, and preservation required by distinctive collections while creating a collaborative culture that contributes to the institution’s capacity to gain and successfully manage large projects and grants. merging the units has created opportuni- ties for collection and funding development that did not exist earlier. the department also promotes research across collections and encourages interdisciplinary scholarship. at the ohio state university (osu) libraries in columbus, combining special col- lections and archives with area studies reflected a steady move of specialized expertise into cohesive departments and an increase in centralized support from the broader infrastructure of the libraries. special collections, including the university archives, rare books, and other special collections, were consolidated in under an assistant director for special collections and archives. an extensive library-wide reorganization in provided an opportune moment to reevaluate how best to distribute administra- tive oversight while considering appropriate combinations of approach. with technical services (acquisition, description, preservation, and similar processes) and technology development consolidated in other divisions, osu libraries took the opportunity to reorient on increasing use and engagement with distinctive collections and specialized expertise. the pairing intended to more effectively make use of support from across the libraries and better enable an interconnectedness that would benefit users and the research community. at the university of kansas (ku) libraries in lawrence, the larger reorganization that led to the creation of the distinctive collections division was explicitly focused on lisa r. carter and beth m. whittaker scalability and sustainability. the ku libraries reorganized to provide services and sup- port to a larger number of users than the previous departmental liaison model, as well as to respond to compelling developments in higher education. a significant previous consolidation in the early s brought all of the kenneth spencer research library, including the university archives, regional history, and special collections, under one administrative head, combining staff, reading rooms, and policies. a similar combination took place in with the creation of the international area studies department under a department head. therefore, when the ku libraries created the position of assistant dean for distinctive collections in , they combined two areas with some experience of working together with common goals. observations from paired programs both osu and ku are navigating the shift of technical services work to more centralized workflows and are negotiating the efficiencies of production-oriented processes alongside the input of specialized expertise. osu had a long history of siloed approaches to both area studies and special collections, where some units managed their own processing, description, and digitization. such silos resulted in a wide and inconsistent variety of hidden collections challenges. the osu libraries had centrally supported such processes as conservation, preservation, special collections cataloging, and some area studies cataloging. in , the osu libraries made a concerted effort to enhance infrastruc- ture support in these areas and in processing, digitization, and digital initiatives. the expectation was that special collections and area studies would redirect specialized materials into mainstreamed technical services functions. addressing vacancies across the organization centrally, the osu libraries shifted or broadened several technical services positions from general collections activity to distinctive collections needs. the number of language expert staff in the description and access unit grew through reassignment of student worker funds (for example, the addition of korean, japanese, and hebrew catalogers). positions in special collections description and access were added as well (such as a processing coordinator to apply best practices and standards to archival and manuscripts materials). other functional positions were added to the osu libraries that particularly benefited special collections and area studies, such as a head of digital initia- tives, application developers, and exhibits staff. these shifts in organizational structure and workflows have increased creation and enhancement of metadata for distinctive collections across the board, initiated a plan to address backlogs in special collections, and enhanced the exhibition of distinctive collections to the academic community. osu libraries hope these changes will systematize the processing, description, digitization, and online delivery of collections that can only be found at ohio state. combining area studies and special collections into one division at ku has advanced collaborative goals by improving ku libraries’ digitization flows. access to unique resources figured prominently in the ku libraries’ strategic plan, which included as strategy .a “enhance discovery, access, delivery and preservation of the institution’s distinct resources and assets.” the ku libraries placed new emphasis on encouraging and fostering these kinds of projects, going beyond previous successes, which included an on-site digitization laboratory and grant funding for special collections and archives. area studies and special collections: shared challenges, shared strength libraries-wide staffing was added not only to manage digital capture but also to provide metadata and system support. the first “test” project involves collaboration between the librarian for latin american studies and a special collections librarian, exposing guatemalan materials in the custodial care of spencer library but of great interest to latin american scholars. this project high- lights another potential outcome of closer collaboration between area studies and special collections librarians—smoother engagement with issues of cultural repatri- ation made possible by digital technologies. ku libraries anticipate broader ownership of digitization activities within the librar- ies when more people perceive them as not just the purview of rare books specialists. in another example of collaboration that improves engagement with distinctive col- lections, osu’s japanese librarian and billy ireland cartoon library and museum jointly assessed and reconceptualized their collaborative collection development of manga ma- terials. they had long shared responsibility for acquiring manga before libraries widely collected the art form, striving to build a distinctive destination collection that would support osu curricular strengths in both visual studies fields and japanese studies dis- ciplines. recently, a broad range of faculty expressed interest in assigning these materials in their classes if the osu libraries did not restrict access to a special collections research room. the japanese librarian, using expertise in international information production and a deep understanding of community use, worked with the cartoon library and special collections description and access to identify which material should circulate and which should remain protected. mainstream technical and collection services then collaborated with special collections catalogers to implement the revised collection access and development plans. the result is enhanced use of these materials in the curriculum, which supports the embedded relationship the japanese librarian has had with east asian studies center, the institute for japanese studies, and the department of east asian languages and literatures. the long-term collaboration between curators and librarian combined with the librarian’s deep embedding has also resulted in an upcom- ing international symposium to be held at osu on the history of manga research. this is just one example of how distinctive specialists can model intensive engagement that truly advances transformative teaching, learning, and research. integrating special collections and area studies operationally while highlighting distinctiveness comes with inherent challenges. combining areas of library practice, just as with any organizational change, requires transparency and communication. broad conversations about what it means to be “distinctive” and how this character- istic benefits other parts of the library are essential. the ku libraries developed their consultant model in the wake of the reorganization, focusing on serving types of users instead of perpetuating a dedicated subject liaison relationship with particular academic departments and schools. acknowledging that there is, in fact, something particular about primary source formats, some foreign language skills, and other types of specialties means finding a balance between providing the best possible service to a this project highlights another potential outcome of closer collab- oration between area studies and special collections librarians— smoother engagement with issues of cultural repatriation made pos- sible by digital technologies. lisa r. carter and beth m. whittaker variety of users with differing needs and making peace with the fact that not all consul- tants will be doing the same kinds of work to the same degree. content development, for example, which depends on knowledge of language, vendors, and formats, has remained an essential duty in distinctive collections while most consultants in the ku libraries no longer have responsibility for that activity. at the same time, special collec- tions librarians and archivists were never the sole point of library contact for faculty and students in a given discipline, and area studies librarians have always worked across academic disciplines and departments. in this way, the consultant model, in the context of the distinctive collections division, has undergone an easier transition to the more scalable and meaningful engagement ku libraries sought to create. the dynamic of highlighting distinctive collections while balancing the need to align with the larger environment is also highly relevant as libraries navigate development and external partnerships. both area studies and special collections cultivate niche commu- nities that provide funding, enthusiasm, and international recognition, but these intimate relationships can create tension with the broader mission of the research library. for example, at osu, gifts from the external comics community provided the majority of funding for a new facility and an operating endowment for the cartoon library. the jerome lawrence and robert e. lee theatre research institute operates under a memorandum of understanding with the department of theatre that speci- fies engagement that impacts resource allocation for both parties. in the meantime, title vi funding from area studies centers for acquiring library collections has decreased, even while expectations from the local and international community for providing increasingly specialized resources grow. a coordinating administrator guides curators and librarians in balancing the needs of external stakeholders with the core direction of the broader academic library to successfully navigate these tensions. forging a path together, celebrating distinctiveness in considering these two discrete areas of librarianship, we found strong evidence that if special collections and area studies are to have significant impact on the future of the research library, understanding the shared challenges and solutions is key. to us, you can switch the terms “area studies” and “spe- cial collections” in much of the trending literature and still have a valid, resonating message. comparing the organizational decision-making of our individual institu- tions with others that have combined area studies and special collections indicated further resonance around the opportuni- ties for both distinctive areas. both area studies and special col- lections cultivate niche communi- ties that provide funding, enthusi- asm, and international recognition, but these intimate relationships can create tension with the broader mission of the research library. . . . if special collections and area studies are to have significant im- pact on the future of the research library, understanding the shared challenges and solutions is key. area studies and special collections: shared challenges, shared strength this resonance suggests to us that elevating the research agenda about how distinc- tive collections and specialized expertise engage with the broader organization holds rich possibilities for addressing the challenges in these two areas. specifically, a united distinctive collections approach to the following issues may advance the impact of the overall research library, as well as the area studies and special collections units they contain: • in what ways can librarians integrate and mainstream operations that connect distinctive resources to users, while sustaining the distinctive identities that hold value? • how do librarians break down silos (without undermining distinctiveness) to specifically advance interdisciplinary inquiry and scholarship? • how might area studies, special collections, and other distinctive collecting areas share strategy, advocacy, and outreach opportunities? • how can librarians help resource allocators navigate the dichotomy between the low-use, highly resource-intensive nature of distinctive collections and the opportunity of integrating them into transformative teaching, learning, and research? • what is a fruitful balance between the renown that comes in exposing distinctive collections to a global but specialized audience and the return on investment that is needed for a generalized local constituency? • how do academic institutions measure the qualitative impact of these distinctive collections when use is disparate, long-term, and prolonged? • what is the essence of the expert’s skill in relationship building and how can that inform the evolving role of the liaison librarian? • how do librarians achieve such intensive relationships at scale? • how do librarians revisit their collecting strategies in the context of decreasing resources, and how might cooperative collecting and investing in the collabora- tive collection increase the impact of resources? • how do librarians shift from being collection-oriented to user-oriented, shifting their mediation role from gatekeeper to connector, broker, and aggregator? • do the commonalities in the opportunities and challenges of distinctive collec- tions define a critical leadership or management role in research libraries? while we do not have answers to these questions, we believe that investigating them in collaboration will likely be more fruitful than discussing them in separate communi- ties. we advocate for increased frequency of conversation between special collections and area studies communities and for collective interrogation of these issues with the broader library community. initiation of this conversation is particularly challenging given that, until recently, there has been no venue that spans area studies as a whole. similarly, the special collections and archives communities participate in multiple pro- fessional organizations and meetings. further, the broader library community allows conversations about distinctive collections to be relegated to these varied, specialized venues. we recommend a concerted effort to address these questions. if distinctive collections are central to the future of each research library, then this is a conversation worth coordinating. lisa r. carter and beth m. whittaker lisa r. carter is associate director for special collections and area studies at the ohio state university libraries in columbus. she may be reached by e-mail at: carter. @osu.edu. beth m. whittaker is assistant dean of distinctive collections and director of the kenneth spencer research library at the university of kansas libraries in lawrence. she may be reached by e-mail at: bethwhittaker@ku.edu. notes . nicolas barker, “introduction,” in celebrating research: rare and special collections from the membership of the association of research libraries, ed. philip n. cronenwett, kevin osborn, and samuel a. streit (washington, dc: association of research libraries [arl], ), accessed june , , http://www.celebratingresearch.org/intro/index.shtml. . special collections in arl libraries: a discussion report from the arl working group on special collections (washington, dc: arl, ), accessed june , , http://www.arl.org/ storage/documents/publications/scwg-report-mar .pdf. . ibid., . . rick anderson, “can’t buy us love: the declining importance of library books and the rising importance of special collections,” ithaka s+r, , accessed june , , http:// www.sr.ithaka.org/sites/default/files/files/sr_briefingpaper_anderson.pdf . ibid., . . ibid., . . dan hazen, “area studies librarianship and interdisciplinarity: globalization, the long tail, and the cloud,” in interdisciplinarity & academic libraries, acrl (association of college and research libraries) publications in librarianship , ed. daniel c. mack and craig gibson (chicago: acrl, ), – . . ibid., . . ibid., . . ibid., . . deborah jakubs, “modernizing mycroft: the future of the area librarian” (bloomington: indiana university, ), , accessed june , , https://scholarworks.iu.edu/dspace/ bitstream/handle/ / /jakubs_modernizing_mycroft.pdf?sequence= . . deborah jakubs, “a library by any other name: change, adaptation, transformation,” in the future of latin american library collections and research: contributing and adapting to new trends in research libraries (papers of the th annual meeting of the seminar on the acquisition of latin american library materials [salalm], ed. fernando acosta- rodríguez [new orleans: salalm secretariat, tulane university, ]), – , accessed june , , http://salalm.org/ / / /salalm- -keynote-address-by-deborah- jakubs-rita-digiallonardo-holloway-university-librarian-at-duke-university/. . spec kit : collecting global resources (washington, dc: arl, ), . . ibid., . . ibid., – . . ellen h. hammond, “international and area studies collections in st century libraries: conference report,” focus on global resources , (winter ), accessed june , , http://www.crl.edu/focus/article/ . . james t. simon, “global dimensions of scholarship and research libraries: a forum on the future,” focus on global resources , (winter ), accessed june , , http:// www.crl.edu/focus/article/ . . james t. simon, “the global dimensions of scholarship and research libraries: a forum on the future” (durham, nc: center for research libraries, duke university, ), accessed june , , http://www.crl.edu/events/ /conf_papers. . ibid. area studies and special collections: shared challenges, shared strength . clir [council on library and information resources] issues ( ), accessed july , , http://www.clir.org/pubs/issues/issues /issues /#area-studies. . “provocations,” in collaboration, advocacy, and recruitment: area and international studies librarianship workshop (bloomington: indiana university, ), accessed june , , http://www.indiana.edu/~libarea/provocations.html. . peter zhou, “response ,” in collaboration, advocacy, and recruitment: area and international studies librarianship workshop (bloomington: indiana university, ), accessed june , , http://www.indiana.edu/~libarea/response .html. . jakubs, “modernizing mycroft,” . . hazen, . . lisa carter, “it’s the collections that are special,” in the library with the lead pipe (february , ), accessed june , , http://www.inthelibrarywiththeleadpipe.org/ /its- the-collections-that-are-special/. . special issue on distinctive collections, research library issues: a bimonthly report from arl, cni [coalition for networked information], and sparc [scholarly publishing and academic resources coalition] (december ), http://publications.arl.org/rli /. . anne r. kenney, “the collaborative imperative: special collections in the digital age,” research library issues , accessed june , , http://publications.arl.org/rli / . . donald j. waters, “the changing role of special collections in scholarly communications,” research library issues , accessed june , , http://publications. arl.org/rli / . . special issue on mainstreaming special collections, research library issues: a bimonthly report from arl, cni, and sparc (october ), accessed june , , http:// publications.arl.org/rli /. . h. thomas hickerson, “rebalancing the investment in collections,” research library issues: a bimonthly report from arl, cni, and sparc (december ), – , accessed december , , http://publications.arl.org/rli /. see also the analysis by lisa r. carter, “special at the core: aligning, integrating, and mainstreaming special collections in the research library,” special issue on mainstreaming special collections, research library issues , accessed june , , http://publications.arl.org/rli / . . michael b. moir, “patron-driven acquisitions and the development of research collections: the case of the portuguese canadian history project,” special issue on mainstreaming special collections, research library issues , accessed june , , http://publications.arl.org/rli / . . ibid., . . liz mengel, “the confluence of collections at johns hopkins’s sheridan libraries,” special issue on mainstreaming special collections, research library issues , accessed june , , http://publications.arl.org/rli / . . robert s. cox, danielle kovacs, rebecca reznick-zellen, aaron rubinstein, and jeremy smith, “metastatic metadata: transferring digital skills and digital comfort at umass amherst,” special issue on mainstreaming special collections, research library issues , accessed june , , http://publications.arl.org/rli / . . barker, “introduction”; zhou. . hazen, – . . jakubs, “modernizing mycroft,” . . ibid. . janice m. jaguszewski and karen williams, “new roles for new times: transforming liaison roles in research libraries” (washington, dc: arl, ), , accessed july , , http://www.arl.org/storage/documents/publications/nrnt-liaison-roles-final. pdf. . james t. simon, sarah van deusen phillips, and marie waltz, “human rights and electronic media: a crl study,” focus on global resources , ( ), accessed july , , http://www.crl.edu/focus/article/ . a conversation with bernard reilly, february , , helped emphasize the potential role of the librarian in such a project. lisa r. carter and beth m. whittaker . hazen, . . ibid., . . we appreciate the advice and input of alice schreyer and bernard reilly, whose thoughtful comments helped guide this article. we are also grateful to schreyer and e. haven hawley, who contributed text to the section on paired programs. . spec kit . . arl position description bank, accessed october , . requires authorized login, http://www.uflib.ufl.edu/arlpdbank/. . ku [university of kansas] libraries strategic directions – , , accessed november , , http://lib.ku.edu/strategic-plan. . with the third meeting of the iasc (international and area studies collections in the st century) group in late , a regular opportunity for discussion across area studies may have evolved. humanities data in the library: integrity, form, access search d-lib: home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib d-lib magazine march/april volume , number / table of contents   humanities data in the library: integrity, form, access thomas padilla michigan state university tpadilla@mail.lib.msu.edu doi: . /march -padilla   printer-friendly version   abstract digitally inflected humanities scholarship and pedagogy is on the rise. librarians are engaging this activity in part through a range of digital scholarship initiatives. while these engagements bear value, efforts to reshape library collections in light of demand remain nascent. this paper advances principles derived from practice to inform development of collections that can better support data driven research and pedagogy, examines existing practice in this area for strengths and weaknesses, and extends to consider possible futures.   introduction commitments to digital humanities, digital history, digital art history, and digital liberal arts are on the rise. , , , , these commitments can be witnessed in federal agency and foundation activity, university and college level curriculum development, evolving positions on tenure and promotion, dedicated journals, and the hiring of faculty and staff geared toward enhancing utilization of and critical reflection on computational methods and tools within and across a wide array of disciplinary spaces. , librarians have sought to engage these commitments through development of digital scholarship centers, recombination of services, creation of new positions, and implementation of user studies. while these engagements bear value, efforts to reshape library collections in light of demand remain nascent, diffuse, and unevenly distributed. where traditional library objects like books, images, and audio clips begin to be explored as data, new considerations of integrity, form, and access come to the fore. integrity refers to the documentation practices that ensure data are amenable to critical evaluation. form refers to the formats and data structures that contain data users need to engage in a common set of activities. access refers to technologies used to make data available for use. in order to inform community steps toward developing humanities data collections, the following work advances principles derived from practice that are designed to foster the creation of data that better supports digitally inflected humanities scholarship and pedagogy. following advance of these principles, a wider field of humanities data collection models are considered for strengths and weaknesses along the axes of integrity, form, and access. the work closes with a consideration of humanities data futures that spans questions of discoverability, terms and conditions limiting access, and the possibility of a humanities data reuse paradigm.   humanities data integrity in preparing humanities data collections, it is instructive to consider documentation practices applied by researchers to the data that are generated from them. consider ted underwood's work on page-by-page genre predictions for , english-language volumes held by the hathitrust digital library. , underwood goes into great detail describing the source of his data, who funded the work, what algorithmic methods were used to generate the data, data structures, file naming conventions, decisions to subset the data, and links to software used to generate the data. taken as a whole these documentation practices make the data usable by a wide audience with intentions that simultaneously span applied computational work as well as critique of that work through a theoretical lens. in order for both angles of inquiry to occur documentation practices must cohere to make data critically addressable. by critically addressable we refer to the ability of data documentation to afford individuals the ability to evaluate both the technical and social forces that shape the data. a researcher should be able to understand why certain data were included and excluded, why certain transformations were made, who made those transformations, and at the same time a researcher should have access to the code and tools that were used to effect those transformations. where gaps in the data are native to the vagaries of data production and capture, as is the case with web archives, these nuances must be effectively communicated. , part to whole relationships between a digital collection and a larger un-digitized collection must be indicated. periodic and/or routine updates to the data must be signaled. this notion of critical addressability is vital across the full spectrum of the research process, from the researcher seeking to select, evaluate, and extend arguments based on a humanities data collection, to the peer reviewer seeking to understand composition of and processes that have been effected upon a collection and how those factors might effect arguments predicated upon the collection. in order to safeguard the integrity, and thus the critical addressability of humanities data collections, librarians can be guided by three conceptually complementary positions advanced by miriam posner, victoria stodden, and roopika risam. in a series of blog posts and talks miriam posner has advocated for an approach to making digital humanities projects more intelligible by asking a seemingly simple but often complex question: "how did they make that"? the "how did they make that" approach entails breaking a digital humanities project apart along three levels of analysis: sources, processed, and presentation. sources relate to the data driving a project, processed refers to processes enacted by the researcher upon the data, and presentation refers to the methods and tools used to present the processed data. while operating in domains outside of the humanities, victoria stodden's work on reproducibility in scientific computing is complementary to posner's framework. , stodden's argument for clear documentation and unimpeded access to code and data driving a research claim makes the "how did they make that?" approach a tenable proposition by helping to ensure that data and code are present for evaluation. roopika risam's work on closing an oft-asserted gap between computation and cultural critique reminds us that documentation in the humanities is about more than reproducibility. documentation of data and process connote researcher ability to understand the provenance of labor, logics of inclusion and exclusion in data, and the rationale that favors one methodological choice and/or mode of transformation over another. the considerations that posner, stodden, and risam introduce are not explicitly developed to influence the manner in which libraries prepare and document humanities data collections. however, their perspectives can certainly help us to develop humanities data collections that are readily usable, interpretable, and critically addressable. with posner, stodden, and risam, we approach articulation of an approximate rubric for evaluating the readiness of humanities data collections to support digitally inflected scholarship: posner: to what extent is information about humanities data collection provenance, processing, and method of presentation available to the user? stodden: to what extent are data and the code that generates data available to the user? risam: to what extent are the motivations driving all of the above available to the user? committing to these principles in humanities data collection development practice marks a distinct direction for libraries. on the service side libraries typically seek to occlude subjective choices driving collection preparation and organization in the interest of presenting an objective and neutral ordering of objects. yet libraries are never neutral. , our systems and practices of organization must be made more transparent. thinking critically on the contours of our collections, we must consider where it makes sense to add the seams back. while universality has long been a core tenet of libraries, presentation of this affect simultaneously erases individual labor committed by actual people with actual opinions and renders collections less readily addressable to critical inquiry.   humanities data form with the concept of humanities data integrity as a suite of data documentation practices established we can move on to consider a process for determining what form humanities data objects themselves should take to better support research and pedagogy at a functional level. generally speaking, digitized objects in libraries are relatively consistent in form, having the benefit of multiple decades of digitization standardization. born digital objects are typically more heterogeneous. collectively, these humanities data are instantiated in file formats. data organization within these formats (e.g. structured vs. unstructured) is contingent in part on format and in part on design of the data creator. when preparing a humanities data collection the goal of the librarian is to decide, at a functional level, what data form will be most readily usable for target user communities. depending on the institution this community could be bound by discipline, by local need (e.g. campus), by purpose (e.g. research vs. pedagogy), or by a broadly defined set of users. generally speaking, some degree of collection transformation will be required in order to better support users that want to interact with collections computationally. librarians can approximate data form requirements by reverse engineering curriculum, web-based projects, research presented at conferences, and scholarly articles produced by and/or relevant to target user communities. for any given disciplinary activity expressed in the prior zones of engagement there are a relatively common set of tools and methods used. across these tools and methods there are common data format requirements. affordances of data residing within these formats varies. review of tool and method requirements leads to the ability to identify core formats in addition to formats with generative potential. formats are "core" when they are fit for use in an unaltered state. formats are "generative" when they have the quality of ready transformability toward a usable state. by extending these considerations across a wider range of pedagogical and research based outputs salient to target user communities, a librarian can begin to develop a strategy for collection transformation that produces more readily functional data. for an example of reverse engineering toward identification of core and generative formats, consider johanna drucker and david kim's dh course site. the text analysis module provides a tutorial on voyant and the data visualization module provides a tutorial on cytoscape. , delving into the documentation for voyant reveals that it accepts data in the following formats: txt, html, pdf, rtf, and doc. cytoscape accepts data in the following formats: sif, nnf, gml, xgmml, sbml, biopax, psi-mi, delimited text, and xls. with respect to voyant all formats aside from txt are concessions to making it easier for users to get data into the text analysis environment. we know this because the structured data accorded to file formats other than txt are for the most part not leveraged post ingest. therefore, we can settle upon txt as the core format for voyant. cytoscape stands distinct insofar as each format conveys varied functionality post ingest. in this case determination of a suitable format is predicated more on generative potential rather than immediate fit for use. we hone in on the format with the most generative potential, again, through a process of reverse engineering. networks of characters in novels are commonly represented in the digital humanities. these data typically take the form of a graph. examination of tools used to create graph data quickly surfaces the networkx python software package. networkx can be readily used to convert structured text data into graph data. assuming availability of a novel stored in a plain text file, a user could readily prepare that data for use with networkx by structuring the plain text data using a tool like stanford named entity recognizer to provide machine readable tags that could be used by networkx to build the graph data required for exploration via cytoscape. from this relatively small sample, a librarian could begin to infer that plain text data are both core and generative for users interested in text analysis and data visualization. as librarians shift their consideration across user communities the notion of what is core and generative may shift. this shift is a consequence of varying levels of skill and desired goals. for example, while data held in a series of xml files underlying a tei project are generally less readily usable to an introduction to digital humanities class, they could be considered a core format for more advanced users. with respect to variation in desired goals of a user community, it could be tempting to assume that the majority of users want to work with plain text derivatives of collection objects. this assumption could lead to extraction of and sole provision of plain text data derived from high quality pdf and tiff page images. yet focus on provision of plain text data at the expense of providing page images fails to consider the possibility of their potential core and/or generative value relative to other types of computational questions. for example, researchers increasingly make use of page images to visualize margin space, line indentation, ornamentation, and text density, as well as exploring automatic detection of poetic content in historical newspapers, and automatic identification of images. , ,   humanities data access addressing humanities data integrity and form must be coupled with a reconsideration of the technical solutions designed to provide access to those data. libraries have historically privileged development of technical solutions that are geared toward emulating aspects of analog object interactions, a decidedly non data oriented approach. page-turners, image zooming, and design biases toward single item use and single item download capabilities inhibit the ability of researchers to work computationally with collections at scale. these interfaces are essentially unusable for researchers that want to text mine, visualize, and/or creatively recombine more than a handful of objects into a work of their own making. in essence, what is required is a way of design thinking that considers what it takes to enable collection wide interactions. as a case in point consider the example of researchers micki kaufman and doug reside. kaufman engaged in a computationally driven historical study of the meeting memoranda (memcons) and teleconference transcripts (telcons) held in the digital national security archives' (dnsa) kissinger collection. kaufman's set of documents totaled approximately , memcons and telcons detailing kissinger's correspondence during the period - . like most historians kaufman was interested in exploring aspects of kissinger's foreign policy and personal motivations. the only difference between this research project and another was the scale at which the questions were asked and the methods and tools employed to ask them. on multiple levels, the dnsa was wholly unprepared to support this research. consider for a moment how profoundly unprepared most traditional digital collection interfaces are to support research of this kind. in a similar vein but on a slightly different trajectory, doug reside was seeking to understand how digital composition practices may have influenced stephen sondheim. librarians in the music division at the library of congress pointed reside to the jonathan larson papers, a recent born digital acquisition that came to the library on a series of floppy disks. reside was able to work with library staff and the larson estate to gain access to the storage media that held the data, to engage in some digital forensic work to access the data, and ultimately transfer data to his own workspace. reside's reflections on this process present vital signposts for thinking about humanities data access: i think sometimes we who work in libraries and archives practice our role as guardians of material more fiercely than we practice our role as a collaborator in research ... it's ... important to note that once i had migrated the data to the servers, it was up to me, as a researcher, to make sense of it. i think often we worry too much about doing research for our readers. over the last decade or so we've come to understand that "more product, less process" is a better approach for paper collections, but i still hear a lot of fretting about how we will process and serve born digital collections if we, as library staff, don't know how to access or emulate the files ourselves. my feeling is that our role is simply to give the researchers what they need and get out of the way. challenges to supporting researchers like kaufman and reside are both technical and social. on the one hand most access systems have not been designed to support them. on the other hand professional standards of care around collection description and access hold potential to be counterproductive. as reside mentions, archivists have sought to balance care and access, in part, by committing to the more product less process model (mplp). as we move forward with humanities data provision in libraries it will be important to pay equal attention to developing technical solutions as well as professional dispositions that support this type of collection work.   . existing models library humanities data collection models vary widely. in order to support more concerted effort in this space, it is necessary to examine models that were explicitly designed to support computational engagement with collections as well as collections and access mechanisms that were not clearly designed for this purpose but nonetheless hold the potential to inform model development. a broad view of cultural heritage institution activity in this space allows us to form a picture of practice that is complementary to humanities data provision goals. this effort reveals the following high level humanities data collection model characteristics. data are typically made available via three primary locations: content steward website: e.g. simple webpages that have static hierarchical structures content steward repository: e.g. repository or digital collections software community repository: e.g. non content steward owned repository or digital collections software data are typically made accessible from these locations as: compressed collections: e.g. data are accessible via download as zip files static collections: e.g. data have fixed structure and are accessible via tools like wget databased collections: e.g. data are accessible via application programming interface (api) data are typically comprised of some combination of the following content: descriptive metadata: item and collection level description objects: text, images, sound, moving images, etc. code: programming instructions that produced the data documentation: readme files, dtds, etc. the form of data varies but they share some consistency across content type. integrity of collections and corresponding documentation are not consistent across providers.   . content steward website michigan state university libraries (msul) and the university of north carolina chapel hill libraries (unc) approach humanities data provision in a similar manner. both institutions make data available from their website, independent of repository software acting in an intermediary role. both institutions share commonality in providing access to compressed collections comprised by similar combinations of data. with its docsouth data collections, derived from the documenting the american south collections, unc provides access to a series of zip files. each zip file corresponds to an individual collection. docsouth data collections contain a set of tei encoded xml files, an identical set of plain text files with markup stripped out, a table of contents file, a readme file, and an xsl file used to create derivatives within the collection. msul provides access to a wider mix of humanities data that are derived from special collections materials, data purchased from vendors, data licensed from vendors, and data negotiated from corporate entities. with respect to humanities data derived from special collections, msul provides access to a series of files via a library webpage. typically content based objects are placed in zip files, while collected metadata records, readme files, dtds, and title lists are provided separately. each collection has a dedicated webpage that functions as a readme, seen prior to downloading data. dedicated readme-like webpages describe the digital collection, provide a preferred citation, digital collection creation background, a data summary that encompasses data format, file naming conventions, data size, and additional sections that document data quality and acknowledge individual staff effort that went into creation of the humanities data collection as well as the source data collection. the university of pennsylvania libraries' (upenn) openn stands distinct from unc and msul with respect to expanding the number of technical methods for accessing a humanities data collection. upenn provides data via methods: clicking on links to files via the openn website, anonymous ftp, rsync, and wget. the ftp and rsync methods are presented as tools for downloading data in bulk, with a slight edge given to rsync. the static structure of the collection makes it easy for a researcher to utilize a tool like wget to selectively download items from the collection at scale. documentation for openn data spans licensing, metadata, intended user communities, collection background, image standards and specifications, imaging and processing equipment, and sponsorship that drove collection development. the primary strength of the content steward website approach is that it is geared toward getting users ready access to collection data at scale through single click of entire collections or utilization of simple tools like wget, ftp, and rsync. while an api could enable more granularly expansive access to the collections, sole commitment to an api as an access method runs the risk of inhibiting the ability of users to get access to the data, as they are either unfamiliar with using apis generally or are simply fatigued from having to learn how to use another api slightly or substantively different than apis that they are used to. weaknesses of the content steward website approach include lack of ability to leverage metadata accorded to humanities data collections, minimal integration with larger collections, and lack of provision of application programming interface (api) for users who want to create a subset of a data collection predicated on multiple parameters.   . content steward repository some institutions make use of their repository or digital collections software to provide access to humanities data. the university of british columbia's (ubc) open collections and the university of pennsylvania's (upenn) magazine of early american datasets are prime examples of this approach. open collections makes data from dspace and contentdm installations accessible via api. ubc provides clear api documentation as well as an in browser api query builder to get users started. the api is intended to help users, "run powerful queries, perform advanced analysis, and build custom views, apps, and widgets with full access to the open collections' metadata and transcripts." upenn's magazine of early american datasets utilizes an installation of the bepress product, digital commons to make data available. data are user submitted rather than wholly upenn sourced collections. each collection is discoverable via search or browse from the digital commons interface. default download options typically point to a single file. in order to get access to codebooks and associated data a user clicks on files in an 'additional files' section. metadata describing the item in question is not made available as an additional file. utilization of the content steward repository approach has the advantage of leveraging metadata accorded to humanities data collections in a meaningful way, integration with other types of content and collections, integration with preservation solutions, and provision of application programming interfaces to enable granularly expansive access to data. weaknesses of this approach depend on the combination of access mechanisms. provision of an api at the exclusion of easier to use methods for beginners runs the risk of alienating users. the api is also potentially a barrier for more advanced users who have few parameters to their data needs where simple collection wide access is sought. where the size of a given collection inhibits the ability to enable single click collection download, cultural heritage organizations may do well to explore the viability of making their data available via academic torrent.   . community repository some institutions contribute their data to a repository that they do not own. examples of this approach include indiana university bloomington and the tate modern gallery utilization of github as well as a multitude of institutions making use of hathitrust and the digital public library of america. indiana university bloomington uses github to make tei collections available for, "... easier harvesting and re-purposing ... [so that] content can ... be analyzed, parsed, and remixed outside of the context of its native interface for broader impact and exposure." collection composition spans metadata as well as objects. the tate modern gallery makes collection metadata available for about , artworks. it is not immediately clear what the purpose of this effort is, yet from the readme associated with the collection it appears that the tate collection looks positively on creative remixing, visualization, and analysis of their collection metadata. , the digital public library of america (dpla) and hathitrust research center (htrc) stand distinct from the prior example in the sense that they are explicitly focused on gathering digital collections materials from cultural heritage organizations and operate under a nonprofit model. dpla currently focuses on aggregation of collections metadata. thus, use of the dpla api provides access to metadata at scale, but not direct access to the digital objects that they refer to. provision of a metadata collection focused api is intended to, "encourage the independent development of applications, tools, and resources that make use of data contained in the dpla platform in new and innovative ways, from anywhere, at any time." to date an interesting array of applications have been developed using the dpla api that help users navigate aggregated collections by color, visualize term frequency over time, and visualize content license type distribution across content in dpla. , , dpla provides thorough documentation for their api, a statement on api design philosophy, as well as a number of code samples that make use of the api. for the non api inclined, dpla provides bulk access to metadata collections as gzipped json files via their tools for developers. the hathitrust research center (htrc) offers an api to access data, the htrc portal interface to execute computational jobs against data using htrc compute resources, a number of datasets that are constituted by extracted features of works held by hathitrust, and at some point in htrc aims to provide the ability to analyze in-copyright works via the htrc data capsule. the htrc data api provides access to zip files that contain plain text volumes, pages of volumes, and associated xml metadata records in the mets format. the htrc portal interface lets users build "worksets" that resolve to objects like literary texts, which can in turn be submitted to htrc for processing under a predetermined set of algorithmic approaches. after these processes run, users gain access to the data generated from analysis. the primary extracted features dataset spans . million volumes and consists of volume features as well as page level features like number of tokens on a page, line count, and languages identified on a page. these data are accessible via rsync as compressed tar files that contain data stored in the json format. in addition to this large scale extracted features dataset, users have the option to generate an extracted features dataset from a custom workset. finally, the htrc data capsule will provide a secure environment for analyzing in-copyright works under a non consumptive paradigm. in this framing users are not afforded the possibility of accessing full text resources on their own device. rather, htrc mediates computational requests and returns output generated by those requests to users. the strengths of the community repository approach are many. community repositories expose disparate collections to a wider audience than source repositories. in doing so they encourage wider use. organizations that lead community repositories often have greater ability to advocate for community positions on copyright and licensing that better support research and creative works. hathitrust has been active in this area through pursuit of the non consumptive paradigm and authors guild v. hathitrust. dpla has been active in this area through international harmonization of rights statements. the weaknesses of this model are better understood as opportunities. given broad reach and broad care over data collections, these repositories bear greater impact than any single institution affiliated repository. the decisions that managers of these repositories make regarding development of collections and technical features to meet target user community need are consequential for a broad spectrum of potential users. while the dpla is wonderful on the computational side for developers it is an open question how well it is suited for computationally inflected research and pedagogy. on the flip side, htrc is wonderful on the computational side for research, yet it is an open question how well suited it is for developers. it would be unfair to expect any one community repository to serve all potential users, but given their size and influence, a greater responsibility is borne given the range of possibilities their work engenders.   humanities data futures as the library community dedicates more effort to developing collections and platforms that support digitally inflected research and pedagogy, it will become increasingly important to consider challenges and opportunities inherent in the development of more robust solutions. it must be acknowledged that present humanities data collection development is diffuse, sometimes focusing on meeting the needs of a broadly (inter)disciplinary and (inter)professional community like the digital humanities, and other times having a narrower scope. in order to develop more resonant collections and platforms, needs assessment and other forms of user research must be executed in a more systematic and sustained manner. a small number of targeted studies show initial promise in this area, and the increased activity of the digital library federation assessment interest group is encouraging. , in the few places where humanities data collections exist they are siloed and difficult to discover outside of their institutional bounds. it is likely the case that there will never be a one ring to rule them all for humanities data discovery. effort in this area could very well trend toward a series of specialized spaces not dissimilar to how repositories have arisen in other areas of inquiry. the repositories maintained by the inter-university consortium for political and social research (icpsr) and the world historical dataverse provide cogent examples. in lieu of consolidated effort in this space, librarians should of course still consider local solutions while keeping an eye on future interoperability. one possible approach to doing both could reside in developing technical solutions that integrate with open source efforts like fedora commons, hydra in a box, and archivesspace. in the present study, focus has been placed on humanities data collections derived from library owned collections rather than collections licensed from vendors. librarians should have greater control over collections the library owns, making for an easier process of transformation that should in turn provide a local precedent that helps frame a productive discussion with vendors, especially with respect to data form and access. the integrity conversation will likely be more challenging. terms and conditions associated with many vendor products come into direct collision with principles of data reuse and research transparency. for example, terms and conditions show their age particularly around the notion of sharing "snippets" of text where a possible research question is predicated on many thousands of documents and potentially millions of words. under such a restriction how might research in this vein, predicated on resources with these terms, be properly evaluated by a peer community that may or may not have access to the data through their institution? where vendors don't actually own content they are in a difficult yet potentially promising position to broker negotiations between content owners and library licensees to make works from a diverse set of sources more usable at scale. on the nonprofit side of things, portico has expressed interest in thinking about how they might help the library community develop services and tools to support scholarship that relies on text and data mining. present humanities data provision is predicated on a push paradigm. in a push paradigm, humanities data collection development is focused on enabling data access. the push paradigm does not consider how to incentivize pulling humanities data back into the collection once it has been used for a given application. a push-pull paradigm for humanities data collection development aligns technical infrastructure and data preparation practices with a growing emphasis on the value of data reuse, research reproducibility, and transparency. operating under a push-pull paradigm, a library could make a data collection available, a faculty member teaching a digital humanities course could subset it and transform it into a graph data format in order to teach a network analysis class, following the class the faculty member could place their data back into the repository in such a way that the provenance between the source data collection and the network data is established. ideally the link between the source data collection and the derivative data produced by the faculty member would be represented as a point of metadata that would increase both the discoverability of the source data as well as the derivative data for audiences seeking collections that can readily support digitally inflected research and pedagogy. the library benefits in this scenario by gaining a firmer handle on how often collections are being used and for what purposes they are being used for. a broader community of users benefits from ready access to source data collections and data derived from them. maintaining public connection between source data and derived data has a corollary benefit of giving users a sense of collection possibility. creators of data uploaded back to the library benefit from having a place to make their data accessible in such a way that reusers of that data, and peer reviewers have a concrete sense of where the data in question originated. this article has advanced a series of principles to inform humanities data collection development in light of demand posed by digitally inflected scholarship and pedagogy. by paying keen attention to integrity, form, and access librarians and others operating in the cultural heritage sector situate themselves well to adapt these principles to reshape their collections to become more amenable to computational methods and tools. fluency gained with data throughout the process of humanities data collection development positions the librarian as research partner as well as content provider. this article extended to review current practice, both explicitly focused on humanities data provision as well as those that inspire this effort, and concluded with suggestions for future directions in this space. while efforts in this area are nascent, they offer exciting opportunities for thinking anew about how librarians and the collections they steward can catalyze research, pedagogy, and our collective creative potential.   references [ ] "grinnell college, university of iowa join forces to expand use of digital technology." [ ] livadas, greg. "rit to offer bachelor's degree in digital humanities and social sciences." [ ] roy rosenzweig center for history and new media, department of history, art history, george mason university, university drive, and msn e fairfax. "getty foundation funds institute for art history graduate students at rrchnm. [ ] "humanidades digitales — acerca de." [ ] "global outlook::digital humanities." [ ] "guidelines for evaluating work in digital humanities and digital media | modern language association." [ ] denbo, seth. "aha council approves guidelines for evaluation of digital projects." [ ] green, harriett e., and angela courtney. "beyond the scanned image: a needs assessment of scholarly users of digital collections." college & research libraries, september , , crl — . http://doi.org/ . /crl. . . [ ] underwood, ted. page-level genre metadata for english-language volumes in hathitrust, - . figshare, . https://doi.org/ . /m .figshare. [ ] "tedunderwood/genre." github. [ ] jackson, andy, "the provenance of web archives — uk web archive blog." [ ] rosenthal, david, "dshr's blog: you get what you get and you don't get upset." [ ] francois, pieter. "the sample generator — part : origins — digital scholarship blog." [ ] lincoln, matthew d. "some problems with glam data on github." matthew lincoln, january , . [ ] posner, miriam. how did they make that?, . [ ] stodden, victoria. "the scientific method in practice: reproducibility in the computational sciences." ssrn electronic journal, . http://doi.org/ . /ssrn. [ ] stodden, victoria. "trust your science? open your data and code." [ ] risam, roopika. "beyond the margins: intersectionality and the digital humanities". digital humanities quarterly , no. . . [ ] sadler, bess and chris bourg. "feminism and the future of library discover". code lib journal . . [ ] masters, christine l. "women's ways of structuring data." ada: a journal of gender, new media, and technology, november , . [ ] bowker, geoffrey c., and susan leigh star. sorting things out: classification and its consequences. cambridge, mass.: mit press, . [ ] drucker, johanna and david kim. "introduction to digital humanities | concepts, methods, and tutorials for students and instructors." [ ] sinclair, stéfan and geoffrey rockwell. voyeur tools (home page), . [ ] "cytoscape: an open source platform for complex network analysis and visualization." [ ] houston, natalie m., "visual page." [ ] elizabeth lorang et al., "developing an image-based classifier for detecting poetic content in historic newspaper collections," d-lib magazine , no. / (july ), http://doi.org/ . /july -lorang [ ] british library. "the mechanical curator." [ ] kaufman, micki. "everything on paper will be used against me: quantifying kissinger." [ ] manus, susan. "digging up the recent past: an interview with doug reside." the signal: digital preservation, n.d. [ ] greene, mark, and dennis meissner. "more product, less process: revamping traditional archival processing." the american archivist , no. (september ): — . http://doi.org/ . /aarc. . .c k [ ] padilla, thomas and matthew lincoln. "data-driven art history: framing, adapting, documenting". dh+lib data praxis. [ ] dalmau, michelle. "tei and plain text from digital collections services, indiana university libraries." [ ] drass, eric. "tate explorer." [ ] kräutli, florian. "the tate collection on github." [ ] nelson, chad. "color browse." [ ] farr, dean. "term frequency map." [ ] farr, dean. "dpla licenses." [ ] capitanu, boris ted underwood, peter organisciak, sayan bhattacharyya, loretta auvil, colleen fallaw, and j. stephen downie. extracted feature dataset from . million hathitrust digital library public domain volumes ( . ) [dataset]. hathitrust research center, . http://doi.org/ . /j td v m [ ] plale, beth, atul prakash, and robert mcdonald, "the data capsule for non-consumptive research: final report," february , . [ ] gore, emily, getting it right on rights." digital public library of america. [ ] green, harriett e., and angela courtney. "beyond the scanned image: a needs assessment of scholarly users of digital collections." college & research libraries, september , , crl — . [ ] green, harriett e. . "an analysis of the use and preservation of monk text mining research software." literary and linguistic computing , no. : - . http://doi.org/ . /llc/fqt [ ] ixchel m. faniel and ann zimmerman, "beyond the data deluge: a research agenda for large-scale data sharing and reuse," international journal of digital curation , no. (august , ): — . http://doi.org/ . /ijdc.v i .   about the author thomas padilla is digital scholarship librarian at michigan state university libraries. he publishes, presents, and teaches widely on humanities data, data curation, and data information literacy. recent national and international presentation and teaching venues include but are not limited to: the annual meeting of the american historical association, the humanities intensive learning and teaching institute, digital humanities, the digital library federation, and advancing research communication and scholarship. thomas serves as an editor for dhcommons journal and dh + lib data praxis. thomas also currently serves as co-convener of the association of college and research libraries digital humanities interest group.   copyright ® thomas padilla white paper report report id: application number: hd project director: jon miller (jonmill@usc.edu) institution: university of southern california reporting period: / / - / / report due: / / date submitted: / / white paper grant number: hd- - division: digital humanities program: digital humanities start-up grants title: essays in visual history: making use of the international mission photography archive grant status: awarded $ , principal investigators at the university of southern california: jon miller, senior research associate, center for religion and civic culture matt gainer, director of the digital library grantee institution: dornsife college of letters, arts and sciences. university of southern california date submitted: / / essays in visual history: making use of the international mission photography archive narrative at the university of southern california, the level one start-up grant from the office of digital humanities supported a two-day workshop for the purpose of conceptualizing and planning essays in visual history, a series of visually informed compositions that will draw upon an established repository of historical materials: the international mission photography archive (impa). while our agenda for the workshop was thus narrowly focused, many of the strategic decisions we needed to make for essays in visual history will be shared by other humanities projects that seek to incorporate visual materials into scholarly presentations. a. background: the international mission photography archive essays in visual history will be a continuing a series of visual essays to be authored by accomplished scholars who use images from the international mission photography archive (impa) to explore topics in their areas of expertise. the nearly , historical photographs presently in the impa database, most of them taken between and wwii, represent cultures across africa, india, china, korea, japan, oceania, the caribbean, and papua new guinea. they comprise an extraordinary resource for comparative research in the humanities. when fully implemented, essays in visual history will be hosted by the usc digital library and featured on the website of the center for religion and civic culture (crcc). the model of the visual essay and the idea of an online series of such compositions were conceptualized by the impa executive board during a meeting in september, . when it is reduced to its key elements, a successful scholarly essay in traditional print form offers a thoughtful argument focused on a topic in the author’s special area of expertise. it features a narrative that is anchored in credible supporting evidence and presented with the intention of encouraging intellectual exchange. this uncomplicated format with all its variations has endured for centuries in print scholarship because it serves its function well. however, the incorporation of visual materials into an essay, a matter of increasing interest in the humanities and social sciences, presents special challenges that ask for innovative forms of presentation. in static print publications, reproduction costs and concerns about rendition quality have often limited the use of visual materials or ruled them out altogether, and even where such materials are encouraged, the relationship between words and images is always attenuated by the static and confining physical limits of the page. rapidly evolving digital and video technologies offer scholars attractive ways to work around these limitations. when they are published online, compositions that feature visual content can make use of a wide array of recently developed presentation tools, and at the same time, dissemination costs for those compositions are of less pressing concern than they are for print publications. most important from our perspective, in digital video format an essay’s spoken narrative can be wrapped around digitized images much more dynamically than print allows, resulting in a fluid scholarly presentation that is engaging in very different ways. http://digitallibrary.usc.edu/impa/controller/index.htm http://digitallibrary.usc.edu/impa/controller/index.htm http://digitallibrary.usc.edu/search/controller/index.htm http://crcc.usc.edu/ taking advantage of these developments, our goal in this enterprise is to initiate a new option for humanities scholarship, namely, essays in the form of narrated videos that can be quickly and broadly disseminated via a growing variety of web publishing platforms. scholarship in the humanities has always relied on the sharing of arguments and commentaries, but perhaps the most critical feature of our project is that the scholarly arguments and the visual materials and metadata upon which they are based will be equally accessible in the same place, that is, on the impa website. viewers of the essays, including peer reviewers, can reach into the impa database to view an essay’s individual images in much greater detail. they can examine the descriptive material attached to the images, search for other pictures that may bear upon the essay’s claims, or follow up on ideas of their own that may have emerged as they were viewing the essay. in short, the audience for an essay can directly engage the primary materials that comprise its essential content and thus be drawn into an extensive body of archival resources that might otherwise escape their attention entirely. the essays will also be accompanied by searchable transcripts of their central narratives, bibliographies, biographical sketches of the authors, and timelines and maps that can provide important background context. the usc digital library will archive and provide access to the essays so that they remain openly accessible for research and reference. they can be shared by viewers using a variety of social media applications, and we will encourage the authors to participate in public online conversations about the essays they create. for pedagogical use the essays can be incorporated into syllabi; excerpts from them or selections from their constituent elements can be embedded in online presentations and used in lectures and assignments; and instructors can encourage their students to use the video format that we develop to fulfill their required essay assignments for the class. b. project activities to inaugurate this series, historian paul jenkins began work on his prototype essay, “reading an image in the other context,” in . in july, that evolving essay served as the focal point for the odh-funded workshop for the series, supported by a level start-up grant from the office of digital humanities. for the workshop, principal investigators jon miller and matt gainer were joined by paul jenkins and two colleagues from other universities: david morgan from duke, and patricia lawton from notre dame. a third intended participant, martha smalley from the yale divinity school, was unable to attend the workshop because of a family emergency, but she came to los angeles at a later time for a full day of consultation with the principal investigators. her evaluation of the project and her suggestions for future refinements are represented in this report. from usc, seven faculty members and several graduate students from various departments joined the discussion. in addition to jenkins’ original essay, the outline of a second essay by him and a rough cut of an essay under development by co-pi jon miller were also presented and discussed. c. workshop accomplishments the workshop was intended to produce a set of framing parameters for the planned essay series. http://digitallibrary.usc.edu/cdm/vijenkins http://digitallibrary.usc.edu/cdm/vijenkins using the three pilot essays as points of departure, two days of conversations yielded the following conclusions and guidelines that will shape the project as we move into the implementation phase: . authoring tools. our first production efforts represented a trial-and-error process that was labor intensive and somewhat confining. in the interest of cost-effectiveness and consistency, we are now focusing our efforts on building a more accessible set of applications that can be used by a constituency of investigators whose “digital savvy” will range from rudimentary to fairly sophisticated. research has shown that many investigators will avoid the use of visual materials in the ways we propose if the technical learning curve is too steep. with this in mind, the workshop participants stressed the clear need for a user- friendly toolset that will enable authors in one integrated process to search and select images from the impa database, capture the relevant descriptive information (metadata) for those images, and merge those visual materials with the recorded presentation of the essay’s narrative core. the goal is to enable an author with moderate technical skills to control as much of the production process as possible, with digital library staff and other technical experts consulting closely on the process and providing technical guidance when it is called for. . multimedia capability. while the toolset developed for the implementation phase must remain accessible to relative novices in the area of digital technology, the workshop also called attention to the need for greater multi-media range, that is, for technical versatility beyond a simple linear video presentation. the early essays will share certain signature framing or “branding” elements and relatively simple technical features, but in time the process will allow more multimedia freedom of form and flexibility of delivery in order to encourage a diversity of authors to address an increasingly wide range of essay topics. . role of the executive board. the impa executive board includes individuals with strong credentials as scholars and archivists. they have worked together for over a decade, presiding over the growth in size and range of coverage of the impa database. for the essay implementation phase, this group will take an active role in the selection of topics and authors and will be charged with monitoring the quality of the essays as they move through the production and publication process. it is likely that as the essay series matures this group will gradually share these selection and oversight functions with a more broadly-based multi-disciplinary editorial committee. . invited versus open submissions. the early essays carry the burden of establishing the credibility of the series. for this reason, it is sensible to rely on invited scholars for the first group of essays and then later move toward more open invitations through various professional communications channels. essayists will include both established and early career scholars. in later phases of the project, the eligibility of students working under the supervision of academic advisors will also be considered. . scholarly breadth. to exploit the full richness of the impa database, the essay series must be multi-disciplinary, not narrowly focused on topics in religious studies. the production, accumulation, and preservation of the photographs were certainly supported by the religious motivation of the missionaries, but the content of the pictures is by no means confined to religious subjects. it is a historical record of cultural diversity that is of interest across many scholarly fields. to exploit the richness of the repository, the executive board should encourage scholars from different fields to participate and address a very broad range of topics. . length. participants in the workshop agreed that fifteen minutes is a reasonable average length for the essays, but that shorter or longer presentations, provided that their quality is high, should not be ruled out as a matter of policy. . relationship to scalar. usc is home to the mellon-funded scalar initiative, which is building a sophisticated and multi-faceted publishing platform for a range of different types of digital scholarship. the principal investigators from that initiative participated in the workshop and outlined a number of ways in which essays in visual history can be integrated into the larger community of digital humanities scholarship. as the project continues we will identify the ways in which the essay series can collaborate with the scalar initiative in a way that preserves the core strengths of the traditional essay form, which emphasizes careful documentation (footnotes and references where appropriate) and observes rules of copyrights and permissions. d. audiences the planning workshop supported by the grant was convened for the specific purpose of clarifying the outlines, objectives, and methodologies of the visual essays project. the overarching objective was and continues to be the development of proposals for external support. the workshop’s list of participants was therefore narrowly defined, as was the intended audience for its deliberations. broader audiences in the humanities and social sciences and other institutions engaged in those scholarly pursuits will be addressed when the project moves fully into the public implementation phase. e. evaluation at the conclusion of the workshop, all of the participants were asked to provide verbal and written comments on each of the seven topics listed under accomplishments above. it was their feedback that to a large extent shaped that list of conclusions and concerns. f. continuation of the project the executive board of the international mission photography archive is committed to the implementation of the essay series, and the members of the board continue to publicize the project and encourage investigators to consider contributing to the series. the search for external resources to cover the costs of the project is ongoing. the usc digital library and the center for religion and civic culture have entered into a formal agreement that assures the security and proper maintenance of the impa database. the principal investigators for the start-up grant will oversee the continuing development of the essay series and will continue to make progress on individual essays as local resources become available. they will approach potential funding sources with grant proposals whenever the opportunity arises. an implementation grant proposal submitted to odh was not funded; it will be revised and resubmitted, and proposals to other agencies, including private foundations, are under development. preliminary discussion of a possible collaborative relationship with the center for global christianity and mission at the boston university school of theology is underway. if such a linkage is established, it will establish a broader base of institutional support for the essay series as well as provide stronger connections to an important multidisciplinary constituency of scholars. g. long term impact the long term impact of essay in visual history can only be assessed when secure support for the project is in place and a critical number of mature essays have been published. if the series is a success, it will be a continuing source of creative production across a range of visual studies fields. as we envision the essays, they lend themselves naturally to classroom presentation, and assignments encouraging students to produce new essays can easily be built into upper-division and graduate curricula. h. grant products the inaugural visual essay authored by historian paul jenkins has been extensively revised and published on the impa website. it is available for viewing at: http://digitallibrary.usc.edu/cdm/vijenkins http://digitallibrary.usc.edu/cdm/vijenkins cripping feminist technoscience hypatia vol. , no. (winter ) © by hypatia, inc. invited review essay cripping feminist technoscience feminist, queer, crip. by alison kafer. bloomington: indiana university press, . disability in science fiction: representations of technology as cure. by kathryn allen. palgrave macmillan, . seizing the means of reproduction: entanglements of feminism, health, and technoscience. by michelle murphy. durham, n.c: duke university press, . the posthuman. by rosi braidotti. cambridge, uk: polity press, . aimi hamraie in feminist technoscience studies (fts), the term technoscience conveys that scientific knowledge and technological worlds are active constructions of entangled material, social, and historical agents. feminist analyses of assisted reproduction, environmental harm, digital media, and cyborg bodies constitute some of the work of fts, a close sibling of the new materialisms and post-positivist feminist philosophies of science. technoscience is also a familiar object of inquiry for scholars of critical disability studies (ds). ds’s historical, sociological, and philosophical engagements with medi- cine, the politics of design, selective reproduction, fictional cyborgs, and technology users make clear that ds and fts scholars share at least some understandings of technoscience. however, whereas feminist disability studies has emerged as a field containing hybrid developments and reciprocal critical exchanges between feminist and disability theories of embodiment, knowledge, and ethics (garland-thomson ; tremain ), the field of feminist disability technoscience studies is only on the cusp of emergence. the missing ingredient, it seems, has been the technoscientific elaboration of crip theory. crip, a concept paralleling the critical work of queer in sexuality studies, is a highly debated and contested term marking resistance to what robert mcruer calls “compulsory able-bodiedness” (mcruer ). cripping actively resists compliance with supposedly normal embodiment, behavior, and desired futures. instead, it understands disability as productive possibility and resource. the texts under review each provide a crucial puzzle piece imagining the futures of a crip feminist techno- science studies (cfts). this field dwells with the insights of feminist materialism and fts while also remaining attentive to what types of difference and embodiment are valued, omitted, or normalized when we talk about disability, objectivity, and technology. read together, the four books braid ethical and political concerns with diverse methodological tactics that will serve cfts well as it becomes a recognized trajectory for critical and activist inquiry. alison kafer’s feminist, queer, crip is a technology that produces, organizes, politi- cizes, and mobilizes critical studies of technoscience. that is, it is not merely a material object, a set of well-researched scholarly arguments, or a helpful teaching tool (though it is all of those things). rather, feminist, queer, crip is an agentive tool and demonstra- tive manifesto of crip futurity. kafer weaves her own politicized, embodied experiences through savvy engagements with familiar theorists, including jodi dean, jasbir puar, la- delle mcwhorter, jos�e esteban mu~noz, lee edelman, and heather love, leaving no doubt about the crucial place of crip analytics in the canon of feminist and queer critical social, political, and literary theories. the book earns its place in this canon on its own terms: through a series of active demonstrations of what a feminist queer crip analytic can do to create accessible futures. kafer’s overarching theme—that we should agitate against the depoliticization of disability in our understandings of futurity—reaches out of the text into the lifeworld, where feminist, queer, and crip activists build coalitions to create environmental and reproductive justice, liberate disability-shaming billboards with spray paint, find restrooms that disabled and gender nonconforming people can access, and invite readers to reach back in and take part. of particular interest to queer theorists, bioethicists, and activists alike, the first several chapters distinguish between curative time (the temporality of medical treat- ment, nostalgia for able-bodied pasts, and a eugenic impulse for a disability-free future) and crip time (the alternative temporalities of crip embodiment, reproduction, and engagement with social and built environments). cripping edelman’s no future, kafer shows us that the expectations of reproductive futurity also operate through compulsory able-bodiedness and able-mindedness, framing cure and elimination as the ideal and preferred future for disabled people, characterizing disabled lives as not worth living, and writing the lived realities of the most marginalized people out of the present. in a series of case-study chapters—on the “ashley treatment,” consisting of growth attenuation procedures for a developmentally disabled girl referred to as “ashley x”; bioethical conflicts over deaf lesbians attempting to conceive a deaf child (put in conversation with marge piercy’s feminist utopian novel, woman on the edge of time); and the foundation for a better life’s billboard campaign promoting personal overcoming and representing disability as tragedy—kafer’s meticulous femi- nist and queer analyses of news media, literature, and visual and medical representa- tion reveal that the tensions in all of these cases emerge from conflicts over the existence of disabled people in ideal futures. the book’s fifth chapter, “the cyborg and the crip,” lays a sophisticated yet radi- cal foundation for what i am calling cfts, providing the most accessible, concise, hypatia and rigorous critique of donna haraway’s cyborg concept, and its circulation within feminist theories, to date. in their attempt to celebrate a post-essentialist hybridity of body, identity, and machine, kafer insists, cyborg theories actually depoliticize disabil- ity. that is, rather than understanding disability as a construction of medical knowl- edge and inaccessible worlds, fts scholars conflate disabled people with cyborgs and use their bodies as evidence of an inevitable post-human future. as a result, “‘cyborg’ and ‘physically disabled person’ are seen as synonymous. or, rather, that ‘person with physical disabilities’ is a self-evident, commonsense category of cyborgism” ( ). by contrast, kafer reminds us of the serious economic, physical, and functional barriers to accessing and using adaptive technologies, which often leave disabled people with an “ambivalent relationship to technology” ( ). rather than eschewing technology altogether, kafer shows us what a cfts ana- lytic can do: “cripping the cyborg, developing a non-ableist cyborg politics, requires understanding disabled people as cyborgs not because of our bodies (e.g., our use of [adaptive technologies]), but because of our political practices” ( ). crip feminist technoscientific practices thus include those that politicize disabled people’s relation- ships to technologies produced by the military or pharmaceutical companies, while valuing the technoscientific activism that has characterized disability rights history. such activism includes demands for accessible worlds, technologies that resist com- pulsory heterosexuality and able-bodiedness, diy technology activism, and disabled people’s use of technologies as diverse as the internet and wheelchairs to participate in political actions. in the final two chapters, kafer crips environmental justice by drawing upon a foundational claim of ds: that environments—both natural and built—are actively constructed rather than given. knitting together feminist materialism, eco-critique, critical race theories, and disability cultural production, kafer demonstrates the possi- bilities for crip eco-politics that “take disability experiences seriously, as sites of knowledge production about nature” ( ). she builds upon this engagement with “environment” and “nature” by exploring crip possibilities for environmental politics and activism: crip, anti-ableist environmental justice and reproductive justice work, which foregrounds race and class analysis alongside feminist and disability politics, and “restroom revolutionaries” working toward restroom access for crips and gender nonconforming people. in all of these cases, concrete examples of art, cultural pro- duction, and activism underscore that creating accessible futures requires understand- ing disability as a critical technoscientific phenomenon grounded in a relational politics of interdependence. whereas kafer foregrounds real life disabled people, new work in disability science-fiction studies catalogues the normalizing role of technology in literature, television, and film. an edited volume of work by junior, senior, and independent scholars, many of whom are new to ds, kathryn allen’s disability in science fiction: representations of technology as cure marks an important milestone for cfts: the entrance of the social-constructivist model of disability into science-fiction scholar- ship. scholars in ds and fts will be especially interested in allen’s critical introduc- tion and several theoretical chapters, most notably those by antonio cascais, joanne invited review essay woiak, and hioni karamanos, which demonstrate the theoretical purchase of crip critiques of cure, rehabilitation, and enhancement. echoing kafer, allen reminds us, “the impulse to imagine our future selves as post-human paragons ignores the lived realities of the various bodies that rely on prosthetic technology today in ways that are mundane, visceral, and difficult” ( ). thus, although focused on fiction, the vol- ume serves as a constant reminder of the ideological and cultural work of representa- tion for nonfictional disabled people. although rarely engaging directly with feminist, queer, or anti-racist theories, the essays in disability and science fiction capture the value of ds for technoscience schol- arship. most of the authors draw upon a canonical set of disability theorists. their (fairly uniform) field-building work thus consists of matching disability theories to textual examples (rather than producing new disability theory). hardly a scholarly limitation, this consistent elaboration of the social model renders the volume as a whole more teachable. a course on disability science fiction could very effectively assign a few key ds texts alongside these essays and their primary sources, which include star wars, avatar, the bionic woman, flowers for algernon, and the novels of octavia butler. the volume’s constructivism also displays the range of what a cfts analytic can do. from chapters on prosthetics and cyborgs (by donna binns, netty mattar, and brent cline) and physical disability (ralph colvino and leigh mcreynolds) to archi- tecture (robert cape jr.), autism (christy tidwell), and genetic disability (gerry canavan), the volume’s variety of engagement makes it exciting to imagine a next wave of disability science-fiction studies. following kafer, this work could engage with science fiction through critical feminist, queer, and crip theories of embodiment, futurity, and normalization. we can also imagine future work in this area that incor- porates feminist technoscience theories, which show us that the adaptation and reuse of technology can subvert logics of cure and normalization to serve activist political purposes. a text foregrounding precisely such technoscience activism, michelle murphy’s seizing the means of reproduction: entanglements of feminism, health, and techno- science, contributes to feminist-materialist, crip, and anti-racist fts through a critical history of the s women’s self-help movement in california. murphy focuses on feminists as agents of technoscience, coining the term procedural feminism to describe the ways that they “appropriated, revised, and invented reproductive health care techniques” ( ). feminist procedural activists used technoscience to practice episte- mic activism, contest medical objectivity, and make “technoscientific counter-con- duct” ( ) a key method of feminist praxis. in a fresh reading of both familiar and forgotten spaces, such as feminist health clinics (chapter ), practices, such as group cervical exams (chapter ), and technologies, such as pap smears (chapter ) and low-tech menstrual self-regulation devices (chapter ), murphy vividly and accessibly renders, through storytelling and archival images, the assemblages of feminists and technologies that produced the women’s health movement. crucially, murphy shows how enacting technoscience as activism can mobilize affective economies and posit alternative epistemologies (cleverly termed “immodest hypatia witnessing”). a more general notion of procedural activism, modeled on murphy’s procedural feminism, could be a core concept for cfts, describing the politicized technoscience activism that kafer so carefully locates in her crip cyborg theory. using murphy’s concept as a guide, we can imagine crip feminist technoscience scholarship about how the design, invention, and use of tools, machines, and even built environments can become a site for crip “hacktivism” (such as disabled people self-inventing adaptive technologies, designing accessible restrooms, and developing low-cost, d printed prosthetics). murphy’s most elegant, attentive, and significant contribution is her diligent attention to race and the construction of whiteness in feminist attempts to do technoscience. even when race does not appear intelligible as a factor in feminist technoscience, murphy shows that whiteness nevertheless circulates to produce privilege, access, and self-determination for white us feminists and to deny epistemic and political agency to women of color, both domestically and internationally. this framing is the crucial thread that connects the book’s chapters on self-help clinics, group cervical exams, pap smears, and the differential framings of reproductive technologies promoted domestically for individualistic freedom but internationally to curtail fertility and population growth. murphy’s attention to the critical geographical concept of scale enables this lay- ered reading of the cold war-era biopolitics of race, fertility, pregnancy cessation, and calculated reproductive health risks. each chapter is a teachable case study, but the persistent threads of race, feminist, and technoscience analysis woven throughout are an impressive methodological demonstration. murphy seamlessly executes her analyses of assemblage, using shifts in scale to identify actors, networks, and technolo- gies mobilizing feminist self-help. although human agency is still its central focus, this methodological complexity and rigor makes the book a crucial contribution to feminist materialism and post-humanist studies of technology. whereas murphy turns to the past to find protocol activism, rosi braidotti’s the posthuman shows us how to use the tools of feminist philosophy to frame technosci- entific futures beyond what (following queer and crip critique) we can think of as compulsory humanity. through a spinozist, materialist, monist, and (at times vertigi- nous) nomadic approach, braidotti produces a methodically crafted argument. as in murphy’s book, the chapters each work at a different scale, progressing through constructions of the human self (through figures such as the vitruvian man), life beyond compulsory humanity, population-level biopolitics, and the possibilities for the future of humanistic disciplines and higher education in a post-human world. at all scales, the post-human is “a relational subject constituted in and by multiplicity. . . express[ing] an embodied and embedded and hence partial form of accountability, based on a strong sense of collectivity, relationality, and hence community building” ( ). this description echoes feminist materialist theories emphasizing entanglement and material agency (such as karen barad’s agential realism), as well as kafer’s politi- cal-relational model of disability and murphy’s historical mapping of assemblages. what distinguishes braidotti from kafer and murphy, however, is a conception of invited review essay agency modeled on zoe, an understanding of living organic matter—both human and nonhuman—as “intelligent and self-organizing” ( ). post-human challenges to the anthropocene are inevitable, braidotti tells us, and entangled with technologies promoting life and destroying it. here, the feminist cyborg concept appears to mark the inevitable shift to a post-human world: “we can therefore safely start from the assumption that the cyborgs are the dominant social and cultural formations that are active throughout the social fabric. . .. the vitruvian man has gone cybernetic” ( ). the evidence? prosthetically-enhanced humans such as the disabled olympic runner oscar pistorius ( ), “figures of mixity, hybridity, and interconnectiveness” (curiously described as “transsexual” or “androgynous”) ( – ), and even the “fast-changing field of disability studies [which] is almost emblem- atic of the post-human predicament” ( ). these characterizations, common in the tradition of feminist cyborg theories, leave us to wonder what role body–technology interfaces may play in a post-human future besides serving as what allen calls “post- human paragons” ( ). but, braidotti assures us, the post-human future implicates all of us as we interface with the proliferation of everyday technologies, such as smart- phones, and the advanced use of drones, genetic engineering, and other technologies with violent and even eugenic possibilities. the book’s most surprising and original contribution comes when braidotti takes familiar post-humanist arguments into new territory by discussing the futures of disci- plines and the university system. in the future “multi-versity,” she predicts, traditional humanities disciplines will be replaced with post-humanities: “humanistic informatics, or digital humanities; cognitive or neural humanities; environmental or sustainable humanities; bio-genetic and global humanities” ( ). this vision of the future is timely given the ongoing attacks on the humanities, and philosophically justifies schol- arly interest in fields such as animal studies and eco-criticism by granting them a place in the multi-versity to come. as recent university politics demonstrate, however, the future is now and the shift toward the post-humanities has begun as an effort to shore up, rather than to render problematic, the corporate, revenue-generating university. at emory university, for instance, visual studies, journalism, and graduate programs in education studies, the interdisciplinary liberal arts, and spanish were cut to create new technoscience-driven programs studying contemporary china, digital media, interdisci- plinary neuroscience, and global health. as critics noted, the emory administration privileged the technoscientific novelty of these fields in its decision-making, but failed to anticipate the disproportionate impacts of restructurings on marginalized students and scholars, particularly nontenured faculty and people of color (sullivan ). recent events should make us seriously evaluate whether a post-human multi-versity can avoid casting such displacements as inevitable or neutral, and instead maintain technoscience as a productive site for crip feminist futures. how will digital, neural, environmental, and global humanities avoid reproducing the systematic racist, sexist, heteronormative, and ableist hierarchies existing within universities today? rather than emphasizing new neoliberal disciplines with the greatest revenue-generating possi- bilities, we could imagine a multi-versity that values critical post-humanistic scholar- ship and teaching on feminist, crip, anti-racist, and queer notions of interdependence. hypatia rather than valorizing digital humanities for temporally syncing us with an inevi- table cyborgian future, we could imagine a role for cfts in training students to build accessible digital worlds, foster digital scholarship on access and inclusion, and receive funding commensurate with the goals of facilitating economic and functional access to new technologies, particularly for women, people of color, and disabled peo- ple. braidotti invites us to think about how technoscientific shifts—particularly shifts in how bodies and societies interact with technological possibility—produce new forms of relational and procedural politics, although the broader implications of these politics for accessible crip futures remain to be determined. the texts under review mark histories, futures, and fictions of technoscientific enactment, with high stakes for bodies typically excluded from ideal futures. the task of the emerging field of cfts will be not only to politicize the lived realities of peo- ple with and without access to technology, but also to use crip feminist insights to agitate against compulsory normalcy. following recent feminist technoscience projects of making, hacking, and designing, cfts, too, should not only concern itself with critique, but also with crafting practices of design and world-building, enacting crip feminist technoscience to create more accessible futures. references garland-thomson, rosemarie. . misfits: a feminist materialist disability concept. hy- patia ( ): – . sullivan, mairead. . cuts disproportionately hurt faculty of color. the emory wheel, september . http://www.emorywheel.com/cuts-disproportionately-hurts-faculty-of- color/ (accessed august , ). tremain, shelley. . introducing feminist philosophy of disability. disability studies quarterly ( ) http://dsq-sds.org/article/view/ / (accessed august , ). mcruer, robert. . crip theory: cultural signs of queerness and disability. new york: new york university press. invited review essay provided by the author(s) and university college dublin library in accordance with publisher policies. please cite the published version when available. title systems in language: text analysis of government reports of the irish industrial school system with word embedding authors(s) keane, mark t.; pine, emilie; leavy, susan publication date - - publication information digital scholarship in the humanities, ( ): i -i publisher oxford university press item record/more information http://hdl.handle.net/ / publisher's statement this article has been accepted for publication in digital scholarship in the humanities © the authors published by oxford university press. all rights reserved. publisher's version (doi) . /llc/fqz downloaded - - t : : z the ucd community has made this article openly available. please share how this access benefits you. your story matters! (@ucd_oa) © some rights reserved. for more information, please see the item record link above. https://twitter.com/intent/tweet?via=ucd_oa&text=doi% a . % fllc% ffqz &url=http% a% f% fhdl.handle.net% f % f systems in language: text analysis of government reports of the irish industrial school system with word embedding susan leavy university college dublin ireland susan.leavy@ucd.ie mark t. keane university college dublin ireland mark.keane@ucd.ie emilie pine university college dublin ireland emilie.pine@ucd.ie abstract industrial memories is a digital humanities initiative to supplement close readings of a government report with new distant readings, using text analytics techniques. the ryan report ( ), the official report of the commission to inquire into child abuse (cica), details the systematic abuse of thousands of children from to in residential institutions run by religious orders and funded and overseen by the irish state. arguably, the sheer size of the ryan report—over million words— warrants a new approach that blends close readings to witness its findings, with distant readings that help surface system-wide findings embedded in the report. although cica has been lauded internationally for its work, many have critiqued the narrative form of the ryan report, for obfuscating key findings and providing poor systemic, statistical summaries that are crucial to evaluating the political and cultural context in which the abuse took place (keenan, , child sexual abuse and the catholic church: gender, power, and organizational culture. oxford university press). in this article, we concentrate on describing the distant reading methodology we adopted, using machine learning and text-analytic methods and report on what they surfaced from the report. the contribution of this work is threefold: (i) it shows how text analytics can be used to surface new patterns, summaries and results that were not apparent via close reading, (ii) it demonstrates how machine learning can be used to annotate text by using word embedding to compile domain-specific semantic lexicons for feature extraction and (iii) it demonstrates how digital humanities methods can be applied to an official state inquiry with social justice impact. keywords: text analysis, text classification, machine learning, industrial schools, child abuse . introduction the ryan report (ryan, ) details the findings of the irish government’s commission to investigate child abuse in irish industrial schools, run by catholic religious congregations from the - . the report provides an extensive catalogue of abuse carried out in these schools and had a major societal impact in ireland with respect to public attitudes to the moral authority of the roman catholic church (donnelly and inglis, ; pilgrim, ). however, aspects of its narrative structure have been criticised for obscuring as much as it revealed. the anonymisation of names of the clergy for instance has been criticised for protecting the religious orders (powell et al, ) and the structure of the document obscures the systematic nature of the abuse (pine et al. ). this paper reports on the use of text analytics to surface heretofore-invisible underlying patterns and enable a system-wide analysis of the contents of the report and facilitate new kinds of reading through an interactive web-based platform . it presents a distant reading methodology whereby word embedding is used to compile domain- specific semantic lexicons for feature extraction to enable machine learning classifiers to annotate excerpts of the ryan report according to its meaning. in the remainder of this introduction, we identify key shortcomings of the report (see section . ), specify our motivation for doing the current work (section . ), outline the key themes in the report (section . ) and outline the structure of the remainder of the paper. . shortcomings of the ryan report the structure and narrative form of the ryan report organises information in a way that impedes a system-wide analysis of abuse in the irish industrial school system. preliminary chapters describe the historical background of the school system, the terms of the commission of investigation and how various selected sources were used . the main body of the report is then comprised of a collection of chapters organized by school . each chapter begins with a historical overview of the school and its management. the narrative then moves to a consideration of the events involving clerics or lay staff in the school, about whom accusations of abuse were made. due to this segregation of information by school, the descriptions of serial abusers and their movements from school to school are distributed across many chapters. this makes it very difficult, if not impossible, for the reader to build a coherent history of a given individual who may have worked at several schools. indeed, in the context of members of religious orders spread across chapters, even the https://industrialmemories.ucd.ie/ see the commission to inquire into child abuse (cica) report, vol. . to vol. . (available at: http://www.childabusecommission.ie/rpt/). see the cica report, vol. . to vol. . . most assiduous reader cannot easily connect a given individual’s sequence of offenses in any coherent way. this narrative structure obscures the movement of staff between institutions, which was a common response of governmental and congregational bodies to allegations of abuse. this structure also makes institutional comparisons difficult, thus obfuscating the system-wide conditions that allowed abuse to emerge and become endemic. within the chapters on each school, information is further divided in sections according to individual perpetrators detailing evidence of abuse and the response of the religious orders. while this approach is consistent with the concept of individual responsibility that is fundamental to a retributive justice system (hagan et al., ), such individualised narratives deflect from the complex social phenomena that contribute to the occurrence of abuse (keenan, ). . motivation for a distant reading of the ryan report the motivation for the current work arose from the difficulties in undertaking a cross- institutional, systemic analysis. hence, we advance a suite of techniques, using word embedding and text analytics (i.e. text classifiers) to perform distant readings of the document and annotate extracts of text based on their content. we outline a methodology for generating a set of domain-specific keywords (doing query expansion from minimal seed keywords) to compile lexicon-based features for classifiers that can be used in conjunction with other features to identify paragraphs based on their semantic content. this methodology could potentially be used to analyse the content of similarly voluminous reports resulting from other investigations (e.g. royal commission report, australia, (covering , witness testimonies); truth and reconciliation commission, canada, (over , testimonies)). a central motivation also concerned issues with the application of machine learning techniques in the area of digital humanities. in many digital humanities projects, although corpora are too large to conduct comprehensive close readings, they are often not large enough to employ ‘big-data’ methodologies such as machine learning mainly due to the cost of compiling sufficient training data (schöch, ). we addressed this issue by outlining a scalable methodology that enables machine learning to be used for annotation with relatively small training-datasets. . knowledge categories annotating excerpts of the ryan report based on their semantic content enabled the existing narrative of the report to be deconstructed and its findings to be extracted and read in new ways. the following outlines the thematic foci of this analysis and their relevance to gaining a system-wide understanding of the dynamics of abuse at irish industrial schools: witness testimonies: extracting the accounts of individual witnesses recorded in the text, to allow us to collate and examine in detail all of the testimony embedded in the report. experts of testimony of witnesses in the ryan report are most commonly presented in the form of block quotes and preceded by a colon along with introductory text contextualising the source of quotations. shorter in-text quotations are identified by quotation marks. the same punctuation is used also to signify extracts from historical documentary sources such as reports and letters necessitating semantic analysis of the text introducing the quitations. (vol. . ) transfer events: these paragraphs deal with the responses to allegations of abuse in the industrial schools where, typically, the cleric involved was transferred from one institution to another. in some cases the cleric involved was moved out of the schools system (to a parish or a congregational house), dismissed or granted a dispensation from their vows. those paragraphs recounting the movement of accused abusers, to enable us to view the transfer trajectories of specific individuals and to surface patterns of movement between institutions obscured by the linear narrative structure of the report. (cica vol. . ) abuse events: these paragraphs detail abusive events (i.e., physical, emotional and sexual abuse) and are a crucial to understanding the scale and nature of abuse across the industrial school system. the language used to describe abusive events is complex reflecting the varied experiences of the , witnesses who gave evidence of abuse experienced at the industrial schools. extracting such paragraphs allows us to identify, collate and examine in detail the representations of abuse in the ryan report. (cica vol. . ) . outline of paper in the remainder of this paper, we present the techniques we used for a distant reading of the ryan report. we review the main collections of research relevant to our concerns in section . we then describe the techniques and present the results of our research. in section , we outline how we used word-embedding methods, specifically word vec (mikolov, ), to carry out feature extraction in order to classify the semantic content of excerpts of the report. section describes how these domain-specific semantic lexicons along with other features can be used in a suite of classifiers designed to automatically identify particular text items in the report. this section also reports the results evaluating the effectiveness of these classifiers in detecting the semantic content. . background the approach we adopted in this project encompasses findings from previous studies in relation to the requirements for a digital platform to enable distant reading. it also builds upon previous approaches to using machine learning to automatically classify text. . digital platforms for humanities research widlöcher et al. ( ) outlined guidelines for platforms for humanities research demonstrating how enriching data through annotation, segmentation of documents, statistical analysis and comprehensive search functionality enables distant reading. they also emphasise the importance of retaining structural elements of documents to facilitate close readings. this incorporation of both close and distant reading functionality within an exploratory digital interface was demonstrated in work by hinrichs et al. ( ) and kopaczyk ( ). distant reading though the extraction and exploration of relationship between entities in text is a central function of many platforms (muralidharan and hearst, ; vuillemot et al., ). jokers and mimno ( ) emphasises distant reading using methods such as topic modelling and visualisation. in developing an approach to digitally analysing the ryan report we build on requirements outlined in these related digital humanities projects. . annotation in humanities research analysis of the contents of the ryan report involved the automatic classification of paragraphs based on their content. in exploring approaches to annotation in humanities research it is important to appreciate the important role that manual annotation plays in the critical analysis of text (jackson, ). researchers gain in-depth knowledge of the corpus through the process of evaluating its meaning and annotating the text. the development of distant reading methods must therefore aim not to simply replace this interpretative stage but to enhance it. incorporating input from domain experts into the process is key to achieving that and also ensuring the interpretability of automation so the classification process itself may be critically analysed. this is demonstrated in work by sweetnam and fennel ( ) who included input from experts in each stage of their annotation process. . automated annotation there are two main approaches to automated annotation, rule-based and statistical machine learning. chiticariu et al. ( ) outlines how the data-analytics industry primarily employs rule-based approaches to annotation and information extraction despite major developments in academia in using machine learning. this they found, is largely due to the fact that rule-based methods are interpretable, can incorporate domain knowledge easily and do not require extensive training data. a comparable situation persists in digital humanities where despite an abundance of research developing automated methods for annotation many projects rely on manual annotation of text (mahlow et al., ). this is due in large part to the domain- specificity of the language of many digital humanities corpora and the high levels of accuracy required to produce reliable analysis (frank et al.; , hampson et al. ). compiling sufficient training data to yield accurate results in this context is often costly and error prone. to address this, we explored an approach to automated classification that ensures high levels of accuracy with limited training data, while also incorporating domain knowledge and emulating the transparency and interpretability of a rule-based approach. . annotation using word embedding and semantic text classifiers the most relevant research on automated annotation pertains to identifying witness testimony. in the ryan report this information is represented as excerpts of reported speech. our methodology therefore builds upon previous approaches to automatically identifying reported speech in text. this commonly relies on pattern-based extraction rules to detect linguistic markers such as quotation marks (krestel et al., ; pouliquen et al., ; iosif et al., ). however, in the ryan report, punctuation does signal speech, but that punctuation also signals other kinds of text so semantic information from the paragraphs has to be taken into consideration making research extracting indirect speech more relevant (krestel et al., ). using machine learning, schöch et al. ( ) developed an approach that involves semantic analysis using a lexicon of linguistic features associated with direct speech derived from a corpus of french th century literature. these were used as features to train a classifier, yielding an accuracy of . percent. weiser and wartin ( ) developed a dictionary of verbs that introduce speech in text (reporting verbs) and used this in conjunction with pattern-based extraction rules to annotate indirect speech. machine learning approaches to text classification commonly use a bag-of- words approach to feature selection. however, this approach is problematic when instances to be classified are short giving rise to over-fitting (brooks, ). a lexicon- based approach to feature selection can prevent address this but encounters new issues concerning the domain specificity of some corpora. existing lexical databases such as wordnet (miller, ) have been used to generate lists of synonyms from seed words to compile semantic lexicons (argamon et al., ). however, they often do not recognise terms specific to particular domains such as the domain of ecclesiastical discourse used in the ryan report. our project therefore required a methodology that used machine learning with lexicon-based features that take account of specific terms used in the ryan report. in compiling domain-specific lexicons for feature extraction we called upon work by mikolov ( ) who developed word vec, a word-embedding algorithm. word vec is a set of neural network models that produces distributed representations of words from text that reflect many aspects of their meanings. it implements the distributed semantics notion that the “meaning of a word can be determined by the company it keeps” . this technique analyses word co-occurrence over large corpora representing a given word by a large vector of all the other words it is found beside it. using these vectors one can then establish that two words are “similar” or synonymous by virtue of whether their vectors are the same or close in a multi-dimensional space. mikolov’s ( ) work provides a method for uncovering word-similarities that are tailored to the language of the ryan report. this word-embedding technique was used by chanen ( ) to identify synonyms and compile lexicons for feature extraction in order to account for the multiplicity of terms used to refer to the same semantic concept in a corpus of flight incident reports. their method of identifying semantically related terms involved generating word- -vec ensembles, extracting terms that re-occurred over each ensemble and manually filtering antonyms and semantically dissimilar words. given the domain-specificity of the language in the ryan report this approach suggests a useful way to compile lexicons that are specific to the language of the religious, industrial- school and legal worlds of the ryan report. . distant reading the ryan report: methodology a central aim of this project was to provide a methodology for identifying the semantic content of text in the ryan report and extracting given categories of information. the semantic categories identified included testimony of witnesses included in the report (witness testimony), details of the transfer of clergy from school to school (transfer see also latent semantic analysis, as a related technique, dumais ; landauer ; and similar methods in turney & pantel, . events) and descriptions of abusive (abuse events). machine learning classification was used to annotate the text based on domain-specific semantic lexicons along with other features. in order to generate these domain-specific lexicons, word embedding was used to find terms in the ryan report that were semantically similar to a given set of seed words; this task can be cast as a type of query expansion or feature extraction. these text-analytic methods for paragraph identification were extensions to our construction of a digital platform involving an exploratory web interface and database into which the significant parts of the ryan report were processed . the core basic record in the database of this web-based system stored each paragraph from the relevant chapters in the ryan report. these paragraph records were then linked to other tables detailing actors in the report (witnesses, clerics, officials), the schools, the congregations and time periods. named actors were extracted using nltk (bird and looper, ) and other information was identified using a rule-based approach. the web-interface also had a string-search facility for the paragraphs along with filters for other categories of entity (e.g., one could search on a single school or a diocese). in the remainder of this section, we report on the other aspects of the methodology we developed to permit automated paragraph identification. . method the corpus & paragraph categories in the ryan report, of the chapters detail events at each school. this dataset, comprising , paragraphs and , words was the corpus annotated according to its semantic content. each paragraph is a definite unit-of-analysis in the report, as they digital platform was developed using the django framework (https://djangoproject.com) are systematically numbered and tend to focus on particular events and issues. the following are characteristic features of each semantic category: . feature extraction techniques: using word embedding using machine learning with lexicon-based features can address the issue of over-fitting of classification models when instances of text to be classified are short as is the case with paragraphs in the ryan report (see section . ). however, the language of the ryan report is domain-specific and general thesauri would not identify concepts such as “dispensation” as being synonymous with “dismissal”. hence, we used the word vec algorithm (mikolov, ) supplemented by synonyms generated from wordnet to find semantically related words from a set of seed-keywords building on the methodology outlined by chanen ( ). to compile the semantic lexicon five word- -vec ensembles were generated from seed words. the top words were extracted from each ensemble. a set of words common to each ensemble were identified and the results were then reviewed by a domain expert to validate their validity as synonyms within the context of the ryan report. using this method many non-obvious synonyms were found. general synonyms were collected using the wordnet lexical database. this involved entering each seed word and compiling a list of synonyms from the results of a search in wordnet. the resulting lists were verified manually to ensure they were appropriate synonyms for the context of the ryan report. seed words were manually selected based on initial readings of the texts. in the case of paragraphs detailing transfers and direct speech, initial seed words were straight forward to compile as terms such as ‘transfer’ and ‘said’ were commonly used in the report. however, in the case of descriptions of abuse, the language varied widely. a support vector machine-learning algorithm was used in this case to generate a classification model using example paragraphs based on a bag-of-words feature set. analysis of the support vectors highlighted words that best distinguished paragraphs describing abuse and the highest-ranking of these were used as seed words. the domain-specific semantic lexicons that resulted from this word embedding procedure are detailed in the next section. . feature extraction techniques: lexicon-based features domain-specific semantic lexicons were supplemented with other less domain-specific features. verbs introducing reported speech, colons or quotation marks signal witness testimony in the ryan report. punctuation such as commas, question marks and word contractions seemed to be used more frequently and testimony was also expressed in the first person. this information was therefore included as features to classify excerpts of direct speech in the ryan report (table ). a lexicon was also manually generated in order to filter out excerpts from written reports and letters. this lexicon included the terms: visitation, visitor, report, letter, wrote, written. ‘visitations’ for instance is the term used for inspections of industrial schools carried out by the church. various combinations of all features were examined to identify the optimal feature set. generally, the person being transferred is named in a paragraph detailing such an event. similarly, in describing abuse, the perpetrator is commonly named. names were therefore included as features for both of these semantic categories. in describing transfer events, the names of the institutions were often mentioned and the events seemed often to be described in sections, which concerned abuse or named the alleged perpetrator. this information was therefore included as features for classification. semantic category feature witness testimony reporting verbs: domain specific semantic lexicon pronouns punctuation transfer events transfer terms: domain specific semantic lexicon section heading references to types of abuse mentions of religious actors mentions of institutions descriptions of abuse abuse terms: domain specific semantic lexicon mentions of religious actors table : feature sets extracted from the ryan report . classifiers used for paragraph identification separate classifiers were built for each of the paragraph categories. using training data for each paragraph type, features were extracted based on the semantic lexicons generated using word embedding and wordnet to build feature vectors for each paragraph based on frequency counts. a random-forest classifier (breiman, ) was then trained to find the relative weightings of features that predicted the content-class for given paragraphs using the weka toolkit (holmes et al., ). the random-forest algorithm was chosen because, as an ensemble learner that creates a ‘forest’ of decision trees by randomly sampling from the training set, it is well suited to learning from smaller datasets (poliker, ). . training data sample paragraphs belonging to each paragraph category were manually selected from the ryan report as training data for classifiers. in order to address the issue of the cost of compiling training data in digital humanities projects (fran et al., ; hampson et al., ), minimising the number of examples required was a guiding principle. the training data consisted of paragraphs detailing transfer events, paragraphs containing direct speech and paragraphs describing abuse. the vacience in numbers of training examples reflected the volume of instances in the report itself and the cost of compiling training data. positive examples of each case were selected from across the report to capture the variety within the category. negative examples were compiled through a random selection and manual verification of paragraphs. . validation & evaluation preliminary testing of the classifiers was done using -fold cross validation. these metrics indicated the most effective combination of features and were subsequently evaluated on a sample taken from a larger set of unseen data. the sample of unseen data was made up of randomly selected paragraphs from the report. for transfer paragraphs, given the low number of training examples ( positive and negative), an interim evaluation stage was conducted by applying the classification model to a balanced set of examples of unseen data from the report to further verify the optimal combination of features for classification. results & discussion the results showed that using word embedding to generate semantic lexicons for feature extraction is effective in yielding high accuracy where the language of a corpus is domain-specific and the volume of training data is limited. this allowed the integration of the semantic annotations in an online search tool for the report (fig. ). in the following sections we outline the results of using word embedding to generate semantic lexicons and then report on effectiveness of the classifiers. figure : search interface for ryan report . domain-specific semantic lexicons: using word embedding semantic lexicons for each category of text were generated from an initial set of seed words derived from readings of the report. in the case of witness testimony, the seed words were the reporting verbs “said”, “told” and “explained”. a reading of the report suggested some obvious key terms to describe the transfer of staff from school to school: “transfer”, “dismiss” and “sack”. seed words for abusive events were uncovered through analysis of the support vectors in a model generated by a support vector machine learning algorithm based on the words in a sample of positive and negative paragraphs (details on this approach in section . ). this showed that terms distinguishing paragraphs describing abuse from the remainder of the report formed five semantic categories: perpetrator, abusive actions, body parts, emotions engendered in the victims and implements used in the abuse. the highest-ranking support vectors from these word types were selected as seed-words to form the semantic lexicon: abuse, beaten, raped, arms, humiliation, implement. the word lists generated from running the word vec algorithm on the full text of the ryan report are detailed in table . this details the common terms among the top- words across word-embedding ensembles generated for each seed word. after the manual verifications step, they were supplemented by general synonyms of each seed word generated from a search of the wordnet lexical database. text category source feature witness testimony seed terms said, told, explained word embedding accepted, acknowledged, added, admitted, advised, agreed, alleged, angry, answered, asked, asking, asserted, assured, believed, called, claimed, commented, complained, conceded, concluded, confessed, confirmed, convinced, denied, describes, explained, explained, felt, guarantee, heard, informed, insisted, knew, described, learned, mentioned, presumed, protested, questioned, realised, recalled, recollection, recounted, relieved, remarked, remember, remembered, replied, requested, said, saw, saying, says, screams, stated, stating, suggested, surmised, tells, thinks, thought, told, warned, witnessed, reported wordnet apologise, apology, articulate, articulated, assure, assured, condone, condoned, enounced, enounce, explicate, explicated, express, expressed, narrate, narrated, pardon, pardoned, posit, posited, recite, recited, recount, recounted, said, state, stated, submit, submitted, tell, told, verbalise, verbalised transfer events seed terms transfer, dismiss, sack word embedding application, applied, apply, appointed, appointment, arrival, arrived, arriving, assigned, attended, committed, continued, converted, decision, departure, discharge, discharged, dismiss, dismissal, dismissed, dispensation, dispensed, entered, expelled, leaving, move, moved, position, posted, posting, proposal, referring, release, relieved, removal, remove, removed, replaced, request, resignation, resigned, returned, returning, sacked, sanction, seek, sending, sent, served, stayed, suspended, transferred, withdraw, withdrawal wordnet transferred, transfer, moved, remove, dismiss, dismissed, sacked descriptions of abuse seed terms abuse, beaten, raped, arms, humiliation, implement word embedding lexicons pertaining to parts of the body, abusive actions, emotion engendered in the victims and implements of abuse table : domain-specific semantic lexicons . classifier results the results showed that the semantic lexicons generated using word embedding played a key role in producing accurate classifiers using limited training data. in classifying abuse paragraphs the words in the semantic lexicons were the sole features used. for transfer paragraphs, the semantic lexicon denoting transfer events featured in each of the combinations yielding the highest classification results. results for classifying witness testimony were also highest when the semantic lexicon of reporting verbs were used as features. however, as was expected, features based on punctuation such as colons were also important in identifying this category of paragraph. classification: witness testimonies in classifying paragraphs containing witness testimony, the model using a combination of all feature sets gained the highest accuracy in -fold cross-validation (table ). most combinations of features were well-balanced between precision and recall. feature sets precision recall f-measure accuracy (%) reporting verbs, punctuation, personal pronouns . . . punctuation, pronouns . . . punctuation, reporting verbs . . . pronouns . . . pronouns, reporting verbs . . . punctuation . . . reporting verbs . . . table : results of -fold cross-validation for witness testimony classification the best performing model as indicated by the -fold cross validation was then was run on the remainder of the report. based on a random sample of paragraphs, an accuracy of percent was achieved (table ). feature sets precision recall f-measure accuracy (%) reporting verbs, writing, punctuation, personal pronouns . . . table : accuracy on sample of paragraphs for witness testimony classification error analysis showed that false negative results were primarily due to in-text quotations of short-phrases. there were no instances of larger blocks of quotations being missed by the classifiers. the rate of false positives was relatively high primarily due to the misclassification of letters, extracts from inspection reports and diary entries. however, in many cases the source of such content can be challenging to decipher even on reading the report. classification: transfer events when all features were included in the classifier to automatically detect paragraphs detailing the transfers of religious throughout the industrial school system, percent accuracy was gained based on -fold cross validation (table ). however, when named entities were excluded as feature-sets, accuracy increased to %. this counter- intuitive result was verified further by applying the best performing models to unseen paragraphs consisting of a - balance between positive and negative examples. this showed that on a balanced set of unseen data, using all features yielded the best results (table ). feature sets precision recall f-measure accuracy (%) transfer terms, section heading info, mentions of school . . . transfer terms, section heading info . . . transfer terms, mentions of religious actors, section heading info, mentions of school . . . transfer terms, mentions of religious actors, section heading info . . . transfer terms, mentions of religious actors . . . section heading info, mentions of school . . . transfer terms, mentions of religious actors, mentions of school . . . transfer terms, mentions of school . . . mentions of religious actors, section heading info . . . mentions of religious actors . . . transfer terms . . . mentions of religious actors, section heading info, mentions of school . . . mentions of religious actors, mentions of school . . . section heading info . . . mentions of school . . . table : witness testimony -fold cross validation for transfer events the text set of sample paragraphs was comprised paragraphs that were distinctly positive and negative examples of transfer paragraphs. for this reason, a higher level of accuracy would be expected than on the rest of the report where language can often be more vague. feature sets precision recall f-measure accuracy (%) transfer terms, section heading info, mentions of school . . . transfer terms, section heading info . . . transfer terms, mentions of religious actors, section heading info, mentions of school . . . table : witness testimony accuracy on balanced sample of paragraphs the final phase of evaluation for paragraphs pertaining to transfer events involved application of the best performing model, from the results of the balanced set of paragraphs, to the remainder of the report and manually examining the classification of randomly sampled paragraphs (table ). these results showed high levels of recall. however, there were quite a few false positive results leading to relatively low levels of precision. feature sets precision recall f-measure accuracy (%) transfer terms, section heading info, mentions of school . . . table : witness testimony accuracy on random sample of paragraphs error analysis showed that paragraphs that were falsely categorised as being about the transfer of clergy actually pertained to the transfer of children. however, some false positive results raised potentially new questions regarding the transfer of children throughout the industrial school system as a response of the congregations to allegations of abuse: (cica vol. . ) classification: abuse events the best performing model for identifying paragraphs describing abuse in -fold cross validation used two of the semantic categories along with the names of the alleged perpetrator (table ). the domain-specific semantic lexicons that were most useful included references to the emotions engendered in the victims and references to abusive actions. feature sets precision recall f-measure accuracy (%) action, emotion, mentions of religious actors . . . . emotion, implement, action, mentions of religious actors . . . . action, implement, mentions of religious actors . . . . action, implement, emotion . . . . implement, emotion, mentions of religious actors . . . . emotion, implement, body, action, mentions of religious actors . . . . body, action, emotion, mentions of religious actors . . . . body, action, implement, mentions of religious actors . . . . body, implement, mentions of religious actors . . . . body, action, implement . . . . actor, body, emotion . . . . emotion, implement, body, action . . . . body, implement, emotion . . . . body, action, mentions of religious actors . . . . body, action, emotion . . . . table : results of -fold cross-validation for abuse events the classification model was then run on the reminder of the report and a random sample of paragraphs was manually verified yielding overall accuracies of percent (table ). feature sets precision recall f-measure accuracy (%) action, emotion, mentions of religious actors . . . . table : descriptions of abuse tested on random sample of paragraphs while precision was reduced reflecting the complexity of the language, recall was high. error analysis showed that false positives uncovered a similarity in the language used to describe the emotional experience of victims of abuse and some memories of young clergy when they first took up positions in the schools. conclusions this research demonstrates how distant reading methodologies can deconstruct an official state report narrative to enable new kinds of analysis of institutional child abuse. automatic annotation of excerpts of the report based on the meaning of the text enabled a more focussed close reading of these identified paragraphs, surfacing significant new patterns of events and language in the institutional system (pine et al., ). these insights were previously obscured by the legal constraints on and narrative form of the ryan report, which emphasised an in-depth case-by-case study, in lieu of system-wide analysis. the feasibility of using machine learning to annotate text for digital humanities projects can be enhanced by using word embedding for feature extraction. the cost of compiling training data and the domain specificity of the text of many projects can often be a barrier to using machine learning approaches to annotation. this research demonstrates how word embedding can be used to compile context-specific semantic lexicons as a method for extracting features for text classifiers to perform automated annotations of text. this is an innovative methodology building on an approach outlined by chanen ( ). high accuracy was achieved using a minimal set of training examples with features based on semantic lexicons generated from the entire dataset. there have been numerous international state investigations into the abuse of children. wright et al. ( ) documented historical child abuse enquiries to date each of which resulted in lengthy reports detailing their findings. in using automated methods to enable distant reading of the ryan report, this project presents an approach whereby key information may be extracted and restructured to facilitate a system-wide analysis of the findings of such investigations. acknowledgements this research is part of the industrial memories project funded by the irish research council under new horizons . references argamon, s., whitelaw, c., chase, p., hota, s.r., garg, n. and levitan, s., . stylistic text classification using functional lexical features. journal of the association for information science and technology, ( ), pp. - . bird, s., klein, e., & loper, e., . natural language processing with python. beijing, o'reilly media. sebastopol, ca. breiman, l., . random forests. machine learning, ( ), pp. - . vancouver brooks, m., kuksenok, k., torkildson, m.k., perry, d., robinson, j.j., scott, t.j., anicello, o., zukowski, a., harris, p. and aragon, c.r., , february. statistical affect detection in collaborative chat. in proceedings of the conference on computer supported cooperative work (pp. - ). acm. chanen, a., , april. deep learning for extracting word-level meaning from safety report narratives. in integrated communications navigation and surveillance (icns), (pp. d - ). ieee. chiticariu, l., li, y. and reiss, f.r., , october. rule-based information extraction is dead! long live rule-based information extraction systems!. in emnlp (no. october, pp. - ). django software foundation, . django (v ersion . . ) [computer software]. re- trieved from https://djangoproject.com. donnelly, s. and inglis, t., . the media and the catholic church in ireland: reporting clerical child sex abuse. journal of contemporary religion, ( ), pp. - . dumais, s.t., . latent semantic analysis. annual review of information science and technology, ( ), pp. - . frank, a., bögel, t., hellwig, o. and reiter, n., . semantic annotation for the digital humanities. linguistic issues in language technology, ( ), pp. - . hampson, c., munnelly, g., bailey, e., lawless, s. and conlan, o., , september. improving user control and transparency in the digital humanities. in culture and computing (culture computing), international conference on (pp. - ). ieee. hogan, r. and emler, n.p., . retributive justice. the justice motive in social behavior: adapting to times of scarcity and change, pp. - . holmes, g., donkin, a. and witten, i. h. ( ), weka: a machine learning workbench, in ‘intelligent information systems, . proceedings of the second australian and new zealand conference on’, ieee, pp. – . iosif, e. and mishra, t., , april. from speaker identification to affective analysis: a multi-step system for analyzing children's stories. in clfl@ eacl (pp. - ). jackson, h.j., . marginalia: readers writing in books. yale university press. jockers, m.l. and mimno, d., . significant themes in th-century literature. poetics, ( ), pp. - . keenan, m., . child sexual abuse and the catholic church: gender, power, and organizational culture. oxford university press. krestel, r., bergler, s. and witte, r., . minding the source: automatic tagging of reported speech in newspaper articles. reporter, ( ), p. . landauer, t.k., . latent semantic analysis. john wiley & sons, ltd. mahlow, c., grün, c., holupirek, a. and scholl, m.h., , september. a framework for retrieval and annotation in digital humanities using xquery full text and update in basex. in proceedings of the acm symposium on document engineering (pp. - ). acm. mikolov, t., sutskever, i., chen, k., corrado, g.s. and dean, j., . distributed representations of words and phrases and their compositionality. in advances in neural information processing systems (pp. - ). miller, g.a., . wordnet: a lexical database for english. communications of the acm, ( ), pp. - . muralidharan, a.s. and hearst, m.a., . improving the recognizability of syntactic relations using contextualized examples. in acl ( ) (pp. - ). pilgrim, d., . child abuse in irish catholic settings: a non‐reductionist account. child abuse review, ( ), pp. - . pine, e., leavy, s. and keane, m.t., . re-reading the ryan report: witnessing via and close and distant reading. Éire-ireland, ( ), pp. - . polikar, r., . ensemble based systems in decision making. ieee circuits and systems magazine, ( ), pp. - . pouliquen, b., steinberger, r. and best, c., , september. automatic detection of quotations in multilingual news. in proceedings of recent advances in natural language processing (pp. - ). powell, f., geoghegan, m., scanlon, m. and swirak, k., . the irish charity myth, child abuse and human rights: contextualising the ryan report into care institutions. british journal of social work, ( ), pp. - . ryan, s. ( ). commission to inquire into child abuse report (volumes i - v). dublin: stationery office, dublin. available at: http://www.childabusecommission.ie/rpt/. schöch, c., . big? smart? clean? messy? data in the humanities. journal of digital humanities, ( ), pp. - . schöch, c., schlör, d., popp, s., brunner, a., henny, u. and tello, j.c., . straight talk! automatic recognition of direct speech in nineteenth-century french novels. in digital humanities : conference abstracts (pp. - ). sweetnam, m.s. and fennell, b.a., . natural language processing and early- modern dirty data: applying ibm languageware to the depositions. literary and linguistic computing, ( ), pp. - . vancouver turney, p.d. and pantel, p., . from frequency to meaning: vector space models of semantics. journal of artificial intelligence research, , pp. - . vuillemot, r., clement, t., plaisant, c. and kumar, a., , october. what's being said near “martha”? exploring name entities in literary text collections. in visual analytics science and technology, . vast . ieee symposium on (pp. - ). ieee. weiser, s. and watrin, p., . extraction of unmarked quotations in newspapers. in proceedings of the eighth international conference on language resources and evaluation (lrec- ). widlocher, a., bechet, n., lecarpentier, j.m., mathet, y. and roger, j., , september. combining advanced information retrieval and text-mining for digital humanities. in proceedings of the acm symposium on document engineering (pp. - ). acm. wright, k., swain, s., and sköld, j. ( ). 'the age of inquiry: a global mapping of institutional abuse inquiries'. melbourne: la trobe university. doi: http://doi.org/ . / / e e a practice what you preach: engaging in humanities research through critical praxis adema, janneke pre-print copy deposited in curve march original citation: adema, j. ( ) practise what you preach: engaging in humanities research through critical praxis international journal of cultural studies ( ) - publisher: sage doi: . / this work is licensed under a creative commons attribution . uk: england & wales license some differences between the journal version and this version may remain and you are advised to consult the journal if you wish to cite from it. copyright © and moral rights are retained by the author(s) and/ or other copyright owners. a copy can be downloaded for personal non-commercial research or study, without prior permission or charge. this item cannot be reproduced or quoted extensively from without first obtaining permission in writing from the copyright holder(s). the content must not be changed in any way or sold commercially in any format or medium without the formal permission of the copyright holders. curve is the institutional repository for coventry university http://curve.coventry.ac.uk/open http://curve.coventry.ac.uk/open http://curve.coventry.ac.uk/open       this document is the author’s final draft of: ‘practice what you preach: engaging in humanities research through critical praxis international journal of cultural studies , first published on march , doi: . / ’ some differences between the journal version and this version may remain and you are advised to consult the journal if you wish to cite from it.   this  work  is  licensed  under  a  creative  commons  attribution   .  uk:  england  &   wales  license. practice what you preach engaging in humanities research through critical praxis janneke adema coventry university abstract this article explores how a cultural studies perspective can be used to critically analyse practices of conducting research within the (digital) humanities. it uses amongst others the example of the author’s phd dissertation currently in process, which is set up as a theoretical and practical intervention into existing discourses surrounding the dominant form of formal communication within the humanities: the scholarly monograph. a methodology of critical praxis is seen as an integral part of the research project as well as an important step in developing academic or research literacy through actively engaging in the production of communicative norms and practices. envisioning the book as a site of struggle over new forms and systems of communication within academia, the dissertation argues for alternative ways of thinking of and performing the monograph in an experimental manner. by making use of digital platforms, tools and media to share, remix and update the research as it evolves, the aim is to develop a digital, open and collaborative research practice. this will offer a practical critique of the dominant structures, politics and practices of producing and distributing research results. this article thus argues for the empowering potential of critically analysing and actively engaging with the dominant norms underlying communication in the humanities as well as with the structures that determine academic literacy and the established and accepted practices herein. by arguing for a potential new future for the book within scholarly communication as an emergent and evolving form, based on accessibility, sharing, process and change, this article makes a case for new ways of engaging a critical praxis that is more speculatory, open-ended and experimental. keywords       critical praxis, phd dissertation, scholarly monograph, (digital) humanities, open research, remix on september , , media studies scholar kathleen fitzpatrick wrote an article in the chronicle of higher education entitled ‘do “the risky thing” in digital humanities’. in this piece fitzpatrick reflects upon advice recently given to a graduate student who wanted to produce a digital project for her final dissertation. instead of doing the save thing and writing a traditional dissertation, fitzpatrick advised her to ‘do the risky thing’ instead, and to experiment and present her argument in an innovative way. at the same time fitzpatrick was careful to emphasise to the student the importance of making sure they had someone to cover their back. fitzpatrick thus used her article in the chronicle to make a plea for mentors and dissertation supervisors to support experimental digital dissertation work. ( a) the following article can in many ways be seen as an expansion of fitzpatrick’s argument. however, although i applaud fitzpatrick’s insistence on the importance of acquiring supervisory support when doing digital research, this article draws more attention to the responsibility and agency of phd students themselves to, in fitzpatrick’s words, ‘defend their experimental work’, and their ‘deviation from the road ordinarily travelled’. it will do so by amongst other looking at the reasoning and argumentation that lies at the basis of critical scholarly work that embraces the digital, and by illustrating this with examples hereof as well as from my own digital dissertation project (currently being conducted). drawing further on the example of experimental digital dissertation work, i will offer a theoretical argumentation as to how the choices we make during the course of our phds and the way we conduct our research, says a lot about the scholarly communication system we want and envision. drawing on foucault and insights from cultural studies and critical literacy theory, i will argue that during the course of our phds and in the process of creating a dissertation, we are very much structured to produce a certain kind of knowledge and with that a certain kind of social identity. developing critical and digital literacy through developing what i will call a ‘critical praxis’ can prevent us from simply repeating established practices, without critically analysing the assumptions upon which they are based. to enable us to remain critical of power structures and relations that shape knowledge, i will argue for the importance of phd students to experiment with different forms of knowledge production as part of their research process. for the practices we develop and embrace whilst doing our dissertation work have the capacity to transform the way we conduct scholarly communication. through them, i will argue, we can struggle for and enable the kind of politics and ethics we feel our systems should embody and we can start to produce knowledge differently. valuing the digital in order to establish where the importance of experimental digital work for humanities scholarship lies, we need to explore how we can use digital tools and technologies in a critical way to enhance and improve our scholarship and our communication systems. through the digital we have the opportunity to critically investigate and question the value of our established institutions and practices and, vice versa, through critique we can analyse and transform the digital to make it abide to a more progressive and open ethics and value system. in this respect,       experimenting with open and online dissertation work can be seen as the beginning of an exploration of what a digital cultural studies could look like. it is important to stress however, as cultural and media theorist gary hall has argued extensively, that in our experiments with the digital our ethics and politics should not be fixed from the start. ( ) we need to leave room to explore them within our experiments or as part of the process of conducting the dissertation. nonetheless, whilst acknowledging this space for openness and experimentation, there is something we can say about a politics of the digital that favours a more progressive and transformative outlook. especially with respect to how the digital, or our digital practices, can be used to build new forms of resistance and can be applied to construct alternative ways of producing knowledge outside the exploitative realm of neo-liberalist institutions and conservative practices. however, and let me state this clearly, resistance and change do not come to take part with the help of digital tools and technologies automatically. the real problem is not a technological, but foremost an institutional one. creating institutional change through the lens of the digital is necessarily part of a process of struggle in which the future of the digital and of our institutions are at stake. the potential of the digital within this struggle has been a subject of fierce debate. within the discourse on net criticism for instance, the potential of change within and through networked politics is seen as small, where it is—rightfully—argued that the digital also has the potential to reproduce social inequalities and even to promote capitalist exploitation. (terranova, ) however, although i am not disagreeing with these kinds of analyses—and i think they are important and necessary—we need to be wary not to fall from net criticism into net scepticism. the digital, in a configuration of events and networks of people and alternative practices, combining a shared interest and politics with a commitment to online social solidarity and community building—however small the effort—does have the potential to effect change. if we only look at the reach open access digital book publications have in comparison with their paper counterparts, we can see how the digital, or a digital open access politics, can be employed to reach a wider audience for scholarship and to encourage more direct interaction between authors and their public. many experiments with alternative forms of scholarly communication have arisen from what has become known as the digital humanities, which has been defined as ‘not a unified field but an array of convergent practices’. (presner and schnapp, ) for some within this community there is a division between ‘coders’ and ‘builders’, and those who are more interested in theory and politics and in what the digital can for instance mean to larger questions related to the nature and purpose of the university system. however, in my vision of what a digital humanities is and can be, i am more interested in how the humanities, through the digital, or the digital with the aid of humanities critique, can act as a disruptive (political) force, asking disruptive questions. with this i am in agreement with fitzpatrick’s reflection on the digital humanities: after much tension between media makers and media scholars, an increasing number of programs are bringing the two modes together in a rigorously theorized praxis, recognizing that the boundaries between the critical and the creative are arbitrary. in fact, the best scholarship is always creative, and the                                                                                                                 ronald snijder’s findings show that the open access publishing of books enhances their discovery and online consultation. (snijder, )       best production is always critically aware. the digital humanities seems another space within the academy where the divide between making and interpreting might be bridged in productive ways. ( : ) the digital humanities can aid in reinterpreting the humanities (in a continuous process that stretches beyond the digital). gary hall states that it is about what the digital (or scientific methods/computation) have to offer to the humanities and vice versa, as he argues for a humanities turn within the digital humanities, where it should ensure to ‘[act] as a responsible, political or ethical opening to the (difference, heterogeneity and incalculability of the) future, including the future of the humanities.’ ( ) what would a critique with and through the digital look like? for one, it can be beneficial to look at our established practices through the lens of the digital. one could argue that the coming of a new medium offers us a gap, a moment within which—through our explorations of the new medium—dominant structures and practices become visible and we become aware of them more clearly. the discourse, institutions and practices that have come to surround our printed forms of communication and that we have grown used and accustomed to, not only have fortified certain politics and ethics that we need to be critical about, these politics and ethics are also being transported into the digital were our practices and institutions are being reproduced online. this applies amongst others to the political economy surrounding scholarly publishing. the progressively unsustainable economic model for print book publishing is very worrisome, where the current monograph crisis shows that the printed monograph is no longer sustainable, as it is increasingly seen as a product that needs to make a profit according to market rules. however, as amongst others fitzpatrick has pointed out, although the form of the book is no longer (financially) viable, it is still required for our academic careers, to gain tenure. ( b: ) new forms of digital publishing, like experiments with not-for profit open access publishing offer digital alternatives to this situation based on an ethics of sharing and cooperation. in the same manner, the digital is enlisted in the struggle to critique established forms of (double blind) peer review in favour of more open digital forms of peer-to-peer review, as well as in the debate surrounding the function and value of our current forms of authorship and ownership. the author construct and the idea of ownership of a work, whether it is owned by an author or by the one who monetarily profits from it—i.e. the work’s publisher or in some cases the author’s university— can be seen to encourage the growing marketisation of higher education. when we question authorship and ownership, we also question these more general structures on which our modern universities are build. of course the digital is not only used as a means to critique established (print- based) systems and practices. it also tries to foreground a value system of its own, one                                                                                                                 as christine borgman argues, although e-publications have fewer material constraints, their form remains relatively stable or continuous to the printed book. in borgman’s vision this is not a rejection of technology but a reflection of the constructive power of scholarly practices. even though, as she states, the existing forms might not necessarily serve scholars well or best, new genres that take advantage of the fluid and mobile nature of the medium are only slow to emerge. hence today’s online books look very identical to print books. (borgman, : ) forms of open and p p review are increasingly being experimented with, for example by the journal shakespeare quarterly and by kathleen fitzpatrick for her book planned obsolescence ( ).         that is amongst others being debated and formed within the community of digital humanities theorists and practitioners. (spiro, ) within this flexible value- system-in-development, the digital is seen as offering us the possibility to rethink our scholarly communication system along the lines of an ethos of openness and sharing. based mainly on a reciprocal gift economy (free, online scholarship, open for re-use), it tries to stir our focus away from product-based market and consumer forms of thinking (including the book as a form of merchandise and the author as its owner). information is here seen not as an object or a commodity but as a social good to be shared and reused to advance the community as a whole. in this sense the digital, and experimentation with new practices within the digital, offer us the opportunity to foreground conversation and collaboration, connectivity and interaction. developing a critical praxis producing a thesis or dissertation in an experimental form—from using multimedia to enhance the dissertation’s argument to more advanced forms such as hypertextual or multi-format dissertations—or even using research blogs or social media to further develop the argument of a print-on-paper thesis online, can be an important aspect of acquiring digital and critical literacy. for example, reflecting on studying for a phd, historian tanya roth writes: ‘as digital tools and processes continue to offer larger benefits for [such] projects, it is increasingly important to make sure grad students understand what’s out there and how these resources and ideas can help them with their own research.’ ( ) as roth makes clear, this is not an either-or-situation where ‘traditional skills’, such as how to write a research paper, also need to be part of the curriculum. one of the reasons it is important when studying for a phd to develop digital and critical literacy—which, i will argue, can be seen as a simultaneous process—is that it helps to develop and perhaps expand one’s research skills. but more importantly, it presents an opportunity to rethink and critically analyse certain ‘traditional skills’ and research practices that have become ‘normalised’ or have become the dominant standard both within humanities research and within the process of writing and conducting a humanities dissertation. from that position, using these new critical skills and tools, we have the possibility to start performing our practices differently. by actively and critically ‘trying out’ new (digital) tools and methodologies to see how they might fit the specific research project and/or argument that is being pursued, by performing the dissertation in an experimental or alternative way, and, as part of this, taking the digital as its object of research, graduate students may be able to develop what i will call a critical praxis. praxis here relates to the process of bringing ideas, ideologies or theories into practice. it refers to how theory gets embodied in our practices. critical praxis then refers to the awareness of, and the critical reflection on how our ideas get to be embodied in our practices, making it possible to transform them. similar to foucault’s genealogy as a theoretical method, critical praxis can be seen as a practical application of the same critical procedure and investigation. it refers to the institutional embeddedness of phd students and the transformational agency of their practices. praxis in this sense forces a link between practice and the political, where through self-critique we will be able to reconstitute and reproduce ourselves and our social systems and relationships. the process of developing a critical praxis during the course of one’s phd, examples of which i am outlining here, draws amongst others on ideas and theories on critical, digital, and media literacy. the insights of critical pedagogue henry giroux       are essential herein. following giroux, cultural processes and power relations are seen as integrally connected in shaping our (educational) institutions. this takes place amongst others through the production of social identities, where certain values and knowledge systems help shape the production, reception and transformation of a particular kind of identities. for instance, as i will argue here, structures and practices underlying knowledge production in a field enable a specific value system to emerge that (re)produces a certain kind of social identity, namely that of the phd student and ultimately of the academic scholar. importantly however, for giroux a cultural politics and critical pedagogy ‘can be appropriated in order to teach students to be critical of dominant forms of authority, both within and outside of schools, that sanction what counts as theory, legitimate knowledge, put particular subject positions in place, and make specific claims on public memory.’( ) developing a literacy that expands ‘beyond the culture of the book’ is in this respect essential, giroux claims. not just to learn new skills and knowledge but to be able to use these to both critically examine and analyse different (multimedia) texts and to produce these texts and technologies differently. giroux thus sees literacy foremost as a critical discourse, as a precondition for agency and self-representation. educators mcleod and vasinda draw further on this (referencing mclaughlin and devoogd) when they say a critical literacy involving multiple media demands of us to expand the concept of text, where text also can include sociocultural conditions and relationships. ( : ) hence developing critical praxis can be seen as a method to critically analyse the sociocultural conditions and relationships that constitute academia and based on that, to produce the phd dissertation (and with that the phd student) and ultimately the scholarly field and system in which it functions differently. that being said, i do not envision any particular kind of critical praxis, including the ones outlined in this here, can be used as a ‘normative method’ or a route map towards conducting a phd in the digital age. the ‘reflection on the self’ as a social identity that a critical praxis envisions is in this respect highly personal and contextual. for this i draw on cultural studies scholar handel wright and the form of autoethnography he applies in his article ‘cultural studies as practice’. for wright ‘doing cultural studies’ means most importantly an ‘intervention in institutional, sociopolitical and cultural arrangements, events and directions.’ he sees cultural studies as a form of ‘social justice praxis’, one that warns against theoreticism and that blurs the boundaries between the academy and the community. in his description of what ‘social justice praxis’ means or what it should do he chooses not to use a model-based, more prescriptive method, but follows a more modest approach, one in which he adopts gregory jay’s ( ) idea and strategy of ‘taking multiculturalism personally’ to ‘taking cultural studies personally’, in order to advocate and explicate cultural studies praxis. (wright, : ) the examples of critical praxis that i mention here should thus not be seen as authoritative models of what a critical praxis should be, but only illustrate and describe what it possibly could be within the specific context of for instance a humanities dissertation. in this specific case, the university, the course of the phd itself and, more specifically, the dissertation or the monograph become the subject on which the critical praxis focuses. wright also stresses the importance of addressing one’s own practices and institutions as sites of critical praxis: ‘in addition, i want to reiterate that the university itself must not be overlooked as a site of praxis, a site where issues of difference, representation and social justice, and even what constitutes legitimate academic work are being contested.’( : )       the (re-)production of the phd student as stated before, critical praxis offers an opportunity to actively rethink ‘traditional skills’ and established research practices, and with that what is still perceived as the conventional or ‘natural’ process of doing a phd in the humanities: creating a single- authored, static, print-based argument in long-form, which should ideally be publishable as a research monograph. this ‘natural process’ of doing a phd—which of course is anything but—can be seen as a product of certain dominant ideas and discourses that function to shape how a graduate student is to write or author a dissertation. as such, this established convention or process provides a road map to becoming a scholar in which the dissertation serves as a model as to how to conduct research and, ultimately, how to produce a scholarly monograph. game studies scholar anastasia salter reflects on this state of affairs remarking that ‘the traditional dissertation as product reflects the dominance of the book: it creates a monograph that sits in a database. the processes of the humanities are to some extent self- perpetuating: write essays as an undergraduate, conference papers as a graduate student, a dissertation as a doctoral student, and books and journal articles as a professor.’( ) the importance of being aware of and critiquing such dominant discourses, however, not only lies in exploring the tension between how, on the one hand, they reproduce ‘traditional scholars’ and how, on the other hand, the phd and the phd thesis are supposed to be, as political theorist angelique bletsas states in ‘the phd thesis as ‘text’’: ‘(…) the foundations of ‘new scholarship’ and as such are integral to the production of new thought and new scholars.’( : ) it is important to be aware that these discourses relating to knowledge production during the phd process have, as bletsas argues, certain subjectification effects. she shows how the dissertation is not only about finishing a static text but also about finishing as a person: as she puts it, the accepted thesis completes the student as a discoursing ‘subject’. in other words, the phd student as a discoursing subject is being (re) produced in and by these dominant discourses; and with that, a certain kind of scholar, and a certain kind of scholarly communication system are also reproduced. alan o’shea already argued in for the importance of cultural studies theorists to pay attention to their own institutional practices and pedagogies and the way knowledge is produced and disseminated herein, something he felt up to then had been lacking. o’shea warns for the ‘tendencies towards self-reproduction’ in higher education, effects which are not pre-given but outcomes of specific struggles. as o’shea states, echoing bletsas argument, ‘the practices in which we engage constitute us as particular kinds of subjects and exclude other kinds. the more routinised our practices, the more powerfully this closure works.’( : ) o’shea however warns not to overemphasize the extent of this closure, focusing on the many-sided complexity of the regimes of value underlying our educational institutions, where different regimes co-exist and overlap and people move between them. he conceptualizes these regimes as ‘a field of contestations’, where we are always already positioned within certain institutions and practices: the cultural critic is always-already positioned within institutions. to speak publicly at all you do not have to belong to a state institution, but you do have to operate within one set or another of 'institutionalized' codes and practices, with historically determinate modes of production, distribution and       consumption.(o’shea, : ) critical praxis as self-assertion drawing further on o’shea and bletsas, i will argue here that to change our institutions from within we should start with critically examining our own position and practices and how these are reproduced. at a time when digital projects are still perceived within the humanities as ‘risky’, developing a form of digital or multimedia literacy (including related skills) in experimenting with these kind of digital projects or practices, can be positioned as a process that goes hand in hand with developing critical literacy in general. it provides students with a means and an opportunity to critically rethink, through critical praxis, some of the dominant discourses and established notions—including connected ethics and politics— concerning how to conduct a dissertation, and with that, ultimately, how to write a scholarly monograph. let me emphasise here that i am not claiming that critical praxis can only be achieved or learned by experimenting with digital projects, methods and tools. rather, i am arguing that at this specific moment these tools and methods can be employed to trigger critique and rethinking of some of our established notions concerning scholarship and scholarly communication—including authorship, peer review, copyright, and the political economy surrounding scholarly publishing. what is more, this critical praxis should be applied just as much to digital methods and to how research is being done within the digital humanities. especially insofar as digital projects reproduce notions and values from the dominant, established discourses. not all digital projects are inherently and necessary critical, experimental or even ‘risky’; they just have the potential to be so. furthermore, i would argue that acquiring digital literacy means acquiring various kinds of literacies, including ‘traditional’ print literacy. media theorists kellner and share argue for the importance of developing forms of ‘multiple literacies’ as a response to dominant forms of literacy as they are socially constructed in educational and cultural practices and discourses. multiple literacies, in the sense of media literacy, computer literacy, multimedia literacy and digital literacy, also include books, reading, and print literacy. (kellner and share, : ) as bletsas states, drawing on foucault, there is ‘no standpoint in the field of knowledge production which is ‘innocent’ or outside of power relations.’ ( ) bletsas describes the tension that you need to be accepted and formed by, and comply to, a certain discourse, before you can critique this discourse. and just as knowledge is inherently political, so i would claim doing a phd or writing a dissertation is also a political act. the process of resisting being formed in a certain way is, for me, something that already starts during the period of studying for a phd, this being a time when we begin to evaluate critically which of the values that get reproduced in scholarly communication we should cherish. the phd can therefore be seen as an intervention in the production of knowledge, in which one adopts a position concerning the future of scholarly communication. in order to maintain this position of the ‘interventionist potential’ of the phd process, i will necessarily not theorise the closure of the dominant discourses within academia and the subjectification effects they have on social identities in a, as o’shea stated, ‘overemphasized way’. for this i draw on foucault’s later work in which he advances the possibility the subject has to develop agency within subordinating       systems. in foucault’s words ‘individuals are the vehicles of power’, they reproduce power in a positive, productive way. ( ) but they also have the power to produce power in a different, creative way. foucault scholar eric paras sums up these changes in foucault’s work as follows: ‘the individual, no longer seen as the pure product of mechanisms of domination, appears as the complex result of an interaction between outside coercion and techniques of the self.’ ( : – ) drawing upon the later foucault, performing the phd and one’s social identity as a student and scholar can be seen as no longer a matter of self-defence but rather of self-assertion. as paras states, becoming a subject is in foucault’s later thought less ‘an affirmation of an identity than a propagation of a creative force.’( : ) it is a creative effort rather than a defensive one. in this sense paras emphasises the potential in the later foucault for the subject to reflect upon its own practices and to choose among and modify them following techniques of the self, those specific practices that enable subjects to constitute themselves both within and through systems of power. if we envision critical praxis as both a critical project and a creative, transforming, and transformative one, part of this creative impulse lies in the potential to, as cultural studies scholar ted striphas calls it, ‘perform scholarly communication differently—that is, without simply giving in, in judith butler’s words, to “the compulsion to repeat.”’( ). he argues that the norms of scholarly communication that we perform today through a ‘routine set of practices’ were forged under historically specific circumstances. to which we might add, circumstances that might not in (their entirety) apply today. this triggers us to ask new questions about these practices and to start performing them differently. as striphas adds, much more creatively and expansively (expanding our repertoire) than we currently do with the ultimate goal to ‘enhance the quality of our research and our ability to share it’ ( ) applying this to the course of a phd means that, instead of seeing phd students as being completely produced by the practices they reproduce and the knowledge systems that enforce these, we can see these practices and institutions not as constituting, but as shaping them. however, this doesn’t underestimate the power these shaping practices and systems have. as o’shea argued before, the more repetitive these practices become, the more thorough and self-perpetuating this shaping-process also becomes. ( ) nonetheless, as students, and as academics, we have the possibility to creatively act within these frameworks, to struggle for a more open and progressive knowledge system, performing scholarly communication differently. that being said, we should remember o’shea’s critique of these (dominant) systems being monolithical. there is a complex power struggle taking place within academia for certain kind of knowledges and knowledge systems. this struggle can be seen to revolve around having or obtaining the power to create the possibilities to transform the structures that will enable certain kind of values to be produced. the digital realm for instance has the potential to promote a more progressive knowledge system based on values of sharing, openness and cooperation, one that struggles against institutional inertia and conservatism, and the perseverance of neoliberal market values in education. the kind of knowledge that can come out of such a more progressive system, i will argue, might be hard to realize if we keep reproducing our print-based practices within the digital realm, practices that might (no longer) be able to promote these values to the fullest in an online environment. it is this struggle over the future of our scholarly communication system that the examples of critical praxis—including my own dissertation—focus on.       re-envisioning our research practices the traditional phd dissertation or the ‘natural phd process’ follows many of the elements of a paper-based view of scholarly communication, increasingly inhibiting progressive practices and knowledges—such as i have outlined above—to come to the fore. consequently, what i am arguing for is a critical praxis that explores (and remains critical of) alternative practices and structures that enable values based on a politics of sharing, openness and collaboration. these values are of the utmost importance in promoting and furthering (digital) scholarship where opening up knowledge and freely sharing and re-using research results will enable scholarship to build upon earlier thoughts and ideas. in the largely print-based scholarly communication and publishing system that we are increasingly adopting online too, our research practices run the risk of becoming more and more alienated from this alternative view. knowledge is commodified and objectified in our increasingly profit-driven publishing systems and in our performance and assessment systems, where scholars are being judged according to their sole attributions to scholarship (single-authored journal articles and books). although we now have the technological possibilities to reproduce and distribute information online virtually for free knowledge still runs the risk of being stowed away behind access and payment barriers. within this system it becomes hard to promote values based on sharing, accessibility and collaboration. hence i argue for critically looking at and rethinking these paper-based structures and practices so that we can reproduce them differently in a digital system. related to this, i will argue for a critical praxis that critiques established notions of authorship and stability, trying to envision how we can perform them differently in our research practices. increasingly these notions of authorship and stability don’t fit with the new (digital) practices that are being developed in the sciences and increasingly also in the humanities where different forms of collaborative processual knowledge production and sharing are being explored. furthermore, these print-based notions of authorship and stability are embedded within a discourse that envisions humanities scholarship as a system based on (single) authors in competition with each other in an attention economy based on creating scholarly objects that determine an author’s value within that system. the notion of a stable scholarly object also becomes increasingly difficult to maintain in an environment that encourages continual updating and change. the digital offers us the possibility to experiment with updating material, linking out and referring to other works—in the process of which—creating a network of texts and resources, and to remix and reuse material in a collaborative fashion. as fitzpatrick explains, the online realm stimulates the open-ended nature of networked writing: ‘all three of these features — commenting, linking, and versioning — produce texts that are no longer discrete or static, but that live and develop as part of a network of other such texts, among which ideas flow.’( c) a critical praxis will thus trigger us to rethink institutions and practices that are at the moment still very much part of, and reproducing, an economics and politics based on a power structure that has been inherited from a print-based situation. striphas argues that we need to get beyond the blind copying of print writing practices into the digital realm arguing for experimentation with the form, content and process                                                                                                                 see (spiro, ) and (simeone et al., ) for examples of collaborative projects in the (digital) humanities.       of scholarly publication. there is no compelling reason, he argues, why we need to conform to “papercentric” conventions in the online world when we can also explore and make better use of the interactive features the web offers to rethink the paper- based distribution and assessment methods we are repeating in the digital realm. (striphas, ) a digital, open, and collaborative research practice to illustrate what a critical praxis might look like, and how it can envision and create such an alternative system, i will focus—amongst others—on my own phd dissertation in-process, which can be seen as an experiment in developing a digital, open research practice by exploring the possibilities of remix, liquidity and openness in the dissertation’s conduct and format. by positioning the medium of the book as a major site of struggle within the humanities over the potential application of some of the new, digital forms and systems of communication that are increasingly affecting academia—such as open access publishing, open peer review, and liquid books — my dissertation argues for the importance of experimenting with alternative ways of thinking and performing the academic monograph. starting with the long-form argument that is the phd dissertation itself, i hope to actively challenge and critique the established (print-based) notions, politics, and practices within the field of the humanities, in form, practice and content. within the humanities scholars increasingly experiment with conducting their research in a more open way, following the idea of open research or open notebook science, which involves publishing one’s research as it evolves (including drafts and raw data) instead of only publishing the research results. examples of scholars that are experimenting with (new forms) of online publishing and who can thus be seen as developing or practicing forms of critical praxis are for instance ted  striphas, who posts his working papers online in his differences and repetitions wiki, and gary   hall,  who is making the research for his new book media  gifts freely available online on his website as it evolves. kathleen fitzpatrick put the draft version of her book planned obsolescence online for peer review using the commentpress wordpress plugin allowing readers to comment on paragraphs of the text in the margins. examples of phd students involved in open research are librarian heather  morrison, who posts her dissertation chapters as they evolve online and english student alex gil, who is putting his work for his dissertation online on electroalex.com, using the commentpress  plugin. the focus in the above cited examples—as in the conduct of my own dissertation—on openness, open research and open access, not only functions as a means to experiment with new practices of producing and distributing knowledge, but can be seen as a direct critique of the material conditions under which humanities research is currently being produced. striphas, who perceives cultural studies as a set of writing practices, has scrutinized the way these writing practices are currently set up and function by looking closely at the material conditions by which they are produced, distributed, exchanged and consumed, i.e. by exploring the politics and economics of academic publishing. the choices we as scholars make or—as striphas emphasizes—that are made for us when we publish our research results are very                                                                                                                 books that can be continually updated or added to,  published under the conditions of both open editing and free content; i.e. books in/as wikis.       important. striphas underlines both the systemic power relations at play as well as our own responsibilities in repeating these practices or alternatively choosing different options. we need to have better access to the ‘instruments of the production of cultural studies’, i.e. the publishing system, and to the content that gets produced, by exploring and taking control of ‘the conditions under which scholarship in cultural studies can—and increasingly cannot—circulate.’ striphas thus emphasizes our roles as scholars within this publishing system, which serves as a good example of critical praxis in action. this to in striphas words ‘perform our writing practices differently, to appropriate and reengineer the publishing system so as to better suit our needs.’( ) following these examples, by making use of digital platforms and tools, all the research for my thesis—notes, drafts, chapters, etc.—will be made available online, as it progresses, via multiple outlets. to reach out to a wider readership and to connect with a peer community of sharing and collaboration, various social media outlets are used, including a weblog where first drafts and short pieces related to the dissertation are posted. the blog builds upon an existing readership (i have been blogging since ) and aims to connect with a wider community, by making extended connections via twitter (a microblogging community) and zotero (an online open source reference system enabling people to collect and share references and resources), two outlets heavily used by scholars and the wider public interested in the digital. more advanced chapters will be published on a multimedia platform, where the possibility to create, edit, and read in a collaborative setting, and the possibility to make mashups and remixes of text, video, sound, illustrations, images and spoken word, will be explored. this platform will be used to find out what it means to communicate research in an other than textual format, to have different multi-medial versions of the research whilst at the same time experimenting with bringing into practice ‘muliple literacies’. (kellner and share, ) furthermore a wiki will be used where the authorial ‘moderating function’ still at work in the blog and the multimedia platform will be left behind. my intention is to use the wiki to explore what it means to let go of authorship as a form of authority. in the wiki environment the author can no longer (solely) be held responsible for the text or research, where the text will know no final version, it can be further commented upon, and it can be updated, remixed and re-used (in principle) indefinitely. finally, as part of the requirements of the phd to produce a single-authored written piece of work in long- format, one of the versions of the dissertation will be exactly that—a traditional argument on paper. the intention of the dissertation project is to create different versions and parts of the dissertation argument, existing on different platforms, that then come to function as nodes in a multi-format, interlinked network of texts, notes, draft, references and remixes, where no part is more or less important than the other parts, nor will one text necessarily form the end-point or final version of the dissertation project. this critical praxis will thus follow the idea of open research mentioned before, by which anyone can track what has been done (openness), can comment on the research (social), and can add to it (collaborative, remix, liquid). this possibility to reuse, remix, and modify (scholarly) material, can be seen as one of the most contested forms of openness (adema, : ), as it actively challenges the established print-based notions the current system of scholarly communication is build around (i.e, stability and authority). this more radical form of openness has the potential to problematise these notions not only in the current context, but also from a more historical perspective, showing how the pillars of academia (stability, authority,       authorship) are merely precariously upheld constructs, maintaining established institutional, economical or political structures. (johns, ) the possibility to expand and to build upon, to make modifications derivative works, to appropriate and update content (within a digital environment), shifts the focus in scholarly communication from product to process and can be seen as a critique of the increasing commodification of the book as product. as this is foremost an experiment in creating a critical praxis following an ethos of sharing and openness, part of the critique will also focus on the process itself: is it possible to easily connect to a community? will people actually reuse and remix the content or feel comfortable to do so? will the processes of collaboration, community input and reuse and adaptation actually limit the authorial intentions at work in the text? however, even the failure to achieve these intentions will be a valuable lesson, as it might show us the strengths of our established practices and institutions, and the challenges we are facing in adapting the digital to our needs. conclusion i have tried to show here, building on existing literature on critical and digital literacy, and using examples of critical digital scholarship—including my own dissertation— various possible ways of developing a critical praxis. the specific example of my dissertation focused on establishing a digital, open and collaborative research practice by looking at the possibilities of remix, liquidity and openness in the dissertation’s conduct and format. critical praxis here not only serves to critique established notions of how to write and conduct a dissertation within the humanities, it also helps to develop new digital research practices that enable sharing, openness, and remix of the research during its ongoing development. referring back to the beginning of this article and to how kathleen fitzpatrick mentioned that doing a digital project within the humanities is still seen as a ‘risky thing’, this research project will encounter both tension and paradox. the paradox lies in the fact that to become an academic within the present system, we in many ways still have to adhere to the present structures and systems, resulting in the tension earlier described by bletsas, where we have to conform to the rules, regulations and practices that we at the same time try to critique and transform. however, following amongst others o’shea and striphas and the later foucault, i maintain that we are able to transform these practices from within. nonetheless, as in any struggle focused on changing a system from within, compromises have to be made to deal with the tension between ‘outside coercion and techniques of the self’. that being said, i hope that the example of my dissertation has shown that by developing a critical praxis during the process of conducting a phd, we can then continue to develop this further once our scholarly career progresses. as part of this process we will be able to actively produce and promote alternative communicative norms, politics and practices, which will aid us in the struggle to critique and transform the established academic power systems. as i stated before, the examples i have mentioned here—including my own dissertation—should not be seen as normative models, but i nonetheless hope they inspire other students and scholars to develop their own form(s) of critical praxis to aid them to produce themselves and their institutional practices differently. references       adema j ( ) open access business models for books in the humanities and social sciences. oapen project report. amsterdam. bletsas a ( ) the phd thesis as “text”: a post-structuralist encounter with the limits of discourse. new scholar ( ). available at: http://www.newscholar.org.au/index.php/ns/article/view/ . borgman c ( ) scholarship in the digital age  : information, infrastructure, and the internet. cambridge mass.: mit press. differences & repetitions wiki ((n.d.)). wiki. available at: http://wiki.diffandrep.org/. fitzpatrick k ( a) do “the risky thing” in digital humanities. the chronicle of higher education, september. available at: http://chronicle.com/article/do-the-risky-thing-in/ /. fitzpatrick k ( b) planned obsolescence: publishing, technology, and the future of the academy. nyu press. fitzpatrick k ( c) the digital future of authorship: rethinking originality. culture machine . available at: http://www.culturemachine.net/index.php/cm/article/viewarticle/ . fitzpatrick k ( ) the humanities, done digitally. in: gold mk (ed) debates in the digital humanities. minneapolis: university of minnesota press. foucault m ( ) two lectures. in: gordon c (ed) power/knowledge: selected interviews and other writings, - . vintage. gil a ((n.d.)) l’atelier du colibri. wiki. available at: http://www.elotroalex.com/atelier/. giroux ha ( ) cultural politics and the crisis of the university. culture machine . available at: http://www.culturemachine.net/index.php/cm/article/viewarticle/ / . hall g ( ) digitize this book!  : the politics of new media, or why we need open access now. minneapolis: university of minnesota press. hall g ( ) on the limits of openness vi: has critical theory run out of time for data-driven scholarship? media gifts. available at: http://www.garyhall.info/journal/ / / /on-the-limits-of-openness-vi-has- critical-theory-run-out-of.html. jay g ( ) taking multiculturalism personally: ethnos and ethos in the classroom. in: gallop j (ed) pedagogy: the question of impersonation. bloomington & indianapolis: indiana university press. johns a ( ) the nature of the book: print and knowledge in the making. university of chicago press. kellner d and share j ( ) toward critical media literacy: core concepts, debates, organizations, and policy. discourse: studies in the cultural politics of education ( ): – . mcleod j and vasinda s ( ) critical literacy and web . : exercising and negotiating power. computers in the schools ( - ): – . morrison h ((n.d.)) heather morrison  » open thesis. heather morrison’s home page. available at: http://pages.cmns.sfu.ca/heather-morrison/open-thesis-draft- introduction-march- /. o’shea a ( ) a special relationship? cultural studies, academia and pedagogy. cultural studies ( ): – . paras e ( ) foucault . : beyond power and knowledge. other press. presner t and schnapp j ( ) the digital humanities manifesto . . available at: www.humanitiesblast.com/manifesto/manifesto_v .pdf.       roth tl ( ) hacking the dissertation process. dude, where’s my tardis? available at: http://tanyaroth.wordpress.com. salter a ( ) rethinking the humanities dissertation. avatar’s quest. available at: http://selfloud.blogspot.com/ / /rethinking-humanities-dissertation.html. simeone m et al. ( ) digging into data using new collaborative infrastructures supporting humanities-based computer science research. first monday ( ). available at: http://www.firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / . snijder r ( ) the profits of free books: an experiment to measure the impact of open access publishing. learned publishing ( ): – . spiro l ( ) examples of collaborative digital humanities projects. digital scholarship in the humanities. available at: http://digitalscholarship.wordpress.com. spiro l ( ) this is why we fight: defining the values of the digital humanities. in: gold mk (ed) debates in the digital humanities. minneapolis: university of minnesota press. striphas t ( ) acknowledged goods: cultural studies and the politics of academic journal publishing. communication and critical/cultural studies ( ): – . striphas t ( ) performing scholarly communication. text and performance quarterly ( ): – . terranova t ( ) free labor: producing culture for the digital economy. electronic book review. available at: http://www.electronicbookreview.com/thread/technocapitalism/voluntary wright hk ( ) cultural studies as praxis: (making) an autobiographical case. cultural studies ( ): – .   ademacover practice-what-you-preach untitled used to track the ‘hours’ of your shift; ) a ‘gridlock counter’, which tracks how many ed backups or adverse patient outcomes occur (‘gridlocks’). the goal of the game is to work cooperatively with your teammates to complete patient tasks and move patients through the ed to an ultimate disposition (e.g. admission, discharge). the game is won if you finish your shift before reaching the maximum number of ‘gridlocks’ allowed. conclusion: initial responses to gridlocked have been very positive, supporting it as both an engaging board game and potential teaching tool. we are excited to see it validated through research trials and possibly incorporated into emergency medicine training at both student and postgraduate training levels. keywords: emergency department flow, simulation, board game lo the canadiem digital scholars program: an innovative international digital collaboration curriculum f. zaver, md, a. thomas, md, s. shahbaz, md, a. helman, md, e.s. kwok, md, b. thoma, md, ma, t.m. chan, md, university of calgary, calgary, ab introduction / innovation concept: digital media are a new frontier in medical education scholarship. asynchronous education resources facilitate a multi-modal approach to teaching, and allows residents to personalize their learning to achieve mastery in their own time. the canadiem digital scholars program is a nationwide initiative that provides residents with practical experiences in creating digital educa- tional materials under the supervision of experts in the field. the pro- gram allows for collaboration and access to mentorship from top digital educators from across north america. methods: interested residents accepted into the program spent a period of their pgy year completing modules developed in the theory and science behind digital education. four modules, developed in an iterative process, have been built on the topics of podcasting, blogging, digital identity, and patient commu- nication. each fellow was supervised members of the canadiem team, a faculty member from the resident’s home institution, and digital experts from across north america. curriculum, tool, or material: the first fellow completed all aspects of the designed curriculum. above this, he also engaged in blog content creation, initiated research on digital scholarship, and managed the editorial section of canadiem. the sec- ond fellow is currently halfway through his year (and is expected to complete the program within the year) and has co-authored blog posts and podcasts in months. conclusion: the canadiem digital scholars program utilizes a novel approach to foster development of digital educators utilizing experts across north america. we have demonstrated the feasibility and sustainability with our initial pilot years. this program is being scaled next year to include two scholars per year, which will facilitate cross-collaboration between the scholars. keywords: innovations in emergency medicine education, social media, free open access meducation (foam) lo not a hobby anymore: establishment of the global health emergency medicine organization at the university of toronto to facilitate academic careers in global health for faculty and residents c. hunchak, md, mph, l. puchalski ritchie, md, phd, m. salmon, md, mph, j. maskalyk, md, m. landes, md, msc, mount sinai hospital, toronto, on introduction / innovation concept: demand for training in global health emergency medicine (em) practice and education across canada is high and increasing. for faculty with advanced global health em training, em departments have not traditionally recognized global health as an academic niche warranting support. to address these unmet needs, expert faculty at the university of toronto (ut) established the global health emergency medicine (ghem) organization to provide both quality training opportunities for residents and an academic home for faculty in the field of global health em. methods: six faculty with training and experience in global health em founded ghem in at a ut teaching hospital, supported by the leadership of the ed chief and head of the divisions of em. this initial critical mass of faculty formed a governing body, seed funding was granted from the affiliated hospital practice plan and a five-year strategic academic plan was developed. curriculum, tool, or material: ghem has flourished at ut with growing membership and increasing academic outputs. five governing members and general faculty members currently run projects engaging over faculty and residents. formal partnerships have been developed with institutions in ethiopia, congo and malawi, supported by five granting agencies. fifteen publications have been authored to date with multiple additional manuscripts currently in review. nineteen frcp and ccfp-em residents have been mentored in global health clinical practice, research and education. finally, ghem’s activities have become a leading recruitment tool for both em postgraduate training programs and the em department. conclusion: ghem is the first academic em organization in canada to meet the ever-growing demand for quality global health em training and to harness and support existing expertise among faculty. the productivity from this collaborative framework has established global health em at ut as a relevant and sustainable academic career. ghem serves as a model for other faculty and institutions looking to move global health em practice from the realm of ‘hobby’ to recognized academic endeavor, with proven academic benefits conferring to faculty, trainees and the institution. keywords: global health education, global health training, global health research lo safety and efficiency of emergency physician supplementation in a provincially nurse-staffed telephone service for urgent caller advice e. grafstein, md, r.b. abu-laban, md, mhsc, b. wong, mha, r. stenstrom, md, phd, f.x. scheuermeyer, md, m. root, ma, q. doan, mdcm, mhsc, phd, st. paul’s hospital, vancouver, bc introduction: in british columbia created a nurse (rn) staffed telephone triage service, (tts) to provide timely advice to non- callers ( ). a perception exists that some callers are inappropriately directed to emergency departments (eds) thereby worsening crowding. we sought to determine whether supplementary emergency physician (ep) triage would decrease ed visits while preserving caller safety and satisfaction. methods: tts rns use computer algorithms and judgment to triage callers. potentially sick callers are directed to “seek care now” (red calls). often this is to an ed depending on acuity and time of day. in the vancouver health region from april-september between : - : hours, a co-located ep also spoke with “red” callers to provide further guidance. callers were followed up with week and satisfaction was evaluated on a -point likert scale. the tts data was linked to the regional ed database to assess ed attendance within days, and the provincial vital statistics database for -day mortality. our primary outcome was the proportion of unique “red” callers who did not attend the ed compared with a historical cohort one year earlier without ep triage in place. secondary outcomes were the proportion of “red” callers advised not to attend the ed but (a) attended, (b) admitted, or (c) died. results: in the study period there were “red” calls of résumés scientifique s ; suppl cjem � jcmu https://doi.org/ . /cem. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/ . /cem. . https://www.cambridge.org/core https://www.cambridge.org/core/terms exploring the (missed) connections between digital scholarship and faculty development: a conceptual analysis research article open access exploring the (missed) connections between digital scholarship and faculty development: a conceptual analysis juliana elisa raffaghelli correspondence: jraffaghelli@gmail.com department of educational sciences and psychology, university of florence, florence, fi, italy abstract the aim of this paper is to explore the relationship between two research topics: digital scholarship and faculty development. the former topic drives attention on academics’ new practices in digital, open and networked contexts; the second is focused on the requirements and strategies to promote academics’ professional learning and career advancement. the research question addressing this study is: are faculty development strategies hindered by the lack of a cohesive view in the research on digital scholarship? the main assumption guiding this research question is that clear conceptual frameworks and models of professional practice lead to effective faculty development strategies. through a wide overview of the evolution of both digital scholarship and faculty development, followed by a conceptual analysis of the intersections between fields, the paper attempts to show the extent on which the situation in one area (digital scholarship) might encompass criticalities for the other (faculty development) in terms of research and practices. furthermore, three scenarios based on the several perspectives of digital scholarship are built in order to explore the research question in depth. we conclude that at the current state of art the relationship between these two topics is weak. moreover, the dialogue between digital scholarship and faculty development could put the basis to forge effective professional learning contexts and instruments, with the ultimate goal of supporting academics to become digital scholars towards a more open and democratic vision of scholarship. keywords: digital scholarship, information science, educational technology, interdiscipline, open science introduction the concept of digital scolarship was coined early in the decade to characterize the scholars’ professional practices linked to digital environments and tools (andersen & trinkle, ; ayers, ). while it has been defined generally as the use of digital evidence, methods of inquiry, research, publication and preservation to achieve scholarly and research goals (rumsey, cited on wikipedia), a first exploration of the literature yields hundreds of definitions that make the issue to appear elusive. the terms adopted span from e-scholarship (borgman, ), digital scholarship (costa, ; pearce, weller, scanlon, & kinsley, ; scanlon, ; weller, ), networked scholarship © the author(s). open access this article is distributed under the terms of the creative commons attribution . international license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. raffaghelli international journal of educational technology in higher education ( ) : doi . /s - - -x http://crossmark.crossref.org/dialog/?doi= . /s - - -x&domain=pdf mailto:jraffaghelli@gmail.com http://creativecommons.org/licenses/by/ . / (quan-haase, suarez, & brown, ; b. e. stewart, a; veletsianos & kimmons, b), open scholarship (garnett & ecclesfield, ; george veletsianos, ) and research . . (esposito, ; oliveira et al., ). moreover, the above men- tioned terms often refer to different research issues and problems, like the cyberinfras- tructures’ affordances endowing scholars to become more “digital’’, like institutional repositories supporting open access (borgman, ; cox, ); or the uses and prac- tices linked to open and social media as facebook and twitter as a mean to become a “social and networked” scholar (manca & ranieri, ; veletsianos, ; george veletsianos & kimmons, ). the underlying values motivating these studies are also diversified: while some of them advocate for the need of opening up science, paying par- ticular attention to the public nature of science and its products (den besten, david, & schroeder, ; pontika, knoth, cancellieri, & pearce, ), others focus the scholars’ struggle against power within the academia and their attempts to shape the own profes- sional identity (costa, ; hildebrandt & couros, ). furthermore, the nature of openness in science as the emerging paradigm for scholars to communicate and connect to the external world beyond the “academic ivory tower”, has been found to be based on several schools of thought (fecher & friesike, ). more recently, a systematic review of the literature and a study on bibliometric maps (raffaghelli, cucchiara, manganello, & persico, a) led the authors to conclude that within the topic of digital scholarship there are several disciplines that contribute with scarce awareness of their specific meth- odological approaches and the underlying conceptual frameworks adopted. if the concept of digital scholarship is fuzzy and covers in a fragmented way different phenomena, we can expect this problem to impact not only on further research, but also over the applied science. of particular interest is the case of faculty development, namely, the pedagogies that could address professional learning processes to know, to do and to become a digital scholar. scholars’ professional learning has been character- ized by another convergent field of research and practice, namely, faculty development. in fact, faculty development can be defined as the practices and environments promoting scholars’ skills to advance in their careers, to perform their role with quality and excellence, and to innovate within their contexts of professional engagement (boyer, ; braxton, luckey, & helland, ). however, effective faculty development has as main requirement a good recognition of contents and methods for professional learning (grover, k. s. walters. s. r. c, ), something that digital scholarship could not be able of providing at its current state of advancement. therefore, the aim of this paper is to explore the relationship between two research topics: digital scholarship and faculty development, attempting to show a) the rather unexplored and intuitive relationships between the two fields and b) the way in which these missed connections could affect further research and practices addressing professional development for digital scholarship. hence, the article attempts strengthen the dialogue between digital scholarship and faculty development as the base to forge effective profes- sional learning contexts and instruments, with the ultimate goal of supporting academics to become digital scholars towards a more open and democratic vision of scholarship. methodological approach emerging topics of research require elaboration in order to configure constructs that give place to further empirical research. in social sciences and within them in raffaghelli international journal of educational technology in higher education ( ) : page of educational research we frequently assist to scientific interests growing over the basis of changing societal conditions and problems. these give birth to case studies, exploratory research and/or best practices studies, and hence, to more structured, experimental approaches (gorard, ; gorard & cook, ). instead, the exploration of research topics through several forms of literature review bring light on the directions of advancement of a research topic or field (petticrew & roberts, ). in the cases where the topic is new or requires interdisciplinary attention, conceptual overviews of the topics under analysis followed by critical and reflective discussion can put the basis for more specific interventions aimed at formalizing frameworks, methods, and constructs to be empirically explored (petticrew & roberts, ). in line with this, the current article undertakes a conceptual analysis addressed by the following research question: are faculty development strategies hindered by the lack of a cohesive view in the research on digital scholarship? the main assumption guiding this research question is that clear conceptual frameworks and models of professional practice (in digital scholarship) will lead to effective faculty development strategies (for digital scholarship). the conceptual analysis is carried out through three steps with their set of specific research questions: a) the evolution and current state of digital scholarship as research topic. the main research question are faculty development strategies hindered by the lack of a cohesive view in the research on digital scholarship? is here explored through the subsidiary question is the current state of research on digital scholarship fragmentary? the answer is provided through the overview of the literature and aims at understanding how digital scholarship has evolved as topic of research, and to observe if (as we assumed) this evolution has been fragmentary. we will build on the work of raffaghelli et al. ( ), that identified three main disciplinary perspectives of research on digital scholarship: information sciences and cyberinfrastructures, digital humanities, and professional networked learning. we will analyse these three perspectives, from their conceptual foundations to the current trends of research to see whether there is a fragmentary composition. b) the evolution and current state of faculty development as research topic. the main research question is here explored through the subsidiary question how is faculty development defined in the research and which are the requirements for effective faculty development practices? the answer is provided through the overview of the literature and aims at understanding the potential gaps that a fragmented vision of digital scholarship could left uncovered. an important assumption here is that the research advancements on digital scholarship are a crucial element to configure the ideal scenarios of professional performance and the approaches to implement professional development on the issue. c) the analysis of intersections between the two research topics. the main research question is here explored through the subsidiary question how the fragmentary vision of digital scholarship could influence faculty development? the answer is provided through a critical discussion integrated with three scenarios of faculty development based on the several perspectives of digital scholarship. the scenarios should bring concrete elements to deepen on the assumptions made through the previous overview of the literature. raffaghelli international journal of educational technology in higher education ( ) : page of in the conclusions, the overview and the connected discussion are wrapped up with remarks for future research and practices. the research on digital scholarship: separated worlds digital scholarship disserved increasing interest from the research community of librar- ians from the beginning of along with the various transformations that library ser- vices faced through their progressive digitalization. since the librarians had constantly collaborated with scholars in both searching scientific information as well as in cata- loguing and supporting the visibility of scholarly results for career purposes, the topic was easily perceived as research problem (j. cox, ; zhao, ). moreover, the open access movement gave an impressive input to librarianship to reflect on the own practices and services, that was immediately transferred to academics’ practices. digital scholarship in this field was defined as building a digital collection of information for further study and analysis, creating appropriate tools for collection-building, creating appropriate tools for the analysis and study of collections, using digital collections and analytical tools to generate new intellectual products, creating authoring tools for these new intellectual products, either in traditional forms or in digital form (palmer & cragin, , p. ). most contributions from information and library sciences em- phasized the problems of librarians to support scholars' understanding and use of digital textual and multimedia collections; as well as the way scholars could enhance digital infrastructures to facilitate the academic endeavour (from searching documents to collaborate with other scholars). last, but not least, the debate focused the way scholars could adopt digital facilities provided by libraries to increase reputation (an- dersen, ; holliman, ; quigley, neely, parkolap, & groom, ; zhao, ). this concern about infrastructures developed hand in hand with the debate on open access (den besten et al., ; suber, ). the need of opening up science seemed to be in transition towards the fully accessible, public and participatory concept of sci- ence in and for society, where the initial concept of escience (e for electronic) was de- veloping into open science with its impact on scholarship (ren, ). research in this field also built on scientometrics to study power relationships, reputation and visibility of science, since the pioneering works of de solla price ( ), praised by the sociolo- gists of science merton and garfield, ( ). in fact, the method showed through mathematical and statistical principles the relationships in science early studied by merton ( [ ]). this important contribution was translated into the current trend analysing not only the power influences and the processes of reputation building through traditional citation networks (obtained through traditional, paid publishers and scientific databases); but also through the open access publications and repositories, the open web, and more recently by social media platforms, building emergent metrics or altmetrics (roemer & borchardt, ). the altmetrics should support new reputation mechanisms for scholars (jamali, nicholas & herman, ) and new ways of developing collaboration and trust (jamali et al., ) towards a more open and democratic concept of science and beyond “the invisible college” (a concept adopted by de solla price, quoted by valente, ). however most debates in this area remained disconnected to a critical perspective for scientists’ professional learning to be introduced to a fair-minded digital, networked and open science. the deep debate raffaghelli international journal of educational technology in higher education ( ) : page of enacted by de solla price went far from research exploring more or less naïve approaches to organizing and delivering digital services for scholars (in the best of cases connected to open access, but in most cases inevitably linked to digitalized but still traditional science). furthermore, the problems in this area were soundly conceptualized by christine borgman ( ) in her influential book “scholarship in the digital age: information, infrastructure and the internet”. in borgman’s contribution, the concept of digital scholarship was in tight connection with the debate about cyberinfrastructures supporting new forms of doing research and science, namely, eresearch and escience, with the progressive digitalization of institutional infrastructures and the expected impact on scholars’ practices to deal with information and communication processes, but went beyond these services to critically think how these cyberinfrastructures could reshape scholarly practices and production. borgman’s work doubtlessly arose from the deep-rooted information science field. however, she was critical with regard to the risks of technological determinism, namely, thinking of technological platforms as the only influencers of human behaviour and organizational change. the sociotechnical studies played a highly important role in this case (borgman, , p. - ) by pushing her in an opposite, new direction with regard to the information sciences trend (analyzing infrastructure’s building and the users’ experiences within them). as borgman explained: librarianship tends to focus on methods of constructing organizational tools that reflect the world in the most authoritative manner, while recognizing that no organizational tool is static. rather, it must be updated continuously, via consensus process, to maintain its currency and relevance. sociotechnical studies, in contrast, tend to focus on how these representational tools construct the world, and how they both facilitate and constrain behavior (borgman, , p. ). she accomplished, in this sense, a first effort in a cross-disciplinary direction, bridging the debate on cyberinfras- tructures with professional practice and institutional development. in sum, the evolution of research in this field went in the direction of analysing the affordances and usage of cyberinfrastructures for the research flow as part of a scien- tific information life cycle (borgman, , p. ; hernon, ): filtering/accessing, creating/using, modifying/authoring, indexing/organizing, storing/retrieving, distribut- ing/networking the results of research as digital objects containing scientific informa- tion, and doing so across traditional cyberinfrastructures, open access repositories or social media. however, another strand within this disciplinary field attempted to analyse how scholars build trust, collaboration and reputation studying metrics based mostly on cross-citations and surveys. in close connection with information sciences and library studies, a second strand of research emerged within the humanities in the cross-over with digital technologies, opening to a new field of research, that of “digital humanities”(terras, nyhan, & vanhoutte, ). as terras et al. pointed out digital humanities as a term (…) provides a big tent for all digital scholarship in the humanities ( , p. ). the scholars connected to this perspective worked intensely to define the boarders of theory and practices as a field of research (unsworth, ), embraced the new forms of repre- sentation of cultural heritage, including history, arts and literature through the digital medium (bentkowska-kafel, ; gardiner & musto, ; kaltenbrunner, ). moreover, the term encompassed the debate connected to the changing research raffaghelli international journal of educational technology in higher education ( ) : page of methods and required professionalism in the humanities along an interdisciplinary dialogue with digital technologies (klein, ). in sum, digital humanities seemed to be at the cross-over of the debate about digital scholarship, adding its “unease” but also providing clear examples of practice of digital scholarship (flanders, ). a last strand connected to the academics’ professional learning and identity in the digital era emerged in tight connection with the educational technologies’ research, by about . the interest in the matter of digital scholarship spread in this research area since scholars contributing to it were interested in the complexities of the technological uptake by institutions and the users as socio-cultural shift encompassing professional practices, the connected learning ecologies and the impacts on professional identity (pearce et al., ). within this field the research was focused on scholars’ struggle to do (practices) and to be (identity) in the changing context of higher education. in fact, scholars’ were somehow pushed (in rather conflictive and contradictory ways) to keep the pace of innovations based on digital, open and networked contexts (goodfellow, ; scanlon, ; weller, ). the conundrum of opening up science and educa- tion was thereby faced through the exploration of professional learning as process through which the scholar undergoes in her effort to be more open, more digital or more networked (goodfellow, ). moreover, the focus of research shifted from the objective usage of cyberinfrastructures to understand how the technological affordances might create new scenarios for practice encompassing a deontological reflection (costa, ; scanlon, ; veletsianos & kimmons, b). this approach aligned indeed with socio-technical studies going beyond technological determinism (pearce et al., ). for this group of researchers, the research problems relating digital scholarship were mostly connected with the adoption of unconventional cyberinfrastructures like social media to do and share research –social scholarship- (greenhow & gleason, ; manca & ranieri, a; veletsianos, ); the collaboration between researchers to co-create content in more fluid processes of work that connect research with inter- disciplinary interactions, teaching and dissemination (garnett & ecclesfield, ; veletsianos & kimmons, a); engaging public audiences in the making of science, by extending the forms of participation along the research process (grand, wilkinson, bultitude, & winfield, ). the whole debate was connected to the need to improve scholars’ literacy to participate in digital, networked and open contexts of scholarship (goodfellow & lea, ; veletsianos & kimmons, b). moreover, in this research community it was possible to observe a strong reference to boyer’s model (boyer, ) on the academic profession. in fact, the need for reconsidering the academic profession has been an issue for research since boyer’s “new priorities for the professoriate” in the ‘ s (boyer, moser, ream, & braxton, ; teichler, arimoto, & cummings, ). boyer pointed out that a new scholarship should be based in four functions: discovering (creating new knowledge through research), integration (interaction across disciplinary lines to construct new research approaches to social problems), application (transacting with the society to use academic knowledge), teaching (use academic knowledge to educate future generations of practitioners and scholars). therefore, digital scholarship’s perspective for this group (see for example greenhow & gleason, ; weller, ) showed that boyer’s dimensions were being accelerated and transformed by: a) openness in both science and the research activities, b) open learning and teaching; c) networking, as the new professional ways of collaboration raffaghelli international journal of educational technology in higher education ( ) : page of across geographical and institutional frontiers based on the affordances provided by social networks and the web . . however, in spite of the interconnectedness between digitality with openness and networking, digital practices rather follow traditional schemes (goodfellow, ). for esposito ( ), scholarly practices are caught in the middle of being digital/open or traditional, aligning this conception with the visitors/ residents’ idea of using digital tools or living within digital spaces (white & cornu, a. le, ). for costa ( ) scholars are reinventing themselves online along different episodes of “outcasts on the inside” for deploying professional identity as digital scholar come at a price. as she expresses “the difference between the field and the habitus individuals bring to it leads to misrecognition of practice… ambivalence between the university world and research participants’ intellectual journeys results is a disjoined sens of identity and predisposition to symbolic revolutions” (costa, op.cit. ). moreover, stewart ( ) equates traditional practices to doing research in a “scarce context” whereas “…in the process of using, sharing, and contributing to this abundant and ever-renewing body of resources and ideas, scholars become more visible to each other and their areas of interest more legible”. with regard to the methodological approaches in this field, while there were also extensive studies covering the way scholars adopted social media through surveys (manca & ranieri, ), most studies in this field were based on qualitative methods observing and making thick descriptions and narratives on scholars’ forms of approaching open social media, particularly blogging (kjellberg, ) and micro- blogging with twitter (stewart, ; veletsianos, ; veletsianos & kimmons, ). some of these studies adopted critical and post-structural theoretical frame- works like bourdieu’s habitus (costa, , ) foucault’s “power and technologies of the self” (hildebrandt & couros, ), or bakhtin’s “chronotopes” (esposito, sangrà, & maina, ) aiming at showing how transition from tradition and the science in the “ivory tower” is in open and painful contradiction with the making of the scholar’s identity as open, networked and digital. to sum up, this research strand focused not only how the scholars behave in the digital, networked and open contexts of practice, but it worked out the tensions and contradictions that lead academics to act creatively to align values and practices within the making of their professional identity. we must mention at this point cross-fertilization between trends. the first one regards the highly cited work of boyer, which has been extensively adopted as model of scholarship. almost all consulted studies in the field of educational technologies elaborate on boyer’s model, starting from weller’s work ( ); a good number of papers from the information science area take into consideration this author (raffaghelli et al., ). another author that we should recognize as “boundary crossing” is surely borgman ( ) whose influencial work (mentioned above) has introduced the problem of digital scholarship for information sciences but also acknowledged the socio-technical studies as another perspective explaining the creative relationship between scholars and cyerinfrastructures. building on the concept of “literacies” for the digital university, goodfellow, ( ) makes borgman ( ) and weller ( ) to dialogue, in an attempt to understand the concepts emerging from these two works to define digital and academic literacies. more recently, the topic of professional identity studied in the field of educational technologies has been raffaghelli international journal of educational technology in higher education ( ) : page of connected to reputation (stewart, ; g. veletsianos & stewart, ); and from the side of information science studies reputational mechanisms have been analysed under the lens of boyer’s model for professional development. however, there are tensions that a more interdisciplinary and cohesive approach could solve. for example, the value of social media platforms to promote new forms of communicate science and of opening science in more participatory and informal ways, that is often assumed in many of the educational technologists’ works is highly contested by many groups of librarians. these lasts see social media platforms and their business models as potentially hazardous for a public and democratic science. moreover, this last group point out the unfair competition social media (with their appealing and user-friendly interface) generate against institutional repositories as public funded and hence safer for public dissemination of science (hall, ). wrapping up, the depicted situation let us only recognize the peak of the iceberg, understanding some of the research subtopics and problems, the methodological approaches and the disciplinary contributions within the broad issue of digital scholarship. however, the analysis of the literature in an attempt to characterize the three strands of research shows that the connections between the several perspectives are still far from come into being. the undefined panorama of ds and its impact on faculty development understanding faculty development faculty development has been frequently adopted to refer to all sort of educators engaged in higher education, particularly at undergraduate level. the term was coined to characterize teaching skills’ frameworks as well as the strategies to develop them within the professional context and along professional life cycle. while it became a standalone area of research and practice, it evolved hand in hand with the deep debate on teachers’ professional development (tpd) of any educational level and the perceived need of continuously support their processes of professionalization (hendriks, luyten, scheerens, sleegers, & steen, ; twining, raffaghelli, albion, & knezek, ). moreover, a high number of contributions in the field came from the research area of medical education (steinert et al., ). the topics covered within studies on faculty development regarded mainly the effectiveness of professional development programs (centra, ; simon & pleschová, ) analysing not only the academics’ perceptions and effective changes on their professional practices but also on students’ learning (guskey & yoon, ). the analysis focused differentially duration, format, or target group of the several faculty development actions (stes, min-leliveld, gijbels, & van petegem, ). the strategies related mainly specific skills development like internationalization and intercultural education, management, curriculum development, quality teaching, teaching innovations, or online teaching; but also covered professional development methods like workshops, problem-based learning, professional networks, action research and reflection on practice (amundsen & wilson, ). within this context, the problem of scholars’ skills and literacies needed to work within digital spaces has become a specific area of interest. an impressive amount of literature mainly analysed faculty development for online teaching (meyer, ) raffaghelli international journal of educational technology in higher education ( ) : page of exploring the barriers and enablers of elearning (singh & hardaker, ). recently, frameworks for professional development relating to open education have been proposed (nascimbeni & burgos, ). however, the majority of studies in faculty development in general and in the area of online teaching specifically have been criticized by the lack of theoretical or conceptual frameworks on professional learning underpinning practice (webster-wright, ). in this regards, there have been few exceptions citing adult learning theories like those of transformative learning by mezirow, andragogy by knowles, or reflective practice by argyris & schon (webster- wright, , amundsen & wilson, ; meyer, ). moreover, the outcome vs. the process approach in the reviewed literature could be deemed uneven. some studies focus the skills’ acquisition or the students’ achievements as proof of effectiveness (bahar-ozvaris, aslan, sahin-hodoglugil, & sayek, ; cole et al., ), other focus on the process of active professional learning as part of changing practices, and a last group consider how the new professional skills could modify the professional and organizational context, being the academics social and situated learners (boud, ; cox, ). for amundsen and wilson ( ) the right questions to address faculty development are how are educational development practices designed? and what is the thinking underpinning the design of educational development practice? (amundsen & wilson, , p. ). we should consider at this point that the lack of a vision able of answering these two questions is the main problem not only to design a program for professional learning, but also to understand whether the achievements envisaged by a professional development programme took place. as evans explained, whilst professional culture may be interpreted as shared ideologies, values and general ways of and attitudes to working […] professionalism seems generally to be seen as the identification and expression of what is required and expected of members of a profession (evans, , p. ). professional development requires hence the acknowledgement of a professional culture within an institutional culture of development, as dynamic and lifelong learning process of the individual towards a community. to this regard, the efforts to train professionals are based on an overarching, big picture of a professional area, which encompasses practices and a professional identity. this way of conceiving professional learning implies at operational level complex systems, based on the following dimensions: . a framework of competences and scenarios of expertise with middle stages of development (from novice to expert), that are closely connected not only with the developmental processes within the organization but also the society; . institutional strategies and policies connected to developmental processes within the organization, that in time acknowledge the existence of embedded professional communities with their values, identity and practices (vescio, ross, & adams, ; wenger, ); . environments, resources and activities that taking into consideration this organizational background would enable professional learners to self-direct their own learning interests, opening to opportunities to reflect and have these efforts recognized by a system (dircking homfeld, jones, & lindstrom, ; pataraia, margaryan, falconer, & littlejohn, ); raffaghelli international journal of educational technology in higher education ( ) : page of . showcase areas, namely, the possibility of showing the concrete results of professional learning: if it is connected to concrete processes to innovate practices, it should lead to new products, reflections, ideas. in the case of teaching, this is clearly connected to the models of action research, where design-based experiments are conducted in order to support experiential learning on specific teaching techniques. discussion: the missed dialogue between digital scholarship and faculty development the brief exploration of the literature on digital scholarship and faculty development conducted hitherto should establish the conditions to discuss the research question are faculty development strategies hindered by the lack of a cohesive view in the research on digital scholarship? indeed, our exploration of the literature disclose two order of problems relating how the advancements in digital scholarship are (or could) inform professional development research and practices needed to become a digital scholar. the first problem is that the research on faculty development to achieve digital skills for the academic profession have been mainly focused on online teaching (mckee, johnson, ritchie, & tew, ). in spite of the importance of open education and elearning for the movement of educational technologies, digital scholarship is a far more complex practice, as the same authors of the mentioned perspective have pointed out. just as example, on the basis of the revisited diat model of boyer, weller ( ) pointed out the importance of new forms of academic communication through blogging and social networks between research and teaching as two areas of practice with blurring boarders. furthermore, the online teaching issue is almost inexistent in the first perspective on digital scholarship (information sciences), which focus is mainly research and the scientific communication to the scientific community or the wider public, as we showed earlier. moreover, the overwhelming information about faculty development on the area of online teaching seems to be in contradiction with the fact that doing research is allegedly the primary endeavour for scholars and the main element for careers’ advancement. seemingly, the lack of attention of faculty development to digital research skills or discovery in boyer’s terms could be explained by a rather pragmatic approach of professional learning where the scholars achieve the specific professional skills through highly informal activities. in fact, it is the same expertise on a research domain that guides the self-recognition of skills’ gap and the associated learning activities and resources required to fulfil the professional learning needs. the fact mentioned above are clearly underlining a clear disconnection between faculty development and digital scholarship. the second problem goes in the opposite direction, from the research on digital scholarship to the issue of faculty development. it regards the fact that the research on the former topic has not considered yet the problem of designing, deploying and evaluating professional development towards the skills and processes required to become (act and being) a digital scholar, a focus that would bridge research on the two issues under analysis here. as raffaghelli, cucchiara, manganello, and persico ( ) pointed out, a close look to studies analysing digital scholarship shows that most of them are based on observational approaches that explore and observe existing practices, reporting objective data or phenomenological or narrative accounts on what it is. moreover, most raffaghelli international journal of educational technology in higher education ( ) : page of studies build on more or less acknowledged values on scholarship (towards more open and digitalized practices) but they just show a current picture and eventually point out the criticalities and conflicts of trying to be a digital scholar in the middle of traditional systems. with no interventionist studies, both experimental or design-based research that take into consideration an initial framework, device or model to be tested, it is clear that the directions for practice are uncertain. in this vein, the same authors explored a more evident question, namely “how many studies on digital scholarship considered professional development on the topic?” the findings showed a situation where very few studies considered specific instructions for professional development (subject areas, the stage professional development ( % of studies), general approaches to adopt digital tools for research and teaching ( %) and design and testing of a model of professional development for digital scholarship not considered at all ( %) (raffaghelli et al., p. ). more recently, the studies have started to make proposals relating to frameworks of practice and eventually competence, like academic microblogging (heap & minocha, ), reputation building (nicholas & herman, ) or the adoption of open datasets as open educational resources (atenas, havemann, & priego, ). a first comprehensive effort to build a theoretical and operational framework of competences for young researchers has been offered by ranieri ( ). however, her attempt was based on an initial reflection on initial training required to do research. therefore, a comprehensive framework that analyses scholarship as lifelong learning endeavour and as professional area integrated by diversified activities beyond teaching or research and hence based on shared values and a broad vision of what being a digital scholar and practicing digital scholarship is, is still missed. having said this, we could now attempt to answer the second subsidiary question, how the fragmentary vision of digital scholarship influences incomplete practices in faculty development?, drafting three scenarios of faculty development for digital scholarship. in order to build the scenarios, we will inspect every perspective on digital scholarship through the four requirements for effective faculty development: the framework of competences and scenarios of expertise; the institutional strategies and policies; the environments, resources and activities and the showcase areas. we will inform every scenario with the existing (but highly fragmentary and incomplete) literature relating professional development for the perspective digital scholarship under analysis. three scenarios of faculty development for digital scholarship for the first perspective (information sciences), the focus of digital scholarship is scientific communication, enhanced by digital technologies, and more recently, the open research practices, participatory science and new modes of disseminating the scientific work. the main theoretical model of reference to consider professionalism regards the cycle of scientific communication (borgman, ) and the competent management of the workflow thereby proposed. if we take into consideration the european context, only in the recent years several proposals for training researchers to adopt more actively open access have increased. several european documents address the need of researchers’ training, that is, opportunities for formal learning on issues like open access, opening up science, open peer-review and open data management and publication. as a matter of fact, a complete picture of what was conceived as digital raffaghelli international journal of educational technology in higher education ( ) : page of science was presented within the concept paper digital science at horizon by the dg connect (european commission, ), where a vision on digital science, a conceptual framework and a number of operational dimensions guiding also researchers’ training were considered. moreover, the communication of the european commission (com final) on the new european cyberinfrastructures supporting science highlights that necessary action to implement the european cloud initiative is to raise awareness and change incentive structures for academics, industry and public services to share their data, and improve data management training, literacy and data stewardship skills (p. ). more recently the high level expert group on european science cloud was created, and in the first report produced by this group produced the recommendation of training is made repeatedly requested as part of a strategy to promote more researchers’ engagement and awareness on open science (ayris et al., ). moreover, in march at the futurium space of the dg connect for public consultation and debate, a working document on open scholarship for the adoption on einfrastructures introduced a synthesis on the european commission’s endeavour on the matter, claiming for actions to cooperate with the new skills and professions group to design an action plan for training a new generation of scholars and shaping model policies for career development in open scholarship (matt, ). finally, in october the horizon workprogramme on science with and for society launched a call to fund projects aiming at training scholars for open science (european commission decision, ), closed by october . a new generation of training activities will be promoted through this call, and a framework of reference will probably developed. however, the models adopted have their focus on promoting specific competences rather than an overarching discussion of what it takes to be a digital scholar. in fact, a framework of competences would be based on the cycle of information science and the competent adoption open science instruments (open access, open data, open innovation). with regard to the institutional strategies and policies, in this first scenario the strategies are to be focused on the promotion of open access and open science as new frontiers of scientific communication, hence adopting policies to give value to open science. the environments, resources and activities required to this endeavour would be (as it is already) a mix of blended strategies promoted by university libraries and associations promoting open science; the initiatives would probably emphasize content (e.g. regulations for open access and open sciences, differences between institutional repositories and academic social networks, bibliometrics and altmetrics for the evaluation of research quality). finally, for the showcase areas we would expect to see the products of practices relating open access scientific communication, public and participatory science, open peer evaluation, etc. as for the second scenario, based on the perspective of digital humanities, the same evolution of the disciplinary debate on what being a humanist in the digital age means, has led to the creation of research centres for digital humanities (romero-frías & del-barrio-garcía, ). the framework of competences and scenarios of professional expertise are focused on activities of exploring the specific methods of digital humanities and the methodological debate (particularly ontological) on the nature of digitalized cultural heritage (unsworth, ). the institutional policies are aimed at opening areas or centres to promote the development of digital humanities both from the technical raffaghelli international journal of educational technology in higher education ( ) : page of (software, tools, labs), but also to cultivate the critical perspective on the methodological evolution of humanities at the cross-over with new computational methods (klein, ). therefore, the training activities would (and actually do) emphasize the adoption of instruments for digitalize or to manipulate digitalized cultural heritage. more recently, there is also an effort to build international communities sharing digital objects and discussing the methodological perspectives to their creation and analysis (borgman, ). the third scenario is based on the advances of research in the field of educational technologies and professional learning in digital and networked spaces. it mainly refers to the revisited theoretical diat model of boyer under the light of digital contexts and instruments; in this model, scholarship is divided into four dimensions of activity. in spite of this reference, very few frameworks of competence to become and to be an expert digital scholar have been elaborated. some of them could be too specifically related to a tool (i.e. academic blogging, by heap & minocha, ) or to a dimension of scholarship (i.e., discovery in the earlier stages of researchers’ professional develop- ment, by ranieri, ). for this perspective, in fact, the training focus could be scholars’ professional learning and identity change while dealing with digital, networked and open contexts of academic practice (goodfellow, ; pearce et al., ). institutional strategies in this perspective should enable researchers to adopt informal, networked communication of an intertwined perspective of research and teaching in the making. moreover, open online teaching and digital contents for learning should be considered in the evaluation of activities for career advancement. a debate on this topic has been clearly headed by the case of moocs (massive open online courses) (daniel, cano, & gisbert, ). as for the environments, resources and training activities, these would take the form of professional communities to reflect on the new practices and identity of the digital scholar. in line with this hypothesis, a recent open course “the digital scholar” offered by the open university of the uk over the basis of weller’s work discussed the new frontiers of practice for digital scholarship taking into consideration teaching and learning as part of digital, open and networked professional practices; in this course an internal simulation of professional networks through the “openstudio” tool showed participants how they were engaged and which were their contributions semantically organized. in fact, the showcase areas would probably aim at showing networks of digital scholarship, sharing stories on new practices, as narratives of what it takes to be a digital scholar is. the table synthesize the above hypothesized scenarios of faculty development for digital scholarship taking into consideration the three perspectives that contribute to the development of the topic. conclusions in this article the challenge was to understand how the current situation of the research digital scholarship influences the practices of faculty development. moreover, the aim was to show to which extent the lack of a cohesive conception of what it is to practice digital scholarship and to be a digital scholar hinders faculty development’s strategies. a careful look at literature highlighted the elusiveness of the concept of digital scholarship, with several disciplines contributing to the topic with diversified (and scarcely connected) conceptual frameworks and approaches to research. to this regard we analysed the literature and considered studies that already pointed out the problem raffaghelli international journal of educational technology in higher education ( ) : page of of fragmentation in the field of digital scholarship. the next step was to understand how the weaknesses of research in the first area impact on faculty development as applied area of research. to this endeavour, we analysed the literature on faculty development, in an attempt to understand which the dimensions of effectiveness and quality are. on these basis, we acknowledge four main dimensions for effective professional development, namely, a framework of competences and scenarios of expertise with middle stages of development (from novice to expert), that are closely connected not only with the developmental processes within the organization but also the society; institutional strategies and policies connected to developmental processes within the organization, that in time recognize the existence of embedded professional communities with their values, identity and practices; environments, resources and activities that taking into consideration this organizational background enable professional learners to self-direct own learning interests, opening to opportunities to reflect and have these efforts recognized by a system; showcase areas where is possible for the professional learner to present the concrete results of innovative practices and ideas cultivated along the process of professional learning. at this point, the assumption that a lack of cohesive views of what it is a digital scholar, as professional profile and connected practices and values, would hinder clear action taking for professional development was clearly supported by the literature. mov- ing forward to get further evidence in this direction, we introduced three scenarios of faculty development for digital scholarship. the three scenarios were based on the table scenarios of faculty development for digital scholarship dimensions of faculty development disciplines contributing to the perspectives on digital scholarship information sciences digital humanities educational technologies framework of competences and scenarios of expertise focused on… … the cycle of scientific communication, encompassing open research practices, participatory science and new modes of disseminating the scientific work. … specific methods of digital humanities and the methodological debate (particularly ontological) on the nature of digitalized cultural heritage … building a professional identity as digital scholars while dealing with digital, networked and open contexts of academic practice (including research and teaching). institutional strategies and policies for… … promoting open access and open science as new frontiers of scientific communication. … opening areas or centres to promote the development of digital humanities both from the technical (software, tools, labs) and the methodological point of view. … enabling researchers to adopt informal, networked communication of an intertwined perspective of research and teaching. environments, resources and activities as… … a mix of blended (online and face-to-face), active and flexible approaches with emphasis on content like: regulations for open access and open sciences, differences between institutional repositories and academic social networks, bibliometrics and altmetrics for the evaluation of research quality), etc. … a mix of blended active and flexible approaches with emphasis on the adoption of instruments for digitalize or to manipulate digitalized cultural heritage, as well as communities for the methodological debate. … a mix of blended and flexible approaches with emphasis on building professional communities to reflect on the new practices and identity of the digital scholar. showcase areas for… … showing the products of practices (open access scientific communication). … showing the products of practices (digitalized cultural heritage). … sharing stories on new practices, as narratives of what it takes to be a digital scholar. raffaghelli international journal of educational technology in higher education ( ) : page of three diverse perspectives in this last area. we could observe that a sectorial perspective on the scholar’s professional activity would encompass a limited vision on the type of contents, skills and products of training, against a broader debate of what being a digital scholar is. in fact, the first and second scenario were highly specific when coming to tools and institutional strategies, but they missed an overall picture on the professional identity forged through participation within open science and digital humanities. instead, the third scenario was less punctual in connecting the reflections of digital scholarship with technical tools and infrastructures, giving excessive value to social media and networks where in other cases (information sciences) their adoption has raised polemics. the three scenarios let us envisage the goals of professional develop- ment for digital scholarship as improving innovation in the use of digital tools, becoming networked professionals and adopting a consistent code of conduct relating opening up research and education. the way they focus contents and activities as well as the specific competences envisaged for the professional profile are diversified and even conflicting in some cases. moreover, there was no agreement on the overall profile of a digital scholar, with little attention to teaching in the first and second perspective against the importance given to this activity in the third. to achieve a broader picture of digital scholarship, the existing and frequently used framework boyer’s diat model relating the academic profession, should become more dynamic (considering levels of professional learning spanning from the novice to the expert) and better integrate with research on the definition of a framework considering researchers’ workflow and the types of production along the scientific information life cycle for open science, as it is the main focus of the information science; as well as the research on scholars’ reputation and identity. finally, a connection between the methods and approaches to online teaching analysed by the research on faculty development could be the base to discuss methodological approaches for the overall development strategies for digital scholarship. it seems not the case to replace models like boyer’s one, revisited in several studies on open and social scholarship like those carried out by veletsianos ( , , ), but to integrate and enrich it to make it become a complete taxonomy or even a pedagogical ontology encompassing contents, activities, learning goals and encompassed outcomes, and tools for evaluation and recognition of competences. the remarks for future research that ensue are connected to three main concerns arising from the need of creating a more cohesive view of digital scholarship bridging faculty development. the three issues are: interdisciplining research in digital scholar- ship, adopting new interventionist methodological approaches and building a more comprehensive framework for professional development. as for the interdisciplinary approach to research in digital scholarship, the immediate advantage would be the cross-fertilization of models on scholarship in the new digital open and networked contexts of knowledge. in fact, interdisciplinarity emerged as the approach leading the effort to understand how consolidated disciplines could collaborate and to explore the effectiveness of this collaboration (klein, ; nissani, ) transnational, european and national research agencies are starting to consider and to support interdisciplinary research since it has been connected to innovation and respon- siveness to social problems (moran, ). despite the advantages, researchers still thrive to engage in interdisciplinary projects since career advancement and research projects’ raffaghelli international journal of educational technology in higher education ( ) : page of evaluation are based on traditional disciplinary perspectives (leahey, beckman, & stanko, ; rons, ); this is a fact that we could appreciate in our initial description of the three disciplinary traditions that study digital scholarship. the fact is maybe due to the way technology has evolved: from one side, the technological development of tools, environments and interfaces requires the participation of math, computer science, engineering, etc.; however, the technological uptake is a social and cultural phenomenon that requires the engagement of sociology, anthropology, psychology and educational sciences. the relationships between these several worlds raise not only specific methodological issues, but also epistemological questions like the need of overcoming technological determinism and the socio-material perspective (fenwick, ). as a matter of fact, in the specific case of educational technologies, an hybrid area by definition, conole, scanlon, mundin, & farrow, p. ) highlighted that …it is evident that interdisciplinarity is a core feature of tel (technology-enhanced learning) research. however, for these same authors, interdisciplinarity entails concrete difficulties hindering the advancement of the tel research: … multiplicity also brings challenges, such as a lack of a shared coherent discourse, tensions and power struggles between the different subject domains and a lack of perceived rigour and credibility (op.cit, p. ). we might expect that digital scholarship, as topic at the crossover of information sci- ence, computer science, educational technologies, sociological and anthropological studies will introduce similar challenges that need to be explored to put the basis of solid interdisciplinary collaboration, and the need to elaborate concrete bases for fac- ulty development on digital scholarship could support this endeavour. the issue of methodological approaches within the research on digital scholarship is intertwined with this last assumption. as we observed here, most studies are based on observational research. to understand faculty development on digital scholarship, there is need instead of developing frameworks to be tested, going beyond descriptions/ observations to take actions and analyse impacts, as it would be the case of design-based research studies on researchers’ training on digital scholarship (raffaghelli, valla, cucchiara, giglio, & persico, ). in fact, the topic of the professional development of researchers’ skills to become digital scholars is frequently missed in the research on digital scholarship, showing that the design, development and test of models and strategies is still far. last, but not least, the vision of expertise and its connected values put the bases to recognize professional learning and career advancement. to this regard, the research should also consider that organizational change (i.e., introducing open science and networked scholarship) may be based on professional development. however, it is necessary to go beyond traditional, on-site training, towards the creation of flexible approaches, with professional learning environments that adopt advanced technologies like recommendation systems endowing participants to develop professional skills (manganello, falsetti, spalazzi, & leo, ). these different approaches should be further tested in order to analyse their effectiveness. to conclude, we would like to consider the sharp synthesis made by esposito ( ) on the current state of digital scholarship. as she pointed out, scholars are “traditionally ‘digital’, moderately ‘networked’ and occasionally ‘open’” ( , n.p.). therefore, the combined research on faculty development for digital scholarship, on the basis of a more cohesive approach to this last, could confirm criticalities and opportunities to design and test pedagogical approaches for the frontiers of digital scholarship. raffaghelli international journal of educational technology in higher education ( ) : page of endnote http://www.open.edu/openlearn/education/the-digital-scholar/content-section-overview. competing interests the author declares that she has no competing interests. publisher’s note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. received: february accepted: may references amundsen, c., & wilson, m. ( ). are we asking the right questions?: a conceptual review of the educational development literature in higher education. review of educational research, ( ), – . http://doi.org/ . / . accessed may andersen, d. l. ( ). in d. l. andersen (ed.), digital scholarship in the tenure, promotion, and review process. london & new york: m.e. sharpe. andersen, d., & trinkle, d. ( ). valuing digital scholarship in the tenure, promotion, and review process - a survey of academic historians. in d. andersen (ed.), digital scholarship in the tenure, promotion and review process (pp. – ). new york: m.e. sharpe. atenas, j., havemann, l., & priego, e. ( ). open data as open educational resources: towards transversal skills and global citizenship. open praxis, ( ), – . http://doi.org/ . /openpraxis. . . . accessed may ayers, e. ( ). doing scholarship on the web: ten years of triumphs — and a disappointment. journal of scholarly publishing, ( ), – . http://doi.org/ . /jsp. . . . accessed may ayris, p., berthou, j.-y., bruce, r., lindstaedt, s., monreale, a., mons, b., … tochtermann, klaus wilkinson, r. ( ). realising the european open science cloud. first report and recommendations of the commission high level expert group on the european open science cloud. luxembourg. bahar-ozvaris, s., aslan, d., sahin-hodoglugil, n., & sayek, i. ( ). a faculty development program evaluation: from needs assessment to long-term effects, of the teaching skills improvement program. teaching and learning in medicine, ( ), – . http://doi.org/ . /s tlm _ . accessed may bentkowska-kafel, a. ( ). “i bought a piece of roman furniture on the internet. it’s quite good but low on polygons. ”—digital visualization of cultural heritage and its scholarly value in art history. visual resources, ( – ), – . http://doi.org/ . / . . . accessed may borgman, c. l. ( ). scholarship in the digital age. cambridge: mit press. borgman, c. l. ( ). supporting the“scholarship” in e-scholarship. educause review, ( ), – . borgman, c. l. ( ). big data, little data, no data: scholarship in the networked world. cambridge: mit press. boud, d. ( ). situating academic development in professional work: using peer learning. international journal for academic development, ( ), – . http://doi.org/ . / . accessed may boyer, e. l. ( ). scholarship reconsidered: priorities of the professoriate (vol. ). san francisco: carnegie foundation. boyer, e. l., moser, d., ream, t. c., & braxton, j. m. ( ). scholarship reconsidered: priorities of the professoriate. san francisco: wiley&sons. braxton, j. m., luckey, w., & helland, p. ( ). institutionalizing a broader view of scholarship through boyer’s four domains. hoboken: wiley- jossey bass. centra, j. a. ( ). faculty development in higher education. teachers college record, ( ), – . cole, k. a., barker, l. r., kolodner, k., williamson, p., wright, s. m., & kern, d. e. ( ). faculty development in teaching skills: an intensive longitudinal model. academic medicine: journal of the association of american medical colleges, ( ), – . conole, g., scanlon, e., mundin, p., & farrow, r. ( ). interdisciplinary research. findings from the technology enhanced learning research programme. milton keynes. retrieved from http://www.tlrp.org/docs/telinterdisciplinarity.pdf. accessed may costa, c. ( ). the habitus of digital scholars. research in learning technology, . http://doi.org/ . / rlt.v i . . accessed may costa, c. ( ). outcasts on the inside: academics reinventing themselves online. international journal of lifelong education, ( ), – . http://doi.org/ . / . . . accessed may cox, m. d. ( ). introduction to faculty learning communities. new directions for teaching and learning, ( ), – . http://doi.org/ . /tl. . accessed may cox, j. ( ). communicating new library roles to enable digital scholarship: a review article. new review of academic librarianship, ( / ), – . http://doi.org/ . / . . . accessed may daniel, s. j., cano, e. v., & gisbert, m. ( ). the future of moocs: adaptive learning or business model? rusc. universities and knowledge society journal, , . de solla price, d. j. ( ). little science, big science. new york: columbia university press. den besten, m., david, p., & schroeder, r. ( ). research in e-science and open access to data information. in j. husinger, l. klastrup, & j. allen (eds.), international handbook of internet research (pp. – ). london & new york. http://doi.org/ . / - - - - . accessed may dircking homfeld, l., jones, c., & lindstrom, b. ( ). in l. dircking homfeld, c. jones, & b. lindstrom (eds.), analysing networked learning practices in higher education and continuing professional development ( st ed.). rotterdam/ boston/taipei: sense publishers. raffaghelli international journal of educational technology in higher education ( ) : page of http://www.open.edu/openlearn/education/the-digital-scholar/content-section-overview http://doi.org/ . / http://doi.org/ . / http://doi.org/ . /openpraxis. . . http://doi.org/ . /jsp. . . http://doi.org/ . /s tlm _ http://doi.org/ . / . . http://doi.org/ . / http://www.tlrp.org/docs/telinterdisciplinarity.pdf http://doi.org/ . /rlt.v i . http://doi.org/ . /rlt.v i . http://doi.org/ . / . . http://doi.org/ . /tl. http://doi.org/ . / . . http://doi.org/ . / - - - - esposito, a. ( , january ). neither digital or open. just researchers: views on digital/open scholarship practices in an italian university. first monday, s.i. http://dx.doi.org/ . /fm.v i . . accessed may esposito, a. (ed.). ( ). research . and the impact of digital technologies on scholarly inquiry. hershey: igi global. http://doi.org/ . / - - - - . accessed may esposito, a., sangrà, a., & maina, m. ( ). chronotopes in learner-generated contexts. a reflection about the interconnectedness of temporal and spatial dimensions to provide a framework for the exploration of hybrid learning ecologies of doctoral e-researchers. elearn center research paper series. retrieved from http://journals.uoc. edu/index.php/elcrps/article/view/ /n -esposito-epub. accessed may european commission decision. ( ). horizon work programme – . . science with and for society, (july ), – . retrieved from http://ec.europa.eu/research/participants/data/ref/h /wp/ _ / main/h -wp -swfs_en.pdf. accessed may european commission. ( ). digital science in horizon . working document, european commission, brussels. retrieved from ec.europa.eu/information_society/newsroom/cf/dae/document.cfm?doc_id = . european commission. ( ). com( ) . european cloud iniative. retrieved from https://ec.europa.eu/digital-single- market/en/news/communication-european-cloud-initiative-building-competitive-data-and-knowledge-economy-europe evans, l. ( ). professionalism, professionality and the development of education professionals. british journal of educational studies, ( ), – . fecher, b., & friesike, s. ( ). open science: one term, five schools of thought. ssrn electronic journal, ratswd_wp_. http://doi.org/ . /ssrn. . accessed may fenwick, t. ( ). re‐thinking the “thing”: sociomaterial approaches to understanding and researching learning in work. journal of workplace learning, ( / ), – . http://doi.org/ . / . accessed may flanders, j. ( ). the productive unease of st-century digital scholarship. in defining digital humanities: a reader (pp. – ). ashgate publishing ltd. gardiner, e., & musto, r. g. ( ). the digital humanities a primer for students and scholars. harvard: harvard university press. garnett, f., & ecclesfield, n. ( ). towards a framework for co-creating open scholarship. research in learning technology, , http://doi.org/ . /rlt.v i . . accessed may goodfellow, r. ( ). scholarly, digital, open: an impossible triangle? research in learning technology, . http://doi.org/ . /rlt.v . . accessed may goodfellow, r. ( ). the literacies of “digital scholarship” - truth and use values. in r. goodfellow & m. lea (eds.), literacy in the digital university: critical perspectives on learning, scholarship, and technology (p. ). london: routledge. goodfellow, r., & lea, m. ( ). literacy in the digital university: critical perspectives on learning, scholarship, and technology. london: routledge. gorard, s. ( ). combining methods in educational and social research. maidenhead: openuniversity press-mcgraw hill education. gorard, s., & cook, t. ( ). where does good evidence come from? international journal of research & method in education, ( ), – . http://doi.org/ . / . accessed may grand, a., wilkinson, c., bultitude, k., & winfield, a. f. t. ( ). open science: a new “trust technology”? science communication, ( ), – . http://doi.org/ . / . accessed may greenhow, c., & gleason, b. ( ). social scholarship: reconsidering scholarly practices in the age of social media. british journal of educational technology, ( ), – . http://doi.org/ . /bjet. . accessed may grover, k. s. walters. s. r. c. ( ). exploring faculty preferences for mode of delivery for professional development initiatives. online journal of distance learning administration, ( ). guskey, t. r., & yoon, k. s. ( ). what works in professional development? phi delta kappan, ( ), – . http://doi.org/ . / . accessed may hall, g. ( ). should this be the last thing you read on academia.edu? radical open access conference. retrieved from https://www.academia.edu/ /should_this_be_the_last_thing_you_read_on_academia.edu. accessed may heap, t., & minocha, s. ( ). an empirically grounded framework to guide blogging for digital scholarship. research in learning technology, (supp). pp. – . https://doi.org/ . /rlt.v i . . accessed may hendriks, m., luyten, h., scheerens, j., sleegers, p., & steen, r. ( ). teachers’ professional development. europe in the international comparison. (scheerens, ed.). louxembourg: office for official publications of the european union. http://doi.org/ . / . accessed may hernon, p. ( ). information life cycle: its place in the management of u.s. government information resources. government information quarterly, ( ), – . http://doi.org/ . / - x( ) - . accessed may hildebrandt, k., & couros, a. ( ). digital selves, digital scholars: theorising academic identity in online spaces. journal of applied social theory, , . holliman, r. ( ). from analogue to digital scholarship: implications for science communication researchers. journal of science communication, ( ), c . jamali, h. r., nicholas, d., watkinson, a., herman, e., tenopir, c., levine, k., nichols, f. ( ). how scholars implement trust in their reading, citing and publishing activities: geographical differences. library & information science research, ( – ), – . http://doi.org/ . /j.lisr. . . . accessed may jamali, h. r., nicholas, d., & herman, e. ( ). scholarly reputation in the digital age and the role of emerging platforms and mechanisms. research evaluation, ( ), – . http://doi.org/ . /reseval/rvv . accessed may kaltenbrunner, w. ( ). infrastructural inversion as a generative resource in digital scholarship. science as culture ( ), – . http://dx.doi.org/ . / . . . accessed may kjellberg, s. ( ). researchers’ blogging practices in two epistemic cultures the scholarly blog as a situated genre. human it, , . klein, j. t. ( ). crossing boundaries: knowledge, disciplinarities, and interdisciplinarities. london: university of virginia press. klein, j. t. ( ). interdisciplining digital humanities: boundary work in an emerging field. ann arbor: university of michigan press. leahey, e., beckman, c., & stanko, t. ( ). prominent but less productive: the impact of interdisciplinarity on scientists’ research (digital libraries). retrieved from http://arxiv.org/abs/ . . accessed may raffaghelli international journal of educational technology in higher education ( ) : page of http://dx.doi.org/ . /fm.v i . http://doi.org/ . / - - - - http://journals.uoc.edu/index.php/elcrps/article/view/ /n -esposito-epub http://journals.uoc.edu/index.php/elcrps/article/view/ /n -esposito-epub http://ec.europa.eu/research/participants/data/ref/h /wp/ _ /main/h -wp -swfs_en.pdf http://ec.europa.eu/research/participants/data/ref/h /wp/ _ /main/h -wp -swfs_en.pdf http://doi.org/ . /ssrn. http://doi.org/ . / http://doi.org/ . /rlt.v i . http://doi.org/ . /rlt.v . http://doi.org/ . /rlt.v . http://doi.org/ . / http://doi.org/ . / http://doi.org/ . /bjet. http://doi.org/ . / https://www.academia.edu/ /should_this_be_the_last_thing_you_read_on_academia.edu https://doi.org/ . /rlt.v i . http://doi.org/ . / http://doi.org/ . / - x( ) - http://doi.org/ . /j.lisr. . . http://doi.org/ . /reseval/rvv http://dx.doi.org/ . / . . http://arxiv.org/abs/ . manca, s., & ranieri, m. ( a). “yes for sharing, no for teaching!”: social media in academic practices. the internet and higher education, , – . http://doi.org/ . /j.iheduc. . . . accessed may manca, s., & ranieri, m. ( ). exploring digital scholarship: a study on use of social media for scholarly communication among italian academics. in research . and the impact of digital technologies on scholarly inquiry (ed: antonella esposito) vol. , pp. – . hershey: igi global. http://doi.org/ . / - - - - .ch . accessed may manganello, f., falsetti, c., spalazzi, l., & leo, t. ( ). pks: an ontology-based learning construct for lifelong learners. educational technology & society, ( ), – . matt, s. ( ). e-infrastructures to facilitate open scholarship. retrieved from https://ec.europa.eu/futurium/en/content/ e-infrastructures-facilitate-open-scholarship. accessed may mckee, c. w., johnson, m., ritchie, w. f., & tew, w. m. ( ). professional development of the faculty: past and present. new directions for teaching and learning, ( ), – . http://doi.org/ . /tl. . accessed may merton, r. k., & garfield, e. ( ). little science big science - foreword. retrieved from http://www.garfield.library.upenn. edu/lilscibi.html. accessed may meyer, k. a. ( ). an analysis of the cost and cost-effectiveness of faculty development for online teaching. journal of asynchronous learning networks, ( ), – . moran, j. ( ). interdisciplinarity. london: routledge. nascimbeni, fabio; burgos, daniel. in search for the open educator: proposal of a definition and a framework to increase openness adoption among university educators. the international review of research in open and distributed learning, ( ). issn - . doi: . /irrodl.v i . . accessed may nicholas, d., & herman, e. ( ). scholarly activities and reputation in the digital age: a conceptual framework. retrieved from http://ciber-research.eu/download/ -reputation_wp _scholarly_activities.pdf. accessed may nissani, m. ( ). ten cheers for interdisciplinarity: the case for interdisciplinary knowledge and research. the social science journal, ( ), – . oliveira, n. r., morgado, l., alvesson, m., ashcraft, k. l., thomas, r., amichai-hamburger, y…., warburton, s. ( ). digital identity of researchers . . in research . and the impact of digital technologies on scholarly inquiry (vol. , pp. – ). igi global. http://doi.org/ . / - - - - .ch . accessed may palmer, c. l., & cragin, m. h. ( ). scholarship and disciplinary practices. annual review of information science and technology, ( ), – . http://doi.org/ . /aris. .v : . accessed may pataraia, n., margaryan, a., falconer, i., & littlejohn, a. ( ). how and what do academics learn through their personal networks. journal of further and higher education, ( ), – . http://doi.org/ . / x. . . accessed may pearce, n., weller, m., scanlon, e., & kinsley, s. ( ). digital scholarship considered: how new technologies could transform academic work. in education. retrieved from http://ineducation.ca/ineducation/article/view/ / petticrew, m., & roberts, h. ( ). systematic reviews in the social sciences: a practical guide. oxford: blackwell publishing ltd. pontika, n., knoth, p., cancellieri, m., & pearce, s. ( ). fostering open science to research using a taxonomy and an elearning portal. in proceedings of the th international conference on knowledge technologies and data-driven business - i-know’ (pp. – ). new york, new york, usa: acm press. http://doi.org/ . / . . accessed may quan-haase, a., suarez, j. l., & brown, d. m. ( ). collaborating, connecting, and clustering in the humanities: a case study of networked scholarship in an interdisciplinary, dispersed team. american behavioral scientist, ( ), – . http://doi.org/ . / . accessed may quigley, d. s., neely, e., parkolap, a., & groom, g. ( ). scholarship and digital publications: where research meets innovative technology. visual resources, ( – ), – . http://doi.org/ . / . . . accessed may raffaghelli, j. e., cucchiara, s., manganello, f., & persico, d. ( ). digital scholarship: a systematic review of literature. report, institute of educational technologies. genoa: italian national research council. raffaghelli, j. e., cucchiara, s., manganello, f., & persico, d. ( ). different views on digital scholarship: separate worlds or cohesive research field? research in learning technology, , – . http://doi.org/ . /rlt.v . . accessed may raffaghelli, j. e., valla, s., cucchiara, s., giglio, a., & persico, d. ( ). exploring researchers’discourses about producing, disseminating and evaluating scientific information on the web. the case of biomedical sciences. in l. gómez-chova, a. lópez martínez, & i. candel torres (eds.), edulearn proceedings (pp. – ). iated. http://doi.org/isbn - - - - . accessed may ranieri, m. ( ). le competenze digitali dei giovani ricercatori. quadro teorico, modelli di analisi, proposte formative. pedagogia oggi, , – . ren, x. ( ). beyond open access: open publishing and the future of digital scholarship. in proceedings of the th australasian society for computers in learning in tertiary education conference (ascilite ). roemer, r. c., & borchardt, r. ( ). from bibliometrics to altmetrics: a changing scholarly landscape. coll. res. libr. news, ( ), – . romero-frías, e., & del-barrio-garcía, s. ( ). una visión de las humanidades digitales a través de sus centros. el profesional de la informacion, ( ), – . http://doi.org/ . /epi. .sep. . accessed may rons, n. ( ). interdisciplinary research collaborations: evaluation of a funding program. collnet journal of scientometrics and information management, ( ), – . http://doi.org/ . / . . . accessed may rumsey, a. s. ( ). new-model scholarly communication: road map for change. retrieved from http://uvasci.org/ institutes- - /sci- -road-map-for-change.pdf. accessed may scanlon, e. ( ). scholarship in the digital age: open educational resources, publication and public engagement. british journal of educational technology, ( ), – . http://doi.org/ . /bjet. . accessed may simon, e., & pleschová, g. ( ). teacher development in higher education. existing programs, program impact and future trends. london: routledge. raffaghelli international journal of educational technology in higher education ( ) : page of http://doi.org/ . /j.iheduc. . . http://doi.org/ . / - - - - .ch https://ec.europa.eu/futurium/en/content/e-infrastructures-facilitate-open-scholarship https://ec.europa.eu/futurium/en/content/e-infrastructures-facilitate-open-scholarship http://doi.org/ . /tl. http://www.garfield.library.upenn.edu/lilscibi.html http://www.garfield.library.upenn.edu/lilscibi.html http://dx.doi.org/ . /irrodl.v i . http://ciber-research.eu/download/ -reputation_wp _scholarly_activities.pdf http://doi.org/ . / - - - - .ch http://doi.org/ . /aris. .v : http://doi.org/ . / x. . http://doi.org/ . / . http://doi.org/ . / http://doi.org/ . / . . http://doi.org/ . /rlt.v . http://doi.org/isbn% - - - - http://doi.org/isbn% - - - - http://doi.org/ . /epi. .sep. http://doi.org/ . / . . http://uvasci.org/institutes- - /sci- -road-map-for-change.pdf http://uvasci.org/institutes- - /sci- -road-map-for-change.pdf http://doi.org/ . /bjet. singh, g., & hardaker, g. ( ). barriers and enablers to adoption and diffusion of elearning : a systematic review of the literature - a need for an integrative approach, education & training ( / ), – . http://doi.org/ . /et- - - . accessed may steinert, y., mann, k., anderson, b., barnett, b. m., centeno, a., naismith, l., … dolmans, d. ( ). a systematic review of faculty development initiatives designed to enhance teaching effectiveness: a -year update: beme guide no. . medical teacher, ( ). http://doi.org/ . / x. . . accessed may stes, a., min-leliveld, m., gijbels, d., & van petegem, p. ( ). the impact of instructional development in higher education: the state-of-the-art of the research. educational research review, ( ), – . http://doi.org/ . /j. edurev. . . . accessed may stewart, b. e. ( a). in abundance: networked participatory practices as scholarship. the international review of research in open and distributed learning, ( ). http://dx.doi.org/ . /irrodl.v i . . accessed may stewart, b. e. ( b). open to influence: what counts as academic influence in scholarly networked twitter participation. learning, media and technology, ( ), – . http://doi.org/ . / . . . accessed may stewart, b. ( ). how do we know who we are when we’re online?: reputation, identity, and influence in scholarly networks. in r. t, s. bayne, c. jones, m. de laat, & c. sinclair (eds.), proceedings of the th international conference on networked learning (pp. – ). lancaster: networked learning conference. suber, p. ( ). open-access timeline. retrieved march , , from http://legacy.earlham.edu/~peters/fos/timeline.htm. accessed may teichler, u., arimoto, a., & cummings, w. ( ). the changing academic profession - major findings of a comparative survey. london & new york: springer. terras, m., nyhan, j., & vanhoutte, e. ( ). defining digital humanities: a reader. london: ashgate publishing, ltd. twining, p., raffaghelli, j., albion, p., & knezek, d. ( ). moving education into the digital age: the contribution of teachers’ professional development. journal of computer assisted learning, ( ), – . http://doi.org/ . /jcal. . accessed may unsworth, j. ( ). what is humanities computing and what is not? in m. terras, j. nyhan, & e. vanhoutte (eds.), defining digital humanities: a reader (pp. – ). london: ashgate publishing ltd. valente, a. ( ). trasmissione d’élite o accesso alle conoscenze? percorsi e contesti della documentazione e comunicazione scientifica. milano: franco angeli. veletsianos, g. ( ). higher education scholars’ participation and practices on twitter. journal of computer assisted learning, ( ), – . http://doi.org/ . /j. - . . .x. accessed may veletsianos, g. ( ). a case study of scholars’ open and sharing practices. open praxis, ( ), – . veletsianos, g., & kimmons, r. ( a). assumptions and challenges of open scholarship. international review of research in open and distance learning, ( ), – . veletsianos, g., & kimmons, r. ( b). networked participatory scholarship: emergent techno-cultural pressures toward open and digital scholarship in online networks. computers & education, ( ), – . http://doi.org/ . / j.compedu. . . . accessed may veletsianos, g., & kimmons, r. ( ). scholars in an increasingly open and digital world: how do education professors and students use twitter? the internet and higher education, , – . http://doi.org/ . /j.iheduc. . . . accessed may veletsianos, g., & stewart, b. e. ( ). discreet openness: scholars selective and intentional self-disclosures online. social media + society, ( ), http://doi.org/ . / . accessed may vescio, v., ross, d., & adams, a. ( ). a review of research on the impact of professional learning communities on teaching practice and student learning. teaching and teacher education, ( ), – . http://doi.org/ . /j.tate. . . . accessed may webster-wright, a. ( ). reframing professional development through understanding authentic professional learning. review of educational research, ( ), . http://doi.org/ . / . accessed may weller, m. ( ). digital scholarship and the tenure process as an indicator of change in universities. rusc. revista de universidad y sociedad del conocimiento, ( ), – – . http://doi.org/ . /rusc.v i . . accessed may weller, m. ( ). the digital scholar: how technology is transforming scholarly practice. london: bloomsbury academic. wenger, e. ( ). communities of practice: learning, meaning, and identity. cambridge: cambridge university press. white, d. s., & cornu, a. le. ( ). visitors and residents: a new typology for online engagement. first monday, ( ). retrieved from http://firstmonday.org/ojs/index.php/fm/article/view/ / . accessed may zhao, l. ( ). riding the wave of open access: providing library research support for scholarly publishing literacy. australian academic & research libraries, ( ), – . http://doi.org/ . / . . . accessed may raffaghelli international journal of educational technology in higher education ( ) : page of http://doi.org/ . /et- - - http://doi.org/ . / x. . http://doi.org/ . /j.edurev. . . http://doi.org/ . /j.edurev. . . http://dx.doi.org/ . /irrodl.v i . http://doi.org/ . / . . http://legacy.earlham.edu/~peters/fos/timeline.htm http://doi.org/ . /jcal. http://doi.org/ . /j. - . . .x http://doi.org/ . /j.compedu. . . http://doi.org/ . /j.compedu. . . http://doi.org/ . /j.iheduc. . . http://doi.org/ . / http://doi.org/ . /j.tate. . . http://doi.org/ . /j.tate. . . http://doi.org/ . / http://doi.org/ . /rusc.v i . http://firstmonday.org/ojs/index.php/fm/article/view/ / http://doi.org/ . / . . abstract introduction methodological approach the research on digital scholarship: separated worlds the undefined panorama of ds and its impact on faculty development understanding faculty development discussion: the missed dialogue between digital scholarship and faculty development three scenarios of faculty development for digital scholarship conclusions http://www.open.edu/openlearn/education/the-digital-scholar/content-section-overview. competing interests publisher’s note references academic blogging: academic practice and academic identity london review of education vol. , no. , march , – issn - print/issn - online © institute of education, university of london doi: . / http://www.informaworld.com academic blogging: academic practice and academic identity gill kirkup* institute of educational technology, the open university, milton keynes, uk taylor and francisclre_a_ .sgm . / london review of education - (print)/ - (online)original article taylor & francis gillkirkupg.e.kirkup@open.ac.uk this paper describes a small-scale study which investigates the role of blogging in professional academic practice in higher education. it draws on interviews with a sample of academics (scholars, researchers and teachers) who have blogs and on the author’s own reflections on blogging to investigate the function of blogging in academic practice and its contribution to academic identity. it argues that blogging offers the potential of a new genre of accessible academic production which could contribute to the creation of a new twenty-first century academic identity with more involvement as a public intellectual. keywords: blogging; writing; academic practice; scholarly texts; identity; academic literacies introduction: writing an academic identity work on academic literacies (ivanic ) takes the position that all writing is a presentation of the self, in a postmodern framework it would even be described as a ‘performance’ of the self (butler ). however the practice of academic writing is understood as problematic for both students and academics. for example, authors in a collection by williams ( ) argue that the identities created through traditional kinds of scholarly writing styles embody values and worldviews that run counter to both the identities that students bring to higher education as well as the identities that a more diverse ‘workforce’ of scholars, researchers and teachers now embodies. what should constitute valid academic writing is being challenged, and it is into this landscape that blogging has entered. the word ‘blog’, as a noun to describe a specific kind of website and as a verb to describe the process of authoring this website, is now in such common use that it needs no explanation for readers of this journal. the activity of blogging is mature enough for some of the early enthusiasts and promoters to argue that it has been co-opted into the mainstream media and consequently lost its power as a democratic and accessible tool for self-expression and commu- nity building (lovink ). this may reflect the sentiments of those who value blogging as a radical oppositional activity, but for many including academics blogging has become more useful now that it is a widespread and widely understood medium. recent educational literature has given a long list of educational reasons why blogging is useful for students (see farmer ; kerawalla et al. , ; these last two articles describe work that the author of this article was also involved with) these include: as a reflective journal, as a notebook to record events and developing ideas, as an aggregator of resources, and as a tool for creating community and conversation with fellow students. blogging might provide students with alternative sites for academic identity creation that are less problematic than traditional ones, but blogging has been less enthusiastically embraced as offering alternatives for scholars and researchers. *email: g.e.kirkup@open.ac.uk g. kirkup a significant reason for this is that traditional forms of scholarly production do not recognise blogging as an academic product: ‘for most academics, blogs are irrelevant because they don’t count as publications’ (lovink , ). in the uk the importance for career advancement and institutional research assessment of printed monographs and publications in peer-reviewed journals has been a discouragement from investing time in the activity of blogging. a recent us book about digital scholarship, discounts blogging on the first page where it is listed with other ‘“stuff” – the unverified and unverifiable statements of individuals, discussions on listservs… questionable advertisements for questionable products and services, and political and religious screeds in all languages’ (borgman , ), and contrasts this ‘stuff’ with ‘the substantial portion of online content [that] is extremely valuable for scholarship’. despite this, the academic practices of scholars, researchers and teachers are changing, it has become accepted scholarly practice to cite online materials of all sorts, and some scholars have even developed a professional reputation for their blogging. another reason for the wariness of academia for blogging is the subjective style of many blogs, a style which seems in opposition to traditional forms of academic text which value an ‘objective’ authorial voice: writing which focuses on the management and presentation of infor- mation above the management and presentation of self (hyland ). perhaps those academics described by the author in williams’ ( ) book who felt the most conflict between the identity available to them through traditional forms of scholarly writing and alternative conflicting identities (for example of race, class) will find that blogging offers them a form of writing which enables them to perform new, and less conflicted, kinds of academic identity. this paper is a small-scale investigation into why some academics produce blogs and the perceived value of this activity to their academic practice and their academic identity. it builds on the work of gregg who argues for ‘blogging as conversational scholarship’, which makes ‘scholarly work accessible and accountable to a readership outside the academy’ (gregg , – ) and ewins ( ) who sees blogs as offering a medium for the creation of new academic identities. blogademia blogging as an activity is not only about creating scholarly products, it is ‘performative writing’ (gregg ). it creates identity through the production of what giddens ( ) describes as a narrative about the self, but it also does this by providing an alternative medium through which to do it. ewins ( ) argues that blogs contribute to the creation of what gergan ( ) defines as ‘multiphrenic’ identity; that is, an identity not only created out of a variety of narratives, but performed and presented through a variety of media. this is part of what makes a postmodern identity different from the kinds of identities that have been available to scholars in the past: through the media of printed texts, letters and lectures. there is now potentially a huge range of media and kinds of narratives we can engage with to explicitly create both private and professional identities. a similar way of understanding blogging is as a foucauldian ‘technology of the self’ (lovink ); and since academics are professionals engaged in the continuous development of a professional ‘self’, blogging could play a useful role in this. thinking of blogging in these terms gives it a much more valid and potentially powerful position than borgman gave it credit for. saper ( ) was the first to categorise academic blogging as being a particular genre of blog- ging, which he labelled ‘blogademia’. academic bloggers, he argued, did not see blogging as part of the production of knowledge in their disciplines because blogs did not go through any peer review or editorial process. consequently he saw blogs by academics that ‘often air dirty laundry, gripes, complaints, rants, and raves, what those blogs add to research seems outside london review of education scholarship’ (saper ). this kind of writing, he argued, is engaged in discussing the social processes of knowledge production and it should be valued as ‘a vehicle to comprehend mood, atmosphere, personal sensibility and the possibilities of knowledge outside the ego’s conscious thought’. blogs he asserted are one of the future tools of academia. but writing about the academic workplace – the back stage of academic performance – has its risks. benton ( ) noted the concerns that people expressed about the sensitivity of employers to what was said about them, and about what an employee might be writing outside of their academic publica- tions. mccullagh ( ) explored the issue of privacy and the professional impact of blogging with a large sample of over bloggers of all types. she noted: ‘bloggers’ privacy boundaries in the workplace have not yet been clearly established, either socially or legally’ (mccullagh , ). academia is no exception. walker ( ) recognised variety in academic blogs. she identified three genres of academic blog: as well as saper’s pseudonymous blogs about academic life, she also indentified public intel- lectual blogs and research logs. she speculated about whether blogs were a good medium to popularise research. ward ( ) began blogging as a graduate student and understood her activity as a form of ethnography of the academy. gregg ( ) has more recently examined the public blogs of a number of postgraduate students and young academics using walker’s catego- ries. she saw them as an expression of a subculture of people who were struggling to make a life in their chosen career as a scholar and researcher, while examining and critiquing the role and function of the academy and the employment practices within it. despite authors like farmer ( ) asserting that there are ‘numerous examples of academic bloggers taking advantage of blogs in order to engage with their peers and students and to reflect on their own learning’ ( ), given the scale of traditional academic production the number of academic bloggers seems actually quite small – and it is worth exploring the reasons for this. the personal context for this study like ewins, gregg, walker, ward and saper, my own interest in blogging began when i explored it as part of my own academic practice. my own blog reflected on problematic work issues and i was interested in finding a community of people engaged with similar issues. i was also testing the limits of the medium to engage with this kind of material. i was in walker’s category of blog- ging about academic life but not doing it pseudonymously. it became clear to me very soon that blogging is a genre of writing with its own demands. not only did i have to struggle with ‘what’ i could say in public, i had to develop a voice for the blog, decide the relationship between my public (blog) identity and other professional and private identities and think about my audience. after two years of blogging during which my blog demonstrated publicly a particular aspect of my professional identity, i decided to enquire more formally into the blogging practices of other academics to see what models of academic blogging were emerging that were professionally useful to those who created them. the institutional context for the study there are now a variety of staff blogging activities being supported by the university where i work. the most publicised, and the most ‘polished’ ones, are those that the university runs as part of its public communications activities. there is a university institutional blog, which is open to the world and is clearly directed at a wide audience including students but also those who might stumble across this site while searching for the topics discussed there. this site aggregates posts by invited contributors, grouped under various headings, and university academic staff are g. kirkup invited to write as experts on some aspect of national or international interest. the site functions to deliver trustworthy ‘open content’ and as a marketing channel for the university’s products. it ‘belongs’ to the institution rather than to the individuals who write the posts. it enhances the identity or ‘brand’ of the organisation. this kind of blog is now a part of the websites of many companies since scoble and isreal published naked conversations: how blogs are changing the way businesses talk with customers (scoble and isreal ). the university also hosts a blogging platform for both individual staff and students. initially the provision was focussed on students as part of the university’s online learning platform. staff began using it to host their blogs; many found it easier than hosting them on external platforms. other staff continue to choose external platforms in order to create a sense of professional identity separate from that of the institution in which they work. the university lists, on the internal staff website, all the staff blogs that it hosts. early in there were of these and only one third of them were open to the public. a number of these internal blogs functioned as forums for discussion of internal working practices and consultation forums, some simply provided information. a surprisingly small number were authored by academic staff reflecting on their work and open for public readership. method collecting a sample of academic bloggers i began searching for my sample of academic bloggers from colleagues in my own department (of educational technology). eleven individuals out of a department containing academics were listed in early as having blogs, eight of these were academic staff, and however of these only four were posting regularly. the others had created a blog but had not written posts for many months. this is not a lot of blogging activity, even accounting for the fact that some colleagues are keeping pseudonymous blogs that they don’t want listed, and have hosted elsewhere. the fact that only about one in ten academics were actively keeping a work- related blog, suggests that in blogging is still a minority activity even among those most active in online technologies. this is about the same proportion as the % of the total us population reported by the pew research centre as having at some time created a blog (lenhart and fox ). i also used the university’s list of institutionally hosted blogs selecting only those owned by academic staff and open to the public. from both these sources i created a list of people who i contacted with a request for an interview. of these i was able to interview six. three of the six are drawn from my own department (educational technology), one from sociology, one from literature and one from biology (the latter working in another institution). they ranged in seniority from professors, senior lecturers, a lecturer and a post-doctoral researcher. only one of them was a woman. in my own institution academic women are much less likely to be regular bloggers than men, even in the field of educational technology. some of my sample had blogs with a large, regular readership; others (including my own) were read mostly by friends and colleagues. such a small opportunistic sample cannot of course claim any statistical validity but despite their wide ranging subject areas and different levels of seniority common themes about the relationship of the blog to other academic activities emerges. data collection each blogger was interviewed using a common interview schedule, but each interview was allowed to take different directions as i probed the particular practices and context of each blogger. four of the interviews were carried out face-to-face, and two were done by telephone. london review of education all the interviews were recorded digitally and i also took handwritten notes. each interview lasted between minutes and an hour. data analysis the analysis was done using the audio files and transcribing only those parts of the interview that represented themes, or succinct ideas and concepts. the initial analysis used the interview questions as themes around which to group the data. a second analysis was then done to identify emergent themes coming out of the responses. this paper focuses on those themes to do with blogs as genres of academic writing as part of the process of performing an academic identity. themes a new medium to articulate ideas all but one of my interviewees had taken up blogging because they wanted to write about their subject/research area but in a different, less formal medium, but at least initially to the same audi- ence of people who they normally engaged with. the two educational technology academics were aware that an educational technology community of bloggers existed before they joined it. they felt that blogging was an activity they needed to experience as a professional in their discipline. in this case the activity of blogging was one of practicing educational technology. keeping a blog was a valid aspect of the identity of an educational technology scholar. even so, they did not find creating their own blog an easy task. ‘professor m’ had friends who were active bloggers and he wanted to create a blog for himself, but it wasn’t until his third attempt that he really got established: i started it on study leave when i was writing a book. i had lots of content… it was a good way to explore some of the ideas that were in the book… often the problem with a blog is getting enough momentum going – the book allowed me to do that. ‘dr k’ started blogging when she responded to a request made to her department for volun- teers to write something for the university’s institutional blog. she was offered all the technical help she needed and she took this as an opportunity to ‘have a go’ at writing for a different medium. after this experience she began her own issues-based blog: i do it about the issues i think other people aren’t doing. i ask ‘the other question’. i do what’s miss- ing [in an issue] mostly in terms of race and gender… i bring feminist theory to sport which isn’t often done. for dr k the blog offered a medium for her to engage in critical ideas at the periphery of her discipline. ideas that might not have been accepted in a peer reviewed publication. ‘dr d’ also began blogging because he was invited to be part of a group blog. this blog involved authors from across a number of institutions who were working in the same scientific field. he was the most junior member of the group, but became the primary author of posts. his blog is about the ‘ideas’ that interest the authors – mainly issues to do with evolutionary biology – and unlike other blogs here one of its main functions is to provide a conversation about ideas but between the blog’s authors. ‘professor r’ was expert in a variety of written genres, he wrote and published fiction as well as scholarly publications and he kept a private journal for himself. he could be described as already having a multiphrenic identity. he set up his blog initially to replace his personal journal, but it became (he noted) ‘violently professional’, partly because he saw that was what others were doing. however the personal element remained and the blog enabled him to talk more g. kirkup personally about literature – more than he would normally feel able to do in his other academic writing. ‘mr a’ was the only person in my sample who was not blogging about an area of expertise. he had previously published a humorous column in a staff newspaper. he described that as ‘pre- blogging’. it was an experience in which he ‘found that i had a bit of a voice and liked writing humorous/provocative things’. when his column was stopped he felt that he didn’t have an outlet for this voice, so he built an institutional website. however, because there was institutional sensitivity about some of his content he set up a blog that was not on the university’s platform, and made it open to the public. he described his aim for his blog as having ‘somewhere where any staff could talk about issues that there is nowhere else to talk about in the university’. it is a place where he can ‘say things i felt (sounds a bit grand to say) ought to be said [when] something is nagging away, and i just get it out of my system’. for him the freedom to write what he wants in the blog is an important aspect of academic freedom. he was the only person in the sample to be primarily engaged in the kind of critical blog described by saper ( ) and gregg ( ). no one had explicitly created their blog as an avenue for self-publicity. however, it might not be clear when a ‘technology of the self’ turns into the kind of ‘technology of self-promo- tion’ criticised by lovink ( ). for this sample it was the satisfaction of the activity itself, which allowed them a new voice that kept them blogging rather than it being part of a career plan. ‘mr a’ for example described himself as a quiet introvert who found his voice by accident, his blog had become for him maybe the most important professional voice in his multiphrenic identity. blogging as one medium in a multiphrenic environment there was a strong indirect relationship between the writing people did in their blogs and other professional academic writing. as they became familiar with the medium of blogging they were surprised to find that it had its own rules, it was not simply a notebook, or a place for making drafts which might be turned later into full scholarly publications. when it worked well there could be synergy between blogging and other writing. but the entire sample described how blog texts were different from other texts, and demanded care and effort to produce at quality. ‘dr c’ described this very clearly: i fondly imagined that a blog would be a good way of getting ideas off the ground for papers and proposals and things like that – it doesn’t do that… the initial draft of a paper often looks like a blog post and i could just post it… but i don’t choose to because it doesn’t feel finished. he described how a paper is reworked with gaps and un-evenness, while a blog post has to have its own sense of being ‘complete’. ‘dr c’ also described what he calls the ‘hierarchy of levels of reflection and thinking and effort’ that go into creating texts for different media. he ‘bangs out a tweet’, but a blog takes a little bit longer. ‘professor r’ also described how he composed his blog posts like ‘little articles’. he considered their length (about words) and tried to make them a ‘rounded piece’. he estimated that it took him about two hours to create a blog post. ‘dr k’ who has also published extensively in traditional academic media found that the blog fed into her other writing. ‘dr d’ felt that the blog had helped him gain confidence and facility in his writing, he was learning academic practice through blogging: i write posts faster and rapidly express things and not have to struggle to produce a whole published concept before i share it with someone. i am more confident about sharing my ideas with people. being a young academic, i don’t feel confident about the ideas i have. london review of education however, he was well aware that the academic community did not value blogging as an academ- ically credible activity, and he worried that the facility to ‘publish’ a blog gave a false sense of achievement: … when you have finished [your post] there’s a button at the bottom saying: ‘publish’ and academic publishing is the currency, the most important thing to do… up to a point [blogging] is parasitising the importance of publishing. because you have pressed this button saying ‘publish’ and you feel great when you have done it… stimulating me to think that i am publishing when i am actually not. at the other end of the spectrum ‘professor m’ who has published numerous books and scholarly articles found that his very active blogging had reduced the need he felt to do so much scholarly writing. the online ‘conversations’ he was involved in with a large numbers of ‘followers’ and other bloggers, satisfied his need to engage with others about new ideas. [there is a] noticeable decrease in formal publications since i started blogging…i don’t feel the need to publish formally so much. secondly the ideas sharing you want to get from formal publication i get more quickly and more satisfyingly from blogging… ‘professor m’ put significant time and effort into his blog. like ‘professor r’ he saw blogs as things that are finished and conform to a particular form and certain standards. but both ‘profes- sors m and r’ had reached a level of professional seniority that gave them the confidence to invest time in non-traditional academic production. only for ‘mr a’, who was not writing about his subject area but about his working environ- ment, was his blog writing not a development which had synergies with his other academic writing. he described his blog as the ‘opposite’ of the kind of research writing he did. he was very critical of this kind of research writing which he described as very controlled and disciplined and somehow had the ‘life sucked out of it’. like the young academics in gregg’s ( ) sample his identity was not convergent with that of the institution where he worked, and he could only express this through writing in his blog. nearly all the bloggers used other online media for their work, as well as traditional print publication channels. unsurprisingly the educational technologists were greatest users of and experimenters with new online applications, but there seemed to be no age correlation between using a great deal of online media and not doing so. in this sample they tended to use other online media to draw readers’ attention to their blogs, some sending an email message or a ‘tweet’ to a list of contacts when they had published a new post. the role of an audience for academic blogs? none of the sample (including myself) was writing with students in mind as the main audience for their blog, and there was no direct relationship between their teaching and their blogging. the blogs were not performances of a teaching identity. although ‘dr k’s blog was related to the course she taught when she set it up, she now addressed a much wider audience, and the idea of this wider audience freed her up to use her blog to explore her ideas. ‘professor r’ also felt that his blog was not a channel to talk to students, or to talk to his peers in his discipline (whom, he suspected, would think his blog was trivial); instead he like dr k wanted to talk to those with common interests: at first i thought that no one was reading [the blog], then people told me that they were. so when i found out that people were reading it i thought i should make it more accessible, and wondered how to get a dialogue going. ‘dr c’ had a clear picture of his audience, because of the local feedback he got; ‘it is made up of university and related "techy" folk. i make the assumption that they are reasonably g. kirkup comfortable with technology’. as i note earlier ‘dr d’ saw the main audience for the blog as his fellow authors but that did not stop him wishing it had a bigger following. ‘mr a’ commented about his blog: ‘if i enjoy doing it and like the product myself, why do i need to show it to anyone?’, and yet he went on to admit that he was ‘quite attention hungry, so it’s funny that i carry on with this blog without feedback’. his attitude to his audience was very complex. he worried that if he thought too much about his audience it might inhibit his writing, and yet he used an email list of friends and colleagues to send his blog posts to them as emails. the entire sample, even the most well known, got few comments posted by readers, and some of these comments were simply the verbal equivalent of applause, not the beginning of the conversation that most of them looked for. the bloggers in this study expected to have very few readers, but all hoped they could get more. but it seems that, in general, the idea of an audience is more important to these bloggers than an actual audience, and practicing a blogging identity, or voice for themselves was more important than having others listen. the costs of blogging an important question for academic bloggers is whether there are any professional costs involved. did blogging contribute to a higher profile academic identity? ‘professor m’, it was noted earlier, felt that there had been a trade off between his formal publication output and his blogging which could be a professional cost depending on the career stage one was at. ‘dr d’, who was the least well established in his career, felt his blogging was not: … career advancing or self-interested… i don’t think i have any reputation costs on the line. i don’t particularly regard it as a plus point in my career… but reputation costs are not nonexistent; i don’t tell everyone that i do it. two of the established bloggers had felt some negative criticism from the institution about the content of their blog. ‘mr a’ remarked of his blog: ‘there is no chance of a chair once you start this kind of writing’. he had received comments from senior managers that some the content of his blog should be removed. he had responded to this criticism and worried that has blog has become ‘bland’. even ‘professor m’ had once received criticism from the institution about the content of his blogging: when i was the director of a project with commercial sensitivity i had to be more sensitive about what i said. people higher up thought i was being too open about developments that were commer- cially sensitive. these issues of reputation cost and impact on careers have to be taken seriously. as well as overt attempts by an institution to constrain the content of blogs some of my bloggers felt that others – peers in the discipline, or managers the institution would see their blog as not academ- ically serious enough. perhaps it should not be surprising that academic institutions can be as sensitive as commercial institutions about what their employees publish. it is professionally safer to perform an academic identity that does not bring you into conflict with your employer. conclusions on the strength of this small sample i would argue that blogging is an emerging academic practice, and a new genre of scholarly writing, which could become an important activity for a professional academic. the possibility exists of creating a significant intellectual identity through a blog. if the formal structure of academic value refuses to engage with blogs – and other media london review of education – then academics will struggle to engage as twenty-first century public intellectuals. writing for blogs needs to be awarded academic esteem as well as public esteem. this esteem would not be just to the individual but also to the institution where they work. ‘professor m’ argued that academic institutions ‘should have some bloggers – [they should] engage in what is now a significant industry’. this small sample of academic bloggers talked at length about the care with which they constructed their blog posts and how they thought about their audience when they wrote. they all had similar ideas about the size, shape and voice that worked best for blogging, which suggested that the rules for blogging as a genre can be deduced and applied. this supports some of the early writing by boyd ( ) who argued that people initially understood blogging through the metaphors of journalism and diary writing, however these were metaphors which reflected the fact that people did not know what possibilities blogging had in its own right. the activities of this sample of academic bloggers suggest that academic blogging is a particu- lar genre within the wider medium of blogging and that academic blogging is a becoming a particular form of academic writing; a genre through which academics perform their scholarly identity, engage in knowledge production, and become public intellectuals, at least on the internet. the most considered and successful bloggers in the sample were academics who had extensive experience with other forms of text production. this might change as young academ- ics – such as ‘dr d’ – learn to produce blogs alongside the other text production skills they are learning; however, academia would do well to encourage some of its best academic writers to take up blogging to provide models for multiphrenic academic identities. notes on contributor gill kirkup is senior lecturer at the institute of educational technology, open university, uk. she has published widely in the field of distance and online learning. one of her interests is the use of web . technologies by academics in particular those technologies that support teaching. references benton, m. . thoughts on blogging by a poorly masked academic. reconstruction , no. . http:// reconstruction.eserver.org/ /benton.shtml accessed . . . borgman, c.l. . scholarship in the digital age. information, infrastructure and the internet. london: mit press. boyd, d. . a blogger’s blog: exploring the definition of a medium. reconstruction , no. . http:// reconstruction.eserver.org/ /boyd.shtml accessed . . . butler, j. . gender trouble. feminism and the subversion of identity. new york: routledge. ewins, r. . who are you? weblogs and academic identity. e–learning , no. : – . farmer, j. . blogging to basics: how blogs are bringing online education back from the brink. in uses of blogs, ed. a. bruns and j. jacobs, – . new york: peter lang. gergan, k.j. . the saturated self. in self and society, ed. a. branaman, – . london: blackwell. giddens, a. . modernity and self-identity: self and society in the late modern age. palo alto, ca: stanford university press. gregg, m. . feeling ordinary: blogging as conversational scholarship. continuum: journal of media and cultural studies , no. : – . gregg, m. . banal bohemia: blogging from the ivory tower hotdesk. convergence. the international journal of research into new media technologies , no. : – . hyland, k. . options of identity in academic writing. elt journal , no. : – . ivanic, r. . writing and identity. the discoursal construction of identity in academic writing. amsterdam: john benjamins. kerawalla, l., s. minocha, g. kirkup, and g. conole. . characterising the different blogging behaviours of students on an online distance learning course. learning, media and technology , no. : – . g. kirkup kerawalla, l., s. minocha, g. kirkup, and g. conole. . an empirically grounded framework to guide blogging in higher education. journal of computer assisted learning , no. : – . ko, h.-c., and f.-y. kuo. . can blogging enhance subjective well-being through self-disclosure? cyberpsychology and behavior , no. : – . lenhart, a., and s. fox. . twitterpated: mobile americans increasingly take to tweeting. pew research centre. http://pewresearch.org/pubs/ /twitter-tweet-users-demographics. lovink, g. . zero comments. blogging and critical internet culture. london: routledge. mccullagh, k. . blogging: self presentation and privacy. information and communications technology law , no. : – . saper, c. . blogademia. reconstruction , no. : – . http://reconstruction.eserver.org/ /saper. shtml . scoble, r., and s. israel. . naked conversations. how blogs are changing the way businesses talk with customers. hoboken, nj: john wiley and sons. walker, j. . blogging from inside the ivory tower. in uses of blogs, ed. a. bruns and j. jacobs, – . new york: peter lang. ward, m. . thoughts on blogging as an ethnographic tool. paper presented at the rd annual ascilite conference: who’s learning? whose technology? in sydney, australia.   open peer review discuss this article  ( )comments opinion article problematizing digital research evaluation using dois in  practice-based arts, humanities and social science research [version ; referees: approved] muriel swijghuisen reigersberg research office, goldsmiths', university of london, london, se   nw, uk abstract this paper explores emerging practices in research data management in the arts, humanities and social sciences (ahss). it will do so vis-à-vis current citation conventions and impact measurement for research in ahss. case study findings on research data inventoried at goldsmiths’, university of london will be presented. goldsmiths is a uk research-intensive higher education institution which specialises in arts, humanities and social science research. the paper’s aim is to raise awareness of the subject-specific needs of ahss scholars to help inform the design of future digital tools for impact analysis in ahss. firstly, i shall explore the definition of research data and how it is currently understood by ahss researchers. i will show why many researchers choose not to engage with digital dissemination techniques and orcid. this discussion must necessarily include the idea that practice-based and applied ahss research are processes which are not easily captured in numerical ‘sets’ and cannot be labelled electronically without giving careful consideration to what a group or data item ‘represents’ as part of the academic enquiry, and therefore how it should be cited and analysed as part of any impact assessment. then, the paper will explore: the role of the monograph and arts catalogue in ahss scholarship; how citation practices and digital impact measurement in ahss currently operate in relation to authorship and how digital identifiers may hypothetically impact on metrics, intellectual property (ip), copyright and research integrity issues in ahss. i will also show that, if we are to be truly interdisciplinary, as research funders and strategic thinkers say we should, it is necessary to revise the way we think about digital research dissemination. this will involve breaking down the boundaries between ahss and other types of research. keywords open access, digital identifiers, arts, humanities, social science   this article is included in the science policy  gateway.research    referee status:   invited referees  version published  jul    report report , university of brighton, ukanne galliot , higher education fundingben johnson council for england (hefce), uk    jul  ,  :  (doi:  )first published: . /f research. .    jul  ,  :  (doi:  )latest published: . /f research. . v page of f research , : last updated: jul https://f research.com/articles/ - /v https://f research.com/articles/ - /v https://f research.com/gateways/scipolresearch https://f research.com/gateways/scipolresearch https://f research.com/gateways/scipolresearch https://f research.com/articles/ - /v http://dx.doi.org/ . /f research. . http://dx.doi.org/ . /f research. . http://crossmark.crossref.org/dialog/?doi= . /f research. . &domain=pdf&date_stamp= - -    this article is included in the proceedings of the  collection.  orcid-casrai joint conference  muriel swijghuisen reigersberg ( )corresponding author: m.swijghuisen@gold.ac.uk  no competing interests were disclosed.competing interests:  swijghuisen reigersberg m. how to cite this article: problematizing digital research evaluation using dois in practice-based arts,     ,  :  (doi: humanities and social science research [version ; referees: approved] f research ) . /f research. .  ©   swijghuisen reigersberg m. this is an open access article distributed under the terms of the copyright: creative commons attribution , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. data associatedlicence with the article are available under the terms of the   (cc   .  public domain dedication).creative commons zero "no rights reserved" data waiver  the author(s) declared that no grants were involved in supporting this work.grant information:    jul  ,  :  (doi:  ) first published: . /f research. . page of f research , : last updated: jul https://f research.com/collections/orcid-casrai- https://f research.com/collections/orcid-casrai- https://f research.com/collections/orcid-casrai- http://dx.doi.org/ . /f research. . http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / http://creativecommons.org/publicdomain/zero/ . / http://dx.doi.org/ . /f research. . introduction this paper explores emerging practices in research data manage- ment in the arts, humanities and social sciences (ahss). it will do so vis-à-vis current citation conventions and impact measure- ment for research in ahss. case study findings on research data inventoried at goldsmiths’, university of london will be presented. goldsmiths is a uk research-intensive higher education institution which specialises in arts, humanities and social science research. the subject of this paper is a topical one in the uk, where research in universities is publically funded. government and research coun- cil funders are asking that universities in receipt of research income demonstrate how their funding is used to generate new knowledge and positive impact in all disciplines, including the arts, humanities and social sciences. the impact that this new knowledge creation and its dissemination are having must be recorded and where possi- ble, quantified. this quantitative information can thereafter be used to help inform future research strategies on a variety of levels. it might also be used, some tentatively suggest, to complement peer review in future research excellence frameworks. some of this quantifiable information for research assessment might be delivered through various types of metrics, based on dig- ital information made openly available. this could include biblio- metrics; a quantitative analysis of research literature and citation rates or altmetrics, which incorporates for example social media analyses and download rates of visual research-related materials, alongside citation rates. this way of measuring research impact and excellence however, is not yet as refined as some researchers might like it to be. firstly, the data on which metrics relies must be available in a digital format and easily accessible. secondly, data must be unambigu- ously linked to its creator(s) through unique researcher identifier numbers, such as those provided by orcid. some argue metrics works better for those disciplines that have focussed more heav- ily on digital dissemination strategies and open access publication methods. it is argued that some forms of research, such as practice- based research, do not naturally lend themselves well to digital cap- turing for impact measuring purposes. the upshot of this is that those disciplines that are less digitally oriented, are likely to obtain unhelpful metric ratings. this in turn it is feared, will lead to reduc- tions in public funding, if metrics were to be used to allocate finan- cial resources in future. some even suggest that this then in turn, might jeopardise the diversity of uk research, diminishing arts and practice-based research activity and jeopardising the sustainability of smaller specialist higher education institutions. as a result, the higher education funding council for england (hefce) conducted two independent reviews on the suitability of metrics for research and impact assessment purposes: one between – , and another soon to be published in july . disciplinary languages also determine whether or not ahss researchers are likely to engage with metrics. many ahss researchers do not define their research outputs as ‘data’. neither do many communicate their research enquiries in writing, mak- ing bibliometrics problematic. often, when they do write, ahss researchers publish monographs or book chapters. monographs as yet are not widely available in an open access format, again impact- ing on the ability of the research therein to be captured by digital tools. lastly, questions of authorship, copyright and ownership arise. often ahss research is co-created with the help of non- researchers as well as co-investigators. this impacts on individual incomes and research integrity where the sharing and citation of data is concerned. it may also impinge on research data manage- ment strategies for interdisciplinary projects where multiple outputs of different kinds are produced. it is the above issues that i will explore in this paper. i will argue that if we are to be truly interdisciplinary, it is necessary to revise the way we think about digital research dissemination and how we write and talk about it. this will involve breaking down the bounda- ries between ahss and other types of research. although presented here as an opinion piece, it is not so much an opinion as a record of the state of play with regards to emerging practices, theory and ideas on digital publication, orcid num- bers and digital object identifiers and how these might complement other mechanisms which enhance digital research impact and dis- coverability of ahss research. this article does not pretend to offer a complete view of all emerging practices across ahss disciplines in the uk. neither does it suggest that digital citation and discover- ability are the only ways in which impact can be achieved. instead it will give an overview of some of the specific debates pertaining to digital dissemination that researchers and administrators are having at goldsmiths, university of london and how staff are engaging with open access, open data, digital object identifiers (dois) and orcid numbers. orcid, definitions of data in ahss research at goldsmiths, university of london, not many researchers have created their individualised orcid numbers yet and neither do they, by and large, label their data with digital object identifiers (dois), apart from possibly their journal articles, which of course contain processed data. this is because many researchers active in arts, humanities and social science (ahss) research do not iden- tify their research outputs (e.g. field notes, monographs, art work sketches, film rushes, interview materials etc.) as research data. the term ‘data’, to them has a much narrower definition, belonging only to the professional jargon used by researchers working in (bio) medical and other scientific fields (e.g. quantitative, numerical data sets, graphs and pie charts). other ahss researchers see ‘data’ as only being synonymous with personal data: personal information about research participants, such as names and dates of birth, sub- ject to safeguarding by the data protection act. the definition of what might be classified as research data, however, could be much broader if conceived of creatively. the first mission of any digital advocate therefore, is to clarify with the help of research- ers themselves, what might count as research data in their disciplines page of f research , : last updated: jul http://www.gold.ac.uk/ http://www.ref.ac.uk/ http://www.orcid.org/ http://www.hefce.ac.uk/rsrch/metrics/ http://www.hefce.ac.uk/rsrch/metrics/ and how this could be labelled appropriately, digitised, archived and maintained as necessary, to document research processes and methods. this process of defining and scoping research data must meet the needs of project-specific research enquiries without letting administrative requirements take the upper hand or imposing non- ahss data collection models where this is not appropriate. to some extent the question of what counts as data in visual arts research was already addressed at goldsmiths. between – , goldsmiths participated in a joint information sys- tems committee (jisc) funded project called kaptur together with glasgow school of art, university for the creative arts and university of the arts london. kaptur built on the work under- taken by the digital curation centre and was led by the visual arts data service. the kaptur project sought to: “investigate the cur- rent state of the management of research data in the arts; develop a model of best practice applicable to both specialist arts institutions and arts departments in multidisciplinary institutions; and to apply, test and embed the model with four institutional partners (http:// www.vads.ac.uk/kaptur/about.html)”. the project helped question definitions of data and provide examples of what might count as data and how it might be managed. kaptur’s findings, useful toolkits, reports and outcomes have not filtered through to most visual arts researchers however and few therefore have embraced the idea that the definition of research data can be broadened. those that have, however: are struggling with ip and copyright issues for digitised work; have no time to digitise analogue research outputs to be able to create digital object identi- fiers; or may not publish in the digital domain, meaning that digital identification and metrics cannot currently be used effectively to assess any potential impact being created. consequently, many ahss researchers feel orcid numbers and digital object identifiers are simply not relevant to what they cur- rently do. as much ahss research remains unfunded there are also no contractual or funder obligations to which researchers must adhere which stipulate that research data must be open access or discoverable. this reluctance to engage with digital dissemination and citation practices is especially in evidence when researchers work in disciplines that still rely on the production of practice-based outputs and the publication of monographs as the most important esteem indicators and research outputs in their fields. monographs in ahss research goldsmiths’ researchers publish many monographs. monographs contain research data: images of art works; creative writing out- puts; and (auto) biographical details for example. the author is not always the copyright owner of this data or indeed its creator, and will often have to gain permission to use information for the pur- poses of publishing their monograph. what is owned by the author is the intellectual theory and often text. academic monographs are still predominantly issued in non-open access formats. if published in the digital domain, proprietary for- mats and specific software are used. these are unlikely to promote sustainability of digital research data in the open access domain. the challenge is compounded by the current lack of digital research data for monographs that might easily be labelled and referenced electronically. here it is worth referring to the january cros- sick report on open access monographs generated by the higher education funding council england (hefce). whilst recognising the importance of monographs to ahss schol- ars, the hefce report identified the limitations of hard copy formats. those relevant to us here include: a) the fact that video, audio and other examples cannot be embedded in hard-copy monographs; b) text-mining options and easy ways of measuring citation levels and impact rates are absent; c) the fact that hard copy monographs are not ‘living’ documents. comments and reviews cannot be easily shared, updates require new editions and comparisons of passages and ideas are not quickly communicated. it is these limitations of the monograph that researchers at goldsmiths are wishing to explore in collaboration with publishers, computing and legal experts. the challenges identified by the hefce report with regards to label- ling research data and making monographs open access are several. those highlighted by the report are, for example, that academi- cally authored exhibition catalogues are part of business models for independent research organisations (iros) and galleries. making exhibition catalogues and the research data openly accessible will reduce the vital income received by iros such as the tate galleries. additionally, data in catalogues and creative writing outputs have often been generated by people other than the catalogue author. this raises copyright, intellectual property and revenue challenges impacting on the licensing and sub-licensing of research data such as images and musical examples if researchers wanted to include certain materials in their open access monograph using digital object identifiers. careful consideration must therefore be given to labelling research data before joining it to researcher orcid numbers and making it open access if in monograph or catalogue format. the analogue, licencing, authorship and ethics questions of research ethics, integrity and licencing also come in to play when labelling practices are considered. pictures, images and text may constitute to a representation of something or some- one. not the actual object or person. practice-based researchers often argue that a (digital) image or recording of their analogue, and possibly temporaneous art work is a different object episte- mologically to the actual, physical work and therefore cannot be labelled as being the same item. similarly, where open access is an option, anthropologists and ethnographers frequently opt for non- derivative licences meaning no materials based data/text can be created derived from the original text. this is important where non- academic research collaborators agree to participate on the basis that they are represented fairly and where these collaborators often have a say in how their interview excerpts, musical materials and images (that is to say, research data collected by the researcher) are used and placed in texts. re-using research data uncritically, or labelling it with object identifiers often runs contrary to the highly personalised material that is being explored, which belongs to both page of f research , : last updated: jul http://www.vads.ac.uk/kaptur/about.html http://www.vads.ac.uk/kaptur/about.html http://www.hefce.ac.uk/media/hefce/content/pubs/indirreports/ /monographs,and,open,access/ _monographs.pdf http://www.hefce.ac.uk/media/hefce/content/pubs/indirreports/ /monographs,and,open,access/ _monographs.pdf http://www.ahrc.ac.uk/what-we-do/fund-world-class-research/pages/independent-research-organisations.aspx http://www.tate.org.uk/ the researcher and his/her participants at the very least and in some cases to the research participant alone, who is sharing it with the researcher in good faith. ahss authors therefore choose the most restrictive licensing options in order to do no harm. this ethical pri- ority reduces the re-use and therefore citation options available for their data and as a result the potential for metrical impact. questions of authorship and citation surface as well. in the sciences definitions of what constitutes ‘authorship’ vary across disciplines and between journals. citation practices are also not standardised. guidelines do exist, however. by way of contrast, in ahss research very few, if any, definitions and guidelines of what constitutes to authorship exist. if one were to apply certain biomedical models of authorship definition to ahss research, many non-academic research participants would technically qualify as co-authors of research papers as they helped shape research data via their active, sometimes non-anonymous, participation in the research enquiry, particularly in applied, bottom-up, process-oriented research enquiries. whilst acknowledging research participants’ input and possible co- authorship of research papers and data may be a more accurate and ethical reflection of their role in the research, it could also poten- tially create logistical challenges in the domains of ip, copyright, ethics and citation. ethical considerations need to inform citation and author definition practices. whilst it might seem like a good idea to measure impact through non-academic authorship, citation and engagement in the way hinted at above, this approach should not be recommended without careful ethical screening addressing questions of anonymity and equitable data sharing and ownership that are likely to arise, amongst many other hurdles. metrics, impact and digital dissemination in ahss another challenge to be overcome is that of the use of metrics in assessing research quality and impact. orcids and dois will be especially helpful in collecting statistical information quickly and digitally on how often research is being cited, and accessed and may go some way, so the argument goes, to showing how much impact is being achieved. metrics however, is only one way in which impact becomes measurable and for ahss researchers, it is thought to be misleading and ineffective. in a response to hefce’s consultation on the use of metrics in assessment of research quality and impact, many goldsmiths staff remained unconvinced that metrics could be used to conclusively prove research excellence or impact. implic- itly therefore, they had little faith in the use of dois and orcid numbers as a way of improving impact analyses through metrics, although some did agree it would improve the visibility of research outputs and data. most felt though that discoverability should not be equated with quality or impact per se. metrics such as citation rates and journal impact factors, they observed, operate differently not only according to discipline (e.g., between psychology and literature), but also differ significantly between varying branches of the same discipline. psychology, for example, is arguably a more diverse discipline than some, so indi- ces like citation factors need to be interpreted very carefully even within different sub-disciplines to allow for meaningful ‘like for like’ comparisons (e.g., neuroscience journals typically have much higher impact factors than social psychology journals). the media, communication and cultural studies association’s (meccsa) ref consultation, compiled by prof golding and reiter- ated in the goldsmiths’ response to the hefce metrics consultation by meccsa members employed at goldsmiths, yielded various anonymised comments from researchers. it was pointed out that evidence “suggests that variations in citation practices occur within disciplines as much as across disciplines, so the issue of calibra- tion cannot simply be to the average for the subject as is proposed in science subjects. staff felt it difficult to envisage a reliable way in which to develop disciplinary citation norms in interdisciplinary areas against which to compare individual counts. this would be especially true in fields such as media and communications which encourage publication across a very wide range of outlets in the arts, humanities, and social sciences. the usual suggestion of web of science (thomson scientific) as the database to be used for cali- bration of citation counts is acknowledged to be problematic. web of science is demonstrably incomplete in many areas (http://www. meccsa.org.uk/pdfs/ref-consultation.pdf)”. colleagues were unconvinced any database was complete and therefore figures not necessarily accurate or useful. others goldsmiths’ colleagues felt that in ahss research the number of citations was not a conclusive indicator of research merit, value or impact. a colleague commented that in the humani- ties (literary studies in particular), no publication is ever really superseded or made obsolete, nor is the author-critic irrelevant to the argument; the argument is very often his/her interpretation and appreciation of certain phenomena; in the sciences the assumption is that scientists report hard facts/results of experiments, not their idiosyncratic and poetic take on the set of data. once a set of data or theory is superseded, scientific research tends no longer be cited or becomes part of what ‘everybody already knows’ (see latour & woolgar, ). this is not true for a lot of ahss research and so the citation of data or the linking of outputs with orcids might be of limited use it was felt. the intellectual debates surrounding digital dissemination that are flourishing at goldsmiths will inform practical digital dissemina- tion strategies that the university as a whole will adopt in future. intellectually these same debates seek to influence emerging theory in digital scholarship and dissemination in ahss subjects more generally. interdisciplinarity, impact, goldsmiths and digital dissemination despite this scepticism and caution, new developments have begun to flourish. during recent data management scoping exercise, it became apparent that goldsmiths researchers are engaging with digital dissemination practices quite effectively. this is especially the case where researchers are working on projects that are inter- disciplinary, well-funded and usually include an element of non- ahss research. page of f research , : last updated: jul http://www.meccsa.org.uk/pdfs/ref-consultation.pdf http://www.meccsa.org.uk/pdfs/ref-consultation.pdf for example, goldsmiths hosts a large arts and humanities, research council (ahrc) grant. its research team actively seek to use and create scientific computing tools to collect and analyse large amounts of data to help further musicological analyses and practice. one such undertaking is the ‘transforming musicology’ project (box ). box . excerpt from the ‘transforming musicology’ website (http://www.transforming-musicology.org/ about/) – principal investigator: professor tim crawford this research project explores how software tools developed by the music information retrieval (mir) community can be applied in musical study. specifically the project seeks to: • enhance the use of digitally encoded sources in studying th-century lute and vocal music and using such sources to develop new musical pattern matching techniques to improve existing mir tools; • augment traditional study of richard wagner’s leitmotif technique through audio pattern matching and supporting psychological testing; • explore how musical communities on the web engage with their music by employing mir tools in developing a social platform for furthering musical discussion online. a key technological contribution of transforming musicology is the enhancement of semantic web provisions for musical study. this involves augmenting existing controlled vocabularies (known as ontologies) for musical concepts, and especially developing such vocabularies for musical discourse (both academic and non-academic). it will also involve developing and promoting methods to improve the quality and accessibility of music data on the web; especially the accessibility for automatic applications, following techniques known as linked data. the project relies heavily on the digital labelling of musical units to help catalogue and identify compositional structures and musical pieces. the research has the potential to inform debates on musical performance, copyright, composition and musical analysis amongst other areas. to this end it will develop open source software tools as well. the process of identifying and labelling musical units with uris means that this project has a large number of data sets and individual data items which might potentially be cited and acces- sible in audio format in the planned monograph for this large grant. the grant’s research team are presently considering the possibil- ity of engaging with publishers to explore open access monograph formats so that they might include digital data sets. the creation of digital data sets and an open access monograph, in turn, pro- vide a significant impetus for the research team to consider adopt- ing orcid numbers so that uris and dois might be linked to their names and the grant. if data sets and digital tools were made available they would also benefit non-academic and amateur music groups, such as the lute-players with which the principal researcher works. for open access sharing to be made a reality however, careful consideration will need to be given to how research data sets are shared in the monograph and whether or not this might contra- vene existing copyright legislation, for example, as data is compiled from existing musical pieces. until adequate sharing mechanisms are explored, it may not be possible to freely share the data accumu- lated during the project’s lifespan. another initiative taken by goldsmiths’ researcher joanna zylinska and her team (professor joanna zylinska, dr kamila kuc, jonathan shaw, ross varney, dr michael wamposzyc. project advisor: professor gary hall), includes the creation of an open book, photo- mediations. the project redesigns a coffee-table book as an online experience to produce a creative resource that explores the dynamic relationship between photography and other media. photomedia- tions: an open book uses open (libre) content, drawn from vari- ous online repositories (europeana, wikipedia commons, flickr commons) and tagged with the cc-by licence and other open licences. in this way, the book showcases the possibility of the crea- tive reuse of image-based digital resources. through a comprehensive introduction and four specially commis- sioned chapters on light, movement, hybridity and networks that include over images, photomediations: an open book tells a unique story about the relationship between photography and other media. the book’s four main chapters are followed by three ‘open’ chapters, which will be populated with further content over the next months. the three open chapters are made up of a social space, an online exhibition and an open reader. a version of the reader, featuring academic and curatorial texts on photomediations, will be published in a stand-alone book form later in , in collaboration with open humanities press. photomediations: an open book’s online form allows for easy sharing of its content with educators, students, publishers, muse- ums and galleries, as well as any other interested parties. promoting the socially significant issues of ‘open access’, ‘open scholarship’ and ‘open education’, the project also explores a low-cost hybrid publishing model as an alternative to the increasingly questioned traditional publishing structures. photomediations: an open book is a collaboration between academics from goldsmiths, univer- sity of london, and coventry university. it is part of europeana space, a project funded by the european union’s ict policy sup- port programme under ga n° . it is also a sister project to the curated online site photomediations machine: http://photome- diationsmachine.net. this example provides a good model of how ahss researchers in the visual arts might approach the production of open access monographs and arts catalogues, where licencing and copy right issues are very much foregrounded. a third example of goldsmiths engagement with open access dis- semination and data is that of work led by dr jennifer gabrys and her team on an erc funded project called citizen sense in the soci- ology department (box ). page of f research , : last updated: jul http://www.ahrc.ac.uk/ http://www.ahrc.ac.uk/ http://www.transforming-musicology.org/about/ http://www.transforming-musicology.org/about/ http://www.ismir.net/ http://semanticweb.org/wiki/ontology http://www.w .org/standards/semanticweb/data http://photomediationsopenbook.net/ http://photomediationsopenbook.net/ http://photomediationsmachine.net http://photomediationsmachine.net http://www.citizensense.net/sensors/environmental-data/ box . excerpt from the ‘citizen sense’ website (http://www.citizensense.net/sensors/environmental- data/). leader: dr jennifer gabrys the project, which runs from – , investigates the relationship between technologies and practices of environmental sensing and citizen engagement. wireless sensors, which are an increasing part of digital communication infrastructures, are commonly deployed for environmental monitoring within scientific study. practices of monitoring and sensing environments have migrated to a number of everyday participatory applications, where users of smart phones and networked devices are able to engage with similar modes of environmental observation and data collection. such “citizen sensing” projects intend to democratize the collection and use of environmental sensor data in order to facilitate expanded citizen engagement in environmental issues. the team examine how effective citizen sensing practices are in not just providing “crowd-sourced” data sets, but also in giving rise to new modes of environmental awareness and practice. through intensive fieldwork, study and use of sensing applications, the project areas set out to contextualize, question and expand upon the understandings and possibilities of democratized environmental action through citizen sensing practices. as part of their studies the research team on citizen sense collect live scientific data on for example air quality, using sensor devices. the team has now developed a website which visualises air quality data so that the general public can view the crowd sourced results. the interdisciplinary scope of this large project therefore means that the research data generated comes in a format that is more akin to data generated in science environments as opposed to ahss dis- ciplines. therefore the collection, labelling, storage and archiving of this data using dois and attaching these to orcids might there- fore usefully draw on practices established in non-ahss domains. this in turn could potentially enhance the visibility and impact of this research both environmentally and academically. conclusion whilst many ahss researchers at goldsmiths remain sceptical about the use of orcid numbers and digital object identifiers to enhance impact, the goldsmiths examples show that there are dis- tinct possibilities for their ability to enhance the visibility of on-line research outputs such as open access monographs, digital musical data and sociologically inspired scientific data. these examples, however, are sourced from projects that are highly interdisciplinary and well-funded, drawing on collaborations and resources not normally available to ahss researchers in general. by and large most research grants in ahss subjects tend to range between £ –£ k in value and many last no longer than between – months, allowing little time and resources for the development of novel strategies to digital research dissemination. similarly, not all research enquiries might lend themselves well to digitisation due to the ethical, epistemological and practical concerns referred to above. questions of authorship, the suitability of metrics for assess- ing impact and dissemination ethics continue influencing debates on the merits of digital dissemination and shall remain points of contention in the foreseeable future. however, as has been dem- onstrated above, there are circumstances where employing digital dissemination practices, dois and orcids numbers is highly appropriate and could potentially lead to raising the profile of research in ahss domains, demonstrating that this same research is capable of generating its theory either independently or in true collaboration with science partners. whilst this paper has explored some of the (perceived) differences between ahss and non-ahss uses of digital approaches to data sharing and management, i would suggest that, based on prelimi- nary discussions had, there are also many similarities between researchers and how they relate to their data, regardless of their disciplinary background. in future it may therefore be useful to explore the commonalities between disciplines alongside dif- ferences to help foster interdisciplinary approaches to research data management both practically and epistemologically, using a bottom-up approach. competing interests no competing interests were disclosed. grant information the author(s) declared that no grants were involved in supporting this work. acknowledgements this article was written with the help of staff at goldsmiths, university of london. it uses information provided by: the chair of the media, communication and cultural studies asso- ciation (meccsa); prof tim crawford and mr richard lewis; pr joanna zylinska; dr jennifer gabrys; ms caroline lloyd and mr andrew gray. references latour b, woolgar s: laboratory life: the construction of scientific facts. nd edn. princeton: princeton university press, . reference source page of f research , : last updated: jul http://www.citizensense.net/sensors/environmental-data/ http://www.citizensense.net/sensors/environmental-data/ http://home.ku.edu.tr/~mbaker/cshs /latourlablif.pdf   open peer review current referee status: version july referee report doi: . /f research. .r  ben johnson higher education funding council for england (hefce), avon, uk this article gives an insightful and valuable overview of the challenges and opportunities of adopting new practices of digital scholarship within ahss research processes. building on the work of the wilsdon review and earlier studies into practices within arts, humanities and social sciences research (e.g. the crossick review), this provides a timely description of the difficulties we face in implementing broad solutions to tricky problems within a diverse research base.   the diversity of research is often name-checked by those looking at the whole system and seeking to improve the way it works, but perhaps not fully understood. the specific process-oriented examples and case studies revealed here provide important contextual information to inform the sensible and sensitive roll-out of modern research management tools and approaches – it may be desirable, from a management and assessment perspective, to see universal adoption of orcids, dois and so on, but this isn’t as easy as it sounds, and this article helps to explain why this might be the case while providing helpful examples of where it has worked and suggestions for ways forward.   a particular problem is the complex and finely balanced nature of the relationship between different facets of the research process. while ethics, ip, copyright, digitisation, licensing, identification, citation, metrics and credit are often thought of as somewhat bounded issues that can be solved by ‘fixing the plumbing’ (e.g. by introducing orcids), this article reminds us how complex their linkages are within the research process and how upsetting just one part of the balance can introduce vulnerabilities into the whole system.   the examples given here about data management within arts disciplines are rich and informative, and justify a bottom-up approach to managing this agenda (as called for in the conclusion). it is already clear that ‘data’ means different things to different disciplines; even across (largely stem) disciplines that generate numerical data as a primary output, one finds large variations in definitions, standards, practices and expectations that tend to muddle us. extending the meaning of ‘data’ to include all inputs and outputs that inform and support the insights generated from the research process is a laudable aim of those seeking to increase the transparency, robustness, replicability, dissemination and impact of research; doing so in a way that take sufficient account of the complex dependencies between anonymity, confidentiality, intellectual property, ethical propriety and so on is a particular challenge within ahss research and one that is perhaps not given sufficient attention by those operating at the ‘macro’ level of research administration, assessment and policy development.   beyond data, the particular problems of contributor anonymity, delineating roles within collaborations with non-academic colleagues, the invalidity of digital simulacra of real-world artistic artefacts, the complexity page of f research , : last updated: jul http://dx.doi.org/ . /f research. .r   beyond data, the particular problems of contributor anonymity, delineating roles within collaborations with non-academic colleagues, the invalidity of digital simulacra of real-world artistic artefacts, the complexity of documentation of data drawn from a wide range of often privately-owned sources… these are problems that are not felt by colleagues in stem (the group of disciplines from which it is often felt that moves to ‘digitise’ research are flowing). the assignment of dois, orcids, oa licences and so on to the outputs of research operating in this environment is tricky and fraught with real dangers that will require careful further investigation. there is a clear need to need to tease out the limitations of these new aspects of the research ‘plumbing’ within disciplines, explore novel solutions, find what works and what doesn’t, and seek a sensible way forward.   at the heart of this is the question of ethics. the close dependency between more open and transparent scholarly communication practices and more effective research integrity are not disputed, but this is often used to justify a conclusion that ‘open’ is ‘better’ in all cases. the examples above, particularly of ethnographic research, reveal that ethical limitations within disciplinary practice often inform models of communication in a way that might hinder openness, and that this is entirely appropriate in the disciplinary context. this at first appears to fly in the face of the very idea of “open science”, but in practice it only underlines the need for context-specific approaches to openness that take sufficient accounts of the ethical practices within disciplines. clear delineation is needed, though, between genuine ethical considerations and those simply borne of more affected academic-cultural norms or resistant to practical change – we need to head off any unfair accusations of ‘special treatment’ being granted to these disciplines purely on political grounds. we need to better understand this problem, so that we can more effectively and sensitively tailor our approaches to achieve open research communication in a way that respects good research practice in all disciplines.   finally, the question of metrics. central to the arguments made above, and elsewhere, is a concern that the ‘plumbing’ of dois, orcids, web of science coverage etc. is insufficient to enable the accurate capture of research outputs within ahss, and therefore the metrics systems that depend on counting research outputs will unfairly discriminate against these disciplines. as the article states, “the upshot of this is that those disciplines that are less digitally oriented, are likely to obtain unhelpful metric ratings.” in my view, this masks a more pressing issue, which is that metrics are most applicable to those disciplines that ‘chunk’ their outputs into easily quantifiable forms, with quantifiable relationships to one another, with quantifiable citation practices, quantifiable(ish) contributions of academics to the research, and so on. it’s clear from the above, and from my own discussions with ahss researchers, that the problems of quantifying ahss research are not only related the coverage of dois and orcids, and we should be careful not to assume that we entirely fix the issue of metrics by fixing the plumbing (even though we might get a few ‘quick wins’ in a few areas).   i was involved in the wilsdon review, and i expect to be involved in thecompeting interests: implementation of many of its findings. i am also closely involved in the development of the open research agenda in the uk, including through the development and implementation of policies for open access and open data. i sit on the jisc rim group, the orcid implementation group, and several other groups that all have a bearing on the issues raised by this report. i have read this submission. i believe that i have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. july referee report doi: . /f research. .r page of f research , : last updated: jul http://dx.doi.org/ . /f research. .r    anne galliot centre for research and development (arts), university of brighton, brighton, uk this article explores the specific needs of arts, humanities and social sciences (ahss) data management, whether these are adequately served by existing data management systems, and whether they are appropriately considered in future data management developments.   it highlights issues with the quantification and digitisation of information for research assessment and problematizes the concept of data itself, a notion that does not easily translate into ahss disciplines. the author identifies this as a barrier to ahss researchers engaging with the digital research information systems such as orcid: it simply has little relevance to them.   links are made between current practices in research data management, licencing, authorship and ethics, and the myriad considerations that might arise in ahss projects for which we do not at present have satisfactory frameworks. the article reiterates the inadequacy of metrics to capture research impact, and indeed, excellence, in the ahss; this is corroborated by their absence in the latest research excellence framework under panel d, as well as by the recent findings of the independent review of the role of metrics in research assessment and management, which indicate that it is not currently feasible to assess research outputs or impacts in the ref using quantitative indicators alone (wilsdon  ,  et al. ).   finally, the author presents two research projects where some of the issues highlighted are being targeted, specifically by the creation of digital data sets and open access monograph and arts catalogues, in the formulation of research method and output. these projects are interdisciplinary and, crucially perhaps, ‘well-funded and usually include an element of non-ahss research’. while it is clear that interdisciplinarity can contribute much to exploring these issues, it is uncertain whether tools and findings from these projects will fare better that the kultur project, which ahss researchers are mainly unaware of.   this is an opinion article, based on the author’s experience of working, and interviews, with researcher at goldsmith. it certainly reflects my experience as research adviser for a college of arts and humanities. it may have been useful to reference some of the arguments referred to in the introduction, although the paucity of literature on these very current issue may have played against this.   this article raises important questions about the definitions, ethical dimension, and the process of digitisation of research data, as well as about sector endeavours to quantify research impact and excellence. it works as a thought-provoking piece, begging many follow-on questions: how might we help ahss researchers expand a definition of research data that will be relevant to them, and how do we enable the sector to acknowledge and redress the generalisation of its definition in favour of stem disciplines? how might we ensure that the findings of kaptur and future projects are taken into consideration and their toolkits used? how might we address the apparent contradictions between copyright and intellectual property considerations and open access policies? how might we ethically define authorship in cases where research participants have contributing to shaping the data?   i hope this article leads to many more engaging with these questions in more depth.  no competing interests were disclosed.competing interests: i have read this submission. i believe that i have an appropriate level of expertise to confirm that page of f research , : last updated: jul http://www.hefce.ac.uk/pubs/rereports/year/ /metrictide/title, ,en.html   i have read this submission. i believe that i have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. author response   jul  , goldsmiths, university of lonon, ukmuriel swijghuisen reigersberg dear anne, thank you for your review and supportive comments. i think based on these i ought to have better contextualized this submission, as it is somewhat unusual. this paper is in fact part of the conference proceedings of the casrai-orcid conference, barcelona, may   on research evaluation, with an emphasis on emerging practice in the humanities and social sciences . the content of this paper was broadly discussed in a livelyhttp://www.orcid-casrai- .org/ panel entitled: “beyond authorship: recognising all research contributions." as such this paper therefore, was given prior to the official launch of the hefce metrics review  /. hence materials of this review werehttp://www.hefce.ac.uk/pubs/rereports/year/ /metrictide not included in the first versions of this paper. however, now that the report has been launched i shall be able to include a link + doi for it and draw on some of the literature it mentions, which, as you suggest, would be very useful indeed. i would recommend that anyone reading or reviewing my submission also reads the hefce report to contextualise this paper. secondly, this submission is not actually an opinion piece. however, f  - being a predominantly bio-medical journal did not cater to arts, humanities and social science electronic 'templates' (if ever there were any), so the 'opinion' format was the only one suited for my particular submission. thankfully editorial staff and conference organisers were very understanding about this small logistical hurdle and i am grateful to have been given the opportunity to contribute as part of the special theme on communicating science stream. hope that's useful.   no competing interests.competing interests: the benefits of publishing with f research: your article is published within days, with no editorial bias you can publish traditional articles, null/negative results, case reports, data notes and more the peer review process is transparent and collaborative your article is indexed in pubmed after passing peer review dedicated customer support at every stage page of f research , : last updated: jul http://http://www.orcid-casrai- .org/ http://www.hefce.ac.uk/pubs/rereports/year/ /metrictide   for pre-submission enquiries, contact   research@f .com page of f research , : last updated: jul from transaction to collaboration: scholarly communications design at uconn library the university of connecticut (uconn) library, in collaboration with the school of fine arts and the uconn humanities institute and with support from the andrew w mellon foundation, is developing greenhouse studios (gs). gs is a scholarly communications research laboratory dedicated to using collaborative models and design principles in the creation of scholarly works. scholarship laboratories that function as a combination of a scientific research lab and an art studio are a useful means of advancing the methods and outcomes of scholarly communications. we intend to examine whether flattening hierarchies through the gs model is a significant challenge for librarians who work within transactional models of interaction and are closely tied to faculty-driven service models of research support. other participants typically thought of as supporting faculty are embedded as equal participants in the design process. we will apply qualitative methods to examine whether the gs design process facilitates development of new models of interaction among faculty, librarians, design technologists and other experts. preliminary experience finds most participants embrace the collaborative model and are energized by the experience. our assessment will focus on gs techniques as drivers for role and scholarly output changes, how these experiences might translate into changes in library culture or services, and on practical findings related to space, technology usage and administrative hurdles. this paper is the result of a presentation delivered at cni (the coalition for networked information) in early and encapsulates our thinking then and now (in early ) as we refine our assessment tools. from transaction to collaboration: scholarly communications design at uconn library keywords libraries; digital humanities; digital scholarship; digital humanities laboratories raising the question: libraries, librarians and digital scholarship how libraries and librarians should participate in the intellectual life of the university has been richly debated for decades – with the intensity of the debate increasing as academia and the world moved from a print-based culture to a digital one. it is now well understood that this change was more than a change in format; it was a revolutionary change in human communication. libraries coped with the format change well enough, embracing electronic databases, online journals, e-books, and the like, but were less adept at understanding what the coming of the ‘digital library’ meant to the library profession and to libraries in general. libraries were not alone in this crisis of identity. the growth of digital activities in the humanities, for example, also spawned ‘digital humanities’ (dh) and a debate over the differences between the traditional and the digital in that discipline as well. the subsequent growth of dh centers in academic libraries further compounded the confusion, as two groups of people unsure of their identities combined to sometimes confuse each other even more about their futures. , , dh centers in libraries were often conceived by librarians as service centers where faculty would bring projects and ‘get a website built’ by the technologically adept. libraries began to hire developers, web designers and other non-librarian staff to meet this self-created demand. while this movement was not unwelcome, it was a significant step away from one of the traditional core functions of libraries as curated sources of raw materials and repositories of culture and knowledge. insights – , scholarly communications design at uconn library | holly jeffcoat and gregory colati holly jeffcoat associate dean university of connecticut library gregory colati assistant university librarian for archives, special collections and digital curation university of connecticut library further, over the centuries, librarians evolved from protectors of scarce objects to mediators in the search for information when the amount of knowledge became greater than the ability of one person to absorb or know. the mediator role increased as the amount of information increased, librarians created elaborate finding and inventory systems, and librarians came to be seen by themselves as essential filters between the seemingly overwhelming amount of information available and the inundated researcher. the traditional conception of the librarian as someone who can ‘get the right information, from the right source to the right client at the right time’ served the profession well until online access to resources and the beginnings of artificial intelligence began to provide not only unmediated access to library resources, but, through search algorithms and recommender functions, the information filtering services previously provided by human librarians. as early as , bill arms raised the question about the future of the automated digital library and whether librarians as filters would always be necessary: ‘the underlying question is not whether automated digital libraries can rival conventional digital libraries today. they clearly cannot. the question is whether we can conceive of a time (perhaps twenty years from now) when they will provide an acceptable substitute.’ that years has nearly elapsed and for some, the answer has clearly changed. chris bourg, director of libraries at mit (the massachusetts institute of technology), recently posted on her blog a talk she gave at harvard’s library leadership in a digital age program called ‘what happens to libraries and librarians when machines can read all the books?’ in that talk, bourg discussed the impact machine learning could have on librarians, especially reference librarians, and offered the suggestion that rather than oppose the use of algorithms and machine learning, librarians should embrace it and determine how to leverage the fact that machines can ‘read all the books’ now, and algorithms may be as good as or better than human librarians at creating bibliographies or doing literature reviews. the filtering function served the profession well before machines could read all the books and search algorithms made that skill less relevant. the technical skills of librarianship, like technical skills in any profession, have always been subject to replacement by tools. this replacement erodes what richard mason called the ‘power relationship’ of the librarians over their clients. the democratization of information discovery and access means librarians no longer hold the keys to unlock the information potential in the libraries of the world. but, as bourg says, it would be a mistake to oppose the increasing power of automated technical skills, or think that it means the end of librarianship. as information professionals, librarians have a significant role to play in research in a way that is more than service provider or collection builder, and as joan lippincott says, ‘…working in such partnership relationships, becoming embedded in the mission-critical aspects of higher education – research, teaching, and learning – and infusing librarians’ particular expertise, collections, and values into new types of research, is, in fact, a core responsibility of st century librarians and libraries’. in order to meet and succeed in and, better yet, create, this new environment, librarians must look outside the traditional ‘hands-off’ culture in which they currently exist. even in this new collaborative era, librarians are often viewed as a support tool brought in for specific duties. a case in point is visible on the one science framework (osf) website. the osf connects all aspects of the research life cycle with digital tools for seamless management by the researcher and his/her team. to be fair, osf is a valuable scholarly communication partner; however, notably, the faq screenshot depicting the answer to limiting access to the overall project stages to ‘contributors’ is of a female ‘bibliographic contributor’. ‘librarians evolved from protectors of scarce objects to mediators in the search for information’ ‘democratization of information discovery and access means librarians no longer hold the keys’ many libraries are attempting to alter the transactional model. for example, the public service roving pilot program model being tested at georgia tech changes the interaction dynamic, but the interaction remains stubbornly transactional. faculty responses to a gale cengage/american library association survey indicate most faculty perceive librarians and the library in a supportive role to their dh work. only % expressed the desire to bring the librarians in as ‘a full-fledged project collaborator or participant’. answering the question: examining and experimenting greenhouse studios: a program that tests an idea the vision behind the greenhouse studios (gs) is to build a culture of collaboration and, by extension, build a culture of rewarding collaboration rather than individual accomplishment, drawing from design studios, scientific laboratories, digital publishing and, of course, digital humanities. at its core gs is built on the principles of collaborative workflows, equitable labor hierarchies and multimodal expression created in collaborative spaces that persist as part of the scholarly record. we can use the gs experiment as one way to test how libraries and librarians can become embedded in the mission-critical aspects of higher education at the uconn library (ucl). a collaboration between the school of fine arts, the college of liberal arts and sciences, and the humanities institute at the university of connecticut (uconn), gs is a collective effort to forge diverse collaborations that build humanities scholarship in new formats to engage new audiences. although funded in part by a grant from the andrew w mellon foundation, gs is a permanently budgeted library program. it is a research laboratory that explores not only new forms of scholarship, but new forms and processes of creating, disseminating and preserving scholarship and, perhaps most importantly, new roles and relationships among those who create scholarship. its research agenda is a deep investigation into the collaborative, interdisciplinary work processes needed to transform scholarly communications for an age of proliferated modes of expression, dissemination and reception. while that agenda may at first glance seem focused on faculty, it has equally significant implications for libraries and librarians. a resumé of the gs design process is given below – see also figure for the different stages. (this model is explained in more detail on the gs website. ) figure . greenhouse studios design process ‘gs is a permanently budgeted library program’ briefly, the design process begins with a team of people and an inquiry-focused ‘prompt’ posed externally by gs. it is the externality of the prompt that puts people and collaboration at the center of the gs process, rather than the needs of a particular faculty researcher. teams are composed of diverse talents, including librarians, faculty, students, artists, developers, acquisition editors and other publishing professionals. the first phase includes understanding what is involved (collections, technologies, audiences, internal and external funding opportunities) and any constraints (time, money, audiences). the team then produces a project brief. the second phase sees the team expand its thinking by entering a phase of divergent research and ideation in which it identifies relevant sources, knowledge and inspiration, culminating in the production of a detailed creative brief that explains in some detail the ultimate product of the team’s efforts. with the creative brief as a guide, the team then enters the ‘build’ phase, which includes weekly meetings, iterative prototyping, testing and refining of the work, with progress toward a final deliverable in mind. this phase ends when team members agree that the media manuscript is feature- complete and ready for peer review and revision. the design process concludes with release of the publication/s and the longer-term work of dissemination, assessment, preservation and access. all members of the project team, including librarians, are active in all phases of the project. external transactional relationships are used only when necessary, and then for essentially administrative functions of the group, such as purchasing or infrastructure support. otherwise, the team is expected to be internally self-sufficient. one aspect of the hands- off culture that currently remains is the post-design process work of preservation and dissemination. we expect to examine these activities in more detail in the future. why uconn library? the ucl’s vision of itself and its mission to ‘create a culture of learning and exploration [in a] multidisciplinary hub of activity’ is a driving force behind the programmatic activities such as gs that encourage ‘community building, collaboration, innovation, and exploration of new pedagogical and research models. it generates a communications network to collect, share, and showcase new ideas and products’ and is an ‘inspirational and inventive space that is home to all at the intersection of content and research’. gs seeks to further break down the power relationships across the continuum of research. the librarian as information broker is replaced by the librarian as information professional with the ability to ‘render judgements [about information] in situations that are unique, uncertain, equivocal, and laden with value conflicts’. the traditionally less emphasized side of the librarian’s craft turns out to be the side of the craft that is less prone to automation, and more valuable to the modern research environment. why not a ‘digital’ research lab? the decline of the technical aspect of librarianship, combined with the increasing value of the synthesizing expertise of the professional librarian, makes librarians and libraries complementary places for humanities programs. at uconn, we invited the humanities institute into the ucl to improve the natural synergies between humanists and librarians, and created the greenhouse studios to explore those relationships in greater depth. we purposefully did not include the word ‘digital’ in the name of the program, or in any of the descriptive and promotional literature. it may seem disingenuous to leave digital out of the title of a program that is so obviously focused on digital outputs and the use of digital technology; however, gs is focused on digital technology only because digital is the place where research is being done today. ‘the team expand its thinking by entering a phase of divergent research and ideation’ ‘the librarian as information broker is replaced by the librarian as information professional’ at some point in the future, we may be exploring telepathic information exchange, or some other methods that today seem just as far-fetched as sending ones and zeros over invisible carrier waves to flat pieces of glass and silicon that we put in our pockets seemed to people only years ago. we recognize the shifting landscape of research and scholarly expression, and aim to be part of the creation of that new landscape, now and in the future. methods: or, how are we going to know if it works? this study will make use of modified grounded theory methodology, , , particularly cultivation of grounded theory behind organizational identity and disruption. , semi-structured interviews will be scheduled with the greenhouse studios working group, steering committee, and over time ( – ) with individual participants in cohorts a, b, and c. interviews will be collected and coded using qualitative data analysis software. constant comparison of interview coding will occur to ensure consistent coding throughout the project. responses will be categorized by standard demographic traits in order to facilitate comparisons. the initial interviews will inform a theoretical direction to possible frameworks such as identity development, social construction, intergroup relations, and role conflict. the sample size may reach – professionals. the potential number of interviewees is five per project. each of the three gs cohorts will have three to five projects. some overlap exists between the working group and steering committee, therefore the number of interviews could reach as high as . potential for varied background and experience in a small, yet diverse, sample size is expected to be high. assessment an abundance of literature about digital humanities or digital scholarship centers in libraries exists in articles, blog posts, reports and book chapters. often the focus is on history, planning and types of centers, connections between libraries and dh, role changes and overcoming librarian ‘timidity’ , communication, skill acquisition, sustainability and perspectives on service. as we move from defining and understanding digital scholarship and library partnerships, one area in the literature is conspicuously lacking: assessment. in , lippincott and goldenberg-hart stated the need to learn what types of assessment are taking place at digital scholarship sites and how success is defined. a recent article by green outlines digital pedagogy assessment strategies. maron and pickle wrote the most comprehensive overview of dh models, funding, value and sustainability. with these exceptions, very little has been written to guide overall program assessment and even less examining the relative effectiveness of design models and impact of participant hierarchies throughout the project. expected outcomes for the gs method and cohort experience while the activities of the gs teams produce the intellectual and scholarly outputs that are part of the scholarly record, a product of gs is a community of differently trained and experienced, interdisciplinary collaborators comprised of faculty, librarians, designers, developers, students and others at uconn, along with colleagues from the publishing community and other institutions. our expectation is that this community, with its collaborative ethos, will have developed new understanding and appreciation of their own, and other cohort members’, professional identity. the community will continue to grow as more and more alumni of gs teams move out into the academic world, spreading the collaboration-first approach. to determine the validity of our expected outcome, assessment will track perception of individual contributor role and experience before, during, and after participation in a gs cohort. we will examine relative adaptability and acceptance among participant types to the prompt-driven (i.e. not faculty-driven) collaboration-first (i.e. equality of team members) ‘digital is the place where research is being done today’ ‘one area in the literature is conspicuously lacking: assessment’ design process as well as challenges experienced while learning a new design process with a multi-modal outcome and concerns with reward systems relative to position. do faculty, design technologists and other project members perceive role and professional identity shifts during the multiple stages in the process? what do student collaborators learn about academic power dynamics? do concerns about ownership or tenure and promotion override the collaborative nature of the project? what is the time commitment during the process and does the amount of time positively or negatively impact external deadlines or other work? who determines project completion? how will the outcome be preserved and what is the scholarly item of record? does the experience influence pedagogy? we will consider all of these aspects of the collaboration. expected outcomes for the gs and library staff we do not expect that we will immediately replace transactional interactions between librarians and users with total collaboration, all the time. our more modest goals are to introduce librarians and other library staff to a new culture, enable them to experience a new approach to academic scholarship, encourage them to have the confidence to be a collaborator, and provide a professional development opportunity within the organization that at the same time improves that organization and its position on campus. librarians will be exposed to non-traditional products and projects, find themselves in non-traditional and potentially uncomfortable roles, will expand their conception of what it means to be a librarian, and allow for a new librarian-faculty-student-technologist dynamic to emerge. in fact, we expect that librarians will behave according to their personalities as the gs structure allows them the freedom to define their own place. to gauge our assumptions, we will interview librarians and library staff cohort members to learn from their experience. does the unique gs environment help overcome persistent librarian ‘timidity’ to embrace new roles in dh contexts? since participation is not mandatory, did librarians choose to prioritize traditional library work over participation, and if so, why? did the librarians inform the process in unexpected ways? how did librarians see themselves and their role in the process and did their role or professional identity reshape itself over the course of the project? will transactional-oriented librarians and staff succeed or enjoy an open-ended, no-rules, no-right-answer project? expected outcomes for the gs and transactional library culture as long as the nature of the interaction of librarians and their communities remains embedded in the hands-off culture, and library spaces reflect that transactional model, design techniques cannot easily be generalized to the larger library culture. however, iterative design thinking, along with adoption of other outside techniques like agile, can be integrated into library services at some level beyond general reference. assessment of notable shifts in library culture and services will be difficult to discern in the short term. however, we will devise methods to determine the value of having a scholarship research lab in the library and whether the collaboration-first process stimulates cultural change in other areas of the library. we are particularly interested to know if librarians internalize the collaborative design process, especially if participation disrupts the transactional culture and, if so, how? does the change in professional identity influence future interactions, assessment of services, or service design, and if so, at what level? are specialized scholarly communication design techniques generalizable to a tradition-bound organization within a similarly constrained institution? ‘librarians … will expand their conception of what it means to be a librarian’ other considerations reward systems scholarly expression continues to evolve but is too often constrained by the scholarly reward system. the remaking of the scholarly reward system in academia is not the primary subject of this article, but it needs to be mentioned in the context of the reward system for librarians. whether or not they are considered faculty, staff, or some type of hybrid, librarians live under a reward system that governs and guides their activities and professional development. those reward systems are based on criteria that are generally not set up to adequately assess individual accomplishment in a collaborative setting. we will be interested to see how participation in greenhouse studios is received within current ucl reward systems. space and furniture we will regularly evaluate the newly designed space which is located on a busy common floor in the homer babbidge library. do amenities like flexible furniture, dedicated breakout space for each project team, fixed workspaces for fellows and students, remote offices for permanent staff, and abundant coffee and insomnia cookies contribute to the success of projects? the lab is enclosed with glass walls that do not reach ceiling height. do the glass walls invite interaction and questions from library users or cause noise concerns for those on either side of the wall? further, gs is co-located on the floor with the ucl maker studio, visualization lab, the scholars’ collaborative, and multiple instruction rooms. does this co-location encourage interaction or will gs cohorts and general ucl users exist separately? technology we will review the use of technology in team work. of the technology – from virtual reality to rolls of butcher paper – which is most useful and why? again, working with small sample sets and only anecdotal evidence, we find that the preferred technology tends to be ‘bring- your-own’, although large screens for group discussions are valuable for the working group and other small groups when working collaboratively. whiteboards, flip charts, sticky notes of various sizes and other physical means of capturing ideas remain heavily used. does this indicate that, despite differing professional training and backgrounds, design projects require less technology during each phase? does lack of a central focus on technology enable greater creativity in digital projects? how does a lack of a ‘standard’ technology set impact the library’s current technology support structures? traditionally, and for many good reasons, library it departments provide a highly standardized, comprehensive technology environment for library staff. by its nature, gs will use experimental and non-standard technology, often brought in or created by team members, and meant to be temporary. what is the maximum level of tech support gs can expect or the minimum amount of technology control the ucl can demand in these situations? administrative structures the cross-departmental (crossing departments within the library) and transdisciplinary (crossing academic departments, schools and colleges, and administrative service departments within the university) nature of gs in many ways challenges the long-standing transactional arrangements of allocating money and services common to the academic bureaucracies built on disciplinary work. libraries have always welcomed non-library staff into the library building with a range of semi-permanent spaces from faculty offices to graduate study carrels. as libraries move to increase beneficial partnerships through shared library spaces, the library’s perception of its core mission is challenged by including new and uncontrolled elements into its midst. specifically, in the case of gs, is support for what some may believe to be ‘other’ (i.e. non-library) staff perceived as a questionable investment especially in a resource-constrained environment? if the ucl funds expenditures for the ‘non-library’ (however that is defined) staff and activities in gs, what is the added cost of supporting those activities with administrative services? ‘scholarly expression … is too often constrained by the scholarly reward system’ ‘gs will use experimental and non- standard technology’ ‘the library’s perception of its core mission is challenged’ we have begun collecting information on the oft-mentioned administrative and institutional structures that potentially impede or redirect transdisciplinary digital scholarship efforts. hiring, grant management, operational budget impact, staff and student labor budget implications, technology funding, financial rewards and administrative support create ongoing pressures in a system that is similarly transactional and departmental in nature. we will speak with it, financial, administrative and facilities services staff to determine the ancillary organizational weight borne by these areas and how they react and adjust to their unique requirements. final thoughts: or, stay tuned for more as long as the nature of the interaction of librarians and their communities remains transactional and library spaces reflect that transactional model, design techniques cannot easily be generalized to the larger library culture. however, iterative design thinking can be integrated into library services at some level beyond general reference. this brings up another, more serious question. can the ucl organize itself and its activities so that collaboration at scale is possible? on the surface, it would seem not. participation in collaborative projects requires significant commitment of effort over a protracted length of time. with thousands of faculty and tens of thousands of students, it is not possible to collaborate on a gs model with more than a small minority of the uconn community. so why do it at all? like any research endeavor, gs is testing an idea rather than implementing a service. if collaborative research becomes a valuable approach to creating scholarship, and we believe it already has passed that milestone, librarians will exert some of their particular skills and expertise on figuring out how to collaborate at scale and what that activity looks like. for the ucl, the goal of the greenhouse studios is to test and understand how librarians fit into a new collaborative model of scholarly creativity. therefore, while we are committed to reporting our results in future publications, we feel it is important to begin the discussion now, while our thoughts and opinions are still fluid, so that we can tap the combined intelligence of a larger audience to make our ultimate solution and conclusions even better. this is the greenhouse ethos in a nutshell: to create collaborators at every step of the creative process, and involve our communities in our creations. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘abbreviations and acronyms’ link at the top of the page it directs you to: http://www.uksg.org/publications#aa competing interests the authors have declared no competing interests. ‘a new collaborative model of scholarly creativity’ references . bedard m, phillips h and colati g, from transaction to collaboration: scholarly communications design at uconn library. in: cni, spring , albuquerque, nm: https://www.cni.org/wp-content/uploads/ / /cni_from_colati.pdf (accessed march ). . gold m k, debates in the digital humanities, , minneapolis: university of minnesota press: http://dhdebates.gc.cuny.edu/ (accessed march ). . for example, allington d, brouillette s and golumbia d, neoliberal tools (and archives): a political history of digital humanitie, los angeles review of books (larb), : https://lareviewofbooks.org/article/neoliberal-tools-archives-political-history-digital-humanities/ (accessed march ). . nowviskie b, skunks in the library: a path to production for scholarly r&d, journal of library administration, , ( ), – . . vandegrift m and varner s, evolving in common: creating mutually supportive relationships between libraries and the digital humanities, journal of library administration, , ( ), – . . vandegrift m, what is digital humanities and what’s it doing in the library?, in the library with the lead pipe, , – : http://www.inthelibrarywiththeleadpipe.org/ /dhandthelib/ (accessed march ). . casson l t, libraries in the ancient world, , new haven, yale university press: http://ebookcentral.proquest.com/lib/columbia/detail.action?docid= . http://www.uksg.org/publications#aa https://www.cni.org/wp-content/uploads/ / /cni_from_colati.pdf http://dhdebates.gc.cuny.edu/ https://lareviewofbooks.org/article/neoliberal-tools-archives-political-history-digital-humanities/ http://www.inthelibrarywiththeleadpipe.org/ /dhandthelib/ http://ebookcentral.proquest.com/lib/columbia/detail.action?docid= . abbott a d, digital paper: a manual for research and writing with library and internet materials, , chicago, the university of chicago press. . mason r o, what is an information professional?, journal of education for library and information science, , ( ), – . . arms w, automated digital libraries: how effectively can computers be used for the skilled tasks of professional librarianship? d-lib magazine, , ( / ): http://search.proquest.com/docview/ /. . bourg c, march , what happens to libraries and librarians when machines can read all the books?, feral librarian blog: https://chrisbourg.wordpress.com/ / / /what-happens-to-libraries-and-librarians-when-machines-can-read-all-the-books/ (accessed march ). . mason, r o, ref. . . lippincott j k, foreword. in: digital humanities, edited by hartsell-gundy a, braunstein l and golomb l, , chicago, il: association of college and research libraries. . open science framework, ‘add contributors to projects and components | creating and managing projects’, osf guides, : http://help.osf.io/m/projects/l/ -add-contributors-to-projects-and-components#add-contributors-to-components (accessed march ). . decker e n, givens m and henson b, from a transactional to relational model: redefining public services via a roving pilot program at the georgia tech library. in: the lita leadership guide: the librarian as entrepreneur, leader, and technologist, edited by antonucci c and clapp s, lanham, , md, rowman & littlefield publisher. . digital humanities – faculty survey results december , cengage learning and american libraries, : https://americanlibrariesmagazine.org/wp-content/uploads/ / /digital-humanities-faculty.pdf (accessed march ). . greenhouse studios, design process, : https://greenhousestudios.uconn.edu/design-process/ (accessed march ). . bedard m, jeffcoat h and uconn library, uconn libraries purposeful path forward purposeful plan of action: programmatic & empowering priorities, : http://lib.uconn.edu/wp-content/uploads/ / /purposeful-path-forward_web.pdf (accessed march ). . mason r o, ref. . . charmaz k, constructing grounded theory: a practical guide through qualitative analysis, , london; thousand oaks (calif.), sage. . glaser b g and strauss a l, the discovery of grounded theory: strategies for qualitative research, , chicago, aldine. . strauss a l and corbin j, basics of qualitative research: techniques and procedures for developing grounded theory, , sage: http://srmo.sagepub.com/view/basics-of-qualitative-research/sage.xml (accessed march ). . gioia d a and thomas j b, identity, image, and issue interpretation: sensemaking during strategic change in academia, administrative science quarterly, , ( ), – : http://www.jstor.org/stable/ (accessed march ). . glaser and strauss, ref. . . bryson t, posner m, st pierre a and varner s, spec kit : digital humanities. . sula c a, digital humanities and libraries: a conceptual model, journal of library administration, , ( ), – . . vandegrift m and varner s, evolving in common: creating mutually supportive relationships between libraries and the digital humanities, journal of library administration , ( ), – . . cox j, communicating new library roles to enable digital scholarship: a review article, new review of academic librarianship  , ( – ), – . . bakkalbasi n, jaggars d and rockenbach b, re-skilling for the digital humanities: measuring skills, engagement, and learning, library management, , ( ), – . . maron n l, yun j and pickle s, sustaining our digital future: institutional strategies for digital content, strategic content alliance, , : http://repository.jisc.ac.uk/ / /sustaining_our_digital_future_ithaka_s+r_final.pdf (accessed march ). . muñoz t, digital humanities in the library isn’t a service, github, : https://gist.github.com/ (accessed march ). . lippincott j and goldenberg-hart d, digital scholarlship centers: trends & good practice, : https://www.cni.org/wp-content/uploads/ / /cni-digitial-schol.-centers-report- .web_.pdf (accessed march ). . green h e, fostering assessment strategies for digital pedagogy through faculty-librarian collaborations: an analysis of student-generated multimodal digital scholarship, pp. – . in: laying the foundation: digital humanities in academic libraries, edited by white j w and gilbert h, , purdue university press: http://www.jstor.org/stable/j.ctt t kq. (accessed march ). . maron n l et al., ref. . http://search.proquest.com/docview/ / https://chrisbourg.wordpress.com/ / / /what-happens-to-libraries-and-librarians-when-machines-can-read-all-the-books/ http://help.osf.io/m/projects/l/ -add-contributors-to-projects-and-components#add-contributors-to-components https://americanlibrariesmagazine.org/wp-content/uploads/ / /digital-humanities-faculty.pdf https://greenhousestudios.uconn.edu/design-process/ http://lib.uconn.edu/wp-content/uploads/ / /purposeful-path-forward_web.pdf http://srmo.sagepub.com/view/basics-of-qualitative-research/sage.xml http://www.jstor.org/stable/ http://repository.jisc.ac.uk/ / /sustaining_our_digital_future_ithaka_s+r_final.pdf https://gist.github.com/ https://www.cni.org/wp-content/uploads/ / /cni-digitial-schol.-centers-report- .web_.pdf http://www.jstor.org/stable/j.ctt t kq. article copyright: © holly jeffcoat and gregory colati. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. corresponding author gregory colati assistant university librarian for archives, special collections and digital curation university of connecticut library, storrs, ct , usa e-mail: greg@uconn.edu orcid id: http://orcid.org/ - - - holly jeffcoat orcid id: http://orcid.org/ - - - to cite this article: jeffcoat h and colati g, from transaction to collaboration: scholarly communications design at uconn library, insights, , , : – ; doi: https://doi.org/ . /uksg. published by uksg in association with ubiquity press on may http://creativecommons.org/licenses/by/ . / mailto:greg@uconn.edu http://orcid.org/ - - - http://orcid.org/ - - - https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ raising the question: libraries, librarians and digital scholarship answering the question: examining and experimenting greenhouse studios: a program that tests an idea why uconn library? why not a ‘digital’ research lab? methods: or, how are we going to know if it works? assessment expected outcomes for the gs method and cohort experience expected outcomes for the gs and library staff expected outcomes for the gs and transactional library culture other considerations reward systems space and furniture technology administrative structures final thoughts: or, stay tuned for more abbreviations and acronyms competing interests references figure this is an accepted author manuscript” (aam) (also known as the “author post-print”) cristina costa school of education, university of strathclyde, glasgow, uk cristina.costa@strath.ac.uk school of education, university of strathclyde glasgow outcasts on the inside: academics reinventing themselves online abstract recent developments in digital scholarship point out that academic practices supported by technologies may not only be transformed through the obvious process of digitisation, but also renovated through distributed knowledge networks that digital technologies enable, and the practices of openness that such networks develop. yet, this apparent freedom for individuals to re-invent the logic of academic practice comes at a price, as it tends to clash with the conventions of a rather conservative academic world. in other words, it may still take some time until academia and the participatory web can fully identify themselves with one another as spaces of ‘public intellectualism’, scholarly debate and engagement. through a narrative inquiry approach, this research explores how academic researchers engaged in digital scholarship practices perceive the effects of their activity on their professional identity. pierre bourdieu’s concept of habitus is used as a theoretical construct and method to capture and understand the professional trajectories of the research participants and the significance of their digital practices on their perceived academic identity. the research suggests that academics engaged in digital practices experience a disjointed sense of identity. the findings presented in this article illustrate how experiences with and on the participatory web inform a new habitus which is at odds with a habitus that is traditionally expected in academia. keywords: habitus, digital scholarship, pierre bourdieu, identity introduction as a space where intellectualism can be developed publicly and collectively, the participatory web is starting to be regarded as a catalyst for change, especially where knowledge work is concerned. when academics recognise the potential of the web as a space of participation, their approaches to how they communicate, discuss and disseminate this is an accepted author manuscript” (aam) (also known as the “author post-print”) their scholarly work is likely to start taking on different dimensions (hall, ; veletsianos and kimmons, ). this becomes even more important given that the current society is increasingly influenced by an economy reliant on digital technological developments. digital scholarship practices are understood as scholarly work supported and enhanced by the participatory web and the movements and ideals associated with it; amongst which is the open access movement that aims to make research practice and outputs accessible to a wider world (henry et al, ; fry et al, ). digital scholarship is starting a tradition of openness and transparency by placing a strong emphasis on a culture of knowledge sharing online. in this sense, the participatory web consists of communication tools, applications and environments in which knowledge networks form as a result of individuals’ active participation as contributors and sharers of information. a major side-effect of academic engagement online is not only reflected in the ways their work is presented, but also how they represent and perceive themselves. the meaning of using the web for academic purposes, i.e., of ‘being’ and perceiving oneself as a ‘digital scholar’ (weller, ) is epitomised by a renewed sense of professional identity among academics. this issue is worth exploring because the practices and, most importantly, the deployment of self- identity as ‘digital scholars’ (see own author anonymised for review purposes, ), are an emerging phenomenon within the academic community. this research is guided by the sociology of pierre bourdieu, especially his conceptualisation of habitus as internalised behaviour; product of life trajectories that individuals carry with them and which, in part, are translated into the practices they transfer to and from the social spaces in which they interact. in doing so, this article explores how academic researchers engaged in digital scholarship activities perceive their professional identity as part of their academic habitus; the perceptions of a professional self that is strongly influenced, and sometimes transformed, by their participation in online knowledge networks and web spaces. to conduct this research a narrative inquiry methodology was employed with the concept of habitus playing a vital role in the background in terms of capturing and translating the narrated experiences of practice into meaningful units of knowledge, especially those regarding the professional trajectories and related sense of identity of the research participants. considering academic identities in the current knowledge society requires attention to the growing effects of the participatory web on the academic world. how the web affects academic practice, and especially what it means in terms of professional and academic identity is central to this article. this research presents a new perspective on academic identities in connection to the digital economy and aims to inform the wider digital society debate in relation to the academic profession. this article is organised in four sections. following this introduction, bourdieu’s key concept of habitus is presented in tandem with literature on identities. next, i elaborate on the methodological choices made for this study. the findings of the research are then presented. i conclude the article with a discussion of the findings in relation to the work of bourdieu. academic identities: a reflection of (a changing) habitus the conceptualisation of habitus is a result of bourdieu’s attempt to overcome the dichotomy between structure and agency whilst acknowledging the external and historical factors that condition, restrict and/or promote change. the concept of habitus, as a socially embodied system of individual and collective dispositions made visible through social agents’ practices, is history that produces more history (bourdieu, , p. ). however, habitus is more than accumulated experience or automated repetition of actions; it consists of a complex social process in which individual and collective ever-structuring dispositions converge or diverge to form and justify individuals’ perspectives, values, actions and social positions, i.e., their embodied cultural capital. this is an accepted author manuscript” (aam) (also known as the “author post-print”) identity, as a product of socio-cultural, historical and political contexts (markus and nurius, ; jenkins, ), is constantly being transformed by the combination of individualss experiences (slay and smith, ) and their personal traits (cote, ). wacquant ( ) contends that individual habitus - ‘the idiosyncratic product of a singular social trajectory and set of life experiences’ (p. ) - may or may not contrast with the collective habitus of the social groups and institutions with which an individual is affiliated. every field of action has its set of rules and conventions that help define it as a social space. this may agree or disagree with individual habitus and subsequently an individual’s sense and perception of identity. what habitus does is to communicate the dialectics between structure and agency, between the object and the subject, through a dispositional theory of action and reflexivity. looking at the context of this research, in general, academia is known for featuring a set of durable dispositions that aim to ensure the reproduction of their symbolic power, i.e., their reputation and status quo. this is especially visible in the communication and dissemination of research through traditional channels of intellectual discourse, such as academic publications in toll-access journals because of their long established prestige (northcott and linacre, ; burdick and willis, ). although the participatory web provides very effective channels of communicating knowledge and achieving influence, its impact is more notorious outside academia (wilkinson et al, ). higher education institutions, as formal sites of knowledge production, are often more hesitant to depart from established norms (harley et al, ) or recognise disruptive practices (priem and hemminger, ) because of their long tradition and reputation. through accountability measures of academic performance that rely heavily on conventional metrics of knowledge production (talib, ; wellington and torgeson, ; northcott and linacre, ; miller et al, ) (e.g.: number of publications and citations, type of academic journals, etc.), academia, as a field of social relations, aims to reproduce a habitus that allegedly gives stability to the institution. at the same time, however, it is increasingly incongruent with more contemporary communication practices supported by the use of digital technologies and, especially, the participatory web (qualman, ). such is the case of academic blogs as a space for public discourse, knowledge networks as sites of influence and public debate. although academia strives to maintain its structure, scholarly work is undergoing a slow process of transformation (pearce et al, ), as an effect of ‘outside’ practices. weller ( ) revisits scholarship in the context of the digital society and puts forward three features that are starting to characterise new scholarly practices: ( ) digital, ( ) networked, and ( ) open (p. ). with the widespread use of the web, academics are given access to ‘a growing body of research data and sophisticated research tools and services’ (conole, , p. ). the opportunities to retrieve and contribute to a living, dynamic, and evolving knowledge database are multiple. the participatory web provides academics with a new conduit for the dissemination and storage of research in an environment where different publics can converge. academics are enabled to participate in online communities and networks that not only link them to their research interests, but can also connect them to new research and collaboration opportunities. in this vein, the participatory web introduces new practices, and challenges the norms of rather stable structures on which academia has established its practice, built its identity, and consequently its policies of power (schneckenberg, ). ‘being’ a digital scholar thus implies a cultural change (becher, ; cronin, ; fry, ; kemp & jones, ; whitely, , as digital scholarship calls for a distinctive set of practices that aim to give scholarly activity a post-modern touch. the more a social field succeeds in establishing itself as habitus the more successful it is in forming and maintaining its structure. this, in return, assumes the individual’s identification with the institution, , by reconciling the social agent’s practices with the social structure of the institution’s norms. , habitus is often understood in the literature as a mechanism of reproduction of practices conveyed through a sense of experiential continuum (see, for example, king, ). however, habitus presents a more complex nature; as the ‘justification’ of agency, habitus is not an innate or intact set of dispositions (bourdieu, ). on the contrary, in representing the social trajectory of an individual, habitus has the ability to change through the assimilation of new dispositions; the result of the individual being exposed to different realities and getting involved in new practices. hence, an individual habitus does not necessarily translate into the habitus that the academic field tries to cultivate. it is this dissonance between institutional and individual practices that give habitus its fluidity. this is an accepted author manuscript” (aam) (also known as the “author post-print”) just like habitus, professional identity is not a static concept, but one that rather evolves, according to ‘work role changes’ (ibarra, , p. ) and personal and social meanings individuals attribute to it. one’s sense of self can also be shaped by one’s self-conception (ibarra, ) and self-interest (du gay, ), i.e., who individuals think they are and what they would like to become. hence, professional identity can be seen as a social construction (jenkins, ) that takes into account an individual’s role, professional structures and the wider contexts in which they interact. slay and smith ( ) posit that professional identities are: - social(ised) (individuals are socialised into the meanings of a given profession) - changeable (individuals adjust and adapt their professional self to different roles and jobs) - modelled (individuals’ work experiences and narratives of life help construct one’s perceptions of the self and thus determine priorities and directions) professional identity can thus be understood as both an act of perception and of being perceived (bourdieu, ) in a given field of action. although an individual’s practice can work in conformity with the field, through the tacit acceptance of its norms, the dispositions the individual brings to the field can also contrast with the recognised order. habitus can work as much as a form of adaptation to the field as it can divide the field of practice. this has an impact on individuals’ identification with the social field. moreover, it tends to culminate in the recognition or misrecognition of the practitioner; of the alignment (or misalignment) of self-identity with the field’s identity. the concepts of recognition and misrecognition were often used by bourdieu to convey perspectives of social classification, position and legitimacy, i.e., instances of symbolic capital that aim to preserve or subvert the social structure. the acquisition of such symbols as embodied habitus determines the inclusion or exclusion of the individual in the social field to which such symbols belong. acts of recognition thus imply that both social agents and social structures share ‘identical categories of perception and appreciation’ (bourdieu, a, p. ), whether acts of misrecognition indicate a clash between the practices and habitus that characterise and distinguish the two parties. in the context of this study, habitus is used as a tool to capture and understand research participants’ sense of identity within the contexts and constraints in which their scholarly work takes place. the next section will explain the context of the research and how the study was conducted. the study narrative inquiry - the entwined process of elicited story-telling and reflection - assumes that social lives are woven from a personal and ‘experiential continuum’ (dewey, ) enveloped in a given social, cultural, political and economic context. this research employs a narrative inquiry methodology as a form of meaning making to capture the experiential process of academic researchers who are advocators and active users of the participatory web within the context of their research practice. although narrative inquiry has often been questioned for its alleged subjectivity, given that it relies on research participants’ own conceptions of reality, it has also been praised for being a tool of empowerment and/or self-improvement that is perhaps less explicit in other methodologies (riessman, ; ). in wanting to access social realities through personal accounts i dwelled on the subjectivity debate: how could i devise a process through which i could collect, understand and interpret the professional trajectories of the research participants, i.e., their academic habitus, via their personal accounts within the limited timescale of the project? on the issue of subjectivity, conle ( ; ) suggests observing the habermasian principles of communicative action as they can provide narrative inquiry with the desired levels of research reliability. considering narrative inquiry as communicative action means challenging the narrator about the truth being told through their ability to truthfully account for their state of mind, emotions and motives in producing a coherent narrative. narrative inquiry thus becomes a process which, through different iterations, aims to establish a common understanding of the experiences narrated: the goal of coming to an understanding [verständigung] is to bring about an agreement [einverständnis] that terminates in the this is an accepted author manuscript” (aam) (also known as the “author post-print”) intersubjective mutuality of reciprocal understanding, shared knowledge, mutual trust, and accord with one another. (habermas, , p. ) that is not to say that the mutual understanding of the narratives told is automatically translated into interpretation of the phenomenon being researched. the conscious application of communicative action to narrative inquiry will rather result in the intersubjective rationality of the research. in other words, sharing a common understanding with my interlocutors (research participants) meant recognising the meaning they ascribed to their narratives before i submitted those understandings to the interpretative process, i.e, before i tried to grasp why they represented their professional world in the ways they did. in order to achieve the intersubjective understanding of the narratives i used digital technologies as a contemporary conduit of communication that can support, but which cannot on its own (re)produce the principles of communicative action. i used skype to interview the research participants and record their narratives of practices. i also made use of closed blogposts to share my understanding of their narratives with them as a form of providing them with an opportunity to confirm, enhance and/or rectify their accounts. further collection of data was conducted via email, as a form of eliciting short, written reflections of their practice in relation to the topic under research. the use of different online tools to record participants’ narratives enabled me to sustain the dialogue with the research participants during an extended period of time, thus allowing me to cross check the constancy of their accounts on the different platforms in which we interacted. research participants were recruited following a purposive sampling technique, in order to access ‘information-rich cases for study in depth’ (patton , p. ). criteria were thus defined to select participants deemed representative of the phenomenon this research aims to study, i.e, digital scholarship (topp, barker, and degenhardt ). this meant that research participants: . were active researchers in an academic setting, that is, they held research contracts in higher education institutions. . were active users of the participatory web as part of their professional activity . had a web presence, that is, their digital footprint was accessible online. http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ /html#cit _ http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ /html#cit _ the empirical work consisted of eleven in-depth interviews with active researchers associated with higher education institutions in the uk, new zeeland and south africa. of the eleven interviews, ten were used, because one of the research participants did not fully meet the research criteria outlined for the study. the research did not limit research participants to a country, disciplinary background, gender or age, because the focus of the research was not on these aspects, but rather on the habitus participants developed on the participatory web as a field that they all had in common. this can arguably be the greatest weakness of this research. yet, this particular focus enabled me to locate research participants’ voices within the context of their digital scholarly work, given that such practices are still emergent in academia and therefore atypical of any given disciplinary context. the study followed an iterative process to collect and analyse the narratives of the research participants. a first pass of narrative interpretation was written up as closed blog posts and shared with the participant-narrators for their commentary and approval. a further six month interaction with a selected number of research participants was conducted through email to deepen my understanding of their perceptions of professional identity. each interview started with a generic question about each participant’s professional background and “history of practice” as a form of positioning the participant within her/his narrative of experience and professional activity. all interviews followed a spontaneous pattern of conversation as a form of providing the narrators with ownership of their narratives of practice. however three themes were used to guide the personal narratives as a form of identifying research participants’ academic habitus: ( ) dissemination of research and knowledge, ( ) collaboration; and ( ) professional identity. these themes helped create incisive narratives of practice, thus allowing me to explore how research participants’ professional identities are presented in both the contexts of a digital society and the institutions with which they are associated. i also kept a research diary throughout the entire research process, as a form of jotting down research participants’ reactions to our interactions and my own thoughts about the research process and the experience of conducting this research. once all the research data this is an accepted author manuscript” (aam) (also known as the “author post-print”) was collected and transcribed, i engaged in a process of content analysis from which i tried to identify themes related to professional identity and research participants’ sense of belonging and/or displacement in relation to their own online practices. using narrative inquiry, i was able to trace research participants’ experiences through the extraordinary shifts in academic practices caused by the exponential growth of the web, and capture how research participants deployed their professional identity in light of their newly acquired academic habitus. bourdieu too made use of life narratives (reed-danahay, , p. ) to explore and contrast the lived experiences within the contexts in which these were developed. revealing the power structures of the social contexts in which personal narratives developed was central to his studies (bourdieu, b). for this research, i build on the work of hooley ( ) who uses the reflexive sociology of bourdieu in narrative inquiry. in doing so, the concept of habitus is “read in association with the narratives produced” (ibid, ) and the themes identified in order to get an understanding of how research participants’ online practices inform and/or transform the perceptions they have of their own professional identity. following bourdieu’s work ( ) - who throughout his career tried to reconcile practice and theory as interdependent entities - this research applies the concept of habitus as both and object and means of inquiry (bourdeu, ; atkin, ; reay, ; wacquant ). in doing so, the concept of habitus was implicitly used to elicit the narration of career related events and practices in a chronological order leading up to participants’ current experiences with the participatory web for professional purposes. the result was research participants’ narrated reflections of the development of their own digital scholarly practices. renditions of how these practices are recognised or misrecognised by their academic peers were very prominent in their accounts. findings for this research, the accounts provided by the research participants are read as narrations of the self and (professional) identity. overall, the research narratives conveyed a strong sense of displacement between research participants’ professional practice and their personal and social trajectories as individuals and scholars attracted by current digital technological developments that promise to reinvent knowledge work. the result was a collective narrative of disjointed identities. this section accounts for the research participants’ perceptions of their own practice and where it places them in terms of professional identity. the bourdieuian lens is applied to illuminate the narratives and explain the implications of participants’ online and networked practices on the (re)presentation of their professional ‘self(ves)’. deviant trajectories: reinvention of the self the research analysis indicated a narrative thread that cut across most research narratives gathered for this project. it focused of the participants’ circumstances of life that informed their career paths, and most likely challenged, and often shaped, their approach to practice, and consequently their professional identity. research participants often mentioned a “turning point” in their lives and careers that seem to have provided them with flexible approaches to practice and their careers in general, and none the least their sense of professional identity. as ibarra ( ) asserts professional identity is a dynamic concept that evolves with changes of own’s professional journeys. the reported turning point was different in each narrative. yet, it raised awareness of their tendency to embrace change and to be flexible in their working practices. research participants reported about: - moving from industry or practice into academia, as was the case of anne, heidi, and luke; - wishing to refocus their career from research into teaching, and thus carry out educational research as opposed to applied research, as represented in hector’s and luke’s narratives; - changing countries (john and lucy); - being embedded in a different team in a different country as part of a visiting fellowship (maria) or, - going through institutional changes and innovations (lucy and richard). this is an accepted author manuscript” (aam) (also known as the “author post-print”) research participants’ narratives also outlined their strong support for the participatory web as an environment where academic researchers can exercise their creative and innovative spirit. on the web research participants can find and congregate with other scholars who share similar professional values independently of their geographic location: (...) extended network and being able to share has been the big change in research for me (richard) the fostering of collaborative links, the sharing of experiences, and collective participation in open spaces are activities that partially summarise the ways in which research participants wish, and often do, conduct their research practice. endorsement of and participation in the open access movement became a pronounced example of such approaches: i think open access – that’s the way it should be to me – we do all this research and then put it into a journal that people read, and i’m not interested in that, if i’m going to do something i want as many people as possible to read it. (luke) i made a decision a couple of months ago that whatever publications i do that i follow t. a.’s lead and only publish in open access journals, and that has huge implications for me, because there aren’t many credible midwifery or nursing journals in the open access environment. (lucy) i want to change [my scholarly practice], because the open journals and the open form... is where knowledge is moving. (maria) research participants’ aspirations to transfer their ideals of digital scholarship to the institutional structures more often than not result in conflict with the rules imposed by the institution. these struggles do not deal so much with the potential of the participatory web in creating spaces for networking as they do in developing new forms of communication and dissemination of research outputs, as highlighted below: anything that i publish in [an open access journal] ...from the university’s point of view doesn’t count as research activity. (…) i am viewed – and this phrase has been used – i am viewed as a problem. (hector) this is due to the fact that collaboration as part of the research process is not a regulated research activity. dissemination of research, however, has a long tradition. in the case of this research, the publication of research in academic journals with established reputations, no longer meets the expectations of those who make use of the participatory web as a new conduit of knowledge communication. research participants’ are avid proponents of the open access movement; the idea of unrestricted online access to peer-reviewed scholarly research. they want to make their research accessible to a wider audience, they want to establish dialogues as part of their work, and they want to make a difference with and through the participatory web supported approaches they bring to academia: i think we’re in a different kind of researcher’s generation ... imagine the hippies in the s. we’re a group of people who think about these kind of things [digital scholarship] seriously, so we can make the difference. (maria) access to research participants’ nonlinear paths provided the research analysis with knowledge of their historical habitus and accounts of how they dealt with change in a flexible way. as pointed at by slay and smith ( ) that professional identity is a process of adaptation that leads to the redefinition of the self in relation to the changes they embrace in their professional world. research participants’ narratives revealed that change has been a constant in their professional trajectories, thus leading to the understanding that they bring a rich set of dispositions from their past experiences that allow them not only to adapt to new situations but also question them. bourdieu understood habitus as both personal traits and social trajectories (bourdieu, ) generative of a system of dispositions that is translated in the way individuals act in or react to a field. he also explored the notion of ‘deviant trajectories’ (bourdieu, b) as a form of transforming the field by challenging the power dynamics present therein. the participatory web as a tool bridging the outside world with scholarly practice encourages deviant trajectories in that it stimulates the development of new approaches to scholarship and related epistemologies of practice. bourdieu has interpreted ‘deviant trajectories’ as a form of distinction. in the context of this research, however, such deviations to established practices do not yet seem to result in this is an accepted author manuscript” (aam) (also known as the “author post-print”) symbolic power able to transform collective practice; it rather seems to translate into the misrecognition of research participants’ scholarly activities with the participatory web. even though research participants are quick to adjust their scholarly practice to the imminent changes of the wider context in which they are placed, i.e., the digital economy, their symbolic position in the field determines the effectiveness of their efforts in bringing digital practices to academia. this consequently has an impact on how they perceive themselves and are perceived professionally. the next section will explore how the adoption of digital scholarship practices and beliefs shapes research participants’ sense of professional identity professional identities reflected: the effects of the participatory web on academic practice research participants shared the perception that being active users of the participatory web allowed them to be seen as someone who is different in their area of practice. the quote-example provided below illustrates this perception of the self well: in terms of doing – using social media – you do see yourself as a bit more radical (…) someone who’s got a bit more forward thinking in some respects than a lot of other academics that you meet. (alex) professional identity as a form of self-conception (ibarra, ) is research participants’ way of distinguishing themselves from their immediate peers. it is also a form of declaring their own interests (du gay, ). research participants view themselves not only as deviant practitioners, but also and above all as innovators who are embracing new practices; a fact that sets them apart from the majority of their academic peers. what is interesting to note here is that the perception of the self is classified in relation to the ‘others’ who follow a different approach. research participants were very vocal in expressing this perception as a crucial aspect of differentiation: when you start using social media... you sort of redefine yourself from somebody who doesn’t know anything about it. (alex) i’m definitely someone who’s breaking new ground (…) in terms of doing research in a different way. the way that i use technology is very much a defining difference between colleagues and i. (luke) the use of digital technology as part of their scholarship activity confers a sense of distinction that confirms their perception as pioneers of digital scholarship. such practices set research participants apart from the ‘mainstream’ scholars and allow them to translate such perceptions into a renewed sense of professional identity as ‘digital scholars’. however, in the background of such perceptions is the awareness of how they are perceived by those about whom they report as lagging behind in the adoption of technology for scholarly work. i.e., their immediate colleagues. identity as a social construction (jenkins, ) is a combination of self-perception and of being perceived. how others acknowledge or fail to acknowledge one’s practice is an act of recognition and/or misrecognition that confirms or discards one’s practice as valid, and worthwhile, in a given field. research participants elaborate on how they are perceived by their academic peers and, by default, academia: i know how others see me, they think i’m insane. they do look at me as some sort of eccentric, techie geek groupie type thing. (heidi) if you were to take an average across the university i would be at one extreme end of that in terms of social media use, most people are basically non-users. (hector) participants’ adoption of the participatory web as part of their scholarly has created a new sense of identity which, as stated in the quotes above, is not shared by their immediate colleagues. however, it is shared by their online peers: i’ve got a global network of people that are interested in working broader. (heidi) i’ve got an attitude that’s quite different from many of my immediate colleagues – let’s put it that way – so having this on my network with people who don’t feel that different from me, is an extremely important means of external validation. (hector) this is an accepted author manuscript” (aam) (also known as the “author post-print”) the narratives of experience featured in this study illustrate that research participants’ practices set them apart from the traditional academic. such competitive perceptions of how research participants’ digital practices are viewed and acknowledged, or not, by two distinctive fields generate an internal conflict regarding the legitimation of their approach. this, in return, impacts on their perception of professional identity. in the context of this study, research participants are positioned between these two opposed worlds: the participatory web that supports an informal intellectual sphere and academia that provides a formal structure in which academic work is validated. research participants’ adoption of the participatory web as part of their scholarly work encourages them to question the academic order by re-defining what they do and who they are professionally. by the same token, their digital scholarly practices lead others (their peers) to re-consider how they view them within academia’s structure. outcasts on the inside: sense of isolation as seen above, participants in this study see their use of the participatory web in the context of their scholarly practice as a distinguishing factor when compared to the majority of their colleagues. their use of digital technology becomes a trait that sets them apart from the majority of their peers. in the context of their academic position, this distinction, i.e., their deviant trajectories of practice, creates a pronounced sense of isolation. this is emphasised in research participants’ accounts: i would probably position myself as an outcast (john). i’m working fairly much in isolation as far as that’s concerned. sometimes i do feel quite isolated (hector). distinction and isolation are the two main themes that characterise research participants’ sense of identity with regards to their digital scholarship activity in the context of their academic practice. if on the one hand, research participants’ digital activity aims to provide evidence of innovation of academic work with novel approaches and tools, on the other hand it denounces the power of the academic field as an antagonist force able to generate a sense professional displacement through the institutional habitus that it cultivates and aims to impose: there are very few people who get what i do. technology has very much had an impact on setting me apart from any of my colleagues, but i think even more than that is how i think about research and how i think about scholarship as a result of using the technology. (luke) by participating in online environments and within distributed networks, individuals are exposed to different ways of thinking and conducting their scholarly activity. this has an effect on the way they approach practice. it also (re)defines who they feel they are, not only in relation to their practice, but also with regards to the practices their peers carry out in conformity with the field of academia. moreover, it separates them from the colleagues who are not yet engaged in digital scholarship activities. yet, the struggle generated by two conflicting habitus does not seem to lead to a simple re-adjustment of individuals’ dispositions to the field that arguably holds the most symbolic power, i.e, academia and its formal mechanisms to validate scholarly work; it rather generates a marked differentiation of practices and approaches to scholarly work, as the participatory web also enjoys a growing reputation, not for tradition as the latter, but for innovation. as pointed out in the literature review of this article, the use of the participatory web in scholarly environments provides an opportunity to explore new forms of scholarship (weller, ; conole, ). however, very few studies have looked at how the participatory web can disconnect the individual from their local work environment, because of the duality that it creates between tradition and innovation. as the participants of this study hinted at, the participatory web can have an isolating effect in that their participation online makes their approach to practice so distinctive from the practice of those who are not part of the same online networks that they no longer identify themselves with the practices carried out at their institutions. this is accrued from their changing perception of professional self as scholars making a difference in academia through the use of digital technology. this conception of the self was often reiterated in research participants’ narratives. this is an accepted author manuscript” (aam) (also known as the “author post-print”) the deeper research participants went into their narratives, the more pronounced was the dissonance between institutional and individual perspectives of practice and the power struggles to which research participants are subjected because of their outlook on practice. habitus produces practices and representations of practice that are susceptible of classification (bourdieu, ; ). digital scholarship as an emergent practice still does not enjoy of a high rated classification, thus characterising research participants’ digital practice as deviant trajectories resistant to the norms of academia. although research participants embody a distinctive identity, as digital scholars, they are only partially esteemed for it, because what the field of academia aspires is the establishment of a homogenous habitus, one that it can recognise at its own rather than one that questions its ordinary norm. yet, this sense of displacement is cancelled when individuals situate their practice in the field of the participatory web which informs their changing academic habitus. an individual exists within a socially constructed space of identity and she/he is perceived according to the principles that define both the field and the individuals that share that common social space. identity is thus translated not only into a sense of belonging (stuart et al, ), but also into a form of recognition of the traits that characterise a given group (bourdieu, ). the opposition of the field of academia to deviant habitus results in a sense of displacement that seems to give prominence to individuals’ divergent scholarly dispositions through perceptions of differentiation. although this distinction might not, in the context of academia, result in the aspired merit and reputation of research participants as digital scholars, such deviant trajectories ‘(…) are undoubtedly one of the most important factors in the transformation of the field of power.’ (bourdieu, b, p. ). this is so because the participatory web, as a social field, informally supports and absorbs participants’ changing habitus as its own, thus partially compensating for what the field of academia fails to recognise. habitus is thus more than a tool of reproduction; it can also be an instrument of change. this is notorious in how research participants’ question the institutional habitus with the habitus they develop in and carry from their online knowledge networks. the space created between the oppositions of the two fields becomes the locus of research participants‘ sense of (disjointed) identity. discussion and conclusion habitus, as an individual’s or group’s embodied system of dispositions, can match or differ from the social field, and sometimes even resist it. individual belonging to numerous social spaces can see their habitus be aligned to or in conflict with the different fields of action in which they co-exist and their practice is contextualised and validated. depending on the field, the result can either be a sense of recognition or misrecognition. the harmonious relationship between individuals habitus and the field in which they operates tends to translate in a sense of identification between the individual and that social space; of habitus and field becoming an indistinct social phenomenon. however, the dispositions individuals develops in one field is not necessarily absorbed by another distinctive field. the difference between the field and the habitus individuals bring to it leads to the misrecognition of practice and consequently a ‘cleft habitus generating all kinds of contradictions and tensions’ (bourdieu, , p. ). the ambivalence between the university world and research participants’ intellectual journeys results is a disjointed sense of identity and a predisposition to symbolic revolutions. according to their narratives of practice, research participants are caught between a habitus that leans towards digital practices and a field that prefers to follow academic traditions, i.e., they are torn between what they perceive to be innovative practices that renew the meaning to their activity and the conventional rules of academia that they see as stifling their novel approaches to scholarly work. the incongruence between habitus and field leads to a strong perception of misrecognition of digital scholarship and digital scholars inside academia. yet, the misrecognition of digital scholarship in the field of academia is balanced with the informal recognition of such practices on the field that produces it, i.e., the participatory web. this can lead to the acknowledgment and perception of the ‘self’ as a digital scholar in times when academia struggles to reinvent itself in light of social, cultural, political, economic and technological developments typical of the contemporary society. such ‘critical crisis’, bourdieu alleges, are a turning point likely to transform practice and the this is an accepted author manuscript” (aam) (also known as the “author post-print”) social and professional identity perceptions associated with it as structures and dispositions come into disruption (bourdieu, ; ). in this sense, i suggest that habitus is not necessarily always defined in relation to the field, as proposed by adams (see , p. ), but rather made apparent via a given field, as social agents’ social and professional trajectories occur simultaneously across different fields. this is made obvious through the agreement or disagreement of individuals’ dispositions with the structures of the social spaces in which their practices are materialised. nonetheless, adams (ibid) is right to assert that a field’s assimilation or rejection of a individuals’ dispositions – of the field imposing itself as habitus - can respectively result in a sense of belonging or disconnection with the norms of that space of practice. this then opens up space for reflection about the discrepancy between individual’s habitus and the field that officially substantiates their academic practice. the difference between habitus and field can produce changes, but such changes can only be effective in so far social agents remain relevant in the field they aim to transform, i.e, hold symbolic positions that allow them to promote their habitus as field. yet, if social agents’ habitus can find recognition in another field, their changing sense of professional identity is more likely to be challenged rather than cancelled by the field that contradicts it. this opens scope for future change through the introduction of an ‘outside’ habitus. this is what research participants hinted at when they declared to wanting to make a difference with their digital practices. indeed, the participatory web is known for triggering a number of changes in social practices that have repercussions on individuals’ perceptions of their professional identity. professional identities, as a social construction, are determined by a sense of distinction, and such ‘difference is asserted against what is closest, which represents the greatest threat.’ (bourdieu, , p. ). in this research, the participatory web is characterised as an instrument of change, and in this sense, as both a promise and a threat to reinventing academia and its agents. these competing perceptions result in a digital divide, not in relation to the accessibility of digital technology, but rather to a shared logic of academic practice; a new mind-set (own author anonymised for review purposes, ). this impacts on how individuals perceive themselves and their peers professionally as either digital scholars or non-digital scholars, innovators or tradition followers, of game changers or conformists. this discrepancy between field and habitus can affect how academics embracing digital scholarship perceive themselves and are perceived in the field of academia and by the social agents that interact therein. yet, the struggle for imposing or changing the dominant habitus is not only one of reproduction, but also one of transformation of practices. research participants want to reproduce the habitus acquired on the participatory web on the field of academia with the purpose of reforming it. in ‘homo academicus’ ( ), bourdieu reported about the opposition between new means of mass production and diffusion of cultural goods – at the time, typified especially by the radio and television – and the traditions shared by the academy. similarly to the mass media, the participatory web could also be said to trigger ‘an anti-institutional mood, constituted essentially by their ambivalent relationship with the university’ (ibid, p. ) in that it disturbs the ‘ordinary order’. the difference between traditional mass-media and the participatory web is however defined by whom holds the power to publish and communicate scholarly knowledge. the shift is no longer from one field to another, but from the institution to the individual. ultimately, it is this power shift against which or for which field and habitus are respectively fighting that fragments individuals’ sense of professional identity. in conclusion, research participants featured in this research are embedded in a social space that generates atypical academic practices. their participation on the participatory web and knowledge networks available therein endow research participants with a very different system of dispositions that prompt them to question the practices promoted in academia. this contrasting habitus impels them to break with the established order in an attempt to ‘(…) defend their own interests’ (bourdieu, , p. ) and try to question academia with the properties that constitute the social identity they have developed online. what the opposition between routine and innovation, between academia and the participatory web as two distinctive fields, does is to denounce the monopoly of academic knowledge production with which research participants no longer identify themselves. in doing so, they aim to transform the academic field with a new logic of practice that reflects their new academic habitus, and consequently their professional identity. this is an accepted author manuscript” (aam) (also known as the “author post-print”) references adams, m. ( ). hybridizing habitus and reflexivity: towards an understanding of contemporary identity? sociology, ( ), – . doi: . / atkin, c. ( ). lifelong learning-attitudes to practice in the rural context: a study using bourdieu’s perspective of habitus. international journal of lifelong education, ( ), – . doi: . / becher, t. ( ). academic tribes and territories: intellectual enquiry and the cultures of disciplines ( st ed.). bourdieu, p. ( ). outline of a theory of practice. cambridge university press. bourdieu, p. ( ). distinction: a social critique of the judgement of taste ( edition.). routledge. bourdieu, p. ( ). homo academicus. stanford university press. bourdieu, p. ( ). social space and symbolic power. sociological theory, ( ), – . bourdieu, p. ( ). language and symbolic power. (g. raymond & m. adamson, trans.) (reprint edition.). cambridge, mass.: harvard university press. bourdieu, p. ( a). practical reason: on the theory of action. stanford university press. bourdieu, p. ( b). the state nobility: elite schools in the field of power. stanford university press bourdieu, p. ( ). the weight of the world: social suffering in contemporary society. stanford university press. bourdieu, p. ( ). science of science and reflexivity. cambridge. polity. burdick, a., & willis, h. ( ). digital learning, digital scholarship and design thinking. design studies, ( ), – . doi: . /j.destud. . . conle, c. ( ). the rationality of narrative inquiry in research and professional development. european journal of teacher education, ( ), – . conle, c. ( ). practice and theory of narrative inquiry in education. in m. murphy & t. fleming (eds.), habermas, critical theory and education. london: routledge. conole, g. ( ). chapter: new approaches to openness - beyond open educational resources - cloudworks. retrieved from http://cloudworks.ac.uk/cloud/view/ cronin, b. ( ). scholarly communication and epistemic cultures. dewey, j. ( ). experience and education. kappa delta pi. du gay, p. ( ). organizing identity: persons and organizations after theory. sage publications. fry, j. ( ). studying the scholarly web: how disciplinary culture shapes online representations. cybermetrics: international journal of scientometrics, informetrics and bibliometrics, ( ). retrieved from http://www.cindoc.csic.es/cybermetrics/vol iss .html fry, j., lockyer, s., oppenheim, c., houghton, j., & rasmussen, b. ( ). identifying benefits arising from the curation and open sharing of research data produced by uk higher education and research institutes. retrieved from http://ie-repository.jisc.ac.uk/ / hall, r. ( ). revealing the transformatory moment of learning technology: the place of critical social theory. research in learning technology, ( ), – . doi: . / . . harley, d., acord, s. k., earl-novell, s., lawrence, s., & king, c. j. ( ). assessing the future landscape of scholarly communication: an exploration of faculty values and needs in seven disciplines, (january), . this is an accepted author manuscript” (aam) (also known as the “author post-print”) henkel, m. ( ). academic identity and autonomy revisited. in i. bleiklie & m. henkel (eds.), governing knowledge (pp. – ). springer netherlands. retrieved from http://link.springer.com/chapter/ . / - - - _ henry, g., baraniuk, r. g., & kelty, c. ( , july ). the connexions project: promoting open sharing of knowledge for education. retrieved from http://hdl.handle.net/ / hooley, n. ( ). narrative life: democratic curriculum and indigenous learning. springer. ibarra, h. ( ). provisional selves: experimenting with image and identity in professional adaptation. administrative science quarterly, ( ), – . doi: . / jenkins, r. ( ). social identity ( rd ed.). routledge. kemp, b., & jones, c. ( ). academic use of digital resources: disciplinary differences and the issue of progression revisited. educational technology & society, ( ), – . kim, t. ( ). transnational academic mobility, knowledge, and identity capital. discourse: studies in the cultural politics of education, ( ), – . doi: . / . . king, a. ( ). thinking with bourdieu against bourdieu: a ‘practical’ critique of the habitus. sociological theory, ( ), – . doi: . / - . miller, a. n., taylor, s. g., & bedeian, a. g. ( ). publish or perish: academic life as management faculty live it. career development international, ( ), – . doi: . / northcott, d., & linacre, s. ( ). producing spaces for academic discourse: the impact of research assessment exercises and journal quality rankings. australian accounting review, ( ), – . doi: . /j. - . . . patton, m. q. ( ). qualitative evaluation and research methods (second edition.). sage publications, inc. pearce, n., weller, m., scanlon, e., & kinsley, s. ( ). digital scholarship considered : how new technologies could transform academic work. article. retrieved may , from http://dro.dur.ac.uk/ / priem, j., & hemminger, b. h. ( ). scientometrics . : new metrics of scholarly impact on the social web. first monday, ( ). retrieved from http://pear.accc.uic.edu/ojs/index.php/fm/article/view/ qualman, e. ( ). socialnomics: how social media transforms the way we live and do business. john wiley and sons. reay, d. ( ). ‘it’s all becoming a habitus’: beyond the habitual use of habitus in educational research. british journal of sociology of education, ( ), – . doi: . / reed-danahay, d. ( ). locating bourdieu. indiana university press. riessman, c. ( ). narrative analysis. in m. s. lewis-beck, a. bryman, & t. f. liao (eds.), the sage encyclopedia of social science research methods (illustrated edition.). sage publications, inc. riessman, c. ( ). narrative methods for the human sciences. sage publications, inc. schneckenberg, d. ( ). web . and the empowerment of the knowledge worker. journal of knowledge management, ( ), – . doi: . / slay, h. s., & smith, d. a. ( ). professional identity construction: using narrative to understand the negotiation of professional and stigmatized cultural identities. human relations, ( ), – . doi: . / http://dro.dur.ac.uk/ / this is an accepted author manuscript” (aam) (also known as the “author post-print”) stuart, m., lido, c., & morgan, j. ( ). personal stories: how students’ social and cultural life histories interact with the field of higher education. international journal of lifelong education, ( ), – . doi: . / . . talib, a. a. ( ). the continuing behavioural modification of academics since the research assessment exercise. higher education review, ( ), – . veletsianos, g., & kimmons, r. ( ). networked participatory scholarship: emergent techno- cultural pressures toward open and digital scholarship in online networks. computers & education, ( ), – . wacquant, l. ( ). homines in extremis: what fighting scholars teach us about habitus. body & society, x . doi: . / x weller, m. ( ). the digital scholar: how technology is changing academic practice. bloomsbury publishing plc. wellington, j., & torgerson, c. j. ( ). writing for publication: what counts as a ‘high status, eminent academic journal’? journal of further and higher education, ( ), – . doi: . / wilkinson, d., harries, g., thelwall, m., & price, l. ( ). motivations for academic web site interlinking: evidence for the web as a novel source of information on informal scholarly communication. journal of information science, ( ), – . doi: . / op-llcj .. omeka in the classroom: the challenges of teaching material culture in a digital world ............................................................................................................................................................ allison c. marsh university of south carolina ....................................................................................................................................... abstract an often-overlooked challenge to the field of digital humanities is the lack of interest in the discipline by young scholars. despite being the so-called digital natives, many of my museum studies students have no desire to engage with new technology. this short paper is a snapshot of my attempt to teach digital curation and online exhibit development within the framework of my material culture seminar. it analyses years of student exhibits developed using omeka and points to new directions the project will go over the next several years. i began the research project by asking the following questions of my students: what does material culture look like on the web? how do you curate it? how does the public interact with virtual objects? what is the relationship between virtual and phys- ical museum artifacts? however, after seeing the students struggle with basic web development, i expanded my own research questions to include: what skills do emerging professionals need and how can we integrate technical training into an academic program? ................................................................................................................................................................................. for presenters and attendees at dh , there is no need to sing the praises of the digital world. we are the early adopters, the converted, the evangelists. but our colleagues across the humanities are not yet entirely convinced, and of more concern to me, neither are the students. as always, i enjoyed the papers at dh and was inspired by many of the fabulous projects. this paper goes in the opposite direction. i direct the museum studies track of the masters in public history at the university of south carolina, one of the oldest public history programs in the country. it is a nationally competitive pro- gram, and our graduates have an impressive place- ment record: the smithsonian; the national park service; federal, state, and local government. and yet, since i joined the faculty years ago, i have been shocked that the students—the so-called digi- tal natives—have little interest in the digital world as part of their professional training. they may com- municate with each other using facebook, share photos on flickr, or post to their personal blogs, but when it comes to coursework they expect, and sometimes demand, a traditional graduate seminar where we read and discuss books. more than one student has balked at my assignments, whining, ‘i don’t need to learn how to program. i just want to be a regular historian.’ unyielding in my persist- ence, i argue back that it is no longer an option. wikis, blogs, and tweeting are everyday realities for museum professionals. at the very minimum, all curators and collections managers need to have a basic understanding of database architecture in order to structure their object databases and con- struct useful queries. more importantly, two dec- ades of digitization have created new questions for curators of three-dimensional objects: what does material culture look like on the web? how do correspondence: allison c. marsh, department of history, university of south carolina, gambrell hall, columbia, sc , usa. e-mail: marsha@mailbox.sc.edu literary and linguistic computing � the author . published by oxford university press on behalf of allc. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqs literary and linguistic computing advance access published january , at n ew y ork u niversity on m arch , http://llc.oxfordjournals.org/ d ow nloaded from http://llc.oxfordjournals.org/ http://llc.oxfordjournals.org/ you curate it? how does the public interact with virtual objects? what is the relationship between virtual and physical museum artefacts? i want my students to struggle with these questions, and this paper is a short excursion into the trial-by-fire ap- proach of my material culture seminar. each fall i teach hist : material culture studies, the foundational graduate seminar for the museums track in our masters program. on the first day of class, i ask the students to bring in five ob- jects that describe either themselves or their research interests. i tell them to choose wisely, as they will be using those objects every week for the entire semes- ter, but otherwise i give no guidance to object se- lection. the objects serve multiple purposes throughout the semester, but most importantly they are part of a larger project to create a virtual object database that represents the changing atti- tudes towards material culture in the digital age. as a final project, each year the students must create an online exhibit drawing from the objects in the database; they can select from the objects of both their classmates as well as the students of pre- vious years. clearly, each year the number of objects in the database increases, and the distance from the early contributors becomes greater. i am in year of what i anticipate to be a decade long study, and this short paper is designed to give preliminary results. because this course is part of a -year master’s degree, this is the first year where the students do not have direct access to the owners of objects from previous classes. i have chosen omeka as the platform for this assignment. although i am well aware of the limi- tations, as well as the potential, of the open-source software, omeka has a low barrier for entry. omeka was developed by george mason university’s roy rosenzwig center for history and new media (chnm) with non-it specialists in mind. chnm’s goal was (and continues to be) to provide museum and library professionals with a tool that allows them to concentrate on content and inter- pretation without worrying about programming. personally, i am concerned that by using a black box application, my students do not fully under- stand the implications of engaging with the virtual world. however, that is one of the compromises i have made in order to encourage budding historians to get their toes wet in the digital arena. for their final assignment, students must create records for each of their objects, which include up- loading images, entering dublin core metadata, tagging objects with keywords, and writing short descriptions. the students then must curate their own exhibit, either by using one of the theme tem- plates provided by chnm or by creating their own. the open-source software allows students who are more skilled or interested in web design to create more elaborate exhibits. so far the results of the online exhibits have been mostly disastrous. as a whole, the exhibits are terrible (available at http://hist .cas.sc.edu). they have clunky navigation, lack any elegance in design and often are just plain boring. one example from an english ma student shows her reluctance to incorporate images into her text: http://hist .cas. sc.edu/exhibits/show/methods-of-memory. if you click on a link to one of the objects (http:// hist .cas.sc.edu/exhibits/show/methods-of-memo ry/medieval-rememberings), you can see why. she was unable to figure out how to crop and centre her image. in fact, simply getting students to upload photos that are in focus seems to be problematic: (http://hist .cas.sc.edu/exhibits/show/old-stuff/ what-kind/heirlooms). a few of the students did manage to accomplish a storyline, for example http://hist .cas.sc.edu/exhibits/show/bringing- paper-to-life/ , although still not a particular com- pelling online exhibit. clicking on only a few of the exhibits, the dual challenges of the assignment become clear: ( ) students need more training on basic digital tools and ( ) students need more train- ing on effective narration/curation/storytelling in an online exhibit. too often students were trying to translate something that would work well as a paper into something that was disastrous online. in many ways, the exhibits are proof of my dis- trust of black box software for developing online exhibits and are an indicator that anyone who wants to engage seriously in the virtual world needs significantly more training (either formally or informally) than a few hours of online tutorials can provide. more generously, these online exhibits are often the first experience students have in a. c. marsh of literary and linguistic computing, at n ew y ork u niversity on m arch , http://llc.oxfordjournals.org/ d ow nloaded from http://hist .cas.sc.edu http://hist .cas.sc.edu/exhibits/show/methods-of-memory http://hist .cas.sc.edu/exhibits/show/methods-of-memory http://hist .cas.sc.edu/exhibits/show/methods-of-memory/medieval-rememberings http://hist .cas.sc.edu/exhibits/show/methods-of-memory/medieval-rememberings http://hist .cas.sc.edu/exhibits/show/methods-of-memory/medieval-rememberings http://hist .cas.sc.edu/exhibits/show/old-stuff/what-kind/heirlooms http://hist .cas.sc.edu/exhibits/show/old-stuff/what-kind/heirlooms http://hist .cas.sc.edu/exhibits/show/bringing-paper-to-life/ http://hist .cas.sc.edu/exhibits/show/bringing-paper-to-life/ http://llc.oxfordjournals.org/ http://llc.oxfordjournals.org/ curating, and so one of the assignment’s goals is for students to gain skills in developing effective narra- tive techniques (useful in both physical and digital curation). in assessing their work, it is important to be mindful of the learning process; remember that they are professionals in training, and they should not be judged on their first attempt but rather on the progress they achieve by the time they graduate. however, as a pedagogical device, the assignment has been tremendously successful. by working through the process of creating an online exhibit, the students naturally confront the many epistemic questions relating to the use of physical objects in a virtual environment. students immediately recognize the diverse chal- lenges of working in the digital format, from the pedestrian, such as how to search for items when a previous user failed to enter appropriate metadata, to the substantial, such as questioning the ethics of using an object as a metonym in an exhibit that is antithetical to the physical object’s authenticity. my goal for the assignment is not for students to become master web designers, but for them to engage in the questions confronting digital curation. although i could go on at length about the im- plications of this ongoing assignment, i want to offer questions, rather than conclusions, on the joint chal- lenges of curating digital resources and the role of digital humanities in the academic curricula—how are universities training the next generation of museum professionals who will have to confront digital curation? what are effective teaching tech- niques? what skills do museum professionals believe graduating students should have? how do professors balance the need to provide theoretical training in how to read and interpret material culture while fos- tering the development of technical skills in an ever-changing digital landscape? at breakfast on the very first day of dh , i happened to sit down with two developers for omeka. after hearing about my project, they asked: ‘but did it work?’ the answer, of course, is all in what you choose to assess. the object database is a mess. metadata is just about non-existent. the exhibits are lousy. everyone involved (both the stu- dents and myself) are frustrated. but at the end of the course, although there was still a lot of whining, there was also some recognition of particularly digi- tal difficulties. one student brought in the program for the up-coming ncph annual conference (the main professional organization in my field) and said, ‘dr. marsh, did you know that george mason was hosting a half day workshop on omeka?’ sighing, i replied that my coursework was in fact designed to be useful, even if they did not see it that way. for the more curious students—dare i say the better students—slogging through my assignment made them want to improve omeka. they recog- nized the need and potential for an open-source database that accepted three-dimensional objects, audio, and video. they wanted to challenge the text-based digital humanists to consider the difficul- ties of encoding images and sound. they saw the opportunity for a tool to create online exhibitions with ease, useful for the small museums and histor- ical associations that lack it departments. i now have a group of students and recent alumni who, after assessing the various free applica- tions available, decided omeka was the best of the lot and are currently using it in a collaborative pro- ject with the smithsonian national museum of american history (www.olympiamills.cas.sc.edu). the group is engaged in ‘active reflection’ on the exhibition development process (http://developin- gele.wordpress.com/). they are keeping track of the difficulties that they, as novices, encounter. they are writing concise how-to guides that they call learning modules for each step of the project. they are working to educate their peers on how to take small steps into the digital world. so did it work? the answer is a resounding ‘kind of.’ my presentation at dh was not an awe-inspiring game changer, but it elicited laughs, nods of agreement, and empathy from all of the professors and museum professionals in the audi- ence. discussion overflowed into the hallways as numerous people came up to me with similar stor- ies. considering the enthusiasm with which my paper was received, i am somewhat (although not all together) surprised that there were few similarly styled papers at the conference, and even fewer in print publication, that outlined the pedagogical problems associated with digital humanities. as a omeka in the classroom literary and linguistic computing, of at n ew y ork u niversity on m arch , http://llc.oxfordjournals.org/ d ow nloaded from www.olympiamills.cas.sc.edu http://developingele.wordpress.com/ http://developingele.wordpress.com/ http://llc.oxfordjournals.org/ http://llc.oxfordjournals.org/ field, i believe we should be more cognizant of what is not working and openly discuss what to do about it (were, ). in her closing plenary for dh , melissa terras correctly acknowledged the importance of digital presence and identity for the discipline of dh itself. in the case study she used, transcribe bentham, she remarked on her luck to have a phd student, rudolf ammann, who is also a gifted gra- phic designer. she then goes on to outline the chal- lenges of dh jobs, whether they should require a phd, and problems of unemployment or under- employment for young scholars (terras, ). what was missing was the need of graphic design as a component in the curriculum of current young scholars. the establishment of new ma programs in digital humanities will help strengthen the discip- line of dh, but i am interested in how we reach young scholars who just want to be ‘regular’ hu- manists. how do we convince the digital natives who have no interest in digital scholarship that metadata, graphic design, and database architecture are becoming requirements for all scholars? my research project, which started out as an in- quiry into changing attitudes about material culture, has become a longitudinal study into the technical skills of students who cycle through usc’s ma in public history. i have not changed the assignment in my material culture seminar, but i have added in a more robust assessment framework than the usual course evaluation. before the first day of class, i ad- minister a front-end evaluation of students’ technical skills and their perception of what skills are necessary to become a successful museum professional. at the end of the course, after their battle with omeka, i repeat the survey. i follow up with the students during their job search, to document what skills are in demand, and then again a year later to find out what skills they are using on the job. additionally, i have made overtures to both the graphic arts department and the library school to see if we can develop some interdisciplinary, inter-college collaborative projects where students from several fields struggle with the difficulties of multiple disciplines. i hope the library school stu- dents can help rationalize our metadata fields as part of their digital archives course; i hope the graphic design students can bring their aesthetic eye and talent to help our sad web pages; i hope my own history students will be able to work on developing strong content, while learning how to navigate the digital world. this short paper is a snapshot—year of an anticipated -year project. it shows how the ques- tions asked over time change as the project takes shape. in a few more years, i plan on writing the long paper, perhaps even two long papers: one on material culture in a virtual world and one on the curricular needs and job skills for emerging profes- sionals in the twenty-first century. references terras, m. ( ). present, not voting: digital humanities in the panopticon: closing plenary speech, digital humanities . literary and linguistic computing, ( ): – . were, g. ( ). out of touch? digital technologies, ethnographic objects and sensory orders. in chatterjee, h. (ed.), touch in museums. oxford: berg, pp. – . notes . http://omeka.org/about/. the omeka showcase has rich examples of the power of the tool to create beautiful websites that combine online exhibits, archives, and col- lections management. my students found the showcase to be a double-edged sword. on the one hand, they were inspired by the possibilities, but on the other hand, it highlighted their frustrations with their own lack of web development skills. in many ways, they found the show- case to be misleading because of the difficulty of first time users to create similar sites. . my assignment is in no way a unique experiment. professors across the humanities are having students create online exhibits using a wide variety of formats and software. the learning objectives are diverse, de- pending on the course and the discipline. for example, shannon mattern at the new school has students in her mediaþmateriality course develop online exhibits using wordpress: http://www.wordsinspace.net/media- materiality/ -spring/. graeme were has anthropol- ogy students examine digital objects in a virtual learning environment and create exhibits within the moodle platform. a. c. marsh of literary and linguistic computing, at n ew y ork u niversity on m arch , http://llc.oxfordjournals.org/ d ow nloaded from http://omeka.org/about/ http://www.wordsinspace.net/media-materiality/ -spring/ http://www.wordsinspace.net/media-materiality/ -spring/ http://llc.oxfordjournals.org/ http://llc.oxfordjournals.org/ nordic wittgenstein review ( ) from the archives david stern david-stern @ uiowa.edu the university of iowa tractatus map http://tractatus.lib.uiowa.edu/ abstract drawing on recent work on the nature of the numbering system of the tractatus and wittgenstein’s use of that system in his composition of the prototractatus, the paper sets out the rationale for the online tool called the university of iowa tractatus map. the map consists of a website with a front page that links to two separate subway-style maps of the hypertextual numbering system wittgenstein used in his tractatus. one map displays the structure of the published tractatus; the other lays out the structure of the prototractatus. the site makes available the full text of the german and the two canonical english translations. while we envisage the map as a tool that we would like a wide variety of readers to find helpful, we argue that our website amounts to a radically new edition of wittgenstein’s early masterpiece, with far-reaching implications for the interpretation of that text. in particular, we claim that our visually compelling presentation of the book’s overall structure delivers on wittgenstein’s cryptic claim in a letter to his publisher that it is the numbers that “make the book surveyable and clear”. . the numbering system of the tractatus the university of iowa tractatus map project arose out of the discussion in my fall graduate seminar on the philosophy of ludwig wittgenstein. we spent several weeks looking at the recent debate over how to read wittgenstein’s tractatus logico-philosophicus, http://www.nordicwittgensteinreview.com/ http://tractatus.lib.uiowa.edu/ david stern cc-by and much of our discussion focused on the question of how best to understand its structure, and the significance of its numbering system. the tractatus is very short, but it is also extraordinarily concise, and an intricate numbering system is used to indicate the relationship between its highly condensed parts. wittgenstein later said that every sentence should be seen as a chapter heading. even though the tractatus has generated an enormous secondary literature, there is no scholarly agreement about even the most elementary exegetical matters. (for a brief history of tractatus interpretation, see stern .) for the last twenty years, the principal focus of interpretive debate has been between supporters of a “traditional” argumentative reading, on which the book is construed as arguing for a systematic conception of the nature of language, logic, and representation, and “resolute” readers who contend that the traditional reading misses the whole point of the book: once we understand the author, we will see that the book – and the argument presented within – is nonsense. however, a few scholars on both sides of that divide have advocated a new way of reading the book, one that challenges a basic assumption that has previously been taken for granted by almost all interpreters on both sides of this debate, an assumption so obvious that it was very rarely explicitly articulated, and had seemed to need no defense (bazzocchi , , a, ; hacker ; kuusela ). the assumption in question is that the book should be read sequentially, from beginning to end. in other words, until very recently almost all readers have presupposed that one should start at the first sentence on the first page and end at the last sentence of the last page. the new alternative that has been proposed is that the book should be read as a hypertext, a tree-structure defined by the author’s numbering system. given that the tractatus was written during the first world war, and published in , and the term “hypertext” was first coined around , the proposal that we approach that book as a hypertext is bound to seem incoherently anachronistic at first sight. indeed, if one relies on the wikipedia definition of hypertext as “text displayed on a computer display or other electronic devices nordic wittgenstein review ( ) with references (hyperlinks) to other text which the reader can immediately access, or where text can be revealed progressively at multiple levels of detail” (https://en.wikipedia.org/wiki/hypertext), it follows that the very idea of hypertext presupposes the existence of electronic computers. however, in a broader sense of the term, a hypertext is any non-linear text, any text “which contains links to other texts” (https://www.w .org/whatis.html). the decimal numbering systems of the prototractatus and tractatus works in precisely that way: each remark begins with a number which indicates its relationship to those remarks above, below, or neighboring it in the tree structure which connects those remarks. the tractatus consists of a series of numbered remarks, arranged in numerical order. the top level ones are numbered to ; decimal numbers are used to indicate the structure of the supporting paragraphs. when wittgenstein’s publisher asked him if he would be willing to give up the decimal numbering he replied categorically: “the decimal numbers of my remarks absolutely must be printed alongside them, because they alone make the book surveyable and clear, and without this numbering it would be an incomprehensible jumble” (wittgenstein, letter to von ficker, december , translation from hacker , ). a footnote attached to the first remark in the tractatus explains the numbering system as follows: the decimal numbers assigned to the separate remarks indicate the logical weight of the remarks, the stress laid on them in my exposition. the remarks n. , n. , n. , etc., are comments on remark no. n; the propositions n.m , n.m , etc., are comments on the remark no. n.m; and so on. (tlp, p. , my translation) most interpreters have either ignored these instructions, or failed to understand their significance. the majority of tractatus interpreters pass over them in silence. however, the footnote makes it quite clear that the numbering system has the structure of a logical tree, or a quite specific kind of hypertext. recent research on the origins of the tractatus has shown that wittgenstein relied on the numbering system to organize his work on an earlier manuscript draft of the book (wittgenstein, ms ), rearranged http://www.nordicwittgensteinreview.com/ https://en.wikipedia.org/wiki/hypertext https://www.w .org/whatis.html david stern cc-by and published in numerical order as the prototractatus, and that the hypertextual grouping of remarks is crucial for an understanding of his work on constructing and rearranging the text. . earlier maps – digital and others earlier attempts to construe the numbering system of the tractatus that did not dismiss it out of hand, a tradition that goes back to de laguna’s review of the book ( ) and black’s companion to wittgenstein’s tractatus ( , ), either concentrated on the hierarchical, or parental, relationship between remarks that comment on one other, or looked for an arcane system hidden behind the numbers. in other words, they focused on the relationships between remarks such as n and n.m, and between n.m and n.m , and so on, to use the terminology wittgenstein introduced in his footnote to remark . there are a few exceptions to this general rule; the most striking and detailed is mayer ( ) who concentrates on the origins of the numbering system. both mayer ( ) and gibson ( ) provide helpful surveys of previous interpretations. what the usual approaches to the numbering system of the tractatus overlook is that wittgenstein also draws our attention in that footnote to the sibling relations between remarks at the same level on the tree with a common parent, such as n. , n. , n. etc., and n.m , n.m , etc. indeed, it is these series of sibling remarks that he characterizes as comments on the remark at the next level up. the series of remarks that go together to form the series of comments that wittgenstein describes in his introductory footnote (remark n and the series of comments n. , n. , n. ; remark n.m, and the series of comments n.m , n.m ) are usually interspersed among other remarks, and are often on different pages of the printed text. as a result, it is very difficult to read the remarks in the order defined by the hypertextual numbering system while working with the traditional printed text. so a leading rationale for the design of any tractatus hypertext is to bring out the importance of these see mcguinness , ; mayer ; bazzocchi , , , a; hacker , pilch , . nordic wittgenstein review ( ) connections between remarks that form part of a single branch of the logical tree. kraft ( , , fn ) has argued that this attention to two kinds of relations between nodes shows that the tree reading of the tractatus is “not a tree in the logical or mathematical sense.” the reason he gives is that mathematical trees contain only one type of relation between nodes, while bazzocchi’s and hacker’s diagrams need “two kinds of lines, solid and dashed, to represent the ‘comment on’ relation and the ‘belongs to the same set of comments as’ relation”. however, their use of a second kind of line to draw the reader’s attention to the sets of sibling remarks is purely a notational device, for the relationship of siblinghood can be analysed without residue in terms of the parental relation: two comments are siblings just in case they have the same parent. the real problem with such logical tree representations of the tractatus, however, is not a matter of logic, but the fact that the need to draw so many lines, connecting a parent with each of its offspring and each sibling with the ones that come before and after, makes it impossible to legibly represent more than a small fraction of the whole structure on a single page. while it is, of course, possible to depict the numbering system of the tractatus by drawing a tree with , , ... arranged horizontally at the top, as the trunk of the tree, and then draw roughly vertical links downwards from each of them to each of the related remarks with one decimal, and so on, the upside-down tree that results rapidly becomes extremely complicated if one tries to include all the remarks. it is not only difficult to draw such a map, but more important, almost impossible to fit more than a small fragment into a single frame. bazzocchi’s many diagrams of the tree structure of the tractatus in his publications are always very selective. his diagram of the tree structure of the several dozen remarks on the first five pages of ms ( , ) is fairly close to the limit of what can conveniently fit onto a single page. laventhol ( ), the oldest surviving tractatus map, includes a link to an extraordinarily long, narrow, and unperspicuous map which serves as a good illustration of the problems faced by any attempt to map the tractatus using a conventional approach. see: http://www.nordicwittgensteinreview.com/ david stern cc-by http://www.kfs.org/jonathan/witt/mapen.html a very similar map is included on bazzocchi’s tractatus site: http://www.bazzocchi.net/wittgenstein/tractatus/eng/mappa.htm pasin, who has designed several imaginative tractatus sites, provides a more compressed rendition of the whole as a logical tree by using polar co-ordinates and a shifting center, but despite this ingenuity, one can only look at a small fraction of the whole at any one time: http://hacks.michelepasin.org/witt/spacetree#.wc p mk-kx . mapping the tractatus the original motivation for the university of iowa tractatus map was to find a way of representing the structure of the tractatus numbering system in a more compact and simple way. it is built as a subway-style map which displays each remark as a station, and each series of remarks which comment on a parent remark as a subway line branching off a junction station. this makes it easy to examine the arrangement of the various series of remarks described in the introductory footnote, together with the remark that they comment on. the site consists of three main pages: a brief introductory front page, with links to two separate maps. one map displays the structure of the published tractatus, while the other lays out the structure of an earlier draft, known as prototractatus. it also provides access to the full text of each line on those maps, in german and the two canonical english translations. phillip ricks drafted the first version of the tractatus map, using a pencil and graph paper, while taking part in my graduate seminar on wittgenstein’s philosophy. i turned it into an excel spreadsheet, and suggested that we use it as the basis for an online map of both the tractatus and the prototractatus. landon elkind, another seminar participant, joined us in working on the design of the online map, and made a crucial contribution to the prototractatus part of the project. the construction and design of the website was done by matthew butler and nikki white at the university of iowa library’s digital scholarship and publishing studio. we are grateful to kevin klement for his careful editorial work on the public http://www.kfs.org/jonathan/witt/mapen.html http://www.bazzocchi.net/wittgenstein/tractatus/eng/mappa.htm http://hacks.michelepasin.org/witt/spacetree#.wc p mk-kx nordic wittgenstein review ( ) domain english and german editions of the tractatus used on the site (wittgenstein ). we strongly recommend looking at our web-based interactive maps of the tractatus and prototractatus to see how the map performs this orientational function, and the next two paragraphs are intended to be read while looking at those maps. the yellow main line at the top of each of the full-map pages, represents the series of whole-numbered remarks, ( , … ), each of which is represented by a station on that line. the red and pink lines, branching off each of the first six remarks ( . , . ; . , . ; . , . , . … and so on) represent the series of remarks that comment on the whole-numbered remarks. further levels of comments are represented by lines in orange, green, aqua, blue, purple and grey. lines containing one or more zeros are in fainter versions of the corresponding colour. readers can zoom in on any part of the map, and then move around in it, or zoom out to see the whole. clicking on the individual numbered stations, each of which stands for a remark in the text, brings up a panel containing the associated text. clicking on the lines connecting the stations, each of which stands for a series of sibling remarks and the remark that they comment, brings up a panel containing the text of those remarks. for instance, clicking on the line that includes n. brings up the text of the whole of that branch (e.g., n. , n. , n. ...), with the text for the junction station, the remark that it comments on, namely n, at the top. the default text is the german original, but a dropdown menu in each text panel allows the reader to choose either of the canonical english translations. approaching the tractatus as a hypertext in this way is not just a matter of coming up with a striking way of representing the numbering system. it amounts to a radically new edition of a canonical text, with a number of far-reaching implications for the interpretation of that text. in the past, it has been taken for granted that the text should be read sequentially, in numerical order. but the hypertext consists of a series of branching and interconnected groupings of remarks, represented by the lines that connect the stations on our subway map. for instance, if we read the text sequentially, remark is preceded by . and succeeded by . . http://www.nordicwittgensteinreview.com/ http://tractatus.lib.uiowa.edu/tlp/ http://tractatus.lib.uiowa.edu/pt/ david stern cc-by but if we take the author’s footnote seriously, it should also be seen as (a) coming after , and before , , , and (b) being commented on by two further series that branch off from it, namely . , . and . , . , . … . . . mapping the prototractatus thanks to the work of brian mcguinness ( , ) we know that the source manuscript for the prototractatus (wittgenstein ) provides a chronologically ordered log of the polished paragraphs that would later be rearranged and revised in the production of the tractatus. while a facsimile of ms is included in the first and second editions of the prototractatus, the published text does not include an edited text of the manuscript in the order it was written. instead, the remarks were rearranged by the editors in numerical order. indeed, in the critical german-language edition of the tractatus, which includes the full published text of the prototractatus, together with detailed information about how each remark was revised (wittgenstein ) there is no information about the original order of the remarks, and no tables or other apparatus that would aid the reader in studying ms . when wittgenstein began to assemble the material that would ultimately be rearranged and reorganized in the familiar numerical order from to , he had not yet finished writing it, and had not yet worked out how to arrange the parts that he had written. consequently, the manuscript of the remarks that we now know as the prototractatus could not be written up in the sequential, numerical order in which the book was published. however, sometime during world war i, wittgenstein worked out the ingenious numbering system that enabled him to organize, review, and repeatedly reorganize his work in progress, despite the very limited resources available to him while serving as a soldier. as a result, the manuscript containing the first known draft of his book (ms in von wright’s numbering system, sometimes known as bodleianus, because it is owned by the bodleian library in oxford), began with the first six whole-numbered remarks on the first page of the main text (pilch ), and then repeated that nordic wittgenstein review ( ) series, together with almost all the remarks with a single decimal through . , on the next page. after that, remarks were written down as wittgenstein decided to make use of them, and each remark prefaced by a decimal number indicating its ultimate location in the sequence. the next page contains double decimal remarks appended to the whole number and single decimal remarks that formed the initial backbone for the growing book draft. progressively higher- numbered remarks soon make an appearance, but throughout the process of construction recorded in ms , remarks are added to the tree-structure, not to a numerical sequence. in october , wittgenstein wrote to russell that he had recently done a great deal of work, and that he was “in the process of summarizing it all and writing it down in the form of a treatise (abhandlung). …if i don’t survive [the war], get my people to send you all my manuscripts: among them you’ll find the final summary written in pencil on loose sheets of paper” (wittgenstein , - ). that loose-leaf “final summary” has not survived, but it is likely that it consisted of some kind of a tree-structure arrangement of his book in progress, as a sequentially-ordered arrangement would have involved constant and extensive additions to what had already been composed, while inserting material into sheets containing remarks arranged in a tree structure would have been simple. certainly, it would have been impracticable to take in either the hypertextual structure or the sequential arrangement of the projected treatise by reviewing ms , the bound ledger containing a chronological ordered record of his additions to the book draft. thus, while the published prototractatus looks very similar to the final tractatus, the source manuscript on which that book was based was put together in a very different way. from each of the first six whole-numbered remarks, numerical sequences branch, starting with one-decimal series such as . , . ; from these nodes, further branches stem. when ms was first discovered by von wright in , who took charge of preparing the text for publication over the next few years, the full significance of the order in which the remarks were http://www.nordicwittgensteinreview.com/ david stern cc-by written down was not yet appreciated. as a result, the focus of that book and of von wright’s introductory essay ( ), is on the path to the tractatus, not the composition of ms . this is already made clear in the wording of the book’s subtitle: “an early version of tractatus logico-philosophicus”. consequently, the text of the first pages was printed in the familiar numerical order, while last fifteen pages of “corrections” were left out, as they belonged to a later stage of revision that could not be fully reconstructed from the available evidence. the immediate result of this enormous amount of careful and conscientious scholarly work was very disappointing: it was hard for the first generation of readers of the prototractatus to see what, if anything, there was to be gained or learned from this edition. the edited text looked too much like the familiar text of the tractatus to be instructively different, while the facsimile of the original seemed quite opaque. most reviewers damned the book with faint praise. w. d. hart’s review in the journal of philosophy is exemplary in this respect: this volume has been handsomely and thoroughly wrought. indeed, the book may even have been overdone. i suspect that its hefty price tag may be due in no small part to the inclusion of a -page photocopy of wittgenstein’s handwritten manuscript; yet i have some doubt that the facsimile of the master's original text is of sufficient scholarly utility to justify the heavy tariff its inclusion occasions. … in preparing the printed german text, the editors rearranged the numbered remarks in the manuscript in wittgensteinian numerical order, though they have included page references to locate the remarks in the manuscript; it might prove interesting to know in which “contexts” which remarks were inscribed by wittgenstein. (hart , ) over and above remarks present in the one text but not the other, the tractatus and the prototractatus differ considerably in the orderings of those remarks they share. on page four of his historical introduction, von wright says that these “are probably the most interesting differences between the two works”. unfortunately, von wright says nothing to arouse any such interest in his readers. (hart , ) at this point, it may be helpful to take a step back and consider the similarities between scholarly editing of philosophical texts in general, and wittgenstein texts in particular, and home nordic wittgenstein review ( ) improvement projects. in both cases, one can draw a very similar graph of happiness over time. both start out with great enthusiasm and excitement over the promised results; there is then a steady decline as one becomes aware of all the problems involved; and finally satisfaction rises again as results are achieved. but while, in the case of home improvement, morale usually recovers as the project is completed, in the case of scholarly editing, it can take much longer to fully appreciate what has been accomplished. often, it is only too easy for critics to point out how much better the job could have been done, without taking into account the fact that we can only see how it could be improved because we can make use of the work already done by our predecessors. if we can see so much further than the previous generation, it is because we are standing on their shoulders, or building on their accomplishments, when we do so. indeed, while von wright did not himself provide any further discussion of the “the most interesting differences between the two works”, his work made those materials available in a form which provoked others to identify those differences, and this may well have been one of his most important contributions to our understanding of the complex relationship between ms , the prototractatus and the tractatus. however, until very recently only the most ardent scholars have been in a position to study even the principal earlier stages, usually known as the ‘core’ prototractatus, which ends at a dividing line on page of the manuscript, and the so-called proto-prototractatus, which ends at a similar dividing line near the bottom of page . researchers can consult schmidt ( ) and pilch ( ) for facsimiles and transcriptions of many of the key documents, and there is a wealth of information about the structure of ms and its relationship to both the tractatus and notebooks - in geschkowski ( ). however, all this material is only available in german, and its overall structure is far from easy to take in. in addition to providing a subway-style map of the complete text of the prototractatus, or the first pages of ms , our map site also provides parallel access to the earlier stages, or “strata” of composition, contained within the source manuscript for the prototractatus. by choosing different start and end pages at the top http://www.nordicwittgensteinreview.com/ david stern cc-by of that map, one can look at different stages in the construction of the prototractatus: the chosen pages are in color, the others are greyed out. in this way, one can look at the text of different stages in the construction of the prototractatus, and map the changing arrangement of the project as it was gradually assembled. however, because the site is intended as a resource for interpreters with very different approaches, and the dating of these stages is a matter of debate (see von wright ; mcguinness , ; geschkowski ; kang ; potter ; bazzocchi ; pilch ), we do not build in any particular hypothesis about the dating of the various stages of composition of the prototractatus. instead, we simply provide information about the page on which each remark first appears, and leave it to the reader to explore the various layers. the principal goal for the next stage of the university of iowa tractatus map project is to connect up our maps of tractatus and prototractatus, in order to provide an equally graphic and accessible map of the process of revision that led from prototractatus to tractatus. because very little wording is added or removed at this late stage in the composition of the final text, the vast majority of remarks in the tractatus have a clear antecedent in one or more prototractatus remarks, and very few remarks in prototractatus have nothing corresponding to them in tractatus. in other words, the alterations in question are primarily a matter of rearrangement and reorganization of a highly structured text. the overarching organization is largely retained: the top seven whole-numbered remarks are left unchanged, and while many remarks are moved around by changing their decimal number, very few are moved to a different whole-number category. there are no structurally significant changes to the remarks grouped under and , so the task of mapping the changes from prototractatus to tractatus can be broken down into five independent sub-tasks: showing the process of revision with groups , , , and . a surveyable map of the various stages of composition and rewriting involved in wittgenstein’s construction of the prototractatus, and ultimately the final text of the tractatus should tell us a great deal about the structure of the book. nordic wittgenstein review ( ) . conclusion our site’s innovative map of the tractatus provides a new and visually compelling presentation of the book’s overall structure. it thus delivers on wittgenstein’s cryptic claim in the letter to the book’s publisher that the numbers “make the book surveyable and clear”. “surveyable” (übersichtlich, literally, overview-able) is a key term of art for wittgenstein, and carries the sense of making it possible to take in a complex structure at a glance, in the way that one can grasp the lay of the land by looking at a landscape from a well-placed hill or tower. crucially, it is possible to look over our subway-style maps as a whole, and to examine the structure of each part. conventional tree diagrams are much more visually complicated, so much so, that they are normally only used to show a small part of the book’s overall structure at one time. in other words, our map is far easier to take in visually, and it is possible to look at the whole thing at once. as previously mentioned, the great majority of readers of the tractatus take it for granted that it should be read and interpreted in sequential numerical order, as published. very recently, bazzocchi, hacker and kuusela have argued against this approach, contending that one should only read the text as a logical tree, or a hypertext, in the order of the lines on our map. they also contend that separate branches of the tree do not cross-refer, or inform each other. the first full-blown defence of a sequential reading that responds to their construal is published in the present issue of this journal (kraft ). because we envisage the map as a tool that we would like a wide variety of readers to find helpful, rather than advocate for a particular scholarly interpretation, we do not take a position on these exegetical issues on the site. for this reason, the front page provides a bare minimum of introductory information about the site and the texts it presents. our own considered view is that both of these are legitimate and appropriate interpretive strategies, while holding that either one of them is the only correct way to read the text is a mistake. we believe that we need to pay attention not only to the final sequential order in which the book was published, but also to the hypertextual arrangement determined by the book’s http://www.nordicwittgensteinreview.com/ david stern cc-by numbering system. both hacker ( ) and kraft ( ), who seem at first sight to be on diametrically opposed sides in this debate, are actually somewhat equivocal on this very issue. notably, after observing that it is useful to distinguish between the thesis that the tractatus can be read and interpreted as a tree and the thesis that it must be read and interpreted that way (kraft , , fn ) kraft goes on to maintain that as “the weaker thesis is too non- committal”, and he construes the tree reading as defending the stronger thesis. he then points out that in the first full paragraph of his paper, hacker states what is clearly a version of the weaker thesis, recommending that one “avoids reading the work only consecutively, and also reads it tree-wise” ( , ). indeed, if one takes the very next sentence of hacker’s paper out of context, it reads like an extremely insistent statement of the strong thesis: the tractatus must be read in accordance with the numbering system, and that demands that the reader follow the text after the manner of a logical tree… (hacker , ) in view of its setting, on the other hand, this sentence seems to be doing no more than insisting and demanding that one must not only read the work consecutively, but also read it as a logical tree. nevertheless, shortly after making his observation about hacker, kraft makes a strikingly similar move. after expressing his conviction that where the tree reading and his own reading conflict, his own reading is clearly superior, he observes that it is not a bad idea to keep in mind that both interpretations of the numbering system exist and can both be applied whenever discussing specific (series of) remarks. (kraft , ) in this paper i have appealed to two different, but related, reasons for reading the book as a hypertext. the first is an argument “from above”: we should take the author’s instructions at the beginning of the book about the relations between the remarks seriously, and respect his insistence that without the numbers the book would not be surveyable or clear. the second is an argument “from below”: we know that the author relied on the numbering nordic wittgenstein review ( ) system to organize his successive drafts of the book when he wrote it down in ms , and looking at a map of the various stages makes it possible to survey that process. but neither these arguments, nor any other i have encountered in the literature on the numbering system of the tractatus proves that it should only be read in this way. in the end, the question of how best to read those remarks is one that can only be settled passage by passage, by means of a close reading and evaluation of all the relevant texts. in the last paragraph of the preface to the second edition of prototractatus, brian mcguinness expressed the hope that it would be possible at a future date to supplement this facsimile and printed version by an electronic version which will facilitate the comparison of the various stages described here, as well as permitting a number of other analyses both of this and the tractatus itself. (mcguinness , xii) in the final paragraph of a paper on the composition of the tractatus presented at the kirchberg wittgenstein symposium, he observed that: the execution and still more the presentation of such analyses are much facilitated by the use of electronic devices, search engines, data bases, excel and so on. in an ideal world we could all have access to one another's constructions of this kind on the internet, whether promiscuously or as members of a club. it is important that they should be accompanied by rationale, by discussion, and interpretation. (mcguinness , ) when i heard him give that paper, that world seemed very distant from our own. in the recent past, it has become much closer. earlier versions of parts of this paper were presented at the “von wright and wittgenstein in cambridge: von wright centenary symposium”, held at strathaird, cambridge, uk, at a session on early analytic philosophy organized by the society for the study of the history of analytic philosophy at the american philosophical association’s central division, held in kansas city, and (via videolink) at the th summer school on mind and language, organized by luciano bazzocchi at the university of siena, italy. i learned a great deal from the discussion at all three events, and my fall graduate seminar at the university of iowa, and want to express my gratitude to everyone who took part. ed. note: this paper was published dec. , and amended on jan. , : on p. and , some incorrect tlp remarks listed were removed. http://www.nordicwittgensteinreview.com/ david stern cc-by references bazzocchi, luciano . “a database for a prototractatus structural analysis and the hypertext version of wittgenstein’s tractatus.” in philosophy of the information society, papers of the th iws (eds. h. hrachovec, a. pichler, j. wang) - . heusenstamm am frankfurt: ontos verlag. bazzocchi, luciano . “the prototractatus manuscript and its corrections.” in n. venturinha (ed.), wittgenstein after his nachlass, - . new york: palgrave macmillan. bazzocchi, luciano . l’arbre du tractatus. peterborough: college publications. bazzocchi, luciano (ed) a. the tractatus according to its own form, ed. luciano bazzocchi. raleigh, nc: lulu. see also http://www.bazzocchi.com/wittgenstein/tractatus/index.htm bazzocchi, luciano . “a better appraisal of wittgenstein’s tractatus manuscript”. philosophical investigations ( ), pp. - . black, max . a companion to wittgenstein’s tractatus. cambridge university press. geschkowski, andreas . die entstehung von wittgenstein’s prototractatus. bern: books on demand. bern studies in the history and philosophy of science. gibson, kevin . “is the numbering system in wittgenstein’s tractatus a joke?” journal of philosophical research , pp. - . hacker, p. m. s.: . “how the tractatus was meant to be read”. the philosophical quarterly , pp. - . hart, w. d. . “prototractatus: an early version of tractatus logico- philosophicus”. the journal of philosophy ( ), pp. - . kang, jinho . “on the composition of the prototractatus”. the philosophical quarterly ( ), pp. - . kraft, tim . “how to read the tractatus sequentially”, nordic wittgenstein review ( ), pp. - . kuusela, oskari . “the tree and the net: reading the tractatus two- dimensionally”. rivista di storia della filosofi / , pp. - . de laguna, theodore . review of tractatus. philosophical review , pp. - . laventhol , jonathan . hypertext of the tractatus logico-philosophicus. see: http://www.kfs.org/jonathan/witt/tlph.html mayer, verena . “the numbering system of the tractatus”. ratio ( ), pp. - . http://www.bazzocchi.com/wittgenstein/tractatus/index.htm http://www.kfs.org/jonathan/witt/tlph.html nordic wittgenstein review ( ) mcguinness, brian . “wittgenstein’s pre-tractatus manuscripts”. grazer philosophische studien : - . reprinted with revisions in mcguinness a. mcguinness, brian . preface to the second edition of wittgenstein . mcguinness, brian . “wittgenstein’s ‘abhandlung’”. in rudolf haller and klaus puhl (eds.) wittgenstein and the future of philosophy: a reassessment after years: proceedings of the th international wittgenstein- symposium, - . vienna: hölder-pichler-tempsky. mcguinness, brian a. approaches to wittgenstein: collected papers. london: routledge. pasin, michele . wittgensteiniana website. links to multiple hypertext editions of the tractatus. see: http://hacks.michelepasin.org/witt/ pilch, martin “a missing folio at the beginning of wittgenstein's ms ”. nordic wittgenstein review ( ), pp. - . pilch, martin (ed.) wittgenstein source prototractatus tools. see: http://www.wittgensteinsource.org potter, michael . “wittgenstein’s pre-tractatus manuscripts: a new appraisal”. in peter sullivan and michael potter (eds.) wittgenstein’s tractatus: history and interpretation, pp. - . oxford: oxford university press. schmidt, alfred (ed.) . wittgenstein source facsimile edition of tractatus publication materials. see: http://www.wittgensteinsource.org stern, david g. . “the methods of the tractatus: beyond positivism and metaphysics?” logical empiricism: historical and contemporary perspectives, part of the pittsburgh-konstanz studies in the philosophy and history of science series, eds. paolo parrini, wes salmon and merrilee salmon, pp. - . pittsburgh: pittsburgh university press. von wright, g. h. . “the origin of wittgenstein’s tractatus”. historical introduction to wittgenstein , pp. - . a revised and expanded version, “the origin of the tractatus” was published in von wright , pp. - . von wright, g. h. . wittgenstein. oxford: blackwell. wittgenstein, ludwig . tractatus logico-philosophicus. translated by c. k. ogden (& f. p. ramsey). london: routledge and kegan paul. wittgenstein, ludwig . tractatus logico-philosophicus. translated by d. pears & b. mcguinness. london: routledge and kegan paul. wittgenstein, ludwig . prototractatus: an early version of tractatus logico- philosophicus. edited by bf mcguinness, t. nyberg [and] gh von wright, with a translation by d. f. pears [and] b. f. mcguinness, and http://www.nordicwittgensteinreview.com/ http://hacks.michelepasin.org/witt/ http://www.wittgensteinsource.org/ http://www.wittgensteinsource.org/ david stern cc-by historical introduction by g. h. von wright and a facsimile of the author’s manuscript. ithaca, ny: cornell university press. wittgenstein, ludwig . prototractatus: an early version of tractatus logico- philosophicus. second edition, with a new preface by mcguinness. wittgenstein, ludwig . logische-philosophische abhandlung: kritische edition. ed. b. f. mcguinness and j. schulte. frankfurt am main: suhrkamp. wittgenstein, ludwig . wittgenstein in cambridge: letters and documents - . edited by bf mcguinness. malden, ma: wiley-blackwell. wittgenstein, ludwig . side-by-side-by-side edition, version . (november , ) of tractatus logico-philosophicus, containing the original german, the ogden/ramsey, and pears/mcguinness english translations. from http://people.umass.edu/klement/tlp/ biographical note david g. stern is a professor of philosophy and a collegiate fellow in the college of liberal arts and sciences at the university of iowa. his research interests include history of analytic philosophy, philosophy of language, philosophy of mind, and philosophy of science. he is the author of wittgenstein’s philosophical investigations: an introduction (cambridge university press, ) and wittgenstein on mind and language (oxford university press, ). he is also a co-editor of wittgenstein: lectures, cambridge - , from the notes of g. e moore, with brian rogers, and gabriel citron (cambridge university press, ), the parallel online wittgenstein source facsimile edition of moore’s notes of wittgenstein’s lectures (ws-mwn) from the wittgenstein source website http://www.wittgensteinsource.org/, wittgenstein reads weininger, with béla szabados (cambridge university press, ) and the cambridge companion to wittgenstein, with hans sluga (cambridge university press, .) a second, extensively revised, edition of the cambridge companion to wittgenstein is forthcoming, for which he has written a chapter on wittgenstein in the s. http://people.umass.edu/klement/tlp/ http://www.wittgensteinsource.org/ abstract . the numbering system of the tractatus . earlier maps – digital and others . mapping the tractatus . mapping the prototractatus . conclusion references biographical note costa, c., burke, c. and murphy, m. ( ) capturing habitus: theory, method and reflexivity. international journal of research and method in education, ( ), pp. - . (doi: . / x. . ) there may be differences between this version and the published version. you are advised to consult the publisher’s version if you wish to cite from it. http://eprints.gla.ac.uk/ / deposited on: december enlighten – research publications by members of the university of glasgow http://eprints.gla.ac.uk http://dx.doi.org/ . / x. . http://eprints.gla.ac.uk/ / http://eprints.gla.ac.uk/ costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / capturing habitus: theory, method and reflexivity cristina costaa , ciaran burkeb and mark murphyc a school of education, university of strathclyde, glasgow, uk contact author: email: cristina.costa@strath.ac.uk b, university of derby, uk c robert owen centre for educational change, university of glasgow, glasgow, uk bourdieu’s career long endeavour was to devise both theoretical and methodological tools that could apprehend and explain the social world and its mechanisms of cultural (re)production and related forms of domination. amongst the several key concepts developed by bourdieu, habitus has gained prominence as both a research lens and a research instrument useful to enter individuals’ trajectories and ‘histories’ of practices. while much attention has been paid to the theoretical significance of habitus, less emphasis has been placed on its methodological implications. this paper explores the application of the concept of habitus as both theory and method across two sub-fields of educational research: graduate employment and digital scholarship practices. the findings of this reflexive testing of habitus suggest that bridging the theory-method comes with its own set of challenges for the researcher; challenges which reveal the importance of taking the work of application seriously in research settings. keywords: application, bourdieu, habitus, reflexivity, theory-method mailto:cristina.costa@strath.ac.uk costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / introduction according to smith ( ), one of the key functions of social theory is to provide a framework for undertaking empirical social research. it does this by ‘equipping the researcher with a vocabulary for describing social phenomena, together with a related set of assumptions about how to go about explaining them’ (p. ). smith was writing about theory and method in relation to the work of axel honneth, who, while gaining prominence in applied fields, has not been as influential as pierre bourdieu. bourdieu’s work has provided something of a template for social theory as a conceptual vocabulary in applied research settings, with forms of capital acquiring visibility both in research literature and popular press. on a par with bourdieu’s treatment of capitals, habitus has now acquired currency in the anglophone world and further afield, as it has been applied to different research areas, a range that continues to broaden at pace. habitus, alongside other bourdieuian tools, offers an explanatory framework and theoretical vocabulary for processes of social reproduction and transformation. following bourdieu’s legacy, the conceptualisation and application of habitus in different settings comprises attempts to overcome the dichotomy between structure and agency whilst acknowledging the external and historical factors that condition, constrain and/or promote change. many researchers are attracted to habitus as a framework because it offers an alternative to overly-agentic or structural accounts of social phenomena. it also speaks to the lived experiences of researchers who are eager to examine the everyday relational modes of being that offer insights into the often invisible workings of power and privilege. the growing popularity of habitus as a conceptual tool has generated much debate, the focus of which has centred on its relationship to change. whether or not habitus is deterministic costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / or transformative has created a division of opinion and approach between proponents of either conceptualization (see jenkins, , and yang, for examples). these discussions however have been mainly focused on the theoretical worth of bourdieuian concepts, thus leaving less space for considerations regarding its application in field work via research methods. yet, these concepts were not meant to be used solely as theory, but rather as theory- method as a form of preparing the research for field work. in this regard, bourdieu’s key concepts, as for example habitus, have been discussed more often in relation to theorisations of research findings than to methodological choices and fieldwork applications, thus making the discussions around bourdieu’s contribution to method far less pronounced. this is most likely because such debates are scarcer in the literature. nonetheless, they were an ever-present concern in bourdieu’s work (see, for example, bourdieu and wacquant, ). this imbalance regarding the use of bourdieuian concepts as theory separated from method is something of a concern, given that bourdieu’s conceptual apparatus was an attempt to reconcile practice and theory through method, with his key concepts working in the background to unearth and understand the essence of contextualised practices (costa & murphy, ). in short, putting bourdieu’s theoretical concepts to work as part of methodological decisions and development of data collection instruments is still regarded by many – especially those new to research – as a ‘black-box’ of social inquiry. this is something we aim to (re)explore in this paper, using the application of habitus as an example of theory-method dialectics. the purpose of this paper is thus to help rectify this imbalance between theory and method by bringing together research studies on habitus in two educational related contexts – graduate employment and digital scholarship practices – and examining in detail the ways in which the research in question has endeavoured to ‘capture’ habitus in those two settings. in costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / particular, the paper indicates that capturing habitus is not a straightforward enterprise, given that it is as much influenced by the context within which the capturing occurs as it is by the way the theoretical apparatus is framed. it also suggests that the actual process of application itself should be paid more attention to in discussions over theory and method, the bridging mechanism too often sidelined as a secondary feature of social research. this take on application is important, not just for studies of habitus but also for the wide range of studies that endeavour to apply social theory in empirical work. these share a common concern, regardless of concept, when it comes to bridging a not-insubstantial gap between theory and method. what emerges from this endeavour – by bringing theory to life through the process of application, while also unpacking the mechanisms via which theory and method converge – is a set of challenges for researchers who wish to bridge the theory-method gap via the socio-theoretical vocabulary of concepts such as habitus. in other words, this paper explores the use of bourdieu’s key concept of habitus from a methodological perspective which makes it a rather distinctive and relevant project. habitus: theory and method for bourdieu, habitus is more than theory; it is an essential instrument for tracing social practices: the notion of habitus has several virtues. … agents have a history and are the product of an individual history and an education associated with a milieu, and … also a product of a collective history ... (bourdieu & chartier, , p. ) costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / but what is habitus? habitus encapsulates social action through dispositions and can be broadly explained as the evolving process through which individuals act, think, perceive and approach the world and their role in it. habitus, thus, denotes a way of being. as assimilated past without a clear consciousness, habitus is an internal archive of personal experiences rooted in the distinct aspects of individuals’ social journeys. individuals’ dispositions are a reflection of their lived trajectories and justify their approaches to practice (bourdieu, ). that said, uncovering habitus is not a straightforward task; the challenges arise on multiple fronts. for a start, they lie in the operationalisation of the theoretical concept of habitus - i.e, in capturing this fluid, broad concept - with specific methodological tools. nonetheless, one aspect that researchers tend to agree on is that sets of dispositions, however defined, are a useful gateway to habitus and its effects. this is understandable and to be expected. what is more interesting is how researchers define these dispositions, in accordance with their research questions, and the methods they employ to capture them (see costa & murphy, ). another key issue from the literature relates to the diversity of research methods used to capture habitus, with evidence suggesting considerable divergence in approaches. this suggests that there is not one single method that should be applied to this subject, but as many and diverse as ‘demanded’ by the research phenomena explored. for example, stahl ( ) utilises narrative inquiry to grapple with white working-class habitus, while bodovski’s ( ) work on parental and adolescent habitus employs an analysis of secondary survey data to flesh out conceptions of habitus. what can also be identified in previous research is the diversity of dispositions under investigation, which suggests that, when it comes to application, it is not as simple as saying habitus can be captured by studying dispositions. aside from clarifying what is meant by ‘disposition’, the researcher must make choices about which dispositions are relevant to the costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / study at hand. this is a significant question, as, even when similar methods are used, there is no guarantee that the same dispositions will come to the surface. this is evident in research on somewhat comparable social groups. take stahl’s ( ) and france’s ( ) research on working class boys as an example. they uncover a concern with ‘loyalty to self’ and ‘averageness’, and ‘fighting’ and ’stealing’ respectively. what is interesting here is that the researchers were looking for dispositions with very different research questions in mind – the former concerned with aspirations, the latter focused on the context of criminality. this does not mean that one approach is more appropriate than the other; what it suggests is that method should fit the purpose of the investigation. it also indicates that the questions asked have major implications for the answers provided. this role for interpretation is a key component in the art of application of habitus and illustrates that the complex lives of research participants, who can embody multiple, often conflicting sets of dispositions, should not be taken at face value. in other words, one isolated set of dispositions does not make a habitus. it is therefore important to highlight here that research methods are more than the types or instruments of data collection, they also encompass the process through which the researcher approaches and conceives the research phenomenon under focus. the complexity of defining and applying habitus provides much food for thought when it comes to the theory-method relationship. this is further enhanced by bourdieu’s obsession with reflexivity, which encourages critical understandings of social realities in both the researcher and the researched. reflexivity, however, extends beyond concepts of self-reference and self-awareness to deal with the systematic exploration of the ‘unthought categories of thought which delimit the thinkable and predetermine the thought’ (bourdieu & wacquant, , p. ), i.e., one’s subjectivity. the authors’ approach in this paper is an example of reflexivity come to life and a testing and re-testing of a concept and its efficacy across fields, costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / further strengthening its explanatory potential. this is an approach that we think bourdieu would have appreciated. that said, it is not the purpose of this paper to provide the final word on habitus and its place in social research. the objective of placing such research projects side by side is, rather, to foster further dialogue about the relationship between concepts such as habitus and methodologies employed in diverse settings. what follows is a description and analysis of the application of habitus in two different studies and a reflexive discussion of what these studies mean in relation to the theory-method relationship. the continuing presence of habitus in these two areas is particularly important when we consider theories of practice that conflict with bourdieu, such as margaret archer’s morphogenesis model ( ). in a similar vein to the late modern arguments from beck (and beck-gernsheim) and bauman, archer ( ) charts the emergence of a morphogenetic society beginning in the s and continuing until current day, such a society is characterised by fluid identities, opportunities for rapid change and the de-structuring of “traditional” inequalities mediated by increasingly individual/autonomous levels of reflexivity via internal conversations. in the context of this model, the habitus is an anachronistic tool unable to account for a society ‘too fluid to be consolidated into correlated dispositions, which are inherited and shared by those similarly positioned’ (archer, , p. ). two key institutions/platforms in the development of the increasingly fluid society are higher education (archer, ) and the internet (porpora, ). however, the classed/collective nature of digital dispositions and attitudes and practices of graduates demonstrated through our case studies question the role of these institutions/platforms. we advocate that the flexibility within bourdieu’s model (adams, ; reay, ; emmerich, ; abrahams and ingram, ; atkinson, ) and the heuristic principle behind the habitus (hodkinson, ) provides us with a sharper set of thinking tools in which to interrogate the social world. indeed, both case costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / studies highlight the theoretical implications for bourdieu and, as such, can act as a testbed for the challenges of methodological application. case study : the habitus and graduate employment this case study provides both the practical setting and the opportunity to reflect on the effectiveness to ‘capture’ habitus in issues concerning graduate employment. the expansion of higher education in the uk, credited to both the robbins report ( ) and higher education: a new framework (dfe, ), has brought with it an increase in the level of participation – rising from per cent in the s (brooks & everett, ) to per cent in (heath, et al., ). a major consequence for graduate employment was that the expansion of uk higher education flooded the market with graduates at a speed and volume incompatible with the requirement of the graduate employment market. figures on graduate underemployment point to per cent (purcell, et al., ) and per cent (ons, ) of graduates unable to find graduate employment. there are two key issues facing recent and future graduates: the role of a priori capitals and the ‘fuzzy’ nature of the labour market. as the number of university graduates rises in a disproportionate level to graduate employment opportunities, the value of the degree – of scholastic capital – decreases. as such, a priori capitals, which bourdieu and boltanski ( ) argue tend to be associated with social class such as cultural and economic capital, play a leading role in a graduate’s ability to enter the labour market. this is compounded by an increasingly destructured and confusing labour market, one that bourdieu and boltanski ( ) term ‘fuzzy’, requiring a mastery of the market’s tacit requirements and appreciation of its constantly changing needs. contrary to the meritocratic discourse which has framed a significant portion of u.k. social policy and the rationale behind various changes in costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / fee structure, a significant variable in deciding which graduates get these jobs is class (bourdieu & boltanski, ; brown & hesketh, ). in light of these statistics and their critical interpretation, the research under review here asked: are strategies of graduate employment influenced by the habitus of a young person? the research focused on the life histories of respondents. all members of the study had graduated from either a pre- university (southern) or a post- university (northern), had read for a non-vocational degree and had graduated between two and ten years before the data collection. in terms of the findings from this study, there was a general binary classed model of experiences and pathways of graduate employment. there were classed contrasts in dispositions including appreciating the devaluation of a university degree, confidence in their ability to find a ‘graduate level’ job and attitudes to a flexible graduate labour market. these contrasts were articulated through and accounted for by habitus. some of the clearest illustrations of the role of habitus on dispositions did not come from comparing classed groups but when observing the reformulation of an individual respondent’s habitus and the subsequent shift both in attitude and practice. the ‘out-of-environment’ conception of experiences used here (burke, a, b) builds on bourdieu’s ( ) assertion that, while the habitus is quite durable, a large enough shift in environment can lead to an altered habitus. in this case, a small number of working class respondents, upon graduating from university, interacted with individuals or environments that radically changed their understanding of the game and their levels of confidence/expectations – in other words, their dispositions. importantly, the divergent pathways these graduates were now on were not directed from a primary habitus but, rather, from a reformulated habitus, as the new ways the graduates approached the labour market stayed quite close to the instructions/advice provided in their out-of-environment experiences. costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / capturing the habitus: the role of biographical research the approach to habitus used in this case study was directed by a specific theoretical interpretation of bourdieu, where habitus is understood as being both a durable structure but also malleable: open to alternative paths through agency and change in circumstance and environment. habitus, understood this way, is quite applicable to graduate employment research. the de-structured and chaotic graduate labour market requires a strong ability to play the game and dispositions congruent to that labour market – two facets of habitus. while it is an interesting academic exercise for the habitus to be theorised, it needs to be operationalised and applied through a form of data collection. as argued elsewhere (see burke, a, b), the durability of habitus in both its dispositions and forms of practice provides an opportunity to empirically observe its directive influence. to be specific, the habitus can be observed through the repetition of both attitudes and practices (bourdieu, ). in this case study, the biographical narrative interview method (bnim) was used to capture habitus. traditionally, the bnim is associated with grounded theory and an inductive approach to data collection (miller, ; rosenthal, ). however, we argue that it is equally applicable in a theoretically driven project and provides a clear opportunity to chart a life history and tally particular dispositions and norms, while measuring an individual’s ability to ‘play the game’ based on their understanding of the game and the end result. while a reasonable critique against any rigid form of data collection is that it will snap when faced with the practicalities of data collection, it is the set of prescriptive rules at the core of the bnim which provides its potential. the interview is typically conducted over two sittings and comprises of three parts, or sub-sessions: costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / /  sub-session : the interviewer poses a very open question or statement: ‘tell me about your life’. the respondent is allowed to talk for as long as they wish, and, importantly, the interviewer is not permitted to interject or direct this initial narration.  sub-session : this portion is usually conducted in the same sitting as sub-session , and the interviewer can ask for greater clarification on topics which have been discussed.  sub-session : this portion happens at a later date once the first two sub-session interviews have been transcribed and analysed. this interview can take a number of forms, as there are no technical constraints imposed on the interviewer. the bnim allows us to return to bourdieu’s own instructions that, to empirically appreciate the habitus, we should look for repetition of attitudes and practices (bourdieu, ). through the three-stage interview process, it was possible to observe and measure certain attitudes and dispositions, such as confidence in one’s ability or hesitancy toward entering higher education. this observation permitted the researcher to demarcate different groups of respondents by their dispositions. the longitudinal aspect of life history research gave further support to this demarcation, as respondents’ attitudes manifested over a significant period of time with respondents displaying similar levels of comfort/anxiety toward graduate employment as they did to higher education. equally, the longitudinal focus provided an opportunity to compare different periods of respondents’ life histories. repetition of practice and sources of habitus reformulation can also be observed and tracked through the bnim. the strategies respondents employed, i.e., their ability to play the game, in relation to significant life events, such as educational trajectories and graduate employment pathways, could also be measured and associated with different groups. the distinction in attitudes and practices provided a classed costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / binary model of graduate employment, illustrating their respective habitus. the bnim demonstrated the durable effects of habitus on the majority of respondents. in particular, it provided a durable undercurrent of its influence despite contradictory gaps in a respondent’s trajectory, such as a middle class respondent’s inability to secure a graduate position. in other words, it provided an understanding of the bigger picture rather than falling prey to the shortcomings of a pinpointed interview/survey. this application of the bnim, points to the future opportunities for research to maintain the ethnographic level data required to observe the habitus but in a practical approach which would be open to a larger proportion of researchers. crucially, the bnim allows a researcher to formulate a theoretically-informed research question but also requires that research not only reflect on its findings but prohibits theory from having an overtly – and, ultimately, detrimental – directive role in the data collection process. it provides the right ‘lab conditions’ to observe and measure attitudes and practices over a significant period of time in order to capture habitus. as with any method, there are practical shortcomings and issues which must be addressed. the bnim is often required to apologise for its failings stemming from quantitative research’s strengths such as reliability and validity. the ethnographic and longitudinal features of the bnim are open to the charge that, unlike many ethnographies, the bnim interview can suffer from a posteriori biographical re-construction. the issue of a respondent’s desire for the presentation of self is one that most qualitative research faces, however more so for the bnim (see rosenthal, , ; schütze, , ). this charge is based on an assumption of quite strong levels of reflexivity, synonymous with archer’s ( ) internal conversations and in contradiction to the (at least semi-) pre-reflexive nature of the habitus. while this form of interview provides a longitudinal account of an individual’s life (burke, a; b), the interview transcript is very unlikely to offer a linear account of an individual’s life the way a costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / traditional longitudinal study would be expected to offer. contrarily, respondents often revisit periods of their lives throughout the sub-sessions of the interview. it is the task of the researcher to chart that life history before conducting analysis and drawing conclusions. in the analysis stage of the research process, the researcher needs to stay vigilant and apply the same level of focus and attention to each topic discussed to sufficiently chart an individual’s life history. finally, the rules of the bnim are there to provide empirical legitimacy to a theoretically- driven research project. the constraints of the interview and the benefits are only as effective as the researcher. it can be quite difficult during the interview or analysis process to stave off directing an interview or applying theory too early, but this short-sighted stance will reduce the heuristic value of the habitus and reinforce the charges of structural determinism. case study : dispositions in digital scholarship the second case study relates to the study of scholarly activity online. it explores how academic practices around scholarship have been affected by digital technologies, more concretely, the web. the web as a site of intellectual participation and production is an emergent phenomenon that is slowly redefining the contribution of academia and the role of scholars in the wider social context, with academics increasingly realising that the production and communication of knowledge can be conducted more autonomously (lupton, ). these developments, however, do not come without challenges as digital scholarship practices are often regarded as antagonists of a long-standing academic tradition. with this observation in mind, this case study aimed to investigate the dispositions that characterise academic researchers engaged in digital scholarship practices (see costa, ) to understand the meaning they attribute to their academic work. costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / in order to develop an understanding of the dispositions associated with digital scholarship practices, bourdieu’s theory of habitus was adapted according to categories of thinking, value systems and strategies that currently guide the practices of digital scholarship. in this regard, the research built on the work of weller ( ), who categorises digital scholarship practices according to a three-element framework: ) digital, ) networked and ) open(ness). if the first element of weller’s framework refers to the structure on which practice happens – the digital web – the other two elements relate to the social and cultural approaches that characterise and encourage a new type of scholarly practice online. taking bourdieu’s works into account in which habitus is regarded as ‘…an endless capacity to engender products – perceptions, expressions, actions – whose limits are set by the historically and socially situated conditions of its productions…’ (bourdieu, , p. ), the research set out to conceptualise habitus within the historical and socio-cultural dimensions that characterise the web and its practices. although relatively young, the web was invented to serve the purposes of information sharing and collaboration (berners-lee, ) and has evolved with the goal of offering free access to information and the production of it. weller’s classification of digital scholarship practices is not too far off from this historical context nor is it from the socio-cultural practices that are therein found and which are mainly typified by approaches to unrestricted participation and publication of knowledge – a game changer for the academy. this take on the web allowed for a conceptualisation of digital scholar dispositions as networked and open. such dispositions are carriers of a value-system which valorises free access to knowledge and sharing of information within and beyond specialised knowledge networks. applying this categorisation to habitus theory allowed the research to explore specific dispositions that digital scholars acquire informally online and to examine how these costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / dispositions are transferred to participants’ professional settings. thus, digital scholars’ dispositions were conceptualised as: ) the strategies they have developed online to challenge the traditional means of knowledge production and dissemination; ) their tendency to congregate with like-minded social capital online; and ) their propensity toward initiatives, such as the open access movement, that challenge the rules of the academic field and the game it aims to play. framing the fieldwork with bourdieu’s theory of habitus required the development of methodological instruments to not only capture the digital dispositions defined by the project but also to help trace such dispositions as part of research participants’ trajectories of practice. this did not come without challenges, as explained below. capturing digital dispositions: the role of narrative inquiry if the first challenge was to conceptualise digital habitus, the second challenge consisted of making methodological decisions regarding how participants’ dispositions could be accessed. employing a similar approach to bourdieu’s later work (see bourdieu, ), this case study made use of practice-based narrative inquiry as a means of unearthing what is often implied but not discussed. devising theory as method requires not only a choice for a given technique of data collection, but also a clear and well thought out way of disclosing what the research aims are (costa & murphy, ). narrative inquiry in this specific case provided not only aimed to honour the complexities of participants’ practices, but also to illuminate the properties of their academic habitus in more explicit ways by materialising theory through method. narrative interviews were, thus, designed to: ) access participants’ own understandings of their own digital scholarship practices; ) examine the values and principles they shared in relation to their digital scholarship practices; and ) explore the strategies participants developed to put their perceptions of scholarship into practice, i.e., participants’ ability to ‘play the game’. costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / it is important to note here that even though practice-based narrative inquiry shares similarities with bnim, they are regarded as two different sub-genres within qualitative research into social lives. these differences are determined by the research questions of the inquiry it aims to serve (kim, , p. ). even though both methods are often used in the exploration of lived experiences, they differ when it comes to the locus and temporality stretch the research aims to investigate. whereas practice-based narrative inquiry explores the particularities of participants’ professional experiences across time and contexts, bnim’s longitudinal aspect is much broader and far reaching in that it aims to capture individuals’ comprehensive personal trajectory. in other words, bnim focuses on the (re)construction of research participants’ biographical experiences as a form of accessing the development of ‘personality’ during the life course (zinn, ). practice-based narrative inquiry, on the other hand, aims to access individuals’ practices in a given or extended moment and place (connelly & clandinin, , p. ). place thus becomes an inquiry boundary that delimits the accounts of participants (kim, , p. ) in practice-based narratives. in the case of this study, place is delimited to the web and academia as loci of scholarly practice. the design of the practice-based narrative interview guide took into account different methodological requirements. to start, the project approached reflexivity as an essential component of the study of social practice (wacquant & bourdieu, , p. ). reflexivity as a research tool is able to evoke participants’ capacity of analysing their own practice and denoting researchers’ place in the research setting. as a form of meaning-making of social experience, narrative inquiry brings attention to the perspective of the participant as both actor and first interpreter of the experiences narrated (atkinson, ). the greatest vulnerability of narrative inquiry is that it relies on participants’ accounts and conceptions of their own practice as research evidence; yet, this is probably also its greatest advantage in that it offers possibilities costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / to access participants’ chronology of professional practice as a representation of their ever- forming academic habitus. as wacquant ( ) contends, habitus is but ‘historicised subjectivity’ (p. ). narrative inquiry focused on practice, on the other hand, is a tool to recover social reality through a process of reflexive reconstruction. it is, therefore, important to reiterate that sociological reflexivity does not aim to apprehend what happened but, rather, to access the meaning the narrator attributes to it (atkinson, ). the integrity of narrative inquiry, thus, relies on the relationship between method and findings regarding the social reality the research aims to represent. hence, the emphasis here is on trustworthiness of the research rather than on more positivist conceptions of reliability. reality, in this case, is a social construct. in order to provide research participants with a stage to reflect on their digital practices and give researchers an opportunity to identify the dispositions that make up their digital scholarly habitus, the research interviews were devised around the digital scholarship dispositions of digital, networked and open (see weller, ). to allow participants to ‘re- live’ their experiences within the context of their academic practice, each research interview started by eliciting participants’ first encounter with the web. the interviews then allowed participants to explore their personal experiences in relation to the macro social structures in which their practices were inserted. here, the purpose of reflexivity was to bring tacit understandings of practice to a more explicit level – reflexive deliberations of internalised dispositions that had materialised into representations of digital scholarship practices. this, however, raised the challenge of ensuring that the purpose of the research was aligned to the narration while, at the same time, making adequate space for narrative flow and accuracy. the researcher’s challenge was to keep participants within the reflexive boundaries of the inquiry and make sure participants explored the idiosyncrasies of their digital practices. costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / this they did by tracing the roots of their digital dispositions and comparing and contrasting their digital scholarship practices to more conventional approaches. in this case study, habitus is, therefore, identified when the individual feels like a ‘fish out of water’ (nowicka, ). identifying this sense of displacement in dispositional form was not always a straightforward process in this case study. one of the reasons for this is that digital scholarship practices demarcate a new, distinctive activity that is not yet fully established nor recognised by and in academia. to some extent, the lack of institutional recognition does not help the cause, as it encourages a form of misrecognition on behalf of digital scholars – a form of symbolic violence. it challenges their positionality and the legitimacy of their digital dispositions in relation to the academic game. as such, participants often differentiated between what they understood as academic practices and what they regarded as online practices. their sometimes reluctance to make links between the two types of practices made the job of the researcher even more challenging; moving between the position of the interlocutor and the narrator is difficult enough methodologically without the extra layer of complexity resulting from the cleft habitus. the question of importance here as a researcher is: what is being narrated? difference is a valuable indicator of a disjointed habitus (bourdieu, ), whilst the questioning of doxified practices can develop into an instrument of self-analysis and reflexivity on behalf of the researcher and the researched, yet reflexivity, in the case of this research, can only foster trustworthiness of narration when the necessary symbolic conditions become available to challenge the dominant practices of academia (bourdieu, ). it might be the case that the researched, in a time of major change, find it difficult to extract themselves from the field within which their work is legitimised (or not). costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / discussion placed side by side, what do these two different studies tell us about the relation between social theory and methodology, specifically when applying the concept of habitus in different contexts? it can be said that the conceptualisation of habitus is specific to a given social phenomena as well as to the purposes of the research, i.e., the dimensions the researcher aims to disclose, which in turn need to be reflected in the research instruments that are devised for each research inquiry, including the specific types of engagement with the respective research participants the case studies presented in this paper show that the operationalisation of habitus differs from one research project to another. as such, operationalisations of the concept of habitus, i.e., how habitus informs and works in the background of data collection strategies, are driven by the questions the research aims to answer, thus showing its flexible nature with regards to the research context. for example, in the study of graduate employment, habitus is perceived through patterns of practice via the repetition of behaviours and approaches, whilst the study on digital scholarship goes on to capture participants’ habitus by identifying the different types of practices that typify and differentiate the two worlds in which participants operate. while the guiding reference for capturing habitus in both studies is centred on its historicity (see bourdieu & wacquant, , pp. - ), the way each study arrives at its understanding is quite different. whereas the first study uncovers habitus by identifying routines and patterns of practice which bourdieu sees as ‘a tendency for self-reproduction’ (ibid, p. ), the second study ends up detecting cleft habitus ‘in the form of tensions and contradictions’ (bourdieu, , p. ). nevertheless, despite contrasting definitions and forms of operationalisation, both case studies point to the heuristic value of the habitus and its continuing application when examining costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / these concepts. it is also the case that the application of habitus across the two studies identifies more convergence than divergence in terms of the theory-method relationship. this convergence is especially acute when we hone in on the ways in which habitus is encountered by the researchers. both studies encounter the rupture of individual’s habitus through out-of- environment experiences and embodiment of another field’s rules, respectively. this is an important point. wacquant’s description of habitus as ‘being endowed with built-in inertia’ ( , p. , emphasis in original) illustrates the empirical challenge associated with habitus. in times of change or rupture, the habitus – albeit, a reconfigured habitus – provides a point of reference to observe and examine dispositions. this temporal and historical dimension to habitus is central to its significance and to the way in which it is researched. herein lies a core dilemma for the researchers in both studies: how to balance longitudinal concerns with latitudinal research methods. in the context of the studies reported in this paper, to apprehend habitus empirically means to acquire a longitudinal understanding of the social conditions of the (re)production of dispositions through a latitudinal approach to the operationalisation of habitus. the collection and reconstruction of agents’ life histories is an important technique for the (historical) recovery of social phenomena that can no longer be retrieved through longitudinal research, given the temporal gap between the past moments in which habitus starts to develop and the present instances in which the research takes place. due to the difficulty in obtaining funding for such approaches, researchers are left to devise methodological tools that aim to collect and analyse periods of agents’ experiences through latitudinal techniques and approaches. as demonstrated in this paper, biographical and narrative interview methods can be devised for the purpose of ‘capturing’ participants’ habitus, yet it is the preparatory work the researcher devotes to conceptualising habitus (in light of his/her research questions) that transform such techniques into effective research methods. in costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / other words, it is not what participants narrate about their practices as part of their continuous experience but how they account for the dispositional aspects that constitute their habitus. furthermore, the rich and thick data often produced by these methods provides an opportunity to address the double bind (bourdieu, ) researchers face when they substitute common sense for learned common sense. with this temporal aspect comes a particular set of challenges for the researcher centred around positionality and reflexivity. one of the main challenges is due to reliance – and, therefore, vulnerability – of the researcher in relation to the research participants who, assuming the role of raconteurs, offer up the meanings they themselves attribute to both their own practices and the conventions that shape their social worlds. even through such participant-led interpretations are, in themselves, an indication of their narrators’ habitus, this type of approach requires analytical caution when working with the accounts collected. participants’ accounts should not be treated simply as research data but, rather, as interactive instances in which the participants provide personal meanings of experience whilst taking into account their interlocutors, i.e., the researchers (pereira, ). in other words, participants’ narratives are anchored in their own interpretations and should therefore be treated as (re)constructions of lived experiences within a given socio-cultural, political and economic context, which may or may have not been already rehearsed to other non-research publics (costa, ). unsurprisingly, the issue of researcher reflexivity is a common concern in much bourdieu-inspired literature, particularly when it comes to the methodological power of the interview (clegg & stevenson, ; hampshire et al, ; pillow, ). the challenges faced in the studies presented here find echoes in the work of other researchers who, a la bourdieu, take seriously the need to problematise and minimise the impact of ‘scholasticism’ costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / (kenway & mcleod, , p. ). take, for example, slembrouck’s anxiety over the role of situational factors impacting on ‘what is sayable’ in interview situations ( , p. ) and the concerns raised by hoskins ( , p. ) about reducing participants’ complex life experiences and life history to ‘significantly abridged versions’ in interview scenarios. it is this temporal aspect that arguably presents the greatest challenge to the methodological issues explored in this paper, and which are characterised by caetano ( , p. ) as the ‘time lapse’ between the exercise of reflexivity at specific moments and the ‘discourse produced by each individual about that process retrospectively in a research context’: we can ask someone to talk about past reflections, but that distance in time results in a possible reconstruction of senses and meanings. each person’s discourse is filtered by memory, experience, social circumstances and emotional states, and these constrain access to what they actually thought at a given moment in their lives. (caetano, , p. ) the interplay between subjectivity and reflexivity is, thus, an important aspect in the application of the bourdieuian habitus. the challenge for the researcher is to navigate between the two to arrive at new understandings of the phenomenon at hand. in this paper, reflexivity is achieved first through acts of narration aimed at translating individuals’ experiences into ‘tangible’ forms of knowledge that bring tacit understandings of practices to a ‘visible’ state; ‘the turning back of the experience of the individual upon [herself/himself]’ (mead, / , p. ). but, as the interaction between the researcher and the research participants evolves from mere accounts of personal history into acts of introspection – and as the dispositions that costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / characterise individuals’ habitus start to become more perceptible – it is also the researcher’s task to engage in a second phase of reflexivity in which what was narrated with a tone of familiarity needs to be approached from a distance to arrive at renewed understandings of the social reality under focus. even then – and bourdieu would agree (see bourdieu, , p. ) – it is not easy to identity the dispositions that lay underneath the practices we aim to study. critical reflexivity is, therefore, an essential tool in acquiring new knowledge (bourdieu, , p. ). the process to reach this level of reflexivity is through instigating an epistemological break (bourdieu, et al., ). through applying an abstract theoretical lens on the everyday, we make it unfamiliar and can begin to ask questions. the process of formulating research questions is, however, dependent on the dispositions that are under investigation – the key variable in these two case studies. the issue of ‘choosing’ variables or indicators of study is a long-standing one for many areas of theoretically-informed research. for bourdieuian research that has generally manifested itself in the operationalisation of capitals – in particular, cultural capital (bennett, et al., ; savage, et al., ) – the same questions need to be asked in relation to the dispositions we focus on and the areas of repetition we examine when trying to unearth the habitus. at the beginning of this paper, we asserted that an isolated set of dispositions does not make a habitus. from this position, we have to ask ourselves two questions: how do we choose the dispositions to examine/question; and how can we discuss the habitus in reference to a few essentially isolated dispositions. the answers to these questions are not easy ones and will continue to be debated; however, reflecting on bourdieu’s own methodological approach can provide a starting point. when pushed by wacquant ( ) to provide an overview of his approach to methods, bourdieu provided a three-level model that is summarised by grenfell ( , p. ): . analyse the position of the field vis-à-vis the field of power. costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / . map out the objective structure of relations between the positions occupied by agents who compete for the legitimate forms of specific authority of which the field is a site. . analyse the habitus of agents; the systems of dispositions they have acquired by internalising a deterministic type of social and economic condition. for this discussion, it is the third level of bourdieu’s model which is most pertinent. grenfell argues that the facets of the habitus – essentially, the various dispositions – are only analysed as they relate to the field. he qualifies his position: ‘in other words, we are interested in how particular attributes, which are social in as much as they only have value in terms of the field as a whole. we are not concerned with individual idiosyncrasies’ ( , p. ). when we discuss habitus, we are talking about it in a particular context, and, as such, the dispositions which are chosen are understood to be related to a particular field, and habitus is discussed in relation to that specific context. this process requires a keen reflexive approach fostered through the combination of an epistemological break and previous research but also grounded by empirical findings. there are clear parallels between this position and weber’s ( ) comments on how to choose which social interactions are worthy of investigation whilst maintaining a level of empirical rigour. in weber’s attempt to provide a scientific method, he argues that the infinite number of interactions between individuals requires a blunt vetting system in order to provide usable data. alongside the dilution of empirical certainty – lauded by the positivists – weber advocated that ‘we cannot discover however, what is meaningful to use by means of a ‘presuppositionless’ investigation of empirical data’ ( , p. ). rather, an application of social logic – in other words, informed/theoretical common sense – will reduce the infinite number of actions, reactions and interactions to a manageable quota whilst maintaining scientific authority. while we would advocate for more strenuous oversight than costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / advocated by weber, the principle of making an ‘informed’ decision based on the empirical requirements and the field of study is clear. conclusion in this paper, we have attempted to sketch out how habitus can be operationalised as method and, in turn, how it can move theoretical understandings forward through a tailored application of the concept applied to the research phenomenon at hand. we have demonstrated that defining the properties of the habitus is a complex exercise that requires a clear understanding of the facets of habitus in which the research is interested. in their own way, the two studies go on to excavate deeper into participants’ histories to access their practice backgrounds and study instances of change or extension of their experiences. although the means through which this is achieved diverge from project to project – from biographical interviews to narrative inquiry – there is an underlining assumption that we arrive at understandings and instances of habitus by tracing individuals’ subjective trajectories. however, in this paper, we have illustrated that such tracing of dispositions has a temporal and historical dimension which tends to add another layer of complexity onto what is already a complex theory-method relationship. habitus as a research lens requires careful methodological considerations that go beyond a mere choice of research techniques. it also requires the conceptualisation of theory as a research instrument ready to unearth the unspoken realities that characterise individual and collective dispositions. it is this concerted effort to understanding social practices in its methodological and theoretical dialectic that allows researchers to move forward the contribution habitus makes to the social sciences. references costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / abrahams, j. and ingram, n. ( ) “the chameleon habitus: local students’ negotiations of multiple fields. sociological research online. ( ), (online) [access on th march ) available from: www.socresonline.org.uk/ / / / /html. adams, m. ( ) “hybridising habitus and reflexivity”, sociology, ( ), - . archer, m.s. ( ). culture and agency: revised edition. cambridge: cambridge university press. archer, m.s. ( ). making our way through the world. cambridge: cambridge university press atkinson, r. ( ). the life story interview. sage. atkinson, w. ( ) class, individualisation and late modernity: in search of the reflexive worker. hampshire: palgrave macmillan. atkinson, w. ( ) beyond bourdieu: from genetic structuralism to relational phenomenology. cambridge: polity. berners-lee, t. ( ). architectural and philosophical points. retrieved april , , from https://www.w .org/designissues/preface.html bodovski, k., ( ). from parental to adolescents’ habitus: challenges and insights when quantifying bourdieu, in: costa, c., murphy, m. (eds.), bourdieu, habitus and social research: the art of application (pp. – ). palgrave macmillan, houndmills, basingstoke hampshire ; new york, ny. bourdieu, p., ( ). outline of a theory of practice. cambridge university press, cambridge. bourdieu, p., ( ). distinction: a social critique of the judgment of taste. routledge, abington, oxon. bourdieu, p. ( ). the biographical illumination, working papers and proceedings of the centre for psychosocial studies, vol. , - . http://www.socresonline.org.uk/ / / / /html https://www.w .org/designissues/preface.html costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / bourdieu, p. ( ). the logic of practice. stanford, california: stanford university press. bourdieu, p. ( ). pascalian meditations. stanford university press, stanford, california. bourdieu, p., ( ). science of science and reflexivity. polity. bourdieu, p. ( ). sketch for a self-analysis. english ed edition. university of chicago press, chicago. bourdieu, p. & boltanksi, l. ( ). changes in social structure and changes in demand for education, in: ginger, s. and archer, m.s. (eds) contemporary europe: social structure and cultural patterns. (pp. - ) london: routledge. bourdieu, p. & boltanski, l. ( ). the educational system and the economy: titles and jobs, in: lemert, c. (ed.) french sociology: rupture and renewal since . (pp. - ) new york: columbia university press. bourdieu, p. , chamboredon, j-c. & passeron, j-c. ( ). the craft of sociology: epistemological preliminaries. new york: walter de gruyter. bourdieu, p. & chartier, r. ( ). the sociologist and the historian. edition. polity press, cambridge, uk ; malden, ma. bourdieu, p. & wacquant, l., ( ). an invitation to reflexive sociology. university of chicago press, chicago. brooks, r. & everett, g. ( ). ‘post-graduation reflections on the value of a degree’, british educational research journal, ( ), - . brown, p. & hesketh, a. ( ). the mismanagement of talent: employability and jobs in the knowledge economy. oxford: oxford university press. burke, c. ( ). the biographical illumination: a bourdieusian analysis of the role of theory in educational research, sociological research online, : costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / burke, c. ( a). culture, capitals and graduate futures: degrees of class. london: routledge burke, c. ( b). habitus and graduate employment: a re/structured structure and the role of biographical research, in: costa, c. and murphy, m. (eds) bourdieu, habitus and social research: the art of application. basingstoke: palgrave macmillan, - . burke, c., scurry, t., blenkinsopp, j. and graley, k. ( ) “graduate employment: critical perspectives” in: tomlinson, m. and holmes, l. (eds.) graduate employability in context: theory, research and debate. london: palgrave caetano, a. ( ). personal reflexivity and biography: methodological challenges and strategies, international journal of social research methodology, : , - clegg, s. & stevenson, j. ( ). the interview reconsidered: context, genre, reflexivity and interpretation in sociological approaches to interviews in higher education research, higher education research & development, : , - . costa, c. ( ). the participatory web in the context of academic research : landscapes of change and conflicts. university of salford. retrieved from http://usir.salford.ac.uk/ / costa, c. ( ). the habitus of digital scholars. research in learning technology, , – . costa, c. ( ). double gamers: academics between fields. british journal of sociology of education , – . costa, c., murphy, m. ( ). bourdieu and the application of habitus across the social sciences, in: costa, c., murphy, m. (eds.), bourdieu, habitus and social research: the art of application. palgrave macmillan, houndmills, basingstoke hampshire (pp. – ). new york, ny. http://usir.salford.ac.uk/ / costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / department for education (dfe) ( ). higher education: a new framework (white paper), cm , london: hmso. emmerich, n. ( ) medical ethics education: an interdisciplinary and social theoretical perspective. london: springer. france, a. ( ). theorising and researching the youth crime nexus: habitus, reflexivity and the political ecology of social practices, in: costa, c., murphy, m. (eds.), bourdieu, habitus and social research: the art of application. palgrave macmillan, houndmills, basingstoke hampshire ; new york, ny, – . grenfell, m. ( ). postscript: methodological principles, in: grenfell, m. (ed.) bourdieu: key concepts. (pp. - ). durham: acumen. hampshire, k. nazalie iqbal, mwenza blell & bob simpson ( ). the interview as narrative ethnography: seeking and shaping connections in qualitative research, international journal of social research methodology, : , - heath, a., sullivan, a., boliver, v. & zimdars, a. ( ). education under new labour, - , oxford review of economic policy, ( ), - . hodkinson, p. ( ) “career decision making and the transition from school to work”, in: grenfell, m. and james, d. (eds.) bourdieu and education: acts of practical theory. routledge: london. pp. - . hoskins, k ( ). researching female professors: the difficulties of representation, positionality and power in feminist research, gender and education, : , - jenkins, r. ( ). pierre bourdieu and the reproduction of determinism. sociology , – . jenkins, r. ( ) pierre bourdieu. london: routledge lupton, d. ( ). digital sociology. abingdon, oxon: routledge. costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / kenway, j. & mcleod, j. ( ). bourdieu's reflexive sociology and ‘spaces of points of view’: whose reflexivity, which perspective?, british journal of sociology of education, : , - . kim, j.-h. ( ). understanding narrative inquiry: the crafting and analysis of stories as research. london: sage publications. mead, g. ( ). mind, self, and society: from the standpoint of a social behaviorist: new edition. university of chicago press, chicago. miller, r. ( ) researching life stories and family histories. london: sage publications. office for national statistics (ons) ( ) full report – graduates in the uk labour market , (online). available at: www.ons.gov.uk/ons/dcp _ .pdf (retrieved september ). nowicka, m. ( ). habitus: its transformation and transfer through cultural encounters in migration. in c. costa & m. murphy (eds.), bourdieu, habitus and social research (pp. – ). london: palgrave macmillan uk. pereira, f.h. ( ). el mundo de los periodistas: aspectos teóricos y metodológicos. comunicación y sociedad, pp. – pillow, w. ( ). confession, catharsis, or cure? rethinking the uses of reflexivity as methodological power in qualitative research, international journal of qualitative studies in education, : , - porpora, d.v. ( ). morphogenesis and social change, in: archer, m.s. (ed.) social morphogenesis. london: springer. pp. - . http://www.ons.gov.uk/ons/dcp _ .pdf costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / purcell, k., elias, p., atfield, g., behle, h., ellison, r. & luchinskaya, d. ( ). transitions into employment, further study and other outcomes: the futuretrack stage report, the higher education careers service unit. reay, d. ( ) “it’s all becoming a habitus: beyond the habitual use of habitus in educational research”, british journal of sociology of education, ( ), - . robbins, l. ( ). higher education: report of a committee (the robbins report), london: hmso. rosenthal, g. ( ) “the healing effects of storytelling: on the conditions of curative storytelling in the context of research and counselling”, qualitative inquiry, vol. ( ), - . rosenthal, g. ( ) “biographical research” in: miller, r.l. (ed.) biographical research methods volume iii. london: sage publications. pp. - . schütze, f. ( ) “pressure and guilt: the experience of a young german solider in world war two and its biographical implications”, international sociology, vol. ( ), - ; vol. ( ), - schütze, f. ( ) biography analysis on the empirical base of autobiographical narratives: how to analyse autobiographical narrative interviews - part i (online) available at http://www.biographicalcounselling.com/download/b . .pdf slembrouck, s. ( ). reflexivity and the research interview, critical discourse studies, : , - . smith, n. ( ). work as a sphere of norms, paradoxes, and ideologies of recognition, in: o’neill, s., smith, n.h. (eds.), recognition theory as social research: investigating the dynamics of social conflict. palgrave macmillan, london, – . costa, burke and murphy submitted to international journal of research & method in education accepted manuscript. accepted / / stahl, g., ( ). egalitarian habitus: narratives of reconstruction in discourses of aspiration and change, in: costa, c., murphy, m. (eds.), bourdieu, habitus and social research: the art of application. palgrave macmillan, houndmills, basingstoke hampshire ; new york, ny, – . tomlinson, m. ( ). the degree is not enough’: students’ perception of the role of higher education credentials for graduate work and employability, british journal of sociology of education, ( ), - . wacquant, l. ( ). a concise genealogy and anatomy of habitus. the sociological review, ( ), – . http://doi.org/ . / - x. wacquant, l. ( ). habitus, in: berckert, j. & zafirovski, m. (eds) encyclopedia of economic sociology. london: routledge. pp. - . weber, m. ( / ). objectivity in social science and social policy, in: shils, e.a. and finch, h.a. (eds) the methodology of the social sciences. new york: free press. pp. - . weller, m., ( ). the digital scholar: how technology is changing academic practice, st ed. bloomsbury publishing plc, london. yang, y., ( ). bourdieu, practice and change: beyond the criticism of determinism. educational philosophy and theory , – . zeuner, l. ( ) “review essay: margaret archer on structural and cultural morphogenesis”, acta sociologica, , - . microsoft word - acrl environmental scan final - - .docx environmental scan by the acrl research planning and review committee march                                             association of college and research libraries  telephone: ( )  ‐ , ext.     american library association  fax: ( )  ‐     e. huron st.   e‐mail: acrl@ala.org   chicago, il  ‐    web: www.acrl.org    acrl environmental scan introduction and methodology the environmental scan of academic libraries is the product of acrl’s research planning and review committee. in the committee produced the “top trends in academic libraries,” published in college and research libraries news (middleton et al. ). the environmental scan expands and broadens that document. although broader than the “top trends,” the environmental scan provides an overview of the current environment for academic libraries rather than an exhaustive examination. the current scan addresses topics related to higher education in general and their resulting impact on library collections and access, research data services, discovery services, library facilities, scholarly communication, and the library’s influence on student success. higher education environment in a time of growing economic inequality in the united states, there is a heightened focus on social mobility and general well-being. as educational completion correlates with income level, the affordability of higher education has become a frequent topic in the media. rising student debt has led to increased scrutiny of higher education costs and outcomes. in december , the obama administration released the framework for a college ratings plan that would link federal funding to a number of performance metrics such as a college’s average net price, its students’ completion rates, the percentage of its students receiving pell grants, labor-market outcomes, and loan-repayment rates. many colleges and universities also rely on student tuition to fund most of their operating budgets at a time when net student revenues are declining. most public institutions are experiencing large cuts in state support and more government oversight. many community colleges find themselves unable to meet student demand for more affordable educational degree paths. research funding levels have decreased, leading to an increasingly competitive environment for research institutions (bidwell ). at the same time, data-intensive research is necessitating new requirements for related infrastructure and data management services, and the federal government has issued open access mandates for federally funded scientific research. federal agencies have submitted and are currently revising release plans to comply with the february white house office of science and technology policy directive (holdren ). technology is advancing new delivery models in higher education. the for-profit sector and open education models offer convenient alternatives to traditional place-based programs. massive open online courses (moocs) and competency-based education (cbe) models represent such market-based alternatives. online learning is an attractive option for adult learners, a demographic that has been the focus of many large for-profit institutions; these students can complete degree programs and other credentials at a self- determined pace and a lower cost (hurst ). technology allows students, faculty, and staff to collaborate, teach, and learn at a level that strains existing infrastructures and service models. the current environment “offers new ways to connect things that were previously considered disparate and ‘un-connectable’: people, resources, experiences, diverse content, and communities, as well as experts and novices, formal and informal modes, mentors and advisors” (abel, brown, suess ). library collections & acquisitions general overview libraries are reassessing their collection practices and strategies and developing a more holistic approach to collections, particularly in light of emerging diversification of the scholarly record (e.g., learning materials/objects, open access materials, freely available digital resources, etc.). to address this new diversification, dempsey, malpas and lavoie ( ) offer a useful matrix based on stewardship, scarcity, and uniqueness of resources that may provide some guidance for collection managers. the authors elaborate on the consequences and implications of “outside-in” (information provided by external vendors and licensed by the library) and “inside-out” resources (locally created resources such as digitized collections, learning objects, etc.) for stewardship/preservation, infrastructure, collaboration, and internal and external workflows. e-books—still in flux the e-book market remains in flux, with most publishers offering options directly and through aggregators, providing both subject packages and individual firm ordering through book vendors. of particular note is the significant success of university press partnerships with well-esteemed academic portals such as project muse and jstor. digital rights management (drm) continues to be a challenge for managing and using e- books (in particular for reserves and interlibrary lending/borrowing), with restrictions on printing, downloading, and re-use of content. some of these drm issues—as noted further below—have been eliminated through the direct delivery of content by individual publishers, or through third parties who have negotiated extensively with these publishers. some print-on-demand services do exist from publishers such as springer, which allows for printing entire e-books rather than just individual chapters. much of the discussion about e-books centers on the role of the print codex (monograph) in scholarly communication and whether or not it will retain a revered status in the academic ecosystem. as schonfield ( ) notes in his provocative ithaka s+r article, stop the presses, the enhancements made possible in the digital format have not come to complete fruition or acceptance. a number of studies have shown that e-books and print books can serve very different purposes for researchers and patrons, whether for basic searching or for actual reading (rod-welch et al. ; staiger ; li et al. ). although there continue to be predictions of bookless libraries (with books no more than aesthetic decoration), only a few high-profile examples have emerged. according to a recent ithaka s+r us library report (long & schonfeld ), the transition to e-books has not been as smooth as earlier predicted. for example, most library directors report that large-scale acquisition of e-books has not led to large-scale de-accession of print materials. another ithaka s+r report focused on faculty (housewright et al ) provided evidence that most faculty are still wary of an e-only monograph future. even for the sciences, only around % of faculty surveyed responded favorably to the statement that within the next five years “it will not be necessary to maintain library collections of hard-copy books.” rather, faculty indicated that print titles (particularly low-use titles) were more likely to move to a storage facility. with that said, only around - % of library directors still consider the acquisition of print books as a means to build research collections a high priority. some collection managers have addressed e- book growth by establishing and expanding e-approval plans, which are no longer reserved for stem publications. even with e-book approvals, though, significant percentages of titles are still received in print within profiled call number ranges. a confounding issue in e-book acquisition and management centers upon the lending of e-books across institutions. most electronic monograph licenses remain relatively restrictive on the sharing of e-book content, thereby practically challenging the first sale doctrine upon which ill operations rely. a new pilot between springer verlag, texas tech university, the greater western library alliance (gwla) and the university hawai’i at manoa provides a new option for sharing such e-book content. a new software program/interface, occam’s reader, which functions as an add-on to the widely used illiad lending software, is currently being tested (anderson ). streaming media/video an increasing number of libraries have been subscribing to streaming video and audio services (e.g., kanopy, alexander street press, naxos) to meet faculty and student demand for said resources. some libraries have also adopted demand-driven acquisitions to streaming services in which number of uses (i.e., views/listens) can trigger the purchase of a streaming license for a particular work. kanopy has been the notable model for such a service. streaming services have definite consequences for technical services (e.g., licensing of public performance rights), systems workflows (e.g., ensuring compatibility with ez proxy servers), and access and discovery (e.g., availability of marc records). drm restrictions on re-use for teaching and research (e.g., clip-making, reserves use), ownership of perpetual streaming rights by libraries, and increased need for bandwidth are all issues at the forefront of this streaming audio and video surge. implications  libraries should continue to work with vendors and each other to better manage the sharing and preservation of e-book content.  libraries will need to continue to manage a hybrid e- and print monograph world for some time to come, balancing user needs and preferences, space issues, and access.  streaming av has its own set of challenges that are currently in a state of discussion and negotiation between libraries and vendors. demand driven acquisition e-book data driven acquisition (dda) and patron driven acquisition (pda) pilots have now reached a level of maturity and have become an integral part of collection development and acquisition workflow within many academic libraries and consortia. in light of this significant shift and adoption, niso has recently unveiled a set of recommended practices for dda implementation (niso b). although focused primarily on e-books, the standards are also applicable to print dda initiatives, which have been tried out at several academic institutions in the form of using open worldcat as the primary discovery layer for patrons or using print-on-demand bookmakers. vendors such as springer already allow for print-on-demand services, but these require purchasing specific e-book collections as a whole. dda models have rendered many cost-savings and have been at the forefront of the strategic shift between real-time collection building and long-term collection building. although dda models have had significant impacts on library collection budgets, there are indeed questions as to the sustainability of these models, particularly in light of recent increases in short-term loan price increases from various publishers (some of which have reached an increase of over %). some publishers, such as wiley and palgrave, have been marketing a new model known as “evidence-based collections” in which subscribers pay an agreed-upon, upfront fee to access all e-titles in the publisher’s collection (or a subset thereof) for a year. the library can then choose which titles to add to its permanent collection, but must purchase an agreed-upon minimum threshold. a key implication of these new publisher models is that that they act more as subscriptions, whereas dda models follow a more traditional monograph acquisition model and do not require an upfront fee or purchase threshold (except for record loading). the potential benefit of these publisher-directed models is the less stringent (or absent) drm. potential issues, however, center on assessment of collection use. in other words, how many uses lead to an addition to the catalog? is a pdf download of one chapter or a simple browse on the landing page enough to merit inclusion? implications  libraries should evaluate their ongoing, established dda programs carefully and ask for detailed usage statistics to perform such assessments.  new publisher models of patron-based acquisition such as evidence-based models are still relatively new, and need to be carefully assessed. textbook/course-adopted readings and libraries textbook affordability and course reading support continue to be substantial areas of discussion among librarians (demas ), with numerous initiatives being piloted. several states have addressed textbook costs through legislation, as has the federal government, requiring students to have access to title lists prior to class enrollment. the role of libraries in textbook support and acquisition continues to be in flux. libraries have begun promoting open educational resources (oers) through direct grants as a means to address rising costs. other institutions have begun to focus on course-adopted readings, rather than traditional textbooks, and promote e-collections as a means to better meet patron demands for these high-use materials (e.g., university of north carolina- greensboro pilot). another approach has been to purchase textbooks for certain fields and place them on reserve—using either existing collection dollars or special funds. implication  libraries can play an important role in providing more access to textbook and course-adopted texts (particularly with e-books), but need to take heed of and collaborate with the many internal university players in the textbook and course readings ecosystem. curating collective collections/collaborative print management shared print repositories continue to be of great interest to academic libraries as a means to more efficiently manage and sustain legacy print collections, expand access, and create or repurpose existing physical space in individual libraries. a oclc report, “understanding the collective collection” (dempsey et al. ), accentuates the “shift from local provisioning of library collections and services to increased reliance on cooperative infrastructure, collective collections, shared technology platforms, and ‘above-the-institution’ management strategies” (oclc ). memoranda of understanding (mous) are becoming more common as a means to govern and structure decision making around shared/collective print collections, including guidelines on retention and last copies (demas , which builds upon malpas ). per a recent arl spec survey (crist and stambaugh ), these collaborative relationships focus much more on shared management of retrospective collections than on prospective collaborative collection development or management. although most participants in these collective arrangements are public or state universities, there is a move to more public-private partnerships (e.g., emory and georgia tech; see payne ). relatively new consulting services such as sustainable collection services (scs) have also appeared to assist individual academic libraries with a data-driven methodology for de-selection. two new arl spec kits # (britton and renaud ) and the afore-mentioned # (crist and stambaugh ), focus on print retention policies and shared and collaborative print initiatives across numerous institutions and consortia. they provide significant guidance in establishing infrastructure and addressing potential issues in print resource management, including communication strategies with relevant stakeholders. the arl spec kit # on print retention decision-making “examines research libraries’ print retention decision making strategies related to storage of materials in three different types of facilities or circumstances: on-site, staff-only shelving; remote shelving; and collaborative retention agreements.” spec kit # on shared print programs “explores the extent of arl member libraries’ participation in shared print programs, the type and scope of programs in which they choose to participate, the rationale for participation, the value and benefits the programs provide to arl and other libraries, and the roles different libraries are playing in them.” a particularly interesting section of the shared print programs study focuses on shared print monographs and “future” services, i.e., potential leveraging of these retrospective collections in light of e- books and digitization. new possible services considered include coordinated digitization of shared collections, scan-on-demand services, metadata crosswalks between shared print and digital copies, and enhanced interlibrary lending networks. access to and discoverability of these shared collections is another issue that should be considered. how are users able to locate these collections in a seamless fashion? several consortia and regional institutions are implementing or have already implemented joint/shared ils to manage these shared holdings in both print and electronic formats. implication  there should be a continued review of the collaborative and coordinated management and use of retrospective print collections and how to enhance services associated with these collections and their digital counterparts. collections assessment collecting metrics on library collections has long been a source for evaluating the usage of the collections and their relevance to the academic programs they support. metrics have also been used to reflect the size, ranking, and prestige of institutions. the current trend continues to focus on how collections help support the library’s alignment with the campus vision/mission/goals, and to what degree they contribute to research, student success, and other criteria. traditionally these metrics have focused on collections owned and managed by the library. as the library's curation role expands to e-research, data, open access scholarship, born-digital resources, and open education resources, the potential for tracking and assessing what is held in institutional repositories has raised some practical issues on what to measure and the need for standards for cross-institutional and global comparisons. in addition, further studies are being undertaken to assess how the increased dissemination of scholarship might help advance research and increase institutional standing (webometrics n.d.). the development of altmetrics that measure the impact of new modes of scholarly communication (such as blogs, social media, institutional repositories, etc.) has led to new approaches in evaluating the importance of individual authors’ works and has influenced the way library collections are both developed and evaluated. the new measures have also opened up opportunities for library staff to engage with researchers in the ongoing dialogue of how scholarly impact is measured and to participate with other stakeholders in developing standards (niso altmetrics initiative ). implications  libraries will need to continue to track and assess the value of collections beyond the traditional boundaries to include new modes of scholarship.  libraries will need to engage with researchers on the impact of new modes of scholarship and new ways to measure this impact and its implications for collection development, management, and data curation. research data services responses to us government and funding agencies’ policies in , the us office of science and technology policy (ostp) released a memo for all its heads of executive departments and agencies with the subject heading of “increasing access to the results of federally funded scientific research” (holdren ). this policy required that the direct results of federally funded scientific research, including both peer-reviewed publications and scientific data in digital formats, be made available and useful to the public, industries, and scientific communities. currently, all federal funding agencies with an annual budget of over $ million need to develop plans for sharing their funded research results, including providing public access to the data. higher education and research communities as well as publishers are all working toward developing suitable dissemination platforms for these agencies to share future scientific results, but they are pursuing different paths. academic libraries are participating in the shared access research ecosystem (share) project ( ), while more than publishers collectively supported the clearinghouse for the open research of the united states (chorus) project ( ). it is still unknown which of these two possible solutions will ultimately serve federal agencies better, but the issue of data linkage will likely be a key differentiator. the new ostp policy recognizes the need to protect confidentiality and personal privacy while maximizing public access to digital research data. however, balancing the needs of privacy protection and scientific research autonomy will not be an easy task. for example, the us department of health and human services developed the health insurance portability and accountability act (hipaa) privacy rule in , which attempted to standardize procedures for protecting the privacy of personal health information while allowing for health data sharing and reuse. but according to an institute of medicine study, the interpretation and implementation of hipaa policy has been costly and has caused unintended negative impacts on health research in many ways (nass, levit, and gostin ). the study calls for a new legal and regulatory framework to better protect privacy and facilitate responsible health research through such approaches as requiring the data provider to establish stronger security safeguards and implement legal sanctions to prohibit unauthorized re-identification of information after it has been de-identified. no matter how the new ospt policy will handle similar technical, legal, and ethical issues of public data access, academic librarians, serving both the data creators and data users, will have more opportunities to provide valuable services beyond data management plan consultation (goben and salo ). implications  the future of research data services of academic libraries will continue to be driven by larger academic factors and government policies, as well as even broader national development priorities and international competition and collaborations.  academic libraries need to pull together their human and intelligent resources and collaborate on developing state-of-the-art, cross-institutional digital platforms for disseminating scholarly projects in multiple formats.  academic libraries can leverage their expertise and experience in curation, preservation, and data management to support, educate, and facilitate government agencies that now need to make their data and information more publicly usable and accessible. understanding researchers’ data sharing and management practices broader and institutional-level policies and requirements that regulate and potentially change researchers’ behaviors affect the everyday tangible practices of research data sharing, management, and preservation. also important are research communities’ norms, their awareness of available resources, and individual researchers’ motivation to increase their researchers’ visibility (kim and stanton ). increasing numbers of scientists are beginning to reflect on their own data sharing abilities and challenges. institutions are trying to identify researchers’ real data needs and develop more targeted programs for research data services. meanwhile, academic librarians have also conducted more survey and interview studies on large and small groups to identify researchers’ current strategies of dealing with data. based on an international survey of over , scientists, one study found that, although most researchers realize the importance of data sharing and preservation, they are usually limited by time, budget, and information about currently available support and tools (tenopir et al. ). another international study of over , scientists, conducted by the publisher wiley (ferguson ), revealed the national and disciplinary differences in research data sharing and found that researchers are more willing to share if they can get full credit for sharing data and thus increase their overall impact within research communities. from the scientists’ perspectives (marx ; budin-ljøsne et al. ), extensive technical challenges still arise when sharing data in a broader range of communities. even sharing across consortia within the same disciplines is difficult, especially when reuse of data requires detailed information on research methods and software tools. faced by these challenges, scientists are not motivated enough to invest in better solutions partly because not enough forms of recognition or ethical standards of sharing data have been developed. smaller scale studies of scientists or research communities have developed deeper dialogues between librarians and researchers and provided opportunities for librarians to introduce newly created data services to their users (diekema et al. ; williams a). librarians have learned that most researchers are not aware of libraries’ various support services throughout the research data life cycle, and librarians have had to educate researchers about their expertise and knowledge in the relevant fields of research data. obvious gaps exist between the available resources and information and the researchers who need data management and shared support services. therefore, libraries must still develop outreach and education efforts with an eye to innovation, and then implement new services, programs, or research projects. detailed strategies might include, for example, a bibliographic study of academic publications to identify researchers to target with data curation services (williams b) or plans to take advantage of the end dates of funding life cycles, when researchers need to implement their data archiving plans (nilsen et al. ). these ideas have been suggested to maximize buy-in for library data services. implications  disciplinary and methodology differences influence researchers’ data collecting, analyzing, and sharing behaviors and thus require data services librarians to develop a deeper understanding of research processes, in order to provide suitable assistance within each research field.  increasing numbers of data management and curation services will be developed based on an evaluation of specific research programs’ needs and practices.  innovative outreach strategies are needed for academic libraries to market their existing data services to users who are usually unaware of librarians’ expertise and the available tools and resources. advances in data curation services as the data curation policy working group of oclc (erway ) has pointed out, although academic libraries are still the main stewards of research data who care about the long-term preservation of this special asset, collaboration between campuses and even institutions is key to services’ success. collaboration with other campuses or institutional units, such as research and research compliance offices and, especially, research departments, could even enable a smaller and less research-intensive university to successfully engage with faculty in data management education and curating research data for long-term preservation (shorish ). academic library data curation services have developed beyond simple extensions of institutional repositories into more customized features while librarians work closely with researchers (olendorf and koch ; miller et al. ). this can include collaborating with disciplinary repositories to maximize the visibility of otherwise hidden data held by individual researchers (akers and green ). data curation quality control is currently a major challenge, and even institutional data repositories are inadequately performing the steps to evaluate deposited data, according to a comprehensive review article on the commitment to data quality among different types of data curators (peer et al ). however, a clearly identifiable trend toward quality control is emerging. workflow models and examples are being presented and shared within the data curation community to make data preservation more streamlined and accountable (giarlo ; hense and quadt ; johnston a). research data curation requires broad, cross-disciplinary expertise as well as specific content knowledge in science, engineering, and data management (mayernik et al. ). in support of this growing need, the harvard-smithsonian center for astrophysics john g. wolbach library and the harvard library have developed data scientist training for librarians or dst l (altbibl.io/dst l/), an experimental course to train and retool librarians to respond to the growing data needs of their communities. a recent study analyzing placement rates revealed that applicable knowledge and hands- on experience strongly influence whether graduates from curation programs are able to get jobs in either libraries or industries. continuing education programs allow data curators to update and further develop their skills while working in their current positions, given the new challenges facing them within the changing landscape of data curation (palmer et al. ). implications  data curation and preservation will require more collaborative efforts between multiple campuses and institutional units, and academic libraries could be the initiators and coordinators of policy development and program design.  customizing features according to specific research communities’ needs and implementing reliable measures for data quality review and control will need a further understanding of research processes and deeper engagement with researchers.  preparation of the data curation workforce requires both formal library school training and continuing education programs, and the skills and knowledge taught need to be practical and to cover science, engineering, and data management domains. data information literacy: national and regional projects data services librarians have been advocating data literacy as an essential aspect of information literacy for a long time. this was recently synthesized on a theoretical level into a detailed list of core content and competencies for articulated data literacy instruction, including additional newly identified competencies in data management (prado and marzal ). data librarians in academic libraries are exhibiting more collaborative and collective efforts for instruction on data information literacy: gathering user information, engaging in conversations across institutions and disciplines, and developing and sharing instructional materials, pedagogical strategies, and practical experiences. the institute of museum and library services funded a successful multi-institutional data information literacy project in . the project counted on the participation of data services librarians and subject specialists in different research departments and laboratories from multiple institutions, including purdue university, university of minnesota, university of oregon, and cornell university. faculty and graduate students’ needs were assessed using a standardized measurement instrument, and different instruction delivery approaches were shared in timely publications (carlson et al. ; carlson et al. ) and at a symposium (data information literacy a, b) where academic librarians from across the nation gathered together to learn about each other’s experiences and to discuss further steps. another noteworthy multi-institutional data information literacy program is the new england collaborative data management curriculum (necdmc) project ( ), with participants currently from countway library of medicine, university of massachusetts medical school, and tufts university’s marine biological laboratory and woods hole oceanographic institution library. this project has developed a series of instructional modules for teaching best practices in data management based on the frameworks for a data management curriculum (martin et al ), which can be adopted and customized for different contexts. the project’s participants are also collecting actual cases in research data management from many different disciplines to be used for instruction. implications  data information literacy has been recognized as an important component of general information literacy competencies for higher education. data librarians need to join more actively in dialogues about information literacy, learn from newly developed pedagogical strategies, and contribute based on their special perspectives as well.  data librarians or subject librarians who are assigned to, or interested in, data information literacy instruction or data management practices training could benefit from existing collaborative national and regional data services program models and curriculum materials, to customize their own efforts within local contexts. data management services: new specialties for subject librarians newly hired data services librarians need to work with subject specialists to provide subject-specific data management services. many times, academic libraries merely add additional data management responsibilities to existing subject librarians’ duties, rather than hiring new data specialists. in either case, subject librarians or liaisons with schools and departments are facing this new challenge and opportunity to acquire new skills and knowledge related to data management. in many disciplinary fields, such as science, business, and health, librarians are paying attention to this new professional demand and publishing studies on the meaning and relevance of data management in their specific fields. digital humanities also provide an area where libraries can offer support through data management services. adams and gunn ( ) note that data services departments “are appearing at many academic libraries as more administrators, researchers and librarians see the possibilities for data use in the humanities as well as in the sciences.” this includes the resources to equip themselves with necessary skills so that they can quickly adapt to change (elmore and jefferson ; creamer et al ; tenopir et al. ). researchers have surveyed academic librarians’ perceptions and attitudes toward this currently emerging role and discovered some important differences between librarians and academic library administrators (tenopir et al. ; tenopir et al. ). as librarians are expected to take on a growing number of new responsibilities, such as support for research data management, they recognize gaps in their current store of skills and knowledge. although administrators believe that they are providing sufficient training opportunities to bridge these gaps, librarians do not perceive this level of support from their institutions. implications  new roles in supporting research—especially research data services—are emerging as new services within academic libraries. these growing opportunities to become further engaged in research processes are inspiring visionary library administrators to reprioritize library functions and even reorganize their libraries’ structure to align with these new needs and potential areas of innovation.  more collaboration among different units of academic libraries will become increasingly common and important in carrying out complicated research support projects, for example, those that involve data discovery, collection, documentation, management, and curation. innovative on-site professional development opportunities, such as cross-departmental dialogues, observations, and demonstrations, will be valuable in developing new collaborative networks and relationships among librarians from different units.  professional development opportunities need to be created for all librarians, which are not limited to support for attending conferences and short, one-time knowledge updates. these also should include providing release time and financial support for librarians to enroll in continuing education programs and to obtain certificates in new specializations. discovery services many libraries have implemented discovery layer services designed to deliver unified results across resource and collection types. the configuration and local enhancement/customization of a discovery service enhances the user experience and encompasses the library’s print, media, electronic resource, library services, library staff and expertise, and resource guides. enhanced discovery requires library staff with systems thinking and web development skill sets. shared integrated library systems (ils)/resource management systems (rms) academic libraries continue to explore ways to provide access to information in the broadest way possible through discovery services. there is also increased interest in shared integrated library systems (ils) and resource management systems (rms) that provide behind-the-scenes infrastructure to coordinate the holdings of large consortia or multi-campus systems (e.g. orbis-cascade, illinois heartland library system) (breeding ). to meet user expectations and preferences, interface design is increasingly modeled after the discovery interfaces in the commercial sector. for example, google's search engine has become so popular that many of these systems provide a similar search-box interface (with options for more advanced search features), and "recommender" systems and relevance rankings similar to amazon. “cloud systems" are increasingly replacing the traditional technical and storage infrastructure to run these systems. implications  advances in discovery systems and shared ils/rms systems are enabling multiple institutions to provide broad user access to library collections and to provide the back-end infrastructure that supports these partnerships.  libraries should continue to consider users expectations and information-seeking behaviors in developing or selecting discovery systems. collaborations large, multi-institutional collaborations focused on digital collections or technology infrastructure have also changed the face of discovery services. projects such as the digital project library of america join the ranks of other large portal sites like europeana to provide users with access to diverse research holdings from numerous institutions. the partnership between library of congress and twitter to archive and provide access to the world's tweets is one of the large-scale projects addressing the preservation of and access to new modes of communication. the committee on coherence at scale sponsored by clir and vanderbilt university has been formed to analyze national-scale digital projects that help transform higher education. what sets many initiatives apart from the previous generation of library projects is the focus on designing platforms to support the sharing of code and the creation of added- value services by the community, such as apis that support development of apps and other tools (experian ). as the number of self-contained portals, repositories, and online catalogs continues to grow, libraries want to create seamless discovery environments and service layers to help researchers search across all these information-rich silos. new developments include open source discovery applications that enable users to search across catalogs, repositories, and digital libraries and view a range of materials and formats (books, manuscripts, images, etds, e-journals, etc.) without the disparate information silos having to merge their infrastructures behind the scenes. implication  libraries will continue to address users' needs by providing broad access to collections via portals, exploring the benefits of large-scale collaborations for digitization, and adding service layers that facilitate searching, discovery, and manipulation of the content they find. user-driven research: linked data, data mining, & analytical tools linked data is about making connections between related data using the semantic web. as libraries increasingly use resource description framework (rdf), uniform resource identifiers (uri), world wide web consortium (w c) standards, and other best practices in the management of data, researchers benefit from the ability to more easily discover data. what makes this so exciting is that it empowers researchers to make new connections between related data and facilitates the creation of new knowledge (lampert and southwick ; krafft and corson-rikert ). user-driven research is also being supported through platforms that support data mining. for example, the hathitrust research center provides computational access to researchers for non-profit and educational use of the ht corpus of works in the hathitrust digital library. libraries frequently support text and data mining via vendor- digitized collections. additionally, various analytical tools have been developed—such as the google books n-gram viewer, voyant tools, and raw— to help researchers perform textual analysis and create visualizations of data in ways that contribute to new insights (kerr ; varner ). implications  libraries have the opportunity to empower users by providing rich and deep content platforms with tools that facilitate discovery and analysis, which ultimately enables them to make information connections that contribute to the creation of new knowledge.  in support of non-consumptive scholarly research, libraries, in collaboration with content vendors, should explore options for providing data mining functionality in aggregated databases. library facilities the ithaka s&r us library survey , mentioned earlier in this report, also highlights the recognition of the library as a place important to the university and to student success. in this survey of library directors, “providing a space for student collaboration” (long & schonfeld ) was a high priority for nearly % of baccalaureate, master’s and doctoral level institutions. current discussions of library facilities focus heavily on student success services and the library as an academic or learning commons. holmgren and spencer ( ) present the results of discussions of chief information officers’s workshop sponsored by the council on library and information resources (clir). they conclude that “by , many library buildings will have been transformed into an academic commons whose primary role is to host academic support services while also providing space for what remains of the library’s physical collection”. as library spaces are re-envisioned for this new role, characteristics such as state of the art technology access and support, flexibility of the infrastructure and furnishings to meet current as well as future demands, accessibility for a wide variety of users, and environmental “friendliness” are essential in enabling the space to meet institutional goals. library construction or remodeling project planning processes necessary in this environment require consultations and collaborations with stakeholders across the university. in his discussion of ways academic libraries are adapting for the future, brad lukanic ( ) identifies four key areas libraries must pay attention to: responding to strategic campus and business needs, providing technology in every aspect of service, embracing flexibility to meet current and future needs, and providing places for engagement. libraries are reaching across campus divisions to collaborate with student affairs and campus life personnel to develop integrated approaches and programming that foster holistic student success. academic support services are co-locating with libraries to provide seamless services. recently, new library buildings have been designed specifically for these purposes. for example, libraries at seattle university and grand valley state university (seattle university, n.d.; gvsu libraries - ) include dedicated space for additional student success services like tutoring and writing centers and a variety of physical spaces and media production facilities. in addition to providing collaboration spaces, the gvsu library also made provision for quiet study spaces (gvsu libraries - ). new library buildings and furnishings are designed with flexibility for the future in mind. pedagogical and curricular changes are leading library planners to include technology- enhanced learning spaces in both reconfigurations and newly built facilities. spaces are being designed to allow users to engage with a range of technologies that support multiple modes of teaching and learning, including collaborative and individual work in support of emerging high-impact practices. many libraries offer multimedia production facilities and lend technology tools that support media-enriched content creation. digital scholarship centers as described by lippincott, hemmasi and lewis (june ) are increasingly found in academic institutions of all types and involve a variety of disciplines with the goal of co-locating expensive equipment, expertise, and services such as assistance with planning research projects, use of software, metadata, intellectual property issues and preservation. as the authors note, “[digital scholarship] centers in their early stages are experimenting with various services and staffing models as they develop partnerships and engage with various researchers; even well-established centers frequently adjust their priorities and services as the nature of digital scholarship and those engaged with such work on campus evolves.” as a central location on campuses, libraries are an obvious place to house such centers. planning for and assessment of the outcomes and benefits of these new spaces is increasingly important. as services and collections in libraries evolve, a clear understanding of the institutional environment for teaching, learning and scholarship is necessary to ensure that library facilities continue to meet user expectations and priorities. implications  as libraries are increasingly required to share their spaces with other campus offices, creativity will be required to envision ways to open up space for these constituencies while still providing the spaces needed for more traditional library services.  libraries at institutions where new buildings or major remodeling efforts are not possible will need to consider other ways to build these connections. options include finding ways to decrease collection footprints in order to accommodate additional offices and spaces for new initiatives/technologies or to partner outside of the library facility.  expertise for support of these new dimensions and services will necessitate new roles for staff. support for services not traditionally provided by the library require new skills such as training and support for increasingly sophisticated technologies: -d printers, visualization labs, or multimedia production. d services, makerspaces, and technology services another development influencing academic library buildings and facilities is the opportunity to provide a hub for cutting edge technologies that allow students to experience and make use of new technologies such as d printing and scanning, advanced multimedia production, and visualization facilities. typically, these services are located in a specially designated area and may offer a variety of options or just one. mobile application development rooms offer students the opportunity to develop new mobile apps and test their product on a variety of devices. new libraries such as the hunt library (completed in ) at north carolina state university (ncsu libraries, no date) provide access to large-scale visualization techniques, a game lab, decision theaters, video and audio studios as well as a makerspace with a laser cutter and d printer. “makerspace” is a general term and can include a host of concepts ranging from hands-on arts to building a robot. these are fun and exciting times for libraries to be able to add value from a campus perspective. students enjoy working collaboratively and testing out new technologies for free or a nominal fee, faculty embrace the new technologies offered at the library and imagine ways of incorporating library services into classroom curricula, and library administration can report on the increase use of the space, services and circulation. these new technology services place the library in the center of campus and increase its visibility and therefore its value. as more libraries explore these spaces, resources such as the librarymakerspace-l@lists.ufl.edu will become available for libraries wanting to initiate d services or to create a makerspace environment, tapping into the expertise and knowledge of library colleagues who are already offering such services. libraries are increasingly called upon to offer students the opportunity to be creative and innovative in a high tech environment. libraries may provide technologies in the building or make them available for circulation. to make the best use of these services, internal library procedures and policies related to use, theft, or damage need to be created prior to beginning the service. providing a d printer requires additional policies, guidelines, space considerations, staff workflows and training (garcia et al.; gonzalez and bennett ; moorefield-lang ; colegrove ). these opportunities serve students but also pique the interest of faculty and researchers who then can develop course curricula and use the lab for assignments. libraries may want to further develop these campus partnerships and be included on grants and other funding initiatives for the maintenance and purchase of new technologies. implication  establishment of technology-related services requires planning for continuous support and infrastructure, including: training for users, availability of staff with the requisite skill sets to support the services, availability of physical facilities with sufficient space and power, ongoing availability of resources to the keep the services up-to-date as well as establishment of appropriate policies and guidelines.  additional expertise related to library and instructional technologies, media production, and other emerging technologies must link with institutional assessment and space planning in order to ensure library facilities meet user expectations into the future. scholarly communication academic library as publisher publishing by academic libraries has steadily increased in the past few years. hahn ( ) reports the results of a survey of arl libraries. at the time of the survey, “ % of the responding arl member libraries reported they were delivering publishing services and another % were in the process of planning publishing service development.” a similar survey in late found interest had grown, with “approximately half ( %) of respondents indicated having, or being interested in, offering library publishing services …, with over three-quarters of arls being interested” (mullins et al. ). the library publishing coalition launched in as a member-supported institution devoted to research and support for library publishing. its library publishing directory (lippincott ) reports on the publishing activity of different academic libraries. library publishing varies, from scholarly journals to monographs and technical reports, but journals lead the list. hahn ( ) reported that arl libraries were publishing journals; lippincott ( ) found that the libraries in the directory were publishing campus-based journals and a further journals for other institutions. ninety-seven percent of the campus-based journals were open access. acrl has just published an extensive guide to why, how, and what academic libraries publish: getting the word out: academic libraries as scholarly publishers (bonn & furlough ). implications  libraries can support open access scholarship through publishing efforts.  libraries can build relationships with campus scholars and other campus units by acting as publishers. copyright issues and fair use as academic technology and scholarly communication practices continue to evolve, existing copyright law does not always reflect the new paradigm. in this environment, academic libraries rely on a set of best practices to guide the use of materials in a manner permissible under the fair use doctrine guidelines, including those specifically granted to educators. in support of standardizing practice and articulating current consensus on this subject, the association of research libraries, the center for media and social impact (cmsi) at the american university school of communication, and the program on information justice and intellectual property published a “code of best practices in fair use for academic and research libraries” (adler et al. ). cmsi has also released a “statement of best practices in fair use of orphan works for libraries and archives” (aufderheide et al. ). many research libraries have staff with expertise in fair use, authors’ rights, and copyright laws. implication  rights management is a complex landscape in which to maneuver. librarians can advise on best practices and the development of institutional policies. altmetrics as scholarly communication increasingly takes place online, alternative metrics are emerging as a methodology to measure social media visibility and research impact via online engagement around scholarly output. an altmetric score is based on the number of individuals mentioning the research, where the mentions occur, and how often the author of the mention references the research. this alternate view is an addition to the existing filters such as citation counting, the impact factor, and peer-review. in , the alfred p. sloan foundation awarded the national information standards organization (niso) a grant to explore, identify, and advance standards and/or best practices related these new assessment methods. niso's alternative assessment metrics initiative will also explore potential assessment criteria for non-traditional research outputs such as data sets, visualizations, software, and other applications. leading scholarly publishers are also working with altmetrics. in , wiley partnered with altmetric to pilot alternative metrics across a number of its subscription and open access journals. a high percentage of the journals included in the trial received scores that demonstrated they were receiving attention and having an immediate impact. during the pilot, wiley also polled website visitors: % felt the metrics were useful, % agreed that altmetrics enhanced the value of the journal article, and % agreed or strongly agreed that they were more likely to submit a paper to a journal that supports altmetrics (warne ). as a result, wiley now makes altmetrics available for their fully open access journals. other scholarly publishers such as elsevier and sage also offer altmetrics information at the article level, including comments and shares made by readers via social media channels, blogs, newspapers, etc., in addition to its altmetric score and demographic data of these users. implications  as the role and importance of repositories increases, academic librarians should develop workflows and consultation services to support the depositing of research in institutional, discipline, and agency repositories.  as compliance requirements continue to evolve, academic librarians should take the lead in developing educational initiatives around open access and author rights.  to enhance the discoverability of open access content, librarians should collaborate with major publishers to index open access journals.  the increasing availability of open access journal content will impact local collection subscription decisions, as libraries continue to consider delivery/access vs. ownership/retention.  researchers will increasingly share their research via social media that best serve their network and include altmetric data in documenting the impact of this research. library impact on student success academic libraries exist in a time of increased accountability as performance-based budgeting becomes a more common approach in higher education. the value of academic libraries report (oakleaf ) and a report detailing two widespread summits around the topic (brown and malenfant ) underscore the ongoing need to articulate and document libraries’ impact on student learning and success. recommendations from the summit report highlight the need for librarians to fully understand the importance of the library on multiple dimensions of student learning as well as to articulate and promote assessment competencies to document and communicate library impact. the study also recommends increased professional development for librarians in the design and implementation of strategically focused assessment activities, development of broader partnerships with higher education groups, and better use of existing acrl resources on assessment. assessment in action to address these issues and recommendations, acrl’s assessment in action program, (conducted in partnership with the association of institutional research and the association of public land-grant universities and with funding from the u.s. institute of museum and library services) is engaged in a multi-year project that fosters the development of effective approaches demonstrating the academic library’s value to student learning and success (association of college & research libraries ). acrl recently released a report synthesizing project results from over higher education institutions that participated in the first project cohort (brown and malenfant, ). the projects discussed in the report document positive relationships between the library and overall student learning and success. studies investigated the effectiveness of a range of library services including library instruction, research and study spaces, use of instructional games, library use of social media, and instruction and services conducted in collaboration with other campus units. the aia teams employed a variety of assessment methodologies and tools, including surveys, rubrics, pre- and post-tests, interviews, and focus groups. the experiences of the aia teams demonstrate that library assessment is most effective when it involves collaboration with other campus units, aligns with institutional goals, employs a mixed- methods approach, and when assessment is assigned to one or more librarians as part of their position responsibilities. in building a community of practice around assessment, the project reports serve as templates that can be adapted for use by academic libraries of all sizes. implication  given current trends in funding models and calls for accountability in higher education, librarians must develop the expertise to articulate and document the impact of libraries on student learning and success. programs such as aia provide resources and expertise for libraries of all types to explore methods for collaboration and assessment across the institution. teaching and learning librarians are partnering with faculty development personnel to take advantage of acknowledged educational high impact practices. collaborations involve more than one- time instruction, instead focusing on course redesign and application of active learning in research skill development. they also continue to experiment with alternative service models to support and enhance rapidly evolving user needs and preferences. models include tiered services targeting distinct needs of undergraduate students, graduate students, faculty members, and researchers. where resources allow, “personal” librarians are designated for first-year students to create initial connections and foster service awareness. liaison librarians are assigned to academic departments, programs, and other initiatives to develop resources and services targeted to those specific audiences. academic support services are co-locating within library facilities to provide seamless services, placing libraries at the heart of student learning. as the range of libraries’ services increases, the range of skills required becomes broader than those taught in traditional library degree programs. libraries are beginning to utilize non-librarians whose skill sets match current opportunities and programmatic needs. these specialists may be instructional designers, assessment specialists, or scholars from other fields, all of whom participate in the provision of online instruction, website development, or specialized collection development and research services. housewright et al point out that, while librarians continue to see information literacy instruction as primarily their responsibility, “faculty members have a more mixed view of where this principal responsibility may reside.” (housewright et al ). the acrl framework for information literacy for higher education (association of college & research libraries ) asserts that “librarians have a greater responsibility in identifying core ideas within their own knowledge domain that can extend learning for students, in creating a new cohesive curriculum for information literacy, and in collaborating more extensively with faculty.” the framework expands the scope of skills and concepts necessary for students in the current information environment, including visual media, data, and social media. because it is based on a cluster of interconnected core concepts, with flexible options for implementation, rather than on a set of standards, learning outcomes, or any prescriptive enumeration of skills, it provides opportunities for deeper collaborations with faculty. librarians are actively embedded in academic courses, in-person or online, in order to gain insight into student and faculty needs, as well as partnering with faculty to develop innovative assignments that engage students in new ways. these collaborations inform the development of new services and resources in addition to highlighting the ways in which libraries contribute to the success of learning and teaching. as more instructional content is housed in course management systems (cms), librarians are included in class rosters, forum discussions, and chat sessions. online course guides are also linked in cms course sites, highlighting library resources and services that are relevant to the course and assignments. these guides are supplemented with video and interactive tutorials that supply just-in-time instructional practice, support, and student feedback. some libraries are creating positions for information literacy design specialists and instructional technologies librarians who are responsible for developing comprehensive suites of online learning tools and environments. as assessment of library websites and online course content continues to expand, the need for special skills in these areas grows. implications  pedagogical innovations such as flipped classrooms, gamification, or high impact educational practices provide librarians opportunities to engage with curriculum development and collaborate with faculty in new and productive ways.  user experience (ux) and usability testing that informs the development of library resources will continue to be a growth area for academic librarians. competency-based education calls for access to higher education, reduced costs for degree completion, and options for students to demonstrate learning gained outside of the traditional degree path are leading to increased examination of competency-based education. in these models, credit is given for demonstrated mastery of content rather than accumulation of credit hours. a variety of models are being used to document students’ learning; some link the competencies with credit hours, while others involve direct assessment of student learning independent of credit hours or other traditional metrics (fain ). competency-based assessment is also growing as institutions try to institute credit for prior learning for courses outside the scope of those traditionally given credit by examination or advanced placement. examples of institutions exploring the direct assessment approach to competency-based education include the university of wisconsin system and college for america, a competency-based education program within southern new hampshire university. college for america, which was recognized by president obama for its innovation, (southern new hampshire university, ) includes “digital fluency and information literacy” as one of nine key competency areas (college for america for an associate of arts in general studies degree. the university of wisconsin flexible option program (university of wisconsin ) offers four degrees—one associate of arts and sciences and three bachelor of science degrees—as well as three certificate programs (university of wisconsin ). implications  as institutions review curricula with competency-based education and credit for prior learning in mind, the library has an opportunity to address the need for information literacy skills as well as offer options for assessing these skills on behalf of the program.  with higher education under increased scrutiny to demonstrate the value of a post-secondary degree, it is incumbent upon academic libraries and librarians to communicate the library’s value in relation to student and faculty recruitment, retention, and teaching and learning success. conclusion the trends and issues outlined in this document highlight the rapidly changing environment in which libraries provide resources and services as well as the evolving roles for library staff. with higher education under increased scrutiny to demonstrate the value of a post-secondary degree, it is incumbent upon academic libraries and librarians to document and communicate the library’s value in supporting the core mission of the institution. libraries increasingly have the opportunity to play a significant role in overall student success through collaborations across campus and in the assessment of student learning. the shifting landscape of scholarly communication, fluctuating publishing models, and focus on data management presents new opportunities for librarians to engage with researchers and publishers alike. advances in technologies and a continued focus on the user experience present new expectations for the development, discovery and delivery of content and services in the virtual environment and in the library’s physical spaces. while this environment can be viewed as challenging, it also presents opportunities for academic libraries to strategically support the core missions of colleges and universities. appendix a: acrl research planning and review committee - jeanne davidson (chair) head of public services south dakota state university wayne bivens-tatum philosophy and religion librarian princeton university marianne buehler special projects librarian university of nevada las vegas ellen carey librarian and instructor santa barbara city college lisabeth chabot (vice-chair) college librarian ithaca college michelle leonard associate university librarian university of florida chris palazzolo, phd head of collections (woodruff library) emory university lorelei tanji university librarian university of california, irvine minglu wang data services librarian rutgers university - newark references abel, rob, malcolm brown, and john j. suess. . “a new architecture for learning.” educause review online, october. adams, jennifer l., and kevin gunn. . “keeping up with...digital humanities.” acrl publications: keeping up with... . april. http://www.ala.org/acrl/publications/keeping_up_with/digital_humanities. adler, prudence s., patricia aufderheide, brandon butler, and peter jaszi. . code of best practices in fair use for academic and research libraries. washington, dc. association of research libraries. http://www.arl.org/storage/documents/publications/code-of-best-practices-fair-use.pdf. akers, katherine g., and jennifer a. green. . “towards a symbiotic relationship between academic libraries and disciplinary data repositories: a dryad and university of michigan case study.” international journal of digital curation ( ): – . doi: . /ijdc.v i . . anderson, rick. . “occam’s reader: an interview.” the scholarly kitchen. march . http://scholarlykitchen.sspnet.org/ / / /occams-reader-an-interview/. antell, karen. jody bales foote, jaymie turner, and brian shults. . “dealing with data: science librarians’ participation in data management at association of research libraries institutions.” college & research libraries ( ): – . doi: . /crl. . . . association of college & research libraries. . “framework for information literacy for higher education”. ala. http://www.ala.org/acrl/standards/ilframework. ———. . “assessment in action: academic libraries and student success | association of college & research libraries (acrl).” accessed december . http://www.ala.org/acrl/aia. aufderheide, patricia, david r. hansen, meredith jacob, peter jaszi, and jennifer m. urban. . statement of best practices in fair use of orphan works for libraries and archives. center for media and social impact. bidwell, allie. . “sequestration presents uncertain outlook for students, researchers, and job-seekers.” the chronicle of higher education, march . http://chronicle.com/article/sequestration-presents/ /. bonn, maria, and mike furlough. . getting the word out: academic libraries as scholarly publishers. chicago, il: association of college and research libraries. breeding, marshall. . “the future of library resource discovery.” niso. february. http://www.niso.org/apps/group_public/download.php/ /future_library_resource_dis covery.pdf. britton, scott, and john renaud. . “print retention decision making, spec kit (october ).” http://publications.arl.org/print-retention-decision-making-spec-kit- /. brown, karen, and kara j. malenfant. . connect, collaborate, and communicate: a report from the value of academic libraries summits. chicago: association of college & research libraries. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/val_summit.pdf. ———. . “academic library contributions to student success: documented practices from the field”. chicago: association of college & research libraries. http://www.ala.org/acrl/files/issues/value/contributions_report.pdf. budin-ljøsne, isabelle, julia isaeva, bartha maria knoppers, anne marie tassé, huei-yi shen, mark i mccarthy, and jennifer r harris. . “data sharing in large research consortia: experiences and recommendations from engage.” european journal of human genetics ( ): – . doi: . /ejhg. . . carlson, jake, lisa johnston, brian westra, and mason nichols. . “developing an approach for data management education: a report from the data information literacy project.” international journal of digital curation ( ): – . doi: . /ijdc.v i . . ———. . “data management skills needed by structural engineering students: case study at the university of minnesota.” journal of professional issues in engineering education and practice ( ): . doi: . /(asce)ei. - . . “chorus: advancing public access to research.” . http://www.chorusaccess.org/. coates, heather. . “ensuring research integrity.” college & research libraries news ( ): – . colegrove, tod. . “re-purposing library spaces to make an impact.” library issues ( ): – . college for america. . college for america academic catalog. manchester, nh: southern new hampshire university. creamer, andrew t., elaine r. martin, and donna kafel. . “research data management and the health sciences librarian.” library publications and presentations, paper . crist, rebecca, and emily stambaugh. . “shared print programs, spec kit (december ).” http://publications.arl.org/shared-print-programs-spec-kit- /. demas, sam. . “curating collective collections---emerging shared print policy choices as reflected in mous.” against the grain ( ): – . dempsey, lorcan, constance malpas, and brian lavoie. . “collection directions: the evolution of library collections and collecting.” portal: libraries and the academy ( ): – . doi: . /pla. . . dempsey, lorcan, brian lavoie, constance malpas, lynn silipigni connaway, roger c schonfeld, j. d shipengrover, and gunter waibel. . understanding the collective collection: towards a system-wide perspective on library print collections. dublin, ohio: oclc research. http://www.oclc.org/research/publications/library/ / - r.html. diekema, anne r., andrew wesolek, and cheryl d. walters. . “the nsf/nih effect: surveying the effect of data management requirements on faculty, sponsored programs, and institutional repositories.” the journal of academic librarianship ( - ): – . doi: . /j.acalib. . . . “dil symposium.” . dil: data information literacy. http://wiki.lib.purdue.edu/display/ste/symposium. “dil: data information literacy.” . dil: data information literacy. http://wiki.lib.purdue.edu/display/ste/home. elmore, justina m., and charissa o. jefferson. . “business librarians donning the data hat: perspectives on this new challenge.” public services quarterly ( ): – . doi: . / . . . erway, ricky. . starting the conversation: university-wide research data management policy. dublin, ohio: oclc research. http://www.oclc.org/content/dam/research/publications/library/ / - .pdf. experian knowledge centre. . the digital trend report. fain, paul. . “taking the direct path.” inside higher ed. https://www.insidehighered.com/news/ / / /direct-assessment-and-feds-take- competency-based-education. ferguson, liz. . “how and why researchers share data (and why they don’t).” wiley exchange. december . http://exchanges.wiley.com/blog/ / / /how-and-why- researchers-share-data-and-why-they-dont/. garcia, moriana m., kevin messner, richard j. urban, sam tripodis, megan e. hancock, and tod colegrove. . “ d technologies: new tools for information scientists to engage, educate and empower communities”. presented at the asist , seattle, wa, november. giarlo, michael j. . “academic libraries as data quality hubs.” journal of librarianship and scholarly communication ( ): ep . doi: . / - . . goben, abigail, and dorothea salo. . “federal research data requirements set to change.” college & research libraries news ( ): – . gonzalez, sara russell, and denise beaubien bennett. . “planning and implementing a d printing service in an academic library.” issues in science and technology librarianship. gvsu libraries. . “mary idema pew library.” university libraries, grand valley state university. . http://gvsu.edu/library/mary-idema-pew-library- .htm. hahn, karla l. . research library publishing services: new options for university publishing. washington, dc: association of research libraries. http://www.arl.org/storage/documents/publications/research-library-publishing-services- mar .pdf. hense, andreas, and florian quadt. . “acquiring high quality research data.” d-lib magazine ( / ). doi: . /january -hense. holdren, john. . increasing access to the results of federally funded scientific research. memorandum. white house office of science and technology policy. http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo _ .pdf. holdren, john p. . “memorandum for the heads of executive departments and agencies: improving the management of and access to scientific collections”. office of science and technology policy. http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo _ .pdf. holmgren, richard, and gene spencer. . the changing landscape of library and information services: what presidents, provosts, and finance officers need to know. online white paper . clir publication . washington, d.c.: council on library and information resources. http://www.clir.org/pubs/reports/pub . housewright, ross, roger c. schonfeld, and kate wulfson. . “ithaka s+ r us faculty survey .” april : . http://www.sr.ithaka.org/sites/default/files/reports/ithaka_sr_us_faculty_survey_ _final.pdf hurst, frederick. . “northern arizona university’s personalized learning.” educause review online, september. johnston, lisa. . a workflow model for curating research data in the university of minnesota libraries : report from the data curation pilot. university digital of minnesota conservancy. http://hdl.handle.net/ / . kerr, virginia. . “webinar: text/data mining, libraries, and online publishers”. center for research libraries, february . https://www.youtube.com/watch?v= e xymy epg. kim, youngseek, and jeffrey m. stanton. . “institutional and individual influences on scientists’ data sharing practices.” journal of computational science education ( ): – . krafft, dean b., and jon corson-rikert. . “linked data for libraries: how libraries can make use of linked open data to share information about library resources and to improve discovery, access, and understanding for library users.” lita forum preconference, albuquerque, nm. lampert, cory k., and silvia b. southwick. . “leading to linking: introducing linked data to academic library digital collections.” journal of library metadata ( - ): – . doi: . / . . . li, chan, felicia poe, michele porter, brian quigley, and jacqueline wilson. . uc libraries academic ebook usage survey: springer ebook pilot project. university of california libraries. http://www.cdlib.org/services/uxdesign/docs/ /academic_ebook_usage_survey.pdf. lippincott, joan, harriette hemmasi, and vivian marie lewis. . “trends in digital scholarship centers.” educause review. http://www.educause.edu/ero/article/trends- digital-scholarship-centers. lippincott, sarah k. . library publishing directory . atlanta, georgia: library publishing coalition. http://www.librarypublishing.org/resources/directory/lpd . long, matthew p., and roger c. schonfeld. . ithaka s+r us library survey . ithaka s+r. http://www.sr.ithaka.org/sites/default/files/reports/sr_libraryreport_ _ .pdf. lukanic, brad. . “ ways academic libraries are adapting for the future.” fast company. october . http://www.fastcoexist.com/ / -ways-academic-libraries- are-adapting-for-the-future. malpas, constance. . shared print policy review report. dublin, ohio: oclc research. http://www.oclc.org/content/dam/research/publications/library/ / - .pdf?urlm= . martin, elaine, tracey leger-hornby, and donna kafel. . frameworks for a data management curriculum: course plans for data management instruction to undergraduate and graduate students in science, health sciences, and engineering programs. http://library.umassmed.edu/data_management_frameworks.pdf. marx, vivien. . “my data are your data.” nature biotechnology ( ): – . mayernik, matthew s., lynne davis, karon kelly, bob dattore, gary strand, steven j. worley, and mary marlino. . “research center insights into data curation education and curriculum.” in theory and practice of digital libraries – tpdl selected workshops, edited by Łukasz bolikowski, vittore casarosa, paula goodale, nikos houssos, paolo manghi, and jochen schirrwagen, : – . communications in computer and information science. cham: springer international publishing. doi: . / - - - - . middleton, cheryl, wayne bivens-tatum, beth blanton-kent, heidi steiner burkhart, ellen carey, steven carrico, jeanne davidson, chris palazzolo, and barbara petersohn. . “top trends in academic libraries.” college & research libraries news ( ): – . miller, laniece e., james e. powell, joyce a. guzik, paul a. bradley, and lillian f. miles. . “a pilot project to manage kepler-derived data in a digital object repository.” science & technology libraries ( ): – . doi: . / x. . . moorefield-lang, heather. a. “makers in the library: case studies of d printers and maker spaces in library settings.” library hi tech ( ): – . ———. b. “ -d printing in your libraries and classrooms.” knowledge quest ( ): – . mullins, james, catherine murray-rust, joyce ogburn, raym crow, october ivins, allyson mower, daureen nesdill, mark newton, julie speer, and charles watkinson. . library publishing services: strategies for success: final research report. washington, dc: sparc. nass, sharyl j., laura a. levit, and lawrence o. gostin. . “a new framework for protecting privacy in health research.” in beyond the hipaa privacy rule: enhancing privacy, improving health through research. washington, dc: national academies press (us). http://www.ncbi.nlm.nih.gov/books/nbk /. “new england collaborative data management curriculum.” . lamar soutter library. accessed january . http://library.umassmed.edu/necdmc/index. nilsen, karl, robin dasler, trevor muñoz, and sarah hovde. . “the position of library- based research data services : what funding data can tell us.” university of maryland, college park, april . http://drum.lib.umd.edu/handle/ / . niso. a. “niso altmetrics standards project white paper.” june . http://www.niso.org/apps/group_public/download.php/ /niso_altmetrics_white_pap er_draft_v .pdf. ———. b. “demand driven acquisition of monographs.” june . http://www.niso.org/apps/group_public/download.php/ /rp- - _dda.pdf. ———. a. “alternative metrics initiative - national information standards organization.” accessed january . http://www.niso.org/topics/tl/altmetrics_initiative/. ———. b. “white papers.” accessed february . http://www.niso.org/publications/white_papers/. oakleaf, megan. . the value of academic libraries: a comprehensive research review and report. chicago: association of college & research libraries. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/val_report.pdf. oclc. . “oclc research work analyzing system-wide print library services and collections documented in new report.” december . http://www.oclc.org/research/news/ / - .html. olendorf, robert, and steve koch. . “beyond the low hanging fruit: archiving complex data and data services at university of new mexico.” journal of digital information ( ). https://journals.tdl.org/jodi/index.php/jodi/article/view/ / . palmer, carole l., cheryl a. thompson, karen s. baker, and megan senseney. . “meeting data workforce needs: indicators based on recent data curation placements.” in iconference proceedings. ischools. doi: . / . payne, lizanne. . “winning the space race: expanding collections and services with shared depositories.” american libraries ( / ): – . peer, limor, ann green, and elizabeth stephenson. . “committing to data quality review.” international journal of digital curation ( ): – . doi: . /ijdc.v i . . prado, javier calzada, and miguel Ángel marzal. . “incorporating data literacy into information literacy programs: core competencies and contents.” libri ( ): – . rainie, lee. . “the internet of things and what it means for librarians.” pew research center’s internet & american life project. presented to internet librarian october . http://www.pewinternet.org/ / / /the-internet-of-things-and-what-it-mean-for- librarians/. rod-welch, leila june, barbara e. weeg, jerry v. caswell, and thomas l. kessler. . “relative preferences for paper and for electronic books: implications for reference services, library instruction, and collection management.” internet reference services quarterly ( - ): – . doi: . / . . . schonfeld, roger c. . “stop the presses: is the monograph headed toward an e-only future?” s+r (blog) ithaka s +r, december , . http://sr.ithaka.org/blog-individual/stop-presses-monograph-headed-toward-e-only-future seattle university. . “learning commons partnership - seattle university.” learning commons partnership - seattle university. accessed february . http://www.seattleu.edu/learningcommons/. shared access research ecosystem. . share notification service architectural overview. washington, dc: association of research libraries. http://www.arl.org/storage/documents/publications/share-notification-service- architectural-overview- apr .pdf. shorish, yasmeen. . “data curation is for everyone! the case for master’s and baccalaureate institutional engagement with data curation.” journal of web librarianship ( ): – . doi: . / . . . sin, gloria. . “digital public library of america launches april , api hacks welcomed | digital trends.” accessed january . http://www.digitaltrends.com/computing/digital-public-library-of-america-launches-april- /. southern new hampshire university. . “president obama recognizes snhu’s college for america in major policy speech on college affordability.” southern new hampshire university. august . http://www.snhu.edu/ .asp. staiger, jeff. . “how e-books are used.” reference & user services quarterly ( ): – . tenopir, carol, suzie allard, kimberly douglass, arsev umur aydinoglu, lei wu, eleanor read, maribeth manoff, and mike frame. . “data sharing by scientists: practices and perceptions.” plos one ( ): e . doi: . /journal.pone. . tenopir, carol, robert j. sandusky, suzie allard, and ben birch. . “academic librarians and research data services: preparation and attitudes.” ifla journal ( ): – . ———. . “research data management services in academic research libraries and perceptions of librarians.” library & information science research ( ): – . doi: . /j.lisr. . . . university of wisconsin. . “competency-based education: what it is, how it’s different, and why it matters to you.” university of wisconsin. january . http://flex.wisconsin.edu/blog/competency-based-education-what-it-is-how-its-different- and-why-it-matters-to-you/. ———. . “degree completion programs - uw flexible option.” uw flexible option. http://flex.wisconsin.edu/degrees-programs/. varner, stewart. . “notes on text and data mining for libraries.” http://stewartvarner.com/ / / /notes-on-text-and-data-mining-for-libraries/. warne, verity. . “wiley introduces altmetrics to its open access journals.” exchanges (blog), wiley, march , . http://exchanges.wiley.com/blog/ / / /wiley- introduces-altmetrics-to-its-open-access-journals/. webometrics. n.d. “ranking web of repositories.” http://repositories.webometrics.info/en. williams, sarah c. a. “data sharing interviews with crop sciences faculty: why they share data and how the library can help.” issues in science and technology librarianship . doi: . /f t m . ———. b. “using a bibliographic study to identify faculty candidates for data services.” science & technology libraries ( ): – . doi: . / x. . . catalogue . : the future of the library catalogue san jose state university from the selectedworks of judy jeng catalogue . : the future of the library catalogue judy h jeng, san jose state university available at: https://works.bepress.com/judy_jeng/ / http://www.sjsu.edu https://works.bepress.com/judy_jeng/ https://works.bepress.com/judy_jeng/ / this article was downloaded by: [judy jeng] on: march , at: : publisher: routledge informa ltd registered in england and wales registered number: registered office: mortimer house, - mortimer street, london w t jh, uk cataloging & classification quarterly publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/wccq a review of “catalogue . : the future of the library catalogue” judy jeng a b a drexel university , philadelphia, pa b san jose state university , san jose, ca published online: mar . to cite this article: judy jeng ( ): a review of “catalogue . : the future of the library catalogue”, cataloging & classification quarterly, doi: . / . . to link to this article: http://dx.doi.org/ . / . . please scroll down for article taylor & francis makes every effort to ensure the accuracy of all the information (the “content”) contained in the publications on our platform. however, taylor & francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the content. any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by taylor & francis. the accuracy of the content should not be relied upon and should be independently verified with primary sources of information. taylor and francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the content. this article may be used for research, teaching, and private study purposes. any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. terms & conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions http://www.tandfonline.com/loi/wccq http://www.tandfonline.com/action/showcitformats?doi= . / . . http://dx.doi.org/ . / . . http://www.tandfonline.com/page/terms-and-conditions http://www.tandfonline.com/page/terms-and-conditions cataloging & classification quarterly, : – , published with license by taylor & francis issn: - print / - online doi: . / . . book review catalogue . : the future of the library catalogue, edited by sally chambers. chicago: neal-schuman, . xxvii, p. illus. isbn - - - - . $ . in his foreword, marshall breeding aptly describes this book as “an inter- esting and important exploration of the realm of the emerging technologies, products and projects that impact the way that libraries provide their cus- tomers with access to their collections and services,” as a result of dissatisfac- tion with the online catalog. the book provides an overview of the current state of the art of the library catalog and then looks to the future to see what the library catalog might become. the book contains a foreword by marshall breeding, an introduction by sally chambers, and eight chapters written by experts in the field. in the “introduction,” sally chambers provides a review of how the library catalog and cataloging have changed and the current trends in those areas. chapter , “next-generation catalogues: what do users think?,” by anne christensen, presents the lack of a user-centered approach in the devel- opment of early online public access catalogs (opacs) and how academic libraries lost what used to be a monopoly position for the provision of sci- entific information. librarians seem to be most concerned about data quality in next-generation catalogs. ease of use is the most important paradigm for next-generation catalogs. chapter , “making search work for the library user,” by till kinstler, explores how users demand self-service from intuitively usable search en- gines and have come to expect it from library catalogs. the author describes how boolean search differs from the vector space model that is used in many search engines and explores how such search engine technologies can be applied to library catalogs. additionally, other features of modern search engines, such as search suggestions and faceted browsing, are also explored. chapter , “next-generation discovery: an overview of the european scene,” by marshall breeding, provides a brief overview of the features and general characteristics of a number of new library discovery systems, focusing © judy jeng d ow nl oa de d by [ ju dy j en g] a t : m ar ch book review on those products that have been implemented or developed in the united kingdom and other parts of europe. chapter , “the mobile library catalogue,” by lukas koster and driek heesakkers, defines mobile catalog and explores the different kinds of mo- bile applications, including advantages and disadvantages of each applica- tion. chapter , “frbrizing your catalogue: the facets of frbr,” by rosemie callewaert, explores how the theory behind the functional requirements for bibliographic records (frbr) model has been applied in practice in belgium, including user experience and metadata creation and enrichment. chapter , “enabling your catalogue for the semantic web,” by em- manuelle bermès, provides a short introduction to the semantic web and its practical implementation, linked data. chapter , “supporting digital scholarship: bibliographic control, library co-operatives and open access repositories,” by karen calhoun, examines bibliographic control, cooperative cataloging systems, and library catalogs in the context of changing library collections, new metadata sources and methods, open access repositories, digital scholarship, and the purposes of research libraries. the author concludes the chapter with a call for research libraries to consider collectively new approaches that could strengthen their roles as essential contributors to the emergent, network-level scholarly re- search infrastructures. chapter , “thirteen ways of looking at libraries, discovery and the catalogue: scale, workflow, attention,” by lorcan dempsey, outlines how the catalog is changing to become a part of a larger discovery environment. this book is timely and well-written. the topics covered include the opac, search engines, discovery systems, mobile applications, frbr, the semantic web, and digital repositories. each chapter starts with a good introduction of the topic, followed by a discussion of current developments and explorations about the future. this is an excellent book. i recommend it to anyone interested in know- ing about the future of the library catalog. reviewed by judy jeng adjunct faculty drexel university philadelphia, pa san jose state university san jose, ca d ow nl oa de d by [ ju dy j en g] a t : m ar ch san jose state university from the selectedworks of judy jeng catalogue . : the future of the library catalogue wccq_a_ _o final-draft.fm image-based empirical information acquisition, scientific reliability, and long-term digital preservation for the natural sciences and cultural heritage mark mudge, tom malzbender, alan chalmers, roberto scopigno, james davis, oliver wang, prabath gunawardane, michael ashley, martin doerr, alberto proenca, joão barbosa abstract the tools and standards of best practice adopted by natural science (ns) and cultural heritage (ch) professionals will determine the digital future of ns and ch digital imaging work. this tutorial discusses emerging digital technologies and explores issues influencing widespread adoption of digital practices for ns and ch. the tutorial explores a possible digital future for ns and ch through key concepts; adoption of digital surrogates, empirical (scientific) provenance, perpetual digital conservation, and ‘born archival’ semantic knowledge management. the tutorial discusses multiple image based technologies along with current research including; reflectance transformation imaging (rti), photometric stereo, and new research in the next generation of multi-view rti. this research involves extending stereo correspondence methods. these technologies permit generation of digital surrogates that can serve as trusted representations of ‘real world’ content. the tutorial explores how empirical provenance contributes to the reliability of digital surrogates, and how perpetual digital conservation can ensure that digital surrogates will be archived and available for future generations. the tutorial investigates the role of semantically based knowledge management strategies and their use in simplifying ease of use by natural science and ch professionals as well as long term preservation activities. the tutorial also investigates these emerging technologies’ potential to democratize digital technology, making digital tools and methods easy to adopt and make ns and ch materials widely available to diverse audiences. the tutorial concludes with hands- on demonstrations of image-based capture and processing methods and a practical problem solving q&a with the audience. keywords: reflectance transformation imaging, polynomial texture mapping, empirical provenance, photometric stereo, stereo correspondence, photogrammetry, structured light, digital preservation, archiving, cultural heritage eurographics / m. roussou and j. leigh tutorial . introduction the tools and standards of best practice adopted by natural science (ns) and cultural heritage (ch) professionals will determine the scope and nature of future digital scholarship. we will explore issues that influence these adoption decisions and showcase examples of emerging digital technologies designed to remove the existing obstacles to widespread adoption of digital practices. . sequence of presentations mark mudge will begin by presenting an overview of the themes uniting the tutorial’s presentations. these themes will explore issues that influence technology adoption decisions made by ns and ch professionals. he will explore the advantages that can be realized when image- based empirical information acquisition is organized in conformance with the fundamental principles of the scientific method. reflectance transformation imaging (rti) will be featured as an example of an image-based technique that can be structured in this advantageous manner. tom malzbender will discuss the ptm representation and rtis, including the advantages and limitations of the representation. he will review tools for building and viewing ptms and basic approaches to their capture. he will offer several brief case studies including the antikythera mechanism and applications in paleontology, forensics, and art conservation. he will also present work using reflectance transformation techniques in combination with photometric stereo and a high speed video and lighting array to generate real time views of enhanced object surfaces. alan chalmers will discuss the use of rti and spectrally measured historic light sources, such as oil and beeswax, to recreate authentic byzantine environments and their impact on architectural mosaics, painted icons, and frescos. roberto scopigno will discuss large object rti acquisition and present a practical, simple and robust method to acquire the spatially-varying illumination of a real-world scene. he will present an assessment of factors including the effects of light number and position influencing polynomial texture mapping (ptm) normal accuracy. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation james davis will discuss photometric stereo, structured light and related image-based techniques for capturing information about the ‘real world’. oliver wang, and prabath gunawardane will discuss new research into techniques used to visualize image-based, empirically captured objects. the research goal is to interpolate both lighting and viewing directions while using a small amount of data that can be easily transferable over the web. the work examines various alternative representations of the lighting and spatial information that can be used to compactly model this information. the research decomposes the measured reflectance function into view dependent and view independent components. from these results, it is possible to include not only color information, but any view independent components of the reflectance function, improving the robustness of d surface shape extraction. michael ashley will discuss the concept of 'born archival' digital surrogates and the perpetual conservation of our digital knowledge through 'smart' media and 'dumb' archives. he will advocate for both individual professional responsibility and multi-institutional, multi-disciplinary curatorial management of digital heritage content for the foreseeable future. he offers a practical approach to enticing technology adoption by repositories and institutions of cultural memory through digital surrogates that adapt to their environment, resist 'bit rot' and improve in terms of stability, semantic meaningfulness and archival potential through time. martin doerr will discuss the techniques and tools of empirical acquisition knowledge management. he will explore the concept that scientific data cannot be understood without knowledge of the meaning of the data and the means and circumstances of its creation. he will examine how this ‘metadata’ can be managed from generation to use, permanent storage and reuse. he will discuss: knowledge deployment; automatic translation of acquisition knowledge into widely used archiving formats for export and as finding aids; management and inheritance of provenance information for image-based derivatives; and determination of knowledge dependencies for digital preservation. alberto proenca and joão barbosa will discuss their work developing processing tools to automate the generation process of the ptm data representation of an object. they will demonstrate how their tools both simplify and mostly automate the capture and processing of ptms, while recording the empirical provenance generated along the processing pipeline. during the final session of the tutorial the participants will demonstrate practical image-based empirical information capture, workflow, and processing techniques using commonly available photographic equipment. questions and dialog with tutorial attendees will be encouraged during the demonstrations. . replacing the ‘r’ in ‘rti’ mark mudge and tom malzbender would like to replace their previous use of the term ‘reflection’ with the term ‘reflectance’ in their current and future work with rti. in turn, this former usage has led to the use of ‘reflection’ by others. they suggest that those currently using rti to consider incorporating this change of terminology in their future work. mark and tom’s contributions to the course notes reflect this suggestion. ********************************************* . natural science, cultural heritage, and digital knowledge tutorial presenter: mark mudge additional authors: carla schroer, marlin lum cultural heritage imaging, usa email: mark@c-h-i.org humanity's legacy can be unlocked and shared between people through digital representations. digital representations can communicate knowledge in a variety of ways. for clarity, we can define three types that distinguish different uses for these representations; art and entertainment, visualization, and digital surrogates of the world we experience. . art, visualization, and digital surrogates digital content can be fine art in its own right. it can also entertain. this content can also be used to visualize concepts, and illustrate hypotheses. in this case, we use the term ‘visualization’ in its broadest sense to include hearing, smell, taste and touch. for example, a computer animation of a large asteroid impacting the yucatan peninsula million years ago is helpful to visualize the cause for worldwide dinosaur extinction. these images are useful not because they faithfully show the shape and color of the actual asteroid moments before impact but because they effectively communicate an idea. visualizations are speculative in nature to varying degrees. current research is exploring ways to explicitly describe the extent of this speculation. [hnp ] digital surrogates serve a different purpose. their goal is to reliably represent ‘real world’ content in a digital form. their purpose is to enable scientific study and personal enjoyment without the need for direct physical experience of the object or place. their essential scientific nature distinguishes them from speculative digital representations. they are built from inter-subjectively verifiable empirical information. digital surrogates are the focus of this discussion. digital surrogates of our 'real world' can robustly communicate the empirical features of ns and ch m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation materials. when digital surrogates are built transparently, according to established scientific principles, authentic, reliable scientific representations can result. these representations allow re-purposing of previously collected information and enable collaborative distributed scholarship. information about the digital surrogates stored in a semantically rich 'common language' accessible to and from locally determined archiving architectures permit concatenation of information across many collections and demystify complex semantic query of vast amounts of information to efficiently find relevant material. digital surrogate archives remove physical barriers to scholarly and public access and foster widespread knowledge and enjoyment of nature and our ancestors’ achievements. the advantages presented by adoption of digital surrogates are great, but can only be attained if well recognized obstacles are overcome and the related incentives realized. as discussed below, the fundamental means to enable adoption of digital surrogates are understood. the necessity to achieve widespread adoption is driving the ongoing development of new tools, methods, and standards. the following four sections examine these efforts to aid digital surrogate adoption. . empirical provenance a fundamental problem of the digital age is the qualitative assessment of digital surrogate reliability during scientific inquiry. a solution to this problem is necessary for digital surrogates to find widespread use in ns and ch scholarship. widespread adoption of digital surrogates by science in all fields, including the multi-disciplinary study of our cultural heritage, requires confidence that the data they represent is reliable. for a scholar to use a digital surrogate, built by someone else, in their own work, they need to know that what’s represented in the digital surrogate is what’s observed on the physical original. if archaeologists are relying on virtual d models to study paleolithic stone tools, they must be able to judge the likelihood that a feature on the model will also be on the original and vice versa. if they can’t trust that it’s an authentic representation, they won’t use the digital surrogate in their work. we suggest that the concept of ‘empirical provenance’ offers to advance our understanding of the role of digital surrogates in scientific inquiry, enhance the development of techniques to digitally represent our world, and increase the adoption of digital surrogates as source material both for scientific research in general and the study of our collective cultural heritage in particular. an essential element of traditional scientific inquiry is the systematic gathering of observations about the world through the senses. in the very, very old and still vigorously pursued epistemological discussion about the nature of human knowledge, the observations of the senses are labeled ‘empirical’ within scientific discourse the methodology employed in the process of generating scientific information has been traditionally called the inquiry’s ‘provenance’. this provenance is carefully recorded in lab notebooks or similar records during the inquiry and then becomes an integral element of the published results. this provenance explains where the information came from and permits replication experiments, central to scientific practice, to confirm the information’s quality. such provenance may include descriptions of equipment employed, mathematical and logical operations applied, controls, oversight operations, and any other process elements necessary to make both the inquiry and its results clear and transparent to scientific colleagues and the interested public. widespread adoption of digital surrogates requires that they be able to pass this traditional lab notebook test. empirical provenance is for digital surrogates the equivalent of what a lab notebook is for non-digital representations. empirical provenance is the extension of classic scientific method into the digital documentary practices used to build digital surrogates. empirical provenance records the journey of original, unaltered empirical evidence from its initial data capture all the way through the image generation process pipeline to its final form as a digital surrogate. just as ‘real-world’ cultural material requires a provenance identifying what it is, establishing its ownership history, and proving its authenticity, digital surrogates require an empirical provenance, to document the imaging practices employed to create them. empirical provenance ensures access to both original empirical data, original photographs for example, and the complete process history enabling the user to generate a confirmatory representation to evaluate the quality and authenticity of the data. that way, the user can decide for themselves whether to rely on the digital surrogate, or not. empirical provenance permits the assessment of digital surrogate accuracy. the experience of those engaged in distributed, internet-based scientific inquiry confirms the necessity of documenting how digitally represented information is generated. these collaborations, frequently found in the biological sciences, rely heavily on process accounts of digital data creation to assess the quality of information contributed by the cooperating partners and make their own work valuable to others. [zgws ] the attributes of empirical provenance information for a given digital surrogate are dependent on the tools and methods employed to build it. for a digital photograph, the empirical provenance information would include xmp data such as: the camera make and model, firmware version, shutter speed, and aperture; parameters used to convert the raw sensor data into an image like color temperature; and all editing operations performed in tools like photoshop such as cropping, re-sizing, distortion correction, sharpening, etc. these editing operations can have a profound impact on image reliability and are examined in greater detail below. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation for a d geometric model displaying photo-realistic surface texture and reflective material properties, the empirical provenance is complex. for these digital surrogates, complete process history accounts are required for the alignment of shape data acquired from different viewpoints, the registration of textural image data to geometry, the correction of geometric acquisition errors such as voids, smoothing in low signal to noise ratio situations, the effects of compressive data reduction, and other issues raised by the selected imaging method. in each case, whether digital photo or d model, the attributes including quantity of records, and ease, difficulty, or even possibility of empirical provenance collection result from the practices used to build the digital surrogate. only practices able to provide a complete empirical provenance can be used to construct reliable digital surrogates. practices unable to produce a complete empirical provenance cannot be used to create reliable digital surrogates since their digital artifacts cannot be subjected to rigorous qualitative evaluation. the requirement for empirical provenance information informs digital technology development and adoption. tools and methods used to build digital surrogates that feature simplification and trivially configured automation of empirical data post processing, including empirical provenance generation, present significant benefits over those that call for significant amounts of subjective judgments by a skilled operator, since every operator action that transforms empirical content must be documented in a digital log for future scientific evaluation. the importance of automation in the construction of reliable digital surrogates is highlighted by a recent major study. [bfrs ] this study examined the digital imaging practices in leading us museums and libraries. the study states: “most museums included some visual editing and other forms of image processing in their workflow…when investigated closely, it was found that visual editing decreased color accuracy in all cases… in addition to visual editing, many images also incurred retouching and sharpening steps. the fact that many of the participants sharpened the images either at capture or before the digital master was saved raised the question of whether the implications of the choices made were well understood. most of the image processing carried out was not automated; automation represents a possibility for improvement in setting up consistent, reproducible workflow.” while an artist’s touch can increase the sales of a print in a museum gift shop or create a stunning cinematic effect, it has little direct role in the scientific construction of digital surrogates. the development of many of today’s digital imaging tools was driven by the entertainment industry’s desire to create special effects for movies and television, computer animations, video games, and multimedia products. unlike the entertainment business where a good- looking image is the goal, scientific documentation requires that the material be represented reliably. if the empirical provenance, enabling assessment of reliability, is lacking, the digital representation may be enjoyed for visualization or entertainment purposes but not used as a digital surrogate. as well as reliability, the synergistic combination of empirical provenance and automated digital processing, requiring trivial operator configuration, offer advantages for the organization, communication and preservation of digital knowledge. once the process used to construct a digital surrogate is automated, an empirical provenance log describing the process can be automatically produced. knowledge management tools can map these process history actions to semantically robust information architectures. an example of a semantic knowledge management architecture is the international council of museums’ (icom’s) committee on documentation (cidoc) conceptual reference model (crm), iso standard . [ccrmweb] the cidoc/ crm working group has recommended amendments to the standard to include the terms ‘digital object’ and ‘digitization process’ which can be used to describe a digital surrogate’s empirical provenance. martin doerr’s following presentation will explore these tools and methods of semantic knowledge management in greater depth. digital processing can then automatically record empirical provenance information into these semantic architectures enabling the digital surrogates to be ‘born archival’. the concept of ‘born archival’ and related issues dealing with perpetual digital conservation will be examined in greater depth in the following presentation by michael ashley. . perpetual digital conservation figure : rti image, with specular enhancement, showing detail of a trackway of species coltoni of the early triassic period. from the collection of the university of california museum of paleontology. access to archival services is an essential element of the digital workflow for people who acquire and use digital surrogates. archival conservation strategies are also essential to guarantee that this digital information is m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation available for both use and reuse by future generations. in turn, the work of archival conservators is simplified and their ability to plan ongoing conservation activities is greatly enhanced if this digital information possesses ‘born archival’ attributes. the essential attributes of ‘born archival’ information are defined by an empirical acquisition and digital surrogate generation processes that provides managed knowledge of the information’s methods of creation (empirical provenance) along with the digital surrogate’s ‘real world’ semantic context. a collaboration between chi, the university of california museum of paleontology (ucmp), and the university of california media vault program (mvp) [mvpweb] demonstrated an example of the value a digital surrogate’s empirical provenance information can have in archival conservation. among the single-view rtis chi captured in the ucmp collection was a million year old dinosaur trackway of species coltoni in the genus c heirotherium. ptms were produced in resolutions from full resolution to a dimension of pixels along the image’s long aspect. empirical provenance information from the ptm generation process permitted the analysis of data dependencies created during ptm processing. this dependency analysis enables the determination of which files were essential to the scientific record and which files could be regenerated from the originally acquired empirical data along with the empirical provenance information. files that could be regenerated were discarded. chi, in cooperation with the mvp staff, analyzed the data dependencies and reduced the number of files requiring archival storage from to , a significant advantage in a preservation context. figure : before dependency determination - digital files were used to build four resolutions of rti images. this includes process and log files. figure : after dependency determination, files are saved in the uc media vault. . democratization of technology for widespread adoption of digital surrogates to occur, the ns and ch workers who build and use digital surrogates must be able to employ these new tools themselves. the means by which robust digital information is captured and synthesized into digital surrogates requires great simplification, cost reduction, increased ease of use, and improved compatibility with existing working cultures. the emergence of the new family of robust image-based empirical acquisition tools offering automatic post- acquisition processing overcomes an important barrier to the adoption of digital workflow. as was previously discussed, automation requiring trivial configuration offers enhanced reliability and greatly reduces the computer technology expertise necessary to manage a digital workflow. these methods leverage new knowledge to enable ns and ch professionals to build digital surrogates with a minimum of additional training. in turn, this automation frees these workers to concentrate on the ‘real’ ns and ch tasks before them. digital photography skills are already widespread and disseminating rapidly. employing digital photography to provide the empirical data for digital surrogates also lowers financial barriers to digital adoption. as will be seen below, rich d and d information can be captured with the equipment commonly found in a modern photographer’s kit. recent work has shown that computational extraction of information from digital photographs can create digital surrogates that reliably describe the d and d shape, location, material, and reflection properties of our world. among these new technologies are single view rti, multi- view rti and associated enhanced stereo correspondence methods, as well as photogrammetric breakthroughs that permit automatically calibrated and post-processed textured d geometric digital surrogates of objects and sites. some of these developments will be briefly reviewed here and will be explored in greater depth in following presentations by tom malzbender, alan chalmers, roberto scopigno, james davis, oliver wang, prabath gunawardane, alberto proenca, and joão barbosa. . . rti’s role in knowledge management research rti using ptms was invented by tom malzbender of hewlett-packard laboratories. it is an example of computational extraction of d information from a sequence of digital photographs. rti is an image-based technology where operator post-processing can be reduced to trivial m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation levels. the rti process has been used as a model to explore the development of empirical provenance and semantic knowledge management tools. as will be seen later in the presentation by alberto proenca and joão barbosa and the technology demonstration section of the tutorial, both chi and the university of minho have developed tools and methods to record the empirical provenance information generated during rti capture and processing. these tools create a log file of all operations performed during rti processing. combined with information stored in adobe software.xmp files generated during original raw digital image conversion, all empirical provenance for the rtis can be recorded. in cooperation with chi and the mvp, steven stead and martin doerr of the cidoc/crm special interest group modeled rti processes as instances of the crm. this was the first application of cidoc/crm semantic knowledge management concepts to image-based empirical acquisition processes and associated empirical provenance information. prior to this work, crm applications focused on uses within and among museums, libraries, and archives. this work also laid the foundation for the development of new, archive friendly, semantic knowledge management tools that promise to increase digital technology’s ease-of- use for ns and ch professionals, enhance digital surrogate reliability, and lower barriers to digital technology adoption. . . recent developments in dense photogrammetry recent developments in dense photogrammetric technologies can generate d textured geometric digital surrogates of objects and sites from automatically calibrated and post-processed sequences of digital photographs. the european project for open cultural heritage (epoch), a seven year european union sponsored initiative to develop digital tools for ch, fostered a major advance in photogrammetry-based d imaging using uncalibrated digital photos. the epoch d webservice, developed by the computer vision group at catholic university leuven allows archaeologists and engineers to upload digital images to servers where they perform an automatic d reconstruction of the scene and return the textured d geometry back to the user [ewweb]. commercial software, initially developed for the aerial mapping and mining industries by adamtech, an australian company, can automatically calibrate digital photo sequences from one or more cameras, automatically generate dense textured d polygonal geometry from one or more image pairs, and automatically align this d content using photogrammetric bundle adjustment [atweb]. these tools have been used by u.s. bureau of land management researchers neffra matthews and tom noble to document native american petroglyphs at legend rock wyoming state park in collaboration with the wyoming state parks, wyoming state university, and chi. photogrammetry digital image sequences were captured tandem with chi’s rti photo sequences. the integrated photo sequences demonstrate the synergies between automated photogrammetric capture of image-based geometry and reflection-based capture of normal data. these synergies, presented at the computer applications in archaeology conference in berlin, april include co-registered rti images free of optical distortions, and dense, ptm textured d geometry. chi used an identical photogrammetry image sequence of a sculpted architectural feature to test the d geometry produced by adamtech software against that returned from the epoch d webservice. the results showed that both methods generated dense d geometrical information of equivalent quality. figure : distortion corrected rti image of petroglyphs at legend rock state park, wyoming. . tolerance of diversity given the powerful dynamic of change attached to all things digital and the history of human nature’s resistance to conformity, adoption of digital surrogate-based workflow will be encouraged by tolerance of decentralized digital information architectures. tolerance encourages optimizations to fit local conditions or the requirements of a given field of study. within such a tolerant environment, scholarly, discipline-based, evolving standards of best practice will continue to guide local practice as it always has. worldwide access to, evaluation, and oversight of these practices, aided by semantic query enabled access to the empirical provenance of digital surrogates and by use of perpetual digital conservation practices for digital surrogates along with their source data, can assist the proven, self- corrective mechanisms of the scientific method to do their work. . conclusion empirical provenance, perpetual digital conservation, democratization of technology, and tolerance of diversity provide a a road-map for future digital scholarship, and enjoyment of humanity’s legacy. informed by these concepts, emerging tools and methods will enable ns and ch professionals to build reliable, reusable, archive friendly, digital surrogates by themselves. archives of digital surrogates can enable distributed scholarship and public access. the aesthetic quality, usefulness to convey ideas, and completeness of empirical provenance m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation information can guide decisions regarding which digital representations are perpetually conserved. references [atweb] adam technology website (accessed january ). http://www.adamtech.com.au [bfrs ] berns, r.s., frey, f.s., rosen, m.r., smoyer, e.p.m., taplin, l.a. . direct digital capture of cultural heritage benchmarking american museum practices and defining future needs - project report, mcsl technical report. [ccrmweb] cidoc conceptual referenc model (accessed january ). http://cidoc.ics.forth.gr [ewweb] epoch webservice (accessed january ) http://homes.esat.kuleuven.be/~visit d/webservice/html [hnp ] hermon, s. nikodem, j., perlingieri, c., deconstructing thevr – data transparency, quantified uncertainty and reliability of d models, proceedings of the th international symposium on virtual reality, archaeology and cultural heritage (vast ), pg - [mvpweb] media vault program (accessed january ). http://mvp.berkeley.edu [zgws ] zhao j., goble c., greenwood m., wroe c., stevens r., . annotating, linking, and browsing provenance logs for e-science. proceedings of the workshop on semantic web technologies for searching and retrieving scientific data, october . ********************************************* . ptm tools for relighting and enhancement tutorial presenter: tom malzbender hewlett packard labs, usa email: tom.malzbender@hp.com polynomial texture maps (ptms) [mgw ] are an extension to conventional texture maps that allow increased control of rendered appearance. although ptms were developed to be used as texture maps in the context of rendering d objects, they have found more use as ‘adjustable images’ in a d context. as opposed to storing a color per pixel, as in a conventional image or texture map, ptms store the coefficients of a second-order bi-quadratic polynomial per pixel. this polynomial is used to model the changes that appear to a pixel’s color based on user defined parameters, typically a parameterization of light source direction. for example, if lu,lv are parameterized light source directions l and a - the scaled and biased polynomial coefficients, a color channel intensity ci are arrived at via: ci= a lu + a lv + a lulv + a lu+ a lv + a parameterized lighting directions are arrived at by projecting the normalized light vector into the dimensional texture space (u,v) to yield lu,lv. for use as ‘adjustable images’, this just amounts to using the first two coordinates of a normalized vector that points towards the light source. advantages of ptms are: • ease of capture – several methods for capturing ptms have been developed, all of which are fairly simple. for example, none of the methods require any camera calibration and several can be performed by laypersons without any technical training. capture can be performed with low end digital cameras with minimal supporting hardware, such as a handheld flash or table lamp as light source (figure ). the procedure is to acquire a set of images under varying lighting directions. methods are available for both the cases of when lighting direction is known or when it is not. • available tools – tools for making and viewing ptms are freely available via http://www.hpl.hp.com/research/ ptm/ and related pages. several tools are available for displaying ptms, the ptmviewer having the most functionality. additionally java-based viewers are available that don’t require any explicit download. once a set of images of a static scene under different and known lighting directions are acquired the ptmfitter can be used to produce a ptm. alternatively, one can use a reflective sphere (snooker ball) to capture images with unknown lighting direction and the ptmbuilder application can be used to produce a ptm. more detail can be found later in this document and at http://www.hpl.hp.com/research/ptm/makingptmnew.htm. • fast rendering – ptms were specifically developed to enable fast color evaluation from lighting direction. since equation ( ) consists solely of multiplies and adds, micro- simd techniques [ffy ] (parallel subword instructions) can be used to compute color from lighting direction in real- time on any modern cpu without relying on any specific graphic hardware. • compact file size – ptms support jpeg compression resulting in compact files, so can be shared on the web efficiently. examples of ptms on the web are at: http://c-h-i.org/examples/ptm/ptm.html http://www.hpl.hp.com/research/ptm/relightdemo/ index .html http://www.hpl.hp.com/research/ptm/ antikythera_mechanism/index.html • surface detail enhancement – ptms represent a reflection function from a specific viewpoint, and as such allow interactive control of lighting direction. this greatly assists in the perception of surface shape and detail. additionally, it is possible to transform the reflectance http://www.hpl.hp.com/research/ptm/relightdemo/index .html http://www.hpl.hp.com/research/ptm/relightdemo/index .html http://www.hpl.hp.com/research/ptm/antikythera_mechanism/index.html http://www.hpl.hp.com/research/ptm/antikythera_mechanism/index.html http://www.hpl.hp.com/research/ptm/ http://www.hpl.hp.com/research/ptm/ m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : original and two specular enhancements of a cuneiform tablet. properties represented by a ptm and this allows one to change the material properties of the object that was imaged. for certain materials this allows perception of surface detail not directly visible when inspecting the original object with the unaided eye. these methods are elaborated in the next section. . reflectance transformation the interactive control of appearance as a function of lighting direction allows ptms to be used to help perceive surface shape. however, the reflectance function represented by a ptm can also be used to extract an estimate for the surface normal at each pixel. once this normal is known, several transformations of these reflectance functions can be performed within the ptmviewer that keep the geometric information (the normal) fixed, but modify the photometric properties of the surface. this is often helpful in assisting the perception of d shape, and sometimes allows the perception of surface detail not readily apparent when inspecting the object directly. we have found simple transformations of the reflectance function particularly useful [mgw ]: ) specular enhancement – adding synthetic specular highlights to the reflectance function of a mostly diffuse object can be quite effective. the ptmviewer implements this using simple phong/blinn shading and is accessed by right clicking, as are the remaining transformations. ) diffuse gain – the reflectance functions of diffuse objects are slowly varying. diffuse gain is an analytic transformation that keeps the normal estimate per pixel fixed, but increases the curvature (second derivative) of luminance of the reflectance function by an arbitrary gain constant under user control. as such, it has not physical analog, but is nonetheless useful. ) light direction extrapolation – parameterizations of physical light directions specified in equation ( ) by lu,lv are limited to the range of (- , ) for each coordinate. however with a parametric description of the reflectance function we are free to specify lighting directions outside of this range. these again have no physical analog, and can be thought of as yield lighting directions more oblique than physically possible. figure : original and enhancements using diffuse gain. figure : original and an extrapolation of lighting direction. . capturing and building ptms ptms are typically made from multiple images of a static scene or object illuminated from separate lighting directions for each image. these sorts of images are easily collected by a variety of methods, some of which are demonstrated at http://www.hpl.hp.com/research/ptm/makingptmnew.htm. the techniques can be broken down into two classes, each with its own set of tools to support constructing ptms from the tools. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation . ptm formats currently supported by the ptmviewer format name bytes per pixel description ptm_format_lrgb + luminance as a polynomial multiplied by unscaled rgb ptm_format_rgb x polynomial coefficients for each color channel ptm_format_lum or ycrcb color space, only y as a polynomial ptm_format_ptm_lut + index to a lookup table that contains rgb values plus polyno- mial coefficients ptm_format_ptm_c_lut variable rgb values plus an index to a lookup table that contains only polynomial coefficients ptm_format_jpeg_rgb variable jpeg compression of an rgb ptm ptm_format_jpeg_lrgb variable jpeg compression of an lrgb ptm ptm_format_jpegls_rgb variable jpegls compression of an rgb ptm ptm_format_jpegls_lrgb variable jpegls compression of an lrgb ptm in the first class, light source direction is known and specified in a file format called a.lp file. the.lp file is typically constructed with a text editor such as wordpad, a simple example is shown below: c:\leaves \ - .jpg - . . . c:\leaves \ - .jpg . . . c:\leaves \ - .jpg . - . . c:\leaves \ - .jpg - . - . . c:\leaves \ - .jpg - . . . c:\leaves \ - .jpg - . . . . . . c:\leaves \ - .jpg - . - . - . c:\leaves \ - .jpg - . . - . the first line contains the number of images in the set. for each image, the image filename is given (either.jpg,.tga or.ppm), then the x, y and z coordinates of a normalized vector pointing at the light for that image are specified. as one is looking at the object to be imaged through the camera, the x axis is off to the right, the y axis is towards the top, and the z axis points at the camera from the center of the image. for example, a light positioned directly overhead, where the camera is, would have direction vector ( , , ). once such a.lp file is constructed, the ptmfitter is run to convert these images and.lp file to a ptm. the ptmfitter is freely available at http://www.hpl.hp.com/research/ptm/. suggested answers for questions the ptmfitter prompts that may not be clear are: enter desired fitting format: enter basis: figure : two inexpensive domes useful for specifying lighting direction. in both cases a digital camera is placed above and the object to be imaged placed on the floor below. right image courtesy of wouter verhesen. a second approach to constructing ptms will be covered in detail in section . in this approach, one uses a handheld flash to trigger the camera, so light directions or positions are not known. in this approach, one places one or two black or red snooker balls next to the object being photographed. the flash will leave a specular highlight in the balls, which can be used to infer the position or direction of the light. the ptmbuilder (also available at http://www.hpl.hp.com/ research/ptm/) is then used to automatically detect the location of the balls in the image, recover highlights, infer light direction or position and produce a ptm. this typically does not require any user interaction besides the specification of a directory the images reside in. . ptm formats several different varieties of ptms are available summarized in the table above. more detail is available from the ptm format document downloadable from m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : photograph and rti enhancement of a footprint in dirt. http://www.hpl.hp.com/research/ptm/. the most commonly used formats by far are the first two, lrgb and rgb. . real-time rti figure : real-time surface detail enhancement is possible using high brightness l.e.d.s coupled to a high speed camera and gpu. the reflectance transformation imaging (rti) methods described above are useful for a number of applications including seeing more detail on object surfaces. such objects must first be captured under varying lighting conditions, then these images are processed into a ptm, and finally the ptm is viewed under varying reflectance transformations. for many applications such as criminal forensics, this workflow is still more elaborate than desired. it is possible to achieve this same functionality in real-time using a combination of high speed cameras and fast gpus as described in [mvga ]. in this system, high brightness leds are flashed sequentially as a f/sec camera captures images of the object which are transferred to a graphics card. every / th of a second, surface normals are estimated using photometric stereo from a collection of images at spaced lighting directions. normal perturbations can be amplified, either in a local or global manner, to accentuate surface detail. additionally, synthetic specular highlights can be added, as in the specular enhancement method mentioned earlier. quantitative measures of surface roughness can be produced at frame rates as well. the resultant system allows untrained users to simple present object surfaces to the system while viewing enhanced results on a nearby display. . case studies reflectance transformation has been used successfully in a variety of disciplines by researchers outside of the fields of computer graphics and vision, using the ptm tools. some examples are highlighted below. figure : photograph and rti enhancement of fragment of the antikythera mechanism. cultural heritage – many examples of the deployment of reflectance transform imaging (r.t.i.) in the contact of cultural artifacts can be found on the cultural heritage imaging (chi) web pages and elsewhere, specifically: http:/ /c-h-i.org/examples/ptm/ptm.html. a recent application of the method was in the study of the antikythera mechanism [fbm* ], by an international research team consisting of scholars and researchers from greece, the uk and the united states, http://www.antikythera-mechanism.gr/. the http://c-h-i.org/examples/ptm/ptm.html http://c-h-i.org/examples/ptm/ptm.html m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : the painting “jean de la chambre at the age of ”, by frans hals, dates from between - . note the variation in brush strokes visible under varying lighting direction, from the left, center and above respectively. images courtesy of the national gallery in london. (http://cima.ng-london.org.uk/ptm/ng_examples.htm) antikythera mechanism is a mechanical astronomical calculator that was built by the ancient greeks around bce and resides in the national archeological museum in athens. it was uncovered by sponge divers in after being underwater for approximately millennium. in conjunction with microfocus ct studies, reflectance imaging was applied to the device to uncover a total of over characters from a starting point of . in particular reflectance imaging was helpful in decoding lunar and solar eclipse glyphs indicating the saros cycle. criminal forensics – the enhancement capabilities of rti are useful in a number of criminal forensics contexts. in the united states, the fbi has used the method for looking at faint indented writing. the california department of justice has used it for studying footprints on soft substrates and the san mateo police department has employed it for looking at faint fingerprints. several more criminal investigations using the method are underway. art conservation – the capture and display of paintings under varying lighting direction is a more thorough characterization than any single image of the same painting. for this reason, both the national gallery and tate galleries in london have explored the use of ptms on several of the paintings in their collection [psm ]. in particular, impasto, cracks, canvas weave, wood grain, pentimenti and point surface deformations can often be easily rendered visible and documented. paleontology – the reflectance transformation techniques in particular have proved useful to paleontologist gleaming information from fossils, specifically those specimens with low color contrast and low but definite relief [hbmg ]. one such example is shown in figure . these methods have been successfully employed on a large number of fossils with different types of preservation, including cambrian fossils from the burgess shale and chengjiang conservation lagerstätten, cambrian fossils with d relief from dark shales of norway, carboniferous plant fossil impressions from england, cambrian trace fossils in sandstone from sweden, and neoproterozoic impression fossils from the ediacara lagerstätten of south australia. references [ffy ] fisher, j., faraboschi, p., young, c., embedded computing: a vliw approach to architecture, compilers and tools, elsevier press, , isbn - - - - . [fbm* ] freeth, t., bitsakis, x., moussas, j., seiradakis, a., tselikas, a., mangou, h., zafeiropoulou, m., hadland, r., bate, d., ramsey, a., allen, m., crawley, a., hockley, p., malzbender, t., gelb, d., abrisco, w., edmunds, m., “decoding the ancient greek astronomical calculator known as the antikythera mechanism”, nature, vol. , nov. th, , pp. - . [hbmg ] hammer, o., bengtson, s., malzbender, t., gelb, d., “imaging fossils using reflectance transformation and interactive manipulation of virtual light sources”, palaeontologia electronica, august , . appears at http://palaeo-electronica.org/ _ / fossils/issue _ .htm [psm ] padfield, j., saunders, d., malzbender, t., “polynomial texture mapping: a new tool for examining the surface of paintings”, icom committee for conservation, , vol. , pp. – . [mgw ] malzbender, t., gelb, d., wolters, h., polynomial texture maps, proceedings of acm siggraph , pp. - . [mwga ] malzbender, t., wilburn, b., gelb, d., ambrisco, b., “surface enhancement using real-time photometric stereo and reflectance transformation”, eurographics symposium on rendering , nicosia, cyprus, june - , . http://palaeo-electronica.org/ _ /fossils/issue _ .htm http://palaeo-electronica.org/ _ /fossils/issue _ .htm m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : wireframe of angeloktistis church, kiti. . recreating authentic virtual byzantine environments tutorial presenter: alan chalmers additional authors: eva z´anyi, jassim happa warwick digital laboratory university of warwick, uk email: alan.chalmers@warwick.ac.uk computer reconstructions of heritage sites provide us with a means of visualizing past environments, allowing us a glimpse of the past that might otherwise be difficult to appreciate. to date there have been many computer models developed to recreate a multitude of past environments. these reconstructions vary vastly in quality. furthermore there are in fact very few that attempt to authentically represent how a site may have been appeared in the past. to achieve such a high-fidelity result, it is crucial that these models are physically-based and incorporate all known evidence that may have affected the perception of a site. failure to do so runs the real danger of the virtual reconstruction providing a false impression of the past. a key feature when reconstructing past environments is authentic illumination [dc , rc ]. today the interior of our buildings are lit by bright and steady light, but past societies relied on daylight and flame for illumination. our perception of an environment is affected by the amount and nature of light reaching the eye. a key component in creating the authentic and engaging virtual environments is the accurate modeling of the visual appearance of the past environment illuminated by daylight and flame. in this section the high-fidelity computer reconstruction of byzantine art, that is the rare visible remains of the long lasting byzantine empire. we show that there is a major difference in the way in which people view byzantine art today, and as it may have appeared in the past when displayed in its original context and illuminated by candle light, oil lamps and day light. . byzantine environments the byzantine empire grew out of the eastern roman empire and comprised a large number of different cultures. scholars do not agree when the empire began, but in ad emperor constantine i (reigned - ) moved his capital to byzantium, which was renamed constantinople. the byzantine empire lasted for more than years until when the turks occupied constantinople. despite large number of different cultures within the empire, a common architecture and sacred art style developed. during byzantine times, cyprus followed closely the art and cultural trends of the capital, constantinople, with especially high-quality art. today it is in cyprus, a former rich and peaceful province of the byzantine empire that many of the most precious surviving relics of byzantine art are to be found. this is due to the fact that byzantine master painters visited cyprus to paint and teach their art with much painting of church interiors and icons. another reason is that cyprus achieved a state of neutrality in the th century strife between byzantium and islam and therefore remained unaffected by the iconoclastic edicts of the byzantine emperors, which resulted in many pieces of art elsewhere being destroyed. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation the outside of byzantine churches were unimposing, with little decoration or use of paint or precious materials. the interiors were, however, very different, being highly decorated including substantial amounts of gold and other precious materials. manuals, known as typicons, regulated the positioning of the lighting within the environment in great detail. this was deliberately used to underline the difference between divine light and profane darkness. care was thus taken to ensure the architecture used light and shadow to symbolically represent different sacral hierarchies and direct the attention of the viewer. the upper parts of the churches, which represented heaven, were better lit than the lower parts. in early byzantium this was achieved with the help of daylight through small openings in the upper parts of the walls. from middle byzantium on, the buildings had less openings letting in natural light and these were replaced by oil lamps and candles [the ]. in addition, the flickering light from different directions would have significantly affected the precious materials such as the gold and silver of the icons, mosaics and frescoes, making them sparkle. the whole purpose was to draw the visitor in the church into contemplation [bel , pee ]. . artifacts visualized figure : the th century mosaic depicting the virgin maria between archangels at angeloktistis church, kiti. the byzantines were much preoccupied with the use of gold and favored it extensively in their churches. in the icons, massive wall and ceiling mosaics and frescoes, the use of gold was not only symbolizing immortality and the supernatural but was meant to illuminate the pictures from within. this lighting effect in combination with certain architectural elements of the churches was used to create certain illusions, including the holy people on the cupola mosaics seeming to step out of the golden background, approaching the viewer [hjk ]. gold was not only used for the pictures, but also for candlesticks: with churches having masses of candles, both in ornate floor candle holders and in hanging candelabra. byzantine architects in fact paid careful attention to the use of direct and indirect lighting in certain parts of the church building, depending on the firmly defined religious value of the respective space [the ]. this religious value was also symbolized by the architectural form and the use of pictures. for example, the cupola, being the most characteristic architectural element of the byzantine churches, should be a direct representation of heaven, therefore it had to be illuminated by as much light as possible, including the generous use of reflecting gold [hjk ]. we investigated the high-fidelity reconstruction of three artifacts, all of which contain gold. • the th century mosaic depicting the virgin maria between archangels at angeloktistis church, kiti, near larnaca, figure . gold was used for the background and the halos. the mosaic stones were glass tesserae, which allowed light to reflect and refract within the glass. • the icon of christ arakiotis, from the church of pantocrator of arakas from lagoudera. the icon is currently displayed in the byzantine museum & art gallery, bishops palace in nicosia. the icon is dated from the end of the th century and is painted with tempera and gold leaf on a wood panel, which was typical for artifacts primarily intended for ritual or ecclesiastical use during the byzantine period. • the fresco of st. george on horseback, - th century in the chapel of sts cosmas and damian, also at the angeloktistis church, kiti. . capturing the data detailed measurements were taken at the two environments, figures , , . the geometry was measured using a leica disto a laser measure meter. this has an accuracy of ± . mm over a range of m. light level measurements were taken at numerous points using a minolta t illuminance meter with a measuring range of . to , lx. finally several hundred digital photographs were taken, with and without the inclusion of a mac- beth color checker chart. a number of images were also stitched together to create panoramas of each of the environments. in addition hdr images were created of each of the environments using a series of photographs at different exposure levels [dm ]. to capture a single-viewpoint ptm image each artifact needed to be photographed from a fixed camera position. multiple photographs were taken, each illuminated from a different light position. if the positions of the lights are known, the photo sequence can be mathematically synthesized into a single ptm image. the images are captured using a process termed the ‘egyptian method’ in which a string is used to measure the illumination radius distance based on the diameter of the subject. one end of the string is tied to the light source and the other end is held near to but not touching the subject at the location corresponding to the center of the composed image. for each light position photographed, the subject end of the string is positioned and the light distance is determined. the subject end of the string is then moved out of the camera’s field of view and the photo taken. this process is repeated until a m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation representative hemispheric sample of light directions is acquired around the subject. figure : capturing the icon of christ arakiotis, byzantine museum & art gallery. capturing ptms of the artifacts using this technique posed a number of challenging problems for the project. the presence of light sensitive objects, including tempura on the wood icon and the nature of the fresco, mandated a low photonic damage lighting system. while, in an isolated environment, the mosaic tesserae themselves are very resistant to photonic damage, and standard flashes or other photographic lights could have been used to document them responsibly. however, in its apse location, the proximity of light sensitive materials meant that responsible cultural heritage practice required another approach. the solution was to use a watt xenon arc lamp light source designed to power a fibre optic swimming pool illumination system. xenon sources emit visible light as well as large amounts of photonically damaging ultraviolet (uv) and infrared (ir) light wavelengths. while a variety of light transmitting fibers and guides are available to carry this light, the least expensive and most widely used material is pmma acrylic cable. pmma acrylic acts as a band pass filter, excluding both uv and ir light and passing only visible wavelengths between and nm. we used a bundle of this fiber to filter our light source. a cypriot lighting contractor, andreas demetriou, loaned the equipment at no charge to the project. figure : capturing the fresco of st. george on horseback, chapel of sts cosmas and damian, angeloktistis church, kiti. the apse is five meters off the floor at the top and over three meters at its base which caused some major difficulties when trying to capture the images of the mosaic. although a four meter ladder was available at the church and kindly loaned to us for this part of the work, the enclosure of the sanctuary directly below the apse is separated from the rest of the church by a high, ornate grating which both segregated the sacred space from the main part of the church and constrained our working area. this limited area contained the alter, freestanding crucifixes, ritual objects, furnishings for practical support of ritual activities such as multiple daily masses, and in addition, all the necessary project equipment for the image capture, including cameras, lights, color checker charts, and reflection capturing black balls. the problem was overcome by attaching the subject end of the string for the egyptian method to a long pole, a broom handle loaned to us by the church. this subject end of the pole was cushioned with bubble wrap in case it accidentally touched the mosaic. this end was held close to, but fortunately never touching, the mosaic by a member of the team, and then another person on the ladder used the string to position the light correctly, figure . the broom handle and string were then moved out of the way and the image taken. despite all these difficulties, light positions were correctly captured and this was enough to build the desired high quality rti images. figure : capturing the th century mosaic depicting the virgin maria between archangels at angeloktistis church, kiti. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : icon of christ arakiotis lit from the (a) left, (b) middle, and (c) right. creating the context using the detailed measurements, accurate models of the angeloktistis church at kiti and the byzantine museum were created using the d modeling software, maya, figures and . experimental archaeological techniques were used to build replica candles and oil for the lamps using authentic materials, in particular beeswax. these candles and oils were then set on fire and the detailed spectral data of each flame type measured using a spectroradiometer, which is able to measure the emission spectrum of a light source from nm to nm, in nm wavelength increments. these spectral results were then converted into a form so they could be incorporated in the physically based lighting simulation system, radiance[ws ]. . results figure : model of angeloktistis church, kiti. figures and show results from the ptm for the icon and the mosaic which clearly show how the position of the lighting may have affected the appearance of the artifacts. this affect is especially pronounced with the mosaic which is of particular interest as many of the byzantine mosaics were on the curved walls and ceilings, which included gold and silver glass tesserae. as the viewer or the light moved within the church, these tesserae sparkled. our study showed that the appearance of the mosaics is indeed significantly different when lit from various directions[ecma ]. figure : appearance of baby jesus from the mosaic lit from different directions. figure shows the icon of christ arakiotis lit by simulated modern lighting, as it appears in the museum today, and figure with simulated beeswax candlelight as it may have appeared in the past [eyta , eyj* ]. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : icon lit by simulated modern lighting. . summary this section has shown has two novel technologies being applied to the computer reconstruction of ancient byzantine artifacts and environments: high fidelity physically-based computer graphics techniques and ptms. the results clearly show that there is indeed a major difference in the way in which the artifacts are perceived when lit from different directions, and with the candle light, oil lamps and day light. these new insights into how byzantine art may have been viewed in the past will form the foundation for future high- fidelity computer reconstructions of cultural heritage sites and artifacts. references [bel ] h. belting. bild und kult: eine geschichte des bildes vor dem zeitalter der kunst. ch beck, . [dc ] k. devlin and a. chalmers. realistic visualisation of the pompeii frescoes. proceedings of the st international conference on computer graphics, virtual reality and visualisation, pages – , . [dm ] p. debevec and j. malik. recovering high dynamic range radiance maps from photographs. acmsiggraph , pages – , . figure : icon lit by simulated beeswax candle. [ecma ] zanyi e., schroer c., mudge m., and chalmers a. lighting and byzantine glass tesserae. eva : electronic information, the visual arts and beyond, . [eyj* ] zanyi e., chrysanthou y., happa j., hulusic v., horton m., and chalmers a. the high-fidelity computer reconstruction of byzantine art in cyprus. iv international congress of cypriot studies, . [eyta ] zanyi e., chrysanthou y., bashford-rogers t., and chalmers a. high dynamic range display of authentically illuminated byzantine art from cyprus. vast : th international symposium on virtual reality, archaeology and cultural heritage, . [hjk ] e. hein, a. jakovljevic, and b. kleidt. zypern. byzantische kirchen und kluster. mosaiken und fresken. melina-verlag, ratingen, . [pee ] g. peers. sacred shock: framing visual experience in byzantium. penn state press, . m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation [rc ] i. roussos and a. chalmers. high fidelity lighting of knossos. proceedings of vast , pages – , . [the ] l. theis. lampen, leuchten, licht. in j.j.k. degenhaart, editor, byzans - das licht aus dem osten. verlag philipp von zabern, . [ws ] g. ward and r. shakespeare. rendering with radiance: the art and science of lighting visualization. morgan kaufmann publishers inc. san francisco, ca, usa, . ********************************************* . reflection transformation imaging for large objects and quality assessment of ptms tutorial presenter: roberto scopigno, additional author: m. corsini visual computing lab, isti - cnr, italy email:{corsini, r.scopigno}@isti.cnr.it reflection transformation imaging has proved to be a powerful method to acquire and represent the d reflectance properties of an object, displaying them as a d image. one of the most popular techniques for reflection transformation imaging is polynomial texture mapping (ptm), where for each pixel, the reflectance function is approximated by a biquadratic polynomial. this tutorial section presents some practical issues about the creation of high-quality ptms of large size objects. the aim is to analyze the acquisition pipeline, resolving all the issues related to the size of the object, from a practical point of view. moreover, we presents some results about quality assessment of ptms, showing the importance of lighting placement. the present methodology is particularly interesting for the acquisition of certain class of cultural heritage objects, like bas-relieves. . methodology as just stated, typically ptms are acquired by positioning the object of interest inside a fixed illumination dome. this permits to automatically change the light direction during photos acquisition, but limits the flexibility of the overall system. since, in this case, the objective is to acquire large objects, we decided to deal with a “virtual” light dome as explained in the next sections. in particular, we divided the acquisition process in three steps: acquisition planning, acquisition and post-processing. . . acquisition planning selecting the correct lights placement is an important step in the ptm acquisition of large objects since, in general, we do not have the possibility to use a physical dome to illuminate the object. instead, we will have to manually place the light in different positions, forming a “virtual” illumination dome. the size of this illumination dome and its light distribution will depend on the size of the target object and on the number of light directions we want to use to sample the reflectance function. to simplify the light placements we developed a specific software tool that helps us to plan the positioning of the lights. the tool usage is quite simple; the scene setup is generated as the user inputs the size of the object to be acquired, its height from the ground and the distance of the camera. objects in the scene are scaled according to user specifications; camera is pointed towards the center of the object. next step is the definition of the acquisition pattern. the array of light can be generated by choosing the light distance and two angles (vertical and horizontal step). the tool can automatically exclude the light positions that are too near to the obstacles around the object of interest (if given in input) or that are aligned with the camera axis (light will be shadowed by camera or will occlude the camera). the points are generated using a parallel-meridian grid. this does not guarantee a uniform distribution over the sphere but, having a series of light position at the same height will result in a much faster acquisition due to the manual placement. finally, given a complete dome, the program can perform a light pruning following the “distributed” scheme (described in section . ). this scheme, by generating a more uniform distribution, greatly reduces the number of required light positions while not influencing excessively the ptm quality. when the light setup has been completed, the tool can save a written description of the lighting setup by providing step by- step instructions on light placement. . . acquisition figure : the acquisition setup. several experimental devices has been created to acquire ptms. typically, this devices are suitable for sampling small objects (from a minimum of cm to a maximum of cm of size) and are characterized from a fixed dome. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation following the previous considerations, our solution is shown in figure . our acquisition equipment was composed of an mpixel canon digital camera, a w halogen floodlight, a tripod and a boom stand. the fact that we used only one light explains also the parallel-meridian placement of lights: with these configuration we needed to set the height and direction of the light only once for each level of height. the time needed to position the light was minimized by the acquisition planning just described, and by some references placed on the floor. we fastened the acquisition using a printed scheme of the angle directions (it helped in placing the references on the floor very quickly), and a plumb line attached to the light in order to facilitate the positioning. the acquisition steps can be summarized as: • take the measures of the object, find the center of it and its height from the ground. • using these data, generate the “virtual dome” and put the reference marks following the output of the ptm planner. • position the digital camera on the tripod. measure aperture and shutter speed under the illumination of the central light. keep these values fixed for all the photos, in order to have a constant exposure. • for each level of height, set the height and the direction of the light, then put it on each reference mark related to the level, and take the photo. other advantages of this equipment are that it is quite cheap (nearly euros in total) and easily transportable. . . data processing in order to calculate a precise illumination function, a critical factor is that the digital camera must remain fixed from one photo to the other. even a misalignment of a few pixel can produce a bad result, with visible aliasing. in our experimental acquisition set it could happen to have small movements of the camera. this led to the necessity of aligning the set of photos before building the ptm. to do so we performed the alignment automatically using a freeware tool for panoramic images. this is the only data processing we need before to generate the ptm. . . about manual light placement as just stated, the light in our acquisition device is placed manually for each direction sample. the acquisition planning and other solutions like the reference marks help us to optimize this time. nevertheless, nothing prevent us to further reduce the acquisition time by employing solutions to eliminate the needed for manual placement of light positions. in fact, useful tool that use a mirror ball to estimate lighting direction without the necessity to measure it has just been used with success in ptm acquisition [mmsl ]. even in this case, the acquisition planning continue to be helpful (e.g. obstacle avoidance). a completely image-based automatic estimation (with a certain degree of approximation) of the light direction is also possible (winnemöeller et al., [wmtg ]) making the light positioning a easy and very fast task. . quality assessment in this part of the tutorial we consider some issues regarding quality assessment of ptms. more specifically, we performed our quality evaluation with respect to the number and position of lights used during the acquisition. in order to perform this quality evaluation, we considered a by cm section of the xivth century tomb of archbishop giovanni masotti as a case study. we performed a very accurate ptm acquisition, using a large number of lights position ( light positions, angles and height levels) and we acquired the same object also with a triangulation scanner (minolta i). we consider the d scanned model as a “ground truth” since for large objects d scanning is a very reliable technique in terms of accuracy. following the steps just described, we created a ptm using all the photos. we also generate an high-precision d model (nearly . millions of faces, / of millimeter of sampling resolution) from a set of range maps. figure : comparison between the normal maps of the d scanning and the ptm: full model and particular. our first comparison was between these two representations; as a measure of quality we compare the normals calculated from the ptm data with the surface normals of the d model. to do so we aligned the d scan model to the ptm image using a tool for image registration [fdg_ ]. in figure a comparison of the normal maps is shown. the variation of the normals in the ptm is smoother than in the corresponding d scan, but their values are coherent. this test demonstrates that, even though ptm provides an approximation of the objects’ geometry, the obtained data are reliable. it also demonstrates that our setup does not introduce significative errors. the other analysis was related to the degradation of ptm quality respect to the number and position of lights. for this purpose, we created four ptms starting from subsets of the original lights. then we made a comparison between the normal maps of the “best” ptm (the one with lights) and the “sub-sampled” ones. the comparison was made calculating the difference m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : quality degradation: (a) best quality ptm (normal map) (b-e) maps of the differences in dihedral angle of normals. the sphere shows the lights placement. of the dihedral angles between the normals of each pixel. in figure we show the analysis of the difference between the best ptm and four possible subsets. in terms of number of lights, we can observe that we can considerably reduce the number of lights without having an excessive degradation of quality. for example, we can reduce the number of photos up to (see figure (c) and (d)) and we will have a ptm where mean value and variance (nearly . and respectively) of, the overall degradation are still satisfying. as regards the different placement of lights, we can observe the case of figure (c) and (d). even though we have almost the same number of lights, a more uniform distribution of the lights brings to lower mean degradation and peak error. considering these facts, we can conclude that a pattern of - properly distributed photos can produce a high-quality ptm. . results several objects have been acquired with the developed system in order to show the reliability of the acquisition results. we will show in the tutorial the results obtained on three artifacts: a capitol, a bas-relief and a sarcophagus. snapshots of the acquired ptms will be shown in the presentation. the ptms themselves are available for download with the additional course material from the course’s website. this testbeds produced satisfying results, and showed us that ptm can be an alternative method for documenting and communicating cultural heritage information also for large size objects. moreover, they also gave useful suggestions on how to perform the acquisition more quickly, without compromising the quality of the final results. a final consideration regards the improvements of the proposed methodology using an automatic system to estimate the light m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation direction.this permits to obtain more accurate results and reduce considerably the time needed by the light placement. references [fdg_ ] franken t., dellepiane m., ganovelli f., cignoni p., montani c., scopigno r.: minimizing user intervention in registering d images to d models. the visual computer , - (sep ), – . special issues for pacific graphics . [mmsl ] mudge m., malzbender t., schroer c., lum m.: new reflection transformation imaging methods for rock art and multiple-viewpoint display. in vast: international symposium on virtual reality, archaeology and intelligent cultural heritage (nicosia, cyprus, ), ioannides m., arnold d., niccolucci f., mania k., (eds.), eurographics association, pp. – . [wmtg ] winnemÖeller h., mohan a., tumblin j., gooch b.: light waving: estimating light positions from photographs alone. computer graphics forum , ( ), – . ********************************************** . photometric stereo, structured light and related image based techniques for real world capture tutorial presenters: james davis, oliver wang, and prabath gunawardance university of california, santa cruz, usa email:{davis, owang, prabath}@cs.ucsc.edu this section of the tutorial will provide some overview of different real world sensing methods. the most commonly used methods in the context of cultural artifacts are triangulation methods. d depth from triangulation has traditionally been treated in a number of separate threads in the computer vision literature, with methods like stereo, laser scanning, and coded structured light considered separately. in this overview, we attempt to unify many of these previous methods. viewing specific techniques as special cases leads to insights regarding the solutions to many of the traditional problems of individual techniques. in addition to d measurements, it is possible to directly measure the orientation of the objects surface using methods like photometric stereo. true d and direct orientation measurements each have advantages. combining both methods can lead to surface reconstruction superior to using either method alone. . new research new research into techniques used to visualize image- based, empirically captured objects. the research goal is to interpolate both lighting and viewing directions while using a small amount of data that can be easily transferable over the web. the work examines various alternative representations of the lighting and spatial information that can be used to compactly model this information. figure : an overlaid comparison (using horizontal stripes) of ptms vs. hemispherical harmonics, showing that the hemispherical harmonic representation better preserves contrast from the original images. one of the key questions to answer is which low dimensional representation of lighting and viewpoints will most faithfully represent actual objects, especially given the interpolation and other processing which will be necessitated. as one example, the figure below shows a coin encoded using both ptms and spherical harmonics. ptms are the current standard in museum rti imaging. the academic community has primarily been using spherical harmonics during the last few years. both representations are efficient to compute and store. the ptm encoded stripes have substantially lower contrast and fail to capture important specular components. even this simple change to lighting interpolation affects the ability of researchers to interpret the archival images. view interpolation requires some notion of pixel or light ray correspondence to smoothly blend from one view to another. lightfields essentially assume planar objects and apply standard signal processing to reconstruct in-between views. an alternate approach would be to have fully d geometry and simply render in-between views. we have been investigating to what extent optical flow (or equivalently stereo correspondence) can be used to provide approximate geometry and thus aid in the task of view interpolation. in particular, multiple lighting conditions provides access to a larger portion of the objects reflectance function than does just a single image. by fitting a low dimensional model such as a ward lighting model, or spherical harmonics to the lighting conditions in each view and performing stereo m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation matching on the coefficients of this model, rather than raw image intensities, better stereo matching is possible. below is an example of stereo reconstruction using (left) standard passive stereo and (right) reconstruction using a low order reflectance field to enhance the matching. the quality is clearly improved. figure : a comparison of the disparity maps generates using standard passive stereo matching (left) and reflectance function coefficient matching (right). darker colors indicate further distance from the camera. references [nrdr ] diego nehab, szymon rusinkiewicz, james davis, ravi ramamoothi, efficiently combining positions and normals for precise d geometry, acm transactions on graphics (siggraph), ( ), . [dnrr ] james davis, diego nehab, ravi ramamoothi, szymon rusinkiewicz spacetime stereo: a unifying framework for depth from triangulation, ieee trans. on pattern analysis and machine intelligence (pami), vol. , no. , . ********************************************* . not all content is ‘born-archival’: digital surrogates and the perpetual conservation of digital knowledge tutorial presenter: michael ashley university of california, berkeley, usa email: mashley@berkeley.edu "thousands of years ago we recorded important matters on clay and stone that lasted thousands of years. hundreds of years ago we used parchment that lasted hundreds of years. today, we have masses of data in formats that we know will not last as long as our life times. digital storage is easy; digital preservation is not." - danny hillis this is a tutorial about digital archives and end-user expectations, and how the practices and technologies of our collective tutorial can fundamentally revolutionize the way producers and consumers of digital content engage with media. we have focused thus far on state-of-the-art media production. here we will look at the state-of-the-field in digital archiving and preservation to see if the world is ready for such innovation. within the past hours (today is february ), the world has seen two continents lose internet access, and microsoft offer to acquire yahoo! for over billion dollars. the internet, and digital technology, remains volatile, friable and at high risk from the perspective of long-term human history. flickr, the huge photo sharing site owned by yahoo!, is a digital repository for millions of users internationally, with over , images uploaded every minute. what would happen if the internet ‘died’ or microsoft decided to pull the plug on flickr? should we be asking, what will happen ‘when’? figure : the long now from: http://www.longnow.org/about/ the long now foundation, established in “ ” seeks to “become the seed of a very long term cultural institution,” meaning adopting a counterpoint to today's ‘faster/cheaper’ mind set and promote ‘slower/better’ thinking [lnf ]. archaeologists as well as archivists don’t think that a decade is a long time, even a century, a millennium, when we take a look at time from the perspective of the human record. digital technology is changing all of this, not necessarily for the better. the tech industry measures time in financial quarters and product lifecycles. those of us who care about the future of human knowledge need to step up and figure out how to make digital content persistent, insulated from the sea changes of innovation and stock prices. this is, as stewart brand says, a “civilizational issue.” [bra ] . the ‘digital dark ages’ hillis describes the here and now as a “digital dark age” because information is devalued by the ubiquity of digital content that cannot outlast our lifetimes. while we have more than enough storage media to hold the cultural memory of the planet, the half-life of data is currently about five years. this is due to the fact that digital preservation is not a corporate priority, nor a consumer priority at present. it must, therefore, be a producer priority. [bra ] m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation digital archivists resist new file formats, new metadata standards, new lifecycles and practices for all the right reasons. consider the fiduciary responsibility of institutional repositories who are charged with keeping content safe, archival, accessible, for as long as possible. minimizing file formats and standardizing metadata minimizes risk (and presumably, costs) as formats become obsolete. the problem is that by limiting the formats archives are willing to accept, we are actually putting the great majority of digital knowledge at risk. jpeg and mp are just two examples of ‘lossy’ file formats that are ubiquitous and also not acceptable by most ‘trusted’ repositories. is the information within these files meaningless? if we wish to avoid a digital dark age, we need to incite consumers into action. in this case, the consumers are the archives. to do so, there are several strategies we can apply. we suggest that we need to design digital media to be ‘self- archiving’, adaptable to virtually any digital environment, so that they have no need to rely on ‘institutional’ repositories to exist, at least not in the monolithic sense. we need file formats that are too clever to ignore, that minimize risk while maximizing semantic meaningfulness, and can transmogrify themselves without degrading as they move ‘across the cloud’. we need institutional repositories to exist, for as clifford lynch says, they are “most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution.” [bai ] until we can invent the digital equivalent of cuneiform tablets, that is, a substance that can preserve the medium and the message equally, we will need stewards of the human record. our short-term proposition (for the next few decades, say), is to provide digital archives a revolutionary way forward in sustainability. . ‘born-archival’ vs. ‘born-digital’ ideally, all of us can be carriers of the digital human genome, digital archivists in our own right, and the technologies and workflows we have been discussing and practicing in this tutorial program go a long way toward this aim. when digital file formats can provide consumers, and here we mean end-users, with digital content that is self- archival, we will have achieved the paradigm shift needed to end the reliance on digital libraries and institutions of cultural memory. john kunze, preservation specialist for the california digital library, calls for ‘born-archival’ media that is fully accessible and preservable at every stage, throughout the lifecycle of this data, from birth through pre-release to publication to revision to relative dis-use and later resurgence. data that is born-archival can remain long-term viable at significantly reduced preservation cost [kun ]. we advocate for both individual professional responsibility and multi-institutional, multi-disciplinary curatorial management of digital heritage content for the foreseeable future. unlike the physical archives of the library of alexandria, lost forever to humanity, digital heritage can be in more than one place at a time and in more than one form, potentially assuring its longevity despite the ephemeral nature of the media. this multiplicity of location and form is both the promise and the peril of digital heritage. with increasingly diverse data formats, larger file sizes, changing media types, distributed databases, networked information and transitive metadata standards, how are today’s heritage specialists to plan for such an uncertain virtual future? it is increasingly difficult for individual scholars and researchers to do the right thing when it comes to digital heritage conservation. the accountability for the conservation of digital heritage falls to all in the natural science (ns) and cultural heritage (ch) fields. but what is a reasonable course of action in the face of such adversity? . digital heritage conservation the importance of developing sensible plans to preserve our digital heritage cannot be minimized. responsible preservation of our most valued digital data requires answers to key questions: which data should we keep and how should we keep it? by digital heritage conservation, we mean the decision-making criteria to discern what must be saved from what can be lost. everything can't be saved nor is it desirable to do so. how is this data to be saved to ensure access in five years, years or , years? in the next years, we will go through dozens of generations of computers and storage media, and our digital data will need to be transferred from one generation to the next, by someone we trust to do it. finally, who will pay for all this? we produce more content now than it is humanly possible to preserve. current estimates are that in , billion trillion bytes -- exabytes, or billion gigabytes -- of digital data were generated in the world -- equivalent to stacks of books reaching from the earth to the sun. in just minutes, the world produces an amount of data equal to all the information held at the library of congress [bb ]. we can think of digital heritage in terms of what the value is of what is being saved, its viability, how available it is to stakeholders, and how long it will last. in other words, an ideal digital heritage repository would conserve archival quality digital surrogate files in an openly accessible way, forever. this is the simplest definition of a trusted repository. the library of congress devised a set of sustainability factors for digital content that are as pragmatic as they are difficult to maintain over time. the core principles we advocate in this tutorial strongly adhere to these sustainability factors [clir ]. adoption: wide adoption of a given digital format makes it less likely to become obsolete while reducing investment by archival institutions for its migration or emulation. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation transparency: open to direct analysis without interpretation, transparency is characterized by self- evidence and substantive metadata. those who use digital surrogates benefit from complete and accessible empirical provenance. self-documentation: xmp (extended metadata platform) and other key forms of self-evidence, such as automatically generated empirical provenance data, dramatically increase the chances for a digital object to be sustainable over time. external dependencies: the less a media form is dependent on proprietary software/hardware, the better. if two documentation methodologies can yield similar results in terms of accuracy and productivity, the more open / less externally dependent method is recommended. impact of patents and copyrights: intellectual property limitations bound to content can inhibit its archival capabilities in profound ways. whenever possible, unambiguous, open licensing for content is recommended. technical protection mechanisms: “no digital format that is inextricably bound to a particular physical carrier is suitable as a format for long-term preservation; nor is an implementation of a digital format that constrains use to a particular device or prevents the establishment of backup procedures and disaster recovery operations expected of a trusted repository.” additionally, limitations imposed by digital rights management (drm) or archaic security protocols severely limit the long-term viability of digital content. furthermore, the archaeology data service (ads) in the uk defines the most critical factor for digital heritage sustainability is to “plan for its re-use” [ads ]. indeed, the design of decision making principles for digital heritage conservation should above all aim to the perpetual use and re-use of this content by striving to assure its reliability, authenticity and usability throughout the archival lifecycle. digital technology and the creation of ‘born digital’ content are indispensable aspects of ns and ch management today. from low-tech documentation like microsoft office, html websites, pdf, and photography, to more complex technologies such as panoramas, object movies, laser/lidar scanning, scanning electron microscopy (sem), x-ray fluorescence (xrf), global positioning system (gps), d modelling, and distributed databases, to cutting edge techniques including web . , reflection transformation imaging (rti), algorithmic generation of drawings from surface normals, and the family of photogrammetry influenced texture and d geometry acquisition tools, these new media types form a spectrum of opportunities and challenges to the preservation field that did not exist even years ago. . a role for all of us we are at a unique point in history, where ns and ch professionals must work to care for the physical past while assuring that there will be a digital record for the future. peter brantley, executive director of the digital library foundation, thinks, “the problem of digital preservation is not one for future librarians, but for future archaeologists.” if one imagines that the well-intentioned efforts of researchers and scholars in the modern era could be unreadable only fifty years from now, there is tremendous responsibility on individual ns and ch professionals to insure a future for their digital work. in the mid ’s, a critical gap between those who provide information for conservation (providers) through construction of digital heritage documentation and those who use it (consumers) was identified by the international council of monuments and sites (icomos), the getty conservation institute (gci) and the international committee for architectural photogrammetry (cipa), who together formed recordim (for heritage recording, documentation and information management) initiative partnership [gci ] a , gci-led literature review demonstrates that most of the key needs identified in recordim are evidently still with us. after reviewing the last years of cultural heritage documentation, the authors concluded, “only / th of the reviewed literature is strongly relevant to conservation.” [ec ] their suggested remedy is to correlate the needs of conservation with the potential documentation technologies by involving more diverse audiences and by creating active partnerships between heritage conservationists, heritage users, and documentation specialists. we are focusing on another gap, between cultural heritage and digital heritage, that has been created as we have shifted away from paper in favor of pixels throughout all of our communication and analytic processes globally. in , the library of congress recognized that “never has access to information that is authentic, reliable and complete been more important, and never has the capacity of libraries and other heritage institutions to guarantee that access been in greater jeopardy.” [clir ] we see the crisis not between producers and consumers of digital data, but in the capacities of ns and ch specialists to produce the content for themselves in ways that can adhere to the principles defined by the loc and other key international standards bodies. there is a desperate need for methodologies for digital heritage conservation that are manageable and reasonable, and most importantly, can be enacted by ns and ch professionals as essential elements of their daily work. the collaboration between ns and ch professionals and digital specialists should lead to the democratization of technology through its widespread adoption, not the continued mystification of technology that is still being defined by the persistence of a producer/ consumer model [ta ] m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation references [ads ] archaeology data service: digital preservation faq (york, ), archaeology data service, http://ads.ahds.ac.uk/project/faq.html [bai ] bailey, c. w.: institutional repositories, tout de suite. ( ), digital scholarship, http://www.digital-scholarship.org/ [bb ] barksale, j., berman, f.: saving our digital heritage [bbc ] bbc: microsoft wants to purchase yahoo! (london, ), bbc, http://news.bbc.co.uk/ /hi/business/ .stm [bra ] brand, s.: escaping the digital dark age: library journal ( ), http://www.rense.com/general /escap.htm [clir ] council on library and information resources (clir) and the library of congress: preserving our digital heritage (washington, d.c., ), plan for the national digital information infrastructure and preservation program. http:// www.digitalpreservation.gov/formats/intro/intro.shtml [cnn ] cnn.: third undersea internet cable cut in mideast ( ), cnn.com, http://www.cnn.com/ / world/meast/ / /internet.outage/index.html [ec ] eppich, r., chabbi, a.: how does hi-tech touch the past? does it meet conservation needs? results from a literature review of documentation for cultural heritage. in vast’ : proc. th int. symp. virtual reality, archaeology and cultural heritage (cyprus, ) the european research network of excellence in open cultural heritage (epoch) [gci ] getty conservation institute: about recordim (los angeles, ), http://extranet.getty.edu/gci/ recordim/about.html [kun ] kunze, j.: new knowledge and data preservation initiative. california digital library ( ) [lnf ] about the long now foundation: the long now foundation ( ), http://www.longnow.org/about [loc ] library of congress: [lv ] lyman, p., varian, h.: how much information?. school of information management, uc berkeley (berkeley, ), http://www.sims.berkeley.edu/how-much-info- [nsf ] the national science foundation and the library of congress: it's about time (washington, d.c., ), research challenges in digital archiving and long-term preservation. [smi ] smith, a.: distributed preservation in a national context. d-lib magazine ( ) [ta ] tringham, r., ashley, m.: the democratization of technology. (berkeley, ), virtual systems and multimedia proceedings th international conference. [une ] unesco: charter on the preservation of digital heritage (paris, ), http://portal.unesco.org/ci/en/ ev.php- url_id= &url_do=do_topic&url_secti on= .html ********************************************* . techniques and tools of empirical acquisition knowledge management tutorial presenter: martin doerr information systems laboratory, centre for cultural informatics of the institute of computer science, forth email: martin@ics.forth.gr as outlined in the previous section, scientific data cannot be understood without knowledge about the meaning of the data and the ways and circumstances of their creation. such knowledge is generally called “metadata”, i.e., data about data. in this section we deal with the problem to automate scientific image capturing and processing methods to the degree possible and to manage the metadata of these processes (“empirical provenance data”) from the generation to use, permanent storage and reuse. we describe in the following requirements for tools, interface specifications, the design of the metadata lifecycle and the core data structure to enable the wide use of the respective imaging technology, in particular as low-cost and easy to apply method for low-budget customers and out-of- lab applications. . function ultimately, the metadata should be sufficient to support the scientific interpretation of the resulting data of an imaging process. frequently the evolution of technology, understanding of shortcomings in the execution of a particular process, or new requirements for the quality of the results may require recalibration or reevaluation of the empirical (primary) source data. alternatively, parts of the source data may be replaced by better ones and the process be reevaluated. in wider scenarios, any part of the source data may be reused for other processes in the future. also integration or reuse of resulting data may require recalibration of the results. the value of imaging data as information source and for reuse may deserve long-term http://portal.unesco.org/ci/en/ev.php-url_id= &url_do=do_topic&url_section= .html http://portal.unesco.org/ci/en/ev.php-url_id= &url_do=do_topic&url_section= .html http://www.digitalpreservation.gov/formats/intro/intro.shtml http://www.digitalpreservation.gov/formats/intro/intro.shtml m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : digitization process as specialization of the cidoc crm. accessibility and preservation. this means storage in central repositories, search capability and preservation of knowledge needed to view, run or migrate imaging results to new platforms. . environment and processes characteristic for cultural heritage is the field observation. immobile objects, but also rare or valuable objects are not easily moved from their permanent location, or belong to a social context such as a church or temple. therefore the capture of primary source data may occur in the limitations of the object location with mobile equipment. under these conditions, reliable registration of the process and context conditions must be possible: • the identity of the measured or depicted object • the experimental setup (geometry, light sources, tools, obstacles, sources of noise) • capture parameters it must be possible to import all metadata that already exist in other sources, such as object descriptions, tool descriptions, processing parameters (such as exif), without retyping. all data common to a serious of captures should be entered only once. automatic plausibility control of manually entered data is feasible to a certain degree. the capture is carried out by a team, which may itinerate through various sites or stay at a home lab. actual processing may occur (for trial purposes!) on field, or in a home lab, or by another team. source and processed data may be transferred to other locations for use, reuse, reevaluation or permanent storage. so at various sites, supersets, subsets and derivatives of the same data emerge. the derivation graph is directed, but neither linear nor a tree. these sites may have different platform requirements. it must be possible to export and import parts and wholes of source and processed data in different formats and to preserve the identity of all the referred items, i.e. such as objects depicted, involved actors, individual captures, resulting data and complete, integrated data sets. metadata should be created with minimal manual input, consistent, avoiding any transcription errors. they must be interoperable with multiple platforms, easy to adapt to new processes, easy to migrate, decompose and integrate in different ways. a notion of identity for all partial products and their authenticity should be preserved despite decomposition and reintegration. . approach only a workflow management system can sufficiently monitor and control the imaging processes in order to capture and import metadata and sufficiently correlate input data, intermediate steps and common processing parameters with final results, such as device description data, device parameters (e.g., exif), experimental setup, calibration data, identification of software used, time of capture etc. primary source data together with (preliminary) end results of an imaging process form a complex, coherent whole. the metadata of this whole form a “metadata masterfile”, which should contain all data scientists would regard as necessary for later interpretation and reuse – to the degree such foresight is possible and correct. respective components, such as a single source image or scan, form self-contained subunits within the whole. in order to avoid data redundancy, the data structure will rely on suitable uri generators and a system of rich cross-linking. processing metadata and parameters may exhibit an incredible diversity. the only chance to create generic tools m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation is a rigorous abstraction and generalization towards the above described functions, and to employ an extensible schema. therefore we adopt the cidoc conceptual reference model (iso ) [doe ], with the notion that scientific observations and processes can be seen as a network of real life historical events that connect things (data and physical objects) with actors, time and space. it has been originally designed for schema integration of museum documentation about the historical context and observational data on museum objects. the crm allows for the seamless connection of technical data with the description of the reality under investigation, and for describing all dependencies of results on other data and of source data on contextual parameters via generalization of an open number of specialized relationships. generic functions can thus be implemented as navigation along data paths in a semantic network formed by the metadata. only an rdf or equivalent representation will allow for the necessary data manipulations, in particular decomposition and merging. thus implemented, data structures of the master metadata can be specialized without compromising the generic functionality. figure shows, how a digitization process can be characterized as a specialization of the generic crm notions of measurement, creation, and modification. . derived metadata from the master metadata, other, more restricted metadata can be generated at any time by suitable portable software on demand for respective environments, such as dublin core or mets representations, in particular for to satisfy access capabilities of various repositories the data may be stored in. the master metadata themselves may not be separated from the data they describe. it is not possible to preserve in one self-contained unit all data a result set depends on. some information, such as a jpg compression algorithm, must be regarded as common knowledge. in a controlled environment it may be uneconomic to duplicate data in cases where too many derivatives are produced from the same sources. components kept in one environment internally may be linked to in another or vice-versa. the proposed metadata management will allow for implementing as generic repository technology the management functions to trace the availability of dependent components. if suitable uris are generated from the beginning and preserved through all processing steps, it will always be possible to recognize duplicates of existing data sets and their components. the impact of deleting or moving data results depend on, or of the obsoletion of the technology necessary to interpret the data, may thus be effectively controlled and suitable measures be taken. more complex is the situation with identifiers for a priori external data, such as museum objects, software components etc. here both, political conventions and richer methods to register multiple identifiers in use and other data to assist identification must be employed. the here described approach to manage metadata for the complete information lifecycle of complex scientific datasets in open, distributed environments is innovative. to our knowledge, no other work has addressed to abstract and generalize metadata structures and management function as described here. we regard it as a basis for new and powerful repository technology in e-science. references [doe ] doerr, m. . the cidoc crm - an ontological approach to semantic interoperability of metadata. ai magazine, ( ). ********************************************* . tools to automate a ptm generation tutorial presenters: alberto proenca, joão barbosa university of minho, portugal email:{aproenca, jbarbosa}@di.uminho.pt digital photography has become a convenient and affordable method to document artifacts. polynomial texture maps (ptm) use digital photography to provide a textured representation of a d artifact. the generation of a ptm relies also on the light source positions, which are fed to a polynomial fitter. for small objects a homemade rigid dome with the lighting positions can be built, which helps to get the light source coordinates required by the ptm fitter. when documenting medium to large size objects, domes may become hard to use due to location constraint or dome size requirements. to overcome the constraints of a physical dome, techniques were developed to help the photographers to place the light sources [dccs ]. these techniques require careful measurements and hand annotation of the light source position during a shooting session. methods were presented to estimate the light sources directions, either from highlights on a glossy sphere [mmsl ] [bsp ] [tse_ ], or by shadow casts by small sticks [cdmr ]. the former methods are the basis for highlight-based reflectance. however, the generation of hrti representation requires considerable human intervention, becoming time consuming and error prone. a software tool was developed, the lptracker, to remove the burden to estimate the light sources positions, by automatically tracking these positions from the highlights on a glossy sphere. images are captured with a glossy sphere next to the object, and the lptracker applies image processing techniques coupled with special geometry to estimate the light source positions. from the set of captured images, the lptracker guides the user through a set of pipelined operations: to find the glossy ball in the images, to compute its radius and center location, to track the m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation highlights at each image, to geometrically estimate the lights positions, to record empirical provenance data in files, to feed the required data to a ptm fitter and to open a ptm viewer to present the final result, the ptm representation of an artifact. a hrti representation can be automatically generated by the ptmbuilder, a software bundle with modules: the lptracker is the user interface for the pipelined operations and acts as a frontend to the other two revised modules, the ptmfitter and the ptmviewer, both from hp labs [hpl]. the ptmbuilder package, currently available at hp labs web site (free under specific conditions), runs under windows, linux and macos x. the guided tour to the ptmbuilder package that follows presents the main user interfaces and dialogues of the software tool, leaving relevant technical details for later . guided tour to build a ptm to build a hrti file representation, images of the artifact must be captured with a glossy ball nearby and light sources should be positioned evenly scattered on a virtual dome over the artifact. figure illustrates a capture session with a superimposed partial -times subdivided icosahedron virtual dome over a petroglyph, while figure shows some resulting images from that session. the ptmbuilder package contains interconnected modules: the lptracker, the ptmfitter and the ptmviewer. once the artifact images are captured and the application is set to run, the user interface at the lptracker guides the user into a processing pipeline • selection of the image set and ball region of interest (figure (a)); • detection, visualization and tuning of the center and sphere radius (figure (b) and (c)); • detection, visualization and tuning of the center of the highlight (figure (d)); • generation and visualization of text files for the light position (lp and hlt file) (figure (e)). some features can be configured by the user: selection of red/black ball, detection algorithms, empirical provenance logging and process pipeline control (figure (f)). the first three have a common user interface layout: the right side of the screen presents selectable image thumbnails while the left side displays the selected image for visualization and tuning. when analyzing the first image to load in memory it checks for its size and if it is above a pre-defined threshold it requests the user to select a region containing the ball (figure (a)) when the detection of the ball is complete, the user can manually adjust the computed center and radius, using any of the images in the set, or some of the images produced during the detection stage (sobel filter gradient, median filter image, blended red channel image). once the user agrees with these computations, he/she can signal the software to proceed (figure (b) and (c)). when the highlights are detected the user can browse through the image set and, if necessary, perform further adjustments on each of the detected highlight centers (figure (d)). during the highlight detection phase, two files are generated with the light positions for each image in the set: a light position file (lp) and a highlight file (hlt). the first contains the list of processed images, each followed by the normalized vector with the light source direction, while the latter contains the coordinates of the highlight center. these allow the output of lptracker to be used by other software modules for the polynomial fitting. these files may be recorded if the user configured the ptmbuilder to do so, and may also be edited (figure (e)). by default, the ptmbuilder is configured to process and generate the ptm file using the ptmfitter, and later to visualize the resulting ptm with the ptmviewer (both from hp labs). . technical background the ptm generation process requires a set of captured images of the artifact and the light sources positions for each one. in hrti the light source position is recorded in each captured image as an highlight on the surface of a glossy ball next to the artifact, as shown in figure . applying geometry and some assumptions (e.g. viewpoint and light source in the infinity), the light source position can be estimated from the highlight [mmsl ] and computer vision techniques can provide automation [bsp ]. the technical background required to grasp the techniques used in the processing tools to generate hrti images are described below: how the ball and highlights are detected, and how the light source positions are estimated. figure : piscos man, unesco rock art at vale do coa, portugal: the shooting session with a virtual dome. . . ball detection the software uses two approaches to detect the ball in the image: search for a region with a given color (when red balls are used) or try to fit a circle into the image by using a hough transform (ht, computational intensive). when black balls are used, the software only resorts to hough m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation transform, but when red balls are present in the image, the user can configure the software to use one out of three approaches: as a "black ball" (ht), as a "red region" or as a "red ball" (ht). there is a strong reason to work with a "red ball": snooker balls are available almost everywhere and are sold in packages with red balls. besides, an image of this red snooker ball produce high intensity values in the r channel (behaving as white balls) while having almost null intensity in g and b channels (behaving as a black ball). note that all the other colored balls in the snooker package (including the green and blue balls) have a strong mix of the rgb channels. figure : piscos man, unesco rock art at vale do coa, portugal: set of captured images. since the ptm representation assumes that the images were taken with a fixed single viewpoint, all objects must be represented by the same set of pixels [bsp ]. this assumption simplifies the detection process and the elimination of some problems related with light conditions (e.g. a shadow cast by the ball can be confused with the ball itself). the software tool will follow one of two paths, depending on the selected approach to detect the ball. if the "black ball" approach is chosen, the software softens shadows cast by the ball using a median filter of all images, then applies a sobel filter for edge detection and a modified hough transform algorithm based on the gradient values to detect the ball contour and geometric center. the two "red" approaches follow a similar path, but using only the information in the r channel. red balls produce high intensity values in the r channel and allow the elimination of some problems, such as shadow casting, with a simple red filter. both approaches, "red region" and "red ball", have a common path: they apply red filter to all images and blend resulting images into a single one, using the maximum values. the "red region" approach uses the resulting image to compute the ball center and radius using a labeling algorithm and a region center of mass computation, while the "red ball" approach applies a sobel filter for edge detection and ht to find the contour and its geometric center and radius. . . highlight detection figure : phases in the highlight detection, with a fake highlight. the software tool searches for the highlights center within the region defined by the center and radius, as computed earlier. it uses a region labeling technique and computes the center of mass per region. to remove interferences from the artifact, the highlight detection stage cleans the region outside the ball contour. this process uses a labeling algorithm to identify highlight candidates for light direction extraction (figure ). these highlight candidates can be due to inter-reflection of some nearby specular surface and if so, they must be discarded. this is done by analyzing the size of the highlight and its distance to the center of the ball: due to reflection laws on a sphere, the closest highlight to the center will be chosen. the final result of the highlight detection stage is an image combining all the highlights, which lets the user assess the quality of the lighting. . . light source position the light source direction can be estimated from the highlight position, according to figure . if the viewpoint and the light source position are assumed to be at infinity, then the view point direction v is parallel to the z axis, and the light direction l is the same for any ball placement in the image and for any image pixel. with these assumptions, the software tool only needs to compute the light source direction values (x; y; z) and feed these data into the ptm fitter, since it normalizes this vector before computing the polynomial coefficients for the curve fitting. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation figure : user interfaces of the ptmbuilder at museum d. diogo de sousa: (a) attempt to locate the ball, (b) black ball detection and tuning, (c) red ball detection and tuning, (d) highlight detection and tuning, (e) hlt/lp file edit over ptmviewer, and (f) user options interface. to compute the light source direction under these assumptions, the software computes the normal at the highlight (n) from the coordinates of the highlight center and the ball center, and assumes the light source direction at f away from the z axis. more details in [mmsl ] and adjusted in [bsp ]. figure : model for estimating the light source direction from highlight. . . tracking empirical provenance one of the major features of hrti is the empirical data recorded in the raw format image set. each image records not only a representation of the artifact lighted with a specific light source direction but also the direction of the light. if two balls are used instead of one, the precise light source position in the artifact space can be triangulated. the lptracker adds features to the richness of the original images set, namely it generates new files: the lp, the hlt and the log. contents of lp and hlt files were presented earlier. the log file records all actions performed by the software, and all adjustments made by the user. these files, together with the original image set, complete all the necessary information to retrace the generation process of a specific ptm file, which supports the scientific validation of the whole process and establishes its empirical provenance. . . next steps the presented tools address some of the constraints posed by the hrti method through the automation of the generation process of a ptm data representation. the lptracker, with a user friendly interface, frees the end user from the most tedious stages of methods guiding him/her through the necessary steps to generate a valid ptm representation of large scale objects: ball detection, highlight detection, lp/htl file generation, empirical provenance logging and ptm fitting. the empirical provenance data, intrinsic to the hrti approach allows the scientific validation of ptm representation of the artifact, and its affordable cost accounts for the increasing number of supporters among scholars and institutions. when the size of the object is large compared to the dome radius, however, the assumption used for light direction estimation may introduce critical inaccuracies, both at the estimated direction of each light source and at its distance. m. mudge, t. malzbender, et al. / image-based acquisition, scientific reliability, and digital preservation inaccuracies to the light source may be due to the ball placement (it may be far away from the image center) and pixels at opposite edges of the image have considerable different light source directions. if the lighting is not placed at a fixed distance from all pixels at all the images, fitting inaccuracies may also occur, requiring pixel intensity adjustments at each image. to overcome these limitations, critical for large objects, two glossy balls can be used as suggested in [bsp ]. through geometric triangulation the precise light position can be computed and light intensities adjusted for each pixel at the images. next ptmbuilder version will be available soon with these improvements. . acknowledgements this work was supported by a scholarship from computer science and technology center (cctc), university of minho. the authors acknowledge and thank the access to artifacts at the parque arqueologico do vale do coa (pavc), and at museu d. diogo de sousa, braga, both in portugal. references [bsp ] barbosa j., sobral j. l., proenÇa a. j.: imaging techniques to simplify the ptm generation of a bas-relief. in vast’ : proc. th int. symp. virtual reality, archaeology and cultural heritage (uk, ), pp. – . [cdmr ] cula o. g., dana k. j., murphy f. p., rao b. k.: bidirectional imaging and modeling of skin texture. in texture : proc. rd int. workshop on texture analysis and synthesis (france, ), pp. – . [dccs ] dellepiane m., corsini m., callieri m., scopigno r.: high quality ptm acquisition: reflection transformation imaging for large objects. in vast’ : proc. th int. symp. virtual reality, archaeology and cultural heritage (cyprus, ), pp. – . [hpl] hplabs: research hplr. hp labs site highlight based ptms. http://www.hpl.hp.com/research/ptm/ highlightbasedptms/index.html [mmsl ] mudge m., malzbender t., schroer c., lum m.: new reflection transformation imaging methods for rock art and multiple viewpoint display. in vast’ : proc. th int. symp. virtual reality, archaeology and cultural heritage (cyprus, ), pp. – . [tse_ ] tchou c., stumpfel j., einarsson p., fajardo m., debevec p.: unlighting the parthenon. in siggraph’ : acm siggraph sketches (usa, ), acm press, p. . http://www.hpl.hp.com/research/ptm/highlightbasedptms/index.html http://www.hpl.hp.com/research/ptm/highlightbasedptms/index.html opportunities and challenges of emerging technologies in higher education: future directions | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /jide. corpus id: opportunities and challenges of emerging technologies in higher education: future directions @article{thomas opportunitiesac, title={opportunities and challenges of emerging technologies in higher education: future directions}, author={p. thomas}, journal={int. j. innov. digit. econ.}, year={ }, volume={ }, pages={ - } } p. thomas published sociology, computer science int. j. innov. digit. econ. recent unprecedented advances in digital technologies and their concomitant affordances in education seem to be a great opportunity to adequately address burgeoning demand for high quality higher education (he) and the changing educational preferences. it is increasingly being recognised that using new technology effectively in he is essential to prepare students for its increasing demand. e-learning is an integral component of the university of botswana’s teaching and learning culture… expand view via publisher igi-global.com save to library create alert cite launch research feed share this paper citations view all topics from this paper social affordance display resolution one citation citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency emerging and innovative services in management school libraries in bangalore university: a case study m. krishnamurthy engineering save alert research feed references showing - of references sort byrelevance most influenced papers recency improving learning and reducing costs: new models for online learning c. twigg sociology pdf save alert research feed teacher perspectives on integrating ict into subject teaching: commitment, constraints, caution, and change s. hennessy, k. ruthven, s. brindley sociology pdf save alert research feed the growth of the scholarship of teaching in doctoral programs j. cohen, r. barton, a. fast sociology save alert research feed lessons from team work: towards a systematic scheme for course development geoff foster psychology save alert research feed interpretations of constructivism and consequences for computer assisted learning b. dalgarno psychology, computer science br. j. educ. technol. pdf save alert research feed innovation in learning: innovative tools and techniques for learning ibrahim arpaci, tarkan gürbüz business, computer science int. j. e adopt. save alert research feed the evolution of ict, economic development, and the digitally-divided society s. takaya economics save alert research feed the advertising marketplace and the media planning course carla v. lloyd, j. slater, brett robbs sociology save alert research feed the challenge of innovation implementation k. klein, j. sorra business , save alert research feed connectivism: a learning theory for the digital age george siemens psychology, sociology , pdf save alert research feed ... ... related papers abstract topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue marketing libraries journal vol. , issue , august from the trenches send in the crowds: planning and benefiting from large-scale academic library events michelle demeter florida state university rachel besara missouri state university gloria colvin florida state university bridgett birmingham florida state university abstract: academic libraries produce a range of events. while large-scale events can be a lot of fun, the planning process can seem more daunting than the process for programming that targets smaller audiences. planning and executing large-scale events—ones that attract one hundred or more attendees, involve partners, and meet the social and academic needs of students—can be very worthwhile in terms of marketing the library and networking. in this article, the authors detail four different events that can be replicated in an effort to show how easy and beneficial large-scale events can be within the academic library community. keywords: academic libraries, programming, large events, outreach, networking introduction in academic libraries, outreach can have many meanings. regardless of its multivalent nature, outreach ultimately means fostering connections with others, whether they be students, faculty, or the academic community at large. most of these interactions and partnerships are on a small scale—in small groups such as presentations, instruction courses, workshops, and campus orientation sessions, or one-on-one interactions during tabling events. florida state university (fsu) libraries found that a variety of outreach efforts yields better results in both visitor counts to the physical buildings and the website, as well as better familiarity with library marketing libraries journal vol. , issue , august resources. one area where the libraries increasingly expanded outreach efforts was in the development of large-scale events. for an event to count as large-scale, it should reach more than one hundred people, include a partnership with a group or individual on campus, and advance interdisciplinary interactions among its participants. at fsu, there is a designated outreach unit within the larger research and learning services division. staff throughout the division approach the outreach unit for assistance in planning, staffing, and budgeting all events in strozier library (the main library) and the campus’s stem libraries. in this article, four librarians at fsu libraries detail four large-scale outreach events that effectively accomplished the above goals and that proved to be sustainable and popular among each of their core target groups of undergraduates, graduate students, and faculty. in each section, one event is closely examined to detail its ideation, planning, execution, and lessons learned so that each can be replicated at other academic libraries and modified for their unique user populations. as noted above, fsu libraries’ large-scale events must meet the needs of both social and academic pursuits while also serving one hundred attendees or more. attracting a large number of participants to a purely academic event is a difficult hurdle. in general, an event that is purely social does not automatically meet the goals of the library or its financial supporters and can be hard for some librarians to justify to their stakeholders. however, showcasing the library as a social center on campus is one way to contribute to the university’s retention goals and make students feel like welcome participants in campus culture. large-scale events require more resources and planning than smaller interactions, which is why fsu libraries often collaborate with other campus departments or groups. research symposia in order to better align library services with faculty needs, liaison librarians in conducted interviews with fsu faculty across academic departments to learn more about how they worked and to better understand what supported or hindered their research and teaching. one issue that arose repeatedly in faculty responses was a desire for opportunities to get to know colleagues across campus and to be aware of research being done in departments other than their own. their interests in knowing about what others were researching, making connections with colleagues, and identifying potential opportunities for collaboration were the impetus for the library’s efforts to position itself as a place where faculty from different disciplines could come together to share their research and discuss ideas. one of the most successful efforts in accomplishing these goals has been hosting interdisciplinary research symposia that involve faculty in the planning and presentations. typically, fsu libraries host two symposia each year—one symposium during the fall and one during the spring—and invite faculty to talk about their research or publications related to a broad, central theme. most library symposia are organized by the scholars commons, the libraries’ graduate and faculty division. every effort is made to include speakers from a wide range of disciplines and to attract attendees from across the campus. interdisciplinary themes included genius, composing, coffee, ethnography, the persistence of evil, academic publishing, marketing libraries journal vol. , issue , august digital scholarship, social media research and, most recently, water. occasionally, symposia based on a specific topic, such as a newly published book or the anniversary of a classic, are used as springboards for discussion of larger issues. the books academically adrift and silent spring, for example, led to discussions of undergraduate education and of environmental issues, respectively. librarians usually select the theme for a symposium, but in some cases, faculty approach librarians with ideas and ask to organize a symposium on a particular topic. an individual librarian usually takes the lead in planning the program with faculty and works with a small group of librarians and outreach staff in coordinating the logistics for the symposium, with assistance from colleagues and student workers on the day of the event. the libraries’ communications staff is instrumental in designing materials to publicize the symposia. in addition, the provost and the office of faculty development and advancement partner with the libraries to promote the symposia. most symposia are full-day events, though some have been half-day programs, with the format varying depending on the topic. the most common format is a series of talks by individual faculty, with each talk followed by time for questions or discussion. some symposia include panel discussions and occasionally administrators or guest speakers participate in the program. the full-day symposium on academic publishing, for example, included a panel discussion on journal publishing with faculty who edit a variety of journals and a journal publisher; talks on book publishing by representatives of a university press and a book vendor; displays and activities related to open access publishing and the university’s research repository; a conversation with the provost and a faculty member on publishing expectations for promotion and tenure; and a guest speaker who looked at the future of publishing. attendees come and go throughout the day, some attending a single presentation and others staying for the entire program. anywhere from to more than attendees is typical, and now that the talks are streamed online, attendance rises with to , remote viewers. time for refreshments and socializing is built into the day, with coffee and pastries available in the morning and a light lunch in the middle of the day. a budget of $ to $ per symposium covers the cost of the food. symposia are held in the bradley reading room, a library space designated for graduate student and faculty study. the room can be reconfigured for lecture-style seating and can accommodate individual presentations or panel discussions. with the help of library technology staff, presenters may use internet connectivity and a ceiling-mounted projector if they wish to incorporate media in their presentations. the symposia have helped establish the library as a focal point on campus for academic programs and conversations with colleagues and the wider university community. they provide opportunities for faculty to connect across disciplines, as well as for students and librarians to learn more about research taking place on a particular topic. finally, they led to collaboration with other departments in the library and with other campus organizations and units. marathon reading marketing libraries journal vol. , issue , august in fall , an english faculty member approached the outreach librarian about hosting a spring event aimed at generating interest among her students in reading herman melville’s moby dick. this faculty member proposed a marathon reading, which involves the reading aloud of a lengthy book for hours or longer in an effort to engage the community and expose people to literature they might not typically approach on their own. the library staff thought this might be a fun and unique way to engage with students and classic literature, and thus a partnership was born. the faculty member reached out to athletics and the student government association to recruit readers, while outreach staff coordinated the production of a flyer and social media advertisements. athletics proved to be a strong partner throughout all of the readings. fsu is well known for its athletic programs, and the new director of the student athletes was especially interested in emphasizing the students’ academic involvement. in later years, student government continued to participate, and the libraries also determined that the readings might be a unique way to include fsu’s distance student population. the libraries’ outreach budget provided funds to purchase snacks for the duration of the reading. individuals can sign up for - or -minute blocks using the signup.com website and can share their block as they please with one, two, or more readers. readers are recruited prior to the event, but empty slots are available for audience members who are moved to participate on the spot. the reading is hosted on the main floor of the library, outside the special collections exhibit room. the space offers a very visible location near the entrance of the main library and its busiest checkout desk. often special collections makes a small accompanying display highlighting items from its collection that align with the chosen topic. about thirty chairs are arranged facing a podium equipped with a microphone and speakers (see figure ). figure . the readings also accommodate remote participation. the libraries’ dean had prioritized outreach to and inclusion of fsu distance students in on-campus activities and, as a result, the marketing libraries journal vol. , issue , august reading organizers worked with libraries’ technology department to include fsu’s extended campuses. in the third year, readers joined the event via skype from florence, italy; london, england; valencia, spain; panama city, panama; and panama city beach, florida. when readers participate via skype, they are projected on a large mondopad tv so the audience can see and hear them. including the extended campuses helps make fsu’s students and faculty abroad feel more connected to the main campus, which is an overarching goal of the university as well as the library. each year, when it comes time to choose a new book for the marathon reading, the directors from fsu’s extended campuses and the english department are included in the decision-making. titles read thus far include dickens’ bleak house, tolstoy’s anna karenina, tolkien’s fellowship of the ring, garcía márquez’ one hundred years of solitude and love in the time of cholera and, most recently, a variety of folk tales, legends, and myths from around the world. numerous copies of the chosen book are on hand for readers at all locations. the libraries purchase the same copy to make following along easier throughout the event. food is also provided to entice and reward readers and passers-by who participate. the budget for this event is usually $ to $ , which includes breakfast pastries and other snacks that have a long shelf-life, since they are out for more than hours. in , the event was live-streamed online for the first time, which added virtual participants to the attendees in the library. audience numbers shift throughout the day, as people are encouraged to come and go as they please. the libraries plan to continue the readings, possibly increasing them to twice a year. with the folk tales and legends reading, it became clear that the audience might like an opportunity to discuss the history and origins of the readings as well as share personal stories about why the selections were meaningful to them. the organizers seek out opportunities like these to keep the readings fresh from year to year and encourage audiences locally and abroad to engage with literature in new and meaningful ways. graduate social since , fsu libraries have hosted a social for graduate students and postdoctoral scholars. the social is primarily an opportunity for graduate and postdoctoral students from departments across campus to meet, talk, and make connections in an informal environment, but librarians also promote library services, spaces, and programs designed for the students. the idea for the social came from members of the libraries’ graduate advisory council, whose members lamented that there were not opportunities to meet other graduate students outside of their own departments. the librarians in the scholars commons seized on this idea and received support from senior administration to move forward. with the help of members of the graduate advisory council, librarians planned the first social for two hours on a late friday afternoon early in the spring semester. they chose this time because the graduate students indicated that they did not teach or attend class at that time of day, but they tended to be on campus grading papers and catching up on other work. marketing libraries journal vol. , issue , august the bradley reading room, which is restricted to graduate students and faculty, was a natural site for the event. a budget of $ , covered appetizers, fruit, vegetables, desserts, and beverages. in addition to soft drinks, librarians successfully requested permission from the university administration to serve wine. to keep costs manageable, especially when purchasing wine for – attendees, food is bought on sale or wholesale at sam’s club. student workers plate food items ahead of the event and keep everything replenished throughout the event. the libraries’ graphic designer produced an invitation that was widely distributed to graduate students by liaison librarians and a digital sign that was posted on monitors throughout strozier library and dirac science library. announcements appeared on the library and graduate school websites. on the day of the event, librarians took turns staffing a welcome table outside the reading room where attendees could sign in and get name tags, and pick up brochures and flyers about the libraries. the libraries hired a bartender, but did not card students, because the social was taking place in a self-contained room and students signed in before entering (see figure ). this first event attracted to graduate students, who quickly filled the room. there were many animated conversations. liaison librarians also mingled with the students, answered questions, and shared information about the libraries. later socials attracted a similar number of attendees; attendance peaked at about , excluding librarians. based on the success of the first graduate social, librarians decided to host another one in the fall semester, an optimal time to reach out to new graduate and postdoctoral students. librarians planning the event kept a similar format but made several adjustments. because the libraries did not have access to a conventional oven, preparing warm appetizers proved difficult, so they were no longer served. during the first social, music played in the background but was drowned out by conversation, so planners eliminated it. to encourage figure marketing libraries journal vol. , issue , august students to use the space outside the reading room, outreach staff placed the desserts there. students came out to the desserts but took them back to the reading room, preferring to be where most of the activity was taking place. the graduate school partnered with the libraries in promoting the socials, and administrators from that office dropped by to talk with students. each spring, the graduate school conducted a survey of first-year graduate students to learn more about their experiences and their views of their programs and various support services. the libraries consistently received high ratings, and quite a few comments referenced the libraries’ socials. as a result of the positive comments about the socials in these surveys, in summer , the dean of the graduate school proposed that they partner with the libraries to host the socials and match the libraries’ funding so that two socials could be held each year, one in the fall in the main library and one in the spring in dirac science library. the success of the graduate socials was noticed by other partners on campus, and in fall fsu’s center for the advancement of teaching approached the libraries and asked to cohost monthly faculty socials. graduate students continue to be enthusiastic about the socials, and attendance is consistently around . some postdoctoral students also come, but they are much fewer in number. librarians have tried to communicate information about the libraries at the socials. however, while the students do get to see the library spaces and talk with librarians, the students are mostly interested in talking with each other and enjoying the food and drink. the most positive outcome for the libraries is in the goodwill that the event creates among graduate and postdoctoral students. hackathon in , hackfsu, a registered student organization at fsu, planned its first hackathon, which is a technology-based inventor marathon. it is an intense, fun, immersion experience wherein experienced computer programmers and those brand new to coding gather, form teams, and create technology-based solutions to problems. the teams work together on self- selected projects and develop them as far as possible for hours, with the support of mentors and workshops. the projects are then judged in a showcase exhibit involving all of the hacker teams at the end of the event. the hackfsu hackathons have all been a part of major league hacking, the official national student hackathon league. the first hackathon was a bit chaotic, but each subsequent event has been much smoother. the formation of the libraries’ initial partnership with hackfsu was an intense experience with little time to prepare. hundreds of participants from more than a dozen universities had registered for the hackathon and, with less than a week to go, hackfsu had not yet secured a venue large enough and flexible enough to house the event. the provost’s office heard of the situation and asked the libraries’ administration whether the dirac science library would be willing to host the -hour event. this library was the ideal space, with plenty of gathering spaces, work spaces, power outlets, and wireless internet access, and it was already accustomed to providing services and facilities to hundreds of students. the library staff and the hackfsu organizers spent more than hours working closely together in the week of the marketing libraries journal vol. , issue , august hackathon to make the event successful. the combination of amenities and the library staff’s enjoyment of the concentrated collaborative work continues to make dirac science library hackfsu’s venue choice year after year. while the partnership began in name when the libraries’ administration approved the first event, it began in earnest when the librarians and staff started working out the logistics of the unfamiliar event. this involved many constructive discussions between the hackfsu organizers, student leaders (who were unfamiliar with many campus operating policies), and the libraries’ managers. libraries’ staff created a physical layout of the building to identify the best locations to serve meals and set up stress-busting games, sponsor booths, storage areas, and nap stations. because the participants included fsu students, corporate sponsors with expensive equipment, students from other universities, and high school students from the region, the libraries and campus police worked together to identify any potential safety issues. finally, the local news media and social media covered the hackathon, and the dirac science library was mentioned in all the stories. coverage emphasized that it was one of the first hackathons of its size to be held in a library space versus other campus locations, something even the vendors found unique and intriguing (see figure ). figure . the first event led to an ongoing relationship between the library, hackfsu, and other technology-focused registered student organizations. four hackathons have been hosted by dirac science library to date. since hackathons by their nature are student-organized and students come and go throughout, the event has posed some unique challenges. the libraries’ charge anywhere from $ , to $ , to cover costs associated with additional staffing, security, and facilities required for the event; dirac is usually closed during half of the event hours and, during its normal marketing libraries journal vol. , issue , august open hours, its usual patrons are displaced. throughout the planning process, the libraries and hackfsu meet to discuss dates and hours of the event, permitted activities, logistics, deliveries, staffing, security, and equipment. the students are responsible for arranging all deliveries of food, technology, and furniture. librarians also help coach each hackathon leadership team through the local logistics of holding the event. all fundraising is conducted by the students. they work with major league hacking, potential sponsors, and other donors to secure the funding for building use, meals, security, equipment rental, t-shirts, and other affiliated costs. during the event, all meals and snacks are provided for participants. in addition, extra furniture is ordered so students have nap areas and the organizers can set up a sort of command center on the floor. hackfsu is responsible for these orders, deliveries, and payments. the librarians often assist with setting up food distribution and identifying locations that allow traffic to flow easily. when furniture is delivered, it is stacked in the bottom of the library stairwell and flagged with notes advising regular patrons not to distribute it across the floor. librarians, staff, and security are on hand at all times during the event to ensure the building is being used properly, to let students in and out of the building as needed, and to make sure things stay civil. students stay awake for more than hours while trying to manage the logistics of running meals and various activities (such as a photo booth, cup-stacking contests, dj dance parties to reenergize the hackers, and interviews for jobs and internships with sponsors) as well as providing crash-course trainings on coding throughout the event. tensions therefore can run high as sleep deprivation increases. often the librarians serve as mentors and sounding boards, stepping in to help when students are running a bit low on energy or need advice on how previous years’ coordinators handled certain issues. the hackathon’s benefit to the campus community is worth the extra work. since the libraries are a partner in the event, hackfsu is only charged the direct staffing costs of hours the library is open beyond the regular schedule. in just one event, some students from almost universities, as well as local community members, can learn about coding, gain concrete skills, and have the opportunity to be noticed by corporate sponsors and recruiters in attendance, such as apple, mailchimp, and state farm. while the students complete the bulk of the work, library staff coordination is essential. the event continues to bring campus and local-media attention to the exploratory teaching-and-learning role of the library—which is invaluable. conclusion large-scale events offer unique opportunities to reach out and connect with a wide range of students and faculty across disciplines. they can allow for lengthy discussions and interactions that do not occur at smaller, shorter events. however, all of the benefits gained from conducting such events should not to detract from the value of smaller events. while fsu libraries have hosted many large-scale events, including resource fairs and undergraduate research symposia, the majority of its programs continue to be smaller events because of their ease and flexibility of planning and execution. the agility afforded by small-scale tabling, book discussions, finals events, and other activities is not to be discounted. each type of outreach event is meant to not only encourage visits marketing libraries journal vol. , issue , august to the libraries but also to educate potential and existing users about the various resources the library has offer. copyright: © demeter, besara, colvin, and birmingham. this is an open access article distributed under the terms of the creative commons attribution-noncommercial-sharealike license (cc by-nc- sa), which permits unrestricted non-commercial use, sharing, adapting, distribution, and reproduction in any medium, provided the original author and source are credited. http://creativecommons.org/licenses/by-nc-sa/ . / introduction bernard of clairvaux and nicholas of montiéramey: tracing the secretarial trail with computational stylistics bernard of clairvaux and nicholas of montiéramey: tracing the secretarial trail with computational stylistics by jeroen de gussem although the past few decades of medieval studies have witnessed some renewed interest in the collaborative process by which medieval latin prose was composed, such interest has nevertheless remained all too scant, and only few solutions have been offered to cope with the difficulties that rise in cases of dubious authorship. whereas it has been rightly acknowledged that scribes, notaries, and secretaries should be regarded not merely as instrumental in the literary process, but as active participants in the composition process who have left a considerable impact on the image and style of the dictator and on the materialization and dissemination of the text, this acknowledgement has nevertheless been accompanied by difficulties speculum /s (october ). © by the medieval academy of america. all rights reserved. this work is licensed under a creative commons attribution-noncommercial . international license (cc by-nc . ), which permits non-commercial reuse of the work with attribution. for commercial use, contact journalpermissions@press.uchicago.edu. doi: . / , - / / s - $ . . this article is a result of the research project “collaborative authorship in twelfth-century latin lit- erature: a stylometric approach to gender, synergy and authority,” funded by the ghent university special research fund (bof). its execution rests on a close collaboration between the henri pirenne institute for medieval studies at ghent university, the clips computational linguistics group at the university of antwerp, and the centre traditio litterarum occidentalium division for computer-assisted research into latin language and literature housed in the corpus christianorum library and knowledge centre of brepols publishers in turnhout (belgium). i am much indebted to the wisdom and continuous and patient guidance of jeroen deploige, wim verbaal, and mike kestemont, who—each in their re- spective fields of expertise (medieval cultural history, latin medieval literature, and computational sty- listics)—have tremendously inspired and challenged me in writing this piece. their voices inevitably re- sound from this text, so much so that i cannot solely take credit for the whole. i also warmly thank my colleagues from the latin and history department in ghent who have gone through the trouble of read- ing my preliminary drafts. in particular, dinah wouters, micol long, and theo lap have my sincerest gratitude for personally sending me their valuable feedback. in conclusion, my gratitude goes out to paul de jongh, bart janssens, jeroen lauwers, and luc jocqué of brepols for their commitment to this project. a more general introduction to twelfth-century notaries, especially in respect to epistolography, can be found in giles constable, “dictators and diplomats in the eleventh and twelfth centuries: medi- eval epistolography and the birth of modern bureaucracy,” dumbarton oaks papers ( ): – , at , where he stresses that “[notaries] took on a new importance in the eleventh and twelfth centuries, when they formed a distinct group of recognizable personalities whose activities extended be- yond the scriptorium.” lynn staley johnson, albeit focusing mainly on late medieval female authors, has drawn attention to how “scribes not only left their marks upon the manuscripts they copied, they also functioned as interpreters, editing and consequently altering the meaning of texts. writers, however, did not simply employ scribes as copyists; they elaborated upon the figurative language associated with the book as a symbol and incorporated scribes into their texts as tropes,” in “the trope of the scribe and the question of literary authority in the works of julian of norwich and margery kempe,” speculum ( ): . see bernard cerquiglini, in praise of the variant: a critical history of philology, trans. betsy wing (baltimore, ); stephen g. nichols, “introduction: philology in a manuscript culture,” speculum this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f &citationid=p_n_ bernard of clairvaux and nicholas of montiéramey s in defining the exact extent and sphere of influence of such secretarial mediation. this challenge is especially marked in the oeuvre of bernard of clairvaux ( – ). by , the abbot’s acclaim as the icon and figurehead of the cistercian movement had brought along such a considerable administrative workload that the assistance of a group of secretaries—which, it could be argued, amounted to a kind of chancery —was indispensable. these secretaries acted as bernard’s stand- ins and spared him the time and effort it would have cost if he had had to take up the quill himself at every single occasion. the reportatio, as it was called, entailed that the contents of bernard’s letters or sermons be engraved on wax tablets in a tachygraphic fashion. the cues, keywords, and biblical references that bernard had spokenaloudprovidedaframeworkthatcapturedthegistofhisdiction. afterwards, the scribe reconstructed what he had heard as a text on parchment, which could pass for bernard of clairvaux’s in its literary allure. among these amanuenses, nicholas of montiéramey († / ) was a focal figure and a highly skilled imitator of his master’s writing style. the influence of nicholas’s mediation on several particular texts within bernard’s corpus, and more generally on his entire oeuvre, has been sub- ject to much debate. this article revisits the authorship of a selection of texts from bernard’s corpus. a detailed listing of the texts under scrutiny can be consulted in the appendix of tables (tables – ). generally, the corpus comprises nicholas of montiéramey’s letters and sermons and bernard of clairvaux’s letter corpus some could argue that the word “chancery” is inappropriately used of bernard’s scriptorium, as it was not primarily a formal or institutional body of administration charged with the composition and dispatch of official documents. the workings of clairvaux’s scriptorium are extensively investigated in peter rassow, “die kanzlei st. bernhards von clairvaux,” studien und mitteilungen zur geschichte des benediktiner-ordens ( ); and jean leclercq, “saint bernard et ses secrétaires,” in leclercq, recueil d’études sur saint ber- nard et ses écrits (hereafter recueil d’études), vols. (rome, – ), : – . constable also com- mented on the difficulty of the redaction process: “aside from a few outlines dictated by bernard or based on sermons he gave, most of the surviving texts are later compositions drawn up by either himself or his secretaries, and they bear little resemblance to what he actually preached, if they were ever delivered,” in giles constable, “the language of preaching in the twelfth century,” viator ( ): – , at . stenography, or shorthand, systems had been forgotten by the twelfth century, making place for ta- chygraphy, a rapid form of writing: see malcolm beckwith parkes, “tachygraphy in the middle ages: writing techniques employed for reportationes of lectures and sermons,” in scribes, scripts and read- ers: studies in the communication, presentation and dissemination of medieval texts (london, ), – . nicholas of montiéramey’s letters can be found under pl : a– d. the sermons have been identified by jean leclercq in “les collections de sermons de nicolas de clairvaux,” recueil d’études, : – . they are collected among those of peter damian in pl , more specifically “sermo in na- tivitate s. ioannis baptistae” ( ), “sermo in natali apostolorum petri et pauli” ( ), “sermo in natali s. benedicti de evangelio” ( ), “sermo in festivitate s. mariae magdalenae” ( ), “sermo in festi- vitate s. petri ad vincula” ( ), “sermo in assumptione b. mariae” ( ), “sermo in nativitate b. mariae” ( ), “sermo in exaltatione s. crucis” ( ), “sermo in festivitate angelorum” ( ), “sermo in dedicatione ecclesiae” ( ), “sermo in festivate s. victoris” ( ), “sermo in festivitate omnium sanctorum” ( ), “sermo in festivitate s. martini” ( ), “sermo in festivitate s. andreae” ( ), “sermo in festivitate b. nicholai” ( ), “sermo in festivitate b. mariae” ( ), “sermo in vigilia nati- vitatis” ( ), “sermo in nativitate domini” ( ), and “sermo in festivitate b. stephani” ( ). ( ): – ; eric h. reiter, “the reader as author of the user-produced manuscript: reading and writing popular latin theology in the late middle ages,” viator ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fj.viator. . &citationid=p_n_ s bernard of clairvaux and nicholas of montiéramey (corpus epistolarum), sermones de diversis (hereafter de diversis), and sermones super cantica canticorum. as a means of determining the extent of nicholas’s stylistic presence in the afore- mentioned works, this article advocates computational stylistics (or stylometry), which is a method that detects stable and recurring patterns of writing style in texts that have been reduced to a range of marked style features. these features are consequently quantified to numerical data in order to gain objective and measur- able ground for distinguishing among works of varying authorship. computational stylistics claims to offer a scientific and objective “distant reading” of literature, as opposed to human expert-based methods, which are often liable to intersubjectiv- ity. a stylometric approach to the case of bernard of clairvaux and nicholas of montiérameywillproverewardingontwolevels.onthefirstlevel,wewilldetermine the authorship of the aforementioned texts, some of which have been subject to long-standing dispute. this will address an elementary concern within bernardian studies, which gillian rosemary evans has aptly put forward as such: “a turn of phrase can bear only so heavy a load of interpretation as it exactly reflects the au- thor’s thought.” on a second level, however, this study should also raise aware- ness of how the (mis)attributions of disputed medieval texts, as they have occurred in earlier studies, are often guided by personal intuitions or theoretical convictions of medieval authorship. this implies that, as we begin to challenge the established these works are edited in the sancti bernardi opera (hereafter sbo), ed. jean leclercq et al., vols. (rome, – ): the corpus epistolarum (vols. – ), sermones de diversis (vol. ), and sermones su- per cantica canticorum (vols. – ). the term “distant reading” was coined by franco moretti, “conjectures on world literature,” new left review ( ): – , and was further developed in his monograph graphs, maps, trees: abstract models for literary history (brooklyn, ). it was frederick mosteller and david l. wallace’s influential study on the disputed authorship of the eighteenth-century federalist papers in the early s that would launch statistical approaches as tools by which to objectively determine authorship, currently known as nontraditional authorship attribution: see frederick mosteller and david l. wallace, applied bayesian and classical inference: the case of the federalist papers (new york, ). for four excellent state-of-the-art surveys on the history of nontraditional authorship attribution and the current debate within the field, see patrick juola, “author- ship attribution,” foundations and trends in information retrieval ( ): – ; efstathios stamatatos, “a survey of modern authorship attribution methods,” journal of the association for in- formation science and technology ( ): – ; moshe koppel, jonathan schler, and shlomo argamon, “computational methods in authorship attribution,” journal of the association for infor- mation science and technology ( ): – ; and walter daelemans, “explanation in computa- tional stylometry,” in computational linguistics and intelligent text processing , ed. alexander gelbukh (berlin, ), – . gillian rosemary evans, bernard of clairvaux, great medieval thinkers (new york, ), . the bibliography of scholarship on medieval authorship is extensive and cannot be listed here in full. the following titles should nevertheless point any reader who is interested in the theory of medieval au- thorship in the right direction. a major work of reference is by alastair j. minnis, medieval theory of authorship: scholastic literary attitudes in the later middle ages (philadelphia, ), whose ap- proach to medieval authorship is largely based on study of the prologues of commentaries and exegetical works especially from the later middle ages (the scholastic period). in this later period he describes how the schools defined for themselves a framework of literary theory from the newly translated aristotelian logic, through which they could approach the biblical texts and patristic auctores more literally, and therefore more literarily, as “a new type of exegesis emerged, in which the focus had shifted from the di- vine auctor to the human auctor of scripture,” . some other indispensable publications on medieval au- speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fasi. &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fasi. &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fasi. &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fasi. &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f &citationid=p_n_ bernard of clairvaux and nicholas of montiéramey s authorship of some of these texts, we also may offer metahistorical reflections on how earlier, more intuitive scholarly approaches have both enriched and yet prede- termined our current understanding of medieval authorship. the case of bernard and nicholas is particularly suitable to demonstrate how computational stylistics provides new answers to old questions, and raises new questions about old problems. nicholas of montiéramey the daily routines and workings of clairvaux’s chancery are rather poorly doc- umented. we rarely know any of the scribes by name, and for those whom we do— a select group of six—only three give us a faint clue of their specific tasks and responsibilities. nicholas began serving bernard as an emissary around – , carrying letters concerning abelard’s heresy to rome. at this time he was still chap- lain of hato, bishop of troyes, and peter the venerable’s friend and secretary, but he must already have been collaborating with bernard from onwards. he thorship that are of particular relevance here are michel zink, la subjectivité littéraire au siècle de saint louis (paris, ); stephen c. jaeger, “charismatic body, charismatic text,” exemplaria ( ), – ; michel zimmermann, ed., auctor et auctoritas: invention et conformisme dans l’écriture médiévale; actes du colloque de saint-quentin-en-yvelines ( – juin ) (paris, ); edith wenzel, “der text als realie? auf der suche nach dem text und seinem autor,” in text als realie: internationaler kongress krems und der donau . bis . oktober , Österreichische akademie der wissenschaften, philosophischen-historische klasse, sitzungsberichte (vienna, ), – ; jeroen deploige, “anonymat et paternité littéraire dans l’hagiographie des pays-bas méridionaux (ca. –ca. ): autour du discours sur l’‘original’ et la ‘copie’ hagiographique au moyen Âge,” in scribere sanctorum gesta: recueil d’études d’hagiographie médiévale offert à guy philippart, ed. Étienne renard, michel trigalet, xavier hermand, and paul bertrand (turnhout, ), – ; jan m. ziolkowski, “cultures of authority in the long twelfth century,” journal of english and ger- manic philology ( ), – ; stephen partridge and erik kwakkel, author, reader, book: me- dieval authorship in theory and practice (toronto, ). we know that bernard’s earliest secretary was william of rievaulx. he must have been active from until , before traveling to northern england to establish the monastery of rievaulx in the di- ocese of york, a daughter house for clairvaux, to become its first abbot: see rassow, “die kanzlei st. bernhards von clairvaux,” . william’s intimate bond with bernard must have established a solid base upon which clairvaux and rievaulx were able to cooperate, communicate, and exchange recruits: see brian patrick mcguire, “introduction,” in a companion to bernard of clairvaux, ed. mcguire, brill’s companions to the christian tradition (leiden, ), . three other names that have come down to us are balduin of pisa, gerard of peronne, and raynaud of foigny, but none of these seems to have had much significance: see evans, bernard of clairvaux, . a more important personality was geoffrey of auxerre, who was a former student of peter abelard and allegedly denounced the parisian schools in favor of the monastery after having witnessed bernard’s genius and eloquence in preaching: “continuo tres ex illis compuncti sunt et conversi ab inanibus studiis ad verae sapientiae cultum, abrenuntiantes saeculo et dei famulo adhaerentes,” in geoffrey of auxerre, liber quartus sancti bernardi abbatis clarae-vallensis vita (pl : ). he entered clairvaux in and became the abbot’s secretary in , a time when the administrative obligationsinclairvaux reached their peakand anofficialchancery had been established. he would become abbot of clairvaux himself in but had to abdicate his lead- ership after two years, presumably as a consequence of an internal dispute over the papal schism between alexander iii and victor iv: see ferruccio gastaldelli, “introduzione,” in goffredo di auxerre, super apocalypsim, ed. gastaldelli, temi e testi (rome, ), – . there is scholarly debate over when exactly nicholas initiated his collaboration with bernard, but recent research tends to agree that it must have been earlier than his accession in – . see con- stable, “dictators and diplomats,” – : “nicholas was at clairvaux probably from the early s speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). s bernard of clairvaux and nicholas of montiéramey would officially become a monk at clairvaux around the end of . his literary qualities, likely to have been acquired through his education in the benedictine ab- bey of montiéramey, enabled him to enter the scriptorium immediately and offi- cially become bernard’s closest secretary. he appears to have been responsible for supervising the workings of the chancery, and he may have been the monastery’s librarian. but their friendship knew an abrupt and painful ending in the final years of bernard’s life, around – , when nicholas must have severely breached his master’s trust. in a letter to pope eugene iii, we find bernard disconcerted over the fact that letters had been sent out under his name and seal by “false brethren” with- out his permission. later, bernard would identify nicholas as the culprit among these brethren, although the exact reasons why the latter deserved this accusation are nowhere explicitly disclosed. in any case one can assume from his correspon- dence and his own words that nicholas’s talent as a writer and his “versatility” ingratiated him with the greatest men of his time. equally so, nicholas appears to have had—perhaps through this flamboyance and self-confidence—a talent for making enemies as well. to and assisted bernard with his sermons as well as his letters, but he continued to visit cluny and to serve peter the venerable, one of whose letters, we have seen, he presented to bernard orally”; and anne-marie turcan-verkerk, “l’introduction de l’ars dictaminis en france: nicholas de montiéra- mey, un professionel du dictamen entre et ,” in le dictamen dans tous ses états: perspectives de recherche sur la théorie et la pratique de l’ars dictaminis (xie–xve siècles), ed. benoît grévin and turcan-verkerk (turnhout, ), : “ami de pierre le venerable, [nicolas] avait déjà servi les intérêts de bernard en portant au pape, en – , des lettres concernant abelard—à la rédaction desquelles il avait peut-être déjà participé, comme le suggère le manuscrit phillipps . . . . trois bil- lets de recommandation envoyés par bernard à innocent ii entre et semblent le concerner [epp. – ], et montrent que s’il servait hatton, il le faisait en obéissant à bernard.” “nicolas fit ses études à l’abbaye bénédictine de montiéramey, près de troyes en champagne. on parle souvent de lui comme d’un magister”: see john benton, “nicolas de clairvaux,” in dictionnaire de spiritualité, vols. (paris, ), : – . leclercq, “lettres de s. bernard: histoire où littérature?,” recueil d’études, : . giles constable, “nicholas of montiéramey and peter the venerable,” in constable, the letters of peter the venerable, vols. (london, ), : . “periclitati sumus in falsis fratribus,” bernard, ep. , sbo : – . ep. , likewise addressed to eugene iii (sbo : ). constable, the letters of peter the venerable, : . nicholas, ep. (pl : ), “ab ineunte aetate mea placui magnis et summis principibus hujus mundi.” the word is jean mabillon’s in pl : : “vir fuit ingenii facilis, versatilis, facile in aliorum affectus influens.” an example can be found in nicholas’s dispute with peter of celle. the two “were at odds over a substantive matter, a theological point about how to treat the attributes of god, and abbot peter took offense that nicholas, who should have possessed the power to triumph with his own (verbal) arms, had used against him the authority and words of great philosophers:” see john van engen, “letters, schools, and written culture in the eleventh and twelfth centuries,” in dialektik und rhetorik im früheren und hohen mittelalter: rezeption, Überlieferung und gesellschaftliche wirkung antiker ge- lehrsamheit vornehmlich im . und . jahrhundert, ed. johannes fried (munich, ), : ; and julian haseldine, “peter of celle and nicholas of clairvaux’s debate on the nature of the body, the soul, and god,” in the letters of peter of celle, ed. haseldine (oxford, ), – . the “ver- satile” aspect of nicholas’s personality, as noted by mabillon (above, n. ), therefore also seems to display itself on the level of language. nicholas was accused of “inverting words and their meaning,” a speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). bernard of clairvaux and nicholas of montiéramey s the scandal at clairvaux and the breach of bernard’s trust has for a long time upheld the portrayal of nicholas as a disreputable judas by bernard’s side, an anal- ogy for which bernard himself was responsible. conversely and simultaneously, bernard’s status as a saint continued to grow during the intense process of canoniza- tion and idealization following his death. these respective caricatural depictions, in which nicholas was deplored as the mistrusted secretary and bernard praised as the saint who had become victim of textual theft, show through on an academic level as well. dom jean leclercq, one of the most prominent bernard scholars of the twen- tieth century, was as relentless as bernard in accusing nicholas of deceit, shame- lessness, and plagiarism. nicholas’s most striking example of seeming textual theft presents itself in his letter to henry the liberal, count of champagne, to whom he humbly offered his services as a secretary shortly after his expulsion from clairvaux. accompanying the letter we find nineteen sermons originally attributed to peter dam- ian, nine sermons attributed to bernard of clairvaux, and seventy-four short com- mentaries tothe psalms thatareascribedtohugh of st.victor. inthe letter,nicholas asserts that these writings are “of my invention, of my style, aside from what i have taken from others in a few places.” we know this assertion to be true of the nineteen sermons also found among those of peter damian, which have been identified by leclercq as stemming from nicholas. bernard’s and hugh’s writings, on the other hand, appear to have been copied almost literally, not merely rearranged or para- phrased “in a few places” (paucis in locis), as nicholas seems to suggest. most of the nine sermons can be found in bernard’s de diversis. it is striking that, months characteristic that peter interpreted as equal to a falsification of language: “verba quoque et sensus verborum praesumis quandoque invertere”: see peter of celle, ep. (pl : ). bernard literally made the analogy with judas, which he significantly did not make often in his letters: see brian patrick mcguire, “loyalty and betrayal in bernard of clairvaux,” in loyalty in the middle ages: ideal and practice of a cross-social value, ed. jörg sonntag and coralie zermatten (turnhout, ), – . as it was first initiated by his biographer william of st. thierry († ). he wrote the vita prima bernardi, a biography with a hagiographical, panegyrical slant. he shares the authorship of the entire vita with bernard’s secretary geoffrey of auxerre and the benedictine abbot arnaud de bonneval. geoffrey was a strong advocate for bernard’s canonization, in which william’s texts played a funda- mental role. jean leclercq, “les collections de sermons de nicolas de clairvaux,” recueil d’études, : – . leclercq cannot but express his dislike for nicholas in phrases such as “cet homme sans caractère, mais lettré, doué de mémoire, habile à manier les fiches, prompt à entrer ‘dans le personnage’ d’un autre, aurait pû être pour s. bernard un parfait secrétaire, si seulement il avait été honnête,” or “or la suite du recueil prouve qu’il était sans scrupules en ce domaine comme en d’autres,” or “ainsi les témoignages les plus formels de nicolas lui-même sont trompeurs, car il ment”; or, in “deux épîtres de saint bernard,” recueil d’études, : , “on sait combien cet esprit peu original aime se citer lui-même, reprendre, en les modifiant à peine, des expressions qu’il a déjà employées en d’autres écrits.” see n. . bernard, sermones de diversis, , , , , , , and (sbo / ). on the sermons, see leclercq, “les collections de sermons de nicolas de clairvaux,” . hugh’s commentaries or chapters, the adnotationes elucidatoriae in quosdam psalmos david—the second book of the miscellanea—are collected under pl : . “meo sensu inventos, meo stylo dictatos, nisi quod paucis in locis de sensibus alienis accepi” (my translation), nicholas of montiéramey, in the prefatory letter in ms harley , ed. leclercq, recueil d’études, : – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). s bernard of clairvaux and nicholas of montiéramey after his banishment from clairvaux, nicholas seemingly betrays his former abbot again with what appears to be a willful appropriation of bernard’s texts. such incriminating evidence contributed to his reputation as a plagiarist, this reputation in its turn provoking prejudicial conclusions in other attribution issues. henri m. rochais, for instance, in a codicological approach to the question of de- termining the disputed authorship of three other lengthy sermons in bernard’s de diversis corpus (de diversis , , and ), pointed out these sermons’ close sim- ilarities to two of nicholas’s works and to chapter of hugh of st. victor’s sixth book of miscellanea (the well-known writer somehow seems to be involved again); yet stated with confidence that nicholas stole the texts from bernard under false pre- tenses. this hypothesis rochais sees corroborated in “the secretary’s unscrupulous personality.” at the same time rochais casts aside jean mabillon’s belief that the literary style of these sermons hardly seems that of bernard as an all-too-subjective andunscientificargument. toourview,rochais’ownsubjectivemistakewasthat— despite being fully aware of bernard’s collaboration with his secretaries—he treated codicological unity as identical to stylistic or authorial unity: “cette tradition manu- scrite ne donne donc aucun motif de doute sur l’authenticité bernardine des trois ser- mons étudiés, et, au contraire, elle constitue une telle probabilité en faveur de cette authenticité, qu’il faudrait des arguments incontestables pour dénier à bernard leur composition.” leclercq’s and rochais’ attributions still stand in their editions, widely used to- day, although medievalists have seriously contested their highly subjective and spec- ulative approach towards authorship attribution and their prejudiced view of nich- olas of montiéramey’s alleged deceitfulness and falsification, as we will show below. moreover, the temptation for scholars to draw lines between imitation and plagia- rism in order to categorize writings and collate them in attributed editions, valuable as it is, can also be rather anachronistic or even unbefitting in a medieval context. a fundamental rationale of the new philology, in the wake of poststructuralist ap- proaches to authorship and texts, is that in a medieval culture there is no place for the idea of an original author, a logic leading to the conclusion that leclercq’s and ro- chais’ quest for such an author only takes us further from the truth. medieval lit- bernard, “sermo : de viis vitae quae sunt confessio et oboedientia”; “sermo : de via oboe- dientiae;” and “sermo : de quinque negotiationibus, et quique regionibus,” sbo / : – . the specific text referred to is hugh of st. victor, “de septem gradibus confessionis,” pl : – : see henri m. rochais, “saint bernard est-il l’auteur des sermons , et de diversis?,” revue bénédictine ( ): – , at . there is a lack of clarity as to how exactly hugh of st. victor’s miscellanea was constituted—whether the collection was assembled by hugh himself or whether it is a compilation assembled from his writings by others. “le caractère de ce secrétaire peu scrupuleux rend assez vraisemblable l’hypothèse d’un nouveau plagiat de nicolas aux dépens de son ancien abbé,” rochais, ibid. “dom j. leclercq a dit justement ce qu’il faut penser de cette sorte d’argument trop subjectif pour avoir, à lui seul, une valeur réellement probante,” rochais, ibid., . jean mabillon’s argument for attributing the sermons to nicholas can be found in a note to pl : – : “hic sermo sequensque in editione lugdunensi anni , in qua primum prodiere, extra classem genuinorum bernardi sermonum locati sunt; nec stylum ejus plene assequi videntur.” rochais, “saint bernard est-il l’auteur des sermons , et ?,” . nichols and cerquiglini, two prominent figures of the new philological approach, have already been cited above (see n. ). the heritage of poststructuralists such as roland barthes, “the death speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fj.rb. . &citationid=p_n_ bernard of clairvaux and nicholas of montiéramey s erature depended strongly on its oral component (dictare had supplanted scribere to designate the act of authorship); it “circulated through networks” and formed part of a “shared culture [that was] characterized by a knowledge of the same erudite language as well as a common foundation of texts and memories.” giles constable touched on the core of the issue by noting that in the middle ages “there are infinite shadings between correction, revision, imitation, and falsification and, in works of art, between repair, restoration, reproduction, and copying.” in this light, nicho- las’s appropriation of some of his former master’s works in his letter to henry the liberal is rather the continuation of a dialogue, not a spiteful act of revenge. ste- phen jaeger has similarly argued that nicholas indulges in the kind of imitatio that would have made little distinction between “honest” and dishonest intentions. like any distinguished writer of his time, nicholas carefully applied for a new posi- tion by showcasing his complete immersion in a prevalent literary network. the juxtaposition of leclercq’s and rochais’ historical positivism with the more recent new philology lays bare the dilemma that has arisen in medieval text stud- ies. although both practices have contributed immensely to the field, neither of the two stances is entirely satisfactory, leaving most scholars to agree to a compromise in cases of doubtful authorship. the first, rather positivist, approach acknowledges that the act of textual appropriation is suitable and possible. it presupposes that per- sonal authorship is a retrievable aspect of the text, whose idealized state can be re- constructed from a hierarchical stemma. the disadvantage of this approach is that it of the author,” aspen magazine – ( ); and michel foucault, bulletin de la société française de philosophie ( ): – ; or in dits et écrits, vol. , – , ed. daniel defert, françois ewald, and jacques lagrange (paris, ), – , is apparent in this new philological approach towards medieval authorship. also see virginie greene, “what happened to medievalists after the death of the author?,” in the medieval author in medieval french literature, ed. greene (new york, ): – . paul saenger, “silent reading: its impact on late medieval script and society,” viator ( ): . pascale bourgain, “the circulation of texts in manuscript culture,” in the medieval manuscript book: cultural approaches, ed. michael johnston and michael van dussen (cambridge, uk, ), . also see rebecca moore howard, standing in the shadow of giants: plagiarists, authors, col- laborators (stamford, ct, ), : “in the middle ages, mimesis was the means of establishing one’s authority, as well as being an expression of humility. the notion of the individual author, auton- omous, original, and proprietary, played only the smallest role in this economy of authorship. with those textual values so much in decline, plagiarism was hardly an issue.” giles constable, “forgery and plagiarism in the middle ages,” in constable, culture and spiritu- ality in medieval europe (aldershot, ), – , at . moreover, the assertion that bernard never heard of nicholas again after he left clairvaux is far from certain: see constable, the letters of peter the venerable, : . “here then is a case in which a ‘skilled student of the ars dictaminis’ with alleged inclinations to forgery imitated a near-contemporary model, and we can assume that there would have been little dif- ference between the ‘honest’ and dishonest imitation of bernard’s style,” in stephen jaeger, “the pro- logue to the historia calamitatum and the ‘authenticity question,’” euphorion ( ): . after all, nicholas was applying for a position as henry’s new secretary, and a familiarity with the greats of the twelfth century would have been one of the prerequisites. leclercq’s assertion that henry the liberal must not have noticed nicholas’s blatant plagiarism because he was a layperson unfamiliar with clerical texts seems unlikely: “il était moins facile à un laïc comme henri le libéral qu’à un clerc de déceler le plagiat,” in leclercq, recueil d’études, : . henry’s recognition of the extent to which nicholas’s compositions were indebted to other authorities would have been the whole point and is likely not to have been conceived of as problematic. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). s bernard of clairvaux and nicholas of montiéramey reduces authorship debates to anachronistic, binary classification problems, whose conclusions often lack substantial evidence and set into motion more unsubstanti- ated debate. therefore it both originates in bias and establishes a circulus in pro- bando in which only additional factual evidence can provide closure. the second and more recent approach, on the other hand, comes to terms with the impossibility of closure through its recognition that—in a medieval context—knowing a text’s authorship was subordinate either, on the one hand, to acknowledging its implied authority (for example, the authority of a writer’s predecessors, such as the church fathers or god); or, on the other, to acknowledging the authority of its unique, ma- terialized appearance: “there are as many texts . . . as there are scribal redactions; and there are as many authors of medieval texts as there were scribes composing new works in the act of writing them down in their [manuscripts].” by embracing the variance, this approach evades the impasse. yet one is wary of where this might lead. lena wahlgren-smith, who is preparing a critical edition of nicholas of mon- tiéramey’s letters, has quite rightfully expressed her concern regarding a “wholesale adoption” of the new philological approach, which “assumes that all medieval lit- erature,inalllanguages,allgenres,andallperiods,operatesinthesameway.” such an approach is counterintuitive to those medieval attestations where value is at- tached totitled authority, where thereisanoutspokenpreference forunviolated text, or where personal literary style is cultivated. bernard’s denunciation of nicholas for sending out texts without his consent serves as a firsthand example. correspond- ingly, nicholas’s bold statement that bernard’s texts are in fact his own—“meo sensu inventos, meo stylo dictatos” —also suggests that an explicit appropriation of texts was not unknown in the twelfth century. it is important not to underesti- mate the degree to which the middle ages was a “charismatic culture” in which texts were regarded in relation to “the body and the physical presence [that were] the me- diators of cultural values.” from this perspective, leclercq and rochais had justifiable reasons to care about the interdependence of text and physical author (or performer). constable has referred to a “rising tide of concern” over textual theft in the late twelfth and thirteenth century, possibly instigated by rapidly changing ap- proaches to “literary individuality.” in the midst of suchan impasse,historians and bernadette a. masters, “the distribution, destruction and dislocation of authority in medieval literature and its modern derivatives,” romanic review ( ): – , at . lena wahlgren-smith, “editing a medieval text: the case of nicholas of clairvaux,” in chal- lenging the boundaries of medieval history: the legacy of timothy reuter, ed. patricia skinner, stud- ies in the early middle ages (turnhout, ), – . for the upcoming critical edition, see wahlgren-smith, the letter collections of nicholas of clairvaux (oxford, forthcoming). minnis, medieval theory of authorship, – . nicholas of montiéramey, in the prefatory letter in ms harley , ed. leclercq, recueil d’études, : – . jaeger, “charismatic body, charismatic text,” . constable, “forgery and plagiarism,” , . he also noted that nicholas “may have inspired the cistercian legislation of defining the punishments for the falsifiers of charters and seals,” . however, the idea of literary individuality in the twelfth-century renaissance that constable here ad- dresses is a subject of immense debate: see caroline walker bynum, “did the twelfth century discover the individual?,” journal of ecclesiastical history ( ): – ; colin morris, the discovery of the individual, – (new york, ); and aron iakovlevič gurevič and katharine judelson, the origins of european individualism (oxford, ). speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fm.sem-eb. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fm.sem-eb. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fs &citationid=p_n_ bernard of clairvaux and nicholas of montiéramey s philologists should hope to find more solid ground on which to establish the authorship of dubious texts. authorship attribution—a seemingly trivial question concerning who wrote which text—forms a vital stepping-stone towards a more sci- entifically responsible understanding of how the medieval author perceived the pro- portional relationship between one’s personal authority over a text and one’s per- sonal contribution to its composition. computational stylistics instead of treating the medieval text as collective and impersonal, the methodol- ogy of computational stylistics traces the linguistic traits of a text that originate from a highly individual stylome, a set of features that betray personal writing style (note here that mapping out individual stylistics should not eliminate the possibility of collective authorship). moreover, its ambitions—whether realistic or not—may extend to extracting information concerning the author’s sex or age or to measur- ing a text’s genre or degree of “literariness.” although experiments and debates as to which textual features best capture stylistic difference are still ongoing, many state-of-the-art studies employ function words, which still prove to be the most ro- bust discriminators for writing styles. function words are usually short and insignif- icant words that pass unnoticed—such as pronouns, auxiliary verbs, articles, con- junctions, and particles—whose main advantages are their frequent occurrence, their less conscious use by authors, and their content- or genre-independent char- acter. their benefit and success for the study of stylometry in latin prose have been convincingly demonstrated before, although the methodology still raises acute this is the so-called human stylome hypothesis: see hans van halteren et al., “new machine learning methods demonstrate the existence of a human stylome,” journal of quantitative linguis- tics ( ): – . for author-profiling studies, see shlomo argamon et al., “automatically profiling the author of an anonymous text,” communications of the association for computing machinery (acm) ( ): – . karina van dalen-oskam, “a literary rat race,” in digital humanities : conference ab- stracts (kraków, ), – . see mike kestemont, sara moens, and jeroen deploige, “collaborative authorship in the twelfth century: a stylometric study of hildegard of bingen and guibert of gembloux,” digital scholarship in the humanities ( ): – ; and jeroen deploige and sara moens, “visiones hildegardis a guiberto gemblacensi exaratae,” in hildegardis bingensis opera minora, vol. , ed. jeroen deploige et al., cccm a (turnhout, ), – ; or the numerous investigations on the authorship of the scriptores historiae augustae, of which the latest is mike kestemont and justin a. stover, “the authorship of the ‘historia augusta’: two new computational studies,” bulletin of the institute of classical studies of the university of london ( ): – . also see penelope j. gurney and lyman w. gurney, “authorship attribution of the scriptores historiae augustae,” literary and lin- guistic computing ( ): – ; richard s. forsyth, david i. holmes, and emily k. tse, “cicero, sigonio, and burrows: investigating the authenticity of the consolatio,” literary and linguistic com- puting ( ): – ; fiona j. tweedie, david i. holmes, and thomas n. corns, “the provenance of de doctrina christiana, attributed to john milton: a statistical investigation,” literary and linguistic computing ( ): – ; and earl jeffrey richards, david joseph wrisley, and liliane dulac, “the different styles of christine de pizan: an initial stylometric analysis,” le moyen francais – ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fj.lmfr. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqt &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqt &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fj. - . . .x&citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fj. - . . .x&citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f &citationid=p_n_ s bernard of clairvaux and nicholas of montiéramey questions, which keep stylometrists on the lookout for alternatives. before return- ing to our case study of nicholas of montiéramey and his alleged falsification, we will expound on the availability of the data, on the preprocessing steps involved, and on the statistical technicalities of computational stylistics. for our subsequent analysis, we relied upon the digitized texts of bernard of clair- vaux’s corpus epistolarum, sermones de diversis, and sermones super cantica can- ticorum as they appear in the state-of-the-art scholarly edition of the sancti bernardi opera by leclercq et al. included in the online brepols library of latin texts. for nicholas of montiéramey’s letters we are provisionally still reliant on the digitally available patrologia latina. all text data are available in an online github repos- itory for experimental replication, yet in a camouflaged form so that the copyright protection on the original text editions is respected. only the texts’ function words were retained in their original form, whereas all content-loaded words were filtered out and replaced by dummy words. since leclercq’s editions and the patrologia la- tina make use of different orthographical conventions, and since latin is a synthetic language with a high degree of inflection, bernard and nicholas’s texts required some preprocessing for the sake of data alignment and feature culling. the result of the latter is that texts are more easily mined for information: thus, the lexemes are lemmatized (which means that a specific instance of the word is referred to its head- word) and a text’s words (tokens) are classified according to grammatical categories (parts of speech). for this purpose we applied the pandora lemmatizer tagger on the texts, a piece of software developed to achieve specifically this. mike kestemont, “ proceedings of the rd conference of the eur anna feldman, anna k see n. for details text files of these editio brepols publishers. for pl : a– picard, who was believ published in lyon in luanne meagher, “the tercian history , ciste forthcoming critical ed see https://github.c pandora was deve and jeroen de gussem learning,” journal of d ing of intertextuality in arxiv.org/pdf/ . speculum /s (oc this all use subject to u token lemma pos-tag (simplex) harum hic pro imo immo adv function words in workshop on co opean chapter of azantseva, and st on the edition of ns have been gene brepols’s online l b. the original ed ed to have had ac and ultimately letters of nicolas rcian studies serie ition is edited by w om/jedgusse/berna loped by mike ke , “integrated seque ata mining and d ancient language v .pdf. tober ) content downloaded niversity of chicago authorship attrib mputational ling the association for an szpakowicz (go bernard of clairva rously provided fo ibrary of latin te ition of nicholas’s cess to the original reprinted by jacque of clairvaux,” in h s , ed. ellen roz ahlgren-smith: see rd. stemont and the au nce tagging for m igital humanities. s, ed. marco büch from . . . press terms and c an advantage of pandora’s design is that it normalizes orthographical variants to a classicized headword if necessary. both imo and immo were categorized under ution: from black magic to theory?,” in uistics for literature (clfl) at the th computational linguistics (eacl), ed. thenburg, ), – . ux’s texts by leclercq et al. the digitized r our experiments by our project partner, xts, see http://www.brepolis.net. letters was first published in by jean manuscripts. the text would later be re- s-paul migne in the patrologia latina: see eaven on earth, studies in medieval cis- anne elder (kalamazoo, ), . the n. . thor of this article. see mike kestemont edieval latin using deep representation special issue on computer-aided process- ler and laurence mellerin ( ), https:// on october , : : am onditions (http://www.journals.uchicago.edu/t-and-c). http://www.brepolis.net https://github.com/jedgusse/bernard https://arxiv.org/pdf/ . v .pdf https://arxiv.org/pdf/ . v .pdf http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fv % fw - &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fv % fw - &citationid=p_n_ bernard of clairvaux and nicholas of montiéramey s immo and identified accordingly as one and the same word. the part-of-speech- tag (pos) displayed in the third column in the above diagram allowed us to restrict the culling of the most frequent words to those word categories that make up the collection of function words: conjunctions (con), prepositions (ap), pronouns (pro), and adverbs (adv). this likewise filtered out some noise caused by ambi- guities or homonyms like secundum, which can be either a preposition or the accu- sative case of the adjective secundus. afterwards, some lemmata in the list that did not qualify as style markers were culled and filtered out, such as tu, tuus, vos, and vester. vos and vester betray a vernacular influence in bernard’s unpublished letters as formal, polite forms similar to the french vous. bernard was known to adapt his sermons to his audience, and in a literary or classicizing text he would maintain tu and tuus when addressing his correspondent. therefore these pronouns were not regarded as suitable features for stylistic difference but as content-dependent, con- scious authorial choices linked to register. aside from lemmatization, smaller in- terventions were undertaken, such as separating the enclitics -que and -ve from the token in order to be recognized as a feature. once procedures of this sort were car- ried out in full, we arrived at a list of the most frequent function words (mffw) of the corpus examined in our experiment. tables and of the most frequent function words correspond to the two experiments (and their two respective cor- pora) described in this article and are listed in the appendix. the corpora under scrutiny were subsequently segmented into parts with a fixed size, in other words, text samples. sampling yields the advantage of “effectively [as- sessing] the internal stylistic coherence of works,” as it also allows for a more fine- grained comparative analysis with segments from external works. the sample sizes, however, can differ depending on the requirements of the experiment. as will be- come apparent in the appendices and the figures, bernard’s letters were segmented into , -word samples, whereas his sermons were segmented into , -word samples. the decrease of sample size in the second experiment was necessary due to the fact that it treats shorter texts. it should be noted that whereas , -word other pairs include tanquam and tamquam, quoties and quotiens, nunquid and numquid, quanquam and quamquam, nunquam and numquam, etc. constable, “the language of preaching in the twelfth century,” – . see leclercq, “notes sur la tradition des épitres de s. bernard,” recueil d’études, : . it should be noted that our decision to disregard these features, or in other words to succumb to manual feature selection, implies a degree of supervision and subjectivity. we do not really see this as a problem, firstly since we have supplied evidence that including these features could only distort the results for obvious historical reasons, and secondly since this is in line with our approach that was already strongly deter- mined by how we set limitations to the culling of function words (only prepositions, conjunctions, ad- verbs, and pronouns were taken into consideration). the omission of too-characteristic corpus features considerably improves precision and historical validity. david l. hoover has demonstrated this for personal pronouns as well: see hoover, “testing burrows’s delta,” literary and linguistic computing ( ): – . on the culling of the most frequent words, see the pivotal work of john f. burrows, “‘delta’: a measure of stylistic difference and a guide to likely authorship,” literary and linguistic computing ( ): – . its workings have been considerably elucidated (and its formula simplified) by the publication of shlomo argamon, “interpreting burrows’s delta: geometric and probabilistic founda- tions,” literary and linguistic computing ( ): – . maciej eder, jan rybicki, and mike kestemont, “stylometry with r: a package for computa- tional text analysis,” r journal ( ): – , at . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqn &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ s bernard of clairvaux and nicholas of montiéramey samples correspond to a state-of-the-art norm, , -word samples run the risk of increased imprecision, a consideration that should nuance any interpretation of the results. once we divided our corpus, each of the text samples needed to be trans- lated to a format that is also readable to computers, namely document vectors. a text sample is represented as an array of tallies for each of the one hundred and fifty function words on the checklist, as follows: there is deba dressed for twelf the twelfth cen small samples, b and walter daele and linguistic co the tf-idf tion words in fav mation retrieval ture extraction in above) that the h gue that observi might go unnotic the assumption th ies suggest that it ture is present in by the same indiv properties, this m lems,” kestemon cations ( the similarit ric that is “a very a feature vector”: multiple classifie speculum /s all use subjec et in qui ... sample ... sample ... sample n ... .... ... ... te over the adequate th-century latin in k tury,” – . also ig problem,” literary mans, “the effect of a mputing ( ): vectorizer is therefore or of rare function w (new york, ), authorship-attributio igh-ranked most frequ ng less common func ed in the standard de at low-frequency item might be useful. arg two documents, this idual. while the meth ight be an attractive t et al., “authenticati ): . y metric applied for th general metric that ca see pádraig cunning r systems ( ): (october ) this content downloa t to university of chic sample length t estemont, moe see maciej ede and linguistic uthor set size a – . a normalizatio ords: see christ – . the tf n studies, since ent words gene tion words can lta approach. “ s are bad predi uably, this mod increases the lik od might there characteristic i ng the writing e pairwise dist n be used in a k ham and sarah – , at . ded from . ago press term o capture a st ns, and deplo r, “does size computing nd data size i n procedure opher d. ma -idf vectorize it readjusts b rate better di yield interes in many ways ctors of autho el captures th elihood that fore be sensit n certain (e.g. s of julius ca ances is the m -nn classifier jane delany, “ . . on s and condition ylistic signal. t ige, “collabor matter? auth ( ): – n authorship a that penalizes nning et al., in r is not an ob urrows’s delta stinctions betw ting authorial , this model ca rial style. neve e intuition that the two docum ive to overfittin , single-domai esar,” expert inkowski metr for any data t k-nearest ne october , s (http://www. these raw counts were tf-idf normalized, a procedure that divides the function word frequencies by the number of text samples that respective function word ap- pears in. as a consequence, less common function words received a higher weight, which prevents them from sinking away (and losing statistical significance) in be- tween very common function words. once the data was preprocessed and regu- lated, two statistical techniques were applied to visualize its dynamics. the first is k nearest neighbors (hereafter k-nn); the second is principal compo- nent analysis (hereafter pca). their respective results will prove to be similar in a general sense, yet crucially different in the details. we argue that such an additional statistical validation provides for a more accurate, nuanced interpretation and a bet- ter intuition of the data. in figs. and , the k-nn networks, we first calculated the five closest text samples to each text sample by applying k-nn on the frequency vectors. accordingly, for each text the five most similar, or closest, texts were he risk has been ad- ative authorship in orship attribution, ; and kim luyckx ttribution,” literary more frequent func- troduction to infor- vious choice for fea- presupposition (see een authors. we ar- preferences, which n be contrasted with rtheless, a few stud- if a highly rare fea- ents were authored g on low-frequency n) authorship prob- systems with appli- ic, a euclidean met- hat is represented as ighbour classifiers,” : : am journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqq &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqq &citationid=p_n_ bernard of clairvaux and nicholas of montiéramey s calculated, weighted in rank of smallest pairwise distance and consequently mapped in space through force-directed graph drawing. what a k-nn network ultimately captures is which texts are most akin—or have the closest connection— when it comes to writing style (as defined by the distribution of function words). it should be noted that k-nn nearly always finds relationships, as it is very much a closed game. it is designed to link candidates to one another in terms of distance (ev- ery text sample needs to find its five neighbors) and can presuppose ties that are rather coincidental or nonexistent (for example, in the case of outliers). the network visualization can therefore be biased by a misleading directionality. secondly, pca is a technique that allows us to reduce a multivariate or multidi- mensional data set of many features, such as our function word frequencies, to merely two or three principal components, which disregard inconsequential infor- mation, or noise, in the data set and reveal its important dynamics (figs. and ). the assumption is that the main principal components, our axes in the plot, point in the direction of the most significant change in our data, so that clustering and out- liers become clearly visible. each word in our feature vector is assigned a weighting, or loading, which reflects whether or not a word correlates highly with a pc and therefore gains importance as a discriminator in writing style. in a plot, the loadings or function words that overlap with the clustered texts of a particular author are the preferred function words of that author (see figs. – ). pca is built to find the most meaningful variance of observations along the axes of its principal compo- nents. in this sense it is not always interested in finding links between candidates, as k-nn is, but rather in finding links between variables. disadvantages are that the weights were derived directly from the calculated distances (see n. for specifications on the metric). the intuition is then that the distances should be normalized to a ( , ) range. note that this is not a ( , ) range, since smaller distances correspond to greater similarities and therefore require greater weighting: distances distances minðdistancesÞminðdistancesÞ maxðdistancesÞ. see, for instance, maciej eder, “visualization in stylometry: cluster analysis using networks,” digital scholarship in the humanities ( ): – . the algorithm used for the graphs in this article was force atlas , embedded in gephi, an open-source tool for network manipulation and visualiza- tion: see mathieu bastian et al., “gephi: an open source software for exploring and manipulating net- works,” proceedings of the third international conference on weblogs and social media (icwsm), ed. eytan adar et al. (san jose, ca, ): – . it must be noted that eder’s network iteratively runs over an increasing number of features ( – mfw) to establish the consensus between different text samples (likewise by means of nearest neighbors). i have somewhat adjusted the method for the purpose of this article, since running up to mfw would surely overfit on the strong content-dependent con- nections that exist between nicholas and bernard’s texts. matthew lee jockers likewise applied gephi networks to detect stylistic differences in his seminal work macroanalysis: digital methods and literary history (champaign, ). the difference with his approach and the one maintained in this article is that jockers not only generated his networks through stylistic variables, but combined these with the- matic linkages to discover trends of literary evolution on a far larger and diachronic scale than in this article (one could say that jockers, as a quantitative formalist, demonstrated the popular formalist con- cept of the “defamiliarization” of literary language). as argued earlier, thematic variables should be dis- regarded in this case study. nevertheless, i concur with jockers’s view that networks are a powerful tool to “demonstrate literary imitation, intertextuality, and influence” ( ). for an elaborate explanation of pca and its applicability to stylometry, see josé nilo g. binongo and m. wilfrid a. smith, “the application of principal components analysis to stylometry,” literary and linguistic computing ( ): – . the pca plots were drawn with the matplotlib package available for python: see john d. hunter, “matplotlib: a d graphics environment,” computing in science and engineering ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fmcse. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fmcse. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqv &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% f . . &citationid=p_n_ s bernard of clairvaux and nicholas of montiéramey pca can never explain all the variance of the data, since it purposefully disregards many features and dimensions that it finds insignificant. it also has the tendency to produce somewhat nebulous scatter plots when texts are stylistically entangled (as is the case for bernard and nicholas). the letters bernard’s epistolary corpus is very complex, but a coherent structure has been recognized thanks to jean leclercq’s editorial achievements. leclercq, whose termi- nology will be adopted here, has divided the corpus into a literary (intra corpus) and a nonliterary section (extra corpus). bernard intended the letters in the intra cor- pus to circulate as a literary collection and kept refining them intensely throughout his life, whereas the second group of letters is scattered across time and manuscript traditions. then, within the first section, the literary or intra corpus, we can make another division. manuscript transmission allows us to distinguish between letters written before nicholas’s arrival in the scriptorium and letters inserted later. the letters that date from before in an earlier first appearance are found in the brevis manuscripts, whereas those added to bernard’s literary corpus afterwards can be found in the perfectum manuscripts. the perfectum corpus was assembled after bernard’s death in , possibly by geoffrey of auxerre, and contains, aside from the brevis letters, many new additions (tables and give a detailed overview of which letters are included in either the brevis or perfectum samples). it is impor- leclercq separated the intra corpus (epp. – ) from the extra corpus (epp. – ) in the sbo. leclercq, sbo : – . leclercq, “lettres de s. bernard: histoire où littérature?,” recueil d’études, : . the main difference between the brevis and perfectum manuscripts is that the latter contain ver- sions of these letters that were clearly amended and lengthened. in the introduction to the edition of the intra corpus, leclercq gives a full account of the arrangement of the different transmissions, in which he distinguishes three more or less homogeneous collections, two of which are the brevis and perfectum cycles, whose names have served as inspiration to how we labeled our chronologically ordered data. it should be noted that leclercq mentions a third intermediary publication that we have decided to ex- clude from our main argument, namely the longior corpus, which was compiled by geoffrey of auxerre and was presumably published in . the longior corpus already contains quite a few of the perfectum additions. however, this corpus is hard to date or reconstruct, making it less interesting for us to include in this study: see leclercq, sbo :xv. we decided to make a distinction between the early brevis publi- cation, when nicholas of montiéramey was certainly absent from clairvaux, and the later publication, when both of them, or an even more developed chancery, could have exerted influence on bernard’s style. more specifically, the letters falling under the heading of brevis—written before and not to be con- fused with leclercq’s brevis manuscript collection—can be consulted under the following indices: epp. , , , , , , , , , , – , , , , , , , , – , , , , , , , , , , , , – , – , – , , , , , , , , , , , , , , and . the letters that fall under the heading of perfectum—not to be confused with the perfectum cycle as it is described by leclercq, but corresponding to all letters that were written after and remained unmentioned in the brevis data—are to be found under these indices: epp. , – , , , – , – , – , , , – , , , , , , – , , – , , , – , , , , – , , , , , , , , – , , – , , – , – , – , – , and – . however, these letters were categorized according to their date of publication (brevis and perfectum) and split up into samples. a full inventory of which letters can be found under which sample in the figures is provided in the appendix of tables (see tables – ). speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). f ig . . n et w o rk vi su al iz at io n o f b er n ar d o f c la ir va u x ’ ep is to la ry co rp u s co m p ar ed to n ic h o la s o f m o n ti ér am ey ’s le tt er s an d se rm o n s. this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.ed u/ t-and-c). f ig . . p ri n ci p al c o m p o n en ts a n al ys is o f b er n ar d o f c la ir va u x ’ ep is to la ry co rp u s co m p ar ed to n ic h o la s o f m o n ti ér am ey ’s le tt er s an d se rm o n s. this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicag o.e du/t-and-c). bernard of clairvaux and nicholas of montiéramey s tant to note that those letters which were already found in the brevis manuscripts and reoccur in the perfectum corpus have sometimes considerably changed in the eight years between the two appearances. after all, bernard’s aim was to compose a unified piece of literature. he corrected, rearranged, and selected throughout his life. importantly, leclercq’s edition of bernard’s literary letter corpus, which we use in these experiments, is almost entirely based on the perfectum transmission, which enjoyed the most popular circulation. we therefore do not work strictly with the brevis corpus in its original pre- form, but with a group of letters that was col- lectively reworked and jointly disseminated. this condition of the texts is a flaw in the experiment that should be kept in mind during the analysis. moreover, as men- tioned earlier, bernard’s extra letters have not known a homogeneous transmission but have been handed down to us under divergent circumstances. the corpus there- fore required some reorganization. since leclercq’s edition allows us to assign indi- vidual letters to discrete periods, we decided to divide the extra corpus into three time-bound parts to see if nicholas’s arrival came with a stylistic impact: the first part dates from before , the second between and , and the third from onwards.those extra letters that are of questionable dating andaddressee have been left out of our experiments, for they cannot contribute to a study of bernard’s stylistic evolution through the influence of his secretaries (table in the appendix gives a full overview of which extra letters were included or excluded). in fig. (k-nn) and fig. (pca) we have applied the statistical methods de- scribed above to calculate and visualize the stylistic differences between bernard’s letter corpus and nicholas’s authentic sermons and letters. for each of these tech- niques, we have provided three additional subplots, which highlight how the differ- ent corpora are positioned within the clusters. there appear to be two general, ob- servable dynamics, confirmed both by k-nn and pca. firstly, the writing style in bernard of clairvaux’s letters is fairly coherent and forms a distinguishable cluster separated from nicholas’s works. nevertheless—and this is the second, more hid- den dynamic—our chronological and codicological rearrangements in the corpus have laid bare a gradual, subtle disturbance in bernard’s stylistic signal from onwards, corresponding to the approximate time of nicholas’s arrival in clairvaux and seemingly moving towards the latter’s cluster. yet, two major remarks are in order. firstly, although the perfectum additions were indeed inserted into the liter- ary corpus from onwards, some of them must have been first composed at a time before nicholas’s arrival. for example, sample in_ of the perfectum addi- tions, which draws closest to nicholas’s cluster of all literary samples (only in the pca, not in the k-nn network), contains letters that revolve around the schism be- tween antipope anacletus and pope innocentius ii, a series of events that occurred between and . although nicholas was not yet part of bernard’s entou- rage during these events and was therefore likely not involved in their first redac- leclercq, sbo :xvi. see n. for a listing of the sermons under consideration. for nicholas’s letters, see pl : – . turcan-verkerk, “l’introduction de l’ars dictaminis en france,” . the sample (in_ ) contains epp. , , , , , , , , , , (sbo : – ). speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). s bernard of clairvaux and nicholas of montiéramey tion, he was nevertheless present when they were first sent out collectively with the other perfectum letters. however, if any refinement was imposed on these letters af- ter , it seems likely that bernard, as the author, would have been the one to do so, not nicholas, although technically the latter’s interference is possible. the second remark ties in to the problem we have just raised. although both fig- ures show a diachronic stylistic shift, pca slightly adjusts k-nn’s inference that this shift has a determined direction towards nicholas. the ex_ samples rather “float” around nicholas’s vicinity but never fully coincide. this suggests that the distur- bances in bernard’s stylistic signal should not necessarily be as “monocausal” or “directional” as the k-nn network suggests, a nuance that reciprocates the his- torical skepticism raised in our first remark. countless other variables aside from nicholas’s interference could have contributed to the subtle stylistic change in ber- nard’s letter corpus. one factor could be the lapse of time and bernard’s personal de- velopment. another is the respective corpus’s divergent transmission history. but perhaps the most crucial reason for pca’s less outspoken directionality is that ber- nard did not have just one secretary. although nicholas was the scriptorium’s head- man, this experiment undoubtedly simplifies or fragmentizes its diversity of styles and personalities. we might even be surprised that bernard’s letters—considering the circumstances under which they were conceived—still display this amount of sty- listic coherence (although there might have been a more outspoken divergence if we had been able to oppose the very original brevis corpus to the published versions). this does not alter the fact that the plots’ gravitation towards nicholas’s latin style, which was of a very schooled nature, might hold some historical ground. as bernard more and more became a public figure, he increasingly began requiring the there is a considerable amount of literature that argues that such an evolution of personal style through time can be captured computationally by applying so-called stylochronometrical methods. for a concise overview of this subfield in computational stylistics, see constantina stamou, “stylochronom- etry: stylistic development, sequence of composition, and relative dating,” literary and linguistic computing ( ): – . nicholas’s style has often been deemed schooled and unoriginal: see constable, the letters of pe- ter the venerable, : ; and leclercq, “les collections de sermons de nicolas de clairvaux,” : : “ses exposés superficiels se développent selon un plan scolaire, en un style artificiel.” likewise, dorette sabersky argued that “the syntactical structure of his sentences is similar to bernard’s, but often clum- sier, less clear, less elegant, and rhythmically less balanced. his frequent use of word plays is at times rather superficial and, in opposition to bernard’s use, of little importance to the development of the contents. repetitions of certain phrases and topics occur every so often. he favors rather unusual words and likes to quote classical authors. his literary exertions are only too obvious. all these aspects evidence nicholas’ lack of bernard’s creative spontaneity and mastery of language,” in “the style of nicholas of clairvaux’s letters,” in erudition at god’s service, studies in medieval cistercian history , cistercian studies series , ed. john r. sommerfeldt (kalamazoo, ), . however, it is all the more peculiar and contradictory to these former statements that—even until very recently—attempts were made to at- tribute texts to nicholas on the grounds of phrasing tics and certain lexical preferences (e.g., neologisms): see patricia stirnemann and dominique poirel, “nicolas de montiéramey, jean de salisbury et deux florilèges d’auteurs antiques,” revue d’histoire des textes ( ): – . either such attribution methods should be challenged (perhaps rightly so; their word and phrase concordances form particularly dangerous grounds for attributing authorship in a twelfth-century context that boasts such a high degree of “plagiarism”) or the statement that nicholas has no style of his own should be withdrawn. i am con- vinced of the latter. our computational experiments show that nicholas, despite being an imitator, has a very controlled and rather clean authorial signal. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqm &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqm &citationid=p_n_ bernard of clairvaux and nicholas of montiéramey s support of scribes to take on administrative tasks, be it in clairvaux or on excep- tional occasions elsewhere. these scribes would have received a similar training or education. we can assume that most were under nicholas’s supervision, which meant that they departed from a common framework or set of rules from which they set out to imitate bernard. this had become the nature of the epistolary writing art, or ars dictaminis. letters were constructed on the basis of similar formulas, abounded in clever wordplay, and the rhythms of their prose pulsated under com- parable cadences. diplomats, ambassadors, and secretaries would inspire one an- other in a network of correspondence or share these rhetorical devices within their scriptoria. these practices might have considerably reshaped the stylistic homoge- neity that is evident in the writings of bernard from his earlier days, when he relied on a far smaller number of secretaries and had more time at his disposal so that he bernard would not necessarily have found help only in clairvaux. it is conceivable that when he was occupied with the turbulent matters of the schism and was traveling through italy he called for the assistance of papal scribes to whom he could dictate his messages. the papal notaries, educated in the ars dictaminis, would in fact have been schooled in a similar tradition as nicholas, who had visited rome and moreover corresponded with at least three popes during his lifetime: see constable, “dic- tators and diplomats,” . this could also explain why brevis samples in_ and in_ , which like- wise have the schism as their subject, somewhat pair with letters that were added to the corpus later and not with the other brevis letters, which cling more closely together. the samples in_ and con- tain epp. , , , , , , , , , , and (sbo : – ). for a concise overview of bernard’s interference in the papal schism and its importance for his public career, see the subchapter “a leading figure in the papal schism – ,” in brian patrick mcguire, “bernard’s life and works,” in mcguire, a companion to bernard of clairvaux, – . for information on the papal chancery, see christopher robert cheney, the study of the medieval papal chancery: the second edwards lecture delivered within the university of glasgow on th december (glasgow, ), – . “instructions about the framing of papal letters may be found in chancery ordinances and in guide-books for chancery clerks; these help to elucidate the legal principles which underly the phraseology.” the writing style of the papacy’s chancery must have served as an important model to all clerks and diplomats both in ecclesiastical and worldly contexts. see ronald witt, “medieval ‘ars dictaminis’ and the beginnings of humanism: a new construc- tion of the problem,” renaissance quarterly ( ): – . with “comparable cadences” i am here referring to rhetorical devices such as the cursus: see tore janson, prose rhythm in medieval latin from the th to the th century (stockholm, ). the cursus has also been tested as a feature for authorship attribution: see linda spinazzè, “‘cursus in clausula,’ an online analysis tool of latin prose,” proceedings of the third aiucd annual confer- ence on humanities and their methods in the digital ecosystem, ed. francesca tomasi, roberto rosselli del turco, and anna maria tammaro, association for computing machinery (acm) inter- national conference proceedings series (icps), (new york, ), : – . on the subject of the ars dictaminis, see giles constable, letters and letter-collections (turn- hout, ), – : “this tendency towards a personalization of style and contents in eleventh- and twelfth-century epistolography was paralleled by a tendency, which was in some respects contra- dictory, towards formalization, which was represented by the emergence of the discipline known as the dictamen or ars dictandi, with teachers (dictatores), text-books (artes or summae dictaminis), and col- lections of model letters (formularies). although dictamen now emerged for the first time as a discipline with clearly formulated rules, it had roots deep in the past and was connected in ways which are still not fully understood with the epistolographical rules and traditions which went back to antiquity. . . . in the course of the twelfth century the number both of teachers and of text-books of dictamen spread rapidly, first in italy and later, in the second half of the century, north of the alps. various schools developed with different styles, as at bologna and orleans; and although in the earlier twelfth century a certain number of writers, like st. bernard and peter the venerable, who knew about dictamen, did not observe its rules, its influence was all but universal by the end of the century.” speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &system= . % f &citationid=p_n_ s bernard of clairvaux and nicholas of montiéramey could be present during the various phases of composition. we know of bernard’s increasing discomfort concerning the fact that he felt obliged to delegate the writing of his letters and sermons to assistants, and his dissatisfaction with some of them when it came to grasping the sensus of his message. perhaps to his own frustration, bernard was increasingly forced to have faith in the reliability of such scribes as nicholas to reformulate his initial dictation in a letter that conformed to the style and content bernard had intended. the extra letters would have received far less re- vision, resulting in the kind of hybrids that float towards middle ground in the figure. the sermons in this second visualization we put nicholas’s word to the test. firstly, assuming that the secretary speaks the “truth” in his letter to henry the liberal, which le- clercq cited as the most striking example of his plagiarism, we expect that a small number of sermons that occur in bernard’s de diversis, namely , , , , , , and , could be attributed to him instead. on the side, we test if his claims to hugh’s commentaries on the psalms, which were also mentioned in the letter, hold any ground. in a second phase, we follow up on henri rochais’ conclusions that bernard—not nicholas—wrote de diversis , , and . in fact, the de diversis collection in its entirety is worth testing here, as it suffers from some con- siderable issues of authenticity, provenance, and dating and might contain other traces of nicholas’s presence. the corpus comprises an assembly of unpolished and rudimentary sermons found in various, heterogeneous manuscripts, conceivably written down by secretaries and granted little revision by bernard (unless if they were reused elsewhere). bernard never disseminated the de diversis sermons him- self. they were gathered after his death and passed on for several centuries until jean mabillon enumerated and published them in the seventeenth century. leclercq and rochais maintained mabillon’s structure in their edition. secondly, we have in- cluded the sermones super cantica canticorum, bernard’s literary masterpiece, as the cleanest possible specimen of bernard’s literary style to benchmark against these texts. “multitudo negotiorum in culpa est, quia dum scriptores nostri non bene retinent sensum nos- trum, ultra modum acuunt stilum suum, nec videre possum quae scribi praecepi,” bernard, ep. (sbo : – ). “the medieval idea of truth . . . was subjective and personal rather than, as today, objective and impersonal”: see constable, “forgery and plagiarism,” . bernard, de diversis , , , , , , (sbo / : – , – , – , , , , – ). nicholas used the phrase “aliosque sermones” in the prefatory letter in ms harley , recueil d’études, : , referring to the aforementioned sermons, a few other texts by bernard, and, finally, hugh of st. victor’s chapters on the psalms gathered in the second book of his miscellanea (pl : ). rochais, “saint bernard est-il l’auteur des sermons , et ?,” – . leclercq, sbo / : – . françoise callerot, “introduction,” in bernard of clairvaux, sermons divers, ed. jean leclercq, henri rochais, and charles h. talbot, vols. (paris, ), : . bernard must have started composing its beginnings around the end of , but never commen- tated the entire song of songs. they are, nevertheless, regarded as his life’s work and greatest literary achievement: see leclercq, sbo :xv–xvi. leclercq argues bernard must have passed away before he speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). f ig . . n et w o rk vi su al iz at io n o f b er n ar d o f c la ir va u x ’ se rm o n es su p er c an ti ca c an ti co ru m an d se rm o n es d e d iv er si s co m p ar ed to n ic h o la s o f m o n ti ér am ey ’s le tt er s an d se rm o n s an d h u gh o f st v ic to r’ s co m m en ta ri es o n th e p sa lm s. this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchic ag o.ed u/t-and-c). f ig . . p ri n ci p al c o m p o n en ts a n al ys is o f b er n ar d o f c la ir va u x ’ se rm o n es su p er c an ti ca c an ti co ru m an d se rm o n es d e d iv er si s co m p ar ed to n ic h o la s o f m o n ti ér am ey ’s le tt er s an d se rm o n s an d h u gh o f st v ic to r’ s co m m en ta ri es o n th e p sa lm s. this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.e du /t-a nd-c). bernard of clairvaux and nicholas of montiéramey s fig. (k-nn) and fig. (pca) feature the results of matching up these texts. firstly, when examining the visualizations, it is striking how the diversity of ber- nard’s de diversis is indeed captured. pca, especially, demonstrates a discernible stylistic incoherence, as the samples burst open all over the plot (especially along the vertical axis of the second principal component), at times suggesting the interfer- ence of writers other than nicholas or bernard in their composition. other samples gravitate in between nicholas and bernard, and in some cases nicholas’s influence on the style is undeniable. before discussing some contingent subjects of interest, let us focus on the primary questions at hand. de diversis , , , , , , and , which nicholas included in the letter to count henry the liberal (they are split up in two red samples labeled with le_ of leclercq), do not betray an obvious affin- ity to nicholas’s style (although le_ is not far off). neither are they unambiguously bernard’s. both samples diverge strongly from bernard’s cluster and seem too hy- brid in nature to be restrained to either of the authors’ clusters. the case rather demonstrates how difficult it is to defend such concepts as “single authorship” and “textual theft” in a medieval context: the le_ samples are clearly not of a “singular” style (neither nicholas’s style nor bernard’s) but defy classification. in fact, if we compare both k-nn and pca, nicholas’s influence in sample le_ seems consider- ably larger than bernard’s. it has by now become an untenable simplification to ar- gue that nicholas has stolen these sermons, especially if we review the results of our second case, that of de diversis , , and (four red samples labeled with ro_ of rochais): although the sermons emanate from bernardian thought, k-nn and pca unambiguously cluster all three sermons together with those written by nich- olas, not bernard. there are some less straightforward developments on the side. hugh of st. vic- tor’s presence in both attribution problems remains somewhat unclear. nicholas included hugh’s commentaries on the psalms in his collection, yet figs. and show that he was unlikely to have been the (only) author of this incohesive text (see the purple hu_ samples, of which hu_ comes closest to nicholas). vice versa, de diversis (first part of the dubious ro_ samples) is collected in hugh of st. vic- tor’s miscellanea. would nicholas have known hugh well, and would they have collaborated before the latter’s death in ? there is no proof of a direct acquain- tance. nicholas’s musical sequences seem largely based on those of adam of st. vic- tor, hugh’s choirmaster, but these texts enjoyed a popular circulation, so the sim- manuscript studies have argued that they can only be of hugh’s hand: see joseph de ghellinck, “hugues de saint-victor,” in dictionnaire de théologie catholique, vols. (paris, – ), : . al- though he admits that the miscellanea is a confluence of the apocryphal and the authentic, de ghellinck based his findings on the indiculum of hugh’s writings. the commentaries on the psalms often occur among hugh’s authentic works in the manuscript transmission. this has been confirmed in the exhaus- tive study of the dissemination of hugh’s oeuvre in rudolf goy, die Überlieferung der werke hugos von st. viktor: ein beitrag zur kommunikationsgeschichte des mittelalters, monographien zur geschichte des mittelalters (stuttgart, ), – . had the chance to finish his work, but it is more likely that bernard never had the intention of discuss- ing all the canticles and has delivered us a finished work of literature: see wim verbaal, “les sermons sur le cantique de saint bernard: un chef d’oeuvre achevé?,” collectanea cisterciensa ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). s bernard of clairvaux and nicholas of montiéramey ilarity does not necessarily presuppose a personal tie. for bernard and hugh, however, the connections are less far-fetched. we know they corresponded. hugh incorporated an entire letter he received from bernard in his acclaimed masterpiece, the de sacramentis. likewise, figs. and show that samples di_ and di_ of bernard’s de diversis bear some affinity with hugh’s commentaries. these samples comprise the very last additions to bernard’s corpus, de diversis – . they are shorter texts, which have not always been accompanied by the preceding sermons but must have circulated as a separate unit in manuscript transmission. mabillon has argued that their provenance differs from that of the other de diversis sermons in a footnote, thereby perhaps showing some wariness as to the authenticity of the works. although they might be hugh’s, we find that the textual style of both bernard’s de diversis and hugh’s commentaries is too unreliable to provide closure. the case for the triangular writing relationship between these authors is compelling, but there is insufficient historical proof to corroborate speculations of a collabora- tion between nicholas and hugh. conclusion jean leclercq, in aspiring to discern the psychological personality of the author behind any given historical text, conceded the difficulty of infiltrating the “screen of rhetoric” so characteristic to twelfth-century literature, referring to its predilec- tions of imitation and formal rigidness. the surface of the medieval text can strike one as impenetrable. in a similar vein, giles constable has argued for medieval epis- “since nicolas is known for his plagiarism and incorporated the work of hugh of st. victor in the collection of his own opera dedicated to count henry, the suspicion arises that nicolas modeled his work directly on that of hugh’s colleague, adam of st. victor”: john f. benton, “nicolas of clairvaux and the twelfth-century sequence,” traditio ( ): – , at . bernard, ep. , “ad magistrum hugonem de sancto victore,” sbo : – ; also see hugh feiss, “bernardus scholasticus: the correspondence of bernard of clairvaux and hugh of saint victor on baptism,” in bernardus magister: papers presented at the nonacentenary celebration of the birth of saint bernard of clairvaux, ed. john r. sommerfeldt, commentarii cistercienses (spencer, ma, ), – . “adding to the complications of de sacramentis as a text is hugh’s incorporation of passages not only from his own prior works but also from other theologians, patristic and contemporary, sometimes named but often without any attribution at all. in this respect, hugh nicely represents the overall con- cern of twelfth-century authors to synthesize their sources”: paul rorem, hugh of st. victor, great medieval thinkers (new york, ), . the earliest editions of bernard’s opera omnia did not yet include these sermons. it was not until its publication by printer johann herwagen of basel in that de diversis – found its place among the sermones de diversis: see gerhard b. winkler, trans., bernhard von clairvaux: sämtliche werke, vols. (innsbruck, – ), : lh, to. jean mabillon relied on jacobus pamelius’s edi- tion of these sermons; see note to pl : . jean mabillon was aware of the fact that herwagen’s and pamelius’s editions were to be ap- proached with great caution when it comes to attribution. herwagen’s and pamelius’s collections of bede’s works are examples of how these editors “ignored and altered rubrics, expurgated passages, disregarded section breaks, and lied outright about the bedan origins of their material:” nathan j. ristuccia, “the herwagen preacher and his homiliary,” sacris erudiri ( ): . jean leclercq, “modern psychology and the interpretation of medieval texts,” speculum ( ): – , at . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fj.se. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &system= . % f &citationid=p_n_ bernard of clairvaux and nicholas of montiéramey s tolography—and he may well have found the statement applicable to all twelfth- century texts—that “style alone is not a reliable guide to authorship,” and that “even today some of the works of nicholas of montieramey, who was clearly an accom- plished mimic, are not easy to distinguish from those of bernard and other writ- ers.” yet this trait of medieval texts, which is primarily qualitative and open to sub- jective interpretation, is elusive only in a close-reading approach and seems not to present a problem when form is quantified in a distant-reading approach. compu- tational stylistics disables the distracting semantics in which nicholas’s style is em- bedded, and patterns the turns of phrase that reveal his presence (or that of a chan- cery working under his lead). it can only follow, then, that nicholas’s reputation of being bernard’s pale shadow is a construction by readers who have undoubtedly experienced the difficulty of peering through the curtain of imitation, citation, and formalization when it comes to recognizing the author behind the text. it turns out that, if nicholas’s style is not “distinguished,” in the sense that it can be judged as of a high literary value, it is nonetheless distinguishable. this does not simply mean that the application of computational stylistics results merely in giving an individualized coloration to the question of authorship. a glance at each of the figures in this article demonstrates the interconnectedness (or “infinite shadings,” in constable’s words) laid out as networks between these two authors. computational stylistics therefore does not simply force us to choose a side in the medieval authorship dilemma, which is infinitely fought out along the axes of the “individual” and the “distributional.” it rather becomes these axes and reenacts the tension field as is. neither is nicholas’s and bernard’s collaboration depicted as a hierarchical “author-scribe” relationship in one-sided text classifications, nor must we seek refuge in a stopgap conception of infinite authority and authorship. this approach embraces both an acknowledge- ment that the practice of cooperative medieval authorship is complex, and a refusal tobelievethatmedievalauthorshipisinterminablydiffuse.therefore,computational stylistics provides valuable tools with which to validate or contradict contrasting the- ories with objective material, taking the voices from the past at face value and open- ing up avenues to rethink our approach to medieval texts in literary theory, text ed- iting, and historical studies. constable, letters and letter-collections, . nicholas has often been accused of having a style filled with platitudes or even of having no style of his own at all; see n. . constable, “forgery and plagiarism,” . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). f ig . . p ri n ci p al c o m p o n en ts a n al ys is lo ad in gs fo r f ig . . f u n ct io n w o rd s th at sw ar m ar o u n d au th o r’ s cl u st er ar e m o st d is ti n ct iv e fo r th at au th o r. s this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.e du /t-and-c). f ig . . p ri n ci p al c o m p o n en ts a n al ys is lo ad in gs fo r f ig . . f u n ct io n w o rd s th at sw ar m ar o u n d au th o r’ s cl u st er ar e m o st d is ti n ct iv e fo r th at au th o r. s this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.e du /t-and-c). s bernard of clairvaux and nicholas of montiéramey appendix of tables table most frequent function words for figs. – (the letters) – – – – – – et que sine tamquam iuxta donec in sibi nam ante verum cito qui pro ita utique itaque nimirum non enim magis contra pote numquam is vel vero nullus secundum cur hic ex apud igitur multum plane quod autem bene certe quando absque ego ne tantus aliquis alter quatenus sed per inter dum ibi proinde de aut immo semper tunc ceterum ut tamen propter videlicet sane longe ad iam quippe quidam uterque pariter ille quo quoque quisquis nemo facile si quidem idem siquidem omnino at ab sic ac sub sive inde cum iste solum satis profecto simul quis alius denique usque nonne ubique suus nisi talis quantum prius tandem ipse super quoniam numquid porro ideo quam etiam adhuc neque ample coram meus tam atque an alioqui huiusmodi quia ergo quantus post vere iterum nec ubi quomodo unquam utinam rursus nos sicut etsi quasi libenter quisque noster nunc unde minus interim parve speculum all use sub /s (october this content d ject to university ) ownloaded from of chicago press te . . . on oc rms and conditions tober , : (http://www.journals table most frequent function words for figs. – (the sermons) – – – – – – et nos nam uterque iuxta seipse in per quoniam aliquis quisquis item qui ex inter tunc videlicet quicumque non autem denique solum apud an hic noster magis sane profecto donec is que nunc quando scilicet certe sed vel unde igitur prius vere ad ergo quidam ante nemo quisque ille quidem sine talis parve absque quod tamen propter post porro interim ut iste quasi bene plane unquam de pro tam nullus ibi numquam ego iam atque sub contra quantum cum alius quomodo omnino immo pote suus ne quoque usque nonne prorsus : am .uchicago.edu/t-and-c). table (continued) – – – – – – ab etiam tamquam semper at semetipse si aut ac quippe nimirum pariter ipse sic tantus sive nihilominus amen quis sicut idem alter primum proinde quia quo neque minus propterea satis sibi nisi utique etsi verum huiusmodi meus vero adhuc inde nempe numquid enim super dum siquidem una hinc nec ita quantus itaque multum aliquando quam ubi secundum ideo longe prae table (continued) all use subj this content d ect to university ownloaded from of chicago press ter . . . on oc ms and conditions tober , : : (http://www.journals.u table description of sample contents ( , words) for bernard’s intra corpus (brevis publication) in figs. – sample ( , words) contents sample ( , words) contents sample_n sbo index and paragraph sample_n sbo index and paragraph in_ epp. . – in_ epp. . ff., , , , , , , , . – in_ epp. . ff., . – in_ epp. . ff., , . – in_ epp. . ff., , . – in_ epp. . ff., , , , , , , . in_ epp. . ff., , , , , , . – in_ epp. . ff., . – in_ epp. . ff., , , , . – in_ epp. . ff., , , , , , , , , . – in_ epp. . ff., , . – in_ epp. . ff., , , , , , , , . – in_ epp. . ff., , , , , . – in_ epp. . ff., , . – am chicago.edu/t-and-c). t a b le d es cr ip ti o n o f sa m p le c o n te n ts ( , w o rd s) fo r b er n ar d ’s in tr a c o rp u s (p er fe ct u m p u b li ca ti o n ) in f ig s. – sa m p le ( , w o rd s) c o n te n ts sa m p le ( , w o rd s) c o n te n ts sa m p le _n sb o in d ex an d p ar ag ra p h sa m p le _n sb o in d ex an d p ar ag ra p h in _ ep p . , , , . – in _ ep p . . ff ., , , , , , , , . in _ ep p . . ff ., , in _ ep p . . ff ., , , , , , , , , , , in _ ep p . , , , , , , , , , , . – in _ ep p . , , , , , , , , , , , , . in _ ep p . . ff ., , , , , , , , , , , . in _ ep p . . ff ., , , , , , in _ ep p . . ff ., , , , , , , , , , , in _ ep p . , , , , , , , , . in _ ep p . , , , , , , , , , , , , , , , , in _ ep p . . ff ., , , , . in _ ep p . , , , , , , , , , , , , , . in _ ep p . . ff ., , , , . in _ ep p . . ff ., , , , , , . in _ ep p . . ff ., , , , , , , , , . in _ ep p . . ff ., , , , , , , , , , in _ ep p . . ff ., , , , , in _ ep p . , , , , , , , , , . – in _ ep p . , , , , , , , , , , , , , , , in _ ep p . . ff ., , , , , , , , , , , , , , . in _ ep p . , , , , , , , , , , in _ ep p . . ff ., , , , , , , in _ ep p . , , , , , , , , , , , , , , , . – in _ ep p . , , , , , , , , , , , . all use s ubj th ect to is con univ ten ers t d ity ow of nlo chi ade cag d f o p rom res s s t . erm s . and . co nd on itio oc ns tob (htt er p:/ , /ww w .jou : rna : ls.u a chic m ago.edu/t-and-c). table description of sample contents ( words) for bernard’s extra corpus (pre- ) in figs. – sample ( , words) contents sample ( , words) contents sample_n sbo index and paragraph sample_n sbo index and paragraph ex_ epp. , , , , , , , , , . ex_ epp. . ff., , , , , , , , , , , , ex_ epp. . ff., , , , , , , . – all use subje s this content downloaded from . ct to university of chicago press terms . . on oct and conditions ( table description of sample contents ( , words) for bernard’s extra corpus ( – ) in figs. – sample ( , words) contents sample ( , words) contents sample_n sbo index and paragraph sample_n sbo index and paragraph ex_ epp. , , , , , , , , , , , . ex_ epp. , , , , , , , ex_ epp. . ff., , , , , , , , , , , , , ex_ epp. , , , , , , , , , , , , o h table description of sample contents ( , words) for bernard’s extra corpus (post- ) in figs. – sample ( , words) contents sample ( , words) contents sample_n sbo index and paragraph sample_n sbo index and paragraph ex_ epp. , , , , , , . – ex_ epp. . ff., , , , , , , , , , , , , . – ex_ epp. . ff., , , , , , , , , , , . – ex_ epp. . ff., , , , , , , , , , , , , , ber , : : am ttp://www.journals.uchicago.edu/t-and-c). table description of sample contents ( , words) for nicholas’s sermons and letters in figs. – sample ( , words) contents sample ( , words) contents sample_n pl (vol:col.) sample_n pl (vol:col.) ep_ ep. ( : a– b) ep. ( : a– c) ep. ( : b– a) ep. ( : c– c) ep. ( : b– b) ep. ( : d– c) ep. ( : b– c) ep_ ep. ( : d– a) ep. ( : d– a) ep. ( : b– c) ep. ( : b– b) ep. ( : c– b) ep. ( : c– d) ep_ ep. ( : b– c) ep_ ep. ( : d– a) ep. ( : d– d) ep. ( : b– a) ep. ( : a– b) ep. ( : b– d) ep. ( : c– c) ep. ( : a– d) ep. ( : c– b) ep. ( : a– c) ep_ ep. ( : b– a) ep. ( : c– a) ep. ( : a– a) ep. ( : b– a) ep. ( : b– d) ep_ ep. ( : a– c) ep. ( : a– c) ep. ( : d– c) ep. ( : d– a) ep. ( : d– a) ep. ( : c– c) ep. ( : b– c) ep. ( : a– d) ep. ( : d– a) ep_ ep. ( : c– a) sm_ sm. ( : c– a) ep. ( : c– a) sm. ( : a– b) ep. ( : b– c) sm_ sm. ( : c– c) ep. ( : d– d) . . hom. ( : c– a) sm_ sm. ( : c– b) sm_ . . hom. ( : b) sm. ( : b– b) sm. ( : b– a) sm_ sm. ( : c– b) sm. ( : b– d) sm. ( : c– c) sm_ sm. ( : a– a) sm. ( : d– b) sm. ( : a– c) sm_ sm. ( : c– d) sm. ( : b– b) sm. ( : d– a) sm_ sm. ( : c– d) sm_ sm. ( : b– c) sm. ( : c– c) sm. ( : d– d) sm. ( : a– a) sm_ sm. ( : b– a) sm. ( : b– d) sm_ sm. ( : a– a) sm. anonym. ( : b– d) sm_ sm. anonym. ( : a– b) sm. ( : b– c) sm. ( : b– b) all use sub s this content downloaded from . ject to university of chicago press term . . on o s and conditions ctober , : : am (http://www.journals.uchicago.edu/t-and-c). table description of sample contents ( , words) for bernard’s sermones de diversis in figs. – sample ( , words) contents sample ( , words) contents sample_n sbo index and paragraph sample_n sbo index and paragraph di_ sm. . – di_ sm. . ff., . – di_ sm. . ff., . – di_ sm. . ff., . – di_ sm. . ff., . – di_ sm. . ff., , , , . – di_ sm. . ff., . – di_ sm. . ff., , , , di_ sm. . ff., . – di_ sm. , , . di_ sm. . ff., . di_ sm. . ff., , , , . di_ sm. . – di_ sm. . ff., , , , , di_ sm. . ff., . – di_ sm. , , , . – di_ sm. . ff., , . – di_ sm. . ff., , , , , di_ sm. . ff., , . – di_ sm. , , , , , , di_ sm. . ff., . – di_ sm. , , . di_ sm. . ff., . – di_ sm. . ff., , . – di_ sm. . ff., . – di_ sm. . ff., . – di_ sm. . ff., , . di_ sm. . ff., , , . di_ sm. . ff., . – di_ sm. . ff., , . – di_ sm. . ff., . – di_ sm. . ff., , di_ sm. . ff., . – di_ sm. , , , . – di_ sm. . ff., . – di_ sm. . ff., , , . – di_ sm. . ff., . – di_ sm. . ff., , , , . – di_ sm. . ff., , . – di_ sm. . ff., , , , , , di_ sm. . ff., . – di_ sm. , , , , , . – di_ sm. . ff., . – di_ sm. . ff., . – di_ sm. . ff., , . – di_ sm. . ff., , . – all use subject this content downloaded from to university of chicago press t s . . . on erms and conditio table description of sample contents ( , words) for bernard’s sermones super cantica canticorum in figs. – sample ( , words) contents sample ( , words) contents sample_n sbo index and paragraph sample_n sbo index and paragraph sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . octob ns (ht er , : : am tp://www.journals.uchicago.edu/t-and-c). table (continued) sample ( , words) contents sample ( , words) contents sample_n sbo index and paragraph sample_n sbo index and paragraph sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . sc_ sm. . – . table (continued) all use subjec this content downloaded from . t to university of chicago press term . . on oct s and conditions ( ober , : : am http://www.journals.uchicago.edu/t-and-c). bernard of clairvaux and nicholas of montiéramey s table description of sample contents ( , words) for nicholas’s sermons and letters in figs. – sample ( , words) contents sample ( , words) contents sample_n pl (vol:col.) sample_n pl (vol:col.) ep_ ep. ( : a– b) ep_ ep. ( : c– d) ep. ( : b– a) ep. ( : a– c) ep. ( : b– b) ep. ( : c– c) ep_ ep. ( : b– c) ep. ( : d– c) ep. ( : d– a) ep_ ep. ( : d– a) ep. ( : b– b) ep. ( : b– c) ep. ( : c– d) ep_ ep. ( : c– c) ep_ ep. ( : d– a) ep. ( : c– b) ep. ( : b– a) ep_ ep. ( : b– c) ep. ( : b– d) ep. ( : d– a) ep_ ep. ( : a– d) ep_ ep. ( : b– d) ep. ( : a– c) ep. ( : a– b) ep. ( : c– a) ep. ( : c– c) ep. ( : b– a) ep. ( : c– b) ep_ ep. ( : a– c) ep_ ep. ( : b– a) ep. ( : d– c) ep. ( : a– a) ep. ( : d– a) ep. ( : b– d) ep_ ep. ( : a– a) ep. ( : a– c) ep. ( : b– c) ep_ ep. ( : d– c) ep. ( : d– a) ep. ( : d– a) ep_ ep. ( : c– a) ep. ( : c– c) ep. ( : c– a) ep. ( : a– d) ep. ( : b– c) ep. ( : d– c) sm_ sm. ( : c– c) sm_ sm. ( : a– c) sm_ sm. ( : d– b) sm. ( : b– b) sm. ( : b– b) sm_ sm. ( : c– b) sm_ sm. ( : c– b) sm_ sm. ( : c– a) sm. ( : c– c) sm. ( : a– b) sm_ sm. ( : d– c) sm_ sm. ( : c– c) sm. ( : d– b) . . hom. ( : c– b) sm_ sm. ( : c– b) sm_ . . hom. ( : c– a) sm_ sm. ( : b– d) sm_ . . hom. ( : b) sm. ( : d– a) sm. ( : b– d) sm_ sm. ( : b– c) sm_ sm. ( : a– a) sm. ( : d– b) sm. ( : b– d) sm_ sm. ( : c– d) sm_ sm. ( : a– a) sm_ sm. ( : a– a) sm. ( : a– d) sm_ sm. ( : b– d) sm_ sm. ( : a– c) sm. ( : a– a) sm. ( : b– b) sm. ( : b– d) sm_ sm. ( : c– d) sm_ sm. ( : a– d) sm. ( : c– d) sm_ sm. ( : a) sm_ sm. ( : a– c) sm. anonym. ( : b– d) sm_ sm. ( : a– b) sm. anonym. ( : a– b) sm. ( : b– b) sm_ sm. ( : b– d) all use subjec this content downloaded from . t to university of chicago press term sp . . on oc s and conditions jeroen de gussem, ghent university (jedgusse.degussem@ugent.be) eculum /s (october ) tober , : : am (http://www.journals.uchicago.edu/t-and-c). op-llcj .. university of groningen measuring syntactical variation in germanic texts heeringa, wilbert; swarte, femka; schüppert, anja; gooskens, charlotte published in: digital scholarship in the humanities doi: . /llc/fqx important note: you are advised to consult the publisher's version (publisher's pdf) if you wish to cite from it. please check the document version below. document version publisher's pdf, also known as version of record publication date: link to publication in university of groningen/umcg research database citation for published version (apa): heeringa, w., swarte, f., schüppert, a., & gooskens, c. ( ). measuring syntactical variation in germanic texts. digital scholarship in the humanities, ( ), - . https://doi.org/ . /llc/fqx copyright other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like creative commons). take-down policy if you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. downloaded from the university of groningen/umcg research database (pure): http://www.rug.nl/research/portal. for technical reasons the number of authors shown on this cover page is limited to maximum. download date: - - https://doi.org/ . /llc/fqx https://research.rug.nl/en/publications/measuring-syntactical-variation-in-germanic-texts( e a fd-fcca- d -abd -cc b ).html https://doi.org/ . /llc/fqx measuring syntactical variation in germanic texts ............................................................................................................................................................ wilbert heeringa fryske akademy, the netherlands femke swarte faculty of arts, applied linguistics, university of groningen, the netherlands anja schüppert faculty of arts, european languages and cultures, university of groningen, the netherlands charlotte gooskens faculty of arts, applied linguistics, university of groningen, the netherlands and school of behavioural, cognitive and social sciences, university of new england, australia ....................................................................................................................................... abstract we present two new measures of syntactic distance between languages. first, we present the ‘movement measure’ which measures the average number of words that has moved in sentences of one language compared to the corresponding sentences in another language. secondly, we introduce the ‘indel measure’ which measures the average number of words being inserted or deleted in sentences of one language compared to the corresponding sentences in another language. the two measures were compared to the ‘trigram measure’ which was introduced by nerbonne & wiersma ( , a measure of aggregate syntactic distance. in nerbonne, j. and hinrichs, e. (eds.) linguistic distances workshop at the joint conference of international committee on computational linguistics and the association for computational linguistics, sydney, july, , pp. – .). we correlated the results of the three measures and found a low correlation between the results of the movement and indel measure, indicating that the two measures represent different kinds of linguistic variation. we found a high correlation between the results of the movement measure and the trigram meas- ure. the results of all of the three measures suggest that english is syntactically a scandinavian language. because of our unique database design we were able to detect asymmetric relationships between the languages. all three measures sug- gest that asymmetric syntactical distances could be part of the explanation why native speakers of dutch more easily understand german texts than native speak- ers of german understand dutch texts (swarte ). ................................................................................................................................................................................. correspondence: wilbert heeringa, fryske akademy, p.o. box , ab. ljouwert, the netherlands, e-mail: wheeringa@fryske- akademy.nl digital scholarship in the humanities, vol. , no. , . � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com doi: . /llc/fqx advance access published on june d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by u niversity library user on n ovem ber introduction textometry is a discipline in which knowledge is derived from corpora without predefined infor- mation models. macmurray and leenhardt ( ) describe textometry as an approach in which ‘a text possesses its own internal structure that would be difficult to analyze by manual means alone. by applying statistical and probabilistic calculations directly to the textual units of comparable texts in a corpus it becomes possible to analyze patterns and trends that would otherwise be obscured by the quantity of the textual units’ (p. ). and ‘textometry consists of seeing the document through a prism of numbers and figures, producing information on the frequency counts of words, otherwise known as occurrences, whereas forms are a single graphical unit corresponding to several instances in the text’ (p. , see also lebart and salem ( ) and tufféry ( )). in this article we utilize written texts for revealing language variation. language variation at different linguistic levels become apparent to a large extent when comparing written texts in different lan- guages, especially lexical, orthographic, and syntac- tical differences. lexical differences are differences in vocabulary or lexicon. in the following example english and german do not have any cognates, apart from the articles: english: the boy teased the dog. german: der jungen neckte den hund. on the other hand, pairs of sentences can be found where for each english word a german cognate is found. cognates are words which have a common etymological origin and normally a similar shape. example: english: the man saw a house. german: der mann sah ein haus. in this example differences are orthographic differ- ences. orthographic differences may reflect histor- ical developments of the pronunciation, for example, english saw versus german sah. however, orthographic differences do not always reflect linguistic differences, they may also be the result of differences in spelling conventions, for ex- ample, english house versus german haus. syntax is ‘the study of the principles and pro- cesses by which sentences are constructed in par- ticular languages’ (chomsky , p. ). between germanic languages like english and german rela- tively large syntactical differences can be found, for example: english: then she said that she will come tomorrow german: dann sagte sie dass sie morgen kommen wird there exist several studies that have proposed how to measure lexical, orthographic, and syntactic dis- tances using parallel corpora. for example, van bezooijen and gooskens ( ) measured lexical distances between dutch, afrikaans, and frisian on the basis of written texts. they also measured orthographic distances using the same material. zulu, botha, and barnard ( ) measured ortho- graphic distance between eleven south african lan- guages. a procedure for measuring syntactical distances between language varieties was introduced by nerbonne and wiersma ( ), who provided a foundation for measuring syntactic differences be- tween corpora. their method uses part-of-speech (pos) trigrams as an approximation to syntactic structure. the frequencies of the trigrams of two corpora are compared for statistically significant differences. in this article we focus on the measurement of syntactical distances between a small set of five germanic languages. we will apply the method of nerbonne and wiersma ( ) and refer to this as the ‘trigram measure’ throughout this article. in addition, we introduce two new methods for mea- suring syntactical variation. using the first method, we measure the average number of word positions that a word in a sentence in language a has moved compared to the corresponding sentence in lan- guage b. we call this the ‘movement measure’. the second method measures the average number of words found in a sentence in language a that is missing in the corresponding sentence in language b, and the number of words in a sentence in lan- guage b that is missing in the sentence in language w. heeringa et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by u niversity library user on n ovem ber deleted text: " deleted text: ." deleted text: : deleted text: " deleted text: ." deleted text: ) deleted text: paper deleted text: " deleted text: " deleted text: deleted text: paper deleted text: paper a. in other words, the number of words which is inserted or deleted in a sentence in language a com- pared to the corresponding sentence in language b is measured. we call this the ‘indel’ measure. we will compare the results of the two methods to results of the trigram method to answer the fol- lowing questions: ( ) do the movement measure and the indel measure yield different results? ( ) does the trigram method resemble one of the other methods in particular? we focus on the germanic language group, more specifically on danish, dutch, english, german, and swedish. in section we give a brief overview of related research concerning syntactical measure- ments. section describes the data source and the way in which syntactical distances are measured. the results of the distance measurements are pre- sented in section . in section the research ques- tions are addressed. finally, general conclusions will be drawn in section . in this section we will also discuss how the methods can be validated. previous research to measure syntactical distances between languages we explored literature to find a suitable distance meas- ure. we found two kinds of approaches dominating. one is based on categorical syntactical features. this approach is typically used when material from dialect atlases is used. another is based on counting and comparing frequencies of trigrams of pos tags. this approach works well when large corpora are available with the words being tagged. the two approaches are discussed in sections . and . , respectively. in section . we will motivate our choice. . categorical syntactic variables spruit ( ) measured syntactic distances between local dutch varieties, using data from two vol- umes of the syntactic atlas of the dutch dialects. the atlas volumes contain a large number of maps show- ing the geographic distribution of syntactic phe- nomena. the maps in the first volume represent binary syntactic features, and those in the second volume represent syntactic features, in all , features. an example concerns the comple- mentizer of the comparative if-clause in the dutch sentence het lijkt wel alsof er iemand in de tuin staat, ‘it looks as if there is someone in the garden’. four examples of binary features are complement- izer¼of, complementizer¼of dat, complementizer ¼dat, complementizer¼alsof. each feature is either true or false, and therefore binary. the distance be- tween two dialects was equal to the total number of shared features, and therefore, the distance will vary between and , . szmrecsanyi ( ) investigated variation in british english dialects by using the freiburg english dialect corpus (fred), a naturalistic speech corpus sampling interview material from different locations in thirty-eight different coun- ties all over the british isles, excluding ireland. fred consists of texts, which total about . million words of text. the corpus was analysed to obtain text frequencies of sixty-two morphosyntactic fea- tures, yielding a structured database that provided a sixty-two-dimensional frequency vector per locality. the feature frequencies were subsequently normal- ized to frequency per , words (because textual coverage in fred varies across localities) and log- transformed to deemphasize large frequency differ- entials and to alleviate the effect of frequency outliers. the resulting � table (on the county level— that is, thirty-eight counties characterized by sixty- two feature frequencies each for the full data set) was converted into a � distance matrix using euclidean distance—the square root of the sum of all squared frequency differentials—as an interval measure. this distance matrix was subjected to clus- ter analysis to find dialect groups. grieve ( ) analysed a word corpus represent- ing the letter to the editor register as written be- tween and in cities from across the usa. the letters were downloaded from the online archives of one or more newspapers published in cities. a total of grammatical alternation variables were measured and mapped across the city sub-corpora. an alternation variable is ‘a set of distinct linguistic forms that have the same referential meaning’ (p. ). the percentage of each variant is calculated as the quotient of the total syntactical variation in germanic texts digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by u niversity library user on n ovem ber deleted text: in order deleted text: h deleted text: in order deleted text: part-of-speech deleted text: ' deleted text: ' deleted text: : deleted text: , deleted text: deleted text: z deleted text: deleted text: deleted text: ten thousand deleted text: -- deleted text: deleted text: deleted text: -- deleted text: -- deleted text: in order deleted text: z deleted text: united states deleted text: the deleted text: " deleted text: " number of tokens of that variant in the corpus and the total number of tokens of all the variants of that alternation variable in the corpus, multiplied by (see also grieve ). spruit ( ), szmrecsanyi ( ), and grieve ( , ) used syntactic alternation variables (or linguistic variables) which were found in a dia- lect atlas (spruit, ) or derived from written text corpora (szmrecsanyi, ; grieve, , ). . frequencies of pos categories hirst and feiguina ( ) presented a method for authorship discrimination that is based on the fre- quency of bigrams of syntactic labels that arise from partial parsing of the text. with this method the authors obtained a high accuracy on discrimination of the work of anne and charlotte brontë (brontë, , , ), both alone and combined with other classification features. high accuracies are achieved even on fragments of short texts of little more than words long. while hirst and feiguina ( ) focussed on determining the authorship of texts, nerbonne and wiersma ( ), lauttamus et al. ( ), wiersma et al. ( ), and nerbonne et al. ( ) measured the impact of l on l syntax in second language acquisition on the basis of corpora of english of finnish australians. they presented an application of a technique from language technol- ogy to tag a corpus automatically and to detect syn- tactic differences between two varieties of finnish australian english, one spoken by the first gener- ation and the other by the second generation. the technique compares frequencies of trigrams of pos categories as indicators of syntactic distance be- tween the varieties and then examine potential ef- fects of language contact. the frequency vectors were compared and analysed by using a permuta- tion test, which resulted in both a general measure of difference and a list with the n-grams that are most responsible for the difference. the findings showed syntactic ‘contamination’ from finnish in the english of the adult first-generation speakers of finnish ethnic origin. the results show that we can attribute some interlanguage features in the first generation to finnish substratum transfer. sanders ( ) extended the method and its application. he extended the method by using leaf-path ancestors of sampson ( ) instead of trigrams, which captures internal syntactic struc- ture—every leaf in a parse tree records the path back to the root. the corpus used for testing is the international corpus of english, great britain (nelson et al., ), which contains syntactically annotated speech of great britain. the speakers were grouped into geographical regions based on place of birth. sanders showed that dialectal vari- ation in eleven british regions from the international corpus of english, great britain (ice-gb) is detectable by the algorithm, using both leaf-ancestor paths and trigrams. . our approach spruit ( ), szmrecsanyi ( ), and grieve ( ) quantified syntactical language variation by using alternation variables. when using corpora as in the case of szmrecsanyi ( ) and grieve ( ), a set or features need to be chosen. the choice of features may partly depend on the data, but will easily be subjective. given the fact that we use corpora (see section . ) we prefer not to choose a set of features, but simply measure syntactical distances in terms of dif- ferences of sentence structure, regardless what fea- tures are represented by those differences. we will introduce two new measures. the first one measures the average number of word positions that a word in a sentence in language a has moved compared to the corresponding sentence in language b. the second measure measures the number of words which is inserted or deleted in a sentence in lan- guage a compared to the corresponding sentence in language b. the methodology of hirst and feiguina ( ) and nerbonne and wiersma ( ) likewise does not require the choice of a feature set and excels in simplicity. we will also consider their method- ology and compare the results of our measures with their trigram measure. nerbonne and wiersma’s ( ) method is sen- sitive only to sequential order, not to insertions, deletions, or phrase structure. sanders ( ) clearly increased the sensitivity of the measure he w. heeringa et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by u niversity library user on n ovem ber deleted text: : deleted text: , deleted text: part-of-speech deleted text: part-of-speech deleted text: z deleted text: -- deleted text: deleted text: deleted text: developed a great deal with respect to phrase struc- ture. it might be argued that the movement and indel measures are potentially sensitive to higher levels of syntactic structure, perhaps even trans- formational structure (chomsky, ). . data source and measurement techniques the data used in this article were collected in the context of a research programme which aims at finding linguistic and non-linguistic determinants of mutual intelligibility within the germanic, romance, and slavic language families. within this research programme, web-based intelligibility tests were performed and linguistic distances be- tween the languages were measured (golubović, ; swarte, ). . data source the basis of our analyses is a set of four english texts at the b /b level according to the common european framework of reference for languages. the texts were used as preparation exercises for the preliminary english test. the diploma is offered by university of cambridge esol examinations in england. the texts we use are obtained at englishaula.com. the texts are translated in each of the other four languages (dutch, danish, german, swedish) by native speakers of those languages. the translations are subsequently corrected by two other native speakers. all of the native speakers had completed a university education or were still studying at a university. they were aged between and years. just as the english text the four texts consist of sixty-six sentences (approximately words) in total. given five languages, we will analyse ( � )/ ¼ language pairs. our initial thought was to calculate the syntactic distance of a language pair by directly comparing the texts of the two languages to each other. however, by doing this, we would introduce a lot of noise in our data. we will illus- trate this by an example. in the text child athletes we find the following sentences in english and german: english: some doctors agree that young mus- cles may be damaged by training before they are properly developed. german: einige ärzte behaupten, dass junge muskeln die noch nicht ausreichend entwickelt sind während des trainings beschä- digt werden können. the two sentences have about the same meaning, but syntactically they strongly differ. however, given the english sentence, it is possible to get a more literal german translation: english: some doctors agree that young mus- cles may be damaged by training before they are properly developed. german: einige ärzte denken, dass junge muskeln durch training geschädigt werden können bevor sie ausreichend entwickelt sind. on the other hand, given the german sentence, a more literal translation in english is possible: german: einige ärzte behaupten, dass junge muskeln die noch nicht ausreichend entwick- elt sind während des trainings beschädigt werden können. english: some doctors claim that young mus- cles which are still not properly developed can be damaged during the training. since we want to model intelligibility (see section ), we should not calculate syntactic distances which are unnecessarily large. a reader who reads a sen- tence in a closely related language, will likely try to match the sentence with the most literal translation in his/her own language. therefore, to obtain the data set that our analysis will be based on, each of the available texts in danish, dutch, english, german, and swedish are ‘translated back’ in each of the other languages as literally as possible. importantly, the texts are trans- lated as literally as possible with respect to syntax, but not necessarily with respect to lexicon, as this is not within the scope of this article. however, the translations are made so that the sentences are still grammatically correct. these translations are lan- guage specific, i.e. the danish text is translated in a different way from swedish than from german, for example. note that we modified only the targets syntactical variation in germanic texts digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by u niversity library user on n ovem ber deleted text: paper deleted text: semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /llc/fqv corpus id: using models of lexical style to quantify free indirect discourse in modernist fiction @article{brooke usingmo, title={using models of lexical style to quantify free indirect discourse in modernist fiction}, author={j. brooke and a. hammond and graeme hirst}, journal={digit. scholarsh. humanit.}, year={ }, volume={ }, pages={ - } } j. brooke, a. hammond, graeme hirst published art, computer science digit. scholarsh. humanit. modernist authors such as virginia woolf and james joyce greatly expanded the use of ‘free indirect discourse’, a form of third-person narration that is strongly influenced by the language of a viewpoint character. unlike traditional approaches to analyzing characterization using common words, such as those based on burrows ( ), the nature of free indirect discourse and the sparseness of our data require that we understand the stylistic connotations of rarer words and expressions which… expand view via publisher ftp.cs.toronto.edu save to library create alert cite launch research feed share this paper citationsbackground citations methods citations view all tables and topics from this paper table table table table collocation joyce neural coding lexicon citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency modeling modernist dialogism: close reading with big data a. hammond, j. brooke, graeme hirst art pdf save alert research feed detecting direct speech in multilingual collection of th-century novels joanna byszuk, m. wozniak, + authors m. eder computer science lt hala pdf save alert research feed are fictional voices distinguishable? classifying character voices in modern drama krishnapriya vishnubhotla, a. hammond, graeme hirst computer science latech@naacl-hlt pdf view excerpt, cites background save alert research feed a dutch coreference resolution system with an evaluation on literary fiction andreas van cranenburgh computer science pdf view excerpts, cites background save alert research feed computational text analysis within the humanities: how to combine working practices from the contributing fields? j. kuhn computer science lang. resour. evaluation pdf save alert research feed who’s afraid of virginia woolf? readers’ responses to experimental techniques of speech, thought and consciousness presentation in woolf’s to the lighthouse and mrs dalloway giulia grisot, kathy conklin, v. sotirova art view excerpts, cites background save alert research feed the double bind of validation: distant reading and the digital humanities' “trough of disillusionment” a. hammond engineering save alert research feed the multivalent moment in jean-pierre de caussade’s l’abandon à la providence divine and virginia woolf’s mrs. dalloway d. j. barclay philosophy pdf save alert research feed gutentag: a user-friendly, open-access, open-source system for reproducible large-scale computational literary analysis a. hammond, j. brooke computer science dh pdf view excerpts, cites methods and background save alert research feed study of linguistic features incorporated in a literary book recommender system h. alharthi, d. inkpen computer science sac view excerpts, cites background save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency a corpus linguistic approach to literary language and characterization: virginia woolf's the waves giuseppina balossi psychology view excerpts, references methods and background save alert research feed performing gender: automatic stylistic analysis of shakespeare's characters s. r. hota, s. argamon, moshe koppel, iris zigdon art pdf view excerpts, references background save alert research feed a bayesian mixed effects model of literary character david bamman, t. underwood, noah a. smith computer science acl pdf view excerpt, references background save alert research feed burrowing into translation: character idiolects in henryk sienkiewicz's trilogy and its two english translations jan rybicki philosophy, computer science lit. linguistic comput. highly influential view excerpts, references methods save alert research feed computation into criticism : a study of jane austen's novels and an experiment in method j. burrows mathematics view excerpts, references results and methods save alert research feed ‘a few simple words’ of interior monologue in ulysses: reconfiguring the evidence w. mckenna, a. antonia philosophy highly influential view excerpts save alert research feed a computational analysis of style, affect, and imagery in contemporary poetry justine t. kao, dan jurafsky computer science clfl@naacl-hlt pdf view excerpt, references background save alert research feed automatic recognition of speech, thought, and writing representation in german narrative texts a. brunner sociology, computer science lit. linguistic comput. pdf view excerpts, references background save alert research feed tracking point of view in narrative j. wiebe computer science comput. linguistics pdf view excerpts, references background save alert research feed mimesis: the representation of reality in western literature - new and expanded edition e. auerbach, edward w. said, willard r. trask art pdf view excerpts, references background save alert research feed ... ... related papers abstract tables and topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue untitled bicentennial bits and bytes the pittsburgh digital frankenstein project rikk mulligan | elisa beshero-bondar | matt lavin | jon klancher @critrikk | @epyllia | @mjlavin | @jklancher mla : saturday jan. @ : pm; sheraton riverside suite link to these slides: http://bit.ly/bicfrankmla ​ http://bit.ly/bicfrankmla hello. thank you for sharing your time with us, especially late on a saturday afternoon. just to make sure you are in the right place, this is session : bits and bytes: the pittsburgh digital frankenstein project. i and my fellow panelists will be describing several aspects of our project over the next forty minutes or so to set up our roundtable discussion with you. i’m rikk mulligan; my fellow panelists are elisa beshero-bondar, matthew lavin, and jon klancher. i’ll begin by introducing our project and explaining how we came together, as well as what each of us are bringing to the project and how this helped us define our initial project plans. my co-panelists will go in depth into our current phase and we'll end with some discussion of our next steps and may milestone. speaker notes a patchwork team elisa beshero-bondar, director, center for the digital text, university of pittsburgh at greensburg jon klancher, english department, carnegie mellon university matt lavin, director, digital media lab, university of pittsburgh rikk mulligan, university libraries, carnegie mellon university raff viglianti, maryland institute for technology in the humanities (mith), university of maryland scott weingart, program director of digital humanities, university libraries, carnegie mellon university . most humanities research is done by individuals, but digital scholarship tends to require a team, often directed by a principal investigator with a well-defined research agenda. our team is somewhat different--we formed around the opportunity to contribute to a project focused on the bicentennial anniversary of the publication of mary shelley’s novel frankenstein. we came together organically through online and face-to-face relationships because of the possibilities of working together rather than as part of a specific goal or design. the pittsburgh digital frankenstein project began to coalesce during october , but its beginnings go back to august when neil fraistat of mith--the maryland institute for technology in the humanities--contacted scott weingart to ask if he might be interested in doing some visualizations on the materials in the shelley godwin archive. a couple of weeks later, while scott and i were discussing an unrelated science fiction project, he asked if i had any interest in frankenstein. i’d just learned that cmu has a copy of the first edition in our special collections and thought this would be a great opportunity to explore new dh methods and highlight our special collections through a dh project. scott then video chatted with neil, raff viglianti, david rettenmaier, and purdom lindblad at mith in september to begin scoping out the project. he then sent out an email to the four of us in the pittsburgh area (elisa, matt, jon, and myself), introducing us and asking if we might explore potential goals. although raff was not involved in our first meetings, elisa was already in contact with him and had previously arranged for him to speak at pitt- oakland and pitt-greensburg on music encoding and its applications. elisa and raff had also been in touch with neil about contributing an update to the edition of frankenstein on the romantic circles website, one that would interweave with the shelley-godwin archive’s edition of the frankenstein manuscript notebooks. because elisa and raff work together as members of the tei technical council, neil hoped they might work together on the interconnection of those editions, as part of the “pittsburgh group” collaboration. jon accepted scott’s invitation because of his involvement in the romantic bicentennials as co-director of networked events, which has several upcoming frankenstein projects. as the only romanticist at cmu we were hoping he would participate and he was looking at this as a chance to work with a dh team. scott and i are members of cmu’s new digital center, dsharp, and matt is one of those who attended our open office/open consulting hours last year. after hearing about the project after our first couple of meetings he opted to join in the late fall of . speaker notes what we contribute elisa beshero-bondar: romanticist; textual scholar; tei architecture and collation jon klancher: romanticist; book historian; annotations matt lavin: th century americanist; textual analysis, stylometry rikk mulligan: th century americanist; web coding, interface design raffaele viglianti: research programmer; shelley-godwin archive encoding; tei pointers to s-ga notebooks scott weingart: early modernist, history of science; textual analysis, stylometry illustrations by bernie wrightson. frankenstein. mary wollstonecraft shelley. marvel comics . images from dark horse comics reprint. . our team is eclectic--a patchwork of institutions, roles, and levels of technical expertise and dh experience. this is our strength. even as victor’s “patchwork” construct is greater than the sum of its parts (especially in modern reimaginings), so is our goal to construct a digital edition with features that bridge print and digital resources. we also approach the project from very different perspectives and can learn a lot from each other. as romanticists jon and elisa are our subject matter experts on shelley’s era and the text; both have several years experience teaching a variety of frankenstein editions. scott expertise in the history and philosophy of science will also contribute to jon’s annotations of the text. elisa’s extensive experience with the text encoding initiative technologies including tei xml and xslt transformations are providing the framework for our online texts. she has worked with raff to architect our collations and consulted with david birnbaum at the university of pittsburgh on some of the thornier coding issues. raff is also contributing from mith to help us integrate the shelley-godwin archive tags and annotations, as well as later helping with the tei pointers for our future interface efforts. both matt and scott are dh generalists with extensive experience with a range of tools, methods, and projects. they began working on the textual analysis of our materials once elisa and raff got them in the shape they required. they are currently working on the stylometrics. [in this case, programmatic approaches to the study of measurable features of (literary) style, such as sentence length, vocabulary richness and various frequencies (of words, word lengths, word forms, etc.) with practical applications in authorship attribution research.] i only had a brief flirtation with tei xml before this project. i’ve been working closely with elisa to create the clean text files of the print editions and to integrate the notebooks into our corpus. my expertise lies more in web and interface design, which will come more into play as we evolve the presentation of our results. speaker notes print publications edition ( volumes) edition ( volumes) edition ( / of a volume) bound with friedrich von schiller's the ghost seer in bentley's standard series of novels) known/authorized by mws illustrations by bernie wrightson. frankenstein. mary wollstonecraft shelley. marvel comics . images from dark horse comics reprint. . but i’m getting ahead of myself. before we could decide what directions we might go in for our digital project, we needed to survey the current frankenscape. although elisa, jon, and raff were well acquainted with frankenstein, the rest of us were only passing familiar with an edition of the novel. we began by looking at the three print editions to see what we might contribute to the digital scholarship. it is estimated that copies of frankenstein were published anonymously on january , , in three small volumes. the current romantic circles edition attributes the preface to percy shelley although neither the dedication or preface are signed or initialed in the actual copy. as a number of scholars have pointed out (charles robinson, susan tyler hitchcock) the novel quickly inspired a number of stage adaptations. these proved so successful that mary’s father, william godwin, supervised the editing and republication of a two volume edition in . the changes in this edition are so minor, which may explain why we could find no digital edition and had to digitize it ourselves from a photo facsimile. the last edition released during mary shelley’s lifetime appeared in . it is important to note that the changes from the and to the are fairly extensive, and that everyone who involved in the ghost story contest and earlier text other than mary had died by the time she released this edition with her story of its genesis in the introduction. speaker notes digital sources pennsylvania electronic edition start (early html, frame-based) romantic circles update from html to tei xml manuscript notebooks bodleian arbinger c , c , c shelley-godwin archive tei xml http://knarf.english.upenn.edu/ https://www.rc.umd.edu/editions/frankenstein http://shelleygodwinarchive.org/contents/franke nstein/ illustrations by bernie wrightson. frankenstein. mary wollstonecraft shelley. marvel comics . images from dark horse comics reprint. . http://knarf.english.upenn.edu/ https://www.rc.umd.edu/editions/frankenstein http://shelleygodwinarchive.org/contents/frankenstein/ we started with a goal to prepare an updated and improved digital edition of mary shelley's frankenstein that conforms to the tei p standard. elisa and raff knew that much of this material already existed online. the earliest work dates back to the s, in stuart curran and jack lynch’s university of pennsylvania electronic edition. this website uses frames to display the and editions, as well as a “variorum” frame to show differences. this site also hosts hundreds of additional files as hypertext annotations and connections. the romantic circles website is a refereed scholarly resource; it published versions of the and editions upcoded from the paee html to tei in . this site also offers a visual comparison of the texts, hosted by the juxta commons. finally, the shelley-godwin archive has the transcribed abinger manuscript notebooks, c , c and c currently in the bodleian library of oxford. the archive focuses on the manuscripts, providing access to the extant fair copy of frankenstein in three forms and a range of critical materials. [( ) the physical order of bodleian ms abinger c. ; ( ) the virtually reconstituted order of notebooks c and c ; and ( ) the linear chapter sequence of the the three-volume fair copy.] speaker notes rieger: inline collation of "thomas" w/ , variants in endnotes legend: curran and lynch: pa electronic edition ( paee) , collation of and : html crook crit. ed of , variants of "thomas", , and in endnotes (p&c mws collected works) romantic circles tei conversion of paee ; separates the texts of and ; collation via juxta ~mid- s c. robinson, the frankenstein notebooks (garland): print facsimile of ms drafts shelley-godwin archive publishes diplomatic edition of ms drafts . print edition digital edition legend: critical and diplomatic editions leading to the pgh frankenstein project pittsburgh bicentennial frankenstein project begins: assembly/proof-correcting of paee files; ocr/proof-correcting ; "bridge" tei edition of s-ga notebook files; automated collation; incorporating "thomas" copy text to develop our “improved and updated” digital edition we needed to think ask the question: “what are the authoritative scholarly and critical editions of frankenstein”? we've gone back to looking at james rieger’s university of chicago press annotated edition from . it is one of the first to emphasize the text; it also included the “thomas” edition of (mary shelley annotated copy) in its analysis. the first online collated edition, curran and lynch had only compared the and editions. nora crook and charles robinson made path-breaking scholarly editions--one critical, and one of the abinger ms notebooks--in print, not connected to curran's work. romantic circles might be considered an update if not an upgrade of the paee, using programmatic tools in juxta commons rather than hand- collation. the shelley-godwin archive went online with the notebooks digital edition in , making a diplomatic edition publicly available. our work is a bridge: we bring all of these nodes of scholarship together by considering the abinger manuscript notebooks, the , , and editions, and mary shelley’s annotations in the “thomas” edition. we do not seek to replace the scholarly apparatus created by others, but to build upon, correct, update, and connect this fine work. we are cognizant that some of the work has errors we can fix or was built with almost- obsolete web technologies--needing a more stable digital architecture. speaker notes evolving project returning to the original texts to produce: clean text files for each edition ( , , ) tei xml files for each edition comprehensive collations from ms through to bridge and build on previous critical editions (print and digital)​ variorum interface to show changes over time stylometric analysis annotations https://github.com/ebeshero/pittsburgh_frankenstein . https://github.com/ebeshero/pittsburgh_frankenstein we have completed several steps toward preparing a new, scholarly digital edition of frankenstein in tei . we have also started the work to offer additional scholarly resources online as part of this project. we maintain a github repository to track and share our work and render our efforts as transparent as possible. as of now we have completed plain text editions of the , , and editions on our github. we have added comments about our initial attempts to use the paee and rc materials, and explained the process we used to produce the clean text files. elisa will go into more detail regarding our efforts to prepare the tei xml editions and collation files. she will also present our work to date on developing a comprehensive collation of all five sources, and sketch out our goals in developing a visual variorum interface to display the differences between them. matt will discuss our current work on the stylometric analysis. for those who are unfamiliar, stylometrics in this cases uses programmatic approaches to study the measurable features of (literary) style; this can include sentence length, vocabulary richness and the various frequencies of words, word lengths, and word forms as part of authorship attribution research. jon will talk to you about how we hope to augment the annotations of previous scholarly editions and both the online resources with something different. speaker notes variorum project manuscript: (notebooks: abinger c , c , c ) "thomas copy" edition ( edition with hand annotations by mary shelley) edition ( volumes) edition ( volumes) edition ( / of a volume) illustrations by bernie wrightson. frankenstein. mary wollstonecraft shelley. marvel comics . images from dark horse comics reprint. . building a digital variorum elisa beshero-bondar @epyllia . can we make an edition that conveniently compares the manuscripts to the print publications? can we make a comprehensive collation to show changes to the novel over time, from to ? ​ how many versions? ( and a bit?) which editorial interventions persist from to ? mws in the "thomas" copy: how much of this persists into ? pbs's additions: which/how many of these persist to ? what parts of the novel were most mutable? motivating questions . rieger: inline collation of "thomas" w/ , variants in endnotes legend: curran and lynch: pa electronic edition ( paee) , collation of and : html crook crit. ed of , variants of "thomas", , and in endnotes (p&c mws collected works) romantic circles tei conversion of paee ; separates the texts of and ; collation via juxta ~mid- s c. robinson, the frankenstein notebooks (garland): print facsimile of ms drafts shelley-godwin archive publishes diplomatic edition of ms drafts . print edition digital edition legend: our project genealogy: critical and diplomatic editions leading to the pgh frankenstein project pittsburgh bicentennial frankenstein project begins: assembly/proof-correcting of paee files; ocr/proof-correcting ; "bridge" tei edition of s-ga notebook files; automated collation; incorporating "thomas" copy text "we stand on the shoulders of giants." as rikk mentioned, we're building on a lineage of frankenstein edition work that illuminates the novel's gestation and transformation. rieger: inline collation: wants to show mws's process, as she handwrote alterations on a copy of the frankenstein. as you read rieger, you see a system of markup distinguishing mws's hand from the print, showing her insertions and addiitons. it's important for foregrounding the act of comparing texts in the line of the reader's sight--precedent for curran's pa electronic edition. s: burst of activity! frankenstein was a major early experiment to build a scholarly apparatus in hypertext in the early years of the world wide web. the web and print editions improve the scope of comparison--beyond just vs . charlie robinson's facsimile edition of the abinger manuscript notebooks illuminates the writing process and hands at work over the novel. much more is now available to the scholar to document change over time. speaker notes the dream of the s. . . hypertext / hypercard books and the paee accessing (reading, writing, editing) texts in nonlinear ways multiplying and individualizing points of access . roughly contemporary with the paee + mid ' s scholarly edition efforts what if the female creature survived and had a chance to create her own story with lots of options? experimental nonlinear navigation...hundreds of hypercards...plot your own course the dream of the ' s: frankenstein's inspiration for hypertext experiment . i've included this slide to emphasize how the s marked a moment of experiments with frankenstein as a body of text--the building of a creature and the assemblage of a textual body become continuous activities--the reader participates. it isn't just that we theorized with stanley fish about readers assembling texts for themselves; the experimenters with frankenstein like shelley jackson and stuart curran invited readers to engage actively in remixing, juxtaposing, and exploring options. speaker notes paee: hypertext collation experiment hundreds of small html files, juxtaposed in frames . discuss how this works as collation and what's so exciting about it. speaker notes the dream of the ' s is alive... (in pittsburgh) digital collation for a "variorum" interface select a text from what version the reader chooses: ms | | "thomas" | | compare that text to what version the reader chooses view the "molten" portions of the novel in context with the stable portions navigate multiple texts in context with one another make the critical apparatus a vantage point: see how the novel changed over time without having to find the fine-print endnotes the creature of collation? we make newly formed text "bodies" from disparately formed source materials. source: article on "frankenstein" malware i programmer . http://www.i-programmer.info/news/ -security/ -frankenstein-stitching-code-bodies-together-to-hide-malware.html tei: the text encoding initiative a community-maintained standard @ vassar: draft of poughkeepsie principles “ provide a standard format for data interchange in humanitiesresearch. guidelines for the encoding and interchange of machine readable texts: first drafted ; published on the web by (p ) standards for encoding texts co-evolve with standards for developing human and machine-readable markup languages ​ html (w c) || (early) sgml and xml tei xml tree structure: meant to store a stable format not subject to commercial processing requirements possible to publish tei directly or convert to html; pdf; tex; other document formats. . http://www.tei-c.org/index.xml sgml = standard generalized markup language xml = extensible markup language speaker notes . small pieces are optimal for collation. . there is no single "complete" edition. . each output (plain text, xml, tei collation) = viable edition on its own. . interface invites the user to play: put the pieces together. from paee to pgh variorum... values in common image source: a friend's lego set . reconcile multiple kinds of text encoding: old ' s html ( , ) not-so-plain ocr-generated text ( ) tei xml for manuscripts: (s-ga diplomatic edition) pittsburgh's bridges ( ) source: newscastic.com a bridge-building challenge construct "bridge" xml for collation markup-assisted machine collation (collatex): "flattened" xml hierarchies for even collation units ms metadata markup (e.g. "hands") to ignore in collation, but preserve in the output pointers outward to manuscript editions (s-ga, morgan library) . https://www.newscastic.com/news/ -decades-of-pittsburgh-skyline- / collation "stitchery" can be done by hand in tei automated: via collatex algorithms for locating union and "delta" points in "streams" of text inputs in a variety of formats (xml/tei, plain text, json) output / visualization options: text table (above); svg flow chart; xml juxtacommons on the web develop a custom web interface (via xml output) . . image source: s-ga . a running text stream...? or an architecture of bridges? (collatex svg output) . xml collation: flagging variants and percy's hand . stylometry and digital frankenstein matthew lavin university of pittsburgh @mjlavin . research questions . how does frankenstein change stylistically across different expressions/manifestations? . how can those changes be attributed and/or characterized? in whose authorial voice is frankenstein? direct analysis of percy and mary . do stylistic changes affect how frankenstein reads in relation to cultural categories like genre, “modernness,” linguistic register generally, and scientific vocabulary? if so, how? . outline of exploratory measures in notebooks, term counts/relative frequencies of: mary’s hand initial mary’s hand strikethrough percy’s hand suggested vs. adopted mary’s hand revised (sometimes mary ver , ver , final, etc.) across our three print editions: term counts/relative frequencies of each text term frequencies weighted against frequency across all documents (tf-idf) . term counts absolute values of term count differences across editions, to (left) and to (right) . relative term frequencies absolute value of weighted term frequency differences across editions (tf-idf), to (left) and to (right) . collational alignment types of changes: spelling normalizations punctuation word insertions, substitutions, deletions word to phrase or phrase to word reordering . shelley-godwin notebooks image courtesy of shelleygodwinarchive.org . punctuation matters … but not for all measures image courtesy of dailywritingtips.com . dynamics of dh collaboration the workflows and analytical paradigms of “machine learning dh” and “scholarly editing dh” are not factory fitted to one another, but they can be adapted to work in tandem. the gains are more valuable than the cost of the retrofit. . dynamics of dh collaboration how can a single, carefully curated edition or set of editions be worked into a “macroanalysis” model where many uncorrected, dirty ocr texts are being compared to one another? what kinds of questions can we ask with hand- corrected editions that we cannot ask with htrc corpora? . open data and reproducibility i have argued elsewhere that openness invites open discussion and collaboration. it doesn't guarantee that these things will happen, but closed data practices all but guarantee that these practices will be difficult or impossible. . next steps: questions/methods how can we characterize changes by trends established in analysis of each person’s hand? how can we think about changes as moving closer or further away from a genre baseline? how do index quantifications like “how modern” or “how scientific” each version of the text is? how do we account for “modern” and “scientific” as rapidly changing ideas? . as scott weingart’s work has shown, the computational determination of “who wrote what” has become less central to dh inquiry. i believe strongly, however, that these approaches will have a second life as we turn increasingly toward the use of authorship attribution techniques to study authorship as an historical and social construction. speaker notes jon klancher @jklancher source: (w c recommendation of feb ) web annotation data model . https://www.w .org/tr/annotation-model/ annotated print editions of frankenstein leonard wolf, ed., the essential frankenstein: the definitive, annotated edition of mary shelley’s classic novel (new york: plume). ( st edition as the annotated frankenstein, ) susan j. wolfson and ronald l. levao, the annotated frankenstein (cambridge, ma: belknap press of harvard university). leslie s. klinger, ed., the new annotated frankenstein (new york: liveright/norton). david g. guston, ed., frankenstein: annotated for scientists, engineers, and creators of all kinds (cambridge, ma: mit press). . susan j. wolfson and ronald l. levao, the annotated frankenstein (cambridge, ma: belknap press, ) . this annotation is the verbatim altered text. leslie s. klinger, ed., the new annotated frankenstein (new york: liveright/norton). . wolfson annotation: published in , in the wake of the french revolution (volney was part of the revolutionary government), les ruines; ou meditation sur les revolutions des empires appeared in english as ruins, or meditations on the revolutions of empires, in . klinger annotation: more properly, the ruins, or, meditation on the revolutions of empires; and the laws of nature, by constantin- françois chasseboeuf, who took the name volney, published in in french. it was translated in into english. the book is described by frankenstein scholar pamela clemit as a “powerful enlightenment critique of ancient and modern governments as tyrannical and supported by religious fraud” (“frankenstein, matilda, and the legacies of godwin and wollstonecraft,” in the cambridge companion to mary shelley, ed. esther schor [cambridge: cambridge university press, ] .) in light of the date of translation, the book in question must have been the french edition, and safie and the creature learned french…. our frankenstein variorum annotation: of the books the creature hears read aloud in the forest, volney's the ruins; or, a survey of the revolutions of empires ( ) was the most closely associated with europe's radical enlightenment. (it was first published in french as les ruines: ou meditation sur les revolutions des empires in .) the creature learns an illuminating critique of imperialism and exploitation from volney, even as he also absorbs some of the enlightenment's own prejudices ("slothful asiatics"). the effect on the creature is to give him a sense of the social or structural and not only a personal framework for understanding virtue and suffering. on volney’s role in the novel, see also ian balfour, "allegories of origins: frankenstein after the enlightenment," sel: studies in english literature ( ): - . - . . tel: - % . using the tool for digital annotation with tags hypothes.is hypothes.is: all tags so far... . https://web.hypothes.is/ annotations that tunnel through the texts (not only pointing outside) domestic affection (walton - margaret seville) domestic affection (delaceys and safie) domestic affection (frankenstein family) travel/expedition: walton tr av el /e xp ed it io n : v ic to r travel/expedition: clerval tra ve l/e xp ed itio n: cre atu re law / judicial system ( justine) law / judicial system felix delacey law / judicial system victor/kirwin . domestic affection (delaceys and safie) domestic affection (frankenstein family)tra ve l/e xp ed itio n: cre atu re law / judicial system felix delacey law / judicial system victor/kirwin annotations in the variorum interface source: ( ) sqft: "see new york city's subway lines superimposed over an aerial photo of the city" . https://www. sqft.com/see-nycs-subway-lines-superimposed-over-an-aerial-photo-of-the-city/ annotations and intertextuality "tunneling" through the texts distinct from external pointers to context affected by collation "bridges" goal: available in each reading view id markers signal textual locations: book | chapter | paragraph collation alignment shows corresponding locations in other versions portable hypothes.is annotations: "pinnable" by text string and id position markers speaker notes frankenstein's invitation/challenge: build digitally: experiment with human and machine reading build a strong, sustainable bridge: update romantic circles edition interlinks to shelley-godwin archive notebooks: point to ms pages morgan library "thomas copy" centralize the critical apparatus ​a tool for scholars a metanarrative? a remixing of the reading process for all who care about frankenstein make all the texts available to all the readers the work continues... collation annotation stylometry visualization and variorum interface this document is downloaded from dr‑ntu (https://dr.ntu.edu.sg) nanyang technological university, singapore. padls : supporting digital scholarship in digital libraries goh, dion hoe‑lian goh, d. ( ). padls: supporting digital scholarship in digital libraries. aslib proceedings, ( ),  ‑ . https://hdl.handle.net/ / https://doi.org/ . /eum . downloaded on apr : : sgt i goh, d.h. ( ). padls: supporting digital scholarship in digital libraries. aslib proceedings, ( ), - . padls: supporting digital scholarship in digital libraries author information dion goh division of information studies, school of computer engineering nanyang technological university blk n , # a- , nanyang avenue singapore singapore tel: ( ) - fax: ( ) - email: ashlgoh@ntu.edu.sg ii padls: supporting digital scholarship in digital libraries abstract this paper introduces digital scholarship, a process in which individuals perform all scholarly work electronically, working entirely with digital media. a proposal is made for patron- augmented digital libraries (padls), a class of digital libraries designed to support the digital scholarship of its patrons. padls not only provide facilities for search and retrieval of library artifacts, but also allow patrons to augment the library’s collection with new artifacts such as annotations, original compositions and organizational structures. finally, a prototype padl (called synchrony) providing access to digitized video segments and associated textual transcripts is described. synchrony allows patrons to search its collection for artifacts, create annotations/original compositions, integrate these artifacts to form synchronized mixed text and video presentations and, after suitable review, publish these presentations into the digital library if desired. introduction digital library research is mostly focused on the development of large collections of multimedia resources and advanced tools for their indexing and retrieval. while these efforts are essential, it is important to recognize that the ultimate goal of a library, whether physical or digital, is to serve the scholarly needs of its users – whose objectives are not solely the retrieval of library artifacts. users instead seek these artifacts (the items that constitute a library’s holdings) in order to manipulate and combine them to produce new artifacts. this observation is especially evident in scholarly (work-oriented) settings in which patrons peruse existing artifacts in order to produce new ones. consider as examples, three commonly occurring scenarios: ( ) a faculty member of a university would invariably seek library artifacts (such as books or journals) for the purposes of composing a journal article; ( ) an information analyst working for a privately owned organization must acquire various artifacts in order to produce a report for a staff meeting; ( ) a student assigned to produce a term paper must acquire and peruse library artifacts for its successful completion. the use of library artifacts while these examples portray users of library artifacts in various situations, two common themes are apparent: ( ) library artifacts are sought in order to complete a task – typically the production of a new information artifact such as a journal article, a staff report or a term paper ( ) these new information artifacts are disseminated – through a formal publication process for the journal article, through handouts and a presentation in the case of the staff report, or through the submission of the paper to an instructor in the case of the student’s term paper. studies of library artifact use support these observations. for example, levy and marshall [ ] observed and interviewed a group of information analysts, their managers, information assistants, and technology providers in two organizations in order to gain insights into the use of libraries. while acquiring documents (artifacts) was a crucial component, this represented only an initial step in the analysts’ task. once completed, these analysts would then annotate the documents as a means of interpreting them, produce new artifacts, and finally disseminate them. in addition, analysts would commonly share documents and other interpretative structures of documents with other analysts, as well as establish and maintain “reading rooms” which serve as collections of reference materials for the benefit of others involved in similar work. likewise, stone [ ] studied humanities scholars and identified five steps that scholars performed in their studies: ( ) thinking and talking to others, ( ) reading existing material on a topic, ( ) studying original sources of information and making observations and notes, ( ) drafting a document on what has been found, and ( ) producing a final document based on the draft. if library use indeed extends beyond search and retrieval, what types of activities do patrons perform? in a study of library use by o’hara et al. [ ], phd students in the arts and humanities at cambridge university were asked to complete a diary of their document-related research activities during a working day. information recorded included the nature of the research activity, time taken, documents used, support activities performed (such as annotating), and place where the activity was conducted. at the end of the working day, the subjects were interviewed for approximately an hour. they were asked to elaborate on the information recorded in their respective diaries. using the data collected, a model for document-related activities by library users was developed. the model characterized scholarly research as a complex process involving searching, information retrieval, reading, information extraction, annotation, review and writing new compositions. these processes were iterative in nature and occurred over varying periods of time. these activities are similar to those found by case [ ] in interviews with historians to determine their use of information. after searching from a variety of information sources, historians would make annotations and copies of the material, arrange and index the material according to their needs, and then produce an original work using the information gathered. case also found that these activities were often performed concurrently within and across projects. four activities that occur over library artifacts may thus be identified. firstly, they are acquired from a library’s collection to solve some specific problem. this is typically performed through an iterative searching and/or browsing process [ , ]. a second activity involves organizing the acquired artifacts to make better sense of the information at hand within the context of the prescribed task. also known as information triage [ ], patrons filter the artifacts to determine the relevancy of each artifact, as well as create various organizations for the artifacts to allow them to be used efficiently and effectively. the third activity involves the authoring of new artifacts using the artifacts already acquired. artifact types are varied and may range from annotations, to documents and organizational structures (such as indexes). finally, artifacts are published, that is, the newly authored artifacts are disseminated. the audience of this artifact may be personal (for private use), public (for use by interested parties), or a selected group of users. methods of publication are also varied and may occur through formal print channels (such as books and journals), presentations, informal handouts, and through the world-wide web. digital scholarship traditional (as opposed to digital) libraries, with the majority of their holdings in physical form, typically promote a form a scholarship termed in this paper as paper-based scholarship. here, physical media, predominately paper, play a major role in the scholarly use of library artifacts. for example, although patrons may use electronic databases to search for artifacts, the resulting metadata records point to both physical and digital artifacts, requiring patrons to switch between digital and physical domains in order to accomplish their tasks. figure depicts paper-based scholarship as a cyclic set of transitions occurring in both the physical and digital domains. artifacts (physical and digital) are located electronically through their metadata records. since scholarship is (mostly) paper-based, copies of physical artifacts (or their proxies) are made for incorporation into the work process. digital artifacts must also be converted to physical form before they are used [ ]. these copies are then organized, and used to author and ultimately publish new artifacts which again may either be physical or digital. the work cycle is completed when the artifacts are incorporated into the library and metadata records are generated for them. figure . paper-based scholarship in traditional libraries digital libraries however provide new service opportunities to patrons as well as an expanded set of informational data types [ ], and thus have the ability to promote digital scholarship. as shown in figure , patrons can now perform their scholarly work electronically, working entirely with digital media. using tools that interface with the digital library, patrons are able to search and acquire library artifacts, organize them to form coherent structures suitable to the task at hand, author new artifacts, and publish them electronically for future use. digital scholarship offers several advantages over paper-based scholarship. these include: ( ) a single access point for library artifacts. patrons are able to acquire all library artifacts in one location – at the computer. there is no longer a need for a two-step acquisition process in which patrons first search electronic records for artifacts of interest and then physically locate them. search electronic database metadata records acquire copy and organize author publish l li library artifacts library artifact proxies new artifact (physical and/or digital) physical media digital media library artifacts (physical and digital) ( ) new data types and new ways of access and manipulation. digital media provide new opportunities for patrons to interact with library artifacts not previously possible with paper-based artifacts. data types such as audio and video can now be used directly in the scholarly process. patrons can search within artifacts, combine and edit portions of existing artifacts to form new ones, create links/associations between artifacts, and so on. ( ) shorter publication times. paper-based artifacts typically take between to months from submission to publication excluding actual authoring time [ ]. the digital medium has the potential to shorten such times by supporting online layout/formatting/editing, and electronic refereeing services, as well as removing the transitions between physical and digital media. figure . digital scholarship in digital libraries search electronic database metadata records acquire copy and organize author publish library library library artifact library artifact new artifact digital media library artifacts patron-augmented digital libraries there is no doubt that traditional library models, in which searching is the main service provided to patrons and scholarly work is mostly paper-based, have utility. however, we postulate that in many instances, an expanded model of digital library services would benefit patrons. that is, digital libraries should provide services that encompass not only searching, browsing and retrieval, but an entire range of services that support patrons’ digital scholarship from task inception to task completion. the question becomes one of the types of services that should be supported. returning to the discussion of artifact use, a plausible starting point would include services for acquiring and organizing library artifacts, together with services for authoring and publishing new artifacts. hence, we propose patron-augmented digital libraries as a class of digital libraries that provide acquiring, organizing, authoring and publishing services to patrons. a patron- augmented digital library (padl) is one whose holdings are enhanced by the digital scholarship of its users – both librarians and patrons contribute to the evolution of a library’s holdings. in the padl model of use, librarians populate the digital library with artifacts that meet the goals of the library. at the same time, patrons may augment the padl’s holdings to meet specific needs through new artifacts such as documents, annotations or other organizational structures over the existing holdings of the library via the support services offered by the padl. often, the results of a patron’s task (the newly authored artifacts) are deemed useful to the community at-large. when this happens, the patron may want to publish the artifacts for the benefit of others. the term “artifact” used in our research refers to any information-bearing object that is accessible by a patron. two major classes of artifacts are distinguished. information artifacts are artifacts that contain information about a topic and are obtained either by librarians for the purpose of populating the library or by patrons who create and publish them into the library. examples include electronic books, journals, and so on. patron-augmented artifacts on the other hand, refer to artifacts produced by patrons and incorporated into the digital library after a review process. these may fall into three categories: ( ) structuring artifacts which are used to organize other artifacts, ( ) annotations which provide commentary and context to other artifacts, and ( ) original compositions created by patrons. patron-augmented artifacts become reusable information artifacts through the publication process. it is important to note that while a padl is designed as an environment for digital scholarship where patrons author and publish artifacts, a system of checks and balances must be in place to ensure the quality of the artifacts produced. for this reason, padls must include support for publishing policies that determine if an artifact considered for publication meets the goals and standards of the padl. in other words, artifacts are subject to reviews, and these may be as stringent or flexible as necessary depending on the stakeholders of the padl. padl services the facilities provided by a padl are based on a model of digital scholarship termed asap [ ]. this model suggests the need for tools that allow patrons to acquire artifacts from the padl, organize these to put them into the context of the task, create new artifacts, and finally publish these new artifacts back into the padl for future use. hence, the minimal requirements for establishing a padl would include the following services. storage and retrieval a fundamental requirement in all digital libraries is the support for services to store and retrieve artifacts, and in the case of padls these artifacts would encompass both information and patron-augmented artifacts. two important features of a storage and retrieval service are: ( ) the ability to accommodate artifacts of different multimedia types, and ( ) the ability to deliver artifacts to patrons through browsing and searching modes. publishing the publishing service functions as an intermediary between a patron who wishes to publish an artifact and the storage and retrieval service responsible for incorporating it into the padl. a typical publishing service would acquire the artifact from the patron, obtain the necessary metadata for it (from the patron and/or analysis of the artifact), forward it for review, and upon acceptance, communicate with the storage and retrieval service for the purposes of storing the artifact in the padl. once again, it must be stressed that publishing policies must be implemented to ensure that published artifacts meet the standards and needs of the padl. manipulation the manipulation service is responsible for delivering the model of digital scholarship to the user, and provides the interface through which the patron interacts with the padl. all user requests come from this subsystem and all results are returned to this subsystem. the tasks supported by this service include searching/browsing of artifacts, organizing/structuring of acquired artifacts, and authoring and publishing of new artifacts. while these tasks may be provided by separate tools, one advantage of a single tool functioning as the access point to the entire padl is the lower cognitive overhead required by patrons in learning and using the padl facilities. security/privacy a padl may be utilized by a large number of patrons, and as such, mechanisms must be present to ensure that artifacts, service requests, and personal information about patrons are secured from unauthorized access [ ]. for example, a system for enforcing access rights is necessary to determine if a patron is able to manipulate (read/write/annotate/reference) published artifacts. likewise, mechanisms are necessary to ensure that only a patron has access to his/her own personal artifacts and work areas in the padl. figure shows the conceptual architecture underpinning padls. each service is supported by a separate subsystem that interacts with other subsystems in response to users’ requests or actions. figure . a conceptual architecture for padls service requests artifacts/service results information/patron- augmented artifacts storage and retrieval subsystem manipulation subsystem (user interface) patron-augmented artifact accepted patron- augmented artifact security/privacy subsystem publication subsystem publishing polices library artifacts approved service requests synchrony synchrony [ , ] is a prototype padl system that is designed for the purposes of digital scholarship. it allows patrons to search and retrieve artifacts from the library’s collection, organize these artifacts to meet the needs of their tasks, author new artifacts, and publish these new artifacts into the digital library. synchrony’s collection of artifacts consists of digitized videos of speeches given by former president george bush (senior) and their corresponding textual transcripts acquired in collaboration with archivists at the george bush presidential library and museum. the transcripts are full-text indexed at the paragraph-level and made available to patrons via standard query operations. in addition, each paragraph is associated with its streaming video segment, allowing patrons to view search results in text-only, video-only, or synchronized text and video formats. the collection also contains artifacts authored by patrons and these fall into three classes: original compositions, annotations and structuring artifacts. original compositions are text- based documents that patrons author and publish into the digital library. annotations are also text-based documents, but are designed to provide commentary and context to other artifacts. presentations serve as structuring artifacts in synchrony. these composite entities consist of sequences of artifacts, each of which may contain a video segment of a speech, its corresponding textual transcript and an annotation/original composition displayed in synchrony. associated with each presentation is a table of contents that allows patrons to navigate to any sequence within the presentation. artifacts contained within the presentations are referenced, not copied. this allows modifications made to individual artifacts to automatically propagate to presentations if desired. synchrony is so named because it allows patrons to author and publish synchronized text and video presentations. the user interface synchrony’s user interface is patterned on a spatial metaphor and represents a large, / dimensional direct manipulation workspace in which patrons manipulate and organize objects of different types such as text and presentations. the interface is depicted in figure and consists of two major entities: the workspace and library objects. figure . synchrony’s user interface the workspace forms the background of the interface and functions much like a physical desktop on which items are placed and a patron’s tasks are performed. library objects, that is query object workspace text object presentation object the information and patron-augmented artifacts in use by the patron, are positioned on this workspace. objects may be arranged (by selecting and dragging an object on the workspace), resized (by selecting and dragging an object’s borders) and visually altered (by modifying an object’s properties such as color) by the patron to create information structures suitable to the current task. in addition, scrolling and panning are supported to allow patrons to view different portions of the workspace. library objects are the means with which a patron accomplishes his/her digital scholarship. they represent the information and patron-augmented artifacts as well as the results of a patron’s tasks in the padl. library objects fall into four basic categories: ( ) query objects represent the results of a search, with each query object representing one result set. queries are performed against information artifacts (speeches) and/or patron- augmented artifacts (original compositions, annotations and presentations) depending on the search options selected by the patron. ( ) text objects represent text-based information and may be of two content types: information artifacts (speeches) and patron-augmented artifacts (original compositions and annotations). text objects allow editing if their underlying content types are editable. in synchrony, published artifacts (those that are part of a padl’s collection) are not editable while unpublished patron-augmented artifacts are editable by those having the appropriate access rights. for editable text objects, text is typed directly on the objects themselves. ( ) presentation objects contain presentations authored by patrons and consist of sequences of artifacts each of which may contain a video segment of a speech, its corresponding textual transcript and/or an annotation/original composition displayed in synchrony. the contents of a presentation are displayed in tabular form, with each row corresponding to a single sequence in the presentation while columns contain the types of artifacts in use within each sequence. ( ) container objects are workspaces within the main workspace and may contain query, document, presentation or even other container objects. while positioning may be used to divide a workspace, containers provide a more formal means of doing so, and are thus typically used to organize a workspace into various tasks and subtasks. synchrony shares common goals with digital library interfaces such as artemis [ ], dlite [ ] and navique [ ] in its support for an integrated, direct-manipulation environment for library-related tasks. in terms of design philosophy however, synchrony is similar to viki [ ] in that both systems derive their interfaces from the branch of hypertext/hypermedia systems known as spatial hypertext [ ]. spatial hypertext is characterized by the use of space in the creation and perception of structure. whereas traditional hypertext systems employ explicit linking mechanisms to associate objects (e.g. unidirectional links between html documents) to create information structures, spatial hypertext systems describe associations among objects through space, that is, by geometrical relationships (e.g. proximity), visual characteristics (e.g. font size, color, shape), and recurrence (e.g. relative positioning of an object within a group of objects). studies have demonstrated the utility of such systems. for example, an analysis of aquanet use (a collaborative hypertext tool) [ ] found that for drawing relationships between objects, users preferred spatial positioning of objects to communicate structure rather than through predefined schemas (a collection of objects and relationship types). further, in the walden's paths project [ ], the spatial hypertext system viki has been used to some success in the authoring of paths - linear presentations of existing and new web pages. a scenario of use the following scenario illustrates how users may potentially use synchrony and highlights the operation of the system. an educator is preparing a lesson about the bush presidency and the soviet union for his history class and decides to prepare a multimedia presentation of speeches and press conferences given by george bush on the subject from synchrony’s collection as a resource for his students. after logging onto synchrony, the educator is presented with an empty workspace. as this will be a new presentation, his first task is to locate relevant information by querying the padl collection. he thus right-clicks at any point on the workspace to display a list of padl services, and after selecting the query service, he enters the query (together with any options) in the dialog box that appears on the workspace. when the query has been processed by synchrony, a query object appears at the click location showing the results of the query. to view an artifact, the educator selects it from the query object, drags it onto the workspace and drops it at a desired location. depending on the artifact type, a text object or a presentation object appears at the drop location. figure depicts the results of these actions. after enough information has been retrieved, the educator's next step is to author the presentation. synchrony simplifies the authoring process through a technique known as incremental formalization [ ] which attempts to make a system understand informally represented information. this feature allows users to rapidly create presentations by first positioning document objects linearly within the workspace and then later specifying which objects to include into the presentation. figure . selecting and viewing artifacts returning to the scenario, the educator uses familiar drag-and-drop operations to assemble the text objects (which may include his annotations) to form two vertical adjacent list structures as depicted in figure . he then invokes the presentation building service, causing synchrony to automatically map these list structures to presentation sequences. in the current version, synchrony assumes that the leftmost list contains video segments of speeches and their textual transcripts, while the adjacent list to its right is assumed to contain the corresponding annotation/original composition. in other words, sequences are mapped to the rows in the lists in a top-to-bottom manner while content is mapped to the columns. (synchrony also supports a left-to-right mapping). when synchrony completes the mapping, a presentation object is displayed depicting the contents of the presentation in a tabular format (see figure ). in addition to providing a formalized representation of a presentation, the presentation object also allows patrons to modify its contents. patrons are able to add/remove sequences, add/move/remove content in any sequence, and shift the display order of sequences. figure . authoring a presentation when the educator is ready to view the presentation, he clicks a button on the presentation object. this causes synchrony to assemble the sequences into a smil (synchronized multimedia integration language) [ ] presentation and invoke a presentation viewer to display it. figure shows the presentation viewer. the viewer provides playback controls to allow patrons to play, pause, stop and seek. each presentation sequence consists of three regions – a content region for displaying the text of a speech segment, a video region for presenting the associated video segment, and an annotation region for displaying associated annotations/original compositions. figure . viewing a presentation when the educator has finished authoring the presentation, he forwards it for review and possible publication by completing a form provided within synchrony. here, the educator provides the title of the presentation, a description, and an explanation of why the presentation should be published. synchrony then uploads the completed form and presentation to the publication subsystem which stores them in a temporary holding area pending review. at this point, the educator’s task is complete. he will later be notified through electronic mail about the outcome of his submission. in the current version of synchrony, all presentations submitted for publication are routed to a designated person (a librarian, a reviewer, an editor, etc.). using a submissions viewing facility, reviewers may accept or reject submissions, or reroute submissions to other reviewers if necessary. when a submission is accepted or rejected, the author is informed via electronic mail. further, if the submission is accepted, it is indexed and incorporated into the padl. to conclude the scenario, the author, upon receiving the acceptance message from the reviewer, informs his students about the presentation. the students may then begin their own synchrony sessions, retrieve the presentation, and view and interact with it. note that for clarity, this scenario portrays the authoring process as a fixed sequence of tasks, that is, querying, organizing, viewing and publishing. in reality, synchrony provides an environment in which these tasks may be performed in a fluid, iterative process. patrons would move effortlessly among these activities depending upon the need at hand. implementation synchrony consists of a suite of client-server tools implemented mainly in java together with two third-party applications. mg [ ], a public domain full-text indexing and retrieval system, is used for the storage and retrieval of the textual content of speeches. synchrony also utilizes realnetworks’ video server [ ] for the delivery of streaming video and its implementation of the java media framework [ ] for the rendering of video segments and smil presentations. conclusion digital libraries must offer more than advanced collection maintenance and retrieval services since patrons often do not solely retrieve library artifacts for their own sake. in scholarly settings, patrons instead seek these artifacts to manipulate and integrate them to produce new artifacts. traditionally, these activities have occurred mainly in physical media (predominantly paper), and as such, may be classified as paper-based scholarship. digital libraries however provide new service opportunities as well as an expanded set of informational data types, and when combined, have the ability to promote digital scholarship. patrons are now able to perform their scholarly work electronically, working entirely with digital media. we thus propose patron-augmented digital libraries as a class of digital libraries that support the digital scholarship of its patrons. a patron-augmented digital library (padl) is one in which librarians and patrons both contribute to the evolution of the library’s holdings. librarians provide the seed material (information artifacts) to form an initial collection and maintain the collection while patrons augment the library with patron-augmented artifacts over the existing collection. to support this new role, a padl departs from the traditional library model of service provision and supports authoring, structuring and publishing services in addition to search and retrieval. synchrony was developed to determine the feasibility of the padl concept. the system provides access to a collection of digitized videos of speeches given by former president george bush and their corresponding textual transcripts together with artifacts authored by patrons. a pilot study was also conducted on synchrony and results were encouraging [ ]. in particular, the study found that attitudes toward padls in general were positive, provided that appropriate security and quality control (through publication policies) mechanisms were employed. however, because this study was performed in a laboratory setting with a small number of subjects, these results cannot be generalized. consequently, in the next phase of synchrony’s development, we envision a larger-scale longitudinal study that will require participants to use synchrony to author and publish presentations for actual homework assignments over a semester. these results will be used to guide future work in the development of synchrony and patron-augmented digital libraries. references . adam, n. and yesha y. strategic directions in electronic commerce and digital libraries: towards a digital agora. acm computing surveys, ( ), , - . . belkin, j., oddy, r. and brooks, h. ask for information retrieval: part i. background and theory. journal of documentation, ( ), , - . . case, d. the collection and use of information by some american historians: a study of motives and methods. library quarterly, ( ), , - . . cousins, s., paepcke, a., winograd, t., bier, e. and pier, k. the digital library integrated task environment (dlite). in: proceedings of the nd acm international conference on digital libraries, philadelphia, - july . new york: acm press, , - . . denning, p. and rous, b. the acm electronic publishing plan. communications of the acm, ( ), , - . . furnas, g. and rauch, s. considerations for information environments and the navique workspace. in: proceedings of the third acm conference on digital libraries, pittsburgh, - june . new york: acm press, , - . . goh, d. and leggett, j. a spatial approach for the access, manipulation and publication of digital library artifacts. in: museums and the web , new orleans, - march . pittsburgh: archives & museum informatics, , - . . goh, d. and leggett, j. patron augmented digital libraries. in: proceedings of the fifth acm conference on digital libraries, san antonio, - june . new york: acm press, , - . . hoschka, p. synchronized multimedia integration language. . available via http://www.w .org/tr/rec-smil/. . javasoft. jmf home page. . available via http://www.javasoft.com/products/java- media/jmf/index.html. . levy, d. and marshall, c. going digital: a look at assumptions underlying digital libraries. communications of the acm, ( ), , - . . marchionini, g. information seeking in electronic environments. new york: cambridge university press, . . marshall, c. and rogers, r. two years before the mist: experiences with aquanet. in: proceedings of the acm conference on hypertext , milan, november- december . new york: acm press, , - . . marshall, c. and shipman, f. searching for the missing link: discovering implicit structure in spatial hypertext. in: proceedings of the fifth acm conference on hypertext, seattle, - november . new york: acm press, , - . . marshall, c., and shipman, f. spatial hypertext and the practice of information triage. in: proceedings of the eighth acm conference on hypertext, southampton, - april, . new york: acm press, , - . . marshall, c., shipman, f. and coombs, j. viki: spatial hypertext supporting emergent structure. in: proceedings of the acm european conference on hypermedia technology, edinburgh, - september . new york: acm press, , - . . nürnberg, p., furuta, r., leggett, j., marshall, c. and shipman, f. digital libraries: issues and architectures. in: proceedings of the second annual conference on the theory and practice of digital libraries, austin, june . college station: hypermedia research laboratory, , - . . o’hara, k., smith, f., newman, w. and sellen, a. student readers’ use of library documents: implications for library technologies. in: conference proceedings on human factors in computing systems chi ’ , los angeles, - april . new york: acm press, , - . . realnetworks. realnetworks.com - the home of digital media. . available via http://www.realnetworks.com/. . shipman, f., furuta, s., brenner, d., chung, c. and hsieh, h. using paths in the classroom: experiences and adaptations. in: proceedings of the ninth acm conference on hypertext and hypermedia, pittsburgh, - june . new york: acm press, , - . . shipman, f. and mccall, r. supporting knowledge-based evolution with incremental formalization. in: conference proceedings on human factors in computing systems chi , boston, - april . new york: acm press, , - . . stone, s. humanities scholars: information needs and uses. journal of documentation, ( ), , - . . wallace, r., soloway, e., krajcik, j., bos, n., hoffman, j., eccleston, h., kiskis, d., klann, e., peters, g., richardson, d. and ronen o. artemis: learner-centered design of an information seeking environment for k- education. in: conference proceedings on human factors in computing systems chi ’ , los angeles, - april . new york: acm press, , - . . witten, i., moffat, a. and bell, t. managing gigabytes. second edition. san francisco: morgan kaufmann publishers, inc., . poetry-allresults.eps the semantics of poetry: a distributional reading aurélie herbelot university of cambridge, computer laboratory j.j. thomson avenue, cambridge cb az united kingdom aurelie.herbelot@cantab.net february , abstract poetry is rarely a focus of linguistic investigation. this is far from surprising, as poetic language, especially in modern and contemporary literature, seems to defy the general rules of syntax and semantics. this paper assumes, however, that linguistic theories should ideally be able to account for creative uses of language, down to their most difficult incarnations. it proposes that at the semantic level, what distinguishes poetry from other uses of language may be its ability to trace conceptual patterns which do not belong to everyday discourse but are latent in our shared language structure. distributional semantics provides a theoretical and experimental basis for this exploration. first, the notion of a specific ‘semantics of poetry’ is discussed, with some help from literary criticism and philosophy. then, distributionalism is introduced as a theory supporting the notion that the meaning of poetry comes from the meaning of ordinary language. in the second part of the paper, experimental results are provided showing that a) distributional representa- tions can model the link between ordinary and poetic language, b) a distributional model can experimentally distinguish between poetic and randomised textual out- put, regardless of the complexity of the poetry involved, c) there is a stable, but not immediately transparent, layer of meaning in poetry, which can be captured distributionally, across different levels of poetic complexity. introduction poetry is not a genre commonly discussed in the linguistics literature. this is not entirely surprising, as poetical language – especially in its contemporary form – is expected to defy the accepted rules of ordinary language and thus, is not a particularly good example of the efficient medium we use to communicate in everyday life. still, it would seem misguided to argue that poetry does not belong to the subject of linguistics. it is, after all, made of a certain type of language which, at least at some level, relates to ordinary language (it is for instance rare – although not impossible – to read the word tree in a poem and find that it has nothing to do with the concept(s) we normally refer to as tree). if we accept that poetical language is not completely dissociated from our everyday utterances, then an ideal linguistic theory should be able to explain how the former can be interpreted in terms of the latter. that is, it should have a model of how poetry uses and upsets our linguistic expectations to produce texts which, however difficult, can be recognised as human and make sense. in this paper, i assume that poetry is a form of language which can be linguistically analysed along the usual dimensions of prosody, syntax, semantics, etc. focusing on semantics, i investigate whether a particular theory, distributional semantics (ds), is fit to model meaning in modern and contemporary poetry. ds is explicitely based on ordinary language use: the theory assumes that meaning is given by usage, in a wittgensteinian tradition. so by building a computational model of word meaning over an ordinary language corpus and applying it to poetic text, it is possible to explore in which ways, if any, poetry builds on everyday discourse. the paper is structured in two parts, one theoretical, one experimental. starting off with questions surrounding the nature of meaning in everyday language and poetry, i introduce standard semantic theories and some of their historical relationships with particular approaches to poetics. focusing on distributionalist theories, i then present modern computational work in distributional semantics as a (rough) implementation of the wittgensteinian account of meaning, and relate this work to experiments in computer-aided poetry dating back to the s. in this process, i argue – with the support of some work in philosophy and literary criticism – that the semantics of po- etry does derive from everyday language semantics, and that computational models of ordinary language should let us uncover aspects of poetical meaning. in the experimental part of the paper, i report on an implementation of a distribu- tional system to compute so-called ‘topic coherence’, that is, a measure of how strongly words of a text are associated. i show that in terms of coherence, poetry can be quan- titatively distinguished from both factual and random texts. more interestingly, the perceived level of difficulty of a text, according to human annotators, is not correlated with its overall coherence. that is, complex poetry, when analysed with distributional techniques, is shown to be just as coherent as more conventionally metaphorical texts. i interpret this result as evidence that poetry uses associations which are latent, but usually not explicit, in ordinary language. the meaning of poetry . ordinary language and poetry one of the most influential theories of meaning in philosophy and linguistics is the theory of reference, as formalised in set-theoretic semantics. the core proposal of set theory is that words denote (refer to) things in the world (frege, ; tarski, ; montague, ). so the word cat, for instance, has a so-called ‘extension’ which is the set of all cats in some world. set theory is closely related to truth theory in that it is possible to compute the truth or falsity of a statement for a particular world just by looking at the extension of that statement in that world. for instance, the sentence all unicorns are black is true if and only if, in our reference world, the set of unicorns is included in the set of black things. the basic notion of extension is complemented by the concept of ‘intension’ which, under the standard account, is a mapping from possible worlds to extensions, i.e. a function which, given a word, returns the things denoted by that word in a particular world. intension allows us to make sense of the fact that evening star and morning star have different connotations, although they denote the same object in the world: they simply have different intensions. the question of whether poetry has meaning, and if so, what kind of meaning, has long been debated. an enlightening example of the issues surrounding the discussion can be found in a exchange between the philosopher philip wheelwright and the poet josephine miles in the kenyon review (wheelwright, ; miles, ). wheelwright wrote an article entitled on the semantics of poetry where he clearly distinguished the language of poetry from the language of science. according to him, meaning in scientific language was to be identified with conceptual meaning, itself guided by the principles of formal logic and propositional truth. poetry, on the other hand, was endowed with what he called ‘metalogical’ meaning, that is, with a semantics not driven by logic. the core of his argument was that signs in poetical language (‘plurisigns’) were highly ambiguous while words in science (‘monosigns’) must have the same meaning in all their occurrences. miles replied to this article with a short letter entitled more semantics of poetry where she argued that ambiguity could be found in all language; that it was misguided to take scientific language as mostly denotational and poetry as mostly connotational; that some stability of meaning was vital for poetry: “poetry, as formalizing of thoughts and consolidating of values, works firmly in the material of the common language of the time, limited by its own conventions” (miles, ). it would be hard to blame wheelwright for rejecting the thesis that poetry is de- notational. expressions such as music is the exquisite knocking of the blood (brooke, b), your huge mortgage of hope (hughes, ), or skeleton bells of trees (slater, ) do not have a natural interpretation in set-theoretic semantics. still, it also seems difficult to argue that poetical meaning is not related to ordinary language. without competence in the latter, it is hard to interpret the semantics (if there is any) of a poem. a position which stays clear from the poetical/ordinary language distinction is that of gerald bruns ( ) who argues that “poetry is made of language but is not a use of it”. bruns clarifies this statement by adding that: poetry is made of words but not of what we use words to produce: mean- ings, concepts, propositions, descriptions, narratives, expressions of feel- ing, and so on. the poetry i have in mind does not exclude these forms of usage – indeed, a poem may “exhibit” different kinds of meaning in self-conscious and even theatrical ways – but what the poem is, is not to be defined by these things. (p ) in other words, poetry uses the basic building blocks of ordinary language, but with an aim radically different from the one they are normally associated with. i will call bruns’s position the ‘pragmatic’ view of poetry, where language is at the core of the investigation, but is deeply dependent on (and playing with) context, intention and meta-linguistic factors. pushing the focus of poetry into pragmatics has the advantage that bruns’s account can cover all forms of poetry, including the less ‘linguistic’ ones (sound poetry in particular), but at first sight, it seems to also lessen the role of semantics – meaning being one of those aspects of language use which poetry is not concerned with. this move, however, is not so clearly intended. in fact, there is a natural bridge between semantics and pragmatics in a theory of meaning which bruns casually alludes to: basically my approach is to apply to poetry the principle that wittgenstein applied to things like games, numbers, meanings of words, and even phi- losophy itself. the principle is that the extension of any concept cannot be closed by a frontier. (p ) the reference here is to wittgenstein’s philosophical investigations ( ) and the idea that ‘meaning is use’, i.e. that meaning comes from engaging in a set of (normative) human practices; in a word, that semantics emerges from pragmatics. anchoring semantics in context makes meaning boundaries much less clear than they are in set theory. still, it is possible to formalise the idea in a linguistic frame- work. the linguistic theory of meaning closest to wittgenstein’s line of argumentation is distributionalism. in this approach, the meaning of cat is not directly linked to real cats but rather to the way people talk about cats. the collective of language users acts as a normative force by restricting meaning to a set of uses appropriate in certain prag- matic situations. the roots of distributionalism can perhaps be found in bloomfield ( ), but the theory grew to have much influence in the s (harris, ; firth, ). some time later, in the s, the advent of very large corpora and the increase in available computing power made the claims (to some extent) testable. both psy- chologists and linguists started investigating the idea that a word’s meaning could be derived from its observed occurrences in text (landauer & dumais, ; grefenstette, ; schütze, ). these empirical efforts would soon lead to a very active area of research in computational linguistics called ‘distributional semantics’: a field which attempts to model lexical phenomena using ‘distributions’, i.e. patterns of word usage in large corpora. interestingly, there are historical links between wittgenstein’s distributionalism, distributional semantics and computer-generated poetry. one of wittgenstein’s stu- dents, margaret masterman, was very influenced by his theory of meaning and by the idea that studying language use could give an insight into semantics. foreseeing the po- tential of applying computers to this type of philosophical and linguistic investigation, she founded the cambridge language research unit (clru), a research group which would become engaged in early computational linguistics work in the uk. in parallel, she also took an interest in the creative processes involving language and produced an early version of a computer program to support poetry generation. the program was not actually producing poetry but rather presenting word choices to the user, al- lowing them to fill in a preset haiku frame. masterman’s idea of using computers to produce poetry was not to replace the human poet. in fact, she clearly differentiated the ‘real’ poet from the machine: “the true poet starts with inspired fragments, emerging fully formed from his subconscious” (masterman, ). she also didn’t believe that randomness could be a foundation for poetry: to put a set of words on disc in the machine, program the machine to make a random choice between them, constrained only by rhyming re- quirements, and to do nothing else, this is to write idiot poetry. [...] in poetry, we have not as yet got the generating formulae; though who would doubt that a poem, any poem, has in fact an interior logic of its own? masterman thought that there was a structure underlying language use. uncovering that structure formed an important part of the work at the clru. computing resources in those days were extremely limited so, instead of directly studying linguistic utter- ances in their natural environment, part of the clru’s research focused on producing so-called ‘semantic networks’ by analysing thesauri (spärck jones, ). still, this work prefigured what would become distributional semantics, and the automatic con- struction of ‘semantic spaces’ out of statistical information from real language data (see § . ). the notion of a semantic structure extractable from language data by computational means was also behind masterman’s belief that machines could support the work of poetry: larger vocabularies and unusual connexions between the words in them, together with intricate devices hitherto unexplored forms of word-combination, all these can be inserted into the machine, and still leave the live poet, op- erating the console, free to choose when, how and whether they should be employed it is possible to summarise masterman’s position as follows: poetry is not random, but the stuff of poetry, the ‘inspired fragments’ found in the subconscious of the poet, are already there, latent in language use, and an appropriate semantic theory should be able to uncover them. this is compatible with miles’s argument that poetry is anchored in ordinary language, and also of bruns’s reading of lyn hejinian’s poetics ( ): “the poet [...] does not so much use language as interact with uses of it, playing these uses by ear in the literal sense that the poet’s position with respect to language is no longer simply that of the speaking subject but also, and perhaps mainly, that of one who listens.” (p ) fifty years after the first clru experiments on distributional semantics, computa- tional linguistics is still working towards the perfect model of meaning that masterman wished for. further, little has been done to linguistically formalise the relation between the semantics of ordinary language and that of poetry. in what follows, i will attempt to capture this relation: first, intuitively, by discussing examples of poetry based on distributional semantics models (§ . ); and later, more formally, by giving experimen- tal evidence that distributional models constructed from ordinary language can account for (at least) a layer of meaning in modern and contemporary poetry (§ ). . distributional semantics and poetry the core assumption behind distributional semantics is that meaning comes from us- age. a fully distributionalist picture includes both linguistic and non-linguistic features in the definition of ‘usage’. so the context in which an utterance is observed comprises not only the other utterances that surround it, but possibly also sensorial input, human activities and so on. although research on distributional semantics is slowly starting to include visual features in its study of meaning (e.g. feng & lapata, ), i will concentrate here on the bulk of the work which makes the simplifying assumption that the meaning of a word can be defined in terms of its close linguistic context. the representation of a lexical item in the framework is a vector, or simply put, a list of numbers. so the meaning of dragon – its so-called ‘distribution’ – might look like this: dungeon . eat . fire . knight . political . scale . very . the numbers represent the average strength of association of the lexical item with other words appearing in its close context (say, a window of words around its occur- rences in a large corpus). there are many ways to compute strength of association – for a technical introduction, i refer to turney & pantel ( ) and evert ( ). i will assume here the use of measures which give strong weights to ‘characteristic contexts’ (e.g. pointwise mutual information, pmi). in such a setting, a word which appears frequently with a lexical item t and not so frequently with other items has a strong as- sociation with t (e.g. meow with respect to cat); a word which appears frequently with t but also very frequently with other things has low association with t (e.g. the with respect to cat); a word which does not appear frequently with t also has low association with t (e.g. autobiography with respect to cat). to come back to our example, the numbers in the dragon vector tell us that dragons are strongly related to dungeons and knights, but only moderately to eating and fire (because a lot of other animals eat and fire has a strong relation to burning houses and firemen). it also shows that they are moderately related to scales, although the . . . . . . . . co nt ex t "d an ge ro us " context "political" democracy dragon lion figure : an example of a semantic space with two dimensions, dangerous and politi- cal prototypical dragon is a scaly creature, because of the polysemy of scale (meaning, for instance, a measurement range). finally, dragons are not strongly associated with very at all, due to the fact that it is such a common word. contexts with high association figures are said to be ‘characteristic’ of the lexical item under consideration. being a vector, the distribution of dragon can be represented in a mathematical space. such a space, where dimensions correspond to possible contexts for a lexical item, is commonly called a ‘semantic space’. one of the benefits of this representation is that similar words can be shown to naturally cluster in the same areas of the seman- tic space. typically, the vectors for dragon and lion will end up close together in the semantic space while dragon and democracy are much further apart, confirming har- ris’s hypothesis that ‘similar words appear in similar contexts’ ( ). fig. shows a highly simplified illustration of this effect, in a two-dimensional space where words are expressed in terms of the contexts dangerous and political. it would be beyond the scope of this work to describe the range of phenomena which can be modelled using approaches based on the framework introduced here. but some examples will be helpful in showing its relevance to semantics as a whole. dis- tributions, for instance, can be further analysed to reveal the various senses of a word: the vector for scale can be combined with its close context via various mathematical operations to produce a new vector distinguishing, say, scales as measurement, scales as weighing machines and dragon scales (schütze, ; erk & padó, ; thater et al., ). they also capture some inferential properties of language related to hy- ponymy (e.g. if molly is a cat, molly is an animal) and quantification (e.g. many cats entails some cats) (baroni et al., ). it is even possible to derive compositional frameworks which show how the lexical meaning of individual words combine to form phrasal meaning (see erk, for an overview). but this type of work is generally evaluated against human linguistic judgements over ordinary language utterances. so can it tell us anything about poetical language? i will first approach the question intuitively and consider the features of a po- etry built out of distributional representations, in the way masterman envisaged. dis- course.cpp (le si, ), a little volume of poems mostly deriving from distributional techniques, will provide suitable examples for my observations. the texts in dis- course.cpp are more or less edited versions of two kinds of output: the first one consists of words that are similar to a given input (for instance, dog or horse for the input cat), while the second one is a list of ‘characteristic contexts’ for the input (for instance, meow or purr for cat). the background corpus for the system was a subset of , wikipedia pages, fairly small by the standards of , but sufficient to produce the semantic clustering effects expected from a distributional framework. context was taken to be the immediate semantic relations in which a given lexical item appeared – that is, instead of just considering single words around a target in the text, the system relied on syntactic and semantic information describing ‘who did what’ in a sentence. for instance, in the statement the black cat chased the grey mouse, the context of chase would be marked as ‘– arg cat’ (cat as first semantic argument of the verb) and ‘– arg mouse’ (mouse a second semantic argument of the verb) while the context of illness s/he nearly died of a psychosomatic food-borne psychotic-depressive near-fatal episodic epidemic, diagnosed as hiv-related and naturally undisclosed. figure : illness, discourse.cpp, o.s. le si ( ) mouse would be ‘grey arg –’ (grey as semantic head) and ‘chase arg –’ (chase a semantic head). one straightforward example of the program’s output is the poem illness (fig. ), produced using some of the characteristic contexts of the lexical item illness. the editing of this poem, as reported in the appendix of discourse.cpp, involved adding coordinations, prepositions and punctuation to the raw output, together with the words s/he nearly died of a and naturally. the adjective epidemic was substantivised. unsurprisingly, concepts which have a strong association with illness are adjec- tives such as psychosomatic or diagnosed. even in this simple example, however, it is clear that some aspects of ‘the discourse’ (i.e. the way that people choose to talk about things) is reflected: given the number of very various conditions and illnesses in medical dictionaries, it is striking that hiv-related makes it into the top contexts, and we can hypothesise that it explains the presence of the adjective undisclosed. in other words, despite the range of medical conditions experienced in everyday life, it is hiv which dominated the thoughts of the wikipedia contributors responsible for the pages constituting the discourse.cpp corpus, and not the common cold or malaria. more distant – but fully interpretable – associations are found throughout dis- course.cpp. politics, for instance, is compared to the japanese puppet theatre bun- raku, probably picking up on the wide-ranging, disenchanted view of government as ‘a circus’. pride, although it does not involve any direct metaphorical association, is pointedly described as a list of ‘status symbols’: pride is your clothes, your girlfriend, a meal. less obvious connections are also found. the poem the handbag is a list of objects commonly found in women’s handbags. the last item in the list, however, is the noun household. whether there is a natural interpretation for this association can be debated, but it picks out a relationship between the handbag and the notion of a home – perhaps a sense of safety, or else a ‘realm’ over which control is exerted. it is probably clear that discourse.cpp is not computer-generated poetry, in the sense that human input is removed. the presentation of the book, its materiality, the typeset- ting, and of course the editing of the poems were human choices. calling upon the notion of ‘intentionality’, emerson ( ) reminds us that the programmer who gets a computer to output data for the aim of producing poetry remains the driving creative force behind the enterprise. from a linguistic point of view, however, the intention of the programmer may be read in terms of pragmatics, as a speech act (searle, ), i.e. as an act of communication with a particular goal. it does not preclude the semantics of the finished product – the meaning produced by the composition of particular lexical items – to come from a fully computational model of part of language. so we may have traces of a computational semantics of ordinary language in dis- course.cpp, but is it poetical semantics? i have tried so far to argue, with masterman, that the ‘structure of language’ – the distributional semantics space–, together with its ‘unusual connexions’ and ‘unexplored forms of word-combination’, can form the basis of poetry production. but is the output comparable to what an actual poet would have produced? at this point, it may be helpful to think of semantics not as something that texts have, but as something that people do with texts. if in distributionalism, meaning is ‘the use of a word’, or ‘the things the word is associated with’, then producing/finding meaning is about producing/finding associations (see hofstadter, for the related argument that cognition is anchored in analogy). arguably, it is impossible for a speaker of a language not to associate when presented with a word sequence: whether speaking/writing or hearing/reading, they are drawn towards specific individual and cultural conceptual connections. it can be shown, in fact, that the neurological re- sponse of an individual presented with a word or word sequence includes an activation of relevant associations. molinaro et al. ( ) write: while composing the meaning of an expression, comprehenders actively pre-activate semantic features that are likely to be expressed in the next parts of the sentence/discourse. from this, it follows that: . human poetry, however complex, should always be experimentally distinguish- able from randomised word sequences, where the latent structure of language is ignored; . a certain level of associativity should be identifiable in all human-produced po- etry, regardless of complexity (i.e. both a semantically opaque poem and a more straightforward text will make use of the underlying, shared structure of lan- guage). the next section puts this hypothesis to the test by using a distributional model of semantics to quantify the associational strength of a range of poems, as well as random and factual texts. semantic coherence in modern and contemporary poetry if we are to show that the semantics of poetry uses the structure of ordinary language to produce meaning, we need to demonstrate that a computational model built on non- poetic language can account for at least some aspects of that semantics, regardless of the apparent difficulty of the text under consideration. i will now turn to the issue of topic coherence, a measure of the semantic related- ness of the items in a given set of words. topic coherence has been studied from the point of view of so-called ‘topic modelling’ techniques, that is, computational methods that take a set of documents and classify them within particular topics (e.g. mimno et al., ). but the proposed measures can be applied to any set of words, and might for instance highlight that the set chair, table, office, team is more coherent than chair, cold, elephant, crime. as such, it is well suited to model the general strength of se- mantic association in a text. i will investigate topic coherence in a number of poems written in the period - . the general idea is to compare texts of varying ‘difficulty’ (from metaphorical but transparent lyrics to opaque, contemporary poetry) and analyse how they behave in terms of coherence, using distributions extracted from ordinary language corpora as word representations. intuitively, it seems that more complex fragments such as the reaches of turning aside remind (coolidge, ) should be less coherent than transparent verses such as the grey veils of the half-light deepen (brooke, a). as argued in the last section, however, we are looking for a stable level of associativity across all poetry. our model should capture associations of (roughly) equal strength in transparent and opaque fragments, making explicit connections which a human reader might not consciously recognise when first presented with a text. following newman et al. ( ), i define the coherence of a set of words w ...wn as the mean of their pairwise similarities: meansimscore(w) = mean{sim(wi, wj), ij ∈ ...n, i < j} ( ) for example, if we were to calculate the coherence of the reaches of turning aside, we would calculate the similarities of reach with turn, reach with aside and turn with aside, and average over the three obtained scores, ignoring closed-class words. the representations for single words are distributionally obtained from the british national corpus (bnc). the corpus is lemmatised and each lemma is followed by a part of speech according to the claws tagset format (leech et al., ). for the ex- periments reported here, parts of speech are grouped into broad classes like n for nouns or v for verbs. furthermore, i only retain words in the following categories: nouns, verbs, adjectives and adverbs (punctuation is ignored). each text/poem is converted into a -word window format, that is, context is defined by the five words preceding and the five words following the target word. to calculate co-occurrences, the following equations are used: freqci = ∑ t freqci,t freqt = ∑ ci freqci,t freqtotal = ∑ ci,t freqci,t the quantities in these equations represent the following: freqci,t frequency of the context word ci with the target word t freqtotal total count of word tokens freqt frequency of the target word t freqci frequency of the context word ci the weight of each context term in the distribution is given by the function sug- gested in mitchell & lapata ( ) (pmi without log): vi(t) = p(ci|t) p(ci) = freqci,t × freqtotal freqt × freqci ( ) finally, the most frequent words in the corpus are taken as the dimensions of the semantic space (this figure has been shown to give good performance in other studies: see again mitchell & lapata, ). the similarity between two distributions is calculated using the cosine measure: sim(a, b) = n ∑ i= ai × bi √ n ∑ i= (ai) × √ n ∑ i= (bi) ( ) where a and b are vectors and i...n are the dimensions of the semantic space. . experimental setup eight poems were selected (all written in modern english), intended to cover a range of ‘difficulty’. that is, some have a straightforward meaning while others require much more interpretation. two additional texts were added to the sample: one is a subset of a wikipedia article while the other is randomly-generated by piecing together words from the british national corpus and inserting random punctuation. these two texts were meant to provide an upper and lower bound on associativity (the assumption being that a factual text makes heavy use of the more common turns of phrase in language). table . gives an impression of the content of the sample by showing the beginning of each text. the texts in the sample are fairly intuitively categorisable into various degrees of complexity. to confirm this, the author and two independent annotators attributed a ‘difficulty’ score in the range - to each text (where = very easy to understand and = very hard to understand). to help the annotators with the task, they were first presented with simple questions regarding the topic of the text: what is this poem about? author year excerpt rupert brooke day that i have loved tenderly, day that i have loved, i close your eyes,/ and smooth your quiet brow, and fold your thin dead hands. coolidge argument over, amounting in edges, in barriers the tonal light of t/ the one thing removed overem- phasizes tonally/ and you could hurry it, and it vanish and plan carol ann duffy valentine not a red rose or a satin heart./ i give you an onion./ it is a moon wrapped in brown paper. allen ginsberg five a.m. elan that lifts me above the clouds/ into pure space, timeless, yea eter- nal/ breath transmuted into words/ transmuted back to breath maccormack at issue iii putting shape into getting without perfect in a culture that doesn’t think, pumps up, the/ two traits go at the face of rate themselves avery slater ithaca, winter. creaking, skeleton bells of trees/ dissolve in a quilt of pale flurries. gertrude stein if i told him, a completed portrait of picasso if i told him would he like it. would he like it if i told him./ would he like it would napoleon would napoleon would would he like it. oscar wilde in the gold room – a har- mony her ivory hands on the ivory keys/ strayed in a fitful fantasy,/ like the silver gleam when the poplar trees/ rustle their pale-leaves listlessly wikipedia ‘the language poets’ ? the language poets (or l=a=n=g=u=a=g=e poets, after the maga- zine of that name) are an avant garde group or tendency in united states poetry that emerged in the late s and early s. random text psychologist. strong. - tabard, battersea, wolf, coma, acas. hutchinson cap’n. suet. ellesmere. proportionality/ mince. outside, morey folk, cum, willoughby, belliger- ent, dimension table : excerpts from the sample author annotator annotator average random . maccormack . coolidge . ginsberg . stein . slater . brooke . wilde . duffy . wikipedia . table : difficulty scores for each text in sample how confident are you of your answer? ( =not confident at all, =absolutely confi- dent) what is the main emotion conveyed by the poem? (e.g. anger, love, disappointment, etc) what are the main images in the poem? (e.g. some people talking, the sun, a busy street, etc) how did you like the poem? ( =not at all, =a lot) how difficult did you find it to understand the poem? ( =very easy, =very difficult) the average spearman correlation between annotators was . , indicating very good agreement. table . shows individual scores for the three annotators, as well as an average of those scores. the table is sorted from the most to the least difficult text. as expected, the wikipedia article is annotated as being the most transparent text, while the random poem is considered the most difficult (on a par, however, with mac- cormack’s ‘at issue iii’). when told that one of the texts was randomly produced by a computer, the two independent annotators were able to identify it but also indicated that maccormack’s poem had caused some hesitation. the poems are pos-tagged with treetagger , and the tagging is manually checked. coherence is calculated between the words in each sentence for poems which have a clear sentence structure (brooke, duffy, slater, stein, wilde, the random text and wikipedia article). the other poems are split into fragments corresponding to the aver- age sentence length in the texts made of sentences. only content words (nouns, verbs, adjectives and some adverbs) with a frequency over in the bnc are considered: the frequency threshold ensures that good-quality distributions are extracted. for the cal- culation of coherence, very frequent adverbs and auxiliaries are also disregarded (e.g. so, as, anymore, be). in total, distributions are extracted from the bnc, covering around % of all content words in the sample. the average sentence length comes to content words. when a word is repeated within a fragment, the similarity of that word with itself is not included in the coherence calculation, so that poems with a high level of repetition do not come out as being particularly coherent. once coherence figures have been calculated for all sentences/fragments in a text, the average of these figures is taken to give an overall coherence measure for the text. . results table . shows the average semantic coherence of our ten texts, together with the mean and standard deviation for the sample. fig. shows the results as a graph. the horizontal line going through the graph shows the mean of the coherence val- ues, while the greyed out areas highlight the points contained within the standard devia- tion. the figure clearly shows that the random text and the wikipedia article are outside of the standard deviation, as would be expected. randomness results in much lower coherence than for the human-produced poetry, and the factual text displays greater coherence. to confirm that the sampled poetry could generally be distinguished from both factual and random texts, other texts were introduced ( random, factual in the form of the first paragraphs of wikipedia article related to poetry) and their average poem average coherence random . slater . duffy . wilde . maccormack . ginsberg . brooke . stein . coolidge . wikipedia . mean . stdv . table : semantic coherence for each text in sample . . . . . . s em an ti c co he re nc e mean = . standard deviation = . random slater duffy wilde maccormack ginsberg brooke stein coolidge wikipedia figure : semantic coherence plot for the -text sample. . . . . . . s em an ti c co he re nc e random poetry factual poetry mean = . poetry stdv = . random mean = . random stdv = . factual mean = . factual stdv = . figure : coherence range for poetry versus random and factual texts coherence computed. the effect is confirmed, with the coherence of random texts lying in the range [ . - . ] and the coherence of the wikipedia texts in the range [ . - . ]. fig. shows the means and standard deviations of the three types of text. the differences are statistically significant at the % level. these results show that, as hypothesised, human-produced poetry can clearly be differentiated from random texts, even in cases where a human reader might hesitate (e.g. maccormack’s ‘at issue iii’). but they also indicate a significant difference between factual and poetic writing. the generally lower coherence of poetry com- pared to factual prose can presumably be put down to both the linguistic creativity and the greater metaphorical content of the texts. despite journey being a conventional metaphor for life, for instance, we cannot expect their distributions to be as similar as, say, journey and travel because their overall pattern of use differs in significant ways. the creativity of the poet is also at work in that he or she might pick out unheard com- binations which, although they make use of the underlying structure of language, do not score so highly in terms of distributional similarity (see § . ). notably, the poems have a standard deviation similar to the factual texts, indicat- ing no marked difference between individual poems, despite their obvious variety in complexity. there is also no correlation between the perceived difficulty of the po- ems, as given by the annotators, and their semantic coherence. in the top range, we find coolidge and brooke close together, despite the fact that coolidge is apparently fairly opaque and brooke generally transparent. at the other end of the range, duffy and wilde, perceived to be generally ‘easy’, come slightly below the mean. this dis- proves that semantic coherence is linked to perceived complexity and thus, the thesis that ‘there is not much semantics in complex poetry’. when looking closer at the results, we find that maccormack’s play arrived, how large in prompting is as coherent as wikipedia’s opening sentence the language poets [...] are an avant garde group or tendency in united states poetry that emerged in the late s and early s (both have coherence . ). this may well seem puzzling, but again, a detailed analysis of the distributions involved shows that some semantics is clearly shared between the words of maccormack’s fragment. table . shows the word pairs involved in the coherence calculation, together with their cosine similarities and all the contexts they share in the most highly weighted subset of their distributions (the shared terms must have a weight of at least . in both distributions). i have grouped contexts by topic where possible, to make results more readable. several topics emerge across the captured contexts. a first one covers performance arts and their audience (ticket, scene, audience, performance, etc). a second one con- cerns temporality (wednesday, minute, finally, june, etc.). a third one relates to polic- ing and violence (police, troop, army, violence, dominate, etc). we also find, perhaps less evidently, a topic about news (tv, story, news). tellingly, these themes are echoed in other parts of the poem: we find gun, combat, push, violence close to the fragment under consideration, day, year, postpone, temporary, minute, hour across the text, story, word pair similarity shared contexts play n–arrive v . {ticket n scene n studio n hall n} {australia n africa n} {wednesday n minute n regularly a} play n–large a . {audience n hall n} area n flat a domi- nate v play n–prompt v . {tv n audience n performance n suc- cess n} {write v version n scene n story n} {violence n dominate v} {move n run n} united a rain n arrive v–large a . {ship n station n island n} {crowd n hall n} arrive v–prompt v . {police n troop n army n warning n} {finally a eventually a june n march n weekend n} {flight n visit n} {paris n germany n} news n scene n miss n william n couple n large a prompt v . {audience n gather v} {firm n organi- zation n europe n} {coal n plastic n fruit n} complex a volume n domi- nate v table : similarities and shared contexts for the fragment play arrived, how large in prompting celebrity, television, radio, glamorous, camera also throughout the text. even the ap- parently unconnected fruit, which appears in the shared contexts of large and prompt, occurs two words after our fragment in the poem. it is worth pointing out that, although the highlighted shared contexts are amongst the most highly weighted for the corresponding distributions (the . threshold means that we are effectively considering the top % of the play, arrive and large distributions, and the top % of prompt), they are not the most salient features overall for those words. that is, they probably do not correspond to features that a native speaker of english would readily associate with play, arrive, large or prompt. however, a closer inspection of the type usually practised by literary criticism would certainly uncover such associational threads in the poem. in other words, if meaning is not immediately present when reading the poem for the first time, it is also not closed to the reader able to disregard the broader pathways of the semantic space. . making sense in linguistics, the term ‘semantic transparency’ is used to refer to how easy or dif- ficult it is for the speaker of a language to guess what a particular combination of words means. according to zwitserlood ( ), “[t]he meaning of a fully transpar- ent compound is synchronically related to the meaning of its composite words”. so a noun phrase like vulnerable gunman might be said to be semantically transparent while sharp glue would not (vecchi et al., ). transparency is not directly related to acceptability in language. some well-known linguistic compounds are not ‘compo- sitional’, that is, the meaning of the compound is not given by the meaning of their parts (e.g. ivory tower), but they are usually fixed phrases which are frequent enough that their meaning is learnt in the same way as the meaning of single words. the line between semantically transparent and opaque phrases is naturally very blurred. bell & schäfer ( ) point out, for instance, that sharp glue, which is neither transparent nor fixed in english, could be glue with a ph less than , i.e., they can come up with an interpretation for a noun phrase without an obvious compositional – or previously known – meaning. the study of poetry has consequences for general linguistics. it may be possible to say that semantic transparency is not a fixed attribute of a word combination, but rather a state in the mind of the hearer. ‘making sense’ of a text, or ‘doing semantics’, becomes the process of making the text more transparent by investigating less-travelled pathways in the semantic space. in the same way that we are aware of the similarity be- tween cats and mongooses – even though we hardly ever encounter the utterance cats and mongooses are similar – it is likely that we capture ‘hidden’ relations in the se- mantic space, leading us to recognise the connection between handbag and household, between pride and girlfriend, or again between mortgage and hope (that which can be lent and taken away). the task of ‘making sense’ of poetry may then be seen as a type of disambiguation, where the dimensions of a word’s distribution are re-weighted to take context into account (see e.g. erk & padó, for a distributional model of sense disambiguation). a last word should be reserved for the study of linguistic creativity. although a large body of work exists on the topic of modelling metaphorical language (see shutova, for an overview), the study of poetical semantics has not been a fo- cus of investigation so far. in spite of this, the claims that apply to metaphor and other well-studied productive phenomena can arguably be made for more complex creative processes: simply put, there is nothing in the present paper that would invalidate the claim that creativity in language can be traced back to its very ordinary use (see veale, for an extensive, computationally-based account of this). language can be seen as the result of profoundly individual and yet ultimately collective phenomena. the se- mantics we ascribe to very mundane objects like cups and mugs varies widely, depend- ing on speakers (labov, ). still, speakers of a language communicate effortlessly with each other, and general evolutionary effects can be observed in any language, be it at the phonetic, syntactic or semantic level. distributions capture the common de- nominator which allows communication to take place. they are in essence a model of the ‘normative’ effects of language use: the repeated utterance of a word in a partic- ular type of context, across a wide range of speakers, fixes its meaning in a way that makes its usage predictable and fit for successful communication. each new utterance entering the language contributes to these norming effects by either reinforcing the sta- tus quo or, possibly, modifying it – thereby accounting for language change. now, if ordinary language is a collective construction, so is its underlying semantic structure and we could expect the latent conceptual associations in this structure to be roughly shared across a specific language and cultural background. the hidden, uncommon as- sociations invoked in poetical semantics may be said to come from the very normative force of everyday speech. conclusion in his literary criticism, richards ( ) suggests the existence of an intuitive process which guides the poet towards particular linguistic combinations.: the poet [...] makes the reader pick out the precise particular senses re- quired from an indefinite number of possible senses which a word, phrase or sentence may carry. the means by which he does this are many and varied. [...] [t]he way in which he uses them is the poet’s own secret, something which cannot be taught. he knows how to do it, but he does not himself necessarily know how it is done. (p. ) a possible linguistic translation of this intuition, based on distributionalism, is to say that the poet, as a speaker of a language, has access to its semantic structure. the ‘secret’ hypothesised by richards is perhaps simply the special skill of some individu- als to analyse that structure. a poet’s work provides exemplars of his/her observations, where the observed data consists of many actual snippets of language use, placing the work of poetry within a collective linguistic intuition. using insights from computational linguistics, we can model the ways in which certain types of poetical output might emerge. the implementation of such models follows some prior claims about the nature of language (wittgenstein, ), about semantic structure and poetry (masterman, ), and about the connection between ordinary language and poetical expression (miles, ). in this paper, i argued that: . assuming a distributional view of meaning, it is possible to show the relation between ordinary language and the ‘extraordinary’ language of poetry; . the distributional model clearly captures the distinction between human and ran- domised production, regardless of the immediate semantic transparency of the text; . the distributional model shows a stable layer of semantic associativity across poems, regardless of complexity. a natural next step for the investigation presented here would be to explore the annotators’ judgements on semantic complexity. it is unclear what exactly makes a fairly straightforward text such as duffy’s ‘valentine’ comparatively less coherent than the complex ‘argument over. amounted.’ by coolidge. a more fine-grained analysis of the results would be necessary to make any hypothesis. as a final note, it may be worth pointing out that, although ‘big data’ has so far mostly been used for the analysis of large phenomena in the digital humanities, this paper shows that one of its incarnations (distributional representations) may have a role to play as a background linguistic model for close reading. notes an example output can be seen at http://www.chart.ac.uk/chart /papers/clements.html. this must not invalidate distributional semantics techniques as essentially wittgensteinian constructs. a corpus which is coherent from the point of view of ‘speech acts’ (searle, ) can be seen as a particular language game: the meaning representations obtained from it are just specific to that language game. so for instance, the online encyclopedia wikipedia might be said to collate language games where one participant gives information about a particular topic to a hearer, in a regulated written form. http://www.wikipedia.org/ see dowty et al. ( ) for an introduction to montague semantics and a description of verbs and adjectives as ‘functions’ taking arguments. see emerson ( ) for a review of discourse.cpp covering this issue. there is a subtlety here. in a distributional account, the semantics of words comes from pragmatics, that is, from an indefinite number of situations where, collectively, words are used in particular situations, with a particular intent. these situations are separate from (or more accurately, a very large superset of) the pragmatic situation and intent behind the creation of a specific poem. available at http://www.cis.uni-muenchen.de/∼schmid/tools/treetagger/ this fairly short sentence length is due to stein’s poem containing many single word sentences. how- ever, experiments with different fragment lengths (up to content words) did not signicantly change the results reported here. references baroni, m., bernardi, r., do, n.-q., & shan, c.-c. ( ). entailment above the word level in distributional semantics. in proceedings of the th conference of the european chapter of the association for computational linguistics (eacl ). bell, m. j., & schäfer, m. ( ). semantic transparency: challenges for distributional semantics. in proceedings of the ‘towards a formal distributional semantics’ work- shop (collocated with iwcs , potsdam, germany). bloomfield, l. ( ). language. new york: henry holt. brooke, r. ( a). day that i have loved. in poems. sidgwick & jackson. url http://www.rupertbrooke.com/poems/ - /day that i have loved brooke, r. ( b). the fish. in poems. sidgwick & jackson. bruns, g. ( ). the material of poetry: sketches for a philosophical poetics. uni- versity of georgia press. coolidge, c. ( ). argument over, amounting. in sound as thought: poems - . green integer. url http://www.poetryfoundation.org/poem/ dowty, d. r., wall, r., & peters, s. ( ). introduction to montague semantics. springer. duffy, c. a. ( ). valentine. in mean time. london, england: anvil press. emerson, l. ( ). materiality, intentionality, and the computer-generated poem: reading walter benn michaels with erin mour’s pillage laud. esc: english studies in canada, ( ), – . emerson, l. ( ). review of discourse.cpp by os le si. computational linguistics, ( ), – . erk, k. ( ). vector space models of word meaning and phrase meaning: a survey. language and linguistics compass, : , – . erk, k., & padó, s. ( ). a structured vector space model for word meaning in context. in proceedings of the conference on empirical methods in natural language processing (emnlp ). honolulu, hi. evert, s. ( ). the statistics of word cooccurrences: word pairs and collocations. ph.d. thesis, university of stuttgart. feng, y., & lapata, m. ( ). visual information in semantic representation. in human language technologies: the annual conference of the north american chapter of the association for computational linguistics (naacl ), (pp. – ). los angeles, california. firth, j. r. ( ). a synopsis of linguistic theory, – . oxford: philological society. frege, g. ( ). über sinn und bedeutung. zeitschrift für philosophie und philosophis- che kritik, , – . ginsberg, a. ( ). a.m. in death and fame:last poems - . harper peren- nial (reprint ). grefenstette, g. ( ). explorations in automatic thesaurus discovery. springer. harris, z. ( ). distributional structure. word, ( - ), – . hejinian, l. ( ). the language of inquiry. berkeley: university of california press. hofstadter, d. r. ( ). analogy as the core of cognition. in the analogical mind: perspectives from cognitive science, (pp. – ). cambridge ma: the mit press/bradford books. hughes, t. ( ). wuthering heights. in birthday letters. faber & faber. labov, w. ( ). the boundaries of words and their meanings. in c.-j. bailey, & r. w. shuy (eds.) new ways of analysing variation in english, (pp. – ). georgetown university press. landauer, t. k., & dumais, s. t. ( ). a solution to plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. psychological review, (pp. – ). le si, o. ( ). discourse.cpp. berlin: press press. leech, g., garside, r., & bryant, m. ( ). claws : the tagging of the british national corpus. in proceedings of the th international conference on computa- tional linguistics (coling ), (pp. – ). kyoto, japan. masterman, m. ( ). computerized haiku. in j. reichardt (ed.) cybernetics, art and ideas, (pp. – ). london, england: studio vista. miles, j. ( ). more semantics of poetry. the kenyon review, ( ), – . mimno, d., wallach, h. m., talley, e., leenders, m., & mccallum, a. ( ). op- timizing semantic coherence in topic models. in proceedings of the conference on empirical methods in natural language processing (emnlp ), (pp. – ). mitchell, j., & lapata, m. ( ). composition in distributional models of semantics. cognitive science, ( ), – . molinaro, n., carreiras, m., & duñabeitia, j. a. ( ). semantic combinatorial pro- cessing of non-anomalous expressions. neuroimage, ( ), – . montague, r. ( ). the proper treatment of quantification in ordinary english. in j. hintikka, j. moravcsik, & p. suppes (eds.) approaches to natural language, (pp. – ). dordrecht. newman, d., lau, j. h., grieser, k., & baldwin, t. ( ). automatic evaluation of topic coherence. in human language technologies: the annual conference of the north american chapter of the association for computational linguistics (naacl ), (pp. – ). los angeles, ca. richards, i. a. ( ). poetries and sciences: a reissue of science and poetry ( , ) with commentary. london: routledge & kegan paul. schütze, h. ( ). automatic word sense discrimination. computational linguistics, ( ), – . searle, j. ( ). speech acts. cambridge university press. shutova, e. ( ). models of metaphor in nlp. in proceedings of the th annual meeting of the association for computational linguistics (acl ), (pp. – ). slater, a. ( ). ithaca, winter. the cortland review, . url http://www.cortlandreview.com/issue/ /slater.html spärck jones, k. ( ). synonymy and semantic classification. ph.d. thesis, univer- sity of cambridge. cambridge language research unit. stein, g. ( ). if i told him: a completed portrait of picasso. vanity fair. tarski, a. ( ). the semantic conception of truth. philosophy and phenomenologi- cal research, , – . thater, s., fürstenau, h., & pinkal, m. ( ). contextualizing semantic representa- tions using syntactically enriched vector models. in proceedings of the th annual meeting of the association for computational linguistics (acl ), (pp. – ). uppsala, sweden. turney, p. d., & pantel, p. ( ). from frequency to meaning: vector space models of semantics. journal of artificial intelligence research, , – . veale, t. ( ). exploding the creativity myth: the computational foundations of linguistic creativity. a&c black. vecchi, e. m., baroni, m., & zamparelli, r. ( ). (linear) maps of the impossi- ble: capturing semantic anomalies in distributional space. in proceedings of the acl disco (distributional semantics and compositionality) workshop. portland, oregon. wheelwright, p. ( ). on the semantics of poetry. the kenyon review, ( ), pp. – . wittgenstein, l. ( ). philosophical investigations. wiley-blackwell ( reprint). zwitserlood, p. ( ). the role of semantic transparency in the processing and repre- sentation of dutch compounds. language and cognitive processes, ( ), – . / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / c u r r e n t ( h t t p : / / w w w . j o u r n a l o f e l e c t r o n i c p u b l i s h i n g . o r g / )a r c h i v e ( / / q u o d . l i b . u m i c h . e d u / j / j e p / . * )a b o u t ( h t t p : / / w w w . j o u r n a l o f e l e c t r o n i c p u b l i s h i n g . o r g / a b oe d i t o r s ( h t t p : / / w w w . j o u r n a l o f e l e c t r o n i c ps u b m i t ( h t t p s : / / j o u r n a l o f e digits two reports on new units of scholarly publication matt burton ���������� �� ���������� matthew j. lavin ���������� �� ���������� jessica otis ������ ����� ���������� scott b. weingart �������� ������ ���������� [ ] [#� ] journal of electronic publishing volume , issue , doi: https://doi.org/ . / . . [https://doi.org/ . / . . ] [http://creativecommons.org/licenses/by/ . /] table of contents table of contents [https://quod.lib.umich.edu/j/jep/ . . /--digits-two- reports-on-new-units-of-scholarly-publication? trgt=p_s zep kg;view=fulltext#p_s zep kg] introduction [https://quod.lib.umich.edu/j/jep/ . . /--digits-two- reports-on-new-units-of-scholarly-publication? trgt=p_ vso qxqqhs;view=fulltext#p_ vso qxqqhs] a new unit of publication [https://quod.lib.umich.edu/j/jep/ . . /--digits-two- reports-on-new-units-of-scholarly-publication? trgt=p_zfwr jh g;view=fulltext#p_zfwr jh g] technical background [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_fcy sf cogom;view=fulltext#p_fcy sf cogom] what are containers? [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_tx znfsc hya;view=fulltext#p_tx znfsc hya] containers vs. virtual machines [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ngqn f dk ;view=fulltext#p_ngqn f dk ] container technologies [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_scg b las ;view=fulltext#p_scg b las ] orchestration [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_mpp sbeamoyh;view=fulltext#p_mpp sbeamoyh] the significance of containers [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ k wleo wjo;view=fulltext#p_ k wleo wjo] container use in academia [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_mncs swgaqnl;view=fulltext#p_mncs swgaqnl] containers for software dependency management [https://quod.lib.umich.edu/j/jep/ . . /--digits- http://www.journalofelectronicpublishing.org/ https://quod.lib.umich.edu/j/jep/ .* http://www.journalofelectronicpublishing.org/about.html http://www.journalofelectronicpublishing.org/editors.html https://journalofelectronicpublishing.submittable.com/submit https://quod.lib.umich.edu/j/jep/ . ?rgn=main;view=fulltext https://quod.lib.umich.edu/j/jep/ . . *?rgn=main;view=fulltext https://doi.org/ . / . . http://creativecommons.org/licenses/by/ . / https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_s zep kg;view=fulltext#p_s zep kg https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ vso qxqqhs;view=fulltext#p_ vso qxqqhs https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_zfwr jh g;view=fulltext#p_zfwr jh g https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_fcy sf cogom;view=fulltext#p_fcy sf cogom https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_tx znfsc hya;view=fulltext#p_tx znfsc hya https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ngqn f dk ;view=fulltext#p_ngqn f dk https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_scg b las ;view=fulltext#p_scg b las https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_mpp sbeamoyh;view=fulltext#p_mpp sbeamoyh https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ k wleo wjo;view=fulltext#p_ k wleo wjo https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_mncs swgaqnl;view=fulltext#p_mncs swgaqnl https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ a v l i g ;view=fulltext#p_ a v l i g / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / two-reports-on-new-units-of-scholarly-publication? trgt=p_ a v l i g ;view=fulltext#p_ a v l i g ] containers for reproducible research [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ i ozwfj h;view=fulltext#p_ i ozwfj h] containers for preservation [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_yzfyrh fz a;view=fulltext#p_yzfyrh fz a] full-stack scholarship [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_rztoqzq k ;view=fulltext#p_rztoqzq k ] conclusion: containers for digital scholarship [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ ua p r k ;view=fulltext#p_ ua p r k ] bibliography [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ikigdabqqdg ;view=fulltext#p_ikigdabqqdg ] new scholarship in the digital age [https://quod.lib.umich.edu/j/jep/ . . /--digits-two- reports-on-new-units-of-scholarly-publication? trgt=p_gjdgxs;view=fulltext#p_gjdgxs] introduction [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_z q ehpt xja;view=fulltext#p_z q ehpt xja] data collection [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_csicnzl f ;view=fulltext#p_csicnzl f ] the problems of non-traditional scholarly objects [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ esq m olpg ;view=fulltext#p_ esq m olpg ] recommendations for future intervention [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ xpjiqx avn;view=fulltext#p_ xpjiqx avn] report on interviews and recommendations [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ dy vkm;view=fulltext#p_ dy vkm] challenges making non-traditional scholarly objects [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ t csth ihn ;view=fulltext#p_ t csth ihn ] collaborative production [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ts ymhqm;view=fulltext#p_ts ymhqm] sociotechnical challenges and limitations [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ d og ;view=fulltext#p_ d og ] funding [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ dp vu;view=fulltext#p_ dp vu] copyright and sensitive data [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ rdcrjn;view=fulltext#p_ rdcrjn] credit models [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_lnxbz ;view=fulltext#p_lnxbz ] challenges publishing non-traditional scholarly objects [https://quod.lib.umich.edu/j/jep/ . . /--digits- https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ a v l i g ;view=fulltext#p_ a v l i g https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ i ozwfj h;view=fulltext#p_ i ozwfj h https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_yzfyrh fz a;view=fulltext#p_yzfyrh fz a https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_rztoqzq k ;view=fulltext#p_rztoqzq k https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ ua p r k ;view=fulltext#p_ ua p r k https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ikigdabqqdg ;view=fulltext#p_ikigdabqqdg https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_gjdgxs;view=fulltext#p_gjdgxs https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_z q ehpt xja;view=fulltext#p_z q ehpt xja https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_csicnzl f ;view=fulltext#p_csicnzl f https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ esq m olpg ;view=fulltext#p_ esq m olpg https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ xpjiqx avn;view=fulltext#p_ xpjiqx avn https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ dy vkm;view=fulltext#p_ dy vkm https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ t csth ihn ;view=fulltext#p_ t csth ihn https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ts ymhqm;view=fulltext#p_ts ymhqm https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ d og ;view=fulltext#p_ d og https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ dp vu;view=fulltext#p_ dp vu https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ rdcrjn;view=fulltext#p_ rdcrjn https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_lnxbz ;view=fulltext#p_lnxbz https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ nkun ;view=fulltext#p_ nkun / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / two-reports-on-new-units-of-scholarly-publication? trgt=p_ nkun ;view=fulltext#p_ nkun ] peer review [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ sinio;view=fulltext#p_ sinio] prestige [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ jxsxqh;view=fulltext#p_ jxsxqh] alternative audiences [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_cpo pnpam ne;view=fulltext#p_cpo pnpam ne] discoverability and access [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_giir pz qt;view=fulltext#p_giir pz qt] financial models and licensing [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_q d d u cqix;view=fulltext#p_q d d u cqix] challenges maintaining non-traditional scholarly objects [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_b pkievqjylv;view=fulltext#p_b pkievqjylv] disparate notions of maintenance [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ino thccaien;view=fulltext#p_ino thccaien] legacy projects [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ xcytpi;view=fulltext#p_ xcytpi] technical and personnel resources [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ ci xb;view=fulltext#p_ ci xb] cost models [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_gi jfc sd j;view=fulltext#p_gi jfc sd j] challenges preserving non-traditional scholarly objects [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ aq jo qq;view=fulltext#p_ aq jo qq] preservable outputs [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_qsh q;view=fulltext#p_qsh q] purpose of preservation [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_m j wlqqwvfh;view=fulltext#p_m j wlqqwvfh] when to preserve [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ as poj;view=fulltext#p_ as poj] continuing to institutionalize preservation [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ pxezwc;view=fulltext#p_ pxezwc] technical resources [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_wsuh depk ;view=fulltext#p_wsuh depk ] conclusion & looking forward [https://quod.lib.umich.edu/j/jep/ . . /--digits- two-reports-on-new-units-of-scholarly-publication? trgt=p_ rizygapbzke;view=fulltext#p_ rizygapbzke] bibliography [https://quod.lib.umich.edu/j/jep/ . . /--digits- https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ nkun ;view=fulltext#p_ nkun https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ sinio;view=fulltext#p_ sinio https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ jxsxqh;view=fulltext#p_ jxsxqh https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_cpo pnpam ne;view=fulltext#p_cpo pnpam ne https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_giir pz qt;view=fulltext#p_giir pz qt https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_q d d u cqix;view=fulltext#p_q d d u cqix https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_b pkievqjylv;view=fulltext#p_b pkievqjylv https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ino thccaien;view=fulltext#p_ino thccaien https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ xcytpi;view=fulltext#p_ xcytpi https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ ci xb;view=fulltext#p_ ci xb https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_gi jfc sd j;view=fulltext#p_gi jfc sd j https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ aq jo qq;view=fulltext#p_ aq jo qq https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_qsh q;view=fulltext#p_qsh q https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_m j wlqqwvfh;view=fulltext#p_m j wlqqwvfh https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ as poj;view=fulltext#p_ as poj https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ pxezwc;view=fulltext#p_ pxezwc https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_wsuh depk ;view=fulltext#p_wsuh depk https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ rizygapbzke;view=fulltext#p_ rizygapbzke https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ j i ci p ;view=fulltext#p_ j i ci p / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / two-reports-on-new-units-of-scholarly-publication? trgt=p_ j i ci p ;view=fulltext#p_ j i ci p ] appendix a: interview protocol [https://quod.lib.umich.edu/j/jep/ . . /--digits-two- reports-on-new-units-of-scholarly-publication? trgt=p_gwudyy dal j;view=fulltext#p_gwudyy dal j] appendix b: commonly referenced technologies [https://quod.lib.umich.edu/j/jep/ . . /--digits-two- reports-on-new-units-of-scholarly-publication? trgt=p_ ckvvd;view=fulltext#p_ ckvvd] introduction the digits team (matt burton, matthew j. lavin, jessica otis, and scott b. weingart) convened around the question of how we might share, preserve, and legitimize scholarship freed from the affordances of print. for the a.w. mellon-funded digits planning grant ( - ), the pis had three goals: . investigate the use of software containers for research in the sciences, social sciences, and humanities. . assess the infrastructural needs of digital humanists around publishing and preserving web-centric scholarship. . gather a team of experts to guide the above activities and plan how they might inform a beneficial intervention into the scholarly ecosystem. through our investigation into the scholarly uses of containers, we discovered that the technical infrastructure needed to connect containers with digital publications is underdeveloped. we see potential for container technologies to facilitate existing digital scholarly publications and afford new forms of computational scholarship, but this process would first require a series of infrastructural bridges. the digital scholarship needs assessment we conducted, as well as our advisory board meetings, made it clear that a targeted technological intervention alone would not be enough to welcome web-first publications into the scholarly ecosystem; in-tandem cultural and institutional changes are also necessary. the first and second of our three tasks resulted in the two reports that comprise this article. the first report, a new unit of publication: the potential of software containers for digital scholarship, involved an environmental and secondary source scan of activities at the intersection containerization and scholarship. the second report, new scholarship in the digital age: making, publishing, maintaining, and preserving non-traditional scholarly objects, summarizes interviews of scholars, technicians, publishers, and others who work towards the publication of digital-first scholarship. both reports were presented to the digits advisory board, including laurie allen (penn), lauren brumfield (reclaim hosting), dan cohen (northeastern), dan evans (cmu), martin paul eve (birkbeck college, london), ilya kreymer (rhizome), alison langmead (pitt), sharon leon (msu), david newbury (getty), andrew odewahn (o’reilly media), mary shaw (cmu), ammon shepherd (uva), ed summers (umd), whitney trettien (penn), amanda visconti (uva), keith webster (cmu), and david wilkinson (pitt). several lessons became apparent over the course of the grant period. our container study revealed myriad use-cases of containers in academia that are research oriented. container adoption is much wider in the sciences than in the humanities, especially for dependency management and reproducibility. our findings also suggest that containers are used in teaching, but we didn’t fully investigate this topic. lastly, we found few if any efforts focusing specifically on containers as a unit of scholarly publication at the time of conducting our research. in the past several years, however, some additional examples have arisen or come to our attention, including binder and nextjournal. commenting on the use of a software container as a unit of scholarly publication, the advisory board stressed the importance of starting with templates or examples when creating digital platforms, of making working prototypes before over-theorizing, and of creating a platform that fits easily within the current publication ecosystem. the board further suggested a project on the long-term costs associated with creating, hosting, and maintaining digital scholarly objects would be critically useful to efforts in this space moving forward. the first advisory board meeting settled on four important elements that a container-based intervention would need to encompass to be successful: setting specifications, creating a production environment, facilitating a publication platform, and designing with preservation as a top priority. https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ j i ci p ;view=fulltext#p_ j i ci p https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_gwudyy dal j;view=fulltext#p_gwudyy dal j https://quod.lib.umich.edu/j/jep/ . . /--digits-two-reports-on-new-units-of-scholarly-publication?trgt=p_ ckvvd;view=fulltext#p_ ckvvd / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / our dh infrastructure study reinforced our sense of how many factors impede digital scholarship, as well as how deeply these impediments run. the diverse ecosystem around digital publications adds friction to their existence at every stage of their lifecycle, particularly at points where a digital object is transferred from one party to another. after conducting our study, we concluded that the best way to solve the identified problems would be to target every stage of the digital scholarly workflow in concert. although these interventions, ideally, would occur all at once, various piecemeal interventions could radically improve the ability of scholars to create, publish, preserve, and receive recognition for their digital work. some of the interventions we discuss are technical, but at least as many are entirely social. even where technical solutions are to be found, implementing them will require scholarly buy-in and a willingness to adapt existing scholarly practices. the second advisory board meeting echoed the findings of the infrastructure report, and additionally offered some next steps: . constructing the technology and standards for a self-contained digital scholarly object. . developing plugins for pre-existing authorship, publication, and preservation platforms to allow for the easy transfer of complex digital objects from one stage to the next. . developing a tool that can encapsulate a given system stack and solicit metadata on the scholarly object within in order to create a digital object conforming to the agreed upon standards. . creating or fostering sample publications that use the proposed technologies to act as a lightning rod to encourage wide adoption. . working with publishers and institutions to adopt and support these standards. . teaching scholars and creators to build towards these standards. . encapsulating or easing the encapsulation of several popular platforms for digital publication, to foster a broader adoption of these standards. the combined expertise and experience of the advisory board stressed the difficulty of such an orchestrated effort, as important as it is. the irony is not lost on us that, at the start of this collaboration, we often spoke of orchestration, but this term had a specific and technical meaning for us related to products like docker and kubernetes. by the time we completed our work, we were speaking almost exclusively of orchestration as a sociotechnical concept. we remain committed to the idea that containerization, or a similar lightweight encapsulation technology, has an important role to play in the future of scholarly publication. with the completion of the digits grant, however, we have come to believe that containerization will only ever be one piece of a much larger puzzle. we hope the ensuing two reports will be useful in revealing that puzzle’s ultimate picture. we would like to thank everyone at the a.w. mellon foundation, particularly patricia hswe, michael gossett, and donald j. waters, for their generous support, guidance, and feedback. we would also like to thank the interview subjects and the members of our advisory board for their thoughtful insights. a new unit of publication the potential of software containers for digital scholarship picture a web-based digital publication. whether it contains an interactive map, a network visualization, a curated collection of born-digital objects, or other multi-modal expression, chances are this project is built upon layers of technological systems or “stacks.” a stack is an assemblage of software that forms the operational infrastructure behind the project itself. industry professionals often speak of the lamp stack, the mean stack, and ruby on rails and, indeed, these technologies are the cornerstone of the web. [ ] [#n ] a stack's seeming ubiquity, however, creates an illusion of monolithic, laborless setup and configuration underpinning the data modeling and public facing layers of production that most digital humanities scholars and web developers tend to focus on. anyone who has attempted to maintain, preserve, or replicate a digital project knows, however, that the deepest layers of any server stack can have a profound impact on how algorithms run and how information displays. a given operating system will immediately enable or prohibit certain software; one’s choice of database can erase an important difference between the two types of data; failing to apply a software patch behind schedule can, in the words of deltron, "crash your whole computer system and revert you to papyrus." [ ] [#n ] / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / these challenges have given rise to two widespread paradigms of support: dedicated servers with full time, in-house system administrators or large-scale, cloud-based vendors who offer varying levels of stack flexibility and system administration support. in a higher education it or on cheap commodity cloud providers, one is likely to find flexibility (say, a virtual machine with no predetermined operating system or software) with little or no support, or a well supported software stack designed for a narrow set of use-cases. within this context, playing around with ideas, creating experimental prototypes, reproducing another’s work to interrogate it—or collectively, “sandboxing”—is particularly difficult. in the face of these challenges, an approach called software containerization is becoming increasingly popular. containerization offers a potential solution to the primary challenges of maintenance, reproducibility, and preservation for web-based digital scholarship, but also necessitates a significant departure from the current status quo. in december , our working group received funding from the andrew w. mellon foundation for “digits: a platform to facilitate the production of digital scholarship,” an -month project to survey the use of container technologies in scholarly publication, assess the needs of researchers producing web-centric scholarship, and develop blueprints for a platform to facilitate those needs. the first stage of our investigations has been to author an environmental scan of software containers in scholarly contexts with two focal research questions: how are containers being discussed and adopted in the academic research contexts? [ ] [#n ] which aspects of containerization have not yet been fully explored in the context of digital scholarship? in section : technical background, we begin by providing a short introduction to container technology. this opening section attempts to introduce container-based approaches in a manner accessible to an imagined reader without deep technical or server “back-end” experience. section : containers in academia provides a review of formal and informal conversations about software containers across a variety of disciplines. as a substantive body of published scholarship demonstrates, containerization has seen more use in high-performance computing (hpc) and the other scientific contexts than it has in digital humanities. section : full-stack scholarship focuses on the prevalent but as-yet-unrealized idea that containers might come to function as standalone publications (self-contained, as it were), with each software container encapsulating a new unit of publishable work comparable to an article or a monograph. this document focuses on the use and potential for containers within the academic research community, however the development of software containers (with some exceptions discussed below) is primarily driven by industrial needs and resources. we cannot hope to cover all of that activity here, as it is already challenging to track everything happening with containers in academia. [ ] [#n ] technical background this section provides a semi-technical introduction to software containers, intended to provide a high-level overview of important and relevant technical concepts of software containers without getting too bogged down in the system-level details of any particular implementation. the section includes a comparison between containers and virtual machines, a discussion of five relevant container technologies, and finally a brief introduction to orchestration. the technical background also includes a short consideration of the social and organizational significance of containers, drawing from their impact in industry and the implications for research workflows and infrastructure. what are containers? software containers are a suite of technology components for linux-based operating systems that enable the isolation of computational processes. more specifically, containers are an instance of operating-system-level virtualization [https://en.wikipedia.org/wiki/operating-system-level_virtualization] . [ ] [#n ] many of the technologies for isolating resources that make containers possible (lvm [https://en.wikipedia.org/wiki/logical_volume_manager_(linux)] , chroot [https://en.wikipedia.org/wiki/chroot] , cgroups [https://en.wikipedia.org/wiki/cgroups] ) have existed for many years, but have only recently been integrated into easy to use interfaces, the most popular being docker [https://en.wikipedia.org/wiki/docker_(software)] . [ ] [#n ] docker is by far the most widely adopted container technology, but there are alternatives such as rkt [https://coreos.com/rkt] [ ] [#n ] or singularity [http://singularity.lbl.gov/] (kurtzer et al. ). the recent popularity of containers, namely docker, stems from the capability to “package software into standardized units for development, shipment and deployment [https://www.docker.com/what-container] .” [ ] [#n ] this statement from docker’s marketing materials can be broken down to illustrate why containers have seen a dramatic rise in popularity and public discussion since june when docker . was released. https://en.wikipedia.org/wiki/operating-system-level_virtualization https://en.wikipedia.org/wiki/logical_volume_manager_(linux) https://en.wikipedia.org/wiki/chroot https://en.wikipedia.org/wiki/cgroups https://en.wikipedia.org/wiki/docker_(software) https://coreos.com/rkt http://singularity.lbl.gov/ https://www.docker.com/what-container / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / . package software: containers make it easy to bundle or envelop software applications and services along with their software dependencies to create an executable thing with clearly defined boundaries. . standardized unit: the thing-ness of software containers is embodied in the container image, a file format for bundling the components of a software application including the code, compiled binaries, configuration files, data files, dependencies, or anything else needed for the execution of the contained software application. this standardization make possibly the fast and easy movement of applications across platforms. . development, shipment and deployment: containers have standardized interfaces for execution, which means that software contained within them will execute consistently on any system capable of running containers (assuming the same underlying container technology). this standardization enables the contained software to run in many different environments without dealing with significant overhead related to software dependencies and system configuration. packaging and standardization allows containers to be portable across environments with each environment having far fewer software dependencies, namely the container runtime. importantly, the environment used to develop the application (perhaps on a developer’s laptop) can be identical to the production environment (perhaps on a commodity cloud provider like amazon web services). the bundling of containers into standardized units makes moving the contained application from the developer’s laptop to the cloud deployment (“shipping”) easier and less likely to introduce bugs because the runtime environment is (nearly) identical. in order to fully understand software containers, one must widen the scope of one’s idea of “software.” for many users, software applications might bring to mind software such as microsoft word or apps for smartphones. while it is theoretically possible to put client side applications with graphical user interfaces into a container, the technology was not necessarily designed for such use cases; [ ] [#n ] for purposes of this paper it is most reasonable to think of a container as a mechanism for packaging a server. nearly all of the software applications being put into containers run on the linux operating system and are back-end applications like web-servers, databases, business application servers, middleware, etc. containers’ immediate appeal in server administration and web development is due to three central features (or affordances): performance, encapsulation, and portability. containers have fast performance because they are system processes, not separate operating systems, which leads to very fast “boot up” times and low resource (memory, cpu) overhead. such speed is possible because containers share many of the operating system resources with the host computer. empirical analysis has shown (felter et al. ; hale et al. ; le and paz ) that for cpu and memory tasks, containers performed nearly as fast as native hardware and faster than virtual machines (discussed more below). encapsulation is the idea of circumscribing all the code an application needs to execute in a single logical unit. all too often, the process of server setup is one of saying “i need to install application a, but to install that i need to install library b, but to install b i need to install library c.” the resulting state, “dependency hell,” results in profound frustration. detailed installation instructions can help for identical environments but may leave out assumed information on taken- for-granted dependencies. conditions like these are widespread in scholarly computing, where reproduction of others’ code/results is considered especially important. when applications are developed inside a container, the only external dependency becomes the container runtime environment itself. portability builds on top of encapsulation, as containers are portable across different host environments. the container engine performs as an infrastructural gateway, not unlike an actual shipping container, capable of executing a wide variety of software conforming to a standard interface (egyedi ). portability is made possible through a container image. a container image is an immutable file format within which a contained application’s dependencies are bundled together. container images can be executed creating a “live” or running instance of a particular image. as a result, one can easily have two containers running from the same image, reducing configuration overhead. for example, a developer might create an image that contains a web server and a default folder for the website that server will host. multiple websites with the same dependencies could be hosted by launching two or three or even more containers from that image. this ease of reuse is the key to a container’s portability. locally, an image can operate as a base for many containers. containerization is not without detractors and skeptics. foremost, the sheer exuberance for containers makes some wonder how much of the attention is mere hype. academic it tends to be especially wary of committing to a technology that could become obscure within a -to- year timescale. further, containers are only supported on linux, both for the host and the system running inside the container. in contexts where linux is already the preferred operating system (e.g., server administration and web development), containerization offers more immediate gains than losses. the small performance hit has been seen as a worthwhile price to pay for the benefits of encapsulation and portability for all except the most extreme computationally intensive tasks. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / containers vs. virtual machines how are containers different from virtual machines? virtual machines envelope more layers of the computing environment, going deeper down into the system and emulating computer hardware. virtual machines run operating systems where containers run software applications. . [/j/jep/images/ . . - .png] figure . the components of the software-hardware stack encapsulated by virtual machines. diagram by authors. the focus on applications means containers are much lighter weight than virtual machines. containers do not need to “boot up” because they draw more resources from the host operating system and share resources with other containers running on the system; they are already “booted up”. this efficiency means the overhead of containers is much smaller than virtual machines. the performance and efficiency of containers (vs. virtual machines) comes at a reduced flexibility for the kinds of software applications that can be run inside of containers. where a virtual machine can theoretically run any operating system capable of running on the virtualized hardware, containers can only run software developed to run on linux. https://quod.lib.umich.edu/j/jep/images/ . . - .png / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / [/j/jep/images/ . . - .png] figure . the components of the software-hardware stack encapsulated by containers. diagram by authors. containers are not a replacement for virtual machines; they function at a different level of abstraction. in many deployments, containers are being used inside virtual machines (especially in the commodity cloud where everything is virtualized) as a means of using computational resources as efficiently as possible. for example, if an enterprise needs four redundant web servers, rather than having four heavy virtual machines each only running a web server, they can have a single virtual machine (or two for an extra layer of redundancy) running multiple containers for each web server. this means running (and paying for) two virtual machines instead of four. the efficient use of computational resources is one of the value propositions of software containers, especially for enterprise applications like web-servers that are not very computationally intensive. a seasoned system administrator might say “i don’t need containers to run multiple web servers, i can just run them myself!” and they would be correct. however, the portability and standardization of containers, not to mention the software ecosystem that has emerged to control the creation, deployment, and management of containers, makes running applications and services several orders of magnitude easier than it has ever been in the past. [ ] [#n ] container technologies table . a short list of relevant container technologies technology launch date license comments (“docker” enterprise . ) (runc) [ ] [#n ] june apache . most popular container technology, runs on % of hosts monitored, according to one study last updated in april . the annual portworx container adoption survey suggests that docker’s reputation has steadily declined since . [ ] [#n ] (“coreos rkt”) [ ] [#n ] july apache . under active development. appears to be growing in popularity, especially given the fact it works with kubernetes and is not docker. https://quod.lib.umich.edu/j/jep/images/ . . - .png / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / singularity [http://singularity.lbl.gov/] [ ] [#n ] april clause bsd a container technology developed specifically for scientific computing. has same advantages of industry containers (portability, reproducibility, environmental encapsulation) with additional security and an execution model better suited to hpc environments. shifter [https://www.nersc.gov/research- and-development/user-defined- images/] [ ] [#n ] december clause bsd another container technology specifically for hpc. pre-cursor to singularity. development may have slowed or stalled. open container initiative [https://www.opencontainers.org/] [ ] [#n ] june n/a an industry led standardization effort. current focus is a runtime specification [https://github.com/opencontainers/runtime-spec] and an image specification [https://github.com/opencontainers/image-spec] for executing and bundling containers. designed around a runtime and format donated by docker (runc). [ ] [#n ] this list of container technologies in the table above is not exhaustive [ ] [#n ] but represents the most popular or most relevant technologies for this discussion (as of ). as of this writing, docker is the de facto standard for the industrial application of containers. it should be noted that docker is not simply a container technology, but rather a suite of technologies including a container runtimev(runc), specification formats (dockerfile, docker-compose.yml), image server (docker hub), and orchestration system (swarm). [ ] [#n ] the academic community has developed its own alternatives to industrial software containers. shifter [https://www.nersc.gov/research-and-development/user-defined-images/] and, more popular, singularity [http://singularity.lbl.gov] . these technologies have emerged from the unique needs of high performance computing (hpc) where the computational workloads aren’t long-running, low overhead web services, but rather computationally intensive (but bounded) data processing or simulation jobs. [ ] [#n ] in high performance computing, containers are less of a solution to optimal utilization of resources (they are already good at this) and more focused on leveraging containers to manage the complexities of scientific software. furthermore, the security model of docker, running with elevated user privileges, is fundamentally incompatible with the current security model of hpc system where users typically have very few privileges to install software or configure the system. the advantage of containers for hpc is the ability to support user defined images (douglas m. jacobsen and canon ; d. m. jacobsen and canon ): where the work and responsibility of managing the software and its dependencies is pushed on to the researcher or scientist to configures their software environment as they please. once they have added all of the needed content (dependencies, data, code, etc.) they can move the container image to a shared resource, most often a high performance research computing cluster, and run their computation without the need for the system administrators of the shared resource to install and configure the researcher’s software. the appeal of operating containers at a non-admin privilege level is substantial, and the demand for secure containerization should not be underestimated. the vision for academic computing proposed by singularity containers leverages the affordances of container technology (encapsulation and portability) while also integrating into existing technical infrastructure and workflows of research computing. http://singularity.lbl.gov/ https://www.nersc.gov/research-and-development/user-defined-images/ https://www.opencontainers.org/ https://github.com/opencontainers/runtime-spec https://github.com/opencontainers/image-spec https://www.nersc.gov/research-and-development/user-defined-images/ http://singularity.lbl.gov/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / [/j/jep/images/ . . - .png] figure . traditional vs. container centric hpc. diagram by authors. additionally, singularity enables integration with existing job scheduling systems used by the research computing community like the slurm workload manager [https://slurm.schedmd.com] . [ ] [#n ] singularity containers in essence become self-contained binary applications with a single dependency, the container runtime, instead of the traditional hpc use case of managing the gordian knot of interdependencies for all the scientific software needed by the many users of a shared computing facility. orchestration while containers in and of themselves have reconfigured the landscape of system administration, the most significant impact has been the coupling of software containers with new breeds of orchestration systems. orchestration [https://en.wikipedia.org/wiki/orchestration_(computing)] , broadly, is the automated management of systems and services within some kind of computational ecosystem. [ ] [#n ] while orchestration is not a new concept, container orchestration has enabled the technology industry to provide services at a scale never before possible. google’s gmail web service (and many other google services) are composed and managed in containers with orchestration (verma et al. ). orchestration systems like kubernetes [https://kubernetes.io/] , [ ] [#n ] provide a set of abstractions to allow administrators to define the applications and services they want (and their redundancy) and then handles all the messy work maintaining that environment automatically. the technological landscape of orchestration is rapidly changing, so any technical description given today would be immediately out-of-date. it is more important to know how orchestration changes the nature of the work of developing, deploying, and maintaining systems and services. consider the difference between a chef cooking individual meals at a restaurant and the standardized meals prepared at a fast food restaurant. while the former focuses attention to each plate, the latter scales to “billions and billions served.” container orchestration shifts the attention of the system administrator away from the deployment and management of individual systems towards collections of applications and services. with orchestration, the emphasis is less upon setting up a bare metal server or virtual machine with hands-on configuration, because this approach does not scale to hundreds or thousands or tens of thousands of instances. instead, given a cluster of identical, minimally configured nodes (either bare metal or virtual machines), the system administrator https://quod.lib.umich.edu/j/jep/images/ . . - .png https://slurm.schedmd.com/ https://en.wikipedia.org/wiki/orchestration_(computing) https://kubernetes.io/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / leverages orchestration software and pre-configured containers to articulate an ecosystem at the scale of the data-center instead of the individual system. much like fast food, this paradigm of service emphasizes certain kinds of use cases where bulk processing is more important than individuated attention. [ ] [#n ] orchestration forces an infrastructure level perspective and enables the management and operation of heterogeneous applications and services at scale. installing and configuring software occurs once in the creation of a container image, which can then be replicated across a computing cluster. orchestration for enterprise or industrial applications and service is related to the job scheduling in hpc discussed above, but the technologies, capabilities, and the kinds of workloads are very different (web servers and databases vs. computational modeling). commodity cloud providers like azure, amazon, and google offer containers as a service, [ ] [#n ] which is possible because of orchestration systems like kubernetes (which is also offered as a service [ ] [#n ] ). containers and orchestration posit radically new ways of managing computation. as such, new best practices, like the twelve factor app [ ] [#n ] are challenging traditional models for the architecture, development, and deployment of enterprise applications and services. these new paradigms, like microservices [https://en.wikipedia.org/wiki/microservices] [ ] [#n ] and serverless computing [https://en.wikipedia.org/wiki/serverless_computing] , [ ] [#n ] are radical departures in the architecture, development, and deployment of enterprise applications and services. for example, microsoft’s azure container instances abstract away the infrastructural boilerplate of networking, filesystems, virtual machines, and server configuration, rending all of that work invisible to the user (invisible, but not eliminated). these systems are designed to execute (and bill) containerized applications on the order of seconds instead of days or weeks. [ ] [#n ] containers move and consolidate specific forms of technical work, which has benefits in terms of labor efficiency, but may have broader implications as well. the signi�cance of containers docker’s logo is a whale carrying shipping containers. the developers and advocates of software containers liken their impact to that of shipping containers. this analogy is often used as a justification to skeptical systems or business administrators who are, rightly-so, risk averse when it comes to new technology. just as shipping containers reduced the cost and simplified the logistics of moving material goods across the earth (levinson ), software containers made it easier to “ship” applications and services. when the means of executing software is standardized, infrastructure can be designed around a standard interface rather than a multitude of uniquely designed applications. [ ] [#n ] alongside these new technologies emerge new ways of working, sometimes called the “devops” philosophy (clark et al. ). the term devops originated with efforts to break down the traditional siloes of development and operations. instead engineers work “across the entire application lifecycle” and cultivate “a range of skills not limited to a single function.” [ ] [#n ] devops takes the programming, scripting, and automation abilities of developers and focuses those efforts on the operation of information infrastructure. essentially the devops approach is one that automates much of the system administration performed by hand. more tritely, devops is about tools-for-managing-tools, which has resulted in a cambrian explosion of new tools and techniques for managing existing suites of tools for managing services like web servers, databases, or application servers. containers introduce not only a new set of tools to learn, but a whole new set of concepts and a philosophy of system administration. this conceptual change is perhaps the most significant, and disruptive, aspect of software containers. the affordances of portability and encapsulation also change how particular forms of it work are done, and by whom. the analogy of the shipping container raises significant, and problematic, questions about labor and the visibility of work. these are questions we must keep at the forefront of the conversations around software containers for digital scholarship in the digital humanities. container use in academia three topics emerged from our review of the published literature and ongoing conversations about containers in academic research contexts. first, we cover containers for software dependency management. a growing body of scholarship suggests that this is the most immediate and salient value proposition associated with containers for scholarly endeavors. second, we cover containers for reproducible research. as with software dependency management, reproducibility seems to be a promising benefit of moving to a containerization approach. third, we address the existing literature on using software containers for preservation. there is perhaps less consensus pertaining to the pros and cons of using containers in this way, but digital preservation is a crucial concern among many scholarly communities, so an assessment of how scholars have discussed this subject is warranted. overall, this section describes how these mostly separate uses of containers lay the groundwork for beginning to imagine software containers as a new unit of publication. https://en.wikipedia.org/wiki/microservices https://en.wikipedia.org/wiki/serverless_computing / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / containers for software dependency management much of the software dependency management conversation emerged from the hpc community, which has been dealing with the “dependency hell [https://en.wikipedia.org/wiki/dependency_hell] ” [ ] [#n ] problem for decades. dependency hell describes the problem of managing the menagerie of shared software packages or libraries that a particular application, especially a scientific application, requires in order to function. software dependencies can be shared by multiple applications, which quickly creates a tangled morass of inter-dependencies. dependency hell adds costs, both in terms of time and money, to the development, deployment, and redeployment of software; an especially potent problem for the management of scientific software. the high performance computing community suffers, perhaps more than any other academic group, from challenges with software dependency management because research computing groups manage computational clusters as a service for researchers with varying computational needs and expertise. because of this variability, many research computing clusters are centrally controlled, where system administrators, not the individual users, must manage the software configuration of the cluster. this workflow places system administrators squarely in the path of getting science done, which can create tensions when scientists need bespoke or highly customized software for their specific research. system administrators do not scale and complex software dependencies coupled with poorly designed scientific software results in dependency hell. software containers alleviate the challenges of dependency management, especially for the management of software in high-performance computing environments, by giving users the “privilege” of the administrative labor of installing software (belmann et al. ; moreews et al. ; szitenberg et al. ; chung et al. ; devisetty et al. ; hosny et al. ; hung et al. ). software containers, especially implementations like singularity, allow for researchers to work with the software environment they like, as opposed to conforming to the standardized and secure software environment of the hpc cluster. through portability and encapsulation, software containers afford a more robust environment for running scientific software without the burden of an exponentially growing list of software to support or secure. in theory, the only software dependency is the container runtime itself. the portability and performance of containers allows a researcher to move quickly and easily from their laptop or desktop as soon as their research needs have outgrown their current resources. images can provide a complete, stable, and consistent environment that can be easily distributed to end users, thereby avoiding the difficulties that end-users commonly face with deep dependency trees. containers have the further advantage of largely abstracting away the host system, making it possible to deliver a common and consistent environment on many different platforms, be it laptop, workstation, cloud instance or supercomputer. (hale et al. , ). examples of this approach include projects like bioboxes (belmann et al. ), which is an effort to address the difficulties installing and maintaining software in bioinformatics using docker containers with standardized interfaces. bioinformatics relies upon complex and custom software creating usability problems that can inhibit the progress of science. other efforts in the biosciences, such as the university of pittsburgh’s center for causal discovery (ccd) use preconfigured docker containers to relieve difficulty configuring their causal modeling applications. [ ] [#n ] instead of battling with the complexities of the python to java software bridge, researchers can just run their ready-to-run container (personal communication). the ccd container builds on top of the jupyter docker stacks [https://github.com/jupyter/docker-stacks] , [ ] [#n ] which are a collection of generic docker containers with jupyter notebooks and other python libraries for data science and scientific computing. the dhbox [http://dhbox.org/] project [ ] [#n ] is using docker containers to manage the complexity of installing popular digital humanities tools and create a ready-to-run “laboratory in the cloud” for research and teaching. [ ] [#n ] software containers for science are often framed as creating portable research environments or workbenches (willis et al. ). the idea here is that all of the software dependencies necessary to do science are bundled up and made available to the researcher, who connect the data needed for doing their science. this offloads the headache of each researcher compiling and configuring software on their environment by allowing them to build on top of other (perhaps more experienced) effort to set up scientific software. in this sense, software containers allow for standardized workbench/lab- bench like computing environments for the scientists to do their work. this begs the question, if we can bundle scientific software environment can we bundle the data and scientific workflow as well? containers for reproducible research https://en.wikipedia.org/wiki/dependency_hell https://github.com/jupyter/docker-stacks http://dhbox.org/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / reproducible research is a vast and active conversation (gentleman and lang ; peng ; leveque, mitchell, and stodden / ; stodden, leisch, and peng ; stodden and miguez ; meng et al. / ; marwick ; meng and thain ) far out of scope for this paper. although containers are a boon for managing software, currently the most active and vibrant conversation around software containers for science (and other academic disciplines) revolves around containers’ potential application for reproducible research. encapsulation enable researchers to capture the full fidelity of their research environments and portability allows sharing beyond prosaic methods sections in journal articles. containers seem like a natural fit for reproducible research; the capability of containers to encapsulate the software dependencies of a computational research environment can be extended to include the data, metadata, code, and workflow of a specific project or publication. in such a case, a container transforms from a universal workbench into a product or publication oriented object with all of the dependencies and content in a single bundle. the conversation around containers for reproducibility is very active with informal workshops bringing together a variety of disciplines. [ ] [#n ] in reviewing the literature, no single discipline can be singled out as leading the adoption of containers for reproducible research, this work crosses disciplinary lines. researchers from multiple disciplines, brought together by methodological commonalities (specifically a devops approach) to achieve reproducibility. certain kinds of computational researcher, especially those drawing on industrial data science, incorporate ideas from the software development industry, such as continuous integration [https://en.wikipedia.org/wiki/continuous_integration] [ ] [#n ] to create reproducible computational workflows (beaulieu-jones and greene ). these approaches take specific technologies like docker and combine them with cutting edge software development techniques to automate reproducibility. again, the distinguishing feature of researchers talking about and using containers are those with the devops approach to their work, not any individual discipline as a whole. one early contribution that kicked off the conversation about containers and reproducibility is karl boettiger’s ( ) article, an introduction to docker for reproducible research. boettiger enumerates several technological barriers to reproducible research: dependency hell, which we discussed above and is a major motivating factor for the adoption of containers in the high performance research computing community. imprecise documentation, highlights the fact that documentation is poor, especially all of the technical and procedural details that are left out of a publication’s method section code rot, the recognition that software and its dependencies are not static, but dynamic entities continually being updated with bug fixes, new features, and security patches. such changes can sometimes impact the reproducibility of computational research. lack of adoption, there are already many existing technical solutions for creating reproducible research, but they are narrow or heavy handed solutions. boettiger argues docker provides a solution to these technical problems by encapsulating dependencies, explicit documentation in dockerfiles, avoiding code rot through versioned container images and enabling adoption by being portable, lightweight, and easy to integrate into existing workflows. however, he recognizes technical remedies alone will not solve the problems of reproducible research. there are significant cultural barriers to overcome, not in the least convincing researchers to be more transparent and share their code and workflow. boettiger points out the incentive structures do not exist to reward the additional work of sharing additional materials like code and data. docker or other container technologies add another burden if researchers are not already accustomed to a devops approach to their practice. both a benefit and a challenge for containers is the lack of standardized ways to express workflows within containers. while some argue the dockerfile provides explicit documentation of a container’s contents, the execution of processes inside a container lack explicit semantics. software containers are somewhat agnostic to the semantics expression of their contents, which potentially makes each individual container a black box. [ ] [#n ] this makes a container is a blank canvas within which researchers can pour dependencies, code, data, and metadata, which has an important advantage because many people have their own personal workflows and environments (especially in the absence of collaborators). this “blank canvas” approach may be a boon for early adoption as computational research practices change, but a thousand and one bespoke containers for a thousand and one research projects still presents problems for reproduction and preservation. there have been efforts to standardize the expression of workflows within the container (o’connor et al. ) and knoth ( ) uses docker encapsulate a standardized workflow building on a standard software install for the genomics community. https://en.wikipedia.org/wiki/continuous_integration / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / how data and code get into containers and how to execute a workflow is unique for each container because the technology is so endlessly flexible. while this affords easier adoption because it imposes fewer constraints upon how the computational work gets done, it may present challenges to the long-term reproducibility of the encapsulated research. containers for preservation the question of preserving containers is a natural follow-on from containers for reproducible research. this conversation has only just begun with only some very preliminary work specifically focused on the preservation of containers as opposed to more general conversations about scientific workflows. there has been some initial effort to propose a conceptual framework for the preservation of docker containers (emsley and de roure ), leveraging linked-open- data standards to provide a semantic expression of the workflow. such efforts are a good start, but significantly more work is needed. the daspos [https://daspos.crc.nd.edu/] project, [ ] [#n ] an nsf funded effort to address the problems of data and software preservation for science, convened a workshop [https://daspos.crc.nd.edu/index.php/workshops/container- strategies-for-data-software-preservation-that-promote-open-science] to explicitly discuss containers for software preservation. containers are seen, alongside virtual machines, as one mechanism for preserving scientific software (thain, ivie, and meng ). scientific software preservation is related to container preservation and projects like softwarex [https://www.journals.elsevier.com/softwarex/] and reprozip [https://www.reprozip.org/] are fellow travelers of the preservation landscape. [ ] [#n ] other efforts like collective knowledge [http://cknowledge.org/] and the occam [https://occam.cs.pitt.edu/] use containers, but focus their attention on preserving the components and build process with richer semantics, so the encapsulated workflow could be rebuilt on whatever the next technology may emerge after containers. [ ] [#n ] part and parcel to preservation is standardization. for long-term preservation to succeed, some formal and agreed upon standards must be established. in industry, standardization of containers is being driven by the open container initiative (oci). the oci specifications for runtime environments and image formats just recently reached . . [ ] [#n ] unfortunately, this is entirely an industry driven effort and it is unclear if the unique needs of academic applications are being addressed. singularity, the container developed by and for academia/hpc has initiated their own effort towards a format specification, in part to address the lack of best practices for content (data files, code, and metadata) placed inside containers. [ ] [#n ] the harsh reality, which any archivist or digital preservation professional would be quick to point out, is thinking about preservation is never at the forefront of researcher’s or system administrator’s attention. archivists and librarians are often dealing with unmanaged dumps of data and information in a variety of formats that haven’t been designed with preservation in mind. this will continue to be true with software containers. researchers are already using a variety of container technologies (docker, shifter, singularity, etc.) and there will probably never be universal agreement on a standard format or set of practices. just like the way in which archives get email or document dumps, we can anticipate they may someday get container image dumps. “without information on the environment, source code and other relevant metadata, we can inspect it but don’t have a can opener.” - (mooney and gerrard ). digital forensics tools like bitcurator [ ] [#n ] give archivists such a can opener, but they don’t yet support software containers. there are still many open questions related to the preservation of containers: how to deal with proprietary, licensed software? how to deal with proprietary hardware like gpus? what additional environmental information is needed? for layered container images are the lower layers available? does it make more sense to focus preservation efforts on the components rather than the container? how do we preserve docker or other runtime engines? [ ] [#n ] there are not enough archivists and digital preservation professionals participating in the academic conversations around software containers. researchers need to invite archivists to help with the preservation of their work, but also the archives community needs to start actively paying attention to the rapidly moving technological landscape around them; this is a mutual failure. computer scientists are taking up the task of digital preservation without consulting on the wealth of expertise from the archives community. while this problem is far out of scope for this document, it is relevant because software containers solve some preservation problems, but also introduce new ones. furthermore, the challenges of digital preservation and reproducible research are not exclusively technological and computer scientists are not necessarily equipped to deal with many of the social, historical, and political dynamics of preservation. https://daspos.crc.nd.edu/ https://daspos.crc.nd.edu/index.php/workshops/container-strategies-for-data-software-preservation-that-promote-open-science https://www.journals.elsevier.com/softwarex/ https://www.reprozip.org/ http://cknowledge.org/ https://occam.cs.pitt.edu/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / scientists, scholars, and researchers are under enormous pressure publish their research, this is how they get credit and this is how the incentive structure of academia operates. the additional work of making data sharable, documenting code, and producing reproducible workflows is unrewarded overhead work. this problem is even worse for digital scholarship and digital publications that do not have a traditional print-centric publication as the expression of the work. all of this work, the workflows or non-traditional publications do not enter the systems of publishing and so they often not given due credit and are certainly not preserved. but what if they were? what if we thought about containers as publications? full-stack scholarship this section of the report considers containers as publications, which is a radical idea whose promises and perils have not been fully evaluated. this section is speculative and the ideas are very much “in beta.” the potential for containers to be publications in and of themselves is precisely what we want to open up for further discussion. there is some existing work considering the possibility of using containers as a basis for scholarly publishing. opening reproducible research [http://o r.info/] , [ ] [#n ] is a promising model being developed by geoscientists at the university of münster. the project has been developing a conceptual model, a platform, and standards for executable research compendia (erc), which are publishable units that include “the actual paper, source code, the computational environment, the data set, and a definition of a user interface.” [ ] [#n ] erc leverage the affordances of containers and lay out a set of standardized requirements for their contents. in this model, the container is just one part of a collection of files that get wrapped into a standard archival format familiar to librarians. [ ] [#n ] the scope of erc are very modest, they are meant for small research workflows that can be run on a laptop using programming languages like r. workflows requiring big data or multi-mode high performance computing are not their target audience. beyond what goes inside the container, the project proposes a publication and review process that is compatible with traditional scholarly publishing. erc are meant to integrate into traditional journal publishing processes because they presuppose a print-centric paper as the main output. they are meant as a way to augment traditional publications by wrapping up all of the additional materials related to the production of a traditional, print-centric publication and necessary for reproducibility. [ ] [#n ] erc is an interesting model for thinking about containers are publications and their implementation could provide a basis for further work. however, the final publication in their workflow is still static, print-centric documents. furthermore, the execution of an erc is expected to time bound (even if it takes a long time) and have clearly defined outputs. this is at odds with some of the use cases for digital projects in the humanities that don't have such clearly delineated boundaries between data, workflow, and publication and are long-running processes. for example, a database backed web application such as infinite ulysses [http://www.infiniteulysses.com/] is a long running process and doesn't necessarily fit the erc model describe above. [ ] [#n ] erc is not, in its current design, supportive of multi-modal humanities (mcpherson ). there are other efforts thinking about the rich expressions. brett victor’s media for thinking the unthinkable [ ] [#n ] has inspired a new genre of multimodal publications that leverage the affordances of the web browser. [ ] [#n ] one of the complications of web publishing is the difficulties with encapsulation and portability, websites can have porous boundaries and they belie the very idea of portability, they exist at one location and are difficult to move because of the conflation of address and identifier. [ ] [#n ] there is the potential for a fruitful marriage between the affordances of web and the affordances of containers to create executable, media rich documents that are encapsulate and portable. in “computational publishing with jupyter [https://github.com/odewahn/computational-publishing] ” andrew odewahn combines jupyter notebooks with software containers as a model for computational publishing. [ ] [#n ] http://o r.info/ http://www.infiniteulysses.com/ https://github.com/odewahn/computational-publishing / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / [/j/jep/images/ . . - .png] figure . odewahn’s model for computational publishing. used with permission from computational publishing with jupyter. [ ] [#n ] odewahn’s model outlines the basic components for thinking about software containers as publications that encapsulate jupyter notebooks as a web-based interface for accessing and interacting with the container’s contents. we can generalize this model and think more abstractly about maintaining a distinction between content and platform, yet still encapsulating both. [/j/jep/images/ . . - .png] figure . https://quod.lib.umich.edu/j/jep/images/ . . - .png https://quod.lib.umich.edu/j/jep/images/ . . - .png / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / a generalized technical model for computational publishing. diagram by authors. drawing on this model we can think of computational publications as having the following components: source content - this is the meat of a computational publication and also the most recognizable. source content could be narrative, data, code and/or other digital components. a container image - container images, would encapsulate the source content, but also two sets of dependencies: platform dependencies - computational publications will need a platform in which to provide an interface to the content. odewahn uses jupyter as the platform, but we could expand to consider a full range of expressive platforms, for example bookworm [http://bookworm.culturomics.org/] , omeka [https://www.omeka.net/] , scalar [http://scalar.usc.edu/scalar/] , static sites, or custom web applications. [ ] [#n ] these platforms each have their own set of software dependencies. content dependencies - source content may include a separate set of dependencies from those of the expressive platform. a researcher’s jupyter notebook may require visualization libraries like matplotlib, altair, or plotly and machine learning libraries like scikit-learn. these are not requirements of the jupyter platform, but are necessary for executing the source content. the container engine - the previous components related to the runtime environment internal to the computational publication, the container engine is the external dependency and environment for running the publication. the container engine can be designed to execute any computable publication conforming to a standardized interface. this rough, high-level model is provided to stimulate discussion and thinking around the technical architecture of computational publications. it may already be evident that conceptualizing digital scholarship in this way would require many in academia to shift in both thinking and practice. although many digitally savvy practitioners in academic contexts already partition and document their software with the expectation of future use, switching from a non-containerized approach to containerization would require a shift in practice comparable to moving from a no-collaboration-expected model to an applications that assumes third-party developers. yet, if a scholar were to “rethink the entire enterprise of academic publishing” and rebuild “public scholarship from the ground up ,” it would behoove that scholar to examine the things we take for granted, such as publishers, peer-review, preservation, and the support structures that have evolved around the affordances of print materiality (trettien ). such a shift will have much greater ramifications than simply our technological choices, it changes the role and relationship with the many invisible laborers of scholarly publishing like librarians and archivists. the potential benefits of such reconceptualization would be manifold, but advocating such a shift raises obvious complications about power, academic prestige, and the material labor of who maintains new publishing infrastructures. conclusion: containers for digital scholarship software containers’ adoption in academic research contexts leverage three technological affordances: encapsulation - containers encapsulate everything an application needs to execute in a single logical unit, the image. portability - with standard image formats and runtime interfaces, containers can be easily moved and executed across different platforms and infrastructures. performance - because containers share resources with the host operating system they are faster and lighter than alternatives like virtual machines. the most popular container technology, docker, has become the de facto standard for packaging and deploying enterprise applications like web servers and databases in the commercial sector. while docker is also popular amongst researchers, academically developed alternatives like singularity have emerged to address some the the problems unique to the research context, such as security.. how are containers being discussed and adopted in academic research contexts? in scanning the literature, public discourse, and conference/workshop discussions the adoption and use of software containers roughly falls into three use cases. first, containers are being used by system administrators, especially in high performance computing, to reduce the complexity of scientific software dependencies. second, beyond encapsulating software dependencies, containers can bundle data and workflows to enable reproducible research. third, the encapsulation of software, data, and workflows means they are not only portable across space, but also time making containers an important format for the long-term preservation of research workflows and outputs. http://bookworm.culturomics.org/ https://www.omeka.net/ http://scalar.usc.edu/scalar/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / which aspects of containerization have not yet been fully explored in the context of digital scholarship? the current discussion and application of containers has primarily focused on alleviating existing problems in the maintenance, reproduction, and preservation of computational research. we argue containers open a new horizon of possible practices in scholarly publishing. while there are some efforts thinking about such possibilities, the conceptual models and the use-cases are still conservative. publishing environments and workflows of a traditional print-centric publication does not go far enough, we think containers could be an enabling format for a new breed of computational publications. the value proposition of containers in academia is not merely that they run fast; encapsulate information; or are portable from a laptop to a supercomputer. the technological problems containers solve are a result of social and organizational differences across academia. [ ] [#n ] solomon hykes, the ceo of docker, has also made the point that “the real value of docker is not technology” but rather “getting people to agree on something.” [ ] [#n ] regardless of whether academic community settles upon docker or singularity as the container technology of choice, the more important point is to have standards that (mostly) everyone can agree upon. the emergence and adoption of standards are complicated social, technological, cultural, political, and economic processes with a complicated tangle of agendas, incentives, and artifacts (egyedi ; millerand and bowker ; lampland and star ; busch ). there is a lot of social and conceptual work, beyond the technical work of building platforms, as various disciplines and constituencies adopt or resist the potential changes that software containers make possible in scholarly publishing. understanding the risks and rewards of software containers is an important next step for this project. the second stage of “digits: a platform to facilitate the production of digital scholarship,” will involve creating a second report for publication. this follow-up document will focus on assessing the infrastructural needs of digital humanists around publishing and preserving web-centric digital scholarship and will evaluate the potential of software containers for specific disciplinary practices in the humanities. bibliography beaulieu-jones, brett k., and casey s. greene. . “reproducible computational workflows with continuous analysis.” biorxiv, august, . belmann, peter, johannes dröge, andreas bremges, alice c. mchardy, alexander sczyrba, and michael d. barton. . “bioboxes: standardised containers for interchangeable bioinformatics software.” gigascience : . boettiger, carl. . “an introduction to docker for reproducible research.” acm sigops operating systems review ( ): – . busch, lawrence. . standards: recipes for reality. mit press. chung, m. t., n. quang-hung, m. t. nguyen, and n. thoai. . “using docker in high performance computing applications.” in ieee sixth international conference on communications and electronics (icce), – . clark, dav, aaron culich, brian hamlin, and ryan lovett. . “bce: berkeley’s common scientific compute environment for research and education.” in proceedings of the th python in science conference (scipy ). https://www.researchgate.net/profile/dav_clark/publication/ _bce_berkeley’s_ common_scientific_compute_environment_for_research_and_education/links/ c c aeeea a b .pdf [https://www.researchgate.net/profile/dav_clark/publication/ _bce_berkeley%e % % s_common_scientific_compute_environme . “coreos.” . accessed july . https://coreos.com/rkt [https://coreos.com/rkt] . “deltron - virus lyrics | metrolyrics.” . accessed july . http://www.metrolyrics.com/virus-lyrics-deltron- .html [http://www.metrolyrics.com/virus-lyrics-deltron- .html] . devisetty, upendra kumar, kathleen kennedy, paul sarando, nirav merchant, and eric lyons. . “bringing your tools to cyverse discovery environment using docker.” f research (december): . “docker.” . docker. accessed july . https://www.docker.com/ [https://www.docker.com/] . “docker alternatives and competitors | g crowd.” . g crowd. accessed july . https://www.g crowd.com/products/docker/competitors/alternatives [https://www.g crowd.com/products/docker/competitors/alternatives] . egyedi, tineke. . “infrastructure flexibility created by standardized gateways: the cases of xml and the iso container.” knowledge, technology & policy ( ): – . https://www.researchgate.net/profile/dav_clark/publication/ _bce_berkeley%e % % s_common_scientific_compute_environment_for_research_and_education/links/ c c aeeea a b .pdf https://coreos.com/rkt http://www.metrolyrics.com/virus-lyrics-deltron- .html https://www.docker.com/ https://www.g crowd.com/products/docker/competitors/alternatives / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / emsley, i., and d. de roure. . “a framework for the preservation of a docker container.” in . https://ora.ox.ac.uk/objects/uuid:f f a- efb- e-abcb- b e c ce [https://ora.ox.ac.uk/objects/uuid:f f a- efb- e-abcb- b e c ce ] . felter, w., a. ferreira, r. rajamony, and j. rubio. . “an updated performance comparison of virtual machines and linux containers.” in ieee international symposium on performance analysis of systems and software (ispass), – . gentleman, robert, and duncan temple lang. . “statistical analyses and reproducible research.” journal of computational and graphical statistics: a joint publication of american statistical association, institute of mathematical statistics, interface foundation of north america ( ): – . hale, jack s., lizao li, chris n. richardson, and garth n. wells. . “containers for portable, productive and performant scientific computing.” arxiv: . [cs], august. http://arxiv.org/abs/ . [http://arxiv.org/abs/ . ] . holdgraf, chris, aaron culich, ariel rokem, fatma deniz, maryana alegro, and dani ushizima. . “portable learning environments for hands-on computational instruction: using container- and cloud-based technology to teach data science.” in proceedings of the practice and experience in advanced research computing on sustainability, success and impact, . acm. hosny, abdelrahman, paola vera-licona, reinhard laubenbacher, and thibauld favre. . “algorun: a docker-based packaging system for platform-agnostic implemented algorithms.” bioinformatics ( ): – . hung, ling-hong, daniel kristiyanto, sung bong lee, and ka yee yeung. . “guidock: using docker containers with a common graphics user interface to address the reproducibility of research.” plos one ( ): e . jacobsen, d. m., and r. s. canon. . “shifter: containers for hpc.” in cray users group conference (cug’ ). jacobsen, douglas m., and richard shane canon. . “contain this, unleashing docker for hpc.” proceedings of the cray user group. http://ai -s -pdfs.s .amazonaws.com/ d / e c a d fb dfd fa ca e bc.pdf [http://ai -s -pdfs.s .amazonaws.com/ d / e c a d fb dfd fa ca e bc.pdf] . kamvar, zhian n., margarita m. lópez-uribe, simone coughlan, niklaus j. grünwald, hilmar lapp, and stéphanie manel. / . “developing educational resources for population genetics in r: an open and collaborative approach.” molecular ecology resources ( ): – . knoth, c., and d. nust. . “enabling reproducible obia with open-source software in docker containers.” in http://proceedings.utwente.nl/ / [http://proceedings.utwente.nl/ /] . lampland, martha, and susan leigh star. . standards and their stories: how quantifying, classifying, and formalizing practices shape everyday life. cornell university press. le, emily, and david paz. . “performance analysis of applications using singularity container on sdsc comet.” in proceedings of the practice and experience in advanced research computing on sustainability, success and impact, . acm. leveque, randall j., ian m. mitchell, and victoria stodden. / . “reproducible research for scientific computing: tools and strategies for changing the culture.” computing in science & engineering ( ): – . levinson, marc. . the box: how the shipping container made the world smaller and the world economy bigger. princeton university press. marwick, ben. . “computational reproducibility in archaeological research: basic principles and a case study of their implementation.” journal of archaeological method and theory, january, – . mcpherson, t. . “introduction: media studies and the digital humanities.” cinema journal ( ): – . meng, haiyan, rupa kommineni, quan pham, robert gardner, tanu malik, and douglas thain. / . “an invariant framework for conducting reproducible computational science.” journal of computational science : – . meng, haiyan, and douglas thain. . “umbrella: a portable environment creator for reproducible computing on clusters, clouds, and grids.” in proceedings of the th international workshop on virtualization technologies in distributed computing, – . vtdc ’ . new york, ny, usa: acm. millerand, f., and g. c. bowker. . “metadata standards: trajectories and enactment in the life of an ontology’.” formalizing practices: reckoning with standards, numbers and models in science and everyday life. mooney, james, and david gerrard. . “software ‘best before’ dates: posing questions about containers and digital preservation.” presented at the docker containers for reproducible research workshop, cambridge, june . https://drive.google.com/file/d/ b jaz j aicwtlrsaknxy hndve/view [https://drive.google.com/file/d/ b jaz j aicwtlrsaknxy hndve/view] . https://ora.ox.ac.uk/objects/uuid:f f a- efb- e-abcb- b e c ce http://arxiv.org/abs/ . http://ai -s -pdfs.s .amazonaws.com/ d / e c a d fb dfd fa ca e bc.pdf http://proceedings.utwente.nl/ / https://drive.google.com/file/d/ b jaz j aicwtlrsaknxy hndve/view / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / moreews, françois, olivier sallou, hervé ménager, yvan le bras, cyril monjeaud, christophe blanchet, and olivier collin. . “bioshadock: a community driven bioinformatics shared docker-based tools registry.” f research, december. doi: . /f research. . [http://dx.doi.org/ . /f research. . ] . nüst, daniel, markus konkol, edzer pebesma, christian kray, marc schutzeichel, holger przibytzin, and jörg lorenz. . “opening the publication process with executable research compendia.” d-lib magazine ( / ). doi: . /january -nuest [http://dx.doi.org/ . /january -nuest] . o’connor, brian d., denis yuen, vincent chung, andrew g. duncan, xiang kun liu, janice patricia, benedict paten, lincoln stein, and vincent ferretti. . “the dockstore: enabling modular, community-focused sharing of docker-based genomics tools and workflows.” f research (january): . peng, r. d. . “reproducible research in computational science.” science ( ): – . portworx, container adoption survey, presented by portworx and aqua security. https://portworx.com/wp- content/uploads/ / / -container-adoption-survey.pdf [https://portworx.com/wp- content/uploads/ / / -container-adoption-survey.pdf] portworx, container adoption survey, presented by portworx and aqua security. https://portworx.com/wp- content/uploads/ / /portworx-container-adoption-survey-report- .pdf [https://portworx.com/wp- content/uploads/ / /portworx-container-adoption-survey-report- .pdf] Špaček, františek, radomír sohlich, and tomáš dulík. . “docker as platform for assignments evaluation.” procedia engineering : – . stodden, victoria, friedrich leisch, and roger d. peng. . implementing reproducible research. crc press. stodden, victoria, and sheila miguez. . “best practices for computational science: software infrastructure and environments for reproducible and extensible research.” journal of open research software ( ): . szitenberg, amir, max john, mark l. blaxter, and david h. lunt. . “reprophylo: an environment for reproducible phylogenomics.” plos computational biology ( ): e . thain, douglas, peter ivie, and haiyan meng. . “techniques for preserving scientific software executions: preserve the mess or encourage cleanliness?” doi: . /r cz m [http://dx.doi.org/ . /r cz m] . trettien, whitney. . “a feminist note on ‘publication, power, and patronage.’” medium. medium. july . https://medium.com/@whitneytrettien/a-feminist-note-on-publication-power-and-patronage-a ed a cd [https://medium.com/@whitneytrettien/a-feminist-note-on-publication-power-and-patronage-a ed a cd ] . verma, abhishek, luis pedrosa, madhukar korupolu, david oppenheimer, eric tune, and john wilkes. . “large- scale cluster management at google with borg.” in proceedings of the tenth european conference on computer systems, : – : . eurosys ’ . new york, ny, usa: acm. “what is devops? - amazon web services (aws).” . amazon web services, inc. accessed july . https://aws.amazon.com/devops/what-is-devops/ [https://aws.amazon.com/devops/what-is-devops/] . williams, jason j., and tracy k. teal. / . “a vision for collaborative training infrastructure for bioinformatics: training infrastructure for bioinformatics.” annals of the new york academy of sciences ( ): – . willis, craig, mike lambert, kenton mchenry, and christine kirkpatrick. . “container-based analysis environments for low-barrier access to research data.” in proceedings of the practice and experience in advanced research computing on sustainability, success and impact, . acm. new scholarship in the digital age making, publishing, maintaining, and preserving non-traditional scholarly objects introduction today’s academic ecosystem is growing beyond the culture of print that once circumscribed it. as padmini ray murray and claire squires argue, the “publishing value chain,” from the invention of movable type through the twentieth century, remained surprisingly stable. “the human experience of how we produce, disseminate and perceive text is now, however, being irrevocably transformed by digital technologies.” [ ] [#n ] murray and squires’ observations also hold true for scholarly publishing. although print-based scholarship remains the gold standard in the humanities, scholars are increasingly producing digital-first objects as part of their research, artistic endeavors, teaching, or other documentary forms. in this report, we refer to such digital artifacts as non-traditional scholarly objects (ntsos). [ ] [#n ] http://dx.doi.org/ . /f research. . http://dx.doi.org/ . /january -nuest https://portworx.com/wp-content/uploads/ / / -container-adoption-survey.pdf https://portworx.com/wp-content/uploads/ / /portworx-container-adoption-survey-report- .pdf http://dx.doi.org/ . /r cz m https://medium.com/@whitneytrettien/a-feminist-note-on-publication-power-and-patronage-a ed a cd https://aws.amazon.com/devops/what-is-devops/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / despite their increasing popularity, ntsos present challenges to publishing and must be reshaped or distorted to fit the social and technical structures of traditional scholarly publishing. institutions generally have a limited range of supported infrastructures, as well as varying degrees of technical expertise and capacity to adapt. practitioners still disagree about how credit and prestige are allocated, how collaboration should function, and who ought to maintain responsibility and ownership for projects that are no longer under active development. lack of stable standards has led to digital scholarship taking diverse forms, in parallel to the heterogeneity of early printed books. [ ] [#n ] this variety enriches the scholarly landscape, but it comes with a price. whereas printable scholarship has a clear place in academia, ntsos struggle to thrive. this a.w. mellon-funded report describes the myriad ways digital scholarship is being conceived, produced, distributed, and preserved in the digital humanities. with its long history of digital-first publications, digital humanities practitioners participate in every stage of scholarly production. we interviewed of these practitioners to learn about their processes, what drives them, what holds them back, and how their work fits into a changing academic world. through anonymized and aggregated responses, we report on the digital scholarly workflow broken into four categories: ( ) making, ( ) publishing, ( ) maintaining, and ( ) preserving digital scholarship. in each section, we report on challenges surfaced in our interviews, with particular attention to the sociotechnical intricacies of that particular phase of an ntso’s lifespan. we further identify five key stakeholder roles: ( ) catalysts, ( ) makers, ( ) evaluators, ( ) hosts, and ( ) audiences. [ ] [#n ] we highlight points of agreement and divergence, of values and practices, of frictions and difficulties common to each role. there is a substantial range of opinion about the social and technical infrastructure needed to support, maintain, and preserve digital scholarship. a primary tension articulated in this report is between the expressive capacities afforded by the digital medium and the constraints of standardized scholarly production. this tension is exacerbated by limitations in even state-of-the-art technical practices, lack of institutional readiness to support such work, and unclear or opposing values with respect to how digital scholarly objects are treated. in processing the interviews, an unremarked-upon issue became increasingly apparent. complex ntsos pass through many hands for many reasons, with no single stakeholder responsible for their trajectories across these spaces. each party focuses on their own needs, leading to unpredictable difficulties. a recurring result of this disjoint is the lack of capacity of publishers, libraries, and other institutions to steward ntsos, often on account of difficulty around the hand-offs of these objects from one party to the next. transferring ownership or stewardship of ntsos is one of the most significant social and technical challenges faced by today’s practitioners. [ ] [#n ] we began with the belief that software containerization offered a path towards decreasing these challenges at minimal cost, with the added advantage of creating a more standardized unit of digital publication which will be easier to collaborate on, distribute, and preserve. after conducting this study, we believe even more strongly in the importance of a single, encapsulated format for digital scholarly objects as a necessary intervention into the problems raised here. digital encapsulation and standardization could do for ntsos what pdfs did for static digital documents, and what shipping containers did for the global transportation of goods. on the pdf format, lisa gitelman writes: the format prospers both because of its transmissiveness and because of the ways that it supports structured hierarchies of authors and readers (“workflow”) that depend on documents. one might generalize that pdfs make sense partly according to a logic of attachment and enclosure. that is, like the digital objects we ‘attach’ to and send along with e-mail messages, or the nondigital objects we still enclose in envelopes or boxes and send by snail mail, pdfs are individually bounded and distinct. [ ] [#n ] as mark levinson has pointed out, the encapsulation offered by shipping containers was transformative in its reduction of trade costs in the mid-twentieth century, particularly around the hand-offs of goods. [ ] [#n ] in our study, we identify similar high cost points at the hand-offs between ntsos. we believe the encapsulation of ntsos would drastically rebalance the digital scholarly value chain, reducing friction at hand-off points in ways similar to the shipping container. however, as with shipping containers, such a technology has the potential to bring harsher conditions to the already contingent laborers associated with these hand-offs. while the current study outlines particular points of contention or difficulties that software containers might help address, we do not limit our report to the links in the digital scholarly value chain directly related to such technical infrastructure. instead, we offer an integrative view of the practices, pitfalls, and promise of digital-first scholarly publication. [ ] [#n ] / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / ultimately, this report raises the importance of orchestrated interventions into the various stages of the digital scholarship workflow: making, publishing, maintaining, and preserving. an initial intervention could at once reduce pain points and increase the prestige of ntsos. ours is not the first group to recommend positive interventions or paths forward. [ ] [#n ] out of necessity, however, most have focused on smaller subsets of the categories we report on. many of these projects have reported that problems in other areas of the digital scholarly workflow limited their efficacy. [ ] [#n ] in response to this obstacle, we propose several integrative approaches toward stabilizing and supporting digital scholarship. [ ] [#n ] data collection before drafting this report, the project team spent approximately eight months doing fieldwork. we interviewed professionals associated with the production, dissemination, and preservation of digital scholarship. we began each semi- formal interview with a consistent list of questions (see appendix a), but allowed for relevant and interesting conversational threads and themes to emerge. most interviews involved one subject and one interviewer asking questions and taking notes. we used audio recording for some responses, and a few interviews took place with pairs of interviewers. in some cases, we also interviewed project teams as a group. interviews usually lasted approximately one hour. [ ] [#n ] we interviewed a range of subjects tied to the production, publication, or preservation of non-traditional scholarly objects. we used convenience sampling to generate an interview pool of people whose roles included researchers, publishers, and librarians. [ ] [#n ] the sample included graduate students, independent scholars and developers, contingent employees, tenured faculty, staff, and other established field leaders. participants' projects varied in size from solo-practitioners project at small institutions to multi-institutional collaborations. we strove for extensive coverage of the digital scholarship community. convenience sampling, however, does not necessarily produce an exhaustive inventory of the field. for example, we interviewed far more researchers and librarians than publishers. our pool converged on common concerns in the wider community, indicating we achieved some qualitative saturation in our data collection, but there is much still do. our subjects most likely over-represent large universities and liberal arts colleges. in turn, we under-represent less well-funded institutions and community colleges. supplemental work in these areas would strengthen our findings. lastly, this report is our distillation of extensive conversations. as such, it leaves out some nuance and specificity. we also heard important points, stories, and insights that were not ultimately included in this document. despite these limitations, our findings still apply to a large cross section of the digital scholarship community. the problems of non-traditional scholarly objects we use the term non-traditional scholarly objects (ntsos) as a subset of digital scholarship. both differ from traditional forms of print-first scholarship. in our interviews we provided a variety of examples to help clarify our idea of ntsos, including blogs, twitter bots, searchable databases, and interactive data visualization essays. these examples are all web-based. they provide a range of linear and exploratory experiences. they depend on many digital platforms, some custom and others “off-the-shelf.” their creators had various levels of expertise and technical proficiency. many of the projects were posted on a scholar's website and disseminated via social media. we wish to distinguish ntsos from the more general term of digital scholarship. digital scholarship is a broader label that can imply ecosystems, contexts, and infrastructures. the key element, as described by christine borgman, is the intersection of digital components and scholarship. [ ] [#n ] with the idea of ntsos, we want to focus attention on objects. we do not seek to bracket the broader social, institutional, and cultural contexts of digital scholarship. rather, we foreground objects and processes, especially making, publishing, maintaining, and preserving. these four distinct themes emerged from our interviews with digital humanities practitioners. by making, we refer the practices related to the creation, conceptualization, and construction of ntsos. publishing refers to both their publication and dissemination. maintaining refers to the practice of ensuring that ntsos remain accessible and operational. [ ] [#n ] preservation, for the purposes of this report, refers to “all the activities necessary to ensure the long-term accessibility of a resource.” [ ] [#n ] in the context of digital scholarship, this includes making an artifact suitable for inclusion in the long- term scholarly record. our report uses section headings to separate content by theme. in each of these four areas, our subjects discussed their biggest challenges. our report uses subsection headings to identify these challenges. at the end of each subsection, we describe potential interventions and recommendations. recommendations for future intervention / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / recommendations are aimed at practitioners working in one or more of five roles: catalysts, makers, evaluators, hosts, and audiences. these categories are defined in relationship to ntsos themselves: catalysts those who facilitate the conception and development of ntso. this includes funders, digital scholarship centers, university departments, scholarly organizations, etc. makers those who make or directly shape ntsos. this includes researchers, programmers, and other digital makers, but also in some circumstances peer reviewers, editors, etc. evaluators those who evaluate ntsos. this includes editors, peer reviewers, publishers, tenure committees, etc. hosts those who host, serve, maintain, or preserve ntsos. this includes libraries, archives, publishers, hosting companies/platformers, etc. occasionally, this includes digital makers themselves. audiences those who access ntsos. this includes readers, those who access ntsos via api, etc. breaking the tasks of ntsos into discrete categories (making / publishing / maintaining / preserving) or roles (catalysts / makers / evaluators / hosts / audience) often occurs implicitly and without intent. among our key findings is that this natural fracturing is itself an impediment to the long-term success of ntsos. many tasks fall through the cracks between categories, and no single body can work to orchestrate success across all links in the scholarly value chain. with this in mind, we offer an additional category of recommendation, disconnected from any one role: ntso these recommendations are for the future of ntsos themselves; how they might act and interact, the shape they may take, and how they may evolve. recommendations aimed at ntsos are those we believe the entire community ought to consider, and cannot be mapped to specific roles. we asked our interview subjects to comment specifically on pain points and areas of dispute. the proposals draw both from suggestions raised during interviews and from a synthesis of secondary literature. they are intended to be illustrative rather than exhaustive. on account of the variety of perspectives available, some are contradictory. some recommendations are actionable in the short term, and most point to the need for a coordinated effort from all stakeholders. many recommendations are actively being tested by a.w. mellon-funded initiatives like our own. we hope that these challenges and recommendations will guide further discussions, investigations, and interventions. report on interviews and recommendations challenges making non-traditional scholarly objects ntsos typically require some combination of expertise in research content, proprietary software, and dev-ops. [ ] [#n ] when team members do not have the requisite expertise, the team will often grow. in other cases, projects begin with a technically proficient maker in search of subject matter. large scale projects may include a range of traditional and hybrid roles. our interviews were consistent with the truism that bespoke digital projects entail heavy technical labor at the start. pre-packaged software or templates, in contrast, demand less labor upfront but afford less flexibility. we did not interview scholars focused on traditional publication tracks. previous work, however, suggests that digital humanities projects tend to involve more collaborators than print-based humanities publications. these named and unnamed participants come from a wide range of institutional and commercial settings. [ ] [#n ] in this section, we explore who creates ntsos and how, focusing on common challenges. the label “digital makers” has been used in many different contexts to describe people who create non-traditional objects, scholarly or otherwise. for example, in the u.k. foundation nesta published a report titled young digital makers, which argued, “for most young people digital technology is an everyday part of life. many are avid consumers of digital / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / media. however they often don’t understand how to manipulate the underlying technology, let alone how to create it for themselves.” [ ] [#n ] for the report’s author, oliver quinlan, “digital maker” was a broad but useful construct because it referred to a range of activities that were “distinct from simply using digital devices.” [ ] [#n ] in the context of digital humanities, some practitioners have embraced these labels. [ ] [#n ] for the purposes of this report, our use of the term maker is analogous. collaborative production digital makers reported working along a collaborative spectrum. in many humanities disciplines, researchers follow the sole author / individual scholar model. in this model, a scholar conducts research and claims sole credit for the outputs, which tend to be peer-reviewed articles and monographs. when others' contributions support this work, an author's acknowledgements sometimes make note of it. our subjects named librarian consultations, collections access, help with archival materials, research assistantships, and informal peer review as important supportive labor. yet the individual scholar model does not consider this work coequal with authorship. some of our subjects, who operate as solo practitioners, showed a preference for this model, perhaps in part due to existing tenure and promotion systems. other digital makers embraced division of labor to expand their projects' scope. project partners had expertise they lacked, or helped expand output capacity. a minority of subjects rejected the credit models of traditional scholarship altogether. when collaboration does occur, some predictable problems arise. the sociotechnical aspects of collaboration, including collaborative credit models and project orchestration environments, seem particularly difficult to navigate. most humanities disciplines lack strong models for collaboration, and teamwork skills vary greatly. our research suggests that lack of training deters collaboration. administrative skills are rare, and many in the humanities consider this labor ignoble. some interview subjects reported difficult transitions between exerting control over their research process and performing the role of a principal investigator. for pis, the ability to coordinate large project teams is especially important. secondary communications skills are also crucial. humanistic knowledge (“subject matter expertise”) and technical proficiency make up two other important axes of skill reported as essential to the development of ntsos. some ntsos are developed by the rare solo digital maker who excels in multiple skillsets, resulting in particularly coherent projects that are often limited in scope. well-functioning collaborations, in contrast, can enable digital makers to work beyond what an individual could create by themselves. such successes seem to give rise to more hybrid project roles. the rewards of such partnerships often included skill development and increased project momentum, but may also show signs of fracture or discontinuity where collaborators meet. every ntso—like every piece of scholarship—must go through many hands. digital making, however, seems to bring the tension points of collaboration to the fore, and increase the likelihood of conflict. this difference between scholarship in general and ntsos in particular may relate to the speed at which roles change, or the number of changes a role is likely to undergo before stabilizing. further, a large digital project could depend on dozens or even hundreds of contributors. any web-facing digital object, at its core, rests on a complex stack of manufacturers, software developers, internet engineers, and system administrators. countless members of an ntso’s audience may also contribute to or co-construct the object, as in crowdsourcing initiatives or annotated editions. in recognition of these complexities, our report does not attempt to define or contain the idea of a “digital maker.” instead, we focus on the labor that typifies digital making. many scholars who would not claim the label "digital maker" take part in digital making. many other participants remain anonymous or uncredited. our rhetorical shift calls attention to the innumerable contributions to almost any project. we found that remaining unnamed often relates to one of three norms: . the labor was a service offered in exchange for some reciprocal benefit, including funding or university credit. . the labor was part of some standardized infrastructural apparatus. . so many participants performed the labor that naming all contributors was impractical. reciprocated labor might include work by paid programmers, interns, or students. these kinds of collaborators more often appear by name in website acknowledgements than in books or articles published about digital projects. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / standardized infrastructural labor includes editors, peer reviewers, librarians, and archivists in one category, and technical experts such as in-house developers, digital humanities center personnel, and system administrators in another. many such professionals receive credit for their work through internal activity reports, though they remain publicly unacknowledged. such hidden labor is typical in both digital and print mediums. according to some subjects, the erasure from mention in externally-facing publications has been a source of consternation. subjects also mentioned cases where too many people had contributed components for acknowledgement to be feasible. when digital projects use outside products, software, and digital infrastructure, the labor of these entities also often remains unacknowledged. in turn, the labor of building and maintaining open source software goes uncredited. likewise, an ntso based on audience participation (eg. social annotation projects) or hit/micro-tasking labor (e.g. mechanical turk), may not credit its participants by name. collaboration is hard. some participants in the ntso ecosystem would prefer to avoid it. others want to collaborate but find it hazardous and a professional risk. changing models for credit and compensation can create more conflicts. as the labor of digital making becomes more institutionalized, some types of collaboration seem more appealing. at the same time, technical labor is becoming less visible. we discuss this ostensible contradiction in the next section. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [makers] embrace hybrid roles, flexible teams, and more diffuse definitions of collaboration and making. [makers] acknowledge labor that is often hidden, such as editors, system administrators, etc. one possible mechanism is by using standard collaborator contribution statements. [evaluators] value non-traditional contributions as much as recognizable writing work. [catalysts] create space particularly for generalists or collaborators with shifting roles. though they may not hyper- specialize in any one area, these contributors often act as translators without whom an ntso cannot exist. [makers] learn and be able to describe in some detail the contribution of every member of a project team. [catalysts] facilitate workspaces and programs where a culture of every team member knowing each other’s contributions is the norm. this may be accomplished, in part, via standardized communication infrastructures and practices. [makers] seek training in effective project management and collaboration. [catalysts] create incentives and programs for makers to get trained in project management and collaboration, through funding initiatives, workplace events, etc. such programs need adequate scaffolding, with clear pathways to gain the expertise necessary in these areas. sociotechnical challenges and limitations ntso production leads to pain points around technology, expertise, and gaps between the two. [ ] [#n ] the variety of projects in our study makes generalization a challenge. this section attempts to identify and categorize the most pressing sociotechnical challenges and limitations we encountered. foremost, our interview subjects described tension between experimentation and maintenance. projects using off-the- shelf platforms and solutions such as omeka, scalar, and wordpress were described as less experimental. [ ] [#n ] other projects were bespoke creations, and required significant technical skills to build. some proponents of custom projects said that off-the-shelf platforms would not meet their needs. others told us that their projects exceeded the hardware or systems capacity of their home institutions. in general, more experimental approaches were seen as less maintainable, and more maintainable platforms were seen as less fit for experimentation. these obstacles appear sociotechnical. institutions select digital architecture based on complex criteria. project teams pursuing institutional partnerships for the sake of technical support must often use off-the-shelf platforms. [ ] [#n ] such compromises constrain projects and deter experimentation, and yet the range of hosted platforms may have been selected with other priorities in mind. [ ] [#n ] even when technical needs are met, bureaucratic barriers or university policies may slow development, or make implementation more difficult than anticipated. [ ] [#n ] our interview subjects expressed frustration at this tension. our interview subjects were eager to talk about web-hosting decisions and stressed their importance. they discussed four broad categories of hosting solutions: / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / . “under-the-desk” servers (any server where the scholar acts as system administrator) . external services (digitalocean, github, wordpress.com, reclaim hosting, etc.) . institutional partnerships (a university library hosts an omeka instance, a custom web application, etc.) . publisher partnerships (a publisher supports hosting and/or building an ntso). [ ] [#n ] in categories , , and , project members tend to cede control of system administration. the work of system administration is often hidden or misunderstood. a parallel may be drawn between these models and traditional book or journal publication. authors play a substantial role in creating and publishing books and journal articles, but they tend to cede control over the afterlife of their work, including distribution, access, and preservation. even during the production process, authors are accustomed to publishers controlling things like page design and printing. the notion of a structured hand-off (such as page proofs) is well understood. in contrast to print publications, less structured hand-offs occur more frequently in ntsos. in some cases, the work is parallel and simultaneous. points of friction can arise between the original team and the production team almost any time in the process. our interview subjects expressed frustration that security problems, hosting changes, or other sociotechnical issues can force a team to work on a project long after their collaboration has ended or their funding has run out. such concerns affect discussions of whose job it is to build the “under-the-hood” pieces of ntsos. factors such as funding, institutional personnel, team member expertise, and hardware & software affect these decisions. negotiations can be complicated, resource-intensive, and difficult to understand. pis may feel they lack the expertise to conduct such negotiations. these considerations have a strong impact on whether a project is “off-the-shelf” or custom built. custom built ntsos tend to be more complex and idiosyncratic than "off-the-shelf" products. hand-offs, as a result, are more difficult. documentation is often absent or out of date. even with clear records, moving such ntsos between servers may be expensive and time-consuming. the implications for digital objects in the scholarly ecosystem are dire. scholars struggle to hand their digital work to peer reviewers, libraries, publishers, and other maintainers. maintainers struggle to transfer them to institutions tasked with preserving digital scholarship. the loss of a single key team member may lead to a project’s failure because that team member possesses skills or key knowledge that no other team members has. under these conditions, libraries cannot hope to host copies of important ntsos the way many libraries keep copies of important books. [ ] [#n ] a majority of ntsos rely entirely on a few off-the-shelf digital tools or platforms. fewer require some form of customization, coding, or other expert technical labor, and fewer still are entirely bespoke, requiring as much nuance, care, and skill as the humanities work involved. however, the amount of technical effort needed to fit ntsos into the scholarly publication chain is the inverse of this. the complexity and diversity of bespoke projects means they are often the most difficult to individually accommodate. though they are in the minority, our subjects reported that bespoke ntsos often comprise the most interesting and impactful projects within their communities. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [ntso] ntsos must become more easily portable to ease the burden of transferring work between makers, evaluators, and hosts. this would become significantly easier were ntsos encapsulated in or organized by a single file, particularly one with relevant scholarly metadata. [makers] [evaluators] [hosts] specify and minimize the amount of hand-offs that take place; plan the moments of hand-off carefully. [makers] [evaluators] [hosts] become more comfortable with containerization (or other encapsulation) standards. [makers] [evaluators] [hosts] use encapsulated ntsos during hand-offs. [catalysts] incentivize encapsulation, through grant requirements, institutional encapsulation standards, and investing in consortial models to construct standards of scholarly encapsulation. [makers] [hosts] use clearly articulated web hosting agreements to reduce the sense of uncertainty around, e.g., whether a university continues to host a faculty project after the faculty has switched institutions. [hosts] widen range of hosting options to accommodate short-term, low cost sandboxing and prototyping. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / [makers] articulate project charters that specify a project’s longevity. depending on the choice, develop with that longevity in mind, or state clear project end dates. [ ] [#n ] [makers] learn the hardware and software stack supporting the ntso enough to be able to describe it. [catalysts] incentivize makers learning hardware and software stacks by offering trainings and decreasing the disconnect between makers and system administrators. [ntsos] given that the majority of ntsos exist on one of a few platforms, the community needs to come together to agree on standard hand-off solutions for these types of objects. [ntsos] although bespoke ntsos are in the minority, given that many described them as the most interesting and impactful within their scholarly communities, it is essential that these bespoke objects are not ignored in favor of focusing on off-the-shelf solutions. as much or more effort must be expended on standardizing the bespoke ntso scholarly value chain, which in turn will be useful for standardizing off-the-shelf solutions. funding our research suggests that ntso production does not fit traditional humanities funding models. the models focus on paying for books, research trips, conferences, and other initiatives with clear end-dates or end-products. instead, digital scholarship tends to either be self-funded or dependent upon grants. this introduces a host of problems from the grant- funding economy into the humanities. all our subjects expressed gratitude for the support of various funding institutions. many said they were nervous about the consequences if a major grant funder, such as the a.w. mellon foundation or the national endowment for the humanities, were to stop offering grants for digital work. some well-funded institutions would be able to support digital scholarship using internal funding, but the vast majority of digital makers do not have access to such resources. as a result, their projects would become limited to what they could self-fund. a change like this would fundamentally alter the production landscape of ntsos. the grant-funded economy also shapes the way digital makers choose to run their projects. universities typically charge overhead rates from external grants to cover indirect costs. these rates are often set with the assumption of scientific research. as a result, overheads in the humanities tend to assume costs that typical digital humanities projects don’t have. in turn, such projects are left with less funding to cover costs that scientific grants typically don’t have. in more than one interview, we were told that overhead payments take a significant part of a project’s funding. the fact that sciences have more grant funding and more funding sources than the humanities seems to aggravate these concerns, especially since overhead costs contribute more directly to scientific necessities such as lab space. this structure can also lead pis to make hard decisions based on cost models rather than project needs. for example, teams may outsource development work to avoid paying benefits and make the budget stretch farther. one subject noted that grant-funded project team members, as soon as they were trained, were often “stolen” by other projects and university libraries that could pay them higher salaries. hard funding comes with several advantages. our interview subjects described career stability as one of its most compelling features. many subjects reported university policies that make soft-funded faculty or staff ineligible to apply for grants in their own right. several faculty or staff with soft funding reported searching for a grant-eligible faculty members to serve as in-name-only pis to circumvent such policies. interview subjects saw hard funding as a way to avoid the negative experience of a project coming to an awkward or even ruinous conclusion. at the end of a grant-funded project, developers are often laid off, and no one remains to address inevitable bugs and security issues. one subject described being responsible for a legacy project that “falls over” at irregular intervals. this particular person, however, did not have the knowledge or expertise to do more than restart the server. to address problems of soft funding, some archives, libraries, and museums have attempted to make more permanent technical and personnel resources available to project partners. such organizational partners face a different set of challenges from lone scholars or one-off projects. most often, they are constrained by technical debt from past projects, as well as limited capacity of personnel and technical breadth. in response, many reported limiting the types of objects they are willing to work with in order to accommodate as many projects as possible. [ ] [#n ] a minority of our subjects said they take on bespoke production with only one or two projects at a time. based on our research, few organizations have secured long-term funding for bespoke production. exemplary projects produced in these cases seem, to others, impossible to imitate. even when funding is stable, interview subjects reported an absence of protocols for or commitments to ongoing maintenance and preservation. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / some of our interview subjects reported funding some or all of their projects out of pocket. web servers operating at home or under someone's desk were more common than we expected. self-funded services such as reclaim hosting, digital ocean, or amazon web services were even more common. the costs of these solutions varied greatly. the most common reasons for such setups were as follows: . the technical needs of the maker exceeded the capacities of their home institution. . the project's preferred software or hardware was not permitted by the institution. . bureaucratic hoops proved too complex or too onerous for the maker or project team. self-funding offered the advantage of making projects easier to move between institutions. likewise, they offer complete freedom to experiment and develop projects using various approaches. they also shift the burden of tech support onto the individual or team working on the project. no subject we interviewed who went this route received compensation for the monetary costs or time debt produced by their self-hosted systems. as evidenced in our interviews, several publishers, libraries, and others are thinking about these issues and have collectively offered some solutions. [ ] [#n ] our subjects, however, reported being under-resourced, understaffed, and missing the crucial expertise to execute these ideas. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [catalysts] support research into the start-up costs of creating digital projects, and the long-term costs of maintaining and preserving them. include in these cost evaluations everything from a maker’s time to equipment costs and sysadmin labor. [hosts] ensure hosting, maintenance, and preservation costs are transparent, and include the full stack of development and technical requirements. [makers] clearly document all project activities, time spent, and costs incurred. [makers] write librarians, technical staff, and others into grant applications to ensure these and related costs are accounted for. [hosts] clarify costs of technical infrastructure and staff to ensure they can be accounted for in grants. [catalysts] encourage grant applicants to include more complete sociotechnical costs in grant applications, and accept applications with such details as integral to the projects being undertaken. [evaluators] press for more clarity in method with respect to personnel time, contribution statements, and technical infrastructure. [hosts] particularly libraries, lean into the analogy between the laboratory in the sciences and the library in the humanities. [ ] [#n ] use this analogy to demand a larger cut of indirect costs levied by the university, and to help project teams secure grant funding for project infrastructure costs. [hosts] fill the role of “laboratory for the humanities” by providing makers with more flexible web hosting and cloud storage, particularly for preliminary work. copyright and sensitive data our subjects raised logistical and ethical concerns about copyright and sensitive data. those working with post- united states materials reported more problems with copyright. the new affordances of text-mining complicate questions of access, as many academic distributors only permit browsing access to copyrighted materials. others charge fees for large-scale, computational use cases (e.g., text-mining). in the united states, analysis of copyrighted materials for scholarly purposes qualifies as fair use, but the re-distribution of copyrighted materials as part of a dataset does not. some vendor licensing agreements bar any redistribution of data, even if copyright is not a factor. such complexities have led to creative solutions like the hathitrust research center’s original “walled garden,” which enabled pre-defined text analysis algorithms to run on copyrighted materials. htrc has also provided “non-consumptive” versions of texts in the form of term frequency tables. in our interviews, several other issues related to copyright came up. scenarios were complex and broad ranging. there was no consensus among our subjects around fairly common questions, such as the conditions necessary to call a work transformative. in cases where licensing was more of a concern than copyright, our subjects voiced similar frustration and uncertainty. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / many of our subjects discussed concerns related to sensitive data. humanists are often unfamiliar with institutional review board (irb) requirements and procedures. irb is often described as a baseline ethical standard and not a cure- all. [ ] [#n ] as our interview subjects pointed out, ethical considerations beyond irb are often crucial. further, irb- exempt projects can encounter ethical quandaries when attempting to create digital datasets. [ ] [#n ] one common example of this is twitter data, which is publicly available and thus—theoretically—free for scholars to analyze yet which can include sensitive materials belonging to or associated with marginalized groups who did not consent to be collected and studied this way. another useful example is the digitization of the on our backs lesbian porn magazine archive. while the digitizers believe that they were operating within the limits of current copyright, most of the contributors to the magazine did so when it was a limited print-run magazine—many before the modern internet existed—and some later contributors explicitly withheld consent to having their images posted online for anyone with a browser to find. given the potentially catastrophic personal and professional harm that could occur to the contributors through this digitization, this may be considered an unethical digital project regardless of its legality. makers must therefore exercise caution when creating digital projects, even if they believe they have the legal right to do so. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [catalysts] foster and offer legal and ethical training for digital scholarship, similar to training required in the sciences around irb and similar issues. [makers] take time to learn about the legal and ethical issues surrounding digital scholarship before embarking on projects. if an instructor, teach students about these issues. [evaluators] keep legality and ethics in mind when evaluating data-rich research. be particularly mindful of situations that are legal but not ethical. [catalysts] [hosts] with respect to legally ambiguous but ethically clear situations, cultivate a risk-tolerant atmosphere that encourages experimentation. [ ] [#n ] [catalysts] [hosts] [evaluators] [audiences] encourage and incentivize open data practices. [makers] when ethical, adopt radically open data practices. avoid using copyright claims as a means of staving off criticism and discouraging engagement with “under-the-hood” elements of a project. credit models of the social aspects of creating a digital project, credit models were consistently mentioned as the most likely to generate friction for a project team. as discussed above, the humanities lack strong models for collaboration. existing norms obfuscate a great deal of labor and may foster resentment among team members. [ ] [#n ] the structural aspects of credit influence how some would-be digital makers approach collaboration, particularly with developers and librarians. some interview subjects reported a common attitude of de facto authority and control. they reported visitors to digital scholarship centers or libraries arriving with all project details predetermined, expecting staff to construct the project without providing any feedback or guidance. the vast majority of our subjects from digital scholarship centers and libraries objected to this model. they said it reinforced problematic hierarchical structures within their institutions, as well as divisions between the digital and the humanities in the community at large. some felt that categorizing technical labor as “service” devalues those contributions. in many cases, the technical aspects of a project are foundational to the project's scholarly intervention or argument. this service model may also inhibit or close off career pathways for developers within academia. [ ] [#n ] one subject argued that these paths should be analogous to computing industry career pathways, with the goal of ensuring that talented developers find intellectual fulfillment within their positions. many must fight a double battle to receive both external and internal recognition for their work. such attitudes may reinforce the idea that digital scholarship is not “real” scholarship. in turn, they may undercut broader efforts to legitimize digital humanities in the wider humanities community. in contrast, two of our subjects who began on the academic track and ended up as digital humanities developers in permanently-funded positions expressed relief and excitement over the very aspects that others found problematic. both said they had some agency to shape projects and sometimes lead their own, but spend most of their time on what their employers assign. authorship credit is possible in their positions, but not inevitable. both expressed distaste for the prestige-driven academic tenure process and said they were happy to have so-called “alt-ac” positions. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / the majority of our subjects agreed on the need for credit models that acknowledged the work of project team members; they also expressed that this was easier said than done. who, for example, should be included as a co-author on an article written about a digital project? tarkang et. al. suggests that authorship denotes “those who deserve credit and can take responsibility for the work.” [ ] [#n ] in the sciences, “the work” includes lab research and the “writing, submission, and editing required for a paper.” in some humanities disciplines, where lab research does not take place, authorship consists solely of writing, submitting, and revising. an article based on a digital project might follow either of the two models. friction can arise when contributors to a project are not listed as authors because they did not take part in the composition of an article or book. many interview subjects discussed the difference between team credit and individual credit. they reported creating an “about us” page that lists everyone associated with a project. in one interview, we were told that this system is analogous to films where credits roll on for ages at the end. such credits, they said, only last as long as the project remains online and often do not provide readers with an adequate understanding of an individual team member’s contribution to the project. even when digital labor is made visible, it is often misunderstood. the amount of time, effort, and skill that goes into activities such as data wrangling are especially opaque. many of the people we spoke with expressed uncertainty about how to claim credit for the project work on their own cvs and portfolios. where does one put pedagogical materials, datasets, software, contract work, consulting work, funded kickstarters, or patreon donations? these problems go beyond the cv itself; they speak to uncertainty about how to describe and categorize their work. one subject expressed a desire to see these questions lead into a deeper analysis of what people are getting credit for and why. what, in other words, does tenure mean? the uncertainty of the present moment, they said, creates an opportunity to raise these questions for digital scholarship, as well as academia more broadly. [ ] [#n ] such a reassessment, they said, raises the question of how prestige drives scholarly publication. meanwhile, most of our interview subjects focused on the value of their own work in the current credit system. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [catalysts] [hosts] revise requirements for how credit is articulated to ensure labor does not go unrewarded. arguably, taking implicit credit for others' work is a form of plagiarism, yet the stigma of a plagiarism accusation is currently much greater than failing to share credit. apply social pressure to balance these credit norms. [evaluators] require clear contribution statements when evaluating digital projects. [makers] via scholarly organizations, coordinate norms for claiming credit in ntsos on cvs and portfolios. define contribution types and roles, but non-dogmatically. some subjects preferred a “total collaboration” model, since many particularly vibrant ntsos included contributors who were involved in every aspect of its creation. [evaluators] accept non-traditional contributions as credit in cases of hiring, tenure, and promotion. [catalysts] [makers] standardize the use of project charters and other formal agreements to reduce the friction around credit statements early on in projects. [ntsos] ntso metadata standards must evolve to accommodate flexible, ambiguous, or expansive contribution statements. challenges publishing non-traditional scholarly objects in this study, we have separated publication and maintenance under different section headings. though they often overlap, they present distinct challenges to digital scholarship. we employ the term publishing to collapse a range of activities: . peer review. . manuscript preparation, including editing, proofreading, quote checking, and production design. . distribution, marketing, publicity, and indexing. the prestige value of a specific imprint, as well as the financial models that support these activities, are also included in this section. not all publishers take on these roles, but they distinguish publishing from our other categories. publishing serves as a convenient umbrella term for this report. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / avenues for the publication of print materials are well established, as is the division between published and unpublished. [ ] [#n ] in contrast, the line for digital scholarship between published and not published remains amorphous, without a clear distinction between “made”, “online”, and “published.” in the current academic ecosystem, the labor of making a finalized public-facing, digital object often falls on its creators, while typesetting and finalization of “print-ready” versions of print scholarship falls on publishers or editors. traditional print scholarship benefits from a level of codification that ntsos presently lack. some have attempted to codify ntso production by creating avenues to move ntsos into the traditional credit pipeline, but the most prevalent options for such transfer struggle to accommodate digital scholarship on its own terms. for example, ntsos are often connected to traditional scholarly publications under the following circumstances: . digital or print publications that stamp peer-review approval on pre-made digital objects. . journals that publicly review digital objects (as with a book review). . a print publication (i.e., companion piece) authored by members of the digital project team. . a print publication with a digital supplement or appendix, where the publication is seen as the primary, peer reviewed object. in three out of four of these circumstances, a journal or publisher partially accommodates an ntso, but the final product does not sit alongside its print counterpart. in these cases, the ntso is not peer reviewed and does not earn prestige or credit equal to the print publication to which it is attached. in choosing among these “shoehorn avenues,” interview subjects said they must consider: . how the eventual ntso will be cited. . where and how it will appear in their cv. . the stated requirements of their chosen career path. . how to justify it to a dissertation director or a to hiring, tenure/promotion committee (where applicable). academic prestige was an abiding concern, and pessimism was abundant in our interviews. they described options as scarce, lacking in prestige, and often ill-fitted to their work. most described the labor of transforming their work for the print ecosystem as difficult, with dubious benefits. [ ] [#n ] others, especially public humanists and digital artists, reported that their audience and peers were more concerned with public impact or critical reception than academic prestige, and thus did not need to “shoehorn” ntsos into pre-existing academic publication ecosystems. many of these interviews subjects, however, said they work outside the research tenure stream and are less tied to traditional scholarly metrics than most of their colleagues. the perception that experimental work of making ntsos was best suited for post- tenure faculty or alt-ac jobs was widespread. peer review our interview subjects identified peer review as an especially important aspect of traditional publishing, but adapting peer review for ntsos is daunting. peer reviewers must have expertise in the ntso’s technical form and its content. as a result, qualified peer reviewers can be hard to find. when a project's content and technical form can be compartmentalized, reviewers with one expertise or the other can be enlisted. ntsos built with off-the-shelf tools may help with such separation. on rare occasions, however, an ntso’s content and technical form are completely intertwined. given the relatively small intersection between digital humanities scholars and scholars of a particular subfield, all qualified reviewers might already be attached to the project under review either as active team members or as advisors. our subjects expressed concern about any peer review system too sophisticated for an average scholar. for example, in digital exhibits a reviewer must be able to assess whether the technology choices make sense for the ntso. for digital objects with complex computational elements, a reviewer needs to be able to determine whether the source material, data, and methods work together to support the object's central argument. access to a digital object’s “front end” and source code is often necessary. if an ntso requires the installation of software or dependencies to run locally for evaluation, many potential peer reviewers would be excluded. another difficulty inherent in adopting the traditional peer-review process for ntsos comes from requested revisions. peer reviewers might consider all aspects of the ntso equally revisable but projects that rely on standard content management systems and other off-the-shelf solutions such as omeka, scalar, or wordpress can only make changes that the platform allows and affords. a relatively simple suggestion for revision could be, in this context, difficult or impossible to accommodate from a technical perspective. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / bespoke projects can be similarly difficult to revise, and may cause even more difficulties. an ad hoc solution to a particular problem could be easy to program but, depending on the scaffolding of project elements, even seemingly minor revisions might require rebuilding the project from the ground up. as with other challenges, the issue here is sociotechnical. often, large projects enlist outside consultants who are available on a term-limited basis. technical services might be funded through grants, provided as part of a course, or extended as grant-in-aid from a digital scholarship center or library. in such cases, revisions are possible from a purely technical standpoint. social barriers, instead, make revision impractical and unlikely to occur. [ ] [#n ] recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [catalysts] [makers] [hosts] start the peer review process early, perhaps during the initial design and development phases, so that ntso revisions can be incorporated before a technical point of no return. [catalysts] [makers] [evaluators] foster the education of peer review standards for ntsos. this can perhaps be combined with other teaching goals, such as technical literacy. utilize scholarly organizations such as mla, ala, and aha to host seminars and pre-conference workshops on best practices for reviewing ntsos. [catalysts] [makers] [hosts] for moments when early integrated peer review is not possible, foster and adopt open, post-publication peer review models. [ ] [#n ] [ntso] ntsos must be portable between collaborators and peer reviewers, and executable such that non-technical peer reviewers are still able run and review the object. prestige despite the difficulties of publishing ntsos, many of our interview subjects remained committed to adapting core elements of monograph publishing to their work. academic prestige was at the heart of this support, as was loyalty to peer review as a process. some said they were concerned with how ntsos appear on their cvs, and many said they preferred omeka and scalar because of their “monograph whiff.” a booklike object, they said, would be easier for hiring, promotion, and tenure committees to understand. [ ] [#n ] there was no consensus as to whether monographs, articles, or conference papers were most analogous to ntsos. [ ] [#n ] some suggested that such a judgment depended on the size and scope of the project. almost all our subjects, however, sought to draw comparisons to traditional modes of publication. a few also related experiences of having ntsos evaluated as service instead of scholarship. the subjects who recalled such experiences found them objectionable. one attributed the misjudgment to an overly narrow definition scholarship, i.e., that only a prose-like intervention articulating and defending a critical argument should count. some subjects voiced the idea that “the model is the argument,” but they also conceded that such scholarship would be less recognizable to some reviewers. [ ] [#n ] in other words, our interview subjects saw the norms of print as crucial to traditional prestige. many of our subjects were especially concerned about the stigma of self-publication. this label, they felt, would lead reviewers to dismiss the work and disqualify it from being “serious scholarship.” there was some disagreement (and even tension) about whether publishers were a healthy part of the scholarly ecosystem. some called for prestigious journals in their field to create more space for ntsos. their core idea was to extend the prestige of traditional, print-based scholarship to contexts where digital scholarship could appear. [ ] [#n ] others called for the reform of (or even an end to) prestige-based scholarship. a press’ reputation, one interview subject argued, is too often relied upon to determine the importance and prestige of a scholarly work. outsourcing scholarly gatekeeping to publishers, they said, prevented scholars from reading and judging scholarship on its own merits. as stated previously, many of our subjects who chose “alt-ac” career paths expressed great relief at “being freed” from the structures of prestige, credit, and promotion. many of these subjects directly associated these structures with a publisher- centric system. [ ] [#n ] recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [catalysts] [makers] articulate criteria for ntsos to be treated as equal to monographs, journal articles, and conference presentations via scholarly organizations, tenure guidelines, grant programs, and other initiatives. reduce the knowledge gap regarding the amount and kinds of labor that go into producing ntsos. [makers] [evaluators] continue experimenting with radically alternative credit models. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / [hosts] put resources into and create space for ntsos. treat ntsos as equally valuable and valued as print-based scholarship. [evaluators] adopt capacious notions of scholarship that include ntsos and self-published work. work to divorce the means of distribution from the granting of prestige. alternative audiences in our interviews, issues of prestige were directly linked to intended audience. many digital makers have traditional academic audiences in mind for their scholarship, including disciplinary scholars in the humanities and stem, which can be further divided by level of specialization. imagined classroom use was also a common intended audience for interviewees ntsos. several of the people we interviewed identify strongly with the public humanities and see their audiences as various segments of the public. they expressed interest in matters of social justice, cultural heritage communities, and policy-making. the great range in size of perceived audiences appears to have a strong impact on the software, tools, and platforms used to create and publish ntsos. in cases where audience response (approval, engagement, etc.) is a priority, the norms of prestige are perhaps less salient. those who said they prioritized public humanities showed little interest in attempting to emulate traditional publishing models. instead, they argued that effectively self-publishing—that is, paying for their own project hosting and taking responsibility for the full lifecycle of the project—enabled them to reach their intended audiences. [ ] [#n ] the scholars we spoke with who were engaged in public-facing scholarship tended to work on smaller or solo projects, and either worked in contexts where these projects were sufficient to keep them employed, or worked on enough more traditional- looking projects that they rested on these when it came time for assessment, tenure, promotion, or re-appointment. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [makers] build ntsos with user-centric design, particularly but not exclusively when building for general audiences. [ntsos] ntso standards should align with modern user-focused web standards, including for mobile compatibility, accessibility, and minimalism. adopting these standards, conveniently, will also help ntsos become more easily accessible and preservable, and decay more gracefully. [catalysts] [hosts] build networks, indices, databases, and other aggregators to collect ntsos and make them more discoverable by non-traditional audiences. such networks would do well to include secondary educators, community organizers, and other stakeholders. [ ] [#n ] [makers] proactively seek inclusion of ntsos in aggregators depending on the intended audience. reaching out to non- governmental organizations and for-profit entities may be relevant, depending on the ntsos focus. [audience] normalize paying for ntsos or ntso consortia out of respect for the labor involved. [ ] [#n ] [makers] directly justify decisions to self-publish if going up for tenure or promotion, describing how the work was received by scholars and broader audiences. [evaluators] accept non-traditional audiences and venues as legitimate markers for success. discoverability and access most of our interview subjects agreed that discovery and access are two of the largest barriers to success for ntsos. making self-published materials visible is difficult. as we have suggested, many ntsos are published on scholars’ personally-maintained websites. ntsos hosted by digital centers or other groups with institutional websites tend to be more visible. some interviewees noted they now pay increased attention to dissemination. putting something online “is not enough anymore,” one subject said. they suggested further that scholars must do more than ever before to market even traditionally published scholarship. our subjects noted that aggregators, distributors, and library catalogues do not prioritize ntsos, even those hosted by traditionally prestigious publishers including stanford university press. scholars generally know that they can find monographs, journals, and articles using library-integrated services like the mla bibliography, jstor, ebsco, proquest, and worldcat. none of the subjects we spoke with were aware of any such systems that indexed digital projects. [ ] [#n ] ntsos, as a result, are less visible and discoverable than print scholarship. according to one subject, many well established aggregators, in fact, see ntsos as a potential threat to their market share. their focus has been / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / embedding digital assets into proprietary distribution models. in some cases, vendors have focused on adding metadata to open access or public domain materials. improved search or browse functionality ostensibly justifies re-distributing free materials for profit. some of our interview subjects had negative reactions to these attempts. they wanted to know why for-profit companies were in control of academic scholarship that they felt should be under the control of their makers or at least available without charge. (none offered solutions for the increased burden this would place on makers.) other subjects were more focused on outcomes. one such subject said they had begun using google scholar because it indexes recent scholarship better than their university library catalog. newer aggregators generally do not provide easy ways to search for digital projects, datasets, libraries, bots, or other digital project “detritus” that might be of interest to a digital scholar. one interview subject had a particularly negative reaction to google scholar. the root of this particular critique was google's opaque metadata and indexing standards, but it suggested a broader concern. new indexes, like the aggregators they seek to supplant, do not account for ntsos. [ ] [#n ] recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [ntsos] standardization must form around ntso formats and metadata before indices/aggregators can pick them up. [catalysts] encourage standardization around local or funded ntsos. [catalysts] fund, create, or join consortia tasked with creating generalized indices for ntsos of particular forms. [makers] learn, adopt, or help create standards for ntsos, particularly with respect to standard locations and structures for metadata. [catalysts] fund studies and support projects into how to make repositories more ntso-compatible. financial models and licensing the publishers we interviewed spoke of financing and maintaining ntsos as two main stumbling blocks to integrating them into existing business practices. this was true for traditional publishers, higher education institutions publishing on their own platforms, and individual creators who self-published. currently, there is no business model established for ntsos that is as clear as the models for print publishing. in a print context, publishers can reliably calculate cost-per- book upfront, and have developed industry-standard methods for cost recovery. [ ] [#n ] our interview subjects making ntsos expressed reluctance to work with for-profit publishers. these publishers might have the resources to experiment, one subject said, but their goals conflict with academia’s. university presses were seen as a more ethical option. many such presses have expressed interest in publishing ntsos, but their business models often prevent them from experimenting freely. outside grants allow some experimentation, but this does not constitute a sustainable model. offering ntsos alongside a monetizable print component can sometimes offset these costs. however, this structure can also create tension between the print and digital component, since the financial model implies that the print component is the main product. several of the press representatives we spoke to said they were hesitant to try a digital-first model, for fear of cannibalizing sales. the few interview subjects who had taken this approach said they used embargo periods— during which time only the print publication would be available for sale—to assuage this fear. more often, access to a digital object would come gratis, on its own or with purchase of a print edition. publishers we spoke to said that lack of technical uniformity and clear production pipelines are the root drivers of costs. each digital object requires different labor, which creates a larger financial burden for publishers. hiring expertise in various technologies, they said, would be especially cost prohibitive. combined with a lack of consensus around cost- recovery models, these forces push production labor to ntso project teams and their host institutions. based on our interviews, this financial model is relatively common. however, it makes ntso publication less appealing for project teams, as they are not able to offset their labor as much as they would like. many ntso makers are not able to pursue this model, as they lack the necessary funding, expertise, or institutional support. such factors keep ntsos marginal, which further inhibits their legitimacy and potential prestige. presses may also avoid ntso publication because unclear lines of ownership can sometimes develop with large scale digital projects. in our interviews, this issue came up several times. if a project has been handed off from one lead investigator to another, it may be unclear who has the ability to offer control and ownership of the component pieces. [ ] / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / [#n ] the concern is also partially a financial one because licensing concerns are seen as a risk to cost recovery. if licensed material is part of project, that license is a continuing cost. the licensing party could discontinue services or abruptly change their price. in contrast, some makers cited the control they gained by self-hosting ntsos as an upshot. many may use self-hosting to sidestep issues of intellectual property, since they are not accountable to publishers’ legal teams to produce signed intellectual property agreements. self-hosting can also allow an ntso to continue to function after official financial support has been exhausted. although self-hosting can sometimes increase the lifespan of a project, interview subjects who said they preferred to maintain control of their materials seemed more comfortable with the idea of digital ephemerality. some ntsos, they said, do not need to be as durable as printed books. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [ntsos] financial models appear to be held back by a collective action problem. publishers seem to be risk-averse because it is difficult to assess costs upfront. outside forces can easily disrupt an ntsos ability to pay for itself long- term. normalizing ntso production could alleviate both problems. meanwhile, many scholars are dissuaded from taking on ntso production because of their marginal status. one solution would be to produce a critical mass of non- traditional scholarly output to try and make it less marginal (supply side). the other would favor targeting ntso demand by appealing to new audiences (consumer side). challenges maintaining non-traditional scholarly objects project teams, digital scholarship centers, and host institutions are responsible for most ntso maintenance. likewise, they are the primary distributors of these objects. with print publication, scholarly presses handle production, distribution, and preparing materials for long-term access (typesetting, printing on acid-free paper, indexing, etc.). libraries are then primarily responsible for the care and preservation of these print publications. with ntsos, maintenance burdens have shifted to those who have not traditionally been responsible for such tasks, and who are often ill-prepared for their requirements. [ ] [#n ] our subjects pointed to many issues that project maintainers need to address. such labor includes scoping resource requirements, preparing projects for preservation, and retroactively dealing with “legacy projects” built on outdated platforms. there was no consensus among our subjects about who should be responsible for project maintenance. several said that their labor in making an ntso should end at the point of publication. as with a print resource, the resulting object would become someone else’s responsibility. one of our subjects said that long-term maintenance concerns prohibited faculty from considering ntso production. if a project might only be maintained for three to five years, they felt their time and effort would be better spent writing a traditional book. others said the shifted maintenance burden made it easier to take their projects with them when they left institutions. still others viewed direct control over how long their scholarship remained available online as a benefit. disparate notions of maintenance many of the people we interviewed questioned the way we were using the term preservation and wondered whether the scholarly community as a whole was using it appropriately with respect to ntsos. they asked if maintenance of digital project could be considered preservation. if not, where was the boundary between these two activities? interview subjects asked “what is the difference between the live web site and the archived site? we can’t get rid of preservation, but can we get rid of maintenance?” collectively they raised questions about how separate these activities are. this report adopts the position that maintenance and preservation must be viewed as separate categories. conflating the two obscures much of the labor of ntso production. for the purposes of this report, we have tried to use maintain and preserve consistently. a project that is being maintained continues to be accessible via the same or similar means as originally designed. a project that is being preserved may not be rapidly or easily accessible in its original context but may continue to be accessible in the long term. [ ] [#n ] web archiving provides an illustrative example of how we distinguish between maintenance and preservation. a web project that is hosted on its original domain on the same platform or perhaps flattened into a static site is being maintained. a web archive of that site that is accessed via the wayback machine or is being stored without public access is being preserved. the modality and availability of access is how we, for the purposes of this report, separate the two practices. recommendations / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [makers] [hosts] [catalysts] pursue education and outreach strategies to close gaps in understanding about maintenance. [hosts] [catalysts] create professional development opportunities targeting maintenance task competencies. [makers] clearly define maintenance tasks and document work. [makers] [hosts] use project charters and other documents to help set clear expectations for maintenance at a project’s outset. [catalysts] direct grant funding towards normalizing ntso maintenance activities. legacy projects our interviews returned often to the so-called “legacy projects” problem. [ ] [#n ] teams, centers, departments, or institutions have existing obligations to maintain certain ntsos. some digital content, likewise, is seen as too important to lose. since many of these projects originated twenty or more years ago, many use non-standard or now-out-of-date technologies. they may be dependent legacy database engines, webservers, and operating systems. these projects can constitute major security risks for hosting institutions, especially when code with known exploits is running on public- facing servers. one common technique involves hijacking a server’s resources to create a zombie or bot. a project may continue to function while its resources are used to send spam emails, mine cryptocurrencies, or spread malware. an ntso's functionality may degrade or disappear due to changing browser or platform standards, security patches, or changes to external resources. [ ] [#n ] recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [ntsos] it would greatly reduce costs and complexities were many actively maintained legacy ntsos transferred to a static or encapsulated preservation state. this would require significant initial investment by [catalysts] and [hosts], but would reduce overall costs of long-term maintenance and preservation. [makers] [hosts] help establish clear pathways between maintenance and preservation to ease hand-offs. [makers] [hosts] avoid models where continuous updating and ongoing maintenance is the norm. many ntsos could adopt a model more like a scholarly monograph, with various editions in stable preservation states. [ntsos] software containers, though not a panacea for maintenance challenges, can reduce maintenance overhead and mitigate the effects of “dependency hell.” containers are often perceived as difficult to adopt but are often praised as time-savers once adopted. [makers] consider simplifying ntsos using approaches like minimal computing. [makers] build graceful degradation into original objects or their metadata. [ ] [#n ] technical and personnel resources maintaining digital objects requires significant personnel and technical resources. both of these raise costs. some interview subjects, especially solo practitioners, said they took on maintenance and stewardship responsibilities themselves, including personally paying for hosting, renewing domains, updating software, responding to copyright issues, moderating user-comments, and updating metadata for search engine indexing. with few exceptions, our subjects reported letting digital objects remain inaccessible when their projects became too difficult to maintain. they rarely put effort into long-term access or preservation beyond a reliance on, for example, github or an institutional repository (if the ir could accept the project in the first place). one solo practitioner said their projects might remain offline for months at a time before someone notices and informs them. fixing the underlying issue could take longer still. more than anything, maintenance requires people tasked with the labor of maintenance. one interview subject said that lack of personnel was “the ultimate hurdle.” all the computers in the world, they said, aren’t enough to maintain projects if there aren’t enough people with knowledge and expertise to work on them. those with outside support (institution, publisher, etc.) sometimes shared short-term maintenance labor. more often, solo practitioners or teams reported making hard choices about how to best use limited resources. maintenance is a / enterprise. it combines ensuring an ntso maintains functionality and remains reasonably secure. as one subject reminded us, it could even include walking into a server room and moving a computer because of a leak in the roof. in our section on making ntsos, we described the trade-offs between standardization and experimentation. these decisions affect a project's eventual maintenance demands. institutions committed to maintaining multiple / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / software stacks (each perhaps requiring its own virtual machine) must have the expertise on hand to deal with each of these stacks. hosts with standardized system stacks can maintain more projects at once, but at the expense of expressive capacity. many content management systems, though, are perpetual targets of attacks. as a result, using platforms such as omeka or scalar incur additional maintenance costs in dealing with the security risks of standardization. in our interviews, the precise cost of accommodating multiple software stacks vs. standardized systems remained unclear. [ ] [#n ] recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [makers] [hosts] create documentation that addresses an eventual project hand-off directly. [evaluators] help establish and defend the norm among ntsos and ntso production teams that documentation will directly address expectations for any future project hand-off. [makers] [hosts] develop consistent editorial policies, agreements, etc. before hand-offs take place. this recommendation may make preservation easier, either by freeing hand-off teams from making curatorial decisions or providing clear guidance for those decisions. [makers] [hosts] allocate the necessary time and resources for maintenance teams to keep sites functional and secure. new maintenance needs, such as recently identified security risks, can arise at a moment's notice. cost models as noted in our interviews, maintenance costs are difficult to estimate ahead of time and can grow quickly. costs can remain low for longer periods of time, with sudden increases for upgrading or short-term troubleshooting. one interview subject compared digital scholarship to venture capitalism. they said this paradigm focuses on starting projects rather than maintaining them in the long term. based on our interviews, although funders mandate maintenance and sustainability, current funding systems do not accommodate these requirements. sustainable ntsos need permanent infrastructure and personnel, which are hard to maintain with project-based expenditures. [ ] [#n ] our interview subjects were practical about the resource requirements of maintaining projects. many (in small and large institutions) pointed out the need to be realistic about what individuals, centers, or libraries can do. determining whether a project should be maintained or preserved, they said, required balancing several factors: . audience or community demand for the resource. . the level of access needed to meet current demand. . commitments made to grant-funders, partners, etc. . the value or significance of the project. several interview subjects spoke of both intellectual and monetary value. some suggested that digital project maintainers should not be afraid to justify and even monetize their projects to support maintenance costs. one said they wanted to “take the hubris” out of digital projects, arguing: if you let it go and it’s valuable, people will step in to ensure its survival. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [makers] [hosts] work to make maintenance costs more transparent, and incorporate these costs into budgets. [ntsos] broadly speaking, the digital scholarship community should be open to cost models that involve monetizing to support maintenance costs (e.g., following the model of omeka.net, mla member resources, etc.). [hosts] develop models to pass maintenance burdens from a project's creators to the people or communities who are most invested in its survival. this is how libraries maintain access to certain valued physical books or journals. even institutions that want to act as maintainers of access to ntsos, however, often lack the resources and infrastructure to do so. [ ] [#n ] [catalysts] [hosts] normalize budget lines related to institutional infrastructure and budget for continuing costs, rather than project-based expenditures. this will require a shift in institutional culture, and may require new types of institutions built around maintaining and making accessible ntsos. increased funding will be necessary, and might come from lines such as university indirect rates. challenges preserving non-traditional scholarly objects / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / as we've noted, the people we spoke with tended to conflate maintenance and preservation. the soft edge between the two speaks to the need for more clarity about the differences between these two practices. it may also suggest that project teams are keeping ntsos in states of active maintenance while trying to preserve them. the digital scholarship community has debated how to ethically and practically use the internet as a publication space and a repository space. [ ] [#n ] what’s more, previous scholarship has pointed out that ntsos blur the lines between these two activities. in our interviews, we focused on questions of what to preserve, why to preserve it, how to enact effective preservation strategies, and how long a preserved digital object should last. [ ] [#n ] we also asked about who should perform the labor and who should pay for it. others have pointed out that effective ntso preservation begins before an object is built. [ ] [#n ] keeping this idea in mind, we have made efforts to frame preservation in relation to making, publishing, and maintaining. preservable outputs interview subjects had strong opinions about what were their most essential outputs. some argued that their data, ideally in rawest form, were the fundamental building blocks of their work. others said metadata was an essential output that must be preserved as well. many pointed out that metadata was necessary for automated indexing systems and would enable citation. some felt that code, or to a lesser extent software-dependencies, were important digital outputs to preserve. less often mentioned digital outputs included project documentation and user interfaces. many tasked with preserving digital projects expressed frustration at the lack of explanatory documentation. this was true for experts working on custodial legacy projects and non-custodial preservation. we heard often that production was prioritized over documentation and metadata. tension between ephemerality and permanence, combined with existing incentives, may reinforce such priorities. [ ] [#n ] many of our subjects felt that digital interfaces were the least important aspects of their projects to preserve. this feeling may be because they did not consider the design and user experience of their ntsos to be scholarly outputs worthy of study in their own right. others said they thought preserving video recordings of user interactions was sufficient for preserving an interactive user interface. however, they recognized a loss of context when trying to preserve digital projects in this way. one interview subject in particular used the extended metaphor of preserving old nintendo games: “you can have a nintendo and you can have a mega man ii cartridge, but you can’t have the sleepovers where you played mega man ii until four in the morning.” recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [makers] [hosts] [catalysts] continue education and outreach efforts. as with maintenance, the perceived burden of preservation is shaped by gaps in understanding. [makers] develop better explanatory documentation to address how projects will be preserved, especially in post- custodial preservation contexts. this is particularly important for points of hand-off. [catalysts] pursue grant programs to increase education, outreach, and documentation initiatives, aimed specifically at increasingly ntso preservability. purpose of preservation our interview subjects had vastly different views on the purpose and usefulness of preservation. where ithaka’s “the state of digital preservation in ” frames digital preservation as a necessity, our subjects questioned the preservation of ntsos. [ ] [#n ] lack of consensus on this subject might reflect a digital scholarship value system in flux. our report, unlike ithaka’s, focuses specifically on digital scholarship, so many of our subjects responded to the idea of preserving argument-driven objects. one group of subjects expressed little interest in preserving ntsos. some considered their projects as a form of digital ephemera. such ntsos may be important for a time but also significant in that they are not designed to last. others articulated a tension between the freedom to explore and the demands of preservability. their experimental “flights of / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / fancy,” one said, simply weren’t worth long-term care. the idea that not everything can or should be preserved came up in many of our interviews. most of our interview subjects did not share this view. ntsos, they said, should be preserved due to scholarly standards of evidence, especially to ensure the veracity of future arguments built on prior work. the comparison we heard most related to citing sources. as long as an ntso is making a contribution to the scholarly conversation, they said, it should be available for scholars to reference. unlike in the sciences and social sciences, where arguments and experiments might become “stale” after a few years, humanities arguments and evidence may be referenced for decades. our subjects said this time scale makes preservation more important in the humanities. ntsos that no longer contribute to contemporary debates often have value to other audiences. some of these may be cultural heritage objects in themselves. others may inform future intellectual histories. one of our interview subjects argued that self-analysis and reflection requires access to the historical record. studying the historiography or legacy of digital fields requires access to the ntsos created in those fields. they asked, how much of the history of humanities computing and digital humanities has been lost due to the challenges of preserving early digital objects? [ ] [#n ] finally, and most pragmatically, our interviewees pointed to issues revolving around contractual requirements. many grant-funded digital projects have data management plans or other commitments to keeping project outputs accessible. institutions and scholars are often bound (by law or contract) to preserve the outputs of sponsored research. others said preserving their project outputs was a moral imperative. we heard this position most with public-facing resources used by a particular community, or in the classroom. the decisions around what to preserve are challenging in the face of scarce resources. meanwhile, ntsos are becoming increasingly complicated. the people we spoke with said they felt no obligation to spend time and money preserving projects of interest to “just one professor.” they did note, however, that they felt a stronger obligation to preserve social justice projects, regardless of usage statistics. this raises the issue of when and what to preserve, which we address in the next section. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [makers] [audiences] [evaluators] participate in conversations with communities of interest in determining what is preserved and why. [makers] reach out to potential stakeholders early in the design of an ntso. expectations for such community building is likely to increase over time. [makers] [audiences] [evaluators] clearly articulate motives and needs. different reasons for preservation can inform different preservation strategies. [makers] [hosts] continue to develop levels of preservation in concert with users or audiences' needs. when to preserve in the previous section, we raised the issue of how to determine an ntso’s value. appraisal, especially in libraries and archives, is a well-established process used to determine the value of records. these determinations inform decisions about preservation and deaccessioning. the labor of preservation is often invisible from the outside and taken for granted, as noted by several subjects. in traditional scholarly publishing, preservation takes place after-the-fact and the author is not involved. the scholars we interviewed, however, had a deeper understanding of appraisal and preservation than we expected. such understanding perhaps arose from their need to play a greater role in these processes than would be necessary with printed materials. in the context of ntsos, a crucial and related concept is graceful degradation, the gradual loss of functionality over time. digital objects do not automatically degrade in this way. if neglected, they may pass from functionality to a complete inaccessibility with little or no transition time. some of our subjects lamented their inability to preserve ntsos in their entirety. most, however, said they were comfortable with graceful degradation, as long as core functionality, or the content, remained. as we noted before, ephemerality was also acceptable on a case-by-case basis. some of those we spoke to said that wanted the original versions of their work to be available for as long as possible. one specifically argued for the -year “life of a laptop” to become standard. another suggested - years with a -year lifespan in special cases, followed by an explicit shut-down process. however, as we noted when discussing maintenance, most projects do not have an explicit shut-down procedure or predetermined end-of-life. this absence can give rise to / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / what several of our interview subjects called “zombie projects,” with no clear hand-off between maintenance and preservation. it may also increase the number of projects that are still online and functional but no longer maintained, and at risk of system failure. our interview subjects were particularly emphatic about the need, sometimes, to embrace ephemerality. this was almost always expressed as a condition of digital experimentation. knowing that their work does not—and indeed should not— exist in the long-term can encourage scholars to “hack” or “play.” one person said they’d found it liberating to think of a website existing only when visited by a user’s browser. some of our interview subjects felt that preservation standards for born-digital content exceed norms for print. books and paper, one person said, are only considered preservable because of the infrastructure that we have built to protect them. a book outside on the pavement would only last days. books go out of print and journals fold, so why, with digital projects, should we have an idea of “we paid for it, it should exist forever”? some subjects said they focused on more personal short-to-medium timelines such as a semester, a graduate career, a job search, or tenure and promotion. we detected a strong sense in many of our interviews that projects in need of long-term preservation would emerge organically from the field. institutions and agencies willing to invest the time and resources necessary to maintain, reinvent, and generally preserve notable projects would presumably step forward. this perspective may be naïve or callous, but we believe it is important to note. [ ] [#n ] a significant divide may exist between the digital preservation community and people who tend to create ntsos. as we noted earlier, some of our subjects did argue for the need to preserve as much as possible. one person argued that digital scholarship should last as long as non-acidic paper. another wanted active projects to be maintained for decades. preservation of content and outputs, one said, should be preserved “forever,” or the lifespan of the institution stewarding the digital object. other subjects said they thought that only the most ground-breaking or frequently used digital projects needed long-term preservation, but we did not speak extensively about the selection criteria. [ ] [#n ] recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [ntsos] although much work on this subject has been and continues to be undertaken, ntsos must coalesce around proven models of graceful degradation. [makers]embrace graceful degradation methods in the development of ntsos [hosts] if an ntso is not built to gracefully decay, consider building in that functionality during the hand-off between institutions, or between maintenance and preservation. [ ] [#n ] [makers] have open conversations about the lifecycle of your project with your team members, and to articulate your project’s goals as they pertain to ephemerality or preservation. [hosts] [audiences] [makers] consider graceful degradation models that include video surrogates to capture the spirit of the original work in modified form, narrated by the author to maintain the project’s original argumentative structure for the academic record. [hosts] [audiences] [makers] consider, alternatively, shut-down processes that include “flattening” the digital object into a standardized or easier to preserve format, such as static html. a version like this would enable a digital project to remain accessible without excessive resource costs. [ ] [#n ] continuing to institutionalize preservation when asked how to establish effective digital preservation for their ntso, many of our subjects said they hadn’t thought about it. we found that almost anyone in the development chain for a project can disavow responsibility for preservation. (one might say “it isn’t my job,” or “let’s wait and see what happens.”) many reported taking one of these approaches in the past. some said they stored ntsos on a server or relied upon free, commercial services such as github despite acknowledging that these weren’t preservation solutions. anything more, some said, would exceed their available resources. we saw strong consensus that, in an ideal world, libraries and archives would maintain and preserve ntsos. many said they already ask archivists, special collections librarians, and scholarly communications librarians for guidance on digital preservation. [ ] [#n ] some added that ntsos developed with library involvement have a better chance of long term sustainability. they said they considered well established roles and responsibilities crucial for digital preservation. [ ] [#n ] / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / several of our subjects stressed that universities and colleges must preserve ntsos themselves. they warned that higher education’s failure to preserve ntsos would ensure a for-profit takeover of the labor. there was particularly strong resistance to the idea of corporate, for-profit, or vendor ownership of ntsos. some subjects worried the different incentives, time-scales, and values of for-profit vendors made them ill-suited for the role of cultural steward. even some who held this opinion felt vendors were the only option, however, given a lack of local institutional expertise, resources, or technical support for dealing with digital preservation. we saw overall agreement that the current social, technical, and financial realities are obstacles. some stressed that smaller or less-funded institutional libraries had particular challenges. most of our subjects, however, said that such concerns inhibit all libraries from taking on such stewardship roles. [ ] [#n ] recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. [makers] [hosts] use "memorandums of understanding" between project teams and libraries to establish clear roles and responsibilities for digital preservation. these might be comparable to the practice of donation agreements associated with material and print collections. [catalysts] [hosts] pursue consortium models for preservation to enable smaller or less-well funded institutional libraries to pool resources and take advantage of economies of scale. [ ] [#n ] [catalysts] [hosts] ensure considerable resources are available at cultural heritage institutions to deal with digital maintenance and preservation. lack of sufficient resources appears to be the most significant reason this task gets outsourced to other organizations in ways that practitioners feel are inappropriate or inadequate. technical resources preservation requires ongoing technical resources, even if the project is not accessible. potential costs include: . server mirroring and storing backups. . maintaining links (e.g., between datasets and code). . format migration. . re-deploying to new platforms. this work, especially platform migration, requires increasing technical proficiency. the digital preservation community is exploring new technologies to streamline preservation, but some of these have a steep learning curve. some newer preservation technologies, such as emulation or containerization, they said, are overkill or unproven for preservation. one subject said, “we don’t need [to emulate] a whole desktop environment” to provide access to a single digital project. only a few of those we interviewed had experience with emulation and containerization. many held the view that either would be difficult to learn. even those with direct experience questioned how a workforce with the necessary expertise would be developed. learning any new skill requires training and time commitment. new competencies can also broaden job descriptions and increase expectations. already overextended project team members are understandably wary of proposed changes with such potential. attitudes about whose job it is to preserve ntsos may constitute, in itself, a sociotechnical obstacle to effective preservation. our subjects were sensitive to shifting responsibilities for digital preservation. as we've discussed, many we interviewed said they don't want to be responsible for long-term preservation. several raised the prospect of a scenario where a digital maker leaves the institution where they deposited a project. this contingency seemed to provoke particular concern. preservation, they said, should work like books. this is to say that preservation would remain the responsibility of libraries, archives, and other memory institutions, even after a maker has left. deep technical expertise, preservation personnel, and technological infrastructure come at significant financial cost. we heard several times that these costs need to be "baked into " project budgets. others said they felt that grant funding should not support preservation costs. most agreed that preservation was best accounted for as part of the overhead of a university or other institutional home. recommendations based directly on interviews, our interpretation of them, or on our interpretation of surveyed literature. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / [hosts] investigate the potential of emulation or containerization as mechanisms to make active preservation easier. [makers] [catalysts] consider using software containers for maintenance, but not preservation. however, note that making an ntso more maintainable will often make it easier to preserve. [hosts] [audiences] [catalysts] explore use cases for containers where users requests access to a preserved ntso. with a software container, a project could be recovered from its preserved state and brought temporarily into some active, temporarily maintained state for consultation. this mode of access would require technically skilled personnel who can manage the technical stack, “revive” a preserved project to make it accessible, and shut it down after the access needs have been met. [hosts] create infrastructure for ntso creators to self-deposit their work. such a system would need to include metadata and documentation. many institutional repositories, at present, are unable to accommodate complex ntsos in a way that would allow them to be easily revived. conclusion & looking forward we envision a future in which scholarship that embraces its digital affordances and materiality are placed on equal footing with typeset, print-ready scholarly publications. currently, non-traditional scholarly objects (ntsos) fit poorly in the academic world. they are less prestigious, more difficult to find, and more likely to suffer neglect than their printable counterparts. the stages of and roles involved in an ntso’s life are ill-defined and contentious. the rich variety of ntsos is both a blessing and a curse, resulting in an explosion of creative, transformative scholarship that by its nature defies academic norms. challenges faced by ntsos will not disappear soon. their inevitable growth is a function of the changing environments in which scholars work. a sound fitting of ntsos within their academic world will require a series of informed, orchestrated interventions that take into account every aspect of their complex lives. our perspective on how to intervene emerged organically from the work of this report. based on hundreds of hours of interviews and a survey of secondary literature, we relay common pain points associated with digital scholarship. the ntso workflow is broken into stages ( . making, . publishing, . maintaining, and . preserving), with subsections organized by topic. we further identify five roles ( . catalysts, . makers, . evaluators, . hosts, and . audiences). though the report’s structure suggests stages and roles are easily separated, the opposite is true. stages and roles blur together, rarely following a neat trajectory. this blurriness betrays a lack of standard protocol engendered by the uncertainty of the new. we do not anticipate the stages and roles organized here will be those eventually settled upon, but they offer a useful starting point to articulate challenges surrounding ntsos. across all stages, several broad and overlapping problem areas suggesting intervention strategies became apparent, including: . sociotechnical challenges being treated as technical challenges. . gaps in expectations and communication between roles leading to poorly aligned practices. . friction around hand-off points and periods of transition. . nonexistent or competing standards preventing ntsos from thriving. ( ) sociotechnical challenges are all-too-easily mistaken for purely technical hurdles. when dealing with ntsos, individuals catalysts, makers, and hosts often try to address sociotechnical challenges with technical solutions. even this study fell into this trap, initially positing software container technology alone as a possible cure for challenges faced by ntsos. in interviews, we heard of decisions between digital platforms being driven entirely by technical capacities and limitations. the capabilities of such platforms are important, but so are factors like what the institution will support, what previous team members have used in the past, and what is the norm for a particular field. as these are sociotechnical challenges, separating the technical from the social, cultural, or institutional challenges is impossible; each dimension must be considered in tandem. interventions, similarly, must address this spectrum. clear communication of these factors at every stage is essential. ( ) gaps in expectations and communication between roles lead to poorly aligned practices. despite the fact that development, publication, and preservation of digital scholarship depends on a shared ecosystem, conversations in that ecosystem often fall out of sync. for example, ntso makers seem more likely to blur distinctions between maintenance and preservation, which can obscure labor and shortchange resource needs. likewise, motives and incentives shape expectations. a scholar including an ntso in their tenure portfolio has specific needs for preservation / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / and evaluation. yet, the scholarly infrastructure at their institution may be unable or unwilling to accommodate a one-off, bespoke digital project. the motivations and incentives of the individual scholar are not aligned with the institution. many problems we observed appeared related to a lack of communication or a failure to understand motivational contexts. interventions must be built with members of different roles in close conversation with one another. ( ) periods of transition and points of hand-off are the most crippling moments in the life of an ntso, and perhaps the best starting point for an orchestrated intervention. ntsos are often distributed across many files, systems, hand-driven modes of stewardship. packaging and transferring ntsos between collaborators, roles, or stages can be intensely difficult. often, without the same group of people, technologies, and resources brought to bear on its creation, an ntso will be impossible to move from one party to another. even when it can be moved, as with any fragile object, an ntso’s transfer can require significant costs and expertise. and because of the collaborative nature of ntsos, they might need to change hands frequently. we heard frustrations like "how do i submit a digital object without self-publishing it?" or "how do i know what technical stack would work best for a particular scholarly journal?" we argue these frictions suggest a deeper question: how might the digital scholarship ecosystem normalize transfers of ownership or stewardship for ntsos? one essential steps in this process will be the encapsulation of ntsos, clearly demarcating an object and its context of functionality. ( ) nonexistent or competing standards prevent ntsos from thriving. the lack of agreed upon ntso standards contributes significantly to the friction around hand-offs, and impedes the normalization of digital scholarship. for example, we learned about an ntso developed within a particular library context which was technically incompatible with a journal seeking different sorts of digital content. not only could the subject not get their ntso published—they were unable to deposit it in the same library’s digital preservation system. although standards have arisen around ntsos, they often compete, or look quite different across stages. getting past these difficulties will require more orchestrated interventions. one question that underlies these challenges is: “who can take action to effect change?” we avoid suggestions of how the world ought to be, disconnected from specific actors who can bring about the change. in recognition of how easy (or arguably cheap) it is to recommend that other parties make broad, sweeping changes, we pointed our recommendations toward practitioners in each of the five identified roles. our last category of recommendation pertains to ntsos as objects, and how we might collectively shape them to suit scholarly needs. how and where to act first is a difficult subject. a well-established way to overcome collective action problems is to disrupt the balance with an orchestrated intervention. it would not be enough to make it easier for journals to accept bespoke websites. such an effort would need a community of practitioners and peer reviewers to test and use the new tool. likewise, establishing a new approach to hosting digital scholarship without bringing in stakeholders from publishing and preservation will surely fail. some well-known failures in digital scholarship have espoused the philosophy, "if we build it they will come." orchestrated interventions, in contrast, make building coalitions and lining up beta testers co-equal priorities to prototyping. any attempt to create smoother hand-offs, in particular, must keep this in mind. solutions not building toward each other are building away from each other. after extensive research, we identified no panaceas or silver bullets. the path forward seems clearer than when we began, thanks to the continued efforts of many stakeholders. but the future we envision, in which ntsos are as prestigious, discoverable, and easy to hand-off as their print-first counterparts, is still far off. standards need better articulation and coordination. more shared tools and platforms must be developed. crucially, incentives must shift to encourage and reward experimentation. perhaps most importantly, there is an urgent need for a single format (or set of formats) for ntsos that encapsulates both the content and the structure of these complex digital objects. such a format must balance its ability to enable maximally expressive scholarship alongside its need to constrain ntsos to standard shapes. the design must support a broad range of digital projects while also reducing friction at the various points of hand-off. standardized formats could offer clear targets tied to existing reward structures and publication systems, thus addressing the challenges around incentives. like with the codex book or the pdf, we anticipate a paradigm of encapsulation for ntsos would eventually curtail the expressive diversity of scholarly objects. in the meantime, we are excited to watch the tension at the heart of this balance foster works of profound creativity and value. and we hope the approaches suggested here will help bring these works the legitimacy, durability, and wide audience they deserve. bibliography / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / agate, nicky, rebecca kennison, stacy konkiel, christopher long, jason rhody, and simone sacchi. “about humetricshss.” humetricshss, september . https://humetricshss.org/about/ [https://humetricshss.org/about/] . aims work group. “aims born-digital collections: an inter-institutional model for stewardship.” white paper, . http://www .lib.virginia.edu/aims/whitepaper/aims_final.pdf [http://www .lib.virginia.edu/aims/whitepaper/aims_final.pdf] . altman, micah, jeffrey bailey, karen cariani, jim corridan, jonathan crabtree, michelle gallinger, andrea goethals, et al. “ national agenda for digital stewardship.” report. ndsa coordinating committee, september . http://www.digitalpreservation.gov/documents/ nationalagendaexecsummary.pdf [http://www.digitalpreservation.gov/documents/ nationalagendaexecsummary.pdf] . arp, laurie gemill, and megan forbes. it takes a village: open source software sustainability. a guidebook for programs serving cultural and scientific heritage. lyrasis, . bailey, charles w. digital curation bibliography: preservation and stewardship of scholarly works. usa: createspace independent publishing platform, . belcher, wendy laura. writing your journal article in twelve weeks: a guide to academic publishing success. sage, . borgman, christine l. scholarship in the digital age: information, infrastructure, and the internet. cambridge, ma: mit press, . ———. “the digital future is now: a call to action for the humanities.” digital humanities quarterly , no. (january , ). https://escholarship.org/uc/item/ fp n s [https://escholarship.org/uc/item/ fp n s] . bryson, tim, miriam posner, alain st. pierre, and stewart varner. spec kit : digital humanities (november ). spec kit. association of research libraries, . https://doi.org/ . /spec. [https://doi.org/ . /spec. ] . burkert, mattie. “london stage database.” mattie burkert (blog), october , . https://mattieburkert.com/london- stage-project/ [https://mattieburkert.com/london-stage-project/] . ———. “recovering the london stage information bank: lessons from an early humanities computing project.” digital humanities quarterly , no. (august , ). butler, brandon, amanda visconti, and ammon shepherd. “archiving dh part : the long view.” scholars’ lab (blog), april , . https://scholarslab.lib.virginia.edu/blog/archiving-dh-part- -the-long-view/ [https://scholarslab.lib.virginia.edu/blog/archiving-dh-part- -the-long-view/] . carlin, claire, ewa czaykowska-higgins, janelle jenstad, elizabeth grove-white, corey davis, john durno, lisa goddard, et al. “the endings project,” . https://projectendings.github.io [https://projectendings.github.io] . daigle, bradley, lorrie chisholm, brantley craig, elizabeth gushee, and matthew stephens. “valley of the shadow.” university of virginia library digital production group, . https://dcs.library.virginia.edu/sustaining-digital- scholarship/valley-of-the-shadow/ [https://dcs.library.virginia.edu/sustaining-digital-scholarship/valley-of-the-shadow/] . davis, robin camille. “taking care of digital efforts: a multiplanar view of project afterlives.” in proceedings of the modern languages association . vancouver, . https://robincamille.com/presentations/mla / [https://robincamille.com/presentations/mla /] . elliott, michael a. “the future of the monograph in the digital era: a report to the andrew w. mellon foundation.” journal of electronic publishing , no. (december , ). https://doi.org/ . / . . [https://doi.org/ . / . . ] . eve, martin paul. “open access publishing models and how oa can work in the humanities.” bulletin of the association for information science and technology , no. ( ): – . https://doi.org/ . /bul . . [https://doi.org/ . /bul . . ] . fenlon, katrina, megan senseney, maria bonn, and janet swatscheno. “humanities scholars and library-based digital publishing: new forms of publication, new audiences, new publishing roles.” journal of scholarly publishing, april , . https://doi.org/ . /jsp. . . [https://doi.org/ . /jsp. . . ] . galey, alan, and stan ruecker. “how a prototype argues.” literary and linguistic computing , no. (december , ): – . https://doi.org/ . /llc/fqq [https://doi.org/ . /llc/fqq ] . germano, william. getting it published, nd edition: a guide for scholars and anyone else serious about serious books. university of chicago press, . https://humetricshss.org/about/ http://www .lib.virginia.edu/aims/whitepaper/aims_final.pdf http://www.digitalpreservation.gov/documents/ nationalagendaexecsummary.pdf https://escholarship.org/uc/item/ fp n s https://doi.org/ . /spec. https://mattieburkert.com/london-stage-project/ https://scholarslab.lib.virginia.edu/blog/archiving-dh-part- -the-long-view/ https://projectendings.github.io/ https://dcs.library.virginia.edu/sustaining-digital-scholarship/valley-of-the-shadow/ https://robincamille.com/presentations/mla / https://doi.org/ . / . . https://doi.org/ . /bul . . https://doi.org/ . /jsp. . . https://doi.org/ . /llc/fqq / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / gitelman, lisa. paper knowledge: toward a media history of documents. durham ; london: duke university press books, . griffin, gabriele, and matt steven hayler. “collaboration in digital humanities research – persisting silences.” digital humanities quarterly , no. (april , ). groeneveld, elizabeth. “remediating pornography: the on our backs digitization debate.” continuum , no. (january , ): – . https://doi.org/ . / . . [https://doi.org/ . / . . ] . guiliano, jennifer, and roopika risam. “reviews in digital humanities.” reviews in digital humanities, august , . https://reviewsindh.pubpub.org/ [https://reviewsindh.pubpub.org/] . gunsalus, c. k., edward m. bruner, nicholas c. burbules, leon dash, matthew finkin, joseph p. goldberg, william t. greenough, gregory a. miller, and michael g. pratt. “mission creep in the irb world.” science , no. (june , ): – . https://doi.org/ . /science. [https://doi.org/ . /science. ] . haynes, anthony. writing successful academic books. cambridge university press, . humphreys, alex, christina spencer, laura brown, matthew loy, and ronald snyder. “reimagining the digital monograph: design thinking to build new tools for researchers.” journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . [http://dx.doi.org/ . / . . ] . ithaka s+r. “life cycle of a digital resource.” ithaka s+r (blog), . https://sr.ithaka.org/life-cycle-of-a-digital- resource/ [https://sr.ithaka.org/life-cycle-of-a-digital-resource/] . jules, bergis, ed summers, and vernon mitchell, jr. “documenting the now: ethical considerations for archiving social media content generated by contemporary social movements: challenges, opportunities, and recommendations.” white paper, april . https://www.docnow.io/docs/docnow-whitepaper- .pdf [https://www.docnow.io/docs/docnow-whitepaper- .pdf] . keener, alix. “the arrival fallacy: collaborative research relationships in the digital humanities.” digital humanities quarterly , no. (august , ). kemp, angie, lee skallerup, and kris shaffer. “what do you do with , blogs? preserving, archiving, and maintaining umw blogs - a case study.” journal of interactive technology and pedagogy, no. ( ). https://jitp.commons.gc.cuny.edu/what-do-you-do-with- -blogs-preserving-archiving-and-maintaining-umw- blogs-a-case-study/ [https://jitp.commons.gc.cuny.edu/what-do-you-do-with- -blogs-preserving-archiving-and- maintaining-umw-blogs-a-case-study/] . kirby, jasmine simone. “how not to create a digital media scholarship platform: the history of the sophie . project.” iassist quarterly , no. (february , ): – . https://doi.org/ . /iq [https://doi.org/ . /iq ] . klein, martin. “a web-centric pipeline for archiving scholarly artifacts.” keynote presented at the tpdl/dcmi, . https://www.slideshare.net/martinklein /a-webcentric-pipeline-for-archiving-scholarly-artifacts [https://www.slideshare.net/martinklein /a-webcentric-pipeline-for-archiving-scholarly-artifacts] . kretzschmar, william a., and william gray potter. “library collaboration with large digital humanities projects.” literary and linguistic computing , no. (december , ): – . https://doi.org/ . /llc/fqq [https://doi.org/ . /llc/fqq ] . langmead, alison, tracey berg-fulton, thomas lombardi, david newbury, and christopher nygren. “a role-based model for successful collaboration in digital art history.” international journal for digital art history, no. (july , ). https://doi.org/ . /dah. . . [https://doi.org/ . /dah. . . ] . lavoie, brian f, eric childress, ricky erway, ixchel m faniel, constance malpas, jennifer schaffner, titia van der werf, and oclc research. the evolving scholarly record. dublin, ohio: oclc research, . http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-evolving-scholarly-record- .pdf [http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-evolving-scholarly-record- .pdf] . lavoie, brian f, constance malpas, and oclc research. stewardship of the evolving scholarly record: from the invisible hand to conscious coordination, . http://www.oclc.org/content/dam/research/publications/ /oclcresearch-esr-stewardship- .pdf [http://www.oclc.org/content/dam/research/publications/ /oclcresearch-esr-stewardship- .pdf] . levinson, marc. the box: how the shipping container made the world smaller and the world economy bigger. princeton, nj: princeton university press, . https://doi.org/ . / . . https://reviewsindh.pubpub.org/ https://doi.org/ . /science. http://dx.doi.org/ . / . . https://sr.ithaka.org/life-cycle-of-a-digital-resource/ https://www.docnow.io/docs/docnow-whitepaper- .pdf https://jitp.commons.gc.cuny.edu/what-do-you-do-with- -blogs-preserving-archiving-and-maintaining-umw-blogs-a-case-study/ https://doi.org/ . /iq https://www.slideshare.net/martinklein /a-webcentric-pipeline-for-archiving-scholarly-artifacts https://doi.org/ . /llc/fqq https://doi.org/ . /dah. . . http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-evolving-scholarly-record- .pdf http://www.oclc.org/content/dam/research/publications/ /oclcresearch-esr-stewardship- .pdf / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / linclon, matthew, zoe leblanc, rebecca sutton koeser, and jamie fulsom. “the state of digital humanities software development roundtable.” pittsburgh, pa, . https://www.conftool.org/ach /index.php? page=browsesessions&form_session= &presentations=show [https://www.conftool.org/ach /index.php? page=browsesessions&form_session= &presentations=show] . lockridge, timothy, enrique paz, and cynthia johnson. “the kairos preservation project.” computers and composition (december , ): – . https://doi.org/ . /j.compcom. . . [https://doi.org/ . /j.compcom. . . ] . maron, nancy l. “the department of digital humanities (ddh) at king’s college london : cementing it status as an academic department, case study update .” ithaka case studies in sustainability. ithaka s+r, . https://sr.ithaka.org/publications/the-department-of-digital-humanities-ddh-at-kings-college-london- / [https://sr.ithaka.org/publications/the-department-of-digital-humanities-ddh-at-kings-college-london- /] . maron, nancy l, and sarah pickle. “sustainability implementation toolkit: developing an institutional strategy for supporting digital humanities resources.” ithaka s+r (blog), june , . https://sr.ithaka.org/publications/sustainability-implementation-toolkit/ [https://sr.ithaka.org/publications/sustainability-implementation-toolkit/] . ———. “sustaining the digital humanities: host institution support beyond the start-up phase.” ithaka s+r, june , . maron, nancy l, k. kirby smith, and matthew loy. “sustaining digital resources: an on-the-ground view of projects today.” new york: ithaka s+r, august , . https://doi.org/ . /sr. [https://doi.org/ . /sr. ] . maron, nancy, kimberly schmelzinger, christine mulhern, and daniel rossman. “the costs of publishing monographs: toward a transparent methodology.” journal of electronic publishing , no. (summer ): . https://doi.org/ . / . . [https://doi.org/ . / . . ] . maxwell, john w., alessandra bordini, and katie shamash. “reassembling scholarly communications: an evaluation of the andrew w. mellon foundation’s monograph initiative (final report, may ).” journal of electronic publishing , no. ( ). https://doi.org/ . / . . [https://doi.org/ . / . . ] . maxwell, john w., erik hanson, leena desai, carmen tiampo, kim o’donnell, avvai ketheeswaren, melody sun, emma walter, and ellen michelle. “mind the gap.” pubpub: simon fraser university / mit press, july . https://mindthegap.pubpub.org/ [https://mindthegap.pubpub.org/] . mazanec, ceciiia. “#thanksfortyping spotlights unnamed women in literary acknowledgments.” npr.org, march , . https://www.npr.org/ / / / /-thanksfortyping-spotlights-unnamed-women-in-literary- acknowledgements [https://www.npr.org/ / / / /-thanksfortyping-spotlights-unnamed-women-in-literary- acknowledgements] . mccarty, willard. “collaborative research in the digital humanities.” in collaborative research in the digital humanities, – . routledge, . meneses, luis, jonathan martin, richard furuta, and ray siemens. “quantifying the degree of planned obsolesce in online digital humanities projects.” presented at the ach, pittsburgh, pa, july . mla task force. “report of the mla task force on evaluating scholarship for tenure and promotion.” profession , no. (november , ): – . https://doi.org/ . /prof. . . . [https://doi.org/ . /prof. . . . ] . morgan, paige c. “the consequences of framing digital humanities tools as easy to use.” college & undergraduate libraries , no. (july , ): – . https://doi.org/ . / . . [https://doi.org/ . / . . ] . murray, padmini ray, and claire squires. “the digital publishing communications circuit.” book . , no. (june , ): – . https://doi.org/ . /btwo. . . _ [https://doi.org/ . /btwo. . . _ ] . newbold, bryan. “about fatcat.” fatcat!, . https://fatcat.wiki/about [https://fatcat.wiki/about] . nowviskie, bethany. “evaluating collaborative digital scholarship (or, where credit is due).” journal of digital humanities , no. (fall ). nowviskie, bethany, and dot porter. “the graceful degradation survey: managing digital humanities projects through times of transition and decline.” in proceedings of digital humanities . king’s college, london, . http://dh .cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab- .html [http://dh .cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab- .html] . https://www.conftool.org/ach /index.php?page=browsesessions&form_session= &presentations=show https://doi.org/ . /j.compcom. . . https://sr.ithaka.org/publications/the-department-of-digital-humanities-ddh-at-kings-college-london- / https://sr.ithaka.org/publications/sustainability-implementation-toolkit/ https://doi.org/ . /sr. https://doi.org/ . / . . https://doi.org/ . / . . https://mindthegap.pubpub.org/ https://www.npr.org/ / / / /-thanksfortyping-spotlights-unnamed-women-in-literary-acknowledgements https://doi.org/ . /prof. . . . https://doi.org/ . / . . https://doi.org/ . /btwo. . . _ https://fatcat.wiki/about http://dh .cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab- .html / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / oppegaard, brett, and michael rabby. “the app-maker model: an embodied expansion of mobile cyberinfrastructure.” digital humanities quarterly , no. (august , ). o’sullivan, james. “the equivalence of books: monographs, prestige, and the rise of edge cases.” convergence , no. (october , ): – . https://doi.org/ . / [https://doi.org/ . / ] . owens, trevor. the theory and craft of digital preservation. johns hopkins university press, . padilla, thomas, laurie allen, hannah frost, sarah potvin, elizabeth russey roke, and stewart varner. “final report — always already computational: collections as data,” may , . https://doi.org/ . /zenodo. [https://doi.org/ . /zenodo. ] . pendergrass, keith, walker sampson, tim walsh, and laura alagna. “toward environmentally sustainable digital preservation.” the american archivist , no. (june ): – . https://doi.org/ . / - - . . [https://doi.org/ . / - - . . ] . pitti, daniel v. “designing sustainable projects and publications.” in a companion to digital humanities, edited by susan schreibman, ray siemens, and john unsworth, – . malden, ma, usa: blackwell publishing ltd, . https://doi.org/ . / .ch [https://doi.org/ . / .ch ] . plantin, jean-christophe, carl lagoze, and paul n edwards. “re-integrating scholarly infrastructure: the ambiguous role of data sharing platforms.” big data & society , no. (june ): . https://doi.org/ . / [https://doi.org/ . / ] . posner, miriam. “digital humanities and the library.” blog. miriam posner’s blog (blog), april . http://miriamposner.com/blog/digital-humanities-and-the-library/ [http://miriamposner.com/blog/digital-humanities- and-the-library/] . quinlan, oliver. “young digital makers: surveying attitudes and opportunities for digital creativity across the uk.” nesta, march . https://media.nesta.org.uk/documents/youngdigmakers.pdf [https://media.nesta.org.uk/documents/youngdigmakers.pdf] . reed, ashley. “managing an established digital humanities project: principles and practices from the twentieth year of the william blake archive.” digital humanities quarterly , no. (april , ). rieger, oya. “the state of digital preservation in : a snapshot of challenges and gaps.” ithaka s+r, october , . https://doi.org/ . /sr. [https://doi.org/ . /sr. ] . robertson, tara. “concerns about reveal digital’s statement about on our backs.” tara robertson (blog), october , . https://tararobertson.ca/ /oob-part / [https://tararobertson.ca/ /oob-part /] . ———. “digitization: just because you can, doesn’t mean you should.” tara robertson (blog), march , . http://tararobertson.ca/ /oob/ [http://tararobertson.ca/ /oob/] . ———. “update on on our backs and reveal digital.” tara robertson (blog), august , . https://tararobertson.ca/ /oob-update/ [https://tararobertson.ca/ /oob-update/] . rosenthal, david. “personal pods and fatcat.” dshr’s blog (blog), april , . https://blog.dshr.org/ / /personal-pods-and-fatcat.html [https://blog.dshr.org/ / /personal-pods-and- fatcat.html] . russell, andrew, and lee vinsel. “hail the maintainers.” aeon, . https://aeon.co/essays/innovation-is-overvalued- maintenance-often-matters-more [https://aeon.co/essays/innovation-is-overvalued-maintenance-often-matters-more] . schreibman, susan, laura mandell, and stephen olsen. “introduction.” profession, , – . shweder, richard a., and richard e. nisbett. “don’t let your misunderstanding of the rules hinder your research.” the chronicle of higher education, april , . https://www.chronicle.com/article/don-t-let-your/ [https://www.chronicle.com/article/don-t-let-your/ ] . ———. “long-sought research deregulation is upon us. don’t squander the moment.” the chronicle of higher education, march , . https://www.chronicle.com/article/long-sought-research/ [https://www.chronicle.com/article/long-sought-research/ ] . sieczkiewicz, robert. “on the diversity of digital decay.” in proceedings of keystone dh . pittsburgh, pa, . http://keystonedh.network/ /abstracts/#submission- . sikes, sara b. “a design process model for inquiry-driven, collaboration-first scholarly communications – dh .” mexico city, . https://dh .adho.org/en/a-design-process-model-for-inquiry-driven-collaboration-first- scholarly-communications/ [https://dh .adho.org/en/a-design-process-model-for-inquiry-driven-collaboration-first- scholarly-communications/] . https://doi.org/ . / https://doi.org/ . /zenodo. https://doi.org/ . / - - . . https://doi.org/ . / .ch https://doi.org/ . / http://miriamposner.com/blog/digital-humanities-and-the-library/ https://media.nesta.org.uk/documents/youngdigmakers.pdf https://doi.org/ . /sr. https://tararobertson.ca/ /oob-part / http://tararobertson.ca/ /oob/ https://tararobertson.ca/ /oob-update/ https://blog.dshr.org/ / /personal-pods-and-fatcat.html https://aeon.co/essays/innovation-is-overvalued-maintenance-often-matters-more https://www.chronicle.com/article/don-t-let-your/ https://www.chronicle.com/article/long-sought-research/ https://dh .adho.org/en/a-design-process-model-for-inquiry-driven-collaboration-first-scholarly-communications/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / sikes, sara, maria bonn, and elli mylonas. “building capacity for digital scholarship & publishing: three approaches from mellon’s - scholarly communications initiative,” . montreal, canada, . https://dh .adho.org/abstracts/ / .pdf [https://dh .adho.org/abstracts/ / .pdf] . silvia, paul j. how to write a lot: a practical guide to productive academic writing. american psychological association, . smart, scott, charles watkinson, gary dunham, and nicholas fitzgerald. “determining the financial cost of scholarly book publishing.” journal of electronic publishing , no. (summer ). https://doi.org/ . / . . [https://doi.org/ . / . . ] . smithies, james, carina westling, anna-maria sichani, pam mellen, and arianna ciula. “managing digital humanities projects: digital scholarship & archiving in king’s digital lab.” digital humanities quarterly , no. (april , ). sula, chris alen. “digital humanities and libraries: a conceptual model.” journal of library administration , no. (january , ): – . https://doi.org/ . / . . [https://doi.org/ . / . . ] . “supporting the digital humanities: report of a cni executive roundtable.” coalition for networked information, . https://www.cni.org/wp-content/uploads/ / /cni-supportdh-exec-rndtbl.report.f .pdf [https://www.cni.org/wp-content/uploads/ / /cni-supportdh-exec-rndtbl.report.f .pdf] . tarkang, elvis e., margaret kweku, and francis b. zotor. “publication practices and responsible authorship: a review article.” journal of public health in africa , no. (june , ). https://doi.org/ . /jphia. . [https://doi.org/ . /jphia. . ] . terras, melissa, james baker, james hetherington, david beavan, martin zaltz austwick, anne welsh, helen o’neill, will finley, oliver duke-williams, and adam farquhar. “enabling complex analysis of large-scale digital collections: humanities research, high-performance computing, and transforming access to british library digital collections.” digital scholarship in the humanities, may , . https://doi.org/ . /llc/fqx [https://doi.org/ . /llc/fqx ] . the information maintainers, d. olson, j. meyerson, m. a. parsons, j. castro, m. lassere, d. j. wright, et al. “information maintenance as a practice of care.” white paper, june , . https://doi.org/ . /zenodo. [https://doi.org/ . /zenodo. ] . the prototyping team of the los alamos national laboratory, and the web science and digital library research group at old dominion university. “about.” my research institute, . https://myresearch.institute/about/ [https://myresearch.institute/about/] . thomas, william g., and patrick d. jones. “history harvest.” university of nebraska-lincoln, . https://historyharvest.unl.edu/ [https://historyharvest.unl.edu/] . trettien, whitney. cut/copy/paste: fragments of history. university of minnesota press, forthcoming. https://manifold.umn.edu/projects/cut-copy-paste [https://manifold.umn.edu/projects/cut-copy-paste] . vinopal, jennifer, and monica mccormick. “supporting digital scholarship in research libraries: scalability and sustainability.” journal of library administration , no. (january , ): – . https://doi.org/ . / . . [https://doi.org/ . / . . ] . visual media workshop at the university of pittsburgh. “the socio-technical sustainability roadmap.” the socio- technical sustainability roadmap, october . http://sustainingdh.net [http://sustainingdh.net] . waters, donald j. “monograph publishing in the digital age.” the andrew w. mellon foundation. shared experiences blog (blog), july , . https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital- age/ [https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/] . wilkin, john, ronald w. bailey, antoinette burton, allen renear, harriett green, megan senseney, marilyn thomas- houston, et al. publishing without walls, . https://publishingwithoutwalls.illinois.edu/ [https://publishingwithoutwalls.illinois.edu/] . zundert, joris van. “on not writing a review about mirador: mirador, iiif, and the epistemological gains of distributed digital scholarly resources.” digital medievalist , no. (august , ): . https://doi.org/ . /dm. [https://doi.org/ . /dm. ] . appendix a: interview protocol opening statement to set the context https://dh .adho.org/abstracts/ / .pdf https://doi.org/ . / . . https://doi.org/ . / . . https://www.cni.org/wp-content/uploads/ / /cni-supportdh-exec-rndtbl.report.f .pdf https://doi.org/ . /jphia. . https://doi.org/ . /llc/fqx https://doi.org/ . /zenodo. https://myresearch.institute/about/ https://historyharvest.unl.edu/ https://manifold.umn.edu/projects/cut-copy-paste https://doi.org/ . / . . http://sustainingdh.net/ https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/ https://publishingwithoutwalls.illinois.edu/ https://doi.org/ . /dm. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / we are interested in the production, publication, and preservation of non-traditional scholarly output. by non-traditional scholarly output we mean, for example: orbis: http://orbis.stanford.edu/ [http://orbis.stanford.edu/] ben schmidt’s “a guided tour of teaching language”: http://benschmidt.org/profcloud/ [http://benschmidt.org/profcloud/] twitter bots scholarly blogs online databases like the trans-atlantic slave database: http://www.slavevoyages.org/ [http://www.slavevoyages.org/] digital hadrian’s villa project: http://vwhl.squarespace.com/digital-hadrians-villa-project/ [http://vwhl.squarespace.com/digital-hadrians-villa-project/] questions (note: we did not necessarily ask these questions with this specific wording, and we occasionally modified questions when they seemed inappropriate for a particular interviewee.) . what is your role in the production, publication, and/or preservation of non-traditional scholarly output? . what is it that you do in your role? give us some details? tell us what you did on your last project? . who was the intended audience for the project? . who did you work with on your last project (if anyone)? what were your collaborator’s roles? . walk me through your last project from cradle to grave. . what were the pain points with the work you did on that project? . where did the project get “published?” . what affordances/expressive capabilities would you like in your non-traditional scholarly output (that you currently don’t have)? . how do you see credit functioning for your work and for non-traditional scholarly objects more generally? . what happened/happens when the project ends? what is the afterlife? . who is responsible for the project? . how important is/was the preservation of the project? . in relation to your work on non-traditional scholarly output, what are your goals for the future? . do you have any questions for us? appendix b: commonly referenced technologies amazon s angular arcgis archiva bepress blocks campus data centers and servers cartodb databases (mongodb, mysql, postgresql, etc.) dataverse docker drupal dublin core electron fedora http://orbis.stanford.edu/ http://benschmidt.org/profcloud/ http://www.slavevoyages.org/ http://vwhl.squarespace.com/digital-hadrians-villa-project/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / flash fusion general “supercomputing” gephi github glacier google drive / doc / sheets / maps hathitrust html/css hydra iframe iiif janeway java javascript/d jekyll jquery jupyter notebooks kubernetes lamp stack, servers, other “systems administration” stuff? manifold mastodon microsoft office multispectral imaging neatline ner observable ojs omeka (omeka s) open refine php plotly python qgis quark quicktime r react reclaim hosting ruby on rails scalar sigma is specific digital repositories/asset management platforms tei / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / tropy twine twitter ubiquity unity virtual reality and d voyant wordpress xml zotero . authorship order is alphabetical. [#n -ptr ] . see, for example https://stackoverflow.com/questions/ /web-stacks-listing-of-common-web-stacks- environments [https://stackoverflow.com/questions/ /web-stacks-listing-of-common-web-stacks-environments] and https://www.coursereport.com/blog/lamp-stack-vs-mean-stack-vs-ruby-on-rails [https://www.coursereport.com/blog/lamp-stack-vs-mean-stack-vs-ruby-on-rails] [#n -ptr ] . (“deltron - virus lyrics | metrolyrics” ) [#n -ptr ] . this report focuses on a research context and set aside the use of containers in teaching. [#n -ptr ] . for more information about the broader ecology of containers in we recommend this informal, but informative, collection of links: https://github.com/friz-zy/awesome-linux-containers [https://github.com/friz-zy/awesome- linux-containers] [#n -ptr ] . https://en.wikipedia.org/wiki/operating-system-level_virtualization [https://en.wikipedia.org/wiki/operating- system-level_virtualization] [#n -ptr ] . https://en.wikipedia.org/wiki/logical_volume_manager_(linux) [https://en.wikipedia.org/wiki/logical_volume_manager_(linux)] ; https://en.wikipedia.org/wiki/chroot [https://en.wikipedia.org/wiki/chroot] ; https://en.wikipedia.org/wiki/cgroups [https://en.wikipedia.org/wiki/cgroups] ; https://en.wikipedia.org/wiki/docker_(software) [https://en.wikipedia.org/wiki/docker_(software)] [#n -ptr ] . https://coreos.com/rkt [https://coreos.com/rkt] [#n -ptr ] . https://www.docker.com/what-container [https://www.docker.com/what-container] [#n -ptr ] . the oldweb.today project from rhizome is using containers to run old web browsers on archived versions of website in a web-based gui. http://oldweb.today/ [http://oldweb.today/] [#n -ptr ] . https://cloud.google.com/containers/ [https://cloud.google.com/containers/] [#n -ptr ] . https://www.docker.com/ [https://www.docker.com/] [#n -ptr ] . https://www.datadoghq.com/docker-adoption/ [https://www.datadoghq.com/docker-adoption/] ; https://portworx.com/wp-content/uploads/ / /portworx-container-adoption-survey-report- .pdf [https://portworx.com/wp-content/uploads/ / /portworx-container-adoption-survey-report- .pdf] ; https://portworx.com/wp-content/uploads/ / / -container-adoption-survey.pdf [https://portworx.com/wp-content/uploads/ / / -container-adoption-survey.pdf] [#n -ptr ] . https://coreos.com/rkt [https://coreos.com/rkt] [#n -ptr ] . http://singularity.lbl.gov [http://singularity.lbl.gov] [#n -ptr ] . https://www.nersc.gov/research-and-development/user-defined-images/ [https://www.nersc.gov/research-and- development/user-defined-images/] [#n -ptr ] . https://www.opencontainers.org [https://www.opencontainers.org] [#n -ptr ] . https://github.com/opencontainers/runtime-spec [https://github.com/opencontainers/runtime-spec] ; https://github.com/opencontainers/image-spec [https://github.com/opencontainers/image-spec] [#n -ptr ] . https://github.com/friz-zy/awesome-linux-containers#containers [https://github.com/friz-zy/awesome-linux- containers#containers] [#n -ptr ] . https://github.com/veggiemonk/awesome-docker [https://github.com/veggiemonk/awesome-docker] [#n -ptr ] https://stackoverflow.com/questions/ /web-stacks-listing-of-common-web-stacks-environments https://www.coursereport.com/blog/lamp-stack-vs-mean-stack-vs-ruby-on-rails https://github.com/friz-zy/awesome-linux-containers https://en.wikipedia.org/wiki/operating-system-level_virtualization https://en.wikipedia.org/wiki/logical_volume_manager_(linux) https://en.wikipedia.org/wiki/chroot https://en.wikipedia.org/wiki/cgroups https://en.wikipedia.org/wiki/docker_(software) https://coreos.com/rkt https://www.docker.com/what-container http://oldweb.today/ https://cloud.google.com/containers/ https://www.docker.com/ https://www.datadoghq.com/docker-adoption/ https://portworx.com/wp-content/uploads/ / /portworx-container-adoption-survey-report- .pdf https://portworx.com/wp-content/uploads/ / / -container-adoption-survey.pdf https://coreos.com/rkt http://singularity.lbl.gov/ https://www.nersc.gov/research-and-development/user-defined-images/ https://www.opencontainers.org/ https://github.com/opencontainers/runtime-spec https://github.com/opencontainers/image-spec https://github.com/friz-zy/awesome-linux-containers#containers https://github.com/veggiemonk/awesome-docker / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / . https://www.nersc.gov/research-and-development/user-defined-images/ [https://www.nersc.gov/research-and- development/user-defined-images/] ; http://singularity.lbl.gov/ [http://singularity.lbl.gov/] [#n -ptr ] . https://slurm.schedmd.com/ [https://slurm.schedmd.com/] [#n -ptr ] . https://en.wikipedia.org/wiki/orchestration_(computing) [https://en.wikipedia.org/wiki/orchestration_(computing)] [#n -ptr ] . https://kubernetes.io/ [https://kubernetes.io/] [#n -ptr ] . in the cloud computing industry you may often hear the analogy "cattle not pets." that it, systems administrators need to stop thinking and caring for servers as their pets (managing the configuration by hand, giving servers names, managing a collection of services on a single server). containerization and the cloud force administrators to think differently about their servers, as nameless cattle to be managed at scale using automated tools. [#n -ptr ] . (“docker alternatives and competitors | g crowd” ) [#n -ptr ] . https://cloud.google.com/container-engine/ [https://cloud.google.com/container-engine/] [#n -ptr ] . https:// factor.net/ [https:// factor.net/] [#n -ptr ] . https://en.wikipedia.org/wiki/microservices [https://en.wikipedia.org/wiki/microservices] [#n -ptr ] . https://en.wikipedia.org/wiki/serverless_computing [https://en.wikipedia.org/wiki/serverless_computing] [#n -ptr ] . https://docs.microsoft.com/en-us/azure/container-instances/container-instances-overview [https://docs.microsoft.com/en-us/azure/container-instances/container-instances-overview] [#n -ptr ] . a deliberate absence in docker’s marketing material is the impact of the shipping container on the labor and the loss of jobs for dockworkers and other port laborers. someone needs to study the social and organizational implications of containers. [#n -ptr ] . (“what is devops? - amazon web services (aws)” ) [#n -ptr ] . https://en.wikipedia.org/wiki/dependency_hell [https://en.wikipedia.org/wiki/dependency_hell] [#n -ptr ] . https://github.com/bd kccd/docker [https://github.com/bd kccd/docker] [#n -ptr ] . https://github.com/jupyter/docker-stacks [https://github.com/jupyter/docker-stacks] [#n -ptr ] . http://dhbox.org/ [http://dhbox.org/] [#n -ptr ] . software containers are also increasingly popular for teaching because they reduce cognitive overload and time commitment of installing software which can affect student experience (clark et al. ; Špaček, sohlich, and dulík ; holdgraf et al. ; kamvar et al. / ; williams and teal / ). [#n -ptr ] . for the most recent the software sustainability institute convened a docker containers for reproducible research workshop [https://www.software.ac.uk/c rr] (https://www.software.ac.uk/c rr [https://www.software.ac.uk/c rr] ). slides are available on the agenda [https://www.software.ac.uk/c rr/agenda] (https://www.software.ac.uk/c rr/agenda [https://www.software.ac.uk/c rr/agenda] ) and there is a short blog post with a summary (http://www.dpoc.ac.uk/ / / /c rr-containers-for-reproducible-research-conference/). [http://www.dpoc.ac.uk/ / / /c rr-containers-for-reproducible-research-conference/] also see the twitter stream [https://twitter.com/search?f=tweets&vertical=default&q=% c rr&src=typd] (https://twitter.com/search? f=tweets&vertical=default&q=% c rr&src=typd [https://twitter.com/search? f=tweets&vertical=default&q=% c rr&src=typd] ). [#n -ptr ] . https://en.wikipedia.org/wiki/continuous_integration [https://en.wikipedia.org/wiki/continuous_integration] [#n -ptr ] . common workflow language (cwl) http://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html [http://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html] - bcbio can take workflows described in cwl and run them in docker container. [#n -ptr ] . https://daspos.crc.nd.edu [https://daspos.crc.nd.edu] [#n -ptr ] . https://www.journals.elsevier.com/softwarex/ [https://www.journals.elsevier.com/softwarex/] ; https://www.reprozip.org [https://www.reprozip.org] [#n -ptr ] . http://cknowledge.org [http://cknowledge.org] ; https://occam.cs.pitt.edu [https://occam.cs.pitt.edu] [#n -ptr ] . https://coreos.com/blog/open-container-initiative-specifications-are- [https://coreos.com/blog/open-container- initiative-specifications-are- ] [#n -ptr ] . http://containers-ftw.org/sci-f/ [http://containers-ftw.org/sci-f/] [#n -ptr ] https://www.nersc.gov/research-and-development/user-defined-images/ http://singularity.lbl.gov/ https://slurm.schedmd.com/ https://en.wikipedia.org/wiki/orchestration_(computing) https://kubernetes.io/ https://cloud.google.com/container-engine/ https:// factor.net/ https://en.wikipedia.org/wiki/microservices https://en.wikipedia.org/wiki/serverless_computing https://docs.microsoft.com/en-us/azure/container-instances/container-instances-overview https://en.wikipedia.org/wiki/dependency_hell https://github.com/bd kccd/docker https://github.com/jupyter/docker-stacks http://dhbox.org/ https://www.software.ac.uk/c rr https://www.software.ac.uk/c rr https://www.software.ac.uk/c rr/agenda https://www.software.ac.uk/c rr/agenda http://www.dpoc.ac.uk/ / / /c rr-containers-for-reproducible-research-conference/ https://twitter.com/search?f=tweets&vertical=default&q=% c rr&src=typd https://twitter.com/search?f=tweets&vertical=default&q=% c rr&src=typd https://en.wikipedia.org/wiki/continuous_integration http://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html https://daspos.crc.nd.edu/ https://www.journals.elsevier.com/softwarex/ https://www.reprozip.org/ http://cknowledge.org/ https://occam.cs.pitt.edu/ https://coreos.com/blog/open-container-initiative-specifications-are- http://containers-ftw.org/sci-f/ / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / . https://www.bitcurator.net/ [https://www.bitcurator.net/] [#n -ptr ] . mooney and gerrard point out the recommended way to build docker from scratch, which has + dependences, is with docker. [#n -ptr ] . http://o r.info [http://o r.info] [#n -ptr ] . see nüst et al. [#n -ptr ] . https://en.wikipedia.org/wiki/bagit [https://en.wikipedia.org/wiki/bagit] [#n -ptr ] . see nüst et al. [#n -ptr ] . http://www.infiniteulysses.com [http://www.infiniteulysses.com] [#n -ptr ] . http://worrydream.com/mediaforthinkingtheunthinkable/ [http://worrydream.com/mediaforthinkingtheunthinkable/] [#n -ptr ] . https://distill.pub/about/ [https://distill.pub/about/] [#n -ptr ] . https://www.w .org/provider/style/uri [https://www.w .org/provider/style/uri] [#n -ptr ] . https://github.com/odewahn/computational-publishing [https://github.com/odewahn/computational-publishing] [#n -ptr ] . https://github.com/odewahn/computational-publishing [https://github.com/odewahn/computational-publishing] [#n -ptr ] . http://bookworm.culturomics.org [http://bookworm.culturomics.org] ; https://www.omeka.net [https://www.omeka.net] ; http://scalar.usc.edu/scalar/ [http://scalar.usc.edu/scalar/] [#n -ptr ] . as are most problems, so this isn’t necessarily a big insight. [#n -ptr ] . https://blog.docker.com/ / /keynote-videos-from-dockercon / [https://blog.docker.com/ / /keynote- videos-from-dockercon /] [#n -ptr ] . padmini ray murray and claire squires, “the digital publishing communications circuit,” book . , no. (june , ): – , https://doi.org/ . /btwo. . . _ [https://doi.org/ . /btwo. . . _ ] . [#n -ptr ] . we struggled for some time with what to call these scholarly artifacts, and arrived at “non-traditional scholarly objects” (ntsos) more because we needed to name them something than because it felt like the best term for the task. “digital” or “digital first” scholarly objects would include documents intended for print or which adopt the affordances of print, like a pdf, which is not our target. “complex digital scholarly object” is not quite right, because many of these objects are simpler than a single typeset page. “non-traditional” is a similarly poor choice, both because it contrasts against a single monolithic print tradition that does not exist, and because it is too broad, encompassing scholarly comic books, performances, and so on. we ask the readers to temporarily suspend their disbelief with respect to the term, in lieu of a more convenient or better alternative. [#n -ptr ] . for a discussion of the diversity of early printed books, see whitney trettien, cut/copy/paste: fragments of history (university of minnesota press, forthcoming), https://manifold.umn.edu/projects/cut-copy-paste [https://manifold.umn.edu/projects/cut-copy-paste] . a good analogous survey of diversity in the digital publishing ecosystem is available in john w. maxwell et al., “mind the gap” (pubpub: simon fraser university / mit press, july ), https://mindthegap.pubpub.org/ [https://mindthegap.pubpub.org/] . oclc also recently completed two surveys of the evolving scholarly record, suggesting that diversifying forms and widening distribution of custodial responsibilities is causing pressure on the scholarly publication ecosystem. brian f lavoie et al., the evolving scholarly record (dublin, ohio: oclc research, ), http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-evolving-scholarly-record- .pdf [http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-evolving-scholarly-record- .pdf] ; brian f lavoie, constance malpas, and oclc research, stewardship of the evolving scholarly record: from the invisible hand to conscious coordination, , http://www.oclc.org/content/dam/research/publications/ /oclcresearch-esr-stewardship- .pdf [http://www.oclc.org/content/dam/research/publications/ /oclcresearch-esr-stewardship- .pdf] . [#n -ptr ] . practitioners often take on multiple roles, and within those roles may not distinguish between them. for the purpose of this report, any mention of roles should be read with this in mind. [#n -ptr ] . for another discussion on the “presently splintered scholarly infrastructure” on account of a publication system “optimized mainly for text publications,” see jean-christophe plantin, carl lagoze, and paul n edwards, “re- integrating scholarly infrastructure: the ambiguous role of data sharing platforms,” big data & society , no. (june ): , https://doi.org/ . / [https://doi.org/ . / ] . [#n -ptr ] . lisa gitelman, paper knowledge: toward a media history of documents (durham ; london: duke university press books, ). [#n -ptr ] https://www.bitcurator.net/ http://o r.info/ https://en.wikipedia.org/wiki/bagit http://www.infiniteulysses.com/ http://worrydream.com/mediaforthinkingtheunthinkable/ https://distill.pub/about/ https://www.w .org/provider/style/uri https://github.com/odewahn/computational-publishing https://github.com/odewahn/computational-publishing http://bookworm.culturomics.org/ https://www.omeka.net/ http://scalar.usc.edu/scalar/ https://blog.docker.com/ / /keynote-videos-from-dockercon / https://doi.org/ . /btwo. . . _ https://manifold.umn.edu/projects/cut-copy-paste https://mindthegap.pubpub.org/ http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-evolving-scholarly-record- .pdf http://www.oclc.org/content/dam/research/publications/ /oclcresearch-esr-stewardship- .pdf https://doi.org/ . / / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / . levinson explains that “by far the biggest expense in [shipping goods] was shifting the cargo from land transport to ship at the port of departure and moving it back to truck or train at the other end of the ocean voyage. [...] as container shipping became intermodal, [...] the overall cost of transporting goods [became] little more than a footnote in a company’s cost analysis.” marc levinson, the box: how the shipping container made the world smaller and the world economy bigger (princeton, nj: princeton university press, ). [#n -ptr ] . the sister study to this one, conducted by the same team, resulted in a white paper enumerating the uses of containerization in academia as of . we concluded that, although containerization is becoming widespread for uses such as reproducibility or collaboration, the technology’s use (or even experimentation) is still relatively rare within scholarly publishing. both this and the sister study fit into a larger a.w. mellon-funded project called “digits: a platform to facilitate the production of digital scholarship”, in which the four co-authors explore the possibility of software containers as a worthwhile technological intervention into the scholarly publication pipeline. information on both studies, as well as the resulting reports, can be found on our website at http://digits.pub [http://digits.pub] . [#n -ptr ] . for example, brandon butler, amanda visconti, and ammon shepherd, “archiving dh part : the long view,” scholars’ lab (blog), april , , https://scholarslab.lib.virginia.edu/blog/archiving-dh-part- -the-long-view/ [https://scholarslab.lib.virginia.edu/blog/archiving-dh-part- -the-long-view/] ; claire carlin et al., “the endings project,” , https://projectendings.github.io [https://projectendings.github.io] ; alison langmead et al., “a role- based model for successful collaboration in digital art history,” international journal for digital art history, no. (july , ), https://doi.org/ . /dah. . . [https://doi.org/ . /dah. . . ] ; trevor owens, the theory and craft of digital preservation (johns hopkins university press, ). [#n -ptr ] . michael a. elliott, “the future of the monograph in the digital era: a report to the andrew w. mellon foundation,” journal of electronic publishing , no. (december , ), https://doi.org/ . / . . [https://doi.org/ . / . . ] ; john w. maxwell, alessandra bordini, and katie shamash, “reassembling scholarly communications: an evaluation of the andrew w. mellon foundation’s monograph initiative (final report, may ),” journal of electronic publishing , no. ( ), https://doi.org/ . / . . [https://doi.org/ . / . . ] ; sara sikes, maria bonn, and elli mylonas, “building capacity for digital scholarship & publishing: three approaches from mellon’s - scholarly communications initiative” (digital humanities , montreal, canada, ), , https://dh .adho.org/abstracts/ / .pdf [https://dh .adho.org/abstracts/ / .pdf] . [#n -ptr ] . for related approaches see, e.g., john wilkin et al., publishing without walls, , https://publishingwithoutwalls.illinois.edu/ [https://publishingwithoutwalls.illinois.edu/] . [#n -ptr ] . audio recordings were partially transcribed by scott and nechama weingart, and partially transcribed by rev.com before being corrected by interviewers. [#n -ptr ] . these people were chosen from our social networks, from people attending the same conferences/events we were attending, from geographical proximity to our travel routes, and/or by name recognition in our fields. [#n -ptr ] . christine l. borgman, scholarship in the digital age: information, infrastructure, and the internet (cambridge, ma: mit press, ), – . [#n -ptr ] . for broader articulations of maintenance, see, e.g., andrew russell and lee vinsel, “hail the maintainers,” aeon, , https://aeon.co/essays/innovation-is-overvalued-maintenance-often-matters-more [https://aeon.co/essays/innovation-is-overvalued-maintenance-often-matters-more] ; the information maintainers et al., “information maintenance as a practice of care,” white paper, june , , https://doi.org/ . /zenodo. [https://doi.org/ . /zenodo. ] . [#n -ptr ] . ithaka s+r, “life cycle of a digital resource,” ithaka s+r (blog), , https://sr.ithaka.org/life-cycle-of-a- digital-resource/ [https://sr.ithaka.org/life-cycle-of-a-digital-resource/] . [#n -ptr ] . mla task force, “report of the mla task force on evaluating scholarship for tenure and promotion,” profession , no. (november , ): – , https://doi.org/ . /prof. . . . [https://doi.org/ . /prof. . . . ] ; bethany nowviskie, “evaluating collaborative digital scholarship (or, where credit is due),” journal of digital humanities , no. (fall ). [#n -ptr ] . see, for example, gabriele griffin and matt steven hayler, “collaboration in digital humanities research – persisting silences,” digital humanities quarterly , no. (april , ); alix keener, “the arrival fallacy: collaborative research relationships in the digital humanities,” digital humanities quarterly , no. (august , ); william a. kretzschmar and william gray potter, “library collaboration with large digital humanities projects,” literary and linguistic computing , no. (december , ): – , https://doi.org/ . /llc/fqq [https://doi.org/ . /llc/fqq ] ; langmead et al., “a role-based model for successful collaboration in digital art history”; willard mccarty, “collaborative research in the digital humanities,” in collaborative research in the digital humanities (routledge, ), – ; daniel v. pitti, “designing sustainable projects and publications,” in a companion to digital humanities, ed. susan schreibman, ray siemens, and john unsworth (malden, ma, usa: blackwell publishing ltd, ), – , https://doi.org/ . / .ch [https://doi.org/ . / .ch ] ; chris alen sula, http://digits.pub/ https://scholarslab.lib.virginia.edu/blog/archiving-dh-part- -the-long-view/ https://projectendings.github.io/ https://doi.org/ . /dah. . . https://doi.org/ . / . . https://doi.org/ . / . . https://dh .adho.org/abstracts/ / .pdf https://publishingwithoutwalls.illinois.edu/ https://aeon.co/essays/innovation-is-overvalued-maintenance-often-matters-more https://doi.org/ . /zenodo. https://sr.ithaka.org/life-cycle-of-a-digital-resource/ https://doi.org/ . /prof. . . . https://doi.org/ . /llc/fqq https://doi.org/ . / .ch / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / “digital humanities and libraries: a conceptual model,” journal of library administration , no. (january , ): – , https://doi.org/ . / . . [https://doi.org/ . / . . ] . [#n -ptr ] . oliver quinlan, “young digital makers: surveying attitudes and opportunities for digital creativity across the uk” (nesta, march ), , https://media.nesta.org.uk/documents/youngdigmakers.pdf [https://media.nesta.org.uk/documents/youngdigmakers.pdf] . [#n -ptr ] . quinlan, . [#n -ptr ] . see, for example, brett oppegaard and michael rabby, “the app-maker model: an embodied expansion of mobile cyberinfrastructure,” digital humanities quarterly , no. (august , ). [#n -ptr ] . for a discussion on similar challenges with respect to research needs, and infrastructural recommendations based on such needs, see melissa terras et al., “enabling complex analysis of large-scale digital collections: humanities research, high-performance computing, and transforming access to british library digital collections,” digital scholarship in the humanities, may , , https://doi.org/ . /llc/fqx [https://doi.org/ . /llc/fqx ] . [#n -ptr ] . from a technical standpoint, these concerns related closely to decisions about whether a project will deliver static or dynamic content. most projects include dynamic content, which requires a web-hosting solution that offers, at a minimum, a database service such as mysql. some projects are built to depend upon an outside data service such as an api or a linked, open data service. several interview subjects linked technical considerations to an early decision to adopt or eschew a particular technical stack, project template, or digital tool, such as drupal, omeka, wordpress, ruby on rails, or django. [#n -ptr ] . the recently released final report for always already computational: collections as data finds, similarly, that “challenges to collections as data development are more organizational than technical” ( ) the report recommends “inclusive organizational experimentation,” which requires, according to their report, “a combination of community engagement, domain knowledge, and the capacity for infrastructure development” ( ). the report also calls for cultural heritage organizations reconsider traditional divisions of labor. see thomas padilla et al., “final report — always already computational: collections as data,” may , , https://doi.org/ . /zenodo. [https://doi.org/ . /zenodo. ] . [#n -ptr ] . these points of friction are particularly noticeable when makers work with university or library it groups more accustomed to supporting enterprise computing, or with high performance computing centers whose systems were developed with different uses in mind (even when hpc centers earnestly seek to support these new uses of their infrastructures, as they often do). [#n -ptr ] . a series of publications by ithaka s+r aimed to scope and address these very concerns. ithaka s+r, “life cycle of a digital resource”; nancy l maron, “the department of digital humanities (ddh) at king’s college london : cementing it status as an academic department, case study update ,” ithaka case studies in sustainability (ithaka s+r, ), https://sr.ithaka.org/publications/the-department-of-digital-humanities-ddh- at-kings-college-london- / [https://sr.ithaka.org/publications/the-department-of-digital-humanities-ddh-at-kings- college-london- /] ; nancy l maron and sarah pickle, “sustainability implementation toolkit: developing an institutional strategy for supporting digital humanities resources,” ithaka s+r (blog), june , , https://sr.ithaka.org/publications/sustainability-implementation-toolkit/ [https://sr.ithaka.org/publications/sustainability-implementation-toolkit/] ; nancy l maron, k. kirby smith, and matthew loy, “sustaining digital resources: an on-the-ground view of projects today” (new york: ithaka s+r, august , ), https://doi.org/ . /sr. [https://doi.org/ . /sr. ] . indeed our results exactly replicate key findings from their report that "even on campuses with designated dh centers, there is rarely an end-to-end solution," and that "some stages in the digital project life cycle seem not to be owned by any one unit”—nancy l maron and sarah pickle, “sustaining the digital humanities: host institution support beyond the start-up phase” (ithaka s+r, june , ). despite the valuable recommendations from ithaka s+r, our study finds little evidence that matters have changed between and . [#n -ptr ] . of the four models, publisher partnership were the least common among our subjects. [#n -ptr ] . angie kemp, lee skallerup, and kris shaffer, “what do you do with , blogs? preserving, archiving, and maintaining umw blogs - a case study,” journal of interactive technology and pedagogy, no. ( ), https://jitp.commons.gc.cuny.edu/what-do-you-do-with- -blogs-preserving-archiving-and-maintaining- umw-blogs-a-case-study/ [https://jitp.commons.gc.cuny.edu/what-do-you-do-with- -blogs-preserving-archiving- and-maintaining-umw-blogs-a-case-study/] ; joris van zundert, “on not writing a review about mirador: mirador, iiif, and the epistemological gains of distributed digital scholarly resources,” digital medievalist , no. (august , ): , https://doi.org/ . /dm. [https://doi.org/ . /dm. ] . [#n -ptr ] . ithaka s+r points to things like understanding the digital lifecycle of a project, articulating institutional expectations, and obtaining the commitment of key stakeholders. maron and pickle, “sustainability implementation toolkit: developing an institutional strategy for supporting digital humanities resources.” https://doi.org/ . / . . https://media.nesta.org.uk/documents/youngdigmakers.pdf https://doi.org/ . /llc/fqx https://doi.org/ . /zenodo. https://sr.ithaka.org/publications/the-department-of-digital-humanities-ddh-at-kings-college-london- / https://sr.ithaka.org/publications/sustainability-implementation-toolkit/ https://doi.org/ . /sr. https://jitp.commons.gc.cuny.edu/what-do-you-do-with- -blogs-preserving-archiving-and-maintaining-umw-blogs-a-case-study/ https://doi.org/ . /dm. / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / “the socio-technical sustainability roadmap” emphasizes project lifespan should be considered in tandem with things like reliable sites of production, hosting, and documentation practices. visual media workshop at the university of pittsburgh, “the socio-technical sustainability roadmap,” the socio-technical sustainability roadmap, october , http://sustainingdh.net [http://sustainingdh.net] . for a different take on designing for sustainable digital projects, see laurie gemill arp and megan forbes, it takes a village: open source software sustainability. a guidebook for programs serving cultural and scientific heritage (lyrasis, ).. [#n -ptr ] . even in cases where institutions like libraries limit their services to a few off-the-shelf platforms, the rhetoric around those tools often unrealistically hides the various costs and necessary labor associated with such platforms. see, e.g., kemp, skallerup, and shaffer, “what do you do with , blogs? preserving, archiving, and maintaining umw blogs - a case study”; paige c. morgan, “the consequences of framing digital humanities tools as easy to use,” college & undergraduate libraries , no. (july , ): – , https://doi.org/ . / . . [https://doi.org/ . / . . ] . [#n -ptr ] . see, for example, maron and pickle, “sustaining the digital humanities: host institution support beyond the start-up phase”; jennifer vinopal and monica mccormick, “supporting digital scholarship in research libraries: scalability and sustainability,” journal of library administration , no. (january , ): – , https://doi.org/ . / . . [https://doi.org/ . / . . ] ; “supporting the digital humanities: report of a cni executive roundtable” (coalition for networked information, ), https://www.cni.org/wp-content/uploads/ / /cni-supportdh-exec-rndtbl.report.f .pdf [https://www.cni.org/wp-content/uploads/ / /cni-supportdh-exec-rndtbl.report.f .pdf] . [#n -ptr ] . see, for example, christine l. borgman, “the digital future is now: a call to action for the humanities,” digital humanities quarterly , no. (january , ), https://escholarship.org/uc/item/ fp n s [https://escholarship.org/uc/item/ fp n s] . [#n -ptr ] . see c. k. gunsalus et al., “mission creep in the irb world,” science , no. (june , ): – , https://doi.org/ . /science. [https://doi.org/ . /science. ] ; richard a. shweder and richard e. nisbett, “long-sought research deregulation is upon us. don’t squander the moment.,” the chronicle of higher education, march , , https://www.chronicle.com/article/long-sought-research/ [https://www.chronicle.com/article/long-sought-research/ ] ; richard a. shweder and richard e. nisbett, “don’t let your misunderstanding of the rules hinder your research,” the chronicle of higher education, april , , https://www.chronicle.com/article/don-t-let-your/ [https://www.chronicle.com/article/don-t-let- your/ ] . [#n -ptr ] . for a selection of such ethical concerns, see e.g., elizabeth groeneveld, “remediating pornography: the on our backs digitization debate,” continuum , no. (january , ): – , https://doi.org/ . / . . [https://doi.org/ . / . . ] ; bergis jules, ed summers, and vernon mitchell, jr., “documenting the now: ethical considerations for archiving social media content generated by contemporary social movements: challenges, opportunities, and recommendations,” white paper, april , https://www.docnow.io/docs/docnow-whitepaper- .pdf [https://www.docnow.io/docs/docnow-whitepaper- .pdf] ; tara robertson, “digitization: just because you can, doesn’t mean you should,” tara robertson (blog), march , , http://tararobertson.ca/ /oob/ [http://tararobertson.ca/ /oob/] ; tara robertson, “update on on our backs and reveal digital,” tara robertson (blog), august , , https://tararobertson.ca/ /oob-update/ [https://tararobertson.ca/ /oob- update/] ; tara robertson, “concerns about reveal digital’s statement about on our backs,” tara robertson (blog), october , , https://tararobertson.ca/ /oob-part / [https://tararobertson.ca/ /oob-part /] . [#n -ptr ] . we acknowledge this recommendation could put catalysts and hosts at greater legal risk, but some of our subjects believed well-resourced institutions have an obligation to absorb these risks in order to protect boundary-pushing ntsos. at present, risk-averse institutions often have a chilling effect on otherwise ethical, important research agendas. [#n -ptr ] . see, for example, ceciiia mazanec, “#thanksfortyping spotlights unnamed women in literary acknowledgments,” npr.org, march , , https://www.npr.org/ / / / /-thanksfortyping- spotlights-unnamed-women-in-literary-acknowledgements [https://www.npr.org/ / / / /- thanksfortyping-spotlights-unnamed-women-in-literary-acknowledgements] . [#n -ptr ] . an extended discussion of this was presented in matthew linclon et al., “the state of digital humanities software development roundtable” (association for computers and the humanities, pittsburgh, pa, ), https://www.conftool.org/ach /index.php?page=browsesessions&form_session= &presentations=show [https://www.conftool.org/ach /index.php?page=browsesessions&form_session= &presentations=show] . [#n -ptr ] . elvis e. tarkang, margaret kweku, and francis b. zotor, “publication practices and responsible authorship: a review article,” journal of public health in africa , no. (june , ), https://doi.org/ . /jphia. . [https://doi.org/ . /jphia. . ] . [#n -ptr ] http://sustainingdh.net/ https://doi.org/ . / . . https://doi.org/ . / . . https://www.cni.org/wp-content/uploads/ / /cni-supportdh-exec-rndtbl.report.f .pdf https://escholarship.org/uc/item/ fp n s https://doi.org/ . /science. https://www.chronicle.com/article/long-sought-research/ https://www.chronicle.com/article/don-t-let-your/ https://doi.org/ . / . . https://www.docnow.io/docs/docnow-whitepaper- .pdf http://tararobertson.ca/ /oob/ https://tararobertson.ca/ /oob-update/ https://tararobertson.ca/ /oob-part / https://www.npr.org/ / / / /-thanksfortyping-spotlights-unnamed-women-in-literary-acknowledgements https://www.conftool.org/ach /index.php?page=browsesessions&form_session= &presentations=show https://doi.org/ . /jphia. . / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / . these questions of credit are more broadly addressed in the body of work by nicky agate et al., “about humetricshss,” humetricshss, september , https://humetricshss.org/about/ [https://humetricshss.org/about/] . [#n -ptr ] . such avenues are fixed enough that numerous how-to titles purport to guide would-be authors through the process of publishing academic books or articles. (see wendy laura belcher, writing your journal article in twelve weeks: a guide to academic publishing success (sage, ); william germano, getting it published, nd edition: a guide for scholars and anyone else serious about serious books (university of chicago press, ); anthony haynes, writing successful academic books (cambridge university press, ); paul j. silvia, how to write a lot: a practical guide to productive academic writing (american psychological association, ).) the preponderance of these books, however, may also suggest that traditional paths are becoming harder to navigate and/or increasingly competitive. [#n -ptr ] . several scholarly journals and academic presses have established publication targets for digital-first content, including but not limited to kairos, manifold@uminnpress, and stanford digital projects. [#n -ptr ] . potential solutions to these problems are offered by groups such as uconn’s greenhouse studios design process model.sara b. sikes, “a design process model for inquiry-driven, collaboration-first scholarly communications – dh ” (digital humanities, mexico city, ), https://dh .adho.org/en/a-design-process-model-for- inquiry-driven-collaboration-first-scholarly-communications/ [https://dh .adho.org/en/a-design-process-model- for-inquiry-driven-collaboration-first-scholarly-communications/] . [#n -ptr ] . reviews in digital humanities is a recent intervention with a great deal of promise. the journal seeks to facilitate “scholarly evaluation of digital humanities work and its outputs.” jennifer guiliano and roopika risam, “reviews in digital humanities,” reviews in digital humanities, august , , https://reviewsindh.pubpub.org/ [https://reviewsindh.pubpub.org/] . [#n -ptr ] . several important projects fall under what might be called “augmented book platforms,” including manifold, scalar, and quire. [#n -ptr ] . for discussions around monographs and ntsos, see e.g., alex humphreys et al., “reimagining the digital monograph: design thinking to build new tools for researchers,” journal of electronic publishing , no. ( ), http://dx.doi.org/ . / . . [http://dx.doi.org/ . / . . ] ; james o’sullivan, “the equivalence of books: monographs, prestige, and the rise of edge cases,” convergence , no. (october , ): – , https://doi.org/ . / [https://doi.org/ . / ] ; donald j. waters, “monograph publishing in the digital age,” the andrew w. mellon foundation, shared experiences blog (blog), july , , https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/ [https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/] . [#n -ptr ] . for discussion around this topic, and a set of recommendations, see alan galey and stan ruecker, “how a prototype argues,” literary and linguistic computing , no. (december , ): – , https://doi.org/ . /llc/fqq [https://doi.org/ . /llc/fqq ] . [#n -ptr ] . "[t]he humanities must broaden traditional definitions of scholarship,” according to susan schreibman, laura mandell, and stephen olsen, “introduction,” profession, , – . [#n -ptr ] . at least one interview subject called for a credit system with little or no value placed on prestige. they said the existing system was incompatible with the types of scholarship they want to do and that they didn’t “believe in credit” in the traditional academic sense. [#n -ptr ] . the use of self-publication for alternative audiences should be seen as distinct from the apparent motivation of scholars who said they lacked patience for fitting their ntsos into traditional scholarly publication ecosystems. [#n -ptr ] . initiatives that have done this well include documenting the now and history harvest. jules, summers, and mitchell, jr., “documenting the now: ethical considerations for archiving social media content generated by contemporary social movements: challenges, opportunities, and recommendations”; william g. thomas and patrick d. jones, “history harvest,” university of nebraska-lincoln, , https://historyharvest.unl.edu/ [https://historyharvest.unl.edu/] . [#n -ptr ] . this recommendation could, in some cases, conflict with some open access models. we want to acknowledge that this one of several paths forward. [#n -ptr ] . interview subjects did mention sites like nines (http://www.nines.org/ [http://www.nines.org/] ) and th century connect (http://www. thconnect.org/ [http://www. thconnect.org/] ), but noted they did not serve the same purpose as services like jstor and worldcat. other initiatives include, e.g., google dataset search (https://toolbox.google.com/datasetsearch [https://toolbox.google.com/datasetsearch] ), but the tool is far from comprehensive, with areas of strength in natural sciences. [#n -ptr ] https://humetricshss.org/about/ https://dh .adho.org/en/a-design-process-model-for-inquiry-driven-collaboration-first-scholarly-communications/ https://reviewsindh.pubpub.org/ http://dx.doi.org/ . / . . https://doi.org/ . / https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/ https://doi.org/ . /llc/fqq https://historyharvest.unl.edu/ http://www.nines.org/ http://www. thconnect.org/ https://toolbox.google.com/datasetsearch / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / . there are many efforts to change this. for projects attempting to unify indexing, aggregation, and archiving, see, e.g., martin klein, “a web-centric pipeline for archiving scholarly artifacts,” (keynote, ), https://www.slideshare.net/martinklein /a-webcentric-pipeline-for-archiving-scholarly-artifacts [https://www.slideshare.net/martinklein /a-webcentric-pipeline-for-archiving-scholarly-artifacts] ; bryan newbold, “about fatcat,” fatcat!, , https://fatcat.wiki/about [https://fatcat.wiki/about] ; david rosenthal, “personal pods and fatcat,” dshr’s blog (blog), april , , https://blog.dshr.org/ / /personal-pods-and- fatcat.html [https://blog.dshr.org/ / /personal-pods-and-fatcat.html] ; the prototyping team of the los alamos national laboratory and the web science and digital library research group at old dominion university, “about,” my research institute, , https://myresearch.institute/about/. [#n -ptr ] . martin paul eve, “open access publishing models and how oa can work in the humanities,” bulletin of the association for information science and technology , no. ( ): – , https://doi.org/ . /bul . . [https://doi.org/ . /bul . . ] ; nancy maron et al., “the costs of publishing monographs: toward a transparent methodology,” journal of electronic publishing , no. (summer ): , https://doi.org/ . / . . [https://doi.org/ . / . . ] ; scott smart et al., “determining the financial cost of scholarly book publishing,” journal of electronic publishing , no. (summer ), https://doi.org/ . / . . [https://doi.org/ . / . . ] . [#n -ptr ] . such complexities also create questions about who should receive credit in publications. [#n -ptr ] . see e.g., katrina fenlon et al., “humanities scholars and library-based digital publishing: new forms of publication, new audiences, new publishing roles,” journal of scholarly publishing, april , , https://doi.org/ . /jsp. . . [https://doi.org/ . /jsp. . . ] ; vinopal and mccormick, “supporting digital scholarship in research libraries.” for additional information on the changing relationship of library- based ntso collaborations, see miriam posner, “digital humanities and the library,” blog, miriam posner’s blog (blog), april , http://miriamposner.com/blog/digital-humanities-and-the-library/ [http://miriamposner.com/blog/digital-humanities-and-the-library/] ; tim bryson et al., spec kit : digital humanities (november ), spec kit (association of research libraries, ), https://doi.org/ . /spec. [https://doi.org/ . /spec. ] . [#n -ptr ] . see the maintainers (http://themaintainers.org [http://themaintainers.org] ) for ongoing discussions about distinctions between maintenance and preservation. [#n -ptr ] . “a generation of legacy projects that need maintenance but are out of funding have reached critical stages of their lifecycles, an increasingly hostile security context has made dh projects potential attack vectors into institutional networks, heterogeneous and often delicate technologies have complicated the task of maintenance, and an increasing number of emerging formats have made archiving and preservation yet more difficult.” james smithies et al., “managing digital humanities projects: digital scholarship & archiving in king’s digital lab,” digital humanities quarterly , no. (april , ). [#n -ptr ] . for discussions of loss of ntsos and other web resources, see carlin et al., “the endings project”; robin camille davis, “taking care of digital efforts: a multiplanar view of project afterlives,” in proceedings of the modern languages association (modern languages association, vancouver, ), https://robincamille.com/presentations/mla / [https://robincamille.com/presentations/mla /] ; luis meneses et al., “quantifying the degree of planned obsolesce in online digital humanities projects” (july ); bethany nowviskie and dot porter, “the graceful degradation survey: managing digital humanities projects through times of transition and decline,” in proceedings of digital humanities (digital humanities , king’s college, london, ), http://dh .cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab- .html [http://dh .cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab- .html] ; robert sieczkiewicz, “on the diversity of digital decay,” in proceedings of keystone dh (keystonedh, pittsburgh, pa, ), http://keystonedh.network/ /abstracts/#submission- . [#n -ptr ] . graceful degradation here refers to some aspect of the content remaining accessible even if the project loses some technical functionality. we discuss graceful degradation further in part iii of this report, especially in the subsection “when to preserve.” [#n -ptr ] . based on our interviews and advisory board conversations, we arrived at a rough, unproven consensus on the calculus of system maintenance costs. every additional custom system stack increases the maintenance cost as much as the last, whereas hosting increasing numbers of ntsos on a single platform does not increase maintenance costs as significantly. this rough consensus drives the tension between an institutional partner’s willingness to support fully custom ntsos and their interest in supporting the widest possible group of makers. [#n -ptr ] . in discussing sophie . and sophie . , jasmine kirby discusses the role a lack of permanent, paid staff played in the failure of the sophie project. the many other contributing factors brought up in her study are particularly relevant to many other sections of this report. jasmine simone kirby, “how not to create a digital media scholarship platform: the history of the sophie . project,” iassist quarterly , no. (february , ): – , https://doi.org/ . /iq [https://doi.org/ . /iq ] . [#n -ptr ] https://www.slideshare.net/martinklein /a-webcentric-pipeline-for-archiving-scholarly-artifacts https://fatcat.wiki/about https://blog.dshr.org/ / /personal-pods-and-fatcat.html https://doi.org/ . /bul . . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /jsp. . . http://miriamposner.com/blog/digital-humanities-and-the-library/ https://doi.org/ . /spec. http://themaintainers.org/ https://robincamille.com/presentations/mla / http://dh .cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab- .html https://doi.org/ . /iq / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / . we recognize that collectively maintaining access to ntsos comes with greater technical and social challenges. institutions may be reluctant to take on such tasks even for content created within their remit, let alone content created elsewhere but deemed of scholarly value to members of that institution. [#n -ptr ] . the topic is too large to even summarize here, but special attention ought to be paid to, e.g., keith pendergrass et al., “toward environmentally sustainable digital preservation,” the american archivist , no. (june ): – , https://doi.org/ . / - - . . [https://doi.org/ . / - - . . ] . [#n -ptr ] . for a focus on this topic as part of a larger agenda, see micah altman et al., “ national agenda for digital stewardship,” report (ndsa coordinating committee, september ), http://www.digitalpreservation.gov/documents/ nationalagendaexecsummary.pdf [http://www.digitalpreservation.gov/documents/ nationalagendaexecsummary.pdf] . [#n -ptr ] . “the threat of obsolescence asks us to reconsider how we teach, create, and circulate digital scholarship.” timothy lockridge, enrique paz, and cynthia johnson, “the kairos preservation project,” computers and composition (december , ): – , https://doi.org/ . /j.compcom. . . [https://doi.org/ . /j.compcom. . . ] . [#n -ptr ] . for a longer conversation around the importance of documentation in legacy projects, as well as many other topics that are relevant to this report, see ashley reed, “managing an established digital humanities project: principles and practices from the twentieth year of the william blake archive,” digital humanities quarterly , no. (april , ). [#n -ptr ] . oya rieger, “the state of digital preservation in : a snapshot of challenges and gaps” (ithaka s+r, october , ), https://doi.org/ . /sr. [https://doi.org/ . /sr. ] . [#n -ptr ] . for some notable efforts in this space, see e.g., mattie burkert, “recovering the london stage information bank: lessons from an early humanities computing project,” digital humanities quarterly , no. (august , ); mattie burkert, “london stage database,” mattie burkert (blog), october , , https://mattieburkert.com/london-stage-project/ [https://mattieburkert.com/london-stage-project/] ; bradley daigle et al., “valley of the shadow,” university of virginia library digital production group, , https://dcs.library.virginia.edu/sustaining-digital-scholarship/valley-of-the-shadow/ [https://dcs.library.virginia.edu/sustaining-digital-scholarship/valley-of-the-shadow/] ; lockridge, paz, and johnson, “the kairos preservation project”; smithies et al., “managing digital humanities projects.” [#n -ptr ] . we also note that this attitude may reinforce an "infrastructural" perspective of preservation. that is, preservation as an invisible process that will “naturally” and “organically” occur, so it doesn't warrant significant consideration. this perspective may often be built upon another’s hidden labor. [#n -ptr ] . for a state-of-field bibliographic overview of many of these discussions, see works from charles w. bailey, digital curation bibliography: preservation and stewardship of scholarly works (usa: createspace independent publishing platform, ). [#n -ptr ] . this concept has also been discussed on twitter. see for example, https://twitter.com/elotroalex/status/ [https://twitter.com/elotroalex/status/ ] [#n -ptr ] . although both examples would involve substantial human mediation and modification of the original project, one interview subject specifically reminded us that all “preservation is interpretive and will never not be.” [#n -ptr ] . for a foundational archival perspective, see aims work group, “aims born-digital collections: an inter- institutional model for stewardship.,” white paper, , http://www .lib.virginia.edu/aims/whitepaper/aims_final.pdf [http://www .lib.virginia.edu/aims/whitepaper/aims_final.pdf] . [#n -ptr ] . the people we interviewed did not specifically mention preservation service providers such as portico, third party involvement has an important role to play in this ecosystem. [#n -ptr ] . this is not to suggest an overall lack of progress. ithaka’s “the state of digital preservation in ” discusses several examples of positive momentum. the report also notes, “although the interviewees described many areas of progress, they also commented on their concerns about how to provide sufficient levels of digital preservation to meet the community’s needs.” see rieger, “the state of digital preservation in .” [#n -ptr ] . ideally, preservation will include strategies of multiple distribution through perhaps a regional network, redundant broadcasting, and multiplicity of access (see, for example, https://www.lockss.org [https://www.lockss.org] ). [#n -ptr ] product of michigan publishing (http://www.publishing.umich.edu), university of michigan library (http://www.lib.umich.edu/) • jep-info@umich.edu (mailto:jep- info@umich.edu?subject=journal% of% electronic% publishing) • issn - https://doi.org/ . / - - . . http://www.digitalpreservation.gov/documents/ nationalagendaexecsummary.pdf https://doi.org/ . /j.compcom. . . https://doi.org/ . /sr. https://mattieburkert.com/london-stage-project/ https://dcs.library.virginia.edu/sustaining-digital-scholarship/valley-of-the-shadow/ https://twitter.com/elotroalex/status/ http://www .lib.virginia.edu/aims/whitepaper/aims_final.pdf https://www.lockss.org/ http://www.publishing.umich.edu/ http://www.lib.umich.edu/ mailto:jep-info@umich.edu?subject=journal% of% electronic% publishing / / digits: two reports on new units of scholarly publication https://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main / libraries as content producers: how library publishing services address the reading experience daniel g. tracy daniel g. tracy is library and information science and research services librarian and assistant profes- sor, university library, university of illinois at urbana-champaign; e-mail: dtracy@illinois.edu. © daniel g. tracy, attribution-noncommercial (http://creativecommons.org/licenses/by-nc/ . /) cc by-nc. this study establishes baseline information about the ways library publishing services integrate user studies of their readers, as well as common barriers to doing so. the library publishing coalition defines library publishing as “the set of activities led by college and university libraries to support the creation, dissemination, and curation of scholarly, creative, and/or educational works.” this area includes traditional as well as novel publication types. results suggest that discussions of library publishing underrepresent engagement with readers but that ample room for increased attention remains. existing reader-related efforts vary widely and may in some cases be happenstance. these efforts also face key barriers in lack of prioritization, lack of expertise, and lack of control of out-of-the-box platforms. ibrary publishing is a booming area of creation for digital (and often print) collections. although libraries have long published in areas such as special collections materials and catalogs, library publication of scholarly journals and monographs has become increasingly common. many library publish- ing services also take advantage of digital formats to create experimental forms not possible in print: multimedia and digital humanities projects facilitate interaction with content as well as reading. most of these services publish working papers, dissertations, and other traditionally “gray” literature through institutional repositories, with some focusing exclusively on these publications. all the preceding activities have received increasing attention as a strategic area of library work under the umbrella of “library publishing,” leading to the creation of the library publishing coalition in and subsequent publication of the first library publishing directory in . this study establishes baseline information about the ways library publishing ser- vices integrate user studies of their readers, as well as common barriers to doing so. it particularly focuses on how these services have investigated and acted on the needs, preferences, and behaviors of readers to improve publication design and delivery platforms. increasingly in libraries, user-centered work on digital interfaces happens under the banner of user experience (ux) research, but library and information science professionals have a long history of engaging in user studies research to understand doi: . /crl. . . college & research libraries february how and why populations use information resources. while user studies and ux overlap in interesting ways that need further exploration, this article employs the term “user studies” to refer to the broader field of understanding information needs and behavior, and “ux” to refer to the emergent area of user-centered research that informs design of digital interfaces in libraries. for the purposes of distinction, it uses a third term, “reading experience,” to refer to the forms of reading engagement produced by publication interfaces designed through these efforts. opportunities for using information about readers to better serve their needs and behaviors abound in library publishing. for example, should a journal offer a web- only interface, only downloadable formats (pdf or otherwise), or both? can the web presence be optimized so that readers can more easily and quickly navigate the table of contents or find specific functions? how can the capacity for annotation in e-books be optimized for reflective academic reading? if the publication is experimental in form—for example, by including mixed media, annotation features for textual or other media, or options for basic text mining of content as a novel reading tool in a digital humanities project—how intuitive are new features to use? if the creator of an experi- mental interface seeks to create a novel reading experience as a result, is it successful? as libraries create content that will in turn become collections, it seems natural to draw from the history of user-centered library research, including the increasing focus on ux, to inform publication design. if, as dan cohen and kathleen fitzpatrick recently noted, “[l]ibraries have the potential to become the crucial nexus for knowledge flows on campus” by producing as well as collecting scholarly output, this publishing activity offers the opportunity not just for library staff to develop new skills but to insist upon the needs of information users (particularly as readers) as fundamental to production of digital scholarship. nonetheless, as a recent analysis of library publishing discourse has shown, it has not been readily apparent that libraries see this as a strength they bring to publishing (except, to a certain extent, in a preference for open access publica- tion) or the extent to which they draw on expertise in, or studies of, user needs when producing their own publications. background academic libraries have been publishers for over a century. paul n. courant and elisabeth jones, as well as ann okerson and alex holzman, have provided recent summaries of the history of library publishing. libraries were the original homes of university presses, and they have published catalogs as well as editions of primary sources housed in the stacks. if this publishing function was secondary to the pri- mary function of collecting and providing access to works, academic libraries have emphasized publishing as an increasingly important service beginning around the early s. digitization has built on the history of publishing related to special col- lections, and libraries have collaborated with digital humanists in particular to curate scholarly collections and editions of important work. institutional repositories have blurred the line between “access to” and “publication of” electronic theses and dis- sertations (etds) and other traditionally gray literature produced by universities. publishing of journals, and to a lesser extent monographs (albeit usually in digital formats), increasingly also finds a home in libraries for a variety of reasons, including interest in noncorporate, open access alternatives for journals and books. perhaps the most overt recognition of publishing as a revitalized library role has emerged from the (re)integration of university presses into academic libraries at many institutions, often as a solution to the economic sustainability of the press. charles watkinson notes that as many as university presses now report administratively through their libraries as of . libraries as content producers this revived synergy of publishing and librarianship has garnered increasing at- tention through a series of reports sponsored by ithaka (in ), the association of research libraries (arl, in and ), the institute for museum and library services (imls, in ), and most recently the council for library and information resources (clir, in ). in one turning point fostering the current momentum, a group of academic libraries formed the library publishing coalition (lpc) in . the organization seeks to provide opportunities to “meet, work together, share information, and confront common issues” around the quickly evolving area of library publishing. the lpc’s subsequent yearly publication of the library publishing direc- tory beginning in revealed the diversity of academic libraries engaged in mostly scholarly publishing. lpc now has member institutions and had identified institutions with library publishing services between the two editions of the directory released prior to the survey reported here. an additional twelve appear in the newest directory, for a total of . these publishing programs take a plethora of forms. university presses report- ing through libraries may do so in name only or in a more systematically integrated fashion. these university-presses-in-the-library, plus other library publishing services unattached to university presses, provide a range of functions that traditionally signal scholarly publishing, including selection, peer review, workflow management, edit- ing, distribution, and marketing. other library publishers have embraced a model that maria bonn and mike furlough refer to as “libraries and publishing,” where the library provides researchers with a publication platform (for example, open journal systems or bepress digital commons) and some training on its use and related scholarly communications issues. this more restricted set of services leaves selection, editing, and other production activities to scholars. still others focus specifically on recogniz- able digital library activities such as publication of digitized special collections and etds or on the challenges of publishing and sustaining unique, often experimental in form, digital humanities projects that require expertise in digital curation as well as publishing. others look to create a leadership role for libraries in the open education resources movement. as these examples demonstrate, “publishing” in a library context has been defined broadly in an intentional effort to capture as wide a range of activities as possible. the lpc defines library publishing as “the set of activities led by college and university libraries to support the creation, dissemination, and curation of scholarly, creative, and/or educational works.” thus, its directory includes many institutions that sup- port deposit of etds as their primary or only publishing activity. however, not all libraries and librarians (or scholars and publishers outside libraries) agree that etds or other original items deposited in institutional repositories qualify as publications. as okerson and holzman note in the recent clir report on library publishing, “the boundary between activities that merit the name publishing and less formal and coher- ent enterprises is fluid and contestable.” the lpc approach of defining publishing broadly ensures a broad inclusion of activities that may signal growing movements. this article, like the survey of library publishing activities and financial models reported by okerson and holzman, uses the broad lpc definition of library publish- ing. throughout this article and in the survey it reports, the term “library publishing services” signifies the full range of activities indicated by the lpc and represented in the directory, regardless of whether they originate in a solo librarian’s efforts, a central- ized library unit devoted to publishing, or activities scattered across the library. while some questions pertain to publishing of particular formats, others apply regardless of publishing output. college & research libraries february literature review jingfeng xia and isaac gilman have noted that much of the published research on library publishing has focused on economic and organizational models, with gilman expand- ing substantively into legal and ethical concerns. the high profile examinations of the state of scholarly publishing in general and library publishing in particular—provided by ithaka, arl, imls, and clir—have positioned libraries as publishers, providing a rationale for library publishing based on existing skill sets, mission alignment, the history of publishing in libraries, and other factors. previous analysis shows that these studies have emphasized a need for libraries to develop staff strengths in working with authors and editors, but it also shows that they have rarely noted how traditional library user studies could provide a substantial, relevant window into the lives of readers. libraries’ strengths with understanding how patrons use resources, for example, are glimpsed only briefly in an appendix bullet point in the ithaka report and have not been followed up on. likewise, analysis of library publishing service mission state- ments suggests that readers are rarely seen as core to the purpose—surprisingly, even when considering open access. much in line with these analyses, in the digital library federation (dlf) assessment interest group user studies working group released a white paper noting an absence of research on use and usability of digital collections created by libraries, especially beyond studies of the needs of humanists as a user community. the report called for more publication in this area, as well as more coordinated user studies on common out-of-the-box platforms in particular. the role of user studies does appear in individual case studies. rebecca kennison, neni panourgiá, and helen tartar, for example, highlight accessibility and user ex- perience challenges in columbia university’s library publishing service. nancy l. eaton, bonnie macewan, and peter j. potter note “knowledge of user behavior and demands” as relevant library values at pennsylvania state university. leila salisbury and patrick alexander have proposed “sharing of user and market data” as a fruitful possibility for library and press collaboration, particularly as it affects “accessibility and navigability” and “understanding user expectations in relation to technology.” the research literature on library collections and areas like human-computer interac- tion offer ample opportunities for engagement in research on library publishing services. a robust literature examines use and perception of academic vendor-supplied e-books and uptake of e-reading devices among students and faculty. these studies highlight the challenges e-books face versus the relative success of electronic journals, in particular due to interface challenges for in-depth scholarly reading. work in human-computer interac- tion has included experiments with a variety of interfaces and reading devices to address challenges in reading digital publications, particularly for academic readers doing the kinds of annotation or extended nonlinear reading often highlighted as a particular chal- lenge in the academic library user studies. future work on user needs and behaviors in relation to library publishing platforms could engage with and build from this literature. more recently, reader-related questions have begun to gain traction as an area for focus in library publishing, albeit largely outside the research literature. reflecting on a summer seminar on “access/ibility in digital publishing” at west virginia university, melanie schlosser initiated a series of blog posts on accessibility in library publishing. she notes that the topic has not been prominent in the library publishing discussion to date. laurie borchard, michael biondo, stephen kutay, david morck, and andrew philip weiss suggest the same absence but provide one recent exception: a study of accessibility of the front and back ends of ojs. they found the front end to meet basic accessibility needs for readers. beyond issues for those with disabilities, interface design and related reading issues, which were not particularly prominent at the first library publishing forum, did get scattered mention at the second. libraries as content producers another important thread of research features library publishing services as sites for publication of student journals, often but not always student-run as well. as amy buckland notes, these journals and related initiatives offer the opportunity for libraries to engage students not just as “consumers” but as “content creators.” this conversation has focused on the specific area of scholarly communications and information literacy due in large part to an acrl white paper on the topic and a related collection of essays edited by stephanie davis-kahl and merinda hensley. undergraduate journals have offered ways for librarians to provide instruction on issues such as peer review, author rights, and the economics of information, particu- larly in regard to open access. however, if undergraduate journals offer an oppor- tunity for pedagogical focus on content creation, ux would be a natural extension of these efforts. like the lack of focus in the general library publishing literature, the absence of ux and the reading experience from the literature on undergraduate journals could stem from a lack of activity or simply a lack of centrality of ux to the current related discourse. this article addresses this gap in the literature by examining whether and how library publishing services take into account the needs, preferences, and behaviors of readers when designing (or redesigning) publications. it identifies sources of expertise that library publishers draw from when doing so, as well as barriers that keep them from focusing more on this area. finally, because of the educational role played by libraries in a wide range of contexts, it examines where and how library publishing services educate their authors and editors about these issues. methodology this article reports on the findings from a survey developed to identify whether and how library publishing services collect and use information about readers to inform publication design, barriers to doing so, and education of authors and editors on related issues. a colleague with experience leading a library publishing service pretested the survey, and questions were adjusted based on the feedback received. after consultation with the irb confirmed the study as nonhuman subjects research, an e-mail invitation to participate in the survey went to all of the institutions listed in either the or edition of the library publishing directory. the individual for each institution selected to receive the e-mail was the one listed as the contact person in the directory, although library publishing service websites were reviewed beforehand to update contacts where necessary. in a small number of cases where there was no clear point of contact, the invitation went to a library publishing service’s general e-mail address. the invitation asked for only one response per institution but noted that the recipient could forward the e-mail to another person in the library more familiar with specifics related to use of reader information in publication design. the survey remained open for four weeks, with two reminders sent over the course of the period. the survey form only recorded responses upon final submission of the survey. the survey included twenty-six questions, though it posed some only to institutions publishing particular types of material (see appendix a for full survey questionnaire). the first question (q ) asked the respondent to identify the institution for which he or she was responding. other questions asked about: • what sources of information about readers the library publishing service collects and uses for development of particular digital publication types. the publication types included electronic journals, e-books, and experimental forms. questions listed examples of “experimental forms” as multimedia and digital humanities publications. (q –q , questions particular to each publication type, were only asked of those indicating that their institution publishes that type.) college & research libraries february • where the library publishing service finds the expertise it uses for understand- ing readers (q ). • barriers to further addressing reader needs (q –q ). • education of editors and authors about reader needs related to publication design (q –q ). questions about frequency of behaviors defined a scope of the most recent two years of activity as the frame of reference. three open-ended questions at the end of the survey allowed respondents to give specific examples of education of editors and authors, how a publication was developed using information about reader needs, and general feedback on the survey (q – ). while these questions could all be framed in relation to user studies or ux, the survey specified narrower questions to get a more defined sense of actual activities used to understand and design for the needs, preferences, and behaviors of readers. although in general library user studies focus on consumers or users of information services, it was important to specify questions in relation to readers and the reading experience for library publisher respondents. library publishing services have often seen their primary users as authors and editors, for whom they must demonstrate value in competition with traditional publishers. these authors and editors are themselves users of back-end interfaces to publishing platforms; therefore, a general survey about “user experience” that did not specify readers could easily be taken to refer to issues related to back-end interfaces for authoring or editorial workflows. additional institution-level data were collected separately for all library publishing services receiving the survey to evaluate any variation in response rates. these data included carnegie classification details for carnegie control type (public/private status) and carnegie basic type for united states (u.s.) institutions (non-u.s. institutions were labeled as “international” in these fields). initially carnegie level was included, but it was discarded after discovering that all u.s. institutions in the population fell into the same group of “ -year or above.” carnegie basic types were grouped into research, master’s, baccalaureate, and special (faith, medical, and technical) institutions during analysis. finally, an additional field was added to the data set to indicate whether a library publishing service focused solely on institutional repository deposit for electronic theses and dissertations, technical reports, or other original material. while the survey invita- tion encouraged participation from all institutions in the directory, and while q –q applied across service types, publishing services operating only typical institutional repository services may be less likely to respond to a survey about evaluation of reader needs for publication design. such services often rely on self-deposit of items, and a reasonable outcome could mean the library does not have significant opportunity for input on ux issues (outside of the basic repository platform). similarly, the items most likely to receive some guidance on publication design or formats—dissertations, theses, and technical reports—may receive that guidance from units external to the library (such as a graduate college or the writer’s department). these conditions of repository services may relate to their status, as noted in the background section, at the contested, innovative bounds of the expansive definition of publishing used by the lpc. basic descriptive statistics were calculated in excel. inferential statistics were not calculated due to the small population. results are valid only for responding institu- tions at a particular moment in time. results response rates of library publishing services that received an invitation to participate, com- pleted the survey for a response rate of . percent. library publishing services at libraries as content producers u.s. institutions responded at a greater rate than those at non-u.s. institutions (see table ). library publishing services responded at a greater rate from baccalaureate, special, and research institutions, while those at master’s institutions responded at a lower rate (see table ). regardless of carnegie type, library publishing services at private institutions responded at higher rates than public institutions (see table ). the largest difference in response rate appears when looking at those library publishing services focused on basic institutional repository activities ( . %) versus those with other publishing activities as well ( . %) (see table ). information used to understand readers table shows sources of information about reader preferences, needs, or behaviors that library publishing services collected or used in relation to electronic journals (q –q ). services reported a nearly ubiquitous collection of usage statistics. fewer services collected information requiring a greater time investment such as usability testing and various types of questionnaires (surveys, interviews, and focus groups), but they almost always put this information to use when collected. nonetheless, the most frequently used source of information was informal feedback to the publishing service. similar patterns appear in relation to the information collected and used by library publishing services in relation to e-books (q –q , see table ) and experimen- tal forms (q –q , see table ). while still reported by a substantial majority, usage statistics were less consistently collected for these types of publication. table response rates by location of institution frequency response rate united states (n= ) . % outside united states (n= ) . % total (n= ) . % table response rates by carnegie institution type (u.s. only)* private public total baccalaureate / ( . %) n/a / ( . %) master’s / ( . %) / ( . %) / ( %) research / ( . %) / ( . %) / ( . %) special / ( . %) / ( %) / ( %) total / ( . %) / ( . %) / ( . %) *cells show frequency over total n for that category, followed by rate in parentheses. table response rates by publishing service scope* frequency response rate basic ir functions only (n= ) . % activities beyond basic ir functions (n= ) . % *ir=institutional repository. basic ir functions for the purposes of classification were electronic theses and dissertations, technical reports, or other traditional gray matter deposit. college & research libraries february in some cases, institutions reported use, but not collection, of a particular type of information, suggesting they may have obtained information about similar popula- tions elsewhere. therefore, table , table , and table also report the number of those who used a type of information out of those who collected it specifically. this column in the tables indicates that some institutions may not be making full use of the data they collect. the disparity between usage and collection was greatest for information in the form of transaction logs (automated records of user actions within an interface) for all three publication types. the survey also asked how often library publishing services used information about readers when designing or redesigning publication formats or interfaces for e-journals (q ), e-books (q ), and experimental forms (q ) in the previous two years. use of this information was more frequent when designing e-books or publications using table sources of information collected and used by e-journal publishers* collect (n= ) rate of use by those who collect rate of use overall usage statistics (downloads/views) ( . %) / ( . %) / ( . %) transaction logs ( . %) / ( . %) / ( . %) usability test results ( . %) / ( . %) / ( . %) surveys, interviews, or focus groups ( . %) / ( . %) / ( . %) informal feedback ( . %) / ( . %) / ( . %) other ( . %) / ( . %) / ( . %) *the “collect” column indicates those who checked the appropriate box. the next column indicates the number of those from the “collect” column who used the information they collected. the final column indicates all reported use of the type of information: respondents could (and in some cases did) indicate use of a type of information they did not collect. table sources of information collected and used by e-book publishers* collect (n= ) rate of use by those who collect rate of use overall usage statistics (downloads/ views) ( . %) / ( . %) / ( . %) transaction logs ( %) / ( %) / ( %) usability test results ( %) / ( . %) / ( . %) surveys, interviews, or focus groups ( . %) / ( %) / ( . %) informal feedback ( . %) / ( %) / ( %) other ( %) / ( . %) / ( %) *the “collect” column indicates those who checked the appropriate box. the next column indicates the number of those from the “collect” column who used the information they collected. the final column indicates all reported use of the type of information: respondents could (and in some cases did) indicate use of a type of information they did not collect. libraries as content producers experimental forms than when designing e-journals, although the difference was not large (see figure ). notably, a small number of institutions (three for e-journals, five for e-books) reported never using this information to design or redesign publications, even though they indicated using one of the specific sources of information to develop formats and preferences in responses to previous questions. location of expertise the survey asked all library publishing services the remaining questions. when asked where the library publishing service finds expertise used for understanding readers (q ), . percent reported not using any such expertise at all. most, though, had used multiple sources of this expertise, and more than half of the services indicated three or more. figure how frequently format options and interfaces based on information about readers (by publication type) % % % % % % % % % % % e xp f orms ( n = ) e -books ( n = ) e -journals ( n = ) never rarely some�mes always table sources of information collected and used by experimental format publishers* collect (n= ) rate of use by those who collect rate of use overall usage statistics (downloads/ views) ( . %) / ( . %) / ( . %) transaction logs ( . %) / ( . %) / ( . %) usability test results ( . %) / ( . %) / ( . %) surveys, interviews, or focus groups ( . %) / ( %) / ( . %) informal feedback ( . %) / ( . %) / ( . %) other ( . %) / ( %) / ( . %) *the “collect” column indicates those who checked the appropriate box. the next column indicates the number of those from the “collect” column who used the information they collected. the final column indicates all reported use of the type of information: respondents could (and in some cases did) indicate use of a type of information they did not collect. college & research libraries february the only location of expertise used by a majority of respondents was relevant re- search studies ( . %), although almost half consulted with their platform vendor or provider ( . %). if expertise was found at the university, it was more often found in the library, in particular the library publishing unit ( . %), than in institutional col- leagues outside the library ( . %). table shows frequencies and percentages for all locations of expertise considered by respondents. barriers the survey likewise asked all respondents about barriers to further addressing reader needs related to digital publishing formats and interfaces, beyond any current activities (q –q ). specific options included factors both internal and external to the service (see figure ). for a majority of library publishing services, policies beyond the service’s control at the library, campus, or other level did not pose a barrier or did so only rarely. the obstacle reported most frequently was simply a lack of priority for these activities. almost half ( . %) suggested this posed a regular barrier, with many more saying it sometimes posed a barrier ( . %). still, prioritization was far from a sole cause. all other barriers stymied more than percent of library publishing services sometimes figure how often factors prohibit further addressing reader needs (n= ) % % % % % % % % % % % l i b r a r y , c a m p u s , or o t h e r p o l i c i e s th e l i b r a r y p u b l i s h i n g s e r v i c e c a n n o t c o n t r o l e x i s � n g p e r s o n n e l l a c k e x p e r � s e l a c k of u s e f u l i n f o r m a � o n ab o u t r e a d e r s l i m i t e d f o r m a t an d i n t e r f a c e o p � o n s of c h o s e n p u b l i s h i n g p l a � o r m ( s) l a c k of p r i o r i t y v e r s u s o t h e r n e e d s ( l a c k of t i m e fo r th i s a c � v i t y ) never rarely some�mes regularly table where library publishing services find expertise about readers (n= )* location of expertise frequency percent none: the library publishing service does not make use of this expertise. . % in a library publishing unit, if one exists . % outside a library publishing unit but inside the library . % outside the library but within the university . % platform vendors or providers (such as bepress, pkp, and others) . % relevant research studies (in other words, ithaka reports on reading behavior, library user studies) . % others outside the university besides platform vendors or providers (such as colleagues at conferences or elsewhere) . % *respondents chose all options that applied. libraries as content producers or regularly, including limitations imposed by chosen platforms, a lack of useful in- formation about readers, and a lack of expertise in existing personnel. when asked to identify the primary barrier to further addressing reader needs (q ), nearly half identified lack of priority as the most significant barrier for their library publishing service (see table ). however, slightly more than a quarter of respondents cited the limited format and interface options of their chosen publishing platforms. responding institutions could also indicate an unlisted barrier as their primary ob- stacle, and seven ( . %) did so. one of these indicated both prioritization and chosen platform limitations as equally the most significant. the others indicated responses that are closely related to lack of priority versus other needs: lack of interest and lack of community expressed need, limited budget or other resources, and cost. education of authors and editors all respondents also provided information about the frequency with which their ser- vices educated authors and editors about reader-related issues in their publications (q –q ). a plurality reported sometimes educating authors and editors about these issues for each type—a majority for each type said sometimes or always (see figure ). however, services reported educating faculty and other nonstudent editors more frequently than the other groups. the question about education did not define “education” to allow for a range of informal and formal educational settings. to get a sense of the range of specific table primary limiting factor when addressing reader needs (n= ) limited format and interface options of chosen publishing platform(s) ( . %) lack of useful information about readers ( . %) existing personnel lack expertise ( . %) lack of priority versus other needs (lack of time for this activity) ( . %) library, campus, or other policies the library publishing service cannot control ( . %) other ( . %) figure frequency with which the library publishing service educates author and editor populations on reader-related issues* % % % % % % % % % % % student editors or author s (n = ) facult y and nonstudent author s (n = ) facult y and nonstudent editors (n = ) never rarely some�mes regularly *those responding “not applicable” due to not working with a particular group directly for any type of publication have been excluded. college & research libraries february activities, one open-ended question asked respondents to describe an example of how their library publishing service educated authors and editors (q ). thirty-one institutions responded to the prompt. most did not indicate a specific population, but six mentioned working with students, and five indicated working with faculty or other nonstudent authors and editors on these issues. educational approaches included one-on-one consultations or advice, optional or required workshops and training sessions, class visits, checklists or web guides, and working alongside authors and editors in more sustained engagements on these issues. most responses included multiple approaches. specific topics included accessibility, responsive design, format and interface options, open access, metadata and search optimization, webpage load time, and layout/page design. none of these topics, however, emerged as particularly common. variety of activities to get a better sense of how library publishing services use information about read- ers in publication design, another open-ended question asked for a brief descrip- tion of the most recent example of how they did so (q ). twenty-four institutions responded to the prompt. grounded analysis of themes revealed a wide variety of activities that addressed reader needs, behaviors, and preferences, with several repeated themes. the most common responses had to do with improvements to discovery of material (such as through search engine optimization) and the addi- tion of new digital formats to broaden reading options. two responses highlighted the creation of new print format options for those preferring print, and two others mentioned digitizing materials or transitioning to digital production. several also indicated improvements to the journal’s interface or functionality. a list of themes can be found in table . table themes of how library publishing services incorporated reader issues into development or improvement of digital publications added digital formats search optimization/discovery interface/functionality improvement integration of altmetrics added print on demand digitization/transition to digital format usability general assessment of needs access platform selection guidance/education of creators of a project accessibility platform improvement other/general design issues libraries as content producers discussion response rates show that institutions with etd or repository-focused library publishing services responded to the survey at a much lower rate than other institutions. united states baccalaureate and master’s institutions and international institutions had higher rates of these limited publishing services, so the lower responses for master’s and in- ternational institutions may be related to their focus. reasons for the lack of response could vary, though, including a lack of time at smaller institutions or language barriers for international institutions (although all invited had submitted a directory entry in english). however, these institutions may not have felt the topic of the survey related to their services despite the e-mail invitation encouraging participation regardless of activity. regardless, the responses overrepresent library publishing services that pub- lish at least some electronic journals, e-books, or experimental forms—in other words, those that have most expanded into traditional scholarly publishing or have taken on experimental publishing projects. therefore, it is not surprising that the vast majority of respondents ( . %) reported publishing electronic journals, or that a majority of respondents indicated publication of e-books ( . %) or experimental forms ( . %). the following results should be read with this imbalanced response in mind. responses indicate that engagement with reader needs, preferences, and behaviors in format options and interface design is more prevalent among library publishing services than has been evident in the practitioner literature to date. around half of services that publish electronic journals, e-books, and experimental forms report using such information “sometimes” or “always” in their publication design, with slightly higher rates for e-books and experimental forms. the difference, though small, may derive from the much earlier move of journals to electronic format (providing a feel- ing that they are a solved problem) and availability of more out-of-the-box electronic journal publishing platforms. as noted in the literature review, e-book design still faces broad challenges as publishers experiment with interfaces to allow a better reading experience with features such as annotation and support for extended nonlinear read- ing more common in academic reading. experimental forms such as digital humanities and multimedia projects have unique design considerations that offer opportunities for user studies, such as novel ways to interact with content or how best to integrate mixed media. while library publishing services reported more engagement with reader needs than expected, other evidence in the survey indicates that much of this activity is incidental and that there are missed opportunities. library publishing services reported collection of a range of information types about reader needs, preferences, and behaviors, but collection was uneven. much of the collected data goes unused. although nearly all e-journal services and three quarters of book services collected usage statistics (total downloads or views of a publication), less than half of services collecting these data actually used it to inform publication design or format options. these services collect transaction logs less often, but similarly the collected information is used to feed back into design in less than half of those cases. in fact, transaction logs had the biggest gap between collection and use for all publication types. although transaction logs often require labor-intensive analysis to decipher and code the data included, this is an area of missed opportunity since these records usually include richer details of reader behavior. these can include paths of navigation, time spent reading particular content, or other features depending on the logs and overall system design. not surprisingly, given the time-intensive nature of data collection, services almost always use collected information from usability tests and from surveys, interviews, and focus groups. however, collection of these types of data only occurs at a handful of institutions. the only source of information both collected and used by the majority of institutions was college & research libraries february informal feedback, indicating a lack of systematic engagement overall. in some cases, an institution reported use, but not collection, of a type of informa- tion. some respondents may have interpreted the term “collect” as systematic or long-term collection, although the high reported rate of collection of informal feedback cuts against this explanation. it could also be the case that some services are using information collected by others, such as that reported in published research, which other parts of the survey suggest. the survey identifies several common barriers that prevent library publishing services from using information about readers to improve publication design, or that limit action even for those that do. prioritization, given time constraints, may be the biggest barrier, and this challenge may be understood in the context of library pub- lishing services investing significant time and resources in establishing operations: developing skills in new areas, working out an economic and organizational model, and developing outreach practices for working with researchers in their authorial and editorial roles may simply be absorbing available attention and resources. library publishing services also face a major external barrier to responding to reader needs due to inherent limitations that come with out-of-the-box platforms. these platforms involve a trade-off: they allow quicker start-up of services that do not need to reinvent publication technologies or invest in programming time for significant customization, particularly for journals. however, adoption of these platforms also means limited control over design (or anything else) beyond standard layout and set- ting options such as css themes, and an inability to foster experimental formats that may not fit. the fact that almost half of services find some expertise about readers from platform vendors or providers suggests they can get some assistance with these platforms. still, one respondent wrote, in the open response field at the end of the survey, “[y]ou hit the nail on the head with the item about the limitations of existing publishing platforms.” this respondent went on to highlight the challenge of trying to balance the more easily achieved sustainability of standard platforms against projects that need more flexible, innovative solutions. occasionally this expertise can run the other way, with the library contributing back to the platform provider: one respon- dent, at an institution that partners on open journals systems (ojs) development, noted, “we routinely contribute bug fixes to the ojs code base, usually based on user- reported problems.” with increasing reliance of libraries on these shared platforms, one important way forward for ux work in library publishing may be to strengthen feedback mechanisms and communities so that libraries can contribute to the overall improvement of platforms without taking on the sole responsibility. many libraries, though, simply lack the expertise to collaborate in this fashion. while very few identified lack of expertise as their primary challenge, . percent said a lack of expertise about readers acted as a barrier to improving design “sometimes” or “always.” although most respondents identified multiple sources of expertise, further analysis shows that services ( . %) did not have any expertise for understanding readers in the library (inside or outside a publishing unit), and nearly all of those ( , . % of respondents) lack that expertise even when looking to the university as a whole. in other words, most expertise is external—from vendors, colleagues at other institutions, or previous studies—and these library publishing services have little capacity to assess reader needs directly. the lack of expertise, then, may pose a more significant challenge than appears in responses indicating the top barrier to addressing needs of readers. lack of expertise is related to prioritization, with which it may have a circular relationship. if understand- ing of readers in design is not prioritized, it is unlikely that expertise will be sought or hired; if no one in this service has the expertise, readers may not have the advocate libraries as content producers they need to become a priority. it is unclear whether individual library publishing services reporting no expertise available at their institution simply have not seen and thus have not taken advantage of expertise that does exist (such as ux librarians or researchers), or whether they truly lack it. library publishing services report a wide variety of educational efforts related to ux, although without much consistency in degree or kind of engagement with student and nonstudent authors and editors. the range of efforts makes sense due to the different services offered across institutions. moreover, the type of educational efforts called for may vary depending on the particular area in question. accessibility standards may call for specific prescribed approaches, and thus education might focus on the purpose for them and what they are, whereas broader interface and functionality decisions may require education of authors and editors on implications of different options so that they can make a final determination. student authors and editors receive education in this area less consistently than faculty or other nonstudent authors and editors, although a small majority of services still report education of this group. this lower level of activity in work with students may surprise, given that a general purpose of these student publications is instructional. however, design issues may not have been as clearly tied to the instructional purpose, and students may have less of a say in the final design due to faculty oversight. even in cases where instruction occurs, this survey does not reveal its depth. that half of those working with students do address these issues “sometimes” or “always,” though, suggests that literature on the intersection of scholarly communications and information literacy may have missed an area of synergy in the use of undergraduate journals to teach issues about content creation. conclusion library publishing services have sought to act on information about the ux of readers more than currently represented in the literature on library publishing. however, this work is by no means pervasive. existing efforts may be incidentally rather than inten- tionally achieved, and some library publishing services miss opportunities for action. the fact that only a handful of institutions do any usability testing of their platforms and interfaces for e-journals, e-books, or experimental forms is of particular concern. library publishing services do not always make use of information about readers that they have, facing challenges that make it hard to prioritize needs, preferences, and behaviors of readers in publication design. library publishing services may face additional challenges not explored in the survey that influence the lack of priority they have placed on ux. as libraries turn to serve the needs of users in their authorial and editorial roles rather than their roles as readers, production-oriented tasks may seem the most pressing. back-end ux is- sues with workflow tools for authors may compete with front-end ux for attention of developers. long-term preservation may also draw available attention. finally, the focus on open access and how best to implement it may unintentionally obscure issues readers face once they do access a work. a key limitation of this study and an area for further work is the lack of responses from library publishing services listed in the directory that focus solely on institutional repository services for electronic theses and dissertations, technical reports, or other original deposits. as a result, the analysis in this article mostly describes practice at library publishing services with the most traditional publishing outputs. however, as noted above, the dlf assessment white paper that reviewed literature in the area of digital libraries, including institutional repositories, suggests a similar lack of user studies. as with the lack of focus on user studies noted in this article’s literature college & research libraries february review, it could be that some work is being done but not reported to the broader community in the published literature. a similar survey to this one, though adapted to the specific audience, could be directed toward institutional repository managers. this would have the added benefit of capturing the many institutional repositories that have not been represented in the library publishing directory at all, which far outnumber those the directory does list. a comparative study including the broader universe of university presses beyond those reporting to libraries would likewise offer comparison to more traditional publishers also situated in the university. a study could also go further to include other academic publishers outside universities, including commercial entities. more important, this article points to the ample opportunity for growth in user studies generally, and ux work specifically, across several dimensions of the research and practice of library publishing services. undergraduate journal programs, often used to educate students on scholarly communication issues, could likewise serve as the site of education about how interface and design decisions impact readers. the dearth of usability studies in existing practice needs attention. engagement with the research on library e-book collections and human-computer interaction could help library publishing services provide solutions to e-book interface problems faced by academic readers requiring annotation functionality and an extended nonlinear reading experience. indeed, the emphasis on open access in library publishing services offers the opportunity to start ahead of the curve; issues related to digital rights manage- ment create some of the more significant usability and reading experience problems in vendor-provided collections as reported in existing user studies. whether or not individual libraries consider their digital collections to be publish- ing efforts, the dlf white paper’s emphasis on a need for further shared user studies research on digital collections interfaces resonates here. in particular, the call for a wider community of research that feeds back into shared platforms speaks to the limitations that respondents to the present survey face in relation to out-of-the-box publishing solutions. research on vendor-provided electronic collections and discovery layers has provided a way for libraries to share information about user issues with each other and those vendors, and there is a need to do this for library publishing solutions as well. libraries are creating new materials for digital collections as they publish, and librarians should not let these materials produced at home go without the same level of examination given to other resources. one virtue of library involvement in publishing sometimes cited is that, in the words of joyce l. ogburn, “librarians are embracing their roles in the entire cycle of knowledge creation, dissemination, access, use, and preservation.” a benefit of in- creased library involvement in the full information cycle should be that we find ways for traditional library strengths related to the moments of consumption in that cycle to inform and be informed by increasing library expertise in production. investiga- tion into how patron populations use information resources has long driven service development in libraries, and there is incredible potential for this to be the case for new production-driven services in scholarly communications and publishing that will create at least some portion of tomorrow’s collections. acknowledgements thank you to zach claybaugh for assistance with collection of supplementary data for this project. thanks to maria bonn for pretesting the survey and reading a partial draft. colleagues suzanne chapman, cindy ingold, aaron mccollough, heather sim- mons, and mara thacker also provided much appreciated feedback or advice on drafts. libraries as content producers appendix a. survey questions survey intro text thank you for participating in this survey of library publishing services. the survey seeks to gather basic data about how and to what extent library publishing services address the needs of readers of their digital publications, and barriers to doing so. your institution has been invited to participate because it has appeared in either the or library publishing directory. all questions relate to the activities of the library publishing service as a whole, and you are asked to respond on behalf of your institution. if you are not the best person in your library publishing service to answer these questions, please exit the survey and forward the original e-mail to the person in your library who would be appropriate. only one person from each institution should respond. to indicate agreement to participate, please enter the name of your institution. re- sponses will be linked to the institution for analysis of broad trends by type of institu- tion but not to you as an individual. any questions or feedback about this survey can be sent to dan tracy (dtracy@illinois.edu). q : name of institution (required) q : does your institution’s library publishing service publish (or support publication of) electronic journals? (required) ____ yes ____ no [skips to q ] q /q : for electronic journals, what sources of information regarding reader prefer- ences, needs, or behaviors does the library publishing service collect, and which does it use for developing publication format options and interfaces? [check all that apply] collect. (q ) use for developing formats and interfaces (q ) ____ usage statistics (downloads/views) ____ transaction logs ____ usability test results ____ surveys, interviews, or focus groups ____ informal feedback ____ other q : considering the past two years up through current practice, when format options and interfaces for electronic journals are designed or redesigned, this is done using information about reader needs, preferences, and/or behaviors [choose one]: ____ never ____ rarely ____ sometimes ____ always college & research libraries february q : does your institution’s library publishing service publish (or support publication of) electronic books (e-books)? (required) ____ yes ____ no [skips to q ] q /q : for electronic books (e-books), what sources of information regarding reader preferences, needs, or behaviors does the library publishing service collect, and which does it use for developing publication format options and interfaces? [check all that apply] collect. (q ) use for developing formats and interfaces (q ) ____ usage statistics (downloads/views) ____ transaction logs ____ usability test results ____ surveys, interviews, or focus groups ____ informal feedback ____ [other?] q : considering the past two years up through current practice, when format options and interfaces for electronic books (e-books) are designed or redesigned, this is done using information about reader needs, preferences, and/or behaviors [choose one]: ____ never ____ rarely ____ sometimes ____ always q : *does your institution’s library publishing service publish (or support publication of) digital experimental forms (such as nontraditional digital humanities publica- tions, multimedia projects)? ____ yes ____ no [skips next page] q /q : for digital experimental forms (such as nontraditional digital humanities publications, multimedia projects), what sources of information regarding reader pref- erences, needs, or behaviors does the library publishing service collect, and which does it use for developing publication format options and interfaces? [check all that apply] collect (q ) use for developing formats and interfaces (q ) ____ usage statistics (downloads/views) ____ transaction logs ____ usability test results ____ surveys, interviews, or focus groups ____ informal feedback ____ [other?] q : considering the past two years up through current practice, when format options and interfaces for experimental forms (such as nontraditional digital humanities publications, multimedia projects) are designed or redesigned, this is done using information about reader needs, preferences, and/or behaviors [choose one]: ____ never libraries as content producers ____ rarely ____ sometimes ____ always q : considering the past two years up through current practice, where does the in- stitution’s library publishing service find expertise it uses for understanding readers? [please check all that apply.] ____ none: the library publishing service does not make use of this expertise. ____ in a library publishing unit, if one exists ____ outside a library publishing unit but inside the library ____ outside the library but within the university ____ platform vendors or providers (such as bepress, pkp, and the like) ____ relevant research studies (in other words, ithaka reports on reading behavior, library user studies) ____ others outside the university besides platform vendors or providers (such as colleagues at conferences or elsewhere) q –q : considering the past two years up through current practice, how often do the following prohibit the library publishing service from further addressing reader needs related to any digital publication formats and interfaces? (choices: never, rarely, sometimes, frequently) q : limited format and interface options of chosen publishing platform(s) q : lack of useful information about readers q : existing personnel lack expertise q : lack of priority versus other needs (lack of time for this activity) q : library, campus, or other policies the library publishing service cannot control q : what is currently the primary limiting factor keeping the library publishing ser- vice from further addressing reader needs related to any digital publication formats and interfaces? [choose one] limited format and interface options of chosen publishing platform(s) ____ lack of useful information about readers ____ existing personnel lack expertise ____ lack of priority versus other needs (lack of time for this activity) ____ library, campus, or other policies the library publishing service cannot control ____ other [please specify] q : library publishing staff educate faculty or other nonstudent editors on reader- related issues such as accessibility, format preferences, or interface design as related to their digital publications: [choose best option] ____ never ____ rarely ____ sometimes ____ regularly ____ not applicable (do not work with this group directly on any publication types) q : library publishing staff educate faculty or other nonstudent authors on reader- related issues such as accessibility, format preferences, or interface design as related to their digital publications: college & research libraries february ____ never ____ rarely ____ sometimes ____ regularly ____ not applicable (do not work with this group directly on any publication types) q : library publishing staff educate students editing or authoring graduate or un- dergraduate student publications on reader-related issues such as accessibility, format preferences, or interface design as related to their digital publications: ____ never ____ rarely ____ sometimes ____ regularly ____ not applicable (do not work with this group directly on any publication types) q : if the library publishing service does so, please give an example of how staff edu- cate any of the above types of users about reader-related issues such as accessibility, format preferences, or interface design as related to their publications. [open ended] q : please describe the most recent example, if one exists, of how the library pub- lishing service has developed or improved a digital publication or digital publishing platform with consideration of information about reader preferences, needs, and behaviors. [open ended] q : if you have any other comments related to the library publishing service’s ac- tivities in the areas covered by this survey, or the survey itself, please include those below. [open ended] notes . user studies in library and information science, in both the practitioner and nonpractitioner literature, is so pervasive that it can be observed in most new issues of any research journal in the field. relevant connections, particularly to user studies in the context of e-book collections, are highlighted in the literature review. for overviews of how ux is being incorporated into libraries, its emergence, and approaches to ux, see: robert fox and ameet doshi, spec kit : library user experience (washington, d.c.: association of research libraries, ); jean e. mclaughlin, “focus on user experience: moving from a library-centric point of view,” internet reference services quarterly , no. / ( ): – , http://dx.doi.org/ . / . . ; aaron schmidt and amanda etches, useful, usable, desirable: applying user experience design to your library (chicago: american library association, ). the new practitioner journal for libraries and ux is weave: journal of library user experience (available online at http://weaveux. org/ [accessed december ]). . dan cohen and kathleen fitzpatrick, “foreword,” in getting the word out: academic librar- ies as scholarly publishers, eds. maria bonn and mike furlough (chicago: acrl, ), vii–x, viii. . daniel g. tracy, “the users of library publishing services: readers and ac- cess beyond open,” journal of electronic publishing (summer ), doi:http://dx.doi. org/ . / . . . . paul n. courant and elisabeth a. jones, “scholarly publishing as an economic public good,” in bonn and furlough ( ); ann okerson and alex holzman, the once and future pub- lishing library (washington, d.c.: council on library and information resources, ), available online at www.clir.org/pubs/reports/pub [accessed august ]. . okerson and holzman, the once and future publishing library, . . for a good introduction to academic library roles in digital humanities with specific discussion of both digital editions and digital collections, see stewart varner and patricia hswe, libraries as content producers “digital humanities in libraries,” american libraries (jan/feb ), available online at http:// americanlibrariesmagazine.org/ / / /special-report-digital-humanities-libraries/ [accessed december ]. due to issues surrounding publication of that article, interested readers should examine the postprint and related documents available through pennsylvania state university’s scholarsphere for a fuller understanding of its context, available online at https://scholarsphere. psu.edu/files/ c wm [accessed december ]. for a more in-depth look at collaborations on dh publications based in text encoding, see harriett e. green, “facilitating communities of practice in digital humanities: librarian collaborations for research and training in text encod- ing,” library quarterly , no. ( ): – . . charles watkinson, “from collaboration to integration: university presses and libraries,” in bonn and furlough ( ): – . . laura brown, rebecca griffiths, matthew rascoff, and kevin guthrie, university publish- ing in a digital age (new york: ithaka, ), available online at www.sr.ithaka.org/research- publications/university-publishing-digital-age [accessed september ]; association of research libraries, arl: a bimonthly report on research library issues and actions from arl, cni, and sparc / ( ); karla l. hahn, research library publishing services: new options for university publishing (washington, d.c.: arl, ); james l. mullins, catherine murray-rust, joyce l. ogburn, raym crow, and october ivins, library publishing services: strategies for success: final research report (washington, d.c.: sparc, ), available online at http://docs.lib.purdue. edu/purduepress_ebooks/ / [accessed september ]. . library publishing coalition, “background,” available online at www.librarypublishing. org/about-us/background [accessed september ]. . the directory, released after this survey, identified twelve additional library publish- ers not in the prior two volumes, for a total of . library publishing directory , ed. sarah lippincott (atlanta, ga.: library publishing coalition, ). . maria bonn and mike furlough, “the roots and branches of library publishing programs,” in bonn and furlough ( ): – , . . library publishing coalition, “about us,” available online at http://librarypublishing.org/ about-us [accessed january ]. the definition continues: “generally, library publishing requires a production process, presents original work not previously made available, and ap- plies a level of certification to the content published, whether through peer review or extension of the institutional brand. based on core library values, and building on the traditional skills of librarians, it is distinguished from other publishing fields by a preference for open access dissemination as well as a willingness to embrace informal and experimental forms of scholarly communication and to challenge the status quo.” while this definition is broad in some ways, it does seem to exclude digitization efforts, which are covered in other discussions of library publishing introduced in this article. . okerson and holzman, the once and future publishing library, . . the survey and study do not use “library publishers” as the default term because some institutions have been reluctant to describe their services in this way, including some of those described by bonn and furlough as offering “libraries and publishing” model. “library publish- ing services” stands in as a broader umbrella term to include the full range of models suggested by lpc. . jingfeng xia, “library publishing as a new model of scholarly communication,” journal of scholarly publishing , no. ( ): – , ; isaac gilman, library scholarly communication programs: legal and ethical considerations (oxford: chandos publishing, ). . tracy, “the users of library publishing services” ( ). . brown, griffiths, rascoff, and guthrie, university publishing in a digital age, . . tracy, “the users of library publishing services” ( ). for further analysis of these statements, see daniel g. tracy, “topics and trends in library publishing mission statements,” poster at library publishing forum , portland, ore., march , , available online at http:// hdl.handle.net/ / [accessed september ]. . digital library federation assessment interest group user studies working group, surveying the landscape: use and usability assessment of digital libraries, december , available online at https://docs.google.com/document/d/ i x su kwbu i i odrxh scyurrgwlxlfist nli/edit?usp=sharing [accessed january ]. . rebecca kennison, neni panourgiá, and helen tartar, “dangerous citizens online: a case study of an author-press-library partnership,” serials , no. ( ): – , doi:http://dx.doi. org/ . / . . nancy l. eaton, bonnie macewan, and peter j. potter, “learning to work together: the libraries and the university press at penn state,” journal of scholarly publishing , no. ( ), – , doi:http://dx.doi.org/ . /scp. . . . patrick alexander, james mccoy, leila salisbury, and richard brown, “mixing oil and college & research libraries february water: recipes for press-library collaboration,” proceedings of the charleston library conference, : – , , doi:http://dx.doi.org/ . / . . for a recent literature review of library e-book user studies that exposes the tip of the ice- berg, see michael lamagna, sarah hartman-caverly, and erica swenson danowitz, “integrating e-books into academic libraries: a literature review,” internet reference services quarterly , no. / ( ): – , doi:http://dx.doi.org/ . / . . . for an extended analysis of various types of reading and the characteristics of academic reading in particular, see terje hillesund, “digital reading space: how expert readers handle books, the web, and electronic paper,” first monday , no. / ( ), available online at http://firstmonday.org/ojs/index.php/ fm/article/view/ / [accessed december ]. . relevant examples include: nicholas chen, francois guimbretiere, and abigail sellen, “de- signing a multi-slate reading environment to support active reading activities,” acm transac- tions on computer-human interaction , no. ( ), doi:http://dx.doi.org/ . / . ; juliane franze, kim marriott, and michael wybrow, “does a split-view aid navigation within academic documents?” in proceedings of the acm symposium on document engineering (new york: acm, ): – , doi:http://dx.doi.org/ . / . ; jennifer pearson, george buchanan, harold thimbleby, and matt jones, “the digital reading desk: a lightweight approach to digital note-taking,” interacting with computers , no. ( ): – , doi:http:// dx.doi.org/ . /j.intcom. . . ; hirohito shibata, kentaro takano, and shun’ichi tano, “text touching effects in active reading: the impact of the use of a touch-based tablet de- vice,” in human-computer interaction—interact : th ifip tc international conference, bamberg, germany, september – , , proceedings, part i, eds. julio abascal, simone barbosa, mirko fetter, tom gross, philippe palanque, and marco winckler (cham: springer, ): – , doi:http://dx.doi.org/ . / - - - - _ . . melanie schlosser, “access/ibility in digital publishing: summer seminar at wvu,” the lib pub: a group blog on library publishing (july , ), available online at https://librarypublishing. wordpress.com/ / / /accessibility-in-digital-publishing-summer-seminar-at-wvu/ [accessed december ]. . laurie borchard, michael biondo, stephen kutay, david morck, and andrew philip weiss, “making journals accessible front & back: examining open journal systems at csu northridge,” oclc systems & services , no. ( ): – . they did, however, note the back end for authors and editors would “pose some problems for users with disabilities” ( ). . amy buckland, “more than consumers: students as content creators,” in bonn and furlough ( ): – . . association of college and research libraries, working group on intersections of scholarly communication and information literacy, intersections of scholarly communication and information literacy: creating strategic collaborations for a changing academic environment (chicago: associa- tion of college and research libraries, ), available online at http://acrl.ala.org/intersections/ [accessed december ]; common ground at the nexus of information literacy and scholarly communication, eds. stephanie davis-kahl and merinda kaye hensley (chicago: association of college and research libraries, ). . the enhanced data set is openly available under a cc license through the illinois data bank. under the repository agreement, the data set will be preserved for five years, at which point it will be reassessed for continued need. the data set is available at https://doi.org/ . / b idb- _v . . opendoar, the directory of open access repositories, lists , institutional repositories worldwide, far more institutions than listed in the lpc’s directory regardless of extent of services. “open access repository types—worldwide” (chart), available online at www.opendoar.org/ about.html#scope [accessed january ]. while it is possible that many of these repositories do not meet the lpc definition of library publishing, the large number suggests the likelihood of significant gap in the directory’s coverage of deposit-based library publishing services to begin with, a gap that could stem from lack of awareness of the directory as well as disagreement over whether the term “publishing” applies to items such as etds. . joyce ogburn, “closing the gap between information literacy and scholarly communica- tion,” in davis-kahl and hensley, v–viii. the promise and problems of the visual e-book: call for an alliance between authors and librarians the promise and problems of the visual e-book: call for an alliance between authors and librarians author(s): anne whiston spirn and ann baird whiteside source: art documentation: journal of the art libraries society of north america, vol. , no. (fall ), pp. - published by: the university of chicago press on behalf of the art libraries society of north america stable url: http://www.jstor.org/stable/ . / . accessed: / / : your use of the jstor archive indicates your acceptance of the terms & conditions of use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . jstor is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. we use information technology and tools to increase productivity and facilitate new forms of scholarship. for more information about jstor, please contact support@jstor.org. . the university of chicago press and art libraries society of north america are collaborating with jstor to digitize, preserve and extend access to art documentation: journal of the art libraries society of north america. http://www.jstor.org this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/action/showpublisher?publishercode=ucpress http://www.jstor.org/action/showpublisher?publishercode=arlisna http://www.jstor.org/action/showpublisher?publishercode=arlisna http://www.jstor.org/stable/ . / ?origin=jstor-pdf http://www.jstor.org/page/info/about/policies/terms.jsp http://www.jstor.org/page/info/about/policies/terms.jsp the promise and problems of the visual e-book: call for an alliance between authors and librarians anne whiston spirn, massachusetts institute of technology ann baird whiteside, harvard university abstract—this article explores the state of libraries and authorship in response to the evolving landscape of electronic books. the authors discuss the topic through a conversation about the choice to self-publish an electronic book in the visual arts. issues such as the primacy of the image as argument for research in design and the visual arts, the availability of e-books to libraries, the influence of publishers on the e-book medium and market, and implications for libraries and collection development are considered. i n t r o d u c t i o n research and scholarship in the visual arts and design fields requires extensive use of images in order to make arguments about theory and practice. however, the cost of publishing the products of such investigations in the form of printed books and articles is quite high, and, despite digital technology, continues to be expensive. those costs are passed on to consumers: libraries, students, faculty, and other readers. anne whiston spirn recently produced an original e-book about seeing as a way of knowing and photography as a way of thinking: the eye is a door: landscape, photog- raphy, and the art of discovery (wolftree press, ) (figure ). this publication is the result of anne’s desire to find a new way to publish heavily illustrated books, make them rich and useful to scholars, and make them affordable. in october , she attended the symposium why books? held at the radcliffe institute for advanced study at harvard university, and, in , she applied for an internal grant at mas- sachusetts institute of technology (mit) to explore and develop prototypes for richly illustrated e-books. the project was conceived in three parts: creating three prototypes anne whiston spirn is professor of landscape architecture and planning, departments of architecture and urban studies and planning, massachusetts institute of technology, cambridge, massachusetts; spirn@mit.edu. ann baird whiteside is librarian/ assistant dean for information resources, frances loeb library, harvard university graduate school of design, cambridge, massachusetts; awhiteside@gsd.harvard.edu. art documentation: journal of the art libraries society of north america, vol. (fall ) - / / - $ . . copyright by the art libraries society of north america. all rights reserved. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp for the richly illustrated electronic book; research on the dissemination of electronic books; and producing a guide for other authors. one of anne’s revelations at the why books? symposium was that she was in a room full of librarians, all discussing books, publishing, and implications for scholarship. it was then she realized that librarians figure . anne whiston spirn, the eye is a door: landscape, photography, and the art of discovery (wolftree press, ) viewed on an ipad. please see the online edition of art documentation for a color version of this image. the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp are key to the future of the book, and that academic publishing needs to realign relationships. authors and librarians should be partners. right after that conference, she reached out to ann baird whiteside for advice, which launched a series of dis- cussions about the process of producing e-books and the implications of e-books for research and scholarship. the discussions began by considering the future of the richly illustrated e-book, how one should go about producing such books, and how to get e-books into libraries. broader conversations then arose about how libraries ac- quire books, the relationships of publishers with authors and with libraries, the im- plications of digital publishing for research and scholarship, and the reasons why authors and librarians are allies. w h y e - b o o k s i n d e s i g n a n d t h e v i s u a l a r t s ? ann baird whiteside (abw): the publishing of electronic books in the visual arts is particularly interesting at this moment from the perspective of art and architecture librarians for several reasons. despite the fact that the state of development of some e-book readers is not very advanced in capabilities for showing visual content, e-books challenge our perceptions of what book collections are and will become. librarians in all disciplines are figuring out how to acquire e-books beyond the use of large aggre- gated packages, and e-books in general are creating a shift in the thinking of librarians because they require different management; they challenge us to think of new ways to manage our collections, and new ways to advertise what we have in our collections. why did you decide to publish your new book as an e-book? anne whiston spirn (aws): it all came down to reaching readers. i wanted to control the price for the book so that it would be affordable to a broad audience. cost is a problem with the richly illustrated book. my students cannot afford to buy most of the books i want them to read because they may cost one hundred dollars or more when first published. once such books go out of print, the price may climb to hundreds or even thousands of dollars. a library may buy one copy, but if that copy is lost, it is not always replaced, especially if the book’s price is too high. the cost of illustrations is a major barrier to publishing in the visual arts. and yet, books in the visual arts are about the visual content. this is a conundrum i have faced in the past, where i have had to limit the number of images in my books, but my new book with its many color photographs posed an economic problem of a different order. in fall , as i was wrestling with these issues, i attended why books?, a conference held at the radcliffe institute about the future of books in a digital age. the first ipad had been released earlier that year, and its high-resolution color screen reproduced gorgeous images. the black-and-white kindle readers had been available for some time, but they were not image-friendly. the ipad opened up a whole new world of possibilities for the richly illustrated book. that and the conference caused me to rethink how i was going to publish my book. abw: you have published your previous books with the university of chicago press, yale university press, and basic books. why did you decide to publish this book yourself? . since , when i started on this venture, new models have emerged to address issues of price and open access, such as knowledge unlatched, which is supported by libraries (http://www.knowledgeunlatched.org/). | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.knowledgeunlatched.org/ http://www.jstor.org/page/info/about/policies/terms.jsp aws: i did not intend to do so originally. i would have preferred to do both an e-book and a print edition with an established publisher. but when my editor worked out the production costs and what the book would have to sell for, he told me that i would have to contribute a $ , subsidy to help offset the cost of printing the book’s many color photographs. abw: that is a lot of money for an author to spend to publish a book, which is something we do not always think about when the books arrive in the library. the author is personally making big sacrifices for scholarship. aws: it is a common requirement for books with many color images. and despite that subsidy, my editor estimated that the publisher would set the price at sixty dollars in hardcover and forty to fifty dollars in paperback. at this price, the book would not have reached the wide general audience that my previous books had enjoyed. my students certainly could not afford that price. another consideration for me was that publishers’ contracts now require authors to grant electronic rights to the publisher, which means that the publisher’s right to distribute the book might not expire during the author’s lifetime. in , i was able to retain electronic rights to the language of landscape with yale university press, but in , when i signed a contract with the university of chicago press for daring to look: dorothea lange’s photographs and reports from the field, i was unable to retain e-rights. although i retained copyright, the contract required that i grant to the pub- lisher “all rights . . . in all media . . . that are now or hereafter available.” that means that so long as the book is available in digital form, it is not out of print, and thus the rights will never revert to me. my agent tells me that she now puts a clause in the contract that if the publisher does not exercise the e-rights within a certain number of years, those rights revert to the author. but an individual author may not be able to get that agreement or think about asking for it. many authors do not even think to retain copyright. in the past, if a book publisher lost interest in an author’s book, if it was not selling enough copies to make it worthwhile to keep the books in the warehouse, then it would let the book go out of print, and the rights would revert to the author (if the author had negotiated an “out-of-print” clause as part of the contract). the author could then take the book to another publisher and publish a new edition. now all the publisher needs to do is scan the book and make it available in an electronic version, and set whatever price it wants. if the publisher does not want to bring out another edition, the author cannot do another edition. that brings us back to the price of the book. many publishers charge the same price for the print and electronic editions, and you have told me that libraries are often charged more for e-books than print books. i understand why they have to charge as much as they do for print books, which are printed on real material and must be transported and stored. there are so many costs associated with the printed book that simply do not apply to the e-book. but publishers are afraid that if the e-price is too low they will not sell the print books, and they have not yet found a new model for e-books that is distinct from the model for print books. that was the background for my going to the conference at radcliffe. after the symposium, i started thinking about the advantages e-books afford. if i had to raise the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp $ , to subsidize print costs and the price of the book was going to be so high, and i could not retain the e-rights, then why not raise the money to publish the book myself? in fact, the e-book cost far less than $ , to produce. i did not want to get into the business of arranging for the copy-editing and design and doing all the promotion and distribution. but it was worth it if it meant that i could retain the rights, explore this new medium, and set a low price for the book. i set the price at $ . , which my students assured me that they could afford. e - b o o k s a n d r e s e a r c h i n d e s i g n a n d t h e v i s u a l a r t s abw: as i listen to you describe how you started down this path, i think about e-books in general, but especially in the visual arts, and i begin thinking about what the e-book means for research and scholarship in the visual arts. aws: e-books have the potential to open up new flows of ideas. images are not just illustrations of an idea, they embody ideas. those who conduct visually based re- search, whether artists, designers, or scholars, make an argument through visual images—in a single image or a mélange or a sequence of images. but, because it is so expensive to print visual images, and because text, in comparison, is so cheap, authors have been hindered from making visual arguments in printed books. otherwise, the argument might be made almost entirely in images, with captions to help the reader move from one image to another and to follow the line of reasoning. abw: an interesting example of a description of the visual thinking you describe is one from a society of architectural historians colleague, dietrich neumann, in which he described the use of a series of images to show design process over time. at the annual meeting of sah in april , neumann presented more than a dozen images and traced the evolution of falling water, showing that it was central to frank lloyd wright’s work as the culmination of earlier work and a turning point for wright’s future work in terms of his use of horizontal and vertical elements (figure , figure ). the visual argument made the verbal argument much more accessible. as an illustration of the use of the visual in intellectual arguments, i can see extending this to e-books, and i think that the e-book format allows freedom to use as much visual content as you want. aws: e-publishing has the potential to transform the field because, theoretically, it will allow an almost unlimited use of images to make and to support intellectual arguments, and the author can include a sequence of images that traces a line of reasoning, instead of just showing the conclusion. and e-books permit the free use of color. ultimately, i think that readers are going to choose color tablets because they will want to read everything from magazines to books electronically. abw: that is a freedom available only in publishing electronically. and it is also an opportunity to transform teaching. someday, for example, color-calibrated e-readers may offer the viewer a virtual experience close to seeing a work in person. . dietrich neumann, paper presented at the business meeting, society of architectural historians nd annual meeting. cincinnati, ohio, april , . | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp i think that we have an opportunity in e-books to give a different perspective of images as being used to provide intellectual arguments. it allows one to offer a greater context for the argument an author may want to make with different kinds of images. aws: there are many different kinds of images that could be incorporated into books. an author may, for example, have footnotes that are visual as well as those that are text-based. you might reproduce a page of diagrams from your field notes as a piece of visual documentation. you could link back to archival collections by provid- ing a link from a footnote or bibliography so that a reader can go directly to the archive. abw: that creates a really interesting link between publishing and libraries as well as archival material. if i am doing research and i follow the link from an e-book to an archive i can see what the archive holds and what it does not hold without leaving the book. there is a possibility of literally transforming scholarship in the visual arts in ways that people have not quite imagined yet. it will attract a new generation of scholars who want to work with images and be visually literate. the transformation will be large, and there are a number of issues that have to be sorted out and ad- dressed. i think getting things out on the table and beginning the conversation helps force discussion about the issues, both in scholarship and in libraries. if we can talk about it, it puts it out there for scholars, librarians, and publishers to start thinking about. figure . dietrich neumann, falling water series ( ). eighth of twelve images. falling water is in the center. the three buildings on the lower left predate falling water, the three on the upper right are later. the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp t h e r i c h l y i l l u s t r a t e d e - b o o k : c h a l l e n g e a n d p r o m i s e abw: you have referred to the eye is a door as an experiment. in what ways is it an experiment? aws: the whole process has been an experiment from conception to production, promotion, and distribution. my hypothesis is that electronic publishing is the future of the richly illustrated book. i wrote a proposal and received a grant from mit to produce three prototypes, the first being my current book, the eye is a door, and the other two, my previous books for which i hold the e-rights: the granite garden: urban nature and human design and the language of landscape. the content of the eye is a door, as an interplay of images and words, is itself an experiment. the book’s two photographic essays (composed of my own photographs) embrace a central text of short chapters. the first visual essay explores the sense of place and introduces dialogues among natural forces and human ideas, values, and actions. the second, which concludes the book, contains more complex photographic pairings that plot a sequence of ideas, an argument for a language of landscape (figure ). eight chapters of text are a counterpoint to the photo essays. images and words correspond, but a single photograph represents more than a single idea or story, and each photographic pair and sequence of pairs has its own logic. given the figure . dietrich neumann, falling water series ( ). last of twelve images, which demonstrates how wright’s use of horizontal and vertical elements evolved after falling water. the central image of falling water is in color in the original, the others are in black and white. | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp nature of the book’s ideas and structure, i wanted to experiment with the opportuni- ties the digital medium affords to transform the reading experience itself. imagine being able to call up images referenced in the text with a simple tap on an icon. tap the screen, and the image appears, tap again, and it vanishes (figure , figure ). that function alone transforms the reading experience. your eye rests, undistracted, on the image, then returns to the text. no flicking back and forth between different pages, sticking your finger in two parts of the book when an image is referred to more than once. e-books also afford the potential for seamless movement between the book and the web. the eye is a door cites works by other photographers, whose images appear in the e-book itself. tapping on the caption takes the reader directly to that photogra- pher’s website; tap again and return to the book. the eye is a door website (http:// www.theeyeisadoor.com) hosts features that complement the e-book, such as a journey via google earth to places depicted in the photographs, where the reader can explore the place on his/her own. reflowable text is another feature made possible by e-books, the fact that you can change the size of the font and choose the font that is easiest on your eye. you can also select a black or white background. but reflowable text does not permit a fixed layout. for designers, control of the book’s figure . the second pair in a series of pairs of photographs that, in sequence, make a visual argument for a language of landscape. from “passage,” a photo essay in spirn, the eye is a door, viewed on an ipad. many e-book readers permit the reader to switch between horizontal and vertical views. please see the online edition of art documentation for a color version of this image. the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.theeyeisadoor.com http://www.theeyeisadoor.com http://www.jstor.org/page/info/about/policies/terms.jsp design will be an issue. i was committed to reflowable text, so the e-book design was a challenge. i was not prepared for how difficult producing the e-book would be. the obstacles for design posed by amazon, for example, are holding back the development of the visually rich e-book since amazon has such a huge share of the e-book market. figure . square icons indicate a linked image. icons are red in the original. from spirn, the eye is a door, viewed on an ipad. please see the online edition of art documentation for a color version of this image. | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp abw: can you tell me more about how amazon is holding back e-readers in relation to other companies? aws: amazon’s kindle readers use a proprietary format, mobi or kf , rather than epub, which has emerged as the industry standard. epub permits much more figure . tap on the square icon, and the image fills the screen. tap again, and return to the text. from spirn, the eye is a door, viewed on an ipad. please see the online edition of art documentation for a color version of this image. the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp control and flexibility in the design of richly illustrated books. there are things you simply cannot do in mobi/kf . currently amazon also limits the file size for e-books, which ibooks does not. as a result, many visually rich e-books are not offered on amazon. the library of congress, for example, recently published a series of media-rich books that are available only from ibooks (e.g., great photographs from the library of congress). abw: i am interested in hearing about how you rethought the construction of an e-book as opposed to printed books. aws: originally, i structured the book with the design of the codex in mind, so i was limited by the number of pages in a signature when selecting the images for the photo essays. the signatures also influenced the structure of the book itself, which was conceived as two essays of color photographs, each printed on its own set of signa- tures, with one essay appearing at the beginning of the book, and the other at the end. as in many printed books, this structure diverts the reader as he/she has to leave the text to find the image, and then flip back to the text. moving to the e-book format freed me from such constraints, but it introduced others. for example, consider the differ- ence between browsing through a book in a bookstore and reading a sample of an e-book from an online retailer. you can hold a printed book in your hands and page through it. flipping through the eye is a door, for example, the reader would see that images compose half of the e-book. but the sample pages from an online retailer contain only a fraction of a book. amazon determines this automatically: the first ten percent of the book. i realized that, the way my e-book was set up, when readers got the sample pages they would have to go through many pages of text before they got to a single image! the eye is a door is a visually oriented book, and designers and other visual thinkers are an important audience, so i redesigned the first section of the book. right after the cover, i inserted five photographs that make a visual argument. only then does the reader come to the epigraph (“the root of the word idea is ‘to see’ in greek” and dorothea lange’s quote “the camera is a tool to see without a camera”), the table of contents, and the verbal introduction. abw: how do you create an e-book in a design field that works for the readers of books? it is a really different problem to think about e-books for the design fields. it is about the people reading the books. how do you make reading in a digital age the best experience it can be? aws: i have only just begun discovering what can and cannot be done given the constraints of the current e-book platforms and reading devices. in working on the eye is a door with ebook architects (http://www.ebookarchitects.com—they are e-book developers, not book designers), we had to adapt my notion of how the photo essays would be laid out and how images would be cited within the text. this could not be done the way that i had imagined. so we had to reinvent the layout of the book. abw: do you think that it is as effective as the way you had imagined it? aws: i think so. the eye is a door’s original design called for images referenced within the text to appear when called up, then to vanish. epub permits this, but amazon’s current version of mobi/kf does not. our solution was to treat these images as endnotes, which means that they all must appear at the back of the book in | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.ebookarchitects.com—they http://www.jstor.org/page/info/about/policies/terms.jsp the order in which they were cited. not ideal, since some images appear more than once, and the sequence seems haphazard. and yet, appropriate, for those images are, in fact, citations, footnotes of images rather than words. an unexpected and unfortu- nate result, however, is that in the sample pages downloaded from an online retailer, the images referenced in the verbal essays cannot be accessed since they are at the end of the book and thus are outside the sample, which includes only the first percent of the book. the solution for the eye is a door inspired the design for the e-editions of my books now in production: the language of landscape and the granite garden. these new e-books will consist of two parts, where the parts can be read both separately and interactively. in the first part, the reading experience will be similar to the text portion of the eye is a door. the second part will consist of all those images cited in the text, composed deliberately as sequenced essays of images and captions, where each image links back to the associated text. the reader may then choose whether to start by reading essays of text (with links to the images) or by reading essays of images (with links to the text). this is a new kind of book that serves both visual and verbal thinkers. abw: in this case, the literal and intellectual structures of the book become different. you are creating a construct around the intellectual thought process through the use of images. this means you have been forced to think about what a book is in electronic form. you are also able, in your new e-books, to offer different perspectives for read- ers—from text to image, or image to text. this is an expansion of the concept of the structure of the book. aws: yet most readers still prefer the printed book. the sociologist howard s. becker, who writes about photography and visual sociology among other topics, pub- lished thinking together: an e-mail exchange and all that jazz, a book that consists of an e-mail exchange with robert faulkner, written during the process of writing another book, do you know . . . ?, about how jazz musicians who have never played with one another can come together successfully in a session. thinking together is an e-book, so every time becker and his co-author mention a musical composition— there are hundreds of tunes and specific renditions—there is a link to youtube where the actual composition is played. abw: that in itself is an example of how an e-book can change scholarship because the book allows readers to make visual and aural connections with the text without ever leaving the book itself. aws: now let me tell you the depressing side of the story. the book was also published in print. the print version has all the e-mail exchanges, which refer to hundreds of tunes, but with no links to the music. becker’s friends bought the e-book, but when he asked them about it, they would say, “i haven’t gotten around to looking at it yet.” but when he gave . howard s. becker and robert r. faulker, thinking together: an email exchange and all that jazz (los angeles: usc annenberg press, ). print edition by aubervilliers: les laboratoires d’aubervilliers; paris: questions théoriques, . . robert r. faulker and howard s. baker, do you know . . . ?: the jazz repertoire in action (chicago: university of chicago press, ). this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp them a paper copy, they read it. as a sociologist, becker knows that you cannot force people to read e-books, they have to want to do so. but why would anyone prefer to read a printed book with no links to the tunes referenced, when they could read the e-book and listen to all the music? i am surprised that people are not as enthusiastic as i am (or as becker is at the age of eighty-six!) about the potential of the e-book to transform the reading experience. i now do most of my reading on an ipad. d i s s e m i n a t i o n o f e - b o o k s aws: given the decline of brick-and-mortar bookstores and the book review in print media, it is now more difficult for readers to find books. we still have publications like the new york review of books and bookforum, but for authors who want to reach a broad audience, reviews in popular media are important. many readers discovered my first book when it was reviewed in the new york times book review. it was not just general readers, but also scholars outside my own field, like historians and geogra- phers. today the new york times book review is a shadow of its former self. the book review editor for a major newspaper used to think, “who would write a really percep- tive review of this book?” now so few newspapers have book review sections. perhaps those that survive the shift from print to online will reinstate their book review sec- tions. if and when they do, i hope they will review e-books and books that are inde- pendently published, too. abw: is there something else that you see replacing the concept of the book review? aws: not the concept itself, but the platforms for review. there are bloggers who write about books, and there are websites where readers share their views, but dis- covering a good blog is not as easy as buying the new york times. abw: blogs are not quite self-reviewing, but anyone can happen to see your book and write a review. they are a form of review, and a form of advertising, but a blog is not peer review, nor is it a peer reviewer or someone with knowledge of a particular subject. aws: there are also social media sites for readers, like goodreads and librarything, but the books discussed are mainly genres like romance, science fiction, mystery, and memoirs. there are no large groups of readers on goodreads devoted to books on the topics i write about: landscape, environment, art, photography, and design. these topics have tiny reader groups on goodreads as compared to romance, for example. abw: we know people are reading, and publishing is certainly prolific. yet there is a gap in the space between where books and readers come together. one can troll amazon. aws: and amazon lets you read a sample from a book. perhaps you never want to go further in many, but, in the meantime, you do find new books and new authors. the challenge remains: how do we find richly illustrated e-books on art and design? journals in design and the visual arts should publish review issues on e-books. abw: we know e-books in our fields are being published, but we have to learn the new places to look. our former knowledge base about where to find books is shifting. aws: new firms have sprung up to distribute e-proofs and e-books to reviewers. netgalley (www.netgalley.com), for example, contacts reviewers and invites them to download the book. according to their website, more than , librarians currently | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions www.netgalley.com http://www.jstor.org/page/info/about/policies/terms.jsp use the site to preview new titles. netgalley serves independent as well as well- established publishers and is one of many companies that are cropping up to serve the growing number of independent publishers and the growing e-book market. publish- er’s weekly now has a section devoted to books by authors who publish independently: “pw select.” so many well-established authors are choosing to self-publish that the pejorative term “vanity” press is disappearing. there are several distributors for in- dependent publishers—smashwords, bookbaby, and ingram spark—which distrib- ute e-books to multiple e-book retailers, including amazon, ibooks, and many others. they collect the royalties and disburse them to the author, and they take a percentage. abw: what are the implications of these new services for libraries? libraries have traditionally relied upon a limited number of vendors to get their books. some ven- dors are just beginning to offer e-books in the same way they offer print books. yet, at the same time, there are clearly more e-books being published and disseminated outside the vendor realm. aws: i was surprised by how difficult it is going to be to get the eye is a door into libraries. all of my previous books are in libraries. it shocked me to learn that the library of congress will not catalog books (print or electronic) from publishers whose books are not already widely distributed to libraries as part of its cataloging-in- publication program (cip), nor are e-books eligible for cip. think about the barrier that throws up, not just to independent publishing, but to new publishing firms. the mit school of architecture and planning publication series, for example, is not eligible for library of congress cataloging in advance of publication. the loc’s current practice is a disservice to the dissemination of knowledge. i always thought that library of congress was our national library. abw: they will say that is not their role. aws: but what does it mean to not have a national library? abw: the united states has never had the concept of a national library, as in other countries. the library of congress was established to be a reference library for con- gress. its mission is “to support congress in fulfilling its constitutional duties and to further the progress of knowledge and creativity for the benefit of the american people.” the library of congress has never officially taken on the role of providing the ultimate leadership in cataloging for other libraries, though we think of loc in that role. but, going back to thinking about how libraries acquire e-books, some of our library vendors do provide e-books. i have a choice now as to whether i want a book in e-format or in print format. but librarians need new workflows for finding reviews of e-books. i am also thinking about how libraries acquire self-published e-books in the way you have been doing it versus the e-books in big packages. aws: how do libraries obtain e-books? abw: a vendor, ebrary (http://www.ebrary.com), for instance, offers thousands of e-books; the library pays a subscription fee, and we receive a package that we can offer . library of congress, “about cip,” http://www.loc.gov/publish/cip/about/. . library of congress, “about the library,” http://www.loc.gov/about. the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.ebrary.com http://www.loc.gov/publish/cip/about/ http://www.loc.gov/about http://www.jstor.org/page/info/about/policies/terms.jsp to our academic community. and while there are options to license individual titles, there remains the concern that the e-book packages start to feel more like the large aggregated e-journal packages. it is an e-journal package model that is being applied to monographs. it is a new territory that librarians need to navigate, understand, and really think about. aws: who is making the decisions about what books distributors will offer to librar- ies and which books they will not distribute? these gatekeepers are invisible to au- thors. why should a publisher or distributor be the one to select the books that a university library can buy? that is crazy! abw: yes, i agree. i often look at a list of e-books and say, “there is little in there that our people care about, in our disciplines.” but in large universities, the model is generally that all the libraries share the cost of the large packages as we do for journals. we subscribe to the ebsco e-book package, and a search for books on landscape architecture, for instance, results in one book. there are e-books that are published in the art and design fields, but they are not found in great numbers in the large aggre- gated packages. it is important to note that it is the publishers who are creating the packages of books to which libraries will subscribe. aws: librarians have long determined what books are available to their readers. they have curated their libraries’ book collections. if the publisher or distributor is making that choice, that eliminates one of the most important roles of the librarian, at least to authors and readers. as a scholar, i have a problem with that. abw: i have a hard time wrapping my head around it, both intellectually and from a practical perspective. what do i do with this package? what does this mean for the library? you asked about challenges, and i think that refers back to this earlier point— how do libraries find, acquire, and keep e-books . . . or do they keep them? they are licensing them, so again they are more like electronic journals. in some cases, librar- ies can buy e-books and can obtain perpetual access to them if this is within the terms of the contract. portico (http://www.portico.org) is working with publishers and libraries to offer an enduring access option for both e-books and e-journals. another large issue is, if an e-book is self-published and our library acquires it, how do we keep it in a way that over time retains the original intent of the author in terms of structure and the technologies used to produce the book? your e-book is a good example. the formatting issues and the construct of the book need to be retained and preserved over time for the book to be understood. what are the implications for storage and preservation? libraries have not yet taken the task on of preserving e-books, though portico is beginning to look at this. it is similar to digital scholarship projects in that we need to think about how we keep scholar- ship in its original formats, with the original visual and navigation intents, with all the embedded tools over time. aws: those issues of long-term usability are the reason that i chose to produce my e-book in standard formats (mobi/kf and epub) rather than as an app. apps may provide authors with more functionality and flexibility in the short-term, but they pose a challenge for preservation. authors should keep this in mind. | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.portico.org http://www.jstor.org/page/info/about/policies/terms.jsp abw: access is another issue. access to e-books and e-book packages is complicated because some vendors allow only one user at a time as opposed to multiple users. there are a number of e-books being published outside mainstream publishing, but the small publishers do not yet have a model for selling or licensing to libraries. strelka press (http://www.strelka.com/en/press/books), for instance, offers a series of books on architecture and design as downloads for individuals. that model does not work for libraries. and how do we as librarians work with someone like yourself, as the author of a self-published e-book? do we work with individual authors to determine these issues? is there an intermediary who takes on this role? or do librar- ies have an opportunity to shape the building of new models? another issue is whether or not libraries share e-books. aws: like interlibrary loan? abw: yes, but with e-books, most vendors will not allow that. if it is a self-published book that i ”purchase,” then we may have more options for thinking about this. aws: what do you mean by saying that most vendors will not allow that? abw: the publishers put restrictions on how e-books are distributed within an insti- tution. i can license the package which will allow x number of users access to an e-book simultaneously. aws: are you saying that another university cannot borrow it for one user? abw: generally not. aws: that is terrible. think about what that means for access to knowledge, about what it means for scholarship. i use interlibrary loan all the time. libraries are used to sharing resources. so now they are being told by publishers and distributors that they cannot share resources anymore? most scholars cannot afford to travel to a distant library in order to read a book. this development is part of a larger issue of who controls access to knowledge. abw: i also want to know what this means for collection development. but first, i have to understand what is and is not in e-book packages. if we are going to collect e-books, how do we reframe collection development in libraries to not just accept packages but to look at them critically? if you get a package of the safari e-books, which is mostly technical manuals, that is kind of a no-brainer. they are more reference-like. they are not monographs on individual artists. and i think that is a distinction that we need to consider. our conversation is bringing up all sorts of issues concerning scholarship and knowledge sharing that i am not sure libraries and scholars are addressing, either separately or together. we are addressing them in the open access arena, but not in the e-book arena. libraries tend to focus how we deal with our physical stacks in light of e-books and e-journals, but librarians on the ground are not yet fully engaged in the conversations about the implications for scholarship of the changes you and i are discussing. but we can see some new, library-led models developing, such as the university of michigan’s library publishing program. . university of michigan library, “michigan publishing,” http://www.publishing.umich.edu/about/. the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.strelka.com/en/press/books http://www.publishing.umich.edu/about/ http://www.jstor.org/page/info/about/policies/terms.jsp i s s e l f - p u b l i s h i n g a m o d e l f o r o t h e r s ? abw: would you advise other scholars to take the approach that you did? what does it mean to publish digitally or independently when you are an established scholar versus a non-established scholar? aws: i would advise caution. there is still resistance to the e-book among many readers, including academics. and i would certainly not advise a young scholar to publish a book independently. however, there are alternative publishers that have emerged, such as new academia, which have peer review. critical peer review helps authors sharpen and strengthen their arguments, and peer-reviewed publications are generally given more weight in decisions about academic promotion and tenure. independent publishing and e-publishing have the potential to transform research and scholarship in the visual arts and design because they enable more experimen- tation than a commercial or academic publisher might be willing to undertake. in order to experiment with the e-book medium, in order to keep the rights to my book, and in order to keep the price low and thereby increase dissemination, i decided to publish the book independently. but i already had a presence on amazon, in libraries, and i already had tenure. this is not something that i would advise an untenured faculty member to do. abw: as a tenured faculty member you think it is ok to do independent e-book publishing, but you would never want young scholars to do that? aws: no, not e-publishing only, and not on their own without going through a well-established publisher. peer review is so important, and, at some universities, there is pressure on junior faculty to publish with a handful of prominent presses. the presumption is that if the book is important, then it will be published at a prestigious press. abw: you are saying that there is a hierarchy of the presses. do you think it is going to be important in the e-publishing, digital book world to change the perceptions of what is acceptable for tenure? do you envision a time when an e-book will be viewed as the equivalent of a published monograph in print form? aws: yes, especially if it is published by a press at the high end of the hierarchy. abw: will you publish your next book independently? aws: probably not. i would rather be an author than a publisher. i have had great experiences with publishers. working with the university of chicago press on daring to look, from design and production to marketing and promotion, was a pleasure. i have missed that partnership, missed working with professionals who contributed their own talents to the book. publishers do a lot for authors. all that i have learned over the past four years will make me a better partner with a publisher (and with librarians!). however, i do want to figure out a mutually acceptable agreement on e-rights and the pricing of e-books. this brings up another issue. there is a growing trend for schools of design to assume the role of publisher. even individual departments of architecture or land- scape architecture, for example, are publishing books. what began decades ago with student-edited journals or research monographs and then expanded more recently to studio reports has become, in some schools such as harvard, the university of penn- | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp sylvania, and mit, a full-fledged publishing enterprise that produces books by faculty, including untenured faculty. there is often no peer review and little, if any, profes- sional copy-editing. the books may have an isbn, but no library of congress catalog number since most do not qualify for the loc cataloging in publication program. i believe that this is a response to the very same conditions in print publishing that led me to publish the eye is a door as an e-book. in its current form, such publishing is a very risky route to publication for untenured faculty since many, if not most, uni- versities will not give the same weight to such publications in the promotion and tenure process as they give to peer-reviewed articles and books published by a well- established publisher. however, with the incorporation of a strong external peer review process, there is great potential in such ventures. abw: this is indeed a growing trend that architecture librarians are witnessing. i think the trend is partly driven by the desire to publish design research, and partly because we have the technology to self-publish. the self-publish model allows school- based publications to be quickly produced, using many images. the model does not require the costs of peer review or editing (though in some cases, there is some form of copy editing done), and the cost of producing images is much less. some schools use a commercial distributor to sell these books. and, as you say, it allows young academics to be published, which may help with the tenure and promotion process in some universities, but may pose a problem in others. if these books are not truly peer reviewed or edited, how does this model support young academics in the tenure process? a group of architecture librarians is looking at this topic, trying to assess what the collecting implications are for our libraries. we have begun by compiling a list of known school publications, and more work will be done over the course of the next year. aws: this is similar to an issue that has long faced libraries that serve the design and planning professions: whether and how to collect professional reports published by private firms and public agencies. such reports often represent important contribu- tions to practice and theory, and yet they are seldom collected by libraries. when i was a young assistant professor at the graduate school of design at harvard and a mem- ber of the library committee, i raised the issue with your predecessor, angela giral. she agreed with the importance of the issue, but pointed out the many difficulties in identifying, assessing, and collecting such publications. abw: and yet, some of this material is, in fact, critical to design and planning librar- ies. these materials, too, are being looked at again by libraries, especially as they are published in digital form, and no one knows if they will maintained over time on websites. a n a l l i a n c e b e t w e e n l i b r a r i a n s a n d a u t h o r s abw: tell me what you are thinking about scholars and librarians as allies. aws: we are allied in the desire to make knowledge freely available. authors produce knowledge, and one of the missions of librarians is to help readers get access to knowledge. both authors and librarians operate in a world where knowledge (books, articles, databases) is increasingly owned by for-profit corporations. the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp one strong thread in our discussion is this increasing corporate ownership and control of the dissemination of knowledge—from those who develop the reading platforms to those who distribute and sell books and journals. electronic publishing has enabled and accelerated this trend at the same time that, paradoxically, it has facilitated the sharing of knowledge and information. so, there is a dark side to digital technology. at what point will the price charged for publications stifle the advance of knowledge? authors and librarians are both getting squeezed. authors are getting squeezed at the production end, and libraries are getting squeezed at the dissemination end. the issue of rights and the new publishing contracts also needs to be considered because it is the flip side of what some publishers are doing to librarians with e-books and subscriptions to e-journals. abw: that is also part of the knowledge-control issue that is of concern. aws: we are talking about an alliance between librarians and authors, but we should include the reader too. authors and librarians have sometimes found themselves in adversarial positions over the royalty issue, so i think we need to educate authors about how librarians are allies. i remember a librarian in my public library who was my trusted guide. i browsed the shelves, but i would never have found so many authors had she not led me to them. librarians still fulfill that function for readers, and authors should not be so concerned about whether they get a royalty every time the book is checked out. they should be worried about their books not getting into libraries. abw: a major issue we have identified is who controls knowledge. aws: yes. who controls knowledge—the flow of knowledge, the access to knowledge. that flow is blocked when the e-book cannot be shared with another library, through interlibrary loan, or even with other readers. abw: that is a reason it gives me pause when librarians, or library systems, start saying, “we are going to go all digital.” but we have not figured out all the problems and implications, not just for libraries, but for scholarship itself. we have not had these discussions. aws: authors and librarians are natural allies. we always have been. abw: i think that the idea of aligning and collaborating with authors is going to be part of the evolution of the role of the librarian. we have been primarily a service profession, but i think there are librarians who see that our futures are shifting and that our roles must change as well. i think that we have an opportunity for changing roles that includes working with authors and scholars to support the publishing of their books and educating scholars and graduate students about publishing and authors’ rights. c o n c l u s i o n the journey of publishing a richly illustrated e-book and the ensuing discussions between us have transformed our thinking. e-books will indeed transform both schol- arship and libraries, but there are implications for authors/scholars, librarians, and scholarship in the academy. robert darnton wrote in his recent article in the new york review of books that “authors generally have one dominant desire—for their work to circulate freely through the public: and their interest coincides with the goals of the | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp open access movement.” darnton discusses a movement among authors to make their books available online through non-profit distributors. this may also be a mo- ment when university libraries can step into the publishing/distribution arena, as they have for open-access repositories. the implications of collecting e-books in libraries also require further thought, discussion, and strategy. there are a couple of models that libraries have now: to license e-books through subscriptions, or to purchase and license perpetual access. much like e-journals, we are recognizing that there are limits to our use and sharing of e-books. this alone is a huge shift in how libraries think about collections. we do not “own” an e-book, and therefore our right to disseminate an e-book is limited to what a publisher will allow according to the licensing contract. these constraints go against the open access culture of libraries and the belief in the free dissemination of knowledge. section of the us copyright act (the first sale doctrine) allows the owner of a copy of a published work to sell or otherwise dispose of that copy without permission from the copyright owner. however, the first sale doctrine applies to “material objects,” and a copyrighted digital book is not a material, physical object. additionally, the licensing of e-books, rather than selling them, suggests that under these licenses e-books cannot be shared, or loaned, or resold. by putting use con- straints in license agreements for e-books, publishers are effectively taking away the rights of libraries to share, preserve, collect, and disseminate books to other patrons, and are also therefore limiting the dissemination of knowledge. from our perspec- tive, this is a detriment to the advancement of knowledge. librarians and authors need to develop a new model for publishing, and we need to think about the following questions. do we collect solely through licensed subscriptions? do we broaden our collecting outside of our vendors and begin collecting individual e-books directly from authors or vendors? will libraries take on the distribution of e-books? will libraries build their own e-book platforms, as is being planned in the connecticut state librar- ies and douglas county libraries? the library publishing coalition is one example in which academic libraries have formed their own group focused on publishing. or is this another arena in which we pursue deeper collaboration across institu- tions, and share collecting of e-books distributed through other venues such as the digital public library of america and hathitrust? there is also an opportunity for the alliance between authors and librarians to develop into deeper relationships focused on the development of research and schol- arship within our institutions. libraries could take the path of library-as-publisher, which would support scholars and their work from inside our institutions, as opposed to the current model where the scholarship taking place inside the institution is published outside the institution, then sold back to us. librarians themselves have the opportunity to work more closely with authors, through shared learning about copy- right, authors’ rights, publishing contracts, and the opportunity to help lead the way in thinking about different dissemination models for authors’ works. these include . robert darnton, “a world digital library is coming true!” the new york review of books, may , , http:// www.nybooks.com/articles/archives/ /may/ /world-digital-library-coming-true. . library publishing coalition, http://librarypublishing.org. the promise and problems of the visual e-book | this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.nybooks.com/articles/archives/ /may/ /world-digital-library-coming-true http://www.nybooks.com/articles/archives/ /may/ /world-digital-library-coming-true http://librarypublishing.org http://www.jstor.org/page/info/about/policies/terms.jsp some of robert darnton’s ideas referenced earlier, such as expanding open-access repositories. there are many library systems creating new roles, such as mit’s schol- arly publishing librarian, and harvard’s copyright advisor (in the office for scholarly communication in the harvard library). how these transformations happen will vary from institution to institution, and from discipline to discipline. some scholarly societies may keep their roles as publishers; some libraries may take on roles as advisors to authors; others may venture more deeply into publishing. one thing we do believe is that authors and librarians need to build alliances as a starting point. a c k n o w l e d g m e n t s for valuable critique and suggestions, we would like to thank two anonymous review- ers and our colleagues at mit and harvard: patsy baudoin, ellen duranceau, gregory eow, kyle courtney, and scott wicks. thanks also to alexander brady for transcrip- tion of our conversations and for research on how e-books find readers. for encour- agement, support, and felicitous editing, we are grateful to judy dyki. | a r t d o c u m e n t a t i o n | f a l l | vol. , no. this content downloaded from . . . on wed, dec : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp information discovery in ambiguous zones of research suzana sukovic library trends, volume , number , summer , pp. - (article) published by the johns hopkins university press doi: . /lib. . for additional information about this article access provided by university of illinois @ urbana-champaign at / / : pm gmt http://muse.jhu.edu/journals/lib/summary/v / . .sukovic.html http://muse.jhu.edu/journals/lib/summary/v / . .sukovic.html abstract electronic environments for information discovery are considered in relation to open-ended and dynamic research practices in the humanities, but a system suitable for these scholars would have many other applications as well. considerations of flexible electronic en- vironments that would support research are based on the holistic view of information processes and the requirements that informa- tion systems enable connections, as well as the trustworthiness and authenticity of information. the proposed electronic environment consists of flexible networks of connections between information of different granularity. strong and weak information paths are es- tablished through use, which contributes to the development and informational value of the system. organizational support, as well as new forms of information provision and services, are required to enable novel approaches to information discovery and research. research practices in the humanities have been a challenge for informa- tion systems. scholars’ unpredictable and dynamic research paths, the use of a variety of materials in any form and from any period, and particularly the subject matter of their work—human lives, artifacts, imagination, and creativity—remain elusive for any information system to capture. this article considers the possibility of developing an electronic environment that would enable information discovery in the humanities, but any sys- tem that is suitable for these scholars would have other academic and general applications. considerations of possible electronic environments in this paper are based on findings from the literature and from a study into the roles of information discovery in ambiguous zones of research suzana sukovic library trends, vol. , no. , summer (“digital books and the impact on librar- ies,” edited by peter brantley), pp. – (c) the board of trustees, university of illinois sukovic/information discovery electronic texts in the humanities. this paper does not report research results of the study, but it occasionally refers to examples from the study. the paper has four main sections. the first two provide a framework for the discussion about possible information systems by considering in- formation processes and issues related to information discovery and use. on the basis of the ideas considered in the two sections, the third section proposes an approach to designing environments for information discov- ery, while the fourth overviews some issues of research support. information processes although “information” has achieved considerable prominence in the last decades, the word can refer to a number of different meanings in various disciplinary communities. the understanding of the concepts of “infor- mation” and “data” proposed here is derived from definitions provided by bates ( ; ), spink and saracevic ( ), and buckland ( ). information means a pattern of organization, which can be contained in any physical manifestation, and it is given meaning by a human being under certain contextual conditions. the concept of information includes the physical manifestation, the process of making sense of that information or “being informed,” and contextual considerations. data means information produced, selected, and/or assembled for further processing—specifically, for further research, in the context of scholarly work. conscious rational information-processing has been traditionally a fo- cus of attention, but there is a need to stress that the process of “giving meaning” to information includes conscious and unconscious processes as well as rational and emotional ways of knowing. in the nonlinear research practices in the humanities where serendipity has an important role in information discovery, researchers may be seeking information all the time at an unconscious level (cole, , p. ). investigation of a large body of materials, common in an open-ended enquiry in the humanities, may rely on unconscious processing and on the development of insight as an important aspect of understanding. insight implies an unconscious phase of processing because it often means “the sudden emergence of an idea into conscious awareness”(schooler, fallshore, & fiore, , p. ). contrary to the views that give primacy to the language in the process of understanding, schooler et al. found that verbalizing may disrupt processes leading to insight. holistic views of information processes suggest a significant role of af- fect, which was rarely investigated by information studies although the information literature has acknowledged the importance of feelings in information processes (kuhlthau, , , ; brooks, ). one aspect of affect and information-processing marked by a lack of under- standing is esthetic emotions (scherer, ), which are particularly rel- evant in disciplines that deal with creative works. library trends/summer in addition to physical and intellectual aspects, unconscious informa- tion-processing and a broad spectrum of emotions are sources of insights in humanists’ individualistic and dynamic research practices. research in electronic environments may promote some forms of nonverbal, sen- sory, and affective ways of knowing. the study into the roles of e-texts indicated that online interactions encourage a blurring of the boundaries between different media and formats. these interactions also may have some influence on blending between academic and creative modes of ex- pression. fast interactions with multimedia are likely to stimulate sensory experiences as well as affective and creative responses to stimuli, which can further promote fusion between sensory, rational and affective ways of knowing. information discovery and use users in general, and humanists and social scientists in particular, con- duct evolving searches. bates ( ) called the way in which scholars start with a query, and then move to a variety of sources, constantly adjusting the query in small increments, a “berry-picking model.” bates found that the ability to access substantial qualities of information is very important in this type of searching, which develops through the selection of bits of information. key issues in evolving discovery concern the way in which systems provide connectivity and assure the trustworthiness of retrieved information, which can be selected for use. systems of connections the retrieval of large amounts of dispersed information enables different configurations of information. lyotard ( ) wrote about performativity that can come from arranging the data in a new way: “this new arrange- ment is usually achieved by connecting together series of data that were previously held to be independent. this capacity to articulate what used to be separate can be called imagination” (p. ). discovery of analogies was seen as the basis for creative thinking by ford ( ) and cory ( ). cory argued that a support for discovering analogies was a way to support research in the humanities. mechanisms for establishing connections in the current systems have many limitations. brockman et al. ( , p. ) suggested that libraries needed to do much more “to assemble information resources in a way that allows scholars to search across them, rather than digging down into separate, exclusive ‘silos’. . . .” the current retrieval systems often limit the discovery of connections by maintaining outdated divisions. palmer and malone ( ) showed that subject descriptions inhibited access to and isolated knowledge about women and women’s work by removing connections with a wider body of knowledge, which was replicated on the internet. jakubowicz ( ) wrote that a fundamental problem in digital sukovic/information discovery research was a separation between “a) the collection, collation, manipula- tion and preservation of data and information, and b) the transformation of information into knowledge through the application of human cre- ativity and its dissemination through new global information networks” (“conclusions,” para. ). the critique of hierarchical systems that isolate information and im- pose certain ways of thinking is often related to deleuze and guattari, who contrasted rhizomes and trees as metaphors for two different systems. in models that correspond to hierarchical arborescent systems an element only receives information from a higher unit, and only re- ceives a subjective affection along preestablished paths. this is evident in current problems in information science and computer science, which still cling to the oldest modes of thought in that they grant all power to a memory or central organ. (deleuze & guattari, , p. ) a rhizomatic system, on the other hand, does not work in hierarchical structures and allows full connectivity: “it brings into play very different regimes of signs, and even nonsign states. . . . it is composed not of units but of dimensions, or rather directions in motion” (deleuze & guattari, , p. ). rhizomatic structures are more akin to the way the human nervous sys- tem works. bush ( ) contrasted an artificial system of indexing based on hierarchical structures and established paths with the way in which the human mind works: it operates by association. with one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. it has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory. (bush, , section , para. ) although bush wrote during the time before personal computers, the problems with hierarchical information systems remained in new elec- tronic environments. burnett and mckinley ( , p. ) proposed that the “rhizomorphic model of information contexts better accounts for both the richness and the chaos encountered in seeking informa- tion . . .” at the same time, hierarchical systems have a long tradition of aiding information retrieval more or less efficiently. although flexible and open systems are needed for discovery, different levels of control over the system are still required and desirable. liu sum- marized the gist of objections to open nonhierarchical models of informa- tion systems: “while knowledge workers may vote for rhizomatic democ- racy in principle, they also want firewalls for their personal computers; and they kill bermuda grass on their lawns” (liu, , pp. – ). library trends/summer in the context of scholarly research, the assurance of quality and au- thority is particularly important. while scholars need to work in an en- vironment that can provide exploration and discovery, they also need a clear understanding of the provenance and quality of information that will become their research data. reliability and authenticity considering the meaning of authenticity, bearman and trant wrote: at its extremes, authenticity carries with it all the philosophical prob- lems of truth, but here we will try to confine the assertion that some- thing is “authentic” to a number of more “provable” claims: that it is unaltered from the original; that it is what it purports to be; and/or that its representation is transparent (the rules are stated and, possibly, reversible). ( , ii asserting and assessing authenticity, para. ) the authors stressed that convincing scholarly arguments depended on judgments about authenticity of source materials—their origin, complete- ness, and internal integrity. electronic documents are sometimes perceived as untrustworthy be- cause they can be published by anyone and because forgeries are much easier. on the other hand, electronic representations of a hard copy pro- vide minute details, which are not accessible to most people or which cannot be seen by the bare eye. the recent digitization of leonardo da vinci’s the last supper ( ) in sixteen billion pixels is an example of an electronic representation that provides details inaccessible to viewers of the original. an obstacle in using and publicly acknowledging rigorously developed electronic sources is the lack of widely accepted criteria for evaluation. the quality of print editions varies significantly, but scholars regularly use a set of criteria to judge the quality of these editions. critically important for judgment of the authenticity of an electronic document are its provenance and a detailed declaration of transformation identifiable in metadata (gladney & bennett, ), but this information is not always available. while systematic and exhaustive documentation would have a signifi- cant role in assuring reliability, it is unlikely to resolve all different require- ments for authenticity. the electronic copy of the last supper, for example, provides remarkable detail, but it cannot replace the experience of seeing the original. authenticity of electronic editions is often judged by print editions, even when electronic editions provide unique functionality. this is not the case with electronic projects such as electra, which presented some women’s works for the first time: “in this instance it offers an unlikely route into authenticity, or rather to that earlier moment of inauthentic- ity which as editors and textual critics we decided to label the real thing” (sutherland, , p. ). sukovic/information discovery an important question is who actually presented not only an electronic copy but also the “real thing,” if they are not the same, and how have they been presented. the official authority of the author, publisher, or a curating institution is often important in establishing the trustworthiness of information, but it does not necessarily guarantee authenticity. as har- away pointed out, representation is rarely a reproduction. haraway used an example of the jaguar and fetus, which cannot represent themselves: “both the jaguar and the fetus are carved out of one collective entity and relocated in another, where they are reconstituted as objects of a particular kind—as the ground of a representational practice that forever authorizes the ventriloquist” (haraway, , p. ). when haraway questioned the right of a scientist to represent “the nature,” she questioned representa- tion authorities. the question is particularly acute in the framework of electronic environments in which a variety of representations with differ- ent origins keeps open questions of who represented something, in what way and for what purposes. environments for discovery like any environment, the information electronic environment provides a context in which information processes happen. this environment can be seen in terms of nardi and o’day’s information ecology characterized by “a complex system of parts and relationships. it exhibits diversity and experiences continual evolution. different parts of an ecology coevolve, changing together according to the relationships in the system” (nardi & o’day, , characterizing information ecologies, para. ). the en- vironment in this sense is local and defined by an individual circle of in- teractions and interests. although an electronic environment can consist of several software programs, documents on a person’s computer, and a few online correspondents, the focus here is on larger, usually online, en- vironments, which can include several databases and tools, or the entire internet. the design of electronic environments for the information discovery suggested here is based on the following premises: . anything in an information system that can be informative to a person is information, including the whole document and its various aspects. . design for dynamic research has to integrate multisensory experiences and different ways of working that enable rational and affective ways of knowing. . scholarly research requires integration of a variety of sources, formats, and media as well as a provision for information use. . a goal is the interaction between the researcher and information rather than interaction with the system. . flexibility and openness, and control and limitation are both needed. library trends/summer the evolution of the proposed electronic environment is realized through flexible networks of connections that grow and change with the use. information network information about a phenomenon or entity is dispersed and contained in many different forms. for scholars in the humanities, it is also contained in many different forms of representation, such as different editions of a text or variants of a manuscript. the image below (figure ) illustrates the complexity of establishing connections between information con- tained in different forms. figure . about the network of ambiguous zones of a lemon (arakawa & gins, , p. ) for computer systems, the image above illustrates challenges of rep- resenting and connecting numerous entities and their ambiguous zones of meaning. three main aspects of the challenge concern (a) trustwor- thiness of representation, (b) establishing connections between related information and metadata, and (c) identification of zones of meaning, which are nothing else but patterns of organization of information. the first challenge is to establish the meaning of the claims that something is a true representation. the last two issues relate to the identification and linking of all potentially relevant information. trustworthiness an essential step in establishing the trustworthiness of a representation is documenting that the label “photo of a lemon” or “image sukovic/information discovery of a page” are true or, more often, in which way they are true. documenta- tion about representation processes and detailed metadata are often needed to demonstrate that the representation and represented are identical in every important way, so the trustworthiness of a document can be based on the strength of evidence that the representation is what it claims to be. a more complex issue concerns the context of representation. schol- ars in the study of interactions with e-texts commented on the absence of information about many nations and cultures that do not belong to the dominant few. there is also a question concerning who represents smaller and/or less powerful groups and cultures even when they are present on- line. the question is pertinent to evaluating artifacts from other cultures in western digital collections. trustworthiness and authenticity of repre- sentation are then cultural and political issues, and as such, significant as topics for scholarly investigation. a comparison of different representations is a way of establishing what and how they represent. inclusion of a variety of representations with dif- ferent origins is a powerful way of strengthening the trustworthiness of the system. for example, a high-quality representation of a literary manuscript on a library website can provide accurate details of the original document. the same document on websites of an alternative acting group and a local historical society provide insights into cultural framing of the manuscript. mistakes and omissions in different representations can have informa- tional value. very often, characteristics of a particular representation can be assessed only by comparison with other representations. the meaning of a purple lemon is constructed in comparison with numerous represen- tations of the yellow fruit. connections among information and metadata in order to enable investiga- tion of the complexity of meanings and their relationships, it is necessary to establish connections between representations and related information. if the question is difficult in any situation, it is particularly complex in the humanities in which every text and its smallest part can be associated with a variety of meanings and other texts. figure illustrates the difficulty of the task, but it also suggests that a solution may exist in establishing as many connections between information as possible and naming them through extensive metadata produced by humans and machines. a challenge is that the difference between information and metadata is not necessarily clear. data was defined earlier as information selected for further processing. metadata means “data that describes other data” or, simply, “data about data.” metadata provide secondary information about data and, in information jargon, they usually refer to distinct forms such as bibliographic records, or dublin core metadata and the tei (text encod- ing initiative) headers inserted in electronic records. from the perspective of an information professional who works with electronic media, differentiation between information and different library trends/summer metadata is relevant, but it does not address the complexity of their rela- tionships. the study of scholars’ interactions with e-texts suggested that bibliographic records can become an integral part of the interaction with the text or, in some situations, they can become primary data themselves, so formal metadata does not necessarily have a distinct function of sec- ondary information. furthermore, metadata do not have to be formal records. if a poem provides information, its bibliographic record is formal metadata, but metadata can also be anything that gives information about the poem. an essay, a song or a commentary in a blog written in response to the poem are also metadata. one of the participants in the study into the roles of e-texts talked about “poetic metadata,” indicating that creative works can provide secondary information or metadata. poetic metadata is then a special form of a secondary descriptor. a record attached to a preserved lemon in a botanical museum, and a painting of a purple lemon can both provide secondary information about the exhibit. different me- dia, formats, and genres can play the role of metadata. the same text can be either information or metadata depending on the context (figure ). it could be useful to distinguish between forms of primary and secondary information as well as between unselected and selected information: figure . information & data, meta-information & metadata sukovic/information discovery • information is any pattern of organization and data is information se- lected for further processing. • meta-information is any secondary information, and metadata is selected and/or processed meta-information. the proposed distinction information—meta-information, data—meta- data can assist in handling large amounts of information and their de- scriptors. one way of approaching the problem of the enormous number of possible links is through the analogy with the human nervous system, which deals with billions of possible connections by strengthening used paths. if information and meta-information refer to informative potential and possible connections, data and metadata refer to selected information and used paths. like the nervous system, the information network can promote differentiation between potentially strong and used paths from rarely used or unestablished paths. connections strengthened through use can serve as a constantly devel- oping guide through the system where both strong and weak connections may be required by the user. although needed in many search situations, the used paths are not necessarily the most desirable ones. weak connec- tions may be more relevant for research purposes. if the researcher wants to find all instances of a hidden lemon, presented as weakly connected and on the margins of figure , established paths may be used to reduce a number of options by excluding strong and central connections. well- established paths can also provide reference points so they can be used to direct searching outside strongly connected area. all types of paths can aid the researcher’s investigation of patterns of connections. zones of meaning and granularity of information identification of the zones of meaning as a challenge in establishing information networks relates to the granularity of information. in order to achieve informativeness of all aspects of representation, information has to be presented on different levels: the physical document and its context as well as the content and its parts. the information profession usually deals with representations on the document level. at this stage, the informativeness of the whole document is usually described by bibliographic details. the provision of context develops through the provision of materials and links, which can contextualize documents. this is a good beginning, but in order to study the lemon, the user has to be able to identify and access representations of its seeds. hockey ( ) referred to mccarty’s idea of morselization of information, which would identify little morsels of information with one’s own metadata. connections between a wide variety of information and meta-information of different granularity have the potential to provide powerful information retrieval and linking as well as to allow manipulation of small segments for use. library trends/summer working with different levels of granularity of information imposes significant challenges in retrieval and selection of vast amounts of infor- mation. bates ( ) wrote about different types of information and sug- gested the development of information genres. bates referred to ingarden and trosborg when she proposed that “a given genre can be seen to be an expression of, and a vehicle for, a particular kind of communication” ( , p. ). with a broadened understanding of information and metadata required for more powerful and more flexible systems, the idea of information types and genres provides a way of dealing with complexity. the distinction information—meta-information, data—metadata is a step in that direction. further differentiation between forms of metadata such as formal—interpretive, analytical—poetic may be the next step. on a lemon trail the personal development of understanding and meanings of the whole information system can grow together through different configurations of information. an example of a relatively simple research path may serve as an illustration of how the system could work for a researcher in the humanities. the scholar would be able to identify large bundles and small morsels of information and meta-information, and then select them for further research and manipulation. the sources would be integrated to allow the scholar to establish her/his own path. information would be retrieved by word, shape, color, sound, and, some way down the track, by smell and touch. while searching, the scholar would apply different filtering systems to target particular types of information and follow well-established or previously rarely used paths. if the researcher wants to study the history of the use of lemons, they can decide to start from academic digital libraries to look at digitized manuscripts of diaries, which describe past travels by ship when scurvy occurred; find references to lemons in medical treatises from the mediter- ranean area and china through history with parallel translations; retrieve medical information about scurvy today; browse discussions of young peo- ple about the use of lemons during self-imposed diets; combine all dif- ferent information about the taste and appearance of lemons, including images, songs, and descriptions in the literature; exchange opinions with various people on the way and leave comments online. a perspective for each combination of information could be reconfigured so the researcher could look at information from a particular disciplinary point, consider a period in time or focus on one of the senses. the researcher would be able to select or exclude filters to browse information about lemons “in the wild.” while doing the search, the researcher would establish some connections for the first time and strengthen others. comments, evalu- ations, discussions, and collaboration, as well as new products created by sukovic/information discovery the scholar, would all contribute to the constantly developing information environment. allowing the system to trace someone’s path, even anonymously, can be potentially problematic, so a number of issues have to be addressed for that to happen. one of them is that the system has to document its repre- sentation of strong and weak links between information. the researcher will not want to leave any visible trace of an innovative information path if individual originality is the most important measure of scholarly achieve- ment, but scholarship may be measured by its contribution to the infor- mation environment. in this case, scholars would want to keep records of their own information passage to learn from it and select parts that they would include in an electronic portfolio to demonstrate their own contributions to the information environment. the potential of an open dynamic system of this kind is in the user-directed growth and a degree of self-maintenance balanced by a professional involvement in ensuring some regulation and goal-oriented development of the system. root or rhizome the need for associative ability and flexibility of the network, as well as the need for some control and structure suggest that both root-like and rhizomatic structures have their advantages. very importantly, they are not mutually exclusive. as the originators of the idea of rhizomatic struc- ture suggested, a rhizome can be entered through the root-tree: “a new rhizome may form in the heart of a tree, the hollow of a root, the crook of a branch. or else it is a microscopic element of the root-tree, a radicle, that gets rhizome production going” (deleuze & guattari, , p. ). an information system can follow the arborescent structure of the natural lemon tree as well as the rhizomatic structure of an imaginary red- leaved plant on which blue lemons grow. these two structures can comple- ment each other or be exchanged as required. like some computer ap- plications, which allow the user to select different representation models to view data, it is possible to consider the design of a system that will allow hierarchical or rhizomatic approach on demand. a user-directed selection of structure in addition to various options for filtering information would be part of the system’s flexibility. a selection of hierarchical and nonhier- archical approaches in addition to the morselization of information, the removal of artificial boundaries between information and metadata and availability of different levels of filtering would give a great deal of control to the user. research support significant assistance is required to ensure that scholars are able to take full advantage of electronic environments. this article cannot address the complexity of issues involved in providing recognition and support for library trends/summer digital scholarship, nor can it consider research education, but it points toward some aspects of support required of academic organizations and the information profession. organizational support in order to find novel approaches, the researcher needs time and space to experiment. however, time is a scarce resource for most scholars. partici- pants in the study into the roles of e-texts often commented that younger generations of researchers were better suited for work with electronic media. the observation may be correct, but the reality of building an academic career makes early and mid-career researchers the least likely to spend time on exploration. job demands and criteria for evaluating scholarship influence research approaches, particularly when researchers are at earlier stages of their careers. organizational culture may also encourage some types of research by providing conditions for certain choices. considering that feelings are part of cognition, not just an accidental part of academics’ lives that they carry with them to information processes, it is possible that the way aca- demics feel at work has had some impact on their research. as investiga- tions of affect indicated (chartrand, van baaren, & bargh, ; damasio, , p. ), negative emotions may not impede researchers’ ability to analyze and observe, but they are likely to have a negative effect on cre- ative and exploratory approaches. working conditions and managerial styles promote organizational cultures in which employees share similar feelings. relatively recent studies have confirmed what many managers of knowledge organizations already know—the way researchers feel at work is likely to have some impact on their creativity and, consequently, on the way they use information systems. support by information professionals information professionals in general and librarians in research libraries in particular can provide support by being involved in information pro- vision, which includes dealing with a variety of information of different granularity, and by developing information services suitable for research in electronic environments. information provision a wide range of materials is of critical importance for humanities research, but the proliferation of information sources has made the task of comprehensive information provision increasingly diffi- cult for any single collection or institution. a variety of materials has been traditionally used in scholarly research, but researchers increasingly find valuable information in nonacademic online sources, which usually do not satisfy library criteria for preservation and description. while research libraries cannot work with all online sources, they need to find novel ways in which they can aid integration of sources, and reconsider divisions on which they base their collection development and information provision. sukovic/information discovery enabling access to information with different levels of granularity re- quires significant professional involvement. some research projects in the humanities provided valuable sources by working with one particular text or with a thematic collection in which they identified and interpreted in- formation of fine granularity. however, projects of this sort cannot provide access to large bundles and small morsels of information on a large scale. this is work that has to be done systematically by information profession- als from the moment of conceptual design of information systems to de- cisions about treatment of the document content. the involvement of research libraries in providing information in different media and formats is critically important to ensure the transfer and application of valuable library knowledge and skills to developing electronic environments. information services scholars require individual and highly specialized services to provide consultation about issues, resources, and tools in a par- ticular project. these services require time and librarians’ specialization that is beyond the means of most individual research libraries. however, large cooperative initiatives in provision of online services would be able to respond to researchers’ needs for specialized individual assistance. verbal communication from help files to reference services that re- quire reference interviews have been the norm in the information field. although verbal communication will continue to have its role in service provision, new forms of support for information discovery and insight will be required. work in interactive environments with multimedia en- courages nonlinguistic ways of knowing and expression, which have to be supported in similar ways. the current knowledge about information processes beyond conscious rational processing that allows verbalization is very limited. research in this area will provide the basis for much-needed innovation in information services. conclusion the growing recognition that different types of information do not exist in separate divisions is part of a broader interest in connections and mu- tual influences, characteristic of contemporary thinking. dynamic, open, and often unpredictable research in the humanities emphasizes the im- portance of connectedness. at the same time, these research practices put high demands on electronic information systems, but they also highlight the nature of information processes and set goals for the development of information systems. in order to provide integrated electronic environments with desirable aspects of ecological connectedness and growth, information profession- als, academic institutions, and other actors who/that shape information systems have to clarify the meaning and relevance of the existing divisions as well as ways of satisfying different interests without imposing obstacles on the user. integration is necessary to allow information discovery, which library trends/summer is essential in academic research as well as in many other areas. an elec- tronic environment that is rich enough to provide a sufficient variety and amount of information, flexible enough to enable individual discovery, but managed and ordered in a way that prevents chaos and accommodates changeable requirements for quality will be suitable for scholars as well as for everyone else. references arakawa, s., & gins, m. ( ). the mechanism of meaning ( rd ed.). new york: abbeville press. bates, m. j. ( ). the design of browsing and berrypicking techniques for the online search interface. online review, ( ), – . bates, m. j. ( ). information and knowledge: an evolutionary framework for information science. information research, ( ), paper . retrieved september , , from http:// informationr.net/ir/ - /paper .html bates, m. j. ( ). fundamental forms of information. journal of the american society for infor- mation and technology, ( ), – . bearman, d., & trant, j. ( ). authenticity of digital resources. d-lib magazine. retrieved september , , from http://dlib.anu.edu.au/dlib/june / contents.html brockman, w. s., neumann, l., palmer, c. l., & tidline, t. j. ( ). scholarly work in the humanities and the evolving information environment (no. pub ). washington, dc: digital library foundation, council on library and information resources. retrieved september , , from http://www.clir.org/pubs/abstract/pub abst.html brooks, b. ( ). the foundations of information science. part i: philosophical aspects. journal of information science, , – . buckland, m. k. ( ). information and information systems. new york: greenwood press. burnett, k., & mckinley, g. e. ( ). modelling information seeking. interacting with com- puters, , – . bush, v. ( , july). as we may think. the atlantic monthly. retrieved september , , from http://www.theatlantic.com/doc/ /bush chartrand, t. l., van baaren, r. b., & bargh, j. a. ( ). linking automatic evaluation to mood and information processing style: consequences for experienced affect, information processing, and stereotyping. journal of experimental psychology: general, ( ), – . cole, c. ( ). information as process: the difference between corroborating evidence and “information” in humanistic research domains. information processing & management, ( ), – . cory, k. a. ( ). discovering hidden analogies in an online humanities database. library trends, ( ), – . da vinci, leonardo. ( ). the last supper [digital image]: ministry of cultural heritage and activities, superintendency for architectural and natural heritages of milan; hal . retrieved september , , from http://www.haltadefinizione.com/en/ damasio, a. r. ( ). descartes’ error: emotion, reason, and the human brain (repr. ed.). new york: quill. deleuze, g., & guattari, f. ( ). a thousand plateaus: capitalism and schizophrenia (b. mas- sumi, trans.). london: continuum. mille plateaux, volume of capitalisme et schizophrénie, . ford, n. ( ). information retrieval and creativity: towards support for the original thinker. journal of documentation, ( ), – . gladney, h. m., & bennett, j. l. ( ). what do we mean by authentic?: what is the real mccoy? d-lib magazine, ( / ). retrieved september , , from http://dlib.anu.edu .au/dlib/july /gladney/ gladney.html haraway, d. ( ). the promises of monsters: a regenerative politics for inappropriate/ d others. in l. grossberg, p. a. treichler & c. nelson (eds.), cultural studies (pp. – ). new york: routledge. hockey, s. ( ). digital resources in the humanities: why is digital information different? on the third lecture of the series twenty-first century curation (sound recording). chadwick sukovic/information discovery lecture theatre, university college london. retrieved september , , from http:// www.slais.ucl.ac.uk/c /hockey/index.html jakubowicz, a. ( ). bridging the mire between e-research and e-publishing for multimedia digital scholarship in the humanities and social sciences: an australian case study. webol- ogy, ( ). retrieved september , , from http://www.webology.ir/ /v n /a .html kuhlthau, c. c. ( ). developing a model of the library search process: cognitive and af- fective aspects. rq, ( ), ( ). kuhlthau, c. c. ( ). a principle of uncertainty for information seeking. journal of docu- mentation, ( ), – . kuhlthau, c. c. ( , february/march). accommodating the user’s information search process: challenges for information retrieval system designers. bulletin of the american society for information science, – . liu, a. ( ). the laws of cool: knowledge work and the culture of information. chicago: university of chicago press. lyotard, j.-f. ( ). the postmodern condition: a report on knowledge. manchester: manchester university press. nardi, b. a., & o’day, v. ( ). chapter four: information ecologies. first monday, ( ). retrieved september , , from http://www.firstmonday.org/issues/issue _ / nardi_chapter .html palmer, c. l., & malone, c. k. ( ). elaborate isolation: metastructures of knowledge about women. the information society, ( ), – . scherer, k. r. ( ). introduction: cognitive components of emotion. in r. j. davidson (ed.), handbook of affective sciences (pp. – ). cary, nc: oxford university press. schooler, j. w., fallshore, m., & fiore, s. m. ( ). epilogue: putting insight into perspective. in r. j. sternberg & j. e. davidson (eds.), the nature of insight (pp. – ). cambridge, ma.: mit press. spink, a., & saracevic, t. ( ). human-computer interaction in information retrieval: nature and manifestations of feedback. interacting with computers: the interdisciplinary journal of human-computer interaction, ( ), – . sutherland, k. ( ). challenging assumptions: women writers and new technology. in w. chernaik, c. davis, & m. deegan (eds.), the politics of the electronic texts. oxford: of- fice for humanities communication publications with the centre for english studies, university of london. suzana sukovic is program coordinator in the digital innovation unit for the hu- manities and social sciences, the university of sydney. previously she taught at the university of technology, sydney and worked as a librarian in academic libraries, including the rare book and special collections library at the university of sydney. suzana has published journal articles and presented papers on electronic texts and research practices in the humanities. her doctoral thesis explored roles of electronic texts in projects in the humanities. across canada, across disciplines: research data management practices and needs in the social sciences and humanities leanne trimble , dylanne dearborn , tatiana zaraiskaya , jane burpee , eugene barsky , catie sahadath , melissa cheung , marjorie mitchell , matthew gertler university of toronto, queen's university, mcgill university, university of british columbia, university of ottawa iassist context who is involved? methods - data collection and analysis "clipboard" icon by annette spithoven from noun project https://thenounproject.com/term/clipboard/ / demographics respondents by institution n= breakdown by discipline n= respondents by rank n= working with research data storage volume, by number of research projects types of research data generated storage media used documentation & description of data n= n= is there sufficient documentation and description (e.g., variable and field definitions, codebooks, data dictionaries, metadata, scripts to run) for another person outside your research team to: how long data is kept? sharing research data current sharing methods vs future sharing methods sharing restrictions ● “privacy or ethics restrictions” ( . %) was the most identified restriction ○ highest in education ( . %) and the social sciences (ss) ( . %) ● “needing to apply for a patent” was only selected by one researcher (ss) ● low number of responses identifying “commercial concerns” ○ only business/management ( . %) and law ( . %) had higher response rates ● low across board for “public safety/sensitive data” (all <= %) perceived benefits & reasons for not sharing funding mandates & rdm services drafting a data management plan n= level of interest in services from libraries digital humanities/ digital scholarship digital humanities or digital scholarship (ds), can be defined as the collection and use of digital research data (either through digitization of print resources, or using born-digital resources) combined with methodologies from traditional humanities and social science scholarship. do you feel your research falls under this definition? yes: % (n= ) no: % (n= ) not sure: % (n= ) nature of digital scholars’ data data topics taught in digital scholarship summary: key findings ● data being produced: ○ most commonly text; small storage sizes (i.e. requirements are not complex) ● knowledge gaps: ○ data storage: range of non-optimal storage options in use ○ over half were not confident about the quality of their documentation ● sharing: ○ plan to share more in the future than they do now ○ privacy/ethics most common reason for not sharing ○ strong awareness of the benefits of sharing ● interest in support from libraries: ○ % would like/need some support in drafting dmps ○ large proportion of respondents expressed interest in services from the library in general ● some variance in disciplinary responses - useful for targeting services lessons learned ● conducting a survey with both local and consortial goals can be a challenge ○ survey design must consider local and group needs ○ group data management planning ○ wrangling data from different institutions ○ communication with partners is key ● portage clearinghouse aims to improve this process for additional institutions joining the project humanities & social sciences engineering & science health & medical sciences future steps... survey templates, data and reports will be made available on the portage website how the data can be used ● informing institutional decisions, e.g. shaping services ● informing national initiatives, e.g portage, funding agencies data, templates and reports available: https://goo.gl/vd hty hosted by portage, https://portagenetwork.ca/ https://goo.gl/vd hty https://goo.gl/vd hty https://portagenetwork.ca/ acknowledgements - data contributors ● queen’s university ○ alexandra cooper ● university of british columbia ○ sheryl adam, megan meredith-lobay ● university of ottawa ○ jessica mcewan, patrick labelle ● university of toronto ○ leslie barnes, nadia muhe ● university of waterloo ○ kathy szigeti, sandra keys questions? digital humanities within a global context: creating borderlands of localized expression fudan journal of the humanities and social sciences issn - fudan j. hum. soc. sci. doi . /s - - - digital humanities within a global context: creating borderlands of localized expression amy e. earhart your article is protected by copyright and all rights are held exclusively by fudan university. this e-offprint is for personal use only and shall not be self-archived in electronic repositories. if you wish to self-archive your article, please use the accepted manuscript version for posting on your own website. you may further deposit the accepted manuscript version in any repository, provided it is only made publicly available months after official publication or later and provided acknowledgement is given to the original source of publication and a link is inserted to the published article on springer's website. the link must be accompanied by the following text: "the final publication is available at link.springer.com”. o r i g i n a l p a p e r digital humanities within a global context: creating borderlands of localized expression amy e. earhart received: february / accepted: march � fudan university abstract as scholars have begun the digitization of the world’s cultural materials, the understanding of what is to be digitized and how that digitization occurs remains narrowly imagined, with a distinct bias toward north american and european notions of culture, value and ownership. humanists are well aware that cultural knowledge, aesthetic value and copyright/ownership are not monolithic, yet digital humanities work often expects the replication of narrow ideas of such. drawing on the growing body of scholarship that situates the digital humanities in a broad global context, this paper points to areas of tension within the field and posits ways that digital humanities practitioners might resist such moves to homogenize the field. working within the framework of border studies, the paper considers how working across national barriers might further digital humanities work. finally, ideas of ownership and/or copyright are unique to country of origin and, as such, deserve careful attention. while open access is appealing in many digital humanities pro- jects, it is not always appropriate, as work with indigenous cultural artifacts has revealed. keywords digital humanities � global � borderlands � transnational as scholars have begun the digitization of the world’s cultural materials, the understanding of what is to be digitized and how that digitization occurs, of how we utilize technology, of infrastructures of academic digital humanities (dh), remains narrowly imagined, with a distinct bias toward north american and european notions of culture, value and ownership. humanists are well aware that cultural & amy e. earhart aearhart@tamu.edu department of english, texas a&m university, tamu, college station, tx - , usa fudan j. hum. soc. sci. https://doi.org/ . /s - - - author's personal copy http://orcid.org/ - - - x http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf https://doi.org/ . /s - - - knowledge, academic infrastructures and copyright/ownership are not monolithic, yet digital humanities disciplinary structures often expect the replication of narrow ideas of such. katherine hayles predicts an entanglement of codes within a global environment, noting that ‘‘as the worldview of code assumes comparable importance to the worldviews of speech and writing, the problematics of interaction between them grow more complex and entangled’’ ( , ). the multiplicity of codes as expressed within global environments brings a largely ignored complexity to digital humanities and code studies and necessitates scholarship to interpret and critique such codes. while digital humanities is global, those of us practicing digital humanities continue to work within, to replicate, localized academic structures. while we might have come to terms intellectually with the notion that our scholarship is looking outward, that we are increasingly called upon to view our work within a complex web of global academic conversations, individual academics remain caught within nationally bound structures of academia, making the notion of a globalized construction of scholarship that values disparate forms of digital humanities incredibly difficult. as digital humanists imagine the ways that our community of scholars across the world might engage, we have the opportunity to construct a collaborative environment that models the best of such interactions. efforts are well underway. models range from a big tent approach, an umbrella model that pulls together all such efforts, to a networked set of nodes. yet, as global interaction among digital humanists grows it has revealed tension regarding the way in which the digital humanities engage with each other. rather than initiating a one size fits all global model, we need to imagine a global digital humanities that lives in the borderlands, a place of connection and contradiction and, mostly importantly, a place that does not try to centralize itself. recognizing that monolithic models of digital humanities are unproductive, digital humanists have begun to discuss how we might create academic infrastruc- tures, such as organizations, conferences and journals, that fully account for the diversity of practice. early organizations such as go::dh, global outlook::digital humanities, are leaders in the expansion of such infrastructure. developed to ‘‘break down barriers that hinder communication and collaboration among researchers and students of the digital arts, humanities, and cultural heritage sectors in high, mid, and low income economies’’ (go:dh ), go::dh has become a special interest group (sig) affiliated with the largest digital humanities organization in the world, the alliance of digital humanities organizations or adho. work by members of go::dh and others within adho has helped to make building ‘‘global digital humanities networks’’ one of the priorities of adho. adho has also been working to expand membership, constituent organizations and cultural and linguistic difference within their organization. other co-partners of adho include centernet: an international network of digital humanities centers, constructed as ‘‘an international network of digital humanities centers formed for cooperative and collaborative action to benefit digital humanities and allied fields in general, and centers as humanities cyberinfrastructure in particular.’’ emphasizing inclusivity, the organization views itself as a ‘‘big tent,’’ extending a welcome to all who self-define as digital humanities. while centernet is an international network a. e. earhart author's personal copy with expansive goals, it remains limited in representation. many countries that are actively producing digital humanities work, such as india, are not included in the network. only two centers in africa are included, though excellent digital humanities work across asia is underway. clearly the largest digital humanities organizations in the world are trying to articulate the way by which they might encourage a global discussion of digital humanities, but remain limited in their success. digital humanities as a structural entity has coalesced around the adho yearly conference. since digital humanists have gathered for the annual conference, imagined as international in scope. originally the conference rotated between north american and europe, but in order to encourage international participants the conference has begun to meet in wide ranging locations; it has moved from its original canadian/us/western europe locations to greater parts of europe and the americas, such as poland and mexico. created under the umbrella of adho, the organization includes the european association for digital humanities (eadh); the association for computers and the humanities (ach), predominantly an americas organization; canadian society for digital humanities/société canadi- enne des humanités numériques (csdh/schn); centernet, australasian associa- tion for digital humanities (aadh); japanese association for digital humanites (jadh); and humanistica, l’association francophone des humanités numériques/digitales (humanistica). past conference themes have embraced a global digital humanities. the international digital humanities conference, held at the university of hamburg, had the auspicious theme of digital diversity: cultures, languages and methods. australia’s hosting of the conference focused on a theme of global digital humanities. the digital humanities conference held in mexico city asks for us to consider bridges/puentes. the conference is fairly unique among academic conferences in that it is attempting to pull together such a broad group of scholars. there is no other academic conference in the literature, for example, that has the long-term goal of global outreach and has made such strives toward building a global organization. digital humanities journals are also focusing on the global digital humanities and have begun to publish papers that engage with the complex issues of how we might define digital humanities in the increasingly broad space and places in which the scholarship is created. such efforts extend to journals affiliated with adho, including dsh: digital scholarship in the humanities (formerly llc: the journal of digital scholarship in the humanities), dhq (digital humanities quarterly) and digital studies/le champ numérique which have featured global issues, such as collections titled ‘‘digital humanities without borders,’’ ‘‘global outlook::digital humanities: global digital humanities essay prize,’’ both in digital studies/le champ numérique, and papers that consider a broader global understanding of digital humanities, such as ‘‘corpus-based studies of translational chinese in english–chinese translation’’ and ‘‘aspect marking in english and chinese: using the lancaster corpus of mandarin chinese for contrastive language study,’’ both in dsh: digital scholarship in the humanities. however, the data suggest that we still have a long way to go if we want to be a global organization. melissa terras was the first to focus attention on conference digital humanities within a global context: creating… author's personal copy representation, finding that the conference was attended overwhelmingly by scholars from the usa, canada and the uk (see fig. ). concerned about the lack of geodiversity of conference attendance, terras has continued to track attendance, and her recent work suggests that digital humanities remains imagined as western located (see fig. ). work by roopika risam, alex gil, isabel galina, domenico fiormont, elika ortega, padmini ray murray, among other scholars, have called interpretations such as fig. into question, suggesting that the digital humanities is centered in the americas and europe only in the western imagination, a construct that ignores the broad scope of global digital humanities. risam notes, ‘‘the distribution of dh centers suggests uneven development. the usa and, to a lesser extent, the uk and canada appear the true centers of dh, while other countries comprise the peripheries’’ ( , ). should we want to broaden the digital humanities to a globally representative field, then we must begin to not only reimagine boundaries, but to construct organizations which decentralize. part of the difficulty is that the structures of the largest digital humanities organizations, such as adho, remain narrowly focused. a study of the conference authors from to shows that conference participation remains unequally distributed (see fig. ). conference participation is largely formed by the perennial question of how to define the field, with some definitions driving limited globalized membership, so too might structural issues associated with the conference. centernet and adho offer free and reduced cost memberships for joining their entities and, while waiving membership fees does encourage participation, the actual costs associated with attending the digital humanities conference, from airfare to lodging costs, remain high. registration discounts occur by career stage, with staff and students receiving fig. presenters at ach/allc by institution country. terras ( ). please note that the digital humanities conference was originally titled the ach/allch conference a. e. earhart author's personal copy discounted rates, but the organization has not included registration differentiation by region, country or income, leaving those from low-economy counties facing a dramatic challenge. for example, at the digital humanities conference in krakow participants from poland reported that the registration costs of the fig. quantifying digital humanities. melissa terras. infographic: quantifying digital humanities. . melissa terras’ blog. http://www.ucl.ac.uk/infostudies/melissa-terras/digitalhumanitiesinfogra phic.pdf accessed september , fig. number of authors per region – . weingart and eichmann-kalwara ( ) digital humanities within a global context: creating… author's personal copy http://www.ucl.ac.uk/infostudies/melissa-terras/digitalhumanitiesinfographic.pdf http://www.ucl.ac.uk/infostudies/melissa-terras/digitalhumanitiesinfographic.pdf conference were equivalent to a month of salary for lecturers. though the conference was in their home country, the cost was prohibitive. while some have floated the idea of income-based registration, to date the conference has not responded to a key structural issue that prohibits participation from a broader digital humanities community. the conference has taken positive steps to create a less exclusionary space by holding the conference in australia and the conference in mexico. prompted by the formation of la red de humanidades digitales (redhd), the mexico city conference will be ‘‘the first time that the conference will take place in latin america & the global south.’’ the shift in locations for digital humanities signals an important moment in the history of the organization is largely due to the hard work of organizations like go::dh and redhd. however, there remain clear structural barriers to an inclusive global digital humanities. algorithmic analysis of digital humanities’ structures points to continuing problems in developing a diverse global digital humanities. scott weingart’s analysis of the yearly adho conference has pushed digital humanities to think through how we are constituting ourselves through our conference and our field, revealing the ways that conference participation remains geographically located in the americas and europe. conference participation limitations also appear in our constituent journals which are likewise publishing articles predominantly clustered around scholars in the americas and europe. telling is an analysis of digital humanities quarterly: dhq examining co-author networks in the journal from to which reveals that the networks remain squarely centered in the americas, with very little representation beyond europe (see fig. ). all of this suggests that digital humanities as understood through our organizational entities, digital humanities organizations, conferences and journals, desires to be global but remains merely the imagined global. the domination of the primary modes of disciplinary construction, journals and conferences by the americas and europe is a problem in that it is creating a field that runs counter to the described goals of global digital humanities, implying that no matter the imagined global digital humanities, a truly global understanding of an organization or a field is difficult to construct, perhaps even more difficult in the current age of nationalist tensions. there are numerous interventions underway to broaden our representation of global digital humanities, but we remain caught within tensions of an umbrella structure that enforces structures that are often not conducive to the larger representation of digital humanities. digital humanities has struggled to articulate a global organization in large part because of originating tensions within the organization construction. digital humanities, as a field, has struggled to articulate what is included within its rubric, a struggle that remains an open academic question. tensions within the field have revolved around who’s in and who’s out, but in a localized context focused on, once again, the americas and europe. reviewing the literature that attempts to define digital humanities reveals that geography has been ignored by scholarship until see dh quantified for a list of scholars invested in collecting information of the community: http:// scottbot.net/dh-quantified. a. e. earhart author's personal copy http://scottbot.net/dh-quantified http://scottbot.net/dh-quantified recent interventions. such scholarly constructions of digital humanities which view digital humanities as naturalized within a european and americas structure has led to current limitations of the field. as o’donnell et al. make clear, our current representation of digital humanities moves along clear lines of demarcation, whether economic, linguistic or geographic ( , ). the centering of digital humanities in this manner has created an ‘‘unproductive dichotomy of center and periphery,’’ leading to a call for a resistance to such structures through a creation of a regional or local digital humanities (gil and ortega , ). for example, alex gil’s ‘‘around dh in days’’ project resists the limited centering of digital humanities, instead revealing the diversity of global digital humanities projects (see fig. ). the diversification of digital humanities, the struggle to create an organizational entity that inclusively represents a global digital humanities, will continue to occur through adho and its affiliated conference and journals, but the organizational structures currently remain resistant to a more globally imagined digital humanities. because of this, we might ask whether adho is actually the mechanism to bring about global digital humanities. as the organization has grown, there has been an almost de facto understanding that it should be the center for global dh. but the centering of digital humanities in an organization that has arisen out of western academic structures will, i argue, always struggle to imagine how to construct a truly representative field. a better question might be whether we can construct an alternative mechanism that accurately represents all the different ways that digital humanities is practiced in a global environment. the rejection of an umbrella or big tent organization in which to coalesce a global digital humanities is born out of an analysis of the way that geographic, economic, cultural and structural approaches to academic discipline impact our interactions in the larger digital humanities. during the research and writing of traces of the old, uses of the new: the emergence of digital humanities ( ) i came to understand that providing one definition of the digital humanities was dependent upon a stable infrastructure from which the practice developed. the definition of digital humanities within the americas is dependent upon an academia that is increasingly defunded and deprofessionalized, driving a digital humanities that is interested in an entrepreneurially based startup model of digital humanities. this is not so for other localized digital humanities practices, yet dh organizations like adho continue to imagine digital humanities with a distinct bias toward north american and european notions of culture, value and ownership. o’donnell et al. rightly argue that this view of digital humanities is predicated on viewing the development of a global digital humanities ‘‘as an opportunity for transferring fig. ‘‘co-author network for digital humanities quarterly: – .’’ de la cruz et al. ( ) digital humanities within a global context: creating… author's personal copy knowledge, experience, and access to infrastructure from a developed north to an underdeveloped south’’ ( , ). rejecting this, the authors call for an approach that ‘‘is far more about developing understanding than merging practice,’’ and they turn to ‘‘supra-networks that transcend national, linguistic, regional and economic boundaries’’ ( , ). i’d like to quibble with the use of networks as the way by which we should represent the interaction of the various global representations of digital humanities. the notion of an overarching system that is built from nodes, is not that different than how adho and its constituent conference imagines itself, a model that ignores the very real institutional and cultural divides that are always with us. in many ways, a supra-network is a slightly shifted replication of the long understood big tent digital humanities and, ultimately, a failed model. digital humanities is an amorphous and fluid concept or practice, particularized in various disciplines, national contexts and even local environments, but the field is represented as a coherent body of practice by intact structures that include the annual digital humanities conference, the various global organizations that form adho, and even journals published by the various societies. the digital humanities, as represented by the yearly international conference, is a digital humanities which ignores the borders of practice that masks areas of dissension and normalizes the field to a particular form without contour. however, the center does not hold and recent conferences have featured ruptures, revealing the false constructedness of a coherent digital humanities. structuring the global digital humanities as a ‘‘big tent’’ hides the way that such a representation seeks ‘‘sameness’’ in practice. a counternarrative that provides a more inclusive understanding of global digital humanities is one that turns to specificity. while some may see the segmentation of digital humanities as counterproductive, i argue fig. ‘‘around dh in days.’’ gil ( ) a. e. earhart author's personal copy that digital humanities must be particularized because dh, as enacted, is so broad, diffuse and flexible that a generalized definition does not adequately address the various digital approaches currently in use nor how certain humanities fields are being altered by digital practice. a far more productive understanding of our collective histories is to identify the borders of practice and to look for disciplinary overlaps that benefit all partners. a specificity of global digital humanities’ practices is best understood in the framework of what gloria anzaldua has called the borderlands in her crucial work borderlands|la frontera ( ). anzaldua’s framework allows us to examine the impact of cultural representations of digital humanities within larger frameworks of power, including the economic, cultural and power dynamics that impact the production of scholarship. while anzaldua is writing prior to the digital turn and code studies scholarship, her work is prescient. examining the code shifting of language, anzaldua argues that language codes provide a way to examine the complexity of networked interfaces of communication and a way of understand how cultural identity is impacted by power dynamics of such code. anzaldua’s focus on code switching, defined in her book as language switching or ‘‘the switching of ‘codes’ …from english to castillian spanish to the north mexican dialect of tex- mex to a sprinkling of nahuatl to a mixture of all of these,’’ produces great cultural upheaval. this ‘‘language of the borderlands’’ is ever shift and changing and ‘‘there, at the juncture of cultures, languages cross-pollinate and are revitalized; they die and are born’’ ( , preface). while anzaldua situates her discussion of borderlands in the geographic specificity of the texas/mexico border, her theorization of power between multiple cultural codes might be extended to our understanding of digital humanities. roopika risam echoes such an extension of code switching when she calls for dh accents, a recognition of the multiple languages, both ‘‘linguistic and computational’’ as the formation of dh(s) ( , ). to risam, the multiple accents of digital humanities must be ‘‘understood in a broader ecology of ‘accents’ that inflect practices, whether geography, language, or discipline,’’ providing a model that makes sense of and values the broadness of digital humanities, rather than contains such diversity within a limited framework ( , ). key to understanding the way that localized digital humanities interact within a global framework is to evaluate the contingent power structures. anne donadey notes, ‘‘discrete fields of knowledge can be seen as being separated by disciplinary borders; the interdisciplinary and comparative areas where they meet and are brought together can be viewed as borderland zones in which new knowledge is created, sometimes remaining in the borderland, sometimes becoming institution- alized into a different field of knowledge with its own borders’’ ( , – ). the importance of borders is not in the separation, though indeed that is in play, but the meeting points, which provide productive tensions that bring forth new knowledge. focusing on resistance, as donadey puts it, avoids the flattening of ‘‘the concept of borderlands that would erase its historical and cultural grounding by turning it into a disembodied metaphor’’ ( , ). the borderlands stand in opposition to big tent representations of cultural connection. to embrace a borderlands understanding of global digital humanities is to respect localized practices and to digital humanities within a global context: creating… author's personal copy embrace points of context rather than a homogenized centrality. as anzaldua reminds us, ‘‘a borderland is a vague and undetermined place created by the emotional residue of an unnatural boundary. it is in a constant state of transition’’ ( , ). the continual renegotiation of points of connection is productive and ever shifting. rather than attempting to stabilize such moments, border theory seeks fluidity and destabilization as a means of new knowledge production. viewing the global digital humanities within a border theory model rather than a big tent or umbrella formulation, one journal or one conference, allows scholars to seek those points of contact while understanding how the power dynamics of digital humanities have come to create points of contention. crucial to respecting the integrity of localized digital humanities is a careful examination of our assumptions about technology use in digital humanities projects. go::dh has supported ‘‘minimal computing’’ approaches as a way to rethink the way that many western digital humanities projects center technology innovation. based on discussions in with digital humanists in cuba, those associated with go::dh, led by alex gil, recognized that computing needs in various localized environments might benefit from what ernesto oroza calls the ‘‘architecture of necessity’’ (gil and ortega , ). go::dh has defined ‘‘minimal computing’’ as that which ‘‘simultaneously capture(s) the maintenance, refurbishing, and use of machines to do dh work out of necessity along with the use of new streamlined computing hardware like the raspberry pi or the arduino micro controller to do dh work by choice. this dichotomy of choice versus necessity focuses the group on computing that is decidedly not high-performance and importantly not first-world desktop computing’’ (go::dh ). while we continue to need to explore how technologies benefit our research questions, we cannot ignore more minimal computing approaches that are often the most innovative and expansive within our field. the bias toward highly robust, often expensive, technologically centered projects as the gold standard for dh also creates a centered field that actively ignores the work occurring in some parts of global digital humanities. to best move forward, we need to return to a multiplicity of approaches that allows for scholarship to recenter technology, and we must resist the creation of rigid borders of academic disciplinarity that effectively shuts down the possibilities of global digital humanities interchange. to proceed in a non-policed borderlands, we must resist a tyranny of technology. frames for our community interaction must be fluid and non-centralized. they must be evolving. to enable the productive friction between communities, we might begin to see our fields as less about connective nodes and networks and more focused on transnational understandings of disconnecting nodes. border theory expands our methodologies and our approaches, rejecting a narrow understanding of digital humanities. it allows us to rethink the way that our own scholarship has been colonized and limited, particularly through models of ownership. a tenet of digital humanities in the americas, for example, has focused around issues regarding ownership of scholarship, with faculty increasingly asserting control over their own labor and their ability to disseminate it freely, as open access (oa) materials, to an audience apart from or in parallel with more traditional structures of academic publishing. key to defining the digital humanities a. e. earhart author's personal copy then is that our scholarship is increasingly public. matthew kirschenbaum notes that ‘‘whatever else it might be then, the digital humanities today is about a scholarship (and a pedagogy) that is publicly visible in ways to which we are generally unaccustomed, a scholarship and pedagogy that’s bound up with infrastructure in ways that are deeper and more explicit than we are generally accustomed, a scholarship and pedagogy that is collaborative and depends on networks of people and that lives an active, / life online’’ ( , ). the public digital humanities and the accompanying push for open access are central to the way that many digital humanists situate their scholarship. however, to fully encompass all expressions of digital humanities, we must also think carefully about issues of ownership, which many in digital humanities have expressed in limited western contexts such as copyright. as we move toward a model of interchange and exchange of globalized digital scholarship, the understanding of ownership and open access must be carefully examined and complicated. the dominance of models of open access in the americas has been critiqued by a growing number of scholars, with particular attention to this issue from scholars who work with indigenous communities and knowledges. kim christen, for example, has produced scholarship and innovative digital tools to address issues of ownership and openness that are centered on indigenous knowledge structures. her work recognizes that the digital archiving process has deep roots in museum and library collections’ problematic pasts and that many indigenous communities’ have had their intellectual production exploited by colonizers. as christen notes, ‘‘the colonial collecting project was a destructive mechanism by which indigenous cultural materials were removed from commu- nities and detached from local knowledge systems’’ ( , ). in response, christen has developed a content management system (cms), mukurtu, that allows for sophisticated control of the materials within the cms, demarcating the viewing of digital objects through localized understandings of what should be seen and what should not be seen and forcing the user to understand that there are certain objects or ideas that are not open to all. while christen’s work explicitly targets indigenous groups, her thinking about what should be seen and what should not be seen models best practices that we must extend into our conception of the global digital humanities. at the montreal digital humanities meeting the ‘‘copyright, digital humanities, and global geographies of knowledge’’ panel considered this important issue. the discussion of copyright practices in various countries during the panel revealed the very limited understanding of the topic within the larger collective who attended the conference. isabel galina russell’s remarks focused on copyright in latin america, with her particular expertise focused on mexico. galina russell emphasized that ‘‘latin america distinguishes itself from other regions of the world in that scientific information belongs to all’’ ( ). recognizing that few for profit academic commercial publishers exist in latin america, galina russell argues that ‘‘there is a see kimberly christen. ‘‘on not looking: economies of visuality in digital museums’’ in the international handbooks of museum studies: museum transformations, first edition. ed. annie e. coombes and ruth b. phillips. oxford: john wiley & sons, ltd. oxford press, : – . – . digital humanities within a global context: creating… author's personal copy generalized idea that knowledge produced in the university belongs to all, it is a common good provided to the country,’’ negating copyright and shifting ownership of academic production to the public ( ). this conception of ownership stands in stark contrast to the way that ownership has functioned within the types of structures set up by the western for profit academic publishers and that many dh scholars see as central to oa initiatives. in the same panel, padmini ray murray discussed the copyright lawsuit brought against shyam singh, the owner of a small indian shop producing course packs for students at a local university, who was sued by several leading academic presses. murray points out that the case revealed the way that assumptions of copyright elided national boundaries and attempted to apply western understandings of ownership on scholarly work. at the same time that the lawsuit negated copyright rules of the indian state, it also selectively ignored us and uk copyright rules with the desire to further enforce western ideas of ownership. in response to the supposed copyright violations, the lawsuit ‘‘sought to ban all course packs, including those that observe the us definition of fair use, i.e., excerpts comprising less than % of the whole text’’ ( ). at the same time the legal challenge ignored ‘‘section of the indian copyright act \that[ permits ‘fair dealing’ with the purpose of research, as well as permitting any copyrighted work to be used for the purpose of educational instruction’’ ( ). situating copyright law neither in indian or the west, the lawsuit was written as nationless, boundary less, centered only on the effort to end the exchange of information. both papers point to the complications of thinking about ownership and knowledge as equivalent forms across cultures and nations. while we might value open access in the digital humanities, not all producers of knowledge will accede to openness. instead we must, once again, develop structures that see knowledge as culturally defined and controlled. by valuing the localized understanding of knowledge and knowledge production, we situate the global digital humanities within a productive nexus of borders. instead of insisting that we encapsulate all practices of digital humanities within a big tent or a centralized structure, we should instead view adho and its conferences and journals as important, but not central, meeting spaces for digital humanists. rather than seeing adho as the center, we should encourage a global digital humanities that works on the borderlands, with localized expressions of scholarship that reinvigorate through exchange. rejecting the ‘‘dualistic thinking in the individual and collective consciousness’’ is a struggle, as anzaldua argues, but it is the only way that we might move beyond binaries that are currently in place, whether technologically advanced/primitive, east/west, or low income/high income ( , ). resisting the homogenization of scholarly methods, questions, outcomes, production and ownership is the only way to develop a truly robust global digital humanities. a. e. earhart author's personal copy references anzaldua, gloria. . borderlands/la frontera. san francisco: aunt lute book company. centernet: an international network of digital humanities centers. . https://dhcenternet.org/about. accessed aug . christen, kimberly. . tribal archives, traditional knowledge, and local contexts: why the ‘s’ ma ers. journal of western archives ( ): – . de la cruz, dulce maria, jake kaupp, max kemman, kristin lewis, and teh-hn yu. . mapping cultures in the big tent: multidisciplinary networks in the digital humanities quarterly. https:// jkaupp.github.io/dhq/coursework/visualizingdhq_final_paper.pdf. accessed aug . dh : mexico city. dh (blog) . https://dh .adho.org/en/. accessed aug . donadey, anne. . overlapping and interlocking frames for humanities literary studies: assia djebar, tsitsi dangarembga. gloria anzaldua. college literature ( ): – . earhart, amy e. . traces of the old, uses of the new: the emergence of the digital literary studies. ann arbor: university of michigan press. galina russell, isabel. . presentation, panel on copyright, digital humanities, and global geographies of knowledge. presented at the digital humanities , montreal, canada. gil, alex. . around dh in days. around dh in days (blog). http://www.arounddh.org. accessed aug . gil, alex, and elika ortega. . global outlooks in digital humanities: multilingual practices and minimal computing. in doing digital humanities: practice, training, research, ed. constance crompton, richard j. lane, and ray siemens, – . london: routledge. global outlook::digital humanities. . http://www.globaloutlookdh.org. accessed aug . hayles, katherine. . my mother was a computer: digital subjects and literary texts. chicago: university of chicago press. kirschenbaum, matthew. . what is digital humanities and what’s it doing in english departments? in debates in the digital humanities, ed. matthew gold, – . st. paul: u minnesota p. membership. adho (blog). . https://adho.org/faq. accessed aug . o’donnell, daniel paul, katherine l. walter, alex gil, and neil fraistat. . only connect: the globalization of the digital humanities. in a new companion to the digital humanities, ed. susan schreibman, ray siemens, and john unsworth, – . malden, ma: wiley blackwell. pannapacker, william. . the brainstorm blog: the chronicle of higher education online. ray murray, padmini. . presentation, panel on copyright, digital humanities, and global geographies of knowledge. presented at the digital humanities , montreal, canada. risam, roopika. . other worlds, other dhs: notes towards a dh accent. digital scholarship in the humanities ( ): – . sigs: adho special interest groups (sigs). . adho (blog). http://adho.org/sigs. accessed nov . terras, melissa. . disciplined: using educational studies to analyse ‘humanities computing’. literary and linguistic computing ( ): – . terras, melissa. . quantifying digital humanities. ucl centre for digital humanities. http://blogs. ucl.ac.uk/dh/ / / /infographic-quantifying-digital-humanities/. accessed nov . weingart, scott b., and nickoal eichmann-kalwara. . what’s under the big tent? a study of adho conference abstracts. digital studies/le champ numerique : . https://doi.org/ . / dscn. /. amy e. earhart is an associate professor in the department of english at texas a&m university. she is the author of traces of the old, uses of old: the emergence of digital literary studies ( ) and co- editor of the american literature scholar in the digital age ( ). she is the author of various books and chapters in venues including debates in digital humanities, textual cultures and the humanities and the digital, among others. digital humanities within a global context: creating… author's personal copy https://dhcenternet.org/about https://jkaupp.github.io/dhq/coursework/visualizingdhq_final_paper.pdf https://jkaupp.github.io/dhq/coursework/visualizingdhq_final_paper.pdf https://dh .adho.org/en/ http://www.arounddh.org http://www.globaloutlookdh.org https://adho.org/faq http://adho.org/sigs http://blogs.ucl.ac.uk/dh/ / / /infographic-quantifying-digital-humanities/ http://blogs.ucl.ac.uk/dh/ / / /infographic-quantifying-digital-humanities/ https://doi.org/ . /dscn. / https://doi.org/ . /dscn. / digital humanities within a global context: creating borderlands of localized expression abstract references share: community-focused infrastructure and a public goods, scholarly database to advance access to research search d-lib: home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib d-lib magazine may/june volume , number / table of contents   share: community-focused infrastructure and a public goods, scholarly database to advance access to research cynthia r. hudson-vitale, washington university in st. louis chudson [at] wustl.edu richard p. johnson, university of notre dame rick.johnson [at] nd.edu judy ruttenberg, association of research libraries judy [at] arl.org jeffrey r. spies, center for open science, university of virginia jeff [at] cos.io   https://doi.org/ . /may -vitale   abstract share has a schema-agnostic approach to aggregate diverse and distributed scholarly metadata in order to build a broadly inclusive open data set about scholarship to power innovation and discovery. in an environment where metadata standards vary widely by discipline or domain, distributed digital assets — while intellectually linked to other objects in the ecosystem — may lack the necessary information to intuit these relationships directly, including strong identifiers for people, institutions, or sources of funding. aggregating metadata across diverse data sources and repositories is essential for making related content discoverable — especially content that may not currently have first-class status in scholarship. related contextual objects, beyond publications, support replicability, reproducibility, and reuse. it is impractical to ask each of these diverse data sources to adopt and implement a common metadata format when the incentives for doing so are low. instead, share is harvesting, normalizing, and linking dispersed assets into an aggregated, open data set of research outputs. this is producing tangible demonstrations of the power of a public goods database to provide notifications or reports of research activity and promote discovery. these demonstrations will entice institutions to enhance their metadata in share or use share to clean and augment metadata in their repositories. keywords: share, scholarly metadata, institutional repositories, metadata schemas   introduction "one researcher's metadata is another researcher's data" aggregated, diverse object-type metadata facilitates discovery, gives more exposure and credence to previously overlooked digital assets (e.g., data, code, software, patents), and contributes significantly to research involving meta-analyses and meta-scholarship. given the dispersed and specialized nature of much scholarship and research, an aggregated (meta)data set is necessary to determine links and relationships among research assets. this information is especially important to scholars, institutions, and funders, all of whom are engaged in tracking research activity for various purposes. share emerged within north american higher education in to address the gap in digital infrastructure by connecting research activity across disciplinary, agency, and institutional repositories in a timely and structured manner (arl, , walters and ruttenberg, ). there are and were, within research or other academic communities, a variety of schemas and extensions that developed to facilitate the aggregation of metadata from multiple repositories (riley, ). the growth over the past few decades of discipline-specific metadata elements and controlled vocabularies has allowed researchers in particular academic domains the ability to share data and exchange information about research in a structured, interoperable format. further, this development and adoption has facilitated data reuse among research domain communities by providing important technical, descriptive, and contextual information (yarmey & baker, , qin & li, ). but the proliferation of many metadata schemas and controlled vocabularies is not without its challenges. while the use of a specific metadata schema encourages sharing among researchers within a given subject domain or object type, the disciplinary focus often limits the extensibility of the metadata schema to other domains or types of objects, which limits the discoverability of the research to only those who are most familiar (willis, et.al., ). a recent presentation by cox ( ) further highlighted the challenges in the use of controlled vocabularies, when he found eleven different definitions of the word "soil," some from the same taxonomic organization. historically, the need to harmonize different metadata schemas has been addressed through cross-walking and reconciliation efforts, which are hard to scale and are labor intensive for the participating data sources/repositories. for institutional repositories, challenges to creating robust metadata include lack of sufficient human resources, inconsistent access to administrative changes within repository platform software, and variable sourcing of metadata, including from third-party services or by author deposit. in recent years a number of initiatives have developed in an attempt to catalog and enumerate the variety of metadata schemas available for a scholar, institution, or organization to use to describe their research. in , the digital curation centre (dcc) launched a metadata standards directory that breaks down a variety of metadata schemas by discipline and provides a short description of each schema's use (dcc metadata standards directory, ). around this same time, the research data alliance (rda), in collaboration with the dcc, developed a community-supported version of the metadata standards directory (ball, ). while containing many of the same standards, the rda directory is more highly structured in what information it is reporting, including use cases, metadata extensions, and tools. while both the dcc and rda directories are extremely useful for discovering and comparing different metadata schemas, neither is exhaustive in its listing. to address the need for an aggregated (meta)data set of research and scholarship, appropriately account for the variety of metadata standards, and lower the barrier to participation among repositories, the share initiative adopted a schema-agnostic approach to harvesting scholarly and research metadata. share, a partnership of the association of research libraries (arl) and the center for open science (cos), employs multiple strategies to harvest, ingest, map, and normalize metadata. these strategies include harvesting from oai-pmh, and non-standard application programming interfaces (apis); having sources push records to share; and in some cases web scraping. in this way, share has moved beyond any one protocol or any one type of repository as the exclusive target for harvesting and towards an ecosystem of evolving information resources. share is a community-based project and came to its inclusive strategy through community consultation, including working and task groups of experts in metadata, digital libraries, and publishing. share is centered around the driving goal to make a more comprehensive picture of research accessible and open. share assumes that the digital research environment is both complex and distributed. and share embraces a system-wide mission and remit — to provide exposure to research outputs and a mechanism to aggregate and connect those outputs from multiple sources while supplying an open data store for metadata enhancement and maintenance and to feed existing and new services.   share metadata harvesting and normalizing pipeline each potential metadata provider for share goes through a simple registration process where both the appropriate harvesting mechanism and proper set of records to be harvested are determined. commonly, api endpoints like oai-pmh are utilized by share for a given data provider. as the usage of metadata elements and schemas vary across each organization, the share development team customizes the harvesting process for each provider. by not requiring any specific metadata elements to be harvested and by the share team writing necessary harvesters, the efforts placed on repository staff or library it staff have thus far been minimal. upon ingest, records are fed to a data processing and normalization pipeline (see figure ). from the original request and response, the metadata is processed and normalized by mapping its schema that varies across data providers to the share schema. by performing this process at the time of ingest, share does not require all data providers to conform to one metadata format for interoperability. data is also archived at each incremental stage of the process for reference or use later, and the final set of normalized data is stored and indexed for discovery. figure : data pipeline typically, a set of metadata records about research activity (e.g., publications, data sets) is supplied to share. upon receipt, these metadata elements are then normalized and mapped within share's schema. associated institutions, co-authors, collaborators, and related works are then derived from common metadata elements across activity records supplied (see figure ). alternatively, data providers may also push records directly to share via its api. when possible, this use of the share api can streamline the normalization and linking of people, institutions, and works within share as mapping can occur at the source. figure : share object and relationship mapping within the share project, guidelines are under development to formalize best practices, especially around structure of the metadata values themselves. for example, the use of unique identifiers, authority records, or controlled vocabulary terms greatly accelerates the linking and deduplication of metadata records. furthermore, identifying which authority and controlled vocabulary schemes are being used (e.g., tgn, dcmi, lcsh) increases precision within the metadata and the confidence with which share asserts relationships between records.   share api metadata harvested, normalized, and enhanced are then made widely available through the share discovery application programming interface (api), which queries share's elasticsearch index. individuals can search across the normalized metadata to discover and make connections among the varying research outputs. for institutions and researchers, the share data set may be used to: discover research and data for reuse or to assess the rigor of a research study populate researcher profiling systems visualize collaborative networks support institutional open access policies conduct meta-analyses with the flexibility of the share api, specific providers, work types, subjects, and funders can be discovered using the controlled, discipline-specific vocabulary, or the normalized terms. in aggregating the dispersed sources, and making the metadata available through a common api, duplication of efforts and redundancies across institutions can also be reduced. rather than searching each individual provider independently for related materials or using different search syntaxes and controlled vocabularies, a user can access the share api and search across all at once. as a developer, this also decreases the need for any one-on-one connections for application development or feeds from a source to a local repository. the share institutional dashboard being developed collaboratively between uc san diego and the center for open science (described in the "applications and tools" section below) is one such solution that allows an institution to search at once across many distributed sources. several institutions use share's push api to add metadata records to share. in some cases, this is an alternative to providing records to share via oai-pmh from their repository or database. as is the case with uc san diego and its institutional dashboard, the push api is also being used to supply additional records aggregated or created that are related to people or work originating from their institution and directly assert relationships between those records.   applications and tools one of share's greatest strengths is the platform it provides to develop tools and services for collecting, using, exchanging, and analyzing research assets. through the share api, new tools for research discovery and aggregation are being built that further facilitate the discovery of research outputs and display them in a manner that is reflective of community needs, while leveraging a public good.   . osf preprints and registries in the last year the scholarly community has seen an explosion of pre-publication paper (pre-print) services and discussions around their potential role in a changing scholarly communication environment. for those disciplines that have a robust history of pre-publication sharing, such as math, physics, and economics, different sets of metadata elements and taxonomies are in use to gather the necessary domain terms to describe the pre-prints adequately. additionally, many of the existing pre-print services are dispersed across institutions, organizations, and technologies, which are also often limited to a specific discipline and community. the center for open science saw a need to develop infrastructure to facilitate the discovery and the aggregation of these dispersed pre-print records. by harnessing the existing infrastructure of the share data set and the open science framework (osf), cos launched a multidisciplinary preprint server and aggregated discovery service, osf preprints. powered by share, osf preprints allows an individual to search across many dispersed pre-print servers and records at once. osf preprints and its discovery layer can also be branded for particular organizations or disciplinary outreach groups; branded services exist for sociology (socarxiv), psychology (psyarxiv), engineering (engrxiv), agriculture (agrxiv), and the berkeley initiative for transparency in the social sciences (bitts). by aggregating and normalizing dispersed metadata from many sources share has similarly enabled the development of a tool that aids in the discovery of research registrations, a concept common in, for example, clinical trials. registrations are time-stamped, (ideally) immutable versions of a research project meant to increase transparency and accessibility to work; they often include documentation or metadata about the state of the project at a given point in time. a pre-registration is a registration created before data collection begins that captures hypotheses and documents what will be tested in a confirmatory analysis. this allows these analyses to retain validity of their statistical inferences and assists readers in separating confirmatory analyses from exploratory analyses that may lead to future confirmatory work. osf registries includes registrations from a variety of domains, including the medical sciences, government, and politics. with share, osf registries makes possible the searching of multiple registries through one interface.   . share institutional dashboard research and higher education institutions have a strong interest in understanding the research-related outputs of their faculty and staff. aggregating and discovering this information is time-consuming and expensive given the variety of research assets dispersed across applications (e.g., data sets, pre-prints, award information, patents, publications), inconsistent adoption of strong identifiers for people and institutions despite the maturity of community standards (e.g., orcid, openisni), and highly variable metadata. in collaboration with the uc san diego libraries (ucsd), an institutional dashboard of research assets is currently in development as an application layer on top of the share api. the dashboard provides faceted views and visualizations of institutionally affiliated metadata found in the share data set. built as flexible javascript widgets, the open source dashboard allows institutions to pick and choose the facets, charts, and information they are most interested in displaying on their custom institutional dashboard. developing the dashboard application requires metadata that is curated to include institutional affiliations for each record. unfortunately, the use of affiliation identifiers is not widespread in many metadata records. by adopting affiliation identifiers in organizational metadata, such as openisni or grid (wheeler, ), a number of institutional disambiguation issues can be reduced. for the ucsd pilot project, affiliations are made using a combination of programmatic and human-mediated efforts. as records are coming from many disparate sources, a useful technique has been to apply a controlled list of aliases to query for affiliation (e.g., ucsd, uc san diego, uc san diego library).   community the creation of high-quality metadata using any general, domain, or discipline schema requires an investment of individual time and resources. while share continues working on automatic techniques to enhance metadata records, these approaches are not infallible for all types of missing metadata values (liu, ). thus metadata practitioners, curators, and repository staff within the share community are an integral component to the curation and metadata ingest workflow. through this community, institutions and organizations can begin to collectively address many of our shared challenges and successes associated with quality metadata, curation treatments, institutional analytics, and more. through programs that build community and technical capacity, such as the share curation associates pilot program, participants learn and exchange treatments and techniques to enhance their own local, institutional metadata. as share providers, any enhancements the associates make locally on a repository or in metadata records are also fed into the share data set, which is then widely shared. similarly, data enhanced by share can be used locally. this results in a sustainable, round-trip enhancement of metadata that has both local and national impact.   conclusion one of share's core values is openness. by creating and aggregating open, research-related metadata, software, assets, and tools, share is contributing to a larger movement towards open science and open scholarship. through these movements, scientific innovation is catalyzed and efficiencies in research funding and studies can be improved. just as the larger movement requires involvement of researchers and faculty to further move ahead, share needs the continued involvement of the community to use the data set, build applications on top of the api, and curate or enhance the metadata. in share will be transitioning to a new governance structure comprised of share's most active contributors and stakeholders. local and community needs surfaced through this group and other partners will be a driving force behind the development of new tools and services leveraging share. while the community continues to use many metadata schemas and standards — a reality that cannot be avoided — the share approach cannot be addressed fully by any one organization. share is focusing efforts on growing its community of collaborators in order to distribute curation, support, development, and maintenance. direct collaboration also enables community members to shape solutions like the share data set to best realize their own objectives now and in the future.   references [ ] association of research libraries. share notification system project plan. washington, d.c: association of research libraries, . [ ] ball, alexander, et al. building a disciplinary metadata standards directory. international journal of digital curation . ( ): - . https://doi.org/ . /ijdc.v i . [ ] cox, simon. 'what does that symbol mean? — controlled vocabularies and vocabulary services'. scidatacon . denver, colorado. . [ ] digital curation centre. disciplinary metadata | digital curation centre., n.p., n.d. [ ] liu, jiankun. classifying research activity in share with natural language processing. share. n.p., may . [ ] qin, jian, and kai li. how portable are the metadata standards for scientific data? a proposal for a metadata infrastructure. international conference on dublin core and metadata applications ( ): - . [ ] riley, jenn. seeing standards: a visualization of the metadata universe. - . [ ] walters, tyler, and judy ruttenberg. shared access research ecosystem. educause review march . [ ] wheeler, laura. digital science launches grid, a new, global, open database offering unique information on research organisations. digital science. n.p., oct. . [ ] willis, craig, jane greenberg, and hollie white. analysis and synthesis of metadata goals for scientific data. journal of american society for information science and technology ( ): - . print. [ ] yarmey, lynn, and karen s. baker. towards standardization: a participatory framework for scientific standard-making. international journal of digital curation . ( ): - . https://doi.org/ . /ijdc.v i .   about the authors cynthia r. hudson-vitale is the data services coordinator in data & gis services at washington university in st. louis libraries. in this position, cynthia leads research data services and curation efforts for the libraries. since coming into this role in , she has worked on faculty projects to facilitate data sharing and interoperability while meeting faculty research data needs throughout the research lifecycle. she has also worked across the university to improve research reproducibility, addressing both technical and cultural barriers. she currently serves as the visiting program officer for share with the association of research libraries.   judy ruttenberg is the program director for strategic initiatives with the association of research libraries. she is primarily responsible for managing the share initiative. while at arl, judy has also directed the transforming research libraries initiative, which included responsibility for e-research and special collections working groups. judy works closely with her colleagues in public policy and diversity and inclusion in advancing the agenda of accessibility and universal design within arl. prior to joining arl in , judy was a program officer at the triangle research libraries network (trln) where she coordinated the work of trln's collections groups, focusing on issues such as collections analysis, shared collections, and large-scale digitization.   richard p. johnson is the co-program director, digital initiatives and scholarship and head, data curation and digital library solutions at the university of notre dame, hesburgh libraries. he directs the design and development of the libraries' data curation and digital library solutions for research, teaching, and learning. these include curatend, the library's service to curate, preserve, and spotlight collections and research at notre dame. rick also provides oversight of data management planning services within the libraries, and supports activities in the center for digital scholarship. he currently serves as the visiting program officer for share with the association of research libraries.   jeffrey r. spies is the co-founder and chief technology officer of the center for open science (cos), a non-profit technology company missioned to increase openness, integrity, and reproducibility of scholarly research. jeff is also the co-director of share. jeff has a ph.d. in quantitative psychology from the university of virginia where he now holds a visiting assistant professor position in the department of engineering and society. his dissertation included the development of the open science framework, a free, open source scholarly commons that is now the flagship product of cos.   copyright ® cynthia r. hudson-vitale, judy ruttenberg, richard p. johnson and jeffrey r. spies collaboratories and virtual safaris as research in virtual learning environments scholarship | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /jgcms. corpus id: collaboratories and virtual safaris as research in virtual learning environments scholarship @article{richter collaboratoriesav, title={collaboratories and virtual safaris as research in virtual learning environments scholarship}, author={j. richter}, journal={int. j. gaming comput. mediat. simulations}, year={ }, volume={ }, pages={ - } } j. richter published engineering, computer science int. j. gaming comput. mediat. simulations copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. one of the most compelling interests in virtual learning environments research is, i believe, in pursuing ways to advance this research through the incorporation of these virtual environments, themselves, as effective methods for displaying and disseminating evidence of learning. by shooting movies made of virtual learning situations and adding relevant… expand view via publisher igi-global.com save to library create alert cite launch research feed share this paper citations view all topics from this paper machinima augmented reality virtual reality virtual world experience printing milieu intérieur citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency virtual learning environments. the oltecx: a study of participant attitudes and experiences adriana d’alba, anjum najmi, j. gratch, chris bigenho psychology, computer science int. j. gaming comput. mediat. simulations save alert research feed measuring student perceptions: designing an evidenced centered activity model for a serious educational game development software leonard a. annetta, s. holmes, m. cheng, elizabeth folta computer science int. j. gaming comput. mediat. simulations save alert research feed a test of the law of demand in a virtual world: exploring the petri dish approach to social science edward castronova, m. bell, + authors nathan mishler computer science int. j. gaming comput. mediat. simulations pdf save alert research feed green chemistry: classroom implementation of an educational board game illustrating environmental sustainable development in chemical manufacturing m. coffey engineering save alert research feed using recommendation systems to adapt gameplay b. medler computer science int. j. gaming comput. mediat. simulations pdf save alert research feed decoupling aspects in board game modeling fulvio frapolli, a. brocco, a. malatras, b. hirsbrunner computer science int. j. gaming comput. mediat. simulations pdf save alert research feed challenges in game design a. ursyn computer science save alert research feed multimedia technologies in education l. d. paolis, egidijus vaskevicius, a. vidugiriene computer science save alert research feed related papers abstract topics citations related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue open access at ubc library open access at ubc library by glenn drexhage caption: open access week puts the focus on how we can all access and share knowledge openly should ubc and other academic institutions have to pay to access publicly funded research that benefits society and leads to a greater understanding of today’s pressing issues? this topic and many others will be discussed at ubc during the first international open access week, which takes place from october to october in the dodson room, on the third floor of the irving k. barber learning centre. how open is open? in a nutshell, open access (oa) is about access to information and knowledge for all. it is a growing international movement that encourages the unrestricted sharing of research that is typically taxpayer-funded. the development of this movement comes at an opportune time, given the surging costs of scholarly journals and the budgetary pressures facing academic libraries. technology – and specifically, the advent of the internet – has been a huge factor behind the growth of open access. “i think we’re aware that there’s a sea change happening that’s driven both by technology and the desire to create something different from traditional models of scholarship,” says joy kirchner, librarian for collections, licenses & digital scholarship at ubc. “we’re seeing new kinds of business models and new ways of interacting with information.” the oa movement is gaining momentum thanks in part to research funders and policy makers. for example, there are new requirements from the canadian institutes of health research (cihr) and the national institutes of health (nih) in the u.s. to deposit grant-recipient research into an openly available repository. ubc’s open circle ubc library launched its own open access, online repository – called circle more than two years ago. it serves as a digital archive of ubc’s scholarly and research output, and is led by co- ordinator hilde colenbrander. circle now features more than , ubc items – the biggest proportion of these being theses and dissertations. the library also hosts e-journals for ubc faculty members who use open journal systems software. titles include bc studies: the british columbian quarterly and the canadian journal of midwifery research and practice. also hosted is the ubc medical journal, a new student peer- reviewed publication. caption: librarian joy kirchner (martin dee photo) in addition, ubc library pays institutional memberships for various open access publications, entitling ubc authors to discounts on article submission fees. examples include biomed central and hindawi, which are science, technology and medicine publishers, and the public library of science journals. the library supports canada’s open medicine and the directory of open access journals, a repository of more than , open access journals. worth attending, with an open mind circle’s colenbrander will be one of the special guests speaking at ubc’s open access week. others include keynote speaker dr. frits pannekoek, president of athabasca university; ingrid parent, ubc’s university librarian; dr. henry yu, associate professor in ubc’s department of history; and many more. https://circle.ubc.ca/ http://ojs.library.ubc.ca/ http://ojs.library.ubc.ca/ http://www.openmedicine.ca/ http://www.doaj.org/ http://www.doaj.org/ topics include a national canadian study examining open access, a copyright workshop, a panel discussion, a review of undergraduate, graduate and faculty research, and a focus on academic journal publishing. for more information and to register, please visit http://www.library.ubc.ca/schol_comm/oa/start.html or contact joy kirchner at joy.kirchner@ubc.ca. also, as you prepare for open access week, check out the workshop notes from town hall . peter dauvergne, a senior advisor to ubc’s president, hosted an informal discussion with sally taylor from ubc library last june at the forest sciences centre. -end- http://www.library.ubc.ca/schol_comm/oa/start.html mailto:% joy.kirchner@ubc.ca http://update.estrategy.ubc.ca/ / / /town-hall-redux-workshop-reports http://update.estrategy.ubc.ca/ / / /town-hall-redux-workshop-reports digitcult - scientific journal on digital cultures skip to main content skip to main navigation menu skip to site footer about about the journal editorial team ethic statements section policies submissions contact current archives call for papers search search register login current issue vol no ( ) isbn:  - - - - cover: haroon mirza, a sleek dry yell ( ) graphic design by stefano morreale published: - - provocations and dialogues la paura, il virus e la dittatura digitale fear, the virus and the digital dictatorship enrico pedemonte, paolo bottazzini - pdf (italiano) il rapporto costo/beneficio dalla pratica medica alla tutela dei dati personali the cost/benefit relationship from medical practice to data protection alessandro vercelli - pdf (italiano) articles internet in everyday life: profiling individual behaviour in the field of online experience rita fornari - pdf l'errore della misura è la causa della crisi dei valori? is the measurement error the cause of the crisis of values? emiliano mandrone - pdf (italiano) open data e risorse educative aperte open data and open educational resources valentina bazzarin, paolo martinelli - pdf (italiano) museums web strategy at the covid- emergency times sarah dominique orlandi - pdf etica hacker? hacker ethics? marco ciurcina - pdf (italiano) realtà virtuale a scuola: le parole dei ragazzi la virtual reality in classroom and the students’ feedback mario chiesa, chiara tomatis, stefania romaniello - pdf (italiano) view all issues make a submission information for readers for authors for librarians editorial management hosted by technology, communication and society department (tecos), guglielmo marconi university ojs platform and web site hosted and managed by music informatics laboratory (lim), computer science department, university of milan graphical project by stefano morreale. for information: info@digitcult.it co models of digital documentation::==ea.rly a . westerner .-u:nen's the th-century concorddigital archive - 'trans- e::":o after. a =- .:ontrol ~commodore ~ >had an ::.d deliber- ;. xtter job." :::-oc journal pro- ::::: apan and is :novement. amy e. earhart d her de- i wish i could write that i recognized the possibilities of digital scholarship immediately and, with my enlightenment, proceeded to create a project that cap- tured the potential of such scholarship. instead, the journey to my current digital work has been halting and slow, with many moments of confusion along the way. my mantra, during my early work, was taken from john unsworth: "if an elec- tronic scholarly project can't fail and doesn't produce new ignorance, then it isn't worth a damn."l ultimately, digital scholarship is in its infancy and digital practi- tioners are largely self-trained. missteps and failures necessarily come with exper- imentation. and, the primary objective of digital work, in my opinion, should be experimentation. the work of digital scholarship is not only about production of the final product, but production of the theoretical and methodological ap- proaches to the digital that we have only just begun to explore. the value of such work is not to be underestimated. jerome mcgann has famously predicted that in "the next fifty years the entirety of our inherited archive of cultural works will have to be reedited within a network of digital storage, access, and dis semina- tion." as our cultural heritage is being digitized at an increasingly rapid rate we are experiencing greater access to materials, but we are also confronted with new problems of use. scholars will want digital materials to meet our particularized needs. for example, geoffrey nunberg has recently described the many problems connected to search capability that stifles scholarly work within google books. for the average user, nunberg notes, google-based searching is useful, but for the type of work that scholars imagine, "the metadata simply aren't up to it." as ;::t'i.'estern ::e:riod.rela- : :;..-e there very '= aspects of ::.::abroad - n policy, ::::zeptance of ;::d she sel- "':;zht preju- =-c.,.l,i.duals. ;:'~: estin her ating .,--;:nt, an d :~':s-le as good _ ~d other ~oit = ::c-,- other john unsworth, "documenting the reinvention of text: the importance of failure," journal rfelec- tronic publishing ( ). http://dx.doi.org/ . / . . jerome mcgann, "a note on the current state of humanities scholarship," critical inquiry ( ): . ;;g _-'~lfiijenu:l t in - so : ), . geoffrey nunberg, "google's book search: a disaster for scholars - the chronicle review- the chronicle of higher education," the chronicle rfhigher education: the chronicle review, august . http://dx.doi.org/ . / . . 'ljlldw~;:)a ulllsqnsii;:)lilj;:)u;:)gpull;:)snsnopu;:)w;:)jl;:)ju;:) -~j;:)dx;:)prnol[s;:)a~l[jjllpjoju ;j;:)l[l'pjojuo;ja.lmu;:)j-l[lu;:);:)pu~uu!ls;:)j;:)lu!;:)l[l u;:)a!o'l!s~j~;:)l[l;:)jopqatillnp!apjoju ;j;:)jotdx;:)wol[.m.joaullw'jll;:)aiisjol~s!a jospullsnol[ljosu;:)lsplljull;:)plljlws!jllolgu~wooq;:)l[l,'j~lu;:)jws!jlloltllj!jol -s!l[ullsllls;:)j;:)lu!tej;:)u;:)gpllojqsplljullpjoju ;j'pjoju ;ju!ls;:)j;:)lu!aplltol[js snopu;:)w;:)jl;:)l[lolu !ppllui'ssllpteu!w!jj;:)l[lpull'jood;:)l[l'slulljg!ww!l[s!ji 'suk)!j;:)wv-ullj!jn;:);:)jj:s;:)a!l[jjlltel!g~paqp;:)lu;:)s;:)jd;:)japu;:)nb;:)jjss;:)tsdnolil ;:)sol[ll[l!.m.p;:)pllj;:)lu!pullpjoju ;ju~p;:)p!s;:)juojnuosuojs:pull'll l;:)jol[l, p!allqaju;:)h'uojnallwils!no' ';:)ujol[l.m.llhp~ulll[liln'uosj;:)w :optllml[dte(l 'pjoj;:)jajllj;:)l~tpulltej!jols!l[;:)l[llnol[s;:)ullll[ls;:)jllggu.m.oidfunpulls;:)j~g p;:)!pllls-ti;:).m.'tllj!uoullju;:);:).m.l;:)q;:)p!a!p;:)l[ls;:)gp!jql!sll;:)a!l[jjlltel!g!piijoj ;:)j!ol[jgu!ls;:)j;:)lu!ulls!pjoju ;j'sppg;:)s;:)l[ln~llld;:)juoj;:).m.l[j!l[.m.u!all.m.;:)l[l )[u!l[l;:)jolste~j;:)lilwtlluo!l!pplll[l!.m.sj ltol[js;:)p!aojdm.m.;:).m.ste!j;:)lilwjo;:)gullj pllojqiigu!z!l!g!pas:'sppgj;:)l[wpulls;:)!l!ullwnl[tlll!g!p'al[dosoel[d';:)jlllj;:)l!l[j -jll'w;:)wuj;:)aog'ajols!l[';:)jlllllj;:)lnjoapllls;:)l[lollullpodw!s;:)aojdpjoj;:)jtllj!jol -s!l[s,u.m.olitewss!l[ljol[ld;:)p;:)l[l,'aiols!l[pull;:)jlllllj;:)lnullj!j;:)wvjo)[jo.m.;:)wlljj iilj !jj;:)l[l;:)u~ppolp;:)dl;:)l[llll[lu joills!l!sll)[jo.m.s!l[ljoj;:)siljls;:)lgu! -ls;:)j;:)lu!ullostes!pjoju ;j'atill!lu;:)uodx;:).m.ojgsj;:)qwnu;:)l[l'pjojuo;ju!p;:»)[jo.m. jop;:)a~lllll[ls;:)jllgg;:)pnpu~olp;:)pulldx;:)s!l[jj l;:)s;:)l[lu;:)l[m'uo!ld!jjs;:)pj!;:)l[l u~pjoju ;j;:)pnpu!llll[loool';:)ju!sp;:)l[snqnds)[ooq j;:)aoslsnlll;jppom 'apu;:)jjll;j'uo!liljots!l[ljo;:)jullpodw!;:)l[lsp;:)u;:)juo~pnpojdaplll l[js'lu;:)w -uj;:)aogpull';:)jlllj;:)l!l[jj l'ajols!l[pull;:)jlllllj;:)lnsy;:)wo.m.'u noqll'al[dosoel[d ';:)jlllllj;:)lnajlllu;:)j-l[lu;:);:)l;:)u!ujosuo!ssnjs!ptej!l!jju!ahlljlu;:)js;:)jllggpjoju ;j ·spjoj;:)ju.m.olpull'spllj!plliilj!sal[d's;:)p!spllojq's;:)lnu!wteuo!liljnp;:)'siil!j;:)lilw snsu;:)j'sl[dlljgolol[d'sdllw'slu;:)wnj piilj!jols!l['slx;:)lajllj;:)ln;:)pnpu!;:)a!l[jjll ;:)l[lu!uo!snpu!jojp;:)llltssiil!j;:)lilw'l[jj l;:)s;:)japlll l[js;:)a llaouu!;:)giljlloju;:)m.m. llll[lpjoju ;jjou.m.ol;:)l[llnoqlluo!lilwjoju!gu!alllds~pjosall.m.;:)ld!ltnwgu!dol;:)a -;:)p'sj;:)uplldal!unwwojpull'ajlljq!l-pull-wn;:)snw'j!w;:)plljllu;:);:).m.l;:)quo!pllj;:)lu! jol;:)powiidoi;:)a;:)polwll;:)l;:)l[lssojjllsti~spulls;:)jjllos;:)js;:)gilj;:)a;:) p;:)fojds~l[l, ';:)a!l[jjllilll!g!pss;:)jjll-;:);:)jj';:)a!pllj;:)lu!ullu!'sn;:)snl[jllssllw'pjoju ;jjopjoj;:)j iiljlllinj;:)l[l;:)n!ss;:)jjllpull;:)n!s~;:»)[ilwpull'suo!pulljl[jjll;:)sp;:)sllq-ahllns!a ';:)a!lilaouu!dopa;:)p'aesll;:)lllllplll;:)w;:)jlll[sols;:)!l u;:);:)l[l.m. tillllll[ls;:)jllljnjls -iljju!jolu;:)wdopa;:)p;:)l[lu!'su;:)snl[jllssllw'pjojuo;j'ajlljqnjnqnd;:);:)j. pjoj -uo;j;:)l[lpull~s;:)!j ljqnal!sj;:)a~uflv\p\?vsilx;:)l,'s;:)j!aj;:)spullsuo!p;:)ti ;jsio pulldllw;:)l[l~al!sj;:)a!u.flv\p\?vsilx;:)l,'spytej;:)qnjo;:)g;:)tio;j;:)l[l';:)a !l!uis;:)!l -~llwnhilll!g!q;:)l[l~~!sj;:)a!u.flv\p\?vsilx;:)l,'l[s!lgu :jolu;:)wljlld;:)q;:)l[lwojj uj l;:)lajllund!js!pj;:)lu!ullsu!ofvaj;:)l[l,';:)l!saplll l[jsp;:)jullapllahllj!goloul[j;:)l ;:)lowiiolp;:)aioa;:)sill[pullagoioul[j;:)l;:)tdw!sgu!zn n;:)l~sq;:).m.iisllp;:)ljllls(raj) jf !qj.fvp.fojuojlv ! !af:.fn ujj-q j:jq,l's;:)a!l[jjlltel!g!paullwl[l!.m.sv 's;:)jjllos;:)jiiljlllrnjjllol[l~;:)gllgu;:)ollull.m.;:).m.j!ste!j;:)lilwtel -?l~lnoqll;:)lllq;:)p;:)l[lu!;:)llld~j!lj ldpm~dnd;:)lslsnwsj llol[js'sls;:)ggnsgj;:)qunn ..::......,·soo'[ .::~-=-y j\:, ~::zrr:io~ ---- ;.:~-=-c:::::-"'=-"=_ -~,.~,~~ ------.------ :::f~ ---...,-- ~i-";'='-%- documentary editing cord that when i first began to sketch out what the cda might become, i was a lecturer. while the position provided a low wage and high teaching load with lit- tle chance of advancement, it also allowed the freedom to experiment with a proj- ect that might have no measurable value in a tenure decision, yet interested me immensely and had, i thought, real scholarly value. during the ensuing years i effectively retrained myself to work with digital scholarship, something that would have been nearly impossible to do under the pressures of the tenure track. i found little infrastructure to support digital work on my campus, so i went to the experts. i attended a teiixml workshop at brown university given by julia flanders and syd bauman and the first nines (networked infrastructure for nineteenth-century electronic scholarship) summer workshop, where i learned much from jerome mcgann, bethany n owviskie, laura mandell, and a small but dedicated group of scholars working on digital archives. i contacted ken price, co-founder of the whitman archive and a former professor of mine, to ask for advice. i was lucky that these pioneers were generous to a scholar interested in the field and were available for help and support. my story is not unique. digital projects are often created by scholars outside the traditional academic power structure who believe strongly in the importance of such work or, at the other ex- treme, leaders in the field who have used their endowed chairs and full professor- ships to help alter attitudes toward digital work. if you decide to take on a digital project, people and organizations are there to help. structures are changing. uni- versities are putting support for digital work into place, new organizations, such as nines, are emerging, and digital humanities centers are being created to sup- port the digital work that you imagine. but, a scholar interested in digital work needs to be realistic about how current digital work is valued by the academy . some changes to tenure and promotion criteria are occurring, but many depart- ments are slow to respond. while groups like the mla task force on evaluating scholarship have called for development of "a system of evaluation for collabora- ave work that is appropriate to research in the humanities and that resolves ques- -ions of credit in our discipline as in others," the same task force found that " % of departments in carnegie doctorate institutions say referred articles in digital ::annat either 'don't count' for tenure in their departments and institutions or that _~- have no experience evaluating them." imagine, then, these departments' re- ~ase to non-referred online digital scholarship. i say this not to discourage :k within this field, but to caution you to be realistic and plan accordingly. ::;.::e about digi- ~:rdarchive ::-.-.x to a more --=-?ary team ~--- humani- : :: e :\lap and ::::.::.me con- ~:ofinfra- ...=::oyative, -= ....,,·rural - :=-;;. -tchive. -:: . model of :... t".riders, de- - -·,·-ion in the "z"""....::: . census :::.;ecords. ~ irerature, =goyem- .. c,-,--:entl-· co:::.:oid in ...!..=,. ~=.ecl or ~':;q =i::ltere,--r- ..:...= --':'cid ;"':":'-~a broad :e-;-b;i!k c~~s~-·-.:'~ ~:...- - ~ - • y l markup of texts is the defocto international standard for encoding texts in the humani- _ ~~_ -::-ask force, "selected findings from the mla's survey of tenure and promotion," :!avww.mla.org/pdf/taskforcereportppt. pdf £ !lluvwnh "_'!.~ljl-lew -pllgu!dol;:)a;:)pafu;:)jjnj;:)j l;:).m.pull';:)l!su.m.ol;:)l[llu;:)s;:)jd;:)jllll[lsdllw;:)tdw!s 'te!l!u!p;:)dol;:)a;:)psill[wll;:)lino'sil;:)jllp;:)llll;:)j;:)l[lgu!l[jj l;:)s;:)jjosall.m..m.;:)u;:)p~ojd pinol[spulls;:)!l~eq!ssodgu!ls;:)j;:)lu!jojs.m. hlluo!lilwjoju!s;:) !ullwnl[;:)l[lgu! -ss;:)jpplljosull;:)wtens~ii'pjoju ;jjo;:)dlljspulltpullal[dlljgo;:)g'uo!liljoi;:)l[ljo ;:)jullpodw!;:)l[lu;:)a!o'lllllp;:)a~l;:)jdj;:)lu!jol;:)sp;:)plljlsuojatillj;:)j ljiiulll[lj;:)l[lilj 'slx;:)ljoj;:)qwnupllojqiijoaiol!sod;:)jtel!g!piiapj;:)ws!;:)a~l[jj ltel!g!p;:)l[lu;:)ljo ool,'lu;:)wuoj~au;:)tel!g!piiu!(slu;:)luojjo;:)jqlll'x;:)pu!);:)jll]j;:)lu!p;:)sllq-)[ooq;:)l[l ;:)u!gilw!;:)jols!vaj;:)l[ljoslllog;:)l[ljo;:)uo';:)jll]j;:)lu!j;:)snollx;:)ljouo!llllu;:)s;:)jd wojj';:)jllljnjls)[ooqlu!jd;:)l[ljol[jnwp;:)liljnd;:)jaltllj!jols!l[;:)alll[s;:)a!l[jjlltel -!g!p'aplllullljoju.fl'd!l[sjlliol[jslu!jdteuo!l!plljll[gnojl[lp;:)jotdx;:);:)qll[g!wsiil~j -;:)lilwilllllx;:)lllll[lall.m.;:)l[lu!uo!llll!wniisu!ilw;:)j;:)j;:)l[l';:)dlljspulltpulls;:)jllljlljls tej!sal[d'uo!liljoljo;:)jullljodw!;:)l[lssnjs!ppjoju ;jl[l!.m.)[jo.m. l[.m.sj li l[js aullw;:)el[.m.'llll[luj;:)juojawss;:)jppllolpug!s;:)ps!;:)jll]j;:)lu!vaj;:)l[l, ';:)iq!ssodp;:)u!gilw~)[jo.m.jo;:)~;:)l[l;:)pidwojoljeiol[js;:)l[l.m. tillolsp;:);:)u;:)jlll -jnjls;:)a!l[jjll;:)l[l'aplllw .fl';:)w j;:)aoal[dlljgouoj!joslj!l[slu;:)s;:)jd;:)jols)[;:);:)s llll[lp;:)fojdiijoj;:)jllljnjlslljju!ll[g!j;:)l[llouanllqojds!p;:)fojduo!l!p;:)aplllol[js iiolp;:)l[jlllws!llll[l;:)jllljnjlslljju!uv'jlllol[js;:)l[ljosteogtllj!l;:)jo;:)l[l;:)l[lol -;:)jll]j;:)lu!pull'lllilplll;:)w'ste!j;:)lilwjolu;:)w;:)gulljjllpulluo!p;:)ps;:)l[l-;:)jllljlljls ;:)a!l[jjll;:)l[lgu!l[jlllwolll[gnol[llllj;:)jllj;:)a!glsnwp;:)fojdtel!g!piil[l!.m.)[jo.m.ol ;:)sool[j l[.m.;:)sol[l's;:)liljlsuow;:)p;:)ldwllx;:)awsv'ajo;:)l[lajllj;:)l!ll[l~d!l[suo!llll -;:)jgu~jo.m.iiolu~,,(goioul[j;:)l;:)l[lgu~gu!jqolsil.m.;:)gu;:)hlll[j;:)l[llnq'd!l[sjlliol[js ajllj;:)l! awjosgu!uu!dj;:)puntllj!l;:)jo;:)l[l;:)l[llu;:)s;:)jd;:)jol;:)pllw;:)j!ol[jiisil.m.;:)jlll -jlljlslljju!jo;:)j!ol[jaw'lllllpsiopullsdllwilll!g!pl[l!.m.)[jo.m.olgu!sool[j';:)dlljs -pulltpjoju ;j;:)l[luo;:»)[llliilj!l;:)jo;:)l[lawlu;:)s;:)jd;:)jol;:)jllljnjlslljju!ull;:)dlll[s olullg;:)qi';:)jowpull'sj s!lllls;:)w!jj'slu;:)wnj psnsu;:)j;:)l[l'dllw;:)l[l';:)gilw! ;:)l[l'lx;:)ltej!jols!l[;:)l[ll[l!.m.;:)~ote!pp;:)j!polu!lx;:)lajllj;:)l! ;:)l[lgu!jqptno.m.l~ 'pll;:)lsui'lx;:)liijosuo~sj;:)a;:)td!lrnw;:)pnpu!louprno.m.;:)a!l[jjllaw'uo!l!p;:)tel!g!p ii;:»)[!iu.fl'j;:)uullw llls~iiu~osopollnq's!sateulllu!jdiiu!ste~j;:)lilw;:)s;:)l[luo!l!s -odolaiuolou;:)wp;:).m.ohll;:)a!l[jjlltel!g!pp;:)i;:)pow;:)l[l'slx;:)laillj;:)l!ll[l!.m.allldu~ l;:)solslx;:)liilj!jols!l[j;:)l[lllgolp;:)u!"ejluosj;:)diijoj'puv·gu!joidx;:)u!p;:)ls;:)j;:)lu! sil.m.illll[lslx;:)l;:)l[lgu!ddllwl[l!.m.lu;:)w!j;:)dx;:)prnojil[j~l[.m.u!uolsog:ulll[llu;:)w -uoj!au;:)j;:)tillwsiisil.m.pjoju ;j'p;:)u!gilw~pill[islld!l[sj llol[jsl[jnsp!llll[g!w j;:)lndwoj;:)l[lllll[lll[gnol[lislu;:)wuoj~u;:)illl!g!pl[l~lu;:)w!j;:)dx;:)olullg;:)q isv'slu;:)wnj pjol;:)sgu~lilu!jslljs~l[ljos;:) n!q!ssodliluo!llllu;:)s;:)jd;:)j;:)l[lp;:)l~ -wnl[j!l[.m.'d!l[sjlltol[jsjowjojj!llllsiil[gnojl[l;:)jlllllj;:)lnpull';:)jlllj;:)l!l[jjll';:)jllj josuo!pnjlsuojgu!lj!l[slu;:)s;:)jd;:)jollrnjg]!psil.m.li'p;:)u~gilw!illll[luo!liljoid -x;:)jo;:)~;:)l[lol;:)a~jnpuojapllrnj!lj ldlousil.m.l[dlljgouowlu!jd;:)l[l'j;:)a;:).m.oh ·;:)dlljspulllpull's;:)jllljnjlsiilj!sal[d';:)jlllllj;:)ln'slx;:)ltej!jols!l[l[gnojl[l,,~~tenb;:) -uijo;:)jlllj;:)l!l[jjv";:)l[lp;:)lbug!s;:)pillll[.m.'uolsog:ajlllu;:)j-l[lu;:);:)l;:)u!uu!;:)jllj ..= sumplljlsuojgum!l[sp;:)ddllwp;:)fojduo!lilp;:)ss!paw'lu;:)w;:)aowls!j!jols!h -\\;:)n;:)l[ljoll[g~;:)l[;:)l[lgu!jllp~!sj;:)a!u.flv\p\?vsilx;:)l,iiip;:)u~iljlsil.m.i _~:!~ullls ~ '~ua-\.ojg: :~dllljq;:)i;:)j ;'::::"-zdm;:)l[ls~ ~.::..=o-\i;:)!i la ~" w-gpp;:)l[l ,,--:::o~ldmoj -=::..= tµ u;:) ~:::;:;:?o.::c" ooq =::_!~!_spril =-~fu:.cluud .,-~·i:' ." =~z...-.i £ documentary editing vanced map interfaces that visualize the town over time. using google earth, historical and contemporary maps as well as digitized town reports, census, and literary materials, we are hoping to develop a map and connected timeline that allows users to manipulate time and place as well as sift the materials to locate textual data. another important issue that the concord digital archive seeks to ex- plore through its interface structure is the way that transnationalism plays out within the particular literary and historical moments of the town. current work on the cda suggests that the mrican and irish diasporas reveal themselves in town materials and that interactions between these groups impact the literary production of concord writers and vice versa. rather than focusing on the few authors that lived in concord for most of their lives, the cda materials invite the scholar to see those who immigrate, who traverse national boundaries, and who look outward, out of concord, massachusetts and the united states to a broader world. the mapping segment of the project is currently being built to show pat- terns of movement in concord by irish- and mrican-americans and the re- sponse of anglo-concordians to both groups by digitizing place of residence, nationality, race, and socioeconomic factors over time. in other words, while the concord project does indeed look to one particular element ofliterary history that has been interpreted as "american," the materials found within the archive challenge this simplistic reading. while digital archives offer the scholar a chance to produce groundbreak- ing research, there remain structural difficulties in the creation of such scholar- ship. digital work is often immeasurably slow to produce, so glacial, in fact, that :hose working within the field often speak of their never-ending projects. if you msh to publish a book, there is a long history of process in place. in addition, a ]:int project has boundaries that are fairly rigid. presses limit page numbers, con- _~cts limit time to finished product, print publication is finished and a bound .:ook produced. not so with the digital. changing technology, the unbounded ~~ of a project, changes in copyright law, and more can create issues with :::rr..?letion. a spring dhq: digital humanities quarterly volume addresses -; "::fficulty of demarcating production boundaries within digital projects from a ~_:- of perspectives. matthew kirschenbaum asks in his introduction, 'what _ -~~ ::neasure of ,completeness' in a medium where the prevailing wisdom is to '----a.:ce the incomplete, the open-ended, and the extensible?" or, as susan _ et al. state of their project, "the orlando project, a large-scale and long- - -:: digital humanities undertaking, reveals an arbitrariness, even a fictive- -c=:b.enew r.-~ctions of ~'::-.l-e fin - :c ..ildscape. =-= ~pe of ex- s:::- .ctionsof __, -:ffiich lim- -;:: ( . as i puter ccenviron- ::=x::s that i was .:::i.. :exts to set =.g: only to po- .,.--- e.:. unlike a ;.=r. instead, ,- """. iel.l, the '-,~to :d land- ~ :infrastruc- = a:: ::::jt lireran-- . ci ;. -;m-ki : g re- :;c ~ moose =-z;:::e ....-c.h..iti! ~~~~c.e-- _---:.-:::..=l_ .,,~--;.~- -- -::::smenbaum, "done: finishing projects in the digital humanities," dhq: digital =-==~- ~ly (september ). http://digitalhumanities.org/dhq/vo l / .html http://digitalhumanities.org/dhq/vo l / .html ·(; (;'ss::j d,u!sl::ja!uflsl::jlllnt[:op!msun gm::jna.[ng:w"bqmg)a. shine a secure and trust-worthy guard in between every transfer to make sure a text is transferred to an authenticated user from an authorized institute => rise resource providers rise authorization research tools shine api shine api https://rise.mpiwg-berlin.mpg.de the rise infrastructure a suite of software packages covering the needs of different stakeholders a middleware that catalogs all linked resources, and authenticates and authorizes text transfers management interfaces for libraries / resource providers, research institutes, and tool developers to set privileges a suite of javascript libraries to allow easy integration with the shine api for software developers a resource provider software package that allows resource providers (database owners, archival institutions, or even scholars) to share resources in a protected, shine-compatible way challenges: how to make the rise middleware work for libraries? how to work with libraries’ internal database management systems automatically? • to avoid duplication of efforts! how to make authentication and authorization seamless? • options beyond shibboleth and rise’s own user registry? how to technically represent the entire spectrum of licensing rights, from fully open to completely protected? call for collaboration this should be a network built by the community • work together to define this network • call for collaboration with libraries to test this concept! rise will allow libraries to offer digital scholarship with both licensed and open resources! check our website for detailed documentations, api, & available toolkits • https://rise.mpiwg-berlin.mpg.de preceding pages special edition dec digital media, technologies and scholarship: some shapes of eresearch in educational inquiry lina markauskaite university of sydney abstract this paper discusses some recent developments in digital media, research technologies and scholarly practices that are known under the umbrella term of “eresearch”. drawing on conceptual ideas of digital materialism, epistemic artefacts and epistemic tools, this paper discusses how the digital inscription of knowledge and knowing could change the nature of knowledge work in educational research and inquiry. this paper argues that eresearch challenges the conventional divide between “monological” and “dialogical” research practices and provides opportunities to create “trialogical” ways of inquiry. these trialogical practices involve not only the collaborative development of answers to research questions, but also require explicit attention and development of new digital epistemic infrastructures – digital resources, software and conceptual tools and social structures. our limited understanding about educational knowledge building practices is one of the major challenges for further advancement of educational research. introduction the last three decades have been marked by the gradual digitalisation of human culture, knowledge and learning. evolving digital media and technologies – such as computers, the internet and mobile devices – have been constantly generating new waves of promises and fads. concurrently, techno-optimistic visions about the egalitarian knowledge society, lifelong learning and the digitally savvy net generation are periodically being offset by scepticism suggesting that these digital developments are just reproducing existing patterns of power, inequalities and illiteracies (cf. bennett, maton, & kervin, ; cuban, ; prenksy, ; van dijk, ; wyatt et al., ). researchers in diverse fields, such as philosophy, sociology, psychology and education, have perceptively noticed this digital shift and turned their scholarly attention to the • the australian educational researcher, volume , number , december implications of digital media and technologies on the foundational notions of education, and teaching and learning practices. cross-national studies on information and communication technologies (ict) in education (law, pelgrum, & plomp, ); the learning sciences and technology research on how people learn (bransford, brown, & cocking, ; sawyer, ) and scholarly debates over the “digital literacies”, “digital natives” and “online identities” (bennett et al., ; greenhow & robelia, ; lankshear & knobel, ) are just the tip of the iceberg of scholarly knowledge produced in this field. a similar digital shift is occurring in research practices (jankowski, ; schroeder, ). e-journal databases, internet search engines, email and other digital media and technologies have become the mainstay of scientific inquiry routines. further investments into creating special advanced-technology research infrastructures and services – often known in australia under the umbrella term eresearch – have been accompanied by a wave of grand visions and sharp debates over the potential of digital media and technologies to transform the ways in which research and scholarship are carried out (cf. atkins et al., ; borgman, ; hine, ; ncris committee, ). as the australian eresearch vision puts it, the transformation process being enabled by advanced and innovative ict, “offers the power to undertake research on a scope previously unattainable, to work collaboratively and globally in a way not previously possible, and to improve existing research” (dest, , p. ). this initial attention on well engineered eresearch infrastructures for “big science” has been followed by the emergence of more participatory “cloud” technologies and web . applications – such as googledocs and facebook – and new hopes that eresearch could induce more open and more democratic forms of research practice (greenhow, robelia, & hughes, ; schleyer et al., ). the importance of new scientific practices in education and student learning has been generally well acknowledged (e.g., borgman et al., ; cra, ; underwood et al., ) – but admittedly this has received less attention in the australian educational context. some efforts have been also made to use eresearch for educational research (e.g., carmichael, ; romero & ventura, ). nevertheless, educational researchers, with only a few exceptions (viz. eisner, ; greenhow et al., ; smeyers & depaepe, ; voithofer, ), have been rather slow embracing and exploring new ways of doing educational research in their intellectual debates. how does the digital inscription of data, inquiry tools and interactions change the nature of knowledge and knowing? how does the digital inscription of learning change the nature of research questions and practices in educational research? finally, how does eresearch affordances change the ways in which educational research knowledge is (and could be) constructed and communicated? • lina markauskaite the questions posed here are large and complex; and, in this paper, i expand upon several important facets only. i argue that the shift to digital inscription of knowledge and knowing challenges the conventional division between monological and dialogical research practices and provides an opportunity to engage into – what paavola and hakkarainen ( ) labelled – trialogical knowledge creation. in educational eresearch, these trialogical practices involve not only collaborative development of shared objects of inquiry for answering research questions, but also explicit attention and the simultaneous development of new digital epistemic infrastructures that consist of digital resources, software and conceptual tools and social structures. initially, i introduce key eresearch notions and show some links with challenges in educational research. next, i discuss three major eresearch affordances in detail: data and knowledge resources; data-rich and computation- intensive research methodologies; and collaborative knowledge building practices. i construct my argument on two planes. first, i draw some parallels between the nature of eresearch affordances and issues in educational research. second, i show several gaps between eresearch notions and present knowledge practices. i conclude by discussing some immanent challenges for building better epistemic infrastructures for educational research and deliberative knowledge advancement. educational research and inquiry is a broad field, and i do not intend to argue that digital media and technologies should have similar role and place in all inquiry practices (perhaps not at all in some). i also do not claim that they could contribute to all answers and solutions. what i want to show, in the context of this discussion, is that digital media and technologies could provide possibilities to do research differently and to investigate in new ways some old and new educational issues that cannot otherwise be explored and therefore not solved. eresearch: notions, visions and potential in the broadest sense, eresearch refers to scholarly practices enabled or enhanced by the combination of three developments of advanced digital technologies: large integrated data repositories; high-performance computing; and high-speed computer networks (wouters, ). new research opportunities typically arise from the possibilities to consolidate distributed raw data and other knowledge resources, technological capacities and human expertise and, consequentially, to work together on more global big-picture problems or conduct explorations at new levels of detail (figure ) . examples of such research problems range from the modelling of climate change and the exploration of human genome structures, in physical sciences, to the studies of large linguistic corpuses, integrated social policy analyses and forecasts of educational systems’ in humanities and social sciences (blanke, hedges, & dunn, ; dzemyda, saltenis, & tiesis, ; hey, tansley, & tolle, ). • digital media, technologies and scholarship with the evolution of digital media and technologies, the distinction between eresearch practices that fundamentally rely on advanced “high-end” technologies and everyday scholarly eresearch practices that apply an ordinary desktop computer connected to internet have become increasingly blurred (cf. anderson & kanuka, ; atkins et al., ). similarly, eresearch role in different inquiry practices has become more varied and its impact on the ways in which knowledge is produced has been different and sometimes highly debated. on the one hand, eresearch is perceived as an element enhancing existing methodological and theoretical research traditions and practices (hine, ; wouters, ). on the other hand, there is a strong argument that eresearch has given birth to an epistemically coherent research paradigm of data-intensive scientific discovery (hey et al., ). furthermore, there is an ongoing debate over the impact of digital media and technologies on the nature of scholarship arguing that existing moral, cultural and organisational frames that historically operated in more self-contained real world research environments are not a good fit for the more distributed and cross-disciplinary eresearch practices of the digital world (borgman, ; greenhow et al., ). while there are some pockets of successful eresearch practices, the actual levels and ways of adoption vary both across and within disciplines (e.g., see hine, ; jankowski, ). even so, the rapid adoption of a number of eresearch affordances across domains – such as e-journal databases – and substantial embracement of digital technologies in some research areas that were historically rather distant from technologies – such as linguistics, arts and archaeology – have signalled that eresearch • lina markauskaite figure . main eresearch elements and affordances data and knowledge resources networks integrated eresearch environments for collaborative distributed inquiry. data-rich and computation -intensive research methods. integrated multimodal datasets and resources. user-sensitive platforms for dissemination. computation-intensive approaches, techniques and tools people and expertise might have the potential to contribute to a rather broad spectrum of research questions and inquiry practices, in various fields, including education (e.g., carmichael, ; pea, ; romero & ventura, ). nevertheless, these eresearch contributions might be more carefully nuanced and specific to the issues and disciplinary practices than the initial techno-optimistic visions (e.g., atkins et al., ) tended to state. many scholars in science and technology studies have argued that research technologies and our knowledge creation practices mutually shape each other (schroeder & fry, ; woolgar & coopmass, ; wouter et al., ). while inscription technologies shape the ways in which data and knowledge are represented, created and shared (voithofer, ), existing research questions and practices influence the choice of technologies and ways they are used (wouters, ). the central question for education, as for other social sciences, is “how it is possible to develop novel ways of knowledge creation … by utilising and adapting e-research concepts, instruments and ways of working” (wouters, , p. ). linking eresearch affordances and issues in educational inquiry in the recent years, there has been ongoing debate over the numerous issues in educational research, naming among many others such limitations as a lack of rigour, theoretical incoherence, irrelevance to schools, lack of involvement of teachers, inaccessibility and poor dissemination, failure to produce cumulative research findings, ideological bias and poor cost effectiveness (e.g., kaestle, ; whitty, ). while educational problems are “wicked” (conklin, ) and do not easily lend themselves to being fully understood, at least some of the limitations come from the rather fragmented nature of educational research and inquiry practices. to summarise some insights of other scholars (dede, ; mcwilliam & lee, ), heterogeneity of research traditions and methods and low compatibility of techniques result in difficulties analysing data from multiple perspectives and, as a result, producing more ecological evidence. even if some data could be (re)analysed from several methodological or social perspectives, limited collaboration and data sharing confine such opportunities. furthermore, solitary research cultures and processes often restrict the possibilities to integrate methodological and stakeholder perspectives and, as a result, to produce findings relevant to schools, decision-makers and consequential stakeholders. finally, the narrow focus of scholarly dissemination on an academic audience limits the impact of research on educational practice and policy. these limitations together restrict opportunities for more cumulative and iterative use-oriented knowledge building. putting these issues and eresearch side- by-side, one could make a justified projection that eresearch might offer some new affordances for educational inquiry (figure ). i will discuss each of these three eresearch affordances in the next sections. • digital media, technologies and scholarship learning “data deluge”: taking data seriously many issues in educational research pertain to data. for example, debates about the quality of “what works” analyses, the transparency of qualitative research and the usefulness of the findings for teaching practice or decision-making, at the end of the day, converge to the issues of the quality of published data or availability of raw data (cf. eisner, ; freeman et al., ; schneider, ; slavin, ). on the contrary, educational practices and research studies generate increasingly larger volumes of more complex data that far exceed human capabilities for manual analysis, do not fit linear format of printed media and exceed the limited space in printed academic journals. analysed and published data are inevitably selective re-representation of a small amount raw data originally collected in the field (woolgar & coopmass, ). they “lock in” numerous methodological and pragmatic choices made at different stages of research and limit subsequent interrogation. data that is abstracted and prepared for publishing does not necessarily provide sufficient information for those who would look at the raw representation of the phenomenon from different epistemic or social perspectives, and does not allow one to zoom in from the abstraction back to the original record. for example, synthesised results from the analysis of a classroom innovation published in an academic paper might have a limited use for a teacher who might be interested to implement similar innovation in her classroom and would benefit from access to the lesson resources or video record analysed in the study, yet not available in the publication. similarly, data prepared for a teacher might have little value for a parent or other stakeholder who might be more interested in students’ • figure . common issues in educational research and eresearch affordances data and knowledge resources solitary research practices. lack of collaboration among policy-makers, practitioners and researchers. heterogenity of research methods. many small datasets and knowledge resources. different audiences and needs. approaches, techniques and tools people and expertise data and knowledge resources networks integrated eresearch environments for collaborative distributed inquiry. data-rich and computation -intensive research methods. integrated multimodal datasets and resources. user-sensitive platforms for dissemination. computation-intensive approaches, techniques and tools people and expertise issues in educational research eresearch affordances lina markauskaite accounts of their experiences that were captured in the interviews rather than in pedagogical nuances. while educational researchers still debate whether there is or not such thing as raw data uncontaminated by human thought or action (cf. erickson, ; freeman et al., ), to put it boldly, materials collected or recorded in the field, if shared, have a potential for generating more knowledge than one chooses to extract and enact in a publication or practice. this argument is illustrated in figure . with the proliferation of digital recording devices and elearning, technologies much of the data gets captured in digital format, and, thus, is ready for further technology- enhanced management, analysis and dissemination. cheap digital storage creates opportunity to store and publish almost unlimited amounts of data. furthermore, data and knowledge resources inscribed in digital media have different features from the data inscribed in physical artefacts and provide new possibilities for analysis, presentation and dissemination. voithoffer ( ), referring back to the theories of digital materialism, describes five primary ways in which digital media are sites for computerisation of culture and knowledge: numerical representation, modularity, automation, variability and transcoding . to put his argument simply, a digital format allows numerous human • figure . from real world to enacted knowledge stakeholder perspectives academic research practice publication, action, etc methods, techniques, etc epistemology, ontology, axiology, etc “enacted” knowledge results, evidence, claims data world potential for new knowledge potential for new knowledge potential for new knowledge digital media, technologies and scholarship and automated re-combinations, transformations, presentations and customisations of data and knowledge and, hence, allows one to investigate the same objects of inquiry inscribed in digital media from multiple methodological and social perspectives. the digital form affords one with opportunities to represent knowledge and the results of inquiry using multiple media languages (such as video and text) and human discourses (such as teacher, decision maker, student and parent); and, when needed, to backtrack from the abstracted representations to raw data. the idea to create data clouds or shared data banks and make them accessible via multiple interfaces to different audiences is not new; and some disciplines have been very successful pooling together large amounts of heterogeneous data resources for collaborative exploration. for example, the astronomy’s portal skyserver provides free access to the integrated from many sources sloan digital sky survey (sdss) astronomy database that includes over million stars, galaxies, and quasars (skyserver, ). a variety of tools for searching, visualizing and exploring sdss make this database accessible for professional and “citizen scientists”. similarly, publishing data in open peer-reviewed data repositories, interlinking them with journal publications or creating presentations of findings for different audiences are well-established practices in some disciplines . these ideas are not completely new for education. for example, data that comes from well known timss and pirls international studies are also publicly available for further interrogation (see iea, ). nevertheless, such cases are few and are typically based on few well-structured datasets. a more challenging question is how to share and integrate heterogeneous datasets needed for cumulative research, such as a dataset for a longitudinal research on teacher education envisioned in the acde’s ( ) scoping study. while one might argue that the issues of anonymity, confidentiality and security of personal and learning data make data sharing problematic in social studies (cf. bishop, ; broom, cheshire, & emmison, ; carusi & jirotka, ; kelly, ; parry & mauthner, ), these ethical questions are likely to have well thought sensible solutions. at least some similar fields, such as social welfare, medicine and health have been rather successful overcoming trust and security issues and have created shared data infrastructures of highly confidential data for scientific research and practice (burton, purvin, & garrett-peters, ; jirotka et al., ). a rather different serious challenge in education is what cole ( ) has labelled as an “indifference to data” – limited explicit attention to data and how they are produced. for example, one common issue in understanding causal explanations of learning phenomena is the importance of the context (freeman et al., ; koro- ljungberg et al., ; maxwell, ); and, in order to make data re-usable and open for meaningful re-interpretations, the context that led to the data and the context in • lina markauskaite which these data were produced need to be recorded and shared. such data “provenance” or “pedigree” records are rarely explicitly produced in educational research. in fact, social researchers have almost no vocabulary and norms for articulating and sharing raw data or data about the data and their contexts . the integration of data, as cole ( ) has noticed, has not occurred by accident even in well structured scientific domains; and maintaining the coherence of data and outputs requires one’s willingness and explicit efforts to prepare data for sharing and share. in the context of massive school computerisation, one more specific type of data needs more explicit attention – “digital traces” of learning activities that are (or could be) automatically captured in learning environments (borgman et al., ; schooneboom et al., ). when much of the interaction happens in digital media and distributed over physical settings, learning phenomena become almost inaccessible for direct observation for both teachers and researchers. nevertheless, these interactions leave an extensive digital trace; and more comprehensive understanding of such large scale political, social and cognitive phenomenon – such as the australian digital education revolution (commonwealth of australia, ) – becomes incomplete, if not impossible, without making sense of these “machine observations”. data collected for technical purposes do not necessary come in a form that is suitable for answering educational questions. nevertheless, these traces often have been either ignored or taken as given in educational research and there has been little attention to the possibilities to get right (or better) data for a problem. this indifference to digital traces is another indication of a larger issue – educational researchers try to produce better answers to old and new research questions, but put relatively little effort into understanding data and creating better data infrastructures and, as i subsequently argue, inquiry tools that are the key drivers of innovation in many scientific and practical fields. data-rich research methods: shapes of “the fourth” paradigm in educational research in the history of scientific inquiry there have been several major shifts: from empirical science, based on the description of natural phenomena thousands years ago; to theoretical science, based on the logical reasoning, measurements, experimentation and mathematical manipulation, hundreds years ago; to computational science, based on the simulations and modelling of complex phenomena several decades ago (jackson, ). developments of new research instruments for the measurement and observation and, later, for computation and modelling, have been at the core of these two scientific revolutions. eresearch gave rise to the the fourth scientific paradigm of data exploration, based on the synthesis of theories, experiments and computation • digital media, technologies and scholarship using large data set exploration techniques (hey et al., ). the development of software and conceptual tools for digital data management and data-intensive knowledge discovery has been at the core of this shift. the history of educational research cannot be reduced to technical choices of method; nevertheless methodological advancements in educational inquiry at least in part have been coupled with the advancements in research instruments for empirical observation and measurement and at least broadly have mirrored the progress from the descriptive to hypothesis and theory driven quantitative and qualitative research (e.g., lagemann, ; shulman, ). while the importance of (conceptual) research tools in educational inquiry has been well acknowledged, the role of digital technologies has not been as radical and as widespread in educational research as in physical sciences; and the shape of the third and the fourth paradigms in educational research is a contentious question . for example, hard system modelling, despite its broad application in economic, policy and other social fields and increasing interest in complexity perspectives in education (e.g., jacobson & wilensky, ; radford, ), has seen very few applications in educational research. some data-intensive research approaches – such as knowledge discovery in databases, typically known as educational data and text mining – have made significant inroads , but most progress has been made in advancing mining algorithms and little educational knowledge. as an indication, all contributors to the book “data mining in elearning” (romero & ventura, ) come from computer science and related fields, while none of the papers published in the australian educational researcher over the last seven years mentions data mining . the complexity of educational systems and the recent learning data deluge do not allow one to think that education might have no issues that are suited for computational modelling and data-rich discovery. on the contrary, the more pragmatic and grounded- in-data logic of these methodological approaches appears to be well suited for a practical domain such as education. for example, data and text mining involves an iterative process of sniffing through large amounts of data and discovering patterns and relationships (zhao & luan, ). this exploration combines interpretative investigation with scientific data-based reasoning. nevertheless, the logic that guides such knowledge discovery contrasts with both positivistic and interpretative research traditions. for example, data miners start their exploration with no a priori assumption about the existence or nature of relationships in data. in contrast to statisticians, they do not focus on establishing generalisations across samples and do not judge their findings on the basis of statistical significance; rather they try to detect different possible patterns and idiosyncratic behaviours and judge their findings on the basis of practical significance. such open opportunistic logic is essentially unrecognised in educational inquiry. • lina markauskaite nevertheless, computation-intensive and data-rich research techniques involve one’s work with transformed data representations mediated by technology that are typically less intuitive than purposefully collected and manually processed data. moreover, explorations of social phenomena in digital spaces often require researchers to combine conventional methodological traditions with data-driven technology-mediated ways of inquiry (hine, ; markham & baym, ). for example, a virtual ethnographer, who explores a large and distributed across physical settings and time online learning community, has to go beyond direct observation and authentic thick experience and engage with more fragmented and shallow ways of technology-mediated observation, such as visual representations of social networks or the exploration of digital traces. such work on the methodological boundaries of rather contrasting inquiry traditions presents a serious epistemic challenge. ultimately, a handful of eresearch methods have challenged the social organisation of knowledge production involving citizen scientists into a legitimate scientific discovery. for example, recent astronomical discoveries about the rotation of galaxies have been made by a network of children, teachers and other lay explorers, who collaboratively over a one year period produced more than million classifications of a million galaxies that later have been only summarised by professional scientists (lintott et al., ). similarly, in education, research planning and, particularly, data analysis and interpretation that have been rather exclusive areas of monologic academic work have become increasingly a collaborative practice (armstrong et al., ; carmichael, ; laterza, carmichael, & procter, ; pea, lindgren, & rosen, ; ritchie & rigano, ). collaborative research platforms and data analysis tools, such as distributed video analysis, provide possibilities to involve participants and other stakeholders in different research stages (carmichael, ; pea et al., ) . such practices, however, are rare. the main challenge, however, is that computational and data-rich methods as well as socially rich forms of inquiry present rather different ways for generating knowledge that are trans-disciplinary in a deep ontological and epistemological sense. they require researchers to understand how “e” works, how it could be combined with disciplinary issues, conceptual knowledge, social organisation of inquiry and what kinds of knowledge these combinations could produce. this blending of professional expertise with digital technology-mediated ways for constructing knowledge inevitably encourages more collaborative configurations of social fabric for knowledge production, and new notions of scholarship. collaborative inquiry: shapes of digital scholarship as borgman ( ) has argued, the main purpose of eresearch is to enable new forms of scholarship that are more data and information-intensive, distributed, collaborative, • digital media, technologies and scholarship and multidisciplinary. discussions about how digital media and technology-mediated ways of knowledge production affect the role and practices of academia have produced an array of notions of “digital scholarship” that vary in terms of their focus on different practices and outputs . one of the classical definitions proposed by the american council of learned societies ( ) has emphasised new types of knowledge products and included in the notion of digital scholarship the following practices: • building a digital collection of information for further study and analysis; • creating appropriate tools for collection-building; • creating appropriate tools for the analysis and study of collections; • using digital collections and analytical tools to generate new intellectual products; • creating authoring tools for these new intellectual products, either in traditional forms or in digital form. (p. ) in short, according to this notion, digital scholarship has a strong epistemic focus, but includes a whole range of new intellectual products that are not necessarily the definitive answers or solutions, but are part of an epistemic infrastructure for building digital knowledge collaboratively. in addition, digital media and technologies have expanded scholarly dissemination opportunities, threatening the monopoly of commercial publishers and established scientific communities and making publishing quicker, more transparent and less restrained by one-way textual format (poschl, ; seringhaus & gerstein, ; see also “scholarly communication” section in hey et al., ). for example, in addition to many self-publishing opportunities in wikis, blogs and other web spaces, open publishing and open peer-review make traditional blind processes fully visible for the audience (poschl, ). further, papers that are open for continuous peer commentary and discussion have become a viable peer-review and dissemination alternative. more recent developments in participatory technologies have generated a set of new notions of scholarship with stronger emphasis on social knowledge practices. as greenhow et al. ( ) indicate, the main qualities of social scholarship are “openness, conversation, access, sharing and transparent revision” (p. ); while “validity of knowledge in web . environments is established through peer review in an engaged community, and expertise entails offering syntheses widely accepted by the community” (p. ). roughly speaking, digital scholarship includes practices that are beyond direct contribution to conceptually new knowledge, such as engaging in debates, the repackaging of existing knowledge into new intellectual products and bringing them beyond traditional spaces of academia. • lina markauskaite in short, these digital shifts place a different emphasis on cognitive and social values in knowledge work. educational scholars, however, have been more successful embracing (some of) the opportunities for social scholarship than for more cognitively rigid knowledge production (see discussion greenhow et al., ; zhang, ). as zhang ( ) comments, participatory approaches and tools – such as wikis – are, “strong in supporting knowledge sharing [italic added], but relatively vague and weak in advancing [italic original] community of knowledge” (p. ). innovation requires sustained commitment to progressive advancement of knowledge and intellectual rigour; and sharing loosely related ideas or voting for the most popular idea does not always contribute to the design of a practical solution or formulation of a higher-level idea. present digital resources and web . tools are relatively weak in their support of epistemic commitment and smarter knowledge work of educational scholars and practitioners. in contrast, scholarly practices with a stronger epistemological focus, such as creating tools for inquiry, contributing data to repositories or creating and maintaining repositories of educational data, have not gained much attention, making more collaborative knowledge building practices little supported by collaborative knowledge building infrastructure. educational eresearch as a trialogical inquiry: some reflections on missing links the classical division between acquisition and participation (sfard, ), that was later relabelled by paavola and hakkarainen ( ) as monological and dialogical approaches to knowing, can be seen not only in educational accounts of learning, but also in our interpretations of scholarly practices, including digital scholarship. on a deep cognitive level, monological knowledge work has a strong presence in digital practices of educational researchers. as an indicator, an eresearch survey conducted in nsw found that about two thirds of educational researchers at least occasionally use spss, nvivo and some other individualistic software on which they could offload some traditional cognitive tasks, but less than % ever use collaborative virtual research environments . on the public discourse level, the attention to dialogical practices – such as open publishing, social networks and other participatory practices – has overshadowed the monological approaches (cf., dede, ; greenhow et al., ; zhang, ). in both cases the attention to how digital media and technologies could support sustainable cognitively nontrivial collaborative educational inquiry has been limited. as paavola and hakakrainen ( ) have argued, innovative knowledge communities advance knowledge by engaging in trialogical knowledge work – collaboratively developing shared “mediated objects” and “mediated artefacts”. while they create social • digital media, technologies and scholarship structures and collaborative processes that support knowledge sharing, they also maintain a strong commitment to generating ideas and conceptual knowledge. eresearch platforms and networks provide a medium for instantiating current understanding, interacting and advancing collaboratively shared conceptual objects, but they do not offer a readily made digital infrastructure – including data and tools – for educational inquiry. shared data and knowledge resources provide the backbone for more collaborative and open knowledge building. nevertheless, research that could be useful, integrated and re-used by others requires an explicit attention to sharing, and neither data nor research processes will become more accessible or transparent just by putting it all online. one of the major (intellectual) challenges for the educational research community is to construct shared vocabularies and grammars appropriate for describing different kinds of knowledge work and different sorts of intellectual products. the actual use of these grammars in research practices is the next (social) challenge. technological artefacts and computers could be useful for assisting with or doing some traditional cognitive tasks (such as calculating or text processing); nevertheless they also embody knowledge that is not accessible to humans directly and are able to perform cognitive tasks that humans cannot do. in scientific practices, digital technologies have increasingly gained the status of cognitive partners (nersessian, ). computation- intensive modelling and data-rich research methods, in essence, represent this partner’s role of digital technologies in educational research. for example, humans cannot make much sense from long hours of classroom video observations or online learning data without seeing and interacting with computer visualisations, nor can computers create meaningful data representations without human involvement. this partner’s role of digital technologies has been little realised and little used in educational inquiry. an important aspect of experimentation that leads to discoveries in scientific laboratories is building, improving and customising these cognitive partners. in educational research, rather differently, digital research tools and infrastructures are, by and at large, taken as immutable and often constraining. the major challenge is that such cognitive partners cannot be built or tweaked to answer research questions by computer experts without cognitive partnership of educational researchers who know research questions and conceptual frames of the field. furthermore, shared data and technology infrastructures for educational inquiry are not bounded to one research team or laboratory or academic community, to include decision makers, practitioners and consequential stakeholders, such as students and parents. what kind of understanding and expertise do educational researchers need about digital media structures and computer languages in order to be able to engage with such partnerships with computers and computer experts? what does it mean to build or tweak a shared • lina markauskaite cognitive partner that crosses the boundaries of laboratories, epistemic communities and social discourses? as the nsf report (borgman et al., ) points out: humans reason differently in stem [science, technology, engineering and mathematics] domains – and learn differently – when the knowledge representational systems for expressing concepts and their relationships are embodied in interactive computing systems, rather than historically dominant text-based or static graphical media. (p. ) in contrast to the numerous studies of knowledge work in scientific laboratories (e.g., knorr-cetina, ; nersessian, ), the nature of knowledge work in educational research has been little researched, not only in digital, but also in the physical world. how do educational researchers and others involved in inquiry make sense, use and advance their inquiry frameworks and tools; how do they interact with them and make their decisions; in what kinds of social interactions do they engage when they create knowledge? unless we understand the “nitty gritty” of these often diverse non- digital and digital knowledge practices better, it might be difficult to think about how we could build shared (digital and conceptual) frameworks and infrastructures for collaborative knowledge building. finally, in the context of the massive digitalisation of culture, knowledge, and learning, how much of the educational research could still have an ecological validity without researching the digital part of it? what kinds of values, knowledge and capacities should be included in a standard “epistemic toolbox” of an educational researcher or teacher- researcher who needs to understand learning in both worlds and make numerous trans- ontological, trans-epistemic and trans-cultural choices while researching them? as eisner ( ) argues “we tend to seek what we know how to find” (p. ). one of the biggest challenges for educational community is to create an epistemic infrastructure that is beyond this zone. endnotes different terms are used to refer to similar, advanced-technology enhanced research infrastructures and practices in different countries and disciplinary contexts – such as, “escience”, “ehumanities” and “esocial sciences”, in the uk; “cyberinfrastructure” in the us; and “einfrastrucure” and “the grid” in the european union (o’brien, ). in this paper, i do not make a distinction between scientific inquiry (i.e., research) and all other forms of scholarly, professional and technical inquiry. i refer by these • digital media, technologies and scholarship terms to different shades of principled deliberative ways to investigate something that is not adequately understood in order to advance knowledge or produce a practical solution. specifically, networks provide a possibility to interconnect distributed small and large datasets and other knowledge resources stored in different digital media (such as documents, videos and other digitalised artefacts) that could be used for integrated research, secondary analysis and evidence-based decisions. further, high-performance computing allows for the employment of new data analysis techniques that are required to perform numerous computations and data transformations, such as the modelling of complex systems, exploration of patterns in large data sets using knowledge discovery algorithms, visualisation of interactions in large social networks or painstaking video analysis of complex social interactions. finally, networks allow researchers to engage with new collaborative distributed inquiry practices, such as joint research design and writing or remote data analysis in virtual research environments. in this paper, i use term “eresearch” broadly to refer to a whole range of ict-enhanced research approaches and practices: from advanced high-speed computation-intensive collaborative research (e.g., atkinson, ; hey et al., ) to more conventional ict- enhanced research, such as the use of digital libraries and web . collaboration (e.g., anderson & kanuka, ; greenhow et al., ). as an illustration of this issue, the integrity of design-based research and other qualitative inquiries is challenged by the bartlett effect, of selecting for analysis and, consequently, for presentation only those data that support one’s favourite hypothesis (brown, ); traces of students’ elearning automatically captured in digital learning environments create “learning data deluge” (borgman et al., ). according to voithoffer ( , in which he cites manovich ( )), numerical representation (e.g. digitalisation) allows one to manipulate and program digital inscriptions and, in this way, provides people with possibilities to recombine and customise these representations in many ways. modularity allows one to combine and recombine media objects without losing their individual characteristics, and present them in various configurations through diverse interfaces. automation allows one to generate user-defined queries and pre-programmed interactions and, in this way, enables people to simplify complex processes of storing, searching and retrieving information. variability allows one to separate of the content from the presentation and to derive the latter from the human or machine manipulation and interaction. in this way, media objects and data could be presented in user-sensitive multiple ways. transcoding allows one to blend computer languages (e.g., algorithms, data structures), media languages (e.g., visual composition, genre) and other human discourses (e.g., research, educational and political discourses) in different ways. all of these features allow one to recombine data in various ways without loosing the link to its original non-transformed form. • lina markauskaite for example, before their papers get published in the highly rated plos computational biology journal (plos, ), computational biologists have to deposit data sets associated with their publication in an open biology repository, such as worldwide protein data bank – ptb (wwptb, ). furthermore, some authors upload to scivee portal a “pubcast” or other media resources in which they explain and discuss their scientific findings further with professional and non-professional audiences (see scivee, ). in contrast, educational researchers have a relatively well established vocabulary and norms of sharing information about their argumentative grammars (i.e., method) that justify their data and link evidence to the claims. more detailed discussion of these questions and the provenance of educational data can be found in reimann and markauskaite ( ) and markauskaite and reimann ( b). more detailed discussion about technology-mediated research methods for educational inquiry can be found in markauskaite (in press). published solid books, annual conferences and recently established “journal of educational data mining” are just few manifestations of advancements in educational data mining. more information can be found on the international working group on educational data mining website (edm, ). the australian educational researcher’s full text archive ( . )- ( . ) has been interrogated using google query “data mining”, site:http://www.aare.edu.au/aer further ideas about how eresearch could be used to enhance teacher-researcher innovation and inquiry can be found in markauskaite and reimann ( a). in academic discourses, a range of other terms have been used to describe different shades of “digital scholarship”, such as “escholarship”, “social scholarship” and “scholarship . ”. these results are based on an eresearch survey data (as at january ) that has been conducted at seven nsw universities in - . see markauskaite, aditomo and hellmers ( ) for the year report. digital knowledge and eresearch raise new questions of institutional control, power and inequity. for example, some of them have been already evident in the recent debates of myschool website, which allows to compare performance of all australian schools (my school, ). while socio-political agendas behind eresearch have been in the background of this paper only, these questions are an integral part of digital scholarship. i believe, this is one of the main “trans-epistemic” and “trans-cultural” questions that those who will create “digital epistemic infrastructures” will need to answer responsibly. acknowledgements i wish to thank nicola johnson, shannon kennedy-clark, michael jacobson and raewyn connell for their help shaping this paper. without their careful reading and feedback this paper would never become what it is now. all claims and mistakes, nevertheless, are my sole responsibility. • digital media, technologies and scholarship references australian curriculum, assessment and reporting authority (acara). ( ). my school website. retrieved august , , from australian curriculum, assessment and reporting authority web site: http://www.myschool.edu.au. australian council of deans of education (acde). ( ). data repository for teacher education scoping study. australia: the australian council of deans of education. american council of learned societies commission (acls). ( ). our cultural commonwealth: the final report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. retrieved august , , from acls website: http://www.acls.org/uploadedfiles/ publications/programs/our_cultural_commonwealth.pdf anderson, t., & kanuka, h. ( ). e-research: methods, strategies and issues. boston: pearson education inc. armstrong, v., barnes, s., sutherland, r., curran, s., mills, s., & thompson, i. ( ). collaborative research methodology for investigating teaching and learning: the use of interactive whiteboard technology. educational review, , - . atkins, d. e., droegemeier, k. k., feldman, s. i., garcia-molina, h., klein, m. l., messerschmitt, d. g., et al. ( ). revolutionizing science and engineering through cyberinfrastructure. report of the national science foundation blue- ribbon advisory panel on cyberinfrastructure. arlington, va: directorate for computer and information science and engineering, national science foundation. bennett, s., maton, k., & kervin, l. ( ). the “digital natives” debate: a critical review of the evidence. british journal of educational technology, ( ), - . bishop, l. ( ). protecting respondents and enabling data sharing: reply to parry and mauthner. sociology, ( ), - . blanke, t., hedges, m., & dunn, s. ( ). arts and humanities e-science – current practices and future challenges. future generation computer systems, ( ), - . borgman, c. l. ( ). scholarship in the digital age: information, infrastructure, and the internet. cambridge, ma: the mit press. borgman, c. l., abelson, h., dirks, l., johnson, r., koedinger, k. r., linn, m. c., et al. ( ). fostering learning in the networked world: the cyberlearning opportunity and challenge, a st century agenda for the national science foundation. arlington: nsf task force on cyberlearning. broom, a., cheshire, l., & emmison, m. ( ). qualitative researchers’ understandings of their practice and the implications for data archiving and sharing. sociology, ( ), - . bransford, j. d., brown, a. l., & cocking, r. r. (eds.). ( ). how people learn: brain, mind, experience, and school. washington, dc: national academy press. brown, a. l. ( ). design experiments: theoretical and methodological challenges in creating complex interventions. the journal of the learning sciences, , - . • lina markauskaite burton, l. m., purvin, d., & garrett-peters, r. ( ). longitudinal ethnography: uncovering domestic abuse in low-income women’s lives. in g. h. elder jr. & j. z. giele (eds.), the craft of life course research (pp. - ). new york, ny: guilford press. carmichael, p. ( ). introduction: technological development, capacity building and knowledge construction in education research. technology, pedagogy and education, ( ), - . carusi, a., & jirotka, m. ( ). from data archive to ethical labyrinth. qualitative research, ( ), - . cole, f. t. h. ( , december - ). taking “data” (as a topic): the working policies of indifference, purification and differentiation. paper presented at the th australasian conference on information systems, christchurch, new zealand. conklin, j. ( ). wicked problems and social complexity. new york: wiley. commonwealth of australia. ( ). success through partnership: achieving a national vision for ict in schools. strategic plan to guide the implementation of the digital education revolution initiative and related initiatives. retrieved august , from http://www.deewr.gov.au/schooling/digitaleducationrevolution/ documents/der% strategic% plan.pdf computing research association (cra). ( ). cyberinfrastructure for education and learning for the future: a vision and research agenda. washington, dc: computing research association. cuban, l. ( ). oversold and underused: computers in the classroom. cambridge, ma: harvard university press. dede, c. ( ). comments on greenhow, robelia, and hughes: technologies that facilitate generating knowledge and possibly wisdom. educational researcher, ( ), - . department of education, science and training (dest). ( ). an australian e- research strategy and implementation framework: final report of the e-research coordinating committee. commonwealth of australia: australian government, dest. dzemyda, g., saltenis, v., & tiesis, v. ( ). forecasting models in the state education system. informatics in education, ( ), - . edm ( ) international working group on educational data mining. retrieved april , , from: http://www.educationaldatamining.org eisner, e. w. ( ). the promise and perils of alternative forms of data representation. educational researcher, ( ), - . erickson, f. ( ). definition and analysis of data from videotape: some research procedures and their rationales. in j. l. green, g. camilli, p. b. elmore, a. skukauskaite & e. grace (eds.), handbook of complementary methods in education research (pp. - ). mahwah, nj: lawrence erlbaum associates. freeman, m., demarrais, k., preissle, j., roulston, k., & pierre, e. a. s. ( ). standards of evidence in qualitative research: an incitement to discourse. educational researcher, ( ), - . • digital media, technologies and scholarship greenhow, c., & robelia, b. ( ). informal learning and identity formation in online social networks. learning, media and technology, ( ), - . greenhow, c., robelia, b., & hughes, j. e. ( ). learning, teaching, and scholarship in a digital age: web . and classroom research: what path should we take now? educational researcher, ( ), - . hey, t., tansley, s., & tolle, k. (eds.). ( ). the fourth paradigm: data-intensive scientific discovery. remond: microsoft research. hine, c. (ed.). ( ). virtual methods: issues in social research on the internet. oxford: berg. hine, c. (ed.). ( ). new infrastructures for knowledge production: understanding e-science. hershey: information science publishing. iea ( ). iea online database. retrieved april , , from international association for the evaluation of educational achievement web site: http://www.ieadata.org jackson, e. a. ( ). the unbounded vistas of science: evolutionary limitations. complexity, ( ), - . jacobson, m. j., & wilensky, u. ( ). complex systems in education: scientific and educational importance and implications for the learning sciences. the journal of the learning sciences, ( ), - . jankowski, n. w. (ed.). ( ). e-research: transformation in scholarly practice. new york, ny: routledge. jirotka, m., procter, r., hartswood, m., slack, r., simpson, a., catelijne, c., et al. ( ). collaboration and trust in healthcare innovation: the ediamond case study. computer supported cooperative work, ( ), - . kaestle, c. f. ( ). the awful reputation of educational research. educational researcher, ( ), - . kelly, a. ( ). in defence of anonymity: rejoining the criticism. british educational research journal, ( ), - . knorr-cetina, k. ( ). epistemic cultures: how the sciences make knowledge. cambridge, ma: harvard university press. koro-ljungberg, m., yendol-hoppey, d., smith, j. j., & hayes, s. b. ( ). (e)pistemological awareness, instantiation of methods, and uninformed methodological ambiguity in qualitative research projects. educational researcher, ( ), - . lagemann, e. c. ( ). an elusive science: the troubling history of education research. chicago: university of chicago press. lankshear, c., & knobel, m. (eds.). ( ). digital literacies: concepts, policies and practices. new york: peter lang. laterza, v., carmichael, p., & procter, r. ( ). the doubtful guest? a virtual research environment for education. technology, pedagogy and education, ( ), - . law, n., pelgrum, w. j., & plomp, t. (eds.). ( ). pedagogy and ict use in schools around the world: findings from the iea sites study. hong kong: cerc- springer. • lina markauskaite lintott, c. j., schawinski, k., slosar, a., land, k., bamford, s., thomas, d., et al. ( ). galaxy zoo: morphologies derived from visual inspection of galaxies from the sloan digital sky survey. monthly notices of the royal astronomical society, ( ), - . manovich, l. ( ). the language of new media. cambridge, massachusetts: the mit press. markauskaite, l. (in press). digital knowledge and digital research: what does eresearch offer education and social policy? in l. markauskaite, p. freebody & j. irwin (eds.), methodological choice and design: linking scholarship, policy and practice. dordrecht: springer. markauskaite, l., aditomo, a., & hellmers, l. ( ). co-developing eresearch infrastructure: technology-enhanced research practices, attitudes and requirements. full technical report. sydney: intersect & the university of sydney. markauskaite, l., & reimann, p. ( a, june – july ). enabling teacher-led research and innovation: a conceptual design of an inquiry framework for ict-enhanced teacher innovation. in proceedings of the world conference on educational multimedia, hypermedia and telecommunications. ed-media (pp. - ). austria, vienna: aace. markauskaite, l., & reimann, p. ( b, july - ). enhancing and scaling-up design- based research: the potential of e-research. in proceedings of the international conference of learning sciences. icls . utrecht, the netherlands. markham, a. n., & baym, n. k. (eds.). ( ). internet inquiry: conversations about method. los angeles: sage. maxwell, j. a. ( ). causal explanation, qualitative research, and scientific inquiry in education. educational researcher, ( ), - . mcwilliam, e., & lee, a. ( ). the problem of “the problem with educational research”. australian educational researcher, ( ), - . national collaborative research infrastructure strategy (ncris) committee. ( ). review of the national collaborative research infrastructure strategy’s roadmap. commonwealth of australia: deewr. nersessian, n. j. ( ). how do engineering scientists think? model-based simulation in biomedical engineering laboratories. topics in cognitive science, ( ), - . o’brien, l. ( ). e-research: an imperative for strengthening institutional partnerships. educause review, ( ), – . paavola, s., & hakkarainen, k. ( ). the knowledge creation metaphor – an emergent epistemological approach to learning. science & education, ( ), - . parry, o., & mauthner, n. ( ). back to basics: who re-uses qualitative data and why? sociology, ( ), - . pea, r., lindgren, r., & rosen, j. ( ). cognitive technologies for establishing, sharing and comparing perspectives on video over computer networks. social science information, ( ), - . • digital media, technologies and scholarship pea, r. d. ( ). video-as-data and digital video manipulation techniques for transforming learning sciences research, education and other cultural practices. in j. weiss, j. nolan & p. trifonas (eds.), international handbook of virtual learning environments (pp. - ). dordrecht: kluwer academic publishing. poschl, u. ( ). interactive journal concept for improved scientific publishing and quality assurance. learned publishing, , - . prenksy, m. ( ). digital natives, digital immigrants. on the horizon, ( ), - . public library of science (plos). ( ). plos computational biology: an official journal of the international society for computational biology. retrieved april , , from http://www.ploscompbiol.org/home.action radford, m. ( ). researching classrooms: complexity and chaos. british educational research journal, ( ), - . reimann, p., & markauskaite, l. ( ). new learning – old methods? how e-research might change technology-enhanced learning research (pp. - ). in m. s. khine & i. m. saleh (eds.), new science of learning: cognition, computers and collaboration in education. dordrecht: springer. ritchie, s. m., & rigano, d. l. ( ). solidarity through collaborative research. international journal of qualitative studies in education, ( ), - . romero, a. c., & ventura, s. ( ). educational data mining: a survey from to . expert systems with applications, , - . romero, a. c., & ventura, s. (eds.). ( ). data mining in e-learning. southampton: witpress. sawyer, r. k. (ed.). ( ). the cambridge handbook of the learning sciences. cambridge: cambridge university press. schleyer, t., spallek, h., butler, b. s., subramanian, s., weiss, d., poythress, l., et al. ( ). facebook for scientists: requirements and services for optimizing how scientific collaborations are established. journal of medical internet research, ( ), retrieved april , from http://www.pubmedcentral.nih.gov/ articlerender.fcgi?artid= . schneider, b. ( ). building a scientific community: the need for replication. teachers college record, ( ), - . schooneboom, j., levene, m., heller, j., keenoy, k., & turcsanyi-szabo, m. (eds.). ( ). trails in education: technologies that support navigational learning. rotterdam/taipei: sense publishers. schroeder, r. ( ). rethinking science, technology and social change. stanford: stanford university press. schroeder, r., & fry, j. ( ). social science approaches to e-science: framing an agenda. journal of computer-mediated communication, ( ), article . scivee. ( ). scivee: making science visible. retrieved april, , from http://www.scivee.tv. • lina markauskaite seringhaus, m., & gerstein, m. ( ). publishing perishing? towards tomorrow’s information architecture. bmc bioinformatics, ( ), . sfard, a. ( ). on two metaphors of learning and the dangers of choosing just one. educational researcher, ( ), - . shulman, l. ( ). disciplines of inquiry in education: an overview. educational researcher, ( ), - . skyserver. ( ). sloan digital sky survey/skyserver. retrieved april , , from http://cas.sdss.org. slavin, r. e. ( ). perspectives on evidence-based research in education – what works? issues in synthesizing educational program evaluations. educational researcher, ( ), - . smeyers, p., & depaepe, m. ( ). educational research: networks and technologies. the netherlands: springer. underwood, j., smith, h., luckin, r., & fitzpatrick, g. ( ). e-science in the classroom – towards viability. computers & education, ( ), - . van dijk, j. a. g. m. ( ). the deepening divide: inequality in the information society. thousand oaks, ca: sage. voithofer, r. ( ). designing new media education research: the materiality of data, representation, and dissemination. educational researcher, ( ), - . whitty, g. ( ). education(al) research and education policy making: is conflict inevitable? british educational research journal, ( ), - . woolgar, s., & coopmass, c. ( ). virtual witnessing in a virtual age: a prospectus for social studies of e-science. in c. hine (ed.), new infrastructures for knowledge production: understanding e-science (pp. - ). hershey: information science publishing. wouters, p. ( , june - ). the virtual knowledge studio for the humanities and social sciences. paper presented at the first international conference on e-social science, manchester, uk. wouters, p., vann, k., scharnhorst, a., ratto, m., hellsten, i., fry, j., et al. ( ). messy shapes of knowledge – sts explores informatization, new media, and academic work. in e. j. hackett, o. amsterdamska, m. lynch & j. wajcman (eds.), the handbook of science and technology studies (pp. - ). cambridge, ma: mit press. wyatt, s., henwood, f., miller, n., & senker, p. (eds.). ( ). technology and in/equality: questioning the information society. london: routledge. wwptb. ( ). worldwide protein data bank. retrieved april , , from http://www.wwpdb.org/. zhang, j. ( ). towards a creative social web for learners and teachers. educational researcher, ( ), - . zhao, c.-m., & luan, j. ( ). data mining: going beyond traditional statistics. new directions for institutional research, , - . • digital media, technologies and scholarship podcasting initiatives in american research libraries abstract purpose: the paper discovers how many american research libraries produce podcasts, on what subjects they are produced and how those podcasts are promoted. design/methodology/approach: the researchers looked at each american research library’s website in december to determine if the library has a podcasting initiative and if so, what topics were covered. general scanning of the website, site search and google search were used to discover podcasts. facebook and twitter pages were also explored to determine if social media was used for podcast promotion. findings: it was found that approximately one third of american research libraries have a podcasting initiative, the subjects vary widely and social media are only used occasionally to promote the podcasts. the authors conclude that podcasting is a technology that has not yet reached its zenith and libraries have many avenues left still to explore using this technology. originality/value: the paper explores the use of podcasts in libraries, which has not been explored in the literature. keywords podcast, vodcast, social media paper type: research paper . introduction academic libraries are leveraging new technologies and social media to engage their targeted audiences and promote valuable resources and services. podcasting is one of the more recent of these technology-driven initiatives. in the article “blogging is so last year—now podcasting is hot,” author janet balas asserts that podcasting is the next big thing in library outreach. now that academic libraries have had a few years to experiment with this newest form of content publishing, it may be a good time to revisit balas’ assertion: is podcasting still “hot”? to approach this question, the researchers examined podcasting activities by association of research libraries (arl) member libraries. of particular interest to the researchers was the percentage of arl member libraries that have produced or are producing podcasts, the type of content arl member libraries are broadcasting via podcasts, podcasting frequency, and how libraries are promoting their podcasts. . literature review podcasting is a relatively new method of content publishing, so it is important to define the concept. these common definitions can be divided into two schools of thought. the new oxford american dictionary provides an example of the first, and more inclusive definition, stating that a podcast is “a digital recording of a radio broadcast or similar program, made available on the internet for downloading to a personal audio player.” (p. ) by this definition, and similar definitions offered by harris ( ), devoe ( ), and balleste, rosenberg, and smith-butler ( ), a podcast could be as simple as an audio file (e.g. mp ) posted to a website and made available for download. in other words, a website-hosted audio file is not really a podcast unless the user can subscribe to the broadcast via real simple syndication (rss) or other push technology. thus, a podcast is not just a content package (a product), but method of content delivery (a service) as well. lee ( ) and balleste, rosenberg, and smith-butler ( ), among others, broaden this definition to include the syndication of video files (e.g. avi and mpeg), now commonly referred to as vodcasting. education and library literature analyzes podcasting themes from a variety of angles, foremost among them the reasons to podcast, potential podcasting applications, and current library podcasting activities. educause ( ) provides a succinct argument in favor of podcasts: “podcasting cannot replace the classroom, but it provides educators one more way to meet today’s students where they ‘live’— on the internet and on audio players.” in their article “what students want: generation y and the changing function of the academic library,” susan gardner and susanna eng note four student attributes from their library user survey that could also be used in support of podcasting ( ): . they have great expectations. . they expect customization. . they are technology veterans. . they utilize new communication modes. ralph and olsen ( ) cite these attributes to argue for podcasting as a means to reach tech-savvy millennials, cater to different learning styles, and improve distance education services. griffey ( ) also argues that the ubiquity of the mp , mpeg, and avi formats, and devices capable of playing those formats, support the delivery of content via podcast. a january national survey conducted by arbitron and edison research found that % of its respondents in the - year-old bracket and % of its respondents in the - year-old bracket owned an ipod or other portable mp player. furthermore, since personal computing devices are also mp friendly, podcasts enjoy considerable market penetration. several published articles provide examples of library or librarian-produced uses of podcasts. balas ( ) describes the online programming for all libraries (opal) project which began offering its archived web-based programs, e.g. book, genealogy, and health discussions, as podcasts. lee ( ) discusses lansing public library’s podcasting efforts that promote its services to the community by connecting targeted audiences to specific programs. ragon and looney ( ) describe the claude moore health sciences library’s podcasting project that provides access to university of virginia health system’s history of the health sciences lecture series. murley ( ) provides examples of law library podcasts, including check this out! by university at buffalo law school’s jim milles and kcll’s sidebar , a monthly legal news podcast from king county law library in seattle, washington. griffey ( ) and ralph and olsen ( ) both describe academic libraries’ podcasting efforts that leverage the new technology to expand instructional services. providing yet another example of a library’s implementation and application of podcasting, library journal ( ) profiles ohio university librarian chad boeninger, who is credited with a number of tech-savvy solutions, including podcast library tours. the literature also recommends ways to initiate a podcast project, beginning with the identification of appropriate podcast content. balleste, rosenberg, and smith- butler ( ) describe how nova southeastern university (nsu) law library and technology posed several questions before embarking on its podcasting project, starting with the basics: “why should we begin a podcast?” and “what should we podcast?” nsu librarians and it staff quickly identified its audio series, legal replays, as a good starting point. by simply moving these preexisting recordings of faculty lectures into podcast format and adding rss feed capability, they could make it easier for students to stay current with new content. with this impetus for a podcasting project, librarians identified new podcasting opportunities, including professional lectures and lectures given by visiting speakers. lee ( ) suggests several podcast applications including event promotion, library tours, and book talks. griffey ( ) describes how the university of tennessee at chattanooga produced podcasts to support its instruction program. the literature also provides some guidance to libraries for moving from podcast vision to podcast creation and implementation. harris ( ) presents a few basic resources to help libraries begin their podcasting programs, ranging from a headset with microphone to audacity software for audio recording and editing. ragon and looney ( ) go into greater detail in describing how the university of virginia claude moore health science library produced its podcasts to capture and disseminate course lectures. the authors provide production notes, describe hardware and software resources that were used for the project, production notes, and discuss how metadata was generated to improve podcast visibility. the authors also offer a glimpse of future podcasting projects. . methodology one hundred and twelve arl member library websites were examined for podcast content during the second and third weeks of december . twelve non- academic arl member institutions were excluded from the study. if no podcast content was found on a site, either by browsing or site search, the website was searched via google using site search functionality (e.g. site:libraries.ou.edu) in conjunction with terms such as “podcast,” “vodcast,” or “mp .” for the purpose of this study podcasts or vodcasts were loosely defined to be any library-produced mp , mp , or similarly formatted content on the library’s website available for download. streaming audio or video that was not downloadable was not included. a more rigorous podcast definition, one that requires subscription capabilities via rss, was deemed too exclusionary for the purposes of this study. however, library sites providing aggregated list of podcasts available from external sites were not included in the study. only podcasts produced by the libraries themselves were included in this study. once each arl library that produced podcasts was identified the podcast content was examined and classified by subject. these classifications included library tours, library resources, recorded lectures, interviews, library news, oral histories, scholarly publishing and art in the library. the total number of podcasts produced in each category was calculated. in addition, the total number of all podcasts was calculated as was the podcasting frequency for each library. finally, the researchers examined the accessibility of the podcast content to determine whether podcasts were promoted by a link on a library’s homepage, or alternatively, how many clicks from the homepage were required to reach the podcast content. the researchers also looked for instances of podcast promotion on library facebook and twitter pages. . results the researchers discovered podcast content on of the arl member library websites, roughly a third of the sample. the content, promotion and frequency varied greatly from library to library. podcast categories by library an examination of the thirty-seven arl libraries that created podcasts revealed the following: sixteen libraries created podcasts on how to use resources within the library. ten libraries provided podcasts of recorded lectures and events, and eight provided podcasts that included library tours. five libraries produced podcasts that contained recorded interviews, and three libraries produced podcasts that contained library news or oral histories. only one, the university of california, davis produced a podcast on the works of art in the library. the massachusetts institute of technology library was the only library to offer a set of podcasts that addressed scholarly publishing. figure i podcast subject taxonomy podcast frequency only seven arl libraries appeared to produce podcasts on a recurring basis (figure ii). arizona state university (asu) launches podcasts almost every day and yale university podcasts frequently, but not on a daily basis. the university of oklahoma (ou) produces podcasts on a weekly basis. the other libraries examined do not have a regular schedule of podcasting, and their podcast content appears to be static or changes infrequently. using the library events and lectures tours interviews oral histories library news art in the library scholarly publishing arl libraries figure ii podcast frequency promotion of podcasts podcasts were on average . clicks from the homepage (figure iii). only six of the podcasting arl member libraries provided prominent links to podcast content on their libraries’ homepages. more surprisingly, the researchers discovered podcast content on six sites that were only discoverable through site searches, and had no discernable browse-and-click pathway to the content. three library websites contained links to podcasts that were no longer functional. of the library sites studied, offered an rss feed for subscription to other library information, but only three allowed subscription to podcasts through rss feeds (figure iv); while thirteen libraries regularly provide status updates via twitter, only three libraries promoted their own podcasts in this manner. in addition, thirteen libraries maintained facebook pages, but only two promoted their podcasts on facebook. - - - > number of libraries number of total podcasts figure iii podcast location on website figure iv general and podcast promotional efforts . discussion the results of this study show that a significant number of arl member libraries employ podcasting as a means of communication, and that libraries are disseminating a wide variety of content via these podcasts. podcasts were most commonly used to describe and promote various library resources. for example, link on main page clicks from main page clicks from main page more than clicks from main page must search arl libraries libraries using twitter libraries using facebook libraries using rss feeds to subscribe to podcasts to promote podcasts generally libraries at texas tech university, the university of connecticut, the university of illinois at urbana-champaign, washington state university and johns hopkins university all offer podcasts that function as guides to using the library, many of which offer general research tips. the university of connecticut libraries’ went one step further in producing an entire podcast series that provides research tips for freshman english students. lectures and events were the second most popular type of podcast. university of california san diego libraries podcasted a small series of lectures produced in collaboration with its literature department. the university of arizona libraries’ special collections produced several book lecture podcasts, as well as the morales de escarcega lecture series, featuring three faculty lecture podcasts. case western reserve libraries initiated a similar effort in their off the shelf podcast series, which features interviews with authors and faculty. in addition to off the shelf, case western produced a second series of podcasts titled case stories which provide the oral histories of prominent university figures. temple university libraries provides yet another example of leveraging the podcast to disseminate author interviews and visiting and resident faculty lectures. recent projects include a guest lecture podcast with janet jakobsen, professor of women’s studies, and an interview with leslie banks, author of the vampire huntress series. tours were the third most common type of podcast. university of iowa libraries’ library tour podcasts are organized by floor while the university of washington libraries provides two podcast tours, one of the suzzallo-allen library and another for the odegaard undergraduate library. other libraries have created podcast tours that focus on special libraries, such as suny buffalo’s health sciences library tour, while other libraries, including suny alabama, have broadened access by podcasting library tours in several languages. for these libraries podcasting was seen as an effective means to help orient new users to physical library spaces. it is also interesting to note that several of the libraries that chose to focus their podcasting efforts on library tours tended to have static podcast collections. the podcasts appear to have served a niche purpose, and once completed, the podcasting initiatives were concluded. libraries with sustained podcasting efforts tended to provide a variety of content, from tours and lectures to interviews and research instruction. these efforts obviously require a greater commitment of resources, either by a dedicated podcaster-in-residence or through broader institutional participation and partnerships. in either case it is highly suggestive that organizations with sustained podcasting efforts consider podcasting to be a worthwhile investment of library resources. asu libraries, which produces five podcasts a week, is one such organization. the subject matter is varied and ranges from how to use library resources, to interviews, to what to do on a hot day. there is no list of past podcasts or vodcasts, although the user may link to itunes and view a list of the most viewed podcasts and vodcasts. the user also has the ability to search for podcasts by topic or browse a subject listing. indiana university-bloomington libraries (iu) is another organization with a significant podcasting initiative. the university maintains links to podcasts from the home page of the libraries’ website. this link leads to a dedicated podcast page for the university-at-large. the user can narrow a search to podcast content or browse podcasts by topic. library produced podcasts are clearly labeled. the podcasts cover a wide range of subject matter, including research instruction, resource highlights, and lectures on a variety of topics, such as open access and digital scholarship. iu libraries also takes the podcast library tour to new levels by providing their broadcasts in twenty-four languages. the university of oklahoma libraries produces podcasts approximately once per week. the ou libraries’ podcast webpage provides direct links to the four most recent podcasts or vodcasts and a link to the full podcast archive. the podcasts cover a wide range of topics, including resource spotlights, interviews with resident and visiting faculty, and current and campus event promotion. ou libraries has also produced podcasts that promote resources outside the library. ou libraries recently produced a podcast highlighting services offered by the university of oklahoma speakers service, providing one example of how a library can leverage podcasting to establish new connections within the campus network. it stands to reason that those libraries with the most ambitious podcast initiatives would put equal efforts in their promotion, beginning with content visibility. a podcast that is difficult to discover will likely be underutilized and underappreciated, regardless of the quality of its content. therefore, libraries with podcasting ambitions would do well to reserve or create conspicuous locations for their podcast content. interestingly, there is not always a correlation between the frequency of the podcasts a library produces and the effort expended in publicizing these podcasts. a few podcasting arl member libraries, including asu, ou, and iu, provide links to podcast content on their website homepages. these libraries, however, were exceptions. most library websites, including several with impressive podcast content, place these resources several clicks away from the homepage, and as a result, are less accessible and can be difficult to discover. the researchers grew to appreciate the severity of this issue when podcasts from thirteen arl member libraries were all but indiscoverable through the native library websites and were only found by a site search or through google. few podcasting arl member libraries have leveraged social media to market their podcasts, and as with podcast link location setting, there is not always a correlation between the effort put into podcasting and the effort to promote the podcasts through social media. louisiana state university libraries, for example, was one of the few among the arl member libraries that have promoted its podcasts on both twitter and facebook, and has produced relatively few podcasts. conversely, asu and ou libraries produce podcasts on a weekly schedule, and neither promotes those resources on their respective facebook pages, and only asu has given podcasts brief mention on twitter. not all arl member libraries have created facebook or twitter profiles (figure ), but it is surprising that so few of those libraries that have made the move into the world of social media have leveraged those channels of communication to market their podcast content. while ou libraries maintains facebook and twitter accounts, neither was used to promote the libraries’ podcasts. as a result of the awareness generated through this research, ou libraries will begin to promote its podcasts on facebook. . conclusion the purpose of this study was to examine, in broad terms, the podcasting activities of arl member libraries. future studies in this area may focus on specific types of podcasts, the factors that go into a library’s decision to initiate or conclude a podcasting program, or may revisit podcasting activities in an effort to project its ascent or decline in the arl community. competing (or complementary) broadcasting mediums, such as youtube, offer libraries alternative channels to communicate to their audiences. similar to podcasts, youtube provides its content authors the ability to create unique channels and viewers the ability to subscribe to those channels. it will be interesting to see if and to what extent youtube and other emerging channels of communication impact arl member library podcasting activities. many arl libraries use podcasts for education in the library to promote library events, resources, services, and but many of those same libraries do not make that content highly visible on their websites and fewer still use the free tools already available to them to promote the podcasts. podcast production, from brainstorming and planning to production and dissemination, can be a resource intensive process, so it was surprising to the researchers to discover such a wealth of quality content buried deep within library websites and rarely promoted on library facebook and twitter pages. arl member libraries are clearly discovering podcasting to be an effective means to present a wide range of information to their audience, and those efforts deserve to be marketed appropriately. references arbitron/edison research. . radio’s digital platforms: am/fm, online, satellite, hd radio™ and podcasting. the infinite dial . retrieved march , from http://www.edisonresearch.com/infinite% dial% % presentation.pdf balas, j. . blogging is so last year—now podcasting is hot. computers in libraries : - . balleste, r., j. rosenberg, and l. smith-butler. . podcasting, vodcasting, and the law: how to understand the latest “it” technology and use it in your library. aall spectrum : - . devoe, k. . innovations affecting us—podcasting, coursecasting, and the library. against the grain : , , . educause learning initiative. . things you should know about… podcasting. http://net.educause.edu/ir/library/pdf/eli .pdf. gardner, s. and s. eng. . what students want: generation y and the changing function of the academic library. libraries and the academy : - . griffey, j. . podcast - - . library journal : - . harris, c. . blogs, podcasts, and the letter j. library media connection : - . lee, d. . ipod, you-pod, we-pod: podcasting and marketing library services. library administration & management : - . murley, d. . podcasts and podcasting for law librarians. law library journal : - . podcast. . in e. mckean, new oxford american dictionary, nd ed: ????. new york, n.y.: oxford university press. ragon, b. and r. looney. . podcasting at the university of virginia claude moore health sciences library. medical reference services quarterly : - . ralph, j. and s. olsen. . podcasting as an educational building block in academic libraries. australian academic & research libraries : - . real-world tech. . library journal : . http://www.edisonresearch.com/infinite% dial% % presentation.pdf http://net.educause.edu/ir/library/pdf/eli .pdf drug resistance markers within an evolving efficacy of anti-malarial drugs in cameroon: a systematic review and meta-analysis ( – ) niba et al. malar j ( ) : https://doi.org/ . /s - - - r e s e a r c h drug resistance markers within an evolving efficacy of anti-malarial drugs in cameroon: a systematic review and meta-analysis ( – ) peter thelma ngwa niba , , , akindeh m. nji , , , marie‑solange evehe , , innocent m. ali , , , palmer masumbe netongo , , randolph ngwafor , , marcel n. moyeh , , , lesley ngum ngum , , , , oliva ebie ndum , , fon abongwa acho , cyrille mbanwi mbu’u , , dorothy a. fosah , barbara atogho‑tiedeu , , olivia achonduh‑atijegbe , rosine djokam‑dadjeu , , jean paul kengne chedjou , , , jude d. bigoga , , , carole else eboumbou moukoko , , anthony ajua , eric achidi , esther tallah , rose g. f. leke , , , alexis tourgordi , pascal ringwald , michael alifrangis , and wilfred f. mbacham , , , * abstract background: malaria remains highly endemic in cameroon. the rapid emergence and spread of drug resistance was responsible for the change from monotherapies to artemisinin‑based combinations. this systematic review and meta‑ analysis aimed to determine the prevalence and distribution of plasmodium falciparum drug resistance markers within an evolving efficacy of anti‑malarial drugs in cameroon from january to august . methods: the prisma‑p and prisma statements were adopted in the inclusion of studies on single nucleotide poly‑ morphisms (snps) of p. falciparum anti‑malarial drug resistance genes (pfcrt, pfmdr , pfdhfr, pfdhps, pfatp , pfcytb and pfk ). the heterogeneity of the included studies was evaluated using the cochran’s q and i statistics. the random effects model was used as standard in the determination of heterogeneity between studies. results: out of the records screened, studies were included in this aggregated meta‑analysis of molecular data. a total of , snps of the anti‑malarial drug resistance genes were genotyped from , samples which yielded a pooled prevalence of . % ( % ci . – . %). between and , there was significant decline (p < . for all) in key mutants including pfcrt t ( . %‑ . %), pfmdr y ( . %‑ . %), pfdhfr i ( . %‑ . %), pfdhfr r ( . %‑ . %), pfdhfr n ( . %‑ . %). the only exception was pfdhps g which increased over time ( . %‑ . %, p < . ) and pfdhps e that remained largely unchanged ( . %‑ . %, p = . ). explor‑ ing mutant haplotypes, the study observed a significant increase in the prevalence of pfcrt cviet mixed quintuple haplotype from . % in to . % in (p < . ). in addition, within the same study period, there was no significant change in the triple pfdhfr irn mutant haplotype ( . % to . %, p = . ). the pfk amino acid poly‑ morphisms associated with artemisinin resistance were not detected. © the author(s) . this article is licensed under a creative commons attribution . international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article’s creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article’s creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/ . /. the creative commons public domain dedication waiver (http://creat iveco mmons .org/publi cdoma in/ zero/ . /) applies to the data made available in this article, unless otherwise stated in a credit line to the data. open access malaria journal *correspondence: wfmbacham@yahoo.com marcad‑deltas programme, laboratory for public health research biotechnologies, university of yaoundé i, yaoundé, cameroon full list of author information is available at the end of the article http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / http://creativecommons.org/publicdomain/zero/ . / http://creativecommons.org/publicdomain/zero/ . / http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf page of niba et al. malar j ( ) : background globally, malaria accounted for million cases and , related deaths in [ ]. malaria remains highly endemic in cameroon despite the adoption, implemen- tation and deployment of different controls measures by the government and her partners [ ]. in cameroon, the rapid emergence and spread of anti-malarial drug resist- ance was responsible for the replacement of chloroquine (cq) as the first-line therapy for treatment of uncom- plicated plasmodium falciparum malaria in and later on amodiaquine (aq) monotherapy/sulfadoxine- pyrimethamine between and [ ]. a major drug policy change occurred in following the adoption of artesunate-amodiaquine (asaq) and later included artemether–lumefantrine (al) in as first-line treat- ments of uncomplicated malaria in line with world health organization (who) recommendations [ , ]. the artemisinin-based combinations, asaq and al, are distributed in the proportions of % and %, respec- tively to public, faith-based and private health facili- ties [ ]. in the northern regions of cameroon, malaria transmission is intense and seasonal when compared to the southern regions characterized with perennial malaria transmission. in , the government of cam- eroon implemented seasonal malaria chemoprevention (smc) in the northern regions [ ]. this prevention strategy involves the yearly administration of four doses of sulfadoxine–pyrimethamine–amodiaquine (spaq) to vulnerable children within the age group –   months [ ]. additionally, sp is still being used as an intermittent preventive treatment in pregnant (iptp) women from the second to the third trimester. the women receive at least doses during pregnancy, with each dose (three tablets of   mg sulfadoxine and   mg pyrimethamine) given at least  month apart [ , ]. both spaq and sp are also subsidized by the government of cameroon. the large- scale deployment of the smc and iptp strategies is a major contributory factor to drug pressure which drives the emergence of p. falciparum resistant parasites. furthermore, the efficacy of anti-malarial drugs is linked to the presence or absence of parasites resistant to artemisinin-based combination therapy (act) and non- act in the population. thus, the regular monitoring of drug resistance markers through molecular surveillance or clinical trials can be used by malaria control pro- grammes in endemic regions to secure the high efficacy of the different anti-malarial drugs. the use of advanced molecular biology techniques has greatly facilitated the identification of key amino acid changes in the genes of p. falciparum chloroquine resistant transporter-pfcrt (c s, v k, m i, n e, k t, a s, q e, n s, i t, r i) [ ], p. falciparum multi-drug resistant -pfmdr (n y, y f, s c, n d, d y, copy number variation) [ ], p. falciparum dihydrofolate reduc- tase-pfdhfr (a v, c r, n i, c r, s n/t) [ – ] and p. falciparum dihydropteroate synthase-pfdhps (i v, s a/f, a g, k e/n, a g, a s/t) [ ] associated with resistance to different anti-malar- ial drugs. the presence of pfcrt k t is associated with increased risk of treatment failure after administration of chloroquine whereas, pfmdr n y is associated with both chloroquine and amodiaquine resistance [ ]. the haplotypes of the pfcrt gene defined by the k t codon and adjacent amino acids (numbers – ) have been used in the typing of malaria parasites [ ]. among the over fifteen haplotypes identified, three predominate namely: cvmnk among cq-sensitive isolates from all geographic regions, cviet among cq-resistant iso- lates from southeast asia and africa, and svmnt among cq-resistant isolates from south america, africa and some countries of asia [ – ]. for sulfadoxine– pyrimethamine the pfdhfr single (s n), triple haplo- type mutants (s n, n i, c r) and pfdhfr-pfdhps quintuple haplotype mutants (s n, n i, c r, a g, k e) have been shown to increase the risk of treatment failure [ ]. it has also been documented that increased pfmdr copy number is correlated with resistance to mefloquine [ ] and reduced sensitivity to lumefantrine [ – ]. a study on al and asaq showed opposing effects for pfcrt k t and pfmdr n y [ ]. this was further confirmed by another study on the selection of pfmdr nfd haplotype for al and pfmdr yyy haplotype for asaq from samples of efficacy stud- ies conducted in africa that led to reduced sensitivities of the two drugs [ ]. in , single nucleotide polymorphisms in the pfk propeller domain of cambodian parasite isolates were reported to be associated with delayed parasite clearance conclusions: this review reported an overall decline in the prevalence of p. falciparum gene mutations conferring resistance to ‑aminoquinolines and amino alcohols for a period over two decades. resistance to artemisinins meas‑ ured by the presence of snps in the pfk gene does not seem to be a problem in cameroon. systematic review registration prospero crd keywords: malaria, plasmodium falciparum, anti‑malarial drug, resistance, mutations, efficacy, systematic review, cameroon page of niba et al. malar j ( ) : of artemisinins [ ]. the epicentres driving the emer- gence and dispersal of artemisinin resistance have been identified in countries within the greater mekong sub- region (gms) namely, cambodia, china (yunnan prov- ince), lao people’s democratic republic, myanmar, thailand and vietnam [ ]. presently, about non-syn- onymous mutations in the k gene have been identified and reported [ – ]. a total of pfk non-synony- mous single nucleotide polymorphisms (f i, n y, n y, y h, r t, i t, p l, r h, c y) have been validated with f i, r t, i t, p l and c y being the most common and with the high- est occurrences [ , , ]. there are candidate gene polymorphisms associated with delayed parasite clear- ance [ , , ]. a number of mutations have also been reported outside the k propeller region notably, k t and e q [ , – ]. in africa, the pfk mutation with the highest geographical distribution is a s [ , , ] and the presence of r h mutation has recently been reported in tanzania [ ] and rwanda [ ]. hence, there are fears that act resistance may spread to other regions including sub-saharan africa where malaria is still a major burden, similar to what happened in the past with the chloroquine, amodiaquine, and sulfadoxine– pyrimethamine. the rationale for the use of act relies on the rapid reduction of the parasite biomass, reduction of transmission (reducing gametocytes), protection of partner drug against resistance, and rapid fever reduction [ ]. the effect of drug policy changes on the selection of p. falciparum anti-malarial drug resistant parasites in cameroon has not been completely understood. there- fore, this systematic review and meta-analysis aimed to determine the prevalence and distribution of p. falcipa- rum drug resistance markers within an evolving efficacy of anti-malarial drugs in cameroon from january to august . methods registration of the systematic review and protocol development in december , a review protocol (#crd ) was developed and registered in the international prospec- tive register of systematic reviews (prospero: http:// www.crd.york.ac.uk/prosp ero). the protocol was submit- ted for publication to a peer review journal. the preferred reporting items for systematic reviews and meta-anal- yses protocol (prisma-p) [ , ] was used in the devel- opment of the protocol for this systematic review and meta-analysis. search strategy an electronic systematic strategy based on the combi- nation of key words was used to search articles from medline via pubmed, google scholar, and science direct databases. both interventional and observational studies were retrieved for inclusion in the review. the following mesh search terms were combined using the boolean operators “or” and “and’’: “anti-malarial”, “drug resist- ance”, “pfcrt”, “pfmdr ”, “pfmdr copy number”, “pfd- hfr”, “pfdhps”, “pfatp ”, “pfcytb”, “pfk ”, “mutations”, “gene polymorphisms”, “amino acid changes”, “plasmo- dium falciparum”, “efficacy”, “artesunate-amodiaquine”, “artemether–lumefantrine”, “sulfadoxine–pyrimeth- amine” “cameroon”. additional searches the reference lists of published articles were searched for eligible studies. authors were contacted when access to full length articles was restricted. data was also obtained from the annual reports of the cameroon national malaria control programme (nmcp), ministry of pub- lic health. in addition to published studies, unpublished medical doctor (md), master of science (msc) and doc- tor of philosophy (phd) theses were sourced for inclusion in the study. eligibility criteria inclusion criteria the systematic review and meta-analysis included the following type of studies: studies published from janu- ary to august ; studies on human participants of all ages; original articles of studies that investigated either asymptomatic, uncomplicated or severe p. falci- parum; studies that included pcr genotyping of anti- malarial drug molecular resistance markers (pfcrt, pfmdr , pfmdr copy number, pfdhfr, pfdhps, pfcytb, pfatp , pfk ); studies written in english or french; studies done within cameroon: all multi-centric studies in which cameroon was one of the sites, and studies in which malaria was imported from cameroon into other countries. exclusion criteria the following types of studies were not included: abstracts; studies on in  vitro, ex  vivo and in  vivo anti- malarial drug resistance without genotyping; genetic studies on pfcg gene; studies on genetic diversity and population structure of p. falciparum without drug resist- ance; studies on diagnostic accuracy of methods for detection of p. falciparum and studies on infections with mixed plasmodium species. review process research articles identified from searches of the elec- tronic databases were screened for eligibility based on their titles and abstracts. ineligible articles and duplicates http://www.crd.york.ac.uk/prospero http://www.crd.york.ac.uk/prospero page of niba et al. malar j ( ) : were eventually removed. full-length articles of the selected studies were read to confirm for fulfilling of the inclusion criteria before data extraction began. two independent reviewers (peter thelma ngwa niba-ptnn and lesley ngum ngum-lnn) screened the titles and abstracts to identify potentially eligible studies and data extracted from full-length articles that fulfilled the inclu- sion criteria. discrepancies were resolved by mutual consent after discussion and independent review from the third researcher (akindeh mbuh nji-amn). the whole process was supervised by wilfred fon mbacham (wfm) and michael alifrangis (ma). data extraction procedure the “microsoft” excel (microsoft corporation, redmond, washington, united states of america) was used to design the data extraction sheet. the data extrac- tion form was produced and consisted of study identi- fication number, author (s), study site, sample size, age group (in months and years), study design (interventional and observational), genotyping method, sequence geno- typing success rate, anti-malarial drug resistance gene, total number of samples genotyped, number of samples genotyped with mutations, and prevalence of molecular markers. the database in microsoft excel was piloted and validated before completion of the process (additional file  ). mixed genotypes were considered as mutants during data collation on frequency of mutations derived from different studies. studies (observational or interven- tional) published multiple times in similar topics by the same authors were diligently screened to avoid duplica- tion of data. these studies were differentiated based on primary variables (anti-malarial drug resistance markers and frequency of single nucleotide polymorphisms) con- taining the datasets of interest. the prisma (preferred reporting items for systematic reviews and meta-analy- ses) checklist for reporting systematic reviews and meta- analyses was used as a guide for this study [ ]. data items the selection and inclusion of studies was done accord- ing to the picos format. this approach includes: popu- lation (p), individuals infected p. falciparum parasites in cameroon, intervention (i), use of non-artemisinin and artemisinin agents in the treatment of malaria, com- parator (c), none, outcome (o), pfcrt, pfmdr , pfdhfr, pfdhps, pfk gene polymorphisms circulating in malaria endemic areas of cameroon, study design (s), observa- tional studies (cross-sectional, case reports, cohorts) and interventional studies such as randomized controlled tri- als reporting on the use of p. falciparum dna infected samples collected before anti-malarial treatment (d ) and during follow-ups of study participants. data management the zotero standalone software package version . . (corporation for digital scholarship, vienna, virginia, usa) was used to review, import full articles and delete duplicates. methodological quality (risk of bias) assessment of individual studies included the quality of randomized clinical trials was assessed by the revised cochrane risk of bias tool for randomized tri- als (rob . ) [ ]. the rob is structured into five bias domains namely: bias arising from the randomization process, bias due to deviations from intended interven- tions, bias due to missing outcome data, bias in meas- urement of the outcome, and bias in selection of the reported result. the overall quality of the randomized clinical trial was judged as “low risk” of bias score when all the key domains in the assessment of bias were found to be of low risk. when one of the key domains in the bias assessment was found to have some concerns, a scor- ing of “some concerns” was rendered. the assessment of at least one key domain of bias with a high risk in a study accorded it to be of “high risk” of bias (additional file  ). the quality of cohort studies was assessed using the newcastle–ottawa scale (nos), which included eight items related to selection, comparison, and outcome. for each item a star is awarded except for comparison that can receive up to two stars. the studies with six stars (maximum of nine) were classified as good quality (addi- tional file  ) [ ]. finally, the quality of included cross- sectional studies and case reports was assessed by the joanna briggs institute (jbi) critical appraisal checklists for cross-sectional [ ] and case reports [ ] which consist of eight yes/no/unclear questions. the overall quality of cross-sectional and case reports were grouped into the following categories: low risk of bias (studies that met at least % of the quality criteria), moderate risk of bias (studies that met between and % of the quality criteria) and high risk of bias (studies that met less than % of the quality criteria) (additional files and ) [ ]. two reviewers (peter thelma ngwa niba-ptnn and cyrille mbanwi mbu’u-cmm) independently assessed the risk of bias of included studies. disagreements between the reviewers at the different stages of the review were resolved by discussion. data analysis, heterogeneity assessment and data interpretation quantitative syntheses (meta-analyses) were done using the “metaphor” and “meta” packages in the r statistical page of niba et al. malar j ( ) : software version . . (supported by the r foundation for statistical computing, vienna, austria). the conven- tional meta-analysis approach from pooled patient data was adopted for the synthesis. the heterogeneity of the included studies was evaluated using the cochran’s q and i statistics. the random effects model was used as standard in the determination of heterogeneity between studies [ ]. the i values were expressed in percentages. heterogeneity was classified as low, moderate and high, with upper limits of %, % and % for i , respectively [ ]. data derived from an article published by one author or same authors in a particular year were merged before presentation on forest plots. forest plots were used to present the data on pooled prevalence of mutations in anti-malarial drug resistance genes. subgroup analyses were also done to show the aggregated prevalence of pfcrt k t, pfmdr n y, pfdhfr irn haplotype, pfdhfr- pfdhps irng haplotype and pfk gene mutations in cases where number of studies were greater than or equal to . the evolution of resistance markers and haplotypes over time was summarized on frequency tables. the pre and post-act intervention periods were con- sidered to be – and – respectively. the criterion for choosing these periods was based on , the year in which the first act was adopted for use in cameroon. in the analysis to compare snps between the two or more study periods, mixed infections with both the wild type and the mutant were all considered mutants. haplotypes were defined as a combination of two or more wild type alleles, mutant alleles or mixed. these haplotypes included pfcrt cvmnk, pfcrt cviet, pfdhfr irn, pfdhfr-pfdhps irng, and pfdhfr-pfdhps irnge. the pearson chi square test in the international busi- ness machine software package for social sciences (ibm spss) version . software package (ibm corporation, armonk, new york, usa) was used to establish the evo- lution of drug resistance markers over time. the shapiro–wilk test was used to check for normal distribution of quantitative variable data. furthermore, the relationships between the efficacy of act medicines (asaq and al) and anti-malarial drug resistance mak- ers (pfcrt   t and pfmdr y) were represented on plots. the pearson correlation coefficient (r) was used to assess the strength and direction of the association between the efficacy of act medicines (al and asaq) and the prevalence of pfcrt  t and pfmdr y mutants over time. in addition, a trend analysis to explore the relationship between proportions of anti-malarial drugs (asaq, al and sp) deployed in cameroon and preva- lence of drug resistance markers (pfcrt  t, pfmdr y and pfdhfr irn) from to was also explored using r. the standard range for r values is between - and + . the level of significance was set at p < . at % confidence interval and two-tailed. assessment of publication bias across studies the risk of publication bias in the included articles was assessed using the asymmetry of funnel plot and egger’s regression test with p < . . the funnel plot contained the standard error on the y-axis and proportion on the x-axis (additional file  ). results study identification, screening and selection process the electronic searches identified a total of published articles on anti-malarial drug resistance markers in cam- eroon. there were three additional unpublished cita- tions from theses of students and one article was derived through contact with a senior researcher on malaria dis- ease. a total of studies were identified, after which duplicates were removed. a total of studies were screened to remove abstract and non-malaria studies, with studies retained after the process. the stud- ies were checked for eligibility, with studies included for both qualitative and quantitative analyses (fig.  ). characteristics of studies included in the review participants of all age groups ranging from   months to   years and both gender were included in the study. out of studies included, studies [ , , , – ] were obtained from published articles and from unpub- lished data. a total of ( . %) were carried out only in cameroon, ( . %) were studies of imported malaria cases from african countries including cameroon, and ( . %) were multi-centric studies including cameroon. the studies were performed in all the geo-ecological zones constituted from the regions of cameroon, that is, sudano-sahelian, tropical, and equatorial. major- ity of these studies (n = ( . %)) were conducted in yaoundé. most of the studies (n = ( . %)) were derived from observational studies while the remaining studies were randomized controlled clinical trials. the main methods used for genotyping were nested poly- merase chain reaction-restriction fragment length poly- morphism (npcr-rflp), dna sequencing by sanger (sequencing by dideoxy-chain termination method) and quantitative real time polymerase chain reaction. oth- ers methods included sequence specific oligonucleotide probe, polymerase chain reaction, enzyme linked immu- nosorbent assay (ssop pcr elisa), dot blot, and dna sequencing using illumina hiseq platform. a total of seven p. falciparum drug resistance genes datasets were page of niba et al. malar j ( ) : created for quantitative syntheses with the following dis- tribution of studies: pfcrt (n = ), pfmdr (n = ), pfd- hfr (n = ), pfdhps (n = ), pfcytb (n = ), pfatp (n = ) and pf (n = ) (additional file  ). heterogeneity of included studies the assessment of heterogeneity was done for all the groups containing different studies on p. falciparum single nucleotide polymorphisms that confer resistance to anti-malarial drugs. there was high heterogene- ity across all the groups: pooled prevalence of all drug records identified through electronic database searching medline via pubmed= google scholar= science direct= n= additional records identified through other sources (author contact = , thesis of students= ) n= id en ti fi ca ti on records after duplicates removed n= records excluded n= -abstract: -non-malaria: sc re en in g records screened n= full-text articles assessed for eligibility n= full-text articles excluded, with reasons n= -in vitro anti-malarial test: -in vivo anti-malarial test without resistance markers: -genetic diversity of pf: -epidemiology of malaria: -drug resistance markers out of cameroon: -non-validated markers: -reviews/opinions: -modelling of malaria: -diagnosis of malaria: -repeated same samples: -retracted article: e lig ib ili ty studies included in qualitative synthesis n= in cl ud ed studies included in quantitative synthesis (meta- analysis) n= fig. flow chart for studies included in the systematic review and meta‑analysis on anti‑malarial drug resistance markers in cameroon from – page of niba et al. malar j ( ) : resistance markers (q(df = ) = , . , i = %, p < . ), pfcrt k t (q(df = ) = . , i = %, p < . ), pfcrt cviet (q(df = ) = . , i = %, p < . ), pfmdr n y (q(df = ) = . , i = %, p < . ), pfdhfr irn (q(df = ) = . , i = %, p < . ), pfdhfr-pfdhps irng (q(df = ) = . , i = %, p < . ), and pfk (q(df = ) = . , i = %, p < . ). pooled prevalence of p. falciparum anti‑malarial drug resistance mutations there were , snps of anti-malarial drug resistance markers genotyped from , samples which yielded a pooled prevalence of . % ( % ci . – . %). the dna sequence genotyping success rate varied from . % to . % while the prevalence of mutations ranged from . to . % (additional file  and fig.  ). the key amino acid substitutions represented in the analyses were: pfcrt (c s, v k, m i, n e, k t, a s, q e, n s, i t, r i), pfmdr (n y, fig. pooled prevalence of plasmodium falciparum anti‑malarial drug resistance mutations from – page of niba et al. malar j ( ) : y f, s c, n d, d y, copy number vari- ation), pfdhfr (a v, c r, n i, c r, s n/t) and pfdhps (i v, s a/f, a g, k e, a g, a s/t). only two studies recorded the presence of pfdhps i v with prevalence rates of . % and . % [ , ]. one study reported the presence of pfdhps k n mutation with a prevalence of . % not previously documented in cameroon [ ]. for pfk , the amino acid polymorphisms associated with artemisinin resistance in southeast asia were not detected in any of the p. falciparum samples genotyped and most of pfk gene polymorphisms reported here have not been observed anywhere in the world. the most prevalent non-validated pfk missense polymorphisms were k t reported in studies with prevalence rates of . % and . % [ , ] (additional file  ). subgroup analyses revealed that the aggregated prevalence of pfcrt k t, pfcrt cviet, pfmdr n y, pfdhfr irn, and pfk genes were . % ( / ) [ % ci . – . %], . % ( / ) [ % ci . – . %], . % ( / ) [ % ci . – . %], . % ( / ) [ % ci . – . %], . % ( / ) [ % ci . – . %] and . % ( / ) [ % ci . – . %] respectively (fig.  a–f ). temporal changes of the key gene polymorphisms conferring resistance to anti‑malarial drugs prior to and after adoption of artemisinin‑based combination therapies (acts) in cameroon the pre-act and post-act interventions were consid- ered as periods before and after respectively. there was a significant decline in pfcrt  t mutant alleles from . % in – to . % and . % in – and ≥ , respectively, with a slight increase of . % recorded between and (p < . ). similarly, during the same study periods, the prevalence of the pfmdr y mutant allele decreased significantly from . % to . % (p < . ) with the exception observed between and when . % was reported (table  ). the only pfcrt haplotypes reported in cameroon were: cvmnk in studies, cviet in studies and svmnt in one study. the highest frequencies recorded were: cvmnk- . % [ ], cviet- . % [ ], and svmnt- . % [ ] (additional file  ). there was an increase in the prevalence of the pfcrt cvmnk wild type haplotype from . % in – to . % in – . how- ever, a decrease of . % was observed between and . generally, there was a significant increase in the prevalence rate of the cvmnk haplotype from . % in to . % in (p < . ). the cviet mutant haplotype declined from . % in – to . % in – . the prevalence rate increased to . % and . % respectively in – and – . simi- larly, there was a significant increase in the prevalent rate of pfcrt cviet haplotype between and (p < . ) (table  ). within the pfmdr gene only the triple nfd haplotype was reported in studies conducted in mutengene [ , ] with prevalence rates of . % and . %. the yfy and yyy triple haplotypes were not reported in any of the studies included (additional file  ). between the two time points – and – , there was a signifi- cant drop (p < . ) in the pfdhfr ( i . – . %, r . – . %,   n . – . %) mutant alleles whereas, the pfdhps ( g . – . %, p < . , e . – . %, p = . ) mutant alleles increased over the two time points (table  ). an evaluation of gene polymorphisms of the pfd- hfr revealed that the triple irn mutant haplotype was the most reported in studies with the minimum prevalence of . % [ ] and a maximum prevalence of . % [ , ]. only studies reported the quadru- ple irng mutant haplotype involving pfdhfr and pfdhps with the highest prevalence of . % [ ]. moreover, pfdhfr/pfdhps quintuple haplotype irnge was identified in studies [ , , ] with a maximum prevalence of . % [ ] (additional file  ). the pfdhfr irn and pfdhfr/pfdhps irnge hap- lotypes remained largely unchanged from . % to . (p = . ) and . % to . % (p = . ), respec- tively, between and . conversely, a significant decrease in trend from . % to . % was reported for pfdhfr-pfdhps irng under the same period (p < . ) (table  ). distribution of antifolate haplotypes across geo‑ecological zones of cameroon the forest and sudano-sahelian zones constitute the major geo-ecological zones of cameroon. the characteristics of these geo-ecological zones are reflected in the towns of yaoundé, mutengene and garoua. different studies were conducted in yaoundé, mutengene and garoua from – in order to understand the evolutionary origins of the antifolate haplotypes [ , ]. the prevalence of pfd- hfr cirn mixed haplotype in the different towns was dis- tributed as follows: yaounde- . %, mutengene- . % and garoua- . %. the pfdhps sgk mixed haplotype was also common in yaounde- . % and mutengene- . % with the least occurrence reported in garoua- . %. the sgk was associated with sp resistance at these three sites. the wild-type alleles (sak and aak) were mostly noticeable in garoua, yaoundé and mutengene, respectively. the cirn page of niba et al. malar j ( ) : fig. a subgroup analysis for pooled prevalence of pfcrt k t mutation from to . b subgroup analysis for pooled prevalence of pfcrt cviet haplotype mutations from – . c subgroup analysis for pooled prevalence of pfmdr n y haplotype mutations from to . d subgroup analysis for pooled prevalence of pfdhfr irn haplotype mutations from to . e subgroup analysis for pooled prevalence of pfdhfr‑pfdhps irng haplotype mutations from to . f subgroup analysis for pooled prevalence of pfk mutations from to page of niba et al. malar j ( ) : fig. continued page of niba et al. malar j ( ) : haplotype was highly prevalent in the southern part when compared to the northern part of cameroon. generally, opposing trends were observed in the haplotypes sgk, agk, aak, sak, cirn, cncs, cicn, and cnrn in malaria parasites isolated from garoua, yaoundé and mutengene. data is not avail- able on these unique mixed haplotypes from other regions in cameroon (fig.  ). efficacy of acts (al and asaq) and prevalence of pfcrt  t and pfmdr y mutants over time a total of ( unpublished and published) studies were used to derive the data on the efficacy (pcr-cor- rected cure rates) of al and asaq [ – ]. the efficacy rates for al and asaq were above . % and remained relatively constant from – . on the contrary, there was a general decline in the pfcrt  t and pfmdr y mutant alleles between and which were more pronounced between and for pfmdr and between and for pfcrt   t. the efficacy of al showed a positive but non-significant correla- tion with pfcrt   t mutant allele (r = . , p = . ) while the efficacy of al demonstrated a negative non- significant relationship with pfmdr y mutant allele (r = − . , p = . ). however, there was a negative significant correlation between the efficacy of asaq and prevalence rates of pfcrt   t mutant allele (r = − . , p < . ) and pfmdr y mutant allele (r = − . , p = . ). generally, the prevalence of pfcrt   t and pfmdr y mutant alleles were below the efficacy rates of asaq and al (fig.  ). this showed that increase in mutant alleles corresponded with decrease in efficacy of acts and vice versa. table changes in the frequency of pfcrt and pfmdr genotypes between  and  *p < . : statistically significant, n: number of amino acid substitutions, n: total number of samples genotyped gene mutation allele – (%, n/n) – (%, n/n) – (%, n/n) ≥ (%, n/n) p‑value pfcrt k t k . ( / ) . ( / ) . ( / ) . ( / ) p < . * t . ( / ) . ( / ) . ( / ) . ( / ) pfmdr n y n . ( / ) . ( / ) . ( / ) . ( / ) p < . * y . ( / ) . ( / ) . ( / ) . ( / ) table changes in the frequency of pfcrt haplotypes between  and  *p < . : statistically significant, n: number of amino acid substitutions, n: total number of samples genotyped gene haplotype – (%, n/n) – (%, n/n) – (%, n/n) ≥ (%, n/n) p‑value pfcrt cvmnk . ( / ) . ( / ) . ( / ) . ( / ) p < . pfcrt cviet . ( / ) . ( / ) . ( / ) . ( / ) p < . table changes in  the  frequency of  pfdhfr and  pfdhps genotypes between  and  *p < . : statistically significant, n: number of amino acid substitutions, n: total number of samples genotyped gene mutation allele – (%, n/n) – (%, n/n) p‑value pfdhfr n i n . ( / ) . ( / ) p < . * i . ( / ) . ( / ) c r c . ( / ) . ( / ) p < . * r . ( // ) . ( / ) s n s . ( / ) . ( / ) p < . * n . ( / ) . ( / ) pfdhps a g a . ( / ) . ( / ) p < . * g . ( / ) . ( / ) k e k . ( / ) . ( / ) p = . e . ( / ) . ( / ) table changes in  the  frequency of  pfdhfr and  pfdhps haplotypes between  and  *p < . : statistically significant, n: number of amino acid substitutions, n: total number of samples genotyped gene haplotype – (%, n/n) – (%, n/n) p‑value pfdhfr irn . ( , / , ) . ( , / , ) p = . pfdhfr-pfdhps irng . ( / ) . ( / ) p < . pfdhfr-pfdhps irnge . ( / ) . ( / , ) p = . page of niba et al. malar j ( ) : fig. pfdhfr and pfdhps haplotype distribution in three major towns of cameroon. efficacy of acts (al and asaq) and prevalence of pfcrt t and pfmdr y mutants over time . . . . . . . . . . . e ff ic ac y (% ) year of publication al asaq pfcrt t pfmdr y fig. efficacy of al/asaq and prevalence of pfcrt t and pfmdr y mutant alleles from to . al, artemether–lumefantrine, asaq, artesunate‑amodiaquine, pfcrt, plasmodium falciparum chloroquine resistance transporter gene, pfmdr , plasmodium falciparum multidrug resistance gene page of niba et al. malar j ( ) : trend analysis of proportions of anti‑malarial drugs (asaq, al and sp) deployed in cameroon and prevalence of drug resistance markers from  to  the proportion of asaq, al and sp was based on the observed frequency of each drug deployed to the differ- ent health facilities in cameroon by the nmcp through cename and the regional funds for health promo- tions. the data was derived from annual reports pub- lished by the nmcp. between and , there was an increase in the proportion of asaq ( . %- . %). peak distributions for asaq were observed in ( . %) and ( . %) with corresponding prevalence p ro po rt io n (% ) year of publication asaq al pfcrt t pfmdr y p ro po rt io n (% ) year of publication sp pfdhfr irn a b fig. a proportion of asaq and al deployed in cameroon versus prevalence of pfcrt t and pfmdr y mutants from to . asaq: artesunate‑amodiaquine, al, artemether–lumefantrine; pfcrt, plasmodium falciparum chloroquine resistance transporter gene; pfmdr , plasmodium falciparum, multidrug resistance gene. b proportion of sp deployed in cameroon versus prevalence of pfdhfr irn mutant haplotype from to . sp, sulfadoxine–pyrimethamine, pfcrt, plasmodium falciparum chloroquine resistance transporter gene, pfmdr , plasmodium falciparum multidrug resistance gene page of niba et al. malar j ( ) : of pfcrt   t ( . %, . %) and pfmdr y ( . %, . %) mutants. the proportions of al distributed from to were low when compared with asaq. the maximum proportion of al deployed was . % in and this corresponded with a prevalence of . % and . % for pfcrt   t and pfmdr y mutants respec- tively. the pearson correlation coefficients revealed nega- tive relationships between the acts and anti-malarial drug resistance markers [(asaq versus pfcrt  t, r = − . , p = . ; asaq versus pfmdr y, r = . , p = −  . ), (al versus pfcrt   t, r = −  . , p = . ; al versus pfmdr y, r = −  . , p = . ) (fig.  a). the proportion of sp deployed to the different regions of cameroon dropped from . % in to . % in which corresponded with the prevalence of . % and . % of pfdhfr irn triple mutant haplotype. there was a negative correlation between proportion of sp deployed and prevalence of pfdhfr irn triple mutant haplotype (r = −  . , p = . ). discussion this systematic review and meta-analysis showed the frequency and geographic distribution of anti-malarial drug resistance markers over a period of three decades in cameroon. the present study showed that the pooled prevalence of all the amino acid changes from to was . %. subgroup analyses revealed that the aggregated prevalence of pfcrt k t, pfmdr n y, pfd- hfr irn, and pfdhfr-pfdhps irng were above . % with the exception of pfk . these analyses highlight the dom- inance of pfcrt k t, pfmdr n y, pfdhfr n i, pfdhfr c r, pfdhfr s n, pfdhps a g and pfk k t mutations. the rates are high and further confirm that resistant parasites are still circulating in towns, such as yaoundé, garoua, mutengene, and buea. this is not surprising considering some of these towns (yaoundé, mutengene and buea) are located within the high malaria transmission stratum and are urban settings with high variability and intensity in the use of anti-malarial drugs with insufficient regulation. it is also around these areas that the first cases of resistance to chloroquine were reported in the s and early that eventually spread to other regions [ – ]. the dispersal of drug resistance markers could be due to human and vector population migration within the same region or between different regions. the presence of drug resistance mark- ers has been regularly reported in the southern regions of cameroon where malaria transmission is perennial compared to the northern regions characterized by intense seasonal transmission. previous studies have demonstrated the association of pfcrt   t and pfmdr y mutant alleles with chlo- roquine and amodiaquine resistance in  vivo among uncomplicated falciparum malaria patients in different transmission settings [ , ]. these two drugs, chloro- quine and amodiaquine were banned and withdrawn from the market since and respectively for the treatment of uncomplicated falciparum malaria in cameroon. however, amodiaquine (aq) continues to be used as a partner drug in the artesunate-amodiaquine (asaq) and sulfadoxine–pyrimethamine–amodiaquine (spaq) combinations. in , asaq combination replaced aq and sp for the treatment of uncomplicated falciparum malaria in the southern regions while spaq was introduced in as chemoprophylaxis in the con- text of seasonal malaria chemoprevention among chil- dren –   months in the north and far north regions of cameroon. the most common quintuple haplotypes identified in pfcrt gene were cvmnk and cviet. this concords with previously published studies in other regions [ , ]. it is important to note that one study reported the presence of pfcrt svmnt haplotype with a prevalence of . % [ ], which is lower than the . % [ ] and . % [ ] reported in the korogwe district, tanzania and luanda, angola, respectively. only two studies reported the triple pfmdr nfd hap- lotype [ , ] while the triple pfmdr yyy haplotype was not documented. a number of studies carried out in malaria endemic areas have demonstrated an oppos- ing effect in the selection of yyy for asaq and nfd for al [ , ]. this is advantageous to cameroon since asaq and al are used as multiple first-line treatments (mfts) that can possibly slow down the emergence of drug resistance [ ]. trend analysis showed that pfcrt   t, pfcrt cvmnk quintuple wild type haplotype, and pfmdr y mutant parasites declined from – . this is in agree- ment with previous studies carried out in other malaria endemic zones confirming the re-emergence of chloro- quine sensitive parasites [ , , ]. however, there should be caution in the future use of chloroquine in the treatment of uncomplicated p. falciparum malaria because this may lead to reintroduction of resistant para- sites population. in cameroon, sulfadoxine–pyrimethamine (sp) is still being deployed as intermittent preventive treatment for malaria in pregnancy (iptp) with estimated coverage of about % in [ ]. the antifolates are also used in combination with amodiaquine for seasonal malaria chemoprevention. the presence of mutations in pfdhfr and pfdhps genes conferring resistance to sp does not seem to threaten the continuous use of this drug in the page of niba et al. malar j ( ) : future especially as there is need to scale-up deployment to pregnant women and young children as intermittent preventive treatment (iptp and ipti). this may also be applicable for children receiving spaq in the context of seasonal malaria chemoprevention in the sahel regions of northern cameroon. the triple pfdhfr irn and quad- ruple pfdhfr/pfdhps irng mutant haplotypes were the most prevalent while quintuple pfdhfr/pfdhps irnge mutant haplotype was the least prevalent. there has been a gradual decline over the years in the prevalence of single antifolate gene polymorphisms associated with sp resistance in cameroon with the exception of pfdhps a g and k e. however, the rates of prevalence recorded are less than the % benchmark recommended by the who to ban the continuous use of sp. these find- ings corroborate with the high prevalence of pfdhfr irn and pfdhfr/pfdhps irng recorded in bata district and bioko island, equatorial guinea [ , ]. there has been a gradual increase in the prevalence of the quintuple pfdhfr/pfdhps irnge mutant haplotype over the years, ascertaining the sudden emergence of the haplotype in central africa [ ]. other underreported pfdhps haplo- types included sgk, agk, sge, aak, and sak. these haplotypes were extensively studied in isolates from dif- ferent african countries including cameroon by pearce and colleagues, where they sought to investigate the evolutionary origin of the mutations flanking the pfdhps gene [ ]. the authors observed that the haplotypes in the cameroonian samples were unique when compared to those from central, south-eastern and west african sites [ ]. the malaria parasite resistance to sp seems to be driving in opposite directions with high resistance recorded in the southern regions when compared to the northern regions. the location of these sites in different malaria transmission settings may be accountable for the variations observed. a new mutation, i v, recently identified in the pfdhps gene has been reported in yaoundé [ ] and mutengene [ ] with prevalence rates of . % and . %, respectively. these rates are lower than that reported in enugu nigeria ( . %) in [ ], suggest- ing the possibility of different mutant haplotypes associ- ated with sp treatment failure in central/west africa. this is unlike previous observations in east africa where the quintuple pfdhfr/pfdhps irnge mutant haplotype is strongly associated is sp resistance [ ]. there was the absence of key gene polymorphisms located in the pfk propeller region, f i, r t, i t, p l and c y previously documented in the greater mekong sub-region which are associated with delayed parasite clearance following drug administra- tion. moreover, a negative or positive relationship was observed between the rate of efficacy of asaq/al and the prevalence of key mutants (pfcrt k t and pfmdr n y) that select for the partner drugs in act. these observations confirm the findings that al and asaq exert opposing selective effects on single-nucleotide pol- ymorphisms in pfcrt and pfmdr [ ]. however, asaq and al are still efficacious with rates of efficacy above the who minimum cut-off of %. it has been shown that some individuals infected with drug resistant parasites are still able to clear the parasites when administered with non-act and act [ , ]. this may be due to immune competence of such individ- uals. semi-immune individuals have an enhanced abil- ity to clear faster than non-immune people. in addition, age has also been identified as a contributory factor with children less than years clearing parasites slower when compared to children greater than five years [ ]. even though immunity due to malaria infection is short-lived, certain cytokines and their receptors have been shown to be highly implicated in this process [ , ]. furthermore, there was a negative correlation between the proportions of anti-malarial drugs (asaq, al and sp) deployed to the different public and private health establishments in cameroon and anti-malarial drug resistance markers (pfcrt t, pfmdr y and pfdhfr irn). the proportion of drugs deployed may be used as a proxy for drug uptake. the decline in propor- tion of drugs deployed may be contributing to less drug pressure to circulating parasites. increase in parasite fit- ness as a result of less drug pressure could be respon- sible for the decline in the prevalence of certain gene mutations associated with anti-malarial drug resistance. the two drugs, asaq and sp are still being subsidized by the cameroon government. asaq is highly recom- mended for the treatment of uncomplicated falciparum malaria in children less than   years while sp used as a preventive treatment for malaria in pregnancy. the major challenge in the fight against drug resistance in cameroon is the inability to effectively implement the legislation on the homologation and importation of unauthorized anti-malarial therapies and insufficient pharmacovigilance. in addition, there are still issues with substandard drugs and auto-medication. strengths and limitations of the study the major strength of the present review is that it has presented a picture of the prevalence and distribution of key anti-malarial drug resistance markers in cameroon with a total of studies included. the data derived from this study showed that there is little or absence of the pfmdr and pfk polymorphisms that select for act, especially asaq and al. these drugs are used con- currently for the management of uncomplicated plasmo- dium falciparum malaria in cameroon. page of niba et al. malar j ( ) : however, despite the strengths of the study, it is not without limitations. firstly, some studies enrolled a fewer number of participants which may not give a true representation of resistant parasite population circulating in the cameroon. secondly, the high het- erogeneity across studies may affect the interpretation of the findings. thirdly, some of anti-malarial drug resistance markers have been understudied in the northern regions of the country that border coun- tries such as nigeria with a high burden of malaria. furthermore, most of the studies were conducted in symptomatic individuals and there is little or no infor- mation on the prevalence of anti-malarial drug resist- ance markers in asymptomatic carriers of the parasite. asymptomatic individuals have been shown to be res- ervoirs for malaria parasite transmission. in addition, earlier studies mostly used npcr-rflp for the detec- tion of drug resistance markers and, therefore, were not capable of identifying novel snps. finally, the association between specific p. falciparum gene poly- morphisms and treatment failures with act could not be investigated because of the non-availability of data. conclusions this review reported a decline in the prevalence of single plasmodium falciparum gene mutations (pfcrt k t, pfmdr n y, pfdhfr n i, pfdhfr c r, pfdhfr s n) conferring resistance to -aminoquinolines, amino alcohols and pyrimethamine for a period over two decades pre and post adoption of act in came- roon. the pfcrt k t and pfmdr n y mutations still persist at moderate frequencies despite the withdrawal of chloroquine. conversely, parasite resistance markers (pfdhps a g and pfdhps k e) linked to the sulpha drugs increased during the same study period. resist- ance to artemisinins measured by the presence of snps in the pfk gene does not seem to be a major prob- lem in cameroon. however, it is a wake-up call for pol- icy makers to design and implement strategies for the regular monitoring of delayed parasite clearance after administration of artemisinin-based combination ther- apy. this will permit the early identification of factors driving the emergence and spread of anti-malarial drug resistance in cameroon. supplementary information the online version contains supplementary material available at https ://doi. org/ . /s ‑ ‑ ‑ . additional file  . summary of the studies included in the systematic review and meta‑analysis on anti‑malarial drug resistance markers in cameroon. additional file  . methodological quality assessment of interventional studies. additional file  . methodological quality assessment of cohort studies. additional file  . methodological quality assessment of cross‑sectional studies. additional file  . methodological quality assessment of case reports. additional file  . assessment of publication bias using funnel plot and egger’s regression test. additional file  . haplotype analyses of anti‑malarial drug resistance mutant allele frequencies reported in cameroon. abbreviations act : artemisinin‑based combination therapy; al: artemether–lumefantrine; asaq: artesunate‑amodiaquine; nmcp: national malaria control programme; pcr: polymerase chain reaction; pfcrt: plasmodium falciparum chloroquine resistance transporter; pfmdr : plasmodium falciparum multi‑drug resistance ; pfdhfr: plasmodium falciparum dihydrofolate reductase; pfdhps: plasmo- dium falciparum dihydropteroate synthase; pfcytb: plasmodium falciparum cytochrome b; pfatp : plasmodium falciparum atpase ; pfk : plasmodium falciparum kelch ; prisma: preferred reporting items for systematic reviews and meta‑analyses; prisma‑p: preferred reporting items for systematic reviews and meta‑analyses protocol; r: correlation coefficient; sp: sulfadox‑ ine–pyrimethamine; spaq: sulfadoxine–pyrimethamine–amodiaquine; who: world health organization. acknowledgements not applicable. disclaimer the views expressed in this publication are those of the author (s) and not necessarily those of aas, nepad agency, wellcome trust or the uk govern‑ ment or the who (geneva). authors’ contributions wfm conceived the research and coordinated the study. ptnn and lnn piloted the data extraction phase. amn and ptnn performed the data analysis, drafted the manuscript, critically reviewed the manuscript, and wrote the final manuscript. the authors wfm, ma, mse, ima, pmn, rn, mnm, lnn, oen, faa, cmm, daf, bat, oaa, rdd, jpkc, jdb, ceem, aa, ea, et, rgfl, at and pr proof read the manuscript. all authors read and approved the final manuscript. funding wfm, amn, ima and ptnn are supported by the malaria research capacity development in west and central africa (marcad) consortium through the developing excellence in leadership, training and science (deltas) africa initiative [grant # del‑ ‑ ] to the university of yaounde i. the deltas africa initiative is an independent funding scheme of the african academy of sciences (aas)’s alliance for accelerating excellence in science in africa (aesa) and supported by the new partnership for africa’s development planning and coordinating agency (nepad agency) with funding from the wellcome trust [grant # /a/ /z] and the united kingdom (uk) government. ethics approval and consent to participate not applicable since it is a systematic review and meta‑analysis. consent for publication not applicable. competing interests the authors declare that they have no competing interests. author details marcad‑deltas programme, laboratory for public health research biotechnologies, university of yaoundé i, yaoundé, cameroon. the bio‑ technology centre, university of yaoundé i, yaoundé, cameroon. depart‑ ment of biochemistry, faculty of science, university of yaoundé i, yaoundé, https://doi.org/ . /s - - - https://doi.org/ . /s - - - page of niba et al. malar j ( ) : cameroon. department of biochemistry, faculty of science, university of dschang, dschang, cameroon. national malaria control programme, ministry of public health, yaoundé, cameroon. department of biochemistry and molecular biology, faculty of science, university of buea, buea, cameroon. department of biochemistry, faculty of medicine and biomedical sciences, university of yaoundé i, yaoundé, cameroon. institute of medical research and medicinal plant studies, ministry of scientific research and innovation, yaoundé, cameroon. université des montagnes, banganté, west region, cameroon. department of microbiology, faculty of science, university of yaoundé i, yaoundé, cameroon. faculty of medicine and pharmaceutical sciences, university of douala, douala, cameroon. malaria research service, centre pasteur cameroon, yaoundé, cameroon. malaria consortium‑ cameroon coalition against malaria, yaoundé, cameroon. the cameroon office of the world health organization, yaoundé, cameroon. global malaria programme, world health organization, geneva, switzerland. centre for medical parasitology, department of immunology and microbiology, faculty of health and medical sciences, university of copenhagen, copenha‑ gen, denmark. department of infectious diseases, copenhagen university hospital, copenhagen, denmark. received: september accepted: december references . who. world malaria report . geneva, world health organization. https ://www.who.int/publi catio ns‑detai l/world ‑malar ia‑repor t‑ . accessed on th december, . . cameroon national malaria control programme (nmcp). annual report of activities , yaoundé, . . who. world malaria report . geneva, world health organization, . http://www.who.int/malar ia/publi catio ns/atoz/ / en/. accessed on th august, . . sayang c, gausseres m, vernazza‑licht n, malvy d, bley d, millet p. treat‑ ment of malaria from monotherapy to artemisinin‑based combination therapy by health professionals in urban health facilities in yaoundé, central province cameroon. malar j. ; : . . cameroon national malaria control programme (nmcp). annual report of activities . yaoundé, . . who. guidelines for the treatment of malaria. rd edn. geneva, world health organization, . https ://www.who.int/malar ia/publi catio ns/ atoz/ /en/. accessed on th july, . . kayentao k, garner p, van eijk ma, naidoo i, roper c, mulokozi a, et al. intermittent preventive therapy for malaria during pregnancy using vs or more doses of sulfadoxine–pyrimethamine and risk of low birth weight in africa: systematic review and meta‑analysis. jama. ; : – . . djimdé a, doumbo ok, cortese jf, kayentao k, doumbo s, diourté y, et al. a molecular marker for chloroquine‑resistant falciparum malaria. n engl j med. ; : – . . cowman af, morry mj, biggs ba, cross ga, foote sj. amino acid changes linked to pyrimethamine resistance in the dihydrofolate reductase‑thymidylate synthase gene of plasmodium falciparum. proc natl acad sci usa. ; : – . . foote sj, galatis d, cowman af. amino acids in the dihydrofolate reductase‑thymidylate synthase gene of plasmodium falciparum involved in cycloguanil resistance differ from those involved in pyrimethamine resistance. proc natl acad sci usa. ; : – . . peterson ds, walliker d, wellems te. evidence that a point mutation in dihydrofolate reductase‑thymidylate synthase confers resistance to pyrimethamine in falciparum malaria. proc natl acad sci usa. ; : . . triglia t, menting jgt, wilson c, cowman af. mutations in dihy‑ dropteroate synthase are responsible for sulfone and sulfonamide resistance in plasmodium falciparum. proc natl acad sci usa. ; : – . . picot s, olliaro p, de monbrison f, bienvenu a‑l, price rn, ringwald p. a systematic review and meta‑analysis of evidence for correlation between molecular markers of parasite resistance and treatment outcome in falciparum malaria. malar j. ; : . . gama be, pereira‑carvalho ga, kosi fj, de oliveira nk, fortes f, rosenthal pj, et al. plasmodium falciparum isolates from angola show the stctvmnt haplotype in the pfcrt gene. malar j. ; : . . alifrangis m, dalgaard mb, lusingu jp, vestergaard ls, staalsoe t, jensen atr, et al. occurrence of the southeast asian/south american svmnt haplotype of the chloroquine‑resistance transporter gene in plasmodium falciparum in tanzania. j infect dis. ; : – . . awasthi g, satya gbk, das a. pfcrt haplotypes and the evolutionary history of chloroquine‑resistant plasmodium falciparum. mem inst oswaldo cruz. ; : – . . price rn, uhlemann a‑c, brockman a, mcgready r, ashley e, phai‑ pun l, et al. mefloquine resistance in plasmodium falciparum and increased pfmdr gene copy number. lancet. ; : – . . lim p, alker ap, khim n, shah nk, incardona s, doung s, et al. pfmdr copy number and arteminisin derivatives combination therapy failure in falciparum malaria in cambodia. malar j. ; : . . sidhu abs, uhlemann a‑c, valderramos sg, valderramos j‑c, krishna s, fidock da. decreasing pfmdr copy number in plasmodium falci- parum malaria heightens susceptibility to mefloquine, lumefantrine, halofantrine, quinine, and artemisinin. j infect dis. ; : – . . simpson ja, jamsen km, anderson tjc, zaloumis s, nair s, woodrow c, et al. nonlinear mixed‑effects modelling of in vitro drug suscep‑ tibility and molecular correlates of multidrug resistant plasmodium falciparum. plos one. ; :e . . venkatesan m, gadalla nb, stepniewska k, dahal p, nsanzabana c, moriera c, et al. polymorphisms in plasmodium falciparum chlo‑ roquine resistance transporter and multidrug resistance genes: parasite risk factors that affect treatment outcomes for p. falciparum malaria after artemether–lumefantrine and artesunate‑amodiaquine. am j trop med hyg. ; : – . . okell lc, reiter lm, ebbe ls, baraka v, bisanzio d, watson oj, et al. emerging implications of policies on malaria treatment: genetic changes in the pfmdr‑ gene affecting susceptibility to artemether– lumefantrine and artesunate‑amodiaquine in africa. bmj glob health. ; :e . . ariey f, witkowski b, amaratunga c, beghain j, langlois a‑c, khim n, et al. a molecular marker of artemisinin‑resistant plasmodium falcipa- rum malaria. nature. ; : – . . ménard d, khim n, beghain j, adegnika aa, shafiul‑alam m, amodu o, et al. a worldwide map of plasmodium falciparum k ‑propeller polymorphisms. n engl j med. ; : – . . ocan m, akena d, nsobya s, kamya mr, senono r, kinengyere aa, et al. k ‑propeller gene polymorphisms in plasmodium falciparum parasite population in malaria affected countries: a systematic review of prevalence and risk factors. malar j. ; : . . kamau e, campino s, amenga‑etego l, drury e, ishengoma d, johnson k, et al. k ‑propeller polymorphisms in plasmodium falciparum para‑ sites from sub‑saharan africa. j infect dis. ; : – . . who. artemisinin resistance and artemisinin‑based combination therapy efficacy: status report. geneva: world health organization; . https ://www.who.int/malar ia/areas /drug_resis tance /updat es/ en/. accessed on nd december, . . safeukui i, fru‑cho j, mbengue a, suresh n, njimoh dl, bumah vv, et al. investigation of polymorphisms in the p. falciparum artemisinin resistance marker kelch in asymptomatic infections in a rural area of cameroon. biorxiv. ; . . torrentino‑madamet m, fall b, benoit n, camara c, amalvict r, fall m, et al. limited polymorphisms in k gene in plasmodium falciparum isolates from dakar, senegal in – . malar j. ; : . . apinjoh to, mugri rn, miotto o, chi hf, tata rb, anchang‑kimbi jk, et al. molecular markers for artemisinin and partner drug resistance in natural plasmodium falciparum populations following increased insecticide treated net coverage along the slope of mount cameroon: cross‑sectional study. infect dis poverty. ; : . . feng j, kong x, xu d, yan h, zhou h, tu h, et al. investigation and evaluation of genetic diversity of plasmodium falciparum kelch polymorphisms imported from southeast asia and africa in southern china. front public health. ; : . . bwire gm, ngasala b, mikomangwa wp, kilonzi m, kamuhabwa aar. detection of mutations associated with artemisinin resistance at k ‑propeller gene and a near complete return of chloroquine https://www.who.int/publications-detail/world-malaria-report- http://www.who.int/malaria/publications/atoz/ /en/ http://www.who.int/malaria/publications/atoz/ /en/ https://www.who.int/malaria/publications/atoz/ /en/ https://www.who.int/malaria/publications/atoz/ /en/ https://www.who.int/malaria/areas/drug_resistance/updates/en/ https://www.who.int/malaria/areas/drug_resistance/updates/en/ page of niba et al. malar j ( ) : susceptible falciparum malaria in southeast of tanzania. sci rep. ; : . . uwimana a, legrand e, stokes bh, ndikumana jlm, warsame m, umulisa n, et al. emergence and clonal expansion of in vitro artemisinin‑resistant plasmodium falciparum kelch r h mutant parasites in rwanda. nat med. ; : – . . okell lc, drakeley cj, bousema t, whitty cjm, ghani ac. modelling the impact of artemisinin combination therapy and long‑acting treatments on malaria transmission intensity. plos med. ; :e . . moher d, shamseer l, clarke m, ghersi d, liberati a, petticrew m, et al. preferred reporting items for systematic review and meta‑analysis protocols (prisma‑p) statement. syst rev. ; : . . shamseer l, moher d, clarke m, ghersi d, liberati a, petticrew m, et al. preferred reporting items for systematic review and meta‑ analysis protocols (prisma‑p) : elaboration and explanation. bmj. ; : . . liberati a, altman dg, tetzlaff j, mulrow c, gotzsche pc, ioannidis jpa, et al. the prisma statement for reporting systematic reviews and meta‑ analyses of studies that evaluate healthcare interventions: explanation and elaboration. bmj. ; :b . . sterne jac, savović j, page mj, elbers rg, blencowe ns, boutron i, et al. rob : a revised tool for assessing risk of bias in randomised trials. bmj. ; :l . . ga wells, b shea, d o’connell, j peterson, v welch, m losos, p tug‑ well. the newcastle‑ottawa scale (nos) for assessing the quality of non‑randomised studies in meta‑analyses. http://www.ohri.ca/progr ams/clini cal_epide miolo gy/oxfor d.asp. accessed on nd december, . . moola s, munn z, tufanaru c, aromataris e, sears k, sfetcu r, currie m, qureshi r, mattis p, lisy k, mu p‑f. chapter : systematic reviews of etiol‑ ogy and risk. in: aromataris e, munn z (eds). jbi manual for evidence synthesis. jbi, . https ://synth esism anual .jbi.globa l. accessed on st december, . . moola s, munn z, tufanaru c, aromataris e, sears k, sfetcu r, et al. sys‑ tematic reviews of etiology and risk. in: aromataris e, munn z (editors). joanna briggs institute reviewer’s manual. chapt . the joanna briggs institute, . https ://revie wersm anual .joann abrig gs.org/. accessed on st december, . . rossi‑fedele g, kahler b, venkateshbabu n. limited evidence suggests benefits of single visit revascularization endodontic procedures—a systematic review. braz dent j. ; : – . . ryan r. cochrane consumers and communication review group. heterogeneity and subgroup analyses in cochrane consumers and communication group reviews: planning the analysis at protocol stage. http://cccrg .cochr ane.org. december . accessed on th february, . . kontopantelis e, springate da, reeves d. a re‑analysis of the cochrane library data: the dangers of unobserved heterogeneity in meta‑analy‑ ses. plos one. ; :e . . moyeh mn, njimoh dl, evehe ms, ali im, nji am, nkafu dn, et al. effects of drug policy changes on evolution of molecular markers of plasmodium falciparum resistance to chloroquine, amodiaquine, and sulphadoxine‑pyrimethamine in the south west region of cameroon. malar res treat. ; : – . . mbacham wf, evehe m‑sb, netongo pm, ateh ia, mimche pn, ajua a, et al. efficacy of amodiaquine, sulphadoxine‑pyrimethamine and their combination for the treatment of uncomplicated plasmodium falcipa- rum malaria in children in cameroon at the time of policy change to artemisinin‑based combination therapy. malar j. ; : . . mccollum am, basco lk, tahar r, udhayakumar v, escalante aa. hitch‑ hiking and selective sweeps of plasmodium falciparum sulfadoxine and pyrimethamine resistance alleles in a population from central africa. antimicrob agents chemother. ; : – . . basco lk. molecular epidemiology of malaria in cameroon. xiii. analysis of pfcrt mutations and in vitro chloroquine resistance. am j trop med hyg. ; : – . . basco lk, ndounga m, ngane vf, soula g. molecular epidemiology of malaria in cameroon. xiv. plasmodium falciparum chloroquine resist‑ ance transporter (pfcrt ) gene sequences of isolates before and after chloroquine treatment. am j trop med hyg. ; : – . . basco lk. molecular epidemiology of malaria in cameroon. xvi. longi‑ tudinal surveillance of in vitro pyrimethamine resistance in plasmodium falciparum. am j trop med hyg. ; : – . . basco lk. molecular epidemiology of malaria in cameroon. xvii. baseline monitoring of atovaquone‑resistant plasmodium falciparum by in vitro drug assays and cytochrome b gene sequence analysis. am j trop med hyg. ; : – . . tahar r, basco lk. molecular epidemiology of malaria in cameroon. xxvi. twelve‑year in vitro and molecular surveillance of pyrimethamine resistance and experimental studies to modulate pyrimethamine resist‑ ance. am j trop med hyg. ; : – . . tahar r, ringwald p, basco lk. molecular epidemiology of malaria in cameroon. xxviii. in vitro activity of dihydroartemisinin against clinical isolates of plasmodium falciparum and sequence analysis of the p. falciparum atpase gene. am j trop med hyg. ; : – . . sahnouni k, menemedengue v, tahar r, basco l. molecular epidemiol‑ ogy of malaria in cameroon. xxx. sequence analysis of plasmodium falciparum atpase , dihydrofolate reductase, and dihydropteroate synthase resistance markers in clinical isolates from children treated with an artesunate‑sulfadoxine‑pyrimethamine combination. am j trop med hyg. ; : – . . basco lk, ringwald p. molecular epidemiology of malaria in yaounde, cameroon iv. evolution of pyrimethamine resistance between and . am j trop med hyg. ; : – . . basco lk, ringwald p. molecular epidemiology of malaria in yaounde, cameroon. vi. sequence variations in the plasmodium falciparum dihy‑ drofolate reductase‑thymidylate synthase gene and in vitro resistance to pyrimethamine and cycloguanil. am j trop med hyg. ; : – . . menard s, morlais i, tahar r, sayang c, mayengue p, iriart x, et al. molecular monitoring of plasmodium falciparum drug susceptibility at the time of the introduction of artemisinin‑based combination therapy in yaoundé, cameroon: implications for the future. malar j. ; : . . xu c, sun h, wei q, li j, xiao t, kong x, et al. mutation profile of pfdhfr and pfdhps in plasmodium falciparum among returned chi‑ nese migrant workers from africa. antimicrob agents chemother. ; :e ‑e . . ngassa mbenda hg, das a. occurrence of multiple chloroquine‑ resistant pfcrt haplotypes and emergence of the s(agt)vmnt type in cameroonian plasmodium falciparum. j antimicrob chemother. ; : – . . chauvin p, menard s, iriart x, nsango se, tchioffo mt, abate l, et al. prevalence of plasmodium falciparum parasites resistant to sulfadox‑ ine/pyrimethamine in pregnant women in yaoundé, cameroon: emergence of highly resistant pfdhfr / pfdhps alleles. j antimicrob chemother. ; : – . . severini c, menegon m, sannella ar, paglia mg, narciso p, matteelli a, et al. prevalence of pfcrt point mutations and level of chloroquine resistance in plasmodium falciparum isolates from africa. infect genet evol. ; : – . . de monbrison f, raynaud d, latour‑fondanaiche c, staal a, favre s, kaiser k, et al. real‑time pcr for chloroquine sensitivity assay and for pfmdr –pfcrt single nucleotide polymorphisms in plasmodium falciparum. j microbiol methods. ; : – . . ndam nt, basco lk, ngane vf, ayouba a, ngolle em, deloron p, et al. reemergence of chloroquine‑sensitive pfcrt k plasmodium falcipa- rum genotype in southeastern cameroon. malar j. ; : . . basco lk, tahar r, keundjian a, ringwald p. sequence variations in the genes encoding dihydropteroate synthase and dihydrofolate reductase and clinical response to sulfadoxine–pyrimethamine in patients with acute uncomplicated falciparum malaria. j infect dis. ; : – . . jiang j, yu c, tian c, li w, zhang t, xu x. surveillance of anti‑malarial resistance molecular markers in imported plasmodium falciparum malaria cases in anhui, china, – . am j trop med hyg. ; : – . . yao y, wu k, xu m, yang y, zhang y, yang w, et al. surveillance of genetic variations associated with anti‑malarial resistance of plasmo- dium falciparum isolates from returned migrant workers in wuhan. central china antimicrob agents chemother. ; :e ‑e . . gharbi m, flegg ja, pradines b, berenger a, ndiaye m, djimdé aa, et al. surveillance of travellers: an additional tool for tracking http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp https://synthesismanual.jbi.global https://reviewersmanual.joannabriggs.org/ http://cccrg.cochrane.org page of niba et al. malar j ( ) : anti‑malarial drug resistance in endemic countries. plos one. ; :e . . djaman ja, olefongo d, ako ab, roman j, ngane vf, basco lk, et al. molecular epidemiology of malaria in cameroon and côte d’ivoire. xxxi. kelch propeller sequences in plasmodium falciparum isolates before and after implementation of artemisinin‑based combination therapy. am j trop med hyg. ; : – . . youmba j‑c, ringwald p, ngane vf, ndounga m, soula g, tejiokem m, et al. molecular epidemiology of malaria in cameroon. xi. geographic distribution of plasmodium falciparum isolates with dihydrofolate reductase gene mutations in southern and central cameroon. am j trop med hyg. ; : – . . zhao l, pi l, qin y, lu y, zeng w, xiang z, et al. widespread resistance mutations to sulfadoxine–pyrimethamine in malaria parasites imported to china from central and western africa. int j parasitol drugs drug resist. ; : – . . achungu cr, nkuo‑akenji t, apinjoh t, wanji s. re‑emergence of chloroquine sensitive plasmodium falciparum after several years of chlo‑ roquine withdrawal in bamenda, north west cameroon. ec microbiol. : – . . basco lk, tahar r, ringwald p. molecular basis of in vivo resistance to sulfadoxine–pyrimethamine in african adult patients infected with plas- modium falciparum malaria parasites. antimicrob agents chemother. ; : – . . ringwald p, basco lk. molecular epidemiology of malaria in yaounde, cameroon i. analysis of point mutations in the dihydrofolate reductase‑ thymidylate synthase gene of plasmodium falciparum. am j trop med hyg. ; : – . . ringwald p, basco lk. molecular epidemiology of malaria in yaounde, cameroon v. analysis of the omega repetitive region of the plasmodium falciparum cg gene and chloroquine resistance. am j trop med hyg. ; : – . . basco lk, ringwald p. molecular epidemiology of malaria in cameroon. x. evaluation of pfmdr mutations as genetic markers for resistance to amino alcohols and artemisinin derivatives. am j trop med hyg. ; : – . . tahar r, basco lk. molecular epidemiology of malaria in cameroon. xxii. geographic mapping and distribution of plasmodium falciparum dihydrofolate reductase (dhfr) mutant alleles. am j trop med hyg. ; : – . . menard s, tchoufack jn, maffo cn, nsango se, iriart x, abate l, et al. insight into k ‑propeller gene polymorphism and ex vivo dha‑response profiles from cameroonian isolates. malar j. ; : . . lu f, zhang m, culleton rl, xu s, tang j, zhou h, et al. return of chloroquine sensitivity to africa? surveillance of african plasmodium falciparum chloroquine resistance through malaria imported to china. parasit vectors. ; : . . mbacham w, evehe ms, netongo p, ali i, nfor en, akaragwe a, et al. muta‑ tions within folate metabolising genes of plasmodium falciparum in cameroon. afr j biotechnol. ; : – . . eboumbou moukoko ce, huang f, nsango se, kojom foko lp, ebong sb, epee eboumbou p, et al. k‑ propeller gene polymorphisms isolated between and from cameroonian plasmodium falciparum malaria patients. plos one. ; :e . . witkowski b, nicolau m‑l, soh pn, iriart x, menard s, alvarez m, et al. plasmo- dium falciparum isolates with increased pfmdr copy number circulate in west africa. antimicrob agents chemother. ; : – . . basco lk, ringwald p. analysis of the key pfcrt point mutation and in vitro and in vivo response to chloroquine in yaoundé cameroon. j infect dis. ; : – . . basco lk, ringwald p. molecular epidemiology of malaria in yaoundé, cameroon. iii. analysis of chloroquine resistance and point mutations in the multidrug resistance (pfmdr ) gene of plasmodium falciparum. am j trop med hyg. ; : – . . basco lk. molecular epidemiology of malaria in cameroon. xii. in vitro drug assays and molecular surveillance of chloroquine and proguanil resist‑ ance. am j trop med hyg. ; : – . . tahar r, basco lk. molecular epidemiology of malaria in cameroon. xxvii. clinical and parasitological response to sulfadoxine–pyrimethamine treatment and plasmodium falciparum dihydrofolate reductase and dihydropteroate synthase alleles in cameroonian children. acta trop. ; : – . . pearce rj, pota h, evehe m‑sb, bâ e‑h, mombo‑ngoma g, malisa al, et al. multiple origins and regional dispersal of resistant dhps in african plasmo- dium falciparum malaria. plos med. ; : . . kimbi hk, nkuo‑akenji tk, patchong afm, ndamukong kn, nkwescheu a. the comparative efficacies of malartin, with and without amodiaquine, in the treatment of plasmodium falciparum malaria in the buea district of cameroon. ann trop med parasitol. ; : – . . ndiaye jla, faye b, diouf am, kuété t, cisse m, seck pa, et al. randomized, comparative study of the efficacy and safety of artesunate plus amodi‑ aquine, administered as a single daily intake versus two daily intakes in the treatment of uncomplicated falciparum malaria. malar j. ; : . . ndiaye jl, randrianarivelojosia m, sagara i, brasseur p, ndiaye i, faye b, et al. randomized, multicentre assessment of the efficacy and safety of asaq – a fixed‑dose artesunate‑amodiaquine combination therapy in the treatment of uncomplicated plasmodium falciparum malaria. malar j. ; : . . sagara i, rulisa s, mbacham w, adam i, sissoko k, maiga h, et al. efficacy and safety of a fixed dose artesunate‑sulphamethoxypyrazine‑pyrimethamine compared to artemether–lumefantrine for the treatment of uncompli‑ cated falciparum malaria across africa: a randomized multi‑centre trial. malar j. ; : . . whegang sy, tahar r, foumane vn, soula g, gwét h, thalabard j‑c, et al. efficacy of non‑artemisinin‑ and artemisinin‑based combination therapies for uncomplicated falciparum malaria in cameroon. malar j. ; : . . yavo w, faye b, kuete t, djohan v, oga sa, kassi rr, et al. multicentric assess‑ ment of the efficacy and tolerability of dihydroartemisinin‑piperaquine compared to artemether–lumefantrine in the treatment of uncom‑ plicated plasmodium falciparum malaria in sub‑saharan africa. malar j. ; : . . faye b, kuété t, kiki‑barro cp, tine rc, nkoa t, ndiaye jla, et al. multicentre study evaluating the non‑inferiority of the new paediatric formulation of artesunate/amodiaquine versus artemether/lumefantrine for the man‑ agement of uncomplicated plasmodium falciparum malaria in children in cameroon ivory coast and senegal. malar j. ; : . . nji am, ali im, moyeh mn, ngongang e‑o, ekollo am, chedjou j‑p, et al. ran‑ domized non‑inferiority and safety trial of dihydroartemisin‑piperaquine and artesunate‑amodiaquine versus artemether–lumefantrine in the treatment of uncomplicated plasmodium falciparum malaria in cameroo‑ nian children. malar j. ; : . . tahar r, almelli t, debue c, foumane ngane v, djaman allico j, whe‑ gang youdom s, et al. randomized trial of artesunate‑amodiaquine, atovaquone‑proguanil, and artesunate‑atovaquone‑proguanil for the treatment of uncomplicated falciparum malaria in children. j infect dis. ; : – . . apinjoh t, anchang‑kimbi j, ajonina m, njonguo e, njua‑yafi c, ngwai a, et al. in vivo efficacy of artesunate/sulphadoxine‑pyrimethamine versus artesunate/amodiaquine in the treatment of uncomplicated p falciparum malaria in children around the slope of mount cameroon: a randomized controlled trial. biomedicines. ; : . . oduola amj, moyou‑somo rs, kyle de, martin sk, gerena l, milhous wk. chloroquine resistant plasmodium falciparum in indigenous residents of cameroon. trans r soc trop med hyg. ; : – . . sansonetti pj, lebras c, verdier f, charmot g, dupont b, lapresle c. chloroquine‑resistant plasmodium falciparum in cameroon. lancet. ; : – . . claveau s. chloroquine‑resistant plasmodium falciparum malaria from cam‑ eroon. cmaj. ; : – . . titanji v, nkuo‑akenji t, ntopi w, djokam r. reduced levels of chloroquine resistant plasmodium falciparum in selected foci for the south west prov‑ ince. cameroon cent afr j med. ; : – . . wamae k, okanda d, ndwiga l, osoti v, kimenyi km, abdi ai, et al. no evidence of plasmodium falciparum k artemisinin resistance‑conferring mutations over a ‑year analysis in coastal kenya but a near complete reversion to chloroquine‑sensitive parasites. antimicrob agents chem‑ other. ; :e ‑e . . dagnogo o, ako ab, ouattara l, dago nd, coulibaly dn, touré ao, et al. towards a re‑emergence of chloroquine sensitivity in côte d’ivoire? malar j. ; : . page of niba et al. malar j ( ) : • fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold open access which fosters wider collaboration and increased citations maximum visibility for your research: over m website views per year • at bmc, research is always in progress. learn more biomedcentral.com/submissions ready to submit your researchready to submit your research ? choose bmc and benefit from: ? choose bmc and benefit from: . sondo p, derra k, diallo nakanabo s, tarnagda z, kazienga a, zampa o, et al. artesunate‑amodiaquine and artemether–lumefantrine therapies and selection of pfcrt and pfmdr alleles in nanoro burkina faso. plos one. ; :e . . boni mf, smith dl, laxminarayan r. benefits of using multiple first‑line therapies against malaria. proc natl acad sci usa. ; : – . . duah no, matrevi sa, de souza dk, binnah dd, tamakloe mm, opoku vs, et al. increased pfmdr gene copy number and the decline in pfcrt and pfmdr resistance alleles in ghanaian plasmodium falciparum isolates after the change of anti‑malarial drug treatment policy. malar j. ; : . . cameroon: demographic health survey (dhs) ‑key indicators report. https ://dhspr ogram .com/publi catio ns/publi catio n‑pr ‑preli minar y‑repor ts‑key‑indic ators ‑repor ts.cfm. accessed on th april, . . berzosa p, esteban‑cantos a, garcía l, gonzález v, navarro m, fernández t, et al. profile of molecular mutations in pfdhfr, pfdhps, pfmdr , and pfcrt genes of plasmodium falciparum related to resistance to different anti‑ malarial drugs in the bata district (equatorial guinea). malar j. ; : . . jiang t, chen j, fu h, wu k, yao y, eyi jum, et al. high prevalence of pfdhfr‑ pfdhps quadruple mutations associated with sulfadoxine–pyrimethamine resistance in plasmodium falciparum isolates from bioko island equatorial guinea. malar j. ; : . . oguike mc, falade co, shu e, enato ig, watila i, baba es, et al. molecular determinants of sulfadoxine–pyrimethamine resistance in plasmodium falciparum in nigeria and the regional emergence of dhps v. int j parasitol drugs drug resist. ; : – . . matondo si, temba gs, kavishe aa, kauki js, kalinga a, van zwetselaar m, et al. high levels of sulphadoxine‑pyrimethamine resistance pfdhfr-pfdhps quintuple mutations: a cross sectional survey of six regions in tanzania. malar j. ; : . . diakite m, achidi ea, achonduh o, craik r, djimde aa, evehe m‑sb, et al. host candidate gene polymorphisms and clearance of drug‑resistant plasmodium falciparum parasites. malar j. ; : . . ataide r, ashley ea, powell r, chan j‑a, malloy mj, o’flaherty k, et al. host immunity to plasmodium falciparum and the assessment of emerging artemisinin resistance in a multinational cohort. proc natl acad sci usa. ; : – . . djimdé aa, doumbo ok, traore o, guindo ab, kayentao k, diourte y, et al. clearance of drug‑resistant parasites as a model for protec‑ tive immunity in plasmodium falciparum malaria. am j trop med hyg. ; : – . publisher’s note springer nature remains neutral with regard to jurisdictional claims in pub‑ lished maps and institutional affiliations. https://dhsprogram.com/publications/publication-pr -preliminary-reports-key-indicators-reports.cfm https://dhsprogram.com/publications/publication-pr -preliminary-reports-key-indicators-reports.cfm drug resistance markers within an evolving efficacy of anti-malarial drugs in cameroon: a systematic review and meta-analysis ( – ) abstract background: methods: results: conclusions: background methods registration of the systematic review and protocol development search strategy additional searches eligibility criteria inclusion criteria exclusion criteria review process data extraction procedure data items data management methodological quality (risk of bias) assessment of individual studies included data analysis, heterogeneity assessment and data interpretation assessment of publication bias across studies results study identification, screening and selection process characteristics of studies included in the review heterogeneity of included studies pooled prevalence of p. falciparum anti-malarial drug resistance mutations temporal changes of the key gene polymorphisms conferring resistance to anti-malarial drugs prior to and after adoption of artemisinin-based combination therapies (acts) in cameroon distribution of antifolate haplotypes across geo-ecological zones of cameroon efficacy of acts (al and asaq) and prevalence of pfcrt  t and pfmdr y mutants over time trend analysis of proportions of anti-malarial drugs (asaq, al and sp) deployed in cameroon and prevalence of drug resistance markers from  to  discussion strengths and limitations of the study conclusions acknowledgements references white paper report report id: application number: hd project director: cheryl ball (cball@ilstu.edu) institution: illinois state university reporting period: / / - / / report due: / / date submitted: / / building  a  better  back-­end:  editor,  author,  and  reader  tools  for  scholarly  multimedia     final  white  paper  (grant  #hd-­‐ -­‐ )     cheryl  e.  ball  (principal  investigator)   s ceball@gmail.com   illinois  state  university       submitted  march   ,     final performance report: building a better back-end project  activities   the  original  proposal  for  this  level  ii  digital  humanities  start-­‐up  grant  was  to  modify  the  open-­‐ source,  editorial-­‐management  system  open  journal  systems  (ojs)  for  use  with  scholarly   multimedia.  the  goal  was  to  build  php-­‐based  plug-­‐ins  that  would  facilitate  synchronous  and   asynchronous  review  of  multimodal  webtexts,  which  includes  adding  metadata  to  the  author   upload  functions,  maintaining  linked  file  structures  of  webtexts  through  the  versioning  system  of   ojs,  and  capturing  nondiscursive  synchronous  review  data  such  as  sticky  notes  and  drawings  on   screencaptures  of  interactive  webtext  submissions.  a  second  set  of  goals,  to  build  remix  and   citation  tools  for  readers,  had  to  be  set  aside  early  on  due  to  the  scope  of  the  review  plug-­‐in   deliverable.   brief  background  on  scholarly  multimedia   scholarly  multimedia  (also  called  webtexts)  are  article-­‐  or  book-­‐length  digital  pieces  of  peer-­‐ reviewed  scholarship  designed  using  hypertextual  and  media-­‐rich  elements  to  enact  an  author’s   argument.  they  incorporate  interactivity,  digital  media,  and  different  argumentation  strategies   such  as  visual  juxtaposition  and  associational  logic  and  are  composed  using  webpages  with  links,   animations,  images,  audio,  video,  scripts,  databases,  multimedia,  and  other  design  elements.  these   publications  are  unique  in  that  each  webtext  is  individually  designed,  which  makes  basic  editorial   processes  such  as  reviewing,  copy-­‐  and  design-­‐editing,  publishing,  and  indexing  significantly  more   complicated  than  print-­‐based  or  linear  (e.g.  pdf-­‐like)  scholarship.  the  oldest,  continuously   published  journal  for  webtexts  is  kairos:  rhetoric,  technology,  and  pedagogy   (http://kairos.technorhetoric.net).  the  pi  and  two  of  the  grant’s  consultants  are  kairos  editors  and   drew  on  their  combined    years  of  expertise  with  the  journal  to  inform  the  deliverables  of  this   project.       project  purpose  and  original  scope   editors,  authors,  readers,  and  publishers  need  media-­‐specific  tools  to  help  them  engage  with  and   promote  scholarly  multimedia,  but  the  unique  editorial  processes  for  scholarly  multimedia-­‐-­‐-­‐such   as  the  lack  of  feasibility  to  blind  review;  the  need  for  collaborative  review  processes;  and  the  added   layers  of  copy-­‐editing  that  attend  to  usability,  accessibility,  sustainability,  and  rhetorical   appropriateness  of  a  webtext’s  design-­‐-­‐-­‐inhibit  this  growth.  creating  tools  that  display  a  webtext   submission  within  a  review  system  (instead  of  downloading  it  for  offline  review,  as  ojs  does)   allows  editors  to  offer  reviewers  the  opportunity  to     • synchronously  chat  about  a  webtext  as  they  interact  with  it,     • put  sticky  notes  on  areas  of  the  design  that  may  need  attention,     • dis/agree  with  other  reviewer’s  comments  in  a  similar  manner  to  facebook’s  “like”  (and  the   • much-­‐called-­‐for  “dislike”)  button,     • vote  to  accept/accept  with  revisions/revise  and  resubmit/reject,  and     • track  which  reviewers  receive  feedback  from  their  co-­‐reviewers  (using  a  game-­‐like  badge-­‐ system   • for  their  logins/avatars  to  promote  the  creative  play  inherent  in  scholarly  multimedia)  and   to  see  which  kinds  of  webtext  content  they  prefer  responding  to,  which  would  help  editors   further  support  reviewers’  disciplinary  and  technical  expertise  when  assignments  are   needed.     final performance report: building a better back-end the  team’s  goal  with  this  grant  was  to  build  an  a/synchronous  webtext  review  plug-­‐in  that  we   would  distribute  through  open  journal  systems’s  plug-­‐ins  gallery.  (we  called  this  the  kairos-­‐ojs   plug-­‐in.)  in  addition,  we  wanted  to  build  plug-­‐ins  for  increased  implementation  of  metadata  for   media  elements,  better  indexing  and  bibliography  management  tools  (i.e.,  cross-­‐support  of   scholarly  multimedia  with  zotero),  and  citation  tools  for  individual  media  elements  or  portions  of   elements  (e.g.,  citing  a   -­‐second  clip  from  within  a   -­‐minute  podcast),  among  others.     major  activities  completed     fourth  quarter   • pi  (cheryl  ball)  and  primary  consultants  (douglas  eyman  and  kathie  gossett)  met  several   times,  and  once  with  programmer  (steven  potts).   • team  (pi,  consultants)  created  technical  specs  and  wireframes  for  its  revised  version  of  ojs.   • team  (pi,  consultants)  created  metadata  schema  with  crosswalk  between  ojs,  dublin  core,   and  kairos  (the  scholarly  multimedia  journal  used  as  the  test-­‐case  for  this  neh  project).  the   metadata  schema  helped  us  to  figure  out  what  new  fields  we  would  need  to  build  in  ojs  to   accommodate  the  reader  tools  we  had  proposed.     first  quarter   • pi  worked  with  her  digital  publishing  undergraduate  class  to  mine  metadata  from  all  the   back  issues  of  kairos.     second  quarter   • team  (pi,  consultants)  ran  user-­‐testing  with  kairos  editors  for  potential  back-­‐end  changes   to  ojs  using  wireframes  and  interactive  mock-­‐ups.     third  quarter   • team  presented  on  wireframes  at  pkp  (public  knowledge  project  conference)  in  berlin,  and   consulted  with  pkp  developers  on  ojs.     fourth  quarter   • team  negotiated  for  installation  of  developmental  server  (from  neh  grant  budget)  at  pi’s   home  institution.  ojs  installed.     first  quarter   • pi  began  initial  set-­‐up  to  migrate  kairos  to  ojs.   • neh  grant  extended  for  one  year.     second  quarter   • team  conducted  user-­‐testing  with  kairos  staff  of  a/synchronous  multimedia  review  system   mock-­‐up.   • pi  presented  metadata  schema  at  new  media  consortium  conference.       final performance report: building a better back-end third  quarter   • pi  and  consultant  eyman  met  in  lansing,  mi,  to  retrieve  prototype  from  programmer.   • pi  called  project  failed.     first  quarter  [end  of  grant  period]   • pi  wrote  article  about  mining  metadata  as  a  pedagogical  tool.   • pi  formed  advisory  group  for  boutique  data  repository  (see  long-­‐term  impact  section.)   • pi  and  consultant  (eyman)  delivered  presentation  at  networked  humanities  conference  on   infrastructures  of  digital  media  publishing  (and  later  published  an  article  on  same)   changes  in  proposed  project  activities   scope  &  deliverables   the  grant  team  changed  the  initial  scope  of  the  project  fairly  quickly  after  meeting  the  first  few   times,  to  exclude  the  reader  tools  (for  remix  and  citation  of  multimedia  elements  and  webtexts)   from  this  project,  with  the  hopes  of  returning  to  these  goals  in  a  follow-­‐up  grant.  the  decision  was   made  to  remove  these  tools  because  the  scope  of  completing  just  the  editorial  workflow  (back-­‐end)   portions  of  the  project  proved  to  be  too  large  to  complete  with  the  time,  money,  and  human   resources  the  grant  provided.  basically,  we  would  have  had  to  totally  re-­‐write  ojs  to  get  it  to  do  all   of  these  things,  and  that  was  beyond  the  intended  scope  of  the  grant  project  (see  technological   changes,  below).     we  further  limited  the  scope  of  the  project,  after  our  initial  user-­‐testing  in  the  second  quarter  of   ,  the  a/synchronous  multimedia  review  plug-­‐in.  we  did  this  because  when  we  tested  the   potential  changes  we  had  planned  for  the  author  and  editorial  workflow  tools  within  ojs,  we   discovered  that  with  slight  modifications  of  our  own  workflows,  we  could  fit  into  the  current  ojs   workflow  relatively  well  without  having  to  rework  the  system.  for  instance,  although  we  would   have  to  change  some  of  our  long-­‐standing  terminology,  like  “design-­‐editing”  to  “layout  editor,”  and   to  re-­‐arrange  the  workflow  pattern  in  ojs  (since  design-­‐editing  for  kairos  comes  before  copy-­‐ editing  the  written  content),  changing  our  terminology  was  potentially  an  easier  fix  than  rewriting   a  major  part  of  ojs  to  accommodate  a  single  journal’s  current  workflow  (even  if  that  workflow  is   best  practice  for  webtextual  journals,  which  are  not  the  mainstay  audience  for  ojs).     thus,  our  focus  for  the  grant  project  ended  up  being  almost  exclusively  on  writing  a  plug-­‐in  for  ojs   that  would  accommodate  a/synchronous  reviewing  of  webtexts.  it  is  unknown  whether  this   prototype  was  successful,  as  the  programmer  stopped  responding  to  all  grant-­‐related   communications  in  fall   ,  when  delivery  (after  a  year  delay)  was  intended  to  occur.  it  is   rumored  that  the  plug-­‐in  prototype  was  completed  and  did  successfully  run,  but  that  it  could  not  be   made  to  integrate  with  ojs  (see  technological  changes,  below).       the  team  did  add  a  deliverable,  however,  in  the  form  of  the  metadata  mining  project.  this   unintended  deliverable  was  created  by  the  pi  with  a  class  of    undergraduate  digital  publishing   students  at  illinois  state  university.  we  mined  over  a  million  points  of  data  from  every  webtext  and   media  element  (filetype)  that  kairos  had  ever  published,  in  its  then-­‐ -­‐year  history.  (we  have  since   expanded  the  collection  to  the  issues  published  since  this  part  of  the  project  was  completed  in  mid-­‐ final performance report: building a better back-end .)  this  metadata  was  meant  to  be  used  to  populate  ojs  so  that  the  journal’s  archives  could  be   searchable  and  sharable  within  the  new  ojs  reader-­‐interface  we  had  originally  planned  to  build.     personnel   the  project  was  unable  to  be  completed  because  the  programmer  stopped  communicating  with  the   grant  team  right  before  delivery  of  the  prototype  was  to  have  been  made.  it  was  too  late  in  the   project,  at  that  point,  to  hire  a  new  programmer.     technological   the  team’s  technological  understanding  of  ojs  changed  the  project  from  its  original  intent  the  most.   open  journal  systems  is  an  organically  coded  tool  built  up  through  the  love  and  grant-­‐getting  of  the   public  knowledge  project’s  architectural  and  programming  team.  it  has  been  built  on  and  modified   over  the  last  decade  through  piecemeal  efforts,  acknowledged  by  the  pkp  team  as  somewhat   haphazard,  and  (as  indicated  at  the  pkp  conference  our  grant  team  attended  in  berlin  in   )  left   to  its  own  devices  in  favor  of  the  more  nuanced,  modular,  and  lessons-­‐learned  coding  project  that   has  become  ojs’s  next  iteration:  open  monograph  press.  while  ojs  functions  pretty  well  from  a   non-­‐technical  viewpoint,  programmers  looking  under  the  hood  have  repeatedly  come  back  with   very  realistic  evaluations  that  modifying  the  system  in  as  radical  a  way  as  this  grant  project  had   hoped  to  do  would  be  unsuccessful.  several  programmers  we  have  spoken  to  have  suggested  that   ojs  needs  to  be  forked  or,  more  efficiently,  rewritten  from  the  ground  up  in  order  to  implement  the   changes  we  wanted  to  make,  which  would  make  it  an  entirely  new  platform.  doing  so  was  outside   the  scope  of  this  neh  grant,  as  we  had  neither  the  time  nor  the  resources  to  maintain  a  new  system,   nor  did  we  want  to  do  the  current  ojs  users  a  disservice  by  forking  and  then  not  being  able  to   provide  a  migration  tool.     publicity  of  results  (summary)   the  major  publicity  efforts  regarding  the  multimedia  plug-­‐in  deliverable  were  based  in  conference   presentations  and  one  article.  the  major  publicity  efforts  regarding  the  metadata-­‐mining  project   were  based  in  conference  presentations,  keynotes,  an  article,  and  the  creation  of  a  boutique  data   repository,  which  is  also  publicized  in  conference  presentations  and  another  article.  see  the  grant   products  section  for  links  to  these  publicity  artifacts.   accomplishments   ( )  our  objective  to  explore  whether  open  journal  systems  as  a  platform  would  be  usable,  with   modifications  via  plug-­‐ins,  for  multimedia  publishing  was  accomplished.  the  outcome  of  this   objective  indicated  that  ojs  is  not  currently  viable  for  multimedia  publishing.  this  is  probably  the   most  important  outcome  for  our  project,  as  well  as  for  any  person  working  with  and  in  digital   publishing  platforms  today.     ( )  our  objective  to  create  plug-­‐ins  for  multimedia-­‐based  editorial  workflow  with  ojs  was  only   minimally  accomplished:   a. we  discovered  that  a  multimedia-­‐based  workflow  based  on  best  practices  at  kairos  could  be   minimally  manipulated  to  work  within  ojs’s  current  production  workflow.  this  would   require  us  to  use  zip  files  of  webtexts  instead  of  transferring  files  within  folder  structures,   as  we  do  now  by  hand  (on  our  servers).   final performance report: building a better back-end b. we  were  not  able  to  deliver  on  our  refocused  objective  to  create  an  a/synchronous  review   plug-­‐in  for  multimedia  texts  in  ojs.  although  the  possibility  exists  that  such  a  plug-­‐in  could   be  created  with  more  funding  and  better  programming,  the  grant  team  has  elected  to  not   pursue  this  project  due  to  the  lack  of  overall  viability  for  using  ojs  for  multimedia   publishing.     ( )  our  objective  to  create  a  robust  reader  interface  for  multimedia  journals  in  ojs  was  removed   from  the  project  as  being  too  large  of  a  technological  task  within  the  financial  scope  of  the  neh   grant.     ( )  the  biggest,  unintended  accomplishment  with  this  grant  was  the  unexpected  deliverables   produced  by  the  metadata  mining  project,  which  elicited  over  a  million  points  of  data  about  the   history  of  webtext  publication  in  kairos,  the  longest-­‐running  journal  of  its  kind.  the  pi  has   published  several  articles  relating  to  this  outcome  and  has  begun  a  new  digital  humanities  project,   rhetoric.io—a  boutique  data  repository—the  idea  for  which  was  an  outgrowth  of  the  lack  of   availability  of  venues  for  distributing  important,  albeit  small,  data  sets  in  the  humanities.  this  new   project  is  briefly  discussed  in  the  long-­‐term  impact  section  below.   audiences   the  primary  intended  audience  for  the  kairos-­‐ojs  plug-­‐ins  were  ojs  users,  specifically  publishers   and  editors  who  already  use  ojs  and  wanted  to  publish  more  multimedia  content,  as  well  as  those   who  wanted  to  start  multimedia  journals  from  scratch.  the  secondary  intended  audience—and   those  who  were  user-­‐tested  during  this  grant—included  editorial  board  and  staff  members  from   kairos,  who  already  have  a  working  knowledge  of  multimedia  publishing.  a  third,  unintended   audience  would  have  been  teachers,  who  could  use  a  multimedia  review  plug-­‐in,  like  the  one  we   had  planned,  for  conducting  peer-­‐review  workshops  and  multimedia  analyses  in  their  classes.   however,  the  project  had  little  actual  impact  on  any  of  these  audiences  since  the  major  deliverable   (the  review  plug-­‐in)  could  not  be  completed.     despite  this  failure,  the  project  has  allowed  us  to  have  conversations  with  several  possible,  future   stakeholders  who  may  be  able  to  help  us  expand  our  collaborations  (and  our  audiences)  to  build  a   new  editorial-­‐management  system  that  is  multimedia-­‐specific.   evaluation   because  the  project  wasn’t  completed,  we  do  not  have  evaluation  statistics  to  provide.   lessons  learned   instead  of  an  evaluation,  we  provide  the  following  list,  written  by  a  first-­‐time  pi  of  an  neh  grant:   • managing  a  grant,  even  a  relatively  “small”  $ ,  one,  takes  more  time  than  you’d   imagine.  it’s  equivalent,  at  least,  to  teaching  a  new  prep,  if  not  more.  do  not  skimp  on   budgeting  for  personnel,  including  the  pi’s  time,  whether  it  be  through  a  course  re-­‐ assignment,  summer  salary,  or  paying  for  a  staff  person  to  manage  the  mountains  of   paperwork  for  you.  check  with  your  institutional  research  office  to  see  whether  some  of  the   administrative  tasks  can  be  wrapped  into  their  office  and  the  overhead  you’re  already   paying  the  university.   • although  it  adds  to  the  paperwork,  requiring  quarterly  (or  more  frequent)  reports  from   consultants  and  grant  team  members  will  assist  with  meeting  grant  project  milestones.  use   final performance report: building a better back-end project  management  software  from  the  start,  or  hire  someone  with  experience  as  a  project   manager  if  the  pi  can’t  do  it  themselves.   • write-­‐in  travel  money  for  publicity  of  your  project.  going  to  conferences  to  present   (particularly  ones  that  are  usually  outside  of  the  budget  of  most  humanities  scholars)  will   assist  with  your  networking  capabilities  and  will  usually  provide  you  with  a  forum  to   receive  insightful  feedback  on  your  in-­‐progress  project.   • saving  money  by  conducting  the  majority  of  the  work  offsite  (and  at  a  lower  overhead  rate)   doesn’t  make  up  for  not  having  oversight  of  consultants.  work  at  a  distance  only  with   people  you  know  well  and  trust  or  have  a  binding  contract  with.   • if  you  don’t  already  have  a  working  relationship  with  consultants,  conduct  formal   interviews  and/or  ask  for  references  and  cvs/résumés.  don’t  rely  on  recommendations,   unless  those  recommenders  have  established  a  formal  working  relationship  with  the   consultant.  also  ask  your  institutional  research  office  in  advance  whether  there  is  a   recuperation  process  if  the  consultant  breaks  his  or  her  contract.   • if  you  do  run  into  personnel  problems,  treat  everyone  involved  humanely  and  communicate   with  them  as  quickly  as  possible,  by  as  many  means  as  necessary  (f f,  phone,  email,  skype,   text,  etc.).  if  none  of  the  above  provides  a  successful  resolution,  seek  advice  from  your   research  office  or  the  neh  program  officer.   • be  welcome  to  unexpected  turns  in  the  project  that  might  produce  interesting  outcomes.  be   cognizant  of  when  those  turns  become  unproductive,  though,  and  are  taking  you  too  far   afield.   • for  a  high-­‐risk  grant  such  as  the  neh  digital  humanities  start-­‐up  grants,  failures  still   produce  outcomes  that  are  useful  to  you  and  the  field,  even  if  the  deliverables  you  intended   don’t  work  out.   public  response   we  were  able  to  conduct  two  rounds  of  usability  tests  with  wireframes  and  mock-­‐ups,  as  well  as   present  those  wireframes  at  several  conference  panels.  we  have  anecdotal  evidence  from  both  of   these  scenarios  to  indicate  that,  if  the  multimedia  review  plug-­‐in  would  have  been  made  available,   people  would  have  definitely  wanted  to  use  it.  several  key  members  of  the  ojs  team—pkp  founder   john  willinsky  and  lead  ojs  technical  architect  alec  smecher,  in  particular—were  very  excited  by  it   when  we  discussed  it  with  them  via  skype  early  on  in  the  grant  as  well  as  when  we  presented  the   wireframes  at  the  pkp  conference  in  berlin  a  year  later.  we  also  had  skype  calls  with  stanford’s   high  wire  press,  to  discuss  their  implementation  of  multimedia  in  ojs,  and  they  were  very   interested  in  what  we  were  working  on  as  they  were  working  on  a  complementary  project  at  the   time.         in  addition,  kairos  staff  members  and  other  journal  editors  alike  thought  that  having  both   synchronous  and  asynchronous  review  possibilities  was  a  smart  idea,  given  the  lack  of  time   reviewers  have  for  providing  reviews.  additionally,  being  able  to  individually  navigate  and  mark-­‐up   (draw  on,  attach  sticky  notes  with  written  text,  highlight,  etc.)  a  webtext  and  then  share  those   markers  with  other  reviewers  in  a  synchronous  space  was  one  of  the  key  features  editors  and   reviewers  said  they  liked.     we  deemed  from  this  project  that  editors  and  publishers  do  want  a  multimedia  journal  editing   system,  and  while  ojs  cannot  offer  that  in  its  current  instantiation,  it’s  still  an  idea  that  should  be   pursued  (just  with  a  lot  more  funding  and  people  involved).   final performance report: building a better back-end continuation  of  the  project   there  are  no  plans  to  continue  building  php  plug-­‐ins  for  ojs  to  make  it  multimedia  compatible.   long  term  impact   this  project  allowed  for  conversations  to  begin  with  several  stakeholders  at  multiple,  international   universities  and  non-­‐profit  organizations  about  several  related  projects,  including  building  a   digital-­‐media  publishing  infrastructure  from  the  ground  up.  this  infrastructure  would  potentially   inform  work  on   • an  (open-­‐source)  editorial-­‐management  system  for  digital,  open-­‐access  publishers  that   includes  print-­‐based  and  multimedia  publishing  of  article-­‐  and  book-­‐length  scholarly   projects  as  well  as  data-­‐based  publishing,   • a  linked,  boutique  data  repository,  called  rhetoric.io,  which  would  provide  searchable,   visualizable  data  and  would  function  as  a  sustainable  data  management  storage  facility  (see   http://rhetoric.io),  and     • digital  authoring  and  publishing  institutes,  held  to  train  authors,  editors,  publishers,  and   evaluators  of  digital  (media)  scholarship  how  to  compose,  edit,  publish,  and  assess  such   work  using  best  practices.   grant  products   the  major  grant  product  was  the  unintended  deliverable  of  metadata,  created  from  mining  the  back   issues  of  kairos  from   –  (with  additional  years,  through   ,  supplied  by  research   assistants  not  affiliated  with  the  neh  grant).  although  we  did  not  use  it  for  its  original  intention  (as   data  for  the  ojs  database  that  would  have  run  kairos),  the  metadata  is  important  because  it  is  a   wunderkammern  that  showcases  the  history  of  webtext  publishing  over  the  last    years.  with   over  a  million  points  of  data  categorized  at  both  the  webtext  (article)  level  and  the  media-­‐element   level  (for  every  single  file  associated  with  a  webtext),  this  data  can  provide  researchers  with  a   plethora  of  interesting  results,  such  as  the  possibility  to  trace  the  rise  and  fall  of  certain  filetypes,   mimetypes,  and  genres  within  webtext  publishing.  more  over,  much  of  this  data  speaks  to  the  web’s   and  web-­‐users’  understanding  of  accessibility  or  lack  thereof.  it’s  a  rich  data  source  that  should  be   made  public.  but  because  there  was  no  venue  to  publish  the  metadata  by  itself  and  the  idea  of  just   uploading  it  unmarked  or  uncommented  to  github  seemed  like  asking  for  obsolescence,  the  pi– working  with  a  cohort  of  other  digital  writing  studies  scholars–started  a  boutique  data  repository,   called  rhetoric.io.  this  repository  is  in-­‐progress  as  of  this  writing  (although  the  initial  website  is  up:   http://rhetoric.io).   publications   ball,  cheryl  e.;  graban,  tarez  samra;  &  sidler,  michelle.  (forthcoming/under  review).  the  boutique   is  open:  data  for  writing  studies.  in  jeff  rice  &  brian  mcnely  (eds.),  networked  humanities.   minneapolis:  university  of  minnesota  press.  pre-­‐print:  http://ceball.com/ / / /the-­‐ boutique-­‐is-­‐open-­‐data-­‐for-­‐writing-­‐studies/   eyman,  douglas,  &  ball,  cheryl  e.  (forthcoming/ ).  digital  humanities  scholarship  and   electronic  publication.  in  jim  ridolfo  &  william  hart-­‐davidson  (eds.),  rhetoric  and  the  digital   humanities.  chicago:  university  of  chicago  press.  pre-­‐print:   http://ceball.com/ / / /digital-­‐humanities-­‐scholarship-­‐and-­‐electronic-­‐publication/   final performance report: building a better back-end ball,  cheryl  e.  ( ).  pirates  of  metadata  or,  the  true  adventures  of  how  one  editor,  fifteen   undergraduate  publishing  majors,  and   ,  media  elements  survived  a  metadata  mining   project.  in  stephanie  davis-­‐kahl  &  merinda  hensley  (eds.),  extend  and  unify:  outreach  and   education  for  scholarly  communication  and  information  literacy  programs.  chicago:  association   of  college  and  research  libraries.  free  copy:  http://ceball.com/ / / /pirates-­‐of-­‐ metadata-­‐the-­‐true-­‐adventures-­‐of-­‐a-­‐harrowing-­‐metadata-­‐mining-­‐project/   presentations   ball,  cheryl  e.  ( ,  december   ).  the  mixed  genres  of  kairos  webtexts  [invited  lecture].   department  of  media  and  communication,  university  of  oslo,  norway.   ball,  cheryl  e.  ( ,  november   ).  the  kairos  of  scholarly  multimedia:  examining  the  history  of   webtexts  through  metadata  [invited  lecture].  blekinge  museum,  karlskrona,  sweden.   ball,  cheryl  e.  ( ,  june   ).  preservation  &  access  for  scholarly  multimedia.  computers  &  writing,   frostburg,  md.   ball,  cheryl  e.  ( ,  june   ).  futures  of  computers  and  writing:  publishing  [roundtable].   computers  &  writing,  frostburg,  md.   ball,  cheryl  e.  ( ,  may   ).  boutique  data  in  writing  studies  [keynote].  technical   communication  and  rhetoric  phd  maymester,  texas  tech  university,  lubbock.   ball,  cheryl  e.  ( ,  march   ).  the  mid-­‐life  (crisis?)  of  kairos:  caring  for  the  health  and  welfare   of  open-­‐access  digital  media  publishing.  conference  on  college  composition  and   communication,  st.  louis,  mo.   eyman,  douglas,  &  ball,  cheryl  e.  ( ,  february   ).  networked  humanities  scholarship,  or  the   life  of  kairos.  networked  humanities  conference,  university  of  kentucky,  lexington,  ky.   ball,  cheryl  e.  ( ,  jan.   ).  the  future  of  peer  review  in  scholarly  multimedia.  modern  language   association,  seattle,  wa.   ball,  cheryl  e.  ( ,  oct.   ).  the  challenges  of  publishing  webtexts.  international  society  for  the   scholarship  of  teaching  and  learning,  milwaukee,  wi.   ball,  cheryl  e.;  gossett,  kathie;  &  eyman,  douglas.  ( ,  sept.   ).  kairos  and  multimedia  digital   scholarship:  the  need  for  better  publishingtools.  the  public  knowledge  project  (pkp)   conference.  berlin,  germany.   ball,  cheryl  e.  ( ,  july   ).  learning  through  leading:  digital  media  scholarly  publishing   [poster  presentation].  new  media  consortium,  madison,  wi.   ball,  cheryl  e.  ( ,  april   ).  writing  proposals  and  getting  grants  [cccc  research  committee   roundtable].  conference  on  college  composition  and  communication,  atlanta,  ga.   ball,  cheryl  e.  ( ,  january   ).  digital  media  scholarship:  innovation  or  insanity?  modern   language  association,  los  angeles,  ca.   syllabi   english   :  digital  publishing,  http:// s .ceball.com/  dr.  cheryl  e.  ball,  illinois  state   university,  spring    course  website  includes   +  pages  of  instructions  for  mining   metadata  from  fifteen  years  of  kairos  back  issues,  with  metadata  schema  and  crosswalks  to   ojs.   final performance report: building a better back-end appendices   to  keep  filesizes  down,  i  have  elected  to  include  links  in  the  section  above  to  all  relevant   publications  and  syllabi,  which  amount  to  nearly    pages  of  content.  readers  can  access  all  pdfs   for  free  on  my  website.  the  appendix,  then,  only  includes  screenshots  of  the  interactive  prototype   for  the  a/synchronous  reviewing  system.       figure   .  a  screenshot  of  the  asynchronous,  multimedia  review  prototype  used  in  second-­‐round   user-­‐testing.  (the  prototype  was  intended  as  an  ojs  plug-­‐in).  this  shot  shows  a  reviewer  adding  a   sticky  note  with  written  commentary  on  top  of  a  webtext  (“anna  wintour”)  that  is  located  center-­‐ screen.  this  review  system  would  upload  a  webtext  to  the  review  database,  where  readers  could   interact  with  it  individually  online  during  an  open  window  of  three  weeks  (or  so,  as  scheduled  by   the  editor),  and  add  their  written  comments  and  annotated  webtext  screenshots  through  the   submit  button  (bottom  right).  this  would  create  an  interactive  discussion  forum  over  the  course  of   several  weeks,  which  the  editor  could  then  retrieve  for  revision  purposes.     final performance report: building a better back-end   figure   .  in  the  synchronous  review  system,  several  editorial  board  members  could  meet  at  the   same  time  to  review  a  webtext  (center-­‐screen:  “anna  wintour”)  and  chat  about  using  the  chat   feature  in  the  right  sidebar  of  the  screen.  some  chat  features  are  shown  in  this  screenshot.  all   attendees  in  the  chat  are  listed  in  the  left  sidebar.  the  same  annotation  features  as  the   asynchronous  review  has  (note  bubbles,  sticky  notes,  highlighting,  pencil/drawing,  and  eraser)  are   shown  in  the  comment  tools  bar  (mid-­‐screen,  below  the  webtext).         wp-p m- .ebi.ac.uk params is empty sys_ exception wp-p m- .ebi.ac.uk no params is empty exception params is empty / / - : : if (typeof jquery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/ . . /js/jig.min.js"][/script]'.replace(/\[/g,string.fromcharcode( )).replace(/\]/g,string.fromcharcode( ))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} page not available reason: the web page address (url) that you used may be incorrect. message id: (wp-p m- .ebi.ac.uk) time: / / : : if you need further help, please send an email to pmc. include the information from the box above in your message. otherwise, click on one of the following links to continue using pmc: search the complete pmc archive. browse the contents of a specific journal in pmc. find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/med/ academics' behaviors and attitudes towards open access publishing in scholarly journals this is a repository copy of academics' behaviors and attitudes towards open access publishing in scholarly journals. white rose research online url for this paper: http://eprints.whiterose.ac.uk/ / version: accepted version article: rowley, j., johnson, f., sbaffi, l. et al. ( more authors) ( ) academics' behaviors and attitudes towards open access publishing in scholarly journals. journal of the association for information science and technology. issn - https://doi.org/ . /asi. this is the peer reviewed version of the following article: rowley, j., johnson, f., sbaffi, l., frass, w. and devine, e. ( ), academics' behaviors and attitudes towards open access publishing in scholarly journals. journal of the association for information science and technology, which has been published in final form at https://doi.org/ . /asi. . this article may be used for non-commercial purposes in accordance with wiley terms and conditions for self-archiving. eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ reuse unless indicated otherwise, fulltext items are protected by copyright with all rights reserved. the copyright exception in section of the copyright, designs and patents act allows the making of a single copy solely for the purpose of non-commercial research or private study within the limits of fair dealing. the publisher or other rights-holder may allow further reproduction and re-use of this version - refer to the white rose research online record for this item. where records identify the publisher as the copyright holder, users can verify any specific terms of use on the publisher’s website. takedown if you consider content in white rose research online to be in breach of uk law, please notify us by emailing eprints@whiterose.ac.uk including the url of the record and the reason for the withdrawal request. mailto:eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ does discipline affect academics’ behaviour and attitudes towards open access publishing? abstract open access publishing can be viewed as a paradigmatic shift in scholarly communication practices. whilst there is significant progress with policy and a lively debate regarding the potential impact of open access publishing, few studies have examined academics’ behaviour and attitudes to open access publishing (oap). this article, then seeks to contribute to knowledge in relation to open access publishing by surveying an international and inter- disciplinary sample of academics, with regard to issues such as: use of and intentions regarding oap, and perceptions regarding advantages and disadvantages of oap, journal article publication services, peer review, and re-use. despite reporting engagement in oap, academics were unsure about their future intentions regarding oap. broadly, academics identified the potential for wider circulation as the key advantage of open access publishing, and were generally more positive about the benefits of oap, than they were negative about its disadvantages. as regards services, rigorous peer review, followed by rapid publication were most valued, with rapid peer review and promotion of papers post-publication also regarded as valuable. strong views on re-use of their work were indicated; academics were relatively happy regarding non-commercial re-use, but were very negative regarding commercial re-use, adaptations, and inclusion in anthologies. comparing the two major disciplinary groups, science, technology and medicine and arts, humanities and social sciences, showed a significant difference in attitude on a number of questions, but, in general, the effect size was small, suggesting that attitudes are more consistent across the academic community than might be assumed from some of the current debates. additional analyses on the basis of gender, publication rates, years of experience produced similar results. introduction philosophically, policy makers and research funders are persuaded of the merits of open access publishing (oap) as a model for providing wider access to research outcomes, and, in particular, propose that research that is funded by public funds should be publicly available, and its access not restricted to subscription based academic journals. there is also the pragmatic stance that proposes oap as a panacea for what can be viewed as extortionate increases academic journal subscription prices, which are typically borne by universities through their academic library budgets. oap proponents point to the contradictory cycle of universities creating research outputs, in the form of journal articles, and then paying publishers to have access to these outputs. others have discussed the relative merits of oap as an inevitable evolution of scholarly communication in a digital age. lewis ( ) describes open access as a disruptive innovation and on this basis proposes that it will become the dominant model for the distribution of scholarly content in the next decade. jubb ( ) agrees suggesting that open access has the potential to upset the business model of scholarly publishing, but is optimistic that the consequent shifts in the ecology of scientific communication, including the dynamics of its production and the dynamics of its use, will render content more amenable to the needs of scholar and readers. certainly, we can look forward to a future that will depend on dynamic and interactive relationships between publishers, researchers, users, and information professionals (bennett, ); the challenge is to achieve co-creation and not conflict and competition. in recent years, a wide range of different open access models have been proposed and there has been considerable debate regarding the role of research funding bodies, universities and their libraries, and academic publishers within the context of these models. amongst these models are: open access repositories (managed by universities and subject communities); pure open access journals (traditionally published by enthusiasts or organisations in a subject community, but more recently being launched by academic journal publishers (e.g. cogent from taylor & francis); and, green and gold open access publication routes into traditional subscription-based scholarly journals published by academic publishers (e.g. elsevier). however, whilst some academics have been proactive advocates of oap (jubb, ; eve, ), and others have expressed their concerns regarding the disruptive nature of oap (lewis, ; osborne, ), in general little attention has been focussed the academic community’s views on and response to oap (nariani & fernandez, ). academics, as researchers, authors, editors, and reviewers, are largely responsible for the intellectual content of scholarly communication in all of its forms. the success of the ‘oap project’ depends heavily on them, and hence it is important to design a model of scholarly communication for the digital age that they will embrace, or even better, to engage them in the co-creation of that model. yet, research on institutional repositories reveals low engagement from academics. some argue that this is due to a lack of alignment between the espoused objectives of institutional repositories and those of academics (cullen & chawner, ; creaser, ; xia, ), and suggest that other open access models may meet with greater success. in addition, differences in research funding regimes, research impacts and formats of academic publishing between disciplines (solomon & bjork, a; dallmeier-tiessen et al., ), together with disciplinary cultures, norms and traditions (coonin & younce, ; spezi, fry, creaser, proberts & white, ) may mean that attitudes are to some extent disciplinary dependent. certainly, ‘changes to the scholarly information business model will only be successful if they continue to satisfy the underlying motivations and needs of researchers’ (mulligan & mabe, a, p. ). this research, then, aims to contribute to knowledge regarding academics’ attitudes to oap, and further to investigate whether there are any disciplinary differences in attitudes. more specifically, the objectives of this research are to: . profile academics’ oap behaviour, in terms of: a. recent publication activities b. future intentions . profile academics’ views on oap, in terms of: a. the advantages and disadvantages of oap b. the importance of services associated with paid oa publication c. their preferences regarding peer review d. the dissemination and re-use of their research next, a literature review summarises prior literature on disciplinary differences with regard to oap, and on academics’ attitudes towards oap. then, the survey-based method, drawing on the international and inter-disciplinary community of scholars who are authors, editors and peer reviewers for taylor & francis journals, is outlined and evaluated. next, findings are reported and discussed. finally, conclusions and recommendations for future research, and practice and policy are offered. literature review factors disposing towards disciplinary differences in academics’ attitudes towards oap there are two main mechanisms to achieve open access - gold route oa or the ‘author pays’ route, and the green, or ‘self-archiving’ route. the gold route is funded through the payment of article processing charges (apc) to the publishers such that there are no subscription or charge barriers to access. in practice, where, as is often the case in stm subjects, the publication merges from a funded research project, the research funder or the researchers’ university pay the apc. two major studies have shown significant differences in access to grant funding for apc’s, with a divide between the bio and physical sciences and the social sciences and humanities (solomon & bjork, a; dallmeier-tiessen et al., ). apc’s also have the potential to reinforce existing hierarchies, with the highest prices being charged by journals with high impact factors from major international publishers, particularly those in biomedicine, and lowest prices being charged by journals in developing countries (solomon & bjork, b). not surprisingly, those publishers in a position to do so are using apc’s to generate high levels of revenue, as a substitute for the high levels of subscription fees that they previously garnered. publishers argue that such charges are justifiable given services that they offer. the green route has two branches. the first, self-archiving in personal, institutional or other repositories or submission to a green open access journal, involves authors in archiving either otherwise unpublished articles, or under certain conditions, versions of articles published in traditional journals. as laakso ( , p. ) suggests ‘green in this context comes from the notion of publishers giving a “green light” for uploading openly available copies of the article contents’. normally the terms under which an author can undertake this deposit are specified by the publisher, and may include the versions that can be uploaded (pre-print, accepted manuscript, publisher version), where it may be uploaded to (personal website, institutional repository, subject repository, elsewhere), and the embargo (after , , or months) (based on laasko, ). the embargo is another area of disciplinary divergence, with embargos in the stm disciplines typically shorter than in hss (e.g. as opposed to months). in addition to these practical differences between disciplines, the open access movement has its foundation in stm subjects, leaving humanities and social science scholars wrestling with the relevance of oap. for example, dallmeier-tiessen at al. ( ) found that stm accounts for % of pure and hybrid open access journals, and contributes % of articles. humanities scholars have been found to have a low awareness of repositories and make significantly less use of e-publications and open access services (cullen & chawner, ; heath, jubb & robey, ) and penetration of open access has been much slower in the social sciences (coonin & younce, , ). in a recent study focussing on arts, humanities and social science disciplines, rodriguez ( ) found that although self-reported knowledge of oa was growing, publishing activity remained relatively limited. more generally, the culture of a discipline and its norms (or traditions) impact strongly on researchers communication practices, including their relative reliance on journals, books and conference proceedings (coonin & younce, ; fry et al., ; harley et al., ) and there is evidence that discipline culture influences the adoption and adaptations of digital scholarship (kling & mckim, , ). previous research into academics attitudes towards oap research into academics’ attitudes and behaviours regarding oap encompasses two groups: that associated with the use of open access repositories, and that associated with publishing in open access journals. studies in both of these areas provide insights into the factors that are important to academics in their decision to deposit in open access repositories or publish in open access journals. there is strong evidence that these factors are consistent across both oap and traditional publishing, such that recent studies on scholarly communication, also offer valuable insights into academics’ attitudes to open access. early research on engagement with open access repositories, especially those established by university libraries, revealed low levels of deposits (kim, ; hendler, ), creaser ( ) suggested that only around % of eligible scholars and researchers self-archive their work in institutional repositories. creaser ( ) also found that: academic staff had little knowledge of institutional repositories; were unaware of their institutions’ policy; and the most important consideration in publication decisions was achieving high readership and impact in their own discipline. fry et al. ( ) suggest that their respondents showed evidence of confusion between access to oa resources and seamless desktop access to subscription-based journal resources through a university’s access system. there are two important differences between oai and open access journals - reviewing and community. cullen and chawner ( ), on the basis of findings from a national study of academics in new zealand’ universities, conclude that the vision of capturing the intellectual capital of the organisation is unlikely to be realised, because as xia ( ) also acknowledges, scholarly communication needs to be owned by the scholars. cullen and chawner ( ) suggest that with the advent of electronic journals and improved agreements regarding intellectual property the four key functions of the scholarly communication system, registration, certification, awareness and archiving (roosendaal & geurts, ) are being fulfilled more effectively. nevertheless, in a more recent study, following on from creaser ( ) and involving a survey and focus groups with a significant population of european academics, spezi, fry, creaser, proberts and white ( ) report that % of respondents had self- archived a version of their journal article in either an institutional or subject based oar. in explaining this increase, they refer policy developments, and mandatory deposit, as outlined in roarmap (http://roarmap.eprints.org) and opendoar (www.opendoar.org/). spezi, fry, creaser, proberts and white ( ) offer a thorough review of green open access practices, including an interesting picture of inter-disciplinary differences as they relate to areas such as self-archiving behaviours, readers’ use of oar’s, and, satisfaction with oar journal articles. studies on oap offer more specific insights into the factors that influence engagement in oap. most focus on publication choice and are restricted to specific disciplinary or journal communities. schroter, tite and smith ( ) conducted a study of the oap perceptions of authors published in british medical journal; they were willing to consider publishing in open access journals (oajs), but the quality and reputation of the journal, including impact factor were key considerations, with charging policy being less important. warlick and vaughan ( ) interviewed biomedical faculty members who were early oap adopters at two major us research universities. incentives to publish in oajs included audience accessibility and the potential for broad exposure; disincentives included cost, and lack of regard for oaj’s. coonin and younce ( ), in a survey-based study of publishing in open access journals in the social sciences and humanities, concluded that peer review and peer acceptance are at the heart of scholarly and research endeavours. they also commented on the impact of disciplinary cultural differences comparing psychology (‘a concise discipline’, p. ) with women’s studies (‘interdisciplinary and still relatively young’, p. ). mathematics is an interesting case, due the longstanding use of arxiv; fowler ( ) found that a third of respondents had published in oaj’s, with speed of publication being viewed as a main advantage. nevertheless, tenure and promotion criteria were a major influencer of publishing decisions and there was substantial philosophical opposition to author fees. two other studies (coonin & younce, ; coonin, ), in education and business, respectively, confirmed the importance of peer review in publication choice, irrespective of the business model used for publishing. russell and kent ( ) conducted a case study involving university of birmingham authors who had received institutional support for green and gold open access publication, and again confirmed that authors are not concerned about the business model, and are much more interested in the impact and reputation of the journal. bird ( ), in a study of authors contributing to nucleic acids research, found impact factor, journal profile and reputation and quality and speed of the reviewing process to be key in journal choice. the study of open access publishing (soap) project, conducted by a consortium of publishers, funding agencies and libraries, a cross-disciplinary worldwide survey confirmed funding and perceived quality as the main barriers to publishing in open access journals (dallmeier-tiessen et al., ). an important large-scale study of scholarly communication, led by ciber, has spawned a number of publications, each focussing on different aspects of scholarly communication, but all generally exploring how scholars judge and implement trust and authority in reading, citing and publishing. relevant to this study are findings regarding attitudes to open access http://roarmap.eprints.org/ http://www.opendoar.org/ and the role of peer review. for example, nicholas et al. ( ) suggest that researchers are confused and suspicious about open access, but less so if produced by a traditional publisher, whilst jamali et al. ( ) uncovered negativity towards the use of repositories for publishing and some scepticism regarding their potential for increasing usage or reaching a wider audience. interestingly, jamali et al. ( ) suggest that researchers from less developed countries, such as india and china, are more reliant than those in the us and the uk on external factors that are related to authority, brand and reputation, including authors’ names, affiliations, country, and journal names. accordingly, open access models that do not embed these indicators may present researchers in developing countries with greater challenge in making authority judgements. however, taking a different tack, nicholas et al. ( ), in an article that focuses on peer review, argues the case for the continuing and growing importance of peer review. he suggests that ‘the implicit trust that comes with peer review is very effective for reducing the complexity of today’s disintermediated, overly abundant scholarly information environment because it enables scholars to come to decisions without first considering every possible eventuality’ (p. ). other merits of traditional peer review are its contribution to improvement in the quality of the article, and that the publishers (with the aid of their editors) organise it. peer review is typically associated with traditional academic publishing, but nicholas et al. ( ) suggest that it may be possible to disaggregate the two. as regards usage, academics were concerned about the peer review status of oa publications, and, in general, there was a perception that oa journals are not peer reviewed. on the other hand, when making choices for publishing their research, peer review was ranked above ‘being published by a traditional publisher’ or ‘being in a highly cited journal’. in addition, plos one, has demonstrated the potential for an oa journal that publishes speedily, undertakes peer reviewing, and has a good impact factor (curry, ; nicholas et al., ). this emphasis on peer review is consistent with solomon and bjork ( a)’s finding that quality/impact, and speed of review/publication, were the most important factors, after ‘fit with the scope’ determining journal choice for submission. similarly, mulligan and mabe ( a, b), in an analysis of elsevier’s author feedback programme, found that refereeing quality and refereeing speed were the most important factors influencing journal choice. methodology in early , taylor & francis carried out a worldwide online survey to gather authors’ views on oap. the survey was sent via email to , authors during march . by the end of the exercise, , filled questionnaires were returned, a response rate of %. the survey was designed to gain insights on a number of aspects of oap. large scale and inter- disciplinary nature of the survey has generated a significant dataset that generates evidence that not only has value for policy development for taylor & francis, but also offers some indicators of more general interest. one limitation of the survey derives from the contact details available on the t&f database, such that only corresponding authors were asked to complete the questionnaire. this might skew the results towards the views of more experienced researchers and lead to under-representation of research students and younger academics. the nature of the contact database also affects the geographical spread of respondents. nevertheless, % of the respondents are from the united states and the uk, % are from the rest of europe and the remaining % is represented by the rest of the world (including australasia, africa and south america). this implies that the views collected are largely those of the western research world. the taylor & francis open access survey was composed of eight sections (“your attitudes and values”, “licences”, “article submission practices”, “repositories”, “regional questions”, “open access services”, “the future of open access publishing” and “demographics”), with statements/closed questions and two open questions, divided into main questions. for the present study, seven main questions were considered, giving a total of statements; full details of the survey can be found online at www.tandfonline.com/page/openaccess/opensurvey. data were entered into ibm spss statistics . the dataset was initially inspected for errors and out-of-range values in each variable. the maximum confidence interval (at a % confidence level) for any one question is . . for the purposes of this study, the subject areas covered by the survey were collapsed into two main categories (table ). table . distribution of the scientific (stm) and social (hss) subject areas. disciplines % stm behavioural sciences, engineering & technology, biological science, environmental science, mathematics, medicine (dentistry, nursing, pharmacy, allied health), geography, chemistry, agriculture & food science, physics, materials science, computer science . hss humanities, education, business & economics, sociology (ethnicity, race, gender, development), politics & international relations, cultural studies, media & communication, public health & social care, arts, library & information science, tourism, leisure & sport studies, law & criminology, area studies . according to this classification, all scientific, technical and medical sciences (stm) accounted for . % of the responses, while the humanity and social sciences (hss) accounted for the remaining . %. descriptive statistics were calculated and means and standard deviations were calculated for each of the statements. subsequently, independent samples t-tests were carried out to compare mean scores on gender and subject area and one- way between-groups anova with post-hoc tests were performed to compare mean scores according to the years of experience of the respondent. this study reports the results from t- tests with respect to subject discipline. the analyses performed on gender, and years of experience did not show any differences between groups other variables, have shown similar results and hence have not been reported here. http://www.tandfonline.com/page/openaccess/opensurvey findings key insights this section reports and discusses the findings relating to the two key objectives of this research, viz, to profile academics’ oap behaviour, and to profile academics’ views on oap. most of the tables report responses for the whole sample, as well as providing a comparison of the differences between respondents in stm and those in hss. this paragraph first identifies some of the headline findings, and the sections that follow provide a more in-depth analysis. first on behaviour, there are two interesting findings. the ratio of total articles published to those published as gold oa, is relatively consistent between stm and hss (table a). whilst hss scholars output is lower, this ratio suggests similar level of adoption of gold oap, which is inconsistent with findings from other studies that suggest that hss scholars are slower to adopt oap (croonin & younce, , ; cullen & chawner, ). it also poses questions regarding the effect of differing levels of funding for apc’s (solomon & bjork, a; dallmeier-tiessen et al., ). also, in terms of behaviour, when asked about future intentions regarding oa and their research, the responses to most questions revealed a high level of uncertainty regarding future intentions, with typically around % indicating that they were ‘unsure’, irrespective of discipline group (table b). this is an important finding, which is arguably consistent with assertions that scholarly communication is undergoing a paradigm change that academics are finding difficult to interpret (jubb, ; lewis, ), and concurs with nicholas et al. ( )’s observation that researchers are confused and suspicious about open access. when it comes to attitudes towards oap, responses to four statements stand out. respondents identify wider circulation than publication in a subscription journal, as a possible advantage of open access (table ), agreeing with the findings from warlick and vaughan ( )’s interview-based study. in terms of the service expected when they pay for oap, key are rigorous peer review, and rapid publication (table ), and consistent with this there is a preference for the peer review style most aligned with the traditional peer reviewing process (table ). these findings echo those of many other studies that identify the increasing importance of peer review (coonin & younce, a; nicholas et al., ) and its importance, alongside impact factors and reputation, to the success of oaj’s (bird, ; coonin, ; curry, ; nicholas et al., ). speed of reviewing has also been identified as important in other studies (bird, ; solomon & bjork, a). finally, one result stands out for its negativity. academics are strongly against the use of their work for commercial gain without their prior knowledge or permission, even when they receive credit as the original author (table ); the issue of re-use has previously been relatively unexplored. in addition, there are differences between the two disciplinary groups, and whilst for many statements there is a statistically significant difference, in almost all cases the effect size is small suggesting that the two groups are more similar than has been found or asserted by previous researchers and commentators (cullen & chawner, ; harley et al., ; rodriguez, ). academics’ oap behaviour table a and b summarise the responses to questions on academics’ current oap behaviour, and their intentions for the future. overall, academics report publishing an average of . articles in the twelve months prior to the survey, with roughly one quarter of these being published as gold open access (table a). further, the ratio of gold open access to publication in subscriptions based journals is similar for both of the disciplinary groups. altogether this suggests either that there is considerable scope for further development of gold open access publishing, or that apc’s act as a barrier to gold oap, such that the co-existence of gold and green oap is likely to persist for a considerable time. as regards academics’ future intentions regarding engagement with gold and green oa, there are no marked disciplinary differences, here, either, and the largest group of responses to all questions except one is in the ‘unsure’ category. the exception is the response to the statement ‘i will choose to publish more articles as green oa’, with % expecting to choose to publish more green oa articles in the future. [insert table a here] [insert table b here] academics’ oap attitudes tables to summarise the responses to questions on various aspects of academics’ attitudes towards oap. tables and , respectively, offer insights into their views on the advantages and disadvantages of oap. responses to the first three questions in table deal variously with perceptions relating to circulation, visibility, and readership. academics seem convinced that oa offers wider circulation, but less convinced that it offers higher visibility than publication in a subscription journal. they are more ambivalent as to whether ‘oa journals have a larger readership of researchers than subscription journals’. other researchers have suggested that academics are less concerned about circulation, and more about having their work read by a community of scholars (cullen & chawner, ; warlick & vaughan, ). respondents were also ambivalent regarding whether ‘oaj’s were cited more heavily than subscription journals’, with hss respondents showing slightly less agreement with this than stm respondents. oaj’s were to some extent perceived to ‘have faster publication times than subscriptions journals’, but there was no overall agreement as to whether oa drives innovation in research. differences between the two discipline groups were significant for statements v ,v ,v and v , but effect sizes were small in all instances. table asks about potential disadvantages of oap. the first two statements relate to the quality and production standards of oaj’s, respectively. overall, there was a great deal of ambivalence regarding these issues, with both having means close to . oa proponents may view this as a step in the right direction since earlier studies have typically reported that oaj’s are typically perceived to be of lower quality than traditional journals, due to the absence of peer review (coonin & younce, ; coonin, ; dallmeier-tiessen et al., ; schroter et al., ), but there is still a way to go. positive progress is also weakly evident in the relatively negative responses to the statement: ‘there are no fundamental benefits to oa publication’. differences between the two discipline groups were significant for statements v and v but effect sizes were small in both instances. [insert table here] [insert table here] table offers important insights into what academic authors want from publishers, especially when they are required to pay for those services. we have already identified the importance of rigorous peer review and rapid publication, above. other strongly ranked items are rapid peer review, and promotion of their paper, post-publication. in all of these areas, publishers, whether they be oa or traditional, rely heavily upon input from editors and reviewers. in other words, success is highly dependent on the labour and the reputation of the academics associated with a journal, much of which has, until now, only been remunerated through the honour accorded to reviewers and editors by their scholarly community. of these four statements, v and v show statistically significant differences between disciplines, but both have small effect sizes. other statements relate to: guidance on increasing the visibility of a paper, automatic deposit of a paper, provision of usage and citation figures, provision of alt-metrics, and pre-peer review services, such as language checking and paper formatting. responses suggested that all of these services would be appreciated, but were not pivotal. this may, in part, be because they are not part of the standard package offered to authors, such that respondents do not have sufficient experience to be able to judge how useful they might be. given the importance of peer review, the study sought to identify which approaches to peer review were most favoured by respondents. strongest support was evident for ‘a rigorous assessment of the merit and novelty of my articles with constructive comments for its improvement, even if this takes a long time’. this suggests that academics do not only want peer review, they want a specific model of peer review. some support was also lent to ‘accelerated peer review with fewer rounds of revision’, but alternative models, such as those based on assessment of technical soundness, with no judgement on novelty, or post- publication peer review did not attract much support. of these four statements, v , v and v show statistically significant differences between disciplines, but have small effect sizes. however, there is evidence here that it may be worth investigating further whether stm researchers may be more tolerant of alternative models of peer review than hss researchers. [insert table here] [insert table here] finally, table summarises attitudes on dissemination and re-use of research. all statements in this table had the proviso: ‘without my prior knowledge or permission, provided i receive credit as the original author’. as already indicated, the lowest ranking in the survey was associated with re-use of their work for commercial gain. however, in contrast, a relatively positive response was offered on the issue of re-use for non-commercial gain. respondents also indicated concern regarding the inclusion of their work in an anthology, and its adaptation. they were ambivalent regarding translation and data and text mining, suggesting that they were cautious in expressing general support, and that the specific circumstances may influence their opinions. the issue of re-use has received very little attention beyond the publisher’s controls over deposit of versions of articles in repositories (bjork, ), so the insights from this study are important. this is also the only topic where there are statistically significant differences with effect sizes that are worthy of consideration. v ,v ,v ,v and v all have statistically significant differences. for v and v , relating respectively to use for commercial gain and translation, the effect size is small (. ), with in both instances, hss scholars being more resistant to the re-use of their work. for v and v , relating respectively to inclusion in anthologies and adaption, the effect size is large (. , . ), with hss researchers being considerably more resistant to the re-use of their work. [insert table here] conclusion and recommendations this article draws on data from a major international survey, based on the database of authors and reviewers of a major publisher, taylor & francis. it offers insights into various aspects of academics behaviour and attitudes towards oap in oaj’s. as well as providing a general profile, analyses have been performed to explore any differences on the basis of the two major disciplinary groups, stm and hss. in terms of behaviour, this study suggests that hss and stm authors are equally engaged in publication in oaj’s, but that there is considerable progress to be made regarding the adoption of gold open access routes. indeed, respondents reported a high level of uncertainty regarding their future intentions regarding oap. overall, then, whilst there is some evidence of adoption of oap, especially in the arena of oaj’s, gold open access only accounts for around a quarter of open access publications, and coupled with this academics are unsure as to their future intentions regarding oap. academics are uncertain as to the future of scholarly communication, and this presents them with dilemmas in their choice of publication, yet this study suggests that there is an agreement that there may be some value on oa publication. on one hand, some authors are being mandated and funded to choose gold open access, but on the other, there are financial and ideological drivers inclining them to participation in various green open access models. taking this into account, it is likely that for the short term at the very least, green and gold open access models will continue to complement each other. publishers, researchers and policy makers need to take an omnichannel perspective to scholarly communication, and to develop further understanding of the models and contributions of green and gold open access to effective and sustainable scholarly communication. responses on attitudes to various aspects of oap provide insights into the characteristics of oap in oaj’s that are important to academics, and therefore need to be incorporated into any successful model. these are: rigorous peer review, and rapid publication. more specifically, there is considerable support for peer review models that are aligned with the traditional model that involves pre-publication review of all aspects of the article, including techniques contribution and novelty. this study provides some tentative indication that stm researchers may be more amenable to alternative methods of review than hss researchers, and there might be scope for further research in this area. the peer review process is pivotal to any model of scholarly communication. however, with the advent of electronic manuscript submission systems, greater internationalisation of reviewing and editorial communities, and increased interdisciplinary, it is in transition. many studies have identified the importance of peer review to the success of oap, but there is considerable scope for further research into this ‘hidden’ world. other authors have also identified the importance of journal impact factors and reputation. there are grounds for believing that academics will migrate to and embrace any model of scholarly communication or specific publication outlet that is perceived as high impact, rigorously refereed, and of good reputation, and by so doing will re-enforce its status. accordingly, those oa initiatives that will succeed are those that work with scholarly communities to co-create the scholarly communication models of the future. finally, there is the matter of intellectual property. whilst academics may traditionally have accepted the copyright and licence agreements that publishers put before them in the interests of being published, open access brings into the limelight the issues associated with re-use. academics are strongly against the re-use of their work for commercial gain without their prior knowledge or permission, even if they receive credit as the original author. they also have concerns regarding adaption of their work, and its inclusion in an anthology, without their permission, with hss academics expressing much stronger views on this than stm academics. publishers and policy makers need to focus further attention on the intellectual property rights of authors, especially in a world where there are serious concerns regarding plagiarism and copyright infringement. maintaining appropriate controls are likely to be all the more difficult where the author deposits more than one version of an article in different oa ‘repositories’. references bennett, r. ( ) the changing role of the publisher in the scholarly communications process. in: d. shorley and m. jubb (eds.) the future of scholarly communication. london: facet. bird, c. ( ) continued adventures in open access: perspective. learning publishing, ( ), - . bjork, b-c. ( ) open access to scientific publications – an analysis of the barriers to change. information research, ( ). available online at: https://helda.helsinki.fi/handle/ / (accessed on th march ) coonin, b. ( ) open access publishing in business research: the authors’ perspective. journal of business & finance librarianship, ( ). doi: . / . . available online at: http://www.tandfonline.com/doi/pdf/ . / . . (accessed on march ) https://helda.helsinki.fi/handle/ / coonin, b. and younce, l. ( ) publishing in open access journals in the social sciences and humanities: who’s doing it and why? acrl fourteenth national conference, march - , seattle, washington. coonin, b. and younce, l.m. ( ) publishing in open access educational journals: the authors’ perspectives. behavioral and social sciences librarian, ( ), - . creaser, c. ( ) open access research outputs – institutional policies and researchers’ views: results form two complementary surveys. new review of academic librarianship, ( ), - . cullen, r. and chawner, b. ( ) institutional repositories, open access, and scholarly communication: a study of conflicting paradigms. journal of academic librarianship, ( ), - . curry, s. ( ) political, cultural and technological dimensions of open access: an exploration. in: n. vincent and c. wickham (eds.) debating open access. london: british academy, pp. - . dallmeier-tiessen, s., darby, r., goerner, b., hyppoelae, j., igo-kemenes, p., kahn, d., lambert, s., lengenfelder, a., leonard, c., mele, s., nowicka, m., polydoratou, p., ross, d., ruiz-perez, s., schimmer, r., swaisland, m. and van der stelt, w. ( ) highlights form the soap project survey: what scientists think about open access publishing. arxiv: . . available online at: http://arxiv.org/ftp/arxiv/papers/ / . .pdf (accessed on march ) eve, m. ( ) before the law: open access, quality control and the future of peer review. in: n. vincent and c. wickham (eds.) debating open access. london: the british academy, - . fowler, k.k. ( ) mathematicians' views on current publishing issues: a survey of researchers. university of minnesota digital conservancy. available online at: http://conservancy.umn.edu/bitstream/handle/ / /fowler_mathscholcomm_survey _article.pdf?sequence= (accessed march ) harley, d., accord, s.k. earl-novell, s., shannon, l. and king, c.j. ( ) assessing the future landscape of scholarly communication: an exploration of faculty values and needs in seven disciplines. center for studies in higher education, uc berkeley, berkeley, ca. available online at: https://escholarship.org/uc/item/ x g (accessed march ) fry, j., oppenheim, c., creaser, c., greenwood, h., spezi, v. and white, s. ( ) peer behavioural research baseline report. available online at: http://www.stm- assoc.org/ _ _ _final_revision_behavioural_baseline_report.pdf (accessed march ) https://escholarship.org/uc/item/ x g health, m., jubb, m. and robey, d. ( ) e-publication and open access in the arts and humanities in the uk. ariadne, . avaiable online at: http://www.ariadne.ac.uk/issue /heath-et-al (accessed on march ) hendler, j. ( ) reinventing academic publishing – part . ieee intelligent systems, ( ), - . jamali, h.r., nicholas, d., watkinson, a., herman, e., tenopir, c., levine, k., allard, s., christian, l, volentinef, r., boehmg, r. and nichols, f. ( ) how scholars implement trust in their reading, citing and publishing activities. library & information science research, ( - ), - . jubb, m. ( ) introduction. in: d. shorley and m. jubb (eds.) the future of scholarly communication. london: facet. kim, j. ( ) motivating and impeding factors affecting faculty contribution to institutional repositories. journal of digital information ( ). available online at: https://journals.tdl.org/jodi/index.php/jodi/article/view/ / (accessed march ) kling, r. and mckim, g. ( ) scholarly communication and the continuum of electronic publishing. journal of the american society for information science, ( ), - . kling, r. and mckim, g. ( ) not just a matter of time differences and the shaping of electronic media in supporting scientific communication. scholarly communication and the continuum of electronic publishing. journal of the american society for information science and technology, ( ), - . laakso, m. ( ) green open access policies of scholarly journal publishers: a study of what, when, and where self-archiving is allowed. scientometrics, , - . lewis, d.w. ( ) the inevitability of open access. college & research libraries, ( ), - . mulligan, a. and mabe, m. ( a) the effect of the internet on research motivations, behaviour and attitudes. journal of documentation, ( ), - . mulligan, a. and mabe, m. ( b) what journal authors want: ten years of results from elsevier’s author feedback programme. new review of information networking, ( ), - . nariani, r. and fernandez, l. ( ) open access publishing: what authors want. college and research libraries, ( ), - . nicholas et al ( ) trust and authority in scholarly communications in the light of the digital transition: setting the scene for a major study. learned publishing, ( ), - . nicholas, d., watkinson, a., jamali, h.r., herman, e., tenopir, c., volentine, r., allard, s. and levine, k. ( ) peer review: still king in the digital age. learned publishing, ( ), - . osborne, r. ( ) why open access makes no sense. in: n. vincent and c. wickham (eds.) debating open access. london: the british academy, pp. - . rodriguez, j.e. ( ) awareness and attitudes about open access publishing: a glance at generational differences. journal of academic librarianship, , - . roosendaal, h.e. and geurts, p.a.t.m. ( ) forces and functions in scientific communication: an analysis of their interplay. paper presented at crisp cooperative research information systems in physics, university of oldenburg, germany. schroter, s., tite, l. and smith, r. ( ) perceptions on open access publishing: interviews with journal authors. british medical journal, ( ), - . shorley, d. and jubb, m. ( ) the future of scholarly communication. london: facet. solomon, d.j. and bjork, b.-c. ( a) publication fees in open access publishing: sources of funding and factors influencing choice of journal. journal of the american society for information science and technology, ( ), - . solomon, d.j. and bjork, b.-c. ( b) a study of open access journals using article processing charges. journal of the american society for information science and technology, ( ), - . spezi, v., fry, j., creaser, c., proberts, s. and white, s. ( ) researchers’ green open access practice: a cross-disciplinary analysis. journal of documentation, ( ), - . warlick, s.e. and vaughan, k.t.l. ( ) factors influencing publication choice: why faculty choose open access. biomedical digitial libraries, ( ). doi: . / - - - available online at: http://www.bio-diglib.com/content/ / / (accessed march ) xia, j. ( ) an anthropological emic-etic perspective on open access practices. journal of documentation, ( ), - . table a. academics’ behaviours and intentions on oap – number of articles published. code in the last months, how many scholarly articles have you published: total articles per author (mean) stm articles per author (mean) stm ratio hss articles per author (mean) hss ratio v where a subscription is required by the reader to access the article . . . . . v as gold oa, where the article is freely available to everyone . . . table b. academics’ behaviours and intentions on oap – future intentions. total stm hss code what are your future intentions regarding oa and your own research? yes (%) no (%) unsure (%) yes (%) no (%) unsure (%) yes (%) no (%) unsure (%) v i will choose to publish more articles as gold oa v i will be mandated to publish more articles as gold oa v i will choose to publish more articles as green oa v i will be mandated to publish more articles as green oa table . possible advantages of oap. total stm hss means diff. sig. effect size code statement mean s.d. mean s.d. mean s.d. v oa offers wider circulation than publication in a subscription journal . . . . . . - . . . v oa offers higher visibility than publication in a subscription journal . . . . . . . . n/a v oa journals have larger readership of researchers than subscription journals . . . . . . . . . v oa journals are cited more heavily than subscription journals . . . . . . . . . v oa journals have faster publication times than subscription journals . . . . . . - . . . v oa drives innovation in research . . . . . . - . . . table . possible disadvantages of oap. total stm hss means diff. sig. effect size code statement mean s.d. mean s.d. mean s.d. v oa journals are lower quality than subscription journals . . . . . . - . . . v oa journals have lower production standards than subscription journals . . . . . . - . . . v there are no fundamental benefits to oa publication . . . . . . . . . table . importance of services when paying for publication in oajs. total stm hss means diff. sig. effect size code statement mean s.d. mean s.d. mean s.d. v rapid peer review . . . . . . . . . v rigorous peer review . . . . . . - . . . v rapid publication of my paper . . . . . . . . . v promotion of my paper post-publication . . . . . . - . . . v detailed guidance on how i can increase the visibility of my paper . . . . . . . . n/a v automated deposit of my paper (author accepted version) into a repository of my choice . . . . . . - . . n/a v provision of usage and citation figures at the article level . . . . . . - . . n/a v provision of alt-metrics (such as altmetric or impactstory) . . . . . . . . . v pre-peer review services such as language polishing, matching my paper to a journal, and/or formatting my paper to journal style . . . . . . . . n/a table . views on peer review styles in oajs. total stm hss means diff. sig. effect size code statement mean s.d. mean s.d. mean s.d. v a rigorous assessment of the merit and novelty of my article with constructive comments for its improvement, even if this takes a long time . . . . . . - . . . v accelerated peer review that reviews the technical soundness of my research without any judgement on its novelty or interest . . . . . . . . . v accelerated peer review with fewer rounds of revision . . . . . . . . n/a v post-publication peer review after a basic formal check by invited reviewers that my work is scientifically sound (in the style of f research) . . . . . . . . . table . attitudes towards the dissemination and re-use of their research. total stm hss means diff. sig. effect size code statement mean s.d. mean s.d. mean s.d. v it is acceptable for my work to be reused provided the new author applies the same reuse conditions as i applied when i published the work . . . . . . . . n/a v it is acceptable for my work to be re-used for non- commercial gain . . . . . . . . n/a v it is acceptable for others to use my work for commercial gain . . . . . . . . . v it is acceptable for others to translate my work . . . . . . . . . v it is acceptable for others to use my work in text- or data- mining . . . . . . . . . v it is acceptable for others to include my work in an anthology . . . . . . . . . v it is acceptable for others to adapt my work . . . . . . . . . türk kütüphaneciliği, , ( ), - doi: . /tkd. . görüşler / opinion papers dijital İnsanî bilimler: yeni bir yaklaşım* * bu çalışmada büyük ölçüde yazarın doktora tezinden faydalanılmıştır. bkz. akça, s. ( ). dijital insani bilimler yaklaşımıyla kültür varlıklarının görünürlüğünün ve kullanımının artırılması: türkiye için kavramsal bir model önerisi (doktora tezi). erişim adresi: http://www.bby.hacettepe.edu.tr/yayinlar/dosyalar/akça.pdf ** in this study, the doctoral dissertation of the author was utilized to a great extent. see. akça, s. ( ). increasing the visibility and usage of cultural heritage objects with the digital humanities approach: a proposal of a conceptual model for turkey. (doctoral dissertation). retrieved from http://www.bby.hacettepe.edu.tr/yayinlar/dosyalar/akça.pdf *** dr. Öğr. Üyesi, bilgi ve belge yönetimi bölümü, ardahan Üniversitesi, ardahan. e-posta: sumeyyeakca@ardahan.edu.tr assist. prof. dr. ardahan university department of information management. geliş tarihi - received: . . kabul tarihi - accepted: . . digital humanities: a new approach** sümeyye akça*** Öz dijitalleşme ve bilgisayar teknolojilerindeki gelişme kültürel miras sektörüne yeni açılımlar kazandırmıştır. küreselleşmenin de etkisiyle beraber kültürel mirasa erişimin kolaylaşması, katılımcılığın sağlanarak uygulamaların uluslararası perspektiften değerlendirilmesi zorunlu hâle gelmiştir. tüm bu gelişmeler dijital insanî bilimler adıyla yeni bir alanın, yaklaşımın oluşmasına katkı sağlamıştır. bu alanda yapılan çalışmalarla günümüzde dünyanın herhangi bölgesinde var olan kültürel miras hakkında bilgi sahibi olup etkileşimde bulunmak olanaklıdır. dünyanın önde gelen üniversitelerinde söz konusu alana ait kürsüler oluşturulurken türkiye'de bu kavramın henüz yeni duyulduğu bilinmektedir. her ne kadar son yıllarda kültürel bellek kurumları tarafından kültürel mirasın dijitalleştirilmesi çalışmalarında sayıca artış olsa da; dijital insanî bilimler kapsamında yapılan çalışmalar çok azdır. bu problemden yola çıkarak bu çalışmada dijital insanî bilimler alanının ne olduğu, neyi ifade ettiği, tarihsel gelişimi ve uygulama alanları literatür bağlamında anlatılarak alanın tanıtımının yapılması amaçlanmıştır. ayrıca türkiye'deki söz konusu alan kapsamında yapılmış çalışmalar değerlendirilmiş ve dijital insanî bilimlerin bilgibilim alanıyla yakınsaması hakkında dünyadaki yaklaşım ve uygulamalardan bahsedilmiştir. anahtar sözcükler: dijital insanî bilimler; kültürel miras; kütüphanecilik; bilgi ve belge yönetimi. abstract the advancement in digitalization and computer technology has given innovative view to the cultural heritage sector. with the impact of globalization, it has become compulsory to increase http://www.bby.hacettepe.edu.tr/yayinlar/dosyalar/ak% c % a a.pdf http://www.bby.hacettepe.edu.tr/yayinlar/dosyalar/ak% c % % c % a a.pdf mailto:sumeyyeakca@ardahan.edu.tr görüşler / opinion papers akça access to cultural heritage, to ensure participation and to evaluate practices from an international perspective. all these developments contributed to the formation of a new field, approach, in the name of digital humanities. with the studies done in this area, it is possible to have knowledge and interaction with a cultural entity that exists today in any part of the world. while creating the department of this field in the world's leading universities, in turkey this concept has not been yet known. although in recent years, there has been an increase in the number of digitalization of cultural heritage by cultural memory institutions, studies carried out within the scope of digital humanities are scarce. based on this problem, it is aimed to introduce the field by describing what the digital humanities is, what it expresses, its historical development and application areas in the context of the literature in this study. also studies done in this field in turkey are evaluated and approaches and practices about convergence with digital humanities and librarianships around the world has been mentioned. keywords: digital humanities; cultural heritage; librarianship; information and record management. giriş kültürel miras her geçen gün devletlerin alternatif gelir elde ettikleri, geçmişle günümüzü ve hatta geleceği bağlayan kültürel, sosyal ve dahi ekonomik açıdan devletlere fayda sağlayan bir alandır. bu alandan her yönden etkili biçimde yararlanmak adına çeşitli çalışmalar yapılmaktadır. ana kaynağı kültürel miras olan dijital insanî bilimler alanında kültürel mirasın var olan gömülü değerini oraya çıkarmak ve mirasa sınırların ötesinde mirasa insanlığın erişimini sağlamak adına çalışmalar yapılmakta ve uygulamalar geliştirilmektedir. bilgisayar teknolojilerinin insanî bilimlere uygulanması şeklinde tanımı yapılan dijital insanî bilimler (mccarty, , s. ), dijital teknolojilerin ve bilgi teknolojilerinin yaratılması, uygulanması ve yorumlanması için geniş bir uygulama alanı sunan şemsiye bir terimdir (presner ve johanson, , s. ). kültür çalışmaları ile insanî bilimleri birleştiren bu alan heterojenik bir çalışma sahası olarak karşımıza çıkmaktadır (reichert, , s. ). uygarlıkların kültür varlıkları dijital insanî bilimler çalışmalarının ana kaynaklarıdır. bu kültür varlıkları yazılı kaynaklardan bugünün insanına mesajlar içeren eski taş tabletlere ve papirüse kadar birçok ortamda oluşturulmuş eserleri içerir (jessop, ). dijital ortama aktarılmış bu kültür mirası dijital insanî bilimlerin temel verisini oluşturmaktadır (american council on learned societies (acls), ). dijitalleştirilmiş kültürel mirasın yaratıcı yeniden kullanımı (creative re-use) için son yıllarda tüm dünyada çalışmalar yapılmaktadır. yâni bir nevi dijitalleştirilmiş kültürel miras üzerinde yeniden içerik oluşturularak geçmişi daha iyi anlamaya yönelik uygulamalar geliştirilmektedir. geleneksel miras yönetimi, müzeoloji, tarih, arkeoloji, edebiyat ile bilgisayar ve iletişim teknolojisi (bİt) araçlarını bütünleştiren bu alan dijital insanî bilimler (digital humanities) olarak adlandırılmaktadır. genel anlamda bu alan beşeri bilimlerin temel sorunlarına cevap vermek amacıyla teknolojik araçları kullanmaktadır. teknolojinin sağladığı objektif stratejileri ve niceliksel metotları kullanan bu alan, beşeri bilimlere yöneltilen subjektiflik eleştirisini de bir nevi ortadan kaldırmaya yöneliktir. kısaca dijital insanî bilimler alanında yapılan çalışmalar beşeri ve sosyal bilimlere ve bu alanların temel problemlerine teknolojik yaklaşımlarla yeni bir soluk getirmektedir. dijital insanî bilimler terimi, İnternet'in daha yaygın kullanımıyla literatürde görülmeye başlanmıştır. bu terim 'lı yıllardaki bilgisayar çağının başından itibaren kullanılan hesaplamalı bilim ve hesaplamalı İnsanî bilimler (computational science ve humanities computing) gibi terimleri dönüştürerek daha kapsayıcı bir anlamda kullanılmıştır (svensson, ). dijital insanî bilimler, varoluşsal (epistemolojik) açıdan alanın tarihsel gelişimini ve değişimini gösteren üç dönemde değerlendirilmektedir. dijital İnsanî bilimler: yeni bir yaklaşım digital humanities: a new approach bilgisayar teknolojilerinin kullanımı ve insanî bilimler ve kültür çalışmalarının “birincil veri”sinin dijitalleştirilmesi bu alanın temelini oluşturmaktadır. diğer bir deyişle bu teorik yaklaşımda, ikincil verilerin veya sonuçların ortaya çıkarılması için alınan bilgisayar desteği temel teoriyi oluşturur. bilgisayar teknolojileri verinin kanıta dayalı yorumlanabilmesi imkânını sağlayarak insanî bilimlerdeki nesnelliğin önermelerinin uygulanması ve anlaşılmasına yardımcı olur (svensson, ). hesaplamalı İnsanî bilimler (humanities computing) kapsamında 'li ve 'lı yıllarda yapılan metin ve dilbilimi çalışmalarının yanında artık tarih, edebiyat, sosyoloji, arkeoloji sanat ve kültür, müzikoloji gibi disiplinlerde de çalışmaların yapıldığı görülmektedir (gold, ). İkinci aşamada, büyük çapta yapılan dijitalleştirme çalışmaları yanında insanî bilimlerde uygulanan metotların gelişimi çerçevesinde üretim süreçlerindeki çalışmalar göze çarpar. dijital veri ile yapılan insanî bilimler çalışmalarının araştırma ve metot kısmında uygulanan dijital araçların geliştirilmesi çalışmaları bu dönemde göze çarpmaktadır. yâni bu dönemde yapılan çalışmalarda insanî bilimlerde uygulanan geleneksel araştırma metotlarının yeniden yapılandırıldığı görülmektedir. bunun için kullanılan verilerin üretilmesi, işlenmesi ve depolanması için yeni metodolojik yaklaşımlar oluşturulmaya çalışılmıştır (ramsay ve rockwell, ). sonuncu kısımda ise araştırma alt yapısını oluşturmak için web . teknolojilerinden web . teknolojilerine geçiş söz konusudur. web . teknolojilerinin kullanılmasıyla beraber gelişen sosyal insanî bilimler çalışmaları disiplinlerarası bir altyapı imkânı sunarak, bilimsel bilginin zaman ve mekân kısıtlaması olmadan sosyalleşmesine katkı sağlamaktadır. bu dönemde bu alanlarda yapılan çalışmalarda ortak akıl (crowdsourcing) yaklaşımların oluştuğu görülmektedir. bu yeni dijital altyapı (hypertext, wiki araçları, ortak akıl yazılımlar vb.) dijital insanî bilimlerin ilk evresinde kullanılan bilgisayar teknolojilerini sosyal bilimlerin çevresinde oluşan geniş ağ kültürüne taşımaktadır. Üçüncü kişilerin de sürece dâhil edildiği bu ortak akıl yaklaşımı sayesinde yapılan işlerde hem içerik hem de zaman açısından birçok fayda sağlanmaktadır (bkz. Şekil ). ortak akıl, kurum ve kuruluşların bir zamanlar sadece çalışanları tarafından yerine getirilebilen işlerin tanımlanmamış ve daha geniş bir insan ağı tarafından yerine getirilmesidir (howe, ). İş ve yenilik (business and innovation) sektöründe kullanılan bu terimin (estelles-arolas, navarro- giner ve gonzalez-ladron-de-guevara, ) uygulanmasında teknik açıdan çağdaşlaşma sağlandığı, dijital yayınlar ve ortamlar üzerinde eş zamanlı (peer-to-peer - p p) ağlar üzerinden çalışmaların yapıldığı görülmektedir (mcpherson, , s. ). dijital insanî bilimler alanı bu evrede paradigma açısından alternatif bilgi üretiminin yollarını aramaktadır. Şekil . girişimci şirketler ile inovasyon ve ar-ge'de huni modeli. “uçer, s. ( ). kurumsal inovasyon ile dijital dünyaya adaptasyon. harvard business review, ” makalesinden çeviridir. telif hakkı harvard business review dergisine aittir. görüşler / opinion papers akça dünyada her geçen gün etkisini artıran bu alan kapsamında türkiye'de yapılan çalışmalar hem yetersiz hem de dar çerçevededir. bu sebeple bu çalışmada birçok disiplinle işbirliği hâlinde olan dijital insanî bilimlerin ne olduğunun, uygulama sahasının ve zaman içerisindeki gelişiminin literatüre kazandırılması amaçlanmıştır. ayrıca son kısımda dijital insanî bilimlerin kütüphanecilikle ilişkisi değerlendirilmiş ve mevcut uygulamalardan bahsedilmiştir. tarihsel gelişimi bilgi teknolojilerinin insanî bilimler, sanat, sosyal bilimler ve daha pek çok disiplinde yapılan çalışmaların teori ve kavramlarının kritik oluşumunda kullanılmasıyla oluşan dönüşüm tüm dünyada büyük yankı uyandırmıştır (berry, ). bilgisayar teknolojilerinin kullanılması bu disiplinler için verileri toplama ve analiz etmede daha önce görülmemiş ölçekte genişlik ve derinlik sağlamaktadır (lazer ve diğerleri, ). bu dönüşüm zaman içerisinde farklı disiplinleri bir araya getiren dijital insanî bilimler alanının doğmasına neden olmuştur. dijital insanî bilimlerin zaman içerisindeki gelişimi sosyal bilimler ve bilgisayar bilimleri arasındaki ilişkinin değerlendirilmesiyle kendini göstermektedir. başlangıçta insanî bilimlerde yapılan çalışmaları destekleyici bir araç olarak görülen (mccarty, ) bu alan, zaman içerisinde bu yapısından sıyrılarak, uygulamalarında entelektüel çaba gerektiren, kendine özgün standartları ve teorik açıklamaları barındıran bir disiplin olma özelliği kazanmıştır (hayles, ). yapılan çalışmalar arttıkça dijital insanî bilimler alanının araç değil, bu çalışmaların yapılmasındaki temel amacın, unsurun bir parçası olduğu görülmüştür (berry, ). bir nevi dijital insanî bilimler sadece istatistikî ve nicel teknikler yaklaşımıyla büyük veriler ve metinler üzerinde bilgisayar teknolojilerinin kullanıldığı, bilgisayar biliminin alt alanı değil; insan deneyimini, insan doğasının çeşitliliğini ön plana çıkaran bir bakış açısı kazandıran bir alandır (kramer, ). dijital insanî bilimler çalışmalarına tarihi yaklaşım 'lı yılların başlarına dayandırılmaktadır. alanın öncü çalışması robert busa'nın 'lı yıllarda, ortaçağ din adamı thomas aquinas dizini olan index thomisticus üzerine yaptığı çalışmadır (busa, ; ; jones, ). 'da ibm'in kurucusu olan thomas j. watson ile birlikte “index thomisticus”un dilbilgisini geliştiren busa, ( ), beşeri bilimler ve bilgisayar teknolojileri arasındaki kesişme noktasının kurucusu olarak kabul edilir (dalbello, ). bilgisayar destekli insanî bilimin ilk temsilleri, 'lerin başından itibaren bilginin objektif bir analizini sağlayan ilk bağımsız araştırma alanı ile hesaplamalı edebiyat (literary computing) alanıyla ortaya çıkmıştır. delme kartların (punch cards) ilk dijital edisyonu, 'lı yıllarda antonio zampolli'nin edebiyat ve dil çalışmalarında kullandığı bilgisayar teknolojileri (jessop, ) ve modern dil birliği uluslararası bibliyografyası (modern language association international bibliography-mlaib) ile -telefon bağlantısı ile arama yapılabiliyordu- geriye dönük kültürel mirasın dijitalleştirilmesi başlamıştır (reichert, , s. - ). bu dönemde kelime dizinleme (word indexing), kelime sıklıklarını ölçme (word frequency) ve kelime gruplama (word groups) çalışmalarının yapıldığı görülmektedir. henüz o dönemde hesaplamalı İnsanî bilimler kavramı kapsamında yürütülen bu çalışmalar bugün dijital insanî bilimler alanında kullanılan temel yöntemleri oluşturmaktadır. - 'lere gelindiğinde hesaplamalı dil bilimi (computer linguistic) üniversite olanakları, uzman dergileri (journal of literary and linguistic computing, computing in the humanities gibi), tartışma panelleri (humanist) ve konferans etkinlikleri ile kurumsal olarak konumlandırılmış bir araştırma alanı olarak karşımıza çıkmaktadır. standartlar (standard generalized markup language) ve metin işleme araçları ile bu dönemde metinlerin yapılandırılarak veri tabanına aktarılması ve metin üzerinde analitik araçların kullanılması sağlanmış ve metinlere erişim kolaylaştırılmıştır. ayrıca bu dönemde pek çok merkezin kurulduğu görülmektedir (hockey, , s. - ). dijital İnsanî bilimler: yeni bir yaklaşım digital humanities: a new approach_________________________________________________________ bilgisayar teknolojisinin bu denli kullanılmasıyla beraber kültürel bellek kurumlarının da ellerindeki koleksiyonu dijital ortama aktararak çoklu kullanıma açma eğiliminde oldukları görülmektedir (american memory project, kanada dijital mirası koruma projesi vb.). bu süreçte daha çok edebiyat alanındaki klasik eserlerin dijital ortama aktarıldığı göze çarpmaktadır. orijinal el yazmalarının farklı edisyonlarının elektronik ortamda erişime açılmasıyla (nicholas, paquet ve heutte, ) veri tabanlarında arama ve erişim özelliklerini artırma çalışmaları, dilbilimi otomasyonu gibi uygulamalar geliştirilmiştir (berry, ). ayrıca bu dönemde tarihî şahsiyetlerin özel arşivlerinin ya da tarihî bir döneme ait kaynakların dijital ortama aktarılarak kullanıcılara açıldığı görülmektedir. meselâ, abraham lincoln tarihi dijitalleştirilerek kullanıma açılmıştır. Çalışma lincoln'ün el yazmaları, mektupları, günlükleri, yayınları, resim dosyaları, ses dosyaları, video arşivi, etkileşimli harita ve bunların açıklamalarının bulunduğu geniş bir arşivi içerir. bu çalışma lincoln'ün tarihsel ve sosyal ağını ortaya koymaktadır (vandecreek, ). bu kapsamda başkaca yapılmış birçok çalışma mevcuttur (nineteenth century collections online ncco; nebraska's digital history project; philipp melanchthon project; the alexandria digital library vb.). kültürel mirası dijital ortama aktarma çalışmalarından sonraki aşamada ise daha çok nitel yaklaşımın varlığı hissedilir. metinlerin daha anlaşılır hâle getirilmesine ve üzerinde çıkarsama ve analiz yapılmasına yardımcı; açıklama, notlandırma gibi üretici ve etkileşimli çalışmaların varlığı görülmektedir. bu dönemle dijital araçlar insanî bilimlerin çekirdek metodolojisi içine entegre edilmiştir (schnapp ve presner, , s. ). diğer bir deyişle ilk dalgada kodlama, metin analizi, bilimsel yayıncılık gibi nispeten dar kapsamlı çalışmalar yapılırken ikinci dalgada bir disiplin olarak dijital insanî bilimlerin paradigması oluşturulmuş, diğer disiplinlerle yakınsaması gibi konular üzerine literatür oluşturulmuştur. yeni ve birleşik metodolojiler, yaklaşımlar kullanılarak yapılan çalışmalar da bu dönemde göze çarpmaktadır (svensson, ). world wide web'in (www) yaygınlaşması sonucunda araştırma ve uygulama pratikleri çok hızlı biçimde değişmiştir (reichert, ). İnternet çağı olarak da adlandırılan günümüzde etkinliğini giderek artıran dijital insanî bilimler hem uygulama hem de etki ettiği alan açısından geniş bir yelpaze sunmaktadır (coğrafi görselleştirme, üç boyutlu modelleme, dijital kültür objelerine teorik erişimin geliştirilmesi gibi) (spiro, ). web . teknolojisinin hayata geçirilmesiyle beraber yapılan çalışmalarda koleksiyonların dijitalleştirilerek kullanıma açılması yanında ortak akıl (crowdsourcing) uygulamalarından yararlanılmıştır. Örneğin old weather projesinde dönem hava koşulları, gemi hareketleri ve gemideki insanların hayatları hakkında kullanıcılardan gelen bilgiler veri madenleme tekniği ile görselleştirilmiştir. yine oxford Üniversitesinin oxyrhynchus papyrus adlı projesinde antik yunan hakkına bilgisi olan kullanıcıların katkıları beklenmektedir (poxy: oxyrhynchus online, ). british library tarafından oluşturulan programda ise arapça bilimsel el yazmalarının ocr'a transkripsiyonu için gönüllü desteği aranmaktadır (from the page, ). sosyal medyanın ve ortak aklın baskın şekilde kullanıldığı bu dönemde beşeri ve kültürel çalışmalarda bilimsel uygulamaların gelişmesiyle beraber kültürel ve bilimsel mirasın daha geniş kitlelere ulaştığı ve şeffaflaştığı görülmektedir. dijital insanî bilimler çalışmalarının yaygınlaşması farklı alfabeyle yazılmış kültürel mirasa erişimi ve etkileşimi de sağlamaktadır. Örneğin, the drukpa kagyu heritage adlı çalışmada yok olma tehlikesi altındaki tibet yazmaları sayısallaştırılarak kullanıcıya sunulmuştur (the drukpa kagyu heritage project, ). bir diğer çalışmada tagore'un bengalce ve İngilizce eserlerinin çevrimiçi erişimine olanak sağlanarak el yazmaları ve bunların transkripsiyonlarıyla birlikte arama motoru, eserin farklı versiyonlarını karşılaştırmaya yarayan bir sistem ve metni anlamaya yardımcı araçlar geliştirilmiştir (bichitra, ). alanın en görkemli projelerinden biri ise kaliforniya Üniversitesi tarafından yürütülen elektronik kültürel atlas girişimidir (the electronic cultural atlas initiative ecai - ecai.org). zaman içerisindeki kültürel değişimleri mekânsal bilgilerle birleştirerek vermeyi amaçlayan ecai.org görüşler / opinion papers akça projede, kültürel atlası oluşturmak üzere farklı disiplinlerden birçok kişi çalışmıştır. bu projede İnternet erişimli kaynakların aranabilir kataloğu ve coğrafi bilgi sistemi sayesinde ticaret, siyaset, ekoloji, kültürel miras alanlarındaki değişim ve tarihsel olayları görselleştirilmektedir (buckland, ; mostern, ). coğrafî bilgi sistemleriyle harmanlanan bir diğer çalışmada tek tanrılı dinlerin (yahudilik, hıristiyanlık ve İslâm) kutsal kabul ettikleri bölgelerinin (İsrail, filistin, Ürdün, güney lübnan, suriye ve sina yarımadası) kültürel atlası oluşturulmuştur. tarih öncesi dönemden . yüzyıla kadar bölgede kaydedilen arkeolojik alanlar google maps, google earth gibi coğrafî bilgi sistemleri yardımıyla görselleştirilmiştir (digital archaeological atlas of the holyland, ). tarihî ve edebî metinleri zaman ve mekân bilgileriyle ilişkilendirilerek yapılan çalışmalar da oldukça fazladır. politik sınırların, kültürel kavramların zaman ve mekâna göre değişimi coğrafî bilgi sistemlerinden faydalanılarak görselleştirilmektedir (hypercities, africa map, the american century geospatial timeline, histography, the map of early modern london, mapping st., the digital literary atlas of ireland - vb.). Üç boyutlu ( d) dijitalleştirme ile kültürel miras alanlarının simülasyonunu oluşturarak bu alanların açıklama ve analizini yapmayı amaçlayan çalışmalar da bulunmaktadır. daha çok arkeoloji alanında yapılan bu çalışmalar, mekân simülasyonu ve mirasın korunması gibi konular üzerine yoğunlaşan çalışmalardır. Örneğin, venedik arşivi dijitalleştirilmesi (venedik zaman makinesi-venice time machine) projesinde dijitalleştirmeyle beraber bilgi teknolojilerinden de faydalanılarak geçmiş bilgiye erişim alanı genişletilmiştir. malzemenin anlamsal (semantik) kodlamasıyla birlikte venedik kentinin tarihteki yolculuğu canlandırılmaktadır (kaplan, ). bir diğer çalışmada, İngiliz arşivlerindeki belgeler, haritalar ve fotoğraflardan yararlanılarak . yy.'da yaşanan büyük londra yangının başlangıcı olarak görülen tarihi pudding sokağı görselleştirilmiştir (dempsey ve diğerleri, ). bilgisayar teknolojilerinin sağladığı ve her geçen gün etki alanını genişleten bu imkânların büyüsünün problematik tarafları da bulunmaktadır. dijital insanî bilimlerde kanıta dayalı ve veri ekseninde istatistikî ve nicel tekniklerin kullanılması araştırmacının kendi konusundaki deneyimini yitirmesine veya ortaya koyamamasına sebep olmaktadır (kramer, ; rieder ve röhle, ). diğer taraftan bilgisayar teknolojilerinin kullanılması kültürel çalışmalarda ve beşeri bilimlerde paradigmayı sağlamlaştırmış ve ilgili alanın kullanılan matematiksel yöntemlerle yorumlanması desteklenmiştir (reichert, ). dijital İnsanî bilimler araçları bilgisayar ve iletişim sektöründeki gelişmeye paralel olarak dijital insanî bilimler alanında da kültürel mirası yorumlama ve anlamaya yönelik araçlar her geçen gün gelişmektedir. her ne kadar hâlâ insanî bilimler alanında çalışanlar arasında bu bilgisayar programlarının kullanımı yaygın değilse de kullanım oranı günden güne artmaktadır. programların daha etkili ve geniş kullanıcılara hitap etmesi için kullanıcı beklentilerinin sık aralıklarla ölçülmesi ve programların buna göre güncellenmesi yerinde olacaktır (gibbs ve owens, ). temel amacı insanî bilimler çalışmaları için yenilikçi yaklaşımlar belirlemek olan bu gelişmeler sayesinde araştırmacıların, öğrencilerin ve halkın kültürel varlıklara daha derinlemesine ve daha sofistike erişimi sağlanmaktadır. eğitim alanına da yenilikçi yaklaşımlar getiren bu bakış açısıyla kullanılan materyallerin sunumu ve etkileşimi zenginleşmektedir (summit, ). İçeriğin yorumlanması (notlandırma) ve görselleştirilmesi, işbirlikçi (collaborative) yaklaşım ve ortak akıl, zaman ve mekân belirsizliğinin görselleştirilmesi, biçimsel analiz ve yazarların belirlenmesi, elektronik yayıncılık (electronic publishing), metin analizi gibi başlıklar altında toplanabilecek insanî bilimler alanının uygulama sahalarında kullanılan ve geliştirilen birçok program bulunmaktadır. Örneğin metin analizi, veri madenleme, metin kodlama, görselleştirme için wordseer, textal, voyant, catma (computer aided textual markup and analysis), wordmap gibi programlar sık kullanılırken, metnin daha iyi anlaşılması için dijital İnsanî bilimler: yeni bir yaklaşım digital humanities: a new approach_________________________________________________________ annotation studio, juxta gibi notlandırma araçlarının kullanıldığı görülmektedir. internet yayımı ve hikâyelendirme için kullanılan omeka ise en sık kullanılan dijital insanî bilimler araçlarından biridir. ayrıca işbirlikçi okuma, yazım stili ve içerik bilgilerinden yazarı belirlemeye yönelik uygulamalar, metindeki kelimelerin metinle olan ilişkilerinin, istatistiklerinin çıkarılması gibi metotların uygulanabildiği programların geliştirildiği bilinmektedir (the text analysis portal for research (the tapor), textal, voyant, wordseer vb.). erişim açısından daha isabetli sonuç sağlayan dijital görüntülerin anlamsal (semantic) notlandırılmasına yardımcı programlar da bulunmaktadır (xu ve wang, ). andrew w. mellon vakfı tarafından desteklenen digital research tools (dirt) dizini, bilimsel kullanım için söz konusu dijital araştırma araçlarının listesini sunar. dirt, araştırmacıların yapmak istedikleri araştırmaya göre hangi araçları kullanabileceklerini göstermesi bakımından önemlidir. digital mappa ücretsiz çevrimiçi program gibi (bkz. https://digitalmappa.org). avrupa dijital İnsanî bilimler birliği (the european association for digital humanities - eadh), bilgisayar ve İnsanî bilimler derneği (association for computers and the humanities - ach), kanada dijital İnsanî bilimler derneği (canadian society for digital humanities - csdh), avustralya dijital İnsanî bilimler derneği (australasian association for digital humanities - aadh) ve japonya dijital İnsani bilimler derneği (japanese association for digital humanities). dünyada üniversitede dijital insanî bilimler bölümü ya da laboratuvarı bulunmaktadır (akça, ). her geçen gün farklılaşan ve çeşitlenen araştırma ve metodoloji ihtiyaçlarına cevap verebilmek ve var olan programları daha elverişli hâle getirmek için de çalışmalar yapılmaktadır. bunlardan biri olan digidoc (document image digitisation with interactive description capability) projesinin amacı, tarihî dokümanları analiz etmek üzere yeni araçlar geliştirmek ve bu dokümanların yönetimini sağlamaktır. bu projede ayrıca tarihî dokümanların dijitalleştirilmesi esnasında şekil ve içerik özelliklerini baz alarak sınıflandırma yapmaya yarayan araçlar geliştirilmiştir (project digidoc, ). uygulama sahası ve yayınlar geniş ve disiplinlerarası bir uygulama alanı sunan dijital insanî bilimler alanında dünyada yapılan çalışmaları destekleyen ve bu çalışmaların yapılmasına olanak sağlayan birçok kurum ve kuruluş bulunmaktadır. bunlardan dijital İnsanî bilimler organizasyonları birliği (alliance of digital humanities organizations - adho) dünyadaki tüm bu organizasyonları çatısı altında toplamaktadır. ayrıca bu derneğin oxford Üniversitesi tarafından çıkarılan literary and linguistic computing ve digital humanities, digital humanities quarterly adlı dergileri bulunmaktadır. yine bu kurum alanda çalışan bilim adamları ve araştırmacıların her yıl bir araya geldiği dijital İnsanî bilimler konferansını (digital humanities conference) desteklemektedir. her yıl adho ana sponsorluğunda düzenlenen dijital İnsani bilimler konferansı yeni çalışmaları ve araştırmacıları buluşturmaktadır. digital studies, journal of cultural heritage, journal of cultural heritage management and sustainable development, digital scholarship in the humanities, acm journal on computing and cultural heritage, international journal of human-computer studies, international journal of humanities and arts computing, evaluation, international journal of heritage studies, social science computer review, computer and humanities, historical social research historische sozialforschung, gibi dergilerde de kültürel mirasın bilgisayar programlarıyla yorumlandığı ilgili çalışmalar yayımlanmaktadır. frontiers in digital humanities adlı dergi ise yayın hayatına yıllında başlamıştır. ayrıca dijital arkeoloji alanında yapılan çalışmaların yer aldığı archaeometry, journal of digital applications in archeology and cultural heritage adlı dergiler de bu alandaki mevcut çalışmaları yayımlamaktadır. dünyada dijital insanî bilimler eğitimi veren pek çok üniversite bulunmaktadır . bunlardan öncü çalışmaları yürüten university college london göze çarpmaktadır. farklı https://digitalmappa.org görüşler / opinion papers akça disiplinlerden insanları bir araya getiren araştırma merkezi uluslararası kapsamda pek çok çalışmayı yürütmektedir. yüksek lisans ve doktora bazında eğitim veren merkezde İnternet teknolojileri, İnsani bilimlerde dijital kaynaklar ve xml konularında kısa kurslar da verilmektedir (ucl center for digital humanities, ). lisansüstü çalışmaların daha yaygın olduğu alanda mcgill Üniversitesi geniş kontenjanlı bir doktora programı sunar. başta kanada victoria Üniversitesi ve oxford Üniversitesi olmak üzere bir kaç üniversite her yıl ilgili alanın temel konularını içeren yaz okulları düzenlemektedir . bkz. http://www.dhsi.org, http://www.dhoxss.net arap ve latin harfleriyle basılan gazete ve dergiler, osmanlıca evraklar, yazma eserler, haritalar, kartpostallar. bazı üniversitelerde doğrudan alanın adıyla eğitim verilirken bazılarında tarih ve yeni medya ya da İnsanî bilimlerde teknoloji gibi isimler kullanılmaktadır. bir çok üniversitede ise dijital insanî bilimler laboratuvarlarının varlığı görülmektedir. hollanda, fransa, almanya, danimarka, İsveç, İtalya ve avusturya gibi ülkelerde 'un üzerinde dijital insanî bilimler merkezi bulunmaktadır. dünyanın önde gelen merkezleri ise amerika'dadır (massachusetts institute of technology hyperstudio; the harvard university digital arts and humanities (dart) gibi). amerika'da 'ın üzerinde dijital insanî bilimler çalışmaları yürüten merkez bulunmaktadır (holm, jarrick ve scott, ). Örnek projeleri, ders içeriklerini ve gündemdeki soru ve konuları tartışan web siteleri de bulunmaktadır (the cuny digital humanities resources guide vb.). türkiye'deki durum dijital insanî bilimler çalışmalarının ilk adımı olan dijitalleştirme faaliyetleri türkiye'de ilk olarak 'li yıllarda nadir ve yazma eserlerin dijitalleştirilmesini hedefleyen tÜyatok (türkiye yazmaları toplu kataloğu) projesi ile başlamıştır (yılmaz, ). yıllar içerisinde milli kütüphane, süleymaniye kütüphanesi, İbb atatürk kitaplığında bulunan nadir koleksiyonlar ve osmanlı arşivlerindeki belgeler dijitalleştirilmiştir. milli kütüphanede yaklaşık yedi milyon dijital ve basılı materyal bulunmaktadır. dijital koleksiyon kullanıcıların uzaktan erişimine açıktır (kültür ve turizm bakanlığı bütçe sunumu, ). bunların dışında kurumlar, bakanlıklar ve bölgeler bazında yazma eser kütüphanelerinin kültür ve turizm bakanlığı bünyesinde dijitalleştirilmesi projeleri yürütülmüştür (yılmaz, ). aydınonat ve Özlük'e göre farklı kurumlardaki bu dijitalleştirme çalışmaları belirli bir standarttan uzaktır ( , s. ). yazma eserler kurumu başkanlığına bağlı adet yazma eser kütüphanesi bulunmaktadır. yazma eserlerin korunması projesi gündemde (t.c. kültür ve turizm bakanlığı yılı bütçe sunumu, ) olmasına karşın dijitalleştirme kapsamında neler yapıldığına dair ayrıntılı bilgi bulunmamaktadır. fakat kurumun web sayfasında e-kitap portalı oluşturulmuş ve kimi osmanlıca eserlerin de aralarında bulunduğu nadir kitabın dijital görüntüleri erişime açılmıştır (türkiye yazma eserler kurumu, ). Çeşitli üniversitelerin koleksiyonlarındaki mevcut nadir eserlerin de dijitalleştirmeyle beraber web üzerinden erişime açıldığı görülmektedir. marmara Üniversitesi nadir eserler koleksiyonundaki adet yazma eser, adet eski harfli basma eser, adet latin harfli nadir eser, adet cilt süreli yayın olmak üzere toplam . . sayfa koleksiyona dijital olarak erişilmektedir. atatürk Üniversitesi seyfettin Özege koleksiyonunda bulunan . civarındaki osmanlıca eser dijitalleştirilmiş ve kullanıcıya dijital ortamda sunulmuştur. ayrıca türkiye diyanet vakfı İslâm araştırmaları merkezinin - İsam koleksiyonunda bulunan . 'den fazla osmanlıca makalelerin künyesi ile tam metinleri; . civarında tarih, edebiyat ve dinî ilimlerle ilgili osmanlıca risâleler; adet osmanlı devlet ve vilayet salnâmeleri ile nevsalleri kapsayan eserler pdf formatında erişime açılmıştır (küpdilli yılmaz, ). ankara üzerine çok geniş bir koleksiyona sahip olan koç Üniversitesi vehbi koç http://www.dhsi.org/ http://www.dhoxss.net dijital İnsanî bilimler: yeni bir yaklaşım digital humanities: a new approach_________________________________________________________ ankara araştırmaları ve uygulama ve araştırma merkezi (vekam) koleksiyonundaki . 'den fazla kaynağı sayısallaştırarak araştırmacıların hizmetine sunmuştur (vekam, t.y.). kültürel bellek kurumlarından bazıları dijital ortama aktardıkları bu koleksiyonlarını uluslararası projelere dâhil ederek erişime açmışlardır. avrupa kültürel mirasına tek bir noktadan erişim sağlanması amaçlanan europeana projesine türkiye, avrupa birliği destekli bir başka proje olan accessit projesiyle katılmıştır (Ünal ve yılmaz, ). hacettepe Üniversitesinin yürütücülüğünü üstlendiği projede, türk kültür ve sanat eserlerine geniş çaplı ve demokratik erişiminin sağlanması için çalışmalar yapılmıştır. ayrıca bu kültür-sanat çalışmaları ve ürünlerinin avrupa dijital kütüphanesine (europeana) aktarılması yönünde alt yapı çalışmaları yapılmıştır (accessit.hacettepe.edu.tr, ). yine avrupa birliği destekli bir proje olan locloud projesinin amacı ise europeana içeriğini bulut bilişim teknolojisi kullanarak geliştirmek, küçük ve orta ölçekli kültürel bellek kurumlarının ellerindeki koleksiyonların bulut bilişim teknolojisi ile içeriklerinin ve üst verilerinin europeana aracılığıyla erişilmesini sağlamaktır. bu projenin ortaklarından biri de türkiye'dir (locloud.eu., ). türkiye'nin ortak olduğu bir diğer ab destekli proje riches projesidir. avrupa kültürel mirasını bir araya getirmeyi ve farkındalık oluşturmayı hedefleyen proje, hâlen devam etmektedir (riches-project.eu, ). son yıllarda artan dijitalleştirme faaliyetlerine karşın ortak politika bazında somut adımların atılmadığı görülmektedir. aygün ( ), çalışmasında dünyadaki kültürel miras üzerine yapılan çalışmalarla türkiye'deki düzenlemeleri karşılaştırmış ve türkiye'deki kültürel miras uygulamalarının sadece bürokrasinin yapısına bırakılmasının artık sorgulanması gerektiğini vurgulamıştır. türkiye'de yukarıda belirtilen dijitalleştirme çalışmalarının ötesinde kültürel miras üzerinde çağdaş anlamda bilgisayar teknolojilerinin kullanıldığı ve dijital insanî bilimler kapsamında değerlendirilebilecek çalışmalar da bulunmaktadır. Örneğin Çatalhöyük'de yapılan kazıların kayıtlarının ve çıktılarının dijitalleştirilmesiyle başlayan çalışmalar, kazı alanının simülasyon aracıyla gezimi, video gösterisi, second life gibi sanal gerçeklik ortamlarında alanın canlandırılması, hikâyelendirilmesi gibi çalışmalarla devam etmiştir (tringham, ). orta Çağ'dan eski tunç Çağına kadar kalıntıların çıkarıldığı kaman-kalehöyük kazıları ise alana yapılan müzede sergilenmektedir (kalehöyükarkeolojimüzesi, ). kendine özgü koleksiyonu ile dünyanın sayılı müzeleri arasında yer alan anadolu medeniyetleri müzesinde paleotik Çağ'dan başlayarak günümüze kadar gelen anadolu arkeolojisi sergilenmekte ve müze sanal ortamda gezilebilmektedir (t.c. kültür ve turizm bakanlığı anadolu medeniyetleri müzesi, ). diğer taraftan edebiyat ve tarih alanında dijital insanî bilimler kapsamında değerlendirilebilecek çalışmaların varlığı dikkat çekmektedir. osmanlı metinlerinin görsel ve yazınsal analizi ve erişimi (ottoman text archive project - otap) projesinde bilkent Üniversitesinden prof. dr. fazlı can ve ekibi osmanlı metinlerinin otomatik olarak analiz ve erişiminin sağlanmasını hedeflemişlerdir. Çalışmanın bir bölümünde unesco dünya belleği kütüğünde yer alan evliya Çelebi'nin seyahatnamesi'nin bitlis kısmında geçen kişi adları yardımıyla eserin sosyal ağı oluşturulmuştur (Şahin, can ve kalpaklı, ). yine bilkent Üniversitesinden prof. dr. mehmet kalpaklı başkanlığında baki divanı projesi yapılmaktadır. projede osmanlı metinleri üzerinde bilgisayar teknolojilerinin kullanılması öngörülmüştür. son dönemde yapılan çalışmalarda ise uluslararası düzlemde türkiye'nin ortak olarak yer aldığı görülmektedir. tokyo Üniversitesi ve türk tarih kurumunun desteğiyle yürütülen osmanlı kitabeleri projesinde osmanlı eserlerinin kitâbelerini içeren dijital veri tabanı oluşturulmuştur. Şu an günümüz itibariyle itibariyle veri tabanında kitâbe kaydı mevcuttur. kitâbelerin görüntüleriyle beraber çevirisi (transliterasyon) de verilmiştir. ayrıca kitâbe ve kitâbenin bulunduğu yapı hakkında bilgiler de o kitâbenin kartında yer almaktadır. osmanlı dönemindeki türkçe, arapça ve farsça olan bu eserlerin yok olma tehlikesiyle karşı karşıya olmasından hareketle yapılan çalışmada İstanbul, bursa ve edirne yazıtlarının görüntüleri kullanıcıya sunulmuştur. web accessit.hacettepe.edu.tr görüşler / opinion papers akça sitesinde ayrıca dizin oluşturulmuş ve kullanıcıların yazıtlar arasında arama yapmasına olanak sağlayan bir arama motoru oluşturulmuştur. ayrıca sitede google map özelliği entegre edilerek yazıtların yeri harita üzerinden gösterilmektedir (ottomanmanuscriptions.com, ). söz konusu çalıma çağdaş anlamda dijital insanî bilimler yaklaşımıyla oluşturulan geniş kapsamlı bir çalışma olmasından dolayı önemlidir. türkiye'de üniversitelerde dijital insanî bilimler bölümü ya da bu çerçevede oluşturulmuş merkezler bulunmadığı için yapılan ferdi çalışmalar atıl kalmakta; amacına ve hedef kitlesine ulaşamamaktadır. devlet ve fon desteği olmadan yapılan bu çalışmalar dijital insanî bilimler çalışmalarının en önemli problemi olarak gösterilen sürdürülebilirlik problemiyle karşı karşıya kalmaktadır. dijital İnsanî bilimlerin kütüphanelerdeki İşlevi web ortamının sağladığı olanaklar ve dijitalleşme her alanda değişimlere neden olmuştur. bilgiye erişimde arama motorlarının yaygınlaşması ve kütüphane hizmetlerinin akademik yayıncılık platformlarına entegrasyonu gibi gelişmeler kütüphaneleri bilgi işleme sürecinde yeniliklere itmiştir (russell, ; eberhart, ). bu bağlamda son yıllarda özellikle akademik kütüphanelerin servis ve sistemlerinde dijital insanî bilimler çalışmalarından faydalanıldığı görülmektedir. kütüphanelerde dijital uygulamalar ve sorunları için yılında thatcamp dijital kütüphane federasyon forumu ile birlikte bir toplantı düzenlenmiştir. amerikan kütüphaneler birliğinin (the american library association - ala) alt kolu olan Üniversite ve araştırma kütüphaneleri derneğinin (association of college and research librarires - acrl) dijital insanî bilimler çalışmaları tartışmaları için bir mail grubu bulunmaktadır. yine aynı derneğin söz konusu alandaki gelişmeler, kaynaklar, vaka çalışmaları ve araçları içeren bir bloğu bulunmaktadır (http://acrl.ala.org/dh). dijital insanî bilimler kütüphanelerin ruhunu yansıtır. Öyle ki kütüphanelerin temel işlevleri ve süreçleri ile dijital insanî bilimler uygulamalarının amaçları örtüşmektedir. bilginin organizasyonu, veri yönetimi, sayısallaştırma ve iyileştirme, dijital koruma, iletişim ve dağıtımda teknolojinin kullanımı ve bilimsel araştırmalar için yararlı araçların üretimi gibi uygulamalar ortaktır (showers, ; ramsay, ). dijital insanî bilimler uygulamalarında kültür varlıklarının içeriklerinin bilgisayar teknolojileriyle daha açıklayıcı bir biçimde çok daha geniş kitlelere ulaştırıldığı görülmektedir. kütüphanelerin temel işlevleri ise bilgiyi düzenleyerek kullanıcıların erişiminin sağlanmasıdır. her iki alanın uygulamalarında da temel amacın bilgiye erişimin demokratikleştirilmesi olduğu görülmektedir. diğer taraftan kültürel bellek kurumlarının koleksiyonları dijital insanî bilimlerin ana kaynaklarını oluşturmaktadır. kütüphaneler, müzeler ve arşivlerdeki nadir eserler ve kaynaklar bu alanın temel uygulama verilerini oluşturur. Çalışmaların yapıldığı kaynakların bu kurumlarda bulunması sebebiyle bir nevi dijital insanî bilimler yaklaşımıyla yapılan çalışmalar kültürel bellek kurumlarını da etkilemekte ve ilgilendirmektedir. hâlihazırda bu alan özellikle akademik kütüphaneneler için yönetim ve diğer birimlerle (enstitü, fakülte, bölümler) ilişkilerden kütüphanecilerin eğitimine kadar pek çok işlevi etkilemektedir. kütüphanelerin diğer bir önemli işlevi ise kaynakların sunulmasında güvenilirliktir. dijital dünyadan önce kaynakların bir ortamdan diğerine transferinde yaklaşık bir sonuç elde edilmesi beklenirken günümüzde eserin orijinal hâli dijital ortama aktarılabilmektedir. ayrıca eserin mevcut hâlinin korunmasıyla beraber pek çok iyileştirici ve sürdürülebilir uygulamalar da yapılabilmektedir (dietrich ve sanders, ). dijital insanî bilimler ve kütüphanelerin ortak yaptığı çalışmaların başında dijitalleştirme, dijital edisyon, dijital arşiv oluşturma gelmektedir (vandegrift ve varner, , s. ). bu alanda yapılan çalışmalar sayısallaştırılmış eserlerin avantajlarından yararlanır ve fiziksel orijinalleriyle yapılamayan işleri yapar. dijital arşiv, eserler üzerinden analitik ve ek açıklama araçlarının oluşturulması akademisyenlerin, araştırmacıların doğrudan dijital dosyalara içerik eklemesine, uzun süreli depolama için kopyalar oluşturmasına, farklı çözünürlüklerde sürüm oluşturmasına, belirli bilgileri ottomanmanuscriptions.com http://acrl.ala.org/dh dijital İnsanî bilimler: yeni bir yaklaşım digital humanities: a new approach_________________________________________________________ çıkarmasına ve orijinalliği bozmadan diğer dijital sürümlerle ve kültürel varlıklarla birleştirmesine olanak tanır (varner ve hswe, ). dijitalleştirme ve bu bağlamdaki dijital insanî bilimler uygulamalarının yaygınlaşması kaynak materyal, üst veriler ve erişim ile ilgili önemli meydan okumaları da beraberinde getirmektedir (poremski, ; keener, ; sula, ). ayrıca dijitalleştirmeyle beraber dijital koruma kavramı da ortaya çıkmıştır. dijitalleştirilen bir kültürel varlık bir tek kaynağa ya da farklı kültürel bellek kurumlarında bulunan birden çok kaynağa sahip olabilir. tüm bu sürümlerin korunması, provenans oluşturularak sürdürülebilirliğin sağlanması kütüphanecilik işlevleri arasında olmalıdır. ayrıca bağlam ve anlam dışında gereksiz bir dijital ortamın kullanılması da kütüphanelerin bu noktadaki işlevini hayati hâle getirmektedir. bu noktada kütüphaneler üst veri ve provenans ilkesini geliştirici uygulamalarla ortalıkta dönen kopyaların sürekli ve yanlış biçimde kullanılmasını engellemelidir (dietrich ve sanders, ). dijital insanî bilimler alanında yapılan uygulamalar kütüphaneciliğin temel işlevlerini desteklediği gibi bu hizmetlerin kapsamını da genişletmiştir. kütüphaneciler, kullanıcıların alanlarında teknolojiden faydalanarak yeni metotlar bulmalarına, yeni yaklaşımlar sergilemelerine ve hatta uygulamalar yaratmalarına olanak sağlamaktadırlar (poole ve garwood, ). bu nedenle, dijital kütüphanecilik tanımıyla kütüphane hizmetlerine entegrasyonu sağlanan bu yeni alanla beraber kütüphaneler, kullanıcılara (öğrenciler, fakülte personeli, araştırmacılar) bilgisayar teknolojileri yardımıyla dijitalleştirmenin ötesindeki imkânları keşfetme konusunda destek sağlamayı hedeflemektedir (sula, , s. ). bu kapsamda dünyadaki pek çok kütüphanede bilhassa akademik kütüphanelerde dijital insanî bilimler laboratuvarının kurulduğu göze çarpmaktadır. buralarda görevli bilgisayar teknolojilerine hâkim kütüphaneciler, kullanıcıların koleksiyon ve içerikle ilgili ihtiyaçlarına en uygun bilgisayar teknolojilerini bulmalarına yardımcı olmaktadırlar (poremski, , s. ). ayrıca kütüphane içerisinde kullanıcıların oluşturmak istedikleri bilgisayar temelli projelere destek sağlanmaktadır. kütüphanelerin kendi koleksiyonları üzerinde yapmak istedikleri dijitalleştirme ve ötesi çalışmalar için de bu birim destek vermektedir. dijital çevre ve gelişmelere entegre olmak, kültürel bellek kurumlarının rollerine de etki etmektir. dijital insanî bilimler kütüphaneciler için kolayca uygulanacak bir paradigma sunmaktadır ve kütüphaneciliğin var olan hizmetlerine kolayca adapte edilebilecek bir yapıya sahiptir. kütüphaneciler dijital insanî bilimlerin kapsayıcı ve geniş vizyonuna seyirci kalmamalı ve bu yönde kendilerini geliştirmelidirler. değerlendirme bilgisayar teknolojileri kullanılarak kültür varlıklarının gizli kalmış değerinin ortaya çıkarılması ve uluslararası boyutta erişiminin artırılması dünya devletlerinin öncelikli politikası hâline gelmiştir. artık devletler sınırları dâhilinde bulunan kültür varlıklarının var olan potansiyelinden (ekonomik, kültürel, sosyal) mümkün olduğunca faydalanma konusunda çalışmalar yapmaktadır. bu noktada dijital insanî bilimler alanında yapılan çalışmalarla dijitalleştirilmiş kültürel mirasın içeriği zenginleştirilerek geçmişin daha iyi anlaşılmasına yönelik uygulamalar geliştirilmektedir. bilgisayar teknolojilerini insanî bilimlere entegre eden bu çalışmalarda hem devletlerin kendi vatandaşlarının sınırları içerisindeki kültürel mirası daha iyi anlayarak aidiyet duygularının geliştirilmesi hem de sınırlar dışındaki insanların bu kültürel mirasla etkileşimlerinin sağlanarak ekonominin canlandırılması hedeflenmiştir. küreselleşme dünyayı düz bir platforma taşıdığı için kültürel miras aslında tüm insanlığın ortak belleğidir. dolayısıyla dijital insanî bilimler çalışmaları kültürel mirasın daha geniş kitlelerce daha derin anlaşılmasına olanak sağlayan bir alan olarak karşımıza çıkmaktadır. yapılan çalışmalar dil, sınır ve zaman görüşler / opinion papers_____________________________________________________________ akça engeli olmaksızın geçmişle bağımızın güçlendirilmesine, geçmişimizi, insanlığı ve geleceği daha kapsayıcı biçimde anlamamıza olanak sağlamaktadır. bu yeni yaklaşım ve alanın varlığı kültürel bellek kurumlarında da bir canlanmaya sebep olmuştur. ana verisi kültürel miras olan dijital insanî bilimler alanı kültürel bellek kurumlarının hizmet ve servislerinde yeni ihtiyaçlar ve çözümler yaratmıştır. Özellikle akademik kütüphanelerde dijital insanî bilimler servislerinin açıldığı görülmektedir. ayrıca bu yeni alanla beraber dijital kütüphaneci kavramı oluşmuş ve kütüphaneler bilgisayar teknolojileri ve insanî bilimleri bir potada eritebilme yeteneğini edinmiş dijital kütüphaneci arayışına girmişlerdir. dünyadaki tüm bu gelişmelere karşı türkiye'nin bu konuda hâlâ stratejik adımlar attığı söylenemez. yapılan tek tük çalışmalar olmasına karşın bu alanda eğitim ve uygulama bağlamında bir politikanın varlığından maalesef bahsedilememektedir. türkiye'nin var olan zengin kültürel mirasından ekonomik, sosyal ve kültürel bağlamda etkin biçimde faydalanabilmesi için teknolojik gelişmeler, bu gelişmelerin kültürel miras üzerine uygulanması yönünde ayrılan bütçe ve bunun şekillenmesinde rol oynayan politikalar çağın gereklerine uygun hâle getirilmelidir. kültürel mirasın yönetilmesinden sorumlu olan kültür ve turizm bakanlığı konuya dijital insanî bilimler perspektifinde yaklaşmalı ve bu doğrultuda politikalar üretmelidir. bu çerçevede bakanlık tüm dünyada sürdürülebilir kültürel miras yönetimi konusunda oluşturulan politikalar ve uygulamaları dikkate alarak ilgili tüm kurumları içine alan daha katılımcı bir politika ve strateji gelişimine yönelmelidir. ayrıca bakanlık ve yüksek Öğretim kurumu (yÖk) işbirliği ile üniversitelerde ilgili bölümlerde bu konunun işlenmesine ve uygulanmasına yardımcı olacak disiplinlerarası bir laboratuvar ya da ayrı bir bölüm kurulmalıdır bilgi ve belge yönetimi bölümlerinde ise ilgili alana ait uygulamalar ışığında yeni dersler açılmalı ve alanın temel metodolojisi geleceğin bilgi profesyonellerine kazandırılmalıdır. teşekkür Çalışmamı okuyan ve fikirleriyle katkı sağlayan müge akbulut'a çok teşekkür ediyorum. kaynakça accessit.hacettepe.edu.tr. ( , kasım). accessit ab projesi hakkında. erişim adresi: http://www.accessit.hacettepe.edu.tr/index.php?kid= &s=accessit% ab% projesi% hakk%c %b akça, s. ( ). dijital insani bilimler yaklaşımıyla kültür varlıklarının görünürlüğünün ve kullanımının artırılması: türkiye için kavramsal bir model önerisi (doktora tezi). erişim adresi: http://www.bby.hacettepe.edu.tr/yayinlar/dosyalar/akça.pdf american council on learned societies (acls). ( ). our cultural commonwealth: the final report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. acls: new york. erişim adresi: www.acls.org/ cyberinfrastructure/ourculturalcommonwealth.pdf aydınonat, b. ve Özlük, h. k. ( ). indicate projesi: “uluslararası dijital kültürel miras altyapı ağı”. . halk kütüphaneciliği sempozyumu: değişen dünyada halk kütüphaneleri ­ mayıs , bodrum: bildiriler, posterler ve Çalıştay raporları içinde (s. - ). ankara: kültür ve turizm bakanlığı. aygün, h. m. ( ). kültürel mirası korumada katılımcılık. vakıflar dergisi, , - . erişim adresi: http://acikerisim.fsm.edu.tr: /xmlui/bitstream/handle/ / /aygün.pdf?sequence= &isallo wed=y berry, d. m. ( ). the computational turn: thinking about the digital humanities. culture machine, , - . erişim adresi: http://www.culturemachine.net/index.php/cm/article/view/ / bichitra.jdvu.ac. ( ). about bichitra. erişim adresi: http://bichitra.jdvu.ac.in/about_bichitra_project.php accessit.hacettepe.edu.tr http://www.accessit.hacettepe.edu.tr/index.php?kid= &s=accessit% ab% projesi% http://www.bby.hacettepe.edu.tr/yayinlar/dosyalar/ak% c % a a.pdf http://www.acls.org/ http://acikerisim.fsm.edu.tr: /xmlui/bitstream/handle/ / /ayg% c % % c % bcn.pdf?sequence= &isallowed=y http://acikerisim.fsm.edu.tr: /xmlui/bitstream/handle/ / /ayg% c % % c % bcn.pdf?sequence= &isallowed=y http://www.culturemachine.net/index.php/cm/article/view/ / http://bichitra.jdvu.ac.in/about_bichitra_project.php dijital İnsanî bilimler: yeni bir yaklaşım digital humanities: a new approach________________________________________________________ buckland, m. ( ). the electronic cultural atlas initiative. proceedings of the american society for information science and technology, ( ), - . busa, r. ( ). the annals of humanities computing: the index thomisticus. computers and the humanities, ( ), - . busa, r. a. ( ). foreword: perspectives on the digital humanities. susan schreibman, ray siemens ve john unsworth (ed.). a companion to digital humanities içinde (s. - ) blackwell publishing. erişim adresi: http://www.digitalhumanities.org/companion/ dalbello, m. ( ). a genealogy of digital humanities. journal of documentation, ( ), - . doi: https://doi.org/ . / dh+lib. (t.y.). erişim adresi: https://acrl.ala.org/dh/ dempsey, j. lindsay, c., hargreaves, d., fontenoy, l., peacock, d., ve bell, d. ( ). pudding lane: recreating seventeenth-century london. journal of digital humanities, ( ), - . dietrich, c. ve sanders, a. ( , haziran ). on the word, digital [web blog yazısı]. erişim adresi: http://acrl.ala.org/dh/ / / /on-the-word-digital/ digital archaeological atlas of the holyland. ( ). erişim adresi: https://daahl.ucsd.edu/daahl/ eberhart, g., m. ( , mart ). how librarians and faculty use digital humanities [web blog yazısı]. erişim adresi: https://americanlibrariesmagazine.org/ / / /how-librarians-and-faculty- use-digital-humanities/ estelles-arolas, e., navarro-giner, r. ve gonzalez-ladron-de-guevara, f. ( ). crowdsourcing fundamentals: definition and typology. f. j. garrigos-simon, i. gil-pechuân ve s. estelles- miguel (ed.). advances in crowdsourcing içinde (ss. - ) spain: springer. from the page. ( ). arabic scientific manuscripts of the british library. erişim adresi: https://fromthepage.com/bldigital/arabic-scientific-manuscripts gibbs, f. ve owens, t. ( ). building better digital humanities tools: toward broader audiences and user-centered designs. digital humanities quarterly, ( ). erişim adresi: http://www.digitalhumanities.org/dhq/vol/ / / / .html gold, m. (ed.). ( ). debates in the digital humanities. minneapolis: university of minnesota press. erişim adresi: http://dhdebates.gc.cuny.edu/about hayles, n. k. ( ). how we think: transforming power and digital technologies. d. m. berry (ed.), understanding the digital humanities içinde (s. - ). london: palgrave. hockey, s. ( ). the history of humanities computing. s.schreibman, r. siemens ve j. unsworth (ed.), a companion to digital humanities içinde (s. - ). usa: blackwell publishing. holm, p., jarrick, a. ve scott, d. ( ). the digital humanities. humanities world report (s. - ) içinde. uk: palgrave macmillan. howe, j. ( ). the rise of crowdsourcing. wired magazine, ( ), - . erişim adresi: http://sistemas-humano-computacionais.wdfiles.com/local--files/capitulo% aredes- sociais/howe_the_rise_of_crowdsourcing.pdf jessop, m. ( ). computing or humanities?. ubiquity, - . erişim adresi: http://citeseerx.ist.psu.edu/viewdoc/download?doi= . . . . &rep=rep &type=pdf jones, s. e. ( ). roberto busa, sj, and the emergence of humanities computing: the priest and the punched cards. routledge. kaman-kalehöyük arkeoloji müzesi. ( ). müze hakkında. erişim adresi: http://kalehoyukarkeolojimuzesi.gov.tr/tr/index.php/mueze-hakk-nda/oeren- yerleri/kalehoeyuek-oeren-yeri kaplan, f. ( ). frederic kaplan: how to build an information time machine [video dosyası]. erişim adresi: https://www.ted.com/talks/frederic_kaplan_how_i_built_an_information_time_machine/transcript?langu age=en keener, a., . the arrival fallacy: collaborative research relationships in the digital humanities. digital humanities quarterly, ( ). erişim adresi: http://digitalhumanities.org: /dhq/vol/ / / / .html http://www.digitalhumanities.org/companion/ https://doi.org/ . / https://acrl.ala.org/dh/ http://acrl.ala.org/dh/ / / /on-the-word-digital/ https://daahl.ucsd.edu/daahl/ https://americanlibrariesmagazine.org/ / / /how-librarians-and-faculty-use-digital-humanities/ https://fromthepage.com/bldigital/arabic-scientific-manuscripts http://www.digitalhumanities.org/dhq/vol/ / / / .html http://dhdebates.gc.cuny.edu/about http://sistemas-humano-computacionais.wdfiles.com/local--files/capitulo% aredes-sociais/howe_the_rise_of_crowdsourcing.pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi= . . . . &rep=rep &type=pdf http://kalehoyukarkeolojimuzesi.gov.tr/tr/index.php/mueze-hakk-nda/oeren-yerleri/kalehoeyuek-oeren-yeri https://www.ted.com/talks/frederic_kaplan_how_i_built_an_information_time_machine/transcript?language=en https://www.ted.com/talks/frederic_kaplan_how_i_built_an_information_time_machine/transcript?language=en http://digitalhumanities.org: /dhq/vol/ / / / .html görüşler / opinion papers akça kramer, m. j. ( ). what does digital humanities bring to the table? [web blog yazısı]. erişim adresi: http://www.michaeljkramer.net/what-does-digital-humanities-bring-to-the-table/ küpdilli yılmaz, e. ( , Şubat). [facebook durum güncellemesi]. erişim adresi: https://www.facebook.com/ /photos/a. . . / /?type= &theater lazer, d., pentland, a. s., adamic, l., aral, s., barabasi, a. l., brewer, d., ... ve jebara, t. ( ). life in the network: the coming age of computational social science. science. ( ), ­ . doi: . /science. locloud.eu. ( ). about. erişim adresi: http://www.locloud.eu/about mccarty, w. ( ). what is humanities computing? toward a definition of the field. erişim adresi: http://www.dighum.kcl.ac.uk/legacy/teaching/dtrt/class /mccarty_humanities_computing.pdf mccarty, w. ( ). attending from and to the machine. inaugural lecture. center for computing in the humanities, king's college london. erişim adresi: http://www.mccarty.org.uk mcpherson, t. ( ). introduction: media studies and the dijital humanities. cinema journal, ( ), - . erişim adresi: http://muse.jhu.edu/login?auth= &type=summary&url=/journals/cinema_journal/v / . .mc pherson.pdf mostern, r. ( ). the electronic cultural atlas initiative. historical geography, , - . erişim adresi: https://ejournals.unm.edu/index.php/historicalgeography/article/view/ / nicholas, s., paquet, t. ve heutte, l. ( , kasım). digitizing cultural heritage manuscripts: the bovary project. proceedings of the acm symposium on document engineering (s. - ) içinde. acm. erişim adresi: http://madonne.univ-lr.fr/publications/nicolas a.pdf ottomanmanuscriptions.com. ( ). erişim adresi: http://info.ottomaninscriptions.com/usingdb/ poole, a.h.. ve garwood, d.a. ( ). natural allies: librarians, archivists, and big data in international digital humanities project work. journal of documentation, ( ), - . doi: https://doi.org/ . /jd- - - poxy: oxyrhynchus online. (t.y.). erişim adresi: http://www.papyrology.ox.ac.uk/poxy/ poremski, m. d. ( ). evaluating the landscape of digital humanities librarianship. college & undergraduate libraries, ( - ), - . doi: . / . . presner, t. ve johanson, c. ( ). the promise of digital humanities: a whitepaper, march , - final version. erişim adresi: http://www.itpb.ucla.edu/documents/ /promiseofdigitalhumanities.pdf project digidoc. ( ). erişim adresi: http://digidoc.labri.fr ramsay, s. ( , ekim ). care of the soul. emory Üniversitesi ile yapılan söyleşi. erişim adresi: http://stephenramsay.us/text/ / / /care-of-the-soul.html ramsay, s. ve rockwell, g. ( ). developing things: notes toward an epistemology of building in the digital humanities. debates in the digital humanities (s. - ) içinde. minneapolis: university of minnesota press. erişim adresi: http://dhdebates.gc.cuny.edu/debates/text/ reichert, r. ( ). digital humanities. infotheca, ( ), - . erişim adresi: http://infoteka.bg.ac.rs/pdf/eng/ - /eng - infotheca_xv_ _april_ - .pdf rieder, b. ve röhle, t. ( ). digital methods: five challenges. d. m. berry (ed.), understanding digital humanities içinde (ss. - ). london: palgrave. riches-project.eu. ( ). riches: renewal, innovation and change: heritage and european society. erişim adresi: http://www.riches-project.eu russell, i. g. ( ). the role of libraries in digital humanities. erişim adresi: http://www. ifla. org/past-wlic/ / -russell-en. pdf. schnapp, j. ve presner, p. ( ). digital humanities manifesto . . erişim adresi: http://www.humanitiesblast.com/manifesto/manifesto_v .pdf http://www.michaeljkramer.net/what-does-digital-humanities-bring-to-the-table/ https://www.facebook.com/ /photos/a. . . / /?type= &theater https://www.facebook.com/ /photos/a. . . / /?type= &theater http://www.locloud.eu/about http://www.dighum.kcl.ac.uk/legacy/teaching/dtrt/class /mccarty_humanities_computing.pdf http://www.mccarty.org.uk http://muse.jhu.edu/login?auth= &type=summary&url=/journals/cinema_journal/v / . .mc https://ejournals.unm.edu/index.php/historicalgeography/article/view/ / http://madonne.univ-lr.fr/publications/nicolas a.pdf ottomanmanuscriptions.com http://info.ottomaninscriptions.com/usingdb/ https://doi.org/ . /jd- - - http://www.papyrology.ox.ac.uk/poxy/ http://www.itpb.ucla.edu/documents/ /promiseofdigitalhumanities.pdf http://digidoc.labri.fr/ http://stephenramsay.us/text/ / / /care-of-the-soul.html http://dhdebates.gc.cuny.edu/debates/text/ http://infoteka.bg.ac.rs/pdf/eng/ - /eng - infotheca_xv_ _april_ - .pdf http://www.riches-project.eu/ http://www.humanitiesblast.com/manifesto/manifesto_v .pdf dijital İnsanî bilimler: yeni bir yaklaşım digital humanities: a new approach_________________________________________________________ showers, b. ( ). does the library have a role to play in digital humanities? jisc digital infrastructure team. erişim adresi: http://infteam.jis-cinvolve.org/wp/ / / /does-the- library-have-a-role-to-play-in-the-digital- humanities spiro, l. ( ). getting started in digital humanities. journal of digital humanities, ( ). erişim adresi: http://journalofdigitalhumanities.org/ - /getting-started-in-digital-humanities-by-lisa-spiro/ sula, c. a. ( ) digital humanities and libraries: a conceptual model. journal of library administration, ( ), - . doi: . / . . summit summit on digital tools for the humanities. ( ). a report on the summit on digital tools. university of virginia. erişim adresi: http://www.iath.virginia.edu/dtsummit/summittext.pdf svensson, p. ( ). humanities computing as digital humanities. digital humanities quarterly, ( ). erişim adresi: http://www.digitalhumanities.org/dhq/vol/ / / / .html svensson, p. ( ). the landscape of digital humanities. digital humanities quarterly, ( ). erişim adresi: http://digitalhumanities.org: /dhq/vol/ / / / .html Şahin, p. d., can, f. ve kalpaklı, m. ( ). osmanlı metinlerinin görsel ve yazınsal analizi ve erişimii (tÜbİtak proje no: e ). ankara: bilkent Üniversitesi. t.c. kültür ve turizm bakanlığı yılı bütçe sunumu. ( ). erişim adresi: http://sgb.kulturturizm.gov.tr/eklenti/ ,butcesunumkitapcigi pdf.pdf? t.c. kültür ve turizm bakanlığı anadolu medeniyetleri müzesi. ( ). müzenin tarihçesi. erişim adresi: http://www.anadolumedeniyetlerimuzesi.gov.tr/tr, /muzenin-tarihcesi.html the drukpa kagyu heritage project ( ). erişim adresi: http://www.pktc.org/dkhp/ tringham, r. ( ). the public face of archaeology at Çatalhöyük. ( ). r. tringham ve m. stevanovic (ed.), house lives: building, inhabiting, excavating a house at Çatalhöyük, turkey. reports from the bach area, Çatalhöyük, - (s. - ) içinde. los angeles: cotsen institute of archaeology publications. erişim adresi: http://diva.berkeley.edu/projects/bach/bach_volume/houselivespreprint/bach_ch _ret_publi cface_small.pdf türkiye yazma eserler kurumu başkanlığı e-kitap portali. ( ). erişim adresi: http://www.ekitap.yek.gov.tr ucl center for digital humanities. ( ). studying digital humanities at ucl. erişim adresi: http://www.ucl.ac.uk/dh/courses uçer, s. ( ). kurumsal inovasyon ile dijital dünyaya adaptasyon. harvard business review. . erişim adresi: https://hbrturkiye.com/blog/kurumsal-inovasyon-ile-dijital-dunyaya-adaptasyon Ünal, y. ve yılmaz, b. ( ). europeana (avrupa dijital kütüphanesi) ve halk kütüphaneleri. . halk kütüphaneciliği sempozyumu: değişen dünyada halk kütüphaneleri - mayıs , bodrum: bildiriler, posterler ve Çalıştay raporları (s. - ) içinde. ankara: kültür ve turizm bakanlığı. vandecreek, d. ( ). web of significance: the abraham lincoln historical digitization project, new technology, and the democratiziation of histroy. digital humanities quarterly, ( ). erişim adresi: http://www.digitalhumanities.org/dhq/vol/ / / / .html vandegrift, m. ve varner, s. ( ). evolving in common: creating mutually supportive relationships between libraries and the digital humanities. journal of library administration, ( ), - . varner, s. ve hswe, p. ( , ocak ). special report: digital humanities in libraries [web blog yazısı]. erişim adresi: https://americanlibrariesmagazine.org/ / / /special-report-digital- humanities-libraries/ vekam. (t.y.). koleksiyon. erişim adresi: https://vekam.ku.edu.tr/tr/content/koleksiyon- xu, l. ve wang, x. ( ). semantic description of cultural digital images: using a hierarchical model and controlled vocabulary. d-lib magazine, ( / ). erişim adresi: http://www.dlib.org/dlib/may /xu/ xu.html yılmaz, b. ( ). türkiye'de dijital kütüphanecilikle ilgili bir standart ya da politika bulunmuyor. bilişim dergisi, ( ), - . erişim adresi: http://www.bilisimdergisi.org/s / http://infteam.jis-cinvolve.org/wp/ / / /does-the-library-have-a-role-to-play-in-the-digital- http://journalofdigitalhumanities.org/ - /getting-started-in-digital-humanities-by-lisa-spiro/ http://www.iath.virginia.edu/dtsummit/summittext.pdf http://www.digitalhumanities.org/dhq/vol/ / / / .html http://digitalhumanities.org: /dhq/vol/ / / / .html http://sgb.kulturturizm.gov.tr/eklenti/ ,butcesunumkitapcigi pdf.pdf? http://www.anadolumedeniyetlerimuzesi.gov.tr/tr, /muzenin-tarihcesi.html http://www.pktc.org/dkhp/ http://diva.berkeley.edu/projects/bach/bach_volume/houselivespreprint/bach_ch _ret_publicface_small.pdf http://diva.berkeley.edu/projects/bach/bach_volume/houselivespreprint/bach_ch _ret_publicface_small.pdf http://www.ekitap.yek.gov.tr http://www.ucl.ac.uk/dh/courses https://hbrturkiye.com/blog/kurumsal-inovasyon-ile-dijital-dunyaya-adaptasyon http://www.digitalhumanities.org/dhq/vol/ / / / .html https://americanlibrariesmagazine.org/ / / /special-report-digital-humanities-libraries/ https://americanlibrariesmagazine.org/ / / /special-report-digital-humanities-libraries/ https://vekam.ku.edu.tr/tr/content/koleksiyon- http://www.dlib.org/dlib/may /xu/ xu.html http://www.bilisimdergisi.org/s / rogers, h academic journals in the digital age: an editor's perspective http://researchonline.ljmu.ac.uk/id/eprint/ / article ljmu has developed ljmu research online for users to access the research output of the university more effectively. copyright © and moral rights for the papers on this site are retained by the individual authors and/or other copyright owners. users may download and/or print one copy of any article(s) in ljmu research online to facilitate their private study or for non-commercial research. you may not engage in further distribution of the material or use it for any profit-making activities or any commercial gain. the version presented here may differ from the published version or from the version of the record. please see the repository url above for details on accessing the published version and note that access may require a subscription. for more information please contact researchonline@ljmu.ac.uk http://researchonline.ljmu.ac.uk/ citation (please note it is advisable to refer to the publisher’s version if you intend to cite from this work) rogers, h ( ) academic journals in the digital age: an editor's perspective. journal of victorian culture, ( ). pp. - . issn - ljmu research online http://researchonline.ljmu.ac.uk/ mailto:researchonline@ljmu.ac.uk academic journals in the digital age: an editor’s perspective helen rogers the place of academic journals in the scholarly eco-system has been radically challenged since i became an editor of the journal of victorian culture in . it has been an exciting time to be involved in curating an interdisciplinary periodical and experimenting in ways the journal format can adapt to the changing landscape of online publication, networking and communication. yet though we now have a host of new resources and tools at our fingertips, the content of scholarly journals and articles – as james mussell explains in this roundtable - remains remarkably similar to their forebears in the pre-digital, pre-social media age. as i step down as editor, here i reflect on what we could do differently and which features of traditional publication we might wish to retain. my initial thinking about the journal format was prompted by changes in my reading practices and those of my students as, increasingly, we accessed single articles online rather than in printed volumes stacked on library shelves. how could jvc replicate the experience of dipping into an issue and browsing its back catalogue, dan cohen, ‘the ivory tower and the open web: introduction: burritos, browsers, and books [draft]’, dan cohen ( july ) [accessed may ]. for the pedagogical implications of this shift, see george gosling, ‘why academic journals still matter’, musings ( october ) and katrina navikas, ‘does the form of traditional academic journals mean anything to students in the age of online access?’, history and today ( october ) [both accessed november ]. http://www.dancohen.org/ / / /the-ivory-tower-and-the-open-web-introduction-burritos-browsers-and-books-draft/ http://www.dancohen.org/ / / /the-ivory-tower-and-the-open-web-introduction-burritos-browsers-and-books-draft/ https://gcgosling.wordpress.com/ / / /why-academic-journals-still-matter/ https://gcgosling.wordpress.com/ / / /why-academic-journals-still-matter/ especially when publishers’ platforms are not easy to navigate? journal archives on these sites are still difficult to search using keywords and i frequently struggle to identify content in jvc’s previous issues relating to a particular author or theme. when i became an editor, i envied the visual appeal of the open access e-journal : interdisciplinary studies in the long nineteenth century, founded in . its content pages - with thumbnail images inviting clickbait by conveying, at a glance, each article’s themes - are far more attractive than online platforms for traditional journals. in setting up the journal of victorian culture online in , i hoped we could recreate the magazine feel for jvc, tempting readers to view individual articles and browse across issues while providing opportunities for interaction and discussion. however, jvc online, with its facebook and twitter feeds, soon acquired its own identity in bringing together an online community of victorianists, under the dynamic editorship of lucinda matthews-jones. allowing real-time engagement with contemporary treatments of nineteenth-century culture - exhibitions, dramatizations, the release of digital materials, and so on - it has become more than a supplement. ‘victorians beyond the academy’, begun as an occasional feature in the print journal in , has effectively migrated to jvc online where timely coverage and interactive links have welcomed a diverse readership as well as critical reflection on the public engagement agenda. launched and run by the centre for victorian studies, birkbeck college, university of london, is now hosted by the mega journal platform, open library of humanities . http://blogs.tandf.co.uk/jvc/ https://www.facebook.com/jvconline/ https://twitter.com/jofvictculture https://twitter.com/jofvictculture http://www. .bbk.ac.uk/ https://www.openlibhums.org/ through our social media streams we have been able to interact - and share both blog and journal content - with independent scholars, creative writers, popular historians, curators and librarians who rarely have time or inclination to participate in academic conferences and publications. in an online survey of jvc online’s users in , nearly % of respondents defined themselves as independent scholars, a much higher proportion than found at academic gatherings. similarly, jvc online has brought into conversation a diverse cross-disciplinary community, from all age groups. though we expected postgraduates and early career researchers to feature heavily in our survey, in fact the proportion aged - ( . %) almost equaled that between - ( . %), and the largest group was aged - ( %). while literary scholars predominate in the field’s publications and conferences, they formed just over half our respondents ( %). using the #twitterstorians hashtag has raised our profile among historians, now reflected in our contributors, readers and altmetrics; in jvc was ranked in thomson reuter’s list of history journals. our activity on social media has also nurtured the growth of blogs in our field, which, few and far between in , are now a vital part research culture. by , % of our survey respondents had blogged on their own site or on a collaborative blog. as lucinda matthews-jones discusses in this roundtable, blogging can provide helen rogers and lucinda matthews-jones, journal of victorian culture survey [accessed november ]. postgraduates and emerging scholars with an introduction to writing for a broad, public audience rather than exclusively for specialists in their field. however, to realize the potential of blogging for scholars at every career stage, we might think more inventively about how interactive media can help us reshape traditional scholarship, including the journal article, which has changed remarkably little in appearance. the stasis of article publication is highlighted in the apparent reluctance of authors to engage explicitly with digital scholarship. in we aimed to kick-start regular discussion of the digital in victorian studies with a special issue on ‘searching questions’. under james mussell’s editorship, the digital forum has become one of the most significant arenas for reviewing digital resources and approaches outside dedicated digital humanities publications. while the forum has nurtured digital conversations in our field – also evident in well-attended conference panels – it is striking how few of our article submissions foreground active use of digital materials and practices. ‘searching questions’ was preceded by a call for articles that were fundamentally concerned with digital concepts and methodologies or where the research had been ‘born digital’. we looked forward to working with authors and our publisher to accommodate the interactive features of such scholarship within the journal, online and in print. but we received just one submission: matthew for examples of creative approaches to scholarly blogging, see helen rogers, ‘blogging our criminal past: public history, social media and creative history’, law, crime and history . ( ): - , http://www.pbs.plymouth.ac.uk/solon/journal/vol. %... helen rogers, editorial, ‘searching questions: digital research and victorian culture’, journal of victorian culture, . ( ), pp. - < http://www.tandfonline.com/doi/abs/ . /e > http://www.pbs.plymouth.ac.uk/solon/journal/vol. % issue % /rogers.pdf http://www.tandfonline.com/doi/abs/ . /e rubery’s insightful article on the history of audio books. subsequent articles that are immersed in the digital can be counted on one hand. while it is often claimed that editors exert a conservative hold over academic scholarship, our experience at jvc points to a wider diffidence in the field in confronting the digital. at the very least, we should all highlight rather than disguise our use of online resources (including digitized books), by providing digital citations and links, and by acknowledging the search process (including keywords) and its limitations when discussing methodology. editors may have to be more pro-active in encouraging experimentation if we are to radically re-imagine the digital article. for this reason, we plan to launch an annual competition to promote digital resources matthew rubery, ‘play it again, sam weller: new digital audiobooks and old ways of reading’, journal of victorian culture, . ( ), pp. - bob nicholson, ‘“you kick the bucket; we do the rest!”: jokes and the culture of reprinting in the transatlantic press’, journal of victorian culture, . ( ), pp. - ; andrew hobbs, ‘the deleterious dominance of the times in nineteenth-century scholarship’, journal of victorian culture, . ( ), pp. - < http://www.tandfonline.com/doi/full/ . / . . >; kelly j. mays, ‘how the victorians un-invented themselves: architecture, the battle of the styles, and the history of the term victorian’, journal of victorian culture, . ( ), pp. - < http://www.tandfonline.com/doi/full/ . / . . >; sally m. foster, alice blackwell, martin goldberg, ‘the legacy of nineteenth-century replicas for object cultural biographies: lessons in duplication from s fife’, journal of victorian culture, . ( ), pp. - ; christopher donaldson, ian n. gregory, patricia murrieta-flores, ‘mapping “wordsworthshire”: a gis study of literary tourism in victorian lakeland’, journal of victorian culture, . ( ), pp. - . http://www.tandfonline.com/doi/abs/ . /e http://www.tandfonline.com/doi/full/ . / . . http://www.tandfonline.com/doi/full/ . / . . http://www.tandfonline.com/doi/full/ . / . . http://www.tandfonline.com/doi/full/ . / . . and essays, individual and collaborative blogs, as dynamic elements of contemporary research culture. to embrace fully the interactive potential of hypertext and the new media, however, we may have to give up the printed journal, while continuing to give readers the option to download and print on demand. sadly, this would deprive the falling number of individual subscribers of the pleasures, attested by some readers, of receiving and handling a print issue, but it would have numerous advantages. fixed page budgets mean most academic journals operate tight word limits. few accept articles over , words while others allow as little as , . word limits encourage authors to be concise but can prohibit essays drawing on extensive archival research or interweaving several scholarly debates; precisely the reasons we consider longer article submissions at jvc. released from the tyranny of print, however, articles could range in length from short, pithy interventions to heavily documented research essays, of the kind e.p. thompson would now struggle to place. as more museums, galleries and libraries open their digital content, authors could display images in addition to embedding links; engage more closely with visual and material culture; and use illustrations to reinforce analysis while making the reading experience more stimulating and pleasurable. authors could experiment with different lines of enquiry and modes of argument, offering readers alternative routes through their essay rather than always following a linear direction to a single point of conclusion. the advantages of online publication have been championed by the open access movement, which has mounted a trenchant, though by no means unified, critique of academic journals and publishers. in some quarters this has been coupled with calls for traditional (usually blind) peer review mechanisms to be replaced by open peer review. in this model, authors publish scholarship on an online platform where essays are open for comment and evaluation by self-selecting reviewers. the open comments system means the review process is transparent and the once hidden labour of anonymous reviewers is recorded and credited. subsequent readers can trace how an article has evolved through each re-draft and assess the author’s responses to readers’ recommendations. once peer review is crowd-sourced by the online community, editors would no longer intervene significantly in the writing process but instead select and ‘badge’ articles for their journal, which could operate without the costs and overheads of traditional publishing. but would this not make see for example, william g. thomas, iii, ‘writing a digital history journal article from scratch: an account’, available dh project ( ) [accessed november ]. the essays recounts the experiment in creating a digital article based on the valleys of the shadows project as an interactive / page article. see william g. thomas, iii, and edward l. ayers, ‘an overview: the differences slavery made: a close analysis of two american communities’, american historical review, ( ), pp. - and the differences slavery made: a close analysis of two american communities ( ) jo guldi, ‘reinventing the academic journal’, hacking the academy: new approaches to scholarship and teaching from digital humanities, ed. by daniel j. cohen and tom scheinfeldt (michigan: michigan university press, ) http://digitalhistory.unl.edu/essays/thomasessay.php http://www.vcdh.virginia.edu/ahr file:///c:/users/mcchroge/appdata/local/microsoft/windows/temporary% internet% files/content.outlook/bp xgf h/ahr.oxfordjournals.org/content/ / / .full.pdf http://www .vcdh.virginia.edu/ahr/ editors more akin to research assessors and journals little more than a ranking platform? there are a few notable examples of crowd-sourced, pre-publication peer-reviewed journal issues, edited collections and monographs but it is notable that attempts to generalize the model, at least in the humanities, have not yet proved successful. as two of its pioneers, katherine rowe and kathleen fitzpatrick acknowledge, open peer review depends on a ‘community of trust’ that takes time and care to build. it is also notable that online journals with comment facilities receive very few comments from readers. similarly, as lucinda matthews-jones points out in this roundtable, time-pressed readers are much more likely to interact with blogs and articles with ‘thumbs-up tweets’ or in ephemeral exchanges on their own facebook timeline, rather than in sustained discussion in comment sections. but if the academy is not yet geared up for open peer review, there are also positive aspects of traditional editorial practice and peer review that we might be well-advised to retain. ; tim hitchcock and jason m. kelley, ‘reinventing the academic journal: the “digital turn”, open access, & peer review’, history workshop online ( april ) [accessed may ]. hitchcock and kelly launched an open scholarship project (described in ‘reinventing the academic journal’), but few articles received comments and the site is no longer available. katherine rowe and kathleen fitzpatrick, ‘keywords for open peer review’, logos: the journal of the world book community, . - ( ): - . http://www.digitalculture.org/books/hacking-the-academy-new-approaches-to-scholarship-and-teaching-from-digital-humanities/ http://www.digitalculture.org/books/hacking-the-academy-new-approaches-to-scholarship-and-teaching-from-digital-humanities/ http://www.historyworkshop.org.uk/reinventing-the-academic-journal-the-digital-turn-open-access-peer-review/ http://www.historyworkshop.org.uk/reinventing-the-academic-journal-the-digital-turn-open-access-peer-review/ reviewers tend to report on specialist aspects of essays under consideration. part of the editor’s role, however, is to ensure articles work not just as stand-alone pieces but speak to the journal’s wider readership. jvc’s editors work hard to encourage authors to show how their research can interest a broad and interdisciplinary readership in order to maximize its influence. this involves helping authors highlight their central claims, flag their argument and extend its significance beyond their specialist concerns. it also means encouraging them to write clearly and economically for an audience including students as well as experts in their field. almost all of us benefit from this kind of editorial intervention – i certainly do – but it is labour that is probably done most constructively ‘behind the scenes’. while the merits of traditional peer review may outweigh the largely unproven claims of open review, jvc has welcomed the move towards open scholarship. however, the mixed economy of open access that currently operates – at least in the uk – has important consequences for the status and reach of our contributors’ research. mandates by hefce and the research councils now ensure that ‘publically funded’ scholarship is made either ‘gold’ open access (instant oa paid for by author processing charges) or ‘green’ open access (either made oa on publication with no charge; or the pre-print version, made oa through a repository of some kind, usually after an months embargo). this means readers can have very different levels of access to articles in a single issue. rcuk awards cover article processing charges for its funded researchers, though only a tiny proportion of our authors receive such funding. articles supported by these grants have been among our most downloaded essays and consequently are likely to be more cited. in view of the increasing use of altmetrics by funders, employers and recruitment panels, there is a danger we create a virtuous circle where funded open access leads to more citations which leads to further funding and career enhancement. while we encourage our publisher to make other articles open access for short periods of time, we need the help of our authors and readers to maximize the circulation of all scholarship in our field and ensure continuing conversations. through social media we currently promote articles when they are published on our journal platform, when the print issue is released, and again when authors post a blog about their article at jvc online. but as lucinda matthews-jones points out, authors could give their articles another lease of life when they come out of the embargo period. in addition, our authors and readers could write blogs to coincide with upcoming anniversaries and events when relevant articles from our archives could be made open access. similarly, readers could offer to edit, with a published introduction, ‘virtual issues’ comprising archived articles on a particular topic or shared agenda. it seems likely that academic journals will keep evolving as reading and online habits continue to change. recent years have seen considerable debate over academic journals in which editors tend to be cast as gatekeepers. less attention has been see for example, the virtual issue on ‘folklore and anthropology’ at past and present, with articles open access until the end of , edited with a new introduction by william pooley, ‘native to the past: history, anthropology, and folklore’, past and present ( ) . http://past.oxfordjournals.org/content/early/ / / /pastj.gtv .full http://past.oxfordjournals.org/content/early/ / / /pastj.gtv .full http://www.oxfordjournals.org/our_journals/past/anthropology_folklore.html given to the constructive ways in which editorial boards have been involved in making journals focal points for scholarly communities, dialogue and experimentation. if the current mixed market for academic publication is to continue, commercial publishers and academic presses will have to play a much larger part in promoting journal content (and at a fair price) rather than leaving it to the unpaid labour of authors and editors. but academics too will need to take responsibility for sharing online the work of their peers and introducing students to the new forms and forums of scholarly communication. the alternative is that universities will stop investing in scholarly publication. our research will be directed to institutional repositories, where it will disappear into impersonal silos, discoverable only via research management sites. that will be no place for conversation or creativity. keywords: academic article; academic journal; altmetrics; blogging; digital scholarship; editorship; electronic publication; open access; peer review; public engagement; print publication; social media; readership; referencing; virtual issue abstract: this article provides an editor’s perspective on academic journals in the transition from print to online publication and the move towards open access. it considers the challenges facing scholarly publication but contends the new media and social networking provide opportunities for radically rethinking what constitutes an academic article and a scholarly journal. while editors and publishers are frequently charged with acting as ‘gatekeepers’, the article argues that resistance to change has also come from authors, particularly evident in the failure to reference their use of digital resources. above all, it claims, experimentation is inhibited by journals retaining the traditional parameters of the printed issue, with their restrictions on length and use of multimedia. journals, it proposes, can become a focal point for academic communities, dialogue and experimentation, but this requires all scholars to be pro-active in sharing online the work of their peers and introducing students to the new forms of scholarly communication. white paper report id: application number: hk- - project director: kimberly ann christen withey institution: washington state university reporting period: / / - / / report due: / / date submitted: / / neh digital implementation grant white paper grant number: hk project: mukurtu mobile: empowering knowledge circulation across cultures project director: dr. kimberly christen, associate professor, mukurtu project director, washington state university date submitted: . . mukurtu mobile white paper neh odh: digital implementation grant mukurtu mobile: empowering knowledge circulation across cultures pi: dr. kimberly christen, washington state university project team: center for digital scholarship and curation (wsu), center for digital archaeology, mapp app project summary this project both critically evaluates the notion of knowledge sharing in the humanities and implements a mobile digital platform that extends and contextualizes the practice of knowledge sharing. mukurtu mobile builds on work supported by the national endowment of the humanities and the institute for museum and library services to produce mukurtu cms a content management system that meets the needs of indigenous communities globally to manage, share and preserve their digital heritage within their own cultural and ethical systems. building from this success, our team launched mukurtu mobile an innovative iphone application that empowers indigenous communities to collect, share and preserve their cultural and environmental resources. mukurtu mobile provides a platform for individuals to bring their own knowledge base to the common concerns of local, traditional and indigenous communities worldwide. with an interface directly to mukurtu cms, mukurtu mobile links the power of a robust, culturally responsive cms to the direct collection of knowledge on-the-ground facilitating curation and collection in real time. from citizen archivists to citizen scientists, activists and scholars, mukurtu mobile enables connection of local sets of knowledge and data to fuel research hubs and educational environments that unite local communities around global issues such as natural and cultural resource management, language revitalization, cultural revitalization, educational outreach, and ecological sustainability. major project activities in order to meet the main goals of the project—to extend and update mukurtu mobile— we undertook five main activities: . creation of functional and technical specification documents for mukurtu mobile app through all phases of development . create support hub for users with multiple access points . iterative public releases of the mukurtu mobile app . community outreach and engagement . collect data metrics and analytics for the app and support sites functional and technical specifications  functional specs: this document describes use cases, interface requirements, and expected operations inside the app, as well as behavior in case of conflict and errors. this document was updated through iterations and app releases and is downloadable on the mukurtu mobile github repository.  technical specs: this document documents the results of the feasibility studies and describes in detail what technology and tools are to be adopted in all phases of development, with special attention to changes in the framework interfacing http://www.mukurtu.org/ https://github.com/mukurtucms/mukurtu-mobile/tree/master/technicalspecs with mukurtu cms. this document was updated through all phases of work and development and is downloadable on the mukurtu mobile github repository. o following the development method adopted in phase , technical specifications for release . were produced as part of the development sprint for updating the mobile app during phase based on this testing. knowledge hub and support the online knowledge hub includes: a dedicated mukurtu mobile support page with links to video tutorials and downloadable documents for all aspects of app use, mukurtu mobile website with development updates, video tutorials and documentation.  mukurtu mobile dedicated support page o basic getting started checklist was created as a starting resource for users  a dedicated mukurtu mobile youtube channel with tutorials  mukurtu mobile website with sections including feature release updates, support links and tutorials, information about getting the app and a demo page with support for testing content without a mukurtu cms site and demo content. mukurtu mobile app releases there were public releases of the app through all phases of development to refine the app to work across a variety of devices.  development phases and releases . - . phase i release was a minor updated version of the original beta mukurtu mobile app, updates included:  in app verbiage updated  support updates  minor interface updates in phase ii updates included:  complete update to interface  support for upload of video and audio files  internal audio recorder  youtube integration for uploading your videos  full exif support for media  preview your content online from the app in phase iii the app updates focused on key features and integration with the new mukurtu cms code release and finalization of display and language features. feature updates in . include:  integration with mukurtu cms . and .  offline content collection and creation  geopositioning  syncs with the communities and cultural protocols on mukurtu sites  mukurtu cms standard metadata  internal documentation and online support page  in-app image, audio and video collection capabilities https://github.com/mukurtucms/mukurtu-mobile/tree/master/technicalspecs http://support.mukurtumobile.org/ http://support.mukurtumobile.org/customer/portal/articles/ -getting-started-with-mukurtu-mobile---all-you-need-to-know https://www.youtube.com/user/mukurtumobileapp http://mukurtumobile.org/  full exif support for your photos  preview your content online updates and changes in . :  in april due to a new release of mukurtu cms with, the mukurtu mobile app client had to undergo a new phase of code update to ensure interoperability between the two systems. specifically the updated had to enable the client to fetch content from - and post content to - the server environment, as well as to leverage new capabilities offered by the cms most specifically in media management.  extensive research and testing during the development of the exhibit feature integration with the platform map app.com resulted in a test html web application with the demo content. the result of this testing phase and the concurrent code updates to mukurtu cms to version . , resulted in the decision that mukurtu mobile will not rely on an external platform for the creation of mukurtu exhibits.  exhibit and display integration with mukurtu cms. a new workflow allows users to create online “exhibits” using the drupal content type collections within mukurtu cms sites via the mukurtu mobile app. within mukurtu mobile, content is created, uploaded, and categorized/tagged accordingly through the mobile client, while collections can be set up in the cms, including configuration and look- and-feel of a dedicated exhibit page for each collection. the adoption of the twitter bootstrap design framework in the last update of mukurtu cms allowed a greater freedom in content presentation.  this final update and release at . ensures compatibility both mukurtu . and . . the app fully leverages the new media management capabilities in mukurtu with integration to the drupal module scald, and full support for youtube videos. substantial security fixes were included with the latest release of the mobile app, which now allow the app to safely work with any mukurtu site without need for further configuration. a release backlog document was created by the project’s subcontractor, the center for digital archaeology, for tracking the work on the app providing a transparent set of build and release documents for further development in other stages. community outreach and engagement community testing, local hands-on workshops and user engagement surveys have been underway since september . we held workshops at the association of tribal archives libraries and museums annual conferences, the convening culture keepers gathering, the northwest archivists native american roundtable, the alaska native language archives and held hand-on community testing with the zuni public library, the california indian museum and cultural center, the pokagon band of potawatomi’s summer dreamcatchers kids’ camp and in three community workshops in australia in collaboration with the traveling places local workshops in remote communities in new south wales. at each session we used pre-and post evaluations to assess individual and organization needs as well as if basic tasks using the app were readily understandable. a https://docs.google.com/spreadsheets/d/ x mpgzw smsxjycpd pb- ihiv bmxxzo qio/edit#gid= sample summary of three sets of workshop evaluation responses shows interest in the app for community engagement, its ease of use, and possibilities for future use (see samples below in evaluations section). (left) fourth and fifth graders at paschal sherman indian school on the colville reservation use mukurtu mobile to record narratives about water health and stream resources in their community. (right) pokagon children use mukurtu mobile to document traditional practices related to their environment during the annual summer dreamcatchers camp. dreamcathers camp summary: jason wesaw (pokagon), lotus norton-wisla (wsu team), and michael wynne (wsu team) co-led a set of workshops with approximately - year olds during a day of the pokagon band of potawatomi dreamcatchers camp. during each . hour workshop session throughout the day, the three taught kids how to use the mukurtu mobile app on ipads to collect pictures, audio, and video to create stories and memories in mukurtu mobile and upload to the pokagon mukurtu site. kids created mukurtu mobile items based on interviews with each other, teachings from camp instructors and kids, nature, documenting camp, cultural arts, and other topics. the content that the kids created was then uploaded to the wiwkwe'bjgen mukurtu cms site under a dreamcatchers camp community only cultural protocol. a total of items were created on the wiwkwe'bjgen website. most items included more than one media asset, and many included more than media assets. most multi-asset items were a mix of images, audio, and/or video, chosen by the kids to best tell their stories.  items total  items with image assets  items with video assets  items with audio assets metrics we captured three sets of metrics:  mukurtu mobile app downloads and use through the itunes store  mukurtu mobile website traffic  mukurtu support page traffic for the (extended) three year grant period during which approximately two years we had an active app for updates and downloads, we show unique downloads across the active release period for the ios app in the itunes store. this number is consistent with the uptake of mukurtu cms and especially shows the higher number of downloads close to the release dates and in the months after the mukurtu cms . release. website traffic was , unique users over the . years which matches with the numbers of mukurtu cms sites and users showing a positive and growing user base as well as high traffic from those interested in using the app. support traffic was steady over the granting period and shows spikes around the time of launch releases and coinciding with mukurtu mobile workshops and hands-on sessions with the mukurtu team. accomplishments and changes original objectives  update the original beta version of mukurtu mobile to a stable release  allow for audio and video capture and upload in app.  app availability on a wide range of platforms (ios and android)  responsive online support, educational tutorials  seamless integration with the latest release of mukurtu cms accomplishments  mukurtu mobile is now at a . release and includes in app audio and video capture. video still requires a youtube account but the experience for users is seamless within the provided workflow.  online support is fully updated and current with documentation including text, screencasts and video tutorials. changes in development  our original objective was to have mukurtu mobile compatible with ios and android platforms. the complete code update of mukurtu cms created unforeseen interoperability issues past the mukurtu mobile . release. mukurtu cms is now at a . . release. the updated cms provides a secure, simple, and streamlined interface with mukurtu mobile. the remaining challenge will be the next integration of mukurtu mobile . with android. our user analysis showed less than % of mukurtu mobile users on android, given the cost and timeframe we elected to wait for a mobile update at this time. android is still compatible with mukurtu cms . however we will be adding android support to our next development cycle post-grant. audiences the primary audience for mukurtu mobile is mukurtu cms users. mukurtu mobile was developed as part of mukurtu cms’s community software development model where features and functions are determined by the communities we serve. each new feature or function within (or attached to) mukurtu cms is born from the direct needs and suggestions of the community of people using mukurtu cms: indigenous communities, archives, libraries and museums and non-indigenous collecting institutions with indigenous collections and/or who collaborate with indigenous communities. our team first encountered the need for a mobile collecting tool during our mukurtu . workshops in australia and new zealand in . it was crucial here that the focus for the app was not a recreation of mukurtu cms or a display tool, rather it was clear from this early feedback that the need was to empower community members, particularly kids, to collect—to document, narrate and add content to their community mukurtu cms sites. indigenous communities around the world—while diverse in languages, cultural practices and social structures—share histories of colonialism that have left many disenfranchised and with shocking rates of adolescent suicide and drop out from school. part of the empowerment needed that we first heard in our workshops was from young adults working in their communities to stave off these issues and using technology as one part of that set of solutions. as we took this first feedback to communities in the united states and canada we heard an overwhelming need to “get the kids involved” and to build a collection tool that would provide that engagement. indeed, many of our workshops have included native youth school groups and summer camps. given our community software development model for mukurtu cms, the audience for mukurtu mobile is current mukurtu cms users. the primary audience is indigenous communities using mukurtu cms, at present for those downloads we can track and those from self-reported use numbers just over + installs worldwide mainly in the united states, canada, australia and new zealand. with the new one-click install hosted package on reclaim hosting we have also seen over + new sites since the release in mid . secondary audiences are the non-indigenous groups who use mukurtu cms as a way either to engage with indigenous communities through the return and sharing of digital content and or organizations or scholars who wish to implement mukurtu cms for their own projects that rely on specific protocols for access, use and sharing of digital content. for example, we have worked with the national museum of the american indian’s conservation program to test an instance of mukurtu cms for the purposes of providing communities a forum for sharing information about the correct and culturally appropriate forms of care for the materials held by nmai. evaluations . development evaluations and assessment simultaneous to development and integration we conducted user testing and user experience sessions allowing communities and our staff to work together to upgrade and define the use of mukurtu mobile for their needs. usability and quality assurance tests were conducted at each testing phase of the project. evaluation took place largely in- person although we had a few online sessions, with less usable data. we found that the online evaluations and testing were hard to control and we had limited success getting usable feedback. in person evaluations included a combination of user surveys, cognitive walkthroughs, and contextual inquiry to assess accessibility, usability, and design interface. in person evaluations will include combination of user surveys, cognitive walkthroughs, and contextual inquiry to assess accessibility, usability, and design interface. the cognitive walkthrough method relies on a variable set of tests and multiple user groups to gain the most insight into the usability of the system. during the walkthrough, the user is asked to “think aloud” as they perform certain tasks and maneuver through the site. the value of the cognitive walkthrough is that one gets significant qualitative data about the human-computer interaction. following this phase of testing, contextual inquiry interviews are used to gather specific data about the interface, design, and experience of the user. contextual inquiry provides the following information about the product: ) constraints on use/environment, ) the process that the user goes through in their navigation, ) specific steps to gain positive results, and ) where users have difficulty. the in-person sessions, however, yielded usable results and helped at every stage for us to refine development, training and support. . training evaluations and assessments [sample evaluation responses compiled from workshops in alaska, washington and california] what features of the mukurtu mobile demo and using mukurtu mobile yourself did you find most relevant to your needs? summary: easy, fun, less intimidating, easily used by community  it is less intimidating then mukurtu like the mobility and it gives you instead a feeling of success. i think this will be good with the kids in my outdoor classroom.  immediacy. haven’t really spent much time with it.  user friendly, versatile, quick upload.  easy, fresh & fun, envision it being used easily when attending community gatherings.  small, portable, assume nice quality images possible, like the speed; steps are pretty intuitive.  adding existing photos from my phone/cloud. what additional features or capabilities would you like to see in mukurtu mobile? summary: interactive games, language app  i am kind of set with this for now i might have other ideas later.  interactive games for end users. not really sure yet.  language app :)  not sure?  want to use it to create an environmental awareness, tek game; really though it could be great to engage the kids in so many projects that involve culture. how do you envision using mukurtu mobile at your own institution? summary: kids, educational, elders, collaboration  the kids in the outdoor classroom. to view tribal future, interact with elders, etc.  perhaps museum’s exhibits info for visitors. student educational---not sure.  quick upload of images/stories by kids, elders, etc. collaborations on so many levels.  see above  i like the idea of bringing elders and children together to work on collecting photos, videos of traditional/historical & genealogical knowledge.  recording language and culture class. what other departments in your tribe or organization could use mukurtu mobile? what types of projects might they want to do? summary: all respondents had ideas, language, food security, education, age, domestic violence prevention  language enrollment for update as tribal people. library, age, cultural graphics, tribal general council, historic info museum, harvesting. i could go on forever. wow!  food security  all of them: education, weavers, language, tribal members.  dnr, language and culture, education, media services.  natural resources; i have to think education would love it. human resources, perhaps. domestic violence prevention.  language documentation and teaching, collection what was the most valuable part of the workshop? summary: all respondents found the hands on the most helpful and the ability to work with their fellow participants at collecting and uploading content.  all of it!  what it is, and how we use it. all well explained.  the hands-on aspect was the most helpful to me.  how user friendly mukurtu mobile is  face to face support and hands on experience creating dh items for cimcc, and the ability to share mukurtu with interested folks  i took a fair amount of notes, all valuable! the most valuable part was getting to work with a mukurtu site. i've read so much and really wanted to see how it works. also i gained clarity on hosting in general, and also gained additional clarity on what i am trying to do in my role at graton, in addition to the need for a community archive.  the overview and group walk-throughs  i enjoyed the hands on mobile demonstration. it is valuable to planning our activities. the background info is relevant too. continuation of the project mukurtu mobile ongoing development and institutional support the project will continue as an integral part of mukurtu cms and its ongoing development. in washington state university created its center for digital scholarship and curation (cdsc), with pi, dr. kimberly christen as one of the co- director’s of the center. the mission of the cdsc is to facilitate and sustain digital scholarship and teaching in support of the university’s strategic plan to foster exceptional research, innovation, and creativity. the cdsc is committed to upholding wsu’s land- grant heritage and tradition of service to society by collaborating with and providing support to a wide range of constituents with a focus on ethical curation and the production of digital tools that support social justice, diversity and sustainability. the center provides not only a physical space, but dedicated faculty, resources, and programming that build from current projects to provide a base for research, scholarship, and tool-building with a focus on ethical, cultural, sustainable projects, practices and a continued emphasis on reaching out to underserved populations. with dedicated development staff, the cdsc now fully manages mukurtu cms and mukurtu mobile and will continue their development, regardless of grant funds. this level of institutional support fosters trust and provides the stability for the community of users we serve now and opens our digital tools to uses by other communities engaged in collaborative models of curation. key areas of growth for mukurtu mobile will be extended during our next phase of development with a recent imls national leadership grant and through programming from a mellon-funded planning grant ( - ). specifically we will be building out the dictionary and language collections functions within mukurtu cms and the connection to those features will be expanded in mukurtu mobile and we have built into our roadmap for future development updates to the android app that will be compatible with mukurtu cms . and future releases. collaborative partnerships building from another neh funded project—the plateau peoples’ web portal—we are applying for nsf funds to build out the mobile application to include educational modules for stem that facilitate the creation of land-based, culturally responsive curriculum that bring together western science and traditional knowledge. this new project brings together eight tribal nations in the region, the college of education at wsu and the center for digital scholarship and curation to produce an update and secondary app that allows teachers to design specific culturally responsive curricula and integrate primary sources from the portal along with collections of content gathered by students, teachers, community members and other scholars. long term impact the impact of the mukurtu mobile app has been increased community engagement within indigenous communities and non-indigenous collecting institutions using mukurtu cms. mukurtu mobile provides an avenue for individual and group collaborations that respect difference and provide culturally responsive and ethical protocols to guide content collection, creation and curation. the long-term impact for our institution will be on-going and increased collaboration with indigenous communities globally, increased use of both mukurtu cms and mukurtu mobile that contribute to the diversity of voices and collections at all institutions, and through increased grant funding. to date we have a new imls grant, a pending nsf grant and we will be submitting a mellon foundation implementation grant following our planning grant phase in . the institutional impact long term will be to not just increase grant funds, but to ensure that our mission to develop digital tools, projects and partnerships around broad-based ethical concerns and with a social justice foundation. at institutions such as wsu it is increasingly clear that we must lead by example to promote inclusion, diversity and ensure a wide range of stakeholder needs are not just represented, but become part of the fabric of the university and the center for digital scholarship and curation as we seek to build, design and produce more digital tools and platforms. grant products the main product of the grant was the open source mukurtu mobile app with the corresponding source code, documentation and support sites.  mukurtu mobile app releases on google play and itunes  github source code  mukurtu mobile website  mukurtu mobile support site and youtube channel support videos mmoblie_coversheet_final mukurtumoblie_whitepaper hawthorne, w., & lingna nafafe, j. ( ). the historical roots of multicultural unity along the upper guinea coast and in guinea- bissau. social dynamics, ( ), - . https://doi.org/ . / . . peer reviewed version link to published version (if available): . / . . link to publication record in explore bristol research pdf-document this is the accepted author manuscript (aam). the final published version (version of record) is available online via taylor & francis at http://dx.doi.org/ . / . . . please refer to any applicable terms of use of the publisher. university of bristol - explore bristol research general rights this document is made available in accordance with publisher policies. please cite only the published version using the reference above. full terms of use are available: http://www.bristol.ac.uk/red/research-policy/pure/user-guides/ebr-terms/ https://doi.org/ . / . . https://doi.org/ . / . . https://research-information.bris.ac.uk/en/publications/ec ed - a b- -b d-d fb https://research-information.bris.ac.uk/en/publications/ec ed - a b- -b d-d fb the historical roots of multicultural unity along the upper guinea coast and in guinea-bissau walter hawthornea and josé lingna nafaféb* professor walter hawthorne, chairperson department of history old horticulture building e. circle dr room michigan state university east lansing, mi usa tel: ( ) - ; ( ) - (fax) email: walterh@msu.edu dr. josé lingna nafafé lecturer in portuguese and lusophone studies school of modern languages department of hispanic, portuguese and latin american studies woodland road bristol bs te uk tel: + ( ) email: jose.lingnanafafe@bristol.ac.uk bristol.ac.uk/hispanic abstract: lusofonia or lusophony is often defined as an identity shared by people in areas that were once colonised by portugal, which in africa include angola, cabo verde, guinea-bissau, mozambique and são tomé and príncipe. lusofonia assumes that in these places people share something – a language, certainly, but also a history and culture rooted in the iberian peninsula. in some ways, it is a re-articulation of gilberto freyre’s lusotropicalismo, the idea that portuguese were more adaptable than other europeans to tropical climates and cultures and created more multicultural colonial communities. those who espouse lusofonia often have a political agenda – the strengthening of the community of portuguese speaking countries (cplp). in this article, we argue that like lusotropicalismo, lusofonia is a dream; it is not rooted in a historical reality. it is luso-centric in that it ignores the power and persistence of local cultures and gives undo weight to portuguese influence. with regard to africa, lusofonia’s agenda is elite driven and assumes the inevitability of modernity and globalisation. and we demonstrate that it was through upper guinean institutions and * corresponding author. email: walterh@msu.edu; jose.lingnanafafe@bristol.ac.uk mailto:walterh@msu.edu mailto:jose.lingnanafafe@bristol.ac.uk mailto:walterh@msu.edu mailto:jose.lingnanafafe@bristol.ac.uk languages, and not colonial ones, that community and fellowship were most commonly fostered in the past, as they are fostered today. those seeking the roots of lusofonia cannot, then, look to this period of portuguese-african engagement in upper guinea. there portuguese embraced “black ways.” they operated in a peculiar multicultural space in which people possessed fluid and flexible identities. portugal did not create that space. lusofonia has not been the foundation for cultural unity. rather, unity has been found in localised institutions and in crioulo. in guinea-bissau, lusofonia is not an indigenous movement. if it is anything, it is the stuff of elites and foreigners and is not rooted in any historical reality. keywords: lusofonia, upper guinea coast, guinea-bissau, portuguese, multicultural lusofonia or lusophony is often defined as an identity shared by people in areas that were once colonised by portugal, which in africa include angola, cabo verde, guinea-bissau, mozambique and são tomé and príncipe.i lusofonia assumes that in these places people share something – a language, certainly, but also a history and culture rooted in the iberian peninsula. in defining lusofonia, many defer to the portuguese philosopher eduardo lourenço, who described it as a “community and the fellowship inherent in a fragmented cultural space” (lourenço , ).ii in other words, lusofonia is multiculturalism portuguese-style. as michel cahen puts it, lusofonia is most often conceptualised as a “peculiar area of intersection with other identities (european, indian, bantu, muslim, christian, jewish, etc.)” in which there exists a “certain ‘weight’ of portuguese expansion” (cahen b, – ). in some ways it is a re-articulation of gilberto freyre’s lusotropicalismo, the idea that portuguese were more adaptable than other europeans to tropical climates and cultures and created more multicultural colonial communities. those who espouse lusofonia often have a political agenda – the strengthening of the community of portuguese speaking countries (cplp).iii as summed up by victor marques dos santos: the idea of a community of portuguese speaking countries... is over a century old, and translates into today’s reality as an expression of political will of eight sovereign states. the idea stemmed from the acknowledged existence of shared cultural elements, namely the common use of the portuguese spoken and written language, as the means of expression of over million people…. portuguese speaking cplp people and the portuguese speaking communities spread around the world, define a geographical space of cultural expression that transcends the territorial frontiers of lusofonia as a potential factor of strategic projection. in this context, cplp stands as the institutional framework that meets the needs for the defense of lusofonia and the development of the portuguese language both as a cultural heritage element and a factor of strategic projection, whose fostering is in the interest of portugal as well as of all the other cplp member states (dos santos , ). in this article, we argue that like lusotropicalismo, lusofonia is a dream; it is not rooted in a historical reality. it is luso-centric in that it ignores the power and persistence of local cultures and gives undo weight to portuguese influence. with regard to africa, lusofonia’s agenda is elite driven and assumes the inevitability of modernity and globalisation. it envisions the existence of a global community of portuguese speakers, and it aims to shape identities accordingly. how then is it possible that community and fellowship has existed in the culturally fragmented space that is known today as guinea-bissau? what other than the legacy of european colonialism fosters multiculturalism? we answer these questions through a look the history of the upper guinea coast, a region stretching from southern senegal through sierra leone, which includes guinea-bissau. we examine how the portuguese colonised the space and how identities, languages and religions changed within it (nafafé , – . and we demonstrate that it was through upper guinean institutions and languages, and not colonial ones, that community and fellowship were most commonly fostered in the past, as they are fostered today. this is not to say that guinea-bissau has rejected broader alliances – political and cultural connections – with the world beyond its borders. but when seeking alliances, guinea-bissau has embraced regional partnerships within africa – partnerships that have often excluded portugal and cplp member states. the first centuries the upper guinea coast stretches from the gambia river through to sierra leone. as early as the sixteenth century, small numbers of portuguese men began to settle there, concentrating around bissau and cacheu and other port towns. there and elsewhere along the upper guinea coast, portuguese settlers and merchants encountered people from a vast number of ethnic groups among which were baga, balanta, banhun, biafada, bijago, cassanga, floup, fula, jola, nalu, papel, sape, jolonke, and mandinka. some of these groups, and particularly those close to the coast, were divided into small- scale settlements that had relatively decentralised or stateless political structures. others, and particularly those beyond the immediate coastal strip such as the mandinka, had more hierarchical structures. their rulers exercised control over people in large sections of territory (hawthorne ; brooks , ; horta ). oral traditions from many of the decentralised groups speak of ethnolinguistic territories, which people in guinea-bissau refer to as chão (tchon in the singular) in a widely spoken creole language called crioulo (nafafé , – ). in a study of written sources from the years to , p. e. h. hair shows how chão have been relatively unchanging over centuries (hair ; lüpke , ). in other words, ethnoliguistic groups have been established in about the same locations for considerable time (lüpke ). the reason for this settlement pattern is rooted in the nature of coastal agriculture. farming methods, soil types, and the unique qualities of the crops people have chosen for planting have permitted coastal groups to remain rooted in the same places for generations. fixed settlement patterns combined with great competition between relatively small-scale communities encouraged people to define themselves in particular ways (mark ; lüpke ). within chão, walls, called tabancas in crioulo, often protected communities. as the frequency of slave raiding and overall volume of the external trade in slaves increased in the sixteenth century, walls became so commonplace that the word tabanca came to mean “village” or “community.” tabancas continued to have importance through the seventeenth century and especially in the second half of the eighteenth century when the volume of the slave trade from upper guinea reached its apex (hawthorne ). to some extent, slave raiding and trading encouraged the hardening of very localised identities. people looked inward to “their own” – to people in their tchon and tabanca – for protection during periods of uncertainty and insecurity. among the most important local identities were what might be called ethnic identities or those defined by linguistic affiliation. but clearly ethnic identities did not, as western intellectuals have often thought, set limits on human interactions. people in upper guinea were multilingual. they married people from outside their ethnic groups. some were mobile, shifting from tchon to tchon. some settled among those from other ethnic groups becoming in time part of a new group. “there were,” boubacar barry informs, “toures, originally manding, who became tukulor or wolof; jallos, originally peul, became khaasonke; moors turned into naari kajor; mane and sane, originally joola, surnames were taken by the manding royalty of kabu” (barry , ). moreover ethnicity was not all that defined who people were. upper guineans had had multiple and overlapping identities, some of which were often more important than ethnic identities. a man who sometimes identified himself as balanta might at other times identify himself as a resident of a rural tabanca and at other times as a grumete (canoe-hand) labouring beside men from other ethnic groups for a merchant in a port town. as a grumete, he could work daily among papel, fula, bijago, and mandinka, joining with them in common cause to defend an employer’s interests or to protest mistreatment by the same employer. in addition to the language of the balanta, he might have spoken crioulo, which was a language that developed on the cape verde islands before spreading to the coast in the fifteenth century and was a mixture of coastal mande languages and portuguese (nafafé , – ; barros ; nafafé ). he could wear, like all people in upper guinea, protective amulets acquired from muslim priests. but this did not make him muslim – or only muslim. he could visit shrines to balanta ancestors and shrines to a natural spirit located in a papel and beafada villages. further, he could attend multi-ethnic masses when catholic priests were on the coast. he could have a broad range of identitieli linked to local, catholic and islamic religious practices; to his profession; to his village; and to his ethnicity (hawthorne ). all of this is to say that upper guineans defined themselves in many ways – some broad and some narrow. they lived in a fragmented space yet shared a sense of community and the fellowship with many. upper guinea was a “peculiar area of intersection” of multiple identities (cahen , - ). but it was not portugal that made it that way. there was an existing shared cultural space prior to the portuguese arrival. europeans, and in particular the portuguese, stepped into this cultural confluence. before the twentieth century, only small numbers of portuguese and other europeans settled in upper guinea and few survived for long. those who survived did so by overcoming tropical diseases and being integrated into a guinean cultural system (nafafé , – ). in written sources, these settlers were called lançados since they had been “lanced” or thrown among africans. on the coast, they fostered trade connections with atlantic ship captains (hawthorne , ). many learned local languages. iv some married and produced offspring. v and all operated within the context of a cultural system that was not their creation. as lemos coelho observed, lançados “live in this freedom because the king allows it and defends them” (lemos coelho ). others historians have made a similar arguments. for example, green cites fernandes, who wrote in about conversations with people in the region who spoke of earlier times: “the casamance river is a great trading river... in the kingdom people of all nations are mixed together, mandinkas, floups, balantas and others.” green then observes that by the time of the portuguese arrival, the casamance area was “a multi- cultural zone, where peoples from different kinship lines co-existed” green provides other examples of this co-existence long before the arrival of portuguese merchants and outcast traders. moving on to the first hundred years of portuguese settlement and trade, he shows how “pre-existing political configurations determined patterns of settlement for europeans and the shaping of… early mixed communities” (green , ). those seeking the roots of lusofonia cannot, then, look to this period of portuguese-african engagement in upper guinea.vi there portuguese embraced “black ways.” they operated in a peculiar multicultural space in which people possessed fluid and flexible identities. portugal did not create that space. a few portuguese were integrated into it. to be sure, official portugal established itself on the upper guinea coast in some coastal towns where they constructed fortified areas known as praças. the most important were ziguinchor, cacheu, farim, bissau and geba. by the eighteenth century, portuguese and african-born christian residents of these praças were known as moradores and, no matter what where they had been born – on the african coast or in portugal – they called themselves portuguese. most were brown skinned, the descendants of relationships among portuguese men and coastal women. others had black skin, had been baptised and claimed a christian-portuguese identity. but few who had been born on the coast spoke the portuguese language. most knew african languages, including crioulo, which was a language born in africa and not on the iberian peninsula. moreover, the number of christians was never many. from the seventeenth through the early nineteenth centuries, priestly accounts and official portuguese censuses never counted more than several thousand in praças (hawthorne , ). and their “christian portugueseness” was always questioned. this is best demonstrated with a look at records from the inquisition. in , inquisitors arrested crispina peres, genebra lopes, and izabel lopes in cacheu. each was “brown” in appearance and was a descendant of a relationship between an african woman and european man. each had been baptised christian. nonetheless, each visited chinas or local shrines. lopes was said to take “palm wine and the blood of chickens to one of these shrines which is only a gunshot away from this settlement, which she has heathen negroes and negresses pour over it.” catholic priests were concerned that shrines played a large part in the lives of most “christian” coastal residents. as records from inquisitors state, “most of the blacks and some of the whites of this settlement keep these idols and other wrongs in their houses, in which they have more faith than in god.” crispina had a white portuguese husband, and both of them consulted mandinka healers, as did other portuguese. among them was ambrósio gomes, one of the wealthiest merchants in the area and a man whom portugal would appoint governor on the coast. gomes employed a mandinka woman who made him amulets to keep him healthy. similar practices are documented well into the eighteenth century. hence in , a portuguese official wrote that moradores carried out rituals at “pagan” shrines “with more willingness than they carry out the work of divine cult” of christianity (hawthorne , – ). to be sure, praças were areas of intersection among people possessing multiple identities. but their logic was a very local one and was not something imported from lands to the north. as we have seen, such spaces were commonplace in upper guinea. upper guineans had long mixed and mingled in a great variety of spaces. and they had long embraced some of the linguistic, cultural and religious elements of people who came into their midst. portugal did not invent areas of cultural intersection in upper guinea. portuguese who settled in upper guinea before the nineteenth century adapted to local customs and engaged in local cultural practices. they did not introduce something new. they became part of something with a deep upper guinean history. from the nineteenth century so what of later periods? in the early nineteenth century, the legal export trade in slaves from upper guinea ended and, threatened by advances from britain and france, portugal moved to shore up claims it had long made to having a place in the region. however, as r. j. hammond writes of the whole of the continent, “the portuguese dominions on the african mainland were quite limited in extent so far as direct sovereignty was concerned, whatever their claims might have been under the vaguer headings of suzerainty or sphere of influence” (hammond , ). hence, throughout the nineteenth century, representatives of the portuguese state would try in vain to regulate and tax commerce. descriptions of portuguese “strongholds” make clear why portugal failed. with the permission of local chiefs, portugal finished the walled fort named praça de josé de bissau in . it housed ragtag portuguese troops who were at the mercy of their african neighbours. troops needed to leave the fort for food and water, and when tensions flared between local papel and portuguese soldiers, access to these things was denied. (valdez , ; mollien , – ). conditions in this and in other praças were so horrendous that portugal had to rely on convicts and other undesirables (degredados) to man them. some survived and through relationships with african women integrated into local societies and found homes for themselves. but many died from malaria or succumbed to dysentery or one of the myriad diseases that ran rampant due to poor sanitary conditions. for this, over the course for the nineteenth century, troops from portugal were increasingly replaced with guinea-born and cape verdean recruits. soldiers of all colors, lacking shoes and uniforms, “most of them… clothed in rags” and “some nude” suffered mightily in praças (hawthorne , ). in , gaspard mollien described one – the praça of geba. geba is a village entirely of mud houses; there is no fort; some black soldiers cause respect to be paid to the government, which is supported by mildness rather than by actual force. bounded on the south by a marshy river, and on the east by mountains, it is perhaps one of the most unhealthy spots on the face of the globe. we saw but three europeans there, but their faces were so emaciated by the pernicious influence of the climate that they might have been taken for spectres returned from the tomb. (mollien , ) and in , an american missionary described the praça at bissau as being little better. [the soldiers] received from the portuguese government a miserable monthly allowance of tobacco, rum, and other articles suitable to barter with the natives for yams, rice and fish…. the whole number of convicts, all of whom are enrolled on the garrison books, and compelled to do the duty of soldiers, attached to bissao and its dependnts, is about . half of these are from lisbon—the balance, coloured people and negroes, from the cape verde islands. the whites… are perhaps of all the human race, the most depressed, spiritless and refuse. considered as animals…. ignorant, despairing, unprincipled, if they have not energy to commit crimes, they have scarce a restraining motive remaining to save them from wallowing in the most swinish vice. (quoted in brooks , ) are the historical roots of lusofonia in these praças in the nineteenth century? to be sure, areas around praças saw, as they had in previous centuries, a great deal of multicultural mixing. but mixing took place in the context of coastal cultural norms, which had long fostered it. as januario correia de almeida described in , near bissau a daily market at bandim, which was controlled by a papel king, attracted “papels, balantas, bijagos” who competed among themselves and with grumetes to attract buyers for their goods (almeida , ). much to portugal’s dismay, europeans from a variety of countries were often welcome in coastal markets. they “cast anchor and negotiate directly with the blacks” (monteiro , ). and thus continued the pattern throughout most of the nineteenth century. portugal had little influence over events on the coast. they could not keep out rivals or control regional trade. despite centuries of portuguese interaction with locals, almost none spoke portuguese. crioulo was the language of choice among africans in praças; local languages were spoken elsewhere. and everywhere across the coastal strip local beliefs along with some elements drawn from islam, catholicism and judaism informed people’s religious practices.vii but the late nineteenth century brought a change. it was then that european competition for territory in africa increased greatly during what has been called the “scramble for africa.” as portugal, britain, france, germany and belgium moved to compel africans across the continent to sign treaties, some local leaders conceded and others chose to resist. and in the politically decentralised coastal strip of upper guinea, there were many who resisted. thus, around bissau, portugal began to launch attacks on areas of major concentration of people. between and , they struck at felup and manjaco areas. from to , they turned their attention to beafada. the next decade saw military expeditions against balanta and papel (lobban , ). all the while portuguese officials attempted to force africans to produce goods and generate revenues to benefit portugal itself. but leis de trabalho and impostos de palhota proved unpopular and people’s resistance effective – at least through the first two decades of the twentieth century. hence, in the sociedade de geographia de lisboa would lament that it could “boldly say that… some of the richest regions of the province, like oio, basserel, the coasta de baixo, the bijagos islands, and the areas of the balanta” remained “completely unsubdued.” it continued, “almost all of these populations have been at times defeated by our forces, but even with the victories the state of rebellion continues” (boletim da sociedade da geographia de lisboa , ). quelling this state of rebellion was tasked to portuguese commander joão teixeira pinto. relying heavily on a mercenary named abdul injai to recruit african troops and to direct strikes on area tabancas, pinto launched a brutal campaign of “pacification” in . during this campaign, as pinto himself noted, coastal people “united” so that they could “defend themselves against the government” (pinto , ). being ignorant of the region’s history, pinto said that in earlier times, the region’s people had been “constantly in war” with one another and that this unity was something new. but, as we have seen, this was not the case. many indigenous cross-cutting institutions had long brought together the people of upper guinea’s multicultural landscape. and so they did again in the midst of portuguese military aggressions. ultimately, of course, coastal groups could not stop the portuguese advance. as historian rené pélissier writes, the injai-pinto strategy was to cause the “destruction of the maximum number of tabancas” and “to kill the maximum number of men” (pélissier , ). similarly, joshua forrest argues, “crucial to the success of the injai-pinto expedition was the unbridled use of state terror.” and he documents the “systematic killings of unarmed civilians, the massive theft of village property, the destruction of livestock, and the capturing of young men and forced conscription as colonial auxiliaries” (forrest , ). thus was born the colonial state. taxes and forced labour followed in an area that was dubbed portuguese guinea. forrest aptly describes the portuguese colonialism in the area as both fragile and violent. the state drafted some locals into its service and used them effectively to quell resistance. through them, it succeeded in conscripting labour for public works projects, and it succeeded in some areas in collecting taxes. but its ability to reshape and co-opt coastal social, cultural and political structures was limited. most coastal people saw the colonial state as illegitimate, so violence was the only way to move locals to act in the service of the state. following forrest, portugal relied on a “terrorist mode of repression” (forrest , ). the voices of coastal people hawthorne interviewed in the s tell the story well. many remembered cipaios or africans who worked as police for the colonial regime, rounding up labour for projects. one man told me that cipaios oversaw the construction of roads but often found it difficult to gather workers. “thus, they arranged their own representatives [appointed chiefs of tabancas] to aid in recruitment. for the tabanca that did not follow through, the fula [cipaios] arrived to seize their livestock or to carry the representative to the post where he was beaten.” another informant explained that if cipaios arrived to recruit someone for a forced labour project and the person fled, “the cipaios.…would carry away an elder man or woman of the household and whip him or her with a chicote and put him or her to work in forced labor in place of the person who had fled.” and a woman said, “balanta women participated in forced labor. during the labor, no woman had the courage to stop or sit to breast feed her child, even when he or she was crying on her back.” another woman dropped her head as she resurrected memories of cipaios carrying away “many women to the post at nhacra. there, they always tried to rape them. those who resisted were beaten a great deal. but if there was one who consented, she was not beaten” (hawthorne , – ). despite the fact that it applied systematic violence in an attempt to subdue the population, the state could never break local social, political and religious structures that had long held sway in tabancas, united people across chão, and were a means of resistance to colonial oppression. as an example of this, we turn to eve crowley’s influential study of coastal spirit shrines. throughout the colonial period, religious leaders who controlled local spirit shrines, crowley shows, gained considerable influence. and as they did spirit societies became important political forces and sources of social unity, particularly in areas north of bissau and the geba river. this unity, crowley argues, was multiethnic and offered people an alternative to the broad power of the colonial state. of course, shrines had long attracted people from a broad range of tabancas and a large number of chão. shrines and markets had long operated as sites of conviviality in bringing people together despite existing rivalry. and from the s through the s, shrines served a new purpose – uniting people in opposition to colonial oppression (crowley , ). as in centuries past, african institutions provided the mechanisms for fostering multiethnic unity. but portugal did not recognise the strength of local institutions. the colonial gaze saw only a fragmented space, a space that possessed no institutions that were familiar, a space that needed portugal to bring its brand of “civilization” to it. as anthropologist joanna davidson explains, the state professed a “rhetoric of multiracial unity.” she continues, “portugal perceived its colonizing mission as a way to unite people in a grand lusotropical culture regardless of geography, race, or ethnicity” (davidson ). this discourse became most intense when independence movements intensified in the s; certain ethnic groups were afforded privileges by the portuguese in detriment to others. it was then that officials like the portuguese overseas minister, adriano moreira, spoke of his country’s policy of “multi-racial integration.” perhaps portugal believed that its colonies were part of “one lusophone nation” and that in them was found the “equal dignity of all men.” however, portugal was not as a colonial power able to create a multi-racial nation that spanned continents and possessed a population that saw itself as one. its rhetoric did not match reality. but did that portuguese rhetoric have any long- term impact in guinea bissau? in our view, local institutions continued to play a predominant role in shaping bissau guinean culture and politics in the post-colonial as in the colonial period. that said, we would welcome davidson calls for studies of “how portuguese integrationist and colorblind colonial rhetoric worked itself out on the ground” (davidson , – ). upper guinean institutions were, after all, integrationist themselves. they were not fixed in time. they allowed for the incorporation of ideas from the outside and adapted (nafafé ). by the early by s, an anticolonial uprising that stretched broadly across portuguese guinea challenged portuguese rule. the movement generated effective, widespread interethnic solidarity that had historic roots stretching back through the colonial and deep into the precolonial period. organization took place under the paigc and its charismatic leader cabral, who believed that ethnic heterogeneity was not a barrier to national unity (nafafé , – ). and, to be sure, during the war “ethnic logic” did not determine who participated, who remained neutral, and who sided with the portuguese. the paigc recruited for and managed well a multi-ethnic movement. its armed struggle succeeded with people from all ethnic groups working in coordination, including cape verdeans among whom were key figures who played important role in shaping ideologies of the party, such as cabral himself, aristides pereira, pedro pires and carmen pereira.viii importantly, the principle language of communication for those involved in the armed struggle was not portuguese. it was crioulo, especially, but also balanta and other local languages. and after the war, the portuguese language did not see an upsurge in use. rather, crioulo did. unlike in most of the rest of africa, where english and french have provided a means for inter-ethnic communication, in what emerged as guinea-bissau an indigenous language has been embraced. this is something in which bissau guineans today take great pride, the past few generations learning crioulo in their communities as their first or second language (after an indigenous language) and studying portuguese, and increasingly french, as a third language in school (kohl ; davidson , – ; scantamburlo ). this has prompted some observers to dub portuguese “a foreign language” and given rise to experiments to teach it as such in schools, children initially receiving instruction in crioulo and later being introduced to portuguese as they continue their educations (benson ). it should be emphasised, following data collected by carol benson, that in guinea bissau “women were overwhelmingly monolingual or bilingual in two guinean languages, while most men reported being bilingual in the mother tongue and the creole, and many of the latter also claimed to know some portuguese” (benson , ). this is to say that there is a gendered dimension to language acquisition, portuguese being the language of few and of those who speak it most being male. the embrace of a common local language, crioulo, in part explains why in post- colonial guinea-bissau, relative interethnic harmony has continued. interethnic communication has been the norm. and crioulo has made great gains. christoph khol shows that it was understood by % of people, even in the countryside, in . compare this to figures of % in and % in (kohl ; davidson ; knörr and filho ). other factors are also important for relative interethnic harmony. guinea-bissau’s military has effectively integrated people from multiple ethnic groups. guineans have felt equally poorly served by their own national elites, their multiethnic government, and the international community. and the post-colonial state itself has proven to be as weak and ineffectual as the colonial state at centralising power. local authorities wield power at the tabanca level and do not serve as links between the state and the people. this means that the state has not been able to penetrate into the rural political arena so the pre-existing and multicultural institutions that have long linked people in the region continue to shape people’s lives on a day-to-day basis. for many in guinea-bissau, and particularly the youth, crioulo is an enabling language, which should be seen as just as rich and expressive as portuguese. and thus much of the literature and music produced in the country comes in crioulo form. take for example the work of odete semedo, a bissau guinean writer. semedo without employing the term lusofonia, questions the validity of the portuguese language in retaining cultural values for future generation of bissau guineans (afolabi ; semedo and ribeiro , ). and why not? lusofonia has not been the foundation for cultural unity. rather, unity has been found in localised institutions and in crioulo. thus the bible society has produced a new testament version of the bible in crioulo (nobu testamentu-crioulo biblia). and the songs that fill nightclubs in bissau and in europe when a bissau guinean singers such as anastacio djéns and kid charles are in guinea- informed rhythms and crioulo. these performers follow from ernesto dabo, super mama djombo and kaba mané who shaped the bissau guinean music scene in the s and s. portugal, however, still harbours the view that guinea-bissau, like other former colonies, reflects its image through lusofonia. portuguese politicians often use cabo verde as an example of lusofonia that the rest of lusophone african countries should follow. they claim that cabo verde has maintained its cultural, political and economic ties with portugal and because of this has been able to avoid economic downturns such as those that have impacted guinea-bissau. portugal saw itself as gateway for lusophone countries to the european economy. lusophone countries will be able to usher in economic development, the argument goes, if they stay loyal to the ideology of lusofonia. however, such claims fail to acknowledge the geographical position of cabo verde in relation to guinea bissau and other lusophone african countries. first, cabo verde has benefited from its geographical location as a trade hub in the atlantic that links it to the main land africa, europe and the americas. second, cabo verde continues to have blood-tie privileges with portugal that other lusophone african countries do not have. as a result, cabo verde gets preferential treatment compared to the rest of the lusophone african countries. third, cabo verde did not suffer from the colonial wars that angola, guinea-bissau and mozambique did. all of these factors contribute to making cabo verde unique among the former portuguese colonies in africa (castles and miller ; ishemo ; jørgen ). today, guinea bissau has chosen to de-emphasise its economic ties with portugal and to focus on new partnerships (cabral ). the country has reconfigured its relationship with portugal by becoming a member of the economic community of west african states (ecowas), which in french is known as communauté Économique des etats de l’afrique de l’ouest (cedeao). it also joined the financial community of africa (communauté financière d’afrique), adopting the currency used in most francophone west african countries. in so doing, guinea-bissau’s economic focus shifted from portugal to france (nafafé , – ). guinea-bissau has also questioned its ties with the cplp. this was precipitated by angolan actions during a coup in guinea-bissau in april . then, many bissau guineans resented angolan intervention on the side of the government. many saw angola as attempting to determine events rather than letting locals determine their own future. and they understood angola’s justification as being rooted in the fact that both countries were former portuguese colonies, were part of a lusophone alliance, one that many in guinea-bissau did not embrace.ix and in the aftermath of the coup, guinea- bissau turned increasingly from the cplp to other alliances, and especially ecowas, which includes its immediate neighbours, senegal, guinea conakry and the gambia. conclusion in guinea-bissau, lusofonia is not an indigenous movement. it is not a rallying cry for people in rural or urban areas. it is not a consideration for people who work daily to put food on their communities’ tables. if it is anything, it is the stuff of elites and foreigners and is not rooted in any historical reality. it arises from what toby green calls an official “portuguese perceptions regarding the superiority of ‘imperial’ peoples.” (green , ). green makes this observation in a discussion of the concept of mandinguisation – the supposed spread of mandinka (or mandinga) language, influence and customs throughout the whole of upper guinea. as early as the sixteenth century, portuguese writers such as pacheco pereira, described this process. for pereira and portuguese who wrote for centuries after him, mandinka were superior to others in upper guinea. why? mandinka possessed a highly stratified society with identifiable elites, a formidable military, and a slave class. and mandinka produced a considerable number of items for trade. they appeared to control an empire and through it, the portuguese thought, advanced the region economically, socially and politically (green , , ).x in the twentieth century, portugal, propaganda had it, did the same – elevated the region by making it part of its empire. as gilberto freyre wrote in , “from the th century onward, a new type of civilization commenced, for which a characterization as lusotropical is suggested.” freyre applied the term lusotropicalism or lusotropicology because, in his words, “the highest and most complex human knowledge… is that which for centuries has been expressed in european language.” “lusotropical civilization,” he continued, “is no more than this: a common culture and social order to which men and groups of diverse ethnic and cultural origins contribute by interpenetration and by accommodation to a certain number of behavioral uniformities of the european and his descendant and successor in the tropics” (freyre quoted in chilcote , – ). others wrote in the same vein. “we believe, therefore,” adriano moreira, said in , “that africa gained when we implanted there the ideas of state and of nation, which were alien to its people. we think it was of incalculable benefit to it that some of its territories were integrated within one political unit together with european peoples who could supply africa with what its peoples lacked and could not have obtained by themselves for a long time” (moreira quoted in chilcote , ). mandinguisation, lusotropicalism, lusophony. central to each of these is the idea that local knowledge, ways of doing things, and modes of living are inferior and need to be changed; and that societies can be elevated if they become part of something larger – an empire, a global society. core to them is a telling of history that assumes indigenous institutions have been unchanging, do not allow people to maintain peaceful relations with one another, and do not engage effectively with the outside world. that is, the tenants of each are rooted in a telling of history that people in guinea-bissau do not subscribe to. of course, freyre and moreia’s praise of past and present benefits of empire was challenged by a more persuasive argument – one that gave a different telling of history and a different account of the conditions in colonies. “your colonialist ancestors conquered guiné by force,” cabral wrote to portuguese living portuguese guinea in . “they enslaved, they sold, they massacred, they dominated, and they exploited the people of guiné for five centuries. today in defense of certain portuguese and non- portuguese enterprises, the colonists persecute, arrest, torture, and massacre the people of guiné and cabo verde, who are fighting to reclaim the liberty and dignity of the people of guiné” (cabral quoted in chilcote , ). like lusotropicalismo, lusofonia is a dream; it is not rooted in a historical reality that people in guinea-bissau subscribe to. it credits portuguese influence in africa for the unity of people, and ignores the power and persistence of local cultures. its agenda is elite driven and assumes the inevitability of modernity and globalisation. with regard to guinea-bissau, it does not recognise that community and fellowship have long been fostered in a “fragmented cultural space” possessing many ethnic groups. in guinea- bissau, the story of the intersection of multiple cultures is told today, as it long has been, in languages born in africa. acknowledgements the authors thank antonio tomas and an anonymous reviewer for comments. notes i for more on how lusofonia has been conceptualised in scholarship and popular discourse, see michel cahen ( a) . ii eduardo lourenço ( , ). for further debate on lusofonia, see c.a. faraco ( ) ; c. cunha ( ); m. l. de carvalho armando ( ); f. dos santos neves ( ). iii cplp is also known as palop (países africanos da língua oficial portuguesa). iv a. Á. almada ( , ch. , fol. v), . “chamado pellos negros ho ganagoga q querdizer na lingua dos beafares homë q falla todas as linguas como de feito as fallam. e pode este homē atravesar todo o sertao do nosso guine de quaes quer negros que seja” [ganagoga in the beafada’s language means a man who speaks all languages, as they do, he can cross the whole of hinterland of our guinea and [talk] to whatever negroes there may be]. v l. silveira ( , ), “e me respondeu que hera filho de portugal, e que passaua de uinte annos q moraua ao pê daquella serra, em sitio tão escondido, e ratirado, pera q nimguem soubesse delle, o qual ueuia a ley dos gentios da terra, e tinha noue molheres e muitos filhos” [and he said to me that he was a son of portugal who had been living at the foot of that sierra, in a place closely concealed and very remote for more than twenty years, in order that no one should know about him. he lived there according to the heathens’ law of the land and had nine wives and many children]. vi for studies african-european cultural exchange in the period, see walter rodney ( ),; j. l. nafafé ( ); p. j. havik ( ). vii on the influence of judaism on the coast, peter mark and josé silva da horta ( ). viii forrest ( – ). the conflict between bissau guinean and caboverdean emerged as a result of constitutional dispute, in particular on the penal system which guinea-bissau has and cabo verde does not. see carlos lopes ( ). ix doka internacional, n.d.; intelectuais balantas na diáspora, n.d.. x ibid., , notes on contributors walter hawthorne is a professor of african history and chair of the history department. his areas of research specialization are upper guinea, the atlantic, and brazil. he is particularly interested in the history of slavery and the slave trade. much of his research has focused on african agricultural practices, religious beliefs, and family structures in the old and new worlds. his first book, planting rice and harvesting slaves: transformations along the guinea-bissau coast, – (heinemann: ), explores the impact of interactions with the atlantic, and particularly slave trading, on small-scale, decentralized societies. his most recent book, from africa to brazil: culture, identity, and an atlantic slave trade - (cambridge: ), examines the slave trade from upper guinea to amazonia brazil. he ha published in a range of scholarly journals such as journal of african history, luso-brazilian review, slavery and abolition, africa, journal of global history, and american historical review. he is involved in digital scholarship and have partnered with matrix, msu’s digital humanities center, for a number of projects. he completed work on a british-library funded archival digitization project in the gambia. he has an ongoing national endowment for the arts project titled slave biographies: the atlantic database network, which is an online database with information about the identities of enslaved people in the atlantic world and is sponsored by the national endowment for the arts. in the works is another neh- sponsored project titled islam and modernity for which we are developing a site for the publication of texts, images, interviews, and interpretive essays, examining the practice of islam in west africa. dr lingna nafafé, lecturer in portuguese and lusophone studies at the university of bristol, in the department of hispanic, portuguese and latin american studies. his academic interests embrace a number of inter-related areas, linked by the overarching themes of: lusophone atlantic african diaspora, seventeenth and eighteenth century portuguese and brazilian history; slavery and wage-labour, - ; race, religion and ethnicity; luso-african migrants’ culture and integration in the northern (england) and southern europe (portugal and spain); ‘europe in africa’ and ‘africa in europe’; and the relationship between postcolonial theory and the lusophone atlantic. dr lingna nafafé joined the university of bristol in october from university of nottingham. he completed his phd at the university of birmingham. he lectured in the department of political science and international studies at the university of birmingham. he has previously taught in the school of education, the theology department, the european research institute, the sociology department and in the portuguese studies department at the university of birmingham. dr lingna nafafé has been awarded a two-year british academy small grant to undertake a research project on the integration of lusophone african migrants in northern and southern europe. the project investigates how migrants from angola, cape verde and guinea-bissau use bonding, bridging and linking social capital to find employment and achieve social mobility in the labour market. references afolabi, n. . the golden cage: regeneration in lusophone african literature and culture. trenton: africa world press. almada, a. Ás. . “tratadosbreue dosrejnos deguine docaboverde.” lisboa: biblioteca nacional de lisboa. almeida, j. c.a de. . um mez na guiné. lisbon: typographia universal. barreira, b. . “letter (january , ).” in jesuit documents, edited by p. e. h. hair. liverpool:university of liverpool. barros, m. . litteratura dos negros: contos, cantigas e parabolas. lisbon: typographia do commercio. barry, b. . senegambia and the atlantic slave trade. new york: cambridge university press. benson, c.. . “bilingual education in africa: an exploration of encouraging connections between language and girls’ schooling.” in education – a way out of poverty? edited by m. melin. stockholm: sida. benson, c.. . “trilingualism in guinea-bissau and the question of instructional language.” in trilingualism in family, school, and community, edited by c. hoffmann, and j. ytsma. clevedon: multilingual matters. boletim da sociedade da geographia de lisboa. . ( ): . brooks, g. e. . “a nhara of the guinea-bissau region: mãe aurélia correia.” in women in slavery in africa, edited by c. c. robertson, and m. a. klein. madison: the university of wisconsin press. brooks, g. e. . landlords and strangers: ecology, society, and trade in western africa, – . boulder, co: westview. brooks, g. e. . eurafricans in western africa: commerce, social status, gender, and religious observance from the sixteenth to the eighteenth century. athens: ohio university press. cabral, a. . unity & struggle, speeches and writings, texts selected by the paigc. london: heinemann educational. cahen, m. a. “‘portugal is in the sky’: conceptual considerations on communities, lusitanity, and lusophony.” in imperial migrations: colonial communities and diaspora in the portuguese world, edited by e. morier-genoud, and m. cahen, – . new york: palgrave macmillan. cahen, m. b. “is ‘portuguese-speaking africa comparable to ‘latin america’? voyaging in the midst of colonialities of power.” history in africa : - . carvalho armando, m. l. de. . “a perspectiva da lusofonia,” oraganon ( ): – . castles, s., and m. j. miller. . the age of migration. new york: macmillan. chilcote, r. h. . emerging nationalism in portuguese africa, documents. stanford: hoover institution press. coelho, f. a. lemos. . duas descrições seiscentistas da guiné, introdução e anotações históricas pelo damião peres. lisboa: edição da academia portuguesa de história. crowley, e. . “contracts with the spirits: religion, asylum and ethnic identity in the cacheu region of guinea-bissau.” phd. diss., yale university press. crowley, e. . “institutions, identities, and the incorporation of immigrants within local frontiers of the upper guinea coast.” in migrations anciennes et peuplement actuel des côtes guinéenes, edited by g. gaillard.paris: l’harmattan. cunha, c.. . uma política do idioma. rio de janeiro: templo brasileiro. davidson, j. . “plural society and interethnic relations in guinea-bissau.” in engaging cultural differences: the multicultural challenge in liberal democracies, edited by r. shweder, m. minow, and h. r. markus. new york: russell sage foundation. - . doka internacional n.d. http://dokainternacionaldenunciante.blogspot.co.uk/ accessed, september . faraco, c. a.. . “a lusofonia: impasses e perspectives.” sociolinguistic studies ( ): – . forrest, j. . lineages of state fragility: rural civil society in guinea- bissau. athens: ohio university press. http://dokainternacionaldenunciante.blogspot.co.uk/ green, t. . the rise of the trans-atlantic slave trade in western africa, – . new york: cambridge university press. jørgen, c. . “migration in the age of involuntary immobility: theoretical reflections and cape verdean experiences.” journal of ethnic and migration studies ( ): – . hair, p. e. h. . “ethnolinguistic continuity on the guinea coast.” journal of african history ( ): - hammond, r. j. . portugal and africa. stanford: stanford university press. havik, p. j. . silences and soundbites: the gendered dynamics of trade and brokerage in the pre-colonial guinea bissau region. munster: lit. hawthorne, w. . “the interior past of an acephalous society: institutional change among the balanta of guinea-bissau, c. – .” phd. diss., stanford university. hawthorne, w. . planting rice and harvesting slaves: transformations along the upper guinea coast, – . portsmouth, nh: heinemann. hawthorne, w. . from africa to brazil: culture, identity and an atlantic slave trade, to . cambridge: cambridge university press. horta, j.a. n. da silva. . “evidence for a luso-african identity in ‘portuguese’ accounts on ‘guinea of cape verde’ (sixteenth-seventeenth centuries).” history in africa : – . intelectuais balantas na diáspora. n.d. http://tchogue.blogspot.co.uk/ accessed, july ishemo, s. . “forced labour and migration in portugal’s african colonies.” in the cambridge survey of world migration, edited by r. cohen. cambridge: cambridge university press. knörr, j. and w. t. filho, eds. . the powerful presence of the past: integration and conflict along the upper guinea coast. leiden: brill. kohl, c. . “national integration in guinea-bissau since independence.” cadernos de estudos africanos : – . lemos coelho, f. a. . as duas descrições seiscentistas da guiné, introdução e anotações históricas pelo damião peres, lisboa: edição da academia portuguesa de história. http://tchogue.blogspot.co.uk/ lobban, richard. . historical dictionary of the republic of guinea-bissau and cape verde. metuchen, new jersey: londres. lopes, c. . guinea bissau, from liberation struggle to independent statehood. london: zed books. lourenço, e.. . “errância e busca num imaginário lusófono.” in a nau de ícaro seguido de imagem e miragem da lusofonia. lisbon: gradiva. lüpke, friederike, . “multiple choice: language use and cultural practice in rural casamance between convergence and divergence.' in knörr, jaqueline and trajano filho, wilson, (eds.), creole languages and postcolonial diversity. oxford; new york: berghahn. mollien, g. . travels in the interior of africa. london: fank cass & co. monteiro, j.m. de souza. . “estudos sobre a guiné de cabo verde.” o panorama : - . mark, p. . the wild bull and the sacred forest: form, meaning, and change in senegambian initiation masks. cambridge: cambridge university press. mark p., and j. s. da horta. . the forgotten diaspora: jewish communities in west africa and the making of the atlantic world. cambridge: cambridge university press. nafafé, j.l.. . “guinea bissau: language situation.” in encyclopaedia of language and linguistics, edited by k. brown, - . london: elsevier. nafafé, j. l. . colonial encounters: issues of culture, hybridity and creolisation, portuguese mercantile settlers in west africa. frankfurt: peter lang. nafafé, j.l. . “african orality in iberian space: critique of barros and myth of racial discourse.” portuguese studies journal ( ): – . nafafé, j. l. . “flora gomes’s postcolonial engagement and redefinition of amílcar cabral's politics of national culture in nha fala.” hispanic research journal: iberian and latin american studies ( ): – . pélissier, r. . história da guiné ( – ). lisbon: imprensa universitária. pinto, j. t. . a ocupação militar da guiné. lisbon: agencia geral das colonias. rodney, w. . a history of the upper guinea coast. oxford: clarendon. santos, v. m.dos. . “lusofonia e projecção estratégica. portugal e a cplp.” nação defesa ( ): - . santos neves, f. dos. . para uma crítica da razão lusófona: onze teses sobre cplp e a lusofia. lisboa: edições universitárias lusófonas. scantamburlo, l. . “dicionário do guineense i. lisbon: faspebi. semedo, o. c., and m. c. ribeiro. . literaturas da guiné-bissau: cantando os escritos da história. lisboa: afrontamento. silveira, l., ed. . peregrinação de andré de faro à terra dos gentios. lisboa: officina tipographia portugal. valdez, f. t. . six years of a traveller’s life in western africa. london: hurst and blackett, publishers. ecrm -proceedings.pdf academic dishonesty: a preliminary researchers view shawren singh and john mendy university of south africa, florida, south africa university of lincoln, uk singhs@unisa.ac.za jmendy@lincoln.ac.uk doi: . /rm. . abstract: increasingly academe is facing the challenge of dealing with allegations of plagiarism and academic dishonesty. academic dishonesty plagues both the degree acquisition process as well and the publishing process. academic dishonesty within the university space has been clouded in mystery, as many universities are not willing to break the code of silence. however, within the academic publishing space, several respectable journals had to withdraw published papers citing academic dishonesty as a concern. at the core of academic dishonesty is the researcher and their perceptions of issues affecting academic dishonesty. the purpose of this research is to develop a better understanding of researchers’ attitudes to issues of academic dishonesty. this study is quantitative in nature and primary data in the form of likert scale questions were collected from developing researchers. the questionnaire data were statistically analysed, and a framework was developed to outline emerging researchers’ perceptions of academic dishonesty. key findings included academic dishonesty is influenced by several issues such as academic pressure, electronic deterrents, writing challenges, outsourcing, data challenges, plagiarism, database challenges, and electronic sources. this is important because by better understanding researchers’ perceptions to academic dishonesty, ( ) appropriate training interventions can be implemented ( ) higher quality research will be produced and ( ) research funding will not be wasted. keywords: perceptions of plagiarism, cheating, academic integrity, ghost writing, academic ethics, academic dishonesty . introduction academic dishonesty in the form of plagiarism, ghost-writing, or data fabrication has an indelible impact on the images of a university. for example, duke university recently agreed to pay back the us government $ . million to settle claims that the universities researchers used fabricated data to attract several government grants (casadevall, ). it is not uncommon to find sensationalist media coverage of academic dishonesty (see exhibit a). merely by being associated with a university that has been involved with academic dishonesty, all the academic staff appears to be guilty by association (molet et al., ). casadevall ( ) aptly points out “this is a communal punishment for an institution where the overwhelming majority of scientists are honest, hard-working individuals seeking knowledge for the good of humanity.” with the increasing acceptance of digital scholarship (remenyi and susan, ), universities that are involved with or appear to be involved with less than acceptable practice are named and shamed. the internet is unforgiving, as these naming and shaming events stay on the internet for perpetuity leaving a digital scare against the good name of the university. . background academic research is the process of adding something of value to the existing body of theoretical and practical knowledge in response to a question or series of questions. academic research follows a formal process which includes the establishment of an auditable research methodology to answer the research questions (remenyi, ). the methodological approach adopted by a researcher is sometimes prone to abuse, some researchers have used flawed research methods (w , ; w , ) or sophisticated data dredging techniques (head et al., ) to make their research appear more relevant. in the pursuit of presenting relevance’s, the research has become dishonest. an important characteristic of academic research is that the research needs to be presented in a matter that demonstrates a respectable level of scholarship on the part of the researcher(s) (remenyi, ). scholarship is displayed in two forms. these are academic writing and by the appropriate use of research methodology, both of which are not trivial tasks. the scholarship enterprise can fall victim to academic dishonesty. academic dishonesty can broadly be described as a form of cheating that occurs within the academic space. academic dishonesty could include (but not limited to): fabrication, deception, sabotage, bribery, collusion, improper use of information, communication and technology and plagiarism (w , ; w , ). as a subset of academic dishonesty, plagiarism refers to the use of other people’s ideas and words without giving the original author appropriate acknowledgment (randall, ; clarke, ). if ideas are used in an essay or dissertation that shawren singh and john mendy have been found in the published work of another author(s), it is academic dishonesty not to specifically acknowledge the original source(s). it is important that the acknowledgment must follow the rules of the referencing system employed in the work (singh and remenyi, ). interestingly some point out that there is the issue of unintentional plagiarism when the researcher disregards accepted scholarly procedures (w , ). although the use of ideas without acknowledging them is an offense, it is even worse if the actual words of other authors are copied without acknowledgment (singh and remenyi, ). there are several grey areas that constitute academic dishonesty but are not adequately understood. academic dishonesty is influenced by several factors, see figure , some of these factors are: academic pressure, electronic deterrents, writing challenges, outsourcing, data challenges, plagiarism, database challenges, and electronic sources. each of these factors will be briefly discussed. figure : factors affecting academic dishonesty there are gaps in the literature about how researchers feel about plagiarism (lei and hu, ; mouton, ). there is a disproportionate number of pages published about students’ perceptions to and involvement with academic dishonesty. it is understandable that universities and academics approach the issue of academic dishonesty within their ranks cautiously. there is increasing academic pressures on individuals to “publish or perish”(dinis-oliveira and magalhães, ; grimes, bauch and ioannidis, ). academics who are under-resourced find themselves under pressure to effectively manage tuition, research, academic citizenship and community engagement (cawood et al., ; santoso and cahaya, ). due to limited funding from governments and abused subsidy models, academics are treated as units of production in order to claim government subsidies (hedding., ). these ongoing sources of pressure have an impact on the quality of research that universities produce. there has been an argument that electronic deterrents can be used as a tool to reduce academic dishonesty. publishers (supak smolcic and simundic, ; kalnins, halm and castillo, ) and academics are using electronic deterrents to curb academic dishonesty. in a recent conversation with a senior professor, the professor erroneously claimed that “…we have solved the plagiarism problems, we use turn-it-in”. software deterrents to plagiarism are one tool in the academics arsenal, however, it must be noted that tools like turn- it-in and ithenticate “does not detect plagiarism, but it does highlight matches in text between the article that has been uploaded” (lammey, ) to articles within the data repository. software can be used to reduce gross plagiarism (santoso and cahaya, ). however, any reasonable attempt to reduce academic dishonesty would require a joint initiative between academic publishers, editors (jarić, ) and researchers. researchers who are the custodians of the knowledge-generation process may at times in their research career have challenges when it comes to writing. it has been said that ‘writing is a full body contact sport’ and within the research writing space there is inadequate attention paid to formal training for writing (aitchison, ). for example, research writing retreats require a high initial investment and many universities are shackled by limited resources, which results in academics taking longer to develop the required academic writing competence (kornhaber et al., ). a shawren singh and john mendy further concern is that international journals are predominantly in english, posing a barrier for second language english research writers (jeyaraj, ). in a desperate effort to bridge some of the writing challenges, some researchers have attempted to outsource aspects of their writing. some authors have resorted to outsourcing their writing by using ghost-writers to assist with the writing of their research (singh and remenyi, ; sarwar and idris, ). ghost-writing is the practice of hiring a writer (or writers) to produce a piece of work that follows a predefined style, and none of the original writing credit is attributed to the ghost-writer/s. detecting ghost-writing is difficult because the peer reviewer is not acquainted with the authors writing style (singh and remenyi, ). a further challenge for researchers relates to data. there are two issues under data, one is data overload and the other is false information. data overload comes in the form of scientific and pseudo-scientific academic articles being published, and it is argued by some researchers that only a small fraction of these papers represents a contribution to the scientific body of knowledge. false information is represented by predatory and counterfeit journals (singh, ). researchers need to navigate the different data repositories to find respectable scientific papers. plagiarism and its consequences are becoming increasingly complex (robinson-zañartu et al., ) and difficult to identify. there are gaps in the literature regarding the factors that force some researchers to commit acts of plagiarism, partly due to the disproportionate level of research focusing on student perceptions of plagiarism (husain, al-shaibani and mahfoodh, ) rather than researcher perceptions. it may be argued that researchers understand the consequences of plagiarism and therefore there is no need for research in this area or a plagiarist has no reason the further expose their universities and/or themselves. like any type of technology, academic databases are constantly changing. to adequately search the different databases, researchers are required to understand the interfaces of different academic databases. understanding the different databases is not an easy task as each database has a distinct vocabulary and interface (singh, ). the complexity of the database interface affects the literature review journey. increasingly the extent literature has become electronic. the search for literature takes the researcher through two paths, the traditional academic publishing path and the open access academic publishing path. within these spaces, it is estimated that there over million published academic articles (jinha, ) and this number is growing. these articles are housed in special databases. ulrichsweb is a library directory that provides information on active academic journals, and there appear to be databases and online databases. the gale directory of databases claims to cover more than databases. this large amount of data poses a challenge to the researchers (singh, ) and emerging researchers who can be easily over- whelmed by the vastness of the literature. . methodology when undertaking any research, it is prudent to have an acceptable research strategy (myers, ; yin, ), figure outlines the strategy adopted in this research. there are three phases in this research, phase understanding aspects of the literature; exploring researcher perceptions and phase future data collection and analysis. only phase and phase will be reported upon in this paper. phase of the research is qualitative in nature. it was important to use a qualitative approach in this phase of the research because the researchers wanted to develop a better quantitative understanding of researchers’ perceptions of issues affecting academic dishonesty. phase constituted a review of the extent literature and a brainstorming session, in order to develop a questionnaire focused on issues that affect academic dishonesty. in this research, a -point likert scale questionnaire was used for data collection. using a -point likert scale questionnaire for data collection is an acceptable approach (sachdev and verma, ; bouranta, chitiris and paravantis, ). likert scales were used because the literature suggests that -point scales are less confusing to understand, can increase response rates and is easy to use by respondents (babakus and mangold, ; devlin, dong and brown, ). the -point likert response format ranged from “strongly agree = ” to “strongly disagree = ”. the questionnaire was piloted and refined accordingly. the final version of the instrument was a one-page questionnaire comprised of sections: a section for demographic data, items on a -point likert scale and a section for comments. shawren singh and john mendy in phase of this research, the questionnaire was administered, the data was collected and then analysed. figure : research approach . the data collection instrument the questionnaire comprised items (see table ) and the items in the questionnaire were classified as follows: research pressures, electronic deterrent, writing challenges, outsourcing, data challenges, plagiarism, database, and electronic sources. for each question, respondents had the option of strongly agree, agree, undecided, disagree and strongly disagree. table : questionnaire instrument question categories likert scale items question . pressure i feel pressured to publish. question . i feel pressured to publish within a shorter time frame. question . i feel pressured by my line manager to publish. question . i feel pressured to do other university activities such as administration or community engagement question . i do not understand the academic review process. question . electronic deterrent turn-it-in is a useful tool for me to avoid plagiarism. question . ithenticate is a useful tool for me to avoid plagiarism. question . writing challenges i find academic writing challenging. question . i have received insufficient training in academic writing. question . outsourcing it is ok to hire a third party to collect my data. question . it is ok to hire a professional to write aspects of my research. question . it is ok to crowdsource aspects of my literature. question . it is ok to hire professional academic writing services to assist me write. question . data challenges i feel overwhelmed with the amount of data that i must manage. question . librarians are key academic resources. question . i find it difficult to identify false information. question . plagiarism i have received insufficient training in anti-plagiarism. question . copying others’ work without citing them constitutes plagiarism. question . there are serious consequences if i violate plagiarism policy. question . copying my own submitted work does not constitute plagiarism. question . database i do not understand how to use academic databases. question . the language used to search academic databases is hard to learn. question . the academic database interface is complicated. question . electronic sources google scholar is a legitimate academic resource. question . i do not trust open access journals. question i only trust the established academic publishing companies i.e elsevier, springer, wiley-blackwell, taylor & francis and sage the scores for questions , , , , , and were reversed as they were stated in the negative. . the sample the selection of appropriate informants for any academic research is a challenging and time-consuming task for a researcher. the informants were selected only from public higher education institutions in south africa. a shawren singh and john mendy total of informants provided data for this study. table provides a summary of the characteristics of the informants that participated in this research. table : characteristics of the sample total age at most years - years > years did not answer the question gender male female i prefer not to answer this question did not answer the question type of employment fulltime not fulltime did not answer the question years of experience at most years - years > years did not answer the question researcher experience emerging intermediate developed established did not answer the question the research population for the study was academics who are involved in research activities. an anonymous paper-based questionnaire was distributed to academics that fell within the lead researcher’s community of practice who are involved with research and supervision. . data analysis the purpose of this research was to develop a better understanding of the issues that affect academics perceptions towards academic dishonesty. the first step was to test the reliability of the questionnaire. a reliability analysis was carried out on the instrument comprising items. the cronbach’s alpha showed the instrument to reach acceptable reliability, α = . . the statements were then ranked by the mean value. table : statements ranked by mean value ra nk m in im um m ax im um m ea n st d. d ev ia tio n . q . copying others’ work without citing them constitutes plagiarism. . . . q . there are serious consequences if i violate plagiarism policy. . . . q . librarians are key academic resources. . . . q . i feel pressured to do other university activities such as administration or community engagement . . . q . google scholar is a legitimate academic resource. . . . q . i find academic writing challenging. . . . q . turn-it-in is a useful tool for me to avoid plagiarism. . . . q . i feel pressured to publish. . . . q . i feel pressured to publish within a shorted time frame. . . . q . i only trust the established academic publishing companies i.e elsevier, springer, wiley-blackwell, taylor & francis and sage . . . q . i feel overwhelmed with the amount of data that i must manage. . . . q . i feel pressured by my line manager to publish. . . . q . ithenticate is a useful tool for me to avoid plagiarism. . . shawren singh and john mendy . q . i find it difficult to identify false information. . . . q . i have received insufficient training in academic writing. . . . q . i have received insufficient training in anti-plagiarism. . . . q . i do not understand the academic review process. . . . q . it is ok to crowdsource aspects of my literature. . . . q . the academic database interface is complicated. . . . q . i do not trust open access journals. . . . q . it is ok to hire a third party to collect my data. . . . q . the language used to search academic databases is hard to learn. . . . q . copying my own submitted work does not constitute plagiarism. . . . q . it is ok to hire professional academic writing services to assist me write. . . . q . i do not understand how to use academic databases. . . . q . it is ok to hire a professional to write aspects of my research. . . the next step in the analysis was to administer the levene’s statistic to test homogeneity of variance for the different categories, as illustrated in table . in the context of this study, the researchers wanted to investigate if the respondents had the same attitudes to issues affecting plagiarism. all p values are > . , the variance can be assumed to be homogeneous. table : levene's test for equality of variances p values g en de r ty pe o f em pl oy m en t ty pe o f re se ar ch er . pressure . . . . electronic deterrent . . . . writing challenges . . . . outsourcing . . . . data challenges . . . . plagiarism . . . . database . . . . electronic sources . . . finally, a one-way anova test was conducted using age as a grouping to investigate if there is a statistically significant difference between group means. for all categories the significance values are greater than . except for database were p = . , which is below . . therefore, there is a statistically significant difference in the mean for the category of database. table : summary of one-way anova – age anova sum of squares df mean square f sig. pressure between groups . . . . within groups . . total . electronic deterrent between groups . . . . within groups . . total . writing challenges between groups . . . . within groups . . shawren singh and john mendy total . outsourcing between groups . . . . within groups . . total . data challenges between groups . . . . within groups . . total . plagiarism between groups . . . . within groups . . total . database between groups . . . . within groups . . total . electronic sources between groups . . . . within groups . . total . . discussion when the statements were ranked by mean value, it is interesting to note that academics are aware of gross plagiarism and the consequences of plagiarism (rank and ). (rank ) academics agree that librarians are key assets in the academic enterprise. (rank ) academics feel pressured to be involved with administration or community engagement and (rank ) academics see google scholar as a legitimate academic resource. academic seeing google scholar as a legitimate academic resource is a concern because google scholar only indexes academic papers, google scholar does not test the veracity of the peer review process or the credibility of the claims made in these papers. the levene’s statistic indicated that the variance can be assumed to be homogeneous, this means the respondents had the same perceptions about the issues that they were asked about. finally, the one-way anova by age indicated the database has a difference between age groups. further investigation is required as to the nature of these differences between the groups. this preliminary research confirms ongoing concerns about academic dishonesty (singh, , ; singh and remenyi, ; casadevall, ). . limitations of this study this study has two limitations. the first is that in this study data was only collected from informants in the public sector higher education space in south africa. no special effort was made to collect data from private higher education institutes in south africa. the second limitation is that the sample size is , it is not possible to conduct sophisticated statistics analysis, such as factor analysis (maccallum et al., ; mundfrom, shaw and ke, ), with a sample of informants. however, this preliminary data does give us insight into how these researchers perceive the issues related to plagiarism. . conclusion the initial findings of this research indicate that academic dishonesty is complex and affected by several factors. the surveyed informants are aware of the issues related to academic dishonesty. in summary, as outlined in figure , the surveyed researchers are aware of: ( ) the negative effects of plagiarism, ( ) the value of key stakeholders in managing the data challenge issues. researchers acknowledge that they feel increasing ( ) academic pressure and follow the path of least resistance when it comes to sourcing academic literature by using ( ) electronic sources. researchers find it difficult to ( ) write and have unrealistically faith in ( ) electronic deterrents to protect them from plagiarism. researchers acknowledge that shawren singh and john mendy ( ) outsourcing aspects of their research to outside parties is a form of academic dishonesty and finally researchers have a respectable understanding of ( ) academic databases. figure : factors affecting academic dishonesty revised it is interesting to note that one of the anonymous reviews pointed out that “perhaps the paper should address types of policies that a university could put in place to ensure that dishonesty is minimised,” universities have moved from a self-policing system to a policy-driven system to discourage academic dishonesty. however the same reviewer pointed out that “of course there is the problem inherent in the system which i face some years ago when i asked for a plagiarism check on some work that i was examining but i was told that my request for this plagiarism test could be interpreted as impugning the integrity of the student.” policies are only as good as people’s acceptance of these policies. academic dishonesty is a challenge that cannot be driven away solely by policy, but probably, by a combination of academic attitudes and policy. respectable research is generally recognised by peers, research methodology and academic writing cannot be regulated - research methodology and academic writing can be used to either honestly support the research endeavour or dishonesty prop up research. references aitchison, c. ( ) ‘writing the practice/practise the writing: writing challenges and pedagogies for creative practice supervisors and researchers’, educational philosophy and theory. routledge, ( ), pp. – . doi: . / . . . babakus, e. and mangold, w. g. ( ) ‘adapting the servqual scale to hospital services: an empirical investigation.’, health services research, ( ), pp. – . available at: http://www.ncbi.nlm.nih.gov/pubmed/ % ahttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=pmc . bouranta, n., chitiris, l. and paravantis, j. ( ) ‘the relationship between internal and external service quality’, international journal of contemporary hospitality management, ( ), pp. – . doi: . / . casadevall, a. ( ) ‘huge misconduct fine is a reminder to reward rigour’, nature, (april), p. . cawood, f. et al. ( ) ‘a perspective on university academic workload measurement’, ( ), pp. – . clarke, r. ( ) ‘plagiarism by academics: more complex than it seems’, journal of the association for information systems, ( ). devlin, s. j., dong, h. k. and brown, m. ( ) ‘selecting a scale for measuring quality’, marketing research, ( ), pp. – . dinis-oliveira, r. j. and magalhães, t. ( ) ‘the inherent drawbacks of the pressure to publish in health sciences: good or bad science’, f research, ( ), p. . doi: . /f research. . . grimes, d. r., bauch, c. t. and ioannidis, j. p. a. ( ) ‘modelling science trustworthiness under publish or perish pressure subject category : subject areas : author for correspondence ’:, royal society open science, ( ). available at: http://dx.doi.org/ . /rsos. . head, m. l. et al. ( ) ‘the extent and consequences of p-hacking in science’, plos biology, ( ), pp. – . doi: . /journal.pbio. . hedding., d. w. ( ) ‘payouts push professors towards predatory journals’, nature, (january), p. . husain, f. m., al-shaibani, g. k. s. and mahfoodh, o. h. a. ( ) ‘perceptions of and attitudes toward plagiarism and factors contributing to plagiarism: a review of studies’, journal of academic ethics. journal of academic ethics, ( ), pp. – . doi: . /s - - - . jarić, i. ( ) ‘high time for a common plagiarism detection system’, scientometrics, ( ), pp. – . doi: . /s - - - . jeyaraj, j. j. ( ) ‘social sciences & humanities improving academic writing standard : a challenge for universities’, journal of language studies, ( ), pp. – . available at: http://doi.org/ . /gema- - - . shawren singh and john mendy jinha, a. e. ( ) ‘learned publishing’, learned publishing, ( ), pp. – . kalnins, a. u., halm, k. and castillo, m. ( ) ‘screening for self-plagiarism in a subspecialty-versus-general imaging journal using ithenticate’, american journal of neuroradiology, ( ), pp. – . doi: . /ajnr.a . kornhaber, r. et al. ( ) ‘the benefits and challenges of academic writing retreats: an integrative review’, higher education research and development. taylor & francis, ( ), pp. – . doi: . / . . . lammey, r. ( ) ‘crossref developments and initiatives: an update on services for the scholarly publishing community from crossref’, science editing, ( ), pp. – . doi: . /kcse. . . . lei, j. and hu, g. ( ) ‘chinese esol lecturers ’ stance on plagiarism : does knowledge’, (january), pp. – . doi: . /elt/cct . maccallum, r. c. et al. ( ) ‘sample size in factor analysis’, psychological methods, ( ), pp. – . doi: . / - x. . . . molet, m. et al. ( ) ‘guilt by association and honor by association : the role of acquired equivalence’, pp. – . doi: . /s - - - . mouton, j. ( ) ‘the extent of south african authored articles in predatory journals’, ( ), pp. – . mundfrom, d. j., shaw, d. g. and ke, t. l. ( ) ‘minimum sample size recommendations for conducting factor analyses minimum’, international journal of testing, ( ), pp. – . doi: . /s ijt . myers, m. d. ( ) ‘qualitative research in business & management’. los angeles: sage. randall, m. ( ) pragmatic plagiarism: authorship, profit, and power. university of toronto press. remenyi, d. ( ) dictionary of research concepts and issues. reading, uk: academic conferences and journals international limited. remenyi, d. and susan, g. ( ) social media and digital scholarship for academic research: a user’s guide. reading, uk: academic conferences and publishing international. robinson-zañartu, c. et al. ( ) ‘academic crime and punishment: faculty members’ perceptions of and responses to plagiarism.’, school psychology quarterly, ( ), pp. – . doi: . /scpq. . . . . sachdev, s. b. and verma, h. v ( ) ‘relative importance of service quality’, journal of services research, ( ), pp. – . santoso, a. and cahaya, f. r. ( ) ‘factors influencing plagiarism by accounting lecturers’, accounting education, pp. – . doi: . / . . . sarwar, s. and idris, z. m. ( ) ‘paid academic writing services : a perceptional study of business students’, (july). doi: . /ijelcs.v i . . singh, s. ( ) ‘the zombie doctorate’, in ecrm -proceedings of the th european conference on research methods : ecrm . academic conferences limited, p. . singh, s. ( ) ‘the art of being a polycephalic researcher’, in ecrm th european conference on research methods in business and management. academic conferences and publishing limited, p. . singh, s. and remenyi, d. ( ) ‘plagiarism and ghostwriting: the rise in academic misconduct’, south african journal of science. academy of science of south africa, ( – ), pp. – . supak smolcic, v. and simundic, a.-m. ( ) ‘biochemia medica has started using the crosscheck plagiarism detection software powered by ithenticate’, biochemia medica, ( ), pp. – . doi: . /bm. . . w ( ) what is academic dishonesty?, berkeley city college. available at: https://www.berkeleycitycollege.edu/wp/de/what-is-academic-dishonesty/ (accessed: may ). w ( ) plagiarism & academic integrity: types of academic dishonesty, st. petersburg college. available at: https://spcollege.libguides.com/c.php?g= &p= #disruptivebehavior (accessed: may ). w ( ) plagiarism tutorial, duke university. available at: https://plagiarism.duke.edu/unintent/ (accessed: may ). w ( ) flawed research methods exaggerate the prevalence of depression, canadian medical association journal, sciencedaily. available at: www.sciencedaily.com/releases/ / / .htm (accessed: may ). w ( ) flaws in popular research method exposed, university of leicester, sciencedaily. available at: www.sciencedaily.com/releases/ / / .htm (accessed: may ). yin, r. k. ( ) ‘qualitative research from start to finish’. new york: the guilford press. p e ac h t r e e s t r e e t, s u i t e at l a n ta , g a w w w. l i b r a ry p u b l i s h i n g. o r g . . s a r a h @ e d u c o p i a . o r g l i b r a r y p u b l i s h i n g d i r e c t o r y e d i t e d b y s a r a h k . l i p p i n c o t t www.librarypublishing.org mailto:sarah@educopia.org cc by . by library publishing coalition - - - - ( e p d f ) contents foreword vi introduction viii library publishing coalition subcommittees xiii reading an entry xiv libraries in the united states and canada arizona state university auburn university boston college brigham young university brock university cal poly, san luis obispo california institute of technology california state university san marcos carnegie mellon university claremont university consortium colby college college at brockport, suny college of wooster columbia university connecticut college cornell university dartmouth college duke university emory university florida atlantic university florida state university georgetown university georgia state university grand valley state university gustavus adolphus college hamilton college illinois wesleyan university indiana university johns hopkins university kansas state university loyola university chicago macalester college mcgill university miami university mount saint vincent university northeastern university northwestern university oberlin college ohio state university oregon state university pacific university pennsylvania state university pepperdine university portland state university purdue university rochester institute of technology rutgers, the state university of new jersey simon fraser university state university of new york at buffalo state university of new york at geneseo syracuse university temple university texas tech university thomas jefferson university trinity university tulane university université de montréal university of alberta university of arizona university of british columbia university of calgary university of california, berkeley university of california system university of central florida university of colorado anschutz medical campus university of colorado denver university of florida university of georgia university of guelph university of hawaii at manoa university of idaho university of illinois at chicago university of iowa university of kansas university of kentucky university of maryland college park university of massachusetts amherst university of massachusetts medical school university of michigan university of minnesota university of nebraska-lincoln university of north carolina at chapel hill university of north carolina at charlotte university of north carolina at greensboro university of north texas university of oregon university of pittsburgh university of san diego university of south florida university of tennessee university of texas at san antonio university of toronto university of utah university of victoria university of washington university of waterloo university of windsor university of wisconsin–madison utah state university valparaiso university vanderbilt university villanova university virginia commonwealth university virginia tech wake forest university washington university in st. louis wayne state university western university libraries outside the united states and canada australian national university edith cowan university humboldt-universität zu berlin monash university swinburne university of technology university of hong kong university of south australia library publishing coalition strategic affiliates platforms, tools, and service providers personnel index vi foreword martin halbert (university of north texas), james mullins (purdue university), and tyler walters (virginia tech) in january , we officially launched the library publishing coalition (lpc) project, a collaborative initiative that now involves academic libraries committed to advancing the emerging field of library publishing. as this new service area matures and expands, we have seen a clear need for knowledge sharing, collaboration, and development of common practices. the lpc is helping this field move forward in a number of key ways, but we are particularly proud to publish the first edition of the library publishing directory, a guide to the publishing activities of academic libraries. in documenting the breadth and depth of activities in this field, this resource aims to articulate the unique value of library publishing; to establish it as a significant and growing community of practice; and to raise its visibility within a number of stakeholder communities, including administrators, funding agencies, other scholarly publishers, librarians, and content creators. collecting this rich set of data from libraries across the united states and canada allows us to identify themes, challenges, and trends; make predictions about future directions; and position the library publishing coalition to better meet the needs of this community. the directory also advances one of the central goals of the library publishing coalition, to facilitate and encourage collaboration among libraries as well as among libraries and publishers that share their values, especially university presses and learned societies. we hope that libraries will use the directory to learn about their peers, find mutually beneficial ways to work together, and ultimately improve their practices and enhance the value they provide to their campuses. we hope that presses will see opportunities to initiate new partnerships or expand existing ones. the lpc is a community-driven project that relies on the hard work and expertise of representatives from our participating institutions. we could not have produced the directory without the support of the directory subcommittee. we would like to thank marilyn billings (university of massachusetts-amherst), stephanie davis-kahl (illinois wesleyan university), adrian ho (university of kentucky), holly mercer (university of tennessee), elizabeth smart (brigham young university), shan sutton (oregon state university), allegra swift (claremont university consortium), beth turtle (kansas state university), and charles watkinson (purdue university) for their invaluable contributions. vii we also are grateful for the generous support of purdue university libraries’ scholarly publishing services unit, which donated resources, staff time, and the expertise of alexandra hoff and managing editor katherine purple; lightning source, who donated print-on-demand services; and the charlesworth group for conversion to ebook formats. finally, we would like to thank the libraries that took the time to help us to better understand, promote, and assert the significance of library publishing initiatives by providing information for this directory. the libraries listed in this first edition demonstrate the tremendous interest and energy in this field. we look forward to continuing to watch and document library publishing services as they evolve and progress in the coming years. viii introduction sarah k. lippincott, katherine skinner, and charles watkinson we are so pleased to share with our readership this first library publishing directory, produced by the library publishing coalition (lpc) in our organization’s inaugural year of work. this directory intends to make visible the innovation, support, and services offered today by a broad range of academic libraries in the area of scholarly communications. herein, we begin to document the strategic investments university libraries around the world are making in the area of academic publishing. once believed to be a one-off activity subsidized by a small number of libraries, “library publishing” today is evolving into a dynamic subfield in the academic publishing ecosystem. why publish a directory? for more than two decades, faculty, researchers, and students have come to their college and university libraries to gain technical support and staffing for early experiments in digital scholarship. from hosting ejournals and electronic theses and dissertations (etds) to collaborating with teams of researchers to construct multimedia experiences, these libraries have been willing and able partners in this academic mission of creating and disseminating scholarship. by , these library-based activities began to formalize, as documented in two key reports: ithaka s&r’s university publishing in a digital age and arl’s research library publishing services: new options for university publishing. subsequent studies reinforced the importance of these emerging library-based publishing endeavors. as demonstrated by the seminal library publishing services: strategies for success report, publishing services now are thriving across the whole range of academic libraries today, from small liberal arts colleges to premier research institutions. this growth of library publishing activities provided the impetus and rationale for creating the lpc to help advance this subfield for u.s. and canadian academic libraries. hosted by the educopia institute, and driven by academic libraries, the lpc project ( – ) is now founding this new organization. its mission is to promote the development of innovative, sustainable publishing services in academic and research libraries to support scholars as they create, advance, and disseminate knowledge. as a key part of this work, the lpc seeks to document practices and services in the field, and to foster strategic alliances and connections both across and between libraries and other academic publishers. the lpc created this directory to begin to answer the many questions the project team had about the publishing activities currently underway in libraries. how ix many libraries define their scholarly communications activities as “publishing”? how long have they been doing this work? with whom do they partner? what types of publications are they producing? are libraries offering specific products and/or services to their campuses? what percentage of their publications are peer reviewed? how many staff members are working on this activity, and how are they funding their activities? are there identifiable models and trends in this subfield of publishing today? with these and other questions in mind, the lpc directory subcommittee built and disseminated an internet-based survey in spring , targeting north american listservs for academic libraries. we focused on north america for scoping reasons: we knew we could not hope to chronicle global work in full, and so began with this smaller-but-significant subset of activity. we intentionally structured this directory to encompass institutions beyond the lpc itself, inviting any institution engaged in library publishing to participate. we received more than responses to this survey. in the following pages, we include directory listings for all institutions that responded, grouping the north american institutions first (our primary target) and programs outside the u.s. and canada next. using the survey data, the lpc directory subcommittee assembled the directory entries, shared each one with its institutional representative for editing and approval, and then published it herein. we greatly appreciate all those who gave their time and energy to help us document the efforts of their individual libraries. notably, the only institutions listed here are those that responded to our survey. undoubtedly, many important programs have been missed in this first edition. we hope that those we have missed will contact lpc (sarah@educopia.org) so we can ensure these institutions are included in future editions. the library publishing directory contributes directly to the lpc’s goal of encouraging collaboration by allowing library publishing staff, who have traditionally had relatively little contact with each other, to identify colleagues producing scholarly work in similar disciplines or using the same technology platform. the directory also is intended to open the way to collaboration with other publishers, especially mission-driven non-profit university presses and learned societies, by introducing and articulating the unique and complementary approach that libraries take to the publishing function. finally, it is hoped that the directory can help scholarly authors to become more aware of the opportunities that may exist on their own campuses or in their disciplines to experiment with new publication formats or business models. we highlight below some of the exciting library publishing trends and models we see emerging in this first directory of activity. together the answers provide a rich picture of what types of product libraries are creating and what technological, financial, and human resources they are using. mailto:sarah@educopia.org x library publishing today individually, the directory entries reveal much about local practices, including the mission driving an institution’s activities, the funding models and staffing supporting its work, the relationship between publication and preservation, and the type and quantity of publications produced. collectively, these entries say far, far more. last year, the libraries profiled in this directory published faculty-driven journals, student-driven journals, monographs, at least , conference papers and proceedings, and nearly , each of etds and technical/research reports. these publications covered an array of disciplines, including law, agriculture, history, education, computer science, and many, many others. thirty-three libraries report disciplinary specialties in the social sciences and area, ethnic, cultural, and gender studies (a broad classification that includes a range of interdisciplinary specialties). education ( libraries), health and clinical sciences ( ), and the general humanities ( ) are also particularly well-represented areas. faculty-driven journals were the most common publication reported by these libraries. over % of the libraries in this directory published at least one in and over half ( %) published at least one student-driven journal. thirty- six percent produced at least one monograph, and more than three-quarters published etds. more than half reported publishing data, audio, and video, in addition to text and images. currently, there is no single, dominant model for the organization of publishing services. in many institutions, services are distributed across multiple library units or across campus. the lead unit varies across libraries (e.g., scholarly communications, technical services, and even special collections). library publishing programs featured in this directory range from small, experimental endeavors to large, more mature operations with several dedicated staff members. libraries reported between . and eight full-time equivalent in library staff, and many also reported employing graduate ( %) and undergraduate ( %) students. across these libraries, the most prominent services are building, implementing, maintaining, and supporting publishing platforms for authors. in this work many report using full-service digital platforms, including public knowledge project’s ojs/ocs/oms suite ( %), bepress’s digital commons platform ( %), and dspace ( %)—the top three for respondents. however, many also report developing software locally ( %) and/or using a content management system like wordpress ( %) for dissemination and delivery. more than three-quarters of respondents said that they provide a broader range of services, including metadata ( %), analytics ( %), outreach ( %), doi assignment ( %), audio/ video streaming ( %), and issn registration ( %). a substantial number of these libraries also provide support for editorial and production processes. xi these include peer review management ( %), copyediting ( %), and print- on-demand ( %). finally, some libraries support business model development ( %), budget preparation ( %), and contract and license preparation ( %). other services offered, such as author advisory on copyright ( %), build upon librarians’ strengths as educators and advocates. a hallmark of library publishing, as is repeatedly highlighted in individual directory entries, has been the building of partnerships with content creators and other publishers on and off campus. faculty, students, and other authors typically provide the editorial leadership for library publications. over % of libraries featured herein report that they have relationships with campus departments or programs; % partner with individual faculty; and over half work with graduate and undergraduate students. many of the libraries in this directory report that they work with or have administrative ties to university presses. off-campus partners include scholarly societies, non-profit organizations, museums, library networks and consortia, and individual faculty at other institutions. despite the different forms library publishing activities have assumed to date, the directory demonstrates that these programs share a growing commonality of philosophy and approach combining traditional library values and skills (such as a concern with long-term preservation, expertise in the organization of information, and commitment to widening access) with lightweight digital workflows to create a distinctive “field” of publishing activity. the libraries in this directory overwhelmingly prefer open access publication ( % focus mostly or completely on open access). and although % of libraries rely in part or completely on their library’s operating budget to support publishing services, notably, % do not. among those libraries that are subsidizing these activities, the operating budget is contributing an average of % of the publishing budget. looking toward the future, many of the libraries in this directory report that they plan to increase the numbers and types of publications they produce, support an expanded suite of services (particularly in areas like data management), identify new partners within and beyond campus, and make improvements to software and workflows. the future of library publishing as libraries undertake the improvement and expansion of services, they will continue to confront a difficult and rapidly changing landscape. building capacity, sustaining services, and securing funding will require concerted efforts to demonstrate value and improve business models. raising credibility and visibility on campus and within the broader scholarly communications community will also require individual and collective efforts. libraries will need to convince campus administrators, university presses, librarians, commercial publishers, and content creators that library publishing is an important, strategic, xii purposeful service area that adds value to the publishing ecosystem. perhaps most important, libraries will need to cultivate and strengthen their relationships with other scholarly publishers—including university presses, scholarly societies, and commercial publishers—to build our collective capacity, extend the reach of scholarship, and ensure that the scholarly communication apparatus continues to evolve in pace with the research and knowledge produced across academia. this library publishing directory tells a compelling story, one that we believe needs dissemination in its own right. we look forward to seeing these networks continue to build upon the work they have done. we hope the directory will help existing and prospective library publishers identify new partners and learn from the experiences of their colleagues. and, of course, we hope to see the nexus of activity represented here continue to expand in the years ahead. xiii library publishing coalition subcommittees the following subcommittee members have donated their time and expertise to advancing the library publishing coalition’s mission and producing its most significant resources. program subcommittee the program subcommittee bears primary responsibility for planning and implementing the library publishing forum. sarah beaubien (grand valley state university) dan lee (university of arizona) mark newton (columbia university) melanie schlosser (ohio state university) marcia stockham (kansas state university) allegra swift (claremont university consortium) evviva weinraub (oregon state university) directory subcommittee the directory subcommittee provides support for the design and creation of the library publishing directory. marilyn billings (university of massachusetts-amherst) stephanie davis-kahl (illinois wesleyan university) adrian ho (university of kentucky) holly mercer (university of tennessee) elizabeth smart (brigham young university) shan sutton (oregon state university) allegra swift (claremont university consortium) beth turtle (kansas state university) charles watkinson (purdue university) research subcommittee the research subcommittee coordinates library publishing coalition roundtable discussions, and manages the organization’s research agenda donna beck (carnegie mellon university) marilyn billings (university of massachusetts-amherst) brad eden (valparaiso university) isaac gilman (pacific university) dan lee (university of arizona) gail mcmillan (virginia tech) catherine mitchell (california digital library) jane morris (boston college) melanie schlosser (ohio state university) mary beth thompson (university of kentucky) xiv reading an entry: some “health warnings” the field of library publishing is rapidly evolving, and its boundaries have not yet been clearly defined. we have attempted to produce a directory that is readable and cohesive and that allows for cross-institutional comparison. in some cases, this means we have used terminology and categories that do not fully reflect the complex and experimental nature of activities that libraries are undertaking. we hope that through this directory, and through input from the library publishing community, we will start to establish common language as this field matures. in some cases, as described below, questions in the questionnaire on which the entries are based were not specific enough and respondents reported numbers in different ways. revised questions will be sent in future years. while “staff in support of publishing activities” are consistently reported for salaried employees, the way in which respondents reported students percentages may vary. for example, undergraduate employees usually work a maximum of hours per week rather than the expected of a full-time staff member. therefore, it is open to question whether a . undergraduate works or hours per week. under “types of publication,” we have noticed that conference proceedings have been reported in various ways. in some cases, respondents report the number of series, while in other cases, they report the total number of faculty papers. some high estimates of total number of publications raise an issue of whether the instructions to just record activity in the last full calendar year, , were followed. readers will notice the presence of “seals” next to the title of some entries. these acknowledge the support of the institutions that fund the library publishing coalition by their generous two-year pledges. “contributing institutions” have pledged to support the foundation of the lpc with an annual contribution of $ , . “founding institutions” receive the highest honor, having pledged $ , a year to the project. to recognize their exceptional contributions, we include profiles of specific publications that founding institutions have nominated. these also give a practical sense of the wide range of types of publications produced. f o u n di ng institu tio n library publishing coalition c o n tr ib ut ing institu tio n library publishing coalition libraries in the united states and canada arizona state university hayden library primary unit: informatics and cyberinfrastructure services primary contact: mimmo bonanni digital projects manager - - digitalrepository@asu.edu website: repository.asu.edu social media: @asulibraries; facebook.com/asulibraries program overview mission/description: arizona state university libraries created the asu digital repository to support asu’s commitment to excellence, access, and impact. the asu digital repository advances the new american university by providing a central place to collect, preserve, and discover the creative and scholarly output from asu faculty, research partners, staff, and students. providing free, online access to asu scholarship benefits our local community, encourages transdisciplinary research, and engages scholars and researchers worldwide, increasing impact globally through the rapid dissemination of knowledge. the asu digital repository improves the visibility of content by exposing it to commercial search engines such as google, the asu libraries’ one search, as well as the asu digital repository search portal. the asu digital repository helps meet public access policies and archival requirements specified by many federal grants. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) mailto:digitalrepository@asu.edu repository.asu.edu media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: state and local documents (government publications); music; dance top publications: journal of surrealism (journal) campus partners: campus departments or programs; individual faculty publishing platform(s): contentdm; locally developed software digital preservation strategy: digital preservation services under discussion additional services: outreach; metadata; doi assignment/allocation of identifiers; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: improving marketing and outreach (involving subject librarians and e-research staff ), expanding data management support, and exploring the addition of learning objects. auburn university auburn university libraries primary contact: aaron trehub assistant dean for technology and technical services - - trehuaj@auburn.edu program overview mission/description: to support the university’s outreach mission by making original research and scholarship by auburn university faculty and students more accessible to alabama residents and the world at large. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); charitable contributions/ friends of the library organizations ( ) publishing activities types of publications: etds ( ) media formats: text; images top publications: etds campus partners: campus departments or programs; individual faculty publishing platform(s): dspace digital preservation strategy: digital preservation services under discussion. auburn university libraries is a founding member of two private lockss networks (metaarchive cooperative; adpnet), but does not currently use these distributed digital preservation networks to preserve etds or materials in the ir. additional services: graphic design (print or web); outreach; training; cataloging; meta- data; open url support; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: currently building and populating an institutional repository. mailto:trehuaj@auburn.edu boston college boston college university libraries primary unit: scholarly communications primary contact: jane morris head of scholarly communications and research - - jane.morris@bc.edu website: www.bc.edu/libraries/collections/escholarshiphome program overview mission/description: our goal is to showcase and preserve boston college’s scholarly output and to maximize research visibility and influence. escholarship@ bc encourages community contributors to archive and disseminate scholarly work, peer-reviewed publications, books, chapters, conference proceedings, and small datasets in an online open access environment. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/ honors theses ( ); grey literature from centers/institutes; datasets media formats: text; video; data disciplinary specialties: theology; education; the middle east; libraries top publications: catholic education (journal); studies in christian-jewish relations (journal); information technology and libraries (journal); levantine review (journal); proceedings of the catholic theological society of america (conference proceedings) c o n tr ib ut ing institu tio n library publishing coalition mailto:jane.morris@bc.edu www.bc.edu/libraries/collections/escholarshiphome percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: catholic theological society of america; ala library and information technology association; council of centers on christian jewish relations; seminar on jesuit spirituality publishing platform(s): ojs/ocs/omp; digitool digital preservation strategy: hathitrust; lockss; metaarchive additional services: marketing; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; dataset management; contract/license preparation; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: hosting more data and developing more open access journals. brigham young university harold b. lee library primary unit: scholarly communication unit scholarsarchive@byu.edu primary contact: elizabeth smart scholarly communication librarian - - elizabeth_smart@byu.edu website: sites.lib.byu.edu/scholarsarchive program overview mission/description: the harold b. lee library’s primary publishing resources include an institutional repository and digital publishing services for faculty- and student-edited journals. combined, these resources are called scholarsarchive. scholarsarchive is designed to make original scholarly and creative work—such as research, publications, journals, and data—freely and persistently available. the library’s publishing efforts are targeted at supporting broader academic and public discovery and use of university scholarship. scholarsarchive may also house items of historic interest to the university. the library supports content partners with software support, digitizing, metadata creation, journal management, and free hosting services. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); student conference papers and proceedings ( ); databases ( ); etds ( ) media formats: text; images f o u n di ng institu tio n library publishing coalition mailto:scholarsarchive@byu.edu mailto:elizabeth_smart@byu.edu sites.lib.byu.edu/scholarsarchive disciplinary specialties: religion; natural history of the american west; children’s literature top publications: western north american naturalist (journal); byu studies (journal); children’s book and play review (journal); pacific studies (journal); tesl reporter (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: international society for the comparative study of civilizations (iscsc); association of mormon counselors and psychotherapists (amcap); council on east asian libraries (ceal) publishing platform(s): contentdm; ojs/ocs/omp digital preservation strategy: rosetta (moving from beta to full implementation in ) additional services: analytics; cataloging; metadata; peer review management; digitization; hosting of supplemental content plans for expansion/future directions: areas of future exploration and possible expansion include monograph publishing, print on demand, doi support, hosting streaming media, and data management. h i g h l i g h t e d p u b l i c a t i o n the western north american naturalist (formerly great basin naturalist) has published peer- reviewed experimental and descriptive research pertaining to the biological natural history of western north america for more than years. ojs.lib.byu.edu/spc/index.php/wnan ojs.lib.byu.edu/spc/index.php/wnan brock university james a. gibson library primary contact: elizabeth yates liaison / scholarly communication librarian - - ext. eyates@brocku.ca website: www.brocku.ca/library/about-us-lib/openaccess program overview mission/description: the library’s publishing initiatives provide technology, expertise, and promotional support for researchers, students, and staff at brock university seeking to make their research universally accessible via open access. the library currently publishes/hosts five scholarly oa journals in partnership with scholars portal and the ontario council of university libraries. we use open journal systems (ojs) software. the library manages an open access publishing fund to help brock authors cover the costs of publishing with fully oa journals or monograph publishers. a minimum of four awards of up to $ , are granted; total funding is $ , . the library also hosts and disseminates brock scholarship through our digital repository, which collects graduate theses, major research projects, and subject- or department-based research collections and materials from our special collections and archives. we also raise awareness of open access through open access week activities, information resources, and other venues. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); etds ( ) media formats: text; images disciplinary specialties: humanities; french language; arts education; teaching and learning percentage of journals that are peer reviewed: mailto:eyates@brocku.ca www.brocku.ca/library/about-us-lib/openaccess campus partners: campus departments or programs; individual faculty other partners: ontario council of university libraries/scholars portal publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: scholars portal additional services: copy-editing; training; analytics; notification of a&i sources; issn registration; digitization plans for expansion/future directions: launching a journal showcasing undergraduate student research in the faculty of applied health sciences; launching an open monograph publishing system in partnership with scholars portal and the ontario council of university libraries. cal poly, san luis obispo robert e. kennedy library primary unit: digital scholarship services primary contact: marisa ramirez digital scholarship services librarian - - mramir @calpoly.edu website: digitalcommons.calpoly.edu/; lib.calpoly.edu/scholarship program overview mission/description: the robert e. kennedy library provides digital services to assist the campus community with the creation, publication, sharing, and preservation of research, scholarship, and campus history. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( ) funding sources (%): library operating budget ( ); endowment income ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ); graduate internship reports media formats: text; images; audio; video; data; multimedia/interactive content disciplinary specialties: history; philosophy; sustainability top publications: senior undergraduate projects; master’s theses; between the species (journal); california climate action planning (conference proceedings) percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:mramir @calpoly.edu digitalcommons.calpoly.edu lib.calpoly.edu/scholarship campus partners: individual faculty; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion. we are in the process of joining lockss and metaarchive. additional services: typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; notification of a&i sources; issn registration; peer review management; business model development; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: hiring an endowed digital scholarship services student assistant through the digital scholarship services student assistantship program, which provides paid, experiential learning opportunities for cal poly students who are interested the various facets of the changing digital publishing landscape. california institute of technology caltech library primary unit: metadata services group primary contact: kathy johnson repository librarian - - kjohnson@library.caltech.edu program overview mission/description: caltechthesis is part of coda, the caltech collection of open digital archives, managed by caltech library services. the mission of coda is to collect, manage, preserve, and provide global access over time to the scholarly output of the institute and the publications of campus units. caltechthesis contains phd, engineer’s, master’s, and bachelor’s/senior theses authored by caltech students. most items in caltechthesis are textual dissertations, but some may also contain software programs, maps, videos, etc. the etd is the version of record for the institute and deposit of doctoral dissertations is required for graduation. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities media formats: text; images; audio; video; data; simple websites disciplinary specialties: biology; chemistry and chemical engineering; engineering and applied science; geology and planetary science; physics; mathematics; astronomy campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): eprints digital preservation strategy: digital preservation services under discussion mailto:kjohnson@library.caltech.edu additional services: marketing; outreach; training; analytics; cataloging; metadata; author copyright advisory; digitization plans for expansion/future directions: undergoing gradual move of platforms to islandora/fedora, including preservation activity. california state university san marcos kellogg library primary contact: carmen mitchell institutional repository librarian - - cmitchell@csusm.edu website: csusm-dspace.calstate.edu; scholarworks.csusm.edu social media: @csusm_library program overview mission/description: the purpose of the california state university san marcos institutional repository (scholarworks) is to collect, organize, preserve, and disseminate csusm research, creative works, and other academic content in a web-based environment. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); other ( ) publishing activities types of publications: etds ( ); library exhibits media formats: text; images; audio; video disciplinary specialties: student work/research; library exhibits top publications: “going paperless: student and parent perceptions of ipads in the classroom” (thesis); “lateral violence in nursing” (thesis); “nurses’ technique and site selection in subcutaneous insulin injection” (thesis); “individual differences in working memory and levels of processing” (thesis); “wounded hearts: a journey through grief ” (thesis) campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students mailto:cmitchell@csusm.edu http://csusm-dspace.calstate.edu scholarworks.csusm.edu publishing platform(s): dspace digital preservation strategy: in-house; digital preservation services under discussion additional services: marketing; outreach; training; analytics; cataloging; metadata; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: planning to include faculty publications and datasets within the next year, working with other csu campuses on an undergraduate journal, and currently working to publish digital surrogates of items from the university archives. carnegie mellon univeristy carnegie mellon university libraries primary unit: archives and digital library initiatives primary contact: gabrielle michalek head of archives and digital library initiatives - - gabrielle@cmu.edu website: repository.cmu.edu program overview mission/description: carnegie mellon university libraries’ publishing program aims to promote open access to scholarly resources, to support online journals and conference management—from article submission through peer review to open access and long-term preservation, and to publish grey literature, including theses, dissertations, and technical reports. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text disciplinary specialties: social and behavioral sciences; engineering; physical and life sciences; arts and humanities; security top publications: journal of privacy and confidentiality (journal); dietrich college honors theses percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:gabrielle@cmu.edu repository.cmu.edu campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: lockss; metaarchive additional services: marketing; outreach; training; analytics; cataloging; metadata; peer review management; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: hosting more open access journals; supporting conference proceedings; and publishing more theses, dissertations, and technical reports. claremont university consortium claremont colleges library primary unit: center for digital initiatives scholarship@cuc.claremont.edu primary contact: allegra swift digital initiatives librarian - - allegra_swift@cuc.claremont.edu website: scholarship.claremont.edu; ccdl.libraries.claremont.edu social media: @ccdiglib; facebook.com/honnoldlibrary; facebook.com/ claremontcollegesdigitallibrary; flickr.com/photos/claremontcollegesdigitallibrary program overview mission/description: the center for digital initiatives facilitates the dissemination of knowledge by providing publishing platforms, consulting, and technical services to enable the creation and distribution of teaching and research resources to the scholarly community. year publishing activities began: (first journal); (in earnest) organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); digital collections; lectures and symposia media formats: text; images; video; multimedia/interactive content disciplinary specialties: arts and humanities; social and behavioral sciences; physical and mathematical sciences; life sciences; business top publications: cmc senior theses; journal of humanistic mathematics (journal); steam (journal); scripps senior theses; lux (journal); performance practice review (journal) c o n tr ib ut ing institu tio n library publishing coalition mailto:scholarship@cuc.claremont.edu mailto:allegra_swift@cuc.claremont.edu scholarship.claremont.edu ccdl.libraries.claremont.edu percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: rancho santa ana botanical gardens publishing platform(s): bepress (digital commons); contentdm digital preservation strategy: amazon glacier; amazon s ; looking into clockss, lockss, and some others additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; peer review management; business model development; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: possible expansion into areas of education and alternative/non-traditional publishing. colby college colby college libraries primary unit: digital and special collections primary contact: marty kelly assistant director for digital collections - - mfkelly@colby.edu program overview mission/description: the publishing mission of colby college libraries digital and special collections is to showcase the scholarly work of colby’s faculty and students, make the college’s unique collections more broadly available, and contribute to open intellectual discourse. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); newsletters ( ); undergraduate capstone/honors theses ( ); alumni magazine media formats: text; images; audio; video disciplinary specialties: humanities; environmental science; jewish studies; economics top publications: colby quarterly (journal); colby honors theses and senior scholars papers; colby undergraduate research symposium (conference proceedings); atlas of maine (journal); colby magazine (magazine) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students f o u n di ng institu tio n library publishing coalition mailto:mfkelly@colby.edu publishing platform(s): bepress (digital commons); wordpress digital preservation strategy: digital preservation services under discussion additional services: outreach; training; analytics; cataloging; metadata; open url support; peer review management; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: planning to support two new major publishing initiatives with colby’s center for the arts and humanities this coming academic year: the relaunch of the colby quarterly ( – ) and the development of a new undergraduate research journal. h i g h l i g h t e d p u b l i c a t i o n the colby environmental assessment team collection of student-produced watershed studies on maine’s belgrade lakes are widely used by local lake associations, town officials, and the department of environmental protection. digitalcommons.colby.edu/lakesproject digitalcommons.colby.edu/lakesproject college at brockport, suny drake memorial library primary unit: library technology primary contact: kim myers digital repository specialist - - kmyers@brockport.edu program overview year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data disciplinary specialties: kinesiology; sports science; physical education; education; counselor education; philosophy; english top publications: counselor education master’s theses; education master’s theses; technical reports from the water research community; dissenting voices (journal); journal of literary onomastic studies (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons) digital preservation strategy: lockss mailto:kmyers@brockport.edu additional services: copy-editing; marketing; training; cataloging; metadata; issn registration; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: expanding in the etd arena; working with the graduate school to automate the publication of our master’s theses as they are produced. college of wooster college of wooster libraries primary unit: digital scholarship and services primary contact: stephen flynn emerging technologies librarian - - sflynn@wooster.edu program overview mission/description: our goal is to digitally preserve and promote the original scholarship of our faculty and students. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons); dspace; wordpress digital preservation strategy: in-house; digital preservation services under discussion additional services: metadata; digitization; hosting of supplemental content plans for expansion/future directions: migrating from dspace to bepress, which may enable us to promote the publishing of new undergraduate journals. mailto:sflynn@wooster.edu columbia university columbia university libraries/information services primary unit: center for digital research and scholarship info@cdrs.columbia.edu primary contact: mark newton production manager - - mnewton@columbia.edu website: cdrs.columbia.edu social media: @columbiacdrs; @researchatcu; @dataatcu; @ scholarlycomm; facebook.com/pages/center-for-digital-research-and- scholarship-columbia-university/ program overview mission/description: the center for digital research and scholarship (cdrs) serves the digital research and scholarly communications needs of the faculty, students, and staff of columbia university and its affiliates. our mission is to increase the utility and impact of research produced at columbia by creating, adapting, implementing, supporting, and sustaining innovative digital tools and publishing platforms for content delivery, discovery, analysis, data curation, and preservation. in pursuit of that mission, we also engage in extensive outreach, education, and advocacy to ensure that the scholarly work produced at columbia university has a global reach and accelerates the pace of research across disciplines. year publishing activities began: (columbia university libraries); (cdrs) organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ); grants ( ); licensing ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); conference papers and proceedings ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) f o u n di ng institu tio n library publishing coalition mailto:info@cdrs.columbia.edu mailto:mnewton@columbia.edu cdrs.columbia.edu media formats: text; images; audio; video; data; software disciplinary specialties: law; humanities; public health; global studies; interdisciplinary studies top publications: tremor and other hyperkinetic movements (journal); dangerous citizens (website); academic commons (digital research repository); women film pioneers project (website); columbia business law review (journal) percentage of journals that are peer reviewed: campus partners: columbia university press; campus departments or programs; individual faculty; graduate students; undergraduate students other partners: modern languages association; fordham university press; new york university; ecological society of america. informal partners include california digital library; cornell university; purdue university. publishing platform(s): fedora; ojs/ocs/omp; wordpress; locally developed software; drupal digital preservation strategy: aptrust; archive-it; duracloud/dspace; dpn; in-house; digital preservation services under discussion. content is also backed up to nysernet, to two on-site locations, and off-site to tape with ironmountain. additional services: graphic design (print or web); typesetting; copy- editing; marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset h i g h l i g h t e d p u b l i c a t i o n academic commons is a digital publication platform that brings global visibility to the research and scholarship of columbia university and its affiliates. academiccommons.columbia.edu academiccommons.columbia.edu management; business model development; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming; preservation; repository deposit to pmc; seo; application development; content and platform migration; workshops and consultation; social media and journal publishing best practices workshops; informal scholarly communication events; open access week events; campus oa fund management; collaboration spaces plans for expansion/future directions: planning to continue integration of the publishing program with the digital research repository, academic commons (academiccommons.columbia.edu), as well as to pursue new publishing partnerships with scholarly societies through members affiliated with the university. further plans include expansion into unique identifier support (such as with orcid and through ezid) as well as work in support of federal and funder mandates for access to funded research. academiccommons.columbia.edu connecticut college charles e. shain library primary unit: special collections primary contact: benjamin panciera director of special collections - - bpancier@conncoll.edu website: digitalcommons.conncoll.edu program overview mission/description: connecticut college seeks to make the products of student and faculty research and campus resources as widely available as possible through its institutional repository. mandatory electronic submission of student honors theses began in . the faculty overwhelmingly passed an open access policy in , and the library has supported this by retrospectively making faculty research available through the institutional repository. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty conference papers and proceedings ( ); newsletters ( ); undergraduate capstone/honors theses ( ) media formats: text; audio campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion; no digital preservation services provided mailto:bpancier@conncoll.edu digitalcommons.conncoll.edu additional services: cataloging; metadata; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: seeking to optimize faculty participation and maximize the amount of available research in the institutional repository and inform faculty of the possibility of using the repository to make available unpublished material like conference papers and datasets. cornell university cornell university library primary unit: digital scholarship and preservation services primary contact: david ruddy director, scholarly communications services - - dwr @cornell.edu program overview mission/description: separate operations have their own mission statements (project euclid, arxiv, ecommons, cip). in general, we wish to promote sustainable models of scholarly communications with an emphasis on access and affordability. year publishing activities began: organization: services are primarily distributed across library units. a few projects involve the cornell university press. staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ); sales revenue ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); etds ( ); undergraduate capstone/ honors theses ( ); case studies media formats: text; audio; video; data disciplinary specialties: mathematics; physics; statistics; computer science percentage of journals that are peer reviewed: campus partners: cornell university press; campus departments or programs; individual faculty; graduate students other partners: duke university press; scholarly societies; scholars worldwide c o n tr ib ut ing institu tio n library publishing coalition mailto:dwr @cornell.edu publishing platform(s): dpubs; dspace; locally developed software digital preservation strategy: in-house additional services: graphic design (print or web); metadata; doi assignment/ allocation of identifiers; open url support; budget preparation; digitization; hosting of supplemental content; audio/video streaming additional information: “publishing” activities at cornell are complex and include at least four fairly distinct operations: project euclid, arxiv.org, ecommons (an institutional repository), and cornell initiatives in publishing (cornell-related journals and books). each of these operations arguably fit the provided criteria for “library publishing” activities. arxiv.org dartmouth college dartmouth college library primary unit: digital library program library.dartmouth.edu/mail/send.php?to=askalib primary contact: elizabeth kirk associate librarian for information resources - - elizabeth.e.kirk@dartmouth.edu website: www.dartmouth.edu/~library/digital program overview mission/description: the dartmouth college library’s digital publishing program supports faculty publication of original scholarly content in a digital environment. our digital publications include journals, monographs, and scholarly editions. all content is available online without charge. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ); endowment income ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ); etds ( ); digital, scholarly editions of manuscripts, letters, etc. ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: environment; linguistics; electronic or “new” media; native american history; history of arctic exploration top publications: elementa (journal); linguistic discovery (journal); journal of e-media studies (journal); occom circle project (digital collection); artistry of the homeric simile (monograph) percentage of journals that are peer reviewed: f o u n di ng institu tio n library publishing coalition library.dartmouth.edu/mail/send.php?to=askalib mailto:elizabeth.e.kirk@dartmouth.edu http://www.dartmouth.edu campus partners: campus departments or programs; individual faculty other partners: university press of new england; bioone publishing platform(s): contentdm; locally developed software; ambra digital preservation strategy: dpn; hathitrust; lockss; portico; in-house; digital preservation services under discussion additional services: marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; peer review management; business model development; budget preparation; other author advisory; digitization; audio/video streaming; xml consultation in jats . and tei additional information: the partnership with the publisher bioone is enabling us to increase our technological capacity for journal publishing. bioone is a significant contributor to the staffing for elementa. the partnership with the university press of new england is enabling us to increase knowledge and capacity for monograph publishing. plans for expansion/future directions: publishing more monographs in conjunction with the university press of new england, further developing technical capacity for journals, increasing the number of digital editions, working with student journals. h i g h l i g h t e d p u b l i c a t i o n through elementa: science of the anthropocene, we aim to facilitate scientific solutions to the challenges presented by this era of accelerated human impact with timely, technically sound, peer- reviewed articles that address interactions between human and natural systems and behaviors. home.elementascience.org home.elementascience.org duke university duke university libraries primary unit: office of copyright and scholarly communications open-access@duke.edu primary contact: paolo mangiafico coordinator of scholarly communications technology - - paolo.mangiafico@duke.edu website: library.duke.edu/openaccess program overview mission/description: duke university libraries partners with members of the duke community to publish and disseminate scholarship in new and creative ways, including helping to publish scholarly journals on an open access digital platform, archiving previously published and original works, and consulting on new forms of scholarly dissemination. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; data; multimedia/interactive content disciplinary specialties: greek, roman, and byzantine studies; transatlantic german studies; th-century russian studies; cultural anthropology; scholarly communications top publications: cultural anthropology (journal); etds; greek, roman, and byzantine studies (journal); scholarly communications @ duke (blog); andererseits (journal) f o u n di ng institu tio n library publishing coalition mailto:open-access@duke.edu mailto:paolo.mangiafico@duke.edu library.duke.edu/openaccess percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: society for cultural anthropology; editors of particular journals and their organizations publishing platform(s): dspace; ojs; wordpress; symplectic elements digital preservation strategy: depends on the journal and type of content, primarily in-house, but exploring archiving with portico additional services: outreach; training; analytics; metadata; open url support; dataset management; business model development; contract/license preparation; author copyright advisory; other author advisory; hosting of supplemental content plans for expansion/future directions: working with more datasets, digital projects, and forms other than linear text; exploring platforms that support new publishing models, not just digital versions of old journal models. h i g h l i g h t e d p u b l i c a t i o n cultural anthropology is the journal of the society for cultural anthropology, a section of the american anthropological association (aaa). it is one of journals published by the aaa, and it is widely regarded as one of the flagship journals of its discipline. culanth.org culanth.org emory university robert w. woodruff library primary unit: emory center for digital scholarship allen.tullos@emory.edu primary contact: stewart varner digital scholarship coordinator - - stewart.varner@emory.edu program overview mission/description: the enduring goal of a university is to create and disseminate knowledge. changes in technology offer opportunities for new forms of both creation and dissemination of scholarship through open access (oa). open access publishing also offers opportunities for emory university to fulfill its mission of creating and preserving knowledge in a way that opens disciplinary boundaries and facilitates sharing that knowledge more freely with the world. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); databases ( ); etds ( ) media formats: text; images; audio; video; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: southern studies; religion/theology top publications: southern spaces (journal); molecular vision (journal); methodist review (journal); practical matters (journal) percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:allen.tullos@emory.edu mailto:stewart.varner@emory.edu campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): fedora; ojs/ocs/omp; wordpress; drupal digital preservation strategy: digital preservation services under discussion additional services: typesetting; copy-editing; metadata; peer review management; contract/license preparation; author copyright advisory; digitization; audio/video streaming additional information: on june , , emory university announced the launch of the emory center for digital scholarship (ecds). the ecds brings together four units currently housed in the robert w. woodruff library: the digital scholarship commons (disc), the electronic data center, the lewis h. beck center for electronic collections, and the emory center for interactive teaching (ecit). these units have each collaborated with emory scholars who wish to incorporate technology into their teaching and research. the formation of the ecds will break down barriers between these functions and simplify the process of establishing partnerships with scholars. expanding and strengthening support for open access, digital publishing is a top priority for the ecds. plans for expansion/future directions: reexamining the expansion of library publishing services following the recent launch of the emory center for digital scholarship. florida atlantic university se wimberly library primary unit: digital library lydig@fau.edu primary contact: joanne parandjuk digital initiatives librarian - - jparandj@fau.edu website: www.library.fau.edu/depts/digital_library/about.htm program overview mission/description: recognizing the publishing needs of campus members and local partners, an open access publishing service was initiated by the fau digital library in support of scholarly communications across campus and the wider dissemination of fau research and creative content. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video disciplinary specialties: geosciences; undergraduate research; communications; local history top publications: the florida geographer (journal); democratic communique (journal); fau undergraduate research journal (journal); journal of coastal research (journal backfile); broward legacy (journal) mailto:lydig@fau.edu mailto:jparandj@fau.edu www.library.fau.edu/depts/digital_library/about.htm percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students other partners: broward county historical society publishing platform(s): islandora (migration underway); ojs/ocs/omp digital preservation strategy: florida digital archive member additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; issn registration; digitization; audio/video streaming plans for expansion/future directions: we have just launched the second issue of the first volume of our undergraduate research journal to instill scholarly inquiry and practices among undergraduates, and we hope to see a rise in the research activity of our students. florida state university robert manning strozier library primary unit: technology and digital scholarship primary contact: micah vandegrift scholarly communication librarian - - mvandegrift@fsu.edu website: diginole.lib.fsu.edu program overview mission/description: scholarly communications is a developing area of librarianship that deals with the production, dissemination, promotion, and preservation of scholarly research and creative works. the scholarly communication initiative will find, assess, and provide tools and services for representing scholarship in a digital environment. our vision is to support a variety of new modes and models of dissemination for academic work (open access, digital publishing, project-based digital scholarship, etc.). areas of focus include our institutional repository (technical management, outreach, collection development); open access (education and programs on access options for scholarly work); author rights (information and resources on negotiating copyright transfer contracts); copyrights and fair use (information and resources on copyright as it pertains to academic publishing); research and writing (keeping abreast of the many changes and development in this area, and contributing to the professional literature); and outreach (creating partnerships with campus offices, faculty, and administrators to further the scholarly communications initiative). year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) c o n tr ib ut ing institu tio n library publishing coalition mailto:mvandegrift@fsu.edu diginole.lib.fsu.edu media formats: text disciplinary specialties: arts and literature; art education and therapy; law top publications: heal: humanism evolving through arts and literature (journal); journal of art for life (journal); the owl (journal); fsu law review (journal) campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion additional services: marketing; outreach; training; analytics; metadata; issn registration; peer review management; contract/license preparation; author copyright advisory; hosting of supplemental content plans for expansion/future directions: piloting an open access fund, finding a sustainable model and including it as an ongoing resource for moving scholarship and prestige to open access; growing scholcomm office to include repository manager and host research fellows (clir, mellon); coordinating with the school of library and information studies and the history of text technologies to integrate scholcomm initiatives into curriculum; providing training and investment in fsu lis students’ skills and knowledge in this area; reworking open access policy with faculty senate to make our policy more effective and more in line with the scholarly communication push internationally. georgetown university georgetown university libraries primary unit: library information technologies digitalscholarship@georgetown.edu primary contact: kate dohe digital services librarian - - kd @georgetown.edu website: www.library.georgetown.edu/digitalgeorgetown social media: @gtownlibrary program overview mission/description: digitalgeorgetown supports the advancement of education and scholarship at georgetown and contributes to the expansion of research initiatives, both nationally and internationally. by providing the infrastructure, resources, and services, digitalgeorgetown sustains the evolution from the traditional research models of today to the enriched scholarly communication environment of tomorrow, and it provides context and leadership in developing collaborative opportunities with partners across the campus and around the world. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: student-driven journals ( ); monographs ( ); technical/ research reports ( ); newsletters ( ); etds ( ); undergraduate capstone/ honors theses ( ); faculty papers; video interviews; citations; syllabi media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: linguistics; communications; international relations/ foreign policy; bioethics mailto:digitalscholarship@georgetown.edu mailto:kd @georgetown.edu www.library.georgetown.edu/digitalgeorgetown top publications: georgetown university round tables on language and linguistics (monograph); the human cloning debate (monograph); the genocide in cambodia (monograph) percentage of journals that are peer reviewed: campus partners: georgetown university press; campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace digital preservation strategy: digital preservation services under discussion additional services: marketing; outreach; training; analytics; cataloging; metadata; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: continuing to enhance and expand our initiative to include more open access materials, different forms and formats of etds, and other scholarly publications. georgia state university georgia state university library primary unit: digital initiatives digitalarchive@gsu.edu primary contact: sean lind digital initiatives librarian - - slind @gsu.edu website: digitalarchive.gsu.edu program overview mission/description: the mission of the institutional repository at georgia state university is to give free and open access to the impactful scholarly and creative works, research, publications, reports, and data contributed by faculty, students, staff, and administrative units of georgia state university. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( ) funding sources (%): library materials budget ( ) publishing activities types of publications: student-driven journals ( ); monographs ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text disciplinary specialties: law review; undergraduate honors research top publications: georgia state university law review (journal); colonial academic alliance undergraduate research journal (journal); discovery: georgia state university undergraduate honors research journal (journal) percentage of journals that are peer reviewed: mailto:digitalarchive@gsu.edu mailto:slind @gsu.edu digitalarchive.gsu.edu campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: duracloud/dspace additional services: marketing; outreach; training; analytics; cataloging; metadata; issn registration; open url support; author copyright advisory; other author advisory; digitization plans for expansion/future directions: increasing the number and variety of georgia state university faculty scholarly publications openly available for download on the internet. grand valley state university grand valley state university libraries primary unit: collections and scholarly communications scholarworks@gvsu.edu primary contact: sarah beaubien scholarly communications outreach coordinator - - beaubisa@gvsu.edu program overview year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); textbooks ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data top publications: online readings in psychology and culture (digital collection); foundation review (journal); fishladder (journal); language arts journal of michigan (journal); journal of tourism insights (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: michigan council of teachers of english; resort and commercial recreation association; international association for cross-cultural psychology; johnson center for philanthropy publishing platform(s): bepress (digital commons) f o u n di ng institu tio n library publishing coalition mailto:scholarworks@gvsu.edu mailto:beaubisa@gvsu.edu digital preservation strategy: lockss; portico; digital preservation services under discussion additional services: outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; peer review management; author copyright advisory; digitization; hosting of supplemental content h i g h l i g h t e d p u b l i c a t i o n the foundation review is the first peer-reviewed journal of philanthropy, written by and for foundation staff and boards, and those who work with them implementing programs. it provides rigorous research and writing, presented in an accessible style. scholarworks.gvsu.edu/tfr scholarworks.gvsu.edu/tfr gustavus adolphus college folke bernadotte memorial library primary contact: barbara fister professor and academic librarian - - fister@gac.edu program overview mission/description: we want to support the shift from closed, licensed access to information to open, shareable, and sustainable scholarship. year publishing activities began: organization: entrepreneurial, experimental, more or less a sandbox in which librarians help other faculty consider alternatives staff in support of publishing activities (fte): library staff ( . ) funding sources (%): other ( ) publishing activities types of publications: monographs ( ) media formats: text campus partners: individual faculty publishing platform(s): contentdm; wordpress; pressbooks digital preservation strategy: in-house additional services: author copyright advisory; other author advisory; digitization additional information: we have published one monograph using pressbooks: an anthology based on faculty statements about teaching, scholarship, and service submitted for tenure and promotion. we wanted it to be lightweight and without cost other than time. it worked. we also have shared platform advice with faculty interested in publishing. it is all very much at the beginning and is without much in the way of technical or financial support, but we expect the resource commitment to grow. plans for expansion/future directions: working with similar libraries to study the possible launch of a press. mailto:fister@gac.edu hamilton college burke library primary unit: department of special collections and archives cgoodwil@hamilton.edu primary contact: randall ericson editor - - rericson@hamilton.edu website: couperpress.org program overview mission/description: the couper press was established in by couper librarian randall ericson of the burke library at hamilton college in clinton, new york. the press is named in honor of the late richard w. couper ‘ , an alumnus, life trustee of hamilton, and benefactor of the burke library. the press publishes a quarterly journal of scholarship, american communal societies quarterly (acsq), which showcases the communal societies collections of burke library. american communal societies series, a monograph series, presents new scholarship pertaining to american intentional communities as well as reprints of, and critical introductions to, important historical works that may be difficult to find or are out of print. shaker studies are short monographs on the shakers. occasional publications are published on topics that highlight the special collections of the burke library. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): endowment income ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ) media formats: text; images disciplinary specialties: communal studies; religious studies; sociology; american history; musicology mailto:cgoodwil@hamilton.edu mailto:rericson@hamilton.edu http://couperpress.org top publications: prison diary and letters of chester gillette (monograph); visiting the shakers, – : watervliet, hancock, tyringham, new lebanon (monograph); encyclopedic guide to american intentional communities (monograph); a promising venture: shaker photographs from the wpa (monograph); demographic directory of the harmony society (monograph) campus partners: individual faculty; undergraduate students other partners: museums; libraries; private collectors digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); copy-editing plans for expansion/future directions: considering making the american communal societies quarterly available through an institutional repository. illinois wesleyan university the ames library primary unit: scholarly communications primary contact: stephanie davis-kahl scholarly communications librarian - - sdaviska@iwu.edu program overview mission/description: the ames library publishing program focuses on disseminating excellent student-authored research, scholarship, and creative works, with an emphasis on providing education and outreach on issues related to publishing such as open access, author rights, and copyright. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( ) funding sources (%): library operating budget ( ); non-library campus budget ( ) publishing activities types of publications: student-driven journals ( ); textbooks; ( ); student conference papers and proceedings ( ); newsletters ( ); undergraduate capstone/ honors theses ( ) media formats: text; images; audio; video disciplinary specialties: economics; political science; history top publications: undergraduate economic review (journal); constructing history (journal); res publica (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students c o n tr ib ut ing institu tio n library publishing coalition mailto:sdaviska@iwu.edu publishing platform(s): bepress (digital commons) digital preservation strategy: in-house; digital preservation services under discussion additional services: training; analytics; metadata; peer review management; author copyright advisory; other author advisory; hosting of supplemental content; audio/video streaming additional information: regarding our funding model; percent of the cost of our bepress implementation is covered by the library, while the remaining percent is generously provided by the office of the president, office of the provost, and mellon center for faculty and curriculum development. faculty advisors for our student journals donate their time. plans for expansion/future directions: considering how to best position the program to become a publishing outlet for faculty. indiana university indiana university libraries primary unit: iuscholarworks iusw@indiana.edu primary contact: jennifer laherty digital publishing librarian - - jlaherty@indiana.edu website: scholarworks.iu.edu program overview mission/description: iuscholarworks is a set of services from the indiana university libraries to make the work of iu scholars freely available and to ensure that these resources are preserved and organized for the future. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ); graduate students ( ) funding sources (%): library materials budget ( ); library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); newsletters ( ); etds ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: folklore percentage of journals that are peer reviewed: campus partners: iu press; campus departments or programs; individual faculty; graduate students; undergraduate students other partners: american folklore society c o n tr ib ut ing institu tio n library publishing coalition mailto:iusw@indiana.edu mailto:jlaherty@indiana.edu scholarworks.iu.edu publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: archive-it; clockss; duracloud/dspace; hathitrust additional services: outreach; training; analytics; cataloging; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; dataset management; peer review management; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming; metadata consultation. plans for expansion/future directions: incorporating the libraries’ open access publishing activities into the development of a new campus office, the office of scholarly publishing, which includes the university press and an etextbook initiative. johns hopkins university sheridan libraries primary unit: scholarly resources and special collections dissertations@jhu.edu primary contact: david reynolds manager of scholarly digital initiatives - - davidr@jhu.edu program overview mission/description: to provide a publishing platform for required etds and journals for the johns hopkins academic community. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images disciplinary specialties: education; business top publications: international journal of interdisciplinary education (journal); new horizons for education (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: in-house; digital preservation services under discussion mailto:dissertations@jhu.edu mailto:davidr@jhu.edu additional services: training; analytics; metadata; peer review management; author copyright advisory additional information: we have only done an etd pilot so far, but mandatory submission was required as of september , . we are working with the school of education to publish two new oa journals. we expect the inaugural issues to appear by the second quarter of . plans for expansion/future directions: publishing journals for the school of education; looking into providing a monograph publishing service for academic departments; revisiting the question of publishing student journals. kansas state university kansas state university libraries primary unit: scholarly communications and publishing info@newprairiepress.org primary contact: char simser coordinator of electronic publishing, new prairie press - - info@newprairiepress.org website: newprairiepress.org social media: @newprairiepress program overview mission/description: to host peer-reviewed scholarly journals, monographs, conference proceedings, and other series primarily in the humanities and social sciences; make the content freely available worldwide; and contribute to and support evolving scholarly publishing models. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ) media formats: text; images; video disciplinary specialties: financial therapy; rural research and policy; library science; cognitive sciences and semantics; analytical philosophy top publications: gdr bulletin (journal); baltic international yearbook (journal); journal of financial therapy (journal); online journal of rural research & policy (journal); kansas library association college and university libraries section proceedings (conference proceedings) f o u n di ng institu tio n library publishing coalition mailto:info@newprairiepress.org mailto:info@newprairiepress.org http://newprairiepress.org percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty publishing platform(s): bepress (digital commons) digital preservation strategy: clockss additional services: graphic design (print or web); marketing; training; notification of a&i sources; doi assignment/allocation of identifiers; digitization; hosting of supplemental content plans for expansion/future directions: publishing open access monographs and conference proceedings and publishing two undergraduate research journals; setting up an advisory board to help set direction and policy and recommend new titles for npp. h i g h l i g h t e d p u b l i c a t i o n since , the journal of financial therapy has been the leading forum dedicated to clinical, experimental, and qualitative research in the emerging field of financial therapy. jftonline.org jftonline.org loyola university chicago loyola university chicago libraries primary unit: library systems primary contact: margaret heller digital services librarian - - mheller @luc.edu program overview mission/description: loyola ecommons is an open-access, sustainable, and secure resource created to preserve and provide access to research, scholarship, and creative works created by the university community for the benefit of loyola students, faculty, staff, and the larger academic community. sponsored by the university libraries, loyola ecommons is a suite of online resources, services, and people working in concert to facilitate a wide range of scholarly and archival activities, including collaboration, resource sharing, author rights management, digitization, preservation, and access by a global academic audience. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ) publishing activities types of publications: faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ) media formats: text; images; data disciplinary specialties: criminal justice; economics; social work campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion mailto:mheller @luc.edu additional services: outreach; training; analytics; metadata; digitization; hosting of supplemental content plans for expansion/future directions: hosting conference proceedings and journals. macalester college dewitt wallace library primary unit: digital scholarship and services primary contact: johan oberg digital scholarship and services librarian - - joberg@macalester.edu website: www.macalester.edu/library/digitalinitiatives/index.html program overview mission/description: the digital publishing unit of the dewitt wallace library supports the creation, management, and dissemination of local digital- born scholarship in various formats. essential to supporting this mission is the continuing exploration of evolving creation, collaboration, and publication tools; encoding methods; and development of staff skills and facility resources. the unit serves the digital scholarship and electronic publishing needs through development of digital scholarship projects as well as open access online distribution of journals, articles, and conference proceedings. the library is committed to playing an active role in the changing landscape of scholarly publishing and supports the ideals of the open access movement. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); undergraduate capstone/honors theses ( ); college alumni magazine; conference proceedings; oral histories media formats: text; images; audio; video; data disciplinary specialties: natural sciences; social sciences; fine arts; humanities; interdisciplinary studies mailto:joberg@macalester.edu www.macalester.edu/library/digitalinitiatives/index.html top publications: “an analysis of the career length of professional basketball players” (thesis); “the cultural omnivore in its natural habitat: music taste at a liberal arts college” (thesis); “what are the effects of mergers in the u.s. airline industry? an econometric analysis on delta-northwest merger” (thesis); “the mirror’s reflection: virgil’s aeneid in english translation” (thesis); “fat teen trouble: a sociological perspective of obesity in adolescents” (thesis) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students other partners: association for nepal and himalayan studies (anhs) publishing platform(s): bepress (digital commons); contentdm digital preservation strategy: in-house additional services: typesetting; cataloging; metadata; issn registration; dataset management; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: working with faculty to develop data management curation and preservation. mcgill university mcgill university library primary unit: escholarship, epublishing and digitization primary contact: amy buckland escholarship, epublishing & digitization coordinator - - amy.buckland@mcgill.ca program overview mission/description: mcgill university library showcases the research done by the mcgill community to the world via publishing initiatives such as electronic theses and dissertations, open access journals and monographs, and by partnering with others to develop new methods to disseminate research. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ); technical/ research reports ( ); etds ( ); undergraduate capstone/honors theses ( ); working papers media formats: text; images; audio; video disciplinary specialties: education; food cultures; library history top publications: mcgill journal of education (journal); cuizine (journal); fontanus (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: public knowledge project; erudit; thesescanada c o n tr ib ut ing institu tio n library publishing coalition mailto:amy.buckland@mcgill.ca publishing platform(s): ojs/ocs/omp; locally developed software; digitool digital preservation strategy: in-house; digital preservation services under discussion additional services: training; analytics; notification of a&i sources; issn registration; author copyright advisory; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: etd program and fontanus series are well established, but ojs journals are still in a developmental stage; looking to pair with the digital humanities community on campus to look at new ways of publishing, beyond the journal/monograph binary. miami university university libraries primary unit: center for digital scholarship primary contact: john millard head, center for digital scholarship - - millarj@miamioh.edu program overview mission/description: we want to serve as a collaborative partner with faculty, students, and staff by providing infrastructure and expertise to support open access journals with or without peer review. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ) media formats: text disciplinary specialties: computer science and engineering; psychology percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): ojs/ocs/omp digital preservation strategy: in-house additional services: cataloging; metadata; author copyright advisory; digitization mailto:millarj@miamioh.edu mount saint vincent university mount saint vincent university library primary unit: archives and scholarly communication ojs@msvu.ca primary contact: roger gillis scholarly communications and archives librarian - - roger.gillis@gmail.com website: journals.msvu.ca program overview mission/description: journals at the mount is a hosting service provided by the mount saint vincent university library for the mount community and/or affiliated partners. the service employs open journal systems (ojs) as a the hosting platform for scholarly journals and includes training, support, and guidance for the development of new and existing publications of the mount community. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images disciplinary specialties: women’s/gender studies; adult education top publications: atlantis: critical studies in gender, culture & social justice (journal); canadian journal for the study of adult education (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students mailto:ojs@msvu.ca mailto:roger.gillis@gmail.com journals.msvu.ca other partners: canadian association for the study of adult education; public knowledge project publishing platform(s): ojs/ocs/omp digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; contract/license preparation; author copyright advisory; hosting of supplemental content plans for expansion/future directions: digitizing back issues, developing student journals, and discussing with faculty the development of new journals/migrating existing journals to the ojs platform. northeastern university university libraries primary unit: scholarly communication primary contact: hillary corbett scholarly communication librarian - - h.corbett@neu.edu program overview mission/description: the university libraries offer a growing suite of publishing services in response to the needs of faculty, students, and staff. the libraries provide an online platform for journal publishing and the opportunity to produce innovative online collections and e-books through its digital repository service. through the repository service, the libraries also provide open access to the university’s electronic theses and dissertations, scholarly research output, and university-produced objects. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons); fedora; omeka; issuu digital preservation strategy: in-house; digital preservation services under discussion f o u n di ng institu tio n library publishing coalition mailto:h.corbett@neu.edu additional services: graphic design (print or web); typesetting; copy-editing; outreach; training; metadata; compiling indexes and/or tocs; notification of a&i sources; doi assignment/allocation of identifiers; dataset management; author copyright advisory; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: working to expand the capabilities of our digital repository, in response to users’ needs for space that can accommodate new kinds of projects; bringing another faculty journal online in the coming year. h i g h l i g h t e d p u b l i c a t i o n annals of environmental science publishes original, peer-reviewed research in the environmental sciences, broadly defined. it has been published open-access at northeastern university since . www.aes.neu.edu www.aes.neu.edu northwestern university northwestern university library primary unit: center for scholarly communication and digital curation cscdc@northwestern.edu primary contact: claire stewart head, digital collections and scholarly communication services - - claire-stewart@northwestern.edu website: cscdc.northwestern.edu social media: @nu_cscdc program overview mission/description: we are engaged in planning activities to identify tools and support models that enable distributed, preservable publishing projects across the entire university. in initial phases, we anticipate the emphasis will be heavier on non-traditional products, transitioning to open theses, open journals, and open books as the key stakeholders, including our press, move into closer technical and mission alignment. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ); charitable contributions/ friends of the library organizations ( ); grants ( ) publishing activities types of publications: databases ( ); scholarly websites that are heavily content- driven media formats: text; images disciplinary specialties: classics; history top publications: classicizing chicago (digital collection) campus partners: campus departments or programs; individual faculty c o n tr ib ut ing institu tio n library publishing coalition mailto:cscdc@northwestern.edu mailto:claire-stewart@northwestern.edu cscdc.northwestern.edu publishing platform(s): fedora; wordpress; drupal digital preservation strategy: duracloud/dspace; dpn; in-house; digital preservation services under discussion additional services: graphic design (print or web); training; metadata; dataset management; author copyright advisory; digitization; hosting of supplemental content additional information: we are working in many areas that blur into “library publishing,” so it is sometimes hard to isolate the people, tasks, and funding that contribute to library publishing services. it is an area that we see as a growing component of our library’s scholarly and digital programs. the fact that the university press also reports to the dean of libraries opens up avenues for fruitful discussion, but to date the press’ publishing is quite separate from the library’s. plans for expansion/future directions: developing a consulting service for faculty seeking to establish new publications and engaging in conversations with partners on campus around a shared investment in a cloud-based wordpress service, with plans to build and extend custom plugins for publishing projects and to integrate cms-based publishing projects with the library’s digital repository; exploring possible collaborations with the university press, especially related to policy and infrastructure. oberlin college oberlin college library primary unit: oberlin college library alan.boyd@oberlin.edu primary contact: alan boyd associate director of libraries - - alan.boyd@oberlin.edu program overview mission/description: publish all current and retrospective honors papers and master’s theses with concurrence of the faculty department. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: undergraduate capstone/honors theses ( ) media formats: text; images; multimedia/interactive content campus partners: campus departments or programs publishing platform(s): ohiolink etd center digital preservation strategy: no digital preservation services provided additional services: outreach; cataloging; metadata; author copyright advisory; digitization; hosting of supplemental content mailto:alan.boyd@oberlin.edu mailto:alan.boyd@oberlin.edu ohio state university university libraries primary unit: digital content services schlosser. @osu.edu primary contact: melanie schlosser digital publishing librarian - - schlosser. @osu.edu website: library.osu.edu/projects-initiatives/knowledge-bank program overview mission/description: our mission is to engage with partners across the university to increase the amount, value, and impact of osu-produced digital content. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); student conference papers and proceedings ( ); newsletters ( ); undergraduate capstone/honors theses ( ); conference and event lectures and presentations ( ); graduate student culminating papers and projects ( ); graduate student research forum papers and symposia posters ( ); undergraduate research forum presentations and posters ( ) media formats: text; images; audio; video; data percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: society for disability studies; the ohio academy of science f o u n di ng institu tio n library publishing coalition mailto:schlosser. @osu.edu mailto:schlosser. @osu.edu http://library.osu.edu/projects-initiatives/knowledge publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); typesetting; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; doi assignment/allocation of identifiers; contract/license preparation; author copyright advisory; digitization; hosting of supplemental content; consulting and educational programming additional information: although an etd program is considered library publishing for this survey, we did not include etds. our dissertations and theses are submitted by students to the ohiolink consortial etd database. since autumn term , dissertations have been produced by the student in electronic format and submitted to the ohiolink etd center. beginning calendar , all master’s theses have been produced by the student in electronic format and submitted to the ohiolink etd center. we do not host our dissertations and theses separately. plans for expansion/future directions: formalizing policies and procedures, recruiting new publishing partners, and adding new services. h i g h l i g h t e d p u b l i c a t i o n disability studies quarterly, the journal of the society for disability studies, is a multidisciplinary, international publication that covers all aspects of disability studies. dsq-sds.org dsq-sds.org oregon state university oregon state university libraries and press primary unit: center for digital scholarship and services primary contact: michael boock head of the center for digital scholarship and services - - michael.boock@oregonstate.edu website: cdss.library.oregonstate.edu program overview mission/description: oregon state university libraries’ publishing activities are primarily focused on the dissemination of scholarship produced by osu faculty and students. this is achieved largely through the institutional repository scholarsarchive@ osu, which includes previously unpublished material such as electronic theses and dissertations, agricultural extension reports, and faculty datasets. osu libraries also hosts open access journals that include articles by osu faculty. the libraries’ center for digital scholarship and services digitizes selected out-of-print osu press publications, and provides open access to excerpts from press books and supplementary materials such as maps and datasets. other publishing activities involve the development of online resources that present and interpret unique holdings of osu libraries. examples include extensive documentary histories and online exhibits on the linus pauling papers and related archival collections in the history of science and other areas. osu libraries has also developed digital resources in conjunction with books published by the osu press. examples include a mobile application for touring historic buildings that is based on a book about portland architecture, and a website that supports nature exploration related to a children’s book published by the press. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); datasets f o u n di ng institu tio n library publishing coalition mailto:michael.boock@oregonstate.edu cdss.library.oregonstate.edu media formats: text; images; audio; video; data disciplinary specialties: forestry; agriculture; history of science; water studies top publications: growing your own (technical report); forest phytophthoras (journal); international institute for fisheries economics and trade conference proceedings (conference proceedings); journal of the transportation research forum (journal); reducing fire risk on your forest property (technical report) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: transportation research forum; international institute for fisheries economics and trade; western dry kiln association; oregon institute for natural resources publishing platform(s): contentdm; dspace; fedora; ojs/ocs/omp; wordpress; omeka digital preservation strategy: archive-it; lockss; metaarchive additional services: graphic design (print or web); training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; dataset management; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming h i g h l i g h t e d p u b l i c a t i o n for more than a century, oregon state university’s extension service and agricultural experiment station publications have covered everything from winemaking techniques to marine economics. ir.library.oregonstate.edu/xmlui/handle/ / ir.library.oregonstate.edu/xmlui/handle additional information: it should be noted that while the osu press is part of the osu libraries organization, the press’ publishing program, which results in the publication of approximately twenty-five books per year on the pacific northwest, has mostly operated independently from the libraries’ publishing activities. therefore, the descriptions of “library publishing” have not included the press’ current print publishing output. in the future, the publishing programs of the libraries and press will be increasingly integrated. plans for expansion/future directions: our plans for the future largely focus on open access student journals, digital humanities, and open textbooks. student journals will publish research from osu undergraduate and graduate students, as well as students from around the world in specific disciplines. digital humanities projects will incorporate platforms that emphasize multimedia elements in presenting scholarship by osu faculty. open textbooks will involve a new partnership between the osu libraries and press and the osu extended campus open educational resources unit to support development of open textbooks by osu faculty. the osu libraries’ gray family chair for innovative library services will focus on digital publishing for at least the next three years, with a new incumbent providing vision and direction for innovation and sustainability in digital publishing. pacific university pacific university libraries primary unit: local collections and publication services primary contact: isaac gilman scholarly communications and research services librarian - - gilmani@pacificu.edu website: www.pacificu.edu/library/services/lcps/index.cfm program overview mission/description: pacific university libraries’ publishing services exist to disseminate diverse and significant scholarly and creative work, regardless of a work’s economic potential. through flexible open access publishing models and author services, pacific university libraries will contribute to the discovery of new ideas (from scholars within and outside the pacific community) and to the sustainability of the publishing system. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ); etds ( ) media formats: text; images; audio disciplinary specialties: health care; philosophy; undergraduate research; librarianship top publications: essays in philosophy (journal); journal of librarianship and scholarly communication (journal); health & interprofessional practice (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students c o n tr ib ut ing institu tio n library publishing coalition mailto:gilmani@pacificu.edu www.pacificu.edu/library/services/lcps/index.cfm publishing platform(s): bepress (digital commons) digital preservation strategy: digital preservation services under discussion additional services: typesetting; copy-editing; training; analytics; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; author copyright advisory; digitization pennsylvania state university penn state university libraries primary unit: publishing & curation services primary contact: linda friend head, scholarly publishing services - - lxf @psu.edu website: www.libraries.psu.edu/psul/pubcur.html program overview mission/description: our mission is to provide authors and researchers with consultation on publishing options and practical, alternative ways for penn state faculty and students to publish and disseminate research in many formats. in addition, we provide assistance to scholarly journals and societies in disseminating their publications and proceedings electronically. we subscribe to the principles of open access to research information. doctoral dissertations and master’s theses for most academic programs are submitted digitally and are disseminated through the libraries, and there is an active program of collecting and making student research available. the three primary research journals in the field of pennsylvania history are part of our digitized collections. we are currently investigating the need and feasibility of offering an enhanced program of tiered publishing services, particularly for research journals, data, conference proceedings, and student-initiated work. year publishing activities began: organization: centralized library publishing unit/department. some operations and publishing workflow responsibilities are distributed among several library units/departments including technology support, cataloging, preservation, etc. staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ); sales revenue ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); graduate student research exhibition posters; undergraduate student research exhibition posters f o u n di ng institu tio n library publishing coalition mailto:lxf @psu.edu www.libraries.psu.edu/psul/pubcur.html media formats: text; images; audio; video; data disciplinary specialties: pennsylvania history and culture top publications: pennsylvania history journal (journal); pennsylvania magazine of history and biography (magazine); western pennsylvania history (journal); wepan conference proceedings (conference proceedings) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: women in engineering proactive network (wepan); historical society of pennsylvania; heinz history center; pennsylvania history association publishing platform(s): contentdm; ojs/ocs/omp; wordpress digital preservation strategy: digital preservation services under discussion; digital preservation special team is currently working on a long range plan. additional services: marketing; outreach; metadata; dataset management; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: redescribing the program, with expansion of services in the near future. h i g h l i g h t e d p u b l i c a t i o n western pennsylvania history from the heinz history center is a colorful regional quarterly of interest to scholars and history buffs alike. ojs.libraries.psu.edu/index.php/wph ojs.libraries.psu.edu/index.php/wph pepperdine university pepperdine university libraries primary unit: office of the dean of libraries primary contact: mark roosa dean of libraries - - mark.roosa@pepperdine.edu website: digitalcommons.pepperdine.edu program overview mission/description: the pepperdine libraries provide a global gateway to knowledge, serving the diverse and changing needs of our learning community through personalized service at our campus locations and rich computer- based resources. at the academic heart of our educational environment, our libraries are sanctuaries for study, learning, and research, encouraging discovery, contemplation, social discourse, and creative expression. as the information universe continues to evolve, our goal is to remain responsive to users’ needs by providing seamless access to both print and digital resources essential for learning, teaching, and research. the libraries, through digital commons@pepperdine, offer a wide array of digital publications that are openly available for study, research, and learning. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; data disciplinary specialties: religion; business; public policy; psychology; law c o n tr ib ut ing institu tio n library publishing coalition mailto:mark.roosa@pepperdine.edu digitalcommons.pepperdine.edu top publications: pepperdine law review (journal); leaven (journal); pepperdine dispute resolution law journal (journal); the journal of business, entrepreneurship and the law (journal); journal of the national association of administrative law judiciary (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons); contentdm digital preservation strategy: lockss; portico; preservica; in-house additional services: marketing; outreach; training; cataloging; metadata; dataset management; digitization; audio/video streaming plans for expansion/future directions: publishing additional undergraduate research; creating a line of monographic publications; publishing rich media content (e.g., video presentations); implementing an enterprise digital preservation solution; identifying new ways of participating in the editorial processes generally associated with publishing. portland state university portland state university library primary unit: digital initiatives primary contact: sarah beasley scholarly communication coordinator - - bvsb@pdx.edu program overview mission/description: portland state university (psu) library provides the infrastructure and a suite of services to offer a publishing platform that facilitates open access distribution; enhanced web search engine discovery through standards-based metadata and file formatting; permanent urls; file formatting and format migration; copyright advisory for authors; and outreach for and promotion of psu faculty or psu departmentally sponsored content. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: student-driven journals ( ); monographs ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/ honors theses ( ) media formats: text; images; audio; video; data disciplinary specialties: physics; environmental sciences; engineering and computer science; urban studies and planning; education percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty publishing platform(s): bepress (digital commons) mailto:bvsb@pdx.edu digital preservation strategy: in-house additional services: marketing; outreach; analytics; cataloging; metadata; dataset management; peer review management; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: hosting journals, archiving monographs, and producing open access textbooks. purdue university purdue university libraries primary unit: purdue scholarly publishing services primary contact: charles watkinson head, scholarly publishing services - - ctwatkin@purdue.edu website: www.lib.purdue.edu/publishing social media: @publishpurdue program overview mission/description: purdue scholarly publishing services focuses on supporting the publication efforts of various centers and departments within the purdue system. the primary publishing platform used is purdue e-pubs (www.purdue. edu/epubs), and the majority of products created are openly accessible, free- of-charge, to readers. open access is made possible by the financial support of partners, foundations, and purdue university libraries. major initiatives include the production of the journal of purdue undergraduate research, the publication of technical reports on behalf of the joint transportation research program (jtrp), and the project management of habri central, a major bibliographic reference database for researchers in the area of human-animal bond studies, produced in partnership with the purdue college of veterinary medicine. purdue scholarly publishing services and purdue university press, which publishes more formal books and journals, together constitute the publishing division of purdue libraries. our diverse publishing activities are supported by a single group of staff members with assistance from undergraduate and graduate students. by harnessing the skills of both librarians and publishers, and leveraging a common infrastructure, we believe we can better serve the needs of scholars in the digital age and enhance the impact of purdue scholarship by developing information products aligned with the university’s strengths. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ); undergraduate students ( . ) f o u n di ng institu tio n library publishing coalition mailto:ctwatkin@purdue.edu www.lib.purdue.edu/publishing www.purdue.edu/epubs www.purdue.edu/epubs funding sources (%): library operating budget ( ); non-library campus budget ( ); grants ( ); sales revenue ( ); licensing ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/ honors theses ( ); habri central (an information hub for human-animal bond studies built on the hubzero platform for scientific collaboration); the data curation profiles directory media formats: text; images; audio; video; data; multimedia/interactive content. disciplinary specialties: engineering (civil engineering); education (stem); library and information science; public policy; comparative literature top publications: joint transportation research program technical reports (technical reports); jpur: journal of purdue undergraduate research (journal); habri central (website); clcweb: comparative literature and culture (journal); interdisciplinary journal of problem-based learning (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: indiana department of transportation (indot); habri foundation; charleston conference/against the grain press; international association of scientific and technological university libraries (iatul) h i g h l i g h t e d p u b l i c a t i o n the journal of purdue undergraduate research (jpur) has been established to publish outstanding research papers written by purdue undergraduates from all disciplines who have completed faculty-mentored research projects. docs.lib.purdue.edu/jpur docs.lib.purdue.edu/jpur publishing platform(s): bepress (digital commons); hubzero for habri central digital preservation strategy: clockss and portico for most important journals; metaarchive for habri central additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; issn registration; doi assignment/ allocation of identifiers; open url support; dataset management; peer review management; business model development; budget preparation; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming; developmental editing; project management additional information: data publication is handled by the purdue university research repository (purr), which is a collaborative project of the libraries, information technology at purdue (itap), and the office of the vice president for research. we have classified all open access journals as being products of scholarly publishing services because of the types of workflow adopted, but five of these use the purdue university press imprint. plans for expansion/future directions: working to expand the number of centers and departments we serve on campus, particularly in the area of conference proceedings and technical reports; creating better linkages between publications and materials in purdue’s data and archival repositories; developing better capacity to handle multimedia and “new form” publications; developing a clearer sustainability plan across the libraries publishing division that balances earned revenue with internal support. rochester institute of technology the wallace center primary unit: scholarly publishing studio primary contact: nick paulus manager of scholarly publishing - - njpwml@rit.edu website: wallacecenter.rit.edu/scholarly-publishing-studio program overview mission/description: we connect stakeholders’ scholarship efforts with our comprehensive publishing services, ensuring that faculty and student research is made available to readers faster and disseminated in a way that meets their academic objectives. our approach is collaborative. we offer help with design and layout, copy-editing outsourcing, open access publishing, and pre-publishing consultation. at sps, we are committed to advancing the dissemination of scholarship. organization: centralized library publishing unit/department publishing activities campus partners: campus departments or programs; individual faculty publishing platform(s): bepress (digital commons); ojs/ocs/omp additional services: graphic design (print or web); copy-editing mailto:njpwml@rit.edu http://wallacecenter.rit.edu/scholarly rutgers, the state university of new jersey rutgers university libraries primary unit: scholarly communication center primary contact: rhonda marker rucore collection manager/head, scholarly communications center - - rmarker@rutgers.edu website: rucore.libraries.rutgers.edu/services program overview mission/description: the goal of the rutgers university community repository is to advance research and learning at rutgers, to foster interdisciplinary collaboration, and to contribute to the development of new knowledge through the archiving, preservation, and presentation of digital resources. original research products and papers of the faculty and administrators and the unique resources of the libraries will be permanently preserved and made accessible with tools developed to facilitate and encourage their continued use. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( ) funding sources (%): library materials budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); faculty conference papers and proceedings ( ); databases ( ); etds ( ); research/interview videos ( ); a new scholarly communication form, the published video analytic, currently in use in our nsf- funded mathematics education collection, the video mosaic (www.videomosaic.org) media formats: text; video; data; multimedia/interactive content disciplinary specialties: mathematics education; psychology; jazz music; new jersey history; classical studies c o n tr ib ut ing institu tio n library publishing coalition mailto:rmarker@rutgers.edu rucore.libraries.rutgers.edu/services www.videomosaic.org top publications: video mosaic collaborative (website); pragmatic case studies in psychotherapy (journal); journal of jazz studies (journal); journal of rutgers university libraries (journal); new jersey history (journal) percentage of journals that are peer reviewed: campus partners: individual faculty; graduate students publishing platform(s): fedora; ojs/ocs/omp; locally developed software digital preservation strategy: in-house additional services: graphic design (print or web); training; analytics; cataloging; metadata; notification of a&i sources; doi assignment/allocation of identifiers; contract/license preparation; author copyright advisory; audio/video streaming additional information: our focus is on the unique scholarship and resources of rutgers university and on the research and education needs of our community. we develop new tools and services, including new modes of scholarly communication, in response to faculty and student needs, often through collaboration in research grants. plans for expansion/future directions: expanding our publishing of original research and scholarship, with a particular focus on research data and digital video, including video of conferences and lectures held in the alexander library; exploring the publishing of undergraduate research in open access journals and new modes of scholarly communication, particularly in the humanities and social sciences. simon fraser university simon fraser university library . theses/dissertations primary unit: thesis office primary contact: nicole white head, research commons - - ngjertse@sfu.ca website: www.lib.sfu.ca/help/writing/thesis program overview mission/description: responsible for accepting formatted theses and dissertations, and depositing them in the library’s institutional repository, summit. summit also acts as a publication platform for university authors (e.g., conference papers, technical reports). conforms to oai-pmh. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( %) publishing activities types of publications: technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; multimedia/interactive content campus partners: campus departments and programs; individual faculty; graduate students, publishing platform(s): drupal digital preservation strategy: archivematica, in-house. work is underway to allow archivematica to store aip’s (archival information packages) in a coppul private lockss network. c o n tr ib ut ing institu tio n library publishing coalition mailto:ngjertse@sfu.ca www.lib.sfu.ca/help/writing/thesis additional services: copy-editing; training; analytics; compiling indexes and/or tocs; author copyright advisory; hosting of supplemental content plans for expansion/future directions: working toward becoming a trusted digital repository. moved to exclusively digital thesis submission. . scholarly journals and conferences primary unit: public knowledge project publishing services (pkp|ps) pkp-hosting@sfu.ca primary contact: brian owen associate university librarian/pkp managing director - - brian_owen@sfu.ca website: http://www.lib.sfu.ca/collections/scholarly-publishing; https://pkpservices.sfu.ca program overview mission/description: provide online hosting and related technical support at no charge for scholarly journals and conferences that have a significant sfu faculty connection (e.g., a managing editor) or to support sfu-based teaching and research initiatives. year publishing activities began: organization: the sfu library provides the administrative and technical home for pkp and its related activities, such as pkp publishing services. in return, pkp|ps provides the technical expertise and infrastructure support for the sfu library’s scholarly communication services. pkp|ps staff work closely with the library’s liaison librarians. staff in support of publishing activities (fte): library staff (. ) funding sources (%): library operating budget ( ); pkp|ps in-kind ( ) publishing activities types of publications: faculty-driven and graduate student journals ( ); scholarly conferences ( ) media formats: text; images; audio; video; data; multimedia/interactive content mailto:pkp-hosting@sfu.ca mailto:brian_owen@sfu.ca http://www.lib.sfu.ca/collections/scholarly https://pkpservices.sfu.ca other partners: sfu’s canadian centre for studies in publishing publishing platform(s): ojs/ocs digital preservation strategy: coppul; lockss additional services: digitization; software customization/development additional information: pkp publishing services is not a typical library publishing operation. by virtue of being the developers of ojs and other pkp software, we are able to offer technical support that may not be feasible for other library publishing services. plans for expansion/future directions: hosting and related support for open monograph press (omp). state university of new york at buffalo e. h. butler library primary unit: scholarly communication librarian primary contact: marc d. bayer scholarly communication librarian - - bayermd@buffalostate.edu website: digitalcommons.buffalostate.edu/submit_research.html program overview mission/description: the e. h. butler library publishes monographs and periodicals that feature the research, applied, and artistic works of the buffalo state community. in addition to a print publishing program, the library administers the campus institutional repository. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) publishing activities campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: monroe fordham regional history center of suny buffalo state plans for expansion/future directions: including more student research. mailto:bayermd@buffalostate.edu digitalcommons.buffalostate.edu/submit_research.html state university of new york at geneseo milne library primary unit: technical services milne@geneseo.edu primary contact: allison brown editor and production manager - - browna@geneseo.edu website: publishing.geneseo.edu program overview mission/description: the mission of milne library publishing services is based on a core value of libraries: knowledge sharing and literacy are an essential public good. the goal of milne publishing is to inspire authors and creators to share their works with a sustainable publishing model that rewards both authors and readers, libraries and learning. milne publishing will help transform scholarly communications and library publishing. year publishing activities began: organization: distributed across library units; open suny textbooks and individual journals are distributed among various institutions staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); student conference papers and proceedings ( ); newsletters ( ); tei digital humanities projects; omeka digital collections; best practices toolkits media formats: text; images. disciplinary specialties: education; library and information science; local history; humanities/liberal arts top publications: digitalthoreau.org (website); reprints and new monographs on amazon.com; educational change (journal); reprints on open monograph press; workflow toolkit (website) mailto:milne@geneseo.edu mailto:browna@geneseo.edu publishing.geneseo.edu digitalthoreau.org amazon.com percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students other partners: thoreau society; thoreau institute; walden woods project; new york state foundations of education association publishing platform(s): contentdm; ojs/ocs/omp; wordpress; commons in a box digital preservation strategy: no digital preservation services provided; server backup as appropriate additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; issn registration; doi assignment/allocation of identifiers; open url support; peer review management; business model development; budget preparation; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: library publishing toolkit available at: www. publishingtoolkit.org. developing interactives with video, multiple choice feedback, etc. plans for expansion/future directions: expanding the use of open monograph press for textbook, reprints, and new monograph publishing; developing network hosting and training models for open journal systems and open monograph press; expanding the role of digital scholarship publishing with social reading in digital thoreau and the use of omeka. www.publishingtoolkit.org www.publishingtoolkit.org syracuse university syracuse university libraries primary unit: scholarly communication primary contact: yuan li scholarly communication librarian - - yli @syr.edu website: surface.syr.edu program overview mission/description: to provide syracuse university (su) faculty with an alternative to commercial publishing venues, and to provide the campus community support for open access publishing models. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library materials budget ( ); library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); working papers; journal articles; images; video; and presentations media formats: text; video disciplinary specialties: law and commerce; public diplomacy; writing and rhetoric; disability and popular culture top publications: intertext (journal) percentage of journals that are peer reviewed: campus partners: syracuse university press; campus departments or programs; individual faculty; graduate students; undergraduate students f o u n di ng institu tio n library publishing coalition mailto:yli @syr.edu surface.syr.edu publishing platform(s): bepress (digital commons); ojs/ocs/omp digital preservation strategy: aptrust; dpn; lockss additional services: graphic design (print or web); typesetting; copy-editing; marketing; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; peer review management; business model development; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: launching a joint imprint (syracuse unbound) with syracuse university press and two new open access journals in the coming months; forming a new unit that brings together several units involved in digital scholarship activities, including digital publishing; formalizing a menu of publishing services for the campus community. h i g h l i g h t e d p u b l i c a t i o n intertext aims to represent the writing of syracuse university students through publishing exemplary works submitted from any writing program undergraduate course. wrt-intertext.syr.edu wrt-intertext.syr.edu temple university temple university libraries primary unit: digital library initiatives diglib@temple.edu primary contact: delphine khanna head of digital library initiatives - - delphine@temple.edu website: digital.library.temple.edu program overview mission/description: the goal of our program is to provide free and open access to digital scholarship produced by temple university students. currently, we focus on the publishing of doctoral dissertations, master’s theses, and the winning essays of the temple university library prize for undergraduate research in general topics and in topics related to sustainability and the environment. in the future, we plan to greatly expand our publishing program to include scholarly journals and books. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ); winning essays for the temple university library prize for undergraduate research ( ) media formats: text; images; data disciplinary specialties: full range of academic subjects in etds top publications: “the digitalization of music culture: a case study examining the musician/listener relationship with digital technology” (thesis); “profitability ratio analysis for professional service firms” (thesis); “naskh al- qur’an: a theological and juridical reconsideration of the theory of abrogation and its impact on qur’anic exegesis” (thesis); “pcaob international inspection mailto:diglib@temple.edu mailto:delphine@temple.edu digital.library.temple.edu and audit quality” (thesis); “mother of god, cease sorrow!: the significance of movement in a late byzantine icon” (thesis) campus partners: campus departments or programs publishing platform(s): contentdm digital preservation strategy: in-house. digital preservation services under discussion; our contentdm instance is hosted at oclc and they have backup procedures. we are also now considering membership in hathitrust. additional services: analytics; cataloging; metadata; hosting of supplemental content plans for expansion/future directions: planning significant expansion of services, such as the inclusion of books and journals. texas tech university texas tech university libraries primary unit: digital resources library unit primary contact: christopher starcher digital services librarian - - christopher.starcher@ttu.edu program overview mission/description: to publish and archive the scholarship of texas tech university by its faculty, researchers, and students. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: textbooks ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images top publications: etds; honors theses campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace digital preservation strategy: duracloud/dspace; lockss; scholars portal; in-house; digital preservation services under discussion. everything is housed at the university data center and then backed up to an out-of-town remote storage facility. additional services: outreach; training; analytics; metadata; doi assignment/ allocation of identifiers; author copyright advisory; other author advisory; digitization mailto:christopher.starcher@ttu.edu thomas jefferson university scott memorial library primary unit: academic and instructional support & resources primary contact: dan kipnis senior education services librarian and editor of jefferson digital commons - - dan.kipnis@jefferson.edu program overview mission/description: to provide an open access institutional repository of the work being produced by the jefferson community to a global audience. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library materials budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); videos; grand round presentations; conference posters media formats: text; images; audio; video disciplinary specialties: historical psychiatry; internal medicine; population studies; integrative medicine top publications: jefferson journal of psychiatry (journal); the medicine forum (journal); on the anatomy of the breast (monograph); a manual of military surgery (monograph); legend and lore: jefferson medical college (monograph) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: special library association mailto:dan.kipnis@jefferson.edu publishing platform(s): bepress (digital commons) digital preservation strategy: lockss; in-house additional services: marketing; outreach; training; analytics; metadata; issn registration; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming additional information: we are encouraged by our grass roots effort to get materials in our ir and to continue our publishing efforts. plans for expansion/future directions: continuing to add journals, newsletters, and additional grey literature materials to our institutional repository. trinity university coates library primary unit: discovery services primary contact: jane costanza head of discovery services - - jcostanz@trinity.edu website: digitalcommons.trinity.edu program overview mission/description: the trinity university open access policy encourages faculty authors to retain non-commercial copyright for their scholarly publications and provides them with the means to negotiate those rights with their publishers. additionally, open access facilitates the sharing of peer-reviewed research through trinity’s digital repository (digital commons @ trinity), which provides broad, free access to a faculty author’s scholarly work. the open access policy at trinity depends for its effectiveness on faculty authors granting to the university permission to upload digital copies of their scholarly publications to trinity’s digital repository. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); student conference papers and proceedings ( ); undergraduate capstone/honors theses ( ); administrative reports media formats: text; images; video; data disciplinary specialties: teacher education; anthropology; psychology; mathematics; biology mailto:jcostanz@trinity.edu digitalcommons.trinity.edu top publications: “cognitive bias modification: past perspectives, current findings, and future applications” (thesis); “cognitive bias modification: induced interpretive biases affect memory” (thesis); “a survey of psychologists’ attitudes towards and utilization of exposure therapy for ptsd” (thesis); “islamophobia, euro-islam, islamism and post-islamism: changing patterns of secularism in europe” (thesis); tipiti (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students other partners: society for the anthropology of lowland south america publishing platform(s): bepress (digital commons) digital preservation strategy: clockss additional services: analytics; cataloging; metadata; open url support; author copyright advisory additional information: we also support selectedworks. plans for expansion/future directions: continuing to help faculty members understand the issues around the economics of scholarly publishing and the benefits of providing open access to their scholarly output. tulane university howard-tilton memorial library primary unit: digital initiatives primary contact: jeff rubin digital initiatives and publishing coordinator - - jrubin @tulane.edu website: library.tulane.edu/repository program overview mission/description: tulane university journal publishing is an open access journal publishing service that provides a web-based platform for scholarly and academic publishing to the tulane community. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); etds ( ) media formats: text; images; audio; video disciplinary specialties: zoology; botany; international affairs; literary top publications: tulane studies in zoology and botany (journal); tulane review (journal); tulane journal of international affairs (journal); second line: an undergraduate journal of literary conversation (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): ojs/ocs/omp c o n tr ib ut ing institu tio n library publishing coalition mailto:jrubin @tulane.edu library.tulane.edu/repository digital preservation strategy: dpn; centralized storage and backup through tulane technology services additional services: training; metadata; issn registration; author copyright advisory; other author advisory; hosting of supplemental content; audio/video streaming universitÉ de montrÉal université de montréal libraries primary unit: teaching, learning and research support primary contact: diane sauvé director, teaching, learning and research support - - ext. diane.sauve@umontreal.ca website: www.bib.umontreal.ca/papyrus program overview mission/description: the université de montréal institutional repository, papyrus, provides access to the university theses and dissertations, as well as to some publications and other forms of intellectual output from the university. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ) media formats: text; images; audio; video campus partners: campus departments or programs publishing platform(s): dspace digital preservation strategy: no digital preservation services provided mailto:diane.sauve@umontreal.ca www.bib.umontreal.ca/papyrus university of alberta university of alberta libraries primary unit: digital initiatives primary contact: leah vanderjagt digital repository services librarian - - leah.vanderjagt@ualberta.ca website: guides.library.ualberta.ca/oa social media: listed at www.library.ualberta.ca program overview mission/description: the university of alberta libraries provides support to community members who want to publish in oa formats (e.g., providing journal hosting and institutional repository services). year publishing activities began: staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); videos media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: library and information studies; education; pharmaceutical sciences; sociology; environmental studies (particularly oil sands) top publications: canadian journal of sociology (journal); international journal of qualitative methods (journal); journal of pharmacy & pharmaceutical sciences (journal); evidence based library and information practice (journal); canadian review of comparative literature (journal) percentage of journals that are peer reviewed: mailto:leah.vanderjagt@ualberta.ca guides.library.ualberta.ca/oa www.library.ualberta.ca campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: public knowledge project; research teams/projects (e.g., oil sands research and information network; canadian writing research collaboratory); local non-profit organizations (e.g., edmonton social planning council). publishing platform(s): fedora; ojs/ocs/omp; wordpress; locally developed software digital preservation strategy: archive-it; archivematica; clockss; coppul; hathitrust; lockss; portico; in-house additional services: outreach; cataloging; metadata; doi assignment/allocation of identifiers; open url support; dataset management; author copyright advisory; other author advisory; digitization; hosting of supplemental content additional information: funding for publishing services comes out of the library operations budget. however, we do not have a fixed breakdown. we do not charge users for our publishing services and only publish open access content. plans for expansion/future directions: supporting the growth of our institutional repository and journal hosting services; facilitating the development of campus-wide scholarly publishing initiatives (e.g., establishing an open monograph publishing service, research data “publication” and curation), open educational resources (oer), etc. university of arizona university of arizona libraries primary unit: scholarly publishing and data management team repository@u.library.arizona.edu primary contact: dan lee director, office of copyright management and scholarly communication - - leed@email.arizona.edu website: journals.uair.arizona.edu; arizona.openrepository.com/arizona program overview mission/description: the scholarly publishing and data management team provides tools, services, and expertise that enable the creation, distribution, and preservation of scholarly works and research data in support of the mission of the university of arizona. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ); periodicals ( ) media formats: text; images; audio; video; data disciplinary specialties: agriculture; life sciences; dendrochronology; archaeology; geosciences top publications: radiocarbon (journal); journal of ancient egyptian interconnections (journal); etds; coyote papers (working papers); arizona anthropologist (journal) f o u n di ng institu tio n library publishing coalition mailto:repository@u.library.arizona.edu mailto:leed@email.arizona.edu journals.uair.arizona.edu arizona.openrepository.com/arizona percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: international society of lymphology; society for range management; tree ring society publishing platform(s): contentdm; dspace; ojs/ocs/omp; locally developed software digital preservation strategy: digital preservation services under discussion additional services: training; analytics; cataloging; metadata; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; dataset management; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: discussing collaborative efforts with the university press. h i g h l i g h t e d p u b l i c a t i o n radiocarbon is the main international journal of record for research articles and date lists relevant to c and other radioisotopes and techniques used in archaeological, geophysical, oceanographic, and related dating. www.radiocarbon.org www.radiocarbon.org university of british columbia university of british columbia library primary unit: digital initiatives and scholarly communications primary contact: allan bell director, digital initiatives and scholarly communications - - allan.bell@ubc.ca website: circle.ubc.ca program overview mission/description: digital initiatives and scholarly communication services supports new models of scholarly communications, copyright services, the showcasing of ubc’s intellectual output via open access repository services, as well as the digitization of unique historical materials. digital initiatives and scholarly communication services is a key part of the library’s strategy to support the evolving needs of faculty and students and to support teaching, research and learning at ubc. our goal is to create sustainable, world-class programs and processes that promote digital scholarship, make ubc research and digital collections openly available to the world, and ensure the long-term preservation of ubc’s digital collections. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ); non-thesis graduate student research ( ) media formats: text; images; audio; video; data mailto:allan.bell@ubc.ca circle.ubc.ca disciplinary specialties: mining engineering; forestry; education; sustainability; earth and ocean sciences top publications: “guidelines for mine haul road design” (technical report); “comparison of limit states design” (technical report); “pain-enduring eccentric exercise” (technical report); “portable science: podcasting as an outreach tool for a large academic science and engineering library” (technical report); “wet- bulb temperature” (technical report) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: archivematica; coppul; lockss; in-house. we participate in the coppul lockss pln. additional services: marketing; outreach; training; analytics; metadata; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming additional information: we also publish lower-level undergraduate work in our repository, for example, the science one program: circle.ubc.ca/ handle/ / . development partner on the public knowledge project (pkp), including the creation and maintenance of user documentation and related training materials, offering hosting and related support, performing testing, participating on pkp’s advisory and technical committees, and seeking further areas for cooperation. circle.ubc.ca/handle/ / circle.ubc.ca/handle/ / university of calgary libraries and cultural resources primary unit: centre for scholarly communication primary contact: tim au yeung coordinator, digital repository technologies - - ytau@ucalgary.ca program overview mission/description: the centre for scholarly communication provides innovative solutions for the creation, evaluation, dissemination, and preservation of the research output of the academy. a priority for libraries and cultural resources, the centre enables scholars through: sustainable electronic publishing using a variety of platforms; robust dissemination of digital collections in multiple formats; a platform for partnerships and discussion of trends and ideas; and solutions for longer term preservation of digital collections. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/ research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); occasional papers media formats: text; images; audio; video; data top publications: arctic (journal); ariel: a review of international english literature (journal); journal of military and strategic studies (journal); etds percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty mailto:ytau@ucalgary.ca other partners: scholarly societies (e.g., canadian evaluation society); research institutes (e.g., arctic institute of north america); individual faculty at other canadian universities (e.g., university of saskatchewan) publishing platform(s): contentdm; dspace; ojs/ocs/omp; locally developed software digital preservation strategy: archivematica; coppul; duracloud/dspace; synergies; in-house; digital preservation services under discussion additional services: graphic design (print or web); outreach; training; analytics; cataloging; metadata; doi assignment/allocation of identifiers; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: the university press is a unit within the library. working collaboratively, the library and press share expertise and technologies to support and extend scholarly publishing services. changes resulting from the integration include transition of press journals to library-hosted online journals (most now open access) and the initiation of open access book publishing. for this survey, activities associated with books under our press imprint were not included. university of california, berkeley institute for research on labor and employment library primary unit: the irle library web team primary contact: terence k. huwe director of library and information resources - - thuwe@library.berkeley.edu website: www.irle.berkeley.edu program overview mission/description: the irle library uses digital technologies to promote the scholarly content created by the institute for research on labor and employment as well as its affiliated faculty, students, and visiting scholars. year publishing activities began: organization: individual units create their own library publishing services, but take care to work with the campus-wide and system-wide resources staff in support of publishing activities (fte): library staff ( ); graduate students ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); in addition to working papers; conference papers and policy reports; gis web resources media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: employment and wage studies; employment in the “green economy”; public sector labor relations; sociology; management of organizations/organizational behavior top publications: “hidden cost of wal-mart jobs: use of safety net programs by wal-mart workers in california” (technical report); “ california establishment survey: preliminary findings on employer based healthcare mailto:thuwe@library.berkeley.edu www.irle.berkeley.edu reform” (technical report); “the impact of san francisco’s employer health spending requirement: initial findings from the labor and product markets” (technical report); “impact of sb on health coverage” (technical report) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students other partners: california studies association publishing platform(s): bepress (digital commons); dspace; wordpress; locally developed software digital preservation strategy: uc merritt; in-house; digital preservation services under discussion additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; issn registration; doi assignment/allocation of identifiers; dataset management; business model development; budget preparation; contract/license preparation; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: beginning to use the epub format for full-length ebook sales by third party outlets. university of california system california digital library primary unit: access and publishing group primary contact: catherine mitchell director, access and publishing group - - catherine.mitchell@ucop.edu website: www.escholarship.org social media: @escholarship; facebook.com/escholarship program overview mission/description: escholarship provides a suite of open access, scholarly publishing services and research tools that enable departments, research units, publishing programs, and individual scholars associated with the university of california to have direct control over the creation and dissemination of the full range of their scholarship. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data disciplinary specialties: law; romance languages/classics; environmental studies; architecture/urban planning; linguistics/literary studies c o n tr ib ut ing institu tio n library publishing coalition mailto:catherine.mitchell@ucop.edu www.escholarship.org top publications: “assessing the future landscape of scholarly communication: an exploration of faculty values and needs in seven disciplines” (technical/ research report); dermatology online journal (journal); journal of transnational american studies (journal); western journal of emergency medicine (journal); the traffic in praise: pindar and the poetics of social economy (monograph) percentage of journals that are peer reviewed: campus partners: uc press; campus departments or programs; individual faculty; graduate students; undergraduate students other partners: public knowledge project; uc campus libraries; pubmed; biomed central publishing platform(s): ojs; locally developed software digital preservation strategy: uc merritt additional services: outreach; training; analytics; cataloging; doi assignment/ allocation of identifiers; open url support; dataset management; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: identify opportunities to support new modes of research by investigating the needs of digital humanities scholars; explore to what extent altmetrics and commenting/annotation provide utility to researchers in different disciplines by experimenting with the provision of related tools and technologies; improve the quality of escholarship journals by providing baseline standards and guidance regarding best practices for oa publications; empower escholarship contributors to better understand and manage their copyright and publishing choices; improve the ability of escholarship research units to more robustly interact with escholarship by completing an administrative interface project (begun in – ) that provides them with expanded capabilities to control their publication environment within escholarship; continue to build relationships with and contribute to the broader digital library publishing community via our major development partnership with the public knowledge project; develop and formalize user community engagement processes for access and publishing services in order to leverage super-user knowledge/ practices, better align development priorities with user needs, raise awareness of new features/development agenda, work more directly with campus contacts and increase outreach opportunities to new users. university of central florida john c. hitt library primary unit: information technology and digital initiatives primary contact: lee dotson digital initiatives librarian - - lee.dotson@ucf.edu program overview mission/description: the ucf libraries currently provides publishing support for honors theses, graduate etds, and ucf affiliated or ucf faculty-edited open access e-journals. efforts to support broader dissemination of scholarship include enabling access to a wide audience through freely accessible databases and using open journal systems (ojs) open source publishing software to publish electronic journals from scratch and host electronic journals in florida oj. the ucf libraries collaborates with the florida virtual campus to provide these services. year publishing activities began: organization: services are distributed across library units/departments publishing activities types of publications: faculty-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text campus partners: campus departments or programs; individual faculty other partners: florida virtual campus publishing platform(s): ojs/ocs/omp; locally developed software digital preservation strategy: fcla daitss additional services: outreach; training; analytics; cataloging; metadata; hosting of supplemental content c o n tr ib ut ing institu tio n library publishing coalition mailto:lee.dotson@ucf.edu university of colorado anschutz medical campus health sciences library primary contact: heidi zuniga electronic resources librarian - - heidi.zuniga@ucdenver.edu program overview mission/description: the university of colorado anschutz medical campus digital repository will reflect the university’s excellence; support the rapid dissemination of research; foster at all levels understanding and appreciation of the value of research, learning, and teaching at cu anschutz medical campus; ensure future, persistent, and reliable access to intellectual assets. year publishing activities began: organization: services are distributed across several campuses staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ) media formats: text; images; audio; video; data; multimedia/interactive content disciplinary specialties: health sciences top publications: etds percentage of journals that are peer reviewed: campus partners: individual faculty publishing platform(s): digitool by exlibris digital preservation strategy: digital preservation services under discussion mailto:heidi.zuniga@ucdenver.edu additional services: marketing; outreach; cataloging; metadata; author copyright advisory; digitization additional information: we don’t consider ourselves to be a “library as publisher” institution at this point, but we certainly do disseminate etds and other resources. plans for expansion/future directions: publishing works from recipients of an open access journal fund program, also administered by our library, which helps authors pay for oa costs; seeing growth in research datasets, and other material that doesn’t normally get published but may be of value to researchers; monitoring the publication output of our researchers and trying to direct those articles toward the repository. university of colorado denver auraria library primary unit: special collections and digital initiatives primary contact: matthew mariner head of special collections and digital initiatives - - matthew.mariner@ucdenver.edu website: digitool.library.colostate.edu/r/?func=collections&collection_id= program overview mission/description: the mission of the auraria digital library program is to securely host, faithfully present, and freely distribute cultural, historical, educational, and scholarly content to auraria campus constituents and the interested public. the curation of scholarly publications, or the intellectual output of auraria campus staff, faculty, and students is of particular importance as it serves to promote and legitimize the activities of our institutions amongst our peers. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ) media formats: text; images; audio; video campus partners: campus departments or programs other partners: university of colorado denver graduate school publishing platform(s): digitool digital preservation strategy: amazon glacier additional services: author copyright advisory; other author advisory; digitization; audio/video streaming mailto:matthew.mariner@ucdenver.edu http://digitool.library.colostate.edu additional information: auraria library actually serves three unaffiliated schools on one campus (cu denver; metropolitan state university of denver; and community college of denver). currently, only cu denver grants graduate degrees requiring a thesis or dissertation, but said school recently made etds mandatory. these are submitted to proquest, but co-delivered to the library, where they are hosted and made publicly available. we hope to add more capacity for inclusion of undergraduate works (capstones, undergrad research) that would be published solely in our repository (unlike etds, which are technically also held by proquest). in addition to these activities, our scholarly communications librarian jeffrey beall offers advice to faculty regarding publishing, but he is currently forming plans to offer these services more concretely and publicly (i.e., campus-wide). plans for expansion/future directions: offering a space for unpublished undergraduate works, which are often ignored, but given auraria’s diverse and undergraduate-focused constituency, demand emphasis. university of florida george a. smathers libraries primary unit: digital library center ufdc@uflib.ufl.edu primary contact: judy russell dean of university libraries - - jcrussell@ufl.edu website: digital.uflib.ufl.edu; ufdc.ufl.edu program overview organization: services are distributed across library units/departments publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); newsletters ( ); etds ( ); databases ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: caribbean studies; entomology; african studies; psychology; physical therapy top publications: arl pd bank (database); vodou archive (digital scholarship database and archive); african studies quarterly (journal); interamerican journal of psychology (journal); florida entomologist (journal); journal of undergraduate research (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: florida virtual campus (flvc); internet archive; digital library of the caribbean (dloc); university press of florida; florida museum of natural history publishing platform(s): ojs/ocs/omp; locally developed software (sobekcm) digital preservation strategy: fcla daitss; in-house additional services: outreach; analytics; cataloging; metadata; dataset management; author copyright advisory; digitization; hosting of supplemental content c o n tr ib ut ing institu tio n library publishing coalition mailto:ufdc@uflib.ufl.edu mailto:jcrussell@ufl.edu digital.uflib.ufl.edu ufdc.ufl.edu university of georgia university of georgia libraries primary unit: digital library of georgia primary contact: andy carter digital projects archivist - - cartera@uga.edu program overview mission/description: our general objectives are to identify valuable, but overlooked, work from faculty and students, and increase the amount of uga’s scholarly output that is available via open access. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ) media formats: text; images; audio disciplinary specialties: higher education campus partners: campus departments or programs; individual faculty publishing platform(s): dspace; ojs/ocs/omp additional services: cataloging; metadata; doi assignment/allocation of identifiers; author copyright advisory; digitization plans for expansion/future directions: fine-tuning etd platform; expanding journal hosting efforts using ojs, depending on need and interest on campus. c o n tr ib ut ing institu tio n library publishing coalition mailto:cartera@uga.edu university of guelph university of guelph library primary unit: research enterprise and scholarly communication primary contact: wayne johnston head, research enterprise and scholarly communication - - ext. wajohnst@uoguelph.ca program overview mission/description: we seek to disseminate and preserve the scholarly output of the university. we believe open access, both green (self-archiving) and gold (open access journals), is critical to this objective. more broadly, we also seek to promote the digitization and dissemination of canadian scholarly journal content. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/ research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ) media formats: text; images; audio; video; data disciplinary specialties: agriculture; veterinary sciences; arts; history; international development top publications: critical studies in improvisation (journal); international review of scottish studies (journal); partnership: the canadian journal of library and information practice and research (journal); synergies canada (journal); studies by undergraduate researchers at guelph (journal) percentage of journals that are peer reviewed: mailto:wajohnst@uoguelph.ca campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: scholarly societies; national organizations; provincial consortia publishing platform(s): dspace; fedora; ojs/ocs/omp; dataverse digital preservation strategy: duracloud/dspace; scholars portal; synergies additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; dataset management; author copyright advisory; other author advisory; audio/video streaming university of hawaii at manoa university of hawaii at manoa libraries primary unit: desktop network services primary contact: beth tillinghast web support librarian, institutional repositories manager - - betht@hawaii.edu program overview mission/description: though the university of hawaii at manoa currently does not have a formal library publishing program, our library is involved in providing publishing services through the various collections hosted in our institutional repository, scholarspace. we provide the hosting services for numerous department journal publications, conference proceedings, technical reports, department newsletters, as well as open access to some dissertations and theses. the publishing activities are consistent with our mission of acquiring, organizing, preserving, and providing access to information resources vital to the learning, teaching, and research mission of the university of hawaii at manoa. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ); datasets media formats: text; images; audio; video; data disciplinary specialties: language documentation; social work; entomology; pacific islands culture; southeast asian culture c o n tr ib ut ing institu tio n library publishing coalition mailto:betht@hawaii.edu top publications: language documentation and conservation (journal); ethnobotany research and applications (journal); the contemporary pacific (journal); journal of indigenous social development (journal); explorations (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): dspace digital preservation strategy: archive-it; portico; in-house additional services: doi assignment/allocation of identifiers; dataset management; author copyright advisory; digitization; hosting of supplemental content university of idaho university of idaho library primary unit: digital initiatives primary contact: devin becker digital initiatives librarian - - dbecker@uidaho.edu website: www.lib.uidaho.edu/digital; journals.lib.uidaho.edu program overview mission/description: the digital initiatives department works to preserve and make accessible publications and other research products from researchers and affiliates of the university of idaho via its open access publishing capabilities. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: student-driven journals ( ); databases ( ) media formats: text; images; audio; video; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: rangeland ecology and management; creative writing top publications: fugue (journal); journal of rangeland applications (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs publishing platform(s): contentdm; ojs/ocs/omp digital preservation strategy: in-house mailto:dbecker@uidaho.edu www.lib.uidaho.edu/digital journals.lib.uidaho.edu additional services: graphic design (print or web); typesetting; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; doi assignment/allocation of identifiers; open url support; digitization; hosting of supplemental content additional information: we also publish a number of digital collections of historical images and documents. plans for expansion/future directions: bringing etds online; using etds to start developing a more robust (and visible) institutional repository. university of illinois at chicago university library primary unit: scholarly communications escholarship@uic.edu primary contact: sandy de groote scholarly communications librarian - - sgroote@uic.edu website: library.uic.edu/home/services/escholarship program overview mission/description: the objective/mission of the uic university library publishing program is to advance scholarly knowledge in a cost-effective manner. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); undergraduate students ( . ) funding sources (%): library operating budget ( ); charge backs ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); newsletters ( ); etds ( ) media formats: text; images; data disciplinary specialties: social work; internet studies; public health informatics top publications: first monday (journal); online journal of public health informatics (journal); behavior and social issues (journal); uncommon culture (journal); journal of biomedical discovery and collaboration (journal) percentage of journals that are peer reviewed: campus partners: individual faculty f o u n di ng institu tio n library publishing coalition mailto:escholarship@uic.edu mailto:sgroote@uic.edu library.uic.edu/home/services/escholarship publishing platform(s): contentdm; dspace; ojs/ocs/omp; inera extyles digital preservation strategy: hathitrust; lockss additional services: graphic design (print or web); typesetting; marketing; training; metadata; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; dataset management; author copyright advisory; digitization; word to xml conversion plans for expansion/future directions: exploring monograph publishing. h i g h l i g h t e d p u b l i c a t i o n first monday is one of the first openly accessible, peer–reviewed journals on the internet, solely devoted to the internet. firstmonday.org/index firstmonday.org/index university of iowa university of iowa libraries primary unit: digital research and publishing lib-ir@uiowa.edu primary contact: wendy robertson digital scholarship librarian - - wendy-robertson@uiowa.edu website: www.lib.uiowa.edu/drp/publishing social media: @iowareso program overview mission/description: digital research and publishing explores ways that academic libraries can best leverage digital collections, resources, and expertise to support faculty and student scholars by: collaborating on interdisciplinary scholarship built upon digital collections; offering publishing services to support sustainable scholarly communication; engaging the community through participatory digital initiatives; promoting widespread use and reuse of locally built repositories and archives; and advancing new technologies that support digital research and publishing. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/ research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ) media formats: text c o n tr ib ut ing institu tio n library publishing coalition mailto:lib-ir@uiowa.edu mailto:wendy-robertson@uiowa.edu www.lib.uiowa.edu/drp/publishing top publications: walt whitman quarterly review (journal); medieval feminist forum (journal); proceedings in obstetrics & gynecology (journal); iowa journal of cultural studies (journal); poroi (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students other partners: society for medieval feminist scholarship publishing platform(s): bepress (digital commons); contentdm; wordpress digital preservation strategy: archive-it; lockss; in-house; digital preservation services under discussion additional services: cataloging; metadata; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; peer review management; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: working on adding additional services, such as dois, html versions of articles, and possibly some formatting of content; assessing campus needs for datasets. university of kansas ku libraries primary unit: center for faculty initiatives and engagement kuscholarworks@ku.edu primary contact: marianne reed digital information specialist - - mreed@ku.edu website: journals.ku.edu program overview mission/description: digital publishing services provides support to the ku community for the design, management, and distribution of online publications, including journals, conference proceedings, monographs, and other scholarly content. we help scholars explore new and emerging publishing models in our changing scholarly communication environment, and we help monitor and address campus concerns and questions about electronic publishing. these services are intended to enable online publishing for campus publications, and help make their content available in a manner that promotes increased visibility and access, and ensures long-term stewardship of the materials. year publishing activities began: organization: centralized library publishing unit/department; transitioning to distribution across library units staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); occasional lectures; oral histories and interviews media formats: text; audio; video c o n tr ib ut ing institu tio n library publishing coalition mailto:kuscholarworks@ku.edu mailto:mreed@ku.edu journals.ku.edu disciplinary specialties: philosophy; natural science; humanities; oral history and interviews; linguistics top publications: biodiversity informatics (journal); american studies (journal); latin american theater review (journal); kansas working papers in linguistics (working papers); treatise online (preprints) percentage of journals that are peer reviewed: campus partners: individual faculty; graduate students publishing platform(s): dspace; ojs/ocs/omp; xtf digital preservation strategy: portico; digital preservation services under discussion additional services: outreach; training; analytics; cataloging; metadata; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming; isbns; consulting on publishing models and issues plans for expansion/future directions: some services are ongoing. a strategic initiative to expand the program is pending. university of kentucky university of kentucky libraries primary unit: department of digital scholarship uknowledge@lsv.uky.edu primary contact: adrian k. ho director of digital scholarship - - adrian.ho@uky.edu website: uknowledge.uky.edu program overview mission/description: the university of kentucky (uk) libraries launched an institutional repository (uknowledge) in late to champion the integration and transformation of scholarly communication within the uk community. the initiative sought to improve access by students, faculty, and researchers to appropriate resources for maximizing the dissemination of their research and scholarship in an open and digital environment. a crucial component of uknowledge is providing publishing services to broadly disseminate scholarship created or sponsored by the uk community. we provide a flexible platform to publish a variety of scholarly content and to expand the discoverability of the published works. additionally, we are establishing a separate digital repository for the long-term preservation of the published content and research datasets. using state-of-the-art technologies, we are able to offer campus constituents sought-after services in different stages of the scholarly communication life cycle to help them thrive and succeed. we also inform them of scholarly communication issues such as open access, author rights, and the economics of journal publishing. providing library publishing services is one avenue through which we are making significant contributions to the fulfillment of uk’s mission. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library materials budget ( ); charge backs ( ) f o u n di ng institu tio n library publishing coalition mailto:uknowledge@lsv.uky.edu mailto:adrian.ho@uky.edu uknowledge.uky.edu publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ); image galleries/virtual exhibits ( ) media formats: text; images disciplinary specialties: higher education; hispanic studies; public health; undergraduate research (multidisciplinary) top publications: kentucky journal of higher education policy and practice (journal); nomenclatura: aproximaciones a los estudios hispánicos (journal); frontiers in public health services and systems research (journal); kaleidoscope: the university of kentucky journal of undergraduate scholarship (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons) digital preservation strategy: in-house additional services: graphic design (print or web); training; analytics; cataloging; metadata; notification of a&i sources; issn registration; open url support; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content h i g h l i g h t e d p u b l i c a t i o n frontiers in public health services and systems research provides quick, open access to actionable public health infrastructure research to improve public health practices. uknowledge.uky.edu/frontiersinphssr uknowledge.uky.edu/frontiersinphssr plans for expansion/future directions: strengthening existing library publishing partnerships; bringing more campus constituents on board; building upon our current library publishing services (e.g., partnering with the uk graduate school to complete the integration of our library publishing services into the workflow as they implement an electronic thesis and dissertation mandate); pursuing additional opportunities to collaborate with various campus units in support of undergraduate research as we celebrate uk students’ academic achievements by making them visible and accessible worldwide; assisting uk-based print journals to create their online presence and extend their reach beyond academia; exploring data publishing in partnership with uk researchers; continuing to advocate open access and open licensing as well as inform the uk community of new scholarly communication practices such as alternative metrics, open peer review, and researcher identity management; making uknowledge the primary online publishing avenue for uk-based research and scholarship. university of maryland college park mckeldin library primary unit: digital stewardship primary contact: terry m. owen drum coordinator - - towen@umd.edu website: publish.lib.umd.edu; drum.lib.umd.edu program overview mission/description: capture, preserve, and provide access to the output of university of maryland faculty, researchers, centers, and labs. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ) campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: in-house; digital preservation services under discussion additional services: analytics; metadata; issn registration; author copyright advisory; hosting of supplemental content plans for expansion/future directions: expanding into epublishing in , including faculty and student-produced e-publications. c o n tr ib ut ing institu tio n library publishing coalition mailto:towen@umd.edu publish.lib.umd.edu drum.lib.umd.edu university of massachusetts amherst w.e.b. du bois library primary unit: office of scholarly communication scholarworks@library.umass.edu primary contact: marilyn s. billings scholarly communication & special initiatives librarian - - mbillings@library.umass.edu website: scholarworks.umass.edu program overview mission/description: scholarworks@umass amherst, an open access digital repository service, was established in to provide a digital showcase of the unique research and scholarly outputs of members of the university of massachusetts amherst community. it provides a platform for the distribution of content such as electronic dissertations, master’s theses, and capstone projects as well as scholarly output of academic departments, research centers, and institutes. scholarworks provides a wide variety of scholarly publishing services including: online journal publishing and conference management system; collaboration with scholarly presses to provide permanent location and urls for supplementary content for scholarly monographs, texts, and other scholarly materials. scholarworks provides many services for research support that can be used in conjunction with grant applications, which now require applicants to detail how the results of the funded research will be showcased and disseminated. the scholarworks service can be included as part of the overall data management strategy for research results, reports, new journal services, conference proceedings, etc. these value-added services enhance the professional visibility for faculty and researchers and provide excellent search and retrieval facilities and broader dissemination as well as increased use of materials through services such as google scholar and other internet search engines. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) f o u n di ng institu tio n library publishing coalition mailto:scholarworks@library.umass.edu mailto:mbillings@library.umass.edu scholarworks.umass.edu funding sources (%): library materials budget ( ); library operating budget ( ); non-library campus budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); graduate student capstones and practicums media formats: text; images; audio; video; data disciplinary specialties: anthropology; engineering; community engagement; nursing; hospitality and tourism top publications: “how to do case study research” (technical report); “the impact of language barrier & cultural differences on restaurant experiences: a grounded theory approach” (conference proceedings); “theme park development costs: initial investment cost per first year attendee” (conference proceedings); “the form of the preludes to bach’s unaccompanied cello suites” (thesis); “ratio analysis for the hospitality industry: a cross sector comparison of financial trends in the lodging, restaurant, airline and amusement sectors” (journal article) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons); eprints; fedora digital preservation strategy: lockss; in-house; digital preservation services under discussion additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content additional information: we are looking into the possibilities of coordinating more closely with our university press on a variety of services. they have expertise but not the additional time to assist with the types of publishing services faculty are starting to ask for, such as copy-editing, proofing. we are also members of the networked digital library of theses and dissertations (ndltd). plans for expansion/future directions: exploring additional publication services in collaboration with other groups on campus (copy-editing, proofing, graphic design, referral services); engaging in more extensive collaboration with the office of research on data management, intellectual property/copyright; and expanding into capturing undergraduate student work/projects. h i g h l i g h t e d p u b l i c a t i o n communication + provides an open forum for exploring and sharing ideas about communication across modes of inquiry and perspectives. its primary objective is to push the theoretical frontiers of communication as an autonomous and distinct field of research. scholarworks.umass.edu/cpo scholarworks.umass.edu/cpo university of massachusetts medical school lamar soutter library primary unit: research and scholarly communication services primary contact: rebecca reznik-zellen head of research & scholarly communication services - - rebecca.reznik-zellen@umassmed.edu website: escholarship.umassmed.edu/about.html program overview mission/description: escholarship@umms is a digital repository offering worldwide access to the research and scholarly work of the university of massachusetts medical school community. the goal is to bring together the university’s scholarly output in order to enhance its visibility and accessibility. we help individual researchers and departments organize and publicize their research beyond the walls of the medical school, archiving publications, posters, presentations, and other materials they produce in their scholarly pursuits. our publishing services—including the journal of escience librarianship and two other open access peer-reviewed electronic journals, student dissertations and theses, and conference proceedings—highlight the works of university of massachusetts medical school authors and others. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); monographs ( ); textbooks; ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); finding aids ( ); book reviews ( ) media formats: text; images; audio; video; multimedia/interactive content c o n tr ib ut ing institu tio n library publishing coalition mailto:rebecca.reznik-zellen@umassmed.edu escholarship.umassmed.edu/about.html disciplinary specialties: library science; psychiatry/mental health research; neurology; clinical and translational science; life sciences top publications: journal of escience librarianship (journal); etds; psychiatry information in brief (journal); neurological bulletin (journal); a history of the university of massachusetts medical school (e-book) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons) digital preservation strategy: in-house; digital preservation services under discussion additional services: copy-editing; marketing; outreach; training; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; author copyright advisory; hosting of supplemental content; audio/video streaming; altmetrics data plans for expansion/future directions: expanding our publishing services to additional departments within the medical school, incorporating more multimedia, and enhancing publications with altmetrics data. university of michigan university library primary unit: michigan publishing mpublishing@umich.edu website: www.publishing.umich.edu social media: @m_publishing program overview mission/description: michigan publishing is the hub of scholarly publishing at the university of michigan, and is a part of its dynamic and innovative university library. our mission as publishers, librarians, copyright experts, and technologists is to support the communications needs of scholars, and to publish, promote, and preserve the scholarly record. year publishing activities began: organization: centralized library publishing unit/department funding sources (%): library operating budget ( ); sales revenue ( ) publishing activities campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: open humanities press; american council of learned societies publishing platform(s): dspace; wordpress; locally developed software digital preservation strategy: hathitrust; in-house additional services: typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; contract/license preparation; author copyright advisory; other author advisory c o n tr ib ut ing institu tio n library publishing coalition mailto:mpublishing@umich.edu www.publishing.umich.edu university of minnesota university of minnesota libraries primary unit: content and collections division jkirchne@umn.edu primary contact: joy kirchner aul for content & collections - - jkirchne@umn.edu program overview year publishing activities began: organization: services are distributed across library units/departments funding sources (%): library operating budget ( ); endowment income ( ); grants ( ) publishing activities types of publications: faculty conference papers and proceedings ( ); etds ( ); working papers; blogs; online dictionary media formats: text; images; audio; video; data campus partners: individual faculty publishing platform(s): contentdm; dspace; movabletype; drupal digital preservation strategy: clockss; duracloud/dspace; hathitrust; portico; omeka additional services: training; analytics; metadata; open url support; dataset management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: currently developing a program in a new division. c o n tr ib ut ing institu tio n library publishing coalition mailto:jkirchne@umn.edu mailto:jkirchne@umn.edu university of nebraska-lincoln university of nebraska-lincoln libraries primary unit: zea books/office of scholarly communications proyster@unl.edu primary contact: paul royster publisher, zea books - - proyster@unl.edu website: digitalcommons.unl.edu/zea program overview mission/description: zea books is the digital and on-demand publishing operation of the university of nebraska-lincoln libraries. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); textbooks ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; data; concept maps/modeling maps/visualizations; multimedia/interactive content campus partners: individual faculty other partners: nebraska academy of sciences; center for great plains studies; textile society of america; lester a. larsen tractor and power museum; center for systemic entomology; nebraska ornithological union publishing platform(s): bepress (digital commons) mailto:proyster@unl.edu mailto:proyster@unl.edu digitalcommons.unl.edu/zea additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; issn registration; open url support; peer review management; business model development; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content university of north carolina at chapel hill university library primary unit: library administration primary contact: will owen associate university librarian for technical services and systems - - owen@email.unc.edu program overview mission/description: the library has historically published, in print, specialized monographs on topics related to the university or library. we publish etds electronically and provide digital editions and original scholarly interpretations in support of research and instruction with a special emphasis on the american south. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: etds ( ); digital humanities research projects media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: the american south top publications: documenting the american south (digital collection) campus partners: unc press; campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): contentdm; fedora; locally developed software mailto:owen@email.unc.edu digital preservation strategy: archive-it; hathitrust; in-house (carolina digital repository); internet archive additional services: training; cataloging; metadata; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: collaborating with researchers on archiving, preserving, and publishing research data; collaborating with unc press for print-on-demand publications. university of north carolina at charlotte atkins library primary unit: digital scholarship lab atkins-dsl@uncc.edu primary contact: somaly kim wu digital scholarship librarian - - skimwu@uncc.edu website: journals.uncc.edu; dsl.uncc.edu/dsl/services/publication program overview mission/description: we support the publication of scholarly journals online and assist journal editors with the management, editorial work, and production of their scholarly journal. the dsl offers journal hosting support services to unc charlotte faculty. our services are built on the open journal system (ojs) journal management software that facilitates the publication of online peer-reviewed journals. dsl services include platform software hosting, updates, and copyright consulting. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ) media formats: text disciplinary specialties: education; psychology; urban education top publications: nhsa dialog (journal); urban education research and policy annuals (journal); undergraduate journal of psychology (journal) percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:atkins-dsl@uncc.edu mailto:skimwu@uncc.edu journals.uncc.edu dsl.uncc.edu/dsl/services/publication campus partners: individual faculty publishing platform(s): ojs/ocs/omp digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); training; issn registration; dataset management; author copyright advisory plans for expansion/future directions: building an institutional repository that is planned to be online within the year. university of north carolina at greensboro university libraries primary unit: collections and scholarly communications primary contact: beth bernhardt assistant dean for collection management and scholarly communications - - brbernha@uncg.edu program overview mission/description: still in development year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): other ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); databases ( ); etds ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: public health; education; nursing; sociology top publications: international journal of nurse practitioner educators (journal); the international journal of critical pedagogy (journal); journal of backcountry studies (journal); journal of learning spaces (journal); partnerships: a journal of service-learning and civic engagement (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students f o u n di ng institu tio n library publishing coalition mailto:brbernha@uncg.edu publishing platform(s): contentdm; ojs/ocs/omp; locally developed software digital preservation strategy: hathitrust; in-house; digital preservation services under discussion additional services: training; analytics; cataloging; metadata; author copyright advisory; other author advisory; digitization; hosting of supplemental content plans for expansion/future directions: hosting ojs for other regional libraries; supporting faculty in new scholarly media, such as database and ui design, web pages, and usability. h i g h l i g h t e d p u b l i c a t i o n a peer-reviewed, open-access journal published biannually, the journal of learning spaces provides a scholarly, multidisciplinary forum for research articles, case studies, book reviews, and position pieces related to all aspects of learning space design, operation, pedagogy, and assessment in higher education. partnershipsjournal.org/index.php/jls partnershipsjournal.org/index.php/jls university of north texas university of north texas libraries primary unit: scholarly publishing services primary contact: martin halbert dean of libraries - - martin.halbert@unt.edu program overview mission/description: the unt libraries scholarly publishing services are a collaborative program between faculty and the library to develop new and innovative forms of scholarly publications, especially using digital technologies. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ), graduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ) media formats: text; images; audio; video; data; multimedia/interactive content disciplinary specialties: electronic arts top publications: möbius journal (journal); the eagle feather (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: texas state historical association publishing platform(s): locally developed software f o u n di ng institu tio n library publishing coalition mailto:martin.halbert@unt.edu digital preservation strategy: digital preservation services under discussion additional services: graphic design (print or web); metadata plans for expansion/future directions: cultivate new ideas for collaborative scholarly publications. h i g h l i g h t e d p u b l i c a t i o n möbius is a journal of the iarta (initiative for advanced research in technology and the arts) research group at the university of north texas. moebiusjournal.org moebiusjournal.org university of oregon university of oregon libraries primary unit: digital scholarship center primary contact: john russell scholarly communications librarian - - johnruss@uoregon.edu website: library.uoregon.edu/digitalscholarship program overview mission/description: the digital scholarship center (dsc) collaborates with faculty and students to transform research and scholarly communication using new media and digital technologies. based on a foundation of access, sharing, and preservation, the dsc provides digital asset management, digital preservation, training, consultations, and tools for digital scholarship. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: humanities; gender studies top publications: ada: a journal of gender, new media, and technology (journal); konturen (journal); oregon undergraduate research journal (journal); humanist studies & the digital age (journal) mailto:johnruss@uoregon.edu library.uoregon.edu/digitalscholarship percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: fembot collective publishing platform(s): contentdm; dspace; ojs/ocs/omp; wordpress digital preservation strategy: in-house additional services: graphic design (print or web); copy-editing; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming plans for expansion/future directions: increasing quality control over publications. university of pittsburgh university library system primary unit: office of scholarly communication and publishing oscp@mail.pitt.edu primary contact: timothy s. deliyannides director, office of scholarly communication and publishing - - tsd@pitt.edu website: www.library.pitt.edu/dscribe social media: @oscp_pitt program overview mission/description: the university library system, university of pittsburgh offers a full range of publishing services for a variety of content types, specializing in scholarly journals and subject-based open access repositories. because we are committed to helping research communities share knowledge and ideas through open and responsible collaboration, we subsidize the costs of electronic publishing and provide incentives to promote open access to scholarly research. our program promotes open access journal publishing at a very low cost; eliminates the high cost of print journal publication and distribution; allows easy collaboration among authors, editors, and reviewers regardless of location; enhances the visibility, searchability, and navigation of publications; and incorporates innovative and sustainable technologies to speed and facilitate scholarly publishing. we are seeking partners around the world who share our commitment to open access to scholarly research information. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ); charge backs ( ) f o u n di ng institu tio n library publishing coalition mailto:oscp@mail.pitt.edu mailto:tsd@pitt.edu www.library.pitt.edu/dscribe publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ); government documents ( ); unpublished article manuscripts ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: latin american studies; european studies; history and philosophy of science; law; health sciences top publications: revista iberoamericana (journal); university of pittsburgh law review (journal); international journal of telerehabilitation (journal); archive of european integration (digital collection); philsci-archive (preprints) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: american forensic association; brunel university; consortium of indonesian universities–pittsburgh (kptip); fonds ricoeur; grupo biblios: international network of the development of library and information science; institute for linguistic evidence; institute of integrative omics and applied biotechnology; institute of public health, bangalore, india; instituto internacional de literatura iberoamericana; kadir has university; laps/ensp h i g h l i g h t e d p u b l i c a t i o n the international journal of telerehabilitation (ijt) is a biannual journal dedicated to advancing telerehabilitation by disseminating information about current research and practices. lawreview.law.pitt.edu lawreview.law.pitt.edu oswaldo cruz foundation laps; motivational interviewing network of trainers (mint); pennsylvania library association; société américaine de philosophie de langue française; society for ricoeur studies; tale: the association for linguistic evidence; university of chapeco, department of anthropology; university of kingston centre for modern european philosophy publishing platform(s): eprints; fedora; islandora; wordpress; locally developed software digital preservation strategy: discoverygarden; hathitrust; lockss; in-house additional services: graphic design (print or web); marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; dataset management; business model development; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming university of san diego copley library primary unit: special collections and archives primary contact: kelly riddle digital initiatives librarian - - kriddle@sandiego.edu program overview mission/description: digital publishing at the university of san diego’s copley library offers the university community the opportunity to share research, scholarly works, and other unique resources of historical or intellectual value. the library’s digital publishing program will serve to advance faculty and student success and will foster intellectual collaboration both locally and globally. the library is dedicated to developing publishing services that will support and disseminate knowledge created or sponsored by the university so that it is readily discoverable, openly accessible, preserved, and sustainable. a goal of digital publishing will be to introduce faculty to a variety of new publishing models. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( ) funding sources (%): library operating budget ( ) publishing activities media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons); contentpro digital preservation strategy: digital preservation services under discussion f o u n di ng institu tio n library publishing coalition mailto:kriddle@sandiego.edu additional services: outreach; training; cataloging; metadata; open url support; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming university of south florida tampa library primary unit: academic resources scholarcommons@usf.edu primary contact: rebel cummings-sauls library operations coordinator - - rebelcs@usf.edu website: scholarcommons.usf.edu program overview mission/description: the usf tampa library strives to develop and encourage research collaboration and initiatives throughout all areas of campus. members of the usf community are encouraged to deposit their research with scholar commons. we commit to assisting faculty, staff, and students in all stages of the deposit process, to managing their work to optimize access/ readership, and to ensure long-term preservation. long-term preservation and increasing accessibility will increase citation rates and highlight the research accomplishments of this campus. scholar commons will have a direct impact on the university’s four strategic goals: student success, research innovation, sound financial management, and creating new partnerships. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ); endowment income ( ) publishing activities types of publications: journals produced under contract/mou for external groups ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ); oral histories; events and lectures; course material; grey/white works mailto:scholarcommons@usf.edu mailto:rebelcs@usf.edu scholarcommons.usf.edu media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: geology and karst; holocaust and genocide; environmental sustainability; literature; math/quantitative literature top publications: etds; social science research: principle, methods, and practices (journal); international journal of speleology (journal); journal of strategic security (journal); studia ubb geologia (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students other partners: national cave and karst research institute (nckri); aphra behn society; union internationale de spéléologie; center for conflict management (ccm) of the national university of rwanda (nur); henley- putnam university; national numeracy network (nnn); iavcei commission on statistics in volcanology (cosiv); babeş-bolyai university; national center for suburban studies at hofstra university publishing platform(s): bepress (digital commons) digital preservation strategy: lockss; portico; in-house; digital preservation services under discussion. pln is being discussed. bepress also offers preservation and backups. additional services: graphic design (print or web); typesetting; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; notification of a&i sources; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; author copyright advisory; digitization; hosting of supplemental content; audio/ video streaming; add dois to references; suggest pod services plans for expansion/future directions: adding a coordinator role; expanding all content areas; and we currently have three new journals in process. university of tennessee university of tennessee libraries primary unit: digital production and publishing/newfound press primary contact: holly mercer associate dean for scholarly communication & research services - - hollymercer@utk.edu website: www.newfoundpress.utk.edu; trace.tennessee.edu program overview mission/description: the university of tennessee libraries has developed a framework to make scholarly and specialized works available worldwide. newfound press, the university libraries digital imprint, advances the community of learning by experimenting with effective and open systems of scholarly communication. drawing on the resources that the university has invested in digital library development, newfound press collaborates with authors and researchers to bring new forms of publication to an expanding scholarly universe. ut libraries provides open access publishing services, copyright education, and services to help scholars meet new data management and sharing requirements. in addition, we create digital collections of regional and global importance to support research and teaching. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); databases ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; multimedia/interactive content f o u n di ng institu tio n library publishing coalition mailto:hollymercer@utk.edu www.newfoundpress.utk.edu trace.tennessee.edu disciplinary specialties: east tennessee; great smoky mountains; anthropology; sociology; law top publications: the fishes of tennessee (monograph); building bridges in anthropology (monograph); to advance their opportunities: federal policies toward african american workers from world war i to the civil rights act of (monograph); goodness gracious, miss agnes: patchwork of country living (monograph); “why we don’t vote: low voter turnout in u.s. presidential elections” (thesis) percentage of journals that are peer reviewed: campus partners: ut press; campus departments or programs; individual faculty; graduate students; undergraduate students other partners: southern anthropological society; music theory society of the mid-atlantic publishing platform(s): bepress (digital commons); locally developed software digital preservation strategy: duracloud/dspace; metaarchive additional services: graphic design (print or web); typesetting; copy-editing; marketing; analytics; cataloging; metadata; doi assignment/allocation of identifiers; dataset management; peer review management; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming; assignment of isbns plans for expansion/future directions: exploring how to cultivate data publishing and how to support digital humanities on campus. h i g h l i g h t e d p u b l i c a t i o n the wondrous bird’s nest i & ii (das wunderbarliche vogelnest) is the only complete english translation of the fourth of the five simplican novels by seventeenth- century german-language novelist grimmelshausen. newfoundpress.utk.edu/pubs/hiller newfoundpress.utk.edu/pubs/hiller university of texas at san antonio university of texas at san antonio libraries primary unit: learning technology primary contact: posie aagaard assistant dean for collections and curriculum support - - posie.aagaard@utsa.edu program overview mission/description: the utsa libraries collaborate with faculty to disseminate original scholarly content using a variety of platforms, ensuring open access while simultaneously acknowledging reader preferences. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty conference papers and proceedings ( ) media formats: text; images; video; concept maps/modeling maps/visualizations disciplinary specialties: astronomy top publications: torus workshop (conference proceedings) campus partners: individual faculty; graduate students other partners: science organizing committee publishing platform(s): contentdm; worldcat.org digital preservation strategy: in-house. master copy is retained in a preferred file format; copies of the files are kept on local server (which has security, disaster recovery, and backup features) and also with oclc; metadata has been created to support ongoing longevity. mailto:posie.aagaard@utsa.edu worldcat.org additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; contract/license preparation; author copyright advisory; hosting of supplemental content; audio/ video streaming additional information: for our pilot publishing project, we collaborated with faculty who expressed a strong preference for using ibooks/itunes as a publishing platform because the primary audience for the material (astronomy scholars) prefer to consume content on ipads. in addition to producing an ibook, we produced a multimedia-pdf, converting the content to a more open format for wider access and preservation purposes. plans for expansion/future directions: actively seeking new opportunities to collaborate with faculty on publishing projects. university of toronto university of toronto libraries primary unit: information technology services primary contact: sian meikle interim director, its - - sian.meikle@utoronto.ca website: jps.library.utoronto.ca; tspace.library.utoronto.ca program overview mission/description: the university of toronto libraries maintains both the open journal system (ojs) and t-space, the university’s research repository with the aim to preserve and make available the university’s scholarly contributions. we provide leadership and actively support scholarly communication needs by developing alternative forms of publication and viability models for the future that ensure the production and capture of research output. year publishing activities began: organization: services are distributed across several campuses staff in support of publishing activities (fte): library staff ( . ); graduate students ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: medicine/health sciences; humanities; social sciences; physical/natural sciences percentage of journals that are peer reviewed: mailto:sian.meikle@utoronto.ca jps.library.utoronto.ca tspace.library.utoronto.ca campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: university of toronto press; ontario council of university libraries (ocul); canadian association of research libraries (carl) publishing platform(s): contentdm; dspace; fedora; islandora; ojs/ocs/ omp; wordpress; bibapp digital preservation strategy: archive-it; duracloud/dspace; lockss; scholars portal; synergies; internet archive additional services: graphic design (print or web); outreach; training; cataloging; metadata; business model development; contract/license preparation; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: aligning focus.library.utoronto.ca (a more outwardly facing system for faculty profiling) with t-space, the repository; working on copyright issues with our recently hired scholarly communication/ copyright librarian. focus.library.utoronto.ca university of utah j. willard marriott library primary unit: information technology primary contact: john herbert head, digital ventures - - john.herbert@utah.edu program overview year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); etds ( ) media formats: text; images; audio; video; concept maps/modeling maps/ visualizations disciplinary specialties: law; environmental studies; foreign languages; political science top publications: utah law review (journal); hinckley journal of politics (journal); utah foreign language review (journal); utah environmental law review (journal) percentage of journals that are peer reviewed: campus partners: individual faculty; graduate students; undergraduate students publishing platform(s): contentdm; ojs/ocs/omp; wordpress digital preservation strategy: rosetta f o u n di ng institu tio n library publishing coalition mailto:john.herbert@utah.edu additional services: graphic design (print or web); outreach; metadata; doi assignment/allocation of identifiers; author copyright advisory; digitization; hosting of supplemental content; audio/video streaming h i g h l i g h t e d p u b l i c a t i o n the utah historical review is the journal of student history published by the alpha rho chapter of phi alpha theta (national history honor society) at the university of utah. utahhistoricalreview.com utahhistoricalreview.com university of victoria university of victoria libraries primary unit: scholarly publishing office press@uvic.ca primary contact: inba kehoe scholarly communications librarian - - press@uvic.ca website: journals@uvic.ca; dspace.library.uvic.ca: program overview mission/description: uvic press represents the scholarly publishing expertise for the university of victoria and its partner institutions and associations. we are dedicated to the online dissemination of knowledge and research through open access of journals, monographs, and other forms of publication. uvic press offers an imprint to scholarship of a high quality, determined through peer review. we will work with emerging writers and research to promote success in scholarly publishing. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); etds ( ) media formats: text; images disciplinary specialties: humanities; social sciences; disability services; writing; creative fiction top publications: philosophy in review (journal); working papers of the linguistics circle (journal); international journal of child, youth and family studies (journal); canadian zooarchaeology (journal); appeal: review of current law and law reform (journal) mailto:press@uvic.ca mailto:press@uvic.ca mailto:journals@uvic.ca http://dspace.library.uvic.ca: percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: public knowledge project; canadian associate of learned journals; universities art association of canada; association for borderlands studies publishing platform(s): dspace; ojs/ocs/omp digital preservation strategy: coppul; lockss; synergies additional services: copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; issn registration; contract/license preparation; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: developing a fully functional university publishing program that will include publishing of journals, conference proceedings, and books. the program will include various imprints under the university press umbrella. university of washington university of washington libraries primary unit: digital initiatives primary contact: ann lally head, digital initiatives - - alally@uw.edu website: researchworks.lib.washington.edu program overview mission/description: the university of washington libraries researchworks service provides faculty, researchers, and students with tools to archive and/or publish the products of research including datasets, monographs, images, journal articles, and technical reports. year publishing activities began: organization: services are distributed across several campuses staff in support of publishing activities (fte): library staff ( . ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/ honors theses ( ); research notebooks media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: information studies; anthropology; fisheries; native american studies percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students c o n tr ib ut ing institu tio n library publishing coalition mailto:alally@uw.edu researchworks.lib.washington.edu other partners: indo-pacific prehistory association; society for slovene studies publishing platform(s): contentdm; dspace; ojs/ocs/omp digital preservation strategy: university escience dark archive additional services: graphic design (print or web); training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; peer review management; contract/license preparation; author copyright advisory; digitization; hosting of supplemental content university of waterloo university of waterloo library primary unit: digital initiatives primary contact: pascal calarco aul, research & digital discovery services - - ext. pvcalarco@uwaterloo.ca program overview mission/description: enabling original scholarly research at the university of waterloo from faculty, students, and staff. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: student-driven journals ( ); journals produced under contract/mou for external groups ( ); newsletters ( ); databases ( ); etds ( ) media formats: text; images; audio; video; multimedia/interactive content disciplinary specialties: disability studies; mechanical engineering; sociology and criminology; food science top publications: engine: pre-print server for ieee society for vehicular technology (preprints); canadian journal of disability studies (journal); canadian graduate journal of sociology and criminology (journal); canadian journal of food safety (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students mailto:pvcalarco@uwaterloo.ca other partners: theses canada; canadian disability studies association; canadian association of food safety publishing platform(s): dspace; ojs/ocs/omp; locally developed software digital preservation strategy: archive-it; scholars portal; in-house; digital preservation services under discussion; theses canada additional services: analytics; cataloging; metadata; issn registration; business model development; author copyright advisory; digitization; hosting of supplemental content additional information: we have also participated in the networked digital library of theses and dissertations since . plans for expansion/future directions: extending to working papers, pre-prints, senior undergraduate work, and other original efforts. university of windsor leddy library primary unit: information services primary contact: dave johnston information services librarian, scholarly communications coordinator - - ext. djohnst@uwindsor.ca website: scholar.uwindsor.ca; ojs.uwindsor.ca/ojs/leddy/index.php; ocs.uwindsor. ca/ocs/index.php/pc/virtues social media: facebook.com/leddy.library program overview mission/description: the leddy library supports the dissemination of new scholarship by graduate, faculty, and staff researchers at the university of windsor in a variety of forms. through the scholarship at uwindsor repository, we are able to support the dissemination of theses and dissertations and thus provide increased visibility to the work of our graduate students. we also use the repository to support conferences run on our campus by helping the organizers manage the submission workflow and publication process. as a longstanding supporter of open journal systems, the library helps to publish and maintain several journals run from our campus, and we are currently in the process of using the new open monograph press software to help support electronic monograph publishing. providing support for open access is a central concern in all of our publishing endeavors. we seek to educate our users about the value of open access and to encourage various forms of open access publication. organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); faculty conference papers and proceedings ( ); etds ( ) media formats: text; images mailto:djohnst@uwindsor.ca scholar.uwindsor.ca ojs.uwindsor.ca/ojs/leddy/index.php ocs.uwindsor.ca/ocs/index.php/pc/virtues ocs.uwindsor.ca/ocs/index.php/pc/virtues disciplinary specialties: philosophy (information logic); social justice; scholarship of teaching and learning; philosophy (phenomenology); multivariate statistical techniques top publications: informal logic (journal); collected essays in teaching and learning (journal); applied multivariate research (journal); studies in social justice (journal); phaenex (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons); ojs/ocs/omp digital preservation strategy: lockss additional services: marketing; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; author copyright advisory; digitization; hosting of supplemental content plans for expansion/future directions: extending use of existing systems to support the publication of more journals and conferences; launching an open monograph series with the philosophy department. university of wisconsin–madison university of wisconsin–madison libraries primary unit: general library system primary contact: elisabeth owens special assistant to the vice provost for libraries - - eowens@library.wisc.edu website: parallelpress.library.wisc.edu; uwdc.library.wisc.edu program overview mission/description: the general library system publishes print and digital works featuring new works of scholars, researchers, and poets, and important scholarly and historical materials that are available for study in both print and digital formats. these publications are the result of collaborations with the scholarly community and represent an ongoing commitment by the libraries to scholarly communication as a contribution to the wisconsin idea and in support of the outreach mission of the university. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ); graduate students ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ); non-library campus budget ( ); endowment income ( ); charitable contributions/friends of the library organizations ( ); sales revenue ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); reformatted works media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations disciplinary specialties: university of wisconsin; state of wisconsin; african studies; ecology and natural resources; decorative arts and material culture mailto:eowens@library.wisc.edu parallelpress.library.wisc.edu uwdc.library.wisc.edu top publications: wi land survey records (digital collection); foreign relations of the united states (digital collection); icelandic online (digital collection); africa focus (digital collection); decorative arts library (digital collection) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty publishing platform(s): dspace; fedora; ojs/ocs/omp; wordpress; locally developed software digital preservation strategy: clockss; hathitrust; lockss; in-house; digital preservation services under discussion additional services: graphic design (print or web); copy-editing; marketing; outreach; training; analytics; cataloging; metadata; issn registration; doi assignment/allocation of identifiers; open url support; dataset management; peer review management; budget preparation; contract/license preparation; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: the majority of the libraries’ publishing activities involve the reformatting and dissemination of new versions of existing resources. we do publish new material, and our responses are primarily reflective of these activities (as opposed to our digital collections and repository services). plans for expansion/future directions: increasing emphasis on open access publications and unique archival and special collections materials. utah state university merrill-cazier library primary unit: digital initiatives primary contact: becky thoms copyright librarian - - becky.thoms@usu.edu website: digitalcommons.usu.edu program overview mission/description: usu libraries is committed to the open dissemination of knowledge, as well as its delivery in new forms. our publishing efforts emphasize open access and a commitment to look beyond traditional monographs and scholarly articles to disseminate dynamic scholarly works that can incorporate multimedia and social communications-style input. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ); undergraduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); faculty conference papers and proceedings ( ); newsletters ( ); etds ( ); faculty and student posters media formats: text; images; audio; video; data top publications: journal of indigenous research (journal); journal of mormon history (journal); journal of western archives (journal); foundations of wave phenomena (journal); an introduction to editing manuscripts for medievalists (monograph) percentage of journals that are peer reviewed: f o u n di ng institu tio n library publishing coalition mailto:becky.thoms@usu.edu digitalcommons.usu.edu campus partners: campus departments or programs; individual faculty; graduate students publishing platform(s): bepress (digital commons) digital preservation strategy: in-house. our digital publishing content is archived on bepress servers in several geographic locations; we also archive copies on an in-house server. some titles are preserved in hathitrust, and we are investigating dpn. additional services: graphic design (print or web); cataloging; metadata; author copyright advisory; digitization plans for expansion/future directions: building on existing collaborative relationship with the usu press to connect authors with freelance providers of traditional publisher services such as peer review management, copy-editing, and typesetting. h i g h l i g h t e d p u b l i c a t i o n folklore and the internet is a pioneering examination of the folkloric qualities of the world wide web, e-mail, and related digital media. it shows that folk culture, sustained by a new and evolving vernacular, has been a key to language, practice, and interaction online. digitalcommons.usu.edu/usupress_pubs/ digitalcommons.usu.edu/usupress valparaiso university christopher center for library and information resources primary unit: christopher center library services scholar@valpo.edu primary contact: jonathan bull scholarly communication services librarian - - jon.bull@valpo.edu website: scholar.valpo.edu program overview mission/description: valposcholar, a service of the christopher center library and the valparaiso university law library, is a digital repository and publication platform designed to collect, preserve, and make accessible the academic output of valpo faculty, students, staff, and affiliates. year publishing activities began: organization: services are distributed across two libraries, the christopher center and the law library staff in support of publishing activities (fte): library staff ( ); graduate students ( ); undergraduate students ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); etds ( ); undergraduate capstone/honors theses ( ); other conference proceedings media formats: text; images; audio; video; data disciplinary specialties: business and leadership ethics; creative writing (fiction); law top publications: valparaiso law review (journal); valparaiso fiction review (journal); the journal of values-based leadership (journal); third world legal studies (journal) percentage of journals that are peer reviewed: c o n tr ib ut ing institu tio n library publishing coalition mailto:scholar@valpo.edu mailto:jon.bull@valpo.edu scholar.valpo.edu campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons); contentdm digital preservation strategy: no digital preservation services provided additional services: typesetting; marketing; outreach; training; analytics; metadata; issn registration; open url support; dataset management; peer review management; author copyright advisory; other author advisory; digitization; hosting of supplemental content; audio/video streaming additional information: this is a growing service that appears to be needed and well-received on campus. we expect only growth in the future, along with external partnerships and more faculty-student collaboration. vanderbilt university jean and alexander heard library primary unit: scholarly communications primary contact: clifford b. anderson director, scholarly communications - - clifford.anderson@vanderbilt.edu website: library.vanderbilt.edu/scholarly program overview mission/description: the jean and alexander heard library fosters emerging modes of open access publishing by providing scholarly, technical, and financial support for the digital dissemination of faculty, student, and staff publications. the library maintains several publishing initiatives through its scholarly communication program. currently, it publishes four peer-reviewed, open access journals—ameriquests, homiletic, vanderbilt e-journal of luso-hispanic studies, and the vanderbilt undergraduate research journal—using open journal systems software. it also hosts a database of electronic theses and dissertations in cooperation with the graduate school. additionally, the library distributes undergraduate capstone projects through its institutional repository. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images disciplinary specialties: american studies; homiletics; luso-hispanic studies top publications: ameriquests (journal); homiletic (journal); vanderbilt e-journal of luso-hispanic studies (journal); vanderbilt undergraduate research journal (journal) mailto:clifford.anderson@vanderbilt.edu http://library.vanderbilt.edu/scholarly percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: academy of homiletics publishing platform(s): dspace; ojs/ocs/omp; etd-db digital preservation strategy: in-house; lockss-etd additional services: outreach; training; cataloging; author copyright advisory plans for expansion/future directions: strengthening support for the publication of scientific datasets as well as projects in the digital humanities. villanova university falvey memorial library primary unit: falvey memorial library primary contact: darren g. poley interim library director - - darren.poley@villanova.edu program overview mission/description: in support of villanova university’s academic mission, the library is committed to the creation and dissemination of scholarship; utilizing digital modes and exploring new media for scholarly communication; and whenever possible, fostering open and public access to the intellectual contributions it publishes. organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); undergraduate capstone/honors theses ( ) media formats: text; images disciplinary specialties: american catholic studies; catholic higher education; theater; humanities; liberal arts and sciences top publications: journal of catholic higher education (journal); american catholic studies (journal); expositions (journal); praxis (journal); concept (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: american catholic historical society; association of catholic colleges and universities c o n tr ib ut ing institu tio n library publishing coalition mailto:darren.poley@villanova.edu publishing platform(s): ojs/ocs/omp digital preservation strategy: in-house additional services: graphic design (print or web); digitization virginia commonwealth university vcu libraries primary unit: information management and processing primary contact: john duke senior associate university librarian - - jkduke@vcu.edu website: digarchive.library.vcu.edu program overview mission statement: vcu’s digital press provides the tools, infrastructure, and support for unique digital scholarly expressions from the vcu community of faculty and students from all disciplines. year publishing activities began: organization: services are distributed across library units/departments total fte in support of publishing activities: library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: monographs ( ), student conference papers and proceedings ( ), etds ( ) media formats: text; images; audio; video; concept maps/modeling/maps/ visualizations; multimedia/interactive content disciplinary specialties: history top publications: british virginia (monograph); “information technology outsourcing in u.s. hospital systems” (thesis); “a computational biology approach to the analysis of complex physiology” (thesis); “the effects of the handwriting without tears program” (thesis); “psychology and the theater” (thesis) internal partners: campus departments or programs; individual faculty; graduate students; undergraduate students mailto:jkduke@vcu.edu digarchive.library.vcu.edu publishing platform(s): contentdm; dspace digital preservation strategy: in-house; digital preservation services under discussion additional services: marketing; outreach; training; cataloging; metadata; digitization additional information: vcu libraries recruited a new professional position to advance research data management in the first quarter of academic year - ; it expects to launch a library publishing program and a full institutional repository later this year. plans for expansion/future directions: expanding the institutional repository to become a full partner in the share initiative; creating a publishing platform for existing journals published by vcu faculty and for new scholarly journals and output from the entire vcu community. virginia tech university libraries primary unit: center for digital research and scholarship primary contact: gail mcmillan director, center for digital research and scholarship services - - gailmac@vt.edu website: scholar.lib.vt.edu; ejournals.lib.vt.edu; vtechworks.lib.vt.edu program overview mission/description: the libraries support the virginia tech community’s needs (e.g., conference, journal, and book publishing; rights management and open access consulting, etc.) through digital publishing services. virginia tech has been hosting, providing access to, and preserving ejournals since , but we are new to supporting the full workflow from article submission to peer review, editing, and production. we launched ojs in december . year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); faculty conference papers and proceedings ( ); etds ( ); yearbooks; annual reports media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content disciplinary specialties: technology education top publications: etds; journal of technology education (journal); alan review (journal); journal of industrial teacher education (journal); journal of technology studies (journal) f o u n di ng institu tio n library publishing coalition mailto:gailmac@vt.edu scholar.lib.vt.edu ejournals.lib.vt.edu vtechworks.lib.vt.edu percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: while our ejournal editors largely work on behalf of scholarly societies, we do not work directly with the societies. publishing platform(s): dspace; ojs/ocs/omp; locally developed software digital preservation strategy: lockss; metaarchive additional services: analytics; cataloging; metadata; doi assignment/allocation of identifiers; dataset management; contract/license preparation; author copyright advisory; hosting of supplemental content; audio/video streaming plans for expansion/future directions: consulting with editors about using ojs through cdrs services; inviting hosted ejournal editors to consider using ojs; launching ocs; and collaborating with our university community to consider other publishing services. h i g h l i g h t e d p u b l i c a t i o n the journal of research in music performance is a peer-reviewed journal designed to provide presentation of a broad range of research that represents the breadth of an emerging field of study. ejournals.lib.vt.edu/jrmp ejournals.lib.vt.edu/jrmp wake forest university z. smith reynolds library primary unit: digital publishing kanewp@wfu.edu primary contact: william kane digital publishing - - kanewp@wfu.edu website: digitalpublishing.wfu.edu program overview mission/description: digital publishing at wake forest university helps faculty, staff, and students create, collect, and convert previously or otherwise unpublished works into digitally distributed books, journals, articles, and the like. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); non-library campus budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); textbooks; ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); newsletters ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video; data; concept maps/modeling maps/ visualizations; multimedia/interactive content percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace; scalar; wordpress; tizra c o n tr ib ut ing institu tio n library publishing coalition mailto:kanewp@wfu.edu mailto:kanewp@wfu.edu digitalpublishing.wfu.edu digital preservation strategy: amazon glacier; amazon s ; hathitrust; in- house; digital preservation services under discussion additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; compiling indexes and/or tocs; open url support; business model development; budget preparation; contract/license preparation; author copyright advisory; digitization; audio/video streaming plans for expansion/future directions: doubling the number of pages published year to year. washington university in st. louis university libraries primary unit: digital library services digital@wumail.wustl.edu primary contact: emily stenberg digital publishing and preservation librarian - - emily.stenberg@wustl.edu website: openscholarship.wustl.edu program overview mission/description: the mission of the washington university in st. louis libraries publishing program is twofold: to provide alternatives to traditional publishing avenues, and to promote and disseminate original scholarly work of the washington university community. washington university libraries began publishing etds in , and in , we launched the open scholarship repository to continue etd publication, to provide a platform for the open access re-publication of faculty articles, and to provide for original publication of online journals and monographs. since the launch of open scholarship, we have expanded into undergraduate honors theses and presentations, and have begun publishing monographs. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) funding sources (%): endowment income ( ) publishing activities types of publications: monographs ( ); etds ( ); undergraduate capstone/ honors theses ( ) media formats: text; images top publications: “edith wharton: vision and perception in her short stories” (thesis); “added-tone sonorities in the choral music of eric whitacre” (thesis); “fashioning women under totalitarian regimes: ‘new women’ of nazi germany f o u n di ng institu tio n library publishing coalition mailto:digital@wumail.wustl.edu mailto:emily.stenberg@wustl.edu openscholarship.wustl.edu and soviet russia” (thesis); “computational fluid dynamics (cfd) modeling of mixed convection flows in building enclosures” (thesis); “sentimental ideology, women’s pedagogy, and american indian women’s writing: - ” (thesis) campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons) digital preservation strategy: in-house additional services: graphic design (print or web); copy-editing; metadata; doi assignment/allocation of identifiers; other author advisory plans for expansion/future directions: bringing a small number of journals (currently in development) online in the coming year. wayne state university wayne state university library system primary unit: digital publishing unit primary contact: joshua neds-fox coordinator for digital publishing - - jnf@wayne.edu program overview mission/description: wayne state’s digital publishing unit works to make unique, important, or institutionally relevant scholarly content available to the world at large, in the context of the wsu library system’s digital platforms. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( . ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text percentage of journals that are peer reviewed: campus partners: wayne state university press; campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): bepress (digital commons); fedora digital preservation strategy: in-house; digital preservation services under discussion f o u n di ng institu tio n library publishing coalition mailto:jnf@wayne.edu additional services: graphic design (print or web); typesetting; copy-editing; marketing; outreach; training; analytics; cataloging; metadata; author copyright advisory; other author advisory; digitization; hosting of supplemental content h i g h l i g h t e d p u b l i c a t i o n jmasm is an independent, peer- reviewed, open access journal providing a scholarly outlet for applied (non)parametric statisticians, data analysts, researchers, psychometricians, quantitative or qualitative evaluators, and methodologists. digitalcommons.wayne.edu/jmasm digitalcommons.wayne.edu/jmasm western university western libraries primary unit: library information resources management wlscholcomm@uwo.ca primary contact: karen marshall assistant university librarian - - ext. karen.marshall@uwo.ca website: ir.lib.uwo.ca program overview mission/description: scholarship@western is a multi-functional portal that collects, showcases, archives, and preserves a variety of materials created or sponsored by the university of western ontario community. it aims to facilitate knowledge sharing and broaden the international recognition of western’s academic excellence by providing open access to western’s intellectual output and professional achievements. it also serves as a platform to support western’s scholarly communication needs and provides an avenue for the compliance of research funding agencies’ open access policies. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); journals produced under contract/mou for external groups ( ); monographs ( ); technical/research reports ( ); faculty conference papers and proceedings ( ); student conference papers and proceedings ( ); etds ( ) media formats: text; images; audio; video; data percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students other partners: scholarly societies; conferences mailto:wlscholcomm@uwo.ca mailto:karen.marshall@uwo.ca ir.lib.uwo.ca libraries outside the united states and canada australian national university australian national university library primary contact: lorena kanellopoulos manager, anu e press + - - - lorena.kanellopoulos@anu.edu.au website: epress.anu.edu.au; digitalcollections.anu.edu.au; anulib.anu.edu.au program overview mission/description: the library aims to support anu by ’s goals of excellence in research and education and the university’s role as a national policy resource. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: student-driven journals ( ); monographs ( ); faculty conference papers and proceedings ( ); etds ( ) media formats: text; images; audio; video percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty publishing platform(s): dspace; wordpress digital preservation strategy: no digital preservation services provided additional services: graphic design (print or web); cataloging; author copyright advisory plans for expansion/future directions: for key strategic directions, see anulib.anu.edu.au/_resources/reports-and-publications/publications/library_ operational_plan_draft_ .pdf. mailto:lorena.kanellopoulos@anu.edu.au epress.anu.edu.au digitalcollections.anu.edu.au anulib.anu.edu.au anulib.anu.edu.au/_resources/reports-and-publications/publications/library_operational_plan_draft_ .pdf anulib.anu.edu.au/_resources/reports-and-publications/publications/library_operational_plan_draft_ .pdf edith cowan university edith cowan university library primary unit: research services researchonline@ecu.edu.au primary contact: julia gross senior librarian, research services + - - - j.gross@ecu.edu.au website: ro.ecu.edu.au program overview year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library materials budget ( ); library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); faculty conference papers and proceedings ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; concept maps/modeling maps/visualizations disciplinary specialties: education; business; social and behavioral sciences; medicine and health sciences; arts and humanities top publications: australian journal of teacher education (journal); landscapes (journal); eculture (journal); journal of emergency primary health care (journal); research journalism (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; undergraduate students publishing platform(s): bepress (digital commons) mailto:researchonline@ecu.edu.au mailto:j.gross@ecu.edu.au ro.ecu.edu.au digital preservation strategy: digital preservation services under discussion additional services: marketing; outreach; training; analytics; cataloging; metadata; notification of a&i sources; issn registration; doi assignment/ allocation of identifiers; author copyright advisory; digitization plans for expansion/future directions: increasing numbers of journals published; investigating ebook publication. humboldt-universitÄt zu berlin universitätsbibliothek primary unit: arbeitsgruppe elektronisches publizieren primary contact: niels fromm head electronic publishing group + - - fromm@ub.hu-berlin.de website: edoc.hu-berlin.de program overview mission/description: the edoc-server is the institutional repository of humboldt university. on this server every member of the university is able to publish his or her electronic theses and/or any documents as open access. we accept anything from single articles or volumes to series of open access publications. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ); graduate students ( ) funding sources (%): library operating budget ( ); other ( ) publishing activities types of publications: faculty-driven journals ( ); student-driven journals ( ); monographs ( ); technical/research reports ( ); etds ( ) media formats: text percentage of journals that are peer reviewed: campus partners: campus departments or programs publishing platform(s): locally developed software digital preservation strategy: clockss; lockss; in-house mailto:fromm@ub.hu -berlin.de http://edoc.hu-berlin.de additional services: cataloging; metadata; digitization; document templates for ms-office; styles for endnote / citavi plans for expansion/future directions: developing a concept and a workflow for the publication of research data in addition to electronic theses. monash university monash university library primary unit: research infrastructure division primary contact: andrew harrison research repository librarian + - - - andrew.harrison@monash.edu program overview mission/description: publishing at monash university is carried out by monash university research repository and monash university publishing, both of which are parts of the university library. monash university research repository is a digital archive of selected content representing monash’s research activity. the repository provides staff and students a place to deposit their research collections, data, or publications so they are centrally stored and managed, with the content easily discoverable online by their peers globally and by the broader community. the university requires that successful phd theses are submitted to the repository for online publication. the repository is intended to be primarily an open access repository but does contain restricted access content on a case by case basis (e.g., embargoed theses). monash university publishing focuses on peer-reviewed monographs, which are published in both online open access and traditional print forms—as such it is not included here. it seeks to publish scholarly work of the highest quality, ensured by rigorous peer review; maximise the impact of those titles; represent the breadth and energy of monash university research interests (while not excluding contributors from anywhere); promote the free exchange of knowledge; play a coordinating role in the production and dissemination of monash’s scholarly publications, and provide a body of publishing expertise within the university. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ) publishing activities types of publications: faculty-driven journals ( ); etds ( ); research datasets such as images and sound files mailto:andrew.harrison@monash.edu media formats: text; images; audio; data disciplinary specialties: geographic information systems (gis); comparative literature and cultural studies; social/community work top publications: pan: philosophy activism nature (journal); practice reflexions (journal); applied gis (journal) percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty other partners: australian community workers association publishing platform(s): fedora; vital digital preservation strategy: in-house additional services: doi assignment/allocation of identifiers additional information: journal publishing is largely a legacy service. our future focus is on theses and research data. we also think that separating the university press from the repository is unhelpful: we see them as complimentary, and both are exploring new ways for libraries to be involved in publishing going forward. plans for expansion/future directions: expanding theses program to include master’s and phd candidates from disciplines previously exempt from the compulsory submission process; expanding the range of research data included in the repository. changes to australian funding council rules will increase the amount of open access journal material we hold. swinburne university of technology swinburne library primary unit: information resources primary contact: nyssa parkes online projects librarian + - - - nparkes@swin.edu.au website: www.swinburne.edu.au/lib/ir/onlinejournals; commons.swinburne.edu.au program overview mission/description: the swinburne online journals service provides publishing support to swinburne faculties and research centres who publish online open access journals. we provide hosting software and technical assistance as well as help and advice on general online publishing and copyright issues. swinburne commons is the centralized service for the management and distribution of digital media content produced across swinburne. the commons draws together quality digital media content from across the university to highlight the research strengths, teaching excellence, student accomplishments, and unique aspects of swinburne. year publishing activities began: organization: services are distributed across library units/departments staff in support of publishing activities (fte): library staff ( . ) funding sources (%): other ( ) publishing activities types of publications: faculty-driven journals ( ); journals produced under contract/mou for external groups ( ); video and audio publishing; videos created at the university are disseminated centrally through the library’s service media formats: text; images; audio; video; multimedia/interactive content disciplinary specialties: mathematics (videos); telecommunications; psychology; settler colonial studies percentage of journals that are peer reviewed: mailto:nparkes@swin.edu.au www.swinburne.edu.au/lib/ir/onlinejournals commons.swinburne.edu.au campus partners: individual faculty other partners: telecommunications society of australia publishing platform(s): ojs/ocs/omp; locally developed software digital preservation strategy: digital preservation services under discussion. additional services: graphic design (print or web); marketing; training; analytics; metadata; compiling indexes and/or tocs; issn registration; doi assignment/ allocation of identifiers; contract/license preparation; audio/video streaming; copyright and permissions advice; technical advice (video and audio); accessibility advice additional information: www.swinburne.edu.au/lib/ir/onlinejournals/support. html, commons.swinburne.edu.au/toolkit.php plans for expansion/future directions: investigating monograph publishing; implementing software upgrades. www.swinburne.edu.au/lib/ir/onlinejournals/support.html www.swinburne.edu.au/lib/ir/onlinejournals/support.html commons.swinburne.edu.au/toolkit.php university of hong kong university libraries primary unit: technical services primary contact: david t. palmer associate university librarian + - - dtpalmer@hku.hk website: hub.hku.hk program overview mission/description: we make highly visible the research and researchers of our university through our efforts, in the expectation that new offers of collaboration, contract research, employment, and so forth will be received from the government, industry, and society. year publishing activities began: organization: services are distributed across campus staff in support of publishing activities (fte): library staff ( ) funding sources (%): library operating budget ( ); grants ( ) publishing activities types of publications: faculty-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ) media formats: text; images; audio; video disciplinary specialties: social sciences percentage of journals that are peer reviewed: campus partners: campus departments or programs; individual faculty; graduate students; undergraduate students publishing platform(s): dspace digital preservation strategy: in-house; digital preservation services under discussion additional services: cataloging; metadata; doi assignment/allocation of identifiers mailto:dtpalmer@hku.hk hub.hku.hk university of south australia library primary unit: information resources and technology unisa-research-archive@unisa.edu.au primary contact: kate sergeant coordinator, repository & archive metadata services + - - - kate.sergeant@unisa.edu.au website: ura.unisa.edu.au program overview mission/description: the university of south australia has mandated that research degree students deposit a digital copy of their thesis in the university’s institutional repository, the unisa research archive. theses are made available under an open access publishing model with the application of a non-exclusive creative commons licence, however, copyright remains with the author. whilst it is the aim of the university that theses (and other research outputs) be made open access where possible, authors do have the option to restrict access to their thesis for two years. in order to capture previous research in an electronic format, the library embarked on a process of digitizing theses back to the foundation of the university of south australia in . these were loaded to the unisa research archive in . as a result, the university has today over digitized research degree theses available in its repository. year publishing activities began: organization: centralized library publishing unit/department staff in support of publishing activities (fte): library staff ( . ) publishing activities types of publications: faculty-driven journals ( ); etds ( ); undergraduate capstone/honors theses ( ); digitized content media formats: text; images; audio; video disciplinary specialties: health; education; engineering; business; information technology mailto:unisa-research-archive@unisa.edu.au mailto:kate.sergeant@unisa.edu.au ura.unisa.edu.au top publications: graduation booklets; “innovation, globalisation and performance in smes” (thesis); “factors affecting online shopping behaviour” (thesis); calendars and handbooks; “service quality improvement in the hotel industry” (thesis) percentage of journals that are peer reviewed: campus partners: campus departments or programs; graduate students publishing platform(s): ojs/ocs/omp; digitool digital preservation strategy: university server additional services: graphic design (print or web); cataloging; metadata; author copyright advisory; digitization; handle publishing (persistent link) library publishing coalition strategic affiliates strategic affiliates are entities (including service providers, library networks and consortia, non-profit organizations, and others) that share a common interest in this emerging field. to become a strategic affiliate, contact the library publishing coalition’s program manager, sarah k. lippincott (sarah@educopia.org). anvil academic association of research libraries (arl) bepress bibliolabs boston library consortium (blc) coalition for networked information (cni) council of australian university librarians (caul) digital public library of america (dpla) five colleges librarians council hastac knowledge unlatched oapen open access scholarly publishers association (oaspa) public knowledge project (pkp) sparc society for scholarly publishing (ssp) tizra mailto:sarah@educopia.org platforms, tools, and service providers libraries work with a range of external software, tools, and service providers to support preservation, markup, conversion, hosting, allocation of identifiers, and other processes related to the publishing workflow. this following list compiles the names and websites of tools, software, and service providers employed by the libraries in this directory. editorial/production amazon createspace www.createspace.com backstage library works www.bslw.com bookcomp www.bookcomp.com calibre www.calibre-ebook.com charlesworth www.charlesworth-group.com data conversion laboratory, inc. www.dclab.com inera extyles www.inera.com ingram lightning source www.lightningsource.com media preserve www.themediapreserve.com oxygen www.oxygenxml.com scene savers www.scenesavers.com sigil www.github.com/user-none/sigil submittable www.submittable.com tips technical publishing www.technicalpublishing.com trigonix www.trigonix.com/english versioning machine www.v-machine.org www.createspace.com www.bslw.com www.bookcomp.com http://www.calibre-ebook.com http://www.charlesworth-group.com www.dclab.com www.inera.com www.lightningsource.com www.themediapreserve.com www.oxygenxml.com www.scenesavers.com www.github.com/user-none/sigil www.submittable.com www.technicalpublishing.com www.trigonix.com/english http://www.v-machine.org @mire www.atmire.com/website ambra www.ambraproject.org bepress www.bepress.com commons in a box www.commonsinabox.org connexions www.cnx.org contentdm www.contentdm.org dataverse www.thedata.org digitool by exlibris www.exlibrisgroup.com/category/ digitoolovervie django web framework www.djangoproject.com dpubs dpubs.org drupal www.drupal.org dspace www.dspace.org ensemble www.ensemblevideo.com eprints www.eprints.org/us etd-db scholar.lib.vt.edu/etd-db/index.shtml xtf (extensible text framework) xtf.cdlib.org fedora www.fedora-commons.org hubzero www.hubzero.org issuu www.issuu.com kaltura www.corp.kaltura.com omeka www.omeka.org ojs/ocs/omp pkp.sfu.ca/ojs pkp.sfu.ca/ocs pkp.sfu.ca/omp panopto www.panopto.com pressbooks www.pressbooks.com scalar scalar.usc.edu tizra www.tizra.com wordpress www.wordpress.org vitalsource www.vitalsource.com platform/hosting/infrastructure www.atmire.com/website www.ambraproject.org www.bepress.com www.commonsinabox.org www.cnx.org www.contentdm.org www.thedata.org www.exlibrisgroup.com/category/digitoolovervie www.djangoproject.com http://dpubs.org www.drupal.org www.dspace.org www.ensemblevideo.com www.eprints.org/us scholar.lib.vt.edu/etd-db/index.shtml xtf.cdlib.org http://www.fedora-commons.org www.hubzero.org www.issuu.com www.corp.kaltura.com www.omeka.org pkp.sfu.ca/ojs pkp.sfu.ca/ocs pkp.sfu.ca/omp www.panopto.com www.pressbooks.com scalar.usc.edu www.tizra.com www.wordpress.org www.vitalsource.com discovery/marketing altmetric.com www.altmetric.com bibapp www.bibapp.org bowker www.bowker.com/en-us crossref www.crossref.org datacite www.datacite.org doaj www.doaj.org ebsco www.ebscohost.com ezid www.n t.net/ezid loc issn registry www.loc.gov/issn marcive home.marcive.com proquest www.proquest.com serials solutions www.serialssolutions.com digital preservation adpnet www.adpnet.org amazon glacier www.aws.amazon.com/glacier amazon s www.aws.amazon.com/s aptrust www.aptrust.org archive-it www.archive-it.org archivematica www.archivematica.org artefactual www.artefactual.com chronopolis chronopolis.sdsc.edu clockss www.clockss.org/clockss/home dark archive in the sunshine state (daitss) daitss.fcla.edu digital preservation network (dpn) www.dpn.org discoverygarden www.discoverygarden.ca duracloud www.duracloud.org hathitrust www.hathitrust.org hydra www.projecthydra.org internet archive www.archive.org/index.php islandora www.islandora.ca lockss www.lockss.org metaarchive www.metaarchive.org portico www.portico.org/digital-preservation altmetric.com www.altmetric.com www.bibapp.org www.bowker.com/en www.crossref.org www.datacite.org www.doaj.org www.ebscohost.com www.n t.net/ezid www.loc.gov/issn home.marcive.com www.proquest.com www.serialssolutions.com www.adpnet.org www.aws.amazon.com/glacier www.aws.amazon.com www.aptrust.org www.archive-it.org www.archivematica.org www.artefactual.com chronopolis.sdsc.edu www.clockss.org/clockss/home daitss.fcla.edu www.dpn.org www.discoverygarden.ca www.duracloud.org www.hathitrust.org www.projecthydra.org www.archive.org/index.php www.islandora.ca www.lockss.org www.metaarchive.org www.portico.org/digital preservica www.preservica.com rosetta www.exlibrisgroup.com/category/ rosettaoverview safety deposit box www.digital-preservation.com/ solution/safety-deposit-box scholars portal spotdocs.scholarsportal.info/display/sp/ home synergies www.synergiescanada.org uc merritt merritt.cdlib.org library networks and consortia networked digital library of theses and dissertations (ndltd) www.ndltd.org ohiolink etd center etd.ohiolink.edu texas digital library www.tdl.org theses canada www.collectionscanada.gc.ca/ thesescanada/index-e.html www.preservica.com www.exlibrisgroup.com/category/rosettaoverview http://www.digital-preservation.com spotdocs.scholarsportal.info/display/sp/home spotdocs.scholarsportal.info/display/sp/home www.synergiescanada.org merritt.cdlib.org www.ndltd.org etd.ohiolink.edu www.tdl.org www.collectionscanada.gc.ca index-e.html personnel index aagaard, posie, anderson, clifford b., bayer, marc d., beasley, sarah, beaubien, sarah, xiii, beck, donna, xiii becker, devin, bell, allan, bernhardt, beth, billings, marilyn, vi, xiii, bonanni, mimmo, boock, michael, boyd, alan, brown, allison, buckland, amy, bull, jonathan, calarco, pascal, carter, andy, corbett, hillary, costanza, jane, cummings-sauls, rebel, davis-kahl, stephanie, vi, xiii, de groote, sandy, deliyannides, timothy s., dohe, kate, dotson, lee, duke, john, eden, brad, xiii ericson, randall, fister, barbara, flynn, stephen, friend, linda, fromm, niels, gillis, roger, gilman, isaac, xiii, gross, julia, halbert, martin, vi, harrison, andrew, heller, margaret, herbert, john, ho, adrian k., vi, xiii, huwe, terence k., johnson, kathy, johnston, dave, johnston, wayne, kane, william, kanellopoulos, lorena, kehoe, inba, kelly, marty, khanna, delphine, kim wu, somaly, kipnis, dan, kirchner, joy, kirk, elizabeth, laherty, jennifer, lally, ann, lee, dan, xiii, li, yuan, lind, sean, lippincott, sarah k., viii, mangiafico, paolo, mariner, matthew, marker, rhonda, marshall, karen, mcmillan, gail, xiii, meikle, sian, mercer, holly, vi, xiii, michalek, gabrielle, millard, john, mitchell, carmen, mitchell, catherine, xiii, morris, jane, xiii, mullins, james, vi myers, kim, neds-fox, joshua, newton, mark, xiii, oberg, johan, owen, brian, owen, terry m., owen, will, owens, elisabeth, palmer, david t., panciera, benjamin, parandjuk, joanne, parkes, nyssa, paulus, nick, poley, darren g., ramirez, marisa, reed, marianne, reynolds, david, reznik-zellen, rebecca, riddle, kelly, robertson, wendy, roosa, mark, royster, paul, rubin, jeff, ruddy, david, russell, john, russell, judy, sauvé, diane, schlosser, melanie, xiii, sergeant, kate, simser, char, skinner, katherine, viii smart, elizabeth, vi, xiii, starcher, christopher, stenberg, emily, stewart, claire, stockham, marcia, xiii sutton, shan, vi, xiii swift, allegra, vi, xiii, thompson, mary beth, xiii thoms, becky, tillinghast, beth, trehub, aaron, turtle, beth, vi, xiii vandegrift, micah, vanderjagt, leah, varner, stewart, walters, tyler, vi watkinson, charles, vi, xiii, weinraub, evviva, xiii white, nicole, yates, elizabeth, yeung, tim au, zuniga, heidi, www.librarypublishing.org participating in the library publishing coalition means joining a robust network of libraries committed to enhancing, promoting, and exploring this emerging field. our participating libraries are designing and building this organization from the ground up: making decisions about governance and services, producing resources that benefit the community, and engaging with colleagues. north american academic libraries with an interest in participating may do so at any point during the two-year project period (january –december ) as a contributing institution. in january , the lpc will launch as a membership organization. for more information, please contact sarah k. lippincott, library publishing coalition program manager (sarah@educopia.org). www.librarypublishing.org mailto:sarah@educopia.org http://www.librarypublishing.org fc title copyright contents foreword introduction library publishing coalition subcommittees reading an entry libraries in the united states and canada arizona state university auburn university boston college brigham young university brock university cal poly, san luis obispo california institute of technology california state university san marcos carnegie mellon university claremont university consortium colby college college at brockport, suny college of wooster columbia university connecticut college cornell university dartmouth college duke university emory university florida atlantic university florida state university georgetown university georgia state university grand valley state university gustavus adolphus college hamilton college illinois wesleyan university indiana university johns hopkins university kansas state university loyola university chicago macalester college mcgill university miami university mount saint vincent university northeastern university northwestern university oberlin college ohio state university oregon state university pacific university pennsylvania state university pepperdine university portland state university purdue university rochester institute of technology rutgers, the state university of new jersey simon fraser university state university of new york at buffalo state university of new york at geneseo syracuse university temple university texas tech university thomas jefferson university trinity university tulane university université de montréal university of alberta university of arizona university of british columbia university of calgary university of california, berkeley university of california system university of central florida university of colorado anschutz medical campus university of colorado denver university of florida university of georgia university of guelph university of hawaii at manoa university of idaho university of illinois at chicago university of iowa university of kansas university of kentucky university of maryland college park university of massachusetts amherst university of massachusetts medical school university of michigan university of minnesota university of nebraska-lincoln university of north carolina at chapel hill university of north carolina at charlotte university of north carolina at greensboro university of north texas university of oregon university of pittsburgh university of san diego university of south florida university of tennessee university of texas at san antonio university of toronto university of utah university of victoria university of washington university of waterloo university of windsor university of wisconsin–madison utah state university valparaiso university vanderbilt university villanova university virginia commonwealth university virginia tech wake forest university washington university in st. louis wayne state university western university libraries outside the united states and canada australian national university edith cowan university humboldt-universität zu berlin monash university swinburne university of technology university of hong kong university of south australia library publishing coalition strategic affiliates platforms, tools, and service providers personnel index back cover c:/users/michael/cinemetrics/bibliography with essay.dvi cinemetrics – a bibliography mike baxter , lady bay road, west bridgford, nottingham, ng bj, uk (e-mail: michaelj.baxter@btconnect.com) october introduction this is work in progress and will be added to from time to time and updated on my website, http://www.mikemetrics.com/. the original idea was to list cinemetric publications with a strong statistical component on the website, but the number is steadily increasing so it is going to be more efficient to make this available as a pdf file. i’ve written in more detail about the statistical aspects of cinemetrics elsewhere, notes on cinemetric data analysis, in what is effectively a book that can be accessed from my website and academia.edu page for free, so i will keep this introduction fairly short. the cinemetrics website established by yuri tsivian in is important for many reasons. it draws some of its inspiration from the work of barry salt, particularly his paper listed below. this was well ahead of its time. a lot of the ideas it embodies couldn’t easily be implemented with the computing power available at the time. this has now changed and cinemetrics is one manifestation of this. among other things it eases the problem of collecting data and interrogating it. there are issues about accuracy – the source material, frame-accurate measurement, the actual analysis of the data, and so on. the issues are being addressed. one thing that interests me is the nature of publication. this is changing rapidly. what you might call ‘conventional’ journals that deal with film studies are frightened to publish anything involving quantitative ideas that the editors judge might scare, or be incomprehensible to, their readers. this is an understandable, if frustrating, point of view if you want to get ideas of the quantitative analysis of ‘filmic analysis’ into the public domain. i’ve experienced this and know i am not alone. a lot of what’s listed below, for the reasons i’ve outlined above, is on the web rather than in journals. apart from the people name-checked above there is an interesting body of work by nick redfern, and james cutting and his colleagues. this will be obvious fron the bibliography. the various people involved do not necessarily agree with each other’s ideas (myself included) and there is a certain amount of what you might call ‘combative’ debate about this, a lot of which can be viewed on the cinemetrics website. in some ways this ‘internal dissent’ about quantitative methodology distracts from the possibility that the quantitative study of film can add to your knowledge about film. it does; it’s just one way of looking at film; and how you rate it depends on what your interests are. the study of montage in early (the s) film has benefited from quantitative analysis emeritus professor of statistical archaeology, nottingham trent university, uk through the work of yuri tsivian and colleagues. if you want to see what some of this is, there are video presentations of talks at a conference sponsored by the neubauer collegium at the university of chicago . so i think this is all interesting; it is what academic study is about; you do things, sit back, and see what happens. the present bibliography is intended as a resource. a fwe refences that are not strictly ‘cinemetric’ but involve the analysis of ‘filmic’ data are included. bibliography references [ ] adams b., dorai c. and venkatesh s. ( ) towards automatic extraction of ex- pressive elements from motion pictures: tempo. ieee international conference on multimedia and expo, , vol. ii, - . [ ] adams b., dorai c. and venkatesh s. ( ) formulating film tempo: the computa- tional media aesthetics methodology in practice. in c dorai and s venkatesh (eds.) media computing: computational media aesthetics. norwell, ma: kluwer academic publishers, - . [ ] adams b., venkatesh s., bui h.h. and dorai c. ( ) a probabilistic framework for extracting narrative act boundaries and semantics in motion pictures. multimedia tools and applications , - . [ ] rubio alcover, a. and samit, a. t. ( ) three neoclassicisms. exploring the pos- sibilities of a comparative average shot length through clint eastwood, brian de palma and woody allen. icono , - . [ ] baxter m. ( a) film statistics: some observations, http://www.cinemetrics.lv/dev/on statistics.php [ ] baxter m. ( b) film statistics: further observations. http://www.cinemetrics.lv/dev/on statistics.php [ ] baxter m. ( c) picturing the pictures: hitchcock, statistics and film. significance , - . [ ] baxter m. ( a) lines, damned lines and statistics. http://www.cinemetrics.lv/dev/on statistics.php [ ] baxter m. ( b) comparing cutting patterns a working paper. http://www.cinemetrics.lv/dev/on statistics.php http://neubauercollegium.uchicago.edu/events/uc/cinemetrics-conference/ [ ] baxter m. ( b) cutting patterns in d.w. griffiths biographs: an experimental statistical study. http://www.cinemetrics.lv/dev/on statistics.php [ ] baxter m. ( d) on the distributional regularity of shot lengths in film. literary and linguistic computing, doi: . /llc/fqt [ ] baxter m. ( e) evolution in hollywood editing patterns? http://www.cinemetrics.lv/dev/evolution paper for cinemetrics.pdf [ ] baxter m. ( a) cutting patterns in d.w. griffiths silent feature films. http://www.academia.edu/ /cutting patterns in d.w. griffiths silent feature films [ ] baxter m. ( a) on the graphical comparison of cutting-rates across bodies of films: with applications to the films of mack sennett and charlie chaplin. http://www.academia.edu/ /on the graphical comparison of cutting- rates across bodies of films with applications to the films of mack sennett and charlie chaplin [ ] baxter m. ( c) further comments on evolution in hollywood film: the role of models. http://www.cinemetrics.lv/dev/baxter cutting and cinemetrics.pdf [ ] buckland w. ( ) what does the statistical style analysis of film involve? a review of ’moving into pictures. more on film history, style, and analysis’. literary and linguistic computing , - . [ ] buckland w. ( ) ghost director. in digital tools in media studies, m. ross, m. grauer and b. freisleben (eds.), transcript verlag, bielefeld, germany, - . [ ] cutting, j.e. ( ) more on the evolution of popular film editing. http://www.cinemetrics.lv/dev/cuttingcinemetricx .pdf [ ] cutting j.e., delong j.e. and nothelfer c.e. ( ) attention and the evolution of hollywood film. psychological science , - . [ ] cutting j.e., brunik k.l. and delong j.e., ( a) the changing poetics of the dissolve in hollywood film. empirical studies of the arts , - . [ ] cutting j.e., brunik k.l. and delong j.e., ( b) how act structure sculpts shot lengths and shot transitions in hollywood film. projections , - . [ ] cutting j.e., brunik k.l. and delong j.e., ( ) on shot lengths and film acts: a revised view. projections , - . [ ] cutting j.e., brunik k.l., delong j.e., iricinschi c. and candan a, ( ) quicker, faster, darker: changes in hollywood film over years. i-perception , - . [ ] cutting j.e., delong j.e. and brunik k.l. ( ) visual activity in hollywood film: to and beyond. psychology of aesthetics, creativity, and the arts , - . [ ] cutting, j. e. and candan, a. ( ). movies, evolution, and mind: from fragmen- tation to continuity. the evolutionary review , - . [ ] cutting, j.e., iricinschi c. and brunick, k.l. ( ) mapping narrative space in hollywood film. projections , - . [ ] delong j.e., brunik k.l. and cutting j.e. ( ) film through the human visual system: finding patterns and limits. in the social science of cinema, j.c. kaufman and d.k. simonton (eds.), oxford university press, new york, in press. [ ] delong, j.e. ( ) horseshoes, handgrenades, and model fitting: the lognormal distribution is a pretty good model for shot-length distribution of hollywood films. literary and linguistic computing, doi: . /llc/fqt [ ] grzybek p. and koch v. ( ) shot length: random or rigid, choice or chance? an analysis of lev kuleov’s po zakonu [by the law]. in sign culture. zeichen kultur, e.w.b. hess-lüttich, ed., königshausen & neumann: würzburg, - . [ ] han x,, small s.d., foster d.p. and patel v. ( ) the effect of winning an oscar award on survival: correcting for healthy performer survivor bias with a rank pre- serving structural accelerated failure time model. the annals of applied statistics , . [ ] manovich, l. ( ) visualizing vertov. http://softwarestudies.com/cultural analytics/manovich.visualizing vertov. .pdf [ ] murtagh f., ganz a. and mckie s, ( ) the structure of narrative: the case of film scripts pattern recognition , - . [ ] o’brien c. ( ) cinema’s conversion to sound. bloomington, in: indiana univer- sity press. [ ] redelmeier d,a. and singh s.m. ( ) survival in academy award-winning actors and actresses. annals of internal medicine , - . [ ] redelmeier d.a. and singh s.m. ( ) reanalysis of survival of oscar winners. an- nals of internal medicine , . [ ] redfern n. ( a) shot length distributions in the chaplin keystones, http://nickredfern.files.wordpress.com/ / /nick-redfern-shot-length-distributions-in-the- chaplin-keystones .pdf [ ] redfern n. ( b) the impact of sound technology on the distribution of shot lengths in motion pictures, http://nickredfern.files.wordpress.com/ / /nick-redfern-the-impact-of- sound-technology-on-hollywood-film-style .pdf [ ] redfern n. ( a) shot length distributions in the early films of charles chaplin, http://nickredfern.files.wordpress.com/ / /nick-redfern-shot-length-distributions-in- the-early-films-of-charles-chaplin.pdf [ ] redfern n. ( b) shot length distributions in the films of alfred hitchcock, to , http://nickredfern.files.wordpress.com/ / /nick-redfern-shot-length-distributions-in-the- films-of-alfred-hitchcock- -to- .pdf [ ] redfern n. ( c) robust measures of scale for shot length distributions, http://nickredfern.files.wordpress.com/ / /nick-redfern-robust-measures-of-scale-for-shot- length-distributions.pdf [ ] redfern n. ( d) shot length distributions in the short films of laurel and hardy, to , http://nickredfern.files.wordpress.com/ / /nick-redfern-shot-length- distributions-in-the-short-films-of-laurel-and-hardy.pdf [ ] redfern n. ( e) statistical analysis of shot types in the films of al- fred hitchcock, http://nickredfern.files.wordpress.com/ / /nick-redfern-statistical-analysis- of-shot-types-in-the-films-of-alfred-hitchcock.pdf [ ] redfern n. ( a) time series analysis of bbc news bulletins using running mann- whitney z statistics, http://nickredfern.files.wordpress.com/ / /nick-redfern-time-series- analysis-of-bbc-news-bulletins .pdf [ ] redfern n. ( a) the lognormal distribution is not an appropriate parametric model for shot length distributions of hollywood films. literary and linguistic computing, doi: . /llc/fqs [ ] redfern n. ( c) exploratory data analysis and film form: the editing struc- ture of slasher films, http://nickredfern.files.wordpress.com/ / /nick-redfern-the-editing- structure-of-slasher-films.pdf [ ] redfern, n. ( d) robust time series analysis of itv news bulletins, footnote- size http://nickredfern.files.wordpress.com/ / /nick-redfern-robust-time-series- analysis-of-itv-news-bulletins.pdf [ ] redfern, n. ( d) the average shot length as a statistic of film style, http://www.cinemetrics.lv/dev/on statistics.php [ ] redfern, n. ( e) robust estimation of the modified autoregres- sive index for high grossing films at the us box office, - . http://nickredfern.files.wordpress.com/ / /nick-redfern-the-mar-index-for-hollywood- films .pdf [ ] redfern, n. ( f) correspondence analysis of genre preferences in uk film audiences. participations , - . [ ] redfern, n. ( a) an introduction to using graphical displays for analyzing the editing structure of motion pictures. http://www.cinemetrics.lv/dev/on statistics.php [ ] redfern, n. ( b) time series clustering and the analysis of film style. http://www.cinemetrics.lv/dev/on statistics.php [ ] redfern, n. ( c) film style and narration in rashomon. journal of japanese and korean cinema , - . [ ] redfern, n. ( d) film studies and statistical literacy. media education research journal , - . [ ] redfern, n. ( a) the structure of itv news bulletins. international journal of communication ( ), . [ ] redfern, n. ( b) quantitative methods and the study of film. http://nickredfern.files.wordpress.com/ / /nick-redfern-quantitative-methods-and-the-study- of-film.pdf [ ] redfern, n. ( c) comparing the shot length distributions of motion pictures using dominance statistics. empirical studies of the arts , - . [ ] salt b, ( ) statistical style analysis of motion pictures. film quarterly , - . [ ] salt b, ( ) film style and technology in the thirties. film quarterly , - . [ ] salt b, ( ) film style and technology in the forties. film quarterly , - . [ ] salt, b. ( ) early german film: the stylistics in comparative context. in a sec- ond life: german cinema’s first decades, t. elsaesser (ed), amsterdam university press, amsterdam, - . [ ] salt b. ( ) the shape of : the stylistics of american movies at the end of the century. new review of film and television studies , - . [ ] salt b. ( ) moving into pictures. starword, london. [ ] salt b. ( a) film style & technology: history & analysis, rd edition. starword, london, . [ ] salt b. ( b) the shape of . new review of film and television studies , - . [ ] salt, b. ( ) speeding up and slowing down. http://www.cinemetrics.lv/salt speeding up down.php [ ] salt, b. ( a) the metrics in cinemetrics. http://www.cinemetrics.lv/metrics in cinemetrics.php [ ] salt, b. ( b) reaction time: how to edit movies. new review of film and television studies , - . [ ] salt b. ( ) graphs and numbers. http://www.cinemetrics.lv/dev/on statistics.php [ ] salt b. ( ) lines and graphs. http://www.cinemetrics.lv/dev/on statistics.php [ ] salt b. ( ) salt on baxter on cutting. http://www.cinemetrics.lv/dev/cutthoughtc.pdf [ ] schaefer r.j. and martinez t. ( ) trends in network news editing strategies from through . journal of broadcasting and electronic media , - . [ ] starace s. ( ) per unanalisi stilometrica del cinema di lattuada. in lberto lattuada. il cinema e i film, ed. a. apr, venice: marsilio, - [ ] starace s. ( a) da realista a falsario. appunti stilometrici sul cinema di lizzani. in carlo lizzani. un lungo viaggio nel cinema, ed. v. zagarrio, venice: marsilio, - [ ] starace s. ( b) introduzione alla televisione di cottafavi. in ai poeti non si spara. vittorio cottafavi fra cinema e televisione, ed. a. apr, g. bursi, s. starace, bologna: cineteca di bologna, - [ ] starace s. ( a), per unanalisi stilometrica di camerini. cabiria , - [ ] starace s. ( b) metrica e poesia. per un’analisi stilometrica del cinema di bertolucci. in bernardo bertolucci. il cinema e i film, ed. a. apr, venice: marsilio, - [ ] starace s. ( ) the evolution of film editing. in titanus. in family diary of italian cinema, ed. s.m. germani, s. starace, r. turigliatto, rome: centro sperimentale di cinematrografia, sabinae, - [ ] sylvestre m-p., huszti e and hanley j.a. ( ) do oscar winners live longer than less successful peers? a reanalysis of the evidence. annals of internal medicine , - . [ ] taskiran, c. and delp, e. ( ) a study on the distribution of shot lengths for video analysis. spie conference on storage and retrieval for media databases http://www.ctaskiran.com/papers/ ei shotlen.pdf [ ] tsivian y. ( ) editing in intolerance. in the griffith project, volume ( - ), ed. p. cherchi usai, london: bfi publishing, - . [ ] tsivian y. ( ) what is cinema? an agnostic answer. critical inquiry , - . [ ] tsivian y. ( ) cinemetrics, part of the humanities’ cyberstructure. in digital tools in media studies: analysis and research: an overview, b. freisleben, j. garncarz and m. grauer (eds.), transcript verlag, bielefeld, - . [ ] tsiviany y. ( ) talking to miriam: soviet americanitis and the vernacular mod- ernism thesis. new german critique , - . [ ] vasconselos n. and lippman a. ( ) statistical models of video structure for content analysis and characterization. ieee transactions on image processing , - . [ ] wolkewitz m., allignol a., schumacher m. and beyersmann j. ( ) two pitfalls in survival analyses of time-dependent exposure: a case study in a cohort of oscar nominees. the american statistician , - . [pdf] pemnetwork: barriers and enablers to collaboration and multimedia education in the digital age | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /pec. corpus id: pemnetwork: barriers and enablers to collaboration and multimedia education in the digital age @article{lumbabrown pemnetworkba, title={pemnetwork: barriers and enablers to collaboration and multimedia education in the digital age}, author={angela lumba-brown and s. tat and m. auerbach and d. kessler and michelle j alletag and p. grover and d. schnadower and c. macias and t. chang}, journal={pediatric emergency care}, year={ }, volume={ }, pages={ – } } angela lumba-brown, s. tat, + authors t. chang published medicine pediatric emergency care abstract in january , pemfellows.com was created to unify fellows in pediatric emergency medicine. since then, the website has expanded, contracted, and focused to adapt to the interests of the pediatric emergency medicine practitioner during the internet boom. this review details the innovation of the pemnetwork, from the inception of the initial website and its evolution into a needs-based, user-directed educational hub. barriers and enablers to success are detailed with unique examples… expand view on wolters kluwer files.constantcontact.com save to library create alert cite launch research feed share this paper citations view all topics from this paper emergency medicine (field) pediatric emergency medicine community one citation citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency curated collections for educators: five key papers on evaluating digital scholarship a. quinn, t. chan, + authors m. gottlieb medicine cureus pdf save alert research feed references showing - of references sort byrelevance most influenced papers recency free open access meducation (foam): the rise of emergency medicine and critical care blogs and podcasts ( – ) mike d cadogan, brent thoma, t. chan, m. lin medicine emergency medicine journal view excerpts, references background save alert research feed the impact of e-learning in medical education. jorge g. ruiz, m. mintzer, r. leipzig medicine academic medicine : journal of the association of american medical colleges , pdf view excerpt, references background save alert research feed internet-based learning and applications for critical care medicine t. wolbrink, j. burns medicine journal of intensive care medicine save alert research feed free open access medical education (foam) for the emergency physician c. nickson, mike d cadogan medicine emergency medicine australasia : ema view excerpts, references background save alert research feed five strategies to effectively use online resources in emergency medicine. brent thoma, n. joshi, n. trueger, t. chan, michelle s. lin medicine annals of emergency medicine pdf save alert research feed a systematic review and qualitative analysis to determine quality indicators forhealth professions education blogs and podcasts. quinten s. paterson, brent thoma, w. milne, m. lin, t. chan medicine journal of graduate medical education save alert research feed development of a specialty-wide web-based medical knowledge assessment tool for resident education. m. beeson, s. jwayyed medicine academic emergency medicine : official journal of the society for academic emergency medicine view excerpt, references background save alert research feed evaluation of the use of an interactive, online resource for competency-based curriculum development patricia s. beach, m. bar-on, c. baldwin, d. kittredge, r. trimm, r. henry medicine academic medicine : journal of the association of american medical colleges pdf view excerpt, references background save alert research feed emergency medicine and critical care blogs and podcasts: establishing an international consensus on quality. brent thoma, t. chan, quinten s. paterson, w. milne, j. sanders, michelle s. lin medicine annals of emergency medicine pdf view excerpt, references background save alert research feed social media responses to the annals of emergency medicine residents' perspective article on multiple mini-interviews. n. joshi, l. yarris, c. doty, m. lin medicine annals of emergency medicine view excerpt, references background save alert research feed ... ... related papers abstract topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue during the / autumn & spring terms, we held an institute of coding funded trial at birkbeck, university of london, of a new one-year part-time master’s level postgraduate certificate framework in computing aimed at information professionals working in the cultural heritage sector. from deploying simple scripts for everyday tasks, to developing tools for analysing collections data, the british library and the national archives are exploring different ways to meet demand for such skills, arising particularly from colleagues in curatorial and collections based roles. this trial explored a model whereby cultural heritage professionals could gain crucial computational skills, immediately relevant to their roles, while earning a formal qualification in computer science, with the express support of their institution. nora mcgregor, digital curator, british library, digitalresearch@bl.uk www.bl.uk/projects/computingculturalheritage outcomes  final student projects and coursework have been submitted, with results expected end of june .  a full evaluation of the trial is currently in progress, and will be published alongside the framework in a project report due early .  birkbeck university of london has moved ahead with the launch of applied data science postgraduate certificate this autumn / , leveraging the same framework developed in the trial while expanding it to include information professionals from a range of domains. computing for cultural heritage work-based project module students completed this module over weeks working independently at their institution. each student developed their project proposal with the input and approval of their manager, the aim being for projects to directly benefit the individual’s current role and/or needs of the home institution. this also allowed for time to be negotiated on an individual basis for projects to be completed during work hours. details of projects undertaken will be included, with student permission, in the final report, as well as on the project website. the following two examples give a good sense of the type of projects undertaken with the opportunity:  automated text extraction from colonial-era maps of eastern africa  distant reading descriptions and grouping topics from general board of health and home office, local government act office: correspondence acknowledgments many thanks to the entire cohort for participating in the trial, and to jo pugh, of the national archives for supporting. generous funding provided by the institute of coding and special thanks to the project team at birkbeck university including stelios sotiriadis, mark levene, martin harris, and peter wood. demystifying computing with python two hr lecture/lab sessions were held one day a week : - : , over the course of five weeks at birkbeck university. managers were asked to consider attendance of their staff at these lectures to be part of a normal working day, rather than taken as special leave or holiday. in this module, the lecturer aimed to incorporate datasets and contextual references from the cultural heritage sector where possible into the general python computing lessons and lectures. the trial a cohort of staff from british library ( ) and the national archives in the uk ( ), were selected through a formal application process at each institution (a total of and applications received respectively). students undertook two newly designed modules at birkbeck university:  demystifying computing with python  work-based project: digital project design and development a final module, analytic tools for information professionals, is currently under development and will be launched as part of the full applied data science postgraduate certificate. http://www.dcs.bbk.ac.uk/study/postgraduate/pgcert-in-applied-data-science/ http://www.dcs.bbk.ac.uk/study/postgraduate/pgcert-in-applied-data-science/ http://www.dcs.bbk.ac.uk/study/modules/work-based-project-for-information-professionals/ http://www.bl.uk/projects/computingculturalheritage https://blogs.bl.uk/magnificentmaps/ / /automated-text-extraction-from-colonial-era-maps-of-eastern-africa.html https://blogs.bl.uk/magnificentmaps/ / /automated-text-extraction-from-colonial-era-maps-of-eastern-africa.html https://twitter.com/dentiloquy/status/ ?s= https://twitter.com/dentiloquy/status/ ?s= https://twitter.com/dentiloquy/status/ ?s= http://www.dcs.bbk.ac.uk/study/modules/demystifying-computing-with-python/ http://www.dcs.bbk.ac.uk/study/modules/analytic-tools-for-information-professionals/ http://www.dcs.bbk.ac.uk/study/postgraduate/pgcert-in-applied-data-science/ digital corpora and scholarly editions of latin texts: features and requirements of textual criticism digital corpora and scholarly editions of latin texts: features and requirements of textual criticism by franz fischer introduction digital philology has produced a wide range of new methods and formats for ed- iting and analyzing medieval texts. the provision of digital facsimiles has put the manuscripts, the very material base of any editorial endeavor, into focus again. sev- eral editions have been created that engage primarily with individual manuscripts; others have posited a wide range of variance as a central characteristic of medieval literature instead of relegating variants to the footnotes of ahistorically normalized and regularized texts or speculative reconstructions of archetypes and authorities. nevertheless, the idea of a critical text, especially of nonvernacular medieval works, does not yet seem to be obsolete. quite the opposite: the number of digital facsimiles of manuscripts and early print books and the quantity of document-oriented tran- scriptions available online is growing continually, and with it the need for critically examined and edited texts increases. like a medieval reader having little choice but to rely on the only manuscript copy available at her or his library, without a critical text the modern reader is at a loss to adjudicate on the quality of the textual version picked up randomly on the internet. moreover, digital technologies, methods, and standards have steadily improved, creating possibilities for digital critical editions the quality of which former generations of editors could only imagine. as of yet only a rel- atively small number of born-digital critical editions of greek and latin texts exists. speculum /s (october ). © by the medieval academy of america. all rights reserved. this work is licensed under a creative commons attribution-noncommercial . international license (cc by-nc . ), which permits non-commercial reuse of the work with attribution. for commercial use, contact journalpermissions@press.uchicago.edu. doi: . / , - / / s - $ . . this article stems from a specialized seminar at the university of oklahoma on “latin textual crit- icism in the digital age” organized by the digital latin library (dll), a joint project of the society for classical studies, the medieval academy of america, and the renaissance society of america funded by the andrew w. mellon foundation’s scholarly communications program. e.g., editions of parzival (http://www.parzival.unibe.ch), the canterbury tales, dante’s divina com- media and monarchia (http://www.sd-editions.com), or the vercelli book (http://www.collane.unito.it /oa/items/show/ ), to name just a few. all urls have been verified and the referenced websites have been archived as far as possible in the internet archive (https://archive.org/) on june . franz fischer, “all texts are equal, but . . . textual plurality and the critical text in digital schol- arly editions,” variants ( ): – ; online: http://kups.ub.uni-koeln.de/ ; caroline macé and jost gippert, oxford handbook of greek and latin textual criticism, ed. wolfgang de melo and scott scullion (oxford, forthcoming), ch. , “textual criticism and editing in the digital age.” paolo monella, “why are there no comprehensively digital scholarly editions of classical texts?” (paper first published online april ; revised version [april ] online at http://www .unipa.it/paolo .monella/lincei/files/why/why_paper.pdf). this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.parzival.unibe.ch http://www.sd-editions.com http://www.collane.unito.it/oa/items/show/ http://www.collane.unito.it/oa/items/show/ http://www.collane.unito.it/oa/items/show/ https://archive.org/ http://kups.ub.uni-koeln.de/ http://www .unipa.it/paolo.monella/lincei/files/why/why_paper.pdf http://www .unipa.it/paolo.monella/lincei/files/why/why_paper.pdf http://www .unipa.it/paolo.monella/lincei/files/why/why_paper.pdf http://www .unipa.it/paolo.monella/lincei/files/why/why_paper.pdf s digital corpora and scholarly editions of latin texts even so, the (albeit slowly) growing number of digital critical editions increases the demand for assembling and providing critical texts that are in the form of a textual corpus, because only collections or corpora of texts that are otherwise dispersed on various websites allow for a systematic analysis and for efficient research across the works of a specific author, genre, subject, period, or language as a whole. in this article, some features and requirements for a digital corpus of critical texts are pro- posed and discussed in order to realize the heuristic, explorative, and interpretative potential of integrated historical texts from the classicist and postclassicist tradition of greek and latin works. generally speaking, when corpora of classical or medieval latin or greek texts are compiled and published, they are stripped of their critical features, namely the accompanying introduction, commentary, and apparatus notes. one reason for this omission might be economic: if the texts are published by a traditional publishing house (such as brepols, with its library of latin texts ), the digital text versions of the corpus are considered an additional means of entry to the printed version in or- der to give access to a large variety of texts and promote the canonical print prod- ucts, which remain indispensable for accurate citation and reference. if the texts are published by an academic institution not primarily driven by eco- nomic interests (such as, most notably, the perseus digital library or the digital li- brary of late-antique latin texts ), the reason for skipping the critical features of a printed scholarly edition might be more practical in nature. while it is rather easy to digitize plain texts, it is very hard to encode the complex and often idiosyncratic reference system of apparatus notes (lines, lemmata, variant readings, sigla, etc.). this task requires both a lot of time and a high degree of skill on the part of the dig- itizing person. on the general aspects and purposes of digital corpora see the catalog of “criteria for reviewing digital text collections,” by ulrike henny and frederike neuber in collaboration with the members of the institut für dokumentologie und editorik (ide), version . , february , http://www.i-d-e.de /publikationen/weitereschriften/criteria-text-collections-version- - /: “a few examples for collection de- sign principles are completeness (e.g. if the corpus aims to represent the work of an author as a whole), representativeness (if the corpus claims to be representative for a specific subject domain and functions as a reference for that domain) and balance (e.g. if the corpus is built to allow for contrastive analyses between its components such as different text genres or regional language varieties).” library of latin texts–online (llt-o, ), online: http://www.brepols.net/pages/browsebyseries .aspx?treeseriespllt-o. there are some exceptions to the rule of stripping away features of textual criticism, for example, in the edition of cicero’s speeches, m. tulli ciceronis orationes. see, for example, against catiline, work uri: http://data.perseus.org/texts/urn:cts:latinlit:phi .phi ; there you also find commentary notes, a translation, a vocabulary tool, and a search tool. for the time being, the only digital corpus of latin texts providing (mostly retrodigitized) critical editions is “musisque deoque: a digital archive of latin poetry, from its origins to the italian renaissance,” http://www.mqdq.it/public/. digital library of late-antique latin texts (digiliblt): http://digiliblt.lett.unipmn.it. for a semiautomated method of mapping apparatus entries on the annotated section of the main text, see federico boschetti, “methods to extend greek and latin corpora with variants and conjec- tures: mapping critical apparatuses onto reference text,” in proceedings of the corpus linguistics conference (birmingham, ), online: http://ucrel.lancs.ac.uk/publications/cl /paper/ _paper .pdf. speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.i-d-e.de/publikationen/weitereschriften/criteria-text-collections-version- - / http://www.i-d-e.de/publikationen/weitereschriften/criteria-text-collections-version- - / http://www.brepols.net/pages/browsebyseries.aspx?treeseries=llt-o http://www.brepols.net/pages/browsebyseries.aspx?treeseries=llt-o http://data.perseus.org/texts/urn:cts:latinlit:phi .phi http://www.mqdq.it/public/ http://digiliblt.lett.unipmn.it http://ucrel.lancs.ac.uk/publications/cl /paper/ _paper.pdf http://ucrel.lancs.ac.uk/publications/cl /paper/ _paper.pdf digital corpora and scholarly editions of latin texts s there are other causes for the omission of text-critical features, such as copyright issues or a predominant interest in simple text analytics and computational meth- ods, such as stylometry, topic modeling, computational semantics, text mining, or search and retrieval applied to plain text versions. be that as it may, one might ask whether it would be sufficient simply to add the information as given in the appa- ratus criticus and in the philological introduction to make these texts “truly” digital critical editions. a “truly” and fully fledged digital scholarly edition is surely some- thing more than, or at least something different from, a traditional scholarly edition in a digital format. but if that is the case, how does this fit into a corpus of digital scholarly editions? digital critical editions: six case studies in the following analysis, six editions will be presented. they are all critical and digital editions of latin or greek works. they have been or are being created in con- nection with my personal and institutional involvement under very specific condi- tions, at a certain place and time, with very specific aims and scope. they serve here as case studies to identify some general characteristics of digital critical editions. on the basis of these examples, four proposals will be made for how to create a digital corpus of critical editions. first study: historians from late antiquity the collection and edition of fragments and testimonies of historians from late antiquity is a long-term project carried out at the university of düsseldorf. it has been conceived as a traditional critical print edition with a parallel online presence. the edition comprises a critical text furnished with an apparatus criticus and a philological introduction. a commentary, german translation, and bibliography are planned to be published exclusively in print—as a concession to the business model of the publisher. the online version is being realized by the cologne center for ehumanities (cceh) of the university of cologne. the critical texts are edited the copyright status of edited ancient or medieval texts varies according to national legislation. for instance, under german law, a critical text of an edition (created by an author deceased centuries ago) might not be copyrighted, while the introduction, commentary, and apparatus are. otherwise there is legal uncertainty, and uniform international guidelines or legal assistance are missing. see a recent ar- ticle by wout dillen and vincent neyt, “digital scholarly editing within the boundaries of copyright restrictions,” in digital scholarship in the humanities / ( ): – , doi: . /llc/fqw , on the possibilities and limitations when working with modern manuscripts. good examples for advanced corpora of latin texts created for this purpose are the corpus corporum, a “latin text (meta-)repository and tool” developed at the university of zurich (http:// www.mlat.uzh.ch/mls/); and the computational historical semantics (comphistsem) latin text da- tabase and lexicon created at the goethe-university frankfurt (http://www.comphistsem.org). for a discussion about the gap between digital scholarly editions and text analysis see the panel discussion “text analysis meets text encoding” at the dh conference in hamburg: http://www.dh .uni-hamburg.de/conference/programme/abstracts/text-analysis-meets-text-encoding. .html. for a definition see, most recently, patrick sahle, “what is a scholarly digital edition (sde)?,” in digital scholarly editing: theory, practice and future perspectives, ed. matthew driscoll and elena pierazzo (cambridge, uk, ), – ; online: http://www.openbookpublishers.com//download/book/ , doi: . /obp. . speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://dx.doi.org/ . /llc/fqw http://www.mlat.uzh.ch/mls/ http://www.mlat.uzh.ch/mls/ http://www.comphistsem.org http://www.dh .uni-hamburg.de/conference/programme/abstracts/text-analysis-meets-text-encoding. .html http://www.dh .uni-hamburg.de/conference/programme/abstracts/text-analysis-meets-text-encoding. .html http://www.openbookpublishers.com//download/book/ http://dx.doi.org/ . /obp. https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fobp. . &citationid=p_n_ s digital corpora and scholarly editions of latin texts with classical text editor (cte), a software tool widely used by traditional phi- lologists for creating multiple apparatus in printable format, namely pdf. the tool also provides an html and even tei-xml output, marking up all relevant layout information of the print version: sections, fonts, italics, borders, spaces, and so on. semantic information (such as readings, witnesses, lemmata, quotes, sigla, and ref- erences) is not marked up explicitly. as a consequence, the digital version is a mere reproduction of the print, lacking any additional features except for basic browse and search. for this reason, it can be labeled a critical edition, as it provides a phil- ological introduction and critical annotations (even if based on the work of previ- ous editors), descriptive information, and indices, as well as—after a so-called mov- ing wall, that is, after a certain period of time—commentary and translation. in essence, the edition follows the print paradigm. digital methods or functionalities have not been applied. its usability does not significantly differ from the usability of a printed book. even if critically annotated and digitally presented, from a techno- logical perspective the established texts are plain and single-dimensional (fig. ). second study: saint patrick’s “confessio” the digital edition of saint patrick’s confessio, a fifth-century open letter by ireland’s patron saint, is based on a critical print edition from including crit- ical apparatus, apparatus fontium, apparatus biblicus, and commentary, but also adding various text layers (facsimiles, translations) and features (paratexts, bibli- ography, scholarly articles, fiction, and more)—all of which are closely interlinked and furnished with user-friendly functionalities (hyperlinks from sigla to facsimile, from lemma to text, from reference to bibliography, and so on). the realization of the edition entailed a wide range of tasks and actions: ocr cleanup; the acqui- sition of facsimiles; copyright negotiations; encoding of the canonical work struc- ture and alignment with the structure of manuscript witnesses, prints, and trans- lations; and, last but not least, a detailed encoding of the apparatus entries and the editor’s commentary. the presentation of various textual layers, versions, and an- notations relies heavily on the application of hypertext technology and is suitably labeled a hypertext stack edition (fig. ). third study: guillelmus autissiodorensis the digital editio princeps of william of auxerre’s treatise on liturgy, the summa de officiis ecclesiasticis, has been generated from a detailed transcription of the prin- classical text editor, version . ( ): http://cte.oeaw.ac.at/?id pmain. a similarly “flat” edition (from the technological point of view) is donald j. mastronardo’s digital edition of the scholia on euripides: http://euripidesscholia.org/. saint patrick’s confessio, ed. anthony harvey and franz fischer (dublin, ); online: http:// confessio.ie; franz fischer, “who is patrick?—answers from the saint patrick’s confessio hyper- stack,” in conference proceedings: supporting digital humanities (copenhagen, ); online: http:// kups.ub.uni-koeln.de/id/eprint/ ; fischer, “all texts are equal.” a comparable edition (if on a slightly smaller scale) is the edition of the schedula diversarum artium (http://schedula.uni-koeln.de/), providing all relevant texts and documents to assess and analyze the complex stages of editorial revision and textual transmission. in the form of a digital collection of three critical print editions, that edition might even be labeled a metaedition. magistri guillelmi autissiodorensis summa de officiis ecclesiasticis, ed. franz fischer (cologne, – ); online: http://guillelmus.uni-koeln.de; franz fischer, “the pluralistic approach—the first speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://cte.oeaw.ac.at/?id =main http://euripidesscholia.org/ http://confessio.ie http://confessio.ie http://kups.ub.uni-koeln.de/id/eprint/ http://kups.ub.uni-koeln.de/id/eprint/ http://kups.ub.uni-koeln.de/id/eprint/ http://schedula.uni-koeln.de/ http://guillelmus.uni-koeln.de digital corpora and scholarly editions of latin texts s cipal manuscript witness, includes variant readings from a selection of other wit- nesses, and is enriched with critical editorial markup. published in , it is the first of its kind in medieval latin philology, as it follows a pluralistic textual paradigm and provides a critical text with a threefold apparatus, links to all facsimiles on the page level, extensive descriptions of the manuscripts, a detailed transcript of the principal manuscript witness, a reading text of an almost-contemporary revision of the text, an introduction, indices, and so forth. applying a digital methodology and addressing a wide range of notions of text, this edition might be labeled a born-digital, multi-dimensional, or pluralistic scholarly edition (fig. ). fourth study: carolingian capitularies the capitularia project provides transcriptions of important law texts from the carolingian era: collections of decrees of frankish rulers regulating political, mil- fig. . a critical text version of testimonia on asinius quadratus (preview of the kfhist beta version). scholarly edition of william of auxerre’s treatise on liturgy,” jahrbuch für computerphilologie ( ): – ;online:http://computerphilologie.tu-darmstadt.de/jg /fischer.html; fischer,“alltexts are equal.” speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://computerphilologie.tu-darmstadt.de/jg /fischer.html s digital corpora and scholarly editions of latin texts itary, ecclesiastical, social, economic, and cultural matters, usually drawn up and is- sued during the course of royal assemblies and distributed by so-called missi, counts and bishops. previous critical editions published in print all failed to reflect ade- quately the diversity and complexity of the textual transmission. in a new editorial approach, all manuscript witnesses are being transcribed with a focus on structural information, such as rubrics, initials, and the order of chapters and capitularies. this serves the twofold aim of respecting the individual and regional characteristics of each of these historical documents and enabling a semiautomated comparison for detecting and highlighting differences and commonalities among the witnesses (fig. ). these automated collations, made using the collation tool collatex, constitute the basis for a critical assessment of the textual tradition and for establishing a crit- ical text version to be published both in print and online as part of the monumenta fig. . the first paragraph of saint patrick’s confessio, with interlinked entries of the three- fold apparatus and links to manuscript facsimiles, previously relevant print editions, and trans- lations. collatex—software for collating textual sources: http://collatex.net/. speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://collatex.net/ digital corpora and scholarly editions of latin texts s germaniae historica (mgh and dmgh, respectively). aiming to document both the full textual transmission and a critical text and following a twofold publication strategy, this edition might be labeled a multiwitness hybrid edition (fig. ). fifth study: monasterium.net monasterium.net is a collaborative and virtual digital archive, presently provid- ing access to facsimiles and descriptions of more than six hundred thousand me- fig. . the chapter on the third hour in william of auxerre’s summa de officiis ecclesias- ticis, critical text with threefold apparatus and links to manuscript facsimiles and other text versions. see gioele barabucci and franz fischer, “the formalization of textual criticism: bridging the gap between automated collation and edited critical texts,” in advances in digital scholarly edit- ing, ed. peter boot et al. (leiden, forthcoming). speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). s digital corpora and scholarly editions of latin texts dieval and early modern charters from more than one hundred and fifty archives. the online platform allows for digital editing of the charters at all scholarly levels: in some instances, scans are provided, along with the most basic metadata, such as repository and shelf marks; in others, short descriptions and abstracts are included and, if available, retrodigitized print editions; whereas in others, veritable born- digital diplomatic editions are produced that include introductions or prefaces, diplomatic transcripts encoded according to the standard of the charters encoding initiative (cei), a diplomatic analysis, and bibliographies. since charters usually survive as single documents, there is no critical annotation in the form of critical ap- paratus entries. the nature of these charter editions varies and ranges from digital diplomatic editions in their original sense, that is, focusing on dating, proof of au- thenticity, and the analysis of the content structure of a charter; digital documentary fig. . a collation table of various witnesses generated by the collation tool collatex, im- plemented into the capitularia website (internal). according to the definition given in the vocabulaire international de la diplomatique, ed. maria milagros cárcel ortí, nd ed. (valéncia, ), ; online: http://www.cei.lmu.de/vid/vid#vid_ ): “une édition diplomatique est la publication d’un document, après établissement critique de son texte speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.cei.lmu.de/vid/vid#vid_ digital corpora and scholarly editions of latin texts s editions, focusing on external features of the documents; and data-enriched editions, with information on historical persons, places, events, or decoration (for example, in the art historical subcollection of illuminated charters) (fig. ). sixth study: digital averroes research environment (dare) the digital averroes research environment (dare) collects and edits the works of the andalusian philosopher averroes (abū l-walı̄d muh�ammad ibn ah�mad ibn rušd), born in cordoba in , died in marrakesh in . through the portal, images of as many textual witnesses as possible, that is, manuscripts, incunabula, fig. . online edition of the frankish capitularies, transcription of the parisian manuscript witness bibliothèque nationale de france, ms lat. . see http://www.monasterium.net/mom/illuminierteurkunden/collection; http://www.monasterium .net/mom/glossar. compte tenu de la tradition de celui-ci et d’un examen critique de sa sincérité et de sa datation.” the term was established in the seventeenth century during the historical debate between the maurist scholar jean mabillon and the bollandist hagiographer daniel van papenbroeck: see paul bertrand, “du de re diplo- matica au nouveau traité de diplomatique: la réception des textes fondateurs d’une discipline,” in dom jean mabillon, figure majeure de l’europe des lettres: actes des deux colloques du tricentenaire de la mort de dom mabillon, ed. jean leclant, andré vauchez, and daniel-odon hurel (paris, ), – . now- adays the term “diplomatic” is usually applied to very detailed transcriptions of any type of document: see lexicon of scholarly editing, ed. wout dillen et al., s.v. “transcription (diplomatic),” http://uahost .uantwerpen.be/lse/index.php/lexicon/diplomatic-transcription/. speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.monasterium.net/mom/illuminierteurkunden/collection http://www.monasterium.net/mom/glossar http://www.monasterium.net/mom/glossar http://uahost.uantwerpen.be/lse/index.php/lexicon/diplomatic-transcription/ http://uahost.uantwerpen.be/lse/index.php/lexicon/diplomatic-transcription/ fig. . facsimile and transcription of a medieval serbian charter on monasterium.net: bari, archivio di s. nicola periodo angioino l. ( august , skopje). this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). digital corpora and scholarly editions of latin texts s and early printed editions, are provided online. at present, dare includes only a small number of edited texts, most of these textual versions that have not yet been critically annotated. however, the portal is already a key resource for a long-term ed- itorial project to create critical editions of the works of averroes that reflects and an- alyzes their extremely complex transmission back and forth through latin, greek, arabic, and hebrew—an enterprise that would have been considered impossible without digital methods and resources. the established critical-text versions will even- tually be integrated into the dare platform in order to complement a digital re- source that can be labeled a knowledge site (fig. ). variety of editions versus homogeneity of a corpus we have just presented six examples of critical approaches towards (mostly) latin texts in a digital editorial format. they show a great variety with respect to the content and the notion of what the text is and what the respective edition ac- tually should do. some digital editions ( ) provide a critical text following the lach- mannian paradigm, reconstructing some archetypal text version by following a strict methodology of recensio (transcription, collation, establishment of a stemma codi- cum), selectio, and emendatio. others ( ) abide by the leithandschrift principle and follow a principal manuscript witness. accurate transcriptions ( ) might focus on very different details and characteristics before being enriched with critical an- notations. nowadays most digital editions provide digital facsimiles of manuscripts and prints, all of which may vary in the quality of the digital scans and in the degree to which they are integrated into and interlinked with the critical text. some editions are multidimensional, providing various versions or layers of text, parallel texts, and trans- lations. all digital editions are labeled according to the material and the editorial method applied: critical, diplomatic, semidiplomatic, documentary, multiwitness, ar- chive edition, and so on. moreover, even editions with similar labels feature various differing functionalities and presentational modes, all of which are based on a large variety of encoding, since even within the de facto standard for text encoding, as pro- vided by the guidelines of the text encoding initiative (tei), there are various ways of modeling textual variance. more generally speaking, digital scholarly editions all differ with respect to the application and degree of both textual criticism and dig- itality (that is, the degree to which they employ and integrate digital technologies). but if textual, or rather editorial, plurality seems to be one of the main charac- teristics of digital editions, how is a coherent digital corpus of scholarly editions to be constructed? how does such diversity fit into a corpus if the usefulness of a cor- pus is based largely on the homogeneity and representativeness of the texts that it includes? these texts are expected to be homogenous in order to be detectable, for an overview of texts available, see http://dare.uni-koeln.de/?qpnode/ . peter shillingsburg, “how literary works exist: convenient scholarly editions,” digital human- ities quarterly / ( ), par. ; thomas stäcker, “creating the knowledge site—elektronische editionen als aufgabe einer forschungsbibliothek,” bibliothek und wissenschaft ( ): – . martin litchfield west, textual criticism and editorial technique applicable to greek and latin texts (stuttgart, ). guidelines of the text encoding initiative (p ): tei-c.org/release/doc/tei-p -doc/en/html/index.html. speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://dare.uni-koeln.de/?q=node/ http://tei-c.org/release/doc/tei-p -doc/en/html/index.html https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f - - - - &citationid=p_n_ s digital corpora and scholarly editions of latin texts comparable, and analyzable across the whole corpus. texts that are part of a cor- pus are supposed to be representative for a specific work, genre, or period. having a variety of versions or textual layers of one specific work is clearly not what suits the idea of a corpus of texts. even if it were possible to integrate complex digital resources into one portal, the amount of work and expertise needed to maintain a resource of such exponentially increased complexity would seem impracticable, given the pace of ongoing technological and methodological innovations. fig. . averroes’s commentary on aristotle’s physics, translated into latin by michael sco- tus, and a manuscript witness from assisi (biblioteca communale, ms , fol. v). speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). digital corpora and scholarly editions of latin texts s four proposals to achieve a compromise in the following four proposals we shall explore how the two conflicting con- cepts and practices of idiosyncratic digital critical editing on the one hand and cre- ating a homogeneous textual corpus on the other can be reconciled despite the ap- parent contradictions. first proposal: digital in a wide sense, critical in a narrow sense the first proposal to resolve the conflict between variety of editions and homo- geneity within a corpus is to create and provide editions that are both digital in a wider sense and scholarly in a narrow sense. this proposal can be divided into two strategic approaches: the first approach starts from the definition of “digital,” the second from the definition of “critical.” . digital in a wide sense as part of a digital corpus, each individual scholarly edition does not necessar- ily need to be digital in a strict sense. what does “digital edition in a strict sense” mean? according to the “catalogue of criteria for reviewing scholarly digital edi- tions” as issued by the institute for documentology and scholarly editing (ide), a scholarly edition is “an information resource which offers a critical representa- tion of (normally) historical documents or texts. scholarly digital editions are not merely publications in digital form; rather, they are information systems which fol- low a methodology determined by a digital paradigm, just as traditional print edi- tions follow a methodology determined by the paradigms of print culture. given this narrow understanding of sdes, many digital resources cannot be considered digital editions in this strict sense.” and in an even more apodictic manner, in his most recent article on the subject, sahle states what can be regarded as common sense among today’s digital humanities scholars: p www s is som analy in all age, al • “a digitized edition is not a digital edition.” • “a digital edition cannot be given in print without a significant loss of content and functionality.” • “a digital edition is guided by a digital paradigm in its theory, method, and practice. ” given these definitions, the point here is exactly the opposite: individual critical editions as part of a corpus need not strictly follow a digital paradigm, which, although desirable, is not a requirement. as demonstrated above, textual plurality and the complexity of the editorial approach towards an edited work is a main characteristic of a fully fledged digital scholarly edition. in contrast, the purpose of a corpus lies in its capacity to provide a large number of homogeneously edited texts, not only to ensure a high degree of usability but also to guarantee its feasibility and long-term maintainability. therefore in principle these editions can be digitized critical editions. atrick sahle et al., “criteria for reviewing scholarly digital editions,” (version . ), http:// .i-d-e.de/publikationen/weitereschriften/criteria-version- - /. ahle, “what is a scholarly digital edition?” according to tara andrews a scholarly digital edition ething “beyond a feature-rich electronic book”: “it is the practice of deep and/or large-scale text sis, rather than that of textual criticism itself, which must drive the development of digital editions their potential.” see tara l. andrews, “the third way: philology and critical edition in the digital ” variants ( ): – ; postprint online version: http://boris.unibe.ch/ /. speculum /s (october ) this content downloaded from . . . on july , : : am l use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.i-d-e.de/publikationen/weitereschriften/criteria-version- - / http://www.i-d-e.de/publikationen/weitereschriften/criteria-version- - / http://boris.unibe.ch/ / s digital corpora and scholarly editions of latin texts content and functionalities do not have to significantly exceed the content and functionalities of the print edition, that is, on the level of the individual text as part of a corpus, even though, even here, a certain minimum of requirements should be met (see below). however, additional digital value does need to be realized on the level of the entire corpus. what additional digital value across the entire corpus can mean will be discussed under proposal below. . critical in a narrow sense—four manifestations of textual criticism the other half of the first proposal needs to be clarified: create and provide edi- tions that are scholarly in a narrow sense. the term “critical” (even though often used as a synonym for “scholarly”) qualifies the meaning of scholarly, but what pre- cisely does critical mean? peter robinson, with his notorious six essential aspects of electronic digital edi- tions, refers with the first three criteria to an essential philological methodology and scholarly rigor. according to robinson, a digital critical edition is anchored in a historical analysis of the materials; presents hypotheses about creation and change; and supplies a record and classification of difference over time, in many dimensions and in appropriate detail. these points are widely accepted by most scholars. this definition and others brought forward by renowned scholars are sup- ported by the wide range of digital scholarly editions currently seen. be this as it may, and whatever the material, methodology, or requirements of a community, in order to make critical editions fit into a digital corpus of homogeneous texts rep- resenting works of latin literature, the various aspects of textual criticism can be broken down into four basic manifestations of criticism: ( ) critical annotation, ( ) markup, ( ) metadata, and ( ) documentation. these essential features of a crit- ical text must be accommodated by any model of a digital corpus, a model defin- ing indispensable requisites and requirements for a text to be incorporated into the corpus. ( ) the first manifestation of textual criticism is critical annotation to the text, more specifically, the presence of an apparatus criticus or other means of record- ing textual variants and all justifications for the state of the edited text. in addition, critical annotation might include an apparatus fontium, giving references to sources and paratexts; an apparatus biblicus, as a typical feature of patristic or medieval texts; a commentary with explanatory notes or historical and philological notes, and dis- cursive notes with present-day relevance, such as references to gender issues and so- ciopolitical subject matter. ( ) the second manifestation comprises the potentially very deep and extensive markup of the text: structural markup (including identifiers); markup of internal and external references or named entities; linguistic and semantic markup, such as part-of-speech tagging; lemmatization or syntactical markup; markup of typical the fourth criterion mentions the presentation of an “edited” text (only) as an option; the fifth and sixth criteria refer to digital usability: see peter robinson, “what is an electronic critical edition?,” variants ( ): – . daniel apollon and claire bélisle, “the digital fate of the critical apparatus,” in digital critical editions, ed. daniel apollon, claire bélisle, and philippe régnier (urbana, ), – , here esp. ; elena pierazzo, digital scholarly editing: theories, models and method (farnham, surrey, ); patrick sahle, digitale editionsformen: zum umgang mit der Überlieferung unter den bedingungen des me- dienwandels, vols. (norderstedt, ), here esp. : – . speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). digital corpora and scholarly editions of latin texts s features of an apparatus entry, such as sigla, references, or quotes and readings. it might also include markup of the types of apparatus entries according to catego- ries such as textual, intertextual, exegetical, rhetorical, and metrical. ( ) the third manifestation of textual criticism comprises all kinds of metadata and structured information on the author, the work, and the edition itself, that is, bibliographical information concerning the work itself, including its genre, dates, appropriate keywords, and so forth; as well as imaging parameters, responsibili- ties, licenses, and so on in regard to the edition; and contextual information in the form of a “critical bibliography.” ideally, all this information is given in a standard- ized format (such as tei, mets, dublin core, or some other bibliographic stan- dard) with references to authority files (such as gnd, viaf, getty thesaurus) for named entities and using taxonomies and ontologies (skos, cidoc crm) that are relevant for the respective field of research. ( ) the fourth manifestation comprises information traditionally provided in a philological introduction, paratexts, and other kinds of accompanying texts and ma- terials, which can all be subsumed under the term “documentation.” ideally, the ma- terial basis of the edited text is documented by digital facsimiles of manuscript wit- nesses and relevant printed editions. these surrogates should be the result of what has been labeled “critical digitization” in the sense that information is provided about the decisions involved in setting up the parameters for digitizing. the manuscripts should then be described thoroughly according to scholarly practice. where tran- scriptions have been created, these should be included as well as the source code of all manuscript descriptions, transcripts, and the critical text itself. moreover, it is es- sential to present a historical analysis, hypotheses about the creation of the text, and a record and classification of differences over time. most importantly, however, the editorial principles need to be made explicit. for a discussion on types and categories (and respective taxonomies), see michael hendry’s blog post on “categories of adversaria” at http://curculio.org/?pp ( march ; paola italia, fabio vitali, and angelo di iorio, “variants and versioning between textual bibliography and com- puter science,” in aiucd ‘ —proceedings of the third aiucd annual conference on humanities and their methods in the digital ecosystem, ed. francesca tomasi, roberto rosselli del turco, and anna maria tammaro (new york, ); doi: . / . ; see also tei-l thread on “types of edits” started by christof schöch ( may ). e.g., variants (substantive, orthographic), conjectures, deletions, obelizations, transpositions, lacu- nae, (marginal or interlinear) additions, punctuation, speaker attribution, structure (e.g., boundaries be- tween books, chapters, paragraphs, poems, stanzas, verses, etc.). e.g., sources, parallels, later usage, reception and nachleben (modern allusions and imitations). e.g., figures of speech, tropes, style. cf. “pede certo—metrica latina digitale,” software developed by the university of udine for the automatic analysis of latin verses: http://www.pedecerto.eu/. a metadata model needs to take into account the various levels of possible entities like those rep- resented in the functional requirements for bibliographic records (frbr) model, such as work, ex- pression, manifestation, and item. mats dahlström, “critical editing and critical digitization,” in text comparison and digital creativity: the production of presence and meaning in digital text scholarship, ed. e. thoutenhoofd, a. van der weel, and w. th. van peursen (amsterdam, ), – ; mats dahlström, “critical transmission,” in between humanities and the digital, ed. p. svensson and d. t. goldberg (cam- bridge, ma, ), – . robinson, “what is an electronic critical edition?,” – . speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://curculio.org/?p= http://dx.doi.org/ . / . http://www.pedecerto.eu/ https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f . &citationid=p_n_ https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f . &citationid=p_n_ https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f . &citationid=p_n_ https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fej. .i- . &citationid=p_n_ https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fej. .i- . &citationid=p_n_ https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fej. .i- . &citationid=p_n_ s digital corpora and scholarly editions of latin texts again, the viability and success of a digital corpus of critical texts depends on finding an appropriate and functional overarching data model that is able to ac- commodate these forms of critical annotation and information. to this end, it may be useful to reduce the force of the term “critical” to a rather prosaic meaning and to define an absolute minimum of requirements for the incorporation of a crit- ical text into a digital corpus. referring to the four manifestations of textual crit- icism described above, this minimum of requirements could be: (ad ) the critically constituted text bears all critical information (for example, in the traditional annotation format of an apparatus) required to justify the lin- guistic or philological form of the edited text. (ad ) the work structure is clearly defined: entities such as book, chapter, par- agraph, and so on are marked up accordingly in order to fit in with a corpus-wide schema for addresses and the citation of the respective text entities. (ad ) metadata is provided on the author, work, and the edition itself. (ad ) the text has sufficient material documentation (manuscript descriptions and facsimiles) and a philological introduction specifying the editorial principles. defining the texts that are to be included into the corpus as “digital in the wider sense” (that is, not necessarily following a digital paradigm) and as “critical in a nar- row sense” (fulfilling the minimal requirements of the critical textual scholarship) would allow for the inclusion of (a) printed critical editions created with a digitizing process that is not too demanding; (b) existing born-digital critical editions with a transformation or spin-off process that is not too complicated; and (c) new born- digital critical editions created within the editorial framework provided by the cor- pus portal (as it is currently planned for the digital latin library). second proposal: works rather than documents the second proposal to resolve the conflict between variety of editions and ho- mogeneity within a corpus is to focus on works rather than documents. a text cor- pus is not an archive. digital editions tend to start from or grow into some sort of digital archive. in order to provide texts that are to some extent homoge- neous, the editorial features within a corpus should not focus on contingent and individual material aspects of the text or on paleographic or codicological details. instead of accumulating textual evidence and transcriptions of witnesses, they should focus on critical value, i.e. critical annotation, deep mark-up and the establishment of the catalogs of existing digital scholarly editions prepared by patrick sahle, “a catalog of digital scholarly editions,” version . , snapshot ff, http://www.digitale-edition.de/; greta franzini, “a catalogue of digital editions,” https://github.com/gfranzini/digeds_cat (with a list of further catalogs at https://github.com/gfranzini/digeds_cat/wiki). digital latin library (dll): http://digitallatin.org/. patrick sahle, “digitales archiv und digitale edition: anmerkungen zur begriffsklärung,” in literatur und literaturwissenschaft auf dem weg zu den neuen medien, ed. michael stolz (zürich, ), – ; online: http://www.germanistik.ch/scripts/download.php?idpdigitales_archiv_und_digitale_edition; ken- neth price, “edition, project, database, archive, thematic research collection: what’s in a name?,” dig- ital humanities quarterly / ( ), http://www.digitalhumanities.org/dhq/vol/ / / / .html; dirk van hulle, “editie en/of archief: moderne manuscripten in een digitale architectuur,” in verslagen en mededelingen van de koninklijke academie voor nederlandse taal- en letterkunde / ( ): – . speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.digitale-edition.de/ https://github.com/gfranzini/digeds_cat https://github.com/gfranzini/digeds_cat/wiki http://digitallatin.org/ http://www.germanistik.ch/scripts/download.php?id=digitales_archiv_und_digitale_edition http://www.digitalhumanities.org/dhq/vol/ / / / .html digital corpora and scholarly editions of latin texts s some kind of representative text version with a canonical work structure. this does not mean that transcriptions and facsimiles etc. should not be included; they should in some way. it is just a matter of prioritizing when creating a digital corpus. indi- vidual scholarly editions will always have to define their own priorities and tend to emphasize particularities of the textual material and specificities of the individual re- search perspective. the challenge here for future corpora of critical texts is to estab- lish a basic and interchangeable data format to which a required set of data com- ponents of complex editions as described above can be translated, transformed or downgraded. third proposal: leave to others what others do better digital editions as part of a corpus cannot and should not be all inclusive. to the contrary: a characteristic of digital editions is the overcoming of the limitations of the publication itself through integration of or, here even more importantly, through linkage to external resources. the theory of digital scholarly editing en- visions an all-encompassing model of highly complex, layered, rich information resources. individual digital editions, however, do not need to provide and main- tain the full range of possible modules, such as high-resolution facsimiles, transla- tions in various languages, all sorts of visualizations, additional contextual material, and user-friendly tools within one clearly delimited and self-contained publication. all these features and information enriching the reading experience and support- ing individual research can hardly be provided and maintained within a single cor- pus. rather, any additional feature that is not required according to the criteria of the corpus should be outsourced and either referred to via hyperlink or, if pos- sible, embedded from external resources. this is especially reasonable with re- gard to authority files; encyclopedic knowledge, as part of online reference works and compendia; paratexts, as part of other digital corpora; and facsimiles. as for the latter, ideally cultural heritage institutions, such as archives and libraries, take care of their own material and provide descriptions, high quality reproductions, and tools to engage with material in a standardized way so that it can be embedded and used by users and editors alike. the embedding of external resources can be realized in two different ways, both of which have advantages and disadvantages. the easiest method from a technical point of view is simply to include a link out of the edition that targets the external resource. an example of the application of this method is the digital edition of the st. gall priscian, which links to manuscript images at the codi- ces electronici sangallenses (cesg) virtual library (figs. and ). this according to patrick sahle is one aspect of overcoming the limitations of print editions (“die entgrenzung der publikation”) both quantitatively (with no restrictions on space) and qualitatively (by inclusion of texts, images, audio, video): see “zwischen mediengebundenheit und transmedialisierung: anmerkungen zum verhältnis von edition und medien,” in editio ( ): – ; doi: . /edit . . . cf. joris van zundert and peter boot, “the digital edition . and the digital library: services, not resources,” in bibliothek und wissenschaft ( ): – ; online: http://peterboot.nl/pub /vanzundert-boot-services-not-resources- .pdf. st. gall priscian glosses, ed. pádraic moran, http://www.stgallpriscian.ie/; codices electronici sangallenses (cesg)—virtual library, http://www.cesg.unifr.ch/en/index.htm. speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://dx.doi.org/ . /edit. . http://dx.doi.org/ . /edit. . http://peterboot.nl/pub/vanzundert-boot-services-not-resources- .pdf http://peterboot.nl/pub/vanzundert-boot-services-not-resources- .pdf http://www.stgallpriscian.ie/ http://www.cesg.unifr.ch/en/index.htm fig. . st. gallen, stiftsbibliothek, ms cod. sang. , fol. r. the digital edition of st. gall priscian glosses (on the left), with links to the manuscript images and descriptions at the codices electronici sangallenses (cesg) virtual library (on the right). this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). fig. . st. gallen, stiftsbibliothek, ms cod. sang. , fol. r. the digital edition of st. gall priscian glosses (on the left), with links to the manuscript images and descriptions at the codices electronici sangallenses (cesg) virtual library (on the right). digital corpora and scholarly editions of latin texts s the integration of external information into the edition itself might be more user- friendly. images or texts can be either included from the external server or, if restric- tions relating to technical infrastructure or copyrights do not prevent it, mirrored onto a dedicated server. a technically advanced publishing framework has been de- veloped by jeffrey c. witt: the lombardpress web application is designed to un- derstand and consume common interfaces (so-called iiif application programming interfaces ) as adopted by a growing number of leading research libraries with see http://lombardpress.org/web. international image interoperability framework (iiif): see http://iiif.io. speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://lombardpress.org/web http://iiif.io s digital corpora and scholarly editions of latin texts manuscript collections in order to allow for the possibility of querying images of manuscript folios directly from library servers across the world (fig. ). fourth proposal: create additional value across the corpus as pointed out under the first proposal, critical editions as part of a corpus need not be “truly digital” in the sense that they follow a digital paradigm and that they are created applying digital methods. rather, the fourth proposal advocates the creation of additional value across the whole range of texts through the features and the technical framework of a “truly digital” corpus—based on an elementary data model for metadata, text, annotation, and paratexts. as soon as a suitable and robust data model has been found to accommodate the various forms of textual criticism, additional value can be generated by en- abling a full exploration of the data captured across the entire corpus. this ad- ditional value cannot be provided in print editions, and it is characteristic of both individual digital editions and digital text corpora in general. a set of generic and corpus-wide tools, features, and functionalities should ad- dress researchers’ needs and expectations. ( ) first, the search function is of the highest importance for any digital corpus. it should not only provide a full-text search over all textual material included in the corpus (edited texts, apparatus, introductions, etc.), but also advanced search options, such as searching by logical operators and connectors and allowing for truncation and wildcards. needless to say, a fuzzy-search function is indispens- able for finding words and strings with orthographic variance within one and the same text as well as across various texts. ideally, each and every word of the cor- pus is lemmatized to allow queries to match different forms of words, which may include even synonyms. in addition to this, metadata allows for faceted search- ing of all kinds. it could be used to search by geographical regions or places of or- igin or provenance; by specific centuries, decades, or years of creation; by genres (like the thesaurus linguae graecae categories of historici, poetae, philosophi, lombardpress-web builds on the “scholastic commentaries and texts archive” (scta: see http://scta.info/). the scta database first points to the id of a respective codex surface. if the holding library’s image repository is iiif compliant, the scta database will link out further to the id of the iiif canvas and from there to the url of the image itself. for a draft proposal of this scta data model see http://lombardpress.org/ / / /surfaces-canvases-and-zones/; about lombardpress in general, see http://lombardpress.org/about/. in the area of linguistic corpora there have been attempts to address the issue of reconciling dif- ferent formats. see, for example, salt and pepper at http://corpus-tools.org/. salt and pepper are not just methodological recommendations, they are functioning, extensible open source tools that support the integration of linguistic corpora created according to different principles into a larger framework. cf. henny and neuber, “criteria for reviewing digital text collections.” there should be also a set of tools, features, and functionalities for the wider public in order to extend the usability of critical editions beyond a scholarly audience. this, however, lies beyond the scope of this article. for an automated form analysis and translation, most advanced digital corpora of latin texts, such as perseus, corpus corporum, and computational historical semantics (comphistsem), use spe- cific treetragger software as developed and maintained by the perseus project: the ancient greek and latin dependency treebank (agldt, http://perseusdl.github.io/treebank_data/). for an overview of current tools for lemmatization and morphological analysis, see the digital classicist wiki: https:// wiki.digitalclassicist.org/morphological_parsing_or_lemmatising_greek_and_latin (last modified on december ). speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://scta.info/ http://lombardpress.org/ / / /surfaces-canvases-and-zones/ http://lombardpress.org/about/ http://corpus-tools.org/ http://perseusdl.github.io/treebank_data/ https://wiki.digitalclassicist.org/morphological_parsing_or_lemmatising_greek_and_latin https://wiki.digitalclassicist.org/morphological_parsing_or_lemmatising_greek_and_latin fig. . the scholastic commentaries and texts archive (scta): first distinction of book of the sentences commentary by william of rothwell, edited by jeffrey c. witt and pub- lished through lombardpress, here in a diplomatic transcription of a manuscript from aarau (aargauer kantonsbibliothek, ms wettf ), displaying in the bottom the same paragraph in a manuscript from copenhagen (danish royal library, ms gks ). this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). s digital corpora and scholarly editions of latin texts theologi, oratores, etc.), or by a specific meter. based on the markup, searches could be limited to a certain type or content of apparatus entries (see above). ( ) another essential feature of a text corpus is an elaborated index function. indices should be generated and interlinked both work-wide and corpus-wide from the metadata (as regards authors, works, genres, periods, keywords, etc.) and from the markup (depending on the encoding schema with respect to named entities, that is, marked-up persons, places, dates, events, etc.), and where the texts are lem- matized, word indices could be provided. lists of manuscripts should be created ac- cording to the structured information given in the documentation. ( ) the third fundamental functionality of a digital corpus is the provision of hyperlinks generated from explicit references, pointers, and identifiers in the markup and metadata. internal links are to be realized as text-wide (especially connecting text and critical annotations), as work-wide (connecting text, manuscript witnesses, trans- lations, and accompanying material) and as corpus-wide (connecting intertextual references, dictionary entries, registers, and indices). external links might point to digital archives (providing manuscript facsimiles, catalog entries and descriptions, etc.), digital corpora (providing relevant texts and contextual material), digital en- cyclopedias and dictionaries, and to any outsourced or externalized material (forums, audios, videos, blogs, etc.; see above). ( ) the aptitude of a digital corpus for scholarly use then completely depends on addressability and citability of all its parts and components, namely of the crit- ical text (according to books, chapters, paragraphs, stanzas, verses, lines, words, and the respective critical annotations) and of the documentation (manuscript de- scriptions, transcripts, and introduction) as well as on the addressability and cita- bility of versions, in case changes have been carried out or a progressive publica- tion mode has been established. if the editorial framework allows for progressive publications, updates, additions, corrections, and so on (which in open software development and in digital humanities research is generally recommended ) this would have an enormous impact on all areas of the corpus. keeping track of ver- sions is an extremely challenging task, especially if the corpus is supposed to pro- vide canonical text versions that do not change. be that as it may, the data model and publication framework need to make sure that every part, layer, and format cf. above, n. , on “pede certo.” the “release early, release often” policy was originally applied in the linux development community. following the publication of the essay “the cathedral and the bazaar: musings on linux and open source by an accidental revolutionary,” by eric s. raymond (beijing and cambridge, ma, ); online: http://www.catb.org/~esr/writings/cathedral-bazaar/, this policy became increasingly popular among digital humanities scholars and has been adapted to publication strategies not only for tool devel- opment but also for the creation of digital scholarly editions (“progressive editions”) in order to create a tight feedback loop between the editor and expert scholars in their respective fields of research: see gun- ther vashold, “progressive editionen als multidimensionale informationsräume,” in digital diplomatics: the computer as a tool for the diplomatist?, ed. antonella ambrosio, sébastien barret, and georg vogeler (böhlau, ), – ; andrew dunning, “rethinking the publication of premodern sources: petrus plaoul on the sentences,” ride (a review journal for digital editions and resources, published by the ide [institut für dokumentologie und editorik]) ( ); doi: . /ride.a. . , esp. pars. – . possible negative effects of updating editions have been described by gabriel bodard, “the in- scriptions of aphrodisias as electronic publication: a user’s perspective and a proposed paradigm,” digital medievalist ( ), doi: . /dm. , pars. – . speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.catb.org/~esr/writings/cathedral-bazaar/ http://dx.doi.org/ . /ride.a. . http://dx.doi.org/ . /dm. https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fboehlau. . &citationid=p_n_ https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fboehlau. . &citationid=p_n_ https://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fdm. &citationid=p_n_ digital corpora and scholarly editions of latin texts s of the critical edition is clearly addressable, according to a urn-naming conven- tion as specified, for instance, by the canonical text services (cts) and used by the perseus project and the homer multitext project; or by something similar to the documents, entities, and texts (det) system as recently presented by peter robinson in his widely discussed draft article on academia.edu. ( ) no matter how user-friendly the interface of an edition or corpus may be, user scenarios and research questions cannot be anticipated always and every- where. for this reason, it is imperative to provide as much raw data and material as possible via interfaces (apis) and downloads in order to enable scholars to access and collect the data directly. the editorial framework should allow for an import of various formats (such as tei/xml, plain text, docx, pdf, tiff, and jpg) specified by the editorial guidelines. ingested text files would be converted into corpus-specific xml, ideally customized tei, in order to be stored and provided in the same format as the files created within the framework directly. ( ) in connection with downloads and apis there is the question of copyright and licenses. digital humanities scholars and open-knowledge activists commonly agree today that a creative commons attribution sharealike (cc by-sa) license is the best way to make sure the editor’s work is appropriately credited and to en- sure that the data is openly accessible and remains open data. conclusion creating a digital corpus of critical editions is a complex task. it involves a wide range of strategic decisions to harmonize the heterogeneity of digital scholarly edi- tions with the core feature of a corpus residing mainly in the homogeneity of the way the texts are prepared and presented. several suggestions have been proposed to convey a maximum of textual criticism with a minimum of formal requirements in order to provide a suitable data model, a practical editing environment, and a maintainable publishing framework that is attractive to both critical editors and scholarly users. a technical and institutional framework for integrating and explor- ing critical editions on a large scale is a great desideratum. it also seems to be a pos- sibility worth the effort to attain. for canonical text services (cts), see the information at sourceforge: http://cts .sourceforge.net/; and, especially on cts urns, “the cite architecture technology‐independent, machine-actionable ci- tation of scholarly resources”: http://cite-architecture.github.io/ctsurn/. the article is soon to be published in digital humanities quarterly: see peter robinson, “some prin- ciples for the making of collaborative scholarly editions in digital form”; a draft is on academia.edu at https://www.academia.edu/ /some_principles_for_the_making_of_collaborative_scholarly _editions_in_digital_form; see here esp. – (with n. ). material published under a cc by-sa license can be copied and redistributed in any format and adapted for any purpose, even commercially, as long as the original creator is appropriately credited and the adapted material is distributed under the same license as the original; see https://creativecommons .org/licenses/by-sa/ . /. franz fischer, university of cologne (franz.fischer@uni-koeln.de) speculum /s (october ) this content downloaded from . . . on july , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://cts .sourceforge.net/ http://cite-architecture.github.io/ctsurn/ https://www.academia.edu/ /some_principles_for_the_making_of_collaborative_scholarly_editions_in_digital_form https://www.academia.edu/ /some_principles_for_the_making_of_collaborative_scholarly_editions_in_digital_form https://creativecommons.org/licenses/by-sa/ . / https://creativecommons.org/licenses/by-sa/ . / introduction: digital humanities as dissonant research how to cite: o’sullivan, james. . “introduction: digital humanities as dissonant.” digital studies/le champ numérique ( ): , pp. – , doi: https://doi.org/ . /dscn. published: january peer review: this is a peer-reviewed article in digital studies/le champ numérique, a journal published by the open library of humanities. copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access: digital studies/le champ numérique is a peer-reviewed open access journal. digital preservation: the open library of humanities and all its journals are digitally preserved in the clockss scholarly archive service. https://doi.org/ . /dscn. http://creativecommons.org/licenses/by/ . / o’sullivan, james. . “introduction: digital humanities as dissonant.” digital studies/le champ numérique ( ): , pp. – , doi: https://doi.org/ . /dscn. research introduction: digital humanities as dissonant james o’sullivan university college cork, ie james.osullivan@ucc.ie the digital humanities summer institute gives students and scholars a chance to broaden their knowledge of the digital humanities within a feasible timeframe. the dhsi colloquium was first founded by diane jakacki and cara leitch to act as a means of supporting graduates who wanted to be a part of such a gathering. the colloquium has grown in recent years, to the point where it is now seen as an important part of the field’s conference calendar for emerging and established scholars alike, but it remains a non-threatening space in which students, scholars, and practitioners can share their ideas. this issue is testament to that diversity, as well as the strength of the research being presented at the colloquium. it includes scott b. weingart and nickoal eichmann-kalwara, mary borgo, william b. kurtz, and john barber. “what’s under the big tent?: a study of adho conference abstracts,” which portrays the discipline as one which is dominated by specific groups and practices. using the victorian women writers project as a case-study, mary borgo treats models for the sustainable growth of tei-based digital resources. william b. kurtz details his experiences working on a digital initiative, in this instance, founders online: early access, and engages with the need for such projects to hold broader public appeal. john barber’s “radio nouspace: sound, radio, digital humanities,” describes the curation of sound within the context of radio, and how such activity connects to creative digital scholarship. together, these articles represent the purpose of facilitating a community comprised of divergent interests and perspectives, a community which can often be positively dissonant. keywords: dhsi; digital humanities summer institute; colloquium; colloque le digital humanities summer institute (dhsi) offre une chance aux étudiants et érudits d’étoffer leurs connaissances en humanités numériques pendant un délai réalisable. diane jakacki et cara leitch ont établi le premier colloque du dhsi pour soutenir des diplômés qui voulaient participer à un tel rassemblement. ces dernières années, le colloque s’est développé jusqu’au point d’être considéré maintenant comme une conférence importante sur https://doi.org/ . /dscn. mailto:james.osullivan@ucc.ie o’sullivan: introduction le calendrier non seulement pour les érudits émergeants mais aussi pour les érudits établis dans le domaine. le colloque continue cependant à être un espace non menaçant où les étudiants, les érudits et les professionnels peuvent échanger leurs idées. ce numéro est un témoignage de cette diversité et de la qualité de la recherche présentée au colloque. le numéro inclut l’article « what’s under the big tent?: a study of adho conference abstracts » par scott b. weingart et nickoal eichmann-kalwara, ce qui présente les humanités numériques comme une discipline dominée par des groupes et pratiques spécifiques. en se servant du victorian women writers project comme étude de cas, mary borgo traite des maquettes pour la croissance durable des ressources numériques basées sur la tei. william b. kurtz détaille les expériences qu’il a acquises en travaillant sur l’initiative numérique founders online: early access ainsi que l’importance que de tels projets constituent un facteur attractif pour un plus large public. dans le texte de john barber, « radio nouspace: sound, radio, digital humanities », il s’agit du traitement de sons radiophoniques et du lien entre cette activité et l’érudition numérique créative. tous ces articles correspondent au but de faciliter une communauté composée des intérêts et perspectives divergents qui peut souvent être véritablement dissonante. mots-clés: digital humanities; dhsi special issue; digital humanities summer three years ago, diane jakacki passed control of the university of victoria’s dhsi colloquium to mary galvin and me. our task was to continue to develop what diane, alongside cara leitch, had started in . initially, the colloquium was intended as a means of giving graduates an opportunity to present their research to the burgeoning community of digital humanities scholars. it was an opportunity for students to discuss their research with a large, international, and interdisciplinary audience, and furthermore, it enabled them to take advantage of institutional mechanisms designed to support participation at conferences. at the present phase in the development of the digital humanities, there is a marked emphasis on the acquisition of technical skills—emerging and established scholars alike are under intense pressure to develop their expertise in this domain. here is not the most appropriate venue to discuss the positive and negative consequences of this reality, but it is the reality, one which is largely compelled by the demands of employers, for more on the colloquium, see the event’s dedicated website, http://dhsicolloquium.org. http://dhsicolloquium.org o’sullivan: introduction funders, and the broader socio-cultural climates in which our institutes of education reside. community-driven learning opportunities like the digital humanities summer institute are vital in such a context, helping us to learn, and further build our community, in a fashion that is suited to the hyper-demands of present-day academia. truly wonderful is the scholar who can specialise in medieval studies while becoming equally adept in french, python, statistics, and d modelling— perhaps i speak for myself, but this isn’t most of us. mastery, of the true kind, comes from a lifetime of repetition, of focusing on that one little thing and questioning it and yourself for decades on end. hiring committees, promotion boards—they often expect the former, the academic swiss army knife capable of achieving excellence in disciplinary discord. through its broad range of foundational and intensive programs, dhsi gives students and scholars a chance to broaden their knowledge within a feasible timeframe. dhsi does not make masters, but it does allow the curious to recognise the ways in which they might re-imagine their intellectual practice. mastery can always be pursued in the aftermath of victoria, but we should also be content to progress with a valuable measure of fluency—one doesn’t need to be an adept programmer to interact with computer scientists, a certain level of proficiency is sufficient to enable the conversations that make meaning happen. this fluency, and the vibrant community that emerges out of its exchange, is what dhsi offers—the colloquium was invented as a means of supporting graduates who wanted to be a part of such a gathering. in , the colloquium’s leadership agreed that there was sufficient demand to broaden the scope of the event beyond graduate submissions. concurrently, dhsi continued to attract an increasing number of students, resulting in significant growth for the colloquium and its audience—it is not unusual for participants to find themselves addressing an auditorium housing several hundred of their peers. this growth has continued in recent years, and as the colloquium remains an addendum to the course-based pedagogical mission of dhsi, a measure of invention has been required to satisfy the increased volume of submission. in addition to more i am of course referencing last year’s opening ceremony, wherein instructors are tasked with describing their courses. in-keeping with tradition, offerings are outlined through something of a pun-off. o’sullivan: introduction traditional presentations—though the current cap stands at minutes—submissions are now welcome across a number of high-impact formats, such as lightning talks. in , mary galvin initiated the colloquium’s first poster session, which has become increasingly popular amongst participants. at dhsi , we were proud to host a joint session with the concurrent electronic literature organization conference and festival, while at dhsi , posters and demonstrations were incorporated from the society for the history of authorship, reading and publishing’s annual conference. developing the colloquium is about continuing to respond to the needs of the community, finding ways to assist scholars and practitioners at various junctures in their careers to disseminate their research, ideas, and projects. a book of abstracts has been circulated since , while a select number of presentations from dhsi were transformed into the colloquium’s first special issue, published in digital humanities quarterly. at the forthcoming gathering, our hope is to incorporate more audio-visual approaches to the capture of contributions. such has been the growth of the colloquium that last year saw a number of registrations from scholars not participating in courses. there was also a need to appoint the first program assistant, lindsey seatter, who has since succeeded mary galvin as co-chair. mary committed much of her time to the development of this event, and, as with many of our field’s instigators, our community is all the better for her efforts. despite its growth, the ethos of the colloquium remains consistent: it is a non-threatening space in which students, scholars, and practitioners can share their ideas. to this end, we operate a peer-review policy wherein all reviewers are instructed to offer collegial feedback—constructive criticism is a requirement, not a recommendation. unlike some other conferences, we have the luxury of accepting submissions if they meet a minimum threshold in terms of scholarly value. those submissions that are considered to have fallen short of this standard are finessed through reviewer feedback so that they improve to a o’sullivan, james, mary galvin, and diane jakacki. . dhsi colloquium special issue, in digital humanities quarterly . . web. http://www.digitalhumanities.org/dhq/vol/ / /index.html o’sullivan: introduction point where they are ready to be presented. i say this is a luxury because all we have to do as organisers and reviewers is to improve and accept submissions— accommodating the rising number of presentations is a task that falls to daniel sondheim, assistant director of the electronic textual cultures lab at the university of victoria, and ray siemens, director of dhsi. dan, ray, and the university of victoria are yet to deny any of the colloquium’s scheduling requirements, and the product of that facilitation is a diverse and inclusive final program. this issue is testament to that diversity, as well as the strength of the research being presented at the colloquium. while there are only four papers, they each represent a significant contribution to the field, spanning a range of subjects that includes radio, metadata standards, victorian women writers, and macro-level explorations of the wider digital humanities. one of the peculiarities of our realm’s interdisciplinary nature is that community gatherings draw a seemingly discordant group of individuals—is there value in conferences and publications comprised of historians, linguists, programmers, archivists, artists, and statisticians? is the dh mix simply too broad to have meaning? i was disappointed to see literary and linguistic computing become digital scholarship in the humanities for this very reason—i liked having a journal that was entirely focused on my particular interests, and wasn’t overly enthused at the prospect of a publication that would meld an array of research on all kinds of everything. but, if the digital humanities are truly meant to be disruptive, then disciplinarity—which has a great many merits—should not be isolated from this process of disruption. in , we stopped clustering colloquium sessions into themes—the argument mary advanced was that themes divided audiences, and as we aren’t forced to schedule parallel sessions, we should follow in the footsteps of the discipline’s pioneers and use the opportunity to encourage dissonance. dissonance is at the very heart of the digital humanities, and we should embrace it, because dissonance is what gave us computational approaches to literary criticism, it is what compelled us to try and think beyond the codex, and most importantly, it is what shows us the failings in our techniques and approaches to scholarship. the colloquium, and o’sullivan: introduction this special issue, like other journals and gatherings in this field, seeks to embrace dissonance as a valuable means of producing knowledge through the exchange of ideas and expertise that seemingly lack harmony, while simultaneously maintaining the utmost respect for the principles of differing disciplines. such collaborative principles are what dhsi is founded on, and its colloquium is merely an opportunity to encourage curiosity, and breed inter- and transdisciplinary creativity. in this respect, it is perhaps fitting that this issue includes scott b. weingart’s and nickoal eichmann-kalwara’s “what’s under the big tent?: a study of adho conference abstracts.” while one can believe in dissonance, diversity, and interdisciplinarity, the reality does not always reflect the mantra. quantifying submissions to our field’s flagship digital humanities conference, weingart and eichmann-kalwara portray the discipline as one which is dominated by specific groups and practices. these findings, they argue, are at odds with anecdotal experiences, and they suggest a number of ways through which we might respond to such failings. using the victorian women writers project as a case-study, mary borgo treats models for the sustainable growth of tei-based digital resources. discussing some of the most salient issues in the development of a digital edition—technical barriers, student involvement, ethics—this essay demonstrates the value of the colloquium through the dissemination of those lessons that have been learned by its author as a consequence of her involvement in this project. william b. kurtz also details his experiences working on a digital initiative, in this instance, founders online: early access. kurtz’s examination is more specific to large-scale digital humanities work, and engages with the need for such projects to hold broader public appeal. john barber’s “radio nouspace: sound, radio, digital humanities,” is something of a departure from the other contributions, in that it describes the curation of sound within the context of radio, and how such activity connects to creative digital scholarship, reflecting on digital storytelling, sound-based narrative, i would like to thank a number of editors from digital studies/le champ numérique, particularly daniel o’donnell, paul esau, vanja spiric, and virgil grandfield for their tireless efforts in bringing this special issue to fruition. o’sullivan: introduction and practice-based research. in isolation, each of these essays offer insight from which interested readers will benefit—together, they represent the purpose of facilitating a community comprised of divergent interests and perspectives. competing interests the author has no competing interests to declare. how to cite this article: o’sullivan, james. . “introduction: digital humanities as dissonant.” digital studies/le champ numérique ( ): , pp. – , doi: https://doi. org/ . /dscn. submitted: november accepted: november published: january copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access digital studies/le champ numérique is a peer-reviewed open access journal published by open library of humanities. https://doi.org/ . /dscn. https://doi.org/ . /dscn. http://creativecommons.org/licenses/by/ . / competing interests preservation in practice: a survey of new york city digital humanities researchers – in the library with the lead pipe skip to main content chat .webcam open menu home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search may malina thiede / comment preservation in practice: a survey of new york city digital humanities researchers in brief digital humanities (dh) describes the emerging practice of interpreting humanities content through computing methods to enhance data gathering, analysis, and visualization. due to factors including scale, complexity, and uniqueness, the products of dh research present unique challenges in the area of preservation. this study collected data with a survey and targeted interviews given to new york city metro area dh researchers intended to sketch a picture of the methods and philosophies that govern the preservation efforts of these researchers and their institutions. due to their familiarity with evolving preservation principles and practices, librarians are poised to offer expertise in supporting the preservation efforts of digital humanists. the data and interviews described in this report help explore some of the current practices in this area of preservation, and suggest inroads for librarians as preservation experts. by malina thiede (with significant contributions from allison piazza, hannah silverman, and nik dragovic) introduction if you want a definition of digital humanities (dh), there are hundreds to choose from. in fact, jason heppler’s whatisdigitalhumanities.com alone offers rotating definitions of the digital humanities, pulled from participants from the day of dh between - . a few of these definitions are listed below: digital humanities is the application of computer technology to make intellectual inquiries in the humanities that either could not be made using traditional methods or are made significantly faster and easier with computer technology. it can include both using digital tools to make these inquiries or developing these tools for others to use. –matthew zimmerman dh is the study, exploration, and preservation of, as well as education about human cultures, events, languages, people, and material production in the past and present in a digital environment through the creation and use of dynamic tools to visualize and analyze data, share and annotate primary sources, discuss and publish findings, collaborate on research and teaching, for scholars, students, and the general public. –ashley sanders for the purposes of this article, digital humanities will be defined as an emerging, cross-disciplinary field in academic research that combines traditional humanities content with technology focused methods of display and interpretation. most dh projects are collaborative in nature with researchers from a variety of disciplines working together to bring these complex works to fruition. dh projects can range from fairly traditional research papers enhanced with computing techniques, such as text mining, to large scale digital archives of content that include specialized software and functionality. due to the range of complexity in this field and the challenges of maintaining certain types of digital content, long-term preservation of dh projects has become a major concern of scholars, institutions, and libraries in recent years. while in the sciences, large scale collaborative projects are the norm and can expect to be well funded, dh projects are comparatively lacking in established channels for financial and institutional support over the long term, which can add another layer of difficulty for researchers. as librarians at academic institutions take on responsibility for preserving digital materials, they certainly have a role in ensuring that these dh projects are maintained and not lost. for the purposes of this paper, a digital humanities project will be broadly defined as cross-disciplinary collaboration that manifests itself online (i.e. via a website) as both scholarly research and pedagogical resource using digital method(s). methods can include, but are not limited to, digital mapping, data mining, text analysis, visualization, network analysis, and modeling. literature review the library of congress’s (n.d.) catchall definition of digital preservation is “the active management of digital content over time to ensure ongoing access.” hedstrom ( ) offers a more specific definition of digital preservation as “the planning, resource allocation, and application of preservation methods and technologies necessary to ensure that digital information of continuing value remains accessible and usable.” digital preservation is a complex undertaking under the most favorable conditions, requiring administrative support, funding, personnel, and often specialized software and technology expertise. kretzschmar and potter ( ) note that digital preservation, and, in particular, digital humanities preservation, faces a “stand-still-and-die problem” because it is necessary to “continually…change media and operating environments just to keep our information alive and accessible.” this is true of preserving most digital objects, but the complex, multi-faceted nature of many dh projects adds additional layers of complexity to the already challenging digital preservation process. zorich ( ) lists other components of the “digital ecosystem” that must be preserved in addition to the actual content itself: “software functionality, data structures, access guidelines, metadata, and other…components to the resource.” kretzschmar and potter ( ) lay out three seemingly simple questions about preserving digital projects: “how will we deal with changing media and operating environments? who will pay for it? and who will do the work?” whose answers are often difficult to pin down. when working with dh projects, ‘what exactly are we preserving?’ may also be an important question because as smith ( ) notes that “there are…nagging issues about persistence that scholars and researchers need to resolve, such as…deciding which iteration of a dynamic and changing resource should be captured and curated for preservation.” in , digital humanities quarterly published a cluster of articles dedicated to the question of “doneness” in dh projects. kirschenbaum ( ) notes in the introduction to the cluster that “digital humanities…[is] used to deriving considerable rhetorical mileage and the occasional moral high-ground by contrasting [its] radical flexibility and mutability with the glacial nature of scholarly communication in the fixed and frozen world of print-based publication.” unlike some digital assets that undergo preservation, dh projects and the components thereof are often in a state of flux and, indeed, may never truly be finished. this feature of dh projects makes their preservation a moving target. kretzschmar ( ) detailed the preservation process for the linguistic atlas project, a large scale dh project that spanned decades, explaining “we need to make new editions all the time, since our idea of how to make the best edition changes as trends in scholarship change, especially now in the digital age when new technical possibilities keep emerging.” another example of a dh project that has undergone and continues to undergo significant revisions is described in profile # below. in addition to the particular technological challenges of preserving often iterative and ever-evolving dh projects, there are structural and administrative difficulties in supporting their preservation as well. maron and pickle ( ) identified preservation as a particular risk factor for dh projects with faculty naming a wide range of entities on campus as being responsible for supporting their projects’ preservation needs, which suggested “that what preservation entails may not be clear.” bryson, posner, st. pierre, and varner ( ) also note that “the general lack of policies, protocols, and procedures has resulted in a slow and, at times, frustrating experience for both library staff and scholars.” established workflows and procedures are still not easily found in the field of dh preservation, leading scholars, librarians, and other support staff to often attempt to reinvent the wheel with each new project. other difficult to avoid problems noted across the literature are those of staff attrition and siloing. although rife with challenges, the preservation of dh projects is far from a lost cause, and libraries have a crucial role to play in ensuring that, to some degree, projects are successfully maintained. the data and interviews summarized in this paper reveal how some of these projects are being preserved as well as their particular difficulties. there are certainly opportunities for librarians to step in and offer their preservation expertise to help scholars formulate and achieve their preservation goals. methodology the methodology for this project was influenced by time frame and logistics. initially the project was slated to be completed within five months, but the deadline was later extended to nine months. because it would have been difficult to interview multiple individuals across new york city within the original time frame, we decided on a two phase approach to conducting the survey, similar to zorich’s methodology, where an information gathering phase was followed by interviews (zorich, ). the survey involved ( ) conducting an online survey of nyc faculty members engaged in digital humanities, and ( ) performing in-person or phone interviews with those who agreed to additional questioning. the survey provided a broad, big picture overview of the practices of our target group, and the interviews supplemented that data with anecdotes about specific projects and their preservation challenges. the interviews also provided more detailed insight into the thoughts of some dh scholars about the preservation of their projects and digital preservation in general. the subjects of our survey and interviews were self-selected faculty members and phd candidates engaged in digital humanities research and affiliated with an academic institution within the new york city area. this population of academics was specifically targeted to reach members of the dh community that had access to an institutional library and its resources. we limited our scope to the new york city for geographic convenience. we targeted survey respondents using the nyc digital humanities website as a starting point. as of october , when the selection process for this project was underway there were members listed in the nyc digital humanities online directory. an initial message was sent to the nycdh listserv on june , , and individual emails were sent to a subset of members in june , . we approached additional potential survey respondents that we knew fit our criteria via email and twitter. figure : nyc digital humanities logo survey the survey tool was a -item online qualtrics questionnaire asking multiple choice and short answer questions about the researchers’ work and their preservation strategies and efforts to date. the survey questions were developed around specific areas: background information about the projects and their settings, tools used, staff/management of preservation efforts, future goals, and a query about their availability for follow up interviews. as all dh projects are unique, respondents were asked to answer the questions as they pertain to one particular project for which they were the principal investigator (pi). interviews interviewees were located for the second phase of the research by asking survey respondents to indicate if they were willing to participate in a more in-depth interview about their work. interested parties were contacted to set up in-person or conference call interviews. the interviews were less formal and standardized than the survey, allowing for interviewees to elaborate on the particular issues related to the preservation of their projects. each interview was recorded but not fully transcribed. team members reviewed the recordings and took detailed notes for the purpose of comparing and analyzing the results. limitations although the scope of this project was limited to a particular geographic area with a large population base, the sample size of the survey respondents was fairly small. the institutions of all but three respondents are classified as moderate to high research activity institutions according to the carnegie classifications. these types of institutions are by no means the only ones involved in dh work, but the high concentration of respondents from research institutions may indicate that there is greater support for dh projects at these types of institutions. as a result, this paper does not provide much discussion of dh preservation practices at smaller baccalaureate or masters institutions with a stronger emphasis on undergraduate education. a note about confidentiality individuals who participated in the online survey were asked to provide their names and contact information so we could follow-up with them if they chose to participate in the interview. individuals who took part in the interviews were guaranteed confidentiality to encourage open discussion. all findings are reported here anonymously. survey results the survey was live from june , to july , . in total, respondents completed the survey. demographics of the faculty engaged in digital humanities our survey respondents represented new york city academic institutions, with the most responses coming from columbia university. department affiliations and professional titles are listed below (figure ). figure . institutional affiliations of survey respondents (n= ) institutional affiliation # of respondents columbia university cuny graduate center new york university bard graduate center hofstra university jozef pilsudski institute of america new york city college of technology queensborough community college st. john’s university the new school departmental affiliations of survey respondents department affiliation # of respondents library/digital scholarship lab english history art history linguistics unreported academic titles of survey respondents academic titles # of respondents professor assistant professor associate professor adjunct/lecturer digital scholarship coordinator or specialist phd candidate director chief librarian we asked respondents where they received funding for their projects (figure ). responses were split, with some respondents utilizing two funding sources. figure . funding source funding source # of respondents institutional funding % grant funding % personal funds % institutional and grant funding % no funding % institutional and personal funds % dh project characteristics as previously mentioned, respondents were asked to choose one digital humanities project in which to answer the survey questions. questions were asked to determine the number of people collaborating on the project and the techniques and software used. the majority of respondents ( %) were working collaboratively with one or more colleagues (figure ). figure . collaborators involved in dh project (n= ) # of collaborators # of respondents - collaborators % + collaborators % collaborators % - collaborators % the techniques utilized are listed in figure , with % of projects utilizing more than one of these techniques. figure . techniques used in dh project (n= ) technique # of projects data visualizations % other* % data mining and text analysis % geospatial information systems (gis) % network analysis % text encoding % -d modeling % *maps, interactive digital museum exhibition, audio ( ), software code analysis, data analysis tools, ohms (oral history metadata synchronizer) the techniques mentioned above are created with software or code, which can be proprietary, open-source, or custom. respondents utilized a mix of these software types, with % of respondents saying that they used proprietary software in their projects, % report using open-source software, and % used custom software. a list of software examples can be found in figure . figure . software utilized by respondents proprietary software open-source software adobe photoshop ( ) wordpress ( ) adobe dreamweaver omeka ( ) adobe lightroom python ( ) google maps mysql ( ) textlab timeline.js ( ) sketchup qgis ( ) weebly dspace knowledge of preservation % of respondents reported that they had formal training in digital preservation, which the authors intended to mean academic coursework or continuing education credit. informally, respondents have consulted numerous resources to inform preservation of their project (figure ). figure . sources consulted to inform preservation source percent published scholarly research % colleagues or informal community resources % digital humanities center, library/librarian, archivist % grey literature % professional or scholarly association sponsored events % conferences % campus workshops or events % none % project preservation considerations preservation of their dh project was considered by the majority ( %) of respondents. when asked who first mentioned preservation of their project, % of those who had considered preservation said either they or one of their collaborators brought up the issue. in only one instance did a librarian first suggest preservation, and there were no first mentions by either funder or host department. the majority of initial preservation discussions ( %) took place during the project, with % taking place before the project began, and % after project completion. when asked to consider how many years into the future they see their project being usable and accessible, the majority ( %) said + years, followed by - years ( %), and % were unsure. one respondent noted they were not interested in preservation of the project. preservation strategy version control, migration, metadata creation, emulation, durable persistent media, and bit stream preservation are just a few strategies for preserving digital materials. we asked respondents to rate each strategy by importance (figure ). figure : preservation strategies by importance all respondents reported that they backup their work in some capacity. the most respondents ( %) are using cloud services. half report the use of institutional servers, and % use home computers. github was mentioned by two respondents as a safe storage solution for their projects. the majority of respondents ( %) are utilizing more than one way of backing up their work. interview findings through follow-up interviews with five respondents, we delved into several of these projects in greater detail. interviewees gave us more information about their projects and their partnerships, processes, and policies for the preserving the work. profile # : dh coordinator interview conducted and summarized by nik dragovic respondent was a coordinator in a digital humanities center at their institution and had undertaken the work in collaboration with librarian colleagues because the library works closely with researchers on dh projects at this particular institution. this initiative was unique in that no preservation measures were being undertaken, a strategy that resulted from discussion during the conception of the project. the resulting life expectancy for the project, comprising a geography-focused, map-intensive historical resource incorporating additional digital content, was three to four years. the reason for the de-emphasis of preservation stemmed from a shared impression that the complexity of preservation planning acts as a barrier to initiating a project. given their intention to produce a library-produced exemplar work rather than a traditional faculty portfolio piece, the initiative was well-suited to this approach. the technical infrastructure of the project included a php stack used to dynamically render the contents of a mysql database. the general strategy incorporated elements of custom software and open source technologies including neatline and omeka. the unique perspective of the respondent as an institutional dh liaison as well as a practitioner made the interview more amenable to a general discussion of the issues facing a broad set of digital humanists and their interaction with library services. the overriding sentiment of the respondent echoed, to a large extent, existing literature’s assertion that dh preservation is nascent and widely variable. specifically, the interviewee opined that no one framework, process, or solution exists for those seeking to preserve dh outputs, and that every project must have its own unique elements taken into account. this requires an individual consultation with any project stakeholder concerned with the persistence of their work. a primary element of such conversations is expectation management. in the respondent’s experience, many practitioners have the intention of preserving a fully functional interface in perpetuity. in most cases, the time, cost, and effort required to undertake such preservation measures is untenable. the variegated and transformative code stack environments currently underpinning dh projects is a leading issue in permanent maintenance of the original environment of a dh project. as a result, the respondent advocated for a “minimal computing” approach to preservation, in which more stable formats such as html are used to render project elements in a static format, predicated on a data store instead of a database, with languages like javascript as a method for coordinating the front-end presentation. this technique allows not only for a simpler and more stable preservation format, but also enables storage on github or apache servers, which are generally within institutional resources. another preservation solution the respondent explained was the dismantling of a dh project into media components. instead of migrating the system into a static representation, one leverages an institutional repository to store elements such as text, images, sound, video, and data tables separately. the resulting elements would then require a manifest to be created, perhaps in that format of a tar file, to explain the technology stack and how the elements can be reassembled. an internet archive snapshot is also a wise element to help depict the user interface and further contextualize the assets. in the experience of the respondent, helping digital humanists understand strategic and scaled approaches to preservation is one of the greatest challenges of acting as a library services liaison. students and faculty have an astute understanding of the techniques underpinning the basic functionality their work, but not the landscape of current preservation methodologies. not only is the learning curve steep for these more library-oriented topics, but the ambitions of the library and the practitioner often diverge. whereas the scholar’s ambition is often to generate and maintain a body of their own work, the library focuses more on standardization and interoperability. this creates a potential point of contention between library staff and those they attempt to counsel. often the liaison must exercise sensitivity in their approach to users, who themselves are experts in their field of inquiry. the broader picture also includes emerging funding consideration for national grants. when asked about the intentions of the national endowment for the humanities to incorporate preservation and reusability into funding requirements, the respondent expressed skepticism of the agency’s conceptualization of preservation, stating that a reconsideration and reworking of the term’s definition was in order. to apply too exhaustive a standard would encourage a reductive focus on the resource-intensive preservation methods that the respondent generally avoids. like most facets of the dh preservation question, this warrants further inquiry from practical and administrative standpoints. in a general sense, realistic expectations and practical measures ruled the overall logic of the respondent, as opposed to adherence to any given emerging standard presently available. profile # : library director the impetus behind respondent ’s project was not to advance scholarship in a particular subject, so the preservation strategy and goals differed from projects that had a more explicitly scholarly purpose. the idea was hatched by a team of librarians as a means to help librarians learn and develop new skills in working with digital research with the ultimate goal of enhancing their ability to collaborate and consult with researchers on their projects. the learning and training focus of this project informed the team’s preservation strategy. a number of tools were used to plan, document, and build out this project, and some levels of the production were designed to be preserved where others were intended to be built out, but then left alone, instead of migrated as updates become available. the process was documented on a wordpress blog, and the ultimate product was built on omeka. the team did preservation and versioning of code on github, but they do not intend to update the code even if that means the website will ultimately become unusable. what was very important to this team was to preserve the “intellectual work” and the research that went into the project. to accomplish that, they decided to use software, such as microsoft word and excel, that creates easy to preserve files, and they are looking into ways to bundle the research files together and upload them to the institution’s repository. respondent expressed that an early problem they had with the technology team was that they “wanted everything to be as well thought out as our bigger digital library projects, and we said that dh is a space for learning, and sometimes i could imagine faculty projects where we don’t keep them going. we don’t keep them alive. we don’t have to preserve them because what was important was what happened in the process of working out things.” this team encountered some challenges working with omeka. at one point they had not updated their version of omeka and ended up losing quite a bit of work which was frustrating. “we need to be thinking about preservation all along the way” to guard against these kinds of losses of data. working with the it department also posed challenges because “technology teams are about security and about control” and are not always flexible enough to support the evolving technology needs of a dh project. the project had to be developed on an outside server and moved to the institutional server where the code could not be changed. profile # : art professor respondent ’s institution has set up a dh center with an institutional commitment to preserving the materials for the projects in perpetuity. the center relies on an institutional server and has a broad policy to download and maintain files in order to maintain them indefinitely on the back end. front end production of the project was outsourced to another institution, and the preservation of that element of the project had not been considered at the time of the interview. this researcher’s main challenge was that although many of the artworks that are examples in the project are quite old and not subject to copyright, certain materials (namely photographs of d objects) are copyrighted and can only be licensed for a period of years. the front-end developer expressed that years was a long time in the lifetime of a website (which would make that limitation of little concern), but being able to only license items for a decade at a time clashes with the institutional policy of maintaining materials indefinitely on the server and raises questions about who will be responsible for this content over the long term if the original pi were to move on or retire. profile # : archivist interview conducted and summarized by hannah silverman respondent , who has developed a comprehensive set of open source tools for the purpose of archiving documents and resources related to a specific historical era, sees their work within the sphere of digital humanities. the sense that their archival work was essentially related to the digital humanities came about over a period of time as their technical needs required them to connect with a larger set of people, first with the librarians and archives community through the metropolitan new york library council (metro), then as a dh activity introduced at a metro event. “i myself am writing a [dh] blog which originally was a blog by archivists and librarians…so, the way i met people who are doing similar things is at metro. we are essentially doing dh because we are on the cross of digital technologies and archives. it is just a label, we never knew we were doing dh, but it is exactly that.” the respondent goes on to describe the value of developing tools that can read across the archive, allowing researchers to experience a more contextual feel for a person described within the material – adding dimensionality and a vividness to the memory of that person: what i am struggling with is essentially one major way of presenting the data and that is the library way. the libraries see everything as an object, a book is an object, and everything else is as an object. so they see objects. and if you look at the ny public library…you can search and you can find the objects which can be a page of an archive but it is very difficult to see the whole archive, the whole collection; it’s not working this way. if you search for an object you will find something that is much in the object but it is not conducive to see the context and the archives are the context, so what i am trying to see if we can expand this context space presentation. we spent very little money on this project product which we use to display the data. there is a software designer…who built it for us, but if we could get more funding i would work on [creating] a better view for visualizing the data. several projects [like this] are waiting in line for funding here…we collect records, records are not people. records are just names. we would like to put the records in such a way that all the people are listed and then give the information about this person who was in this list because he was doing something, and in this list because he was doing something else, and in this document because he traveled from here to here and so on. that would be another way of sort of putting all the soldiers and all the people involved in these three (volunteer) uprisings for which we have complete records of in part of the archive. we have complete records of all the people in such a way that you could follow a story of a person and also maybe his comrades in arms. it may be the unit in which he worked, and so on. the respondent has addressed preservation with multiple arrays of hard drives that are configured with redundancy schemes and daily scrubbing programs for replacing any corrupted digital bits. also copies stored on tape are routinely managed in multiple offsite locations, as well as quality assurance checks occurring via in both analog and digital processes. profile # : english and digital humanities professor interview conducted by hannah silverman and summarized by malina thiede. the project discussed in this interview began as a printed text for which an interactive, online platform was later created. the online platform includes data visualizations from user feedback (such as highlights) and a crowdsourced index, as no index was included in the original print text. the code for the project is preserved and shared on github which the interviewee sees as a good thing. the visualizations of the data are not being preserved, but the data itself is. there is an intent to create and preserve new visualizations, but the preservation plan was not set at the time of the interview. the initial project was conceived and executed in a partnership between an academic institution and a university press on a very short timeline (one year from call for submissions to a printed volume) with very rigid deadlines. due to the rapid and inflexible timeline, preservation was not considered from the outset of the project, but a data curation specialist was brought in between the launch of the site and the first round of revisions to review the site and give advice on issues of preservation and sustainability. the institution supporting the project has strong support for digital initiatives; however, an informal report from the data curation specialist tasked with reviewing the project indicated that “precarity in the institutional support for the project could result in its sudden disappearance.” the interviewee stated that “we are less focused on preservation than we should be” because “we’re looking towards the next iteration. our focus has been less on preserving and curating and sustaining what we have” than on expanding the project in new directions. at the time of the interview, this project was entering a new phase in which the online platform was going to be adapted into a digital publishing platform that would support regular publications. the interviewee indicated several times that more of a focus on preservation would be ideal but that the digital elements of this project are experimental and iterative. the priority for this project is moving ahead with the next iteration rather than using resources to preserve current iterations. analysis & conclusion through this survey of nyc librarians, scholars, and faculty, our aim was to capture a sample of the work being done in the digital humanities, paying close attention to this population’s preservation concerns, beliefs, and practices. through this research, we offer the following observations regarding dh content creators and preservation: . preservation is important to the researchers working on these projects, but it is often not their main focus. . scholars working on dh projects are looking for advice and support for their projects (including their project’s preservation). . librarians and archivists are already embedded in teams working on dh projects. preservation challenges we noticed through textual responses and follow-up interviews that preservation rarely came up in the earliest stages of the project – sometimes due to tight deadlines, and other times simply because preservation is not generally in the conversation during the onset of a project. researchers are typically not accustomed to thinking about how their work will be preserved. the workflows for traditional published research leave preservation in the hands of the consumer of the research, which is often the library. however, dh and other digital projects often have less clearly defined workflows and audiences, making it less obvious who should be responsible for preservation and when the preservation process should begin. our data indicates that most planning about preservation occurs sometime during the course of the project or after its completion, rather than at the beginning. best practices for digital projects state that preservation should be a consideration as close to the beginning of the project as possible, but researchers may not be aware of that until they have done significant work on a project. it is also noteworthy that just over half of our survey respondents set a goal of preserving their work for five or more years, and significant percentages ( and , respectively) set goals of three to four years or were unsure of how long they wanted their work to be preserved. this indicates that not all projects are intended to be preserved for the long term, but that does not mean that preservation planning and methods should be disregarded for such projects. as these projects go forward, respondents who do want their projects to be available long term grapple with the difficulties that surround preservation of digital content and the added time commitment it demands. the following survey respondent illustrates this potential for complexity: unlike many digital humanities projects this project exists/existed in textual book format, online, and in an exhibition space simultaneously. all utilize different aspects of digital technologies and are ideally experienced together. this poses much more complicated preservation problems since preserving a book is different from preserving an exhibition which is different from preserving an online portion of a project. what is most difficult to preserve is the unified experience (something i am well aware of being a theatre scholar who has studied similar issues of ephemerality and vestigial artifacts) and is something that we have not considered seriously up to this point. however, because books have an established preservation history, the exhibition was designed to tour and last longer than its initial five-month run, and the online component will remain available to accompany the tour and hopefully even beyond, the duration of the project as a whole has yet to be truly determined and i am sure that considerations of preservation and version migration will come up in the near future for both the physical materials and the digital instantiations of the project. it promises to provide some interesting conundrums as well as fascinating revelations. and another survey respondent: i feel like i should unpack the perpetuity question. our project is text (and) images (and) data visualizations on a website. the text (and) images i’d hope would be accessible for a long time, the data (visualization) relies on specific wordpress plugins/map applications and may not be accessible for a long time. since we’re self-administering everything we will take things forward with updates as long as we can, but… roles for librarians and archivists as one librarian interviewee explained, preservation is a process that needs to be considered as a project is developed and built out, not a final step to be taken after a project is completed. hedstrom noted as far back as that preservation is often only considered at a project’s conclusion or after a “sensational loss,” and this remains a common problem nearly years later. therefore, librarians and archivists should try to provide preservation support starting at the inception of a project. considering preservation at an early stage can inform the process of selecting tools and platforms; prevent data loss as the project progresses; and help to clarify the ultimate goals and products of a project. nowviskie ( ) posed the question: “is [digital humanities] about preservation, conservation, and recovery—or about understanding ephemerality and embracing change?” humanists have to grapple with this question as it regards their own work, but librarians and archivists can provide support and pragmatic advice to practitioners as they navigate these decisions. sometimes this may mean that information professionals have to resist their natural urge to advocate for maximal preservation and instead to focus on a level of preservation that will be sustainable using the resources at hand. librarians and archivists would do well to consider this advice from nowviskie ( ): we need to acknowledge the imperatives of graceful degradation, so we run fewer geriatric teen-aged projects that have blithely denied their own mortality and failed to plan for altered or diminished futures. but alongside that, and particularly in libraries, we require more a robust discourse around ephemerality—in part, to license the experimental works we absolutely want and need, which never mean to live long, get serious, or grow up. profiles # and # exemplified the ‘graceful degradation’ approach to dh preservation by building a website that was intended to be ephemeral with the idea that the content created for the site could be packaged in stable formats and deposited in an institutional repository for permanent preservation. the project discussed in profile # , while not explicitly designed as an ephemeral project, has a fast moving, future focused orientation, such that any one particular iteration of the project may not exist indefinitely, or even for very long. of course, an ephemeral final product may not be an acceptable outcome in some cases, but advice from librarians can inform the decision making process about what exactly will be preserved from any project and how to achieve the level of preservation desired. due to variations in the scale and aims of individual dh projects and the resources available in different libraries, it would be virtually impossible to dictate a single procedure that librarians should follow in order to provide preservation support for dh projects, but based on our data and interviews, librarians who want to support preservation of dh research can take the following steps: . keep up with existing, new, or potential dh research projects on campus. depending on the type of institution, those projects may be anything from large scale projects like the linguistic atlas mentioned above to undergraduate student work. . offer to meet with people doing dh on campus to talk about their projects. begin a discussion of preservation at an early stage even if long term preservation is not a goal of the researchers. establishing good preservation practices early can help to prevent painful data losses like the one mentioned in profile # as the project progresses. . work with the researchers to develop preservation plans for their projects that will help them meet their goals and that will be attainable given the resources available at your institution/library. – in developing a plan, some of the questions from our survey (see appendix i) may be helpful, particularly questions about the nature of the project and the intended timeline for preservation. – also keep in mind what resources are available at your library or institution. kretzschmar and potter ( ) took advantage of a large, extant media archive at their library to support preservation of the linguistic atlas. the interviewees in profiles # and # also mentioned the institutional repository (ir) as a possible asset in preserving some of the components of their work. (while useful for providing access, irs are not a comprehensive preservation solution, especially at institutions that use a hosting service.) – coordinate with other librarians/staff that may have expertise to help with preservation such as technology or intellectual property experts. as discussed in profile # , copyright can pose some challenges for dh projects, especially those that include images. many libraries have staff members that are knowledgeable about copyright who could help find solutions to copyright related problems. – for doing preservation work with limited resources, the library of congress digital preservation site has a lot of information about file formats and digitization. another good, frequently updated source from the library of congress is the digital preservation blog the signal. although created in and not updated, the powrr tool grid could be a useful resource for learning about digital preservation software and tools. conclusion dh projects are well on their way to becoming commonplace at all types of institutions and among scholars at all levels from undergraduates to full professors. the data and interviews presented here provide a snapshot of how some digital humanists are preserving their work and about their attitudes toward preservation of dh projects in general. they show that there are opportunities for librarians to help define the preservation goals of dh projects and work with researchers on developing preservation plans to ensure that those goals are met, whether the goal is long term preservation or allowing a project to fade over time. acknowledgements although this article is published under a single author’s name, the survey and interviews were created and conducted by a team of four that also included allison piazza, nik dragovic, and hannah silverman. allison, nik, hannah, and i all worked together to write and conduct the survey, analyze the results, and present our findings in an ala poster session and to the metropolitan new york library council (metro). writing and conducting the interviews was likewise a group effort, and all of them contributed to writing our initial report although it was never fully completed. the contributions of these team members was so substantial that they should really be listed as authors of this paper alongside me, but they declined when i offered. this project was initially sponsored by the metropolitan new york library council (metro). tom nielsen was instrumental in shepherding this project through its early phases. special thanks also to the pratt institute school of information for funding the poster of our initial results that was displayed at the ala annual conference. additional thanks to chris alen sula, jennifer vinopal, and monica mccormick for their advice and guidance during the early stages of this research. finally, thanks to publishing editor ian beilin, and to reviewers ryan randall and miriam neptune. their suggestions were immensely helpful in bringing this paper into its final form. references bryson, t., posner, m., st. pierre, a., & varner, s. ( , november). spec kit : digital humanities. retrieved from http://www.arl.org/storage/documents/publications/spec- -web.pdf carnegie classifications | basic classification. (n.d.). retrieved from http://carnegieclassifications.iu.edu/classification_descriptions/basic.php hedstrom, m. ( ). digital preservation: a time bomb for digital libraries. computers and the humanities, ( ), – . kirschenbaum, m. g. ( ). done: finishing projects in the digital humanities, digital humanities quarterly, ( ). retrieved from http://www.digitalhumanities.org/dhq/vol/ / / / .html kretzschmar, w. a. ( ). large-scale humanities computing projects: snakes eating tails, or every end is a new beginning? digital humanities quarterly, ( ). retrieved from http://www.digitalhumanities.org/dhq/vol/ / / / .html kretzschmar, w. a., & potter, w. g. ( ). library collaboration with large digital humanities projects. literary & linguistic computing, ( ), – . library of congress. (n.d.). about – digital preservation. retrieved from http://www.digitalpreservation.gov/about/ maron, n. l., & pickle, s. ( , june ). sustaining the digital humanities: host institution support beyond the start-up phase. retrieved from http://www.sr.ithaka.org/publications/sustaining-the-digital-humanities/ nowviskie, b. ( ). digital humanities in the anthropocene. digital scholarship in the humanities, (suppl_ ), i –i . https://doi.org/ . /llc/fqv smith, a. ( ). preservation. in s. schreibman, r. siemens, & j. unsworth (eds.). a companion to digital humanities. oxford: blackwell. retrieved from http://www.digitalhumanities.org/companion/view?docid=blackwell/ / .xml&chunk.id=ss - - &toc.depth= &toc.id=ss - - &branddefault walters, t., & skinner, k. ( , march). new roles for new times: digital curation for preservation. retrieved from http://www.arl.org/storage/documents/publications/nrnt_digital_curation mar pdf what is digital humanities? ( , january). retrieved from http://whatisdigitalhumanities.com/ zorich, d. m. ( , november). a survey of digital humanities centers in the us. retrieved from http://f-origin.hypotheses.org/wp-content/blogs.dir/ /files/ / /zorich_ _asurveyofdigitalhumanitiescentersintheus .pdf appendix: survey preservation in practice: a survey of nyc academics engaged in digital humanities thanks for clicking on our survey link! we are a group of four information professionals affiliated with the metropolitan new york library council (metro) researching the digital preservation of dh projects. contextual information is available at the mymetro researchers page. our target group is new york city digital humanists working in academia (such as professors or phd candidates) who have completed or done a significant amount of work on a dh project. if you meet this criteria, we’d appreciate your input. the survey will take less than minutes. the information we gather from this survey will be presented at a metro meeting, displayed on a poster at the annual conference of the american library association, and possibly included as part of a research paper. published data and results will be de-identified unless prior approval is granted. please note that your participation is completely voluntary. you are free to skip any question or stop at any time. you can reach the survey administrators with any questions or comments: nik dragovic, new york university, nikdragovic@gmail.com allison piazza, weill cornell medical college, allisonpiazza.nyc@gmail.com hannah silverman, jdc archives, hannahwillbe@gmail.com malina thiede, teachers college, columbia university, malina.thiede@gmail.com is your project affiliated with a new york city-area institution or being conducted in the new york city area? yes no title or working title of your dh project: does your project have an online component? yes (please provide link, if available): to be determined no what techniques or content types have you used or will you use in your project? select all that apply. data visualizations data mining and text analysis text encoding network analysis gis (geospatial information systems) -d modeling timelines what date did you begin work on this project (mm/yy) approximately how many people are working on this project? - - + i am working on this project alone has preservation been discussed in relation to this project? yes no who first mentioned the preservation of your project? self librarian dh center staff project member funder host department other: at what stage in the project was preservation first discussed? before the project began during the project after project completion who is/will be responsible for preserving this project? select up to two that best apply. self (pi) library host department another team member institution person or host to be determined campus it another institution how important are each of these processes to your overall preservation strategy for this project? bit-stream preservation or replication (making backup copies of your work) durable persistent media (storing data on tapes, discs, or another physical medium) emulation (using software and hardware to replicate an environment in which a program from a previous generation of hardware or software can run) metadata creation migration (to copy or convert data from one form to another) version control are there any other preservation strategies essential to your work that are not listed in the above question? if so, please list them here. do you have defined member roles/responsibilities for your project? yes no not applicable, i am working on this project alone. what is your main contribution to this project team? select all that apply. technical ability subject expertise project management skills is there a specific member of your team that is responsible for preservation of the technical infrastructure and/or display of results? yes no is there a dh center at your institution? yes no how often have you consulted with the dh center for your project? never once a few times many times dh center staff member is a collaborator on this project my institution does not have a dh center how is this project funded? select all that apply institutional funding grant funding personal funds were you required to create a preservation plan for a funding application? yes no what kinds of resources have you consulted to inform the preservation of your project? select all that apply. published scholarly research (such as books or journal articles) guides, reports, white papers and other grey literature professional or scholarly association sponsored events or resources (such as webinars) conferences campus workshops or events colleagues or informal community resources none dh center, library/librarian, archivist have you had any training in digital preservation? yes no how many years into the future do you see your project being usable/accessible? - years - years + years not sure is your resource hosted at your own institution? yes no if no, where is it hosted? how are you backing up your work? select all that apply. cloud service institutional server home computer dam tools not currently backing up work other which of the following types of software have you used to create your project? select all that apply. proprietary software (please list examples) open-source software (please list examples) custom software if you would like to add any perspectives not captured by the previous questions, or clarify your answers, please use the comment box below: your full name email address institutional affiliation primary department affiliation academic title if applicable, when did/will you complete your phd? would you be willing to be the subject of an approximately -minute interview with a member of our team to talk more in-depth about your project and preservation concerns? the innovation in libraries awesome foundation chapter from accidental to intentional library management: the risws approach response pingback : preservation in practice: a survey of new york city digital humanities researchers – the gale blog this work is licensed under a cc attribution . license. issn - about this journal | archives | submissions | conduct belfast group poetry|networks : about the data about the data rdf generation for belfast group data rebecca sutton koeser rebecca.s.koeser@princeton.edu https://orcid.org/ - - - june https://belfastgroup.digitalscholarship.emory.edu/network/about/ this document describes the steps that are done by the “prep_dataset” script, which harvests and builds the rdf dataset for the website, which is used in part as the basis for the network graphs and chord diagrams. prior to running the script, significant work was required to ) to tag names in the ead and tei and ) expose the tagged information as rdf so it could be harvested, but this work is documented elsewhere. . harvest and collect rdf . . harvest rdf from specific collections we are interested in on the emory finding aids website; includes logic to harvest “related” web pages using related links, which allows for automatically harvesting all parts of a multipart finding aid (e.g. a large collection with multiple series/subseries and index) and also allows us to automatically harvest data from collections that are labeled as related collections in the ead via the “related materials in this repository” section (see the michael longley papers for an example) . . harvest rdf for the tei group sheets included on belfast group poetry| networks, for inclusion in the list of group sheets and so data about names and places mentioned in the poetry can be included in the dataset . . generate rdf from two sets of local “fixture” sources that are not available as rdf elsewhere: . . . a list of the group sheets based on the collection at queen’s university belfast (converted to html and cleaned up to make some corrections and improvements for consistency with our other data) . . . four html documents; one brief biographies pulled from the original belfast group website (annotated in rdfa and tagged with viaf identifiers), to provide profile information for persons associated with the group that was otherwise not being included in our dataset; one with the edna longley biography (from the michael longley finding aid, but not harvestable); one with the hannah hobsbaum biography (from international who’s who in poetry ); and one with information about the one group sheet known to be in private hands (pakenham) . identify and “smush” group sheets belfast group poetry|networks koeser • belfastgroup.digitalscholarship.emory.edu/network/about/ mailto:rebecca.s.koeser@princeton.edu https://orcid.org/ - - - https://belfastgroup.digitalscholarship.emory.edu/network/about/ http://pid.emory.edu/ark:/ / zkmf https://belfastgroup.digitalscholarship.emory.edu/network/about/ the group sheets in the rdf data do not have unique identifiers (other than the ones available as digital editions), so for convenience the script includes work to identify group sheet content and to de-duplicate the group sheets where multiple copies are present in different collections. this work consists of going through the data and identifying manuscripts associated with the belfast group, tagging them as a belfast group sheet (using a custom rdf type), and then “smushing” them in the rdf, generating a new, distinct identifier so that a copy of the same group sheet in different archival collections or in a tei digital edition can all be connected to each other. . annotate the graph with related data because we are using identifiers from other rdf data sources—specifically viaf, dbpedia, and geonames.org—preparing the data includes a step to harvest a minimum set of useful data from those external systems for use in the website (e.g., names for people in viaf, descriptions and wikipedia links for people in dbpedia, and geographical coordinates for places in geonames.org) . make inferences about the data and add them to the dataset some information that is necessary or useful for displaying parts of the site (e.g., the list of group sheets) or the network graphs is present in implicit ways within the rdf dataset at this point, but not present in a way that can be easily queried for use. this step in the process makes a few specific inferences about the data and then adds that information back to the data in the form of new rdf triples that can then be used by the data or for generating network graphs. . . time period: infer whether a group sheet belongs to the first or second period of the group, and add coverage information to the rdf graph for that group sheet: . . . if there is a date associated with any copy of that group sheet, use that date to determine which time period it belongs in . . . if there is a tei copy of the group sheet, use the coverage information from the tei . . . otherwise, assume that if a group sheet is in the hobsbaum collection at queen’s, it belongs to the first time period (since hobsbaum was only involved in the first period) . . ownership: infer ownership of a group sheet based on the archival collection(s) where it can be found (e.g., if a group sheet is listed in the longley papers, add a triple stating that longley owned that group sheet) . . associate authors and owners of group sheets with the belfast group: in order to easily identify people associated with the group when generating profile pages and network graphs for the site, goes through each group sheet and adds a triple (if not already present) to indicate that anyone who owns or authored a group sheet should be considered affiliated with the group. . . mentions of people/places in the digitized group sheets: for convenience, and to allow explicit connections to be included in the network graphs, go through all poems koeser • belfastgroup.digitalscholarship.emory.edu/network/about/ http://viaf.org/ http://wiki.dbpedia.org/ http://geonames.org/ https://belfastgroup.digitalscholarship.emory.edu/network/about/ in the rdf data (only harvested from the tei) and add a direct relation between the author and the entities mentioned in the text. . generate network graphs for ease of processing, network analysis, and display on certain portions of the website, the rdf data is converted into network graph format using the python networkx library. . . full network: all triples in the rdf (with the exception of rdf sequences) are converted into a network where each subject and object is a node and each predicate is an edge between them, weighted based on some sense of the strength of that connection (see current weights used in the code ) . . specialized network based on the group sheets by time period: nodes are the group in period one ( – ) and period two ( – ), and the authors and owners of group sheets; edges are added between authors and owners of group sheets that belong to the group in one period or the other, and between co-authors, co-owners, and owners and authors of the same group sheet, and edges are weighted based on the number of group sheets. figure . network graph of people associated with the belfast group. (view interactive version)  the “people associated with the belfast group”  graph (figure ) is generated from the full network graph (described above in . ): an egograph centered on the belfast group, filtered to only include persons and organizations and restricted to include nodes that are only one or two degrees away from the belfast group (and in the case of the two-degree graph, nodes are further filtered by a minimum degree of five because otherwise there is too much data to be presented or made sense of). similar logic is used to generate the egographs on individual profile pages, except that when filtering koeser • belfastgroup.digitalscholarship.emory.edu/network/about/ https://networkx.github.io/ https://github.com/emory-libraries-ecds/belfast-group-site/blob/master/belfast/rdf/nx.py#l - https://belfastgroup.ecds.emory.edu/network/belfast-group/ https://belfastgroup.ecds.emory.edu/network/belfast-group/ https://belfastgroup.digitalscholarship.emory.edu/network/about/ nodes places are also included in addition to persons and organizations. figure . network graph of belfast group authors by time period. (view interactive version)  the “belfast group authors by period”  graph (figure ) is generated from the specialized network described above in . . emory center for digital scholarship manuscript, archives, and rare books library emory university koeser • belfastgroup.digitalscholarship.emory.edu/network/about/ https://belfastgroup.ecds.emory.edu/network/belfast-group/groupsheets/ https://belfastgroup.ecds.emory.edu/network/belfast-group/groupsheets/ http://digitalscholarship.emory.edu/ http://marbl.library.emory.edu/ http://www.emory.edu/ https://belfastgroup.digitalscholarship.emory.edu/network/about/ about the data rdf generation for belfast group data rebecca sutton koeser rebecca.s.koeser@princeton.edu https://orcid.org/ - - - june https://belfastgroup.digitalscholarship.emory.edu/network/about/ microsoft word - workingdh_whkchun_lmrhody.docx   [note: the following is the full text of an essay published in differences . ( ) as part of a special issue entitled in the shadows of the digital humanities edited by ellen rooney and elizabeth weed. duke up’s publishing agreements allow authors to post the final version of their own work, but not using the publisher’s pdf. the essay as you see it here is thus a standard pdf distinct from that created by duke up. subscribers, of course, can also read it in the press’s published form direct from the duke up site. other than accidentals of formatting and pagination this text should not differ significantly from the published one. if there are discrepancies they are likely the result of final copy edits and the exchange between the differences style guide and our standardized format. this article is copyright © duke university press.] citation:     volume   ,  number    doi   . / -­‐     ©    by  brown  university  and  differences:  a  journal  of  feminist  cultural  studies     working the digital humanities: uncovering shadows between the dark and the light wendy hui kyong chun and lisa marie rhody         the following is an exchange between the two authors in response to a paper given by chun at the “dark side of the digital humanities” panel at the modern languages association (mla) annual convention. this panel, designed to provoke controversy and debate, succeeded in doing so. however, in order to create a more rigorous conversation focused on the many issues raised and elided and on the possibilities and limitations of digital humanities as they currently exist, we have produced this collaborative text. common themes in rhody’s and chun’s responses are: the need to frame digital humanities within larger changes to university funding and structure, the importance of engaging with uncertainty and the ways in which digital humanities can elucidate “shadows” in the archive, and the need for and difficulty of creating alliances across diverse disciplines. we hope that this text provokes more ruminations on the future of the university (rather than simply on the humanities) and leads to more wary, creative, and fruitful engagements with digital technologies that are increasingly shaping the ways and means by which we think.       part the digital humanities a case of cruel optimism? (chun) what  follows  is  the  talk  given  by  wendy  chun  on  january   ,   ,  at  the  mla  convention  in  boston.  it  focuses   on   a   paradox   between   the   institutional   hype   surrounding   dh   and   the   material   work   conditions   that   frequently  support  it  (adjunct/soft  money  positions,  the  constant  drive  to  raise  funds,  the  lack  of  scholarly   recognition  of  dh  work  for  promotions).  chun  calls  for  scholars  across  all  fields  to  work  together  to  create  a   university  that  is  fair  and  just  for  all  involved  (teachers,  students,  researchers).  she  also  urges  us  to  find  value   in  what  is  often  discarded  as  “useless”  in  order  to  take  on  the  really  hard  problems  that  face  us.     i want to start by thanking richard grusin for organizing this roundtable. i’m excited to be a part of it. i also want to start by warning you that we’ve been asked to be provocative, so i’ll use my eight minutes here today to provoke: to agitate and perhaps aggravate, excite and perhaps incite. for today, i want to propose that the dark side of the digital humanities is its bright side, its alleged promise—its alleged promise to save the humanities by making them and their graduates relevant, by giving their graduates technical skills that will allow them to thrive in a difficult and precarious job market. speaking partly as a former engineer, this promise strikes me as bull: knowing gis (geographic information systems) or basic statistics or basic scripting (or even server-side scripting) is not going to make english majors competitive with engineers or cs (computer science) geeks trained here or increasingly abroad. (*straight up programming jobs are becoming increasingly less lucrative.*) but let me be clear: my critique is not directed at dh per se. dh projects have extended and renewed the humanities and revealed that the kinds of critical thinking (close textual analysis) that the humanities have always been engaged in is and has always been central to crafting technology and society. dh projects such as feminist dialogues in technology, a distributed online cooperative course that will be taught in fifteen universities across the globe, and other similar courses that use technology not simply to disseminate but also to cooperatively rethink and regenerate education on a global scale—these projects are central. in addition, the humanities should play a big role in big data, not simply because we’re good at pattern recognition (because we can read narratives embedded in data) but also, and more importantly, because we can see what big data ignores. we can see the ways in which so many big data projects, by restricting themselves to certain databases and terms, shine a flashlight under a streetlamp.     i also want to stress that my sympathetic critique is not aimed at the humanities, but at the general euphoria surrounding technology and education. that is, it takes aim at the larger project of rewriting political and pedagogical problems into technological ones, into problems that technol- ogy can fix. this rewriting ranges from the idea that moocs (massive open online courses), rather than a serious public commitment to education, can solve the problem of the spiraling costs of education (moocs that enroll but don’t graduate; moocs that miss the point of what we do, for when lectures work, they work because they create communities, because they are, to use benedict anderson’s phrase, “extraordinary mass ceremonies”) to the blind embrace of technical skills. to put it as plainly as possible: there are a lot of unemployed engineers out there, from forty-something assembly program- mers in silicon valley to young kids graduating from community colleges with cs degrees and no jobs. also, there’s a huge gap between industrial skills and university training. every good engineer has to be retaught how to program; every film graduate, retaught to make films. my main argument is this: the vapid embrace of the digital is a form of what lauren berlant has called “cruel optimism.” berlant argues, “[a] relation of cruel optimism exists when something you desire is actually an obstacle to your flourishing” ( ). she emphasizes that optimistic relations are not inherently cruel, but become so when “the object that draws your attachment actively impedes the aim that brought you to it initially.” crucially, this attachment is doubly cruel “insofar as the very pleasures of being inside a relation have become sustaining regardless of the content of the relation, such that a person or world finds itself bound to a situation of profound threat that is, at the same time, profoundly confirming” ( ). so, the blind embrace of dh (*think here of stanley fish’s “the old order changeth”*) allows us to believe that this time (once again) graduate students will get jobs. it allows us to believe that the problem fac- ing our students and our profession is a lack of technical savvy rather than an economic system that undermines the future of our students. as berlant points out, the hardest thing about cruel optimism is that, even as it destroys us in the long term, it sustains us in the short term. dh allows us to tread water: to survive, if not thrive. (*think here of the ways in which so many dh projects and jobs depend on soft money and the ways in which dh projects are often—and very unfairly—not counted toward tenure or promotion.*) it allows us to sustain ourselves and to justify our existence in an academy that is increasingly a sinking ship.     the humanities are sinking—if they are—not because of their earlier embrace of theory or multiculturalism, but because they have capitulated to a bureaucratic technocratic logic. they have conceded to a logic, an enframing (*to use heidegger’s term*), that has made publishing a question of quantity rather than quality, so that we spew forth mpus or minimum publishable units; a logic, an enframing, that can make teaching a burden rather than a mission, so that professors and students are increasingly at odds; a logic, an enframing, that has divided the profession and made us our own worst enemies, so that those who have jobs for life deny jobs to others—others who have often accomplished more than they (than we) have. the academy is a sinking ship—if it is—because it sinks our students into debt, and this debt, generated by this optimistic belief that a university degree automatically guarantees a job, is what both sustains and kills us. this residual belief/hope stems from another time, when most of us couldn’t go to university, another time, when young adults with degrees received good jobs not necessarily because of what they learned, but because of the society in which they lived. now, if the bright side of the digital humanities is the dark side, let me suggest that the dark side—what is now considered to be the dark side—may be where we need to be. the dark side, after all, is the side of passion. the dark side, or what has been made dark, is what all that bright talk has been turning away from (critical theory, critical race studies—all that fabulous work that #transformdh is doing). this dark side also entails taking on our fears and biases to create deeper collaborations with the sciences and engineering. it entails forging joint (frictional and sometimes fractious) coalitions to take on problems such as education, global change, and so on. it means realizing that the humanities don’t have a lock on creative or critical thinking and that research in the sciences can be as useless as research in the humanities—and that this is a good thing. it’s called basic research. it also entails realizing that what’s most interesting about the digital in general is perhaps not what has been touted as its promise, but rather, what’s been discarded or decried as its trash. (*think here of all those failed dh tools, which have still opened up new directions.*) it entails realizing that what’s most interesting is what has been discarded or decried as inhuman: rampant publicity, anonymity, the ways in which the internet vexes the relationship between public and private, the ways it compromises our autonomy and involves us with others and other machines in ways we don’t entirely know and control. (*think here of the constant and promiscuous exchange of information that drives the internet, something that is usually hidden from us.*) as natalia cecire has argued, dh is best when it takes on the     humanities, as well as the digital. maybe, just maybe, by taking on the inhumanities, we’ll transform the digital as well. thank you. the sections in asterisks are either points implied in my visuals or in the talk, which i have elaborated upon in this written version. part the digital humanities as chiaroscuro (rhody)   taking as a point of departure your thoughtful inversion of the “bright” and “dark” sides of the digital humanities, i want to begin by revisiting the origin of those terms as they are born out of rhetoric sur- rounding the mla annual convention, when academic and popular news outlets seemed first to recognize digital humanities scholarship and, in turn, to celebrate it against a dreary backdrop of economic recession and university restructuring. most frequently, such language refers to william pannapacker’s chronicle of higher education blog post on december , , in which he writes: amid all the doom and gloom of the mla convention, one field seems to be alive and well: the digital humanities. more than that: among all the contending subfields, the digital humanities seem like the first “next big thing” in a long time, because the implications of digital technology affect every field. i think we are now realizing that resistance is futile. one convention attendee complained that this mla seems more like a conference on technology than one on literature. i saw the complaint on twitter. (“mla”)     of course, pannapacker’s relationship to digital humanities has changed since his first post. in a later chronicle blog entry regarding the mla annual convention, pannapacker walked back his earlier characterization of the digital humanities, explaining: “i regret that my claim about dh as the nbt—which i meant in a serious way—has become a basis for a rhetoric that presents it as some passing fad that most faculty members can dismiss or even block when dh’ers come up for tenure” (“come-to- dh”). unfortunately for the public’s perception of digital humanities, the provocativeness of pannapacker’s earlier rhetoric continues to receive much more attention than the retractions he has written since.     in , though, pannapacker was reacting to the “doom and gloom” with which a december new york times article set the stage for the mla annual convention by citing dismal job prospects for phd graduates. the times article begins with a sobering statistic: “faculty positions will decline percent, the biggest drop since the group began tracking its job listings years ago” (lewin). pannapacker, though, wasn’t the first one who called digital humanities a “bright spot.” that person was laura mandell, in her post on the armstrong institute for interactive media studies (aims) blog on january , , just following the conference: “digital humanities made the news: these panels were considered to be the one bright spot amid ‘the doom and gloom’ of a fallen economy, a severely depressed job market, and the specter of university-restructuring that will inevitably limit the scope and sway of departments of english and other literatures and languages” (“digital”). in neither her aims post nor in her mla paper does mandell support a “vapid embrace of the digital” or champion digital humanities as a solution to the sense of doom and gloom in the academy. rather, in both, mandell candidly and openly contends with one of the greatest challenges to digital humanities work: collaboration. the “brightness” surrounding digital humanities at the mla convention was based on the observation that dh and media studies panels drew such high attendance because they focused on long-standing, unresolved issues not just for digital humanities but for the study of literature and language at large. for example, in mandell’s session, “links and kinks in the chain: collaboration in the digital humanities”—a session presided over by tanya clement (university of maryland, college park) and that also included jason b. jones (central connecticut state university), bethany nowviskie (neatline, university of virginia), timothy powell (ojibwe archives, university of pennsylvania), and jason rhody (national endowment for the humanities [neh])—presenters addressed the challenges and cautious optimism that scholarly collaboration in the context of digital humanities projects requires. liz losh’s reflections on the panel recall a perceived consensus that collaboration is hard enough that one might be tempted to write it off as a fool’s errand, as nowviskie’s tongue-in-cheek use of an image titled “the ministry of silly walks” (borrowed from a monty python skit) implied. but neither nowviskie’s nor mandell’s point was to stop trying; quite the opposite, their message was that collaboration takes hard work, patience, revisions to existing assumptions about academic status, and a willingness to compromise when the stakes feel high. as mandell recalls in her post: “[m]y deep sense of it is that we came to some conclusions (provisional, of course). digital     humanists, we decided, are concerned to protect the openness of collaboration and intellectual equality of participants in various projects while insuring the professional benefits for those contributors whose positions within academia are not equal (grad students, salaried employees, professors)” (“digital”). that is a tall order, especially because digital humanities scholarship unsettles deeply rooted institutional beliefs about how humanists do research. if the digital humanities in seemed “bright,” it was in large part because it refocused collective attention around issues that vexed not just digital humanists but their inter-/ trans-/ multi-disciplinary peers, those julia flanders is noted for having called “hybrid scholars,” a term not limited to digital humanists. furthermore, across the twenty-seven sessions at the conference that might be considered digital humanities or media studies related, most addressed, at least in a tangential way, issues related to working across institutional barriers. in other words, the bright optimism of for digital humanists was not that of economic recovery, employment solutions, and technological determinism, but of consensus building and renewed attention to long-standing institutional barriers. one takeaway from the mla panels is also a collective sense of strangeness in claiming “digital humanities” as a name when it draws together such a diversity of humanities scholars with so many different research agendas under a common title—an unease that, perhaps, may be attributed to the chosen theme of the digital humanities conference, “big tent digital humanities.” what the four years since the “links and kinks” panel have proven is that its participants were right: collaboration, digital scholarship, and intellectual equality are really hard, and no, we haven’t come up with solutions to those challenges yet. reorienting the bright side/dark side debate away from the pro- vocativeness of its media hype and back toward the spirit of creating con- sensus around long-standing humanities concerns, i would like to suggest that the “dark side” of digital humanities is that we are still struggling with issues that we began calling attention to even earlier than : effectively collaborating within and between disciplines, institutions, and national boundaries; reorienting a deeply entrenched academic class structure; recovering archival silences; and building a freer, more open scholarly dis- course. consequently, a distorted narrative that touts digital humanities as a “bright hope” for overcoming institutional, social, cultural, and economic challenges has actually made it harder for digital humanities to continue acting as a galvanizing force among hybrid scholar peers and to keep the focus on shared interests because such rhetoric falsely positions digital humanities and the “rest” of humanities as if they’re in opposition to one     another. dh and technological determinism moving beyond the “bright/dark” dichotomy is in part compli- cated by the popular complaint first levied against digital humanities at the mla conference that “resistance is futile” and that the convention seemed to be more about technology than literature (see pannapacker, “mla,” above). setting aside the problematic opposition between “technology” and “literature” that pannapacker’s unnamed source makes, the early euphoria over digital humanities that you call attention to in your talk is frequently linked to a sense that digital humanists have fallen victim to a pervasive technological determinism. the rhetoric of technological determinism, however, more often comes from those who consciously position themselves as digital humanities skeptics—which is in stark contrast to how early adopters in the humanities approached technology. in , early technology adopters like dan cohen, neil fraistat, alan liu, allen renear, roy rosenzweig, susan schreibman, martha nell smith, john unsworth, and others didn’t encourage students to learn html (hypertext markup language), sgml (standard generalized markup language), or tei (text encoding initiative) so they could get jobs. they did it, in large part, so students could understand the precarious opportunity that the world wide web afforded scholarly production and communication. open, shared standards could ensure a freer exchange of ideas than proprietary standards, and students developed webpages to meet multiple browser specifications so that they could more fully appreciate how delicate, how rewarding, and how uncertain publishing on the web could be in an environment where netscape and microsoft internet explorer sought to corner the market on web browsing. reading lists and bibliographies in those early courses drew heavily from the textual studies scholarship of other early adopters such as johanna drucker, jerome mcgann, morris eaves, and joseph viscomi, whose work had likewise long considered the material economies of knowledge production in both print and digital media.     consider the cautious optimism that characterizes roy rosen- zweig and dan cohen’s introduction to digital history, which begins with a chapter titled “promises and perils of digital history”: we obviously believe that we gain something from doing digital history, making use of the new computer-based technologies. yet although we are wary of the conclusions of techno-skeptics, we are not entirely enthusiastic about the views of the cyber-enthusiasts either. rather, we believe that we need to critically and soberly assess where computer networks and digital media are and aren’t useful for historians—a category that we define broadly to include amateur enthusiasts, research scholars, museum curators, documentary filmmakers, historical society administrators, classroom teachers, and history students at all levels [. . .]. doing digital history well entails being aware of technology’s advantages and disadvantages, and how to maximize the former while minimizing the latter. ( ) in other words, digital history, and by extension digital humanities, grew out of a thoughtful and reflective awareness of technology’s potential, as well as its dangers, and not a “vapid embrace of the digital.” moreover, the earliest convergence between scholars of disparate humanities backgrounds coalesced most effectively and openly in resistance to naive technological determinism. anxiety, however, creeps into conversations about digital humanities with phrases like “soon it won’t be the digital humanities [. . .] it will just be the humanities.” used often enough that citing every occasion would be impossible, such a phrase demonstrates and fuels a fear that methods attributed to digital humanities will soon be the only viable methods in the field, and that’s simply not true. and yet, unless there is a core contingent of faculty who continue to distribute their work in typed manuscripts and consult print indexes of periodicals that i don’t know about, everyone is already a digital humanist insofar as it is a condition of contemporary research that we must ask questions about the values, technologies, and economies that organize and redistribute scholarly com- munication—and that is and always has been a fundamental concern within the field of digital humanities since before it adopted that moniker and was called merely “humanities computing.”     dh and moocs related to concerns over technological determinism is an indictment that digital humanities has given way to a “vapid embrace of the digital” as exemplified by universities’ recent love affair with moocs. you describe the moocification of higher education very well as the desire to “rewrit[e] political and pedagogical problems into technological ones, into problems that technology can fix. this rewriting ranges from the idea that moocs, rather than a serious public commitment to education, can solve the problem of the spiraling cost of education [. . .] to the blind embrace of technological skills.” digital humanists who have dared to tread on this issue most often do so with highly qualified claims that higher education, too, requires change. for example, edward ayers’s article in the chronicle, “a more-radical online revolution,” contends that if an effective online course is possible, it is only so when the course reorients its relationship to what knowledge production and learning really are. he points out that technology won’t solve the problem, but learning to teach better with technology might help. those two arguments are not the same. the latter acknowledges that we have to make fundamental changes in the way we approach learning in higher education—changes that most institutions celebrating and embracing moocs are unwilling to commit to by investing in human labor. in solidarity with ayers’s cautious optimism are those like cathy davidson, who has often made the point that moocs are popular with university administrators because they are the least disruptive to education models that find their roots in the industrial revolution—and conversely this is why most digital humanists oppose them. dh and funding   another challenge presented by the specter of media attention to the field of digital humanities has been the perception that it draws on large sums of money otherwise inaccessible to the rest of humanities researchers. encapsulating the “cruel optimism” you identify as described by lauren berlant, hopeful academic administrations may once have seen digital humanities research as having access to seemingly limitless pools of money— an assumption that creates department and college resentments. but there’s a reality check that needs to happen, both on the part of hopeful administrations and on the part of frustrated scholars: funding overall is scarce. period. humanists are not in competition with digital humanists for funding: humanists are in competition with everyone for more funding. for example, since , the national endowment for the humanities     (neh) budget has been reduced by percent. in its appropriations request for fiscal year , the neh lists the office of digital humanities (odh) actual budget at $ , , . in other words, odh—the neh division charged with funding digital research in the humanities—controls the smallest budget of any other division in the agency by a margin of $ to million (national endowment ; see table at the end of this article). since most grants from odh are institutional grants as opposed to individual grants (such as fellowships or summer stipends), a substantive portion of each odh award is absorbed by the sponsoring institution in order to offset “indirect costs.” when digital humanities centers and their institu- tions send out celebratory announcements about how they just received a grant for a digital humanities project for x number of dollars, only a fraction of that money actually goes to directly support the project in question. anywhere between to percent of digital humanities grant funds are absorbed by the institution to “offset” what are also referred to as facilities and administrative—f&a—costs, or overhead. indirect cost rates are usually negotiated once each year between the individual academic institutions and a larger federal agency (think department of defense, environmental protection agency, national institutes of health, national aeronautics and space administration, or department of the navy), and they are presumably used to support lab environments for stem-related disciplines (science, technology, engineering, and mathematics). whatever the negotiated cost rate at each institution, that same rate is then applied to all other grant recipients from the same institution who receive federal funds regardless of discipline. while specialized maintenance personnel, clean rooms, security, and hazard insurance might be necessary to offset costs to the institution to support a stem-related research project, it is unclear the extent to which digital humanities projects benefit from these funds. thus, while institutions are excited to promote, publicize, and even support digital humanities grant applications (bright side), that publicity simultaneously casts long shadows obscuring from public view the reality that the actual dollar amount that goes directly to support dh projects is significantly reduced. if we really wanted to get serious about exploring the shadows of digital humanities research, we might begin by asking probative questions about where those indirect costs go and how they are used. in fact, as christopher newfield points out in “ending the budget wars: funding the humanities during a crisis in higher education,” more of us humanists should be engaging in a healthy scrutiny of our institution’s budgets. new-field points out that academic administrations have been milking humanities departments for quite a long time without clear indication of where income from humanities general education courses actually go:     first we must understand that though the humanities in general and literary studies in particular are poor and struggling, we are not naturally poor and struggling. we are not on a permanent austerity budget because we don’t have the intrinsic earning power of the science and engineering fields and aren’t fit enough to survive in the modern university. i suggest, on the basis of a case study, that the humanities fields are poor and struggling because they are being milked like cash cows by their university administrations. the money that departments generate through teaching enrollments that the humanists do not spend on their almost completely unfunded research is routinely skimmed and sent elsewhere in the university. as the current university funding model continues to unravel, the humanities’ survival as national fields will depend on changing it. ( ) lack of clarity about where money absorbed by academic institutions as indirect costs ends up is linked to a much wider concern about whether or not humanities departments really should be as poor and struggling as they are. here is an opportunity in which we could use the so-called celebrity status of digital humanities to cast new light on the accounting, budgeting, and administrating of humanities colleges in general to the benefit of faculty and researchers regardless of their research methods. dh and collaboration   the topic of money, however, returns us to the complicated constellation of issues that accompany collaboration. barriers to collabora- tion, as mandell, nowviskie, powell, jones, and rhody discussed in , are less a matter of fear or bias against collaborating with the sciences or engineering than they might have been in the past. as it turns out, though, collaboration across institutional boundaries is hard because financing it is surprisingly complex and often insufficient. in , the digging into data challenge announced its first slate of awardees. combining the funds and efforts of four granting agencies (jisc [joint information systems committee], neh, nsf [national science foundation], and sshrc [social sciences and humanities research council]), digging into data grants focused on culling resources, emphasizing collaboration, and privileging interdisciplinary research efforts—all valuable and laudable goals. in a follow-up report (unfortunately named) one culture: computationally intensive research in the humanities and social sciences: a report on the experiences of first respondents to the digging into data challenge, however, participants     identify four significant challenges to their work: funding, time, communication, and data (williford and henry). in other words, just about everything it takes to collaborate presents challenges. the question is, though, what have we been able to do to change this? how well have we articulated these issues to those who don’t call them- selves digital humanists in ways that make us come together to advocate for better funding for all kinds of humanities research, rather than constantly competing with one another to grab a bigger piece of a disappearing pie? the frustrating part in all of this is that we know collaboration is hard. we want to bridge communities within the humanities, across to social science and stem disciplines, and even across international, cultural, and economic divides. unless we really set to work on deeper issues like revising budgets, asking pointed questions about indirect cost rates, and figuring out how to communicate across disciplines, share data, and organize our collective time, four years from now we will still be asking the same questions.   dh and labor finally, there are other “shadows” in the academy where digital humanists have been hard at work. while no one in the digital humanities really believes that technical skills alone will prepare anyone for a job, important work by digital humanists has helped reshape the discourse around labor and employment in academia. for example, tanya clement and dave lester’s neh-funded white paper “off the tracks: laying new lines for digital humanities scholars” brought together digital humanities practitioners to consider career trajectories for humanities phds employed to do academic work in nontenure, often contingent university positions. for example, groups such as dh commons, an initiative supported by a coalition of digital humanities centers called centernet, put those interested in tech- nology and the humanities in contact with other digital humanities practitio- ners through shared interests and needs. “alt-academy,” a mediacommons project, invites, publishes, and fosters dialogue about the opportunities and risks of working in academic posts other than traditional tenure-track jobs.     while none of these projects could be credited with “finding jobs” for phds, per se, they are demonstrations of the ways digital humanities practitioners have made academic labor a central issue to the field. worth noting: all of these projects have come to fruition since and in response to concerns about labor issues, recognition, and credit in a stratified academic class structure. and yet, none of these approaches on their own are solutions. there are still more people in digital humanities who are in contingent, nontenure-track positions than there are in tenure-track posts. a heavy reliance on soft funding continues to fuel an academic class structure in which divisions persist between tenure-track and contract faculty and staff— divisions that seem to be reinscribed along lines of gender and race difference. as long as these divisions of labor remain unsatisfactorily addressed, it promises to dim the light of a field that espouses the value of “intellectual equality” (mandell). even though recent efforts by the scholarly communication institute (sci) (an andrew w. mellon foundation–supported initiative) have not answered long-standing questions of contingent academic labor and placement of recent phds in the humanities, efforts to survey current alternative academic (alt-ac) professionals and to build a network of digital humanities graduate programs through the praxis network constitute important steps toward addressing these widely acknowledged problems across a spectrum of humanities disciplines. as a field, digital humanities has not promised direct avenues to tenure-track jobs or even alt-ac ones; however, digital humanities is a community of practice that, born out of an era of decreasing tenure-track job openings and rhetoric about the humanities in crisis, has worked publicly to raise awareness and improve dialogue that identifies, recognizes, and rewards intellectual work by scholars operating outside traditional tenure-track placements. dh silences and shadows i agree that what is truly bright about the digital humanities is that it has drawn from passion in its critical, creative, and innovative approaches to persistent humanities questions. for example, i look at the work of lauren klein, whose mla paper was one of four that addressed the archival silences caused by slavery. klein’s paper responded directly to alan liu’s call to “reinscribe cultural criticism at the center of digital humanities work” (“where is?”). her computational methods explore the silent presence of james hemings in the archived letters of thomas jefferson:     to be quite certain, the ghost of james hemings means enough. but what we can do is examine the contours that his shadow casts on the jefferson archive, and ask ourselves what is illuminated and what remains concealed. in the case of the life—and death—of james hemings, even as we consider the information disclosed to us through jefferson’s correspondence, and the conversations they record—we realize just how little about the life of james hemings we will ever truly know. (“report”) klein proposes one possible way in which we might integrate race, gender, and postcolonial theory with computer learning to develop methodologies for performing research in bias-laden archives, whereby we can expose and address absences. still, while we have become more adept at engaging critical theory and computation in our scholarship, we have spent little of that effort constructing an inclusive, multivalent, diverse, and self-conscious archive of our own field as it has grown and changed. the shadows and variegated terrain of the digital humanities, this odd collection of “hybrid scholars,” is much more complicated, as one might expect, than the bright/dark binary by which it is too often characterized. recovering the histories of dh has proven complicated. jacqueline wernimont made this point famously well in a paper she delivered at dh and in a forthcoming article in digital humanities quarterly (dhq). wernimont explains that characterizing any particular project as feminist is difficult to do: “the challenges arise not from a lack of feminist engagement in digital humanities work, quite the opposite is true, but rather in the difficulty tracing political, ideological, and theoretical commitments in work that involves so many layers of production.” put simply: the systems and networks from which dh projects arise are wickedly complex. perhaps a bit more contentiously: the complexity of those networks has enabled narratives of digital humanities to evolve that elide feminist work that has been foundational to the field. wernimont’s claim runs contrary to the impulse to address through provocation the sobering challenges that confront the digital humanities. rather than claiming that “no feminist work has been done in dh,” wernimont engages productively with the multifaceted work conditions that have led to our understanding of the field. as you suggest at the tail end of your talk, we often claim to “celebrate failures,” but it is unclear to what extent we follow through on that intent. despite john unsworth’s insistence in “documenting the reinvention of text: the importance of failure” that we make embracing failure a disciplinary value, we very rarely do it. consequently, we have riddled our discipline’s own archive with silences about our work process,     our labor practices, our funding models, our collaborative challenges, and even our critical theory. as a result, we have allowed the false light of a thriving field alive with job opportunities, research successes, and techno- logical determinism to seep into those holes. in other words, we have not done what we as humanists should know better than to do: we have not told our own story faithfully. even so, recent events have demonstrated important steps to improving transparency in digital humanities. this summer at the dh conference, quinn dombrowski did what few scholars are willing or bold enough to do. she exposed a project’s failure in a talk titled, “whatever hap- pened to project bamboo?” dombrowski recounted the challenges faced by an andrew w. mellon–funded cyberinfrastructure project between and . tellingly, when you go to the project’s website, there is no discussion of what happened to it—whether or not it met its goals, or why, or even what institutions participated in it. there is a “documentation wiki” where visitors might review the archived project files, an “issue tracker,” and a “code repository.” there is even a link to the “archive” copy of the website as it existed during its funding cycle. that is it. in the face of this silence, dombrowski provided a voice for what might be seen as the project’s failure to begin hashing through the difficulties of collaboration and the dangers of assuming what humanists want before asking them. dombrowski’s paper was welcomed by the community and cel- ebrated as a necessary contribution to our scholarly communication prac- tices. significantly, many dh projects, particularly those that receive federal funding, do have outlets for discussing their processes, management, and decisions; however, where these scholarly and reflective documents are published is often in places where those starting out in digital humanities are unlikely to find them. white papers, grant narratives, and project histories— informally published scholarship called gray literature—discuss significant aspects of digital humanities research, such as rationales for staffing decisions, technology choices, and even the critical theories that are foundational to a project’s development. still, gray literature is often stored or published on funders’ websites or in institutional repositories. occasionally, though less frequently, white papers may be published on a project’s website. since these publications reside outside a humanist’s usual research purview, they are less likely to be found or used by scholars new to the field. in her essay “let the grant do the talking,” sheila brennan suggests that wider circulation of these materials would prove an important contribution to scholarship: “one way to present digital humanities work could be to let grant proposals and related reports or white papers do some of the talking for us, because those forms of writing already provide     intellectual rationales behind digital projects and illustrate the theory in practice.” brennan continues by explaining that grant proposals are often heavily scrutinized by peer reviewers and provide detailed surveys of exist- ing resources. most federal funders require white papers that reflect upon the nature of the work performed during the grant when the grant period is over, all of which are made available to the public. while the nature of the writing differs from what one might find in a typical journal article, grant proposals and white papers address general humanities audiences. that means a body of scholarly writing already exists that addresses the history, composition, and development of a sizeable portion of digital humanities work. the challenge resides in making this writing more visible to a broader humanities audience. although we still have work to do to continue filling in the archi- val silences of digital humanities, i believe that it is a project worth the work involved. eschewing the impulse to draw stark contrasts between digital humanities and the rest of the humanities, choosing instead to delve into the complex social, economic, and institutional pressures that a “technological euphoria” obscures represents a promising way ahead for humanists—digital and otherwise.   part shadows in the archive (chun)   first, thank you for an excellent and insightful response, for the ways you historicize the “bright side” rhetoric, take on the challenges of funding, and elaborate on what you find to be dh’s dark side: your points about the silences about dh’s work process, its labor practices, funding mod- els, collaborative challenges, and critical theory are all profound. further, your move from bright/dark to shadows is inspiring. by elaborating on the work done by early adopters and younger scholars, you show how digital humanists do not engage in a “vapid embrace of the digital.” you show that the technological determinists rather than the practicing digital humanists are the detractors (and i would also insert here supporters). indeed, if any group would know the ways in which the digital   humanities do not guarantee everything they are hyped to do, it is those who have for many years worked under the rubric of “humanities computing.” as liu has so pointedly argued, they have been viewed for years as servants rather than masters (“where is”). they know intimately the precariousness of soft money projects, the difficulty of being granted tenure for preparing rather than interpreting texts, and the ways in which teaching students mark-up languages hardly guarantees them jobs. for all these reasons, the “bright side” rhetoric is truly baffling—unless, of course, one considers the institutional framework within which the digital humanities has been embraced. as you point out, it has not given institutions the access to the limitless pools of money they once hoped for, but it has given them access to indirect cost recovery—something that very few humanities projects provide. it also gives them a link to the future. as william gibson, who coined the term “cyberspace” before he had ever used a computer, once quipped, “[t]he future is already here—it’s just not evenly distributed.” the cruel optimism i describe is thus a “vapid embrace of the digital” writ large, rather than simply an embrace of the digital humanities. one need only think back to the mid- s when the internet became a mass medium after its backbone was sold to private corporations and to the rhetoric that surrounded it as the solution to all our problems, from racial discrimination to inequalities in the capitalist marketplace, from government oversight to the barriers of physical location. and as you note, this embrace is most pointed among those on the outside: soon after most americans were on the internet, the television commercials declaring the internet the great equalizer disappeared. stanley fish’s “the old order changeth” compares dh to theory, stating, “[o]nce again, as in the early theory days, a new language is confidently and prophetically spoken by those in the know, while those who are not are made to feel ignorant, passed by, left behind, old.” yet, your discussion of what you see as the dark side—that, because of dhers’ silences, “[w]e have allowed the false light of a thriving field alive with job opportunities, research successes, and technological determinism to seep into those holes”—made me revisit berlant again and in particular her insistence that cruel optimism is doubly cruel because it allows us to be “bound to a situation of profound threat that is, at the same time, profoundly confirming” ( ). it is the confirmation—the modes of sur- vival—that generate pleasure and make cruel optimism so cruel. also, as berlant emphasizes, optimism is not stupid or simple, for “often the risk of attachment taken in its throes manifests an intelligence beyond rational calculation” ( ). given the institutional structures under which we work, i     find your call for dhers to tell their own story faithfully to be incredibly important and, i think also, incredibly difficult. rather than focus on dh, though, i want to return to the broad- ness of my initial analysis and your response. i was serious when i stated that my comments were not directed toward dh per se, but rather toward the technological euphoria surrounding the digital, a euphoria that makes political problems into ones that technology can solve. here, i think the problem we face is not the “crisis in the humanities” or the divide between humanists and digital humanists, but rather the defunding of universities, a defunding to which universities have responded badly. i remember a for- mer administrator at brown once saying: “[w]e are in the business of two things: teaching and research. both lose money.” his point was that viewing research simply as a way to generate revenue (“indirect costs”) overlooks the costs of doing “big” research; his point was also that the university was in the business not of making money, but of educating folk. grasping for ever-diminishing sums of grant money to keep universities going—a grasping that also entails a vast expenditure in start-up funds, costs for facilities, and so on, arguably available to only a small number of already elite universities—is a way to tread water for a while but is unsustainable. we see the unsustainability of this clearly in the recent euphoria around moocs, which are not, as you point out, embraced by the dh com- munity even as they are increasingly defining dh in the minds of many. they are sexy in a way that zotero is not and bamboo was not. moocs are attractive for many reasons, not least in terms of their promise (and i want to stress here that it is only a promise—and that promises and threats, as derrida has argued, have the same structure) to alleviate the costs of getting a college degree. but why and how have we gotten here? and would students such as my younger self, educated in canada in the s, have found moocs so attractive? as i stressed at the mla, the problem is debt: the level of student debt is unsustainable, as are the ways universities are approaching the problem of debt by acquiring more of it (a problem, i realize, that affects most institutions and businesses in the era of neoliberalism). the problem is also the strained relationship between education and employment. to repeat a few paragraphs from that talk: the humanities are sinking—if they are—not because of their earlier embrace of theory or multiculturalism, but because they have capitulated to a bureaucratic technocratic logic. they have     conceded to a logic, an enframing (*to use heidegger’s term*), that has made publishing a question of quantity rather than quality, so that we spew forth mpus or minimum publishable units; a logic, an enframing, that can make teaching a burden rather than a mission, so that professors and students are increasingly at odds; a logic, an enframing, that has divided the profession and made us our own worst enemies, so that those who have jobs for life deny jobs to others—others who have often accomplished more than they (than we)—have. the academy is a sinking ship—if it is—because it sinks our students into debt, and this debt, generated by this optimistic belief that a university degree automatically guar- antees a job, is what both sustains and kills us. this residual belief/hope stems from another time, when most of us couldn’t go to university, another time, when young adults with degrees received good jobs not necessarily because of what they learned, but because of the society in which they lived. we—and i mean this “we” broadly—have not been good at explaining the difference between being educated and getting a job. a college degree does not guarantee a job; if it did in the past, it was because of demographics and discrimination (in the broadest sense of the term). one thing we can do is to explain to students this difference and to tell them that they need to put the same effort into getting a job that they did into getting into college. to help them, we have not only to alert them to internships and job fairs but also to encourage them to take risks, to expand the courses they take in university and to view challenging courses as rewarding. i cannot emphasize how much i learned—even unintentionally—from doing both systems design engineering and english literature as an undergraduate: combined, they opened up new paths of thinking and analyzing with which i’m still grappling. another thing we can do is address, as you so rightly underscore, how the university spends money. most importantly, we need to take on detractors of higher edu- cation not by conceding to the rhetoric of “employability,” but arguing that the good (rather than goods) of the university comes from what lies outside of immediate applicability: basic research that no industrial research center would engage in, the cultivation of critical practices and thinking that make us better users and producers of digital technologies and better citizens. i want to emphasize that this entails building a broad   coalition across all disciplines within the university. the sciences can not only be as useless as the humanities, they can also be as invested in remaining silent and bathing in the false glow of employability and success as some in the dh. as i mentioned in the mla talk, there are students who graduate from the sciences and cannot find jobs; the sciences are creative and critical; the sciences, of all the disciplines, are most threatened by moocs. we need to build coalitions, rather than let some disciplines be portrayed as “in crisis,” so that ours, we hope, can remain unscathed. to live by the rhetoric of usefulness and practicality—of technological efficiency—is also to die by it. think of the endlessness of debates around global climate change, debates that are so endless in part because the probabilistic nature of science can never match its sure rhetoric. what i also want to emphasize is that these coalitions will be fractious. there will be no consensus, but, inspired by the work of anna tsing, i see friction as grounding, not detracting from, political action. these coalitions are also necessary to take on challenges facing the world today, such as the rise of big data. again, not because they are inherently practical, but rather, because they can take on the large questions raised by it, such as: given that almost any correlation can be found, what is the relationship between correlation and causality? between what’s empirically observable and what’s true? i want to end by thinking again of berlant’s call for “ambient citizenship” as a response to cruel optimism and lauren klein’s really brilliant work, which you cite and which i—along with my coeditors tara mcpherson and patrick jagoda—am honored to publish as part of a special issue of american literature on new media and american literature (“image”). berlant ends cruel optimism by asking to what extent attending to ambient noise could create forms of affective attachment that can displace those that are cruelly optimistic. these small gestures would attend to noises and daily gestures that surround us rather than to dramatic gestures that too quickly become the site of new promises (although she does acknowledge that ambient citizenship resonates disturbingly with george w. bush’s desire to “get rid of the filter”). ambient citizenship would mean attending to things like teaching: teaching, which is often accomplished not by simply relaying information (this is the mooc model), but through careful attention to the noises in and dynamics of the classroom. i also wonder how this notion of ambient citizenship can be linked to klein’s remarkable work discovering the contours of james heming in the letters of thomas jefferson. jefferson, as klein notes, was meticulous about documentation and was very much aware of leaving an archive for history. searching for “information” about heming, his former     slave and chef, though, is extremely difficult, and reducing the lives of slaves to lists and accounts—to the signals that remain—is unethical. drawing from the work of saidiya hartmann and stephen best, klein uses dh tools to trace the ghost, the lingering presence, of heming. she uses these tools to draw out the complexity of relations between individuals across social groups. resisting the logic of and ethic of recovery, she makes the unrecorded story of hemings “expand with meaning and motion.” she also, even as she uses these tools, critiques visualization as “the answer,” linking the logic of visualization to jefferson’s uses of it to justify slavery. klein’s work epitomizes how dh can be used to grapple with the impossible, rather than simply usher in the possible. i think that her work— and some other work in dh—by refusing the light and the dark, reveals the ways in which the work done by the union of the digital and the humanities (a union that is not new, but rich in history) will not be in the clearing (to refer to heidegger), but rather, as you suggest, in the shadows.                                                                                       *this  column  reflects  fy    annualized  funding,  including  a   . %  increase  as  provided  by  the  fy    continuing   appropriations  resolution,  p.l.   -­‐ .           fy            fy                                                    fy     approp.       estimate                              request     bridging  cultures     $ ,     $ ,     $ ,     education  programs     ,     ,     ,     federal/state  partnership     ,     ,     ,     preservation  and  access     ,     ,     ,     public  programs     ,     ,     ,     research  programs     ,     ,     ,     digital  humanities     ,     ,     ,     we  the  people     ,     ,     —     program  development                                                           subtotal     ,     ,     ,       challenge  grants     ,     ,     ,     treasury  funds             ,             ,           ,     subtotal     ,     ,     ,       administration           ,             ,             ,       total     $ ,     $ , *     $ ,       table fy appropria- - tion request ($ in thousands). neh.gov         wendy  hui  kyong  chun  is  professor  and  chair  of  modern  culture  and  media  at  brown  university.   she   has   studied   both   systems   design   engineering   and   english   literature,   which   she   combines   and   mutates  in  her  current  work  on  digital  media.  she  is  the  author  of  programmed  visions:  software  and   memory   (massachusetts   institute   of   technology   press,   )   and   control   and   freedom:   power   and   paranoia  in  the  age  of  fiber  optics  (massachusetts  institute  of  technology  press,   ).  she  is  working   on  a  monograph  titled  “habitual  new  media.”     lisa  marie  rhody  is  research  assistant  professor  at  the  roy  rosenzweig  center  for  history  and  new   media  at  george  mason  university.  her  research  employs  advanced  computational  methods  such  as   topic  modeling  to  revise  existing  theories  of  ekphrasis—poetry  to,  for,  and  about  the  visual  arts.  she  is   editor  of  the  journal  of  digital  humanities  and  project  manager  for  the  institute  of  museum  and  library   services’  (imls)  signature  conference,  webwise.                                         anderson,  benedict.   imagined  communities:  reflections  on   the  origin  and   the  spread  of  nationalism.   london:  verso,   .     ayers,   edward   l.   “a   more-­‐radical   online   revolution.”   chronicle   of   higher   education     feb.   .   http://chronicle.com/article/a-­‐more-­‐radical-­‐online/ /.     berlant,  lauren.  cruel  optimism.  durham:  duke  up,   .     brennan,   sheila.   “let   the   grant   do   the   talking.”   journal   of   digital   humanities   .   (fall   ).   http://journalofdigitalhumanities.org/ -­‐ /let-­‐the-­‐grant-­‐do-­‐the-­‐talking-­‐by-­‐sheila-­‐brennan/   (accessed    july   ).     cecire,   natalia.   “theory   and   the   virtues   of   digital   humanities.”   introduction.   journal   of   digital   humanities   .   (winter   ).   http://journalofdigitalhumanities.org/ -­‐ /introduction   -­‐theory-­‐and-­‐ the-­‐virtues-­‐of-­‐digital-­‐humanities-­‐by-­‐natalia-­‐cecire/  (accessed    july   ).          see  “links  and  kinks  in  the         see  john  unsworth’s  talk,  “what       chain:  collaboration  in  the  digital       is  humanities  computing  and       humanities”  for  an  abstract  of  the       what  is  not?”  for  more  along  these        mla  convention  panel.       lines.         for  a  list  of  the  twenty-­‐seven  digi       indirect  cost  recovery  started  dur     tal  humanities  and  media  studies       ing  world  war  ii  and  the  era  of  big       sessions  presented  at  the    mla       science:  the  government  agreed  to       convention,  see  sample.       pay  for  the  physical  infrastructure         at   the   time,   much   media   attention   was  devoted  to  the  united  states  v.   microsoft  corporation  antitrust  case   initiated  in    and  settled  by  the   united  states  department       needed   for   funded  projects;  private   grant    agencies—still   a   large   source   of   funding  for  the  humanities,  often  in   the   form  of   fellowships—  routinely   refuse  to  pay  for  these  offsets.       of  justice  in   ,  which  created           a  backdrop  for  ensuing  conver         sations  about  open  standards  in           humanities  computing.           notes   works cited       clement,  tanya,  and  dave  lester.  “off  the  tracks:  laying  new  lines  for  digital  humanities  scholars.”   http://mith.umd.edu/wp-­‐content/uploads/whitepaper_offthetracks.pdf  (accessed    july   ).     davidson,  cathy.  “humanities   . :  promise,  perils,  predictions.”  pmla   .  ( ):   – .       dombrowski,  quinn.  “whatever  happened  to  project  bamboo?”  conference  paper.  dh  conference.    july   .  university  of  nebraska–lincoln.     fish,   stanley.   “the   old   order   changeth.”   new   york   times     dec.   .   http://opinionator   .blogs.nytimes.com/ / / /the-­‐old-­‐order-­‐changeth/.     flanders,   julia.   “the   productive   unease   of   st-­‐century   digital   scholarship.”   digital   humanities   quarterly   .  ( ).  http://www.digitalhumanities.org/dhq/vol/ / / /  .html.     gibson,   william.   “the   science   in   science   fiction.”   talk   of   the   nation.   npr     nov.   .   http://   www.npr.org/templates/story/story.php?storyid= .     klein,   lauren   f.   “the   image   of   absence:   archival   silence,   data   visualization,   and   james   hemings.”   american  literature  and  new  media.  spec.  issue  of  american  literature   .  (dec.   ):   – .     .   “a   report   has   come   here.”   lauren   f.   klein   (blog).     jan.   .   http://lmc.gatech   .edu/~lklein / / / /a-­‐report-­‐has-­‐come-­‐here-­‐social-­‐network-­‐analysis-­‐in-­‐the-­‐papers-­‐of   -­‐ thomas-­‐jefferson/.     lewin,  tamar.  “at  colleges,  humanities  job  outlook  gets  bleaker.”  new  york  times    dec.   .     “links   and   kinks   in   the   chain:   collaboration   in   the   digital   humanities.”   panel.   modern   languages   association   program   archive     dec.   .   http://www.mla.org/conv_listings_detail?   prog_id= &year= .     liu,  alan.  “digital  humanities  and  academic  change.”  english  language  notes    (spring   ):   – .   ebsco  host  (accessed    dec.   ).      “where   is   cultural   criticism   in   the   digital   humanities?”   alan   liu.   webpage.   http://liu.english.ucsb.edu/where-­‐is-­‐cultural-­‐criticism-­‐in-­‐the-­‐digital-­‐humanities/   (accessed     july   ).     losh,   liz.   “the   ministry   of   silly   walks.”   virtualpolitik     dec.   .   http://networkedblogs   .com/p .     mandell,   laura.   “digital   humanities:   the   bright   spot.”   aims     jan.   .   http://aims.muohio   .edu/ / / /digital-­‐humanities-­‐the-­‐bright-­‐spot/.     national  endowment  for  the  humanities  appropriations  request  for  fiscal  year   .  washington,  dc.   national   endowment   for   the   humanities,   .   http://www.neh.gov/files/neh   _request_fy .pdf   (accessed    july   ).     newfield,  christopher.   “ending   the  budget  wars:  funding   the  humanities  during  a  crisis   in  higher   education.”   profession     ( ):   – .   http://www.mlajournals.org/doi/pdf/ .   /prof. . . .  (accessed    july   ).     pannapacker,  william.   “the  mla  and   the  digital  humanities.”  chronicle  of  higher  education    dec.   .  http://chronicle.com/blogpost/the-­‐mlathe-­‐digital/ /.     “pannapacker  at  mla:  the  come-­‐to-­‐dh  moment.”  chronicle  of  higher  education    jan.   .   http://chronicle.com/blogs/brainstorm/pannapacker-­‐at-­‐the-­‐mla-­‐ –the-­‐come  -­‐to-­‐dh-­‐moment/ .         rosenzweig,  roy,  and  dan  cohen.  digital  history:  a  guide  to  gathering,  preserving,  and  presenting   the  past  on  the  web.  philadelphia:  u  of  pennsylvania  p,   .     sample,  mark.  “digital  humanities  sessions  at  the    mla.”  sample  reality  (blog).    nov.   .   http://www.samplereality.com/ / / /digital-­‐humanities-­‐sessions-­‐at-­‐the-­‐ -­‐mla/.     tsing,  anna  l.  friction:  an  ethnography  of  global  connection.  new  jersey:  princeton  up,   .     unsworth,   john.   “documenting   the  reinvention  of  text:  the   importance  of  failure.”   journal  of   electronic  publishing   .  (dec.   ).  http://dx.doi.org/ . / . .  (accessed     july   ).      “what   is   humanities   computing,   and   what   is   not?”   http://computerphilologie   .tu-­‐ darmstadt.de/jg /unsworth.html  (accessed    july   ).     wernimont,   jacqueline.   “not   (re)covering   feminist   methods   in   digital   humanities.”   jacqueline   wernimont  (blog).    july   .  http://jwernimont.wordpress.com/ / /  /not-­‐recovering-­‐ feminist-­‐methods-­‐in-­‐digital-­‐humanities.     williford,   christa,   and   charles   henry.   one   culture:   computationally   intensive   research   in   the   humanities  and  social  sciences:  a  report  on  the  experiences  of  first  respondents  to  the  digging  into   data  challenge.  washington,  dc:  clir,   .  http://www.clir.org/pubs/reports  /pub  (accessed    july   ).     association of community sanitation usage with soil-transmitted helminth infections among school-aged children in amhara region, ethiopia research open access association of community sanitation usage with soil-transmitted helminth infections among school-aged children in amhara region, ethiopia william e. oswald , *, aisha e. p. stewart , michael r. kramer , tekola endeshaw , mulat zerihun , berhanu melak , eshetu sata , demelash gessese , tesfaye teferi , zerihun tadesse , birhan guadie , jonathan d. king , , paul m. emerson , , elizabeth k. callahan , matthew c. freeman , w. dana flanders , thomas f. clasen and christine l. moe abstract background: globally, in , approximately . billion people were infected with at least one species of soil-transmitted helminth (sth), ascaris lumbricoides, trichuris trichiura, hookworm (ancylostoma duodenale and necator americanus). infection occurs through ingestion or contact (hookworm) with eggs or larvae in the environment from fecal contamination. to control these infections, the world health organization recommends periodic mass treatment of at-risk populations with deworming drugs. prevention of these infections typically relies on improved excreta containment and disposal. most evidence of the relationship between sanitation and sth has focused on household-level access or usage, rather than community-level sanitation usage. we examined the association between the proportion of households in a community with latrines in use and prevalence of sth infections among school-aged children. methods: data on sth prevalence and household latrine usage were obtained during four population-based, cross- sectional surveys conducted between and in amhara, ethiopia. multilevel regression was used to estimate the association between the proportion of households in the community with latrines in use and presence of sth infection, indicated by > eggs in stool samples from children – years old. results: prevalence of sth infection was estimated as % ( % ci: – %), % ( % ci: – %), and % ( % ci: – %) for hookworm, a. lumbricoides, and t. trichiura, respectively. adjusting for individual, household, and community characteristics, hookworm prevalence was not associated with community sanitation usage. trichuris trichuria prevalence was higher in communities with sanitation usage ≥ % versus sanitation usage < %. association of community sanitation usage with a. lumbricoides prevalence depended on household sanitation. community sanitation usage was not associated with a. lumbricoides prevalence among households with latrines in use. among households without latrines in use, a. lumbricoides prevalence was higher comparing communities with sanitation usage ≥ % versus < %. households with a latrine in use had lower prevalence of a. lumbricoides compared to households without latrines in use only in communities where sanitation usage was ≥ %. conclusions: we found no evidence of a protective association between community sanitation usage and sth infection. the relationship between sth infection and community sanitation usage may be complex and requires further study. keywords: sanitation, soil-transmitted helminths, ethiopia * correspondence: william.oswald@lshtm.ac.uk department of disease control, london school of hygiene and tropical medicine, london, uk department of epidemiology, emory university, atlanta, ga, usa full list of author information is available at the end of the article © the author(s). open access this article is distributed under the terms of the creative commons attribution . international license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the creative commons public domain dedication waiver (http://creativecommons.org/publicdomain/zero/ . /) applies to the data made available in this article, unless otherwise stated. oswald et al. parasites & vectors ( ) : doi . /s - - - http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf mailto:william.oswald@lshtm.ac.uk http://creativecommons.org/licenses/by/ . / http://creativecommons.org/publicdomain/zero/ . / background globally, in , approximately . billion people were infected with at least one species of soil-transmitted helminth (sth) [ ]. the four most common nematode worms that infect humans are: the roundworm, ascaris lumbricoides; the whipworm, trichuris trichiura; and, the hookworms, ancylostoma duodenale and necator americanus [ , ]. these intestinal parasites infect humans through exposure to eggs or larvae that develop in the environment after being deposited in feces [ ]. eggs and larvae thrive in warm, moist soils of the tropics and sub- tropics, particularly in poorer areas with inadequate access to sanitation [ , ]. recent estimates suggest that among million people in sub-saharan africa, million are infected with hookworm, million with a. lumbricoides, and million people with t. trichiura [ ]. sth infections infrequently lead to mortality, but chronic infection results in several detrimental outcomes, including impaired physical and cognitive development, school absenteeism and poor performance, reduced work product- ivity among adults, adverse pregnancy outcomes, anemia, and possibly increased susceptibility to malaria, tubercu- losis, and hiv [ , ]. the extent of morbidity is related to the burden of infection, the number of worms residing within the host, and the health of the host [ , ]. current strategies for control of sth in low-income countries focus on large-scale provision of anthelmintic drugs to prevent the consequences of chronic infection [ , ]. the world health organization (who) recom- mends periodic administration of albendazole and meben- dazole to at-risk populations, including preschool-aged children, school-aged children, women of reproductive age, including pregnant women and lactating mothers, and other groups with high exposure [ ]. it is recognized that long-term control based on deworming efforts through mass treatment need to be complemented with concur- rent improvements in sanitation and excreta disposal behaviors [ , – ]. using data from population-based surveys and associated stool samples collected between and , we estimated the association between the proportion of households in a community with latrines in use and preva- lence of sth infections among children, aged to years, in amhara, ethiopia. we hypothesized that higher commu- nity sanitation usage would be associated with lower preva- lence of these sth infections. methods study participants and overview for this analysis, data were combined from four population- based, cross-sectional surveys conducted in distinct areas of amhara between and . data from an additional survey conducted in north gondar and west gojjam zones between may and june were excluded because of data quality concerns (fig. ). the methods and results of these surveys have been described previously [ – ]. briefly, surveys used a multi-stage cluster random sampling methodology and were powered to estimate zonal prev- alences of sth infections, including a. lumbricoides (al), t. trichiura (tt), and hookworm (hw). ‘woreda’ (ethiopian administrative units equivalent to districts) became eligible for surveying when at least five rounds of annual azithromycin mass drug administration for trachoma control had occurred. the smallest adminis- trative units with population data available are ‘gott’ (villages) and were primary sampling units. within each eligible district, villages were listed by geographical distribution and systematically selected probability pro- portional to population size (median village size: households; iqr: – ). within villages, smaller administrative units of approximately households, called development teams (dt), were used as segments for a modified segment survey design [ , ]. develop- ment teams were listed upon arrival in the community with an appropriate village representative, who then drew numbers from a hat to select dts to be surveyed. in villages of households or less, the entire village was surveyed. for the current study, selected dts were considered clusters, the immediate geographic area of residence of participants. in selected clusters, village leaders were interviewed for community information. heads of all households were interviewed for demographic and socioeconomic infor- mation and knowledge and practices regarding trach- oma, water, sanitation, and hygiene. visual inspections were made of household latrines and handwashing sta- tions. within each cluster, one child aged to years old ( to years in survey) was randomly selected in each household, after enumerating all residents, and asked to provide a single stool sample. interviews were con- ducted with selected children about school attendance, use of latrines for defecation, receipt of anthelmintic treatment, recent infection with worms, and shoe wearing. responses were recorded electronically using tablet com- puters operating swift insights software (the carter center, atlanta, ga, usa) [ ]. exposure and outcome measures the exposure, community sanitation usage, was calcu- lated as the proportion of households within the cluster with a latrine to which there was a defined path and feces were observed in the pit [ ]. stool sampling methods, training, and quality control have been described previously [ ]. an ether-concentration method was used to enumerate the number of helminth eggs per gram of stool, fixed in ml of sodium acetate-acetic acid-formalin (saf) solution, counting from to eggs and then recording ≥ if higher oswald et al. parasites & vectors ( ) : page of [ , ]. outcomes were dichotomous indicators for pres- ence of > eggs of each species (al, hw, tt) in stool samples. frequencies of infection intensity were tabulated across categories of community sanitation usage. covariates individual measures included child’s age in years (centered at ), sex, and reported usually attending school. between and subsequent surveys, some questions were asked differently, so to avoid missing values the following approaches were used to combine responses. in , reported wearing of shoes was recorded as: (i) always; (ii) sometimes; or (iii) never. subsequent surveys recorded whether the child was observed to be currently wearing shoes. for an indicator of shoe wearing, responses from of always wearing shoes were combined with positive observations of shoe wearing in subsequent surveys. in , the child and parent/guardian were asked whether the child had received and taken albendazole or mebenda- zole: (i) in the past month; (ii) between month and year; or (iii) year ago. the first two responses for either medi- cation were combined to indicate receipt of medication within the last year. for a measure of recent anthelmintic treatment, responses to the question were combined with responses to the question from subsequent surveys of whether the child had taken medicine for worms in the last year. in , reported use of a latrine by the child was recorded as: (i) always; (ii) sometimes; or (iii) never. in sub- sequent surveys, children were asked if they last defecated in a school latrine, family latrine, open field, or backyard. for an indicator of latrine usage, responses from of always using a latrine were combined with responses in subsequent surveys of last defecating in a school or the family’s latrine. children were also asked if they had worms in the last year. household access to water was dichotomized < min or not, based on asking how long it took to fetch water for bathing. reported type of drinking water source was dichotomized as improved or not according to who/ unicef joint monitoring programme classification [ ]. indicators were created for presence of a pit latrine and presence of a pit latrine in use (defined above). house- hold wealth was indicated by ownership of radio, television, mobile phone, metal roof, and access to electricity. a fig. location of clusters and districts, by survey and year, amhara region, ethiopia, – oswald et al. parasites & vectors ( ) : page of categorical variable was created for the highest level of education completed by respondents in or by any household member in subsequent surveys. cluster wealth was calculated as the mean total of re- ported wealth indicators per household. mean elevation in meters was calculated for each cluster from household measurements and evaluated as a continuous measure. population density (km- ) in was generated using the oak ridge national laboratory’s landscan as an unprojected map in wgs with . × − ° resolution [ ]. annual average volumetric soil moisture (m /m ) measures for , produced by the european space agency climate change initiative (esa cci), were obtained as a grid file in wgs with a lambert azimuthal equal area projection and . ° resolution [ ]. population density and soil moisture values were extracted for each cluster using geographic coordinates in arcmap . (esri, redlands, ca, usa). population density was evaluated as a continuous measure, natural log transformed, and dichotomized at people km- . soil moisture was evaluated as a continuous measure. soil moisture measures for clusters were unavailable because of their proximity to lake tana. each cluster was assigned the nearest neighbor’s value. presence of a health post, health center, or hospital was dichoto- mized as presence of any health facility. analyses means and frequencies were estimated with confidence intervals across categories of community sanitation usage, accounting for study design and sampling weights, based on inverse total selection probability for clusters (village and dt) and individuals. f-statistics were calculated using an adjusted wald test for categorical variables and ana- lysis of variance for continuous measures. multilevel poisson regression with robust variance was used to estimate the association between proportion of house- holds in each cluster with a latrine in use and infection with each of three species of soil-transmitted helminths among children aged to years. modified poisson regression uses robust error variance to correct over- estimation of error when applied to binomial data and allows direct estimation of prevalence ratios (pr) [ , ]. potential confounders, among measures recorded in all surveys, were identified based on literature review. re- ported measures for child’s school attendance, location of last defecation, and having worms in past year were not modeled. an evaluation of directed acyclic graphs (dags) identified the same minimal sufficient set of covariates to estimate associations of community sanita- tion usage with each sth infection [ , ]. a sequential modeling approach, removing covariates at each level from fully-adjusted models, was also used to identify confounders based on changes in exposure estimates. all models controlled for survey round to account for year and possible differences. results are presented from crude, dag-based, and fully-adjusted models for comparison. generalized linear mixed models were fit, specifying a random intercept for cluster and incorporating sampling weights. robust standard errors were requested to account for clustering within districts, and adaptive quadrature with eight integration points was used. results are reported for individual weights scaled to sum to the cluster sample size, though weights were also scaled to effective cluster sample size for comparison [ ]. operationalization of exposure as a categorical measure, versus linear or quadratic, was based on a preliminary assessment considering fit and interpret- ability. participants missing covariates in any survey were excluded from models. effect modification on the multi- plicative scale of the association of community sanitation usage with sth infection by household latrine use, an- thelmintic treatment, and by wearing of shoes (for hw infection) was evaluated with wald tests. measures of association were presented for community sanitation usage within strata of each potential effect modifier, as stratified prevalence ratios with a single reference category, and for household sanitation within strata of community sanitation [ ]. an analysis to further examine robustness of estimates of the association of household sanitation with al infection within strata of community sanitation usage is described in additional file : supplementary information. individual and cluster mean shoe wearing were assessed as negative control exposures a posteriori to detect uncon- trolled confounding of the association of community sanita- tion usage with al and tt infection [ ]. all described analyses were conducted using stata . (statacorp lp, college station, tx, usa). results characteristics of the study population of , children selected, stool sample results were obtained for , children ( %). the combined dataset linked community, household, and individual information and complete parasitological results for , ( %) children aged to years in clusters in districts (fig. ). the analysis included ( %) observations with complete results for al and tt and ( %) ob- servations for hw. table describes individual, household, and community characteristics of children aged to years, overall and by community sanitation usage category. children in com- munities with lower sanitation usage had indicators of less household education and access to health facilities, worse access to water for bathing and drinking, and more impo- verished and less densely-populated living conditions, compared to children in communities with higher sanita- tion usage. among school-aged children, % ( % ci: oswald et al. parasites & vectors ( ) : page of t a b le in d iv id u al ,h o u se h o ld ,a n d co m m u n it y ch ar ac te ris ti cs o f , ch ild re n ag ed to ye ar s b y co m m u n it y p ro p o rt io n o f h o u se h o ld s w it h la tr in es in u se in a m h ar a re g io n ,e th io p ia , – % h o u se h o ld s w it h la tr in es in u se – < % – < % – < % – < % – % to ta l n m ea n (% ) % c i m ea n (% ) % c i m ea n (% ) % c i m ea n (% ) % c i m ea n (% ) % c i m ea n (% ) % c i pa c h ild re n ,a g ed – ye ar s (n ) , c o m m u n it ie s (n ) h o o kw o rm , . . – . . . – . . . – . . . – . . . – . . . – . . t. tr ic h iu ra , . . – . . . – . . . – . . . – . . . – . . . – . . a .l u m b ric o id es , . . – . . . – . . . – . . . – . . . – . . . – . . a g e, ye ar s , . . – . . . – . . . – . . . – . . . – . . . – . < . m al e se x , . . – . . . – . . . – . . . – . . . – . . . – . . re p o rt ed u su al ly at te n d in g sc h o o l , . . – . . . – . . . – . . . – . . . – . . . – . < . re p o rt ed al w ay s w ea rin g sh o es ( ) . . – . . . – . . . – . . . – . . . – . . . – . < . a n y o b se rv ed sh o es (n o ) . . – . . . – . . . – . . . – . . . – . . . – . . a n y re p o rt ed o r o b se rv ed sh o es (a ll) , . . – . . . – . . . – . . . – . . . – . . . – . . a n th el m in ti cs in p as t ye ar ( ) . . – . . . – . . . – . . . – . . . – . . . – . . a n th el m in ti cs in p as t ye ar (n o ) . . – . . . – . . . – . . . – . . . – . . . – . . a n th el m in ti cs in p as t ye ar (a ll) , . . – . . . – . . . – . . . – . . . – . . . – . . re p o rt ed al w ay s u se la tr in e ( ) . . – . . . – . . . – . . . – . . . – . . . – . < . re p o rt ed la st u se o f la tr in e (n o ) . . – . . . – . . . – . . . – . . . – . . . – . < . re p o rt ed u se o f la tr in e (a ll) , . . – . . . – . . . – . . . – . . . – . . . – . < . w o rm s in p as t ye ar , . . – . . . – . . . – . . . – . . . – . . . – . . h o u se h o ld o w n s la tr in e , . . – . . . – . . . – . . . – . . . – . . . – . < . h o u se h o ld o w n s la tr in e in u se , . . – . . . – . . . – . . . – . . . – . . . – . < . ba th in g w at er ac ce ss < m in , . . – . . . – . . . – . . . – . . . – . . . – . < . im p ro ve d d rin ki n g w at er so u rc e , . . – . . . – . . . – . . . – . . . – . . . – . < . h o u se h o ld h as ra d io , . . – . . . – . . . – . . . – . . . – . . . – . < . tv , . . – . . . – . . . – . . . – . . . – . . . – . < . el ec tr ic it y , . . – . . . – . . . – . . . – . . . – . . . – . < . m o b ile p h o n e , . . – . . . – . . . – . . . – . . . – . . . – . < . iro n ro o f , . . – . . . – . . . – . . . – . . . – . . . – . < . h ig h es t ed u ca ti o n o f an ad u lt , < . n o n e . . – . . . – . . . – . . . – . . . – . . . – . re lig io u s . . – . . . – . . . – . . . – . . . – . . . – . oswald et al. parasites & vectors ( ) : page of t a b le in d iv id u al ,h o u se h o ld ,a n d co m m u n it y ch ar ac te ris ti cs o f , ch ild re n ag ed to ye ar s b y co m m u n it y p ro p o rt io n o f h o u se h o ld s w it h la tr in es in u se in a m h ar a re g io n ,e th io p ia , – (c o n tin u ed ) pr im ar y sc h o o l . . – . . . – . . . – . . . – . . . – . . . – . ju n io r se co n d ar y . . – . . . – . . . – . . . – . . . – . . . – . se n io r se co n d ar y . . – . . . – . . . – . . . – . . . – . . . – . c o lle g e/ u n iv er si ty . . – . . . – . . . – . . . – . . . – . . . – . n o n -f o rm al ed u ca ti o n . . – . . . – . . . – . . . – . . . – . . . – . m ea n to ta l o f w ea lt h in d ic at o rs p er h o u se h o ld . . – . . . – . . . – . . . – . . . – . . . – . < . po p u la ti o n d en si ty (k m − ) – – – – – – < . h as a h ea lt h fa ci lit y . . – . . . – . . . – . . . – . . . – . . . – . < . el ev at io n (m ) – – – – – – < . so il m o is tu re (m /m ) . . – . . . – . . . – . . . – . . . – . . . – . < . a b b re vi a ti o n :c i co n fi d en ce in te rv al a p -v al u es fr o m w al d ad ju st ed f- te st fo r ca te g o ri ca l va ri ab le s o r a n o v a f- te st fo r d if fe re n ce in co n ti n u o u s m ea n s oswald et al. parasites & vectors ( ) : page of – %) and % ( % ci: – %) reported attending school in communities with lowest and highest sanita- tion usage, respectively. of children’s households, % ( % ci: – %) and % ( % ci: – %) reported an improved source of drinking water, comparing communities with lowest and highest sanitation usage respectively. households in communities with lowest sanitation usage had a mean of . items ( % ci: . – . ) compared to . items ( % ci . – . ) in communities with highest sanitation usage. children’s shoe wearing and treatment with anthelmintics were not associated with community sanitation usage (f( . , . ) = . , p = . and f( . , . ) = . , p = . , respectively). compared to communities with higher sanitation usage, communities with lower sanitation usage were in areas with lower population density (f( , ) = . , p = . ) and lower elevation (f( , ) = . , p < . ). soil moisture was significantly lower in communities with lower sanitation usage compared to communities with higher sanitation usage (f( , ) = . , p < . ), but the magnitude of differ- ence may not reflect meaningful change. in clusters, mean community sanitation usage was % ( % ci: – %) and ranged from % in clusters ( %) to % in clusters ( %). hw was the most preva- lent of these sth across surveyed areas of amhara, infect- ing almost a quarter of school-aged children (table : %, % ci: – %). tt was least prevalent, infecting % of school-aged children (table : % ci: – %). table presents results from crude and adjusted models of the association of community sanitation usage with prevalence of each sth, controlling for se- lected covariates and survey. results were generally robust to sampling weight scaling method (data not shown), but potentially meaningful identified differ- ences between weighted and unweighted results are discussed (additional file : tables s and s ). hookworm infection community sanitation usage ≥ % was associated with lower hw prevalence, compared to usage of < %, adjust- ing only for survey. based on the crude model, the differ- ence was statistically significant, only where usage was between – < % (pr . , % ci: . – . ). adjusting for potential confounders in both dag-based and full models attenuated the association towards or past the null across usage categories. in the full model, adjusting for community sanitation usage and other factors, household ownership of a latrine in use was not associated with hookworm prevalence (pr . , % ci: . – . ). trichuris trichiura infection tt prevalence in communities with sanitation usage ≥ % was more than double the prevalence in communities with sanitation usage of < %. in the crude model, community sanitation usage was significantly associated with elevated prevalence of tt at usage ≥ %, compared to usage < %. estimates from dag-based and full model were not meaningfully different. after adjusting for all potential confounders in the full model, community sanitation usage ≥ % was significantly associated with higher prevalence of tt, compared to usage < ( – < %, pr . , % ci: . – . ; ≥ %, pr . , % ci: . – . ). adjusting for community sanitation usage and other factors, household ownership of a latrine in use was not significantly associated with tt prevalence (pr . , % ci: . – . ). when included in full models as negative control exposures, individual shoe wearing was associated with tt infection, but the asso- ciation was not statistically significant (pr . , % ci: . – . ). cluster mean shoe wearing was not associ- ated with tt infection (pr . , % ci: . – . ). ascaris lumbricoides infection based on the crude model, community sanitation usage of ≥ % was associated with higher prevalences of al compared to usage of < %. estimates from dag-based and full model were not meaningfully different. adjust- ing for all potential confounders moderately attenuated estimated associations, and community sanitation usage ≥ % was significantly associated with higher prevalence of al, compared to usage < (full: – < %, pr . , % ci: . – . ; ≥ %, pr . , % ci: . – . ). adjusting for community sanitation usage and other factors, household ownership of a latrine in use was not associated with al prevalence (pr . , % ci: . – . ). variables for shoe wearing, individually and aggregated to cluster, were not associated with al in- fection when included in full models as negative control exposures (individual: pr . , % ci: . – . ; cluster mean reported/observed, pr . , % ci: . – . ). effect modification reported receipt of deworming treatment in the past year did not significantly modify the association of com- munity sanitation usage with any of the infections (data not shown). measures of shoe wearing did not signifi- cantly modify the association between community sani- tation usage and hookworm infection (data not shown). table shows prevalence ratios comparing children in respective strata of community and household sanitation usage. no significant modification by household latrine usage of the association of community sanitation usage with hw (p = . ) or tt (p = . ) prevalence was detected. the association between community sanitation usage and al prevalence was significantly modified by household latrine usage, adjusting for all covariates (p < . ). the first two groups of columns compare prevalences oswald et al. parasites & vectors ( ) : page of of al between communities with sanitation usage ≥ % to those with usage < %, among children from households with and without latrines in use. community sanitation usage was not associated with al prevalence among children from households with latrines in use. children from households without latrines in use had increas- ingly higher prevalences of al when comparing com- munities with higher sanitation usage to communities with usage < % (pr range: . – . ). examining the joint association of increased community sanitation and a household latrine in use, children in households with a latrine in use in communities with any level of sanitation usage had higher prevalences of al com- pared to children in households without latrines in use in communities with sanitation usage < % (pr range: . – . ). the last column in table compares prevalences between children from households with and without latrines in use by community sanitation usage. in commu- nities with sanitation usage ≥ %, children in households with a latrine in use had significantly lower prevalence of al compared to children in households without a latrine in use (≥ %, pr . , % ci: . – . ); while in communities with sanitation usage < %, children in households with a latrine in use had higher prevalence of al compared to children in households without a latrine in use (< %, pr . , % ci: . – . ). table association of infection with hookworm, trichuris trichiura and ascaris lumbricoides with community proportion of households with latrines in use and household ownership of latrine in use among children aged to years in amhara region, ethiopia, – infection sanitation measure crudea dag-basedb fullc apr % ci apr % ci apr % ci hookworm community ≥ % . . – . . . – . . . – . % households with latrines in use per cluster – < % . . – . . . – . . . – . – < % . . – . . . – . . . – . – < % . . – . . . – . . . – . < % ref ref ref household latrine in use – – – – . . – . no latrine in use – – – – ref t. trichiura community ≥ % . . – . . . – . . . – . % households with latrines in use per cluster – < % . . – . . . – . . . – . – < % . . – . . . – . . . – . – < % . . – . . . – . . . – . < % ref ref ref household latrine in use – – – – . . – . no latrine in use – – – – ref a. lumbricoides community ≥ % . . – . . . – . . . – . % households with latrines in use per cluster – < % . . – . . . – . . . – . – < % . . – . . . – . . . – . – < % . . – . . . – . . . – . < % ref ref ref household latrine in use – – – – . . – . no latrine in use – – – – ref models for a. lumbricoides and t. trichuris included data on children in communities. models for hookworm included data on children in communities abbreviations: apr adjusted prevalence ratio, ci confidence interval. results weighted to account for unequal probabilities of selection acrude model only controlled for survey round bdag-based model was adjusted for elevation; population density; community mean total of wealth indicators per household; soil moisture; and survey round cfull model for a. lumbricoides and t. trichuris adjusted for age; sex; anthelmintic treatment; bathing water source < min; improved drinking water source; household owns: radio, television, mobile phone, iron roof, and has access to electricity; household education; elevation; soil moisture; community mean total of wealth indicators per household; population density; and survey round. full model for hookworm adjusted for the same covariates in addition to shoe wearing oswald et al. parasites & vectors ( ) : page of infection intensity intensities of infection with each worm as measured by eggs per gram in stool samples were very low in this population and showed a similar pattern as prevalence (table ). only intensity of infection with tt was signifi- cantly associated with community sanitation usage (f( . , . ) = . , p = . ). discussion our findings show no evidence that increased community sanitation usage was protective against the three most- common sth infections among children aged to years in amhara region, ethiopia. these findings contrast with current understanding of the relationship between sanita- tion and sth infection, and the relationship between com- munity sanitation and these sth remains unclear. much of the evidence of the relationship between sanitation and sth infection has focused on household sanitation access or usage, rather than community sanitation. two recent meta-analyses examined accumulated evidence of the relationship between sanitation and sth infection and found protective associations of household sanitation access with lower odds of any sth, al, tt, and hw infec- tion [ , ]. in their systematic review, ziegelbauer et al. [ ] identified only six studies that examined community sanitation and sth infection. two recent studies from tanzania found that higher community sanitation coverage was associated with lower prevalence odds of al and weakly associated with higher prevalence odds of hw, controlling for individual, household, and environmental measures [ , ]. we observed no association of community or household sanitation with hw prevalence, after controlling for indi- vidual, household, and community characteristics. infection intensity (represented by egg counts) directly represents transmission rate because no sth reproduction occurs within the host [ ]. as an indicator of transmission, fre- quencies of hw infection intensities did not significantly differ across categories of community sanitation usage (table ). hookworm may live up to seven years in the gut [ ]. in the absence of deworming, which was infrequent in this population, it is perhaps not unusual that a reduction in prevalence was not observed for hw within the latrines’ times in place, which was less than years on average (data not shown) [ ]. there was little relative difference in al prevalence by community sanitation usage among children in households with latrines in use. it is understood that most al trans- mission clusters within households and families [ , ]. a study from bangladesh found that household-related table association of hookworm, trichuris trichiura and ascaris lumbricoides with community sanitation usage by household latrine use among children aged to years in amhara region, ethiopia, – % households with latrines in use per cluster community sanitation by household sanitation joint association latrine in use no latrine in use latrine in use household sanitation by community sanitation infection +/− pr % ci +/− pr % ci pr % ci pr % ci hookworm ≥ % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . pinteraction = . < % / ref / ref . . – . . . – . t. trichiura ≥ % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . pinteraction = . < % / ref / ref . . – . . . – . a. lumbricoides ≥ % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . – < % / . . – . / . . – . . . – . . . – . pinteraction < . < % / ref / ref . . – . . . – . results weighted to account for unequal probabilities of selection. models for a. lumbricoides and t. trichuris included data on children in communities and adjusted for age; sex; anthelmintic treatment; bathing water source < min; improved drinking water source; household owns: radio, television, mobile phone, iron roof, and has access to electricity; household education; elevation; soil moisture; community mean total of wealth indicators per household; population density; and survey round. model for hookworm included data on children in communities and adjusted for the same covariates in addition to shoe wearing abbreviations: +/−, number with/without infection, pr prevalence ratio, ci confidence interval pr, prevalence ratio for household latrine in use versus household latrine not in use, within strata of community sanitation usage; pinteraction, global wald oswald et al. parasites & vectors ( ) : page of exposures explained % of clustering of al worm burden at the household level, indicating the importance of the domestic domain in transmission [ ]. therefore, not finding a significant association between community sani- tation usage and al prevalence in this population subset is less surprising [ ]. among children from households without latrines in use, al prevalence increased with greater community sanitation usage relative to communi- ties with lowest sanitation usage. this subset of children resided in households that were last to adopt household sanitation in their communities, which might indicate an increased likelihood of worse hygiene conditions or prac- tices related to other al transmission routes. a significant protective association of sanitation with al prevalence was observed among children from households with latrines in use compared to children from households without latrines in use among communities with sanitation usage ≥ %. this result corresponds with odds ratios ob- served in recent meta-analyses of . ( % ci: . – . ) and . ( % ci: . – . ), representing reductions in the odds of al infection with household sanitation use [ , ]. our finding could indicate that household latrines may only be protective against al at specific levels of community sanitation usage. a study in tanzania found a non-significant protective association of household latrine ownership when community latrine coverage was included in the model, but each % increase in latrine coverage was associated with a reduction in al prevalence odds [ ]. community sanitation usage is not frequently reported in studies of household sanitation, so further studies are warranted to confirm this finding. among communities with low sanitation usage, children in households with a latrine in use had significantly higher al prevalence compared to children in households without a latrine in use. the magnitude of the associ- ation was smaller with exclusion of sampling weights (additional file : table s ), so this result should be interpreted with caution. a plausible explanation for the finding may be that in communities with fewer latrines overall, there is increased reliance on sharing sanitation in- frastructure between families. a recent systematic review found a consistent pattern of elevated risk of helminth in- fection among those relying on shared sanitation facilities [ ]. curtale et al. [ ] and tshikuka et al. [ ] found that increased numbers of users and sharing increased intensity of al infections. shared sanitation is not currently included in the definition of improved sanitation because facilities may not be accessible at all times and poor cleanliness may not fully separate users from contact with human waste [ ]. information on latrine cleanliness and maintenance was not collected, so further exploration of the mechanism behind this possible transmission was not possible. future studies should collect information on latrine sharing, par- ticularly in contexts with limited sanitation availability, and indicators of latrine construction, maintenance, and cleanli- ness to explore these possible transmission pathways. our dataset allowed for characterization of each child’s immediate and community environment. as an evaluation activity, however, limited information could be collected during household surveys. our outcome measure was based on a single, small sample of stool, so prevalence may have been underestimated because egg excretion varies by day and egg distribution is not uniform in stool [ ]. our indicator of household latrine usage balanced standard recommendations with the logistical realities of program evaluation, but the aggregated measure table intensity (eggs per gram) of infection by community proportion of households with latrines in use among children aged to years in amhara region, ethiopia, – community sanitation usage < % – < % – < % – < % ≥ % total infection eggs/gram n % n % n % n % n % n % pa hookworm . . . . . . . – . . . . . . – . . . . . . ≥ . . . . . . t. trichiura . . . . . , . < . – . . . . . . – . . . . . . ≥ . . . . . . a. lumbricoides . . . . . . . – . . . . . . – . . . . . . ≥ . . . . . . ap-values from wald adjusted f-test for categorical variables oswald et al. parasites & vectors ( ) : page of for community sanitation usage may not sufficiently re- flect levels of fecal contamination in the environment. for example, there was no actual measure of consistent latrine usage by all household members or measures of child feces disposal and hand hygiene. the difficulty with accurately measuring sanitation usage has been acknowledged [ ]. furthermore, as a cross-sectional study, the possibility that latrine promotion activities were targeted to areas with higher sth prevalences cannot be ruled out. models controlled for potential confounders among available measures and other possible differences between survey rounds. additional unmeasured factors were con- trolled through application of remote-sensing information, but residual confounding is possible with any observational study. tt prevalence and infection intensity were observed to increase with increasing community sanitation usage (tables and ), but household ownership of a latrine in use was not associated with lower prevalence of tt, adjust- ing for other factors. overall prevalences of al were higher in communities with highest sanitation usage. community sanitation usage may reflect unmeasured factors related to urbanization that were not completely controlled by in- cluded measures. urban areas are generally believed to have higher prevalences of al and tt compared to rural areas [ ]. in their review, brooker et al. [ ] found no consistent pattern of differences between urban and rural communi- ties for the prevalence of al and tt among a limited num- ber of studies, but concluded that hookworm appeared equally prevalent in rural and urban settings. our statistical models adjusted for population density using a remote-sensing derived measure. this measure of population density, along with our other included measures, may not have adequately controlled for con- founding related to urbanization. to identify residual confounding, individual and cluster mean shoe wearing were included in dag-based and fully-adjusted models for al and tt infection as negative control exposures [ ]. if these control exposures do not cause al and tt infection and have a comparable set of confounders as community sanitation usage, then any detected associ- ation of these exposures with the outcomes would indi- cate bias in the main association of interest [ ]. under the necessary assumptions of comparability between these measures of shoe wearing and community sanita- tion, our results did not strongly indicate the presence of any residual confounding with al. there was some indication of residual confounding of the association of community sanitation usage and tt based on the indi- cator for individual shoe wearing. conclusions in the current study, we found no evidence of a protective association between community sanitation usage and sth infection and evidence of a protective association with household sanitation only for al under conditions of high community sanitation usage. sanitation may con- vey other private and public benefits, including con- venience, dignity, privacy, and safety [ ]. the extent of sanitation usage in this study reflects promising uptake of sanitation in the amhara region, but reductions in sth prevalence may still require additional improve- ments in sanitation-related behaviors to substantially reduce exposure to fecal contamination. additional file additional file : supplementary information. table s . estimated measures of association of household ownership of a latrine in use with prevalence of ascaris lumbricoides infection, using conditional logistic regression and mixed regression with and without individual sampling weights. table s . number of clusters (n) contributing to each stratum- specific analysis out of the total number of clusters within that stratum of community sanitation usage (n) and correlation between cluster-specific odds ratios of association of household ownership of a latrine in use with prevalence of ascaris lumbricoides infection and respective sampling weight by strata of community sanitation usage. (docx kb) acknowledgments we would like to thank the amhara national regional health bureau and health offices and the carter center support staff, field teams, and study supervisors. we particularly thank dr. rob o’reilly and the emory center for digital scholarship for help with data management. finally, we are especially grateful to the residents of selected communities, who gave freely of their time to participate. funding this work was supported by the lions-carter center sight-first initiative; emory university laney graduate school; and arcs foundation atlanta. this study was made possible thanks to the generous support of the american people through the united states agency for international development (usaid) and the envision project led by rti international in partnership with the carter center. the contents of this article are the responsibility of the authors and do not necessarily reflect the views of usaid or the united states government. availability of data and materials the data that support the findings of this study are available from the carter center trachoma control program but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. data are however available from the authors upon reasonable request and with permission of the carter center trachoma control program and the amhara regional health bureau, amhara, ethiopia. this product was made utilizing the landscan ™ high resolution global population data set copyrighted by ut-battelle, llc, operator of oak ridge national laboratory under contract no. de-ac - or with the united states department of energy. the united states government has certain rights in this data set. neither ut-battelle, llc nor the united states department of energy, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of the data set. authors’ contributions conceived the study: weo, tfc, clm. designed and conducted data collection: aeps, te, mz, bm, es, dg, tt, zt, bg, jdk, pme. contributed to data assembly: weo, aeps, jdk. contributed to analysis and interpretation of results: weo, mrk, wdf, tfc. contributed to writing of the manuscript: weo, aeps, mrk, ekc, mcf, tfc, clm. all authors read and approved the final manuscript. competing interests the authors declare that they have no competing interests. oswald et al. parasites & vectors ( ) : page of dx.doi.org/ . /s - - - consent for publication not applicable. ethics approval and consent to participate study protocols were approved by emory university institutional review board and amhara regional health bureau. participant informed consent processes involved explaining study purpose and methods and obtaining verbal consent from parents or guardians of children under years of age and verbal assent from children aged years or older, described previously [ ]. regardless of participation, selected children were offered a single dose of albendazole ( mg) during household visits. this secondary analysis was exempt from additional review (emory university irb, protocol no. – ). author details department of disease control, london school of hygiene and tropical medicine, london, uk. department of epidemiology, emory university, atlanta, ga, usa. the carter center, atlanta, ga, usa. the carter center, addis ababa, ethiopia. amhara regional health bureau, bahir dar, ethiopia. world health organization, geneva, switzerland. international trachoma initiative, atlanta, ga, usa. department of environmental health, emory university, atlanta, ga, usa. hubert department of global health and center for global safe water, sanitation, and hygiene, emory university, atlanta, ga, usa. received: december accepted: february references . pullan rl, smith jl, jasrasaria r, brooker sj. global numbers of infection and disease burden of soil-transmitted helminth infections in . parasit vectors. ; : . doi: . / - - - . . bethony j, brooker s, albonico m, geiger sm, loukas a, diemert d, hotez pj. soil-transmitted helminth infections: ascariasis, trichuriasis, and hookworm. lancet. . doi: . /s - ( ) - . . hotez pj, bundy dap, beegle k, brooker s, drake l, de silva n, et al. helminth infections: soil-transmitted helminth infections and schistosomiasis. in: jamison dt, breman jg, measham ar, alleyne g, claeson m, evans db, jha p, mills a, musgrove p, editors. disease control priorities in developing countries. washington dc: the international bank for reconstruction and development/ the world bank group; . p. – . . brooker s, clements ac, bundy da. global epidemiology, ecology and control of soil-transmitted helminth infections. adv parasitol. ; : – . . karagiannis-voules da, biedermann p, ekpo uf, garba a, langer e, mathieu e, et al. spatial and temporal distribution of soil-transmitted helminth infection in sub-saharan africa: a systematic review and geostatistical meta-analysis. lancet infect dis. ; ( ): – . . feachem rg, bradley dj, garelick h, mara dd. sanitation and disease: health aspects of excreta and wastewater management. new york: wiley; . . who. eliminating soil-transmitted helminthiases as a public health problem in children: progress report – and strategic plan – . geneva: world health organization; . . asaolu so, ofoezie ie. the role of health education and sanitation in the control of helminth infections. acta trop. . doi: . /s - x( ) - . . barker wh. perspectives on acute enteric disease epidemiology and control. bull pan am health organ. ; : – . . mara d, lane j, scott b, trouba d. sanitation and health. plos med. . doi: . /journal.pmed. . . okun da. the value of water supply and sanitation in development: an assessment. am j public health. ; : – . . prichard rk, basáñez mg, boatin ba, mccarthy js, garcia hh, yang gj, et al. a research agenda for helminth diseases of humans: intervention for control and elimination. plos negl trop dis. ; ( ):e . . king jd, endeshaw t, escher e, alemtaye g, melaku s, gelaye w, et al. intestinal parasite prevalence in an area of ethiopia after implementing the safe strategy, enhanced outreach services, and health extension program. plos negl trop dis. ; ( ):e . . king jd, teferi t, cromwell ea, zerihun m, ngondi jm, damte m, et al. prevalence of trachoma at sub-district level in ethiopia: determining when to stop mass azithromycin distribution. plos negl trop dis. ; ( ):e . . oswald we, stewart ae, kramer mr, endeshaw t, zerihun m, melaku b, et al. active trachoma and community use of sanitation, ethiopia. bull world health organ. (in press). . turner ag, magnani rj, shuaib m. a not quite as quick but much cleaner alternative to the expanded programme on immunization (epi) cluster survey design. int j epidemiol. ; : – . . unicef. monitoring the situation of children and women. multiple indicator cluster survey manual . new york: unicef; . . king jd, buolamwini j, cromwell ea, panfel a, teferi t, zerihun m, et al. a novel electronic data collection system for large-scale surveys of neglected tropical diseases. plos one. ; ( ):e . . ngondi j, teferi t, gebre t, shargie eb, zerihun m, ayele b, et al. effect of a community intervention with pit latrines in five districts of amhara, ethiopia. trop med int health. ; ( ): – . . marti h, escher e. saf - an alternative fixation solution for parasitological stool specimens. schweiz med wochenschr. ; : – . . utzinger j, botero-kleiven s, castelli f, chiodini pl, edwards h, kohler n, et al. microscopic diagnosis of sodium acetate-acetic acid-formalin-fixed stool samples for helminths and intestinal protozoa: a comparison among european reference laboratories. clin microbiol infect. ; ( ): – . . who/unicef. progress on drinking water and sanitation - update. new york: unicef; . . bright ea, coleman pr, rose an, urban ml. landscan . oak ridge national laboratory, oak ridge, tn. http://web.ornl.gov/sci/landscan/. accessed apr . . afsis climate collection: essential climate variable (ecv) soil moisture annual averages, release. africa soil information service (afsis), the earth institute, columbia university, new york, ny. ftp://africagrids.net/ m/ecvsm/. accessed nov . . spiegelman d, hertzmark e. easy sas calculations for risk or prevalence ratios and differences. am j epidemiol. . doi: . /aje/kwi . zou g. a modified poisson regression approach to prospective studies with binary data. am j epidemiol. . doi: . /aje/kwh . . textor j, hardt j, knuppel s. dagitty: a graphical tool for analyzing causal diagrams. epidemiology. ; ( ): . . rothman kj, greenland s, lash tl. modern epidemiology. rd ed. philadelphia: lippincott, williams, & wilkins; . . carle ac. fitting multilevel models in complex survey data with design weights: recommendations. bmc med res methodol. ; : . . knol mj, vanderweele tj. recommendations for presenting analyses of effect modification and interaction. int j epidemiol. ; ( ): – . . lipsitch m, tchetgen tchetgen e, cohen t. negative controls: a tool for detecting confounding and bias in observational studies. epidemiol. . doi: . /ede. b e d eeb. . strunz ec, addiss dg, stocks me, ogden s, utzinger j, freeman mc. water, sanitation, hygiene, and soil-transmitted helminth infection: a systematic review and meta-analysis. plos med. ; ( ):e . . ziegelbauer k, speich b, mausezahl d, bos r, keiser j, utzinger j. effect of sanitation on soil-transmitted helminth infection: systematic review and meta- analysis. plos med. ; ( ):e . . riess h, clowes p, kroidl i, kowuor do, nsojo a, mangu c, et al. hookworm infection and environmental factors in mbeya region, tanzania: a cross- sectional, population-based study. plos negl trop dis. ; ( ):e . . schule sa, clowes p, kroidl i, kowuor do, nsojo a, mangu c, et al. ascaris lumbricoides infection and its relation to environmental factors in the mbeya region of tanzania, a cross-sectional, population-based study. plos one. ; ( ):e . . brooker s, bethony j, hotez pj. human hookworm infection in the st century. adv parasitol. ; : – . . cairncross s, blumenthal u, kolsky p, moraes l, tayeh a. the public and domestic domains in the transmission of disease. trop med int health. . doi: . /j. - . .d - .x. . walker m, hall a, basanez mg. individual predisposition, household clustering and risk factors for human infection with ascaris lumbricoides: new epidemiological insights. plos negl trop dis. ; ( ):e . . heijnen m, cumming o, peletz r, chan gk, brown j, baker k, clasen t. shared sanitation versus individual household latrines: a systematic review of health outcomes. plos one. ; ( ):e . . curtale f, shamy my, zaki a, abdel-fattah m, rocchi g. different patterns of intestinal helminth infection among young workers in urban and rural areas of alexandria governorate. egypt parassitologia. ; : – . oswald et al. parasites & vectors ( ) : page of http://dx.doi.org/ . / - - - http://dx.doi.org/ . /s - ( ) - http://dx.doi.org/ . /s - x( ) - http://dx.doi.org/ . /journal.pmed. http://web.ornl.gov/sci/landscan/ ftp://africagrids.net/ m/ecvsm/ ftp://africagrids.net/ m/ecvsm/ http://dx.doi.org/ . /aje/kwi http://dx.doi.org/ . /aje/kwh http://dx.doi.org/ . /ede. b e d eeb http://dx.doi.org/ . /j. - . .d - .x . tshikuka jg, scott me, gray-donald k. ascaris lumbricoides infection and environmental risk factors in an urban african setting. ann trop med parasitol. ; : – . . knopp s, mgeni af, khamis is, steinmann p, stothard jr, rollinson d, et al. diagnosis of soil-transmitted helminths in the era of preventive chemotherapy: effect of multiple stool sampling and use of different diagnostic techniques. plos negl trop dis. ; ( ):e . . clasen t, fabiani d, boisson s, taneja j, song j, aichinger e, et al. making sanitation count: developing and testing a device for assessing latrine use in low- income settings. environ sci technol. ; ( ): – . . crompton dw, savioli l. intestinal parasitic infections and urbanization. bull world health organ. ; : – . . jenkins mw, sugden s. rethinking sanitation: lessons and innovation for sustainability and success in the new millennium. in: human development report . new york: united nations development programme; . • we accept pre-submission inquiries • our selector tool helps you to find the most relevant journal • we provide round the clock customer support • convenient online submission • thorough peer review • inclusion in pubmed and all major indexing services • maximum visibility for your research submit your manuscript at www.biomedcentral.com/submit submit your next manuscript to biomed central and we will help you at every step: oswald et al. parasites & vectors ( ) : page of abstract background methods results conclusions background methods study participants and overview exposure and outcome measures covariates analyses results characteristics of the study population hookworm infection trichuris trichiura infection ascaris lumbricoides infection effect modification infection intensity discussion conclusions additional file acknowledgments funding availability of data and materials authors’ contributions competing interests consent for publication ethics approval and consent to participate author details references [pdf] analyzing and visualizing ancient maya hieroglyphics using shape: from computer vision to digital humanities | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /llc/fqx corpus id: analyzing and visualizing ancient maya hieroglyphics using shape: from computer vision to digital humanities @article{hu analyzingav, title={analyzing and visualizing ancient maya hieroglyphics using shape: from computer vision to digital humanities}, author={r. hu and c. pallan and j. odobez and d. gatica-perez}, journal={digit. scholarsh. humanit.}, year={ }, volume={ }, pages={ii -ii } } r. hu, c. pallan, + author d. gatica-perez published art, computer science digit. scholarsh. humanit. maya hieroglyphic analysis requires epigraphers to spend a significant amount of time browsing existing catalogs to identify individual glyphs. automatic maya glyph analysis provides an efficient way to assist scholars’ daily work. we introduce the histogram of orientation shape context (hoosc) shape descriptor to the digital humanities community. we discuss key issues for practitioners and study the effect that certain parameters have on the performance of the descriptor. different hoosc… expand view via publisher academic.oup.com save to library create alert cite launch research feed share this paper citationsbackground citations view all figures and topics from this paper figure figure figure figure figure figure figure figure figure figure figure figure figure figure view all figures & tables glyph autodesk maya digital humanities computer vision directed graph information visualization force-directed graph drawing prototype citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency improved hieroglyph representation for image retrieval laura alejandra pinilla-buitrago, j. a. carrasco-ochoa, j. martínez-trinidad, edgar román-rangel computer science jocch view excerpt, cites background save alert research feed a probe into patentometrics in digital humanities guirong hao, f. ye engineering, computer science libr. trends save alert research feed references showing - of references sort byrelevance most influenced papers recency analyzing ancient maya glyph collections with contextual shape descriptors edgar román-rangel, c. pallan, j. odobez, d. gatica-perez computer science international journal of computer vision pdf view excerpt, references methods save alert research feed assessing a shape descriptor for analysis of mesoamerican hieroglyphics: a view towards practice in digital humanities r. hu, j. odobez, d. gatica-perez geography, computer science dh pdf save alert research feed multimedia analysis and access of ancient maya epigraphy: tools to support scholars on maya hieroglyphics r. hu, gulcan can, + authors d. gatica-perez computer science ieee signal processing magazine pdf view excerpts, references methods save alert research feed reading maya art: a hieroglyphic guide to ancient maya painting and sculpture andrea j. stone, m. zender art save alert research feed statistical shape descriptors for ancient maya hieroglyphs analysis e. rangel geography highly influential pdf view excerpts, references methods save alert research feed automatic egyptian hieroglyph recognition by retrieving images as texts morris franken, j. v. gemert computer science mm ' pdf save alert research feed a catalog of the maya hieroglyphs j. e. thompson art, history highly influential pdf view excerpts, references background and methods save alert research feed sketch-based shape retrieval m. eitz, ronald richter, t. boubekeur, k. hildebrand, m. alexa computer science acm trans. graph. pdf save alert research feed learning hatching for pen-and-ink illustration of surfaces e. kalogerakis, derek nowrouzezahrai, simon breslav, aaron hertzmann computer science togs pdf save alert research feed shape matching and object recognition a. berg, jitendra malik computer science toward category-level object recognition pdf view excerpts, references methods save alert research feed ... ... related papers abstract figures and topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue © the research society for victorian periodicals technologies of serendipity paul fyfe in reckoning with the digital restructuring of the scholarly discourse net- work circa , patrick leary begins with a story. it is a story about how, thanks to web discovery and email contacts, scholarship on letitia elizabeth landon took a major turn. this happened because of the “for- tuitous electronic connections” of people and documents facilitated by the internet. and making sense of this experience, rather than detailing spe- cific resources for digital scholarship, becomes leary’s abiding concern in “googling the victorians.” his essay ponders a “profound shift” towards casual discovery, “a serendipity of unexpected connections to both infor- mation and people that is becoming increasingly central to the progress of victorian research.” if, in the subtitle to their volume the victorian periodical press, joanne shattock and michael wolff nominate “samplings and soundings” as our only reasonable approach, leary begins to clarify how such casual discoveries should not merely be viewed as symptoms of trying to find specifics amid superabundance, whether in terms of the vic- torian archive or networked digital information. instead, that character- istic research experience has been absorbed into the technological routines of how we work now. in other words, chance discovery is not a bug; it is a feature. it is the very condition of “googling the victorians,” as leary calls it. a decade later, we find ourselves deeper in the networked experi- ence of such unexpected connections, with more perspective that allows us to acknowledge, critique, and perhaps even credit serendipity as scholarly technique. “googling the victorians” also reveals the scholar’s reflex to enfold fortuitous discoveries within descriptive explanation. the essay shows a consistent dynamic between the item of scholarly interest, serendipitously found, and the narrative in which the researcher governs the unexpected. in drawing this connection, leary makes a crucial distinction between ser- endipity and randomness. if we all have random encounters all the time, serendipity requires recognizing such an encounter for its meaning, requir- victorian periodicals review : summer ing an interpretive context to place the unexpected within an explanatory framework. digital materials have proliferated since “googling the vic- torians” but so too have the contexts in which we encounter them, includ- ing the very tools and platforms which circumscribe the digital objects they serve. such is the lament of many teachers: students privilege quick access (“googling”) over research methods contextualized by institutional structures (in other words, libraries), which have served print scholar- ship for decades and longer. as leary noted, such is the boon of many researchers, now awash in volumes of materials at scales unseen since the nineteenth century’s own profusion of printed objects, periodicals espe- cially. contexts—or how to reconcile new information within a horizon of knowledge or questioning—have become as crucial for digital scholarship as access to digital objects themselves (if not more so). what we need to teach now and what researchers themselves need to cultivate is the “cura- torial intelligence” to assess and recontextualize digital objects discovered through techniques now including—even privileging—serendipity. the remainder of this short essay will sketch out how, since leary’s article, serendipity has been “operationalized,” or built into, the research platforms which reflexively shape our methods and become part of the ordinariness of scholarly practice. while many early sites on the web featured random discovery, only lately have serendipitous machines been purpose-built for academic work. certain scholarly sites now spotlight random access as a valid entry point to their contents, such as the oxford english dictionary’s “lost for words” feature, previously subtitled “get a random entry.” apparently there are fans of randomness at oxford university press, as the dictionary of national biography—leary’s own ultimate example in —now also offers a “get a life at random” fea- ture, or “get a life” for short. the information science and library com- munity has also driven these developments. for example, the trove project of the national library of australia offers a “discovery experience” that includes random sampling of its digital collections. built by tim sherratt, the trovenewsbot program pulls random content from digitized austra- lian newspapers from to the mid- s, posting headlines to twitter and illustrations to an associated tumblr account. sherratt has also built experimental tools like “headline roulette” and “the future of the past,” each of which randomly harvests from trove’s archive and delivers a visual discovery interface. the paired twitter and tumblr accounts featuring “historical book images,” built by mark sample, take a similar approach to randomly sampling historical book illustrations within the internet archive. at the british library, the digital scholarship department cre- ated “the mechanical curator” to push “randomly selected small illustra- tions and ornamentations” from seventeenth- to nineteenth-century books paul fyfe to its own tumblr site. as one of its developers suggests, the project offers one way of opening digital collections to the perennial research desk ques- tion: “how do you find things that you cannot begin to describe?” to put this differently, how do you discover things you did not know to look for? these machines of serendipity sometimes offer simple shifts of perspec- tive. for instance, the “stackview” browsing tool developed at harvard’s library lab shows search results by virtual book spines. the project under- scores the importance of the “visual layout of [the] interface on browsing outcomes” and tries to find what gabrielle dean calls “that sweet spot of ‘unexpected but relevant.’” an in-development project called stak (serendipitous tool for augmenting knowledge), by kim martin, aims to create an application for mobile devices that will suggest proximate resources based on users’ physical locations within a given library. the bohemian bookshelf project offers “five interlinked visualizations” keyed to other aspects of library collections, including the “tangible qualities of books such as cover colour and page count, temporal aspects such as pub- lication year and content era, as well as content data such as keywords and books’ author.” these visual discovery interfaces may be especially rel- evant for nineteenth-century texts and periodicals which, because of their abiding ordinariness, are targets for de-accessioning and are disappearing from library shelves. victorianist dan cohen, now director of the digital public library of america (dpla), has also considered how “to create environments in which serendipity is able to flourish.” cohen points out the governing paradox: how do you actually plot the unexpected or program serendipity without prearranging the results? for its part, the dpla offers an applica- tion programming interface (api) which lets users and their machines grab, play with, and sort the dpla’s data. seeking just such a way to “query for random items” in the dpla, mark sample built the twitter “dpla bot” to post “random finds” and “add what we all love about libraries— serendipity.” at the “one week | one tool” project, a number of digital humanists, designers, and programmers convened to develop a “serendipity engine” that harvests results from the apis of multiple digital libraries. this engine drives “serendip-o-matic,” an experimental research interface which neither offers a field for directed searching (in other words, googling) nor returns random or unsupervised results. instead, it processes chunks of text or citation lists and uses that information to query multiple collections according to its own algorithm. “let your sources surprise you,” reads the tagline. so let us. what happens when leary’s own essay about electronic ser- endipity gets fed into the serendip-o-matic? the results include an edition of eminent victorians from hathitrust; a s poster for the victorian victorian periodicals review : summer railways from trove; several pictorial examples of architectural restora- tions of victorian houses from the us national archives; an piece of ephemera printed in london advertising a “new religious movement started by charles whitmore stokes”; and a grayscale photograph of a young girl searching the rubble of a home damaged in the great timber yard fire in hartlepool, . these documents depict a snapshot histo- riography of the victorian past; they testify to victorian afterlives, to the sifting and renovation of historical debris. if fortuitously discovered, what do these results mean? their stories have yet to be written. but their elec- tronic encounter echoes, in ways that leary well understood, the unique methodological challenges facing victorianists and especially researchers of periodicals, who seek coherence from forms necessarily fragmentary and networked, miscellaneous and serialized. perhaps the field of periodicals research should especially embrace ser- endipity in its technologies and scholarly techniques. for victorian peri- odicals were already a technology of serendipity in print. they allowed and rewarded a full spectrum of programmatic and random ways of discover- ing their contents. for scholars or other readers, discovery results less from directed searching than from all the tangents encountered on the way. thus, sources which are plural, redundant, and tangent-rich help promote discovery by the proliferating contingencies of their usage. by developing techniques of serendipity in digital scholarship, we remediate perhaps the most unique feature of the victorians’ own machines of discovery. north carolina state university notes . patrick leary, “googling the victorians,” journal of victorian culture , no. ( ): . . ibid., , . i mean “casual” in its historical sense, which the oxford eng- lish dictionary defines as “subject to, depending on, or produced by chance; accidental, fortuitous” and “occurring or brought about without design or premeditation.” . joanne shattock and michael wolff, eds., the victorian periodical press: samplings and soundings (leicester: leicester university press, ), x. for more on the history of chance discovery in victorian periodicals, see paul fyfe, “the random selection of victorian new media,” victorian periodicals review , no. (spring ): – . . i am grateful to the attendees of the “periodical method” panel at the rsvp conference, university of delaware, who contributed to a rewarding discussion of this paper and helped sharpen many of its guiding insights, including this one from troy bassett. paul fyfe . this includes google books, which somewhat amazingly came after leary’s piece. officially launched in december , it was only promoted the fol- lowing year as “google print” and the “google library project” before being renamed by late as “google book search.” see “google books history,” google books, accessed october , , http://www.google. com/googlebooks/about/history.html; “our history in depth,” google, accessed october , , http://www.google.com/about/company/history; and “google books,” wikipedia, accessed september , , http:// en.wikipedia.org/w/index.php?title=google_books&oldid= . see also the wayback machine’s snapshot of the google print page on the last day of : http://web.archive.org/web/ /http://print. google.com. . that lovely phrase “curatorial intelligence” was furnished by a member of our panel’s audience in delaware. i want to express my gratitude to this person and my apologies for, amid the discussion, not getting her name. . as leary also suggested, the ordinary techniques of scholarship, that invisible middle ground between tool and method ungraciously known as “workflow,” are the best measures of our digital transformations. leary, “googling the victorians,” . . for example, see google’s own “i’m feeling lucky” button. for a snapshot of google’s front page in december , see the wayback machine: http:// web.archive.org/web/ /http://www.google.com. . “oxford english dictionary,” oxford english dictionary, , http:// www.oed.com; “oxford dictionary of national biography,” oxford dic- tionary of national biography, – , http://www.oxforddnb.com. . “trovenewsbot (@trovenewsbot),” twitter, , https://twitter.com/ trovenewsbot; “trovenewsbot selects,” tumblr, , http://trovenews- bot.tumblr.com; tim sherratt, “headline roulette,” wragge labs, , http://wraggelabs.com/shed/headline-roulette; tim sherratt, “the future of the past,” wragge labs, , http://newspapers.wraggelabs.com/fotp. . mark sample, “old book pics (@bookimages),” twitter, , https:// twitter.com/bookimages; mark sample, “historical book images,” tumblr, , http://historicalbookimages.tumblr.com. . james baker, “the mechanical curator,” digital scholarship blog, september , , http://britishlibrary.typepad.co.uk/digital-scholar- ship/ / /the-mechanical-curator.html. see the corresponding tumblr site at http://mechanicalcurator.tumblr.com. . ben o’steen, “peeking behind the curtain of the mechanical curator,” dig- ital scholarship blog, october , , http://britishlibrary.typepad.co.uk/ digital-scholarship/ / /peeking-behind-the-curtain-of-the-mechanical- curator.html. . gabrielle dean, “browsing, serendipity, and virtual discovery,” the sheri- dan libraries blog, october , , http://blogs.library.jhu.edu/word- victorian periodicals review : summer press/ / /browsing-serendipity-and-virtual-discovery; “stack view,” harvard library innovation lab, , http://librarylab.law.harvard.edu/ blog/stack-view. . kim martin and john simpson, “serendipity nouveau,” colloquium pre- sentation, digital humanities summer institute, university of victoria, bc, june , ; kim martin, question answering, serendipity, and the research process of scholars in the humanities, access conference, you- tube, , https://www.youtube.com/watch?v=mvpb n mwrs&feature= youtube_gdata_player. . alice thudt, uta hinrichs, and sheelagh carpendale, “the bohemian bookshelf: supporting serendipitous book discoveries through information visualization,” the bohemian bookshelf, accessed january , , http:// www.alicethudt.de/bohemianbookshelf. . andrew stauffer, “the troubled future of the nineteenth-century book,” the hoarding (blog), march , , http://thehoarding.wordpress. com/ / / /the-troubled-future-of-the-nineteenth-century-book-essay. . dan cohen, “planning for serendipity,” digital public library of america, february , , http://dp.la/info/ / / /planning-for-serendipity. . mark sample, “the @dpla api (@samplereality),” twitter, july , . https://twitter.com/samplereality/status/ . . “serendip-o-matic: let your sources surprise you,” serendip-o-matic, , http://serendipomatic.org. . ibid. . see also jim mussell, the nineteenth-century press in the digital age (new york: palgrave macmillan, ). . “how do simple questions lead to big discoveries?” ted radio hour, september , , http://www.npr.org/ / / / /how-do- simple-questions-lead-to-big-discoveries. the integration of libraries and academic computing at columbia: new opportunities for internal and external collaboration journal of library administration, : – , copyright © taylor & francis group, llc issn: - print / - online doi: . / . . the integration of libraries and academic computing at columbia: new opportunities for internal and external collaboration patricia renfro columbia university, new york, ny, usa (retired) james g. neal columbia university, new york, ny, usa abstract. over the past decade, the libraries and academic com- puting units at columbia have been brought together to form a new information services organization. this article will trace the history and current state of it and library relationships at columbia uni- versity. the expanding collaboration among academic computing services, their deeper partnership and integration with library pro- grams, the working relationship with campus administrative com- puting, and participation in national and international projects will be described and evaluated. keywords libraries, academic computing, research instruction the academic library is being driven by five fundamental shifts. primal in- novation: creativity as an essential component of our organizational and individual dna. radical collaboration: new, drastic, sweeping and energetic combinations across and outside libraries. deconstruction: taking apart tra- ditional axioms and norms, removing the incoherence of current concepts and models, and evolving new approaches and styles. survival: persistence and adaptation which focuses more on the “human” objectives of our users, that is success, productivity, progress, relationships, experiences, and im- pact. particularism: deep specialization and niche responsibilities in the face of rampant shared and open resources. how do we respond to these revo- lutionary trends through our shifting geography, our essential expertise, and our advocacy of the public interest? address correspondence to patricia renfro, east princeton rd., bala cynwyd, pa , usa. e-mail: pr @columbia.edu d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il the integration of libraries and academic computing the columbia university libraries/information services are responding to these extraordinary challenges through strategic investments in content, technology, tools, space, and people. and one important component of change has been the evolution of a hybrid organizational structure, integrat- ing key elements of the research library and university technology services. the basis of any organization is individuals and groups carrying out roles and working together to achieve shared objectives within a formal structure and with established processes. organizations define the systems through which goals and priorities are established, decisions are made, re- sources are allocated, power is wielded, and plans are accomplished. they determine the degree to which administrative responsibility and authority are distributed and shared, operations and procedures are integrated and flexible, and policies and standards are designed and enforced. organizational models focus on a set of parameters defined by: cen- tralization and decentralization, hierarchy and adhocracy, bureaucracy and distribution, simplicity and complexity, formality and informality, administra- tion and entrepreneurship, authority and collaboration. they can be viewed, among many characteristics, in terms of layers and rigidity of structure, direc- tion and effectiveness of information flow, sources and impact of leadership, participation in decision making, freedom of action, and levels of ambiguity. particularly important are the health of the industry, the level of competition, the speed of technological change, the extent of globalization, the degree of professionalization in the field, and the rapidity of new knowledge creation. these have been critical considerations as the libraries’ organization has expanded. libraries have struggled to distribute authority, integrate key operations, breakdown bureaucratic processes, achieve less rigidity in structure, promote more cooperation across units, and build more matrix-type approaches to the work. as a result, centralized planning and resource allocation systems coexist with broadly distributed and loosely coupled structures and an ex- panding array of maverick units like research centers and entrepreneurial enterprises. columbia university was an early integrator of information technology and information services functions under a single administrator. by the early s, all computing and telecommunications areas were administered by a cio, reporting to the executive vice president for academic affairs. by the mid s, this structure was expanded to include the university libraries, but now under the direction of a vice president and university librarian. by , the pendulum had swung back in the other direction, with adminis- trative computing and telephone services moving to an administrative vice president. academic computing, network services, electronic mail, electronic classroom support, computer labs, videoconferencing, security authorization and authentication, the data center, technology training, helpdesk and the li- braries continued to report to the librarian, and the unit was expanded over d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il p. renfro and j. g. neal time to include electronic publishing, digital library, instructional technol- ogy, and research computing units. in , another reorganization pushed all administrative, network, and telecommunications technologies under a new cio position. academic computing and the libraries were assigned to the university librarian. in the columbia libraries and information services implemented a rationalized organizational structure, creating four administrative groups un- der the direction of the vice president for information services and univer- sity librarian. one group is responsible for administrative services, including budget and finance, facilities and building projects, and human resources. a second group, bibliographic services and collection development, includes units responsible for serials acquisitions, electronic resources, monograph ac- quisitions, copy and original cataloging, database maintenance, metadata ser- vices, collection development, and the offsite shelving facility. a third group, collections and services, includes three multidisciplinary library divisions for history and humanities, social sciences, and sciences and engineering; the global and distinctive collections in the starr east asian library, the avery architecture and fine arts library, the rare book and manuscript library, the global resources program, and the burke theology library; and access services including circulation, interlibrary loan/document delivery, collection management, reserves, and building security. the fourth group, digital pro- grams and technology services, includes the libraries digital program, the library information technology office, the preservation and digital conver- sion division, the center for new media teaching and learning, the center for digital research and scholarship, and the copyright advisory office. the latter two centers were created in at the time of the reorganization in order to focus on unmet needs in the university. in the past decade the way faculty and students access information, the way they in fact use information and the way they need to manage that infor- mation, have been transformed by technology. at the columbia libraries it became increasingly clear that academic computing would be the foundation of new library services and that the integration of academic computing units and skills into the traditional library organization was critical to our ongoing success. consequently the development of the digital programs group, and its working relationships with the more traditional parts of the columbia li- braries, became a key strategic focus. in total digital programs has a staff of approximately ftes who together are breaking new ground, experiment- ing with new technologies and piloting new services, while at the same time addressing the very traditional mission of libraries to preserve information for the use of current and future generations. critical to the success of the new digital programs group was a major change in mission for the library information technology office (lito). the libraries continue to have a close working relationship with the central campus computing organization and rely on them for management of the d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il the integration of libraries and academic computing network, for capacity in the central data center and for the administration of the course management systems, as well as for the administration of en- terprise systems such as e-mail. however, in recent years central computing found itself increasingly focused on support for administrative computing and it became clear that the libraries needed more control of its technology infrastructure. consequently lito’s role and staffing were expanded starting in in order to provide full technology services and support across the organization. lito runs the libraries’ long-term digital storage system, which underpins the work of the library’s digital program and is also critical to the university’s institutional repository, academic commons. lito also manages novell storage to support internal digital production activity and maintains a number of online services such as a wordpress platform, used by all digital programs and many library units. the center for digital scholarship was es- tablished with a clear understanding that most of its technical support would come from lito. the new library organization is by nature highly collaborative. where possible the group aims to select platforms that will be of broad use and will therefore provide a shared and efficient infrastructure. consequently the implementation of the libraries’ fedora system was a high priority and a joint effort with programmers working across units to share skills and expertise. some units, by the nature of their programs, work in tandem: the center for digital research and the copyright advisory office together educate the university community about scholarly communication and copyright issues; the libraries digital programs division and preservation’s digital conversion group plan and conduct digital conversion projects together, calling also on the expertise of metadata specialists in technical services; the library in- formation technology office works with the libraries digital program and the center for digital research on digital archiving. collaborations between these groups and library staff continue to expand and develop: curators frequently work with the center for new media staff to develop online envi- ronments that allow students to explore special collections resources; pub- lic services staff work with library technology staff to offer subject-focused digital commons facilities to undergraduate and graduate students; library liaisons introduce researchers to the value of deposit in the repository, and may tell them about journal, conference and video services offered by the center for digital scholarship. collaborations within the university, are also common and growing as the digital programs group develops and strength- ens programs: researchers now regularly look to the center for new media to partner on large federal grant applications; schools and academic depart- ments are increasingly aware of the value of the institutional repository for long term archiving; and the university’s office of research has called on the libraries for help in addressing new nih and nsf requirements. the list of collaborations is lengthy and perhaps best illustrated by an overview of the services developed during the past decade. these fall into three major d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il p. renfro and j. g. neal categories: student computing services, instructional support, and research support. the need to repurpose library space to meet the changing practices of students has been apparent at columbia for well over a decade, and general university-wide computing labs were installed in the butler (humanities) and lehman (social sciences) libraries well before the information com- mons movement took hold in research libraries across the country. rather than replicate the standard, broad-based information commons model, we decided that columbia would be better served by a series of subject-focused digital commons. these would serve upper level undergraduates and grad- uate students, and provide high-end hardware and software targeted at the needs of specific disciplines. today three major digital centers are in op- eration: the digital humanities center, focusing on textual analysis, image scanning and editing, digital video editing, etc.; the digital social science center focusing on the use of numeric and geospatial data; and the digital science center focusing on scientific analysis and visualization. the centers are staffed by public services librarians with appropriate skills and subject knowledge, supported by the library technology office which plans the installations and upgrades and fully manages two of the centers and shares support of one with the central it organization. indications so far are that these subject-focused centers, with the critical combination of staff exper- tise and hardware and software, are meeting a critical campus need. other specialized academic computing centers are planned and will follow the successful model of the music library where workstations are outfitted with high-end audio editing, music notation and related applications, and music librarians provide expert consultation services. columbia’s center for new media teaching and learning (ccnmtl) was established in “to enhance teaching and learning through the purposeful use of new media and technology.” the word “purposeful” is key here: new learning environments designed by the center are in a continuous loop of evaluation, assessment, and improvement. ccnmtl partners with faculty to provide a range of services from basic course web site management to advanced project development. columbia’s central it organization runs the hardware and software for columbia’s legacy course management system and its in-process sakai implementation, but all faculty support, and training for departmental administrative staff, is provided by ccnmtl. using the course management systems as a base for managing courses and registrations, ccnmtl promotes complementary third party best-of- breed collaborative learning tools such as a multi-author course blogging service called edblogs, built on the library information technology office’s implementation of wordpress. faculty can also enable a wiki for any course using the columbia wikispaces services, and can upload audio and video to the columbia itunes u and youtube edu media platforms, both managed by the center. workshops, a drop-in faculty support center located in butler d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il the integration of libraries and academic computing library, and the “tools” and “enhanced” sections of the center’s web site, combine to offer a variety of avenues for assistance and support. the other major focus of the center’s work, advanced project man- agement, partners center staff with faculty across campus to develop on- line teaching environments for specific courses or to fulfill a component of a research grant. a few examples here can only touch on the range of projects, which are best understood by visiting the web site: http://ccnmtl. columbia.edu/portfolio/. mysmilebuddy is an ipad tool supported by nih and developed with columbia’s school of dental medicine to allow social workers and dental professionals to work with families on preventive oral health care. this is one of a number of projects in the center’s triangle initiative, which aims to develop digital tools to extend faculty research, transform the education of human service professionals, and provide benefits to communities in need. using the triangle approach ccnmtl has become a sought for partner in research grants across the university, and external funding of this kind now makes up a significant percentage of the center’s operating budget. the global master’s degree in development practice, funded by the macarthur foundation, matriculated its first class of students at columbia in fall . ccnmtl supports this master’s degree program with a range of customized tools that enable synchronous and asynchronous collaboration among the participants in shared courses taught simultaneously at approxi- mately universities around the world. mapping the african american past is a public web site created by ccn- mtl to enhance the appreciation and study of significant sites and moments in the history of african americans in new york from the early th-century through the recent past. the web site is a geographic learning environ- ment, developed with the assistance of the library’s geographic information services librarians. it enables students, teachers, and visitors to browse a mul- titude of locations in new york and read encyclopedic profiles of historical people and events associated with these locations. the black radical archive, an online repository for a course on black radicalism, represents a unique platform for engaging students in the cu- ration of rare archival materials. it houses images of archival materials that the professor and his students selected from the library’s rare book and manuscript library. archivists worked closely with students in this course to enable them to explore unprocessed as well as processed black history collections, and select items to describe and record themselves with digital images. in addition, selected audio recordings, and manuscript items, criti- cal to the course, were digitized on request by the preservation and digital conversion division. both mapping the african american past and the black radical archive are projects developed as part of ccnmtl’s digital bridges initiative. this strategic initiative explores the creation of learning environments using d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il p. renfro and j. g. neal persistent high-quality digital resources. it is particularly in the digital bridges projects that the center’s organizational base within the libraries can be seen to be most valuable. working closely with curators and librarian liaisons in humanities, social sciences and sciences, the center has been able to identify new opportunities to work with faculty and unique collections and to use innovative technology to create ways to look at and integrate information resources into classroom teaching. curators and librarians bring their knowl- edge of the collections, their understanding of the metadata necessary for good digital retrieval, and their familiarity with the work and interests of faculty. the connections here have often opened doors and resulted in new ways of exposing collections to students. although the bulk of technical support for instruction at columbia comes from the center for new media, the libraries digital program and preser- vation and digital conversion groups also work directly to support specific teaching needs. a faculty member who had regularly used the new york real estate record and builders’ guide in his historic conservation course asked that it be digitized to provide better access and searchability for his students. as a result dense volumes of real estate transactions are now accessible online to the general public and researchers worldwide. in the case of a fragile and at risk tibetan newspaper, the tibet mirror, the primary motivation for digitization was conservation, but faculty also noted the value of making the digitized text accessible for students to examine, read, and translate since this paper chronicles dramatic social and political transfor- mations in tibet. through the efforts of the tibetan studies librarian the columbia issues have been significantly augmented with equally rare issues from three other collections, so that % of the full run of the paper is now online. faculty interest in using digitized material for instruction or research is an important factor in helping to set priorities for digitization. the success of the center for new media and its impact on instruction at columbia was so clearly evident by that the libraries decided to create a center for digital research and scholarship (cdrs), to provide a comparable set of services in support of research. at the same time, recognizing that intellectual property issues are fundamental to the dissemination of research, the libraries established a copyright advisory office (cao) to work closely with all digital programs and libraries units. following discussions with scholars and researchers across campus it became apparent that it would be possible to develop a suite of scalable services to meet a common set of needs. the robust list of services now described on the center’s web site represents an intense period of experi- mentation and development during the past four years (http://cdrs.columbia. edu/cdrsmain/). the center administers the university’s institutional reposi- tory, offers a number of types of publication services, conference and video services, and provides collaboration tools in form of the wikischolars ser- vice. through its web site and regular presentations across campus, the d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il the integration of libraries and academic computing copyright advisory office actively promotes an understanding of copyright issues. in contrast to the free instructional support services offered by the cen- ter for new media, the center for digital research has found it necessary to develop a range of free and for fee services. the group’s video team, as the primary video service on campus, charges for capture, editing, etc. although this is not a cost recovery service, the income helps to provide a stable and robust service. other services, such as the journal hosting service and conference support service offer graduated levels of support from a free barebones set up to a premier offering, which includes significant amounts of web design. the journals hosting service has been particularly attractive to the law school where the advantages of having a consistently administered publication framework that can be easily managed through annual changes of student editorial staff has been very clear. workshops held by center staff and the director of the copyright advisory office have provided critical ed- ucation in intellectual property issues for student journal editors. although most of the journals currently in production are student run, the center is now also publishing its first peer-reviewed faculty edited journal, tremors. work- ing with the editor and founder of the journal, the center’s director was able to frame an open access business model for this title, and will be monitoring and analyzing this model for the benefit of colleagues in the scholarly pub- lishing community. the scholarly communication program, which educates the columbia community about changes taking place within the scholarly communication system, is a major focus of the center for digital research and works in tandem with the director of the copyright advisory office. other innovative publishing projects have included creating a digital ver- sion of a scholarly print monograph simultaneously published by fordham university press in a traditional paper version. the center also developed a solution for an encyclopedic work on film that was initially planned as a print publication but proved to be a better fit for a wikipedia-like solution. the author heard about the center from her library liaison who now serves on the project’s advisory group. this faculty member also took advantage of the work of the center for new media to provide a course web site that would facilitate her students in capturing and incorporating information resources into their papers. the decision to locate responsibility for the development and manage- ment of the university’s institutional repository, academic commons, in the center for digital research, proved to be very effective. journals, conference proceedings, videos of presentations are natural candidates for the reposi- tory. interactions with faculty about archiving research often lead to interest in cdrs’s other services. liaison librarians can offer faculty a wide range of services that go well beyond the traditional options of collection build- ing and library instruction. a conversation in the history department raised the question of how to provide access to senior history theses (academic d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il p. renfro and j. g. neal commons is a natural home); the slavic studies librarian applied for and received an neh grant to run a summer institute using cdrs’s conference services; a humanities liaison librarian set up endnote training for a summer seminar program, and was then able to connect the seminar leader with the video services unit so that a series of high interest seminar talks could be broadcast on youtube. the seminar also asked for a collaboration site for sharing drafts of papers and the center for digital research was able to work with the library information technology office to set up an alfresco instance for their use. academic commons is the most visible part of the library’s long term, fedora-based digital archive. using a sam-fs storage environment designed and managed by the library information technology office, but housed in the university’s central data center, the long term archive has become key to the library’s management of its own digital collections and its ability to serve some of the broader digital archiving needs of the university. researchers developing data management plans for nsf grant applications are able to call on the academic commons long term digital archive to meet some aspects of their data archiving and sharing plans. national and international engagement continue to be high priorities for the digital programs group and for the columbia libraries generally, both in terms of giving back to the professional community and in sharing in the work of others to address key technology challenges. the libraries belong to major technology organizations such as hathitrust, duraspace, the na- tional digital stewardship alliance, the new media consortium and orcid; directors participate in national steering committees and advisory boards, and are frequently invited to consult and present at meetings. programmers are committers to open source projects such as fedora and blacklight. what have we learned from our work over the past several years? cer- tainly that investments in technical staff and in scalable infrastructure are critical to the libraries’ ability to provide the services that faculty and stu- dents want and need. that a critical mass of technology savvy staff with a variety of areas of expertise creates a stimulating and creative work envi- ronment where staff can develop and learn from each other. that scalable solutions and common systems are critical to our ability to accomplish a lot with a little. the new organization has worked because staff at all levels are flexible and not territorial. working together across unit boundaries has been a major learning experience for all staff, but particularly for the directors of each group who have developed a deeper understanding of each other’s goals and approaches. for example, the fast-paced technology framework of the center for new media, with its focus on the immediate needs of this semester’s classrooms, is quite different than the technology environment that must be built for long-term digital preservation. directors today better understand these necessary differences and the work cultures that they engender. d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il the integration of libraries and academic computing the columbia libraries are a distinctive example of collaboration and convergence, attracting and integrating independent and disparate programs and projects from across the university to advance an enhanced and innova- tive merging of information service ventures. by bringing together academic computing and research library and entrepreneurial vision, columbia has created a valuable and powerful organizational strategy that significantly enhances productivity, service, and impact. d ow nl oa de d by [ c ol um bi a u ni ve rs it y] a t : a pr il tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff in canadian institutions dr. felicity tayler; research data management librarian, university of ottawa, ottawa, ontario, ftayler@uottawa.ca maziar jafary; phd candidate and part-time professor school of sociological and anthropological studies, university of ottawa, ottawa, ontario emails: mjafary@uottawa.ca & mjafa @uottawa.ca abstract question: in consideration of emerging national research data management policy and infrastructure, what is the most effective way for a canadian research university to build capacity of library and campus-wide research support staff, with a view towards providing coordinated research data management (rdm) support services for our researcher community? what international training models and course offerings are available and appropriate to our local context? what national guidelines and best practices for pedagogical design and delivery can be adapted to a local context? methods: this literature review synthesizes a total of thirteen sources: nine articles, two book chapters and two whitepapers selected for a narrative literature review due to their focus on case studies detailing train-the-trainer models. within the thirteen sources, we found fourteen key case studies. the articles reviewed here supplement the carl portage training expert group white paper, “research data management training landscape in canada,” the focus of which was to identify rdm training gaps in order to recommend a coordinated approach to rdm training in a national environment. results: three thematic areas emerged from the narrative review of case studies: pedagogical challenges were identified, including the need to target training to rdm support staff such as librarians and researchers, as they comprise distinct groups of trainees with divergent disciplinary vocabularies and incentives for training. our case studies cover a broad range of pedagogical models including single or multiple sessions, self-directed or instructor-led, in-person or online instruction, and a hybrid of the two. rdm training also emerged as a key factor in community building within library staff units, among service units on campus, and with campus research communities. conclusion: training programs at local institutions should be guided by a set of principles aligned with the training methods, modes of assessment, and infrastructure development timeline outlined in a national training strategy. when adapting principles and training strategies to a local context, the following trends in the literature should be considered: librarians (and researchers) must have meaningful incentives to undertake training in rdm or to join a community of practice; disciplinary-specific instruction is preferable over general instruction; a librarian’s own training opportunities will influence their ability to provide discipline-specific rdm instruction to researchers; in-person training opportunities improve learning retention and produce beneficial secondary effects, online instruction is most effective when paired with an in-person component; generalized third-party rdm training should be adapted to local context to be https://journals.library.ualberta.ca/eblip/index.php/eblip mailto:ftayler@uottawa.ca mailto:mjafary@uottawa.ca mailto:mjafa @uottawa.ca tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) meaningful. future directions for rdm training will integrate into open access and digital scholarship training, and into cross-disciplinary, open science communities of practice. acknowledgements: the authors would like to thank chantal ripp, jane fry, james doiron, lindsey sikora and kim powroz for valuable feed-back on drafts of this review. introduction this literature review was undertaken to help the research services division of the university of ottawa library, to determine effective training methods for library and campus-wide research support staff, with a view towards providing coordinated rdm support services for the researcher community. for the last four years, university of ottawa has held an annual, in-person, campus- wide rdm training event, welcomed by researchers and a wider general audience. the event was also attended by rdm-curious librarians and researchers from other universities, and by , had gained national attention as the shifting horizons training series. the edition presented a national training program, developed through canada’s carl portage network, with the goal of establishing a common, minimum service level of rdm skills for librarians and campus-wide research support staff in the research office, labs, faculty departments, and central it. despite the event’s success, the library’s research services division needed to evaluate whether a single annual event was the most effective way to achieve the vision of campus-wide rdm awareness and a coordinated service model. as was observed in the event’s follow-up rdm readiness report, the stakeholders of a coordinated rdm services at university of ottawa continue to face the same challenges as those identified in a study of librarians in the us and other international locations by tang and hu ( ): “capacity/bandwidth; limited staffing,” “marketing and outreach of rdm service,” “collaborative understanding among campus departments,” “upskilling staff,” “providing consistent service in terms of quality and options,” and “connecting with the researcher and faculty.” aims/objectives the articles reviewed here supplement the carl portage training expert group white paper, “research data management training landscape in canada” (fry et al., ). the purpose of this white paper was to identify “significant issues and gaps in rdm training in canada,” to recommend a national, coordinated approach to rdm training, where expertise in data stewardship is unevenly distributed across higher education institutions, and is often isolated within disciplinary areas. in contrast to rdm infrastructures elsewhere, which cohere around disciplinary or national service centres, a critical mass of rdm expertise in canada is organized within the academic library community. to date, this report’s holistic multi-platform vision of a coordinated national training curriculum, to “level the playing field” has been articulated in a modest capacity through best practices, data primers and ad-hoc webinar training, supplemented by single-day, in- person sessions reflecting the individual expertise of members of the portage training expert group. the day-long training event at university of ottawa, led by james doiron (who is both an author of the training landscape white paper and the rdm services coordinator at the university of alberta libraries), is an example of the in-person sessions currently offered through carl portage. once an institution has participated in the training, the next steps are unknown: there is https://journals.library.ualberta.ca/eblip/index.php/eblip http://dx.doi.org/ . /s pj- x http://dx.doi.org/ . /s pj- x tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) no clear, national direction, recommended strategies, or coordinated curriculum resources to support the long-term development of highly qualified personnel (hqp) providing research data management services, if an institution has chosen to place its library in a leadership position to do so. methods this literature review synthesizes a total of thirteen sources: nine articles, two book chapters and two whitepapers from a larger sample of texts published within the last ten years ( - ). seven additional supporting sources have been cited in our analysis to provide the contextual framing for the thematic approach of this narrative review. keyword searches such as “research data management (and) training” were undertaken in data bases including lista and library and information science source. because rdm training is an emerging field, contingent upon variable jurisdictional challenges, policy, and funding environments, our aim was not to be exhaustive, nor systematic in our searches; rather, we supplemented these keyword database searches with a “snowball” approach to searching for key articles, white papers and reports shared by colleagues on rdm-themed listservs such as canlib-data, or iassist, or referenced at annual rda plenaries. in addition to the snowball searching, the authors contacted various content experts to review the abstracts collected, to ensure they were not missing any important sources. though the number of sources reviewed is sparse, this is an indicator of rdm is an emerging area of librarianship, which is also interdisciplinary in nature. there are simply not that many articles out there yet, and this literature review aims to fill this gap while recognizing that there is further work to do in this area. in the thirteen sources selected for synthesis, we found fourteen key cases for analysis. out of the thirteen sources selected for synthesis, as outlined above, nine of the selected sources had a single case study focus (baker et al., ; grootveld & verbakel, ; haddow, ; helbig, ; papadopoulou & miller in clare et al., ; papadopoulou & grabauskiene in clare et al., ; wittenberg et al., ; southall & scutt, ; read et al., ). two of the selected sources covered multiple case studies (bryant et al., ; surkis & read, ; ). two of the sources dealt with one same case study (tang & hu, ; shipman & tang, ). in choosing the case studies, we prioritized european, north american, and australian examples as their social and academic contexts are comparable to those of canada. we recognize that this geographic limitation and focus on english-language sources introduces a bias to this review. however, this bias in our selection does not reflect a deliberate exclusion of other regional models; rather, it echoes a trend to build canadian digital research infrastructure on existing models such as the european open science cloud (eosc), or to look to best practices in rdm established by the digital curation centre in the uk or to rdm service models as outlined by oclc in the us. https://journals.library.ualberta.ca/eblip/index.php/eblip https://www.eosc-portal.eu/about/eosc https://www.dcc.ac.uk/ https://www.oclc.org/research/publications/ /oclcresearch-research-data-management.html tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) results we have divided our review of the literature into three sections, reflecting themes within the articles and case studies. we will first look at the challenges and opportunities for rdm training in universities; outreach and pedagogical issues were identified by several authors, including the development of targeted rdm training to two distinct groups of trainees: rdm support staff, such as librarians, and researchers. these groups differ in their incentives for training participation and use of discipline-specific language/vocabulary. with these challenges in mind, the evaluation of training models for success and areas of improvement will be discussed. secondly, we will explain different approaches to curriculum and pedagogical design in research data management training. our case studies cover a range of pedagogical models and. whenever possible, we sought out evaluations of these training methods and formats of pedagogical engagement for rdm training. third, we will look at how rdm training operates as a means of community building within library staff units, between service units on campus, and with campus research communities. this section also covers internal and external partnerships that are necessary to develop rdm training. discussion challenges and opportunities for rdm training in universities while many of the texts that were retrieved in our searches addressed developing rdm services around best practices, or outlined approaches for broader data literacy training strategies, this literature review focuses on train-the-trainer models as a unique subset of the rdm training landscape. because the literature in this area is emerging, we have combined conclusions drawn from train-the-trainer models alongside approaches to training researchers. in a train-the-trainer model, the targeted audience of trainees are librarians and other research support staff; in the researcher trainer model, the targeted audience are typically faculty, student research assistants and other affiliates of disciplinary research project. however, in practice the line between these roles are blurry, as trainers often become a secondary audience of the training for researchers, and researchers can also benefit from train-the-trainer sessions as they can perform a trainer role in their own research team. furthermore, as this review shows, there is a correlation between the pedagogical model applied to train-the-trainer sessions and the effectiveness of these trainers to then shape learning experiences for researchers. by outlining the challenges to providing rdm training to researchers in this section, the recommended best-practices can inform approaches to train-the-trainer models. we begin with the principle that rdm is not generic; rather, librarians and other research support staff need a fundamental understanding of how data flows and data management differ between disciplinary research methods, and how to recommend relevant engagement with local, national and international infrastructure contexts. https://journals.library.ualberta.ca/eblip/index.php/eblip tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) rdm training for librarians and other research support staff will have an impact on the success of rdm services delivered. both tang and hu ( ) and surkis and read ( ) identify significant barriers and pedagogical challenges of rdm training for librarians and other research support staff, beyond the administrative concerns of budget and capacity. for example, librarian language and vocabulary does not translate well to the disciplinary environment of researchers and other stakeholders; such specialized rdm vocabulary might not be well received or even understood by researchers. another challenge could be a lack of training for librarians and research support staff on different approaches to research data management within the field of study, as defined by the researchers’ peers and funding bodies (tang & hu, ; surkis & read, , p. - ). while tang and hu’s needs assessment highlighted the need for key training in strategic communication of rdm service models to library and university administration; surkis and read instead stress that when the goal is the improvement of training offerings for researchers, instructors from the library sector (and related fields), as part of their own training, should engage in interviews with researchers in different fields to better understand their needs and expectations from rdm services (surkis & read, , p. ). a later study further explored this lack of disciplinary knowledge as a high barrier to librarian engagement with rdm services in biomedical fields, due to a “lack of comfort engaging with researchers” (read et al., , p. ). read’s study noted that a double gap in the training landscape: a “lack of satisfactory curricula” to train both librarians and researchers in rdm (read et al., , p. ), further contributed to the lack of rdm service offerings in biomedical fields. the anthology, engaging researchers with data management edited by connie clare et al. ( ), includes several case studies of rdm engagement and collaborations among researchers, to demonstrate how librarians and other research support staff with disciplinary awareness can encourage researchers to consider research data management practices and services as an extension of their disciplinary peer communities. in one of the chapters focusing particularly on rdm training, papadopoulou and miller evaluate the format of training “mini-events” for their impact on building a community of rdm supports and data management best practices at the vilnius university library in lithuania. each of these mini-events (delivered either as half-day or full-day workshops) consisted of three incremental phases: familiarity of the participants with rdm support services; learning how to use various available tools; sharing research data in practice (in clare et al., , pp. – ). papdopoulou and grabauskiene specify that one of the challenges faced by these rdm training sessions is reaching out to, and persuading, the “uninterested” researchers to attend. one proposed strategy is completing outreach to researchers via their peers, rather than through a generic unit, such as information services (papadopoulou & grabauskiene in clare et al., , pp. – ). secondly, based on their study on a conference at the university of edinburgh papadopoulou and miller propose that the events should include presentations by researchers from multiple university faculties. such presentations might discuss rdm best practices and their impact on researchers’ work, thereby encouraging their disciplinary peers to participate. thus, the presentations can also be interactive sessions among the researcher peers themselves (papadopoulou & miller in clare et al., , pp. – ). https://journals.library.ualberta.ca/eblip/index.php/eblip tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) approaches to research data management training in the above section we outlined challenges of rdm training such as a gap in terminology shared by research support staff and the researcher community that they support, and the lack of interest of researchers to engage with rdm, if it is perceived to be beyond the scope of methodologies shared by their disciplinary community. the challenges outlined above support the carl portage white paper finding that the matching of pedagogical design to trainee needs is a necessary learning objective for librarians and other research support service providers (fry et al., , p. ). our literature review has revealed multiple approaches to rdm training, specific to the trainee contexts. although we focus here on librarians and other research support staff as “trainees,” it is with an understanding that their training opportunities have an impact on the quality of rdm training and service provision available to researchers. further, we note several approaches to pedagogical design for rdm training, which can be loosely categorized as: generalized instruction or discipline-specific, single or multiple sessions, self-directed or instructor-led, in-person or online instruction (and most often, a hybrid of the two). the literature shows that there are significant advantages to delivering discipline-specific, or targeted rdm training; however, a generalized approach to rdm training may be favoured due to perceived scalability. as mentioned, read et al. note that available online training for librarians is inadequate to build rdm service capacity in biomedical fields, as none have the necessary disciplinary focus; this focus on general rdm training for librarians further contributes to a gap in disciplinary-specific training curricula for researchers (read et al. , p. ). after reviewing humboldt university of berlin’s rdm initiative, launched as a joint venture between computer and media service, research service centre, university library, and vice president for research helbig similarly concludes, “although general workshops on research data management are more scalable in comparison to discipline-specific workshops, the advantages of a tailored approach outweighed this concern. through the provision of pinpointed guidance, participants receive more practically useful information for their individual data management” (helbig, , p. ). humboldt university’s rdm training initiative consisted of one-day workshops aimed at helping phd students and researchers in the geography department. groups of six to eight trainees were formed in order to facilitate the learning process. rdm specialists at the university felt that a targeted approach would be advantageous, through a priori surveys and interviews with researchers and graduate students, the workshops were designed for the specific needs of that department. by understanding the nature of rdm in geography, specialists were able to provide an interactive session encouraging the full participation of the trainees. other universities such as monash university in australia, university of edinburgh in the united kingdom, and university of illinois, in the united states, offer courses to targeted campus groups based on their needs. such needs are identified through consultation with strategic research management services at these universities, as well as in-person discussions with individual researchers around the campus. bryant et al. explain that the integrated instruction model in a semester-long course is a preferable method because it is sustainable, as they observe, “the most resource-intensive approach to supporting rdm education is through in-person, instructor-led workshops” (bryant et al., , p. ). however, if a workshop approach is taken over a course integration approach, bryant et al. ( ), argue that rdm educational services should strategically align their workshops with course https://journals.library.ualberta.ca/eblip/index.php/eblip tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) content and with broader institutional policies of the respective university (such as conforming to the requirements of data management plans). within the literature, the choice between disciplinary focus or generalized curriculum models, is paralleled by the choice of delivery mode through online modules or in-person sessions (and a hybrid of the two). online training modules are among the most desired among rdm professionals because it is perceived to allow flexibility in accommodating working schedules (tang & hu, ); nonetheless, read et al. note that the required time commitment is a strain on working librarians and there is a significant rate of non-completion of online training (read et al., , p. - ). read et al. also showed that while online modules improve the “understanding of and comfort level with rdm” in-person instruction resulted in “improved rdm practices” (read et al., ). the differing experiences between online and in-person learning led read et al. to develop a hybrid, or “two-tier” coordinated approach to rdm training for health sciences librarians, and for biomedical researchers that the librarians will, in turn, train and support (read et al., , p. ). seven self-paced, multi-media, online modules were produced to train librarians. the modules covered general rdm topics and applications of rdm in health science methodologies and discipline-specific data standards. an evaluation form embedded at the end of each module facilitated a self-assessment. once a librarian indicated comfort with the content, they received a teaching toolkit which included a lesson plan and related materials to teach rdm to biomedical researchers via a - minute in-person session (read et al., , p. ). this hybrid, coordinated model improved the librarian’s ability to deliver an rdm session for researchers; as read et al. observe, “the online modules were concise and directly tied to the teaching toolkit, a curriculum specifically created for use by the librarians to teach rdm locally, thus addressing the time constraints of working professionals…” (read et al., , p. ). the learning objectives of online training options are improved when paired with in-person instruction. bryant et al. ( ) explain that the mantra research data management training modules, promoted on the website as “a free online course for those who manage digital data as part of their research project,” is a series of eight generic self-paced modules and tutorials that are supplemented by in-person training courses by rdm professionals, at the university of edinburgh (bryant et al., , p. ). the online modules, initially built for researchers and graduate students, have influenced pedagogical design of rdm training for librarians and research support staff, not only at the host institution, but also for researchers and staff of other institutions. in , mantra launched a diy training kit for librarians to facilitate the remote training modules. built for the uk research and funding environment, the course can be adapted locally to include online and in-person instruction, covering data management planning, organizing and documenting data, data storage, data sharing and ethics questions around data management. haddow ( ) writes of the experience of adapting and delivering the mantra diy training kit for librarians at the sterling university of edinburgh: the subject librarian members of a dedicated local rdm task force, “found it beneficial to set time aside as a team to look at this issue;” however, they noted challenges and significant time investment for the local facilitator =to adapt the course content. as haddow ( ) explains: “the instructions were sometimes not clear but by the end i figured out that i just needed to look at the manual.” https://journals.library.ualberta.ca/eblip/index.php/eblip tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) the “data intelligence librarians course” was released in by tu.datacentrum, a partnership among three universities in netherlands (the partnership was later called tu.researchdata ( )); this course provides another example of a learning platform targeted to digital preservation professionals and included two in-person sessions at the beginning and the end of training period. during the in-person sessions, coaches would teach the trainees, while during the online sessions, trainees were expected to be prepared for each unit and complete assignments by themselves or in pairs. throughout the online portion, trainees could reach out to their respective coaches through an established online platform. later, the course was transformed into “essentials data support” whose target group was a more widely-defined group of professionals identified as data supporters. trainees from multiple institutions attended and worked mostly in pairs, learning how to write research data plans for fictional scenarios. participant surveys and networking though online forums following the training were completed (grootveld & verbakel, ). feedback indicated that homework assignments were the most valuable element of the course, as the pairing of trainees led to enjoyable discussions; participants also appreciated learning from researchers, including how they deal with data management issues and about differences between disciplines (grootveld & verbakel, , p. - ). trainees admitted that the use of audio-visual elements were helpful for their learning experience. current versions of tu.researchdata consist of three variants: a combination of in-person sessions and online training platforms, supervised by coaches and open to online discussion forums; a self- directed, online course, open to online discussion forums; a self-directed, online course with noaccess to coaches or discussion forums. a recent example of generalized, online rdm training includes the research data management librarian academy (rdmla), for librarians from multiple institutions around the globe (shipman & tang, ). the curriculum was based on needs gathered from interviews and a survey conducted by tang and hu ( ), as previously discussed, and its intent was to fill gaps training for librarians in higher education, through online training. although its success cannot be confirmed at this time, the online-only format of rdmla should be assessed in terms of its ability for librarian trainees to translate their knowledge into researcher training, in consideration of completion rates and the findings of studies on hybrid or in-person models. it is important to note that the rdmla training is underwritten by the publisher elsevier, with modules promoting tools in which elsevier has a vested interest, while the other training reviewed are developed through public or local institutional funding streams. despite available online solutions to local training gaps, in-person instruction remains a popular approach, as it catalyzes communities of practice around complex skillsets. wittenberg et al. ( ) discussed workshops launched by a research data management team at the university of california in berkeley, and show that in-person, ongoing, and discipline-based consultations on rdm by specialized liaison librarians are among the most successful methods of rdm support by university libraries. as they mention, “participants, on average, were more satisfied with domain- based rdm training than they were with general rdm training” (wittenberg et al., p. ). at the same time, the success of discipline-based training depend on a scientific community built around as grootveld & verbakel ( ) mention, “data supporters are people who support researchers in storing, managing, archiving and sharing their research data” (p. ). https://journals.library.ualberta.ca/eblip/index.php/eblip tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) rdm, which is mainly based on continuous connections between liaison librarians and researchers (wittenberg et al., ). likewise, the library carpentry workshops with rdm-focused content, as discussed by baker et al. ( ), are a worthwhile comparison to the online or hybrid teaching models available to librarians, because they place a strong emphasis on in-person skill sharing and long-term community building. the multi-session workshop took place in the fall of over four, three- hour weekly evening sessions at the city university london centre for information science. the workshops had three aims: to blend non-library specific software skills training with existing library specific programs; to collect data on software skills in university libraries; and to build the foundations of a distributed community model for embracing and sustaining software skills in the library (baker et al., , p. - ). prior to the sessions, attendees were asked to make a name badge, also identifying their level of knowledge of rdm and related software, for presenters to better guide the attendees. participants were also encouraged to note the level of knowledge of others to better assist them during the workshop, if needed. in this way, peer-to-peer collaborations were built into the workshop design (baker et al., , pp. - ). participants shaped workshop content: session one began with an introduction to basic programming concepts and attendees were asked to reflect on words and phrases associated with programming, code, and software from which they could benefit. baker et al. ( ) note that many universities around the world use “data carpentry workshops” formats and materials adapted to their local needs, which demonstrates the success of the project; still, they recognize the need to develop a set of resources to enable workshop attendees to share software skills in their home libraries (p. - ). it is anticipated that these resources would be predicated on the idea that the best way to reinforce own software skills is through teaching others. rdm training as a means of community building the carl portage white paper outlined eight principles for developing a coordinated national training curriculum and several of these foreground the community of practice approach adopted by the librarian-led portage network rdm expert groups. the notion of rdm as a set of skills and practices shared by a community, whether disciplinary, institutional, professional, or otherwise, is consistent with several of the articles reviewed above, as well as the “data communities” model of researcher behaviour in data sharing described by danielle cooper and rebecca springer ( ). however, while communities of practice may be wrapped in a myth of informal organizing, in reality, they require leadership and intentional cultivation, particularly as etienne and beverly wenger-trayner ( ) observe, if they are used for developing the “strategic capability” of an organization or its personnel. indeed, the strategy of nurturing national rdm infrastructure, training and support by “building partnerships in the face of complexity” has been carefully crafted by portage since its early stages (humphrey, , p. ). from this perspective, rdm librarians and other research support staff have a key role in training, as universities develop capacity to comply with rdm requirements of national and international funding agencies. for this reason, we will conclude this literature review with the seven principles of rdm training developed at tu delft ( ), as well as new approaches to librarian rdm training that build upon the intersections of research data management with the workflows, best practices and scholarly communities of open science. https://journals.library.ualberta.ca/eblip/index.php/eblip tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) the tu delft ( ) principles provide a framework whereby rdm training becomes the mechanism for cultivating a community of practice that is both campus-wide and disciplinary- focused, while reaching beyond the campus into the information circuits of scholarly community. significantly, these principles encourage a researcher-focused rdm vocabulary; they foster collaboration between faculty and research support staff across multiple university departments and service providers; and furthermore, there is recognition that the university must provide meaningful incentives that motivate trainees, whether they are administrators, librarians, research support staff, researchers, or students, to join the community of practice. the tu delft “open working” website ( ) outlines some principles including: “whenever possible, data and software management training should be built upon the existing faculty-specific courses”; “building and delivering such training must be a collaborative effort between faculties, the library, graduate school and other university services”; and, “library and graduate schools should continuously engage in consultation processes with phd students and researchers.” at the same time, the principles recommend engagement with organizations outside universities as vital in making training resources sustainable. in order to successfully implement this vision, the tu delft principles recognize that researchers receive the proper incentives to participate and contribute to the training. the library should also solicit feedback from researchers to iteratively improve and update the training content. finally, the principles reinforce that courses should be accompanied by clear learning objectives, a lesson plan and a description of the methods selected for the training (tu delft, ). looking forward, one can imagine integrated training for librarians and researchers that establishes rdm as the foundation for data-sharing workflows and other best practices of open science scholarly communications. the international principles of fair data, findability, accessibility, interoperability and reuse, can be a shared method between cross-disciplinary open scholarship practices due to a common engagement with digital assets. as higman et al. argue, “researchers often want to be fair, and sometimes open; they are noble aspirations... by using the language of fair and open, we can engage people in data management too (higman et al., , p. ).” the bodleian libraries at the university of oxford offers a model of how the integration of rdm training with other areas of open scholarship might be achieved for librarians. library rdm services are led by one specialist, who has developed an rdm training series for researchers addressing key issues, such as working with confidential data, secondary use of data, and data deposit and preservation. this training series is often team-taught with it representatives or library staff with complementary expertise, highlighting the need for researchers to first contact their subject librarians with queries. rdm platforms are also supported by multiple members of library staff, not only the rdm specialist. the collaborative approach to rdm training for researchers, and a distributed technical rdm service “serves to reinforce the message of the training aimed at library staff, namely that rdm is an area that library staff across the board can support to some extent” (southall & scutt, , p. ). rdm training for librarians and library staff mirrors the content of training for researchers. two workshops cover basic principles of rdm, trends in scholarly communications, and concrete examples of data management, with an emphasis placed on an “increased understanding of digital scholarship, rdm issues and where these sit in relation to the work of the academic library and new areas of scholarly activity such as open access (southall & scutt, , p. ). https://journals.library.ualberta.ca/eblip/index.php/eblip https://www.go-fair.org/fair-principles/ tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) conclusion the aim of this literature review of thirteen sources, containing fourteen case studies, was to survey a range of rdm training and capacity-building approaches, to determine the next steps for our own local context of the university of ottawa. we looked to international training models in order to supplement a gap in the emerging national rdm policy, infrastructure, and training environment. for instance, a notable challenge in the canadian rdm training space is that many institutions have not yet developed the rdm institutional policies that are anticipated by the draft tri-agency research data management policy ( ). this layer of institutional strategy will enable the building of rdm into graduate-level curricula for both researchers and librarians.. in the meantime, the next steps for building a training program at our local institution will begin with establishing a set of principles based on the findings of this literature. we will align these principles with the training methods, modes of assessment, and infrastructure development timeline outlined in a national training strategy anticipated for release in fall by the portage training expert group as a follow-up to the white paper, tentatively titled: building a portage network training strategy: a canadian approach to research data management. the following trends emerged though this literature review, which have informed the national training strategy, and will be taken into consideration when building our own local training options for librarians and other research support staff, and for researchers. librarians and researchers must have sufficient incentive to undertake training in rdm or to join a community of practice. training requires a significant investment of time, whether online or in- person, and librarians are unlikely to take on additional training, or to complete the training once enrolled, without a perceived benefit or reinforcement through regular rdm service provision. disciplinary-specific instruction is preferable over general instruction for both librarians and researchers; however, a librarian’s own training opportunities will influence their ability to provide discipline-specific rdm instruction to researchers. there is a double gap in the training landscape, as the lack of disciplinary-specific training opportunities for librarians, further contributes to a lack of training options and service offerings for distinct research areas. the range of pedagogical designs reflected in the case studies make it difficult to draw conclusions as to whether intensive events, or a series of shorter time-commitments over a longer time period, is preferable for learning outcomes. in-person training opportunities emerged as the preferred option for learning retention and secondary effects of building a community of practice. for the same reasons, online instruction was found to be most effective when paired with an in-person component. the sources in this literature review predate the global covid- pandemic, which has shifted higher-education into online delivery in historically unprecedented ways; this context may present an opportunity to apply the best practices of online learning design to close the gap between the benefits of in-person training and the low of retention in online learning environments. initiatives such as the university of british columbia rdm fall series are early responses to virtual rdm instruction in pandemic times, demonstrating the importance of the adaptation to local contexts, for example. in the literature review, we already saw a recommendation that generalized rdm training offered by third parties must be adapted to local contexts to be meaningful. discipline-specific training, in-person training, and adaptation to local contexts are https://journals.library.ualberta.ca/eblip/index.php/eblip https://www.ic.gc.ca/eic/site/ .nsf/eng/h_ .html https://www.ic.gc.ca/eic/site/ .nsf/eng/h_ .html http:// . . . /osf.io/w n k tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) all resource intensive activities, but are worth the investment. librarians and other research support staff with disciplinary awareness will more successfully engage with researchers and helping them to adopt research data management practices as an extension of their disciplinary peer communities. finally, future directions for rdm training will be integrated into open access and digital scholarship and awareness training, as well as cross-disciplinary, open science communities of practice that reach beyond local campuses. references tu research data. ( ). https://researchdata. tu.nl/en/ baker, j., moore, c., priego, e., alegre, r., cope, j., price, l., stephens, o., strien, d. van, & wilson, g. ( ). library carpentry: software skills training for library professionals. liber quarterly, ( ), – . https://doi.org/ . /lq. bryant, rebecca, brian lavoie, and constance malpas. ( ). sourcing and scaling university rdm services (the realities of research data management, part ). dublin, oh: oclc research. https://www.oclc.org/research/publications/ /oclcresearch-rdm-part-four- sourcing-scaling.html clare, c., cruz, m., papadopoulou, e., savage, j., teperek, m., wang, y., witkowska, i., & yeomans, j. ( ). engaging researchers with data management: the cookbook. open book publishers. cooper, d., & springer, r. ( ). data communities a new model for supporting stem data sharing. us national library of medicine. grootveld, m. j., & verbakel, e. ( ). essentials for data support: training the front office. international journal of digital curation, ( ), urn:issn: – . https://doi.org/ . /ijdc.v i . haddow, l. ( ). training subject librarians in rdm. research data blog. http://datablog.is.ed.ac.uk/ / / /training-subject-librarians-in-rdm/ helbig, k. ( ). research data management training for geographers: first impressions. isprs international journal of geo-information, ( ), . https://doi.org/ . /ijgi higman, r., bangert, d., & jones, s. ( ). three camps, one destination: the intersections of research data management, fair and open. insights, ( ), . https://doi.org/ . /uksg. humphrey, c. ( ). the carl portage partnership story. partnership: the canadian journal of library and information practice and research, ( ). https://doi.org/ . /partnership.v i . fry, j., doiron, j., létourneau, d., perrier, l., perry, c., & watkins, w. ( ). research data management training landscape in canada : a white paper. doi:http://dx.doi.org/ . / . papadopoulou, e. & miller, k. ( ). ‘dealing with data’ conference at university of edinburgh. in clare, c., cruz, m., papadopoulou, e., savage, j., teperek, m., wang, y., witkowska, i., & yeomans, j. ( ). engaging researchers with data management: the cookbook (pp. - ). open book publishers. https://journals.library.ualberta.ca/eblip/index.php/eblip https://doi.org/ . /uksg. tayler, f. & jafary, shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff this is a preprint of an article accepted for publication in eblip . ( ) papadopoulou, e. & grabauskiene, r. ( ). duodi: the ‘days of data’ at vilnius university. in clare, c., cruz, m., papadopoulou, e., savage, j., teperek, m., wang, y., witkowska, i., & yeomans, j. ( ). engaging researchers with data management: the cookbook (pp. - ). open book publishers. read, k. b., larson, c., gillespie, c., oh, s. y., & surkis, a. ( ). a two-tiered curriculum to improve data management practices for researchers. plos one, ( ), undefined- undefined. https://doi.org/ . /journal.pone. shipman, j. p., & tang, r. ( ). the collaborative creation of a research data management librarian academy (rdmla). information services & use, ( ), – . https://doi.org/ . /isu- southall, j., & scutt, c. ( ). training for research data management at the bodleian libraries: national contexts and local implementation for researchers and librarians. new review of academic librarianship, ( – ), – . https://doi.org/ . / . . surkis, a., & read, k. ( ). research data management. journal of the medical library association, ( ), – . https://doi.org/ . / - . . . tang, r., & hu, z. ( ). providing research data management (rdm) services in libraries: preparedness, roles, challenges, and training for rdm practice. data and information management, ( ), – . https://doi.org/ . /dim- - tu delft. ( , october ). vision for research data management training at tu delft. open working. https://openworking.wordpress.com/ / / /vision-for-research-data- management-training-at-tu-delft/ wenger-trayner, e., & wenger-trayner, b. ( ). introduction to communities of practice | wenger-trayner. https://wenger-trayner.com/introduction-to-communities-of-practice/ wittenberg, j., sackmann, a., & jaffe, r. ( ). situating expertise in practice: domain-based data management training for liaison librarians. the journal of academic librarianship, ( ), – . https://doi.org/ . /j.acalib. . . https://journals.library.ualberta.ca/eblip/index.php/eblip shifting horizons: a literature review of research data management train-the-trainer models for library and campus-wide research support staff in canadian institutions abstract introduction this literature review was undertaken to help the research services division of the university of ottawa library, to determine effective training methods for library and campus-wide research support staff, with a view towards providing coordinated rdm... aims/objectives the articles reviewed here supplement the carl portage training expert group white paper, “research data management training landscape in canada” (fry et al., ). the purpose of this white paper was to identify “significant issues and gaps in rdm t... methods results we have divided our review of the literature into three sections, reflecting themes within the articles and case studies. we will first look at the challenges and opportunities for rdm training in universities; outreach and pedagogical issues were ide... discussion challenges and opportunities for rdm training in universities while many of the texts that were retrieved in our searches addressed developing rdm services around best practices, or outlined approaches for broader data literacy training strategies, this literature review focuses on train-the-trainer models as a ... rdm training for librarians and other research support staff will have an impact on the success of rdm services delivered. both tang and hu ( ) and surkis and read ( ) identify significant barriers and pedagogical challenges of rdm training for ... the anthology, engaging researchers with data management edited by connie clare et al. ( ), includes several case studies of rdm engagement and collaborations among researchers, to demonstrate how librarians and other research support staff with d... approaches to research data management training in the above section we outlined challenges of rdm training such as a gap in terminology shared by research support staff and the researcher community that they support, and the lack of interest of researchers to engage with rdm, if it is perceived to... the literature shows that there are significant advantages to delivering discipline-specific, or targeted rdm training; however, a generalized approach to rdm training may be favoured due to perceived scalability. as mentioned, read et al. note that a... within the literature, the choice between disciplinary focus or generalized curriculum models, is paralleled by the choice of delivery mode through online modules or in-person sessions (and a hybrid of the two). online training modules are among the m... rdm training as a means of community building conclusion references libraries as research partner in digital humanities​, dh , adho libraries and digital humanities special interest group, pre-conference workshop, national library of the netherlands, the hague, july . growing an international cultural heritage labs community sally chambers (ghent centre for digital humanities, belgium)​; ​mahendra mahey (british library labs, british library, london, united kingdom); katrine gasser (royal danish library, denmark); milena dobreva-mcpherson (ucl qatar, doha, qatar); kristy kokegei (history trust of south australia, adelaide, australia); abigail potter (library of congress, washington d.c., usa); meghan ferriter (library of congress, washington d.c., usa); rania osman (bibliotheca alexandrina, alexandria, egypt). ‘cultural heritage labs’ in galleries, libraries, archives and museums around the world help researchers, artists, entrepreneurs, educators and innovators to work on, experiment, incubate and develop their ideas of working with digital content through competitions, awards, projects, exhibitions and other engagement activities. they do this by providing services and infrastructure to enable, facilitate and give access to their data both openly online and onsite for research, inspiration and enjoyment. in september , the british library labs team organised a ‘building library labs’' international workshop. the event provided the opportunity for colleagues that are planning or already have digital experimental ‘labs’ to share knowledge, experiences and lessons learned. the workshop, which attracted over institutions from north america, europe, middle east, asia and africa, demonstrated a clear need and enthusiasm for establishing an international support network. within months, a second international workshop was organised at the royal danish library in copenhagen in march . in total we have brought together some participants and an even wider community of around people online. some have been sharing their experiences in setting, using and running innovation labs, but there was a sizeable group of attendees who are planning to set up such labs and need advice and support in how to do this. the aim of this short paper is to present the journey and development of the international labs community and outline our future activities. the principle of the network is that by fair sharing and ‘paying forward’ our expertise, knowledge and experiences, the group hopes to ensure that organisations don’t have to ‘re-invent the wheel’. organisations can learn from each other and enable collaboration across borders through their digital collections, data, services, infrastructure and practice. this we hope this will result in building better digital ‘labs’ for their organisations and their users and help to further open up data and services for everyone. https://adholibdh.github.io/dh -preconference/ people are the essence of the international labs network. from the results of an initial global building library labs survey, including responses from countries, there was significant interest from the wider cultural heritage sector, beyond libraries. with currently people, from over institutions, based in over countries affiliated with the network, a solid set of communication tools were needed. the network has a shared google drive, a mailing list and a wiki, as well as an active whatsapp group, a slack channel and meets regularly via zoom. with two successful events behind us, and plenty of enthusiasm and willingness to continue activities further, we now looking to the future. planned activities include: a booksprint to capture significant knowledge and expertise within the labs network serving as a reference guide for people wanting to build their own lab, populating our wiki and creating a global directory of cultural heritage labs. further regional and international events have and are also being organised. in less than a year, the labs network has come a long way, and this is only the beginning! ‘cultural heritage labs’ in galleries, libraries, archives and museums around the world help researchers, artists, entrepreneurs, educators and innovators to work on, experiment, incubate and develop their ideas of working with digital content through competitions, awards​, ​projects​, ​exhibitions and other engagement activities. they do this by providing services and infrastructure to enable, facilitate and give access to their data both openly online and onsite for research, inspiration and enjoyment. in september , the british library labs team organised a ‘​building library labs’​’ international workshop . the event provided the opportunity for colleagues that ​are planning or already have digital experimental ‘labs’ to share knowledge, experiences and lessons learned. the workshop, which attracted over institutions from north america, europe, middle east, asia and africa, demonstrated a clear need and enthusiasm for establishing an international support network. within months, a second international workshop was organised at the royal danish library in copenhagen in march . in total we have brought together some participants and an even wider community of around people online. some have been sharing their experiences in setting, using and running innovation labs, but there was a sizeable group of attendees who are planning to set up such labs and need advice and support in how to do this. british library labs started out as an andrew w. mellon foundation funded project in , to support and inspire the use of the british library’s digital collections and data in exciting and innovative ways, through competitions, events and collaborative projects. from spring , the british library labs is funded by the british library. for further information, see: ​https://www.bl.uk/projects/british-library-labs ​https://blogs.bl.uk/digital-scholarship/ / /building-library-labs-around-the-world.html ​https://blogs.bl.uk/digital-scholarship/ / /the-world-wide-lab-building-library-labs-part-ii.html http://blogs.bl.uk/digital-scholarship/ / /the-submission-deadline-for-bl-labs-awards- -is-next-week.html?_ga= . . . - . http://blogs.bl.uk/digital-scholarship/ / /the-submission-deadline-for-bl-labs-awards- -is-next-week.html?_ga= . . . - . https://drive.google.com/open?id= cs hyngq yi mqdh t sn e--zj du-lntkohnumdc https://drive.google.com/open?id= cs hyngq yi mqdh t sn e--zj du-lntkohnumdc https://www.bl.uk/events/imaginary-cities https://www.bl.uk/events/imaginary-cities https://www.bl.uk/projects/british-library-labs https://blogs.bl.uk/digital-scholarship/ / /building-library-labs-around-the-world.html https://blogs.bl.uk/digital-scholarship/ / /the-world-wide-lab-building-library-labs-part-ii.html the aim of this short paper is to present the journey and development of the international labs community and outline our future activities. the principle of the network is that by fair sharing and ‘paying forward ’ our expertise, knowledge and experiences, the group hopes to ensure that organisations don’t have to ‘re-invent the wheel’. organisations can learn from each other and enable collaboration across borders through their digital collections, data, services, infrastructure and practice. this we hope this will result in building better digital ‘labs’ for their organisations and their users and help to further open up data and services for everyone. people are the essence of the international labs network. from the results of an initial global building library labs survey , including responses from countries, there was significant interest from the ​wider cultural heritage sector, ​beyond libraries. with ​currently people, from over institutions, based in over c​ountries affiliated with the network, a solid set of communication tools were needed. the network has a shared google drive , a mailing list and a wiki , as well as an active whatsapp group , a slack channel and meets regularly via zoom . with two successful events behind us, and plenty of enthusiasm and willingness to continue activities further, we now looking to the future. planned activities include: a booksprint to capture significant knowledge and expertise within the labs network serving as a reference guide for people wanting to build their own lab , populating our wiki and creating a global ​https://en.wikipedia.org/wiki/pay_it_forward ​https://www.surveymonkey.co.uk/r/building-library-labs​ we would like to particularly acknowledge the work of nora mcgregor from the british library’s digital scholarship team here. labs network google drive: ​https://goo.gl/s zc​ - as of . . , there are contributors to this collaborative folder. ​http://www.jiscmail.ac.uk/lists/librarylabs.html​ - as of . . there are people subscribed to the mailing list. labs network wiki: ​https://wikis.fu-berlin.de/x/xqaxnw​ this wiki was kindly set up by martin lee and is hosted by his institution, the freie universität berlin. access can be requested by emailing mahendra.mahey@bl.uk to be added to the labs network whatsapp group, send your international mobile phone number to mahendra mahey at + . ​https://buildingglamlabs.slack.com since the first labs network event in september , there have been ‘virtual’ meetings of the network october , november and february . the next meeting is being scheduled for late may , followed by regular meetings every months. at the time of writing, a possible source of funding and a location for the labs network booksprint is being finalised. if secured, it is expected that the labs network booksprint would be facilitated by booksprints (​https://www.booksprints.net/​), a new zealand company with their headquarters in berlin. based on the concept of a team of people, having days to create a book from scratch. while the writing team are sleeping, another team in new zealand and australia, would be undertaking the editing, https://en.wikipedia.org/wiki/pay_it_forward https://www.surveymonkey.co.uk/r/building-library-labs https://goo.gl/s zc http://www.jiscmail.ac.uk/lists/librarylabs.html https://wikis.fu-berlin.de/x/xqaxnw https://buildingglamlabs.slack.com/ https://www.booksprints.net/ directory of cultural heritage labs . further regional and international events have and are also being organised. in less than a year, the labs network has come a long way, and this is only the beginning! proof-reading and illustrating the fruits of the writing efforts. at this stage, it is anticipated that the booksprint would take place in near the end of . associate professor milena dobreva-mcpherson, research assistant somia salim and research fellow fidelity phiri are the team at university college london qatar working on the international labs network in close collaboration with mahendra mahey, british library labs manager. in march , a dedicated session about the labs network was held as part of the arts, humanities and cultural data summit and dariah beyond europe international workshop at the national library of australia in canberra: ​https://www.humanities.org.au/special-event- /​. a key outcome of this event was to instigate a australaisian regional node of the labs network. in may , a second labs event on experimenting in the digital research lab: a hands-on introduction to using digital collections for digital humanities research​ will be held at the royal library of belgium as a pre-conference workshop to the international digital access to textual cultural heritage (datech) conference: http://datech.digitisation.eu/programme/workshop/​. publishing and reusing linked open data in galleries, libraries, archives and museums (glams): opportunities and challenges by maría dolores sáez, pilar escobar, gustavo candela, manuel marco-such and mahendra mahey for 'musaccess-madrid ', april, , http://www.musacces.es/, presentation slides here: https://docs.google.com/presentation/d/ ijmddnhwuuci-odok f dwuptxz jyqpwa o-_q zkg/edit#sli de=id.p a rd international labs event is current in the early stages of planning, which will be potentially held at the library of congress, washington dc in late spring . https://www.humanities.org.au/special-event- / http://datech.digitisation.eu/programme/workshop/ benefits and limitations of three-dimensional printing technology for ecological research behm et al. bmc ecol ( ) : https://doi.org/ . /s - - -z m e t h o d o lo g y a r t i c l e benefits and limitations of three-dimensional printing technology for ecological research jocelyn e. behm , * , brenna r. waite , , s. tonia hsieh and matthew r. helmus abstract background: ecological research often involves sampling and manipulating non-model organisms that reside in heterogeneous environments. as such, ecologists often adapt techniques and ideas from industry and other scientific fields to design and build equipment, tools, and experimental contraptions custom-made for the ecological systems under study. three-dimensional ( d) printing provides a way to rapidly produce identical and novel objects that could be used in ecological studies, yet ecologists have been slow to adopt this new technology. here, we provide ecolo- gists with an introduction to d printing. results: first, we give an overview of the ecological research areas in which d printing is predicted to be the most impactful and review current studies that have already used d printed objects. we then outline a methodological workflow for integrating d printing into an ecological research program and give a detailed example of a success- ful implementation of our d printing workflow for d printed models of the brown anole, anolis sagrei, for a field predation study. after testing two print media in the field, we show that the models printed from the less expensive and more sustainable material (blend of % plastic and % recycled wood fiber) were just as durable and had equal predator attack rates as the more expensive material ( % virgin plastic). conclusions: overall, d printing can provide time and cost savings to ecologists, and with recent advances in less toxic, biodegradable, and recyclable print materials, ecologists can choose to minimize social and environmen- tal impacts associated with d printing. the main hurdles for implementing d printing—availability of resources like printers, scanners, and software, as well as reaching proficiency in using d image software—may be easier to overcome at institutions with digital imaging centers run by knowledgeable staff. as with any new technology, the benefits of d printing are specific to a particular project, and ecologists must consider the investments of developing usable d materials for research versus other methods of generating those materials. keywords: d models, additive manufacturing, anolis sagrei, clay model, curaçao, maya autodesk, sustainability © the author(s) . this article is distributed under the terms of the creative commons attribution . international license (http://creat iveco mmons .org/licen ses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the creative commons public domain dedication waiver (http://creat iveco mmons .org/ publi cdoma in/zero/ . /) applies to the data made available in this article, unless otherwise stated. background ecologists exhibit exceptional creativity and ingenuity in designing new tools and equipment for their studies, often incorporating and repurposing technology from other fields. for example, unique solutions have been devised for tracking animals (backpack-mounted radio transmitters [ ]), tracking seeds (fluorescent pigments [ ]; seed tags [ ]), catching animals (pit-less pitfall traps [ ]), containing or restraining difficult-to-hold speci- mens (squeeze box for venomous snakes [ ], ovagram for amphibian eggs [ ]), and remotely collecting data or samples (frog logger [ ]; hair trap [ ]), among countless others. because many ecological studies require custom- ized equipment, ecologists are no strangers to building the contraptions necessary for conducting their research, and the weeks leading up to and during field seasons and lab experiments often involve multiple trips to hardware stores and craft shops. open access bmc ecology *correspondence: jebehm@temple.edu integrative ecology lab, center for biodiversity, department of biology, temple university, philadelphia, pa, usa full list of author information is available at the end of the article http://orcid.org/ - - - http://creativecommons.org/licenses/by/ . / http://creativecommons.org/publicdomain/zero/ . / http://creativecommons.org/publicdomain/zero/ . / http://crossmark.crossref.org/dialog/?doi= . /s - - -z&domain=pdf page of behm et al. bmc ecol ( ) : despite the high level of creativity and adaptability exhibited by ecologists, there is one technology that ecol- ogists have been slower to adopt relative to other fields: three-dimensional ( d) printing. additive layer manu- facturing, or d printing, is the layering of material by a computer-controlled machine tool to create an object from a digital file that defines its geometry [ ]. most objects are printed in plastic, but newer print materials such as metal, wood, or other composites are increas- ingly common in consumer applications. in the recent past (i.e., before ), d printing was cost-prohibitive and limited in availability, but it is now affordable and accessible to budget-conscious ecologists. many research institutions have at least one d printing center and d printing services are available to all online. other fields, such as the health sciences, have readily adopted d printing into their research (e.g., [ ]), but it is as of yet an untapped technology that ecologists can exploit to their advantage [ ]. recent studies have highlighted the benefits of d printing in terms of cost and time efficiency [ , ], yet ecologists wanting to implement d printing for the first time must still traverse a steep learning curve. our goal here is to flatten the curve and provide ecologists with a general but sufficient background in d printing tech- nology to know what considerations are important when approaching a d printing project. in this article, we pro- vide an overview of how d printing has been adopted by fields related to ecology. we highlight areas of ecological research where we think d printing has the promise to be most effective and provide a methodological work- flow for integrating d printing into ecological studies. we illustrate this workflow using an example from our own work, which includes the obstacles we encountered and the solutions we devised. finally, we conclude with important environmental sustainability considerations. overview of  d printing in fields related to ecology two disciplines that were early adopters of d print- ing technology and have strong connections to ecology are biomechanics and natural history curation. below we provide examples of d printing implementations in these fields to provide ecologists with ideas of what is possible. the aim of biomechanics is to understand the move- ment and structure of living organisms integrating across physics, engineering, physiology, and ecology. in biome- chanics, d printing is used to test how the shapes of particular appendages or biological structures function in the physical environment without having to use live organisms. for example, d printed models of the sand- burrowing sandfish lizard’s (scincus scincus) respiratory system made it possible to study why it does not inhale sand in ways that are impossible with a living lizard’s respiratory system [ ]. in studies of fluid dynamics, d printed models of swift (apus apus) wings and bodies of echolocating bat species permitted tests in water and wind tunnels respectively to understand how morphol- ogy influences species’ movements [ , ]. in other applications, biomechanical theory is tested by attaching d printed structures to robots. in a study of underwa- ter burrowing mimetics in bivalves, germann et  al. [ ] used mathematical models to design a bivalve shell which was d printed and incorporated into a burrowing robot. in other studies, evolutionary optimization models are used to design the shape of anatomical structures. then, d prints of the modeled and naturally occurring struc- tures are compared in performance tests to understand the evolutionary limitations species face in structural adaptation in examples such as station keeping in aquatic environments, morphological optimization of balance and efficiency in fish, and seahorse tail shape morphology [ – ]. for these studies, d models enabled scientific inquiry, as manipulating live animals would have been challenging or impossible. in the field of natural history curation, d printing increases the speed at which discoveries are made, and the rate at which data and resources are shared across natural history collections [ ]. in paleontology, the reconstruction of complete skeletons is often impaired by the recovery of incomplete remains at dig sites. mit- sopoulou et  al. [ ] used mathematical allometric scal- ing models to calculate the dimensions of bones missing from the remains of a dwarf elephant (paleoloxodon tiliensis) recovered from charkadio cave on tilos island, greece. from these analyses, a d model was printed to allow the complete skeleton to be assembled. in addition, d technology also facilitates the sharing of museum material without having to loan valuable specimens, making it possible to construct complete skeletons using partial skeletons from multiple separate collections [ ]. in fact, museums have been quick to adopt d technol- ogy because it vastly improves the rate at which collec- tions are shared. the exchange of d-printed specimens facilitates crowd sourcing for specimen identification; access to high-quality replicas of endangered, extinct, or otherwise valuable and/or fragile specimens; and printed specimens can even be used in a field setting for spe- cies identification [ , ]. museums are increasingly accepting deposits of d printed material for rare and/ or difficult to access specimens. lak et al. [ ] employed d technology to describe two new damselfly species that were preserved in amber. because it is difficult to physically extract amber-encased specimens without damaging them, the team used phase contrast x-ray syn- chrotron microradiography to make d images of the page of behm et al. bmc ecol ( ) : specimens and deposited the d prints in several muse- ums. finally, d technology also accelerates the flow of information for education and outreach. for example, bokor et  al. [ ], developed a classroom exercise where students print fossilized horse teeth and examine how the teeth changed over time with respect to changing climate. integration of  d printing in ecology while ecologists have used d printing in a variety of applications (table  ), there are four areas where we view d printing to be the most impactful: behavioral ecology, thermal ecology, building customized equipment, and enhancing collaboration. the main goal of behavioral ecology is to understand how ecological and evolutionary forces shape behavior. in addition to observational studies, behavioral ecology research can involve manipulations of environmental conditions to test hypotheses. for testing hypotheses in both lab and field conditions, d printing may be incred- ibly useful for making precise, repeatable models. three- dimensional printing has already been used to create precise models of bird eggs to test egg rejection behavior in the context of brood parasitism [ ], zebrafish shoals to test the effect of body size on zebrafish shoaling pref- erences [ ], artificial flower corollas to test the effect of floral traits on pollinator visitation [ – ], and female turtle decoys to test the effect of body size on mate choice [ ] (table  ). in these studies, d printing was chosen for its ability to create identical experimental stimuli because alternative methods, such as constructing mod- els by hand, could introduce unintentional variation that makes it difficult to determine whether study subjects are responding to intentional or unintentional variation in experimental stimuli. in addition, d printing is often a faster method for creating models than making them by alternative methods [ ]. there may be scenarios where d printing will not produce more biologically accurate models than other methods, but in many cases, d print- ing will increase the types of behavioral questions that can be asked [ ]. for example, northern map turtles (graptemys geographica) are sensitive to captivity, and using d printed decoys of females permitted field stud- ies of male mating behavior whereas using live females for the same study would have been detrimental to their survival [ ]. within the field of behavioral ecology research, d printing can be used to test myriad behav- iors including predation (see “workflow application”), reproduction, foraging, social interactions, and defense in both aquatic and terrestrial habitats. thermal ecology is focused on understanding how organisms are influenced by the temperature profile of their environment. a major challenge of thermal ecology research is constructing models that accurately replicate the thermal properties of a study organism. copper mod- els are often used, however, recent work demonstrated that d printed plastic models were cheaper and faster to construct and exhibited no difference in thermal proper- ties compared to standard copper models (table  ) [ ]. this, as well as the need for high numbers of identical models, suggests that d printed models may make ther- mal ecology research more accessible. perhaps d printing will be the most helpful to the wid- est number of ecologists because it provides a method for constructing customized equipment such as tools and experimental habitats or mesocosms. in the field of soil ecology, d printing has been used to print artificial soil structures which accurately replicate the macropore structure of soil (table  ) [ , ]. these artificial soils are ideal replicate experimental mesocosms for soil macro- and/or microorganisms. structures designed for other studies could be repurposed by ecologists as exper- imental habitats such as artificial gravel beds originally designed for testing water flow patterns [ ] and artifi- cial oyster shell reefs used to test how habitat complexity influences predation rates [ ]. opportunities for printing tools are limited primar- ily by the ecologists’ imagination and range from sim- ple structures to complex moving machines [ ]. on the low-complexity end of the spectrum, d printing has been used to sample two difficult-to-catch, invasive, tree-boring beetle species that cause significant dam- age. three-dimensional printed emergence traps make it possible to effectively trap and census invasive ambrosia beetles (euwallacea fornicates) as they emerge from trees [ ], while d printed decoys placed on standard bee- tle traps enhanced capture rates of invasive emerald ash borer beetles (agrilus planipennis) [ ]. in a more com- plex application, whale researchers used d printing to build an unmanned surface vehicle named snotbot which allows scientists to get close enough to whales to collect biological samples (table  ) [ ]. there are ample oppor- tunities for ecologists to design tools to aid in data collec- tion, sample processing, organism containment, and even organization of field or lab spaces. from the examples provided above, designing custom materials certainly benefits scientists within the context of a particular study. however, the use of d technology also provides a mechanism for collaboration that extends beyond the limits of a single study. ecological studies that are replicated across systems, geographic bounda- ries, latitudinal gradients, etc., are a powerful method for testing ecological theory [ ]. the use of d technology facilitates these broad-scale studies through the sharing of identical tools, models, and/or equipment that can be used in multiple systems. for example, d printed page of behm et al. bmc ecol ( ) : ta b le e co lo g ic al s tu d ie s th at  h av e u se d d p ri n ti n g n r n o t re p o rt ed r es ea rc h t o p ic ta xa o b je ct s p ri n te d pr in t m ed iu m sa m p le s iz e r ef er en ce s be h av io ra l e co lo g y e g g re je ct io n b eh av io r in c o n te xt o f b ro o d p ar as it is m br o w n -h ea d ed c o w b ird (m ol ot h ru s a te r) c o w b ird e g g s th at v ar ie d in s iz e/ sh ap e, t h en p ai n te d d iff er en t co lo rs “w h it e st ro n g a n d fl ex ib le p la st ic , p o lis h ed ” [ ] e ff ec t o f c o ro lla s h ap e o n p o lli n at o r b eh av io r h aw km o th (m a n d u ca s ex ta ) fl o w er s th at v ar ie d in c o ro lla s h ap e b as ed o n sp ec ifi c m at h em at ic al p ar am et er s a cr yl o n it ri le b u ta d ie n e st yr en e (a bs ) p la st ic n r [ ] e ff ec ts o f v is u al a n d o lfa ct o ry fl o ra l t ra it s in at tr ac ti n g p o lli n at o rs m u sh ro o m -m im ic ki n g o rc h id (d ra cu la la fle u rii ) m o ld s to m ak e si lic o n fl o w er s c ya n o ac ry la te im p re g n at ed g ym p - su m n r [ ] e ff ec t o f n ec ta r ca ff ei n e co n ce n tr at io n s o n p o lli n at io n s er vi ce bu m b le b ee s (b om b u s im p a tie n s) st ru ct u re s th at fu n ct io n ed li ke c o ro lla s o ve r g la ss ja rs c o n ta in in g a rt ifi ci al n ec ta r pl as ti c (t yp e n o n -s p ec ifi ed ) m in . [ ] s o ci al b eh av io r o f z eb ra fis h in re sp o n se t o va ry in g s ti m u li z eb ra fis h (d a n io re rio ) pr ed at o ry fi sh m o d el ro b o t sh o al s co m p ri si n g ze b ra fis h t h at v ar ie d in b o d y si ze p lu s an ch o r- in g m at er ia ls b io lo g ic al ly -in sp ire d z eb ra fis h re p lic a a bs p la st ic a bs p la st ic a bs p la st ic sh o al s [ ] [ ] [ ] in flu en ce o f f em al e b o d y si ze o n m at e ch o ic e b y m al es n o rt h er n m ap t u rt le s (g ra p te m ys g eo - g ra p h ic a ) r ep lic as o f f em al e tu rt le s th at d iff er ed in b o d y si ze a bs p la st ic [ ] e va lu at io n o f d p ri n ti n g a s su it ab le m et h o d fo r fie ld p re d at io n m o d el st u d ie s br o w n a n o le (a n ol is s a g re i) li za rd m o d el s u si n g p ri n t m ed ia , c o ve re d in cl ay , a n d fi el d -t es te d fo r p re d at io n a bs p la st ic , p la st ic -w o o d h yb ri d fil am en t th is s tu d y th er m al e co lo g y c o m p ar in g t h er m o d yn am ic s o f d p ri n te d a n d c o p p er li za rd m o d el s te xa s h o rn ed li za rd (p h ry n os om a c or n u tu m ) th er m al m o d el s o f l iz ar d s a bs p la st ic [ ] to o ls — ex p er im en ta l a re as e va lu at io n o f d p ri n te d s o il as s u it ab le fo r fu n g al c o lo n iz at io n pl an t p at h o g en ic fu n g u s (r h iz oc to n ia so la n i) a rt ifi ci al s o il fr o m d s ca n s o f s o il w it h v ar yi n g m ic ro p o re s tr u ct u re n yl o n [ ] c o m p ar in g h yd ra u lic p ro p er ti es o f d p ri n te d s o il re la ti ve t o re al s o il so il a rt ifi ci al s o il fr o m d s ca n s o f s o il r es in ( vi si je t c ry st al e x p la st ic m at er ia l) [ ] m ic ro sc al e b ac te ri al c el l– ce ll in te ra ct io n s ps eu d om on a s a er u g in os a a n d s ta p h ly lo co c- cu s a u re u s “d es ig n er ” b ac te ri al e co sy st em s th at v ar y in s iz e, g eo m et ry a n d s p at ia l d is ta n ce w it h e xa ct s ta rt - in g q u an ti ti es o f p . a er u g in os a a n d s . a u re u s g el at in n r [ , ] e ff ec t o f i n te rs ti ti al s p ac e o n p re d at o r– p re y in te ra ct io n s bl u e cr ab (c a lli n ec te s sa p id u s) a n d m u d c ra b (e u ry p a n op eo u s d ep re ss u s) o ys te r sh el ls a g g re g at ed in to a rt ifi ci al re ef s th at va ri ed in in te rs ti ti al s p ac e co n fig u ra ti o n po ly la ct ic o r a bs p la st ic n r [ ] to o ls — sa m p lin g e q u ip m en t c o lle ct in g u n o b tr u si ve b io lo g ic al s am p le s fr o m w h al es so u th er n r ig h t, h u m p b ac k an d s p er m w h al es c o m p o n en ts t o b u ild a n u n m an n ed s u rf ac e ve h i- cl e fo r o ce an o g ra p h ic re se ar ch (s n ot bo t) a bs p la st ic a n d n yl o n [ ] t o o ls fo r st u d yi n g t h e im p ac t o f a m b ro si a b ee tl es o n t re es sh o t h o le b o re r b ee tl e (e u w a lla ce a fo rn i- ca tu s) c o m p o n en ts fo r en tr y d ev ic es a n d e m er g en ce tr ap s a bs p la st ic [ ] t es ti n g d ec o ys v s re al b ee tl es t o e n h an ce tr ap c ap tu re r at es em er al d a sh b o re r b ee tl e (a g ril u s p la n ip en - n is ) be et le d ec o y to u se o n t ra p s a bs p la st ic [ ] page of behm et al. bmc ecol ( ) : models of brown-headed cowbird (molothrus ater) eggs [ ] and texas horned lizards (phrynosoma cornutum) [ ] can be used to test patterns of brood parasitism and thermal tolerances, respectively, across their geographic ranges. similarly, for widespread invasive species like the emerald ash borer, sharing effective trap methodol- ogy [ ] among scientists and agencies can potentially accelerate the rate at which the impact of the species is mitigated. in addition, d technology provides a useful platform for ecologists who would like to incorporate cit- izen scientists into a research program. indeed, effective sampling technologies that can be disseminated electron- ically are ideal for citizen science, and increase the speed at which consistent data can be collected [ ]. workflow methodology below we describe a general workflow to use when embarking on incorporating d printing into ecological research. essentially, once an ecologist has identified the object to be printed, the d printing process involves cre- ating a printable  d digital image file of the object, select- ing an appropriate print media, and then printing draft and final versions of the object (fig.  ). to be clear, details specific to each project and available resources will need to be explored and fine-tuned along the way. however, our workflow highlights the major steps and aspects to consider at the onset. make a digital object file the first step is to generate a digital file of the object to be printed, which can be accomplished by creating a digi- tal file of the image from scratch, converting a d image (e.g., photograph) into a d image, scanning an existing d object, or using an existing d file. all digital d files require use of software specifically for editing d images (additional file  ). the most common d image file for- mat is an stl file and is used by many software packages. depending on the image generating methods used and the types of modifications needed, there may be a signifi- cant learning curve to attain the necessary level of profi- ciency on the software. this is especially true for creating a d image completely from scratch (see below). in our experience, however, we scanned an existing object and an undergraduate student was able to work together with the printing center staff to learn the software and manip- ulate the image within  months. before trying to create the image from scratch or scan an existing image, it may be worthwhile first to check the many libraries of d imagery that are available online (additional file  ). it is possible that a digital d file of a similar object has already been created and can be down- loaded potentially for free, ready to be printed. even if the file in an online library is not exactly perfect, it can be manipulated using d software (additional file  ), which, depending on the modifications needed, may be a more efficient use of time than scanning an image or trying to draft an image from scratch. if a suitable digital d file is not available, but the object to be printed is in the ecologist’s possession, it is possible to use a d scanner to make a digital d image of the object, similar to how a flatbed scanner makes a digital d image of an object. there are various types of scanners, and it is necessary to choose a scanner that can accurately capture the level of detail needed for the project from the object being scanned. laser scanners, structured light scanners, and even smart phone apps, can be used to create lower resolution scans of an object’s external features. laser scanners were used to scan texas horned lizards that were frozen in realistic positions for a thermal ecology study (makerbot digitizer d, mak- erbot, new york, usa) [ ], and oyster shells for a bio- mechanical predation study (vivid i, konika minolta inc., tokyo, japan) [ ]. for more complex and fine scale fig. steps of workflow for integrating d printing in ecological research page of behm et al. bmc ecol ( ) : objects with both internal and external features like soil micropore structure or seahorse tail skeletal structure, methods like x-ray microtomography (hmx , nikon corp., tokyo, japan) [ ] or micro-computed tomogra- phy scanning (skyscan , kontich, belgium) [ ] may be more appropriate. if the object to be printed is not in the ecologist’s pos- session, it is possible to design the object using d draft- ing software (additional file  ), with the time investment being proportional to the researcher’s proficiency on the software and the complexity of the object. using pho- togrammetry, photos can be digitized and d x,y coor- dinates from the photo converted into a d image [ , ]. photogrammetry may be the easiest and most cost effective method, especially if a scanner is not available. in addition, photogrammetry can be used to augment an image produced by d scanning: in the creation of d printed northern map turtle decoys, the carapace and legs of a dried specimen were scanned and the head was digitally rendered using photographs [ ]. alternatively, mathematical formulae may be used to generate different shapes, such as the surface of a bird egg [ ] or the curva- ture of a flower corolla [ ]. finally, it is possible to draft the object completely from scratch (e.g., [ ]), although a higher proficiency on the appropriate drafting software is necessary (additional file  ). once a digital d image file is in hand, it will likely need to be edited and customized for the particular study. for example, in the brood parasitism study, the d image of the bird egg was edited to make it hollow so that the printed versions could be filled with water so their weight and thermal properties more closely matched a real bird egg [ ]. similarly, in the thermal ecology study, the d image of the texas horned lizard was edited to include a well in the underside that fit a small environmental sen- sor (ibutton) for measuring temperature [ ]. object size can also be manipulated and various polygons added to include additional structures. depending on the type of printer and material used, the image may need to be edited to make printing possi- ble and to efficiently use printing material. non-manifold geometry errors (i.e., geometry that cannot exist in the real world) can be common in scans made on biologi- cal objects and must be corrected to avoid fatal printing errors. most d file manipulation software allows for these corrections (additional file  ). because most print- ers print the object from the bottom up layer-by-layer, any appendages or protrusions that extend out much wider than the bottom layer may need added scaffolding to make the print possible. this scaffolding is removed after printing is completed with varying degrees of effort depending on the design and print material. in addition, if the object is not flat, it will likely need a flat base added to make it printable. if multiple copies of the object are to be printed, it may be possible to rotate or stack them so that several copies can be printed simultaneously. this method ensures efficient use of printing platform space and materials. printer and printing material there is a wide range of d printers that use various printing technologies and materials, and a comprehen- sive review of all printer types is beyond the scope of this article. for a technical review of various d print- ing technologies, we refer the reader to [ , ]. here, we focus on the printers and materials likely to be most useful to ecologists. many factors must be weighed when choosing a printer and printing material for a project, such as cost, material durability, printed surface qual- ity, timeframe for printing, and color. the most ubiqui- tous printers that are common on university campuses and also through commercial online printing services typically use either plastic-based filament or resin as the print material. filament is hard plastic stored on spools that is melted and deposited as beads or streams dur- ing printing that quickly re-harden into layers to form the object. resin is a polymer liquid that is layered and solidified with uv light. both come in a range of colors; filament is often cheaper but leads to a lower resolution print with printed bands more prominent on the finished object, however if needed there may be applicable surface finishing methods for smoothing out these bands, like using acetone vapor. filament may also be less durable for some applications and cracks can form between layers if the object is subjected to physical stress. finished resin products are generally smoother, can be printed at higher resolution, are more durable, and have the surface quality of a store-bought plastic item. both filament and resin have been used for printing low and high resolution ecological models, respectively. for example, acrylonitrile butadiene styrene (abs), a type of filament, was used for printing artificial flowers [ ], artificial zebrafish [ , ], and models of lizards [ ], while resin was used for printing artificial soils with fine- scale pore structure in a hydrology study [ ]. it is also worth considering the type of scaffolding involved with a specific printer/print material combination. for some printing set-ups, the scaffolding is the same material as the printed object, which means the scaffolding must be physically cut off, creating opportunities to damage the printed object. other printers are capable of dual or multi-extrusion, meaning they can print using different materials simultaneously. in this case, the scaffold mate- rial differs from the print material and can be dissolved after printing in a chemical solvent solution. page of behm et al. bmc ecol ( ) : more high-tech printers capable of printing even finer- scale and more-detailed objects use a powder based print material which is converted into a solid plastic with a laser. an advantage of this print material is that little scaffolding is needed and extra powder can quickly be removed by shaking or brushing. this media was used to print soil pore microstructure at the scale of micrometers [ ]. these artificial soils were printed using nylon , a material that can be autoclaved, which makes it possible to reuse the soils for multiple experiments [ ]. although most standard printing materials are various types of plastic, there are a handful of products that include other materials like wood, rubber, and metal. at least one bio- degradable plastic filament also exists: a polylactic acid (pla) made from corn starch [ , ]. there are two exceptionally technical printing appli- cations that are not yet readily available to ecologists but may provide exciting opportunities soon. in one application, designer bacterial ecosystems that varied in geometry and spatial structure were printed using a gelatin-based material in order to study cell-to-cell inter- actions ([ , ]; table  ). in a second application, nano- scale d printing technology was used to print replicas of abdominal scales from rainbow peacock spiders (mara- tus robinsoni and m. chrysomelas) and specialized hairs from blue tarantulas (poecilotheria metallica and lam- propelma violaceopes) with comparable visual properties to the actual structures [ , ]. although these tech- nologies are still under development, they could provide novel methods for testing community ecology theory and visual signaling hypotheses, respectively. printing once the d image has been drafted and edited, and the printer and print materials have been selected, a test round of printing is necessary before moving to the final round. printing a test object makes it possible to identify errors with the d image file, compare print materials and confirm the material choice, and gain an estimate of the amount of time required for printing en masse. after all aspects of the printing project have been approved, the final prints can proceed. post‑processing following printing, various post-processing stages will likely need to occur, such as removing scaffolding, paint- ing, adding clay, and/or assembling pieces. it is particu- larly important to consider the sensory modality of the organism(s) under study with respect to how they will perceive and interact with the d printed object. while these considerations are important for any study using artificial models generated by d printing or otherwise, d printed materials may differ from other commonly used materials in their hardness, roughness, visual, and odor-related properties. through post-processing meth- ods, ecologists can insure that the d printing material does not interfere with their study. workflow application: d printed anolis lizards here we provide an example of a successful attempt to integrate d printing into an ecological project follow- ing the workflow outlined above. we include the obsta- cles encountered along the way as a useful case study for other ecologists. note, we used equipment (scanners and printers) and expertise from two (out of the four) d printing centers at our institution. for ecologists with fewer onsite resources, online resources and resources at collaborating institutions may be useful. clay animal models have long been used in ecologi- cal field research to infer predation rates by free-ranging predators on prey. in this methodology, animal models are constructed from plasticine modeling clay and then placed in the field for a fixed time period. because the clay does not harden, predation attempts leave marks in the clay, making it possible to score models for evidence of predation. early work used this method to study how body coloration affected predation rates in snakes [ , ]. since then, clay models have been used in predation studies to represent a wide range of taxa including frogs [ ], salamanders [ ], lizards [ ], and insect larvae [ ]. in many of these studies, models are constructed by hand either completely or nearly completely from clay (e.g., [ , , , ]). in other studies, silicon molds are made from preserved specimens, which are then used to make models either directly out of clay [ ], or out of plaster which is then covered with clay [ ]. these meth- ods clearly produce models that elicit responses in preda- tors, however, producing the models in this manner can be time consuming as studies may use upwards of models. in addition, modifying the models in a precise manner to test the effects of prey traits on predation is difficult. the repeatability, speed, and precision of d printing make it highly applicable to field studies of pre- dation using models. we first explored the ease of creat- ing a d scan of a preserved lizard specimen, and then used software to modify its body size. we then tested the durability of two print materials and two model sizes in a field predation study. making the lizard model we used two methods, a structured light scanner (david sls- d scanner, hp inc., palo alto, ca, usa) and a laser scanner (nextengine , nextengine, inc., santa monica, ca, usa), to make d scans of a preserved male page of behm et al. bmc ecol ( ) : anolis sagrei lizard. structured light scanners operate by projecting light patterns onto the object being scanned and analyzing the pattern’s deformation with a camera. the laser scanner we used boasts new technology con- sisting of more sophisticated algorithms and multiple lasers which scan in parallel, yielding more data points and an overall more accurate scan. both scanners are designed to scan d objects, but because they use dif- ferent technologies to do so, one scanner may be more effective for scanning a particular object. regardless of the number of scans or angle of rotation, the structured light scanner’s software was not able to converge the multiple scans into a single image of our anole, likely due to the complexity and high reflectance of the preserved specimen’s skin. the laser scanner, however, was able to produce a digital d image of the specimen within about   min, and we used this file going forward. the laser scanner was most successful when the lizard specimen was positioned in a vertical rather than flat manner using an extra part gripper (nextengine, inc., santa monica, ca, usa; fig.  a). we used maya software (autodesk, san rafael, ca, usa; additional file  ) to edit the scanned image (fig.  b) of the lizard specimen to attain three goals. first, to make the lizard scan possible to print, we had to edit the non- manifold geometry errors that arose due to the scanning process. second, we manipulated the size of the lizard to test whether different printing materials were dura- ble for both large and small prints. the large lizard was % larger than the original (snout vent length =  mm). finally, we added a hollow horseshoe-shaped tube in the ventral side of the body cavity for looping a small wire through in order to anchor the models to branches in the field. the final file we used to print the lizards is included in additional file  . print material and printing we tested two types of filament print media as bases for our clay models: plastic (abs-p plastic in ivory, stratasys, eden prairie, mn, usa) and plastic-wood hybrid (woodfill by colorfabb, belfeld, the nether- lands). abs exceeded the woodfill in cost and per- ceived durability, yet woodfill was a more sustainable option as it is made of % recycled wood fibers. dur- ing our test print stage, we learned we needed to add a base to our digital d image file for the woodfill prints because the scanned image was not flat which made it fig. construction of a d printed lizard predation model a successful laser scanning setup of preserved brown anole (anolis sagrei) specimen in vertical orientation; b d image of scanned anole viewed in meshmixer software and later edited in maya; c d printed plastic-wood hybrid (left) and abs plastic (right) anole models; d clay covered model on a branch in the field with bite marks likely from a lizard predator (cnemidophorus murinus murinus) page of behm et al. bmc ecol ( ) : difficult to print. we did not need to edit it for the abs print because the scaffold base dissolved. after we finalized our d image files from the test print stage, we printed abs models on a dimen- sion elite printer (stratasys, eden prairie, mn, usa) and seven plastic-wood hybrid models on a bigbox  d printer (chalgrove, uk) (fig.  c). we had intended to print equal numbers of each, however, the printer using the woodfill kept getting jammed and starting over, and seven was all we could print in the timeframe we had available. the printer jamming was due in part to the print material and due to errors in the file geom- etry that were not adequately resolved during the edit- ing stage. in total, it took about  h to print the abs lizards plus an additional   h to dissolve the scaffold- ing. it took nearly   days to print the seven plastic- wood hybrid models (due to the printer jamming), and the scaffolding needed to be cut off by hand using an exacto knife which took about an hour for all seven models. if the printer had not jammed, it would have taken  h per model to print. it was quite difficult to thread the narrow floral wire ( gage, panacea products, columbus, oh, usa) through the ventral holes in both woodfill and abs of models. the tube we made was curved, and in hindsight it should have been straight through the lizard midsection. instead, we wrapped the wire around the midsection of the bodies with two long ends hanging off the ventral side. we then dipped all abs and woodfill models in melted plasticine clay (craft smart, irving, tx, usa) to completely cover all parts of the body and the wire wrapped around the midsection. after the clay solidified (about   min), we folded the wire and wrapped each lizard in aluminum foil for transport to the field. in total, our time investment from scanning to printing was relatively low: it took   h from scanning the speci- men to our first test print. additional manipulations to the image took an additional   h (an undergraduate working   h/week for   months). although we had to troubleshoot issues with our image and printing, the pro- cess was relatively easy due to the resources available at the d print centers (namely staff to mentor undergradu- ate on image software and troubleshoot printing issues), and that we did not need the surface to be an exact bio- logical replica because we covered all models with clay. field testing lizard models to test the effectiveness of both printing materials as bases for clay-covered models in the field, all clay-cov- ered abs and woodfill lizard models were deployed in natural and developed habitats on the island of curaçao (dutch antilles) for –   h and then scored for pre- dation. in both habitat types, models were anchored to tree branches, bushes, or rocks on the ground using the floral wire. we recorded evidence of predation from likely lizard and avian predators based on marks left in the soft clay (fig.  d). we considered two components of effectiveness: ( ) do predators perceive and interact with the two print materials in the same manner (indi- cated by equal predation rates); ( ) are both print mate- rials durable to field conditions? while there was much higher predation in natural compared to developed sites (f , = . , p < . ), predators exhibited equal attack rates on abs and woodfill models (f , = . , p = . ) fig. results from testing abs and woodfill print materials as bases for clay-covered lizard models in field predation experiments. there was no difference in predation rates on models with respect to print material or model size, however, models in natural habitats had higher predation rates (* indicates p < . ). bars represent ± standard error of the mean page of behm et al. bmc ecol ( ) : as well as on small and large models (f , = . , p = . ) (fig.  ). both d print material types were dura- ble to the field conditions and none of our models expe- rienced any structural problems during the experiment. we concluded that both abs and woodfill were effec- tive print materials to use as bases for clay-covered lizard models in field predation studies. discussion recommendations for using d printed models for field predation studies because the woodfill models were cheaper and just as durable as the less sustainable abs models, we would recommend using the woodfill, or other similar plastics in comparable future studies, provided that the jamming issues we encountered during printing can be attributed to geometry errors in our file and not the woodfill mate- rial itself. it should be noted that although we tested the models in extremely hot (>   °c) field conditions, we cannot comment on the durability of the two materials in rainy or very cold conditions. initially, we believed the woodfill would crumble more on the smaller model with narrower appendages, but this was not the case. finally, our study took place over a -week period. it is possible that over longer time periods, the woodfill would not be as durable as the abs plastic. reduce, reuse, recycle while d printing can facilitate ecological research, the use of this technology must be weighed against its environmental and social costs. in general, d printing can to reduce co emissions and lead to more sustain- able practices in the consumer manufacturing industry [ ], yet there are many less sustainable aspects to con- sider. three-dimensional printing is energy intensive and often uses fossil fuel derived virgin plastics which can exist in the environment for ages after disposal and can be toxic to aquatic organisms, especially resin-based printed objects [ ]. the printing process itself generates waste due to printers jamming, misprinted models, and scaffolding necessary for more complex d objects, as well as harmful emissions in the form of ultra-fine par- ticles and volatile organic compounds [ , ], which is especially worrisome as most d printers are housed in indoor office settings [ ]. with respect to the manufac- turing of any plastic item, these negative aspects are not completely unique to d printing, they just become more obvious when one is directly involved in the manufac- turing process. in our specific case, we chose d printed models for the speed at which they could be produced and their durability as we intend to use them in future experiments. ecologists planning to incorporate d printing in research should strongly consider the negative impacts associated with d printing compared to the impacts of creating objects via other methods or not at all. there are promising advances in the sustainability of d printing materials. materials scientists are developing a range of filaments that are biodegradable, compostable, and made from recycled materials. for example, eco-fil- aments, such as willowflex (bioinspiration, eberswalde, germany), are made from plant-based resources and are completely compostable, even in residential compost bins. other filament choices are made from recycled plastics like car dashboards, pet bottles, and potato chip bags ( d brooklyn, brooklyn, ny, usa; refil, rotterdam, the netherlands). in fact, the cost of generating recycled plastic filament is often less than making filament from raw materials, prompting the establishment of a fair trade market for used plastic collected by waste pickers in the developing world (e.g., protoprint solutions, prune, india) [ ]. non-plastic recycled filament options exists, such as filament made from the waste products of beer, coffee, and hemp production processes ( dfuel, fargo, nd, usa) as well as wood pulp [ ]. finally, because common print materials such as abs plastic are not bio- degradable or recyclable in municipal recycling centers, machines have been developed to recycle these plastics directly at the printing site [ ]. these machines grind old prints and melt them into new filament that can be reused for printing (e.g., filastruder, snellville, ga, usa). across sustainable options for print materials, we can attest to the durability of woodfill for applications comparable to ours. for ecologists considering other sustainable print materials, most of these companies readily provide information about the durability of their products. we stress that all d printing projects in ecologi- cal research should reduce, reuse, and recycle: reduce the amount printed and the use of toxic print materials; reuse printed objects and use materials made from post- consumer, waste materials; and recycle printed objects by choosing materials that can be easily recycled, com- posted, or that are biodegradable. planning a print job (fig.  ) requires both careful estimation of the minimum number of replicates to print and smart design of geom- etry that minimizes or eliminates scaffolding, as scaffold- ing is usually discarded. printing should be performed in well-ventilated environments where airborne toxins do not accumulate and harm personnel. the environmental toxicity of objects should be reduced by choosing mate- rials with low toxic potential and reducing the toxicity of materials post-print. for example, exposure of resin- based printed objects to intense uv light can reduce their toxicity to aquatic organisms [ ]. printed objects should be reused in research as much as possible to avoid page of behm et al. bmc ecol ( ) : repeat printing, and print materials made from recycled material or materials that are recyclable or compostable should be used when possible. while most ecologists will not invest in their own d printing equipment and instead employ general-use academic (e.g., library) or commercial facilities, these environmental concerns can be communicated to the printing facilities so that they might adopt sustainable practices in their d printing for research. conclusions in conclusion, d printing technology has the promise to reduce the time and cost invested in creating custom materials used in ecological research, while at the same time increasing the ease at which collaborations occur within and outside the scientific community. although there is a learning curve for developing d image files, there are ample online libraries of d files, plus tech savvy students and d printing center staff can be extremely helpful. recent advances in print materials may reduce the footprint associated with this new technology. over- all, as with any new technology, ecologists must weigh the costs in terms of time and monetary investments into developing usable d materials for research versus other methods of generating those materials. if ecologists are in the position to commit the initial investment in secur- ing printing resources and navigating the technologi- cal learning curve, the resulting ability to implement d printing into future studies could save time and money on the long term. additional files additional file  . software for designing, modifying, and analyzing d files. additional file  . online libraries of d imagery relevant for ecological research (as of ). additional file  . d image file (stl format) of anolis sagrei lizard we made. authors’ contributions jb and mh conceived of and designed the study; jb, bw, and mh developed the workflow; bw tested the workflow and designed the d model; jb and mh conducted the field experiment; jb, bw, sth, and mh wrote the manu- script and provided editorial advice. all authors read and approved the final manuscript. author details integrative ecology lab, center for biodiversity, department of biology, tem- ple university, philadelphia, pa, usa. department of ecological science-ani- mal ecology, vu university amsterdam, amsterdam, the netherlands. school of biological sciences, university of western australia, perth, wa, australia. department of biology, temple university, philadelphia, pa, usa. acknowledgements we are grateful to two anonymous reviewers who provided useful comments that improved the quality of this manuscript. we thank j. hample from the digital scholarship center, s. campbell from the digital fabrication studio, and c. denison from the health sciences library print center all at temple university for assistance with scanning and printing the lizard models. we thank m. vermeij and s. berendse from the carmabi foundation for logistical support in curaçao. finally, we thank s.b. hedges for access to the preserved anolis sagrei specimen. competing interests the authors declare that they have no competing interests. availability of data and materials the datasets used and/or analysed during the current study are available from the corresponding author on reasonable request. consent for publication not applicable. ethics approval and consent to participate all work conducted involving live animals was in accordance with the institu- tional animal care and use committee at temple university (iacuc protocol # ). funding this work was supported by funds from the netherlands organization for scientific research ( . . ) and temple university. publisher’s note springer nature remains neutral with regard to jurisdictional claims in pub- lished maps and institutional affiliations. received: december accepted: september references . small rj, rusch dh. backpacks vs. ponchos: survival and movements of radio-marked ruffed grouse. wildlife soc bull ( – ). ; : – . . reiter j, curio e, tacud b, urbina h, geronimo f. tracking bat-dispersed seeds using fluorescent pigment. biotropica. ; : – . . xiao z, jansen pa, zhang z. using seed-tagging methods for assessing post-dispersal seed fate in rodent-dispersed trees. for ecol manage. ; : – . . patrick lb, hansen a. comparing ramp and pitfall traps for capturing wandering spiders. j arachnol. ; : – . . quinn h, jones jp. squeeze box technique for measuring snakes. herpe- tol rev. ; : . . karraker ne. a new method for estimating clutch sizes of ambystomatid salamanders and ranid frogs: introducing the ovagram. herpetol rev. ; : – . . peterson cr, dorcas me. automated data acquisition. in: heyer rw, donnelly ma, mcdiarmid rw, hayek lc, foster ms, editors. measuring and monitoring biological diversity: standard methods for amphibians. washington, dc: smithsonian institution; . p. – . . pauli jn, hamilton mb, crain eb, buskirk sw. a single-sampling hair trap for mesocarnivores. j wildlife manag. ; : – . . conner bp, manogharan gp, martof an, rodomsky lm, rodomsky cm, jordan dc, et al. making sense of -d printing: creating a map of additive manufacturing products and services. addit manuf. ; – : – . . mironov v, kasyanov v, drake c, markwald rr. organ printing: promises and challenges. regen med. ; : – . . allan bm, nimmo dg, ierodiaconou d, vanderwal j, koh lp, ritchie eg. futurecasting ecological research: the rise of technoecology. eco- sphere. ; :e . . domingue mj, pulsifer dp, lakhtakia a, berkebile j, steiner kc, lelito jp, et al. detecting emerald ash borers (agrilus planipennis) using branch traps baited with d-printed beetle decoys. j pest sci. ; : – . https://doi.org/ . /s - - -z https://doi.org/ . /s - - -z https://doi.org/ . /s - - -z page of behm et al. bmc ecol ( ) : . watson cm, francis gr. three dimensional printing as an effective method of producing anatomically accurate models for studies in thermal ecology. j therm biol. ; : – . . stadler at, vihar b, günther m, huemer m, riedl m, shamiyeh s, et al. adaptation to life in aeolian sand: how the sandfish lizard, scincus scincus, prevents sand particles from entering its lungs. j exp biol. ; : – . . van bokhorst e, de kat r, elsinga ge, lentink d. feather roughness reduces flow separation during low reynolds number glides of swifts. j exp biol. ; : – . . vanderelst d, peremans h, razak na, verstraelen e, dimitriadis g. the aerodynamic cost of head morphology in bats: maybe not as bad as it seems. plos one. ; :e . . germann dp, schatz w, hotz pe. artificial bivalves—the biomimetics of underwater burrowing. procedia comput sci. ; : – . . moore jm, clark aj, mckinley pk. evolution of station keeping as a response to flows in an aquatic robot. in: proceedings of the th annual conference on genetic and evolutionary computation. acm; . p. – . http://dl.acm.org/citat ion.cfm?id= . accessed nov . . clark aj, wang j, tan x, mckinley pk. balancing performance and effi- ciency in a robotic fish with evolutionary multiobjective optimization. in: ieee international conference on evolvable systems (ices). . p. – . . porter mm, adriaens d, hatton rl, meyers ma, mckittrick j. why the seahorse tail is square. science. ; :aaa . . ziegler a, menze b. accelerated acquisition, visualization, and analysis of zoo-anatomical data. in: computation for humanity. boca raton: crc press; . p. – . http://www.crcne tbase .com/doi/ abs/ . /b - . accessed nov . . mitsopoulou v, michailidis d, theodorou e, isidorou s, roussiakis s, vasilopoulos t, et al. digitizing, modelling and d printing of skeletal digital models of palaeoloxodon tiliensis ( tilos, dodecanese, greece). quatern int. ; : – . . niven l, steele te, finke h, gernat t, hublin j-j. virtual skeletons: using a structured light scanner to create a d faunal comparative collection. j archaeol sci. ; : – . . raupach mj, amann r, wheeler qd, roos c. the application of “-omics” technologies for the classification and identification of animals. org divers evol. ; : – . . lak m, fleck g, azar d, engel fls ms, kaddumi hf, neraudeau d, et al. phase contrast x-ray synchrotron microtomography and the oldest damselflies in amber (odonata: zygoptera: hemiphlebiidae). zool j linn soc. ; : – . . bokor j, broo j, mahoney j. using fossil teeth to study the evolu- tion of horses in response to a changing climate. am biol teach. ; : – . . igic b, nunez v, voss hu, croston r, aidala z, lópez av, et al. using d printed eggs to examine the egg-rejection behaviour of wild birds. peerj. ; :e . . bartolini t, mwaffo v, showler a, macrì s, butail s, porfiri m. zebrafish response to d printed shoals of conspecifics: the effect of body size. bioinspir biomim. ; : . . thomson jd, draguleasa ma, tan mg. flowers with caffeinated nectar receive more pollination. arthropod plant interact. ; : – . . campos eo, bradshaw hd, daniel tl. shape matters: corolla curvature improves nectar discovery in the hawkmoth manduca sexta. funct ecol. ; : – . . policha t, davis a, barnadas m, dentinger btm, raguso ra, roy ba. disentangling visual and olfactory signals in mushroom-mimicking dracula orchids using realistic three-dimensional printed flowers. new phytol. ; : – . . bulté g, chlebak rj, dawson jw, blouin-demers g. studying mate choice in the wild using d printed decoys and action cameras: a case of study of male choice in the northern map turtle. anim behav. ; : – . . otten w, pajor r, schmidt s, baveye pc, hague r, falconer re. combin- ing x-ray ct and d printing technology to produce microcosms with replicable, complex pore geometries. soil biol biochem. ; : – . . dal ferro n, morari f. from real soils to d-printed soils: reproduction of complex pore network at the real size in a silty-loam soil. soil sci soc am j. ; : – . . bertin s, friedrich h, delmas p, chan e. gimel’farb g. dem quality assess- ment with a d printed gravel bed applied to stereo photogrammetry. photogram rec. ; : – . . hesterberg s. three-dimensional interstitial space mediates predator foraging success in different spatial arrangements. masters thesis. uni- versity of south florida; . http://schol arcom mons.usf.edu/etd/ . accessed nov . . mohammed js. applications of d printing technologies in oceanogra- phy. methods oceanogr. ; : – . . berry d, selby rd, horvath jc, cameron rh, porqueras d, stouthamer r. a modular system of d printed emergence traps for studying the biology of shot hole borers and other scolytinae. j econ entomol. ; : – . . bennett a, barrett d, preston v, woo j, chandra s, diggins d, et al. autono- mous vehicles for remote sample collection enabling marine research. in: proc. ieee/mts oceans. genova; . p. . . borer et, harpole ws, adler pb, lind em, orrock jl, seabloom ew, et al. finding generality in ecology: a model for globally distributed experi- ments. methods ecol evol. ; : – . . newman g, wiggins a, crall a, graham e, newman s, crowston k. the future of citizen science: emerging technologies and shifting paradigms. front ecol environ. ; : – . . rochman d, luna ed. prototyping the complex biological form of the beetle deltochilum lobipes via d geometric morphometrics landmarks and descriptive geometry for d printing. comput aided des appl. ; : – . . garcia j, yang z, mongrain r, leask rl, lachapelle k. d printing materials and their use in medical education: a review of current technology and trends for the future. bmj simul technol enhanc learn. ; : – . . pucci ju, christophe br, sisti ja, connolly es. three-dimensional printing: technologies, applications, and limitations in neurosurgery. biotechnol adv. ; : – . . ruberto t, polverino g, porfiri m. how different is a d-printed replica from a conspecific in the eyes of a zebrafish?: how does a zebrafish see a replica? j exp anal behav. ; : – . . tabone md, cregg jj, beckman ej, landis ae. sustainability metrics: life cycle assessment and green design in polymers. environ sci technol. ; : – . . connell jl, ritschdorff et, whiteley m, shear jb. d printing of micro- scopic bacterial communities. pnas. ; : – . . connell jl, kim j, shear jb, bard aj, whiteley m. real-time monitoring of quorum sensing in d-printed bacterial aggregates using scanning electrochemical microscopy. pnas. ; : – . . hsiung b-k, siddique rh, stavenga dg, otto jc, allen mc, liu y, et al. rainbow peacock spiders inspire miniature super-iridescent optics. nat commun. ; : . https ://doi.org/ . /s - - -x. . hsiung b-k, siddique rh, jiang l, liu y, lu y, shawkey md, et al. tarantula- inspired noniridescent photonics with long-range order. adv opt mater. ; : . . madsen t. are juvenile grass snakes, natrix-natrix, aposematically colored. oikos. ; : – . . brodie e. differential avoidance of coral snake banded patterns by free- ranging avian predators in costa rica. evolution. ; : – . . flores ee, stevens m, moore aj, rowland hm, blount jd. body size but not warning signal luminance influences predation risk in recently metamor- phosed poison frogs. ecol evol. ; : – . . kraemer ac, serb jm, adams dc. both novelty and conspicuousness influence selection by mammalian predators on the colour pat- tern of plethodon cinereus (urodela: plethodontidae). biol j lin soc. ; : – . . sato cf, wood jt, schroder m, green k, osborne ws, michael dr, et al. an experiment to test key hypotheses of the drivers of reptile distribution in subalpine ski resorts. j appl ecol. ; : – . . peisley rk, saunders me, luck gw. cost-benefit trade-offs of bird activity in apple orchards. peerj. ; :e . . vazquez b, hilje b. how habitat type, sex, and body region influence predatory attacks on norops lizards in a pre-montane wet forest in costa rica: an approach using clay models. herpetol notes. ; : – . . yeager j, wooten c, summers k. a new technique for the production of large numbers of clay models for field studies of predation. herpetol rev. ; : – . http://dl.acm.org/citation.cfm?id= http://www.crcnetbase.com/doi/abs/ . /b - http://www.crcnetbase.com/doi/abs/ . /b - http://scholarcommons.usf.edu/etd/ https://doi.org/ . /s - - -x page of behm et al. bmc ecol ( ) : • fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold open access which fosters wider collaboration and increased citations maximum visibility for your research: over m website views per year • at bmc, research is always in progress. learn more biomedcentral.com/submissions ready to submit your research ? choose bmc and benefit from: . gifford me, herrel a, mahler dl. the evolution of locomotor morphol- ogy, performance, and anti-predator behaviour among populations of leiocephalus lizards from the dominican republic. biol j lin soc. ; : – . . gebler m, schoot uiterkamp ajm, visser c. a global sustainability per- spective on d printing technologies. energy policy. ; : – . . oskui sm, diamante g, liao c, shi w, gan j, schlenk d, et al. assessing and reducing the toxicity of d-printed parts. environ sci technol lett. ; : – . . azimi p, zhao d, pouzet c, crain ne, stephens b. emissions of ultrafine particles and volatile organic compounds from commercially available desktop three-dimensional printers with multiple filaments. environ sci technol. ; : – . . yi j, lebouf rf, duling mg, nurkiewicz t, chen bt, schwegler-berry d, et al. emission of particulate matter from a desktop three-dimensional ( d) printer. j toxicol environ health part a. ; : – . . steinle p. characterization of emissions from a desktop d printer and indoor air measurements in office settings. j occup environ hyg. ; : – . . feeley s, wijnen b, pearce jm. evaluation of potential fair trade standards for an ethical -d printing filament. j sustain dev. ; : – . . gardan j, roucoules l. d printing device for numerical control machine and wood deposition. julien gardan int j eng res appl. ; : – . . baechler c. matthew devuono, joshua m. pearce. distributed recycling of waste polymer into reprap feedstock. rapid prototyp j. ; : – . . cianca v, bartolini t, porfiri m, macrì s. a robotics-based behavioral paradigm to measure anxiety-related responses in zebrafish. plos one. ; :e . benefits and limitations of three-dimensional printing technology for ecological research abstract background: results: conclusions: background overview of  d printing in fields related to ecology integration of  d printing in ecology workflow methodology make a digital object file printer and printing material printing post-processing workflow application: d printed anolis lizards making the lizard model print material and printing field testing lizard models discussion recommendations for using d printed models for field predation studies reduce, reuse, recycle conclusions authors’ contributions references lic _ .. the kingdom has been digitized: electronic editions of renaissance drama and the long shadows of shakespeare and print brett d. hirsch* university of western australia abstract this article considers the challenges and opportunities associated with the production and recep- tion of electronic editions of renaissance drama. chief amongst these challenges are the long shadows cast by the cultural, scholarly, and economic investments in shakespeare, and the institu- tions, conventions, and scholarly status of print publishing. this article argues that electronic edi- tions force us to rethink existing publishing models and notions of scholarship, to recognize that digitizing primary materials alone is no substitute for critical editions, and to acknowledge that, despite the challenges associated with them, electronic editions will play a far greater role in expanding the canon of renaissance drama as taught, studied, and performed than their print counterparts. on the then-recent arguments for the editorial de-conflation of king lear into two texts, jonathan goldberg wryly commented, ‘the kingdom has been divided, but shakespeare reigns supreme, author now of two sovereign texts’ ( ). in the decades following goldberg’s article, the shakespearean landscape has expanded to include two lears, three hamlets, edward iii, and a shrew, not to mention ‘reconstructed’ texts of pericles and cardenio. moreover, despite increasing critical awareness that collaborative authorship was the norm and not the exception in the early modern theatre, critical editions of shakespeare – particularly in terms of marketing, presentation, and publicity – tend to understate (if not ignore) the contributions of other playwrights. for example, the arden edition of timon of athens, the most recent single-volume edition of the play to accept that it ‘was written by two playwrights’ (dawson & minton ), does not signal middle- ton’s collaboration on its cover, spine, or title-page. the catalogue listing for the edition on the publisher’s website similarly relegates all mention of middleton’s involvement to the blurb, situated beneath the attribution, ‘by: william shakespeare, anthony dawson, gretchen minton’. the british library catalogue entry for the volume makes no men- tion of middleton at all, listing shakespeare as the sole author. we might forgive an unsuspecting browser or casual reader for thinking timon of athens was shakespeare’s alone. this is, of course, not the fault of the arden editors; as reviewers have noted, anthony dawson and gretchen minton have produced a work of meticulous scholarship. one rightly assumes, as editors that they had little to no input in terms of the marketing of their work. nor is the arden unique in this respect: john jowett’s oxford shakespeare edition, touted as ‘the first to locate the play firmly within a context of col- laboration’ ( ), similarly makes no mention of middleton on the cover, although the title-page attribution follows the oxford complete works in presenting the play as ‘by william shakespeare and thomas middleton’ (iii). moreover, the publisher’s uk, literature compass / ( ): – , . /j. - . . .x ª the author literature compass ª blackwell publishing ltd canada, australia and new zealand catalogues likewise privilege shakespeare as author. the marketing and packaging of other modern shakespeare editions similarly diminish the contributions of his collaborators, absent from the covers, spines, title pages, sales, and library catalogues. consider the horror and bewilderment of the new york shakespeare festival press office staff when d. c. greetham inquired ‘whether any of the plays in its shakespeare marathon had been listed as ‘‘by william shakespeare and someone else’’ ’ ( ). for greetham, this is evidence of ‘our current culture’s preference for the original, the soli- tary, and the socially unsullied against the collaborative and the cumulative’, which leads him to conclude ‘the capital invested in shakespeare might decline in value if, say, mac- beth were marketed as a play by william shakespeare and thomas middleton’ ( ). scholars are no less guilty of this reductive tendency and preference for the ‘socially unsullied’ in works of criticism, as evidenced by the frequency with which collaborative plays tacitly become shakespeare’s alone: a google books search for the phrase ‘shake- speare and middleton’s timon of athens’ returns a total of results, whereas the phrase ‘shakespeare’s timon of athens’ returns , . tellingly, the phrase ‘middleton and shakespeare’s timon of athens’ does not return any matches. to return to goldberg’s metaphor, the result of these practices, however subtle, is to surrender ever more textual real estate to shakespeare, leaving his collaborators with the editorial equivalent of squatters’ rights. as gary taylor has observed, ‘every edition, every textual investigation, represents an assertion of value’ (‘the renaissance’ ), and we owe it to these playwrights to pay more than lip service to their contributions. in light of ongoing critical efforts to divest and fragment shakespeare’s authority and the emergence of new technologies and models of publication, our continued editorial and scholarly neglect of the full panoply of renaissance dramatists, whether as collaborators or as indi- vidual playwrights, would seem inexcusable. yet, as this article will show, the vast major- ity of critical editions of renaissance plays published every year are (or at least marketed as) by shakespeare and, with few notable exceptions, in print and in print alone. why then, in a critical climate so invested in decentring shakespearean authority, and in which there are more opportunities to redraw and expand the canon than ever before, are so few editions of (or at least partially crediting) other dramatists produced? why, with the theoretically boundless possibilities of the electronic medium in mind, do we continue to limit ourselves to editions in print? to expand the canon of renaissance drama as it is taught, studied, and performed, more critical editions are needed. to accomplish this, our profession has a duty to foster new editors and to support and value the work of textual scholars. we must also rethink the current demand-driven model for the production of critical editions, because it clearly functions to sustain the canon. a new model is required, untethered to the canon (or ‘‘what gets taught,’’ according to roland barthes’ aphorism), and free from the restraints imposed by the institutions of print publishing. while the precise shape of this new model remains to be seen, digital publishing is certain to play a pivotal role, such that the production of critical editions is supported and maintained by flexible institutional part- nerships and collaborations, and autonomy is not surrendered to the presses. editions of renaissance drama therefore face many challenges. chief amongst these are the long shadows cast by the cultural, scholarly, and economic investments in shake- speare, and the institutions, conventions, and scholarly status of print publishing. electronic editions of renaissance drama face particular difficulties in terms of their pro- duction, distribution, usability, preservation, evaluation, and scholarly status. in the space allowed, i want to briefly consider these challenges and discuss a number of practical and electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd theoretical opportunities and benefits – many yet unrealized – offered by the electronic medium. surveying the kingdom ‘amidst lively debate over the decentring of shakespeare’s authority’, tanya hagen ( ) has observed, ‘the larger part of the early english dramatic canon languishes – undervalued, underexamined, and underedited’ ( ). this is an all-too-familiar lament, often rehearsed without recourse to empirical evidence: as scholars, we instinctively know just how big the critical divide between shakespeare and his contemporaries is, right? an empirical study was clearly wanting. for the purposes of this article, i conducted a survey of critical editions published since of renaissance plays printed, performed, or simply written (in the case of manu- script and closet drama) between and , arranged by author(s) based on modern attribution. the survey excluded editions of latin plays, facsimiles and reprints (such as the malone society reprints, the tudor facsimile texts, and the electronic facsimiles and transcriptions produced by early english books online [eebo] and the eebo text creation project [eebo-tcp] respectively), unpublished theses, plays of anonymous or unattributed authorship, and edited excerpts (such as one might find in an anthology). as my primary concern was with english-language scholarship – and as an attempt to rein in the number of shakespeare entries – the survey also excluded foreign-language editions, editions for children, and adaptations in prose, verse, graphic, or ‘simplified’ forms. as expected, these last exclusions only affected the total number of editions counted for shakespeare; unsurprisingly, i failed to locate editions of other renaissance dramatists in any shape or hue for children and young adults. faced with the grim prospect of typing up to or so entries for each play contained in every collected works of shakespeare, for the purposes of the survey i decided that individual volumes of collected or complete works would constitute a single entry in the resulting list, rather than itemizing individual plays within each volume as a separate entry. by this logic, both the single-volume oxford middleton and the single-volume riverside shakespeare would only count as a single entry for middleton and shakespeare respectively, just as the revels edi- tion of the duchess of malfi would constitute a single entry for john webster. this makes practical sense as well, since you cannot purchase or access play-chapters of these works individually. moreover, as long as the editions formally identified the individual collabora- tors of a play as co-authors, the list would apportion credit equally. thus, the oxford shakespeare and arden editions of timon of athens added to both shakespeare and mid- dleton’s tallies; volume of the bowers edition of the dramatic works in the beaumont and fletcher canon counted towards totals for john fletcher as well as his collaborators george chapman, nathan field, john ford, philip massinger, and webster; and so on. given the usual practice of publishing the same editions in paperback and hardcover, i decided that these titles merited only a single entry in the survey. after the limits of the survey were set as outlined above, i compiled a list of authors in consultation with the database of early english playbooks. with this list, i searched google books, worldcat, and the libraries australia catalogues, and occasionally ventured into the library stacks to check editions available in situ. other sources included the immensely helpful chronological appendix to andrew murphy’s shakespeare in print and suzanne gossett’s survey of ‘recent studies in the english masque’. as the description of my methodology above suggests, it should be clear that the resulting list could not be as exhaustive as it is representative. to allow others to check my findings against the data, electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd to conduct their own analysis, or (as i hope) to take up the challenge of producing a more comprehensive survey (perhaps by author ⁄ play rather than author ⁄ volume); i include the list in its entirety as an appendix to this essay (appendix s ). the tally of results, in which dramatists with less than critical editions to their name are combined into an ‘other’ category, is as above (table ). according to the criteria of the survey outlined above, there have been , critical editions of renaissance drama published in the last years. out of these, , editions or . % were of shakespeare. ben jonson, with editions (soon to be with the publication of the long-awaited seven-volume cambridge works) or . %, is hardly a close second by any stretch of the imagination. even the amalgamated tally of individual dramatists with less than editions to their name only count for editions or . %. this disparity is even more readily apparent when the data is represented graphically as an exploded pie chart (fig. ). like a ravenous pac-man, shakespeare looms large and threatens to consume the editorial dots standing for the other dramatists included in the survey. an archaic arcade simile aside, the scope of the imbalance is clear: there are far too few critical edi- tions of renaissance dramatists other than shakespeare. even with the dictum ‘correlation does not imply causation’ in mind, it is hard to avoid the conclusion that the editorial neglect of renaissance dramatists in relation to shakespeare is both a contributing factor to, and a reflection of, the relative (and relatively limited) critical interest in these drama- tists and their works. notwithstanding arguments about quality over quantity, the results of this survey should give us pause. table . table of critical editions of renaissance plays since arranged by dramatist. dramatist critical editions since shakespeare, william other (< eds) jonson, ben middleton, thomas marlowe, christopher webster, john fletcher, john rowley, william dekker, thomas milton, john ford, john massinger, philip marston, john beaumont, francis chapman, george heywood, thomas brome, richard kyd, thomas shirley, james tourneur, cyril heywood, john lyly, john munday, anthony greene, robert electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd book, bard, and canon whether the charts and figures presented above evoke shock or grim amusement, they do not explain the immense gap reflected in the editorial work and scholarly research between shakespeare and other renaissance dramatists. questions of literary value and the scholarly debates over the origins, consequences, and appropriateness of the so- called ‘canon’ of western literature are far too complex to adequately address in this essay; suffice it to say, shakespeare still enjoys pride of place in the canon and, as a result, critical editions of his works will continue to be produced and made readily available. shakespeare’s unique position in the canon also means that all of his works receive canonical status, as opposed to other renaissance dramatists, from whom only a slim selection of plays is accorded the same significance. demand drives the production of critical editions, and the vast majority of this demand is for plays already secured a place in the canon. the desire, however, infrequent, to capitalize on current critical trends may also prompt the production of non-canonical critical editions, such as edi- tions of massinger’s the renegado and william percy’s mahomet and his heaven in the context of post-colonial studies. even then, these texts are commonly marketed in relation to ‘safe’ canonical plays. for example, the recent arden early modern drama edition of the renegado is marketed as a text to be ‘studied alongside more familiar plays such as othello and the merchant of venice’, while the first critical edition of mahomet and his heaven to appear in print is advertised as ‘roughly contemporary with shakespeare’s othello’. o t h e r jo n s o n m id dl et o n ma rl ow e we bst er flet cher w. rowl ey dekker milton ford marston massingerbeaumont chapman t. heyw ood kyd brom ej. shirley to u r n eu r jo . h eyw o o d ly lygr e e n e m u n d ay sha kes pea re fig. . chart of critical editions of renaissance plays since arranged by dramatist. electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd the ‘tradition of editing shakespeare is largely maintained by pedagogy’ (mcleod ), and what is true for shakespeare is also (for the most part) true for his contemporaries: because the biggest demand for critical editions is for classroom use, ‘the market contin- ues to be driven by students, not scholars’ (giddens ). in print, the present demand for critical editions of renaissance drama is met by a handful of dedicated series and large author-centric editorial projects. the revels plays (manchester university press), as well as selected volumes in the revels plays companions library series, publish modern- spelling critical editions of a wide variety of early modern dramatic works of the highest calibre and textual sophistication. however, since the series appeals (or is marketed) to a smaller audience of textual scholars (in the case of the revels plays) or area specialists (in the case of the revels plays companion library), its volumes tend to reside in academic libraries, too expensive for classroom (or even personal) use. the new mermaids (methuen drama), norton critical editions (w. w. norton), and the revels student editions (manchester university press) publish modern-spelling critical editions designed for classroom use. whilst affordably priced, the available titles in these series tend to be limited to canonical works that already sit comfortably on syllabus lists. the newly launched arden early modern drama (methuen drama) seeks to mediate between these two extremes and promises to publish modern-spelling critical editions of both well- and lesser-known plays priced similarly to the arden shakespeare, which the series is designed to accompany and complement. however, of the original sixteen titles commissioned by the series to date (of which four are currently available), the majority are still canonical works regularly taught in the undergraduate and graduate classroom. in addition to the revels, mermaids, norton, and arden series, recent decades have witnessed the inaugu- ration of large ongoing editorial projects focused on the works of a particular author. these include editions of the works of middleton (in modern spelling), ford (in old spelling), thomas heywood (in old spelling), and james shirley (in modern spelling) for oxford university press, and of jonson (in modern spelling) and webster (in old spelling) for cambridge university press. the medium and market of print limits these critical editions in terms of their cost, format, flexibility, and usability. only a handful of presses publish critical editions ‘and even then infrequently’, such that ‘there is often very little choice when reading the works of minor dramatists’ (giddens ). unlike shakespeare, whose collected works are readily available in both old and modern spelling (such as the oxford shakespeare), critical editions of other renaissance dramatists, if they exist at all, are typically available only in old or modern spelling alone. with the exception of canonical titles designed for classroom use, limited print runs ensure that critical editions of renaissance drama are ‘too expensive to be purchased by individual scholars and students and therefore they remain unused’, relegated to the status of ‘library-only editions’ (giddens ). the revision of these editions to incorporate new research, moreover, is usually economically unfeasible. not only does all of this typically guarantee their virtual absence from the classroom, but that these critical editions are also ‘rarely cited in academic critical work on those dramatists’ (giddens ). limited page numbers and the limited textual real estate imposed by the dimensions of the printed page further restrict the type and amount of content that can be included in these editions. many of the restraints associated with publishing in print – the word counts, page numbers, print runs, and the size and layout of the printed page itself – do not apply to the electronic medium. electronic editions can present multiple and interlinked versions of the same texts and textual witnesses, alongside relevant sources, analogues, and adapta- tions, in both old and modern spelling, all with multiple levels of editorial annotation and electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd commentary. electronic editions can also incorporate rich multimedia content, such as digitized facsimile images of the textual witnesses and performance documents, audio recordings of staged readings and radio performances, music, and still and moving images of performances for stage and screen. unlike print editions, in which the contents are bound and static, electronic editions are able to facilitate dynamic interaction with its contents by and between users through customization, annotation, discussion, and play. electronic editions can also facilitate computer-aided research more readily than print materials. in theory, electronic editions can accomplish all of this and more; and yet, as peter robinson has noted, ‘with a few exceptions, almost every scholarly edition published in the last decade has been published in print, and in print only’ ( ). this is certainly true for editions of non-shakespearean renaissance drama: some projects ‘originally conceived as primarily digital’, such as the cambridge jonson and oxford shirley, have now ‘become primarily print or exclusively print’ (robinson ). a number of the large edi- torial projects mentioned above have promised to produce electronic editions in the future, but the form they will take – if indeed they appear at all – remains open to specu- lation. for example, the electronic edition of the oxford middleton, announced by john lavagnino and gary taylor as forthcoming in , remains in preparation, while the original launch of the electronic edition of the cambridge jonson, under the direc- tion of david gants, is now projected for , but may still suffer from further delays. john jowett’s remark, that at ‘the end of the twentieth century’ the role of ‘scholarly electronic editions’ remained, ‘at most, supplementary to the print edition’ (‘editing’ ), seemed embarrassingly accurate when it was published in . until the launch of richard brome online in , there were no electronic critical editions of non-shakespearean renaissance drama available ; electronic editions of shake- speare, on the other hand, had been available in some form or another since the s. nonetheless, other electronic resources relevant to the study of renaissance drama were then, and still are, available. chief amongst these are literature online (lion), eebo, and the eebo-tcp, which offer access (via institutional subscription) to digital facsimile images or electronic transcriptions of a number of early textual witnesses of renaissance printed drama. although still relatively new, lion, eebo, and eebo-tcp have quietly revolu- tionized the study of renaissance drama, allowing scholars to access ‘what were once elite and inaccessible international resources’ on their desktops, and to examine ‘some of the rarest and most impressive works of a global collection by a few clicks of the mouse’ (champion). however, the digitized primary materials offered by lion, eebo, and eebo-tcp are seriously limited in terms of their accuracy, reliability, and application. experts famil- iar with early modern print were not responsible for the preparation of texts for lion and eebo-tcp; rather, these texts are the product of large-scale, industrial-style data entry and, despite efforts to ensure accuracy through use of a double-keyboarding tran- scription system, errors are frequent. scholars have noted that texts offered by both databases habitually confuse the letter ‘f’ with the long-stem ‘s’, so that words like ‘saye’ and ‘misse’ in the original are reproduced as ‘faye’ and ‘miffe’. the transcriptions also typically omit other conventional elements of early modern print, such as the use of a tilde or macron to indicate missing letters, contractions, and abbreviations. the search functions of these databases are also open to ‘indeterminacy’, such that a researcher has had to retract a published count for the article ‘an’, made using a search of lion, because this turned out to include instances of the speech prefix ‘an.’, as used to indicate speeches by characters whose names begin with these letters (egan ‘impalpable hits’). on electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd a more technical note, these texts employ only very basic encoding schemes, with mini- mal metadata, which seriously limits their direct application for advanced computer- assisted research. scholars have expressed similar frustration with the limitations of the facsimile images offered by eebo: derived from scanned microfilm, these facsimiles are low-resolution, poor quality, black-and-white images. digitized primary materials alone, such as those provided by lion, eebo, and eebo-tcp, are clearly no substitute for critical editions. this is not to dismiss the impor- tance of these electronic resources or to ignore the immense impact they have had on the depth and scope of research in the field, but to recognize that these materials do not offer the scholarly apparatus necessary to encourage readers to consider them as discrete, unified wholes: as mike pincombe and cathy shrank have noted, users of databases such as lion and eebo-tcp do not read the texts they offer in a linear fashion, but rather rely on the search function to source sections ‘to be mined for useful ‘‘quote-bytes’’ ’, at the risk of taking quotations out of context and distorting them ( ). likewise, the navigational options available to eebo users limit their traversal of the facsimiles by image number, illustration, or thumbnail, and not by physical (e.g. book, volume, page, column, para- graph, line) or logical (e.g. play, act, scene, speech) division. the design of these interfaces and texts simply does not facilitate the practices associated with ‘professional reading’ (guillory - ) or ‘[careful] reading’ (marshall ), as opposed to ‘seeking’ (marshall ). critical editions not only encourage readers to consider play-texts as cohesive wholes but also, by situating them in wider historical and intellectual contexts, encourage readers to consider play-texts as constituent parts of larger, richer, systems of cultural exchange, then and now. as renaissance drama is ‘constantly being reinterpreted in rela- tion to the concerns of our society’, these ‘new insights demand new editions with new critical introductions’ (foakes ). critical editions also benefit from the latest findings in textual studies, as editors ‘continually and conscientiously readdress the problems of textual interpretation in terms of contemporary values and language’ (bevington, ). the long shadows of shakespeare and print the current demand-driven model for the production of critical editions, whether in print or electronic form, will never facilitate meaningful expansion of the canon, for the simple reason that we cannot demand editions of plays we do not know. in his essay, ‘the renaissance and the end of editing’, gary taylor argued that a series of interrelated cycles, fostered by academic, cultural, and economic structures, promotes the editing of shakespeare and discourages the editing of other renaissance dramatists: . scholars producing editions of shakespeare have less time to work on other drama- tists ( ). . the vast population of shakespeare editions reinforces a notion of literary and cultural superiority, which simultaneously encourages the production of more editions of shakespeare and discourages the editing of other renaissance playwrights ( - ). . the extensive body of editorial and textual work on shakespeare already in existence means much of the difficulty of producing a minimally competent edition of shake- speare is far easier than producing the equivalent edition of another renaissance dra- matist ( ). . editions of shakespeare tend to be aimed at a wide range of readers, whereas editions of other playwrights tend to be ‘largely or even entirely bibliographical’ and ‘aimed at a small readership particularly interested in technical issues’ ( ). the disparity in electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd market share and demand results in the production of cheap editions of shakespeare, and expensive editions of other playwrights. . the greater availability of inexpensive shakespeare editions facilitates flexible teaching, whereas the unavailability of affordable editions ‘makes it correspondingly difficult to teach other dramatists at all, let alone flexibly’ ( ). when he wrote this article in , taylor noted the ‘widespread disregard, even derision, of textual studies’ ( ), the marginalization or exclusion of discussions of editorial and textual issues from mainstream scholarship ( ), and the declining numbers of graduate courses in bibliography ( ). the situation has not markedly improved in the intervening years. the devaluation of editorial work, especially in the context of tenure and promotion, remains an all-too-familiar complaint. the report of the mla task force on evaluating scholarship for tenure and promotion, for example, found bibliographic scholarship rated ‘not important’ by an average of . % of the departments surveyed, and an average of % rating scholarly editions ‘not important’ (‘report’ – ). the ‘tyranny of the monograph’ (waters b ) for tenure and promo- tion has consequences for teaching and supervision as well, reflected in the dwindling numbers of graduate courses in bibliography and critical editions submitted as doctoral theses. with this in mind, i would add an additional cycle to taylor’s list: the profes- sional devaluation of bibliographical and editorial work, and the diminishing opportunities for graduate students to pursue training in these areas, simultaneously limits the produc- tion of new editors and discourages the activities of existing textual scholars. taylor argues that ‘all these cycles, and others, reinforce one another, by inflating the incentives for the production of more shakespeare editions, and depressing the produc- tion of editions of other dramatists’, and ‘this entire system of interrelated vicious cycles not only reflects, but [also] actively deepens, the canonical class system’ (‘the renaissance’ ). even taylor’s impassioned plea, ‘we should not be editing shakespeare, because we should be editing someone else’, is inevitably couched in terms of shakespearean investment and return. in addition to expanding our understanding and appreciation of these other playwrights, taylor argues, editions of non-shakespearean drama will ‘radically change our perceptions of shakespeare more than any new edition of shakespeare could’, because they ‘will change our perceptions of the renaissance, of the textual space to which shakespeare belonged, and of his place in it’ (‘the renaissance’ ). this is not an indictment of taylor or his argument; rather, it highlights the peculiar challenge facing editions of non-canonical, non-shakespearean, renaissance drama in the demand-driven model of production: generating demand in the first place. for taylor, the answer to the questions, ‘how can you love a work, if you don’t know it? how can you know it, if you can’t get near it? how can you get near it, without editors?’ (‘the renaissance’ ) appears to be to produce more critical editions of renaissance drama, but to justify their existence in starkly shakespearean terms. we cannot wait to respond to a demand that will never come. we should therefore produce critical editions of renaissance drama absent of initial demand, if only on the untested belief that their availability will stimulate the demand that might have justified their creation in the first place – in other words, ‘if we build it, they will come’. the present model, in which demand justifies the production of critical editions, is clearly unsuitable for the project of expanding the canon of renaissance drama as taught, studied, and performed. a new model is required: a model in which editorial effort does not simply sustain the existing canon or respond to prevailing critical trends; a model sit- uated outside of the traditional restraints imposed by the institutions of print publishing; a model in which autonomy is not surrendered to the presses, but is distributed across electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd flexible institutional partnerships and collaborations. although the precise shape of this new model remains to be seen, digital publishing is certain to play a pivotal role: the internet, hailed by overzealous critics in the s as ushering in the last age of print, promises to be the most viable medium for the production and distribution of critical editions of non-canonical, non-shakespearean, renaissance drama. behind the curtain: challenges to production and reception peter shillingsburg reminds us that ‘print editions benefit from a five-hundred-year tradi- tion of craft, skill, equipment, design, production, marketing, and dissemination’, with ‘personnel and infrastructure in place to bolster the scholarly effort’ ( ); digital publish- ing, still very much in its infancy, does not. despite calls for change, and efforts to establish mechanisms for its peer review, the profession continues to view the status of scholarly work in electronic form with scepticism. michael best’s suggestion, ‘that the way to scholarly credibility for the electronic medium is not to try to placate tradition through a slavish attempt to recreate the page and the mechanisms of judging the page’, but rather ‘through pushing the edges of scholarship’ and its dissemination ‘and through celebrating joyously those things the page cannot do’, is instructive (‘forswearing thin potations’ ). however, even if the profession were to recognize digital scholarship as radically different from, yet as valid and valuable as, its print counterparts, the electronic medium presents unique challenges for its production and its reception. of the many ‘hurdles that have limited the creation of a truly innovative online edition’, christie carson has argued that ‘the key hurdle is copyright’ ( ). while copy- right presents a valid challenge, particularly in terms of securing permission to reproduce images, video, and audio of performances, it is certainly not the ‘key hurdle’. it may be difficult to obtain the required permissions – from actors, directors, theatre designers, photographers, musicians, composers, distribution companies, and so forth – but a grow- ing number of electronic editions and performance archives prove that it is possible to do so. the shakespeare performance in asia and the global shakespeares projects offer impres- sive video collections of shakespeare performance from around the world; the shakespeare in performance database of the internet shakespeare editions similarly offers a growing archive of performance materials submitted by theatre companies in north america and abroad. other projects have not only procured permissions for multimedia content, but have actively created it. richard brome online commissioned actors drawn from the alumni lists of the royal shakespeare company to act out selected sequences, recorded and made available as video clips, to serve as performance footnotes for visualizing the theatrical and staging potential of the sequences. the newly launched queen’s men editions simi- larly incorporates video clips of live performances of the plays as part of their electronic editions. it is virtually impossible to obtain permissions to use materials from hollywood block- busters and video-recordings of performances by major theatre companies, since their exclusive dissemination – long after the initial theatrical run – provides an ongoing source of revenue. rather than a hindrance, this practical reality should spur the inclusion of performance materials from non-canonical stage and screen productions in electronic editions. all performances offer a valuable contribution, whether by professionals, amateurs, or students; electronic editions of renaissance drama, therefore, have an oppor- tunity (and arguably a duty) not only to expand the canon of plays as taught, studied, and performed, but also to extend the range of productions surveyed beyond the usual suspects in any discussion of performance history. electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd the most pressing issue facing electronic editions of renaissance drama therefore is not copyright, but sustainability: it matters little how difficult rights and permissions are to obtain for multimedia content if the electronic edition for which they are acquired becomes obsolete. since hardware and software technologies are constantly changing, the task of ensuring that an electronic edition remains usable is akin to hitting a moving target. as joseph a. dane has wryly commented, ‘several early electronic databases are usable today only to the degree that, say, my rpm record collection is playable’ ( ). while the technological progression from vinyl to mp has been striking, it pales in comparison to the (many invisible) advances made in computer hardware and software over the last decade. the rate at which electronic editions, dependent on specific hard- ware or software requirements, can become obsolete is steadily increasing. the cost of preserving, maintaining, and updating electronic editions in order to stave off technologi- cal obsolescence far exceeds the costs of publishing a print edition which, once it has been published, requires no further action (short of preserving it in a library) to ensure it remains usable. ‘though the ephemerality of some editions derives from the fragility and novelty of the infrastructure’, john lavagnino has recently noted, ‘it is also a reflection of grant- funding priorities, in which long projects are prohibited and nothing matters once the de- liverables are completed’ ( ). current grant schemes in the humanities, in which the monograph remains the deliverable par excellence, are incapable of supporting the ongoing costs of sustaining an electronic edition. this is effectively the message of the and ithaka reports, which stress the need for creative revenue models in order to ‘generate or gain access to the resources – financial or otherwise – needed to protect and increase the value of the content or service for those who use it’ (maron et al. ). pro- posed models of collaboration to ensure the long-term sustainability of digital projects and electronic editions include formal partnerships between researchers and their institu- tional libraries (krezschmar & potter) and flexible partnerships across institutions (reside), but these remain to be tested. as the ithaka reports have repeatedly noted, ‘there is no magic ‘‘rule book’’ for online projects’, and ‘experimentation is often the only way to see what works best’ (maron et al. ). the task of continually preserving, maintaining, and updating digital projects and elec- tronic editions to ensure that they remain usable, as well as revising their content to ensure that they remain relevant, has given rise to what julia flanders has characterized as a ‘culture of perpetual prototype’, in which finality and completion is resisted. the iter- ative nature of electronic editions presents a peculiar challenge to conventional mecha- nisms of evaluation and peer review, designed for assessing scholarship in ‘complete’ and ‘final’ form in the static medium of print. recent calls for a shift in thinking away from treating digital projects as products to conceiving of them instead as processes may yet offer a solution. ‘digital artifacts themselves’, alan galey and stan ruecker have argued, and ‘not just their surrogate project reports’ should ‘stand as peer-reviewable forms of research, worthy of professional credit and contestable as forms of argument’ ( ). whether such proposals gather support from the profession, particularly in terms of ten- ure and promotion review, remains to be seen. in addition to a viable model for sustainability, the production of electronic editions requires technical expertise in programming, textual encoding, interface design, digitiza- tion of analogue sources, and digital content management, amongst others. while renais- sance scholars typically do not have these skills, it has been argued that they are, if not becoming then already, necessary: ‘we need to become electronically expert ourselves if we hope to produce workable [electronic] editions’, urges leah marcus, since ‘to leave electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd the niceties of encoding to outside experts is to court disaster’ ( ). more recently, alan galey has offered an outline of the technical knowledge and expertise that a digital schol- arly editor should possess (‘mechanick exercises’). while it is certainly beneficial for a textual scholar to become familiar with the technical workings behind the curtain of an electronic edition, it is impractical to insist on a level of expertise and proficiency equiva- lent to that of a computer scientist. ‘to insist that all editors who want to make digital editions should understand these things’, writes peter robinson, seems ‘as short-sighted and narrowly limiting as the requirement of oxford and cambridge universities (main- tained until ) that incoming undergraduates in all subjects have a basic qualification in latin or greek’ ( ). as peter shillingsburg reminds us, ‘creating an electronic edition is not a one-person operation; it requires skills rarely if ever found in any one person’ ( ). at the same time, the dangers of over-reliance on outside technical experts are real. in an essay reflecting on the then still unpublished oxford middleton, gary taylor confessed that he was ‘powerless to finish the edition’ despite the delays, and that ‘when people ask me whether or when the edition will appear, i say, ‘‘it depends upon john lavagnino’’ ’ (‘c:\wp\file.txt’ ), the project’s digital editor. some of the delays associated with the cambridge jonson may also have resulted from the project’s reliance on a single digital editor, david gants, who, like lavagnino, has relocated internationally since the project began. this is not to fault lavagnino, gants, or anyone else involved with the jonson and middleton projects, but rather to suggest that large editorial projects such as these consider a level of flexibility in their personnel to ensure that progress and completion rests on more than a single pair of shoulders. forward-planning is not an issue unique to electronic editorial projects: for example, fredson bowers thankfully arranged for robert k. turner jr. to take charge of the edition of the works in the beaumont and fletcher canon prior to his death in , before the final volumes were completed. the conventions of presenting the text and apparatus of a scholarly edition of renais- sance drama in print – that is, the layout and the typography, such as bold and italic type, small caps, strike-throughs, paragraph alignment and justification, and the bewildering array of brackets, sigla, endnotes and footnotes famously derided by edmund wilson and lewis mumford as ‘barbed wire’ distancing the reader from the text – and the processes by which these features are produced, are established industry and community standards recognized by editors, publishers, and readers alike. the same cannot be said for elec- tronic editions, where the display of text and apparatus – and the interaction between them – varies from project to project: as alan galey has remarked, ‘the digital scholarly edition still does not have a stable, repeatable exemplar that can bear the weight of the critical speculation that preceded it’ (‘signal to noise’ ). the problem of standardization extends beyond the rendering and layout of the text and apparatus of an electronic edition. when we look at the oxford shakespeare edition of timon of athens, for example, we instinctively distinguish between the functions per- formed by the same word ‘timon’ as it appears in different contexts, such as in the play’s title (‘timon of athens’ and ‘the life of timon of athens’), as part of the running title in the header of the play-text (‘the life of timon of athens’), as a speech prefix (‘timon’), as an instruction in stage directions (e.g. ‘they greet timon’), and as a reference to the char- acter in dialogue (e.g. ‘thou art going to lord timon’s feast?’). in order to formalize these distinctions and make them machine-readable (and thus enable a computer to dis- play, interact with, and search intelligently), an electronic text of the play needs to be structured with textual encoding, by which the various elements of the text – its con- tent and form – are explicitly described and defined or ‘tagged’ or ‘marked up’. although electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd the text encoding initiative (tei) consortium has been developing a standard (based upon the iso standard xml or extensible markup language) for encoding texts of the sort studied by humanities scholars, editors of renaissance texts have been more reluctant than most in its acceptance and use: as ian lancashire has observed, ‘embedded in tei tags are modern assumptions of language, text, and genre partly incompatible with renaissance thought’ ( ). for these and other reasons, a number of electronic editorial projects develop their own textual encoding schemas rather than follow the tei guide- lines. the upside-down ‘tree’ structure of tei and xml documents – in which all docu- ments contain a single ‘root’ element from which all other elements branch out – pre- sents a further problem for the electronic edition of renaissance drama, because it requires a text to be encoded as a single hierarchy of elements. a typical renaissance printed play-text has both a physical structure (divisible by book, gatherings or quires, formes, leaves, pages, columns, sections, paragraphs, and lines) and a literary or conceptual structure (divisible by play, acts, scenes, and lines), and these frequently overlap. a num- ber of solutions (of varying degrees of effectiveness and inelegance) have been proposed, but the search for a single, accepted community standard for textual encoding continues. just as textual encoding or markup is used to provide a machine-readable structure for an electronic text, character encoding is used to assign distinct machine-readable codes to the individual characters (letters and punctuation) made visible on the screen and to trans- late between them. character encoding presents a particular problem for electronic edi- tions of renaissance drama, since many of the characters and typographical symbols commonly found in early modern printed play-texts are not supported by existing charac- ter encoding standards. the unicode standard utf- , to take the most ubiquitous exam- ple, can encode the long-stem ‘s’ and a limited set of the ligatures, digraphs, abbreviations, and macron letters typically found in renaissance texts, including the capi- tal and lowercase ae and oe digraphs, the lowercase ff, fi, fl, ffi, ffl, ij, st, and long st liga- tures, and the macron letters a, e, i, o, and u. unicode does not, at present, support other commonly employed ligatures – ct, sh, si, sl, sp, ss, ssi, and ssl – or swash charac- ters. an electronic edition cannot render and display all of the characters that appear in the original renaissance play-texts using unicode or any other existing character encod- ing standard, which raises issues of textual fidelity: should electronic editions render as much of the early modern orthographical and typographical elements as technologically possible, or simply normalize the characters to avoid inconsistencies and exclusions? the display of all necessary characters can be accomplished by the creation of a custom-built font, which users need to install in order for the characters to render correctly, but this raises further issues of software dependency and accessibility. beyond the facsimile: current practices and future possibilities in a review published in the athenaeum in , an anonymous critic remarked: we are about to be inundated with new editions of shakespeare… as the demand increases for the plays of shakespeare, so new editors will arise – all with notions and new readings of their own, – till it will end perhaps by every intelligent man turning editor for himself. with the emergence of digital publishing, the notion of the reader-as-editor – so peevishly derided in – is fast becoming a reality, with more and more electronic editions supporting user-generated content, customization, and social networking applications. studies have already demonstrated the pedagogical benefits of producing an electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd electronic ‘social edition’, allowing a community of readers to create and share annota- tions and comments (kaplan & chisik, ‘in the company of readers’ and ‘reading together alone’). lukas erne has also recently argued that ‘students have much to gain from their own hands-on editorial experience,’ that ‘those who produce their own edi- tion of even a short passage of a shakespeare play, with their own modernized spelling and punctuation, emendations, added or altered stage directions, lineation, annotation, collation, and perhaps even introduction’ will be ‘uniquely placed to engage with the complexities of the shakespearean text and its editorial constructiveness’ ( ). electronic editions of renaissance drama could therefore perform a useful pedagogical service, simultaneously providing a reliable, rigorously edited and citable text for scholarship as well as making the digital tools and materials available for readers to engage in editorial practice themselves. the ability to encourage and incorporate user- and community-generated content has also radically expanded the range of resources provided by an electronic edition. for example, the shakespeare in performance database, developed by the internet shake- speare editions, allows users not only to browse and search through digitized perfor- mance materials (such as audio clips, costume and poster designs, photographs, press releases, production notes, reviews, scripts, and so on) from over a thousand film and stage productions of shakespeare’s plays, but also to submit their own content to the collection. similarly, the internet shakespeare editions has also developed a performance chronicle, which offers a searchable blog-style database of reviews of contemporary shakespeare productions, penned and submitted by the general public and by scholars alike, as well as pre- and post-publication reviews from select scholarly journals. though newly launched, the performance chronicle promises a level of dynamic interac- tion – from searching, posting, and commenting on submitted reviews, to subscribing for email updates when a new review of a particular play is posted – that is simply impossible to accomplish in print. editions of individual plays for the internet shakespeare editions currently do not incorporate content from the shakespeare in performance or per- formance chronicle databases, but it is clear that some level of directed interaction between the editions and these user- and community-generated materials is planned for the near future. scholarly editions of renaissance drama find their way into print either as an individ- ual play, as part of the canonical works of a particular playwright, as an exemplar of a particular genre, or as part of a thematic or generic grouping. richard proudfoot, reflect- ing on the canon of shakespeare, noted and questioned ‘the reluctance of publishers and readers to contemplate other criteria than the sometimes slippery one of authorship for constructing collections of plays’, when these other criteria – such as the particular play- house, acting company, or moment in theatrical history associated with the play – might ‘reflect other, equally significant, common characteristics of the plays’ ( ). tanya hagen has also persuasively argued for the production of ‘repertory-based’ editions ‘as a potential and stimulating alternative to current models for editing english drama’ ( ). despite growing interest in repertory studies and increasing recognition of its importance as a corrective to the author- and canon-centred model of literary history, at present there are no plans to publish repertory-based editions in print. online, however, is a different story. in , the queen’s men editions were launched, offering the first repertory-based edition of renaissance drama. the project, which uses the publication platform devel- oped by the internet shakespeare editions, offers scholarly editions of the plays associated with the queen’s men as part of a rich multimedia environment through which users can explore the theatrical, historical, and scholarly contents and contexts of the plays. electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd electronic editions of renaissance drama need not be restricted to groupings by play- wright or acting company alone. unlike print, in which plays are literally bound in place by what michael best has called ‘the determined physicality of a book’ (’standing in rich place’ ), the flexibility of the electronic medium is capable of supporting the grouping of the same plays by any given criteria, from the traditional categories of acting company, playwright, playhouse, year of publication, year of first performance, genre and theme, through to more innovative and abstract criteria, such as the use of particular props, character names, stage directions, or even individual words. electronic editions of renais- sance drama therefore promise to make proudfoot’s and hagen’s vision of a flexibly bound corpus of plays an achievable reality. in addition to offering unparalleled flexibility in grouping and arranging editions of renaissance drama, the electronic medium also offers tantalizing possibilities for experi- mentation with the interface through which an edition is accessed, displayed, and inter- acted with by the reader. in her discussion of the conventional presentation of stage directions in print editions of renaissance drama, margaret jane kidnie urges editors to ‘begin experimenting more freely with the layout of the edited page’ in order to ‘make readers aware of textual indeterminacy’ and to ‘develop conventions with which we might guide users, not to a ‘proper’ choice, but rather to an awareness of choice and an imaginative interaction with the drama’ ( - ). as an example, kidnie presents an edi- ted passage of troilus and cressida using ‘typographical arrows to indicate a time span for possible entries’ in cases of ambiguity (fig. ). with print editions, as kidnie notes, ‘it seems even general editors are reluctant to play around too much with presentation’ ( ). while it seems unlikely that kidnie’s experi- mental print layout will ever be incorporated into an edition in the medium of print (for which it was originally conceived), there are plans to trial its use in an electronic edi- tion. electronic editions are also able to ‘take advantage of the capacity of the medium for animation’ as a possible solution for highlighting textual indeterminacy in renaissance play-texts ‘by recreating a semantic field where the text dances between variant readings’ (best ‘standing in rich place’ ). for example, variant readings (such as ‘wayward’, ‘weyard’ and ‘weird’ in macbeth) or ambiguities in dialogue and stage directions (such as hamlet’s entry before giving his ‘to be, or not to be’ speech) might be animated to interchange randomly at given intervals, or otherwise presented to the reader as selectable options, so the ‘text becomes visibly variant, teasingly slippery, as it makes manifest its actual instability, hidden by our meticulously edited print texts’ (best ‘standing in rich place’ ). experimentation with virtual environments and electronic gaming have further tested and expanded the boundaries of the scholarly edition. by rendering percy shelley’s son- net ‘ozymandias’ as a webmoo (a multimedia web-based form of the object-oriented mud or multi-user dungeon ⁄ dimension), neil freistat and steven e. jones have explored the creation of an immersive environment through which the text is ‘reflexively interpreted’ or ‘embodied and experienced architecturally, spatially, at the cognitive inter- section of its linguistic and graphic codes’ ( - ). the introduction of ‘online collabora- tive playspace[s]’, such as the ivanhoe game constructed by jerome mcgann and johanna drucker to allow users to dynamically interact with, alter, and comment upon walter scott’s romance and its ongoing reception history, have similarly exposed ‘the indeterminacy of humanities texts to role-play and performative intervention’ (ivan- hoe ‘about’). with more direct application to renaissance drama is the simulated envi- ronment for theatre, a virtual environment for reading, visualizing, and directing plays in scaled three-dimensional models of real or imagined performance spaces (roberts-smith electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd fig. . experimental print layout of troilus and cressida by margaret jane kidnie to highlight textual indeterminacy in stage directions (‘the staging of shakespeare’s drama in print editions’ ). used with kind permission of the author. electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd et al., ). although still in development, a preview version of the software application is available, with the text of shakespeare’s julius caesar included for experimentation (fig. ). whether through animation, immersion, visualization, or play, or by giving users the opportunity to make their own decision about variant readings, electronic editions may highlight the richness and fluidity of renaissance play-texts. conclusion electronic editions of renaissance drama challenge us to rethink existing models of schol- arly publishing and collaboration, to expand our notions of what constitutes a ‘critical edition’, and to reconsider the products and processes we deem to be meaningful and assessable research as our discipline embraces the new medium. the challenges to their production, maintenance, and scholarly reception are many; but so too are the opportuni- ties for not only creating new ways of looking at old texts, but new ways of conceiving the texts as dynamic, interactive, and unstable. perhaps one of the most important contri- butions that electronic editions of renaissance drama have to offer is their spirit of exper- imentation and play, the capacity of the medium to extend beyond the facsimile, beyond the surrogate, and beyond the restrictions and conventions of print. the kingdom has been digitized, and yet shakespeare continues to reign supreme. if we wish to pay more than lip service to the ongoing project of expanding the canon of renaissance drama as it is taught, studied, and performed, digitizing primary materials is not enough – we need more critical editions. only by developing a new model for the production of these critical editions, a model not driven by demand, can we hope to fig. . screenshot of the simulated environment for theatre while running an interactive three-dimensional simu- lated performance of shakespeare’s julius caesar. used with kind permission of the developers. electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd escape the ‘increasingly sterile reiteration’ of the canon (leslie ). once the kingdom has been edited as well as digitized, shakespeare might have to fight to retain his crown. acknowledgement i wish to thank michael best, richard allen cave, eugene giddens, and helen ostovich for sharing their experiences of editing renaissance drama, online and offline, with me. jenna mead, jo mcewan, and kate riley kindly read over draft versions. a research development award from the university of western australia funded this research. supporting information additional supporting information may be found in the online version of this article. appendix s . critical editions of english playtexts (printed, performed, or written – ), published – , excluding latin plays, facsimiles, reprints, fragments, unpublished theses, and private printings. please note: wiley-blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. any queries (other than missing mate- rial) should be directed to the corresponding author for the article. short biography brett d. hirsch is university postdoctoral research fellow in medieval and early modern studies at the university of western australia. he is one of the editors of the routledge journal shakespeare and coordinating editor of the digital renaissance editions. he has pub- lished articles in the ben jonson journal, early modern literary studies, early theatre and parer- gon, and is a contributor to the forthcoming cambridge world shakespeare encyclopedia. he is currently working on a monograph on the transmission and adaptation of animal narra- tives in late medieval and early modern england, an electronic edition of fair em with kevin quarmby, and, with hugh craig, a computational stylistics study of early modern dramatic genre, repertory, and authorial style. notes *correspondence: centre for medieval and early modern studies (m ), university of western australia, stir- ling highway, crawley, wa , australia. email: brett.hirsch@uwa.edu.au representative studies include craig and kinney, hope, jackson, masten, and vickers. ‘a & c black : timon of athens – william shakespeare’. a & c black, methuen drama & the arden shakespeare. catalogue. [online]. retrieved on december from: http://www.acblack.com/drama/books/details.asp- x?isbn= . at time of writing, the integrated catalogue of the british library does not provide a stable uri for individual entries. the system number for the arden volume in question is . the attribution of the title in the oxford university press online catalogue seems to depend on region: the us catalogue gives both shakespeare and middleton, whereas the uk, australia ⁄ new zealand, and canada catalogues give only shakespeare; compare: http://www.oup.com/us/catalog/general/subject/literatureenglish/drama/ shakespeare/?view=usa&ci= (usa), http://ukcatalogue.oup.com/product/ .do (uk), http://www.oupcanada.com/catalog/ .html (canada), and http://www.oup.com.au/titles/ higher_ed/oxford_worlds_classics/shakespeare/ (australia ⁄ new zealand) [accessed december ] richard proudfoot provides a similarly telling example: reflecting on his edition of the two noble kinsmen in which, ‘following the precedent of the first edition of the play in ’, he gave the names of its authors as john electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd fletcher and william shakespeare (in that order), proudfoot remarks, ‘i have a suspicion that it might have earned more royalties over the years if i had been less scrupulous and put the big name first’ ( ). accessed december . see also peter holland’s discussion of the various critical responses to the oxford shakespeare and its treatment of collaboration (‘authorship and collaboration’). in addition to scholarly translations, this exclusion extended to individual titles in the webster’s thesaurus edi- tions, which includes separate volumes for every shakespeare play with french, german, spanish, chinese, and korean apparatus. such as editions for the following series: longman school shakespeare; oxford school shakespeare; cambridge school shakespeare; insight shakespeare; no fear shakespeare; interfact shakespeare; ntc shakespeare; falcon shakespeare; shakespeare parallel text; access to shakespeare; heinemann ⁄ shakespeare library; harcourt shake- speare; simply shakespeare; nelson thornes shakespeare; and, the red reader shakespeare. such as the shakespeare retold series, and shakespeare titles in the lake classics, white wolves, real reads, and the anthropomorphic phakespeare series. such as shakespeare titles in the saddleback classics and pacemaker classics editions. such as editions for the following series: manga shakespeare; comic book shakespeare; livewire shakespeare; illustrated readers shakespeare; picture this! shakespeare; campfire graphic novels shakespeare; and, the graphic shakespeare. such as titles in the shakespeare on the double! series. ‘a & c black : the renegado – philip massinger’. a & c black, methuen drama & the arden shakespeare. cat- alogue. [online]. retrieved on december from: http://www.acblack.com/drama/books/details.aspx? isbn= . ‘william percy’s mahomet and his heaven by matthew dimmock’. ashgate publishing. catalogue. [online]. retrieved on december from: http://www.ashgate.com/isbn/ . for a more detailed discussion of richard brome online, see hirsch (‘bringing richard brome online’). representative early examples include the electronic editions of the oxford shakespeare complete works (released in as a machine-readable version of the text), larry friedlander’s shakespeare project (developed during the s using apple hypercard), and the voyager macbeth cd-rom ( , incorporating a. r. braun- muller’s new cambridge edition of the text). for a more detailed discussion of early electronic editions of shake- speare, see bolton (‘the bard in bits’), best (‘shakespeare and the electronic text - ’), and hirsch, arneil & newton (‘mark the play’ - ). for an historical overview of lion, eebo, and eebo-tcp, see chadwyck-healey (‘the new textual technologies’). for a more detailed discussion of these and other issues, see foster (‘a romance of electronic scholarship’), gadd (‘the use and abuse of eebo’), gants and hailey (‘renaissance studies and new technologies’), and steg- gle (‘knowledge will be multiplied’). scholarly demand for high-resolution, high quality full-colour images of early printed shakespeare witnesses resulted in the creation of the shakespeare quartos archive. currently in prototype, the shakespeare quartos archive makes freely available ‘full cover-to-cover digital reproductions and transcriptions of copies of the five earliest editions of the play hamlet’. the equivalent for non-shakespearean renaissance playbooks currently does not exist. guillory and marshall offer different models of critical reading practices. guillory proposes a model of sustained engagement with text that distinguishes between academic or ‘professional reading’ and ‘lay reading’. professional reading is characterized by four features: it is a form of ‘work … requiring large amounts of time and resources’; it is a ‘disciplinary activity … governed by conventions of interpretation and protocols of research’; it is ‘vigilant’ insofar as it ‘stands back from the experience of pleasure in reading’ and ‘gives rise to a certain sustained reflection’; and it is ‘a communal practice’ ( - ). lay reading, by contrast, is ‘practiced at the site of leisure’, governed by different conventions, ‘motivated by the experience of pleasure’ and ‘is largely a solitary practice’ ( ). the model proposed by marshall, however, does not distinguish between reading practices within and without the academy. instead, marshall offers a system of reading practices distinguished by level of sustained engagement and characteristic bene- fits and losses in comprehension and speed: ‘reading’, ‘skimming’, ‘scanning’, ‘glancing’, and ‘seeking’, with the meta-type of ‘rereading’ to reinforce the fact that multiple practices might be engaged by the same reader of the same text ( ). in this model, reading is defined as ‘canonical careful reading’, in which ‘the reader traverses the text linearly’ with the aim of achieving full comprehension ( ). skimming is ‘faster than canonical reading’, in which ‘traversal is still linear, but comprehension is sacrificed for speed’ and ‘the aim is to get the gist of the text’ ( ). scanning is faster still as ‘traversal becomes non-linear’ and ‘the reader looks ahead and back’ in the text with the aim of ‘triage or to decide on further action’ ( ). when glancing, the reader turns the pages or scrolls through elec- tronic text very quickly, spending almost as much time turning ⁄ scrolling through the text as looking at it. the aim is to detect important structural elements (in the early printed drama, this might include prefatory or paratextual matter, section divisions, headings) ‘until something holds sufficient interest to transition to another type of reading’ ( ). finally, seeking involves the reader scanning quickly for a particular textual, structural, or linguistic element (such as speech prefixes or proper nouns) ‘with an aim orthogonal to full comprehension’ ( ). see also ulrich (‘tenure’) for a response to the recommendations of the report in terms of textual scholarship. electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd a cursory search of the proquest dissertations and theses (pqdt) database, which indexes masters and doctoral theses submitted in the united states, canada, and the united kingdom, lists only critical editions of renaissance drama (including stage jigs and tudor manuscript plays) submitted as phd theses since . the british library’s electronic theses online service (ethos), which indexes masters and doctoral theses produced at higher education institutions in the united kingdom, also only lists critical editions submitted as phd theses during this period. however, as with the other databases consulted in this study, these indexes are not exhaustive; indeed, ethos is still a beta release. the figures cited here serve only as estimates for the purposes of illustration and comparison. on the issue of the ‘tyranny of the monograph’ and its relation to the doctoral thesis, i agree with leslie monkman that ‘we continue to ignore the relation of the doctoral dissertation to the much talked of and multiple crises in the humanities’ ( ). as joy palmer notes, this oft-quoted reference is to the film field of dreams, ‘in which the protagonist takes an enormous leap of faith by building a baseball field … because the mystical voice in his head assures him, ‘‘if you build it, they will come’’ ’ (‘archives . ’). for recent discussion of possible models of collaboration, see nichols, kretzschmar & potter, and reside. in particular, see the reports submitted to the humanities and social sciences federation of canada (siemens et al., ), the uk arts and humanities research council (bates et al., ), and the american council of learned societies (‘our cultural commonwealth’). as kenneth price notes, ‘various experiments with peer review mechanisms are now underway’ (‘digital schol- arship’ ), pointing to the national endowment for the humanities (neh) edsitement portal and to nines as examples. the ephemerality of the electronic medium is perhaps the most pressing challenge to overcome in order to raise the scholarly profile and status of digital projects. scholars, comforted by the apparent permanence of print, are understandably cautious about publishing online. for example, renaissance forum, one of the earliest online journals in the field ( – ), has only recently re-emerged after years of being offline and unavailable, accessible only by trawling through the cache of the site stored by the internet archive’s wayback machine service. for further discussion of these issues, see especially shillingsburg ( – ). the special cluster of articles on the notion of ‘completion’ in the context of digital humanities projects in the spring issue of digital humanities quarterly, guest-edited by matthew g. kirschenbaum, is pertinent to this discussion; see http://digitalhumanities.org/dhq/vol/ / /index.html. as one of the anonymous reviewers of this article reminds me, with any long-term, sustained, collaborative enterprise, tensions may arise between the competing demands and desires of those involved. with electronic edi- tions, for example, academic decisions about the re-presentation of the text may be at odds with the practical demands of the developers, and perhaps even the technical limitations of the medium itself. the task of reconciling these differences can be a major challenge facing the collaborators in any editing project, print and electronic. see mumford and wilson; for a thoughtful discussion of these objections, see greetham (‘rights to copy’ – ) and garber (academic instincts – ). for an overview of textual encoding as it applies to digital humanities projects, see renear (‘text encoding’). representative examples include the internet shakespeare editions, which has developed its own encoding guide- lines; the queen’s men editions and the digital renaissance editions, which have adapted the guidelines developed by the ise; the renaissance electronic texts project, which rejected the tei guidelines as unsuitable and developed its own; and, the lexicons of early modern english, which is instead built upon a database of lemmata. representative discussions of the problem of overlapping hierarchies and proposed solutions include renear, mylonas and durand (‘refining’), sperberg-mcqueen and huitfeldt (‘goddag’), barnard et al., (‘hierar- chical encoding of text’), eggert (‘text-encoding’), and liu and smith (‘a relational database model’). for an overview of character encoding as it applies to digital humanities projects, see wittern (‘character encoding’). for a discussion of the issues associated with character encoding as they apply to an existing electronic edition of renaissance drama (richard brome online), see hirsch (‘bringing richard brome online’ – ). the athenaeum no. , mar. : – . the journals currently providing permission for pre- and ⁄ or post-publication reviews to be included on the site include cahiers élisabéthains, early modern literary studies, shakespeare, and shakespeare bulletin. for a more detailed discussion of electronic editions of shakespeare and the inclusion of materials relevant to performance criticism, see hirsch, arneil, and newton (‘mark the play’ – ). representative recent discussions of repertory studies include munro (‘early modern drama and the repertory approach’) and rutter (‘repertory studies’). the digital renaissance editions project, which ambitiously aims to publish electronic scholarly editions of all early modern plays from tudor interludes to caroline drama, will experiment with providing this level of flexibility in grouping the plays by various criteria. the edition of fair em i am preparing, with kevin quarmby, for the digital renaissance editions will experiment with kidnie’s model for displaying stage directions. for an extensive account of the ivanhoe game, see mcgann – . electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd works cited american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. ‘our cultural commonwealth’. final report, american council of learned societies, . [online]. http:// www.acls.org/cyberinfrastructure/ourculturalcommonwealth.pdf. barnard, david, et al. ‘hierarchical encoding of text: technical problems and sgml solutions.’ computers and the humanities ( ): – . bates, david, et al. ‘peer review and evaluation of digital resources for the arts and humanities.’ . final report, ict strategy projects, arts and humanities research council, uk. [online]. http://www.history. ac.uk/resources/digitisation/peer-review. best, michael. ‘ ‘‘forswearing thin potations’’: the creation of rich texts online.’ mind technologies: humanities computing and the canadian academic community. eds. raymond siemens and david moorman. calgary: u of calgary p, . – . ——. ‘shakespeare and the electronic text.’ a concise companion to shakespeare and the text. ed. andrew murphy. malden: blackwell, . – . ——. ‘standing in rich place: electrifying the multiple-text edition or, every text is multiple.’ college literature . ( ): – . bevington, david. ‘editing renaissance drama in paperback.’ renaissance drama ( ): – . carson, christie. ‘the evolution of online editing: where will it end?’ shakespeare survey ( ): – . chadwyck-healey, charles. ‘the new textual technologies.’ a companion to the history of the book. eds. simon eliot and jonathan rose. malden: blackwell, . – . champion, justin. ‘discovering the past online.’ jisc inform ( ): n. p. [online]. retrieved on january from: http://www.jisc.org.uk/publications/jiscinform/ /pub_inform .aspx#dtpo. craig, hugh and arthur f. kinney, eds. shakespeare, computers, and the mystery of authorship. cambridge: cambridge up, . dane, joseph a. out of sorts: on typography and print culture. philadelphia: u of pennsylvania p, . database of early english playbooks. eds. alan b. farmer and zachary lesser. –. university of pennsylvania. [online]. retrieved on december from: http://deep.sas.upenn.edu/. dawson, anthony b., and gretchen e. minton, eds. timon of athens. london: the arden shakespeare, . digital renaissance editions. coordinating ed. brett d. hirsch. –. university of western australia and the inter- net shakespeare editions. [online]. retrieved on december from: http://digitalrenaissance.arts.uwa. edu.au/. early english books online [eebo]. –. chadwyck-healey (proquest llc). [online]. retrieved on decem- ber from: http://eebo.chadwyck.com/. early english books online text creation partnership [eebo-tcp]. –. proquest llc, university of oxford libraries, university of michigan libraries, and council on library and information resources (clir). [online]. retrieved on december from: http://quod.lib.umich.edu/e/eebo/. egan, gabriel. ‘impalpable hits: indeterminacy in the searching of tagged shakespearean texts.’ saa meeting. bermuda. march . eggert, paul. ‘text-encoding, theories of the text, and the ‘‘work-site’’.’ literary and linguistic computing . ( ): – . electronic theses online service [ethos]. –. british library. [online]. retrieved on december from: http://ethos.bl.uk/. erne, lukas. shakespeare’s modern collaborators. london: continuum, . foakes, r. a. ‘the need for editions of shakespeare: a response to marvin spevack.’ connotations . ( – ): – . ford, john. the complete works of john ford [oxford ford]. gen. ed. brian vickers. vols. oxford: oxford up, forthcoming. foster, donald. ‘a romance of electronic scholarship; with the true and lamentable tragedies of hamlet, prince of denmark. part : the words.’ early modern literary studies . ( ): . – . [online]. http://purl.oclc.org/ emls/ - /fostshak.html. freistat, neil, and steven e. jones. ‘immersive textuality: the editing of virtual spaces.’ text ( ): – . gadd, ian. ‘the use and misuse of early english books online.’ literature compass . ( ): – . galey, alan. ‘mechanick exercises: the question of technical competence in digital scholarly editing.’ electronic publishing: politics and pragmatics. ed. gabriel egan. tempe: medieval and renaissance texts and studies, . – . ——. ‘signal to noise: designing a digital edition of the taming of a shrew.’ college literature . ( ): – . ——, and stan ruecker. ‘how a prototype argues.’ literary and linguistic computing . ( ): – . electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd gants, david, and r. carter hailey. ‘renaissance studies and new technologies: a collection of ‘‘electronic texts’’.’ new technologies and renaissance studies. eds. william r. bowen and raymond g. siemens. tempe: medieval and renaissance texts and studies, . – . giddens, eugene. how to read a shakespearean playtext. cambridge: cambridge up, . global shakespeares video and performance archive. eds. peter s. donaldson and alexander c. y. huang. –. mit and global shakespeare project. [online]. retrieved on december from: http://globalshakespeares.org/. goldberg, jonathan. ‘textual properties.’ shakespeare quarterly . ( ): – . gossett, suzanne. ‘recent studies in the english masque.’ english literary renaissance . ( ): – . greetham, d. c. the pleasures of contamination: evidence, text, and voice in textual studies. bloomington: indiana up, . guillory, john. ‘the ethical practice of modernity: the example of reading.’ the turn to ethics. eds. marjorie garber, beatrice hanssen and rebecca l. walkowitz. new york: routledge, . – . guthrie, kevin, et al. ‘sustainability and revenue models for online academic resources’. ithaka report, [online]. retrieved on may from: http://www.ithaka.org/publications/sustainability. hagen, tanya. ‘thinking outside the bard: reed, repertory canons, and editing early english drama.’ reed in review: essays in celebration of the first twenty-five years. eds. audrey douglas and sally-beth maclean. toronto: u of toronto p, . – . heywood, thomas. the complete works [oxford heywood]. gen. ed. grace ioppolo. vols. oxford: oxford up, forthcoming. hirsch, brett d. ‘bringing richard brome online.’ early theatre . ( ): – . ——, stewart arneil, and greg newton. ‘ ‘‘mark the play’’: electronic editions of shakespeare and video con- tent.’ new knowledge networks . ( ): n.p. [online]. http://journals.uvic.ca/index.php/inke/article/view/ . holland, peter. ‘authorship and collaboration: the problem of editing shakespeare.’ the politics of the electronic text. eds. warren chernaik, caroline davis and marilyn deegan. oxford: office for humanities communica- tion, . – . hope, jonathan. the authorship of shakespeare’s plays: a socio-linguistic study. cambridge: cambridge up, . internet shakespeare editions (ise). coordinating ed. michael best. –. university of victoria. [online]. retrieved on december from: http://internetshakespeare.uvic.ca/. internet shakespeare editions performance chronicle. ed. paul prescott. –. university of victoria. [online]. retrieved on december from: http://isechronicle.uvic.ca/. ivanhoe game [ivanhoe]. dir. johanna drucker and jerome j. mcgann. –. arp (university of virginia). [online]. retrieved on december from: http://www.ivanhoegame.org/. jackson, macdonald p. defining shakespeare: pericles as test case. oxford: oxford up, . jonson, ben. the works [cambridge jonson]. gen. eds. david bevington, martin butler and ian donaldson. cambridge: cambridge up, forthcoming. jowett, john. ‘editing shakespeare’s plays in the twentieth century.’ shakespeare survey ( ): – . ——, ed. timon of athens. oxford: oxford up, . kaplan, nancy, and yoram chisik. ‘in the company of readers: the digital library book as ‘‘practiced place’’.’ proceedings of the th acm ⁄ ieee joint conference on digital libraries (jcdl ‘ ). new york: acm press, . – . ——, and ——. ‘reading alone together: creating sociable digital library books.’ proceedings of the confer- ence on interaction design and children (idc ‘ ). new york: acm press, . – . kidnie, margaret jane. ‘the staging of shakespeare’s drama in print editions.’ textual performances: the modern reproduction of shakespeare’s drama. eds. lukas erne and margaret jane kidnie. cambridge: cambridge up, . – . kretzschmar, william a. jr, and william gray potter. ‘library collaboration with large digital humanities projects.’ literary and linguistic computing . ( ): – . lancashire, ian. ‘encoding renaissance electronic texts.’ new technologies and renaissance studies. eds. william r. bowen and raymond g. siemens. tempe: medieval and renaissance texts and studies, . – . lavagnino, john. ‘afterword.’ electronic publishing: politics and pragmatics. ed. gabriel egan. tempe: medieval and renaissance texts and studies, . – . leslie, michael. ‘electronic editions and the hierarchy of texts.’ the politics of the electronic text. eds. warren chernaik, caroline davis and marilyn deegan. oxford: office for humanities communication, . – . literature online [lion]. –. chadwyck-healey (proquest llc). [online]. retrieved on december from: http://lion.chadwyck.com/. liu, yin and jeff smith. ‘a relational database model for text encoding.’ computing in the humanities working papers a. ( ): n. p. [online]. retrieved on july from: http://projects.chass.utoronto.ca/chwp/ chc /liu_smith/liu_smith.htm. marcus, leah s. unediting the renaissance: shakespeare, marlowe, milton. new york: routledge, . electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd maron, nancy l., et al.. ‘sustaining digital resources: an on-the-ground view of projects today.’ ithaka case studies in sustainability. [online]. retrieved on july from: http://www.ithaka.org/ithaka-s-r/research/ ithaka-case-studies-in-sustainability. marshall, catherine c. reading and writing the electronic book. san rafael: morgan & claypool, . masten, jeffrey. textual intercourse: collaboration, authorship, and sexualities in renaissance drama. cambridge: cambridge up, . mcgann, jerome. radiant textuality: literature after the world wide web. new york: palgrave macmillan, . mcleod, randall. ‘un ‘‘editing’’ shak-speare.’ substance - ( ): – . middleton, thomas. the collected works [oxford middleton]. gen. eds. gary taylor and john lavagnino. oxford: oxford up, . mla task force on evaluating scholarship for tenure and promotion. ‘report.’ profession ( ): – . monkman, leslie. ‘confronting change.’ esc: english studies in canada . – ( ): – . mumford, lewis. ‘emerson behind barbed wire.’ new york review of books. january. . – , . munro, lucy. ‘early modern drama and the repertory approach.’ research opportunities in renaissance drama ( ): – . murphy, andrew. shakespeare in print: a history and chronology of shakespeare publishing. cambridge: cambridge up, . nichols, stephen g. ‘time to change our thinking: dismantling the silo model of digital scholarship.’ ariadne ( ): n. p. web. [online]. retrieved on january from: http://www.ariadne.ac.uk/issue /nichols/. palmer, joy. ‘archives . : if we build it, will they come?’ ariadne ( ): n. p. [online]. retrieved on july from: http://www.ariadne.ac.uk/issue /palmer/. pincombe, mike and cathy shrank. ‘doing away with the drab age: research opportunities in mid-tudor literature.’ literature compass . ( ): – . price, kenneth m. ‘digital scholarship, economics, and the american literary canon.’ literature compass . ( ): – . proquest dissertations and theses [pqdt]. –. continues proquest digital dissertations ( – ). proquest ⁄ umi dissertation publishing. [online]. retrieved on december from: http://proquest.umi.com/. proudfoot, richard. shakespeare: text, stage and canon. london: thomson learning, . queen’s men editions. gen. ed. helen ostovich. –. mcmaster university, university of toronto, and the internet shakespeare editions. [online]. retrieved on december from: http://qme.internetshakespeare.uvic.ca/. renaissance electronic texts. gen. ed. ian lancashire. –. web development group, university of toronto library. [online]. retrieved on december from: http://www.library.utoronto.ca/utel/ret/ret,html. renear, allen h. ‘text encoding.’ a companion to digital humanities. eds. susan schreibman, ray siemens and john unsworth. malden: blackwell, . – . ——, elli mylonas, and david durand. ‘refining our notion of what text really is: the problem of overlap- ping hierarchies.’ research in humanities computing . eds. nancy ide and susan hockey. oxford: oxford up, . – . reside, doug. ‘a technical framework for publishing electronic editions’. presentation. hastac : grand challenges and global innovations. [online]. retrieved on april from: http://www.ichass.illinois.edu/ hastac /hastac_ /presentations/entries/ / / _technical_framework_for_publishing_electronic_ editions.html. richard brome online. gen. ed. richard allen cave. –. royal holloway, university of london, and the humanities research institute, university of sheffield. [online]. retrieved on december from: http:// www.hrionline.ac.uk/brome/. roberts-smith, jennifer, sandra gabriele, stan ruecker, and stéfan sinclair. ‘the text and the line of action: re-conceiving watching the script.’ new knowledge environments . ( ): n. p. [online]. http://journals.uvic. ca/index.php/inke/article/view/ . robinson, peter. ‘how we have been publishing the wrong way, and how we might publish a better way.’ electronic publishing: politics and pragmatics. ed. gabriel egan. tempe: medieval and renaissance texts and studies, . – . rutter, tom. ‘repertory studies: an overview.’ shakespeare . ( ): – . shakespeare performance in asia. eds. peter s. donaldson, alexander c. y. huang. –. mit and global shakes- peares video and performance archive. [online]. retrieved on december from: http://web.mit.edu/ shakespeare/asia/. shakespeare quartos archive. dir. stephen c. enniss. –. folger shakespeare library, maryland institute for technology in the humanities, university of oxford, and the british library. [online]. retrieved on december from: http://www.quartos.org/. shillingsburg, peter l. ‘the impact of computers on the art of scholarly editing.’ electronic publishing: politics and pragmatics. ed. gabriel egan. tempe: medieval and renaissance texts and studies, . – . shirley, james. the complete works of james shirley [oxford shirley]. gen. eds. eugene giddens, teresa grant and barbara ravelhofer. vols. oxford: oxford up, forthcoming. electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd siemens, raymond g., et al. ‘the credibility of electronic publishing: a report to the humanities and social sciences federation of canada.’ text technology, . ( ): – . simulated environment for theatre. dev. jennifer roberts-smith, teresa dobson, sandra gabriele, stan ruecker, sté- fan sinclair, annmarie akong, shawn desouza-coelho, marcelo hong, sally fung, andrew macdonald, and omar rodriguez. –. university of alberta, university of british columbia, mcmaster university, york university, and the university of waterloo. [online]. retrieved on december from: http://www.humviz. org/set/. sperberg-mcqueen, c. m., and claus huitfeldt. ‘goddag: a data structure for overlapping hierarchies.’ digital documents: systems and principles (ddep ⁄ poddp ). eds. peter king and ethan v. munson. berlin: springer-verlag, . – . steggle, matthew. ‘ ‘‘knowledge will be multiplied’’: digital literary studies and early modern literature.’ a companion to digital literary studies. eds. ray siemens and susan schreibman. malden: blackwell, . – . taylor, gary. ‘c:\wp\file.txt : – – .’ the renaissance text: theory, editing, textuality. ed. andrew murphy. manchester: manchester university press, . – . ——. ‘the renaissance and the end of editing.’ palimpsest: editorial theory in the humanities. eds. george bornstein and ralph g. williams. ann arbor: u of michigan p, . – . ulrich, john m. ‘tenure, promotion, and textual scholarship at the teaching institution.’ profession ( ): – . vickers, brian. shakespeare, co-author: a historical study of five collaborative plays. oxford: oxford up, . waters, lindsay. ‘rescue tenure from the tyranny of the monograph.’ chronicle of higher education april. ( ): b – . webster, john. the works [cambridge webster]. gen. eds. david gunby and david carnegie. vols. cambridge: cambridge up, –. wilson, edmund. the fruits of the mla. new york: new york review of books, . wittern, christian. ‘character encoding.’ a companion to digital literary studies. eds. ray siemens and susan schreibman. malden: blackwell, . – . electronic editions of renaissance drama ª the author literature compass / ( ): – , . /j. - . . .x literature compass ª blackwell publishing ltd op-llcj .. edinburgh research explorer accessing russian culture online citation for published version: kizhner, i, terras, m, rumyantsev, m, sycheva, k & rudov, i , 'accessing russian culture online: the scope of digitization in museums across russia', digital scholarship in the humanities, vol. , no. , pp. - . https://doi.org/ . /llc/fqy digital object identifier (doi): . /llc/fqy link: link to publication record in edinburgh research explorer document version: publisher's pdf, also known as version of record published in: digital scholarship in the humanities general rights copyright for the publications made accessible via the edinburgh research explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. take down policy the university of edinburgh has made every reasonable effort to ensure that edinburgh research explorer content complies with uk legislation. if you believe that the public display of this file breaches copyright please contact openaccess@ed.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. download date: . apr. https://doi.org/ . /llc/fqy https://doi.org/ . /llc/fqy https://www.research.ed.ac.uk/portal/en/publications/accessing-russian-culture-online( c c-a e- d - b a-f ae ec ).html accessing russian culture online: the scope of digitization in museums across russia ............................................................................................................................................................ inna kizhner siberian federal university, krasnoyarsk, krasnoyarskiy kray, russia melissa terras college of arts, humanities, and social sciences, university of edinburgh, edinburgh, midlothian, scotland maxim rumyantsev, kristina sycheva and ivan rudov siberian federal university, krasnoyarsk, krasnoyarskiy kray, russia ....................................................................................................................................... abstract we compare the scope of museum digitization in the russian federation, a country with diverse cultural heritage and over , museums, with the scope of digitization in europe as measured by the enumerate survey of museums from twenty european countries initiated by the collections trust, uk, in . our article shows that the reach and scope of digitization in russia is lesser than that of european museums. digitization is mainly done in russia for inventory purposes. the share of digitized objects published online is comparable to that in europe if we consider images published on museum websites; however, much content from russia is not licensed as reusable, partly due to the different legal framework that exists there. the article challenges the perceptions that global heritage collections are becoming more visible and accessible. it shows that future digital analysis of cultural heritage may be only possible with corpora of images provided by museums that publish numerous images from their digital collec- tions online while pursuing the policies of free image reuse alongside open licensing. such corpora may not be found beyond a limited number of western collections, which may result in excluding many cultures from huma- nities research. ................................................................................................................................................................................. introduction the rate and coverage of digitization throughout europe and the western world are monitored and understood (navarette, ; europeana, ; minerva ec, ). the reach and scope of digit- ization across russia, a huge country with diverse heritage, is almost unknown. in this article, we build on previous work (kizhner et al., a) by using russian ministry of culture statistics to calculate correspondence: inna kizhner, siberian federal university, svobodny, krasnoyarsk, , russia. e-mail: inna.kizhner@gmail.com digital scholarship in the humanities � the author(s) . published by oxford university press on behalf of eadh. this is an open access article distributed under the terms of the creative commons attribution non-commercial license (http://creativecommons. org/licenses/by-nc/ . /), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. for commercial re-use, please contact journals.permissions@oup.com of doi: . /llc/fqy d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: is deleted text: paper xpath error undefined namespace prefix xpath error undefined namespace prefix the percentage of museum collections that have been digitized across russia. we identify country- wide patterns showing that there are huge regional variations for the scope of digitization and quantity of digital images produced and that there are lim- ited amounts of images posted online. our analysis clearly demonstrates that despite numerous local ef- forts and statewide programmes to build a national aggregator of museum images, there are few out- comes, and russian cultural heritage is significantly absent online, compared to the average results for european museums. we suggest that studying non- european digitization practices can lead to further understanding of the digital canon upon which ana- lysis of culture is based (limb, ; price, ; earhart, ; warwick et al., ), allowing us to question the biases and online-premium experi- enced by the cultures which are digitized and made available, either for online viewing or for further open licensing. analysing the representation of heritage collec- tions in the online medium is the first step to under- standing how they contribute to international perceptions of culture in the digital age. we moni- tor various characteristics to be able to understand the complex status of digitization in russia, including the history of digitization in russia, as- sessing the number of images available in museum databases and images available online, understand- ing the licences and legal frameworks that govern any reuse, noting the importance of multilingual interfaces and metadata, and noting the differences between digitization in city centre and provincial collections. we discuss russian digitization as an example of a complex, bottom-up, unstructured data creation, distinct from western approaches to content reuse, open data, linked data, and repurpos- ing (robinson, ; kizhner et al., b). we show that incomplete understanding of digitization as technology and social force (gooding et al., ) can lead to a lag in undertaking digitization at scale, and ask how a potential change in digitization prac- tices, which would be inclusive of russian culture and approaches, can broaden the digital canon available to international researchers. this article provides, for the first time, data on russian digital cultural heritage collections, which are generated from museums scattered across a huge country with diverse collections representing european and national heritage. by using estab- lished methods from monitoring european collec- tions, we highlight difficulties, opportunities, and ramifications for online cultural heritage, in a wider european context. we clearly demonstrate that future analysis of cultures for humanities re- search may be biased towards the corpora of digi- tized images published online and licensed for free reuse, which may have complex ramifications for the study of russian cultural heritage, and beyond. digital collections in russian museums . historical background it is never easy to build a single narrative of museum computing (parry, ). conflicting forces of building inventories, providing access, managing idiosyncrasies of museum descriptions, and intro- ducing standards of machine-readable metadata mean that the field did not develop in a straightfor- ward mode or a single direction (parry, ). however, this article will demonstrate that russian museum computing has been more about building inventories than about developing digital collections that can be accessed as large-scale digital image repositories, or the reuse and extension of digital images to provide more advanced digital resources in the humanities, such as digital scholarly editions. although digitization has a long history in russia covering the early days of museum computing in the country (sher, ; sher, ; nol, ; mikhailova, ) and creating the first russian col- lection management systems (brakker, ; brakker, ; kamis, ; loshak, ), we do not have a consistent discussion of the current status of digitization of russian cultural heritage within institutional settings. from the s, the rationale for museum digit- ization practices in russia was quite similar to that in many other countries, being informed by a need for information and collection management so that museum objects would be catalogued and properly conserved (aseev and sher, ; chenhall and i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: - deleted text: , deleted text: analyzing deleted text: : deleted text: ; deleted text: licenses deleted text: ; deleted text: - deleted text: ; deleted text: - deleted text: paper deleted text: deleted text: . deleted text: background deleted text: ibid deleted text: paper deleted text: deleted text: . deleted text: mikailova vance, ; williams, ; navarette, ). the synergy (or conflict) of keeping inventories and pro- viding access continued in the late s and early s. an important initiative of providing access to russian museum collections stems from when the state hermitage museum and international business machines (ibm), a computational indus- try partner, launched an important collaboration programme. ibm provided a scanner—then a rare and expensive peripheral—and software, a web ap- plication, design, and user interface design for the museum website (fig. ), which was launched in (ibm, ).the state hermitage museum was unique in developing its digitization pro- gramme and publishing collections on its website, as the museum combined the advantages of having dedicated curators to provide metadata, ability to use high-quality digitization technology provided by a commercial company, and ibm technology to develop its website. the interaction of this major museum with large commercial companies was quite typical for a rise of digitization observed in many countries in the s when museums bene- fited from large-scale applications of technologies and companies could experiment and build their reputation on the achievements (terras, ). the balance between keeping inventory databases and providing access to collections resulted in building the national catalogue of the russian federation (rf) museum collections. russian gov- ernment policy related to the need of preserving collections from onwards (federal law number -fz, ) was aimed at building the resource (fig. ), first as an offline catalogue for inventory purposes and later as a comprehensive open database posted online (ministry of culture of the russian federation, b). the catalogue is supposed to be completed by when metadata and images for all objects from the rf museum collections will be included in the registry and posted online (ministry of culture of the russian federation, b). uploading the data is mandatory for all public mu- seums, and the planning/timeline is supposed to be controlled by the ministry of culture at the federal level for the most important museums (ministry of culture of the russian federation, c), and at the regional level for regional and local museums. the national catalogue includes three registries. the offline registry of russian public and corporate museums is maintained as a mandatory list, and private museums can be included on a voluntary basis. the second registry is an offline registry of museum objects for managing acquisition and ac- cession, controlling location and movement. the third registry is the online database mentioned above (fig. ). it was developed for research in the humanities and for the general public. the guide- lines available on the website of the national catalogue inform museum professionals that the mandatory data to upload are an image, title (or object type), period, dimensions, accession numbers, classification field from a guideline, prop- erty type for a museum object (e.g. federal prop- erty), and credit line. this means that the collection management system will not allow the uploading of records without images (ministry of culture of the russian federation, a). it is not yet a comprehensive database, as it only includes images for % of museum objects in the rf museum collections so far. this indicates that, to meet legislative requirements from the rf ministry of culture, a mass programme of digitization will need to happen across russia. consolidated museum activities may result in providing images and metadata to be published in the national catalogue for the total number of museum objects by , but the quality of images and metadata may suffer (pravdina and loshak, ). beyond the rf catalogue, we analysed the repre- sentation of russian digital collections through international aggregators of content, but there were not vast amounts of russian content available via these mechanisms, given the overall number of objects contained in these content management sys- tems. in – , five russian museums expressed their interest in contributing metadata of objects from their online collections to europeana (brakker, ). between and , these mu- seums submitted metadata for , objects (brakker and kuibyshev, ). metadata for more objects was added between and , and their number was , at the time of writing this article (europeana collections, ). google accessing russian culture online digital scholarship in the humanities, of d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: the deleted text: , deleted text: - deleted text: - deleted text: deleted text: deleted text: deleted text: , deleted text: deleted text: in order deleted text: program deleted text: z deleted text: . deleted text: - deleted text: deleted text: were deleted text: is deleted text: paper fig. the interface developed in included the options of viewing collection highlights and browsing the state hermitage museum’s digital collection. the museum website with a new interface was launched in . courtesy of state hermitage museum i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober arts and culture provides access to the images and metadata for , museum objects from russian collections. during the course of the digitization of russian museum collections, we have observed dedicated work aimed at providing metadata standards and descriptions (early years of museum informatics at the state hermitage museum, developing the first russian collection management system and contri- buting metadata to europeana collections). we have seen exciting efforts of providing access to russian cultural heritage at the beginning of cultural heritage digitization (the state hermitage museum website). further research is needed to understand various drivers of digitization in the russian history, considering that, despite obvious advances, we observe a low involvement in providing access at national (national catalogue of the rf museum collections) and international (europeana collections) levels. the following sections will dem- onstrate that access to images and metadata from separate museum websites is low at the moment of writing this article. this means that russian cultural heritage does not have a significant potential to be used for enjoyment, education, and research before when museum efforts are supposed to be con- solidated to provide access to a major part of col- lections through the national catalogue of the rf museum collections (ministry of culture of the russian federation, b). this is important when we consider how the humanities develop and what collections inform scholarly results/inter- national perceptions. assessing the spread of digitization across russian museums . methodology the national catalogue of the rf museum collections (ministry of culture of the russian federation, a) is an initial access point in find- ing out the scale of museum digitization in various parts of the country, including its remote regions. our previous article (kizhner et al., a) fig. at the time of writing, the national catalogue of the rf museum collections includes images and metadata for , , objects, % of russian analogue museum collections accessing russian culture online digital scholarship in the humanities, of d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: s, deleted text: very deleted text: the deleted text: deleted text: m deleted text: c deleted text: deleted text: paper deleted text: . deleted text: paper demonstrated preliminary results of a survey estimating the percentage of digital images for russian museum collections. the study also included website exploration results on the percent- age of museum collections posted online. however, we only asked . % museums in the country for the percentage of digitized images and explored % of museums for the images posted online. the results gave initial estimates, indicating that the uptake of digitization for russia is lower than that in europe— % of analogue collections compared to % for european museums (nauta and van den heuvel, , p. ), and that the percentage of images published online is low ( . %) but compar- able to that published in europe ( %) (nauta and van den heuvel, ). we studied the scope of digitization across a diverse country with huge cul- tural and ethnic heritage. the limitation of our study was that being based on a small sample, we did not look at the quality of collections, import- ance of museum objects for humanities research, or the quality of digitized images. the present article studies the uptake of digitiza- tion in russian museums through the statistical re- ports (form nk) submitted to the ministry of culture from , museums in . the annual statistical reports are mandatory for all mu- seums reporting to local municipalities, regional ad- ministrations, and the rf ministry of culture, in fact for all non-private and non-corporate mu- seums. from these, we can generate the average re- sults for the country and the average results for its eight major geographical regions. this will show the distribution of digitization activities and content across russia. we aim to contrast the data available with that from the enumerate project, which is a study of the uptake of digitization across europe between and , funded by the european union (europeana, ), which will allow us to ascertain whether russian digitization efforts are equivalent to those being undertaken elsewhere. we used the data from the enumerate survey of (nauta and van den heuvel, ), including museums from european countries. we obtained the data of the rf museums’ stat- istical reports for from the rf ministry of culture in summer , after an enquiry submitted via email by the office of provost, siberian federal university, to the rf ministry of culture. the com- plete data received as an aggregated spreadsheet for the filled form nk (rf ministry of culture statistics, ) relate to , museums from every region of the rf. to the best of our know- ledge, these data have not been previously used to study the scope of digitization, either at a regional or at a national level. the data were received as an excel spreadsheet. we redacted the spreadsheet removing information which did not relate to the digitization of museum objects or contained data on galleries that were for temporary display: these data cleaning resulted in , museums. the data in the spreadsheet were analysed to give the total number of objects for every museum, the number of database records with digital images, the number of images posted online, and the availability of english interfaces counted manually at a later stage (the data on english interfaces were not included in the spread- sheet). the table received included data for over , museums, and it was too large to be added to this article as an appendix, so we chose to present the results of the analysis. results the percentage of digital images as related to the total number of museum objects across russia was %. this is a low uptake compared to the average numbers for europe, as the survey report on digitization in europe for shows % digital images as compared to analogue objects in museum collections (nauta and van den heuvel, ). the scope of digitization varied across geographical re- gions (fig. , table ), declining relatively steeply in the far east (the lowest scope), volga federal district, and caucasus. the greatest level of museum digitization that exceeded the european level was observed in saint petersburg. the scale of digitization across major geographical regions varied between the minimum of % in the far east and the maximum of % in the regions adja- cent to saint petersburg (fig. , table ). this means that online scholarly access and promoting i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: deleted text: , deleted text: - deleted text: ibid. deleted text: as well as deleted text: paper deleted text: . deleted text: form deleted text: s deleted text: russian federation deleted text: . deleted text: i deleted text: s deleted text: as deleted text: , deleted text: this deleted text: as deleted text: z deleted text: was deleted text: paper deleted text: . deleted text: the cultural heritage of russian provinces is going to be more difficult even when (if) images are available online via the national catalogue (the museum ob- jects necessary to study the cultural heritage of the country have not been digitized). the survey report on digitization in europe (ibid.) demonstrates the perceptions of museum staff regarding the necessity to digitize museum ob- jects. curators think that % of museum collec- tions have to be digitized. this means that historical and cultural information has been digitally repro- duced for a third of european museum collections, for the same number of collections in saint petersburg and for a much smaller number of col- lections in siberia, the russian far east, and volga district where ethnographic and historical museum repositories obviously represent a great interest. an interesting and unexpected result was the dif- ference between the scale of digitization in two major cities, moscow and saint petersburg. the per- centage of analogue objects with digital images was much higher in saint petersburg than the average across russia and much higher than that in moscow. a possible explanation of the ibm/ hermitage project started in (see above) triggering digitization activity in the museum com- munity in saint petersburg may be a partial explan- ation. in addition, a strong uptake of digitization in this region relates to the interaction of the museum community in saint petersburg and the russian fig. the percentage of images in the digital collections (databases) of russian museums as related to the number of analogue objects in a museum (the average value across russia is %). this clearly shows a difference between the advanced regions in the north-west, with the scope of digitization almost reaching the european level of %, and the rest of the country accessing russian culture online digital scholarship in the humanities, of d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober academy of sciences in the s, followed by col- laboration with national and international commer- cial companies, including ibm, at a major scale, followed by kamis: museum collections (see above) working in the region. we can see that digital collections do exist across the country, but their scope varies, and the level of digitization beyond the northwestern federal district is much lower compared to the average european level of digitization. it is especially important to understand a com- bination of digitally reproduced images and the scope of images posted online (fig. , table ). for example, saint petersburg with the record level of digitization at % makes only . % of the city’s analogue collections published online and visible (fig. , table ). the ural federal district with the level of digitization at %, the second highest in the country, provides digital access to . % of its analogue collections. cultural heritage in this part of the country is the most ac- cessible to online users, while museum collections in siberian federal district are least accessible (fig. , table ). the effect of invisibility of siberian museum collections may result in an inadequate impression regarding siberian cultural heritage. a question ‘do siberian museums exist as data for the researchers in the humanities’ may indeed be asked in this context. we can see that digital collections of russian museums mostly exist for inventory purposes. visibility of russian digital collections, consequent access to images for scholarly studies, and introduc- tion of russian cultural heritage to the international cultural discourse depend on the combination of digitally reproduced images and images published online. with numerous international cultural col- lections available online, a major part of russia’s cultural heritage may be at risk of staying inaccess- ible for public use and scholarly analysis at national and international levels. we analysed whether the information on russian digital collections is provided in english. we com- pare moscow, saint petersburg, and adjacent re- gions with provinces demonstrating that digital collections for museums in siberia, far east, and the caucasus are least accessible to international online users. as shown in table , museums in moscow, saint petersburg, and adjacent regions in northwestern federal district indeed provide english interfaces. almost a half of museums in moscow provide english interfaces, but only a half of them (sixteen museums of twenty-eight) provide several images of museum objects linked to an english interface. fifteen museums across russia ( . % of the total museum number) provide meta- data in english. in moscow, metadata in english is present on the websites of the pushkin state table the percentage of the analogue collections digitally reproduced and available online in the museums of saint petersburg, moscow, and across russia places the percentage of the analogue collections digitally reproduced as related to the total number of objects, % the percentage of digital images posted online as related to the total number of analogue objects, % the average across russia . saint petersburg . northwest (northwestern federal district) . ural federal district . southern federal district . centre (central federal district) . siberian federal district . moscow . caucasus (north caucasian federal district) . volga federal district . far eastern federal district . i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: - deleted text: able deleted text: s deleted text: z deleted text: . deleted text: - deleted text: out deleted text: deleted text: the museum of fine arts, the state tretyakov gallery, the polytechnic museum, and moscow kremlin museums. a similar situation of attract- ing physical visitors and obvious difficulties in ac- cessing online collections is a characteristic of museums in saint petersburg. while twenty-five museums in saint petersburg provide english interfaces, only three major museums (the hermitage museum, museum of the history of saint petersburg, and the state russian museum) present metadata in english so that they can be retrieved as separate museum objects by non- russian speaking users. russian museums understand digitization of their collections as the necessary tool of maintaining museum registries for inventory purposes. this is demonstrated by a dramatic difference between the percentage of digitally reproduced images and images posted online, especially in an advanced region of saint petersburg and the northwestern federal district. closed collections ‘permissions culture’ (bielstein, ; whalen, ; petri, ; aufderheide et al., ) is a situation when the society expects users to ask for permis- sions or licences when interacting with visual art in a digital environment. the degree of freedom for this interaction varies in different countries (for example, aufderheide et al., discussing fig. the percentage of digital images posted online as related to the total number of analogue objects. the lowest percentage is observed in siberia, far east, and saint petersburg. images of analogue museum objects are under- represented online even in the case they have been digitized. this shows that digitization is mainly conducted for inventory purposes. accessing russian culture online digital scholarship in the humanities, of d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: , deleted text: the deleted text: , deleted text: the deleted text: , deleted text: . deleted text: the deleted text: the deleted text: - deleted text: . deleted text: licenses deleted text: see, the limitations of ‘fair use’ implementation in the usa and wallace and deazley, for real-life examples from museums in a number of countries). in russia, the ‘permissions culture’ is maintained by the legislation of the rf. this means that mu- seums are supported by federal or local ministries of culture, and they can claim their rights of being asked for permissions. the state hermitage museum allows image reuse for student projects, educational handouts, and doctoral theses, present- ing research results at conferences. publishing your conference slides online will involve asking the museum for permission as if it were a research pub- lication or a commercial product for which a per- mission or licence is required (the state hermitage museum, ). previously, we demonstrated that moving images across platforms and outputs for different research projects, for example to develop scholarship or digital resources in the humanities, may not be possible in russia, as a permission from a museum tends to relate to a single project, and changing its use will require a new licence (kizhner et al., b). russian museums are not an exception in keep- ing their collections ‘closed’. a recent study demon- strates that about % of museums in a sample of institutions in english-speaking countries (the usa, the uk, canada, australia, and new zealand) allow image re(use) only on the condition of re- questing permissions (esalieva, ). a study of museum reputation (van riel and heijndijk, ) features eighteen famous art museums and relates their rankings to the awareness of their ex- istence. when we manually checked the museum websites for the documents on image policies, we found that two-thirds of the museums do not pursue an open access policy (table ). this shows that russian museums are not the only institutions which prevent their images from being circulated for humanities research or contribution to a new online visual canon (price, ). however, the complex legal framework within the russian context effectively precludes involvement in the ‘open glam’ movement, where individual in- stitutions within other legal cultural contexts may have a choice whether to engage and prioritize open licensing and online access to digitized content. limitations russian museum collections tend to consist of two parts: the main collection of objects and a smaller ‘research collection’, including analogue copies of ob- jects, supporting documentation, museum library books, plans, and maps (ministry of culture of the table accessibility of online museum collections to international users place number of museums in the data set absolute number of museums with english interfaces/ metadata in english english interfaces (% as related to the total number of museums) metadata in english (% as related to the total number of museums) saint petersburg / . . north-west (northwestern federal district) / . . ural federal district / . . southern federal district / . . centre (central federal district) / . . siberian federal district / . moscow / . . caucasus (north caucasian federal district) / . volga federal district / . . far eastern federal district / . total across russia , / the average across russia . . note: geographical distribution of museums where websites include an english interface and metadata in english as related to the total number of museums in a region. i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: real deleted text: russian federation deleted text: . deleted text: license deleted text: are deleted text: license deleted text: english deleted text: deleted text: deleted text: , deleted text: , deleted text: or not deleted text: . union of soviet socialist republics, ). while the total number of objects in russian museum collec- tions slightly exceeds million objects, the number of original objects (including duplicates) is actually million objects. the aggregated results of the stat- istical surveys (rf ministry of culture statistics, ) obtained for the study reported the number of digitized objects as related to the total number of objects in a museum, including their ‘research collec- tions’. this did not create a methodological problem when comparing the results with those from the enumerate project where the survey report on digitization provided the percentage of digital images for museums’ analogue collections (nauta and van den heuvel, , p. ), but the research collection aspect should be borne in mind when look- ing at the statistics provided here. we cannot tell which objects were digitized in a given museum, and whether museums preferred to include or ex- clude the ‘research collection’ from the reported data set. if they did exclude the research collection (which is logically justified), the scope of digitization would be higher, if they did not (which is quite feas- ible because they may have preferred to report all objects with images), the scope of digitization is equal to that reported in the results section (for the data on the percentage of digitized objects and objects published online as related to the number of original objects, see table ). another limitation of this study is that we do not consider what digitized content has been ‘cherry-picked’ for online presentation (besser, ), what influences the decision-making of what is being digitized or posted online, and what impact it has on culture perception. we do not consider the quality of images published online, either, leaving aside the question of how quality— whether high resolution, or effective colour manage- ment procedures, for example—influences image perception and contributes to maintaining a balance between keeping images under control and provid- ing access that matches users’ expectations given the current online environment. discussion our findings demonstrate that digital collections in russian museums do exist across the country, in both metadata and digitized content, but we cannot say that their online display is representative enough to cover the culture considering the variety in geography and ethnography. we can roughly confirm our previous results on the percentage of museum objects with corresponding digitized images across the country (kizhner et al., a) to be in the region of %, as our present data show the level of digitization is on average % in each museum. however, our previous results might have a sampling bias, as the museums answering the questions of the survey could be interested in digit- ization per se and work towards obtaining more table a list of eighteen famous museums from a recent study of what influences museum reputation (van riel and heijndijk, ) and their reuse policy types policy type museums open access (commercial reuse allowed) for images in the public domain metropolitan museum of art, national gallery of art, and rijksmuseum non-commercial reuse allowed for images in the public domain or where copyright is cleared by a museum the louvre, british museum, and van gogh museum personal and educational use, otherwise permitted use only (a fee may apply) state hermitage museum, musée d’orsay, museo del prado permitted use upon request (a fee may apply) national gallery, vatican museums, tate modern, musée national d’art moderne, reina sofia, and museum of modern art requests to provide images (no fee is applied) national art centre, japan no information on policy type centro cultural banco do brasil, and shanghai museum note: two-thirds of museums in the study do not pursue open access policy. accessing russian culture online digital scholarship in the humanities, of d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: decision deleted text: - deleted text: color deleted text: - deleted text: . financial and administrative support to sustain this activity. comparing our data with those from the enumerate project ‘which aimed to survey the extent of digitization across europe’ (europeana, ) where some survey questions were about the percentage of the analogue collection digitally reproduced (nauta and van den heuvel, , p. ), we can say that the average results of the present study at % are much lower than the re- sults of the enumerate project for when the percentage of digitized collections in european mu- seums was %. the enumerate project allows com- paring data across museums, libraries, and archives, and its survey report demonstrates a higher per- centage of analogue objects with digital reproduc- tions for museums compared to libraries at % and archives at % (nauta and van den heuvel, ). we cannot make a similar comparison across sectors to get a full understanding of digitization activities for russian cultural heritage due to the lack of data on russian digital collections in libraries and archives. the results for saint petersburg museum collections are higher than the european average (fig. , table ). the percentage of images available online across russia as related to the ana- logue collection is . % which is lower than the percentage reported by the enumerate project ( % of digital collections and . % of european analogue collections). however, the enumerate re- sults included digital collections and digitally born objects available online, which complicates the com- parison (europeana, ). a clear dominance of digital collections in the northwestern part of the country may be partially explained by the existence of a skilled labour pool in this region, the historical links to technical companies, infrastructure, and western influences. historical reasons of the influence of museum professionals from saint petersburg, the centre of the northwestern district, including their links to major international and national companies, such as ibm and kamis: museum systems, are also important. it would be indeed tempting to position the northwestern federal district as an island of digit- ization efforts. what is strikingly incompatible with this argument is the ratio of images of museum objects posted online. the figure is . % for the northwestern federal district and even lower ( . %) for saint petersburg, almost twice as low as the average across russia at . %. the figure is equal to the percentage of images posted online in the far east (fig. , table ). while the objects are being digitized, those images are not being posted online, in an overturning of the open data principles that we are seeing being uptaken across europe and america (boyle, ; borgman, ; terras, ; european commission, ). a possible table the percentage of the analogue collections digitally reproduced and available online in the museums of saint petersburg, moscow, and across russia (for collections without supporting documentation and museum library books) places % analogue museum objects with digital images for the main collection (without library books and supporting documentation) % for the digitized objects published online (without library books and supporting documentation) the average across russia . saint petersburg . moscow . centre (central federal district) . north-west (northwestern federal district) . southern federal district . caucasus (north caucasian federal district) . volga federal district . ural federal district . siberian federal district . far eastern federal district . i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: ibid. deleted text: - deleted text: western deleted text: - deleted text: n deleted text: - deleted text: w deleted text: n deleted text: -w explanation could be that major museums in moscow and saint petersburg have huge collections with millions of objects. another explanation might be an argument of attracting visitors to physical museums. this is quite consistent with a high number of websites with english interfaces— museum administrators might want an english interface to attract the international public to a physical museum. the websites with metadata in english are available for some of the most important museums with famous collections featured in printed international sources (the state tretyakov gallery, the state russian museum, and moscow kremlin museums), european paintings from the hermitage museum, and the state museum of fine arts in moscow. starting from the s, influencing content se- lection for what can be digitized and included in a database was an issue that significantly affected this early work. the hermitage museum’s senior man- agement was much interested in building a collec- tion management system for the museum’s collection of european paintings (sher, ). their intention to transfer famous works from printed materials to digital collections can be easily explained and understood in terms of pro- moting the state hermitage museum as an institu- tion that keeps and maintains european core values. another possible explanation of keeping online museum images within a printed canon may be the feeling of control, a concept discussed in the context of licensing images by american museums in the early twenty-first century (kelly, ). the feeling may be quite common all over the world, and russian museums may not be an exception. challenging ‘permissions culture’ in visual art (bielstein, ) and relying on public domain images to be published without restrictions (petri, ), as it happens in several museums across the world (aufderheide et al., ), have been compli- cated by a strong opposition of museum gatekeepers when museums assume that ‘permissions are inev- itably required’ (aufderheide et al., , p. ). russian museums are supported in these assump- tions by the rf legislation (kizhner et al., b). the national catalogue of the rf museum collections is supposed to include records with images from all museum collections in the rf except private museums by (ministry of culture of the russian federation, b). we can only hope that the catalogue can meet its planned target figures within a reasonable period. if it does so and if russian digital policies change to allow openly licensed content and content repurposing, then russian cultural heritage will be accessible to a wider national and international user base. if it does not, then russian cultural heritage will not have adequate representation in online cultural heritage resources, and this could lead to insuffi- cient knowledge about the country’s cultural heri- tage on a global scale in an age when countries compete for better visibility through digital media. conclusion our novel contribution is in comparing the scope of museum digitization in russia with the scale of digitization in europe (using nauta and van den heuvel, , as an example). our findings clearly demonstrate that the scope of digitization is lower than in europe: the number of images posted online does not contribute to building a clear picture of russian cultural heritage, and the information on russian museum collections is not accessible to the international audience as few museums publish metadata in english or have english interfaces beyond a few famous museums. this is the case despite important historical developments and sig- nificant initiatives in museum computing scattered across the country. our results challenge the percep- tion of museum collections across the world as ‘vis- ible and easily accessible’ (salamon-cindori et al., ). increased access at a european level pre- vented only by technical or copyright issues (taylor and gibson, ) does not mean it has been achieved worldwide. although much is known about a group of museums with a large share of their collections published online (aufderheide et al., ) or european museums that have digital collections (nauta and van den heuvel, ), further research is needed to find out the share of museums at an international scale that are indeed able to contribute to disseminating accessing russian culture online digital scholarship in the humanities, of d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: deleted text: - deleted text: . deleted text: deleted text: the deleted text: the deleted text: deleted text: has deleted text: russian federation deleted text: . deleted text: very the information on cultural heritage through their digital platforms. if non-western collections will continue to stay invisible and inaccessible, building an art historical corpus (drucker, ) and applying ‘data science’ to visual analysis in art history (manovich, ) will be restricted to western museum data. further steps of data simulation, dimension reduc- tion, and extracting new, unexpected dimensions from large sets of visual data (manovich, ) will be limited by accessible data sets, and the ana- lysis will be, obviously, biased towards the repre- sented heritage characteristics of the western culture. the sheer magnitude of digitization efforts in creating open archives, a road taken in europe and elsewhere, demands intertwining digitization ef- forts and research on artistic canon evolution in a digital era. eventually, the cultural biases of the twentieth century that are rooted in the colonial and political attitude of the nineteenth century (said, ) will be substituted by the attitudes of the generations from the twenty-first century. harnessing the culture of remix (lessig, ) and introducing careful attitudes to what is used and reused to build a new perception of culture suggest that further research is needed on how a future digital canon is created or how it may differ from printed publications. who decides what is being digitized, posted online, easily retrieved, and linked to further knowledge is an important re- search question to arm further studies (and, indeed, it would be useful to carry out equivalent studies comparing the results of the enumerate study to museum digitization activity in other geo- graphical areas, to be able to assess the predicted dominance of european and north american digital culture online). this article presents the first view on the state of russian digital collections on a national scale and regional scales, reporting on the scale of digitization for major geographical regions within russia. by doing so, we can challenge the concept of the digital canon and claim that the printed canon should be essentially extended within the digital space. our research supports recent criticism of digitization that is not accompanied by thematic context and that is strong enough to generate added knowledge in the humanities (hitchcock, , gregory et al., ). in the russian context, the delay of digitiza- tion and online publishing may be exploited to build a network of historically meaningful context that gradually introduces masterpieces and artworks from a variety of regional/social contexts and links them together. national programmes are needed to introduce recommendations on how russian museum websites and/or the national catalogue of the rf museum collections should host images for searching and browsing to provide infrastructure that can assist humanities research, and what the ramifications of not meeting the deadlines for pro- viding a russian-wide catalogue of museum objects will be, given no mass digitization programme exists, or is resourced, there. future research may be also needed to find out the scope and reach of digitization in the library and archive sector in the rf to further understand how the national cultural heritage may be accessed by a wider audience. the task of building inventory databases to get rid of the burden of clerical chores may be just an initial step towards reaching significant economic, social, and cultural impact (drucker, , gooding et al., ). only by extending the scope and reach of digitization of cultural and heritage collections in russia, can they become accessible to both national and international audiences. acknowledgements the authors would like to thank the ministry of culture of the russian federation for providing the data on russian museum collections for ana- lysis. the authors are also grateful to itzhak benenson, nadezhda brakker, margarita kovaleva, xenia pushnitskaya, and jakob sher for valuable discussions. references aseev, y. and sher, j. ( ). preface from the editors of the russian edition. in chenhall, r. (ed.), museum cataloging in the computing age. moscow: mir, pp. – . i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: deleted text: , deleted text: - deleted text: s deleted text: paper deleted text: , deleted text: programs deleted text: deleted text: program deleted text: russian federation al-eroud, a. f., al-ramahi, m. a., al-kabi, m. n., alsmadi, i. m., and al-shawakfa, e. m. ( ). evaluating google queries based on language prefer- ences. journal of information science, ( ): – . ammon, u. (ed.). ( ). the dominance of english as a language of science: effects on other languages and language communities. berlin: mouton de gruyter. aufderheide, p., milosevic, t., and bello, b. ( ), the impact of copyright permissions culture on the us visual arts community: the consequences of fear of fair use. new media and society, ( ): – . besser, h. ( ). the changing role of photographic collections with the advent of digitization. in katherine, j. g. (ed.), the wired museum. washington: american association of museums, pp. – . bielstein, s. ( ). permissions, a survival guide: blunt talk about art as intellectual property. chicago, il: the university of chicago press. brakker, n. ( ). russian cultural institutions in athena project. paper presented at eva moscow, november– december , moscow (in russian). http://conf.evarussia.ru/eva /rus/reports/ report_ .html (accessed may ). brakker, n. ( ), how to join the projects of europeana group. paper presented at automation development in museums and information technology (adit) conference, - may , khanty-mansiysk (in russian). brakker, n. ( ). europeana collections, may [email] (in russian). brakker, n. and kuibyshev, l. ( ). europeana and russian cultural institutions (in russian). ifla journal, ( ): – . borgman, c. ( ). big data, little data, no data. cambridge, ma: mit press. boyle, j. ( ). the public domain, enclosing the commons of the mind. new haven, ct: yale university press. chenhall, r. and vance, d. ( ). the world of (almost) unique objects. in parry, r. (ed.), museums in a digital age. london; new york: routledge, pp. – . crystal, d. ( ). english as a global language. cambridge university press. drucker, j. ( ), is there a ‘digital’ art history? visual resources, ( – ): – . drucker, p. ( ). the manager and the moron. mckinsey quarterly, december , https://www. mckinsey.com/business-functions/organization/our-in- sights/the-manager-and-the-moron (accessed december ). earhart, a. ( ). can information be unfettered? race and the new digital humanities canon. in debates in the digital humanities. minneapolis, mn: university of minnesota, pp. – . esalieva, s. ( ). disseminating digital copies of images from museum collections: legal restrictions. bachelor’s dissertation, siberian federal university. european commission. ( ), the european cloud ini- tiative. digital single market. https://ec.europa.eu/digi- tal-single-market/en/% european-cloud-initiative (accessed may ). europeana. ( ). enumerate. https://pro.europeana.eu/ tags/enumerate (accessed december ). europeana collections. ( ). https://www.europeana. eu/portal/en (accessed december ). europeana collections. ( ). providing country: russian federation. https://www.europeana.eu/portal/ en/search?f% bcountry% d% b% d¼russia&q¼ (accessed december ). federal law number -fz. ( ). on the rf museum collections and museums in the rf, passed april, (in russian). franzini g., mahony s., and terras m. ( ). a cata- logue of digital editions. in driscoll, m. j. and pierazzo, e. (eds), scholarly digital editions: theory, practice and future perspectives. cambridge: open book publishers. gregory, i., atkinson, p., hardie, a., joulain-jay, a., kershaw, d., porter, c., rayson, p., and rupp, c. j.( ). from digital resources to historical scholar- ship with the british library th century newspaper collection. journal of siberian federal university. humanities and social sciences, ( ): – . gooding, p., terras, m., and warwick, c. ( ). the myth of the new: mass digitization, distant reading and the future of the book. literary and linguistic computing, ( ): – . hitchcock, t. ( ). confronting the digital or how aca- demic history writing lost the plot. cultural and social history, : – . ibm. ( ). hermitage museum project. https://www.re- search.ibm.com/haifa/projects/software/hermitage/ (ac- cessed december ). accessing russian culture online digital scholarship in the humanities, of d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober http://conf.evarussia.ru/eva /rus/reports/report_ .html http://conf.evarussia.ru/eva /rus/reports/report_ .html https://www.mckinsey.com/business-functions/organization/our-insights/the-manager-and-the-moron https://www.mckinsey.com/business-functions/organization/our-insights/the-manager-and-the-moron https://www.mckinsey.com/business-functions/organization/our-insights/the-manager-and-the-moron https://ec.europa.eu/digital-single-market/en/% european-cloud-initiative https://ec.europa.eu/digital-single-market/en/% european-cloud-initiative https://pro.europeana.eu/tags/enumerate https://pro.europeana.eu/tags/enumerate https://www.europeana.eu/portal/en https://www.europeana.eu/portal/en https://www.europeana.eu/portal/en/search?f% bcountry% d% b% d=russia&q= https://www.europeana.eu/portal/en/search?f% bcountry% d% b% d=russia&q= https://www.europeana.eu/portal/en/search?f% bcountry% d% b% d=russia&q= https://www.research.ibm.com/haifa/projects/software/hermitage/ https://www.research.ibm.com/haifa/projects/software/hermitage/ kamis. ( ). about kamis. http://www.kamis.ru/o- kompanii/istoriya/ (accessed december ). in russian. kelly, k. ( ). images of works of art in museum collections: the experience of open access. prepared for the andrew w. mellon foundation. http://archiv.ub.uni- heidelberg.de/artdok/ / /kelly_images_of_works_of_ art_in_museum_collections_ .pdf (accessed december ). kizhner, i., terras, m., and rumyantsev, m. ( a). museum digitization practices across russia: survey and web site exploration results. in digital humanities : conference abstracts. jagiellonian university & pedagogical university, kraków, pp. – . kizhner, i., stankevich, j., rumyantsev, m., makarchuk, i. ( b). licensing images from russian museums for an academic project. journal of siberian federal university. humanities and social sciences, ( ), – . http://elib.sfu-kras.ru/bitstream/handle/ / / _kizhner.pdf?sequence¼ (accessed may ). lessig, l. ( ). remix: making art and commerce thrive in the hybrid economy. new york: penguin press. limb, p. ( ). the politics of digital" reform and revo- lution: towards mainstreaming and african control of african digitisation. innovation, . loshak, y. ( ). kamis and museum digitization, may , [email] (in russian). mcgann, j. ( ). information technology and the troubled humanities. in terras, m., hyhan, j., and vanhoutte, e. (eds), defining digital humanities. a reader. farnham: ashgate publishing limited. manovich, l. ( ). data science and digital art history. international journal for digital art history, , pp. – . manovich, l. ( ), cultural analytics, social computing and digital humanities. in schafer, m. t. and van es, k. (eds), the datafied society: studying culture through data. amsterdam: amsterdam university press, pp. – . mikhailova, a. ( ). collection management compu- terization in the uk and russia: a comparative history. ma thesis, school of museum studies, university of leicester. minerva ec. ( ). http://www.minervaeurope.org/ home.htm ministry of culture of the russian federation. ( ), the national catalogue of the rf museum collections. http://goskatalog.ru/portal/#/ (accessed december ). ministry of culture of the russian federation. ( b). corporate museums will be included in the catalogue of the rf museum collections. ministry news of january (in russian). ministry of culture of the russian federation ( c), on timelines for registering museum objects in the na- tional catalogue of the rf museum collections, february, . http://goskatalog.ru/portal/#/for-mu- seums/docs?id¼ (accessed december ). ministry of culture of the union of soviet socialist republics. ( ). guidelines on documenting and care of museum collections from the public museums of the union of soviet socialist republics, directive no. of . . . https://museumlaw.ru/ .html (ac- cessed march ). nauta, j. g. and van den heuvel, w. ( ). survey report on digitization in european cultural institutions . europeana pro. http://pro.europeana.eu/enumer- ate/statistics/results (accessed may ). navarette, t. ( ). a history of digitization: dutch museums. ph.d. thesis, university of amsterdam. http://catalogus.boekman.nl/pub/p - .pdf (ac- cessed december ). nol, l. ( ). information technology in museums. textbook for museum studies programmes. moscow: russian state university for the humanities press (in russian). parry, r. ( ). recoding the museum: digital heritage and the technologies of change. london: routledge. petri, g. ( ). the public domain vs the museum: the limits of copyright and reproductions of two-dimen- sional works. journal of conservation and museum studies, ( ). http://www.jcms-journal.com/articles/ . /jcms. /print/ (accessed may ). pravdina, m. and loshak, y. ( ). collection manage- ment systems support museum development (in russian). price, k. m. ( ). digital scholarship, economics, and the american literary canon. literature compass, ( ): – . rf ministry of culture statistics. ( ), forms for - statistical surveys. http://mkstat.ru/forms/ (ac- cessed march ). in russian. robinson, p. ( ). five desiderata for scholarly editions in digital form. in digital humanities : conference i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober http://www.kamis.ru/o-kompanii/istoriya/ http://www.kamis.ru/o-kompanii/istoriya/ http://archiv.ub.uni-heidelberg.de/artdok/ / /kelly_images_of_works_of_art_in_museum_collections_ .pdf http://archiv.ub.uni-heidelberg.de/artdok/ / /kelly_images_of_works_of_art_in_museum_collections_ .pdf http://archiv.ub.uni-heidelberg.de/artdok/ / /kelly_images_of_works_of_art_in_museum_collections_ .pdf http://elib.sfu-kras.ru/bitstream/handle/ / / _kizhner.pdf?sequence= http://elib.sfu-kras.ru/bitstream/handle/ / / _kizhner.pdf?sequence= http://elib.sfu-kras.ru/bitstream/handle/ / / _kizhner.pdf?sequence= http://www.minervaeurope.org/home.htm http://www.minervaeurope.org/home.htm http://goskatalog.ru/portal/#/ http://goskatalog.ru/portal/#/for-museums/docs?id= http://goskatalog.ru/portal/#/for-museums/docs?id= http://goskatalog.ru/portal/#/for-museums/docs?id= https://museumlaw.ru/ .html http://pro.europeana.eu/enumerate/statistics/results http://pro.europeana.eu/enumerate/statistics/results http://catalogus.boekman.nl/pub/p - .pdf http://www.jcms-journal.com/articles/ . /jcms. /print/ http://www.jcms-journal.com/articles/ . /jcms. /print/ http://mkstat.ru/forms/ abstracts. university of nebraska-linkoln, - july . said, e. w. ( ). culture and imperialism. new york: random house. salamon-cindori, b., tot, m., and zivkovic, d. ( ). digitization: challenges for croatian museums. qualitative and quantitative methods in libraries, : – . sher, j. ( ). the use of computers in museums: pre- sent situation and problems. museum, , – . http://unesdoc.unesco.org/images/ / / eo.pdf (accessed december ). sher, j. ( ). department of museum informatics at the hermitage museum ( - ). information technology for museums, no , saint petersburg. http://kronk.spb.ru/library/sher-yaa- .htm (ac- cessed december ). in russian. taylor, j. and gibson, l. k. ( ). digitization, digital interaction and social media: embedded barriers to democratic heritage. international journal of heritage studies, ( ), – . http://www.tandfonline.com/doi/ full/ . / . . (accessed may ). terras, m. ( ). the rise of digitization. in rikowski, r. (ed.), digitization perspectives. rotterdam, boston, taipei: sense publishers, pp. – . terras, m. ( ). opening access to collections: the making and using of open digitized cultural content. online information review, ( ): – . the state hermitage museum. ( ). image use policy. the state hermitage museum, https://www.hermitage- museum.org/wps/portal/hermitage/about/image_ usage_policy?lng¼en (accessed december ). van riel, c. and heijndijk, p. ( ). why people love art museums: a reputation study about the most famous museums among visitors in countries. rotterdam school of management, erasmus university. wallace a. and deazley r. ( ). display at your own risk: an experimental exhibition of digital cultural heri- tage. http://displayatyourownrisk.org/publications/ (ac- cessed december ). warwick, c., terras m., and nyhan j. ( ). digital humanities in practice. london: facet publishing in association with ucl digital humanities centre. whalen m. ( ). what’s wrong with this picture? an examination of art historians’ attitudes about electronic publishing opportunities and the consequences of their continuing love affair with print. art documentation: bulletin of the art libraries society of north america, ( ): – . williams, d. ( ). a brief history of museum compu- terization. in parry, r. (ed.), museums in a digital age. london, new york: routledge, pp. – . notes a complicated task that has been rarely achieved for textual materials and requires sophisticated training in editing skills and knowledge of the history of book (mcgann, ). a recent study shows that there are only about digital scholarly editions worldwide (franzini et al., ). https://www.hermitagemuseum.org/wps/portal/ hermitage/ https://www.ibm.com/us-en/ at the time of writing, the catalogue is available in russian at http://goskatalog.ru/portal/#/ at the time of writing, there are million objects in europeana collections (europeana collections, ). the state tretyakov gallery https://www.tretyakovgal- lery.ru/en/, saratov state museum of fine art http:// artkatalog.radmuseumart.ru/en/, rybinsk museum (near yaroslavl) http://www.rybmuseum.ru/en/, chuvash state museum of fine art http://www.artmu- seum.ru/museumexpo/, and kazan university museum http://kpfu.ru/eng/about-the-university/museums-and- library/the-museum-of-history-of-kazan-university/ex- hibition-halls. it should be noted that four museums on the list provide interfaces in the english language and are obviously interested in visibility/access to their col- lections at an international level. https://www.google.com/culturalinstitute/beta/?hl¼ru google arts and culture is a digital collection of museum objects initiated by google and launched in as an online platform to provide access to high- resolution images of artworks. the rf ministry of culture introduced national statis- tics related to museums (form nk) in . form nk for – is available on the website of the rf ministry of culture statistics (rf ministry of culture statistics, ). the form includes thirty-six fields, and the data are annually submitted to the rf ministry of culture. the fields cover the information on the type of museum (public or private), the type of museum object property (federal, regional, or municipal), the number of objects exhibited in the museum space, the number of objects that can be physically accessed by the blind and visually impaired, the number of accessing russian culture online digital scholarship in the humanities, of d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober http://unesdoc.unesco.org/images/ / / eo.pdf http://unesdoc.unesco.org/images/ / / eo.pdf http://kronk.spb.ru/library/sher-yaa- .htm http://www.tandfonline.com/doi/full/ . / . . http://www.tandfonline.com/doi/full/ . / . . https://www.hermitagemuseum.org/wps/portal/hermitage/about/image_usage_policy?lng=en https://www.hermitagemuseum.org/wps/portal/hermitage/about/image_usage_policy?lng=en https://www.hermitagemuseum.org/wps/portal/hermitage/about/image_usage_policy?lng=en https://www.hermitagemuseum.org/wps/portal/hermitage/about/image_usage_policy?lng=en http://displayatyourownrisk.org/publications/ https://www.hermitagemuseum.org/wps/portal/hermitage/ https://www.hermitagemuseum.org/wps/portal/hermitage/ https://www.ibm.com/us-en/ http://goskatalog.ru/portal/#/ deleted text: fifty one https://www.tretyakovgallery.ru/en/ https://www.tretyakovgallery.ru/en/ http://artkatalog.radmuseumart.ru/en/ http://artkatalog.radmuseumart.ru/en/ http://www.rybmuseum.ru/en/ http://www.artmuseum.ru/museumexpo/ http://www.artmuseum.ru/museumexpo/ http://kpfu.ru/eng/about-the-university/museums-and-library/the-museum-of-history-of-kazan-university/exhibition-halls http://kpfu.ru/eng/about-the-university/museums-and-library/the-museum-of-history-of-kazan-university/exhibition-halls http://kpfu.ru/eng/about-the-university/museums-and-library/the-museum-of-history-of-kazan-university/exhibition-halls deleted text: , https://www.google.com/culturalinstitute/beta/?hl=ru https://www.google.com/culturalinstitute/beta/?hl=ru deleted text: - deleted text: deleted text: deleted text: deleted text: is museum objects requiring conservation, the number of objects cleaned, repaired, and stabilized in the reported year, the number of museums with electronic inven- tories, the number of museums with the internet access, etc. english has been long considered a global language (crystal, ) or ‘today’s dominant language of sci- ence’ (ammon, , p. v). there is some evidence supporting the claim that search engines favour pages in english giving them a priority in rankings (al-eroud et al., ). http://www.arts-museum.ru/?lang¼en https://www.tretyakovgallery.ru/en/ https://polymus.ru/eng/ http://www.kreml.ru/en-us/museums-moscow- kremlin/ federal law number -f , may on museums and museum collections in the rf, amended in , , , , , , , and . article number states that copying museum products is impossible without a written permission from museum administration. the second law regulating, in particular, image reuse is ‘basic legislation of the rf on culture’ number - , october , amended in . article number states that com- panies and public institutions can use the images of cultural heritage objects only with the permission of an object owner. because the owner is either the rf or a region within the rf in the case of public museums, the owners’ rights are looked after by either federal or regional ministries of culture (federal law number -f , may , article number ). https://openglam.org of course, major british and us galleries, libraries, archives, and museums do not provide interfaces in languages other than english. see, for example, the website of the metropolitan museum https://www. metmuseum.org or tate britain http://www.tate.org. uk/visit/tate-britain federal law number -f , may on museums and museum collections in the rf, amended in , , , , , , , and . i. kizhner et al. of digital scholarship in the humanities, d ow nloaded from https://academ ic.oup.com /dsh/advance-article-abstract/doi/ . /llc/fqy / by guest on o ctober deleted text: favor http://www.arts-museum.ru/?lang=en http://www.arts-museum.ru/?lang=en https://www.tretyakovgallery.ru/en/ https://polymus.ru/eng/ http://www.kreml.ru/en-us/museums-moscow-kremlin/ http://www.kreml.ru/en-us/museums-moscow-kremlin/ deleted text: no deleted text: russian federation deleted text: no deleted text: russian federation deleted text: russian federation deleted text: no https://openglam.org deleted text: a https://www.metmuseum.org https://www.metmuseum.org http://www.tate.org.uk/visit/tate-britain http://www.tate.org.uk/visit/tate-britain federated geospatial data discovery for canada - geodisy eugene barsky and evan thornberry, ubc january image - https://www.flickr.com/photos/double-m https://www.flickr.com/photos/double-m the good people of data... ● amber leahey, data services metadata librarian, scholars portal ● marcel fortin, head, map and data library, university of toronto ● jason brodeur, associate director, digital scholarship services, mcmaster university library ● jason hlady, manager, research computing, university of saskatchewan ● eugene barsky, research data librarian, university of british columbia ● paul lesack - gis/data analyst, university of british columbia library ● evan thornberry, gis librarian, university of british columbia library ● mark goodwin, metadata coordinator, university of british columbia library ● paul dante, software developer, university of british columbia library ● lee wilson, service manager, portage outline ● general overview of the project ● including: ○ the problem ○ our suggested solution ○ steps ○ timelines ○ your feedback image - https://www.flickr.com/photos/ @n / https://www.flickr.com/photos/ @n / the problem ● how do i find, for example, the migration paths of humpback whales, the distribution of maple-syrup yields, infrared satellite imagery, distribution of artifacts in an archaeological site or the flow routes of water due to sea level rise? ● text-based searches don’t always work well with spatial data ● location is a key image - https://www.flickr.com/photos/blprnt/ https://www.flickr.com/photos/blprnt/ the problem ● most repositories lack a map-based interface ● how do i find data about mining in northern bc? potential solution extend existing software to find and display research data in a search interface which is both map and text based, combining research data with the functionality of a product such as google maps. image by https://www.flickr.com/photos/ @n / https://www.flickr.com/photos/ @n / steps: step : software will query the canadian dataverse repositories (scholar portal, ubc, uofa, dal, unb, uofm) to determine if geospatial information is present within the digital object (e.g., a study, or a data deposit) image - https://www.flickr.com/photos/jdhancock/ https://www.flickr.com/photos/jdhancock/ steps: step : ● software will harvest any geospatial metadata in the primary record (eg: main record page) ● more importantly, the software harvester will query and harvest any geospatially relevant file objects (satellite imagery, geospatial vector files, etc) image - https://www.flickr.com/photos/a-g https://www.flickr.com/photos/a-g/ steps: step : once the data have been harvested, the software will create and normalize relevant geospatial data from the (a) primary record and from (b) any associated digital objects, and extract all relevant metadata image - https://www.flickr.com/photos/wakingtiger https://www.flickr.com/photos/wakingtiger/ steps: step : the extracted, cleaned and normalized (iso ) data is deposited by the software pipeline into a geospatial data server, such as geoserver, capable of distributing geospatial data in a wide variety of formats to various services image - https://www.flickr.com/photos/centralasian/ https://www.flickr.com/photos/centralasian/ steps: step : ● data will then be harvested by a geospatial search interface such as geoblacklight, an open source geospatial search tool ● the user interface will be customized to the needs of the federated research data repository project (frdr), providing a unified map-based search interface for research data in canada. * image - https://www.flickr.com/photos/pamilne/ https://www.flickr.com/photos/pamilne/ suggested solution potential partners and synergies ● big ten academic alliance - geospatial data discovery project - https://geo.btaa.org/ ● canadian historical geographic information systems partnership (chgis)- http://geohist.ca/ ● and hopefully some of you! image - https://www.flickr.com/photos/derpunk/ https://geo.btaa.org/ http://geohist.ca/ https://www.flickr.com/photos/derpunk/ fair principles ● enhance findability and accessibility of geospatial research data in canada - metadata clean-up and crosswalks, guidance, and tools for the description of research data (e.g. dublin core, ddi, iso ) ● interoperability - open geospatial metadata exchange and open apis ● re-usability - open, fit-for-purpose interface for discovering and exploring geospatial data * image - https://www.flickr.com/photos/wwworks/ https://www.flickr.com/photos/wwworks/ national services ● dataverse - we are working with our dataverse north partners, a potential national repository solution ● frdr - we are working with our portage and compute canada colleagues - incorporating an open source geoblacklight application into frdr as another search option, a mapped based search * image - https://www.flickr.com/photos/crysb/ https://www.flickr.com/photos/crysb/ questions? image - https://www.flickr.com/photos/debord/ https://www.flickr.com/photos/debord/ palgrave communications – connecting research in the humanities, social sciences and business editorial received dec | accepted dec | published jan palgrave communications – connecting research in the humanities, social sciences and business iain hrynaszkiewicz and michele acuto h ere we introduce the first edition of palgrave commu- nications, a high-quality peer-reviewed open access journal for research in all areas of the humanities, the social sciences (hss) and business. the scope of the journal, the first multidisciplinary title from palgrave macmillan, reflects the publish- er’s strengths in these areas. in addition to our multidisciplinary position, palgrave communications particularly welcomes inter- disciplinary research, which fosters interaction, creativity and reflection between disciplines. palgrave communications aspires to be the definitive peer-reviewed outlet for open access academic research in and between our subjects. we discuss the need for a journal like palgrave communications in academic research, our editorial standards, our aims and scope, and present the journal’s first articles. championing interdisciplinary research global problems do not come in neat packages: the stresses of transnational migrations present questions for international lawyers, transport experts and conflict analysts alike, and the impacts of water scarcity equally call on civil engineers, anthropologists and natural hazard specialists. the disciplinary boundaries that are cemented in the academic world are regularly questioned by the “real world”. while not necessarily antidisciplinary, interdisciplinary research is today crucial in helping to solve global social, environmental and economic problems. yet traditional academic research assessment practices can incentivize approaches to research that lack the interdisciplinary flexibility to engage with challenges like migration, water scarcity and many others. this is exemplified, for instance, by the common placing of greater values and attentions on publications in discipline-specific journals or on particular orders of authorship on a paper, rather than a paper’s content and real- world value of its arguments. nevertheless, the orientation of some research might be chang- ing. there are today many emerging examples of interdisciplinary research. professor nikolas rose, a member of our editorial board, king’s college london, launched the urban brain lab (http://www.kcl.ac.uk/sspp/departments/sshm/research/research- groups/biomedicine-ethics-and-social-justice/besj-projects/ urban-brain-lab.aspx), which looks at the relations between sociological and neurobiological sciences, with a focus on mental health. likewise, the university of oxford’s bioproperty project (http://www.bioproperty.ox.ac.uk), led by dr javier lezaun, has been working across property rights, biomedical research and science and technology studies to unpack the dynamics of tropical disease, human/animal interactions and medical patenting. indeed, “interdisciplinary” has been an increasingly common buzzword for academia and the broader ecosystem of science- policy institutions. governmental research councils such as the esrc and the epsrc in the united kingdom push towards greater “exploratory” methods and “cross-domain” interactions to integrate various modes of scholarly enquiry. international and regional institutions linking universities and inter-state coopera- tion bodies, like the european research council, promote grant funding and research initiatives based on collaborative modes of engagement between disciplines and subdisciplines. the challenges of interdisciplinary research are, nonetheless, momentous and certainly more and more pressing as this demand grows. typically, interdisciplinary research is confronted by a challenge of effective and productive communication. even in the twenty-first century, academic disciplines remain siloed into relatively different linguistic styles and terminologies, presenting substantial barriers to direct and productive cross-disciplinary discussions. further, there are pressures from resource scarcity and changing funding models for higher education, along with the continued push for publishing in “high-impact” journals. hybridization of academic research with policy and corporate research can also result in quick and superficial interdisciplinary collaborations mostly aimed at attracting funding rather than exploring true and long-term innovation. practically, we need stable, innovative and courageous steps towards developing a more systematic and widely recognizable interdisciplinary agenda. as well as being a multi-disciplinary journal, palgrave com- munications is seeking to offer a space for more in-depth and professionalized interdisciplinarity to flourish. the journal offers a venue for different scholarly arenas to connect. developing truly collaborative research takes time—something that can have little appreciation in funding and policy demands—and dialogue, doi: . /palcomms. . open head of data and hss publishing, open research, nature publishing group & palgrave macmillan, the macmillan campus, trematon walk, wharfdale road, london, n fn, uk department of science, technology, engineering and public policy (steapp), university college london, boston house, – fitzroy square, london, w t ey, uk (e-mail: m.acuto@ucl.ac.uk) correspondence: (e-mail: iain.hrynaszkiewicz@nature.com) palgrave communications | : | doi: . /palcomms. . |www.palgrave-journals.com/palcomms http://www.kcl.ac.uk/sspp/departments/sshm/research/research-groups/biomedicine-ethics-and-social-justice/besj-projects/urban-brain-lab.aspx http://www.kcl.ac.uk/sspp/departments/sshm/research/research-groups/biomedicine-ethics-and-social-justice/besj-projects/urban-brain-lab.aspx http://www.kcl.ac.uk/sspp/departments/sshm/research/research-groups/biomedicine-ethics-and-social-justice/besj-projects/urban-brain-lab.aspx http://www.bioproperty.ox.ac.uk http://www.palgrave-journals.com/palcomms but is something we hope palgrave communications can help with. journals and publishing are just part of the answer, of course, and education should follow suit. the more extensive comment by jacobs, in this first edition of palgrave communications, further discusses interdisciplinary research trends in higher education (jacobs, ). jacobs’ article is the first in a series of articles that will explore the need and drive for interdisciplinary research from different perspectives. a multidisciplinary open access journal with impact palgrave communications provides immediate, free online access to and dissemination of all articles, which are published under a creative commons attribution licence (cc by) by default (with other licences available on request). to provide immediate open access to all articles without charging readers a subscription, authors of accepted papers are asked to pay an article-processing charge (apc; http://www. palgrave-journals.com/palcomms/about/openaccess). costs are involved in all stages of the publication process and the apc includes coordination of peer review, typesetting, web hosting, copy editing, production, archiving and promotion of content. the apc is a flat, one-off charge and authors are not faced with additional charges for longer articles or particular numbers of pages, tables or figures. while we focus here on the journal’s content, scope and goals, we cannot ignore the ongoing debate about the role of open access journals in hss and business. access to funding for apcs, and who pays these in the long term, is still to be determined in hss and we will undoubtedly be part of the ongoing debate. palgrave communications is not the first peer-reviewed, fully open access journal for researchers in hss and business. with our multidisciplinary scope we may be compared with open access “megajournals”—a phrase established in the s with the rise of broad scope open access journals that judge research on methodological rigour but ignore impact and importance of works (of which there are now more than ; solomon, ). palgrave communications differs from most of these journals as our criteria for publication require peer reviewers to assess novelty and importance of works—as well as checking they are methodologically sound (http://www.palgrave-journals.com/pal comms/referees). in general, to be acceptable, a paper should represent an advance in understanding likely to influence think- ing in the field. this publication policy and our scope is a response to the opinions of hss researchers, which suggested that a high-quality open access journal was missing from the literature. a survey by palgrave macmillan of hss researchers found the majority ( %) of respondents ( ) would publish in open access journals if it was offered by the best or most appropriate journal (npg, a). this perceived lack of a high-quality open access option is echoed in the large ( , responses) european commission study of open access publishing (dallmeier-tiessen et al., ). now, with the publication of the first articles, palgrave communications will likely be judged on the quality of its content and its relevance to its audience (npg, b). being born a digital, as well as open access, journal provides numerous opportunities. palgrave communications is commit- ted to providing an efficient service for authors, reviewers and readers. an online peer review and manuscript submission system, together with the support of a large and diverse editorial board, enables us to make rapid and fair publication decisions. we use continuous online publication to promptly disseminate accepted papers—to palgrave macmillan’s wide readership and beyond. published manuscripts are enhanced by innovative web technologies, including a modern article template for reading on a variety of devices and platforms. rich information about the readership, reuse and discussion of each article is provided— article-level metrics. this enables readers and authors to rapidly assess who is talking about research online, where and in what fora, as well as measuring citation impact. the journal’s use of this technology reflects a broader movement in both research assessment at research institutions and in publishing to assess the impact of research at the individual—articles and authors—rather than journal level (neylon and wu, ). studies in various academic disciplines, including in economics (wohlrabe and birkmeier, ) have shown that open access articles may be more highly cited than similar articles only available to sub- scribers (for a bibliography of studies, see http://opcit.eprints.org/ oacitation-biblio.html). beyond open access the digital format of palgrave communications offers a chance to help tackle other problems in research communication, such as the reproducibility of results. the journal has strong editorial and ethical policies which include sharing of research data and materials as a condition of publication. we, also, strongly encour- age data citation—an important driver of cultural change in scholarly research to give more credit for transparency and reproducibility (hahnel, ). the need to increase reproducibility applies to all areas of research—and has been hotly debated in the social (miguel et al., ) and political (http://datacommunity.icpsr .umich.edu/da-rt-workshop) sciences in the past year. we believe the journal is well placed to promote and facilitate digital scholarship more generally—which is equally important in the humanities as it is in the sciences. lack of recognition of digital publications, from data to articles, has been seen as a barrier to recognition and adoption of digital approaches in the humanities (holm et al., ). also in support of transparency and recognition in digital scholarship, palgrave communications does not consider advance sharing of abstracts and preprints to compromise novelty. by establishing ourselves as a credible peer-reviewed outlet for open access academic research, in and between our subjects, we hope to contribute to a shift in perceptions about digital scholarship—as well as online only, open access journals. our first articles our first articles include articles from the research fields of development and international political economy (shaw, ), literature (bennett, ), political science and international studies (tsang, ) and operational research (spyridonis et al., ). we have received a wide variety of submissions, from the majority of disciplines within our scope. we will also, in the coming months, be announcing several articles collections (known as special issues, in traditional publications) in . these will focus in detail on new developments in a specific topic in a discipline within our scope, as we seek to grow our presence and relevance to the wide range of research communities we serve. palgrave communications is open to all theoretical and methodological perspectives and we welcome proposals for article collections and article presubmission enquiries by e-mail to palcomms@palgrave.com. references bennett m ( ) theatrical names and reference. palgrave communications; , article number: . dallmeier-tiessen s et al ( ) highlights from the soap project survey. what scientists think about open access publishing, http://arxiv.org/abs/ . . hahnel m ( ) referencing: the reuse factor. nature; ( ): . editorial palgrave communications | doi: . /palcomms. . palgrave communications | : |doi: . /palcomms. . | www.palgrave-journals.com/palcomms http://www.palgrave-journals.com/palcomms/about/openaccess http://www.palgrave-journals.com/palcomms/about/openaccess http://www.palgrave-journals.com/palcomms/referees http://www.palgrave-journals.com/palcomms/referees http://opcit.eprints.org/oacitation-biblio.html http://opcit.eprints.org/oacitation-biblio.html http://datacommunity.icpsr.umich.edu/da-rt-workshop http://datacommunity.icpsr.umich.edu/da-rt-workshop http://arxiv.org/abs/ . http://www.palgrave-journals.com/palcomms holm p, jarrick a and scott d ( ) the digital humanities. in: humanities world report . palgrave macmillan: pp – . jacobs wj ( ) interdisciplinary trends in higher education. palgrave communi- cations; , article number: . miguel e et al ( ) social science: promoting transparency in social science research. science; ( ): – . neylon c and wu s ( ) article-level metrics and the evolution of scientific impact. plos biology; ( ): e . npg ( a) npg—open access survey raw data. figshare. http://dx.doi.org/ . /m .figshare. . npg ( b) author insights . figshare. http://dx.doi.org/ . /m .figshare. . shaw t ( ) from post-brics’ decade to post- : insights from global governance & comparative regionalisms. palgrave communications; , article number: . solomon d j ( ) a survey of authors publishing in four megajournals. peerj; : e . spyridonis f et al ( ) a study on the current state-of-the-art of e-infrastructures uptake in africa: current landscape and future prospects for e-infrastructure development. palgrave communications; , article number: . tsang s ( ) the xian incident and the start of the sino-japanese war. palgrave communications; , article number: . wohlrabe k and birkmeier d ( ) do open access articles in economics have a cita- tion advantage? http://mpra.ub.uni-muenchen.de/ / /mpra_paper_ .pdf. acknowledgements some of the text of this editorial was previously published on a blog, co-written by michele acuto and first published in may (http://blogs.lse.ac.uk/impactofso cialsciences/ / / /finding-a-home-for-interdisciplinary-research/). additional information competing interests: iain hrynaszkiewicz is employed by nature publishing group/ palgrave macmillan, which publishes palgrave communications. reprints and permission information is available at http://www.palgrave-journals.com/ pal/authors/rights_and_permissions.html how to cite this article: hrynaszkiewicz i and acuto m ( ) palgrave communications – connecting research in the humanities, social sciences and business. palgrave communica- tions : doi: . /palcomms. . . this work is licensed under a creative commons attribution . international license. the images or other third party material in this article are included in the article’s creative commons license, unless indicated otherwise in the credit line; if the material is not included under the creative commons license, users will need to obtain permission from the license holder to reproduce the material. to view a copy of this license, visit http://creativecommons.org/licenses/by/ . / palgrave communications | doi: . /palcomms. . editorial palgrave communications | : | doi: . /palcomms. . |www.palgrave-journals.com/palcomms http://dx.doi.org/ . /m .figshare. http://dx.doi.org/ . /m .figshare. http://dx.doi.org/ . /m .figshare. http://dx.doi.org/ . /m .figshare. http://mpra.ub.uni-muenchen.de/ �/� /mpra_paper_ .pdf http://blogs.lse.ac.uk/impactofsocialsciences/ �/� / /finding-a-home-for-interdisciplinary-research/ http://blogs.lse.ac.uk/impactofsocialsciences/ �/� / /finding-a-home-for-interdisciplinary-research/ http://www.palgrave-journals.com/pal/authors/rights_and_permissions.html http://www.palgrave-journals.com/pal/authors/rights_and_permissions.html http://creativecommons.org/licenses/by/ . / http://www.palgrave-journals.com/palcomms palgrave communications – connecting research in the humanities, social sciences and business championing interdisciplinary research a multidisciplinary open access journal with impact beyond open access our first articles additional information acknowledgements references where are we now? delivering content in academic libraries changes in scholarly communications and new business models are presenting academic libraries with challenges and opportunities which impact the way they approach the delivery of content. libraries are re-evaluating internal processes and structures, enhancing and developing the skills within their teams, and embracing new possibilities for strengthening and enhancing partnerships with publishers and the academic community. this will not only enable them to manage in the current unstable and unpredictable environment, but empower them to influence and drive forward changes in the development of content and models. where are we now? delivering content in academic libraries under pressure the transformation of academic content presents both a direct challenge and new opportunities to the role of libraries. the content that we deliver is evolving and diversifying, not only from print to digital, but also from the traditional formats of book and journal article to new media such as research data and learning materials. much of it is not even purchased or licensed by us, given the increase in what is available in the public domain or as open access (oa). libraries are no longer defined by the collections within our buildings; the scope of the content we deliver and who we deliver it to is becoming difficult to define. our focus is just as much about exposing and delivering university-generated outputs to the outside world as providing licensed and purchased content to our own user communities. how we deliver that content is also changing: the traditional purchase and subscription models we have worked with comfortably for many years have been overtaken by a shifting landscape of demand-driven and e-textbook services. the impact of these new models should not be underestimated: they present a fundamental challenge to our professional ethics and require a shift in our thinking. librarians have always worked on the premise that all the information we acquire is available to all our students and researchers. licensing and delivering course-specific content to a subset of our community, though of course satisfying user need, does not sit comfortably with our principle of equitable access. external pressures also impact on how we approach the delivery of content. the introduction of student fees has raised expectations of information provision, and universities are under pressure to avoid additional costs for students, such as library fines or the purchase of personal copies of textbooks or core readings. students expect access to content in both print and digital formats, as they satisfy different needs. many librarians are also working within the context of a parent institution’s growth agenda, needing to deliver more and diverse content to increasing student numbers. golden years libraries are grappling with the question of who drives the delivery of our content. in the past ten years, we have transitioned from developing and building collections on behalf of our users, through the hazy early days of demand-driven acquisition (dda) when we handed over portions of our budget to allow our users to determine what content they needed. insights – ( ), july where are we now? delivering content | joanna ball joanna ball head of library content delivery and digital strategy university of sussex ‘the scope of the content we deliver and who we deliver it to is becoming difficult to define’ ‘students expect access to content in both print and digital formats, as they satisfy different needs’ we have reached a point now where we need to lead on the establishment of a middle way: users determine the content, but we are developing tools and systems which we use alongside professional judgement to determine the most cost-effective and suitable method of delivery, whether through document supply, rental or purchase. we know now that print as a format will not die completely: what we are striving for is the perfect balance of formats within our new hybrid environment. we require analytical tools to enable us to fully understand and compare the use of our digital and print resources and answer some difficult questions, not only about the preservation (or not) of our legacy print collections, but also about our current practices in delivering content. how do these new models compare in usage? what advantages do they provide our users, and how can this data inform our future practices and policies? even ten years ago, students needed to make regular physical visits to the library building to access materials, and library resources were still seen as an add-on to a course. at my own institution, library content now increasingly forms part of a complex web of information and resources delivered seamlessly to students through the virtual learning environment (vle). this is made possible through the integration of the library reading list system with other relevant campus systems, and partnership with teams within the institution responsible for supporting the delivery of different types of content. let’s dance publisher-librarian relationships are evolving in tandem with developments in content, from customer-seller to collaborative partnerships enabling us to learn from each other and develop models for the delivery of content that suit us and our users. libraries are now negotiating with publishers and academics to provide textbooks directly through vles, and we have an important role in ensuring that this method of delivery continues to provide value for money for our institutions. the boundaries between libraries and publishers appear to be blurring, as our focus shifts towards facilitating the creation and delivery of content created by our own academic community. libraries are taking advantage of these opportunities to develop new models of service. many of us now act as distributors by providing the infrastructure to publish oa journals and books, and there is an increase in the number of libraries setting up full-blown publishing services. this unique position that libraries have as a trusted partner of students, researchers and publishers puts us in an excellent position to exploit these collaborations to make innovations in the development of content formats. together we can drive the evolution of the academic book, the traditional vehicle for university teaching and research, into something dynamic that takes full advantage of current technology and reflects the complexity of the research process today, rather than merely recreating print in digital form. we have an opportunity to innovate and create new formats that are more flexible, as well as shape new business models more closely aligned to the evolving needs of our institutions and our users. the development of digital scholarship, and in particular digital humanities, presents further opportunities for working with our user community, exploiting the traditional professional skills of library content delivery teams in different ways. we can now add metadata to research data and new forms of digital outputs to ensure that university research is available for discovery and reuse. our preservation focus is also becoming more digital, ensuring long-term access to our special collections, our own online content and the research outputs of the institution. we can work with our academic community on the copyright implications of sharing their digital content. these new roles present us with an opportunity to become full partners in the research process rather than the suppliers of a service. ‘we require analytical tools to enable us to fully understand and compare the use of our digital and print resources’ ‘the boundaries between libraries and publishers appear to be blurring’ ‘together we can drive the evolution of the academic book’ it ain’t easy these changes create practical challenges for library structures and processes. our workflows and staffing have been based on an individual- purchase, print model and although we have adapted well, we have had to supplement our ill-equipped library management systems with add-on modules and makeshift tools to cope effectively. the transition to new library services platforms capable of dealing with a hybrid environment is enabling us to completely review internal processes and structures. how should we best divide up our teams? traditional divisions of work, for example print/digital and one-off purchase/subscription, are no longer helpful or relevant. how should we manage our budgets? our approach of allocating and accounting for expenditure on a subject basis, collaboratively with our academic community, does not fit with e-book packages and dda, both of which cut across subject boundaries. as libraries, we are trying to find answers to these two key questions. emerging models do not necessarily sit comfortably alongside traditional ones. while, in theory, evidence-based acquisition supplements our selection of items to resource teaching, the practicalities of combining these models within my own institution presents us with challenges when we are trying to streamline our processes. there is also a question of how delivery of our own institutional content fits alongside licensed or purchased content. can this be handled from within our existing structures or do we need to create new scholarly communications teams to manage this area which is growing in scope and importance? changes more than ever, we are creating flexible, agile and outward-looking teams who are well equipped to deal with this changing environment. our focus should be on where we can provide most value in delivering content: the increase in evidence-based and data-driven decision-making around content weakens our traditional role as selectors. many of us are no longer crafting print collections, and should avoid the temptation to merely recreate them in digital form. our value is now in delivering content, not acquiring it, and this has implications for the skills we need. many of us are now negotiating on a publisher-by-publisher basis for individual e-textbooks in an arena where there are no existing pricing models. our new library services platforms promise to enable us to streamline our workflows as much as possible, eliminating the need for some of our clerical and back-office tasks, and freeing up teams to develop expertise in new areas where they can provide greater value. for example, metadata teams need to be able to focus on implementing new initiatives to make our bibliographic data more useful and visible to the outside world, and on developing skills and capacity for the institutional repository to support the organization in meeting the oa requirements of the next research excellence framework. these tools are beginning to open up possibilities for actionable analytics and superior management information for evidence-based decision-making. we need data science and library carpentry software skills to manipulate, clean, organize and interpret these quantities of data and make sense of analytics and benchmarking. loving the alien one of the biggest challenges may be bringing our user community along with us. there is a disparity between what we as librarians believe the role of the library is in supporting teaching and research within the institution and the perceptions of our users. as a librarian, i no longer make value judgements about other libraries based purely on the scope and ‘we have had to supplement our ill- equipped lms with add-on modules and makeshift tools to cope effectively’ ‘emerging models do not necessarily sit comfortably alongside traditional ones’ ‘our value is now in delivering content, not acquiring it, and this has implications for the skills we need’ scale of their collections; as a profession, we are confident that the value we have to offer our institutions is much more profound. my experience of working with academic staff and students has shown me that this vision is out of sync with many of theirs. we are navigating between traditional views that our role is to acquire and preserve static print collections representing broad areas of knowledge, and the demands of students who expect the convenience and flexibility of accessing digital resources (as long as print is there as an alternative). this creates a conflict between expectations of what a library should deliver and what user needs actually are, as well as a lack of understanding about the potential librarians have to enhance learning through both content and skills. we need alignment and understanding between library teams that deliver content and manage relationships to ensure that we do not alienate our users, and are able to bring them along with us as we evolve. heroes i am tired of hearing that librarians are concerned about their survival and relevance in the digital age, and that we somehow need to reinvent ourselves. we have been quietly (and not so quietly) evolving over the past decades. our focus is not on building up our own collections, but collaborating on a national level to influence the development of the market for content – including delivery models and usability – as well as price. we must not lose sight of our goal: providing access to content for students and researchers when and where they need it. by remaining an integral part of the teaching and research process, libraries can ensure that the way that content is delivered to our users is not shaped purely by budgets, technology and business models, but by innovation, agility and skill. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘abbreviations and acronyms’ link at the top of the page it directs you to: http://www.uksg.org/publications#aa competing interests the author has declared no competing interests. ‘a conflict between expectations of what a library should deliver and what user needs actually are’ ‘shaped … by innovation, agility and skill’ references cilip’s ethical principles for library and information professionals: http://www.cilip.org.uk/about/ethics/ethical-principles (accessed may ). estelle, l, what students told us about their experiences and expectations of print and e-books, insights, , ( ), – ; doi: http://doi.org/ . /uksg. (accessed may - ). emery, j and stone, g, library as scholarly publisher, oawal blog: https://library .hud.ac.uk/blogs/oawal/library-as-publisher/ (accessed may ). levine-clark, m, access to everything: building the future academic library collection, portal: libraries and the academy, , ( ), - : https://muse.jhu.edu/article/ (accessed may ). hefce, policy for open access in the post- research excellence framework: updated july : http://www.hefce.ac.uk/pubs/year/ / / (accessed may ). chad, k, library management system to library services platform. resource management for libraries: a new perspective. higher education library technology briefing paper, ( ): http://doi.org/ . /rg. . . . (accessed may ). library carpentry: https://librarycarpentry.github.io/ (accessed may ). http://www.uksg.org/publications#aa http://www.cilip.org.uk/about/ethics/ethical-principles http://doi.org/ . /uksg. https://library .hud.ac.uk/blogs/oawal/library-as-publisher/ https://muse.jhu.edu/article/ http://www.hefce.ac.uk/pubs/year/ / / http://doi.org/ . /rg. . . . https://librarycarpentry.github.io/ article copyright: © joanna ball. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. joanna ball head of library content delivery and digital strategy university of sussex library, university of sussex, falmer, brighton bn ql, uk tel: + ( ) | e-mail: j.e.ball@sussex.ac.uk orcid id: http://orcid.org/ - - - to cite this article: ball, j, where are we now? delivering content in academic libraries, insights, , ( ), – ; doi: http://dx.doi.org/ . /uksg. published by uksg in association with ubiquity press on july http://creativecommons.org/licenses/by/ . / mailto:j.e.ball@sussex.ac.uk http://orcid.org/ - - - http://dx.doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ repertoire warwick.ac.uk/lib-publications original citation: franks, matthew ( ) laboratory, library, database : london’s avant-garde drama societies and ephemeral repertoire. modernism/modernity, ( ). pp. - . doi: . /mod. . permanent wrap url: http://wrap.warwick.ac.uk/ copyright and reuse: the warwick research archive portal (wrap) makes this work by researchers of the university of warwick available open access under the following conditions. copyright © and all moral rights to the version of the paper presented here belong to the individual author(s) and/or other copyright owners. to the extent reasonable and practicable the material made available in wrap has been checked for eligibility before being made available. copies of full items can be used for personal research or study, educational, or not-for-profit purposes without prior permission or charge. provided that the authors, title and full bibliographic details are credited, a hyperlink and/or url is given for the original metadata page and the content is not changed in any way. publisher’s statement: © copyright johns hopkins university press published version: https://doi.org/ . /mod. . a note on versions: the version presented here may differ from the published version or, version of record, if you wish to cite this item you are advised to consult the publisher’s version. please see the ‘permanent wrap url’ above for details on accessing the published version and note that access may require a subscription. for more information, please contact the wrap team at: wrap@warwick.ac.uk http://go.warwick.ac.uk/lib-publications http://go.warwick.ac.uk/lib-publications http://wrap.warwick.ac.uk/ https://doi.org/ . /mod. . mailto:wrap@warwick.ac.uk laboratory, library, database: london’s avant-garde drama societies and ephemeral repertoire [draft] forthcoming in modernism/modernity, september matthew franks in november , a london play-producing organization known as the stage society sent circulars to its members announcing one sunday evening and one monday matinée performance of mrs. warren’s profession. the lord chamberlain had banned george bernard shaw’s play three years earlier, and although the stage society’s members-only performances technically were exempt from both the pre-performance licensing requirement and the longstanding prohibition on sunday theatrics, managers feared the loss of their operating licenses. by the time the play premiered at the new lyric club in january, the stage society had been forced to change venues three times, after approaching at least twelve theaters, two music halls, three hotels, and two galleries. the society also had postponed the production once due to an actress’s last-minute scheduling conflict. with each change, the society printed new sets of circulars, programs, and tickets—sometimes, only a day apart. dedicated to the discovery of new or sometimes very old drama, subscription societies were experimental coterie clubs composed of members whose annual fees financed, and secured tickets to, a season of private productions. in , j. t. grein founded the first such group in britain, the independent theatre society, in order to stage a performance of henrik ibsen’s ghosts, which the lord chamberlain had banned from the public stage. over subscription societies followed; the stage society ( – ) ran longest and most successfully. though extreme, the case of mrs. warren’s profession demonstrates the extent to which subscription societies lacked actors and theaters of their own, and relied on printed ephemera to constitute, as much as to communicate, their performances. compared to bound books, ephemera—from the greek for things lasting no more than a day—better approximated the transience of live performance. but ephemera also could virtually assemble repertoires and audiences beyond a single theater or performer. the stage society’s annual report meticulously recounted the mrs. warren saga and boasted of the speedy production of ephemera: “tickets and programmes and a circular to members were printed and ready within twenty-four hours.” the curtain would go up after the letterpress had come down: when the theater changed five days before another performance, members “[suffered] no further inconvenience than a late receipt of programmes and tickets consequent on the delay due to reprinting.” subscription societies produced more ephemera than plays, such that shaw received a prospectus from the fictitious “pornographic play society (limited),” which stated that the success of mrs. warren’s profession “encourages the committee of the p. p. society to follow it up by a series of performances suitable to the taste of supersensuous audiences.” the prospectus satirized the tastes of subscription society members and the plays promised to them by committees. it also mocked the “limited” nature of such societies, conflating legal registration with limits on influence. how did these avant-garde societies shape the performance repertoire? in this article, i quantitatively analyze a database of over , london productions from to in order to determine the extent to which subscription societies introduced a modern dramatic repertoire to the public stage, otherwise known as the commercial theater. i further argue that subscription societies virtually assembled the very idea of a modern dramatic repertoire using ephemera such as prospectuses, programs, annual reports, and tickets. my methodological aims with respect to the study of repertoire are twofold: to demonstrate the potentials and limitations of digital databases and to make a case for integrating them with book history. as debra caplan has observed, databases “tackle a recurring and significant challenge in [theater and performance studies]—the ephemerality of our medium and the dispersal of theatrical ephemera that may shed light on a performance event.” in this article, i follow through on caplan’s pun by tracking the relationship between theatrical ephemera and performance databases in the era of modernity, when britain’s professional not-for-profit theater sector first emerged, and with it, a quantifiable avant-garde. by combining book-historical and digital-quantitative methods, i propose a new model for integrating modernist studies with theater and performance research. while artist- centered analyses by lawrence switzky and toril moi have evaluated shaw’s and ibsen’s modernist credentials on aesthetic grounds, theater-historical accounts by tracy davis and claire cochrane have readily used the adjectives “modernist” and “avant-garde” to describe the societies that premiered these dramatists’ plays. from a historical perspective, the term “avant-garde” could not be more appropriate, since it was introduced into french dramatic criticism to describe the repertoire of andré antoine’s théâtre libre, the parisian subscription society that inspired grein’s independent theatre; in their own heyday, british subscription societies were considered “advanced.” nevertheless, modernism’s contentious relationship with the stage exceeds historical definitions. olga taxidou has written of “the impasse created by a critical tradition that views textuality (literary or otherwise) and materiality (stage, bodily or otherwise) as mutually exclusive discourses”—a bifurcation that further maps onto anglophone literary modernism and continental theatrical avant-gardism. the tension between textual page and material stage has been especially generative for william worthen, martin puchner, and jennifer buckley, who have argued for the importance of the published play, the closet drama, and the performance text to the formation of modern drama, modernism, and the avant-garde, respectively. i am less concerned here with parsing those categories in terms of individual artists’ aesthetics, since subscription societies mounted naturalist, symbolist, and expressionist plays alike, and playgoers saw each style as new and experimental; rather, i pivot away from the textual page and toward material ephemera—an under-theorized print genre, but one essential to structuring collectivity for any institution, particularly theater. scholars of modernist little magazines and private presses have long recognized that print, if not strictly ephemera, conditions collectivity. in addition to convening coteries, subscription societies were similar to subscription publishers in that both assembled stables of writers like fantasy baseball teams. often, overlapping teams: one of the stage society’s less-remembered plays was a one-act called one day more ( ) by joseph conrad, adapted from his short story “to-morrow”; the society later was responsible for the london premieres of james joyce’s exiles and d. h. lawrence’s the widowing of mrs. holroyd (both ). in , art critic huntly carter observed that subscription societies “strongly resembled the new so-called advanced journals which are springing up to-day, and which serve as a dust-hole for literary and moral outpouring.” more recently, elizabeth miller has compared societies to the “slow print dynamic of the radical press”; she dubs them the theatrical counterpart to socialist magazines like to-day, which published ibsen’s plays before societies staged them. yet if little magazines—along with literary archives, museums, art collections, and even encyclopedias— have been characterized as institutions of modernist collectivity, theater archives have more often been juxtaposed to a theatrical collectivity predicated on liveness. performance studies has habituated us to recognizing theatrical ephemera like playbills, posters, press clippings, and picture postcards as mere traces of irreproducible happenings, like breadcrumbs leading to people and places we’ll never reach. when such documents contradict each other, upending w. b. yeats’s account of the riotous ubu roi ( ) premiere (to choose a performance event that has become crucial to the story we tell about modernism), the archive only further “performs the institution of disappearance,” to borrow rebecca schneider’s haunting formulation. sarah bay-cheng recently has proposed situating theater and performance history within a new and old media ecology, thereby transforming personal and institutional archives alike into “networks” in which “performance does not disappear.” in this article, i acknowledge the validity of both perspectives: our encounters with media, whether in theaters or rare books libraries, are as embodied as any performance; ephemera can be discarded (or deleted) as well as saved. what remains, so to speak, is to imagine ephemera in the hands of playgoers before, during, and after the performance event. ephemera’s affordances were clear to turn-of-the-century theater reformers. in , one theater manager observed that a subsidized play-going public existed, “but it wants organising and circularising, and that is the work for [subscription] societies to take in hand.” other observers compared societies to legal bodies like corporations and syndicates. in the inaugural issue of the times literary supplement, critic arthur bingham walkley declared: like nearly everything else in the modern world the new theatrical demand has of late years been worked by corporations and syndicates, with the usual apparatus of prospectuses, pamphleteering, and, above all, subscription lists. in this kind the independent theatre society begat the new century theatre society, and the new century theatre society begat the stage society, and by-and-by—say, at the coming of the cocqcigrues—the stage society may beget that new theatrical supply which ought to meet the new theatrical demand, but, somehow, never does. as walkley anticipated, the stage society became a limited company two years later, changing its name to the incorporated stage society to suggest a wider membership and influence. with incorporation, the society halved the annual fee to one guinea, and membership doubled from to , . but that a vital list of modern drama seemed as likely as a mythical monster to appear throws into relief the astounding accomplishments of the next decade, during which the stage society launched the playwriting careers of shaw, harley granville-barker, st. john hankin, and john masefield—and, over a longer period, the less-successful bids of conrad, lawrence, and joyce. the stage society continued the work of earlier societies by further popularizing ibsen, as well as introducing maeterlinck, chekhov, strindberg, pirandello, and cocteau to the english stage. in other words, the society’s playlists knitted together modern dramatists, literary modernists, and theatrical avant-gardists. and, perhaps more surprisingly, box-office successes: new media analysis reveals that after passing the subscription test, many of these playwrights successfully crossed over into the (retrospectively-constructed) commercial repertoire, subtending the gap that penny farfan has identified between “hegemonic modernism and mainstream theatre practice.” rather than evaluate subscription plays in a vacuum, this approach takes stock of the entire professional london stage, placing man and superman and hedda gabler alongside peter pan and charley’s aunt. what’s more, old media analysis suggests that theatrical ephemera like prospectuses, pamphlets, and subscription lists played an important role in self- consciously fashioning the concept of a modern dramatic repertoire in the first place. prospectuses, by looking forward to an imagined series of future performances, and annual reports, by looking backward to take stock of successes and failures, trained audiences to think of plays not as individual works, but as parts of a repertoire that could be compiled, catalogued, and chosen from at will. even as their ephemera communicated practical administrative information, assembling this repertoire was subscription societies’ raison d’être. repertoire even took the place of a permanent theater building; as the journal the new age reported in : “in london the only permanent home the drama we want possesses is in those pioneer dramatic societies which are financed by the subscriptions of members.” the stage society’s membership never exceeded , and only a fraction of theatergoing audiences attended subscription performances, but the society’s productions were reviewed in newspapers and revived in commercial theaters throughout britain. subscription lists and reports of stage society audiences in the public press gendered play- going as female and playwriting as male; both were thought to influence the repertoire. even as print brought subscription societies into existence, ephemera orbited around the live performance event, with the distance of the prospectus and the annual report, and the proximity of the ticket and program. ephemera virtualized repertoire nearly a century before the advent of digital databases. database; repertoire; list—as kenneth price asks, “what’s in a name?” price distinguishes between the technical term “database” and a looser metaphorical collective. although i make use here of a modern-day database for quantitative analysis and locate the emergence of the technical term “repertoire” in the nineteenth century, i recognize that the more telling moments in both time periods emerge from metaphor: when a repertoire is compared to a library or a storehouse, say, or when a database is compared to a cloud or an internment camp (as when donald trump recently suggested a database for american muslims). moreover, i argue, as neil postman once did, that the material form of information shapes our metaphorical perception of it. material and metaphor meet in virtuality, which has gained new currency in the digital age. but as david saltz reminds us, artaud claimed the term “virtual reality” for the theater over fifty years before jaron lanier did for the computer. taking an even longer view, sue ellen-case proposes that we conceptualize the literal theater as a space of virtual representation akin to the medieval cathedral, which “purported to provide an architecture of the virtual space of heaven.” yet from the new media end of the timeline, steve dixon comes to a seemingly opposite conclusion, identifying “the inherent tensions at play between the live ontology of performance arts and the mediatized, non-live, and simulacral nature of virtual technologies.” the difference between these approaches to virtuality inheres in whether we take performance as our subject or our object: does performance imagine something else, as in the former; or are we asked to imagine performance itself, as in the latter? or are we asked to imagine a performance repertoire, like the lists of plays embedded in subscription ephemera? whether celestial, cybernetic, or canonical, each approach accesses virtuality from the point of a representational platform, be it stage, screen, or page. all, in the words of n. katherine hayles, “[play] off the duality at the heart of the condition of virtuality— materiality on the one hand, information on the other.” in this article, the concept of the “virtual” serves as a bridge between media that are still too frequently considered in binary terms: live/non-live, unmediated/mediated, ephemeral/permanent. from laboratory to library turn-of-the-century theater reformers compared subscription societies to different storage facilities: laboratories, museums, storehouses, libraries. each analogy had something to say about the nature of the repertoire, be it experimental, esoteric, explosive, or classical. these analogies conceptualized plays as discrete objects that could be arranged on a shelf, in a mental shift hastened by the late-victorian renaissance in dramatic publishing, which helped to literalize the metaphor. ephemera’s institutional associations further inspired such comparisons. i begin by weighing the various trade-offs of these comparisons before moving into a quantitative analysis that evaluates their accuracy. to what extent did subscription societies discover drama for the commercial repertoire? the majority of subscription productions were performed only once or twice; in this respect, societies resembled laboratories. william archer imagined a “test performance society” which would operate as a “safety-valve” for plays that might upset the censor. in , the shelley society (generally not considered a play-producing society) staged a subscription performance of shelley’s unlicensed play the cenci ( ). this established a precedent for future play-producing societies as far as censorship was concerned. theater historians have long observed that dramatic publishing returned to being integral to a play’s literary value at the end of the nineteenth century. as henry arthur jones proclaimed after the passage of the american copyright act, which ostensibly protected english playwrights from unauthorized trans-atlantic performances: “[if] a playwright does not publish within a reasonable time after the theatrical production of his piece, it will be an open confession that his work was a thing of the theatre merely, needing its garish artificial light and surroundings, and not daring to face the calm air and cold daylight of print.” apparently, play-going was for the evening and reading, the daytime. copyright law newly defined performance through print: in order to secure copyright before publication, the play had to be “publicly performed,” which meant that a playbill had to be exhibited outside the venue and the performance advertised in two newspapers. (subscription performances did not count.) reading editions of shaw and other “advanced” dramatists followed, spurred by publisher involvement in societies. this included william heinemann, who asked john lane to publish heinemann’s banned play the first step ( ) after the independent theatre society decided not to stage it; gerald duckworth, who was secretary of the new century theatre society and later published all of galsworthy’s plays; and grant richards, who published shaw’s plays pleasant and unpleasant ( ) and was listed as a signatory on the stage society’s invitational circular. by edward’s reign, critics had inverted the print/performance paradigm. one lamented that the stage society had gone the way of “other experimental dramatic societies” by performing mrs. warren’s profession, “which one could have been content to read.” with less ambivalence, the stage society’s secretary allan wade recalled that richards’s shaw volumes “were very amusing . . . to read. the thought that they might be acted did not seem to occur to anybody” (memories of the london theatre, ). societies devoted themselves to testing the so-called “great unacted,” the iceberg of which shaw was assumed to be only the tip. they may have wanted for quality plays, but they were never short of submissions. the stage society’s reading and advising committee received an average of three plays a week (most of which had never been published), and the society’s ten-year jubilee celebrations included a special midnight burlesque that depicted a strike of great unacted dramatists who compel the “ultra-drama society” to stage a gloomy play. laboratory-like, societies engineered the rise of modern drama by creating a controlled environment where theater would not be subject to the blunt forces of commercialism. although jones and arthur wing pinero penned a number of popular yet high-quality society plays, george sims, sydney grundy, and f. c. burnand hacked out melodramas, comedies, farces, and musical comedies that enjoyed long runs but rare revivals. of the advanced drama printed in the s, wade recalled: “i must have taken it for granted that one could not expect to see these tender plants exposed to the ordeal of performance at a west end theatre” (memories of the london theatre, ). this anathema toward the commercial west end earned societies a reputation for producing seedy plays. closely related to the analogy of the laboratory was that of the museum. as one critic observed: there is a medical museum in london—from which the frivolous are excluded by the fact that admission can only be obtained by a card from a doctor—where, ranged on shelves, are exhibited all the various disease to which the interior of man—and, for aught we know, his exterior also—is liable. . . . the stage society performs somewhat the same salutary and scientific function. this critic underscored the self-seriousness of stage society members and emphasized the subscription card by likening it to the institutional medical card. (with all these tender plants and medical cards, one can’t help but compare societies to today’s cannabis clubs: like cannabis clubs, subscription societies provided a loophole for accessing illicit and supposedly dangerous wares.) the use of the word “liable” also connected this intrapersonal conflict to the shared, or limited, liability of the stage society’s members. the salutary and scientific function came not only from a shared investment in humankind’s private pathologies, such as venereal disease or drug addiction (presented in plays such as ibsen’s ghosts [ ] and w. l. courtney’s on the side of the angels [ ]), but also from arrangement and exhibition. as time went on, arrangement and exhibition came to include the dramatic experiments of anglophone literary modernists like conrad, lawrence, and joyce. david kurnick has described theatrical failure as a driving engine behind the modernist novel; that conrad and lawrence both adapted their short stories into plays suggests further cross-genre exchanges. by extension, reviews of these productions tended to affirm that sterling novelists made poor dramatists. the observer’s critic noted that exiles “left me with the impression that i had strayed into the consulting-room of a psycho-pathologist.” the stage society mounted literary modernists rather like hunting trophies: lawrence’s and joyce’s plays were accepted only after the authors had bolstered their reputations with women in love and ulysses, each play having been rejected approximately a decade before. yet societies by no means shunned commercial success. in , j. t. grein both repeated and refuted the museum analogy: in our theatre the stage society, in spite of its not having a fixed abode, has cemented its own place; and it is, perhaps, not presumptuous to express the hope that henceforth it will be looked upon by the regular managers not merely as a kind of freakish museum, an intellectual refuge of the destitute, but as as a splendid auxiliary channel to increase the répertoire of the commercial theatre. the lack of a physical theater turned repertoire itself into both medium and destination. grein hoped that societies ultimately would contribute to the mainstream. indeed, quantitative analysis of over , london productions from to demonstrates that many of the stage society’s plays crossed over into the commercial repertoire. shaw’s man and superman, which the society premiered in , was revived seventeen times on the public stage between and . to put this in perspective: when we remove operas, ballets, musicals, pantomimes, and the data-skewing shakespeare, the most-produced play from to was j. m. barrie’s peter pan ( , revived times); any play revived more than seven times, including subscription and non-subscription performances as well as charity matinée and touring productions, numbers among the top hundred (around percent) of the corpus (fig. ). man and superman ties with james bernard fagan’s adaptation of treasure island for the eleventh most-produced play. the stage society’s first production was the premiere of shaw’s you never can tell ( , revived fourteen times), and other frequently revived plays include stanley houghton’s hindle wakes ( , revived seven times on the public stage) and r. c. sherriff’s journey’s end ( , revived five times on the public stage). this list suggests that domestic commercial crossovers were primarily shavian. as grein recalled: “practically from the beginning ‘g.b.s.’ lent his storehouse for the society, and whenever shaw was on the programme up went membership, interest, and prestige” (grein, the world of the theatre, ). this formulation figured shaw’s plays as a hoard of weapons that might explode the theater—rather than arcane specimens that would put it to sleep—and metonymically substituted the program for the live performance event. shaw’s crossover appeal also stabilized the famous court theatre seasons ( – ) organized by j. e. vedrenne and harley granville-barker, who sought commercially viable ways to stage plays on the repertory, or short run, model. the stage society premiered first plays by granville-barker, hankin, and maugham; none was much revived, but each dramatist went on to write plays that were among the edwardian theater’s most popular. no other society produced english- language playwrights with such broad appeal. the society record for introducing new translations of foreign plays to the commercial repertoire was even more substantial. between and , ninety-six of new translations (or percent) were subscription productions. what’s most striking about these plays is the way that they move from the avant-garde to the commercial theater. ibsen’s controversial ghosts was revived sixteen times after the independent theatre society production—of the next productions, the first two were by other societies, but the play was revived thirteen times on the public stage after the lord chamberlain removed the ban in , and ties with sheridan’s the rivals as the thirteenth most-revived play in the corpus. a doll’s house and hedda gabler also top the list. although chekhov’s plays never ran afoul of the censor, the stage society premiered the cherry orchard ( ) and uncle vanya ( ), which were revived on the public stage nine and eight times, respectively, including internationally touring productions. societies premiered a number of banned works that have been foundational to modern dramatic criticism but that exerted much less influence on the commercial repertoire of the time, including strindberg’s miss julie ( ), pirandello’s six characters in search of an author ( ), and cocteau’s the infernal machine ( ); because the database does not extend beyond the abolishment of theater censorship in , we are less able to determine whether these plays subsequently figured in the commercial repertoire. but although censorship electrified the society movement, of the , subscription productions to be staged in theaters, only twenty-four (less than percent) were of banned plays. this number is somewhat lower than the total because the database does not include productions in non-theater venues such as galleries and clubs. still, it reflects the reality that the lord chamberlain historically banned only a minority of plays. between and the censor banned thirty out of , plays, though he wielded his blue-pencil to strike lines from a great many more. certain societies did not concern themselves with new or banned plays, focusing instead on unearthing older dramas that subsequently were reintroduced to the commercial repertoire. william poel’s elizabethan stage society ( – ) produced everyman in after poel’s own revival a year before; the play was produced fourteen more times on the public stage before , and ties with leopold lewis’s sensational the bells as the fifteenth most-produced play in the corpus. an outgrowth of the stage society, the aptly-named phoenix society ( – ) specialized in elizabethan and restoration plays, the most popular of which was wycherley’s the country wife ( ); after the society revived it in , it was produced five times on the public stage before . other societies attempted to revive classical greek tragedy in the style becoming popular at oxford, including the very short-lived greek play society ( ). the most important discovery was euripides’s hippolytus, which the new century theatre ( – ) briefly resuscitated in ; the vedrenne-barker court theatre produced the tragedy later that year, and it was produced three more times before . (even if that doesn’t sound like a large number of revivals, it’s still among the top percent.) a handful of societies specialized in the performance of shakespeare, including the elizabethan stage society, the british empire shakespeare society ( – ), and the fellowship of players ( – ), but they tended to produce oft- revived plays such as hamlet and the merchant of venice. though they sometimes revived lesser-produced history plays, none of these plays subsequently re-entered the commercial repertoire. the influence of societies on shakespeare staging was significant, particularly poel’s vigorous attempts to recreate the boards of elizabethan england. granville-barker, who began directing with the stage society, went on to direct a handful of symbolist shakespeare productions in the years before the war. this assessment of london societies’ influence on the commercial repertoire has a number of shortcomings. an obvious one is location: many subscription plays subsequently were revived in the allied repertory theaters of manchester, glasgow, liverpool, and birmingham, and reducing british repertoire to the london stage underplays the provinces as well as the numerical success of these plays. from the opposite direction, the stage society’s world-premiere of houghton’s hindle wakes was performed by annie horniman’s manchester repertory theatre company; in general, though, new plays from the provinces did not figure into london’s commercial repertoire to anywhere near the extent that subscription plays did. the data further exclude the activities of amateur groups, which were important for spreading the new theatrical movement beyond the metropolis (nicoll, english drama, ). another limit is periodization: – covers a little more than shaw’s lifetime of theatergoing, and we do not yet have data for how plays by him, ibsen, and chekhov fared once bertolt brecht, samuel beckett, and harold pinter began to influence the british stage. however, in britain granted a royal charter to the arts council, thus ending the era when subscription was the only collective, not-for-profit method for counteracting commercialism. (and from which point it becomes necessary to define what i have called the “commercial repertoire” as the open-to-the-public repertoire.) government-subsidized theaters such as the english stage company at the royal court theatre took up the laboratory role that had been filled by societies, and the abolishment of theatrical censorship in further diminished the need for subscription performances. it’s also worth bearing in mind that there are other ways of determining a play’s significance to repertoire than the number of times it has been revived. if measured by number of performances rather than productions, far fewer subscription plays would top the list, though with over performances, man and superman would come closest as among the top fifty most-performed plays in the corpus. what’s most interesting from this vantage is how infrequently the most-performed plays get revived: though , or more total performances signal that a play numbers among the top twenty, the only such plays that also appear on the most-produced list are peter pan, charley’s aunt, and when knights were bold; in other words, a high number of total performances often indicates that a play was revived infrequently if at all. so although audiences flocked to see , performances of edward sheldon’s opera-prima-donna play romance ( ) as opposed to performances of everyman over the same half century, it matters that everyman was revived fourteen times after its subscription performance, and romance only once, in . that interested theatergoers were able to see a particular play is at least as significant as whether crowds actually did; this was the very paradigm shift advocated by subscription societies. short runs also conform to the repertory ideal, which trades momentary popularity for a chance at posterity. in any case, the data do not take into account theater capacity or audience size, only revivals and performances. what this analysis does offer is a means of evaluating the societies’ successes in discovering or testing plays that might then get placed on the shelf not of a laboratory but of a library. this mission informed granville-barker’s analogy of a repertory playhouse that would keep plays “on the shelf of a theatre, so that, as from time to time a reasonable number of people is likely to want to see it, it can be taken down without overwhelming trouble and expense.” the government-subsidized royal national theatre that granville-barker envisioned ultimately found its feet in , and it has since revived a great many subscription plays. moreover, the influence of subscription can be counted throughout the database: of all the non-shakespearean plays produced more than once between and , almost one in five were produced by subscription. subscription plays had a percent chance of being revived; plays produced only in the commercial theater had a percent chance. acting in subscription productions, which required memorizing many lines for only one or two performances with little to no pay, could have an even greater effect on one’s career: although percent of the actors who performed in societies never performed on the public stage and might be called “amateurs,” actors who performed in societies averaged twelve productions on the public stage from to ; actors who never performed in societies averaged three. in these respects, societies did, in fact, serve as a splendid auxiliary channel to increase the commercial theater repertoire, slotting modern drama in among a list of frequently-revived popular plays like mrs. hilary regrets, david garrick, and treasure island, and integrating a consciously-created avant-garde repertoire into a broader commercial repertoire that we only now are able to construct retrospectively. reporting the repertoire just as important as the data of play premieres and revivals is the very idea of repertoire. after all, few if any playgoers actually went to see all eighteen productions of man and superman between and . the oed dates “repertoire” to the early nineteenth century, when it emerged as an alternative to “stock” as a way to describe the list of “dramatic or musical pieces which a company or performer has prepared or is accustomed to play.” this best applied to the stock companies that toured the provinces of victorian england, as articulated by the actor jerome k. jerome in : “i got hold of the répertoire and studied up all the parts i knew i should have to play.” for jerome, repertoire meant a collection of sides or pages containing a character’s lines preceded by cue words. how did the idea of a modern dramatic repertoire emerge? the concept of a theatrical canon that was independent of a company or performer originated with other stage genres: the most frequently revived works are not plays, but operas and ballets. in london, the number of ballet productions was miniscule before the visits of the ballets russes in the years leading up to world war i, and it was not until the s when marie rambert formed the ballet club (later the ballet rambert) and ninette de valois started the vic-wells ballet (later the royal ballet company) that the number of ballets rapidly escalated to match other stage genres. opera, however, emerged as a major performance genre in the late eighteenth century. as jennifer hall-witt observes, a local operatic repertoire developed at king’s theatre in the early nineteenth century. hall-witt credits the value increasingly attributed to original (though not necessarily new) works and the romantic cult of the artistic genius for audiences’ willingness to pay to see revivals of operas by popular composers. mid-century copyright laws encouraged managers to stage older operas, as well as to perform the same few works by a particular composer (fashionable acts, – ). that the oed dates “repertoire opera” to and “repertoire plays and operas” to further suggests this teleology. in practice, operatic repertoire exerted (and still exerts) far greater control than does dramatic repertoire. while the percentage of one-off operas per decade decreases from to , the percentage of one- off non-musical plays increases (fig. ). the idea of a modern dramatic repertoire first circulated in subscription ephemera. grein’s prospectus for the independent theatre society proclaimed the object “to give special performances of plays which have a literary and artistic, rather than commercial value. . . . the following plays will form the repertoire.” grein believed he would reform the commercial theater by nurturing plays that opposed its values; even if much of his repertoire never made it onto the public stage, he would later boast that the best work of mainstream dramatists like pinero and jones dated from the society. indeed, much of grein’s proposed repertoire never even made it onto the subscription stage, but his mixed list of english and foreign plays, both original and classical, influenced all subsequent attempts to define the modern dramatic repertoire in britain. circulars further contributed to the repertoire ideal, but they (like the productions they marked) lacked regularity. as grein’s widow recalled: “announcements of future productions were made and then altered. dates were given out, later to be postponed” (j. t. grein, , ). the invitational circular announcing the formation of the stage society suggested that the group “should meet regularly once a month, and should give at least six performances during the year.” this introduced periodicity to the subscription theater, which the society reinforced through routine prospectuses, annual reports, and, for a time, a bimonthly newsletter edited by st. john hankin. the society also settled on sunday evening performances, which had not taken place since charles i (grein, j. t. grein, ). sunday performances were both practical, since this was the day theater managers could afford to let their theaters, and “just a little naughty,” in the words of the playwright herbert swears (though a sunday matinée would have been naughtier still). after the first season, the society also offered monday matinée performances, to which the press was expressly invited. the stage society continued the self-conscious construction of a modern dramatic repertoire through its prospectuses and programs. sent to members at the beginning of each season, prospectuses listed the managing committee, the productions of all previous seasons, and the first several plays of the coming season. performance dates and venues were not listed for past or future productions, with the proviso that arrangements for the coming season would be announced by circular. though this probably was due to the difficulty of securing venues and actors in advance, it implied that the thoughtful selection of plays was more important than performance details, which were liable to change at a moment’s notice. subscribers would know which plays were coming long before they knew where to and when, and often these details were stripped from subsequent lists. programs for individual performances, called “meetings,” replicated this forward-and-backward-looking structure by reserving the back page of the folio for a list of the season’s “previous meetings” and “further arrangements,” as appropriate (fig. ). here, we see that programs further divided the plays from their performance details by listing only the venue, date, time of performance, and sequence in season on the front cover, with the title, genre, and author inside. although this might suggest a desire to hide the name of a controversial work from prying eyes, the back cover listed plays liberally; the perhaps unintentional effect was to separate performance details from repertoire. playgoers would have been fully aware that they were attending one play (or occasionally two or three shorter plays on a single bill, as with the one-acts by maeterlinck and “fiona macleod” [william sharp]) from a growing library. simple typefaces and a conspicuous lack of the advertisements with which programs were traditionally crowded further separated the avant-garde from the commercial theater. the society cemented the idea of a modern dramatic repertoire through its annual reports. these reports included lists of all previously produced plays, along with extracts from the society’s rules, an account of the year’s activities, and membership statistics reflecting the society’s finances. starting with the second annual report, the society adopted the practice of publishing complete membership lists. the annual reports further listed the repertoires of other london societies (such as the pioneer players, who took up the stage society’s practice of publishing annual reports and membership lists), provincial repertory theaters (such as those in manchester, liverpool, and birmingham), and london repertory seasons (such as lillah mccarthy and granville-barker’s season at the st. james), later publishing a complete list—or database—of “plays for repertory theatres” (fig. ). here, we see the importance of compiling figures such as number of performances (shaw already dominates) and act structures (as does the one-act). although stage society membership topped out at , in , newspapers throughout britain had long reviewed the society’s annual reports; in the year of incorporation, the annual report was reviewed in at least the referee, times, sunday times, era, clarion, stage, derby telegraph, bristol mercury, and nottingham guardian. reviewers fetishized the report’s materiality: the pall mall gazette ironically praised the report as “a lordly document of twenty-six pages, beautifully printed, and enclosed in a stiff cover.” the press took care to report the repertoire, including the names and numbers of english and foreign plays since . as annual reports recounted, in the society established a small library of theatrical literature for its members, meaning that members had access to a permanent library but not a theater. in addition to plays by english and foreign dramatists, the library included books and magazines (among them edward gordon craig’s the mask) dealing with both contemporary theater and theater history. one could argue that the forward-and-backward-looking dynamic created by prospectuses and annual reports rendered the ideal of an annual season as much as of a modern dramatic repertoire. but by listing all past productions, rather than just those of the past season, the ephemera were used to evoke marble rather than ice sculpture—a dramatic repertoire based at least in part on plays that would stand the test of time, even if they were not staged in artistically unified seasons. the critical consensus was, however, that as much as the stage society managed to produce an important cumulative list of plays, the society’s democratic organization actually prevented cohesive seasons. reviewing the season, critic ashley dukes declared: the stage society, with a large membership, has the defect of being ruled by a council, a committee, and a democratic constitution. this results, of course, in confusion and compromise. . . . it was a typical season, creditable enough as regards each individual performance, but lacking in direction and continuity. a hotch-potch, in brief . . . the stage society would perform a great service by converting itself into a literary theatre, under a dictatorship. yet dukes’s assessment indicates that by , the stage society had succeeded in changing “repertoire” from jerome’s handful of stock sides to a modern dramatic library from which a hodge-podge selection would no longer be adequate. dukes’s assessment was also sexist: the stage society’s membership had an increasingly female majority, whose efforts he implicitly judged as incompetent. subscription lists and reports of the stage society’s membership in the public press diagramed a division of labor, where women were the majority of the playgoers, and men were the majority of the playwrights; both were thought to sculpt the repertoire. the notion that both playgoers and playwrights shaped the theater was not new, but the sense of a specific, intellectual coterie was. when the stage society publicly campaigned for funds to establish a permanent repertory theater in , archer published a letter in the morning leader advising against it: “a popular playhouse is the last thing [members] ask for or care about. they love the coterie sensation. they want to have their own ideas, and no others, mirrored for them by the stage.” far from merely connoting feminine vanity, archer’s use of the word “mirror” imputed to the stage society’s subscribers a considerable amount of control over the works that appeared on stage. in a manner typical of the public press, archer’s hyperbolic concerns both reflected and distorted the stage society’s own virtual assembly of audience. this virtual assembly was perhaps best exemplified by the society’s subscription cards: subscribers wishing to be balloted together for the purpose of securing adjoining seats were requested to send in their cards securely pinned together, suggesting the extent to which the society’s collectivity was conditioned by print. a shared sense of collectivity also emerged from, or was reinforced by, the society’s subscription lists, in which the number of last names followed by “miss” and “mrs.” increasingly outnumbered those without. these lists were alphabetized by last name and included the year that members had been elected, and whether they were regular, honorary, or associate members, or part of the managing committee. the society soon abandoned the honorary and associate schemes, but began to include the numerical order in which members had joined; under this scheme, all levels of membership were equal, save for any prestige accorded to having joined the society earlier. the lists did not distinguish between playwrights, actors, production staff, and patrons, suggesting that the so-called “earnest students” of the drama were as important as the theatrical personnel who were listed alongside them. though such lists might have radiated exclusivity, the society was open to anyone who could afford the one guinea annual subscription fee. guineas were the traditional fee of doctors and lawyers, and the new theater intended to be a similarly professional service. the fee also echoed that at mudie’s circulating library; like most readers of fiction, most subscription theatergoers were middle-class women, many of them unmarried—ironically, the demographic the lord chamberlain most sought to protect. the invitational circular’s proposed membership limit of was abandoned quickly, and although the society raised the limit to and, with incorporation, , , both of these limits were provisional (and, given the precedent, extremely optimistic); the legal articles of association declared the number of members to be unlimited. the stage society constructed its coterie status both privately and publicly by sending materials to both members and the press. in the first season, a later annual report recounted, consent from skittish theater managers “could only be secured by placing special stress on the character of the society as a club producing plays exclusively for its members and their guests. to establish this principle a circular was issued to the dramatic critics (many of whom were members of the society), and all forms of advertisement were carefully avoided” (“third annual report,” ). this special stress was relaxed in the second season, when monday matinée performances were added to which the press was now officially invited. from the beginning, however, the implication was that the stage society could be both selective and open to all interested theatergoers. although subscription forms required two nominations from members, this was little different from the referral system at institutions such as the british library, which to this day requires a letter of reference for entry. but to say that the stage society was open to anyone in london would be a stretch. recounting his years as an aspiring actor, allan wade illustrated the tension between public and insider knowledge: “it was doubtless because i had read some press notices of these performances that i became fired with a desire to become a member of the stage society, and happening to meet one day at a friendly house a brother of frederick whelen, the originator of the society, i asked him to propose me for membership” (memories of the london theatre, ). the stage society could have its coterie and eat it, too. the lists were circulated privately in the society’s annual reports, but their contents were reviewed in the public press. in , one critic observed: “i was afraid the stage society had done for itself when i heard not long ago that it had saved a lot of money, and when i saw by the latest membership list what a number of ‘influential’ people had joined it. to become rich and respectable is as fatal to a society as it is to an individual.” this critic recognized the power subscribers wielded over the society’s artistic product: the repertoire. critics inevitably characterized the membership as either too fashionable, or not fashionable enough. ladies’ journals commented on the habiliments of the baronesses and captains’ wives with the breathlessness of red carpet reporters. some columnists remarked on an overabundance of green, apparently due to the natural vegetable dyes favored by socialist dress reformers (cockin, women and theatre, ). (though the stage society chairman and several dramatists served on the fabian executive committee, a comparison of lists from suggests that only around percent of members were registered fabians.) in , the society created a minor fashion scandal by instituting a policy that asked ladies to remove their matinée hats because they disrupted audience sightlines. more importantly, fashion was seen both to reflect and dictate the repertoire. in a review, the scots pictorial wondered: why the faculty . . . of seeing beauty only in the hideous and the unclean side of writing and acting, should also have taken away all nice taste in the matter of clothes. the majority of the playgoers were women, but there were not a dozen well-dressed women in the theatre. the remainder were drab and dingy, and every second woman among them seemed to be wearing spectacles. women playgoers had become dramatis personae. although members were allowed to bring a guest (subject to availability), the popular press amplified the collectivizing gesture of the subscription lists, and reported on subscribers as a unified coterie, whether fashionable or unfashionable, serious or unserious. one such guest included the impressionable, if fictional, heroine of h. g. wells’s novel ann veronica, who attends the stage society’s monday afternoon performance of mrs. warren’s profession as the companion of her “advanced” friend hetty widgett, and disastrously decides to model her behavior on vivie warren. the stage society’s mostly-female subscribers dictated and reflected a repertoire that the society’s mostly-male dramatists wrote: of plays, only fourteen were by women. when we remember the frequency with which subscription plays migrated onto the public stage, where the ratio of female to male playwrights was no less dismal, we are better able to appreciate the role played by women subscribers in shaping the commercial theater repertoire. subscription ephemera structured critics’ awareness of this role, which meant that the newspaper-reading public knew of it, too. coda: performative codes the two approaches to repertoire spotlighted in this article—what literally gets performed, and how we imagine or represent what gets performed—are stuck in a perpetual feedback loop. so, too, are old and new media. tara mcpherson’s legitimate concerns about converting archives into “post-archival” databases might be even further contextualized by recognizing that the former already contain the latter; any database whose subject is more than a decade or so old once was paper-based. today, databases sometimes promote an anti-materialist tendency precisely opposite to that which led turn-of-the-century theater reformers to compare repertoires to laboratories, museums, and storehouses. it’s worth remembering that scholars have been using reference books—databases avant la lettre, as lev manovich has pointed out—for millennia; like calculators, the digital kinds enable us to count much more quickly. just as a reference book is not yet an argument, neither is a database; both are starting points for posing provocative questions whose answers require the rigorous connecting of dots. like fashion magazines or twitter feeds, databases announce trends easily but have trouble explaining them (figs. , , ). why, for example, does the one- act replace the three-act as the dominant play structure just before world war i? though they correspond at the end of the nineteenth century, why over the next sixty years does the number of works that self-describe as “drama” plummet while “play” skyrockets? why are original works at best one-third and at worst one-fifth or less of all works produced on the london stage each year from to ? in short, tracing the influence of subscription societies through the database is merely one of many lines of inquiry, all of which need to be balanced with archival research. to put it another way: quantitative methods yield relative, some might say obvious, observations. they confirm that operas and ballets are revived much more frequently than plays; that musicals and pantomimes run longest; that shakespeare dominates the dramatic repertoire. rather than sketch a history parallel to the rise of so-called “literary” and “artistic” plays based on an alternate performance canon—welcome and necessary though such a history would be—the findings presented here dramatize how quickly avant-garde turned old-guard and how frequently artistic risk returned commercial reward. perhaps repertoire isn’t a representative way of discussing theater history at large: of approximately , unique stage works, nearly , (or percent) were never revived; of those, around , (or percent) were performed just one time. in this way, franco moretti’s “slaughterhouse” of eighteenth and nineteenth-century novels equally applies to the modern theater. but while databases might seem to privilege the long-running or the most-revived, they also make it easier to find needles in the play-stack: the handful of plays that feature a pregnant woman, or the thousand more that feature a domestic servant. lists of familiar plays encompass lists of unfamiliar players: databases cast their net beyond , production titles to the over , persons who brought them to life—none more promiscuous than william clarkson, for example, who provided the wigs for more than , productions. and then there are the playgoers: this article has tried to suggest that any discussion of repertoire ultimately leads to a discussion of audience, whose names might not figure in a london stage database, but whose imprint can’t help getting counted. for modernist studies more generally, quantitative methods could help to further expand the relatively small canon of artists who have traditionally anchored the field by shifting from discourses of autonomous production to those of collective reception. that the most-performed plays are rarely the most revived suggests trade-offs inherent to competing kinds of ephemerality determined by the audience: long runs over a relatively short period of time, or short runs over a relatively long period of time. my approach to repertoire, focalized through plays that were introduced by a self- consciously literary avant-garde and that also are most likely to show up in twenty-first- century drama anthologies, might seem antithetical to diana taylor’s widely-recognized definition. she distinguishes “between the archive of supposedly enduring materials (i.e., texts, documents, buildings, bones) and the so-called ephemeral repertoire of embodied practice/knowledge (i.e., spoken language, dance, sports, ritual).” taylor’s approach would remind us rightly, for instance, that man and superman’s first performance did not include the third act, don juan in hell, which received four stand-alone productions between and ; the play was not performed with all four acts until , and after that only occasionally, so it is not quite accurate to say that the play was produced eighteen times before . even so, here i also invest in ephemeral repertoire: the repertoire performatively assembled by material ephemera. though theater researchers have long mined archives for textual nuggets—the proper nouns of the event; the pearled strings of a future digital database—we have thought much less about how theatergoers interacted with such fugitive print matter. this kind of approach would mean dusting off ephemera in order to process what book historians call bibliographic and what we might well call performative codes, asking how layout, typography, ink color, and paper weight, along with distribution and circulation, condition the sociability of theatergoing. it might include studying theater tickets that were embossed to resemble wedding invitations or playbills that were printed with blank spaces for the “name of play, the friend or friends you were with, and where you dined after the performance”; it could also include studying the scraps of paper that circulated in the theater dialogue which had been censored by the lord chamberlain or programs that listed the times of the last trains in order to help provincial playgoers return home. such an approach would recognize the extent to which the performance event, and the process by which we virtually store that event in our mental repertoire, has been and continues to be conditioned by interactive media. research like this should be made easier by the cutting- edge efforts of the abbey theatre dublin and bam to digitize their ephemera; fortunately for scholars, uploading is only the beginning of analyzing. if we’re now ready to count live- tweeting, blogging, and digital images under the umbrella of performance, as sarah bay- cheng has suggested, then why not count old media, too? figure . the most-produced plays (excluding shakespeare, musical, pantomime) in london, – . asterisk (*) indicates play was performed by a subscription society. data source: wearing, . figure . while the percentage of one-off operas per decade decreases from – , the percentage of one-off non-musical plays increases. data source: wearing, . figure . the back and front of a stage society program with previous and future productions listed, . courtesy of houghton library, harvard university. figure . two pages detailing repertoire from the incorporated stage society annual report, – . courtesy of robert b. haas family arts library, yale university. figure . why does the one-act replace the three-act as the dominant work structure just before world war i? data source: wearing, . figure . though they correspond at the end of the nineteenth century, why over the next sixty years does the number of works that self-describe as “drama” plummet while “play” skyrockets? data source: wearing, . figure . why are original works at best one-third and at worst one-fifth or less of all works produced on the london stage each year from – ? data source: wearing, .                                                                                                                 notes using python and mysql, michael fountaine and i converted j. p. wearing’s multi- volume reference series the london stage into a relational database that can be queried and graphed. wearing lists “play-producing societies”; many self-describe as “association,” “league,” “club,” “guild,” “group,” and “circle.” wearing published the first volume of his reference series in ; an eight-volume second edition was published in . see j. p. wearing, the london stage – : accumulated indexes (lanham: rowman and littlefield, ), – . stage society, “third annual report, – ,” , incorporated stage society archive, gb thm/ / , victoria and albert museum department of theatre and performance, london. quoted in l. w. conolly, “mrs warren’s profession and the lord chamberlain,” shaw: the annual of bernard shaw studies , no. ( ): – , . following allardyce nicoll, i use the term “commercial” to encompass all performances open to the paying public. this includes productions at west end establishments such as drury lane and the haymarket, as well as at more self-consciously experimental theaters such as the court, the hampstead everyman, and the lyric in hammersmith. as nicoll writes: “all of [these playhouses] were, in their own ways, commercial” (english drama, – : the beginnings of the modern period [cambridge: cambridge university press, ], – ). debra caplan, “notes from the frontier: digital scholarship and the future of theatre studies,” theatre journal , no. ( ): – , . tracy davis considers societies to be “not-for-profit schemes” that were, “with one exception [a subscription theater from ] limited to the latter part of the victorian period                                                                                                                                                                                                                                                                                                                                                           and the edwardian era” (the economics of the british stage, – [cambridge: cambridge university press, ], ). lawrence switzky, “shaw among the modernists,” shaw: the annual of bernard shaw studies , no. ( ): – ; toril moi, henrik ibsen and the birth of modernism: art, theatre, philosophy (oxford: oxford university press, ); davis, the economics of the british stage, , – ; claire cochrane, twentieth-century british theatre: industry, art and empire (cambridge: cambridge university press, ), – , , . jean chothia, andré antoine (cambridge: cambridge university press, ), xv. olga taxidou, modernism and performance: jarry to brecht (new york: palgrave macmillan, ), . see also claire warden, british avant-garde theatre (london: palgrave macmillan, ). w. b. worthen, print and the poetics of modern drama (cambridge: cambridge university press, ); martin puchner, stage fright: modernism, anti-theatricality, and drama (baltimore: johns hopkins university press, ); jennifer buckley, “the bühnenkunstwerk and the book: lothar schreyer’s theater notation,” modernism/modernity , no. ( ): – . for a longer history of the page-stage conflict, see julie stone peters, theatre of the book: – (oxford: oxford university press, ), – . to the extent that this article focuses on institutions, i am indebted to lawrence rainey’s pioneering work in institutions of modernism: literary elites and public culture (new haven: yale university press, ). any study of avant-garde reception owes a debt to mark morrisson, who in an extended footnote ethnographically assesses the egoist’s “solidly middle class” subscription lists (the public face of modernism: little magazines, audiences, and reception, – [madison: university of wisconsin press, ], – ).                                                                                                                                                                                                                                                                                                                                                           huntly carter, the theatre of max reinhardt (london: palmer, ), ; elizabeth miller, slow print: literary radicalism and late victorian print culture (stanford: stanford university press, ), . see ruth hoberman, museum trouble: edwardian fiction and the emergence of modernism (charlottesville: university of virginia press, ); jeremy braddock, collecting as modernist practice (baltimore: johns hopkins university press, ); paul k. saint-amour, tense future: modernism, total war, encyclopedic form (oxford: oxford university press, ). christopher balme has called for performance scholars to reevaluate ephemera on the grounds “that theatre is dependent on forms of communication beyond the exchange of libidinal energies between performers and spectators” (the theatrical public sphere [cambridge: cambridge university press, ], ). jacky bratton has remarked that theatrical ephemera such as playbills are “a very unimaginatively used resource” (new readings in theatre history [cambridge: cambridge university press, ], ). tiffany stern has argued for the importance of theatrical ephemera within an early modern context in documents of performance in early modern england (cambridge: cambridge university press, ). rebecca schneider, performing remains: art and war in times of theatrical reenactment (london: routledge, ), . for a case study of ubu, see thomas postlewait, “cultural histories: the case of alfred jarry’s ubu roi,” in the cambridge introduction to theatre historiography (cambridge: cambridge university press, ), – . sarah bay-cheng, “theater is media: some principles for a digital historiography of performance,” theater , no. ( ): – , .                                                                                                                                                                                                                                                                                                                                                           see sharon marcus, “the theatrical scrapbook,” theatre survey , no. ( ): – . c. g. compton, “a subsidised theatre,” this week’s survey, april , , . “cocqcigrues” are mythical french monsters; the expression “the coming of the cocqcigrues” is akin to “when pigs fly.” see arthur bingham walkley, “new theatrical demands,” times literary supplement, january , , . penny farfan, women, modernism, and performance (cambridge: cambridge university press, ), . l. haden guest, “towards a dramatic renascence ii,” the new age , no. ( ): – , . kenneth m. price, “edition, project, database, archive, thematic research collection: what’s in a name?,” digital humanities quarterly , no. ( ): np, digitalhumanities.org. maggie haberman and richard pérez-peña, “donald trump sets off a furor with call to register muslims in the u.s.,” the new york times, november , , nytimes.com/ / / /us/politics/donald-trump-sets-off-a-furor-with-call-to-register- muslims-in-the-us.html. on the materiality of new media, see matthew g. kirschenbaum, mechanisms: new media and the forensic imagination (cambridge, ma: mit press, ), – . neil postman, amusing ourselves to death: public discourse in the age of show business (new york: penguin books, ). david z. saltz, “performing arts,” in a companion to digital humanities, ed. susan schreibman, ray siemens, and john unsworth (malden: blackwell, ), – , . sue-ellen case, performing science and the virtual (new york: routledge, ), .                                                                                                                                                                                                                                                                                                                                                           steve dixon, digital performance: a history of new media in theater, dance, performance art, and installation (cambridge: mit press, ), . n. katherine hayles, how we became posthuman: virtual bodies in cybernetics, literature, and informatics (chicago: university of chicago press, ), . although most new english plays staged after were published, the rise of cheap acting editions in the victorian era decoupled literary value from dramatic publishing. see john russell stephens, the profession of the playwright: british theatre – ( ; rpt., cambridge: cambridge university press, ), – . william archer, “about the theatre. the censorship: rejected remedies,” tribune, november , . henry arthur jones, “preface to saints and sinners,” in the renascence of the english drama (london: macmillan, ), . this comes from allan wade, who was repeating one of many purported requirements for a public performance, none of which courts ever explicitly stated. see memories of the london theatre, – , ed. alan andrews (london: society for theatre research, ), – . for more on copyright performances, see derek miller, “performative performances: a history and theory of the ‘copyright performance,’” theatre journal , no. ( ): – . “the stage society,” court circular, october , . a full summary of the burlesque (dull monotony by gilbert canaan), which took for its structure the plot of john galsworthy’s miners’ strike drama strife ( ), can be found in “a midnight play,” evening standard, may , . j. p. wearing observes that “the percentage of contemporary dramas produced in – is greatly inflated by largely ephemeral, short pieces produced for special occasions,                                                                                                                                                                                                                                                                                                                                                           whereas that percentage in the s is derived from plays (both short and full-length) which ran for a substantial number of performances. what we see in the s is the firm establishment of the modern practice of staging a long run of a new play” (“the london west end theatre in the s,” educational theatre journal , no. [ ]: – , ). “the stage society,” era, september , . the independent theatre society produced ghosts in ; the pioneers produced on the side of angels in . david kurnick, empty houses: theatrical failure and the novel (princeton: princeton university press, ). “‘exiles’ by james joyce,” observer, february , , . exiles was rejected in and the widowing of mrs. holroyd in . for a thorough discussion of the theatrical output of conrad, see richard j. hand, the theatre of joseph conrad: reconstructed fictions (new york: palgrave, ). for lawrence, see james moran, the theatre of d. h. lawrence: dramatic modernist and theatrical innovator (london: bloomsbury publishing, ). and for joyce, see john macnicholas, “the stage history of ‘exiles,’” james joyce quarterly , no. ( ): – . j. t. grein, the world of the theatre: impressions and memoirs, march – (london: william heinemann, ), . the stage society premiere ultimately was extended to the public by barker-vedrenne as part of their matinée series. eight of these man and superman revivals were by the macdona players, who specialized in shaw. plays like hindle wakes and journey’s end were also revived in private theater clubs. like play-producing societies, private theater clubs were not subject to pre-performance                                                                                                                                                                                                                                                                                                                                                           censorship; unlike subscription societies, private theater clubs had permanent venues. private theater clubs emerged in the s and included the gate theatre studio, the arts theatre club, the new lindsey theatre club, the watergate club, the torch, and the new lyric club. of these clubs, wearing’s main calendar includes only the arts theatre club. for more on private theater clubs, see david thomas, david carlton, and anne etienne, theatre censorship: from walpole to wilson (oxford: oxford university press, ), – . the use of the term “programme” to mean a plan of proceedings that may or may not have been printed dates to the middle of the nineteenth century (oed online, march , s.v., “programme, n.,” ). the famous barker-vedrenne court seasons owed their existence to the stage society. as ashley dukes observed: “the stage society, now in its eleventh year, has a finer record than any other society of its kind in europe. by giving new dramatists a hearing it made the court theatre under the vedrenne-barker management possible” (“drama,” the new age , no. [ )]: – ). for more, see desmond maccarthy, the court theatre, – (london: a. h. bullen, ). james woodfield, english theatre in transition, – (london: croom helm, ), . eleven of these everyman revivals were at the old vic. as a case in point, the glasgow repertory theatre is better remembered for the first british production of anton chekhov’s the seagull ( ) than for j. a. ferguson’s campbell of kilmohr ( ); the company also synchronized with london to premiere john galsworthy’s strife ( ), coming just under the wire, over the wire—as founder alfred wareing claimed: “at the end of every act i telegraphed to mr. galsworthy in london the reception the play received in glasgow, so that he knew it was a big success in scotland before the prolonged                                                                                                                                                                                                                                                                                                                                                           cheering which greeted it in london confirmed the judgment” (“state of the drama,” the globe, july , ). some of this is attributable to a postwar lengthening of successful runs for new plays. even more extreme, we would not want to characterize the repertoire of today’s london stage as dominated by agatha christie’s mousetrap ( ), with over , performances since . quoted in george rowell and tony jackson, the repertory movement: a history of regional theatre in britain (cambridge: cambridge university press, ), . stage society actors were paid one to three guineas for one to three weeks of rehearsal and two performances (woodfield, english theatre in transition, ). jerome klapka jerome, on the stage—and off: the brief career of a would-be actor (london: field and tuer, ), . tracy davis has argued for an “associational, polytextual, intertheatrically citational” conception of repertoire in the nineteenth-century theater, and she observes that this “transmitted least well on the page” (“introduction: repertoire,” in the broadview anthology of nineteenth-century british performance, ed. tracy c. davis [peterborough, on: broadview press, ], – , ). ballet had been a regular feature of the opera in england since the eighteenth century, but does not appear as a stand-alone genre in the database until the production of les deux pigeons with music by andré messager and choreography by f. ambrosiny. though there was an eighteenth-century opera canon in england due to the importation of italian opera, according to emanuele senici “[w]hereas during the decade – three- quarters of the operas were performed for one season only, forty years later ( – ) the number was down to about half” (quoted in jennifer hall-witt, fashionable acts: opera and                                                                                                                                                                                                                                                                                                                                                           elite culture in london, – [durham: university of new hampshire press, ], – ). grein modeled his repertoire on andré antoine’s théâtre libre ( – ) and otto brahm’s freie bühne ( – ); essentially, he sought to introduce new english plays into the continental repertoire. grein even included “(théâtre libre)” in small type beneath the prospectus title. prospectus reproduced in alice grein [michael orme], j. t. grein: the story of a pioneer, – (london: j. murray, ), . “stage society invitational circular,” july , , incorporated stage society archive, gb thm/ / , victoria and albert museum department of theatre and performance, london. early productions were carried out without costumes or scenery. quoted in katharine cockin, women and theatre in the age of suffrage: the pioneer players, – (new york: palgrave, ), . a prospectus is a document that advertises or describes an enterprise in order to attract investors. the prospectus first emerged among publishers marketing books, since john minsheu’s ductor in lingua ( ). see maurice rickards and michael twyman, “prospectus,” in the encyclopedia of ephemera: a guide to the fragmentary documents of everyday life for the collector, curator, and historian (new york: routledge, ), . dennis kennedy has remarked on the programs’ “seriousness of purpose,” which contrasted with cluttered commercial-theater programs (“the new drama and the new audience,” in the edwardian theatre: essays on performance and the stage, ed. michael r. booth and joel h. kaplan [cambridge: cambridge university press, ], – , ). the printed annual account first appeared in the late eighteenth century with the rise of organized charities, followed shortly by local authority institutions such as poor-law                                                                                                                                                                                                                                                                                                                                                           “unions,” schools, workhouses, lunatic asylums, prisons, and hospitals. in the middle of the nineteenth century in britain and the united states, it became legally binding on all public companies to publish formally edited accounts; see rickards and twyman, the encyclopedia of ephemera, s.v. “accounts, institutional, n.” (new york: routledge, ), . with an accountant as treasurer, only in – did the society show a deficit, when the income was £ , . . and the expenditure £ , . . . for more on the pioneer players, see cockin, suffrage, . “theatrical notes,” pall mall gazette, september , , . ashley dukes, “the repertory theatres,” poetry and drama ( ): . william archer, “study and stage,” morning leader, january , . “the ballot for seats,” the stage society news ( ): . a guinea was £ , s; £ was approximately a quarter of a lower clerk or shopkeeper’s weekly income. see helen c. long, the edwardian house: the middle-class home in britain – (manchester: manchester university press, ), . for example, during the joint select committee hearings on theater censorship, the liberal mp lord ribbesdale remarked: “my point is that because [the public] know that there is a censorship they know that plays will be of a kind that they can take their young ladies to see” (report from the joint select committee of the house of lords and the house of commons on the stage plays [censorship] [london: wyman and sons, ], ). incorporated stage society articles of association, july , board of trade: companies registration office: files of dissolved companies, bt / / , the national archives, kew. h. hamilton fyfe, “the stage society’s decline—‘hannele’ by the play actors,” the world, april , .                                                                                                                                                                                                                                                                                                                                                           stage society sixth annual report, , incorporated stage society archive, gb thm/ / , victoria and albert museum department of theatre and performance, london; private list of members of the fabian society, september , fabian society archive, gb fabian society/c/ / , item , british library of political and economic science, london school of economics, london. “the stage society: an impression,” scots pictorial, march , . tara mcpherson, “post-archive: the humanities, the archive, and the database,” in between humanities and the digital, ed. patrik svensson and david theo goldberg (cambridge, ma: mit press, ), – . lev manovich, the language of new media (cambridge, ma: mit press, ), . franco moretti, “the slaughterhouse of literature,” mlq: modern language quarterly , no. ( ): – . to describe two participant-generated queries when i demoed the database at the modernist studies association conference. diana taylor, the archive and the repertoire: performing cultural memory in the americas (durham: duke university press, ), . andré antoine printed the invitational tickets to his théâtre libre as wedding invitations (chothia, andré antoine, ). william archer printed the censored passages of his translation of edvard brandes a visit ( ) and distributed them in the theater. see j. p. wearing, the london stage – : a calendar of productions, performers, and personnel, nd ed. (lanham: rowman and littlefield, ), . the abbey theatre programs included the schedule of the last trains and trams. see irish national theatre society, “programme,” december , –january , , george roberts papers concerning the abbey theatre                                                                                                                                                                                                                                                                                                                                                           and the irish national theatre society, – , ms thr , harvard theatre collection, houghton library, harvard university. sarah bay-cheng, “pixelated memories: theatre history and digital historiography,” academia.edu/ /pixelated_memories_theatre_ history_and_digital_historiography/. untitled development and evaluation of a nutrition transition-ffq for adolescents in south india nida i shaikh ,*, jennifer k frediani , usha ramakrishnan , , shailaja s patil , kathryn m yount , , reynaldo martorell , , km venkat narayan , and solveig a cunningham , , doctoral program in nutrition and health sciences, laney graduate school, emory university, clifton road ne, -j, atlanta, ga , usa: emory college, center for the study of human health, emory university, atlanta, ga, usa: hubert department of global health, emory university, atlanta, ga, usa: department of community medicine, shri. b.m. patil medical college, blde university, vijayapura, india: department of sociology, emory university, atlanta, ga, usa submitted april : final revision received october : accepted october : first published online january abstract objective: to develop and evaluate a nutrition transition-ffq (nt-ffq) to measure nutrition transition among adolescents in south india. design: we developed an interviewer-administered nt-ffq comprising a -item semi-quantitative ffq and a twenty-seven-item eating behaviour survey. the reproducibility and validity of the nt-ffq were assessed using spearman correlations, intra-class correlation coefficients (icc), and levels of agreement using bland–altman and cross-classification over months (nt-ffq and nt-ffq ). validity of foods was evaluated against three -h dietary recalls ( -hr). face validity of eating behaviours was evaluated through semi-structured cognitive interviews. the reproducibility of eating behaviours was assessed using weighted kappa (κw) and cross-classification analyses. setting: vijayapura, india. subjects: a representative sample of adolescents aged – years. results: reproducibility of nt-ffq: spearman correlations ranged from · (pulses) to · (red meat) and icc from · (fruits) to · (tea). on average, concordance (agreement) was % and discordance was % for food groups. for eating behaviours, κw ranged from · (eating snacks while watching television) to · (eating lunch at home) with a mean of · . validity of nt-ffq: spearman correlations ranged from · (fried traditional foods) to · (tea) and icc ranged from · (healthy global foods) to · (grains). the concordance and discordance were % and %, respectively. bland–altman plots showed acceptable agreement between nt-ffq and -hr. the eating behaviours had acceptable face validity. conclusions: the nt-ffq has good reproducibility and acceptable validity for food intake and eating behaviours. the nt-ffq can quantify the nutrition transition among indian adolescents. keywords ffq adolescents reproducibility validity nutrition transition globalization, urbanization and economic growth in low- and middle-income countries including india are associated with the nutrition transition – the phenomenon hypothesized to encompass shifts in dietary patterns, eating behaviours and physical activity patterns( , ). concomitant with the nutrition transition, obesity and other chronic diseases have emerged among adults, adolescents and children across social strata( – ). validated dietary assessment instruments could measure the extent of nutrition transition through food intake and eating behaviours associated with it. typical dietary assessment instruments such as ffq and h dietary recalls ( -hr) have been used to determine food consumption and nutritional status( ). however, there is no ffq developed to assess the nutrition transition. in addition, with the availability of and accessibility to global or non-local foods and beverages through globalizing food markets, the diets of individuals are likely to include both global and tradi- tional items. studies have reported behaviours that may be part of the nutrition transition, including eating outside the public health nutrition: ( ), – doi: . /s *corresponding author: email nida.shaikh@emory.edu © the authors downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. http://crossmark.crossref.org/dialog/?doi= . /s &domain=pdf https://www.cambridge.org/core home( ) and watching television while eating meals( ). although studies indicate that these trends may be becoming more common, especially in adolescents aged – years( ), currently there is no dietary instrument that quantifies these and other nutrition transition-related food consumption and eating behaviours. at the forefront of social change and global trends( ), adolescents in low- and middle-income countries may be experiencing nutrition transition-related shifts in food consumption and eating behaviours. in india, adolescents comprise one-fifth of the population (~ million)( ), of which % are underweight and % are overweight or obese( ). a few ffq have been used in epidemiological studies among adolescents in india( – ), but the validity and reproducibility of these ffq have not been docu- mented. the -hr method has been used by the national nutrition monitoring bureau of india to assess periodically the nutritional status of adolescents in selected states( ). unlike ffq, -hr are not representative of long-term food intake( ) and are also not as practical and cost- effective( , ). the lack of validated dietary instruments limits the information known about the nature of dietary changes that may be occurring among adolescents in india. assessing the dietary changes that are part of the nutrition transition requires validated dietary instruments to measure long-term trends and changes, not only in food intake but also in eating behaviours. the objective of the present study was to develop and evaluate the validity and reproducibility of a nutrition transition-ffq (nt-ffq) to measure nutrition transition- related food consumption and eating behaviours among adolescents aged – years in south india. methods setting the study was carried out from june to january in vijayapura in karnataka, india. vijayapura is a mid- sized city (population ) located in karnataka in a district which is categorized as economically under- developed but is urbanizing as a result of the major eco- nomic growth of its small-scale industries, including agriculture, and its large-scale industries, including sugar and textiles( ). vijayapura serves as a prototype mid-sized indian city that is underdeveloped but undergoing urba- nization and experiencing exposure to non-local and global trends( ). the institutional review board at emory university, atlanta, ga, usa and the institutional ethical committee at blde university, vijayapura, india approved the study. interviewer recruitment and training twelve field interviewers proficient in english and the local language, kannada, were recruited and trained to administer the nt-ffq and -hr and to obtain the written informed consent from the adolescents’ caregivers and assent from participants. mock interview sessions were conducted in kannada prior to field testing of the instrument to ensure that interviewers were familiar with food items and to ensure uniformity in the data collection techniques. development of the nt-ffq qualitative fieldwork to identify food items the nt-ffq was developed using a sequenced mixed- methods approach including formative qualitative field- work. the nt-ffq comprised a -item semi-quantitative ffq section that measured food consumption over a month and a twenty-seven-item eating behaviour section that quantified eating behaviours over a week. the food items listed in the nt-ffq were built using two methods: (i) identification of food items, including the commonly consumed local, regional and global foods, from our previous -hr from adolescents; (ii) and written freelists of the most commonly available foods and beverages in stores identified by a purposive sample of adolescents (n ) aged – years. freelisting, an elicitation tech- nique, involves asking individuals to list all the items that they can think of for a given cultural domain( ). a cultural domain is a collection of items related in the minds of informants, which helps them label, interpret and under- stand items in their lives. through these two methods, the food list was built and categorized into ten food groups based on their ingredients and preparation methods: (i) global foods; (ii) snack foods; (iii) non-vegetarian foods; (iv) sweets and desserts; (v) dairy; (vi) beverages; (vii) fruits and seasonal fruits; (viii) vegetables; (ix) tradi- tional foods; and (x) miscellaneous foods. for the eating behaviour section of the nt-ffq instrument, questions were developed on eating behaviours associated in the literature with the nutrition transition( , ). the eating behaviours included the adolescent’s practice of eating at friends’ homes or vice versa, the frequency of eating meals at home v. away from home (e.g. at restaurants and at the home of a friend or family member) and the frequency of watching television while eating meals. frequency response section of the -item nt-ffq the nt-ffq had eleven frequency categories for food consumption over month, from several times per day to never (see online supplementary material, table s ). the intake of seasonal fruits was asked over a -month period, which is the length of a typical season for most seasonal fruits in india. an additional column beside these frequency categories was included to record whether the participant believed that his or her consumption of each food had increased, decreased or remained the same in the past months. this additional column was tailored from a previous study where participants were asked to evaluation of nt-ffq for youth downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core report if the intake of a food item had greatly increased or decreased during the past years( ). this information was added to the traditional ffq format to capture dietary changes that may be occurring as part of the nutrition transition. frequency response section of the twenty-seven-item eating behaviour section of the twenty-seven eating behaviour questions, eighteen questions had the response categories of daily, – d/week, – d/week and never. one question asked participants if they consumed dietary supplements in the past year. the remaining eight questions asked participants about their perceptions of the novelty of eating habits using a -point likert scale with the options ‘totally traditional’, ‘somewhat traditional’, ‘somewhat modern’ and ‘totally modern’. these eight questions included the perception of a family with a working mother and the perceptions of eating home- cooked food, outside food, eggs, meat and bread, and drinking milk and fruit juice. nt-ffq portion size we specified a standard serving size for each food (e.g. slices of bread, glass of milk) and included an additional column for participants to report the portion size they consumed if different from the one specified. the standard serving size was determined using those listed in the dietary guidelines for indians( ) and those listed most frequently in our previous -hr. portion size estimates were based on household utensils including cups, spoons and natural units (e.g. small, medium and large size for fruits, etc.). circular food models were constructed to measure traditional indian breads including chapati and puri that are typically consumed in varying sizes including small, medium or large (see online supplementary mate- rial, fig. s ). given that packaged foods including chips, chocolates and popcorn were available in several sizes, participants were asked to report both the quantity con- sumed over month and the cost of one unit of that item (e.g. cost of bar of chocolate). a database of packaged foods was developed to include the items’ cost, weight and nutritional information from the food label. the weight of food in grams was determined according to the cost of the packaged foods and quantity eaten (e.g. bar of chocolate). pre-testing the nt-ffq the initial nt-ffq included items and was rigorously pre-tested using cognitive interviews among a purposive sample of five adolescents aged – years attending a private school in vijayapura, india. in the pre-test, ado- lescents were asked to report their usual intake of foods and beverages over a -month period. using established guidelines( ), the nt-ffq was tested for content including the clarity of the meaning of food names, portion-size descriptions, unfamiliar food items, length of the instrument and ease of administration. cognitive inter- views have been shown to help identify cognitive pro- blems in dietary questionnaires and improve the accuracy of ffq( , ). we observed that adolescents found it diffi- cult to recall food consumption over a -month period and reported that it would be easier to recall food intake over a -month period, except for seasonal fruits. in addition, adolescents found twenty-nine items unfamiliar (e.g. tofu). using this feedback, the nt-ffq was revised to items with a -month reference period. the revised -item nt-ffq was pre-tested again in english and kannada among a convenience sample of sixteen adolescents of the same age attending a private school and a public school, without encountering additional problems. the nt-ffq was finalized at items to assess food intake over a -month period. on average, it took – min to administer the instrument. evaluation of the nt-ffq study population the reproducibility and validity of the nt-ffq were evaluated among adolescents aged – years. these adolescents were interviewed as a part of the follow- up of a longitudinal study of adolescents who parti- cipated in the home environment and adolescent body weight study in vijayapura( ). in the baseline study, a representative stratified random school-based sample of adolescents was drawn from three public and three private schools in vijayapura. design of the evaluation study the nt-ffq was administered during home visits at baseline (nt-ffq ) and months later (nt-ffq ) to adolescents. two adolescents were lost to follow-up during the second administration of nt-ffq, yielding a final sample of participants. a sub-sample of ninety- seven adolescents completed three additional interviewer- administered -hr over the -month period as shown in fig. . the interviewers used the multi-pass method for all -hr. for each participant, two of the three -hr were taken on a weekday and one was taken on the weekend. statistical methods data from the nt-ffq were transformed into daily intake of each food (g/d) and beverage (ml/d). the daily intake was calculated by multiplying the specified portion unit by the frequency of intake, using the following values for reported frequencies: more than times/d = ; twice daily = ; once daily = ; – times/week = · ; – times/week = · ; once weekly = · ; – times/ month = · ; monthly = · ; less than once monthly = · ; and never eaten or don’t know the food = . the foods in the nt-ffq were collapsed to twenty-one meaningful food groups based on nutrient content as shown in the online supplementary material, table s . ni shaikh et al. downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core the reproducibility and validity of the nt-ffq were assessed using a food-based approach, as done elsewhere( , – ). the reproducibility of the nt-ffq for foods and food groups was assessed at baseline (nt-ffq ) and after months (nt-ffq ) using spearman and pearson corre- lation coefficients, intra-class correlation coefficients (icc) and cross-classification of food group intakes into tertiles. the measures of agreement or cross-classification were calculated using the percentage of participants in the same (concordance), adjacent and extreme (discordance) tertiles of food intake by both nt-ffq. the reproducibility of the eating behaviour questions was assessed using weighted kappa (κw) statistics and cross-classification analyses. the validity of the ffq portion of the nt-ffq was assessed, in a sub-sample of ninety-seven adolescents, by comparing the intakes of twenty-one food groups from the nt-ffq with the average intakes from the three -hr. for each individual in the validation study, the daily intakes of foods consumed during each of the three -hr were computed and used to calculate the mean daily intakes of foods and foods groups from the three -hr. the mixed dishes from the -hr were divided into their components and allocated to the appropriate food items of the questionnaire as would routinely be done in the analysis of mixed dishes( ). spearman and pearson cor- relation coefficients were used to measure the strength of the relationship between food and food group intakes estimated by nt-ffq and the -hr. the relative agreement between nt-ffq and the average of the three -hr was tested by cross-classification of the food group intakes and estimation of the proportion of participants who were classified by the two methods into the same tertile (concordance) and extreme tertiles (discordance). to assess the ‘limits of agreement’ between nt-ffq and the average of three -hr, the bland–altman method was performed for each of the food groups. the differences in intake between the two methods were plotted against the mean intakes of the two instruments for each food group. these estimates were analysed using the statistical soft- ware package sas® version . . p values are two-sided and deemed significant at · . the face validity of the twenty-seven-item eating behaviour section in the nt-ffq was evaluated through semi-structured cognitive interviews using paraphrasing and response latency. to evaluate the face validity of the eating behaviour questions, a convenience sample of thirty adolescents aged – years were selected from one public school and one private school in vijayapura. trained interviewers administered the semi-structured cognitive interviews at the home of the participant. to assess paraphrasing, the interviewers elicited a response from the participant and probed for the meaning of each question to ensure consistency with the intent of the question. response latency was assessed through the time taken to answer each question. in addition, at the end of the interview, the participants were asked if they preferred reporting frequency of eating behaviours over a month instead of over a week. results demographic characteristics the demographic characteristics of the participants in the nt-ffq evaluation study are given in table . of the school-going adolescents eligible for analysis, the mean age was · years, % were female and % attended public (government-funded) schools. the intake of each food group based on both administra- tions of the nt-ffq and the average intake from the three -hr are shown in table . the mean daily intake of most food groups was overestimated by the nt-ffq when compared with the mean daily intake of food groups estimated from the -hr. however, intakes were higher reproducibility of nt-ffq (n ) validity of nt-ffq (n ) duration: months nt-ffq -hr time nt-ffq time nt-ffq + -hr time time nt-ffq + -hr time fig. design of the reproducibility and validity study to evaluate the nutrition transition-ffq (nt-ffq) among adolescents in south india. data were collected in november –january . the nt-ffq was administered by trained interviewers at homes of adolescents aged – years at baseline (nt-ffq ) and after months (nt-ffq ). a sub-sample of ninety-seven adolescents also completed three interviewer-administered h dietary recalls ( -hr) during the months between nt-ffq and nt-ffq evaluation of nt-ffq for youth downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core when estimated by -hr for the fried traditional food group, red meat food group, lean meat food group and sugar food group. reproducibility estimates of the reproducibility and validity of the nt-ffq are given in table . the spearman correlation coefficients for foods ranged from · (cooked lentils) to · (red meat; mean = · ). for food groups, the spearman correlations ranged from · (pulses and nuts) to · (red meat; mean = · ) and icc ranged from · (fruits) to · (tea and coffee; mean = · ). of the twenty-one food groups, spearman correlation coefficients were ≥ · for sixteen food groups and ≥ · for five food groups. on average, concordance was % and discordance was %. very good concordance (≥ %) was determined for lean meat, ghee (clarified butter) and the healthy global food group comprising oats, cereal and multigrain biscuits. the analysis showed good concordance ( – %) for the food groups dairy, tea and coffee, red meat, sugar, soda and energy drinks, unhealthy global foods, eggs, fried snacks, grains, fried traditional foods, breads, processed foods, fruit juices, vegetables, snacks, and sweets and desserts; and fair concordance ( – %) for the food groups fruits and pulses and nuts. the discordance was less than % for all food groups except for fruit juices ( %) and snacks ( %). for eating behaviours, κw ranged from · (eating snacks while watching television) to · (eating lunch at home) with a mean of · , suggesting moderate agree- ment (table ). on average, the concordance (exact agreement) was % and discordance (opposite agree- ment) was %. concordance ranged from % (practice of eating sweets prepared outside the home and eating table comparison of food group intakes estimated from the nutrition transition-ffq (nt-ffq) and the average of the three h dietary recalls ( -hr) among adolescents in vijayapura, india nt-ffq (g/d)‡ nt-ffq (g/d)‡ -hr (g/d)§ food group† mean se median iqr║ mean se median iqr mean se median iqr energy-dense foods breads · · · · – · · · · · – · · · – · unhealthy global foods · · · · – · · · · · – · · · healthy global foods · · · · processed foods · · · · – · · · · · – · · · · – · snacks · · · · – · · · · · – · · · – · fried snacks · · · · – · · · · · – · · · – · fried traditional foods · · · · – · · · · · – · · · · – · sweets and desserts · · · · – · · · · · – · · · – · animal-source foods red meat · · – · · · – · · · lean meat · · – · · · · – · · · eggs · · · · – · · · · · – · · · – · dairy · · · · – · · · · · – · · · · – · drinks soda and energy drinks¶ · · · – · · · · – · · · tea and coffee¶ · · · · – · · · · · – · · · · · – · fruit juices¶ · · · · – · · · · · – · · · traditional foods fruits · · · · – · · · · · – · · · – · vegetables · · · · – · · · · · – · · · · · – · pulses and nuts · · · · – · · · · · – · · · · · – · grains · · · · – · · · · · – · · · · · – · sugar · · · – · · · · – · · · · · – · ghee · · · – · · · · – · · · – · nt-ffq , first administration of the nt-ffq; nt-ffq , second administration of the nt-ffq; iqr, interquartile range. data were collected in november –january . †for analysis, the items in the nt-ffq were reduced to twenty-one meaningful food groups. ‡a total of adolescents aged – years were in the reproducibility study. §a sub-sample of ninety-seven adolescents aged – years were in the validity study. ║iqr of – % of the population. ¶data presented as ml/d. table characteristics of the adolescents (n ) in the evaluation study of the nutrition transition-ffq (nt-ffq) in vijayapura, india† characteristic mean or n se or % age (years)‡ · · boy · public school · grade in which studying x · xi · xii · short-term or diploma course · school or college dropout · data were collected in november –january . †data are from the second administration of the nt-ffq (nt-ffq ). ‡age presented as mean and its standard error; all other data presented as number and percentage. ni shaikh et al. downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core sweets prepared at home) to % (practice of eating dinner at home). discordance was ≤ % for twenty-four of the twenty-seven eating behaviour questions. a maximum discordance of % was found for the practice of eating breakfast outside home and for the perception question, from totally traditional to totally modern, of a family with a working mother. validity for the -item nt-ffq, spearman correlation coeffi- cients for foods ranged from · (buns) to · (chocolate milk powder; mean = · ) and for food groups ranged from · (fried traditional foods) to · (tea or coffee; mean = · ). the icc for food groups ranged from · (healthy global foods) to · (grains; mean = · ). of the twenty-one food groups, spearman correlation coeffi- cients were ≥ · for five food groups and ≥ · for eleven food groups. comparing the intakes of food groups between the nt-ffq and the -hr, concordance was % and discordance was %. the agreement analysis revealed very good concordance (≥ %) for the red meat and healthy global food groups; good concordance ( – %) for the food groups tea and coffee, ghee, breads, grains, pulses and nuts, processed foods and eggs; fair concordance ( – %) for vegetables, fruit juices, sweets and desserts, dairy, unhealthy global foods, snacks, fruits, fried snacks, soda and energy drinks, and fried traditional foods; and low concordance ( %) for sugar. the dis- cordance was less than % for all food groups except for the fried snacks ( %) and fried traditional foods ( %). the bland–altman plots showed acceptable agreement for food groups between the nt-ffq and the -hr as shown in fig. . the twenty-seven-item eating behaviour questions in the nt-ffq were found to have acceptable face validity. table reproducibility and validity of the nutrition transition-ffq (nt-ffq) among adolescents in vijayapura, india reproducibility† validity‡ cross-classification by tertiles (%) cross-classification by tertiles (%) food group (g/d) pearson correlation spearman correlation icc same adjacent opposite pearson correlation spearman correlation icc same adjacent opposite energy-dense foods breads · * · * · · · · · *** · *** · · · · unhealthy global foods · * · * · · · · · · * · · · · healthy global foods · * · * · · · · –║ –║ · · · · processed foods · * · * · · · · · *** · *** · · · · snacks · * · * · · · · · · * · · · · fried snacks · * · * · · · · · ** · ** · · · · fried traditional foods · * · * · · · · · · · · · · sweets and desserts · * · * · · · · · · · · · · animal-source foods red meat · * · * · · · · · *** · *** · · · · lean meat · * · * · · · · · *** · * · · · · eggs · * · * · · · · · *** · *** · · · · dairy · * · * · · · · · *** · *** · · · · drinks soda and energy drinks§ · * · * · · · · · · ** · · · · tea and coffee§ · * · * · · · · · *** · *** · · · · fruit juices§ · * · * · · · · · ** · ** · · · · traditional foods fruits · * · * · · · · · ** · * · · · · vegetables · * · * · · · · · ** · *** · · · · pulses and nuts · * · * · · · · · *** · *** · · · · grains · * · * · · · · · *** · *** · · · · sugar · * · * · · · · · ** · * · · · · ghee · * · * · · · · · *** · *** · · · · icc, intra-class correlation coefficient. data were collected in november –january . *p < · , **p < · , ***p < · . †reproducibility of the nt-ffq is the comparison of nt-ffq (first administration of the nt-ffq) v. nt-ffq (second administration of the nt-ffq). there were adolescents aged – years in the reproducibility study. ‡validity of the nt-ffq is the comparison of nt-ffq v. the average of the three h dietary recalls ( -hr). a sub-sample of ninety-seven adolescents were in the validity study. §data presented as ml/d. ║spearman and pearson correlations cannot be computed as there is no reported intake of the global healthy foods in the three -hr. evaluation of nt-ffq for youth downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core through the assessment of paraphrasing, participants were found to be able to understand, explain and repeat the questions in their own words. additionally, all parti- cipants reported that it was easier to report eating beha- viours over a week as opposed to over a month. discussion changing dietary patterns from the ongoing nutrition transition among adolescents in india have drawn atten- tion to the lack of validated dietary instruments to assess the nutrition transition among this population. to address this gap, we followed a sequenced mixed-methods approach to develop and to evaluate a dietary instru- ment, the nt-ffq, which can be used to assess the nutrition transition among adolescents in south india. this nt-ffq provides reasonably reproducible and valid esti- mates for most foods, food groups and eating behaviours in india. to our knowledge, it is the first validated ffq for adolescents in india and the first validated dietary instru- ment to assess nutrition transition-related food consump- tion and eating behaviours. as seen in other studies among adults( , – ) and adolescents( , ), the ffq overestimated intakes relative to the reference method for most food groups. in western settings, correlations in the range of · – · for food intakes are considered acceptable( ). the direct comparison of our study with similar ffq evaluation studies is complicated by the fact that the food groups chosen were not similar across studies. reference instruments also differed between studies( , ). to the extent that comparisons can be made for evaluating the reproducibility of ffq for individual foods and food groups, spearman correlation coefficients for our study are comparable to those described for other studies( , , , ). in our study, spearman correlation coefficients were ≥ · for five food groups and ≥ · for the remaining sixteen of twenty-one food groups, sug- gesting that the reproducibility of the nt-ffq was good. the spearman correlation coefficients for the reproduci- bility of the nt-ffq tended to be higher (≥ · ) for commonly consumed foods than for infrequently consumed foods (< · ), as reported elsewhere( ). salvini et al. reported pearson correlation coefficients > · for % of the foods and > · for % of the foods on a fifty-five-item self-administered ffq completed months apart among women in the nurses’ health study( ). even though pearson correlation coefficients were reported, these were found to be very similar to the spearman correlation coefficients( ). another study among women that compared two nurses’ health study ffq ( ffq version v. ffq version) found that spearman correlation coefficients ranged between · (readymade pie) and · (tea)( ). in a third study, high reproducibility using spearman correlations (r ≥ · ) was reported for half of the food groups and moderate reproducibility (r < · ) was reported for the other half of the food groups( ). the validation study was carried out among german adults aged – years who com- pleted a -item ffq administered at two intervals, months apart, and -hr at monthly intervals( ). ocke et al. reported spearman correlations for foods and food groups in the range from · to · (median r = · ) on a -item self-administered ffq completed thrice during -month intervals among adults( ). in our study, discordance (extreme tertiles) between the intakes in nt-ffq and nt-ffq was < % (range: · – %) for most foods groups except for fruit juices ( %) and snacks ( %). fruit intakes had a low icc of · with % being misclassified. in an ffq validation study among ninety-nine participants interviewed within · years, the concordance (exact agreement) ranged from % (coleslaw) to % (vodka)( ). in another study where an ffq was administered to sixty-three participants at the beginning and the end of months, the concordance (same or adjacent category) was %( ). table reproducibility of the twenty-seven eating behaviour questions in the nutrition transition-ffq for adolescents in vijayapura, india cross-classification (%)‡ eating behaviour† κw concordance discordance eating with friends at home · eating outside food with friends · eating outside food brought home · eating indian sweets made outside home · eating indian sweets made at home · eating fried foods made at home · at home eating breakfast · eating lunch · eating evening snack · eating dinner · outside home eating breakfast · eating lunch · eating evening snack · eating dinner · watching television while eating breakfast · while eating lunch · while eating evening snack · while eating dinner · eating habits traditional or modern§ eating home cooked food · eating outside food · eating eggs · eating meat · eating bread · drinking milk · drinking fruit juice · family with working mother · mean · κw, weighted kappa. data were collected in november –january . †a total of adolescents aged – years were in the reproducibility study. ‡cross-classification: concordance (exact agreement) and discordance (opposite agreement). §perceptions of the novelty of eating habits were asked using a -point likert scale with the options ‘totally traditional’, ‘somewhat traditional’, ‘somewhat modern’ and ‘totally modern’. ni shaikh et al. downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core the validity of the nt-ffq to measure nutrition transi- tion was assessed against the -hr. in the validation of ffq, mean correlation coefficients of · are indicative of good validity between the ffq and the reference meth- ods( ), whereas correlations in the range of · – · are desired between the study and reference methods( ). in the present study, the mean spearman correlation coefficient was · with the correlations ≥ · for seventeen of the twenty-one food groups, indicating fair agreement. other ffq studies, validated for foods and foods groups, have reported correlation coefficients ranging from · to · ( , , , ). in the validation of a -item ffq among middle-aged german adults, spear- man correlations for foods and food groups between ffq and -hr showed values between · and · , with most between · and · ( ). four food groups yielded correlations > · , eleven groups showed values between · and · , and the remaining nine food groups yielded correlations < · ( ). in another study that validated a fifty-three-item ffq among german adults aged – years, spearman rank correlations between the ffq and two -hr ranged from · (pizza) to · (tea), with two-thirds of the spearman correlations > · ( ). the cross-classification of intakes (concordance and discordance) reported in nt-ffq and the three -hr in our study is similar to that reported in other studies( , , ). in a study that evaluated food group intakes from an ffq against a d diet record among flemish children, the con- cordance (same or adjacent category) was % (meat products) to % (fruit juices) and discordance (opposite category) was < % for all food groups( ). in the paper by bohlscheid-thomas et al., the concordance (exact quintile) ranged from · % for legumes to · % for alcoholic drinks, with most values lying between and %, and discordance (extreme quintile) was < · %( ). as reported by haftenberger et al., the concordance (same or adjacent quartile) ranged between % (cooked vegetables) and % (coffee)( ). similar to our study, other studies reported that discordance was < % for most food groups( , ). the low concordance between nt-ffq and nt-ffq for lentils and pulses may stem from the large number of foods in the group, some that are eaten more frequently than others; which may make it challenging to estimate the average monthly intake. for instance, in indian households, pigeon pea (tur dal) is typically consumed more frequently than kidney beans (rajma). – – – d iff e re n ce in p ro ce ss e d f o o d s in ta ke (n t -f f q – -h r ) (g /d ) d iff e re n ce in g ra in in ta ke (n t -f f q – -h r ) (g /d ) d iff e re n ce in t e a a n d c o ff e e in ta ke (n t -f f q – -h r ) (m l/d ) d iff e re n ce in b re a d in ta ke (n t -f f q – -h r ) (g /d ) – – – – mean processed foods intake [(nt-ffq + -hr)/ ] (g/d) – – – – – – – – mean bread intake [(nt-ffq + -hr)/ ] (g/d) mean grain intake [(nt-ffq + -hr)/ ] (g/d) mean tea and coffee intake [(nt-ffq + -hr)/ ] (ml/d) (a) (b) (c) (d) fig. bland–altman plots assessing the relative validity of the nutrition transition-ffq (nt-ffq) among adolescents (n ) in vijayapura, india, data collected in november –january . the difference in intake between the second administration of the nt-ffq (nt-ffq ) and the average of the three h dietary recalls ( -hr) is plotted v. the mean intake from the two methods for: (a) processed foods, (b) breads, (c) grains and (d) tea and coffee. —— represents the mean difference (bias) and – – – – – represent the limits of agreement (± sd) evaluation of nt-ffq for youth downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core bland–altman plots were generated to visually examine the agreement between the nt-ffq and -hr across the range of intake of food groups. the bland–altman method also allows one to identify bias between the administration of the questionnaires and to see the nature of the bias across the range of intakes( ). as given in fig. , the bland–altman plots of food groups demonstrated good agreement between the nt-ffq and -hr. for all food groups, fewer than % of the participants were outside the limits of agreement. the present study has certain limitations that need to be taken into account. first, dietary assessment of children has been shown to have methodological problems relating to their limited knowledge of foods, difficulty in the estimation of frequency of consumed foods and potential response bias( – ). as participants were interviewed at home often in the presence of one or more family members, a potential response bias may have resulted. however, the reliability of dietary recalls from – -year-olds should not be a concern as numerous studies have reported that by age – years, children can report their food intake as reliably as their parents( , , ). second, the -item nt-ffq took – min to complete and might be considered lengthy, but a response rate of % shows that participants were motivated to participate in the study. the length of the nt- ffq also falls within the acceptable -item limit at which adolescents have been found willing to complete long questionnaires( ). third, the sources of error in ffq have been reported as due to the restriction imposed by a fixed list of foods, seasonal and regional variations in the avail- ability of foods, memory, perception of portion sizes and interpretation of questions( ). however, given the vast variety of foods and beverages and their variations across india, the nt-ffq was developed to capture the salient intakes of global, national and regional items as relevant to adolescents in south india. we expect minimum error in the measurement of seasonal foods, given that the nt-ffq and -hr were administered within the same winter season (november–january), and expect minimum error in the measurement of seasonal fruits as their intake was recorded over the length of a typical season. across the three recorded -hr, participants did not consume twenty-nine of foods listed on the nt-ffq. of the twenty-nine foods, seven were seasonal foods that were not typically consumed during the period of the validation study and the remaining foods were infrequently consumed (once monthly or less) by adolescents. these twenty-nine foods were likely to be consumed at festivals, in summer or may be relatively new to this region. fourth, the low icc ( · ) for the intake of healthy global foods between the nt- ffq and -hr may be attributed to their infrequent intake ( · g/d or less than once monthly). the healthy global food group includes oats, breakfast cereals and multigrain biscuits, which are not commonly consumed by adolescents in our study. however, some infrequently consumed foods were included in the nt-ffq to capture foods that may be relatively new to this region but may soon become a part of the local food environment. this phenomenon is corroborated in a similar study that reported low probability of eating rarely consumed foods( , ). in order to validate the intake of infrequently consumed foods with better accuracy we would require large calibration studies, as reported elsewhere( ). the documentation of the increasing availability of global, national and regional foods not only in urban regions but also in remote, but urbanizing regions is a large part of our work. further research could explore how adolescents’ access to and selection of foods may be influenced by the food marketplace in india. the present study offers several strengths. first, the estimates of reproducibility and validity of the foods in the nt-ffq were evaluated with a comprehensive range of tests, including correlations coefficients and cross- classification in conjunction with the bland–altman method. the bland–altman method has been preferred over correlation analysis as a method to evaluate the reproduci- bility and validity of an ffq( ). furthermore, the sample size of the present study was large enough to allow for the estimation of the limits of agreement from the bland–altman analysis as a component of the evaluation of the validity of the nt-ffq. second, the reproducibility of the eating behaviour questions was assessed using cross-classification and κw. the use of semi-structured cognitive interviews strengthened the validation of the eating behaviour ques- tions. third, the validated nt-ffq can serve as a useful instrument in ranking adolescents according to both food intake and eating behaviours, and can be used in epide- miological research. lastly, the acceptable measures of agreement between the nt-ffq and -hr may be a result of the nt-ffq’s flexibility wherein participants were able to describe a portion size if they did not find a suitable portion size on the questionnaire. this method, where participants are able to describe their own portion size, has been found to provide the highest estimates of correlation coefficients ( · – · ) compared with the method where portion size is specified on the questionnaire (correlation coefficients= · – · ) or when no portion size is specified but the average portion weights are used to compute intakes (correlation coefficients= · – · )( ). conclusion globalization and nutrition transition in india have drawn attention to the lack of existing validated dietary instru- ments to assess the food consumption and eating beha- viours of adolescents. to address this gap, we developed and evaluated an nt-ffq for adolescents; this nt-ffq has good reproducibility and acceptable validity for most food groups and eating behaviours. the nt-ffq can be used in epidemiological studies to assess food intakes and eating behaviours associated with the nutrition transition among ni shaikh et al. downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core adolescents in south india. our team is evaluating the nt- ffq among adolescents residing in an urban region in south india. the development of the nt-ffq represents an important and much needed first step that allows us to measure dietary changes and eating behaviours among adolescents in a globalizing society. acknowledgements acknowledgements: the authors would like to thank dr veena algur at blde university for her assistance with translation from english to kannada and dr rob o’reilly at the emory center for digital scholarship for his assistance with data merging. the authors would like to thank the field research team and the adolescents for their participation in this study. financial support: this work was supported by the eunice kennedy shriver national institute of child health and human development (nichd; grant number d hd - s ). n.i.s. was supported by the national institutes of health (nih; grant number r tw - ) funded by the fogarty international center and the amy joye memorial research award from the academy of nutrition and dietetics foundation. disclosure: this work is solely the responsibility of the authors and does not neces- sarily represent the official views of the fogarty international center, nih or the academy of nutrition and dietetics foundation. the fogarty international center, nih and the academy of nutrition and dietetics foundation had no role in the design, analysis or writing of this article. conflict of interest: none. authorship: n.i.s. and s.a.c. formulated the research question; n.i.s., j.k.f., u.r. and s.a.c. designed the study; n.i.s. and s.s.p. carried it out; n.i.s. analysed the data, with interpretative input from all authors; n.i.s. drafted the manuscript; all authors helped to revise the manuscript and approved the final version. ethics of human subject parti- cipation: the institutional review board at emory uni- versity, atlanta, ga, usa and the institutional ethical committee at blde university, vijayapura, india approved the study. supplementary material to view supplementary material for this article, please visit http://dx.doi.org/ . /s references . popkin bm, horton s, kim s et al. ( ) trends in diet, nutritional status, and diet-related noncommunicable diseases in china and india: the economic costs of the nutrition transition. nutr rev , – . . popkin bm ( ) the nutrition transition in low-income countries: an emerging crisis. nutr rev , – . . drewnowski a & popkin bm ( ) the nutrition transition: new trends in the global diet. nutr rev , – . . satia ja ( ) dietary acculturation and the nutrition transition: an overview. appl physiol nutr metab , – . . popkin bm & gordon-larsen p ( ) the nutrition transi- tion: worldwide obesity dynamics and their determinants. int j obes relat metab disord , suppl. , s –s . . zingoni c, norris sa, griffiths pl et al. ( ) studying a population undergoing nutrition transition: a practical case study of dietary assessment in urban south african adoles- cents. ecol food nutr , – . . singhal s, goyle a & gupta r ( ) quantitative food frequency questionnaire and assessment of dietary intake. natl med j india , – . . adair ls & popkin bm ( ) are child eating patterns being transformed globally? obes res , – . . kuriyan r, bhat s, thomas t et al. ( ) television viewing and sleep are associated with overweight among urban and semi-urban south indian children. nutr j , . . dasen p ( ) rapid social change and the turmoil of adolescence: a cross-cultural perspective. int j group tensions , – . . unicef ( ) the state of the world’s children . adolescence: an age of opportunity. new york: unicef. . patel sa, narayan km & cunningham sa ( ) unhealthy weight among children and adults in india: urbanicity and the crossover in underweight and overweight. ann epidemiol , – .e . . swaminathan s, thomas t, kurpad av et al. ( ) dietary patterns in urban school children in south india. indian pediatr , – . . vijayapushpam t, menon kk, raghunatha rao d et al. ( ) a qualitative assessment of nutrition knowledge levels and dietary intake of schoolchildren in hyderabad. public health nutr , – . . raghunatha rao d, antony gm, sarma kvr et al. ( ) dietary habits and effect of two different educational tools on nutrition knowledge of school going adolescent girls in hyderabad, india. eur j clin nutr , – . . national institute of nutrition ( ) diet and nutritional status of adolescents. in national nutrition monitoring bureau, special report. nnmb technical report no. , pp. – . hyderabad: nin. . rockett hr, berkey cs & colditz ga ( ) evaluation of dietary assessment instruments in adolescents. curr opin clin nutr metab care , – . . slater b, enes cc, lopez rv et al. ( ) validation of a food frequency questionnaire to assess the consumption of carotenoids, fruits and vegetables among adolescents: the method of triads. cad saude publica , – . . jayawardena r, swaminathan s, byrne nm et al. ( ) development of a food frequency questionnaire for sri lankan adults. nutr j , . . government of india, ministry of micro small and medium enterprises ( ) brief industrial profile of bijapur district. bijapur: government of karnataka. . shaikh ni, patil ss, halli s et al. ( ) going global: indian adolescents’ eating patterns. public health nutr , – . . borgatti s ( ) elicitation techniques for cultural domain analysis. in ethnographer’s toolkit. vol. : enhanced ethnographic methods: audiovisual techniques, focused group interviews, and elicitation techniques, pp. – [jj schensul, md lecompte, bk natasi et al., editors]. walnut creek, ca: altamira. . wang z, zhai f, du s et al. ( ) dynamic shifts in chinese eating behaviors. asia pac j clin nutr , – . . salvini s, hunter dj, sampson l et al. ( ) food- based validation of a dietary questionnaire: the effects of evaluation of nt-ffq for youth downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core week-to-week variation in food consumption. int j epidemiol , – . . national insitute of nutrition, indian council of medical research ( ) dietary guildelines for indians, edition. hyderabad: nin. . cade je, burley vj, warm dl et al. ( ) food-frequency questionnaires: a review of their design, validation and utilisation. nutr res rev , – . . subar af, thompson fe, smith af et al. ( ) improving food frequency questionnaires: a qualitative approach using cognitive interviewing. j am diet assoc , – . . thompson fe, subar af, brown cc et al. ( ) cognitive research enhances accuracy of food frequency questionnaire reports: results of an experimental validation study. j am diet assoc , – . . staab em, cunningham sa, thorpe s et al. ( ) a ‘snapshot’ of nutrition and physical activity among private school adolescents in rural india. childhood , – . . millen ae, midthune d, thompson fe et al. ( ) the national cancer institute diet history questionnaire: validation of pyramid food servings. am j epidemiol , – . . bohlscheid-thomas s, hoting i, boeing h et al. ( ) reproducibility and relative validity of food group intake in a food frequency questionnaire developed for the german part of the epic project. european prospective investigation into cancer and nutrition. int j epidemiol , suppl. , s –s . . midthune d, schatzkin a, subar af et al. ( ) validating an ffq for intake of episodically consumed foods: appli- cation to the national institutes of health–aarp diet and health study. public health nutr , – . . huybrechts i, de backer g, de bacquer d et al. ( ) relative validity and reproducibility of a food-frequency questionnaire for estimating food intakes among flemish preschoolers. int j environ res public health , – . . ocke mc, bueno-de-mesquita hb, goddijn he et al. ( ) the dutch epic food frequency questionnaire. i. descrip- tion of the questionnaire, and relative validity and reproducibility for food groups. int j epidemiol , suppl. , s –s . . haftenberger m, heuer t, heidemann c et al. ( ) relative validation of a food frequency questionnaire for national health and nutrition monitoring. nutr j , . . zhuang m, yuan z, lin l et al. ( ) reproducibility and relative validity of a food frequency questionnaire devel- oped for adults in taizhou, china. plos one , e . . bowen l, bharathi av, kinra s et al. ( ) development and evaluation of a semi-quantitative food frequency questionnaire for use in urban and rural india. asia pac j clin nutr , – . . mahajan r, malik m, bharathi av et al. ( ) reproduci- bility and validity of a quantitative food frequency ques- tionnaire in an urban and rural area of northern india. natl med j india , – . . araujo mc, yokoo em & pereira ra ( ) validation and calibration of a semiquantitative food frequency ques- tionnaire designed for adolescents. j am diet assoc , – . . tabacchi g, filippi ar, breda j et al. ( ) comparative validity of the asso-food frequency questionnaire for the web-based assessment of food and nutrients intake in adolescents. food nutr res , . . willett w ( ) nutritional epidemiology, rd ed. oxford: oxford university press. . colditz ga, willett wc, stampfer mj et al. ( ) the influence of age, relative weight, smoking, and alcohol intake on the reproducibility of a dietary questionnaire. int j epidemiol , – . . wiecha jm, hebert jr & lim m ( ) diet measurement in vietnamese youth: concurrent reliability of a self- administered food frequency questionnaire. j community health , – . . graham s, lilienfeld am & tidings je ( ) dietary and purgation factors in the epidemiology of gastric cancer. cancer , – . . acheson ed & doll r ( ) dietary factors in carcinoma of the stomach: a study of cases and controls. gut , – . . samaras k, kelly pj, chiano mn et al. ( ) genes versus environment. the relationship between dietary fat and total and central abdominal fat. diabetes care , – . . bland jm & altman dg ( ) statistical methods for assessing agreement between two methods of clinical measurement. lancet , – . . rockett hr, breitenbach m, frazier al et al. ( ) vali- dation of a youth/adolescent food frequency questionnaire. prev med , – . . livingstone mb, robson pj & wallace jm ( ) issues in dietary intake assessment of children and adolescents. br j nutr , suppl. , s –s . . hebert jr, gupta pc, bhonsle rb et al. ( ) dietary exposures and oral precancerous lesions in srikakulam district, andhra pradesh, india. public health nutr , – . . emmons l & hayes m ( ) accuracy of -hr. recalls of young children. j am diet assoc , – . . jenner da, neylon k, croft s et al. ( ) a comparison of methods of dietary assessment in australian children aged – years. eur j clin nutr , – . . cade j, thompson r, burley v et al. ( ) development, validation and utilisation of food-frequency questionnaires – a review. public health nutr , – . ni shaikh et al. downloaded from https://www.cambridge.org/core. apr at : : , subject to the cambridge core terms of use. https://www.cambridge.org/core development and evaluation of a nutrition transition-ffq for adolescents in south india methods setting interviewer recruitment and training development of the nt-ffq qualitative fieldwork to identify food items frequency response section of the -item nt-ffq frequency response section of the twenty-seven-item eating behaviour section nt-ffq portion size pre-testing the nt-ffq evaluation of the nt-ffq study population design of the evaluation study statistical methods results demographic characteristics fig. design of the reproducibility and validity study to evaluate the nutrition transition-ffq (nt-ffq) among adolescents in south india. data were collected in november &#x ;january . the nt-ffq was administered by trained interviewers at hom reproducibility table comparison of food group intakes estimated from the nutrition transition-ffq (nt-ffq) and the average of the three &znbsp;h dietary recalls ( -hr) among adolescents in vijayapura,�india table characteristics of the adolescents (n ) in the evaluation study of the nutrition transition-ffq (nt-ffq) in vijayapura, india&#x ; validity table reproducibility and validity of the nutrition transition-ffq (nt-ffq) among adolescents in vijayapura,�india discussion table reproducibility of the twenty-seven eating behaviour questions in the nutrition transition-ffq for adolescents in vijayapura,�india fig. bland&#x ;altman plots assessing the relative validity of the nutrition transition-ffq (nt-ffq) among adolescents (n ) in vijayapura, india, data collected in november &#x ;january . the difference in intake between the second admini conclusion acknowledgements acknowledgements supplementary material references sharing oregon’s cultural heritage: harvesting oregon digital’s collections into the digital public library of america ola quarterly ola quarterly volume number digital repositories and data harvests - - sharing oregon’s cultural heritage: harvesting oregon digital’s sharing oregon’s cultural heritage: harvesting oregon digital’s collections into the digital public library of america collections into the digital public library of america julia simic university of oregon ryan wick oregon state university recommended citation recommended citation simic, j., & wick, r. ( ). sharing oregon’s cultural heritage: harvesting oregon digital’s collections into the digital public library of america. ola quarterly, ( ), - . https://doi.org/ . / - . © by the author(s). ola quarterly is an official publication of the oregon library association | issn - http://commons.pacificu.edu/olaq http://commons.pacificu.edu/olaq https://commons.pacificu.edu/olaq https://commons.pacificu.edu/olaq/vol https://commons.pacificu.edu/olaq/vol /iss https://commons.pacificu.edu/olaq/vol /iss https://doi.org/ . / - . https://doi.org/ . / - . sharing oregon’s cultural heritage: harvesting oregon digital’s collections into the digital public library of america oregon digital, the library digital collections platform of oregon state university and the uni- versity of oregon, joined the mountain west digital library (mwdl) and the digital public library of america (dpla) in to increase the visibility of our collections. this article discusses the process of becoming participants in the hub-network structure of the two organiza- tions, remediating metadata in compliance with best practices, and modifications to the digital collections platforms, both locally and at mwdl, to successfully harvest over , items into dpla. background oregon state university and the university of oregon have a longstanding and successful collaboration in providing access to unique digitized cultural heritage materials. utilizing ex- pertise from both institutions, oregon digital (n.d.) was launched in as a joint project on contentdm, and later migrated to the samvera (formerly hydra) (n.d.) platform. oregon digital provides a single point of access for over , items in discrete col- lections. among our regional partners are the greater western library alliance, the oregon state historic preservation office, the oregon arts commission, the oregon historical so- by julia simic assistant head of digital scholarship services, digital production and preservation university of oregon libraries jsimic@uoregon.edu and ryan wick analyst programmer, oregon state university libraries and press, the valley library ryan.wick@oregonstate.edu julia is the assistant head of digital scholarship services at the university of oregon libraries. her primary areas of responsibility include the management of all stages of the digital lifecycle, and participation in initiatives related to digital collections and scholarship projects. julia assisted in developing training for the orbis cascade alliance’s lsta grant to become a dpla service hub and is the institutional representative to the unique & local content team. she holds a ba and mls from indiana university. ryan is an analyst programmer at oregon state university libraries and press, with both the special collections and archives research center and the emerging technologies and services department. beginning as a student worker in , he has been involved in many digitization and digital collections projects, including publication of the pauling catalogue, osu sesquicentennial oral history project, scholarsarchive@osu, and oregon digital. he is also active in the samvera and code lib communities. ciety, and many others. in , about a year after we migrated our collections to samvera, we began to explore participation in the digital public library of america (dpla) (n.d.) as a way to share and promote our collections beyond the state of oregon. dpla member re- positories make their digital collections metadata available for oai-pmh harvesting. dpla aggregates that metadata and makes it public and searchable through their interface. actual content, such as images and documents, are not harvested; users clicking on an item in the dpla interface are redirected to the original repository item. partnering with the mountain west digital library (mwdl) (n.d), a service hub for dpla, was a natural extension of the collaborative spirit of both institutions. preparing collections for harvest although we began reviewing our digital collections for compliance with dpla and mwdl content and metadata standards in late , oregon digital officially joined mwdl as a single member repository in . mwdl, based at the university of utah, has a long relationship with dpla, participating in the foundational digital hubs pilot project between and , and has built partnerships with over sixty cultural heritage institutions. their expertise as an established metadata harvester for dpla was invaluable in assisting the oregon digital team through the technical challenge of making mwdl’s primo-based harvester work with our samvera oai-pmh output. oregon digital was, in fact, their first attempt at providing service to the samvera platform, as most of their mem- ber repositories used contentdm for delivering digital collections. v o l n o • w i n t e r v o l n o • w i n t e r mwdl search results showing content from both uo and osu. in preparation for harvesting, metadata specialists at osu and uo identified collections (or sets in oregon digital) that could be contributed to dpla and compared the oregon digital metadata dictionary (n.d.) to mwdl’s dublin core application profile (mountain west digital library, ), each documenting the metadata standards and fields that would be used in harvesting. metadata remediation was necessary to meet mwdl/dpla require- ments. some fields, such as description and subject were required by mwdl, but were not used always used in oregon digital. other fields had incompatible data formats. building oregon (oregon digital. building oregon, n.d.), one of the most popular collections from uo, contains over images photographed by former dean of the uo school of architecture marion dean ross. scanned from mm slide film, the only metada- ta we had about the images was what was written on the slide mount itself and what could be gleaned from its filing position in the physical collection. they lacked information appropri- ate for inclusion in the description and subject fields necessary for harvest into mwdl. to address this and similar complications in other collections, we had to take into account the subject matter and availability of staff who could add missing metadata, and the needs of the users of oregon digital and how they would discover the items through searching and browsing. in the end, most items were given “boilerplate,” or generally applicable descrip- tions and subjects that took minimal staff time and required little quality assurance. inconsistent metadata, particularly in the date field, also needed to be addressed. agreement on a single input standard between collections, even within institutions, was non-existent. once we decided on the machine-readable extended date time format (edtf) specification ( ) and the level of support we would provide for it, scripts were written to search out and correct the formatting with little human intervention. several collections had items that used separate earliest date and latest date fields with values specifying a date range. for oai output, these were collapsed into a single range value. edtf date ranges gave us more flexibility, and mwdl agreed to adjust primo to handle the ranges and parse them out for date values and facets. mwdl also knew that other partners were interested in using edtf and oregon digital could serve as a pilot effort. this proved to be more involved than first anticipated, partly due to staff transitions, but ultimately was resolved with data normalization rules. compound or complex objects were also a challenge. these were used heavily by uo to represent physical archival folders, and at osu to display items such as oral histories, au- diovisual materials, and sometimes documents such as scrapbooks that have individual page descriptions. they manifested in oregon digital as parent metadata records to which child item records with content files were related. in early harvest tests both the parent and the child records were taken, resulting in some confusion in mwdl’s primo instance and their public search interface. after conversations with mwdl, we decided to make only par- ent records available for harvest by adding a metadata field, primary set, which functioned directly as the oai set and was applied to records selected for harvesting. oregon digital makes heavy use of rdf and linked data. fields such as type and rights were recorded in oregon digital metadata records as uniform resource identifiers (uris) that needed to be translated into text for the mwdl harvester. record text labels are not stored in fedora, so they were instead pulled from the solr index and returned in oai records. in a few cases, our label formatting was different than what was expected by mwdl. our region and location labels, built from geonames (n.d.), separated the hier- archical levels with ‘>>’ (i.e. corvallis >> benton county >> oregon >> pacific northwest), but mwdl wanted commas as separators to match dpla’s metadata requirements. in our oai specific code we could adjust the labels after they came out of solr and leave solr data as it was, not affecting the main oregon digital site. o r e g o n l i b r a r y a s s o c i a t i o n test load five test collections were submitted to mwdl’s required data checker (mountain west digital library, ) after metadata remediation. this tool, first provided as part of the dpla oai aggregation tools and modified to meet the requirements of mwdl’s dublin core application profile, gives item-level feedback on the presence of metadata in required fields. we used this feedback as the final step of quality assurance for metadata remediation of the test collections, and cleaned up anything we missed earlier. when that was complet- ed, an initial harvest of these collections was performed, and technical difficulties, including with the oai provider response, could be addressed. simultaneously, remediation began on more collections for harvest. v o l n o • w i n t e r mwdl’s required data checker reviewing one of our collections. oai support was provided by adding the ruby oai gem (code lib, ) to the oregon digital ruby on rails application, integrating oai commands and responses. a few small parts of code from the gem were overridden in our application based on mwdl/ dpla metadata requirements. one instance of this was modifying the oai record identi- fier code to return a value that included the collection or set identifier in it; this is used by primo to determine the oai set in the item record. another example was modifying the oai xml result to not include any empty metadata fields. our first implementation of oai in oregon digital did a full lookup of items from our fedora backend when requested, in order to return all of an item’s metadata for processing. an oai listrecords request to show items could take a minute or more to return a response. for a full harvest, this would obviously not scale, and harvesting a single collection would take several hours. we changed the code to pull metadata and labels out of our solr index instead, as this already powered the oregon digital public user interface, which was much more performant. providing thumbnail images for harvest also required configuration. our images are stored on disk in folders that are organized based on parts of the item pid (permanent identifier). while this is consistent and reproducible, it didn’t make sense to try and imple- ment the folder rules in an external program. we built a new rails controller in the oregon digital application to handle thumbnails when another system only had the item pid value. an image request with the pid value resolves and returns the correct thumbnail url. this allowed mwdl’s primo to use a thumbnail template for any item harvested. dpla search results showing osu publications. o r e g o n l i b r a r y a s s o c i a t i o n conclusions preparing and configuring our oai endpoint and results took more work and time than was initially expected, but we knew it was important and necessary to get right. furthermore, because we had full control of our application, we could make all of the changes that were needed, including making our legacy content better. our initial goal was getting content into dpla and mwdl, but there are other aggregators, including the orbis cascade alli- ance, that we have worked with in the past and may again in the future. dpla item view with thumbnail and metadata. our partnership with mwdl has also led to participation in community efforts beyond the orbis cascade alliance. metadata librarians participated in the western name authority file project (myntti & neatrour, ), a pilot for creating linked open data name authorities for regionally significant people, and the bulk digitization interest group, a place to share standards and technical infrastructure for large-scale digitization projects. our technical work has also contributed to the samvera open source community. develop- ers actively participate in the samvera metadata interest group, the applied linked data interest group, and in samvera application development. v o l n o • w i n t e r as members of the orbis cascade alliance, osu and uo have worked with the digital collections in primo group and the dublin core best practices standing group of the unique and local content team, assisting in preparing the alliance itself to become a dpla service hub. the experience we gained from participation in mwdl and dpla has greatly benefited us and our sister institutions, and provided a valuable opportunity to grow our knowledge and practice of digital collection building. references code lib. ( , may ). code lib/ruby-oai. retrieved june , , from https://github.com/code lib/ruby-oai/ digital public library of america. (n.d.). retrieved june , , from https://dp.la/ extended date/time format (edtf) specification. ( , october ). retrieved june , , from https://www.loc.gov/standards/datetime/edtf.html geonames. (n.d.). retrieved june , , from https://www.geonames.org/ metadata task force of the digitization committee of the utah academic library con- sortium. ( , july ). mountain west digital library dublin core application profile. retrieved june , , from https://mwdl.org/docs/mwdl_dc_profile_version_ . .pdf mountain west digital library. ( , july ). mountain west digital library dublin core metadata application profile. retrieved june , , from https://mwdl.org/docs/mwdl_dc_profile_version_ . .pdf mountain west digital library. ( , may). dpla oai aggregation tools . . required data checker - simple dublin core. retrieved june , , from http://dpla-aggregation.sandbox.lib.utah.edu/reqdata_checker/index_oai_dc.php mountain west digital library. (n.d.). retrieved june , , from https://mwdl.org/ myntti, j., & neatrour, a. ( , may ). western name authority file project. retrieved june , , from https://sites.google.com/site/westernnameauthorityfile/ oregon digital. (n.d.). retrieved june , , from https://oregondigital.org/ oregon digital. building oregon. (n.d.). retrieved june , , from https://oregondigital.org/sets/building-or/ oregon digital metadata dictionary. (n.d.). retrieved june , , from https://tinyurl.com/y zypmpr samvera: an open source repository solution for digital content. (n.d.). retrieved june , , from https://samvera.org/ o r e g o n l i b r a r y a s s o c i a t i o n sharing oregon’s cultural heritage: harvesting oregon digital’s collections into the digital public library of america recommended citation tmp. .pdf.crgmq global emergency medicine journal club: a social media discussion about the lack of association between press ganey scores and emergency department analgesia ucsf uc san francisco previously published works title global emergency medicine journal club: a social media discussion about the lack of association between press ganey scores and emergency department analgesia. permalink https://escholarship.org/uc/item/ x x m journal annals of emergency medicine, ( ) issn - authors westafer, lauren hensley, justin shaikh, sameed et al. publication date doi . /j.annemergmed. . . peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ x x m https://escholarship.org/uc/item/ x x m #author https://escholarship.org http://www.cdlib.org/ journal club/special contribution global emergency medicine journal club: a social media discussion about the lack of association between press ganey scores and emergency department analgesia lauren westafer, do, mph*; justin hensley, md; sameed shaikh, do; michelle lin, md† *corresponding author. e-mail: westafer@gmail.com, twitter: @lwestafer. †all par volume annals of emergency medicine collaborated with an educational web site, academic life in emergency medicine (aliem), to host a public discussion featuring the annals article on the association between press ganey scores and emergency department (ed) analgesia by schwartz et al. the objective was to curate a -day (december through , ) worldwide academic dialogue among clinicians in regard to preselected questions about the article. five online facilitators hosted the multimodal discussion on the aliem web site, twitter, and google hangout. comments across the social media platforms were curated for this report, as framed by the preselected questions. engagement was tracked through web analytic tools and analysis of tweets. blog comments, tweets, and video expert commentary involving the featured article are summarized and reported. the dialogue resulted in page views from cities in countries on the aliem web site, , twitter impressions, and views of the video interview with experts. of the unique identified tweets, discussion ( . %) and learning points ( . %) were the most common category of tweets identified. common themes that arose in the open-access multimedia discussions included press ganey data validity and the utility of patient satisfaction in determining pain treatment efficacy. this educational approach using social media technologies demonstrates a free, asynchronous means to engage a worldwide scholarly discourse. [ann emerg med. ;-: - .] - /$-see front matter copyright © by the american college of emergency physicians. http://dx.doi.org/ . /j.annemergmed. . . introduction in , annals of emergency medicine and academic life in emergency medicine (aliem) launched a shared initiative to increase awareness of key emergency medicine literature and highlight critical appraisal skills. the ultimate goal was to increase the speed of knowledge translation into clinical practices because this gap is often longer than years. the use of open-access multimedia digital platforms, termed free open access medical education, is increasingly common in medical education, and aliem is a central resource for us emergency medicine residents. this joint online journal club, the global emergency medicine journal club, paired aliem’s social media capabilities with the evidence-based medicine expertise of annals. in this installment of the global emergency medicine journal club, we discussed an article by schwartz et al on the association of press ganey scores with analgesia administered in the emergency department (ed). in this retrospective review, the authors linked ed visit information from hospitals in rhode island, totaling , press ganey patient satisfaction survey scores for patients discharged between october and september ticipants are listed in the appendix. -, no. - : - . the authors found no association between analgesics prescribed in the ed and patient satisfaction scores, measured by the press ganey instrument. on december , , aliem published the blog post, which served as a central resource unifying conversations from the other social media platforms, including twitter and the google hangout on air video. the objective of this article is to curate the proceedings of the global emergency medicine journal club through collection, organization, and review, as well as to report objective engagement analytics of the social media modalities. materials and methods the annals editors selected the article for this edition of the global emergency medicine journal club collaboration with aliem. the facilitators were chosen for their expertise in curating open-access multimedia resources. three were experienced bloggers (l.w., j.h., and m.l.) and all have active twitter accounts with greater than (s.s., @synthshaikh), greater than (j.h., @ebmgonewild), greater than , (l.w., @lwestafer), and greater than , (m.l., @m_lin) followers at the discussion. annals of emergency medicine mailto:westafer@gmail.com mailto:@lwestafer http://dx.doi.org/ . /j.annemergmed. . . global emergency medicine journal club westafer et al the aliem team prespecified times for promotion (november to , ), a -day discussion period (december to , ), and the live discussion (december , ). in accordance with an inventory of previous online discussions in other open-access resources and the paired annals journal club article, authors (l.w. and j.h.) formulated discussion questions. pre-event promotion promotion of the global emergency medicine journal club was multimodal. the unique identifying hashtag, #aliemjc, was prospectively registered with symplur. com (hashtag-based twitter analytics web site) as a result of previous global emergency medicine journal club projects. facilitators publicized the initiative on twitter leading up to the live journal club event, using the hashtag #aliemjc, through their individual twitter accounts, as well as the aliem twitter account. facilitators also e-mailed potentially interested colleagues about the journal club event. journal club event a blog post on the aliem web site was published on december , , to global emergency medicine journal club, featuring the discussion questions. on the same day, announcements were made with twitter, as well as on the aliem googleþ and facebook pages, about the event. the facilitators monitored all channels and encouraged continued scholarly discussions. live interview on google hangout on december , , a live interview was conducted with the authors of the featured article, tayler schwartz, bs, (brown university) and kavita babu, md, (university of massachusetts), using google hangout on air, a free multiperson videoconferencing software, which was automatically streamed live onto the aliem youtube channel (http://youtu.be/g zmclcfco). concurrently, an off-screen facilitator (s.s.) live-tweeted quotes from the interview with the #aliemjc hashtag. this video could be viewed asynchronously on youtube directly or on the aliem blog post. discussion analysis written transcripts from the aliem blog, twitter, and the google hangout interview were reviewed and curated to create a discussion summary using the featured questions as a framework. content curation was conducted by one author (l.w.) and independently checked by another (j.h.). a full transcript of the blog discussion is annals of emergency medicine archived at http://www.aliem.com/?p¼ , all tweets with the #aliemjc are archived at http://aliem.link/ av ttv, and the google hangout video can be accessed on youtube at http://youtu.be/g zmclcfco. social media web analytics free, prepackaged analytic tools were used to measure engagement and reach during the -day global emergency medicine journal club period. google analytics measured web traffic for the aliem blog, symplur measured metrics for #aliemjc-related tweets, and youtube measured analytics for the recorded live discussion. viewership was measured by the number of visits to a web page from a single user, identified by internet provider address, and the duration a viewer stayed on a page or allowed the youtube video to play in a single viewing was recorded. table provides descriptions for these tools. a post hoc analysis of the #aliemjc tweets, aggregated by symplur during the -day period, was conducted to attempt to categorize the purpose of each tweet. additionally, because the #aliemjc hashtag was not used consistently by participants, one author (l.w.) performed a manual search of twitter to capture tweets that were part of the conversation regardless of hashtag use. only original tweets or replies were included in this analysis. tweets that were modified tweets or retweets were included in this analysis only if they contained additional original comments or thoughts. tweets were classified into categories, which were defined as informed by a literature review and selected before analysis. - this included promotion (of the project), learning points, discussion, support, literature, or other. descriptions of these can be found in table . a study by mckendrick et al categorized tweets related to a medical conference and provided the foundation for establishing content categories. of these categories, promotion, learning points, and discussion were incorporated into our analysis because these had the most relevance. the categories “encouraging speakers” and “social” by mckendrick et al were merged into a single category called support because these original categories have similar aims and the asynchronous nature of our project did not allow traditional, in-person social events. the literature category was created according to previous studies showing that more than half of tweets from some medical conferences may share links to journal articles and news stories, which could also aid in knowledge dissemination. last, an “other” category was included to include tangential conversations or tweets that did not fit into the other established aforementioned categories. volume -, no. - : - http://symplur.com http://symplur.com http://youtu.be/g zmclcfco http://www.aliem.com/?p= http://www.aliem.com/?p= http://aliem.link/ av ttv http://aliem.link/ av ttv http://youtu.be/g zmclcfco table . aggregate analytic data from various social media–based discussions for the first days of the event (december to , ). social media analytic aggregator metric metric definition count google analytics: a free online service to track page views and other blog metrics page views number of times the web page containing the post was viewed users number of times individuals from different internet provider addresses viewed the site number of cities number of unique jurisdictions by city as registered by google analytics number of countries number of unique jurisdictions by country, as registered by google analytics average time on page average amount of time spent by a viewer on the page min s aliem social media post widget: a web-based tool embedded into each blog post that tracks engagement metrics for multiple social media platforms number of tweets from page number of unique -character notifications sent directly from the blog post by twitter to raise awareness of the post number of facebook likes number of times viewers “liked” the post through facebook number of googleþ shares number of times viewers shared the post through googleþ aliem comments section number of site comments comments made directly on the web site in the blog comments section average word count per blog comment (excluding citations) symplur analytics: a free online service to track metrics for twitter engagement of health-related hashtags; used to track twitter hashtag #aliemjc number of tweets number of tweets containing the hashtag #aliemjc number of twitter participants number of unique twitter participants using the hashtag #aliemjc twitter impressions how many impressions or potential views of #aliemjc tweets appear in users’ twitter streams, as calculated by number of tweets per participant and multiplying it by the number of followers that participant has , youtube analytics: a free online service to track youtube video viewing statistics length of video interview total duration of recorded google hangout videoconference session min s number of views number of times the youtube video was viewed average duration of viewing average length of time the youtube video was played in a single viewing min s westafer et al global emergency medicine journal club a tweet could be classified into more than category. tweets were independently categorized by authors (l.w. and j.h.). we calculated the cohen unweighted k statistic to assess for interrater agreement beyond chance alone. disputes were settled by discussion between authors (l.w. and j.h.). results social media analytics the -day analytic data for the global emergency medicine journal club discussion about the association between patient satisfaction and analgesia during december through , , are summarized in table . figure summarizes a global geographic distribution of participants who read the blog post. a total of viewers from countries saw the blog post on the aliem web site during the discussion period, with an average of minutes seconds spent on the page. volume -, no. - : - in accordance with the symplur analytic database, the #aliemjc-tracked discussions garnered total tweets (includes retweets) and , twitter impressions. through a manual search of twitter during the same -day period, we identified unique tweets (excluded retweets), of which included the #aliemjc hashtag and did not. these tweets thus were missed by the symplur analytics mechanism. table shows the distribution of tweets divided by category and hashtag use. overall agreement of the categorizationoftweetsbetween raters was excellent (k¼ . ; % confidence interval . to . ). in descending order of frequency, the tweet categories included discussion ( . %), learning point ( . %), promotion ( . %), support ( . %), literature ( . %), and other ( . %). summary of the online discussion the global emergency medicine journal club attracted participants from around the globe, who contributed annals of emergency medicine table . classification of unique tweets during december to , , identified by the symplur #aliemjc-tracked transcript and a manual search.* category description number of tweets in each category with #aliemjc hashtag without #aliemjc hashtag total (% of all tweets) discussion tweets discussing academic matters directly with one another or posts of controversial points ( . ) learning point directly answering a journal club question, a summary from the live hangout, or an objective pearl not otherwise classified as discussion ( . ) promotion a tweet advertising or promoting the project ( . ) support tweet demonstrating support of the project or individual or a display of camaraderie/nonacademic discussion ( . ) literature reference or link to literature other than the article associated with the project ( . ) other links to other discussions on a similar but not directly related topic ( . ) *the total tweet count exceeds the unique tweets count because a single tweet could be classified into multiple categories. global emergency medicine journal club westafer et al asynchronously to the open-access conversation by commenting on the aliem web site, posting tweets, and watching the google hangout on air. the curated summary of the conversation on the preselected journal club questions is presented below. question this study evaluated the association between analgesics provided in the ed and patient satisfaction scores. do you think analgesia in the department can be extrapolated to satisfaction of analgesic prescriptions dispensed at discharge from the ed? this question was sparked by a recent aliem-annals residents’ perspective discussion about the opioid prescription epidemic (http://www.aliem.com/opioid- prescription-epidemic-annals-em-resident-perspectives- article/). patient satisfaction was mentioned as a possible driving force for unnecessary opioid prescriptions at discharge. the global emergency medicine journal club consensus was that the findings of schwartz et al, which showed no association between patient satisfaction and ed analgesia, should not be extrapolated to analgesic prescriptions dispensed on discharge. study author babu noted that the press ganey scores evaluate only the experience in the ed and are not intended to reflect postdischarge care. on the blog, avi giladi, md, (plastic and reconstructive surgery, university of michigan) commented further: “looking at the data en masse is a statistical exercise but doesn’t look at the complexities of the issue. the real question is whether pg [press ganey] scores are changed by giving opioids to patients that want them—specifically, to patients that want annals of emergency medicine them and might not need them.. without a comparison arm (patients with the same diagnoses who didn’t get meds), and without even filtering for diagnoses in the model, i don’t think this paper answers the real question about opioid delivery and pg scores that goes through the mind of the under-pressure provider.” question the press ganey instrument was used to measure patient satisfaction in this study. what are the limitations to using this instrument? is there another way to measure patient satisfaction? the limitation most commonly identified by discussants focused on the undisclosed proprietary sampling algorithm used in the press ganey surveys. in addition to a nontransparent patient selection process, concerns were raised about the nonrandom process. furthermore, these data do not capture admitted patients, thereby eliminating the sickest population. on twitter, participants exchanged numerous studies and articles identifying poor response rates associated with press ganey surveys. ultimately, response bias was identified as a significant limitation to this instrument and is further discussed in question . ryan radecki, md, (university of texas–houston), author of an emlitofnote.com blog post also covering this article, called for real-time assessment of patient-oriented pain management effectiveness, rather than using the poor recall-based surrogate marker of patient satisfaction (figure ). zack repanshek, md, (temple university) added that satisfaction scores are subjective and should not be treated as a reliably objective measure (figure ). volume -, no. - : - http://www.aliem.com/opioid-prescription-epidemic-annals-em-resident-perspectives-article/ http://www.aliem.com/opioid-prescription-epidemic-annals-em-resident-perspectives-article/ http://www.aliem.com/opioid-prescription-epidemic-annals-em-resident-perspectives-article/ http://emlitofnote.com figure . geographic distribution of readers who viewed the global emergency medicine journal club between december and , . westafer et al global emergency medicine journal club question the authors studied only patients who returned the survey. how might that group differ from the others? what might these authors have done, given that they had treatment data on all patients, to explore the potential for response bias? in the discussion, author schwartz hypothesized that those returning the surveys are likely to be at one end of the spectrum of satisfaction: either highly satisfied or dissatisfied. this bias would thereby skew the results. repanshek suggested that in an ideal world, a secondary analysis of both respondents and nonrespondents to the press ganey survey should be conducted to determine their demographic, patient profile, and satisfaction differences. giladi also commented on the blog that the article suffers from a lack of comparison group, not including diagnoses in the model, and not knowing whether discharge prescriptions were given. babu suggested on the google hangout live-stream video that, given a theoretical scenario of unlimited funds and resourcestoanswerthequestion ofpatientsatisfactionandin- ed analgesic administration, “[y]ou could survey patients about their satisfaction with the discharge prior to them receiving their discharge prescriptions. that way you would be able to getallcomersbefore they knowtheir dischargeprescription.” this figure . tweet by ryan radecki, md, about measuring patient satisfaction. volume -, no. - : - would minimize such confounders as recall bias, response bias, and inclusion of discharge prescription experiences. question based on your own clinical experience, do you think that there exists an association between positive satisfaction scores and discharge opioid prescriptions? is pain control the largest component of satisfaction, or is it merely a small player? what other aspects of the patient experience can affect patient satisfaction? in the google hangout, babu discussed that she undertook this study after asking community emergency physicians at opioid-prescribing conferences to curtail prescription of opioids for acute exacerbations of chronic pain. she found that they often inquired how this change would affect their patient satisfaction scores because bonus compensation ( % to % of base salary) was tied to patient satisfaction for some. on twitter, many providers echoed this sentiment, acknowledging that patient satisfaction had altered their prescribing practice. ari kestler, md, (san jose regional medical center) suggested that it may not be patient satisfaction but rather “prescriber fatigue” that has driven an increase in opioid prescriptions in the ed. on a chaotic shift, the provider may not have the time or energy to explain why opioids are not being prescribed to the potential drug seeker, and so he or she opts to just prescribe the opioid. ultimately, however, babu stated that she did not think that a correlation would be evident between opioid prescriptions and patient satisfaction if a study directly assessing this were to be conducted. she reasoned that “the opioid fixated population is [a] relatively small” proportion of the sample population and thus would unlikely affect overall patient satisfaction trends. overall, there was good agreement that patient satisfaction is multifactorial and very difficult to measure in a reliable and meaningful way. communication, demeanor of staff, and length of stay were identified as some of the most important components associated with patient satisfaction in the global emergency medicine journal club. figure . tweet by zack repanshek, md, about the measuring patient satisfaction. annals of emergency medicine global emergency medicine journal club westafer et al limitations the primary limitations of sampling and response bias have been addressed in previous articles in this series. - our project may also have an english-language bias because this was the only language used in the materials. although the global emergency medicine journal club harnessed open-access multimedia platforms, the featured article was not open access. this may have decreased participation and engagement by some parties. despite use of symplur analytics and a manual search for twitter messages and conversations, it is still possible that some tweets were missed and thus underreported. discussion in this aliem-annals global emergency medicine journal club, we report a summary of social media–based discussions critically appraising the association between analgesics administered in the ed and patient satisfaction. the discussion focused on the shortcomings of the press ganey instrument currently used to measure patient satisfaction. this endeavor provided a forum for participants to share their perspectives on methodological flaws, evidentiary concerns, systematic issues, and personal experiences about the topic in a virtual and transparent space. measuring value and impact of digital scholarship in medical education is a continually evolving point of discussion. the value of available metrics of community engagement and content dissemination, such as page views or twitter impressions, remains under debate. however, our reported social media analytic data provide early insight into the behavior and usage patterns of global emergency medicine journal club participants. for example, youtube analytics demonstrate that viewers, on average, watched % of the video. this value is consistent with that of previous editions of the global emergency medicine journal club, in which viewers trended toward watching small portions of the video. - this suggests limited effectiveness of this platform for the sharing and dissemination of scholarly content. although the social media analytics in table generally report passive consumption by a global audience, the thematic analysis of symplur and manually searched twitter content demonstrates some degree of active engagement. tweets often contained active academic discussion and contribution of links to affiliated literature. this suggests that it is feasible to engender thoughtful discussion from geographically disparate parties on open- access multimedia platforms. during the search for global emergency medicine journal club–related tweets, we annals of emergency medicine determined that symplur failed to capture of tweets ( %) from this journal club despite this serving as a standard hashtag-based reporting tool in academic conferences and tweet chats. , because twitter plays an expanding role in academic discussions and is increasingly used to measure engagement at conferences, our results suggest that future social media–based discussion will require ongoing participant education about the significance of the hashtag. in the meantime, our analysis demonstrates that tracking hashtag mentions on twitter alone would miss a substantial portion of participant discussions. alternative approaches to capturing meaningful discussions on twitter and reporting engagement metrics are needed. the benefit of social media such as blogs and twitter extends beyond being able to virtually and transparently connect clinicians and scholars in a free, worldwide platform. they also affect article-level metrics, which are becoming increasingly valued by academic institutions and used as a comparative metric within and between journals. in contrast to the journal impact factor score, these metrics measure impact at the article level. the alternative metrics provider altmetric.com tracks the attention that scholarly articles receive from multiple media platforms, including news outlets, social media such as twitter, blogs, and facebook, and research highlight platforms. although this metric reports only mentions (as an indirect measure of value and quality), it does incorporate both traditional and social media–based data. this featured global emergency medicine journal club article had a high score of as of february , . this score placed the article in the th percentile among all annals articles ( th of , articles). collaborations between social media outlets such as aliem and traditional academic journals such as annals may be intrinsically beneficial by enabling a scholarly discourse on a global level through the establishment of an online community of practice. additionally, academic journals may benefit as well, as demonstrated by the high- scoring altmetric score for the featured article. conclusions in this aliem-annals global emergency medicine journal club, we report the perspectives of clinicians on the association between patient satisfaction scores and analgesia in the ed. most participants believed that patient satisfaction is multifactorial and may be more heavily influenced by communication and systems processes than by analgesic administration itself. all agreed that the ways in which we measure patient satisfaction are inadequate and should not be used to drive medical treatment decisions. volume -, no. - : - http://altmetric.com appendix participants the #aliemjc twitter participants: @aliemteam, @alittlemedic, @apathetic_cynic, @bodymender_n_ed, @ccemrp, @debhourycdc, @docamyewalsh, @docertrauma, @doconskis, @drsamko, @ebmgonewild, @emcases, @emtogether, @glazier_scott, @himmelhimmel, @k_scottmd, @kavitababu, @kestlermd, @lwestafer, @m_lin, @maggiemahar, @matthew b, @mdaware, @meetsdeadlines, @nomadicgp, @peterrchai, @rocknicepac, @shannonomac, @signaturedoc, @synthshaikh, @thesgem, @toxtalk, @zackrepem the aliem blog discussion participants: kavita babu, avi giladi, anton helman, ronald hirsch, michelle lin, tom logue, sabrina poon, zack rapanshek, seth trueger, lauren westafer westafer et al global emergency medicine journal club this educational initiative to promote scholarly dialogue in the digital community attracted a global audience by using various social media platforms, including a blog, twitter, and google hangout. we hope that this curated report from the global emergency medicine journal club virtual community of practice will encourage more to participate and engage with scholarly dialogues in the social media domain. supervising editors: n. seth trueger, md, mph; michael l. callaham, md author affiliations: from the department of emergency medicine, baystate medical center/tufts university, springfield, ma (westafer); the department of emergency medicine, texas a&m health science center, corpus christi, tx (hensley); the department of emergency medicine, sinai-grace hospital/wayne state university, detroit, mi (shaikh); and the department of emergency medicine, university of california–san francisco, san francisco, ca (lin). funding and support: by annals policy, all authors are required to disclose any and all commercial, financial, and other relationships in any way related to the subject of this article as per icmje conflict of interest guidelines (see www.icmje.org). the authors have stated that no such relationships exist. this section is provided as a reader service; there is no financial relationship between annals and aliem. references . davis d, evans m, jadad a, et al. the case for knowledge translation: shortening the journey from evidence to effect. bmj. ; : - . . mallin m, schlein s, doctor s, et al. a survey of the current utilization of asynchronous education among emergency medicine residents in the united states. acad med. ; : - . . schwartz tm, tai m, babu km, et al. lack of association between press ganey emergency department patient satisfaction scores and emergency department administration of analgesic medications. ann emerg med. ; : - . . barrett tw, schriger dl. do survey results reflect the truth or a biased opinion on emergency department care? ann emerg med. ; : - . . mckendrick dr, cumming gp, lee aj. increased use of twitter at a medical conference: a report and a review of the educational opportunities. j med internet res. ; :e . . sinclair c. how to use twitter at your next medical conference. kevinmdcom. . available at: http://www.kevinmd.com/blog/ / /twitter-medical-conference.html. accessed january , . volume -, no. - : - . sinclair c. . lessons learned using twitter at a medical conference. pallimed. available at: http://www.pallimed.org/ / /lessons-learned-using-twitter-at.html. accessed january , . . mcgowan bs. . the great asco tweetup. meetingsnet. available at: http://meetingsnet.com/social-media/ -great-asco-tweetup/. accessed january , . . hawkins cm, duszak r, rawson jv. social media in radiology: early trends in twitter microblogging at radiology’s largest international meeting. j am coll radiol. ; : - . . radecki rp, rezaie sr, lin m. annals of emergency medicine journal club. global emergency medicine journal club: social media responses to the november annals of emergency medicine journal club. ann emerg med. ; : - . . chan tm, rosenberg h, lin m. global emergency medicine journal club: social media responses to the january online emergency medicine journal club on subarachnoid hemorrhage. ann emerg med. ; : - . . thoma b, rolston d, lin m. global emergency medicine journal club: social media responses to the march annals of emergency medicine journal club on targeted temperature management. ann emerg med. ; : - . . ferguson c, inglis sc, newton pj, et al. social media: a tool to spread information: a case study analysis of twitter conversation at the cardiac society of australia & new zealand st annual scientific meeting . collegian. ; : - . . loeb s, bayne ce, frey c, et al. american urological association social media work group. use of social media in urology: data from the american urological association (aua). bju int. ; : - . . altmetric. about altmetric and the altmetric score. available at: http:// support.altmetric.com/knowledgebase/articles/ -about- altmetric-and-the-altmetric-score. accessed january , . annals of emergency medicine http://www.icmje.org/ http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://www.kevinmd.com/blog/ / /twitter-medical-conference.html http://www.kevinmd.com/blog/ / /twitter-medical-conference.html http://www.pallimed.org/ / /lessons-learned-using-twitter-at.html http://www.pallimed.org/ / /lessons-learned-using-twitter-at.html http://meetingsnet.com/social-media/ -great-asco-tweetup/ http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://support.altmetric.com/knowledgebase/articles/ -about-altmetric-and-the-altmetric-score http://support.altmetric.com/knowledgebase/articles/ -about-altmetric-and-the-altmetric-score http://support.altmetric.com/knowledgebase/articles/ -about-altmetric-and-the-altmetric-score global emergency medicine journal club: a social media discussion about the lack of association between press ganey scores ... introduction materials and methods pre-event promotion journal club event live interview on google hangout discussion analysis social media web analytics results social media analytics summary of the online discussion question question question question limitations discussion conclusions references appendix participants a multifaceted approach to promote a university repository: the university of kansas’ experience. holly mercer, brian rosenblum, ada emmett holly mercer: university of kansas libraries hmercer@ku.edu anschutz library room k hoch auditoria dr. lawrence kansas holly mercer is the coordinator of digital content development at the university of kansas. a member of the digital initiatives program since , she is a consultant on metadata, digitization standards, project management, and scholarly communication issues for the ku campus, and administrator of the ku scholarworks repository. brian rosenblum: university of kansas libraries brianlee@ku.edu anschutz library room a hoch auditoria dr. lawrence kansas brian rosenblum has served as scholarly digital initiatives librarian at the university of kansas since , where he helps promote ku scholarworks and develop other digital projects. previously, from - , brian worked at the scholarly publishing office at the university of michigan library. his professional interests include scholarly communication issues, electronic publishing, and library development in central and eastern europe. ada emmett: university of kansas libraries aemmett@ku.edu anschutz library room hoch auditoria dr. lawrence kansas ada emmett has worked at the university of kansas since . she serves as the subject specialist for chemistry and molecular biosciences and has been involved in projects to foster greater awareness and use of ku’s institutional repository, ku scholarworks. she has been interested in the complex issues currently facing the system of dissemination and access of scholarship since graduate school. article type: case study purpose of this paper to describe the history of ku scholarworks, the university of kansas’ institutional repository, and the various strategies used to promote and populate it. design/methodology/approach this paper describes how ku scholarworks came into being, and discusses the variety of activities employed to publicize the repository and encourage faculty to deposit their work. in addition, the paper discusses some of the concerns expressed by faculty members, and some of the obstacles encountered in getting them to use the repository. the paper concludes with some observations about ku’s efforts, an assessment of the success of the program to date, and suggests some next steps the program may take. findings ku scholarworks has relied on a "self-archiving" model, which requires regular communication with faculty and long-term community building. repository content continues to grow at a steady pace, but uptake among faculty has been slow. in the absence of mandates requiring faculty to deposit work, organizations running institutional repositories must continue to aggressively pursue a variety of strategies to promote repositories to faculty and encourage them to deposit their scholarship. originality/value ku’s experience will help other institutions develop institutional repositories by providing examples of marketing strategies, and by promoting a greater understanding of faculty behavior and concerns with regard to institutional repositories. a multifaceted approach to populate a university repository: the university of kansas’ experience. introduction in a september article assessing institutional repository deployment in the united states, clifford lynch and joan lippincott conclude that "institutional repositories are now clearly and broadly being recognized as essential infrastructure for scholarship in the digital world" and that they are "being positioned decisively as general-purpose infrastructure within the context of changing scholarly practice, within e-research and cyberinfrastructure, and in visions of the university in the digital age" ( ). however, although repositories may be "recognized as essential infrastructure" it is not necessarily faculty-authors doing the recognizing, and persuading faculty to fill institutional repositories (irs) through self-archiving remains a challenge. the university of kansas (ku) established its institutional repository, ku scholarworks, in spring , early in the ir movement, with solid support from the provost, who was instrumental in helping launch the repository as part of a broader scholarly communications program. since its introduction, library staff have employed a variety of strategies and approaches, none of which are unique to ku, to marketing ku scholarworks to the ku community. however, despite active building and promotion for nearly three years, making the campus aware of its existence and purpose has not been easy, and uptake among faculty has been slow, though content has continued to steadily grow. ku’s experience is typical. a report on the state of institutional repositories asserts, “the biggest problem facing those setting up irs is persuading faculty to use them. outside a few disciplines (e.g. physics, computer science, and economics) there is little tradition of preprints or working papers and apparently still little interest in self-archiving. academics may be radical in their thought but they are conservative in their behavior, and there is a good deal of inertia in the current publishing systems….the data quoted in this report shows that take-up rates for irs have to date been very patchy, especially where the deposit of materials depends on the decision by individuals to self-archive their material” (ware, ). “archivangelist” stevan harnad states that encouragement to deposit items “is not sufficient to raise the self-archiving rate appreciably above the % baseline for spontaneous self-archiving” ( ). he argues forcefully for institutions to require faculty to self-archive all research. in the absence of those mandates (and perhaps as a necessary preliminary to them) institutions operating irs will continue to employ a variety of small- and large-scale, labor-intensive methods to reach out to faculty, solicit their material, and further engage them in applying alternative methods to disseminate their research. this can be "a slow, incremental, somewhat piecemeal process" (lynch and lippincott, ) which has been compared elsewhere to throwing spaghetti at a wall and seeing what sticks (salo, ). this kind of advocacy and grass-roots activism may be part of the preliminary groundwork needed to create an environment in which such mandates will be possible. jones, andrew, and maccoll ( ) place these advocacy efforts in a theoretical framework that relates everett rogers’ diffusion of innovation concepts to issues of faculty adoption of irs, the challenges of getting widespread use of an innovation, and the time and efforts involved. they describe a social-system of repository use where innovators introduce irs and advocacy builds support for irs, but wholesale adoption does not occur until use is mandated. success for institutional repositories is usually defined by the number of items held in relation to the number of faculty, and, though less often articulated, by how often the archived items are downloaded by others (use by authors and readers) (shearer, ). jones, andrew, and maccoll compare an ir to a library and ask, “…who in their right mind would want to visit a library without books?” ( ). the more items deposited that are representative of the faculty output the better. but this definition of success, if solely based on numbers, belies one purpose of irs, which is to create opportunities for change in the system and its stakeholders, such as authors, publishers, and readers. in essence, universities are widely adopting institutional repositories as dissemination engines because the successful ir will create an opportunity for behavior changes in both authors and readers, two key stakeholders in the system. thus, gauging the success of ku’s repository (and other repositories) is not simply a numbers game, especially not at this early stage when irs are still largely in embryonic form. although the number of items in ku scholarworks is modest, the repository has several very active communities and contributors, and has generated interest among faculty in a variety of departments on campus. moreover, the early establishment of an institutional repository has given ku librarians a great deal of feedback and knowledge about the campus environment and faculty members’ perceptions and needs with regard to scholarly communication. ku has gained valuable experience in the policy and technical requirements of setting up and maintaining a repository, and librarians have established relationships with academic units that will likely prove beneficial in the long term as more faculty are persuaded to use ku scholarworks. this paper discusses the history of ku scholarworks to date, including the strategies used to populate it. part one describes how ku scholarworks came into being, and discusses the variety of activities employed to publicize the repository and encourage faculty to deposit their work. part two explains how some ku scholarworks communities have evolved, and includes several observations and an assessment of ku efforts. the paper concludes with thoughts about measuring the “success” of ku’s repository, and suggests next steps for the program. part one / birth and growth of ku scholarworks take in figure : ir development at the university of kansas the ku environment the university of kansas (ku) is a comprehensive educational and research institution with over , students and , faculty members. ku includes the main campus in lawrence; the medical center in kansas city, kansas; the edwards campus in overland park; a clinical campus of the school of medicine in wichita; and educational and research facilities throughout the state. ku offers more than fields of study and has a research budget of more than $ million. the ku scholarworks repository includes scholarship created primarily by faculty, staff, and students at the lawrence and edwards campuses. this repository service is offered and maintained by ku digital initiatives, a program of information services (is). the vice provost of information services oversees the libraries, information technology (it), and networking and telecommunication services divisions. staff from both it and the libraries take part in providing technical and administrative support for ku scholarworks. laying the groundwork discussion of scholarly communication issues on campus preceded the launch of ku scholarworks as a pilot project. david shulenburger, an early advocate for scholarly communication reform, was provost and chief operating officer at ku until june . shulenburger, an economist, proposed developing a national eprint archive, the national electronic article repository (near) ( ), and wrote and spoke on the topic extensively while ku provost. he provided a campus forum for discussion of scholarly communication issues through the provost's seminar on scholarly communication, sponsored by the office of the provost and the university libraries. following on the heels of national efforts to manage the rising costs of library subscriptions to scholarly journals, such as the enumeration of the tempe principles for emerging systems of scholarly communication (association of research libraries, ) and the formation of the scholarly publishing and academic resources coalition (sparc), the first provost's seminar, "from crisis to reform: scholarly communication and the tempe principles," was held on november , . the primary focus of the seminar was engaging faculty in discussing the tempe principles for emerging systems of scholarly communication. speakers addressed ku's role in the scholarly communication movement, reactions to the tempe principles, and discipline-based solutions to the serials crisis. while this seminar did not focus on establishing an institutional repository at ku, it laid the groundwork for development of a ku repository by raising awareness of issues that a repository might help address. launch of a pilot repository efforts to establish an institutional repository at the university of kansas began in earnest in when the libraries hosted a forum for ku librarians to discuss scholarly communication issues and the open access movement. provost shulenburger focused on changing scholarly practices, and information services leadership focused on establishing a repository for preservation and dissemination. a white paper explains, “…scholarly works scattered across a variety of web sites can be difficult for other researchers to locate. opportunities for effective exchange may be lost in the chaotic sprawl of the world wide web…. institutional repositories—digital collections that organize, preserve, and make accessible the intellectual output of a single institution—are emerging at leading universities as one response to this new environment” (fyffe and warner, ). information services leadership developed a repository implementation plan that called for a series of working groups to address various aspects of establishing and maintaining an institutional repository. these working groups were organized in the spring of and each group was to complete its charges and submit a report by summer . the system selection group recommended installing dspace, then in beta test (version . was released by the time ku was ready to proceed with the installation), and ku began with a "proof-of-concept" test repository to build further administrative and faculty support. in all, staff members from the library and it units on the lawrence and medical center campuses participated in the working groups. ku scholarworks launched as a pilot repository in september . early and ongoing faculty involvement ku scholarworks was conceived as a service for faculty, and ku libraries sought ongoing faculty involvement from the earliest stages of planning and development. one of the ir working groups, the early adopters group, identified faculty from across ku who might learn to use the system, submit some items, and provide feedback to refine the ir. some early adopters were faculty who had previously expressed an interest in digital scholarship. richard fyffe, associate dean for scholarly communication and holly mercer, coordinator for digital content development, met with each early adopter at least once to demonstrate system functionality, discuss policies and procedures, and assist in uploading documents. the early adopters group submitted items to the test repository, then met together in january for a focus group discussion on policy issues as well as system functionality. feedback received from these focus groups influenced subsequent decisions in the planning and development process. early adopters believed that ku scholarworks communities should reflect epistemic communities rather than administrative campus units (such as schools and departments). therefore, ku scholarworks supports three community types: formal communities, associated with academic departments and research units; informal communities, for individuals to contribute without a formalized community structure; and communities of practice, for interdisciplinary groups that lack a formalized administrative structure. interestingly, while early adopters stressed the need for communities of practice, none have been requested yet. while responses from the focus groups were generally positive, few of the "early adopters" in fact became users of the repository. however, one early adopter did establish ku scholarworks' first formal community, the policy research institute community, and several others became members of a ku scholarworks advisory committee. while ku scholarworks’ policies are ultimately the decision of information services leadership, this advisory group brings an important faculty and user perspective to the planning process. staff working on repository development had hoped that members of the advisory committee would also act as "ambassadors" who would advocate the use of ku scholarworks to faculty peers, but to date the group has not yielded dramatic results in terms of advocacy or activism. in fact, few members of the committee are associated with departments or research centers with ku scholarworks communities or have actually submitted items themselves. future plans may call for an expanded or altered membership so that actual ku scholarworks participants will have a greater voice in developing and refining the service. in addition, the ku libraries held a separate focus group in conjunction with ku continuing education (kuce) in february to learn more about principal investigators' needs for meeting grant dissemination requirements. the libraries and kuce invited recent federal grant recipients in various disciplines to participate. the participants stated that dissemination of research was often only considered as an afterthought, because by the time there were results to report they had already moved on to the next project. they indicated an interest in having boilerplate language to describe how ku scholarworks meets preservation and dissemination requirements for inclusion on grant applications. consequently, ku libraries added a section to the "about ku scholarworks" web site titled "support for grant applicants" which includes a link to text that grant applicants can copy and paste into their grant proposals (http://www .ku.edu/~scholar/docs/grantsupport.shtml). romeo green (i) ku scholarworks launched with the expectation that faculty would self-archive their work—that is, they would decide to upload their work themselves or submit via a departmental proxy. however, it was clear there would be a number of barriers to immediate faculty participation, ranging from complex copyright clearance issues, to confusion about appropriate content for the repository, to simply getting the attention of busy faculty and researchers who may not pay much attention to a new service whose benefits are not immediately clear to them. library staff believed that departments would be more likely to join as communities if faculty could see high quality content already in the repository, and therefore launched a project to populate ku scholarworks. ku libraries launched the romeo green project in september to explore some of these issues. phase one of romeo green (named after the romeo/sherpa project from which much of the initial publisher policy data was derived) focused on alternative, staff-mediated strategies to populate the repository. by combining ku faculty citation data with “green” publisher policy data (publishers that allow their authors to post versions of their articles on web sites on in repositories), staff determined which papers by ku authors might be deposited in ku scholarworks. staff then contacted those authors and asked permission to deposit the articles on their behalf. this initiative was based in part on a similar initiative undertaken at the university of glasgow (mackie, ). the romeo green project goals were to add content to ku scholarworks, explore services that might be offered faculty to support their use of ku scholarworks, and create interest in an institutional repository at ku. staff identified and requested articles from faculty. ninety-two articles, about % of the total requested, were deposited. the percentage is low, but this was the first time many faculty had heard of ku scholarworks. it is also consistent with the compliance rate in the initial eight-month period after the national institutes of health (nih) implemented its public access policy requesting and encouraging (but not requiring) that nih-funded investigators submit their final, peer-reviewed manuscripts to the national library of medicine’s pubmed central database upon acceptance for publication in a journal (zerhouni, ). at ku, in addition to the articles added to the repository, the romeo green project did provide several, perhaps less quantifiable, benefits. it provided a way for the libraries to continue to reach out to faculty about scholarly communication issues; staff received feedback about faculty behavior and attitudes, and gained a better understanding of the complexity of working around publishers’ self-archiving policies; and it helped ku libraries form relationships with some faculty members who later deposited more material in the repository. this is important because, as will be discussed later, one of the ways communities in ku scholarworks become active submitters is through long-term relationship building with individual faculty members and departments. the library hopes that getting an early start in developing these relationships will pay off later. (for a full description of the ku’s romeo green project, its methods and findings see mercer and emmett, ) faculty resolution and second scholarly communication seminar in march , the ku university council, the governance body for faculty and professional and academic staff of the university, passed a broad “resolution on access to scholarly information.” ku was the first member of the american association of universities (aau) to pass a resolution calling on its faculty to self-archive (suber, ). the resolution, a result of strong advocacy and involvement from provost shulenburger and assistant dean fyffe, addresses current issues in scholarly communication, and calls on faculty to take such actions as amending their copyright transfer statements to allow them to deposit their work in ku scholarworks, and to become familiar with the publishing and business practices of journals and support those that permit dissemination through university repositories and other open access models. the resolution also calls on the academy (university, professional and scholarly associations and administrators) to establish clear “guidelines for merit and salary review…and promotion and tenure…that will allow the assessment of and the attribution of appropriate credit for works published in such venues” as ku scholarworks (university of kansas university council, ). it calls on ku libraries to provide resources to help faculty better understand the business practices of journal publishers and their impact on the scholarly communication system. passage of the resolution was timed to coincide both with the second provost’s seminar on scholarly communication (http://www.lib.ku.edu/scholcommseminar.shtml), held in early march , and with the official launch of the ku scholarworks repository. the second provost’s seminar focused specifically on the role of digital repositories in the scholarly communication system, and brought leaders in the scholarly communication movement to the ku campus. the seminar also included a demonstration of ku scholarworks. ku is not alone in choosing to announce its ir at a scholarly communication seminar; the university of new mexico, for example, planned a similar event to announce its repository, also in march (phillips et al., ) . while librarians had been talking informally about ku scholarworks, and giving formal presentations to academic departments, research centers, and governance bodies for some time, there was a noticeable spike in interest in ku scholarworks following the provost’s seminar and passage of the university council resolution. some academic departments requested that library subject liaisons attend a departmental meeting to discuss ku scholarworks, and individual faculty contacted ku scholarworks administrators to inquire about the submission process and items accepted for deposit. the university council resolution is a significant accomplishment and is an indication of the importance of this issue to ku leadership and their commitment to addressing it, but the lasting impact of the resolution on the ku scholarworks repository is still unclear. when the summer break approached, direct inquiries from faculty declined. clearly, there is a need for a continued and sustained effort at keeping faculty aware of these issues, as they seem to respond when the opportunities are presented to them. ongoing outreach and education since the events and publicity surrounding the official launch of ku scholarworks in march , ku libraries has continued to promote the repository on a smaller scale. library staff have been communicating formally and informally with academic departments, making presentations at departmental meetings, working with individual faculty members to deposit their materials, taking advantage of personal connections, and generally looking for opportunities to discuss the repository program. the combination of education and outreach efforts has resulted in small but growing ku scholarworks communities. staff are also increasing outreach to and involvement of library subject liaisons. subject liaisons have more regular contact with faculty members in their subject areas than ku digital initiatives staff do, and it is clear that their participation and support will be crucial for a successful repository program (bell et al., ). the libraries currently offer workshops on ku scholarworks to subject librarians so that they can become more familiar with the program and better able to discuss it with faculty. recently, usage statistics have been sent out monthly to library liaisons with data on the most- downloaded items of the month. liaisons can then send this information on to their faculty colleagues if they feel it is appropriate. an “about ku scholarworks” web site (http://www .ku.edu/~scholar/) provides information about the repository service. the web site includes a detailed faq, policy documents, text for grant applicants, and links to other pages about scholarly communication issues. a section on “working with publishers” is intended to help educate users about intellectual property issues and give them some guidance in retaining or obtaining rights for their work. this section includes links to the securing a hybrid environment for research preservation and access (sherpa) web site so that faculty may determine the policies of particular journals in which they publish, letter templates they can use when seeking permission from publishers to post articles in the repository, and an “author’s addendum” that authors can use to modify their copyright transfer agreement with their publisher. (this addendum is based on the addendum created by sparc, and was reviewed and approved by ku general counsel.) romeo green (ii) in early , ku libraries continued gathering faculty input by following up with a second phase of the romeo green project. this phase focused on assessing faculty perceptions of ku scholarworks, and identifying what conditions would encourage ku faculty to adopt greater use of the repository. faculty who had responded favorably to requests to participate in the first phase of romeo green (by granting permission to have some of their published articles posted in ku scholarworks) were invited to attend focus groups. during the focus groups, they discussed their knowledge and impressions of ku scholarworks, the submission process, departmental and disciplinary concerns about the repository, and any barriers to depositing their work. the twelve faculty who participated offered enthusiastic support for ku scholarworks. some, though not all, regularly submitted their work to the repository. several broad issues emerged from the focus groups. financial and administrative support. faculty feel overburdened as it is and feel that they and their departments do not have the time or infrastructure to take on new responsibilities, to become familiar with copyright issues, or to learn the archiving policies of different publishers. they think that centralization of these activities would be more efficient. policy and community issues. staff detected some tension between the desire to set submission and content policies at the community level, and the need to understand and be assured of the consistency and quality of content in the repository across the entire institution. this suggests a possible need to illuminate more clearly the distinction between the access and preservation functions of the repository, and the peer-review functions of formal publication. staff working with the ir need to better articulate to faculty that ku scholarworks is not intended to displace the traditional peer-review process. technological barriers. there were several suggestions for repository software changes or technology add-ons that would increase efficiency or lower technology barriers to participation (for example, the ability to automatically create pdf files as part of the submission process). marketing and education. there is a need for continued and more aggressive marketing about ku scholarworks and scholarly communication issues. participants offered many suggestions for ways to publicize these issues. they also suggested that library staff make discussions of scholarly communication issues more concrete---rather than presenting abstract and formulaic explanations about the scholarly communication system. the libraries would be more effective if it “told success stories.” faculty want to hear concrete examples of real benefits of participating in these programs, in terms they understand. a report was made to the ku libraries dean’s council with recommendations for future actions based on ir user feedback. the recommendations included providing greater support for teaching faculty through staff-mediated projects, developing and implementing detailed marketing and education campaigns, and providing technology support to simplify the submission process. the report was well received, and the dean of libraries presented the report to is leadership. information services is currently involved in strategic planning, and it is expected that many ideas will be implemented in support of the planning process. part two / observations and assessment successful ku scholarworks communities ku has adopted a somewhat labor intensive approach to encourage submissions to ku scholarworks that relies on building relationships with individual faculty authors, but more importantly, with potential ku scholarworks communities. informal communities include those communities established as part of the romeo green project, as well as those that were created at the request of an individual faculty member or researcher, without departmental support. thirty-one ( %) of the forty-three ku scholarworks communities are informal, and lack a designated community administrator or signed memorandum of agreement. formal communities have an identified community administrator who acts as a point of contact, and is empowered to make decisions on behalf of the community. a memorandum of agreement outlines the formal relationship between a community and ku scholarworks (http://www .ku.edu/~scholar/docs/memorandum.shtml). although informal communities make up % of the total number of communities, they account for only % of total items deposited. formal commitments with campus units seem to build stronger relationships and provide structure for ongoing community development, content recruitment, and faculty support. gibbons noted that understanding the needs of faculty is necessary to build a repository program, and implementers must create "a tailored and personalized impression" to which faculty can relate ( ). communities also have their own personalities, needs and uses for a repository, and it is important to develop relationships with them to understand those needs. ku scholarworks communities have come into being and grown in a variety of ways. the following three examples of successful communities will illustrate this process: author advocacy. the personal communications established through the romeo green project increased many participants’ awareness of their rights as authors. when one faculty member was asked to supply the author final draft of his work, he initially declined, but did express an interest in understanding why he was not asked to supply the publisher's versions. he preferred to have the final published version available, rather than the author final draft. this professor had served as editor of a scholarly society journal, and he used those professional connections to gain permission for ku to post in ku scholarworks the publisher versions of all articles, present and future, authored by ku faculty in that society's journals. in this case, staff efforts did not result in one of the desired outcomes of the project (for faculty to deposit their own work), but it did lead to one author's better understanding of publisher policies and author rights, and additional articles posted. perhaps even more importantly, a faculty member became an agent of change. as rogers states in his work diffusion of innovations, a "change agent's position is often midway between the change agency" (in this case, the university), and the client system (the scholarly society). the faculty member was able to effect change because he was an effective "linker" between the interests of the university and its faculty, and the scholarly society as publisher ( ). department-mediated submissions: the school of law and the department of public administration have adopted a mediated process whereby an appointee from the academic unit submits all work on behalf of authors. while to date only two items have been submitted to the public administration community using this method, the school of law has over items in its community. public administration and law have experienced different outcomes based on this model, and romeo green faculty focus groups expressed doubts that all departments would have the resources to take on such a task. still, a centralized approach to community development may prove an effective submission method for other campus units. graduate student project submissions: a final example demonstrates how the first student content was deposited into ku scholarworks. the school of engineering offers professionals employed in engineering firms the opportunity to pursue an advanced degree in engineering management at ku's edwards campus. the engineering management program does not have a thesis requirement, but instead requires students to submit a field project. the field projects were submitted in print to the program and retained in the program offices, and a second copy was placed on reserve in the library on the edwards campus. after the library director at the edwards campus attended a ku scholarworks information session, she determined ku scholarworks would be a more efficient method to disseminate and store the field projects. she approached the engineering management program director, and he supported adoption of a new procedure using ku scholarworks. students continue to submit field projects to the engineering management program, and edwards campus library staff then deposit an electronic copy in ku scholarworks. by the numbers in an earlier paper describing efforts to populate ku scholarworks, mercer and emmett stated, "ku scholarworks will fill its role as an institutional repository when its contents are representative of the vast research output from the many disciplines at ku" ( ). as of september , , there are items in forty-three ku scholarworks communities or, on average, . items per community. while the number of items available in ku scholarworks continues to increase, it hardly represents the depth or breadth of scholarship produced by ku faculty. in addition, the number of items available in ku scholarworks is far fewer than the median for association of research libraries (arl) members with repositories (university of houston libraries' institutional repository task force, ). this is despite the extensive promotion of the repository over the course of several years. why are the numbers lower than expected at this stage, and what can staff learn from this? first, one must be careful not to read too much into these numbers. lynch and lippincott, in their survey of u.s. repositories, recognized that comparing repositories by size is problematic because ...no two institutions are counting the same things. we received reports of the number of objects ranging from hundreds of thousands to, at the low end, a few dozen. the diversity in both the definition of what constitutes an "object" and in the nature of the objects being stored (massive videos or groups of datasets as opposed to individual articles or images) makes repository size very hard to interpret, or to relate to space measurements ( ). in addition, a count of total items in a repository does not take into account factors such as whether items were archived by authors or by proxies. the libraries have not been proactive in identifying for submission items such as working papers and technical reports that are already available on departmental web sites. ku has taken an approach that relies on building relationships with individual faculty authors and potential communities, and encourages self-archiving. most of the content in ku scholarworks has been self-archived by individuals or submitted through their community administrator, as opposed to a library staff-mediated model. another metric for measuring the success and impact of a repository is usage, which can be measured by the number of searches performed and number of items downloaded from the repository (shearer, ). the dspace usage logs at ku show that the repository is searched regularly and items are frequently accessed. conclusions ku has employed a variety of methods to encourage its faculty to take more control of the intellectual rights of their future works using the ir as a dissemination tool. as outlined in this paper, staff’s multifaceted approach has utilized the efforts of university and library top administrators, ir staff, library subject specialists, early adopters, and advisory board members to populate the repository. ku scholarworks continues to grow at a slow but steady pace, with several successful and active communities. still, ku libraries are striving for higher participation, and can make some general observations and conclusions about its approach so far. based on the experiences at ku and those reported by colleagues at other institutions, library staff know there is work yet to do to increase the rate of adoption of the ir. ku scholarworks has relied heavily on the "self-archiving" model for institutional repositories, where authors deposit their own works with little assistance from their academic units or the libraries. this model assumes faculty have made, or are willing to make, the behavioral change required to deposit their published and unpublished scholarship. while ultimately this behavioral shift is a desired outcome, the reality may well be that faculty will be more willing to self-archive when there is more content available in the repository. indeed, faculty stated as much during the romeo green focus groups. more content in the ir can serve as indirect evidence that current practice is shifting. until contributing to an ir is an integral part of the scholars’ social system (and hence normal practice), they are not likely to use a repository (jones et al., ). institutional repositories are still in the early stages of development. everett rogers’ innovation diffusion model defines five stages of progression: knowledge, persuasion, decision, implementation, and confirmation ( ). ku is firmly in the decision stage, with some enthusiastic early adopters and departments committed to using the repository. the university of kansas is an early adopter of an institutional repository, although individual faculty are at various stages along the adoption continuum. a handful of authors regularly submit their work to ku scholarworks, but they are not yet activists who encourage and persuade their peers to submit. the challenge will be to continue developing methods to encourage uptake so that ku scholarworks will move through the implementation phase and become part of the fabric of faculty practice at ku. while mandates may eventually be the best way to ensure comprehensive capture of the output of an institution, those running irs must continue to pursue other means of applying social and administrative pressure to persuade faculty to deposit their works. other institutions, such as the massachusetts institute of technology, have found that identifying and working with an "insider advocate" is a more effective means of increasing deposits (baudoin and branschofsky, ) . a respected member of the faculty might influence behavior more than administrative encouragement. identifying more insider advocates or activists, who will promote ku scholarworks, is a logical next step for continued development of ku's institutional repository program. ku has experienced several changes in leadership in . with a new provost and several new deans (including a new dean of libraries), the libraries have an opportunity to work with these new campus leaders to market the ku scholarworks service and spark changes in faculty behavior. staff are hopeful that is leadership will act on faculty recommendations outlined in the romeo green report. the report calls for increased support for library-mediated submissions, and enhancements that will make faculty self- archiving easier, such as conversion to the pdf format as part of the submission process. expanding ku scholarworks to include more graduate student work is a priority for digital initiatives. during focus groups, faculty expressed strong support for inclusion of theses and dissertations in ku scholarworks. inclusion of electronic theses and dissertations (etds) will increase total submissions to the repository, but will also provide greater exposure for graduate student work. staff will also expand the number of ku scholarworks contributors by offering to host papers and presentations given at conferences and symposia sponsored by ku. ku will continue a personalized approach to encouraging use of ku scholarworks. while staff will continue to work with individual faculty, more energy will be directed toward establishing formal communities, where the most significant growth in items has occurred. as the number of ku scholarworks communities continues to rise, staff will work even more closely with library subject specialists, so that they can effectively market the repository service. staff will continue to sponsor periodic focus groups with ku scholarworks users, and others engaged in alternative methods for research dissemination. ku scholarworks community practices will be documented by “telling stories,” so that faculty understand how ku scholarworks reflects their own disciplinary work practices. ir administrators and advocates have the responsibility and challenge to continue to make faculty aware of the repository and related scholarly communication issues. this can be done by promoting the repository and engaging in dialogue with faculty as much as possible. use of the repository by ku faculty is tied in part to larger trends in the academic world. as self-archiving becomes an increasingly accepted part of academic practice, ku faculty will wish to participate in that practice, and ku libraries must position ku scholarworks to meet their needs as well as the needs of the institution as a whole. bibliography: association of research libraries ( ), "principles for emerging systems of scholarly publishing", available at: http://www.arl.org/scomm/tempe.html (accessed september , ). baudoin, p. and branschofsky, m. ( ), "implementing an institutional repository: the dspace experience at mit", science & technology libraries, vol. no. / , pp. - . bell, s., foster, n. f. and gibbons, s. ( ), "reference librarians and the success of institutional repositories", reference services review, vol. no. , pp. - . fyffe, r. and warner, b. f. ( ), "scholarly communication in a digital world: the role of an institutional repository", available at: http://hdl.handle.net/ / (accessed september , ). gibbons, s. ( ), "establishing an institutional repository", library technology reports, vol. no. , pp. - . harnad, s. ( ), "maximizing research impact through institutional and national open-access self-archiving mandates", proceedings cris . current research information systems: open access institutional repositories bergen, norway, available at: http://cogprints.org/ / (accessed september ). jones, r., andrew, t. and maccoll, j. ( ), "advocacy", the institutional repository. chandos publishing, oxford. lynch, c. a. and lippincott, j. k. ( ), "institutional repository deployment in the united states as of early ", d-lib magazine, vol. no. , available at: http://www.dlib.org/dlib/september /lynch/ lynch.html (accessed august ). mackie, m. ( ), "filling institutional repositories: practical strategies from the daedalus project", ariadne, vol. , available at: http://www.ariadne.ac.uk/issue /mackie/ (accessed august , ). mercer, h. and emmett, a. ( ), "romeo green project at the university of kansas: an experiment to encourage interest and participation among faculty and jumpstart populating the ku scholarworks repository. " proceedings of the th annual meeting of the american society for information science and technology (asist), pp. - . new orleans, available at: http://hdl.handle.net/ / (accessed september , ). phillips, h., carr, r. and teal, j. ( ), "leading roles for reference librarians in institutional repositories: one library's experience", reference services review, vol. no. , pp. - . rogers, e. m. ( ), diffusion of innovations, free press, new york. salo, d. ( ), "a messy metaphor", caveat lector, available at: http://cavlec.yarinareth.net/archives/ / / /a-messy-metaphor/ (accessed september , ). shearer, m. k. ( ), "institutional repositories: towards the identification of critical success factors", canadian journal of information and library science-revue canadienne des sciences de l information et de bibliotheconomie, vol. no. , pp. - . shulenburger, d. ( ), "moving with dispatch to resolve the scholarly communication crisis: from here to near", association of research libraries proceedings of the rd membership meeting arl washington, dc, available at: http://www.arl.org/arl/proceedings/ /shulenburger.html (accessed september , ). suber, p. ( ), "more on the kansas oa policy", available at: http://www.earlham.edu/~peters/fos/ _ _ _fosblogarchive.html (accessed september , ). university of houston libraries' institutional repository task force ( ), "spec kit : institutional repositories: executive summary", available at: http://www.arl.org/spec/spec web.pdf (accessed september , ). university of kansas university council ( ), "resolution on access to scholarly information: passed by the ku university council / / ", available at: http://www.provost.ku.edu/policy/scholarly_information/scholarly_resolution.htm (accessed september , ). ware, m. ( ), "pathfinder research on web-based repositories: final report", bristol, uk, publisher and library/learning systems (pals) available at: http://www.palsgroup.org.uk/palsweb/palsweb.nsf/ b d e a cb ae a e / c ce a c cd e e a/$file/pals% report% on% institutional% repositories.pdf (accessed september , ). zerhouni, e. ( ), "report on the nih public access policy", available at: http://publicaccess.nih.gov/final_report_ .pdf (accessed september , ). ifla journal: volume number october i f l a iflavolume number october contents special issue: cultural heritage guest editors: douwe drijfhout and tanja de boer guest editorial ifla journal special issue on cultural heritage douwe drijfhout and tanja de boer articles indigenous cultural heritage preservation: a review essay with ideas for the future loriene roy the digital library in the re-inscription of african cultural heritage dale peters, matthias brenzinger, renate meyer, amanda noble and niklas zimmer storing and sharing wisdom and traditional knowledge in the library brooke m. shannon and jenny s. bossaller the challenges of reconstructing cultural heritage: an international digital collaboration rachel heuberger, laura e. leone and renate evers born fi dead? special collections and born digital heritage, jamaica cherry-ann smart digitization of indian manuscripts heritage: role of the national mission for manuscripts jyotshna sahoo and basudev mohanty preserving digital heritage: at the crossroads of trust and linked open data iryna solodovnik and paolo budroni the universal procedure for library assessment: a statistical model for condition surveys of special collections in libraries sam capiau, marijn de valk and eva wuyts cultural heritage digitization projects in algeria: case study of the national library nadjia ghamouh and meriem boulahlib abstracts aims and scope ifla journal is an international journal publishing peer reviewed articles on library and information services and the social, political and economic issues that impact access to information through libraries. the journal publishes research, case studies and essays that reflect the broad spectrum of the profession internationally. to submit an article to ifla journal please visit: http://ifl.sagepub.com ifla journal official journal of the international federation of library associations and institutions issn - [print] - [online] published times a year in march, june, october and december editor steve witt, university of illinois at urbana-champaign, main library, mc – w. gregory drive, urbana, il, usa. email: swwitt@illinois.edu editorial committee rafael ball, eth-bibliothek, zurich, switzerland. email: rafael.ball@library.ethz.ch barbara combes, school of information studies, charles sturt university, wagga wagga, nsw australia. email: bcombes@csu.edu.au marı́a del cármen dı́ez hoyo, spain. email: carmen.diez-hoyo@aecid.es ben gu, national library of china, beijing, people’s republic of china. email: bgu@nlc.cn dinesh gupta, vardhaman mahaveer open university, kota, india. email: dineshkg.in@gmail.com/dineshkumargupta@vmou.ac.in mahmood khosrowjerdi, allameh tabataba’i university, tehran, iran. email: mkhosro@gmail.com/mkhosro@atu.ac.ir jerry w. mansfield (chair) congressional research service, library of congress, washington, dc. email: jmansfield@crs.loc.gov ellen ndeshi namhila (governing board liaison) university of namibia, windhoek, namibia. email: enamhila@unam.na seamus ross, faculty of information, university of toronto, toronto, canada. email: seamus.ross@utoronto.ca shali zhang, university of montana, missoula, montana, united states. email: shali.zhang@mso.umt.edu publisher sage, los angeles, london, new delhi, singapore and washington dc. copyright © international federation of library associations and institutions. uk: apart from fair dealing for the purposes of research or private study, or criticism or review, and only as permitted under the copyright, designs and patents acts , this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the copyright licensing agency (www.cla.co.uk/). us: authorization to photocopy journal material may be obtained directly from sage publications or through a licence from the copyright clearance center, inc. (www.copyright.com/). inquiries concerning reproduction outside those terms should be sent to the publishers at the address below. annual subscription ( issues, ) free to ifla members. non-members: full rate (includes electronic version) £ /$ . prices include postage. full rate subscriptions include the right for members of the subscribing institution to access the electronic content of the journal at no extra charge from sage. the content can be accessed online through a number of electronic journal intermediaries, who may charge for access. free e-mail alerts of contents listings are also available. for full details visit the sage website: www.sagepublications.com student discounts, single issue rates and advertising details are available from sage, oliver’s yard, city road, london ec y sp, uk. tel: + ( ) ; e-mail: subscriptions@sagepub.co.uk; website: www.sagepublications.com. in north america from sage publications, teller road, thousand oaks, ca , usa. periodicals postage paid at rahway, nj. postmaster: send address corrections to ifla journal, c/o mercury airfreight international ltd, blair road, avenel, nj , usa. please visit http://ifl.sagepub.com and click on more about this journal, then abstracting/indexing, to view a full list of databases in which this journal is indexed. printed by henry ling ltd, dorset, dorchester, uk. guest editorial ifla journal special issue on cultural heritage douwe drijfhout national library of south africa, pretoria, south africa tanja de boer koninklijke bibliotheek, the hague, the netherlands cultural heritage (ch) consists of tangible and intangi- ble, natural and cultural, movable and immovable assets inherited from the past. it is of extremely high value for the present and the future of a country. access, preserva- tion, and education around cultural heritage are essential for the evolution of people and their culture. ch preservation and conservation management present unique practices and challenges worldwide. in africa these are amplified by the number of lan- guages and indigenous knowledge systems, the range of economic conditions, varying climates, and histories that encompass ancient civilizations and post-colonial realities. this special issue aims to contribute to a deeper understanding of ch preservation and high- lights case studies and practices from within the cul- tural heritage community and context. in particular, the main goal of this issue is to gather inter- disciplinary and inter-professional research on ch in african libraries, but not excluding other continents; the use of new technologies in protecting, restoring, and preserving ch; the use of digitization, documenta- tion, and preventive conservation to make ch content accessible; and the impact of natural disasters and con- flict on preserving ch (african case studies). with the st ifla general conference and assembly taking place in africa this year, the need for dynamic libraries is expressed. the congress theme ‘dynamic libraries: access, development and trans- formation’ is of critical importance to strengthen democracy on the african continent and to eradicate poverty, illiteracy, and unemployment. the preserva- tion and restoration of ch has always been a priority for ifla. it is essential to monitor areas at risk, to advocate for and raise awareness about conflict and disaster prevention. with an increase in ch being abused for political propaganda and destroyed to serve certain agendas, the protection of ch has never been more important. the editors compiled a range of articles that repre- sent contributions from algeria, india, flanders (belgium), south africa, jamaica, germany and the united states. the review article provides some thought-provoking recommendations on the role of libraries and librarians in preserving, promoting, and advancing indigenous ch. through this special issue, the editors hope to highlight the case studies and cur- rent research on cultural heritage from the perspec- tives of libraries and archives. corresponding author: tanja de boer, koninklijke bibliotheek, prins willem-alexanderhof , po box , the hague, lk, the netherlands. email: tanja.deboer@kb.nl international federation of library associations and institutions , vol. ( ) ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifl.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifl.sagepub.com article indigenous cultural heritage preservation: a review essay with ideas for the future loriene roy university of texas at austin, usa abstract this literature review shows the realm of indigenous cultural heritage preservation within libraries is an area still ripe for meaningful exploration and achievement. yet this field is also still sensitive and potentially harmful for the cultural communities who have entrusted these institutions with their living treasures. opportunities abound to make a difference, but they may need to evolve from changes in generational attitudes and approaches. keywords libraries and society/culture, cultural heritage management, indigenous knowledge systems, principles of library and information science, preservation and conservation, collection development introduction according to the united nations education, scien- tific, and cultural organization (unesco, – , § ), ‘‘cultural heritage is the legacy of physical artefacts and intangible attributes of a group or soci- ety that are inherited from past generations, main- tained in the present and bestowed for the benefit of future generations’’. notably, this definition addresses a cultural heritage’s physical characteris- tics, history or provenance, and importance or poten- tial over time. the preservation of cultural heritage is therefore concerned with safeguarding both the tangi- ble representations of culture—including everyday objects such as clothing and dwellings, as well as art in its many representations, from pottery and bead- work to painting and sculpture—and the other, less physical but equally important, aspects of traditional lifeways such as language, oral stories, customs, and beliefs. as information settings, libraries are concerned with cultural heritage preservation from several van- tage points. firstly, they collect and house cultural heritage in numerous formats, from print to media to digital. secondly, they create and organize records of cultural heritage, as reflected through the processes of cataloging and classification. thirdly, they provide access to these records through specific policies and practices (such as employing digitization as a way to document our collective memory) and, thus, assist and shape users’ understanding of the nature of that cultural heritage. fourthly, libraries provide a location for cultural heritage to be expressed, shared, and con- tinued by serving both as the venue for their study and as a space for holding programs and events to cele- brate and contemplate heritages’ meanings. lastly, libraries themselves provide laboratories for creating ongoing cultural heritage by providing education, equipment, and training to the wider community. while these vantage points may differ in some regards, all of these activities fundamentally call on librarians to balance their professional standards of behavior with the protocols of the originating commu- nities. so how do information professionals learn their roles in these different processes? librarians turn to their professional associations, such as the interna- tional federation of library associations and institu- tions (ifla), for guidance and education. ifla’s concern with cultural heritage preservation is apparent in its structure, statements and publica- tions, and special initiatives. key among ifla’s corresponding author: loriene roy, university of texas at austin, austin, tx, usa. email: loriene@ischool.utexas.edu international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifla.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifla.sagepub.com structural units is the preservation and conservation section, which has organized events including work- shops and programs, as well as producing publica- tions (ifla, ). in , ifla released a joint statement with the international publishers associa- tion (ipa) that addressed the archiving and preserva- tion of digital information by calling on national libraries, in particular, to ‘‘take the lead responsibility for long-term archiving of digital publications’’ (ifla/ipa steering group, : § ). ifla’s com- mitment to cultural heritage preservation is also seen in its role as a signatory to the lyon declaration on access to information and development, which affirms the role of libraries and archives in ‘‘preser- ving and ensuring ongoing access to cultural heri- tage’’ (ifla, ). within ifla, there is evidence that the association has and is considering its stance specifically regarding indigenous cultural heritage preservation. ifla’s statement on indigenous traditional knowledge recommends that libraries and archives ‘‘implement programs to collect, preserve, and disseminate indi- genous and local traditional knowledge resources’’ (ifla, : § ). additionally, several ifla presi- dents have chosen to highlight indigenous cultural knowledge and services for indigenous people as part of their initiatives. this includes presidents kay raseroka, alex byrne, and ingrid parent. raseroka launched and supported the initial discussions that drew attention to indigenous knowledge. byrne formed a presidential committee on indigenous mat- ters and sponsored several forums at the world library and information congresses in and , as well as a program for the presidential com- mission on indigenous issues in . this attention led to the approval, in december , of an ifla special interest group (sig) on indigenous matters under the library services to multicultural popula- tions section. parent chose the theme of ‘‘indigenous knowledges: local priorities, global contexts’’ for one of her two ifla presidential programmes held at the university of british columbia in the spring of ; archived copies of selected slides and record- ings from this programme are available free online (ifla, b). my own background in the area of cultural heritage preservation concerns the place of the information center in supporting living indigenous cultures. therefore, while this review article highlights past work on cultural heritage preservation in general, it does so from the perspective of work in and consider- ation of indigenous cultural expressions. in addition to citing and summarizing key publications, the arti- cle provides advice for those wishing to continue to follow this issue and/or to attend events where the topic is explored, discussed, and advanced. note that, while i refer throughout this article to the first peoples of the land as indigenous or native, i acknowledge that there may be other local naming preferences. my comments are offered from the place of personal choice and preference and, as one person, my life experience is limited. i say this in my place as an indigenous person and with respect to other indi- genous peoples across the globe: i am anishinabe, enrolled or an official member of the white earth reservation, a member of the minnesota (usa) chip- pewa tribe. for more information about me, see a recent article in the journal alternative (roy, b). while several writings specifically in this area of cultural heritage preservation are cited here, it is important to note that those interested in the topic should broaden their perspective by engaging with writings that address the philosophies, worldviews, and epistemology of the originating cultures. for example, a true collection of writings on indigenous cultural heritage would include a broad range of pub- lications, from tribal newspapers to fiction and poetry. i must also note that a literature review in and of itself may not be reflective of an indigenous approach to gathering, presenting, and sharing information. mar- tin ( ) points out that an indigenous approach to writing a literature review would involve seeking information from primary sources by indigenous peo- ples first, followed by reviewing both primary and secondary sources by non-indigenous peoples. addi- tionally, anderson summarizes the place of the library or archive as a colonial space, while also proposing what might happen when the power dynamic shifts so that ‘‘the people traditionally subjected to archives gain a recognized voice and question not only status within the archive, but the authority of the archive as a centre of interpretation’’ (anderson, : ). acknowledging the limits of the printed word in reflecting indigenous culture and of the presence of libraries and archives as colonizing structures, the fol- lowing essay holds such guidance in mind while intro- ducing sources of information that might be more easily located by any reader. at a minimum, pub- lished accounts are one way to trace the extent to which the topic is discussed within the library and information science professional literature: this is just one step in understanding indigenous cultural heri- tage. note that this review essay is in no way compre- hensive, and that the titles chosen are examples of research sources. due to my own experience and per- sonal collection, the bias is toward english language materials produced in north america. roy: indigenous cultural heritage preservation this literature review is organized into the follow- ing sections. section is a literature review essay, introducing publications that can answer user and librarian questions such as: � i need an encyclopedia article on native govern- ments and organizations. � are there any special issues of journals on indi- genous cultural heritage preservation? � are there any entire books on indigenous librarianship? � what conferences might i attend to hear the latest about indigenous cultural heritage preservation? section summarizes literature addressing specific topics including policy, research methodologies, and practical guidance. the final section is a discussion for librarians on how to contribute to advancing awareness and continued discussion of this topic. the article then closes with suggestions for further research and of how information workers might advo- cate for greater attention to this issue. indigenous cultural heritage literature: the general literature while indigenous peoples have existed since long before librarianship developed as a formal profession during the late th century, the topic of indigenous cultural heritage preservation is still emerging in the professional literature. this literature review sum- marizes the existing threads of this emerging thinking. printed accounts of the roles of libraries, archives, and museums in preserving cultural heritage can be found as encyclopedia articles, monographs, book chapters, special issues of journals, journal articles, and confer- ence proceedings. several organizations additionally exist that host deep discussions, or serve as platforms for those working in cultural heritage settings to share experiences and to seek support and advice; such events often exist only in the memories of attendees. each of these sources of knowledge is explored below. if we were to create a path through the indigenous cultural heritage literature, we might first want to locate a survey or overview article in an encyclopedia. there are two main types of encyclopedias: general encyclopedias and subject, or specialized, encyclope- dias (cassell and hiremath, ). since indigenous cultural heritage preservation is in itself a special topic, a first search step might be to examine subject encyclopedias such as national encyclopedias, discipline-specific encyclopedias, and then subject encyclopedias focused on indigenous cultures. national encyclopedias, in particular, may provide a majority perspective on indigenous issues and can set the stage for considering indigenous cultural heritage. for example, the canadian encyclopedia includes an entry on ‘‘aboriginal cultural landscape’’ (buggey, ), while the entry on ‘‘te tāpoi māori – māori tourism‘‘ in te ara encyclopedia of new zealand includes content on ‘‘preserving culture’’ (diamond, ). additionally, the gale virtual reference library (gvrl) in an index that points to content held in sev- eral reference sources, including encyclopedias, handbooks, and biographical sources. a search in gvrl under the terms ‘‘indigenous’’ and ‘‘culture’’ and ‘‘preservation’’ will yield hundreds of results, providing insight into the wide interdisciplinary inter- est in this topic. for example, content on cultural heri- tage preservation can thus be found in the encyclopedia of public health (durie, ), ency- clopedia of environment and society (robbins, ), encyclopedia of language and education (hamel, ), world history encyclopedia (andrea and neel, ), and encyclopedia of law and soci- ety: american and global perspectives (clark, ), among other sources. a more focused coverage of indigenous cultural heritage can be found in encyclopedias specifically on indigenous culture. the handbook of north amer- ican indians is an incomplete encyclopedia set that was planned in the s with volume , the final vol- ume, scheduled to be in-press in . volume , published in , addresses relevant topics in the chapters grouped under ‘‘social and cultural revita- lization’’, including coverage of subjects such as repa- triation (mckeown, ), native museums and cultural centers (watt and laurie-beaumont, ), and languages and language programs (hinton, ). finally, turning to our own discipline, the arti- cle on ‘‘indigenous librarianship’’ in the encyclopedia of library and information science introduces partic- ular issues that impact indigenous cultural heritage preservation such as access, protocols, and intellec- tual and cultural property rights (burns et al., ). several monographs have been published on the place of libraries and archives in supporting indigen- ous cultural heritage, especially by hills ( ), rockefeller-macarthur ( ), and roy et al. ( ). in , roy (who served as the first conve- ner for the ifla sig on indigenous matters from to ) and frydman edited a free online book, library services to indigenous populations: case studies (roy and frydman, ). in addition, books on museums may provide insight into the attitudes and actions regarding handling, describing, and exhi- biting indigenous cultural material. some titles focus ifla journal ( ) on the history of one or more specific museums (force, ; lonetree, ; spruce and thrasher, ), while others may discuss work with specific tribal communities (clavir, ; mccarthy , ). in addition to entire books, book chapters— especially in texts about social justice (roy and hogan, ) or access to knowledge (roy et al., )—might also be of use. only a few publications address the role of the library or archives in language recovery; reznowski and joseph’s book chapter ( ) addresses the potential for archivists to work with tribal members in such efforts. however, libraries and museums can play critical roles in working with indigenous com- munities in the protection and recovery of indigenous languages. many of these languages are sleeping, largely as a result of colonization practices that aimed to force or encourage native peoples to cast off their ethnic identities and to assume the language, beliefs, and behaviors of western cultures. many resources are now available to libraries for supporting such indigenous language study, including print and online dictionaries, language courses on dvd, and other publications, such as board books for young readers. several such resources are published locally and with limited distribution. some publishers may specialize in indigenous content and even pro- duce bi-lingual publications: in the united states this includes salina bookshelf ( ), a publisher of high- quality children’s board books and picture books, as well as of navajo language materials. similarly, birchbark books ( ) publishes literature support- ing anishinabemowin, the language of the anishi- nabe of the great lakes region of the united states, through wiigwaas press. in addition to encyclopedia articles and topical books, a number of special issues of journals have been published that focus on indigenous cultural heri- tage. this list includes an issue of world libraries ( ), d-lib magazine ( ), several issues of the electronic library ( a, b), an issue of inter- national preservation news ( ), and a special issue on native american archives in the journal of western archives ( ). ongoing coverage of indigenous issues also appears in the newsletters of professional associations, such as the american indian libraries newsletter of the american indian library association, which is one of five ethnic library organizations affiliated with the american library association (ala). such journal articles may focus on materials, col- lections, or services, or may even present cases of activities at one site, thereby providing greater con- text in the discussion of indigenous cultural heritage. roy ( ), for example, discusses providing readers’ advisory services for indigenous patrons who live far from their homelands, while danowitz and videon ( ) introduce online resources on american indians. ifla’s own journal has published a number of articles that illustrate the organization’s support of professional literature on traditional heritage preser- vation, such as chakravarty’s ( ) article on the traditional knowledge digital library of india and greyling and zulu’s ( ) article on the ulwazi website that includes a database of indigenous knowl- edge in south africa. as we begin to look beyond the traditional written venues for developing the indigenous cultural heri- tage literature, the proceedings of conferences pro- vide a useful means to read print versions of past presentations. some conferences to watch include the annual ifla world library and information con- gress (wlic), where meetings of the sig on indigen- ous matters have been held annually since august . the sig often sponsors a program at ifla and presentations on this topic are often included in pro- grams at the wlic, especially those hosted by the ifla section on library services to multicultural populations. the section sometimes additionally organizes an ifla satellite meeting prior to or after the ifla world library and information congress, and issues calls for presentations on topics such as ser- vices for indigenous communities. not all conference presentations on indigenous cultural heritage are published in print or online, how- ever. such events include presentations on indigen- ous matters that take place at national, state, or regional conferences that are organized by units or individuals interested in the topic. for example, the association of tribal libraries, archives, and museums (atalm) organizes the annual interna- tional conference of indigenous archives, libraries, and museums; this conference is held in various locations within the united states and is open to any registrant interested in the work of tribal museums, libraries, archives, and native language recovery. atalm was founded in and, while it does not publish proceedings, it has supported reports that would be of use to anyone interested in the topic (jorgensen, ; jorgensen et al., ). another relevant setting for the discussion of indigenous mat- ters in educational settings, including libraries, is the world indigenous peoples conference on education (wipc: e); wipc: e takes place every three years in locations around the globe and is open to any delegate interested in matters of indigenous education. relevant presentations are additionally often found at the annual conferences of ala, as well as at those of the roy: indigenous cultural heritage preservation australian library and information association, cana- dian library association, and library and information association new zealand aotearoa. interdisciplinary meetings such as those of naisa (native american and indigenous studies association) and the popular culture association/american culture association provide options for presenters from all disciplines, including librarianship, to share papers. some meetings are developed for a specific constit- uency and not open to nonmembers. such events include the annual tribal college librarians profes- sional development institute held for a week in june (usually in bozeman, montana, usa), and the con- vening culture keepers meetings held for those work- ing at tribal libraries, archives, and museums in wisconsin and minnesota, usa. in , the first international indigenous librarians forum (iilf) took place in auckland, aotearoa/new zealand (roy, ). since then, iilf has taken place every other year in rotating locations, including venues in swe- den, the united states, canada, australia, aotearoa/ new zealand, and norway. one tradition of this meeting is for indigenous delegates to meet as a coun- cil and to create some product that particularly repre- sents the gathering, such as a mission statement or a plan of action. the iilf proceedings are sometimes published, but usually with distribution limited to forum attendees. while some portion of iilf is open to non-indigenous attendees, each forum usually organizes time for indigenous-only deliberation. additionally, some gatherings are one-off events where participants produce statements or agendas with wide-reaching potential. for example, article of the declaration of principles building the infor- mation society: a global challenge in the new mil- lennium was the result of work conducted at the world summit on the information society (wsis) and stated that ‘‘in the evolution of the infor- mation society, particular attention must be given to the special situation of indigenous peoples, as well as to the preservation of their heritage and their cul- tural legacy’’ (world summit on the information society : article ). other such noteworthy events include the salzburg global seminar on ‘‘connecting to the world’s collections: making the case for the conservation and preservation of our cultural heritage’’ (stoner, n.d.) and the ifla presidential programme on ‘‘indigenous knowl- edges: local priorities, global contexts’’. my discussion here has demonstrated that the aca- demic literature on the role of libraries in preserving and promoting indigenous cultural heritage is still rather sparse and underdeveloped. the next section presents a brief introduction to supportive policy and protocol documents that are providing a broader context. indigenous cultural heritage preservation: selected relevant policy documents the increasing importance of digital cultural heritage expands the discussion surrounding indigenous cul- tural heritage to legal and policy issues, bringing in topics such as e-publishing, e-lending, and access, including the notion of the right to be forgotten. lor and britz ( ), for example, consider the ethical considerations underlying preservation of digital content. their table on the ‘‘information rights of moral agents involved in, or affected by digital pre- sentation’’, summarizes various rights involved such preservation—from personal autonomy to own intel- lectual property—that might be attributed to places or people, which they refer to as moral agents. they define moral agents as individuals, groups, or institu- tions including authors/creators, originating commu- nities, rights holders, holding institutions, persons depicted, digitizing/acquiring institutions, and users (lor and britz : ). some rights are supported by more than one moral agent and, thus, create poten- tial conflict areas. some of these potential conflict areas are explored in a variety of policy documents that set the stage for working with indigenous cultural heritage. recent writings have described how accommoda- tions need to be made to the professional standards and processes learned by librarians and archivists (ogden, ). the most influential and controversial publications in the area of cultural heritage are those created for the use and access of representations of traditional cultural expression. the first statement was the mataatua declaration on cultural and intellectual property rights of indigenous people (mataatua, ). this brief document outlines rec- ommendations for indigenous peoples to define, develop, maintain, and protect their traditional prac- tices and calls on states, nations, and organizations for support. in , a group of educators in alaska collaborated on guidelines for working with indigen- ous communities, resulting in a series of documents that provide guidance to tribal elders, educators, parents, and authors, among others, who have roles in tribal cultural heritage (assembly of alaska native educators, ). within the library and archives communities, the primary groundbreaking document was the proto- cols developed by and distributed through the aborigi- nal and torres strait islander library and information resource network (atsilrn) ( ); these protocols ifla journal ( ) start with the recognition that the indigenous peoples of australia are the owners of their traditional knowl- edge. the atsilirn protocols stimulated the develop- ment of a similar document in the united states by the first archivists circle, known as the protocols for native american archival materials. although these protocols have not been formally accepted or endorsed by the society of american archivists (saa), they have prompted much discussion on topics such as access, especially with regard to non-tribal access to materials that might be culturally sensitive (first archivists circle, ). on a global scale, the united nations declaration on the rights of indigenous peoples (undrip) (united nations general assembly, ) serves as a critical model for work with indigenous peoples. specifically, article affirms that ‘‘indigenous peo- ples have the right to practice and revitalize their cul- tural traditions and customs. this includes the right to maintain, protect and develop the past, present and future manifestations of their cultures’’. article additionally states that ‘‘indigenous peoples have the right to revitalize, use, develop and transmit to future generations their histories, languages, oral traditions, philosophies, writing systems and literatures’’, while article asserts that ‘‘indigenous peoples have the right to maintain, control, protect and develop their cultural heritage, traditional knowledge and tradi- tional cultural expressions’’ (united nations general assembly, ). adopted by the united nations general assembly in , these three articles affirm the leadership role that indigenous people may take in their own cultural heritage preservation. similarly, springer ( ) provides an overview of unesco’s activities related to the preservation of indigenous knowledge. before and after writing such policy documents, these recommendations must be developed, consid- ered, and tested. research methods and theory pro- vide a broader context within which the topic of indigenous cultural heritage preservation might be viewed. those are the topics of the next section. indigenous cultural heritage literature by issue: research methodologies and theory writing on indigenous cultural heritage preservation might require researchers to adopt a non-western orientation that is more reflective of indigenous worldview. adopting a new perspective may be chal- lenging since the topic is not well covered in library and information science curricula and little addressed in professional statements within librarianship. thankfully, inroads have been made within several literatures that call on researchers to challenge their methods while proposing alternatives. decolonizing methodologies, tuhiwai smith’s ( ) ground-breaking text, argues that there are other valid research methodologies besides those based on the scientific method and that are better sui- ted for work with indigenous peoples. her work has stimulated other publications, including wilson’s research is ceremony ( ), the decolonizing handbook (denzin et al., ), and writings on decolonizing research in other disciplines such as social work (gray et al., ). martin ( ) calls for an indigenist approach to conducting research that recognizes indigenous worldviews, honors their social values, emphasizes the contexts in which they lived, and privileges the indigenous voice and expe- rience. both martin ( ) and wilson ( ) call on indigenous scholars to base their research approaches on their indigenous ontology, or views on reality. within the library and archives literature, lone- tree’s ( ) decolonizing museums specifically examines how the museum setting might be decolo- nized, and illustrates this concept through detailed case studies of three museums in the united states. nakata’s ( ) concept of the cultural interface pro- vides the best model for the interaction between infor- mation workers and indigenous peoples and their cultural representations. indigenous peoples live in this interface, the place where their indigenous life- ways and western viewpoints come together, and ‘‘a place of tension that requires constant negotiation’’ (nakata, : ). within this space indigenous living may either flourish or be repressed, and it is here that cultural heritage institutions reside. thus, the role of these institutions, including libraries and archives, within this space cannot be underestimated. respectful and supportive work within the cultural interface can be assisted, however, by a mindful attention to practice—the details of which are introduced in the next section. indigenous cultural heritage literature by issue: practical guidance the literature review section of this article thus far has largely provided a summary of the print literature available on the preservation of indigenous cultural heritage, as well as highlighting important policy doc- uments. the topic has been further supported by a dis- cussion of adopting non-western research methods with an underlying model or theory that specifically places indigenous cultural heritage in the realm of the world’s knowledge and that describes the intersection roy: indigenous cultural heritage preservation between this knowledge and the mission of cultural heritage institutions, including libraries. this section now presents some of the practice-oriented literature on how to accomplish preservation of indigenous cul- tural heritage. as graham ( : ) points out, ‘‘without pre- servation there will be no long-term access to heritage materials’’. byrne ( ) adds a caveat to the rush to equate digitization with access, noting that barriers to access might emerge: but the access is not without limitations. it is limited by the availability of reliable and affordable information and communication technologies. it is limited to those scholars and students who are affiliated to organisations which have the money and skills to provide access. it is limited to those who are literate, information-literate, and have a command of the major languages of com- merce and scholarship and, of course, english in partic- ular. in addition, contractual and other bounds imposed by vendors exclude many potential users. (byrne, : ) thus, even with broad-scale support for digital cul- tural heritage, there are still barriers toward advancing these topics and any resultant products. while digitization is often regarded as the de facto process for preservation, christen ( : ) warns that: digital technologies and the internet have combined to produce both the possibility of greater indigenous access to collections, as well as a new set of tensions for com- munities who wish to gain some control over the classi- fication of, access to, and cultural protocols for the circulation of those materials. for example, boamah et al. ( ) summarize the hindrances to digital preservation of cultural heritage (dpch) in ghana and point to the lack of interest among key stakeholders as the primary reason why dpch has not been a priority within the country. although such barriers are yet to be overcome, a num- ber of existing publications can provide guidance and practical advice in cultural heritage preservation initiatives. trails ( ) is a free notebook available on the website of ala’s office for diversity, literacy and outreach services. it includes sections on developing library collections and how to care for them, including book repair, emergency planning, and planning digital projects. the guide to building support for your tri- bal library toolkit is available free on the ala web- site to anyone wishing to advocate for their tribal library (american library association, ). cooper and sandoval ( ) additionally provide essays with considerable practical advice for starting local com- munity museums, advice that can also be applied to library project management. one of ifla’s key pub- lications on cultural heritage is the ifla disaster pre- paredness and planning manual that is available for free online and in multiple languages (mcilwain, ). ogden’s ( ) caring for american indian objects: a practical and cultural guide has both general advice on topics such as handling and house- keeping and specialized advice for objects specific to indigenous cultural materials, such as birch bark, quills, shells, skin and skin products, and glass beads. central to most library literature is the topic of col- lection development. hogan ( : ) found that ‘‘a survey of the library literature on collection develop- ment reveals few articles or books addressing collec- tion development specifically in tribal libraries’’. those writings that do address publications for and about native peoples often introduce titles and advice for non-native serving libraries on building collec- tions specifically for children and youth. such publi- cations do not address the needs of adult library patrons. advocacy for books by writers of color that depict people from many cultural backgrounds in con- temporary settings has culminated in the ‘‘we need diverse books’’ ( ) campaign in the united states. an overview of the current status of indigen- ous children’s literature of the americas can be found in the oxford handbook of indigenous american lit- erature (roy, a). local efforts have also devel- oped to increase this literature, including the publication of material by indigenous communities in the local languages. to keep up to date on new indigenous literature for youth, readers can follow book award winners, such as those recognized through the american indian library association youth literature awards. another key source to fol- low for updated information on indigenous children’s literature and related issues is reese’s ( ) popular blog, ‘‘american indians in children’s literature’’. recommendations, further research, and advocacy in ekwelem et al.’s ( ) literature review, they offer several recommendations for advancing the pre- servation of cultural heritage. these recommendation focus specifically on the ‘‘( ) training of librarians . . . ( ) provision of infrastructure . . . ( ) adequate funding . . . ( ) environmental conditions . . . ( ) pro- vision of internet infrastructure . . . [and] ( ) building in incentives for the local population’’ (ekwelem et al., : ). touching on many of these issues, ifla journal ( ) jorgensen ( : ) found that tribal archives, libraries, and museums were ill-prepared in the partic- ular areas of ‘‘conservation, preservation and emer- gency preparedness’’, with one-third of these institutions reporting that they had no single person on their staff with these responsibilities. all of these more general recommendations are also relevant specifically for advancing indigenous cultural heritage preservation. i will examine one of these recommendations, that of training, in some depth, and then briefly mention several others. over- all, these recommendations point to the need for one or more organizations to coordinate and support the needs of the individual librarian, so that they may acquire the skills (with adequate support) to both con- vince and collaborate with their local communities. as indigenous communities recover and build their economies, they reach out to libraries, archives, and museums as settings with staff knowledgeable in cul- tural preservation (roy et al., ). there is a need for education and training for all library, archives, and museum staff in order to acquire some degree of cul- tural competency as well as the specific techniques and processes involved in the preservation of indigen- ous cultural heritage. since people of color are still underrepresented in graduate programs of library and information science, those working with indigenous cultural expressions and their creators may not be tri- bal community members. overall ( ) provides an additional rationale for acquiring cultural compe- tence, arguing that ‘‘knowledge about diverse cultures begins a lifelong process of learning about cultural differences to effectively reach those who would ben- efit the most from library services’’ (p. ). it there- fore behooves all library staff, librarians and archivists as well as their educators, to be open to acquiring training on understanding indigenous ways and working with representations of indigenous cul- tural views. ifla’s guidelines for professional library/ information educational programs introduces cur- ricular areas that the ifla considers critical for the education of librarians. while cultural heritage pre- servation might be addressed under a number of areas, from ‘‘information resource management’’ to ‘‘man- agement of information agencies’’, it is core element eleven, ‘‘awareness of indigenous knowledge para- digms’’, that affirms a global responsibility for edu- cating librarians to understand their place in serving these audiences (ifla a: ). these curricular areas were based on the professional development scheme in new zealand/aotearoa. librarians who seek to achieve or retain professional registration within new zealand must show continuing competency in the content areas that together constitute the body of knowledge, the last of which is ‘‘awareness of indigenous knowledge paradigms’’ (library and information association new zealand aotearoa, ). at the national level in the united states, master’s programs seeking accreditation through ala report on how they meet the particular standards for accred- itation whereby ‘‘the nature of a demonstrably diverse society is referenced throughout the standards because of the desire to recognize diversity, defined in the broadest terms, when framing goals and objec- tives, designing curricula, and selecting and retaining faculty and students’’ (american library association, ). still, more tailored preservation training options for those working specifically with indigen- ous cultural heritage is needed. such national training opportunities could replicate those offered in new zealand, where workshops on training and care of indigenous cultural material have been brought directly to the source communities (graham, ). these workshops were predicated on the development of specialized indigenous staff, who then lead such localized training efforts. a similar suggestion evolved from the salzburg global seminar, which recommended resources training for the non- specialist, starting with an understanding of ‘‘why preserve?’’ (stoner, n.d.). within the domain of further education is the need for continuing discussions on what it means to hold and care for indigenous cultural heritage. such dis- cussions would engage indigenous and non- indigenous cultural heritage workers in sharing their views of objects and content, use of materials, and the cultural context of such use. additionally, the results of such discussions and deliberations should be disse- minated more broadly, and key discussions should not be limited only to those with the resources to travel and the professional connections that endow partici- pants with elite status. these changes involve critically challenging pro- fessional values, in order to ensure that the library and archives profession no longer reflects western colonialized views of interactions with indigenous peoples. a cultural heritage representative must not only learn to advocate for the security and well- being of the cultural material they house; they must also shift their ways of thinking in order to make sure that their use and access place traditional lifeviews as primary. each individual information worker should ‘‘think critically about their practice in relation to indigenous peoples’’ (mccarthy, : ). roy and trace (forthcoming) argue that simply consulting with indigenous communities about their cultural roy: indigenous cultural heritage preservation heritage is insufficient; negotiation and power sharing is needed. cultural heritage institutions should review their policies in order to make sure that they are welcoming to indigenous source communities and librarians ‘‘must engage with how indigenous peoples choose to be cultural’’ (roy et al., : ). librarians must also embrace a new protocol and worldview that values the records of the past through the eyes of their creators’ descendants, as we are reminded in the maori phrase, ‘‘me hoki whakamuri, kia ahu whaka- mua, ka neke’’, or ‘‘our future lies in the past’’ (hei- kell, ). together, these communities and the libraries that serve them can imagine work settings with policies and other directives that welcome indi- genous peoples. in addition to changing such local policies, national professional associations should follow the lead of lianza and enter into a contract with their indigen- ous information workers. individual libraries and information settings should look to the actions and activities of other cultural heritage institutions. for example, mccarthy ( : ) describes how ‘‘increasingly, the relationship between the museum and source communities has moved beyond consulta- tion and collaboration to explore new ways of work- ing that ask ‘for partnership rather than superficial involvement’, in which both parties share power’’. this model can easily be applied to libraries, as well. new theoretical models for implementing such changes will continue to emerge, as many young scholars are concentrating on services for indigenous peoples by doing extensive reading, opening discus- sions, writing, and connecting indigenous knowledge paradigms to what librarianship has to offer. new tools are being developed, while frameworks brought in from other disciplines are offering different ways of thinking. as this introductory literature review has shown, the realm of indigenous cultural heritage pre- servation within libraries is still ripe for meaningful exploration and achievement. yet this field is also still sensitive and potentially harmful for the cultural com- munities who have entrusted these institutions with their living treasures. opportunities abound to make a difference, but they may need to evolve from changes in generational attitudes and approaches. funding this research received no specific grant from any agency in the public, commercial or not-for-profit sectors. references aboriginal and torres strait islander library and informa- tion resource network ( ) aboriginal and torres strait islander protocols for libraries, archives, and information services. available at: http://atsilirn.aiat- sis.gov.au/protocols.php (accessed may ). american library association ( ) standards of accreditation of master’s programs in library and information studies. available at: http://www.ala.org/ accreditedprograms/sites/ala.org.accreditedprograms/ files/content/standards/standards_ _adopted_ - - .pdf (accessed may ). american library association. office for diversity, lit- eracy, and outreach services ( ) guide to building support for your tribal library toolkit. available at: http://www.ala.org/offices/olos/toolkits/triballibrary (accessed may ). anderson j ( ) access and control of indigenous knowledge in libraries and archives: ownership and future use. in: correcting course: rebalancing copy- right for libraries in the national and international arena, columbia university, ny, usa, – may . available at: http://ccnmtl.columbia.edu/proj- ects/alaconf /paper_anderson.pdf (accessed may ). andrea aj and neel c (eds) ( ) indigenous people of the caribbean since . in: world history encyclope- dia. santa barbara, ca: abc-clio, pp. – . gale virtual reference library. available at: http://go. galegroup.com/ps/i.do?id¼gale% ccx &v¼ . &u¼txshracd &it¼r&p¼gvrl&sw¼w &asid¼ c be f a a fcb b fb (accessed may ). assembly of alaska native educators ( ) guidelines for respecting cultural knowledge. available at: http:// www.ankn.uaf.edu/publications/knowledge.pdf (accessed may ). boamah e, dorner dg and oliver g ( ) stakeholders’ attitudes towards the management and preservation of digital cultural heritage resources in ghana. australian academic & research libraries ( ): – . birchbark books ( ) available at: http://birchbark- books.com/wiigwaas-press (accessed may ). buggey s ( ) aboriginal cultural landscape. in: the canadian encyclopedia. available at: http://www.the- canadianencyclopedia.com/en/article/aboriginal-cultural- landscape/ (accessed may ). burns k, doyle a, joseph g, et al. ( ) indigenous librarianship. in: encyclopedia of library and informa- tion sciences. rd edn. florence, ky: taylor & francis, pp. – . byrne a ( ) digital libraries: barriers or gateways to scholarly information? the electronic library ( ): – . cassell ka and hiremath u ( ) reference and informa- tion services: an introduction. rd edn. chicago, il: american library association. chakravarty r ( ) preserving traditional knowledge: initiatives in india. ifla journal ( ): – . christen k ( ) opening archives: respectful repatria- tion. the american archivist ( ): – . ifla journal ( ) http://atsilirn.aiatsis.gov.au/protocols.php http://atsilirn.aiatsis.gov.au/protocols.php http://www.ala.org/accreditedprograms/sites/ala.org.accreditedprograms/files/content/standards/standards_ _adopted_ - - .pdf http://www.ala.org/accreditedprograms/sites/ala.org.accreditedprograms/files/content/standards/standards_ _adopted_ - - .pdf http://www.ala.org/accreditedprograms/sites/ala.org.accreditedprograms/files/content/standards/standards_ _adopted_ - - .pdf http://www.ala.org/accreditedprograms/sites/ala.org.accreditedprograms/files/content/standards/standards_ _adopted_ - - .pdf http://www.ala.org/offices/olos/toolkits/triballibrary http://ccnmtl.columbia.edu/projects/alaconf /paper_anderson.pdf http://ccnmtl.columbia.edu/projects/alaconf /paper_anderson.pdf http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= c be f a a fcb b fb http://www.ankn.uaf.edu/publications/knowledge.pdf http://www.ankn.uaf.edu/publications/knowledge.pdf http://birchbarkbooks.com/wiigwaas-press http://birchbarkbooks.com/wiigwaas-press http://www.thecanadianencyclopedia.com/en/article/aboriginal-cultural-landscape/ http://www.thecanadianencyclopedia.com/en/article/aboriginal-cultural-landscape/ http://www.thecanadianencyclopedia.com/en/article/aboriginal-cultural-landscape/ clark d s (ed.) ( ) aboriginal and indigenous peoples, legal systems of. in: encyclopedia of law and society: american and global perspectives. thousand oaks, ca: sage, pp. – . gale virtual reference library. available at: http://go.galegroup.com/ps/i.do?id¼gale% ccx &v¼ . &u¼txshracd &it¼r&p¼ gvrl&sw¼w&asid¼ a d e f eac b d b e (accessed may ). clavir m ( ) preserving what is valued: museums, conservation, and first nations. vancouver, british columbia: ubc press. cooper kc and sandoval ni (eds) ( ) living homes for cultural expression: north american native perspec- tives on creating community museums. washington, dc: national museum of the american indian, smith- sonian institution. danowitz es and videon c ( ) native american resources: sites for online research. college & research libraries news ( ): – . denzin nk, lincoln ys and tuhiwai smith l (eds) ( ) handbook of critical and indigenous methodologies. los angeles, ca: sage. diamond p ( ) te tāpoi māori – māori tourism – pre- serving culture. in: te ara – the encyclopedia of new zealand. available at: http://www.teara.govt.nz/en/ te-tapoi-maori-maori-tourism/page- (accessed may ). d-lib magazine ( ) digital technology and indigenous communities. special issue, ed. atkins de and holland mp, ( ). durie m ( ) cultural preservation and protection. in: kirch w (ed.) encyclopedia of public health. new york: springer, pp. – . gale virtual reference library. available at: http://go.galegroup.com/ps/i.do? id¼gale% ccx &v¼ . &u¼txshracd &it¼r&p¼gvrl&sw¼w&asid¼adc db d - a d f (accessed may ). ekwelem vo, okafor vn and ukwoma sc ( ) preser- vation of cultural heritage: the strategic role of the library and information science professionals in south east nigeria. library philosophy and practice: – . the electronic library ( a) winds of change: libraries in the twenty-first century. special issue, ed. raitt d, ( ). the electronic library ( b) the impact of it on indi- genous peoples. special issue, ed. roy l and raitt d, ( ). first archivists circle ( ) protocols for native archi- val materials. available at: http://www .nau.edu/lib- nap-p/protocols.html (accessed may ). force rw ( ) politics and the museum of the american indian: the heye & the mighty. honolulu: mechas. graham t ( ) electronic access to and the preservation of heritage materials. the electronic library ( ): – . gray m, coates j, yellow bird m, et al. ( ) decoloniz- ing social work. farnham, england; burlington, vt: ashgate. greyling e and zulu s ( ) content development in an indigenous digital library: a case study in community participation. ifla journal ( ): – . hamel re ( ) indigenous language policy and educa- tion in mexico. in: encyclopedia of language and edu- cation. nd edn. new york: springer, pp. – . gale virtual reference library. available at: http://go. galegroup.com/ps/i.do?id¼gale% ccx &v¼ . &u¼txshracd &it¼r&p¼gvrl&sw¼w &asid¼b a c cada fd e d (accessed may ). heikell v-a ( ) our future lies in the past: me hoki whakamuri, kia ahu whakamua, ka neke. international preservation news : – . hills g ( ) native libraries: cross-cultural condi- tions in the circumpolar north. lanham, md; london: scarecrow. hinton l ( ) languages and language programs. in: bailey ga (ed.) handbook of north american indians. vol. : indians in contemporary society. washington, dc: smithsonian institution, pp. – . hogan k ( ) tribal libraries as the future of librarian- ship: independent collection development as a tool for social justice. in: roy l, bhasin a and arriaga sk (eds) tribal libraries, archives, and museums: preserving our language, memory and lifeways. lanham, md: scarecrow, pp. – . ifla ( ) ifla statement on indigenous traditional knowledge. available at: http://www.ifla.org/publicati ons/ifla-statement-on-indigenous-traditional-knowledge (accessed may ). ifla ( a) guidelines for professional library/informa- tion educational programs. available at: http://www. ifla.org/files/assets/set/publications/guidelines/guidelines- for-professional-library-information-educational-programs .pdf (accessed may ). ifla ( b) indigenous knowledges: local priorities, global contexts. available at: http://iflaindigenous- knowledges .ok.ubc.ca/ (accessed may ). ifla ( ) the lyon declaration on access to informa- tion and development. available at: http://www.lyonde- claration.org/contact/ (accessed may ). ifla ( ) preservation and conservation section. available at: http://www.ifla.org/preservation-and-con- servation (accessed may ). ifla/international publishers association steering group ( ) preserving the memory of the world in perpetuity: a joint statement on the archiving and pre- serving of digital information. available at: http:// www.ifla.org/publications/preserving-the-memory-of-the- world-in-perpetuity-a-joint-statement-on-the-archiving- and (accessed may ). international preservation news ( ) strategies of con- servation and cultural identities. special issue . available at: http://www.ifla.org/node/ (accessed july ). jorgensen mj ( ) sustaining indigenous culture: the structure, activities, and needs of tribal archives, roy: indigenous cultural heritage preservation http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= a d e f eac b d b e http://www.teara.govt.nz/en/te-tapoi-maori-maori-tourism/page- http://www.teara.govt.nz/en/te-tapoi-maori-maori-tourism/page- http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=adc db d a d f http://www .nau.edu/libnap-p/protocols.html http://www .nau.edu/libnap-p/protocols.html http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid=b a c cada fd e d http://www.ifla.org/publications/ifla-statement-on-indigenous-traditional-knowledge http://www.ifla.org/publications/ifla-statement-on-indigenous-traditional-knowledge http://www.ifla.org/files/assets/set/publications/guidelines/guidelines-for-professional-library-information-educational-programs.pdf http://www.ifla.org/files/assets/set/publications/guidelines/guidelines-for-professional-library-information-educational-programs.pdf http://www.ifla.org/files/assets/set/publications/guidelines/guidelines-for-professional-library-information-educational-programs.pdf http://www.ifla.org/files/assets/set/publications/guidelines/guidelines-for-professional-library-information-educational-programs.pdf http://iflaindigenousknowledges .ok.ubc.ca/ http://iflaindigenousknowledges .ok.ubc.ca/ http://www.lyondeclaration.org/contact/ http://www.lyondeclaration.org/contact/ http://www.ifla.org/preservation-and-conservation http://www.ifla.org/preservation-and-conservation http://www.ifla.org/publications/preserving-the-memory-of-the-world-in-perpetuity-a-joint-statement-on-the-archiving-and http://www.ifla.org/publications/preserving-the-memory-of-the-world-in-perpetuity-a-joint-statement-on-the-archiving-and http://www.ifla.org/publications/preserving-the-memory-of-the-world-in-perpetuity-a-joint-statement-on-the-archiving-and http://www.ifla.org/publications/preserving-the-memory-of-the-world-in-perpetuity-a-joint-statement-on-the-archiving-and http://www.ifla.org/node/ libraries, and museums. oklahoma city, ok: associa- tion of tribal archives, libraries, and museums. jorgensen m, morris t and feller s ( ) digital inclu- sion in native communities: the role of tribal libraries. oklahoma city, ok: association of tribal archives, libraries, and museums. journal of western archives ( ) native american archives. special issue, ed. o’neal j and lewis dg, ( ). library and information association new zealand aotearoa ( ) bok . available at: http://www. lianza.org.nz/bok- (accessed may ). lonetree a ( ) decolonizing museums: representing native america in national and tribal museums. cha- pel hill, nc: university of north carolina press. lor pj and britz jj ( ) an ethical perspective on political-economic issues in the long-term preservation of digital heritage. journal of the american society for information science and technology ( ): – . mccarthy c ( ) exhibiting maori: a history of colo- nial cultures of display. wellington, new zealand: te papa press. mccarthy c ( ) museums and maori: heritage profes- sionals, indigenous collections, current practice. wellington, new zealand: te papa press. mcilwain j ( ) ifla disaster preparedness and plan- ning: a brief manual. available at: http://www.ifla.org/ publications/ifla-disaster-preparedness-and-planning–a- brief-manual?og¼ (accessed may ). mckeown ct ( ) repatriation. in: bailey ga (ed.) handbook of north american indians. washington, dc: smithsonian institution. pp. – . martin k ( ) ways of knowing, ways of being and ways of doing: a theoretical framework and methods for indigenous re-search and indigenist research. journal of australian studies (special issue: voicing dissent) ( ): – . mataatua declaration on cultural and intellectual prop- erty rights of indigenous people ( ) available at: http://ankn.uaf.edu/iks/mataatua.html (accessed may ). nakata m ( ) indigenous knowledge and the cultural interface: underlying issues at the intersection of knowledge and information systems. ifla journal ( / ): – . ogden s (ed) ( ) caring for american indian objects: a practical and cultural guide. st paul, mn: minne- sota historical society press. ogden s ( ) understanding, respect, and collaboration in cultural heritage preservation: a conservator’s devel- oping perspective. library trends ( ): – . overall pm ( ) cultural competence: a conceptual framework for library and information science profes- sionals. library quarterly ( ): – . reese d ( ) in: american indians in children’s litera- ture. available at: http://americanindiansinchildrensli- terature.blogspot.com/ (accessed may ). reznowski g and joseph na ( ) out of the archives: fostering collaborative environments for language revi- talization. in: roy l, bhasin a and arriaga sk (eds) tribal libraries, archives and museums: preserving our language, memory, and lifeways. lanham, md: scarecrow, pp. – . robbins p ( ) knowledge. in: robbins p (ed.) encyclo- pedia of environment and society. thousand oaks, ca: sage, pp. – . gale virtual reference library. available at: http://go.galegroup.com/ps/i.do?id¼gale % ccx &v¼ . &u¼txshracd &it¼ r&p¼gvrl&sw¼w&asid¼ d e a cecf f fbff - f af (accessed may ). rockefeller-macarthur e ( ) american indian library services in perspective: from petroglyphs to hypertext. jefferson, nc: mcfarland. roy l ( ) recovering native identity: readers’ advi- sory services for non-reservation native americans. collection building ( / ): – . roy l ( ) the international indigenous librarians’ forum: a professional life-affirming event. world libraries ( / ): – . roy l ( a) indigenous children’s literature. in: cox jh and justice dh (eds) the oxford handbook of indigen- ous american literature. oxford: oxford university press, pp. – . roy l ( b) leading a fulfilled life as an indigenous academic. alternativ, ( ): – . roy l and frydman a ( ) library services to indigenous peoples: case studies. available at: http://www.ifla.org/ publications/library-services-to-indigenous-populations- case-studies (accessed may ). roy l and hogan k ( ) we collect, organize, pre- serve, and provide access, with respect: indigenous peoples’ cultural life in libraries. in: edwards jb and edwards sp (eds) beyond article : libraries and social, and cultural rights. duluth, mn: library juice, pp. – . roy l and trace cb (forthcoming) beyond stewardship and consultation: use, care, and protection of indigen- ous cultural heritage. in: salvatore cl (ed.) cultural heritage management in libraries, archives, and museums: global perspectives. roy l, bhasin a and arriaga sk (eds) ( ) tribal libraries, archives, and museums: preserving our lan- guage, memory and lifeways. lanham, md: scarecrow. roy l, hogan k and lilley s ( ) balancing access to knowledge and respect for cultural knowledge: librarian advocacy with indigenous peoples’ self-determination in access to knowledge. in: lau j, tammaro am and bothma t (eds) libraries driving access to knowledge. the hague: international federation of library associa- tions and institutions, pp. – . salina bookshelf ( ) available at: http://www.salina- bookshelf.com/ (accessed may ). springer j ( ) unesco’s contribution to preserving traditional and indigenous knowledge. international preservation news : – . ifla journal ( ) http://www.lianza.org.nz/bok- http://www.lianza.org.nz/bok- http://www.ifla.org/publications/ifla-disaster-preparedness-and-planning--a-brief-manual?og= http://www.ifla.org/publications/ifla-disaster-preparedness-and-planning--a-brief-manual?og= http://www.ifla.org/publications/ifla-disaster-preparedness-and-planning--a-brief-manual?og= http://www.ifla.org/publications/ifla-disaster-preparedness-and-planning--a-brief-manual?og= http://ankn.uaf.edu/iks/mataatua.html http://americanindiansinchildrensliterature.blogspot.com/ http://americanindiansinchildrensliterature.blogspot.com/ http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://go.galegroup.com/ps/i.do?id=gale% ccx &v= . &u=txshracd &it=r&p=gvrl&sw=w&asid= d e a cecf f fbff f af http://www.ifla.org/publications/library-services-to-indigenous-populations-case-studies http://www.ifla.org/publications/library-services-to-indigenous-populations-case-studies http://www.ifla.org/publications/library-services-to-indigenous-populations-case-studies http://www.salinabookshelf.com/ http://www.salinabookshelf.com/ spruce db and thrasher t (eds) ( ) the land has memory: indigenous knowledge, native landscapes and the national museum of the american indian. washington, dc: national museum of the american indian, smithsonian institution; chapel hill, nc: uni- versity of north carolina press. stoner jh (n.d.) connecting to the world’s collections: making the case for the conservation and preservation of our cultural heritage. washington, dc: institute of museum and library services. available at: http:// www.imls.gov/assets/ /assetmanager/sgs_report.pdf (accessed may ). trails: tribal library procedures manual ( ) rd edn. available at: http://www.ala.org/offices/olos/toolk- its/trails (accessed may ). tuhiwai smith l ( ) decolonizing methodologies: research and indigenous peoples. london; new york: zed books; dunedin, new zealand: university of otago press. united nations general assembly ( ) united nations declaration on the rights of indigenous peoples. avail- able at: http://www.un.org/esa/socdev/unpfii/documents/ drips_en.pdf (accessed may ). unesco. unesco office in cairo ( – ) tangible cultural heritage. available at: http://www.unesco.org/ new/en/cairo/culture/tangible-cultural-heritage/ (accessed april ). watt lj and laurie-beaumont bl ( ) native museums and cultural centers. in: bailey ga (ed.) handbook of north american indians. vol. : indians in contempo- rary society. washington, dc: smithsonian institution, pp. – . we need diverse books official campaign site ( ) available at: http://weneeddiversebooks.org/# (accessed may ). wilson s ( ) research is ceremony: indigenous research methods. halifax, winnipeg: fernwood. world libraries. ( ) indigenous librarianship. special issue, ed. roy l and saari p, ( ). world summit on the information society ( ) declara- tion of principles. building the information society: a global challenge in the new millennium. available at: http://www.itu.int/wsis/docs/geneva/official/dop.html (accessed may ). author biography loriene roy is professor in the school of information, the university of texas at austin. she is anishinabe, enrolled on the white earth reservation, a member of the minnesota chippewa tribe. she served as the – president of the american library association and as the first convener of the ifla special interest group on indigenous matters. roy: indigenous cultural heritage preservation http://www.imls.gov/assets/ /assetmanager/sgs_report.pdf http://www.imls.gov/assets/ /assetmanager/sgs_report.pdf http://www.ala.org/offices/olos/toolkits/trails http://www.ala.org/offices/olos/toolkits/trails http://www.un.org/esa/socdev/unpfii/documents/drips_en.pdf http://www.un.org/esa/socdev/unpfii/documents/drips_en.pdf http://www.unesco.org/new/en/cairo/culture/tangible-cultural-heritage/ http://www.unesco.org/new/en/cairo/culture/tangible-cultural-heritage/ http://weneeddiversebooks.org/# http://www.itu.int/wsis/docs/geneva/official/dop.html article the digital library in the re-inscription of african cultural heritage dale peters university of cape town, south africa matthias brenzinger university of cape town, south africa renate meyer university of cape town, south africa amanda noble university of cape town, south africa niklas zimmer university of cape town, south africa abstract african digital libraries have evolved beyond the ‘preservation or access’ debate of the s, and the concomitant compulsion to (un-)systematically convert cultural heritage collections from analogue to digital formats. the challenge now lies in the agility to respond to user needs, to match the selection for digitisation with a more strategic approach towards research relevance and potential research outputs. this paper will examine the symbiotic relationship between preservation, cultural heritage and scholarship in a case study on the description and documentation of extinct african languages. it proposes that the new point of focus lies in digital scholarship, enabling both technical innovation and more intellectual engagement in revisiting the digital library to review, correct and augment transitory records through a new scholarly interpretation of african cultural heritage. keywords african digital libraries, digital scholarship, african cultural heritage, digital humanities introduction the digitisation of cultural heritage was embraced by the heritage sector in south africa, comprising museums, libraries, galleries, archives and relevant departments of higher education institutions (heis) – as a deliberate act of social cohesion after the transi- tion to democratic government in . national col- laborative projects such as disa: digital innovation south africa ( ) served as incubators for capacity building in making cultural heritage content accessi- ble online (peters and pickover, ). however, the recommendations by sula ( ) that cultural heri- tage institutions should do everything they can to digitise material as quickly as possible, were never attainable in the developing world. instead, the intervening decades represent a period of intense reflection on the role of cultural heritage institutions to act not simply as storehouses of the memory of great men and women of the past, but increasingly to capture our stories as they are unfold- ing today. a recent student protest at the university of corresponding author: dale peters, university of cape town, private bag x , rondebosch, cape town , south africa. email: dale.peters@uct.ac.za international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifla.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifla.sagepub.com cape town (uct), calling for the removal from the campus of a statue of cecil john rhodes, reminds us of the ongoing contestation surrounding public monuments and cultural symbols of a despised colo- nial heritage. cultural heritage institutions in south africa are challenged to reflect the multiple and often marginalised histories, heritages and collective mem- ories of all its people. this challenge continues to demand an engagement in constructive and creative ways of mediating diverse and sometimes conflicting histories and traditions, and aligning contrasting memories with deep emotional attachments to historic and cultural events, figures and symbols. established in , the ifla strategic programme on preservation and conservation (pac) ( ) follows the guiding principle that preservation is essential to the survival and development of culture and scholarship. conversely, the use of new technologies is key to that act of preservation in drawing cultural heritage into the focus of scholarship to examine and revisit our story as it unfolds. the symbiotic relationship between preserva- tion, cultural heritage and scholarship is framed in the role of the digital library at the university of cape town libraries in leading the emerging field of digital scholarship. digital scholarship digital libraries in south africa have seen uneven growth in scale from early grant-funded projects to insti- tutionally based organisational units. the university of cape town libraries has successfully traversed the divide from grant-funded project to internal programme through engaging the academic researcher in critical collaboration through digital scholarship programmes. while libraries and archives feature significantly in the search for primary research sources, digital huma- nities researchers increasingly seek to assemble mixed competencies drawn from computer science, design, online publication, digital curation and library science. the digital library is particularly well suited to meeting the needs of digital scholarship by provid- ing technical services across diverse disciplines, and thus facilitating dialogue; furthermore also in promot- ing ideals such as open access and digital preserva- tion, and championing scholarly and pedagogical innovation. lisa shapiro ( : ) defines the scope of digital humanities in mediating diverse and some- times conflicting histories and traditions, rather than technology for the sake of technology: it can encompass a wide range of work, such as building digital collections, constructing geo-temporal visualiza- tions, analyzing large collections of data, creating d models, re-imagining scholarly communication, facili- tating participatory scholarship, developing theoretical approaches to the artifacts of digital culture, practicing innovative digital pedagogy, and more. thanks to a new emphasis by funders on research data management, the role of the digital library now involves integrating data curation into digital scholar- ship, enabling a fluency with topics such as reposi- tories, web publication and information-sharing practices, descriptive standards, metadata formats and the plethora of different file formats that characterise digital data. the digital library now encompasses con- ceptual frameworks for the nature of digital objects, types of metadata, data curation, for understanding collections management as curation, as well as for data sharing and its legal limitations. in transitioning the digital library from a reprographic service to a research partnership, and with the experience of projects such as the westphal collection of non- bantu click languages, we perceive the need to develop new collections stewardship policies, new collaborative governance structures and interactive workflows to guide digital librarians and data cura- tors in working alongside researchers in the digital humanities. in , through the vice chancellor’s strategic grant, humanitec was established to address these divides. the project envisioned the creation of an institution-wide digital repository, to be used to ( ) curate and showcase the many important collections of works of art, music, rare documents and other arti- facts that are in the possession of the university, and ( ) showcase the intellectual and scholarly activities of members of the academic staff. in doing this, the project aimed to connect uct in important ways with scholars in other countries, by making important afri- can collections available to them for study and by enhancing awareness of areas of research, expertise and collections at uct. voices from the past: a case study of the westphal collection uct libraries special collections and archives has an outstanding collection focusing on the african continent. emphasis is placed on african imprints, collected in conjunction with material directly about africa from other continents. the collection com- prises holdings of , african studies titles, including also pamphlets, current journals, rare books, an historical collection of maps, archival manuscripts, sound recordings, posters, and african film and photographic records. peters et al.: the digital library in the re-inscription of african cultural heritage the westphal holdings in the uct’s special col- lections comprise some audio tapes, handwritten field notes and sketch maps as well as unpublished manuscripts on indigenous languages of southern africa. created by ernst westphal, head of african languages at uct between and , the west- phal sound files are precious because they include recordings of some languages which are no longer spoken and which have not been documented in greater detail. in – two humanitec grants supported the digitisation of the audio recordings of southern bantu and non-bantu click languages which westphal recorded in south africa, namibia, botswana and angola between the early s and the beginning of the s. the project of study of the westphal collections is led by dr matthias brenzinger, director of caldi – centre for african language diversity – and head of linguistics at uct. one of his main interests lies in the documentation of african languages, especially those that are vanishing (brenzinger, , ). these studies have become one of the most vibrant fields in linguistics over the past two decades. research on the actual speech and its variation has regained considerable recognition in academia. while theoretical, universalist approaches to the study of languages had dominated the discipline for almost half a century, scholars working on african languages always had a strong focus on language description and documentation. linguistic analysis requires a thor- ough description and a language corpus, which does not exist for most african languages. ernst westphal was a true pioneer in establishing the study of southern african linguistics and has contribu- ted significantly to our understanding of the non-bantu click languages. westphal rejected the term ‘khoisan’ (levin, ), as he claimed that the non-bantu click languages spoken by former hunter-gatherers and some of the pastoralists in southern africa belong to several unrelated language families. although currently discre- dited, joseph greenberg’s postulation of the khoisan language family was widely accepted at the time, while westphal’s rebuttal was ignored outside of south africa (westphal, ). over the last years, however, the findings from research on non-bantu languages have produced evidence that in fact westphal was right in rejecting the single family hypothesis for these lan- guages (güldemann, ). the audio soundtracks digitised in this project include recordings of languages no longer spoken. westphal recorded the last speakers of kwadi, // xegwi and n/amani, and his recordings of n/uu are equally precious as the last handful of remaining three speakers are no longer fully competent in this language. westphal’s audio files of job (jopi) mabinda are of special importance. mabinda spoke //xegwi, one of the non-bantu click languages of south africa and was first recorded by leonard lanham and desmond hal- lowes on a farm in the lake chrissie district (mpuma- langa province) in (lanham and hallowes, a, b) accompanied by hallowes, westphal worked with mabinda in , when the audio tapes in uct’s westphal holding were recorded. oswin koehler, from the university of cologne, germany, visited mabinda several times between and , and audio recordings are believed to exist in the koehler archive at the university of frankfurt. profes- sor tony traill, chair of linguistics at the university of the witwatersrand and an authority on non-bantu click languages, consulted mabinda in the s and worked with him on //xegwi until . according to traill, most //xegwi community members had aban- doned their language and spoke isizulu as mother ton- gue as far back as the s. job mabinda, however, not only maintained a strong identity as a //xegwi man, but retained a thorough knowledge of //xegwi, of which he was the last speaker. kwadi used to be spoken at the southern coast of angola and westphal’s recordings of the language in the s were with the last, already by then semi-speakers. very little is known about this lan- guage and the data that has now become available through the digitisation project might allow for a bet- ter understanding of the history of this language and the genetic relationships of its speakers. in , the german linguist anne-maria fehn conducted field- work in the area where kwadi used to be spoken. when playing audio clips of kwadi from the uct westphal holdings to elderly people who claimed to have kwadi ancestors, some recalled words from the language they had heard when they were young. these digitised language recordings not only enable researchers to analyse the language by apply- ing present day technologies and theoretical frame- works, but also allow audio documents recorded some years ago to be returned to family and com- munity members of the recorded speakers. further- more the conversion from these reel-to reel tapes to high resolution digital masters ensures the material will remain audible in perpetuity. evolution of metadata practice in response to digital scholarship in , uct formally approved a metadata and information architecture policy, jointly owned by the ifla journal ( ) information and communications technology ser- vices and the uct libraries. the aim of this policy is to ensure that all content collections, both physical and digital, generated and managed by uct have sufficient metadata that meets international stan- dards and is consistently applied to ensure that the content is discoverable online. in , the metadata working group was established to implement the policy and provide metadata guidelines and assis- tance to content collection owners. one of the bene- fits of strategic institutional support has been the growing awareness across campus of the importance of assigning accurate metadata that meets international standards to content collections. the humanitec project has afforded the library, and specifically the librarians in the cataloguing and metadata manage- ment section, the opportunity to work closely with researchers and project owners in metadata creation. as a result, there is now collaboration between researchers, who with their subject expertise can cre- ate the metadata that best describes the collections, and librarians who ensure that international standards are applied consistently. the creation of metadata for the digitised tapes of the ernst westphal collection presented unique chal- lenges due to the highly specialised nature of the sub- ject matter. initial project work preceded the structured workflow initiated by the humanitec proj- ect, as a result of which the creation of the metadata fell to librarians with no subject specialisation and was created from the physical object such as the tape box and not the actual tape. the limitations of this method became very obvious when the opportunity to work with specialists did arise in subsequent proj- ect phases. these later phases were funded, allowing for the employment of linguistic specialists, who were able to listen to the tapes and correctly identify the lan- guages spoken, as well as to follow the subject content of the discussions. in some instances, the identities of westphal’s interviewees were revealed. therefore, beyond the essential level of descriptive metadata, these collaborations with researchers enabled further valuable enhancement opportunities. with specialist input, greater consistency in the spelling of the languages could be achieved, in keep- ing with internationally accepted taxonomies. this often differed from the terminology used by westphal to describe the content of the tape boxes. for example, he used the term s|hu for the n|huki language; and yei and yeyi interchangeably. other challenges where subject specialisation was essential lay in the identifi- cation of the special characters used by many of the non-bantu click languages to indicate clicks. some of these characters are standard latin ones, for exam- ple in ! xuun, whereas others require the charis sil font to provide the additional characters, such as ¼j aka-s|ous. in addition to ensuring consistency in spelling and terminology, metadata librarians provided additional value in adherence to international standards, includ- ing dublin core (unqualified) and library of con- gress subject headings. further linguistic standards were also applied, including the codes developed by malcolm guthrie for the classification of the bantu languages (maho, ). as mentioned above, westphal himself was not always consistent in his spelling and terminology, and it was considered important to capture these idiosyn- crasies in the metadata. considerable value was added to the resource by capturing annotations reflected on the tape boxes to the description field. for example: title on box: kwisi words (capelopopo). the benefits of close engagement and collabora- tion between librarians with their expertise in the application of international standards and researchers with their extensive subject knowledge have resulted in very rich metadata which ensures that these rare and valuable recordings will be discoverable to future researchers. the challenge of digitisation: technical service or digital scholarship? the linguistics scholar matthias brenzinger first con- sulted the centre for popular memory at uct on the digital preservation of a handful of field recordings from the s on quarter-inch tape reels. at the time, the project presented a novel relief from the systema- tic digital preservation of several thousands of cas- sette tapes of oral history interviews. the project was later transferred to the digitisation and digital services section of the uct libraries, as the scho- larly engagement was channelled through the huma- nitec project. an early challenge of digitisation was in sourcing a professional reel-to-reel recorder to add to the cpm’s collection of legacy format playback equipment and have it serviced by a rare enthusiast, who proved to be very forward thinking in terms of backwards com- patibility, and had been collecting disused equipment for many years. once the new machine had been installed and a test run had been performed with a non-critical tape, the first originals from the westphal collection arrived. as alluded to in the previous sec- tion, basic metadata had been provided by the catalo- guing and metadata section, with an accession number in small pencil lettering on each box and reel. peters et al.: the digital library in the re-inscription of african cultural heritage it was an exciting moment, threading the first end of tape onto the take-up reel, setting the audio soft- ware on the computer to start recording, and hitting play on the revox. would the old tape tear? what would we hear? would we be able to capture the full audio spectrum adequately? from the very first tape, the clarity of the sound was astounding. the playback head was perfectly calibrated, with no need for adjust- ing the azimuth, and the tapes had obviously been stored correctly, showing no signs of stretching, brit- tleness, mould or excessive dust. but of course there were other interesting challenges. westphal had recorded in mono, forwards and backwards. he had also quite regularly changed the speed at which he was recording, obviously trading off – whenever nec- essary – between higher quality (high speed, i.e. ½) and more recording time (low speed, i.e. = ). in prac- tice, this meant that during playback in stereo, one was listening to a cacophony of voices: some sound- ing backwards and/or pitched twice too high and fast or too low and slow, mixed with some at the right set- tings, via the left and right speakers respectively. nev- ertheless, in order not to stress the tapes unnecessarily, it was decided to digitise the record- ings as they were, capturing both channels at once in bit/ khz, focusing only on keeping the signal going through the high-quality lucid ad-converter trimmed as hot as possible without clipping. there- after, it was relatively easy to digitally split the dual-mono file, reverse the separate channels, change their speed and pitch, and export new, unfiltered deri- vative files. these would be almost indistinguishable from ones that would have necessitated a second recording process, i.e. rewinding the tape completely, turning the spool over, rethreading the tape and replaying it at whichever initial speed the first take suggests. from a digital asset management (dam) point of view, this way of proceeding also meant that we were creating one single archival (.aif) audio file to match a single physical object in the special col- lections archive. several more challenging details emerged in the digital scholarship engagement, some of which were purely mechanical, while others per- tained to digital workflow and research data manage- ment (rdm) issues. on the mechanical side, two to three tapes that con- tain more than the usual two (e.g. stereo or dual-mono) channels of audio were encountered. the reason for this, it was surmised, could be that westphal used a dif- ferent kind of field recorder (capable of simultaneously recording more than two tracks) to record onto these specific tapes, or it could mean that a misalignment of recording heads led to one channel (that was actually meant to be recorded over) remaining audible. this issue remains to be addressed in future investigation. another challenge lay in correcting inconsistent speed changes on certain recordings that westphal made while the battery power supply of the field recorder was deep-draining, resulting in a gradual decrease of recording speed. it proved possible to achieve, but these enhancements on the derivatives were not as per- fect as the general filtering and processing that was achievable in order to reduce tape hiss and rumbling wind. lastly, there were also some reels with up to four separate sections of tape wound onto them, but without these necessarily being torn sections of the same recording. this necessarily led to the creation of dis- tinct archival audio files, which became one of the sub- jects of consideration in data- and metadata management. in fact the biggest challenge for this col- lection of audio files was only to become entirely clear later: digitisation, digital storage and metadata creation and enhancement had taken place in four successive phases, leading to small inconsistencies in the file nam- ing convention, for instance in the suffixes of derivatives. an important part of the scholarly engagement identified by douglas ( ) was the activity of lis- tening itself, which in the context of audio-visual archiving tends often to be overlooked, between tech- nological considerations on the one hand and aca- demic outputs on the other. while the project did require high-resolution original audio files for linguis- tic analysis with specialised software, such as praat, filtered derivatives were also created for maximum ‘naked ear’ usability, such as transcription and trans- lation (boersma and weenink, ). this process of intense listening to these recordings revealed much more than non-bantu click sounds – of course unintel- ligible to a non-expert – but many further impressions emerge. for example, an immensely charming inter- viewer’s voice (namely westphal’s), speaking a large variety of languages himself, at times to a second translator in the room; ways of speaking with each other that always involve a rich palette of aural signals that apparently need no translation for any listener: empathic sounds, sounds of understanding, interest and also joy; a whole tape side of what may be young relatives of westphal’s: two boys making up a radio play in the queen’s english; documentary excerpts of ritualistic singing and drumming that were clearly just a small window into a musical event that stretched for days and nights; two young women sing- ing what sounds like a nursery rhyme – an intricately interwoven polyphony of such startling beauty that one is almost reduced to tears. suffice to say: there is always so much else contained in any recording however specific to a discipline its provenance may ifla journal ( ) be – details, backgrounds and ‘ways there’ that reach far beyond the immediate research interests of any academic project. furthermore, with the passing of time the manners in which we can (and indeed need to) engage with these mediated events also change. in the case of the undeniable affective quality and fas- cinating layering of the westphal recordings, one can only wish to provide them with more listeners, at the very least in appreciation of the cultural heritage they represent. while there is no doubt that a high degree of spe- cialisation is unavoidable in meeting the wide range of systemic demands that are put on digitisation and research data management, it is equally true that the very outputs of this work necessitate the more broadly skilled function of the digital curator. in order to cre- ate the kind of digital repositories that the digital library of the future requires, it becomes ever more necessary than in any traditional archival context for researchers to work in close collaboration with digital knowledge brokers: staff who can intelligently engage with their materials in order to provide a variety of relevant accessibility and interoperability solutions. the various digitised media formats can always be further interlinked and enriched. recently, a small- scale, departmentally driven initiative to make repro- duction photographs of westphal’s handwriting on the tape boxes (and some reels) led to the creation of an auxiliary archive of visual material to comple- ment the relevant audio files and text documents. fur- ther efforts could include time-based anchoring of relevant transcripts, meta tags, commentaries and other hyperlinks on the audio files, perhaps via an interactive web-interface with specific access rights put in place for research experts and general audi- ences. in this sense, the future potential for the west- phal collection is to open it up to new audiences outside of its immediate field of specialisation, and to enable new ways of listening that will no doubt generate interdisciplinary insights into what its richly layered materials provide, in this case particularly to the african humanities. conclusion the case study mapped out in this article reminds us of the many and varied entanglements embedded in library collection items. these extend from the gath- ering and maintenance of collections in archival con- ditions, right through to engagement and access of the derivatives by researchers and community users, and back again to the metadata enhancement within the archive after such specialist interactions. the engagement of the linguistic experts and librarians in the process of metadata creation speaks to the intellectual process of revisiting the library to review, correct and augment the transitory record in a new scholarly interpretation of african cultural heritage. through the digitisation of old tapes the derivative digital files can be taken back to the relevant first lan- guage speakers and their communities for further interpretation or corrections, and these enhancements then fed back into the bibliographic record. the high quality digital master can also be played over and over by the linguistic specialists for further analysis, which is also fed back into the bibliographic record. both these levels of enhancement are neither bound by the physical location of the original collection nor by the fragile nature of the original recording. these enhancements go further than merely a richer articulation of materials in the library. the added value in the specialist interactions of first lan- guage speakers, linguistic specialists and librarians enable deeper (and broader) research to be generated out of the material, this can then be digitally curated creating virtual portals of related material. as archivist barbara craig ( ) reminds us, compared to a long history of inquiry into memory, discussions of its specific manifestations in archives are recent, and much of the discussion is by people with no first-hand experience with archival work. in many ways, archives and libraries of the st century represent cross-sectors of societies. they often inter- sect around issues/themes rather than institutions, around the visceral rather than just the physical. such engagement in the library is imperative to the building of collections that resonate with the environ- ments from which they originate and add value to the research that extends from use of such material. as such, libraries play a role in ensuring that the legacies we hold are not only preserved behind physical (and virtual) walls, but also curated and disseminated beyond the boundaries within which we work. though the advancement of digital and interactive media technologies, and the innovative collaborations of academic research, the symbiotic relationship between preservation, cultural heritage and scholar- ship has defined the emerging role of the digital library in a synchronic circle of re-inscription of afri- can cultural heritage. declaration of conflicting interests the author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. peters et al.: the digital library in the re-inscription of african cultural heritage funding the author(s) received no financial support for the research, authorship, and/or publication of this article. references boersma p and weenink d ( ) praat, a system for doing phonetics by computer. available at: http:// www.fon.hum.uva.nl/praat/ (accessed august ). brenzinger m (ed.) ( ) language death: factual and theoretical explorations with special reference to east africa. berlin: walter de gruyter. brenzinger m ( ) language diversity endangered. berlin: walter de gruyter. craig b ( ) selected themes in the literature on memory and their pertinence to archives. american archivist ( ): – . disa ( ) disa: digital innovation south africa. avail- able at: http://www.disa.ukzn.ac.za/ (accessed august ). douglas k ( ) digital archiving in the context of cul- tural change. serials review ( ): – . güldemann t ( ) greenberg’s ‘case’ for khoisan: the morphological evidence. in: ibriszimow d (ed.) prob- lems of linguistic-historical reconstruction in africa. köln: rüdiger köppe, pp. – . ifla ( ) strategic programme on preservation and conservation (pac). available at: http://www.ifla.org/ about-pac (accessed august ).] lanham lw and hallowes dp ( a) linguistic relation- ships and contacts expressed in the vocabulary of east- ern bushman. african studies ( ): – . lanham lw and hallowes dp ( b) an outline of the structure of eastern bushman. african studies ( ): – . levin a ( ) the khoisan. available at: http://www. khoisan.org/ (accessed august ). maho jf ( ) nugl online: the online version of the new updated guthrie list, a referential classification of the bantu languages. available at: goto.glocalnet.net/ mahopapers/nuglonline.pdf (accessed march ). peters d and pickover m ( ) insights of an african model for digital library development. d-lib magazine ( ): – . shapiro l ( ) getting started in digital humanities. journal of digital humanities ( ). available at: http://journalofdigitalhumanities.org/ - / (accessed august ). sula ca ( ) digital humanities and digital cultural heri- tage (alt-history and future directions). in: ruthvench- owdhury i and gg (eds) cultural heritage information: access and management. london: facet, pp. – . westphal e ( ) the languages of africa, by joseph h greenberg. american anthropologist ( ): – . author biographies dale peters is deputy director for technical services at the university of cape town libraries. she served as proj- ect manager for disa: digital innovation south africa, and has previous experience of e-infrastructure projects in the fp programme of the european union, aimed to enhance the visibility of scientific research outputs through global networks of digital repositories. matthias brenzinger holds the mellon research chair: african language diversity in the linguistics section of the school of african & gender studies, anthropology and lin- guistics at the university of cape town. he is director of caldi – centre for african language diversity, curator of tala – the african language archive. brenzinger is a member of the institute of african studies (institut für afrika- nistik) at the university of cologne. the main fields of his academic interest are language classification, cognitive lin- guistics, ethno-botany, language documentation, applied lin- guistics, bimodal communication and sociolinguistics. renate meyer is head of special collections at university of cape town libraries. having qualified as a fine artist and with a research ma exploring ‘generations of mean- ing: archival collections in african contexts’, meyer has worked in museology and archival curatorship for the past years. she is particularly interested in how multifaceted distinctive collections are gathered ethically, curated intel- ligently and accessed dynamically. amanda noble is manager of cataloguing and metadata at the university of cape town libraries. she has a profes- sional interest in metadata, metadata standards, linked open data and library technical services workflows. niklas zimmer is a digitisation manager at the university of cape town libraries, with previous experience at the centre for popular memory at uct. niklas has lectured in theory and discourse of art, critical studies, and given workshops in video, sound and photography at tertiary institutions in cape town and stellenbosch and is a research associate at the archive and public culture research initiative at uct. furthermore, he has per- formed, exhibited and published in the fields of visual art, photography and sound. ifla journal ( ) http://www.fon.hum.uva.nl/praat/ http://www.fon.hum.uva.nl/praat/ http://www.disa.ukzn.ac.za/ http://www.ifla.org/about-pac http://www.ifla.org/about-pac http://www.khoisan.org/ http://www.khoisan.org/ http://goto.glocalnet.net/mahopapers/nuglonline.pdf http://goto.glocalnet.net/mahopapers/nuglonline.pdf http://journalofdigitalhumanities.org/ - / article storing and sharing wisdom and traditional knowledge in the library brooke m. shannon tiffin university, usa jenny s. bossaller university of missouri, usa abstract traditional library practice focuses on print collections and developing collections of materials that have been published, which means the documents have gone through some kind of review or vetting process. this practice leaves a wide swath of potential knowledge out of the collection. for example, indigenous knowledge, beliefs, and experience are different, in that they do not undergo the same review or vetting process; we might refer to these types of content as wisdom. non-print collections, such as collections of recorded oral histories, represent less traditional forms of knowledge. human libraries push the boundaries further in the quest to integrate wisdom and lived experience into library collections. this paper delineates the relationship between wisdom and knowledge that arose during a phenomenological study of the everyday information practices of kenyan university women. the women were asked to photograph everyday events from their life and describe what they saw. one finding was a divergent presentation of wisdom and knowledge. because the women were describing this in relation to their education, we assert that this demonstrates a need to reconsider positivist assumptions in library science, bringing what the women called wisdom into the stacks. how, though, can wisdom be stored and shared? keywords wisdom, knowledge, oral history, living libraries, traditional knowledge introduction how can libraries store the information that its users need when that information takes a non-traditional, changeable, or living form? there are certainly many kinds of knowledge that are not textual or that fall out- side of the realm of regular library collection prac- tices. traditional knowledge falls into this category. this paper explores possibilities of expanding limits of texts and knowledge in the library stacks. libraries serve to connect people to the informa- tion that they need, which might include a wide range of material such as fiction, nonfiction, or persuasive literature; ‘‘information needs’’ are therefore quite subjective. librarians work within boundaries that have been established by practice and practicality formed by the types of documents that they house. librarians are bounded by space, scope of the collec- tion (what is available and desirable), and limitations on form (i.e. books, periodicals, audio-visual items). we can also see how ‘normal library practice’ is changing drastically in all of these areas, especially as digital technologies have changed space usage, limitations on knowledge, and form that the knowl- edge takes. these changes have come about directly and indirectly as a result of technology. for instance, reference areas have been largely reconfigured into ‘‘commons areas’’ where people can meet with lap- tops, and the spaces of libraries have been reconfi- gured because of users’ technology needs (morrone and workman, ; turner et al., ). while library spaces have been reconfigured, space is certainly still very much a practical limitation. corresponding author: jenny s. bossaller, school of information science & learning technologies, university of missouri, townsend hall, columbia, mo , usa. email: bossallerj@missouri.edu international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifla.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifla.sagepub.com librarians continually negotiate and reconfigure spaces in the library for the various objects and activ- ities that they want to and are expected to include in the library: periodicals, reference collections, book stacks, computer areas, and meeting spaces. the areas of a library still define its various purposes, though commons areas are designed to be much more fluid, ensuring that users have the ability to move chairs, tables, etc. according to their needs. several authors have explained that librarians run into limitations on which texts that they include in the library’s collection. they work within established parameters of knowledge and form. budd ( : ) points out that knowledge ‘‘has been subjected to some kind of filtering through society’s value system . . . has been deemed meaningful through some socie- tally agreed-upon process.’’ radford ( ) critically explains that librarians serve as facilitators, guardians, and brokers of a modern, positivist view of knowl- edge; here, information is presented as a commodity. this viewpoint lays the groundwork for analyzing the limits of library collections in terms of value. we can say that library practices and research give prece- dence (and possibly privilege, though that might insinuate a deliberate exclusion) to texts or to graphic records that can be easily stored, understood, and ana- lyzed for content. collections of textual and visual documents are the physical components of ‘‘normal library practice’’ (with texts being most usual). this leads us to the conclusion that normal library practice, by definition, marginalizes some kinds of informa- tion, such as communication between people – libraries do not typically organize unrecorded com- munication. they might organize space where com- munication (between books and people, between computers and people, between people and people) will occur, but they do not organize people. that falls outside of the scope of librarianship. people embody living knowledge, but are much more difficult to organize, store, and disseminate; that differentiation marks ‘‘alternative views of knowledge‘‘ that the librarian community could give serious consideration, were it not for the practical problems posed by actu- ally carrying it out. another barrier to connecting the reader to the information sought is the method used by librarians to obtain books, including limitations of the publish- ing industry. the ‘normal’ method for libraries to obtain texts is through publishers, jobbers, and small presses; that is how they fill the shelves. librarians select from catalogs of books to meet the scholarly and/or recreational needs of their patrons. however, the publishing industry censors itself. publishers pub- lish what is publishable: that which has been vetted, is acceptable, and salable. this means that certain kinds of knowledge will make it through the publishing process, and other kinds will not. this might be considered a product of market forces, or, more ominously, censorship. for instance, durrani ( ) writes about librarianship in kenya during the s, and distinguishes between the publications that were on the library shelves and the underground literature of liberation. liberation literature was cen- sored by the government and so was unavailable in public libraries except through what he calls ‘‘guerilla librarianship.’’ information suppression is linked to discussions of traditional knowledge (tk), which claims that local knowledge is often superior in finding solutions to local problems. tk presents a calling for librarians to expand the information ecosphere and somehow recognize the value of wisdom and experience. a roadblock, though, is that some of that local informa- tion is not actually written down; ‘‘normal library practices’’ focus on texts. there is a lot of wisdom that is not found in the library stacks. finding a way to include this into the library’s collection is not only a way to connect library users to the information that they are seeking, but also an act of guerilla librarian- ship. it liberates knowledge and can connect users to tk. because librarians concentrate their efforts on developing collections of texts (or at least recorded material), it makes sense to define a text, explore tex- tual boundaries, and identify how expanding the col- lection might enhance a library user’s access to information. ricœur ( ) offers one way to think about this by describing how people interact with information in dialog or conversation. he defines text as ‘‘every utterance or set of utterances fixed by writ- ing’’ (p. ). in dialog, two people are engaged in mutual discourse; they are rooted in a present reality and have the opportunity to negotiate meanings. that exchange offers a psychological advantage that the written text cannot share. written texts, though, offer a sociological advantage over conversation. texts are fixed in time, can be preserved and studied. written texts can become part of the collective memory (p. ), but they are located in a quasi-world where real- ity is ‘‘intercepted’’ (p. ). similarly, traditional reading theory has put the reader as recipient of knowledge (the reader as ‘‘empty vessel’’); however, more modern reading theory has put the reader in command through interpretation. thus, reading a text might be a two-way or intersubjective experience, but ricœur explains that it should never be confused with speech itself (p. ). likewise, the reader can never leave an imprint on the text for the next reader without ifla journal ( ) writing in the book, which is highly discouraged in libraries (although social tagging the catalog might be an exception worth noting). written texts are only one form of communication – they are essentially one-way or limiting because the speaker (the writer) can never clarify for the reader what he is saying should there be questions (although journal clubs and written interpretations offer the reader a way to engage with the text in the presence of another). alone, the reader has limited options for interpreta- tion. providing a way for the seeker of information to interact personally with the person who holds that information, then, can be seen as an important step in increasing understanding and knowledge, and help- ing people find what they are looking for. when is a text not a text? budd ( : ) claims that ‘‘if text can be taken as a formally constructed system of signs, then many things, and certainly libraries, qualify as texts.’’ budd’s interpretation is influenced, in part, by brown’s ( ) view of society as text, as a social structural system of speech acts, acts that can be reproduced, interpreted, and reinterpreted. each indi- vidual has the power to analyze and act upon this text, first through interpretation and then using their own experiences to reinterpret. turner ( ) explains that even an individual utterance can be a document. turner refers to frohmann ( : ) who defined documents according to either ‘‘their materiality; their institutional sites; the ways in which they are socially disciplined; and their historical contingency;’’ they are both socially constructed and contextual. turner, studying institutional communication, found that oral statements should be considered documents because an utterance is ‘‘an artifact that conveys evidence’’ ( : ). thus, even utterances are something that should be of interest to documentarians and librarians. defining knowledge and wisdom during the study that is described in this paper, ken- yan university women distinguished between knowl- edge, which they said is stored in books and libraries, and wisdom, which they associated with people, especially elders. the first objective, then, in explaining ‘‘wisdom’’ as an alternate path to knowledge is in establishing the relationship between knowledge and wisdom. one way to conceive of their relationship is as a hierarchy. haeckel and nolan ( ) have proposed a multi-level information hier- archy of variables and processes by which informa- tion becomes knowledge. facts, at the base of the hierarchy, become information given a particular context. information becomes intelligence through inference. certitude transforms intelligence into knowledge. finally, knowledge becomes wisdom through the process of synthesis. arguably, the moments of distinction are just as vague as the states themselves. however, the tendency to place knowl- edge as antecedent to wisdom is common. in an effort to develop an explicit theory of wisdom to explain why people are labeled wise rather than knowledgeable, intelligent, or creative, sternberg ( ) has also found that knowledge precedes wis- dom. he has not operationally defined knowledge but alludes to knowledge as, simply, what is known. in this case, knowledge is rooted in the positivist tradi- tion and is transferable and, for example, written in books and passed along in schools. bluck and gluck ( ) have described wisdom as something that necessitates transmitted knowledge but also requires demonstration or possession of a wider set of what they call life-tools, such as empathy, support, self- determination and assertion. also linking values to wisdom is kekes ( ), who has defined wisdom as ‘‘a character-trait intimately connected with self- direction . . . [and] the possession of wisdom shows itself in reliable, sound, reasonable, in a word, good, judgment’’ (p. ). in other words, wisdom is the manifestation of value-mediated and reasonable judg- ment. such judgment, including an understanding of differences in values and priorities, is culturally based and necessarily relies upon the mastery of various social and institutionalized norms and beliefs. the role that experience and expertise play in wis- dom has also been explored. rowley ( : ) claims that ‘‘knowledge, experience and action are key aspects of wisdom.’’ similarly, baltes and smith ( : ) have explained wisdom as ‘‘an expert knowledge system’’, or expertise. they have defined wisdom as ‘‘a highly developed body of factual and procedural knowledge and judgment dealing with what we call the ‘fundamental pragmatics of life’’’ (p. ). importantly, many studies of wisdom extend beyond educational settings. as baltes and smith ( : ) have suggested, in studies of wisdom and intelligence, the ‘‘primary focus on school-related knowledge and skills has been questioned, and new sectors of life . . . have been singled out as domains within which factual and procedural forms of intelli- gence can be properly studied.’’ bodies of knowledge outside the sets of core curricula such as ‘‘the knowl- edge of everyday routines (e.g. knowledge of com- mon activities and events, social norms, available human services, and social institutions)’’ (p. ) come into play during the everyday pragmatics of life. shannon and bossaller: storing and sharing wisdom and traditional knowledge in the library lloyd ( ) has also studied information practices beyond the academic setting, specifically in emer- gency response situations. she has found that written knowledge is only one source of information. crisis responders also rely on social sources of information, such as experts in their field, and also on environmen- tal cues. another aspect of wisdom is that it enables people to function in a state of uncertainty. baltes and smith ( ) have suggested that experts have not only knowledge of their domain but also knowledge of how to manage uncertainty, such as being aware of probabilities or the relative success or failure of rele- vant decisions. this tolerance of uncertainty is what bluck and gluck ( ) have called flexibility. wisdom might be conceived of as a component of tradition, or as a means to bolster traditional social systems. wisdom is a type of learned judgment that comes from and complements the social system of which a person is part. this presents a structuralist- functional view on the purpose of wisdom; people who learn how to become a functional part of society are thought of as wise. howard ( ) has stressed the importance of con- versation between people involved with western development, coming in as technical experts, and the people who live in that nation’s traditional society. the ‘‘quick fixes’’ imposed upon the traditional soci- ety are a reflection of a western knowledge system, and the process of development is imbalanced, favor- ing the developed nation as expert. howard explains that ‘‘the legitimacy of the authority of the technical experts is based on the assumption of the superiority of science’’ (para. ). science is presented as an ‘‘objective, impersonal, rational, and universal knowl- edge system’’ (para. ). there are many advantages to having scientists and developers who have adopted some degree of humility, and who recognize the legiti- macy of local or traditional knowledge; people who know the land and understand the intricate balance of their ecology are a source of real knowledge. western bias, colonialism, and positivism there are different views on the value of traditional knowledge and how it might work in modern society. outside of science and development, education and access to information, and laws (as representations of wisdom) demonstrate tensions that might arise. many librarians have noted that students from other countries do not have the same conception of the library as domestic students. northern or western hemisphere libraries have different traditions from southern and eastern hemisphere libraries. in some areas of the world, libraries are only available to scho- lars. the materials might be available only in the dominant language, as relics, reminders, and reinfor- cements of a colonial system. in the least developed areas of the world, people who hold the most power, the tribal elders, might be marked not by knowledge that is learned in books at all, but by their ability to make good judgments based on wisdom, of knowing stories, and having life experience. we might say that in the west, book knowledge is privileged over life experience. one of the relics of colonialism is the ten- dency to discount non-textual knowledge. we can say, therefore, that books hold more power in western societies than they do in societies with a strong oral tradition. while we will not argue here that books are oppressive, there is a long line of thinking that books have the capacity to destroy the capabilities of mem- ory; in western tradition, this was found in socrates’ conversations with plato, though his second- generation student (aristotle) wrote books and even established a library. what are the results of establishing western-style education, imposing a western tradition of learning from books on non-western societies? what might that mean for the self-determination of a nation with different values and processes of legitimation and authority? falgout ( ) described a case of the social effects of implementing a standardized ameri- can schooling system in pohnpei, one of four munici- palities in an island nation of the federated states of micronesia. rewards for completion were ‘‘primarily western ones—jobs, money, and manufactured goods’’ (p. ). the coeducational system improved women’s workforce prospects, but the new academic rites of passage also unintentionally shifted sociopoli- tical relations between ‘‘new elite’’ and the old (p. ). the western democratic ideal of equality upended traditional norms in both well-received and problematic ways (depending on who was asked and how those consequences are weighed). in this case study, the intended state of knowledge and, perhaps, wisdom (i.e. democracy) were significantly different from the actual outcome. similar cultural clashes occur in the united states, where similar problems or ethical questions have been raised involving native americans and religious peo- ple who want to remain faithful to tradition-based laws. the indian civil rights act ( ) established that united states courts have no jurisprudence in tribal decisions. one tribal practice is banishment, described by swift ( : ) as ‘‘an ancient punishment used by tribes to preserve order and rehabilitate tribal mem- bers . . . [which] helped maintain tribal cohesion, essential to cultural identity and protection.’’ there ifla journal ( ) have been appeals to us courts, though, from banished members, illustrating the difficulty in maintaining a distinction between tribal and federal jurisdictions. similarly, wolfe ( ) has described the sovereignty of faith-based laws and religious courts’ relationship with secular (i.e. us and canadian) courts. wolfe claims that most of the time the courts have been happy to let religious courts carry out their own arbitration because self-regulation has been good for the courts. it has lightened their load, and it has been good for the people, who have been more comfortable having their disputes resolved according to their religious values. however, in-group cultural disputes might cross juris- dictions. wolfe ( : ) explains, ‘‘many cultural groups wish to preserve their distinctive cultures and resist ‘state-promoted assimilation’.’’ the tension occurs when state dominance interferes with religious values; however, sometimes, traditional practices vio- late human rights, or in-group disagreements call for state intervention. the two disparate views of colonialism and control (or suppression) of non-western knowledge in knowl- edge institutions are brought to the library discussion in durrani ( , ). durrani describes how, in kenya’s post-colonial society, the english established libraries and the educational system in order to main- tain some modicum of control over the people. in the s, activists managed to establish their own libraries for liberation purposes (durrani, ); they incorporated theater and art spaces into the library as a form of engagement, and also as a method of resis- tance. have people in post-colonial nations managed to wrest control over the educational system from the oppressors now that they are in charge of libraries and education? is there still a dichotomy in relation to books, education, and libraries that stands between western culture and non-western culture? what does globalization mean for traditional knowledge, and what do young people who live in societies that are transitioning into an interconnected, globalized soci- ety think about the difference? this paper discusses how knowledge and wisdom were described by a group of women in kenya, a country that has moved beyond its colonial past, but also one in which many worlds exist simultaneously. in kenya, tribal and indigenous knowledge exists alongside schools built on western traditions. how can these two be effectively embodied within the con- text of the library? the women’s comments about the concepts of knowledge and wisdom prompted the question regarding the place of libraries and, ulti- mately, what a library is and what it does. this brings up a discussion about how knowledge and wisdom are perceived, in what context each concept is used, and how each is accessed and shared. aside from the the- oretical interest in such delineations, the findings and discussions provide credence to human libraries and other non-traditional or developing conceptualiza- tions of the library. methodology data was gathered during a larger qualitative study about what information people describe as relevant to their everyday lives. a combination of content, phenomenological, and hermeneutical methods were used to explore kenyan women university students’ interactions with information in their everyday life. basic content analysis was used to gain an under- standing of the occurrence and co-occurrence of words. the relationship among words and how parti- cipants experienced these concepts was explored using hycner’s ( ) -step phenomenological method. finally, a hermeneutic method inspired by gadamer ( ) was used to understand the sociocul- tural and historical context of experiences. the study focused on women’s information prac- tices, which is the set of institutionalized, or recurrent, information-related activities (i.e. seeking, searching, use, evaluation, production, and sharing) of a particu- lar group or community (savolainen, ). in an information practice, the unit of analysis is the individual-with-context. this can be contrasted with information behavior research in which the unit of analysis is the person, regardless of context. the value of this interpretive approach (compared to a more positivist approach) is that it takes into consideration information activities as they are embedded in a greater social, cultural, and historical context. context, in this case, is related to what knowledge is possible. in a foucauldian sense, knowledge is more akin to a system in which claims makes sense, or not (foucault, ). the organizing factor is dis- course, or a certain way of knowing and the validating statements and claims that make knowing possible. looking at the discourse makes it possible to under- stand how and why information activities, rather than individual behavior, are legitimized and gain value. method participants in this study included women, aged – , who were students at a private university in nairobi, kenya. eight tribes were represented, and all but one identified as christian. over the span of eight weeks, participants were asked to photograph objects they deemed relevant to their everyday life, write a description of what they intended to capture in the photograph, and then meet shannon and bossaller: storing and sharing wisdom and traditional knowledge in the library in a group to discuss why the object in the photograph was relevant to their everyday life. data included participants’ spoken words during group discussion, participants’ written descriptions of their photographs, and participants’ collection of photographs. a total of principal documents were collected, including corresponding photographs, written descriptions, and transcribed group discussions. only written and transcribed documents were coded for content analysis. additional clarifica- tion was collected through face-to-face interviews, informal interactions with participants and non- participants, and through researching local events, places, and objects to which participants referred. importantly, participants were not asked to talk about either knowledge or wisdom, but these concepts emerged as relevant in their everyday lives. twenty- three documents, including seven photographs, written descriptions, and five discussions, were col- lected that specifically pertained to and were coded as knowledge. five documents, including two written descriptions and three discussions, explicitly addressed their conceptions of wisdom. findings the students expressed a distinct difference between wisdom and knowledge. when discussing everyday life events and sources of information, participants discussed knowledge as it related to the education setting. in contrast, wisdom was related to life experience. on knowledge students associated knowledge with education. knowledge was experienced in their identity role as a student % of the time and was expressed as a thing gained through going to school, going to class, and learning in class. for example, one participant explained that she was: doing classes that are teaching me about my own coun- try which i didn’t know at all. and, i’m gaining new knowledge and loving it. knowledge was associated with the classroom, ‘‘a place where you can gain knowledge, discover your- self, and learn new things.’’ they spoke of school as the place where one acquires knowledge for empow- erment and preparation for the future. for example, one woman said: i believe . . . all the schools i’ve gone to have contribu- ted to my knowledge. they might not be directly but through the education. through wanting to gain knowl- edge. through the education i have received i have managed to turn most of it into valuable knowledge that i am using in my everyday life. they said that knowledge was something that was transmitted from the institution of the school. the library was important for the students as a place to do schoolwork and as having ‘‘all the knowl- edge you want.’’ in some cases the women also asso- ciated knowledge with culture, as a result of the diversity experienced by going to the university and by integration of cultural topics into the curriculum. books were described as ‘‘full of knowledge’’ or as comprising a ‘‘shelf of knowledge.’’ another student said that books actually ‘‘represent knowledge’’ and ‘‘without them you can get no education.’’ expert knowledge was also identified. for exam- ple, people who actually worked in their field of study were described as having expert knowledge. this type of knowledge, which was similar to wisdom, was based on direct and active involvement in the topic of knowledge and had an affective (i.e. motivational, inspirational) impact. on wisdom in contrast, the women connected wisdom to connec- tions with elders and religious institutions. they asso- ciated wisdom with life experience. they said that elders were wise regarding tradition, normative issues, and money because they had life experience. one student described her parents as sources of both knowledge and wisdom, saying that they are: very important sources of knowledge and information and . . . they’re like the wisest people on earth. even though . . . everyone says that we always repeat our par- ents’ mistakes . . . i guess that whatever, that i kind of . . . learned through them so i don’t have to fully repeat them. another woman said that it is through people, not books, that she has learned the most: you know how people say books are knowledge but i tend to believe people have more knowledge than books . . . there were these guys from several departments of the un who talked about poverty and hiv . . . and, some of this material you don’t usually get in books. you get it from the people and it has . . . more impact on you when you hear it from somebody else than when we read it in our books. another woman said that the bible was a source of wisdom because of the stories about the people who ifla journal ( ) went through difficult situations and solved problems. of course, the bible is a book; however, she empha- sizes the personal connection to the characters in the bible as a source of strength: when i read it, it gives me wisdom [and new under- standing] because we all need wisdom in whatever we do, in every decision we make . . . the bible encourages me whenever i feel that a situation is hard for me . . . so many people in the bible have gone through hard things. but, they have overcome them they have overcome them and gotten solutions to different impossible things. so, my bible is my guiding thing, and i respect the bible. protocol or ritual played a role in sharing wisdom. for example, one woman discussed how rituals and the accompanying symbols during church service were central to gaining wisdom. the bible, the only text-based source of information related to wisdom apart from biblical inscriptions on church walls, was integrated into church protocol, in which the vested authorities would read or lead readings from the bible, and the house of worship was filled with vari- ous symbols of their faith. in these cases, wisdom was contained in stories and based on people’s lived experiences, while knowledge was found in books or related to education. further- more, knowledge was usually given; it was unidirec- tional. wisdom required much more situational interpretation. knowledge and wisdom overlapped in the domain of what can be called expert knowl- edge. this type of knowledge was related to the stu- dent’s area of study in school, but it was different because it was based on the expert’s lived experience and shared face-to-face. the relevance of stories in wisdom is not a surpris- ing link. for example, pellowski ( ) explains how storytelling is universal and used as a way to entertain, transmit culture, and make sense of the world. myths, stories, and proverbs, therefore, are embodiments or packages of shared cultural knowledge, but they might require situational interpretation. for example, in buddhist and hindu cultures, storytelling was used to transmit knowledge and wisdom and was even seen as superior to other forms of communication. mbiti ( : ) explains that in many african societies, phi- losophies are ‘‘found in the religions, proverbs, oral traditions, ethics and morals of the society con- cerned.’’ he goes on to explain the unique value of proverbs, which are distinct from other traditions in that in that ‘‘their philosophical content is mainly situational’’ (p. ). the power of myths is in their oral- ity, in ritual retelling. discussion the critical thinking skills taught in school as a part of a positivist tradition are certainly important for global participation and development, but they do not neces- sarily lead to situational or cultural competency. wis- dom, enacted through experience, requires flexibility for situational variability. knowledge stored and shared in books and facilitated by trained profession- als is an integral aspect of understanding. knowledge stored and shared in this way can be measured, recorded, and compared for future reference. the practice of wisdom is stored in the individual and shared through time-binding but flexible containers that cannot be measured, recorded, and compared in the same way. wisdom cannot be standardized like positivistic knowledge, but that is precisely what is lacking in the knowledge society. findings support the idea that wisdom is a practical approach to life that is both situational and requiring action. in this study, life experience was an indicator of wisdom. for instance, parents were referred to as wise, giving their daughter the ability to learn from their past mistakes. this confirms rowley’s ( ) claims regarding experience and action as important aspects of wisdom. however, the student also sug- gests that the benefits of wisdom can be passed on. because the receiver cannot possibly relive the same experience, such an acquisition would be an inten- tional and reflexive activity made relevant to current situational conditions. knowledge and wisdom were mentioned in differ- ent contexts, suggesting an implicit divergence between the two concepts. ‘‘wisdom’’ points toward everyday life rather than the school setting, confirm- ing baltes and smith’s ( ) suggestion that wisdom is comprised of bodies of knowledge pertinent to sec- tors of life other than school. participants did not denote a hierarchical relationship between knowledge and wisdom that would directly support the theoretical definitions given by haeckel and nolan (as cited in eisenberg et al., ) and sternberg ( ); sternberg and jordon, ). however, the findings from this small sample of indicators do not necessarily negate such a relationship. bluck and gluck ( : ) found that ‘‘people are perceived as wise when they have helped others to solve a problem in a way that went beyond what they had been able to see and do before’’, which was also found in this study. age was certainly associated with wisdom. most people who were identified as ‘‘wise’’ were older than the students, though one student described looking up to her little brother because of his ability to achieve goals. (she emphasized how shannon and bossaller: storing and sharing wisdom and traditional knowledge in the library strange that was because she was his elder.) when people are able to meet their goals, it indicates a cer- tain wisdom, or lifespan contextualism. while she was not speaking of wisdom in this situation, findings support bluck and gluck’s idea that perceived wis- dom might be more about age groups than just age. one participant confirmed this when she discussed how the bible imparted wisdom; its characters faced challenges, and she used those as lessons in her own life. she both read the bible and listened to it, and explained that both of those experiences helped her understand the meaning of the text. she used the stor- ies as instruction in her own life, which is an inten- tional and reflexive act. the text is the container of the words, but the reading and interpretation by the preacher or rector helped her understand the meaning of the text; thus, the text was embodied by the inter- preter. so while learning from other people’s life experience is an important aspect of wisdom, there are protocols that facilitate wisdom, especially through ritual. the ritual and the interpretation legitimizes and deepens understanding of the message. schmit ( ) described how various ritual expressions such as ges- tures, repetition, rhythm, and music, in religion give legitimacy to texts, claiming that ‘‘elements of ritual are symbols that speak in ways discourse cannot’’ (p. ). the idea of ritual language, beyond discourse, is that it ‘‘wholly involves us’’ (p. ). in other words, interpersonal acts can be more powerful than books: that is to say, this supports the knowledge–wisdom hierarchy. information is at the lowest level, below interpersonal communication, and embodied in ritual, which leads to action (or embodied wisdom). is it pos- sible to facilitate this in the stacks of the library, or are libraries simply storehouses for dead words? brophy ( ) describes the new library as a hybrid – a shift from the traditional library of the past that housed printed matter, to a new library that facil- itates learning by push technologies (pushing elec- tronic documents to users). recent reconceptions of the library, though, as a community center or as a learning commons, point toward yet another type of library, or another hybrid. this is the library that recognizes people as containers of knowledge or wis- dom. co-production and cross-disciplinarity have become buzzwords in academia; the lone scholar toil- ing away with books is no longer the norm. likewise, group assignments are more common, and libraries have responded with more group workspace (steiner and holley, ) in order to provide students with the print and online resources that they need. the library has also been reconceptualized as a place for production – of video, of writing. another interesting manifestation of the concepts of ‘‘library as space’’ is the human library, in which people are ‘‘checked out’’ for library use – becoming a non-recorded human source for information (malin, ). the fact that it is not recorded is a difficult hurdle for inclusion in libraries: that person is not a text. the large university where the current study took place offers a glimpse into the difference between young women’s conceptions of wisdom and knowl- edge, especially their view of knowledge in an educa- tion setting and the library, which houses graphic documents. is there a way to house wisdom, as well, in the library? hendon ( ) has explored the rela- tionship between the storage of material objects and a community’s social and moral practices. she has suggested that communities develop an ‘‘ethic of stor- age’’ that ‘‘varies in conjunction with the need to define and validate social status, reflecting how peo- ple in different kinds of society interpret social rela- tions and enact social values’’ (p. ). essentially, storage practices ‘‘raise issues of secrecy, memory, prestige, and knowledge’’ and provide insight into what a society values. storage practices might mani- fest in a variety of ways. for example, accumulation of certain material objects might represent status. physical location of material objects might have social significance. an actual person might also be a container of knowledge and, in such a case, might be spatially or materially identified. studying sources of wisdom and incorporating traditional storage val- ues into the library’s collection provides a bridge between wisdom and knowledge and creates possibi- lities for a richer and deeper ethic of storage. it vali- dates a wider range of knowledge and potentially gives students a way to conceive of how other paths of knowing. the women said that when something was written in a book, it was true. they also said, though, that things their teacher said were true. their expression of ‘‘truth’’ relates to the idea that what we know, or what we respect, is found in books – that there is a definite way of knowing that can be found ‘‘in the stacks’’ through the lens of positivism. while this is not inherently problematic, their education would be enriched by looking toward grandparents and others who are outside of the center of these institutionalized paths to knowledge. however, such inclusion forces librarians, as well as students, to rethink authority – to expand their conception of truth, to provide aca- demic legitimacy to that wisdom of lived experience. implications and recommendations libraries have been positioned, culturally and socially, to store and allow for potential sharing of ifla journal ( ) both knowledge and wisdom. the place itself is dynamic and growing; the library is more than a place of storage, it is one for creation and community inter- action. the ‘‘library as place’’ movement or the growth of academic commons within libraries is testa- ment to this evolving role of libraries, especially on the university campus. rethinking the role of libraries as information providers, in terms of what they pro- vide, is compelling; it is relatively easy for libraries to collect printed materials, and convenient for stu- dents to access them. it might be more difficult for students to access people, especially elders or people who are outside of their social circle. we are often limited in our interactions to people in our various social circles, so providing access to people outside of those circles gives researchers a way to access knowledge that is not in books, and that is outside of their social circles. we can imagine that in coun- tries that have experienced war or social upheaval or where there are strict social divisions this connec- tion could be especially helpful in academics, from medicine and law to the social sciences. in this section, we describe human libraries, oral history, and narrative interviews in order to reimagine how these might be incorporated into a university library setting. human libraries show us a way to deconstruct the legitimacy of power and status man- ifested in static containers ‘‘in the stacks’’. the inter- play between people in conversation is truly a subjective event. we do not propose to replace recorded information with human libraries, but to supplement it with voices that have not yet been recorded, to find ways to let students interact with tribal elders under the same roof (mental and physi- cal) along with books and traditional academic sources. this gives them an opportunity to actually access wisdom, and in turn increase their under- standing of why people act as they do, to understand socially embedded actions and reactions, and to use that wisdom to enrich their understanding of how to act in the world. human or living libraries situational knowledge or awareness is something that human libraries epitomize. they represent knowledge that is embodied in interaction rather than that which is fixed in print. this is not to imply that print is ‘‘dead’’, but rather that some elements of interaction are missing when the message is mediated. print allows people to temper or to strengthen arguments, and they are presented as a complete idea. interaction, on the other hand, gives both sides a chance to clarify ideas and respond to questions. the first official human library appeared in den- mark in , at a stop the violence festival. the idea quickly spread to hungary, norway, iceland, sweden, finland, and into the rest of europe (human library organization, ). abergel ( ) has helped establish living libraries across europe to help break stereotypes and sensitize people to various issues. now, the concept is alive and thriving in north america, australia, asia, and south america – within libraries, on campuses (university of mis- souri, ) and even within companies (aaker and hammond, ). the trend has recently been intro- duced in south africa. in a human library, originally called a living library, the concept of a book is expanded to pertain to a human, usually a person with a particular story to tell or exemplifying a social stigma; it is a tool for combating stereotypes. a patron checks out a human book in a similar fashion to other documents. the concept has been successful in facilitating important discussion in areas with diverse populations where stereotypes and preju- dices prevent cohesion and to extend the opportunity for people to interact with others with whom they may not otherwise interact. the human library program has been especially effective in australia, where a successful living library event at lismore public library held in november of became established as a monthly event (pearse, ). the lismore living library was the first permanent library to be established across the globe, and the popularity and impact of the trend has led australia to formulate a national strat- egy for development of living libraries (human library organization, ). lismore also has out- reach programs such as link up, a program that con- nects students with living library books in the community, and another program that brings human books to aged care facilities (kinsley, ; pearse, ). in a sydney library, the concept was introduced in response to media reports suggesting a growing sense of intolerance for diversity among the growing popu- lation (crawshaw, ). the program is funded by a grant but still relies on volunteers to ‘‘act as books’’ and ‘‘through conversation and sharing their personal experiences . . . help break down prejudices and address misunderstandings’’ (p. ). in australia’s northern territory, the human library program brings together patrons and people from other countries to discuss the issues and difficulties faced in the book’s home country or as a result of moving to australia (hilder, ). the impact has helped foster harmony and community development in the territory’s multi- cultural communities. the human library movement shannon and bossaller: storing and sharing wisdom and traditional knowledge in the library has been so successful in australia that they boast more than local human library organizers, which is more organizers than any of the more than other countries that participate in the human library move- ment has (human library organization, ). interacting with human texts would expose stu- dents to people who have past experiences in various aspects of life, such as a particular illness, religious background, sexual orientation, or traumatic experi- ence. garbutt ( : ) has offered further explana- tion of the goal of a living library: whatever the differences being worked with, for exam- ple, whether multi-cultural, multi-abled, multi-sexed, multi-sexual, or multi-faith, the intended outcome is not assimilation of less-powerful positions in society but of findings ways of coexisting in our differences. through the practice of conversation, living library participants and organizers are seeking a form of integration that does not leave hegemonic positions undisturbed and unchanged, nor one in which all values are necessarily shared. in this sense, living libraries are ‘laboratories’ of multicultural cosmopolitan practice worth of greater study and research. the human library movement is not a consolidated, integrated effort, and concerns and challenges for the program include inconsistency in cataloging or stan- dardizing entries, the voluntary nature of human books, and possible emotional distress. some facilita- tors of human libraries such as lismore human library have cataloged their human books (kinsley, ). in lismore’s case, the human book creates its title and the catalog description. while this preserves authenticity, the ways in which the book can be described and connected with a potential reader might also be limited. books are also volunteers, which may arouse concerns about the type of people most likely to volunteer, longitudinal factors, and other affective factors. the human library organi- zation and human libraries australia have both cre- ated resource kits to guide organizers in their efforts to recruit quality books, inspire books, and create a safe space for interaction. interviews and meetings prior to acquiring and cataloging a book have been recommended to discuss the book title and proposed description and, importantly, understand the book’s motivation. the goal does not have to end at cultural exposure. living libraries potentially share knowl- edge in multiple domains. oral history another way libraries can facilitate less mainstream exposure to wisdom is through providing a space for oral history. kargbo ( : ), working in sierra leone, has suggested that librarians can provide aca- demic validity for oral histories by selecting record- ings and transcripts and then ‘‘develop directories to facilitate access to these vital data and organize and process oral traditions in a similar way they do for printed matter . . . for the continued existence of their cultural heritage.’’ oral histories have been and can be archived and stored digitally using an array of ict, including audio and video recordings. oral history is democratic; it is roots-up rather than top-down. some examples of oral history projects that might help us to conceptualize the inclusion of oral history are actually easy to find. the library of congress has embraced it through the storycorps project, which is a public service that helps bring personal histories to the public (storycorps, ). in , the library of congress’s american folklife center also started the veterans history project, an oral history project to make available to the public the personal accounts of american war veterans (american folklife center, ). these projects are important because narrative interviews can teach us what a person feels and thinks, and sets to the truth rather than a mediated version of the truth. riessman ( ) iterates that narrative interviews have been used across many academic dis- ciplines, from folklore to law and occupational ther- apy; ‘‘narrative analysis . . . is appropriate for studies of social movements, political change, and macrolevel phenomena’’ (p. ). for the purposes of this paper, we can imagine that the interactive experience of such an interview cannot be packaged; it can be transcribed and analyzed, but each time a person is interviewed, different aspects of experience will arise based on memories that were triggered in the interaction and that particular context. conclusion the students in this study identified a disjuncture in knowledge and wisdom that were wrapped in state- ments of value. they clearly valued both the wisdom held by elders and religious figures and also the edu- cation that they were receiving at school, but we ask if there are methods to better integrate these two domains of knowledge. this would give the students a way to bring traditional knowledge into their own studies and at the same time legitimize the values of traditional social systems in the educational setting. integrating human libraries, storytelling, and personal interaction within the walls of the library is a way to bring wisdom into the stacks. it could be used to increase an understanding of diverse voices, espe- cially in the post-colonial world. this will not only ifla journal ( ) add to the informational ecology within the library, but also provide a means for students to integrate diverse viewpoints, including what has been identi- fied as wisdom into their studies. funding this research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors. references aaker d and hammond g ( ) the human library. mar- keting news ( ): . abergel r ( ) don’t judge a book by its cover! the living library organiser’s guide. budapest, hungary: council of europe. american folklife center ( ) about the veterans his- tory project. available at: http://www.loc.gov/vets/ about.html (accessed march ). baltes pb and smith j ( ) toward a psychology of wis- dom and its ontogenesis. in: sternberg rj (ed.) wisdom: its nature, origins, and development. new york: cam- bridge university press, pp. – . bluck s and gluck j ( ) from the inside out: people’s implicit theories of wisdom. in: sternberg rj and jordon j (eds) a handbook of wisdom: psychological perspectives. new york: cambridge university press, pp. – . brophy p ( ) towards a generic model of information and library services in the information age. journal of documentation ( ): – . brown rh ( ) society as text: essays on rhetoric, reason, and reality. chicago, il: university of chicago. budd jm ( ) an epistemological foundation for library and information science. library quarterly ( ): – . budd jm ( ) jesse shera, social epistemology and praxis. social epistemology ( ): – . crawshaw b ( ) camden living library is part of com- munity harmony strategy. incite ( ): . durrani s ( ) information & liberation: writings on the politics of information & librarianship. duluth, mn: library juice press. durrani s ( ) progressive librarianship: perspectives from kenya and britain, – . london: vita books. eisenberg m, lowe c and spitzer k ( ) information literacy: essential skills for the information age. nd edn. westport, ct: libraries unlimited. falgout s ( ) hierarchy vs. democracy: two strategies for the management of knowledge in pohnpei. anthro- pology & education quarterly ( ): – . frohmann b ( ) documentation redux: prolegomenon to (another) philosophy of information. library trends ( ): – . foucault m ( ) the archaeology of knowledge and the discourse on language (ams smith, trans.) new york: pantheon books. (originally published in french, ) gadamer hg ( ) classical and philosophical herme- neutics. theory, culture & society ( ): – . garbutt rg ( ) the living library: some theoretical approaches to a strategy for activating human rights and peace. in: activating human rights and peace: universal responsibility conference conference proceedings (ed. rg garbutt), byron bay, nsw, australia, – july . lismore, nws: centre for peace and social justice, southern cross university, pp. – . available at: http://epubs.scu.edu.au/cgi/viewcontent. cgi?article¼ &context¼cpsj_pubs (accessed march ). haeckel sh and nolan rl ( ) the role of technology in an information age: transforming symbols into action. in: the knowledge economy: the nature of information in the st century, – annual review of the insti- tute for information studies. queenstown, md: aspen institute, pp. – . hendon ja ( ) having and holding: storage, memory, knowledge, and social relations. american anthropolo- gist ( ): – . hilder c ( ) public libraries serving multicultural com- munities across australia: best practice examples. aus- tralian public libraries and information services ( ): – . howard p ( ) the confrontation of modern and tradi- tional knowledge systems in development. canadian journal of communication ( ). human library organization ( ) the history of the human library. available at: http://humanlibrary.org/ the-history.htm (accessed march ). hycner rh ( ) some guidelines for the phenomenolo- gical analysis of interview data. human studies ( ): – . indian civil rights act ( ) us code §§ – . available at: http://www.tribal-institute.org/lists/icra . htm (accessed july ). kargbo ja ( ) oral traditions and libraries. library review ( ): – . kekes j ( ) wisdom. american philosophical quar- terly : – . kinsley l ( ) lismore’s living library: connecting communities through conversation. australasian public libraries and information services ( ): – . lloyd a ( ) informing practice: information experi- ences of ambulance officers in training and on-road practice. journal of documentation ( ): – . malin sc ( ) what if? exploring how libraries can embody trends of the twenty-first century. young adult library services ( ): – . mbiti js ( ) african religions and philosophy. new york: praeger. morrone a and workman sb ( ) keeping pace with the rapid evolution of learning spaces. in: frasher k (ed.) the future of learning and teaching in next genera- tion learning spaces (international perspectives on shannon and bossaller: storing and sharing wisdom and traditional knowledge in the library http://www.loc.gov/vets/about.html http://www.loc.gov/vets/about.html http://epubs.scu.edu.au/cgi/viewcontent.cgi?article= &context=cpsj_pubs http://epubs.scu.edu.au/cgi/viewcontent.cgi?article= &context=cpsj_pubs http://epubs.scu.edu.au/cgi/viewcontent.cgi?article= &context=cpsj_pubs http://epubs.scu.edu.au/cgi/viewcontent.cgi?article= &context=cpsj_pubs http://humanlibrary.org/the-history.htm http://humanlibrary.org/the-history.htm http://www.tribal-institute.org/lists/icra .htm http://www.tribal-institute.org/lists/icra .htm higher education research vol. ). bingley: emerald, pp. – . pearse t ( ) living libraries australia: a national strat- egy for bringing communities together. incite ( ): . pellowski a ( ) the world of storytelling. rev. edn. new york: hw wilson. radford gp ( ) positivism, foucault, and the fantasia of the library: conceptions of knowledge and the mod- ern library experience. library quarterly ( ): – . doi: . / . ricœur p ( ) from text to action. evanston, il: north- western university press. riessman ch ( ) analysis of personal narratives. in: gubrium j, et al. (eds) the sage handbook of interview research: the complexity of the craft. thousand oaks, ca: sage, pp. – . rowley j ( ) conceptions of wisdom. journal of infor- mation science ( ): – . savolainen r ( ) information behavior and information practice: reviewing the ‘umbrella concepts’ of information-seeking studies. library quarterly ( ): – . doi: . / . schmit cj ( ) too deep for words: a theology of liturgical expression. louisville, ky: westminster john knox press. steiner hm and holley rp ( ) the past, present, and possibilities of commons in the academic library. the reference librarian ( ): – . sternberg rj (ed.) ( ) wisdom: its nature, origins, and development. new york: cambridge university press. sternberg rj and jordon j (eds) ( ) a handbook of wisdom: psychological perspectives. new york: cam- bridge university press. storycorps ( ) about us. available at: http://storycorp- s.org/about/ (accessed march ). swift m ( ) banishing habeas jurisdiction: why fed- eral courts lack jurisdiction to hear tribal banishment actions. washington law review ( ): – . turner a, welch b and reynolds s ( ) learning spaces in academic libraries: a review of the evolving trends. australian academic & research libraries ( ): – . doi: . / . . . turner d ( ) oral documents in concept and in situ, part i: grounding an exploration of orality and information behavior. journal of documentation ( ): – . university of missouri ( ) mizzoudiversity. available at: http://diversity.missouri.edu/summit/human-librar- y.php (accessed march ). wolfe cl ( ). faith-based arbitration: friend or foe: an evaluation of religious arbitration systems and their interaction with secular courts. fordham law review : – . author biographies brooke m. shannon is an assistant professor of global security and intelligence studies at embry-riddle aero- nautical university in prescott, az. she earned her phd in information science and learning technologies from the university of missouri in . her research interests include information practices, qualitative methodologies, and east african affairs. jenny s. bossaller is an assistant professor of library and information science at the university of missouri in columbia, mo. her research interests are in information policies, and social and technological constraints on infor- mation flow and provision, public libraries, and lis education. ifla journal ( ) http://storycorps.org/about/ http://storycorps.org/about/ http://diversity.missouri.edu/summit/human-library.php http://diversity.missouri.edu/summit/human-library.php article the challenges of reconstructing cultural heritage: an international digital collaboration rachel heuberger goethe university library, frankfurt am main, germany laura e. leone center for jewish history, new york, usa renate evers leo baeck institute, new york, usa abstract the digitization of the freimann collection, unique works belonging to the wissenschaft des judentums (academic study of judaism), was a collaborative, international initiative to virtually reconstruct a cultural jewish heritage collection that suffered losses during world war ii. these works comprised the first engagement with pre-modern jewish religious texts using modern research methods of academia. building off a pre-war published catalog, the project brought together remnants of the original library collection in germany and collections that were gathered in one of the main exile locations of german-speaking jews in the united states. the freely accessible texts ensure the enhancement of scholarship by providing long- term discovery to an unlimited audience. digitization and virtual reconstruction are not only crucial from a digital preservation standpoint, but also allow researchers to envision the works in the context of their intellectual and historical significance. the project also generated models for international collaboration and large-scale digitization workflows. keywords jews, europe, german-jewish history, cultural heritage, digital projects the wissenschaft des judentums (wdj) project con- tinues to be a fruitful collaboration among three inter- national institutions in the united states and in europe: center for jewish history (the center), leo baeck institute (lbi), both in new york, usa, and the judaica division of the university library (jsf) in frankfurt am main, germany. the – stage of the project (dolnick, ) virtually re-created a renowned library collection whose originals were partially destroyed during world war ii. the collec- tion is also known as the freimann collection, named after the frankfurt librarian aron freimann ( – ), who built the largest comprehensive judaica collection on the european continent (leo baeck institute, ) and published a printed catalog in (freimann, ). based on this published catalog, the jsf had started the virtual reconstruction of this prewar judaica collection in with funding from the deutsche forschungsgemeinschaft (dfg) under the name freimann-sammlung/the freimann collection. during the course of that project (heu- berger, ) jsf estimated that it was missing % of the , titles that once constituted its world-renowned collection. the center identified approximately of these missing books within the holdings of its partner organizations, primarily corresponding author: laura e. leone, center for jewish history, w th street, new york, ny , usa. email: lleone@cjh.org international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifla.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifla.sagepub.com in the collections of the leo baeck institute, a library dedicated to german-jewish history which was founded after the war in new york, and plans were made to digitize the missing volumes for inclusion in the freimann portal. subsequently, a new collaborative project (center for jewish history, ) was envisioned and funded by a joint grant of the national endowment for the humanities (neh) and the deutsche forschungsge- meinschaft (dfg). it took place between and and produced well over , digital surro- gates with accompanying metadata and ingest into both the freimann portal (goethe universität frank- furt am main, ) and the center/lbi’s opac (center for jewish history, ). the importance of this project is based on the fact that the majority of the books from the freimann col- lection belong to the academic field of wissenschaft des judentums (academic study of judaism) (leo baeck institute, ), an important scholarly move- ment among european jews in the th and the begin- ning of the th century that comprised the first engagement with pre-modern jewish religious texts using the modern research methods of academia. pro- viding free access to a unique collection of books that suffered losses during a worldwide war ensures the enhancement of scholarship and discovery to an unlimited audience. while at the time of their writing, these books were often read only by a network of scholars themselves, they comprise today an important primary resource for historical research in a diversity of disciplines of the humanities and social sciences. given the compre- hensive nature of the wdj collection, the documents provide insight into a broad range of religious and political movements worldwide. it was recognized that creating digital surrogates of these materials was not only prudent from a digital preservation standpoint, but a crucial step in making them discoverable and accessible. it was vital to recreate this historically significant collection, much of which no longer existed in physical form since most european judaica collections had been destroyed or dispersed during world war ii. the collaboration among the three institutions combined the remnants of one of the historically richest european judaica collections (judaica collection frankfurt) with a col- lection that was rebuilt after the war in one of the outposts of the exiled german-jewish community (leo baeck institute new york). the center for jewish history provided digital services and logisti- cal support. as a result, the most comprehensive digital library of wdj in existence was produced, and the academic and cultural resources from the pre-holocaust era were recreated using st-century technology and practice. basis of the project a key element of this cultural reconstruction project was the existence of a printed library catalog for the frankfurt judaica collection that was published in (freimann, ). the curator, professor aron freimann, who looked after the judaica collection from until , made it the largest and most significant judaica collection on the european conti- nent before world war ii. the collection, with its approximately , titles, was created at the end of the th century through generous donations of frankfurt jewish philanthropists. as aron freimann noted in his introduction, the publication of this judaica catalog was a bibliographical milestone, since catalogs of judaica collections had only been pub- lished partially, unlike the more language-focused hebraica collections. first plans for this printed cata- log had already been announced in . it cannot be emphasized enough how fortunate is the existence and survival of this catalog, which was published in before the beginning of the third reich. the years of nazi rule between and lead to the annihilation of jewish life and cultural heritage, including the systematic destruction of jewish libraries and collections. without the existence of this catalog it would have been very difficult to gather comprehensive bibliographical data for judaica publi- cations before . the importance of this catalog can also be seen in the fact that the catalog was rep- rinted after the war, in . in this edition, the old call numbers are no longer listed, and it is meant to be used as a bibliography, not a catalog of a collec- tion. using freimann’s printed catalog to cross- reference and isolate the works missing from the wdj collection for inclusion on the project was an integral part of the process, and enabled the realization of the project. wissenschaft des judentums the freimann catalog is especially important since it also includes the major historical literature on the sci- ence of judaism (wdj) until . wdj is rooted in the enlightenment movement (haskalah) and com- bines the ideals of emancipation and freedom brought about in the french revolution with critical engage- ment with the classical sources of judaism. founded in germany in the early s, wdj soon became the legacy of important jewish communities worldwide, and one of judaism’s outstanding manifestations of modern times. ifla journal ( ) the wdj was the key instrument in the transfor- mation of judaism during the th century through the use of new methods of textual study, especially philol- ogy and history, in the study of jewish texts and the history of judaism. scholars of classical studies, greek, and latin had first developed these methods for explor- ing classical texts and the history of the ancient world. other scholars then applied this methodology to the study of christianity, stirring major controversies by treating sacred texts as historical, human creations. the new scholars set out to understand how juda- ism had changed and developed over the millennia by posing questions that were historical and scientific. this approach bore the potential for dissolving the sacred traditions and timeless nature of the texts and corroding the timeless revelations and traditions. yet scholars could also employ wdj to support revelation and tradition. wdj was a malleable tool in the hands of its practitioners. the academic study of judaism and jewish history played a significant role in the for- mation of a new jewish identity that encompassed both jewish and german components. the scholar- ship of the wissenschaft movement formed intellec- tual directions such as conservative and reform judaism and inspired the foundation of scholarly institutions such as the leo baeck institute at the cen- ter for jewish history, hebrew university in jerusa- lem, and jewish theological seminary in new york. today the legacy of wdj is still alive and rel- evant as jewish studies programs around the world continue to grow and flourish. audiences due to the rarity and fragile condition of historic resources for jewish studies neither the center nor the jsf allows for their lending, requiring researchers to travel to their reading rooms to view the works onsite or have photocopies, microfilms, or individual scans sent abroad. researchers at both institutions as well as worldwide have benefited from easy digital access to these materials. this collaboration has produced the most comprehensive digital library of wdj in existence, and because of this, users can closely examine these works without having the materials in hand. the easy accessibility and the compact and defined format can reach large and diverse audiences beyond the core groups of academic users. the project partners and their collaborative roles three primary partners participated in the wdj proj- ect: judaica division frankfurt am main, the leo baeck institute, and the center for jewish history. the judaica division frankfurt (judaica abteilung frankfurt, jsf) is part of the frankfurt university library (universitätsbibliothek johann christian senckenberg frankfurt am main), one of the central scholarly libraries in germany. as of , its com- prehensive holdings included special collections of about . million titles. the library performs . mil- lion book loans each year and was responsible for special collections financed by the deutsche for- schungsgemeinschaft (dfg) for many years. jsf owns the largest collection of jewish studies in ger- many, as well as literature about israel and is one of the world’s great judaica library collections. it was founded at the end of the th century through gener- ous donations of frankfurt jewish philanthropists. professor dr aron freimann, the librarian until , formed it into the most significant judaica col- lection of the european continent before world war ii. the collection suffered partial losses during national socialism and the war. the leo baeck institute (lbi) in new york is devoted to the history of german-speaking jews. it was founded in by leading german-jewish émi- gré intellectuals including martin buber, max grune- wald, hannah arendt and robert weltsch, who were determined to preserve the vibrant cultural heritage of german-speaking jewry that was nearly destroyed in the holocaust. centers were established in new york, london, and jerusalem, as the places with the largest numbers of exiled german-speaking jews. today, the , -volume library and extensive archival and art collections in new york represent the most signifi- cant repository of primary source material and scho- larship on the jewish communities of central europe over the past five centuries. they named the institute for rabbi leo baeck, the last leader of ger- many’s jewish community under the nazi regime. lbi-new york is a founding member of the center for jewish history in manhattan and also maintains an office in berlin, and a branch of its archives at the jewish museum berlin. the center for jewish history in new york is home to five partner organizations: american jewish historical society, american sephardi federation, leo baeck institute, yeshiva university museum, and yivo institute for jewish research. the partners hold archival collections that span approximately years of history and include more than , volumes, million archival documents, and tens of thousands of textiles, ritual objects, recordings, films, photographs, and works of art. the center’s online public access catalog (opac) gives users the opportunity to search across the partner collections via a single-search portal, and users in more than heuberger et al.: the challenges of reconstructing cultural heritage countries have accessed the center’s resources online. preservation, access and scholarship form the core of the center’s work. as a public research insti- tution, the center promotes open access to informa- tion, and illuminates the jewish experience through education and research, scholarship, programming, and exhibits. the partners on the wdj project were a natural fit for collaboration, given their overlapping collection scopes and core missions. at the conception of the project, it was important to: identify the partners whose collaboration would be the most logical and successful; determine the needed technical tools, cap- abilities and expertise; ascertain how to build on and use existing resources, and generate a road map to illustrate how all the moving parts would come together. the judaica division in frankfurt laid the ground- work in . with the remnants of the collection, jsf started the digitization project and set up the frei- mann portal. in , the center and the lbi joined the effort and undertook the digitization of additional wdj materials, found mainly in the lbi collections; some works were also found in the collections of three other center partners. the roles of each partner were clearly delineated, and each institution focused on a particular aspect of the dynamic process. lbi library staff handled the selection, preparation, cataloging and physical trans- fer of materials based on the missing lists in frankfurt, while the digital lab staff at the cjh handled the digitization and transfer of digital content to jsf. jsf oversaw the project’s main objectives, and incorpo- rated the new digital content into the existing frei- mann portal. while the preparation and selection of materials was an intensive process and required dedicated time prior to the project’s start date, many of the steps in the project plan were underta- ken concurrently. selection, preparation and cataloging of materials at lbi library the project required detailed bibliographical prepara- tion, selection, verification, and intense tracking of materials. lbi catalogers, working from the original freimann library catalog, worked through the titles that were identified as missing at jsf and compared them with the holdings at lbi and the other partner organizations at cjh. this process ensured not only the selection of materials but also their full descriptive cataloging, which was required before digitization took place. lbi catalogers created a shared project management spreadsheet with all the titles to manage the project. the majority ( %) of the materials came from the lbi collections. however, some wdj works were also found in the collections of the center’s other partners: american sephardi federation, amer- ican jewish historical society, and yivo institute for jewish research. credit lines were added to descrip- tive information at jsf to specify the name of the con- tributing partner. each book (or each volume in the case of multi-volumes) was assigned a unique digital identifier from a list of urns monitored by jsf. when this initial preparation was completed, the books underwent digitization and quality control in the center’s lab in batches of to . a challenging part of the project was the matching of the entries from the german freimann catalog with the bibliographical records in the current amer- ican online library system, both following different bibliographical descriptive standards. the arduous nature of the bibliographical analysis of the freimann catalog entries was only rivaled by the book tracking system that was put into place which also included an intricate post-production process to resolve problem cases. digitization at the center’s digital lab the center’s gruss lipper digital lab was founded in to serve the center community on ad hoc and small-scale projects. the wdj project was the first large-scale digitization project done in the lab, and it served as a model for future large-scale initiatives. during the project’s first months (september –august ), the center coordinated efforts with jsf to generate and test workflow protocols for the transfer of materials between the two institutions, hired lab staff dedicated to the project, and purchased an atiz bookpro, a dedicated book cradle which uses two mm digital cameras for image capture. staff- ing included a dedicated photographer, a dedicated quality assurance technician, and a metadata librarian. the first year of the wdj project also was spent cre- ating and coordinating communication protocols among the center, lbi, and jsf. in the second year, additional staff were assigned to the project for both image capture and quality assur- ance functions. regarding equipment, while the bookpro remained the primary camera for the entire project and completed the majority of the work, the lab’s large format better light camera was used to digitize selected fragile and oversize materials. photo- graphers discovered elements of the physical books that, while not particularly vulnerable overall as objects, made them more vulnerable for photography in particular. for example, condition issues such as ifla journal ( ) brittleness and tight margins necessitated the removal of a selection of books from the original list. these books were then replaced with alternate titles. the center’s conservator assessed and stabilized selected objects prior to shooting. in addition, the quality assurance process was reviewed and adjusted. following image capture and quality assurance, the batches of digital surrogates were sent to jsf via external hard drive shipments. simultaneously, lbi sent bibliographic records of the digitized books to jsf. this workflow was built in to ensure the quality and comprehensiveness of the process. the role of jsf throughout the project, the center’s digital lab and lbi worked with jsf to complete the delivery of all digital assets and related metadata. jsf then ingested the digital assets into the freimann portal and enhanced the metadata by connecting it to authority databases, including standardized headings according to the virtual international authority file (viaf) and other thesauri. the process of digitization was checked carefully page by page and the books were structured according to their content, marking title and final pages, chapter entries, and so on. subject headings and systematic indexing were added, culmi- nating in accessibility to the materials via the portal. browsing was enabled using the indexing of relevant freimann catalog chapters, thus allowing the discov- ery of unknown titles relevant to the subject. the quality of the metadata facilitates the discovery of the material via well-known search engines such as goo- gle, thus producing hundreds of visitors using the site and documents daily. in addition, enhanced metadata linked to authority headings enables the use of this data as linked open data sets. jsf used the visual library data management sys- tem to manage all digital content, including descrip- tive metadata and the digital assets themselves. since , jsf has supported and maintained their online judaica databases, and the institution has pro- cedures in place to ensure the long-term management of and access to the freimann portal and its wdj online digital content. long term-sustainability all project partners are committed to the long-term care of digital assets. building an infrastructure that will sustain not only the assets themselves but the access provided to them is crucial. in addition to access via the freimann portal at jsf, the digital assets were also ingested into the center’s digital assets management system, providing an additional access point through which researchers discover and view materials. bringing the wdj collection together in one portal and situating it in context is one way the wdj project ensured its long-term viability. adhering to best prac- tices in both digitization and accessibility plays a vital role in sustainability for the long term. the implemen- tation of a digital preservation program and explora- tion of next generation digital assets management tools is currently underway at the center/lbi, and is a priority going forward. fundraising, budgeting, and projecting costs for digital storage have also been key elements in planning for future sustainability of digi- tal assets. a combination of cloud options and the augmentation of local infrastructure ensures the safety and security of the digital assets produced in the lab, and the configuration will continue to evolve. accomplishments this project’s importance cannot be overstated. the reconstruction of the largest and most significant judaica collection on the european continent before world war ii made it possible for researchers to again discover and access this cultural heritage over six decades later. this was the impetus for the project, and clearly the most resonant. from a digitization standpoint, the project has served as a model for large-scale digitization projects, as well as for international collaboration. in addition to fostering enhanced workflows for digitization and access, new ways of communication and data transfer were established. continuation of the project due to the success of this project, both the center and jsf aim to continue this important work together. as the original freimann project as well as this project focused on books or monographs, a natural next step could be to digitize the periodicals in the wdj collec- tion. since their development in the th century, jewish periodicals were an integral part of the corpus of jewish publications, having their own specific characteristics. from the beginning they were created as a temporary product only, meant to serve for daily, weekly or monthly use. today—after the extinction policy of the nazi regime and the second world war—it is unusual to find complete sets of jewish periodicals. it would be of value to integrate compact memory (heuberger, ), an existing portal of th-century german-jewish periodicals at jsf and at the germania judaica in cologne, with a digital periodicals portal at lbi. additional titles would be heuberger et al.: the challenges of reconstructing cultural heritage added, and the shared portal enhanced and further developed. long-term impact this international project is an excellent case study for the reconstruction of cultural heritage by digitally unifying resources that have been physically scattered across the globe due to world events. remnants of collections that were partially destroyed during a war and collections that were collected in exile were reunited. important elements of this collaborative project are as follows: the evaluation of the importance of a collection; the identification of logical partners; the development of a road map; and technical, administra- tive, and international communication approaches and solutions. it is especially noteworthy that the identifica- tion, discovery, use, and incorporation of existing old bibliographical sources were first and important steps. collaborative, themed digital portals are on the one hand important technical tools to reconstruct collec- tions and on the other hand important research tools to promote humanities and cultural content in context to existing and new audiences. acknowledgement the authors wish to acknowledge the national endowment for the humanities and the deutsche forschungsge- meinschaft for their generous support of the wissenschaft des judentums project. references center for jewish history ( ) science of judaism (wis- senschaft des judentums). available at: http:// thstreet.tumblr.com/post/ /science-of- judaism-wissenschaft-des-judentums (accessed june ). center for jewish history ( ) single-search portal. available at: http://search.cjh.org (accessed june ). dolnick s ( ) jewish texts lost in war are surfacing in new york. new york times, march. available at: http://www.nytimes.com/ / / /nyregion/ books. html?_r¼ (accessed april ). freimann a ( , reprinted ) katalog der judaica und hebraica, erster band: judaica. frankfurt am main: lehrberger. goethe universität frankfurt am main, universitätsbi- bliothek, freimann-sammlung ( ) the freimann collection. available at: http://sammlungen.ub.uni-frank- furt.de/freimann?lang¼en (accessed june ). heuberger r ( ) arche noah der erinnerung–jüdisches kulturerbe online. in: gelber mh (ed.) integration und ausgrenzung: studien zur deutsch-jüdischen literatur- und kulturgeschichte von der frühen neuzeit bis zur gegenwart; festschrift für hans otto horch zum . geburtstag. tübingen: niemeyer, pp. – . heuberger r ( ) cultural heritage reconstructed: compact memory and the frankfurt digital judaica collection. in: ifla world library and information congress: th ifla general conference and assem- bly, lyon, france, – august . available at: http://www.ifla.org/files/assets/newspapers/gen- eva_ /s -heuberger-en.pdf (accessed april ). leo baeck institute ( ) wissenschaft des judentums: the freimann collection. available at: http://www.lbi. org/collections/library/highlights-of-lbi-library-collec- tion/wissenschaft-des-judentums-freimann-collection/ (accessed june ). leo baeck institute ( ) wissenschaft des judentums: jewish studies and the shaping of jewish identity. exhi- bition at the leo baeck institute, march – august . available at: http://www.lbi.org/ / /wissenschaft-judentum-jewish-studies-jewish-iden- tity-exhibition/ (accessed april ). author biographies rachel heuberger, phd, is head of the jewish division of the universitätsbibliothek frankfurt am main (university library). she studied history, jewish history, and education at the hebrew university in jerusalem, is a lecturer for jew- ish studies at the university of frankfurt and has published extensively on modern german-jewish history, especially the history of the jews in frankfurt, on wissenschaft des judentums, hebrew bibliography and the gender question in judaism. within the library she initiated the digitization of the historic hebraic and judaica collections, and is responsible for the online portal ‘‘digital collections judaica’’. in – , rachel heuberger was the coordi- nator of judaica europeana, an ec-financed project that ingested millions of digital images of jewish cultural heri- tage into europeana, the european digital library. today she serves as chairperson of the judaica europeana consortium. laura e. leone is the director of archive and library ser- vices at the center for jewish history. she has oversight of the institution’s archival, digital, preservation, reference and library systems services, and manages the exhibition space in the david berg rare book room at the center. in addition to sharing oversight of the neh/dfg-funded wissenschaft des judentum initiative, she has managed other large-scale initiatives including a council of library and information resources-funded project to make disco- verable over linear feet of hidden archival collec- tions. she also serves on the palmer school special collections advisory group, which convenes biannually to review the rare books and archives curriculum at liu’s palmer school of library & information science. ifla journal ( ) http:// thstreet.tumblr.com/post/ /science-of-judaism-wissenschaft-des-judentums http:// thstreet.tumblr.com/post/ /science-of-judaism-wissenschaft-des-judentums http:// thstreet.tumblr.com/post/ /science-of-judaism-wissenschaft-des-judentums http://search.cjh.org http://www.nytimes.com/ / / /nyregion/ books.html?_r= http://www.nytimes.com/ / / /nyregion/ books.html?_r= http://www.nytimes.com/ / / /nyregion/ books.html?_r= http://sammlungen.ub.uni-frankfurt.de/freimann?lang=en http://sammlungen.ub.uni-frankfurt.de/freimann?lang=en http://sammlungen.ub.uni-frankfurt.de/freimann?lang=en http://www.ifla.org/files/assets/newspapers/geneva_ /s -heuberger-en.pdf http://www.ifla.org/files/assets/newspapers/geneva_ /s -heuberger-en.pdf http://www.lbi.org/collections/library/highlights-of-lbi-library-collection/wissenschaft-des-judentums-freimann-collection/ http://www.lbi.org/collections/library/highlights-of-lbi-library-collection/wissenschaft-des-judentums-freimann-collection/ http://www.lbi.org/collections/library/highlights-of-lbi-library-collection/wissenschaft-des-judentums-freimann-collection/ http://www.lbi.org/ / /wissenschaft-judentum-jewish-studies-jewish-identity-exhibition/ http://www.lbi.org/ / /wissenschaft-judentum-jewish-studies-jewish-identity-exhibition/ http://www.lbi.org/ / /wissenschaft-judentum-jewish-studies-jewish-identity-exhibition/ she has presented at conferences/workshops in the us, uk and israel. renate evers is the head librarian of the library of the leo baeck institute new york. she initiated the digiti- zation program for the lbi library collections in focusing on rare books, gray literature, and periodicals. she holds an mls from frankfurt (germany), an mis from konstanz (germany), and an mcis from rutgers university (new brunswick, usa). she has more than years of international professional experience in univer- sity libraries, special collections, and archives and has pub- lished and presented on digitization as well as topics related to german-jewish history. heuberger et al.: the challenges of reconstructing cultural heritage article born fi dead? special collections and born digital heritage, jamaica cherry-ann smart university of the west indies, jamaica abstract cultural heritage print items in special collections hold distinct positions in libraries. yet their potential for pedagogy and learning may often be under-explored due to issues related to preservation management. these challenges underpin their exclusivity, a perception which may impact donor relations and unconsciously impede research access. conversely, electronic publishing has enhanced access to born digital cultural heritage products of scholars and creative expressionists. without the hassle of intermediaries, these born digital producers represent a ‘‘new world order’’ for libraries charged with dual responsibility for access and posterity. some challenges are infrastructural, such as internet penetration, others are human related arising from the need for capacity building in areas such as publishing and effective preservation strategies. these skills are essential to fully conceptualize potential for loss of cultural heritage products and the need for viable mechanisms to manage content; anything less would suggest cultural heritage products are literally ‘‘born fi dead’’. keywords cultural heritage management, principles of library and information science, special collections/rare books, collection development, caribbean, latin america and the caribbean, information and society/culture, access to knowledge/access to information, preservation and conservation introduction the west indies cultural identity developed from watersheds in its history. it began with the re- discovery of the islands in the th century, the ship- ment of millions of people from the african continent to this new world through the middle passage, inden- tureship, post-colonialism, and independence. in the st century the islands are now subject to the vag- aries of globalization as small island developing states. the more popular term ‘‘caribbean’’ came into being in the th century during the period of united states expansion, according to puerto rican historian antonia gazambide-giegel (girvan, ). the group of islands situated in the caribbean sea consists of a range of english, french, spanish, and dutch- speaking countries, representative of the nationality of the invasive dominant power over the centuries. jamaica is located at the northwestern portion of the caribbean sea and is the third largest island in the greater antilles. home to approximately three million people, this anglophone caribbean nation has a cultural heritage of tangible cultural products con- sisting of literature, artwork, and artifacts; and intan- gible products which include expressions of oral tales, dances, and music. reggae and dancehall music are two of the more eminent intangible exports of the island. the recent focus on creative industries in the car- ibbean, in particular music and film (boxill, ) has triggered a renewed tangential examination of cul- tural heritage products. the renaissance is timely as mcdonald ( : ) had long since critiqued the caribbean as being ‘‘distressingly inferior in the effort and time and resources and money’’ put into preserva- tion efforts. corresponding author: cherry-ann smart, west indies & special collections, the university of the west indies, mona campus, kingston , jamaica, wi. email: cherryann.smart @uwimona.edu.jm international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifl.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifl.sagepub.com like most other developing countries, jamaica’s cultural heritage is now largely influenced by the internet. multiple intellectual property breaches by industrialized countries in the past have since prompted the jamaican government’s counter mea- sures to attempt retribution by introducing a retroac- tive extension of copyright to years (copyright amendment bill, ). this move will have a signif- icant impact on libraries, which have since countered government’s proposal by recommending a -year extension with no retroactivity. while the publishing industry is still on the verge of transformation, quite a large body of cultural works is now being ‘‘born digital’’. creative and cultural scholars have embraced the freedom of electronic publishing, presenting a copious, inventive, unrestrained bevy of expressions of works which fall within the bercelonette of digital birth. here, born digital resources are defined as items created and managed in a digital form (erway, ), distinct from digitized items which were originally committed to print. the unesco ( : ) charter on the preservation of the digital heritage identifies such digital materials as ‘‘texts, databases, still and moving images, audio, graphics, software and web- pages’’. special collections in the academic library have also been impacted by the internet. in the midst of a generic template of information available on elec- tronic databases, these mainly print collections of the country’s antiquity are the library’s single most dis- tinguishing feature. in this new dispensation, libraries must consider ways to provide more access to special collections while simultaneously applying defensive techniques for posterity (traister, ). in jamaica, preservation of print special collections must contend with intemperate climate, insufficient resources, and fickle gatekeepers who include stake- holders that are under- and un-educated on the inher- ent value of preservation and conservation. for libraries, archives and museums (lam), pre- servation management of cultural heritage is contin- gent on the institution’s mandate. while lams may superficially be classified as information units with interlocking roles, their focus, priorities, staff compe- tencies, and image are quite dissimilar (lidman, ), thus collaborative ventures are at times willing but weak, leading to significant challenges in the har- vesting, preservation, and provision of access to cul- tural heritage material in jamaica. some caribbean libraries have embraced digitiza- tion as the seeming panacea for preservation efforts, but this is idealistic as renwick ( : ) counters, ‘‘intermittent power, low levels of computer, internet and information literacy, limited bandwidth and access to computers’’, and the inefficient use of the technologies are chronic issues with which caribbean libraries must contend. indeed, temper ( ) asserts that digitization is not yet a long-term preservation method. in truth, much of the focus on digitization is on data management with less attention attributed to technical issues related to the production and long-term storage of e-data (lawrence et al., ; temper, ). the situation is volatile at best, with invisible strategies and opaque directions by library leaders, leaving some to wonder about the preserva- tion of tangible cultural heritage, and in the current scenario, whether the creation of born digital items is literally ‘‘born fi dead’’. this paper looks at preservation challenges of cul- tural heritage products in jamaica with especial emphasis on materials held in special collections and those born digital. special collections as cultural heritage within the research library academic libraries are not often featured as purveyors of cultural heritage in jamaica. that role is often reserved for the national library or the institute of jamaica because of their more intimate interaction with independent researchers, members of the public, and their nationalistic agenda. their relationship with cultural products thus affords them the opportunity to have the government’s ear on issues of rights management and preservation. the more research- oriented focus of the university of the west indies (uwi) mona library, despite its mandate to collect all things west indian, makes it difficult for some, even librarians, to see the value of special collections as cultural heritage, indeed as traister ( : ) sub- mits, some professionals even see them as ‘‘a waste of time’’. at the west indies and special collections (wi&sc) of the uwi, mona library, there are signif- icant collections which chronicle the cultural heritage of jamaica. the unit was created in the s through the efforts of then university librarian, kenneth en ingram, who was also an historian, bibliographer, and scholar. apart from archiving the institutional mem- ory of the university, the collection was grown with purchased works although in the last years this has not been possible and so there has been heavy depen- dence on donations. the global recession has signifi- cantly impacted the library’s purchasing power. unfortunately the heavy reliance on gifts, not unwel- comed, has taken much of the evaluatory decisions out of the hands of librarians and into the hands of benefactors or university administrators. it becomes smart: born fi dead? difficult then to pre-determine whether gifted content aligns with more than the fundamental collecting interests. too, it becomes difficult to ascribe potential use or access restrictions, cultural or research poten- tial or sometimes even the scope and condition of a body of works until the items are actually received and a preliminary scan is conducted in-house. as a result, librarians must adopt a form of ‘‘push’’ promo- tion of the special collections, taking the cultural products to faculty and students, highlighting poten- tial research areas rather than the ‘‘pull’’ concept, which is the reverse and traditional concept. the peril with the push approach is that, like marketing, it does not build brand loyalty and is a short-term fix to an ongoing chronic problem of the loss of the uniqueness and focus on special collections from the mass effect of electronic databases. this is also attributed, in part, to the changing emphasis of societal needs and per- ceptions of cultural heritage. despite this serendipitous approach to building special collections, they still form the single distin- guishing features in the library’s holdings. an incon- clusive list of the growing collection includes public and private papers of two former prime ministers; the honourable edward seaga and pj patterson; nove- lists, john hearne and vic reid, celebrated play- wright, trevor rhone; former vice chancellor emeritus, scholar and noted cultural icon, professor emeritus rex nettleford, and dr olive lewin, world-renowned jamaican ethnomusicologist. this is in addition to tainos artifacts and the traditional rare books and manuscripts. several collections, such as the trevor rhone and rex nettleford collections are still being processed and so are not formally accessible to the public via the online catalogue or finding aids. this does not mean that ‘‘access’’ is not available. jones ( ) notes that references in the literature indicating the existence of a body of works, and word of mouth, as through the liaison librarian function are alternative means of pro- viding access. the latter method, for example, sparked faculty’s interest in the trevor rhone collec- tion. consequently, an assessment of the collection’s viability for research and teaching in a life writing class in the uwi’s literatures in english programme is being explored. anasi et al. ( ) noted the use of cultural heritage in research and learning as instru- mental in protecting rights, and serving as instruments of public accountability. despite this affirmative, schmiesing and hollis ( ) had observed the use of special collections for teaching was a rarely researched topic in the library literature. their argu- ment that faculty and librarians overlooked the ‘‘ped- agogical advantages of using rare materials and book history to further students’ understanding of subjects within a variety of disciplines’’ (p. ) can be further extended to the use of cultural heritage items. through the capture of cultural heritage, the library not only ensures these items’ posterity for future gen- erations; but the objects and records demonstrate the achievements and progress (manaf, ) jamaica has made since it gained its independence in . such expressions should provide fruitful opportuni- ties for collaboration between the library and faculty in generating new knowledge and pedagogy. preservation and conservation of the special collections the issues which surround preservation and conserva- tions efforts in libraries in jamaica are undoubtedly similar to those of other libraries in developing coun- tries. smith ( ) in articulating the need for public policies and economic models which support preser- vation attempted to establish a framework for value within the united states context. in support of the value claim, however, libraries must be conscious of accusations of exclusive collections, not necessarily a good thing for donors who want their works to be used, and reclusive gatekeepers who may be reluctant to establish a balance between access and preserva- tion (traister, ). understandably though, all col- lections and users are not created equal and much of the reticence of librarians to fully open the collection is premised on prior bad acts of university students and some faculty members. preservation and conser- vation efforts are also hampered by several factors, which include: donor relations. collections are often partially-gifted. the process of having donors sign off on donations is sometimes cumbersome and long-winded, and fraught with uncertainties especially when the items hold emotional memories. the absence of a strong and clear policy on gifts contributes to the challenge as donors with caveats and specifications further com- plicate bequests (ballestro and howze, ). as a result the library is reasonably reluctant to advance limited funds sorting, accessioning, and cataloguing collections which may be recaptured by the donor. potential donors also want their collections to be used (retired staff, , personal communication), and the restricted access provided by the library is a real deterrent for sought after gifts. space constraints. insufficient spacing hinders the proper storage of unprocessed collections resulting in potential physical damage from unstable ifla journal ( ) temperature and humidity. carrico ( cited in bal- lestro and howze, ) had suggested one of the questionable benefits of unsolicited gifts was the pro- cessing time and storage space required to host them. as the situation stands currently, gift items are depos- ited into the wi&sc unit while arrangements are made to have them fumigated. their deposit into the unit serves as a safeguard measure especially if no inventory of the items was conducted at arrival. there is, however, the risk of transferring infestations from untreated gifts to treated library items. top staff dependence. access to special collections in the unit are too staff dependent. the head of the unit and one clerical are two long-serving staff members with institutional memory. occasionally, access is left to their recall; a position which became transparently untenable given their imminent retirement in the upcoming year. fortunately attempts are now on- going to make the collection more accessible to staff and users by updating the location of items in the cat- alogue, and inputting finding aids in records of pro- cessed material into the archivist toolkit software. there is still need for a more efficient, inclusive sys- tem of documentation which needs to be inbuilt into the unit’s workflow. thus far, the rate of implementa- tion of this aspect of access provision has been slow. backlog. the centralized cataloguing department has been understaffed for several years with three contin- uous cataloguers. in addition to the main library which caters to the social sciences, humanities and education students, the cataloguing unit is also responsible for servicing the medical, law, science, and off-site campus branch libraries. understandably, focus has been on satisfying the needs of the student population seeking degrees with global equivalency. consequently special collection items are ascribed low priority. but, according to jones ( ) there is high risk of uncatalogued items being lost or stolen. too, their inaccessibility could hinder potential research activities. education and training. there is insufficient profes- sional expertise, poor sharing and collaboration for special collection development, especially as it relates to cultural heritage material. while there is an acknowledgement that an approach is needed to deal with and handle gifts, there appears to be insufficient ‘‘how to’’ knowledge. consequently there is a need to build capacity among special collections librarians to properly care for, market, and promote the collection. according to grieder ( cited in ballestro and howze, : ) other important librarian attributes are ‘‘knowledge of bibliographical procedure, diplo- macy, and a capacity for decision making’’. born digital items as cultural heritage items the manifestation of born digital expressions presents a simultaneous or alternating emotion of love and hate for libraries in jamaica. the internet creates an outlet for nouvelle forms of expressions and creators thus allowing for a more interactive process by users. con- sequently there has been an upsurge in the number of cultural works represented in e-books, cds, youtube clips, artwork, etc. unfortunately, the challenges related to born digi- tal material can be as diverse as the items themselves. apart from the lack of funds to purchase the items and the technology on which they are stored or fully sub- scribe to the multiple aggregators for access, the insti- tutions also may not have the trained personnel to effect substantive changes. the following are some other challenges associated with born digital cultural heritage products. loss of access to born digital cultural products the internet infrastructure in jamaica is severely underdeveloped. the global it report (bilbao-osorio et al., ) reported little progress was being made in bridging the digital divide between technology-savvy nations and others, citing the coun- try as being at risk of missing out on global competi- tiveness. intermittent internet access associated with digitization is also attributable to born digital items whose accessibility rests upon the platform on which the items are published. in the last few years, local authors have taken to publishing on amazon with access available via the kindle. while amazon has admittedly revolutionized publishing opportunities for local independent creators, publishing in e- format only should be of concern to information insti- tutions. the public, national, and academic libraries do not currently possess kindle or digital distributor systems such as overdrive to facilitate customer bor- rowing of e-books or the sharing capabilities. conse- quently, born digital cultural e-books, unlike their print equivalents, cannot be easily shared with and among nationals. in addition to not being in possession of electronic devices for reading books, most government and quasi-government agencies follow a procurement pro- cess which may not support credit card purchases. additionally, even if the material, with device, is gifted by creators there is the real threat of obsoles- cence of the particular equipment. smart: born fi dead? some persons argue that there is lot of born digital reading material available free for download on the internet; but these items seldom represent the indigen- ous jamaican culture which is important for cultural transmission. in an interview with an award-winning local author, on her choice to publish a solely elec- tronic children’s book (diane brown, , personal communication), brown admitted her motivation stemmed from publishing in a new format which was less costly and which she assumed would reach a wider global audience. understandably, an author’s focus might not be posterity and non-commercial access. too, legal deposit, while mandatory, may not be of high concern to authors. the traditional publish- ing process in the caribbean is expensive and authors are not always satisfied with binding options or colour palettes provided, especially with heavily illustrated children’s books (alison latchman, , personal communication). too, much of the support for profes- sionally produced self-published materials such as book editors, copy editors, etc. are not commonly available. many established authors were provided with their first break through uk presses such as mac- millan and peepal tree press. quality of cultural heritage items born digital curdella forbes ( ) acknowledges the increase in ‘‘homegrown’’ fiction in the last decade, facilitated mainly by digital technology, the internet and desk top publishing. unfortunately, much of the ‘‘home- grown’’ material lacks the editorial controls with the results being works filled with grammatical errors, other editorial mishaps, and inappropriate use of copyrighted materials. joanne johnson, another noted children’s author implied the lack of standards of some of these publications was not advantageous to caribbean works (edwards, ). she argued that a writer’s ability to accept criticism and/or rejection of their work guaranteed a more worthwhile product. a similar argument could be ascribed to scholarly publishing and substantiation for the peer review process. unfortunately libraries have little control over the quality of the works, and if presented, they must be accepted as legal deposit, and form part of the cultural heritage of the country. loss of born digital scholarly works with cultural content the profusion of grey literature has been a chronic challenge for developing countries. works with rele- vant caribbean content are hidden in eclectic compi- lation of materials (ramchand, ). some researchers and government ministries have opted to post reports online with the intent to improve access to other scholars and the interested public. the prob- lem with some of these works is that most times they lack proper bibliographic information which red flags their credibility and usability in knowledge genera- tion. too, there is little consistency. as librarians struggle to assist students to become ethically respon- sible researchers, and avoid plagiarism, citing and assigning credit to thoughts and ideas is extremely important, especially as globalization takes roots in all aspects of our existence. unfortunately some works lack elemental signatures such as the date of creation, which makes it hard to establish whether it is a final version or a draft, author affiliation or accreditation, organization affiliation, etc. so while, there may be reasonably argued content or important cultural information which may not be accessible elsewhere, librarians must caution use of such mate- rial for knowledge generation and research. one response could be to teach students how to read and ‘‘interrogate’’ the url to find the required biblio- graphic information. the creation of an electronic cultural heritage divide as mentioned previously, internet bandwidth is not widespread and so internet access, for those who can afford it, is concentrated in the urban areas. for decades, jamaica has struggled with ‘‘low growth, high public debt and many external shocks which have further weakened the economy’’ (world bank, ). in rural areas internet connection is extremely slow or non-existent, as penetration has not kept pace with government policy, political promises, and prag- matic business decisions. accordingly children who are located in these areas are denied equal access to the cultural heritage of jamaica when these products exist solely in a born digital format. the situation has the potential for the creation of an additional divide – the possessors of the technology with the funds and credit cards to access and download culturally related works and those without these accoutrements. resource knowledge and description for access ramchand ( : ) notes ‘‘cultural confidence is knowing who you are and why you are in the midst of all the convulsions that are changing your life’’. he echoed similar sentiments by sancho ( : ) who critiqued the dearth of caribbean creative endea- vours used in the schools as west indians not ‘‘suffi- ciently rooted in the beauties of our [their] vernacular and the knowledge of our [their] literature’’; and with ifla journal ( ) educators themselves ‘‘undernourished’’ in this area so unable to feed their charges. so in addition to contemporary librarians lacking knowledge of the new cache of west indian writers, there also lies the challenge of description of the works. in the changing e-environment, especially with born digital and digitized items, the terminology used to make these items discoverable does not always gel with formal cataloguing rules. therefore while jamai- can nationals can easily discover items on their cul- tural heritage by performing a natural search of terms via google, the more restrictive vocabulary used by cataloguers may pose a problem for discov- ery. for example, a search for ‘‘caribbean folk tales’’ ‘‘caribbean fairy stories’’ and ‘‘caribbean legends’’ produced hopelessly limited hits in the library’s cata- logue despite a fair amount of works on the shelves through a physical inspection. conversely, a similar search on google produced a mass of hits which included not only links to blogs and websites which address indigenous jamaican children’s tales but also some material which had since fallen out of copyright and was subsequently digitized by guttenberg, haithi or google books. conclusion preservation efforts of special collections and born digital items present an interesting challenge for libraries in jamaica. the ideals of cultural heritage in special collections can assume an exclusive or inclusive approach. librarians must be decisive on the way forward, ever mindful of the changing demo- graphics and focus of faculty, students, and their research agenda. a revisit of policies which indicate how much access is provided may be the first step towards promoting cultural heritage items for pedago- gical purposes. similarly, the paradigms for born digital items for research and cultural expressions may need to be explored. in the changing environment of open access, libraries may need to demonstrate a stronger interest in born digital items in their different spheres, to ensure standards are upheld and access is made available to all citizens. much of the transgressions in electronic self-publishing can be attributed to lack of awareness of standard publishing procedures (edwards, ). consequently born digital items, despite the novelty of ideas reframed without the restrictions of external editorial advisories, are in a fight for legitimacy. this plea for acceptance was reflected in this dub poetry composed by the author, the preservation librarian and three paraprofessional staffs of the uwi, mona library and presented at the th symposium of the archaeological society of jamaica. dub poetry evolved from dub music of the reggae rhythms in jamaica in the s. accompa- nied by the beats of a wooden drum, the poem is read with a staccato beat in the patwa or local dialect. the drum is another strong evidence of survival of africa in this new world (whylie, ). the poem ‘‘born digital’’ is recited as follows: born digital mi born digital but it not ah easy trad from mi enter dis father land mi secluded, dem say mi naa provide equality cause me quality nah match de uppity but mi wah be loved, by mi linki like google cud de cataloga spred di words, feh mek me known? cud de bigger heads, uplift de i show me de way - or – mi - jus - born – fi – dead english translation i was born digital but it is not a simple matter from the time i was born, i was isolated, not considered good enough because i do not conform to industry standards but i want to be accepted in the same way you have accepted google could cataloguers please change their manner of cataloguing so i can be found, acquire legitimacy or i might just become grey literature. funding this research received no specific grant from any funding agency in the public, commercial or not-for-profit sector. references anasi sn, ibegwam a and oyediran-tidings s ( ) pre- servation and dissemination of women’s cultural heri- tage in nigerian university libraries. library review ( / ): – . available at: http://dx.doi.org/ . /lr- - - (accessed october ). ballestro j and howze pc ( ) when a gift is not a gift: collection assessment using cost-benefit analysis. collection management ( ): – . doi: . / j v n _ bilbao-osorio b, dutta s and lanvin d (eds) ( ) the global information technology report : rewards and risks of big data. available at: www .weforum. org/docs/wef_globalinformationtechnology_report_ .pdf (accessed april ). smart: born fi dead? http://dx.doi.org/ . /lr- - - http://dx.doi.org/ . /lr- - - http://www .weforum.org/docs/wef_globalinformationtechnology_report_ .pdf http://www .weforum.org/docs/wef_globalinformationtechnology_report_ .pdf http://www .weforum.org/docs/wef_globalinformationtechnology_report_ .pdf boxill i ( ) leveraging the creative industries for development in the caribbean. feature address in: third symposium in the alphonsus ‘arrow’ cassell memorial lecture series, montserrat, november . copyright amendment bill ( ) available at: http:// www.japarliament.gov.jm (accessed april ). edward s ( ) interview: joanne gail johnson’s win- dow into caribbean children’s publishing. available at: www.summeredward.com/ / /interview-joanne- gail-johnsons-window.html (accessed february ). erway r ( ) defining ‘born digital’: an essay. avail- able at: www.oclc.org/research/activities/hiddencollec- tions/bornditgital.pdf (accessed april ). forbes c ( ) jamaican children reading: a reflection. small axe. available at: http://smallaxe.net/word- press /discussions/ / / /jamaican-children-reading (accessed october ). girvan n ( ) creating and recreating the caribbean. in: hall k and benn d (eds) contending with destiny: the caribbean in the st century. kingston, jamaica: ian randle, pp. – . jones bm ( ) hidden collections, scholarly barriers: creating access to unprocessed special collections materials in north america’s research libraries. a white paper for the association of research libraries task force on special collections, june. available at: http://www.arl.org/storage/documents/publications/ hidden-colls-white-paper-jun .pdf (accessed april ). lawrence gg, kehoe wr, rieger oy, et al. ( ) risk management of digital information: a file format information. washington, dc: the digital library fed- eration. available at: www.clir.org/pubs/reports/pub / contents.html (accessed february ). lidman t ( ) our cultural heritage and its main protago- nists: libraries and archives, a comparative study. liber quarterly ( / ). available at: http://liber.library.uu.nl/ index.php/lq/article/view/ / (accessed march ). mcdonald i ( ) caribbean creative achievement: pre- serving the record/extending the influence. in: hall k and benn d (eds) contending with destiny: the carib- bean in the st century. kingston, jamaica: ian ran- dle, pp. – . manaf za ( ) the state of digitisation initiatives by cultural institutions in malaysia: an exploratory survey. library review ( ): – . available at: http://dx. doi.org/ . / (accessed april ). ramchand k ( ) the lost literature of the west indies. in: hall k and benn d (eds) contending with destiny: the caribbean in the st century. kingston, jamaica: ian randle, pp. – . renwick c ( ) caribbean digital library initiatives in the twenty-first century: the digital library of the carib- bean (dloc). alexandria ( ): – . sancho ta ( ) the importance of our literature in car- ibbean secondary education. kaie : – . schmiesing a and hollis dr ( ) the role of special collections departments in humanities undergraduate and graduate teaching: a case study. libraries and the academy ( ): – . doi: . /pla. . . smith a ( ) valuing preservation. library trends ( ): – . temper th ( ) challenges for the future of library and archival presentation. library resources and technical service ( ): – . traister d ( ) is there a future for special collections? and should there be? a polemical essay. rbm: a jour- nal of rare books, manuscripts and cultural heritage ( ): – . unesco ( ) charter on the preservation of the digital heritage. available from: portal.unesco.org (accessed may ). whylie m ( ) jamaican drumming styles. caribbean quarterly ( / ): – . available at: www.jstor.org/ stable/ (accessed april ). world bank ( ) jamaica overview. available at: www. worldbank.org/en/country/jamaica/overview (accessed may, ). author biography cherry-ann smart is a special collections librarian at the university of the west indies, mona campus main library, west indies and special collection (wi & sc) unit. prior to this appointment she worked as chief librarian at the mon- tserrat public library and in the field of health and law. she holds a ba in history, an ma in library and information science and is currently reading for a dphil in lis. her research interests include cultural heritage informatics, public access to information, and internationalisation of higher education. ms smart maintains membership in the mixed methods international research association (mmira), the association for information science and technology (asist), and the library association of jamaica. she publishes mainly on issues which affect car- ibbean libraries. ifla journal ( ) http://www.japarliament.gov.jm http://www.japarliament.gov.jm http://www.summeredward.com/ / /interview-joanne-gail-johnsons-window.html http://www.summeredward.com/ / /interview-joanne-gail-johnsons-window.html http://www.oclc.org/research/activities/hiddencollections/bornditgital.pdf http://www.oclc.org/research/activities/hiddencollections/bornditgital.pdf http://smallaxe.net/wordpress /discussions/ / / /jamaican-children-reading http://smallaxe.net/wordpress /discussions/ / / /jamaican-children-reading http://www.arl.org/storage/documents/publications/hidden-colls-white-paper-jun .pdf http://www.arl.org/storage/documents/publications/hidden-colls-white-paper-jun .pdf http://www.clir.org/pubs/reports/pub /contents.html http://www.clir.org/pubs/reports/pub /contents.html http://liber.library.uu.nl/index.php/lq/article/view/ / http://liber.library.uu.nl/index.php/lq/article/view/ / http://dx.doi.org/ . / http://dx.doi.org/ . / http://portal.unesco.org http://www.jstor.org/stable/ http://www.jstor.org/stable/ http://www.worldbank.org/en/country/jamaica/overview http://www.worldbank.org/en/country/jamaica/overview article digitization of indian manuscripts heritage: role of the national mission for manuscripts jyotshna sahoo sambalpur university, sambalpur, india basudev mohanty indian institute of technology, bhubaneswar, india abstract india has the distinction of having one of the most ancient, richest and largest collections of manuscripts in the world. these manuscripts which are available in different forms, languages, scripts and cover a wide range of subjects are a powerful medium for the preservation of indian cultural heritage. but the preservation of these manuscripts is a serious problem for the custodians of manuscripts because of the hot and humid climate of the country. in this context the present paper gives an account of the commendable efforts rendered by the national mission for manuscripts since its inception in by establishing and strengthening manuscript resource centres, manuscript conservation centres and developing a national database of manuscripts. it also presents the current status of digitization of indian cultural heritage in the form of manuscripts starting from its collection to the development of a digital manuscript library for global access. keywords manuscripts heritage, national mission for manuscripts, manuscript resource centres, manuscript conservation centres, digitization, digital manuscript library introduction india has sustained a glorious tradition of preserving knowledge through oral and written communication since time immemorial. a variety of manuscripts in different forms have been in use since ancient days, ranging from clay tablets to copper plates and from leaves of trees to prepared skins of animals. a good number of manuscripts relating to art and architec- ture, astronomy, mathematics, purana, vyakarana, tantra, yoga, philosophy and medicine date back several hundreds of years and are still available for reference today. it is amazing to discover how scho- lars packed so much information into what they wrote on these manuscripts. as indian ancient cultural heri- tage is preserved in manuscripts, these are regarded as valuable sources of information for the reconstruction of the history and culture of the country. composed in different indian languages, these manuscripts are spread all over the country in different institutions, libraries, monasteries, temples and in several private collections the manuscripts being organic in nature are quite susceptible to deterioration caused by changes in climatic conditions, bio-deterioration and also by constant handling. but the advent of informa- tion and communication technologies brings unprece- dented changes in the entire process of information generation, organization, and retrieval as well as in the process of preservation. digitization, an offspring of the technological innovation has emerged as a viable tool for long-term access to the documentary heritage. digitization of manuscripts promises docu- mentation and preservation of original texts and at the same time facilitates greater access for scholars and researchers. with this backdrop, this paper discusses the digitization of indian manuscripts, emphasizing corresponding author: basudev mohanty, central library, indian institute of technology, bhubaneswar, odisha, india. email: basudev_mohanty@rediffmail.com international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifla.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifla.sagepub.com the efforts of the national mission for manuscripts (nmm) in digitizing the manuscripts heritage and thereby developing and maintaining a knowledge base available in manuscript form for generations to come. the nmm is the first national level compre- hensive initiative in the world that caters to the need of conserving manuscipts and disseminating knowl- edge contained therein. manuscript: the concept the term ‘manuscript’ is derived from the latin word manuscriptus which is a combination of two words, that is manu meaning by hand and scriptus meaning written. so etymologically manuscript means written by hand. in the classical sense the term manuscript refers to a document, handwritten by an author. manuscripts are found in every part of the world where human beings put their thoughts and experience into written form. in archaeological terms a manuscript is defined as any early writing made on stone, metal, wood, clay, linen, bark, leaves of trees and prepared skins of animals. hand writings of any kind whether on paper or any other material as opposed to printed materials are called manuscripts (cornish, ). in general, the term manuscript refers to handwritten materials including ancient inscriptions on clay tablets and stone, medieval and renaissance manuscripts of books and codices and modern manuscripts such as literary manuscripts, historical manuscripts and personal papers. there are also no restrictions on the forms of writing whether it is phonetic, pictorial or ideographic. according to the chambers dictionary, ‘a manu- script is a book or document written by hand before the invention of printing’. according to the encyclo- pedia of information and library science, a manu- script is defined as ‘a written document that is put down by hand, in contrast to being printed or repro- duced by some other way’ (corea, ). the anglo-american cataloguing rules nd edition (aacr ) defines manuscripts as ‘materials of all kinds including manuscript books, dissertations, let- ters, speeches, legal papers and collection of such manuscripts’ (gorman and winkler, ). how- ever, the definition given in aacr seems to be more illustrative of the type of manuscripts and for the present paper the term manuscripts includes a variety of writing mediums such as palm leaf, bam- boo leaf, sanchipat, birch-bark, stone, wood and paper: the evidence of which is found in various manuscript resource centres in india. the digitiza- tion section of this paper includes in its ambit only palm leaf manuscripts. efforts to preserve and catalogue manuscripts in india prior to independence manuscripts were the sole medium for the transmis- sion of knowledge, and were the predominant writing medium before the advent of paper. as such, in clas- sical and medieval india, the house of every teacher had a good collection of manuscripts. manuscripts were also collected by the rulers of different states, including the mughal emperors, religious institutions, monasteries (mathas) of different sects and the jain bhandaras. the credit for listing the manuscripts in india for the first time goes to a jain monk who com- piled the manuscripts of patan, cambey and bharauch in the year under the title brihattipanika which is still preserved in the shantinatha bhandara, patan. recognizing the works of kavindracharya of vara- nasi, who compiled the subject-wise classified catalo- gue of manuscripts between and , mughal emperor shahjahan conferred on him the title of ‘sarvavidyanidhana’. king tipu sultan of mysore had built up a library of oriental manuscripts in arabic, persian and hindustani languages. the manu- scripts from tipu’s library were studied and catalo- gued by general charles stewart and the catalogue was published by cambridge university press under the title a descriptive catalogue of the oriental library of the late tipu sultan of mysore (stewart, ). with the establishment of the east india com- pany’s rule in india, the systematic survey of manu- scripts, their collection, preservation, and cataloguing gained further momentum. the british rulers, who took upon themselves the cause of education and of patron- izing indian traditional knowledge systems, directed their attention towards the indian literary heritage pre- served in manuscripts. during the british regime the asiatic society in calcutta, established in , under- took the work on manuscripts collection and documen- tation. several government collections gradually came into existence in calcutta, varanasi, pune and madras. the work of sir william jones, lady jones and sir charles wilkins in the cataloguing of manuscripts is also praiseworthy and was published in the philosophi- cal transactions of the royal society of london during and respectively. another noteworthy con- tribution is by pandit ramagovinda tarkaratna who, under the instructions of james princie, compiled in a -page catalogue of manuscripts avail- able in the holdings of the college of fort william library, the college of the asiatic society of bengal and banaras sanskrit colleges. in the th and th centuries, survey, search and cataloguing of manu- scripts were carried on by both indian and european experts in various regions of the country, particularly ifla journal ( ) in western, central and northern regions. the names of g buhler, f kielhorn, peter peterson, rg bhandarkar, sr bhandarkar, who listed many manuscripts between and ad, deserve special mention in the field of cataloguing of manuscripts. raja rajendralala mitra also performed a great job in cataloguing manuscripts between and , and after his death his incom- plete work was taken over by mm harprasad sastri and published between and in six volumes. national agencies working for preservation of manuscripts recognizing the richness of indian literary heritage preserved in manuscripts, the government of india has taken the initiative to strengthen a number of national level institutions that are particularly devoted to the preservation of indian manuscripts. those insti- tutions are: � the national archives of india (nai), new delhi � national library of india, kolkata � indira gandhi national centre for the arts (ignca), new delhi nai is the repository of the non-current records of the government of india and its predecessor, where records are preserved for the use of the administration and scholars. the aims of nai are to conserve records from all over the country; to encourage the scientific management and greater liberalization of access to archival holdings; to develop greater professionalism and a scientific temper among creators, custodians and users, and aid in spreading a feeling of national pride in the documentary cultural heritage of india and ensuring its preservation for posterity. nai is making earnest efforts to ensure longevity of the doc- uments in its custody through preventive, curative and restorative processes for which the department set up the conservation research laboratory in . since its inception, it has been engaged in research and devel- opment work like developing indigenous techniques for restoration, testing of materials required for restora- tion and storage (www.nationalarchives.nic.in). the national library of india, kolkata was estab- lished in with the passing of the imperial library act, and has the status of an institution of national importance. it is engaged in the task of acqui- sition and conservation of all significant production of printed material. it has a rich collection of persian, sanskrit, arabic and tamil manuscripts and also rare books. it is the recipient library under the delivery of books and newspapers (public libraries) act, and the repository library for south asia. it holds more than volumes of paper as well as palm leaf manuscripts written in different languages and scripts. the arabic and persian manuscripts bear beautiful illustrations and fine calligraphy (www.nationallin- rary.gov.in). the library also undertakes the conser- vation and digitization of manuscripts of national importance as well as its own holdings. ignca was established in as an autonomous institution under the ministry of culture, govern- ment of india. it is the national information system and databank in the fields of arts, humanities and cultural heritage. the indian cultural heritage resource centre of ignca which is known as ‘kala nidhi’ division, is chiefly responsible for the compi- lation of unpublished manuscripts of indian and for- eign collections and from private and public libraries. this unit has taken an initiative to bring under one roof primary sources of indian tradition, lying scattered, fragmented, inaccessible and in dan- ger of extinction. this division of ignca has col- lected sizable numbers of manuscripts from east, west, north, north-east, south and central regions of india and began microfilming. to date this division has more than , film rolls of manuscripts (approx. , , folios with , digitized rolls) in its possession (www.ignca.nic.in). national mission for manuscripts the national mission for manuscripts (namami) is an autonomous organization under the ministry of culture, government of india. the mission was initi- ated in february by the ministry of tourism and culture, government of india, and the indira gandhi national centre for the arts (ignca), new delhi is the nodal agency for the execution of this project. the indian manuscripts heritage covers a variety of themes, textures, scripts, languages, calligraphies, illuminations and illustrations. together, they consti- tute the ‘memory’ of india’s history, heritage and thought. namami aims to locate, document, pre- serve and digitize indian manuscripts and make these accessible to connect india’s past with its future and to create a national resource base for manuscripts for enhancing their access, awareness and use for educa- tional purposes. the present study is basically confined to the activities of nmm with special reference to the digitization of indian manuscripts. methodology of the study the present study is primarily based on the activities of the mission with regard to growth of manuscript resource centres, conservation centres, documentation of manuscripts, diverse forms and nature of indian sahoo and mohanty: digitization of indian manuscripts heritage http://www.nationalarchives.nic.in http://www.nationallinrary.gov.in http://www.nationallinrary.gov.in http://www.ignca.nic.in manuscripts with a special reference to the digitiza- tion aspect of manuscripts. the required data have been collected from the annual reports of the mis- sion accessible through the website (www.namami .org) and also available in hard copy. data related to the types of manuscripts, languages, scripts, subject area have been collected from the respective web pages of the resource centres under nmm. keeping in view the objectives of the study, obtained data have been transferred to tables and figures and finally ana- lyzed to get the results. objectives of the study the study is primarily designed to focus on the fol- lowing objectives: � to show the growth and distribution of manu- script resource centres (mrcs), manuscript conservation centres (mccs) along with manu- script collections across various zones and states of india. � to show the diverse nature of indian manu- scripts available in different forms, languages, scripts and subjects. � to focus on the national database of manu- scripts and the national electronic catalogue of manuscripts. � to make an exclusive assessment of the activ- ities of nmm in respect of the following two aspects: . digitization of manuscripts and its status . development of a digital manuscripts library. literature review a number of studies regarding the digitization of manuscripts have been conducted in different set- tings, different times and for different manuscript libraries. for this paper some significant studies in the field that focus on different aspects of manuscript digitization in the indian context have been thor- oughly reviewed. kumar and shah ( ) have dis- cussed in detail the scindia oriental research institute (sori) a pioneer manuscript library of india. some manuscripts of importance have been microfilmed by ignca at sori and it has been recognized as one of the mrcs for accessioning, cat- aloguing and launching of an awareness programme in madhya pradesh. kumar and shah ( ) also dis- cussed unesco’s digitization project ‘the memory of the world’ initiated in and the manuscript digitization pilot project ‘down memory lane’ at the national library of india. majumdar ( ) has described the history of artistic heritage, history of lit- erary heritage and recorded knowledge of india and viewed that past literary heritage in the form of manu- scripts available on palm leaves, cotton, silk, wood, bamboo and copper plates and has also discussed the initiatives taken by the indian government in introdu- cing the nmm towards preserving and digitizing these culturally significant works. ramana ( ) has given a brief overview of india’s largest and ancient manuscript collections, the forms and places of avail- ability of these manuscripts. he also described some indigenous methods of preserving palm leaf manu- scripts and has highlighted the important benefits of digital preservation in dissemination of information, the manuscript collections of the nli and the process of digitization of manuscripts at nli. nair ( ) has depicted the valuable recorded knowledge housed in different museums, archives, art galleries and manu- script libraries that are affiliated to kerala university and has pointed out that development of a campus- wide information system and opting for digitization of the valuable content would help their wider acces- sibility. maltesh et al. ( ) have discussed digitiza- tion of cultural heritage, particularly manuscripts of india and other parts of the world including the unesco project ‘memory of the world’, czech national library, national library of australia, etc. this paper also highlights the organizational role of metadata for information retrieval and access as regards manuscripts. kumar and sharma ( ) pointed out that digitization of manuscripts in the indian set up is a bigger challenge than it appears. however, in the area of manuscripts, the department of culture, goi made an ambitious plan in by constituting the national mission for manuscripts to preserve, conserve and digitize manuscripts for pos- terity and described how punjab university, chandi- garh is utilizing nmm guidelines to digitize its multilingual holdings. devi ( ) has described the importance of the manipur manuscripts collection and the necessity to preserve the collection in digi- tized form for future generations. mazumdar ( ) has described the manuscript collection in assam as well as initiatives for digital preservation in assam with reference to the krishna kanta handique central library of gauhati university which has about valuable manuscripts written on sanchipat, tulapat and paper. gaur and chakraborty ( ) have asserted that the glorious past of indian culture lies in the ancient manuscripts which represent the basic historical evi- dence with great research value. it is estimated that india possesses more than five million manuscripts, making her the largest repository of manuscript wealth ifla journal ( ) http://www.namami.org http://www.namami.org in the world. in order to preserve this knowledge resource and to make these accessible to scholars, ignca initiated the most important manuscript micro- filming programme in . gaur and chakraborty ( ) also discussed topics like the tradition of preser- vation and access in india, institutional efforts in the fields of preservation and access, initiatives taken by ignca and nmm and challenges of manuscript preservation in the st century. saikia and kalita ( ) have highlighted the digitization process of manuscript collections in the krishna kanta handiqui library, guahati, assam which has copies of manuscripts on important branches of knowledge writ- ten in assamese, sanskrit, bengali, nepali and tibetan scripts. the study also describes digitizing tools like scanners, digital cameras, image-processing software, file compression and ocr software along with digital library software like gsdl, dspace and eprints as well as the workflow of digitizing manuscripts. londhe et al. ( ) have focused on the technical know-how required for digitization of manuscripts, discussed the digitization process of manuscripts adopted in the jayakar library, university of pune in india and also evaluated the digitization software used in this project. singh ( ) has depicted cul- tural heritage as the symbolic presence that integrates the history, traditions and culture of a country and examined the viability of preserving india’s cultural heritage resources in a digital world to make it glob- ally accessible. observation and analysis setting up manuscript resource centres the nmm works with the help of manuscript resource centres (mrcs) spread across the country. these mrcs are well-established indological institutes, museums, libraries, universities and non-government organizations and function as the mission’s coordinat- ing agencies in their respective regions. it is observed from figure that the highest numbers of mrcs ( ) function under the north zone. mrcs under this zone are distributed over six states plus two mrcs function in the national capital territory of delhi. the south zone covers mrcs, whereas east zone covers mrcs, west zone covers mrcs and the central zone covers mrcs. the zone-wise distribution of mrcs is listed in appendix and shows the number of states included in each zone along with the number of manuscripts available in each mrc that functions under each zone. figure shows the zone-wise distribution of manu- scripts and it is found that the highest number of manu- scripts are available in north zone, i.e. , ( %), followed by south zone , ( %), east zone , ( %), west zone , ( %) and central zone , ( %) respectively. so it can be interpreted that both in terms of number of manuscripts and mrcs north zone is ahead of other four zones. setting up manuscript conservation centres the mission has identified manuscript conserva- tion centres (mccs) across the country for the con- servation of manuscripts. these mccs are the nodal centres for all preservation and conservation work relating to manuscripts that work towards ful- filling its motto ‘conserving the past for the future’. these centres provide services such as training in preservation and conservation, workshops on pre- ventive and curative conservation of manuscripts in different institutions and private collections. for this purpose a standard methodology comprising the posi- tive aspects of both traditional indian practices and north south east west central no. of mrcs number of states present figure . distribution of mrcs across various zones. % % % % % north south east west central figure . distribution of manuscript collections across various zones. sahoo and mohanty: digitization of indian manuscripts heritage modern scientific methods is followed. table pro- vides the number and percentage-wise availability of mccs as well as mrcs in the different states as well as union territories of india. out of a total of states in india, mrcs function in states and in union territories, namely delhi and puducherry, whereas mccs are distributed over states and two mccs function in the national capital territory of delhi. it is observed that both mrcs and mccs are distributed over most of the states of india under the ambit of nmm for furthering its activities relating to manuscripts. uttar pradesh is the state in which the highest percentage ( . %) of mrcs function whereas bihar and karnataka jointly occupy the sec- ond position with . % of resource centres. similarly in the case of mccs, uttar pradesh (up) and karna- taka occupy the first position with % of mccs fol- lowed by kerala with % of mccs. form-wise distribution of manuscripts across mrcs under nmm figure gives an idea of the various forms of manu- scripts that are available in the mrcs and these are bamboo leaf, birch bark, cloth, hand-made paper, palm leaf, stone, terracotta and wood. it is observed that out of mrcs, palm leaf manuscripts are available in the maximum resource centres ( ) that contribute to % of the total forms of manuscripts. so it is interpreted that though there were other forms of writing materials, palm leaf was the predo- minant one. the growth of palm trees in abundance in different parts of the country is the possible cause for plentiful use of palm leaves than other forms of manuscripts. table . state-wise distribution of mrcs & mccs. sl. no. name of the states number of mrcs % of mrcs number of mccs % of mccs andhra pradesh . arunachal pradesh . assam . bihar . chhattisgarh . delhi (nct) . gujarat . haryana . himachal pradesh . jammu & kashmir . karnataka . kerala . madhya pradesh . maharashtra . manipur . odisha . puducherry (ut) . punjab . rajasthan . tamil nadu . tripura . uttar pradesh . uttarakhand . west bengal . total % % % % % % % % % bamboo leaf birchbark cloth handmade paper palm- leaf sanchipat stone terracotta wood figure . percentage-wise distribution of various forms of manuscripts across mrcs. ifla journal ( ) language-wise distribution of manuscripts across mrcs under nmm figure shows the language-wise distribution of manuscripts under nmm at various mrcs. it is observed that manuscripts are available in impor- tant languages such as arabic, bengali, bhojpuri, english, gujarati, hindi, kannada, maithili, malaya- lam, marathi, odia, pali, punjabi, persian, prakrit, rajasthani, sanskrit, tamil, telugu, tibetan, turk- ish, and urdu. out of the total manuscripts under nmm covering all the mrcs the majority of the manuscripts are available in sanskrit and hindi lan- guages, contributing to ( . %) and ( . %) respectively out of the total percentage for all the lan- guages. in languages like bhojpuri, gujarati, maithili, punjabi, rajasthani and turkish, much fewer ( . % in each language) numbers of manuscripts are seen in various mrcs. script-wise distribution of manuscripts across mrcs under nmm scripts denote the writing systems employed by lan- guages to represent the sounds which form the pho- netic base of the language. each language has its own representation for the sounds and thus has its own script, whereas some of the languages have a common script. very often it is found that manuscripts are written in one language using the script of another language; for example manuscripts are seen to be written in the odia language using devanagari script. from figure , it can be observed that manuscripts have been written using many scripts such as bengali, devanagari, english, grantha, gaudi, gujarati new- ari, odia, sharada, telugu, tamil and tibetan scripts. out of all the scripts, the percentage of manuscripts written in devnagari script are highest ( . %) in comparison to other scripts because devanagari is the common script used both for hindi and sanskrit languages. subject-wise distribution of manuscripts across mrcs under nmm from the subject-wise analysis of the manuscripts (figure ) it can be observed that manuscripts were written on a variety of subjects. it indicates that authors of manuscripts had profound knowledge of different subject aspects starting from veda/vedanta to literature and linguistics. the study of the content of the manuscripts shows that the highest percentages of mrcs cover manuscripts on dharma shastra ( . %) followed by arts ( . %), ayurveda, culture and literature ( . %), linguistics ( . %), veda ( . %), grammar and history ( . %), ecology, phi- losophy and mathematics ( . %), astrology, purana, vedanta and anthropology ( . %), upanishad ( . %) respectively. growth of manuscript documentation under nmm one of the significant contributions of the nmm is the detailed documentation of manuscripts in india for creating a national electronic database of manu- scripts to provide scholars with a common portal for reference. for this purpose the mission receives data on manuscripts from three different sources: � national survey followed by post-survey � manuscript resource centres � manuscript partner centres (mpcs) or private collections national survey is an intensive state-wide pro- gramme with the aim to locate every manuscript in the . . . . . . . . . . . . . . . . . . . . . . a ra bi c b en ga li b ho jp ur i e ng lis h g uj ur at i h in di k an na da m ar at hi m al ay al am m ar at hi o di a pa li pe rs ia n pu nj ab i pr ak ri t r aj as th an i sa ns kr it t am il t el ug u t ib et an t ur ki sh u rd u figure . language-wise distribution of manuscripts across mrcs in percentage. . . . . . . . . . . . . . . . . . . . . . . figure . script-wise distribution of manuscripts across mrcs in percentage. sahoo and mohanty: digitization of indian manuscripts heritage country with a special emphasis on undocumented private collections. in post-survey each and every repository unearthed during the national survey is revisited to document every individual manuscript contained therein. it provides an overview of the num- ber of manuscript repositories in a district to docu- ment each manuscript in each repository, in every district, every state and eventually the country. the manuscripts are documented through the mission’s datasheet known as manus data sheet that covers detailed bibliographic information such as title, author, commentary, language, script, subject, name of repository, number of folios and other relevant details. after the collection of such information, these data are entered into the manus granthavali software at the mrcs or mpcs and finally the detail informa- tion is sent to the mission. under this scheme it is observed that the highest number of manuscripts has been documented during the year – ( , ) and the total number of manuscripts received for documentation is , , (table ). the data processing status as on march is presented below: � total data received in electronic format ¼ , , � total data received in hard copy ¼ , � total data edited ¼ , , � total data released on website ¼ , , (www.namami.org as on march ) the national electronic database of manuscripts is the first online catalogue of indian manuscripts, where a particular manuscript can be searched on the table . year-wise growth of documentation of manuscripts. data documented cumulative/progressive data growth rate doubling time year no. % no. % log (gr) (dt) – , . , . . – , . , . . . . – , . , , . . . . – , . , , . . . . – , . , , . . . . – , . , , . . . . – , . , , . . . . – , . , , . . . . – , . , , . . . . – , . , , . . . . – , . , , . . . total , , mean . . . . . . . . . . . . . . . . . . . figure . subject-wise distribution of manuscripts across mrcs in percentage. ifla journal ( ) http://www.namami.org basis of its title, author, subject or material. a partic- ular repository can also be searched on the basis of name of district and state. nmm and digitization in order to create a digital resource base for manu- scripts, the nmm initiated a pilot project in and developed a consistent policy for digitization. since there is a large corpus of manuscripts available in the country, nmm selects manuscripts for digitization with the following parameters: � manuscripts that are unique and with rare heri- tage value (where it is possible that without preservation, it would be lost); � manuscripts that deal with disciplines relating to ancient knowledge systems and belonging to a relatively antique period; � material where the users are wide-spread geo- graphically and temporally; � material where the retrieval of information is cumbersome and copies of such material can- not be supplied to the users quickly and easily. along with the above parameters for selection of manuscripts, nmm adopts the following procedure for digitization of manuscripts. benchmarking. benchmarking is the process underta- ken at the beginning of a digitization project that attempts to set the levels used in the capture process to ensure that the most significant information is captured. the mission has developed a guideline known as ‘guidelines for digitization of manu- scripts’ which covers the detailed guidelines for scanning like image quality, resolution, bit depth, image enhancement process, compression, output specification, etc. naming convention and image formats for scanned images. the naming of images is an important issue that is handled by the mission. each manuscript digi- tized is already documented on the mission’s elec- tronic database and the metadata information for each manuscript scanned is identified by giving it a manuscript identification number (manus id) which is generated by the mission’s manus granthavali software. the manus id and the accession number (from the institute/repository catalogue where the manuscript is kept) and where the digitization is tak- ing place, form the basis of naming the digitized images of each manuscript page. similarly four image formats namely master image (tiff format), clean image (tiff format), access image (jpeg format), and thumbnail image (jpeg format) are considered for all the scanned images. quality assurance. quality assurance refers to the series of quality control analyses of the manu- scripts during the process of digitization. it is a method of verifying that all the digital reproduction of manuscripts is up to the prescribed standard defined by nmm. ideally quality assurance is per- formed on all master images and their derivatives with regard to size and resolution of image, file format, image mode, bit depth, tonal values, bright- ness, contrast, sharpness, interference, orientation, missing lines or pixel, text legibility, cropped and border areas, etc. metadata creation. for each digitized manuscripts two sets of metadata are created namely subject meta- data and technical metadata. while subject metadata are generated according to the specific manus data record using manus granthavali that covers meta-elements, technical metadata describes the fea- tures of the digital file. technical metadata is auto- matically generated and assigned to the image file at the time of creation and the data elements covered are: file name, date created, date modified, equipment used, image format, width, height, colour mode, etc. the illustration in chart depicts the sequential view of the complete digitization process maintained by nmm srarting from material selection to the retrival of manuscripts. manuscript digitization status of various institutions table shows the institutions covered under nnm’s digitization project. it is observed that nmm has taken up the digitization work of mrcs distributed over states and two mrcs of delhi – the national capital territory of india. the total number of digi- tized images of manuscripts that are available with the mission is , , up to march . the highest number of pages ( , , ) ( . %) of manuscripts have been digitized from allahabad sankrit sansthan, varanasi, up, followed by odisha state museum, odisha ( . %) and bharat itihas san- shodhan mandal, pune from the state of maharashtra ( . %) respectively. figure shows the percentage of digitized manu- scripts over the states. it is observed that the highest percentage of manuscripts has been digitized from up ( . %) followed by maharashtra ( . %) and rajasthan ( . %) respectively. the state up occu- pies the first position in terms of both number and sahoo and mohanty: digitization of indian manuscripts heritage pages of digitized manuscripts that denotes that more mrcs included for digitization are in up. digital manuscripts library for the first time in history, the mission has taken sig- nificant steps to preserve digitally and make easily available almost all literary, artistic, and scientific works in india for research, education, and also for future generations. the mission aims to set up a digi- tal manuscripts library of india which will foster creativity and easy access to all ancient and medieval indian knowledge in the form of manuscripts of this country available at one place. this digital library will also become an aggregator of all the knowledge and digital contents created by other digital library initia- tives in india. very soon this library would provide a gateway to indian digital manuscripts libraries in science, arts, culture, music, traditional medicine, vedas, tantras and many more disciplines. nmm has collected hard disks containing digital images of , , pages of manuscripts as of march and more will be received in future as the work of digitization progresses. conclusion in india, the national mission for manuscripts (nmm) is the national level comprehensive initiative that caters to the need of preserving the knowledge held in millions of indian manuscripts. the present study draws the following conclusions on the basis of the above observations in regard to the selected activities included for the present study: � the national mission for manuscripts (nmm) is the first consolidated national effort devoted to the survey, documentation, preservation and digitization of manuscripts. � the manuscript heritage of india contains the accumulated knowledge of indian culture in diverse fields of study. � the manuscript heritage of india is unique in terms of quantity, quality, variety, language, script, subject matter and calligraphy. chart – . digitization process chart. ifla journal ( ) � the nmm chiefly functions through mrcs and mccs and it is found that at present there are mrcs and mccs working under nmm. � the mission has developed a national elec- tronic database of manuscripts which is the first online catalogue of indian manuscripts that provides information on every manuscript that has been documented through the mis- sion’s datasheets and the catalogue covers various aspects of manuscripts such as title, commentary, language, script, subject, place table . manuscript digitization status of various institutions. sl. manuscripts digitized pages digitized no. name of institution state no. % no. % akhilbharatiya sanskrit parishad, lucknow uttar pradesh (up) , . , . allahabad sanskrit sansthan, varanasi uttar pradesh (up) , . , , . allama iqbal library jammu & kashmir . , . anandashramsanstha, pune maharashtra , . , , . bhandarkar oriental research institute, pune maharashtra . , . bharat itihassanshodhan mandal, pune maharashtra , . , , . bhogilalleherchand institute of indology, delhi delhi , . , , . french institute of pondicherry, puducherry tamil nadu . , . dr harisingh gaur university, sagar madhya pradesh (mp) . , . himachal academy of arts, shimla himachal pradesh (hp) . , . institute of asian studies, chennai tamil nadu . , . jain manuscripts, lucknow uttar pradesh . , . jamiahamdard, new delhi delhi . , , . krishnakanthandiqui library, guwahati assam . , . kundakundajnanapith, indore madhya pradesh (mp) . , , . kutiyattam manuscripts kerala . , . nmm collection, new delhi delhi . , . oriental research library, srinagar jammu & kashmir (j&k) , . , , . odisha state museum, bhubaneswar odisha , . , , . rajasthan oriental research institute rajasthan , . , , . rashtriya sanskrit sansthan, allahabad uttar pradesh . , . siddha manuscripts, chennai tamil nadu . , . sri pratap singh library jammu & kashmir (j&k) . , . vrindavan research institute, vrindavan uttar pradesh (up) , . , , . vvbis & is, hosairpur punjab . , . total , , , . . . . . . . . . . . . . . . . . . . . . . . . % of manuscripts digitized % of pages digitized figure . state-wise contributions in percentage. sahoo and mohanty: digitization of indian manuscripts heritage of availability, number of pages, illustrations, date of writing, etc. � the electronic data available in the nmm web- site stands at around , , as of march . � digitization process, benchmarking and quality control parameters are well defined by nmm. � the mission has successfully digitized , , pages of manuscripts from lead- ing mrcs under nmm. � establishing a digital library of manuscripts and linking the library with the manuscripts database for research purpose of the scholars is in progress. appendix i: zone-wise distribution of mrcs sl. no. name of the resource centre zone state no. of manuscripts url central institute of buddhist studies, leh north jammu & kashmir www.cibsleh.in directorate of state archaeology, archives and museum, srinagar north jammu & kashmir , http://jktourism.org himachal academy of arts, culture and languages north himachal pradesh , https://coral.uchicago.edu library of tibetan works and archives, dharamsala north himachal pradesh , http://www.ltwa.net kurukshetra university, kurukshetra north haryana , www.kuk.ac.in visweshvaranandabiswabandhu institute of sanskrit and indological studies north punjab , www.vvbisis.puchd.ac.in uttaranchal sanskrit academy, haridwar north uttarakhand , http://www.euttaranchal.com rampur raza library, rampur north uttar pradesh , www.razalibrary.com sampurnanand sanskrit visvavidyalaya, north uttar pradesh , www.ssvv.ac.in akhilbhartiya sanskrit parishad, lucknow north lucknow h. n. b. garhwal university, paurigarhwal north uttaranchal http://www.srinagargarhwal.com vrindavan research institute, vrindavan north uttaranchal , www.vrindavanresearchinstitute.org km hindi institute of hindi studies and linguistics, agra north uttar pradesh www.dbrau.ac.in bhai vir singh sahityasadan, new delhi north delhi www.bvss.org institute of tai studies and research moranhat, assam north assam http://wikimapia.org bl institute of indology, delhi north delhi www.blinstitute.org mazahar memorial museum, bahariabad, ghazipur (up) north uttar pradesh oriental research institute, sri venkateswara university, tirupati south andhra pradesh , www.svuniversity.in andhra pradesh government oriental manuscripts library and research institute, hyd. south andhra pradesh , www.manuscriptslibrary.ap.nic.in french institute of indology, pondicherry south puducherry , www.ifpindia.org oriental research institute mysore south karnataka , www.uni-mysore.ac.in department of manuscriptology, kannada university, hampi south karnataka www.kannadauniversity.org national institute of prakrit studies and research, shravanabelagola south karnataka , www.jainmanuscripts.nic.in keladi museum and historical research bureau, shimoga south karnataka http://www.craftrevival.org mahabharata samshodhanapratishthanam, bangalore south karnataka , http://www.poornaprajna.com thanjavur maharaja serfoj’ssaraswati mahal library, thanjavur south tamilnadu , www.sarasvatimahallibrary.asp (continued) ifla journal ( ) http://www.cibsleh.in http://jktourism.org https://coral.uchicago.edu http://www.ltwa.net http://www.kuk.ac.in http://www.vvbisis.puchd.ac.in http://www.euttaranchal.com http://www.razalibrary.com http://www.ssvv.ac.in http://www.srinagargarhwal.com http://www.vrindavanresearchinstitute.org http://www.dbrau.ac.in http://www.bvss.org http://wikimapia.org http://www.blinstitute.org http://www.svuniversity.in http://www.manuscriptslibrary.ap.nic.in http://www.ifpindia.org http://www.uni-mysore.ac.in http://www.kannadauniversity.org http://www.jainmanuscripts.nic.in http://www.craftrevival.org http://www.poornaprajna.com http://www.sarasvatimahallibrary.asp declaration of conflicting interests the author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. funding the author(s) received no financial support for the research, authorship, and/or publication of this article. references corea i ( ) encyclopedia of information and library science. new delhi: akashdeep. cornish ga ( ) the encyclopedia americana interna- tional edition. new york: americana corporation. devi ts ( ) impact of information technology on the societal archive: a case study of manipuri manuscripts. international information and library review ( ): – . appendix . (continued) sl. no. name of the resource centre zone state no. of manuscripts url central library university of madras, chennai south tamilnadu www.unom.ac.in sri chandra sekharendrasaraswathiviswamahavidyalaya, kanchipuram south tamilnadu www.kanchiuniv.ac.in oriental research institute and manuscripts library, university of kerala south kerala , www.keralauniversity.ac.in thunchan memorial trust, tirur south kerala , www.thunchanmemorial.org dg centre for heritage studies, thripunithura south kerala http://www.kerala.gov.in goml & research centre university of madras library campus, chennai south tamilnadu www.tnarch.gov.in khuda bakhsh oriental public library, patna east bihar www.kblibrary.bih.nic,in kameshware singh darbhanga sanskrit university, darbhanga east bihar www.ksdsu.edu.in nava nalandamahavihara, nalanda east bihar , www.navanalandamahavihara.org sri dk jain oriental research institute, arrah east bihar calcutta university manuscripts library kolkata east kolkata , http://www.caluniv.ac.in odisha state museum bhubaneswar east odisha , www.odishamuseum.nic.in sarasvati, bhadrak east odisha , http://pincode.net.in krishna kantahandiqui library gauhati university of gauhati east assam www.gauhati.ac.in manipur state archives imphal, manipur east assam , http://archivesmanipur.nic.in/ bc gupta central library gurucharan college silchar, assam east assam http://www.gccollege.ac.in/ tripura university, tripura east tripura http://www.tripurauniv.in patna museum, patna east bihar – culture and archiology, raipur east chhattisgarh – rajasthan oriental research institute, jodhpur, rajasthan west rajasthan , http://www.rori.nic.in lalbhaidalpatbhai institute of indology, ahmedabad west gujarat , http://beta.ldindology.org bhandarkar oriental research institute, pune west maharashtra , http://www.bori.ac.in/ kavikulagurukalidasa sanskrit university, ramket west maharashtra http://www.sanskrituni.net institute for oriental studies (shivashakti) thane west maharashtra http://www.orientalthane.com sat shrutprabhavana trust, bhavnagar west rajasthan , http://www.satshrut.org anandasharmsanstha,pune west maharashtra http://sanskritbhavan.blogspot.in shivaji university kolhapur west maharashtra http://www.unishivaji.ac.in/ shri dwarakadhish sanskrit academy and indological research institute west gujarat scindia oriental research institute vikram university ujjain central madhya pradesh , http://www.indianetzone.com dr hs gaur university, sagar central madhya pradesh , http://www.dhsgsu.ac.in kundakundajnanapitha, indore central madhya pradesh , www.kundakunda@sancharnet.in sahoo and mohanty: digitization of indian manuscripts heritage http://www.unom.ac.in http://www.kanchiuniv.ac.in http://www.keralauniversity.ac.in http://www.thunchanmemorial.org http://www.kerala.gov.in http://www.tnarch.gov.in http://www.kblibrary.bih.nic,in http://www.ksdsu.edu.in http://www.navanalandamahavihara.org http://www.caluniv.ac.in http://www.odishamuseum.nic.in http://pincode.net.in http://www.gauhati.ac.in http://archivesmanipur.nic.in/ http://www.gccollege.ac.in/ http://www.tripurauniv.in http://www.rori.nic.in http://beta.ldindology.org http://www.bori.ac.in/ http://www.sanskrituni.net http://www.orientalthane.com http://www.satshrut.org http://sanskritbhavan.blogspot.in http://www.unishivaji.ac.in/ http://www.indianetzone.com http://www.dhsgsu.ac.in http://www.kundakunda@sancharnet.in gaur rc and chakraborty m ( ) preservation and access to indian manuscripts: a knowledge base of indian cultural heritage resources for academic libraries. in: ical – vision and roles of the future academic libraries, new delhi, india, – october , pp. – . new delhi: university of delhi. gorman m and winkler pw (eds) ( ) anglo american cataloguing rules. nd edn. chicago, il: ala. government of india. department of culture ( ) national mission for manuscripts. project report. new delhi: department of culture, india. government of india. department of culture ( ) national mission for manuscripts. available at: http://www.india.gov.in/knowindia/national_mission.php (accessed december ). indira gandhi national centre for the arts (ignca). available at: www.ignca.nic.in (accessed december ). indira gandhi national centre for the arts ( ) national mission for manuscripts. report, iv (july–august ). available at: http://www.ignca.nic.in/nl . htm (accessed november ). kumar m and sharma n ( ) digitization of manuscripts and rare literature: initiatives of archival cell, panjab university, chandigarh (india). in: th international caliber , punjab university, chandiharh, india, – february , pp. – . kumar s and shah l ( ) digital preservation of manu- scripts. in: nd convention planner – , manipur university, impha, india, – november , pp. – . ahmedabad: inflibnet. londhe nl, sanjay kd and suresh kp ( ) development of a digital library of manuscripts: a case study at the university of pune, india. program ( ): – . majumdar s ( ) preservation and conservation of lit- erary heritage: a case study of india. international information & library review ( ): – . maltesh m, lakhar n and gajakose s ( ) digitization of culture. in: th convention planner – , gau- hati university, india, – december , pp. – . ahmedabad: inflibnet. mazumdar nr ( ) digital preservation of rare manu- scripts in assam. in: th international caliber, pon- dichery university, puducherry, india, – february , pp. – . nair rr ( ) digitization of indigenous materials: prob- lems and solutions in the context of kerala university. in: rajan gd (ed.) library and information studies in the digital age: prof. k. a. isaac commemoration vol- ume. new delhi: ess publications. pp. – . national archives of india. available at: www.nationalarc- hives.nic.in (accessed january ). national library of india, kolkata. available at: www. nationallinrary.gov.in (accessed january ). national mission for manuscripts. available at: http:// www.namami.nic.in (accessed november ). national mission for manuscripts ( ) guidelines for digitization of manuscripts. available at: http://www. namami.nic.in (accessed november ). national mission for manuscripts ( ) report of the eleventh year – . available at: http://www. namami.nic.in (accessed november ). ramana yv ( ) digital preservation of indian manu- scripts – an overview. in: rd international caliber , cochin, india, – february , pp. – . ahmedabad: inflibnet. saikia rr and kalita b ( ) prospects of digitizing manuscript collection in kkh library: a model. in: th international caliber – , goa university, goa, india, – march , pp. – . ahmedabad: inflibnet singh a ( ) digital preservation of cultural heritage resources and manuscripts: an indian government ini- tiative. ifla journal ( ): – . stewart c ( ) a descriptive catalogue of the oriental library of the late tipu sultan of mysore. cambridge: cambridge university press. author biographies jyotshna sahoo is currently a lecturer in the pg depart- ment of library and information science, sambalpur uni- versity, odisha, india and has been engaged in teaching at both masters and mphil levels since september . before that she had served as the assistant librarian of the odisha state museum for more than a decade. preservation and conservation of arts and artifacts, especially organic material, fall into her area of interest. she has authored two books and research papers published in both national and international journals. she was awarded ugc-net in library and information science in ; a junior research fellowship from the indian department of cul- ture in ; and icssr doctoral fellowship for her phd work in . she was director of the icssr project ‘research productivity in the fields of social sciences in orissa: a bibliometric appraisal’ in and is a life member of professional bodies such as ila, iaslic. baudev mohanty is working as assistant librarian at the indian institute of technology (iit) bhubaneswar, odisha, india since . prior to joining iit bhubaneswar he was at infosys ltd. for years in different roles, namely assis- tant librarian, librarian and lead librarian. he also worked as a programmer-cum-training officer in dpep under the department of school and mass education, gov- ernment of orissa. he has published more than research papers and presented papers at many seminars and confer- ences. he has received many accolades for his philanthro- pic and professional activities. ifla journal ( ) http://www.india.gov.in/knowindia/national_mission.php http://www.ignca.nic.in http://www.ignca.nic.in/nl .htm http://www.ignca.nic.in/nl .htm http://www.nationalarchives.nic.in http://www.nationalarchives.nic.in http://www.nationallinrary.gov.in http://www.nationallinrary.gov.in http://www.namami.nic.in http://www.namami.nic.in http://www.namami.nic.in http://www.namami.nic.in http://www.namami.nic.in http://www.namami.nic.in article preserving digital heritage: at the crossroads of trust and linked open data iryna solodovnik fao, opcc, rome, italy paolo budroni university of vienna, austria abstract regardless of current or future technologies, accessing digitally preserved information resources will always pose challenges. there is a plethora of models, standards and best practices addressing the different facets for the preservation of digital objects. the management of digital objects requires well-defined policies and data management plans that include all processes within their specific lifecycle. to achieve high levels of data sharing and long-term re-use of data, aparsen recommends developing an interoperable framework for persistent identifiers, paving the way for a ‘ring of trusted persistent identifiers for linked open data’. to enable semantic interoperability of such a ring, this article proposes to map lode-bd metadata with the framework’s ontology. the ring can be further enriched with lod technology stack to tackle the problem of trustworthiness of linked data lifecycle while addressing the issue of big data. to be trusted, digital libraries need to be audited and certified in compliance with the european framework for audit and certification. keywords digital libraries, digital preservation, european framework for audit and certification, interoperability, linked data, lode-bd, lod stack, trusted digital repositories curators at important institutions had been making heroic efforts against the loss of shared cultural heritage (miller and ogbuji, : ). digital preservation: context in a special issue of the journal isq information stan- dards quarterly ( : ) dedicated to digital preser- vation (dp), it was stressed that our rapidly changing digital world suffers from an over-abundance of unstructured digital information, rapid obsolescence of hardware and software, and increasingly restrictive intellectual property regimes. to ensure continued, sustainable and authentic long-term access to digital information, a vibrant international community of digital information specialists is continuously devel- oping and implementing standards and best practices in the areas of digital curation and dp, taking into account that technological means for storage of digital information will change over time. this means that choices made early in the life of a digital project will certainly have an impact on digital posterity (holds- worth, : ). albeit issues regarding dp will continue to be pressing in the digital universe and despite dp poli- cies that differ greatly across countries, the fundamen- tal challenges regarding information resources’ availability over time are universal (henneken, corresponding authors: iryna solodovnik, fao of the un, opcc, editorial services, rome, italy. email: iryna.solodovnik@gmail.com paolo budroni, phaidra department, university of vienna, austria. email: paolo.budroni@univie.ac.at international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifla.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifla.sagepub.com ; unesco, ). these challenges concern the whole curation lifecycle of digital resources and are largely addressed by the central methodological prob- lems of research in science and technology at the intersection of digital libraries. the aim of a digital library that may host a number of digital repositories is to facilitate communication between libraries, museums and archives at a cross-cultural level, in order for these institutions to work together to a greater extent, making their digital collections and objects available on the web for a large audience through one central (multifocal) access point (ifla, ). pursuing preservation research to forward the national preservation research agenda, the library of congress, in consultation with leading scientific laboratories, has developed a matrix of preservation science projects undertaken by libraries, archives and museums worldwide, illustrating the wide spectrum of preservation research from scientific and forensic studies to the development of preservation treatment (library of congress. national preservation research agenda). the qualitative shift from good research to good practice requires cutting-edge strategies, as in the implementation of methods underpinning the storage of digital memories comprising both short-term and long-term preservation of digital objects (dos) (coar, ; cornell university). preservation of dos in the long term is not limited to storage and backup; rather it involves multifaceted strategies aimed at providing a trusted environment (covering authenticity, integrity, long-term access, security issues) where dos can evolve along with the changes in technology, hardware and software (interpares trust; w c, ). the long-term dp together with the principle of open access to research data (and their metadata) offer broad opportunities for the scientific community. in particular, more and more universities and research centres are starting to build research data repositories allowing permanent and open access to data sets in a trustworthy environment (swan et al., ; zenodo, : ). in this context it should be underlined that just a few recognize the importance of preserving the so-called negative results or the inconclusive results deriving from the processes of elaboration of raw data. usually positive results are preserved and accessible over the long term. the digital curation centre (dcc) places dp at the centre of digital curation (maintaining, preserving and adding value to digital data throughout its life- cycle) activities. these latter are of vital importance to ensure and achieve qualitative access management and content re-usability by means of well-established digital curation workflow models (e.g. taverna) and tools (weidner and alemneh, ) supporting a complex set of actions necessary to support authenti- city, reliability, usability and integrity – measured in terms of content, fixity, reference, provenance, and context – of dos in a long-term perspective. do is the heart of what dp management is all about. premis (preservation metadata implementation strategies) data dictionary defines do as ‘a discrete unit of information in digital form. a do can be a rep- resentation, file, bitstream, or filestream’ (library of congress, : ). the annual international confer- ence ipres, dedicated to different aspects of dp, endorses dos under the aegis of articles, datasets, images, stream of data (ipres, : ). the califor- nia digital library glossary identifies do as an entity with one or more content files united (physically and/or logically, through the use of a digital wrapper) to their corresponding metadata, while the glossary of archi- val and records terminology refers to do as an infor- mation resource ‘that has been digitally encoded and integrated with metadata to support discovery, use, and storage of those objects’ (society of american archi- vists, ). in regard to metadata, they are essential elements for managing, accessing, reusing, retrieving and preserving huge amounts of information resources (liber, ). together with metadata, certain signif- icant properties (inspect project) of dos need to be preserved in order that these latter are deemed authen- tic over time. digital preservation: between quality, sustainability and planning according to the widely-accepted iso defini- tion, quality is ‘the totality of features and character- istics of a product or service that bear on its ability to satisfy a given need’ (iso ). an important prere- quisite for every sustainable dp management system is to continuously assure compliance to specific qual- ity requirements (technical and non-technical) adopted by its outsourcing and/or hosting organiza- tion (american society for quality (asq)). the iso ( ) system and software quality model – often adopted by organizations to set-up dp plans based on classificatory decision criteria for technical requirements (hamm and becker, ) – defines a hierarchy of quality attributes by combining charac- teristics related to the outcome of interaction of the software product (quality in use) and those related to static properties of software and dynamic proper- ties of the computer system (product quality). quality and sustainability of dp management sys- tems are terms very much ‘in vogue’ (doorn, ; ifla journal ( ) hey, ). the recent document coar roadmap for future directions for repository interoperability includes the concept ‘sustainability’ among six topics regarding interoperability and groups it with goals such as: � improving platform stability; � supporting long-term preservation and archiving; � exposing persistent identifiers; � integrating different persistent identifiers (coar, : ). to provide sustainability of activities and work- flows in dp services (aparsen, a), a struc- tured, systematic process – based on well-defined strategies (e.g. digital preservation strategy of the british library), interoperable policies (innocenti et al., ) (e.g. digital preservation policy of the national library of australia; portico trust archive preservation policies) and comprehensive data and process management plans (dmps) – is essential (budroni et al., ; icpsr; rda). a core set of controlled vocabulary elements can be ‘instantiated to connect preservation planning, pre- servation watch, and experimentation with preserva- tion policies’ (kulovits et al., ; plato). dmp should be an integral part of every project implicating data management, and it should formalize in detail all technical and non-technical elements – including processes (e.g. workflows performing com- plex operations involving identification, migration, conversion tools, as well as the comprehension of visualization issues) and context – accompanying a do’s lifecycle in conjunction with a repository environment. the dmp section devoted to ‘preservation’ should comprise and relate all necessary features and require- ments clarifying issues on: � technical registries, i.e. information about file formats: tiff, pdf/a, alto, tei, bwf, aiff, mxf, avi, etc. (california digital library, ; preforma) and their conver- sion (holdsworth, ); software products to access the information; migration paths and platforms; persistent identifiers/pis unambigu- ously locating and accessing dos; � digital rights and access management (drm) within the context of long-term dp, as well as the related risks and challenges arising in con- nection with the long-term dp, ongoing acces- sibility of drm-protected objects, and the safeguarding of associated rights (aparsen, b); � standards concerning preservation and work- flows for collecting actionable representation and administrative (that can overlap with tech- nical and perseveration metadata) metadata (digital preservation coalition, ); � costs in time and effort. the overall structure of dmp, considering also processes for data curation – as stressed by rauber ( ) – should: � demonstrate that resources and systems will enable the data to be curated effectively beyond the lifetime (dcc, ); � describe all contingent processes, their imple- mentation and data used and produced by processes; � provide preservation history (long-term storage and funding); � highlight conditions for sharing, reuse, verifi- cation, legal aspects; � demonstrate monitoring and external dependencies; � be machine-readable and machine-actionable to automate (most of) the activity in creating and maintaining that dmp. the main building blocks for process management plans (pmps), that comprise: � metadata frameworks; � preservation plans; � process context models; � preservation actions; � approaches for validation documentation; � policies; should be carefully analyzed and elements establish- ing context of interoperable process activities should be mined and described in a context model, combin- ing features with ground truth into specific file for- mat. pmps extending dmps should automatically enable the following processes: capture processes, workflows and their dependencies; verify correctness of re-execution and re-use of data and workflows; identify subsets of data in large and dynamic data- bases; assign pis to time-stamped query; capture all elements of a research process; cite data, etc. need- less to say that the development of dmp requires a certain degree of cooperation between a number of agents responsible for a wide range of digital (data) curation phases (dcc, ; ganguly, ; ifla, ; tammaro and casarosa, ; uc curation center). solodovnik and budroni: preserving digital heritage framing digital objects’ preservation: models and initiatives preservation metadata have been identified as essen- tial for the long-term management of dos. the core of dp metadata is premis specifying the semantic units/classes (intellectual entities, objects, rights, events, agents) designed to support the long-term accessibility of a do by providing information about its content, technical attributes, dependencies, man- agement, designated communities and change history. premis interoperable units convey detailed and complex information about digital content through administrative metadata, technical metadata and spe- cification of structural relationships relevant for pre- servation functions (isq information standards quarterly, ). for institutions getting started with dp, the meta- data standards able to support quality and sustainabil- ity of do in a long-term period can be overwhelming. making smart choices about what constitutes ‘good enough’ can enable repository managers to move for- ward more quickly. michael day published in a short paper in ariadne on the implications of meta- data for dp from the point of view of responsibilities. the author addressed five important issues which to this day still represent challenges, which are the following: � who will define what preservation metadata are needed? � who will decide what needs to be preserved? � who will curate the preserved information? � who will create the metadata? � who will pay for it? (white, ) it is obvious that dp implies machines’ and humans’ dependability and there should be a common framework regulating responsibilities and interactions between humans and systems and accepting the responsibility to preserve information and make it available for a designated community. such a com- mon framework is presented by the widely-endorsed open archival information system (oais) model (lavoie, ), published as standard is ( ), which provides a first-rate overview of the role of preservation metadata in the management over time of digital resources and contains a set of preser- vation policies. when it comes to the long-term perspective of the digital library project (ifla, ), a strategy for long-term digital preservation (ltdp) is required and oais provides a well-suited reference model for this context. oais vision has been specifically tailored for the purposes of lifecycle management in rome at the sapienza digital library, particularly for building a consistent set of data, covering all information needs, required by the different oais functional scenarios: ingestion (submission ip), archiving (archival ip) and access (dissemination ip). the digital library and dp services should be based on data conveyed by the aforementioned ip and enriched by a number of components, supporting the management of the information infrastructure (catarci et al., ). over the last decade, great practical progress has been achieved in support of dos’ expressivity and long-term sustainability. in particular, a series of methodologies, models and implementation guide- lines have been developed by a number of projects (e.g. aparsen, preforma, scape, scidip- es, timbus, wf ever, keep, dp lib, presto- prime, persid, chronopolis, parseinsight, preserv, shaman, spar, planets, caspar), every one of which has come up with a number of personalized (strongly community-driven and ‘by design’) frameworks, tools and systems to solve distinct problems in the dp domain, accelerating long-wave preservation trends with cross-disciplinary strategies. moreover, ‘an essential step in the data pre- servation process is to convince people to invest time and effort in depositing their data in repositories specif- ically designated for data preservation’ (henneken, : , ), like phaidra (phaidra, the ten commandments for policy), dans, dataverse net- work, zenodo, etc. several important preservation issues addressing maintenance and preservation of cultural heritage (ch) resources in the long term as well as their persis- tent accessibility to the global community have been focused on europeana digital library. this last is a core of the european commission recommenda- tion on the digitization and online accessibility of cul- tural material and digital preservation (european commission, ), that has challenged member states to develop solid plans and build partnerships to place all public domain masterpieces in eur- opeana by and, by , all of europe’s cul- tural heritage. the recommendation also invites all interested shareholders to adapt national legislation and strategies to ensure the long-term dp of more in-copyright and out-of-commerce, i.e. open data, ch material online conveyed by non-property (open) formats as property ones make the preservation risky. to support collaborative creative endeavours in sharing, re-use and enrichment of ch data by adding new value, europeana cloud (europeana professional) will change the way that data (content and metadata) are sent to and stored in eur- opeana, and will give researchers new tools to ifla journal ( ) support their engagement in a trusted, efficient cloud-based infrastructure forging connections with new communities exploiting potential synergies. the ongoing european project pericles (promot- ing and enhancing reuse of information throughout the content lifecycle taking account of evolving semantics) – besides addressing a number of chal- lenges ensuring that dos remain accessible in a digital environment encompassing continuous technological change – stresses that changes in semantics (i.e. seman- tic obsolescence), academic or professional practice, or society itself can also influence the attitudes and inter- ests of the various stakeholders that interact with digital content (pericles). among a range of conceptual and formal models, tools, policies, architectural approaches developed to support a range of preserva- tion requirements to be used independently in different environments, it is worth citing the pet modular toolkit for extracting significant environment infor- mation (sei) and linked resource model (peri- cles) based on linked data (ld) principles for representing dynamic preservation ecosystems. it is already well known that: linked data provides a global environment for describ- ing the objects and their significant properties. this environment reduces duplication of effort when describ- ing resources and their attributes, and fosters the cre- ation of a global information graph encompassing all the information needed to perform complex queries and actions. (w c, ) another project worth citing in the context of ld and preservation of ch is the linked heritage project. the linked heritage project published a few years ago a document entitled state of the art report on persistent identifier standards and man- agement tools, stressing the importance of creating digital identifiers (uniform resource identifier/uri) which are reasonably persistent (pi) as, for example, doi names (adns, ; doi, ; linked heritage, ). the document mentioned addresses the following issues: ch institution requirements for pis; pi service requirements for pis; pi policy; ld and pis. to tackle the key issues affecting the preservation and long-term accessibility of digital ch, in unesco organized an international conference enti- tled ‘the memory of the world in the digital age: digitization and preservation’ and published the van- couver declaration, including a number of the main recommendations on trusted dp frameworks and practices for collaborative management and preserva- tion (dch-rp project; unesco/ubc, ). to support dp activities on a regular basis among differ- ent stakeholders, a two-year coordination action ‘digital cultural heritage roadmap for preservation’ (dch-rp) was launched by the european commis- sion ( ). this initiative presented an action frame- work appropriate for advancing outstanding case studies, practices and effort in facilitating, promoting, advocating, raising awareness and disseminating har- monized data storage and preservation policies devel- oped by different communities (ch organizations and e-infrastructure providers) aiming to improve access to information and ch resources. the main outcome of this action is a dch-rp roadmap supporting implementation of a dp federated and interoperable collaborative e-infrastructures, supported by common standards, practical tools, approaches and business models for decision makers. the dch-rp roadmap makes it possible for each ch entity to define its own practical action plan with a realistic timeframe for the implementation of its stages. the dch-rp roadmap also provides practical steps to design a trust model appropriate for the use in collaborative e- infrastructures and including recommendations for user authentication and access control system(s). above all, collaboration with a diverse set of stake- holders means that libraries can stake their place in the common vision for dp, thus ensuring that the issues surrounding the preservation of digital ch are represented in this vision (reilly, ). a common thread among all projects and initia- tives focusing their efforts on dp activities is that they highlight the need to contribute qualitatively to the lifecycle of interoperable dos in a trusted (comply- ing with specific requirements of quality and sustain- ability) digital environment. so what are the facets of such an environment and is there any common practi- cal framework to assess its quality and sustainability? the next sections will be devoted to presenting some issues of interoperability and trust that can be replicated in any do management environment. initiatives to be presented below that address these issues are: the already cited aparsen, lode-bd, lod and iso ( ) tackling a range of topics focused on persistent interoperability and trust of do management systems. interoperability framework for persistent identifiers systems enhanced by lode-bd e lod one of the main goals of the european aparsen project was to combine and integrate european dp efforts into a shared enterprise and thus to build a long-lived virtual centre of excellence (vcoe) to solodovnik and budroni: preserving digital heritage share a common vision (aparsen roadmap under- pinned by the revised oais reference model) of expertise, tools and resources for dp clustered in two hierarchical groups (research silos and integrated topics) with common agreement on terminology, evi- dence standards, dp services, access and re-use of data holdings over the whole life-cycle. the common topics of aparsen ‘access’, ‘usability’, ‘sustain- ability’ and ‘trust’ are impregnated by issues such as interoperability in connection to pis. the concept of ‘interoperability’ promoted by aparsen ( b) is conceived in terms of a com- mon way to access data in the same format even if these data belong to heterogeneous pi domains. con- sidering that different identification schemes will never speak with each other (e.g. doi does not speak with nbn), aparsen provides persistent identifiers (pis) interoperability framework (if), commonly known as ‘if for pi systems’ (aparsen, a) underpinning interoperability, persistent access, reuse and exchange of information through the use of exist- ing pis and associated objects across different sys- tems, locations and services. the basic idea of if for pi systems is that a common conceptual represen- tation is the main condition to design added-value interoperability services, which can exploit the value of a scheme of representation agreed and shared across trusted systems in order to facilitate exchange, re-use and integration of dos identified in these sys- tems by different pis. different repositories, for example phaidra (permanent hosting, archiving and indexing of digital resources and assets) and the repositories working within the frame of the phaidra.org net- work, provide their own identifier system, which is applied to the objects generated through the reposi- tory. moreover, phaidra objects in the future will be assigned more than one pi, namely handle and urn, according to the needs of the owners of the objects. with the increasing exchange of metadata, differ- ent identifier systems will clash in a repository envi- ronment. any type of additional pi (e.g. pmid/ pmcid, issn, do) is useful to fetch more, contex- tual information (coar, : ). in compliance with the linked content coalition ( ) framework, any unique pi should be resolva- ble to a single object such as web page or file, or to both object and metadata or to multiple objects, such as different formats of the same objects, or different content types, through the same pi (multiple resolu- tion). the resolution is the key mechanism enabling a system to locate and access the identified object or information related to it on the web. no digital system can be functional and interoper- able without metadata and the explicit linkages between metadata and resources identified by pis (e.g. relation existing between a resource and the col- lection of which it is part of). common conceptual representation of metadata in different services repre- sents an added value that can speed up the implemen- tation of their interoperability. in this respect, the if is mapped to the incoming dublin core and marc information on the aparsen entities through flex- ible frbroo ontology (bekiari et al., ) bridging entities representing library and museum ch resources. the metadata normalization could be accom- plished on top of nine metadata groups of common properties recommended by lode-bd (linked open data enabled bibliographical data) (subirats and zeng, ), which are: title information; responsi- ble body; physical characteristics; location; subject; description of content; intellectual property; usage; relation. these nine clusters are consistent in both type of entities and relationships between entities in the treatment of work, expression, and manifestation concepts used in frbr (functional requirements for bibliographic records) (ifla, ). being mapped to dc (simple and qualified) and to other metadata and schemes, also designed to support bibliographical data on the web, lode-bd metadata can be seen as one-size-fits-all approach for encoding meaningful lod-ready bibliographical data concentrated on the data, not on the scheme. as a reference tool, lode-bd provides assistance on how to make decisions on metadata modelling (in both depth and detail), encoding and implementation (with better response to specific needs via design- time/run-time strategies (subirats and zeng, ) by providing all necessary paths on how to create meaningful and comprehensive (both to humans and web engines) bibliographic data and to share (subirats et al., ) them among different systems and with lod universe (an unbound, global data space con- taining more than billion triples) (aloe; getty). content/data providers aiming to communicate and to discover knowledge via a common ‘if for pi sys- tems’ can directly create rdf triples using lode- bd metadata properties encoded with non-literal (uri) data values of lod-ready schemes. in this way, content providers will be aligned on the back- bone of a common conceptual representation of data. in the ideal draft scenario, these data should be aggre- gated by a central ‘if for pi systems’ (service provi- der) – exploiting powerful crosswalks and ontology including lode-bd metadata – with no delays, ifla journal ( ) http://phaidra.org failures, errors or omissions or loss of transmitted information. ‘since publishing as lod in any case means interlinking the data with external sources by means of typed relations, it would foster the topic of data interoperability’ (coar, : ). after the metadata normalization in aparsen if comes the stage of the co-reference generation among resources through a relation indicat- ing that two uri refer to the same entity (i.e. digital objects/authors/institutions have the same ‘identity’). the programming of a technical infrastructure based on aparsen if should foresee all standardized rela- tionships between the identified entities, their pis, the corresponding resolution services and related infor- mation (metadata). finally, a common interoperabil- ity layer – where meaningful information from independent systems is integrated, re-used and exploited to enable added-value interoperability ser- vices (aparsen, ) – can be created. aparsen if for pi systems stresses the impor- tance of registering alternative identifiers for the same entity, because it guarantees multiple ways to access the resource and related information, making the res- olution process really persistent. the first prototype of the aparsen if for pi systems demonstrator was presented in at the workshop ‘interoperability of persistent identifiers systems – learning how to bring them together’ (aparsen, b). this demonstrator aggregated some metadata provided by several aparsen part- ners on a single machine implementing the if (frbroo) ontology in a rdf triple store mechanism and exposing these metadata through a sparql end- point. the prototype exposes co-references among related entities in the knowledge base using informa- tion provided by content providers. ‘if the if is widely implemented it can become a reference model for any future development for pi systems and it could create a ‘‘ring of trusted pi for linked open data (lod)’’’ (aparsen, b: ). extending preserving linked data proj- ect’s challenges, the first diahron workshop (hosted by eswc ) entitled ‘managing the evo- lution and preservation of the data web’ (diahron workshop, ) stressed that it is of particular rele- vance for different stakeholders to raise awareness of how openly available ld sets could be used to achieve their full potential. a traditional view of digi- tally preserving ld sets by pickling them and locking them away for future use, like groceries, would con- flict with their evolution. to provide some solutions to this problem, the european lod project proposed the lod approach (i.e. lod stack) to plan and manage a full life-cycle of ld. in particular, the lod project was launched to deal with the following issues: � how to improve coherence and quality of data published on the web? � how to close the performance gap between relational and rdf data management? � how to establish trust on the ld web and gen- erally lower the entrance barrier for data pub- lishers and users? these questions have been answered by providing: . tools and methodologies for exposing and man- aging very large amounts of structured informa- tion (big data) on the data web (h project; oai workshop, ; or , ); . a testbed and bootstrap network of high-quality multi-domain, multi-lingual ontologies from sources such as wikipedia and openstreetmap; . algorithms for automatically interlinking and fusing data from the web; . standards and methods for consistently track- ing trust and trustworthiness of information as well as for assessing its quality (gladney, ; hartig, ; semantic web company, ); . adaptive tools for searching, browsing, and authoring of ld. the lod stack provides a series of mechanisms to manage a full life-cycle of ld, by tackling: � synchronisation problem (i.e. how to monitor changes); � curation problem (i.e. to repair data imperfections); � appraisal problem (i.e. to assess the quality of a dataset); � citation problem (i.e. how to cite a particular version of a linked dataset); � archiving problem (i.e. to retrieve the most recent or a particular version of a dataset); � sustainability problem (i.e. to spread preserva- tion ensuring long-term access). the lod stack is a valuable tool to support crea- tors and publishers of ld and is a likely candidate to be integrated in the ‘ring of trusted pi for lod’. in particular, engaging lode-bd and lod stack in the if for pi systems will empower its interoperabil- ity, pave all necessary conditions for ‘creating knowledge out of interlinked data’ (auer et al., ) and enhance ‘proof and trust’ (jaques et al., ). solodovnik and budroni: preserving digital heritage the next section will introduce the reader to the concept of trust, a concept on which aparsen and trusted digital repository framework have focused their main endeavours. vision of trust so how can ch organizations collaborate to address unique practices and challenges worldwide related to dp and to management of trusted systems, aiming at ensuring persistent access to digital resources worldwide? one of the ways is to be engaged with the dp com- munity as a whole. the previously mentioned apar- sen network, by extending its virtual centres of excellence (centro di eccellenza italiano sulla con- servazione digitale), invites different stakeholders to take part in its network contributing to and sharing a common dp vision. collaboration with a diverse set of practitioners (public and private), exchanging their experience and expertise, means that the ch sector can gain its place in the common cross-referenced vision for dp, ensuring that the issues surrounding the preservation and management of digital ch are repre- sented in this common vision too. in recent years, there have been multiple efforts to assess repositories with the objective of making their practices and procedures transparent, while assuring that their valuable digital assets are protected. a few years ago, aparsen presented a unified european vision of trust in dp (aparsen, c), in particular when it comes to unfamiliar digi- tally encoded information, especially when it has passed through several hands over a long period of time. the report collected, evaluated and provided key answers to the following issues: � has the digitally encoded information been preserved properly? � is it of high quality? � has it been changed in some way? � does the pointer or link takes user to the right object? the unified vision of trust refers to three levels for evaluation of trusted digital repositories (tdr). these levels constitute the tdr framework and are recognized as the ‘european framework for audit and certification on digital repositories’ under- pinned by a memorandum of understanding (mou) (trusteddigitalrepositories.eu, ). the relevance of tdr framework is also stressed by dcc in the context of lifecycle planning for successful dc. the integrated multilevel framework for evalua- tion of a tdr assembles: . data seal of approval (dsa) assessment initiative; . standard din ( ) – information and documentation. criteria for trusted digital repositories; . standard for tdr - iso ( ) – space data and information transfer systems – audit and certification of trustworthy digital repositories. by implementing this framework, the digital world may become more reliable. moreover, the ‘audit and certification of digital repositories are fundamental in guaranteeing the trustworthiness of research infra- structures as a whole’ (dillo, : ). the first (basic certification) level – presenting an entry point for the self-accessing of repository quality and sustainability – requires a few days’ effort from the repositories. the last two (extended and formal certification) levels present auditing standards for tdr and require several person months to collect much more detailed information than the dsa, to take part in the audits for assessing the trust of digital repo- sitories, considering also that it is ‘not a one-time accomplishment that you achieve and then forget’ (dillo, : ). basically, the definition of a tdr starts with a mis- sion to provide reliable, long-term access to managed digital resources to designated community/ies via an articulated framework of attributes (administrative responsibility, organizational viability, financial sus- tainability, and procedural accountability) and responsibilities for trusted, reliable, sustainable digi- tal infrastructures capable of handling the plethora of materials held by large and small ch and research institutions. the nestor working group defines a trusted, long-term digital repository as a complex and interrelated system. in determining trustworthiness, one should look at the quality of entire digital infra- structure, ‘in which the digital information is man- aged, including the organization running the repository’ (trac, : , , ). the dsa sets forth guidelines related to trust- worthy data management and stewardship (data seal of approval, ). some of the digital repositories awarded with dsa include: icpsr, the archaeology data service (united kingdom); the dans elec- tronic archiving system (netherlands); the platform for archiving cines (france); the language archive of the max planck institute for psycholinguistics (netherlands); and the uk data archive (icpsr. trusted digital repositories). the standard din consists of require- ments structured in three parts: ( ) organization; ( ) ifla journal ( ) management of intellectual entities and their repre- sentations; ( ) infrastructure and security. it includes appendices with examples of digital repositories and best practices for each requirement. the iso – based upon the trusted digital repositories and audit checklist (trac) tracing the story (‘let to’, ‘developed into’, ‘adopted as’, ‘informed’, ‘referenced by’) (wikipedia) of all digi- tal repository standards – can be used as a basis for formal certification and assessment of digital reposi- tories. trac describes the metrics of an oais- compliant digital repository developed from work done by the oclc/rlg programs and national archives and records administration (nara) task force initiative (giaretta, ). the center for research libraries certification advisory panel (center for research libraries) ensures that the certi- fication process addresses the interests of different stakeholders including managers in collection devel- opment, preservation and library information technology. the following different high-quality aspects are provided by both ( ) tdr framework (organizational infrastructure; do management and infrastructure; security risk management, etc.); and ( ) virtual cen- tres of excellences constituted by aparsen: � repository policies compliant with tdr criteria can be defined (e.g. comparison of trac checklist and pledge policy list); � preservation prototypes, as well as a portfolio of models, services and tools for innovative sup- port of lifecycle management, monitoring risks and opportunities connected with dp compo- nents and quality measures can be developed; � preservation ecosystems (shifting from colla- borative approach towards distributed dp to open scalable preservation ecosystems) can be achieved (kulovits et al., ; skinner and halbert, ); and � a broader take-up of the dp projects’ results can be encouraged providing guidance that oth- ers can use in their own preservation efforts determining their own institutional dp needs, and including interactive ‘on-the-spot’ research on current dp trends. final thoughts and outlook the push for the long-term dp of valuable informa- tion resources is both a challenge (ensuring that it is carried out in the most cost-effective and efficient methodological and implementation manner) and an opportunity for different stakeholders, included ch organizations. the accurate selection and application of models and technologies promoted by a wide range of initiatives and projects, as well as replication of core elements of best practices – underpinning a plethora of facets of dp – will positively support per- sistent access to content and its interoperability in the long-term perspective, paving a stable way for re-use of data for research and innovation. the aparsen network of excellence in dp has launched the long-life collaborative virtual centres of excellence, where different stakeholders can inter- act, sharing their models and practices and developing a common vision for dp. by means of the aparsen interoperability framework for persistent identifier systems empow- ered by lode-bd and lod stack, semantically enhanced content can be pushed in an interoperable trustworthy manner out of its dc ecosystem to lod universe, facilitating communities’ participation through data and knowledge re-use, re-distribution and sharing on the frontline of linked data. trust and trustworthiness of dp notably affect the quality and sustainability of dc, focusing its main efforts on the creation of long-life value-added services, where users can undertake innovative exploration and anal- ysis of digital contents over a long span of time (aparsen, a). digital repositories compliant with organizations and policies and procedures, focusing well on preser- vation goals and assessed according to the european framework for audit and certification on digital repositories are trusted and trustworthy and thus sustaining different opportunities for long-term data sharing. to empower collaborative endeavours of trusted dp communities, a set of interrelated technical and non-technical requirements, objectives and compo- nents for preservation quality should be programmed in human-machine friendly scalable pmps connecting dynamically (on request and in respect with updates) cross-referenced elements and retrieving answers on queries, helping to monitor and to assess different pre- servation contexts with the goal of developing shared solutions for the optimization of dp services. to enhance community-driven dp activities sup- ported by virtual centres of excellence, dp services should collaboratively focus their efforts on extending already existing ‘friendly human-machine’ controlled vocabulary elements for preservation quality, enabling interoperability among the building blocks of the preservation ecosystem (kulovits et al., ). the semantics of such vocabularies should be optimized for rdf-aware environments, aligned and automatically updatable on the frontline of solodovnik and budroni: preserving digital heritage linked open data (haag, ) and big data, thus notably contributing to enable interoperability fea- tures defined in the recent coar ( ) roadmap for future directions for repository interoperability. in an ideal scenario, such a common controlled vocabu- lary supporting dp should connect dp systems around the globe, merging the concepts of policy- aware operations, planning, technical and monitoring components of (complex) digital objects. the ulti- mate goal of such endeavour is to collaboratively ensure that all necessary exchangeable information is leveraged to develop a global scalable trusted pre- servation ecosystem. declaration of conflicting interests the author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. funding the author(s) received no financial support for the research, authorship, and/or publication of this article. references aloe: assisted linked data consumption engine, aksw research group. available at: http://aksw.org/ projects/aloe.html (accessed june ). ands (australian national data service) ( ) the digi- tal object identifier system & doi names. available at: http://www.ands.org.au/guides/doi.html (accessed june ). american society for quality (asq). available at: http:// asq.org/index.aspx (accessed june ). aparsen. about aparsen. available at: http://www. alliancepermanentaccess.org/index.php/aparsen/ (accessed june ). aparsen. virtual centre of excellence (vcoe). avail- able at: http://www.alliancepermanentaccess.org/index. php/community/virtual-centre-of-excellence/ (accessed june ). aparsen. the aparsen roadmap to a common vision. available at: http://www.alliancepermanentaccess.org/ index.php/aparsen/ (accessed june ); aparsen ( a) persistent identifiers interoperability framework. report. rutherford appleton laboratory chilton. available at: http://www.allianceperma- nentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ _incurn.pdf; http:// www.alliancepermanentaccess.org/wp-content/plugins/ download-monitor/download.php?id¼interoperabilityþ frameworkþforþpiþsystems (accessed june ). aparsen ( b) workshop on pi system interoperabil- ity, auditorium ente cassa di risparmio di firenze, italy, december . available at: http://www.alli- ancepermanentaccess.org/wp-content/uploads/ / / interoperability-framework-for-pi-systems-aparsen. pdf (accessed june ). aparsen ( c) trust. report. available at: http://lib- ereurope.eu/blog/ / / /trust-in-digital-preserva- tion/ (accessed june ). aparsen ( a) d . overview of preservation ser- vices. report, rutherford appleton laboratory chilton. available at: http://www.alliancepermanentaccess.org/ wp-content/uploads/downloads/ / /aparsen- rep-d _ - - _ .pdf (accessed june ). aparsen ( b) wp citability and identification, interoperability framework for pi systems. report. available at: http://www.alliancepermanentaccess.org/ index.php/aparsen/aparsen-research/wp -identifiers-and- citability/ (accessed june ). aparsen ( a) d . the interoperability framework implementation with added value services. report. rutherford appleton laboratory chilton. available at: http://www.alliancepermanentaccess.org/wp-content/ uploads/downloads/ / /aparsen-rep-d _ - - _ .pdf (accessed june ). aparsen ( b) d . report drm preservation. ict - . – digital libraries and digital preservation. report. available at: http://www.alliancepermanentaccess.org/ wp-content/uploads/downloads/ / /aparsen- rep-d _ - - _ _incurn.pdf (accessed june ). auer s, bryl v and tramp s ( ) linked open data: creating knowledge out of interlinked data. results of the lod project. lecture notes in computer sci- ence . berlin: springer. bekiari c, doerr m, boeuf pl, et al. ( ) frbr object- oriented definition and mapping from frbrer, frad and frsad. report. version . . available at: http:// www.ifla.org/files/assets/cataloguing/frbr/frbroo_v . . pdf (accessed august ). budroni p, miksa t and rauber a ( ) data manage- ment plans: how to treat digital sources. the immi- nent future for repositories and their management. lab phaidra (slides). available at: https://www. coar-repositories.org/files/ _dmp_vienna.pdf (accessed april ). bulletin of the association for information science and technology ( ) linked data and the charm of weak semantics. special issue, ( ). california digital library ( ) digital file format rec- ommendations: master production files. report. regents of the university of california. available at: http://www. cdlib.org/gateways/docs/cdl_dffr.pdf (accessed june ). california digital library. glossary. available at: http:// www.cdlib.org/gateways/technology/glossary.html#d (accessed june ). catarci t, di iorio a and schaerf m ( ) the sapienza digital library from the holistic vision to the actual implementation. procedia computer science ( ) – . center for research libraries. certification & assessment of digital repositories. available at: http://www.crl. ifla journal ( ) http://aksw.org/projects/aloe.html http://aksw.org/projects/aloe.html http://www.ands.org.au/guides/doi.html http://asq.org/index.aspx http://asq.org/index.aspx http://www.alliancepermanentaccess.org/index.php/aparsen/ http://www.alliancepermanentaccess.org/index.php/aparsen/ http://www.alliancepermanentaccess.org/index.php/community/virtual-centre-of-excellence/ http://www.alliancepermanentaccess.org/index.php/community/virtual-centre-of-excellence/ http://www.alliancepermanentaccess.org/index.php/aparsen/ http://www.alliancepermanentaccess.org/index.php/aparsen/ http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ _incurn.pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ _incurn.pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ _incurn.pdf http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/download.php?id=interoperability+framework+for+pi+systems http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/download.php?id=interoperability+framework+for+pi+systems http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/download.php?id=interoperability+framework+for+pi+systems http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/download.php?id=interoperability+framework+for+pi+systems http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/download.php?id=interoperability+framework+for+pi+systems http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/download.php?id=interoperability+framework+for+pi+systems http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/download.php?id=interoperability+framework+for+pi+systems http://www.alliancepermanentaccess.org/wp-content/plugins/download-monitor/download.php?id=interoperability+framework+for+pi+systems http://www.alliancepermanentaccess.org/wp-content/uploads/ / /interoperability-framework-for-pi-systems-aparsen.pdf http://www.alliancepermanentaccess.org/wp-content/uploads/ / /interoperability-framework-for-pi-systems-aparsen.pdf http://www.alliancepermanentaccess.org/wp-content/uploads/ / /interoperability-framework-for-pi-systems-aparsen.pdf http://www.alliancepermanentaccess.org/wp-content/uploads/ / /interoperability-framework-for-pi-systems-aparsen.pdf http://libereurope.eu/blog/ / / /trust-in-digital-preservation/ http://libereurope.eu/blog/ / / /trust-in-digital-preservation/ http://libereurope.eu/blog/ / / /trust-in-digital-preservation/ http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ .pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ .pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ .pdf http://www.alliancepermanentaccess.org/index.php/aparsen/aparsen-research/wp -identifiers-and-citability/ http://www.alliancepermanentaccess.org/index.php/aparsen/aparsen-research/wp -identifiers-and-citability/ http://www.alliancepermanentaccess.org/index.php/aparsen/aparsen-research/wp -identifiers-and-citability/ http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ .pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ .pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ .pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ _incurn.pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ _incurn.pdf http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/ / /aparsen-rep-d _ - - _ _incurn.pdf http://www.ifla.org/files/assets/cataloguing/frbr/frbroo_v . .pdf http://www.ifla.org/files/assets/cataloguing/frbr/frbroo_v . .pdf http://www.ifla.org/files/assets/cataloguing/frbr/frbroo_v . .pdf https://www.coar-repositories.org/files/ _dmp_vienna.pdf https://www.coar-repositories.org/files/ _dmp_vienna.pdf http://www.cdlib.org/gateways/docs/cdl_dffr.pdf http://www.cdlib.org/gateways/docs/cdl_dffr.pdf http://www.cdlib.org/gateways/technology/glossary.html#d http://www.cdlib.org/gateways/technology/glossary.html#d http://www.crl.edu/archiving-preservation/digital-archives/certification-assessment edu/archiving-preservation/digital-archives/certification- assessment (accessed june ). centro di eccellenza italiano sulla conservazione digitale. available at: http://www.conservazionedigitale.org/ (accessed june ). coar ( ) coar roadmap. future directions for repository interoperability.comparison of trac checklist and pledge policy list. available at: http://pledge.mit.edu/images/ / /tdrppltraccompv . pdf (accessed june ). cornell university. digital preservation management: implementing short-term strategies for long-term problems. tutorial. available at: http://www.dpwork- shop.org/dpm-eng/eng_index.html (accessed june ). data seal of approval ( ) guidelines, version . available at: https://assessment.datasealofapproval.org/ guidelines_ /html/ (accessed june ). dataverse network. available at: http://thedata.org (accessed june ). dcc. dcc curation lifecycle model. available at: http://www.dcc.ac.uk/resources/curation-lifecycle-model (accessed june ). dcc. what is digital curation? available at: http:// www.dcc.ac.uk/digital-curation/what-digital-curation (accessed april ). dcc. lifecycle planning for successful digital curation. available at: http://www.dcc.ac.uk/resources/curation- reference-manual/chapters-production/lifecycle-plan- ning (accessed june ). dcc ( ) checklist for a data management plan. ver- sion . . available at: http://www.dcc.ac.uk/resources/ data-management-plans/checklist (accessed august ). european commission ( ) dch-rp project. dch-rp digital cultural heritage roadmap for preservation. available at: http://www.dch-rp.eu/ (accessed june ). diahron workshop ( ) managing the evolution and preservation of the data web. hosted by eswc . available at: http://www.diachron-fp .eu/workshops. html (accessed june ). digital preservation coalition ( ) preservation meta- data. technology watch report. available at: http:// www.dpconline.org/newsroom/not-so-new/ -new- preservation-metadata-second-edition-technology-watch- report-released-to-dpc-members (accessed august ). dillo i ( ) certification as a means of providing trust. in: fondazione rinascimento digitale, florence. available at: http:// . . . : /dspace/bit- stream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc. pdf (accessed august ). din ( ) information and documentation. cri- teria for trusted digital repositories. available at: http://www.beuth.de/de/norm/din- / (accessed august ). doi ( ) international doi foundation. available at: http://www.doi.org/doi_handbook/ _idf.html (accessed june ). doorn p ( ) towards sustainable data sharing. in: open access week, october , groningen, the nether- lands. available at: http://www.rug.nl/bibliotheek/ser- vices/openaccess/peter-doorn- ?lang=en (accessed june ). european commission ( ) recommendation on the digitisation and online accessibility of cultural material and digital preservation. official journal of the eur- opean union, october. available at: http://eur-lex. europa.eu/lexuriserv/lexuriserv.do?uri¼oj:l: : : : :en:pdf (accessed june ). europeana professional. europeana cloud. avail- able at: http://pro.europeana.eu/structure/europeana- cloud (accessed june ). ganguly r ( ) roles of e-infrastructure, phaidra. available at: http://phaidra.univie.ac.at/o: (accessed june ). getty. what is lod? available at: http://www.getty. edu/research/tools/vocabularies/lod/#definition (accessed june ). giaretta d ( ) advanced digital preservation. berlin: springer. gladney h m ( ) long-term preservation of digital records: trustworthy digital objects. american archivist ( ): – . h project big data europe. available at: http://big- data-europe.eu (accessed june ) haag d ( ) persistent object identifier – linked open data manifesto. in: persistent object identifiers semi- nar, the hague, the netherlands, – june . hamm m and becker c ( ) impact assessment of deci- sion criteria in preservation planning. in: th interna- tional conference on digital preservation of digital objects, singapore, – november , pp. – . hartig o ( ) provenance information in the web of data. in: ldow , madrid, spain, april . available at: http://www.dbis.informatik.hu-berlin.de/ fileadmin/research/papers/conferences/ -ldow-hartig. pdf (accessed august ). henneken e ( ) unlocking and sharing data in astron- omy, linked data and the charm of weak semantics. spe- cial bulletin of the asis&t ( ): – . hey t ( ) the fourth paradigm – data-intensive scien- tific discovery, e-science and information management. communications in computer and information science ( ). holdsworth d ( ) instalment on preservation strate- gies for digital libraries. digital curation manual. version . . available at: http://www.rin.ac.uk/system/ files/attachments/preservation_strategies.pdf (accessed june ). icpsr data management and curation. data manage- ment plan resources and examples. available at: http://www.icpsr.umich.edu/icpsrweb/content/dataman- agement/dmp/resources.html (accessed june ). solodovnik and budroni: preserving digital heritage http://www.crl.edu/archiving-preservation/digital-archives/certification-assessment http://www.crl.edu/archiving-preservation/digital-archives/certification-assessment http://www.conservazionedigitale.org/ http://pledge.mit.edu/images/ / /tdrppltraccompv .pdf http://pledge.mit.edu/images/ / /tdrppltraccompv .pdf http://www.dpworkshop.org/dpm-eng/eng_index.html http://www.dpworkshop.org/dpm-eng/eng_index.html https://assessment.datasealofapproval.org/guidelines_ /html/ https://assessment.datasealofapproval.org/guidelines_ /html/ http://thedata.org http://www.dcc.ac.uk/resources/curation-lifecycle-model http://www.dcc.ac.uk/digital-curation/what-digital-curation http://www.dcc.ac.uk/digital-curation/what-digital-curation http://www.dcc.ac.uk/resources/curation-reference-manual/chapters-production/lifecycle-planning http://www.dcc.ac.uk/resources/curation-reference-manual/chapters-production/lifecycle-planning http://www.dcc.ac.uk/resources/curation-reference-manual/chapters-production/lifecycle-planning http://www.dcc.ac.uk/resources/data-management-plans/checklist http://www.dcc.ac.uk/resources/data-management-plans/checklist http://www.dch-rp.eu/ http://www.diachron-fp .eu/workshops.html http://www.diachron-fp .eu/workshops.html http://www.dpconline.org/newsroom/not-so-new/ -new-preservation-metadata-second-edition-technology-watch-report-released-to-dpc-members http://www.dpconline.org/newsroom/not-so-new/ -new-preservation-metadata-second-edition-technology-watch-report-released-to-dpc-members http://www.dpconline.org/newsroom/not-so-new/ -new-preservation-metadata-second-edition-technology-watch-report-released-to-dpc-members http://www.dpconline.org/newsroom/not-so-new/ -new-preservation-metadata-second-edition-technology-watch-report-released-to-dpc-members http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http:// . . . : /dspace/bitstream/ / / /dillo-ingrid-ch -certification% as% a% means% of% providing% trust_cc.pdf http://www.beuth.de/de/norm/din- / http://www.doi.org/doi_handbook/ _idf.html http://www.rug.nl/bibliotheek/services/openaccess/peter-doorn- ?lang=en http://www.rug.nl/bibliotheek/services/openaccess/peter-doorn- ?lang=en http://eur-lex.europa.eu/lexuriserv/lexuriserv.do?uri=oj:l: : : : :en:pdf http://eur-lex.europa.eu/lexuriserv/lexuriserv.do?uri=oj:l: : : : :en:pdf http://eur-lex.europa.eu/lexuriserv/lexuriserv.do?uri=oj:l: : : : :en:pdf http://eur-lex.europa.eu/lexuriserv/lexuriserv.do?uri=oj:l: : : : :en:pdf http://pro.europeana.eu/structure/europeana-cloud http://pro.europeana.eu/structure/europeana-cloud http://phaidra.univie.ac.at/o: http://www.getty.edu/research/tools/vocabularies/lod/#definition http://www.getty.edu/research/tools/vocabularies/lod/#definition http://big-data-europe.eu http://big-data-europe.eu http://www.dbis.informatik.hu-berlin.de/fileadmin/research/papers/conferences/ -ldow-hartig.pdf http://www.dbis.informatik.hu-berlin.de/fileadmin/research/papers/conferences/ -ldow-hartig.pdf http://www.dbis.informatik.hu-berlin.de/fileadmin/research/papers/conferences/ -ldow-hartig.pdf http://www.rin.ac.uk/system/files/attachments/preservation_strategies.pdf http://www.rin.ac.uk/system/files/attachments/preservation_strategies.pdf http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/resources.html http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/resources.html icpsr trusted digital repositories. available at: https:// www.icpsr.umich.edu/icpsrweb/content/datamanage- ment/preservation/trust.html (accessed june ). ifla ( ) the role of libraries in data curation, access, and preservation: an international perspective. session – science and technology libraries, august . available at: http://www.ifla.org/news/ the-role-of-libraries-in-data-curation-access-and-preser- vation-an-international-perspective (accessed june ). ifla ( ) about digital libraries. available at: http:// www.ifla.org/about-digital-libraries (accessed june ). ifla study group on the functional requirements for bibliographic records ( ) functional requirements for bibliographic records report. munich: kg saur verlag. innocenti p, smith m, ashley k, et al. ( ) towards a holistic approach to policy interoperability in digital libraries and digital repositories. international journal of digital curation ( ): – . inspect project. available at: http://www.significantpro- perties.org.uk/ (accessed june ). interpares trust. available at: https://interparestrust.org/ trust/about_research/studies (accessed june ). ipres ( ) th international conference on preservation of digital objects (eds j borbinha, m nelson and s knight), lisbon, portugal, – september . iso family – quality management. available at: http://www.iso.org/iso/home/store/catalogue_tc/catalo- gue_detail.htm?csnumber¼ (accessed april ). iso ( ) space data and information transfer systems – open archival information system (oais) – reference model. available at: http://www.iso.org/iso/ home/store/catalogue_tc/catalogue_detail.htm?csnumber¼ (accessed april ). iso ( ) space data and information transfer systems – audit and certification of trustworthy digital repositories. available at: http://www.iso.org/iso/ home/store/catalogue_tc/catalogue_detail.htm?csnumber¼ (accessed april ). iso/iec ( ) systems and software engineering systems and software quality requirements and eva- luation (square) system and software quality mod- els. available at: http://www.iso.org/iso/home/store/ catalogue_tc/catalogue_detail.htm?csnumber¼ (accessed april ). isq information standards quarterly ( ) digital pre- servation. special issue ( ). jaques y, anibaldi s, celli f, et al. ( ) proof and trust in the openagris implementation. in: proceedings of the international conference on dublin core and meta- data applications, kuching, sarawak, malaysia, sep- tember – , pp. – . kulovits h, kraxner m, plangg m, et al. ( ) open preservation data: controlled vocabularies and ontolo- gies for preservation ecosystems. in: th international conference on preservation of digital objects, ipres (eds j borbinha, m nelson and s knight), lisbon, portugal, – september , pp. – . lavoie b ( ) oais introductory guide. dpc technol- ogy watch report. nd edn. available at: http://www. dpconline.org/newsroom/not-so-new/ -digital-pre- servation-coalition-publishes-oais-introductory-guide- nd-edition-technology-watch-report (accessed august ). liber ( ) rd liber workshop on digital curation. keeping data: the process of data curation, vienna, austria, – may . library of congress. national preservation research agenda for the human record: scientific research focuses of the library of congress preservation direc- torate. available at: http://www.loc.gov/preservation/ scientists/projects/agenda.html (accessed april ). library of congress. premis. preservation metadata maintenance activity. available at: http://www.loc. gov/standards/premis/ (accessed june ). library of congress. preservation metadata: implementa- tion strategies (premis) ontology. available at: http:// id.loc.gov/ontologies/premis.html (accessed june ). library of congress ( ) premis data dictionary for preservation metadata, version . . premis ed. com- mittee. available at: www.loc.gov/standards/premis/v / premis- - .pdf (accessed august ). linked content coalition ( ) lcc identifiers work- stream identifiers specification. version . . available at: file:///c:/users/solodovnik/downloads/identification% -% llc% specifications% identifiers% v . a. pdf (accessed june ). linked heritage project ( ) d . state of the art report on persistent identifier standards and manage- ment tools. report. available at: http://www.linkedheri- tage.eu/index.php?en/ /persistent-identifiers (accessed june ). lod project. lod stack. available at: http://stack.lod . eu (accessed june ). metaarchive cooperative ( ) trac audit checklist. eudocopia institute. miller e and ogbuji u ( ) linked data design for the visible library. linked data and the charm of weak semantics, special section, bulletin of the asis&t ( ): – . oai ( ) workshop on innovations in scholarly com- munication. digital curation and preservation of large and complex scientific objects. cern, geneva, switzer- land, – june . available at: https://indico.cern. ch/event/ / (accessed june ). or ( ) th international conference on open repositories, indianapolis, usa, – june . avail- able at: http://www.or .net/ (accessed june ). pericles project. available at: http://pericles-project.eu/ page/about (accessed june ). ifla journal ( ) https://www.icpsr.umich.edu/icpsrweb/content/datamanagement/preservation/trust.html https://www.icpsr.umich.edu/icpsrweb/content/datamanagement/preservation/trust.html https://www.icpsr.umich.edu/icpsrweb/content/datamanagement/preservation/trust.html http://www.ifla.org/news/the-role-of-libraries-in-data-curation-access-and-preservation-an-international-perspective http://www.ifla.org/news/the-role-of-libraries-in-data-curation-access-and-preservation-an-international-perspective http://www.ifla.org/news/the-role-of-libraries-in-data-curation-access-and-preservation-an-international-perspective http://www.ifla.org/about-digital-libraries http://www.ifla.org/about-digital-libraries http://www.significantproperties.org.uk/ http://www.significantproperties.org.uk/ https://interparestrust.org/trust/about_research/studies https://interparestrust.org/trust/about_research/studies http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.dpconline.org/newsroom/not-so-new/ -digital-preservation-coalition-publishes-oais-introductory-guide- nd-edition-technology-watch-report http://www.dpconline.org/newsroom/not-so-new/ -digital-preservation-coalition-publishes-oais-introductory-guide- nd-edition-technology-watch-report http://www.dpconline.org/newsroom/not-so-new/ -digital-preservation-coalition-publishes-oais-introductory-guide- nd-edition-technology-watch-report http://www.dpconline.org/newsroom/not-so-new/ -digital-preservation-coalition-publishes-oais-introductory-guide- nd-edition-technology-watch-report http://www.loc.gov/preservation/scientists/projects/agenda.html http://www.loc.gov/preservation/scientists/projects/agenda.html http://www.loc.gov/standards/premis/ http://www.loc.gov/standards/premis/ http://id.loc.gov/ontologies/premis.html http://id.loc.gov/ontologies/premis.html http://www.loc.gov/standards/premis/v /premis- - .pdf http://www.loc.gov/standards/premis/v /premis- - .pdf http://file:///c:/users/solodovnik/downloads/identification% -% llc% specifications% identifiers% v . a.pdf http://file:///c:/users/solodovnik/downloads/identification% -% llc% specifications% identifiers% v . a.pdf http://file:///c:/users/solodovnik/downloads/identification% -% llc% specifications% identifiers% v . a.pdf http://file:///c:/users/solodovnik/downloads/identification% -% llc% specifications% identifiers% v . a.pdf http://file:///c:/users/solodovnik/downloads/identification% -% llc% specifications% identifiers% v . a.pdf http://file:///c:/users/solodovnik/downloads/identification% -% llc% specifications% identifiers% v . a.pdf http://file:///c:/users/solodovnik/downloads/identification% -% llc% specifications% identifiers% v . a.pdf http://www.linkedheritage.eu/index.php?en/ /persistent-identifiers http://www.linkedheritage.eu/index.php?en/ /persistent-identifiers http://stack.lod .eu http://stack.lod .eu https://indico.cern.ch/event/ / https://indico.cern.ch/event/ / http://www.or .net/ http://pericles-project.eu/page/about http://pericles-project.eu/page/about pericles project. deliverable . initial version of environment information extraction tools. report. pericles consortium. available at: http://www. pericles-project.eu/uploads/files/pericles_wp _d _ -initial_version_environment_information_extraction_ tools-v _ .pdf (accessed june ). pericles project. linked resource model (lrm). available at: http://pericles-project.eu/tagsearch/tag/lrm (accessed june ). phaidra. the ten commandments for policy. available at: https://phaidraservice.univie.ac.at/en/phaidra/policy/ (accessed june ) plato planning tool. available at: http://www.ifs. tuwien.ac.at/dp/plato/intro/ (accessed june ). preforma project. preservation formats for culture information/e-archives. available at: http://www.pre- forma-project.eu/ (accessed june ). rauber a ( ) data management plans: a good idea, but not sufficient. in: rd liber workshop on digital cura- tion. keeping data: the process of data curation, vienna, austria, – may , available at: http:// liber .univie.ac.at/fileadmin/user_upload/doevl_ events/liber _pics/ _liber_processpreserva- tion.ppt (accessed august ). rda. active data management plans rda ig. available at: https://rd-alliance.org/groups/active-data-manage- ment-plans.html (accessed june ). reilly sk ( ) positioning libraries in the digital pre- servation landscape. available at: http://www.unesco. org/new/fileadmin/multimedia/hq/ci/ci/pdf/mow/ vc_skreilly_ _a_ .pdf (accessed june ). semantic web company ( ) the semantic puzzle. the lod cloud is dead, long live the trusted lod cloud. available at: http://blog.semantic-web.at/ / / / the-lod-cloud-is-dead-long-live-the-trusted-lod-cloud/ (accessed june ). skinner k and halbert m ( ) metaarchive: a coopera- tive approach to distributed digital preservation. against the grain ( ): art. . society of american archivists ( ) a glossary of archi- val and records terminology. available at: http:// www .archivists.org/glossary (accessed june ). subirats i and zeng ml ( ) lode-bd recommenda- tions . : how to select appropriate encoding strate- gies for producing linked open data (lod)-enabled bibliographic data. report. rome: food and agricul- ture organization of united nations. available at: http://aims.fao.org/lode/bd (accessed june ). subirats i, zeng ml and keizer j ( ) metadata approaches for shareable and lod-enabled biblio- graphic data from open repositories: in: proceedings of the international conference on dublin core and metadata applications, the hague, the netherlands, , pp. – . swan a, gargouri y, hunt m, et al. ( ) open access policy: numbers, analysis, effectiveness. pasteur oa work package report. available at: http://eprints. soton.ac.uk/ / (accessed august ). tammaro am and casarosa v ( ) research data man- agement in the curriculum: an interdisciplinary approach. procedia computer science : – . taverna workflow management system. available at: http://www.taverna.org.uk/ (accessed june ). trusted digital repositories and audit checklist (trac) ( ) oclc. version . . available at: https://www. crl.edu/sites/default/files/d /attachments/pages/trac_ .pdf (accessed april ). trusteddigitalrepositories.eu ( ) european frame- work for audit and certification of digital repositories. available at: http://www.trusteddigitalrepository.eu/ trusted% digital% repository.html (accessed june ). uc curation center ( ) digital curation foundations, version . . available at: https://wiki.ucop.edu/down- load/attachments/ /uc -curation-grundlagen- v . .pdf?version¼ &modificationdate¼ (accessed june ). unesco/ubc ( ) the memory of the world in the digital age: digitization and preservation. vancouver declaration. canada, – september . unesco/ubc ( ) vancouver declaration on digiti- zation and preservation. available at: http://www. unesco.org/new/en/communication-and-information/ resources/news-and-in-focus-articles/all-news/news/ unesco_releases_vancouver_declaration_on_digitization_ and_preservation/#.vzfmgvntmko (accessed june ). w c ( ) library linked data incubator group: use cases. report. october. available at: http://www. w .org/ /incubator/lld/xgr-lld-usecase- / (accessed april ). weidner aj and alemneh dg ( ) workflow tools for digital curation. code lib journal . white m ( ) mining the archives: metadata develop- ment and implementation. ariadne . wikipedia. digital repository standards development. available at: http://en.wikipedia.org/wiki/trustworthy_ repositories_audit_% _certification#/media/file:digi- talrepositorystandards.png (accessed june ). zenodo. available at: http://zenodo.org (accessed june ). zenodo ( ) schema for the description of research data repositories – rfc. version . . available at: http://zenodo.org/record/ #.vzgcrfntmko (accessed june ). author biographies iryna solodovnik holds a phd in philosophy of commu- nication (scientific area archiving, library and librarian- ship) from the university of calabria. in she started her career as a research fellow at the institute of informatics and telematics of cnr of rende (iit-uos cnr) focusing on the analysis and development of conceptual models underpinning digital library. since april she has solodovnik and budroni: preserving digital heritage http://www.pericles-project.eu/uploads/files/pericles_wp _d _ -initial_version_environment_information_extraction_tools-v _ .pdf http://www.pericles-project.eu/uploads/files/pericles_wp _d _ -initial_version_environment_information_extraction_tools-v _ .pdf http://www.pericles-project.eu/uploads/files/pericles_wp _d _ -initial_version_environment_information_extraction_tools-v _ .pdf http://www.pericles-project.eu/uploads/files/pericles_wp _d _ -initial_version_environment_information_extraction_tools-v _ .pdf http://pericles-project.eu/tagsearch/tag/lrm https://phaidraservice.univie.ac.at/en/phaidra/policy/ http://www.ifs.tuwien.ac.at/dp/plato/intro/ http://www.ifs.tuwien.ac.at/dp/plato/intro/ http://www.preforma-project.eu/ http://www.preforma-project.eu/ http://liber .univie.ac.at/fileadmin/user_upload/doevl_events/liber _pics/ _liber_processpreservation.ppt http://liber .univie.ac.at/fileadmin/user_upload/doevl_events/liber _pics/ _liber_processpreservation.ppt http://liber .univie.ac.at/fileadmin/user_upload/doevl_events/liber _pics/ _liber_processpreservation.ppt http://liber .univie.ac.at/fileadmin/user_upload/doevl_events/liber _pics/ _liber_processpreservation.ppt https://rd-alliance.org/groups/active-data-management-plans.html https://rd-alliance.org/groups/active-data-management-plans.html http://www.unesco.org/new/fileadmin/multimedia/hq/ci/ci/pdf/mow/vc_skreilly_ _a_ .pdf http://www.unesco.org/new/fileadmin/multimedia/hq/ci/ci/pdf/mow/vc_skreilly_ _a_ .pdf http://www.unesco.org/new/fileadmin/multimedia/hq/ci/ci/pdf/mow/vc_skreilly_ _a_ .pdf http://blog.semantic-web.at/ / / /the-lod-cloud-is-dead-long-live-the-trusted-lod-cloud/ http://blog.semantic-web.at/ / / /the-lod-cloud-is-dead-long-live-the-trusted-lod-cloud/ http://www .archivists.org/glossary http://www .archivists.org/glossary http://aims.fao.org/lode/bd http://eprints.soton.ac.uk/ / http://eprints.soton.ac.uk/ / http://www.taverna.org.uk/ https://www.crl.edu/sites/default/files/d /attachments/pages/trac_ .pdf https://www.crl.edu/sites/default/files/d /attachments/pages/trac_ .pdf http://www.trusteddigitalrepository.eu/trusted% digital% repository.html http://www.trusteddigitalrepository.eu/trusted% digital% repository.html http://www.trusteddigitalrepository.eu/trusted% digital% repository.html http://www.trusteddigitalrepository.eu/trusted% digital% repository.html https://wiki.ucop.edu/download/attachments/ /uc -curation-grundlagen-v . .pdf?version= &modificationdate= https://wiki.ucop.edu/download/attachments/ /uc -curation-grundlagen-v . .pdf?version= &modificationdate= https://wiki.ucop.edu/download/attachments/ /uc -curation-grundlagen-v . .pdf?version= &modificationdate= https://wiki.ucop.edu/download/attachments/ /uc -curation-grundlagen-v . .pdf?version= &modificationdate= https://wiki.ucop.edu/download/attachments/ /uc -curation-grundlagen-v . .pdf?version= &modificationdate= http://www.unesco.org/new/en/communication-and-information/resources/news-and-in-focus-articles/all-news/news/unesco_releases_vancouver_declaration_on_digitization_and_preservation/#.vzfmgvntmko http://www.unesco.org/new/en/communication-and-information/resources/news-and-in-focus-articles/all-news/news/unesco_releases_vancouver_declaration_on_digitization_and_preservation/#.vzfmgvntmko http://www.unesco.org/new/en/communication-and-information/resources/news-and-in-focus-articles/all-news/news/unesco_releases_vancouver_declaration_on_digitization_and_preservation/#.vzfmgvntmko http://www.unesco.org/new/en/communication-and-information/resources/news-and-in-focus-articles/all-news/news/unesco_releases_vancouver_declaration_on_digitization_and_preservation/#.vzfmgvntmko http://www.unesco.org/new/en/communication-and-information/resources/news-and-in-focus-articles/all-news/news/unesco_releases_vancouver_declaration_on_digitization_and_preservation/#.vzfmgvntmko http://www.w .org/ /incubator/lld/xgr-lld-usecase- / http://www.w .org/ /incubator/lld/xgr-lld-usecase- / http://en.wikipedia.org/wiki/trustworthy_repositories_audit_% _certification#/media/file:digitalrepositorystandards.png http://en.wikipedia.org/wiki/trustworthy_repositories_audit_% _certification#/media/file:digitalrepositorystandards.png http://en.wikipedia.org/wiki/trustworthy_repositories_audit_% _certification#/media/file:digitalrepositorystandards.png http://en.wikipedia.org/wiki/trustworthy_repositories_audit_% _certification#/media/file:digitalrepositorystandards.png http://zenodo.org http://zenodo.org/record/ #.vzgcrfntmko been a member of the editorial services of fao/aims (agricultural information management standards). from to she undertook her phd period at the library of the university of vienna/phaidra department, focus- ing on research into digital object lifecycle management. paolo budroni has worked at the university of vienna since , initially heading the office of science documentation at the liaison office. he managed the creation of the first research documentation of the university of vienna (donkey). from – he headed the project ‘research strategies at the university of vienna’ at the logistical centre of the university. since he has worked at the library of the univer- sity of vienna and since has been the project direc- tor of phaidra, a long-term digital archiving platform. he now also coordinates the project e-infrastructures austria. ifla journal ( ) article the universal procedure for library assessment: a statistical model for condition surveys of special collections in libraries sam capiau flanders heritage library, antwerp, belgium marijn de valk boekrestauratie de valk, middelburg, the netherlands eva wuyts flanders heritage library, antwerp, belgium abstract to project the needs for conserving and preserving special collections in libraries, the flanders heritage library foundation developed the universal procedure for library assessment (upla), a model for damage assessment. this tool allows libraries to independently survey the physical condition of their collections. it also provides flanders heritage library with the opportunity to collect much-needed statistics for policy making. upla describes the condition of the collection as a whole, based on a random sample of items. each of these items is assessed on types of damage. because of the systematic approach of the method, the results can be used for benchmarking: both over time and between libraries. keywords cultural heritage management, principles of library and information science, preservation and conservation, collection development, special collections/rare books heritage libraries in flanders for centuries, flanders (the northern part of belgium) has been playing a significant role as a documentary heritage production centre. manuscripts, books, newspapers, magazines and ‘grey literature’ are held in many heritage institutions within this region. these special or, as we call them, heritage collections are held in a variety of institutions, including research and public libraries, archival departments, documentation centres, museums, abbeys and convents. the volume of the collections held and the (financial and staff) resources available to these institutions for the care of their heritage collections differ considerably. how- ever, they all share a common goal: the long-term conservation and preservation of our written and printed cultural heritage. see figure . until recently, libraries in flanders holding collec- tions of historical importance were not considered a sector of their own. despite the volume and cultural value of their collections, which include many mas- terpieces, heritage libraries continued to remain in the background. their relative invisibility, combined with a lack of continuity and the absence of struc- tural support made it impossible for these libraries to adequately respond to the many challenges faced today. to change this situation and to provide an impetus for a more structured policy framework for heritage libraries, the flanders heritage library foundation was established at the end of . flanders heritage library is a network comprising six major libraries, some of which are inherently anchored in the aca- demic world: the hendrik conscience heritage corresponding author: sam capiau, flanders heritage library, hendrik conscienceplein , b- antwerp, belgium. email: sam@vlaamse-erfgoedbibliotheek.be international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifla.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifla.sagepub.com library in antwerp, the bruges public library, the limburg provincial library and the university libraries of antwerp, ghent and leuven. together, they are committed to building and sharing expertise in the preservation, bibliographic description, digitisation and accessibility of heritage library collections throughout the sector as a whole. the objectives of flanders heritage library are clearly outlined in the cultural heritage decree of the flemish community. these same goals are included in the articles of association of the flanders heritage library foundation: � to harmonise collection development policies between partner libraries, based on collection subject profiles. � to develop a common collection policy for ‘flandrica’, particularly for publications of importance to flanders from a cultural or histor- ical point of view or because of their heritage value; in relation to the former, investigate the possibilities for organizing a supplemental regional legal deposit. � to develop and spread expertise in the area of preservation of heritage collections. � to catalogue heritage collections and to develop expertise around metadata and standards for heritage collections. � to digitise heritage collections. � to create a solution for the long-term preserva- tion of and access to digitised and born-digital heritage collections. � to organise and participate in communication initiatives to raise public awareness of heritage libraries in flanders and to develop compe- tence in the area of promoting collection use and dissemination. caring for books in , the university of antwerp conducted a survey among heritage libraries on behalf of flanders heri- tage library. the purpose of the survey was to establish the level of preservation and conservation of library collections and the extent to which these collections have been digitised and made accessible through (online or offline) cataloguing. the outcome of the sur- vey was published in a report with a telling name: de wet van de remmende achterstand [the law of the inhibiting arrears] (capiau et al., ). the survey paints a worrying picture of the sector of heritage libraries in flanders. the situation is particularly precarious when it comes to the physical care of the collections. the majority of libraries lack insight into the phys- ical condition of their collections, making it impossi- ble to develop professional programmes for collection care and preservation. the meagre financial resources are usually directed towards the restoration of a hand- ful of (master)pieces and rarely towards a more comprehensive approach that involves the complete collection. most institutions lack the time and exper- tise to remedy this situation. to provide heritage libraries with the insight and tools needed for establishing strong policies regarding damage prevention and remediation, flanders heri- tage library developed a damage assessment model in – . the model uses a reliable sampling method to efficiently and pragmatically gain insight into the physical condition of library collections and into the level of accessibility. the results of these ran- dom checks are then used to assess the ‘need for con- servation’ (i.e. the amount of work required to execute the necessary preservation and conservation actions), allowing institutions to draw up a tailor-made preser- vation policy and a collection care plan. at the end of , flanders heritage library assigned the task of developing the model to book restoration company boekrestauratie de valk from middelburg, the netherlands, in partnership with paper conservation company hoogduin papierrestauratoren, also based in the netherlands. the project was co- managed by eva wuyts and sam capiau from flanders heritage library. in addition, a group of conservation experts and librarians from flanders and the nether- lands acted as a sounding board. this way, the initia- tors of the project were able to ensure that the methodology was subscribed to by the field. considering the limited resources of most heritage libraries, flanders heritage library determined that the new model had to meet two requirements: � first of all, the model ought to collect data at the collection level, based on a representative figure . marijn de valk assessing damages according to the upla method. photo: j. goedemé. ifla journal ( ) sampling method. this would heavily reduce the time needed for a collection screening. assessing damage for each individual item in a collection would be too time-consuming and would mainly serve an object-driven approach. � secondly, heritage library staff ought to be able to independently execute the damage assess- ment. in other words, the model could not be solely aimed at book restorers; it had to be suit- able also for use by ‘laymen’ who received some basic training. this would reduce costs and – not unimportantly – it would increase knowledge and awareness about collection care in heritage institutions. all library staff involved would have to complete a special training programme. the expertise gained during these training ses- sions, as well as the (renewed) introduction to the collections held in their own library, were to benefit the staff even after the damage assess- ment was completed. from upaa to upla rather than establish an entirely new methodology, we adopted a proven methodology and modified it to suit the specific requirements of library collections. the methodology selected was the universal proce- dure for archive assessment (upaa), an instrument developed in the early s by the national archives of the netherlands for the purpose of assessing archi- val collections. upaa has more than earned its status: in addition to the netherlands, it is used in croatia, the nordic countries, indonesia, romania, russia, south africa, sri lanka and the uk. the sampling method used in the upaa and upla models is based on the principles of french statistician pierre gy. to select a sample that is representative of the collection, the shelf capacity (in linear metres) and the average number of books per metre have to be determined first. as a guiding principle, the sample size is set to items per collection. based on the total capacity, a constant interval is computed. this interval is then used to single out discrete areas of one lin- ear metre each. within every one of these areas, a ran- domly generated countdown value decides which item will serve as the actual sample. this method has been tested in various archives and produces a % accuracy rate, signifying that the results may vary by a margin of % above or below the final figure. see figure . upla versus upaa despite the fact that the upaa and upla models are closely related, there are quite a number of distinct dif- ferences and even points for improvement. firstly, only some of the types of damage applicable to archives also apply to library collections. the upaa model mainly focuses on the condition of the paper, with little atten- tion given to the binding. the upla model – designed more specifically for traditional library collections – takes into account a wider range of materials that may be encountered, such as linen, leather and parchment. photos, charters, maps and modern media such as cds and tapes are excluded under the upla model. on the other hand, newspapers and unbound (archival) materi- als are taken into account. a second, more fundamental difference lies in the fact that the upaa model considers accessibility – the ability to actually handle the document without causing further damage to its content – to be the main criterion for assessment. this is quite logical, given that archives must be preserved for legal purposes and that their sources must remain accessible to future generations. for this reason, the severity of damage to an object assessed under the upaa model will depend on its impact on the object’s accessibility. naturally, the accessibility of a source is a primary concern for heri- tage libraries as well. however, to book historians and other examiners, the physical appearance of books is equally important. the scientific and cultural value of books and their bindings is not only derived from their content; the object itself holds value as well. this results in a different method for damage assess- ment. the upaa method simply determines the sever- ity of damage by assessing the impact of user handling on the physical condition of the document. the upla model makes it a two-step process. first, the extent of the damage is assessed. then, the method examines whether normal use of the item is likely to worsen the damage. this two-step approach facilitates the overall assessment process, rendering more reliable and nuanced at the same time. for instance, where a figure . a special die is used to identify the linear metre in which the first sample item is selected. photo: marijn de valk. capiau et al.: the universal procedure for library assessment volume of archival documents may be classified under the upaa model as ‘not accessible’ and thus ‘seri- ously damaged’ due to containing one single sheet of heavily felted paper, the same object would be classi- fied under the upla-model as ‘moderate damage’ (only one sheet) but ‘not accessible’. finally, the upla model identifies several types of damage that can render a book’s physical condition unstable. there are three main groups: intrinsic decay, biological damage and red rot. intrinsic decay includes damage from acidification of the end papers, acidifica- tion of the text block, and ink and copper corrosion, similarly to the definition of this term by metamorfoze, the dutch national programme for the preservation of paper heritage. biological damage includes mould and rodent damage. in all cases, the presence of mould or any signs of insect damage are considered urgent, unless recent tests have confirmed the contrary. all types of damage mentioned above will worsen over time, even if the item is never used. that is why the upla model also examines the (in)stability of a library collection in addition to the accessibility of the objects. these types of damage require urgent preser- vation measures. an upla report that pinpoints areas that require immediate action will give policy makers of the affected institutions the backup they require. twenty-two types of damage the focus on books as objects is clearly illustrated in the upla assessment method. the exterior of the book, i.e. the cover, is examined first. then, the book is opened to verify that the essential parts are structu- rally solid. after all, books are objects that move as they are opened and closed. next, the text block is assessed for damage. see figure . a total of types of damage have been identified for the damage assess- ment process, classified in four major groups: damage to the cover: . dust and surface dirt . outer lining in poor condition . red rot . harmful tapes and repairs . loose fragments . missing fragments . damaged cover cores . damaged fasteners and fittings damage to the construction: . warping of covers and text block . damaged cover binding and joints . damaged sheet and signature joints damage to the text block: . dust and surface dirt . harmful tapes and repairs . gaps, tears and folds . felting . stuck sheets . acidification of end papers . acidification of text block . foxing . ink corrosion and copper corrosion biological damage: . mould damage . rodent damage an important tool for performing an upla assess- ment is the ‘damage atlas’, which provides insight into the types of damage and their urgency, based on images and clear definitions. in , the meta- morfoze programme issued schadeatlas archieven, a tool for assessing and classifying damage to archival documents. its english translation as archives dam- ages atlas followed in . in partnership with metamorfoze, flanders heritage library then pub- lished schadeatlas bibliotheken [libraries damage atlas] at the end of (de valk, ). the book is a useful instrument even separate from an upla assessment and is freely available in print format or online. the atlas will be made available in english in due course. to measure is to know using the damage atlas as a guideline, selected items from a collection are meticulously screened, and their various types of damage are recorded in a database. results are automatically bundled and figure . damaged books. photo: marijn de valk. ifla journal ( ) presented in a number of tables across four work- sheets in an excel spreadsheet. the first worksheet details the accessibility of the collection as a whole in percentages, as well as the degree of instability detected. the second worksheet provides a list of all the metadata elements, while the third worksheet lists the damaged objects that are not accessible, or in other words, the percentage of books that will get damaged even further through normal use. naturally, the automatic data processing method offers an in-depth analysis as well. the fourth table, on the last worksheet, is more comprehensive and lists all types of damage followed by the corresponding number of books from the sample. four different cate- gories of damage can be distinguished: � serious damage – handling the item will wor- sen the damage � serious damage – no risk of additional damage when handling the item � moderate damage – handling the item will wor- sen the damage � moderate damage – no risk of additional dam- age when handling the item the default statistical analysis for each library con- sists of these four worksheets. however, the data col- lected can be used for further analysis. specific damage type combinations and details retrieved from metadata may prove very useful to some libraries. for instance, when we know the percentage of books con- taining a metal lock or clasp and whether these books have been boxed, we may deduce the number of books with metal fittings that may cause damage to adjacent books. or we can determine the number of books that are seriously damaged due to dust or dirt and that also show signs of mould damage. such information can provide a starting point for a cleaning programme. however, further analysis and comparison of upla statistics against each other and against storage condi- tions will require some custom reporting. as each upla assessment is performed in the same uniform manner, the results of separate assess- ments can be compared against one another. flanders heritage library plans to bundle the (anonymised) results of all upla assessments in flanders in a sin- gle registry. this umbrella database will allow for benchmarking of individual institutions and will facilitate policy recommendations regarding the pre- servation of flemish heritage library collections. after some time, libraries may opt to repeat the assessment process, allowing them to monitor cer- tain (intrinsic) types of damage or even their own conservation policies. upla put to the test we decided to test the upla model prior to making the instrument generally available. the upla model was tested mid on the collection of the ruusbroecgen- ootschap library in antwerp. this library proved extremely well suited for the test as it encompasses a diverse collection that includes incunabula and a vast range of ancient prints, but also many modern works, magazines and brochures held across several stack rooms. furthermore, the library is used intensively, resulting in a wide range of damage to its collections. a total of samples were selected from metres of shelving filled with books. a detailed form to record the damage was completed for each selected item. in this case, the test was carried out by two teams of assessors. team a consisted of so-called ‘laymen’ with little to no knowledge about damage assessment. team b consisted of book restorers. the latter were able to more easily identify damage, listing more types of damage across all samples. neverthe- less, the final results of the screening process pro- duced by both teams were very similar. this demonstrates that ‘laymen’ are capable of performing these assessments autonomously, especially if they receive proper training. see figure . the test week proved to be a positive experience for everyone involved. the team managed to screen the entire collection in eight days, thanks to the input from the book restorers and the good teamwork with the library’s own staff, who were of course also famil- iar with the collection and the layout of the stack rooms. when completed, the head librarian of the ruusbroecgenootschap library stated that the assess- ment was hugely beneficial and crucial for gaining valuable insight. this indicates that the model was successful. it collected valuable policy-relevant figure . sam capiau (flanders heritage library) and hilde schalkx (hoogduin papierrestauratoren) testing the upla model on the collection of the ruusbroecgen- ootschap library in antwerp. photo: marijn de valk. capiau et al.: the universal procedure for library assessment statistics about the library collection as a whole, and increased awareness among the library’s own staff. meanwhile, the process of implementing the result- ing recommendations regarding the storage and pla- cement of books inside the ruusbroecgenootschap library has already been initiated. getting started with upla flanders heritage library has made the upla model freely available to the entire heritage sector. the orga- nisation has also set up the necessary support frame- work to improve the overall quality of the screening process and to ensure that the method is implemented correctly. that helps ensure reliable results that can be incorporated in the joint registry for library collection damage. institutions wishing to perform a damage assess- ment using the upla model can register their library staff for training. a two-day training workshop pro- vides the perfect starting point for an upla assess- ment. it focuses on knowledge of materials, book terminology and damage identification as defined in the model. participants are assisted by the libraries damage atlas, which provides an abundance of illu- strated examples. in addition, flanders heritage library provides the opportunity to consult with a book restorer who is familiar with the upla model, thus guaranteeing efficiency and quality throughout the assessment pro- cess. essentially, these experts could be put in charge of a upla project from start to finish. however, to avoid missing out on a great opportunity to build internal expertise, we recommend that heritage libraries perform these screening tasks autonomously with guidance and support from an expert. for instance, the expert could initially assist library staff with the assessment of books. after completing the appropriate training and with the libraries damage atlas in hand, library staff would then perform the remainder of this task autono- mously. an upla assessment is estimated to take up to three full days, but these do not need to be con- secutive. it is generally easier to schedule in six half- days. a phased assessment process including occa- sional breaks will also benefit the overall quality of the project. during the last half day of the assess- ment, the expert could once again be available to answer questions. impetus for policy reform at the time of writing this article, the first upla assessment is yet to commence. however, we firmly believe that libraries who decide to execute such an assessment are taking a first important step towards improving collection care plans and preservation policies for their heritage collections. the reports they receive upon completion of their assessments will provide a bird’s-eye view of the current state of their collections. the upla model is not a rigor- ous screening of each and every item in a library’s collection, nor will it provide the library with a ready-made remedial approach. however, it is the preferred tool for those heritage libraries wishing to get to grips with their collections and with the associated challenges that are sometimes unfamiliar or overwhelming. the upla model is equally recommended for small institutions, as they too will benefit from the results. after all, the guiding principles behind an upla assessment can be summarised in three words: pragmatic, efficient and uniform. � pragmatic – because the upla assessment can be executed by the library’s own staff, even by ‘laymen’. if needed, external experts can be consulted to help guide the process. the types of damage to be recorded are kept to a limited number to avoid collection of non- relevant data. � efficient – because the amount of time invested in the assessment is relatively minor in compar- ison to the vast amount of knowledge gained throughout the process. it takes approximately seven working days for two library staff mem- bers to fully complete an upla assessment. � uniform – to ensure that assessment results can be compared against one another in the future. the guidelines and the libraries damage atlas are unambiguous and not too complex. this guarantees that everyone performs these assessments in the same manner and that there is little room for variation. every upla assessment results in a scientifically- based report which can be used by librarians as a tool for developing custom conservation policies and for advocating to policy makers regarding the need for investments. they should do this at their own pace. because putting upla recommenda- tions into action and taking concrete measures may require additional (internal) assessments. unfortu- nately, libraries usually only have limited resources for such projects. let us hope the joint registry of library collection damage will help the government reflect on the issue, thus providing an impetus for more systematic support of flemish heritage libraries in the future. ifla journal ( ) declaration of conflicting interests the author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. notes . http://www.flandersheritagelibrary.be . the sounding board group included: pierre delsaerdt (information and library studies – university of antwerp), guy de witte (de zilveren passer), leen breyne and griet kockelkoren (faro – flemish inter- face centre for cultural heritage), imke neels (restora- tion studio, royal library of the netherlands), marleen vandenreyt (limburg provincial library), ludo vandamme (bruges public library), elke van herck (department of collection management/museum collection care, city of antwerp), serafien hulpiau (restoration studio, university of ghent), hilde schalkx (hoogduin papierrestauratoren), dorrit van camp (hendrik conscience heritage library), ellen storms (antwerp university library), lieve watteeuw (illuminare – university of leuven), eva wuyts and sam capiau (flanders heritage library). references capiau s, delsaerdt p, coppoolse d, et al. ( ) de wet van de remmende achterstand. preservering, conservering, ontsluiting en digitalisering in vlaamse erfgoedbibliothe- ken. armarium. publicaties voor erfgoedbibliotheken. antwerp: vlaamse erfgoedbibliotheek vzw. available at: http://www.vlaamse-erfgoedbibliotheek.be/bron/ (english executive summary included) (accessed july ). de valk m ( ) schadeatlas bibliotheken. hulpmiddel bij het uitvoeren van een schade-inventarisatie. armar- ium. publicaties voor erfgoedbibliotheken. antwerp: vlaamse erfgoedbibliotheek vzw. available at: http:// www.vlaamse-erfgoedbibliotheek.be/bron/ (cur- rently in dutch only) (accessed july ). author biographies sam capiau is a historian and studied information and library studies at the university of antwerp. he is currently working as a project officer at flanders heritage library, implementing the upla model (among other things). marijn de valk is a self-employed book conservator- restorer in middelburg, the netherlands. she developed the upla model on behalf of flanders heritage library. eva wuyts studied history at the university of ghent and completed a postgraduate course in culture management at the university of antwerp. she has been active in the cul- tural heritage sector in flanders for quite some time and has been working as a coordinator at flanders heritage library since it was established. capiau et al.: the universal procedure for library assessment http://www.flandersheritagelibrary.be http://www.vlaamse-erfgoedbibliotheek.be/bron/ http://www.vlaamse-erfgoedbibliotheek.be/bron/ http://www.vlaamse-erfgoedbibliotheek.be/bron/ article cultural heritage digitization projects in algeria: case study of the national library nadjia ghamouh university of constantine - abdelhamid mehri, algeria meriem boulahlib university of constantine - abdelhamid mehri, algeria abstract currently, the algerian national library is striving to digitize algerian cultural heritage. this exercise became imperative due to physical damage to manuscripts when they were handled during reading. this case study aims to shed light on the challenges of manuscripts and rare books digitization in the algerian context. in addition, this paper clarifies the algerian national library’s aspirations and plans to make manuscripts and rare books digitization a thriving endeavor. keywords digitization, human resources, manuscripts and rare books collections, algerian national library, planning digitization introduction the algerian national library was defined in execu- tive decree no. - of june which identi- fied the algerian national library as: a public institution under the direction of the minister of culture; it aims to collect, store, and diffuse the national cultural heritage and insure openness to the universal heritage. in this context, the national library is responsi- ble for gathering and cataloging manuscripts collections, coins, medals and rare books of national interest, in addi- tion to launching projects and programs related to its activities. (general secretariat of government, ) this legal definition presents the most important activities of the algerian national library and explains the director general’s announcement of a new digitization project on the recommendation of the minister of culture who described the project as ‘a bet that should be earned’. on this policy announcement, digitization became one of the algerian national library’s priorities and attracted both human and material resources to make it successful. these resources included budgets for equipment, research and a training program led by an international expert to develop the local project team’s digitization skills. the manuscripts and rare books selected to be digi- tized were the most requested by users; more than manuscripts had been digitized from a list of between and . at the end of , digitization was suspended after the project faced a range of technical and organizational issues such as lack of qualified staff, work-flow problems, and the need for adequate equipment. since this initial experience, the algerian national library has been attempting to overcome its digitiza- tion issues and make the project viable through an updated work plan and new equipment. objective of the study this study aims to shed light on the challenges that caused the suspending of manuscripts and rare books digitization in the algerian national library. in corresponding author: meriem boulahlib, université constantine - abdelhamid mehri, nouvelle ville ali mendjeli, bp: a, constantine , algeria. email: meryam.boulahlib @gmail.com international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifl.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifl.sagepub.com addition, the study clarifies the library’s future aspira- tions to create a successful digitization program. the algerian national library: background the algerian national library was established in on the initiative of the civil administrator of the regency of algiers genty de bussy. the library is the country’s oldest cultural institution. its first home was a state-owned building but it was soon moved into an ex-janissary barracks in bab-azoun in (wedgeworth, : ), which was a mid- dle school at that time. two classrooms were allo- cated for the national library and the museum until . in the algerian national library relocated to the palace of the dey mustapha pasha (a janissary commander) as its official residence (venis, ). according to the history of algeria, this palace was a grandiose moorish building, which the algerian national library occupied from until (le soir d’algerie, ). in algeria’s growing population and economic development prompted the construction of a new per- manent home for its national library (lebel, ). however, the accidental burning of the university library of algeria in obliged the national library to accommodate the university collections and support university students. since this accident, the national library has become the most preferred place for students, academics and researchers (bou- derbane and boukerzaza, ). the new building of the algerian national library covers an area of , m , consisting of floors with a capacity of readers (ali, : ). currently, the national library is the most active cultural institution in algeria. according to execu- tive decree no - of june (general secretariat of government, ), the library aims to collect, store, and diffuse the national culture heritage, and ensure openness to the universal heri- tage, by: � collecting every printed book, magazine and newspaper published in algeria; � preparing and publishing general catalogs and the national bibliographies; � enabling access to every researcher; � supporting library and information sciences as well as providing training to university students; � enriching the cultural life of algerian citizens and ensuring global cultural openness; � taking part in scientific research through peri- odical publications. according to the executive degree mentioned above, the algerian national library must be man- aged by a director general, who is responsible for the financial, technical and administrative aspects of the library. in addition, the director general represents the algerian national library as a cultural institution in international fora and cultural events. furthermore, the executive decree indicates that the algerian national library is guided by two main councils. the first one is the library council whose members provide oversight of library projects and represent different ministries. the second council is the scientific council, a multi-disciplinary advisory board that provides advice related to the sci- entific activities of the national library. however, unfortunately these councils have never met because their members have not yet been nominated. figure illustrates the library’s structure. manuscripts and rare books collection in the algerian national library: preservation and accessibility despite its recent creation as a modern state, algeria is a deeply rooted civilization in northern africa where archaeological excavations have established early human activity (sahnouni and de heizelin, ). the national library of alger- ia’s manuscripts and rare books collections tell the stories of subsequent civilizations, and represent the wealth of algerian history. see pictures and for examples. the collecting of those manuscripts and rare books began in when adrien berbrugger, a member of the french campaign in the city of constantine, wit- nessed the destruction and burning of the local libraries (fagnan, : ). during this campaign berbrugger succeeded in collecting manuscripts. unfortunately he could transport only manu- scripts, because of the long distance between constantine and algiers, and the deficiency of trans- portation caused by the war (bounfikha, : ).on the other side of western algeria m. le baron de slane gathered multidisciplinary manuscripts from tlemcen city and carried them from oran to algiers by steamboat. the collection of manuscripts continues today. table illustrates the historical development of the collection. according to the annual reports of the manuscripts and rare books department in the alger- ian national library from the library has col- lected manuscripts from different sources and now owns manuscripts and rare books in fields as varied as: religions, otophone, science of logic, philosophy, history, geography, literature, algebra, ghamouh and boulahlib: cultural heritage digitization projects in algeria geometry, medicine, geriatrics, pharmacy, mathe- matics, astronomy, music, and black magic in many different languages such as: arabic, tamazight, aljamiado, persian, turkish, latin, greek, spanish, tibetan, italian, and syriac. the algerian national library’s manuscripts collec- tion, which was first described in the library catalog of manuscripts in , included a bibliographic descrip- tion and illustration of rare manuscripts (ben mukadem and ben yahiya, : ).the second ver- sion of the catalog is in the process of being published. their rarity and preciousness make the manuscripts and rare books in the algerian national library difficult to preserve. for this reason, the manuscripts and rare books department relies upon the arabic union cata- log to accelerate cataloging to make the manuscripts ready for digitization. the department cataloged manuscripts according to the unified local manuscripts cataloging standards authenticated by a local expert in . besides cataloging, the manuscripts and rare books department is now working on manuscript sterilization for restoration and preservation goals. after using an autoclave for a long period of time, the department discovered that this equipment caused health issues to both workers and users. the depart- ment therefore suspended the sterilization procedures using the autoclave and has begun the study of the deep freezing method as an alternative solution. manuscripts and rare books digitization initiative in the algerian national library the announcement of the manuscripts and rare books digitization project was made in by the director general of the national library; however, this was not the first try. in and after the experience figure . the organizational structure of the national library of algeria. source: field data, march . picture . the oldest manuscript in the national library of algeria is a chapter from the quran written on gazelle skin parchment in the third century of hegira ( – ad) source: field data, march . picture . one of the national library of algeria’s manu- scripts: the canon of medicine: part four, by avicenne, written in the th century with commentary by the doctor of sultan saladin, assad bin elias bin gerges mutran. source: field data, march . ifla journal ( ) gained and equipment received from the juma alma- jid center for culture and heritage, the manuscripts and rare books department began to collaborate with the reprography department to digitize frequently required manuscripts in order to preserve them from the damage that can be caused by regular use. picture shows a scanner from this time. moreover, since , patrons have been forbid- den from physically browsing manuscripts and rare books, so users are required to apply for a digital copy to be made. through this process, the repro- graphy department brings materials from a safe and documents the physical state of manuscripts before and after digitization. after receiving the order from the minister of culture to start a digitization project aimed to preserve and dif- fuse algerian manuscripts heritage, the initiative was supposed to turn into a long-term project, but unfortu- nately the national library was unable to achieve this. in the national agency for management of cul- ture major project execution announced a call for bids to study and manage the digitization of the algerian national library’s heritage. an international company which uses robotic bound document scanning systems, kirtas, obtained the tender and used the kabis tm scanner to digitize the manu- scripts. after a while, the scanner started to shred parts from the most fragile manuscripts. to avoid further damage the manuscripts and rare books department suspended the digitization procedures in march . the only exception is for urgent researcher requests approved by the national library director general. in these cases, staff use a copybook tm onyx rgb scanner (see picture ) purchased in to digitize the other library resources such as periodicals, and the collection of the maghreb. this failed program, which was caused by a lack of understanding of the physical characteristics and fra- gility of manuscripts and rare books was an unfortunate episode in the history of the algerian national library. the library came out of this expe- rience with digitized manuscripts and rare books, stored on dvds in the reprography depart- ment without the minimum preservation requirements or standardized bibliographic description. the dilemma facing the manuscripts and rare books department and the reprography department is the knowledge of which department has priority to follow the digitization procedures through metadata creation to providing long-term preservation and networked access. this conflict is caused by two main factors; the first one is the inadequacy of the national library’s organizational structure to support such as a project. the second factor is instability of the gen- eral administration because of the continual change of the director general post (see table ). table . the development of the national library of algeria’s manuscripts collection through the years. year number source: field data, march . picture . book eye a scanner used in manuscripts and rare books digitization in . source: field data, march . picture . copybook tm onyx rgb scanner used in manuscripts and rare books digitization for urgent requests since . ghamouh and boulahlib: cultural heritage digitization projects in algeria planning for the future in order to move the algerian national library’s digi- tization program forward, the authors solicited the opi- nions of staff involved in digitization. this small study was conducted in the manuscripts and rare books department and reprography department of the library. both questionnaires and interviews were used to collect information from the working group in charge of digitizing manuscripts and rare books. three questionnaires were distributed to the digiti- zation work team members. the first questionnaire was designed to measure the planning level of the digitization project to understand the issues faced in transforming the manuscripts and rare books digitiza- tion from an initiative to a sustainable project. the second questionnaire sought responses about the skills and qualifications of the digitization teamwork in the reprography department concerning the stan- dardized procedures of manuscripts and rare books digitization. the third questionnaire was designed to gather responses about problems displaying digitized manuscripts through the library’s network. discussion of findings planning for the digitization project according to the guidelines for planning the digiti- zation of rare books and manuscripts collection (ifla, ), digitization projects need to begin with a set of questions regarding the goals of the project, funding, staffing, equipment, and legal issues: � what is the vision for the project? what are the goals and objectives? � who will use it? how will they use it? � who should be involved in the planning? � are there external funding opportunities? � what level of complexity can be achieved? � what do you want to digitize and why? � are there any copyright issues regarding the materials? � should the digitization be accomplished in house or by external service providers? do you have the space, money, and equipment, and expertise? what can an external vendor provide? � what is the final format of the project? do you have the means to achieve it? � is a social networking component envisioned, such as crowd-sourced transcription or meta- data enhancement? � how will you incorporate quality management into all stages of the project? the first questionnaire was designed to understand the extent to which the manuscripts and rare books digitization plan in the national library of algeria provided a shared understanding of such steps. the main results were summarized in table which shows the responses of the digitization team members concerning planning. the visions and goals of the manuscripts and rare books digitization were clear for most of the project members: all of them emphasize that the plan for digitization is clear and understandable, with confir- mation from five team members that its purposes are for both access and preservation. however, there was no consensus of opinion among the project members about their involvement in project plan- ning. the digitization team was divided into equal conflicted groups: three of them confirmed their par- ticipation in the planning of the digitization project, while the others affirmed their exclusion from the planning procedures. this conflict seems to be related to the instability within the institution with the program management shifting in from a group of workers nominated by the director general to a new project plan ordered by the minister of culture. in regard to external funding, four responses claim an absence for unknown reasons. in addition, there was dissonance regarding selection and copyright. two responses confirm that the selection of manu- scripts and rare books for digitization was based on specific criteria that included copyright while two other responses denied the existence of any specific selection criteria. the possible reasons for varied perceptions about the program are the lack of information sharing between the teamwork members and ignorance of manuscript and rare books copyright issues. table . the directors general of the national library of algeria between and . name period mahmoud agha bouayad – abdelkrimbajdajda – abdulatif rahal – aissa moussa mouhamed – miloud abbes – abdelallahbesseriani – amine zaoui – sid ahmed oussadit – azzedinemihoubi – madjiddahman –march bengana yasser march –now source: field data, march . ifla journal ( ) there are also conflicting responses concerning the equipment used in the digitization process, suggesting evidence of lapses in knowledge related to the appro- priate equipment that should be used in the digitiza- tion of manuscripts and rare books. digitization skills according to staff interviews, a contributing factor to the failure of the digitization project was the lack of qualified staff ( january , personal communi- cation). to better understand the skill level of the project team, a questionnaire was used to determine the familiarity of staff with the standardized digitiza- tion procedures outlined in the ifla guidelines for planning the digitization of rare books and manu- script collections (see table ). the most important feature observed in the responses listed in table is the presence of uncer- tainty regarding these procedures as seen by the negative answers. this suggests a disparity in train- ing level within the digitization team. this variance in understanding of standardized procedures is also evident in knowledge of the care and preparation of materials for digitization. in addi- tion, there seems to be a lack of understanding regard- ing metadata procedures. on the other hand, the team appears to have a strong grasp of standardized meth- ods of ensuring image quality to achieve proper resolution, colour, depth and lighting. based on this evidence, it is clear that the digitiza- tion team would benefit from further training with an emphasis on the preparation of manuscripts and metadata. access to digital collections there is much consensus among teamwork members that the algerian national library is digitizing for both access and preservation. however, there have been no attempts to make the collections available on the library’s network. the digitized manu- scripts are now stored on dvds in the reprography department, waiting to be displayed. a third ques- tionnaire sought to discover the prospects for making manuscripts available online. the main findings are summarized in table . responses from the work team suggest that lack of equipment is the largest impediment to providing online access to manuscripts. there is a vital need for networks equipment such as servers and file storage. in addition, the impact of the lack knowledge of metadata is a contributing factor to providing access. interviews confirm that an ambiguous digitization policy can directly affect the access initiative since department heads in charge of digitization do not share a vision regarding the conditions necessary to provide access. conclusion clearly, the recent failure of digitization initiatives within the algerian national library is not based on a lack of initiative or vision of the need to make these treasures available to scholars and the public. inter- views with staff and questionnaires suggest that prob- lems with the planning process and a lack of shared understanding among team members were the largest contributing factors. as projects such as this falter, library administrators need to reconsider which units and institutions have the ability to organize and implement digitization projects. in addition, intensive training programs on the physical needs of table . the planning level of the manuscripts and rare books digitization project. indicators agree disagree uncertain vision and goals planning involvement external funding selection and copyright issues appropriate equipment need for training quality management source: field data, march . table . digitization skills and qualifications level. indicators agree disagree uncertain preparing the materials metadata image quality image processing source: field data, march . table . network access to digital collections. reasons for lack of network access to manuscripts responses lack of required equipment lack of metadata lack of qualified staff other source: field data, march . ghamouh and boulahlib: cultural heritage digitization projects in algeria manuscripts and rare books, metadata standards, and equipment for access are essential. in conclusion, this paper provides a unique case study and attempts to provide formative evidence toward the improvement of a stalled digitization proj- ect in the algerian national library. the study shows the interrelated issues that range from planning, com- munication, training, and availability of appropriate equipment that contribute to the success of a large digitization program. overall, the study suggests that a functioning organizational structure and stability are essential ingredients to providing the environment required to avoid conflicting views and visions related to a program’s mission. declaration of conflicting interests the author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. funding the author(s) received no financial support for the research, authorship, and/or publication of this article. notes . french archaeologist and philologist ( – ). founder and creator of the museum and library of algiers. graduated from the school of charters. secre- tary to marshal clauzel, bertrand ( – ). member of several scientific committees in algeria. founded the algerian historical society. . french orientalist of irish descent ( – ). mem- ber of the academy of inscriptions and letters (elected in ). principal interpreter for the french army of africa. professor of arabic at the special school of oriental languages in paris. . berber language. . use of arabic scripts to transcribe european language. . in , the digitization team of juma almajid center for culture and heritage came to the national library of algeria with algerian presidential permission to digi- tize the manuscripts collections, but unfortunately the digital copies left were lost and there is no proof of the manuscripts digitized, not a catalog or even a list, only the digital camera used at that time and some digitiza- tion skills for the local team who left the national library many years ago. . a public institution under the direction of the ministry of culture has a mission to study and manage major culture infrastructure. references ali s ( ) preservation and accessibility of manuscripts in the national library of algeria. masters thesis, uni- versity of constantine , algeria. ben mukadem r and ben yahiya f ( ) from the most rare and precious manuscripts in the national library of algeria. algiers: national library of algeria. bouderbane a and boukerzaza k ( ) les manuscrits de la bibliothèque national d’algérie: la numérisa- tion et la recherche scientifique. in: nd melcom international conference, cordoba, spain, – april . bounfikha f ( ) algerian manuscript intellectual pro- duction at the national library of algeria: analytic study of bibliographic description excluded manu- scripts. masters thesis, university of algiers, algeria. fagnan e ( ) manuscripts general catalog of the national library of algeria. algiers: national library of algeria. general secretariat of government ( ) journal offi- ciel: executive decree no. - containing the basic law of the national library. available at: http://www. joradp.dz/har/index.htm (accessed august ). international federation of library associations and institutions ( ) guidelines for planning the digiti- zation of rare books and manuscripts collection. available at: http://www.ifla.org/files/assets/rare-books- and-manuscripts/rbms-guidelines/ifla_guidelines_for_ planning_the_digitization_of_rare_book_and_manu- scripts_collections_january_ .pdf (accessed august ). lebel g ( ) la nouvelle bibliothèque nationale d’alger. bulletin des bibliothèques de france. available at: http://bbf.enssib.fr/consulter/bbf- - - - (accessed august ). sahnouni m and de heizelin j ( ) the site of ain hanech revisited: new investigations of this lower pleistocene site in northern algeria. journal of archae- ological science. doi: . /jasc. . . le soir d’algérie ( ) dar mustapha pacha un riche patrimoine. available at: www.lesoirdalgerie. com/articles/ / / /article.php?sid¼ &cid¼ (accessed july ). venis b ( ) bibliothèque nationale d’alger ancienne résidence des deys d’el djezaı̈r. available at: algerroi. fr/alger/bibliotheque_nationale/textes/ _biblio_algeria . htm (accessed july ). wedgeworth r ( ) world encyclopedia of library and information services. chicago, il: american library association. author biographies nadjia ghamouh is a professor at the institute of library science and documentation at the university of constan- tine - abdelhamid mehri, algeria. meriem boulahlib is a phd candidate at the institute of library science and documentation at the university of constantine - abdelhamid mehri, algeria. ifla journal ( ) http://www.joradp.dz/har/index.htm http://www.joradp.dz/har/index.htm http://www.ifla.org/files/assets/rare-books-and-manuscripts/rbms-guidelines/ifla_guidelines_for_planning_the_digitization_of_rare_book_and_manuscripts_collections_january_ .pdf http://www.ifla.org/files/assets/rare-books-and-manuscripts/rbms-guidelines/ifla_guidelines_for_planning_the_digitization_of_rare_book_and_manuscripts_collections_january_ .pdf http://www.ifla.org/files/assets/rare-books-and-manuscripts/rbms-guidelines/ifla_guidelines_for_planning_the_digitization_of_rare_book_and_manuscripts_collections_january_ .pdf http://www.ifla.org/files/assets/rare-books-and-manuscripts/rbms-guidelines/ifla_guidelines_for_planning_the_digitization_of_rare_book_and_manuscripts_collections_january_ .pdf http://bbf.enssib.fr/consulter/bbf- - - - http://www.lesoirdalgerie.com/articles/ / / /article.php?sid= &cid= http://www.lesoirdalgerie.com/articles/ / / /article.php?sid= &cid= http://www.lesoirdalgerie.com/articles/ / / /article.php?sid= &cid= http://www.lesoirdalgerie.com/articles/ / / /article.php?sid= &cid= http://algerroi.fr/alger/bibliotheque_nationale/textes/ _biblio_algeria .htm http://algerroi.fr/alger/bibliotheque_nationale/textes/ _biblio_algeria .htm http://algerroi.fr/alger/bibliotheque_nationale/textes/ _biblio_algeria .htm abstracts تافطتق review article: indigenous cultural heritage preservation: a review essay with ideas for the future راكفأبةيدقنةعجارُم:ةيلصألابوعشلليفاقثلاثارتلاظفح:لاقم :ةيلبقتسُم loriene roy :ةصصختُملاالفإلاةلجمنم ، مقرددعلا صاخلايفاقثلاثارتلاظفحتاربخةيدقنلاةعجارُملاهذهضرعت دعبهفاشكتساوهقيقحتنكمُييفامو،تابتكملايفةيلصألابوعشلاب ًالاجماًضيأدعُييذلاو،مادقألانمريثكلاهأطتمليذلالاجملااذهيف تاسسؤملاهذهتنمتئايتلاةيفاقثلاتاعمتجُملابررضلاقحلُيدقاًكئاش يتلاورييغتثادحإل؛ةريفولاصرفلالاقملاضرعيامك،اهتاورثىلع .لايجألابيلاسأوتايكولسيفتارييغتىلإجاتحت the digital library in the re-inscription of african cultural heritage :يقيرفألايفاقثلاثارتلافصوةداعإيفةيمقرلاةبتكملارود dale peters, matthias brenzinger, renate meyer, mandy noble, niklas zimmer :ةصصختُملاالفإلاةلجمنم ، مقرددعلا راوحنمدعبأوهاملةيقيرفألاةيمقرلاتابتكملاتروطتدقل يفاقثلاثارتلاليوحتىلإةجاحلاو’لوادتلاوظفحلا‘لوحتاينيعستلا يفنآلايدحتلانمكيو،ةيمقرتاعومجمىلإةيماظن)ريغ(ةروصب يفًةيماظنرثكأجهنمعابتاو؛مدختسُملاتاجايتحاعمبواجتلاءاكذ ظفحلانيبةيلماكتلاةقالعلاثحبلااذهصحفتي.ةنمقرُملاداوملارايتخا تاغللاقيثوتوفصولوحةلاحةسارديفةساردلاويفاقثلاثارتلاو يفةديدجلازيكرتلاةطقننمكتنأثحبلاحرتقي،ةضرقنُملاةيقيرفألا ةكراشُملاوينقتلاراكتباللربكأةحاسمقلخو،ةينورتكلإلاةساردلا ةتقؤملاتالجسلاحيحصتو،ةيمقرلاةبتكملايفرظنلاةداعإيفةيركفلا .يقيرفألايفاقثلاثارتللةديدجةيساردةمجرتلالخنم storing and sharing wisdom and traditional knowledge in the library :ةبتكملايفاهتكراشُموةيديلقتلافراعملاوةمكحلابظافتحالا jenny bossaller, brooke shannon :ةصصختُملاالفإلاةلجمنم ، مقرددعلا تاعومجملاوةعوبطملاتاعومجملاىلعةيديلقتلاتابتكملازكرُت كرتياموهو،ةعجارُملانمةلحرمبترميأاهرشنمتيتلاةئشانلا فراعملاكلت:لثم،فراعملانماًضعبتاعومجملانادقفلةحاسم تابتكملالثمُت،ةيلصألابوعشلابةصاخلاتاربخلاوتادقتعُملاو نمًةيديلقتلقأاًعون،ةعوبطملاريغتاعومجملانماهريغوةيناسنإلا ةمكحلانيبةقالعلاثحبلافصيو،ثحبلااذهفصي،فراعملا تاديسلةيمويلاةيتامولعملاتاسرامُملاةساردراطإيفةفرعملاو امفصووةيمويلاتايلاعفلاريوصتبتاديسلاتماقدقف،اينيكةعماج نأل،ةفرعموةمكحنملثمتاميفاًريبكاًعونتلمعلااذهرهظأو،هوأر ةجاحلاىلعدكؤياموهو،نهميلعتبطبترياميفكلذنفصودقءاسنلا جارخإوتابتكملامولعيفةيعضولاتاضارتفالايفرظنلاةداعإىلإ ةمكحلانيزختمتيفيكنكلو،ةنزخُملافراعملابءاسنلاهتمسأام ؟اهتكراشُمو the challenges of reconstructing cultural heritage: an international digital collaboration :اًيلودواًيمقرنواعتلا:يفاقثلاثارتلاءانبةداعإيفتايدحتلا rachel heuberger, laura e. leone, renate evers :ةصصختُملاالفإلاةلجمنم ، مقرددعلا ةساردلليمتنتةزيمُملامعأنمةنوكملاfreimannةعومجمةنمقرنإ ةعومجمءاشنإةداعإلةيلودةينواعتةردابُمتناك،ةيدوهيللةيميداكألا ةحدافتاراسخلتضرعتيتلاوتنرتنإلاربعيدوهيلايفاقثلاثارتلا صوصنعملماعتلوألامعألاهذهلمشت،ةيناثلاةيملاعلابرحلاءانثأ عمجبعورشملاماق،ةثيدحلاثحبلابيلاسأمادختسابةرصاعُمريغةيدوهي برحلالبقرشُنسرهفجهنىلعايناملأيفةيلصألاةبتكملانمىقبتام يفةيناملألابنيقطانلادوهيلااهيلإيفُنيتلانكامألايفاهعمجمتتاينتقُمو ةساردلاريوطتىلإ،اًناجمصوصنلاةحاتإنمضتو،ةدحتُملاتايالولا ةنمقرلادعُتال،دودحمريغوعساوروهمجىلإفاشتكالااذهةحاتإب international federation of library associations and institutions , vol. ( ) – ª the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / ifla.sagepub.com i f l a http://www.sagepub.co.uk/journalspermissions.nav http://ifla.sagepub.com مهلامعأروصتبنيثحابللحمستاهنكلويمقرلاظفحلاروظنمنمةيرورض يلودلانواعتللجذامنعورشملااذهنعجتن،يخيراتلاويركفلاهقايسيف .عساوقاطنىلعةنمقرلاىلعلمعلاو born fi dead? special collections and born digital heritage, jamaica :اكيماجيفيفاقثلاثارتلانمةيمقرلاةصاخلاتاعومجملا cherry-ann smart :ةصصختُملاالفإلاةلجمنم ، مقرددعلا ةصاخلاتاعومجملايفةزيممةناكمةيفاقثلاةيثارتلاداوملالتحت ببسبةيفاكةروصبميلعتلايفداوملاهذهلالغتسامتيملنكلو،ةبتكملاب ةركفمعدييذلايدحتلاوهو،داوملاهذهظفحةرادإبةقلعتُمرومأ ةقالعىلعرثؤيدقيذلاموهفملاوهوةنيعُمدودحلخاداهرصح رشنلادازدقف،سكعلاىلع.ثاحبألالوادتقوعييلاتلابونيعربتُملا ،نوعدبُملاونوسرادلاهبتكيذلايفاقثلاثارتلالوادتنمينورتكلإلا اًملاعهتيادبذنميمقرلاىوتحملااذهلثمُي،ءاطسولاىلإءوجللانود نمكت،ةمداقلالايجأللونآلاهتحاتإيفتايلوؤسملابئلمتابتكمللاًديدج رصنعلايفاهضعبوتنرتنإلا:لثم،ةيتحتلاةينبلايفتايدحتلاضعب رشنلاكتالاجميفتاردقلاءانبنمديزملجاتحييذلاهسفنيرشبلا روصتقلخلةيرورضلاتاراهملايهو،ظفحللةلاعفلاتايجيتارتسإلاو ةرادإلةيلمعتايلآىلإةجاحلاويفاقثلاثارتلانادقفةيناكمإللماك .دبألاىلإيفاقثلاثارتلاةراسخىلإيدؤيسكلذريغو،ىوتحملا digitization of indian manuscripts heritage: role of national mission for manuscripts :ةيدنهلاةينطولاةثعبلارود:تاطوطخملانميدنهلاثارتلاةنمقر jyotshna sahoo, basudev mohanty :ةصصختُملاالفإلاةلجمنم ، مقرددعلا يفتاطوطخملاتاعومجمربكأوىنغأومدقأنمةدحاودنهلاكلمت هتاغلوهلاكشأفلتخُمبيفاقثلادنهلاثارتلاًيوقاًسراحدعُتيتلاو،ملاعلا نيمئاقللتاطوطخملاهذهظفحلكشُينكلو،هتاعوضوموهصوصنو مدقُي،قايسلااذهيف،ةبِطَرلاوةراحلادلبلاةعيبطل،ةيقيقحةلكشماهيلع ةينطولاةثعبلااهبتماقيتلاءانثلابةريدجلادوهجلاباًريرقتثحبلا تاطوطخمزكرم معدوسيسأتب ذنم)nmm(تاطوطخملل ،تاطوطخمللةينطوتانايبةدعاقلمعوتاطوطخملاظفحلاًزكرم و يفاقثلاثارتلاتاطوطخمةنمقرليلاحلاعضولاثحبلاضرعيامك ًالوصوةبتكملابةصاخلاتاطوطخملاةعومجمةنمقرنماًءدبيدنهلا .اهتاينتقُملوادتهرسأبملاعللنكمُيةيمقرتاطوطخمةبتكمءاشنإىلإ preserving digital heritage: at the crossroads of trust and linked open data ةلصتُملاتانايبلاوةقثلانيبقرطقرتفُميف:يمقرلاثارتلاظفح :ةحوتفملا iryna solodovnik, paolo budroni :ةصصختُملاالفإلاةلجمنم ، مقرددعلا تروطتامهماًيدحتاًيمقراهظفحمتيتلارداصملالوادتلظيس لضفأوريياعملانمريثكلاكانهو،ةيلبقتسُملاوأةيلاحلاتايجولونكتلا رمألاوهو،ةيمقرلاداوملاظفحهجوأفلتخُملوانتتيتلاتاسرامُملا لكلمشتتانايبلاةرادإلططُخوةددحُموةديجتاسايسبلطتييذلا تانايبللةكراشُمنمةيلاعتايوتسمقيقحتل،ةداملاةايحلحارم عضوaparsenةكبشحصنت،ليوطلاىدملاىلعاهمادختساو تانايبللاهبقوثومةقلحلمعلديهمتلل؛لدابتلاىلعمئاقلمعراطإ وهو،يلالدلالدابتلانمةكبشلاهذهيفنيكراشُملانيكمتل،'ةلوصوملا ،lod-bdتانايبتافصاوبةطيرخلمعبمتينأثحبلاحرتقييذلا ةمزأةلكشملحلlod ايجولونكتبرثكأةكبشلاهذهءارثإنكمُيامك ريياعمراطإيفةقثةداهشاهحنموتابتكملاةعباتُمبجيثيح،ةقثلا .نييبروألادامتعالاوةعباتُملا the universal procedure for library assessment: a statistical model for condition surveys of special collections in libraries لوحيئاصحإجذومن:يملاعلاىوتسملاىلعتابتكملامييقت :تابتكملايفةصاخلاتاعومجملاةلاحنعتاءاصقتسا sam capiau, marijn de valk, eva wuyts :ةصصختُملاالفإلاةلجمنم ، مقرددعلا flanders‘ةسسؤمتعنص heritage library’ةفرعملاًجذومن لوصولاةيناكمإىدموةيبتكملاتاعومجملابقحليذلارامدلاىدم ةجاحلاةيامحل؛)upla(تابتكملامييقتلاًيملاعاًجذومننمضاهيلإ اًمييقتجذومنلااذهفصي،ةصاخلاتاعومجملاميمرتوظفحىلإ ىدمددحُي،ةيبتكمةعومجمهلضرعتتيذلاررضلاىدملاًيئاصحإ اذهقيبطتبةبتكملاوفظومموقينأنكمُي،اهللوصولاةيناكمإوررضلا عمجنمضياموهو،يئدبمبيردتىلعلوصحلادعبمهسفنأبجذومنلا لكشُتو،ةسسؤملالخادةيبتكملاداوملابةيانعلاةيفيكبةفرعملازيزعتو ظفحلاتاسايسريوطتوعضول؛اًساسأاهلمعمتيتلاتاءاصحالا اهتعمجيتلاتانايبلاباهتنراقُمنكمُيو،تاينتقُملابةيانعلاجماربو مضتىربكةسسؤمflandersةسسؤمدعُت،ىرخألاتاسسؤملا عضوبةسسؤملاتماقدقو،يلامشلااكيجلبميلقإيفةيثارتتابتكمتس . - ماعيفيمييقتلاجذومنلااذه cultural heritage digitization projects in algeria case study of the national library ةبتكملانمةلاحةسارد:رئازجلايفيفاقثلاثارتلاةنمقرتاعورشم :ةينطولا meriem boulahlib, nadjia ghamouh :ةصصختُملاالفإلاةلجمنم ، مقرددعلا يفاقثلاثارتلاةنمقرةيرئازجلاةينطولاةبتكملااًيلاحلواحت نمتاطوطخملابقحلامل؛ةرورضحبصأيذلارمألاوهو،يرئازجلا ifla journal ( ) ءوضلاءاقلإىلإةساردلاهذهفدهت،اهتءارقواهلوادتةجيتنررض بتكلاوتاطوطخملاةنمقرةيلمعهجاوتيتلاتايدحتلاىلعىلع ةبتكملاتاعلطت،كلذبناجىلإثحبلاحضوي،رئازجلايفةردانلا .داوملاهذهةنمقريفعيرسلامدقتللاهتططخوةينطولا 摘要 review article: indigenous cultural heritage preservation: a review essay with ideas for the future 评论文章:本土文化遗产保护:一篇包含对未来 畅想的综述论文 loriene roy ifla journal, - , - 此篇文献综述展示了图书馆为本土文化遗产的 保护所做出的丰富、有价值的探索与成就。然 而这个领域对于那些将现存的珍贵遗产委托于 这些机构的文化社区来说,仍然具有敏感性与 潜在的危害性。改变这种局面的机遇固然很 多,但需要从一代代人们的态度与方式中逐渐 地衍变与发展。 the digital library in the re-inscription of african cultural heritage 数字图书馆与非洲文化遗产 dale peters, matthias brenzinger, renate meyer, mandy noble, niklas zimmer ifla journal, - , - 非洲数字图书馆的发展已经超越了上世纪 年代 的“保存或获取”的争论以及随之而来的对文化遗 产收藏的(非)系统化的模拟到数字格式的转换冲 动。现在面临的挑战是如何对用户需求做出快速 反应,并用更具战略性的方法做出数字化选择匹 配。本文将通过对非洲消亡语言、文档的描述及 个案研究来调查保存、文化遗产和学术间的共生 关系。本文认为新的焦点在于数字学术,通过对 非洲文化遗产的新的学术理解,使得技术创新与 智力能够更多地参与到数字图书馆对短暂性记录 的评估、编辑与传播。 storing and sharing wisdom and traditional knowledge in the library 在图书馆中储存与分享智慧和传统知识 jenny bossaller, brooke shannon ifla journal, - , - 传统图书馆的做法侧重于发展印刷物或者出版物 的收藏,这意味着这些文档是经过评估或者一定 程度上的审批过程的。这种做法使得大量诸如土 著知识、信仰和经验等潜在的知识流于图书馆的 收藏之外。人类的图书馆和其他非印刷型收藏仅 代表少数的知识形式。本文描述了智慧与知识的 关系,该关系源于一个关于肯尼亚女大学生们日 常信息实践的研究。她们拍摄每天生活中发生的 事物,同时描述她们看到的事物。其中一个发现 是智慧与知识的差异表现。由于她们的描述与其 教育相关联,因此我们断定这表明在图书馆学中 需要重新考虑实证主义假设,也就是这些女性所 言的将智慧搬进书库。然而,智慧可以被储存与 共享么? the challenges of reconstructing cultural heritage: an international digital collaboration 重建文化遗产的挑战: 一个国际数字化合作项目 rachel heuberger, laura e. leone, renate evers ifla journal, - , - 犹太学术研究——弗莱曼合集(freimann collec- tion)的数字化是一项国际性合作,意在虚拟重建 二战中遭受重大损失的犹太文化遗产收藏。这些 作品包括首次尝试运用现代学术研究方法的前现 在犹太宗教文本。建立起战前出版目录后,该项 目汇集了残存的原德国馆藏和位于犹太人主要流 放区之一(美国)的德语犹太人的收藏。可自由访 问的文本能能够确保读者无限制的访问,从而增 abstracts 强学术研究。数字化和虚拟重建不仅从数字化保 存的观点上来看是重要的,而且也符合研究者对 作品的文化性与历史性意义上的预想。该项目也 为今后国际合作与大规模数字化工作流程做出 范例。 born fi dead? special collections and born digital heritage, jamaica 产生即消亡?牙买加特色馆藏与原生数字遗产: 特色馆藏与出生数字遗产, 牙买加 cherry-ann smart ifla journal, - , - 关于文化遗产的印刷型特色馆藏在图书馆中占据 独特的地位。然而,这些资料的教学和学习潜力 可能经常由于保护管理等方面的问题而有待进一 步研究。这些挑战强化了其排他性,可能会影响供 体关系和无意识地阻碍研究访问。相反地,电子 出版则加强了对于原生数字文化遗产产品的访 问。没有中介的麻烦,这些原生的数字生产者代 表了一个“新的世界秩序”,使得图书馆承载存取 与传承的双重责任。部分挑战是基于基础设施层 面的,如互联网普及率,另一些则是人为相关 的,如出自出版和有效保存策略等方面能力建设 的需要。这些技术对于充分概念化文化遗产产品 缺失的可能和内容管理的可行机制是必不可少 的;做不到这点意味着文化遗产产品将真的是“产 生即消亡”。 digitization of indian manuscripts heritage: role of national mission for manuscripts 印度手稿遗产的数字化:手稿的国家使命 jyotshna sahoo, basudev mohanty ifla journal, - , - 印度是世界上拥有年代最古老、种类最丰富、数 量最多的手稿的国家之一。这些手稿作为一种强 大的媒介,以多种形式、语言、文稿和主题保存 印度文化遗产。对于这些手稿的保管人来说,这 个国家炎热、潮湿的气候条件是面临的一个严重 问题。此文肯定了手稿的国家使命项目(nmm)自 年启动以来所做的努力,包括建立了 家 手 稿资源中心(mrcs)、 家 手稿保存中心 (mccs),以及研发了一个国家手稿数据库。同 时,也展示了印度文化遗产数字化的现状,即从 手稿的收藏到建立一个全球可访问的数字手稿图 书馆 (dml) 。 preserving digital heritage: at the crossroads of trust and linked open data 保存数字遗产:在信任与关联开放数据的十字路口 iryna solodovnik, paolo budroni ifla journal, - , - 无论是现今还是未来的技术,获取电子化保存 的信息资源将都是一项挑战。对于数字对象的 保存,有过多的模式、标准和最佳做法。数字 对象的管理需要明确定义政策和数据管理计划, 涵盖在其特定的生命周期内的所有工作流程。 为达到高水平的数据共享和长期的数据再利 用,aparsen建议开发一个具有互操作性的持 久性标识符框架,为“关联开放数据的可信任持 久标识符环”铺平道路。为实现此语义互操作性 环,本文提出用框架的本体映射lode-bd元数 据。用lod 技术堆可以进一步丰富此环,解决 关联数据的生命周期的可信赖性问题,同时解 决大数据的问题。为获得信任,数字图书馆还 需要在符合欧洲审计和认证框架的前提下接受 审计和认证。 the universal procedure for library assessment: a statistical model for condition surveys of special collections in libraries 图书馆评估的通用程序:图书馆特色馆藏情况调 查的统计模型 sam capiau, marijn de valk, eva wuyts ifla journal, - , - 出于对图书馆特色馆藏的保存及维护的需要,弗 兰德斯遗产图书馆基金会开发了一个损坏评估模 型,即图书馆评估通用程序(upla)。这个模型描 ifla journal ( ) 述了图书馆馆藏资源的统计损伤评估。它决定损 伤的程度和图书馆资料的可访问性。 模型可能由 图书馆的工作人员在完成基本培训之后实施。这 可以确保通过upla评估,图书馆资料收集方面 的知识可在组织内得到积累和强化。通过筛选收 集到的统计数据是制定保存策略和开展馆藏保护 项目的重要基础。他们还可以与其他机构收集的 数据进行比较。 弗兰德斯遗产图书馆是一个网络组织,由六个 位于弗兰德斯(比利时北部地区)的遗产图书馆组 成。 - 年间, 弗兰德斯遗产图书馆基金会 开发了图书馆评估通用程序(upla)。 cultural heritage digitization projects in algeria case study of the national library 文化遗产数字化项目:阿尔及利亚国家图书馆案 例研究 meriem boulahlib, nadjia ghamouh ifla journal, - , - 目前,阿尔及利亚国家图书馆正在努力将本国的 文化遗产数字化。考虑到阅读手稿时触摸所带来 的物理性伤害,这个项目变得十分必要。此案例 研究意在充分了解本国手稿与善本数字化存在的 挑战。另外,文本亦阐明国家图书馆愿为手稿和 善本数字化做出巨大努力的愿望与实施计划。 sommaires review article: indigenous cultural heritage preservation: a review essay with ideas for the future [article de synthèse : la conservation du patrimoine culturel indigène : essai critique avec des idées pour le futur] loriene roy ifla journal, - , - cette analyse critique présente le domaine de la conservation du patrimoine culturel indigène au sein des bibliothèques comme une discipline qui offre encore un grand potentiel d’exploration et de réali- sations. cependant, c’est encore aussi un domaine sensible et potentiellement dommageable pour les communautés culturelles qui ont confié leurs trésors vivants à ces institutions. les possibilités de faire la différence sont nombreuses, mais il leur faut peut- être évoluer en modifiant les attitudes et approches générationnelles. the digital library in the re-inscription of african cultural heritage [la bibliothèque numérique dans le réenregistrement du patrimoine culturel africain] dale peters, matthias brenzinger, renate meyer, mandy noble, niklas zimmer ifla journal, - , - les bibliothèques numériques africaines ont évolué au-delà du débat « conservation ou accès » des années et de la tendance concomitante à convertir systé- matiquement (ou non) les collections analogues du patrimoine culturel en formats numériques. le défi consiste maintenant à faire preuve de flexibilité pour satisfaire les besoins des utilisateurs, afin de répondre au choix de la numérisation avec une approche plus stratégique concernant la pertinence et les résultats potentiels de la recherche. cet article examine la rela- tion symbiotique entre conservation, patrimoine cultu- rel et érudition dans une étude de cas sur la description et la documentation des langues mortes africaines. il suggère de mettre maintenant l’accent sur l’érudition numérique, pour permettre tout à la fois une innova- tion technologique et un plus grand engagement intel- lectuel dans le réexamen de la bibliothèque numérique, afin de réviser, corriger et augmenter les documents éphémères par le biais d’une nouvelle interprétation savante du patrimoine culturel africain. storing and sharing wisdom and traditional knowledge in the library [conserver et partager sagesse et savoir traditionnel au sein des bibliothèques] jenny bossaller, brooke shannon ifla journal, - , - la pratique bibliothécaire traditionnelle est axée sur les collections imprimées et sur le développement de collections de matériaux ayant été publiés, ce qui abstracts signifie que les documents ont été soumis à une cer- taine forme d’examen ou de procédure de contrôle. cette pratique exclut de la collection un vaste éventail de connaissances potentielles, notamment savoir, croyances et expérience indigènes. les « bibliothèques humaines » et autres collections non imprimées repré- sentent des formes de savoir moins traditionnelles. cet article évoque la relation entre sagesse et savoir telle qu’elle se manifeste dans une étude des pratiques quotidiennes d’information de femmes universitaires kenyanes. ces femmes ont photographié chaque jour des événements de leur vie et décrit ce qu’elles voyaient. une des constatations a été la présentation différente de la sagesse et du savoir. les femmes les décrivant par rapport à leur éducation, cela démontre selon nous qu’il est nécessaire de reconsidérer les hypothèses positivistes de la bibliothéconomie, en tenant également compte de ce que les femmes appel- lent « sagesse ». cependant, comment conserver et partager la sagesse ? the challenges of reconstructing cultural heritage: an international digital collaboration [les défis de la reconstruction d’un patrimoine culturel : une collaboration numérique internationale] rachel heuberger, laura e. leone, renate evers ifla journal, - , - la numérisation de la collection freimann, composée d’œuvres uniques appartenant à la wissenschaft des judentums (science du judaïsme), était une initiative internationale en collaboration pour reconstruire vir- tuellement une collection du patrimoine culturel juif en partie perdue pendant la seconde guerre mondiale. ces œuvres comprennent les premières études de textes religieux juifs pré-modernes utilisant des méthodes de recherche académique. en se basant sur un catalogue publié avant la guerre, le projet a rassemblé des vestiges de la collection bibliothécaire existant à l’origine en allemagne et des collections recueillies dans l’un des principaux lieux d’exil des juifs allemands aux États- unis. les textes en libre accès permettent de renforcer l’érudition en permettant à une audience illimitée une découverte sur le long terme. la numérisation et la reconstruction virtuelle sont non seulement cruciales du point de vue de la conservation numérique, elles per- mettent aussi aux chercheurs d’appréhender les œuvres dans le contexte de leur signification intellectuelle et historique. le projet génère également des modèles de collaboration internationale et des travaux de numérisa- tion à grande échelle. born fi dead? special collections and born digital heritage, jamaica [born fi dead ? la jamaïque : collections spéciales et patrimoine d’origine numérique] cherry-ann smart ifla journal, - , - les documents imprimés du patrimoine culturel ras- semblés dans des collections spéciales occupent une position particulière au sein des bibliothèques. cepen- dant, leur potentiel en matière de pédagogie et d’ap- prentissage demeure souvent sous-exploité, en raison de problèmes de conservation. ces problèmes renfor- cent leur caractère exclusif, une perception qui peut avoir un impact sur les relations avec les donateurs et empêcher inconsciemment l’accès à des fins de recher- che. inversement, la publication électronique a amélioré l’accès aux produits en format numérique du patrimoine culturel d’érudits et de créateurs. sans avoir à se soucier de passer par des intermédiaires, ces producteurs de contenu d’origine numérique représentent un « nouvel ordre mondial » pour les bibliothèques chargées de la double responsabilité de l’accès et de la postérité. cer- tains défis ont un caractère infrastructurel, par exemple la pénétration d’internet, d’autres un caractère humain découlant de la nécessité de renforcer les capacités dans des domaines tels que stratégies efficaces de publication et de conservation. ces compétences sont essentielles pour bien comprendre le potentiel de perte de produits du patrimoine culturel et le besoin de mécanismes via- bles pour gérer le contenu ; se contenter de moins sug- gérerait que les produits du patrimoine culturel sont littéralement « born fi dead », à savoir : nés pour mourir. digitization of indian manuscripts heritage: role of national mission for manuscripts [numérisation du patrimoine de manuscrits indiens : le rôle de la mission nationale chargée des manuscrits] jyotshna sahoo, basudev mohanty ifla journal, - , - l’inde a la particularité de posséder les collections de manuscrits les plus anciennes, les plus riches et les plus étendues du monde. ces manuscrits jouent un rôle important dans la conservation du patrimoine culturel indien sous des formes, langages, écritures et sujets différents. mais la conservation de ces manuscrits pose un sérieux problème aux conservateurs en raison des conditions climatiques chaudes et humides du pays. dans ce contexte, le présent article rend compte des efforts louables faits par la mission nationale chargée ifla journal ( ) des manuscrits (national mission for manuscripts, nmm) depuis sa mise en place en , avec la cré- ation et la consolidation de centres de ressources pour les manuscrits et de centres de conservation des manuscrits, ainsi que le développement d’une base de données nationale de manuscrits. il présente aussi l’état actuel de la numérisation du patrimoine culturel indien sous forme de manuscrits, depuis leur recueil jusqu’au développement d’une bibliothèque numé- rique de manuscrits accessible à tous. preserving digital heritage: at the crossroads of trust and linked open data [conservation du patrimoine numérique : à la croisée des chemins entre données de confiance et linked open data] iryna solodovnik, paolo budroni ifla journal, - , - quelles que soient les technologies actuelles ou futures, l’accès aux ressources d’information conser- vées en format numérique constituera toujours un défi. il existe une pléthore de modèles, normes et pratiques d’excellence traitant des différentes facettes de la con- servation des objets numériques. la gestion des objets numériques nécessite des politiques et des plans de gestion bien définis, qui englobent toutes les procé- dures concernant leur cycle de vie spécifique. pour atteindre de hauts niveaux de partage des données et permettre une réutilisation des données sur le long terme, aparsen recommande de mettre en place une structure interopérable pour les codes permanents (interoperable framework for persistent identifiers), ouvrant ainsi la voie à un « cercle de codes permanents fiables pour linked open data ». pour permettre l’in- teropérabilité sémantique de ce cercle, cet article pro- pose de recenser les métadonnées lode-bd avec l’ontologie de la structure interopérable. le cercle peut être complété par l’ensemble d’outils technologi- ques de lod , pour s’attaquer au problème de la fia- bilité du cycle de vie des données liées tout en tenant compte de celui des big data. pour inspirer confiance, les bibliothèques numériques doivent être contrôlées et certifiées en conformité avec le cadre européen d’audit et de certification des archives numériques. the universal procedure for library assessment: a statistical model for condition surveys of special collections in libraries [procédure universelle d’évaluation des bibliothèques : un modèle statistique pour étudier l’état des collections spéciales des bibliothèques] sam capiau, marijn de valk, eva wuyts ifla journal, - , - pour assurer la protection des besoins en conservation et préservation des collections spéciales des bibliothèques, la fondation des bibliothèques du patrimoine flamand a conçu une procédure universelle d’évaluation biblio- thécaire (universal procedure for library assessment ou upla), un modèle permettant d’évaluer les domma- ges. ce modèle consiste en une évaluation statistique des dommages subis par les collections bibliothécaires. il détermine l’étendue des dommages et l’accessibilité des documentsbibliothécaires. il peut être misen œuvre par le propre personnel des bibliothèques ayant suivi une formation de base. cela permet de rassembler et d’affermir les connaissances au sein de l’organisation à propos des soins à apporter à la collection de maté- riaux bibliothécaires pendant toute la procédure upla. les statistiques réunies lors du contrôle servent de base pour développer une politique de conservation et un programme d’entretien de la collection. elles peuvent aussi être comparées à des données rassemblées par d’autres institutions. la fondation des bibliothèques du patrimoine flamand rassemble six bibliothèques patrimoniales de la région flamande (dans le nord de la belgique). la fondation a conçu la procédure univer- selle d’évaluation upla en - . cultural heritage digitization projects in algeria case study of the national library [projets de numérisation du patrimoine culturel en algérie: une étude de cas de la bibliothèque nationale] meriem boulahlib, nadjia ghamouh ifla journal, - , - la bibliothèque nationale algérienne s’emploie actuel- lement à numériser le patrimoine culturel algérien. cet exercice est devenu impératif en raison des dégâts phy- siques causés par la manipulation des manuscrits pen- dant leur lecture. cette étude de cas vise à faire la lumière sur les défis de la numérisation des manuscrits et des livres rares dans le contexte algérien. en outre, cet article présente les aspirations et projets de la bibliothèque nationale pour assurer la réussite de cette tentative de numérisation des manuscrits et des livres rares. abstracts zusammenfassungen review article: indigenous cultural heritage preservation: a review essay with ideas for the future Übersichtsartikel: erhaltung des einheimischen kulturerbes: eine betrachtung der ideen für die zukunft loriene roy ifla journal, - , - diese literaturstudie zeigt die wirkung durch die erhaltung des einheimischen kulturerbes in biblio- theken als einen bereich, der noch weitere bedeu- tungsvolle erkundungen und leistungen bieten dürfte. dieses themengebiet ist für die örtlichen gemeinschaften jedoch ein sehr sensibler punkt mit möglicherweise schädlichen folgen, weil sie diesen einrichtungen ihre lebenden schätze überlassen haben. es bieten sich unzählige möglichkeiten, dinge zu bewegen, aber sie müssen sich durch den wandel bei dem verhalten und der vorgehensweise bei generationen entwickeln. the digital library in the re-inscription of african cultural heritage die digitale bibliothek in der neuerfassung des afrikanischen kulturerbes dale peters, matthias brenzinger, renate meyer, mandy noble, niklas zimmer ifla journal, - , - afrikanische bibliotheken haben sich seit der debatte „erhaltung oder zugriff“ in den -er jahren weiter- entwickelt, während gleichzeitig die (un-)systemati- sche umstellung der kollektionen des kulturerbes von analogen in digitale formate verlaufen ist. die herausforderung heute besteht daraus, sich flexibel auf die bedürfnisse der benutzer umzustellen, damit die auswahl für die digitalisierung auf strategische weise dem stellenwert für die forschung sowie den möglichen forschungsergebnissen gerecht wird. in diesem dokument wird in einer fallstudie über die beschreibung und dokumentation ausgestorbener afrikanischer sprachen die symbiotische beziehung zwischen erhaltung, kulturerbe und lehre in einem fallbeispiel untersucht. es zeigt auf, dass der neue schwerpunkt in der digitalen lehre liegt, durch die eine neue gelehrte interpretation des afrikanischen kulturerbes sowohl technische innovationen als auch ein verstärktes intellektuelles engagement in einem neuerlichen besuch der digitalen bücherei zur prü- fung, berichtigung und aufnahme transitorischer auf- zeichnungen ermöglicht. storing and sharing wisdom and traditional knowledge in the library archivierung und austausch von weisheit und traditionellem wissen in der bücherei jenny bossaller, brooke shannon ifla journal, - , - in einer traditionellen bibliothek dreht sich alles einer- seits um gedruckte sammlungen und anderseits um die erstellung von sammlungen aus veröffentlichten materialien; das bedeutet, dass die dokumente einer art von prüfung beziehungsweise untersuchung unterzogen wurden. in der praxis bietet sich noch ein weites feld möglichen wissens außerhalb solcher sammlungen wie indigene kenntnisse, Überzeugun- gen und erfahrungen. menschliche büchereien und andere nicht gedruckte sammlungen stehen für weni- ger traditionelle formen des wissens. dieses doku- ment skizziert das verhältnis zwischen weisheit und wissen, und beruht auf einer studie über den alltägli- chen umgang kenianischer akademikerinnen mit informationen. die frauen fotografierten aspekte aus ihrem alltag und beschrieben, was sie sahen. ein sich daraus ergebendes ergebnis war die unterschiedliche darstellung von weisheit und wissen. da die frauen dies in bezug zu ihrer eigene ausbildung beschrie- ben, gehen wir davon aus, dass dies ein bedürfnis zur neubewertung positivistischer annahmen in der bibliothekswissenschaft zeigt, die - wie die frauen das nannten - die weisheit in die regale bringt. wie allerdings lässt sich weisheit sichern und teilen? the challenges of reconstructing cultural heritage: an international digital collaboration die herausforderungen bei der wiederherstellung von kulturerbe: eine internationale digitale zusammenarbeit rachel heuberger, laura e. leone, renate evers ifla journal, - , - die digitalisierung der freimann-kollektion, die ein- zigartige werke aus der wissenschaft des judentums umfasst, beruht auf einer gemeinsamen internationalen initiative zur virtuellen wiederherstellung einer ifla journal ( ) kollektion jüdischen kulturerbes, die im zweiten weltkrieg dezimiert wurde. diese werke umfassten die ersten kontakte mit prä-modernen, jüdisch-religiö- sen texten, für die moderne forschungsmethoden der akademischen welt eingesetzt wurden. auf der grund- lage eines vor dem zweiten weltkrieg veröffentlichten katalogs führte das projekt reste der ursprünglichen bibliothekssammlung aus deutschland und kollektio- nen zusammen, die sich an einem der wichtigsten orte der deutschsprachigen juden im us-amerikanischen exil befinden. die frei zugänglichen texte gewährlei- sten verbesserungen in der lehre, indem sie einem unbegrenzten publikum langfristige entdeckungen ermöglichen. die digitalisierung und die virtuelle rekonstruktion sind nicht nur aus dem gesichtspunkt der digitalen erhaltung von wesentlicher bedeutung, denn sie erlauben es forschenden zudem, die werke im kontext ihrer geistigen und historischen bedeutung darzustellen. das projekt führte darüber hinaus zu modellen für die internationale zusammenarbeit und umfassenden arbeitsabläufen bei der digitalisierung. born fi dead? special collections and born digital heritage, jamaica born fi’ dead? besondere sammlungen und geschaffenes digitales erbe, jamaica cherry-ann smart ifla journal, - , - gedruckte werke des kulturerbes in besonderen sammlungen nehmen einen besonderen stellenwert in bibliotheken ein, aber deren potenzial für schulung und ausbildung wird aufgrund von fragen über deren erhaltung oftmals nur unzureichend ausgeschöpft. diese herausforderungen unterstreichen ihre exklusi- vität, eine wahrnehmung also, die sich auf die bezie- hungen mit spendern auswirken und unterschwellig den zugriff für die forschungstätigkeit beeinträchti- gen kann. im gegensatz dazu bieten elektronische veröffentlichungen für gelehrte und kreative expres- sionisten den zugriff auf geschaffene produkte des kulturerbes. ohne behinderungen durch zwischen- personen stellen diese schaffenden digitaler werke eine neue weltordnung für bibliotheken dar, für die sich die doppelte verantwortung zu zugriff und nachwelt stellt. manchen herausforderungen wie die internetdurchdringung haben einen infrastrukturellen charakter, während andere sich auf menschen bezie- hen, und auf dem bedürfnis beruhen, in den berei- chen publikation und effektive erhaltung neue kapazitäten aufzubauen. diese fertigkeiten sind für die konzeptualisierung des potenzials der verloren gegangenen produkte des kulturerbes und den bedarf nach zukunftssicheren mechanismen für die verwaltung von content von ausschlaggebender bedeutung, denn alles andere würde bedeuten, dass exemplare des kulturerbes ganz buchstäblich „born fi‘ dead“ (nach dem gleichnamigen buchtitel des romans von laurie gunst über ghettos in jamaica: zum sterben geboren) seien. digitization of indian manuscripts heritage: role of national mission for manuscripts digitalisierung des erbes indische manuskripte: die rolle der nationalen mission für manuskripte jyotshna sahoo, basudev mohanty ifla journal, - , - indien zeichnet sich dadurch aus, dass das land über eine der weltweit ältesten, umfassendsten und größten sammlungen von manuskripten verfügt. diese manu- skripte in interschiedlichen formen, sprachen, skripten und themen sind ein wertvolles hilfsmittel für die erhaltung des indischen kulturerbes, aber die erhaltung dieser manuskripte stellt durch das heiße und schwüle klima im land ein schwerwiegendes problem für die hüter dieser dokumente dar. in diesem kontext zeigt das vorliegende dokument auf, welche anstrengungen die nationale mission für manuskripte (national mis- sion for manuscripts, nmm) seit ihrer gründung durch die einrichtung und förderung von for- schungszentren für manuskripte (manuscript resource centers, mrc), von konservierungszentren für manuskripte (manuscript conservation centers, mcc) sowie durch die erstellung einer nationalen datenbank für manuskripte erbracht hat. dadurch wird zudem der derzeitige stand bei der digitalisierung des indischen kulturerbes in der form von manuskripten ab deren sammlung bis hin zum aufbau einer digitalen manu- skriptbibliothek (digital manuscript library, dml) für einen globalen zugriff aufgezeigt. preserving digital heritage: at the crossroads of trust and linked open data erhaltung des digitalen erbes: am schnittpunkt von vertrauen und verbundenen offenen daten iryna solodovnik, paolo budroni ifla journal, - , - ungeachtet der heutigen oder künftigen technologien stellt der zugriff auf digital erhaltene informationsquellen abstracts immer gewisse herausforderungen. es gibt eine fülle an modellen, normen und bewährten methoden für die verschiedenen aspekte bei der erhaltung digitaler objekte. die verwaltung digitaler objekte erfordert sorgfältig durchdachte vorgaben und pläne zum datenmanagement, die sämtliche prozesse im rahmen ihreseigenen lebenszyklus umfassen. um möglichst optimale leistungen beim austausch von daten und deren langfristige wiederverwertung zu gewährlei- sten, empfiehlt aparsen die erarbeitung eines vollständig kompatiblen rahmens für dauerhafte identifikatoren, die den weg zu einem „ring ver- lässlicher dauerhafter identifikatoren für verbundene offene daten“ ebnen. zur ermöglichung der seman- tischen interoperabilität in einem solchen ring schlägt dieser artikel die erfassung der metadaten lode-bd in der seinslehre des rahmens vor. der ring kann durch die lod technologie stack weiter verstärkt werden, sodass das problem der verlässlich- keit in bezug auf den lebenszyklus der verbundenen daten beseitigt wird, während man sich gleichzeitig der frage der big data stellt. für die zuverlässigkeit müssen digitale bibliotheken nach dem europäischen rahmen für prüfung und zertifizierung geprüft und zertifiziert werden. the universal procedure for library assessment: a statistical model for condition surveys of special collections in libraries das universelle verfahren für den zugriff auf bibliotheken: ein statistisches modell für zustandserhebungen der besonderen kollektionen von bibliotheken sam capiau, marijn de valk, eva wuyts ifla journal, - , - zum schutz der bedürfnisse für die erhaltung und konservierung besonderer kollektionen in bibliothe- ken hat die bibliothek des flämischen kulturerbes das universelle verfahren zur bibliotheksbewertung (uni- versal procedure for library assessment, upla) als modell für die bewertung von schäden erarbeitet. dieses modell beschreibt die beschädigungen von bibliothekssammlungen anhand statistischer anga- ben, die sowohl den umfang des schadens als auch die zugänglichkeit der bibliotheksmaterialien aufzei- gen. das modell lässt sich nach einem absolvierten grundkurs von den angestellten der bibliothek selbst implementieren. dadurch wird gewährleistet, dass durch die upla-bewertung das wissen über die pflege einer sammlung in bibliotheken zusammen- getragen und in der organisation konsolidiert wird. die durch die bewertung erfassten daten sind dann die bausteine für die entwicklung einer verhaltens- weise zur erhaltung und eines programms zur pflege der sammlung. diese daten lassen sich zudem mit den gesammelten angaben anderer einrichtungen vergleichen. die bibliothek des flämischen kulturerbes ist ein netzwerkverband, der aus sechs bibliotheken für das kulturerbe in flandern (dem nordteil von belgien) besteht. in den jahren und hat die biblio- thek des flämischen kulturerbes das universelle verfahren für den zugriff auf bibliotheken (upla) erarbeitet. cultural heritage digitization projects in algeria case study of the national library projekte zur digitalisierung des kulturerbes in algerien - fallbeispiel in der nationalbibliothek meriem boulahlib, nadjia ghamouh ifla journal, - , - die algerische nationalbibliothek arbeitet zurzeit an der digitalisierung des algerischen kulturerbes. diese bemühungen erwiesen sich durch physische schäden an manuskripten bei deren lektüre erforderlich. dieses fallbeispiel zielt darauf ab, weitere informationen über die herausforderungen zu erhalten, die sich bei der digitalisierung von manuskripten und seltenen büchern im kontext algeriens ergeben. darüber hin- aus beschreibt dieses dokument die zielsetzungen und pläne der nationalbibliothek, damit die digitali- sierung seltener bücher zu einer erfolgreichen unter- nehmung wird. ifla journal ( ) pефераты статеи review article: indigenous cultural heritage preservation: a review essay with ideas for the future Обзорная статья: Сохранение культурного наследия коренного населения: Обзорное эссе, содержащее идеи относительно будущего Лорин Рой ifla journal, - , - В данном литературном обзоре мир культурного наследия коренного населениявбиблиотеках пред- стает как сфера, которая все еще готова к предмет- ному изучению и новым достижениям. При этом данная сфера по-прежнему остается уязвимой и потенциально губительной для тех культурных сообществ, которые вверили этим учреждениям свои живые сокровища. Существует масса спосо- бов изменить ситуацию к лучшему, но они должны стать результатом изменения точек зрения и подхо- дов поколений. the digital library in the re-inscription of african cultural heritage Электронная библиотека в новом подходе к документированию культурного наследия Африки Дейл Питерс, Маттиас Бренцингер, Рената Мейер, Менди Ноубл, Никлас Циммер ifla journal, - , - Электронные библиотеки Африки переросли дебаты -х о том, как поступать: “сохранять, либо предоставлять доступ”, а также сопутствую- щий порыв (не)систематического преобразования собраний, являющихся культурным наследием, из аналогового в цифровой формат. Сейчас главной задачей является способность оперативно реаги- ровать на потребности пользователя, сочетать выбор в пользу цифрового формата с более стра- тегически продуманным подходом в части актуальности проводимых исследований и их потенциальных результатов. В данной работе рас- сматривается симбиотическая взаимосвязь между сохранением, феноменом культурного наследия и образованием в рамках анализа практического описания и документирования исчезающих язы- ков Африки. Главная идея настоящего документа заключается в том, что в текущихусловияхоснов- ное внимание необходимо уделять обучению в цифровом формате, предоставляя возможность при повторяющемся обращении в электронную библиотеку с использованием как технических новшеств, так и более интеллектуальных средств, производить обзор, исправление и дополнение записей, не имеющих статуса завершенных, согласно новой научной интерпретации культур- ного наследия Африки. storing and sharing wisdom and traditional knowledge in the library Сохранить и поделиться мудростью и традиционными знаниями в библиотеке Дженни Боссаллер, Брук Шеннон ifla journal, - , - Согласно традиционному подходу, библиотеки фокусируют свое внимание на собрании печат- ных материалов и создании фондов из материа- лов, которые были напечатаны, и это означает, что документы в той или иной степени были под- вергнуты изучению и анализу. При таком подходе из фондов ‘выкашивается’ существенный объем потенциальных знаний, таких как знания корен- ных народов, поверья и жизненный опыт. “Живые библиотеки” и прочие некоммерческие собрания представляют менее традиционные формы зна- ний. В настоящей работе очерчено взаимоотноше- ние между мудростью и знаниями, возникшее в ходе изучения информационного опыта женщин университета Кении. Они фотографировали собы- тия своей повседневной жизни и описывали то, что видели. Одним из результатовбыло обнаруже- ние различия в изложении мудрости и знаний. Поскольку женщины представляли свои описания во взаимосвязи с собственным образованием, мы утверждаем, что данный факт указывает на необ- ходимость пересмотра позитивистких предполо- жений в библиотековедении и размещения на стеллажахтого, чтоженщины назвали мудростью. Правда, как можно хранить мудрость и делиться ею? the challenges of reconstructing cultural heritage: an international digital collaboration Трудноразрешимые вопросы восстановления культурного наследия: Международное abstracts сотрудничество в области цифровых технологий Рашель Хойбергер, Лаура Е. Леоне, Рената Эверс ifla journal, - , - Перевод в электронный формат Коллекции Фрай- мана, уникальных работ, относящихся к Академи- ческому исследованию иудаизма [wissenschaft des judentums], был совместной международной инициативой, направленной на восстановление в виртуальном формате коллекции еврейского куль- турного наследия, которая пострадала во время Второй мировой войны. Данные работы включали в себя первое обращение к досовременным еврей- ским религиозным текстам с использованием современных методов исследования, принятых в научном сообществе. Начавшись с напечатанного в довоенное время каталога, проект собрал вое- дино остатки оригинальной библиотеки в Герма- нии, а также коллекции, которые были собраны в одном из главных мест пребывания изгнанных немецкоязычных евреев в Соединенных Штатах. Нахождение текстов в свободном доступе обеспе- чивает расширение возможностей образования, поскольку таким образом неограниченному коли- честву читателей предоставляется доступ к сведе- ниям, значение которых имеет долговременный характер. Преобразование в электронный формат и виртуальная реконструкция не только имеют чрезвычайно важное значение с точки зрения сохранения материалов в цифровом формате, но также позволяют исследователям представить эти работы в контексте их интеллектуальной и истори- ческой значимости. Также в ходе данного проекта были сформированы модели международного сотрудничества и организации рабочего процесса в рамках масштабных проектов по преобразова- нию материалов в электронный формат. born fi dead? special collections and born digital heritage, jamaica Рождены, чтобы умереть? Специальные собрания и местное культурное наследие, Ямайка Черри-Энн Смарт ifla journal, - , - Особое место в библиотеках занимают специаль- ные собрания печатных образцов культурного наследия. При этом их потенциал с точки зрения педагогики и обучения зачастую может быть недостаточно изучен вследствие влияния факторов, связанных с обеспечением сохранности. Данные факторы особо подчеркивают исключительность подобных предметов, и такое восприятие может отразиться на взаимоотношениях с дарителями и неумышленно затруднить доступ для проведения научно-исследовательской работы. С другой сто- роны, публикация в электронном виде расширила доступ членов научного сообщества и творческих личностей к представленному в цифровом формате местному культурному наследию. При отсутствии препятствий в лице посредников данные местные производители цифровой продукции представляют “новый мировой порядок” для библиотек, на кото- рых лежит двойная ответственность: обеспечить доступ к материалам и сохранить их для после- дующих поколений. Некоторые из насущных проблем являются инфраструктурными, как, например, проникновение Интернета, другие свя- заны с человеческим фактором и возникают в связи с наращиванием потенциала втакихсферах, как публицистика и эффективные стратегии обес- печения сохранности. Данные навыки имеют пер- востепенное значение для полного осмысления возможности потери предметов культурного наследия, а также потребности в жизнеспособных механизмах управления содержанием; что-либо менее масштабное дало бы основание предполо- жить, что предметы культурного наследия в бук- вальном смысле “рождены, чтобы умереть”. digitization of indian manuscripts heritage: role of national mission for manuscripts Преобразование в цифровой формат рукописного наследия Индии: Роль Национальной миссии по вопросам рукописей Юотшна Саху, Басудев Моханти ifla journal, - , - Отличительной чертой Индии является наличие одного из древнейших, богатейших и крупнейших собраний рукописей в мире. Данные рукописи являются действенным средством сохранения культурного наследия Индии в многообразии форм, языков, рукописных шрифтов и предметов. Однако сохранение рукописей является серьезной проблемой для их хранителей в связи с жарким и влажным климатом в стране. В данном контексте внастоящейработепредставленотчетодостойных похвалы усилиях, предпринятых Национальной миссией по вопросам рукописей (nmm) с момента ее основания в году, выразившихся в ifla journal ( ) учреждении и укреплении -ми Ресурсных центров рукописей (mrc), -ти Центров сохра- нения рукописей (mcc), а также в разработке Национальной базы рукописей. В работе также представлено текущее состояние процесса пре- образования в цифровой формат рукописного наследия Индии в виде пути рукописи с момента ее получения до момента создания Электронной библиотеки рукописей (dml), открытой для все- мирного доступа. preserving digital heritage: at the crossroads of trust and linked open data Сохранение электронного наследия: На пересечении путей доверия и связанных открытых данных Ирина Солодовник, Паоло Будрони ifla journal, - , - Независимо от современных или будущих техно- логий, подключение к сохраняемым в цифровом формате информационным ресурсам всегда будет источником сложных задач. Существует огромное количество моделей, стандартов и проверенных практических методов в различных областях сохранения Цифровыхобъектов. Управление Циф- ровыми объектами требует наличия четко опреде- ленной Политики и Планов управления данными, которые включают в себя все процессы, связанные с их конкретным жизненным циклом. Для дости- жения высокого уровня обмена данными и пов- торного использования данных в долгосрочной перспективе aparsen (Сетевой альянс постоян- ного доступа к научным данным в Европе) реко- мендует разработку функционально совместимой среды постоянных идентификаторов, проклады- вающей путь для “Кольца доверительных постоян- ных идентификаторов для связанных открытых данных”. С целью обеспечения семантического взаимодействия такого Кольца в настоящей работе предлагается сопоставлять метаданные lode-bd с онтологией среды. Кольцо впоследствии может быть дополнено стеком технологии lod для работы над решением проблемы достоверности жизненного цикла Связанных данных, при этом необходимо будет поработать над вопросом Боль- ших данных. Для того, чтобы рассчитывать на доверие пользователей, электронные библиотеки должны пройти аудит и быть сертифицированы в соответствии с Европейскими критериями прове- дения аудита и сертификации. the universal procedure for library assessment: a statistical model for condition surveys of special collections in libraries Единая процедура оценки библиотеки: Статистическая модель для обследования состояния специальных фондов библиотек Сам Капиау, Марейн де Валк, Эва Вюутс ifla journal, - , - Для обеспечения потребности в консервировании и сохранении специальных коллекций вбиблиоте- ках организация Библиотека наследия Фландрии разработала Единую процедуру оценки библио- теки (upla), которая представляет собой модель оценки причиненного ущерба. Данная модель описывает статистическую оценку ущерба, при- чиненного фондам библиотеки. Она позволяет определить размер ущерба, а также степень доступности материалов библиотеки. Данную модель можно реализовать силами собственного персонала библиотеки после завершения базового обучения. Проведение оценки согласно проце- дуре upla позволяет накапливать и подкреплять в рамках организации знания о мерах по обеспе- чению сохранности материалов библиотеки. Ста- тистические данные, полученные в ходе оценки, являются структурными элементами развития стратегии консервирования материалов, а также программы обеспечения сохранности фондов. Также их можно сравнивать с данными, получен- ными другими учреждениями. Библиотека наследия Фландрии является сетевой организацией, включающей в себя шесть библио- тек наследия на территории Фландрии (северная часть Бельгии). В - годах Библиотека наследия Фландрии разработала Единую проце- дуру оценки библиотеки (upla). cultural heritage digitization projects in algeria case study of the national library Проекты по переводу культурного наследия в электронный формат в Алжире, практическое исследование на примере Национальной библиотеки Мериэм Боулахлиб, Надия Гхамоу ifla journal, - , - В текущий момент Национальная библиотека Алжира прилагает большие усилия, чтобы пере- вести в электронный формат культурное наследие abstracts Алжира. Данная процедура приобрела статус неотложной задачи вследствие причинения физи- ческогоущерба рукописям при обращении с ними в процессе чтения. Задачей настоящего практиче- ского исследования является пролить свет на главные трудности, связанные с переводом в цифровую форму рукописей и редких книг, в кон- тексте Алжира. Вданной работе такжедается объ- яснение стремлениям и планам национальной библиотеки, направленным на превращение про- цесса оцифровывания рукописей и редких книг в процветающее начинание. resúmenes review article: indigenous cultural heritage preservation: a review essay with ideas for the future artículo de revisión: conservación del patrimonio cultural indígena: un artículo de revisión con ideas para el futuro loriene roy ifla journal, - , - esta reseña literaria muestra el ámbito de la conserva- ción del patrimonio cultural indígena en las bibliotecas como área ya madura para realizar una exploración sig- nificativa y obtener logros importantes. sin embargo, este campo todavía es sensible y potencialmente nocivo para las comunidades culturales que han confiado sus tesoros vivos a estas instituciones. está repleto de opor- tunidades para marcar la diferencia, pero puede que sea necesario modificar las actitudes y los planteamientos generacionales. the digital library in the re-inscription of african cultural heritage la biblioteca digital en la reinscripción del patrimonio cultural africano dale peters, matthias brenzinger, renate meyer, mandy noble, niklas zimmer ifla journal, - , - las bibliotecas digitales africanas han evolucionado más allá del debate “conservación o acceso” de , y el consiguiente apremio por convertir las colecciones del patrimonio cultural del formato analógico al digital de forma (poco) sistemática. ahora el desafío recae sobre la agilidad para responder a las necesidades de los usuarios, para combinar la selección a digitalizar con un enfoque más estratégico de la relevancia de la investigación y los posibles resultados de la misma. este documento analizará la relación simbiótica entre conservación, patrimonio cultural e investigación en un caso práctico sobre la descripción y la docu- mentación de los idiomas africanos extinguidos. propone que el nuevo punto de interés recaiga en la investigación digital, permitiendo tanto la innova- ción técnica como un compromiso más intelectual a la hora de revisar la biblioteca digital, corregirla y aumentar los registros transitorios mediante una nueva interpretación académica del patrimonio cul- tural africano. storing and sharing wisdom and traditional knowledge in the library almacenar y compartir sabiduría y conocimientos tradicionales en la biblioteca jenny bossaller, brooke shannon ifla journal, - , - la práctica bibliotecaria tradicional se centra en colecciones impresas y en el desarrollo de colec- ciones de materiales que se han publicado, lo que significa que los documentos se han sometido a algún proceso de revisión o selección. esta práctica deja fuera de la colección gran cantidad de conoci- mientos potenciales, como por ejemplo los conoci- mientos, las creencias y las experiencias indígenas. las bibliotecas humanas y otras colecciones no impre- sas representan formas menos tradicionales del conoci- miento. este documento define la relación entre la sabiduría y el conocimiento que surgió al estudiar las prácticas de información del día a día en las universi- tarias kenianas. las mujeres fotografiaron aconteci- mientos cotidianos de su vida y describieron lo que veían. uno de los resultados fue la divergente presen- tación de sabiduría y conocimientos. dado que las mujeres los describieron en relación con su educación, constatamos que esto demuestra la necesidad de recon- siderar suposiciones positivistas en biblioteconomía, llevando a las estanterías lo que las mujeres llamaban sabiduría. entonces, ¿cómo se puede almacenar y com- partir la sabiduría? ifla journal ( ) the challenges of reconstructing cultural heritage: an international digital collaboration los desafíos de reconstruir un patrimonio cultural: una colaboración digital internacional rachel heuberger, laura e. leone, renate evers ifla journal, - , - la digitalización de la colección freimann, obras úni- cas que pertenecen al wissenschaft des judentums [estudios judaicos], fue una iniciativa de colaboración internacional para reconstruir prácticamente una colec- ción del patrimonio cultural judío que sufrió pérdidas durante la segunda guerra mundial. estas obras incluían el primer acercamiento a textos religiosos judíos premodernos usando métodos de investigación modernos del entorno académico. partiendo de un cat- álogo publicado antes de la guerra, el proyecto reunió los restos de la colección original de la biblioteca en alemania y colecciones que se recopilaron en uno de los principales lugares de exilio de los judíos de habla alemana en los estados unidos. los textos de acceso gratuito garantizan la mejora de la investigación, ya que ofrecen descubrimiento a largo plazo a una audien- cia ilimitada. la digitalización y la reconstrucción vir- tual no solo son cruciales desde el punto de vista de la conservación digital, sino que además permiten a los investigadores visualizar los trabajos en el contexto de su importancia intelectual e histórica. el proyecto tam- bién generó modelos de colaboración internacional y flujos de trabajo de digitalización a gran escala. born fi dead? special collections and born digital heritage, jamaica born fi dead? colecciones especiales y patrimonio de origen digital, jamaica cherry-ann smart ifla journal, - , - los artículos impresos de patrimonio cultural pertene- cientes a colecciones especiales ocupan distintas posi- ciones en las bibliotecas. su potencial pedagógico y didáctico puede estar poco explorado por problemas relacionados con la gestión de la conservación. estos desafíos respaldan su exclusividad, una percepción que puede afectar a las relaciones con donantes e impide inconscientemente el acceso a la investiga- ción. en cambio, las publicaciones electrónicas han ampliado el acceso de investigadores y expresionistas creativos a productos del patrimonio cultural de ori- gen digital. sin intermediarios, estos productores de origen digital representan un “nuevo orden mundial” para las bibliotecas, con una doble responsabilidad en relación con el acceso y la posteridad. algunos desafíos son infraestructurales, como por ejemplo la entrada de internet; otros son humanos y surgen de la necesidad de crear capacidades en áreas como la publicación y las estrategias efectivas de conservación. estas destrezas son esenciales para una plena concep- tualización del potencial de pérdida de productos de patrimonio cultural y la necesidad de mecanismos viables para administrar contenidos; cualquier otra cosa haría que los productos de patrimonio cultural nacieran literalmente muertos. digitization of indian manuscripts heritage: role of national mission for manuscripts digitalización del patrimonio de manuscritos indios: papel del national mission for manuscripts (centro nacional de manuscritos) jyotshna sahoo, basudev mohanty ifla journal, - , - india tiene la particularidad de tener una de las colec- ciones de manuscritos más antigua, rica y grande del mundo. estos manuscritos son un poderoso medio para la conservación del patrimonio cultural indio en diferentes formas, idiomas, escritos y temas. pero la conservación de estos manuscritos representa un serio problema para sus conservadores debido a las cálidas y húmedas condiciones climáticas del país. en este con- texto, este documento da cuenta de los encomiables esfuerzos que realiza el national mission for manu- scripts (nmm - centro nacional de manuscritos) desde sus comienzos en estableciendo y reforzando manuscript resource centers (mrcs - centros de recursos de manuscritos), manuscript conservation centers (mccs - centros de conservación de manuscri- tos) y desarrollando una base de datos nacional de man- uscritos. el documento también presenta el estado actual de digitalización del patrimonio cultural indio en forma de manuscritos, empezando por su colección hasta el desarrollo de una biblioteca digital de manuscri- tos (bdm) para un acceso global. preserving digital heritage: at the crossroads of trust and linked open data conservación del patrimonio digital: en la encrucijada entre confianza y datos enlazados iryna solodovnik, paolo budroni ifla journal, - , - abstracts independientemente de las tecnologías actuales o futuras, el acceso digital a los recursos de información conservados siempre será un desafío. existen multi- tud de modelos, normas y prácticas óptimas que abor- dan diferentes aspectos de la conservación de objetos digitales. la gestión de objetos digitales requiere políticas y planes de gestión de datos bien definidos que abarquen todos los procesos dentro de su ciclo de vida específico. para alcanzar altos niveles de intercambio de datos y reutilización de datos a largo plazo, aparsen recomienda desarrollar un marco interoperable para identificadores constantes, alla- nando el camino para un ‘círculo de identificadores constantes fiables para datos abiertos vinculados’. para permitir la interoperabilidad semántica de este círculo, este artículo propone esquematizar los meta- datos lode-bd con la ontología del marco. el círculo puede enriquecerse aún más con la pila tecno- lógica lod para enfrentarse al problema de con- fianza del ciclo de vida de los datos vinculados y al problema que conlleva big data. para que sean fiables, las bibliotecas digitales tienen que ser audita- das y certificadas en cumplimiento con el marco eur- opeo de auditoria y certificación. the universal procedure for library assessment: a statistical model for condition surveys of special collections in libraries el procedimiento universal para la evaluación de la biblioteca: un modelo estadístico para encuestas sobre condiciones de colecciones especiales en las bibliotecas sam capiau, marijn de valk, eva wuyts ifla journal, - , - para proteger las necesidades de conservación y preservación de colecciones especiales en las bibliote- cas, la fundación flanders heritage library desarrolló el universal procedure for library assessment (upla - procedimiento universal para la evaluación de bibliotecas), un modelo para la evaluación de daños. este modelo describe una evaluación de daños estadística de colecciones bibliotecarias, y determina el alcance de los daños y la accesibilidad de los mate- riales de la biblioteca. el modelo puede implementarlo el propio personal de la biblioteca tras haber recibido una formación básica. esto garantiza la acumulación y el afianzamiento de conocimientos sobre el cuidado de los materiales de la colección bibliotecaria a lo largo de la evaluación del upla. las estadísticas recogidas en la investigación son componentes para el desarrollo de una política de conservación y un pro- grama de cuidados de la colección. estas estadísticas también se pueden comparar con los datos recogidos por otras instituciones. flanders heritage library es una organización de red con seis bibliotecas de patrimonio en flandes (en la parte norte de bélgica). en - , flanders heritage library desarrolló el universal procedure for library assessment (upla - procedimiento uni- versal para la evaluación de bibliotecas). cultural heritage digitization projects in algeria case study of the national library proyectos de digitalización del patrimonio cultural en argelia: caso práctico de la biblioteca nacional meriem boulahlib, nadjia ghamouh ifla journal, - , - actualmente, la biblioteca nacional de argelia se afana por digitalizar el patrimonio cultural de argelia. esta práctica se convirtió en una prioridad debido a los daños físicos que sufrían los manuscritos cuando se manejaban para leerlos. este caso práctico intenta arro- jar luz sobre los desafíos de la digitalización de manu- scritos y libros raros en el contexto argelino. además, este documento establece las aspiraciones y los planes de la biblioteca nacional para la digitalización exitosa de manuscritos y libros raros. ifla journal ( ) << /ascii encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (gray gamma . ) /calrgbprofile (srgb iec - . ) /calcmykprofile (u.s. web coated \ swop\ v ) /srgbprofile (srgb iec - . ) /cannotembedfontpolicy /error /compatibilitylevel . /compressobjects /off /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjdffile false /createjobticket true /defaultrenderingintent /perceptual /detectblends false /detectcurves . /colorconversionstrategy /leavecolorunchanged /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel /emitdscwarnings false /endpage - /imagememory /lockdistillerparams true /maxsubsetpct /optimize false /opm /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments false /preserveoverprintsettings true /startpage /subsetfonts true /transferfunctioninfo /preserve /ucrandbginfo /remove /useprologue false /colorsettingsfile (color management off) /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution /colorimagedepth /colorimagemindownsampledepth /colorimagedownsamplethreshold . /encodecolorimages true /colorimagefilter /flateencode /autofiltercolorimages false /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /colorimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /jpeg coloracsimagedict << /tilewidth /tileheight /quality >> /jpeg colorimagedict << /tilewidth /tileheight /quality >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution /grayimagedepth /grayimagemindownsampledepth /grayimagedownsamplethreshold . /encodegrayimages true /grayimagefilter /flateencode /autofiltergrayimages false /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /grayimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /jpeg grayacsimagedict << /tilewidth /tileheight /quality >> /jpeg grayimagedict << /tilewidth /tileheight /quality >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution /monoimagedepth - /monoimagedownsamplethreshold . /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k - >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx acheck false /pdfx check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ . . . . ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ . . . . ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /description << /enu >> >> setdistillerparams << /hwresolution [ ] /pagesize [ . . ] >> setpagedevice deborah estrin deborah estrin publications presentations students teaching about deborah estrin is a professor of computer science at cornell tech in new york city where she holds the robert v. tishman founder's chair, serves as the associate dean for impact, and is an affiliate faculty at weill cornell medicine. estrin's research activities include technologies for caregiving, immersive health, small data, participatory sensing, and public interest technology. before joining cornell university estrin was the founding director of the nsf center for embedded networked sensing (cens) at ucla; pioneering the development of mobile and wireless systems to collect and analyze real time data about the physical world. estrin co-founded the non-profit startup, open mhealth, and has served on several scientific advisory boards for early stage mobile health startups and as an amazon scholar. estrin's honors include: acm athena lecture ( ), anita borg institute's women of vision award for innovation ( ), the ieee internet award ( ), and macarthur fellowship ( ). she is an elected member of the american academy of arts and sciences ( ), the national academy of engineering ( ), and the national academy of medicine ( ). she was awarded honorary doctorates from epfl ( ) and uppsala ( ). current funders include national science foundation (award # and # ), atlantic philanthropies, macarthur foundation, cornell university. current collaborators include tanzeem choudhury, nicki dell, helen nissenbaum, jp pollak. email: destrin@cornell.edu, twitter: @deborahestrin, longform cv. current students hongyi wen eugene bagdasaryan andrea cuadra emily tseng postdocs michael sobolev katrin hansel graduated phd students faisal alquaddoomi, lee breslau, nirupama bulusu, alberto cerpa, ron cocchi, jeremy elson, hossein falaki, deepak ganesan, lewis girod, benjamin greenstein, ahmed helmy, shai herzog, andy hsieh, bau-yi polly huang, josh hyman, chalermek intanagonwiwat, donnie kim, teresa ko, kanna satish kumar, charley ching-gung liu, brent longstaff, martin lukac, danny mitzel, minyoung (min) mun, fabian okeke, andrew parker, graham phillips, pavlin radoslavov, nithya ramanathan, anoop reddy, sasank reddy, reza rejaie, vids samanta, thomas schoellhammer, puneet sharma, thanos stathopoulos, gene tsudik, kannan varadhan, hanbiao wang, liming wei, ya xu, longqi yang, haobo yu, yan yu, daniel zappala, jerry zhao. teaching - info : ms specialization project course conference publications (to journals) bagdasaryan, e., veit, a., hua, y., estrin, d., & shmatikov, v. how to backdoor federated learning in aistats. june . bagdasaryan, e., berlstein, g., waterman, j., birrell, e., foster, n., schneider, f. b., estrin, d. ancile: enhancing privacy for ubiquitous computing with use-based privacy. in proceedings of the th acm workshop on privacy in the electronic society (pp. - ). acm. november . wen, h., yang, l., estrin, d. leveraging post-click feedback for content recommendations. in thirteenth acm conference on recommender systems (recsys' ) , september . yang, l., sobolev, m., wang, y., chen, j., dunne d., tsangouri, c., dell, n. , naaman, m., estrin, d. how intention informed recommendations modulate choices: a field study of spoken word content. in the world wide web conference (the web conference) (www ' ) april . chawla, m., singh, k., yang, l., estrin, d. recboard: a web-based platform for recommendation system research and development. in the world wide web conference (the web conference) (www ' ) april . yang, l., wang, y., dunne, d. sobolev, m., naaman, m., c., estrin, d. more than just words: modeling non-textual characteristics of podcasts in twelfth acm conference on web search and data mining (wsdm' ), february . yang, l., sobolev, m., tsangouri, c., estrin, d. understanding user interactions with podcast recommendations delivered via voice. in twelfth acm conference on recommender systems (recsys ' ), october . wen, h., yang, l., sobolev, m., estrin, d. exploring recommendations under user-controlled data filtering. in twelfth acm conference on recommender systems (recsys ' ), october . yang, l., cui, y., xuan, y., wang, c., belongie, s., estrin, d. unbiased offline recommender evaluation for missing-not-at-random implicit feedback. in twelfth acm conference on recommender systems (recsys ' ), october . okeke, f., sobolev, m., dell, n., estrin, d. good vibrations: can a digital nudge reduce digital overload? international conference on human-computer interaction with mobile devices and services (mobilehci ), september . yang, l., fang, c., jin, h., hoffman, m.d., estrin, d. characterizing user skills from application usage traces with hierarchical attention recurrent networks. acm transactions on intelligent systems and technology (tist). june hsieh, c.k., alquaddoomi, f., okeke, f., pollak, jp, gunasekara, l., estrin, d. small data: applications and architecture. proceedings of the fourth international conference on big data, small data, linked data and open data, april . okeke, f., sobolev, m., estrin, d. towards a framework for mobile behavior change research. proceedings of the technology, mind, and society, april . berman, f., rutenbar, r., hailpern, b., christensen, h., davidson, s., franklin, m., martonosi, m., raghavan, p., stodden, v., szalay, a.s., estrin, d. realizing the potential of data science communications of the acm april alquaddoomi, f., estrin, d. ranking subreddits by classifier indistinguishability in the reddit corpus proceedings of the tenth international conference on information, process, and knowledge management, march . yang, l., bagdasaryan, e., gruenstein, j., hsieh, c., estrin, d., openrec: a modular framework for extensible and adaptable recommendation algorithms proceedings of the th acm international conference on web search and data mining (wsdm ' ), february . to appear. openrec.ai yang, l., fang, c., jin, h., hoffman, m.d., estrin, d. . personalizing software and web services by integrating unstructured application usage traces. in proceedings of th international world wide web conference (www), perth, australia, april . hsieh, c.h., yang, l., cui, y., lin, t.y., belongie, s., estrin, d., . collaborative metric learning. in proceedings of th international world wide web conference (www), perth, australia, april kizer, j., sahuguet, a., lakin, n., carroll, m., pollak, jp and estrin, d. . internet scale research studies using sdl-rx. presented at the data for good exchange, september, . yang, l., freed, d., wu, a., wu, j., pollak, jp. and estrin, d. , to appear. your activities of daily living (yadl): an image-based survey technique for patients with arthritis. in proceedings of the th international conference on pervasive computing technologies for healthcare, cancun, mexico, may , pervasive health. hsieh, c.k., yang, l., wei, h., naaman, m. and estrin, d. . immersive recommendation: news and event recommendations using personal digital traces. in proceedings of the th international world wide web conference, montréal, quèbec, canada, april , www . wei, h., hsieh, c., yang, l., estrin, d. . grouplink: group event recommendations using personal digital traces. . in in the th acm conference on computer supported cooperative work and social computing, . say, p.r., stein, d., ancker j.s., hsieh a, pollak jp. and estrin, d. . smartphone data in rheumatoid arthritis - what do rheumatologists want?. in proceedings of the amia annual symposium, san francisco, ca, november , amia . yang, l., cui, y., zhang, f., pollak, jp., belongie, s. and estrin, d. . plateclick: bootstrapping food preferences through an adaptive visual interface. in proceedings of the th acm international conference on information and knowledge management, melbourne, australia, october , cikm . yang, l., hsieh, c. and estrin, d. . beyond classification: latent user interests profiling from visual contents analysis. in proceedings of ieee international conference on communications smart city and smart world, london, uk, june , data mining workshop (icdmw), ieee international conference. baum, a., carroll, m., estrin, d., gunasekara, l. and pollak, jp. . pushcart: supporting and scaling nutritionist-client relationships. in proceedings of cscw : workshop on moving beyond e-health and the quantified self, vancouver, canada, march, , cscw . alquaddoomi, f., ketcham, c. and estrin, d. . the email analysis framework: aiding the analysis of personal natural language texts. in workshop on linking the quantified self (linkqs), santiago, chile. hsieh, c.k., tangmurarunkit, h., alquaddoomi, f., jenkins, j., kang, j., ketcham, c., longstaff, b., selsky, j., swendeman, d., estrin, d. and ramanathan, n. . lifestreams: a modular sense-making toolset for identifying important patterns from everyday life. in proceedings of the th acm conference on embedded networked sensor systems (sensys ), rome italy, november . casillas, j., doose-peña, m., alquaddoomi, f., millán, r., jacobson, a., ganz, p., and estrin, d. . sms to promote risk-based late effect screenings? a technical pilot for young adult cancer survivors. in proceedings of the annual conference of critical mass, the young adult cancer alliance, atlanta, ga, november . hsieh, c.k., falaki, h., ramanthan, n., tangmunarunkit, h. and estrin, d. . hotmobile poster: performance evaluation of android ipc for continuous sensing applications, acm sigmobile mobile computing and communications review. khalapyan, z., tangmunarunkit, h., selsky, j., sakabu, e., rocchio, r. and estrin, d. . hybrid approaches in designing cost efficient and user friendly mobile interface. in proceedings of the th international conference on human-computer interaction with mobile devices and services, workshop on mobility and web behavior, san francisco, september , mobilehci workshop. ramanathan, n., swendeman, d., comulada, s., dawson, b., rotheram-borus, m.j. and estrin, d. . identifying preferences for mobile health applications for self-monitoring and self-management: focus group findings from hiv-positive persons and young mothers. in proceedings of the th world congress on social media. mobile apps. internet/web . , boston, ma, september , medicine . . tangmunarunkit, h., ramanathan, n., falaki, h., jenkins, j., ketcham, c., longstaff, b., monibi, m., ooms, j., parameswaran, k., selsky, j. and estrin, d. . measures and real-time feedback of diet, activity, and stress using gps and accelerometer enabled smartphones. in proceedings of the th world congress on social media. mobile apps. internet/web . , boston, ma, september , medicine . . ramanathan, n., alquaddoomi, f., falaki, h., george, d., hsieh, c.k., jenkins, j., ketcham, c., longstaff, b., ooms, j., selsky, j.,tangmunarunkit, h. and estrin, d. . ohmage: an open mobile system for activity and experience sampling. in proceedings of the th international conference on pervasive computing technologies for healthcare, may , pervasivehealth. kim, d., han, k. and estrin, d., . employing user feedback for semantic location services. in proceedings of the th international conference on ubiquitous computing (ubicomp' ), beijing, china, september , ubicomp' . estrin, d., acker, a., lukac, m. and gracian, i. . engaging residents in community data gathering using smartphone technology: a proof-of-concept pilot in boyle heights. in proceedings of improvements that work: how to create efficiency in clinical research management, bethesda, md, august , th annual national ctsa community engagement conference (poster). falaki, h., mahajan, r. and estrin, d. . systemsens: a tool for monitoring usage in smartphone research deployments. in proceedings of the th acm international workshop on mobility in the evolving architecture (acmmobiarch ), washington d.c., june . kyungsik, h., graham, e., vassallo, d. and estrin, d. . enhancing motivation in a mobile participatory sensing project through gaming. ieee international conference on social computing (socialcom ): workshop in social connection in the urban spacessocialurb- : - . falaki, h., lymberopoulos, d., mahajan, r., kandula, s. and estrin, d. . a first look at traffic on smartphones. in proceedings of the internet marketing conference (imc), barcelona, spain, november . mun, m., hao, s., mishra, n., shilton, k., burke, j., estrin, d., hansen, m. and govindan, r. . personal data vaults: a locus of control for personal data streams. in proceedings of the th international conference on emerging networking experiments and technologies (conext), philadelphia, pa, november . kim, d., kim, y., estrin, d. and srivastava, m. . sensloc: sensing everyday places and paths using less energy. in proceedings of the th acm conference on embedded networked sensor systems (sensys ), zurich, switzerland, november . hicks, j., ramanathan, n., kim, d., monibi, m., selsky, j., hansen, m. and estrin, d. . andwellness: an open mobile system for acsvity and experience sampling. in proceedings of the wireless health : academic and research conference, la jolla, ca, october . reddy, s., estrin, d., hansen, m. and srivastava, m. . examining micro-payments for parscipatory sensing data collections. in proceedings of the international conference on ubiquitous computing (ubicomp), copenhagen, denmark, september . ko, t., soatto, s. and estrin, d. . warping background subtraction. in proceedings of the rd ieee conference on computer vision pattern recognition (cvpr), san francisco, ca, june . falaki, h., mahajan, r., kandula, s., lymberopoulos, d., govindan, r. and estrin, d. . diversity in smartphone usage. in proceedings of the th international conference on mobile systems, application, and services (mobisys' ), san francisco, ca, june . estrin, d., participatory sensing: applications and architecture. . in proceedings of the th international conference on mobile systems, applications and services (mobisys' ), san francisco, ca june . reddy, s., estrin, d. and srivastava, m. . recruitment framework for participatory sensing data collections. in proceedings of the th international conference on pervasive computing, helsinki, finland, may . reddy, s., shilton, k., denisov, g., cenizal, c., estrin, d. and srivastava, m. . biketastic: sensing and mapping for better biking. in proceedings of the th acm conference on human factors in computing systems, atlanta, georgia, april - , , chi . longstaff, b., reddy, s. and estrin, d. improving activity classification for health applications on mobile devices using active and semi-supervised learning. in proceedings of the th international conference on pervasive computing technologies for healthcare , munich, germany, march . ramanathan, n., schoelhammer, t., kohler, e., whitehouse, k., harmon, t. and estrin, d. . suelo: human-assisted sensing for exploratory soil monitoring studies. in proceedings of the th acm conference on embedded networked sensor systems (sensys ), berkeley, ca, november . kim, d.h., hightower, j., govindan, r. and estrin, d. . discovering semantically meaningful places from pervasive rf-beacons. in proceedings of the th international conference on ubiquitous computing (ubicomp ), orlando, fl, . shilton, k., burke, j., estrin, d., govindan, r. and kang, j. . designing the personal data stream: enabling participatory privacy in mobile personal sensing. in proceedings of the th research conference on communication, information and internet policy (tprc), arlington, va, september . ryder, j., longstaff, b., reddy, s. and estrin, d. . ambulation: a tool for monitoring mobility patterns over time using mobile phones. ieee international conference on social computing: workshop on social computing with mobile phones and sensors: modeling, sensing and sharing, vancouver, canada, august . ko, t., soatto, s. and estrin, d. . categorization in natural time-varying image sequences. computer vision pattern recognition: visual interpretation and understanding workshop, san francisco, ca, june, . mun, m., reddy, s., shilton, k., yau, n., burke, j., estrin, d., hansen, m., howard, e., west, r. and boda, p. . pier, the personal environmental impact report, as a platform for participatory sensing systems research. in proceedings of the th annual international conference on mobile systems, applications and services, acm mobisys , krakow, poland, june . reddy, s., shilton, k., burke, j., estrin, d., hansen, m. and srivastava, m. . using context annotated mobility proï¬�les to recruit data collectors in participatory sensing. in proceedings of the th international symposium on location and context awareness, loca, tokyo, japan, may . lukac, m., davis, p., clayton, r. and estrin, d. . recovering temporal integrity with data driven time synchronization. in proceedings of the th acm/ieee international conference on information processing in sensor networks, ipsn , san francisco, ca, april . whitesell, k., kutler, b., ramanathan, n. and estrin, d. . a system for determining indoor air quality from images of an air sensor captured on cell phones. in proceedings of the workshop on applications, systems, and algorithms for image sensing (imagesense' ), raleigh, north carolina, november . hyman, j., hansen, m., graham, e. and estrin, d. . estimating the spectral reflectance of natural imagery using color image features. in proceedings of the workshop on applications, systems, and algorithms for image sensing (imagesense' ), raleigh, north carolina, november . hicks, j., paek, j., coe, s., govindan, r. and estrin, d. . an easily deployable wireless imaging system. in proceedings of the workshop on applications, systems and algorithms for image sensing, (imagesense' ), raleigh, north carolina, november . reddy, s., shilton, k., burke, j., estrin, d., hansen, m. and srivastava, m. . evaluating participation and performance in participatory sensing. in proceedings of international workshop on urban, community, and social applications of networked sensing systems - urbansense , raleigh, north carolina, november . shilton, k., ramanathan, n., samanta, v., burke, j., estrin, d. hansen, m. and srivastava, m. . participatory design of urban sensing networks: strengths and challenges. in proceedings of the participatory design conference, bloomington, indiana, october . reddy, s., burke, j., estrin, d., hansen, m., srivastava, m. . determining transportation mode on mobile phones. in proceedings of the th ieee international symposium on wearable computers (iswc), pittsburgh, pennsylvania, september – october . mun, m., estrin, d., burke, j. and hansen, m. parsimonious mobility classification using gsm and wifi traces. in proceedings of the th workshop on embedded networked sensors (hotemnets ), charlottesville, va, june . allen, m., girod, l., newton, r., madden, s., blumstein, d. and estrin, d. . voxnet: an interactive, rapidly-deployable acoustic monitoring platform. in proceedings of the information processing in sensor networks (ipsn ), st. louis, missouri, april - , . shilton, k., burke, j., estrin, d., hansen, m. and srivastava, m. . participatory privacy in urban sensing. in proceedings of the international workshop on mobile device and urban sensing (modus ), st. louis, missouri, april , . allen, m., graham, e., ahmadian, s., ko, t., yuen, e., girod, l., hamilton, m. and estrin, d. . interactive environmental sensing: signal and image processing challenges. in proceedings of the icassp : ieee international conference on acoustics, speech and signal processing, las vegas, nv, usa, march - april . agapie, e., chen, g., houston, d., howard, e., kim, j., mun, m., mondschein, a., reddy, s., ros ario, r., ryder, j., steiner, a., burke, j., estrin, d., hansen, m. and rahimi, m. . seeing our signals: combining location traces and web-based models for personal discovery. in proceedings of the th ieee workshop on mobile computing systems and applications (hotmobile ), napa valley, ca, february . ko, t., charbiwala, z., ahmadian, s., rahimi, m., srivastava, m., soatto, s. and estrin, d. exploring tradeoffs in accuracy energy and latency of scale invariant feature transform in wireless camera networks. first acm/ieee international conference on distributed smart cameras (ic dsc ), vienna, austria, september . hyman, j., graham, e., hansen, m. and estrin, d. . imagers as sensors: correlating plant co uptake with digital visible-light imagery. th international workshop on data management for sensor networks (dmsn ), vienna, austria, september . reddy, s., parker, a., hyman, j., burke, j., estrin, d. and hansen, m. . image browsing, processing, and clustering for participatory sensing: lessons from a dietsense prototype. in proceedings of the th workshop on embedded networked sensors (emnets ), cork, ireland, june . allen, m., girod, l. and estrin, d. . acoustic laptops as a research enabler. in proceedings of the th workshop on embedded networked sensors (emnets ), cork, ireland, june - , . stathopoulos, t., lukac, m., mcintire, d., heidemann, j., estrin, d. and kaiser, w. . end-to-end routing for dual-radio sensor networks. ieee infocom, , anchorage, alaska, may . estrin, d. . reflections on wireless sensing systems: from ecosystems to human systems ieee radio and wireless symposium. long beach, ca, january . rahimi, m., ahmadian, s., zats, d., garcia, j., srivastava, m. and estrin, d. . magic of numbers in networks of wireless image sensors. workshop on distributed smart cameras (dsc), boulder, colorado, . burke, j., estrin, d., hansen, m., parker, a., ramanathan, n., reddy, s. and srivastava, m. . participatory sensing. world sensor web workshop, acm sensys , boulder, colorado, . gnawali, o., greenstein, b., jang, k., joki, a., paek, j., vieira, m., estrin, d., govindan, r. and kohler, e. . the tenet architecture for tiered sensor networks. acm sensys, november . greenstein, b., mar, c., pesterev, a., farshchi, s., kohler, e., judy, j. and estrin, d. . capturing high-frequency phenomena using a bandwidth-limited sensor network, acm sensys, november . parker, a., reddy, s., schmid, t., saurabh, g., chang, d., burke, j., hansen, m., srivastava, m., estrin, d., paxson, v. and allman, m. . network system challenges in selective sharing and veriï¬�cation for personal, social, and urban-scale sensing applications. in proceedings of the th workshop on hot topics in networks (hotnets-v), irvine, california, november . lee, d., kim, h., tu, s., rahimi, m., estrin, d. and villasenor j.d. . energy-optimized image communication on resource-constrained sensor platforms. ieee/acm international conference on information processing in sensor networks: special track on sensor platforms, tools and design methods (spots), november . kohler, m., heaton, t.h., govindan, r., davis, p. and estrin, d. . using embedded wired and wireless seismic networks in the moment-resisting steel frame factor building for damage identiï¬�cation. in proceedings of the th china-japan-u.s. symposium on structural control and monitoring, hangzhou, china, october . girod, l., lukac, m., trifa, v. and estrin, d. . the design and implementation of a self- calibrating distributed acoustic sensing platform. in proceedings of the th acm conference on embedded networked sensor systems (sensys ), october . lukac, m., girod, l. and estrin, d. . disruption tolerant shell. acm sigcomm workshop on challenged networks, pisa, italy, september . chang, k., yau, n., hansen, m. and estrin, d. . sensorbase.org-a centralized repository to slog sensor network data. in proceedings of the international conference on distributed computing in sensor network (dcoss)/euro-american workshop on middleware for sensor networks (eawms), san francisco, ca, june . ramanathan, n., balzano, l., estrin, d., harmon, t., hansen, m., jay, j., kaiser, w. and sukhatme, g. . designing wireless sensor networks as a shared resource for sustainable deve lopment. in proceedings of the first international conference on information and communication technologies and development (ictd), berkeley, ca, may . pon, r., batalin, m., chen, v., kansal, a., liu, d., rahimi, m., shirachi, l., somasundara, a., yu, y., hansen, m., kaiser, w., srivastava, m., sukhatme, g. and estrin, d. . coordinated static and mobile sensing for environmental monitoring. ieee international conference on distributed computing in sensor systems (dcoss), . pon, r., batalin, m., gordon, j., kansal, a., liu, d., rahimi, m., shirachi, l., yu, y., hansen, m., kaiser, w., srivastava, m., sukhatme, g. and estrin, d. . networked infomechanical sysems: a mobile embedded networked sensor platform. ipsn , special track on platform tools and design methods for network embedded sensors (spots), . ramanathan, n., chang, k., kapur, r., girod, l., kohler, e. and estrin, d. . sympathy for the sensor network debugger. in proceedings of the rd acm conference on embedded networked sensor systems (sensys), san diego, ca, november . rahimi, m., baer, r., iroezi, o., garcia, j., warrior, j., estrin, d. and srivastava, m. . cyclops: in situ image sensing and interpretation. in proceedings of the rd acm conference on embedded networked sensor systems (sensys), san diego, ca, november . rahimi, m., hansen, m., kaiser, w., sukhatme, g. and estrin, d. . adaptive sampling for environmental field estimason using robotic sensors. ieee/rsj international conference on intelligent robots and systems, edmonton, canada, august . batalin, m., kaiser, w., pon, r., sukhatme, g., pottie, g., yu, y., gordon, j., rahimi, m. and estrin, d. . task allocation for event-aware spatiotemporal sampling of environmental variables. ieee/rsj internasonal conference on intelligent robots and systems, edmonton, canada, august . wang, h., chen, c.e., ali, a., asgari, s., hudson, r.e., yao, k., estrin, d. and taylor, c. . acoustic sensor networks for woodpecker localization. in proceedings of spie conference on advanced signal processing algorithms, architectures and implementation, san diego, ca, august . ganesan, d., greenstein, b., perelyubskiy, d., estrin, d. and heidemann, j. . multi- resolution storage and search in sensor networks. acm transactions on storage , , - , august . ramanathan, n., yarvis, m., chhabra, j., kurshalnagar, n., krishnamurthy, l. andestrin, d. . a stream- oriented power management protocol for low duty cycle sensor network applications. in proceedings of the nd ieee workshop on embedded sensor networks (emnets), sydney, australia, june . cerpa, a., wong, j., potkonjak, m. and estrin, d. . temporal properties of low power wireless links: modeling and implications on multi-hop routing. in proceedings of the th acm international symposium on mobile ad hoc networking and computing (mobihoc ' ), urbana, champaign, illinois, may . cerpa, a., wong, j., kuang, l., potkonjak, m. and estrin, d. . statistical model of lossy links in wireless sensor networks. in proceedings of the acm/ieee th international conference on information processing in sensor networks (ipsn ), los angeles, california, april . pon, r., batalin, m., gordon, j., kansal, a., liu, d., rahimi, m., shirachi, l., yu, y., hansen, m., kaiser, w.j., srivastava m., sukhatme, g. and estrin, d. . networked infomechanical systems: a mobile wireless sensor network platform. ieee/acm fourth international conference on information processing in sensor networks, - . xu, n., rangwala, s., chintalapudi, k., ganesan, d., broad, a., govindan, r. and estrin, d. . a wireless sensor network for structural monitoring. in proceedings of the acm conference on embedded networked sensor systems (sensys ), november . ramanathan, n., kohler, e., girod, l. and estrin, d. . sympathy: a debugging system for sensor networks. workshop record of the st ieee workshop on embedded networked sensors (emnets-i), tampa, florida, november . batalin, m., rahimi, m., yu, y., liu, d., kansal, a., sukhatme, g., kaiser, w., hansen, m., pottie, g., srivastava, m. and estrin, d. . call and response: experiments in sampling the environment. in proceedings of the nd acm conference on embedded networked sensor systems (sensys), - , . girod, l., stathopoulos, t., ramanathan, n., elson, j., estrin, d., osterweil, e. and schoellhammer, t. . a system for simulation, emulation, and deployment of heterogeneous sensor networks. in proceedings of the nd acm conference on embedded networked sensor systems (sensys), . greenstein, b., kohler, e. and estrin, d. . a sensor network application construction kit (snack). in proceedings of the nd acm conference on embedded networked sensor systems (sensys), . yu, y., estrin, d., govindan, r. and rahimi, m. . using more realistic data models to evaluate sensor network data processing algorithms. st ieee workshop on embedded networked sensors . stathopoulos, t., kapur, r., estrin, d., heidemann, j. and zhang, l. . application-based collision avoidance in wireless sensor networks.in ieee workshop of embedded networked sensors (emnet) . schoellhammer, t., greenstein, b., osterweil, e., wimbrow, m. and estrin, . lightweight temporal compression of microclimate datasets. emnets-i . greenstein, b., kohler, e., culler, d. and estrin, d. . distributed techniques for area computation in sensor networks. the st ieee workshop on embedded networked sensors, emnets-i . kansal, a., rahimi, m., kaiser, w., srivastava, m., pottie, g. and estrin, d. . controlled mobility for sustainable wireless networks. in proceedings of ieee sensor and ad hoc communications and networks (secon), . rahimi, m., hansen, m., kaiser, w., sukhatme, g. and estrin, d. . adaptive sampling for environmental field estimation using robotic sensors. in proceedings of ieee/rsj international conference on intelligent robots and systems, edmonton, canada, august . kansal, a., somasundare, a., jea, d., srivastava, m.b. and estrin, d. . intelligent fluid infrastructure for embedded networking. in proceedings of acm mobisys, june . girod, l., elson, j., cerpa, a., stathopoulos, t., ramanathan, n. and estrin, d. . emstar: a software environment for developing and deploying wireless sensor networks. in proceedings of the usenix technical conference, june . (also available as cens technical report , march , .) pon, r., batalin, m., yu, y., estrin, d., pottie, g., srivastava, m., sukhatme, g. and kaiser, w. . self-aware distributed embedded systems. in proceedings of the th ieee international workshop of future trends of distributed computing systems, - . wang, h., yip, l., yao, k. and estrin, d. . lower bounds of localization uncertainty in sensor networks. in proceedings of ieee international conference on acoustics, speech, may . rahimi, m., pon, r., kaiser, w., sukhatme, g., estrin, d. and srivastava, m. . adaptive sampling for environmental robotics. in proceedings of ieee international conference on robotics and automation, - . wang, h., yao, k., pottie, g. and estrin, d. . entropy-based sensor selection in localization. in proceedings of the symposium on information processing in sensor networks (ipsn ' ), berkeley, california, april . ganesan, d., ratnasamy, s., wang, h. and estrin, d. . coping with irregular spatio-temporal sampling in sensor networks. nd workshop on hot topics in networks, (hotnets-ii), november . heidemann, j., silva, f. and estrin, d. . matching data dissemination algorithms to application requirements. in proceedings of the st acm conference on embedded networked sensor systems (sensys ), los angeles, ca, . ganesan, d., greenstein, b., perelyubskiy, d., estrin, d. and heidemann j. . an evaluation of multi-resolution storage for sensor networks. in proceedings of the st acm conference on embedded networked sensor systems (sensys ), los angeles, ca, . rahimi, m., shah, h., sukhatme, g., heidemann, j. and estrin, d. . studying the feasibility of energy harvesting in a mobile sensor network. in proceedings of the ieee international conference on robotics and automation (icra- ), taiwan, september . greenstein, b., estrin, d., govindan, r., ratnasamy, s. and shenker, s. . difs: a distributed index for features in sensor networks. in proceedings of the st ieee international workshop on sensor network protocols and applications, anchorage, ak. may . rahimi, m., shah, h., sukhatme, g., heidemann, j. and estrin, d. . energy harvesting in mobile sensor networks. in proceedings of the ieee international conference on robotics and automation, taipei, taiwan, may . zhao, j., govindan, r. and estrin, d. . computing aggregates for monitoring wireless sensor networks. in proceedings of st international workshop on sensor network protocols and applications, anchorage, ak, may . wang, h., elson, j., girod, l., estrin, d. and yao, k. . target classification and localization in habitat monitoring. in proceedings of ieee international conference on acoustics, speech, and signal processing (icassp ), hong kong, china, april . chen, j.c., yip, l., wang, h., maniezzo, d., hudson, r.e., elson, j., yao, k. and estrin, d. . dsp implementation of a distributed acoustical beamformer on a wireless sensor platform. in proceedings of ieee international conference on acoustics, speech, and signal processing (icassp ), hong kong, china. april . bychkovskiy, v., megerian, s., estrin, d. and potkonjak, m. . a collaborative approach to in-place sensor. in proceedings of the nd international workshop on information processing in sensor networks (ipsn ' ), , - . elson, j., girod, l. and estrin, d. . fine-grained network time synchronization using reference broadcasts. in proceedings of the th symposium on operating systems design and implementation (osdi ), boston, ma, - . ganesan, d., estrin, d. and heidemann, j. . dimensions: why do we need a new data handling architecture for sensor networks? in proceedings of the st workshop on hot topics in networks (hotnets-i), princeton, new jersey, - . bulusu, n., bychkovskiy, v., estrin, d. and heidemann, j. . scalable, ad hoc deployable rf-based localization. in proceedings of the grace hopper conference on celebration of women in computing, vancouver, canada, - . braginsky, d. and estrin, d. . rumor routing algorithm for sensor networks. in proceedings of the st acm international workshop on wireless sensor networks and applications (wsna ), - . ratnasamy, s., karp, b., yin, l., yu, f., estrin, d., govindan, r. and shenker, s. . ght: a geographic hash table for data-centric storage. in proceedings of the st acm international workshop on wireless sensor networks and applications (wsna ), atlanta, georgia, september , . ratnasamy, s., estrin, d., govindan, r., karp, b., shenker, s., yin, l. and yu, f. . data-centric storage in sensornets. in proceedings of the st workshop on sensor networks and applications (wsna), atlanta, ga, - . girod, l., bychkovskiy, v., elson, j. and estrin, d. . locating tiny sensors in time and space: a case study. in proceedings of the international conference on computer design (iccd ), freiburg, germany, - . elson, j., girod, l. and estrin, d. . short paper: a wireless time-synchronized cots sensor platform part i: system architecture.in proceedings of the ieee cas workshop on wireless communications and networking, pasadena, ca, - . krishnamachari, b., estrin, d. and wicker, s. . the impact of data aggregation in wireless sensor networks. in proceedings of the international workshop on distributed event based systems (debs), vienna, austria, - . intanagonwiwat, c., estrin, d., govindan, r. and heidemann, j.s. . impact of network density on data aggregation in wireless sensor networks. in proceedings of the nd international conference on distributed computing systems (icdcs ' ), vienna, austria, - . ye, w., heidemann, j. and estrin, d. . an energy-efficient mac protocol for wireless sensor networks. in proceedings of the st international annual joint conference of the ieee computer and communications societies (infocom ), new york, ny, - . cerpa, a. and estrin, d. . ascent: adaptive self-configuring sensor networks topologies. in proceedings of the st international annual joint conference of the ieee computer and communications societies (infocom ), new york, ny, - . zhao, y., govidan, r. and estrin, d. . residual energy scans for monitoring wireless sensor networks. in proceedings of the ieee wireless communications and networking conference (wcnc ' ), orlando, fl, - . ganesan, d., govindan, r., shenker, s. and estrin, d. . highly-resilient, energy-efficient extended abstract multipath routing in wireless sensor networks. in proceedings of acm symposium on mobile ad hoc networking and computing (mobihoc ), long beach, ca, - . [one of the best poster papers from mobihoc ]. ganesan, d., krishnamachari, b., woo, a., culler, d., estrin, d. wicker, s. . extended abstract, large-scale network discovery: design tradeoffs in wireless sensor systems. in proceedings of the symposium on operating systems principles (sosp ), lake louise, banff, canada, poster, - . bulusu, n., estrin, d. and heidemann, j. . tradeoffs in location support systems: the case for quality- expressive location models for applications. in proceedings of the ubicomp workshop on location modeling, atlanta, ga, - . elson, j. and estrin, d. . random, ephemeral transaction identifiers in dynamic sensor networks. in proceedings of the st international conference on distributed computing systems (icdcs- ), phoenix, az, - . girod, l. and estrin, d. . robust range estimation using acoustic and multimodal sensing. in proceedings of the ieee/rsj international conference on intelligent robots and systems (iros ). maui, hawaii, - . heidemann, j., silva, f., intanagonwiwat, c., govindan, r., estrin, d. and ganesan, d. . building efficient wireless sensor networks with low-level naming. in proceedings of the symposium on operating systems principles (sosp), lake louise, banff, canada, - . ya, x., heidemann, j. and estrin, d. . geography-informed energy conservation for ad hoc routing. in proceedings of the acm sigmobile th international conference on mobile computing, rome, italy, july . bulusu, n., estrin, d., girod, l. and heidemann, j. . scalable coordination for wireless sensor networks: self-configuring localization systems. in proceedings of the th international symposium on communication theory and applications (iscta ), ambleside, lake district, uk, - . estrin, d., girod, l., pottie, g. and srivastava, m. . instrumenting the world with wireless sensor networks. in proceedings of the international conference on acoustics, speech, and signal processing (icassp ), salt lake city, utah, - . ye, w., vaughan, r.t., sukhatme, g.s., heidemann, j., estrin, d. and mataric, m.j. . evaluating control strategies for wireless-networked robots using an integrated robot and network simulation. in proceedings of the ieee international conference on robotics and automation (icra ), seoul, korea, may . radoslavov, p., papadopoulos, c., govindan, r. and estrin, d. . a comparison of application-level and router-assisted hierarchical schemes for reliable multicast. in proceedings of the infocom conference, anchorage, alaska, - . elson, j. andesrin, d. . time synchronization for wireless sensor networks. in proceedings of the international parallel and distributed processing symposium (ipdps), workshop on parallel and distributed computing issues in wireless networks and mobile computing, san francisco, ca, - . cerpa, a., elson, j., estrin, d., girod, l., hamilton, m. and zhao, j. . habitat monitoring: application driver for wireless communications technology. in proceedings of the st acm sigcomm workshop on data communications in latin america and the caribbean, san jose, costa rica, - . bulusu, n., heidemann, j. and estrin, d. . adaptive beacon placement. in proceedings of the st international conference on distributed computing systems (icdcs- ), phoenix, arizona, - . heidemann, j., bulusu, n., elson, j., intanagonwiwat, c., lan, k.c., xu, y., ye, w., estrin, d. and govindan, r. . effects of detail in wireless network simulation. in proceedings of the scs communication networks and distributed systems modeling and simulation conference, phoenix, az, - . kumar, s., alaettinoglu, c. and estrin, d. . scalable object-tracking through unattended techniques (scout). in proceedings of the th ieee international conference on network protocols [ieee icnp- ], osaka, japan, - . sukhatme, g.s., estrin, d., caron, d., mataric m. and requicha, a. . proposed approach for combining distributed sensing, robotic sampling, and offline analysis for in situ marine monitoring. in proceedings of advanced environmental and chemical sensing technology (spie ), . helmy, a., gupta, s.k.s., estrin, d., cerpa, a., and yu, y. . systematic performance evaluation of multipoint protocols. in proceedings of the forte conference, pisa, italy, - . reddy, a., estrin, d. and govindan, r. . fault isolation in multicast trees. in proceedings of the acm sigcomm conference, stockholm, sweden, - . intanagonwiwat, c., govindan, r. and estrin, d. . directed diffusion: a scalable and robust communication paradigm for sensor networks. in proceedings of the th annual international conference on mobile computing and networking (mobicom ), boston, ma, - . rejaie, r., yu, h., handley, m., estrin, d. . multimedia proxy caching mechanism for quality adaptive streaming applications in the internet. in proceedings of the ieee infocom conference on computer communications, tel aviv, israel, march . liu, c.g., estrin, d., shenker, s. and zhang, l., . recovery timer adaptation in srm. in proceedings of the international conference in computer communications, iccc' , tokyo, japan, - . rejaie, r., handley, m. and estrin, d. . quality adaptation for congestion controlled video playback over the internet. in proceedings of acm sigcomm ' , cambridge, ma, august-september . estrin, d., govindan, r., heidemann, j. and kumar, s. . next century challenges: scalable coordination in sensor networks. in proceedings of the th annual international conference on mobile computing and networks (mobicom' ), seattle, washington, - . govindan, r., faber, t., heidemann, j. and estrin, d. . ad-hoc smart environments. in proceedings of the darpa/nist workshop on smart environments, atlanta, georgia, - . rejaie, r., handley, m., yu, h. and estrin, d. . proxy caching mechanism for multimedia playback streams in the internet. in proceedings of the th international web cache workshop, san diego, ca, march-april, . rejaie, r., handley, m. and estrin, d. . rap: an end-to-end rate-based congestion control mechanism for realtime streams in the internet. in proceedings of the ieee infocom conference, new york, n.y., march . estrin, d., handley, m., helmy, a., huang, p. and thaler, d. . a dynamic bootstrap mechanism for rendezvous-based multicast routing. in proceedings of the ieee infocom conference, new york, n.y., march - , . helmy, a., estrin, d. and gupta, s. . fault-oriented test generation for multicast routing. in proceedings of the forte/pstv' conference, paris, france, november . kumar, s., radoslavov, p., thaler, d., alaettinoglu, c., estrin, d. and handley, m. the masc/bgmp architecture for inter-domain multicast routing. in proceedings of the acm sigcomm conference, vancouver, british columbia, canada, august -september, . helmy, a. and estrin, d. . simulation-based stress testing case study: a multicast routing protocol. in proceedings of the sixth international symposium on modeling analysis and simulation of computer and telecommunications systems (mascots' ), montreal, canada, july . huang, p., estrin, d. and heidemann, j. . enabling large-scale simulations: selective abstraction approach to the study of multicast protocols. in proceedings of the sixth international symposium on modeling analysis and simulation of computer and telecommunications systems (mascots ), montreal, canada, july . heidemann, j., govindan r. and estrin, d. . configuration challenges for smart spaces. in proceedings of the darpa/nist smart spaces workshop, july . varadhan, k., estrin, d. and floyd, s. . impact of network dynamics on end-to-end protocols: case studies in reliable multicast. in proceedings of the rd ieee symposium on computers and communications (iscc' ) conference, athens, greece, june -july, . helmy, a. and estrin, d. . stress testing applied to a multicast routing protocol. in proceedings of the sixth international symposium on modeling, analysis, and simulation, ottawa, canada, december . huang, p., estrin, d. and heidemann, j. . enabling large-scale simulations: selective abstraction approach to the study of multicast protocols. in proceedings of the sixth international symposium on modeling, analysis, and simulation, ottawa, canada, december . sharma, p., estrin, d., floyd, s. and jacobson, v. . scalable timers for soft state protocols. in proceedings of ieee infocom ' , kobe, japan, april - , . mitzel, d., estrin, d., shenker, s. and zhang, l. . a study of reservation dynamics in integrated services packet networks. in proceedings of ieee infocom ' , san francisco, ca, march . wei, l. and estrin, d. . multicast routing in dense and sparse modes: simulation study of tradeoffs and dynamics. in proceedings of the international conference on computer communications and networks (icccn), las vegas, nevada, september . wei, l. and estrin, d. . the trade-offs of multicast trees and algorithms. in proceedings of the international conference on computer communications and networks (icccn), september . deering, s., estrin, d., farinacci, d., jacobson, v., liu, c. and wei, l. . an architecture for wide-area multicast routing. in proceedings of acm sigcomm ' , london, england, - , august -september, . mitzel, d., estrin, d., shenker, s. and zhang, l. . an architectural comparison of st-ii and rsvp. in proceedings of ieee infocom ' , toronto, canada, june . wei, l., liaw, f., estrin, d., romanow, a. and lyon, t. . analysis of resequencer model for multicastover atm networks. in proceedings of the rd international workshopon network and operating systems, november . estrin, d., rekhter, y. and hotz, s. . scalable inter-domain routing architecture. in proceedings of acm (sigcomm ' ), baltimore, maryland, - . estrin, d. and mitzel, d. . an assessment of state and lookup overhead in routers. in proceedings of ieee infocom ' , florence, italy, may . meyers, k. and estrin, d. . a network management tool for inter-domain policy routing. in proceedings of ieee noms ' network operations and management symposium, memphis, tn, april . cocci, r., estrin, d., shenker, s. and zhang, l. . a study of priority pricing in multiple service class networks. in proceedings of the acm sigcomm conference, zurich, switzerland, - . estrin, d. and zhang, l. . design considerations for usage accounting and feedback in internetworks. in proceedings of ifip international conference on integrated network management, april . estrin, d. and obraczka, k. . connectivity database overhead for inter-domain policy routing. in proceedings of ieee infocom ' , miami, florida, april . estrin, d. and tsudik, g. . secure policy enforcement in internetworks. in proceedings of dimacs ( ) (series in discrete mathematics and theoretical computer science) workshop, october, , acm/ams dimacs distributed computing and cryptography, . hart, p. and estrin, d. . computer integration: a co-requirement for effective inter-organization computer network implementation. in proceedings of the conference on computer supported cooperative work, october . breslau, l. and estrin, d. . design of inter-administrative domain routing protocols. in proceedings of acm sigcomm ' , , , philadelphia, pa, - . hart, p. and estrin, d. . inter-organization computer networks: indications of shifts in interdependence. in proceedings of acm conference on office information systems, published as sigois bulletin, , - . estrin, d. and tsudik, g. . security issues in policy routing. in proceedings of ieee symposium on research in security and privacy, may . estrin, d. and tsudik, g. . visa scheme for inter-organization network security. in proceedings of ieee symposium on security and privacy, april . estrin, d. . the organizational consequences of inter-organization computer networks. in proceedings of acm conference on office information systems published as sigois bulletin, , - . estrin, d. . inter-organization networks: implications of access control requirements for interconnection protocols. in proceedings of acm sigcomm ' , august . estrin, d. . interconnection of private networks: a link between industrial and telecommunications policy. in proceedings of the fourteenth annual telecommunications policy research conference, april . estrin, d. . non-discretionary controls for inter-organization networks. in proceedings of ieee symposium on security and privacy, april . journal publications (to conferences) wen h., sobolev m., vitale r., kizer j., pollak jp, muench f, estrin d., mpulse mobile sensing model for passive detection of impulsive behavior: exploratory prediction study j jmir ment health. jan ; ( ):e . doi: . / . pmid: . sobolev m., vitale r., wen h., kizer j., leeman r., pollak jp, baumel a., vadhan np, estrin d., muench f., the digital marshmallow test (dmt) diagnostic and monitoring mobile health app for impulsive behavior: development and validation study j med internet res mhealth uhealth, ( ):e , january . cole c., sengupta s., rossetti s., vawdrey d.k., halaas m., maddox t.m., gordon g., dave t., payne p.r.o., williams a.e., estrin d. ten principles for data sharing and commercialization j american medical informatics association, ocaa , november . yin a.l., gheissari p., lin i.w., sobolev m., pokkaj j.p., cole c., estrin d. role of technology in self-assessment and feedback among hospitalist physicians: semistructured interviews and thematic analysis j med internet res, ( ):e , november . birnbaum, m.l., wen, h., van meter, a., ernala, s.k., rizvi, a.f., arenare, e., estrin, d., de choudhury, m., kane, j.m. identifying emerging mental illness utilizing search engine activity: a feasibility study plos one, ( ): e , october . yin a.l., hachuel d., pollak j.p., scherl e.j., estrin d., cole c. digital health apps in the clinical care of inflammatory bowel disease: scoping review j med internet res, ( ):e , august . casillas, j.n., schwartz, l.f., crespi, c.m., ganz, p.a., kahn, k.l., stuber, m.l., bastani, r., alquaddomi, f. and estrin, d. the use of mobile technology and peer navigation to promote adolescent and young adult (aya) cancer survivorship care: results of a randomized controlled trial. journal of cancer survivorship, ( ), pp. - , august . dodge, h., h., & estrin, d. making sense of aging with data big and small. the bridge, ( ), pp. - ., march . searcy, r. p., summapund, j., estrin, d., pollak, j. p., schoenthaler, a., troxel, a. b., & dodson, j. a mobile health technologies for older adults with cardiovascular disease: current evidence and future directions. current geriatrics reports. doi: . /s - - - , january . swendeman, d., comulada, w.s., koussa, m., worthman, c.m., estrin, d., rothearm-borus, m., ramanathan n.longitudinal validity and reliability of brief smartphone self-monitoring of diet, stress, and physical activity in a diverse sample of mothers jmir mhealth and uhealth september selter, a., tsangouri, c., ali, s., freed, d., vatchinsky, a., kizer, j., sahuguet, a., vojta, d., vad, v., pollak, j., estrin, d. an mhealth app for self-management of chronic lower back pain (limbr): pilot study jmir mhealth uhealth september . okeke, f., sobolev, m., estrin, d. towards a framework for mobile behavior change research in technology, mind, and society: apascience, washington dc, usa, april . comulada, w.s., swendeman, d., koussa, m.k., mindry, d., medich, m., estrin, d., mercer, n., ramanathan, n. adherence to self-monitoring healthy lifestyle behaviours through mobile phone-based ecological momentary assessments … public health nutrition december yang, l., hsieh, c., yang, h., pollak, jp, dell, n., belongie, s., cole, c., estrin, d., yum-me: a personalized nutrient-based meal recommender system. acm transactions on information systems (tois), july . patrick, k., hekler, e.b, estrin, d., mohr, d.c., riper, h., crane, d., godino, j., riley, w.t., . the pace of technologic change: implications for digital health behavior intervention research. american journal of preventive medicine. nov; ( ): - . aung, m. s. h., alquaddoomi, f., hsieh, a., rabbi, m., yang, l., pollak, j.p., estrin, d. and choudhury, t. , to appear. leveraging multi-modal sensing for mobile health: a case review in chronic pain. ieee journal of selected topics in signal processing. estrin, d. and juels, a. winter. reassembling our digital selves. daedalus, , , - (doi: . /daed_a_ ). kumar, s., abowd, g.d., abraham, w.t., al'absi, m., beck, j. g., chau, d. h., condie, t., conroy, d.e., ertin, e., estrin, d., ganesan, d., lam, c., marlin, b., marsh, c. b., murphy, s. a., nahum-shani, i., patrick, k., rehg, j.m., sharmin, m., shetty, v., sim, i., spring, b., srivastava m. and wetter, d.w. . center of excellence for mobile sensor data-to-knowledge (md k). journal of the american medical informatics association. tangmunarunkit, h., hsieh, c.k., longstaff, b., nolen, s., jenkins, j., ketcham, c., selsky, j., alquaddoomi, f., george, d., kang, j., khalapyan, z., ooms, j., ramanathan, n. and estrin, d. . ohmage: a general and extensible end-to-end participatory sensing platform. acm transactions on intelligent systems and technology (tist), , . swendeman, d., comulada, s., worthman, c. and estrin, d. . mary jane rotheram-borus, nithya ramanathan, smartphone self-monitoring to support self-management among people living with hiv: perceived benefits and theory of change from a mixed-methods, randomized pilot study. journal of acquired immune deficiency syndromes, jaids. estrin, d. . small data, where n=me. cacm, viewpoint column, communications of the acm, , , - . mun, m.y., kim, d.h., shilton, k., estrin, d., hansen, m., govindan, r. . pdvloc: a personal data vault for controlled location data sharing. acm transactions on sensor networks (tosn), , . chen, c., haddad, d., selsky, j., hoffman, j.e., kravitz, r.l., estrin, d. and sim i. . making sense of mobile health data: an open architecture to improve individual- and population-level health. journal of medical internet research, , . ramanathan, n., swendeman, d., comulada, s., estrin, d. and rotheram-borus, m.j. . identifying preferences for mobile health applications for self-monitoring and self-management: focus group findings from hiv-positive persons and young mothers. international journal of medical informatics. shilton, k. and estrin, d. . ethical issues in participatory sensing. journal of professional and research ethics. kang, j., shilton, k., burke, j. a., estrin, d. and hansen, m. . self-surveillance privacy. iowa law review, , . arab, l., estrin, d., kim, d.h., burke, j. and goldman, j. . feasibility testing of an automated image-capture method to aid dietary recall. european journal of clinical nutrition. estrin, d. and sim, i. . open mhealth architecture: an engine for health care innovation. science magazine, aaas, , , - . ko. t., hyman, j., graham, e., estrin, d. and soatto, s. . embedded imagers: detecting, localizing and recognizing objects and events in natural habitats. sensor networks and applications. ko. t., ahmadian, s., hicks, j., rahimi, m., estrin, d., soatto, s., coe, s. and hamilton, m.p. . heartbeat of a nest: using imagers as biological sensors. acm transactions of sensor networks (tosn), , . reddy, s., mun, m., burke, j., estrin, d., hansen, m. and srivastava, m. . using mobile phones to determine transportation modes. acm transactions on sensor networks (tosn), , . estrin, d. . participatory sensing: applications and architecture. internet computing, ieee, , , - . lee, d., kim, h., rahimi, m., estrin, d. and villasenor, j.d. . energy-efficient image compression for resource-constrained platforms. ieee transactions on image processing, , , - . samanta, v., knowles, c., burke, j., wagmister, f., estrin, d. . metropolitan wi-fi research network in the communities of the los angeles state historic park. journal of community informatics, field note, special issue: wireless networking for communities, citizens and the public interest. , . goldman, j., shilton, k., burke, j., estrin, d., hansen, m., ramanathan, n., reddy, s., samanta, v. and srivastava, m. . participatory sensing: a citizen-powered approach to illuminating the patterns that shape our world. white paper published by woodrow wilson international center for scholars. kim, h., rahimi, m., lee, d., estrin, d. and villasenor, j. d. . energy-aware high resolution image acquisition via heterogeneous image sensors. ieee journal of selected topics in signal processing. girod, l., ramanathan, n., elson, j., stathopoulos, t., lukac, m. and estrin, d. . emstar: a software environment for developing and deploying heterogeneous sensor-actuator networks. acm transactions on sensor networks, , , - . abdelzaher, t., anokwa, y., boda, p., burke, j., estrin, d., guibas, l., kansal, a., madden, s. and reich, j. . mobiscopes for human spaces. ieee pervasive computing - mobile and ubiquitous systems, , . hamilton, m. p., graham, e., rundel, p., allen, m., kaiser, w., hansen, m. and estrin. d. new approaches in embedded networked sensing for terrestrial ecological observatories. environmental engineering science, , , - . goldman, j., ramanathan, n., ambrose, r., caron, d., estrin, d., fisher, j., gilbert, r., hansen, m., harmon, t., jay, j., kaiser, w., sukhatme, g. and tai, y. . distributed sensing systems for water quality assessment and management. white paper published and prepared by the foresight and governance project at the woodrow wilson international center for scholars. wang, h., yao, k. and estrin, d. . information-theoretic approaches for sensor selection and placement in sensor networks for target localization and tracking. journal of communications and networks, , , - . ramanathan, n., kohler, e. and estrin, d. . towards a debugging system for sensor networks. international journal for network management. wang, h., chen, c.e., ali, a., asgari, s., hudson, r.e., yao, k., estrin, d. and taylor, c. . acoustic sensor networks for woodpecker localization. in proceedings of spie conference on advanced signal processing algorithms, architectures, and implementations. ganesan, d., greenstein, b., estrin, d., heidemann, j. and govindan, r. . multi-resolution storage and search in sensor networks. acm transactions on storage. cerpa, a., wong, j., potkonjak, m. and estrin, d. . temporal properties of low power wireless links: modeling and implications on multi-hop routing. ieee transactions on mobile computing (tmc). elson, j., girod, l. and estrin, d. . emstar: development with high system visibility. ieee wireless communication magazine. bergamo, p., asgari, s., wang, h., maniezo, d., yip, l., hudson, r., yao, k. and estrin, d. . collaborative sensor networking towards real-time acoustical beamforming in free space and limited reverberance. ieee transactions on mobile computing, , . cerpa a. and estrin, d. . ascent: adaptive self-configuring sensor networks topologies. ieee transactions on mobile computing (tmc), special issue on mission-oriented sensor networks, , , - . ganesan, d., cerpa, a., yu, y., ye, w., zhao, j. and estrin, d. . networking issues in sensor networks. journal of parallel and distributed computing (jpdc), special issue on frontiers in distributed sensor networks, elsevier publishers, , , - . radoslavov, p., papadopoulos, c., govindan, r. and estrin, d. . a comparison of application-level and router-assisted hierarchical schemes for reliable multicast. ieee/acm transactions on networking, , , - . szewczyk, r., osterweil, e., polastre, j., hamilton, m., mainwaring, a. and estrin, d. . habitat monitoring with sensor networks.communications of the acm, , , - . helmy, a., gupta, s. and estrin, d. . the stress method for boundary-point performance analysis fo end-to-end multicast timer suppression mechanisms. ieee/acm transactions on networking, , . bulusu, n., heidemann, j., estrin, d. and tran, t. . self-configuring localization systems: design and experimental evaluation. acm transactions on embedded computing systems (acm tecs), special issue on networked embedded systems, , , - . greenstein, b., estrin, d., govindan, r., ratnasamy, s. and shenker, s. . difs: a distributed index for features in sensor networks. elsevier journal of ad hoc networks. chen, j.c., yip, l., elson, j., wang, h., maniezzo, d., hudson, r.e., yao, k. and estrin, d. . coherent acoustic array processing and localization on wireless sensor network. in proceedings of the ieee, , . ratnasamy, s., karp, b., shenker, s., estrin, d., govindan, r., yin, l. and yu, f. . data-centric storage in sensornets with ght, a geographic hash table, mobile networks and applications (monet). journal of special issues on mobility of systems, users, data, and computing: special issue on algorithmic solutions for wireless, mobile, ad hoc and sensor networks, kluwer. wang, h., estrin, d. and girod, l. . preprocessing in a tiered sensor network for habitat monitoring. eurasip journal on applied signal processing, , , - . intanagonwiwat, c., govindan, r., estrin, d., heidemann, j. and silva, f. . directed diffusion for wireless sensor networking. ieee/acm transactions on networking, , , - . ganesan, d., govindan, r., shenker, s. and estrin, d. . highly-resilient, energy-efficient multipath routing in wireless sensor networks. mobile computing and communications review, , , - . estrin, d., handley, m., heidemann, j., mccanne, s., xu, y. and yu, h. . network visualization with the vint network animator nam. ieee computer magazine, , , - . bulusu, n., heidemann, j. and estrin, d. . gps-less low-cost outdoor localization for very small devices. ieee personal communications, , , - . reddy, a., estrin, d. and govindan, r. . large-scale fault isolation. ieee journal on selected areas in communications (may ) special issue on network management, , , - . breslau, l., estrin, d., fall, k., floyd, heidemann, j., helmy, a., huang, p., mccanne, s., varadhan, k., xu, y. and yu, h. . advances in network simulation. ieee computer magazine, , , - . estrin, d., govindan, r. and heidemann, j.s. . embedding the internet: introduction. communications of the acm journal, , , - . varadhan, k., govindan, r. and estrin, d. . persistent route oscillations in inter-domain routing. computer networks and isdn systems journal, , , - . liu, c., estrin, d., shenker, s. and zhang, l. . local error recovery in srm: comparison of two approaches. acm/ieee transactions on networks, , , - . govindan, r., alaettinoglu, c., varadhan, k. and estrin, d. . route servers for inter-domain routing. computer networks and isdn systems, , - . herzog, s., shenker, s. and estrin, d. . sharing the cost of multicast trees: an axiomatic analysis. ieee/acm transactions on networking, , , - . deering, s., estrin, d., farinacci, d., jacobson, v., liu, c. and wei, l. . the pim architecture for wide-area multicast routing. ieee/acm transactions on networks, , , - . shenker, s., clark, d., estrin, d. and herzog, s. . pricing in computer networks: reshaping the research agenda. acm computer communications review journal, , , - . cocchi, r., estrin, d., shenker, s. and zhang, l. . pricing the computer networks: motivation, formulation and example. ieee/acm transactions on networks, , , - . estrin, d., steenstrup, m. and tsudik, g. . protocols for route establishment and packet forwarding across multi-domain internets. ieee/acm transactions on networks. zhang, l, deering, s., estrin, d., shenker, s. and zappala, d. . rsvp: a new resource reservation protocol. ieee network magazine, , , - . danzig, p., jamin, s., caceres, r., mitzel, d. and estrin, d. . an empirical workload model for driving wide-area tcp/ip network simulations. journal of internetworking research and experience, , . breslau, l. and estrin, d. . design and evaluation of inter-domain policy routing protocols. internetworking research and experience, , . estrin, d. and tsudik, g. . secure control of transit internetwork traffic. computer networks and isdn systems, , - . hart, p. and estrin, d. . inter-organization networks, computer integration, and shifts in interdependence: the case of the semiconductor industry. acm transactions on information systems, , . estrin, d. and tsudik, g. . an end-to-end argument for network layer, inter-domain access controls. journal of internetworking research and experience, , - . estrin, d. . policy requirements for inter administrative domain routing. computer networks and isdn systems, , - . estrin, d. and steenstrup, m. . inter-domain policy routing: overview of architecture and protocols. acm sigcomm computer communication review, , , - . estrin, d., mogul, j. and tsudik, g. . visa protocols for controlling inter-organizational datagram flow. ieee journal on selected areas in communications, , . estrin, d. . interconnection protocols for interorganization networks. ieee journal on selected areas in communications, sac- , , - . estrin, d. . interconnection of private networks: a link between industrial and telecommunications policy. telecommunications policy, - . estrin, d. . controls for interorganization networks. ieee transactions on software engineering, se- , , - . estrin, d. inter-organizational networking: stringing wires across administrative boundaries. computer networks and isdn systems, , - . estrin, d. and sirbu, m. . cable television networks as an alternative to the local loop. journal of telecommunications networks, , , - . presentations national academy of medicine, health technology interest group, annual meeting' , invited speaker: from patient generated data to digital biomarkers and therapeutics, invited speaker, october , (virtual) md sg' , keynote: technologies for caregiving (video), invited speaker, august national academies workshop, keynote: an examination of emerging bioethical issues in biomedical research, invited speaker, washington dc, february barnard college distinguished lecture series, participatory sensing: from ecosystems to human system, computer science, barnard college, february three decades of dimacs, participatory sensing: from ecosystems to human system, rutgers university, november ats women panel, december . node health, invited speaker nyc, december weill women's health symposium, invited speaker, "how data is changing the future of personal health", nyc, october . mayo transform, invited speaker, "personalizing care with small data", mayo clinic, september . open science matters, panelist, sage bionetworks, seattle, july . net@ , invited speaker, mit, boston, july . scientific meeting on big data for better science , keynote, "small data for better health: technologies for personalizing assessments and interventions", royal society of london. february - , . uc berkeley computer science commencement address, berkeley, may . scientific meeting on big data for better science, keynote: "scientific meeting on big data for better science: technologies for measuring behavior", royal society of london. february - , undergraduate research summer institute symposium, keynote: "small data and the future of personal health" vassar university, ny. september , national institute of health technology showcase , panelist, "mhealth technology for individualized, n-of- , assessments", washington, dc, june , the national academies of sciences, engineering, and medicine artificial intelligence (ai) and the future of health and society workshop, "i and patient generated data", washington, dc may th, pccw symposium, keynote: "how data will shape the future of personal health", new york city, ny, april , cornell silicon valley annual meeting, keynote: "how data will shape the future of personal health", mountain view, ca, march , th annual centre for behavioral change conference, behaviour change for health: digital & beyond, keynote: "using small data to personalize, sustain, and study health behaviour", university college london, london, england, february , grand rounds, weill cornell medical college department of surgery, "technology and healthcare: experience at cornell tech to date", new york, ny, february , momentum , medstartr/health . nyc. partner presentation, "building healthcare entrepreneurs: health tech at cornell tech," november , grace hopper distinguished lecture, "in pursuit of digital biomarkers", university of pennsylvania, philadelphia, pa november , grand rounds, columbia university, department of psychiatry,"leveraging mobile technologies for health research and intervention," november , future of care conference, invited speaker. "using small data to personalize, sustain and study patient care." rockefeller university, new york, new york, october , grand rounds, northwell health's the zucker hillside hospital, "using small data to personalize, sustain, and study health behavior," queens, new york, october , grand rounds, weill cornell medical college department of medicine, "technology and healthcare: experience at cornell tech to date," new york, new york, october th, hackny, invited speaker, "leveraging small (n=me) data", new york, ny, july , rsf summer institute in computational social science, invited speaker, leveraging small data to personalize, sustain, and study (health) behavior, princeton, nj, june , itdothealth: smart decisions, invited panelist, harvard medical school, boston, ma, june - , nih workshop icampam, special presenter, "using small data to personalize, sustain and study health behavior", washington d.c., june , sage assembly, invited speaker, seattle wa, april th, nlm ga biomedical informatics course, invited keynote, "mobile health (mhealth): leveraging mobile technologies for health research and intervention", augusta university, augusta, ga, april israhci, invited keynote, "mobile health and small data". herzliya, israel, january , data transparency lab, columbia university, panel speaker, new york, november , amia, panel speaker, "building a research ecosystem", chicago il, november , mrwjf workshop, ignite talk, " digital marshmallow mobile research study" with fred muench, new york, ny, october th mit idss launch, invited panelist: analyzing our health, "small data for health", cambridge ma, september rd, tnc , keynote, "building the internet of people", prague czech, june , m. , google nyc, june , healthtech:apps, gadgets, gizmos conference, nyu, may , biomedical engineering department, cornell university, ithaca, may , uhg innovation summit, minneapolis mn, april , isrii keynote, seattle, april , joint summit on translational research keynote, san francisco, march , mozilla science labs, march , yale university, nyquist distinguished lecture, march , netflix workshop on recommender systems, feburary , nesta health lab: the future of people powered health, invited speaker, "mobile research study platforms and challenges", london, england, february , nsf workshop, invited presentation, "small data fueled applications and services", washington dc, january , aol-technion workshop, "immersive recommendation", haifa, israel, january , global health grand rounds, "small data, big impact: using small data to fuel, personalize, sustain and study health behavior", wcmc, new york, december , pfizer medical leaders summit, "invited panelist", november , internet of things, vint cerf keynote speaker, "invited panelist", november , mount sinainnovations session, mount sinai medical college, with jp pollak, "consumer health and small data", new york, october , cornell food systems global summit, invited speaker. "personalizing nutritional recommendations with small data", cornell university, ithaca ny, october , united states white house special session on citizen science. citizen science communities and health. invited speaker and session moderator. washington dc, september , predictive analytics world for healthcare. invited speaker. boston, september , nsf workshop on future technology to preserve college student health and foster wellbeing. invited speaker and participant. northwestern university, chicago, july - , nyu grand rounds, department of public health, june , stanford-presidents council on fitness workshop, invited keynote, june , trust center program for women, invited lecture, berkeley california, june , ddd , invited keynote, florence italy, may , www , invited keynote, florence italy, may hxrefactored, invited keynote, boston, ma, usa, april , future tense conference, invited speaker, washington dc, march , cstb workshop, washington dc, march , beth israel invited seminar, boston, ma, usa, march , lake nona impact forum, invited speaker, "quantified self", lake nona, florida, february , distinguished lecture, school of computer science, georgia tech, inaugural mary jean harrold memorial lecture, atlanta, georgia, november , nyc health tech food forum, invited speaker,"using small data to fuel, personalize, sustain and study incentives", hunter college, nycnovember , techvision speaker series, thompson reuters, nyc, october , bbc future world changing ideas summit, invited speaker, nyc, october , grace hopper celebration for women, invited speaker, phoenix arizona, october , robert wood johnson foundation, invited lunch speaker, "small data as a tool for health innovation", princeton, new jersey, july , university waterloo, distinguished lecture, "mobile health: as a tool for personal and clinical management of chronic disease", canada, june , project mac th anniversary, invited speaker, "small, n=me, data", mit, cambridge, ma, may , institute of medicine special meeting on lake nona, invited speaker, "small, n=me, data", washington dc, may , personal genomics conference, invited speaker, "small, n=me, data", cambridge, ma, april , games change, invited speaker, "small data", nyu, april , nih workshop, keynote, "harnessing "small data" for personalized health promotion", bethesda md, april , health research association annual meeting, keynote, "small data", chicago iii, april , brown university computer science department, distinguished lecture, "small, n=me, data", providence, ri, march , south by southwest (sxsw), invited panelist, "nano size me", austin tx, march , privacy and health, invited guest, brian lehrer show, march , simon's rock, invited speaker, hudson new york, jan , mobile, social, cloud meets medicine, conference organizer and session leader, technion university, haifa, israel, december , big data and health, panel member, world international summit on health, doha, qatar, december , nips, keynote, lake tahoe, nv, december , wcmc genomics systems bio seminar, invited seminar, , new york, ny, december , wcmc rogers health policy colloquium, invited seminar, new york, ny, november , at kearney digital business forum, invited speaker, november , aamc annual conference, keynote speaker, philadelphia, pa, november , wireless health conference, keynote speaker, baltimore, md, november , new york city: a data science mecca, panel member, strata conference, new york, ny, october , stanford university, "from internet architecture to mobile health", stanford, ca, october , two sigma, seminar speaker, new york, ny, october , information sciences, distinguished lecture, univ maryland, college park, september , genomics and quantified health, panel member, new york genome center, new york, ny, september , nsf, distinguished lecture, september , nyu govlab, invited seminar, new york, ny, august , dagstuhl workshop on a life shared, keynote, dagstuhl, germany, july , mhealth zone live radio show, invited interview, july , nsf workshop on smart and connected health, keynote, washington, dc, june , hong kong university, distinguished lecture, may , panel member on mobile health at aps annual meeting. washington, dc, may , invited presentation to wcmc th year medical students, new york, ny, may , verisign, distinguished lecture, reston, virginia, may , north carolina environmental conference center, "leveraging pervasive mobile technology for environmental and personal health", may , university wisconsin, madison, may , nyu centre for technology and economic development, new york, ny, april , tedmed, washington, dc, april , aaas annual meeting, "transforming health care through mobile platforms...for patients", boston, ma, february nyu-onc privacy workshop, "mhealth data streams, personal health portfolios, and privacy", nyu, february , wcmc board of trustees, "toward mobile behavioral biomarkers", february , weill cornell medical school (wcmc) integrative medicine colloquium, "mobile health (mhealth): from smart phone apps and sensor streams to behavioral biomarkers", january information sciences department seminar, "mobile health (mhealth): from smart phone apps and sensor streams to behavioral biomarkers", cornell university, january nih obssr webinar, "mobile health (mhealth): from smart phone apps and sensor streams to behavioral biomarkers", january read reviews write a review correspondence: brent.thoma@usask.ca date received: june , doi: . /winn. . archived: december , keywords: tenure and promotion, academic merit, digital teaching, digital resources, scholarship citation: brent thoma, teresa chan, javier benitez, michelle lin, educational scholarship in the digital age: a scoping review and analysis of scholarly products, the winnower :e . , , doi: . /winn. . © thoma et al. this article is introduction in boyer redefined the scope of scholarship in higher education with the definition of four overlapping subtypes of scholarship (discovery, integration, application, and teaching) (boyer ). prior to this redefinition, scholarship was largely considered to consist only of the discovery subtype. boyer’s influential definition paved the way for the recognition of a broader definition of scholarship that included teaching in addition to research. the explosive growth of digital products (resources used for the dissemination of information that exist primarily in digital formats) that has occurred since the internet was democratized in could not be predicted at that time (leiner et al. ). social media, online courses, blogs, podcasts and other digital products have since changed the way we teach, disseminate, and discuss scholarly ideas. their exclusion from traditional scholarly frameworks, combined with a lack of standards to ensure their quality, may explain why they are generally not viewed as scholarship by members of the academic establishment (brabazon ; hendricks ; kirkup ; savage ). scholars and educators are turning to digital methods for disseminating knowledge and reaching students (priem ). this has resulted in the creation of online communities of practice with benefits including: increased collaboration, enhanced knowledge dissemination, instantaneous scholarly discussion, and the generation of scholarly identity (kirkup ; gruzd, staves, and wilk ; maitzen ; shema, bar-ilan, and thelwall ). arguments against digital products note that they have not proven to be superior and that they require more time to develop (cooke ). the increasing prominence of digital products in medical education and the time being devoted to their development makes determining their scholarly value extremely important (cadogan et al. ; medicine  educational scholarship in the digital age: a scoping review and analysis of scholarly products brent thoma , teresa chan , javier benitez , michelle lin . mededlife research collaborative . emergency medicine residency program, university of saskatchewan . simulation fellowship program, massachusetts general hospital . department of medicine, division of emergency medicine, mcmaster university . department of emergency medicine, university of california san francisco abstract boyer’s framework of scholarship was published before significant growth in digital technology. as more digital products are produced by medical educators, determining their scholarly value is of increasing importance. this scoping systematic review developed a taxonomy of digital products and determined their fit within boyer’s framework of scholarship. we conducted a broad literature search for descriptions of digital products in the medical literature in july using medline, embase, eric, psychinfo, and google scholar. a framework analysis categorized each product using boyer’s model of scholarship, while a thematic analysis defined a taxonomy of digital products. abstracts were found and met inclusion criteria. digital products mapped primarily to the scholarship of teaching ( . %) followed by integration ( . %), application ( . %), and discovery ( . %). a taxonomy of categories was defined. web-based or computer assisted learning ( %) was described most frequently. we found that digital products are well described in medical literature and fit into boyer’s framework of scholarship and proposed a taxonomy of digital products that parallel traditional forms of the scholarship of teaching and learning. this research should inform the development of tools to examine the impact and quality of digital products. ✎ thoma et al the winnower june https://thewinnower.com/topics/medicine https://thewinnower.com/papers/ -educational-scholarship-in-the-digital-age-a-scoping-review-and-analysis-of-scholarly-products#submit https://thewinnower.com/papers/ -educational-scholarship-in-the-digital-age-a-scoping-review-and-analysis-of-scholarly-products#submit mailto:brent.thoma@usask.ca https://dx.doi.org/ . /winn. . distributed under the terms of the creative commons attribution . international license, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited. matava et al. ; bahner et al. ). in this scoping review paper, we quantify the increasing prevalence of digital products in the medical literature, develop a taxonomy of digital products, and compare the products in the taxonomy to traditional forms of the scholarship of teaching and learning. we hope that this will increase the awareness of this growing area of educational scholarship and classify digital products so that their value can be understood within the context of their traditional parallels. methods in concert with an expert librarian, an expert search strategy was developed using the medline, embase, eric, and psychinfo databases, as they were deemed to be the most likely to provide literature on digital products used in medical education. the search was not limited by year or language, and used the keywords and keyword variations of: (student, medical or medical student or “internship and residency” or intern or resident) and (education, medical or education, medical, graduate or education, medical, undergraduate or “medical education”) and (blog or weblog or microblog or social media or social network or “health . ” or “web . ” or video or youtube or podcast or vodcast or webcast or screencast or wiki or widget or new media or new technology or mobile app or app, collaborative or cooperative behavior or conferencing or crowdsource or rss or “really simple syndication” or computer-assisted instruction or web-based instruction or “access to information” or open access or free access). in addition to this traditional literature search, a previously described google scholar search methodology (chan et al. ) was conducted for five sets of keywords: “blogging and scholarship,” “digital scholarship medicine medical,” “free open access medical education,” “medical blogging” and “’tenure and promotion blogging.” the first results for each keyword set were reviewed and relevant results were added to the findings. a title review of the abstracts was performed by one author (bt). abstracts were excluded if ( ) there was no english-language abstract, ( ) they were duplicates, or ( ) they clearly did not address the use of digital products in medicine. the abstracts were coded and classified with a detailed abstract review conducted by two authors (bt, jb). upon abstract review, articles were excluded if ( ) no particular digital product was described, ( ) the digital product did not meet the criteria for scholarship based on boyer’s model, or ( ) upon closer inspection they met the initial exclusion criteria. during the abstract review, two authors (bt, jb) performed both a framework analysis and thematic analysis of the digital products described in the abstracts. two reviewers (bt, jb) classified the digital products described in the first abstracts collaboratively to develop an initial taxonomy and set of definitions for the thematic analysis and to calibrate the coding schemes for the thematic and framework analyses. subsequently a constant comparator technique was used to perform both analyses whereby classifications were made independently in batches of approximately abstracts and compared. the frequent comparisons allowed the reviewers to ensure consistency within the analyses and to refine a consensus definition for each type of digital product in the thematic analysis. when available and necessary, full manuscripts were reviewed to accurately classify the digital products and their form of scholarship. discordant classifications were discussed by the reviewers and resolved by consensus when possible. when consensus was not reached, a third reviewer (tc) arbitrated disagreements. the third reviewer also audited the excluded abstracts to ensure that they met the review’s exclusion criteria. the year of publication of each abstract was also recorded to demonstrate the prevalence of digital products described each year. while they were conducted concurrently, the two analyses were functionally independent. the thematic analysis was used to derive a taxonomy that defined the described all of the digital products found in the literature. additional items were added to the taxonomy as they were found and the definitions were frequently refined to accurately describe all of the digital products effectively. the purpose of the framework analysis was to determine if and how digital products fit into boyer’s four types of scholarship (boyer, ). digital products were classified as one or more of boyer’s types of scholarship: discovery (original research for the advancement of knowledge), integration educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://creativecommons.org/licenses/by/ . / (contextualizing information across disciplines or into larger intellectual patterns), application (applying knowledge dynamically to inform and test new theories in an engaged fashion), and/or teaching (systematic study of teaching and learning in the presence of learners) (gale et al. ; boyer ). the intraclass correlation coefficient was calculated to determine a measure of agreement. the definitions resulting from the thematic analysis were assessed to determine if there were traditional scholarly products used for the same purpose. this comparison, while inherently subjective, was conducted to further contextualize the role of each type of digital product. results the flow diagram for the literature search, title review, and abstract review is presented in figure . the thematic and framework analyses were conducted on digital products described by the abstracts that met the inclusion criteria. an abstract published in described the oldest digital product. figure . diagram illustrating the number of articles excluded through the title and abstract reviews. the number of digital products described in the published medical literature between and july is illustrated in figure . the number of digital products for was projected to double because our literature search only included articles published through july . educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june figure . the number of digital products described in the medical literature over time. framework analysis table presents the results of the analysis mapping published digital products to boyer’s framework of scholarship.(boyer ) the intraclass correlation between the raters was . , but disagreements were ultimately discussed to resolve consensus. most products ( . %) were categorized under the scholarship of teaching. the scholarship of integration ( . %), application ( . %), and discovery ( . %) were described much less frequently. this table further stratifies these scholarship models based on the categories of digital products, as derived by our thematic analysis. of note, there were some products that could be classified as more than one type of scholarship. table : types and numbers of digital products mentioned in the literature and classified using boyer's framework of scholarship digital product discovery (%) integration (%) application (%) teaching (%) total web-based or computer assisted learning ( ) ( . ) ( . ) ( . )* multi-modal products ( ) ( . ) ( . ) ( . )* social network ( ) ( . )* ( . ) ( . )* instructional video ( ) ( ) ( . ) ( . ) online repository ( ) ( . ) ( . ) ( . )* podcast ( ) ( ) ( ) ( ) online course ( ) ( ) ( ) ( )* video podcast ( ) ( ) ( . ) ( . ) blog ( ) ( . ) ( . ) ( . )* educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june open access journal ( )* ( . ) ( ) ( . ) wiki ( ) ( ) ( ) ( )* website ( ) ( . ) ( . ) ( )* online discussion board ( ) ( ) ( ) ( )* e-mail ( ) ( ) ( ) ( )* application ("app") ( ) ( ) ( ) ( ) online textbook ( ) ( ) ( ) ( )* virtual reality ( ) ( ) ( ) ( )* search engine ( ) ( ) ( ) ( ) serious game ( ) ( ) ( ) ( )* total ( . ) ( . ) ( . ) ( . ) in table , the starred numbers represent the most popular type of scholarship for each product. the table includes abstracts that were classified as multiple forms of scholarship, resulting in totals ( ) greater than the number of abstracts reviewed ( ). thematic analysis table provides a taxonomy of the digital products described in the literature and derived from the thematic analysis. each of the categories are defined with an example provided. together, web- based learning and computer assisted learning ( %) were the most prevalent forms of digital product. a single category was created for these two types of digital products because prior to the democratization and widespread accessibility of the internet, web-based learning products were classified under the umbrella term of computer assisted learning. the significant overlap between these two terms necessitated their amalgamation into one category in our taxonomy. social networks, instructional videos, online repositories, podcasts, online courses, video podcasts (also known as screencasts or vodcasts), and blogs had roughly similar prevalence and collectively comprised another % of the publications. table : definitions and examples of digital products. digital product definition example applications (‘apps’) a resource downloaded to a smartphone. irash is an application that allows users to search and learn about various rashes (deveau and chilukuri ) blog a website used to publish information in periodic posts that are primarily text-based. a blog was created to host synopses of ‘morning report’ sessions run by chief medical residents (bogoch et al. ) e-mail a common form of direct electronic messaging between a sender and one or more recipients. e-mail was used to send questions to teach residents about pediatric emergency medicine (komoroski ) instructional video a video demonstrating a skill (ie procedure, physical exam finding, ecg or x-ray instructional video used to teach chest tube insertion (davis et al. ) educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june interpretation, etc). multi-modal products a product that consists of multiple digital products. an online course on evidence based medicine and critical appraisal that used video podcasts, a wiki and blogs (tam and eastwood ) online course a complete curriculum delivered using multiple online modalities. differs from multi-modal products in that it is organized into a formal curriculum. the online genetic testing curriculum is a course about the ethical, legal, and social implications of genetic testing and counseling (metcalf, tanner, and buchanan ) online discussion board an online forum that allows users to post and respond to other participants. a clinical discussion board for learners to describe their rural medicine experiences (baker, eley, and lasserre ) online repository an online database that resources can be drawn from and added to. a repository of images of dermatologic findings in darker-skinned patients (ezzedine et al. ) online textbook a textbook published online. oditeb (open distributed text book), an online textbook that describes the diagnosis of gastrointestinal tumours (horsch et al. ) open access journal a journal only available online that publishes articles without access restrictions. various online journals have been created to decrease cost and allow open-access publication of scientific materials (davis and walters ) podcast audio recordings that are published periodically with the intent of disseminating knowledge. surgery podcasts are used to teach core principles to clinical clerks on their surgical rotation (white, sharma, and boora ) search engine search engines used to find information online. google, yahoo, dogpile, altavista, metacrawlers and ask were used to find information on scleroderma renal crisis (akbar and yacyshyn ) serious game an online game designed to educate the players. emedoffice, a serious game to teach practice management.(hannig et al. ) social network an online platform that allows synchronous and asynchronous communication between individuals. twitter used to connect teachers with learners (forgie, duff, and ross ) video podcast videos with embedded audio that are published periodically. differs from instructional videos because it focuses on knowledge rather than skill. video podcasts used to teach embryology (evans ) virtual reality a virtual environment used to present learning material. a virtual reality simulator was used to simulate medical cases (alverson et al. ) web based learning or computer assisted learning educational modules that may make use of multiple modalities. web-based learning is based online while computer assisted learning is not. these modalities were combined due to substantial overlap. a web based module on pediatric pain management (ameringer et al. ) a computer based application about occupational lung disease (bresnitz, gracely, and rubenstein ) website an online webpage that cannot be classified as any other digital product. case based pediatrics is a website with a list of teaching cases for medical students and residents (falagas, karveli, and panos ) wiki a website that can be openly edited by end- a wiki site for orthopedic cases, utilizes a educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june users. utilizes crowd-sourcing as a method for improving and revising the content. scoreboard to encourage participation (ma et al. ) historical parallels as demonstrated by our framework analysis, digital products can be classified within the types of scholarship described by boyer (boyer ) and most fall under teaching and learning. following the completion of our thematic analysis, the definitions of the digital products were compared with traditional forms of the scholarship of teaching and learning. table outlines the parallels between traditional products and of the digital products described in the thematic analysis. no product was found that was comparable to the digital product ‘virtual reality.’ table : comparing traditional products used for the scholarship of teaching and learning to digital products that are used for this purpose types of teaching and learning resources examples of traditional products examples of digital products interactive resources small groups workshops online discussion board social network wiki independent study resources assignments discussions with tutors group work laboratory work e-mail online course serious game virtual reality web based and computer assisted learning audiovisual resources lecture skill demonstration podcast video podcast instructional video point-of-care resources guidebooks pocketbooks applications (‘apps’) written resources textbook printed journals medical journalism online textbook blog open access journal website resource repository library library classification system online repository search engine discussion the growing number of digital products documented in the literature (figure and ) suggests that medical educators are increasingly using technology to engage in various forms of scholarship. while educators have discussed applying boyer’s traditional definitions of scholarship to digital products (heap and minocha ; pearce et al. ), we provide the first comprehensive framework analysis of these products. our framework analysis found that, following teaching and learning, integration ( . %), application ( . %), and discovery ( . %) were the most frequent types of scholarship found in digital products. we suspect that the digital products were predominantly consistent with scholarship of teaching and learning because, despite boyer’s reclassification of scholarship, educators have traditionally not had their scholarly contributions recognized. literature that assesses their innovations is one way to receive academic recognition for their work. educators should keep in mind that digital products can be educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june scholarly outside of their traditional realm of teaching. for example, boyer’s concept of application was demonstrated by the various ‘apps’ that allow translation of concepts at the point of care (graber, tompkins, and holland ), integration was illustrated by an online textbook that synthesized multiple resources into a single resource (horsch et al. ), and discovery was exemplified by open access online journals that fostered new scientific works (p. m. davis and walters ). social networks were the most versatile product with multiple examples of their use in teaching, application, and integration. the thematic analysis described the diversity of digital products (table ). notably, web-based and computer assisted learning programs were prominently featured in the literature and there has been a recent uptake of social media (nickson and cadogan ; cadogan et al. ). social networks, in particular, seem to have impacted medical education by allowing scholars to share their digital products (boulos, maramba, and wheeler ). a traditional parallel was found for nearly every digital product defined in the thematic analysis. the use of digital products was particularly prominent for the scholarship of teaching and learning. this may be because of their reach, customization, and updatability. whereas scholarly teaching was historically a fleeting event offered to a defined group (i.e. an address that was given in a lecture hall), digital products extend their reach to large numbers of learners who can access them at their convenience. this asynchrony allows learners to customize their experience (i.e. by speeding up or slowing down a lecture) and educators to update their products as needed. that said, there is no compelling evidence that digital products are more effective for learning and they may take more time and resources to develop than traditional products (cooke ). they have also been criticized for their lack of editorial oversight and review (brabazon ; kirkup ). these limitations may limit their widespread endorsement and utilization. further research will be required to determine when and how they should be used. while our results suggest that this research is increasingly being conducted, the role and value of digital products in our current academic schema for scholarship remains poorly defined, and hence, poorly acknowledged. institutions that do acknowledge digital products as scholarship for the purpose of promotion and tenure decisions have difficulty classifying them and quantifying their value relative to other scholarly pursuits (gruzd, staves, and wilk ; cheverie, boettcher, and buschman ; rockwell ; ruiz, mintzer, and leipzig ). novel ways to recognize digital products include publishing them on a platform with peer review and publication processes such as mededportal (ruiz, mintzer, and leipzig ; reynolds and candler, christopher ) or conducting educational research to evaluate their efficacy (cheston, flickinger, and chisolm ). regardless, the amount of academic recognition for digital products is relatively low compared to the effort expended to build and maintain them and may limit their growth in the future (anderson et al. ; profhacker ). limitations while our literature search was intended to be as broad as possible, it is still likely that some digital products were missed since they may not have been reported in the literature. a broader review of grey and non-english literature would not have been feasible given the sheer volume of unreported products. for example, a recent report found that there were english-language blogs and podcasts in emergency medicine alone (cadogan et al. ). additionally, we may have missed digital products of historic significance that were described using terms that are not applicable today. for example, cd-rom’s were likely to have been considered digital products in the past but were not included in our literature search. missing resources would change the number of products per year represented in figure and made our taxonomy of digital products incomplete. the exclusion of the mededportal database could also be considered a limitation as it publishes many digital products. however, our search explicitly attempted to quantify and describe the digital products described in the literature. mededportal’s publications are digital products, rather than descriptions of them, and for this reason they were considered to be outside of the scope of this review. finally, our quantification of the rapidly increasing number of digital products described annually in the educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june literature fails to account for the increase in literature that has been published in general (larsen, ). unfortunately, we were unable accurately quantify this growth for the body of literature that our review assessed. as the amount of research published annually is increasing (larsen, ), the increase in descriptions of digital products would have been less spectacular had we been able to take this into account. future directions since the digital products described in the medical literature fit within boyer’s framework, we feel strongly that they should be considered alongside other forms of scholarship. however, given the ease with which some products can be created, better evaluation tools will need to be developed to determine their quality, value, and relative impact. educator portfolios are becoming accepted as a way to provide additional detail to the traditional curriculum vitae, which sub-optimally captures the scholarly efforts of educators (simpson et al. ; baldwin, chandran, and gusic ). in showing that digital products fall within boyer’s framework of scholarship, our findings suggest that we should look to apply other conceptual frameworks of educational scholarship to digital products or online educational resources. frequently, educators lean towards the criteria for assessing scholarship developed by glassick. assessment frameworks such as glassick’s criteria of scholarship are manifest in the aamc toolbox for evaluating educators and could be used to evaluate these portfolios (glassick ; gusic et al. ). table suggests multiple parallels between traditional and digital projects for teaching and learning that could guide how digital products should fit into these portfolios. developing a standardized approach would allow promotion committees and administrative leadership to evaluate digital and traditional educational efforts more rigorously. together, boyer and glassick’s respective frameworks provide a roadmap for educators interested in scholarship. digital scholars must take care to ensure that their digital products warrant scholarly respect by ensuring that they stand up to the scrutiny of these recognized conceptual frameworks. conclusion digital products are increasingly being described in the medical literature. they are likely to have a substantial impact on medical education and can readily fit into boyer’s established framework of scholarship. our taxonomy shows clear parallels between digital and traditional products and can hopefully provide a framework for further research on digital scholarship. references akbar, s, and e yacyshyn. . “is there relevant information about scleroderma renal crisis on most frequently visited internet search engines?” journal of rheumatology ( ): – . alverson, dale c, stanley m saiki, summers kalishman, marlene lindberg, stewart mennin, jan mines, lisa serna, et al. . “medical students learn over distance using virtual reality simulation.” simulation in healthcare : journal of the society for simulation in healthcare ( ): – . doi: . /sih. b e f d . ameringer, suzanne, deborah fisher, sue sreedhar, jessica m ketchum, and leanne yanni. . “pediatric pain management education in medical students: impact of a web-based module.” journal of palliative medicine ( ): – . doi: . /jpm. . . anderson, michael g, donna d alessandro, dawn quelle, rick axelson, lois j geist, and donald w black. . “recognizing diverse forms of scholarship in the modern medical college”, – . doi: . /ijme. b . c. bahner, david p, eric adkins, nilesh patel, chad donley, rollin nagel, and nicholas e kman. . educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://dx.doi.org/ . /sih. b e f d https://dx.doi.org/ . /jpm. . “how we use social media to supplement a novel curriculum in medical education.” medical teacher ( ): – . doi: . / x. . . baker, peter g, diann s eley, and kaye e lasserre. . “tradition and technology: teaching rural medicine using an internet discussion board.” rural and remote health ( ): . http://www.ncbi.nlm.nih.gov/pubmed/ . baldwin, constance, latha chandran, and maryellen gusic. . “guidelines for evaluating the educational performance of medical school faculty: priming a national conversation.” teaching and learning in medicine ( ): – . doi: . / . . . bogoch, isaac i, david w frost, suzanne bridge, todd c lee, wayne l gold, daniel m panisko, and rodrigo b cavalcanti. . “morning report blog: a web-based tool to enhance case-based learning.” teaching and learning in medicine ( ): – . doi: . / . . . boulos, maged n kamel, inocencio maramba, and steve wheeler. . “wikis, blogs and podcasts: a new generation of web-based tools for virtual collaborative clinical practice and education.” bmc medical education (january): . doi: . / - - - . boyer, e. . “scholarship reconsidered: priorities of the professoriate” the carnegie foundation for the advancement of teaching: princeton, nj. brabazon, t. ( ). the google effect: googling, blogging, wikis and the flattening of expertise. libri, ( ), - . doi: . /libr. . bresnitz, eddy a, edward j gracely, and harriet l rubenstein. . “a randomized trial to evaluate a computer-based learning program in occupational lung disease.” journal of occupational and environmental medicine ( ). http://journals.lww.com/joem/fulltext/ / /a_randomized_trial_to_evaluate_a_computer_based. .aspx. cadogan, m., b. thoma, t. m. chan, and m. lin. . “free open access meducation (foam): the rise of emergency medicine and critical care blogs and podcasts ( - ).” emergency medicine journal, (february). doi: . /emermed- - . chan, teresa m, clare wallner, thomas k swoboda, katrina a leone, and chad kessler. . “assessing interpersonal and communication skills in emergency medicine.” academic emergency medicine ( ): – . doi: . /acem. . cheston, christine c, tabor e flickinger, and margaret s chisolm. . “social media use in medical education: a systematic review.” academic medicine : journal of the association of american medical colleges ( ): – . doi: . /acm. b e ffc . cheverie, joan f., jennifer boettcher, and john buschman. . “digital scholarship in the university tenure and promotion process: a report on the sixth scholarly communication symposium at georgetown university library.” journal of scholarly publishing ( ): – . doi: . /scp. . . cooke, david. . “futurecasting in education technologies: fun new toys and a reality check.” international conference on residency education, plenary session. available at: https://www.youtube.com/watch?v=xcodjukpuec&list=uu z-vvzoq cvwmvdzsh a. retrieved november , . davis, james s, george d garcia, mary m wyckoff, salman alsafran, jill m graygo, kelly f withum, and carl i schulman. . “use of mobile learning module improves skills in chest tube insertion.” the journal of surgical research ( ). elsevier ltd: – . doi: . /j.jss. . . . davis, philip m, and william h walters. . “the impact of free access to the scientific literature: a review of recent research.” journal of the medical library association : jmla ( ): – . doi: . / - . . . . deveau, michael, and suneel chilukuri. . “mobile applications for dermatology.” seminars in cutaneous medicine and surgery ( ). elsevier inc. – . doi: . /j.sder. . . . educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://dx.doi.org/ . / x. . https://dx.doi.org/ . / . . https://dx.doi.org/ . / . . https://dx.doi.org/ . / - - - https://dx.doi.org/ . /libr. . https://dx.doi.org/ . /emermed- - https://dx.doi.org/ . /acem. https://dx.doi.org/ . /acm. b e ffc https://dx.doi.org/ . /scp. . https://www.youtube.com/watch?v=xcodjukpuec&list=uu z-vvzoq cvwmvdzsh a https://dx.doi.org/ . /j.jss. . . https://dx.doi.org/ . / - . . . https://dx.doi.org/ . /j.sder. . . evans, darrell j r. . “using embryology screencasts: a useful addition to the student learning experience?” anatomical sciences education ( ). wiley subscription services, inc., a wiley company: – . doi: . /ase. . ezzedine, k, a amiel, p vereecken, t simonart, b schietse, k seymons, b s ndiaye, et al. . “black skin dermatology online, from the project to the website: a needed collaboration between north and south.” journal of the european academy of dermatology and venereology : jeadv ( ): – . doi: . /j. - . . .x. falagas, matthew e, efthymia a karveli, and george panos. . “infectious disease cases for educational purposes: open-access resources on the internet.” clinical infectious diseases : an official publication of the infectious diseases society of america ( ): – . doi: . / . forgie, sarah edith, jon p duff, and shelley ross. . “twelve tips for using twitter as a learning tool in medical education.” medical teacher ( ): – . doi: . / x. . . gale, nicola k, gemma heath, elaine cameron, sabina rashid, and sabi redwood. . “using the framework method for the analysis of qualitative data in multi-disciplinary health research.” bmc medical research methodology ( ). bmc medical research methodology: . doi: . / - - - . glassick, charles e. . “boyer’s expanded definitions of scholarship, the standards for assessing scholarship, and the elusiveness of the scholarship of teaching.” academic medicine ( ), – . doi: . / - - graber, mark l, david tompkins, and joanne j holland. . “resources medical students use to derive a differential diagnosis.” medical teacher : – . doi: . / . gruzd, anatoliy, kathleen staves, and amanda wilk. . “tenure and promotion in the age of online social media.” proceedings of the american society for information science and technology ( ): – . doi: . /meet. . . gusic, m, j amiel, c baldwin, l chandran, r fincher, b mavis, p o’sullivan, et al. . “using the aamc toolbox for evaluating educators: you be the judge!” mededportal. doi: . / ​ mep_ - . . hannig, andreas, nicole kuth, monika Özman, stephan jonas, and cord spreckelsen. . “emedoffice: a web-based collaborative serious game for teaching optimal design of a medical practice.” bmc medical education (january): . doi: . / - - - . heap, tania, and shailey minocha. . “an empirically grounded framework to guide blogging for digital scholarship.” research in learning technology (august). doi: . /rlt.v i . . hendricks, arthur. . “bloggership, or is publishing a blog scholarship? a survey of academic librarians.” library hi tech ( ): – . doi: . / . horsch, a., p. hellerhoff, m. hogg, h. ahlbrink, t. balbacha, liss. t., k. minov, and p. gerhardt. . “concepts of a web-based open distributed textbook for the multimodal diagnostics of gastrointestinal tumours with mri, ct and video-endoscopy addressing students of medicine and students of medical informatics as two different target groups.” studies in health technology and informatics ( ): – . kirkup, gill. . “academic blogging: academic practice and academic identity.” london review of education ( ): – . doi: . / . komoroski, e m. . “use of e-mail to teach residents pediatric emergency medicine.” archives of pediatrics & adolescent medicine ( ): – . doi: . /archpedi. . . larsen, p. o., & von ins, m. ( ). the rate of growth in scientific publication and the decline in coverage provided by science citation index. scientometrics, ( ), - . doi: . /s - - -z. educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://dx.doi.org/ . /ase. https://dx.doi.org/ . /j. - . . .x https://dx.doi.org/ . / https://dx.doi.org/ . / x. . https://dx.doi.org/ . / - - - https://dx.doi.org/ . / - - https://dx.doi.org/ . / https://dx.doi.org/ . /meet. . https://dx.doi.org/ . /?mep_ - . https://dx.doi.org/ . / - - - https://dx.doi.org/ . /rlt.v i . https://dx.doi.org/ . / https://dx.doi.org/ . / https://dx.doi.org/ . /archpedi. . . https://dx.doi.org/ . /s - - -z leiner, barry m, david d clark, robert e kahn, leonard kleinrock, daniel c lynch, jon postel, larry g roberts, and stephen wolff. . “a brief history of the internet.” computer communication review ( ): – . doi: . / . ma, zhen-sheng, hong-ju zhang, tao yu, gang ren, guo-sheng du, and yong-hua wang. . “orthochina.org: case-based orthopaedic wiki project in china.” clinical orthopaedics and related research ( ): – . doi: . /s - - -z. maitzen, rohan. . “scholarship . : blogging and/as academic practice.” journal of victorian culture ( ): – . doi: . / . . . matava, clyde t, derek rosen, eric siu, and dylan m bould. . “elearning among canadian anesthesia residents: a survey of podcast use and content needs.” bmc medical education (january): . doi: . / - - - . metcalf, mary p, t bradley tanner, and amanda buchanan. . “effectiveness of an online curriculum for medical students on genetics, genetic testing and counseling.” medical education online (january): – . doi: . /meo.v i . . nickson, christopher p, and michael d cadogan. . “free open access medical education (foam) for the emergency physician.” emergency medicine australasia ( ): – . doi: . / - . . pearce, nick, martin weller, eileen scanlon, and melanie ashleigh. . “digital scholarship considered: how new technologies could transform academic work.” in education ( ). http://ineducation.couros.ca/index.php/ineducation/article/view/ / . priem, jason. . “beyond the paper.” nature : – . doi: . / a. koh, adeline. . “the challenges of digital scholarship.” chronicle of higher education. retrieved from http://chronicle.com/blogs/profhacker/the-challenges-of-digital-scholarship/ on d\necember , . reynolds, robby j., and s. candler, christopher. . “mededportal : educational scholarship for teaching.” journal of continuing education in the health professions ( ): – . doi: . /chp. rockwell, geoffrey. . “on the evaluation of digital media as scholarship.” profession : – . doi: . /prof. . . . . ruiz, jorge g, michael j mintzer, and rosanne m leipzig. . “the impact of e-learning in medical education.” academic medicine ( ): – . doi: . / - - savage, william w. . “the transom: you can’t spill mustard on a blog.” journal of scholarly publishing ( ): – . doi: . /scp. . . shema, hadas, judit bar-ilan, and mike thelwall. . “research blogs and the discussion of scholarly information.” plos one ( ): e . doi: . /journal.pone. . simpson, deborah, ruth-marie e fincher, janet p hafler, david m irby, boyd f richards, gary c rosenfeld, and thomas r viggiano. . “advancing educators and education by defining the components and evidence associated with educational scholarship.” medical education ( ): – . doi: . /j. - . . .x. tam, chun wah michael, and anne eastwood. . “available, intuitive and free! building e- learning modules using web . services.” medical teacher ( ): – . doi: . / x. . . white, j s, n sharma, and p boora. . “surgery : evaluating the use of podcasting in a general surgery clerkship.” medical teacher ( ): – . doi: . / x. . . educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://dx.doi.org/ . / . https://dx.doi.org/ . /s - - -z https://dx.doi.org/ . / . . https://dx.doi.org/ . / - - - https://dx.doi.org/ . /meo.v i . https://dx.doi.org/ . / - . https://dx.doi.org/ . / a http://chronicle.com/blogs/profhacker/the-challenges-of-digital-scholarship/ on december https://dx.doi.org/ . /chp https://dx.doi.org/ . /prof. . . . https://dx.doi.org/ . / - - https://dx.doi.org/ . /scp. . https://dx.doi.org/ . /journal.pone. https://dx.doi.org/ . /j. - . . .x https://dx.doi.org/ . / x. . https://dx.doi.org/ . / x. . educational scholarship in the digital age: a scoping review and analysis of scholarly products abstract introduction correspondence: date received: doi: archived: keywords: citation: methods results framework analysis thematic analysis historical parallels discussion limitations future directions conclusion references enduring access to rich media content: understanding use and usability requirements search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine september/october volume , number / table of contents   enduring access to rich media content: understanding use and usability requirements madeleine casad, oya y. rieger and desiree alexander cornell university library {mir , oyr , dca }@cornell.edu doi: . /september -casad   printer-friendly version   abstract through an neh-funded initiative, cornell university library is creating a technical, curatorial, and managerial framework for preserving access to complex born-digital new media objects. the library's rose goldsen archive of new media art provides the testbed for this project. this collection of complex interactive born-digital artworks are used by students, faculty, and artists from various disciplines. interactive digital assets are far more complex to preserve and manage than single uniform digital media files. the preservation model developed will apply not merely to new media artworks, but to other rich digital media environments. this article describes the project's findings and discoveries, focusing on a user survey conducted with the aim of creating user profiles and use cases for born-digital assets like those in the testbed collection. the project's ultimate goal is to create a preservation and access practice grounded in thorough and practical understanding of the characteristics of digital objects and their access requirements, seen from the perspectives of collection curators and users alike. we discuss how the survey findings informed the development of an artist questionnaire to support creation of user-centric and cost-efficient preservation strategies. although this project focuses on new media art, our methodologies and findings will inform other kinds of complex born-digital collections.   introduction despite its "new" label, new media art has a rich -year history, making obsolescence and loss of cultural history an imminent risk. as a range of new media are integrated in art works, these creative objects are becoming increasingly complex and vulnerable due to dependence on many technical and contextual factors (delve, et al., ). the phrase "new media art" denotes a range of creative works that are influenced or enabled by technological affordances. the term also signifies a departure from traditional visual arts (e.g., paintings, drawings, sculpture, etc.). another characteristic of new media art that adds further complications to the preservation process is its interactive nature. works in this genre often entail, and indeed rely on, interactions between artists and viewers/observers. in , cornell university library received a research and development grant from the national endowment for the humanities to design a framework for preserving access to digital art objects. the preservation and access frameworks for digital art objects (pafdao) was undertaken in collaboration with cornell university's society for the humanities and the rose goldsen archive of new media art, a collection of media artworks housed in the library's division of rare and manuscript collections. the project aims to develop scalable technical frameworks and associated tools to facilitate enduring access to complex, born-digital media objects, working primarily with a test bed of nearly optical discs from the holdings of the goldsen archive. the preservation model developed will apply not merely to new media artworks, but to other rich digital media environments (for instance see kirschenbaum, et al., ). many of the issues we have been addressing within the framework of this project apply to other rich digital contents, not limited to artistic productions. from the beginning, the project team has recognized that both metadata frameworks and access strategies would need to address the needs of future as well as current media art researchers. toward that end, we developed a survey targeting researchers, artists, and curators to expand our understanding of users and use cases. this article summarizes key findings of the survey and describes their impact on our current preservation and access frameworks and future plans.   about the collection the ultimate aim of the pafdao project is to create generalizable new media preservation and access practices that will be applicable for different media environments and institutional types. the nature of the project's test collection, a set of cd-rom artworks from cornell's rose goldsen archive of new media art , has meant that the project provides a case study in new media preservation that may be informative to library and museum contexts alike. rose kohn goldsen ( - ) was a professor of sociology at cornell university and an early critic of commercial mass media's impact on social and ethical imagination. named in her honor, the rose goldsen archive of new media art was founded in by professor timothy murray (director, society for the humanities, cornell university) in the cornell library division of rare and manuscript collections as an international research collection for scholars of new media and media art history (murray, ). since its founding, the goldsen archive has grown to achieve global recognition as a prominent research collection that documents more than years of the history of aesthetic experimentation with electronic communications media. these collections span the two most crucial decades in the emergence of digital media art, from to the present, tracing the historical shift in emphasis within media culture from disc-based to networked and web-based applications. they also mark the early stirrings of a networked, interactive digital culture that has subsequently become the global norm. the goldsen archive constitutes a vital record of our cultural and aesthetic history as a digital society. the pafdao project focused on a subset of born-digital media artworks on cd-rom. these artworks were created for small-screen, single-user experience, and dated back as far as the early s. the cultural significance of such artworks is great. among other things, they represent the early development of interactive interfaces that are now a major part of our everyday life. and artists' exploration of the expressive possibilities these new multimedia interfaces have to offer. despite their cultural value, and their relatively recent production, such artifacts present serious preservation challenges and obsolescence risks. to begin with, no archival best practices yet exist for preserving such assets. many are stored on fragile storage media like optical discs, meaning that physical damage as well as data degradation or "bit rot" pose serious dangers to the integrity of the information. in the case of the pafdao project's test collection, many of these discs were artist-produced and irreplaceable. interactive digital assets are, furthermore, far more complex to preserve and manage than single, uniform digital media files. a single interactive work can comprise an entire range of digital objects and dependencies, including media files in different types and formats, applications to coordinate the files, and operating systems to run the applications. if any part of this complex system fails, the entire asset can become unreadable. this danger is especially acute in the case of artworks. in most cases, interactive digital artworks are designed to create unique, multimedia experiences for users. an even relatively minor problem with an artwork's rendering—for example, an obsolete media player that no longer operates as expected—has the potential to significantly compromise an artwork's "meaning." simply migrating information files to another storage medium is not enough to preserve their most important cultural content. when the pafdao project began, approximately percent of the artworks in the test collection could not be accessed at all without using legacy hardware—a specialized computer terminal that runs obsolete software and operating systems. the project's objective was to provide "best-feasible" access to artworks, and document the distance between "feasible" and "ideal," as well as we could understand it. very soon after beginning pafdao, the project team realized that, contrary to our initial assumptions, operating system emulation would be a viable access strategy at scale for our complex digital media holdings (for information about emulation, see lange, ). embracing emulation as an access strategy meant that the team could provide better access more easily to more artworks in the collection. though increasingly feasible, however, emulation is not always an ideal access strategy: emulation platforms can introduce rendering problems of their own, and emulation usually means that users will experience technologically out-of-date artworks with up-to-date hardware. this made it all the more important for the team to survey media art researchers, curators, and artists, in order to gain a better sense of the relative importance of the artworks' most important characteristics for different kinds of media archives patrons.   about the survey we developed a questionnaire that presented users of media archives with a number of open-ended, largely qualitative and non-restrictive questions about their needs, goals, and preferences. in january , we circulated the questionnaire on several preservation, art, and digital humanities mailing lists. the pafdao team initially hoped that survey results would support the identification of "personas," or broad profiles of media archives users who shared similar needs and preferences. we hoped that these profiles would direct both metadata framework and access provisions. as it happened, no such clear classifications emerged, yet questionnaire results were still vastly informative, and shaped the development of the pafdao project in integral ways. in the remainder of this paper, we offer an overview of noteworthy trends and comments, then discuss the conclusions we draw from these results and their impact on the pafdao workplan and preservation framework.   survey results a total of people responded to the questionnaire. respondents came from disparate geographical locations, including the us, germany, france, uk, australia, and argentina. of respondents, responded as an individual researcher or practitioner and responded on behalf of an archive, museum, or other cultural heritage institution. we did not observe any significant differences in the responses of these two groups (personal and institutional responses), possibly due to the fact that even at an institutional level, new media projects and collections are led by small, specialized teams of committed individuals. respondents often held multiple roles and characterized themselves non-exclusively as artists ( %), researchers ( %), educators ( %), curators ( %), collection specialists ( %). the scope of digital media art collections respondents worked with was also broad, and included digital installation, video and images, interactive multimedia, audio, -d visualization, and websites. the key impetus behind the survey was to understand what kind of research questions and needs were motivating users to search for and use media works. this information is critical for the research team to identify and assess the nature and extent of viewing experience that needs to be preserved. in aggregate, respondents gave almost equal weight to artistic, social, historical, cultural, aesthetic, and technical research frameworks. several described pedagogical uses and how they use media works in teaching and learning. some sample research questions include: how are technologies assisting the exploration of political issues by artists? how do you bring the work to the viewer through the interactive power of technologies? do digital works explore something further than the analog approaches can do? how do technologies support and stimulate community engagement? how are access issues for individuals with lower economic backgrounds being addressed? what are the possible implications of gender in digital media artworks? what does it mean to view an art work that is designed for an old tv set in a larger installation? the respondents cited a number of serious impediments they had encountered in conducting research involving new media art. for example, they mentioned the lack or insufficiency of documentation and metadata, discovery and access provisions, and technical support. ones who use new media collections in support of teaching and learning listed several impediments such as vanishing webpages, link rot, poor indexing, gap for works from the s and s, and the lack of quality documentation. also often underscored were the complexity of legal issues and access rights. one respondent pointed out that, due to a widespread "disinterest in preserving the cultural artifacts of the digital age," there is a lack of understanding of the importance of these objects for cultural history. another comment noted infrequent access requests and therefore difficulties in justifying institutional investment in preservation efforts for future use. one of the respondents wrote, "in a society that is rushing headlong into the future, it is vital that we preserve the efforts of those who have early works in this new culture." another one commented that as technologies evolve, some works become very easy to create and therefore some users don't understand the significance of a work and how it was a complicated piece to produce at the time. such sentiments underscore the importance of documenting cultural context to situate the work from artistic, historic, and technical perspectives. for practicing artists, there were several concerns about the longevity of their creative work. some expressed concern about the difficulty of selling works that may become obsolete within a year. many worried that it was difficult to store or archive immersive installations, interactive pieces, and work with dependency on external files. they also mentioned copyright issues as a significant challenge. many emphasized the importance of historical contexts, usability, and discovery. one of them pointed out that archiving has become a part of his practice and he feels the pressure to consider future uses as he is going through a creative process. for curators of new media art, many indicated that they don't include born-digital interactive media in their holdings because either such materials fall outside of collecting scope or the procedures for providing access are too complex or unsustainable. for those who collect this genre, the biggest concerns were trying to identify which aspects of interaction experiences to preserve and how to capture as much information as possible to assist future users. out of the twenty survey respondents who answered on behalf of an educational or cultural institution, only one organization could claim a sophisticated and integrated web-based discovery, access, and preservation framework. the others indicated that access needed to be arranged through a special arrangement such as setting an appointment. they cited a range of preservation strategies they rely on, including migration, creation of search and discovery metadata, maintaining a media preservation lab, providing climate controlled storage, and collecting documentation from the artists.   content authenticity and authentic user experience as mentioned above, the pafdo survey of users of media archives did not, as we had hoped, result in the definition of clear user profiles or personas. however it had several important effects on the pafdao project. first, we noted a significant concern among our respondents for "authenticity"—understood as a cultural rather than technical concept. the international research on permanent authentic records in electronic systems (interpares) project defines an authentic record as "a record that is what it purports to be and is free from tampering or corruption" (macneil, et al., , referenced in dietrich & adelstein, ). verifying the bit-level self-identity of a digital object over time can be accomplished relatively easily with checksums, automated fixity checks, and collection audits. when working with cultural artifacts, however, "authenticity" becomes a more nebulous and controversial concept. conservation measures undertaken to restore an artwork to some approximation of its original appearance may, in fact, alter its original form in ways that can affect its meaning. this is especially true in the case of artworks conceived to be ephemeral or experiential, or works that involve "contemporary" technologies that become obsolete, even obscure, over time. our questionnaire respondents seemed to respect this difficulty. reading across the complete pool of responses, we noted that the desired sense of "authenticity" derived not from some naïve sense of the object's pristine originality, but rather from a sense that the archiving institution has made a good-faith commitment to ensuring that the artist's creative vision has been respected, and providing necessary context of interpretation for understanding that vision—and any unavoidable deviations from it. we had excellent models for addressing these concerns. within the last ten to fifteen years, many arts organizations have joined forces to develop shared practices for the conservation of technology-based media, but also difficult-to-document arts such as performance, video art and multi-media installations. examples include independent media arts preservation (imap); the variable media network; matters in media art (a collaborative project between the tate, the new art trust (nat) and its partner museums—the museum of modern art (moma), the san francisco museum of modern art (sfmoma)); and incca (international network for the conservation of contemporary art). the most significant commonality of these initiatives is their shared emphasis on appropriate documentation. while some complex time-based artworks can never be authentically replicated, it is generally agreed that, with proper documentation, many can be reinterpreted, adapted and revived for modern audiences. in cultural heritage organizations, this documentation can take the form of technical and descriptive metadata tailored for the breadth and specificity of new media, detailed installation instructions, detailed exhibition histories, and so forth. above all, practices for working directly with artists have been especially important conservation tools, and the initiatives cited above provide excellent models for how artist interviews can aid efforts to preserve complex artworks; see, for example, the variable media questionnaire (depocas, et al., ). in response to these considerations raised by our user survey, we developed a conservation-oriented artist questionnaire and interview process, pushing the integration of archival protocols as far upstream as possible, to the point of content creation and initial curation. enlisting the help of our project advisors, we worked with existing models, but adapted these models significantly. we streamlined and simplified our artist questionnaire to address specific aspects of our emerging preservation and access framework. we were particularly concerned about communicating with artists and enlisting their input about our decision to rely on operating system emulation as a default access strategy. though easy and readily scalable, emulation introduces variations into the rendering of artworks that artists might not have anticipated; it was clear that we would need to work with artists wherever possible to ensure that artworks' most significant properties and interpretive contexts were preserved, and not obscured, by our access measures.   artist questionnaire the pafdao questionnaire is designed to be a first step in a two-part process, gathering essential information but also laying the groundwork for a more conversational interview process where possible. first and foremost, the questionnaire elicits artists' input in identifying the most significant properties of individual media artworks by asking about the artists' initial vision for the work, and by posing open-ended questions about the relationship between artistic vision, technology, and historical contexts. the questionnaire also asks fundamental technological questions. (e.g., "what software or programming language was used to create this artwork?" "what hardware and software were optimal for running this artwork when it was new?") we inquire as to whether artists still have the working files they used in creating the artwork, including source code; these would constitute a deep technological and historical context for the works, and also an invaluable resource for future conservation work (engel & wharton, ). we also ask about related artworks or websites, and whether any of these materials may have been archived by another person or institution. networks of collaboration between archiving institutions will become more and more important in preserving cultural, historical, and technological contexts of reference that will be essential to understanding these artworks. the questionnaire also discloses foreseeable problems in our chosen access frameworks, including specific rendering issues that might come about with different emulation platforms: we have found virtual machine emulation to be an effective strategy for providing research access to interactive digital artworks. running older artworks in an emulation environment may involve changes to the look and feel of the original artwork. our default access strategy is likely to involve: current, commercial-grade hardware and peripherals (mouse, screen, keyboard, etc.) color shift associated with the change from crt to led monitor screens possible alterations to the speed of animation and interactive responsiveness possible changes to audio quality presentation of digital surrogates rather than original physical materials that may have accompanied the artwork (discs, booklets, cases, etc.) we ask artists to describe how such changes might affect their initial vision for the work. we also request permission to provide works in emulation, outline the kinds of documentation we expect to provide archive users, and invite artists to work with us on supplementary or alternate forms of documentation if they choose: we expect to present users with a general statement about the effects of our emulation environments on the rendering of an artwork. if you would like to author or co-author a more specific statement about how these changes may affect your work, we can provide researchers with this information as well. in some cases, we may be able to provide additional documentation of original rendering conditions. please let us know if you would like to discuss these possibilities further. finally, the questionnaire furthermore provides us with an opportunity to revisit rights agreements, which must be updated in light of new access technologies, and an opportunity to invite further conversation (a follow-up interview) and collaboration with the artist.   concluding remarks a reoccurring theme in our findings involved the difficulties associated with capturing sufficient information about a digital art object to enable an authentic user experience. this challenge cannot and should not be reduced to the goal of ensuring bit-level fixity checks or even providing technically accurate renderings of an artwork's contents as understood on the level of individual files. as rinehart & ippolito ( ) argue, the key to digital media preservation is variability, not fixity. the trick is finding ways to capture the experience—or a modest proxy of it—so that future generations will get a glimpse of how early digital artworks were created, experienced, and interpreted. so much of new media works' cultural meaning derives from users' spontaneous and contextual interactions with the art objects. espenschied, et al. ( ) point out that digital artworks relay digital culture and "history is comprehended as the understanding of how and in which contexts a certain artifact was created and manipulated and how it affected its users and surrounding objects." for a work to be understood and appreciated, it is essential for the archiving institution to communicate a cultural and technological framework for interpretation. as one user survey respondent noted, some works that come across as mundane now may have been among the highly innovative trailblazers of yesterday. given the speed of technological advances, it will be essential to capture these historical moments to help future users understand and appreciate such creative works. the pafdao survey of users of media archives affirmed the importance of institutions like the rose goldsen archive, which is able to provide a breadth of media technological, historical, and cultural contexts to researchers and educators through its extensive and accessible collections. it also underscored the need for archiving institutions to be in contact with one another, and to be conscious of the need for greater integration of discovery and access frameworks across multiple institutions as they move forward in developing new preservation plans and access strategies for their collections. providing appropriate cultural and historical contexts for understanding and interpreting new media art is part of each institution's individual mission, but also a matter of collective importance, given the rarity of such collections, the numerous challenges of establishing preservation protocols, and the overall scarcity of resources. as we conclude, we must emphasize that, as artists have increasing access to ubiquitous tools and methodologies for creating complex art exhibits and objects, we should expect to see an increasing flow of such creative works to archives, museums, and libraries. it is nearly impossible to preserve these works through generations of technology and context changes. therefore, diligent curation practices are going to be more essential than ever in order to identify unique or exemplary works, project future use scenarios, assess obsolesce and loss risks, and implement cost-efficient strategies.   acknowledgements we would like to express our gratitude to the national endowment for the humanities for supporting this project, to the project advisory board, to consultants chris lacinak, kara vanmalssen, and alex duryee of avpreserve, and to the pafdao project team, including timothy murray (co-pi), dianne dietrich, desiree alexander, jason kovari, danielle mericle, liz muller, michelle paolillo.   notes an early version of this report is available at dsps press: the blog of cornell university library's division of digital scholarship and preservation services. see "interactive digital media art survey: key findings and observations: dsps press". the goldsen archive's holdings range to include media formats such as reel-to-reel videotape, floppy disk, database artworks housed on external hard drives, and works of net.art. all of these formats pose unique and significant preservation challenges. for more information, please see the goldsen archive website. out of respondents, fully and partially completed the survey, and took a quick look without responding. we suspect that the incomplete survey indicates a combination of curiosity and unfamiliarity with the program area, as media art research, curation, and practice still constitute fairly specialized fields. only twenty-four respondents indicated that their institutions include born-digital interactive media artworks and artifacts in their holdings. several respondents who identified as curators indicated that born-digital interactive media would fall outside the scope of their collections. in some cases, they also noted that procedures for providing access to such materials are prohibitively complex or unsustainable. for further information and documentation please see http://imappreserve.org/, http://variablemedia.net/, http://www.tate.org.uk/about/projects/matters-media-art, and http://www.incca.org/ cornell university library's commitment to provide broad and democratic access to its special collections was a key reason why founding goldsen archive curator timothy murray located the goldsen collections within the library. cornell's division of rare and manuscript collections has notably open policies for user access; see http://rmc.library.cornell.edu/ for more information.   references [ ] delve, j. et al. ( ). the preservation of complex objects. volume one: visualizations and simulations. [ ] depocas, a., ippolito, j., jones, c., eds. ( ). permanence through change: the variable media approach. guggenheim museum publications, ny & daniel langlois foundation, montreal. [ ] dietrich, d., adelstein, f. archival science, digital forensics, and new media art. volume , supplement , august , proceedings of the fifteenth annual dfrws conference. http://doi.org/ . /j.diin. . . [ ] espenschied, d., rechert, k., valizada, i., von suchodoletz, d., russler, n. ( ). "large-scale curation and presentation of cd-rom art", ipres . [ ] engel, deena, and glenn wharton ( ). reading between the lines: source code documentation as a conservation strategy for software-based art. studies in conservation ( ): — . http://doi.org/ . / y. [ ] kirschenbaum, m. et al. ( ). digital forensics and born-digital content in cultural heritage collections. clir. [ ] lange, a. ( ). keep strategy paper. [ ] macneil, h. wei, c., duranti, l., authenticity task force report. interpares. [ ] murray, t. ( ). thinking electronic art via cornell's goldsen archive of new media art. neme: the archival event. [ ] rinehart, richard, and jon ippolito, ( ) re-collection: art, new media, and social memory. leonardo. cambridge, massachusetts: the mit press.   about the authors madeleine casad is curator for digital scholarship at cornell university library. as associate curator of the rose goldsen archive of new media art, she manages an exciting collection of media objects that present a wide range of preservation and access challenges. she coordinates many of the library's digital humanities initiatives, and plays a leading role in education and outreach programs to promote the innovative use of digital collections in humanities scholarship. she holds a phd in comparative literature from cornell university.   oya y. rieger is associate university librarian for scholarly resources and preservation services at cornell university library. she provides leadership for full lifecycle management of scholarly content, including selection, creation, design, maintenance, preservation, and conservation. she is interested in current trends in scholarly communication with a focus on needs assessment, requirements analysis, business modeling, and information policy development. she holds a phd in human-computer interaction (hci) from cornell university.   desiree alexander is the pafdao collections analysis assistant and has worked with the goldsen archive since , assisting with the goldsen's experimental video and digital media preservation projects. she is also co-lead in surveying cornell's a/v assets to locate at risk materials campus-wide in an effort to develop preservation and access strategies. she holds a ms in information studies and an ma in public history from suny albany, and an undergraduate degree in art history from ithaca college.   copyright © madeleine casad, oya y. rieger and desiree alexander revisiting critical gis this work is licensed under a creative commons attribution-noncommercial-noderivatives . international licence newcastle university eprints - eprint.ncl.ac.uk thatcher j, bergmann l, ricker b, rose-redwood r, o’sullivan d, barnes tj, barnesmoore lr, imaoka lb, burns r, cinnamon j, dalton cm, davis c, dunn s, harvey f, jung j, kersten e, knigge l, lally n, lin w, mahmoudi d, martin m, payne w, sheikh a, shelton t, sheppard e, strother cw, tarr a, wilson mw, young jc. revisiting critical gis. environment and planning a copyright: © sage publications sage publishing has allowed for the accepted version of this article to be deposited in an institutional repository. doi link to article: http://dx.doi.org/ . / x date deposited: / / https://creativecommons.org/licenses/by-nc-nd/ . / http://eprint.ncl.ac.uk/ javascript:viewpublication( ); http://dx.doi.org/ . / x revisiting critical gis a commentary submitted for review on november , . jim thatcher, university of washington - tacoma, division of urban studies luke bergmann, university of washington, department of geography britta ricker, university of washington - tacoma, division of urban studies reuben rose-redwood (landscapes of injustice research collective), university of victoria, department of geography david o’sullivan, university of california, berkeley, department of geography trevor j barnes, university of british columbia, department of geography luke r. barnesmoore, university of british columbia, department of geography laura beltz imaoka, university of california, irvine, program in visual studies ryan burns, university of washington, department of geography jonathan cinnamon, university of exeter, department of geography craig m. dalton, hofstra university, department of global studies and geography clinton davis, temple university, department of geography and urban studies stuart dunn, king’s college london, department of digital humanities francis harvey, leibniz institute for regional geography and university of leipzig jin-kyu jung, university of washington-bothell, school of interdisciplinary arts & sciences ellen kersten, university of california, berkeley, department of environmental science, policy, and management ladona knigge, california state university chico, department of geography & planning nick lally, university of wisconsin–madison, department of geography wen lin, newcastle university, school of geography, politics and sociology dillon mahmoudi, portland state university, urban studies and planning michael martin, simon fraser university, department of geography will payne, university of california, berkeley, department of geography amir sheikh, university of washington, department of urban design and planning taylor shelton, clark university, graduate school of geography eric sheppard, ucla, department of geography chris w. strother, university of georgia, department of geography alexander tarr, university of california, berkeley, department of geography matthew w. wilson, university of kentucky, department of geography jason c young, university of washington, department of geography revisiting critical gis . introduction from late afternoon, october th, , until early on the th, thirty researchers met at the university of washington’s friday harbor laboratories to revisit the spirit of ‘critical gis’ in approaching questions both emerging and enduring around the intersection of the spatial and the digital. while the gathering at friday harbor, like much early work in critical gis, can be read as ‘peace talks’ brokered between warring factions, with wary giscientists and cautious human geographers on opposite sides of the table (schuurman ), more than a decade into the twenty- first century, our meeting drew an open field of scholar-practitioners bursting with questions, varied experiences, and profound concerns. even as the meeting ‘revisited’ critical gis, it offered neither recapitulation nor reification of a fixed field, but repetition with difference. neither at the meeting nor here do we aspire to write histories of critical gis, which have been taken up elsewhere. in the strictest sense, one might define gis as a set of tools and technologies through which spatial data are encoded, analyzed, and communicated. yet any strict definition of gis, critical or otherwise, is necessarily delimiting, carving out ontologically privileged status that necessarily silences one set of voices in favor of another. instead, see poiker ; schuurmann ; sheppard ; o’sullivan ; and wilson for overviews. other points of entry into critical gis and critical cartography, inter alia, may also be found: pickles ( , ); harvey and chrisman ( ); curry ( ); kwan ( ); crampton and krygier ( ); harvey, kwan and pavlovskaya ( ); goodchild ( ); cope and elwood ( ); and rose-redwood ( ). we suggest that both ‘critical’ and ‘gis’ evolve in unresolved tension, as geospatial technology and information becomes ever more present in daily life (greenfield , kitchin and dodge , dourish and bell ), as new fields both claim and extend spatial inquiry and visualization (drucker ), and as the academy itself grapples with its role in a neoliberalized world (wyly ). critical gis offers trading zones (barnes and sheppard ) for discussion of these and other issues, for building alliances and interrogating tensions, and for a constant dialectical process of critique and renewal. notwithstanding the contemporary ubiquity of digital maps, ‘i want to be a gis researcher when i grow up’, remains a rare aspiration, rarer still when the qualifier ‘critical’ is added. but, what critical means, how it might itself be critiqued, and what work it enables depends on the disciplinary background of individual scholars. for some, predominantly from earth science backgrounds, the groundwork for critical gis is found in practitioners using the geospatial toolkit not only to inventory the natural world in quantitative terms, but also to spatially document its qualitative features. for others, mathematical models and economic analyses that engage critical social theory while retaining a focus on the spatial organization of the world define critical gis (sheppard and barnes ). still others produce critical gis work through engagements with critical cartography (crampton ), science and technology studies (harvey and chrisman ), a politics of reflexivity (dunn , schuurman and pratt ), and increasingly, the digital humanities (drucker ). as such, this commentary is meant as much for those who self-identify as critical gis practitioners as it is for giscientists; it is meant for those in the digital humanities, those in physical geography, and more. it is a constant tacking between old and new, between expert and novice as we seek new allies to ask new questions. as spatial data and its analysis seeps into ever more facets of modern life, we ‘revisit’ critical gis seeking new connections, new concerns, and new paths forward. in this commentary we sketch some of those uncovered at our meeting. ‘critical gis’ operates as an affiliation, one with a variety of resonances and tensions to be explored, rather than resolved. one tension revolves around how the spatial and digital function in relation to issues of social justice. another around ‘hybrid’ strategies, such as critical quantification and the digital humanities, and their relationship to critical gis. despite some progress, particularly around geospatial data, we find that a political economy of geospatial technologies remains largely undeveloped. we thus revisit critical gis not as a historical body of scholarship, but as a set of living, diverse, dynamic endeavors necessary in the present and invested in transforming the future. . social justice and gis one such unresolved tension running through critical gis is the contradictory role gis has played in addressing questions of social justice (warren ). on the one hand, critics have questioned the complicity of geospatial technologies, and mapping more generally, in supporting the interests of corporate and governmental power, not to mention the military applications of gis and its role as part of the broader apparatus of geosurveillance (smith ; pickles , ; crampton ). on the other, a growing body of literature draws on gis techniques to document systematic patterns of spatial inequity, such as the disproportionate risks that socially marginalized groups face in exposure to air pollution and toxic waste (margai ; buzzelli et al. ; higgs and langford ; raddatz and mennis ). in some cases, gis use has been instrumental in legal decisions resulting in millions of dollars in damages being paid to affected residents (e.g., kennedy v. city of zanesville; see parnell ; monger ). the strategy of using gis mapping and spatial analysis as part of a legal defense shows some promise in challenging social and environmental injustices via law and due process. however, critical gis must also ask whether social and environmental justice is reducible to ‘justice’ as conceived by juridical systems alone, particularly in the context of settler societies where the colonial state has been one of the primary agents of oppression and the dispossession of indigenous lands. how gis have been used to reinforce or challenge social injustices demands serious theoretical and empirical consideration, often as questions needing to be posed, not as foregone conclusions. such questions might include: how should we conceptualize the notion of ‘justice,’ in procedural, distributional, or other terms? are we drawing on ‘passive’ or ‘active’ conceptions of equality (may ) as we theorize the role of gis in exposing and challenging social and environmental injustices? has the analysis of some spatial inequities been privileged over others? to what extent does the availability of particular types of data influence which injustices are addressed? how can marginalized populations be digitally empowered in the contemporary geoweb era? what tools and theories are most relevant to our work and with what political commitments do they come? additionally, what mechanisms of inclusion and exclusion are at work in geospatial communities? critical gis must not only pose these questions of others, but continue to be reflexive in proactively questioning its own inclusivity, especially given the centrality of feminist interventions in constituting critical gis (kwan ; cope and elwood ; leszczynski and elwood ; schuurman ). one approach involves continually asking, ‘who is missing? how would their presence alter not only our internal conversations, but also the social roles of critical gis?’ by posing such questions, we seek to broaden the scope of what a ‘social justice and gis’ research agenda might entail by reconsidering how critical engagements with political theories of justice and equality can enrich our critiques of gis as a political technology as well as how gis itself can more productively be employed as a means of intervening within struggles for social justice. . two hybrid strategies, among others: critical quantifications and digital humanities as an affinity always becoming and engaging others, critical gis is necessarily hybrid. in this section, we briefly explore critical quantification and the digital humanities as two hybrid approaches -- one historically more associated with critical gis practitioners and one just entering into conversation with them -- and suggest they offer productive paths cognizant of critiques of mainstream computation and positivist quantification. in these mindful transgressions of what are often seen as epistemological and ontological barriers between the qualitative and the quantitative, or between the social-theoretic and the mathematical, we suggest there are lessons that critical gis is well positioned to articulate, that offer insight into how and when critical hybridities may emerge and become productive. ‘critical quantification’ suggests a variety of stances and practices. given that scholars generally aspire to think critically, it can seem unproductive to distinguish a specifically ‘critical’ quantification of objects and phenomena. nevertheless, it is important to recognize the particular intellectual charge of efforts to re-appropriate and refashion mathematical, statistical, and computational practices using theoretical insights stemming from a serious engagement with the methodological, ontological, and political commitments of social and cultural theory. geographers have pursued a variety of ‘mixed’ method approaches including interweaving of narrative and simulation practices (bergmann, sheppard, and plummer ; millington, o’sullivan and perry ). as the explosion in the construction and commodification of spatial data systems continues, with upwards of sixty percent of all data now containing a spatial component (hahmann and burghardt ), scholars have begun to highlight moments of resistance and explore alternatives to capitalist quantification (thatcher forthcoming; wilson ; further elaborated upon below). qualitative methods are being increasingly integrated into gis practice (cope and elwood ; knigge and cope ), supporting arguments that the qualitative-quantitative ‘divide’ was a contingent construct, especially in the social sciences (wyly ). such engagements suggest the value of an engaged pluralism among gis and ‘non-gis’ approaches (barnes and sheppard ). critical quantification has been closely associated with critical gis, whether interacting within the same project or co-existing within the oeuvres of scholars (bergmann ; o’sullivan ; schwanen and kwan ; sheppard ; ). such a description of the relationship between the digital humanities and critical gis might be premature, although there is considerable potential for synergies (bodenhamer, corrigan, and harris ). while examples retrospectively understood as digital humanities stretch back decades or even centuries, to the work of those such as roberto busa and ada lovelace, it is only in recent years that digital scholarship has become widespread in the humanities and recognized more broadly. whereas the digital humanities are even more open- ended in their remit than critical gis, and also involve many researchers who see less relevance in the theoretical humanities for their work than most critical gis scholars find in social-theoretic and critical geography, considerable intersections and opportunities for cross-fertilization exist. of particular interest to critical gis, the digital humanities have grappled directly with the contradictions between interpretative approaches to scholarship that characterize many humanistic ways of knowing, and analytical computing paradigms largely designed by engineers to serve the interests of capital accumulation and state power. projects in ‘speculative computing’ have attempted to rework visualization, data, interfaces, and analysis for the theoretical commitments of humanistic scholarship (drucker , ; burdick et al. ). in this, they have much in common with efforts in critical gis to theoretically reconstruct geospatial practices (from software to concepts to applications) to be in greater sympathy with the commitments of social-theoretic and critical geography (curry ; kwan ; sieber ; sheppard ; cope and elwood ). bringing critical gis and the digital humanities into conversation around the efforts of both in ‘speculative computing’ holds great promise—not only for critical gis, but also for the digital humanities, where critical geographical perspectives on absolute and relative spaces as well as on cartography have much to offer. . the political economy of gis nearly a decade ago, o’sullivan ( ) noted the incomplete and partial nature of studies charting the political economy of spatial technologies. while recent work has explored the political economy of new spatial and mapping technologies, situating them within a larger framework of neoliberalism (leszczynski ) or as ‘fixes’ for capital as loci for speculative investment (wilson ), a comprehensive political economy of spatial technologies remains distant. we see several avenues for furthering such research along lines we designate as questions of scope, historical pathways, and expanding reach. the title of this section points to a political economy of gis, but the previous paragraph refers to ‘new spatial and mapping technologies’. this slippage is not a mistake, but rather the crux of an ongoing debate: what exactly is the scope for critical gis? should we, as scholars, dedicate our inquiry towards a political economy of gis, of spatial technologies more broadly, or of an entirely different set of questions? what can an interrogation of gis tell us about broader political economies? accompanying each of these terms is a particular commingling of state, economy, society, and specific pathways of technological development. the answers to such questions feed into any political economy of gis and into how the development of diverse set of spatial technologies is shaping economic and societal futures at multiple scales. against mythic accounts of the sui generis technical solutions offered by new spatial technologies, a means of justifying their value in and of itself (leszczynski ), critical gis must situate these new technologies in the older traditions from which they emerged. this involves parsing the long histories behind where, when, and how specific geospatial technologies were produced. we must chart the paths that have shaped and continue to shape this technological form and its role in the world, paying attention to where, how, and when actors such as the state, and in particular, the military-industrial complex, have influenced their development. we must continue the work begun by scholars like clarke and cloud ( ) that foregrounds the relationship between gis and the military, but we must also push further. we must recognize the recursive relations between ideology and technology, discussing how any technological orientation both results from and shapes subsequent epistemological and ontological orientations to the world. work by barnes and wilson ( ) and dalton and thatcher ( ) attends to this historical excavation, tying the present myth of ‘big data’ to earlier movements in social physics and geodemographics, respectively; however, these concerns extend well beyond ‘big data.’ a critical spatial history of gis must also pay heed to other processes of governmentality that have implicated spatial rationalities and political technologies in the reconfiguration of geographical spaces (rose-redwood ). a political economy of gis should be cognizant of the slippage between traditional gis and spatial information more broadly. leaving behind the desktops of state workers, academic researchers, and private sector analysts, the tools of gis—of spatial information, visualization, and analysis—have become prime sites of speculative investment (wilson ) and a core means by which individuals navigate and understand the world around them (sui , elwood, goodchild, and sui , leszcynski and wilson , thatcher ). just as the move from mainframe to desktops in gis raised concerns over a ‘hidden technocracy’ (obermeyer ), similar concerns must be raised concerning the advent of ‘big’ spatial information and analysis. a political economy of gis should be forward looking, examining not only the historical paths that led to the present moment (see, for example, mchaffie ), but also those paths opened and foreclosed toward possible futures (sheppard ). critical gis must remain attentive to the specific functions of traditional gis within society, but engaged scholars must also not lose sight of the widening import of ‘big’ spatial information. from the vantage point of , this includes growing economies of surveillance, consumer location-based services, data speculation, and other economies of control (dalton and thatcher ). . repetition with difference: future directions, present entanglements in this commentary, we have attempted to outline a critical gis of unresolved tensions and of hopeful affiliations. stemming from conversations and dialogues that took place at the ‘revisiting critical gis’ meeting, these suggestions can only reflect diversity of a particular kind, those already interested in identifying with critical gis. whereas the original friday harbor meeting in has been portrayed as an important moment of detente in a previously uncomfortable relationship between ‘gisers’ and more skeptical human geographers (schuurman ), it would be impossible to put such a spin on the meeting. with few exceptions, both mainstream gis (or should that be giscience?) and important strands in contemporary human geography were notable by their absence from this meeting. this is a concern. first, it suggests that giscience might now be beyond the reach of skeptical questioning, even as, only a few years ago (ten years on from pickles’s [ ] ground truth), mike goodchild suggested that ‘giscience would never again be quite the comfortable retreat for the technically minded that it had been in the past’ ( , ). if that claim was true then, it seems less so now, as monolithic desktop gis mutates into a much more varied array of spatial technologies well beyond geography’s purview, and as what was once ‘academic gis’ has become ‘giscience’. second, it highlights what appears to be a neglect by critical human geographers more widely to seriously interrogate geospatial technologies and their implications following up on significant works from the s (although, see sheppard, ; rose-redwood, , ; wilson , ). ‘critical gis’, in the form of intricately interwoven affinities advocated above, can help us constructively engage not only mainstream giscience and the ever-proliferating intersections of computation with space and place but also critical human geography. despite the scale of that challenge, our mood is one of optimism. we regard critical gis as less of a field and fixed basis for identity and more as a multitude of intellectual banners, lacking fixed essence, raised through calls that repeat with difference, ever rediscovered and reclaimed. as intersections of the geographical, the technological and the digital proliferate and raise new questions, we will offer many responses. we are continually revisiting critical gis. join us. references barnes, trevor j., and eric sheppard. . “‘nothing includes everything’: towards engaged pluralism in anglophone economic geography.” progress in human geography ( ): – . barnes, trevor j., and matthew w. wilson. . “big data, social physics, and spatial analysis: the early years.” big data & society ( ): - . doi: . / . bergmann, luke r. . “bound by chains of carbon: ecological-economic geographies of globalization.” annals of the association of american geographers ( ): – . doi: . / . . . bergmann, luke r, eric sheppard, and paul s. plummer. . “capitalism beyond harmonious equilibrium: mathematics as if human agency mattered.” environment and planning a ( ): - . bodenhamer, david j., john corrigan, and trevor m. harris, eds. . the spatial humanities: gis and the future of humanities scholarship, bloomington, in: indiana university press. burdick, anne, johanna drucker, peter lunenfeld, todd presner, and jeffrey schnapp. . digital_humanities. cambridge, ma: mit press. buzzelli, michael, michael jerrett, richard burnett, and norm finklestein. . “spatiotemporal perspectives on air pollution and environmental justice in hamilton, canada, - .” annals of the association of american geographers ( ): - . clarke keith c., and john g. cloud. . “on the origins of analytical cartography.” cartography and geographic information science. jan; ( ): – . cope, meghan s., and sarah elwood, eds. . qualitative gis: a mixed methods approach. thousand oaks, ca: sage publications ltd. crampton, jeremy w. . mapping: a critical introduction to cartography and gis. malden, mass: wiley-blackwell. crampton, jeremy w. . “the role of geosurveillance and security in the politics of fear.” in geospatial technologies and homeland security: research frontiers and future challenges edited by daniel sui, - . dordrecht, netherlands: springer. crampton, jeremy w., and john krygier. . “an introduction to critical cartography.” acme: an international e-journal for critical geographies ( ): - . curry, michael r. . digital places: living with geographic information technologies. london: routledge. dalton, craig m. and jim thatcher. . “inflated granularity: spatial ‘big data’ and geodemographics.” big data & society available at: http://bds.sagepub.com/content/ / / (accessed october ). dalton, craig m. and jim thatcher. . “what does a critical data studies look like, and why do we care? seven points for a critical approach to ‘big data’.” available at: http://societyandspace.com/material/commentaries/craig-dalton-and-jim-thatcher-what- does-a-critical-data-studies-look-like-and-why-do-we-care-seven-points-for-a-critical- approach-to-big-data/ (accessed april ). dourish, p., and g. bell. . divining a digital future: mess and mythology in ubiquitous computing. cambridge, ma: mit press. drucker, johanna. . “humanistic theory and digital scholarship.” in debates in the digital humanities, edited by matthew k. gold, - . minneapolis: university of minnesota press. drucker, johanna. . speclab: digital aesthetics and projects in speculative computing. chicago: university of chicago press. dunn, christine e. . participatory gis a people’s gis? progress in human geography ( ): – . elwood, sarah, michael f. goodchild, and daniel z. sui. . “researching volunteered geographic information: spatial data, geographic research, and new social practice.” annals of the association of american geographers ( ): - . doi: . / . . . goodchild, michael f. . “giscience ten years after ground truth.” transactions in gis ( ): - . greenfield, adam. . everyware: the dawning age of ubiquitous computing. berkeley, ca: new riders publishing. hahmann, steve and burghardt, dirk. . “how much information is geospatially referenced? networks and cognition.” international journal of geographical information science ) _: - . harvey, francis and chrisman, nicholas r. . “boundary objects and the social construction of gis technology.” environment and planning a, ( ): - . harvey, francis, mei-po kwan, and marianna pavlovskaya. . ”introduction: critical gis.” cartographica ( ): - . higgs, gary and mitch langford. . “giscience, environmental justice, and estimating populations at risk: the case of landfills in wales.” applied geography ( ): - . knigge, ladona. and meghan s. cope. . “grounded visualization and scale: a recursive analysis of community spaces.” in qualitative gis: a mixed methods approach, edited by. meghan cope and sarah elwood, - . thousand oaks, ca: sage publications ltd. kitchin, r., and m. dodge. . code/space: software and everyday life. cambridge ma: mit press. kwan, mei-po. . “feminist visualization: re-envisioning gis as a method in feminist geographic research.” annals of the association of american geographers ( ): - . leszczynski, agnieszka. . “on the neo in neogeography.” annals of the association of american geographers ( ): - . doi: . / . . . leszczynski, agnieszka. . “situating the geoweb in political economy.” progress in human geography ( ): - . doi: . / . leszczynski, agnieszka, and sarah elwood. . “feminist geographies of new spatial media.” the canadian geographer ( ): - . leszczynski, agnieszka, and matthew w. wilson. . “guest editorial: theorizing the geoweb.” geojournal ( ): - . doi: . /s - - - . margai, florence lansana. . “health risks and environmental inequity: a geographical analysis of accidental releases of hazardous materials.” the professional geographer ( ): - . may, todd. . the political philosophy of jacques rancière: creating equality. university park, pa: pennsylvania state university press. mchaffie, patrick. . “towards the automated map factory: early automation at the u.s. geological survey.” cartography and geographic information science ( ): - . millington, james d. a., david o’sullivan, and george l. w. perry. . “model histories: narrative explanation in generative simulation modelling.” geoforum ( ): - . doi: . /j.geoforum. . . . monger, jon. . “thirsting for equal protection: the legal implications of municipal water access in kennedy v. city of zanesville and the need for federal oversight of governments practicing unlawful race discrimination.” catholic university law review ( ): - . obermeyer, nancy j. . “the hidden gis technocracy.” cartography and geographic information systems ( ): - . o’sullivan, david. . “geographical information science: critical gis.” progress in human geography ( ): - . parnell, allan. . “maps used in support of the plaintiff’s argument in kennedy et al. v. city of zanesville, et al.” race equity project, legal services of northern california, (access date april , ). pickles, john. . a history of spaces: cartographic reason, mapping and the geo-coded world. london: routledge. pickles, john, ed. . ground truth: the social implications of geographic information systems. new york: the guilford press. poiker, tom. . “preface to special issue.” cartography and geographic information systems ( ): - . raddatz, liv and jeremy l. mennis. . “environmental justice in hamburg, germany.” the professional geographer ( ): - . rose-redwood, reuben. . “introduction: the limits to deconstructing the map.” cartographica: the international journal for geographic information and geovisualization ( ): - . rose-redwood, reuben. . “with numbers in place: security, territory, and the production of calculable space.” annals of the association of american geographers ( ): - . rose-redwood, reuben. . “governmentality, geography, and the geo-coded world.” progress in human geography ( ): - . schuurman, nadine. . “trouble in the heartland: gis and its critics in the s.” progress in human geography ( ): - . schuurman, n., and g. pratt. . care of the subject: feminism and critiques of gis. gender, place & culture ( ): – . schwanen, tim., and mei-po kwan. . “‘doing’ critical geographies with numbers.” the professional geographer ( ): - . sheppard, eric. . “knowledge production through critical gis: genealogy and prospects.” cartographica ( ): - . sheppard, eric. . “quantitative geography: representations, practices, and possibilities.” environment and planning d: society and space ( ): - . doi: . /d . sheppard, eric. . “gis and society: towards a research agenda.” cartography and geographic information systems ( ): - . sheppard, eric, and trevor j. barnes. . the capitalist space economy: geographical analysis after ricardo, marx and sraffa. london: unwin hyman. sieber, renee e. . “rewiring for a gis/ .” cartographica ( ): - . doi: . /t u - m- w- r. smith, neil. . “history and philosophy of geography: real wars, theory wars.” progress in human geography ( ): - . sui, daniel z. . “the wikification of gis and its consequences: or angelina jolie’s new tattoo and the future of gis.” computers, environment and urban systems ( ): - . doi: . /j.compenvurbsys. . . . thatcher, jim. . “avoiding the ghetto through hope and fear: an analysis of immanent technology using ideal types.” geojournal ( ): - . doi: . /s - - - . thatcher, jim. forthcoming. “understanding spatial media: locative and sousveillant media” in understanding spatial media, edited by rob kitchin, tracey p. lauriault, and matthew w. wilson. sage press. warren, stacy. . “the utopian potential of gis.” cartographica ( ): - . doi: . /jw l- - j -v q . wilson, matthew w. . “new lines? enacting a social history of gis.” the canadian geographer / le géographe canadien ( ): - . doi: . /cag. . wilson, matthew w. . “location-based services, conspicuous mobility, and the location- aware future.” geoforum ( ): - . doi: . /j.geoforum. . . . wilson, matthew w. . “‘training the eye’: formation of the geocoding subject.” social & cultural geography ( ): - . wilson, matthew w. . “towards a genealogy of qualitative gis.” in qualitative gis: a mixed methods approach, edited by meghan cope and sarah elwood, – . thousand oaks, ca: sage. wyly, elvin. . “where is an author?” city ( ): - . doi: . / . . . wyly, elvin. . “strategic positivism.” the professional geographer ( ): - . doi: . / . forgotten genealogies: what is digital art history? forgotten genealogies: brief reflections on the history of digital art history benjamin zweig abstract: the past five years have witnessed a growing interest amongst art historians in the potential of digital projects to impact, if not transform, the discipline. a steep rise in conferences and institutes dedicated to digital art history, along with funding opportunities and institutional support, has accelerated the rate at which art historians are now engaging with digital tech- niques. with this new visibility, art historians have criticized themselves for lagging behind other disciplines such as history and archaeology. this article questions the assumption that art his- torians have been slow to embrace digital tools and methods through a brief historical examina- tion of projects undertaken by institutions and scholars during the infancy of art history compu- ting: the early s through the early s. using johanna drucker's distinction of the "digit- ized" and "digital" iterations of art history, this essay traces the genealogies of both categories, arguing that scholars have been more active in theorizing, practicing and creating digital meth- ods than is often seen to be the case. ultimately, this essay is an attempt to help define from a historical perspective what "digital art history" is and how it has been practiced. keywords: historiography, databases, art history, methodology, museum, digital, digitized introduction in her report for the kress foun- dation transitioning to a digital world: art history, its research centers, and digi- tal scholarship, diane zorich summarizes both the consternation that art historians have been left behind by the digital turn in the humanities and the skepticism that it is going to change the practices of the discipline in any meaningful way. in her estimation, "there is a pervasive sense that the discipline is too cautious, moves too slowly, and has to "catch up" in the digital arena." this perception is not a new one. in his article "computer applications in the history of art," an- thony hamber argues how "information peer-reviewed figure : the vasari scanner. date unknown. (photo: kirk martinez. reproduced with permission) forgotten genealogies dah-journal, issue , technology within the world of the histo- ry of art has, until recently, lagged somewhat behind [other disciplines]." such attitudes have continued to circu- late throughout art history. in the book a companion to digital humanities, michael greenhalgh, a longtime support- er of art history computing, laments how it is "the human element [rather than the technological] that restricts obvious de- velopments in the discipline." the an- nouncement for a conference on "digital art history" held at the institute of fine arts, new york university, at the end of proclaims, "in the context of art history the integration of digital tools and processes has lagged, in varying de- grees, in comparison to other disciplines like archaeology and literary studies." and in a paper delivered at the confer- ence "the digital world of art history : from theory to practice" at the index of christian art, zorich argues that art history has been "slow at adopting the computational methodologies and analyt- ic techniques that are enabled by new technologies," singling out as examples visualization, network analysis, and topic modeling. rather than embracing the methodo- logical innovations or challenges pre- sented by computational practices, the argument goes, art historians have simp- ly lapsed into using technology as ever- expanding slide libraries. johanna druck- er makes this point in a article in the journal visual resources, in which she distinguishes between art historians who practice digitized art history and those who practice digital art history. accord- ing to drucker, "[a] clear distinction has to be made between the use of online repositories and images, which is digit- ized [emphasis in original] art history, and the use of analytic techniques ena- bled by computational technology that is the proper domain of digital [emphasis in original] art history." in drucker's view, the "digitized" iteration of art history propels traditional practices, exemplified by the online publication of image collec- tions and born-digital periodicals such as nineteenth-century art worldwide. this iteration gives scholars quicker access to more materials without challenging the practices under which they work. in contrast, the "digital" is "the use of ana- lytic techniques enabled by computation- al technology," including structured metadata, network analysis, discourse analysis, virtual modeling, simulation, and the aggregation of materials from disparate geographic locations. with the steep rise of scholarly inter- est in using, theorizing, and funding the creation of digital tools and methodolo- gies, it seems as though art historians are indeed playing catch-up. but art histori- ans' engagement with both the digitized and the digital versions of art historical practice, as per drucker, is more histori- cally complex than current debates sug- gest. for instance, as early as the getty art history information program (ahip), an antecedent of the getty re- search institute, set out to facilitate the creation of sets of linked "data banks" by the getty and a group of international partner institutions that included the national gallery of art, washington, and the witt library. in , the group computers and the history of art (chart) was founded in london in order to bring together academics, museum professionals, and information technolo- gy specialists who were interested in forgotten genealogies dah-journal, issue , pursuing computational practices, such as database creation and quantitative analysis, as well as developing new soft- ware and hardware with which to exam- ine works of art. chart began publish- ing a newsletter in , a book in , and an eponymous journal in . indeed, also witnessed the first “electronic visualization and the arts” (eva) conference at the imperial college, london. in , hubertus kohle pub- lished the volume kunstgeschichte digital: eine einführung für praktiker und stud- ierende, a collection of essays explor- ing a diverse array of projects and theo- retical positions on the relationship be- tween art history and computers. that same year, two unrelated articles were published exploring the intersection of art history and emerging technologies: "digital art history: a new field for collaboration" by sally promey and miri- am stewart in american art, and "digital culture and the practices of art and art history" by kathleen cohen et al in the art bulletin. and in , chart pub- lished the volume digital art history: a subject in transition. in this short essay, i want to question the assumption that art history has lagged behind other humanities disci- plines in its engagement with digital tools and techniques. i approach the ontology of "digital art history" from a historical perspective rather than a tech- nical or methodological one. i want to sketch out the genealogies of "digital art history" itself to better understand how the practices and debates subsumed un- der this concept have taken shape. i do not attempt to tell the complete story. indeed, i limit my chronological scope from roughly the early s through the mid s, and have selected just a few examples from a rich body of material. ultimately, this essay is an attempt to help define from a historical perspective what "digital art history" is and how it has been practiced. a genealogy of "digitized" art history rucker's distinction between digit- ized and digital art history, while imperfect categories, affords us with a good point of entry from which to under- stand the history of doing art history digitally. let us begin with the digitized, the creation of electronic databases and the digitization of works of art and image collections. the earliest projects integrating com- puters with art history primarily emerged from museums and libraries in the late s and the early s. as computers enabled cultural organizations to organize better large and sometimes poorly documented collections, museums and libraries from the united states and europe saw the potential for collabora- tion and the cross-referencing of their collections. but there were complications. while computers allowed for the unprec- edented exchange of information, dispar- ate standards of cataloging practices made communication difficult. several ambitious initiatives and groups sought to tackle this problem. for instance, in the international architectural d forgotten genealogies dah-journal, issue , drawings advisory group (adag) first convened at the center for advanced study in the visual arts (casva) in washington in order to systematize cata- loging standards that would ensure for scholars "a consistent set of research information across repositories, perhaps eventually, through an electronic network [emphasis added]." in , a sub-group of four adag repositories and the getty trust, the foundation for documents of architec- ture (fda), was created for the purpose of addressing disparate cataloging prac- tices for closely related drawings. in - , and housed at the national gallery in washington, the fda project staff was tasked with experimenting on a new cataloging system devised by ahip that "would allow scholars to manipulate catalogue information in ways that would yield new views of the material itself [emphasis added]." the ideal goal was not simply to reconcile cataloging practices through computers, but to use them as a means to find new research questions. they sought "to define what an electronic research environment might be." while the fda eventually concluded that the development of a computer network was beyond its reach, the ambition to develop such a project, and the foresight regarding its possibili- ties, was at the cutting edge of conceptu- alizing the intersection of art history with information technology. smaller institutions began inde- pendently testing the ideas floated by the adag and ahip from an early date. in , janet barnes, keeper of the ruskin gallery, sheffield, england, considered implementing a database that would function as both the first accurate catalog of the gallery's collection and as a multi- faceted image retrieval system for users rather than a standard commercial inven- tory system. the logic behind creating such a system was to follow the inten- tions of the art critic john ruskin, who compiled the museum's collection, so that visitors could easily make connections between ostensibly unrelated artworks – effectively an early user-oriented and visually-constructed relational database. the ultimate fate of the project is sadly unclear. also witnessed the initiation of the ambitious and well-documented im- age-oriented database vasari project, both a reference to giorgio vasari and an acronym for visual arts system for ar- chiving and retrieval of images. va- sari was an international collaborative, bringing together scientific departments from the national gallery, london, the doerner institute of the bavarian state galleries, telecom paris, the louvre, and the department of the history of art, birkbeck college, university of london, which handled much of the art historical and computer science aspects of the pro- ject. the goal of vasari was to create digital images of sufficiently high resolu- tion that could replace photographs as the preferred recording system for art- works. vasari did not rely on scanning existing images or transparencies into a database. rather, it sought to create new colorimetric images taken directly from paintings, which involved the creation of a new type of scanner that recorded paintings frame by frame (or pixel by pixel) through seven simultaneous color filters, and then "mosaiced" them togeth- er using custom software (fig. ). these forgotten genealogies dah-journal, issue , images were to be far more accurate in terms of their color reproduction and color monitoring than analog photog- raphy. most interestingly, the vasari project was envisioned as "machine inde- pendent," able to be transported from computer to computer and, ideally, over a network, rather than tied to a single workstation. in this way, vasari was conceived as a web-based project before the "web" was in the public conscious- ness – indeed, conceived of at the same moment as tim berners-lee's revolu- tionary work at cern. in , ahip published humanities and arts on the information highways, one of the earliest "state of the field" re- ports for what would become better known as the "digital humanities." the report extolled the possibilities presented by the exchange of information electron- ically, while also highlighting its many challenges, such as technological barri- ers, political apathy, and the undercapi- talization of projects. the report lists many art history projects in their survey of important computer-based projects in the humanities and the arts (a number of which still function), including the mit museum architecture project, the bibli- ography of the history of art, the save outdoor sculpture project, the witt computer index of print works, and the census of antique art and architecture known to the renaissance. the above projects are electronic da- tabases or iterations of mostly pre- electronic initiatives. but the report goes deeper than summarizing then-current electronic projects. it enumerates a series of recommendations for the practice of creating and maintaining digital projects, such as enabling the "highest fidelity of representation of originals" and preserv- ing object integrity through "technical methods such as color matching and compensation." moreover, the report encourages the development of new tools for humanities and arts computing, in- cluding building authoring tools that "exploit networked resources," "capture text, image, and sound in its editing and mark-up while capturing the history of different versions," "annotate videoclips, images, oral interviews, music, dance, and other cultural heritage information," and "support annotation systems that allow not only for personal commentary, but also for additions to the cumulative scholarly record." ahip was highly conscious of the impact that the digitiza- tion of source material could have on scholarly exchange while being equally aware of how electronic formats present- ed a host of particular challenges and possibilities. there are many other notable exam- ples of art historical projects that began testing the limits of technology's impact on image databasing in the s and s, such as the visual arts network for the exchange of cultural knowledge (van eyck) project, a european interna- tional collaborative that sought to ex- change text and image information be- tween different art historical databases that could be searched simultaneously from remote terminals – a precursor of aggregator sites like europeana or the getty research portal. the point to be taken from the above survey is that art historians have not simply been interest- ed in creating a better slide library. for many years scholars have recognized the potential that the digitized iteration of art forgotten genealogies dah-journal, issue , history held for organizing and working with both the objects of study and for scholarly collaboration; something that is becoming increasingly important with the move towards linked open data and the semantic web. a genealogy of "digital" art history hat, then, about the digital itera- tion of art historical practice that art historians are criticized for not prac- ticing? can this charge hold up to a scru- tiny of the historical record? let us begin answering this question by examining one the earliest projects that sought to use computational tech- niques for art historical research: the pioneering morelli project, named after the physician and connoisseur gio- vanni morelli and initiated in the mid s by william vaughan, professor of art history at birkbeck college. in short, morelli was a pattern recognition tool that automatically classified and analyzed the formal qualities of pictures. vaughan conceptualized the project as "a simple matching process…the visual equivalent of the 'word search' [fea- ture]…" but morelli did not rely on metadata as its organizing principle, as would be the case with a traditional da- tabase. instead, features such as composi- tional configuration and tonality were to be derived directly from the process of digitization, which would then be com- pared across a base data set of , images. moreover, it used a mono- chrome low-resolution digital image of kb rather than large files, and was able to recognize within "reasonable limits" different copies of the same picture and differentiate formally similar pictures without confusion. according to vaughan, the ultimate ambition of the project was to enable a new methodology in order "to make such visual sorting and selecting…. something that could genuinely be the basis of structured pictorial analysis." because the system relied on visual matching and sorting, in a fully implemented system the user could sift through an enormous visual archive, one beyond the capacity of human memorization, to find patterns and anomalies in the historical record; that is, to find if a particular type of composition is unique to one artist or one period, and, most importantly, to "link images together that cannot be found by means of textual reference." morelli was thus envisioned as ena- bling a "visual syntax of forms" from which complex visual arguments could be made, and stands as an unheralded antecedent to contemporary projects like image plot. vaughan's morelli project had a cognate in ibm almaden's query by im- age and video content system (qbic). like morelli, qbic retrieved data from images not based on subject matter, as art historians might understand "content" to mean, but on the visual qualities of the image – line, color, patterns, textures, and shapes. in theory, the system al- lowed a user to conduct queries such as "find images with a red, round object," w forgotten genealogies dah-journal, issue , "find images that have approximately - percent red and -percent blue colors," or "find images that have percent red and contain a blue textured object." in , the department of art and art his- tory at the university of california, da- vis, put these ideas into practice and launched a pilot database using qbic as a means of enabling better searching through the department's collection of , slides. after the completion of initial testing using a data set of , images, the department concluded that qbic's chief strength resided in its ability to sort artworks by aesthetic values ra- ther than search for them. the value of applying the qbic system to an image collection was to allow a user to sift quickly through large datasets to find hidden trends, relationships, or themes; the visual equivalent to computational methodologies such as text mining and topic modeling. during the late s and early s, a number of art historians were also working on smaller-scale digital projects. for instance, around , marilyn lavin began planning an interactive three- dimensional recreation of piero della francesca's legend of the true cross at arezzo. as she saw it, formats such as slides gave uniform scale to all images, unintentionally eliminating important aesthetic and experiential differences. the aim of the piero project was to "pre- sent an electronic surrogate for the con- figuration of the fresco paintings as they appear to a visitor in the church," which would incorporate natural color, relative scale, and physical environment. lavin's project sought to use the digital environment to re-create one of the most persistent concerns of the history of art – understanding a work of art in its physi- cal and historical context. the central problem tackled by lavin's project was by no means a radical one; in fact, it was a rather conservative one. but the virtual modeling approach allowed for an "ana- lytic flexibility" that still photography could not equal. one of the more interesting early digi- tal projects (c. ) was gilbert herbert and ita heinze-greenberg's statistical analysis of the profession of the architect in palestine during the british mandate of the s and s. in contrast to the biographical approach (understanda- bly) favored by most scholars, herbert and heinze-greenberg organized a data- bank of persons who had lived and worked in palestine as architects between and , of which contained enough information to use in their study. the authors organized their data by the years of immigration of architects into palestine, the countries from which they emigrated, the country of education of architects born in palestine, and the country of education of architects who qualified for the profession after immi- gration. some of the conclusions they reached by quantitative analysis included the large number of german-born and german-educated architects, many who studied at the bauhaus; that while the number of british-born architects was small, a large group of russian and polish-born architects trained in the united kingdom; and that during the first decade of the mandate, % of immigrant architects had been in the country less than ten years. the value of such quantitative studies as herbert and heinze-greenberg's for forgotten genealogies dah-journal, issue , art history is that they can problematize the weighty claims put forth by scholars based upon very small data sets. by dis- placing the centrality of exceptional works of art or individual biographies into larger networks, this approach can function as a research method that raises new questions about historical events and as a potential mode of historiograph- ic critique. as the foundation for meth- ods such as topic modeling and data min- ing, the quantitative analysis of art his- torical data can be both a challenge and a complement to the case-study model of practice. conclusion this brief enumerative trip into the historical record shows how art histori- ans have been engaged in theorizing and using computational technologies and techniques since the s. as noted earlier, the projects outlined here merely scratch the surface of a much richer his- tory. while working digitally has been a small subset of disciplinary practice, it has by no means been absent. many of the challenges these early forays in the digital world faced and that sadly could not be addressed here – funding, sustain- ability, archiving, copyright, technologi- cal obsolescence, documentation, tenure consideration, peer evaluation – will remain issues that art historians must tackle as the field moves forward. by gazing at the recent past, the field can recognize these pioneering contributions and learn from their ambitions. technol- ogy has reached a point where it is now easier (but by no means easy) to experi- ment with digital tools and methods, from using content management systems, to analyzing collection metadata released by museums, to employing open-source programs such as the visualization tool gephi and the mapping program qgis. but as digital art history continues to grow, as the problems it addresses be- come more sophisticated, as we work to define the tenets under which it func- tions, as it occupies a more central place in the discipline, and as scholars become more active in the creation of digital tools, we should be careful not to forget that the digital itself has formed part of the larger history of art history. notes for their most valuable input and support, i would like to thank deans elizabeth cropper, peter lukehart, and therese o'malley at the cen- ter for advanced study in the visual arts, paul jaskot, susan siegfried, kirk martinez, the jour- nal's editors, and the two anonymous reviewers. diane zorich, transitioning to a digital world: art history, its research centers, and digital schol- arship (new york: samuel h. kress foundation, ), esp. , , . available at: http:// www.kressfoundation.org/uploadedfiles/sponsore d_research/research/zorich_transitioningdigital world.pdf ibid, . anthony hamber, "computer applications in the history of art: a perspective from birkbeck col- lege, university of london," extrait de la revue informatique et statistique dans les sciences hu- maines, vol. , no. ( ), - . michael greenhalgh, "art history," a companion to digital humanities (oxford: blackwell, ), . book available at: http://www.digitalhuma- nities.org/companion/ http://www.nyu.edu/gsas/dept/fineart/research/ me llon/mellon-digital.htm diane zorich, "the "art" of digital art history," presented at the index of christian art, princeton university, june , . available at: http:// ica.princeton.edu/digitalbooks/digitalworldofarthis tory / .d.zorich.pdf forgotten genealogies dah-journal, issue , johanna drucker, "is there a "digital" art histo- ry?" visual resources: an international journal of documentation, vol. , no. - ( ), . ibid. ibid. a brief overview available at: http://socialar- chive.iath.virginia.edu/xtf/view?docid=getty-art- his tory-information-program-cr.xml for a brief discussion of the founding of chart, see jean miles, "introduction," computers and the history of art, vol. , no. ( ), - ; also ham- ber, "computer applications," . a full list of chart programs and publications available at: www.chart.ac.uk kunstgeschichte digital: eine einführung für praktiker und studierende, ed. hubertus kohle (berlin: dietrich reimer verlag, ). sally m. promey and miriam stewart, "digital art history: a new field for collaboration," american art, vol. , no. ( ), - ; kathleen cohen, james elkins, marilyn aronberg lavin, nancy macko, gary schwartz, susan l. siegfried and barbara maria stafford, "digital culture and the practices of art and art history," the art bulletin, vol. , no. ( ), - . digital art history: a subject in transition, ed. anna bentkowska-kafel, trish cashen, and hazel gardiner (bristol: intellect books, ). elizabeth cropper, dean of the center for ad- vanced study in the visual arts (casva), made a similar point in her introductory remarks to the conference "new projects in digital art history," held at the national gallery of art, washington d.c., november , . indeed, the history of the digital humanities has received little attention. see julianne nyhan, an- drew flinn, and anne welsh, "oral history and the hidden histories project: towards histories of computing in the humanities," digital scholarship in the humanities, vol. , no. ( ), - . thanks to paul jaskot for alerting me to this essay. available at: http://dsh.oxfordjournals.org/content / / / the history of museums and technology de- serves a much fuller investigation than can be done here. vicki porter and robin thomas, a guide to the description of architectural drawings (new york: g.k. hall & co., ), xvii. ibid, xix. ibid. janet barnes and alan griffiths, "creating an image database for the collection of the guild of st. george ruskin gallery, sheffield, uk," comput- ers and the history of art, vol. , no. ( ), - . anthony hamber, "the vasari project," com- puters and the history of art, vol. , no. ( ), - . for the entire list of partners, see ibid, - . ibid. ibid, - . ibid, . humanities and arts on the information high- ways: a profile. final report. (santa monica: getty art history information program, ). available at: http://www.cni.org/resources/historical-resour ces/humanities-and-art-on-the-information-high ways ibid, - . ibid, . ibid, . colum hourihane and john sunderland, "the van eyck project, information exchange in art libraries," computers and the history of art, vol. , no. ( ), - . william vaughan, "the automated connoisseur: image analysis and art history," history and computing, ed. peter denley and deian hopkin (manchester: manchester university press, ), - ; idem, "automated picture referencing: a further look at 'morelli'," computers and the histo- ry of art, vol. , no. ( ), - ; idem, "com- putergestützte bildrecherche und bildanalyse," kunstgeschichte digital, - . vaughan, "automated picture referencing," . ibid, . hamber, "computer applications," . vaughan, "automated picture referencing," - . ibid, . image plot: http://lab.softwarestudies.com/ p/imageplot.html it is unclear to me when exactly ibm began developing qbic. it was well underway by - . see note . myron flickner, harpeet sawhney, wayne niblack, jonathan ashley, qian huang, byron dom, monika gorkani, jim hafner, denis lee, dragutin petkovic, david steele, and peter yanker, "query by image and video content: the qbic system," ieee computer, vol. , no. ( ), - . ibid. . bonnie holt, ken weiss, wayne niblack, myron forgotten genealogies dah-journal, issue , flickner, and dragutin petkovic, "the qbic pro- ject in the department of art and art history at uc davis," proceedings of the annual asis meeting, vol. ( ), - . marilyn aronberg lavin, "researching visual images with computer graphics," computers and the history of art, vol. , no. ( ), - . the project, albeit in a very different form than the original, is available at: http://projects.ias.edu/ pierotruecross/ ibid, . drucker makes a good point of highlighting the "analytical flexibility" of virtual reconstructions. drucker, "is there a "digital" art history?" . gilbert herbert and ita heinze-greenberg, "the anatomy of a profession: architects in palestine during the british mandate," computers and the history of art, vol. , no. ( ), - . ibid, - . bibliography barnes, janet and alan griffiths. "creating an image database for the collection of the guild of st. george ruskin gallery, sheffield, uk." computers and the history of art , no. ( ): - . bentkowska-kafel, anna, trish cashen, and hazel gardiner, ed. digital art history: a subject in transition. bristol: intellect books, . cohen, kathleen, james elkins, marilyn aronberg lavin, nancy macko, gary schwartz, susan l. siegfried and barbara maria stafford. "digital culture and the practices of art and art history." the art bulletin , no. ( ): - . drucker, johanna. "is there a digital art history?" visual resources: an international journal of documentation , no. ( ): - . flickner, myron, harpeet sawhney, wayne niblack, jonathan ashley, qian huang, byron dom, monika gor- kani, jim hafner, denis lee, dragutin petkovic, david steele, and peter yanker. "query by image and video content: the qbic system." ieee computer , no. ( ): - . greenhalgh, michael. "art history." in a companion to digital humanities, edited by susan schreibman, ray siemens, and john unsworth, - . oxford: blackwell publishing, . hamber, anthony. "the vasari project." computers and the history of art , no. ( ): - . --- "computer applications in the history of art: a perspective from birkbeck college, university of london." extrait de la revue informatique et statistique dans les sciences humaines , no. ( ): - . herbert, gilbert and ita heinze-greenberg. "the anatomy of a profession: architects in palestine during the british mandate." computers and the history of art , no. ( ): - . holt, bonnie, ken weiss, wayne niblack, myron flickner, and dragutin petkovic. "the qbic project in the department of art and art history at uc davis." proceedings of the annual asis meeting ( ): - . hourihane, colum, and john sunderland. "the van eyck project, information exchange in art libraries." com- puters and the history of art , no. ( ): - . humanities and arts on the information highways: a profile. final report. santa monica: getty art history information program, . kohle, hubertus, ed. kunstgeschichte digital: eine einführung für praktiker und studierende. berlin: dietrich reimer verlag, . lavin, marilyn aronberg. "researching visual images with computer graphics." computers and the history of art , no. ( ): - . miles, jean. "introduction." computers and the history of art , no. ( ): - . nyhan, julianne, andrew flinn, and anne welsh. "oral history and the hidden histories project: towards histories of computing in the humanities." digital scholarship in the humanities , no. ( ): - . porter, vicki and robin thomas. a guide to the description of architectural drawings. new york: g.k. hall & co., . promey, sally m. and miriam stewart. "digital art history: a new field for collaboration." american art , no. ( ): - . forgotten genealogies dah-journal, issue , vaughan, william. "the automated connoisseur: image analysis and art history." in history and computing, edited by peter denley and deian hopkin, - . manchester: manchester university press, . --- "automated picture referencing: a further look at 'morelli'." computers and the history of art , no. ( ): - . --- "computergestützte bildrecherche und bildanalyse." in kunstgeschichte digital: eine einführung für praktiker und studierende, edited by hubertus kohle, - . berlin: dietrich reimer verlag, . zorich, diane. transitioning to a digital world: art history, its research centers, and digital scholarship. new york: samuel h. kress foundation, . --- "the "art" of digital art history." paper presented at the index of christian art, princeton university, june , . benjamin zweig, ph.d., is the robert h. smith postdoctoral research associate for digital art history at the center for advanced study in the visual arts (casva) national gallery of art, washington dc. he received his ph.d. in art history from boston university. he is a medievalist by training, with a particular interest in digital mapping and developing/writing tools useful for art historical research. correspondence e-mail: b-zweig@nga.gov le patrimoine numérique national à l'heure de l'intelligence artificielle r e v u e o u v e r t e d ' i n t e l l i g e n c e a r t i f i c i e l l e emmanuelle bermès, eleonora moiraghi le patrimoine numérique national à l’heure de l’intelligence artificielle volume , no ( ), p. - . © association pour la diffusion de la recherche francophone en intelligence artificielle et les auteurs, , certains droits réservés. cet article est diffusé sous la licence creative commons attribution . international license. http://creativecommons.org/licenses/by/ . / la revue ouverte d’intelligence artificielle est membre du centre mersenne pour l’édition scientifique ouverte www.centre-mersenne.org http://roia.centre-mersenne.org/item?id=roia_ __ _ _ _ http://creativecommons.org/licenses/by/ . / http://www.centre-mersenne.org/ www.centre-mersenne.org revue ouverte d’intelligence artificielle volume , no , , - le patrimoine numérique national à l’heure de l’intelligence artificielle le programme de recherche corpus comme espace d’expérimentation pour les humanités numériques emmanuelle bermèsa, eleonora moiraghia a bibliothèque nationale de france, quai françois mauriac, paris cedex , france. courriels : emmanuelle.bermes@bnf.fr, eleonoramoiraghi@gmail.com. résumé. — dans un contexte d’augmentation des volumétries des données et de ré- duction des temps de traitement, la bibliothèque nationale de france est confrontée à plusieurs défis et évolutions. afin de collecter, préserver, décrire et permettre l’étude d’en- sembles de données massifs et hétérogènes, elle fait non seulement appel aux méthodes relevant des sciences de l’information mais elle recourt aussi aux techniques issues de l’informatique, de plus en plus développées dans le domaine de l’intelligence artificielle. cette nécessité croissante de convoquer des compétences complémentaires, s’ajoutant aux opportunités ouvertes par les collections numériques pour la recherche, notamment en sciences humaines et sociales, induit pour la bibliothèque la définition d’un espace pour le développement des humanités numériques. mots-clés. — patrimoine numérique, fouille de données, apprentissage profond, intel- ligence artificielle, humanités numériques, sciences de l’information, mégadonnées. comment renouveler le dialogue entre institutions patrimoniales et milieu de la recherche? comment adapter les pratiques et l’offre de services d’une bibliothèque aux besoins et aux usages de l’homo numericus et à l’infosphère qui caractérise le xxie siècle? À l’heure d’une progressive mise en données, voire en réseau, du monde et d’une complémentarité croissante entre humain et systèmes intelligents, une institution pa- trimoniale telle que la bibliothèque nationale de france (bnf) est confrontée à de nombreuses évolutions, à plusieurs étapes de son activité allant de la collecte, de la description, du classement, du stockage, de la conservation et du signalement de ses collections numériques à la recherche, l’analyse et la communication de l’information. le présent article propose d’abord une définition préliminaire, au prisme d’une bi- bliothèque, des éléments fondamentaux convoqués dans le corps de l’article – sciences mailto:emmanuelle.bermes@bnf.fr mailto:eleonoramoiraghi@gmail.com e. bermès, e. moiraghi de l’information, intelligence artificielle et humanités numériques – pour ensuite se resserrer sur trois exemples de projets de constitution et d’analyse de corpus numé- riques conduits à la bnf. ces exemples viennent illustrer les objectifs du programme corpus, qui vise à construire une offre de services pour les chercheurs autour des col- lections numériques de la bibliothèque. en conclusion seront abordées les possibilités qui découlent d’une collaboration renouvelée entre bibliothèques et milieu académique ainsi que les perspectives ouvertes par les expérimentations menées dans le cadre du programme de recherche corpus notamment en matière d’intelligence artificielle. . la bibliothèque face au numérique : périmètre et définitions . . une approche interdisciplinaire pour l’analyse et l’étude de corpus numériques « sciences de l’information », « humanités numériques » et « intelligence artifi- cielle » sont trois concepts qui se sont développés à partir de la deuxième moitié du xxe siècle. le terme « intelligence artificielle » a été forgé en par l’américain john mccarthy à l’occasion de sa demande de subvention au nsf (national science foundation) pour l’école d’été au dartmouth college. l’expression « sciences de l’in- formation » (information science) est utilisée aussi à partir du milieu des années aux États-unis. malgré cette proximité chronologique, ces deux champs sont cependant restés longtemps disjoints. les sciences de l’information, qui ont pour objet d’étude l’information dans ses dimensions de production, de gestion, d’utilisation et de communication, fournissent à la bibliothèque des méthodes et des techniques pour l’organisation et l’administration des données. la bibliothéconomie, pilier de l’expertise de toute bibliothèque, en est une application concrète. plus précisément, elle mobilise la recherche d’information, qui désigne les méthodes et techniques employées afin de retrouver de l’information dans un ensemble de documents ou de données, ainsi que la structuration et la description de l’information à travers l’élaboration et l’implémentation de modèles de données et de métadonnées. les bibliothèques ont ainsi élaboré un certain nombre de standards, de formats, de protocoles d’accès appropriés à la gestion de leur domaine spécifique. en termes informatiques, ces éléments se sont traduits par des infrastructures et interfaces de collecte, de conservation et d’accès aux collections. le numérique a eu pour effet de rendre encore plus prégnante cette omniprésence des technologies de l’information et de la communication dans les métiers des bibliothèques, mais dans un premier temps, essentiellement pour faciliter des usages qui restaient relativement inchangés : accès aux documents de façon unitaire, consultation sur place ou à distance, et dissémination des résultats de la recherche essentiellement à travers des publications dans lesquelles la bibliothèque n’était pas impliquée. les humanities computing émergent dès la deuxième moitié du xxe siècle, grâce aux travaux de pionniers comme josephine miles et roberto busa. avec l’irruption du web, la notion de digital humanities apparaît et se popularise progressivement à partir du milieu des années . en tant qu’ensemble de pratiques de recherche en sciences – – le patrimoine numérique national à l’heure de l’intelligence artificielle humaines et sociales, arts et lettres «mobilisant les outils et les perspectives singulières du champ du numérique » (cf. thatcamp paris, ), les humanités numériques ont profondément fait évoluer les usages de recherche portant sur les collections patrimo- niales, s’appuyant notamment sur la disponibilité de collections numériques massives que les bibliothèques s’étaient organisées pour collecter et produire depuis plus de dix ans. les humanités numériques abordent des questions qui auparavant relevaient au sens strict des sciences de l’information, telles que la description, la gestion et l’analyse d’objets numériques, ainsi que de nouvelles modalités de communication, médiation et valorisation des collections patrimoniales et des recherches dont elles font l’objet. mais de façon certainement encore plus importante, elles apportent de nouvelles questions scientifiques, liées directement au potentiel offert par l’outil informatique d’analyse massive, quantitative, des collections numériques. dans ce contexte, l’intelligence artificielle, en tant que discipline informatique qui vise à élaborer des machines ou des outils simulant les fonctions cognitives, apporte de plus en plus à la bibliothèque des possibilités techniques aussi bien pour l’automa- tisation des traitements documentaires que pour offrir aux chercheurs en humanités de nouvelles modalités d’exploration, d’analyse et de gestion des collections ou ensembles cohérents de données massives. l’apprentissage automatique, à travers le développe- ment et l’implémentation de méthodes statistiques et algorithmiques permettant à un ordinateur d’apprendre à réaliser des tâches, présente des cas d’usage essentiels pour les bibliothèques d’une part, pour les chercheurs en humanités d’autre part. trois champs sont ainsi particulièrement concernés : • les traitements d’analyse de l’image conduisant à la création de contenu struc- turé et exploitable, notamment de contenu textuel (ocr ou reconnaissance optique de caractères, olr ou reconnaissance automatique de la mise en page, hcr ou reconnaissance automatique de l’écriture manuscrite, omr ou reconnaissance automatique de l’écriture musicale...) ou permettant d’accéder à l’image par le contenu (reconnaissance de formes...); • la fouille de données (text and data mining) permettant de faire émerger des tendances ou des motifs à partir de masses importantes de données, notamment en passant par une étape de visualisation de données; • les traitements sémantiques, qui permettent d’opérer des rapprochements au- tomatisés entre des données similaires (alignements de données...) ou des documents similaires (clustering...), ou d’extraire des informations séman- tiques à partir d’informations brutes (annotation de texte, d’image fixe ou animée...) dans le cadre de sa transition numérique initiée dès , la bnf s’est fixée pour objectif d’explorer, en participant à des projets d’humanités numériques, les territoires de recouvrement entre la recherche, notamment en sciences humaines et sociales, le domaine de l’informatique, notamment pour ce qui concerne les techniques d’apprentissage automatique et profond, et l’expertise de la bibliothèque notamment en matière de systématisation des traitements, structuration, normalisation et préservation des données. ainsi, sciences de l’information, humanités numériques et intelligence – – e. bermès, e. moiraghi artificielle se croisent, s’influencent, se mêlent, et participent de concert à des projets de recherche dont la bibliothèque est partie prenante. . . un besoin croissant d’automatisation pour analyser et gérer le patrimoine numérique national de plus en plus confrontée à des niveaux de volumétrie et de vélocité typiques des mégadonnées (big data), les collections numériques de la bnf, qui occupent au- jourd’hui environ six pétaoctets, sont caractérisées par une variété considérable. docu- ments numérisés, tels que par exemple les livres et manuscrits consultables dans gallica – la bibliothèque numérique de la bnf –; documents nativement numériques comme les œuvres d’art vidéo, les logiciels, les bases de données, les archives de l’internet; métadonnées bibliographiques et données d’autorité décrivant les personnes, lieux, organisations, concepts... autant d’ensembles de données diverses en termes de struc- tures, formats, qualité, contextes de production, fonctions et contenus. ces ensembles ont des histoires différentes, issues des changements des supports et des multiples strates de pratiques documentaires accumulées au fil du temps. leur hétérogénéité exige des traitements spécifiques et par conséquent des compétences et des méthodes particulières, aussi bien pour les conserver ou les communiquer que pour les analyser (cf. [ ]). cette hétérogénéité des données, qui découle de l’amplitude chronologique et de la vocation à l’encyclopédisme caractéristiques des bibliothèques nationales, s’ajoute à l’accroissement de la quantité des données en entrée et à l’accélération conséquente des temps de traitement. la tendance traditionnelle des bibliothèques à la systématisation des procédures doit dès lors trouver son équilibre face à la spécificité des données mais aussi des questions scientifiques propres aux projets de recherche qui les exploitent. dans ce contexte de tension entre l’accroissement des volumétries des données et la réduction des temps d’analyse et de traitement, la bnf développe une expertise poussée dans le domaine de l’informatique documentaire et de la gestion de collections numériques. on peut ainsi citer la préservation numérique dans son système spar, l’archivage de l’internet qui représente près d’un pétaoctet de données pour des mil- liards d’url, ou encore la publication de ses métadonnées bibliographiques sur le web des données qui ouvre la porte à des alignements semi-automatiques avec d’autres jeux de données (cf. [ ]). cependant, innover dans le domaine des sciences de l’informa- tion requiert parfois de mobiliser des compétences nouvelles ou des champs inexplorés de la connaissance. pour atteindre cet objectif, la bnf expérimente, souvent en parte- nariat avec des équipes de recherche, dans le cadre de projets aux échelles variées, des techniques issues de l’informatique et de plus en plus de l’intelligence artificielle, pour automatiser la gestion, la communication et l’analyse de son patrimoine numérique. la génération automatique de contenu textuel à partir d’images numériques a été la première technologie issue de l’intelligence artificielle à rejoindre les dispositifs informatiques régulièrement employées par la bnf dans le cadre de la gestion de ses – – le patrimoine numérique national à l’heure de l’intelligence artificielle collections numériques. quatre-vingt-neuf ans après la « machine à lire » de gus- tav tauschek et soixante et un an après l’encombrant et tentaculaire perceptron( ), la bibliothèque effectue une reconnaissance automatique de caractères (ocr, optical character recognition) dans la majorité des documents imprimés qu’elle détient afin que les contenus puissent être recherchés et exploités dans le format texte. dans ce domaine, elle ne se limite pas à l’état de l’art, mais mobilise des partenariats de re- cherche pour repousser les limites de la technique et obtenir des résultats toujours plus performants, comme dans le cadre du projet europeana newspapers qui portait sur l’extraction automatique de la mise en page (olr, optical layout recognition) et de la structure logique des documents (cf. [ ]). plus récemment, elle favorise également la recherche dans le champ de la reconnaissance automatique de l’écriture manuscrite (hwr, handwriting recognition) par exemple en soutenant le projet européen hima- nis qui se propose de comprendre la réalité du gouvernement royal français à partir des registres de la chancellerie royale des xive et xve siècles conservés aux archives nationales et à la bnf. d’autres applications de l’intelligence artificielle sont ensuite venues rejoindre la boîte à outils de la bnf pour exploiter ces contenus textuels : elle expérimente ainsi une indexation automatique des contenus textuels via la reconnaissance d’entités nommées (ner, named-entity recognition) avec le moteur sémantique exalead, utilisé dans sa bibliothèque numérique gallica. au-delà du texte, elle se donne pour objectif d’ici d’étudier la faisabilité de l’application de solutions d’apprentissage profond pour l’indexation d’images et de nouvelles interfaces pour la recherche et l’analyse de docu- ments iconographiques (cf. [ ]). enfin, pour que les contenus puissent être explorés et analysés non plus de manière unitaire (par document) mais de manière globale (par corpus) via des outils numériques et avec des méthodes relevant notamment du data mining (fouille de données), elle est en train de construire une offre de services autour de ses collections numériques. la mise au point de ces outils informatiques de plus en plus intelligents pour ana- lyser les collections numériques ouvre des opportunités inédites pour la recherche, notamment en sciences humaines et sociales. la bnf a pu constater depuis une augmentation du nombre de projets de recherche portant sur des corpus numériques et impliquant non seulement l’expertise des chercheurs dans leur domaine scienti- fique mais aussi la mobilisation de compétences en sciences de l’information et en informatique, y compris en matière d’intelligence artificielle. la partie suivante illustre via trois projets de recherche menés à la bnf comment des questionnements scientifiques issus des humanités numériques peuvent déboucher sur l’expérimentation d’outils relevant du champ de l’intelligence artificielle pour explorer et traiter les données numériques de la bibliothèque, et les perspectives que ces expérimentations ouvrent pour l’évolution de son système d’information. dans les trois cas, il s’agit de projets de recherche dont la bibliothèque a été à l’initiative. elle ne s’est pas limitée à jouer le rôle de commanditaire ou de fournisseur des données ( )la machine à lire de tauschek ( ) et le perceptron ( ) peuvent être considérées comme les ancêtres précurseurs de l’ocr. – – e. bermès, e. moiraghi mais s’est autorisée, à titre d’expérimentation, à être partie prenante de la démarche de recherche. c’est ainsi au croisement de compétences diverses, mobilisées par la bibliothèque autour de ses collections, qu’émergent les lignes de force de nouveaux usages numériques. . trois exemples de projets d’humanités numériques conduits à la bnf les trois projets présentés brièvement ici, en assumant le point de vue de la bi- bliothèque, portent chacun sur des ensembles de données différents : corpus sur la grande guerre extrait des archives de l’internet, données (ou logs) de connexion à la bibliothèque numérique gallica et ressources iconographiques issues de toutes les collections dans gallica couvrant la période - . ils adoptent trois approches d’exploration, d’analyse et d’exploitation de corpus numériques, chacune reposant sur des méthodes et des techniques différentes en raison de la nature des données explorées et des finalités scientifiques propres à chaque projet. ils partagent cependant la mobili- sation, à différents niveaux, de compétences et méthodes en sciences de l’information, informatique et sciences humaines et sociales. enfin, les deux premiers projets en par- ticulier montrent que chaque acteur en autonomie n’aurait pas pu parvenir aux mêmes résultats, qui découlent, non sans difficultés, d’un travail collectif et interdisciplinaire. . . un projet fondateur : « le devenir du patrimoine numérisé en ligne : l’exemple de la grande guerre » le projet « le devenir du patrimoine numérisé en ligne : l’exemple de la grande guerre » a été lancé en dans le cadre du labex « les passés dans le présent » et porté par la bnf, le département de sciences économiques et sociales de télécom paristech et la bibliothèque de documentation internationale contemporaine (bdic). d’une durée de trois ans ( - ), son objectif était multiple : d’abord étudier « les pratiques sociales en ligne visant à construire une représentation du passé et à perpétuer la mémoire de la grande guerre » (cf. [ ]); mesurer l’impact des institutions patrimoniales dans la circulation et dans l’appropriation des documents massivement numérisés et mis en ligne; puis, à partir des archives de l’internet de la bnf, analyser de manière automatique le réseau des sites web français concernant la grande guerre et cartographier les liens entre ces sites internet. en parallèle, le projet avait vocation à développer des outils et à proposer des méthodes reproductibles pour analyser un corpus issu des archives de l’internet de la bnf comme du web en général. . . . une collection des archives de l’internet de la bnf à l’origine de la deuxième phase du projet de recherche l’étude des archives de l’internet n’était pas au premier abord au cœur des pré- occupations de l’équipe de recherche : c’est le besoin exprimé par les chercheurs de disposer d’un corpus web fiable, documenté, légal et permettant la reproductibilité des traitements qui a conduit à faire appel à cette collection patrimoniale constituée par la – – le patrimoine numérique national à l’heure de l’intelligence artificielle bnf. il semblait en effet que le web « vivant »( ) ne permettait pas de définir un corpus présentant ces caractéristiques, et qu’il fallait travailler sur un web archivé. la bnf disposait depuis d’un cadre juridique, le dépôt légal, l’autorisant à reproduire et archiver des sites internet et à les communiquer à un public de chercheurs accrédi- tés. en outre, pour la bibliothèque, le fait de travailler sur les archives de l’internet présentait l’intérêt de fournir un terrain d’expérimentation autour d’une collection pa- trimoniale encore peu connue et peu exploitée et d’envisager le développement d’outils qui pourraient être réutilisés à terme dans d’autres projets. c’est ainsi que l’étude d’un corpus d’archives de l’internet est devenue l’une des étapes fondamentales du projet et a demandé le recrutement d’une ingénieure informatique. le travail réalisé par cette dernière, conjointement avec la sociologue qui pilotait la partie scientifique du projet, a porté d’une part sur la création d’un graphe de visualisation des liens entre les sites web constituant le corpus, et d’autre part sur l’extraction des données (fouille) d’un forum en ligne, le forum - , afin d’analyser les méthodes utilisées par les amateurs pour identifier et partager des contenus culturels numérisés. À la bnf, c’était la première fois que de tels outils informatiques étaient utilisés pour appréhender le contenu d’un corpus d’archives de l’internet. . . . le dialogue de multiples compétences au cœur du processus de constitution et d’analyse du corpus numérique le rapport de beaudouin et pehlivan [ ] détaille les défis particuliers posés par ce choix. en effet, les archives de l’internet, entrées dans le champ du dépôt légal en , font à la bnf l’objet de collectes selon deux modalités : la première porte annuellement sur un très grand nombre de sites internet ( , millions de domaines en ) identifiés à partir des listes de bureaux d’enregistrement, et la deuxième consiste en des collectes ciblées, plus fréquentes et/ou plus profondes, d’un nombre plus restreint de sites internet (environ ), sélectionnés par des bibliothécaires ou des partenaires en fonction de plusieurs thématiques. la collecte « grande guerre » faisait partie depuis novembre de cette seconde catégorie. elle s’est enrichie au fil de la durée du projet, à travers les sélections effectuées par la bnf et ses partenaires, occasionnant une variation importante de couverture du corpus entre le début du projet et les derniers mois de l’analyse. en outre, les modalités de la collecte, qui repose sur des robots, a débouché sur un effet de « bruit » important, nuisant à l’interprétation des visualisations de données. ces difficultés ont entravé le processus idéal qui aurait dû, en théorie, fonctionner comme une chaîne dans laquelle les experts d’une thématique opèreraient d’abord la sélection et la description des sources, les experts techniques procèderaient ensuite à la collecte et à l’archivage des données, et enfin le corpus serait fourni aux équipes de recherche pour qu’il puisse être étudié et analysé. en réalité, un dialogue constant s’est établi tout au long de la recherche, occasionnant de multiples itérations entre experts ( )des outils comme hyphe ou webrecorder sont souvent utilisés par les chercheurs en sciences sociales pour constituer un corpus à partir du web vivant. – – e. bermès, e. moiraghi des collections, experts des formats, informaticiens et équipes de recherche. les inter- prétations des graphes générés par les traitements ont évolué avec la compréhension progressive des modalités de collecte et d’archivage mises en place par la bnf, et la mise en œuvre de solutions adaptées pour corriger les biais inhérents au matériau source. en tant que partie prenante du projet, la bibliothèque n’a donc pas seulement contribué à l’identification des sources (collecte « grande guerre ») et à la délimitation du corpus pour répondre aux questions scientifiques du projet : elle a aussi fourni son expertise autour des formats de fichiers (arc/warc( ), dat/wat( )); elle a mis à disposition ou employé ses outils, comme la base de données « bcweb » (bnf collecte du web), ainsi que ses procédures (crawl logs( )) pour constituer, nettoyer et enrichir le corpus. . . . les conclusions et les résultats d’un travail de recherche collaboratif en conclusion, ce projet, en plus de déboucher sur la création d’outils et l’élabora- tion de méthodes pour l’analyse de corpus issus des archives de l’internet, a contribué à démontrer l’intérêt de travailler conjointement entre bibliothécaires, informaticiens et chercheurs sur ces nouveaux objets afin d’en fonder l’approche épistémologique. il a également montré qu’une approche linéaire et dissociée n’était pas suffisante, et que le succès de projet en humanités numériques portant sur des collections numériques patrimoniales massives et complexes requérait une organisation adéquate avec des me- sures itératives afin de produire une analyse fiable. l’ouvrage « le web français de la grande guerre. réseaux amateurs et institutionnels » (cf. [ ]) retrace cette démarche pluridisciplinaire et synthétise les conclusions qu’il a été possible de tirer de ce travail de recherche collaboratif. en croisant démarches quantitatives et qualitatives, sociolo- gie et sciences des données numériques, ce projet a permis d’éclairer la manière dont les sources documentaires numérisées circulent et dont les réseaux s’organisent sur le web à partir ou autour de ces sources. il a montré l’apport des espaces amateurs de discussion sur le web en tant que vecteurs de valorisation de recherches individuelles, mais aussi d’acquisition de compétences et d’élaboration d’une conscience collective et de nouvelles connaissances. du côté de la bibliothèque, le projet a initié la création d’outils qui ont ensuite servi d’autres projets et fait évoluer globalement l’approche documentaire de ces collections (voir chapitre ). . . une approche technique basée sur l’intelligence artificielle : l’analyse des traces d’usage de gallica alors que le projet « le devenir du patrimoine numérisé en ligne : l’exemple de la grande guerre » s’était emparé des collections numériques de la bnf, à travers les archives de l’internet, parce qu’elles présentaient une opportunité pour répondre ( )http://bibnum.bnf.fr/warc/ ( )https://webarchive.jira.com/wiki/display/ars/wat+overview+and+technical+ details ( )fichiers qui contiennent les traces de l’activité des robots de collecte pendant le processus de crawl ou capture des sites internet. – – http://bibnum.bnf.fr/warc/ https://webarchive.jira.com/wiki/display/ars/wat+overview+and+technical+details https://webarchive.jira.com/wiki/display/ars/wat+overview+and+technical+details le patrimoine numérique national à l’heure de l’intelligence artificielle à la question scientifique posée par le projet, ce deuxième exemple visait spécifique- ment à expérimenter des méthodes informatiques issues de la fouille de données et de l’intelligence artificielle, en complément d’autres méthodes visant également à appré- hender les usages de gallica (des entretiens, un questionnaire en ligne administré à plus de gallicanautes, un dispositif d’observation vidéo ethnographique). l’idée était d’évaluer l’apport des méthodes automatisées pour les études d’usage des bi- bliothèques numériques en s’appuyant sur l’emploi d’un type de données particulier : les « logs » ou traces d’usage. l’ensemble du dispositif scientifique, mis en œuvre en dans le cadre du bibli-lab( ), partenariat de recherche entre la bnf et l’école d’ingénieurs télécom paristech, constituait une vaste étude des usages de gallica, aux multiples facettes complémentaires, dont les résultats ont été présentés le mai lors de la journée d’étude « quels usages aujourd’hui des bibliothèques numériques? enseignements et perspectives à partir de gallica » (cf. [ ]). . . . la bnf en dialogue avec d’autres acteurs et compétences pour étudier les comportements de ses publics le volet de l’étude intitulé « analyse des traces d’usage de gallica » proposait une approche inédite d’analyse des parcours-types d’usagers de gallica, la bibliothèque numérique de la bnf, fondée sur des méthodes d’apprentissage automatique (machine learning). À partir des fichiers de connexion aux serveurs de gallica, l’objectif principal du projet consistait à identifier des sessions-types, c’est-à-dire des parcours similaires en termes d’enchaînement d’actions et de consultation de documents de la bibliothèque numérique. le projet, d’une durée de quinze mois, a été conduit par adrien nouvellet, chercheur en traitement du signal en contrat postdoctoral à l’école télécom paristech, qui était encadré par deux enseignants-chercheurs en sciences économiques et sociales (valérie beaudouin et christophe prieur) et deux enseignants-chercheurs en traite- ment du signal et des images (florence d’alché-buc et françois roueff) de la même école. la transversalité du projet, qui faisait se rencontrer deux équipes distinctes de télécom paristech, ainsi que l’implication d’un chercheur informaticien extérieur au domaine culturel, faisaient partie des aspects intéressants du projet d’un point de vue méthodologique. . . . l’avancement collaboratif et itératif pour la préparation et le traitement des données afin de découvrir des tendances dans l’utilisation de gallica via l’application de méthodes de type fouille de données (data mining), on a choisi d’exploiter les données de connexion aux serveurs de la bibliothèque numérique. ces données sont appelées communément « logs de connexion » et contiennent les requêtes effectuées depuis une ( )bibli-lab est un partenariat de recherche initié en entre la bnf et l’école télécom paristech. il vise à étudier les usages en ligne du patrimoine numérique des bibliothèques, url : http://www.bnf.fr/fr/la_bnf/pro_publics_sur_place_et_distance/a.bibli-lab.html; https://c.bnf.fr/hlc – – http://www.bnf.fr/fr/la_bnf/pro_publics_sur_place_et_distance/a.bibli-lab.html https://c.bnf.fr/hlc e. bermès, e. moiraghi adresse ip (qui identifie généralement un utilisateur). un exemple de ligne de logs de connexion à gallica est proposé ci-dessous. ria. volume – n° / . . la bnf en dialogue avec d’autres acteurs et compétences pour étudier les comportements de ses publics le volet de l’étude intitulé « analyse des traces d’usage de gallica » proposait une approche inédite d’analyse des parcours-types d’usagers de gallica, la bibliothèque numérique de la bnf, fondée sur des méthodes d’apprentissage automatique (machine learning). À partir des fichiers de connexion aux serveurs de gallica, l’objectif principal du projet consistait à identifier des sessions-types, c’est-à-dire des parcours similaires en termes d’enchaînement d’actions et de consultation de documents de la bibliothèque numérique. le projet, d’une durée de quinze mois, a été conduit par adrien nouvellet, chercheur en traitement du signal en contrat postdoctoral à l’école télécom paristech, qui était encadré par deux enseignants- chercheurs en sciences économiques et sociales (valérie beaudouin et christophe prieur) et deux enseignants-chercheurs en traitement du signal et des images (florence d’alché-buc et françois roueff) de la même école. la transversalité du projet, qui faisait se rencontrer deux équipes distinctes de télécom paristech, ainsi que l’implication d’un chercheur informaticien extérieur au domaine culturel, faisaient partie des aspects intéressants du projet d’un point de vue méthodologique. . . l’avancement collaboratif et itératif pour la préparation et le traitement des données afin de découvrir des tendances dans l’utilisation de gallica via l’application de méthodes de type fouille de données (data mining), on a choisi d’exploiter les données de connexion aux serveurs de la bibliothèque numérique. ces données sont appelées communément « logs de connexion » et contiennent les requêtes effectuées depuis une adresse ip (qui identifie généralement un utilisateur). un exemple de ligne de logs de connexion à gallica est proposé ci-dessous. figure . exemple de ligne de logs de connexion à gallica, extrait de nouvellet et. al. . outre les requêtes html, les requêtes sru et les identifiants ark sont aussi présents dans les logs de connexion. ces deux types d’information ont permis sru : search/retrieve via url est un protocole de type rest (representational state transfer) utilisé pour formuler des recherches notamment dans le contexte des données de bibliothèque et obtenir des résultats. il s’agit d’un standard reconnu par le consortium oasis et maintenu par la bibliothèque du congrès aux États-unis. ark : archival resource key est un standard d’identification utilisé par la bnf et maintenu par la california digital library. figure . . exemple de ligne de logs de connexion à gallica, extrait de nouvellet et. al. . outre les requêtes html, les requêtes sru( ) et les identifiants ark( ) sont aussi présents dans les logs de connexion. ces deux types d’information ont permis respectivement d’identifier les différentes actions d’un usager sur le site internet et d’identifier les ressources consultées. ces logs ont été nettoyés et enrichis avec d’autres données en fonction des objectifs de la recherche, étape cruciale et préliminaire à tout processus d’analyse dont dépendent les objectifs scientifiques; d’autant plus dans le cas de données comme les logs de connexion qui n’ont pas été conçus pour l’étude et l’analyse de comportements d’usa- gers. À plusieurs reprises, le dsi de la bnf (département des systèmes d’information) a modifié ou complété les données en fonction des besoins de l’étude : anonymisation, application de correctifs, ajout de champs nécessaires pour la recherche... les logs ont ensuite été liés aux métadonnées des documents consultés grâce au lien entre l’identi- fiant unique ark, qui identifie un document, et la notice bibliographique du document correspondant collectée grâce au protocole oai-pmh( ) : la normalisation des données bibliographiques et l’utilisation de l’identifiant unique ark pour chaque document ont permis une utilisation et un enrichissement rapide des données ce qui contribue à démontrer l’intérêt des données structurées et de qualité dans une perspective de recherche. enfin, autre type de données exploité dans le cadre du projet : l’ensemble des liens vers gallica extraits des contenus du blogue et de la page facebook consacrée à la bibliothèque numérique. ces données ont fait l’objet d’une analyse dans la dernière phase du projet pour déterminer l’impact des activités de médiation sur la consultation de documents dans gallica. afin d’atteindre l’objectif principal consistant à identifier des similitudes ou des tendances parmi les sessions et donc les comportements des usagers, un algorithme de classification non supervisée (clustering) fondé sur un mélange de modèles de markov ( )sru : search/retrieve via url est un protocole de type rest (representational state transfer) utilisé pour formuler des recherches notamment dans le contexte des données de bibliothèque et obtenir des résultats. il s’agit d’un standard reconnu par le consortium oasis et maintenu par la bibliothèque du congrès aux États-unis. ( )ark : archival resource key est un standard d’identification utilisé par la bnf et maintenu par la california digital library. ( )oai-pmh : open archive initiative protocol for metadata harvesting. protocole standard utilisé dans le domaine culturel et scientifique pour collecter et mettre à disposition de manière asynchrone des métadonnées issues de plusieurs silos. – – le patrimoine numérique national à l’heure de l’intelligence artificielle a été utilisé. les descriptions du modèle de markov et l’algorithme employé sont présentés en détail dans le chapitre . du rapport du projet (cf. [ ]). ce traitement a permis, à travers des visualisations de données, de représenter visuellement des types de comportements récurrents, regroupés en clusters, ce qui a servi de support à l’analyse sociologique des usages. le patrimoine numérique national à l’heure de l’intelligence artificielle respectivement d’identifier les différentes actions d’un usager sur le site internet et d’identifier les ressources consultées. ces logs ont été nettoyés et enrichis avec d’autres données en fonction des objectifs de la recherche, étape cruciale et préliminaire à tout processus d’analyse dont dépendent les objectifs scientifiques ; d’autant plus dans le cas de données comme les logs de connexion qui n’ont pas été conçus pour l’étude et l’analyse de comportements d’usagers. À plusieurs reprises, le dsi de la bnf (département des systèmes d’information) a modifié ou complété les données en fonction des besoins de l’étude : anonymisation, application de correctifs, ajout de champs nécessaires pour la recherche... les logs ont ensuite été liés aux métadonnées des documents consultés grâce au lien entre l’identifiant unique ark, qui identifie un document, et la notice bibliographique du document correspondant collectée grâce au protocole oai-pmh : la normalisation des données bibliographiques et l’utilisation de l’identifiant unique ark pour chaque document ont permis une utilisation et un enrichissement rapide des données ce qui contribue à démontrer l’intérêt des données structurées et de qualité dans une perspective de recherche. enfin, autre type de données exploité dans le cadre du projet : l’ensemble des liens vers gallica extraits des contenus du blogue et de la page facebook consacrée à la bibliothèque numérique. ces données ont fait l’objet d’une analyse dans la dernière phase du projet pour déterminer l’impact des activités de médiation sur la consultation de documents dans gallica. afin d’atteindre l’objectif principal consistant à identifier des similitudes ou des tendances parmi les sessions et donc les comportements des usagers, un algorithme de classification non supervisée (clustering) fondé sur un mélange de modèles de markov a été utilisé. les descriptions du modèle de markov et l’algorithme employé sont présentés en détail dans le chapitre . du rapport du projet (cf nouvellet et al., ). ce traitement a permis, à travers des visualisations de données, de représenter visuellement des types de comportements récurrents, regroupés en clusters, ce qui a servi de support à l’analyse sociologique des usages. oai-pmh : open archive initiative protocol for metadata harvesting. protocole standard utilisé dans le domaine culturel et scientifique pour collecter et mettre à disposition de manière asynchrone des métadonnées issues de plusieurs silos. figure . . exemple de visualisation des clusters représentant les parcours des usagers, extrait de [ ] comme dans l’exemple du projet grande guerre, les corpus analysés étaient consti- tués de données nativement numériques massives, qui ne pourraient pas être appréhen- dées seulement par l’œil ou le cerveau humain. l’opportunité que ces corpus offrent à la recherche ne réside pas, dans la plupart des cas, dans une lecture unitaire basée sur le document mais dans une lecture distante et globale de l’ensemble. en ce sens, cette lecture augmentée, cette hyper-lecture de données liées, en constituant la prothèse d’un œil augmenté, d’un hyper-œil qui s’ajoute à l’œil organique, augmente aussi la faculté de la recherche à extraire des connaissances. cette approche technique pour l’étude des usages de gallica a permis de confirmer le caractère siloté des usages de gallica, en démontrant que la majorité des sessions d’utilisation de la bibliothèque nu- mérique portait sur un seul document et que les utilisateurs qui ouvraient plus de cinq documents consultaient seulement des documents d’un ou deux types. concernant la valorisation sur les réseaux sociaux, l’analyse d’audience des publications sur la page facebook a montré qu’un lien illustré d’une vignette engendrait vingt-cinq fois plus de visites qu’un simple lien textuel : un constat qui a immédiatement conduit la bnf à modifier sa stratégie de communication et médiation autour de gallica. cependant, de tels résultats ne parviennent à faire sens qu’au prix d’une mise en œuvre itérative de ces méthodes, au cours de laquelle se confrontent les différentes compétences réunies par le projet, qu’il s’agisse des sciences de l’information avec les normes et formats utilisés par la bnf, de l’apprentissage automatique avec les opérations de traitement, ou des sciences humaines et sociales lorsqu’il s’agit de formuler les hypothèses et d’interpréter les résultats obtenus et les visualisations de données. comme dans le projet précédent, ce processus d’interaction entre un trinôme de compétences n’était pas linéaire mais faisait l’objet d’itérations successives tout – – e. bermès, e. moiraghi au long du projet. le résultat de l’analyse, ensuite confronté aux autres méthodes d’études en jeu dans le dispositif plus complet incluant questionnaire, entretiens et vidéo-ethnographie, est le fruit de cette approche transversale au croisement des trois champs qui font l’objet du présent article. . . une expérimentation mettant l’apprentissage profond à l’épreuve : le moteur de recherche iconographique gallicapix À la différence des deux projets précédemment présentés, « gallica.pix » n’est pas un projet de recherche proprement dit, mais un démonstrateur qui a été développé pour proposer de nouvelles méthodes de recherche dans les documents iconographiques présents dans les collections de la bnf. cette preuve de concept (poc, proof of concept) a été réalisée en par jean-philippe moreux, actuellement expert scientifique de gallica avec la collaboration de guillaume chiron (l i, université de la rochelle). elle met en œuvre une approche d’indexation sémantique sur un corpus de ressources iconographiques de gallica contemporaines de la première guerre mondiale. . . . À l’origine du projet, un chercheur autonome aux compétences variées toutefois, ce prototype ne constitue pas, comme les deux projets précédents, un projet de recherche institutionnel dans lequel la bibliothèque est partie prenante : conduit de manière autonome par un chercheur alors en poste à la bibliothèque, il démontre la faisabilité de construire des projets d’exploitation des données de la bnf en utilisant ses api (application programming interface), sans que celle-ci n’intervienne dans la réalisation ou ne produise des outils ad hoc. il aurait aussi bien pu être réalisé par un chercheur en dehors de la bnf, avec les mêmes données et les mêmes outils. en cela, « gallicapix » est aussi un démonstrateur de ce que les chercheurs pourraient construire avec les données de la bnf sans forcément mettre en place de partenariat : c’est pourquoi il est intégré dans gallica studio( ) et utilisé ici comme exemple de perspective offert par le projet corpus. . . . les trois phases de réalisation : de l’extraction des données à l’élaboration de l’interface de recherche la réalisation de ce démonstrateur a impliqué essentiellement trois phases de recherche et d’expérimentation : la première consistant à repérer et à extraire les illustrations à l’aide des api mises en place par la bnf; la deuxième visant à les enrichir avec des métadonnées permettant leur recherche; la dernière portant sur l’élaboration d’une interface de recherche web interrogeant une base de données xml. la première phase de la réalisation a montré encore une fois l’importance d’avoir des métadonnées bibliographiques complètes et des vocabulaires normalisés, et le défi que représente la gestion de l’hétérogénéité lors de l’extraction des données à partir de documents, comme par exemple une illustration publicitaire dans un journal. dans le ( )http://gallicastudio.bnf.fr – – http://gallicastudio.bnf.fr le patrimoine numérique national à l’heure de l’intelligence artificielle cadre de la deuxième phase du développement, celle de l’enrichissement des métadon- nées, plusieurs méthodes et techniques ont été expérimentées sur le corpus d’images extrait des serveurs de gallica via l’api image iiif( ) (international image interope- rability framework) : inception-v ( ), réseau de neurones artificiels convolutionnels et open-source de la société google qui a été réentrainé sur les genres iconographiques présents dans le corpus (photographie, gravure, carte, dessin de presse, etc.); la biblio- thèque open source opencv/dnn( ) et les api ibm watson visual recognition( ) et google cloud vision( ) pour l’indexation sémantique. parmi les principaux obstacles techniques constatés dans la tâche de classifica- tion des genres, on pourra citer : la confusion sur des genres visuellement proches comme photogravure-gravure et celle liée à l’identification des publicités illustrées de la presse quotidienne, qui relèvent d’un mode de communication et non d’une forme graphique visuellement homogène. l’entraînement des réseaux de neurones artificiels proposés par les offres commerciales sur des ressources iconographiques majoritai- rement contemporaines s’avère inadapté aux besoins des institutions patrimoniales, puisque cette approche montre ses limites même sur un corpus du xxe siècle. une col- laboration étroite avec des équipes de recherche permettrait de traiter la spécificité des ressources iconographiques historiques. enfin, la volumétrie importante des métadon- nées générées pose d’autres défis en termes d’architecture technique, de puissance de calcul et d’espace de stockage ainsi qu’en termes de normalisation et d’interopérabilité des métadonnées de classification générées par les api de reconnaissance visuelle. malgré ces limites, l’application web « gallicapix », en interrogeant tout à la fois les métadonnées bibliographiques, les métadonnées de reconnaissance visuelle et l’ocr des documents et en mobilisant des techniques d’intelligence artificielle, permet de satisfaire de nombreux cas d’usage en matière de ressources iconographiques. . . . l’apport d’un prototype pour explorer de nouveaux usages cette expérimentation démontre la maturité croissante des techniques à base d’in- telligence artificielle pour le traitement d’images, notamment contemporaines. elle ouvre de nouvelles perspectives comme la création de jeux de données iconogra- phiques à destination du chercheur ou de l’ingénieur dans le cas de projets impliquant de l’apprentissage profond, et elle confirme l’intérêt de l’utilisation de protocoles stan- dards tels que iiif. ce projet d’application web mobilise à nouveau des compétences de différente nature : en utilisant des réseaux de neurones artificiels pour la reconnais- sance et l’indexation automatique de formes et d’images, il s’appuie sur des méthodes et des techniques relevant de l’intelligence artificielle; en employant des protocoles standards et des outils, comme les protocoles sru( ) et iiif développés dans le milieu ( )https://iiif.io ( )https://www.tensorflow.org/tutorials/images/image_recognition ( )https://docs.opencv.org/ . . /d /d /tutorial_table_of_content_dnn.html ( )https://www.ibm.com/watson/services/visual-recognition ( )https://cloud.google.com/vision/ ( )http://www.bnf.fr/fr/professionnels/recuperation_donnees_bnf_boite_outils/ a.service_sru.html – – https://iiif.io https://www.tensorflow.org/tutorials/images/image_recognition https://docs.opencv.org/ . . /d /d /tutorial_table_of_content_dnn.html https://www.ibm.com/watson/services/visual-recognition https://cloud.google.com/vision/ http://www.bnf.fr/fr/professionnels/recuperation_donnees_bnf_boite_outils/a.service_sru.html http://www.bnf.fr/fr/professionnels/recuperation_donnees_bnf_boite_outils/a.service_sru.html e. bermès, e. moiraghi des bibliothèques, il s’appuie également sur des méthodes et des techniques relevant des sciences de l’information; en élaborant une interface de consultation permettant une expérience de recherche améliorée et enrichie, il propose une plateforme relevant des humanités numériques, qui pourrait donner lieu à la formulation de nouvelles questions scientifiques. cependant, pour que l’efficacité du prototype soit testée et perfectionnée, l’intervention d’experts du contenu des collections et de chercheurs en sciences humaines et sociales semble indispensable. l’élaboration d’une plateforme technique de ce type peut être vue comme un outil au service des chercheurs et non comme une fin en soi; elle ne peut révéler son utilité et sa pertinence qu’au prisme de compétences scientifiques éprouvées. . programme de recherche corpus porté par la bnf les trois projets ci-dessus illustrent l’interaction entre les trois domaines et les trois compétences que sont les sciences de l’information, l’informatique (dont l’intelligence artificielle) et les sciences humaines et sociales. si les motivations pour travailler sur les collections numériques de la bnf et les modalités de cette collaboration diffèrent, il n’en reste pas moins qu’aucun de ces trois projets n’aurait pu être mené à son terme sans cette conjonction de compétences spécifiques. au croisement de trois domaines dotés chacun de leurs apports, les humanités numériques ouvrent ainsi un champ des possibles pour la bibliothèque : celui de l’exploration de ses collections au moyen de techniques nouvelles, permettant la formulation de questions scientifiques originales et l’émergence de nouvelles connaissances. ils montrent également que l’existence de données numériques disponibles en masse et d’outils aptes à les traiter constitue une opportunité pour le développement de nouvelles recherches que la bnf a vu émerger depuis plusieurs années et auxquelles elle a participé. plusieurs motivations distinctes ont pu conduire la bibliothèque à s’y engager : tantôt l’espoir d’améliorer ses propres outils de production, de gestion et d’accès, tantôt le souhait d’augmenter la visibilité et l’étude de ses collections. confrontée à ce contexte non pas de révolution mais plutôt d’évolution notable pour sa rapidité, la bibliothèque a décidé d’envisager la construction d’une nouvelle offre de services aux chercheurs, en adoptant une approche par projets expérimentaux pour ensuite opérer une généralisation des processus. initié en et inscrit dans le cadre du plan quadriennal de la recherche de la bnf pour la période - , le projet corpus s’est donné pour objectif de construire un service de fourniture de corpus permettant la fouille de textes et de données à destination de la recherche. en procédant de manière expérimentale, itérative, collaborative et transversale, ce programme de recherche de quatre ans focalise son attention sur des corpus issus de trois principaux ensembles cohérents de données numériques : les archives de l’internet, les documents numérisés et les métadonnées. – – le patrimoine numérique national à l’heure de l’intelligence artificielle . . une nouvelle dimension pour l’accès aux archives de l’internet en parfaite continuité avec les expérimentations menées pour le projet « le devenir en ligne du patrimoine numérisé : l’exemple de la grande guerre », un autre projet de recherche portant sur les archives de l’internet a été conduit dans le cadre de ce programme. en , le partenariat avec l’équipe de l’institut des sciences de la communication du cnrs en charge du projet anr web a porté sur l’élaboration d’une application expérimentale nommée « archives du web labs » ainsi que sur l’indexation en plein texte de deux corpus : les « incunables du web » ( - ) et la collecte « attentats » de . cette application représente la continuation du travail entrepris pour la réalisation de l’interface pour le projet sur la grande guerre car elle offre, en plus de l’extraction des métadonnées des corpus, la possibilité de rechercher tous les mots présents dans les pages du corpus ainsi que plusieurs fonctions de personnalisation telles que l’enregistrement de requêtes et l’export de résultats de traitements. la mise en place de la fonctionnalité de recherche plein texte constituait un défi technique pour la bnf qui n’avait pas les moyens de la déployer à l’échelle de l’en- semble des collections des archives de l’internet. l’approche par corpus, grâce à la collaboration étroite avec les équipes de recherche qui s’inté ressent à ces données, est donc une manière de lever cet obstacle. elle permet d’envisager le passage à l’échelle et la proposition aux lecteurs de la bnf de services à valeur ajoutée : en , l’essentiel des fonctionnalités de l’application « archives web labs » ont fait l’objet d’un dé- ploiement sur tous les postes d’accès aux ressources numériques de la bibliothèque de recherche, alors qu’auparavant elle n’était offerte qu’à l’équipe de recherche partenaire, sur un seul poste informatique dédié( ). un nouveau corpus, la collecte « actualités » ( - ), a également été ajouté et indexé en plein texte pour servir les besoins d’un autre projet de recherche, le projet « neonaute »( ). . . des ateliers pour explorer de nouvelles dimensions des humanités numériques et de l’intelligence artificielle en , dans le cadre de la deuxième année du programme corpus, le projet « giranium », conduit par une équipe du gripic du celsa (laboratoire de sciences de l’information et de la communication de sorbonne université), a permis d’in- clure dans la réflexion l’exploitation numérique de corpus numérisés. en lien avec les recherches menées par jean-philippe moreux ( ) [ ] autour des approches innovantes pour l’étude de la presse ancienne numérisée, le projet « giranium » visait ( )certaines fonctionnalités comme la possibilité d’accéder aux métadonnées dans la ont été retirées dans la deuxième version qui est accessible au public, l’accès à ces métadonnées étant limité aux chercheurs sous convention. ce cadre contractuel fixent des conditions pour respecter les stipulations du code du patrimoine, mais également la législation sur la propriété intellectuelle et la protection des données. ( )umr – lipn – université paris sorbonne paris cité. le projet « neonaute » a été retenu dans le cadre de l’appel à projets de la délégation générale à la langue française et aux langues de frances (dglflf), langue et numérique . il porte sur la réalisation d’un moteur de recherche et d’études terminologiques s’appuyant sur le corpus « actualités » issu du dépôt légal du web de la bnf. – – e. bermès, e. moiraghi à mieux comprendre l’apparition des premières industries culturelles et médiatiques en france à travers le prisme d’Émile de girardin, personnalité emblématique du jour- nalisme français du xixe siècle, tout en mettant en œuvre des pratiques relevant des humanités numériques. outre la numérisation d’un corpus de presse du xixe siècle et son océrisation, le projet a demandé à la bibliothèque d’explorer d’autres aspects liés à une potentielle offre de services autour des collections numériques tels que le besoin d’espaces de travail dédiés dans les espaces physiques de la bibliothèque (pour le travail en groupe) et d’ateliers méthodologiques sur les humanités numériques (autour notamment des formats, des standards, des pratiques de structuration, normalisation, pérennisation et liage des informations). un de ces ateliers, intitulé « explorer des corpus d’images. l’ia au service du patrimoine » (cf. [ ]), a été l’occasion d’inclure de manière très explicite l’intelli- gence artificielle dans la réflexion menée dans le cadre du programme corpus. neuf projets d’humanités numériques impliquant à différents niveaux la reconnaissance au- tomatique d’écritures ou bien la reconnaissance automatique d’image par le contenu étaient présentés au fil de cet après-midi dans une logique de partage d’expérience entre institutions patrimoniales et milieu académique. pour la bibliothèque, prendre connaissance, étudier et expérimenter ces techniques de reconnaissance automatique de textes ou d’images, impliquant notamment l’utilisation de réseaux de neurones, constitue une nouvelle opportunité et un enjeu considérable pour réduire les temps de traitement des collections et améliorer le travail de recherche. pour les équipes de re- cherche, les collections de la bnf constituent un terrain idéal pour éprouver l’efficacité des outils et mesurer la maturité des technologies sur des matériaux historiques. dans le cadre de la troisième année du projet, d’autres expérimentations ont été menées en lien notamment avec l’exploration et la réutilisation des métadonnées bi- bliographiques que la bibliothèque collecte ou crée dans le cadre de son activité. ces données, qui ont été placées sous licence ouverte de l’État en , sont essen- tielles pour la gestion et la recherche de l’information mais peuvent constituer aussi un terrain d’enquête en elles-mêmes pour des projets de recherche. elles peuvent être questionnées par exemple pour des études démographiques sur les auteurs (cf. [ ]). comme dans le cadre de la deuxième année, une équipe de recherche a contri- bué à l’avancement du programme. le projet anr « foucault fiches de lecture » a pour objectif de numériser, mettre en ligne, indexer, décrire et enrichir les notes de lecture manuscrites de michel foucault, en utilisant une plate-forme numérique de travail collaboratif. cette plateforme donne accès aux fiches de lecture numérisées, permet l’enrichissement des métadonnées par un système de mashup et d’alignement avec les données bibliographiques et biographiques de data.bnf.fr et fournit une trans- cription de chaque fiche. cette transcription semi-automatique est obtenue à l’aide du logiciel transkribus qui, basé sur une technologie d’intelligence artificielle, après une phase d’apprentissage via des réseaux neuronaux, permet la reconnaissance d’écritures manuscrites ainsi qu’une recherche par mots clés. malgré la nécessité d’un travail mi- nutieux ligne par ligne, l’équipe a constaté un taux moyen de réussite de reconnaissance de l’écriture de %, une fois l’entraînement effectué. les échanges avec l’équipe, – – le patrimoine numérique national à l’heure de l’intelligence artificielle notamment lors de l’atelier « penser, classer, modéliser. l’exemple du projet foucault fiches de lecture » [ ] a confirmé l’efficacité croissante de ce type d’approche et contribue à préfigurer les enjeux qui découleront du projet corpus en matière de reconnaissance automatique des écritures manuscrites. . . une volonté d’échange et de dialogue avec les publics potentiels parallèlement aux expérimentations menées en collaboration avec les chercheurs, une étude a été conduite la deuxième année du programme corpus afin de mieux cerner les besoins des équipes de recherche notamment en termes d’espaces dédiés. fondée sur une méthodologie mêlant une enquête qualitative par entretiens, des observations informelles effectuées lors de deux ateliers autour de thématiques liées aux humanités numériques et un atelier participatif utilisant la méthode ux (user experience design) des personas, l’étude relève le besoin des équipes de recherche de disposer des collec- tions numériques à distance et en mobilité mais explore également la valeur ajoutée potentielle d’un espace physique à la bibliothèque consacré à l’étude et l’analyse de corpus numériques. outre la nécessité d’un tel espace pour la consultation et l’analyse de corpus sous droit (selon le code du patrimoine, les documents sous droit issus du dépôt légal ne sont accessibles que dans les emprises physiques de l’établissement), la possibilité d’avoir un accès immédiat aux différentes expertises de la bibliothèque est perçue comme la principale valeur ajoutée d’un lieu physique. dans la logique d’un dialogue renouvelé entre milieu de la recherche et bibliothèques, le modèle qui se profilerait dans cet espace autoriserait la formulation par les équipes de recherche de questions scientifiques et techniques aux agents de la bibliothèque, et l’apport de ces derniers serait une expertise sur les fonds, sur les questions juridiques et sur les aspects techniques, notamment de formats et d’outils. une infrastructure et des outils logiciels, notamment dédiés à la fouille de données, y seraient déployés. ce modèle convoquerait donc les trois éléments – sciences de l’information, informatique ou intelligence artificielle et sciences humaines et sociales – précédemment mentionnés et illustrés par les trois exemples de projets de recherche. un autre facteur, pragmatique mais notable, identifié comme favorisant la fré- quentation d’un espace physique à la bnf, est l’actuelle pénurie de locaux dans les universités parisiennes. tel que défini dans le rapport de l’étude (cf. [ ]), ce futur espace à la bibliothèque se profile comme facile d’accès, convivial, capable d’abriter des formations, des événements, des présentations de travaux de recherche et capable d’évoluer au rythme de l’innovation et du progrès technologique. À la suite de ce rapport, le programme corpus avance sur plusieurs axes de re- cherche : la conception et la mise en place de l’offre de services autour des collections numériques sur place; la continuation et l’amélioration des dispositifs dans le cadre de la politique de dissémination des données à distance et en ligne; l’élaboration d’une infrastructure sécurisée permettant la constitution et l’analyse de corpus numériques; l’articulation de ces deux offres complémentaires de services à la recherche (en ligne et sur place); la cartographie des compétences; la systématisation des processus et des – – e. bermès, e. moiraghi procédures; l’élaboration d’une feuille de route autour de l’intelligence artificielle; le positionnement institutionnel et stratégique dans l’écosystème de la recherche aussi bien français qu’international. . en conclusion : vers un lieu et un modèle de collaboration scientifique pour la connaissance du patrimoine numérique national traditionnellement une bibliothèque a pour vocation de collecter, préserver, décrire et communiquer les objets qui sont appelés ensemble à constituer un patrimoine. la bibliothèque nationale de france, en raison du dépôt légal, n’opère aucun jugement de valeur, ni moral ni esthétique ni social, pour sélectionner les documents appelés à faire partie des collections nationales, au contraire d’une politique documentaire classique telle que la pratiquent les autres types de bibliothèques (universitaires ou publiques). la conséquence de cette spécificité est l’existence, dans les collections de la bnf, de masses considérables de documents introuvables ailleurs, qui reflètent l’esprit de leur époque et qui peuvent servir de source pour étudier la société qui les produit. avec le numérique, ce potentiel d’étude et de connaissance est démultiplié car il devient possible d’appliquer à ces matériaux numériques des méthodes de lecture distante (cf. [ ]). ces approches d’hyper-lecture, fondées sur l’automatisme et le quantitativisme, continuent de soulever un certain scepticisme, notamment dans les milieux acadé- miques liés aux humanités, depuis les premières expériences d’histoire quantitative dans la deuxième moitié du xxe siècle. qu’elles mènent à l’élaboration d’interfaces, au développement et à l’amélioration d’algorithmes, à des statistiques ou à des vi- sualisations, ces approches sont souvent accusées de parvenir à des évidences déjà connues ou de trop dépendre des biais présents dans les algorithmes ou dans les corpus convoqués pour les analyses. avec la conscience de ces limites, la bibliothèque a vu augmenter les demandes d’accès à des corpus numériques depuis et a décidé de participer à des projets de recherche en ne considérant pas ces approches comme objets de recherche en tant que tels mais plutôt comme des moyens à mettre en œuvre, parmi d’autres, pour répondre à des problématiques scientifiques. grâce à ces multiples expériences de collaboration avec des équipes de recherche, la bibliothèque a pu constater un triple intérêt pour le déploiement d’une activité autour des humanités numériques et, plus généralement, de tous les usages liés aux collections numériques. tout d’abord, ce nouveau champ porte une promesse de renou- vellement, voire de reconquête du public des chercheurs pour l’étude des collections et la connaissance du patrimoine. par ailleurs, la mutualisation et la capitalisation sur des méthodes, des outils et des techniques notamment à base d’intelligence artificielle ouvrent des perspectives aussi bien pour la connaissance du patrimoine numérique que pour sa gestion par la bibliothèque. enfin, l’utilisation de méthodes issues de l’intelligence artificielle pour l’enrichissement de contenus numériques (comme par exemple l’ocr) ou pour la médiation et la valorisation du patrimoine permet d’ima- giner un nouveau cadre de travail scientifique sur les collections, où les compétences – – le patrimoine numérique national à l’heure de l’intelligence artificielle des bibliothécaires, des informaticiens et des chercheurs s’associent et se complètent pour mieux faire connaître le patrimoine national. en promouvant une complémentarité entre méthodes quantitatives et qualitatives, entre humains et systèmes intelligents, entre sciences de l’information, ingénierie informatique, intelligence artificielle et humanités numériques, l’offre de services qui résultera du programme de recherche corpus se veut un lieu et un modèle de collabo- ration scientifique ancré dans la pluridisciplinarité. ce nouveau modèle représente un moyen prometteur pour la bibliothèque d’améliorer et d’assurer la constitution et la communication de son patrimoine numérique. incarné dans un lieu physique, ce mo- dèle permettra de renouveler le dialogue entre institutions patrimoniales et milieu de la recherche. Également virtuel, sous forme d’infrastructure sécurisée, il saura répondre aux besoins de mobilité, de dynamisme et de rapidité de l’homo numericus. dans un contexte international de redéfinition des rapports entre bibliothèques et milieu de la recherche, de définition des enjeux et des questions éthiques autour des gafa ainsi que des compétences dans le domaine de l’intelligence artificielle au sein des biblio- thèques, cette offre de services se configure en faveur de l’ouverture des données, de la science ouverte, de la sociabilité scientifique et d’une économie du savoir fondée sur le partage et la coopération. grâce au développement de procédures et d’outils intelli- gents, tels que des moteurs d’indexation automatique de contenus, pour la gestion de flux massifs de données, elle a vocation à perpétuer dans l’infosphère l’équilibre entre la vision microscopique de la recherche, qui réside dans la spécificité des questions scientifiques, et la vision macroscopique d’une bibliothèque à vocation encyclopédique et universelle. bibliographie [ ] v. beaudouin, « forums en ligne : des espaces de co-production de la connaissance et du lien social », in l’ordinaire d’internet (o. martin & É. dagiral, éds.), armand colin, paris, , p. - . [ ] v. beaudouin, p. chevallier & l. maurel, le web français de la grande guerre. réseaux amateurs et institutionnels, presses universitaires de paris nanterre, . [ ] v. beaudouin & l. maurel, « la commémoration de la grande guerre sur le web : présence et diffusion du patrimoine numérisé », matériaux pour l’histoire de notre temps - ( ), p. - . [ ] v. beaudouin & z. pehlivan, «cartographie de la grande guerre sur le web : rapport final de la phase du projet “le devenir en ligne du patrimoine numérisé : l’exemple de la grande guerre” », research report, bibliothèque nationale de france; bibliothèque de documentation internationale contempo- raine; télécom paristech, , https://hal.archives-ouvertes.fr/hal- . [ ] e. bermès, « préfiguration d’un service de fourniture de corpus numériques à destination de la recherche », , http://c.bnf.fr/fom. [ ] ——— , « text, data and link-mining in digital libraries : looking for the heritage gold », in ifla satellite meeting – digital humanities – opportunities and risks : connecting libraries and research (berlin, allemagne), , https://hal.inria.fr/hal- . [ ] ——— , « text, data and link-mining in digital libraries : looking for the heritage gold », , library science talks https://indico.cern.ch/event/ /attachments/ / /lstalks- -bermes_en_v .pdf. [ ] ——— , « quand le dépôt légal devient numérique : épistémologie d’un nouvel objet patrimonial », quaderni ( ), p. - . – – https://hal.archives-ouvertes.fr/hal- http://c.bnf.fr/fom https://hal.inria.fr/hal- https://indico.cern.ch/event/ /attachments/ / /lstalks- -bermes_en_v .pdf https://indico.cern.ch/event/ /attachments/ / /lstalks- -bermes_en_v .pdf e. bermès, e. moiraghi [ ] bibliothèque nationale de france, « contrat d’objectifs et de performance - », http: //www.bnf.fr/documents/contrat_performance.pdf, . [ ] ——— , « il était une fois dans le web : ans d’archives de l’internet en france », http://c.bnf. fr/fse, . [ ] ——— , « quels usages aujourd’hui des bibliothèques numériques? enseignements et perspectives à partir de gallica », http://c.bnf.fr/fuz, . [ ] a. bouchard, « présentation du projet corpus à la bnf », https://webcorpora.hypotheses. org/ , . [ ] p. chevallier, «web de la mémoire et mémoire du web», revue de la bnf ( ), no , p. - . [ ] f. glorieux, « femmes de lettres, démographie (data.bnf.fr ) », https://resultats. hypotheses.org/ , . [ ] g. illien, p. sanz, s. sepetjan & p. stirling, « la situation du dépôt légal de l’internet en france : retour sur cette nouvelle législation, sur sa mise en pratique depuis cinq ans, et perspectives pour le futur », in actes du e congrès de la fédération internationale des associations de bibliothécaires et d’institutions (ifla) (san juan, porto rico), , http://conference.ifla.org/past-wlic/ / -stirling-fr.pdf. [ ] o. jacquot, « stratégie de recherche de la bibliothèque nationale de france », revue patrimoines. enjeux contemporains de la recherche ( ), p. - . [ ] a. le follic, p. stirling & b. wendland, « putting it all together : creating a unified web harvesting workflow at the bibliothèque nationale de france », http://netpreserve.org/wp-content/ uploads/iipc_project-putting_it_all_together-web_harversting_workflow_at_ bnf.pdf, . [ ] e. moiraghi, « décrire, transcrire et diffuser un corpus documentaire hétérogène : méthodes, formats, outils », https://bnf.hypotheses.org/ , . [ ] ——— , « géolocalisation et spatialisation de documents patrimoniaux : trois heures de partage autour de la cartographie numérique », https://bnf.hypotheses.org/ , . [ ] ——— , « données liées et données à lier : quels outils pour quels alignements? », https://bnf. hypotheses.org/ , . [ ] ——— , « le projet corpus et ses publics potentiels : une étude prospective sur les besoins et les attentes des futurs usagers », https://hal-bnf.archives-ouvertes.fr/hal- , . [ ] ——— , « penser, classer, modéliser. l’exemple du projet foucault fiches de lecture », https: //bnf.hypotheses.org/ , . [ ] e. moiraghi & j.-p. moreux, « explorer des corpus d’images. l’ia au service du patrimoine », https://bnf.hypotheses.org/ , . [ ] f. moretti, distant reading, verso, . [ ] j.-p. moreux, « approches innovantes pour la presse ancienne numérisée : fouille et visualisation de données », https://bnf.hypotheses.org/ , . [ ] ——— , « data mining historical newspaper metadata – old news teaches history », in ifla news media section conference (hamburg), . [ ] ——— , « plongez dans les images de - avec notre nouveau moteur de recherche iconographique gallicapix », https://c.bnf.fr/gxs, . [ ] ——— , « recherche d’images dans les bibliothèques numériques patrimoniales – expérimentation de techniques d’apprentissage profond », documentation et bibliothèques ( ), no , p. - . [ ] j.-p. moreux & g. chiron, « hybrid image retrieval in digital libraries : a large scale multi- collection experimentation of deep learning techniques », in digital libraries for open knowledge (cham), springer international publishing, , nd international conference on theory and prac- tice of digital libraries , porto, p. - . [ ] a. nouvellet, v. beaudouin, d. florence, c. prieur & f. roueff, « analyse des traces d’usage de gallica : une étude à partir des logs de connexions au site gallica », https://hal. archives-ouvertes.fr/hal- , . [ ] t. pardé & j. olivier, « les humanités numériques à la bibliothèque nationale de france », revue patrimoines. enjeux contemporains de la recherche ( ), no , p. - . [ ] s. peter, « le dépôt légal de l’internet dans le projet corpus », https://webcorpora. hypotheses.org/ , . – – http://www.bnf.fr/documents/contrat_performance.pdf http://www.bnf.fr/documents/contrat_performance.pdf http://c.bnf.fr/fse http://c.bnf.fr/fse http://c.bnf.fr/fuz https://webcorpora.hypotheses.org/ https://webcorpora.hypotheses.org/ https://resultats.hypotheses.org/ https://resultats.hypotheses.org/ http://conference.ifla.org/past-wlic/ / -stirling-fr.pdf http://conference.ifla.org/past-wlic/ / -stirling-fr.pdf http://netpreserve.org/wp-content/uploads/iipc_project-putting_it_all_together-web_harversting_workflow_at_bnf.pdf http://netpreserve.org/wp-content/uploads/iipc_project-putting_it_all_together-web_harversting_workflow_at_bnf.pdf http://netpreserve.org/wp-content/uploads/iipc_project-putting_it_all_together-web_harversting_workflow_at_bnf.pdf https://bnf.hypotheses.org/ https://bnf.hypotheses.org/ https://bnf.hypotheses.org/ https://bnf.hypotheses.org/ https://hal-bnf.archives-ouvertes.fr/hal- https://bnf.hypotheses.org/ https://bnf.hypotheses.org/ https://bnf.hypotheses.org/ https://bnf.hypotheses.org/ https://c.bnf.fr/gxs https://hal.archives-ouvertes.fr/hal- https://hal.archives-ouvertes.fr/hal- https://webcorpora.hypotheses.org/ https://webcorpora.hypotheses.org/ le patrimoine numérique national à l’heure de l’intelligence artificielle [ ] v. vincent, « atelier bnf corpus (ii) – penser, classer, modéliser », https://ffl.hypotheses. org/ , . abstract. — in a context of increasing volumes of data and reduced processing times, the national library of france is facing several challenges and developments. in order to collect, preserve, describe and enable the study of massive and heterogeneous data sets, the library uses not only methods of information sciences but also techniques developed in the field of computer science, especially in artificial intelligence. this growing need to convene complementary skills, combined with the research opportunities opened by these digital collections, has led the library to create a space for supporting digital humanities. keywords. — digital heritage, digital corpora, data mining, artificial intelligence, ma- chine learning, deep learning, digital humanities, information science, digital scholarship, big data. resumen. — en un contexto de aumento de los volúmenes de datos y de reducción de los tiempos de procesamiento, la biblioteca nacional de francia se enfrenta a varios retos y evoluciones. con el objetivo de colectar, preservar, describir y permitir el estudio de conjuntos de datos masivos y heterogéneos, ésta no sólo recurre a los métodos de la ciencia de la información, sino que también utiliza técnicas informáticas cada vez más desarrolladas en el ámbito de la inteligencia artificial. esta creciente necesidad de convocar competencias complementarias, además de las oportunidades que ofrecen las colecciones digitales para la investigación, en particular en las ciencias humanas y sociales, induce a la biblioteca a la definición de un espacio para el desarrollo de las humanidades digitales. palabras claves. — patrimonio digital, text and data mining, deep learning, inteligen- cia artificial, humanidades digitales, ciencias de la información, big data. manuscrit reçu le août , accepté le mars . – – https://ffl.hypotheses.org/ https://ffl.hypotheses.org/ . la bibliothèque face au numérique : périmètre et définitions . . une approche interdisciplinaire pour l'analyse et l'étude de corpus numériques . . un besoin croissant d'automatisation pour analyser et gérer le patrimoine numérique national . trois exemples de projets d'humanités numériques conduits à la bnf . . un projet fondateur : « le devenir du patrimoine numérisé en ligne : l'exemple de la grande guerre » . . . une collection des archives de l'internet de la bnf à l'origine de la deuxième phase du projet de recherche . . . le dialogue de multiples compétences au cœur du processus de constitution et d'analyse du corpus numérique . . . les conclusions et les résultats d'un travail de recherche collaboratif . . une approche technique basée sur l'intelligence artificielle : l'analyse des traces d'usage de gallica . . . la bnf en dialogue avec d'autres acteurs et compétences pour étudier les comportements de ses publics . . . l'avancement collaboratif et itératif pour la préparation et le traitement des données . . une expérimentation mettant l'apprentissage profond à l'épreuve : le moteur de recherche iconographique gallicapix . . . À l'origine du projet, un chercheur autonome aux compétences variées . . . les trois phases de réalisation : de l'extraction des données à l'élaboration de l'interface de recherche . . . l'apport d'un prototype pour explorer de nouveaux usages . programme de recherche corpus porté par la bnf . . une nouvelle dimension pour l'accès aux archives de l'internet . . des ateliers pour explorer de nouvelles dimensions des humanités numériques et de l'intelligence artificielle . . une volonté d'échange et de dialogue avec les publics potentiels . en conclusion : vers un lieu et un modèle de collaboration scientifique pour la connaissance du patrimoine numérique national bibliographie april politecnico di torino repository istituzionale semantic enrichment for recommendation of primary studies in a systematic literature review / rizzo, giuseppe; tomassetti, federico; vetrò, antonio; ardito, luca; torchiano, marco; morisio, maurizio; troncy, raphael. - in: digital scholarship in the humanities. - issn - . - stampa. - : ( ), pp. - . original semantic enrichment for recommendation of primary studies in a systematic literature review publisher: published doi: . /llc/fqv terms of use: openaccess publisher copyright (article begins on next page) this article is made available under terms and conditions as specified in the corresponding bibliographic description in the repository availability: this version is available at: / since: - - t : : z oxford university press semantic enrichment for recommendation of primary studies in a systematic literature review giuseppe rizzo ?, federico tomassetti , antonio vetrò , luca ardito , marco torchiano , maurizio morisio , raphaël troncy eurecom, sophia antipolis, france [giuseppe.rizzo, raphael.troncy]@eurecom.fr, politecnico di torino, turin, italy [federico.tomassetti, luca.ardito, marco.torchiano, maurizio.morisio]@polito.it technische universität münchen, germany vetro@in.tum.de abstract. a systematic literature review (slr) identifies, evaluates and synthesizes the literature available for a given topic. this gener- ally requires a significant human workload and has subjectivity bias that could a↵ect the results of such a review. automated document classifi- cation can be a valuable tool for recommending the selection of studies. in this paper, we propose an automated pre-selection approach based on text mining and semantic enrichment techniques. each document is firstly processed by a named entity extractor. the dbpedia uris com- ing from the entity linking process are used as external sources of in- formation. our system collects the bag of words of those sources and it adds them to the initial document. a multinomial naive bayes classi- fier discriminates whether the enriched document belongs to the posi- tive example set or not. we used an existing manually performed slr as benchmark dataset. we trained our system with di↵erent configura- tions of relevant documents and we tested the goodness of our approach with an empirical assessment. results show a reduction of the manual workload of % that a human researcher has to spend, while holding a remarkable % of recall, important condition for the nature itself of slrs. we measure the e↵ect of the enrichment process to the precision of the classifier and we observed a gain up to %. introduction a systematic literature review (slr) is a research methodology used to iden- tify, analyze and interpret all available evidences related to a specific research question in a way that is unbiased and (to a degree) repeatable (kitchenham ? corresponding author. rizzo et al. ). a slr has to be performed according to a pre-defined protocol describ- ing how primary studies are selected and categorized, reducing as much as possible subjectivity bias. depending on the research field where it is applied, the protocol changes. in this paper, we focus on a slr applied to the field of software engineering, where the protocol can be summarized by the follow- ing steps (kitchenham ): (i) identification of research, (ii) selection of pri- mary studies, (iii) study quality assessment, (iv) data extraction and monitoring progress, (v) data synthesis. the first step defines the search space, i.e. the set of documents in which researchers select papers. a small sample set of relevant documents is used to define the search space. the second step identifies and analyses all possible useful studies among the papers which are contained in the search space that can help to answer some research questions. in the third step, an assessment about the quality of the studies collected is performed, while in the fourth step, the data extraction forms are delivered according to the review under evaluation. the last step delivers the data synthesis methods. although these steps seem to be sequential, it is worth considering them as iterative steps and, therefore, the outputs may evolve according to the evolving topics. the entire process is supervised and guided by researchers who summarize all existing information about some phenomena in a thorough and, potentially, unbiased manner. the final goal is to draw more general conclusions about some phenomena derived from individual studies, or as a prelude to further research activities. a slr has a crucial importance in all research fields but it is extremely time-consuming, requiring an important human workload which is costly and error prone. even though full automation of slr is not possible due to the need of human reasoning for the aggregation and interpretation of scientific results, we believe that a tool support in the selection of the primary studies can reduce the human workload necessary in that phase, without loosing knowledge (which is a particularly important condition for the nature itself of slrs). therefore, the objective of this paper is to reduce the human workload in a slr, semi-automating the selection of primary studies (i.e. the second step of the slr process). this depends on the dimensions of the search space. the larger the search space is the more e↵ective our proposed approach will be. our method focuses on a filter strategy resorting to semantic enrichment and text mining techniques to reduce the number of papers that researchers, who perform a slr, should read. we use a text classifier to filter potentially interesting documents within the search space. the classifier produces a reduced set which contains a higher percentage of interesting document than the initial set. afterwards, this reduced set is manually examined by researchers. in this way, we reduce the workload required to all researchers, limiting the human error rate. this phenomenon usually occurs when a set is sparse and searching through it requires more e↵orts than in a clean set, where the noise is smaller. a primary study is (in the context of evidence) an empirical study investigating a specific research question (kitchenham ). semantic systematic literature review rq does the automatic selection process based on the multinomial naive bayes classifier and semantic enrichment (enriched process) reduce the amount of manual work of a slr with respect to the original process? rq does the automatic selection process based on multinomial naive bayes classifier and semantic enrichment (enriched process) reduce the amount of manual work of the alternative version of the process with only multinomial naive bayes classifier ( non-enriched process)? in other words, we aim to validate the idea behind the use of enriched papers as test samples instead of using original papers as test samples. the approach presented in this paper is based on a previous work (tomassetti et al. ). the following improvements are proposed: while previously the au- tomatic classification was planned to fully automate the entire selection process step, in this paper, we propose a semi-supervised approach. this is because pa- pers selected by the automatic classifiers could be immediately discarded by a human researcher just looking at the title and the abstract and do not need necessarily to be fully read. in addition, we perform an evaluation on a much larger dataset, extending the benchmark dataset size from the previous pa- pers to the current papers (almost times larger). finally, we present an exhaustive task-based evaluation. the remainder of this paper is organized as follows. section compares our approach with the state of the art in the slr domain. section details the steps of selecting primary studies and section presents our approach to improve this step. section describes the use case we use to validate our approach. in section , we report and discuss the results we obtained. finally, we give our conclusions and outline future work in section . related work the automatic text classification applied to a systematic review is more chal- lenging than the typical classification task. this is basically due to the dynamic nature of a slr which is a supervised and iterative process where the initial scope of the slr often evolves during the review process. numerous research ef- forts have been spent to reduce the human workload when a slr is performed. we focus on two di↵erent types of studies: i) machine learning based, and ii) ontology based. cohen et al. proposed a first attempt to reduce the human workload in the slr field (cohen et al. ). they used automatic classification to discard non-interesting papers from a set of them in fifteen di↵erent medical systematic literature reviews, each one considering the validity of a particular drug. their classification model uses a reduced set of the features gathered from the paper such as author name, journal name, journal references, abstract, introduction, and conclusion. the classification model is built using negative examples as well as positive examples, where negative examples are selected from the pool of papers which do not adhere to the chosen slr. finally, this model is used to rizzo et al. create a perceptron modified vector for each feature in the feature set. negative examples bias the model. in order to limit this phenomenon, they introduced a perceptron learning adjustment just evaluating the false negatives and false positives, monitoring them according to the false negative linear rate (fnlr). a test article is classified by taking the scalar product of the document feature vector with the perceptron vector and comparing the output values. considering a recall of %, the reduction of workload ranges from % to % according to the slr they took under evaluation. similarly to cohen et al.’s work, in our approach we evaluate the reduction of human workload, while holding a % of recall for the classifier. the experiment we conduct is inspired to this, but we di↵erentiate in terms of feature selection and the classifier used. for the former, we use a bag of words model enriched with further descriptions available in an external knowledge base, and we used a multinomial naive bayes classifier. the human workload and the precision we achieve are in order of magnitude com- parable with the ones observed by cohen et al. (above the average) on fifteen medical literature reviews. however, due to the di↵erence of the slr domains (medical for cohen et al., software engineering in this paper), we cannot exhaus- tively compare the two approaches. among the findings, cohen et al. suggested that the automatic classification may be useful to regularly monitor new relevant journal issues in order to identify interesting primary studies, easing the task to keep a slr constantly updated. according to this result, it is crucial to con- sider the classification problem in the slr field as a semi-supervised approach in which a human being supervises the inclusion or exclusion of possible relevant studies selected by the classifier. another attempt to reduce the human workload in selecting relevant primary studies was performed by (matwin et al. ). they proposed an approach mainly based on the naive bayes classifier with some optimizations which are based on the complement naive bayes (cnb) (rennie et al. ). the results they achieved outperform what detailed in (cohen et al. ), but using a di↵erent configuration parameters (they consider only title and abstract for each document instead of the large set of features considered by cohen). leveraging on natural language processing techniques (nlp), cohen et al. tackle the problem of paper handling once the review starts (cohen ). this is practically done to allow the reviewer to first analyze the documents which are labelled as potentially relevant documents, leaving at the end the evaluation for the remaining ones. they combined the approach of unigram and medical subject headings (mesh) to create the histogram of documents which potentially fits the scope of the review. in (ruttenberg et al. ), the authors proposed a hybrid approach for automating scientific literature search by means of data aggregation and text mining algorithms to make easy the search process. the key point of their work was to find a way to represent and share knowledge learned by human beings reading relevant papers, by means of an ontology. through it, it was possible to combine outcomes of each single document and to represent it into a graph, which is mapped to the ontology. the first step of this process consists of identifying semantic systematic literature review the key phrases of the document (outcomes). then, key phrases are used to link di↵erent concepts in the graph. following this process, concepts are linked together, obtaining a chain of relationships. this work is usually made by human beings, who are experts of the domain. ideally, they shoud be objective but the authors assessed that the graph mapping is strongly a↵ected by the expert subjectivity. then, they proposed a mechanism based on text mining algorithms to be able to navigate and cluster inferences. this work represents the first attempt to introduce the concept of knowledge representation in a slr and, among the findings, they stated that a pre-clustering and linking of documents limit the human subjectivity improving the overall result. selection of primary studies in this section, we detail the selection step of the slr process analyzing its strengths and weaknesses according to the guidelines described in (kitchenham ). this step takes as input the set of primary studies w gathered from a collection assumed to be the universe of all scientific papers in the domain of interest of the review. w results from the first step of the process and it is obtained as the output of the search process performed by human beings using keywords on dedicated sources. for instance, w could be composed by all papers published by a given set of journals or by all papers that a digital library provided as result of the search with keywords. the selection of primary studies is divided in two sub-steps: the former operates a selection based on reading titles and abstracts (first selection), the latter is the decision based on the full text human analysis (second selection). both steps are basically a↵ected by the following choice criteria: does it fit the research field? we define c (candidate studies) the set of studies that successfully passed the first selection and are eligible to be processed by researchers in the second selection step. it has the goal to split c in i (included studies) and e (excluded studies) where those sets are: – i is the set of studies c which successfully passed the second manual selection and will contribute to the systematic review. the following relation holds: i ✓ c. – e is the set of studies c which did not pass the second manual selection and will not contribute to the systematic review and synthesis. hence, e ✓ c and e \ i = ↵. figure illustrates the selection of primary studies step. as introduced in the previous section, the selection of primary studies is performed by human beings who usually apply selection criteria . however, the application of those criteria could rarely be completely objective, and it is frequently instead a↵ected by the subjective opinions of the involved researchers. a semi-supervised approach aims to reduce this potential bias. rizzo et al. fig. . selection of primary studies in a systematic literature review approach the proposed approach relies on text mining techniques and semantic enrich- ment to reduce the set of interesting papers a researcher has to evaluate. the approach consists of a semi-supervised iterative process built on top of the fol- lowing assumption: w = ↵ (as a result of the applied search strategy) and i = ↵ at the beginning (the set of relevant documents already known addedis not emply when the systematic review starts. the output of this approach is the set of most interesting papers w gathered from a larger set of unread papers w . . i construction the initial set of sources contained in i is named i and it is composed of primary studies already classified as relevant for the review: this is the first step of our process and it is needed to start the iterative part of the algorithm. i can be built in two di↵erent ways. the first way is to ask researchers to use their previous knowledge indicating the most well known and fundamental papers in the field of interest. this strategy considers that, often, systematic reviews are undertaken by experts in the field. the second way is to explore a portion of the search space using the basic process, e.g. searching on digital libraries or selecting the issues of (a) given journal(s). this portion is marked as i and the enriched process is used to explore the remaining search space. . model building the second step of our approach consists in computing automatically a model m from i . the idea is to build a bag of words (bow) model starting from the primary studies in i . for each study, we considered the words from the abstract and introduction. according to (cohen et al. ) words which appear at the beginning and at the end of a document (such as title, abstract, introduction and conclusion) are more significant. we empirically assessed that using a reduced set of words, coming only from abstract and introduction, provides the same results of considering the extended set of words (i.e. set of words coming from the title, semantic systematic literature review abstract, introduction and conclusion). the explanation is that the semantic enrichment stage (cfr. section . ) compensates a reduced cardinality of the bow through linking external sources and gathering from them textual data. finally, we perform stop words elimination and stemming process, using the porter algorithm (porter ). the model built is used to train a multinomial naive bayes classifier which computes the weight for each word according to the tf-idf normalized approach (kibriya et al. ). . semantic enrichment we define wi a document composed by the bow collected from the abstract and the introduction of one paper wi w . each wi is processed to get a bag of named entities n which features wi. a named entity is a name of a person or an or- ganization, a location, a brand, a product, a numeric expression including time, date, money and percent found in a sentence (grishman & sundheim ). basically, it is an information unit described by a set of classes (e.g. person, location, organization) which may be further disambiguated by an entry in a knowledge base such as dbpedia or freebase. in this work we disambiguate entities to dbpedia (bizer et al. ), with the rationale of linking them to external knowledge base entries. we then will fetch the abstract description of those entries and we join the existing textual content with the retrieved tex- tual data. the encyclopedic nature of this dataset is appropriate to enrich the content of each wi. once we have extracted the bag of named entities n, we link each ni n to the corresponding dbpedia resource (when it is available). the extraction of named entities is performed using opencalais . opencalais provides a classification for each named entity and suggests a uri of an external source where the information is disambiguated. relying on it, we point to a db- pedia resource defined by the owl:sameas property. since not all the instances in the opencalais knowledge base have the owl:sameas property, to minimize the loss, we used a logic that looks up entries in dbpedia that match the labels of the extracted entities (e.g. an occurrence of systematic literature review is mapped to http://dbpedia.org/resource/systematic_review). once the resource is found, then we collect all words contained in the description field (dbpedia-owl:abstract property). the abstract property is one of the descrip- tive property , whose usage is consistent across the entire dbpedia dataset. after collecting these descriptions, we add them to the bag of words natively taken by the document wi. we call it the enrichment process and the resulting document is defined as w+i , and with bow+ we refer to the bag of words extracted from w+i. finally, it is compared with the trained model m using a naive bayes classifier which is described below. . classification we used a multinomial naive bayes (mnb) classifier and we implement the tf-idf weight normalization. the choice of the multinomial naive bayes clas- http://www.opencalais.com rizzo et al. sifier was based on two criteria: ( ) the characteristics of the specific data and classification problem, and ( ) the focus of the approach: . a first characteristic in this use case is the small training set, which is a pe- culiarity of the problem under the study (i.e. the common situation is that the initial set of available papers is not large at the beginning of a literature search). usually, specific configuration of the classification algorithm parameters can improve the performances of a classifier (forman & cohen ). however, this is not a task that we expect from a normal user, given that we address a very transversely and general problem. instead naive bayes models are more robust towards shift in training distribution (elkan ). another character- istic is the data heterogeneity because every word is interpreted as feature, thus leading to the well known problems of sparsity (which produces the so-called curse of dimensionality). common text classifiers such a support vector machines (svms), which are more often used for text classification purposes (murphy ), particularly su↵er leading to consequent overfitting issues (cawley & talbot ). in such fuzzy contexts, naive bayes (nb) approaches corrected with tf-idf are competitive (rennie et al. ). we then opt for the mnb setting since it is proven to lead the best results compared with other nb variants for such a context (kibriya et al. ). finally, slrs produce highly imbalanced datasets. as a matter of fact, in our case study only articles over are interesting (cfr. section . ). typical solutions to this type of problem are resampling techniques or hybrid algorithms (chawla et al. , chawla ). while the first type of solutions is not applicable to the case of systematic literature reviews, the second one has the risk of a too specific implementation, which is not in the focus of our study. . the classification task in our case is subordinate to the enrichment process. for this reason our focus is to show that even with a very simple classifier, such as the mnb, the enrichment process is worthy: in fact, we show that using the bow+ produces better results than using the original bow in terms of saved manual work (from % to % reduction), preserving the recall beyond %, which is a very high value for all type of classifications. we use the classifier to compare w+i with the model m and we determine whether the conditional probability that w+i belongs to i is significant or not. this allows to still preserve the context of the initial documents where the en- tities are extracted, hence favoring the classifier to decide also according to the entire bag of words instead of the extracted named entities. we assume that all papers which do not belong to i, belong to e adopting the boolean algebra. the comparison is done for each w+i w : papers with p [w+i i] � threshold are moved to w and they are manually analyzed by researchers. finally, all the papers whose p [w+i i] < threshold remain in w . semantic systematic literature review . iteration the papers with a p [w+i i] � threshold are moved to w to be manually processed, whilst the remaining ones still remain in w . it is likely that some of the papers moved in w will pass the manual selection and will go to i, while the others will go to e. when i is modified, m becomes obsolete and it is necessary to re-build the model and repeat the classification step for all papers w+i w . again, if p [w+i i] � threshold, w+i is moved to w to be manually analyzed. if any w+i goes to w , i.e. w = ↵ after a classification, the iteration stops. papers that remain in w after the last iteration are finally discarded and not considered by researchers. the exclusion of these papers represents the reduction in workload for the human researchers. at each iteration, the model will be progressively tailored to the domain of interest, allowing to refine the selection of primary studies. algorithm enriched selection process algorithm define i init i with i repeat /* automatic recommendation of primary studies */ train classifier with i extract model m for all wi in w do enrich wi obtaining w+i compare w+i with model m: if p[w+i in i] � threshold then move wi to w end if end for /* first selection */ for all w i w do manually read title and abstract (w i i ) ? move w i to c : discard w i end for /* second selection */ for all ci c do manually read full paper (ci i ) ? move ci to i : move ci to e end for until c = ↵ discard wi w we provide in algorithm the synopsis of the whole study selection process proposed in this paper and in figure its complementary graphical representa- tion. comparing this picture with figure which represents the selection pro- cess provided by the guidelines (kitchenham ), we observe that the original process is not changed, but we have added a selection of primary studies that recommends papers similar to the model at each iteration. we also reported in figure the steps of the new process described in subsections . to . : the use of a model of bag of words (b) derived from i or i (a), the enrichment of papers through semantic enrichment (c) and the comparison of the model m with the studies through a multinomial naive bayes classifier (d). rizzo et al. experimental settings the proposed approach has been implemented in the semantic systematic re- view tool which is publicly available at https://github.com/ftomassetti/ semreview. the tool allows the loading of an already performed slr from which are already known both the set of interesting papers and the set of non- interesting ones. this enables experiments to be run to assess the e↵ectiveness of our approach. the tool creates the initially set of relevant papers i (papers which belong to the i set) randomly selecting a sub-set of the interesting papers defined by the slr. doing that, the tool simulates the operation performed by human researchers at the beginning of the slr. the other interesting papers, to- gether with the non-interesting ones, end in the w . this set is used for assessing the performance of the approach. from i , the tool extracts the corresponding bow and initializes the model m. then, for all the papers in w , the tool auto- matically performs the recommendation of the primary studies (the second step in the slr process) implementing the approach described in section . finally, the tool reports the performance of the approach using as ground truth the slr taken as reference. the performance is measured as the amount of the saved manual work. the baseline in the experiment is given by the semi-supervised automatic approach without the semantic enrichment mechanism. fig. . the enriched study selection process and its principal steps: model extraction (b) after i is built (a), enrichment of papers through semantic enrichment (c) and comparison with the model through a multinomial naive bayes classifier (d). . benchmark dataset as a case study we selected a slr on software cost estimation done by (jorgensen & shepperd ) and we limit the ground truth to all the papers mentioned the version released is a research prototype. it does not include some of the addi- tional scripts used to run the experiments. semantic systematic literature review in the slr coming from the ieee transactions on software engineering (ieee tse) journal. they cover a timeframe ranging from to april . we had to exclude the first volume of ieee tse because it is not accessible from the ieeexplore portal . the resulting set contains candidates, all of them eval- uated from the srl taken as reference. the original slr contains interesting papers. however, only of them are actually present in the set of the candi- dates available from the ieeexplore, the missing one having been published in the first volume of ieee tse. our benchmark dataset is therefore composed of papers, of which belong to the i set. the others are considered as non-interesting papers, i.e. they do not pass the selection criteria defined at the beginning of the performed study and they belong to the e set. . variable selection the main outcome under measurement is the manual work, consisting of reading primary studies either entirely or only title and abstract, to select the interesting ones for the subject of the slr. we measure the manual work as the number of papers that are read assuming the number as a proxy for the actual time that would be spent reading the articles. the minimum manual work ideally required is the total number of interesting papers. however, this minimum could reasonably never be reached in slr. indeed, the relation i ⇢ w holds, where i is the set of relevant papers and w is the set of containing papers defined by the search criterion. this choice is motivated by the fact that the slr, selected as subject of the case study, does not report neither the time spent for papers selection nor which papers were read entirely and which partially (only title and abstract). as a consequence, we define the following two metrics: mw is the manual work. more specifically mwo is the manual work performed in the original slr, i.e. manually selecting and reading all papers, mwne is the manual work obtained applying the selection based on the multinomial naive bayes classifier using original papers (non-enriched process), mwe is the manual work obtained applying the selection based on the multinomial naive bayes classifier using enriched papers (enriched process). t is the applied task. three levels are possible: manual, non-enriched, enriched. . hypothesis formulation the last step of the design is the hypothesis formulation. we formulate a pair of null and alternative hypothesis for each of the two research questions. goal of the experiment is to reject the null hypothesis h monitoring the p-value (hubbard & lindsay ). in other words, we discard the null hypothesis and we validate the alternative one ha if the probability to reject the h is lower than the . . moreover, it tells that when choosing the alternative hypothesis ha, the probability to commit an error is lower than . . http://ieeexplore.ieee.org/xpl/recentissue.jsp?punumber= rizzo et al. . h : mwo  mwe , recall= . h a : mwo > mwe , recall= . . h : mwne  mwe, recall= . h a : mwne > mwe, recall= . . parameter configuration we decided to assess the validity of our process with di↵erent sizes of i ranging between and . in order to limit the bias introduced by a particular configura- tion of selected papers, we built di↵erent i sets per each dimension choosing them randomly among relevant papers. we used each generated i to kick- o↵ the two variants of the process: enriched and non-enriched. moreover, we replicated the experiment varying the classification threshold between and with steps of . . the classifier threshold represents the posterior probability for a sample to belong to i (interesting set). overall, we executed the complete algorithm , times = (number of i sizes) x (number of i sets for each size) x (variants of the algorithm) x (thresholds). a preliminary step consisted to define the best classifier threshold t which maximizes the recall for the two variants. according to (cohen et al. ), we decided to aim at a recall of %. although this recall value is a strong constraint, we adopted it for limiting as much as possible the elimination of interesting papers. in table , we report the distribution of the maximum classifier threshold which permits to obtain the target recall using the di↵erent i sets. we chose the maximum threshold because is the one which minimizes the workload while it still satisfies the requirement of a recall equal to or greater than %. we select the median values to set the classifier, that means . for the enriched process and . for the non-enriched one. min. st qu. median mean rd qu. max. non-enriched . . . . . . enriched . . . . . . table . analysis of the best classifier threshold for both enriched and non-enriched process across di↵erent i sets. the first and last column show the minimum and maximum values, second and fifth columns respectively the first and third quartile of the distribution, then mid columns show median and the mean of it. . analysis methodology the goal of data analysis is to apply proper statistical tests to reject the null hy- potheses we formulated. since the values are not normally distributed (according semantic systematic literature review to the shapiro test), we adopt a non parametric test. in particular, we select the mann-whitney test (hollander & wolfe ) that compares the medians of the vectors of mw. to do that, we considered all papers extracted from the dataset except those papers used to build the i . results and discussion figure shows the comparison distributions for di↵erent settings of i according to the two di↵erent types of recommendation approaches proposed: enriched process or non-enriched process. on the y-axis, the workload needed for a human being after both processes (enriched e and non-enriched ne) is reported. on the x-axis, we indicate the number of papers used for training the i set and the process used (e.g. .e means an i composed of paper and the process has been performed using the enrichment mechanism). we observe a reduction of the workload in both approaches. comparing the semantic enrichment with the baseline, we observe a greater reduction of the workload. this increment ranges from . % to % for all i settings, except for the i composed of paper ( .e in figure ) where the increment is lower then % with respect to the not-enriched (e.g. .ne in figure ). fig. . number of papers to read for di↵erent i sizes and tasks applied: e (with enrichment) and ne (without). rizzo et al. we present below the results according to the two research questions ad- dressed in this paper (see section ): evaluating whether the semantic automatic process classification reduce the amount of work of a slr or not (rq ) and evaluating if the semantic enrichment increases the performance of the simple classification process (rq ). . rq : reduction of the human workload the results from the mann-whitney test are shown in table . the table reports the i size (column ), the manual work in the original slr process (column ), the manual work obtained with our enriched process (column ), the estimated percentage of manual work to be performed with our enriched approach with respect to the total work required using the common approach (column ) and the p-value obtained from the mann-whitney test. the p-value for all the configurations indicates that the null hypothesis can be rejected and we assume the alternative which motivates the choice to use the semantic enrichment ap- proach. in addition, we notice that the workload reduction increases as the size of i . workload manual workload vs enriched workload |i | mwo mwe median p � value . % < . . % < . . % < . . % < . . % < . table . for each i configuration, we first compare the workload required to a human being in the original slr and the workload mean if our process is performed. to verify the goodness of our process, we compute the mann-whitney test and we reject the hypothesis mwo  mwe with a recall = . . . rq : assessing the performance of the enrichment process we used the mann-whitney test to reject the null hypothesis by which we state that mwne  mwe. table reports the i size (column ), the estimated di↵er- ence of manual workload between the two processes (column ), and the p-value of mann-whitney test (column ). while we can observe that the enriched pro- cess requires less workload for every size of i , we can a�rm it with p < . just when the size of i is . semantic systematic literature review |i | workload median pairwise di↵erence p � value . . . . . . . . . . table . for each i configuration, we performed the mann-whitney test, evaluating median pairwise di↵erence and p-value to estimate the minimum workload using both process: enriched and not-enriched. as for rq , the minimum recall is . . . discussion the results show that our approach actually reduces the human workload to perform a slr, while aiming to maintain a high level of completeness. indeed, by limiting the recall to %, we adhere to the state of the art in the automa- tion of slr field maintaining its high quality. however, relying only on positive papers, this approach introduces one more configuration step for defining the threshold. the threshold can change according to the field of the slr. in our test, we empirically observed that the probability threshold is almost consistent in di↵erent test scenarios. for this reason, we consider it as a baseline value for further investigations. in addition, we observed that the enriched process performs better than the variant without enrichment up to %. there are still two shortcomings: i) the extracted entities from opencalais sometimes point to resources in the opencalais knowledge base which do not contain sameas links to dbpedia resources. we observe that the enrichment process fails in around % of the cases. the fallback strategy, to rely on another interlinking step using the named entity labels and lookup in dbpedia, partially fills the gap, since we observe that . % of resources can be located, holding a loss of . % of matched resources. however, this does not entirely fulfill the semantic gap since the interlinking step empowered as fallback does not consider the context from which the named entity has been extracted (raising an ambiguity issue which should be further analyzed with domain adaptive techniques). ii) a massive use of encyclopedic sources can bias the content of the enriched paper, penalizing words which do not appear often in the linked source but that are frequent in the initial document. di↵erently from what we expected, the i configuration does not a↵ect the recall. indeed, our results suggest that the number of papers in i is not relevant. its composition in terms of which papers are used to create it may play a more important role. for instance, let us consider an initialization of i with papers that are not strictly related or if they represent just a niche of the research field, or if we select papers which are completely out of argument and they represent di↵erent meaning. while in the latter case, a wrong initialization a↵ects all process and requires the initial set, in the former case the enrichment process enlarges i evading from the niche. experiments show that the subjective bias in rizzo et al. the composition of i is reduced when we use the semantic enrichment approach. while we do not have statistical evidence for that, i size seems to play a role on workload reduction. an important positive consequence of the use of automatic classification is the possibility to operate on larger search spaces because the e↵ort of explor- ing w is reduced by means of partial automation. as consequence the search strategies can also explore potential interesting sources. for example, using the standard approach, search on a high number of journals and conferences is com- monly quite expensive. instead resorting on partially automatic classification, this search is more a↵ordable. moreover, using an external knowledge base we are able to capture not just papers we recognize being similar to the ones al- ready selected, but we are able to capture papers that have conceptual relations (named entities) to the content expressed in the already selected papers. this strategy allows to deal with an incomplete description of the field of interest, which can not be completely described by the set of already selected papers. therefore the proposed approach allows, as reported by the results, to use also a i set which is relative small and not representative of the whole field and to obtain results which outperform the classification process using only original sources. in addition, the experimental results show that these improvements are obtained with a still high recall (above %), which means loosing a negligible amount of relevant information, which is an essential condition for the nature itself of slrs. conclusion and future work in this paper, we presented a semantic enrichment recommendation of primary studies in a slr. resorting on text mining techniques and semantic enrich- ment, we improved the second step of the slr process in order to filter the set of possible studies a researcher should read, automatically discarding the not relevant papers. our approach has two main advantages: i) reduction of work- load requested to classify sources and ii) reduction of subjectivity in the overall process. we tested our approach using a real slr (jorgensen & shepperd ) which is used as benchmark dataset. keeping a recall of % (i.e. we expected to discard papers only when the system is at least % sure that the paper is out the scope) we gained a percentage of workload saved of % when i is composed of papers. in addition, we demonstrated that the enrichment process outperforms up to % the automatic recommendation process without enrichment which is used as baseline. as future work, we plan to improve the classification step, using besides positive examples also negative examples. we believe that using also negative examples the process may have a more accurate value of the plausible probability if a sample belongs to the interesting set. the first idea is to use some of the papers not included in the slr for training negative examples. although this may be intuitive, we may address the problem of a short distance from positives and negatives, due to the cross topics which these papers may report. a further semantic systematic literature review evaluation of the distance among papers from di↵erent journal issues may give a better idea about the use of negative examples. therefore a deep analysis of which studies may be considered as negative is needed. in addition, we have planned to extract one paper i at a time from the set of relevant papers i, and to use the remaining papers i to train the classifier and, then, to evaluate if it recognizes i as similar to the others. in this way, the classifier is used to give a “second opinion” on the selection process, potentially reducing the number of researchers necessary to undertake this step. in the presented approach, we rely on the mnb classifier. it is considered as the baseline for text classification, but its results are often comparable to the state of the art in text classification, such as svm and markov chain (rennie et al. ) and as shown in section . . we plan to validate the use of the semantic enrichment with other classifiers to investigate the changes in perfor- mance. the experiments addressed an important weakness in the named entity extraction task. the disambiguation mechanism provided by opencalais often links, via the sameas link, to dbpedia resources. the loss of this process is recovered by an in-house interlinking logic which disambiguates the entity to dbpedia only considering the name of the entity. currently we are investigating the e↵ect of nerd (rizzo et al. ) which disambiguates to dbpedia considering the surroundings of the text where the entity has been spotted, hence preserving the semantics. finally, the semantic enrichment mechanism has been validated using one slrs. we plan to validate it also using other slrs especially coming from other field of research. we be- lieve that our approach could be adopted by scientific content providers such as journal portals, to index sources and to automatically classify and cluster the papers they publish. this approach may be used to propose a faceted view of sources queried by a user. the challenge will be to compute this operation in real-time to limit human e↵orts. acknowledgments this work was partially supported by the european union’s th framework programme via the projects linkedtv (ga ). references bizer c, lehmann j, kobilarov g, auer s, becker c, cyganiak r & hellmann s dbpedia - a crystallization point for the web of data web semantics: science, services and agents on the world wide web ( ), – . cawley g c & talbot n l on over-fitting in model selection and subsequent selection bias in performance evaluation the journal of machine learning research , – . chawla n v data mining for imbalanced datasets: an overview data mining and knowledge discovery handbook pp. – . chawla n v, japkowicz n & kotcz a editorial: special issue on learning from imbalanced data sets acm sigkdd explorations newsletter ( ), – . rizzo et al. cohen a m optimizing feature representation for automated systematic review work prioritization in ‘annual symposium of the american medical informatics association (amia)’ pp. – . cohen a m, hersh w r, peterson k & yen p y reducing workload in system- atic review preparation using automated citation classification journal of the american medical informatics association (jamia) ( ), – . elkan c the foundations of cost-sensitive learning in ‘ th international joint conference on artificial intelligence’ ijcai’ . forman g & cohen i learning from little: comparison of classifiers given little training knowledge discovery in databases: pkdd . grishman r & sundheim b message understanding conference- : a brief his- tory in ‘ th international conference on computational linguistics (coling’ )’ pp. – . hollander m & wolfe d a nonparametric statistical methods john wiley and sons new york. hubbard r & lindsay r m why p values are not a useful measure of evidence in statistical significance testing theory & psychology ( ), – . jorgensen m & shepperd m a systematic review of software development cost estimation studies ieee transactions on software engineering ( ), – . kibriya a, frank e, pfahringer b & holmes g multinomial naive bayes for text categorization revisited in ‘ th australian joint conference on advances in artificial intelligence (ai’ )’. kitchenham b procedures for performing systematic reviews technical report tr/se- software engineering group, department of computer science, keele university. kitchenham b guidelines for performing systematic literature reviews in software engineering technical report ebse- - . matwin s, kouznetsov a, inkpen d, frunza o & o’blenis p a new algorithm for reducing the workload of experts in performing systematic reviews journal of the american medical informatics association (jamia) ( ), – . murphy k p machine learning: a probabilistic perspective the mit press. porter m an algorithm for su�x stripping program ( ), – . url: http://www.emeraldinsight.com/doi/abs/ . /eb rennie j d m, shih l, teevan j & karger d r tackling the poor assumptions of naive bayes text classifiers in ‘ th international conference on machine learning (icml’ )’. rizzo g, van erp m & troncy r benchmarking the extraction and disam- biguation of named entities on the semantic web in ‘ th edition of the language resources and evaluation conference (lrec’ )’. ruttenberg a, rees j a, samwald m & marshall m s life sciences on the semantic web: the neurocommons and beyond briefings in bioinformatics ( ), – . tomassetti f, rizzo g, vetro a, ardito l, torchiano m & morisio m linked data approach for selection process automation in systematic reviews in ‘evaluation and assessment in software engineering (ease’ )’. the open access divide publications , , - ; doi: . /publications publications issn - www.mdpi.com/journal/publications article the open access divide jingfeng xia school of informatics and computing, indiana university, w. michigan st, ul b, indianapolis, in , usa; e-mail: xiaji@iupui.edu; tel.: + - - - received: august ; in revised form: october / accepted: october / published: october abstract: this paper is an attempt to review various aspects of the open access divide regarding the difference between those academics who support free sharing of data and scholarly output and those academics who do not. it provides a structured description by adopting the ws doctrines emphasizing such questions as who, what, when, where and why for information-gathering. using measurable variables to define a common expression of the open access divide, this study collects aggregated data from existing open access as well as non-open access publications including journal articles and extensive reports. the definition of the open access divide is integrated into the discussion of scholarship on a larger scale. keywords: scholarly information sharing; self-archiving; data; publications; open access . introduction the term “open access divide” describes the split between those academics who support free sharing of scientific data and intellectual output including scholarly publications and instructional materials and those academics who do not. stimulated by an ever-growing cost of periodical subscriptions and facilitated by the new information technologies, particularly the internet, open access (oa) has experienced dramatic progress in the past two decades, providing a digital outlet for scholarly communication. during its course of development, oa has also encountered many challenges. an open access divide (oad) has constantly permeated every aspect of the oa movement, despite notable oa efforts to increase participation. the academic community can be readily sorted into distinct groups. to a great extent, this divide reflects differences in how individual scholars perceive and participate in oa initiatives, which are influenced by their disciplinary norms, thematic research concentrations, roles in the oa undertaking, and cultural traditions and regional backgrounds. open access publications , efforts to bridge the gap of divergent oa practices have not been as effective as many have expected. this is most likely because of “the importance of faculty values and the vital role of peer review in faculty attitudes and actual publishing practices” and “the myriad divides that obstruct communication within the networks of the internet” [ , ]. a better understanding of the diverse practices will help advocates to better configure oa strategies to promote the involvement of every type of stakeholder. this requires appropriate measures of various oad dimensions in order to shed light on the divergence and critical differences among key oa concepts and practices and grasp the essence of the open access divide. this is the purpose of the present research. . background defined as having “unrestricted access and unrestricted reuse” by the plos, open access has many characteristics [ ]. the oa users are primarily in the academic community, in which the major constituencies are either researchers or those that work on supporting research activities, e.g., institutional administrators, academic librarians, and information professionals. the oad is illustrated by comparing variations in behavioral patterns of researchers within various disciplines, considering the availability and usage of oa resources can vary greatly. also, because the internet is the sole platform for the practice of oa publishing, self-archiving, content retrieval, and use, the oad is not affected by various types of technologies (e.g., phone, computer, and digital tv) that are central to a digital divide. oa has had many enthusiastic proponents. the berlin declaration, the budapest initiative, and the bethesda statement in the early s symbolized international promotion of oa, with signatories from leading international research, scientific, and cultural institutions lending their support to the open access paradigm. today, the berlin declaration alone has been signed by over institutions, libraries, archives, museums, funding agencies, and governments from around the world [ ]. the awareness of oa as both a concept and a mechanism for making scientific data, knowledge, and cultural heritage reachable to everyone has dramatically increased among scholars over the past two decades [ ]. nonetheless, there is still a gap between the oa awareness of scholars and the number of contributions they make to oa repositories. additionally, mandate policies as the new campaign strategy have only proven effective in accruing valuable free digital content in certain areas. oa proponents are cautiously optimistic that continued implementation of mandates will eventually restructure the scholarly landscape and overcome the divide [ , ]. it has been recognized in the oa literature that dichotomies exist in many areas of the practice. thatcher highlights an oa publishing divide between books and journals, and predicts that the divide will only become wider due to pre-published journal content that “is made available in oa while only a trickle of the former gets into that mode” [ – ]. he finds no evidence that universities have made or will start making necessary steps toward subsidizing oa publishing in book format as they did for oa journal publishing. xia, wilhoite and myers outline a divide between librarians and lis (library and information science) faculties in regard to the number of oa publications and citations [ ]. librarians are found to have not taken part in oa self-archiving more than the teaching faculty in lis even though the former has played various roles in open access and are more knowledgeable about the impact of oa. similarly, many researchers examine other representations of an open access divide between scholars in developed and developing countries; between junior and senior faculty; between publications , differences of raw data use; between the willingness expressed by scholars to participate in oa self-archiving and their actual oa contributions, and the like [ – ]. the open access divide is readily detectable. a structured description of the open access divide will help provide an insight into oa progress and challenges. this paper is an attempt to conceptualize oad by following and modifying the constructs of the ws doctrine and its revisions. oad is streamlined along several distinct dimensions with the purpose of creating a common framework to address the questions of who, e.g., the divide between librarians and faculty, with which characteristics, e.g., the divide in academic rankings, subject area and geography, connects how, e.g., self-archiving activities and oa journal publishing, and to what, e.g., oa awareness, attitudes and actions [ ]. in addition to combining these variables to form a collection of choices to define the divide, an attempt is also made to stress the separation between scholars’ attitudes and actions concerning open access and to integrate the definition of oad into the discussion of scholarship on a larger scale. it is hoped that the conceptualization of oad will help policymakers regulate their efforts in advancing free information access and exchange, and help information professionals and librarians adapt better strategies of dealing with scholars’ resistance to self-archiving. data utilized for the analysis are drawn from existing oa publications including journal articles and extensive reports, most of which are freely available online. this study relies on measurable variables to define a common expression of oad, thereby requiring selection of dependable data sources that are scientifically acquired, verified, and reported. to this end, a list of important oa data sources organized in a chronological order is consulted, which is then supplemented by available up-to-date numbers [ ]. whenever necessary, non-oa sources are also referred. however, since this study aims to construct a conceptual framework instead of performing a pure quantitative evaluation, aggregated numbers are generally calculated and incorporated whenever possible. . the ws doctrines . . “quis, quid, quando, ubi, cur, quem ad modum, quibus adminiculis”–augustine [ ] the origin of the ws doctrines could be traced back to the thirteenth century when a mnemonic verse was developed as a result of the need to help priests question confessors about their sins characteristic of the penitentials. later, st. augustine categorized the questions into seven circumstances, namely quis, quid, quando, ubi, cur, quem ad modum, quibus adminiculis (who, what, when, where, why, in what way, which supports) [ ]. a further development of this form of questions as an analytical model to examine bible studies was made by william wilkinson, a professor of theology, poetry, and literary figure, in the s, known as the “three ws” (what? why? what of it?). trumbull described wilkinson’s “three ws” as a plan of study of alliterative methods for the teacher. to wilkinson, this model was “an almost immemorial orator’s analysis, first the facts, next the proof of the facts, then the consequences of the facts” [ ]. the usefulness of this method in research and professional practice has been increasingly noticed later and the analysis was often expanded into the “five ws” (when? where? whom? what? why?). it provides a radical way of thinking and is applicable to various types of scholarly as well as individual projects. among others, journalism, publications , communications, and political science adopted the constructs as a preeminent device in the early s to regulate their professional undertakings, e.g., in newspaper writing. . . “who says what, in which channel, to whom, and with what effect?”–lasswell [ ] in the field of journalism, students have long been taught the importance of answering six basic questions to complete a story. in addition to the “five ws” an h was included to ask how it began or operates. it had become the standard that when a press release was written, one would need to follow the six questions: who is the story about (referring to the people involved)? what is it about (denoting the problems, things, and ideas)? when will it happen (verifying past, present, and future of the topic)? where will it happen (involving the locations)? how will it take place (concerning history or function)? why is it happening (regarding the causes, reasons, results, and conditions)? though some believed that the “five ws” and an h were characterized as old-fashioned and fallacious in the s, this staple of questions has been revived in digital journalism with the popularity of multimedia and virtual interactivity [ ]. the constructs have been bestowed with new substances as to focus on asking: who can we connect with (social networking)? what did the journalist read to write this (social bookmarking)? where did this happen (mapping)? when are events coming up (calendars)? why should we care (databases)? how can we make a difference (automation)? [ ] harold lasswell borrowed the concept of the ws doctrines to orient a simple structure of analysis for the studies of communication. after his expansion of the model, a series of basic questions was posed: “who says what, in which channel, to whom, and with what effect?” [ ] his prototype aimed at identifying various elements of communication in a political sphere where “who” represents people involved in the political body or agency communicating, “what” contains the essence of the message or idea, “channel” is of the method of communication, “whom” refers to the target audience, and “effect” signifies the outcome. lasswell published his recognized book politics: who gets what, when, how with the title itself later serving as the standard lay definition of politics [ ]. his refinement of the ws has helped inspire systematic thinking about political communication and characterize the psychological and policy implications of different systems of communication. . . “who, with which characteristics, connects how, to what?”–hilbert [ ] in a recent study on the intricacy of the digital divide, hilbert crafted a four-category structure to accommodate the most relevant studies, approaches and definitions by illustrating its major characteristics and dynamic connections. grounded in the theory of diffusionism through the social network schema, a common framework was constructed to steer the multi-dimensional analysis that encapsulates every possible variable by focusing the examination on subject (the level of units to be engaged), function (the nature of attributes to be affected), mode (the style of immersions to be conducted), and channel (the type of media to be observed). by his own elaboration, the framework is carefully designed to refocus questions about the digital divide from asking “who is the subject?” and “which attributes matter?” to “how to connect?” and “what kind of technology?” the latter two groups of variables may present the value of “haves” and “have-nots,” famous in the discussions of the digital divide, which aligns the two variables on either side of a dichotomy, which, in turn, produces the gap of the digital divide. it is mentioned by the author that different combinations of these four groups of publications , variables can bring about a sizeable collection of combined choices and may lead to contradictory arguments delineating the complexity of the digital divide. this four-category structure can be summarized as: – for whom (level of analysis): the digital divide existed among individuals, households, groups, organizations, communities, societies, countries, and world regions; – with which characteristics (attribute of node and tie): the digital divide is affected by a great deal of factors including age, autonomy, education, ethnicity and race, gender, geography, income, language, occupation, profitability, religion, skill, type of computer and website ownership, etc.; – connects how (level of digital sophistication): measure of the divide taken on internet access, actual usage, and impact; – to what (type of technology): connection accomplished via laptop, workstation, e-reader, digital tv, phone, gps, internet, etc. each dimension consists of a different number of variables which may change along with technology, e.g., the recent popularity of gps technologies; the adjustment to individual situations, e.g., adopters’ attitudes toward innovations; and changing interest from research groups, e.g., adding sub-groups of the target subjects. yet, even without a dynamic change of the variables, a combination of selected variables across all dimensions will yield a great variety of possible definitions. for instance, starting with only three different choices of the subject units (e.g., households, communities and countries), each being evaluated by using five attributes (age, gender, geography, income, and occupation), differentiating between three levels of digital adoption (access, actual usage and effective adoption), and with five types of technologies (phone, e-reader, laptop computer, digital tv, and general internet), a combination of choices ( × × × ) has already been made for an investigation of the digital divide. there is no threshold of variable numbers to be set for an analysis; yet, the formula indicates that for each additional variable being added, the matrix will be substantially amplified. while one may have been overwhelmed by the vast number of the digital divide elements, it is the framework proposed that demonstrates the value of hilbert’s effort for rationalizing complex analyses of the digital divide. . the open access divide the ws doctrines, especially hilbert’s four-category structure, can be adopted to discuss the open access divide, although the unique characteristics of oad require some variations. among other changes, hilbert’s subject-level analysis can be replaced by the type of subject, e.g., librarians vs. faculty, because necessary data is absent for distinguishing analytical levels between individuals and institutions. the attributes of each node are narrowed down to several characteristics of the oa practice that can be supported by measureable indexes and have shown noticeable dichotomies. for a description of the how factor, an effort is made to measure various oa activities, notably authors’ efforts to perform self-archiving, publish in open journals, and contribute and reuse free data. the most visible change to hilbert’s constructs is the separation of the what factor, which is no longer about the levels of technology for the oa practice, but instead depicts a larger picture of the open access process, focusing on the discrepancy between scholars’ expressed intention to participate in oa and their actual contributions. the structured description also highlights the applicability of the diffusionist model in publications , the technological as well as cultural context. however, before getting into the actual discussion of oad using the ws theory, let us examine how academics become involved in open access. . . open access: from awareness to action in a general sense, oa may be viewed as a sequence of several consecutive phases, i.e., awareness, attitude, action and allusion, throughout which advocacy pushes the process forward and actor (agent) is the subject who performs every oa task (figure ). of these “a” categories, “action” refers to a participant’s work to ( ) self-archive intellectual outcomes in the form of article pre-print or post-prints, reports or other types of written work in a digital repository or on a personal or institutional webpage; ( ) contribute raw scientific data and data definitions to a free data repository; ( ) post instructional content using open courseware; ( ) publish peer-reviewed articles in an oa journal; and ( ) make open source programs with original code available to the public. similarly, the “actor” category can include institutional or library administrators, government officials, association personnel, funding agencies, or renowned scientists among those who advocate for oa. “allusion” represents the actor’s reuse and repurpose of raw data or open source code, which involves providing necessary credit to the original contributors. as with any analysis of the digital divide, most of these phases of the open access divide contain multiple variables, leading to a wide range of combinations and making the analysis more complicated. figure . a conceptual relationship among various types of open access (oa) activities. an oa activity starts from one’s awareness of the urgency and consequence of the digital means of scholarly communication. at the beginning of the oa movement, scholars’ indifference to oa journal publishing or self-archiving using open access technology was considered to be the major reason for their unfamiliarity with the innovative approach [ ]. oa advocates have since then undertaken a persistent effort to raise the rate of awareness among scholars. a time-series research reveals that in a period of about a decade since the late s, the rate of oa awareness among scholars continued to increase, and is projected to follow the same trend [ ]. it is worth noting that awareness has multiple degrees. the fact that one knows of the existence of oa does not guarantee his/her familiarity with the practice. in several author surveys on self-archiving, it has been found that many respondents could not publications , differentiate a free scholarly resource from a subscription-based resource, probably because they have substantial access to journal databases through their institutional subscriptions [ , , ]. few studies have made an effort to focus upon oa concepts and practices as understood by scholars, which demonstrates a critical research need that requires further attention. scholars’ attitudes toward oa are thought to determine their behaviors in oa activities [ ]. like awareness, attitudes can also be multifaceted. one popular index to measure attitudes in most author surveys is willingness of scholar respondents to comply with a policy mandating participation in oa journal publishing or self-archiving [ , , ]. a large percentage of scholars surveyed—in many cases more than %—show their enthusiasm about contributing to oa. it is reasonable that the ratio of willingness to participation does not match any single reporting time because it usually takes time for intention to be transferred into actual work. nevertheless, it is unfortunate that over time individual surveys have continued to report a discrepancy in both oa journal publishing and self-archiving (see figures and ). these figures show a larger gap in oa journal publishing than in repository self-archiving. specifically, scholars hold a more positive attitude in favor of submitting articles to an oa journal, but act differently later. comparatively, if they agree to make contributions to a repository, particularly an institutional repository, they are much more likely to follow through on that commitment. figure . the divide between awareness and action for open access journal publishing [ – ]. figure . the divide between awareness and action for open access self-archiving [ , , ]. publications , although there are many reasons attributed to scholars’ unresponsiveness to oa, it is generally argued that academic emphasis on impact may “override the perceived ‘opportunities’ afforded by new technologies” [ , , , ]. high quality and high impact research is the necessary ticket to tenure and other forms of career success in all research-oriented institutions. tenure-tracked faculty perform in the same way across many fields by spending more time on publishing articles in the right venues than on anything else, while established scholars may follow the primary modes of scholarly dissemination in their own field, particularly in the humanities, where monographs are heavily valued, and in physical sciences, where traditional publishing is highly regarded [ – ]. non-traditional dissemination including oa self-archiving and publishing has not yet been weighed as high in the system, especially when systems for peer review have not been well structured. even though oa mandate policies have changed in the culture of digital scholarship to a great extent in some fields such as in the life sciences, academia is still dominated by the value faculty placed in traditional journal publishing. the consistent high rate of scholars’ willingness to participate in oa does reflect their interest in alternative scholarship. there are concerns about the restrictions of current publication practice by scholars who have experienced the slow publishing cycle and the limited dissemination mechanism. with the potential of the internet, people are expecting to observe changes, which have fortunately occurred in many areas as a result of the continuous oa advocacy in the past decades. some institutions have started providing credit toward tenure and promotion to faculty who make oa contributions in the form of data curation, although the amount varies significantly. the implementation of oa mandates has been a positive step for raising awareness among various stakeholders, including institutional administrators and faculty. the future of oa advocacy may need to become focused more upon affecting change in the academic evaluation system as a whole, instead of targeting individual scholars as it did in the past. by incorporating a rigorous peer-review process into data curation and self-archiving, and by publishing high-quality scholarly journals in oa, real change in the effectiveness of oa implementation is possible. only if the system has been optimized to better accommodate open access will the divide between willingness and action be diminished. . . subjects: the divide between librarians and faculty we now go into details of various oad dimensions following hilbert’s framework. since the oa literature is mostly scholarly in nature and the purpose of the movement is to restructure scholarly communication, this study reviews only oa practice in the academic community, regardless of the reality that the general public is also the beneficiaries of the effort, e.g., those who are suffering from a disease may get free access to information about new treatments that are released online at the time of, or even before, their formal publication [ ]. scholars are the foremost constituencies as oa contributors and beneficiaries. in most cases, scholars refer to faculty in research institutions and universities, and therefore, the two terms are interchangeably used in this paper. many other people in the community are also involved in the digital efforts with varying responsibilities, including academic librarians. they have been working on building websites and repository databases, coordinating with faculty to acquire materials, and creating and maintaining metadata to facilitate digital preservation and information retrieval. at the same time, an increasing number of advocates have delved into the promotion of an oa consciousness among faculty, most significantly through using their influence to provide incentives publications , for faculty or by endorsing the implementation of mandate policies at various levels, such as institutional and funding agency, to direct a constructive change of culture in scholarly communication. scholars are known for their reluctance to self-archive raw data and publications in digital repositories with exceptions for disciplines where a culture of information sharing has long been in existence, such as physics and economics. early reports found that the numbers of items in many repositories were low [ , – ]. in the following years, a slow growth of repository content has not reflected the aggressive advocacy among scholars and their increasing awareness of the importance of open access [ , ]. similarly, the rate of oa journal publishing by scholars started very low in the mid- s, and was not able to reach a high level by the late s in spite of great continual improvement over the course of the decade [ ]. oa advocates have developed some strategies to boost the collection of free digital repositories, one of which is the implementation of mandate policies [ , – ]. some types of mandatory policies are more effective than others. for example, policies implemented by journals and funding agencies have been able to increase the number of researchers who are depositing raw data by making it a condition of funding or publication, while institutional policies have not resulted in more e-prints than repository managers expected [ ]. although sale and others did report an increase of institutional repository items after the implementation of oa mandates in australia, such an increase is more the result of mediated archiving by repository workers rather than faculty authors as well as the result of other types of oa advocacy launched by individual repositories [ – , , ]. in comparison to faculty who are late adopters of the oa practice, librarians are supposed to present more positive behaviors in oa publishing and self-archiving because of their heavily assumed roles in preserving and disseminating scholarly records, although only a small portion of librarians are exclusively responsible for repository management. most academic libraries in the united states evaluate the performance of their librarians on scholarship, and hence publishing is often a required part of academic librarianship. it is therefore surprising to find that librarian authors in the united states have authored significantly fewer oa publications and have not participated any more in article self-archiving than faculty in lis, even though faculty themselves are not active in oa. counting the numbers of articles available through oa yields a rate of . % for librarian authors as opposed to . % for faculty authors [ ]. the same study finds that the odds of increasing oa citation counts for faculty’s publications are by a multiplicative factor of . . similar results have also been found by other studies which include faculty in other academic fields [ ]. figure has a simple comparison of librarians and faculty for their making own articles available in all types of repositories and websites, which shows a visible divide with regard to their oa article supplies and consumptions. figure . the oa divide between librarians and faculty in self-archiving [ , , , ]. publications , academic librarians as a group of digital facilitators also retain a different view of oa achievements from that of oa advocates. when oa advocates are enthusiastically and unanimously cheerful about the development of the scholarship reforms, librarians and involved information professionals are relatively cautious and pay more attention to challenges faced in practice [ – ]. this vision disparity in oa assessment may be caused by their different roles in the campaign: advocates have more access to diverse resources and thus are able to draw a larger picture and consider scholarly communication as an entire system, while librarian practitioners gain their opinions generally out of their own experience in individual projects. with regard to the discovered librarian-faculty divide, a disciplinary culture in information exchange may have played an essential role in influencing librarian authors’ response to oa advocacy, which will be discussed below. the fact that librarian authors have authored much less oa literature may be caused by their familiarity with the capability of, and easier access to, subscribed database searching, and therefore, being inclined to depend less on general web search engines to acquire free articles for their own studies [ ]. . . attributes: dichotomies in geography, discipline, and academic status there are numerous interrelated elements affecting the presentation of the open access divide. this paper focuses on several of these that are both mutable and quantifiable in study, namely academic ranking, subject affiliation, and geographic location of scholars. other elements are neither applicable to open access, such as computer skills, educational background, occupational conditions, and religious connections, nor coming with sufficient analytical data, e.g., age classifications; gender differences; racial groups; and language practices. also, types of technology and levels of digital sophistication are not as applicable for the discussion of the open access divide. with the selection of several of the most pertinent attributes, we further limit this study to one particular group of subjects because of the unique position of scholars in the academic community. oa investigations on the subject of librarians and information professionals have been too limited in number to distinguish among different types of digital responsibilities. similarly, oa advocates do not represent an exclusive group of people; some of them may also be scholars. of course, this does not prevent future studies from exploring different practices within each of these groups, which are rather interesting topics. . . . the divide between developed and developing countries oa is basically a western phenomenon, which was initiated in the united states and western europe prior to and was soon accepted by other developed countries such as australia, canada, and many other european countries [ ]. it was not until the mid- s that oa endeavors started spreading all over the world, which gives credit to persistent oa advocacy and expanding access to the internet [ ]. even today, many developing countries, particularly those in africa and central asia, are still struggling with developing a healthy infrastructure to facilitate free information sharing. a trans-cultural and trans-national diffusion of the digital scholarly system has been shaped by regional adoption strategies to suit uniquely local traditions. oa researchers have already paid attention to the spatial characteristics of the innovation and adoption by analyzing oa geography at the global scale, synthesizing models to understand diverse discoveries and using a chronological approach to reconstruct the history of oa spatial expansions [ ]. a recent study has examined the conditions of mandate publications , policy implementations across countries, and other surveys have attempted to differentiate scholars’ oa behaviors between developed and developing countries by collecting information about how many scholars are self-archiving and reusing oa data [ ]. multiple reports all show a developed-developing split in several oa areas [ , , ]. specifically, scholars from africa, latin and south americas, and most asian countries have a high awareness rate of both oa journal publishing and e-print repositories; and in some case the rate is even slightly higher than that of european and north american scholars. however, when data on actual oa actions by these scholars are collected, a geographic developed-developing disparity stands out so that one can easily find that scholars in developing countries have made far fewer oa attempts than scholars in developed countries. when surveyed for their willingness to comply with a mandate policy, if applicable, respondents from developing countries expressed their interest in open access more often than their counterparts in the west. this may be explained by the fact that scholars in developing countries are more inclined for free sharing of scholarly materials because of their limited access to subscription based journals than their counterparts in the developed countries. their responses represent direct evidence indicating an inconsistency between scholars’ expressed compliance with mandates and their actual level of oa contributions. figure presents a geographic dissemination of authors whose publications appear in oa journals by major country or region, in which the united states, united kingdom and canada are singled out due to their strong performance and all developing countries are summed up for the simplicity of analysis. figure . geographic distribution of oa authors in lis journals by major country or region [ ]. this visualization of the international divide in oa progress is further supported by the numbers of oa journals initiated in developed and developing countries as shown in figure . data for these figures are gathered from the directory of open access journals (doaj), which is a web service that includes most of the open access scientific and scholarly journals that apply a quality control system to guarantee the content [ ]. doaj offers the most comprehensive list of oa journals with necessary links and metadata for each journal which can be sorted by country and initiation year. to reveal a geographic pattern of the journals, we selected the top countries with the most journals and aggregated the numbers by continent for easy data analysis and visualization. the country designation of a journal publications , is based on the location of its editorial office, rather than the site of its publisher, as this is how data are acquired by doaj. also, the continent classification may provide a biased result for the developed- developing country separation. for example, mexico, classified as a developing country based on the un’s human development index, was calculated toward the total journal number for north america which has considerably affected the continental rate after this total number is divided also with two developed countries [ ]. such exceptions are, of course, rare and can be ignored at the continent level. figure . number of oa journals by continent [ ]. the number of oa journals in each country will become more meaningful if it is compared to the numbers of total journals published in that country. we checked ulrichsweb, a global serials directory of more than , periodicals, for the latter data and limited our search by selecting journals only. to make a precise comparison, the same countries are examined, and the numbers of total journals are summed up by continent and then divided by the number of countries in that continent (table ). a pearson’s correlation coefficient (r = . ) implies a perfect relationship in a linear equation between the two variables. this may help explain the reluctance to commit to oa by most developing countries, rather than as a result of a lack of funds for scholarly pursuits in these countries. table . numbers of total journals and oa journals by continent (averaged by country in each continent) [ , ]. total journals oa journals n. america , pacific , europe asia s. america africa l. america publications , an understanding of the oad between geographic locations may be taken from a diffusionist perspective since oa is also “the process by which an innovation is communicated through certain channels over time among the members of a social system” [ ]. following roger’s model of diffusion of innovations, the oa movement originated in a handful of core countries as an initiative to respond both to a sluggish publishing cycle and an ever-increasing subscription price for scholarly publications [ , ]. because the early oa adopters shared similar attributes with the western- originated innovators, they did not encounter major cultural or technological obstacles. as adoption spread, dissimilar systems across the globe started showing strength to block or slow down the channel of diffusion. it is easy to observe that the late adopters and non-adopters represent those countries and regions most affected by technological factors (e.g., the scarce availability of the internet in poor countries), and/or by cultural norms, economic conditions, and political structures, e.g., the tradition of unwillingness for free information sharing in some areas of east asia. in a paper on oa geography, the author presented evidence to verify the assumption that the oa distribution has not corresponded well to the expansion of the information and communication technology infrastructure in some regions, where oa has been alienated from harmonizing with existing customs [ ]. . . . the disciplinary divide it has been commonly recognized that scholars in different disciplines have varying attitudes and practices concerning self-archiving [ – ]. this view is supported by the history of the oa movement. the earliest subject repositories, e.g., arxiv for physics and repec for economics, are also the most successful repositories with active contributions by scholars [ , ]. these repositories are developed fields in which there was a preexisting culture of free information exchange, and scholars had been familiar with sharing research among peers [ ]. for example, prior to the invention of the internet physicists exchanged their research in the form of pre-print by using mail or fax. by contrast, scholars in other disciplines, mostly in humanities and social sciences, are not acquainted with a preprint tradition and therefore are unenthusiastic about making their research available publicly [ ]. as a result, subject repositories have not been able to fully develop even as efforts have been made to promote oa in these fields. most oa surveys have not provided useful data to validate this disciplinary divide because their classification of “subjects” is not specific enough to reveal self-archiving disparities. the success of repec in economics does not represent the condition of subject repositories in all other social sciences. there is a difference between some scientific fields and many fields in social sciences and the humanities, if we use the size of full-text deposits as a factor to measure the success of self-archiving in subject repositories (table ). this is for the purpose of demonstration only, and one needs to be cautious about interpreting the sizes because the total number of available articles will become meaningful only if it is divided by the total number of researchers in the field(s) that a subject repository serves. this piece of data is absent. another reason that this comparison is suggestive is that content size may not be the only factor of assessment as argued by carr and brody in response to xia and sun [ , ]. also, there is no evidence that these deposits are the result of self-archiving to reflect scholars’ personal involvement in oa as the acquisition of repository items can be taken by mediated archiving, done by someone else such as students or librarians, or by applying particular computer publications , programs for loading files automatically. at the same time, mandate policies also cause polarization of content volumes among repositories. nonetheless, the magnitude of subject repositories, for a quick view, can at least provide a rough idea about the position of the oa promotion among academic disciplines. this table also shows that most subject repositories have expanded their coverage to multiple disciplines. table . major subject repositories and their content size as of october . repository full-text articles subject areas pubmed central , , biomedical and life sciences repec , , economics arxiv , mathematics, computer science, quantitative biology, quantitative finance, statistics ssrn , accounting, cognitive science, corporate governance, economics, entrepreneurship, financial economics, health economics, information systems & ebusiness, legal, management, political science, social insurance, sustainability, humanities cogprints biology, computer science, electronic publishing, journals, linguistics, neuroscience, philosophy, psychology the recent discussion of the statement by the american historical association (aha) on policies concerning the embargoing of completed history phd dissertations highlights a strong disciplinary culture on open access [ – ]. in their june meeting, the aha council provided a statement that strongly suggests the embargoing of newly defended dissertations in digital form for six years. the statement argues that an unlimited access to these dissertations will put history graduates at a disadvantage in their effort to turn their work into book format because publishers may be unwilling to accept book drafts that have been freely available online. it further argues that history has been and is still remaining a book-based discipline, and making dissertations open access presents a tangible threat to the interest and careers of junior scholars in particular. critics blame the aha statement for having not made an attempt to change the process of granting tenure to junior academics by raising the value of citations rather than the format of publications. they point to the fact that “manuscripts that are revisions of openly accessible etds are always welcome for submission or considered on a case-by-case basis by . percent of journal editors and . percent of university press directors polled” [ ]. on the other hand, supporters of the aha statement insist that an embargo will be beneficial to new phds who can have more time to fine-tune their graduate work. debate on the embargo has been sparked on social media and academic blogs. oa journal publishing provides another piece of evidence showing different practices between various disciplines. the most successful oa campaign to date occurred in the life sciences where the national institutes of health (nih) has played a supportive role by launching a series of mandate initiatives to require the sharing of raw data and publications [ ]. in addition to nih’s reputable repositories, e.g., pubmed central and gene expression omnibus (geo), a large number of journals are either created as oa or converted into oa, many of which are highly-regarded scholarly publications. researchers in this inter-disciplinary area have integrated open access publishing into regular scholarship. recently, many small academic libraries have published their own institutional oa journals to support research in humanities and social science fields [ – ]. the average number of such journals per institution is . for faculty and . for student authors according to an examination of small publications , institutions in the united states [ ]. it may still be too early to judge these attempts, but concerns have already been raised about the quality and sustainability of these journals [ ]. disciplinary culture in scholarly communication is largely influenced by a dichotomy of epistemic features between convergent disciplines, which bear uniform standards and a relatively stable elite, and divergent disciplines, which have shifting standards, resulting in more intellectual results and a higher deviance from the norm [ , ]. convergent disciplines support research that is carried out based on the approaches of others as well as shared by others. economics, engineering and physics are several examples of convergence, where “…the exact methods and the hard convergent nature of the disciplinary knowledge seem to provide clearer guidelines for management and academic work,” and where oa is logically encouraged [ ]. on the other hand, in divergent disciplines, knowledge sharing occurs only at limited levels and within a restricted pool of projects, which is represented by diverse research data, interests and schemas. most disciplines in the humanities and social sciences fall into the category of divergence, e.g., the weak linkages and frequent barriers between sociological works, where subject repositories and oa publishing lag significantly behind the oa movement [ ]. it is believed that a strong correlation exists between a disciplinary culture and the health of its subject repository, as well as the self-archiving rate of its scholars. . . . the divide between senior and junior scholars for most faculties, oa is an experiment. very few of them really realize the advantage of information openness in scholarly communication, although many may be aware of the practice. among other concerns, faculties usually do not know how to handle the copyright, version control, and many other related issues of an article when depositing it to a repository, e.g., which rules are applicable in which disciplines and which journals, etc. or the faculties simply do not have time to make the contribution no matter how easy a self-archiving process is [ – ]. early career faculty members are particularly concerned with tenure and promotion and cannot envision a logical connection between participation in oa and assessment of scholarship, as there is not an intrinsic reward in the existing academic structure to accommodate their efforts in the experiment [ ]. the tenure clock keeps impelling junior faculty to prioritize only proposals for research grants, projects for high quality studies, and publications in prestigious journals, if teaching and service are not taken into consideration [ – ]. there is no evidence that a mandate policy has changed the perceptions and behaviors of these faculty members, unless the policy is implemented by funding agencies or top-ranked scholarly journals as can be seen in data sharing policies and their consequences in life science [ , ]. faculty members who are later in their career are relatively independent of the tenure and promotion restrictions and thus are “the most fertile targets for innovation in scholarly communication” [ ]. with tenure, senior faculty members are more willing to take part in various types of experiments than their junior counterparts. an example is senior researchers’ quick recognition of the value of online information sharing, considering the rate of downloads “a more credible measure of the usefulness of research than traditional citations” [ ]. it is not surprising that most oa advocates, in addition to administrators and librarians, are prominent scholars. a strategy to help recruit more content for digital repositories is to use senior faculty as role models for junior ones, as adopted by repository managers for the cream of science project in the netherlands [ ]. seniority is not limited to tenure status, e.g., publications , it is found that scholars who have accumulated more than sixteen publications tend to participate more in oa self-archiving regardless of their academic ranking [ ]. another important motivation for attaining senior faculty endorsement in open access lies in the fact that they are involved in academic policy-making and their interests in innovation are likely to have broader influence within their academic areas. . . connects how? measure of activities . . . the gap between journal and monograph publishing one of the earliest open access efforts was the publishing of electronic scientific journals in order to deliver research results to the general public free of charge as early as in the s [ ]. it was not until the late s when psycholoquy was published that oa journal publishing started gaining its momentum [ , ]. since then, diverse business models have been adopted in support of a sustainable operation. some established journal titles were transferred from subscription-based to open access with sponsorship from governments [ ]. in the past decade, academic libraries stepped in to launch peer-reviewed oa journals as a promising alternative to institutional repositories [ , ]. as many as % of academic libraries have been found to either have delivered oa publishing services or to be planning to deliver them, which does not count journals published by small-sized universities or colleges [ , , ]. the major players, however, are professional associations and some professional publishers that have managed the publication of high-quality scholarly journals in many academic fields, such as the american library association’s support for college & research libraries and biomed central’s series of oa journals [ ]. some oa journals charge a fee to authors or research sponsors for each article they publish in order to cover part of the expenses for a peer-review process, journal production, and online hosting and archiving. among many others, plos journals are known for following this business model [ ]. this author-pay-to-publish style may be appropriate for these fields where research projects are typically supported by grants such as in life science and engineering, but could be a hindrance to increasing oa content in social sciences and the humanities. when an oa journal is managed by an academic library, it usually serves scholarship in the latter fields, and a publication charge is not typically implemented. however, many library-sponsored journals are still in the experimental stage and may lack a rigorous peer-review system, particularly journals designed for student authors [ ]. also, this toll-free-publishing model relies solely on financial support from funding institutions and/or grant agencies. an extensive discussion about the applicability, sustainability and scalability of providing oa journal publishing services has been recently undertaken [ – ]. with regard to peer-reviewed journals in general, doaj listed a total of registered oa journals with more than , articles as of october in comparison with about journals in . unlike libraries and professional associations, many other types of publishers, e.g., university presses, set monograph publishing as their core mission. after the golden age of scholarly book publishing in the s, when lavish government funding underwrote scholarly activity, all university presses were faced with challenges [ , ]. the first challenge was and still is financial. when libraries are struggling with increasingly declining budgets, their reactions to rising subscription costs is to reduce publications , book acquisitions in order to optimize their use of available funds for exorbitantly overpriced journals. as early as the s, it was found that the ratio of monograph to journal expenditures in some major academic libraries had fallen from more than : to . : over a period of five years [ ]. this condition has only deteriorated since then. the second major challenge facing monograph publishers is competition from digitization projects and the internet. an escalating digital tidal wave starting from the mid- s has dramatically changed the publishing landscape. today, scholars and students assume that “a google search is a first stop for doing research, that multimedia is an integral part of narrative text, and that content will be available in a variety of formats and devices, with the accompanying tools and functionality to enhance its use” [ ]. monographs seem increasingly under siege. the book publishing business has to seek short-term and long-term innovations for financial self-sustainability. publishing e-books, particularly e-textbooks, is one of the possibilities; so is the strategy of working on collaboratively productive actions. some institutional publishers have already extended their responsibilities incrementally to implement new initiatives. the key is to balance the reconstruction of a self-supportive business model and the necessity of focusing exclusively on helping scholars create a new means of scholarly communication. esposito has recently proposed a five-stage book publishing model in which he describes, “the arc as publishers move from the traditional model (where print books were sold mostly in bookstores and to libraries) through a range of developments using online media, culminating in new forms of subscription marketing” [ ]. publishers are urged to progressively look for direct relationships with their readers, to become experts in metadata creation, and to create customer databases and become concerned about the life cycles of their customers. before workable strategies are successfully adopted, we will still observe a gap between journal and book publishing. however, the question remains whether this gap will keep widening as thatcher predicts, or if it will instead lead to multimodal communication in the scholarly ecosystem [ , – ]. . . . oa version disparities in oa practice, an article is a pre-print before it receives peer review and a post-print after it is peer-reviewed and accepted but before it is formatted by a journal. an e-print is a digital file of any research document, which may include a pre-print, a post-print, or both. after an article is accepted for publication, the journal will configure it with its printing prototype to add necessary contextual branding such as the publisher’s logo, pagination, etc., and typically in a pdf format. most publishers, both academic and commercial, have set explicit policies to regulate self-archiving including the specific version(s) to be allowed for oa. the sherpa-romeo database collects the information of publishers’ copyright policies on self-archiving of journal articles, where a total of , journals are color-coded based on the level of their self-archiving policies as of october [ ]. the romeo colors include green (allowing pre-print and post-print or publisher’s version/pdf to be archived), blue (allowing post-print, i.e., final draft post-referencing, or publisher’s version/pdf to be archived), yellow (including only pre-print, i.e., pre-referencing), and white (archiving not formally supported). it provides an effective online location for scholars to clarify which publishers grant which levels of copyright to allow authors to post their research results online for free access. publications , however, one may never expect individual scholars to check the database before conducting any self-archiving activities [ , ]. several years ago, antelman examined the self-archiving behavior of authors publishing in leading journals in six social science disciplines and found that publishers’ policies have little influence on author self-archiving practice: “the overall self-archiving rate for the white journals examined in this study is significantly higher than the self-archiving rate for the green journals” [ ]. scholars have in general made a significant number of articles in the form of post- print or publisher pdf version, which may not be allowed by relevant policies, and the confusion does not seem to have lessened since then [ ]. this mix of various versions of scholarly articles will potentially bring up legal issues on one hand, and on the other hand, makes it difficult for scholars to reuse the data and results with regard to research quality control [ ]. in another early study, cave found that “only % of academics and . % of information professionals surveyed found it easy to identify different versions of digital objects within institutional repositories with the figure being even worse across multiple repositories” [ ]. this in particular is still a problem today. in , the national information standards organization in partnership with the association of learned and professional society publishers recommended a classification of journal article versions (jav) [ ]. the recommended terms and definitions for jav define journal articles at seven stages: ( ) author’s original; ( ) submitted manuscript under review; ( ) accepted manuscript; ( ) proof; ( ) version of record; ( ) corrected version of record; and ( ) enhanced version of record. these stages can be comparable to the pre-print and post-print distinction. . conclusion: narrowing or widening? will the varying outlooks of the open access divide be narrowed or eventually be overcome? the answer is that it is too early to tell. because of the heterogeneous sources of variables in practice, it is likely that some aspects of the divide may become even worse at certain stages of the movement. cultural, economic and political influence in the trans-national environment may give the transition a different look even though the trend of internationalization has penetrated the scholarly system, and differing professional roles will still have dissimilar demands and concerns. alternatively, gaps such as the one between oa journals and book publishing and the problem of alternate file versions may become less detectable after necessary infrastructure has been fine-tuned and appropriate polices have been created. on the road to closing the divide, immense effort must be undertaken to develop a consensus that academic culture can change in a way that is beneficial to the movement. in analyzing the open access divide, differing definitions and perspectives may also help align our understanding of the oa challenges and accomplishments, thereby directing the oa efforts of advocates and policy-makers. it becomes important for researchers to develop a comprehensive theoretical framework for systematic exploration of oad from many different perspectives. this article is a systematic examination of oad based on the four-category model of a social network approach, covering the most visible gaps in oa activities. in order to summarize the findings, we draw a diagram to show the conceptual relationship among the primary variables, demonstrating the divide in geography, oa players, and types of activities, where the degree of various gaps is signified by the number of plus and minus signs (figure ). comparisons within any single category across either horizontal or vertical lines will demonstrate measurable disparities between one’s awareness of an oa publications , activity and his/her actual contributions. this matrix corresponds to hilbert’s conceptual combinations for the complexity of the digital divide and also fits into a diffusion analysis. specifically, the higher level of awareness and oa participation in archiving of data from various subject areas in the developed countries correlates with these countries’ status as early adopters and/or members of the majority, while all groups of subjects in the developing countries may be viewed as the adoption laggards in all types of oa activity [ ]. although it is currently impossible to quantify the turning point threshold between minority and majority within rogers’ diffusion model, such an undertaking would be a useful area for future studies to address. figure . diagram of a conceptual relationship among the primary variables in the open access divide [ ]. the purpose of analyzing the oad is to determine impediments to widespread oa adoption and action as to prioritize possible solutions [ – ]. among other considerations, it is most useful to iterate here the significance of changing the existing structure of faculty promotion and assessment for minimizing the divide. while this may appear at first to be a difficult undertaking, oa advocates have already exercised an increasing influence on digital scholarship in theory and practice. faculty members have been conscious of the problems of current scholarly communication as well as of the potential of widespread, free information sharing in the transformation of e-science, and librarians have been consistently providing innovative services in support of high-quality research activities. the recent berlin open access conference featured many high-profile presenters, including top federal government officials, university presidents, nobel laureates and major foundation directors; presenting a positive sign of how far oa has come in just a decade. hopefully, more research projects can be initiated to explore the open access divide in greater depth and to address additional issues beyond the scope of this discussion to further raise awareness on the disparities of open access, including possible gaps in the form of inequalities of resources among different types and sizes of institutions, or among different age and gender groups. publications , there are limitations in this study. it attempted to cover many aspects of the open access divide in one paper, which made it impossible to discuss in depth some work already published by others on the subject of open access. on the other hand, this study did not address some other issues in open access, such as the green and gold oa models, and links between the various openness movements. also, the different practices of open data and publications could have been examined more adequately. conflicts of interest the author declares no conflict of interest. references . king, c.j.; harley, d.; earl-novell, s.; arter, j.; lawrence, s.; perciali, i. scholarly communication: academic values and sustainable models; center for studies in higher education: berkeley, ca, usa, ; p. . . graham, m. time machines and virtual portals: the spatialities of the digital divide. progr. dev. stud. , , . . plos: open access. available online: http://www.plos.org/about/open-access/ (accessed on october ). . berlin . conference. available online: http://www.berlin .org/call-to-action.html (accessed on october ). . xia, j. a longitudinal study of scholars attitudes and behaviors in open access publishing. j. am. soc. inf. sci. technol. , , – . . xia, j. an anthropological emic-etic perspective of open access practices. j. doc. , , – . . xia, j.; gilchrist, s.b.; smith, n.x.p.; kingery, j.a.; radecki, j.r.; wilhelm, m.l.; harrison, k.c.; ashby, m.l.; mahn, a.j. a review of open access self-archiving mandate policies. port. libr. acad. , , – . . thatcher, s.g. from the university presses–open access and the future of scholarly communication. against grain , , – . . thatcher, s.g. from the university presses–what university presses think about open access? against grain , , – . . thatcher, s.g. back to the future: old models for new challenges. against grain , , – . . xia, j.; wilhoite, s.k.; myers, r.l. a ‘librarian-lis faculty’ divide in open access practice. j. doc. , , – . . feijen, m.; van der kuil, a. a recipe for cream of science: special content recruitment for dutch institutional repositories. ariadne , . available online: http://www.ariadne.ac.uk/issue /vanderkuil (accessed on october ). . morris, s.; thorn, s. learned society members and open access. learn. publ. , , – . . rowlands, i.; nicholas, d. new journal publishing models: an international survey of senior researchers; ciber: london, uk, . . swan, a.; brown, s. open access self-archiving: an author study; key perspectives ltd.: truro, uk, . publications , . xia, j.; liu, y. usage patterns of open genomic data. coll. res. libr. , , – . . hilbert, m. the end justifies the definition: the manifold outlooks on the digital divide and their practical usefulness for policy-making. telecommun. policy , , – . . augustine, a. liber de rhetorica. in rhetores latini minores; halm, k., ed.; aedibus b.g. teubneri: leipzig, germany, , p. . . robertson, d.w. a note on the classical origin of ‘circumstances’ in the medieval confessional. stud. philol. , , – . . trumbull, h.c. teaching and teachers; john, d., wattles: philadelphia, pa, usa, ; p. . . lasswell, h.d. the structure and function of communication in society. in the communication of ideas; bryson, l., ed.; harper and row: new york, ny, usa, ; p. . . griffin, p.f. the correlation of english and journalism. engl. j. , , – . . bradshaw, p. five w’s and an h that should come *after* every story. online j. blog , nov . available online: http://recoveringjournalist.typepad.com/recovering_journalist/ / / - ws-an-h- .html (accessed on october ). . lasswell, h.d. politics who gets what, when and how; whittlesey house: new york, ny, usa, . . mackie, m. filling institutional repositories: practical strategies from the daedalus project. ariadne , . available online: http://www.ariadne.ac.uk/issue /mackie (accessed on october ). . xia, j.; dalbello, m. self-archiving as an emergent scholarly practice. in proceedings of the research showcase of school of communication, information, and library studies, rutgers university, new brunswick, nj, usa, may . . austin, a.; heffernan, m.; david, n. academic authorship, publishing agreements and open access: survey results; queensland university of technology: brisbane, australia, . . hess, t.; wigand, r.t.; mann, f.; von walter, b. open access & science publishing; ludwig- maximilians-universitat: munich, germany, . . boufarss, m. if we build it, will they come? in proceedings of the sla-arabian gulf chapter th annual conference, abu dhabi, united arab emirates, march, . . harley, d.; acord, s.k.; earl-novell, s.; lawrence, s.; king, c.j. assessing the future landscape of scholarly communication: an exploration of faculty values and needs in seven disciplines; center for studies in higher education: berkeley, ca, usa, ; p. i. . warlick, s.e.; vaughan, k.t.l. factors influencing publication choice: why faculty choose open access. biomed. dig. libr. , . available online: http://www.bio-diglib.com/content/ / / (accessed on october ). . becher, t. the significance of disciplinary differences. stud. high. educ. , , – . . damrosch, d. we scholars: changing the culture of the university; harvard university press: cambridge, ma, usa, . . tierney, w.g. organizational culture in higher education: defining the essentials. j. high. educ. , , – . . suber, p. open access overview. available online: http://www.earlham.edu/~peters/fos/ overview.htm (accessed on october ). publications , . rowlands, i.; nicholas, d.; huntingdon, p. scholarly communication in the digital environment: what do authors wants? ciber: london, uk, . . ware, m. pathfinder research on web-based repositories–final report; publisher and library learning solutions: bristol, uk, . . ware, m. institutional repositories and scholarly publishing. learn. publ. , , – . . university of california, faculty attitudes and behaviors regarding scholarly communication: survey findings, . available online: http://osc.universityofcalifornia.edu/responses/materials/ osc-survey-full- .pdf (accessed on october ). . xia, j.; opperman, d. current trends in institutional repositories of master’s and baccalaureate institutions. ser. rev. , , – . . sale, a. the acquisition of open access research articles. first monday , . available online: http://firstmonday.org/ojs/index.php/fm/article/view/ (accessed october ). . sale, a. the impact of mandatory policies on etd acquisition. d-lib. mag. , . available online: http://www.dlib.org/dlib/april /sale/ sale.html (accessed on october ). . sale, a. comparison of content policies for institutional repositories in australia. first monday , . available online: http://firstmonday.org/ojs/index.php/fm/article/view/ (accessed october ). . piwowar, h.a. who shares? who doesn’t? factors associated with openly archiving raw research data. plos one , , e . . kennan, m.a. learning to share: mandates and open access. libr. manag. , , – . . kennan, m.a.; kingsley, d. the state of the nation: a snapshot of australian institutional repositories. first monday , . available online: http://firstmonday.org/ojs/index.php/fm/article/view/ / (accessed on october ). . kurtz, m.j.; eichhorn, g.; accomazzi, a.; grant, c.; demleitner, m.; henneken, e.; murray, s.s. the effect of use and access on citations. inf. process. manag. , , – . . carter, h.; snyder, c.a.; imre, a. library faculty publishing and intellectual property issues. port. libr. acad. , , – . . mercer, h. almost halfway there: an analysis of the open access behaviors of academic librarians. coll. res. libr. , , – . . harnad, s. open access to research: changing researcher behavior through university and funder mandates. j. democr. open gov. , , – . . lewis, d.w. the inevitability of open access. coll. res. libr. , , – . . poynder, r. suber: leader of a leaderless revolution. inf. today , . available online: http://www.infotoday.com/it/jul /suber-leader-of-a-leaderless-revolution.shtml (accessed on october ). . peterson, w.k. open access to digital information: opportunities and challenges identified during the electronic geophysical year. data sci. j. , , s –s . . shin, e. the challenges of open access for korea's national repositories. interlend. doc. supply , , – . . young, p. open access dissemination challenges: a case study. oclc syst. serv. , , – . . palmer, k.l.; dill, e.; christie, c. where there’s a will there’s a way? survey of academic librarian attitudes about open access. coll. res. libr. , , – . publications , . suber, p. timeline of the open access movement, . available online: http://www.earlham.edu/ ~peters/fos/timeline.htm (accessed on october ). . evans, j.a.; reimer, j. open access and global participation in science. science , , . . xia, j. diffusionism and open access practices. j. doc. , , – . . rowlands, i.; nicholas, d. the changing scholarly communication landscape: an international survey of senior researchers. learn. publish. , , – . . liu, z.; wan, g. scholarly journal articles on open access in lis literature: a content analysis, . available online: http://www.white-clouds.com/iclc/cliej/cl liuwan.htm (accessed on october ). . directory of open access journals. available online: http://www.doaj.org/ (accessed on october ). . hdi index. available online: http://hdr.undp.org/en/media/lets-talk-hd-hdi_ .pdf (accessed on october ). . ulrichsweb global serials directory. available online: http://www.ulrichsweb.com/ulrichsweb/faqs.asp (accessed on october ). . rogers, e.m. diffusion of innovations, th ed.; free press: new york, ny, usa, . . rogers, e.m. communication technology: the new media in society; free press: new york, ny, usa, . . rogers, e.m. diffusion of innovations; free press: new york, ny, usa, . . andrew, t. trends in self-posting of research material online by academic staff. ariadne , . available online: http://www.ariadne.ac.uk/issue /andrew (accessed on october ). . jenkins, b.; breakstone, e.; hixson, c. content in, content out: the dual roles of reference librarian in institutional repositories. ref. serv. rev. , , – . . pinfield, s. self-archiving publications. in international yearbook of library and information management – : scholarly publishing in an electronic era; gorman, g.e., rowland, r., eds.; facet publishing: london, uk, ; pp. – . . shearer, m.k. institutional repositories: towards the identification of critical success factors. can. j. inf. libr. sci. , , – . . arxiv. available online: http://arxiv.org (accessed on october ). . repec. available online: http://repec.org/ (accessed on october ). . hubbard, b. sherpa and institutional repositories. serials , , – . . lynch, c.d. institutional repositories: essential infrastructure for scholarship in the digital age. port. libr. acad. , , – . . carr, l.; brody, t. size isn’t everything: sustainable repositories as evidenced by sustainable deposit profiles. d-lib. mag. , . available online: http://www.dlib.org/dlib/july /carr/ carr.html (accessed on october ). . xia, j.; sun, l. assessment of self-archiving in institutional repositories: depositorship and full-text availability. ser. rev. , , – . . aha. statement on polices regarding the embargoing of completed history phd dissertations. aha today . available online: http://blog.historians.org/ / /american-historical- association-statement-on-policies-regarding-the-embargoing-of-completed-history-phd- dissertations/ (accessed on october ). publications , . jaschik, s. embargoes for dissertations? inside high. ed , july . available online: http://www.insidehighered.com/news/ / / /historians-association-faces-criticism- proposal-embargo-dissertations (accessed on october ). . patten, s. scholarly group seeks up to -year embargoes on digital dissertations. chron. high. educ. , july . available online: http://chronicle.com/article/scholarly-group-seeks-up- to/ / (accessed on october ). . ramirez, m.l.; dalton, j.t.; mcmillan, g.; read, m.; seamans, n.h. do open access electronic theses and dissertations diminish publishing opportunities in the social sciences and humanities? findings from a survey of academic publishers. coll. res. libr. , , p. . . suber, p. an open access mandate for the national institutes of health. open med. , . available online: http://www.openmedicine.ca/article/view/ / (accessed on october ). . bankier, j.g.; smith, c. establishing library publishing: best practices for creating successful journal editors. in proceedings of the elpub conference on electronic publishing, toronto, on, canada, june ; pp. – . . royster, p. publishing original content in an institutional repository. ser. rev. , , – . . xia, j. library publishing as a new model of scholarly communication. j. sch. publ. , , – . . pinfield, s. journals and repositories: an evolving relationship? learn. publ. , , – . . becher, t.; trowler, p.r. academic tribes and territories: intellectual enquiry and the cultures of discipline; open university press: buckingham, uk, . . kreber, c. the university and its disciplines: teaching and learning within and beyond disciplinary boundaries; routledge: london, uk, . . kekäle, j. preferred patterns of academic leadership in different disciplinary (sub)culture. high. educ. , , – . . demartini, j.r. basic and applied sociological work: divergence, convergence, or peaceful co-existence. j. appl. behav. sci. , , – . . gadd, e.; oppenheim, c.; probets, s. romeo studies : the impact of copyright ownership on academic author self-archiving. j. doc. , , – . . gadd, e.; oppenheim, c.; probets, s. romeo studies : an analysis of journal publishers’ copyright agreements. learn. publ. , , – . . carr, l.; harnad, s. keystroke economy: a study of the time and effort involved in self-archiving. available online: http://www.eprints.ecs.soton.ac.uk/ (accessed on october ). . ozek, y.h. lund virtual medical journal makes self-archiving attractive and easy for authors. d-lib. mag. , . available online: http://www.dlib.org/dlib/october /ozek/ ozek.html (accessed on october ). . austin, a.e.; rice, r.e. making tenure viable: listening to early career faculty. am. behav. sci. , , – . . olsen, d. work satisfaction and stress in the first and third year of academic appointment. j. high. educ. , , – . . developing new and junior faculty; sorcinelli, m.d., austin, a.e., eds.; jossey-bass: san francisco, fl, usa, . publications , . mccain, k. mandating sharing: journal policies in the natural sciences. sci. commun. , , – . . piwowar, h.a.; chapman, w. a review of journal policies for sharing research data. telpub . available online: http://precedings.nature.com/documents/ /version/ (accessed on october ). . ellingford, l. education scholars’ motivations, approaches and practices toward open access publishing. in proceedings of the berlin open access conference, washington, dc, usa, – november . . senders, j. an on-line scientific journal. inf. sci. , , – . . harnad, s. scholarly skywriting and the prepublication continuum of scientific inquiry. psychol. sci. , , – . . schauder, d. electronic publishing of professional articles: attitudes of academics and implications for the scholarly communication industry. j. am. soc. inf. sci. , , – . . devakos, r.; turko, k. synergies: building national infrastructure for canadian scholarly publishing. arl bi- mon. , / , – . . maron, n.l.; smith, k.k. current models of digital scholarly communication: results of an investigation conducted by ithaka for the association of research libraries; association of research libraries: washington, dc, usa, . . nicholas, d.; huntington, p.; jamali, h.r. the impact of open access publishing (and other access initiatives) on use and users of digital scholarly journals. learn. publ. , , – . . hahn, k.l. research library publishing services: new options for university publishing; association of research libraries: washington, dc, usa, . . koh, a. what is publishing? a report from thatcamp publishing. chron. high. educ. , november . available online: http://chronicle.com/blogs/profhacker/what-is-publishing-a- report-from-thatcamp-publishing/ (accessed on october ). . suber, p. the sparc open access newsletter. , . available online: http://www.earlham.edu/~peters/fos/newsletter/ - - .htm (accessed on october ). . plos journals. available online: http://www.plos.org/publications/journals/ (accessed on october ). . bankier, j.g.; foster, c.; wiley, g. institutional repositories–strategies for the present and future. ser. libr. , , – . . king, d.w. the cost of journal publishing: a literature review and commentary. learn. publ. , , – . . reische, j. who bears the cost? comment on s. jaschik, ‘abandoning print, not peer review’, inside high. ed. , february . available online: http://www.insidehighered.com/news/ / / /open (accessed on october ). . givler, p. university press publishing in the united states. in scholarly publishing: books, journals, publishers and libraries in the twentieth century; abel, r.e., newman, l.w., eds.; wiley: hoboken, nj, usa, . . pochoda, p. editor’s note for reimagining the university press. j. electron. publ. , . available online: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext (accessed on october ). publications , . fry, b.m.; white, h.s. economics and interaction of the publisher-library relationship in the production and use of scholarly and research journals; national science foundation: washington, dc, usa, ; p. . . wittenberg, k. reimagining the university press. j. electron. publ. , . available online: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext (accessed on october ). . esposito, j.j. stage five book publishing. j. electron. publ. , . available online: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext (accessed on october ). . jensen, m.j. university presses in the ecosystem of . j. electron. publ. , . available online: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext (accessed on october ). . lynch, c.d. imagining a university press system to support scholarship in the digital age. j. electron. publ. , . available online: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext (accessed on october ). . mcpherson, t. scaling vectors: thoughts on the future of scholarly communication. j. electron. publ. , . available online: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext (accessed on october ). . sherpa/rome. available online: http://www.sherpa.ac.uk/romeo (accessed on october ). . antelman, k. self-archiving practice and the influence of publisher policies in the social sciences. learn. publ. , , . . pinfield, s. how do physicists use an e-print archive? implications for institutional e-print services. d-lib. mag. , . available online: http://www.dlib.org/dlib/december /pinfield/ pinfield.html (accessed on october ). . probets, s.; jenkins, c. documentation for institutional repositories. learn. publ. , , – . . cave, p. work package : requirements exercise–report of a survey of academics and information professionals; university of leeds: leeds, uk, ; p. . . niso/alpsp journal article versions (jav) technical working group. journal article versions (jav): recommendations of the niso/alpsp jav technical working group; national information standards organization: baltimore, md, usa. . harper, f.m.; raban, d.; rafaeli, s.; konstan, j.a. predictors of answer quality in online q&a sites. in proceedings of the th annual sigchi conference on human factors in computing systems; acm: new york, ny, usa. . goodman, d. the criteria for open access. ser. rev. , , – . . guédon, j. the “green” and “gold” roads to open access. ser. rev. , , – . . harnad, s.; brody, t.; vallières, f.; carr, l.; hitchcock, s.; gingras, y.; oppenheim, c.; hajjem, c.; hilf, e.r. the access/impact problem and the green and gold roads to open access: an update. ser. rev. , , – . publications , . guédon, j. mixing and matching the green and gold roads to open access—take . ser. rev. , , – . © by the authors; licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution license (http://creativecommons.org/licenses/by/ . /). on providing semantic alignment and unified access to music library metadata int j digit libr ( ) : – https://doi.org/ . /s - - - on providing semantic alignment and unified access to music library metadata david m. weigl · david lewis · tim crawford · ian knopke · kevin r. page received: march / revised: june / accepted: june / published online: august © the author(s) . this article is an open access publication abstract a variety of digital data sources—including insti- tutional and formal digital libraries, crowd-sourced commu- nity resources, and data feeds provided by media organisa- tions such as the bbc—expose information of musicological interest, describing works, composers, performers, and wider historical and cultural contexts. aggregated access across such datasets is desirable as these sources provide comple- mentary information on shared real-world entities. where datasets do not share identifiers, an alignment process is required, but this process is fraught with ambiguity and difficult to automate, whereas manual alignment may be time-consuming and error-prone. we address this problem through the application of a linked data model and frame- work to assist domain experts in this process. candidate alignment suggestions are generated automatically based on textual and on contextual similarity. the latter is determined according to user-configurable weighted graph traversals. match decisions confirming or disputing the candidate sug- gestions are obtained in conjunction with user insight and expertise. these decisions are integrated into the knowledge base, enabling further iterative alignment, and simplifying the creation of unified viewing interfaces. provenance of the musicologist’sjudgementiscapturedandpublished,support- ing scholarly discourse and counter-proposals. we present our implementation and evaluation of this framework, con- ducting a user study with eight musicologists. we further demonstrate the value of our approach through a case study b david m. weigl david.weigl@oerc.ox.ac.uk oxford e-research centre, university of oxford, oxford, uk department of computing, goldsmiths, university of london, london, uk british broadcasting corporation, london, uk providing aligned access to catalogue metadata and digitised score images from the british library and other sources, and broadcast data from the bbc radio early music show. keywords linked data · metadata · semantic alignment · contextual matching · musicology · early music introduction the reconciliation of corpora providing access to historical catalogue data in a digital libraries context is made diffi- cult by a range of challenges from ambiguities concerning the names of individuals to disputed or erroneous attribu- tion (e.g. [ ]). relevant sources include digital libraries of institutions such as the british library, formal digital library resources provided by organisations such as the oclc (e.g. viaf ), data feeds provided by commercial and media industry institutions such as the bbc (the uk’s national public service broadcaster), and community resources such as musicbrainz. such datasets provide complementary informationconcerningthesamehistoricalentities(e.g.com- posers, works), but the corresponding records may not share identifiers across the datasets. this greatly complicates con- venient access to the entirety of the information available on a given entity. automated approaches employing heuristics to identify similarly named individuals or similarly titled works are insufficient to resolve ambiguous cases such as a name being shared by multiple individuals or an individual being known by multiple names. further complications arise from http://www.oclc.org. http://viaf.org. http://musicbrainz.org. http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf http://www.oclc.org http://viaf.org http://musicbrainz.org d. m. weigl et al. potential differences in language between datasets (e.g. via anglicisation of latin names), non-standardised spellings (a prominent issue with historical catalogue data), and simple errors. the manual resolution of these concerns by domain specialists is tedious and error-prone, as the number of poten- tial alignment candidates grows exponentially with the size of the datasets in question. in this paper, we address this issue by combining the expert guidance of the domain specialist with the efficiency and tractability of an automated solution, combining sur- face similarity and contextual semantics to generate match candidates for confirmation or disputation by the musicolo- gist. by providing computational assistance, we ensure that the user’s attention is required only for those aspects of the alignment task providing or demanding insight. by track- ing provenance information regarding the user’s alignment activities, we explicitly capture their judgement on contested attributions and similar topics of scholarly dispute. we adopt a linked data approach [ ] employing the resource description framework (rdf) [ ], a standard model for online data exchange. linked data extends the linking structure of the world wide web by employing uris to specify directed relationships between data instances. these data instances may themselves be encoded by uris or represented by literal values. a set of two such instances, linked by such a relationship, is referred to as a triple. col- lections of triples may be stored in flat files on a server, accessed via http, or housed within specialised rdf databases, known as triplestores. by employing this linked data approach, the meanings of the relationships between the data are made explicit, allowing them to be understood by both humans and machines. information represented in this way may be linked to external datasets and can in turn be linked to from external datasets, embedding the information within a wider web of knowledge and making it discoverable and reusable in other contexts [ ]. a key strength of the rdf model compared to the tradi- tional relational model is that it is robust to underspecification and mutability in the data schema, and thus facilitates the merging of disparate datasets that may each employ radically different schemas or ontologies. since it involves publishing alignment outcomes as interlinked rdf triples, our solution facilitates the creation of combined views of the unified data within the now-reconciled corpora, while supporting reuse of the musicologist’s decisions to drive further scholarly activ- ity. the remainder of this article is organised as follows: sect. describes the musicological motivations that have led to the development of the work presented here and introduces a specific motivating case in early music. section details in the parlance of linked data, a subject is related by a predicate to an object. related work in the matching of disparate datasets. section presents the design of the data model and framework under- lying our approach towards addressing this problem. section discusses the implementation of a semantic alignment and linking tool (salt) that builds on this model. section reports on a case study applying salt to link academic and media industry datasets focusing on our early music scenario. section presents a user evaluation of the salt data model and user interface, employing eight musicologist participants with expertise in early music. section presents the imple- mentation of a unified viewing interface providing access to datasets linked by a domain specialist using salt. finally, sect. concludes this discussion and presents our plans for future work. musicological motivation and case study in early music exploratory investigations have been conducted into the information needs and information seeking behaviours of (potential) users of music information systems [ , ], and to a lesser degree, of musicologists [ , , ]. however, research in the field of music information retrieval has pre- dominantly focused on system-centric concerns [ ], and the needs and behaviours of musicologists in particular remain relatively underexplored [ ]. in the course of our work, we have observed that musi- cologists gather the information they require about music and its contexts from a variety of sources. an academic may find literature through tools including répertoire interna- tionaldelittératuremusicale(rilm) andjstor, through general-purpose resources such as wikipedia or google scholar, or using topic-specific encyclopaedias, for exam- ple the new grove dictionary of music and musicians (new grove) [ ], and musik in geschichte und gegenwart [ ], and through library catalogues. resources about notated music are more diverse, including répertoire international des sources musicales (rism) [ ] and more specialised cat- alogues such as brown’s instrumental music printed before : a biography [ ], and even more varied still are the sources of information on performances, recordings, and broadcasts. historically, the task of coordinated study of these areas of musical understanding has necessarily been a manual one. many of the resources are or have been published in book or serial form, and the reader can carefully assemble a narra- tive from the separate parts. online versions of these, along with the new digital-only materials, provide opportunities http://github.com/oerc-music/salt. http://github.com/oerc-music/slobr-ui. http://rilm.org. http://jstor.org. http://github.com/oerc-music/salt http://github.com/oerc-music/slobr-ui http://rilm.org http://jstor.org on providing semantic alignment and unified access to music library metadata for faster, more comprehensive study, along with larger-scale apprehension of a topic, but they also carry greater risks of misinterpretation. musicologists are aware of this tension between physical and digital methods. they are frustrated when the materials they seek are not available in physical libraries [ ], and acknowledgethebenefitsofthedigitalintermsofbreadthand immediacy of access, and of increased workflow efficiency, while recognising the difficulties inherent in the excessively abundant information, making it difficult to “separate the wheat from the chaff” [ ]. the ability to navigate easily between large stores of musicological knowledge is immensely valuable, provided the navigation is reliable. even where it is occasionally incorrect, perhaps mistakenly linking the twentieth-century john tavener with the sixteenth-century john taverner, these errors are usually easily spotted and ignored. where such connections are used as part of an automated process that summarises a large amount of gathered information, errors are harder to spot and can make nonsense of the results. in a commercial or otherwise public-facing environment, such mistakes can affect user confidence in the data at large. our motivation is the belief that a combined—or rather, linked—dataset better serves the research needs of musicolo- gists and musicians, as well as acting as a means for enriching the experience of a general user, for example exploring music on a broadcaster’s website with supplemental insight gen- erated by the musicologist. we are clearly not alone in this belief. the implicit linking provided by library authority files is progressively being turned into explicit linking through viaf [ ] in ever more bibliographic resources, the rism dataset is now published as linked data, and broadcasters such as the bbc publish some of their musical information with musicbrainz artist links. . use case the work presented here employs, as a use case, an early music topic involving the combination of datasets generated in academic projects and in the media industry. both sets of stakeholders—academic and industrial—are intended bene- ficiaries of the technology rather than merely providers of data. the semantic linking of information, content, knowl- edge and metadata in early music (slickmem) project [ ] produced a linked data resource combining the early music online (emo) dataset [ ] of digitised images and metadata on a collection of music books from the british library, with the machine-readable electronic corpus of lute music (ecolm) [ ]. radio is the music and arts station of the bbc. the station broadcasts the early music show (ems) weekly, as an “exploration of early music, looking at early devel- opments in musical performance and composition both in britain and abroad”. it plays almost exclusively european classical music from the eighteenth century and earlier. like all regular bbc programmes, ems has a dedicated area on the broadcaster’s website with clips, podcasts, and support- ing information about current and past editions. the bbc exposes structured broadcast data about its programmes, including ems, encompassing a list of featured works for each episode, and information on the contributors associated with these works (performers, composers, singers, arrangers, etc). slickmem resources intersect with a subset of the repertory broadcast on ems, both in historical period and genre, with emo consisting of primarily sixteenth-century vocal music and ecolm featuring music for the solo lute extending into the eighteenth century. through the creative- commons licensed slickmem resources, , pages of digitised score representing works by over com- posers are available. none of these data sources are born as linked data pub- lications. in the case of the ems data, basic alignment and export to rdf is performed by translating programme data exposed as json by the bbc into rdf using json-ld [ ], a simple extension of the json format that enables the incorporation of semantic context within the more widely familiar json syntax. in the case of emo, the dataset is a version of the british library catalogue further enriched as part of the emo project itself. as a part of the library’s cat- alogue, the data are represented in marc [ ]. this means that while associating information with books is reasonably well catered for, the same is not true for musical items con- tained in them, and the relationship between composers and arrangers and the precise musical items they worked on is seldom clearly specified. since the first phase of emo con- centrated on books with multiple composers represented, this is a more significant issue than it might otherwise have been. ecolm uses a bespoke relational data model for its cat- alogue, implemented with a mysql database. decisions about authority lists were made by project members at launch and so, for example, although multiple names for personal entries are permitted, the primary name form and spelling is taken as authoritative from the new grove [ ]. the intricate model is carefully designed to avoid misrepresenting subtle historical data—and uncertainty and provenance associated with it—so harmonising it with other models is far from triv- ial. . problem statement in linking collections such as those from the bbc, the british library, and musicologists, we aim to maintain the separation http://www.bbc.co.uk/programmes/b tn . http://www.bbc.co.uk/programmes/b tn d. m. weigl et al. of concerns between the actors providing the data—the bbc, the british library, and the musicologist—recognising that each will have distinct requirements, while still reaping the benefits of intersecting interests that are manifest by the links between the corpora. to do so requires us to semantically align the ems pro- gramme data published by the bbc with data available through slickmem. further, we integrate links to com- plementary datasets from sources including linkedbrainz [ ] and dbpedia [ ], projects that publish structured content, extracted respectively from the open online music database musicbrainz, and from wikipedia, as linked data. it is difficult to automate the alignment process as each dataset uses its own distinct unique identifiers to address particular data instances. nevertheless, the datasets overlap significantly in terms of describing the same real-world enti- ties, such as particular composers of early music and their works. valuable alignment cues are provided by indicators such as the similarity of textual labels (e.g. composer name or work titles) or shared contextual information (e.g. birth place or publication date), but can be too ambiguous to be reliable without manual verification. further, they may fail to capture valid matches, e.g. when the same composer is known by two different names. a knowledgeable musicol- ogist is able to resolve many of these issues by drawing on personal domain expertise, but manually aligning datasets with thousands or tens of thousands of entities is likely to be prohibitively time-consuming. the act of alignment is made more difficult—and reliant on domain expertise—by the nature of the information the historical music catalogues are modelling. not all people named in the catalogue may have entries, even in the author- ity lists used by the originating organisation. the information available is often insufficient to disambiguate between can- didate matches. an example is music by both domenico and alfonso ferrabosco in slickmem, often with the sim- ple attribution string ‘ferrabosco’ (variously spelled). with the piece titles and a list of works by each composer, these can be disambiguated given sufficient domain knowledge, but not without. in cases where a work is untitled, anony- mous or both, disambiguation is impossible without access to the music in some form. more difficult still are works for http://linkedbrainz.org. http://www.dbpedia.org. the ferrabosco family of musicians and composers included three alfonso ferraboscos, along with a domenico, henry and john. all but domenico were active in the english court, held similar employment and wrote in similar musical genres. the alfonso ferraboscos are now usually disambiguated using roman numerals, but are seldom distin- guished textually in contemporary documents. one rare exception is the british library manuscript add. , which contains works by both alfonso ferrabosco ii and iii, where attribution is clarified by the epithets ’senior’ and ’junior’. which attribution is disputed or erroneous, i.e. works subject to ongoing scholarly disagreement. the approach presented here makes use of the contextual and string similarity-based alignment cues discussed above in order to generate candidate match suggestions for confir- mation or disputation by a musicologist user. in doing so, we simultaneously address the issues of ambiguity and the lack of reliability inherent in a fully automated approach, while minimising the musicologist’s workload to require interac- tion only where human insight is required, ensuring that the alignment task remains tractable. related work identification of multiple data instances corresponding to shared real-world entities has a long history of research in the literature on relational databases, where related issues are framed in a wide variety of terms including deduplication, record linkage, instance identification, coreference resolu- tion, and reference reconciliation [ ]. related instances are typically detected using heuristics operating on the similar- ity of strings (so-called “fuzzy matching”) contained within the fields of the records in question; this similarity may be determined by various means [ ], including variations of levenshtein edit distance [ ] or phonetic similarity (e.g. via the metaphone family of search algorithms [ ]). where differences in data schema must be overcome, this can be achieved by using shared instance values as cues that differ- ently named fields may refer to the same kinds of entities [ ]. the problem is no less widespread in the linked data world. instance matching here is complicated by the high degree of schema variability between data sources, and the potentially widely distributed nature of related data instances contained within these sources (see reviews in [ , , ]). a common approach is to focus on alignment of ontologies, rather than individual instances. in their recent review paper, shvaiko and euzenat [ ] presented a comprehensive discus- sion of the challenges and applications of ontology matching. such approaches may make use of a number of different tech- niques, including matches based on shared terminology (i.e. string similarity of predicate labels), and on structural simi- larity (based on is-a or part-of hierarchies relative to already matched concepts). oneofthechallengesoutlinedisthatofdesigningwaysfor users to be involved in the matching process without becom- ing lost in the huge number of results inherent in the merging of large datasets. the need for matching tools to be user- configurable and customisable is emphasised. shvaiko and euzenat note that existing matching tools tend to lack graph- ical user interfaces and emphasise the utility of enabling the http://linkedbrainz.org http://www.dbpedia.org on providing semantic alignment and unified access to music library metadata customisation and configuration of such tools by users who are not ontology matching specialists. tools that do exhibit graphical interfaces for user interac- tion include the system for aligning and merging biomed- ical ontologies [ ], a matching tool developed at linköping university,sweden;agreementmaker[ ],ageneral-purpose ontology matching system developed at the university of illinois at chicago; and the silk linking framework [ ] developed at the university of mannheim, germany. this framework comprises a link discovery engine determining matches based on predefined heuristic rules, a user inter- action component enabling rapid evaluation of matching outcomes and tweaking of heuristics in order to improve results, and a protocol for maintaining links in conditions of potentially mutable data. each of these tools is capable of matching based on measures of string distance and structural considerations of data schemas: the linköping tool makes structural align- ment recommendations by considering class and sub-class relationship hierarchies relative to previously matched con- cepts, making use of biomedical domain-specific knowl- edge bases; agreementmaker propagates similarity mea- sures determined for ancestors and siblings in the hierarchy; and silk provides measures of taxonomic distance, as well as a bespoke selector language that allows the description of arbitrary structural relationships. each tool combines the outcomes from these measures in order to determine final alignment proposals, based on some notion of relative weighting. thelinköpingtoolcreatesone-to-onealignmentsbetween concepts and relations, whereas agreementmaker is also capable of generating one-to-many and many-to-many align- ments on the schema level. both tools use string similarity between instance labels to inform higher-level schema align- ment, but neither is targeted at the creation of links between data instances (instance matching). silk is concerned with the creation of links at the instance level, but takes a ‘top-down’ approach; the user interacts with the system to calibrate heuristic rules on the schema level until the resulting instance match outcomes are deemed acceptable. a further tool situated broadly within our area of inter- est is openrefine, an open-source tool formerly known as google refine. this tool provides a user interface sup- porting the exploration, tidying, and reconciliation of large datasets, with a focus on creating new datasets derived from the original source data; in contrast, alignment tools gener- ate auxilliary metadata linked to the original datasets in a hyperstructure. the openrefine interface enables the user to merge entities deemed to be identical, based on entity value matching; the procedure does not make use of contextual cues. http://openrefine.org. in the domain of digital musicology, the musicnet tool [ ] takes a ‘bottom-up’ approach, allowing the user to create matches through interactions on the instance level. however, the tool only assists the user by generating alignment cues based on string similarity; the underlying schematic structure of the data is not taken into account. the evaluation of interactive tools has received increas- ing interest in the ontology alignment community in recent years, with an interactive matching evaluation track running as part of the annual ontology alignment evaluation ini- tiative campaign since . paulheim et al. [ ] detailed the evaluation strategy involved in this track, outlining qual- ity measures including generic cost per user action, which is defined according to specific task context (e.g. time con- sumed, number of interactions required, or the money paid to the domain expert user); and the f-measure, the traditional measure of classification accuracy corresponding to the har- monic mean of precision and recall. paulheim et al. noted that while it is relatively easy to optimise for either the cost mea- sure (by relying on a fully automated alignment solution) or the f-measure (by making the domain expert perform all the work manually), achieving a reasonable trade-off between the two is more challenging. as discussed in sect. . , portions of the source data fac- toring into the use case presented in this article are rooted in the library catalogue, comprising metadata in the mar- cxml format, a serialisation of standard marc records. the use of rdf to supplement—or even replace—catalogue records with bibliographic ontologies remains a topic of active research and ongoing discussion both in libraries [ ] and in the digital humanities [ ]. available ontolo- gies include bibframe, a conceptual bibliographic description model; rdf ontologies expressing the meta- data object and metadata authority description standards (mods/rdf and mads/rdf ), as well as the frbr- aligned bibliographic ontology (fabio), among others [ ]. as we will see in sect. . , the approach presented in this article simply requires data to be expressible as rdf, remaining agnostic as to ontological and vocabulary choices aside from a few very basic requirements. this maximises the applicability of our tool to a broad range of datasets; how- ever, careful consideration should be given to appropriate data modelling choices, for example employing the ontolo- gies listed above, if the datasets are to be adopted for use with digital library systems. http://oaei.ontologymatching.org. http://www.loc.gov/bibframe. http://www.loc.gov/standards/mods/rdf. http://www.loc.gov/standards/mads/rdf. http://vocab.ox.ac.uk/fabio. http://openrefine.org http://oaei.ontologymatching.org http://www.loc.gov/bibframe http://www.loc.gov/standards/mods/rdf http://www.loc.gov/standards/mads/rdf http://vocab.ox.ac.uk/fabio d. m. weigl et al. data model and framework . linked data compatibility the principle aim of the work reported here was to design a model and framework supporting domain experts in the semantic alignment of complementary datasets through link- ing structures published as linked data. these published outcomes may then form the semantic scaffolding for a unified view of the data. our design stipulates minimal requirements upon the datasets to be aligned, permitting the framework to be cross-applicable to corpora from a variety of domains. these requirements are: . the data can be expressed as rdf triples. . each entity in the data that is subject to alignment deci- sions is addressable using a persistent uri. . each such entity exposes a human-comprehensible label. . it is possible for entities to be linked to additional sources of contextual information. the first requirement relates to the data model underlying our design. it should be noted that the data does not have to be expressed as rdf at source; it is relatively trivial to con- vert legacy data stored in tabular spreadsheets or relational databases into an rdf format using open-source tools [ ]. examples include the d rq platform [ ] that produces semantic structures by mapping from a particular derivative of the relational structure describing the tables in a database, and web-karma, a tool that supports the interactive defi- nition of a graph of relationships between the columns of a tabular dataset, using a graphical user interface. the second requirement relates to the mechanism by which the alignment decisions of the domain expert— confirmations or disputations of a match between pairs of specific entities across two datasets—are asserted and stored. a persistent uri that uniquely identifies each entity is required in order to serve as a handle to which data represent- ing individual alignment decisions may be attached. queries and browsing interfaces that make use of the outcomes of the alignment process are then able to address specific entities on either side of the dataset divide, and may easily discover all matches that have been asserted across the gap (sect. . ). the third requirement is necessary in order to make the system useful to human users. assigning comprehensible labels to all specific data entities—typically by asserting triples encoding rdfs:label relationships—is considered good linked data practice [ ] and in this case is required in order to give the user an indication of the specific data instances available. while the system currently focuses on http://d rq.org. http://usc-isi-i .github.io/karma. textual labels, multimedia and multimodal applications may be envisioned for future development (sect. ). finally, while recommendations on potential alignment candidates can be made based on surface similarity of the labels of the entities to be compared (i.e. string distance), these entities may be linked to additional sources of con- textual information in order to make profound use of the underlying semantic capabilities of the system. all that is required is that some graph relationship can be described between the entities on either side of the alignment, and the same, shared contextual item. each such contextual item represents a potential alignment point, upon which a match of two entities from either side may be suggested for user confirmation or disputation. the exact schematic relation- ship between the entity and the contextual item may take an entirely different form within either dataset. as an illustra- tive example drawn from our case study in early music, the slickmem data encodes people both in their capacity as composers of works, and as authors of books that compile works. while the relationship of: composer composes work is schematically different from: author creates book; book contains work, we can, nevertheless, connect composer and author instances via work using these relationships, in order to aid the alignment task (fig. ). . datasets and saltsets it is a strength of rdf that the same dataset may incorporate diverse types of entities. however, in order to simplify the alignment task, it is desirable to present entities of a consis- tent type with direct user relevance for the given task—the focal points that will anchor one side of an alignment deci- sion.referringagaintoourusecase,theslickmemdataset encodes entities representing people, works, books, publi- cations, and places and relates them within a shared graph structure; however, in terms of the alignment task, it is more useful tooperateonsub-graphs relatingdirectlytotheentities that act as alignment anchors, e.g. slickmem composers, slickmem authors, or slickmem works. by virtue of their different positions within the overall graph structure of the dataset, different types of entities also have distinct schematic relationships with contextual items of interest. to disambiguate between the shared graph structure incor- porating all data from a particular source and the sub-graphs of this structure that are directly relevant to a given align- ment task, we refer to the former as dataset and to the latter as saltset. a saltset is thus a subset of a dataset, consisting specifically of those entities (and their labels) that will form the focal points of an alignment task, combined with a collec- tion of templates that define potential contextual alignment http://d rq.org http://usc-isi-i .github.io/karma on providing semantic alignment and unified access to music library metadata fig. a saltset configuration used in the alignment of programme data from the bbc radio early music show with metadata on early music from the british library and the electronic corpus of lute music, comprising five source datasets: bbc ems broadcast data, slickmem, linkedbrainz, dbpedia, & viaf points. each such point is an entity in the dataset, a contex- tual item which stands in some defined relation (a contextual path) to such an anchoring entity. . contextual paths a contextual path is specified as a graph traversal—a walk through the graph structure—that begins at the focal point of a given saltset, and ends at a particular graph node expected to offer significant alignment cues to the user. these nodes of significance—the contextual items associated with a par- ticular saltset—are selected by the domain expert during the configuration of the tool (sect. . ). examples are illustrated in fig. , where contextual paths are represented as patterned lines according to their associated saltset. in a particular alignment task, any contextual items asso- ciated with saltsets on either side of the alignment gap form explicit connections between the focal entities of the saltsets being compared. these contextual items are made accessible to the user in order to serve as cues to alignment decisions. by associating weights with shared contextual items in a given alignment task, a score is calculated to provide the user with a view of potential alignment candidates sorted according to contextual relevance. the magnitudes of the associated weights are determined by the user. this allows for fine-grained differentiation of significance between the association of particular alignment anchors with different contextual items. for instance, consider a scenario where item a is of small but non-negligible interest in terms of offering identification cues for the alignment of two particu- lar saltsets; item b may be of somewhat greater interest, but item c, perhaps an identifier in an external authority file, may trump the presence of both items a and b combined. this sit- uation is easily accommodated by simply ensuring that item b has a greater weighting than item a, and that item c has a greater weighting than the aggregated weightings of items a and b. the alignment process is iterative in nature. in construct- ing matches between entities, changes in the significance of contextual items may become apparent, and new nodes of contextual significance can be discovered as a corpus of match decisions takes shape. this may prompt the user to reconfigure the contextual paths and weightings associated with the saltsets, which may in turn support the discovery of further match instances. . match decisions user match decisions are represented by rdf triples stored in a dedicated area (or named graph) within the triplestore. d. m. weigl et al. fig. structure of match decisions produced by salt to reflect deci- sions made by a user. all match decisions produced by a given user are published to a dedicated named graph specific to that user in the triplestore a match decision ties together two match participants, each of which is the focal topic of alignment for their respective saltset. these match decisions, which may encode confir- mations or disputations of a match between the two match participants, are represented as sub-graphs within the named graph. the provenance of each decision is captured by stor- ing associated metadata identifying the user, the date and time, and the reason provided for the decision (fig. ), thus accommodating further scholarly activity in the form of repli- cation, agreement, and dispute between different domain experts analysing the same data. referencing the corpus of match decisions created by a specific individual is simpli- fied by storing decisions in named graphs specific to each user. this storage strategy has two important benefits: the corpus of match decisions generated in the musicologist’s alignment activities is represented as a coherent object of scholarly output, addressable via the uri associated with the named graph, facilitating subsequent scholarly exchange, and the named graphs function as a rudimentary trust model, whereby an application building upon the match decision structures can be configured to accept the decisions of users deemed “trustworthy” by the application administrator as valid, while preferring to neglect untrusted match decisions. as a consequence of the abstract nature of the match deci- sion mechanism, instance alignments are transitive within and across saltsets. for saltsets x, y, and z, and the match decision relation r: l et u = x ∪ y ∪ z. ∀x, y, z ∈ u : (x r y ∩ y rz) ⇒ x rz ( ) thus, when instances are matched between saltsets x and y , and further matches are specified between saltsets y and z, implicit instance associations may be inferred between saltsets x and z. such match decisions may themselves be fig. linked match decisions forming match chains. entities con- tained within the shaded area are connected by match decisions generated by a trusted user and are therefore included in the match chain. entities outside this area are connected by decisions generated by an untrusted user and are thus excluded. match decisions are stored within named graphs specific to each user, denoted here by direct con- nection to the dashed nodes configured to function as contextual items in further align- ment activities, bootstrapping the task of aligning saltsets x and z. . providing unified access to the matched corpora using the data model and framework the process of exploiting the published alignment outcomes in order to provide a unified view of the underlying data revolves around the concept of a match chain. this consists of a series of entities linked by match deci- sions generated by a trusted source. these entities may be included in any saltset subject to alignment activity. taking the example of our case study in early music, a particular match chain may consist of an ems composer entity linked to a corresponding slickmem author entity, which in turn is linked to a number of different slickmem composer entities, each associated with a work composed by the per- son being described (fig. ). the person setting up a unified view can easily limit the graph returned based on an assessment of the trustworthi- ness of the users making the matches. as match decisions are stored within separate named graphs based on the user responsible for decisions, this is simply a case of requiring the match decision nodes linking the entities participating in on providing semantic alignment and unified access to music library metadata a match chain to be included within a set of named graphs corresponding to users considered reliable. one of the goals of the work presented here is to har- ness the power of semantic technologies without requiring the user to be proficient in their use. to support this, our design allows the delivery of a simple json object that can serve as the basis of a website presenting the unified view. this json object is generated by extracting the informa- tion associated with all entities related by a given match chain and retains this information stripped of its seman- tic context. a web designer wishing to build on a corpus of match decisions is thus able to craft a website present- ing data associated with the various entities described in the various datasets, linked by a trusted domain expert, without having to worry about the underlying semantic structure, or indeed about the fact that the datasets were separate in the first place. semantic alignment and linking tool . architecture we now present a semantic alignment and linking tool (salt) that implements the model and design introduced in the previous section. our tool comprises several interacting software components (fig. ): • a web client presenting the alignment tool’s user inter- face, implemented in html/css, javascript, and the jquery library. • an application server implemented in python using the flask web application framework. • an rdf triplestore hosted on the openlink virtuoso database engine platform. communication between the client and server uses the http and websocket protocols. the triplestore is accessed and updated using sparql [ ], an rdf query language analogous to sql in relational databases that enables the retrieval and manipulation of data by specifying patterns of interlinked triples. in our implementation, these queries are performed via the sparqlwrapper python module. a public sparql endpoint enabling direct querying of the data is also available. salt accesses the outcomes of the alignment process via the collection of named graphs con- taining match decisions that, in turn, link entities from the source datasets. the access control layer provided by vir- tuoso grants the back-end process serving the user interface http://flask.pocoo.org. http://rdflib.github.io/sparqlwrapper. fig. semantic alignment and linking tool (salt) system archi- tecture privileged access to the data (e.g. if sparql updates are to be made available through this interface), or restrict pub- lic access to sensitive sections of the data at the sparql endpoint. a unified user interface offering a combined view of the underlying data, making use of the same sparql endpoint, is described in sect. . . configuration four steps are required to prepare an rdf dataset for use in salt. these steps define the entities in the data that are to be aligned, and augment the initial set of rdf triples with additional metadata. . ingest dataset into the triplestore. . assign entities in the dataset to a particular saltset. . calculate string similarity measures between the entities in the saltsets to be aligned. . specifycontextualpathsbetweentheentitiesinthesaltset and any contextual items in the dataset. first, the datasets must be loaded into the triplestore. each dataset is loaded into its own named graph; this is done to facilitateupdatesandadditionstothedata.forinstance,when new ems data becomes available (e.g. after a new episode is aired), it is easy to accommodate this by simply reloading the ems graph with the latest version of the dataset, without affecting any other data housed in the triplestore. this sepa- ration into distinct named graphs also facilitates permission http://flask.pocoo.org http://rdflib.github.io/sparqlwrapper d. m. weigl et al. management when controlling public access to sensitive data (e.g. due to copyright issues). once ingested into the triplestore, entities within the dataset are assigned to a saltset. these assignments are per- formed by inserting a triple asserting that a given entity is in a particular saltset, e.g. bbc:p fkxm salt:in_saltset saltset:ems_composers . where bbc:p fkxm is a unique identifier for a par- ticular composer: bartolomeo tromboncino, the murderous trombonist. note that the assignments are performed on the schema level, so that, for instance, only a single action is required to add all composers from the ems dataset to saltset:ems_composers. in order to enable match suggestions by string simi- larity of entity labels, string distances are precalculated between each label of any two datasets to be compared. this calculation is performed automatically using a script that applies different variations of the levenshtein edit dis- tance metric [ ] as implemented by the fuzzywuzzy python module. the script then reifies the fuzzy string matches as distinct fuzzymatch entities with two match partici- pants (the uris of the two entities whose labels are being compared), a match algorithm (indicating the variation of the edit distance used in the comparison), and a match score. the resulting triples expressing string similarity are then ingested into a dedicated named graph in the triple- store. . specifying contextual information contextual information is used by salt in two different ways: as visual hints to the user when a particular entity is selected and as a relevance criterion that affects display order when the user requests candidates for alignment sorted by contextual proximity. the process of contextual configu- ration required is summarised in fig. . we have adopted json-ld for use in the configuration of contextual information in order to minimise the degree of expertise in semantic web technologies required from the salt administrator. for each saltset, a list of con- text paths specifying the schematic relationship between the entity to be aligned and a potential alignment point is spec- ified. salt generates a sparql query from the specified con- figuration by serialising the json-ld into rdf turtle [ ] syntax and applying textual transformations on the result- ing triples. the outcomes of this query are used to retrieve ems episode: http://www.bbc.co.uk/programmes/b zddcm. http://github.com/seatgeek/fuzzywuzzy. fig. process of contextual configuration, from user specification to generation of context-ordered alignment candidates instances of the specified contextual relationship across the two saltsets to be compared, in order to generate candidate alignment suggestions. for the purpose of sorting these alignment candidates by contextual proximity, it is important to differentiate between the relative significance of particular kinds of contextual items. for instance, two composer entities sharing a year or place of birth is of minor, but non-negligible, interest, whereas two entities sharing a musicbrainz id is a very strong cue that they are likely representations of the same real-world target. in order to address these relative differ- ences in significance, a user-specified weighting is provided for the significance of each contextual item in the context of aligning two specified saltsets during configuration. these weightings are aggregated for each potential combination of cross-saltset entities so that, for instance, an alignment can- didate combining entities that share both a year and a place of birth trumps another candidate with entities sharing merely the birthplace, whereas another candidate with entities that share neither birth year nor place, but do share a musicbrainz id, trumps both. . user interface . . matching modes the web client front-end comprises two scrollable lists corresponding to the two saltsets involved in a given align- ment context. depending on the presentation mode specified by the user in a drop-down menu (labelled . in fig. ), these lists are either independent of one another, pre- senting the saltsets in their entirety with entities sorted alphabetically by their labels (unmatched lists mode), or the lists are related, so that corresponding rows across lists present a suggested match. these suggestions are made http://www.bbc.co.uk/programmes/b zddcm http://github.com/seatgeek/fuzzywuzzy on providing semantic alignment and unified access to music library metadata fig. salt user interface: exact string matches (left) and contextual matches (right) either by similarity of textual labels ( .)—the exact string match and fuzzy string match modes—or according to contextual similarity ( .) (contextual match mode). scores based on string distance measures or on weighted con- textual significance are displayed next to each suggested match. the user may scroll through the lists row-by-row (locked scrolling), or unlock the lists and scroll them inde- pendently. in order to avoid a combinatorial explosion and to exclude irrelevant information, only those align- ment combinations passing a threshold configured separately for textual and contextual similarity are presented to the user. searching, filtering, and contextual hinting. regardless of the currently selected matching mode, the user has several options to filter information: textual search, string match fil- tering, and contextual item filtering. textual searches may be performed via a search box that accepts regular expression inputs. the constraint for- mulated in the search box may be targeted to either or both lists using a radio button selection. string match fil- tering is performed by double-clicking a label of interest in either list; upon this action, entities in both lists are fil- tered to show labels that exactly match that of the target entity. upon selecting an entity in either list by single- d. m. weigl et al. click, all contextual items associated with the entity are displayed in a ui component situated beneath the two lists ( .). for each associated contextual item, a count is dis- played indicating how many entities in the other saltset share this specific contextual item. further contextual hint- ing is applied by highlighting the labels of context-sharing entities within the list, and by indicating via the appear- ance of up and down arrows next to the scroll bar when contextual matches exist above or below the current scroll position. finally, contextual filtering may be applied by clicking on any item in the contextual display component, constraining the lists to saltset entities sharing the particular item. confirming, disputing, and bulk matches. any combi- nation of inter-saltset entities may be selected in order to confirm or dispute a match between the two entities ( .), generating one-to-one alignment decisions. additionally, in any matching mode (i.e. all modes except unmatched lists), row-wise and targeted bulk confirmations may be performed. row-wise bulk confirmations ( .) may be performed when no item is selected and result in confirmations of matches between every pair of entities sharing a row across the lists (i.e. many-to-many alignment). targeted bulk confirmations may be performed when one entity is selected in either list (but not both) and produce confirmations of matches between that entity on one side, and every entity on the opposing list (i.e. one-to-many alignment). prior to either of these actions, individual entities may be unlisted in order to withhold them from the bulk confirmation process. entities that have been subject to a prior match decision are unlisted automatically. the labels of unlisted entities are visually de-emphasised— see bartolomeo tromboncino in ( .)—in order to clearly demarcate their status while remaining accessible to the user (e.g. in order to re-list them). by combining searching, filtering, and unlisting, it is pos- sible to rapidly assert a large number of match decisions. a more fine-grained approach using confirmations and dispu- tations of individual items across the saltsets may then be applied in order to cover more abstruse cases of alignment. any entities that have been subject to at least one match decision are subtly indicated by a translucent check-mark, in order to help the user “keep their place” during align- ment. when a match decision is made via any confirming or disputing action, the user is prompted to enter a reason for the decision. the match decision is then reified with its own persistent uri, according to the specification in sect. . . these triples are then passed to the application server and persistently stored in the triplestore. the triples are transmit- ted from the web client via the websocket protocol in order to allow the user to continue their activities without being interrupted by page reloads. application to case study: alignment of early music corpora a musicologist used salt to perform corpus-scale align- ments of the early music datasets described in sect. . . as a first step, this involved aligning entities within the slick- mem dataset. these data were sourced to a large degree from a traditional library context (the british library catalogue) in which the basic unit of description was the book; as such, the names of persons contributing to the creation of the book (“book authors”) were subject to a tightly controlled vocabu- lary enforced by reference to authority files, whereas names associated with composers were less carefully controlled in the source data and thus more variable and ambiguous. as a consequence, the slickmem dataset publishes book authors as distinct entities, each author with their own persistent uri, whereas composers exist “merely” as name strings attached to works using rdfs:label. this limita- tion had to be addressed if a robust link from a work presented on the bbc radio programme to its representation in slick- mem and thus to the corresponding digitised score was to be established. as such, we have bootstrapped distinct com- poser entities, minting a new persistent uri to represent the composer of each individual work, and associating the com- poser’s name with the new entity, rather than with the work directly. the musicologist then aligned these new slick- mem composer entities with the more tightly controlled slickmem authors using salt, in order to address the ambiguity inherent in representing composers with a distinct entity for each of their works. for the reason discussed above, the digitised resources available through slickmem are book-centric, rather than work-centric. thus, the musicologist’s next task was to align the composers represented in the ems data with the authors represented in the slickmem data. the combined outcomesofbothalignmentactivitiesenabledtherobustlink- ing from items of ems programme data to slickmem resources. in terms of the within-slickmem alignment, the musi- cologist confirmed matches between slickmem composers and authors, involving a total of distinct authors (i.e. . works per author on average). the musicologist created match decisions between ems composers and slickmem authors. thus, out of distinct composers ( %) featured on the ems programme are mapped to digitised resources and further metadata via slickmem. at least one of these composers features in of the episodes available at the time of analysis ( %). thus, just under two thirds of all ems episodes can be augmented with additional metadata in a unified view of the datasets (sect. ), a respectable num- ber given the narrow chronological scope of two centuries in the slickmem dataset, compared with the broader musical on providing semantic alignment and unified access to music library metadata timeline, from mediaeval times to the baroque and beyond, presented on ems. user evaluation a user study was conducted in order to evaluate the data model, alignment framework, and salt user interface beyond the context of the musicologist performing the align- ments detailed in sect. . . sampling frame eight academic musicologists, including one based at an international music library institution, one from a mediae- val manuscript archive, and another with formal background in library and information sciences, participated in the user study. all participants possessed extensive domain knowl- edge in early music. they were recruited using snowball sampling [ ], whereby initially contacted participants were asked to recommend others with similar expertise. in a post-evaluation questionnaire, each participant self- reported considerable expertise in early music: on a scale of (“i have never heard of early music”) to (“i am an expert in early music”), the median response was . (minimum: ). further, each participant indicated a high degree of familiar- ity with the author and composer names encountered during the evaluation: on a scale of (“i did not recognise any of the names”) to (“i recognised almost all of the names”), the median response was also . (minimum: ). in terms of tech- nicalbackground,responseswereslightlymorevaried:rating their technical expertise in using computers, on a scale of (“i am a novice computer user”) to (“i am an expert com- puter user”), the median response was (minimum: ); and rating their familiarity with digital musicology on a scale of (“i have never heard of digital musicology”) to (“i am very familiar with digital musicology”), the median response was , with two individuals indicating and , respectively. our sample thus consisted of individuals with strong expertise in early music, varying in terms of their technical background. . design and procedure the evaluation consisted of a practice session and two eval- uation tasks, each presenting a subset of the slickmem author-to-composer alignment task. this was followed by a questionnaire investigating participants’ familiarity with digital musicology and with early music, and posing several questions relating to their user experience of the evaluation and of the alignment tool. participation took place remotely using participants’ own computers. evaluation sessions were scheduled to ensure the researcher overseeing the evaluation was available for immediate clarification of questions about the task via e-mail or through video conferencing. having indicated their consent for voluntary participation in the evaluation study, participants first read an instructions page, detailing the task objectives and the functionality of the alignment tool. participants then completed a practice task presenting a subset comprising slickmem authors against the complete list of slickmem composers. the practice task lasted for min, with a timer indicating the remaining time at the top of the interface. during the prac- tice task, participants were encouraged to try out the various functionalities of the tool, as described in the instructions page. after the practice session, participants completed two experimental tasks, in sessions lasting min each. in both tasks, participants were presented with a distinct subset of slickmem authors (i.e. authors across both sessions, all distinct from those presented during the practice session). in both cases, as in the practice session, the authors were pre- sented against the complete list of slickmem composers. all slickmem authors had been previously matched by the musicologist involved in the development of the tool (see sect. ), ensuring the existence of matching composers. in task , participants completed the evaluation with a lim- ited subset of the tool’s modes, working with either: only unmatched lists mode; suggestions based on string similarity, plus unmatched lists mode; or suggestions based on contex- tual similarity, plus unmatched lists mode. participants were randomly assigned to one of these conditions. during the practice session and in task , intended as our control con- dition, participants each had access to the full capabilities of the tool. . analysis and results an analysis of the generated match decisions was conducted, employing the evaluation principles outlined by paulheim et al. [ ] (see sect. ). the f-measure was determined by calculating precision and recall as defined against a refer- ence set of match decisions generated by the musicologist performing the slickmem author-to-composer alignment in the case study (sect. ). two cost functions were calcu- lated: matches per interaction, where the number of distinct match confirmation actions (clicks on a single instance or bulk confirmation button) was considered in terms of the match decisions generated, and matches per second, where the average time required for each generated match decision was considered. over the course of the user evaluation, our participants dis- covered out of a possible valid author–composer matches (recall: . ), generating a further “erroneous” matches (according to the judgement of the musicolo- gist responsible for the early music use case alignments; precision: . ), giving an overall f-measure of . ; this cor- d. m. weigl et al. fig. precision, recall, and f-measure for each evaluation task responds to distinct author–composer matches identified by the combined efforts of our participants. the distribution of individuals’ performances in terms of precision, recall, and f-measure is summarised in fig. . precision was high throughout, as would be expected given participants’ domain expertise in early music. encourag- ingly, precision remained consistently strong regardless of matching modes employed, suggesting that a greater use of match suggestions and bulk confirmation does not negatively affect the accuracy of the alignment activity. recall varied consistently among individuals, correspond- ing to the variation in alignment efficiency (cost per match decision) discussed above; note that participants were lim- ited to min per interaction task and thus that less effi- cient use of the tool necessarily resulted in a lower recall score. the number of matches generated per interaction, and per second, are visualised in fig. . there is a considerable degree of variability between participants, ranging from to matches per interaction, and . – . matches per second. this variability is expected for task , given the dif- ferences in modes available to participants. the retention of thisvariabilityintotask wasduetoatendencyofseveralpar- ticipants to remain in the unmatched lists mode that reduced opportunities for greater efficiency via match candidate sug- gestions and row-wise bulk confirmation. the variation may also reflect differences in the analytical approach between participants, perhaps due to certain participants being more thorough in their confirmation of match candidates. one participant’s performance particularly demonstrates the value of our approach. in task , the participant only had access to the unmatched lists mode and thus could not bene- fit from the advanced functionalities of the tool. during this task, the participant generated matches, at matches fig. cost measures for each evaluation task. top number of matches generated per confirmation interaction. bottom number of matches gen- erated per second per interaction, and . matches per second. in task , the participant made extensive use of the full functionality of the tool, generating matches, at matches per interaction, and . matches per second. the participant was thus able to roughly double alignment efficiency, while decreasing six- fold the number of interactions required, demonstrating the value of this approach when the capabilities afforded by the tool are used to the full. . user experience paulheim et al. explicitly placed assessment of the user expe- rience out of scope in their evaluation guidelines, as the aim is to fully automate the evaluation procedure for the interactive matching track of the ontology alignment evaluation initia- tive, making measurements of the user experience difficult. however, they note that, for interactive matching tools pro- viding a user interface, measuring user experience is a useful complement to the measures they outline. we attempted to capture these aspects by asking participants to reflect on their user experience in the post-evaluation questionnaire. partic- ipants were asked to rate their perception of the clarity of the task, and usability of the tool, by responding to the statements “i found the instructions and objectives for this task easy to understand” and “i found this tool to be easy to use” on five-point likert scales, ranging from “strongly disagree” via “neutral” to “strongly agree”. participants responded to the statement, “i could usefully incorporate such a tool into my research”, by choosing a response from “no”, “uncertain”, or “yes”. participants were able to elaborate their responses to each question via free-text fields and were given the oppor- tunity to include any further comments at the end of the questionnaire. on providing semantic alignment and unified access to music library metadata asoneofthegoalsofthisevaluation,weintendedtoinves- tigate the relative utility of the different matching modes, hence the distinction between the different modes avail- able, according to experimental condition. unfortunately, participants tended to remain in the default unmatched lists mode, thus not benefiting from salt’s ability to suggest match candidates, even when other modes would have been available to them—one participant explicitly stated in the post-evaluation questionnaire that the other modes would have been explored if there had been more time available. this tendency to remain within the default mode was partic- ularly common among participants with lower self-reported computer literacy and familiarity with digital musicology. cases where participants fully utilised the salt function- ality did indeed result in the greatest alignment efficiency (sect. . ). responses regarding task clarity were mixed, with five participants indicating agreement that the instructions and objectives were easy to understand, one remaining neu- tral, one participant disagreeing, and a further disagreeing strongly. similarly, views on the tool’s usability were vari- able, with five participants agreeing that the tool was easy to use, two disagreeing, and one disagreeing strongly. it is pos- sible that some of this variability may relate to differences in the participants’ technical backgrounds and experience with digital scholarship; the participant strongly disagreeing in both cases also reported a lack of familiarity with digital musicology and indicated in comments a confusion about the tool’s purpose. this participant only completed the prac- tice task of the evaluation, successfully generating match decisions that all correctly matched our “ground-truth” set. four participants indicated that they could see a role for this sort of tool in their own research, elaborating responses detailed applicability in other alignment contexts, arising when building digital resources from original sources, and when mapping potentially noisy user input against authority records, as well as a means of handling attribution questions. the remaining participants indicated concerns about scala- bility, or simply stated that they saw no applicability to their own work. although all participants were able to generate match decisions with the tool, most had suggestions for improved usability.theseincludedrequestsforincreasedfontsize(cur- rently, the contextual item view panes use very small font sizesinordertofitmoreinformationon-screen);theinclusion ofkeyboardshortcuts,toreducerelianceonmouseclicks;the ability to explicitly select multiple items in either list for one- to-many and many-to-many instance matches (currently, this type of functionality is achieved implicitly, by a combination of filtering, unlisting, and bulk confirmation); and an explicit undo function for single and bulk confirmation operations. it is clear from these responses that there is a learning curve to the current user interface that must be overcome, particularly if the tool is to target users lacking technical expertise. these insights will provide useful guidance to future development work on the user interface (sect. ). . limitations our use case in early music necessarily narrowed the pool of domain experts available for the evaluation of our system. a larger-scale evaluation on a broader knowledge domain, employing a correspondingly greater number of participants in order to obtain a more fine-grained understanding of the utility of the system and its different matching modes, is envisaged for future work. nevertheless, the present evalu- ation serves to demonstrate the value of the data model and design underlying our approach; when the tool’s functionali- ties are fully exploited, highly efficient and precise alignment progress can be achieved. providing unified access to the matched corpora as the match decisions generated by salt users are pub- lished as linked data, the combined information available within the aligned datasets can be queried using sparql. however, given a target audience of musicology scholars and laypersons with interests in early music, a more famil- iar means of access that does not assume knowledge of linkeddatatechnologiesasaprerequisiteisclearlyrequired. accordingly, we now present the semantic linking of bbc radio (slobr) demonstrator, a web application inspired by the look and feel of the existing ems web resource while providing access to biographical information, bibliograph- ical catalogue data, and digitised musical score available via alignment to the slickmem dataset, and via further external datasets made available by this alignment. in design- ing the architecture to support this demonstrator, we aim to divorce aspects catering to this particular use case from the generic aspects involved in the outcomes of any application of salt, regardless of alignment context. to support this, we have developed tooling that facilitates the creation of unified views across any saltset combinations linked by match deci- sion structures generated by salt. this tooling serves to demonstrate the flexibility of our model (sect. ), and the reusability of the data it produces, within and beyond the domain of our present use case. . implementation the demonstrator consists of an html/css front-end with a design loosely based on the bbc gel website design specification, and a back-end server using the flask web http://www.bbc.co.uk/gel. http://www.bbc.co.uk/gel d. m. weigl et al. application framework. the flask server handles requests by performing template filling using the results of parameterised sparql queries to generate the front-end views. basic navigational vectors are supported by a generic match chain walking query (sparql query ). this query returns all information associated with the uri of a specified “source” entity, as well as all information associated with any entities linked to the source entity by a chain of trusted match decisions (see sect. . ). in the early music demonstrator, these entities may each represent a person (ems composer, slickmem author or composer) or a work (ems work or slickmem work). the query stitches the datasets together according to the salt user’s alignment activity, enabling all related information to be displayed in aggregated, unified views. we now step through sparql query to explain the process in detail. block a sets up the query parameters, specifying input and output variables, and as well as the relevant datasets the query will be confined to in order to improve efficiency. line describes the variable bindings produced by the successful execution of this query, i.e. the shape of the results set. ?uri refers to the unique identifier of a particular entity in the match chain; this entity is retrieved from the named graphs listed as possible values of the ?contentgraphs vari- ableinlines – .?pand ?orefertothepredicatesandobjects associated with these entities; that is, the directly related information that we wish to retrieve. the {sourceuri} parameter encased in curly brackets on line specifies the entity uri that serves as an entry point to the match chain; it is filled by the flask server in response to the user’s actions on the web front-end using standard python string formatting prior to query execution and bound to the ?source variable when the query is run. block b performs the match chain walking operation. the {trustedgraph} parameter encased in curly brackets on line specifies the named graph of trusted match decisions, as configured on the server and supplied prior to query exe- cution. line retrieves all entities that share a match chain with the specified ?source uri. this is achieved using a sparql property path that constrains the query to pat- terns where the relationship between ?source and ?uri is such that ?source is a match participant in a match decision that also has a match participant ?uri, or which has an intermediary node that is involved in match decisions with both ?source and ?uri. the * operator allows for an arbitrary number of repetitions of this pattern, includ- ing zero, in which case, ?uri simply takes the value of http://www.w .org/tr/sparql -query/#propertypaths. while the subsection on arbitrary length path matching in the prop- erty path specification section of the w c recommendation on the sparql . query language states that “connectivity matching is defined so that matching cycles does not lead to undefined or infinite results”, complex alignment contexts involving very long match chains, ?source, as a property path of length zero connects a node to itself. block c now extracts all information directly associated with the entities contained in the match chain, i.e. all prop- erties (?p) and objects (?o) of any ?uri retrieved in block b. this part of the query is constrained to the graphs con- taining the datasets specified in block a, supporting efficient performance by avoiding the need to search the entire triple- store. block a: specify datasets and source uri select distinct ?uri ?p ?o where { bind({sourceuri} as ?source) . values ?datasets { :ems :slickmem } block b: find all uris in the match chain graph {trustedgraph} { ?source (:matchparticipant/^:matchparticipant)* ?uri . } block c: retrieve all associated information graph ?datasets { ?uri ?p ?o . } } sparql query : retrieve all information associated with a given entity (e.g., an author, a composer, a work) by unified query of the aligned datasets, via match chain walking. block a: supply source entity uri, and specify dataset graphs. block b: specify graph containing trusted match decisions, and retrieve the uri of any entity sharing a trusted match chain with the source entity. block c: capture information directly associated with each uri in the match chain. an example results set is provided in table . here, the ems uri for the composer orlande de lassus is supplied as the starting point of the match chain walk. the query returns information on this ems composer, as well as on the matched slickmem author, orlando di lasso, and on distinct slickmem composers—one associated with each slickmem work attributed to the composer—with labels exhibiting variant spellings of his name. as each ?uri in the results set is part of the chain of match decisions, any one of them could serve as the input ?source variable to generate identical results. the set of resulting triples is stored as a simple json object storing the predicates (?p) and objects (?o) associ- ated with the entities in the match chain (?p as keys, and ?o as values). where there are multiple instances of a certain predicate, potentially with different values—e.g. an ems composer, slickmem author, and various slickmem or erroneous matches resulting in longer than expected chains, may sig- nificantly impact query performance; in such situations, the number of hops can be constrained using the property path syntax. http://www.w .org/tr/sparql -query/#propertypaths on providing semantic alignment and unified access to music library metadata table match chain walking: example results set produced by sparql query ?source ?uri ?p ?o ems:p dzzq ems:p dzzq mo:musicbrainz_guid mbz: f c a- b - - c -c a d a d ef ems:p dzzq ems:p dzzq salt:in_saltset saltsets:ems_composers ems:p dzzq ems:p dzzq slobr:contributor_role composer ems:p dzzq ems:p dzzq rdfs:label orlande de lassus ems:p dzzq ems:p dzzq rdf:type dct:agent ems:p dzzq slickmem: - a aa a e fd cbfbf ec salt:in_saltset saltsets:slickmem_authors ems:p dzzq slickmem: - a aa a e fd cbfbf ec rdfs:label orlando di lasso ems:p dzzq slickmem: - a aa a e fd cbfbf ec rdf:type dbpedia:person ems:p dzzq slickmem: _creator salt:in_saltset saltsets:slickmem_composers ems:p dzzq slickmem: _creator rdfs:label orlan. di lassus ems:p dzzq slickmem: _creator salt:in_saltset saltsets:slickmem_composers ems:p dzzq slickmem: _creator rdfs:label orlandi di lassus ems:p dzzq slickmem: _creator salt:in_saltset saltsets:slickmem_composers ems:p dzzq slickmem: _creator rdfs:label orlando di lasso ems:p dzzq ... further entries associated with other saltsets:slickmem_composers ... ?source is an input variable, included here for illustrative purposes. any of the resulting ?uri values could equally serve as the ?source to produce the same results set, as they are all part of the same match chain composers may form a match chain, each with their own rdfs:label—all distinct values are stored against the pred- icate as an array. this representation strips out semantic context inherent in the result set, but makes the development of web interfaces significantly simpler. where the associa- tion of the predicates and objects to their source subject (i.e. the entity bound to ?uri in the result set) must be retained— for example, if only the rdfs:label of the slickmem author is to be displayed as the authoritative name—a secondary json object, keyed first according to ?uri and then by ?p and ?o, is also available. information about the saltset membership of each entity in the result set is made avail- able through these objects using the salt:in_saltset property, further facilitating entity class-specific view deci- sions. . early music demonstrator interface a web interface developed for the slobr early music demonstrator provides access to ems programme data, slickmem catalogue data and digitised score images, as wellasfurtherbiographicaldataobtainedbyfederatedquery- ing of dbpedia via linkedbrainz. the interface presents four interlinked views: episode view, episode listing, contributor view, and work view. the episode view (fig. ) provides access to full details for a particular episode, as available from the ems programme resource’s json feed. this includes a synopsis of the con- tent of the episode, an illustrative image (generally of the episode’s presenter, or of the featured composer, performer, location, or musical instrument), as well as a listing of the works and composers featured. the items in this listing link to the work and composer views, respectively. this linking makes use of the ems composer and work uris retrieved from the bbc’s feed; the combined data then becomes avail- able using the match chain walking technique detailed in sect. . . the episode listing (fig. ) provides a short summary of multiple ems episodes, ordered chronologically. by default, all episodes are summarised. the list may also be filtered from links situated on the episode, contributor, and work views, to “all episodes featuring” these contributors, this composer, or this work. these links lead to filtered instances of the episode listing showing only those ems episodes fea- turing at least one of the indicated composer(s) or work(s), a navigational means unavailable from the bbc’s ems pro- gramme resource. the contributor view (fig. ) provides access to bio- graphical data and a depiction of particular composers, as extracted from structured information of wikipedia articles via dbpedia. it is worth noting that neither the ems nor the slickmem datasets include dbpedia identifiers directly. however, these can be obtained via the musicbrainz database of crowd-sourced music metadata, accessible as linked data via the linkedbrainz project. this information is obtained on page load via a federated query, ensuring that the presented data reflect the latest versions of the corresponding informa- tion; alternatively, the data could be cached locally in the triplestore, in order to increase robustness against potential downtime of the external services’ sparql endpoints. by comparing the life and death dates of the composer, obtained d. m. weigl et al. fig. episode view. . full episode details, including image asso- ciated with the episode in the bbc ems programme data. . list of works (with composers) featured in the episode. work names and composer names are clickable links that reference the corre- sponding work/contributor view pages using the match chain walking sparql query. . bbc broadcast data and inter-episode navigation. all episodes featuring these contributors links to the episode listing view, filtered to only show episodes presenting works by composers featured in the current episode (fig. ) from dbpedia, with the publication dates associated with the books described within the slickmem dataset, we arrive at a rough conception of the composer’s contemporaries— people who were involved in the creation of music books published during the composer’s lifetime. the list of cor- responding names is displayed as part of the contributor view, with links to the respective contemporary’s contributor view page that enable a novel navigation vector according to temporal proximity. future work could usefully include a geographical element, defining “contemporariness” along spatial as well as temporal dimensions. finally, the contrib- utor view also includes a list of works by the composer that have been featured on ems, linking to the respective work views by virtue of the ems to slickmem works alignment, as well as the broadcast dates associated with each work’s appearance on the show, linking to the corresponding ems episode view. the work view (fig. ) revolves around the display of digitised musical score pages from the book contain- ing the respective work, obtained by following links in the slickmem dataset to images hosted by the emo digi- tal repository at royal holloway university of london. a lazy loading technique is used to only load images for pages that currently need to be visible to the user, as well as the next few pages down the scroll list, in order to minimise server load; further images are loaded dynamically as the on providing semantic alignment and unified access to music library metadata fig. episode listing, displaying a multiepisode view; either all ems episodes, or a subset determined by user interaction context—here, the three episodes of the ems to have featured works by jacques arcadelt at time of writing user scrolls down the list. navigation to the contributor view for the work’s composer, as well as to various filtered episode list views, is supported. certain functionalities presented here—the determina- tion of contemporaries, and the retrieval of supplementary detail describing the composer from dbpedia—are driven by parameterising specialised sparql templates on the server, and thus, their implementation requires a degree of famil- iarity with semantic technologies; these functionalities are included in the demonstrator as illustrations of the kinds of added value that is made available by interlinking with external linked data resources such as dbpedia. however, the navigational hyperstructure enabling the exploration of the unified corpus, from an ems episode, to the composers and works featured on that episode, to digitised images of the pages of books featuring those works, is entirely based around applying the match chain walking query (sparql query ) operating over a collection of match decisions generated by a domain expert using salt. this process is generic and abstracted from the types of entities involved in the particular alignment context—the process is the same operatingoverpersonsasitisoperatingoverworks.itsimple- mentation allows users to benefit from the advantages of linked data without requiring proficiency in semantic web technologies. conclusions and future work in this paper, we have detailed the design of a data model and framework to align and provide unified access to com- plementary datasets lacking common identifiers. tackling the ambiguity inherent in automatic alignment processes, and issues of scalability in fully manual alignment, our approach takes a middle path: automatically generating can- didate match suggestions based on textual and contextual alignment cues, which are confirmed or disputed by manual application of human insight and domain expertise. this is accomplished by the definition of saltsets com- prising sub-graphs of the available rdf datasets. these structures describe alignment anchor entities whose textual labels are of relevance for match candidate generation based on textual similarity. we associate these entities via con- textual paths to user-configurable contextual items, forming weighted graph traversals that provide the cues inform- ing contextual match candidate generation. user match decisions, realised through additional rdf structures incor- porated into the knowledge graph, confirm or dispute the generated match candidates, capturing provenance informa- tion from the responsible user, including their reasoning behind the decision. these match decisions may themselves serve as contextual items, driving iterative alignment activ- ity; further, their transitive nature may be exploited in match chain walking to provide unified views of the aligned data. we have presented the semantic alignment and linking tool (salt) and slobr (semantic linking of bbc radio) toolsets that implement this design, motivated by a use case in early music combining catalogue metadata and digitised score images from the british library and other sources with programme data from the bbc early music show. we have evaluated our approach in a user study employing eight musi- cologists with expertise in early music, determining highly significant increases to the efficiency of the alignment pro- cess when taking full advantage of the semantic affordances of our model. the domain expert-verified linked data generated by salt can form the basis of novel music digital library sys- tems with user interfaces presenting the underlying datasets d. m. weigl et al. fig. contributorview. .composernamelabelsassociatedwiththe entities within this match chain: the bbc use jacques arcadelt, the british library (via slickmem) use jacob arcadelt. . where any entity in the match chain is associated with a musicbrainz id, we can query linkedbrainz to retrieve a dbpedia id. this enables the retrieval of a depiction, birth and death dates, and a biographical blurb. . link to the episode listing, filtered to show only episodes fea- turing this composer (fig. ). . list of work titles and broadcast dates retrieved from bbc programme. titles link to corresponding work view (fig. ) via match chain walking; dates link to corresponding episode view (fig. ). . links to contemporaries’ contributor view pages. con- temporariesareauthorsthathavepublishedbookswithpublicationdates that fall within this composer’s lifetime as one union corpus, demonstrated here by the slobr web application.theimmediatevalueofsuchviewsisintheavail- ability of new connections, e.g. between works presented during an episode of the radio programme, pages of corre- sponding digitised musical score from the british library, and biographical information about the composers of the works extracted from sources such as dbpedia. by publish- ing these connections as linked data, we expect further value to accrue as reuse of the data in other contexts is facilitated. for all the benefits of linked data, there are significant barriers to uptake: two of the greatest are the difficulties of publishing pre-existing data in a usefully linked way and, on the other hand, the complexity of exploring a semantic web dataset. in the latter case, the problem is often that, where the representation is rich enough to reflect the data mean- ingfully, the graph generated is complex and full of indirect paths that limit the use of generic browsers. we believe that both of these barriers can be greatly reduced by the use of shared semantics implicit in the underlying graph structure, used to perform data reduction offered as contextual views to the user. while our tooling makes profound use of semantic tech- nologies, it is desirable to minimise or eliminate any obli- gation on the user’s familiarity with such techniques. the on providing semantic alignment and unified access to music library metadata fig. work view. . titles associated with work entities in the match chain. contributor name and recorded as link as per associated ems programme data. . score images (click for full-screen viewer) served by early music online via slickmem data associated with this match chain. . navigational links to work composer’s contribu- tor view (fig. ), and to filtered episode listings (fig. ) match chain walking mechanism employed by our model to create unified views of the data accomplishes this goal by reducing the required technical knowledge to the much more widespread json syntax. technical expertise at this level is sufficient for the configuration of salt, which is achieved through the use of json-ld; however, a basic knowledge of the semantic schema underlying the data is currently required in order to appropriately set up the salt- sets and their associated contextual paths. algorithms for efficient path finding along a directed graph (the topologi- cal form of rdf data) are well studied in computer science [ ]. in future development, we plan to make use of such an algorithm in order to automate the configuration of con- textual paths over the graph structure of the dataset, given two endpoints (i.e. the focal entity of the saltset, and a contextual item). further, we will tie specification of the endpoints and the weighting of contextual paths into the ui, facilitating iterative refinement by simplifying the interac- tive reconfiguration of the system as the alignment process unfolds. severalimprovementstotheuserinterfacehavebeeniden- tified based on feedback during the user evaluation (sect. . ). we plan to address these concerns in future development inordertoeasethelearningcurveoftheinterface,andaddress the current absence of convenient features including multi- ple selection and undo functionalities. these developments will be guided by further iterative user evaluation sessions in order to ensure the tool’s usefulness to our target audience of domain expert users, while minimising requirements for additional technical expertise. additionally, some optimisa- tions are required to handle datasets of significantly greater size than those in the early music deployment, including on- demand loading of data (e.g. using web sockets) rather than a complete load on client initialisation, and dataset segmen- tation or indexing when calculating string distances to avoid a combinatorial explosion in computation. d. m. weigl et al. further, non-textual representations may be envisioned. we anticipate that multimodal information representations will be of particular interest in the context of digital musicol- ogy, for instance in the alignment of audio recordings with musical score. further plans involve the incorporation of fea- ture vectors obtained from symbolic or audio representations of musical works, using techniques from music informa- tion retrieval and available from linked data sources such as the computational analysis of the live music archive (calma) [ ] project, to serve as contextual cues in the alignment task. our work provides an illustration of the power of tool- ing that assists, rather than fully automates, the process of digital scholarship, respecting that alignment involves both groundwork in gathering and structuring data, combined with judgement which must always be elevated beyond the groundwork to the purview of the musicologist. as digital resources continue to expand in scope and quantity, the devel- opment of tools such as salt is imperative to overcome the increasing scale and complexity of this groundwork to ensure that the resources within remain accessible to the insight of scholarship. in doing so, we accept and reinforce the obser- vation that the act of study is iterative and ongoing; our data model can provide both a means for capturing the prove- nance of judgements over complex information structures, and of incorporating these judgements in new and dynamic data structures that can, in turn, provide the foundation for further insight. for musicologists interrogating the aligned datasets, the simplest benefit comes in the form of clearer, richer explo- rations. with composers and places linked to external resources, it becomes possible to construct lines of enquiry based on chronology and geography without the informa- tion having been entered separately into each database. by accessing the linked ems and slickmem datasets and similar resources, scholars investigating recent perfor- mance history and practice can explore a wider variety of research questions—for instance, the extent to which music programmes by london-based ensembles shape their reper- tory to the sources that are readily available in the british library. consumers and interested laypersons are provided with simplified access and novel navigation vectors that sup- port exploratory browsing and serendipitous discovery. to generalise, linked datasets facilitate the study of the spread of music in historical and contemporary periods with far greater detail and depth than would otherwise be possible without the models and tooling presented here. epilogue: linked data (re)use when first designing the rdf structures representing the user’s match decisions, our considerations revolved around encapsulating inter-entity links and match decision prove- nances, in order to drive the iterative alignment process, facilitate unified access to the combined data, and provide an addressable handle to allow the collection of match deci- sions generated by a particular musicologist to function as a coherent object of scholarly output. this last property of our approach ended up greatly facilitating the analysis of the user evaluation of the alignment tool, reported in sect. . using the same sparql endpoint that drives the various tools pre- sented in this paper, along with some simple set relationship logic, it was easy to determine evaluation measures includ- ing precision and recall by determining the differences and overlaps between the collections of match decisions gener- ated by each of our participants against the “ground-truth” set created by our resident domain expert responsible for the alignments in our case study (sect. ). the cost measures were also determined with the aid of sparql by reference to the captured provenance information associated with each match decision, allowing us to trivially compute the number ofmatchdecisionsgeneratedpersecondandperconfirmation interaction. pleasingly, the flexibility and utility of the linked data approach in promoting and facilitating data reuse was thus reaffirmed in the process of writing this paper. acknowledgements this work was undertaken through the seman- tic linking of bbc radio (slobr) project, a subaward of the epsrc funded semantic media network (ep/j / ), with addi- tional support from the ahrc transforming musicology project (ah/l / ),partofthedigitaltransformationstheme,andcontin- ued as part of the epsrc fusing semantic and audio technologies for intelligent music production and consumption (fast impact) project (ep/l / ). we gratefully acknowledge the support of our col- leagues within these projects and our institutions, particularly graham klyne for his advice on rdf and sparql, and terhi nurmikko-fuller for user feedback during the development of the salt tool. we thank the musicologist participants in the user evaluation of salt for gen- erously volunteering their time, and the anonymous reviewers for their thoughtful comments and suggestions on this article. open access this article is distributed under the terms of the creative commons attribution . international license (http://creativecomm ons.org/licenses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. references . bacciagaluppi,c.:classifyingmisattributionsinpergolesi’ssacred music. eighteenth century music ( ), – ( ) . barthet, m., dixon, s.: ethnographic observations of musicologists at the british library: implications for music information retrieval. in: proceedings of the th international society for music infor- mation retrieval conference, pp. – . citeseer ( ) . beckett, d., berners-lee, t., prud’hommeaux, e., carothers, g.: rdf . turtle: terse rdf triple language. recommendation, w c, feb. . http://www.w .org/tr/turtle/ . bennett, r., hengel-dittrich, c., o’neill, e., tillett, b.b.: viaf (virtual international authority file): linking die deutsche bib- http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / http://www.w .org/tr/turtle/ on providing semantic alignment and unified access to music library metadata liothek and library of congress name authority files. in: world library and information congress: nd ifla general confer- ence and council. citeseer ( ) . berners-lee, t.: linked data. personal statement: design issues, w c, july . http://www.w .org/designissues/linkeddata. html . bizer, c., cyganiak, r.: d r server—publishing relational databases on the semantic web. in: th international semantic web conference, pp. – ( ) . bizer, c., heath, t., berners-lee, t.: linked data—the story so far. in: semantic services, interoperability and web applications: emerging concepts, pp. – ( ) . blume, f., finscher, l.: die musik in geschichte und gegenwart: allgemeine enzyklopädie der musik. bärenreiter ( ) . bretherton, d., smith, d.a., lambert, j., schraefel, m.c.: music- net: aligning musicology’s metadata. in music linked data workshop, may . brown, h.m.: instrumental music printed before : a bibli- ography. harvard university press, cambridge ( ) . castano, s., ferrara, a., montanelli, s., varese, g.: ontology and instance matching. in: knowledge-driven multimedia information extraction and ontology evolution, pp. – . springer, berlin ( ) . cherkassky, b.v., goldberg, a.v., radzik, t.: shortest paths algo- rithms: theory and experimental evaluation. math. program. ( ), – ( ) . crawford, t., fields, b., lewis, d., page, k.: explorations in linked data practice for early music corpora. in: digital libraries (jcdl), , ieee, pp. – ( ) . crawford, t., gale, m., lewis, d.: an electronic corpus of lute music (ecolm): technological challenges and musicological pos- sibilities. in: conference on interdisciplinary musicology, graz, pp. – ( ) . cruz, i.f., antonelli, f.p., stroe, c.: agreementmaker: efficient matching for large real-world schemas and ontologies. proc. vldb endow. ( ), – ( ) . cyganiak, r., wood, d., lanthaler, m.: rdf . concepts and abstract syntax. recommendation, w c, feb. . http://www. w .org/tr/rdf -concepts/ . dodds, l., davis, i.: linked data patterns: a pattern catalogue for modelling, publishing, and consuming linked data, chapter label everything ( ) . elmagarmid, a.k., ipeirotis, p.g., verykios, v.s.: duplicate record detection: a survey. ieee trans. knowl. data eng. ( ), – ( ) . inskip, c., wiering, f.: in their own words: using text analysis to identify musicologists’ attitudes towards technology. in: pro- ceedings of the th international society for music information retrieval conference ( ) . jacobson, k., dixon, s., sandler, m.: linked-brainz: providing the musicbrainz next generation schema as linked data. in: late- breaking demo session at the th international society for music information retrieval conference ( ) . jett, j., nurmikko-fuller, t., cole, t.w., page, k.r., downie, j.s.: enhancing scholarly use of digital libraries: a comparative survey andreviewofbibliographicmetadataontologies.in:proceedingsof the th acm/ieee-cs on joint conference on digital libraries, pp. – . acm, new york ( ) . kroeger, a.: the road to bibframe: the evolution of the idea of bibliographic transition into a post-marc future. cat. classif. q. ( ), – ( ) . lambrix, p., tan, h.: a system for aligning and merging biomedical ontologies. web semant. sci. serv. agents world wide web ( ), – ( ) . lee, j.h., cunningham, s.j.: toward an understanding of the his- tory and impact of user studies in music information retrieval. j. intel. inf. syst. ( ), – ( ) . lehmann, j., isele, r., jakob, m., jentzsch, a., kontokostas, d., mendes, p.n., hellmann, s., morsey, m., van kleef, p., auer, s., et al.: dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. semant. web ( ), – ( ) . leme, l.a.p., brauner, d.f., breitman, k.k., casanova, m.a., gazola, a.: matching object catalogues. innov. syst. softw. eng. ( ), – ( ) . levenshtein, v.i.: binary codes capable of correcting deletions, insertions, and reversals. soviet phys. dokl. , – ( ) . liew, c.l., ng, s.n.: beyond the notes: a qualitative study of the information-seeking behavior of ethnomusicologists. j. acad. librariansh. ( ), – ( ) . nurmikko-fuller, t., dix, a., weigl, d.m., page, k.r.: in collab- oration with in concert: reflecting a digital library as linked data for performance ephemera. in: proceedings of the rd international workshop on digital libraries for musicology, dlfm , pp. – . acm, new york ( ) . nurmikko-fuller, t., jett, j., cole, t., maden, c., page, k.r., downie, j.s.: a comparative analysis of bibliographic ontologies: implications for digital humanities. in: digital humanities : conference abstracts, pp. – ( ) . patton, m.q.: qualitative evaluation and research methods. sage, thousand oaks ( ) . paulheim,h.,hertling,s.,ritze,d.:towardsevaluatinginteractive ontologymatchingtools.in: the semantic web: semanticsandbig data, pp. – . springer, berlin ( ) . philips, l.: the double metaphone search algorithm. c/c++ users j. ( ), – ( ) . rism-zentralredaktion: rism: an overview. brochure, apr. . http://www.rism.info/fileadmin/content/ community-content/zentralredaktion/ _rism_ broschuere_neu- _final.pdf . rose, s.: early music performer. early music online , – ( ) . sadie, s.e.: the new grove dictionary of music and musicians. groves dictionaries inc, oxford ( ) . shvaiko, p., euzenat, j.: a survey of schema-based matching approaches. j. data semant. iv, – ( ) . shvaiko, p., euzenat, j.: ontology matching: state of the art and future challenges. ieee trans. knowl. data eng. ( ), – ( ) . sporny, m., longley, d., kellogg, g., lanthaler, m.: json-ld . : a json-based serialization for linked data. recommendation, w c, jan. . http://www.w .org/tr/json-ld/ . the marc formats: background and principles. approved statement, ala machine-readable bibliographic information commmitte & network development and marc standards office, loc, nov. . http://www.loc.gov/marc/ principl.html . volz, j., bizer, c., gaedke, m., kobilarov, g.: discovering and maintaining links on the web of data. springer, berlin ( ) . weigl, d., guastavino, c.: user studies in the music information retrieval literature. in: proceedings of the th international society for music information retrieval conference, pp. – ( ) . wilmering, t., page, k., fazekas, g., dixon, s., bechhofer, s.: automating annotation of media with linked data workflows. in: proceedings of the th international conference on world wide web companion. international world wide web conferences steering committee, pp. – ( ) . w.s.w. group: sparql . overview. recommendation, w c, mar. . http://www.w .org/tr/sparql -overview/ http://www.w .org/designissues/linkeddata.html http://www.w .org/designissues/linkeddata.html http://www.w .org/tr/rdf -concepts/ http://www.w .org/tr/rdf -concepts/ http://www.rism.info/fileadmin/content/community-content/zentralredaktion/ _rism_broschuere_neu- _final.pdf http://www.rism.info/fileadmin/content/community-content/zentralredaktion/ _rism_broschuere_neu- _final.pdf http://www.rism.info/fileadmin/content/community-content/zentralredaktion/ _rism_broschuere_neu- _final.pdf http://www.w .org/tr/json-ld/ http://www.loc.gov/marc/ principl.html http://www.w .org/tr/sparql -overview/ on providing semantic alignment and unified access to music library metadata abstract introduction musicological motivation and case study in early music . use case . problem statement related work data model and framework . linked data compatibility . datasets and saltsets . contextual paths . match decisions . providing unified access to the matched corpora using the data model and framework semantic alignment and linking tool . architecture . configuration . specifying contextual information . user interface . . matching modes application to case study: alignment of early music corpora user evaluation . sampling frame . design and procedure . analysis and results . user experience . limitations providing unified access to the matched corpora . implementation . early music demonstrator interface conclusions and future work epilogue: linked data (re)use acknowledgements references monteith h, et al. bmj open ; :e . doi: . /bmjopen- - open access protocol for a scoping review of the qualitative literature on indigenous infant feeding experiences hiliary monteith , tracey galloway, anthony j hanley to cite: monteith h, galloway t, hanley aj. protocol for a scoping review of the qualitative literature on indigenous infant feeding experiences. bmj open ; :e . doi: . / bmjopen- - ► prepublication history and additional materials for this paper is available online. to view these files, please visit the journal online (http:// dx. doi. org/ . / bmjopen- - ). received august revised january accepted january nutritional sciences, university of toronto, toronto, ontario, canada anthropology, university of toronto, mississauga, ontario, canada correspondence to dr anthony j hanley; anthony. hanley@ utoronto. ca protocol © author(s) (or their employer(s)) . re- use permitted under cc by- nc. no commercial re- use. see rights and permissions. published by bmj. abstract introduction prudent infant nutrition, including exclusive breastfeeding to months, is essential for optimal short- term and long- term health. quantitative research to date has documented that many indigenous communities have lower breastfeeding rates than the general population and that this gap in breastfeeding initiation and maintenance may have an important impact on chronic disease risk later in life. however, there are critical knowledge gaps in the literature regarding factors that influence infant feeding decisions. qualitative research on infant feeding experiences provides a broader understanding of the challenges that indigenous caregivers encounter, and insights provided by this approach are essential to identify research gaps, community engagement strategies, and programme and policy development. the objective of this review is to summarise the qualitative literature that describes breastfeeding and other infant feeding experiences of indigenous caregivers. methods and analysis this scoping review will follow guidelines from preferred reporting items for systematic reviews and meta- analyses extension for scoping reviews, the joanna briggs institute and the methodological framework from arksey and o’malley. in october , we will conduct an electronic database search using medline, embase, the cumulative index to nursing & allied health literature (cinahl), psycinfo, and scopus, and will focus on qualitative studies. publications that have a focus on infant feeding in canada, the usa, australia and new zealand, and the indigenous caregiver experience from the caregiver perspective, will be included. we will conduct a grey literature search using indigenous studies portal, country- specific browser searches, and known government, association, and community websites/reports. we will map themes and concepts of the publications, including study results and methodologies, to identify research gaps, future directions, challenges and best practices in this topic area. ethics and dissemination ethical approval is not required for this review as no unpublished primary data will be included. the results of this review will be shared through peer- reviewed publications and conference presentations. this protocol is registered through the open science framework ( osf. io/ su ). introduction indigenous peoples living in canada, the usa, australia and new zealand are disproportionately affected by chronic diseases, including type diabetes mellitus. their heavy disease burden is compounded by socioecological factors, such as food inse- curity, poverty, housing and water sanitation issues. these adverse environments and funding limitations are a direct result of the legacy of colonisation and they restrict the ease in which indigenous communities can improve their health and well- being. – in recent years, there has been an emphasis in research inquiry and public health program- ming on the contribution of these complex interconnected factors to health disparities, and the increased recognition for the need for multidimensional and culturally safe approaches to support indigenous commu- nities into the future. the health and well- being of indigenous infants and children are priorities for improved health outcomes overall, as maternal and early life risk factors are known to have long- term effects on health later in life. importantly, a focus on infants and children also aligns with indigenous ways of knowing, where intergenerational rela- tionships hold particular significance. strengths and limitations of this study ► this protocol describes a rigorous search strategy and methodological framework for summarising the literature that align with the research question and include peer- reviewed sources, as well as a grey lit- erature search. ► selection of publications that meet the inclusion and exclusion criteria will be completed by two indepen- dent reviewers and most inclusion/exclusion crite- ria are only applied at screening, not at the search, augmenting the comprehensiveness of this review. ► this review will map important findings and meth- odologies to provide an overview of work in this area, guiding best practices for future projects. ► the topic of this review is broad and interdisciplin- ary; therefore, it is possible that publications only available in subject- specific databases or websites may be omitted. o n a p ril , b y g u e st. p ro te cte d b y co p yrig h t. h ttp ://b m jo p e n .b m j.co m / b m j o p e n : first p u b lish e d a s . /b m jo p e n - - o n ja n u a ry . d o w n lo a d e d fro m http://bmjopen.bmj.com/ http://orcid.org/ - - - http://crossmark.crossref.org/dialog/?doi= . /bmjopen- - &domain=pdf&date_stamp= - - https://osf.io/ su http://bmjopen.bmj.com/ monteith h, et al. bmj open ; :e . doi: . /bmjopen- - open access it has been well documented that indigenous infants and children disproportionately experience risk factors that are associated with chronic diseases later in life, including high rates of overweight and obesity, food insecurity, poverty and limited quality of education. – optimal nutrition during infancy and childhood is an important factor that contributes broadly to health and well- being across the lifespan. a limited number of previous studies of indigenous infants have reported that breastfeeding initiation and duration have important protective effects on subsequent risk for type diabetes and adiposity ; however, breastfeeding rates are often low among indigenous mothers in developed coun- tries. – although quantitative studies have reported descriptive statistics and basic epidemiological features of breastfeeding among indigenous mothers, many important knowledge gaps remain. qualitative research decontextualises and recontextualises the deeper mean- ings and reasons for infant feeding experiences as perceived by indigenous caregivers, providing clarity of the phenomena of interest, uncovering new under- standing and possibly illuminating areas for further inquiry. infant feeding initiatives must be informed by these experiences to better address community- specific concerns to effectively promote healthy infant feeding behaviours, including breastfeeding, within indigenous communities. – rationale to date, there are no published summaries, scoping or systematic reviews on infant feeding experiences among indigenous caregivers that include qualitative descrip- tions of barriers, stories, supports and initiatives or related topics. a scoping review in this area will assist researchers in understanding the current state of the existing liter- ature, research gaps and key research priorities for future work. this review will also summarise the qualita- tive research methodologies used in indigenous infant feeding studies with the potential to clarify best research practices, and additional methodological applications to address gaps in the literature. this information may also result in further clarification of clinical best practices for breast and alternative forms of infant feeding among indigenous populations. objectives the primary aim of this work is to summarise the liter- ature available to date that incorporates qualitative approaches to describe the breastfeeding and other infant feeding experiences of indigenous women residing in developed nations impacted by colonisation. this review will include research addressing indigenous women’s experiences from their own perspectives, as well as the perspectives of other caregivers, including but not limited to grandmothers and fathers. in addition to literature on breastfeeding experiences, this review will also include literature on alternative infant feeding options, including formula feeding, complementary feeding, weaning and other forms of milk feeding if the work describes the alternative in relation to breastfeeding. methods protocol and registration the protocol for this scoping review follows preferred reporting items for systematic reviews and meta- analyses guidelines adapted for scoping reviews, as well as guidelines from the joanna briggs institute reviewer’s manual and guidelines published in by arksey and o’malley. the protocol is registered with the open science framework. eligibility criteria table provides an overview of the inclusion and exclu- sion criteria for this scoping review. the population of focus is indigenous peoples living in canada, the usa, new zealand and australia. these countries are included as they are developed nations that have similar legacies of colonisation, where western worldview is dominant, and in which indigenous peoples have similar health outcomes. infant feeding experiences are the main focus for this review. breastfeeding, as well as alternative forms of infant feeding, such as formula and cow’s milk, are included; however, we will exclude works that only focus on the introduction of solid foods. breastfeeding compared with not breastfeeding within the same popu- lation is the comparison considered in this review, when applicable, where we consider caregiver experiences of breastfeeding compared with other infant feeding strat- egies. the qualitative outcomes specific to experiences, perspectives and practices (including themes, descrip- tions, open- ended survey responses or any answers pertaining to experience) as described from the care- giver or others involved in caregiving will be included; work that only describes an outsider perspective will be excluded. additional inclusion criteria include: works published in the english language, grey literature and peer- reviewed journal articles, work with a focus on indigenous groups table eligibility criteria overview inclusion criteria exclusion criteria indigenous populations in canada, us, new zealand and australia work not describing experiences from a caregiver’s perspective explores breastfeeding and alternative infant feeding options work only about the introduction to solid foods published in english no english version published after published before qualitative or mixed methods data presentation of only quantitative and numerical data that do not describe infant feeding experiences o n a p ril , b y g u e st. p ro te cte d b y co p yrig h t. h ttp ://b m jo p e n .b m j.co m / b m j o p e n : first p u b lish e d a s . /b m jo p e n - - o n ja n u a ry . d o w n lo a d e d fro m http://bmjopen.bmj.com/ monteith h, et al. bmj open ; :e . doi: . /bmjopen- - open access within australia, canada, new zealand and the usa as the primary population, research using qualitative or mixed methods where the infant feeding experience is described, and works published after . works published prior to are likely to include literature that has been archived and is therefore not feasible to review in detail for this review. should there be works that do not clearly fit within these criteria, the two reviewers will meet to discuss until consensus is reached. we will report where clarity was and was not achieved and disclose why. publications that do not involve indigenous populations will also be excluded. information sources databases included in the initial search for this review will be medline, embase, cinahl, psycinfo, and scopus. these databases are selected for this review to include a broad range of research as our topic overlaps with various fields, including anthropology, health sciences, sociology and indigenous studies. following this initial search, the grey literature will be explored for additional relevant documents. the grey literature search will concentrate on resources and publications available from indigenous studies portal and a variety of indigenous focused websites, govern- ments, organisations and book chapters. a thorough google search will be conducted with each of the country- specific google versions (eg, google au) and the first pages of results will be included in the search. indigenous scholars and non- indigenous scholars who work in this area of study in canada, australia, usa and new zealand will be contacted with the aim of including as many appli- cable grey literature sources as needed to be as sensitive in our search as possible. the canadian agency for drugs and technologies in health’s ‘grey matters’ checklist will also be consulted. given the limitations to reproducibility and compre- hensiveness in a grey literature search, transparency is particularly important. therefore, the reporting strategy used for the grey literature search will include all websites (url and title) visited, the dates of searches, the search terms used to reach such websites and used within those websites, and the number of items screened. we will report both the sources of relevant content, as well as when no relevant content is found on a website or from a specific search. we will also search book chapters and confer- ence proceedings in the following databases: medline, embase, cinahl, psycinfo, scopus and indigenous studies portal. the initial database search and exporting of abstracts and references will take place from to october . the detailed proposed search strategy can be found as online supplemental appendix . selection of sources of evidence all literature references will be exported to zotero soft- ware (corporation for digital scholarship, virginia) and saved. the titles, abstracts and references will then be transferred to covidence, where duplicates will be removed and data will be managed for the duration of the scoping review. this will enable independent review of the literature for the reviewers/authors. a minimum of two independent reviewers will be involved in this work from screening to inclusion. at minimum, an additional author will assist in summarising the included literature and in writing the scoping review. data charting process all data will be collected and shared using covidence software to enable an independent review process. the software facilitates the review process through organ- ised management of the sources, and identification of publications where reviewers are not concordant and discussion may be required. if or when review decisions differ, an author other than the primary two reviewers will provide a third vote to achieve a review decision. all titles and abstracts will be screened by two reviewers at the screening stage and all eligible sources will move to a full- text review, also completed by two independent reviewers. data items literature included in this scoping review must be from qualitative or mixed- method studies. articles that report on survey data will be included if the questions reported on are based on infant feeding experience (perspectives, perceptions and practices), whether the survey questions were open ended or not. data are considered as any information, such as quotations, codes, themes and open- ended survey responses as first, second and/or third order constructs, describing infant feeding experiences. experience is defined as ‘practical knowledge, skill, or practice derived from direct observation of or participa- tion in events or in a particular activity ( merriam- webster. com )’. in this work, experience refers to the reported knowledge, skill or practice from direct observation or participation in infant feeding. the term “indigenous peoples” has not officially been defined by the united nations given the importance of enabling indigenous peoples to self- determine their iden- tity and that a specific definition is not required for the protection of indigenous rights. a working definition is provided by the josé r. martínez cobo study and is as follows: indigenous communities, peoples and nations are those which, having a historical continuity with pre- invasion and pre- colonial societies that developed on their territories, consider themselves distinct from other sectors of the societies now prevailing on those territories, or parts of them. they form at present non- dominant sectors of society and are determined to preserve, develop and transmit to future gener- ations their ancestral territories, and their ethnic identity, as the basis of their continued existence as peoples, in accordance with their own cultural pat- terns, social institutions and legal system. o n a p ril , b y g u e st. p ro te cte d b y co p yrig h t. h ttp ://b m jo p e n .b m j.co m / b m j o p e n : first p u b lish e d a s . /b m jo p e n - - o n ja n u a ry . d o w n lo a d e d fro m https://dx.doi.org/ . /bmjopen- - http://bmjopen.bmj.com/ monteith h, et al. bmj open ; :e . doi: . /bmjopen- - open access in canada, indigenous groups include inuit, métis, and first nations, including any of the over recog- nised first nations. in australia, this includes aboriginal and torres straight islanders, in new zealand, the maori people, in the usa, native american peoples and alaska natives. breastfeeding is a form of infant/early childhood nutrition using breast milk. in this scoping review, breast- feeding as well as any other form of infant feeding such as formula feeding, cow’s milk administration, bottle feeding, expressed milk feeding, milk bank feeding, wet nurse feeding and others are included so long as the work describes the breastfeeding or absence of breastfeeding experience. for those works that only describe an alter- native method to breastfeeding, the work must describe that method in relation to breastfeeding (ie, why breast- feeding was not engaged etc). synthesis of results as previously mentioned, covidence software will be used to manage the literature and selection process. once the records have been screened and full- text arti- cles have been reviewed, studies that meet the inclusion criteria will be retained. this literature will be synthesised based on charting of results and thematic analysis. this process of synthesising the results will be completed by the primary author with the feedback and review of the second and third authors. the results will focus on the themes, quotations, conclusions and other interpreta- tions related to infant feeding experiences of indigenous caregivers, as well as a synopsis of the methodologies and theories used to support the work. two steps will be used to present the findings: ( ) a figure highlighting the number of studies at each stage of the search and ( ) a written analysis of the primary outcomes. qualitative evidence synthesis can have several challenges; therefore, detailed documentation will be important for the analyt- ical process, including decision- making rationale through mind mapping and/or charting. the final results will be validated by a researcher in the field and an indigenous community member with lived experience. validation of sources will be conducted using a test set of preidentified relevant publications that are expected to be captured using the database search terms. after the first database search is complete, we will check to see if these publications are included in our strategy. if they are, this will indicate that our search was likely compre- hensive, if they are not included, we will investigate why, report this information, and make the appropriate changes to the search strategy prior to searching the remaining databases. this paper describes the protocol for a scoping review of peer reviewed journal articles and grey literature on the topic of qualitative research on infant feeding experiences of indigenous caregivers living in canada, australia, the usa and new zealand. there is a need to explore and understand the literature related to indige- nous people’s experiences with infant feeding practices as no such review exists to date. this is an important knowl- edge gap given the significant role that infant feeding plays in indigenous health and well- being, and disease prevention. this scoping review will summarise the liter- ature to date and highlight any important gaps that exist to guide research priorities in the future. it is anticipated that this review will also summarise the methodologies used to date, providing guidance for future research, highlighting best practices and/or gaps in how data have been collected. acknowledgements the authors would like to thank our research librarian, glyneva bradley- ridout, at the university of toronto, for her guidance in the scoping review process. contributors hm contributed to the drafting and editing of the protocol and oversaw revisions. tg provided feedback on the structure of the manuscript and the search strategy. ah contributed to the introduction and methods and was extensively involved in editing the manuscript. all authors approved the final manuscript. funding this research received no specific grant from any funding agency in the public, commercial or not- for- profit sectors. tg is supported by the canadian institutes of health research, fund number . hm is supported by an ontario graduate scholarship and the following university of toronto (u of t) scholarships: a banting and best diabetes centre scholarship, a department of nutritional sciences loblaw food as medicine award, the dr. bernard lau memorial scholarship (b): graduate bursary, the al and hannah perly graduate student scholarship, and the peterborough k.m. hunter graduate scholarship. competing interests none declared. patient and public involvement statement patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research. patient consent for publication not required. ethics approval as this review involves material previously published or in the public domain, ethical approval is not required; however, this review is specific to indigenous groups and therefore, it is important to consider data sovereignty and ethics in the analysis and interpretation of results. the methodologies of the included works will be considered within this context, and the reviewers will validate results with an indigenous scholar and/or community member prior to publication. provenance and peer review not commissioned; externally peer reviewed. supplemental material this content has been supplied by the author(s). it has not been vetted by bmj publishing group limited (bmj) and may not have been peer- reviewed. any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by bmj. bmj disclaims all liability and responsibility arising from any reliance placed on the content. where the content includes any translated material, bmj does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise. open access this is an open access article distributed in accordance with the creative commons attribution non commercial (cc by- nc . ) license, which permits others to distribute, remix, adapt, build upon this work non- commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non- commercial. see: http:// creativecommons. org/ licenses/ by- nc/ . /. orcid id hiliary monteith http:// orcid. org/ - - - references united nations, department of economic and social affairs. indigenous peoples: health [online]. available: https://www. un. org/ development/ desa/ indigenouspeoples/ mandated- areas / health. html o n a p ril , b y g u e st. p ro te cte d b y co p yrig h t. h ttp ://b m jo p e n .b m j.co m / b m j o p e n : first p u b lish e d a s . /b m jo p e n - - o n ja n u a ry . d o w n lo a d e d fro m http://creativecommons.org/licenses/by-nc/ . / http://orcid.org/ - - - https://www.un.org/development/desa/indigenouspeoples/mandated-areas /health.html https://www.un.org/development/desa/indigenouspeoples/mandated-areas /health.html http://bmjopen.bmj.com/ monteith h, et al. bmj open ; :e . doi: . /bmjopen- - open access knibbs ld, sly pd. indigenous health and environmental risk factors: an australian problem with global analogues? glob health action ; : . united nations. state of the worlds indigenous peoples: indigenous peoples’ access to health services [online], . available: https:// www. un. org/ esa/ socdev/ unpfii/ documents/ / docs- updates/ sowip_ health. pdf asia pacific forum, united nations. the united nations declaration on the rights of indigenous peoples: a manual for national human rights institutions [online], . available: https://www. ohchr. org/ documents/ issues/ ipeoples/ undripmanualfornhris. pdf sheikh ma, islam r. cultural and socio- economic factors in health, health services and prevention for indigenous people. antrocom online j anthropol ; : – . dalang r, carino j. indigenous peoples and the human rights- based approach to development: engaging in dialogue [online]. united nations development programme (undp), . available: https://www. undp. org/ content/ dam/ rbap/ docs/ research% &% publications/ democratic_ governance/ rbap- dg- - indigenous- peoples- approach- to- development. pdf mcnamara bj, gubhaju l, chamberlain c, et al. early life influences on cardio- metabolic disease risk in aboriginal populations--what is the evidence? a systematic review of longitudinal and case- control studies. int j epidemiol ; : – . martens pj, shafer la, dean hj, et al. breastfeeding initiation associated with reduced incidence of diabetes in mothers and offspring. obstet gynecol ; : – . greenwood m. children as citizens of first nations: linking indigenous health to early childhood development. paediatr child health ; : – . first nations information governance centre. national report of the first nations regional health survey phase : volume two. first nations information governance centre . government of canada sc. dietary habits of aboriginal children [online], . available: https:// www . statcan. gc. ca/ n / pub/ - - x/ / article/ - eng. htm willows nd, johnson ms, ball gdc. prevalence estimates of overweight and obesity in cree preschool children in northern quebec according to international and us reference criteria. am j public health ; : – . ip s, chung m, raman g, et al. breastfeeding and maternal and infant health outcomes in developed countries. evid rep technol assess ; : – . dieterich cm, felice jp, o’sullivan e, et al. breastfeeding and health outcomes for the mother- infant dyad. pediatr clin north am ; : – . pettitt dj, forman mr, hanson rl, et al. breastfeeding and incidence of non- insulin- dependent diabetes mellitus in pima indians. lancet ; : – . willows nd, morel j, gray- donald k. prevalence of anemia among james bay cree infants of northern quebec. cmaj ; : – . kuperberg k, evers s. feeding patterns and weight among first nations children. can j diet pract res ; : – . chamberlain cr, wilson an, amir lh, et al. low rates of predominant breastfeeding in hospital after gestational diabetes, particularly among indigenous women in australia. aust n z j public health ; : – . mcisaac ke, lou w, sellen d, et al. exclusive breastfeeding among canadian inuit: results from the nunavut inuit child health survey. j hum lact ; : – . becker h. chapter : concepts. in: tricks of the trade: how to think about your research while you’re doing it. university of chicago press, : – . dodgson j, struthers r. traditional breastfeeding practices of the ojibwe of northern minnesota. health care women int ; : – . eni r, phillips- beck w, mehta p. at the edges of embodiment: determinants of breastfeeding for first nations women. breastfeed med ; : – . moffitt p, dickinson r. creating exclusive breastfeeding knowledge translation tools with first nations mothers in northwest territories, canada. int j circumpolar health ; : . houghtaling b, byker shanks c, ahmed s, et al. grandmother and health care professional breastfeeding perspectives provide opportunities for health promotion in an american indian community. soc sci med ; : – . tricco ac, lillie e, zarin w, et al. prisma extension for scoping reviews (prisma- scr): checklist and explanation. ann intern med ; : – . peters m, godfrey c, mcinerney p. chapter : scoping reviews ( version). in: aromataris e, munn z, eds. joanna briggs institute reviewer’s manual [online]. jbi, . https:// reviewersmanual. joannabriggs. org/ arksey h, o'malley l. scoping studies: towards a methodological framework. int j soc res methodol ; : – . smylie j, crengle s, freemantle j. indigenous birth outcomes in australia, canada, new zealand and the united states – an overview. open womens health j ; : – . yeates ke, cass a, sequist td, et al. indigenous people in australia, canada, new zealand and the united states are less likely to receive renal transplantation. kidney int ; : – . canadian agency for drugs and technologies in health (cadth). grey matters: a practical tool for searching health- related grey literature, . covidence systematic review software [online]. melbourne au: veritas health innovation. available: www. covidence. org merriam- webster. experience [online], . available: https://www. merriam- webster. com/ dictionary/ experience martinez cobo j. problem of discrimination against indigenous populations, e/cn. /sub. / / /add. , para. . united nations, . soilemezi d, linceviciute s. synthesizing qualitative research: reflections and lessons learnt by two new reviewers. int j qual methods ; . o n a p ril , b y g u e st. p ro te cte d b y co p yrig h t. h ttp ://b m jo p e n .b m j.co m / b m j o p e n : first p u b lish e d a s . /b m jo p e n - - o n ja n u a ry . d o w n lo a d e d fro m http://dx.doi.org/ . /gha.v . https://www.un.org/esa/socdev/unpfii/documents/ /docs-updates/sowip_health.pdf https://www.un.org/esa/socdev/unpfii/documents/ /docs-updates/sowip_health.pdf https://www.un.org/esa/socdev/unpfii/documents/ /docs-updates/sowip_health.pdf https://www.ohchr.org/documents/issues/ipeoples/undripmanualfornhris.pdf https://www.ohchr.org/documents/issues/ipeoples/undripmanualfornhris.pdf https://www.undp.org/content/dam/rbap/docs/research% &% publications/democratic_governance/rbap-dg- -indigenous-peoples-approach-to-development.pdf https://www.undp.org/content/dam/rbap/docs/research% &% publications/democratic_governance/rbap-dg- -indigenous-peoples-approach-to-development.pdf https://www.undp.org/content/dam/rbap/docs/research% &% publications/democratic_governance/rbap-dg- -indigenous-peoples-approach-to-development.pdf http://dx.doi.org/ . /ije/dys http://dx.doi.org/ . /aog. http://dx.doi.org/ . /pch/ . . https://www .statcan.gc.ca/n /pub/ - -x/ /article/ -eng.htm https://www .statcan.gc.ca/n /pub/ - -x/ /article/ -eng.htm http://dx.doi.org/ . /ajph. . http://dx.doi.org/ . /ajph. . http://www.ncbi.nlm.nih.gov/pubmed/ http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.pcl. . . http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / . . . http://dx.doi.org/ . / - . http://dx.doi.org/ . / - . http://dx.doi.org/ . / http://dx.doi.org/ . / http://dx.doi.org/ . / http://dx.doi.org/ . /bfm. . http://dx.doi.org/ . /bfm. . http://dx.doi.org/ . /ijch.v . http://dx.doi.org/ . /j.socscimed. . . http://dx.doi.org/ . /m - https://reviewersmanual.joannabriggs.org/ https://reviewersmanual.joannabriggs.org/ http://dx.doi.org/ . / http://dx.doi.org/ . /ki. . www.covidence.org https://www.merriam-webster.com/dictionary/experience https://www.merriam-webster.com/dictionary/experience http://bmjopen.bmj.com/ protocol for a scoping review of the qualitative literature on indigenous infant feeding experiences abstract introduction rationale objectives methods protocol and registration eligibility criteria information sources selection of sources of evidence data charting process data items synthesis of results references institutional repositories: an analysis of trends and a proposed collaborative future author: leila sterman leila sterman ( ) institutional repositories: an analysis of trends and a proposed collaborative future, college & undergraduate libraries, : - , - , doi: . / . . made available through montana state university’s scholarworks scholarworks.montana.edu http://scholarworks.montana.edu/ http://scholarworks.montana.edu/ institutional repositories: an analysis of trends and a proposed collaborative future leila sterman, montana state university this study seeks to give libraries a plan for interinstitutional cooperation for institutional repositories that will benefit all involved: researchers, institutions, and, ultimately, global scholarship. this research uses repository studies, interviews with existing repository managers, and the input of libraries considering a repository to inform the exploration of the opportunities for collaboration in ir development and maintenance. this article proposes opportunities for collaboration between institutions in order to convince libraries that it is possible and effective to work together toward a common goal: highlighting existing working groups or alliances, sharing technology and hardware, building separate interinstitutional bodies to house repositories, and sharing the work of specialists. introduction this article seeks to help libraries join together to form more efficient and useful collaborations for the sustainable success of institutional repositories (irs). collaborative efforts can help form more efficient and systematic management of ir services. repository managers may then have more time and resources to put toward solving a number of problems or challenges. these include: increasing indexing rates by search services such as google scholar (arlitsch and o’brien ); integrating unique identifiers like open researcher and contributor id (orcid) and the international standard name identifier (isni) system; engaging in outreach to increase support, understanding and funding for irs; and soliciting scholarship to populate the ir. the state of repositories background as the budgets shrink in many libraries, there has been a shift toward e-resources in an attempt to focus resources where they will provide the greatest return on investment within institutions (oder ). in an article describing the institutional repository at pacific university, the authors note that “in taking on a much more active role in the creation, dissemination and preservation of internally produced scholarship, the library has demon- strated its value to faculty and administrators and has opened the door to new partnerships which will not only strengthen the university, but also the library’s place within it” (gilman and kunkel , ). gilman and kunkel ( ) further suggest that the academic library move from being an access point and archive of information to a place to increase engage- ment in the scholarly communication life cycle. bankier and smith ( ) speak to the wave of new technology-supported initiatives that are possible in an academic library. they state that “by seeking out a variety of content types, the library is able to initiate, renew, or redefine its relationship with faculty, departments, and administration, generating critical support for scholarly communication and repository initiatives” ( ). librarians have a responsibility to be aware of and advocate for new technology—especially technology such as irs that have the ability to bring such a great deal of information to such a broad range of people. although an institutional repository could become a useful and important tool for universities, not every institution has the means to create and maintain their own repository. repositories are expensive and time consuming, and they demand specific knowledge of programming, content management, metadata applications, publicity, and internal marketing to researchers. in the past few years, irs have proliferated within the context of the open access (oa) movement, shrinking library budgets, the increasing costs of traditional scholarly journals and publications, and the increased development of software platforms for irs. in , the ranking web of world repositories lists repositories in the usa, and , repositories in the world (cybermetrics lab ). since arxiv, one of the first repositories and the top- ranked repository in the ranking web, began in august , there has been a relative explosion of digital repositories in the academic world (cornell university library ). the mission of most of these digital repositories is to “enable better access, searchability, usability, and visibility of their research output by those with internet access” (ocholla ). although they share a similar mission, there are many avenues to this goal, and repositories vary greatly in their size, content, scope, and successful realization of that mission. as there are many ways in which an ir can engage with the research community, this article focuses on five areas: interoperability and visibility, engagement and dissemination, researcher participation, education, and information stability. interoperability and visibility the association of research libraries (arl) defines repositories as: “institutionally defined, scholarly, cumulative and perpetual, and open and interoperable” (crow a, ). most repositories are consistent with those standards if they follow basic open archives initiative protocol for metadata harvesting (oai-pmh; open archives initiative ) standards for interoperability of metadata. to be truly interoperable, however, repositories need more than comparable metadata. to truly work together, repositories need to be linked to each other in some more meaningful way. for example, irs should apply more specific metadata to each item so that they are more reliably linked to search engines and can be more accurately indexed and more easily found through the basic searches that are the beginning of so many academic research endeavors (arlitsch and o’brien ). libraries should work to make their sites visible so that the scholarship held in irs can be indexed by search engines and read and cited in the academic community. that requires time and a specific set of skills. “the goals motivating an institution to create and maintain a digital repository—whether paninstitutional, as a component in the changing structure of scholarly com-munication, or institution-centric—require that users beyond the institution’s community gain access to the content” (crow a). it is not enough to post objects to the internet; the metadata of those objects must be optimized for search engines if these objects are to be found by users. engagement and dissemination increasing numbers of academic institutions have realized the clear and immediate benefits of having an ir. the “sparc institutional repository checklist & resource guide” states that, “institutional repositories offer a strategic response to systemic problems in the existing scholarly journal system—and the response can be applied immediately, reaping both short-term and ongoing benefits for universities and their faculty and advancing the positive transformation of scholarly communication over the long term” (crow b). this is an alluring and impressive draw for libraries to offer scholarly life cycle literacy education. scholars see a clear desire for maximum readership, access to, and preservation of articles (chan ). an ir could be a good starting place to transform the library into an active participant in the promotion and dissemination of scholarly materials. ir advocates believe that repositories will make huge changes in the public image of a university’s research output and a library’s image. crow explains, “the rationale for universities and colleges implementing institutional repositories rests on two interrelated propositions: one that supports a broad, pan-institutional effort and another that offers direct and immediate benefits to each institution that implements a repository” (crow a). researcher participation it is difficult to get researchers to voluntarily submit their own research. it takes time to explain what the repository is and how it might help a researcher. researchers need compelling reasons to take time from their busy schedules to assign basic metadata, figure out the copyright status of an article, and then find the pdf of the appropriate version of their work. education and outreach, ease of use, and implied benefits are often not enough to foster a culture of participation that encourages researchers to deposit. harvard university has implemented a mandate to ensure that their repository is populated with its own scholarly output. “the mandate, which resembles a publishing contract, has been instituted to combat rising serials costs that are forcing subscription cuts and restricting intellectual exchange” (albanese , ). even with a mandate, it still takes time and energy to deposit works into a repository. education irs provide practical opportunities to increase awareness about multiple scholarly communication issues and systems. they also promote “awareness of the increasing importance of virtual, informal, and global mechanisms of communication” (nabe , ). they can be a practical platform for education on publishing, copyright, plagiarism, digital preservation, and information literacy. elsevier’s early takedown notices did more than show the power of the legal department of a major publishing house; they made public the ignorance or disinterest that many published authors have about their copyright holdings (culter ). researchers share pdfs in an informal network of phone calls, emails, and social media (moriano et al. ). as mass dissemination of electronic resources becomes easier, the practice of illegally sharing documents to which authors have signed away the copyright, including distribution rights, becomes easier. now, instead of illegally sharing an article with a small number of colleagues, an author can share with anyone in the world. this brings two issues to light: one, publishers often have a legal right to these articles, as most publishers still hold the copyright to the final version of an article; and two, authors do not behave as if they care about that legal distinction. authors often do not adhere to the copyrights that they agreed to, as seen in the proliferation of illegal copies on services like researchgate. irs can help educate authors about those rights so that they do not sign them away without considering the implications, and when they do sign restrictive author’s contracts, they understand what has transpired. researchers and students should be educated about their rights as authors and the effects that publication agreements have on publications so that they can make more informed choices about their work. information stability “link rot” is a term used to describe the permanent unavailability of linked content on web pages due to out of date hyperlinks. the loss of citation information is a real and serious impediment to research. as links deteriorate they no longer point to sources for quotes, maps, data sets, and any other digital object that may be deeply important to an article. for example, before publishing a book on the internet, the authors of interpersonal divide: the search for community in a technological age attempted to recheck their sources, only to find, years after starting, that percent of their digital references were gone. “this study continues to challenge advocates of online scholarship to stop touting the convenience of easy access and start resolving issues of later retrieval “(bugeja and dimitrova ). online scholarship, they assert, is initially about access to information, but continued retrieval is just as important. institutional repositories can serve as a permanent source of stable information through intentional preservation architecture and stable urls. irs begin to answer the problems in current digital scholarship, although they are not the whole story for most institutions. it is now a question of how to meet this need most efficiently and for the longest amount of time. survey analysis in an attempt to understand the current ecosystem of irs in the united states, the author emailed a questionnaire (see the appendix) to managers of the irs listed on opendoar’s list of repositories in the united states. the survey pool was not limited by type or material collected. twenty-six ir managers responded to the survey, giving a percent response rate. respondents reported the use of several platforms, including dspace, digital commons (bepress), contentdm, a variety of homegrown software, hydra, fedora, sobekcm, and debian. just under half of the responding irs run on dspace, a third have contracts with bepress to run digital commons, and the remainder are spread among the other platforms. several types of institutions responded to the survey, including research universities, state archives, a subject repository, teaching-focused colleges, museum libraries, large state schools, and a research institute. staffing not every library that maintains a repository has an explicitly titled ir manager. many of the surveyed libraries had a librarian who manages the ir in addition to normal duties. in the most extreme case, a reference librarian was relieved of one reference desk shift in order to run the repository. some libraries have a team of librarians who work together to run the repository, and many have an additional staff person or graduate student. the designation of a single staff person for the ir is a large commitment of resources that not all libraries are able to make to their irs, but this is not essential for success. some librarians were concerned that there was someone who ran the ir, yet that person’s title did not reflect that role. some respondents claim “no real defined leadership,” as it is shared among multiple departments. it would benefit the decision-making process to have a formal plan in place for the staffing and budget of an ir if it is to survive budget cuts, new initiatives, personnel turnover, and the annual assessments of academic institutions. most of the institutions surveyed allocated at least . fte to the repository, mostly to education and item deposit. for those libraries that run self-supported repository software (mostly dspace), technical server and software development and maintenance are not counted in the . fte. when making a case for an ir, it would also be prudent to note all the time on the project, regardless of departmental affiliation. this would help administrators and managers see the full extent of work done on and for an ir. additionally, treating all personnel involved with the ir as valuable players may increase effective communication and simplify management of the project as a whole. scope of collection most repositories collect electronic theses and dissertations (etds), peer- reviewed articles (pre- or postpublication) published in traditional journals, monographs, or book chapters. additionally, some collect grey literature, technical reports, working papers, white papers, conference presentations, archival university papers, college or departmental newsletters, course cat- alogs, audio samples, books, journals, accreditation reports, funded grant proposals, artistry and performance materials, video of recorded lectures and talks, university ephemera, conference proceedings, posters, undergrad-uate work, videos, curricular materials (teaching tools designed by faculty), datasets, images, and maps. one ir currently will add any type of content that supports the university’s curriculum and research needs: they collect “everything and anything currently.” some respondents note that there are negative aspects to such a broad collection scope. one of these respondents states: “it’s becoming clear that a repository with such a broad scope may be of limited value. enforcement of metadata best practices is extremely difficult when responsibility for description is distributed to anyone who requests a collection. not to mention that lowest-common-denominator description and structure are limiting for projects who want to do more.” when asked about aspects of the ir that have been a struggle, this respondent continues: the biggest struggle has been defining and maintaining the scope of the repository. this scope has shifted multiple times since starting and new collections/partnerships are given vague expectations about our services. the collection policy is so broad and the scope so varied that providing specialized services per collection is impossible. another struggle has been running the project without a dedicated budget or business plan. currently, there are no plans in place to recoup costs of the resources it takes to run the repository (storage, transcoding, streaming, staff, etc.). this makes designing a system with appropriate boundaries and features challenging. this response clearly illustrates the need for a plan, a collection devel- opment policy, and a dedicated, or at least noted, budget. the repository manager knows that these things would help. the struggle is advocating for resources while fighting to keep up with daily operations. the respondent continues: resources that would help: a dedicated budget. a leader with a vision concerning repository services and how they fit into the larger goals of the library/university. project manager to oversee the development of both systems and digital collections production. more software engineers to work on the platform, new features, and improvements. a repository manager to oversee routine day-to-day queries, data analysis, and technical maintenance. digital archivist to support media collections (standards, policy). more digital lab student staff to grow digitization to support requests. student staff to support ingestion and description. one good reason to limit the scope of the collection of an ir to scholarly materials or faculty publications is that it limits the possibilities of mission creep and the repository, its staff, and budget being pulled in too many directions to be effective at any one task. age of institutional repositories the irs represented by survey respondents had start dates ranging from to . although the repositories started at different times, there were similar issues and successes across all irs, with the younger programs more concerned with faculty engagement and increasing content. budget the survey asked about both the startup budget and estimated annual bud-get for irs. the answers varied between hard dollar numbers and guesses. respondents spent up to $ , for startup with repository services. these repository services average around $ , a year for the survey population. some respondents reported “there was no dedicated budget” or that “it just came out of the library budget.” it is indisputable that in order to store infor- mation, a container is needed in which to house it. these places could be local servers, cloud-based storage, or an external service. in each case there is a cost associated. additionally, there are personnel costs; these vary from a reassigned percent fte to positions that each spend an average of percent of their time on the ir at one institution. the respondents vary in how they categorized staff who work with the ir—from deposit only to the full range of cataloging, development, server maintenance, liaisons in the institution, marketing, deposits, repository instruction, digitization, and policy development. “you can’t manage what you don’t measure,” as paraphrased by bill hewlett, has become a management adage that has been applied to informa-tion technology, small businesses, corporations, and should also be applied to irs (symons ). the library, an integral part of the research function of a university, should not be driven by bottom line, like a business, but most libraries run on a limited budget, and it would be prudent to analyze current practices in order to make a case for them if they are ever threatened. libraries cannot become complacent and assume that because the graduate school (or similar external funding) gave $ , as startup funds to help maintain the etds, that they would fund the ir forever. likewise, when committing to electronic preservation, part of the commitment must be the securing of, or at least planning for, continuing funds. it is difficult to maintain a budget and personnel time without first knowing the amount of time spent currently. “it is self-defeating to advocate for adoption of the service and then be unable to meet the requests of the contributor pool” (nabe , ). it is in a repository’s best interest to keep good records and maintain a number of metrics to assess the work performed for its upkeep and the benefits enjoyed as a result of that work. successes the survey asked respondents to comment on their biggest accomplish- ment. although responses varied from institution to institution, they fell into these categories: etd work, increasing item counts, publishing journals and student journals, increased outreach, digitization of existing collections, mi- gration to open source software, broadening repository holdings, improved workflows, creating the repository itself, and becoming part of the culture at a home institution. institutions are increasing the size of their collections and the acceptance and use of those collections. it is difficult to draw conclusions beyond this from the survey responses. still, it is useful to know what other repositories consider a success. difficulties difficulties reported in the survey fell into a few categories: lack of time, self- deposit, lack of staffing or a dedicated manager, copyright clearance, obtaining content (especially faculty publications), making time to set poli-cies, it support, getting better statistics about use, defining the scope of the ir, funding (especially for storage), managing workflows, tailoring messages to specific faculty, helping other librarians to provide meaningful outreach, and communication and marketing. though it is difficult to say why these are problematic points for repository managers, the issues can be further broken down into a lack of funding, staffing time, it support, policies, and outreach. while these are not uncommon for any organization, it is worth noting that these are the places irs feel pressure. software issues one solution to facilitate the process of building and maintaining an ir was framed as “just purchasing a [commercial product].” libraries in this survey who have purchased repository services report similar concerns to their open source counterparts. both groups spend a great deal of time on outreach, item ingest, and increasing faculty awareness of the tool and service. purchasing repository software maintained by a for-profit company could be an easy solution to issues such as technical repository maintenance and gaps in librarians’ skill sets, yet purchasing a product from a for-profit company is at odds with the sustainable continuation of a repository built for the promotion and preservation of open access research. in an era of shrinking library budgets, committing to preserve the schol-arly record without being able to confidently fund the project forever seems tenuous. many of the surveyed managers reported a current push to have the ir service paid for outside the library. if the library remains in control of the ir yet not its budget, we take both the infrastructure and the assurance of its continued existence out of our hands. libraries without repositories of the libraries surveyed, the few that did not have a repository had all considered an ir at some time. the major reason that they had not pursued an ir was budget concerns. one library reported that they hoped to join an ir hosted or facilitated by the consortium where they already have membership and working relationships. this is one option that provides fertile ground for collaboration. another library reported that they had the desire to preserve materials that were produced at the university that could not be supported in their current digital archives. as a small library without a large staff, this library does not have the on-site staff time to create and support an open source product like dspace for a repository. they also did not have the budget this year to purchase a service like bepress or dspace direct. it is this type of issue, caused by limited budget, that collaboration most effectively addresses. existing collaboration there are a few common existing collaborations within institutions. just over percent of the libraries that responded to the survey noted a content or financial relationship with the graduate school, law school, or individual department. the partnerships include monetary support, outreach and education, and personnel time. support from departments in the form of items to deposit, mandates, or active enthusiasm makes a large difference in the awareness and population of an ir. these are valuable partnerships that should be fostered on campuses and maintained through mutual benefits. many graduate schools support the ir based on the deposit and preser- vation of electronic thesis and dissertations (etds). two repository managers responded that they had collaborations with other institutions, and many noted that they used services like the dspace user community to aid their work. these communities and support networks not only distribute work among more people and budgets, but they also foster engagement around an ir, which is invaluable to its success. additionally, sharing the work of an ir within an individual library not only helps the persons responsible for the repository, but as some commit-tees work on things like “tough questions about copyright,” the discussions act as valuable learning tools for all involved. potential new collaboration proposal summary the knowledge and resources to run an ir may not be available to all institu- tions. as librarians we need to think critically about the state of repositories and realize that many situations may not be scalable or sustainable. although it may be easier for some libraries to divert staff time and resources from the library budget to create a repository, this should not limit the libraries that would like to build a repository yet do not have the resources. for those institutions that may not be able to support that activity but want the benefit of an ir, how can they build a repository that makes sense, is sustainable, and promotes their mission as a library and as an institution? five options for collaboration between libraries that would like to es- tablish a new ir are: . form a consortium to produce a single repository that has more visibility than any one institution has on its own; . increase communication between partner institutions so that we do not all solve the same problems or work in a vacuum; . establish a separate center for site-neutral collaboration, such as ohiolink, orbiscascade, and oclc do (ohiolink ); . share the work of specialists; and . use metrics to evaluate and assess irs so that they may have a productive future. consortium for some institutions, even if repositories become less expensive and easier to set up, they are still difficult to maintain and to populate. it may be the case that even in the best circumstances it is advantageous for libraries to join together in this facet of their services. to work with others to produce a single repository that represents many institutions, and thus has more visibility than any single ir on its own, could be the best option for many potential irs. consortia could be based on any number of attributes of an institution, for example: institutions whose values are aligned, who are similarly sized or focused, who are geographically close, or who are very different and serve to benefit from other’s strengths. existing partnerships or groups (e.g., ohiolink, the oberlin group) could strengthen their partnerships and create a shared digital repository. just as libraries could learn to give up another piece of cataloging control to a body or elected group, they could learn to make decisions about preserving the scholarly record together by working to develop and maintain the repository as a group. branding and politics are two issues that institutions may face. even if the libraries want to work together, it is possible that someone at the institution will fear the lack of recognition and control of research output. the university of rochester’s ( ) ur research is a good model for the potential success of group repositories. the numerous schools that make up the university of rochester each has a unique and important identity. some are famous within their field, e.g., the eastman school of music, while some are not; yet, they all share the same homepage. each body within the university is given space, a logo, and an individual search page within the interface. this allows each school to have its own space and to benefit from the site architecture, metadata, cataloging, and maintenance that is specialized and centralized within the library. ur runs the repository on institutional repository software (http://code.google.com/p/irplus/), and it serves as a good example of how a consortial repository might break up its content. with more specific design, each school could have its own colors, logo, etc., and make sure that its work was branded. users could also search by discipline instead of by institutional facet. in a well-branded consortial ir the pages would still be branded; for example, while on oberlin college’s page a user would still see a yeoman or the coat of arms at the top of the page and be able to easily navigate to other articles in american studies from the consortium. this system could benefit all participating institutions. if an institution is well regarded in the field, it need not worry that joining the open digital research community would detract from it. instead, the sum of all cooper-ating institutions would be much greater than any one alone. even a large research institution would gain site traffic and visibility if allied with another organization. it could be that where individual schools see branding, they are really just increasing fragmentation in the digital research world. although the costs would be spread over a wide pool of participants, a community repository would still need a more specific funding model. one model would be that each institution pay an even share of costs as yearly membership dues and meet annually to discuss options, changes, and problems as a board. further, institutions could be contractually obligated to remain in the repository for a minimum of ten years, barring extreme circumstances. this would ensure that no institution had a greater say in running the repository based on funding and that institutions would not be able to withdraw from the collaboration without considerable penalty. the political aspects of hiring repository managers and the physical site of their workspace would most likely be a tough decision, but one that should be made at the time of hire and set up. the repository could be housed in an existing work space at a participating institution to keep costs down, or they could build a new space on neutral ground (see suggestion three). this allows each institution to learn from each other and put pride aside as much as possible for the pursuit of the preservation of digitized knowledge. increased communication at each individual repository, workflows, engagement strategies, budgets, policies, justifications, collection development policies, promotional materials, and metadata practices are all duplicated. this does not have to be the case. working in real time with another person to solve problems and make decisions can be extremely helpful. especially for those ir managers who do not have a committee or working group on site or do not have the time to dig through the literature and forums to find answers to questions, a small body of like- positioned professionals as a support network would greatly reduce the redundancy of effort for ir managers and leave time to do the work of running an ir instead of figuring out how. site neutrality while it may be tempting for institutions to house their resources indepen-dently out of a sense of school pride, it is difficult to ignore the benefits of collaboration if it makes these materials more easily found and cited. increased site activity, improved search index ranking through more robust linking, and a better overall repository experience that results from collaboration can be a good incentive to convince institutions to join together and share the work of a repository. digital scholarship can be a point of pride, a section of interest on the open web, and uniquely branded, but it does not necessarily need to be uniquely maintained. ultimately, a digital repository need not exist in the physical space of an institution. as long as the object pages are appropriately and clearly branded, the information is visible on the open web. with appropriate metadata attached to each item, a server that houses a repository could exist anywhere. one of the benefits of depositing work into a repository is to release knowledge to the global community. based on this global dispersal, there are few reasons that make local storage important for institutions. one measure of success for a repository is to have documents downloaded as many times as possible. we celebrate our papers making their way to china and brazil, to ecuador and france. items that are ready for that sort of dissemination do not have the sensitivity issues of some data and can easily be stored on servers without concern for geographic location. an externally based repository could have much the same structure as the consortial system described previously, yet this system would be more autonomous. this would set the repository outside political embattlement and give managers a neutral view over the management of the repository. shared expertise many repositories are run on less than fte, with one person trying to do the work of catalogers, website designers, outreach coordinators, liaisons, coders, and copyright experts and still do the rest of their job on top of that. a collaboration allows institutions to share the work of specialists across geographic space so that we do not all have to be experts on each task. if a group of several managers formed together and shared their work and the work of their colleagues, each institution would not have to have a copyright expert, dedicated cataloger, software developer, etc. the work of these specialized employees could be done remotely in exchange for the work of cooperating specialists. additionally, these specialists could be point people to field questions, design workflows, and troubleshoot problems. going beyond information sharing, this model suggests that the actual workloads are shared between institutions. metrics repositories that have a clear budget, a dedicated manager, and clear scope of collection development use their resources more efficiently. they are better able to move forward strategically. from an administrative perspective, statistics based on clear metrics help justify budgets (and budget increases when available). in a data-driven world, repositories cannot afford to not collect data about the services they provide for their institutions. thinking seriously about the worth of a repository requires measures and indicators that guide that process year to year. use is often a key indicator of the success of an institutional repository. use is measured in downloads, views, and citations. these statistics measure only one aspect of a repository. the following metrics represent some additional indicators of the success of irs. number of items. one of the major metrics that gives meaning to a repository is its contents. many repositories set a scope of collection to define the types and kinds of items that it may ingest. it is prudent to measure not only the quantity and quality of work in the repository, but also the growth over time. size of a collection should be measured in comparison with the total output of possible items. visibility and indexing. another important metric for irs is visibility to search engines. if appropriate metadata are not applied, our content will not be visible from the various search engines and open access aggregators and directories that work to make this information more prominent on the web. according to the jisc infonet repositories toolkit, a good deal of oa ir content can be found through basic internet searches. this is only true, however, if the correct metadata (open archives initiative protocol for metadata harvesting [oai-pmh]) is applied, the content is held on a reliable server, and the html or microdata are optimized so that search engines can more easily index our ir content. a number of services have been developed to enable more specific con-tent searching and thus more relevant results than are possible from general online search engines. these specific searches are not, however, what the majority of people use when searching for information. it is important to also optimize metadata for general search engines, like google scholar, using ad- ditional meta tags and markup, so that these services can index information and provide accurate citation data to the reader. a metric to measure this would be the ratio of indexed items in an ir to the total number of items in the ir. awareness. if more people know what an ir is and that it exists at an institution, then more conversation about the particulars of copyright, logistics, and item deposit can occur. as awareness grows, there will be increased conversation on campus about scholarly communication issues, which will increase information seeking and understanding of the topic. awareness on campus can lead to appreciation at an institution, which is valuable for long-term administration of a service. this can be measured in polls, in site visits, in reference questions on the topic, or through pulse checks when visiting departmental meetings. understanding. in order to measure the impact of a repository as an educational tool, it is useful to define a population, a set of skills or knowledge, and a measure of that knowledge. the groups that should be educated include, but are not limited to, faculty (especially those who publish), graduate students, undergraduate students, and administrators. each group has a specific subset of scholarly communication education that would be the most valuable to their work. repository managers should administer surveys, run brief interviews, and listen to the conversations around campus to gauge the knowledge base at their institutions. once there is a baseline of knowledge on campus, ir managers can reach out to the community in a strategic way and develop targeted, measurable education activities about authors’ rights, contracts, and copyright. the number, type, and impact of each teaching situation, if measured, can become a valuable metric when making a case for the tangible use of a repository beyond storage. user experience. user experience can also be used as a metric to judge the success of a repository. the user end of the repository should be easily accessible and provide a clear search and retrieval experience. the institution should plan for at least one initial usability study, with the potential to complete additional usability studies for the duration of the project. conclusion based on this survey of ir managers, there is a wide range of issues and barriers that inhibit repositories from optimal performance or planning for a sustainable future. alhough . percent of repository managers surveyed reported some existing collaboration, many of the partnerships are within a single campus. these partnerships exist mostly between institutional bodies that have a research and scholarship interest, i.e., the library, the graduate school, the research office, a law school, or academic department. while a few interinstitutional collaborations already exist, there are many more connections that can be developed to benefit repository managers and the ecosystem of scholarly work housed within repositories. new collaborations could be centered on a shared repository, shared workflows and increased communication, shared facilities and infrastructure, shared personnel and expertise, or a shared understanding of how to best measure the successes of an institutional repository. there is a healthy and growing ecosystem of repositories in the united states and globally upon which to build beneficial collaborations. by working together, we lessen the burden on each institution. repositories that are built and maintained using a sustainable and collaborative model will benefit more libraries in the present and more users in the future. references albanese, andrew. . “harvard mandates open access.” library journal ( ): – . http://search.ebscohost.com/login.aspx?direct=true&db=llf&an = &site=ehost-live. arlitsch, kenning, and patrick s. o’brien. . “invisible institutional repositories: addressing the low indexing ratios of irs in google scholar.” library hi tech ( ): – . bankier, jean-gabriel, and courtney smith. . “repository collection policies: is a liberal and inclusive policy helpful or harmful?” australian academic & research libraries ( ): – . bugeja, michael j., and daniela v. dimitrova. . vanishing act; the erosion of on- line footnotes and implications for scholarship in the digital age. sacramento, ca: litwin books. chan, leslie. . “supporting and enhancing scholarship in the digital age: the role of open access institutional repository.” canadian journal of communi- cation ( ). http://cjc-online.ca/index.php/journal/article/view/ / . cornell university library. . “arxiv membership model.” http://arxiv.org/ help/support/faq. crow, raym. a. “the case for institutional repositories: a sparc position pa- per.” arl: a bimonthly report on research library issues & actions : – . http://sparc.arl.org/sites/default/files/media_files/instrepo.pdf. ———. b. sparc institutional repository checklist and resource guide. the scholarly publishing & academic resources coalition. http://sparc. arl.org/ sites/default/files/presentation_files/ir_guide__checklist_v .pdf. culter, kim-mai. . “elsevier’s research takedown notices fan out to star- tups, harvard, individual academics.” techcrunch. http://techcrunch.com/ / / /elsevier/. cybermetrics lab. . the ranking web of world repositories—north america. http://repositories.webometrics.info. gilman, isaac k., gilman, m. i., and marita kunkel. . “from passive to pervasive: changing perceptions of the library’s role through intra-campus partnerships.” collaborative librarianship ( ): – . moriano, pablo, emilio ferrara, alessandro flammini, and filippo menczer. . “dissemination of scholarly literature in social media.” figshare. http://dx.doi.org/ . /m .figshare. . nabe, jonathan a. . starting, strengthening, and managing institutional repos- itories. new york: neal-schuman publishers. ocholla, dennis n. . “an overview of issues, challenges and opportunities of scholarly publishing in information studies in africa.” african journal library, archives & information science ( ): – . oder, norman. . “study: in downturn, academic libraries to focus on value, roi.” library journal. http://lj.libraryjournal.com/ / /managing- libraries/study-in-downturn-academic-libraries-to-focus-on-value-roi/. ohiolink. . “welcome to ohiolink.” http://www.ohiolink.edu/. open archives initiative. . “open archives initiative protocol for metadata har- vesting.” http://www.openarchives.org/pmh/. symons, craig. . “it strategy maps: a tool for strategic alignment visual rep- resentation of strategy makes it-business communication easier.” forrester re- search. http://cendoc.esan.edu.pe/fulltext/e-documents/itstrategymaps.pdf university of rochester. . “ur research.” https://urresearch.rochester.edu appendix questionnaire . what is the name of your repository? . what platform do you run your repository on? . who leads the work of your ir, and what is their title? . is there a committee or advisory group in place to support the work of your ir? . what kind of materials do you collect? (scholarly, archival, audio, visual materials, etc.) . when was your repository started? . what was the startup budget? . what is your estimated annual budget? . what percent of personnel assignments and time are spent on the repository? . what do you and your colleagues spend the most of your time on? . what is your biggest accomplishment? what helped that to happen? . what has been a struggle? what resources could help solve that? . have you had any support from other departments, universities, or other outside organizations? if so, how were you supported? . is there anything else you would like to share about your repository or the work you do in support of it? sterman_coverpage institutional_repositories_postprint_sterman a case study of librarian outreach to scientists title: a case study of librarian outreach to scientists: collaborative research and scholarly communication in conservation biology keywords: scholarly communication, altmetrics, data sharing, research collaboration, social networks, open access contact info: tina m. adams tadams @gmu.edu distance education librarian george mason university libraries university dr. fairfax, va kristen a. bullard bullardk@si.edu conservation biology librarian smithsonian libraries p.o. box , mrc washington, dc - tina m. adams would like to acknowledge george mason university libraries with providing the author research leave to complete this manuscript. the authors would like to acknowledge the course collaborators: katherine christen, phd, brian gratwicke, phd & scott loss, phd (smithsonian conservation biology institute); and jennifer hammock, phd & katja schulz, phd (encyclopedia of life). abstract global collaboration is increasingly important across universities and conservation biology organizations. in this example, a partnership resulted in the creation of a short-course aimed at exploring communication forms and digital tools that facilitate scholarly communication in conservation biology. questions the authors hoped to answer in the course were: what are the benefits and limitations of these tools? how can researchers in conservation biology use these methods to share data, show impact and connect to colleagues and stakeholders? why is open access even more important in this highly collaborative scholarly environment? how can communities of interest benefit scientists and the public? mailto:tadams @gmu.edu mailto:bullardk@si.edu a case study of librarian outreach to scientists introduction global collaboration is increasingly important across universities and conservation biology organizations, and becomes more so as the scholarly communication environment evolves. while this evolution requires increased technical savvy, it allows for more creative collaboration that crosses institutional boundaries more easily and frequently than ever before. in this example, a partnership between the distance education librarian at george mason university libraries and the conservation biology librarian at the smithsonian libraries resulted in the creation of a short-course aimed at exploring communication forms and digital tools that facilitate scholarly communication in conservation biology. the authors collaborated with a smithsonian conservation biology institute (scbi) training manager who teaches courses for the smithsonian-mason school of conservation (smsc) interested in developing a course on this topic. to assist the authors in learning about these emerging scholarly communication methods, the authors invited technology savvy scbi researchers as well as two project coordinators from the encyclopedia of life (eol) to serve as guest speakers and to help the librarians learn about these emerging trends from the perspective of practitioners. questions the authors hoped to answer in the course were: what are the benefits and limitations of these tools? how can researchers in conservation biology use these methods to share data, show impact and connect to colleagues and stakeholders? why is open access even more important in this highly collaborative scholarly environment? how can communities of interest benefit scientists and the public? the authors highlight the course content and share results from the survey of course participants. literature review the focus of most of the literature thus far with regards to faculty and librarians in academic collaborations has relied predominantly on traditional relationships such as librarians collaborating with faculty to provide information literacy instruction to students (gallegos and wright ). more recently initiatives have evolved to embed librarians in online courses (schulte, ) or to incorporate library resources and services into courses and collaborate on library assignments (bhatti ). in fact, the literature is replete with examples of faculty and librarian partnerships to teach information literacy skills and improve student outcomes in courses (massis ; mavodza ; mounce ; rader ) with one study attributing % of librarian/faculty collaborations to information literacy goals (gallegos and wright , ). collaborations among librarians and scientists or researchers in non-university settings are not as well documented in the literature. galloway, pease and rauh ( ) discuss the a case study of librarian outreach to scientists benefits of altmetrics to scholars and how librarians can assist scholars with new ways of tracking research impact by engaging faculty entering the tenure track. many see librarians’ roles as evolving to include educating researchers about new scholarly communication tools, and advise that future efforts should focus on the development of library services and librarian roles in relation to support of scholarly communication and social media. as librarians, we should be communicating with the researchers we support regarding new tools and methods of scholarly communication that can advance their research and scholarship including offering training and services around these new applications. (o’dell ; malenfant ; gu and widén-wulff ). while o’dell ( , ) argues “the most significant obstacle encountered thus far has been the culture change that has had to happen for faculty to accept support from the library beyond information gathering.” malenfant ( , ) contends that there is a culture change necessary for librarians too, who must not only develop a new knowledge base-- understanding scholarly communication--but must develop new skills, namely advocacy and persuasion. he notes, “if we want people to work in new ways, they have to be comfortable hearing ‘no’ and feel that it’s ok” (malenfant , ). in special libraries, where embedded librarianship is a common collaboration model, librarians serve as collaborators on research projects or as part of a research team. these project- based partnerships have a start and an end date, and at the end of the project the librarian’s commitment is over (carlson and kneale ; brandt ; reynolds, smith and d’silva ). an additional collaboration mode of embedded librarianship is the programmatic-based collaboration where librarians are hired by an organization on a full-time, ongoing basis. this librarian will have defined functions and responsibilities to support the research activities of the organization. unlike the project-based approach, this librarian supports multiple projects within the organization (carlson and kneale ). cross-institutional collaboration of all the collaboration models discussed, this article aligns with the project-based collaboration model and is a case study of a unique partnership that involved librarians, faculty and scientists from across a number of institutions, including; mason libraries, smithsonian libraries, the smsc, the scbi, and the eol. the original goal of this collaboration was to develop the curriculum and co-teach a smsc course for conservation biologists about emerging scholarly communication methods. the endeavor began when the smithsonian libraries conservation biology librarian had a research consultation with a smsc instructor which led to an invitation to collaborate on the course. due to the smithsonian-mason partnership, the mason libraries distance education a case study of librarian outreach to scientists librarian was invited to participate. the faculty member was initially surprised by how much librarians know about these topics and our willingness to collaborate. the full course has not yet been offered by smsc, but after several meetings we decided that there was interest in moving forward and that this would be a great topic for the upcoming international congress for conservation biology (iccb) conference which would be held locally in baltimore, md. a pre-conference short-course would provide an opportunity to pilot the course content with conservation biologists. although conversant on many of the topics related to scholarly communication, the librarians knew they had more to learn. after conducting research on emerging tools and digital scholarship issues, the librarians met again with the faculty member to plan the course curriculum. the librarians, in partnership with the faculty member, developed the course description and learning outcomes and then devised a course outline that would meet the learning outcomes. each participant took the lead on specific topics. for many of the content areas, we decided that it would be worthwhile to partner with scientists actually using these tools and grappling with many of the issues, such as data sharing and citizen science. the smithsonian libraries’ librarian and the smsc instructor approached scientists in their organizations that they knew were using some of these tools and asked them to serve as guest speakers in the short course. in designing the course, we knew our participants would be coming from all over the world due to the international scope of the conference, so we purposefully highlighted resources that did not require an institutional affiliation. topics covered in the course included the researcher personae, professional communities, public communities, citation collaboration, data sharing, alternative metrics and open access publishing. below is a synopsis of these tools and how they can benefit conservation biologists. there are many more tools and communities than we will discuss here. for additional tools and sites see (roemer and borchardt ; mcmahon et al. ; macmillan et al. ) researcher personae the online face of a researcher is the most visible and findable variable of new media and can have implications for collaboration, employment, and professional reputation. for this reason, we began the course by discussing the importance of creating an online researcher presence, how tools can help researchers disambiguate their work from researchers with similar names, and how these tools can help researchers track their research impact beyond traditional citation metrics. posner ( ) suggests that when creating an effective online presence, “it’s important to carry the same voice, image, and persona across multiple social networking platforms” and that you should choose a profile picture that is consistent and professional. highlighted during the course were orcid, impact story, and google scholar citations profile. all have benefits and help researchers track their research impact and cultivate a professional persona. a case study of librarian outreach to scientists orcid http://orcid.org/ is an open, non-profit, community-based effort to provide a registry of unique researcher identifiers and a transparent method of linking research activities and outputs to these identifiers. orcid is a platform-independent identifier unique in its ability to reach across disciplines, research sectors, and national boundaries and its cooperation with other identifier systems (such as researcher id, impact story and google scholar). impact story http://impactstory.org/ is an open-source, web-based tool that helps researchers explore and share the diverse impacts of all their research products-- traditional ones like journal articles, but also alternative products like blog posts, datasets, and software. by helping researchers tell data-driven stories about their impacts, impact story aims to help build a reward system that values and encourages new forms of web- native scholarship. google scholar citations http://scholar.google.com/citations‎ allows scholars to track citations to their publications, set up alerts to track who is citing their publications, graph their citations over time and compute citation metrics. google scholar profiles will show up in google scholar search results when someone searches for the researcher’s name. professional communities conservation biologists who develop connections with other professionals through shared online communities using a variety of tools, methods and practices, can develop connections with other professionals in an online environment. these communities of shared interests advance conservation biology. in the course we experimented with researchgate by setting up a project area for our short-course where we started online discussions after the course, and shared links and papers. researchgate http://www.researchgate.net is one of the largest online communities for researchers, with more than million users. scholars can not only find potential research partners, but can “follow” researchers, and set up groups called “projects”. researchgate is aimed at scientists and can search pubmed, citeseer, the institute of electrical and electronics engineers and arxiv. it offers a search engine to help researchers find similar papers and gives users the opportunity to rank and comment on papers. this tool also allows users to track their research metrics if the user creates a profile. facebook http://www.facebook.com is the largest social network with . billion users (kiss, ). the ability to join or create special interest groups has proved to be valuable for communication among scholars. some facebook groups are open (ex: faceplant open facebook group) but for other groups, users must be invited to these groups, so there are barriers to use. http://orcid.org/ http://impactstory.org/ file:///c:/users/toshiba-test/dropbox/cul% article/scholar.google.com/citations‎ http://www.researchgate.net/ http://www.facebook.com/ a case study of librarian outreach to scientists while it is generally accepted that there have been privacy issues with facebook, the authors continue to advocate its use simply because the sheer scale of facebook brings the power of crowd-sourcing to bear which can help with real life research needs. in one such example we highlighted in the course, a smithsonian institution national museum of natural history sponsored team of ichthyologists performed the first survey of the fish diversity in the cuyuni river of guyana. the researchers needed to identify over , specimens in less than a week’s time in order to obtain an export permit. faced with insufficient time and inadequate resources, they posted a catalog of specimen images to facebook, hoping their colleagues would assist (sidlauskas ). in less than hours, the facebook participants had identified approximately % of the posted specimens to at least the level of genus, and revealed the presence of two newly discovered species. the majority of commenters held a phd. in ichthyology or a related field, and represented countries from all over the world (sidlauskas ). in another example, brian gratwicke, one of our guest speaker collaborators, recorded a video detailing how his small community of researchers uses facebook to share and comment on each other’s work related to amphibian conservation research specific to mitigating the effects of amphibian disease. many in the facebook group share preliminary results and data and elicit feedback from their colleagues as part of their research process (brian gratwicke, unpublished video). in the course evaluation survey, one participant noted that “i appreciate the videos with brian, but what i really wanted was to be able to do a q and a interaction with a scientist who actively used these tools.” brian and scott, the scbi collaborators were unable to attend our short-course session, so we used technology to record interviews with those scientists, but it is obvious there was a preference to have all the guest speakers in person. public communities public engagement, such as citizen science projects can leverage online tools and communities to raise public awareness, share news and discoveries, collect data, foster feedback, and generate funding for organizations. the most successful citizen science endeavors focus on a scientific question or environmental issue that is best addressed by analyzing large amounts of data that are collected across a wide area, or over a long period of time (bonney et al. ). in the course, we looked at contributory projects, which are generally designed by scientists and for which members of the public primarily contribute data (bonney et al. ). two organizations that we spotlighted were encyclopedia of life (eol) and inaturalist. our guest speakers jennifer hammock and katja shulz are both project coordinators for eol and brian gratwicke is an inaturalist contributor. this first-hand knowledge provides a unique perspective of these organizations and how they enable many individuals to work in concert, collecting and contributing information and data. a case study of librarian outreach to scientists encyclopedia of life (eol) www.eol.org is a free, online collaborative encyclopedia intended to document all of the . million living species known to science. it is compiled from existing databases and from contributions by experts and non-experts throughout the world and has a goal of building one page for each species, that will include video, sound, images, graphics, as well as text. inaturalist www.inaturalist.org is a site for the interested public where users can record what they see in nature, meet other nature lovers, and learn about the natural world from scientists. one question we hoped to answer was how can communities of interest benefit scientists and the public? scientists sometimes have reservations about the utility of citizen science projects due to the challenge of managing the data and verifying accuracy. data verification is a real concern and researchers using citizen science approaches need to integrate a method to validate data. wiggins et al. ( ) conducted a sampling of citizen science data verification methods and found that the primary mechanism for validating data submitted by the public is still expert review. photo submissions were the next most used method, but can be a challenge for large-scale projects unless they provide an infrastructure for online identification and classification. regardless of the challenges, citizen science has advantages like allowing researchers to interact and gather data in a scalable way, leverage existing networks for their work and contribute to the global conservation community, as well as, educating the public about conservation issues, so it is, for many researchers, a worthwhile partnership. citation collaboration the ability to share and discuss the research literature is vital in the world of digital scholarly communication. often both project-based and cross-institutional, this type of collaboration can include sharing and annotation of articles for literature reviews. mendeley http://www.mendeley.com/ is a free reference manager and academic social network that can help you organize your research, generate bibliographies, collaborate with others online, and discover and share the latest research as well as manage and annotate your pdf documents. mendeley has a web browser plugin to extract and save complete bibliographic references from web resources and employ the word processor add-on to incorporate citations into documents as in-text citations and reference lists (zaugg et al. ). zotero www.zotero.org is a free research tool that helps you collect, organize, and annotate your references and share them with other zotero users. additional functionality includes the zotero web browser plugin to extract and save complete bibliographic references from web resources. (chnm ). http://www.eol.org/ http://www.inaturalist.org/ http://www.mendeley.com/ http://www.zotero.org/ a case study of librarian outreach to scientists of the two, mendeley has the advantage of incorporating research statistics. as mendeley’s‎version of altmetics reports how often articles are saved by different users and how articles are being tagged, enabling researchers to see how often different articles are being read, or at least accessed (zaugg et al. ). in fact mendeley downloads and reads are being reported in most altmetric tracking tools now. both tools have significant advantages to researchers who would like to collaborate by sharing references and annotations as well as serving as discovery tools for finding related research. data sharing data sharing and reuse has become increasingly important, especially in the sciences. funding agencies have begun requiring data release. in , the national institutes of health (nih) added a data management plan requirement for grants over $ , (nih ). the national science foundation (nsf) has a statement requiring data sharing in its grant contracts, but has not enforced the requirement consistently; however in the nsf made the announcement that all future grant proposals would require a data management plan and will be subject to peer review (nsf ). increasingly, journals like science and nature have begun to expect researchers to submit data as part of the article submission process. for example, nature will be launching scientific data in may , a new open-access, online-only publication for descriptions of scientifically valuable datasets. scientific data exists to help researchers publish, discover and reuse research data (npg ). beyond being a requirement, data sharing is the ideal as it accelerates the process of discovery and advancement in conservation research. in some content areas like data sharing, the librarians relied heavily on the expertise and experience of our invited guests to take the lead for the course. we approached two project coordinators from the eol who did an excellent job of explaining current issues and practices in data sharing and showing some of the many data sharing repositories as well as explaining the difference among primary data repositories, data aggregators and citizen science projects (hammock and schulz figure ). two primary data repositories highlighted in the course were dryad and pangaea. dryad digital repository datadryad.org is a curated general-purpose primary data repository that makes the data underlying scientific publications discoverable, freely reusable, and citable. dryad has integrated data submission for a growing list of journals. ecology and evolution is the latest journal to integrate submission of manuscripts with data to dryad. (schaeffer ) pangaea http://www.pangaea.de is a primary data repository operated as an open access library aimed at archiving, publishing and distributing earth and environmental science data. the system guarantees long-term availability of its content through a commitment of the operating institutions. authors submitting data to the pangaea data library for archiving agree that all data are provided under a creative commons license. file:///c:/users/toshiba-test/desktop/datadryad.org http://www.pangaea.de/ a case study of librarian outreach to scientists the rationale for sharing data is firstly to reproduce or to verify research as well as to make results of publicly funded research available to the public. sharing data also enables others to ask new questions of existing data, and to advance the state of research and innovation. even though data sharing is the ideal, there are challenges with sharing data. the reasons why researchers may not share data include incompatible or non-transferable data formats, missing or insufficient descriptive metadata, ethical issues, issue of incentives, and concern over re-use and ownership. “if the rewards of the data deluge are to be reaped, then researchers who produce those data must share them, and do so in such a way that the data are interpretable and reusable by others. underlying this simple statement are thick layers of complexity about the nature of data, research, innovation, and scholarship, incentives and rewards, economics and intellectual property, and public policy. sharing research data is thus an intricate and difficult problem—in other words, a conundrum.” (borgman , ) at this point, there are no easy answers, but some have suggested ways that data can be shared, and both dryad and pangaea have incorporated elements of these models. essentially, built into all data banks there would be an attribution system that would make sure the original researcher retains credit or ownership of their data. an issue facing journals and data banks is how to ensure proper citations for data sets. attribution is very important, especially in science and without a way of assigning credit for original data, scientists will be reluctant to share as they will not get the recognition for the work or their data could be “poached” or “scooped” in the parlance of researchers. william michener, director of e-science initiatives for university libraries at the university of new mexico, albuquerque, and a leader of dataone, sees an additional challenge, “changing the culture of science from one where publications [are] viewed as the primary product of the scientific enterprise to one that also equally values data" (quoted in nelson ). the discussions that arose in the course surrounding both the positive and negative aspects of data sharing revealed the complexity of this new scholarly environment. alternative metrics the traditional citation-based metrics of a researcher’s published work has been the basis of decisions like promotions and budgets for decades. increasingly, scholars are becoming aware of the limitations of traditional scholarly metrics, such as the lag time between publication and seeing any impact from publications being cited. these limitations can provide librarians with opportunities to initiate discussions with researchers by showing how alternative metrics or altmetrics can capture a more immediate and accurate picture of scholarly influence. altmetrics are an attempt to measure web-driven scholarly interactions. galloway, pease and rauh ( ) point out the limitations of new altmetrics tools including the challenge to disambiguate authors and how tools like orcid are endeavoring to solve this issue by providing unique identifiers for authors and point up the need for faculty to establish a unified digital profile. applying metrics a case study of librarian outreach to scientists to contributions to scientific and public communities using altmetrics has been difficult to capture, yet it is beginning to make progress. altmetric explorer http://www.altmetric.com/ is a fee-based service that captures hundreds of thousands of tweets, blog posts, news stories and other pieces of content each week that mention scholarly articles and allows users to browse, search and filter this data. altmetric it! http://www.altmetric.com/bookmarklet.php is a bookmarklet available for chrome, firefox and safari browsers (not internet explorer) that tracks altmetrics at the article level. once in a journal article page, click "altmetric it!" to see altmetrics at the article level. plum-x (plum analytics) http://www.plumanalytics.com/about.html is a fee-based product that tracks more than different types of artifacts, including journal articles, books, videos, presentations, conference proceedings, datasets, source code, cases, and more. its suite of tools helps answer questions about research impact. there have been attempts to create a statistical methodology that defines different types of altmetrics. priem et al. reported finding five patterns of usage: .) highly rated by experts and highly cited; .) highly cited; highly shared; .) highly bookmarked, but rarely cited; .) uncited. (priem et al.,”altmetrics in the wild,” ). plum analytics categorizes altmetrics into five separate types: .) usage; .) captures; .) mentions; .) social media; and .) citations (plum analytics ). for a brief chart showing the different types of altmetrics that can be tracked, see figure “examples of altmetrics.” in the example we used in the course, our guest speaker, scott loss from the smithsonian conservation biology institute, co-published a paper looking at the effects of the impact of free- ranging domestic cats on wildlife of the united states, which was one of the first data-driven systematic reviews on the issue. as scott notes, “we expected there to be a decent response to the paper given the contentious nature of the topic.” within a day of publication, the article was picked up by most press agencies including the new york times and was tweeted by readers at these news outlets as well as organizations like the nature conservancy. “the article took on a life of its own,” said scott. a recent review of the altmetrics of this article, which was only published in , paints an interesting picture. the article was shared in tweets times, posted times on facebook pages, mentioned in google+ post, picked up by news outlets, posted to scientific blogs, and recommended in reddit libraries (nature communications ). while one may argue that altmetrics are not necessarily an indication of the quality of an article, in this case, in addition to altmetrics, the article has web of science citations, cross-ref citations and scopus citations, for a total of “traditional” citations in less than a http://www.altmetric.com/ http://www.altmetric.com/bookmarklet.php http://www.plumanalytics.com/about.html a case study of librarian outreach to scientists year’s time. the article is in the percentile (ranked st) of the , tracked articles of a similar age in all journals and is in the percentile (ranked st) of the tracked articles of a similar age in nature communications (nature communications ). unfortunately, the news agencies sensationalized the topic by providing the conclusions of the study without providing the qualifications along with the conclusions and ran with a “killer cats” theme. the smithsonian press office estimated that there were about million unique viewers of the press coverage of the article (scott loss, unpublished video.) this is a great example of how internet coverage can lead to greater visibility for science research and can inform the public, but can also lead to sensationalized coverage. in the future, altmetrics may be consulted in promotion and tenure decisions. this will be especially important as studies are beginning to find that some articles may be heavily read and saved by scholars but seldom cited (priem et al.,”altmetrics in the wild,” ). for the time being, altmetrics can serve as a “reader’s advisory” for scholars to stay abreast of research in a given field of study and to gauge the impact of their research (galloway, pease and rauh ). for a more complete overview of a wide range of specific altmetrics tools see roemer and borchadt ( ). open access publishing closely related to the issues raised in research collaboration and altmetrics is the issue of open access (oa) publishing. oa publishing is literature that is online, free of charge, and free of most copyright and licensing restrictions. oa removes price barriers (subscriptions, licensing fees, pay-per-view fees) and permission barriers (most copyright and licensing restrictions) (suber ). the course discussion of oa publishing included how it can intersect with research collaboration and altmetrics. a distinct challenge to researchers not affiliated with large well- funded research libraries is that electronic access to the literature is not assured. this is due to unprecedented increases in the price of journal subscriptions, as well as practices like “bundling” that do not allow cancellations of individual titles in the bundle, leading to libraries subscribing to some titles they might not have otherwise and often requiring cuts in non-bundled journals to maintain bundled subscriptions (willinsky ). laasko notes, (“background” ) “open access (oa) has expanded the possibilities for disseminating one's own research and accessing that of others.” the benefits to scholars are more research impact. so how should scholars who would like to publish their research in oa platforms proceed? it is important to understand that oa can be accomplished by either publishing in a oa journal (called “gold oa”) or self-archiving in an online digital or institutional repository (called “green oa”) (suber ) a case study of librarian outreach to scientists many of our course participants were surprised to learn that just because they are the author of a work, did not mean they retain the right to deposit a copy of their work in either a digital or institutional repository (green oa). one thing librarians can begin doing is educating faculty on their rights as authors and encouraging them to assert those rights when signing publishing contracts. here is a synopsis of three tools helpful with navigating the landscape of open access journals. directory of open access journals (doaj) http://www.doaj.org/ aims to increase the visibility and ease of use of open access scientific and scholarly journals, thereby promoting their increased usage and impact. the doaj aims to be comprehensive and cover all open access scientific and scholarly journals that use a quality control system to guarantee the content. in short, the doaj aims to be a clearinghouse for users of open access journals. beall’s list http://scholarlyoa.com/publishers/ is a list of potential, possible, or probable predatory scholarly open-access publishers based on evaluation criteria such as credentials of the journal’s editorial staff; the journal’s publisher and publishing model; integrity issues related to impact factors, journal mission, indexing claims and peer review; author experiences and adherence to codes of conduct, specifically to the open access scholarly publishers association (oaspa) code of conduct; the committee on publication ethics (cope) code of conduct for journal publishers; and the international association of scientific, technical & medical publishers (stm) code of conduct. sherpa/romeo http://www.sherpa.ac.uk/romeo/ is a great resource to check for copyright/self-archiving policies by journal. use this site to find a summary of permissions that are normally given as part of each publisher's copyright transfer agreement. though these resources are a step in the right direction, science’s recent investigation where the author targeted dozens of oa journals in an elaborate sting where these journals accepted a spoof cancer research article, raises questions about peer-review practices in oa journals. bohannon notes, “some say that the open-access model itself is not to blame for the poor quality control revealed by science’s investigation. but open access has multiplied that underclass of journals, and the number of papers they publish” ( ). this study points out the need to educate scholars about predatory journals and the need for more rigor regarding what makes the doaj list, especially since the study uncovered that some beall labelled predatory journals made it into doaj. for a more complete treatment of open access including history and statistics see willinsky ( ), hitchcock ( ) and suber ( ). http://www.doaj.org/ http://www.doaj.org/ http://scholarlyoa.com/publishers/ http://www.sherpa.ac.uk/romeo/ http://www.sherpa.ac.uk/romeo/ a case study of librarian outreach to scientists course outcomes overall the course was very successful. we had attendees from many countries including australia, canada, cameroon, england, guyana, new zealand, thailand and the united states. the attendees represented many kinds of institutions including universities, government agencies, and ngo’s such as the international union for conservation of nature (icun) and the national land resource center (nlrc). ten attendees responded to the survey. of the responses, all “strongly agreed” ( ) or “agreed” ( ) that the short-course was valuable. all “strongly agreed” that the instructors were prepared and knowledgeable, most “strongly agreed” ( ) or “agreed” ( ) that the instructors were approachable/accessible. and the majority “agreed” ( ) or “strongly agreed” ( ) that the hands-on portion of the course helped them better understand the concepts presented. all participants would recommend the short-course to a friend or colleague. though the response to the course was positive and one of our presenters, brian gratwicke reported being approached during the conference by course attendees with great things to say about it, the comments revealed some confusion over what the participants thought the course was going to be about. the mention in the course description of social media seemed to confuse some participants who expected a different focus. a couple of participants noted, “i expected to focus on social media (facebook and twitter) and how to grow followers and get out a conservation message to different audiences” [but] “this was really about how to make yourself known as a researcher and how to use online tools to advertise yourself and work with collaborators. that said, it was still super helpful and something that i think is unique among the usual social media classes. you have so many topics; this could certainly be a one week or longer course.” “while i did find this short course to be valuable, i admit i signed up because i was most interested in what was discussed in the latter part of the course: new media communication skills for public discourse/sharing about science. i had thought that was what the course would be focused on due to the description.” another suggestion was to have everyone in the room introduce themselves at the beginning of the session. we had discussed doing this before, yet decided against it because of time constraints, but some participants would have valued the networking opportunity and the opportunity as one participant put it [to meet the] … “interesting people in the room.” if this content is further developed into a course for the smsc, it will be important to continue to include librarians and scientists as guest speakers. in addition, incorporating more organizational marketing and outreach for ngo’s and research institutes should probably be included into the curriculum plan. a case study of librarian outreach to scientists conclusion collaboration outside of traditional library roles is an important new arena for librarians and can result in a rewarding experience, professional growth and beneficial outcomes. to be successful in these partnerships librarians must embrace change, and be willing to learn and take risks. it is also important for librarians to get buy-in from their organization and/or direct supervisors since this is a new role requiring time to learn new content areas and time to build trust and relationships with potential conservation biology research project teams. in any collaboration, patience is important, and this example is no exception. staying on track as a group could be challenging. in addition to occasional travel to each other’s sites, the authors used the internet video phone service vidyo to meet virtually from our different institutions and locations, and the document sharing tool dropbox to share outlines, powerpoints and videos and the research tool zotero to save, annotate, share and format references. we also created a project space in researchgate for the short-course, which attendees were encouraged to join and is where we shared articles and posted discussion threads after the short-course to continue conversations. these tools kept us on track and served as a “practice what you preach” approach. the discussions that arose in the course surrounding both the positive and negative aspects of these tools revealed the complexity of this new scholarly environment. the authors benefited greatly by collaborating with others, as this enabled us to learn from one another and gain new expertise. the librarians agreed that having the scientists involved not only lent a sense of legitimacy to the course, it helped the librarians relinquish parts of the course content (like data sharing and citizen science) for which we have no first-hand experience. at the same time, the scientists graciously ceded the floor when it came to “library stuff” like altmetrics, oa publishing and citation management tools. respecting each other’s expertise is the key to a successful collaboration. a case study of librarian outreach to scientists references bhatti, rubina. . "teacher-librarian collaboration in university libraries: a selective review." pakistan library & information science journal ( ): - . bohannon, john. . “who’s afraid of peer review?” science ( ): – . doi: . /science. . . . bonney, rick, heidi ballard, rebecca jordan, ellen mccallie, tina phillips, jennifer shirk, and candie c. wilderman. "public participation in scientific research: defining the field and assessing its potential for‎informal‎science‎education.”‎center for advancement of informal science education., last modified june . http://informalscience.org/documents/legacy_newsletters/caise% newsletter% jun- jul% .pdf borgman, christine l. . “the conundrum of sharing research data.” journal of the american society for information science and technology ( ): – . doi: . /asi. . brandt, scott d. ."librarians as partners in e-research purdue university libraries promote collaboration." college & research libraries news ( ) - . carlson, jake, and ruth kneale. . "embedded librarianship in the research context: navigating new waters." college & research libraries news ( ): - . chnm (roy rosenzweig center for history and new media.) “zotero | about.” zotero, accessed march , . https://www.zotero.org/about/. gallegos, bee, and thomas wright. . "collaborations in the field: examples from a survey." in the collaborative imperative: librarians and faculty working together in the information universe edited by richard raspa and dane ward, - . chicago: association of college and research libraries. galloway, linda m., janet l. pease, and anne e. rauh. . “introduction to altmetrics for science, technology, engineering, and mathematics (stem) librarians.” science & technology libraries ( ): – . doi: . / x. . . gu, feng, and gunilla widén-wulff. . "scholarly communication and possible changes in the context of social media: a finnish case study." the electronic library, ( ): - . hitchcock, steve. "the effect of open access and downloads ('hits') on citation impact: a bibliography of studies." the open citation project, last modified june , . http://opcit.eprints.org/oacitation-biblio.html kiss, jemima. . “facebook’s th birthday: from college dorm to . billion users.” the guardian, last modified february , sec. technology. http://www.theguardian.com/technology/ /feb/ /facebook- -years-mark-zuckerberg. https://www.zotero.org/about/ http://opcit.eprints.org/oacitation-biblio.html http://www.theguardian.com/technology/ /feb/ /facebook- -years-mark-zuckerberg a case study of librarian outreach to scientists macmillan, don. . "mendeley: teaching scholarly communication and collaboration through social networking." library management ( / ): - . malenfant, kara j. . “leading change in the system of scholarly communication: a case study of engaging liaison librarians for outreach to faculty.” college & research libraries ( ): – . massis, bruce e. . "librarians and faculty collaboration – partners in student success." new library world library ( / ): - . mavodza, judith. . "the academic librarian and the academe." new library world , ( / ): - . mcmahon, tamara m., james e. powell, matthew hopkins, daniel a. alcazar, laniece e. miller, linn marks collins, and ketan k. mane. . "social awareness tools for science research." d-lib magazine ( / ). http://www.dlib.org/dlib/march /mcmahon/ mcmahon.html mounce, michael. . "working together: academic librarians and faculty collaborating to improve students' information literacy skills: a literature review - ." reference librarian ( ): - . nature communications. . “article metrics for: the impact of free-ranging domestic cats on wildlife of the united states” nature communications, accessed march , . http://www.nature.com.mutex.gmu.edu/ncomms/journal/v /n /ncomms /metrics. nelson, bryn. . “data sharing: empty archives.” nature news ( ): – . doi: . / a. nih (nih office of extramural research). “data sharing regulations/policy/guidance chart for nih awards,’ last modified august , . http://grants.nih.gov/grants/policy/data% fsharing/data_sharing_chart.doc npg (nature publishing group). “scientific data | about.” scientific data is now open for submissions! accessed march , . http://www.nature.com.mutex.gmu.edu/scientificdata/about/. nsf (national science foundation). . “dissemination and sharing of research results.” national science foundation, last modified november , .http://www.nsf.gov/bfa/dias/policy/dmp.jsp o'dell, sue. . "opportunities and obligations for libraries in a social networking age: a survey of web . and networking sites." journal of library administration ( ): - . http://www.dlib.org/dlib/march /mcmahon/ mcmahon.html http://www.nature.com.mutex.gmu.edu/ncomms/journal/v /n /ncomms /metrics http://grants.nih.gov/grants/policy/data_sharing/data_sharing_chart.doc http://www.nsf.gov/bfa/dias/policy/dmp.jsp a case study of librarian outreach to scientists “plum analytics | metrics.” . plum analytics, december , . http://www.plumanalytics.com/metrics.html. posner, miriam. . “creating your web presence: a primer for academics”. the chronicle of higher education: profhacker column. february , . http://chronicle.com.mutex.gmu.edu/blogs/profhacker/creating-your-web-presence-a-primer-for- academics/ . priem, jason, heather a. piwowar, and bradley m. hemminger. . “altmetrics in the wild: using social media to explore scholarly impact.” arxiv: . [cs], http://arxiv.org/abs/ . . rader, hannelore b. . "information literacy - : a selected literature review." library trends ( ): - . reynolds, latisha m, siobhan e. smith, and margaret u. d'silva. . "the search for elusive social media data: an evolving librarian-faculty collaboration." journal of academic librarianship ( ), - . roemer, robin chin, and rachel borchardt. . “from bibliometrics to altmetrics: a changing scholarly landscape.” college & research libraries news ( ): – . schaeffer, peggy. . “ecology and evolution integrates with dryad.” dryad news and views, january , . http://blog.datadryad.org/ / / /ecology-and-evolution-integrates-with- dryad/. schulte, stephanie j. . "embedded academic librarianship: a review of the literature." evidence based library & information practice ( ): - . sidlauskas, brian. . “crowdsourcing via social media allows rapid remote taxonomic identification - national museum of natural history unearthed.” national museum of natural history unearthed, last modified march , . http://nmnh.typepad.com/ years/ / /crowdsourcing-via-social-media-allows-rapid- remote-taxonomic-identification-.html. smith, anne-marie. . "understanding the relationship between the librarian and the academic." new review of academic librarianship ( ): - . suber, peter. “open access overview.” earlham college, last modified december , . http://bit.ly/oa-overview. wiggins, andrea, greg newman, robert d. stevenson, and kevin crowston. . "mechanisms for data quality and validation in citizen science." in e-science workshops (esciencew), ieee seventh international conference on, pp. - . http://www.plumanalytics.com/metrics.html http://chronicle.com.mutex.gmu.edu/blogs/profhacker/creating-your-web-presence-a-primer-for-academics/ http://chronicle.com.mutex.gmu.edu/blogs/profhacker/creating-your-web-presence-a-primer-for-academics/ http://arxiv.org/abs/ . http://blog.datadryad.org/ / / /ecology-and-evolution-integrates-with-dryad/ http://blog.datadryad.org/ / / /ecology-and-evolution-integrates-with-dryad/ http://nmnh.typepad.com/ years/ / /crowdsourcing-via-social-media-allows-rapid-remote-taxonomic-identification-.html http://nmnh.typepad.com/ years/ / /crowdsourcing-via-social-media-allows-rapid-remote-taxonomic-identification-.html http://bit.ly/oa-overview a case study of librarian outreach to scientists willinsky, john. “the access principle: the case for open access to research and scholarship.” mit press, accessed july , http://mitpress.mit.edu/sites/default/files/titles/content/ _download_the_full_text. pdf. zaugg, holt, richard west, isaku tateishi, and daniel l. randall. . “mendeley: creating communities of scholarly inquiry through research collaboration.” techtrends, ( ), - . http://mitpress.mit.edu/sites/default/files/titles/content/ _download_the_full_text.pdf http://mitpress.mit.edu/sites/default/files/titles/content/ _download_the_full_text.pdf a case study of librarian outreach to scientists figure data sharing repositories figure examples of altmetrics a case study of librarian outreach to scientists appendix a short course evaluation short course: new media matters: communicating conservation research & ideas please respond to the following questions using the rating scale below. feel free to add comments in the comments field. . i found the information in this short course to be valuable strongly agree agree disagree strongly disagree no opinion . the instructors were prepared and knowledgeable strongly agree agree disagree strongly disagree no opinion . the instructors answered questions appropriately and were accessible. strongly agree agree disagree strongly disagree no opinion . utilizing hands-on exercises in class helped me to understand the concepts. strongly agree agree disagree strongly disagree no opinion . would you recommend this workshop to others? yes no comments: Παρουσίαση του powerpoint poster presentation of ongoing work for a white paper by the operas working group on best practices best practices in open access scholarly publishing |transition to oa the term ‘transition to oa’ is understood in different contexts: from the perspective of publishers, librarians, funders, researchers, and bibliometrics. from the perspective of established publishers, it means the transition from subscription-based model to a fully or partially oa model. for libraries, it means making the institutions’ research output openly available through an institutional repository, and increasingly, negotiating with publishers to achieve oa within the framework of existing agreements. for researchers, it means looking for an oa publication channel or depositing their work in institutional or thematic repositories. emerging practices: the fairoa alliance for journal editors; the oa initiative for libraries and consortia; knowledge unlatched for libraries and publishers. |authors authors who want (or need) to publish their article in open access are confronted with a plethora of choices. there is an increasing range of models besides gold and green oa, with a variety of open licenses and embargo periods. prices range from no-fee to a publication fee of over € per article. funders may require oa and be willing to pay for publication charges, but the terms and conditions for payment vary with each individual funder. last but not least, the emergence of predatory or rogue publishers and their journals complicates things even further. best practices: doaj, a journal accreditation service for pure oa journals; doab for oa books; qoam (quality oa market) is a marketplace for all kinds of journals with oa options to promote transparency and provide quality indicators; sherparomeo collects publisher policies on copyright and self-archiving; thinkchecksubmit is a collaborative initiative to help authors select an appropriate journal. |publishing agreements open access publishing models require a different approach in the relationship between authors and publishers. new factors in the drafting of publishing agreements include the role of institutional subventions and funder involvement, as well as the rights and responsibilities of publishers under this new model. these may include a requirement to deposit content for preservation or access via a repository, guidelines for iterative updates, or language for describing non-textual objects. through negotiation with the publisher, authors may retain rights to reuse and further develop their work, increase access for research and educational purposes, and secure proper attribution for reuse. supporting resources: the sparc author addendum modifies the publisher agreement and allows authors to keep key rights to their articles; the model publishing agreement is a sample agreement for long-form digital scholarship and open access publications. |peer review peer review is one of the founding pillars of scholarly publishing to ensure the reliability and validity of the research presented. in the transition to oa, peer review is considered to be a key element to create trust in new publishing models. the growth of science and the advent of e-publishing has presented various flaws in the peer review process and in recent years new practices have emerged where the online techniques and standardization of research information has made it possible to open up the review process for scrutiny by making it more public. best practices: cope (the committee of publication ethics) produced widely used guidelines for reviewers and editors; aup (the association of university presses) has developed best practices for peer review. introduction publishing is a composite activity that includes several components, and the adoption of best practices in academic publishing should address all aspects: service provision to authors, publishing agreements, peer- reviewing, editing, usage of open access licenses, dissemination, metrics and digital preservation. on most of these topics, best practices have been developed by different academic and professional networks, gaining enough consensus to be adopted by operas consortium. our objective is to identify the most accepted practices for each area and plan for specific actions for their implementation by operas partners. wg contact info: eelco ferwerda (oapen) - e.ferwerda@oapen.org partners list: oapen (contact point); association of european university presses (aeup); hypothesis; linguistics in open access (lingoa); openedition; open library of humanities (olh); quality open access market (quam); lexis; stockholm university press; ubiquity press; university of milan; university of zadar |editing in general one can say that the editors’ role varies within specific disciplines (stm and ssh disciplines) and type of output (journals and books). editors have a central role in the publication process, and in highly specialized fields within ssh and when developing monographs, their contribution to the final publication is crucial. that said, the role and responsibility of editors has been accurately investigated mainly in biomedical science journals sector, but the same guidelines for best practices can be effectively adapted for ssh. best practices: cope (the committee of publication ethics) developed the code of conduct; icmje (the international committee of medical journal editors) has detailed roles & responsibilities. |usage of open access licenses the most commonly used oa licenses are the creative commons set of licenses. the most open of these is the cc by license allowing for all types of re-use provided there is proper attribution for the copyright holders (in particular the authors of the work). although cc by is widely considered to be the default license for oa articles in stm disciplines, there is no consensus within the ssh community, and this is particularly true for long form publications. most guides insist on transparency: clear explanations, license on every format (xml, html, pdf, epub) and under every format (human, legal, machine readable), and in addition the license on included materials (figures, tables, data) from third parties. best practices: how open is it? a guide to identify the level of openness in multiple dimensions; the fair principles are used for sharing open data; the oapen-uk team produced the guide to creative commons for humanities and social science monograph authors. |dissemination dissemination is a wide and crucial area in publishing, and this is true for oa publishing as well. it consists of combination of activities to ensure distribution and discovery of publications. these activities are carried out in a complex interplay within the industry with a wide range of service providers: vendors and distributors (ebsco, proquest, project muse, jstor), search engines (google and google scholar), indexing and discovery services, metadata systems (crossref, orcid, marc and onix), library service providers (oclc, exlibris), various types of institutional and subject repositories and hosting platforms (pubmed central and europe pmc), and preprint servers (arxive). selection of oa infrastructure resources: doaj, doab, base, openaire, pubmed, ku online services, jstor open, openedition, oapen. |metrics traditional academic publications metrics gathering and evaluation has been more evolved in journal publishing, and therefore also in stem subjects. this has focused in particular on journal impact factor. however, journal- based citation rates as a measure for an individual article quality are increasingly considered to be inadequate, and as technology improved, alternative article-level metrics have been developed, based on views/downloads, social media mentions, and other metrics in addition to more comprehensive list of citations. emerging practises: in general, transparency is important and this should include how usage metrics are aggregated, how chapter-level metrics are rolled-up into book-level metrics, and the mechanism to count downloads and views. counter is a standard for counting views/downloads; crossref event tracker provides doi event data; the operas project hirmeos is developing a service for oa books. |digital preservation as content is increasingly born digital and accessed online by researchers, students, and readers, ensuring preservation of that content is critical. regardless of the business model behind a publication, the publisher should take responsibility for preserving the scholarly record through participation in trusted preservation initiatives. digital preservation initiatives exist to ensure continuation of access to content in the event that a publisher is no longer able to provide access. best practices: clockss (controlled lots of copies keep stuff safe) is a preservation initiative run out of stanford university; portico is part of ithaka, a non-profit serving the academic community; the keepers registry acts as a global monitor of where (and if) content is being preserved. mailto:e.ferwerda@oapen.org day - -geodiscovery federated geospatial data discovery for canada - geodisy eugene barsky and evan thornberry, ubc january image - https://www.flickr.com/photos/double-m https://www.flickr.com/photos/double-m the good people of data... ● amber leahey, data services metadata librarian, scholars portal ● marcel fortin, head, map and data library, university of toronto ● jason brodeur, associate director, digital scholarship services, mcmaster university library ● jason hlady, manager, research computing, university of saskatchewan ● eugene barsky, research data librarian, university of british columbia ● paul lesack - gis/data analyst, university of british columbia library ● evan thornberry, gis librarian, university of british columbia library ● mark goodwin, metadata coordinator, university of british columbia library ● paul dante, software developer, university of british columbia library ● lee wilson, service manager, portage outline ● general overview of the project ● including: ○ the problem ○ our suggested solution ○ steps ○ timelines ○ your feedback image - https://www.flickr.com/photos/ @n / https://www.flickr.com/photos/ @n / the problem ● how do i find, for example, the migration paths of humpback whales, the distribution of maple-syrup yields, infrared satellite imagery, distribution of artifacts in an archaeological site or the flow routes of water due to sea level rise? ● text-based searches don’t always work well with spatial data ● location is a key image - https://www.flickr.com/photos/blprnt/ https://www.flickr.com/photos/blprnt/ the problem ● most repositories lack a map-based interface ● how do i find data about mining in northern bc? potential solution extend existing software to find and display research data in a search interface which is both map and text based, combining research data with the functionality of a product such as google maps. image by https://www.flickr.com/photos/ @n / https://www.flickr.com/photos/ @n / steps: step : software will query the canadian dataverse repositories (scholar portal, ubc, uofa, dal, unb, uofm) to determine if geospatial information is present within the digital object (e.g., a study, or a data deposit) image - https://www.flickr.com/photos/jdhancock/ https://www.flickr.com/photos/jdhancock/ steps: step : ● software will harvest any geospatial metadata in the primary record (eg: main record page) ● more importantly, the software harvester will query and harvest any geospatially relevant file objects (satellite imagery, geospatial vector files, etc) image - https://www.flickr.com/photos/a-g https://www.flickr.com/photos/a-g/ steps: step : once the data have been harvested, the software will create and normalize relevant geospatial data from the (a) primary record and from (b) any associated digital objects, and extract all relevant metadata image - https://www.flickr.com/photos/wakingtiger https://www.flickr.com/photos/wakingtiger/ steps: step : the extracted, cleaned and normalized (iso ) data is deposited by the software pipeline into a geospatial data server, such as geoserver, capable of distributing geospatial data in a wide variety of formats to various services image - https://www.flickr.com/photos/centralasian/ https://www.flickr.com/photos/centralasian/ steps: step : ● data will then be harvested by a geospatial search interface such as geoblacklight, an open source geospatial search tool ● the user interface will be customized to the needs of the federated research data repository project (frdr), providing a unified map-based search interface for research data in canada. * image - https://www.flickr.com/photos/pamilne/ https://www.flickr.com/photos/pamilne/ suggested solution potential partners and synergies ● big ten academic alliance - geospatial data discovery project - https://geo.btaa.org/ ● canadian historical geographic information systems partnership (chgis)- http://geohist.ca/ ● and hopefully some of you! image - https://www.flickr.com/photos/derpunk/ https://geo.btaa.org/ http://geohist.ca/ https://www.flickr.com/photos/derpunk/ fair principles ● enhance findability and accessibility of geospatial research data in canada - metadata clean-up and crosswalks, guidance, and tools for the description of research data (e.g. dublin core, ddi, iso ) ● interoperability - open geospatial metadata exchange and open apis ● re-usability - open, fit-for-purpose interface for discovering and exploring geospatial data * image - https://www.flickr.com/photos/wwworks/ https://www.flickr.com/photos/wwworks/ national services ● dataverse - we are working with our dataverse north partners, a potential national repository solution ● frdr - we are working with our portage and compute canada colleagues - incorporating an open source geoblacklight application into frdr as another search option, a mapped based search * image - https://www.flickr.com/photos/crysb/ https://www.flickr.com/photos/crysb/ questions? image - https://www.flickr.com/photos/debord/ https://www.flickr.com/photos/debord/ resilient scholarship in the digital age . technology and scholarship once scholarship might be characterised in terms of the lone scholar in an ivory tower, toiling in libraries, reading, writing and communicating their research through con- ferences, journals and books, networking in person with small ‘elite’ disciplinary groups, and teaching small num- bers of students (e.g. pausé & russell : ). the move to digital scholarship sees scholars acquiring information online and communicating with colleagues via email, video and social media, blogging and networking about research, analysing and archiving data online, submitting and reviewing papers and grant applications via the web, and producing a wider range of outputs including grey lit- erature and podcasts, for example (holliman : ). the association of new technologies with scholarly activity “… marks a new shift in academic practice from a formal, one-dimensional type of communication to different forms of engagement with academic knowledge within and beyond the academy … [which] … has given rise to a digital scholarship culture that is epitomised by a perceived libera- tion of the academic as consumer, producer and publisher of knowledge for the public good.” (costa & murphy : – ) archaeological scholarship is frequently situated beyond the academy with a high proportion of archaeological employment in government agencies and commercial organisations (e.g. aitchison : – ). however, the focus of this paper is specifically on the experience of the digital scholar within a university environment from a largely uk perspective. european universities are currently less committed to the levels of unbundling and efficiencies experienced in the uk, but strong par- allels exist across north america and australasia (e.g. muellerleile & lewis : ). but what is meant by digital scholarship? weller ( : ) has proposed that digital scholarship entails engagement, experimentation, reflection, and sharing, with the digital ideally seen to support and extend the existing functions of scholarship, even breaking down the boundaries between them. a precise balance will be found differently by different scholars: for instance, grand et al. ( ) define three categories of digitally engaged researchers, recognising that these sit on a spec- trum (table ). tensions are inherent in this model of scholarship. for example, there is a tendency for approaches to digital scholarship to focus on future trends and developments. in doing so, important practices and values may be lost as a result of commercial and cultural pressures while at the same time what remains may be rigidly ingrained and not necessarily beneficial for scholars or for scholarship more generally (weller : – ). the strike in uk universities in over pension arrangements brought many of these pressures to the fore and revealed how mainstreamed aspects such as commercialisation and commoditisation had become. . the landscape of digital scholarship one outcome of this commodification of the university is that “the use-value of knowledge diminishes, and aca- demic time is increasingly devoted to establishing the exchange-value of the knowledge we produce” (schwarz & knowles : ). consequently huggett, j. . resilient scholarship in the digital age. journal of computer applications in archaeology, ( ), pp. – . doi: https://doi.org/ . /jcaa. university of glasgow, gb jeremy.huggett@glasgow.ac.uk research article resilient scholarship in the digital age jeremy huggett this paper addresses the nature of digital scholarship and discusses the challenges for digitally engaged researchers in archaeology and elsewhere who find that the move to digital scholarship alters the terms of engagement in both the institutional and the personal context. for example, digital methods can counterintuitively lead to increased workloads and expectations of availability, and they are frequently linked to managerialism and marketisation of scholarship. paradoxically, digital scholarship can entail both a tightening of control through forms of surveillance and an increase in freedom to work in places and at times of choice. this gives rise to a heightened experience of stress and insecurity, and so this paper will argue for the need for resilience in scholarship, not at the institutional level where business resilience approaches are already applied, but at the community and individual level, to benefit most those who experience the risks and downsides associated with digital scholarship. keywords: digital scholarship; resilience; open scholarship; neoliberal university; sociable scholarship journal of computer applications in archaeology https://doi.org/ . /jcaa. mailto:jeremy.huggett@glasgow.ac.uk huggett: resilient scholarship in the digital age “the ideals of digital scholarship are tempered by the realities of academia, with its powerful prestige economy alongside the pressures of a diversi- fied workload … taking advantage of the digital revolution should come with an advisory sticker attached.” (costa & murphy : ). digital technologies are not the cause of the commodi- fication and commercialisation of universities; never- theless, they enable the characteristics of the neoliberal university which can “… be seen as forming a critical new terrain inside which digital technology is used to control labour- power … cybernetics is a means of controlling, deconstructing and reimagining academic labour- power for value production, such that academic autonomy is unimaginable.” (hall : ). accordingly, “understanding the complicated landscape of what it means to be a digital scholar now requires a more sophisticated appreciation of both the shift from legacy to digital scholarship and the struggle between the forces of commercialization and democratization” (daniels & thistlethwaite : ). however, debates concerning digital scholarship and the neoliberal, commercialised, commoditised university are frequently disconnected. discussions about digital scholarship focus primarily on techniques and technolo- gies and their mainstreaming in practice (e.g. borgman ; cohen & scheinfelt ; weller ). changes introduced through digital scholarship have become “co-opted into broader agendas around commercialisa- tion, commodification and massification of education” (weller ). conversely, despite the digitalisation of university practice, critiques of the modern university (e.g. berg & seeber ; brink ) make little refer- ence to digital scholarship beyond passing reference to aspects such as the impact of email and virtual learning environments. in some accounts, the scholar is absent entirely (e.g. sperlinger, mclellan & pettigrew ). this makes sophisticated appreciation of the transition to dig- ital forms of scholarship alongside the evolution of the commercialised, commoditised university difficult, and as a result the outcomes may be unforeseen, hidden, unex- pected, and largely unrecognised. for the individual scholar that is caught amidst the colli- sion and hybridisation of digital technology and the acad- emy (suiter : ) this can be a profoundly unsettling experience, characterised as precarious and fearful: “widespread redundancies, growing levels of cas- ual employment, unrelenting pressures from an increasingly global marketplace, new forms of pro- fessional surveillance and mounting institutional ‘productivity’ demands see increasingly apprehen- sive scholars in perilous professional positions” (hay : ). since all scholars are now digital to some degree (tables and ), digital technologies are implicated in the transformation and reconfiguration of universities: whether it is to teach more students, to publish more and higher quality research, or to engage with profes- sional, commercial and lay communities (bacevic : ; woodcock : ). this relationship between the scholar and digital technology is little researched in terms of their socio-cultural and political contexts, however (lupton, mewburn & thomson : ). for example, administrative, bureaucratic, and surveillance functions are not normally considered as part of digital scholarship, but they are increasingly part of the scholarly experience (table ). the definition and scope of digital scholarship therefore needs to be expanded in the face of function creep: the experience of a digital scholar extends beyond the core activities of research and teaching that are the focus of most attention to date. . institutional digital scholarship considering digital scholarship in these broader terms encompasses areas most closely associated with the management of the academic institution: specifically, surveillance, audit and metrics, administration of research and teaching, and the management of workloads, although little unambiguously differentiates archaeological prac- tice at this level. features of the modern business world, their introduction into the university brings scholarship into the realms of a service industry, where, for example, table : features of scholarly digital engagement ‘types’ (grand et al. : ). online persona engagement digital tool use digital practice ‘highly wired’ well-developed highly collaborative, works with multiple stakeholders multiple tools; strategic; sustains partnerships originally personal but extends to projects ‘dabbler’ at an early stage of development; patchy or unfocused; spread across multiple tools mixed; partial; collaborative; cooperative; multi-disciplinary within academic contexts some experience with multiple tools; strategic use at early stage; draws on colleagues’ skills originates in project demands but extends to personal ‘unconvinced’ non-existent or meeting minimal institutional demands minimal; discipline- focused within academia uses communication tools e.g. email low-level or non-existent huggett: resilient scholarship in the digital age “… the professoriate is expected to treat students as customers … serving the customer (student) means extending office hours, being on-call via email and social media for emergencies, counseling the wayward and grief stricken, becoming a gradu- ate admissions counselor, self-disclosing personal information, and exuding warmth and approach- ability.” (lawless : ). . . metrification of scholarship recasting universities as corporations has led to the creation of digitally managed audits and surveillance metrics purporting to measure the quality of teaching and research which feed into league tables and income streams (e.g. feldman & sandoval ; morrish a). these place demands on scholars to focus on areas which enhance such metrics and generate income (e.g. dyson : – ; rustin : – ), emphasising what is measurable at the expense of other equally valid areas of scholarship. research is judged as much on its economic and social worth as its academic value (ylijoki : ) and shifted away from curiosity-driven research. audits quickly become a managerial device which increasingly bear down upon scholars and scholarship (e.g. holmwood : ). for example, metrics-based management prac- tices were introduced across uk universities under the research excellence framework (ref) (e.g. macdonald ) with individuals required to meet targets for the number of internationally excellent or world-leading pub- lications, the number of research students, impact, and income generation (e.g. morrish b: – ). failure to achieve targets requires closely monitored individual action plans for improvement, with ‘capability proce- dures’ applied if progress is deemed inadequate, leading potentially to demotion or dismissal (e.g. baker ). the introduction of the teaching excellence framework (tef) in england and wales expanded this approach into the management of teaching (e.g. morrish a) while the proposed knowledge exchange framework (kef) prom- ises to do the same for knowledge transfer. inevitably, this has led to precisely the atmosphere of precariousness and fear identified by hay ( : ) and fundamentally shaped research and teaching outcomes. introducing digital technologies to monitor perfor- mance, seeking to render everything auditable, knowable, and calculable (gill : ) has reinforced the corporate model of university management. however, the knowl- edge captured is often poorly related to the realities of scholarship and do not measure what was intended. for example, gill ( : ) highlights the transparent approach to costing (trac) methodology applied in the uk since , introduced to calculate the direct and indi- rect costs of academic and professional staff activities. this employs a model of academic employment predicated on proportions of notional contracted hours rather than actual hours worked, so consistently under-represents the cost of scholarly work by disguising the actual levels of staff contribution. although responsibility for these audit processes might be laid at the door of national policy, they are implemented at local level and most institutions operate additional ‘shadow’ audits to predict the potential outcome of the periodic national audit for planning purposes (holmwood ). as a result, an intermittent process becomes con- tinuous evaluation of individual performance, creating a comparative and competitive and anxious workplace in which “another’s success becomes a possible sign of one’s own failure” (grealy & laurie : ). . . administrative scholarship most digital systems at institutional level are concerned with business activities. promising to support scholars and provide relief from administrative work, their reality is often the reverse. of course, the introduction of complex computer systems across government agencies, health services, and other large organisations have a long history of problematic implementation. for instance, in a review of the epic computer system introduced in us hospitals, gawande ( : ) describes how “a system that prom- ised to increase my mastery over my work has, instead, increased my work’s mastery over me”, with staff trapped in the system, “all of us hunched over our screens, spending more time dealing with constraints on how we do our jobs and less time simply doing them. and the only choice we seem to have is to adapt to this reality or become crushed by it.” (gawande : ). such a description is familiar to scholars who have expe- rienced the introduction of a large-scale computer system across a university. for example, student lifecycle manage- ment systems have been widely implemented, designed to manage the student record from application through registration, course selection and enrolment, timeta- bling and room allocation, assessment and examination, table : the digitalisation of the academic labour process (after woodcock : ). the labour process academic work impact of digitalisation the activity of work research, teaching, administration acceleration of activities, linked to management strategies of control the objects of work research outputs (journal articles, books, publicity). teaching materials online media outputs and new metrics for research success. email, online materials, and lecture capture for teaching. new methods of control the instruments of work tools for researching, writing, and teaching new skill requirements; the university becoming more like a digital platform huggett: resilient scholarship in the digital age progression and graduation, and thereafter as alumni. sold to institutions as a means of enhancing their com- petitive edge (e.g. oracle ), staff are persuaded their administrative burden will be reduced. however, such off- the-shelf commercial systems do not fit local procedures without considerable modification (different terminolo- gies, degree structures, academic year structures, assess- ment models, etc.) leading to a choice between expensive customisation or altering well-established processes to fit the system. the inexperience of academics and adminis- trators in specifying, testing, and understanding complex computer systems, as well as the need to populate those systems with large amounts of previously under-formal- ised course details and regulations can lead to faulty process modelling, overly complex and cumbersome procedures, poor configuration, and inadequate imple- mentation. developers frequently fail to understand local circumstances, are inadequately informed as to the pro- cedures in operation, and often seemingly unresponsive to requests for even minor modifications to the system to better reflect the needs of staff and students. the reliance on students to correctly enter details into unfamiliar and often forbidding-looking systems leads to student (and parent) frustration, anxiety and anger which is projected onto the academic advisers and administrators providing their personal interface with the institution. the rigid implementation and sequencing of embedded rules can seem illogical and deliberately obstructive. for example, students may be required to complete financial registra- tion before academic registration can commence, causing problems for international students and students with non-standard financial circumstances. resolution of labo- ratory or tutorial timetable clashes may require a student to un-enrol and re-enrol on a course and hence risk loss of access to oversubscribed courses. courses may simply not be available because no room has yet been allocated to them. the demands on scholars and administrators to assist students, even to the extent of completing registra- tion and course enrolment on their behalf, results in con- siderable unrecognised time commitments outside of nor- mal working hours as well as the emotional cost of dealing with stressed and distressed students. bedding in such complex systems successfully can take several years, and the burden placed on staff in the interim to manage the imperfections in the system is often unrecognised. yet such systems are claimed to be a means of relieving academics from arduous administra- tive paperwork and reducing perceptions of overload (e.g. zábrodská et al. : ). . . intensification of scholarship in conforming to the broader digital economy, universi- ties have changed the nature of scholarly labour. along- side the digitalisation and informatisation of scholarly activities, there have been significant temporal conse- quences in relation to the technological acceleration of the tempo and rhythm of academic life (ylijoki : ) and seemingly constant restructuring, reorganisation and change. gill ( : ) describes a punishing inten- sification of work as endemic to academic life alongside an extensification of work across time and space, facili- tated by digital technologies that render it possible to be ‘always on’ (gill : ). in the process, universities have “exploited and normalised anxiety-driven overwork as a culturally-acceptable self-harming activity” (hall & bowles : ). for example, the university and college union calcu- lated that uk staff worked the equivalent of two unpaid days per week on average, and some considerably more (ucu a: ). further, the uk’s trades union congress found that academics and teachers were the most likely occupational group (other than chief executives) to do unpaid overtime (tuc ). gill ( : ) suggests that institutional awareness of this lies behind the use of proportional rather than actual time in the trac methodology. institutional responses to workload issues can seem detached: time management courses which aca- demics have no time to attend, for example, and workload models which are ineffective in the face of demonstra- bly high workloads or worse, consider an inability to complete workload within ‘normal’ hours as a personal performance failure. intensification is also experienced in the use of teaching technologies: “… it is no longer enough to give a lecture and run some seminars, we are also expected to produce a set of resources for use on the new online commu- nications platforms … the pressure that is produced by such constant exhortations to be more creative, teach more innovatively, be at the cutting edge (etc) is undeniable – particularly because it meets an already existing set of desires and ethics around being professional and wanting to do a good job.” (gill : ). these digital resources become commodified as the institution assumes ownership, leading to situations such as during the uk strike when some universi- ties reportedly sought to appropriate recorded lectures and deliver them to students in order to offset the effects of strike action. suspicions remain that lecture capture can be used as a means of surveillance of academic per- formance, even removing academics from the teaching process (woodcock : ). intensification is further revealed in the overflow of work beyond core hours. most of the functionality of the workplace is easily reproduced in the home through networked access to institutional data systems, library catalogues, and online journals, books, databases, and archives. these are commonly seen as ‘free goods’ but their costs are absorbed by scholars in terms of free hours worked, the substituted labour of their partners or oth- ers (jarvis & pratt : ), and lost time with family and friends. scholars for whom circumstances limit such free work are disadvantaged, with consequences for their promotion and advancement. institutional building pro- grammes which move academics into smaller and/or shared offices, even hot-desking, can appear to encourage scholars to rely on home resources more than ever. as gill ( : ) observes, working in universities has become literally academia without walls. huggett: resilient scholarship in the digital age that so-called ‘sacrificial labour’ is long-established in the form of scholarly service on editorial boards, review panels, conference committees, community organisa- tions, charitable trusts and the like does not detract from the criticism that digital scholarship is complicit in supporting this intensification and extensification of academic labour. however, much academic labour is affective in nature, in that it “… does not result in direct financial profit or exchange value, but rather produces a sense of community, esteem, and/or belonging for those who share a common interest” (gregg a: ; see also lupton, mewburn & thomson : ff). in short, scholarship is something that is enjoyed by the individual academic, as it generates personal gratification, passion- ate attachment, builds personal reputation and profile, develops personal networks and social contacts, and is effectively a form of self-aggrandisement. this exposes scholars to manipulation: “for academics in particular, affective labour explains how the university draws on the psycho- logical lives of staff to both exploit and disguise the ‘immaterial’ dimensions of working life. productiv- ity demands placed on academics rarely acknowl- edge the human factors that complicate the tasks of thinking, writing and delivering the timely out- comes crucial to individual and institutional suc- cess. meanwhile, the language of campus mission statements and marketing campaigns promote ‘creativity’ and ‘innovation’ as the university’s asset base, emptying out the discursive terrain in which employees may have once expressed admiration or commitment to the institution.” (gregg a: ). a sense of alienation from the institution is a typical con- sequence of the metrification of scholarship, increased administrative loads, and the intensification and extensi- fication of work. . the individual digital scholar the affective nature of scholarship – even if it may be manipulated for coercive ends – means that many of the more traditional aspects of scholarship can be seen as spaces over which the individual exerts more control. areas such as publication and social engagement, for example, primarily entail risk-taking and decision-making by the individual scholar, although institutional and other external constraints may remain influential in practice. . . open scholarship digital scholarship is not the same as open scholarship (although weller ( ) points to an increasingly close relationship between the two), but the venues associated with openness are frequently digital and the emergence of open scholarship sits alongside broader technological advances (veletsianos & kimmons : – ). digital scholarship is frequently linked with the democratisation of knowledge production and consumption as a result of changes in how scholars engage with their materials and their audiences (e.g. daniels & thistlethwaite ). how- ever, free, open scholarship can effectively devalue the intellectual labour of academics, and humanities scholars are especially disadvantaged since scientists and engineers have more opportunities to profit from their research (golumbia : ). furthermore, while open access presents the scholarly product as ‘free’ to consumers, it takes little account of the unpaid academic labour beyond authorship: peer review, editorial and advisory roles, and so on are not covered by the author processing charges made by commercial publishers (e.g. eve : – ). nevertheless, open access is increasingly mandated by government and institutional policy despite significant outstanding problems. open scholarship can seem to conflict with traditional expectations of quality and prestige that focus on elite high-ranking journals operated by commercial publishers in a ‘market’ that results in the products of research being costly to publish and/or costly to access. institutions require papers placed in high-impact journals of international standing and monographs with long-standing, eminent academic publishing houses. however, many independent open access journals are frequently digital only and are not widely recognised or considered high-ranking (e.g. mišík ), and it can be challenging to differentiate them from ‘predatory’ journals. mišík proposes that established aca- demics should initiate change by providing legitimacy to emerging journals by publishing in them and by serving as editors and reviewers for them. a similar call has been made in archaeology by costopoulos ( ) who argues for disengagement from the current journal system but recog- nises that “as established scholars and administrators, we have a duty to protect the most vulnerable from the most disruptive consequences of this transition”. . . sociable scholarship relationships with external audiences, increasingly con- ducted online through channels such as twitter, blogs and other forms of digital media represent a significant chal- lenge for scholars. engagement is seen as ‘good’, but the nature of the audience is ill-defined, and participants can move between different channels, be a member of differ- ent audiences, and occupy different roles, often simulta- neously. this has been debated extensively in archaeology (see bonacchi ; bonacchi ; morgan & winters ; perry ; perry & beale ; richardson ; and contributors in rocks-macqueen & webster amongst others). this ‘sociable scholarship’ (pausé & russell ) is not without risk. for example, the long memory of the internet means that past statements can be resurrected and reused, often out of context, and mis- steps may have a much wider audience compared to, say, a poorly presented conference paper (pausé and russell : ). sociable scholarship offers a range of poten- tial benefits – enhanced visibility, recognition, reputa- tion, public engagement, participation, influence, and networking across disciplinary lines (e.g. stewart : ) – but unmasking the sacred and subverting author- ity through posting positions, opinions, and discussions (mclean & wallace : ) can pose risks. engage- ment with social media exposes scholars to different audi- ences and frames of reference in a medium characterised by ‘fake news’, ‘alternative facts’, a rejection of ‘experts’, huggett: resilient scholarship in the digital age the ‘weaponisation’ of information, and the blurring of boundaries between public and private. consequently, it can be both help and hindrance at the same time, a con- flict that stewart has described as “… the messy business of being truly open to mul- tiple publics at once [which] forces scholars to navigate the cognitive dissonance between orality- based expectations of sociality and print-based interpretations of speech” (stewart : ). traditional means of protecting academic freedom are of limited value online, and “… cannot protect scholars from cyber-vigilan- tes who take every post or tweet as an indelible marker of character … there exists a profound risk, then, that the climate of digital culture, where identity is perceived not as shifting or context dependent, but rather as an expression of a core self, may lead academics to self-censor and in turn bring out a silencing of important conversations.” (hildebrandt & couros : ). to compound the problem, institutions are often poorly equipped to support staff facing online abuse (e.g. cook : – ; perry ) while at the same time requir- ing public engagement even if much digital engagement remains largely unrecognised as ‘academic’ work. . . always-on scholarship digital information and communication technologies have enabled the expansion of the scholarly workplace beyond the institution in a process gregg ( : ) defines as “presence bleed”. this is not an aspect of digital scholar- ship that has received much attention. for example, an outline of the changing habits of digital scholars (daniels & thistlethwaite : – ) refers to the ability to share scholarly resources via networks across large distances, but not to the way in which work itself can be digitally transferred from place to place, from the workplace to the daily commute to the hotel room to the conference floor to the home office, in an expansion, intensification, and extensification of the scholarly experience. removing boundaries between work and home life results in a lack of downtime, but to be disconnected smacks of a lack of commitment (e.g. agger : ). the growth in flexible working hours and working from home is facilitated by digital technologies, as is demonstrated throughout a study of ‘hyperprofessional’ academic work (gornall & salisbury ), for example. flexible working arrangements are seen by employers as advantageous for staff – not least for women, traditionally associated with child-rearing and domestic labour – and scholars can undoubtedly take advantage of some of the most flexible working arrangements around, outside their scheduled teaching and meeting commitments. however, presence bleed means that flexible work easily becomes the kind of sacrificial out-of-hours work discussed above. although ylijoki ( ) sees this as ‘boundary work’, situated between work time and private time, any bound- ary blurs as flexible working arrangements encroach into the non-work side of life. issues of ‘work-life balance’ become complex when personal identity, pleasure, and sense of accomplishment are closely related to scholarly work. this may be one reason why work-life balance ini- tiatives are largely ineffective, limited to relatively minor changes such as attempting to ban out-of-hours email or introducing relaxation and massage sessions. there is also a fundamental, and often gendered, ineq- uity in the failure of academic work-life balance: scholars with young children, caring responsibilities, health prob- lems, etc. are disadvantaged in a working environment which normalises – even expects – the bleeding of work hours into private time. scholars who are seen as not pri- oritising work, who resist the encroachment of work into their personal life, who are unwilling to engage in sacri- ficial labour, and who are motivated to switch off their digital presence, will inevitably appear less committed and less productive, with consequences for performance evaluations and promotion prospects. in this way, the technologies experienced and exploited by digital schol- ars sustain what can be argued to be a corrupted and demoralising form of scholarship. ylijoki argues that this is ultimately a question of morality: “the question arises whether the current high- speed university is a generous and benevolent alma mater proving space and time to cultivate the human mind and strive for the truth, or a greedy and ruthless organisation eager to exhaust its inhabitants?” ( : – ). the problem is worse in a field discipline such as archae- ology since research, teaching, and professional develop- ment are frequently linked to fieldwork undertaken at some distance from both office and family home. indeed, a study of anthropologists and fieldwork found higher levels of stress due to an imbalance between career and family alongside gender inequities and intersectionality (lynn, howells & stein ). clearly a balance must be struck, but the quantity of invisible, sacrificial labour currently undertaken and the inequities as well as associated occupational stress are issues that need to be addressed in digital scholarship. as gill suggests, “we … need urgently to think about how some of the pleasures of academic work (or at least a deep love for the ‘myth’ of what we thought being an intellectual would be like, but often seems at far remove from it) bind us more tightly into a neoliberal regime with ever-growing costs, not least to ourselves.” ( : ). in many respects, therefore, scholars are complicit in their own abuse (cederström & hoedemaekers ), aided in that complicity by access to digital technologies, and their compliance facilitated by a seemingly romantic view of academic labour (clarke, knights & jarvis : – ). huggett: resilient scholarship in the digital age . introducing resilience it may be that the academic workplace is digitised to an extent that makes it difficult to challenge (lupton, mewburn & thomson , ), but the impact on schol- ars demands that an attempt should be made. resilience thinking has been applied to areas including ecological systems, sustainability studies, climate change, urban plan- ning, organisational studies, economics, and defence stud- ies, focusing on learning and adaptation, the mitigation of risks, predicting and resolving problems, responding to risks as they are realised, and recovering from disruptions. resil- ience typically concerns organisations and is frequently seen as desirable in students (e.g. berg & seeber : ) but is rarely discussed relative to scholars. however, weller and anderson ( : ) define resilience in digital schol- arship as using technology to change practices where this is desirable but retaining the underlying function and identity that the existing practices represent, if they are still considered necessary. for example, they suggest that current peer review practice is not the only way to achieve the desired end, so might be changed while preserving the essential function. they suggest that resilience in digital scholarship is best seen at the institutional level, but if – as argued here – there is a degree of alienation between schol- ars and institutions, the success of such an approach will be limited, or at least treated with suspicion. while scholars might take advantage of institutional support where appro- priate and available, the affective nature of much scholarly labour could imply that success or failure will ultimately depend on the individual. many of the risks and pitfalls are encountered personally by the individual digital scholar, not the institution, and indeed, institutions often place the responsibility for resilience and adaptability on individuals. individual resilience therefore refers to the individual schol- ar’s capacity to persist and to develop within the changing institutional environment, and consequently a bottom-up approach would seem logical, while recognising that long- term success will likely depend on the institutionalisation and disciplinification of practice. ultimately, normalisation of practice and its subsequent acceptance and adoption by institutions should primarily be driven by resilient individu- als rather than imposed upon them. at the same time, it is important to avoid a framing within the neoliberal discourse on individualisation and knowledge production (e.g. feldman & sandoval ). resilience thinking has been criticised for its incorpora- tion of dominant social values (e.g. cretney : ) and its co-option in neoliberal discourses (cretney & bond : ; see also welsh ). resilience is seen to be con- servative in the way it emphasises the stability of a system and its resistance to interference (mackinnon & derickson : ), and it is criticised for lacking human agency (davidson : ; olsson et al. : ). cretney ( : ) points to the neoliberal resilience discourse as “encouraging and, in some cases, mandating that com- munities, departments and projects become increasingly adaptable, flexible and open to change through disrup- tion”. this characterises much institutional change across universities of late – disruptive changes have been intro- duced through restructuring, reorganisation, performance management, voluntary severance, curriculum change, casualisation of employment etc., while at the same time emphasising the responsibility of staff to be flexible and adaptable. as a result, scholars experience ‘responsibility without power’ in the way that the institution restricts their actions by retaining power and resources while itself enjoying reduced responsibilities (cretney & bond : ; peck & tickell : ). however, resilience can be turned against the domi- nant discourse by articulating and practicing it through “… transformative, alternative counter-neoliberal discourses of self, community and society” (cretney : ). alternative approaches to resilience include ‘community resilience’ (e.g. bonanno, romero & klein ; mulligan et al. ), ‘grassroots activism’ (cretney & bond ), ‘resourcefulness’ (mackinnon & derickson ), and ‘equitable resilience’ (matin, forrester & ensor ). all share a community approach and provide reassurance that resilience can subvert established power rather than reinforce it. for instance, equitable resilience is defined as dealing with “… issues of social vulnerability and differential access to power, knowledge, and resources; it requires starting from people’s own perception of their position … and it accounts for their realities and for their need for a change of circumstance to avoid imbalances of power into the future.” (matin, forrester & ensor : ). a community focus emphasises collegiality, altruism, and mutual support networks, and hence fits the schol- arly situation, at least in its idealised form. the potential of such approaches for genuine resistance was demon- strated by the uk university pension-related strike action in where collective action overturned the decision to close the defined benefit element (e.g. hillman , ff). one of the challenges of community resilience approaches, however, is the need to retain sight of the individual. what applies to a community does not neces- sarily work for all individuals within it – equally, what works for an individual does not necessarily work for the group. something that is good for a community may be bad for an individual, and vice versa (mcnally : ). mcnally ( : ) argues that since a community can- not literally possess resilience as such, “we might deem a community resilient if its resources render its mem- bers emotionally robust against the effects of traumatic stressors”. this reinforces the importance of developing individual resilience alongside community resilience, as argued here. critically, individual resilience is not static – it can be developed but it can also be lost, and resilience may be different at different stages in life so that an individual demonstrating resilience at one time may be less resilient when confronted by later adversity. indeed, an accumu- lation of adversity over time may eventually exceed an individual’s capacity to cope. individual resilience is a state of mind which enables the person to readjust and continue their life in the face of adversity (kimhi & eshel huggett: resilient scholarship in the digital age : ), but unsurprisingly this is not dependent on any single factor; instead there is a set of unique predic- tors, each exerting relatively small effects on the outcome (bonanno, romero & klein : ). examples of such predictors include a sense of commitment, engagement of support, close and secure attachments, self-efficacy, sense of control, action orientation, flexibility, optimism, and being goal directed (hobfoll, stevens & zalta : ). . a crisis of resilience? severe damage has been inflicted on the ideals of schol- arly vocation and collegiality – however mythical they might be (hall & bowles : ) – by the encroachment of marketisation, surveillance, and competitive practices, supported by digital technologies. for example, a report by the university and college union (ucu b) found over half of academics employed in uk universities were on a mixture of hourly-paid atypical and short fixed-term contracts. this is coupled with a culture of long working hours with young academics working an average of hours per week and one in six estimated to work hours per week when adjusted to their full-time equivalent (ucu a: ). in the face of precarious employment, often limited rights, and very heavy workloads, it is difficult to see how such individuals can develop resilience. worse, they may be exploited by those in more secure positions as many precarious staff provide teaching support to allow valuable research time for established staff. indeed, a criticism of ‘slow professorship’ (berg & seeber ) is precisely that decelerating professors might presume that junior staff will accelerate to pick up the slack (carrigan & vostal ). such precarity can last for years and it can also raise diversity and discrimination issues in terms of gender and race (jones & oakley ). so-called para-aca- demics or alt-ac scholars are most affected by gender- and race-related issues of lower pay and lack of security and by the misogyny and harassment that can characterise digi- tal network platforms. indeed, the digital may make their situation worse. for instance, digital networks and online identities can break down institutional walls (weller : ), but while an established scholar might benefit from this, one seeking to develop their academic identity and build their reputation is likely to find it disadvantageous (e.g. richardson ). similarly, there may be clear dif- ferences in scholarly blogging between established schol- ars in secure employment and those at an earlier stage in their career (gregg b: ), with the former able to afford more risks than the latter. in turn this might suggest that there are digital scholars who – for reasons of age, seniority, gender – have less need for resilience and for whom risk-taking is not especially audacious because of the relative security of their posi- tion (e.g. haven et al. ). however, the crisis in resil- ience is not limited to casualised (and generally young) scholars. for example, a study by the guardian newspaper revealed that two-thirds of staff who had suffered mental health problems saw it as a direct result of workload, and senior lecturers and those aged between – felt most strongly about this link (shaw ). the deaths of stefan grimm in and malcolm anderson in , associ- ated with workload and the level of expectations placed upon them (morrish b: ) highlighted the pressures felt by mid-career and senior staff. while a synthetic study found that younger and newer scholars were more vulner- able to burnout, the risk of selection bias was identified whereby mid- to late-career academics with high levels of burnout had already quit, leaving those who remained as the most successful in coping with demands and stressors (sabagh, hall and saroyan : ). studies of academic burnout suggest a range of inter- related causes, including lack of support and influence, time constraints (kinman : ), student numbers (sabagh, hall & saroyan : ; watts & robertson : ), the indirect effects of administrative paper- work (zábrodská et al : ), value conflict and workload (morrish b: ff; sabagh, hall & saroyan : ). however, the most consistent factor identi- fied across numerous studies is conflict between work and family/leisure time (e.g. kinman : ; padilla & thompson : ; sabagh, hall & saroyan : ; zábrodská et al. : ). academics generally indicate that they have little choice in working long hours: “as one lecturer remarked: ‘if everybody worked strictly on a – basis, the institution simply could not function’. another commented: ‘the number of hours i work represents stress avoidance; it enables me to maintain an acceptable standard of work and meet deadlines and targets most of the time’.” (kinman & jones : ). the intensification, extensification, and affective nature of scholarship therefore presents a particularly toxic combination for scholarly wellbeing. despite this, there is at present no national measure of staff wellbe- ing within uk universities (hewitt : ). amongst individual universities morrish ( b: ) identifies a ‘turn to wellbeing’, with staff offered enhanced access to support services as a means of mitigating institu- tional liability, and her survey of uk universities found a % increase in demand for counselling services between and while referrals to occupational health services increased by % over the same period (morrish b: – ). however, “this is not a case of employers admitting that structural problems are the source of employ- ees’ distress. on the contrary, both students and staff have been accused of lacking resilience. as a partial solution, some universities have become advocates of resilience training, along with stress management and mindfulness … however, many of the proposed beneficiaries are unconvinced about the legitimacy of a solution which seems to place the onus for recovery squarely on the employee.” (morrish b: ). as a result, the introduction of mentoring, coaching, mindfulness and resilience training “recompose a terrain of subordination and conditioning against which there is limited defence” (hall : – ) and such approaches are therefore treated with scepticism by alienated staff. huggett: resilient scholarship in the digital age similarly, collegiality – sometimes recast as ‘citizenship’ – is beginning to appear in academic promotion crite- ria, allowing institutions to claim that they are actively encouraging collegiality, but in doing so collegiality is appropriated and becomes metricised as a set of behav- ioural criteria against which to evaluate an individual. . supporting the resilient digital scholar clearly not all the ills of modern scholarly experience can be laid at the digital door; it is simply that those digital technologies accelerate, sustain and are otherwise com- plicit in many of the challenges facing the modern scholar (e.g. bacevic : ), even if they may also be capable of contributing to the solutions. alternative approaches tend to focus either on the reso- lution of organisational issues or on individual action. for example, morrish ( b: – ) proposes a series of tac- tics: reducing workloads, adopting a responsible approach to metrics, taking a longer-term view of performance management, and addressing precarity and developing sustainable academic careers. however, such worthy aims are beyond the control of individual scholars, and even those in management positions may be limited to at best frustrating the worst excesses. alternatively, the princi- ples of the ‘slow’ movement may be adopted, whereby an individual exerts personal agency to slow down the pace of their academic life (berg & seeber : ). however, believing that changing the self will change the institu- tion and offering individual interventions to what are structural problems is itself a neoliberal trap (brady : ; edwards : ). the privilege associated with a slow approach that is impossible for early career and pre- carious staff has also been criticised (e.g. edmonds : ; reed ; scott : ). other models suffer from similar drawbacks: for example, rolfe seeks subver- sion of the corporate university through the creation of the ‘paraversity’, entailing individual responses such as “being good” (reconciling conflicting agendas by doing things in the right ways for the right reasons) (rolfe : – ), being collegiate ( : – ), and being radical ( : – ), in the process developing new approaches to scholarship ( : ff). pursuit of either organisational change or individual action on their own is equally problematic. institutions have become increasingly dependent on the anxious and precarious scholar and their remedies largely fail to deal with the root causes of the problem (e.g. hall & bowles : ). meanwhile individual agency, if feasi- ble, is often evidenced in disengagement and absence, with implications for those left behind. this underlines the importance of incorporating both community and individual resilience, addressing challenges at both insti- tutional and individual level, protecting and supporting the individual whilst at the same time taking collective action to bring about change within the organisation and avoiding the (re)appropriation of resilience by the insti- tution. such a combined approach is not something that has been widely debated, although it is embedded within hall’s ( ) marxist critique of the ‘alienated academic’, for example. it is also hinted at in the ‘slow professor’ where alongside recommendations for individual action the affective aspects of collegiality are called upon to help develop a culture of social and emotional support (berg & seeber : – ). this seeks to balance the risk that a focus on self-care alone may damage the very collegiality that is sought. one approach is to support community and individual resilience through nurturing social capital and social networks, facilitating co-operation and sustainabil- ity, and establishing practical projects for mutual support and constructive change. crucially, such activities based around resilience can challenge the dominant values and norms (cretney & bond : ). archaeological digital scholarship has not addressed these issues, although there are several parallel debates which provide insights into managing the scholarly condi- tion and thereby add a particularly archaeological as well as digital perspective. the earliest of these concern aspects of a ‘punk’ archaeology, defined as including a reflective mode of organising archaeological experiences and a cel- ebration of diy practices (caraher : ). although explicitly rejecting the traditional academy as “commit- ted to a culture of privileged, solipsistic navel gazing” (schultz : ), punk emphasises a strongly individual, self-sufficient, resistant, do-it-yourself ethos, akin to the individualistic approaches to academic labour described above. furthermore, a critique of punk archaeology recasts it as being primarily concerned with the creation of an equitable and politically aware archaeology, a participatory practice (richardson : ) which draws in an explic- itly collaborative, community aspect. this finds parallels in an emancipatory political archaeology which “is truthful about its political content and confronts power and oppres- sion” (mcguire : ), advocating a socially responsi- ble scholarship embedded in practice ( : ), and an emancipatory digital archaeology defined as a reflexive, politically engaged, activist approach (morgan : ). the ‘slow’ movement has also been debated within digital archaeology, with ‘slow archaeology’ resisting an emphasis on efficiency, economy and standardisation in digital practice (caraher : ). the parallels with ‘slow professorship’ (berg and seeber ) extend to its critique: it “stands as a privileged indulgence of the white, male, tenured, grant-funded, and secure faculty member” (caraher : ). caraher’s ( : ) response is to relo- cate ‘slow’ into a conversation “that emphasizes a more human, humane, reflexive, and inclusive discipline” which he describes as ‘the archaeology of care’. this is “a natu- ral result of sincere and caring people working with other people in difficult circumstances” (caraher & rothaus : ), and an explicit parallel with university scholar- ship is drawn (caraher : ). separately and together, these debates concerning a different practice ethos under- line the importance of a focus on both community and the individual in supporting a more humane, care-full and inclusive approach to archaeological digital scholarship. alongside these debates digital archaeology scholars have also begun to identify how this might be developed and supported through the construction of digital com- munities and platforms (e.g. cook ; watrall ). both cook and watrall emphasise aspects of creative mak- ing as an objective of a successful community, developing practical outputs as a means of encouraging progress and huggett: resilient scholarship in the digital age debate (watrall : ), and, with parallels in maker or hacker cultures, supporting activism through shared resources, experiences, memories, heritage and trauma (cook : ; see also morgan : – ). cook ( : – ) points to the strategic application of technol- ogy and media to confront present identities and author- ity which resonates with the scholarly situation, and notes that “it often emerges most strongly in the face of work action and concerns over equity, inclusivity, and security in the workplace” ( : ). watrall’s ( : – ) framework for a community of ‘thoughtful praxis’ may be adapted to the scholarly situation through fostering an environment that builds confidence in its members, recognising that community members are at different stages and have different needs, understanding the posi- tive value of failure, and creating a culture of generosity, making time to listen, learn, and contribute knowledge and expertise. creating such a structure to build resilience at a commu- nity level and amongst constituent members is not a trivial enterprise: in particular, energy, effort and commitment are demanded of individuals, which means that they will require support in providing it which cannot be taken for granted (cook : ). a further challenge is situating such communities for greatest effect: within disciplines and hence crossing organisational boundaries, or within organisations and hence inter-disciplinary, or in some com- bination. for example, we might visualise communities sitting within each of the four scholarly scenarios charac- terised by papadopoulos and reilly ( , ff), across two or more, or across all four, and the membership and focus of those resilient communities would change accordingly. there is also a danger that individual communities could become artificially isolated, with members effectively operating within a filter bubble, a limited shared world- view that could become quite negative. watrall’s ( : – ) framework could aid in establishing a positive and constructive outlook and emphasising the importance of communication within and between communities – sharing opportunities, lessons learned, common activities etc. – as well as engagement with the wider institutional environment and/or discipline. ultimately, a community of resilience and the resilient individuals within it practises a form of ‘affirmative disrup- tion’ (adema & hall ; hall : ff). this is distinct from the kind of digital disruption pursued and practised across industries and institutions. affirmative disruption is not about emphasising the potential of technology to disrupt practice; instead it seeks to address the human- scale problems experienced because of the ways in which the philosophies and practices of digital technologies have been inserted into the scholarly environment. affirmative disruption seeks a positive realignment which enables individuals to rebuild their commitment and engagement, regain their sense of control, and recover their optimism and thereby create a new approach to scholarship. this requires the investment of those who find by virtue of their situation that resilience is less of a present necessity, as well as those for whom resilience is a daily requirement. as cook powerfully argues, the strongly independent do-it-yourself mentality that characterises digital archae- ology – and digital scholarship – needs to become a do-it-collectively priority (cook , ). acknowledgements this paper is written in part from personal experience, and i should like to thank my glasgow colleagues – in partic- ular lynn abrams, nyree finlay, and michael given – for their support in recent years. several people have pro- vided constructive advice on the content of this paper, and i should particularly like to thank sara perry, lorna-jane richardson, the two anonymous referees, and my co-editors – eleftheria paliou, costas papadopoulos, and isto huvila. the participants of the cost arkwork writing workshop on scholarship in cologne helped focus my early efforts, and i am also grateful to katherine cook for sharing her embodiying disruption paper with me pre-publication. this article is based upon work from cost action arkwork, supported by cost (european cooperation in science and technology). www.cost.eu. funded by the horizon framework programme of the european union. competing interests the author has no competing interests to declare. references adema, j and hall, g. . posthumanities: the dark side of “the dark side of the digital”. journal of electronic publishing, ( ). doi: https://doi. org/ . / . . agger, b. . itime: labor and life in a smartphone era. time and society, ( ): – . doi: https://doi. org/ . / x aitchison, k. . state of the archaeological market : archaeological market survey – . london: landward research. available at https:// www.archaeologists.net/profession/profiling [last accessed june ]. bacevic, j. . with or without u? assemblage theory and (de)territorialising the university. globalisation, societies and education, ( ): – . doi: https:// doi.org/ . / . . baker, b. . campaigning against metrics-based management. in: kelly, s, freedman, d and ismail, f (eds.), the university is ours: how to build an activist union branch, – . london: branch solidarity network. available at https:// ucubranchsolidarit y network.wordpress.com/ branch-activists-handbook/ [last accessed june ]. berg, m and seeber, b. . the slow professor: challenging the culture of speed in the academy. toronto: university of toronto press. doi: https:// doi.org/ . / bonacchi, c. (ed.) . archaeology and digital commu- nication: towards strategies of public engagement. london: archetype publications. https://www.cost.eu/ https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / x https://doi.org/ . / x https://www.archaeologists.net/profession/profiling https://www.archaeologists.net/profession/profiling https://doi.org/ . / . . https://doi.org/ . / . . https://ucubranchsolidaritynetwork.wordpress.com/branch-activists-handbook/ https://ucubranchsolidaritynetwork.wordpress.com/branch-activists-handbook/ https://ucubranchsolidaritynetwork.wordpress.com/branch-activists-handbook/ https://doi.org/ . / https://doi.org/ . / huggett: resilient scholarship in the digital age bonacchi, c. . digital media in public archaeology. in: moshenka, g (ed.), key concepts in public archae- ology, – . london: ucl press. doi: https://doi. org/ . / . bonanno, g, romero, s and klein, s. . the tempo- ral elements of psychological resilience: an inte- grative framework for the study of individuals, families, and communities. psychological inquiry, ( ): – . doi: https://doi.org/ . / x. . borgman, c. . scholarship in the digital age: infor- mation, infrastructure, and the internet. cambridge: mit press. doi: https://doi.org/ . / mitpress/ . . brady, j. . review. the slow professor: challenging the culture of speed in the acad- emy. by maggie berg and barbara k. seeber. radical teacher, : – . doi: https://doi. org/ . /rt. . brink, c. . the soul of a university: why excellence is not enough. bristol: bristol university press. doi: https://doi.org/ . /j.ctv fgwf caraher, w. . toward a definition of punk archae- ology. in: caraher, w, kourelis, k and reinhard, a (eds.), punk archaeology, – . grand forks, nd: digital press at the university of north dakota. caraher, w. . slow archaeology: technology, efficiency, and archaeological work. in: averett, e, gordon, j and counts, d (eds.), mobilizing the past for a digital future: the potential of digital archaeol- ogy, – . grand forks nd: the digital press at the university of north dakota. caraher, w. . slow archaeology, punk archaeology, and the ‘archaeology of care’. european journal of archaeology. doi: https:// doi.org/ . /eaa. . caraher, w and rothaus, r. . an archaeology of care. on second thought: magazine of the north dakota humanities council, – . spring. carrigan, m and vostal, f. . not so fast! a critique of the ‘slow professor’. university affairs/affaires uni- versitaires, april . available at https://www. universityaffairs.ca/opinion/in-my-opinion/not-so- fast-a-critique-of-the-slow-professor/ [last accessed: june ]. cederström, c and hoedemaekers, c. . on dead dogs and unwritten jokes: life in the university today. scandinavian journal of management, : – . doi: https://doi.org/ . /j. scaman. . . clarke, c, knights, d and jarvis, c. . a labour of love? academics in business schools. scandinavian journal of management, : – . doi: https://doi. org/ . /j.scaman. . . cohen, d and scheinfeldt, t. (eds.) . hacking the academy: new approaches to scholarship and teaching from digital humanities. ann arbor: university of michigan press. doi: https://doi. org/ . /dh. . . cook, k. . embodiying disruption: queer, feminist and inclusive digital archaeologies. european journal of archaeology. doi: https://doi. org/ . /eaa. . costa, c and murphy, m. . theorising digital scholarship – introducing the new edition. journal of applied social theory, ( ): – . https:// socialtheor yapplied.com/journal/jast/article/ view/ / . costopoulos, a. . limiting the damage of disen- gaging from the journal system. archeothoughts, may . available at https://archeothoughts. wordpress.com/ / / /limiting-the-damage- of-disengaging-from-the-journal-system/ [last accessed june ]. cretney, r. . resilience for whom? emerging critical geographies of socio-ecological resilience. geography compass, ( ): – . doi: https:// doi.org/ . /gec . cretney, r and bond, s. . ‘bouncing back’ to capital- ism? grass-roots autonomous activism in shaping discourses of resilience and transformation follow- ing disaster. resilience, ( ): – . doi: https:// doi.org/ . / . . daniels, j and thistlethwaite, p. . being a scholar in the digital era. bristol: policy press. doi: https://doi.org/ . / policypress/ . . davidson, d. . the applicability of the concept of resilience to social systems: some sources of optimism and nagging doubts. society & natural resources, ( ): – . doi: https://doi. org/ . / dyson, l. . the knowledge market. soundings: a jour- nal of politics and culture, : – . doi: https:// doi.org/ . / edmonds, j. . book review symposium: maggie berg and barbara k seeber, the slow professor: challenging the culture of speed in the academy. sociology, ( ): – . doi: https://doi. org/ . / edwards, m. . review: maggie berg and and barbara k seeber, the slow professor: challenging the culture of speed in the academy. humboldt journal of social relations, : – . http:// www.jstor.org/stable/ . eve, m. . open access and the humanities: con- texts, controversies and the future. cambridge: cambridge university press. doi: https://doi. org/ . /cbo feldman, z and sandoval, m. . metric power and the academic self: neoliberalism, knowledge and resistance in the british university. triplec, ( ): – . doi: https://doi.org/ . /triplec. v i . gawande, a. . the upgrade: why doctors hate their computers. the new yorker, november , – . available at https://www.newyorker.com/ magazine/ / / /why-doctors-hate-their- computers [last accessed june ]. https://doi.org/ . / . https://doi.org/ . / . https://doi.org/ . / x. . https://doi.org/ . / x. . https://doi.org/ . /mitpress/ . . https://doi.org/ . /mitpress/ . . https://doi.org/ . /rt. . https://doi.org/ . /rt. . https://doi.org/ . /j.ctv fgwf https://doi.org/ . /eaa. . https://doi.org/ . /eaa. . https://www.universityaffairs.ca/opinion/in-my-opinion/not-so-fast-a-critique-of-the-slow-professor/ https://www.universityaffairs.ca/opinion/in-my-opinion/not-so-fast-a-critique-of-the-slow-professor/ https://www.universityaffairs.ca/opinion/in-my-opinion/not-so-fast-a-critique-of-the-slow-professor/ https://doi.org/ . /j.scaman. . . https://doi.org/ . /j.scaman. . . https://doi.org/ . /j.scaman. . . https://doi.org/ . /j.scaman. . . https://doi.org/ . /dh. . . https://doi.org/ . /dh. . . https://doi.org/ . /eaa. . https://doi.org/ . /eaa. . https://socialtheoryapplied.com/journal/jast/article/view/ / https://socialtheoryapplied.com/journal/jast/article/view/ / https://socialtheoryapplied.com/journal/jast/article/view/ / https://archeothoughts.wordpress.com/ / / /limiting-the-damage-of-disengaging-from-the-journal-system/ https://archeothoughts.wordpress.com/ / / /limiting-the-damage-of-disengaging-from-the-journal-system/ https://archeothoughts.wordpress.com/ / / /limiting-the-damage-of-disengaging-from-the-journal-system/ https://doi.org/ . /gec . https://doi.org/ . /gec . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /policypress/ . . https://doi.org/ . /policypress/ . . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / http://www.jstor.org/stable/ http://www.jstor.org/stable/ https://doi.org/ . /cbo https://doi.org/ . /cbo https://doi.org/ . /triplec.v i . https://doi.org/ . /triplec.v i . https://www.newyorker.com/magazine/ / / /why-doctors-hate-their-computers https://www.newyorker.com/magazine/ / / /why-doctors-hate-their-computers https://www.newyorker.com/magazine/ / / /why-doctors-hate-their-computers huggett: resilient scholarship in the digital age gill, r. . breaking the silence: the hidden injuries of neo-liberal academia. feministische studien, ( ): – . doi: https://doi.org/ . /fs- - golumbia, d. . marxism and open access in the humanities: turning academic labor against itself. workplace: a journal for academic labor, : – . doi: https://doi.org/ . /workplace. v i . gornall, l and salisbury, j. . compulsive work- ing, ‘hyperprofessionality’ and the unseen pleasures of academic work. higher education quarterly, ( ): – . doi: https://doi. org/ . /j. - . . .x grand, a, holliman, r, collins, t and adams, a. . “we muddle our way through”: shared and distrib- uted expertise in digital engagement with research. journal of science communication, ( ): a . doi: https://doi.org/ . / . grealy, l and laurie, t. . higher degree research by numbers: beyond the critiques of neo-liberalism. higher education research & development, ( ): – . doi: https://doi.org/ . / . . gregg, m. a. learning to (love) labour: production cultures and the affective turn. communication and critical/cultural studies, ( ): – . doi: https://doi.org/ . / gregg, m. b. banal bohemia: blogging from the ivory tower hot-desk. convergence: the inter- national journal of research into new media technologies, ( ): – . doi: https://doi. org/ . / gregg, m. . work’s intimacy. cambridge: polity press. hall, g. . the uberfication of the university. minneapolis: university of minnesota press. doi: https://doi.org/ . / hall, r. . the alienated academic. the struggle for autonomy inside the university. london: palgrave macmillan. doi: https://doi. org/ . / - - - - hall, r and bowles, k. . re-engineering higher edu- cation: the subsumption of academic labour and the exploitation of anxiety. workplace: a journal for academic labor, : – . doi: https://doi. org/ . /workplace.v i . haven, t, tijdink, j, martinson, b and bouter, l. . perceptions of research integrity climate differ between academic ranks and disciplinary fields – results from a survey among academic research- ers in amsterdam. psyarxiv. doi: https://doi. org/ . /osf.io/ hw hay, i. . how to be an academic superhero: establish- ing and sustaining a successful career in the social sciences, arts and humanities. cheltenham: edward elgar publishing. hewitt, r. . measuring well-being in higher education. higher education policy institute policy note . available at https://www.hepi.ac.uk/ / / / measuring-well-being-in-higher-education/ [last accessed june ]. hildebrandt, k and couros, a. . digital selves, digital scholars: theorising academic identity in online spaces. journal of applied social theory, ( ): – . http://socialtheoryapplied.com/journal/ jast/article/view/ / hillman, n. . the uss: how did it come to this? higher education policy institute report . available at https://www.hepi.ac.uk/ / / /the-uss-how- did-it-come-to-this/ [last accessed june ]. hobfoll, s, stevens, n and zalta, a. . expanding the science of resilience: conserving resources in the aid of adaptation. psychological inquiry, ( ): – . doi: https://doi.org/ . / x. . holliman, r. . from analogue to digital scholarship: implications for science communication research- ers. journal of science communication, ( ). doi: https://doi.org/ . / . holmwood, j. . damned metrics. in: kelly, s, freedman, d and ismail, f (eds.), the university is ours: how to build an activist union branch, – . london: branch solidarity network. available at https://ucubranchsolidaritynetwork.wordpress. com/branch-activists-handbook/ [last accessed june ]. jarvis, h and pratt, a. . bringing it all back home: the extensification and ‘overflowing’ of work. the case of san francisco’s new media households. geoforum, ( ): – . doi: https://doi. org/ . /j.geoforum. . . jones, s and oakley, c. . the precarious postdoc: interdisciplinary research and casualised labour in the humanities and social sciences. durham: working knowledge/hearing the voice, durham university. available at http://www.working- knowledgeps.com/wp-content/uploads/ / / wkps_precariouspostdoc_pdf_interactive.pdf [last accessed june ]. kimhi, s and eshel, y. . the missing link in resil- ience research. psychological inquiry, ( ): – . doi: https://doi.org/ . / x. . kinman, g. . work stressors, health and sense of coherence in uk academic employees. educational psychology, ( ): – . doi: https://doi. org/ . / kinman, g and jones, f. . ‘running up the down escalator’: stressors and strains in uk academics. quality in higher education, ( ): – . doi: https://doi.org/ . / lawless, b. . documenting a labor of love: emotional labor as academic labor. review of communication, ( ): – . doi: https://doi.org/ . / . . lupton, d, mewburn, i and thomson, p. . the digi- tal academic: identities, contexts and politics. in: lupton, d, mewburn, i and thomson, p (eds.), the digital academic: critical perspectives on digital technologies in higher education, – . london: routledge. https://doi.org/ . /fs- - https://doi.org/ . /workplace.v i . https://doi.org/ . /workplace.v i . https://doi.org/ . /j. - . . .x https://doi.org/ . /j. - . . .x https://doi.org/ . / . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / - - - - https://doi.org/ . / - - - - https://doi.org/ . /workplace.v i . https://doi.org/ . /workplace.v i . https://doi.org/ . /osf.io/ hw https://doi.org/ . /osf.io/ hw https://www.hepi.ac.uk/ / / /measuring-well-being-in-higher-education/ https://www.hepi.ac.uk/ / / /measuring-well-being-in-higher-education/ http://socialtheoryapplied.com/journal/jast/article/view/ / http://socialtheoryapplied.com/journal/jast/article/view/ / https://www.hepi.ac.uk/ / / /the-uss-how-did-it-come-to-this/ https://www.hepi.ac.uk/ / / /the-uss-how-did-it-come-to-this/ https://doi.org/ . / x. . https://doi.org/ . / x. . https://doi.org/ . / . https://ucubranchsolidaritynetwork.wordpress.com/branch-activists-handbook/ https://ucubranchsolidaritynetwork.wordpress.com/branch-activists-handbook/ https://doi.org/ . /j.geoforum. . . https://doi.org/ . /j.geoforum. . . http://www.workingknowledgeps.com/wp-content/uploads/ / /wkps_precariouspostdoc_pdf_interactive.pdf http://www.workingknowledgeps.com/wp-content/uploads/ / /wkps_precariouspostdoc_pdf_interactive.pdf http://www.workingknowledgeps.com/wp-content/uploads/ / /wkps_precariouspostdoc_pdf_interactive.pdf https://doi.org/ . / x. . https://doi.org/ . / x. . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / . . https://doi.org/ . / . . huggett: resilient scholarship in the digital age lynn, c, howells, m and stein, m. . family and the field: expectations of a field-based research career affect researcher family planning decisions. plosone, ( ): e . doi: https://doi. org/ . /journal.pone. macdonald, r. . ‘impact’, research and slaying zombies: the pressures and possibilities of the ref. international journal of sociology and social policy, ( – ): – . doi: https://doi.org/ . / ijssp- - - mackinnon, d and derickson, k. . from resilience to resourcefulness: a critique of resilience policy and activism. progress in human geography, ( ): – . doi: https://doi. org/ . / matin, n, forrester, j and ensor, j. . what is equitable resilience? world development, : – . doi: https://doi.org/ . /j. worlddev. . . mcguire, r. . archaeology as political action. berkeley: university of california press. mclean, p and wallace, d. . blogging the unspeakable: racial politics, bakhtin, and the carni- valesque. international journal of communication, : – . http://ijoc.org/index.php/ijoc/article/ view/ mcnally, r. . people can be resilient, but can communities? psychological inquiry, ( ): – . doi: https://doi.org/ . / x. . mišík, m. . a call to arms for established research- ers. the research whisperer, august . avail- able at https://theresearchwhisperer.wordpress. com/ / / /a-call-to-arms-for-established- researchers/ [last accessed june ]. morgan, c. . emancipatory digital archaeology. unpublished thesis (phd), university of california, berkeley. morgan, c. . punk, diy, and anarchy in archaeological thought and practice. ap: online journal in public archaeology, : – . doi: https://doi.org/ . /ap.v i morgan, c and winters, j. . introduction: critical blogging in archaeology. internet archaeology, . doi: https://doi.org/ . /ia. . morrish, l. a. the accident of accessibility: how the data of the tef creates neoliberal subjects. social epistemology. doi: https://doi.org/ . / . . morrish, l. b. pressure vessels: the epidemic of poor mental health among higher education staff. higher education policy institute occasional paper , may . available at https://www.hepi. ac.uk/ / / /pressure-vessels-the-epidemic- of-poor-mental-health-among-higher-education- staff/ [last accessed june ]. muellerleile, c and lewis, n. . reassembling knowl- edge production with(out) the university. globalisa- tion, societies and education, ( ): – . doi: https:// doi.org/ . / . . mulligan, m, steele, w, rickards, l and fünfgeld, h. . keywords in planning: what do we mean by ‘community resilience’? international planning studies, ( ): – . doi: https://doi.org/ . / . . olsson, l, jerneck, a, thoren, h, persson, j and o’byrne, d. . why resilience is unappeal- ing to social science: theoretical and empirical investigations of the scientific use of resilience. science advances, ( ): e . doi: https://doi. org/ . /sciadv. oracle. . campus solutions. available at https:// docs.oracle.com/cd/e _ /infoportal/cs.html [last accessed june ]. padilla, m and thompson, j. . burning out faculty at doctoral research universities. stress and health, : – . doi: https://doi. org/ . /smi. papadopoulos, c and reilly, p. . the digital human- ist: contested status within contesting futures. digital scholarship in the humanities. doi: https:// doi.org/ . /llc/fqy pausé, c and russell, d. . sociable scholarship: the use of social media in the st century academy. journal of applied social theory, ( ): – . http:// socialtheor yapplied.com/journal/jast/article/ view/ / peck, j and tickell, a. . neoliberalizing space. antipode, ( ): – . doi: https://doi. org/ . / - . perry, s. . digital media and everyday abuse. anthropology now, ( ): – . doi: https://doi. org/ . / . . perry, s. . changing the way archaeologists work: blogging and the development of expertise. internet archaeology, . doi: https://doi.org/ . / ia. . perry, s and beale, n. . the social web and archaeology’s restructuring: impact, exploita- tion, disciplinary change. open archaeology, ( ): – . doi: https://doi.org/ . / opar- - reed, p. . review: maggie berg and barbara see- ber, the slow professor: challenging the culture of speed in the academy. logos, ( ). http:// logosjournal.com/ /review-maggie-berg-and- barbara-seeber-the-slow-professor-challenging- the-culture-of-speed-in-the-academy/. richardson, lj. . micro-blogging and online community. internet archaeology, . doi: https:// doi.org/ . /ia. . richardson, lj. . i’ll give you ‘punk’ archaeology, sunshine. world archaeology, ( ): – . doi: https://doi.org/ . / . . rocks-macqueen, d and webster, c. (eds.) . blog- ging archaeology. landward research. http://www. landward.eu/publications. rolfe, g. . the university in dissent: scholarship in the corporate university. abingdon: routledge. doi: https://doi.org/ . / https://doi.org/ . /journal.pone. https://doi.org/ . /journal.pone. https://doi.org/ . /ijssp- - - https://doi.org/ . /ijssp- - - https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /j.worlddev. . . https://doi.org/ . /j.worlddev. . . http://ijoc.org/index.php/ijoc/article/view/ http://ijoc.org/index.php/ijoc/article/view/ https://doi.org/ . / x. . https://doi.org/ . / x. . https://theresearchwhisperer.wordpress.com/ / / /a-call-to-arms-for-established-researchers/ https://theresearchwhisperer.wordpress.com/ / / /a-call-to-arms-for-established-researchers/ https://theresearchwhisperer.wordpress.com/ / / /a-call-to-arms-for-established-researchers/ https://doi.org/ . /ap.v i https://doi.org/ . /ia. . https://doi.org/ . / . . https://doi.org/ . / . . https://www.hepi.ac.uk/ / / /pressure-vessels-the-epidemic-of-poor-mental-health-among-higher-education-staff/ https://www.hepi.ac.uk/ / / /pressure-vessels-the-epidemic-of-poor-mental-health-among-higher-education-staff/ https://www.hepi.ac.uk/ / / /pressure-vessels-the-epidemic-of-poor-mental-health-among-higher-education-staff/ https://www.hepi.ac.uk/ / / /pressure-vessels-the-epidemic-of-poor-mental-health-among-higher-education-staff/ https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /sciadv. https://doi.org/ . /sciadv. https://docs.oracle.com/cd/e _ /infoportal/cs.html https://docs.oracle.com/cd/e _ /infoportal/cs.html https://doi.org/ . /smi. https://doi.org/ . /smi. https://doi.org/ . /llc/fqy https://doi.org/ . /llc/fqy http://socialtheoryapplied.com/journal/jast/article/view/ / http://socialtheoryapplied.com/journal/jast/article/view/ / http://socialtheoryapplied.com/journal/jast/article/view/ / https://doi.org/ . / - . https://doi.org/ . / - . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /ia. . https://doi.org/ . /ia. . https://doi.org/ . /opar- - https://doi.org/ . /opar- - http://logosjournal.com/ /review-maggie-berg-and-barbara-seeber-the-slow-professor-challenging-the-culture-of-speed-in-the-academy/ http://logosjournal.com/ /review-maggie-berg-and-barbara-seeber-the-slow-professor-challenging-the-culture-of-speed-in-the-academy/ http://logosjournal.com/ /review-maggie-berg-and-barbara-seeber-the-slow-professor-challenging-the-culture-of-speed-in-the-academy/ http://logosjournal.com/ /review-maggie-berg-and-barbara-seeber-the-slow-professor-challenging-the-culture-of-speed-in-the-academy/ https://doi.org/ . /ia. . https://doi.org/ . /ia. . https://doi.org/ . / . . http://www.landward.eu/publications http://www.landward.eu/publications https://doi.org/ . / huggett: resilient scholarship in the digital age rustin, m. . the neoliberal university and its alternatives. soundings: a journal of politics and culture, ( ): – . doi: https://doi. org/ . / sabagh, z, hall, n and saroyan, a. . antecedents, correlates and consequences of faculty burnout. educational research, ( ): – . doi: https:// doi.org/ . / . . schultz, p. . collingwood’s goo. in: caraher, w, kourelis, k and reinhard, a (eds.), punk archaeology, – . grand forks nd: the digital press, university of north dakota. schwarz, b and knowles, c. . the scandal of contemporary universities. soundings: a journal of politics and culture, : – . doi: https://doi. org/ . /soun: .editorial. scott, s. . book review symposium: maggie berg and barbara k seeber, the slow professor: challenging the culture of speed in the academy. sociology, ( ): – . doi: https://doi. org/ . / shaw, c. . overworked and isolated – work pres- sure fuels mental illness in academia. the guard- ian, may . [online access at https://www. theguardian.com/higher-education-network/ blog/ /may/ /work-pressure-fuels-aca- demic-mental-illness-guardian-study-health last accessed june ]. sperlinger, t, mclellan, j and pettigrew, r. . who are universities for? re-making higher education. bristol: bristol university press. doi: https://doi. org/ . /j.ctv fgxx stewart, b. . collapsed publics: orality, literacy, and vulnerability in academic twitter. journal of applied social theory, ( ): – . http://socialtheoryap- plied.com/journal/jast/article/view/ / suiter, t. . why ‘hacking’? in: cohen, d and scheinfeldt, t (eds.), hacking the academy: new approaches to scholarship and teaching from digital humanities, – . ann arbor: university of michigan press. doi: https://doi.org/ . /j.ctv swj . tuc. . workers in the uk put in £ . billion worth of unpaid overtime a year. press release, feb- ruary . available at https://www.tuc.org.uk/ news/workers-uk-put-£ -billion-worth-unpaid- overtime-year [last accessed june ]. ucu. a. workload is an education issue. ucu workload survey report . avail- able at https://www.ucu.org.uk/media/ / wo r k l o a d - i s - a n - e d u c a t i o n - i s s u e - u c u - w o r k - load-survey-report- /pdf/ucu_workloadsur- vey_fullreport_jun .pdf [last accessed june ]. ucu. b. precarious work in higher education. november . available at https://www.ucu.org. uk/article/ /precarious-contracts-in-he---insti- tution-snapshot [last accessed june ]. veletsianos, g and kimmons, r. . assump- tions and challenges of open scholarship. international review of research in open and dis- tributed learning, ( ): – . doi: https://doi. org/ . /irrodl.v i . watrall, e. . building scholars and communities of practice in digital heritage and archaeology. advances in archaeological practice, ( ): – . doi: https://doi.org/ . /aap. . watts, j and robertson, n. . burnout in univer- sity teaching staff: a systematic literature review. educational research, ( ): – . doi: https:// doi.org/ . / . . weller, m. . the digital scholar: how tech- nology is transforming scholarly practice. london: bloomsbury. doi: https://doi. org/ . / weller, m. . the digital scholar revisited. the ed techie blog, december . available at http://blog.edtechie.net/digital-scholarship/ the-digital-scholar-revisited/ [last accessed june ]. weller, m and anderson, t. . digital resilience in higher education. european journal of open, dis- tance and e-learning, ( ): – . http://www. eurodl.org/?p=archives&year= &halfyear= &a bstract= . welsh, m. . resilience and responsibility: governing uncertainty in a complex world. the geographical journal, ( ): – . doi: https:// doi.org/ . /geoj. woodcock, j. . digital labour in the university: understanding the transformations of academic work in the uk. triplec, ( ): – . doi: https://doi.org/ . /triplec.v i . ylijoki, oh. . boundary-work between work and life in the high-speed university. studies in higher education, ( ): – . doi: https://doi.org/ . / . . zábrodská, k, mudrák, j, Šolcová, i, květon, p, blatný, m and machovcová, k. . burnout among university faculty: the central role of work- family conflict. educational psychology, ( ): – . doi: https://doi.org/ . / . . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /soun: .editorial. https://doi.org/ . /soun: .editorial. https://doi.org/ . / https://doi.org/ . / https://www.theguardian.com/higher-education-network/blog/ /may/ /work-pressure-fuels-academic-mental-illness-guardian-study-health https://www.theguardian.com/higher-education-network/blog/ /may/ /work-pressure-fuels-academic-mental-illness-guardian-study-health https://www.theguardian.com/higher-education-network/blog/ /may/ /work-pressure-fuels-academic-mental-illness-guardian-study-health https://www.theguardian.com/higher-education-network/blog/ /may/ /work-pressure-fuels-academic-mental-illness-guardian-study-health https://doi.org/ . /j.ctv fgxx https://doi.org/ . /j.ctv fgxx http://socialtheoryapplied.com/journal/jast/article/view/ / http://socialtheoryapplied.com/journal/jast/article/view/ / https://doi.org/ . /j.ctv swj . https://www.tuc.org.uk/news/workers-uk-put-� -billion-worth-unpaid-overtime-year https://www.tuc.org.uk/news/workers-uk-put-� -billion-worth-unpaid-overtime-year https://www.tuc.org.uk/news/workers-uk-put-� -billion-worth-unpaid-overtime-year https://www.ucu.org.uk/media/ /workload-is-an-education-issue-ucu-workload-survey-report- /pdf/ucu_workloadsurvey_fullreport_jun .pdf https://www.ucu.org.uk/media/ /workload-is-an-education-issue-ucu-workload-survey-report- /pdf/ucu_workloadsurvey_fullreport_jun .pdf https://www.ucu.org.uk/media/ /workload-is-an-education-issue-ucu-workload-survey-report- /pdf/ucu_workloadsurvey_fullreport_jun .pdf https://www.ucu.org.uk/media/ /workload-is-an-education-issue-ucu-workload-survey-report- /pdf/ucu_workloadsurvey_fullreport_jun .pdf https://www.ucu.org.uk/article/ /precarious-contracts-in-he---institution-snapshot https://www.ucu.org.uk/article/ /precarious-contracts-in-he---institution-snapshot https://www.ucu.org.uk/article/ /precarious-contracts-in-he---institution-snapshot https://doi.org/ . /irrodl.v i . https://doi.org/ . /irrodl.v i . https://doi.org/ . /aap. . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / https://doi.org/ . / http://blog.edtechie.net/digital-scholarship/the-digital-scholar-revisited/ http://blog.edtechie.net/digital-scholarship/the-digital-scholar-revisited/ http://www.eurodl.org/?p=archives&year= &halfyear= &abstract= http://www.eurodl.org/?p=archives&year= &halfyear= &abstract= http://www.eurodl.org/?p=archives&year= &halfyear= &abstract= https://doi.org/ . /geoj. https://doi.org/ . /geoj. https://doi.org/ . /triplec.v i . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / . . huggett: resilient scholarship in the digital age how to cite this article: huggett, j. . resilient scholarship in the digital age. journal of computer applications in archaeology, ( ), pp. – . doi: https://doi.org/ . /jcaa. submitted: december accepted: june published: august copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access journal of computer applications in archaeology, is a peer-reviewed open access journal published by ubiquity press. https://doi.org/ . /jcaa. http://creativecommons.org/licenses/by/ . / . technology and scholarship . the landscape of digital scholarship . institutional digital scholarship . . metrification of scholarship . . administrative scholarship . . intensification of scholarship . the individual digital scholar . . open scholarship . . sociable scholarship . . always-on scholarship . introducing resilience . a crisis of resilience? . supporting the resilient digital scholar acknowledgements competing interests references table table provided by the author(s) and nui galway in accordance with publisher policies. please cite the published version when available. downloaded - - t : : z some rights reserved. for more information, please see the item record link above. title the abbey theatre digital archive: a digitization project withdramatic impact author(s) cox, john publication date - - publication information cox, j., ( ). the abbey theatre digital archive: a digitization project with dramatic impact. insights. ( ), pp. – . doi: http://doi.org/ . /uksg. publisher united kingdom serials group link to publisher's version http://doi.org/ . /uksg. item record http://hdl.handle.net/ / doi http://dx.doi.org/ . /uksg. https://aran.library.nuigalway.ie http://creativecommons.org/licenses/by-nc-nd/ . /ie/ national university of ireland galway digitized the archive of the abbey theatre between and . this was the largest theatre archive digitization project worldwide and it has had a major impact on the university and its library. the scale of the digitization project presented a series of challenges, including fragile material, limited time, streamlined workflows, complex digital rights management and effective systems. the project was completed on time and on budget in , using a ‘more product, less process’ approach. access to the abbey theatre digital archive has delivered strong academic impact for the university, generating new research income and international connections as well as contributing to improved institutional ranking. the digital archive enables new types of research, including text and data mining, and has reshaped undergraduate curricula. it has also had a transformative effect on the library as leader of the project. the role of the archivist has changed and partnerships with the academic community have strengthened. a growing emphasis on digital publication has been a catalyst for a function- rather than subject-based organizational structure which promotes participation in digital scholarship initiatives, with archives and special collections occupying a new position of prominence. the abbey theatre digital archive: a digitization project with dramatic impact introduction projects come and go but occasionally there is one that turns everything upside down. this is how it has been with the digitization of the abbey theatre archive at national university of ireland (nui) galway. i immediately had a sense that this might be the case when the president of the university rang me in november to charge the library with leading the project. the prospect was daunting and on a far greater scale than anything we had undertaken before. saying no was not an option, however, and i was excited at the opportunity to position the library at the head of a high-profile digital humanities project of national and international as well as institutional significance. digitizing this major archive has presented many challenges but also reshaped teaching, enabled new forms of research and transformed the library agenda. a significant archive founded in , the abbey is ireland’s national theatre. w b yeats and lady gregory established it to ‘bring upon the stage the deeper emotions of ireland’ and it has influenced the country’s history. some of its players participated in the easter rising of and it has staged controversial plays on themes ranging from the northern ireland troubles to child abuse. the abbey has attracted the attention of censors, as shown in figure . its archive is extensive, encompassing almost two million pages, hours of video and , hours of audio recordings. insights – ( ), november the abbey theatre digital archive: a digitization project with dramatic impact | john cox john cox university librarian national university of ireland galway ‘projects come and go but occasionally there is one that turns everything upside down’ figure . censor’s note, the playboy of the western world, (abbey theatre) this collection of production, publicity, administrative and financial records represents a massive research resource and was the centrepiece of the institutional partnership established between the abbey theatre and nui galway and formally agreed in april . the president of ireland, michael d higgins, launched the partnership between the two institutions in october (see photograph). president of ireland, michael d higgins, at the launch of the archive partnership (leon farrell/photocall ireland) digitization was attractive to the abbey both as a way of opening the archive to a wider audience and of securing its content against losses such as those sustained in a major fire in . the focus of the partnership for nui galway was to build on research and teaching strengths in theatre and drama. this subject area had been prioritized for further development at the university and a process of further staff and student recruitment was already under way, with the intention of expanding the existing range of taught programmes at undergraduate and postgraduate level. there is a strong performing arts tradition in the west of ireland, home to the founders of the abbey theatre, w b yeats and lady gregory. the university had already developed local partnerships with the galway international arts festival and the druid theatre. a partnership with the abbey theatre offered the opportunity for mutual benefit at national level. the university president recognized access to unique archives as vital for the humanities, stimulating research as well as research-led teaching. there was already a focus on theatre archives in the library’s collections. these include the archives of institutions such as the druid, taibhdhearc and lyric theatres, of playwrights such as thomas kilroy and john arden, and of the actors arthur shields and siobhán mckenna. exclusive access to the archive of the abbey theatre had the potential to increase the international reputation of the university in this field. digitization challenges the scale of the abbey archive, already outlined, would have presented a major challenge on its own but the stakes were raised by the need to complete the digitization in three years. time was of the essence as the institutional partnership is for years in the first instance. it was clear from an early stage that digitization on this scale and to this timescale could not be achieved through existing library resources. this meant outsourcing a large proportion of the digitization and we were fortunate to select an excellent contractor, an archivist who understood the needs of all parties and employed qualified archivists to process an amount of difficult material appropriately. the material in question included fragile documents damaged in the fire, an array of formats and sizes from press cuttings to stage designs, a mix of handwritten correspondence and typescript records, and audio or video recordings in legacy formats and delicate condition. efficient workflows were key to rapid throughput and library staff gained from the contractor’s expertise in this regard. it was important to share learning between contracted staff and archivists in the library who digitized part of the archive, including the programmes. other library staff engaged closely with the contractor to establish the systems infrastructure. this was a vital element in enabling large-scale digitization and meeting complex rights management requirements. components included a range of digitization equipment to handle different formats, a productions database already created by the abbey theatre which reduced cataloguing effort by providing a metadata ‘spine’ for much of the material, and a bespoke digital asset management (dam) system designed by aetopia limited in belfast. the dam was key to managing digital rights, enabling automatic redaction based on the occurrence of certain words, withholding of sections rather than whole documents and automatic release of documents after the expiry of agreed embargo periods for certain categories, e.g. years for board minutes . cloud-based computing and storage infrastructures have been successfully deployed throughout, with amazon’s safe secure storage (s ) service selected for this purpose. two aspects of the project are particularly noteworthy. firstly, access to the digital archive is limited to designated workstations in the archives reading room at nui galway’s library. this was specified in the partnership agreement with the abbey theatre which had concerns about publishing the digitized archive on the open web due to rights management issues and relationship management with living actors. librarians generally favour open access (oa) and it is frustrating to limit the availability of the digital archive in this way. the reading room model is, however, advantageous in that exclusive access helps to recruit academic staff and students to the university in addition to attracting visitors from around the world. the minute books from the period – have been published on an oa basis and it is hoped that further content can be released. ‘exclusive access helps to recruit academic staff and students to the university’ ‘it was important to share learning between contracted staff and archivists in the library’ the second area of interest concerns metadata. a streamlined methodology, characterized by the ‘more product, less process’ approach , has been adopted to enable full digitization to be completed in three years. the integration of the abbey’s own productions database has enabled a lot of material, such as scripts or programmes, to be linked to specific plays. this associates the material with relevant cast and venues and brings all of the different document types relating to that play or production together (figure ). it is not possible to link all documents to a play, of course. for such material a brief descriptive record has been provided and ocr used where possible to maximize full-text retrieval. it is recognized that the online environment opens up different ways of locating material and that new approaches to processing archives are available, potentially saving time otherwise spent on detailed arrangement. the dam offers powerful search facilities and users have reported positive experience with the digital archive. figure . launch page for the plough and the stars (nui galway abbey theatre digital archive) expert archivists, efficient workflows and a robust systems infrastructure were all key to making rapid progress in digitizing the archive. the initial version of the archive became available in late , just a year after the project commenced in september , and the full digitization was completed on time and on budget at the end of august . this was an excellent outcome. a rigorous deduplication exercise meant that the number of pages digitized was around , rather than the original two million calculated. table shows the categories of material involved. a project steering group, comprised of staff from the abbey theatre and nui galway along with the contractor, has played an important role throughout in monitoring progress, addressing issues and keeping stakeholders updated. annual reports for the period – accounted for progress on digitization, rights management, user experience, academic uptake and public engagement . three presentations to the university’s governing authority highlighted the strategic importance of the project and its impact at and beyond nui galway, further elaborated in the remaining sections of this article. ‘expert archivists, efficient workflows and a robust systems infrastructure were all key’ category volume administrative files , pages scripts , pages prompt scripts , pages programmes , pages photographs , items press cuttings , items stage management files , pages audio , recordings set designs , pages posters items lighting designs items video recordings venue designs items handbills items table . abbey theatre archive: volumes of digitized material per category academic impact the success of a project is often judged on its impact and in the academic world this is closely linked to teaching, research and reputation. the abbey theatre digital archive has had a very positive influence in each of these areas. the headline most often associated with it is that it has generated more than € , in research funding and student scholarships. money is one measure, but reputational gain is at least as important. unique access to such a major resource for research into different aspects of ireland’s history has attracted scholars from around the world to the campus. new international connections with other universities have developed and boosted nui galway’s profile in theatre and drama, making it a frequently referenced institution in this field. this is significant since mentions of an institution play an important part in university ranking systems, helping nui galway into the top bracket in a number of tables. patrick lonergan, professor of drama and theatre studies at nui galway, has commented that ‘digital access to the abbey archive has been vital in strengthening academic participation in international research networks in digital humanities and other fields’. this heightened level of external engagement has included multi-partner funding bids and the hosting of major conferences, as described later. publications are still the major currency in terms of academic impact and the digital archive has underpinned a lot of published research. the best example is the oxford handbook of modern irish theatre , a landmark publication with more than chapters. most of the contributors visited nui galway to use the abbey and related archives. as a result, this publication contains more than citations of archives held by the university, along with images reproduced from those collections. nui galway authors feature prominently in the handbook and academic staff in the centre for drama, theatre and performance are committed to using the digital archive and related collections in their publications. the head of the centre will publish a monograph on theatre and digital archives in , and this reflects the development of a very close relationship with the library, based on the archives. ‘it has generated more than € , in research funding and student scholarships’ ‘the digital archive has underpinned a lot of published research’ the abbey theatre digital archive has strongly influenced teaching and learning. it underpinned the shaping of a new undergraduate curriculum for theatre and drama in and has generated a series of new modules, some at master’s level, including one focused specifically on the archive and another on irish theatre and archives more generally. existing and new programmes require intensive use of the archives by students, often in very active ways, for instance to develop their own playwriting skills. some students are also able to get work experience at the abbey theatre as part of the internships programme included in the partnership agreement. a recent innovation has been the appointment of a teaching fellow in the centre for drama, theatre and performance who will focus on creating archives-related teaching and learning materials. this is a very positive development, unique to this discipline at nui galway, and a further stimulus to close collaboration between the library and the centre. digital archives make new types of engagement with teaching, learning and research possible and chris morash, seamus heaney professor of irish writing at trinity college dublin, has observed that the archive ‘is really transforming the way in which we do theatre history research in ireland’. in the first instance, the whole collection is accessible for searching, rather than only its metadata as for print archives, and users can view any document, not just those requested for consultation. the range of search facilities available for the digital archive has enabled new connections to be made. text and data mining offer interesting possibilities and studies to date include analyses of advertisements published in programmes over the decades, the use of profanities, and gender representation in the language of plays and the personnel involved. this work has generated interdisciplinary collaborations on campus, bringing together humanities scholars with academic staff from the insight centre for data analytics. fintan o’toole, one of ireland’s leading commentators, highlighted the value of digital archives for theatre research in an irish times review of the oxford handbook of modern irish theatre in late . he noted that ‘reconstructing or evoking what actually happened is far tougher. it has, though, become possible, not just with changes of attitudes but with the availability of archives, many of them digitized by institutions, notably the james hardiman library at nui galway’ . the changed role of the archivist the abbey theatre project represented the largest theatre archive digitization worldwide and has had a profound impact on the library at nui galway. change initially concentrated itself in the archives team. one member was designated to digitize the programmes and some other materials from the archive, working closely with the contractor’s staff. this has proved invaluable in developing expertise around workflows, quality control and metadata requirements to identify specific plays and tours. there can be a tendency to think of digitization as a simple process, but things can go wrong and we learned that distributing work among groups of students will only work with clear instructions, robust processes and an eye for complexity. the post in question has been redesignated as digital archivist and now underpins the creation of other digital collections, as well as managing the addition of new content to the digital archive annually. the role of the archivist as advisor and mediator of collections has really advanced in a digital environment. a digital archive of this size presents challenges of navigation and discovery. the collection is large, running to about , pages and containing a wide range of material, some of it unfamiliar to users of traditional archives. video recordings of live performances, for example, are sources not usually available to users. instruction on how to search and, importantly, why to search for set designers, costume designers, even voice coaches and choreographers, are new approaches for users to learn and they lead to dynamic material with which to undertake new studies. users also value an explanation of the scope and context of the archive, notably what is included or excluded, what embargo periods apply, how redaction impacts access and what the conditions are for reuse ‘the largest theatre archive digitization worldwide’ ‘a digital archive of this size presents challenges of navigation and discovery’ of material. feedback on the user interface is positive and users value features such as a self-generating citation for every individual item and zooming facilities to enhance viewing of material. all of our archivists provide training and guidance on the use of the digital archive to groups and to individuals. they have also developed a new ‘discovering the archives’ module whose uptake is increasing at both undergraduate and postgraduate levels. there is no need to deliver original material from the stores to users of the digital archive and this saving of archivist time enables a greater emphasis on providing expert mediation of detailed queries, often by connecting users with material in other collections. as noted earlier, the library had specialized in theatre archives previously, including those of the druid and lyric theatres, of the playwright thomas kilroy, and the actor arthur shields . archivists are taking every opportunity to link the use of the digital archive to these and other collections which offer valuable contextual linkages to enhance the scholarly understanding of the abbey theatre’s own history. closer relationships between archivists and academic staff have developed, as evidenced by the inclusion of archivists as presenters in programme modules and at locally organized conferences or seminars. curation of exhibitions, such as an exhibition of costumes from the abbey theatre wardrobe, has developed as a key area of collaboration. the archivists have taken a very entrepreneurial role in promoting the digital archive at international events but also in collaborating with academics to co-host a major conference titled ‘performing the archive’ in july , which attracted an international audience of more than to the university. since then, one of the archivists has led a successful bid to host the sibmas (international association of libraries, museums, archives and documentation centres of the performing arts) conference in at nui galway. archivist participation in academic funding bids has had positive outcomes, including a recent award of € , for a project related to the study of the gate theatre. a further manifestation of the outward-facing approach of the archivists and their embedding into the academic community is the editing of a monograph on irish theatre archives by one of the team . library engagement with digital scholarship digitizing the abbey theatre archive has shaped new roles for nui galway’s library in general. perceptions of the library have been changed by leadership of a project of this scale, and academic staff have turned in our direction for advice on their own digital projects. library staff have engaged in collaborations on a range of projects in digital humanities and, more broadly, digital scholarship. outputs have included new digital archive collections such as those of the cartographer tim robinson, who has documented the landscape of the west of ireland extensively , or the peacemaker brendan duddy, who played a key role in northern ireland . a major data set for eighteenth-century irish trade has been published as a result of a collaboration with an academic in economics . the emphasis on theatre has resulted in the publication of a historical database of shakespearean productions in ireland and a transcription of the early abbey theatre minute books, mentioned previously (figure ). the abbey project has undoubtedly raised ambitions around archives, helping to attract high-quality collections and to take on their digitization. examples are the archives of the gate theatre, second only to the abbey in ireland, and of mary robinson, former president of ireland and united nations high commissioner for human rights. ‘the archivists have taken a very entrepreneurial role in promoting the digital archive’ ‘perceptions of the library have been changed by leadership of a project of this scale’ figure . abbey theatre minute books, – academic libraries are increasingly vital to enabling digital scholarship but this brings real challenges. effective participation needs the right infrastructures in place, particularly in terms of technology and human support. at nui galway the need to consolidate a number of disparate digital preservation and publishing platforms soon became evident, triggering a move towards systems such as islandora, which are commonly used worldwide, and adherence to international metadata standards. the publication of a digital scholarship enablement strategy in guided a more joined-up approach and helped to position the library as a key player in digital projects on campus. most important of all has been the establishment of a new digital publishing and innovation team to bring together a number of staff with an appropriate mix of skills, including programming, metadata, web publishing and data management. members of this team have backgrounds in libraries, it and archives, and are key to continued strong engagement with the many aspects of digital scholarship which at nui galway have ranged from complex rights management to the development of a new technology-rich makerspace to facilitate digital projects. five new teams in total were created as part of an extensive process of analysis and consultation among staff and users about the future role of the library, stimulated at least in part by the challenges and opportunities arising from the abbey digital archive project. this process resulted in the publication in of a library strategy to in which archives and special collections featured prominently and high-impact publication of research, data and digital content emerged as one of six priorities. the new team structure has a number of distinguishing features, including a move from a subject librarian team to a functional approach to organizing staff, described elsewhere . that change reflects a need to meet an expanding range of user expectations, many of them related to the stimulation by the abbey project of digital scholarship activities and an increased profile for archives and special collections whose functions are now distributed across three teams. conclusion the project to digitize the abbey theatre archive at nui galway has certainly provided its share of drama. the task appeared sisyphean at the outset but the digitization was completed on schedule thanks to the expertise and teamwork of the contractor, archivists, librarians and academics involved. access to the digital archive has yielded many benefits for the university, beyond even the high expectations expressed from an early stage, generating significant research funding, reshaping curricula, enabling new modes of enquiry and advancing institutional reputation. digitized archives do not always have the impact expected of them and the limitation of access to the archives reading room at nui galway might have had a negative effect in this instance. a combination of factors has, ‘most important of all has been the establishment of a new digital publishing and innovation team’ ‘the task appeared sisyphean at the outset but the digitization was completed on schedule’ however, delivered a favourable outcome. the most important has been the support and enthusiasm of the university president for the project. that support has generated the necessary funding and given the digital archive a positive profile within and beyond the institution. investment in a major initiative for the humanities has also added a level of scrutiny which has helped to maintain momentum and to ensure accountability among the different partners involved, resulting in annual reports and presentations about the project to the university’s governing authority. the inter-institutional partnership has been important but a local partnership within nui galway has made the greatest difference. the closeness of the relationship developed through the project between the library and the centre for drama, theatre and performance has been unique in my experience and both departments have helped to differentiate the university. as a discipline, drama has flourished at nui galway in recent years through innovative teaching, local partnerships, a strong publications record and an award-winning building. archives have similarly thrived, with strong university support enabling increases in staffing and a new building with excellent facilities to attract, use and exhibit collections. archivists and academic staff in drama have matched each other in bringing great energy and complementary resources to the project. another significant factor has been the somewhat restricted access afforded to the archive in paper format. the abbey theatre has only a small consultation area and a single archivist. as a result cataloguing was limited, making it difficult to identify material of interest. in addition, the small amount of space for researchers meant that the waiting list was long and a significant level of unsatisfied demand had built up over a period of time. multi-user access to the digital archive was therefore much welcomed by scholars. the timing has also been favourable. research and publication activity in theatre and drama has been strong recently, with the oxford handbook of modern irish theatre , published in , a leading example. initially daunted by the project, the library has emerged stronger from it, reorganized, refocused and perceived differently in the institution as a partner in the academic mission, not simply a service to support it. a few lessons have emerged along the way. it is usually better to take on a challenge than to sidestep it and this project has been a case in point. ambition is not a word associated frequently enough with libraries and archives but it is often a force for good. archives occupy a sweet spot in the university strategy at nui galway right now and strong institutional support has raised their profile, attracting a series of prestigious collections. the digitization of the gate theatre archive emerged as a successor project and others in a range of disciplines are in the pipeline, showing the university’s continuing intent to advance its academic mission through distinctive collections. the positive experience and outcomes associated with the abbey theatre archive digitization have stimulated these developments. inputs are vital but impact is ultimately a big differentiator for any project. the ‘more product, less process’ mantra challenges our thinking but its focus on outputs, backed by streamlined workflows, has much to commend it, increasing impact and liberating time for new areas of contribution and recognition. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘abbreviations and acronyms’ link at the top of the page it directs you to: http://www.uksg.org/publications#aa competing interests the author has declared no competing interests. ‘the library has emerged stronger from it, reorganized, refocused and perceived differently in the institution’ http://www.uksg.org/publications#aa references . abbey theatre: https://www.abbeytheatre.ie/about/ (accessed september ). . bradley m and keane a, the abbey theatre digitization project in nui galway, new review of information networking, , ( – ), – ; doi: https://doi.org/ . / . . (accessed september ). . nui galway digital collections, abbey theatre minute books jan –may : https://digital.library.nuigalway.ie/islandora/object/nuigalway% aabbey-theatre-minute-books (accessed september ). . greene m a and meissner d, more product, less process: revamping traditional archival processing, the american archivist, , ( ) fall/winter, – ; doi: https://doi.org/ . /aarc. . .c k (accessed september ). . nui galway library, abbey theatre digital archive: http://library.nuigalway.ie/collections/archives/depositedcollections/featuredcollections/abbeytheatredigitalarchive/ (accessed september ). . grene, n and morash, c, eds., the oxford handbook of modern irish theatre, , oxford, oxford university press. doi: https://doi.org/ . /oxfordhb/ . . . lonergan p, irish theatre since , , london, bloomsbury press. . o’toole f, the oxford handbook of modern irish theatre review: the best single volume on the subject, the irish times, november : http://www.irishtimes.com/culture/books/the-oxford-handbook-of-modern-irish-theatre-review-the-best-single-volume-on-the- subject- . (accessed september ). . nui galway library, archives catalogue: http://archivesearch.library.nuigalway.ie/nuig/calmview/default.aspx (accessed september ). . houlihan b, ed., negotiating ireland’s theatre archive: theory, practice, performance, , oxford, peter lang. . nui galway digital collections, tim robinson’s townland index for connemara and the aran islands: https://digital.library.nuigalway.ie/islandora/object/nuigalway% arobinson (accessed september ). . nui galway digital collections, brendan duddy papers: https://digital.library.nuigalway.ie/islandora/object/nuigalway% aduddy (accessed september ). . duanaire: a treasury of digital data for irish economic history: http://duanaire.ie/ (accessed september ). . shakespeare’s plays in dublin, – : http://www.nuigalway.ie/drama/shakespeare/ (accessed september ). . abbey theatre minute books, ref. . . cox j, communicating new library roles to enable digital scholarship, new review of academic librarianship, , , – ; doi: https://doi.org/ . / . . (accessed september ). . nui galway library, digital scholarship enablement strategy: http://library.nuigalway.ie/media/jameshardimanlibrary/digital-scholarship-enablement-strategy.pdf (accessed september ). . nui galway library, library strategy: the journey to : http://library.nuigalway.ie/media/jameshardimanlibrary/library-strategy---the-journey-to- .pdf (accessed september ). . cox j, new directions for academic libraries in research staffing: a case study at national university of ireland galway, new review of academic librarianship, , in press; doi: https://doi.org/ . / . . (accessed september ). . grene n and morash c, eds., ref. . . nui galway, gate theatre: https://www.nuigalway.ie/gatetheatre/ (accessed september ). https://www.abbeytheatre.ie/about/ https://doi.org/ . / . . https://digital.library.nuigalway.ie/islandora/object/nuigalway% aabbey-theatre-minute-books https://doi.org/ . /aarc. . .c k http://library.nuigalway.ie/collections/archives/depositedcollections/featuredcollections/abbeytheatredigitalarchive/ https://doi.org/ . /oxfordhb/ . . http://www.irishtimes.com/culture/books/the-oxford-handbook-of-modern-irish-theatre-review-the-best-single-volume-on-the-subject- . http://www.irishtimes.com/culture/books/the-oxford-handbook-of-modern-irish-theatre-review-the-best-single-volume-on-the-subject- . http://archivesearch.library.nuigalway.ie/nuig/calmview/default.aspx https://digital.library.nuigalway.ie/islandora/object/nuigalway% arobinson https://digital.library.nuigalway.ie/islandora/object/nuigalway% aduddy http://duanaire.ie/ http://www.nuigalway.ie/drama/shakespeare/ https://doi.org/ . / . . http://library.nuigalway.ie/media/jameshardimanlibrary/digital-scholarship-enablement-strategy.pdf http://library.nuigalway.ie/media/jameshardimanlibrary/library-strategy---the-journey-to- .pdf https://doi.org/ . / . . https://www.nuigalway.ie/gatetheatre/ article copyright: © john cox. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. john cox university librarian james hardiman library, national university of ireland galway, galway h rew , ie e-mail: john.cox@nuigalway.ie | twitter: http://twitter.com/johncoxnuig orcid id: http://orcid.org/ - - - to cite this article: cox j, the abbey theatre digital archive: a digitization project with dramatic impact, insights, , ( ), – ; doi: https://doi.org/ . /uksg. published by uksg in association with ubiquity press on november http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / mailto:john.cox@nuigalway.ie http://twitter.com/johncoxnuig http://orcid.org/ - - - https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ introduction a significant archive digitization challenges academic impact the changed role of the archivist library engagement with digital scholarship conclusion abbreviations and acronyms competing interests references figure figure figure table the oregon digital newspaper program’s commitment to open access ola quarterly ola quarterly volume number digital repositories and data harvests - - the oregon digital newspaper program’s commitment to open the oregon digital newspaper program’s commitment to open access access sarah seymore university of oregon recommended citation recommended citation seymore, s. ( ). the oregon digital newspaper program’s commitment to open access. ola quarterly, ( ), - . https://doi.org/ . / - . © by the author(s). ola quarterly is an official publication of the oregon library association | issn - http://commons.pacificu.edu/olaq http://commons.pacificu.edu/olaq https://commons.pacificu.edu/olaq https://commons.pacificu.edu/olaq/vol https://commons.pacificu.edu/olaq/vol /iss https://commons.pacificu.edu/olaq/vol /iss https://doi.org/ . / - . the oregon digital newspaper program’s commitment to open access by sarah seymore digital collections metadata librarian, university of oregon libraries sseymore@uoregon.edu sarah is the digital collections metadata librarian at the university of oregon libraries and program manager of the oregon digital newspaper program (odnp). she supervises non-marc cataloging of digitized and born-digital materials for the digital collections repository, oregon digital, and digital scholarship projects. in her role as program manager of odnp, she assists patrons with fundraising for newspaper digitization, manages the digitization workflow, and supports outreach to newspaper publishers across the state for born-digital preservation. she holds an undergraduate degree in art history, and a master’s degree in library and information science. the oregon digital newspaper program (odnp) at the university of oregon libraries is an initiative to digitize historic and current oregon newspapers, making them freely available to the public through a keyword-searchable online database. the odnp is committed to open access and has included collaboration and data sharing with larger programs like the library of congress’ chronicling america historic newspaper website. since , the odnp has in- creased its open access mission by archiving and hosting born-digital newspaper content, as well as continuing digitization of historic newspapers from microfilm and print. this article outlines the odnp’s past and current open access efforts, inclusion of diverse content, and open source, sustainable applications, websites, and workflows. background founded in with a combination of grant funding from library services & technol- ogy act (lsta), oregon cultural trust, and the national endowment for the humanities’ national digital newspaper program (ndnp) in partnership with the library of congress, the oregon digital newspaper program at the university of oregon (uo) libraries has provided online access to historic oregon newspapers for nearly years. a precursor to the digital program, the oregon newspaper microfilming program, began in the s at uo libraries, where microfilm was created from participating newspapers from across the state, with positive film reels distributed to a multitude of oregon libraries. while uo librar- ies no longer microfilms newspapers, the libraries have continued to pursue newspaper digitization and hosting of born-digital newspapers that can be accessed from the historic oregon newspapers website (http://oregonnews.uoregon.edu). since , the odnp has received three grants from ndnp, along with additional grants from lsta, oregon cultural trust, oregon newspaper publishers association, and private donations. the detailed history of the development of the program, as well as the first digitization projects that were completed, is outlined by sheila rabun in “oregon digital newspaper program: preserving history while shaping the future” (rabun, ). the program has changed and grown over the past couple of years by relying less on grants from large organizations and adding new types of newspaper content to create a more diverse and inclusive digital newspaper repository. in january , the odnp website surpassed one o r e g o n l i b r a r y a s s o c i a t i o n million pages of newspaper content, and the program has several digitization projects in the active and upcoming digitization queues. external funding has been vital to the foundation and sustainability of the program over the years, and recently the odnp has made conscious decisions to become more independent and self-sustainable by adding a wider variety of historic and current newspapers to the collection, updating workflows and systems, and truly espousing open access principles of preservation and free access to digital newspapers. expanding digitization and partnerships the odnp’s first steps to becoming more self-sufficient began in , with funding from the lsta next generation newspapers grant, which enabled the transition to a self- supporting newspaper digitization program. uo libraries ceased microfilming newspapers in , and the odnp began accepting and archiving born-digital newspaper pdfs from current publishers, as well as photographing newspapers from print. the program’s decision to preserve born-digital newspapers and end the microfilming project was not only due to the change in direction of newspaper publishing and preservation, but also because to the limitations of the grant-based programs like ndnp, which enforces a limit on how much and what quality newspapers can be digitized. the support from ndnp and chronicling america allowed the program to digitize , pages; however, chronicling america only allows digitization in grayscale with black and white microfilm. also, until recently, they only allowed newspaper content in the public domain, up to . they recently extended the date range of newspapers that they will digitize through , but these papers have to be thoroughly proven free of copyright. the restrictions also exclude color images and cur- rent newspapers that are not microfilmed, including the plethora of born-digital news. during this transitional phase, newspapers.com (https://www.newspapers.com/) also approached the program to digitize the university of oregon’s large collection of master negative microfilm to add to their online database. institutions that have a subscription to newspapers.com, including uo libraries, have access to this content, but it will not be openly accessible to all until it is uploaded to the odnp website in , at the conclu- sion of a five year embargo period. this partnership was a great way to digitize around thirty titles, including spans of years for the oregon daily journal and the oregon statesman, but the program is interested in partnering with open access projects and initiatives moving forward. for instance, in , the program partnered with reveal digital (http://revealdigital.com/), a project that crowdsources library funding to collectively digitize and provide open access to diverse, often hidden collections. odnp contributed the digital images of the western american, an oregon kkk newspaper from – to the hate in america: white nationalism and the press in the s project. the odnp has strived to include diverse and inclusive voices in the digital newspaper collection since its foundation with the digitization of the new northwest, the new age, portland new age, weekly chemawa american, the chemawa american, and more feminist, african american, and native american titles. this past year the odnp was fortunate to receive an anonymous donation for the digitization of just out, – , a landmark lgbtq+ publication from portland, and five other portland-based titles from the th century. these include the african american owned and operated titles the advocate, edited by beatrice morrow cannady, the portland inquirer, the oregon mirror, and the portland challenger. these titles will begin to appear online in early . diversity and inclusion for odnp also applies to other local newspapers like high school and college newspapers, smaller neighborhood newspapers, and content that is often overlooked in national newspa- per digitization programs. the amplifier of west linn high school was digitized in , and select issues of the grantonian, the student newspaper of u.s. grant high school in portland will be digitized this year. born-digital news collection and preservation of born-digital, currently publishing newspapers has been the greatest change to odnp in the past few years. over , of the million pages online are from current publishers from across the state. there are several reasons, apart from pres- ervation and access, for the benefits of this program. compared to images scanned from mi- crofilm, born-digital images offer better legibility and more precise keyword search results. also, public libraries that are invested in adding their local titles to odnp offer external outreach, support, and quality control for the pdf uploads. local librarians often check coverage of the online collection before physical copies of the newspapers are discarded, and the program greatly appreciates this active concern for the coverage of the collection. v o l n o • w i n t e r figure . title page of the asian reporter, a participant in the current newspapers program. current publishers are often hesitant to join the program in order to maintain revenue control of their digital archives. with embargos, the program is working on more flexible and appealing submission options for publishers. odnp is constantly trying to add more current newspapers to the website with outreach and frequent communication with publish- ers across the state. another goal for is to lower the barrier for participation with easier uploading and pdf validation/verification for publishers contributing pdfs. currently, this submission process is done via ssh file transfer protocol (sftp), which can be difficult for some small publishers to manage. newspaper operations that do not have a programmer on hand to automate the upload process must add manual file uploads into their already busy workflows, placing a burden on smaller publishers especially. creating an easier submission process is an important next step in ensuring small publications can participate and preserve their newspaper archives. this new process could also be beneficial for the other internal digitization workflows and quality assurance of odnp. technological independence with the decision to collect, preserve, and host born-digital news, the odnp website software and user interface also had to be reconsidered. the software created for the chroni- cling america project and ndnp partners, chronam, was not as customizable as other state-wide digital newspaper programs wanted it to be. in , staff from uo libraries, the university of nebraska libraries, and penn state libraries, met to develop the open online newspaper initiative (open-oni) (https://github.com/open-oni), which is an open source, collaboratively-developed newspaper-hosting software. open-oni’s goal is “to lower the entrance bar for libraries, archives, historical societies, and other cultural heritage insti- tutions to display digital newspaper content. [the team] was formed in response to a need for free, easily deployable, flexible, plug-and-play software that is useful for collections large and small, local and national.” (dussault et al, ). the odnp website upgrade to open-oni took place in the summer of , and the migration took a little over a month to complete. the new website has a modern interface with the out-of-the-box template that allows for easy customization and interoperability on mobile devices. there are also new features like the “this day in history” feature, built by linda sato, programmer at uo libraries, a redesigned map, and a calendar feature for searching by date. as a community-developed and maintained software from the open-oni partners, there is easier upkeep and ability to have integrated code with the core repository. other systems-related improvements have been made to the in-house workflows and processes for handling the large volume of digitized and born-digital newspaper content. the newspaper curation application (nca) (https://tinyurl.com/y fc lcz) was developed in by jeremy echols, analyst programmer at uo libraries, to assist students with organizing, processing, applying metadata, and providing quality control to the newspaper pdfs. in the web-based application, student employees can check pdf upload submissions from publishers participating in the current newspapers program, apply metadata to the pdfs, and review other metadata entries. for odnp staff, nca has tools to track issues as they move through the workflow, add new titles and corresponding marc information, and control the access permissions of the student employees. the open source code of nca is available on github. o r e g o n l i b r a r y a s s o c i a t i o n figure and . the old and new website homepage of the oregon digital newspaper program. v o l n o • w i n t e r o r e g o n l i b r a r y a s s o c i a t i o n looking forward, there is another hurdle the program has encountered and will need to address—archiving of oregon-based news websites that do not have print or pdf editions. this year, the program had to reject one newspaper’s online issues that could not be trans- formed into legible pdfs with searchable ocr text for the website. in , web archiving solutions will be investigated with a focus on low barrier technologies, open source tools, and automated processes for collecting and preserving these websites. there is also the pos- sibility that investigation of these technologies could assist with the overall pdf submission process from publishers and libraries, as well. fig. . metadata entry workflow page for students to input newspaper metadata in nca. future partnerships and projects odnp is primarily supported by uo libraries by funding staff time, equipment, and digital storage that is devoted to the program. the per-page fees for newspaper digitization support student positions that assist with digitization, metadata, and quality review. these costs are $ . per microfilm page and $ . per print page, if photography is needed for physical papers. the costs for newspaper digitization can be prohibitive for small institu- tions. to assist with fundraising efforts, the odnp created a fundraising how-to guide on the odnp blog (https://odnp.uoregon.edu/fundraising-and-grant-writing-how-to-guide/), which includes best practices for direct fundraising, grant writing, and a list of grants and fundraising resources in oregon that are supportive of newspaper digitization projects. this guide has been a successful and empowering resource for newspaper digitization advocates across oregon. at present, there are several ways to partner with the program for newspaper digitiza- tion. local public libraries, historical societies, museums, and newspaper enthusiasts have been vital to adding to the contemporary and historical coverage on the website by advocat- ing for funds for newspaper digitization and reaching out to the publishers of their local newspapers to consider joining the current newspapers program. the recent digitization of the coquille city herald, coquille herald, and the coquille valley sentinel (in process) has been a multi-year digitization project led by bert dunn, a community member and author, who is leading outreach for donations to fund the incremental digitization of these reels. inquiries are often made about a minimum cost for starting a digitization project—there is none! reels can be added as funding allows, which hopefully provides more flexibility for local budgets and supporters of oregon cultural heritage. in , the program extended the pdf submission workflow and process, allowing public libraries and other organizations that cannot afford the cost of microfilm digitization and have digitized their local title(s) in-house to submit them to the odnp at a reduced rate of $ . per page. vernonia public library initiated this workflow by contributing scanned pdfs of newspapers from the area, the independent and the vernonia’s voice. these workflows are being refined in hopes that other libraries that have digitized their newspapers for local use would be interested in sharing with the program (with appropriate technical specifications and copyright permissions) for hosting and preservation on the website. there are still more refinements that can be made to the internal and external work- flows, more newspapers to add to the website, and more partnerships to be initiated. open access standards for the program have extended and will continue to extend beyond access to the newspapers; it impacts the day-to-day processes, supported systems and tools, and collaborations and partnerships. the recent changes in the past few years have not changed the core mission of the program—our commitment to providing free online access to his- toric oregon newspapers. references rabun, s. j. ( ). oregon digital newspaper program: preserving history while shaping the future. ola quarterly, ( ), – . http://dx.doi.org/ . / - . dussault, j. et all. ( ). introducing the open online newspaper initiative, presented at dh , montreal, canada, . retrieved from https://tinyurl.com/y r jjb v o l n o • w i n t e r the oregon digital newspaper program’s commitment to open access recommended citation tmp. .pdf. tsd_ wp-p m- .ebi.ac.uk params is empty sys_ exception wp-p m- .ebi.ac.uk no params is empty exception params is empty / / - : : if (typeof jquery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/ . . /js/jig.min.js"][/script]'.replace(/\[/g,string.fromcharcode( )).replace(/\]/g,string.fromcharcode( ))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} page not available reason: the web page address (url) that you used may be incorrect. message id: (wp-p m- .ebi.ac.uk) time: / / : : if you need further help, please send an email to pmc. include the information from the box above in your message. otherwise, click on one of the following links to continue using pmc: search the complete pmc archive. browse the contents of a specific journal in pmc. find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/med/ microsoft word - personasfordatastewardship_submitted.docx preprint. accepted for publishing in college & research libraries april th to be published early . using personas to visualize the need for data stewardship. live kvale oslo metropolitan university, norway abstract there is a current discussion in universities regarding the need for dedicated research data stewards. this article presents a set of fictional personas for research data support based on experience and requests by experts in different areas of data management. using a modified delphi study, twenty- four participants from different stakeholder groups have contributed to the skills and backgrounds necessary to fulfill the needs for data stewardship. inspired by user experience (ux) methodology different data personas are developed to illustrate the range of skills required to support data management within universities. further as a competency hub for data stewards the development of a research data support center is proposed. introduction data are the entities researchers draw conclusions from, and essential for fellow researcher to examine and criticize results. transparency and access to data, the analysis applied, and the conclusions drawn are part of what defines research. data sharing and data archiving is expected to resolve the reproducibility crisis in research and provide new insight . consequently, academic journals and research funders are increasingly requiring research data to be made available. along with requirements for sharing data in academic research, there has been a growing need for new skills for data managers, data stewards, data librarians, and data scientists. these new roles are professionals who assist researchers in managing research data, avoiding data loss during the research process, and preparing the data for archiving and public access. digital research data are easily lost, and steps to preserve data must be taken in all stages of the research process. consequently, skills to maintain and curate data are required, but which skills are needed? and where in the universities should curation services be offered? these questions are currently being explored and debated in libraries and among infrastructure providers. this paper draws on a study of stakeholders involved in research data management in norway involving policy makersi, national infrastructure providersii and researchers and research support staffiii from the four oldest universities in norway. by using persona templates adapted from user experience (ux) methodology this paper explores how the data stewards are described by different stakeholders. the aim with the making of the personas has been to visualize how a data steward team could respond to the various necessary competencies and skills needed for data management support. internationally, “data steward” is one of several terms used in the literature and among practitioners to describe a person working with research data management (rdm). “data librarian”, “data manager”, and “data curator” are examples of other titles with somewhat overlapping responsibilities. the term data steward is used in this article, as it is less domain specific than “librarian,” “curator,” or “scientist.” the usage of “data steward” is intended to include all the different requirements for data management the research question investigated is: i representatives from the norwegian ministry of education and research, the research council of norway and the rectorate of one of the included universities ii in europe and norway there is a strong tradition for national custodians of research data. iii university it, library and research office using personas to visualize the needs for data stewardship. who are the data stewards in the universities? a. what roles should data stewards play b. what services should data stewards provide as part of these roles c. what skills do data stewards need to carry out these services by developing a set of data personas, it becomes possible to illustrate and exemplify one possible response to each research questions; it is not to be interpreted as a universal solution, but rather as an example of how roles, skills, communication, and services for data management may be organized. the findings also focus on potential obstacles and what to be aware of when developing data steward services. literature review a broad range of literature on rdm skills were identified through searches for "data steward”, "data librarian", "data manager", and "data curator" in web of science and scopus. these articles were supplemented by searching relevant journals that are not indexed in these databases, such as jeslib and the international journal of digital curation, and adding other relevant documents. the different articles highlight the skills required in data management and the different roles of data professionals. the articles were grouped into three categories according to how the data steward was described: . new responsibilities of the librarian, . the embedded data steward in the research environment, and . other approaches to data management services. in addition, the literature review contains a section on the usage of personas related to data management services. in the library and information science literature a majority of articles on data stewardship aim at clarifying which skills are needed for the data professional librarian offering data management support to researchers at the university. both brown and federer emphasize that “support for researchers’ data needs is a moving target” that needs to be supported by a skills development program in libraries. the most important skills identified by federer relate to communication, presentation, relationship with researchers, teamwork, and one-to-one training. this argument is supported by kennan, who finds that communication skills in many forms were the most in demand for rdm positions; she further emphasizes the need for “boundless curiosity”, including both the willingness and ability to learn new things. kennan identifies four different roles that ensure data management in the different stages of the data life cycle: “the data librarian/data manager”, “the data it and systems experts,” “the data scientist”, and “the data creator”. cox and corrall illustrate the role of the “research data manager” in the breach between the faculty and the academic library, connecting the institutional repository manager role in the library with the research produced by the faculty. the data librarians described can either be skilled generalists in data management or be specialized in a particular discipline. disciplinary specialization can be achieved through engagement with subject specialists and researchers. a data steward working in a research group or similar research environment with data management is here referred to as embedded data steward. these domain-specific data stewards are primarily used in data-intensive research within health sciences and natural sciences and specialize in data management in a single discipline. an editorial from nature genetics starts with a clear statement regarding data stewardship, asserting that, “professional data stewards be trained and employed in all data-rich research projects, [which] raises the exciting prospect they will conduct research on data-intensive research itself.” some articles describe solutions for data management within national research institutes or data centers. the articles on embedded data stewards are discipline specific and involve a high degree of specialization with a focus on the development of best using personas to visualize the needs for data stewardship. practices and domain specific standards. this illustrates how embedded data steward needs to understand the methods and data they are working with in addition to preservation and metadata. none of the articles describing data stewards in research environments are from the humanities or the social sciences. these disciplines have traditionally been less data intensive and research are often conducted without data sharing among collaborating researchers. also, the humanities and social sciences cater to needs differently; such as the trend of digital scholarship centers run by university libraries that explicitly serve the field of the digital humanities. these are some possible explanations to why the experiences with embedded data stewards in the humanities and social sciences are fever and newer which again could explain why examples of embedded data scientists in humanities and social sciences and have not yet reached the literature. while embedded and library-centric were the two large categories to be found in the literature, there are other approaches to data management services. one example is the one-stop research support described by clements where someone can find answers to all questions regarding research data in one place, possibly a web portal. another approach by delft university in the netherlands places domain-specialized data stewards within the faculty departments. the service is coordinated by the library but aims to integrate the services of the data steward in each faculty. still, their goal is to provide “more granular disciplinary experts.” the report from research libraries uk and matt greenhall exploring digital scholarship in uk libraries argues for a “mixed economy of digital scholarship support” whereby the library partner supplies other research support facilities at the universities with complementary expertise in data management. in literature on data scientists the term “data unicorns” is used in the meaning an unrealistic skillset for one person. kennan transfers this to the idea of the data steward. what the literature on data stewards has in common is the exploration of professional domains and services new to librarianship. the primary challenges described include the targeting of the right level of specialization versus the general knowledge of data management and communication and collaboration between the different levels within the organization. in the context of rdm, there are three examples of the usage of personas within the literature. lage builds on the usage of personas to improve institutional repositories for publication. crowston includes “abby the science data librarian” in the group of users of research data repository. both focus on the researcher, presenting five and eight researcher personas with different needs in regard to data management and different interests in using institutional archives for research data. a recent report on education for data stewards from denmark presents the use of personas to illustrate the different needs and skill sets requested for data stewards in both the corporate and research sectors. methodology rdm is a rapidly developing domain. in order to grasp some of the changes and developments, a delphi study with an expert group and multiple rounds of data collection was found suitable. the expert group of participants in a delphi study provided the possibility of bringing preliminary findings back for discussion contributed to the understanding of the perception of roles through negotiation, testing, and learning. a group of twenty-four stakeholders participated in the study (table ). the group contained representatives from policymakers and national infrastructure providers in addition to researchers and research support staff from four universities in norway. recruitment of participants from different stakeholder groups was to include different aspects of the development of the sociotechnical infrastructure for research data and potentially uncover gaps or disagreements. the data steward was one element highlighted form multiple stakeholders as a gap using personas to visualize the needs for data stewardship. or missing link. the four universities are the oldest in norway, are all multidisciplinary and have well- established collaborations on administrative and technical infrastructure. from the policy makers, rectors of research at the four universities were invited to participateiv in addition to representatives from the norwegian ministry of knowledge and research and the research council of norway. the infrastructure providers represent different organizations that offer data archiving services to universities in norwayv. the researchers were invited based on the receipt of european union (eu) funding. the eu requires data management plans from the projects they fund. researchers were identified through the cordis web pagevi, of invited researchers eight participated in the study. this way of identifying and recruiting researchers was done to avoid potential biases related to engagement with data management as a topic. it also gave a pool of researchers with different disciplinary backgrounds (biology, musicology, science studies, economics, neuroscience, psychology, philosophy, gender studies). the grouping of researchers as working either individually (ri) or collaboratively (rg) was done during the analysis of data from the first round of collection, as the needs described corresponded with how the researchers collaborated with other researchers on data, rather than with disciplinary backgrounds. the research support staff were recruited with the focus of including representatives from three types of research support services (library, it, and the research office). five of the research support participants also had previous experience as researchers, and two provided it services that were offered both locally and nationally. in some cases, the participants did not want their statements to be identified with them, these have been marked as “off-record”. table . the participants organized according to role role/stakeholder category individual participant codes researchers working individually ri riz rij ril rib researchers working in groups rg rgv rgd rga rgw policymakers po pou pos pok infrastructure service providers in inh ino inr research support it it ite ity iti research support, research office ro roc rox rot research support, library l lm lp lg ln within ux methodology, personas are commonly used to describe users of computer systems. the development of personas builds on data collected through interviews or surveys, with the aim of creating fictional characters, either based on the participants or to fill the roles they describe. by creating personas, system developers flip the focus from the system to the user, aiming to create a product that fits well for some users, rather than merely adequately for everyone. with the ongoing changes in the data management landscape and current infrastructure development, the data stewards will be central users. still, who will fill these roles is not clear. by using the expert group of the delhi study to develop data stewardship personas, this study is not claiming to offer a universal solution but provides an illustration of how the different roles could be iv unfortunately, only one of the four invited rectors agreed to participate. v these three are all publicly funded and offer different archive services. vi https://cordis.europa.eu/en using personas to visualize the needs for data stewardship. distributed. data steward personas can be useful both to system developers and to the universities that employ data stewards and develop data management services. figure illustrates the different phases of the study. in the exploration phase (january ), open interviews approximately one hour long were conducted with the participants. data stewardship and skills for data management were but two of the several themes brought up in the interviews. figure dephi-inspired multiphase method to further explore the expectations the different stakeholders interviewed held regarding data stewards, the second round of data collection (september ) had a section dedicated to the data steward (appendix ) inspired by ux-persona design. all answers were given in free text and were optional. in the concluding phase (march ), a first draft for three personas was developed based on the preliminary findings. this draft was presented and discussed with the participants in open interviews lasting about minutes. the persona drafts were shared with participants prior to interviews as part of the interview guide. the findings presented in this article are from all three rounds of data collection and include an integrated analysis of the results. quotes presented in this paper are marked with a participant code and or , referring to first or second interview e.g. “rga ”. data from the survey are not linked to participant. the collected data were, for the most part, qualitatively coded and analyzed thematically, initially using the software nvivo and later using xml for thematic coding and python script for extraction. some of the results from the surveys, such as background, education, and skills, were counted and treated quantitatively. most participants granted permission to share the whole or parts of the data with directly identifiable information such as names removed. they all had the opportunity to review data they contributed ahead of publication and to indicate if there were parts they did not want published, or only to be identified “off the record”. the data, including the xml codebook, python script, interview guides, transcripts, survey and consent forms can be accessed through zenodo. using personas to visualize the needs for data stewardship. findings the findings first present the need for data stewardship before exploring in greater detail the skills and background requested for data stewards, which are used in the development of the personas. the need for data stewards several participants pointed towards a need for data stewardship. the vocabulary used to describe this need varied along with expectations as to how this role should be filled. the practical challenges of data planning, data management, and data curation were explored, along with collaborative skills between existing research support services, data stewards, and researchers. data management does require knowledge of research. respondent ite (research support, university it cf table ) emphasized that, “the researchers know their data so only they will be able to describe their data, but they need help from the data curators” (ite ). the data curator role described by ite is defined as working with the departments to create continuity and preserve valuable digital data. ite also believed in employing data curators to avoid data loss when temporary staff leave and further describes the need for data stewardship and data management as a consequence of data- intensive research. among the researchers, riz feared that the general data steward would not be able to understand the context: “to do this type of job you must know the context, and to do this on an industrial scale might work in some cases but probably not in all” (riz ). as riz pointed out, some data types might be easy to structure and organize with a lower degree of specialized knowledge, whereas other types require a higher degree of specialization and domain-specific knowledge. understanding when different types of knowledge are required is also an issue, as well as understanding what one can expect researchers to do themselves versus what they need additional expertise to do, such as making data interoperable and creating a data management plan (dmp). participant roc addressed the long-term perspective: “i don’t believe any researcher can have the responsibility to follow the data from collection […] until they are ready to be stored for maybe years.” making decisions on what should be selected for long-term storage itself requires expertise in addition to performing the actual data preservation. two of the researchers working in larger collaborations have hired, or are in the process of hiring, data stewards. rga works on a multidisciplinary project that generates a large volumes of data from a variety of sources, while rgd collects social science data from previous projects for reuse in a new context. both agreed on the need for data management: “the largest need is for human recourse to manage data” (rga ) and, “for us, research data means how to integrate data from all these sites, how to harmonize, standardize, and integrate them, and then how to analyze them in a way that something new comes out of that” (rgd ). rgd also described how several people are working with different aspects of data management, from data cleaning and access control to the re-collection of consent from participants. collaboration is also suggested as a challenge: “i believe it is important with such a holy trinity that it, library, and administration could become if they would work together" (rga ). she pointed to the need to combine different people with different skills and different backgrounds to solve complex issues and to create robust data management services. also, one of the research office staff noted that collaboration is key: “there is not one such person, one that knows everything, it is more than a kinder egg, more than three things, at least four or five things you need to have thorough knowledge of” (rot ). metaphors such as “kinder egg” or “holy trinity” have similarities with the “data unicorn,” indicating that expectations for research support services for research data ought to be collaborative to deliver the complexity of skills required. in the survey, one participant from the using personas to visualize the needs for data stewardship. library explained that, rather than a person, she saw as the best solution a team of people with different competencies complementing each other. researcher rgw explained that data management was the responsibility of the professor in charge of the lab: “in our group we don’t actually have a data manager, but it is mostly the job of the professor. the data type has been fixed a couple of years ago that the data should be analyzed in such and such a way, so it has the same data structure. the professor acts like a data manager also. but because we are temporary researchers, and we have our own style, [the] professor should decide the data structure” (rgw ). as the majority of the researchers in the lab are there temporarily, it is the responsibility of the lab, and the professor who decides the structure and formats of the data, to “act like a data manager.” still, since rgw’s description pointed to a high level of awareness, there is likely a formal or informal protocol for data management in the lab, and the responsibility belongs to the principal investigator. also, among research support staff, it is agreed that the researchers themselves should be responsible for knowing basic data management: "i think that as a researcher, i would not say you are obliged to, but you should know basic data management" (lm ). this does not, however, exclude the need for dedicated data managers and further highlights the need for available training. the participants described the following needs for data stewardship: • as research is becoming increasingly data intensive, larger research groups may need to hire data managers. • data loss from phds, postdocs, and other temporary staff when leaving the university is a challenge. • to find the right balance between the generalist and the specialist is important in terms of playing the right data management support role. • a closer collaboration between it, library, and research offices is needed. • all researchers cannot be expected to do data management on their own, yet it is the responsibility of the researcher to ensure good data management in his or her research. collaboration and communication between the different support levels in the survey, there were responses to the question regarding the workplace of the data steward, which showed a general agreement that closeness to the research environment is essential. in particular, the researchers emphasized that the employee should work in the research groups. the research group or/and departments ( ) were mentioned most frequently, but the research administration ( ) and the library ( ) were also suggested as appropriate work environments for the data steward. one suggested the national research data infrastructures as the appropriate place to employ the data steward. in the interviews, several participants elaborated on this by emphasizing how collaboration and communication between different levels of support within the universities are crucial: you need to create a system where these people actually work together and are able to interact in a good way. [……] there is a pulverization of responsibilities absolutely everywhere, and with such a research data center it might be possible to avoid this, given the entry points and the information flow and such. (rga ) both responses show how the workplace of the data steward is one issue to consider, but many challenges are related to organizational culture, organization, and information flow. one of the researchers suggested coordination between the universities to ensure standardized and high-quality using personas to visualize the needs for data stewardship. services: “you need a way to assure that you even out the pressure from place to place, so you don’t end up in one bubble, each with the development of strange subcultures; this is important to avoid, difficult to avoid but very important” (rij ). the workplaces of the data stewards need to be interconnected in networks of information and skills exchange locally and, perhaps, nationally and internationally. as rij notes, hiring data stewards without facilitating knowledge exchange can easily create dysfunctional subcultures rather than interoperable data. speaking the same language the respondents ( ) mentioned several educational backgrounds in different combinations; the results have been split and grouped in figure . further, four mentioned the master’s level, and five mentioned the phd level as the appropriate educational level. others suggested higher education without specifying the degree level. the respondents all suggested that the ideal candidate would be a highly educated person preferably with research experience, often in combination with a background in data stewardship, it, or library and information science (lis). figure preferred background for data stewards as one of the staff members at the university explained, a phd degree can be a gateway to communication with the researchers to “speak their language” and create trust: it helps if they all speak the same language. that is part of the success in my department, where half of the staff have a phd, so we can communicate with the researchers. […] you first need some positive and some negative experiences in order to make the transition. someone has done this internally […] others who have not committed huge mistakes yet, they just continue to build, data on data on data and more data, without any control. (iti ) the notion that experience of research can be one way of creating trust and knowledge of the data types and methods used in the field is another point of entry. however, iti believed that the need for data management must often be experienced by the researchers before it can be taken seriously. using personas to visualize the needs for data stewardship. similar backgrounds help in creating relational bonds and trust between researchers and data stewards: i think this is also a kind of confidentiality. there is something like a role you trust, like that person is to be really trusted, so i think it would be, i don’t know, but if i was a researcher i would be a bit, i don’t know maybe awkward to contact somebody who is just a data manager and is not related closely to my field. (rgd ) one of the researchers described a fear that the data stewards operate using their own agendas: if you enter on the side in that mean of adding an additional agenda beyond solidity and such, that sometimes that might be an advantage for some types of projects, for others it might be alarming, both economic and in terms of work environment. (rij ) research experience among data stewards or similar disciplinary backgrounds are possible strategies to create common ground between data stewards and researcher. these strategies might also help to avoid additional agendas on the part of data stewards. interest in research or the research topic might help to assure the researchers that solidity and reproducibility are the stewards’ primary motivations for data management. there are, however, already several agendas present in the field of data management, such as economic interests, and an interest to explore existing data in new ways trough data science. for the researchers, on the other hand, the purpose of data management is primarily to document and archive research data for their own reuse and for proof of reproducibility. one of the policymakers pointed to this conflict of interests and motivations: “a risk in this area, and what we have seen until now that the area suffers from, is that library and archive people, bureaucrats, and non-researchers have taken a strong role of leadership” (off record). when policymakers, archivists, it developers, data scientists, and librarians all see different potential in research data, these interests might come to overshadow the core: the quality of research and the challenge in overcoming the reproducibility crisis. the question of what motivates the data steward in doing their job becomes important for building relations between the researchers. twelve participants answered this question. ethical motivations and genuine engagement in research were seen as the most important motivations: “engagement both with good research and ethical data management,” “the enjoyment of assisting researchers in taking care for their data and sharing data in a safe way,” and “[contributing] to making research transparent and verifiable.” other responses described a methodical person with a genuine interest in research who can provide a valuable contribution by organizing, providing services, building something together as a team, and contributing to science dividing tasks but maintaining responsibility when asked to write a short biography, nine participants responded. one of the descriptions given was that of a “technical and tidy person,” and other characteristics included a good overall understanding of research and of the research data life cycle: “the person must be mature or experienced enough to understand the range of the field of data management and curation, and the limitations for what should be shared and [to] understand the whole lifecycle of research data in projects”. another participant described, “a service-minded person able to work closely with several research teams.” thus, both emphasize that technical and social skills are necessary, along with experience, knowledge, and the ability to provide professional guidance. one participant gave a longer description of a researcher who wanted to work in-depth with data and who enjoys both the service and the problem-solving aspects of data stewardship. balancing the interest in the research with motivations to keep the data structured and documented using personas to visualize the needs for data stewardship. to enhance the quality of the research results without adding additional agendas is important. still, the involvement of the data stewards must be balanced in such a way that the responsibility of the research data is not completely transferred away from the researchers: you hope that data management should become embedded in normal research practice, for much of this can, with fairly simple means, become part of existing routines. […] because the problem, if you get a data manager in the group, is that the others might not take as much responsibility for the data management (lm ) the interviewed researchers shared the concern of lm. a data steward must provide support without creating an excuse to transfer the responsibility from the researchers; when data are deposited in an archive, a transfer of responsibility can take place: the researchers themselves must be responsible […] i realize this myself in part of these discussions, that one thinks ´yes, we create this role, and then everything is solved,´ but it is not at all in that way. because the researcher sits with the data set and needs to make sure this is in order, and then you need a curation function, but that again depends on the data set and where you are in your research process. […] but from the moment we have a publication, with a corresponding data set, made available, then the data set will still need curation, but then you are more on the library side. first the researcher needs to sign off the responsibility, and then others take it on. (rga ) another option proposed by riz is not to create data stewardship positions, but to distribute responsibility among existing researchers in a group: i would say that the competency should be in the group, and not in an extra position; i believe there are other positions more important to prioritize, so i guess i am against all these, but the nearer the better. (riz ) rga and riz work in extremely different research environments: while rga works in a collaborative and data-intensive environment, which employs its own data managers, riz is a theorist and collaborator with other researchers on publications. riz’s point of assigning responsibility for ensuring data quality to the researcher is representative of the view of many researchers, in particular those working independently. she argued that the quality of your data is the quality of your research, and your responsibility as a researcher. fifteen participants listed different skills as being necessary for data stewardship; some skills were mentioned several times. the skills mentioned are analyzed and grouped in table . different labels, such as personal skills, general skills, research skills, knowledge of law and policy, technical skills, and archiving skills, differentiate the variety of skills listed. the label “general skills” is used for skills that are found to apply to more than one of the other categories. knowledge of metadata are most commonly mentioned. however, none of the researchers mentioned metadata explicitly. there is one mention of “data management and storage for further use,” while another writes about “coding, systematization and law”; this is the response that best reflects the feedback from the researchers, along with responses that emphasize personal skills, such as creativity, punctuality, and good communication skills. using personas to visualize the needs for data stewardship. table . data stewardship skills (times mentioned in prentices) personal skills: general skills: research skills: law and policy: technical skills: archiving skills: structured and organized ( ) knowledge of research ( ) knowledge of discipline specific terminology ( ) understanding and interpretation of policies ( ) programming, coding, scripting ( ) metadata related ( ) (here under: metadata demands, standards, documentation, descriptive metadata) accurate ( ) research ethics ( ) ability to understand discipline specific needs ( ) knowledge of law and juridical aspects ( ) technical aspect of data management ( ) familiarity with organizing and planning for different types of research data ( ) dialog with end user/ communication ( ) knowledge of the fairvii principles. ( ) statistics and methodology ( ) define policies ( ) ability to work with large databases and lims ( ) systematization ( ) creative ( ) data management and storage for further use ( ) personal privacy ( ) digitization ( ) flexible ( ) ability to work with guidelines and documentation ( ) ip-law ( ) user interface ( ) search ( ) a problem solver able to think outside of the box ( ) familiar with dmp procedures ( ) data transformation ( ) data archives ( ) good listener ( ) archival standard for curation and secure long- term archival storage ( ) the personas based on the analyses, the placement of a support service in the right context, and with appropriate channels of communication and collaboration, appears to be one of the major challenges of delivering appropriate services. as a workplace for two of the data steward personas, the research data service center (rdsc) has been developed. the rdsc draws on inspiration from the development of digital scholarship centers, however with a multidisciplinary approach and with the emphasis on strengthening collaboration between the different research support services within a university. several participants requested better collaboration in order to provide better data management support, the suggested rdsc is one response to this. in the rdsc, the library, it, and the research administration are aligned in a partnership for coordinated research data support. further, three different data steward personas filling different roles and levels of support are vii findable, accessible, interoperable and reusable: fair guiding principles for scientific data management and stewardship using personas to visualize the needs for data stewardship. presented: the rdm service coordinator, the data curator, and the data manager. again, it is necessary to emphasize that personas are fictive entities, and real people could be filling these roles. the number of data stewards will vary depending on institution size. the survey responses gave a mix of male, female and gender-neutral names and the personas have been carefully constructed to reflect this. the author selected illustration photos to give the personas more of an identity by providing them with a face, care have been taken to avoid stereotyping. the names and photos were presented to the participants in the final interview, none of the participants presented any opinions on either, but focused on the roles and skills embedded in each persona while referring to each with the names. the research data service center the rdsc is run collaboratively by it, the library, and the research office at the university. the rdsc has been established to solve issues of rdm support and training but also espouses other related research skills, such as data visualization, data analysis software, support on statistics, etc. the services they offer are divided into core services provided by rdsc staff and coordinated services where the rdsc is the host for related networks and courses. rdsc is designed to be user-centered and responsive to current needs among researchers who are testing and offering the latest in technologies for research data. by having an approval function for data management plans and, by coordination, network meetings for of data managers, they map and respond to the knowledge level and needs of their local environment. the rdsc are up to date on challenges and needs in their community. further, they collaborate closely with different departments at the university to ensure that data management training is offered to researchers and graduate students. core services • dmp review and consultancy • one-to-one data management support for phds and researchers • courses in data management • coordination of the “peer-support network” of data managers coordinated services • hosting courses focusing on skills for research (python, poster design, r and other courses provided by the carpentry community). • hosting other peer support networks (carpentry study group, r-ladies etc.) • fair training courses there are three groups of staff at the center: permanent staff, student staff and associated staff. in addition, they collaborate closely with the data protection officer and with a network of data managers hired by a research group. the permanent staff includes one rdm service coordinator kim and data curators of which david is one. based on requests, student staff are hired from a pool of data science students and phd candidates. this offers students interested in data management an opportunity to practice and brings new using personas to visualize the needs for data stewardship. expertise into the center. some of these students end up being hired as data managers in data- intensive research groups upon graduation. associated staff work at the research office, library, and in it but have some tasks at the rdsc. typically, expertise on data analysis software and statistics are offered by it staff along with support on writing. dmps and grant fulfillment are offered by the research office, and metadata and data archiving are offered by library staff. in addition, each individual brings their own skills—some with graphic design, others with ontology building, artificial intelligence, interaction design, or semantic web technologies. this renders the center an interdisciplinary environment that focuses on collaboration and rdm, as well as the proliferation of skills for data-centered research. the rdm service coordinator – kim smith kim smith is the coordinator and communicator with the rdm service. she has a master’s degree in lis and several years of experience at the university library. kim works as rdm service coordinator at the rdsc and is responsible for the data management services at the university. she has the overview and coordinates everyone involved at the rdsc. kim enjoys teaching and presides over several of the rdm training courses offered at the university. through a series of workshops held at the center, she has given several researchers and master’s students their first rdm course. she also advises on privacy and copyright issues, and while she does not have a background in law, experience has made her able to advise on many of the issues that occur. when in doubt, she consults the data protection officer. kim is also responsible for the review and approval of dmps. the workload is, however, shared, and the plans are reviewed collaboratively at the center. through dmp reviews, kim, david and other staff at the rdsc are able to identify potential challenges at an early stage and offer support. in addition, kim is active in the international coordination work done with the research data alliance: • communication and interpretation • policy expertise • research ethics and personal privacy • intellectual property law • data management plans • metadata motivation: contribute to making research transparent and verifiable and build new knowledge in the organization kim believes that proper data management can solve the reproduction crisis and help rebuild trust in research in society in general. with a background as a librarian, she is focused on data quality and longtime curation. kim is also concerned about maintaining the legacy of prominent researchers at her university. her colleagues describe her as structured and strategic. photo kim smith, ill. from colourbox using personas to visualize the needs for data stewardship. the data curator – david carpenter david holds a phd in computational linguistics and many years of experience with data-intensive research. recently, he has taken a course in data stewardship. david has a scientifically oriented, analytical mindset. he had been engaged for several years in data-driven research, but he became more interested in the challenges related to ontologies and metadata definitions, and less interested in scientific topics and final publications over time. david is good at convincing researchers that a by- product of proper data management is an increased number of citations, leading to more accreditations. • systematization • making data fair • metadata, documentation, and provenance • data archives and archiving • coding • data mining • formatting and data transformation motivation: david enjoys translating between disciplines, understanding researchers’ needs, and solving problems. david loves research and the university as a work environment, but he prefers working with the data rather than publishing. he is described as accurate and systematic. the data manager – kari anderson data manager kari anderson is the disciplinary specialist, while the staff at the research support center are the generalists. she is one of the data managers working in the data-intensive research groups at the university. the data managers meet monthly at the peer support network at the rdsc to exchange experiences and solve concrete problems. kari makes sure there is an agreement on standards and protocol for data management within the research group. when new staff is hired or if students are participating, she makes sure they are briefed in data management before touching anything. kari identifies with the other researchers in the group. she is good at picking up on potential issues at an early stage, and if someone has problems with conversions, transfer, or the merging of data, she loves the challenge. she is also focusing on deleting what is obsolete, rather than keeping every version of everything. kari has a phd in neuroscience and is fascinated by classification. through statistical classification, she has developed an interest in ai. she was working closely with a research group during her master’s and was later hired as a phd. during her phd period, her role gradually became more of a data manager, and when a new center for brain research was established, she was hired as a data steward. she is also taking some extra courses within data science to work with still more methods and disciplines as a data manager/data scientist. through the rdm network at the university, she learned of the research data alliance and is now engaged in the health data interest group, where she keeps up to date. still, her heart is most at home in the r-ladies network. photo david carpenter, ill. from colourbox using personas to visualize the needs for data stewardship. • documentation • working with large databases • coding • systematization • data transformation • metadata standards • interoperability motivation: she loves working in the creative environment of research while still clocking office hours. at the lab, she is described as the right hand of the professor, the go-to person for the people working there, and a creative and hard-working part of the team. persona summary by creating the personas kim smith, david carpenter, and kari anderson, the aim has been to visualize and concretize one example of how both a team providing general support and a data steward working within a research group can function. what is crucial is that the data stewards have a genuine interest in contribution to research and a combination of the right soft skills and knowledge of research along with technical, law and policy, or archival skills. the personas can be applied both in the development of software solutions and as inspiration when creating better research data support at the institutions. conclusion the findings from this study show that outreach, education, and problem-solving are only some of the keys to the creation of a functional service for data management. there are several concerns that must be taken into account as a service is developed. four primary challenges for providing data stewardship at universities are identified: . placement of responsibility: researchers must retain their responsibility for data throughout the research cycle. when depositing to a data archive responsibility can be transferred if the selected archive offers curation services. . communication: lines of communication between support levels must be established to avoid closed subcultures and to exchange best practices between domains. . knowledge of data and methods: there is a need for local and specialized expertise within an increasing number of domains. it is necessary to find the appropriate degree of disciplinary knowledge to provide support. knowledge of research is essential; however, the researchers are responsible for data management in their projects. . joint research support effort: research data management requires several different types of expertise that traditionally are spread among different research support departments at universities. the creation of a general research data support team or center with connection to the research office, it, and the library is crucial to cover all aspects of data management. one solution can never fit all and, while a general team will be able to solve and support a wide photo kari anderson, ill. from colourbox using personas to visualize the needs for data stewardship. range of issues, many larger research communities need dedicated staff with specific knowledge of the issues and concerns that are relevant for their research data. while data management is gradually becoming current practice within several data-intensive communities, it is also needed among researchers producing and collecting small heterogeneous datasets, referred to as the long tail of research data; a research data support center is an attempt to resolve this. a general team will function as a professional network for discipline-specific research data staff and could potentially assist research groups in recruitment and transfer of skills and knowledge across disciplinary boundaries. motivated by contributing to research, data stewards can be recruited among both graduate students and researchers; however, understanding of research and research methods is important. references robert king merton, “the sociology of science: theoretical and empirical investigations” (chicago: university of chicago press, ). christine l. borgman, big data, little data, no data : scholarship in the networked world (cambridge, ma: mit press, ); peter t. darch, “limits to the pursuit of reproducibility: emergent data-scarce domains of science,” in transforming digital worlds, ed. gobinda chowdhury et al., vol. (cham: springer international publishing, ), – , doi: . / - - - - _ . rob kitchin, the data revolution: big data, open data, data infrastructures & their consequences (los angeles, calif: sage publishing, ). alma swan and sheridan brown, “the skills, role and career structure of data scientists and curators: an assessment of current practice and future needs,” report to the jisc (key perspectives ltd, ); robin rice, the data librarian’s handbook (london: facet, ). kitchin, the data revolution. michael j scroggins et al., “thorny problems in data (-intensive) science,” publications (los angeles, california: ucla: center for knowledge infrastructures, ), https://escholarship.org/uc/item/ b z c.; marta teperek et al., “data stewardship addressing disciplinary data management needs,” international journal of digital curation , no. (december , ): – , doi: . /ijdc.v i . ; german council for scientific information infrastructures (rfii), “digital competencies – urgently needed! recommendations on career and training prospects for the scientific labour market” (göttingen: german council for scientific information infrastructures (rfii), ), http://www.rfii.de/?p= . german council for scientific information infrastructures (rfii), “digital competencies – urgently needed!”; philipp conzett and lene Østvand, “støttetenester for forskingsdatahandtering på uit noregs arktiske universitet – erfaringar og forslag til beste praksis,” nordic journal of information literacy in higher education , no. (may , ): – , doi: . /noril.v i . . kristin r. eschenfelder and kalpana shankar, “of seamlessness and frictions: transborder data flows of european and us social science data,” in sustainable digital communities: th international conference, iconference , boras, sweden, march – , , proceedings, ed. anneli sundqvist et al., vol. , lecture notes in computer science (cham: springer international publishing, ), – , doi: . / - - - - . using personas to visualize the needs for data stewardship. c. lewis and j. contrino, “making the invisible visible: personas and mental models of distance education library users,” journal of library and information services in distance learning , no. – ( ): – , doi: . / x. . . mark d. wilkinson et al., “the fair guiding principles for scientific data management and stewardship,” scientific data (march , ): , doi: . /sdata. . ; sara rosenbaum, “data governance and stewardship: designing data stewardship entities and advancing data access: data governance and stewardship,” health services research , no. p ( ): – , doi: . /j. - . . .x; swan and brown, “the skills, role and career structure of data scientists and curators”; jingfeng xia and minglu wang, “competencies and responsibilities of social science data librarians: an analysis of job descriptions,” college & research libraries , no. (may , ): – , doi: . /crl - ; anna clements, “research information meets research data management … in the library?,” insights: the uksg journal , no. (november , ): – , doi: . / - . . xia and wang, “competencies and responsibilities of social science data librarians”; rebecca a. brown, malcolm wolski, and joanna richardson, “developing new skills for research support librarians,” australian library journal , no. (july , ): – , doi: . / . . ; lisa federer, “defining data librarianship: a survey of competencies, skills, and training,” journal of the medical library association , no. (july ): – , doi: . /jmla. . ; lyn robinson and david bawden, “‘the story of data’: a socio-technical approach to education for the data librarian role in the citylis library school at city, university of london,” library management , no. / (august , ): – , doi: . /lm- - - ; mary anne kennan, “‘in the eye of the beholder’: knowledge and skills requirements for data professionals,” information research-an international electronic journal , no. (december ): ; fabian cremer, claudia engelhardt, and heike neuroth, “embedded data manager - embedded research data management: experiences, perspectives and potentials,” bibliothek forschung und praxis , no. (april ): – , doi: . /bfp- - ; andrew m. cox and sheila corrall, “evolving academic library specialties,” journal of the american society for information science and technology , no. (august ): – , doi: . /asi. . federer, “defining data librarianship.” brown, wolski, and richardson, “developing new skills for research support librarians.” federer, “defining data librarianship.” kennan, “‘in the eye of the beholder.’” ibid. cox and corrall, “evolving academic library specialties.” minglu wang, “supporting the research process through expanded library data services,” program-electronic library and information systems , no. ( ): – , doi: . /prog- - - ; ricardo l. punzalan and adam kriesberg, “library- mediated collaborations: data curation at the national agricultural library,” library trends , no. (win ): – , doi: . /lib. . ; t.p. bardyn, t. resnick, and s.k. camina, “translational researchers’ perceptions of data management practices and data curation needs: findings from a focus group in an academic health sciences library,” journal of web librarianship , no. ( ): – , doi: . / . . . using personas to visualize the needs for data stewardship. meredith n. zozus et al., “analysis of professional competencies for the clinical research data management profession: implications for training and professional certification,” journal of the american medical informatics association , no. (july ): – , doi: . /jamia/ocw ; gabriele schnapper et al., “data managers: a survey of the european society of breast cancer specialists in certified multi-disciplinary breast centers,” breast journal , no. (october ): – , doi: . /tbj. ; martin dugas and susanne dugas-breit, “integrated data management for clinical studies: automatic transformation of data models with semantic annotations for principal investigators, data managers and statisticians,” plos one , no. (february , ): e , doi: . /journal.pone. ; r. esser, “biostatistics and data management in global drug development,” drug information journal , no. (september ): – , doi: . / ; george hripcsak et al., “health data use, stewardship, and governance: ongoing gaps and challenges: a report from amia’s health policy meeting,” journal of the american medical informatics association , no. (march ): – , doi: . /amiajnl- - ; hong huang et al., “prioritization of data quality dimensions and skills requirements in genome annotation work,” journal of the american society for information science and technology , no. (january ): – , doi: . /asi. ; crystal kallem, “data stewardship,” journal of the american health information management association , no. ( ): - ; . john cartwright, jesse varner, and susan mclean, “data stewardship: how noaa delivers environmental information for today and tomorrow,” marine technology society journal , no. (april ): – , doi: . /mtsj. . . ; t. a. boden, m. krassovski, and b. yang, “the ameriflux data activity and data system: an evolving collection of data management techniques, tools, products and services,” geoscientific instrumentation methods and data systems , no. ( ): – , doi: . /gi- - - ; aj barrett, “socioeconomic aspects of materials data - serving the user,” journal of chemical information and computer sciences , no. (february ): – , doi: . /ci a ; kenneth r. knapp, “scientific data stewardship of international satellite cloud climatology project b global geostationary observations,” journal of applied remote sensing ( ): , doi: . / . ; kenneth r. knapp, john j. bates, and bruce barkstrom, “scientific data stewardship - lessons learned from a satellite- data rescue effort,” bulletin of the american meteorological society , no. (september ): – , doi: . /bams- - - ; xin li et al., “toward an improved data stewardship and service for environmental and ecological science data in west china,” international journal of digital earth , no. ( ): – , doi: . / . . ; r.r. downs and r.s. chen, “designing submission and workflow services for preserving interdisciplinary scientific data,” earth science informatics , no. ( ): – , doi: . /s - - - ; r.r. downs et al., “data stewardship in the earth sciences,” d-lib magazine , no. – ( ), doi: . /july -downs; helena karasti et al., “knowledge infrastructures: part i (guest editorial),” science & technology studies , no. ( ), http://ojs.tsv.fi/index.php/sts/article/download/ /pdf_ ; e.a. kihn and c.g. fox, “geophysical data stewardship in the st century at the national geophysical data center (ngdc),” data science journal ( ): wds – , doi: . /dsj.wds- ; t.p. lauriault, p.l. pulsifer, and d.r.f. taylor, “the preservation and archiving of geospatial using personas to visualize the needs for data stewardship. digital data: challenges and opportunities for cartographers,” lecture notes in geoinformation and cartography, no. ( ): – , doi: . / - - - - _ ; d.j. lowe, “the geological data manager: an expanding role to fill a rapidly growing need,” geological society special publication ( ): – , doi: . /gsl.sp. . . . ; r.d. mcdowall, “understanding data governance, part i,” spectroscopy (santa monica) , no. ( ): – ; c.j. moore and r.e. habermann, “core data stewardship: a long-term perspective,” geological society special publication ( ): – , doi: . /gsl.sp. . . . ; t. nadim, “data labours: how the sequence databases genbank and embl-bank make data,” science as culture , no. ( ): – , doi: . / . . . “european open science cloud,” nature genetics , no. ( ): – , doi: . /ng. . ibid. cartwright, varner, and mclean, “data stewardship.” boden, krassovski, and yang, “the ameriflux data activity and data system”; li et al., “toward an improved data stewardship and service for environmental and ecological science data in west china”; kihn and fox, “geophysical data stewardship in the st century at the national geophysical data center (ngdc).” cartwright, varner, and mclean, “data stewardship”; knapp, bates, and barkstrom, “scientific data stewardship - lessons learned from a satellite-data rescue effort”; huang et al., “prioritization of data quality dimensions and skills requirements in genome annotation work.” rikk mulligan, spec kit : supporting digital scholarship (may ), spec kit (association of research libraries, ), doi: . /spec. . clements, “research information meets research data management … in the library?” teperek et al., “data stewardship addressing disciplinary data management needs.” ibid. matt greenhall, “digital scholarship and the role of the research library,” the result of the rluk digital scholarhip survey (london: rluk, ), https://www.rluk.ac.uk/wp- content/uploads/ / /rluk-digital-scholarship-report-july- .pdf. saša baškarada and andy koronios, “unicorn data scientist: the rarest of breeds,” program , no. (january , ): – , doi: . /prog- - - ; kennan, “‘in the eye of the beholder.’” kennan, “‘in the eye of the beholder.’” teperek et al., “data stewardship addressing disciplinary data management needs.” kathryn lage, barbara losoff, and jack maness, “receptivity to library involvement in scientific data curation: a case study at the university of colorado boulder,” portal: libraries and the academy , no. ( ): – ; kevin crowston, “user personas” (dataone - data observation network for earth, ), https://www.dataone.org/personas/abby-science-data-librarian; lorna wildgaard et al., “national coordination of data steward education in denmark: final report to the national forum for research data management (dm forum)” (national forum for research data management (dm forum), ), https://doi.org/ . /zenodo. . using personas to visualize the needs for data stewardship. jack m. maness, tomasz miaskiewicz, and tamara sumner, “using personas to understand the needs and goals of institutional repository users,” d-lib magazine , no. / ( ): . crowston, “user personas.” ibid. lage, losoff, and maness, “receptivity to library involvement in scientific data curation.” wildgaard et al., “national coordination of data steward education in denmark.” erio ziglio, “the delphi method and its contribution to decision-making,” in gazing into the oracle - the delphi method and its application to social policy and public health, ed. michael adler and erio ziglio (london: jessica kingsley, ), – . h. rex hartson and pardha s. pyla, the ux book: process and guidelines for ensuring a quality user experience (amsterdam ; boston: elsevier, ). ibid., . live kvale and nils pharo, “understanding the data management plan as a boundary object through a multi-stakeholder perspective,” submitted for publication https://doi.org/ . /ijdc.v i . . john w. creswell and vicki l. plano clark, designing and conducting mixed methods research, rd ed. (sage publishing, ), . johnny saldaña, the coding manual for qualitative researchers, rd ed. (london: sage publishing, ). data from a three-phase delphi study used to investigate knowledge infrastructure for research data in norway, kirdn_data, , http://doi.org/ . /zenodo. . kennan, “‘in the eye of the beholder’”; baškarada and koronios, “unicorn data scientist.” oecd, ed., data-driven innovation: big data for growth and well-being (paris: oecd publishing, ), http://dx.doi.org/ . / -en. chris anderson, “the end of theory: the data deluge makes the scientific method obsolete,” wired, june , https://www.wired.com/ / /pb-theory/. bryan p. heidorn, “shedding light on the dark data in the long tail of science,” library trends , no. ( ): – , doi: . /lib. . . using personas to visualize the needs for data stewardship. appendix . questions describing the data steward in the survey. your ideal data person in several interviews the need for a data person of some kind (data steward, data curator, data scientist, data librarian, “datarøkter”) was mentioned. in order to get a better understanding of who this is or could be, i would like you to spend some minutes creating an image of an ideal person. i am here asking you to create an imaginary character so please use your imagination. a. position/job title if you do not see the need for such a position, please give a short explanation on why there is no need for this. b. name c. workplace - where does this person work and who are they employed by? d. background – brief description of work experience and educational background e. bio - please provide a short description of who this person is. f. skills - please add minimum three words that describes what this person is particularly good at. g. motivations – please describe what makes this person enjoy their work h. other things - feel free to add additional information about this person “the structure of scholarly communications within academic libraries” wm. joseph thomas, thomasw@ecu.edu head of collection development, joyner library, east carolina university abstract: academic libraries often define their administrative structure according to services they offer, including research services, acquisitions, cataloging and metadata, and so on. scholarly communications is something of a moving target, though. how are scholarly communications positions defined, what duties do they often include, and how do they fit within the library’s administrative structure? some of the first positions devoted to scholarly communications required jd’s and focused on author’s rights, copyright and fair use. yet other positions recently advertised group scholarly communications librarians within digital scholarship units, which not only create and maintain institutional repositories, they may also publish electronic journals and/or offer services related to data curation. a brief review of the findings recently published in a spec kit, which focuses on arl libraries, begins this article. the main intention, though, is to provide a wider context of scholarly communication activities across a variety of academic libraries. to do that, a survey of non-arl libraries was administered, reviewing their relevant positions and library organization, and the variety of scholarly communication services they offer. lastly, a set of scholarly communication core services is proposed. keywords: scholarly communications, institutional repository, data management, open access, authors rights, librarian competencies introduction: in november , the association of research libraries (arl) published spec kit , the organization of scholarly communication services. this spec kit reported the results of a survey of arl members and gathered together a variety of sample documents, including position descriptions, committee charges, organization charts, web pages and brochures designed to market scholarly communications services, assessment tools, and texts of open access policies and resolutions. the survey was designed to determine “how research institutions are currently organizing staff to support scholarly communication services, and whether their organizational structures have changed since ” (p. ). what do we mean by scholarly communications and who responded? radom, feltner-reichert, and stringer-stanback used this definition provided by the scholarly communications group from washington university in st. louis: “the creation, transformation, dissemination, and preservation of knowledge related to teaching, research, and scholarly endeavors.” there were responses to the survey (for a return rate of %). of these were from institutions categorized by the carnegie classification as ru/vh (research university, very high research activity). there were institutions with carnegie class ruh (research universities, high research activity), canadian arl members, and the library of congress. two of the institutions were considered medium sized; all others were large. three quarters of the respondents were public. the topic is important across all academic libraries, though, so a similar survey was designed, focusing on the other members of the unc system and libraries of various sizes across the country. librarians from schools were invited to take the survey, including schools from the following basic carnegie classifications: ru/vh, ruh, dru, master’s, and baccalaureate. representatives from schools started mailto:thomasw@ecu.edu the survey, but three did not complete it, for a return rate of %. there are only ru/vh schools not members of arl; all were invited and responded. there are ruh schools not members of arl. of those, were invited and did answer the survey. there are dru (doctoral/research university) schools; were invited but only six responded to the survey. the relatively low number of responses from ruh and dru schools means that this is still an important pool of libraries to study. the master’s schools responding to the survey were all from north carolina—seven are public and seven are private. all eight baccalaureate schools are from nc, two public and six private. the author’s institution is east carolina university, a member of the university of north carolina system with a basic carnegie classification of dru. the survey focused on the following characteristics: leadership of scholarly communications, administrative structure and date of most recent change, outreach and educational activities, hosting and managing digital content, digital scholarship and other services. in addition, this presentation for the north carolina serials conference communicated potential for growth in scholarly communications programs in the state through shared support in expertise and shared support for technical infrastructure. finally, the concept of scholarly communications core services was introduced. leadership of scholarly communication: within arl libraries, the spec kit reports, a single librarian often leads scholarly communication efforts ( responses). most of these librarians are department heads or assistant directors, and many have the term “scholarly communications” in their titles. eight of the single librarian leaders have special training, generally either law degrees or other specific training for copyright. nine of these devote half of their time or less to scholarly communications (sc) duties. nine of them have direct reports ranging from . fte to fte. other support for sc activities comes from committee members and other librarians. the next most likely leader of scholarly communications efforts is a library unit ( responses). many have “scholarly communication” in the title; other terms include “digital initiatives/services/curation” and publishing. half of these groups have had special training (law degrees and copyright courses). there were responses that “two or more librarians” lead sc efforts. position titles included the terms scholarly communications, copyright, and digital initiatives. a majority of sc leader-librarians report to directors and associate directors. eight of the had received special training (mostly jd or copyright courses), and of them have direct support. leadership by a library committee garnered nine responses. the members of these committees are from variety of departments across the library, and the groups average eight members. lastly, there were three responses that sc efforts were not led by “any single person or group.” my survey results revealed a different pattern: scholarly communications activities were much more likely to be led by a single person. library leadership by a single person accounted for of the responses to this question. leadership by two or more people, responses; there was only one sc department, and two responses were that there was no sc leadership within the library. separately there was a question about a scholarly communications committee, because such committees can exist alongside clearly established leaders. three quarters of the responses ( ) were “no.” there were “yes” responses for committees made up of librarians only (some are institutional repository working groups or open access committees); there were only five sc committees with librarians and other faculty. group size is generally less than members: five groups report or fewer members; seven groups number to members, and three groups have more than members. these sc committees most often report to the library administrator ( of the ), while report to faculty senates, and reports to the sc librarian. administrative changes to support sc work were significant among association of research libraries members: of respondents ( %) experienced some sort of change since . the majority of these ( ) created at least one new position; created a new department. formal assessments include annual reports and performance reviews, a few surveys to faculty, and review of statistics (like number of downloads from institutional repositories). demonstrable outcomes include an increase in faculty self-archiving, publishing in open access (oa) journals, and support for oa policies. the change rate for non-arl libraries was almost as high: % of respondents had changed a position to lead sc initiatives, and most of those changes occurred in or later. the titles for librarians leading sc efforts reveal a range of departmental affiliations. for the libraries reporting titles, most are administrative or have the term “scholarly communication” in the title ( ). another dozen refer to the director of the library and a half dozen were assistant or associate directors. ten have the term “reference” or “research” in the title, and other terms included in position titles were collection development, digital collections, and systems. while the library directors report to the provost, the majority of other respondents report to the director or ad ( ), and another five report to a department head. staffing support, where it exists, is generally parts of people’s time, in particular, liaisons and those doing work on an ir (metadata, systems, programming). assessment is varied and still in its infancy. only some respondents are counting things, mostly the number of items added to the repository, while others are counting number of attendees at events. a few are recording other measures, such as tracking recipients of oa publishing fund grants, but most are concentrating on building programs and on creating support across campus (for instance, faculty backing an oa policy). scholarly communications services: outreach and education scholarly communications services may be generally divided between outreach and educational activities and those services related to hosting and managing digital content. all arl libraries answering the questions about outreach and education offer services related to authors’ rights, and all but one consult with faculty on sc issues. most consult with graduate students ( ) and most advise authors on meeting funding mandates ( ). funding requirements consultations and authors’ rights discussions (which inevitably include copyright) are also seen as offered elsewhere on campus, most likely a research office and university legal counsel—suggesting partnerships for the libraries. a large number of arl libraries, of them, also plan campus-wide events; consult with undergrads about sc issues; and prepare sc-related documents for faculty discussion. it is important to note that the spec kit survey permitted librarians to mark that the service was provided both by the library and in another unit on campus, while my survey did not. for the non-arl libraries, authors’ rights education is still a significant activity: of respondents are engaged in it, across a variety of school types. there are libraries that advise authors on how to make their research open access, and as might be expected, there is a high degree of overlap between schools offering both services. only libraries plan group events related to scholarly communications. sample group events include recent presentations to faculty on journal publishing in oa and traditional publishers, and open access week talks. only of schools advise researchers on their data management plans—but of these also engage in data management activities. advising graduate students about electronic theses and dissertations (etds) takes place at schools; other schools said this activity is done by another unit, most likely the graduate school or faculty advisors. schools of varying sizes are indeed participating in scholarly communications activities, just at rates that differ from those by arls. libraries should look for potential partners within their institutions in order to increase the range and audience for their sc efforts. for several of these outreach and educational activities, the graduate school, university research office, and/or university legal offices make natural partners. scholarly communications services: hosting and managing digital content there were responses to questions about hosting and managing digital content recorded in the spec kit. the number of libraries offering each service is somewhat lower than the outreach and education services, though. highest numbers are for supporting campus etds ( of ), providing an ir ( ), data management ( ) and digitization ( ). more of these activities are also provided by other campus units. identifying those other units and clarifying whether the library should be involved or in what way would be very important. libraries that i surveyed are also engaged in hosting and managing digital content, and the two services most often offered are the provision of an ir and digitization. note that the ir and digitization are not offered elsewhere on campus, and that digitization (which includes everything from scanning old college yearbooks to participating in hathi trust) is the most offered service ( of responses). irs are offered by schools across the span of carnegie classes, but in decreasing frequency: only two baccalaureate schools have one, and another indicated that they are planning for one. in contrast, only two ru/vh schools reported that they did not have an ir. a little over half as many libraries ( ) have begun publishing journals compared to the arl’s, but there were two master’s colleges and a dru in addition to the ru/vh and ruh schools. a few more libraries report involvement with data management ( ), and these also included schools from across a variety of carnegie classes. what campus partners are available here, for example, to publish e-journals? campus it, various departments on campus? maybe even if another unit is already providing the basic service, the library can add value to those e-journals with services related to indexing, registering for issns, crafting a preservation plan, etc. scholarly communications services: other digital publishing and support the spec kit survey combined digital humanities, e-science, and “e-scholarship initiatives” without defining any of these three. a large number of the responses ( ) indicated support, and noted other campus units also offering support. this number compares well with number of libraries offering an ir and data management. there were libraries that said they are working with faculty to develop new forms of publishing, and schools noted that other units on campus are doing this too. there are libraries publishing e-journals, and who said that other units are providing this service. only of respondents indicated the library administers an oa publishing fund, and said that other units offer such a fund. who paid page charges or other publishing fees in past? likely a research office or dean’s office paid these fees, and maybe these offices would make good partners for a campus oa fund. non-arl library support for new forms of publication included smaller numbers than arl schools ( compared to ), but these were spread across ru/vh, ruh, dru, and masters schools. the surveyed schools also were less likely to offer an open access publishing fund—only out of respondents (all ru/vh or ruh)—although other schools indicated that they are looking for opportunities to offer a fund. this compares to arl schools offering an oa fund. other services mentioned related to reserves, e-reserves, and fair use consultations, new faculty orientations and graduate student orientations. one library director talked about watching nih grant-funded research projects through the campus office of sponsored programs process and tracking public access policy compliance. in all of these activities too are potential campus partners, including campus research and legal offices. potential for growth: exploring options and planning growth in scholarly communication will be easier if libraries can take advantage of shared support for expertise and shared support for technical infrastructure. shared support for expertise for north carolina libraries includes several web resources, a working group, and a new resource person. web resources highlighted were acrl’s scholarly communication toolkit and the arl’s “developing a scholarly communication program in your library.” recently formed by the university library advisory council (ulac) formed a scholarly communication working group, and charged it with investigating oa publishing and archiving resources available to member institutions of the university of north carolina. the new resource person is the visiting program officer for scholarly communication, for the association of southeastern research libraries: christine fruin. ms. fruin is the scholarly communication librarian for the university of florida, and in her capacity as vpo will work with sc and oa leaders within aserl on a series of articles in order to highlight sc work done in our region and to identify common themes and best practices. these are only some of the external expert sources available to libraries. in addition to other external experts, libraries should seek expertise in partners such as the university legal counsel, research office, and/or graduate school. shared support for technical infrastructure presupposes libraries working together on any of several different software packages designed to offer the following services: institutional repositories, e-journal publishing, and data management. there are several well-known institutional repository software options, including dspace and bepress. at least two regional consortia also offer shared repositories using dspace: lasr (liberal arts shared repository) and the nitle network (national institute for technology in liberal education). unc greensboro has also created an ir system (ncdocks) that is currently shared by seven unc system schools. open journal systems is one of the best known software packages for publishing e-journals, and several unc schools are already utilizing it. a shared ojs would defray costs for other schools. some libraries publish e-journals in their dspace repositories, and bepress can also host e-journals. data storage and management is an important and growing need, so libraries are scrambling to evaluate what they can provide. dspace can store data, as can dataverse and project redcap, and there are other free repository software packages, but libraries must be careful, because this software is “free” as in “free puppies.” reflections: the scholarly communications landscape has changed rapidly in the last few years, and the pace of change continues to increase. within the past few weeks, there has been a flurry of activity: aserl announced the vpo, ulac created their task force, an oa fund was initiated at northern illinois, and positions have been posted at virginia commonwealth university, butler university, montana state university, and others. can libraries avoid being left out of the loop? more space for working in the scholarly communications arena will definitely be opened up by the recent office of science and technology policy directive for more agencies to make their funded publications oa and better manage the underlying data. libraries must ask themselves what services to offer, strategically and sustainably, while the library community at large should also consider how to bridge gaps in service across such a wide variety of library sizes. a basic takeaway from the survey data is that schools of all sizes are already offering scholarly communications services, so any of our libraries can engage in this work. the libraries still have to decide carefully what services to offer, and who their partners should be. perhaps a set of scholarly communication core services could offer direction for planning training, bridging gaps across institutions of varying sizes, and lead to effective assessment of scholarly communications programs. scholarly communication core services: one of the first questions to address when considering a set of core services for scholarly communications is whether they would be program oriented or whether they would be written as librarian competencies. after all, one possibility for describing a set of core services is to consider sc as a program. acrl has guidelines for instruction programs in academic libraries that might serve as a good model. these guidelines address such functions as program design, support, key components of advanced programs, and benchmarks. there might be more flexibility, though, in concentrating on librarian competencies. these newly- developed competencies could stand alone like the information literacy competency standards, or librarians could recommend that sc competencies be integrated into other competency standards. and there are certainly lots of competency sets out there: rusa’s professional competencies for reference and user services libraries has a very good structure; nasig lists draft competencies for electronic resources librarianship; there are competencies for art librarians, music librarians, and medical librarians, among others. consider the following proposal for scholarly communication core services. related to each broad topic, librarians will:  open access: o help authors make their works open access o understand variety of publishing models  copyright and publishing agreements: o help patrons use copyrighted materials fairly and legally o consult with authors on their publishing agreements  research support: o help users evaluate oa resources among their lit reviews o help authors comply with funding mandates in order to meet the goal to help authors make their works open access, librarians will have to be familiar with a variety of publishing models and a variety of types of open access. this competency would include the librarian being able to deposit a permissible copy of a work into an appropriate repository. (see s. potvin, , p. .) this repository might be an ir, a data repository, pubmed central, or a subject repository. copyright and publishing agreements are critical features of the scholarly communication landscape, so understanding them must be a basic competency among librarians doing sc work. consistent among comments in my survey and on the spec kit survey were remarks about the library’s role as a resource for the use of copyrighted materials—reserves were mentioned a lot, and digitization of physical formats (like vhs), but coursepacks are another area where the library’s licenses can make a big difference to students. working with authors to understand their publishing agreements and to retain the rights they want to keep is an important proactive service that will have a direct impact downstream on the availability of research for future library users. research support services refer to a wide range of library users, from students needing resources to write their papers to faculty conducting a literature review for a grant. complying with funding mandates will create more demands on librarians as the funding mandates increase. librarian help writing a successful data management plans might be one indicator of success, or the verification of public access policy compliance. overall, these scholarly communication core services are generally framed so that any member of the library can offer them. they are also intended to be flexible, to address variances of need whether the audience member is a student, faculty member, or other library user. initially, at least, the core services would focus on outreach and educational objectives, since such activities could precede the technological infrastructure necessary for hosting and managing digital content. feedback received during the north carolina serials conference was generally positive, with encouragement to focus on consulting and advocacy roles, to be respectful of different approaches to scholarly communication issues required by disciplinary differences, and to be sure that scholarly communication expertise is disseminated throughout the organization rather than concentrated only in one person. conclusion: librarians from a wide variety of schools were surveyed to discover their scholarly communication leadership, administrative structure, and services offered. outreach and educational activities most offered include authors’ rights and open access, and digitization and hosting the ir top the list of digital content services. these results compare favorably to the types of activities offered by arl members, although not at the same rate of adoption. in addition to suggesting potential for growth through shared expertise, the author also encourages librarians to consider implementing a set of scholarly communications core services because they might provide useful benchmarks against which to plan and evaluate locally offered services. since the north carolina serials conference in mid-march , two publications and a presentation reveal widespread interest in incorporating scholarly communications educational activities into information literacy. acrl’s intersections of scholarly communication and information literacy ( ), and common ground at the nexus of information literacy and scholarly communication (s. davis-kahl and m. k. hensley, eds., ) were both published, and davis-kahl, kim duckett, julia gelfand, and cathy palmer presented “information literacy & scholarly communication: mutually exclusive or naturally symbiotic?” to the acrl conference in indianapolis ( ). incorporating sc activities into information literacy will provide excellent benchmarks for engaging students. hopefully while this effort is underway, librarians will come up with strategies for defining sc competencies with respect to faculty members, researchers complying with mandates, and other campus partners. librarians might also consider whether there are other preexisting competencies into which sc could be incorporated. references: association of college and research libraries. ( ). intersections of scholarly communication and information literacy: creating strategic collaborations for a changing academic environment. chicago, il: association of college and research libraries. retrieved from http://acrl.ala.org/intersections/. davis-kahl, s., duckett, k., gelfand, j., and c. palmer. ( ). information literacy & scholarly communication: mutually exclusive or naturally symbiotic? paper presented at the association of college & research libraries conference, indianapolis, in. apr. . retrieved from http://works.bepress.com/stephanie_davis_kahl/ /. davis-kahl, s., & hensley, m. k. ( ). common ground at the nexus of information literacy and scholarly communication. chicago: association of college and research libraries, a division of the american library association. potvin, s. ( ). the principal and the pragmatist: on conflict and coalescence for librarian engagement with open access initiatives. the journal of academic librarianship, ( ), – . doi: . /j.acalib. . . radom, r., feltner-reichert, m., and k. stringer-stanback. ( ). organization of scholarly communication resources, spec kit . washington, dc: association of research libraries. thomas, w. j. ( ). the structure of scholarly communications within academic libraries. paper presented at the nd north carolina serials conference, chapel hill, nc. mar. . retrieved from http://thescholarship.ecu.edu/handle/ / . this project has received funding from the european union's horizon research and innovation programme under grant agreement no https://www.newseye.eu/ i @newseyeeu i newseye-communication(at)ml.univ-lr.fr july int roduct ion: valuing digit ised newspapers as dat a newspapers collect information about cultural, political and social events in a more detailed way than any other public record. since their beginnings in the th century, they record billions of events, stories and names, in almost every language, every country, every day. newspapers have always been an important medium for the dissemination of public and political opinions, literary works, essays and art. this thematic wealth sets them at the centre stage for anyone interested in european cultural heritage. the importance of newspapers as cultural heritage is thus irrefutable, but whilst some progress has been made concerning digitising newspapers, it is their potential as data which opens up new possibilities for their exploration and analysis using digital methods. the newseye project has prototyped an integrated platform of data, digital tools and methods for the exploration and analysis of digitised historical newspapers. it thus demonstrates the potential of developing a european-wide ecosystem for large-scale, transnational and multilingual analysis of digital cultural heritage. newseye offers a unique combination of state - of-the-art artificial intelligence applied to a rich and linguistically diverse historical newspaper data corpus with a set of humanities-friendly tools. this opens up the possibility for the use of digital methods by anyone seeking to use historical evidence to understand major social and cultural debates (policy, industry, civil society) and significantly deepens our understanding of european culture and history. newseye paves the way for future research to be undertaken in the european commission’s horizon europe and digital europe programmes, bridging the gap between computer science, cultural heritage and digital humanities (and their funding streams). european policybrief https://www.newseye.eu/ https://twitter.com/newseyeeu?lang=en https://ec.europa.eu/info/horizon-europe-next-research-and-innovation-framework-programme_en https://ec.europa.eu/digital-single-market/en/news/digital-europe-programme-proposed-eu -billion-funding- - this project has received funding from the european union's horizon research and innovation programme under grant agreement no what digital cultural heritage and digital humanities now need is in-depth interdisciplinary collaboration with state-of-the art computer science. this is no longer a nice-to-have, but essential. the development of the newseye project has proven the value and necessity of progressing toward opening the utility of historical newspaper data as a concerted effort combining expertise in digital cultural heritage, digital humanities and computer science. policy recommendat ions the achievements of the newseye project in its first two years clearly demonstrate the strength of its cutting-edge interdisciplinary research collaborations, a model according to which europe can remain a leader in digital scholarship in the humanities and computer science alike. yet, these new developments also show how such approaches are only in their infancy and require funding for new (and coordination of existing) structural research programmes to progress towards maturity. based on newseye’s research outcomes to date, the strategic policy directions are recommended in the below sections. policy area : cult ural herit age digit isat ion: d st ill mat t ers digitisation is not a ‘once and for all’ process. ‘first generation’ or ‘legacy’ digitisation is no longer of a high enough quality for analysis using advanced digital humanities methods. in addition to this, with over % of europe’s cultural heritage still to be digitised, the european commission’s ambitious goal of all european cultural heritage being digitised by remains a distant target. before moving on to strategic priorities which focus on d technologies and tools (which we agree is extremely important) we need to first revalue and reassess where we are with d digitisation. text is a privileged data format: written records give us access to the actual provenance of ideas and actions, allowing us to get beyond the technological and interpretive layers added when data is recreated. the newseye project has already demonstrated that we can do much better with historical newspapers; and thus it can be inferred that digitisation of other types of d heritage can be significantly improved as well. there is an urgent need to re-think the european digitisation strategy for the coming years, i.e. first reassessing what we have and then improving the quantity and quality of that. moreover, the current state of digitisation in europe is still not sufficiently quantified. in previous years, the numeric ( – ) and enumerate ( – ) projects were set up ‘to create a reliable baseline of statistical data about digitisation, digital preservation and online access to cultural heritage in europe’. europeana subsequently continued this work , with the latest report published in summer . furthermore, the european commission is currently undertaking an evaluation of recommendation ( / /eu) digitisation and online access of cultural material and digital preservation, the results of which are expected in late . newseye contribution (evidence & analysis): newseye has been working with the digital newspaper collections of three national libraries as a test bed for where we are with historical text corpora in three different european countries, four different languages and a range of levels of resource and technology to demonstrate how advanced digitisation can be applied. our work has proven both the value of this approach and the urgent need to sustainably prolong it. whilst initiatives and projects like newseye are already taking place in several countries (in the uk for example who are working on building a national collection), what is lacking on a european level is a transnational corpus of historical texts and adequate methods to deal with them. essentially, we need a european platform for analysing european historical cultural data. https://pro.europeana.eu/post/charting-trends-in-digitisation-of-heritage-collections-read-the-enumerate-survey-results https://ahrc.ukri.org/research/fundedthemesandprogrammes/tanc-opening-uk-heritage-to-the-world/ this project has received funding from the european union's horizon research and innovation programme under grant agreement no recommendation: work on historical text is not finished. if europe is going to stay ahead in digital humanities, we still need to do work in regard to textual resources. ongoing sustainable investment in digitisation and the use of digitised material is needed and we urge for the undertaking of a survey to provide important knowledge for evidence-based policy development for digitisation in europe. more structural support is required to enable research which will push forward digitisation and digital literacy to the level needed. finally, the technology we have developed within the newseye project should be scaled up and applied to all national libraries as well as other cultural heritage institutions in europe, contributing to a european historical cultural data platform. policy area : accelerat ing access t o and use of cultural herit age t hrough art ificial int elligence (ai) the application and advancement of ai built to work with the complexities of cultural data has an immense, unrecognised (and therefore largely untapped) innovation potential which could act as a catalyst for the digital transformation of europe’s cultural heritage. it is also a source of technological innovation in itself by creating challenging and complex use cases. europeana is currently conducting a survey on the application of ai in relation to galleries, libraries, archives, and museums (glams), which we believe is very necessary. the european commission’s white paper on artificial intelligence (february ) describes a basis for developments in ai which we wholeheartedly support. however, this report omits cultural heritage and digital humanities as an application and innovation area with significant potential. due to the richness, diversity and multilingual nature of europe’s cultural heritage, its complexity as data and richness in context, investment in ai research and innovation related to this application area will boost europe’s competitive advantage and significantly contribute to ensuring that europe is fit for the digital age. newseye contribution (evidence & analysis): newseye has shown the advantages of bringing machine learning, ai and state-of-the-art computer science closer to the digital humanities and cultural heritage. however, recent advances in machine learning and knowledge extraction have widened the gap between the technological possibilities available and their practical implementation. the europeana project has been essential for stimulating the opening up of digital collections of europe’s rich cultural heritage. this valuable work is only a first, yet important step, towards extracting and analysing the knowledge still buried deep inside our cultural artefacts. digitised newspapers, while an excellent use case to explore the potential of data science to unlock these embedded semantic layers, are only the tip of the iceberg when it comes to extraction of knowledge from our historical documentary heritage, whether it be written on clay tablets, papyrus, parchment or most recently on paper. in particular, machine readability of textual documents is an area where significant additional development is required: newseye has proven this case through its ground-breaking work starting with historical newspapers, but additional challenges and application areas yet to be tackled are numerous, including ones beyond textual documents. recommendation: we strongly urge the consideration and inclusion of the unique challenges and opportunities provided by cultural data in ai research and policy. this will bring cultural heritage up to a level of technological advancement that is required to bolster the semantic knowledge we talk about in the next section, and to ensure ai can meet the kinds of nuanced requirements these artefacts of the human record present. https://pro.europeana.eu/project/ai-in-relation-to-glams https://pro.europeana.eu/project/ai-in-relation-to-glams https://ec.europa.eu/info/sites/info/files/commission-white-paper-artificial-intelligence-feb _en.pdf this project has received funding from the european union's horizon research and innovation programme under grant agreement no policy area : deepening cult ural understanding t hrough semant ic knowledge ext ract ion in order to fully use the potential of digital technologies, we need to move on from basic digital text analysis of digital cultural heritage resources (e.g. newspapers, periodicals, literary works, etc.) towards enabling a deeper semantic understanding of european culture and its history (e.g. language, emotions, discourses and memory) through the application of advanced analytic methods such as historical sentiment mining, temporal-spatial analysis and linguistic processing, discourse mapping, event detection and big data visualisations. newseye contribution (evidence & analysis): digitisation does not stop at the scanning of documents. the digitised historical documents need to be ‘pre-processed’ before any interpretation of them as social or cultural data can be undertaken. newseye has demonstrated significant results in the area of automatic text recognition, layout analysis and article separation, which enrich the digitised newspapers with a foundational layer, including full text transcripts, along with text block and article separation information. beyond this, newseye has significantly advanced the state of the art in semantic text enrichment for historical documents, producing semantic annotations such as the cross- lingual identification, disambiguation and linking of mentions of named entities (e.g., person, locations and organisations), and the detection of events. this is used to improve document access and allow the advanced systematic analysis of the newspaper collections. in addition to this the project’s work on dynamic text analysis has shown how we can produce methods to automatically find topics, trends, viewpoints, and themes in the corpus being studied, both within a specified context and in comparison between contexts. recommendation: we call for the move beyond static digital libraries towards a new way of creating computational analytical tools for digital data exploration. more research and practice is needed in the domain of semantic knowledge exploration in order to develop our understanding of cultural heritage content more deeply. policy area : democrat ising digit al lit eracy for digit al t ransformat ion until now, the digital transformation of european cultural heritage has been more of an evolution than a revolution. however, the latest advancements in artificial intelligence and data science will require to an ever-increasing extent that discrete disciplines and areas of expertise be combined to deliver a coherent and multidisciplinary approach to digital transformation. therefore, newseye very much welcomes the joining up of innovation, research and culture in the portfolio of commissioner mariya gabriel as a necessary, and comprehensive three-pronged approach. newseye contribution (evidence & analysis): transnational research is constantly being undertaken and advocated for by newseye, bringing humanities and computer science groups together to do cross-lingual, transnational comparative research. the newseye project has championed inter- and cross-disciplinary methods in several ways, which has bolstered our understanding of the importance of such approaches. firstly, with the incorporation of digital humanities hackathons, we have changed the way humanities scholars and computer scientists come together to co-create knowledge using digital tools. these innovative ‘lab-like’ environments, which can and should increasingly be opened up within europe, model an effective workflow for building digital literacy across cultural actors to mutually enrich results. glam labs have started to develop all over the continent, and across glam sectors, a development that is crucial for our work in this field for cross learning, skills and knowledge transfer. glam labs generally present a strong methodology for the digital transformation of cultural heritage and digital humanities. https://www.youtube.com/watch?v=cx-zhlymla &feature=emb_logo this project has received funding from the european union's horizon research and innovation programme under grant agreement no secondly, we are creating jupyter notebooks on coding for analysis of newspaper data within the project acting as a scientific and pedagogical tool, even within disciplines where coding is not yet an established methodological norm. this, in addition to cross-institutional user workshops and study trips between partners to understand each other's work, has proved crucial for sharing of knowledge and skills. recommendation: in order to deliver the inter- and trans-disciplinary capacity that the emerging digital society will require, and which has been tested within the newseye project, europe will need to increase its investment in broad cross-disciplinary capacity building, ideally via innovative environments such as glam labs. funding that crosses sectoral, disciplinary and other traditionally siloed approaches to knowledge can be a cornerstone for this development. on the one hand, we need more capacity for implementing leading-edge computer science in cultural heritage institutions, on the other hand we need a hybrid approach where the ‘humanistic’ and the ‘digital’ meet on the same level. policy area : innovat ive open science: sharing ‘fair’ research dat a the data, which results from the digital historical text analysis pipeline is a final step to close the virtuous circle of feedback and improvement that connects digital humanities and computer science. we need to make sure that cultural heritage data can be made available for research, in spite of the many necessary restrictions required to ensure its provenance and protect the human subjects that created or were the subject of it. this complex challenge is beginning to be approached by initiatives such as the social sciences and humanities open cloud project (sshoc) providing humanist-friendly access to the european open science cloud (eosc). there is still significant work to be done, however. in terms of humanities research data, how ‘open’ can open science be? while cultural heritage metadata is generally open for reuse, the underlying historical textual data may still be protected under copyright and other legal restrictions. we should be striving for cultural data to become findable, accessible, interoperable, and reusable (fair) throughout this continuum, but within the limits that its hybrid nature as both cultural heritage and research data require. to take an example, association of european research libraries (liber) recently published (may ) a paper on supporting text and data mining. liber is actively advocating for researchers to be able to use text and data mining for their research, without fear of legal ramification. similarly, cultural heritage institutions need to be confident as to what is legally possible with their data. newseye contribution (evidence & analysis): all tools, services and datasets developed within newseye are made available on the following sites and will be sustained beyond the project duration: “coming from a traditionally discourse - driven research area the collaboration in the project fo rced me to think c l o s e l y a b o u t w h a t t h e d i g i t a l t r a n s f o r m a t i on m e a n s i n our fields of study. the newseye project has allowed me to u n d e r s t a n d h o w i m p o r t a n t t h e h u m a n i t i e s ’ p a r t i c i p at i o n i n the development of ai is – and how important it is, to have tools that show how machines are workin g in a v ery simple and understandable way so that thes e can be made available for wider research an d usage.” prof. dr. eva pf anzelter - u n i v e r s i t y o f i n n s b r u ck https://sshopencloud.eu/ https://www.eosc-portal.eu/ https://zenodo.org/record/ this project has received funding from the european union's horizon research and innovation programme under grant agreement no • project website newseye.eu/ • github repository: github.com/newseye • publications and data sets: zenodo.org/communities/newseye/ • newseye demonstrator platform platform.newseye.eu/ • social media and interactive platforms: podcasts, youtube and twitter in addition to this, newseye team members have embedded their work firmly within the landscape or european and international cultures to make digital cultural heritage visible and usable: this is an effort that needs to be scaled up. the newseye project is already linked with europeana newspapers, impresso, living with machines, oceanic exchange, impact, read-coop, ocr-d, embeddia, amongst others. the project also seeks to draw up associated partnerships with national libraries and research communities therein to develop its work and make it sustainable and accessible. for these leads to be built upon, we need an effective platform to coordinate programmes of work and to facilitate more efficient knowledge sharing and cooperation. this would be a mechanism with which to address gaps between needs of researchers and institutions with regards to the production and use of cultural heritage data. the communication between these groups and linking to ongoing and upcoming research is key, for instance with the eosc, digital europe, and european research infrastructure consortiums (eric) for language resources and technology (clarin) and for digital research for arts and humanities (dariah). ensuring that these initiatives are joined up is the next step in a european wide data access-friendly approach. we need an effective platform to reduce inefficiencies in programmes of work and to facilitate more effective knowledge sharing and cooperation between them. this would be an effective mechanism with which to address the many gaps between researcher and institutional needs with regards to the production and use of cultural heritage data. recommendation: although a few european countries already have copyright exceptions for text data mining, many still do not. the currently tricky labyrinth of legalities of copyright still needs to be made clear for cultural heritage institutions and digital humanities professionals. we call for exceptions for research of digitised newspaper data or clarity on what is possible within those limitations. additionally, we need training in how to deposit such materials in open access platforms and then continued investment in such a resource to make data available for reuse. finally, funding is needed for cross sharing and ‘fairing’ of data between the various strands and national efforts of european cultural heritage open access initiatives. we suggest setting up a platform as a room for exchange and to coordinate and resolve issues of shared concern regarding access. conclusion newseye’s contributions to the progress of european cultural heritage initiatives and the application of advanced digital methods in humanities research through interdisciplinary collaboration have given the team behind this project a privileged position from which to observe both the current state of the art and the potential futures for this research. continued investment on european level and policy support, as detailed in this brief, are now needed more than ever in order to make our work impactful; we owe that to european cultural heritage. project ident it y project name: newseye: a digital investigator for historical newspapers project number/ grant agreement id: https://www.newseye.eu/ https://github.com/newseye https://zenodo.org/communities/newseye/ https://platform.newseye.eu/ https://twitter.com/newseyeeu https://www.youtube.com/channel/ucweqok jrfbjebyv-zkpcta/playlists https://twitter.com/newseyeeu http://www.europeana-newspapers.eu/ https://impresso-project.ch/ https://livingwithmachines.ac.uk/ https://oceanicexchanges.org/ http://impacteurope.eu/ https://readcoop.eu/ https://ocr-d.de/ http://embeddia.eu/ https://www.clarin.eu/ https://www.dariah.eu/ this project has received funding from the european union's horizon research and innovation programme under grant agreement no coordinator: university of la rochelle, france primary contact: antoine doucet antoine.doucet@univ-lr.fr consortium: university of la rochelle – ulr – france austrian national library – onb – austria university of helsinki – uh – finland department of computer science helsinki centre for digital humanities national library of finland university of innsbruck – uibk – austria institute of contemporary history digitisation and digital preservation group bibliothèque nationale de france – bnf – france university of rostock – uros – germany university montpellier iii paul valery – upvm – france university of vienna – univie - austria funding scheme: horizon type of action: research and innovation action call: understanding europe, promoting the european public and cultural space topic: european cultural heritage, access and analysis for a richer interpretation of the past call/topic ids: h -sc -cult-coop- /cult-coop- - duration: may – april ( months) budget: eu contribution: € website: https://www.newseye.eu/ mailto:antoine.doucet@univ-lr.fr https://www.newseye.eu/ introduction: valuing digitised newspapers as data policy recommendations policy area : cultural heritage digitisation: d still matters policy area : accelerating access to and use of cultural heritage through artificial intelligence (ai) policy area : deepening cultural understanding through semantic knowledge extraction policy area : democratising digital literacy for digital transformation policy area : innovative open science: sharing ‘fair’ research data conclusion project identity the library-press partnership: an overview and two case studies this is authors’ final manuscript of the article. the article is published at library trends, vol , no. , . (“the role and impact of commercial and noncommercial publishers in scholarly publishing on academic libraries,” edited by lewis g. liu), pp. - . the library-press partnership: an overview and two case studies yuan li, sarah kalikman lippincott, sarah hare, jamie wittenberg, suzanne m. preate, amanda page, suzanne e. guiod abstract this article provides an overview of the changing role of the library in scholarly publishing and the rising phenomenon of library-press collaboration. it examines, through a literature review and two case studies, how and why the library has taken on this new role in scholarly publishing and created partnerships with university presses. the case studies describe current library-press partnerships from the perspective of institutional context, publishing services, and respective roles and responsibilities. authors also briefly discuss the possible future of the library-press partnership in scholarly publishing. introduction creation, publication, and dissemination of new knowledge lie at the heart of scholarly communication. while these functions have changed little over the past several decades, the emerging affordances of information technology are shifting nearly every established mechanism for scholarly communication. scholars, publishers, and libraries are all re-evaluating their historic roles in scholarly publishing in light of transformative technologies and changing attitudes toward scholarship (hahn ). over the past few decades, technological advances have created both opportunities and challenges in scholarly communication. while scholars have access to an unprecedented wealth of information, tools, and services that enable exciting new possibilities in scholarly inquiry and knowledge production, they struggle to find publishing venues for new research outputs, particularly works that incorporate nontraditional components, such as multimedia elements or - d models. meanwhile, nonprofit and mission-driven publishers—especially university presses and small professional societies—are confronting challenges to their traditional business model and processes. the transition from print to electronic publishing can be expensive and complex, and it can be difficult to find willing publishing partners and new revenue streams. the same issue is facing scholars seeking venues to launch new publications in niche research areas or new media formats. libraries, too, face opportunities and challenges in this new environment. they have been actively crafting their services to catch up with the ever-changing information needs of their community, such as digitization projects that made previously published or unpublished works in library collections available electronically. evolving repository services that collect, store, publish, and disseminate scholarly works demonstrate the new capabilities of the library in information management and dissemination. as scholars and researchers confront gaps in traditional publishing systems, libraries are a natural service provider. collaboration between scholars and libraries in academic publishing is a true partnership, with scholars taking the lead on the editorial process and marketing activities, and libraries providing services related to technical infrastructure, copyright advisement, and information organization (e.g., metadata, indexing, etc.). further, partnerships with university presses add another potential avenue for libraries that wish to offer scholarly publishing services, due to their complementary skills and assets. libraries are among the best consumers of university presses’ content, and academic libraries and university presses have a long tradition of collaboration (neal ), though in the past this has predominantly taken the form of bilateral knowledge-sharing. as the case studies in this article demonstrate, many recent library-press collaborations highlight the potential of deep, ongoing collaboration to produce innovative services and publications. despite their differences in financial goals, organizational culture, and even size, university presses are an obvious resource for publishing expertise as well as legitimacy (butler ) when libraries experiment with a new role in scholarly publishing. literature review long before library publishing became mainstream, librarians and publishing professionals have written about the need for library–university press collaboration. day ( ) pointed this out as early as in his article “the need for library and university press collaboration.” neal ( ) and wittenberg ( ) described publishing initiatives at columbia university, a pioneer in library-press collaboration. both authors exhorted other libraries to take the lead in the inevitable reinvention of the scholarly publishing system. more recently, okerson and holzman ( ), who synthesized the most comprehensive report to-date of the history of publishing in libraries, highlighted library-press collaboration, writing that one of the overarching themes of their research “is the possibility and desirability of increasing collaborations between libraries and university presses” ( ). okerson and holzman are not the only authors to forecast a promising future for library-press collaboration. in , ivins and luther advocated a role for libraries in sustaining small mission-driven publishers, such as scholarly societies, which have become increasingly keen on oa and digital publication. walters ( ) employed a scenario- planning approach to describe potential high-level trajectories and evolving roles for library publishers, predicting that “cooperative digital publishing services established between several universities, their libraries, scholarly societies, and/or university presses” will become a predominant model ( ). an influential report by mullins et al. ( ) also advanced the pressing need for interdepartmental and interinstitutional collaboration in order to facilitate library publishing at scale. according to the aaup’s library-press collaborations survey report ( ), “collaboration between university presses and libraries is growing, and helps to point the way towards some best practices in developing these relationships.” the early literature on library publishing positions it as complementary to the scholarly publishing activities of commercial and university presses. the s saw a proliferation of articles and case studies advocating the use of the institutional repository to publish gray literature, electronic theses and dissertations (etds), and other original research alongside faculty preprints. case and john ( ) and royster ( ) further developed the case for leveraging the institutional repository to publish original scholarly and creative work that does not fit within traditional publishing models, laying the groundwork for library publishing as a distinct subfield with its own identity. a seminal report published by griffiths et al. ( ), which examined the future of university-based publishing writ large, emphasized the potential value of libraries as publishers, but cautioned them against the peril of institutional repositories that turn into “‘attics’ (and often fairly empty ones), with random assortments of content of questionable importance” ( ). the authors cited the need for cross-institutional collaboration to build economies of scale and develop a critical mass of content to attract authors and readers ( ). in many cases, library publishers have adopted a complementary role to university presses, publishing content that traditional publishers would not ordinarily disseminate. library publishing initiatives upend traditional definitions of publishing and the boundaries between institutional repository programs and publishing programs. with a few notable exceptions, library publishers operate as fully subsidized units of the library, freeing them from the obligation to generate revenue. this model aligns well with libraries’ role as oa advocates, and also allows them to pursue more logistically complicated projects and publications that appeal only to a very niche audience. whyte appleby et al. ( ) noted that many libraries characterize their publishing activities as “hosting services,” particularly those that predominantly deal in gray literature, data, etds and other informal content. case studies also abound on libraries as oa journal publishers (de groote and case ; sondervan and stigter ; perry et al. ; georgiou and tsakonas ). these journal publishing services offer alternatives to scholars looking for rapid publication solutions, permissive licensing, and the incorporation of multimedia. even libraries that have decided not to launch full-fledged publishing initiatives find they may have other related services to offer that complement the services provided by other publishers. bains ( ), for example, described a research study undertaken at the university of manchester to determine feasibility and desirability of launching a journals publishing program. the results of that study convinced the library that providing training and support for authors and editors, rather than creating and publishing its own portfolio, was the better course of action. collaborations or coordination between libraries and university presses can take many forms. some libraries have partnered externally with university presses on specific projects that would benefit readers both on and off campus. one example would be the collaboration between the university of utah library and the oxford university press on the ethics of suicide digital archive. other libraries are exploring opportunities with their own university presses for their mutual benefit, such as libraries providing more open access books and presses having increased print sales. the university of pittsburgh presented a perfect example of this kind of collaboration between libraries and presses, in which five hundred out-of-print press books were revived with online and print-on-demand access (murray ). while many library-press collaborations are initiated by anticipated economic benefits, the partners increasingly find social, political, and technological advantages (watkinson ). several university presses have now come under the administration of their university libraries; by , about percent of university presses, according to the educopia institute, are situated within or report directly to university libraries (straumsheim ). a successful partnership between libraries and presses, however, entails much more than establishing reporting lines. library-press relationships have met with a good deal of skepticism from the scholarly publishing community, and building a successful partership, one that equally engages and benefits both parties, has proven difficult (anderson ). as esposito ( ) contended, “every way you look at the relationship between a press and a library, you come away with little or nothing to support an organizational marriage. presses are great things, libraries are great things, but they are not better things by virtue of having been put into the same organization.” healthy and effective collaborations require mutual understanding of not only the shared goals and values that unite libraries and presses but also the very different drivers, cultures, and expectations that have developed in each field over decades (roh ). brown noted that collaboration is hard as presses don’t see the world through the same lens as librarians. but that does not mean research libraries and scholarly presses cannot acknowledge these different lenses and work together to put some of their aims and interests into a common focus ( ). despite the challenges, numerous published case studies demonstrate that successful partnerships are not only possible, but desirable. examples include purdue university press (watkinson et al. ), penn state university press (eaton, macewan, and potter ), and the university of michigan (courant ). the outcomes of successful partnerships are diverse, from intangible benefits like better communication and knowledge-sharing to concrete publications that could not have come to life without contributions from both the library and the press. anderson ( ) noted that, at utah state university, having the university press situated in the library makes the university a better place for students and scholars and makes the larger scholarly community a richer source of knowledge. overview of library-press partnership the changing landscape of scholarly communication and the advent of digital publishing have pushed academic libraries and the university presses to rethink their roles and to cooperate in creating new digital publishing models that better serve the emerging publishing needs from their campus and beyond. in june , a summit on the library and the press as partners in the enterprise of scholarly publishing was convened by the california digital library, the university of california press, the university of michigan libraries, and the university of michigan press. libraries and presses participating in the summit discussed how they might collaborate to forge new publishing structures that support existing and emerging forms of scholarly communication (crow ). library-press partnerships vary in form, size, and services based on the individual institutional context, the actual project, and unique needs. from the examination of the literature review and current practices, we observe that collaborations between libraries and presses may include but not are limited to the following: the library digitizing the press’s backlist, the library hosting supplementary files for press books, jointly providing scholarly journal/book publishing programs, and jointly developing a publishing platform. there are many benefits to both libraries and presses in each type of collaboration. here, we briefly review some of these categories. backlist digitization many library-press partnerships start from digitizing a subset of the press backlist or out-of-print books and making the digital version available online through the library’s existing digital collections infrastructure, such as an institutional repository or digital collections management system. this type of collaboration leverages the skills and serves the individual interests of both partners. libraries increasingly possess the technical infrastructure and skills for large-scale digitization and for hosting digital content, an interest in expanding their role in collecting and disseminating digital scholarship online, and a commitment to promoting open access publishing models. presses, historically print-oriented, are looking for opportunities to test the water in digital and open access publishing. at the university of pittsburgh, for example, the university press and library system worked together to revive five hundred out-of-print titles. the books were made available online through the library system for users to read and search the full text, and paperback editions were offered for purchase via print-on-demand through the chicago digital distribution center. according to murray ( ), each partner had a distinct role in the project: “the press would clear the rights for books (the press generally had the rights to publish in paper, but not digital) while the libraries would digitize the books, mount them on library servers, and do the graphic design.” this joint effort not only brought new use to out-of-print books but also resulted in increased print sales. the effort closely aligned with the campus’s desire to promote open access publishing, and the support of the university presses gave more credibility to the digital initiatives (murray ). other examples of this type of collaboration include projects sponsored by the humanities open book program, a joint program of the national endowment for the humanities and the andrew w. mellon foundation. both fordham university press and libraries and wayne state university press and library system received the grant in and to digitize out-of-print books and make them available on the library’s servers. supplementary content hosting another type of collaboration between libraries and presses also positions the library as a host for digital content. instead of hosting the digitized backlist from the press, the library helps the press host supplemental digital files for their current publications. the press sometimes encounters challenges in dealing with extensive supplemental files, which provide important contextualizing material but cannot be included in print due to format or volume considerations. by collaborating with the library, the supplemental files are hosted online by the library in digital format and linked to the publication page on the press website or in the text of the publication. in the print version, the press only needs to include a link to the supplemental files on the library server. in some cases, the library also provides enhanced functionalities to the hosted content, such as cross-linking to the press website or other related resources and full text searching. one example of this type of collaboration is a joint initiative from the umass amherst library and the umass press. for the book meetinghouses of early new england, the library hosts over two hundred supplemental pages of appendixes and a bibliography. for the print edition of tidal wetlands primer, the library hosts eighty five high-resultion color figures and images and enables zoom functionality. journal/book publishing services an increasing number of libraries and presses have launched collaborative publishing services. library publishing services launched solely by the library rarely provide the time-intensive services that represent the hallmarks of traditional scholarly publishing, including typesetting, marketing, graphic design, and print production and distribution. the level of service provision varies widely across the field, a trend that has given rise to questions about the distinction between hosting and publishing, or what whyte appleby et al. ( ) termed the “publishing- hosting spectrum” ( ). on the other hand, publishing services launched jointly by libraries and presses lend legitimacy to the initiative and provide a more robust suite of services. this type of collaboration may be represented through the creation of a library-press imprint or a joint program, such as a scholarly publishing office. the services of the imprint or joint program include those offered by libraries, such as infrastructure, guidance on metadata and copyright best practices, indexing, provision or unique identifiers, and preseveration services, alongside traditional publishing services from university presses, such as copyediting, graphic design, marketing, and print production and distribution. this type of collaboration generally has a focus on open access and sometimes with an option of print-on-demand. this type of collaboration helps the library to move forward their agenda in open access and allows the press to fulfill an important role in disseminating high quality scholarly content regardless of its market potential. there are many examples of this type of collaboration, including the two case studies elaborated in this article. development of publishing platforms in recent years, library-press partnerships have gone beyond developing publishing services to the development of publishing platforms, designed to have a broader impact and benefit the overall library/press publishing practice and community. in , through the support of the andrew w. mellon foundation, the california digital library and the university of california press partnered together to develop a new open source, digital-first book production platform. “the project, called editoria, will support a robust book production system for academic publishers and library publishing programs that seek a low-cost and efficient mechanism for streamlining their book-publishing activities. the platform will be open source and able to be configured for many different publishing workflows” (mitchell ). another example is the mellon-funded project from the university of michigan press and library. the joint effort is “to create a shareable, open-source solution for born-digital complementary monograph materials as well as a working model that maximizes the publishing strengths of university presses and the preservation expertise of libraries to meet the growing needs of authors to durably connect their publications to related datasets, interactive information, video and other non-text based online content” (university of michigan press ). benefits of collaboration the benefits of library-press collaboration are manifold. at their core, these partnerships are an acknowledgement that securing a robust future for libraries and publishers requires a broader set of skills, a deeper pool of resources, and a more diverse set of perspectives than any one player can bring to the table. as crow ( ) observed, “a mutuality of interests is critical to creating a strong alliance. in many cases, a library and a press will partner because each needs the other to advance its individual interests” ( ). although university libraries and presses are different in many ways, including their respective missions, one centered on the research and teaching needs of the institution and another on serving academics as a whole, it is still appealing for them to collaborate as they share an institutional culture, a commitment to serving the emerging needs in scholarly publishing of their faculty and students, and the understanding of the problems in the current system of scholarly publishing. by collaborating, university presses are allowed to pursue, experiment, and expand the digital publishing program that would otherwise go beyond their resources. having the ability to pursue a new digital publishing model or develop new services can help presses cope with the changing market and shifting environment, manage innovation, and upgrade the competencies. libraries also have their own motivation for collaborating with their universities’ presses. the most obvious benefit is to integrate the expertise and skills in traditional publishing into the library publishing services or program. in addition, partnering with presses brings reputation and validation to the library publishing program. collaborating with presses also helps the library move forward their agenda on open access. case study: indiana university bloomington the library-press relationship takes many forms; on many campuses they provide complimentary services or collaborate on innovative projects that leverage each partner’s skills. indiana university provides one such example of library-press collaboration. the indiana university office of scholarly publishing is a collaboration between the libraries and the university press (iu press), established in by indiana university provost lauren robel “to strengthen iu's central missions of scholarship and teaching and create a model of effective, sustainable st- century academic publishing” (indiana university ). prior to the establishment of the office of scholarly publishing, the iu libraries scholarly communication department was operating an open access journal publishing program, iuscholarworks journals. the first journal to be published as part of iuscholarworks program, which used the public knowledge project’s open journal systems platform, was museum anthropology review. the first issue was published on february , . following the establishment of the office of scholarly publishing, the scholarly communication department worked collaboratively with the iu press to develop new publishing services for the open access journals that they support. as part of this collaboration, the thirty existing iuscholarworks journals were assessed based upon the iu press's criteria for academic rigor, review practices, and consistency. of the thirty journals evaluated, sixteen were found to meet the established criteria and were invited to join a new publishing program: the office of scholarly publishing journals (osp journals). of these sixteen journals, thirteen accepted the invitation. these osp journals were offered a range of enhanced publishing services free of charge, which are detailed below. since the launch of the osp journals program in , the number of journals participating has increased by nearly percent. as of april , there are eighteen osp journals, with several slated to come onboard in the coming months. osp publishing services the office of scholarly publishing frames its mission around the research needs of the university. the program is currently designed to support indiana university and only accepts proposals from journals with an indiana university affiliation. part of the reason for this is the program’s provision of a full range of operational publishing services at no cost to the journals, with the exception of copyediting and print on demand. all osp journals have access to the following services: • publishing project management • copyediting and proofreading • composition and design • advertising, marketing, and promotion • indexing and discovery assistance • print on demand (pod) • fulfillment services • epub conversions iu open journals the counterpart to the office of scholarly publishing (osp) journal publishing program is indiana university’s iu open journals publishing program. this part of the program is designed to lower barriers to journal publishing and provide system-wide support for serial publication. anyone affiliated with iu bloomington or one of iu’s regional campuses can participate, including undergraduate and graduate students. the program supports several nontraditional publications, with unique content and review models. additionally, as of spring , there are ten iu open journals led by students at iu bloomington or a regional campus. with its emphasis on access, this branch of the program provides an incredible opportunity to educate students on publishing topics and open access. as an example, the scholarly communication librarian partnered with the office of the vice provost for undergraduate education to teach a one-credit-hour course to the editorial board of an iu open journal, the indiana university journal of undergraduate research (iujur) in fall . this provided an immersive opportunity for students to learn about the ojs publishing platform and their journal’s review process, as well as broader concepts, including copyright, open access funding models, and the labor and resources required to operationalize publishing innovations. respective roles and responsibilities generally, the respective roles and responsibilities of osp partners are somewhat traditional. iu press staff oversee several service offerings, including brokering print-on-demand and copyediting, providing graphic design advice, and managing monograph/ book subventions and consultations. the library spearheads conversations about open access, hybrid models, and the open source publishing platform open journal systems. however, at its best, the office of scholarly publishing goes beyond centralizing disparate publishing resources into a single unit in order to increase efficiencies. it also provides a space for cross-pollination in order to shape each respective partner’s approach. in short, the best work within the osp happens when library roles and responsibilities blur with press roles and responsibilities (and vice versa). an important example of this is when the osp works as a team to manage complex negotiations with new journal candidates. several journal candidates are considering flipping to open access and often have unique service needs for print-on-demand, copyediting, doi creation, or maintaining a subscription list and/or their back issues. conversations with candidates have prompted the osp to reflect on what kind of open access we are committed to (and why) and what the value of the services we offer is. these conversations often also empower us to share expertise about copyright, oa models, and general publishing philosophy with each other. discussion of possible future trends the indiana university libraries/press partnership has engendered several experimental projects that are emblematic of global twenty-first century publishing trends. the osp journals program has been piloting xml-first publishing for journals that can benefit from access to full text. one example is studies in digital heritage, a digital archeology journal that embeds time-based media and d models into their articles. publishing their articles in xml enables readers to interact with embedded media. publishing xml-first is substantially less resource-intensive when the articles are already encoded using the journal article tag suite (jats). for journals that take advantage of the osp journals program’s print-on-demand service, articles are encoded at no additional cost. the office of scholarly publishing has also been exploring the possibility of a new program to support open access books and monographs—tentatively called osp editions. this nascent service has leveraged the university’s license for the pressbooks platform and published affordable textbooks in collaboration with indiana university faculty and central it unit, uits. these digital textbooks can be made available through canvas, the university’s learning management system. the osp plans to continue work piloting digital publishing platforms to support open and affordable course materials. the office of scholarly publishing’s commitment to both serving and educating the iu community about publishing issues is unique. in addition to bringing together disparate publishing expertise on campus, understanding publishing as an educational imperative is an important framing for the group—it informs the services, initiatives, and programming the group creates and provides. case study: syracuse university institutional context and business model syracuse university libraries launched its institutional repository, surface, in to highlight and enable broad access to the university’s extensive array of scholarly output. this venture provided natural opportunities for open access (oa) education and fueled discussions about how authors and researchers produce, distribute, and consume information. with institutional repository deposits underway, the libraries began to explore a sustainable oa service model that would support campus publishing needs and offer an alternative to commercial vendors. our approach followed a clear trend in higher education: leveraging skills and services distributed across campus units and combining them formally and informally on a case-by-case basis to enable publishing activities. syracuse’s initial dive into such a model pooled staff expertise from the libraries, syracuse university press, information technology services (its), and faculty from several departments, and prompted the adoption of an open source publishing platform (open journal systems) to support two pilot projects. as a vehicle for these services, the libraries and press jointly launched an open access imprint, syracuse unbound, in . today, the syracuse unbound imprint is an active alliance between syracuse university libraries and syracuse university press in fostering open access endeavors through publishing workflows, platforms, and the institutional repository. syracuse unbound focuses on a few goals, loosely: collaboration, broadening the definition of open scholarship, and providing opportunities for oa publishing. in , the libraries’ department of research and scholarship (drs) reorganized its scholarly communication unit; the new open publishing services (ops) offers a menu of services to support campus scholarly communication needs as part of the libraries’ vision for its nascent digital library program (dlp). currently, open publishing projects intended for inclusion in syracuse unbound are triaged and selected thoughtfully, in collaboration with the press, though more publishing services may evolve over time as our capacity increases. services currently vary by project, but may include project management and consultation on the following: general oa education, best practices in oa publishing, platform recommendations and technical infrastructures (ojs, wordpress, digital commons, and other tools), peer review, copyediting, proofreading, design and layout, metadata, cataloging, copyright and licensing, marketing, identifier registration (e.g., issn, eissn, doi), accessibility production and compliance, and preservation considerations. collaborative project example the first oa project to publish under the syracuse unbound imprint was a complex peer- reviewed multimedia journal, launched in , and focused on the humanities, art, and design in public life. public: a journal of imagining america continues to be edited by su faculty and makes use of submission protocols through open journal systems and the front-end graphic design capabilities of wordpress supported by staff from both the press and libraries. the next project published under the syracuse unbound imprint was a book titled triple triumph: three women in medicine, highlighting the path-breaking careers of three women medical pioneers in upstate new york. initiated in , this book project is likewise edited by syracuse university faculty and housed in the institutional repository, and presents a strong example of successful collaboration. an initial contact by a faculty member for copyright advisement expanded into a full syracuse unbound publishing project. participants worked closely to provide the following support and infrastructure to the book’s editors: project management, graphic design, editorial guidance, and eisbn and isbns on the part of su press; and project management, digital file creation, discovery workflows (including doi creation), metadata, accessibility, copyright, open- access licenses, and preservation on the part of su libraries. all parties worked together on marketing. while the stories of the careers of brangham, numann, and weinstock were a motivating factor in selecting triple triumph for the syracuse unbound imprint, the global impact of the publication has surpassed expectations. triple triumph published in print and digitally—in pdf, accessible pdf, epub . , and kindle formats—under a cc-by-nc-nd . license. with , downloads since august , , access to the book spans twenty-six countries and includes downloads of the accessible (ada) files, in addition to two print runs. the project ran smoothly and provided a learning opportunity for planning future library-press collaborations. key participants responsible for that success include the publishing librarian, su press’s director and design editor, the libraries’ digital initiatives librarian, its accessibility unit, information technology services, director of communications, subject librarians, and principal cataloger. our takeaways were these: collaboration in project management is key; clarity of roles, responsibilities, timelines, and project management workflows are essential; and publishing projects that are meaningful and match our shared mission are invaluable. reflections on the collaboration/conclusions collaborative work on both the journal and book projects provided positive, educational opportunities for both partners, offering exposure to and understanding of our respective cultures, philosophies, business models, challenges, and strengths. we discovered that developing the most natural, least forced partnership arrangements and interactions should happen, ideally, at a project level rather than at the program level. while university presses and libraries serve similar constituencies and share similar missions to disseminate scholarship and increase accessibility, our day-to-day activities—those that absorb the majority of our time and focus—are quite different. further policy refinement is needed to define our scope and capacity to customers. while redundancy between the partners is acceptable, it remains important to understand the roles and responsibilities of both partners so that we offer a realistic menu of services that we can genuinely support. these exercises likewise underscored the extant importance of aligning our program and services with the strategic planning goals of the libraries and the university. we are also learning to refine selection criteria and are simultaneously expanding our understanding of what constitutes scholarship through a value- based analysis. further, we found that opportunities for collaboration with or outreach to more untapped “markets”—digital humanities practitioners and others—become more apparent through discussion. given our overlapping missions and desire to make common cause with institutional partners, syracuse university libraries and press look forward to future opportunities to expand our oa services and to collaborate on successful projects. looking forward library publishing is one of the notable transformations that the library is making in light of the changing landscape of scholarly communication. there is an emerging consensus that basic publishing capabilities will become a core service for research libraries (hahn ). by partnering with the university press, libraries can leverage complementary contributions to provide better, more comprehensive, and transformative publishing models. libraries bring new models to the table to fill gaps, such as nontraditional publishing in data, gray literature, and digital humanities projects, and fulfill the library mission of access and stewardship. libraries provide a home for scholarship that would not otherwise be available to the world and address critical service needs in publishing by providing alternatives that offer less restrictive terms that can accommodate new forms of scholarship and complement existing services to support teaching and learning (li ). library publishing represents just one manifestation of libraries’ transformation from service providers to research partners, from knowledge keepers to knowledge creators. however, publishing services will require broader institutional support to thrive. libraries have taken the lead in launching new services, but will require new and ongoing resources from institutional leadership to build effective capacity to grow in scale. robust institutional funding forms a cornerstone of library publishing’s identity, allowing libraries to adopt platinum oa business models, take on experimental or logistically complicated projects, and fulfill their mission of providing broad, unfettered access to knowledge. the library-university press relationship represents one of the most promising avenues forward for scholarly communication as it leverages the library’s strengths in infrastructure, campus relationships, and knowledge management, with the press’s expertise in acquisitions and editorial work, marketing, and its existing reputation and prestige. over the long term and at scale, library-press collaborations can result in a landscape where high-quality scholarly content is available to all in a range of forms and with different levels of curation and review. it is our hope that in the future, libraries and university presses, as publishing agents and partners with scholars and academic societies with the support of institutions and funders, will help to create a more sustainable, open, transparent, and effective scholarly communication system. acknowledgement the literature review sections of this article greatly benefited from the bibliography of library publishing (https://librarypublishing.org/resources/) assembled by the library publishing coalition. references aaup (association of american university presses) library relations committee. . library-press collaborations survey report. http://www.aaupnet.org/images/stories/data/librarypresscollaboration_report_corrected.pdf. anderson, rick. . “another perspective on library-press ‘partnerships.’” scholarly kitchen (blog). july , . https://scholarlykitchen.sspnet.org/ / / /another- perspective-on-library-press-partnerships/. bains, simon. . “the role of the library in scholarly publishing: the university of manchester experience.” insights ( ): – . https://doi.org/ . /uksg. . https://librarypublishing.org/resources/ http://www.aaupnet.org/images/stories/data/librarypresscollaboration_report_corrected.pdf https://scholarlykitchen.sspnet.org/ / / /another-perspective-on-library-press-partnerships/ https://scholarlykitchen.sspnet.org/ / / /another-perspective-on-library-press-partnerships/ https://doi.org/ . /uksg. billings, marilyn s., sarah c. hutton, jay schafer, charles m. schweik, and matt sheridan. . “open educational resources as learning materials: prospects and strategies for university libraries.” research library issues: a quarterly report from arl, cni, and sparc, no. : – . https://doi.org/ . /rli. . . brown, richard. . “six characteristics of success press-library collaboration.” contribution to panel at association of university presses annual conference, charleston, sc, november - , . http://www.aupresses.org/news-a-publications/aaup-publications/the- exchange/current-issue/ -press-library-collaboration. butler, declan. . “the dark side of publishing: the explosion in open-access publishing has fuelled the rise of questionable operators.” nature : - . http://dx.doi.org/ . / a. case, mary, and nancy r. john. . “publishing journals @uic.” research library issues: a quarterly report from arl, cni, and sparc, nos. / . http://old.arl.org/bm~doc/arl-br- - -uic.pdf. courant, paul n. . “what might be in store for universities’ presses.” journal of electronic publishing ( ). http://dx.doi.org/ . / . . . crow, raym. . “campus-based publishing partnerships: a guide to critical issues.” washington, d.c.: sparc. http://sparc.arl.org/sites/default/files/pub_partnerships_v .pdf. day, colin. . “the need for library and university press collaboration.” collection management ( - ): - . http://dx.doi.org/ . /j v n _ . de groote, sandra l., and mary m. case. . “what to expect when you are not expecting to be a publisher.” oclc systems & services: international digital library perspectives ( ): – . https://doi.org/ . /oclc- - - . eaton, nancy, bonnie macewan, and peter potter. . “learning to work together: the libraries and the university press at penn state.” journal of scholarly publishing ( ): - . http://dx.doi.org/ . /jsp. . . . esposito, joseph. . “having relations with the library: a guide for university press.” scholarly kitchen (blog). july , . https://scholarlykitchen.sspnet.org/ / / /having- relations-with-the-library-a-guide-for-university-presses/. georgiou, panos, and giannis tsakonas. . “digital scholarly publishing and archiving services by academic libraries: case study of the university of patras.” liber quarterly ( ): – . http://doi.org/ . /lq. . https://doi.org/ . /rli. . http://www.aupresses.org/news-a-publications/aaup-publications/the-exchange/current-issue/ -press-library-collaboration http://www.aupresses.org/news-a-publications/aaup-publications/the-exchange/current-issue/ -press-library-collaboration http://dx.doi.org/ . / a http://old.arl.org/bm% edoc/arl-br- - -uic.pdf http://old.arl.org/bm% edoc/arl-br- - -uic.pdf http://dx.doi.org/ . / . . http://sparc.arl.org/sites/default/files/pub_partnerships_v .pdf http://dx.doi.org/ . /j v n _ https://doi.org/ . /oclc- - - http://dx.doi.org/ . /jsp. . . http://doi.org/ . /lq. griffiths, rebecca j., matthew rascoff, laura brown, and kevin m. guthrie. . university publishing in a digital age. new york: ithaka s+r. https://doi.org/ . /sr. . hahn, karla l. . “research library publishing services: new options for university publishing.” washington, d.c.: association of research libraries. http://www.arl.org/about/ -research-library-publishing-services-new-options-for-university- publishing#.wqqaxhpwaax. indiana university. . “iu to establish new office of scholarly publishing.” iu newsroom. http://newsinfo.iu.edu/news/page/normal/ .html. ivins, october, and judy luther. . “publishing support for small print-based publishers: options for arl libraries.” washington, d.c.: association of research libraries. http://www.arl.org/focus-areas/research-collections/preservation/ -publishing-support-for- small-print-based-publishers-options-for-arl-libraries. li, yuan. . “is it really publishing: the why and how of library publishing initiatives.” against the grain ( ): . mitchell, catherine. . “university of california press and california digital library partner with collaborative knowledge foundation to build open source monograph publishing platform.” university of california, california digital library. press release, may , . https://www.cdlib.org/cdlinfo/ / / /university-of-california-press-and-california-digital- library-partner-with-collaborative-knowledge-foundation-to-build-open-source-monograph- publishing-platform/. mullins, james l., catherine murray-rust, joyce l. ogburn, raym crow, october ivins, allyson mower, daureen nesdill, mark p. newton, julie speer, and charles watkinson. . library publishing services: strategies for success: final research report. washington, dc: sparc. https://docs.lib.purdue.edu/purduepress_ebooks/ /. murray, peter. . “online editions of out-of-print books result from library/press partnership at univ of pittsburgh.” disruptive library technology jester (blog). last updated may , . https://dltj.org/article/upitt-library-press/. neal, james. . “symbiosis or alienation: advancing the university press/research library relationship through electronic scholarly communication.” journal of library administration ( / ): - . okerson, ann, and alex holzman. . “the once and future publishing library.” council on library and information resources. https://www.clir.org/wp-content/uploads/sites/ /pub .pdf. perry, anali maughan, carol ann borchert, timothy s. deliyannides, andrea kosavic, and rebecca r. kennison. . “libraries as journal publishers.” serials review ( ): – . https://doi.org/ . /j.serrev. . . . https://doi.org/ . /sr. http://www.arl.org/about/ -research-library-publishing-services-new-options-for-university-publishing% .wqqaxhpwaax http://www.arl.org/about/ -research-library-publishing-services-new-options-for-university-publishing% .wqqaxhpwaax http://newsinfo.iu.edu/news/page/normal/ .html https://www.cdlib.org/cdlinfo/ / / /university-of-california-press-and-california-digital-library-partner-with-collaborative-knowledge-foundation-to-build-open-source-monograph-publishing-platform/ https://www.cdlib.org/cdlinfo/ / / /university-of-california-press-and-california-digital-library-partner-with-collaborative-knowledge-foundation-to-build-open-source-monograph-publishing-platform/ https://www.cdlib.org/cdlinfo/ / / /university-of-california-press-and-california-digital-library-partner-with-collaborative-knowledge-foundation-to-build-open-source-monograph-publishing-platform/ https://docs.lib.purdue.edu/purduepress_ebooks/ / https://dltj.org/article/upitt-library-press/ https://www.clir.org/wp-content/uploads/sites/ /pub .pdf https://doi.org/ . /j.serrev. . . roh, charlotte. . “library-press collaborations: a study taken on behalf of the university of arizona.” journal of librarianship and scholarly communication ( ). http://doi.org/ . / - . . royster, paul. . “publishing original content in an institutional repository.” serials review ( ): – . https://doi.org/ . / . . . sondervan, jeroen, and fleur stigter. . “sustainable open access for scholarly journals in years: the incubator model at utrecht university library open access journals.” learned publishing. http://dx.doi.org/ . /leap. . straumsheim, carl. . “paper on library-university press partnerships.” inside higher ed. december , , quick takes. https://www.insidehighered.com/quicktakes/ / / /paper- library-university-press-partnerships. university of michigan press. . “mellon grant funds u-m press collaboration on digital scholarship.” michigan publishing, university of michigan library. april , . https://www.publishing.umich.edu/ / / /mellon-grant-funds-u-m-press-collaboration-on- digital-scholarship/. walters, tyler. . “the future role of publishing services in university libraries.” portal: libraries and the academy ( ): – . http://doi.org/ . /pla. . . watkinson, charles. . “why marriage matters: a north american perspective on press/library partnerships.” learned publishing : - . https://doi: . /leap. . watkinson, charles, catherine murray-rust, daureen nesdill, and allyson mower. . “library publishing services: strategies for success.” panel presentation at the association of university presses annual conference, charleston, sc, november - , . http://dx.doi.org/ . / . whyte appleby, jacqueline, jeanette hatherill, andrea kosavic, and karen meijerkline. . “what’s in a name? exploring identity in the field of library journal publishing.” journal of librarianship and scholarly communication ( ). http://doi.org/ . / - . . wittenberg, kate. . “the electronic publishing initiative at columbia (epic): a university- based collaboration in digital scholarly communication.” learned publishing : - . https://doi.org/ . / . yuan li is the scholarly communications librarian at princeton university, where she manages the princeton university library's efforts to support scholarly publication innovations and reforms and supervises and coordinates activities related to the princeton open access policy and the princeton institutional repository. prior to joining princeton, she served as scholarly http://doi.org/ . / - . https://doi.org/ . / . . http://dx.doi.org/ . /leap. https://www.insidehighered.com/quicktakes/ / / /paper-library-university-press-partnerships https://www.insidehighered.com/quicktakes/ / / /paper-library-university-press-partnerships https://www.insidehighered.com/quicktakes/ / / /paper-library-university-press-partnerships https://www.publishing.umich.edu/ / / /mellon-grant-funds-u-m-press-collaboration-on-digital-scholarship/ https://www.publishing.umich.edu/ / / /mellon-grant-funds-u-m-press-collaboration-on-digital-scholarship/ http://doi.org/ . /pla. . https://doi: . /leap. http://dx.doi.org/ . / http://doi.org/ . / - . communication librarian at syracuse university, digital initiatives librarian at the university of rhode island, and digital repository resident librarian at the university of massachusetts amherst. her research interests focus on the changing landscape of scholarly communication, open access, digital preservation, data management and curation, new models of digital publishing, and digital scholarship. yuan has an mls from the university of rhode island, an me in applied computer science from the national computer system engineering research institute of china, and a bs in computer science and technology from yanshan university (china). sarah kalikman lippincott is the assessment and planning librarian at the university of massachusetts amherst. she served as the inaugural program director of the library publishing coalition, an international nonprofit membership association for academic libraries. sarah has consulted on a range of digital publishing and scholarly communications projects for libraries and other cultural heritage organizations. her research and consulting focuses on the information behavior of scholars who use archival materials and on the research and teaching practices of digital humanists. she has an mlis from the university of north carolina at chapel hill and a ba in french and comparative literature from wesleyan university. sarah hare (formerly crissinger) is the scholarly communication librarian at indiana university bloomington. in this role, she collaboratively leads several open access publishing initiatives. sarah’s research focuses on scholarly communication outreach to undergraduate students, open educational resources (oer), and the intersections of information literacy and scholarly communication. she currently teaches an eight-week library juice academy course, “introduction to open educational resources,” for library and information professionals worldwide. prior to joining the iu libraries in , sarah served as information literacy librarian at davidson college, where she created open access programming and led two open educational resource (oer) initiatives. she holds a master of science in library and information science from the university of illinois at urbana-champaign and a bachelor of arts in english from wright state university in dayton, ohio. jamie wittenberg is head of the scholarly communication department and research data management librarian at the indiana university libraries. her work focuses on enabling open access to scholarship, facilitating reuse, and advocating for transparent research practices. jamie and her team run an institutional repository for the iu research community as well as an open journal publishing platform. jamie is working in collaboration with library developers to operationalize workflows from the iu faculty reporting system to the institutional repository. jamie’s current research includes work on preprint deposit pipelines, pedagogical models for data services, personal digital archiving methods, and nsf and sloan-funded research on publishing digital d objects. prior to joining the iu libraries in , jamie served as research data management service design analyst at the university of california–berkeley. she received a ba in literary studies from bard college at simon’s rock, master of british studies from humboldt university of berlin, and an mslis from the university of illinois at urbana- champaign. suzanne m. preate is the digital initiatives librarian and manager of the digital production unit at syracuse university libraries. her interests include all aspects of the digital scholarship lifecycle, including digitization, asset management, metadata, open access publishing, digital preservation, and project management. suzanne previously held positions as a reference and instruction librarian, research instructor, and web developer. she received her mls from syracuse university’s school of information studies and is a phase one certified cultural heritage specialist. amanda page is the open publishing/copyright librarian at syracuse university libraries in syracuse, ny, where she leads a team and oversees the instituional repository and other open publishing services serving research and instructional support. previous roles included serving as the head of extended collections and scholarly communications at northern kentucky university, and at the harvard open access project (hoap) at the berkman klein center for internet & society at harvard university, and for the countway library of medicine. her research focuses on author rights and permissions, privacy, publishing ethics, and open access. she has an ms in library and information science from simmons college in boston, ma, and is an associate editor for the doaj. suzanne e. guiod is editor-in-chief at syracuse university press. she previously served as editorial director of the university of rochester press and managing editor of the encyclopedia of new england (yale university press ). she holds an ma in english literature from the university of new hampshire. i-schools and archival studies richard j. cox and ronald l. larsen school of information sciences university of pittsburgh n. bellefield avenue pittsburgh, pa rjcox @comcast.net rlarsen@mail.sis.pitt.edu mailto:rjcox @comcast.net ischools and archival studies abstract whispers and rumors about the ischool movement lead some to fear that this represents yet another shift away from the valued traditions of library schools, threatening something far different than what library science pioneers ever envisioned. predating the ischool movement, however, were other programmatic shifts such as those that led to the formalization of graduate archival education. this essay argues that such evolution is essential to our future, as ischools tackle the increasingly complex issues confronting a digital society. we consider the mission and history of ischools and of archival studies, the basic elements and concepts of archival studies that are critical to ischools, and the relationship between ischools and the changing nature of personal and institutional archives. keywords ischools, archival studies, archives, library and information science introduction american graduate archives programs have been connected to library schools and then library and information science schools for more than a half-century, competing for a while with history departments but emerging as fully embedded in the former by the s (some would argue even before then). how are graduate programs in archival studies affected by the transition of many of the traditional library and library and information science (lis) schools to the newly emerging information or ischools? what is the place of archival studies programs in ischools? such questions might have interesting precedents if we bear in mind that many of the varying definitions of information, some in use in the newer ischools, stem from the traditional variants of these schools (for example, bates ; buckland , ; shera , ). more importantly, what new possibilities open for enhancing the archival studies programs in a time when archivists increasingly are facing working with digitized or digitally-born documents? when we originally proposed this paper for the iconference, the primary motivation behind it was the sense by some graduate archival educators that their role and that of the archival profession was being somehow lost in or neglected by the ischool movement. however, after due consideration, we are seeing how a stronger connection between archivists and the archival profession and ischools could deal with many of the challenges presented by the transition to the digital age. there are new and emerging interdisciplinary avenues for those in archival studies programs to follow, such as what seamus ross is doing at the university of glasgow with the humanities advanced technology and information institute or what anne gilliland is doing at ucla with the center for information as evidence. both ross and gilliland come from the archives community, and the kind of collaborative work they are doing may suggest the future for what archival studies programs become. we emphasize that this essay is a preliminary exploration, intended to start conversation about a relationship (given the early formative stage of both archival education and ischools) that is in a nascent developmental stage. this paper takes a snapshot of the evolving role of archival studies in an increasingly digital world and considers, in particular, the convergence of this evolution with the emergence of ischools. it reflects on the societal and technological context that is driving this symbiotic relationship, in the interest of stimulating discussion, debate, and further analysis. we begin the discussion by reviewing several foundational definitions, some of which remain in a state of flux reflecting the transitional character of the disciplines involved. following the section on definitions, we discuss the historic roots and contemporary trends in archival education, building to the dominant theme of the paper: strengthening archival studies in ischools. setting the scene: basic definitions discussing an issue such as archival studies, and all the variation of terms represented by the archiving function, can become confusing when we discuss it in the arena of information studies. it is important to provide some basic definitions up front so that we are all on the same page. in the transitional era from print to digital, from paper to electronic, some basic concepts -- such as archives or archive or archiving -- can get confused. and, as well, in the shifting from library to library and information science to ischools as the past, present, and future home for the education of information professionals such as archivists, professional missions, identities, and partnerships may be changing in radically new ways. in this transitional era, even when friendly and like- minded professionals, educators, and scholars sit around the table to discuss issues of mutual concern and interest, care often must be taken to ensure that everyone understands what is being discussed. ironically, we often need to be more precise in our definitions (such as with records or documents) and broader in how we define the scope of our responsibilities (such as in our appraisal work and in the ethical ramifications of such work) (see cox, , , ). the first thing to understand is that when we write or speak of archives we are not referring to backed-up data or old records and information with no other value than as some reminder of the past. archives encompass organizational, governmental, personal, and family records maintained because of continuing or enduring values to their creators, particular research clienteles, and society. these documents are preserved because of evidence, information, accountability, and corporate or public memory values. and archives exist in every kind of organization – government agencies, corporations, cultural agencies such as libraries and museums, universities, and community groups; they are also created and maintained by individuals and families. the most comprehensive, basic glossary, definition for archives is as follows: . materials created or received by a person, family, or organization, public or private, in the conduct of their affairs and preserved because of the enduring value contained in the information they contain or as evidence of the functions and responsibilities of their creator, especially those materials maintained using the principles of provenance, original order, and collective control; permanent records. – . the division within an organization responsible for maintaining the organization's records of enduring value. – . an organization that collects the records of individuals, families, or other organizations; a collecting archives. – . the professional discipline of administering such collections and organizations. – . the building (or portion thereof) housing archival collections. – . a published collection of scholarly papers, especially as a periodical (pearce-moses, ). while this definition covers all the bases, at least as traditionally seen within the modern archival profession of the past century or so, it also generates some questions. like library science education, the education of archivists emerged from a world of paper records, information systems and technologies generating paper records (typewrite and carbon paper to early personal computers and word processing), traditional bureaucratic structures characterized by the thinking of max weber and frederick taylor, and compliance systems and information policies geared to paper records (such as represented by the fourth amendment notion of privacy). all this is being challenged by the networked world of the web and the post- / world of security, transforming notions of government intrusion and control, personal privacy, and portable digital information systems – just to consider some aspects. how do traditional principles of archives administration hold up in our emerging digital era? what is the timetable for the complete shift from paper to digital and the implications of this for the education of a new generation of archivists? are archivists part of the information professions, or part of the historical or cultural heritage fields, or all of these and more? what is the nature of the knowledge domain of the archivist, and how does it intersect with the information sciences? how is the mission and work of the archivist evolving in light of digital recordkeeping and information systems? for many outside of the archives profession, archival work and the mission archivists and their programs are associated with is preservation, but even preservation management and conservation are also distinct fields, with their own educational issues and standards. here is a standard definition of preservation as noun and verb: n. ~ . the professional discipline of protecting materials by minimizing chemical and physical deterioration and damage to minimize the loss of information and to extend the life of cultural property. – . the act of keeping from harm, injury, decay, or destruction, especially through noninvasive treatment. – . law · the obligation to protect records and other materials potentially relevant to litigation and subject to discovery. v. ~ . to keep for some period of time; to set aside for future use. – . conservation · to take action to prevent deterioration or loss. – . law · to protect from spoliation (pearce-moses, ). what this translates into is the idea that preservation is really a commitment to maintain information, evidence, or an artifact over time whatever it is made of or how it is originally created; while this has often been seen as synonymous with the concept of permanence, archivists themselves have debated about whether it implies continuing (meaning as long as there is some reason for keeping) or enduring (meaning as long as possible) (o‟toole, ). such debates have only accelerated in intensity as we have moved from paper to digital sources (considering such issues as record reliability, authenticity, and other traditional concerns expressed by archivists about records and recordkeeping). preservation also encompasses the function of conservation and restoration (including hands-on treatment and repair), but the focus is on preservation management with responsibilities ranging from facilities conditions to proper storage and handling procedures and to making decisions about reformatting (digitizing, microfilming, and migrating or emulating). preservation is generally seen to be the crux or end result of archival work (although archivists destroy more than they save – a fact that surprises many outside of the field, as well as a good number within), and it is a focus archivists share with librarians and museum curators. preservation is a reality- check against all the hype of the wonders of creating, harnessing, and using more information than any other era in world history. there has been a tension between the possibility, promoted by futurists and pundits, of saving everything that is produced digitally. this is usually based on the increasing power and capability of information technology and the decreasing costs of the technology, while ignoring social, political, cultural, and other issues. however, it is certainly the case that what archivists have traditionally worked with is shifting from paper systems (and an emphasis on records as artifacts) to the digital (and an emphasis on the virtual). while there will always be a need for conservators, for example, to work with historical documents and other artifacts, the increasing efforts to digitize traditional holdings to lessen wear on originals and to increase remote access also suggest that matters like knowledge of digital technologies, new research and experimentation on issues like appraisal and selection, and new approaches to ensure reliability and authenticity of both digitized and digitally-born records suggests the need for continuous revamping of graduate archival education and perhaps hints at why such education in new ischools has great promise. the digital era has brought with it all sorts of new questions and challenges for those interested in preservation matters. how has the concept of preservation been challenged or transformed with the growing use of and dependence on digital systems? are digital advocates still arguing that all information sources can be saved and effectively used? what is the ideal weighting between traditional and digital preservation in educating archivists (and preservation administrators)? christine borgman, in her important new book on digital scholarship, casts it in this manner: “preservation and management of digital content are probably the most difficult challenges to be addressed in building an advanced information infrastructure for scholarly applications” (borgman, , p. ). her use of “curation” may not be necessary as a replacement for preservation, but at least it serves as a useful mechanism for representing preservation as a function extending from traditional documentary and artifactual sources to their digital surrogates. the digital curation conference held at the university of north carolina school of information and library science in april and its ongoing project to build a digital curation curriculum may be another example of how traditional lis schools are shifting to support new archives education venues (for information, see http://www.ils.unc.edu/digccurr /). even archivists have tended to be fairly loose in their definitions. the increasing creation, maintenance, and use of records in electronic information systems have pushed archivists to try to be more precise. however, at the same time, these systems and the internet/world wide web have introduced more complex record genres pushing standard definitions or concepts derived from best practices and new needs. the work of the archivist has always been centered about the identification, preservation, and providing access to “records” possessing archival value, but there has been a growing recognition that the notion of records has shifted and expanded. a record has been defined as a n. ~ . a written or printed work of a legal or official nature that may be used as evidence or proof; a document. – . data or information that has been fixed on some medium; that has content, context, and structure; and that is used as an extension of human memory or to demonstrate accountability. – . data or information in a fixed form that is created or received in the course of individual or institutional activity and set aside (preserved) as evidence of that activity for future reference. – . an instrument filed for public notice (constructive notice); see recordation. – . audio · a phonograph record. – . computing · a collection of related data elements treated as a unit, such as the fields in a row in a database table.– . description · an entry describing a work in a catalog; a catalog record (pearce-moses, ). some archivists adhere to a notion of archival science, based on the seventeenth century emergence of diplomatics, derived from jean mabillon's de re diplomatica ( ) and mostly fixated on determining whether a document is authentic or a forgery or a copy by examining internal and external characteristics. in north american practice, the notion of records was largely taken for granted, following general definitions created in government laws or best practices in corporate and other organizational settings. however, the increasing use of information technology led to the need to revisit basic definitions and to re-engineer the uses of older archival sciences such as “diplomatics” (see, for example, duranti, ). after a generation of largely ignoring the implications of the computer for the creation and maintenance of archival sources, archivists found themselves engaged in defining more precisely the notion of a record, the elements of recordkeeping systems, the concept of evidence, and other such matters. some major research projects, and a considerable amount of debate within the archival community, generated a large literature on the nature of the record. however, the establishment of the world wide web, other concepts of information documents, postmodern scholarship on the idea of the “archive,” and high profile legal cases all seemed to broaden the idea of the record far beyond what anyone could have imagined. cell phones, digital cameras, and other portable devices contributed to a broadening notion of how records could be used and what records represented. such changes and their implications for archives and recordkeeping, and the educational and scholarly reactions to these changes, may reflect some of the differences between the notion of archival studies (mostly seen as an all encompassing term for the knowledge supporting basic – some might say traditional - archival functions and practices) and archival science (based on the centuries-old concepts deriving from diplomatics and the reliability and authenticity of texts, now directed at digital systems). with many disciplines studying archives, and applying new theories and models to archives and recordkeeping, it may be that neither umbrella term is completely useful or meaningful at the present time (see, for example, cook, and ) – and this may be yet another reason for the potential of archival programs located in ischools (where other useful sciences reside and where additional research, reflection, and reformulation may occur). even those involved in some of the research projects have questioned some of their presuppositions and assumptions, while still remaining committed to the notion that records are important to society, institutions, and citizens. david bearman recently revisited the university of pittsburgh project of the early s on the functional requirements for evidence in recordkeeping and concluded that the basic structure for preserving essential evidence in digital systems is sound but not implemented by any archives (bearman, ). heather macneill has shifted away from some of the authoritarian perspectives reflected in the interpares project, and in one essay she considers the strengths and weaknesses of modern diplomatics, concluding that the diplomatics approach does not reflect the reality of electronic recordkeeping but provides a useful conceptual model for evaluating such recordkeeping. in her opinion, the projects utilizing diplomatics suggest that the reality of these electronic systems is that they are “too complex and diffuse for any one method to capture.” as a result, the archival community is left with lots of questions to ponder. are new digital forms of records still functioning as transactions of business with the elements of warrant, structure, content, and context still relevant? are researchers and others needing access to records still concerned about matters of authenticity and reliability as they once used to be? are new means of providing access to more complex digital information sources trumping issues of definition and maintenance? have the continuously emerging digital documentary forms eased the way for more postmodern notions of evidence and information? although practitioners may wring their hands over such matters, they represent wonderfully engaging and challenging issues to theorize about, conduct research about, and speculate about solutions in the future (such as the predictions about the emergence of the paperless office) (anderson, ). this brings us to the definitional issues surrounding ischools. just what are they and how do they differ from library and information science schools? while the emergence of ischools as a consortium is relatively recent, their origins reflect a more sustained dialogue among faculty and deans of a number of library and information science and related programs around the broader implications of information technologies on their curricula, their institutions, and the information professions. a summary of this dialogue (larsen, ) concluded: “informed by decades of debate and responding to exceptionally rapid changes in technology and uncertainty in public policy, ischools foster the development of an intellectual space where true interdisciplinarity plays out. in so doing, they introduce a range of challenges to traditional university structures and practices … as they create an environment where issues of information are addressed systematically, regardless of disciplinary heritage or presumed 'ownership'. in this way, ischools respond to the salient issues of the time by stressing the production of strong results. they are in a constant state of adaptation within their core competencies, while building necessary bridges among disciplines.” archival studies is clearly a vital participant in this interdisciplinary dialogue. education and the formation of archival knowledge it is easy for professional schools, often burdened with immediate concerns such as practitioner competencies and the sometimes political matters of credentialing and program accreditation, to ignore their own histories (labaree, ; khurana, ). archival studies or science programs are no exception. archives are ancient, and there were formal training programs for scribes in the ancient world. the modern archives profession is about a century old, dating to the late th century in europe and slightly younger in north america. the formal education of archivists emerged slowly, also grew slowly, and today it has a finger hold in library and information science schools and, to a lesser extent, in history departments. where are these programs going? the evolution of the education of archivists has followed a pretty clear path. initially, in the early twentieth century, individuals entered the field basically through a kind of informal apprenticeship or on-the-job training; some still enter the field in this manner. single graduate courses began to appear in history departments and library schools in the s, and this remained the prevalent avenue for any graduate education until the s. in the s, a three course sequence appeared, mostly situated in what had become library and information science schools; this set of courses – usually an introductory course, an issues seminar of some sort, and a fieldwork or practicum – was endorsed by the first society of american archivists education guidelines in . also in the middle part of the twentieth century, we witnessed a proliferation of institutes, probably a reflection of the lack of comprehensive graduate programs and the preference by the field for skills training. the emergence and decline of public history programs, in the s to early s, including some coursework on archival studies, both enriched the discussion about the education of archivists and provided a distraction from ramping up the quality of graduate archival education programs. it is rather difficult even to argue that there was anything approaching what could be termed a comprehensive education “program” in this period. all of this began to change in the s, when universities, mostly in lis schools, began to hire regular, tenure stream faculty to teach in the archival studies area. soon, the saa guidelines began to concern more comprehensive education. within a decade, there were schools, again mostly in lis programs, hosting multiple faculty specializing in archives and related disciplines such as preservation and records management; this represented a remarkable shift from just the decade before when few thought there would ever be schools supporting one such faculty member. even more remarkable has been the growth of programs supporting doctoral students in the archives field; in , when this essay was first written, for example, richard cox had eight such students and anne gilliland at ucla had thirteen, more between these individuals than the entire field could boast two decades before. this is a very impressionistic sense of the evolution of graduate archival education programs, but there are some obvious characteristics we can point to in where we are today. while we have a number of programs with impressive clusters of courses and faculty, we have only a couple of separate masters degree programs, the preparation of new faculty members is not keeping pace with demand, and archival studies or science is seen as an uncertain appendage of information sciences or historical studies. even when new archival masters degrees have been announced, the focus seems to be more on teaching and professional mentoring than on research and knowledge creation (such as the recent creation of an online masters in archives and records administration at the san jose state university school of library and information science). professional support for graduate education is unsteady by the professional associations, which seem as much oriented to apprenticeship training and lowest common denominator concerns (as reflected in certification programs in saa and arma). with the exception of a few programs, preservation education is even more tenuous. how lis programs or ischools can proceed with educating the next generation of information professionals without some attention to the long-term maintenance of sources deemed to possess archival value and requiring preservation seems questionable if not foolhardy. will we digitize other materials only to see these digital surrogates disappear relatively quickly (when compared to how long older formats lasted)? will we continue to build information systems without being able to preserve records and their evidence or information needed over the long haul? it is not incorrect to suggest that most graduate archives program are small, conservative affairs doing the best they can to orient students to the field. when you are limited in faculty and the number of courses, you face challenges in dealing with the fast- paced change of digital information technologies. this is doubly difficult given the interests many students bring with them based on their exposure to archives as undergraduates often working with older records in museums, university special collections, and historical societies or historic sites. this is changing as students are learning about various technologies or growing up with them. however, it is a great leap we are still facing to get into newer areas of digital scholarship, electronic records management, and other such areas, partly because of strides such traditional repositories are making in dealing with digital systems. for example, a student interested in museums must know or may be quickly exposed to the uses of information technologies by these repositories. paul marty hints at this, writing, “museum informatics is the study of the sociotechnical interactions that take place at the intersection of people, information, and technology in museums” (marty, , p. ). in fact, the various authors in this compilation of essays argue that information science and technology “have changed the very nature of museums, both what it is to work in one, and what it is to visit one” (marty and jones, , p. xii). these technologies are providing new ways to study documents and artifacts as well as the means to provide different and more compelling interpretations both in the institution and by remote access. we see the same trends in archives and in other institutions – corporate, museum, and library – employing archivists. the very nature of archival work is changing, and we need individuals who are intellectually engaged by the challenges the digital technologies are bringing to records and information systems; graduate archival programs situated in ischools might attract such individuals tomorrow where the traditional lis school tended to attract individuals interested in traditional records forms and the cultural and historical aspects of recordkeeping. in the past, these graduate archival education programs have been severely limited in their scope and flexibility. they have been generally focused on traditional records systems and archival principles built on or deriving from such systems, usually because of limited resources and faculties stretched often to teach in other areas as well as to try to provide service to the professional community. the traditional focus also occurs because so many of the incoming students have developed interests in archives and preservation through their orientation to cultural organizations such as historical societies, museums, and historic sites, such interests often prompted by their own undergraduate careers primarily in the humanities. obviously, we can detect a shift in this as well as these younger students grow up and mature with more sophisticated knowledge about and experience with digital information technologies and their undergraduate disciplines and the cultural institutions they visit reflect more involvement with a greater array of technologies. just as the quest for an understanding of the past (even if it is the most antiquarian of interests) engages these individuals, a growing preoccupation with the nature of information technologies and their potential use in harvesting historical data or re-creating the look, feel, and sound of the past also will cause them to demand a greater presence of technologies in the archives and preservation curriculum. we may ultimately see the kind of emotional attachment to the digital systems as we have been accustomed to seeing with the look of printed books, the feel of paper documents, and the touch of artifacts – sentiments that have often attracted certain people to the archives and preservation management programs in the lis schools or history departments. while alberto manguel gushes, “my books hold between their covers every story i’ve ever known and still remember, or have now forgotten, or may one day read; they fill the space around me with ancient and new voices,” (manguel, , p. ) there is no reason to think that we couldn’t say the same about the computers we carry with us or surround ourselves. there are, of course, still challenges in developing an archives and preservation curriculum that fully integrates digital technology. while there has been increasing attention to electronic records management issues, usually presented either in a dedicated course or integrated throughout curriculum, this has proved to be only one of many such issues needing to be confronted. there is also the need to teach about the historical evolution of records and recordkeeping systems and all the other core functional or knowledge areas (and their principles and applications) of reference and access, preservation, public programming and outreach, management, legal issues – just to provide a sample of such other concerns. understanding records and recordkeeping systems and technologies requires an understanding of nearly all the cultural, economic, political, historical, and other factors affecting the nature of these information or evidence systems. perhaps the greatest problem in dealing with such matters derives from the limitations posed by small faculties, adjunct reliance, the nature of archives in the immediate area of the university offering these courses, and other similar factors. it is truly difficult to build comprehensive archival education programs when there are only one or two specialized faculty with regular appointments (who have a greater array of responsibilities than just teaching) or when archives and preservation programs in the immediate geographic area of the school are sparse or limited in their own scope of activities (how many graduate archival education programs have the opportunity to work with an archives program supporting a full-fledged electronic records operation?). there is no question that the archival community missed the boat in establishing archival education programs in an earlier era when there were more resources and a greater willingness to establish and populate such programs. and, to a certain extent, the identity of the existing programs is mostly shaped by their affiliation with a history department or library and information science school rather than their own sense of professional mission or disciplinary scope. such issues prompt even more self-reflection about what the future holds as lis programs evolve into ischools. it is not as simple as just worrying about how to orient traditional archival studies to new and emerging digital document and information forms. the notion of archives and the “archive” is becoming far more complex than how we used to imagine it. scholars from a wide range of disciplines -- literary and cultural studies, anthropology, history, sociology, political science, and other fields -- are studying archives or the “archive” and adding new understanding to what ought to be included in archival studies (some of this is reflected in some of the present graduate archival education programs, but there is reason to expect that the emerging interdisciplinary ischools also will encourage such research and scholarship). we have new and challenging notions of what a document represents and of how archives create and sustain public or collective memory; teaching in such an interdisciplinary way also pressures archives faculty to expand their own horizons of scholarly endeavor or to build new partnerships for collaborative research and teaching. to educate the next generation of working archivists requires more than merely teaching from basic practice manuals or assigning articles from the half-dozen or so leading archival journals. we need to immerse our students into a very large and deep ocean of interdisciplinary studies on the archive, ranging from academically-trendy cultural studies to the generally more staid information sciences. this broad and expanding scholarship represents a great range of notions about archives, archival documents, and archivists. while some archivists ignore this literature, or dispute its relevance for their own work, it is clear that this scholarly work is enriching our knowledge of the records archivists work with; it is easy for individuals working closely with personal papers, literary manuscripts, family records, and institutional documentation to take for granted the veracity, reliability, and usefulness of the materials (reading scholarly and other accounts about the nature and use of such documentation provides other useful perspectives enriching how we read and interpret these sources). this literature is also beginning to study archives and archivists in new ways, such as with the rich and deep literature on the idea of public or collective memory, an area where scholars of all sorts are studying not just museums, libraries, and historic sites, but archives (the records, the building, the institution, and the discipline) as well. for example, for several generations archivists clung to concepts of objectivity in their tasks of appraising and describing records. now, many archivists are far more aware of the ways in which they deliberately or inadvertently shape the documentary heritage. new insights, from literary and cultural studies scholars, have made archivists (at least some of them) more open to new forms of collaboration with both records creators and records users. new forms of scholarship -- embracing digital means of collaboration and access -- are also suggesting new uses of archives (both digitally born and digitized). recordkeeping, and the scholarship on it, represents, according to alistair tough and michael moss, a “relatively new field of study. the boundaries of the field are poorly defined and porous. this is characteristic of emerging disciplines and need not be a cause of professional insecurity” (tough and moss, , p. ix). but it is even more complicated than merely an emerging discipline. maria economou suggests the differences in considering real rather than virtual sources, arguing, --“although viewing the digital version will never replace the experience of examining the original, in certain cases this is the only way to provide access to important objects that would have otherwise remained known only to a few scholars . . . . in this way, new technologies offer a medium which circumvents often-arbitrary limitations and boundaries imposed by the history of the collections, the vision of academic disciplines, practical consideration of space, or just chance” (economou, , p. ). integrating traditional, emerging, and new records or archival technologies is a difficult, but necessary, task for all archival educators. it requires them not only to contend with the problems of the present, but also to grapple with what has happened in the past and to examine comfortably the possibilities of the future. we have conflicting views (probably many conflicting views) of our present information or digital age, both within the archival community and outside of it. for the moment, let‟s just consider some dramatically contrasting perspectives. mark herring writes, “if we define knowledge as any bit of datum, right or wrong, factual or not, fraudulent or accurate,” then the digital world is fine, but “if this is the definition of information that we want, then, yes, the web should replace all libraries. on the other hand, if knowledge includes something about accuracy, appropriateness, balance and value then the web cannot arrogate to itself a place of preeminence to knowledge- seekers” (herring, , p. ). this captures a huge literature of speculation about the perverse effects of the digital universe on reading, publishing, and knowledge, or, and maybe more accurately, a growing nostalgia for the printed book and other traditional information sources. what gets lost in the position espoused here, however, is a basic understanding of what the web is, vs. a website, or an institutional repository, or a digital library. jeff gomez, in his discussion about the future of the book, strikes a somewhat different chord: “and so to expect future generations to be satisfied with printed books is like expecting the blackberry users of today to start communicating by writing letters, stuffing envelopes and licking stamps” (gomez, , p. ). gomez makes a good point, one that many would attest to today, including the authors of this essay. not a day passes that we don‟t read from print, search on the web, and receive and respond to e- mail. it is even more complex than a belief or lack of faith in technology. well-known cultural historian anthony grafton suggests how we are in a complicated transitional area, a road with many wrong turns and misleading signage. “for now and for the foreseeable future,” grafton argues, “any serious reader will have to know how to travel down two very different roads simultaneously. no one should avoid the broad, smooth, and open road that leads through the screen.” grafton also believes we need to be able to continue to examine original documents, taking what he calls the “narrow path”: “the narrow path still leads, as it must, to crowded public rooms where the sunlight gleams on varnished tables, and knowledge is embodied in millions of dusty, crumbling, smelly, irreplaceable documents and books” (grafton, , p. ). in other words, there will always be some of us who want to touch as well as see, to experience as well as ingest, what they read. this has interesting implications for how we think about archives and, certainly, how we educate the next generation of archivists. a quarter century ago, leading archivist f. gerald ham, hinted at the relationship between what archivists do and what they work with: “i subscribe also to the notion that our work, and indeed our behavior as archivists, is determined by the nature of the material we deal with: we are what we accession and process” (this is the theme of ham, ). at the moment the majority of archivists seem inclined to deal with traditional paper records, but there is a decided shift (and need) for working with digital records. fortunately, while the need is real, we may have some time to build the kinds of educational programs we need. christine borgman, considering the emerging area of cyberscholarship, writes, “we are currently in the early stages of inventing an e-research infrastructure for scholarship in the digital age. it may take twenty, forty, or sixty years to realize that vision, by which time the technology and tools will be quite different from today” (borgman, , p. ). while we must resist lulling ourselves into complacency, we can afford to understand that we have ample room for experimentation and exploration. nevertheless, archivists have struggled, over the past couple of decades, with the implications and products of new electronic information systems influencing the creation of records. in a recent survey about electronic records management, robert williams and lori j. ashley conclude, “most organizations have serious operational shortfalls regarding the processes by which they manage electronic records, one of their most important assets” (williams and ashley, , p. ). richard pearce-moses, while he was president of the society of american archivists, declared, “as we face the challenges of electronic records, we must also face our need for new knowledge. we need new tools for new materials. where to begin?” (pearce-moses, , p. ). ken thibodeau, of the u.s. national archives, added, “while we are still at the dawn of the digital era, before too many cultural assets are lost, and before the technology has raced utterly beyond our ability to catch up, we need to construct concepts, methods and operational systems that can preserve and provide access to digital information” (thibodeau, , p. ). these sentiments reflect a consistent notion that archivists are always, somehow, behind the -ball when it comes to dealing with electronic records and recordkeeping systems. however, archivists may be climbing out of this pit, as joanna sassoon suggests in the emerging of a “new culture within the archival profession”: “this culture would acknowledge that all formats in archival custody have specific needs which require specialist knowledge. these new specialists would be educated and trained using a new range of texts which build format specific understandings of archival material, their research potential and their requirements to preserve their „recordness‟. this approach may be embedded into our professional culture through creating an understanding that, like the new archival format of electronic records, all archival formats require specialist knowledge and skills” (sassoon, , p. ). what better way to help jump-start the creation of this new culture than by embedding archival studies programs in the emerging ischools? will it happen, actually, if we don‟t work to make sure archives programs are within ischools, new ones or ones emerging from older traditional forms? strengthening archival studies in ischools as we have tried to demonstrate, graduate archival programs have been traditionally located in history departments and library and information science schools. over the past two decades especially, these programs have mostly shifted to the lis programs where some have developed fairly expansive curricular offerings and employed two or more regular faculty with the expectations of this faculty contributing to the broader research, teaching, and service missions of these schools (cox, yakel, wallace, bastian, and marshall, ). however, as some lis schools evolve into ischools, what does this suggest about what prospective archivists ought to be learning? given that students presently preparing to be archivists may be working far more with digitally-born documentary sources or making digitization decisions about traditional records (or, for some, exclusively working with digital materials within the next decade or so), it stands to reason that present students ought to be more fully grounded in the electronic information and recordkeeping systems while still learning about critical archival principles and where, why, and how these principles may be challenged by the new digital documents. this rationale correlates with the early motivations that led to the formation of ischools. many of the founding ischools (see www.ischools.org) originated as schools of library and information science, for which the dominant focus had been on information and how people use information, while other ischools came from a tradition more closely aligned with computer science, in which the dominant focus was on technology and how technology serves human needs and interests. the ischools evolved in response to students’, employers’ and society’s needs becoming increasingly holistic in relation to information and information technologies. the curricula, the research, and, indeed, the schools’ missions, were expanding to address more explicitly the relationship between information, technology, and people. schools from both historic traditions recognized their convergence through a mutual commitment to learning and understanding the role of information supported by advancing technology in human endeavors. central to the evolutionary development of ischools has been the conviction that expertise in the management and use of all forms of information is required for progress in virtually any endeavor in science, business, education, or culture. information professionals’ core competencies must include both a sophisticated understanding of how humanity uses information (from the individual through society in the large) as well as proficiency in the enabling technologies and their applications. in other words, there is nothing in the new ischools that suggests exclusion of the archival realm; indeed, the kinds of elements being defined for these schools suggest an exciting new way to deal with the challenges of electronic records issues that have long challenged the archival community. the focus by archivists on evidence can be seen as merely a component of the information and information systems these schools are http://www.ischools.org/ interested in. there is another promise here. as ischools evolve and their partnerships grow by encompassing other schools far removed from the traditional lis realm, there may be new opportunities to expand the archival area into other sectors. archivists have long expressed the desire, captured in the writings by individuals like david bearman, terry cook, and margaret hedstrom, (see, for example, bearman, ) to influence software designers and vendors, corporate entities, government regulatory agencies, and other creators and sustainers of records and information systems; could ischools represent a better venue for accomplishing this goal by equipping a group of new archivists well-versed in both archival principles and information technologies? what we might be seeking is the regaining of the ancient status of scribes, as models for archivists functioning as scholars of both recordkeeping and digital records and information systems. karel van der toorn contends that in the ancient world, the “scribes were not merely penman and copyists but intellectuals,” but the “academics of their time.” in ancient israel, scribes were part of an exclusive group: “the skills of the scribes – of reading, understanding, and interpreting – commanded general respect. the scribes held the key to the symbolic capital of the nation” (van der toorn, pp. , ). philip brooks, more than three decades before this study of ancient scribes, provides a glimpse into how many archivists hoped to see their professional community function in a way that is much more vital to society and scholarly disciplines: “a competent archivist is to be looked upon as a scholarly colleague of the researcher, far more than solely a preserver and a caretaker. his knowledge of the sources can contribute materially to the user‟s evaluation and understanding of them” (brooks, , p. ). at present, some in the archival world have lost this sense of the archivist in society or the archival mission. most archivists complain either that they are invisible to society or that society and its organizations hold images of archivists as low-level clerks. some of this derives from misperceptions of records and recordkeeping as simply fodder for bureaucratic inertia or obstacles to be overcome. records as important safeguards for accountability, vessels of essential evidence, and foundations for social and corporate memory have been lost because archivists sometimes seem to portray the notion that they are merely antiquarians concerned in preserving documentary debris for the use of a few scholars, genealogists, and local historians. might this also be the result of how lis schools have been traditionally seen by many, and why library science has been supplemented by information science and why ischools have emerged with an even broader agenda and mission? teaching (and researching) about archival studies may provide a kind of liberating perspective for what we have had over the past half century or so as reflected in history departments and library and information science programs. seamus ross, as one example, suggests that, “digital archives combined with new technologies will liberalize scholarship. they will enable simultaneous access to a range of sources (both local and distant) and facilitate the use of research methods not possible with conventionally printed or hand written records.” ross perceives “digital information” as a “cultural product. as we think of physical products of culture as artifacts, so we should also be thinking of digital and electronic products as d-facts (or e-facts). these new products form an essential fragment of our cultural record” (ross, , pp. , ). and this can only occur in a new collaborative environment, as diane zorich argues: “no one can work in isolation on digital preservation and access issues because the needs and requirements are too great. we all benefit from (and generate) economies of scale, pooled expertise, larger funding, and more robust infrastructure when we collaborate. and collaboration means not just crossing over our museum/library/archives divisions, but entering whole new communities such as science, engineering, and the commercial sector.” zorich continues, “we cannot preserve a digital object or a digital collection in isolation: we must preserve the entire digital ecosystem where the object or collection is found” (zorich, ). this is where the ischools become such a relevant part of the solution. archival scholars and ischools’ academics may be independently converging on a synergistic set of needs and objectives. the ischools proponents advocate a holistic perspective inclusive of society, information and technology. this is built on a foundation of principles, traditions, and values that are the product of more than a century of practice in librarianship and, perhaps, half that in the advancement of computing and communications technologies. a broad base of technologies, standards, and policies has emerged, from marc, aacr and z . supporting traditional library operations to tcp/ip, xml, and oai/ore enabling broader network-based access to information. the cyberinfrastructure program (to which the ischools have contributed substantial intellectual substance) is largely a federal acknowledgement of the emerging synergies that necessitate the development of information infrastructure on behalf of society in the large, of regional and disciplinary communities, and of individuals. the ischools arguably provide the one forum on campus where interdisciplinary scholarship can engage disciplinary scholars (e.g., biology, chemistry, history, humanities, social sciences) with information scholars (ischool faculty and researchers) in a coherent and scalable manner. the ischools enable scholarly attention to the issues of information selection, curation, retention, and preservation that are of lesser interest to most disciplinary scholars, while also advancing the state of knowledge in these areas, fueled by the diversity of issues, traditions, and requirements of the separate disciplines. these interdisciplinary projects could easily evolve into an array of joint degree programs, minors, and related interdisciplinary educational opportunities that have been barely envisioned, but could redefine the image of information-intensive, multi- disciplinary scholarship. so why should archival educators care about the ischools, beyond the fact that many of them have evolved from more traditional lis schools? as has become clear, archives in a digital world introduce an entire new range of questions, challenges, and opportunities. but the challenges are not ones of mission or role, but ones of instantiation… what does it mean in the st century to preserve the “records” of a digital society? the ischools are the only places in academia that are prepared to approach these questions from a holistic perspective; indeed, this is the basic mission of the ischools – to explore, interpret, and advance society‟s understanding and use of information as a “record” of its achievement. but just as information is meaningless without structure, organization, and context, archives needs a disciplinary context. are ischools a more logical venue for archival studies than lis schools, as they extend their reach through interdisciplinary relationships with other disciplines? lis schools that retain a focus on the centrality of the library as a service organization, while a valuable societal construct, are likely to be less relevant to archival studies that must engage each of the disciplines directly (especially as so much scholarship about archives or the archive has come from other disciplines or in a true interdisciplinary format). the ischools‟ efforts to not only develop a new image, but to also transform themselves into organizations that illuminate the future for information- intensive institutions (like our universities) are responding to the same motives and forces that are impacting the archival community, but they may be a bit ahead of the archival community, increasing the value returned to the archival community. could this be a natural alliance in which the total is, indeed, greater than the sum of the parts? the perspective adopted by the ischools in reflecting on their mission expands on the historic traditions of lis schools by thinking more broadly about society‟s use of technology to generate, disseminate, utilize, and manage information. peter lyman‟s report (lyman and varian, ) estimated the world‟s information output as exabytes. a related study conducted four years later (gantz, et al, ) estimated the output to be exabytes, suggesting a growth rate approaching % per year by in humanity‟s generation of information of all sorts. to place this in context, if you were to read one book a day for years, it would total about gigabytes (one ten- billionth of the information generated in ). and if the estimates of the san diego supercomputer center are applied (moore, et al, ), the cost of saving one online copy of all the information generated in plus three tape backups, using contemporary storage and server technology, would approach the national debt. clearly there is an ongoing need for curation and some careful consideration given to what is worth saving in an increasingly digital information society. many institutions anticipated that institutional repositories could provide a sufficient solution to the problem of preserving the intellectual output of their organizations, and eagerly installed popular open-source repository software packages such as fedora or dspace. many of these same institutions were subsequently disappointed when such efforts were not rewarded by faculty enthusiastically depositing all of their papers, data sets, and related scholarly materials. despite the fact that research and scholarly communication is increasingly dependent on datasets so large that they evade human understanding and must be analyzed by machine, the infrastructure to support such communication through space and time remains to be developed. is this not a challenge made to order for the ischools and the archival profession? and here the archival profession offers something to ischools. the concept of archival appraisal, the identification of documentary sources with enough continuing value to merit their ongoing maintenance, may offer lots of value for grappling with the information glut. archivists can demonstrate that the challenge is not saving everything but saving the right stuff. in some cases, where data can be entirely regenerated, it may be preferable to avoid saving it in the first place. for archivists the challenge mostly in recent years has been the business of figuring out how to save the new digital documents and information systems. however, for information scientists and other professionals, the challenge may have been trying to figure out how to maintain everything. indeed, several prominent researchers have suggested that the cost of manual metadata generation makes it cheaper to save everything than to curate and catalog it. a partnership seems in order, and ischools perhaps provide the vehicle for this. for example, archivists are well aware that their legacy holdings in traditional formats can’t all be digitized due to issues of resources and other responsibilities. do information scientists really understand that they probably can’t save everything? even the internet archive is only taking periodic snapshots of the web and not even capturing the largest portion of the web, the deep web (see arms and larsen, ). the evolving demands of escience and other data-intensive domains clearly require disciplined attention to the development of curation and preservation strategies appropriate to the time. irreproducible primary data and evidence, for example, should be routinely captured at the source through an infrastructure that can be tailored to specific needs, interests, and preferences, but does not require subsequent overt attention by its users. metadata should, to the greatest extent possible, be generated automatically at the point of data capture. in addition, though, social networking experiences have demonstrated the value of enriching data through the annotations of users (including their profiles). the intention here goes beyond organizing the vast and growing collection of digital content for access and usage by humans, to include the even more challenging, and potentially more valuable, access and analysis by computers. as rick luce observes (see arms and larsen, ), we need “applications that support not just links between authors and papers but relationships between users, data and information repositories, and communities. what is required is a mechanism to support these relationships that leads to information exchange, adaptation, and recombination.” rather than debate or delay the inevitable necessity of dealing with pervasive digitization, ubiquitous access to information by both humans and computers, and at-risk digital content, archivists working with (or through) ischools can proactively help society not only understand the urgency and importance of these issues, but also to develop long term solutions. these solutions, while enabled by technology, must go far beyond the technical infrastructure to also address issues of policy, human needs and motivations, intellectual property rights, economics, privacy, security, and a host of related concerns. dealing with challenges such as these relate to how archivists have played around with the life-cycle concept of records. the life-cycle concept developed as a means of visualizing how and when archivists might work with the records. at its earliest point, the concept suggested that archivists deal with records at the end of their life and that their colleagues (the records managers) deal with the records at earlier stages. with the growing use of electronic records, many archivists began to advocate for archivists to be much farther up in the cycle, even helping with the design of records systems to ensure that archival records could be captured. some even thought that the records life cycle was obsolete and suggested the records continuum concept allowing for systems to capture archival records from beginning to preservation, even suggesting that many electronic records do not go into an inactive stage but are always active. anyway, the issues outlined here suggest that ischools could enable a new kind of curriculum for archival studies whereby a good deal of focus could be placed on such design issues and with how to work with designers, vendors, and other information professionals. the curriculum might build on the notion of “content” becoming a recognized component of “infrastructure,” as described in the nsf/jisc cyberscholarship report (see arms and larsen, , p. ). given this broad construct, one can then identify a range of value-added services to which users could subscribe. gregory crane (see arms and larsen, , p. ) identified a family of such services that would be of particular value to the humanities, including services ( ) to automatically catalog discrete objects within collections, ( ) to recognize semantically significant elements embedded within collection objects, ( ) to customize the selection and presentation of materials to the needs and interests of a particular user, and ( ) to support structured user contributions such as those emerging in social networking websites. the curriculum would also need to reconsider curation itself, moving into realms beyond the physical artifact. the uk joint information systems committee (jisc) has taken some initial steps in this direction by fostering the development of “data journals” as a new form of scholarly publication (overlay journal project, -present). a data journal is a peer-reviewed, reputable vehicle for scholarly communication that explicitly recognizes the intellectual challenges and value in creating credible sources of high quality data. the andrew w. mellon foundation has supported similar efforts in the humanities (nowviskie and mcgann, ) and archaeology (see save). these pioneering projects and others like them are fundamental to developing an understanding of the challenges in developing large-scale, coherent and consistent collections operating on robust and reliable systems, providing access and services to a large and distributed clientele. there are opportunities for leverage here. as we have seen, the issues confronting ischools, in general, and archival studies, in particular, share much in common, and each has a lot to do with the overwhelming impact of digital technologies. these broader sets of issues have attracted much attention, from the nsf’s blue ribbon panel on cyberinfrastructure and the acls’s study of cyberinfrastructure for the humanities and social sciences to the nsf’s formation of the office of cyberinfrastructure. if anything, the european emphasis has been even stronger through their framework programmes in escience. these considerations have also led to fundamental questions regarding what constitutes the scholarly record, a question that by now you will recognize as one that has occurred to archivists before. as research increasingly draws on (and generates) vast quantities of data, we have seen that data, itself, become part of the archival record of scholarly accomplishment. while well-known pioneering projects are forging new paths and new forms of scholarship, we have yet to reflect on these projects from the perspective of archival requirements. who will do this? genomists? astronomers? physicists or chemists? not likely on their own, and not likely archivists on their own. the necessary partnerships are yet to be forged through teams that include discipline specialists and archivists. is this not a natural direction for the ischools with archival programs? might it even be a reason for others to develop them? the challenges inherent in this venture are multifold, spanning issues that are purely technical to ones that impact directly on public policy, economics, and the traditions of various scholarly communities. in the technical arena, for example, the variety reflected in the scale, structure and internal complexity of materials as diverse as digitized books, scientific data, web pages, courseware, and annotated greek manuscripts can too easily lead to a perceived need for custom approaches that fall short of being considered “infrastructure.” on the other hand, this same variety effectively precludes a single approach for all categories of content. some middle ground must be found that accommodates a wide variety of content through a manageably small set of approaches. the magnitude of the transformation that seems inevitable to some of us will likely impact directly on our most-cherished human organizations, their traditions, motivations, incentives, economics, and legal frameworks. how will we sort out the nature of this transformation, if not through a colloquy between those most knowledgeable about the core issues and those most knowledgeable about the disciplinary cultures? are not the ischools and their archival scholars placed well to consider the core issues? there are alternative models to consider in managing the growing scale and complexity of the scholarly record. where should the locus of responsibility fall? will the traditional model of scholarly publishing, led by a few industry giants, adapt to the competing interests of profitability and more open access? might the role of supercomputer centers, which were initially established in response to the accelerating need for computational power, expand their mission to become superdata centers in response to the accelerating growth of information? how will scholars, students, and the general public be assured of access to not only the publications that have traditionally supported creativity, entrepreneurship, and intellectual advancement, but also to the multimedia resources, models, simulations, software, primary data, statistical records, and other diverse information resources that are now part of these endeavors? increasingly restrictive intellectual property rights (ipr) provisions and aggressive business practices suggest this will continue to be a difficult and complex challenge. whatever approaches ultimately prevail will need to include consideration of stability and sustainability. an infrastructure, by definition, must satisfy this attribute, and it must apply not only to the technology, but also to the content (the data) and to the organizations engaged. these are formidable but not necessarily overwhelming challenges that could benefit from the long term sustained attention of ischools and archivists. if anything, as the challenges grow more and more complex, they increasingly move into areas that archivists (not to mention information scientists and librarians) have not had to spend too much time worrying about in the past. but now the variety of issues is growing quite complex, from the technical issues of managing immense volumes of data with intricate structures and complex interactions to legal issues that impact directly on individual use of information resources to the economic interests that arise around the commercial potential (real or imagined) of information resources. then there are the differing traditions among disciplines regarding their information, those for whom the monograph is dominant, for example, versus those for whom immediate additions to a shared database represent valued scholarly contributions, and those where new media are the venue for establishing records of creativity. few of our institutions, organizations, policies, and traditions welcome and adapt quickly to fundamental change. resistance is natural, if not futile (to recall an aphorism from the not too distant past). the landscape of scholarly communication is being transformed by digital media, though, and we need to get ahead of this trend and position our ischools as true thought leaders. we may need, for example, to be less sanguine about the industrial takeover (by google, for example, or perhaps you prefer elsevier) of our creative outputs. while the current focus may be on documents, copyright, and fair use, one can easily imagine this debate growing to include models, simulations, and data, for example. when the nation felt challenged by international competitors in high end computing, the federal government saw fit to compete head on by establishing supercomputing centers and investing in high end computing research. now that research is becoming increasingly dependent on voluminous data resources, should we be building superdata centers? if so, does this not suggest a role for a new generation of archivists and new archival theory? might it not be the case that the staid (some might say stodgy) discipline known as archival studies might, in fact, provide a window to our future? having matured beyond the fantasies of storing everything, it is the archivists who have thought the most rigorously about clearing out our attics, of preserving the necessary evidence of our existence, and of representing the essence of our disciplines through appropriate models. it is the archivists who have clarified our understanding of both the best (the “hero stories”) and the worst (the “horror stories”) through illustrative and analytical case studies. despite the magnitude of the transformation brought about by digital technologies, it is the archivists (and, yes, the librarians), who have made a career out of understanding, whether analog or digital, that it is all information, and there are a set of principles and practices that transcend the medium. closing thoughts so we have spent some time exploring the domain of archival studies and the changing landscape of scholarly communication, all with an eye toward the ischools. and if we come away somewhat persuaded that the ischools are a reasonable (if not logical) home for archival studies, do the archival studies bring a larger value proposition to the ischools? it well may be the case that the values and vision that have developed in archival studies over the past century can inform our broader path in the st century. the difficult issues of digital preservation have been recognized in the ischool community for some time, but perhaps we need to pay greater attention to related issues of selection and curation. we may find case studies buried in the archival experience to provide dramatic insight into choices yet to be made regarding digital archives. at the very least, there is value in recognizing and appreciating the perspective and foundations of one of our niche sub-disciplines that may well become of greater significance than many would expect… perhaps even contributing to the transformation of our digital futures and (who can say), maybe even elevating the practitioners of that sub-discipline back to the status they enjoyed in the ancient world. references anderson, e ( ) time to get serious about the paperless office. ubiquity . available at http://www.acm.org/ubiquity/volume_ /v i _andersen.html. accessed april , arms, w y and larsen, r l ( ) the future of scholarly communication: building the infrastructure for cyberscholarship. report of a nsf/jisc workshop, available at www.sis.pitt.edu/~repwkshop bates, m ( ) information and knowledge: an evolutionary framework for information science. inform research . available at http://informationr.net/ir/ - /paper .html. accessed april bearman, d ( ) electronic evidence: strategies for managing records in contemporary organizations. archives & museum informatics, pittsburgh bearman, d ( ) moments of risk: identifying threats to electronic records. archivaria : - borgman, c ( ) scholarship in the digital age: information, infrastructure, and the internet. mit, cambridge, ma brooks, p c ( ) research in archives: the use of unpublished primary sources. university of chicago press, chicago buckland, m ( ) library services in theory and context. pergamon, elmsford, new york, nd ed. buckland, m ( ) information and information systems. praeger, new york cook, t ( ) archival science and postmodernism: new formulations for old concepts. arch sci : - cook, t ( ) fashionable nonsense or professional rebirth: professionalism and the practice of archivists. archivaria : - cox, r j ( ) managing records as evidence and information. quorum books, new york cox, r j ( ) no innocent deposits: rethinking archival appraisal. scarecrow, metuchen, new jersey cox, r j ( ) ethics, accountability and recordkeeping in a dangerous world. facet, london cox, r j, yakel, e, wallace, d, bastian, j and marshall, j ( ) archival education in north american library and information science schools: a status report. lib q : - duranti, l ( ) diplomatics: new uses for an old science. society of american archivists, association of canadian archivists, and scarecrow, lanham, maryland, and london economou, m ( ) a world of interactive exhibits. in: marty p f and jones k b, (eds) museum informatics: people, information, and technology in museums. routledge, new york, pp - gantz, j, et. al. ( ) the diverse and exploding digital universe, an updated forecast of worldwide information growth through . idc white paper, available at http://www.emc.com/collateral/analyst-reports/diverse-exploding-digital-universe.pdf gomez, j ( ) print is dead: books in our digital age. macmillan, new york grafton, a ( ) future reading: digitization and its discontents. new yorker (november ). available at http://www.newyorker.com/reporting/ / / / fa_fact_grafton ham, f. g. ( ) archival strategies for the post-custodial era. am arch : - herring, m y ( ) fool’s gold: why the internet is no substitute for a library. mcfarland, jefferson, north carolina khurana, r ( ) from higher aims to hired hands: the social transformation of american business schools and the unfulfilled promise of management as a profession. princeton university press, princeton labaree, d ( ) the trouble with ed schools. yale university press, new haven http://www.emc.com/collateral/analyst-reports/diverse-exploding-digital-universe.pdf http://www.newyorker.com/reporting/ / / / fa_fact_grafton larsen, r. l. ( ) ischools. encyclopedia of library and information sciences, third edition doi: . /e-elis - taylor & francis (to be published). also available at http://www.ischools.org/site/history/ lyman, p and varian, h r ( ) how much information? . available at http://www .sims.berkeley.edu/research/projects/how-much-info- / macneill, h ( ) contemporary archival diplomatics as a method of inquiry: lessons learned from two research projects. arch sci : - manguel, a ( ) the library at night. yale university press, new haven marty, p f ( ) an introduction to museum informatics. in: marty p f and jones k b, (eds) museum informatics: people, information, and technology in museums. routledge, new york, pp - marty p f and jones k b, ( ) (eds) museum informatics: people, information, and technology in museums. routledge, new york moore, r l., d‟aourst, j, mcdonald, r h, and minor, d ( ) disk and tape storage cost models. available at http://users.sdsc.edu/~mcdonald/content/papers/dt_cost.pdf nowviskie, b and mcgann, j ( ) nines: a federated model for integrating digital scholarship. available at www.nines.org/about/ swhitepaper.pdf o‟toole, j m ( ) on the idea of permanence. am arch : – overlay journal infrastructure for meteorological sciences (ojims) ( -present). available at http://www.jisc.ac.uk/whatwedo/programmes/reppres/sue/ojims.aspx pearce-moses, r ( ) a glossary of archives and records terminology. society of american archivists, chicago. available at http://www.archivists.org/glossary/term_details.asp?definitionkey. accessed april . pearce-moses, r ( ) becoming an archivist in the digital era. archival outlook, may- june: . ross, s ( ) changing trains at wigan: digital preservation and the future of scholarship. national preservation office, london sassoon, j ( ) beyond chip monks and paper tigers: towards a new culture of archival format specialists. arch sci : - the save (serving and archiving virtual environments) project ( -present) available at http://www .iath.virginia.edu/save/ http://www.ischools.org/site/history/ http://www .sims.berkeley.edu/research/projects/how-much-info- / http://users.sdsc.edu/~mcdonald/content/papers/dt_cost.pdf http://www.nines.org/about/ swhitepaper.pdf http://www.jisc.ac.uk/whatwedo/programmes/reppres/sue/ojims.aspx http://www.archivists.org/glossary/term_details.asp?definitionkey shera, j ( ) libraries and the organization of knowledge. c. lockwood, london. shera, j ( ) documentation and the organization of knowledge. archon, hamden, conn., archon books thibodeau, k ( ) archival science and archival engineering: building a new future for the past. archival outlook, may-june : tough, a and moss, m (eds) ( ) record keeping in a hybrid environment: managing the creation, use, preservation and disposal of unpublished information objects in context. chandos publishing, oxford, england van der toorn, k ( ) scribal culture and the making of the hebrew bible. harvard university press, cambridge williams, r f. and ashley, l j ( ) call for collaboration: electronic records management survey. cohasset associates in association with arma international and aiim, chicago zorich, d m. ( ) defining stewardship in the digital age. first monday , no. (july). available at http://firstmonday.org/issues/issue _ /zorich/index.html. accessed july discussioni - descrizione e delimitazione dell’ambito le origini – recenti, remote, remotissime – dell’informatica umanistica sono intrec- ciate con lo studio dei testi attraverso i libri a stampa prima e i ‘libri digitali’ poi. la vulgata diffusa in italia e fuori d’italia vede le origini dell’informatica umanistica nei lavori di roberto busa per la creazione dell’index thomisticus, che comportò la realizzazione di una biblioteca digitale ante litteram perché i testi del corpus tomisti- co vennero integralmente trascritti su schede perforate per poter essere acquisiti dai computer e poi elaborati. ma ci sono stati nella seconda metà del secolo scorso alme- no altri due progetti fondativi per l’informatica umanistica, per l’influenza che ebbe- ro nel darle forma e per il ruolo o per gli effetti che continuano ad avere: gli studi sulla bibbia greca dei settanta e il thesaurus linguae graecae. in entrambi i casi la digi- talizzazione delle opere (i libri della bibbia greca nel primo caso, le opere della let- teratura greca arcaica e classica, poi estesasi al periodo bizantino, nel secondo) diede luogo alla creazione di collezioni di testi digitalizzati, all’epoca spesso chiamati corpo- ra o «database testuali». scopo primario degli utenti di questi corpora era ed è tutt’o- ra la ricerca di informazioni all’interno dei testi cioè una attività centrale dell’am- bito della library and information science – ricerca mossa dall’intenzione di una nuova modalità di lettura del testo letterario, quella basata sulle concordanze, in cui si cer- cano parole rilevanti, od oscure, e si studiano le parole sulla base dei contesti e i con- testi sulla base delle parole che contengono. si tratta di una lettura che non richie- de il digitale e può essere praticata sui testi a stampa o addirittura sui manoscritti. infatti le prime concordanze, che sono un tipo di pubblicazione scientifica, venne- ro prodotte nel xiii secolo a parigi ad opera di hugues de saint cher e hanno per intersezioni maurizio lana, università degli studi del piemonte orientale “amedeo avogadro”, dipartimento di studi umanistici, vercelli, e-mail maurizio.lana@uniupo.it. questa ricerca è stata realizzata con il contributo di fondi forniti dall’università degli studi del piemon- te orientale “amedeo avogadro”. ultima consultazione siti web: maggio . sull’argomento si vedano le riflessioni che hiørland sviluppa sull’arco di una ventina d’anni, da bir- ger hjørland, library and information science: practice, theory, and philosophical basis, «information processing & management», ( ), n. , p. - , doi: . /s - ( ) - ; a id., library and information science (lis), part , «knowledge organization», ( ), n. , p. - , doi: . / - - - - , con grande spazio a come la costellazione semantica e temati- ca dell’information seeking caratterizzi l’ambito disciplinare della lis. aib studi, vol. n. - (gennaio/agosto ), p. - . doi . /aibstudi- issn: - , e-issn: - digital humanities e biblioteche di maurizio lana oggetto la bibbia latina, mentre i primi studi che innervano di matematica e stati- stica gli esiti delle concordanze si collocano nell’europa orientale all’inizio dell’ot- tocento e si basano anch’essi su collezioni di testi, per esempio le opere di platone. si tratta però di una lettura raramente praticata dagli studiosi nonostante il famoso invito di gianfranco contini: «non c’è che da cominciare a preparare un po’ di sche- de perforate per il nostro ‘robot’ filologico: allestire spogli o anzi glossari completi di più testi o autori che si possa, anche di breve respiro. come già mi accadde di sug- gerire altra volta, io vi esorto alle concordanze» anche solo perché le concordanze a stampa di un autore costituivano un prodotto editoriale piuttosto raro a causa del- l’impegno creativo e produttivo (non a caso contini invitava alle concordanze in un contesto digitale parlando di «schede perforate per il nostro ‘robot’ filologico»: ma anche così non era cosa semplice in termini operativi). le concordanze a stam- pa erano disponibili solo per i capisaldi della cultura e del pensiero: la bibbia, virgi- lio, seneca, i promessi sposi, giusto a titolo d’esempio. la disponibilità di testi in for- mato digitale e di una varietà di programmi per la generazione di concordanze, o più in generale per la ricerca di testo nei testi, ha molto facilitato questa modalità di studio che rimane comunque poco diffusa. investigare la relazione tra campo disciplinare della library and information scien- ce e digital humanities mette in luce una varietà di temi che aiutano sia a delineare quale spazio la library and information science possa avere nello sviluppo in atto nel mondo delle digital humanities sia a capire meglio il passato delle digital humanities. poiché il contesto di studio in cui si opera è quello del mondo digitale, le bibliote- che di cui si parla sono anch’esse biblioteche digitali, cioè biblioteche il cui conte- nuto è in formato digitale, perché il testo delle opere è disponibile ai lettori/utenti in uno dei tanti formati testuali come txt, epub, xml, pdf e altri ancora; ma a cui è sottesa una precisa e innovativa concezione dell’organizzazione della conoscen- za . le biblioteche che qui abbiamo chiamato digitali (in inglese digital libraries) sono anche chiamate biblioteche elettroniche (electronic libraries) o biblioteche virtuali (virtual libraries). le tre espressioni non sono neutre, hanno specifici significati anche legati a fasi storiche e quindi non sono equivalenti. analoga analisi per quanto riguar- da l’espressione digital humanities: anni fa in italia questo ambito di studi si chia- mava «informatica umanistica», ma questa denominazione è progressivamente stata soppiantata da quella angloame ricana, più ampia. indagare, concisamente, la sto- ria di queste denominazioni aiuta a cogliere l’evoluzione di ambiti in cui general- mente la presenza della tecnologia finisce con l’appiattire tutto su un eterno pre- sente (accade per tutti gli ambiti tecnologici, ma per gli ambiti in cui si utilizzano le tecnologie informatiche la cosa è ancora più evidente) che ostacola una conoscen- za approfondita. da una conoscenza più sfaccettata e fondata dell’ambito delle digi- tal humanities si potrà poi muovere verso una riflessione su quale potrebbe essere in futuro la relazione tra digital humanities e biblioteche. intersezioni gianfranco contini, esperienze di un antologista del duecento poetico italiano. in: studi e problemi di critica testuale: convegno di studi di filologia italiana nel centenario della commissione per i testi in lingua, bologna, - aprile . bologna: commissione per i testi di lingua, , p. . alberto salarelli; anna maria tammaro, la biblioteca digitale. milano: editrice bibliografica, . ben showers, does the library have a role to play in the digital humanities?, «jisc - digital infra- structure team», febbraio , . le biblioteche a cui si fa riferimento in questo contributo sono sostanzial- mente quelle del circuito della ricerca (accademiche, speciali, di ricerca ecc.). ciò sembrerebbe portare verso lo sviluppo di un discorso non interessante per le public libraries (biblioteche civiche, biblioteche di comunità locale) ma non è detto che sia così: i think librarianship can go further by incorporating digital humanities com- puting techniques into our systems and services. for example, why not pro- vide concordance services against all of the full text items in our collections. why not allow readers to create small corpuses of library content and then pro- vide n-gram services, entity-recognition services, or parts-of-speech extraction service against the result . ciò che showers prospetta è certamente un esito complesso e di alto livello scienti- fico dell’integrazione di digital humanities e library and information science, ma for- nire «concordance services against all of the full text items in [the] collections» cioè rendere possibili ricerche full text all’interno delle collezioni aiuterebbe ‘tutti i let- tori, in qualunque tipo di biblioteca, e soprattutto i lettori meno esperti’, a trovare le pubblicazioni di loro interesse. analoga considerazione vale per il fornire un entity- recognition service (che è il passo successivo al servizio di concordanza): anche in esso c’è dimensione di utilità pratica per tutti i lettori benché appaia a prima vista focalizzato su una finalità di ricerca scientifica. naturalmente il quadro delineato da showers non è semplice da realizzare a breve termine nel qui e ora, ma il suo signifi- cato è prospettico: indica una finalità comples siva che può informare gli sviluppi dell’attività e dei servizi. infine ricordiamo, ma è ovvio, che il digitale in biblioteca non significa solo biblioteca digitale ma anche creazione di un’infrastruttura comunicativa finalizza- ta a favorire l’engagement e la par tecipazione di nuovi pubblici attraverso specifiche strategie di comunicazione e di digital story te lling . biblioteche elettroniche, virtuali, digitali piuttosto che basarsi su interpretazioni personali pare opportuno cercare per quanto possibile evidenze o almeno indizi documentali sulle tre aggettivazio- ni della biblioteca: elettronica, virtuale, digitale. lo strumento di analisi testua- le «ngram viewer» di google books benché non operi su testi successivi al fornisce indicazioni interessanti sull’evoluzione nel tempo della presenza delle tre espressioni digital library, virtual library, electronic library nei testi a stampa in lingua inglese. intersezioni un servizio di riconoscimento di entità denominate individua (e per quanto possibile disambigua) in un testo nomi di persona, nomi di luogo, unità di misura, distanze, date. maria cassella, comunicare con gli utenti: facebook nella biblioteca accademica, «biblioteche oggi», ( ), n. , p. - ; juliana mazzocchi, blog e social network in biblioteca: strumenti complemen- tari o antagonisti?, «biblioteche oggi», ( ), n. , p. ; gino roncaglia, social network e ricon- quista della complessità: il ruolo della biblioteche, «biblioteche oggi», ( ), n. , p. . il conve- gno stelline del era intitolato “la biblioteca connessa: come cambiano le strategie di servizio al tempo del social network”. figura – frequenza di electronic library, virtual library, digital library, nei libri in inglese di google books come si vede in figura nasce per prima l’espressione electronic library e la sua pre- senza mostra un’onda lunga con culmine intorno al che corrisponde (cfr. figu- ra ) al tempo in cui l’aggettivo elettronico era usato per caratterizzare aspetti del mondo dell’informatica: riviste elettroniche, posta elettronica, e così via. figura – frequenza di electronic library, electronic journals, electronic mail nei libri in inglese di google books intorno al iniziano a comparire dapprima virtual library, che raggiunge il suo culmine d’uso alla fine degli anni novanta del secolo scorso in corrispondenza (come si vede in figura ) del diffondersi nel discorso pubblico del tema della virtual reality, figura – frequenza di virtual library, virtual reality nei libri in inglese di google books intersezioni e subito dopo (cfr. ancora figura ) digital library la cui presenza è quantitativa- mente molto rilevante, molto più delle precedenti espressioni, a segnalare un progressivo diffondersi e affermarsi sia della res indicata da tale espressione sia del discorso su di essa. il , anno in cui si collocano gli inizi delle due espres- sioni virtual library e digital library è un anno chiave per il mondo digitale: è l’an- no in cui ad opera di tim berners-lee vengono inventati il protocollo di comu- nicazione http e lo spazio digitale da esso definito, cioè il web. di per sé, e quando nasce, il web non è altro che uno dei vari software che definiscono ambienti di comunicazione e interscambio (tra quelli nati in quegli anni, e quasi tutti obso- leti, si possono ricordare ftp, gopher, wais, archie, veronica, netscape) ma per la sua versatilità rapidamente diventa l’ambiente in cui ogni altra attività si può svolgere, tanto che oggi ha soppiantato gli altri ambienti e finisce per essere iden- tificato con internet (e viceversa). non stupisce dunque che sia con l’inizio degli anni novanta che iniziano a entrare in uso espressioni che indicano oggetti digi- tali di nuovo tipo (virtual library, digital library), oggetti digitali che indicano l’e- sistenza di ambienti innovativi che attraggono l’interesse di un numero di sog- getti sempre più vasto. in italia la situazione è simile (per meglio dire, i libri scritti in italiano disponi- bili in google books delineano una situazione simile) come si può vedere in figura figura – frequenza di biblioteca digitale, biblioteca virtuale, biblioteca elettronica nei libri in italiano di google books si nota anche qui (benché in modo meno evidente rispetto a quanto appare dai libri scritti in inglese) il culmine dalla curva dell’uso di «biblioteca elettronica» intorno al , quello dell’uso di «biblioteca virtuale» verso la fine degli anni novanta e poi la rilevante crescita quantitativa di «biblioteca digitale». anche per le fonti in ita- liano valgono le coincidenze temporali già mostrate per il contesto angloamerica- no: l’uso di «biblioteca elettronica» è coevo a quello di «calcolatore elettronico, «rivi- ste elettroniche», e l’uso di «biblioteca virtuale» ha il suo picco in corrispondenza con quello di «realtà virtuale». che cosa ne emerge? che in relazione a come si muove ed evolve la percezione e la rappresentazione della relazione tra computer e società, così evolve la denomina- zione della biblioteca che con quel mondo si connette, si relaziona. ciò che è più inte- ressante è che si tratta di un dato, non di un’ipotesi, che mostra che «la biblioteca» spesso concepita sia all’interno sia all’esterno come un’entità molto stabile (statica: in fin dei conti poche istituzioni culturali hanno una storia così lunga e sono così autoidentiche come le biblioteche) è in realtà ‘anche’ capace di modificarsi per segui- intersezioni re le trasformazioni dei tempi e della società. nel contempo si nota che oggi sono con- temporaneamente in uso, sebbene in differenti proporzioni, le espressioni «bibliote- ca digitale» e «biblioteca virtuale». qui utilizzeremo l’espressione «biblioteca digita- le» in quanto l’aggettivo ‘digitale’ indica in modo corretto una caratteristica rilevante del contenuto della biblioteca e rimanda al fatto che la biblioteca esiste nel mondo digitale ; mentre ‘virtuale’ parla della sua forma, e impropriamente perché virtuale indica ciò che esiste in potenza ma non in atto, quando invece le biblioteche per quan- to siano chiamate virtuali esistono in atto, nel mondo digitale. gli inizi delle digital humanities matthew kirschenbaum colloca la nascita dell’espressione digital humanities, che è correntemente in uso per indicare il campo degli studi umanistici in cui si utiliz- zano tecnologie informatiche, in corrispondenza di due eventi del fra loro indi- pendenti: la pubblicazione del manuale intitolato a companion to digital humanities e l’unione della association for computers in the humanities, statunitense, e della association for literary and linguistic computing, europea, in una nuova entità federale che venne denominata alliance of digital humanities organizations. ad essi si aggiunse nel l’istituzione all’interno dello statunitense national endow- ment for the humanities di un programma di azione permanente che venne deno- minato «digital humanities» . all’inizio degli anni duemila il campo che oggi viene chiamato digital humanities era chiamato in inglese (come si può osservare anche intersezioni l’operazione di trasferimento di un contenuto dal mondo fisico al mondo ‘dei computer’ si chiama propriamente digitalizzazione. per effetto del formato del contenuto la biblioteca digitale ha poi una serie di modalità di lavoro e di opportunità di presenza nella società che le sono specifiche: «da biblioteche digitali ‘centri di risor- se’ a biblioteche ‘centri di comunità’! […] la biblioteca digitale non è quello che viene comunemente inteso, cioè un deposito di contenuti digitali con servizi di ricerca collegati. l’idea centrale del con- cetto di biblioteca digitale è che la facilitazione della conoscenza e l’azione sociale devono andare insieme: ci sono molte possibili costruzioni sociali del mondo e ognuna di queste porta a una diver- sa azione per diverse comunità.» (anna maria tammaro, biblioteca digitale partecipata: le sfide per i bibliotecari, «aib studi», ( ), n. , p. , doi: . /aibstudi- ). la denominazione digital humanities è recente. in precedenza in ambito anglofono era dominante l’e- spressione humanities computing. anche in italia digital humanities è correntemente in uso e ha sop- piantato informatica umanistica. in queste pagine per semplicità useremo generalmente (e in qualche caso anacronisticamente) l’espressione digital humanities, perché è così che oggi è denominato questo campo di studi che pure esisteva già in precedenza. È ovvio che mutamenti di denominazione comportino sposta- menti di prospettiva e quindi non siano irrilevanti, e di questo si terrà conto nelle pagine seguenti. matthew g. kirschenbaum, what is digital humanities and what’s it doing in english departments?, «ade bulletin», ( ), p. - , doi: . /ade. . . a companion to digital humanities, edited by susan schreibman, raymond george siemens and john unsworth. malden, ma: blackwell, , . È interessante notare che le primissime attestazioni dell’espressione digital humanities si trovano nel e in pubblicazioni di ambito biblioteconomico: in dennis dillion, the changing role of humanities collection development, «the acquisitions librarian», ( ), n. - , p. , doi: . /j v n _ , si legge: «as we have already seen, a good number of the currently available dai nomi delle due associazioni statunitense ed europea appena citate) computers and humanities, humanities computing, literary computing, mentre in italiano si par- lava di «informatica umanistica» e di «linguistica computazionale». gli scritti su temi relativi a biblioteche digitali e digital humanities, ma anche su temi di digital humanities in senso ampio, si aprono spesso con una breve descrizio- ne di che cosa si debba intendere per digital humanities, a indicare implicitamente che si tratta di un campo di studio il cui contenuto e i cui metodi non sono poi così noti e che quindi essi devono essere in qualche misura dichiarati e spiegati a chi è esterno a tale ambito. le definizioni/descrizioni non solo variano nel contenuto ma anche mostrano sostanziali diversità reciproche. alcune che mostrano in evidenza quest’alta variabilità sono raccolte qui di seguito. the term digital humanities is being referred to more and more, as the cross- road of information technologies and traditional humanities research. in my short definition, it is the application of information technologies to analyzing humanities as well as many interdisciplinary subjects . the fields of humanities computing and digital humanities have been evolv- ing over several decades. our working definition is “application of digital resources and methods to humanistic inquiry” […]. some consider the “process” of dh to be part of the scholarship, while others see published outcomes as the only true coins of the realm. the unit of dh is the project, which often requires a one-off approach . come condiviso dalla maggior parte degli studi in materia, data d’origine della tradizione dell’informatica umanistica e� il , anno in cui il progetto index thomisticus di padre busa vede la luce. l’idea dell’avanguardistico gesuita di gallarate era appunto quella di produrre un indice di concordanze lemmatiz- zate di tutte le parole presenti nel corpus testuale di tommaso d’aquino e altre opere correlate . by “digital humanities” we mean not only philological applications but any support of cultural-historical research using computer science . intersezioni digital humanities resources are simply a reformatting of materials which the typical library already owns»; e in david green, the national initiative for a networked cultural heritage, «information tech- nology and libraries», ( ), n. , p. , si legge: «two early offshoots of its “computing & human- ities” initiative, cosponsored with the national academy of sciences, have been an internationally dis- tributed database of digital humanities projects...». le chiamiamo primissime attestazioni in quanto esse non mutarono il contesto e non entrarono nel discorso corrente. hitoshi kamada, digital humanities: roles for libraries?, «college & research libraries news», ( ), n. , p. - , doi: . /crln. . . . jennifer schaffner; ricky erway, does every research library need a digital humanities center?. dublin, ohio: oclc research, , p. , . federica perazzini, words, bytes and numbers: le digital humanities “viste da vicino”, «status quaestionis», ( ), n. , p. - . dominic oldman; martin doerr; gerald de jong, realizing lessons of the last years: a manifesto for data provisioning and aggregation services for the digital humanities (a position paper), «d-lib magazine», ( ), n. - , nota , doi: . /july -oldman. developed in the late s, the digital humanities primarily focused on design- ing standards to represent cultural heritage data such as the text encoding ini- tiative (tei) for texts, and to aggregate, digitize and deliver data . given recent large investments in projects such as bamboo, dariah, and clarin, there seems to be a certain consensus among funders and policy- makers that there is a real need for the humanities to shift its methodology into the digital realm. the report of the american council of learned societies com- mission on cyberinfrastructure for the humanities and social sciences, for example, heralds digital and computational approaches as drivers of method- ological innovation in humanities . la variabilità delle descrizioni è correlata per un verso all’oggettiva complessità e ramificazione del campo di studio ma per un altro anche al fatto che la complessità sembra in qualche modo giustificare il fatto che chiunque dia del campo una descri- zione personale; il che in genere non accade per altri ambiti disciplinari/scientifici. esiste comunque una descrizione condivisa e diffusa degli inizi delle digital huma- nities che vengono individuati solitamente nel lavoro di roberto busa, basato in gal- larate, per la realizzazione dell’index thomisticus (la concordanza delle opere di tom- maso d’aquino) che comportò l’utilizzo dei computer ibm quando nessuno pensava che un computer potesse elaborare altro che numeri. unlike many other interdisciplinary experiments, humanities computing has a very well-known beginning. in , an italian jesuit priest, father roberto busa, began what even to this day is a monumental task: to make an index of all the words in the works of st thomas aquinas and related authors . hockey è una studiosa autorevole ma, nel caso suo come di molti di coloro che tratta- no questo argomento, si tratta di una spiegazione post eventum: perché l’incidenza del lavoro di busa, dagli inizi informatici nel fino verso gli anni ottanta, estrema- mente focalizzato sulle opere di tommaso d’aquino, sull’allora nascente contesto del- l’informatica umanistica italiana è difficile da delineare: da un lato egli ad esempio col- laborò alla redazione dell’almanacco bompiani del dedicato alle applicazioni dei calcolatori elettronici alle scienze morali e alla letteratura ; ancora busa fu tra i sosteni- tori e collaboratori del lessico intellettuale europeo nato nel da esperienze di alcuni intersezioni stefan jänicke; greta franzini; muhammad faisal cheema, on close and distant reading in digital humanities: a survey and future challenges. in: eurographics conference on visualization (eurovis)- stars, a cura di r. borgo, f. ganovelli, i. viola. [geneve]: the eurographics association, , doi: . /eurovisstar. . joris van zundert, if you build it, will we come? large scale digital infrastructures as a dead end for digital humanities, «historical social research / historische sozialforschung», ( ), n. , p. - , . susan hockey, the history of humanities computing. in: a companion to digital humanities cit., p. - , doi: . / .ch . data del primo incontro di busa con il presidente dell’ibm thomas watson. almanacco letterario bompiani : le applicazioni dei calcolatori elettronici alle scienze morali e alla letteratura, a cura di sergio morando. milano: bompiani, . anni prima; e antonio zampolli poi fondatore nel dell’istituto di linguistica com- putazionale del cnr, negli anni successivi alla laurea, avvenuta nel , si formò al centro per l’automazione dell’analisi linguistica di busa a gallarate. ma online si tro- vano meno di dieci suoi articoli scientifici pubblicati tra il e il dedicati alla pre- sentazione degli aspetti ‘computazionali’ del progetto dell’index. vogliamo dire che per molti anni il lavoro di busa si svolse con ridotta circolazione di comunicazione e con poca condivisione scientifica pubblica per una serie di ragioni ovvie: il tempo in cui egli iniziò, la distanza siderale del suo progetto dalla pratica degli studi umanistici a quel tempo, la modalità di pubblicazione e circolazione delle riviste. il risultato fu che il lavo- ro di busa per molto tempo non fu granché conosciuto nel suo modo di procedere e quindi non mise in movimento (non fu in grado di mettere in movimento) altro. quin- di busa non fu un iniziatore nel senso di un individuo che coagula e catalizza energie diffuse che riesce a mettere in movimento – passarono anni prima che si potesse dire che esisteva in italia un campo denominabile come informatica umanistica. poi a un certo punto quando l’informatica umanistica prese piede allora busa trovò uno spazio che lo riconobbe. tant’è che la prima edizione dell’index è del cioè si colloca in un tempo in cui certamente qualcuno svolgeva embrionali attività in ambito letterario con i calcolatori – ma erano attività isolate, non l’espressione di un ampio campo di atti- vità e di studi. quando invece furono inventati internet e l’e-mail, una parte degli impe- dimenti alla comunicazione cadde e per esempio sui catss e sul tlg ci fu tutt’altra dif- fusione di comunicazione fra gli studiosi, che solo in parte però passava dalle riviste, che erano ancora a stampa. busa fu invece un iniziatore nel senso di primo: per lunghi anni egli non trovò nessuno che lo seguisse in termini progettuali nel contesto filosofico-lin- guistico-letterario perché era troppo avanti e nessuno intorno sapeva nemmeno che cosa facesse, per così dire. in questa solitudine di pioniere per lunghi anni non ricono- sciuto è una parte della sua grandezza per il campo delle digital humanities. quindi sostenere che le digital humanities ebbero inizio di lì, a indicare che quel progetto fu il primo a mettere in campo una visione, dei computer e un gran nume- ro di lavoratrici esclusivamente focalizzati sulla digitalizzazione dei testi e sulla loro gestione, è senza dubbio vero; meno semplice affermare che, e come, il progetto ebbe un significato ‘seminale’ per l’intero ambito che successivamente venne chiamato informatica umanistica, cioè che ‘direttamente da esso nacquero altri progetti’ che ne continuarono e svilupparono l’esperienza e le conoscenze. la ‘questione degli inizi’ è di per sé complessa in molti ambiti disciplinari, e le digi- tal humanities non fanno eccezione. a giudizio di chi scrive le digital humanities ebbe- ro un inizio policentrico e disperso nel corso del tempo: scegliere quale sia l’inizio dipen- de da che cosa si ritiene più rilevante. ci furono nel nostro tempo almeno altri due inizi negli stati uniti, indipendenti da quello di busa. il primo inizio sono i cosiddetti sep- tuagint studies, gli studi intorno alla bibbia dei settanta che è traduzione in greco di epoca ellenistica della bibbia ebraica. per la nascita e lo sviluppo del progetto dei computer assi- intersezioni È dunque benvenuta e importante la ripubblicazione di un corpus di scritti di busa nel volume one origin of digital humanities: fr. roberto busa in his own words, editors julianne nyhan, marco pas- sarotti. new york: springer nature, . index thomisticus: sancti thomae aquinatis operum omnium indices et concordantiae in quibus verborum omnium et singulorum formae et lemmata cum suis frequentiis et contextibus variis modis referuntur quaeque, auspice paulo vi summo pontifice consociata plurium opera atque electronico ibm automato usus digessit robertus busa. stuttgart-bad cannstatt: frommann-holzboog, - . sted tools for septuagint studies (catss) nel - all’interno del computer center for the analysis of texts (ccat) , condiretto da robert kraft (università di pennsylva- nia) ed emanuel tov (università di gerusalemme), fu determinante la sede dell’uni- versità della pennsylvania dove nel era stato creato eniac, il primo computer digi- tale general purpose della storia. infatti se l’index thomisticus era un’iniziativa di ricerca la cui principale caratteristica innovativa era la mole di dati da acquisire (mentre il tipo di strumento di studio che si intendeva costruire, la concordanza, era noto da tempo), i catss si caratterizzarono per la varietà e innovatività delle elaborazioni informatiche da realizzare sui testi: dalla gestione di opere scritte in lingue (greco ed ebraico) che uti- lizzano caratteri non occidentali, all’analisi morfologica automatica dei testi, alla crea- zione di un testo parallelo allineato greco-ebraico con gestione delle varianti testuali. il progetto vide la collaborazione di studenti, dottorandi, staff dell’università, e studiosi esterni; e come detto la collaborazione tra l’università della pennsylvania e l’università di gerusalemme. ampia parte della comunicazione scientifica informale del progetto si svolse nella mailing list humanist cioè in un contesto pubblico internazionale. È il caso di notare che alla base dei progetti tanto di busa quanto di tov e kraft stava la digitalizzazione dei testi, la produzione di collezioni di testi in formato digitale cioè la crea- zione di ambienti che, in senso generale, possono essere interpretati come ‘embrionali’ biblioteche digitali. ora è ben noto che la concezione oggi dominante di biblioteca si strut- tura sulla compresenza di collezioni e servizi; e nei progetti citati è facile vedere che sono presenti prioritariamente le collezioni e solo marginalmente, almeno da un punto di vista quantitativo, i servizi. qui però si vuole sottolineare che all’origine di alcune esperienze fondanti delle digital humanities si trova la creazione di ‘embrionali’ biblioteche digitali. questa interpretazione ha due basi: una di tipo contenutistico e una di tipo storico. sul piano dei contenuti si può osservare che alle collezioni digitali dell’index, dei catss, del tlg, si accompagnavano dei servizi. nel caso di collezioni digitali i servizi possono essere qualcosa di molto differente da quelli che un bibliotecario in presenza offre ai lettori che chiedono aiuto: può trattarsi dell’attività di gestione dei mezzi tecnici e integrazione delle risorse di calcolo e di rete che permettono l’esistenza della collezione e l’accesso ad essa ; oppure dell’attività di progettazione del contenuto della collezione e la sua manutenzio- ne e incremento nel corso del tempo ; o – come sostiene borgman – dove c’è attività di selezione, raccolta, organizzazione, conservazione e fornitura di accesso all’informazione nell’interesse di una comunità di utenti, là c’è una biblioteca . e anche nella definizione che si trova nei documenti ufficiali dell’unione europea, «digital libraries are organised collections of digital content made available to the public» traspare il tema dei servizi intersezioni si veda il report del progetto per il neh all’indirizzo . cfr. . all’url robert kraft scrive: «around i was years old […]. before that i knew, somewhat vaguely, about the use of computers in father busa’s aquinas project»: vaguely, perché la comunicazione sul progetto dell’index era limitata. gary cleveland, digital libraries: definitions, issues and challenges, ifla universal dataflow and telecommunications core programme, occasional papers , , p. . carl lagoze; david fielding, defining collections in distributed digital libraries, «d-lib magazine», november , . christine l. borgman, what are digital libraries? competing visions, «information processing and management», ( ), n. , p. . associati alla collezione: collezioni organizzate, e messe a disposizione del pubblico, impli- cano un’intenzionalità che opera in modo costante con uno scopo preciso rispetto a un pubblico di riferimento. non sarà casuale che quando poi si arriva ad anni più vicini a oggi i criteri si facciano più stringenti e si consolidi la concezione in base alla quale ‘se oltre ai testi non ci sono i servizi, allora non si parli di biblioteca digitale’ perché la varietà delle origini si è progressivamente incanalata in una serie di forme consolidate. borgman segna- la altresì l’esistenza di un altro asse della questione: in general, researchers view digital libraries as content collected on behalf of user communities, while practicing librarians view digital libraries as institu- tions or services. tensions exist between these communities over the scope and concept of the term ‘library’ . da cui si può cogliere il significato discriminante di uno specifico servizio, quello di refe- rence: meno rilevante nelle biblioteche di ricerca e più rilevante in quelle frequentate dal pubblico generico. in effetti, nella prospettiva specifica da cui qui si è partiti, i progetti menzionati non avevano una finalità di attività con il pubblico generico e quindi l’atti- vità di reference effettuata era riferita esclusivamente a comunità interpretative speciali- stiche. questi corpora non nacquero a partire da, o con il coinvolgimento di, biblioteca- ri. nacquero invece ad opera degli studiosi, in risposta a loro specifiche necessità di ricerca, e questa tendenza almeno in italia non è cambiata se si considerano per esempio le biblio- teche digitali di ambito latino classico come alim, digiliblt, musisque deoque. in ogni caso i catss non sono pressoché mai menzionati, nemmeno dagli studiosi nordamericani, quando si parla di inizi delle digital humanities (il fatto che da un punto di vista puramente cronologico il loro inizio si collochi circa anni dopo quello del- l’index thomisticus non è rilevante perché il loro inizio fu indipendente). È quindi un destino complesso, intricato, quello di questi progetti antesignani che o sono menzio- nati in modo stereotipato (index thomisticus) o non sono conosciuti nemmeno all’in- terno del loro proprio contesto linguistico-culturale (i catss). il progetto dei catss vide fin dall’inizio la collaborazione di david packard, creatore e inventore di ibycus (com- puter specificamente destinato alla visualizzazione e studio di testi greci in quanto era dotato di un sistema di visualizzazione avanzatissimo per i tempi) nel quale si utilizzava il testo greco della bibbia dei settanta proveniente dal thesaurus linguae graecae (tlg). e proprio il tlg che prese l’avvio nel costituisce una seconda linea di svi- luppo autonomo e originario delle digital humanities in ambito nordamericano . il intersezioni unione europea, communication from the commission of september to the european par- liament, the council, the european economic and social committee and the committee of the regions – i : digital libraries, «official journal of the european union», communication, , febbraio , . si vedano a titolo di esempio howard besser, the next stage: moving from isolated digital collec- tions to interoperable digital libraries, «first monday», ( ), n. ; anna maria tammaro, che cos’è una biblioteca digitale?, «digitalia», ( ), p. , . c.l. borgman, what are digital libraries? competing visions cit., p. . degli stessi anni è anche il ‘database testuale’ noto come packard humanities institute (phi) cd- rom, che sulla falsariga del tlg e adottandone i formati di codifica dei testi offriva una raccolta di molti testi latini dalle origini all’epoca classica. per quanto utile e significativo per gli studi classici digitali non ha mai avuto la forza propulsiva del tlg. progetto, diretto da theodore brunner e basato inizialmente all’università di califor- nia a irvine, mirava alla creazione di una biblioteca digitale di tutta la letteratura greca, dall’epoca arcaica a quella bizantina. l’espressione usata all’epoca era textual database, database testuale, e in effetti a prima vista il tlg era costituito da una pura e semplice raccolta di file corrispondenti alle opere dei vari autori. ma a partire dal al database testuale si accompagnava un densissimo volume a stampa, il tlg: canon of greek authors and works in cui per ogni autore e opera del tlg erano for- nite informazioni come la datazione, i riferimenti bibliografici dell’edizione a stam- pa che era stata digitalizzata, il nome del file dell’opera nella raccolta, sicché il tlg: canon costituiva una sorta di catalogo della biblioteca la quale si caratterizzava per la natura ibrida, mista, digitale/fisica. la creazione del tlg, tutt’ora esistente e ope- rante, diede il via allo sviluppo di una serie di strumenti software specifici per la let- tura e uso del tlg: cd-rom – per lo più programmi per le due piattaforme mac e windows, i cui esemplari più recenti sono di pochi anni fa (diogenes e musaios), senza dimenticare il già menzionato ibycus che era una workstation dedicata. men- zioniamo questi aspetti, a prima vista strettamente tecnici e secondari, perché que- sta biblioteca digitale benché ininterrottamente operante dal non viene mai menzionata quando si discute di come sono iniziate le digital humanities, benché essa sia stata una forza di primaria importanza per il concetto e per la diffusione della pratica di studi filologici, letterari, con l’uso di strumenti informatici e benché le persone coinvolte a vario titolo nel progetto siano state e siano tutt’ora parte attiva della comunità internazionale degli studiosi di digital humanities. si potrebbe soste- nere che i catss e il tlg non sono innovativi quanto l’index perché nel contesto statunitense l’uso del computer nello studio dei testi era già noto e praticato (basti ricordare il manuale di john abercrombie del ) , ed è certamente vero. ma essi portano l’uso del computer nello studio dei testi a un livello incomparabilmente più alto: i catss perché coordinano una serie di competenze disparate per risolvere il problema difficilissimo per quel tempo di tentare un lavoro filologico critico su testi che usano scritture ingestibili all’interno del set dei caratteri ascii allora domi- nante; e il tlg perché realizzò una risorsa tutt’ora fondamentale per chi studia i testi letterari greci fino al tardo periodo bizantino. abbiamo dunque sin qui visto che negli ultimi settanta anni circa gli inizi delle digital humanities si possono collocare in almeno contesti differenti e fra loro indi- pendenti: l’index thomisticus di busa, i computer assisted tools for septuagint studies di emanuel tov e robert kraft, e il thesaurus linguae graecae di theodore brunner. differenti per collocazione geografica e per capacità di ‘fare scuola’ ma comunque tutti in varie forme centrati sulla creazione di raccolte di testi digitalizzati che (pur con precisazioni e limitazioni più serie per l’index thomisticus in cui i testi digitaliz- intersezioni luci berkowitz; karl a. squitier, thesaurus linguae graecae: canon of greek authors and works, with technical assistance from william a. johnson. new york: oxford university press, . dalla fine degli anni novanta ibycus non venne più prodotto perché in tempi di computer con inter- faccia a carattere e quindi limitati alla visualizzazione dei caratteri occidentali una sua caratteristica fondamentale era che permetteva di visualizzare correttamente i caratteri greci. con l’avvento di win- dows e con la diffusione dei mac la visualizzazione dei caratteri non occidentali si diffuse e rese obso- leto il costoso ibycus. john r. abercrombie, computer programs for literary analysis. philadelphia: university of penn- sylvania press, . zati erano a uso interno, meno per i testi su cui operavano i computer assisted tools for septuagint studies, che erano disponibili per l’utilizzo da parte degli studiosi capa- ci di padroneggiare i testi e gli strumenti di studio) soprattutto con il tlg prefigu- ravano che cosa sarebbero poi state le biblioteche digitali – perché ‘germinalmen- te’, ‘embrionalmente’, intorno ai testi si coagulavano dei servizi: il canone che fungeva da catalogo bibliografico delle opere raccolte nel tlg, e la serie dei pro- grammi da sns greek a diogenes che permettevano di operare ricerche testuali all’in- terno del tlg. ma la riflessione sulle origini può andare oltre, se si considera che in tutti i casi questi progetti di ricerca configuravano una lettura ‘destrutturata’ dei testi in cui si cercano, si analizzano, si contano, si studiano, singole locuzioni o parole o sequenze di caratteri in quanto espressione di fenomeni linguistici, fonici, fonetici, grammaticali, sintattici, che lo studioso reputa utili per lo studio e la com- prensione del testo che li contiene. tutto ciò è più facile da operare praticamente se il testo oggetto di studio è in formato digitale e si trova all’interno di un ambiente finalizzato, dotato di strumenti specifici; ma nulla impedisce che tutto ciò possa esse- re concepito ed eseguito anche in assenza di un ambiente digitale. e quindi risalen- do indietro nel tempo si possono individuare alcuni precursori di questo tipo di stu- dio dei testi. wincenty lutoslawski, polacco, sul finire dell’ottocento indagò sulla cronolo- gia dei dialoghi di platone e l’autenticità di alcune delle sue lettere, nel saggio the origin and growth of plato’s logic; with an account of plato’s style and of the chronology of his writings e nell’articolo principes de stylométrie appliqués a la chronologie des œuvres de platon . egli riteneva che lo stile di platone si potesse studiare misurando (con- tando) una serie di caratteristiche sintattiche . con una più netta impronta matematico-statistica nei medesimi anni di luto- slawski operò negli stati uniti thomas corwin mendhall, un fisico, che dapprima in the characteristic curves of composition: word lengths in the writings of dickens, thacke- ray and others studiò come si potesse individuare nella frequenza delle parole di lun- intersezioni la ‘lettura destrutturata dei testi’ oltre che una descrizione del modus operandi degli studi testuali digitali è anche uno dei contenuti principali del rifiuto delle digital humanities da parte degli studiosi di discipline umanistiche: il testo dell’opera viene smontato e studiato anche in assenza di quella lettura – e potremmo dire rimuginazione – che tradizionalmente caratterizzano lo studio dei testi a stampa. wincenty lutoslawski, the origin and growth of plato’s logic; with an account of plato’s style and of the chronology of his writings. london, new york and bombay: longmans, green, and co., , . id., principes de stylométrie appliqués à la chronologie des œuvres de platon, «revue des études grecques», ( ), n. , p. - , doi: /gfj h. a titolo di esempio si possono menzionare questi marcatori di stile (menzionati in anthony kenny, the computation of style: an introduction to statistics for students of literature and humanities. oxford [oxfordshire], new york: pergamon press, ): risposte denotanti assenso soggettivo meno di volta su risposte; aggettivi di grado superlativo in risposte affermative con frequenze superiori alla meta� degli aggetti- vi di grado positivo, ma non prevalenti sui positivi; proposizioni interrogative con ara costituenti tra il e il % di tutte le interrogative; preposizione perì collocata dopo la parola a cui si riferisce, costituente più del % di tutte le occor- renze di perì. ghezza data un indicatore dello stile di un autore (la cosiddetta ‘curva caratteristi- ca’ ) e che successivamente in a mechanical solution of a literary problem utilizzan- do tale indicatore studiò l’attribuzione delle opere di shakespeare confrontandole con le opere di marlowe e bacone. ancora qualche anno prima, nel , viktor jakovlevič bunjakovskij, eminen- te matematico russo, aveva pubblicato un articolo intitolato on the possibility to apply determining measures of confidence to the results of some observing sciences, particularly statistics in cui prospettava «the application of probability analysis, to which obviou- sly no-one has ever before drawn the attention [to] grammatical and etymological studies of a language, as well as comparative philology» (non risulta che luto- slawski conoscesse questo articolo di bunjakovskij quando intraprese i suoi studi di statistica linguistica sulle opere di platone). riassumendo, abbiamo visto che c’è un inizio policentrico delle digital humani- ties negli anni tra questo secolo e gli ultimi del precedente in cui emerge in eviden- za la connessione dei progetti di ricerca con la creazione di risorse di base che oggi verrebbero chiamate biblioteche digitali. ma gli intenti e gli approcci metodologici di chi oggi studia i testi nell’ambito delle digital humanities hanno forti somiglian- ze con l’opera di studiosi di attribuzione che operarono nell’ottocento, in entram- bi i casi v’è al centro dell’attenzione una lettura destrutturata dei testi che sono ogget- to di studio. ma resta un ultimo passo, vertiginoso, da compiere verso il passato profondo. lo studio delle opere sulla base del confronto delle parole presenti nel testo nasce nel nostro mondo culturale nel medioevo, a parigi, all’abbazia domeni- cana di san giacomo nel . lì ad opera di hugues de saint-cher venne conce- pita e realizzata la prima concordanza della vulgata: di ogni parola del testo veni- vano elencati i passi che la contengono. il concetto centrale della concordanza è comprendere e studiare il significato della parola in base all’insieme dei passi che concordano nell’utilizzo di tale parola. in figura si può osservare il lemma abba pater: sulla sinistra ci sono le citazioni, sulla destra i corrispondenti passi concisi: mc. xiiii.d omnia possibilia sunt tibi ro.viii.c clamantes abba pater gal.iiii.d clamantes abba pater intersezioni thomas corwin mendenhall, the characteristic curves of composition: word lengths in the writ- ings of dickens, thackeray and others. new york: science co., . id., a mechanical solution of a literary problem, «the popular science monthly», ( ), n. , p. - . viktor jakovlevi� bunjakovskij, on the possibility to apply determining measures of confidence to the results of some observing sciences, particularly statistics, «sovremennik», ( ), ii. la traduzione inglese dell’originale russo è in peter grzybek, history of quantitative linguistics - i. viktor jakovlevi� bunjakovskij, «glottometrics», ( ), p. - . martin morard, les concordances bibliques d’hugues de st-cher – sacra pagina, «sacra pagina: gloses et commentaires de la bible latine au moyen Âge», ottobre , ; janos bartko, un instrument de travail dominicain pour les prédicateurs du xiiie siècle: les sermones de evangeliis dominicalibus de hugues de saint-cher († ): edition et étude. lyon: lyon - lumière, , . la stranezza del passo che apparentemente non contiene il lemma dipende dal fatto che la frase a cui si fa riferimento recita «et dixit abba pater omnia possibilia sunt tibi». figura – voce di concordanza «abba pater» nel ms. , f. , biblioteca municipale di saint-omer non si ritrova qui la divisione oggi abituale dei capitoli in versetti perché essa fu con- cepita e operata per la prima volta nel ad opera di r. stefanus; ugo di saint-cher invece suddivideva ogni capitolo in parti uguali identificate dalle lettere da a a g. anche quando si utilizza una concordanza per studiare un testo si realizza quella let- tura destrutturata del testo (o ri-strutturata secondo l’intenzione del lettore che sce- glie la parola di suo interesse) perché dal testo ‘principale’ si estrae (per mezzo dell’a- nalisi e ricerca delle forme) un testo ‘secondario’, costituito dall’insieme dei passi che concordano nell’uso di una determinata parola, e questo testo secondario diventa oggetto della lettura. uno «studiare il testo con il testo» che concepisce il testo come un universo di cui occorre conoscere le regole interne per poter arrivare a compren- derne il significato. la concordanza, che è lo strumento per operare tutto ciò, è un tipo di pubblicazione molto particolare perché oltre a richiedere una mole imponente di lavoro preparatorio crea un testo di secondo livello che presuppone l’esistenza di biblioteche in cui i testi di riferimento di primo livello sono catalogati e accessibili. questo approccio al testo anche chiamato analisi testuale – testimoniato in forma seminale dalla concordanza e poi sviluppatosi in varie forme nel corso del tempo fino a caratterizzare un nucleo duro di informatica umanistica – si caratte- rizza per essere essenzialmente costituito da un’attività di ricerca di informazione all’interno dei testi, ricerca che prende le forme più diverse e varie a seconda che riguardi elementi testuali in senso stretto (in genere parole o sequenze di caratteri) o metatestuali (per esempio caratteristiche grammaticali o sintattiche che o ven- gono inferite dal testo stesso o sono preventivamente inserite e descritte in modo formale nel testo per poi poterle cercare e reperire); gli esiti desiderati non sono solo i passi che contengono i fenomeni cercati ma anche dati numerici sulle frequenze, da poter sottoporre in un secondo momento ad analisi statistiche; e le ricerche si possono effettuare su dati testuali conservati localmente oppure su dati testuali remoti per mezzo di strumenti di ricerca online . ma proprio la ricerca di infor- mazioni (intesa in senso estensivo) da vari studi viene riconosciuta come un ele- mento costitutivo del nucleo specifico della library and information science . secon- do la classificazione degli argomenti di library and information science nelle riviste intersezioni . È il caso, ad esempio, di webcorp (). ne dà conto maurizio vivarelli, dai frattali alle reti: un punto di vista olistico per la lettura. in: la biblioteca che cresce: contenuti e servizi tra frammentazione e integrazione. milano: editrice bibli- ografica, , p. - . scientifiche del settore elaborata da ja�rvelin e vakkari nel è costitutivo il tema «information retrieval»; secondo l’indagine di borup larsen del in tutti i syl- labi dei corsi di library and information science da lei esaminati è presente il core subject «information seeking and information retrieval»; nel figuerola, garci�a marco e pinto individuano con il topic modelling, i temi ricorrenti delle pubblica- zioni scientifiche indicizzate nei library and information science abstracts - e tra questi compaiono advanced statistics application; automatic information proces- sing; online search services . siamo quindi risaliti, in questa indagine sugli inizi delle digital humanities, dai nostri anni fino al medioevo sempre seguendo il filo con- duttore di metodi di studio dei testi conservati nelle biblioteche – biblioteche crea- te appositamente per la ricerca che si intende condurre, come si è visto per i pro- getti più recenti, o biblioteche preesistenti. gli strumenti e le entità oggetto dell’analisi possono cambiare ma i concetti permangono a indicare che si è sempre all’interno di un medesimo campo di studi ‘di natura eminentemente testuale’. si tratta di una conclusione apparentemente ovvia, sulla base di quanto fin qui esposto: ma ha una serie di implicazioni non banali per il seguito del discorso. e lo strumento di lavo- ro all’interno di questo campo, cioè l’analisi testuale, che è analisi dell’informa- zione veicolata dal testo, è un tema chiave che concorre a definire l’identità della library and information science. sulla questione degli inizi delle digital humanities visti nel progetto di busa per l’index thomisticus più voci si sono espresse negli ultimi anni. steven jones ha pubblicato un’ampia ricostruzione storica del progetto, proprio allo scopo di por- tarlo fuori dalla semplificatoria vulgata corrente per mostrarne la complessità e quindi confermarne per via di approfondita analisi il significato di inizio delle digi- tal humanities. fabio ciotti ha recentemente pubblicato un articolo in cui pur ricordando gli inizi delle digital humanities con busa sottolinea il ruolo e il valore dell’impronta della scuola romana (orlandi in primis e poi gigliozzi e mordenti) all’interno della ‘via italiana’ documentata a partire dall’uscita nel del già ricor- dato almanacco letterario pubblicato da bompiani e dedicato alle applicazioni dei calcolatori elettronici alle scienze morali e alla letteratura . il tema delle origini viene trattato anche da edward vanhoutte in un capitolo del volume defining digital intersezioni kalervo ja�rvelin; pertti vakkari, the evolution of library and information science - : a con- tent analysis of journal articles, «information processing & management», ( ), p. - . jeannie borup larsen, survey of library & information science schools in europe. in: european cur- riculum reflections on library and information science education. copenhagen: the royal school of library and information science, , p. - . nell’articolo di carlos g. figuerola; francisco javier garcía marco; maría pinto, mapping the evo- lution of library and information science ( – ) using topic modeling on lisa, «scientometrics», ( ), n. , p. – , doi: . /s - - - , tra i temi caratterizzanti dell’ambito lis, studiato con tecniche di topic modelling, emergono advanced statistics applications, automatic information processing, online search services. steven jones, roberto busa, s.j., and the emergence of humanities computing: the priest and the punched cards. london: routledge, , doi: . / . fabio ciotti, from informatica umanistica to digital humanities and return: a conceptual history of italian dh, «testo e senso», ( ), p. - , . almanacco letterario bompiani cit. humanities principalmente mostrando con riferimento a busa e ad altri studiosi e progetti la grande varietà delle forme degli studi di digital humanities già nei primi anni di sviluppo del settore. in modo simile a vanhoutte procede anche julianne nyhan nell’introduzione al volume computation and the humanities: towards an oral history of digital humanities . per la prospettiva sulle origini delle digital humanities che abbiamo qui sopra deli- neato in modo conciso sono determinanti da un lato il ruolo centrale per le biblio- teche (biblioteche digitali ante litteram), dall’altro l’attenzione portata più sui meto- di di studio che sugli specifici oggetti e prodotti della ricerca, il che permette di ampliare di molto la dimensione storica della riflessione; e soprattutto di leggere nella formazione e sviluppo delle digital humanities un’espressione coerente (ben- ché non prioritaria o dominante) della cultura del libro. il ‘campo esteso’ delle digital humanities nelle righe precedenti è comparso in più punti il problema della metodologia: quali entità si studiano? con quali strumenti? come si valutano i dati che si ottengono? pro- prio il riconoscimento della rilevanza di questi aspetti metodologici permette di (ri)costruire un percorso verso le origini che in modo non pretestuoso porta fino al medioevo. ma è usuale concepire le digital humanities – come si è visto in varie defini- zioni riportate sopra – focalizzando l’attenzione sull’applicazione di tecnologie del- l’informazione allo studio di contenuti provenienti dalle scienze umane, lasciando in secondo piano la questione metodologica. anche per effetto di questo modo di con- cepire la specificità delle digital humanities, (si) è diffusa una concezione pan-inclusi- va di digital humanities per lo più espressa con le parole big tent, che non sono entrate nel lessico italiano delle digital humanities, a differenza di quanto è accaduto per molte altre espressioni angloamericane del mondo dell’informatica e della tecnologia del- l’informazione. l’espressione big tent nacque e si diffuse in ambito americano/cana- dese intorno agli anni duemiladieci, e il convegno internazionale dh ebbe per tema proprio “big tent digital humanities” a dire che il concetto era già sufficiente- mente noto e diffuso benché in articoli scientifici o saggi immediatamente preceden- ti non lo si ritrovi, a indicare probabilmente una circolazione colloquiale o un uso non formalizzato (meno note ma di segno simile sono le espressioni expanded field e trading zone). la presentazione del convegno specificava in questo modo il concetto di big tent: with the big tent theme in mind, we especially invite submissions from latin american scholars, scholars in the digital arts and music, in spatial history, and in the public humanities . intersezioni edward vanhoutte, the gates of hell: history and definition of digital | humanities | computing. in: defining digital humanities», a cura di melissa terras, julianne nyhan, edward vanhoutte. farnham: ashgate, , p. - . julianne nyhan; andrew flinn, computation and the humanities: towards an oral history of digital human- ities. cham: springer international publishing, , . l’edizione digitale in accesso aperto del volume, distribuita da springer open, non contiene numerazione di pagine. come accade per molti ambiti di studio che si riconoscono in un’associazione di livello mondiale, ormai da tempo nell’ambito delle digital humanities si svolge ogni anno un convegno denominato “dh[anno]”. il convegno “dh ” si inseriva in questo percorso temporale e non costituiva dunque un evento isolato. . si vede qui comparire un aspetto sistemico di multiculturalismo (l’invito specifica- mente rivolto agli studiosi sudamericani) insieme ad aspetti contenutistici. i temi delle digital humanities specificamente invitati al convegno erano pertanto così descritti nella call for papers: data mining, information design and modelling, software studies, and human- ities research enabled through the digital medium; computer-based research and computer applications in literary, linguistic, cul- tural and historical studies, including electronic literature, public humanities, and interdisciplinary aspects of modern scholarship. some examples might be text analysis, corpora, corpus linguistics, language processing, language learn- ing, and endangered languages; the digital arts, architecture, music, film, theater, new media, and related areas; the creation and curation of humanities digital resources; the role of digital humanities in academic curricula. i temi che davano più specificamente corpo al concetto di big tent erano arti digita- li, architettura, musica, film, teatro, nuovi media e aree collegate, creazione e con- servazione di risorse digitali, insieme all’apertura verso gli studiosi sudamericani e quindi verso quello che viene chiamato «the global south». per dare sostanza all’analisi del tema della big tent si può misurare per quanto pos- sibile su base documentale la presenza del tema (e delle altre due metafore abbastanza frequenti expanded field e trading zone) nelle pubblicazioni scientifiche. in figura sono esposti gli esiti quantitativi suddivisi per anno, della ricerca («big tent» or «expanded field» or «trading zone») and «digital humanities» effettuata con google scholar. figura – esiti della ricerca («big tent» or «expanded field» or «trading zone») «digital humanities» in google scholar come si può osservare, la presenza del tema della big tent nelle pubblicazioni di digi- tal humanities in lingua inglese si amplia proprio a partire dal , anno del conve- gno “dh ” che aveva per tema “big tent digital humanities”, con un andamen- to di crescita netta e progressiva . le frequenze assolute di queste metafore devono però essere rapportate al numero complessivo di pubblicazioni sul tema digital huma- nities come si può vedere in figura . intersezioni l’apparente calo di frequenze nel è influenzato dal fatto che la ricerca è stata effettuata a ini- zio dicembre . figura – numero di esiti delle ricerche «digital humanities», e («big tent» or «expanded field» or «trading zone») and «digital humanities» in google scholar: in rapporto alle frequenze di «digital humanities» quelle delle altre espressioni sono quantitativamente irrilevanti come si può osservare il discorso sulla big tent occupa uno spazio molto piccolo nelle pubblicazioni su temi di digital humanities a indicare in modo chiaro che le digi- tal humanities sono un ambito di ricerca fortemente centrato sulla ricerca e il dibat- tito interno autoriflessivo sul significato della disciplina rimane contenuto. potreb- be essere interessante verificare in modo analogo a quanto fatto qui sopra per le digital humanities se anche in altri ambiti disciplinari esiste un analogo metadibat- tito, cioè non sulla disciplina in sé ma sulla sua ragion d’essere. peraltro in ambito nordamericano la call for papers del convegno “dh ” che aveva per tema “big tent digital humanities” fu giudicata non abbastanza inclusiva (cioè la tenda non sarebbe stata abbastanza grande, o almeno non così grande come la si dichiarava): the call as a whole is definitely more inclusive than the cfp , which had a more pronounced instrumental and textual focus; but, even so, there can be no doubt that there is a particular scholarly tradition underlying the call. this may not be surprising given the history of the conference series, but the cur- rent state of the field and the theme would seem to call for a more clearly inclu- sive stance. again, it is important to consider inside and outside perspectives. it may be that the call under discussion seems inclusive to the organizers of the conference, whereas it is seen as exclusionary by “outsiders” or newcomers to the field. for instance, most of the aspects listed could be said to represent tool- oriented and text-based research . in sintesi: ciò che dall’interno della tenda poteva apparire come una proposta di temi molto inclusiva, veniva invece recepito all’esterno come escludente anche perché i temi del convegno ‘rappresentavano ancora una ricerca basata sui testi’, a indicare che secon- do svensson chiedevano di ‘entrare nella big tent delle digital humanities’ persone i cui intersezioni anche in questo caso il calo di frequenze nel è connesso al fatto che la ricerca è stata fatta in dicembre . si intende la call for papers del convegno “dh ”. patrik svensson, beyond the big tent. in: debates in the digital humanities, edited by matthew k. gold. minneapolis: university of minnesota press, , p. - . studi non erano basati sul testo . il tema della big tent si caratterizza poi progressivamente negli anni successivi per aspetti e contenuti ideologico-politici di cui l’apertura verso gli studiosi sudamericani della call del convegno “dh ” era un primo indicatore, come appare bene da questa lista di argomenti focalizzati sulla «cultural, political and ultima- tely epistemological diversity» e che costituiscono la call for papers per chi volesse con- tribuire all’edizione del già citato volume debates in the digital humanities : dh has been described through various metaphors – “big tent”, “trading zone”, “expanded field”, etc. – lacking perhaps one further step: the idea of digital pluralism linked to new geographical and geopolitical dimension. our aim in this project is therefore to build a different representation of dh based on cul- tural, political and ultimately epistemological diversity. dh and the epistemologies of the south dh and theory from the south dh and southern critical perspectives dh and cultural criticism critique of dh postcolonial dh decolonial computing alternative histories of dh geopolitics of dh digital hegemonies dh and alternative methodologies geopolitics of code technical challenges of dh with non-anglophone and non-latin material dh and alternative technologies open humanities dh and public policy dh and local communities dh and intercultural problems dh and multilingualism dh and indigenous knowledge orders dh and digital divides dh and political debates dh and social change in the global south dh and citizen-driven innovation from the south dh and social complexity dh and surveillance studies dh and big data from the south intersezioni a dire quanto questa prospettiva si sia affermata in ambito anglofono si può osservare che la descri- zione del contenuto proposta dall’editore routledge per il suo recente routledge companion to media studies and digital humanities (edited by jentery sayers. new york, london: routledge, , ) è «humanities, cultural studies, media & film studies»: si crea un’identificazione (un cortocircuito) tra le digital humanities del titolo e i cultural e media studies. reperibile qui: joão fernandes, global debates in the digital humanities, settembre , . si trova qui: . che un settore di ricerca umanistica come sono le dh si presenti fortemente caratterizzato da una varietà di temi dalle connotazioni o caratteristiche chia- ramente politiche è cosa nuova. fuori delle digital humanities probabilmente non si penserebbe a una «filologia italiana e problemi interculturali», o a una «letteratura bizantina e multilinguismo» ma questo è ciò che caratterizza le digi- tal humanities principalmente ma non esclusivamente nel contesto culturale nord e sud-americano. tra l’altro il volume in questione ha tre curatori non ame- ricani (domenico fiormonte, italia; paola ricaurte, messico; sukanta chaudhu- ri, india) a dire una ancora più complessa situazione: quella per cui le dh sono attraversate sottotraccia da una polemica anticolonialista, anticapitalista, antioc- cidentale, di cui è parte non secondaria la lotta contro il predominio della lin- gua inglese nella comunicazione e contro la coloritura anglo e nord-americana di molti aspetti della vita della comunità degli studiosi delle digital humanities. si tratta di temi indiscutibilmente importanti e fondati nella realtà delle digi- tal humanities di oggi ma l’intensità con cui vengono promossi e sostenuti, e con cui si cerca di imporli come agenda di tutto il mondo delle digital humanities, sembra negare i principi di multiculturalismo e di valorizzazione della diversità che si vogliono affermare. le digital humanities in italia in italia il concetto di big tent delle digital humanities non si è diffuso né afferma- to benché se ne sia ben consapevoli. le ragioni (possibili, perché non v’è contro- prova) sono probabilmente di tipo storico cioè il fatto che l’informatica umani- stica prima, e le digital humanities poi, si sono strutturate in italia intorno allo studio dei testi (o, in modo più estensivo, a studi testuali cioè studi per i quali i testi sono una parte rilevante): il già menzionato progetto dell’index thomisticus, la presenza degli studi classici fin dagli inizi della costituzione del campo, l’influen- za degli scritti e dell’insegnamento di orlandi che sottolineano la valenza meto- dologica e perciò scientifica delle digital humanities (che non a caso orlandi chia- ma informatica umanistica), le riflessioni di impronta filosofica di buzzetti sulle caratteristiche della testualità e delle operazioni di studio in tale contesto , l’esi- stenza e l’attività di un istituto di linguistica computazionale del cnr a pisa la cui storia si può tracciare a partire dal . coerentemente con questa impronta complessiva, una parte importante del dibattito interno alle digital humanities in italia riguarda la strutturazione disciplinare formale delle digital humanities all’in- terno delle aree concorsuali e , che sono quelle che comprendono le discipli- ne umanistiche in senso ampio. ma fino ad ora le digital humanities non sono entrate nei settori disciplinari dell’università italiana: né con una disciplina pro- pria né come contenuto specifico all’interno delle declaratorie dei vari settori con- intersezioni dino buzzetti, digital representation and the text model, «new literary history», ( ), n. , p. - , doi: . /nlh. . ; jerome mcgann; dino buzzetti, critical editing in a digital horizon. in: electronic textual editing. new york: the modern language association of america, , p. - ; dino buzzetti, digital editions and text processing. in: text editing, print and the digital world, a cura di marilyn deegan, kathryn sutherland. farnham: ashgate, , p. - , . antonio zampolli, introduction to the special section on machine translation, «literary and lin- guistic computing», ( ), n. , p. - , doi: . /llc/ . . . corsuali . esse di conseguenza vengono praticate e sviluppate per così dire ‘in incognito’ in un’ampia varietà di ambiti: biblioteconomico, ingegneristico, infor- matico, giuridico, archeologico, storico-artistico, linguistico, musicale/musico- logico, didattico ecc. la varietà degli ambiti disciplinari non implica però una big tent delle digital humanities italiane, bensì la caratteristica distintiva di un approc- cio multidisciplinare al testo e alle sue ‘ramificazioni’: proprio quell’essere cen- trate sul testo che svensson e il contesto nordamericano reputano essere segno di chiusura (cfr. sopra dove egli afferma che una netta impronta di «text-based resear- ch» risulta «exclusionary for “outsiders” or newcomers to the field»), costituisce in italia il punto d’incontro di discipline diversissime tra loro. le digital humani- ties italiane mostrano in atto che nel mondo digitale la testualità e il testo sono il tessuto connettivo di un’amplissima varietà di discipline, anche di quelle che si potrebbero reputare lontane come ingegneria e informatica. a definire questa caratteristica delle digital humanities italiane ha certamente contribuito il processo inclusivo con cui si formò aiucd, l’associazione italiana di digital humanities. essa nacque in ottobre da un’iniziativa di anna maria tam- maro e della fondazione rinascimento digitale: coloro che in italia operavano nel- l’ambito dell’informatica umanistica furono invitati ad alcune assemblee fondati- ve, al termine delle quali gli studiosi che si ritenevano interessati alla costituzione di un’associazione di informatica umanistica diedero vita all’associazione. l’ele- mento chiave fu dunque il fatto che l’associazione ebbe un processo decisionale e un nucleo fondatore non disciplinarmente caratterizzati, ma costituiti da studiosi che appartenevano (e appartengono) nativamente e formalmente a discipline e ambiti molto vari e allo stesso tempo condividono l’interesse per un medesimo oriz- zonte di studio cioè quello che abbiamo poco sopra chiamato il testo e le sue ‘rami- ficazioni’: il contenuto delle biblioteche e degli archivi, cioè testi letterari, fonti sto- riche, fonti giuridiche, e l’annotazione formale delle fonti testuali e delle fonti visive anche per mezzo di ontologie formali. tutto ciò fa sì che aiucd costituisca un uni- cum nel panorama delle associazioni di digital humanities esistenti nel mondo in quanto il ‘campo esteso’ delle digital humanities vi si realizza molto più per la varietà disciplinare delle appartenenze dei soci (fanno parte di aiucd ingegneri, filosofi, linguisti, letterati, storici, biblioteconomi, storici dell’arte ecc.) che per la moltipli- cazione degli oggetti di studio (dal testo verso i cultural studies di cui qualsiasi argo- mento può essere oggetto). e vale la pena di ricordare che la denominazione «asso- ciazione italiana per l’informatica umanistica e la cultura digitale» (aiucd) da un lato evita l’utilizzo di un’espressione inglese dall’altro, nella doppia descrizione informatica umanistica / cultura digitale, informatica umanistica risponde più diret- tamente all’impronta caratteristica di questo campo di studi in italia mentre cultu- ra digitale tiene conto del contesto internazionale ove l’orizzonte degli studi è costi- tuito più ampiamente dalle scienze umane. intersezioni fanno eccezione brevissimi cenni contenuti nelle declaratorie di scienze del libro e del documen- to, glottologia e linguistica, linguistica e filologia italiana. sono invece numerose nel mondo le associazioni nazionali di digital humanities di paesi non anglofoni che utilizzano l’espressione inglese o un suo calco: humanistica, l’association francopho- ne des humanités numériques/digitales; red de humanidades digitales (messico); asociación argen- tina de humanidades digitales; czech digital humanities initiative; russian association for digital humanities; digital humaniora i norden (scandinavia); japanese association for digital humanites; digital humanities association of southern africa, taiwanese association for digital humanities. forme dell’interazione tra digital humanities e biblioteche quanto sin qui esposto ha ricostruito in termini di indagine storica la relazione tra biblioteche e digital humanities che dacquino e tomasi hanno formulato in termi- ni teorici nel come parte di una riflessione sulla lis: le biblioteche infatti si qualificano sulla base di alcune delle funzioni che per definizione connotano anche le dh. classificazione, gestione e disse- minazione delle informazioni del proprio dominio – che possiamo racchiu- dere nell’ampio spettro dell’organizzazione della conoscenza – sono alcune delle più antiche funzioni che le biblioteche sono votate a svolgere e che a loro volta identificano una parte fondamentale della metodologia dell’u- manista informatico . l’indagine storica delle pagine precedenti ha mostrato in quali modi il passato – remoto e prossimo – delle digital humanities ha concorso a definire e configurare le caratteristiche del presente in cui operiamo: di qui la domanda su come si caratte- rizzi il presente e che cosa si potrebbe delineare per il futuro dell’interazione com- plessa tra scienza della biblioteca e digital humanities. in tale prospettiva, classifica- zione, gestione e disseminazione delle informazioni possono essere intese sia come un nucleo fondante dalla cui teorizzazione e pratica consolidate non ci si allonta- na; sia come concetti che ad ogni svolta evolutiva della cultura e della scienza devo- no essere ripensati. saranno presentati dapprima gli esiti di alcuni studi di area sta- tunitense (del e ), inglese (del ), europea (del ; a quest’ultimo non hanno partecipato biblioteche italiane) per poi sviluppare una riflessione analitica su specifici aspetti. in area statunitense sono stati pubblicati due surveys sul tema della relazione tra biblioteche e digital humanities: nel tim bryson e altri, digital humanities e nel rikk mulligan, supporting digital scholarship . entrambi editi da acrl, raccol- sero informazioni da e biblioteche universitarie rispettivamente. il report digi- tal humanities segnalava come tendenze emergenti nel la necessità da parte delle biblioteche di sviluppare linee guida e modelli di gestione dello staff appropriati ad operare con progetti di digital humanities; e il fatto che molte biblioteche per rispon- dere alle richieste dei progetti digital humanities operavano assumendo un ruolo di hub di risorse proveniente da differenti dipartimenti. inteso che digital humanities per il survey indicava: an emerging field which employs computer-based technologies with the aim of exploring new areas of inquiry in the humanities. practitioners in the digital humanities draw not only upon traditional writing and research skills associated with the humanities, but also upon technical skills and infrastructure intersezioni marilena daquino; francesca tomasi, digital humanities e library and information science: through the lens of knowledge organization, «bibliothecae.it», ( ), n. , p. , doi: . /issn. - / . tim bryson [et al.], digital humanities: spec kit . washington, dc: association of research libraries, ; rikk mulligan, supporting digital scholarship: spec kit . washington, dc: asso- ciation of research libraries, . solo quattro biblioteche (pari al % del totale) dichiaravano di non offrire servizi per la digital scholarship . poco dopo, proprio a commento del survey citato, miriam posner che ne era coautrice segnalava che i bibliotecari che decidevano di lasciarsi coinvolgere in attività di digital humanities finivano col dover sopperire a limiti e carenze strutturali delle loro istituzioni: digital humanities has reached new levels of popularity, piquing the interests of a great many institutions that have little previous experience with it. […] the result is that the success of library dh efforts often depends on the ener- gy, creativity, and goodwill of a few overextended library professionals and the services they can cobble together. […] so there are very good reasons why indi- vidual librarians may choose to eschew digital humanities work, and they have to do with the lag between libraries’ enthusiasm for dh and institutions’ abil- ity to support it in meaningful ways . il survey del , supporting digital scholarship definiva le digital humanities come «use of digital evidence and method, digital authoring, digital publishing, digital curation and preservation, and digital use and reuse of scholarship» e il proprio scopo come to gather data on how the librarians, faculty, and professional staff in research libraries support a great variety of multimodal research as collaborative schol- arship, as collaborators, services, and in partnership with other units within and beyond the library ponendo l’attenzione su tipi di attività riconducibili alle digital humanities: gis e cartografia digitale; digitalizzazione di fonti analogiche; realizzazione di collezio- ni digitali; creazione di metadati; digital preservation; data curation and management; modellazione e stampa d; analisi statistica e attività di supporto; digital exhibits; project planning; project management; editoria digitale; computational text analysis e attività di supporto; progettazione di interfacce e/o usabilità; visualizzazione; svi- luppo di database; codifica di contenuto (per esempio annotazione tei); aggiorna- mento tecnico di prodotti e progetti; sviluppo di software per la ricerca in digital humanities. le conclusioni erano che queste tipologie di attività erano tutte in vario grado supportate nelle biblioteche che avevano risposto . rispetto a quanto osservato nel survey del il sostegno alle iniziative di digital humanities è più siste- matico e spesso organizzato dall’interno della biblioteca anche perché gli studiosi spesso chiedono sostegno sull’intero ciclo di vita del progetto di ricerca, che spesso necessita principalmente di collezioni speciali o digitali. in linea con questo, la biblio- intersezioni t. bryson [et al.], digital humanities cit., p. . miriam posner, no half measures: overcoming common challenges to doing digital humanities in the library, «journal of library administration», ( ), n. , p. - , . l’espressione usata nel survey è «digital scholarship in the humanities», a segnalare che le huma- nities propriamente non sono né digitali né non digitali; ma sono digitali i metodi di studio e di ricer- ca presi in esame. r. mulligan, supporting digital scholarship cit., p. . ivi, p. . teca opera come centro sia di ‘ricerca’ sia di ‘disseminazione’ il che porta all’atten- zione il problema sia di rendere le collezioni accessibili al pubblico generico, sia (anche per questo) di dotare la biblioteca di sistemi di storage e gestione che intera- giscano al meglio con strumenti e metodi digitali. il contesto ampio è quindi quel- lo che in italia viene denominato terza missione: sharing research with the public as a foundational stakeholder – by better sup- porting public history, public scholarship, and becoming a conduit for life- long learning and active citizen scholarship . il survey inglese del di christina kamposiori, the role of research libraries in the creation, archiving, curation, and preservation of tools for the digital humanities si basa su risposte da parte di biblioteche del regno unito e afferma che based on the results […] there is a role for libraries in the creation, archiving, curation and preservation of tools for digital humanities research, mainly as a collaborative activity between library professionals and researchers in the field quasi in risposta a un dubbio preliminare non dichiarato: «ma c’è possibilità di collaborazione tra biblioteche e studi nelle digital humanities?». non mancano gli aspetti delicati, che riguardano principalmente la capacità di assicurare la manutenzione e conservazione a lungo termine di ciò che è stato realizzato per la ricerca e l’insegnamento; la mancanza di modelli condivisi sulla scelta e uso delle risorse necessarie per i progetti di digital humanities; il fatto che i progetti di digital humanities quando accolti portano con sé un ampliamento di respon- sabilità per i bibliotecari; il fatto che se si vuole che i progetti abbiano ricadute positive per le istituzioni coinvolte occorre prevedere condivisione di conoscen- za e di buone pratiche . il survey europeo del di lotte wilms, a mini survey of digital humanities in european research libraries realizzato all’interno della rete liber, segnala fin dalle prime righe che in europa la collaborazione tra progetti di digital humanities e biblio- teche sta appena iniziando: of the libraries who responded have been running a dh activity for under a year. been active between - years, only libraries have had a dh activity for more than years. quasi tutte ( in totale) hanno però uno staff dedicato: in di esse lo staff dedica- to va da a persone mentre in lo staff va da a persone. in di esse quest’at- tività nell’ambito delle digital humanities deriva da esplicite scelte programmatiche e coerentemente di esse hanno fondi specifici destinati a questo. quanto alla cono- scenza da parte dei professori dell’attività della biblioteca in ambito digital huma- intersezioni ibidem. christina kamposiori, the role of research libraries in the creation, archiving, curation, and preser- vation of tools for the digital humanities. london: research libraries uk, . ivi, p. . ibidem. nities, essa è descritta come «vaga» in casi su , mentre in altri è assente benché i bibliotecari operino attivamente per diffondere questa conoscenza . nelle pagine seguenti ci soffermeremo su specifiche questioni che sono a nostro giudizio di particolare importanza e che nei surveys descritti non compaiono, o riman- gono marginali, forse anche perché difficile da affrontare in tale forma. l’asimmetria informativa in primo luogo occorre tener conto di una asimmetria tra mondo fisico e mondo digitale, dal punto di vista dell’informazione: nel mondo digitale è abbondante l’informazione che descrive il mondo fisico, mentre non è vero il contrario: l’infor- mazione che descrive il mondo digitale è scarsa nel mondo fisico. a conferma di que- sto, da decine di anni i cataloghi delle collezioni delle biblioteche sono disponibili online, e lo sono con una forza, con un’intenzionalità condivisa, evidenti: lo dice il fatto che le biblioteche iniziarono a dare accesso online ai loro cataloghi agli inizi degli anni novanta del secolo scorso quando per accedervi occorrevano un compu- ter, competenze tecniche non irrilevanti (l’accesso in telnet con la configurazione dei parametri del terminale) e informazioni molto specifiche per ogni catalogo. que- sto fece sì che per molte biblioteche l’opac diventasse rapidamente una modalità standard di incontro con i lettori nel mondo digitale – tanto che spesso l’utente non esperto quando scopre che esiste l’opac crede che esso, inteso come la presenza della biblioteca nel mondo digitale (!), dia accesso al testo delle opere possedute dalla biblioteca stessa. si manifestano in questo due linee di tendenza: la prima è quella per cui il mondo digitale è sentito come un pervasivo contesto di accesso all’infor- mazione, la seconda strettamente connessa con la prima è quella per cui (anche da chi non conosce i manifesti ifla!) la biblioteca è comunque concepita come luogo di accesso alla conoscenza, e dunque se la si incontra online si presume (a prescin- dere da aspetti tecnici come la differenza tra opac e biblioteca digitale) che lì si possa accedere alle sue collezioni. per la vita dei cittadini l’asimmetria informativa è evi- dente nella quotidianità: si prende in mano lo smartphone, o si apre il computer, per cercare informazione sullo stato del, o per ‘operare’ nel, mondo fisico: informarsi sugli orari dei trasporti, acquistare un biglietto di treno, informarsi sul meteo e deci- dere se compiere oppure no una certa attività, informarsi sui giorni e ore di apertu- ra di un museo per decidere quando andare a visitarlo; confrontare e acquistare occhiali, scarpe, abiti; e altro ancora. la medesima asimmetria governa anche le rela- zioni del cittadino con le istituzioni pubbliche: si comunica attraverso il mondo digi- tale per definire azioni e scelte che opereranno nelle vite delle persone nel mondo fisico. per non parlare della ricerca, che sempre più utilizza risorse informative e fonti che si trovano nel mondo digitale. se tutto questo accade nei contesti ordinari della lettura, della cittadinanza, dello studio e ricerca, ‘a maggior ragione’ si verifica per chi pratica l’informatica umani- stica o digital humanities: gli oggetti e gli strumenti con cui si opera sono digitali, e i prodotti della ricerca sono digitali anch’es si. quindi alle biblioteche che vogliano collegarsi con questo mondo effervescente e tumultuoso delle digital humanities occorre essere fortemente presenti nel mondo digitale con una capacità progettua- le specifica e innovativa che esprima sia servizi sia contenuti. È ovvio che questo possa distur ba re o preoccupare. da un lato perché alle spalle c’è un lunghissimo tempo in cui per le biblioteche l’essere luogo di accesso alla conoscenza ha signifi- intersezioni lotte wilms, a mini survey of digital humanities in european research libraries. liber, , p. . cato gestirne i supporti fisici tanto che sembrava possibile assimilare i supporti e il contenuto; di fronte ci sono un presente e un futuro in cui invece l’informazione e la conoscenza si presentano smaterializzate, svincolate da un supporto fisico. ma le biblioteche sono sempre state luoghi di connessioni più che di collezioni: luoghi di incontri e di azioni attraverso i media; alveari di attività dove ciò che è vivo sta insieme a ciò che è morto, oltre che naturalmente insieme a ciò che è vivo; e insomma luoghi dove questa condivisione è generativa in quanto capa- ce di preservare forme di conoscenza ereditate mentre ne produce di nuove e dunque il mutamento delle forme, che appare come un cambiamento destabiliz- zante, è piuttosto la riscoperta o la riaffermazione di una caratteristica costitutiva. dall’altro perché, in tempi difficili in cui le biblioteche come la cultura nel suo complesso perdono risorse (denaro e persone), tutto ciò che prospetta percorsi inno- vativi sembra richiedere proprio quelle risorse che mancano già per l’ordinario. eppu- re rinunciare ad avere linee di azione non è una risposta efficace perché si rischia di non essere pronti a cogliere le occasioni che si presenteranno. esporremo quindi le considerazioni delle pagine che seguono con rispetto per la storia che ha formato le biblioteche e con consapevolezza delle componenti problematiche. i contenuti gli studiosi che lavorano nell’ambito delle digital humanities operano su fonti in for- mato digitale. le fonti possono essere molto differenti fra loro: dai testi di qualsiasi tipo, a registrazioni audio-video (collezioni fotografiche, film, brani musicali, regi- strazioni ecc.) a collezioni di beni culturali materiali o immateriali ecc. come si è visto nelle pagine precedenti, nel corso del tempo c’è stato uno spostamento dell’origina- ria impronta di informatica umanistica/humanities computing centrata sullo studio dei testi verso quelli che vengono chiamati in senso ampio i cultural studies che hanno per oggetto qualsiasi forma delle espressioni delle culture umane – e questo in qual- che misura trova un corrispettivo pragmatico nella trasformazione della biblioteca da luogo di accesso alla conoscenza veicolata dalla stampa (collezioni formate da monografie e periodici), alla biblioteca come luogo di accesso alle espressioni della creatività umana e delle culture – di qui l’evoluzione che ha portato agli spazi di gioco per i bambini, alle videoteche, cineteche, collezioni di musica, alle sale computer, all’ospitalità per i makers. quindi anche l’espansione degli interessi dagli studi testua- li dell’informatica umanistica ai cultural studies delle digital humanities può trovare piena corrispondenza nell’evoluzione delle collezioni delle biblioteche. nel contesto delle digital humanities però le fonti vengono sostanzialmente sem- pre decostruite, smontate, lette trasversalmente, per mezzo di strumenti e metodi appositi. ‘se ciò può avvenire è perché le fonti sono digitali e in formato aperto’. la cosa è solo apparentemente semplice e ovvia: infatti molto spesso le fonti disponi- bili in biblioteca sono in formati chiusi/protetti che non permettono una fruizione differente da quella prefigurata dall’autore e dall’editore – semplicemente perché questo è il modo in cui vengono normalmente venduti (e gestiti normativamente) nel mondo fisico libri, riviste, film, musica. a dire che i formati chiusi, a parte altri problemi, sono perfetti quando la modalità di fruizione è quella prevista dall’edito- intersezioni jeffrey t. schnapp, la biblioteca oltre il libro. in: la biblioteca che cresce cit., p. . re, che in genere è una fruizione sequenziale nel tempo: la lettura del libro, la visio- ne del film, l’ascolto della musica. e dunque il digital humanist che per suo uso per- sonale digitalizza un’opera allo scopo di potervi ‘effettuare privatamente operazio- ni di analisi’ infrange comunque le norme della legge sul diritto d’autore perché sta riproducendo integralmente l’opera anche se poi non la condividerà con nessuno; ma non può farne a meno perché il lavorare sulle fonti decostruendole, smontan- dole, ricostruendole è una caratteristica essenziale delle digital humanities e ciò può realizzarsi solo se le fonti sono digitalizzate. in altre parole lo studioso digitale vuole scegliere da sé l’approccio al contenuto, vuol scegliere quale lettura operare (lettura in senso semiotico, non ci riferiamo solo a fonti scritte) – e pressoché sempre ciò comporta come precondizione la disponibilità del materiale di studio in forma digi- tale. se il contenuto ricade sotto la legge sul diritto d’autore ovviamente le questio- ni connesse con la sua digitalizzazio ne sono troppo complesse per poter essere discus- se qui. ma una grande quantità di fonti testuali è disponibile per la digitalizzazione perché fuori diritti ed esistono oggi sia ottimi (ed economici) dispositivi di digita- lizzazione; sia programmi di riconoscimento del testo che traggono vantaggio da un’ottima digitalizzazione delle pagine. analogamente per le fonti fotografiche, o audio/video, con la sola differenza che le operazioni di digitalizzazione sono più complesse e le attrezzature necessarie più costose. se la biblioteca in piena continuità con la modalità fisico/analogica di lavoro sulle fonti offre allo studioso i visori per studiare le fonti disponibili in microfilm, la biblioteca che vuole occupare uno spa- zio nel mondo digitale dovrebbe oggi offrire allo studioso uno spazio di lavoro digi- tale sulle fonti (ancora una volta sottolineiamo, di qualsiasi natura esse siano: testo, audio, video, immagine) e di digitalizzazione delle opere fuori diritti . la pura digitalizzazione delle fonti (che per quelle a stampa ha un duplice pas- saggio: acquisizione delle immagini e riconoscimento del testo) non termina il per- corso di lavoro di preparazione perché spesso oggi lo studio di una fonte digitaliz- zata implica il suo arricchimento con l’annotazione formale del contenuto – nel caso del testo essa in genere utilizza il linguaggio xml secondo lo standard tei per par- lare di contenuti espressi in termini di ontologie formali. ma si può ricorrere a un ampio spettro di risorse linguistiche di crescente complessità e raffinatezza: da una semplice lista di termini, a un glossario, a una tassonomia, a un tesauro, fino a un’on- tologia . lo scopo è di descrivere in tutto o in parte, in modo formalizzato e sia com- prensibile dagli studiosi sia utilizzabile dai computer (le ontologie formali sono pro- prio descrizioni-ponte di specifici ambiti di conoscenza, scritte in modo da essere comprensibili agli essere umani e utilizzabili dalle macchine), la fonte, il suo conte- nuto e/o la sua struttura formale; e di rendere possibile l’inserimento di note di com- mento: si può pensare a un’ontologia geografica come geonames o go! per arric- chire la soggettazione delle collezioni rendendo possibile la selezione di opere che riguardano una determinata area geografica; o all’annotazione del testo di un’ope- intersezioni o, previa formazione legale e pratica, delle parti fuori diritti di un’opera complessa come un’edi- zione critica: il diritto d’autore su di essa cessa in italia dopo anni dalla pubblicazione. «questa infrastruttura tecnologica è costituita da una serie di strumenti condivisi di controllo ter- minologico e di disambiguazione semantica, che permettono di descrivere univocamente dati e di esprimere la loro semantica formale: si tratta sostanzialmente di linguaggi, metalinguaggi, vocabo- lari controllati e ontologie» (gianfranco crupi, universo bibliografico e semantic web, «quaderni digi- lab», ( ), n. , p. - ). ra per evidenziarne le caratteristiche grammaticali/sintattiche, o in un testo lette- rario per segnalare l’interpretazione di un passo difficile rinviando alla eventuale fonte su cui si basa l’interpretazione; per un’immagine si può pensare all’identifi- cazione di un soggetto raffigurato creando un rimando a un authority file come viaf se il soggetto è una persona. nel far questo non si pensa a una successiva ripro- duzione della fonte bensì a renderne possibile una lettura e uno studio analitici: ad esempio, in un’opera teatrale, le battute del personaggio x che contengono un’a- postrofe alla seconda plurale; in una collezione di immagini quelle in cui è raffigu- rato un dato personaggio in un dato ambiente . da un lato vediamo quindi che come risultato delle loro attività di ricerca e di didattica, molti studiosi appar- tenenti all’area umanistica sono diventati creatori di contenuti digitali. per questi studiosi è sempre più diffusa l’esigenza di avere certe conoscenze tecni- che e metodologiche di base dall’altro, a fronte della varietà e complessità della progettazione della ricerca (dal modello concettuale, agli strumenti, ai metodi ecc.) la digitalizzazione delle fonti con i suoi vari passaggi rimane un punto fermo: perché se le fonti non sono digita- lizzate la ricerca in sostanza non si può sviluppare in ambito digital humanities. la discussione sulla teorizzazione e i modelli nelle digital humanities è vivace e ha un indubbio significato formativo per il campo disciplinare anche perché cerca di strut- turarlo in modo forte in rapporto ad altri soggetti forti con cui si vuole relazionare (informatica, linguistica, teoria della conoscenza) ma questo lavoro di strutturazio- ne teorica ha comunque come inizio e come fine le effettive, reali, attività di ricerca che operano sulle/con le fonti digitalizzate. È evidente che l’interesse della questione, dal punto di vista delle modalità ope- rative della lis, è che la biblioteca (ri)diventi in collaborazione con gli studiosi luogo non solo di fruizione ma anche di co-ideazione/definizione di forme, e di co-pro- intersezioni un esempio chiaro di annotazione di immagini in fabio cusimano, il digitale in biblioteca: prezio- sa opportunità di crescita e integrazione, o deriva verso la frammentazione?. in: la biblioteca che cre- sce cit. p. . in marzo si è svolto nella mailing list humanist () un vivace dibattito su quali siano i caratteri distintivi (i limiti!) dell’annotazione formale del testo e di quanto (o quanto poco) essa sia adatta a raggiungere gli scopi appena menzionati – ma la discussio- ne stessa indica che si ritiene che l’annotazione sia appropriata a questi scopi pur essendoci ampio disaccordo su quale tipo di annotazione sia migliore. anna maria tammaro, biblioteca digitale per l’informatica umanistica. in: e-laborare il sapere nel- l’era digitale: strumenti e tecniche per la gestione, la conservazione e la valorizzazione del patrimonio culturale in ambiente digitale, montevarchi, - novembre , p. ; . si possono ricordare d. buzzetti, digital representation and the text model cit.; arianna ciula; Øyvind eide, modelling in digital humanities: signs in context, «digital scholarship in the humani- ties», ( ), suppl. , p. i -i ; steven e. jones, turning practice inside out: digital humanities and the eversion. in: the routledge companion to media studies and digital humanities cit., p. - ; michael gavin, vector semantics, william empson, and the study of ambiguity, «critical inquiry», ( ), n. , p. - , doi: . / . duzione di strumenti di lettura, dei testi. un testo digitalizzato è un contenuto, ma la forma digitalizzata è lo strumento che permette quelle letture analitiche di studio che sarebbero altrimenti impossibili. sia la digitalizzazione sia l’annotazione dei testi rimandano a modalità di lavoro già note in passato: l’una rimanda allo scriptorium e l’altra rimanda alle glosse. tra un laboratorio di digitalizzazione e uno scriptorium ci sono sia differenze importanti sia elementi di continuità, ma ciò che interessa qui sottolineare è che queste sono attività di lavoro sul testo nate nelle biblioteche, che si presentano ora nelle modalità specifiche della contemporaneità digitale. se oggi esse si attuano spesso al di fuori delle biblioteche (in laboratori specializzati o nella stanza del singolo studioso) è più per un concorso di circostanze che per una irri- ducibile estraneità ad esse. e ove vi fosse qualche dubbio o perplessità sul fatto che digitalizzazione significhi una dissimulata e banale attività di mera riproduzione occorre sottolineare che l’attività di annotazione del testo è un’attività autoriale a pieno titolo che si basa su vaste competenze disciplinari il cui dispiegamento e messa in atto richiedono un notevole investimento di tempo . l’insieme delle attività che possono portare le biblioteche a (ri)diventare luoghi non solo di fruizione ma di produzione di forme e strumenti di lettura dei testi in autonomia o in interazione e collaborazione con gli studiosi che operano in ambi- to di digital humanities è ampio e permette varie scelte: dal dotarsi delle attrezzatu- re necessarie e fornirle agli studiosi interessati, a fornire non solo attrezzature ma anche formazione all’uso, a operare autonomamente sull’intera attività dall’acqui- sizione delle pagine fino all’annotazione semantica . in questo percorso, quale che sia la forma scelta, entrano prepotentemente le questioni relative ai formati e quel- le connesse con il diritto d’autore e le licenze aperte: occorre che i prodotti digitali siano sia scritti in formati che garantiscano per quanto possibile la durata nel tempo sia distribuiti in accesso aperto (pubblicare in accesso aperto al tempo un testo in un formato proprietario sarebbe un controsenso perché l’evoluzione del formato e del software potrebbe portare al tempo ad avere un testo in accesso aperto tecno- logicamente inaccessibile). questioni che i bibliotecari conoscono e con cui molti soprattutto nelle biblioteche di ricerca si confrontano ogni giorno. la gestione dei contenuti per biblioteche che decidano di essere incisivamente presenti nell’ambito delle digi- tal humanities e della digitalizzazione di fonti a fini di ricerca , il più evidente tema intersezioni nella biblioteca digitale “digital latin library” alla pagina si legge: «markup as scholarship. [the] semantic markup with xml must be considered part of the original, scholarly contribution of a digital critical edition, which in turn means that there must be a way of evaluating markup as scholarship. accordingly, the dll project is developing a rubric for assessing the quality of scholarly markup in editions submitted for publica- tion in the library of digital latin texts. this rubric takes several standards into consideration, includ- ing not only adherence to the long-standing best practices of textual criticism, but also the guidelines established by the text encoding initiative for using xml in scholarly editions». si veda ad esempio harriett e. green, facilitating communities of practice in digital humanities: librarian collaborations for research and training in text encoding, «the library quarterly», ( ), n. , p. - , doi: . / . lo schema concettuale più semplice è ‘contenuti – servizi’, lo studioso produce contenuti e la bib- lioteca fornisce i servizi di catalogazione conservazione e accesso («a significant portion of the respons- di lavoro è quello della gestione dei prodotti digitali realizzati localmente: cataloga- zione, conservazione, gestione. infatti non basta, ad esempio, che per una ricerca di public history siano state digitalizzate n annate di stampa locale: occorre che esse siano adeguatamente catalogate e conservate e diventino individuabili e utilizzabi- li anche al di fuori dell’istituzione pubblica che le ha prodotte. in altre parole la cata- logazione moltiplica il valore sociale del lavoro di acquisizione e digitalizzazione: perché intorno alla risorsa si incontrano tutti i soggetti (persone e istituzioni) che condividono l’interesse per essa, cioè l’esistenza accessibile della risorsa crea l’occa- sione di conoscenza fra soggetti. di qui in avanti è breve il passo verso questioni più complesse come la conservazione a lungo termine, perché in certo modo la ricerca esiste finché esistono i suoi prodotti; la sostenibilità, perché un progetto di digital humanities richiede (consuma!) risorse in termini di tempo, competenze, denaro; e infine verso la possibilità di iniziative condivise tra biblioteche e altre istituzioni , iniziative per le quali visibilità e sostenibilità sono due elementi chiave. in relazione alla catalogazione dei prodotti digitali realizzati localmente, è vero che da tempo le biblioteche catalogano, conservano e distribuiscono prodotti edi- toriali in forma digitale, sia libri (e-books) sia riviste (e-journals); ma occorre ricor- dare che in genere ciò avviene nel quadro di contratti con intermediari che opera- no sul prodotto (fornitori di pacchetti di riviste e/o e-book) e/o sulle risorse informatiche (fornitori di servizi di accesso ai file delle pubblicazioni e/o di ricerca) e dunque la gestione dei prodotti digitali realizzati localmente comporta la neces- sità di ampliare e/o approfondire le competenze di gestione dei prodotti digitali. la questione di una catalogazione che renda reperibile, e quindi disponibile, global- mente un prodotto realizzato localmente in un contesto di ricerca accademica non si può risolvere semplicemente con il deposito istituzionale della ricerca iris perché esso sia non prevede l’entrata di contenuti che abbiano autore al di fuori del conte- sto accademico, sia opera per nuclei separati corrispondenti alle università e centri di ricerca, sia infine perché la digitalizzazione di fonti non sempre si amplia in quel lavoro di annotazione formale che ne fa prodotto autoriale. le risposte a questa necessità di catalogazione e accesso possono essere di due tipi, fondamentalmente. se si ragiona sul problema in termini molto strutturati, lo stru- mento appropriato potrebbe essere un meta-catalogo che permetta di interrogare in modo integrato tutti i singoli cataloghi che raccolgono le fonti digitalizzate local- mente e che idealmente dovrebbero essere dotate di doi. un ottimo esempio sia della complessità strutturale sia delle opportunità offerte da questo modello è costituito dallo share catalogue, l’opac che permette la ricerca integrata nei cataloghi delle intersezioni es seem to assume that when we are talking about “doing digital humanities” in libraries, we are talk- ing about some kind of service libraries might provide»; trevor muñoz, digital humanities in the library isn’t a service, «trevor muñoz», agosto , ). tutto questo però, continua muñoz, frena il coinvolgimento delle biblioteche nelle digital humanities anziché promuoverlo: «framing digital humanities in libraries as a service to be provided and consequently centering the focus of the discussion on faculty members or others out- side the library seem likely to stall rather than foster libraries engagement with digital humanities. digital humanities in libraries isn’t a service and libraries will be more successful at generating engage- ment with digital humanities if they focus on helping librarians lead their own dh initiatives and pro- jects. digital humanities involves research and teaching and building things and participating in com- munities both online and off». ma si veda anche m. posner, no half measures cit. j. schaffner; r. erway, does every research library need a digital humanities center? cit., p. . università della basilicata; di napoli federico ii, parthenope, orientale; del salento; di salerno; del sannio; della campania vanvitelli . in questo caso una volontà cen- trale grazie a mezzi tecnologici avanzati connette cataloghi distribuiti negli spazi digi- tali delle istituzioni di afferenza: il concetto è chiaro ma la realizzazione è complessa: i progetti inclusi nella famiglia share sono promossi dalle biblioteche per sta- bilire procedure per l’identificazione e la riconciliazione di entità, la conver- sione di dati in linked data e la creazione di un ambiente di discovery virtua- le basato sulla struttura a tre livelli del modello di dati bibframe. da un punto di vista tecnologico questi progetti sono per lo più basati sulla linked open data platform, un sistema tecnologico innovativo per la gestione dei dati biblio- grafici, archivistici e museali, e la loro trasformazione in linked data . se invece si ritiene appropriato uno strumento a bassa intensità tecnologica (niente discovery, né linked data, e simili) il caso esemplare non solo per le sue caratteristiche ma anche per la sua storia è l’oxford text archive (ota) che venne fondato nel da lou burnard e susan hockey sotto l’egida degli oxford university computing ser- vices. erano tempi pre-web: i testi in formato elettronico venivano messi su floppy disk e spediti per posta ordinaria, a un costo che copriva le spese per il supporto e la spedizione. oggi si presenta come una biblioteca digitale che contiene circa . testi annotati in tei, . in altri formati e corpora (nel la suddivisione dei formati era tra txt, sgml e html). quanto alla formazione delle collezioni, ota dichiara di fare affidamento «upon deposits from the wider community as the pri- mary source of high-quality materials» : chi conosce l’ota conferisce i testi digita- lizzati liberi da diritti che ha prodotto. in tutto ciò ovviamente ha una parte impor- tante il fatto che ota nel corso di più di anni di attività si è guadagnato notorietà e autorevolezza. il modello non è ad alta intensità tecnologico-organizzativa e dun- que è più sostenibile di altri in quanto in sostanza si tratta di un’interfaccia di con- sultazione, selezione, download, di file da un server (nei primi tempi del web, ota permetteva il download diretto tramite ftp ) più agile e semplice da mantenere. l’a- gilità e semplicità sono certamente l’esito di scelte progettuali esplicite e non di iner- zia di fronte all’evoluzione tecnologica perché (grazie all’internet archive) è comun- que possibile osservare nell’ota una costante evoluzione nel corso del tempo. un altro vasto ambito digitale in cui si potrebbero attuare significative azioni di library and information science è quello della conservazione e accesso alla «memoria degli studi»: preservare a lungo termine le memorie collettive e personali degli ultimi decen- ni è un’impresa resa particolarmente complessa dalla necessità di integrare competenze appartenenti ad ambiti considerevolmente diversi: discipline let- terarie, tecniche archivistiche, tecnologia dell’informazione, questioni giuri- intersezioni share catalogue, s.d., . tiziana possemato; claudio forziati, riuso, interoperabilità, influenza: la cooperazione virtuosa tra i progetti share e wikidata. in: la biblioteca che cresce: contenuti e servizi tra frammentazione e integrazione cit., p. . . oxford text archive, the ota public ftp service, , . diche, aspetti amministrativi. inoltre, la gestione dell’archivio digitale pre- suppone l’aggiornamento costante dei modelli di dati, degli standard e delle procedure per far fronte alla crescente varietà delle fonti documentarie . fino a che il contesto della pubblicazione coincideva con la stampa, alla morte di uno studioso spesso la sua biblioteca personale entrava a far parte, come fondo speciale, di una biblioteca accademica o di ricerca e analogamente poteva accadere per il suo archi- vio personale di lettere. il senso e lo scopo dell’acquisizione sono ovviamente di per- mettere di conoscere e di studiare il modus operandi, gli interessi, dello studioso. la situa- zione che si verifica oggi in modo paradigmatico alla morte di un umanista digitale è tale da rendere impossibile il recupero della parte digitale della memoria degli studi se non siano stati concepiti e messi in atto dei protocolli precisi e specifici per fronteggiare pro- blemi come le password di accesso ai dispositivi, all’hard disk esterno, ai servizi in abbo- namento ecc. per quanto riguarda l’accesso ai contenuti non pubblicati di proprietà intel- lettuale dallo studioso, tali protocolli richiedono solo la volontà delle parti coinvolte e una buona dose di competenza tecnica da parte dell’ente destinatario del lascito; ma impattano con questioni legali specifiche del mondo digitale per esempio per quanto attiene al trasferimento di eventuali opere con accesso a pagamento di cui lo studioso possedeva la licenza, perché la licenza è personale; e spesso scade se non ne viene rinno- vato il pagamento. il risultato è che lo studioso (o gli eredi) non potrebbero lasciare il fondo a una biblioteca se non per la parte digitale in accesso aperto e per quella a stam- pa. a completare il contesto di lavoro dello studioso concorre sempre più anche la posta elettronica ma ad oggi sono pochi gli approcci archivistici alla sua gestione conservati- va. essa infatti presenta a sua volta specifici problemi tecnici (formati, programmi di gestione, password, allegati, eventuale presenza di virus e malware ecc.) oltre a quelli con- sueti (essenzialmente la definizione dei confini tra attività di studio e vita privata). part of the problem is complexity. email is not one thing, but a complicated inter- action of technical subsystems for composition, transport, viewing, and storage. archiving email involves multiple processes. archivists must build trust with donors, appraise collections, capture them from many locations, process email records, meet privacy and legal considerations, preserve messages and attachments, and facilitate access. […] email preservation is doable, but not yet done by enough archives to achieve our shared community goal to preserve correspondence, as we did for the paper-based archives that have facilitated untold historical insights . intersezioni paul gabriele weston; emmanuela carbé; primo baldini, se i bit non bastano: pratiche di conser- vazione del contesto di origine per gli archivi letterari nativi digitali, «bibliothecae.it», ( ), n. , p. - , doi: . /issn. - / ; stefano allegrezza, le criticità nella conservazione degli archivi di persona tra passato, presente e futuro. in: gli archivi di persona nell’era digitale: il caso dell’archivio di massimo vannucci. bologna: il mulino, , p. - . ma in realtà la situazione di verifica per ogni studioso, perché sono sempre meno numerosi colo- ro che non hanno mai scritto al computer una stesura di un articolo o di un saggio o non hanno scam- biato email su argomenti di lavoro. task force on technical approaches for email archives, the future of email archives: a report from the task force on technical approaches to email archives, august . washington, dc: council on library and information resources, , vol. , p. , . questo recente report del council on library and information resources è totalmente dedicato alla questione della conservazione della posta elettronica, e presenta sia un quadro complessivo delle po ten zialità e dei problemi, sia una serie di strumenti software per la gestione archivistica dell’e-ma il, ma soprattutto vuole costruire «a working agenda for the community to improve and refine this technical framework, to adjust existing tools to work within this framework, and to begin filling in the missing elements». il punto centrale della questione consiste nel fatto che la con- servazione unitaria (prodotti a stampa e prodotti digitali) dei fondi bibliotecari degli studiosi e la conservazione archivistica dell’e-mail sono reciprocamente connesse: l’una ha poco senso senza l’altra. l’iniziativa pad, pavia archivi digitali, diretta a pavia da paul gabriele weston può apparire simile a quanto qui delineato, ma essa è focalizzata su autori viventi di opere pubblicate che sottoscrivono un contratto per affidare a pad la conservazione dei materiali digitali ad esse relative e il contesto digitale in cui si sono sviluppate . quanto proponiamo, in certo modo comple- mentare al pad, è di ridefinire in modo più ampio che in passato la cessione di fondi personali librari e archivistici alle biblioteche da parte degli eredi di studiosi, tenen- do conto delle mutate modalità di lavoro degli studiosi stessi che sono sempre più miste di analogico e digitale. parte determinante di questa ridefinizione è la defini- zione di protocolli operativi per la gestione e soluzione dei problemi tecnici specifi- ci del digitale che ancora una volta chiamano in gioco quella primaria componen- te della library and information science già ricordata costituita dalla gestione delle informazioni, dall’organizzazione della conoscenza e dal successivo accesso anche attraverso modalità di ricerca. la successiva questione complessa legata alla produzione o al possesso di fonti digitali/digitalizzate è, come si ricordava, quella della conservazione a lungo termi- ne: conservazione che le protegga sia da guasti, sia da mutamenti nei contesti che le hanno prodotte (una biblioteca chiude, un sito web cambia , un fornitore di softwa- re non assiste più il prodotto ecc.). l’iniziativa di conservazione a lungo termine magazzini digitali in corso in fase sperimentale ad opera delle biblioteche nazio- nali centrali fa riferimento alle opere depositate in ove il crite- rio di ammissione è attualmente che la pubblicazione sia o una tesi di dottorato o il prodotto di un editore (e quindi al momento in cui si scrive questo articolo qualsia- si attività di digitalizzazione e annotazione di fonti a stampa prodotte in un conte- sto di ricerca o di conservazione che non arrivi alla pubblicazione editoriale non può seguire quella strada). d’altra parte il nome stesso ‘deposito legale’ implica che il qua- dro di riferimento complessivo sia quello delle attività di soggetti giuridici operan- ti nell’editoria e non quello di iniziative di ricerca di singoli o di gruppi. si desidere- rebbe dunque un allargamento che permetta se non ancora a tutti i (semplici) intersezioni p.g. weston; e. carbé; p. baldini, se i bit non bastano: pratiche di conservazione del contesto di origine per gli archivi letterari nativi digitali cit., p. – . un autore può non rinnovare il contratto e ritirare tutti i suoi scritti dal pad (ivi, p. ). al punto che la gestione digitale dei prodotti prevede i comportamenti da adottare nel caso che vengano individuati dei virus o dei malware (ivi, p. ). alcune fonti web citate in questo articolo (alle note: , , e ), per le quali si fa riferimento all’internet archive manifestano in evidenza il problema. giovanni bergamin; maurizio messina, magazzini digitali: dal prototipo al servizio, «digitalia», ( ), p. - , . documenti elettronici (definiti dalla legge / come «documenti diffusi tra- mite rete informatica»), l’ingresso nella conservazione a lungo termine attraverso i magazzini digitali almeno anche alle opere libere da diritti conservate in bibliote- che digitali. le archiviazioni web – che spesso si presentano come autoarchiviazio- ni – possibili con risorse quali arxiv o internet archive, giusto per citarne due famo- se, molto differenti tra loro e benemerite , ovviamente non rispondono all’esigenza di sistematicità e organicità che sono al cuore di una biblioteca digitale. la sostenibilità dei progetti di digital humanities si rivela di fondamentale impor- tanza non tanto a breve quanto a medio-lungo termine: se i progetti hanno alti costi di esercizio per i mezzi tecnici (licenze, spazio in server farm, e così via) e/o per le com- petenze di personale (ad esempio il ruolo chiave di un partecipante), la chiusura dei finanziamenti al termine del progetto (o l’abbandono del progetto da parte di una persona molto qualificata!) possono ridurre pressoché a zero le attività del progetto che non può più svilupparsi. questo aspetto evidenzia forse meglio di altri la carat- teristica di ricerca avanzata che è propria delle digital humanities: i suoi modi e pro- cedure non sono (ancora) così diffusi, noti, condivisi, da potersi reggere senza gran- di sforzi molto consapevoli e molto focalizzati. d’altra parte se si tiene conto che ciò che le dh hanno da offrire […] è un patrimonio di pratiche e ragionamen- ti che potrebbero trasformare la progettualità nata nell’alveo di una disciplina tradizionale in nuove domande di ricerca, arrivando potenzialmente a rag- giungere risultati non preventivati e non altrimenti determinabili. se volessi- mo riassumere una visione del ruolo delle dh, sicuramente la prospettiva di svelare l’inaspettato e far emergere il non conosciuto rappresenterebbe l’o- biettivo forte di questo àmbito di ricerca se ne può concludere che la sostenibilità non può essere un criterio dirimente: pro- getti molto sostenibili potrebbero non riuscire a «svelare l’inaspettato e a far emerge- re il non conosciuto» perché probabilmente non si azzarderebbero a inoltrarsi nelle ‘zone rischiose’ che richiedono competenze poco diffuse, metodologie complesse, mezzi tecnici non ordinari. ma in ogni caso, giunto il termine del progetto, permane la necessità che ciò che esso ha prodotto (dati e output) sia conservato, catalogato e reso accessibile – necessità in funzione della quale è fondamentale la figura del data librarian che fin dall’inizio sia parte del progetto per vigilare e operare affinché i dati e gli output siano progettati e gestiti nel miglior modo possibile in considerazione delle esigenze presenti, del progetto stesso, e future di accessibilità e diffusione. un altro possibile tipo di collaborazione tra biblioteche e digital humanities è quel- lo di iniziative condivise su specifiche linee di azione che evidenziano l’utilità e neces- intersezioni «one half was setting web crawlers upon noaa web pages that could be easily copied and sent to the internet archive.» (zoë schlanger, rogue scientists race to save climate data from trump, «wired», gennaio , ). ricordiamo questa vicenda dei climatologi americani (che nell’imminenza dei tagli decisi dall’ammini- strazione trump alle loro attività, con conseguente impossibilità di continuare a pagare gli spazi in cloud, salvavano (spostavano) su internet archive una parte dei dati) perché essa mostra bene che le azioni di salvataggio dei dati sono risposte a condizioni complesse e imprevedibili: a volte si può pianificare a medio-lungo termine, a volte si è costretti ad agire nel brevissimo termine senza pianificazione. m. daquino; f. tomasi, digital humanities e library and information science cit., p. . sità delle competenze di area library and information science nello sviluppo e gestione di progetti di digital humanities, come hanno scritto schaffner ed erway proprio in un report realizzato per oclc sulla relazione tra biblioteche e digital humanities: there are many ways to respond to the needs of digital humanists, and a digi- tal humanities (dh) center is appropriate in relatively few circumstances. library leadership can choose from a range of possible directions: - package existing services as a “virtual dh center” - advocate coordinated dh support across the institution - help scholars plan for preservation needs - extend the institutional repository to accommodate dh digital objects - work internationally to spur co-investment in dh across institutions - create avenues for scholarly use and enhancement of metadata - consult dh scholars at the beginning of digitization projects - get involved in dh project planning for sustainability from the beginning - commit to a dh center. a dh center does not always meet the needs of dh researchers. when warrant- ed, a dh center is not necessarily best located in the library. library culture may need to evolve in order for librarians to be seen as effective dh partners . l’aspetto più interessante di questo quadro è probabilmente nella frase iniziale del passo citato. l’articolo ha per titolo la domanda: does every research library need a digi- tal humanities center? alla quale in sostanza gli autori rispondono là dove scrivono «a digital humanities center is appropriate in relatively few circumstances», a dire che secondo loro specifiche azioni pertinenti sono generalmente più appropriate di pianificazioni progettuali e istituzionali complesse come sarebbero quelle necessa- rie per dar vita ad un centro di digital humanities (sullo sfondo c’è anche la discus- sione sulla questione complicata, in parte filosofica in parte economica, su quali siano le ragioni d’essere di un centro di digital humanities, come mantenerlo in vita, se abbia una durata prevedibile ecc.) . e dunque le loro proposte delineano una progressione di complessità crescente, dal reinterpretare i servizi esistenti in biblio- teca («package existing services as a “virtual dh center”») fino, certo, anche a crea- re un centro di digital humanities («commit to a dh center»), passando per azioni di collegamento rivolte agli studiosi («help scholars plan for preservation needs»; «con- sult dh scholars at the beginning of digitization projects»), altre rivolte verso le isti- tuzioni («advocate coordinated dh support across the institution»; «extend the insti- tutional repository to accommodate dh digital objects») e altre ancora che mettono intersezioni j. schaffner; r. erway, does every research library need a digital humanities center? cit., p. . giusto a titolo di esempio si può menzionare il dibattito in rrchnm : the future of digital humanities centers – roy rosenzweig center for history and new media, [ ], con interventi di edward ayers (president, university of richmond), bethany nowviskie (director of digital research & scholarship at the university of virginia library), brett bobley, (office of digital humanities, national endowment for the humanities), stephen robertson (director, roy rosenzweig center for history and new media); oppure ricordare ying zhang; shu liu; emilee mathews, convergence of digital humanities and digital libraries, «library manage- ment», ( ), n. - , p. - , doi: . /lm- - - , che scrivono «dh remains uncer- tain about how to ensure successful projects with long-lasting impact». in gioco competenze specifiche a livello locale e internazionale («get involved in dh project planning for sustainability from the beginning»; «create avenues for scho- larly use and enhancement of metadata»; «work internationally to spur co-invest- ment in dh across institutions»). essendo sostanzialmente scomparsi i finanzia- menti pubblici e privati per progetti centrati sulle collezioni, le linee di azione suggerite da schaffner ed erway si collocano bene nella situazione presente perché conten- gono o implicano – in vari modi e misure – una componente infrastrutturale su cui i finanziamenti sono ancora possibili. conclusione le fonti menzionate nel discorso sin qui sviluppato sono, come si è visto, in buon numero straniere e questo potrebbe in qualche modo giustificare l’osservazione che i modelli biblioteconomici proposti in italia nella letteratura, con riferimento alla biblioteca pubblica, sono in parte derivati da esperienze realizzate all’este- ro e si rivelano, quindi, poco adatti a descrivere la realtà fenomenica delle biblio- teche italiane osservazione indiscutibilmente fondata in termini metodologici perché non c’è dub- bio che la replicabilità delle esperienze e dei modelli è condizionata dalle differen- ze delle culture e dei contesti giuridico-amministrativi. in questo articolo peraltro (in cui il focus del discorso sono le biblioteche di ricerca) le esperienze straniere sono riportate come catalizzatori di riflessione e non come modelli da attuare pedisse- quamente ed è stata mostrata la connessione di fondo tra l’informatica umanistica italiana nata su, e tutt’ora fortemente connessa con, gli studi testuali e una propo- sta di posizionamento forte delle biblioteche nell’universo delle attività connesse con le fonti digitali (digitalizzazione, conservazione, catalogazione, distribuzione ecc.) di cui lo share catalogue, i magazzini digitali, il pad, sono punti di riferimento. senza dimenticare che il mondo italiano delle digital humanities al di là delle sue specificità costitutive e in atto, è strettamente interconnesso con le esperienze e la riflessione nel resto del mondo che parla inglese, e che molte e molti digital huma- nists italiane e italiani che lavorano all’estero creano un’osmosi continua tra italia, europa e resto del mondo. nello specifico dei contenuti, è evidente che concepire e delineare la relazione tra library and information science e digital humanities (il che significa poi, in con- creto, tra bibliotecari e digital humanists, gli umanisti informatici) secondo le linee esposte nelle pagine precedenti non è banale, in quanto richiede a entrambe le parti una forte evoluzione per di più in tempi in cui le risorse sono scarse e in calo . i digi- tal humanists di solito non cercano l’aiuto delle biblioteche e lavorano per loro conto sulle fonti, anche perché spesso lottano per imparare a usare nuovi strumenti e a intersezioni anna galluzzi; alberto salarelli, dialogando sui modelli, «biblioteche oggi trends», ( ), n. , p. - . «yet despite this ongoing engagement, libraries are often unsure how they should respond as dh attracts more and more practitioners and its definition evolves to cover an everexpanding range of techniques and methods» (stewart varner; patricia hswe, special report: digital humanities in libraries, «american libraries magazine», gennaio , ). mettere a punto i metodi; ma operando insieme bibliotecari e digital humanists si potranno riappropriare della responsabilità e della pratica del percorso produttivo che nel digitale sembra spesso remoto e impossibile da gestire («sembra» perché certo così si presentano le cose nell’ordinario, ma ciò non significa che sia impossibile ope- rare in modo differente). il primo spazio di relazione e di collaborazione è lo scriptorium digitale, la biblio- teca che accoglie attività di digitalizzazione fino a diventarne eventualmente un cen- tro. ne abbiamo parlato qui in relazione agli studiosi, ma si applica comunque anche ad essi il discorso sui learning commons (che di per sé è orientato, per il focus sull’ap- prendimento, agli studenti; ma caratterizza le digital humanities il fatto che studioso e studente sono nella medesima condizione di scoperta nell’apprendimento): putting the learner at the center of library space planning is a return to the first par- adigm, with the critical differences that information is now superabundant rather than scarce and now increasingly resident in virtual rather than in physical space. nei learning commons – a differenza degli information commons – la conoscen- za non è solo fruita: questi centri, infatti, sono progettati per stimolare la creazione di nuova conoscenza: the learning commons more readily reflects the understanding that students, as learners, are not merely information con- sumers but actively participate with information in order to create meaning- ful knowledge and wisdom . il secondo spazio di collaborazione è la gestione dei contenuti nelle forme consue- te per le biblioteche (catalogazione, conservazione, accesso ecc.) che si può confi- gurare o nella linea complessa esemplificata dallo share catalogue, o nella linea agile di una biblioteca digitale come l’ota (dove complesso e agile rimandano alle strutture tecnologico-informative-informatiche soggiacenti), o nella linea della defi- nizione di protocolli per la gestione dei fondi bibliotecari ed epistolari di studiosi contemporanei che in varia misura, anche se non prioritaria, hanno operato nel mondo digitale o con strumenti digitali, iniziata da pad. il terzo ambito sono le attività collaborative suggerite da schaffner ed erway che vedo- no i bibliotecari contribuire con le loro competenze nei contesti in cui si definisce e si svi- luppa la ricerca delle digital humanities che rileggono e reinterpretano nella contempo- raneità la più antica funzione della biblioteca cioè l’organizzazione della conoscenza: of all scholarly pursuits, digital humanities most clearly represents the spirit that animated the ancient foundations at alexandria, pergamum, and mem- phis, the great monastic libraries of the middle ages, and even the first research libraries of the german enlightenment. it is obsessed with varieties of repre- sentation, the organization of knowledge, the technology of communication and dissemination, and the production of useful tools for scholarly inquiry intersezioni maria cassella, terza missione e modelli biblioteconomici: come evolve il profilo della bibliote- ca accademica. in: la biblioteca che cresce: contenuti e servizi tra frammentazione e integrazione cit., p. - ; le due citazioni sono da scott bennett, libraries and learning: a history of paradigm chan- ge, «portal: libraries and the academy», ( ), n. , p. . stephen ramsay, care of the soul, «literatura mundana», ottobre , . i link originari alla fonte sono persi e permane solo quello offerto dall’internet archive. nel quadro di una relazione ininterrotta tra gli umanisti e le biblioteche: it is in libraries that humanists have always found their basic and essential instrumentation. libraries can be described as the humanist’s lab. obviously, this applies also to digital humanists, who deal with digital objects for research purposes, and to digital libraries that store collections in digital form . articolo proposto il gennaio e accettato il luglio . abstract aib studi, n. - (gennaio/agosto ), p. - . doi . /aibstudi- issn: - , e-issn: - maurizio lana, università degli studi del piemonte orientale “amedeo avogadro”, dipartimento di studi umanistici, vercelli, e-mail maurizio.lana@uniupo.it. digital humanities e biblioteche gli inizi delle digital humanities sono complessi da delineare ma comunque connessi con l'utilizzo di biblioteche esistenti (nel medioevo o nell’ottocento) o con la creazione di nuove biblioteche sul finire del secolo scorso, nei progetti dell'index thomisticus, dei computer assisted tools for septuagint studies, del thesaurus linguae graecae. l'impronta metodologica di fondo, attraverso questo inizio multitemporale e multicentrico, è lo studio dei testi attorno al quale si incontrano discipline molto differenti e anche apparentemente lontane. nel contesto internazionale questa impronta pur presente viene messa in discussione in quanto sarebbe escludente rispetto a una varietà di temi il cui orizzonte va dai cultural studies, ai media studies, all'inclusione geopolitica del sud del mondo. la situazione italiana, anche attraverso l'aiucd, l'associazione di informatica umanistica e cultura digitale, si caratterizza invece per la capacità di riconoscere in forme costantemente rinnovate la capacità vitale del testo e della testualità di costituire il connettivo di una varietà di contenuti e contesti. digital humanities and libraries the beginnings of digital humanities are complex to delineate but in any case connected with the use of existing libraries (in the middle ages or in the th century) or with the creation of new libraries at the end of the last century, in the projects of the index thomisticus, of computer assisted tools for septuagint studies, of the thesaurus linguae graecae. the basic methodological imprint, through this multi-temporal and multi-center start, is the study of the texts around which very different disciplines even apparently (or really) distant get in touch with each other. in the international context, this imprint is questioned as it would be exclusionary with respect to a variety of subjects whose horizon ranges from cultural studies, to media studies, to the geopolitical inclusion of the south of the world. the italian situation, also through aiucd, the association of informatica umanistica and digital culture, is characterized instead by the ability to recognize in constantly renewed forms the vital capacity of text and textuality to constitute the connective of a variety of contents and contexts. intersezioni dino buzzetti, where do humanities computing and digital libraries meet? in: digital libraries and archives: th italian research conference, ircdl , bari, italy, february - , : revised selected papers, maris- tella agosti [et al.] (eds). berlin, heidelberg: springer, , p. , doi: . / - - - - _ . [pdf] the digital native - myth and reality | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . / corpus id: the digital native - myth and reality @article{selwyn thedn, title={the digital native - myth and reality}, author={neil selwyn}, journal={aslib proc.}, year={ }, volume={ }, pages={ - } } neil selwyn published sociology, computer science aslib proc. purpose – the purpose of this paper is to develop and promote a realistic understanding of young people and digital technology with a view to supporting information professionals in playing useful and meaningful roles in supporting current generations of young people. in particular the paper aims to offer a critical perspective on popular and political understandings of young people and digital technologies – characterised by notions of “digital natives”, the “net generation” and other… expand view via publisher comminfo.rutgers.edu save to library create alert cite launch research feed share this paper citationshighly influential citations background citations methods citations results citations view all supplemental video : video pitch maddy nielsen august explore further discover more papers related to the topics discussed in this paper topics from this paper digital native digital electronics web . digital data information science while mind reflections of signals on conducting lines new media color gradient email jenkins persistence (computer science) elegant degradation paper mentions blog post os ‘nativos digitais’ existem? glúon /blog february citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency are digital natives a myth or reality? university students' use of digital technologies a. margaryan, a. littlejohn, gabrielle vojt sociology, computer science comput. educ. pdf view excerpt, cites background save alert research feed the dilemmas of digital methodologies: learning from work on young digital susan elsley, m. gallagher, e. tisdall sociology pdf view excerpts, cites background save alert research feed the realities of researching alongside virtual youth in late modernity creative practices and activity theory m. sclater, v. lally sociology pdf view excerpts, cites background save alert research feed mind the gap: digital practices and school e. ferreira, c. ponte, m. silva, c. azevedo sociology, computer science int. j. digit. lit. digit. competence view excerpt, cites background save alert research feed african art students and digital learning p. uimonen computer science save alert research feed let’s talk about digital learners in the digital era eliana gallardo-echenique, l. marqués-molías, m. bullen, jan-willem strijbos psychology pdf save alert research feed the rise and fall (?) of the digital natives terry judd sociology highly influenced pdf view excerpts, cites background and results save alert research feed talking past each other: academic and media framing of literacy k. ognyanova political science view excerpts, cites background save alert research feed media-making matters : exploring literacy with young learners as media crafting, critique and artistry m. cannon sociology pdf view excerpt, cites background save alert research feed emerging technologies: bridging personal and organisational uses f. bell engineering pdf view excerpt, cites background save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency are digital natives a myth or reality? university students' use of digital technologies a. margaryan, a. littlejohn, gabrielle vojt sociology, computer science comput. educ. pdf save alert research feed the 'digital natives' debate: a critical review of the evidence s. bennett, k. maton, l. kervin sociology, computer science br. j. educ. technol. , pdf view excerpt, references background save alert research feed digital learning and participation among youth: critical reflections on future research priorities sonia livingstone sociology pdf save alert research feed digital natives: if you aren't one, get to know one s. long computer science view excerpt, references background save alert research feed born digital: understanding the first generation of digital natives j. palfrey, u. gasser psychology , pdf view excerpt save alert research feed student and faculty inter-generational digital divide: fact or fiction? florin d. salajan, d. schönwetter, b. cleghorn computer science comput. educ. pdf save alert research feed beyond digital divide: towards an agenda for change neil selwyn, k. facer political science view excerpt, references background save alert research feed access: the information-seeking behavior of youth in the digital environment eliza t. dresang psychology, computer science libr. trends view excerpt, references background save alert research feed the role of university teachers in a digital era e. ljoså sociology pdf view excerpt, references background save alert research feed forthcoming features: information and communications technologies and the sociology of the future p. golding sociology view excerpt, references background save alert research feed ... ... related papers abstract supplemental video topics paper mentions citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators blog posts, news articles and tweet counts and ids sourced by altmetric.com terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue stylometric techniques for multiple author clustering (ijacsa) international journal of advanced computer science and applications, vol. , no. , | p a g e www.ijacsa.thesai.org stylometric techniques for multiple author clustering shakespeare‘s authorship in the passionate pilgrim david kernot joint and operations analysis division defence science technology group edinburgh, sa, australia terry bossomaier the centre for research in complex systems charles sturt university bathurst, nsw, australia roger bradbury national security college the australian national university canberra, act, australia abstract—in - printer, william jaggard named shakespeare as the sole author of the passionate pilgrim even though jaggard chose a number of non-shakespearian poems in the volume. using a neurolinguistics approach to authorship identification, a four-feature technique, rpas, is used to convert the poems in the passionate pilgrim into a multi-dimensional vector. three complementary analytical techniques are applied to cluster the data and reduce single technique bias before an alternate method, seriation, is used to measure the distances between clusters and test the strength of the connections. the multivariate techniques are found to be robust and able to allocate nine of the unknown poems to shakespeare. the authorship of one of the barnfield poems is questioned, and analysis highlights that others are collaborations or works of yet to be acknowledged poets. it is possible that as many as poems were shakespeare’s and at least five poets were not acknowledged. keywords—authorship identification; principal component analysis; linear discriminant analysis; vector space method; seriation i. introduction william jaggard first printed the passionate pilgrim in - , and the authorship of the poems within it was attributed to william shakespeare [ ]. however, bartholomew griffin's , fidessa more chaste than kind, already contained poem [ ]. another, poem , appeared anonymously in anne cornwallis‘ personal notebook alongside works from sir philip sidney, sir walter raleigh, sir edward dyer and edward de vere, th earl of oxford [ ]. the list grows, and in , jaggard‘s brother john printed richard barnfield‘s, the encomion of lady pecunia, containing poems and [ ]. by , only five had been confirmed as shakespeare‘s (poems , , , , and ) having appeared in the sonnets, or his play, love‘s labour‘s lost [ ]. then, england's helicon also printed a version of poem , attributing it to christopher marlowe, although its reply (signed ignato) was later said to be by sir walter raleigh [ ]. jaggard persisted with his claim, and in the third edition added a number of poems from thomas heywood, however, after complaints, jaggard removed shakespeare‘s name from the title [ ]. by then, the authorship of unknown poems lay in doubt, something that has remained for over years. modern scholars are divided on the authorship of the remaining unknown twelve. reference [ ] suggests jaggard used shakespeare‘s name because the majority of the poems were shakespeare‘s, including unidentified poems in the passionate pilgrim said to be his earlier quality work and never meant for publishing. she also adds there is some doubt surrounding the authorship of the barnfield and griffin poems. reference [ ] disputes shakespeare‘s authorship, while [ ] suggest eight, not of the anonymous poems are shakespeare‘s. however, [ ] suggest poems , , , , , , and use a similar six-line stanza format to shakespeare‘s venus and adonis, and poems , , and are about venus and adonis and have shakespearian similarities, but [ ] says poems and resemble robert greene‘s poems. it is interesting to note that unknown poem gets little attention, even though it appears in thomas delany‘s the garland of goodwill, and entered into the stationers register ledger during - [ ]. when chosen by jaggard, delaney was living with an arrest warrant over his head because of his insightful writing during the london riots and in no position to complain [ ], but what is strange are the few references in the literature to delaney as the author until recently. either way, jaggard cannot be asked about the true authorship of the poems, and today, the poems, for the most part, remain unidentified. stylometric analysis, the quantitative analysis of a text‘s linguistic features has been extensively used to determine the authorship of the undocumented collaborations of the playwrights from the elizabethan period, including shakespeare [ ]. there appears dissension among leading shakespearean authorship attribution scholars about an agreed method [ ], but the most successful and robust methods are based on low-level information such as character n-grams or auxiliary words (function word, stop words such as articles and prepositions) frequencies [ ]. the premier work in evaluating authorship in the th to mid- th centuries includes macdonald p. jackson, brian vickers, and hugh craig and arthur kinney [ ]. jackson [ ] uses common low-frequency word phrases, repetition of phrases, collocation, and images to link word groups to other works. vickers [ ] uses a tri-gram, or n-gram, approach, while hirch and craig [ ] use function word frequency and other methods, that includes ones based on word probabilities and the information theoretic measure jensen-shannon divergence (jsd) and unsupervised graph (ijacsa) international journal of advanced computer science and applications, vol. , no. , | p a g e www.ijacsa.thesai.org partitioning clustering algorithms [ ]. however, there are other techniques used in this period of shakespearean analysis, including simple function words [ , ] and word adjacency networks (wans) [ ]. however, the meaning- extracting method (mem) from the field of psychology to extract themes from commonly used adjectives and describe a person from their personality, or self is very different [ , ]. the authors offer a new and alternative approach to authorship identification using personality. a. an approach using rpas in this paper, a methodology is employed that adopts a multi-faceted approach to text analysis and reveal details about a person's personality; their sense of self, from subtle characteristics hidden in their writing style [ - ]. the techniques draw on biomarkers for creativity and known psychological states [ - ] to identify characteristics within the passionate pilgrim poems. it uses a series of four indicators (rpas) identified in [ ] to create a stylistic signature from a person‘s writing: richness (r) [ ], the number of unique words used by an author; personal pronouns (p) [ - ], the pronouns used, closely aligned to gender and self; referential activity power (a) [ - ], based on function words, or word particles derived from clinical depression studies; and sensory (s) [ - ], five sensory measures (v-visual a-auditory h – haptic o – olfactory g - gustatory) corresponding to the senses. rpas is used to create individual stylistic signatures of the the passionate pilgrim poems and the known works of william shakespeare, christopher marlowe and sir walter raleigh, richard barnfield, and bartholomew griffin are labelled. three clustering techniques are then applied to identify the likely authorship of the unknown poems within the passionate pilgrim. ii. methodology the passionate pilgrim contained within the complete works of shakespeare [ ] is used to process the data with the stanford parts of speech tagger [ ] to remove all punctuation and symbols and then aggregate the works by word frequency. the passionate pilgrim is further broken down into chunks that represent each known poem, and a decision made to follow the modern approach by editors [ ], and divide poem into two poems (labelled as and ) with a subsequent renumbering of the remaining poems so that there are twenty-one and not twenty poem chunks (refer to table ). the , -word data ends up as an aggregated matrix of , distinct word types across poems, and the size of each varies between and words (average = ). putting this into perspective, they are slightly larger than a shakespearian sonnet which varies between and words (average = ). table i. the list of the poems by shakespeare, barnfield, griffin, marlowe including the unknown authored poems in the passionate pilgrim poems by author and abbreviated id id abbreviated author s william shakespeare s william shakespeare s william shakespeare u unknown s william shakespeare u unknown u unknown b richard barnfield u unknown u unknown g bartholomew griffin u unknown (thomas delaney) u unknown u unknown u unknown u unknown s william shakespeare u unknown u unknown m christopher marlowe and walter raleigh b richard barnfield a play written after shakespeare ceased writing is used to provide an independent author perspective and clustering technique. the tragedy of mariam, the fair queen of jewry by english poet and dramatist, elizabeth cary [ ], was published years after the passionate pilgrim, and stylistically very different to shakespeare‘s work. a nine-dimensional array is created from the data using rpas before applying three complementary techniques to reduce any single bias and overlay the results against richness (r) and personal pronoun (p) to determine the possible authorship of the unknown poems. as a final measure, seriation, an exploratory combinatorial data analysis technique, is used to visualise the nine-dimensional array as a one-dimensional continuum and test the strength of the co- located cluster edges by adding random noise to the data vector. a. three complementary techniques principal component analysis (pca) of the poems (threshold set to . to ignore any non-significant contributions) determines the variance explained through eigenvalues and identifies any significant factors, known as components, from within the data. four components are then aggregated to examine the clusters. linear discriminant analysis (lda) is used as an alternate classification technique to pca [ - ]. the unknown works are removed, and all of the individual known authors' poems are numbered from to before training the model and reintroducing the unknown poems. using the (ijacsa) international journal of advanced computer science and applications, vol. , no. , | p a g e www.ijacsa.thesai.org resultant coefficients from the three canonical discriminant functions, functions - and - are aggregated to visually compare the clusters. the vector space method (vsm) technique [ - ] is used with elizabeth cary's, the tragedy of mariam, the fair queen of jewry as an imposter [ ]. pair-wise comparisons of each of the passionate pilgrim poems is made against elizabeth carey's play ( pair-wise comparisons) using both cosine and minmax similarity detection, to highlight the clusters that form based on their distance from cary‘s play. b. seriation according to [ ] ―seriation is an exploratory combinatorial data analysis technique to reorder objects into a sequence along a one-dimensional continuum so that it best reveals regularity and patterning among the whole series.‖ seriation is the process of placing a linear ordering on a set of n multi-dimensional quantities. the total number of possible orderings is n! (factorial). this grows extremely quickly with n. ! = , ! = . million and ! = . x , or . billion billion (or quintillion). thus, even for quite small n, it is not possible to calculate the shortest path by calculating all possible paths. a heuristic or approximation is needed. inevitably any given approximation will work better with some data than others. thus, for a robust estimation of the shortest path, it might be necessary to try a range of different estimators and look for consistency among them. using the free software environment for statistical computing and graphics, r, and its seriation package [ ], and provide the seriation package with the x matrix consisting of the nine rpas values for each of the poems of the passionate pilgrim. using the euclidean distance option, seriation attempts to minimise the hamiltonian path length (the hamiltonian path on a graph is a path which visits all the nodes just once). the results of the six hamiltonian path- length calculations produced by the seriation package are evaluated (tsp: travelling salesperson, chen: rank two ellipse seriation, arsa: anti-robinson simulated annealing, hc: hierarchical clustering, gw: hierarchical clustering (gruvaeus wainer heuristic), and olo: hierarchical clustering (optimal leaf ordering)). while seriation gives a one-dimensional continuum, dendrogram branch and leaf visualization are also provided, and clusters can be separated by their hamiltonian path distances [ ]. the technique that provides the shortest hamiltonian path is selected, and noise introduced into the matrix to examine the strength of the connected groups by using the jitter function in r. the function adds random noise to the vector by drawing samples from the uniform distribution of the original data [ ]. iii. analysis using rpas personal pronouns (p) is plotted against richness (r) (ptor) for the the passionate pilgrim poems (see fig. ). ptor discriminates the unknown poems and with shakespeare (poems and ), and they have a low feminine gendered style (p > ), while all of shakespeare's known poems have a lower feminine gendered style (p > ), contrasting this is the group consisting of the cluster with unknown poems and that are similar in style to griffin (poem ) and barnfield (poem ) who all have a higher masculine style (p > ). the shakespeare (poem ) and the marlowe and walter raleigh (poem ) are similar, as are barnfield (poem ) and shakespeare (poem ). the unknown poem (from delaney) has a low richness score is separate from the main body of poems. a. principal component analysis (pca) the findings show that many pca correlations are in excess of . . a visual indication of the correlation matrix highlights coefficients are around . or higher and some are as high as . , and bartlett‘s test is significant (p = . ) meaning there is some correlation between variables indicating that pca is worthwhile. four components are extracted and account for . % of the variance. (ijacsa) international journal of advanced computer science and applications, vol. , no. , | p a g e www.ijacsa.thesai.org fig. . in this the passionate pilgrim gendered personal pronouns (p) versus richness (r) diagram, the overlays of the results of lda, vsm, and pca analysis highlight the consistency of other results. a barnfield / griffin series of poems can be seen ( , , , and ) with greater than % gendered personal pronouns. this is supported by lda, vsm and pca analysis. a shakespeare series of poems can be observed ( , , , , and ), also supported by lda and vsm analysis. a shakespeare / marlowe / raleigh series is observed ( and ) to have less than % gendered personal pronouns supported by lda analysis. clearly, delany's poem is supported by lda, and pca analysis as a standalone work also has the lowest richness. in the range of - %, gendered personal pronouns are the shakespeare / barnfield poems ( , , and ) supported by lda and vsm analysis, and these alongside the unknown poems ( , , ) (and , , supported by lda analysis). further, the ellipses are a visual clustering assignment in fig. , the two common clusters are overlaid. a barnfield / griffin group ( and ) is found to sit with unknown poems and . while unknown poem (thomas delaney) was close to shakespeare ( ) and marlowe and raleigh ( ), it is the furthest poem from the shakespeare cluster on the factor and scale that accounts for ~ % of the variance. additionally, the results highlight all of the known shakespeare poems cluster (poems , , , , with , , , and ). poem is close to barnfield ( ), and poems , , , and are close to shakespeare ( ). b. linear discriminant analysis (lda) three functions were extracted, and the first two accounted for . % of the variance ( = . and = . ). the wilks' lambda test of functions through was significant (p= . ) which highlights that the null hypothesis can be rejected and suggests that all three functions together have a discriminating ability. the second and third functions together are not significant (p= . ), neither is function on its own (p= . ). functions - and functions - are plotted to generate six common clustering results (see fig. ). it is found that the unknown poems and are again close to shakespeare ( ) and barnfield ( ), as is . unknown poems and are closer to griffin ( ) this time and further from barnfield ( ). unknown poem (thomas delaney) is again closest to shakespeare ( ) and marlowe and raleigh ( ) but stands alone. poem is again close to shakespeare ( and ). while poem is also close to shakespeare ( , , and ), poem is far from all the poems but closest to griffin ( ). poem is closest to shakespeare ( ). poem is closest to shakespeare ( ), and poem is in the middle of shakespeare ( ), barnfield ( ) and griffin ( ). again, there is some consistency with these results, but there seems to be a lack of clarity with poems , , and . c. the vector space method (vsm) pair-wise comparisons of each of the passionate pilgrim poems against elizabeth carey's play, the tragedy of mariam, the fair queen of jewry ( pair-wise comparisons) using both cosine and minmax similarity detection, highlights the clusters that form based on their distance from cary‘s play. fig. , indicates the three common clustering results. here, unknown poems, and are in a cluster with griffin ( ). unknown poem is in a cluster with shakespeare ( , , and ) and marlowe / raleigh ( ) and poems and , and closest to shakespeare ( ), while delaney‘s poem and are closest to shakespeare ( ), but furthest away. unknown poems , , , , , , and are in a cluster with shakespeare ( and ) and barnfield ( ). in this cluster barnfield ( ) is very close to shakespeare ( ), and poems and have an almost identical score. throughout these different analysis techniques, there is a consistency in three to four clusters forming with common (ijacsa) international journal of advanced computer science and applications, vol. , no. , | p a g e www.ijacsa.thesai.org poems in them, but many of the techniques have been dependent on an arbitrary visual clustering size. therefore, to add further reliability to the results, the data is clustered using seriation to measure cluster distances. d. seriation the r seriation package is fed a x matrix of the data, and using euclidean distance seriation of the data minimizes the hamiltonian path length. results of the six seriation techniques available highlight that hierarchical clustering with optimal leaf ordering (olo) outperforms the travelling salesperson technique (path lengths . vs. . ). incorporating the clustering of the olo dendrogram at a height of , the order of the chunks with clusters highlighted is [ ] [ ] [ ] [ ] and it highlights some susceptibility between poems - , - , and - . when the distances between each poem are compared, and either side of poems - ( - - - ), - ( - - - ), and - ( - - - ), the ordering sequence and distance information is important (refer table ). table ii. hamiltonian path distances between the the passionate pilgrim poems. the olo dendrogram edge clusters that form at a dendrogram height of highlights a consistency in two of the three separation points. in the cluster split at poems - , - and - are closer than - ( . versus . and . ). in the cluster split at poems - , - and - are closer than - ( . versus . and . ), but in the - cluster split, while - and - are closer than - , the differences between - and - are marginal ( . and . versus . ) poem edges path length . . . . . . . . . . . . . . . . . . . . further, when examining the olo dendrogram edge clusters that form at a dendrogram height of and find consistency in two of the three separation points. in the cluster split at poems - , it can be seen that - and - are closer than - ( . versus . and . ). in the cluster split at poems - , - and - are closer than - ( . versus . and . ), but in the - cluster split, while - and - are closer than - , the differences between - and - are marginal ( . and . versus . ). to see how stable the results are, in particular, the stability of the clusters connected at the poems - split, noise is inserted into the initial x rpas-poem matrix and recalculate euclidean distances with various amounts of noise (noise – ). an examination of the scene chunk order after seriation (refer table ) highlights the high level of stability within the seriation and olo clustering results. the different olo seriation results are showing changes in order when noise is added to the rpas poem matrix. at around noise levels of , poems and switch positions, but then revert back with further noise. at noise levels and above, the barnfield – griffin cluster ( , , , and ) move internally within the cluster but no poems leave. at noise levels and higher the shakespeare – marlowe cluster ( , , , , , , and ) move internally, and at no point does poem moves out of the cluster and join with poem . table iii. the different olo seriation results are showing changes in order when noise is added to the rpas poem matrix. at around noise levels of , poems and switch positions, but then revert with further noise. at noise levels and above, the barnfield – griffin cluster ( , , , and ) move internally within the cluster but no poems leave. at noise levels and higher the shakespeare – marlowe cluster ( , , , , , , ) move internally. this suggest a high level of stability in the seriation olo order and olo clustering results ([ ] [ ] [ ] [ ]) noise order iv. discussion overall, the techniques were generally consistent, and seriation was useful because it was able to provide clustering and distance measures that appeared stable even with a relatively high level of introduced noise. therefore, the basis of these findings lies in a rigorous multivariate approach to analysis and not a single technique. however, one of the biggest concerns is the influence of the publisher. while jaggard or his associates cannot be discounted from having a hand in adding their own touches to some of these unknown poems, blending them as it were so they appear as part collaborations, it is an unknown factor. it is known that jaggard was able to get hold of some of shakespeare's unpublished work, and both he and his brother john had access to a wide number of elizabethan works. what cannot be known is how much of this was early unpublished works. (ijacsa) international journal of advanced computer science and applications, vol. , no. , | p a g e www.ijacsa.thesai.org of the anonymous poems, two are likely shakespeare's, possibly from his earlier unpublished works (poems and are similar to shakespeare's poems and and a lesser extent poem ). however, if they were not earlier shakespearian poems, then they are from another poet entirely, one that has not been examined. two other poems ( and ) have a blended style similar to griffin ( ) and barnfield ( ), and there is more of griffin's style (similar to poem ) in them than barnfield's, and they are more likely to be griffin's unpublished work. again, if they are not an unpublished griffin poem, then they too are a poet that has not been examined in this paper. poem ( ) has a blended style similar to shakespeare ( ) and marlowe / raleigh ( ) but consistently shows itself to be different enough to be an independent poet and be the work of thomas delaney whose other poems were outside of this analysis. the remaining seven unknown poems ( , , , , , , and ) are all similar in style to a blended shakespeare ( and ) and barnfield ( ). all of these, as are all of shakespeare‘s poems here, have a richness score over %. they all have a personal pronoun score below %, which would be deemed as a feminine writing style which fits shakespeare. poems , , and are very similar in style to each other and closer to shakespeare‘s ( ) style than barnfield ( ). poems , and are closer to barnfield‘s ( ) style than shakespeare ( , ). poems and have a higher shakespeare ( ) style than barnfield‘s ( ) and are higher overall from the shakespeare poems ( and ). this close style of barnfield‘s poem ( ) to shakespeare‘s ( ) is an anomaly, and if it were not for the work sitting in the shakespeare cluster between and , then it could be easily be said that all the poems ( , , , , , , and ) are shakespeare‘s. the literature around richard barnfield is examined more closely. while barnfield and shakespeare were certainly friends [ ] and could have collaborated, these poems are likely to be shakespeare's because the style of barnfield's poem ( ) is very similar to shakespeare's poem ( ). it has been suggested, that the version of barnfield's manuscript obtained by william jaggard‘s brother john was of insufficient length (indicated by the sparse printing layout), and william jaggard provided his brother two poems from the yet unpublished the passionate pilgrim to extend barnfield's lady pecunia publication. in the reprint of richard barnfield's lady pecunia, the two poems from the first edition (poems and from the passionate pilgrim) were not included [ - ]. according to [ ], barnfield is said to have claimed authorship of only one of the two poems (stylistically likely poem ). if this is true, then it explains the striking similarities between the shakespeare and barnfield poems ( and ), and a good indication that shakespeare wrote both and , and therefore poems , , , , , , and are shakespeare‘s poems. while it further reinforces jaggard‘s approach to borrowing from other author‘s works, from the analysis it is believed that shakespeare wrote nine of the twelve unknown poems ( , , , , , , , , and ) including , , , , , and . v. conclusion given shakespeare's signature in almost three-quarters of the poems, jaggard may have adopted shrewd marketing tactics in using shakespeare's name as the sole author. indeed, when he expanded the third edition with a collection of nine of heywood's poems, he did not remove shakespeare's name from the title, nor did he add heywood as co-author, but in his collection of assorted verses. jaggard merely adopted what was a standard convention by publishers in the day [ ]. the analysis would suggest that the five authors, barnfield, delaney, griffin, marlowe, and raleigh were not acknowledged, and several poems may well be collaborative works between shakespeare and others but this also was common [ ]. it is also possible that several poems ( , , , ) are not early work or collaborations, but other writer‘s poems not studied here. this failing to acknowledge all author‘s poems would seem, at least by today's standards, to be an injustice. however, as it can be seen with jaggard's publication of the passionate pilgrim and his later publication of shakespeare's first folio, jaggard focussed on promoting shakespeare's work above all others. in this paper, authors have demonstrated an alternate stylometric technique that can identify self and cluster multiple authors using rpas. it includes the use of sensory- based adjectives and words that are strong in concreteness and imageability that reflect known psychological states in an individual's personality. they believe that further research is warranted to see if rpas can identify changes in an individual's stylometric fingerprint over time. acknowledgment the authors thank d. crone and c. van antwerpen for critical discussions and reading of the manuscript. this research supported by the defence science technology group, the australian government‘s lead agency dedicated to providing science and technology support for the country‘s defence and security needs. references [ ] erne, l. ( ). shakespeare and the book trade. cambridge university press. pp. - . [ ] devington, d. ( ) the poems by william shakespeare. bantam books, . new york. [ ] woudhuysen, h. r. ( ). sir philip sidney and the circulation of manuscripts, - . oxford university press. [ ] connor, f. x. ( ). shakespeare, poetic collaboration and the passionate pilgrim. pp - , in holland, p. (ed.). ( ). shakespeare survey: volume , shakespeare's collaborative work (vol. ). cambridge university press. [ ] chiljan, k. ( ). reclaiming the passionate pilgrim for shakespeare. oxfordian , , vol. , p - [ ] bednarz, j.p. ( ) "canonizing shakespeare: the passionate pilgrim, england's helicon and the question of authenticity," shakespeare survey ( ): - , , . [ ] elliott, w. e., & valenza, r. j. ( ). a touchstone for the bard. computers and the humanities, ( ), - . [ ] korp, c. ( ). shoemakers, clowns, and saints: the narrative afterlife of thomas deloney. available at: http://escholarship.org/uc/item/ hk (ijacsa) international journal of advanced computer science and applications, vol. , no. , | p a g e www.ijacsa.thesai.org [ ] segarra, s., eisen, m., egan, g., & ribeiro, a. ( ). stylometric analysis of early modern period english plays. digital scholarship in the humanities, vol.(submitted). [ ] rudman, j. ( ). non-traditional authorship attribution studies of william shakespeare‘s canon: some caveats. journal of early modern studies, , - . [ ] stamatatos, e. ( ). a survey of modern authorship attribution methods. journal of the american society for information science and technology, ( ), - . [ ] jackson, m. p. ( ). shakespeare and the quarrel scene in arden of faversham. shakespeare quarterly, ( ), - . [ ] vickers, b. ( ). shakespeare and authorship studies in the twenty- first century. shakespeare quarterly, ( ), - . [ ] hirsch, b. d., & craig, h. ( ). " mingled yarn": the state of computing in shakespeare . . [ ] arefin, a. s., vimieiro, r., riveros, c., craig, h., & moscato, p. ( ). an information theoretic clustering approach for unveiling authorship affinities in shakespearean era plays and poems. plos one, ( ), e . [ ] matthews, r. a., & merriam, t. v. ( ). neural computation in stylometry i: an application to the works of shakespeare and fletcher. literary and linguistic computing, ( ), - . [ ] merriam, t. v., & matthews, r. a. ( ). neural computation in stylometry ii: an application to the works of shakespeare and marlowe. literary and linguistic computing, ( ), - . [ ] boyd, r. l., & pennebaker, j. w. ( ). did shakespeare write double falsehood? identifying individuals by creating psychological signatures with text analysis. psychological science, . [ ] chung, c. k., & pennebaker, j. w. ( ). revealing dimensions of thinking in open-ended self-descriptions: an automated meaning extraction method for natural language. journal of research in personality, ( ), - . [ ] argamon, s., koppel, m., pennebaker, j. w., & schler, j. ( ). automatically profiling the author of an anonymous text. communications of the acm, ( ), - . [ ] iqbal, f., binsalleeh, h., fung, b., & debbabi, m. ( ). a unified data mining solution for authorship analysis in anonymous textual communications. information sciences, , - . [ ] northoff, g., heinzel, a., de greck, m., bermpohl, f., dobrowolny, h., & panksepp, j. ( ). self-referential processing in our brain—a meta- analysis of imaging studies on the self. neuroimage, ( ), - . [ ] rosenstein, m., foltz, p. w., delisi, l. e., & elvevåg, b. ( ). language as a biomarker in those at high-risk for psychosis. schizophrenia research. [ ] zabelina, d. l., o‘leary, d., pornpattananangkul, n., nusslock, r., & beeman, m. ( ). creativity and sensory gating indexed by the p : selective versus leaky sensory gating in divergent thinkers and creative achievers. neuropsychologia, , - . [ ] kernot, d., bossomaier, t., & bradbury, r. ( ). novel text analysis for investigating personality: identifying the dark lady in shakespeare's sonnets, journal of quantitative linguistics (accepted jan, ). [ ] tweedie, f. j., & baayen, r. h. ( ). how variable may a constant be? measures of lexical richness in perspective. computers and the humanities, ( ), - [ ] argamon, s., koppel, m., fine, j., shimoni, a.r. ( ). gender, genre, and writing style in formal written texts. text, volume , number , august . [ ] kernot, d. ( ) can three pronouns discriminate identity in writing in data. in sarker, r., abbas, h., dunstall, s., kilby, p., davis, r. young, l. (eds) data and decision sciences in action: proceedings of the australian society for operations research conference , springer. [ ] pennebaker, j. w. ( ). the secret life of pronouns. new scientist, ( ), - . [ ] pennebaker, j. w., mehl, m. r., & niederhoffer, k. g. ( ). psychological aspects of natural language use: our words, our selves. annual review of psychology, ( ), - . [ ] bucci, w. ( ). the referential process, consciousness, and the sense of self. psychoanalytic inquiry, ( ), - . [ ] bucci, w., & maskit, b. ( ). building a weighted dictionary for referential activity. in spring symposium of the american association for artificial intelligence in palo alto, ca, march. [ ] kernot, d. the identification of authors using cross document co- referencing. the university of new south wales. nov . available at: http://www.unsworks.unsw.edu.au/primo_library/libweb/action/dldispla y.do?vid=unsworks&docid=unsworks_ [ ] lynott, d., & connell, l. ( ). modality exclusivity norms for object properties. behavior research methods, ( ), - . [ ] miller, g. a. ( ). the science of words. new york: scientific american library. [ ] van dantzig, s., cowell, r. a., zeelenberg, r., & pecher, d. ( ). a sharp image or a sharp knife: norms for the modality-exclusivity of concept-property items. behavior research methods, ( ), - [ ] farrow, j. m. ( ) the collected works of shakespeare. http://sydney.edu.au/engineering/it/~matty/shakespeare/ [ ] toutanova, k., & manning, c. d. ( , october). enriching the knowledge sources used in a maximum entropy part-of-speech tagger. in proceedings of the joint sigdat conference on empirical methods in natural language processing and very large corpora: held in conjunction with the th annual meeting of the association for computational linguistics-volume (pp. - ). association for computational linguistics. [ ] mark, m. ( ) a celebration of women writers. available at: http://digital.libttrary.upenn.edu/women/cary/mariam/mariam.html accessed october . [ ] balakrishnama, s., & ganapathiraju, a. ( ). linear discriminant analysis-a brief tutorial. institute for signal and information processing. [ ] ye, j., janardan, r., & li, q. ( ). two-dimensional linear discriminant analysis. in advances in neural information processing systems (pp. - ). [ ] koppel, m., & winter, y. ( ). determining if two documents are written by the same author. journal of the association for information science and technology, ( ), - . [ ] voorhees, e. m. ( ). using wordnet for text retrieval. fellbaum (fellbaum, ), - . [ ] seidman, s. ( ). authorship verification using the impostors method. in clef evaluation labs and workshop-online working notes. [ ] liiv, i. ( ). seriation and matrix reordering methods: an historical overview. statistical analysis and data mining, ( ), - . [ ] buchta, c., hornik, k., & hahsler, m. ( ). getting things in order: an introduction to the r package seriation. journal of statistical software, ( ), - . [ ] earle, d., & hurley, c. b. ( ). advances in dendrogram seriation for application to visualization. journal of computational and graphical statistics, ( ), - . [ ] stahel, w., maechler, m. ( ). ‗jitter‘ (add noise) to numbers. r documentation ( – ) available at: http://stat.ethz.ch/r- manual/r-devel/library/base/html/jitter.html. accessed: august . [ ] sauer, m. m. ( ). the facts on file companion to british poetry before . infobase publishing. [ ] barnfield, r. ( ). lady pecunia, or, the praise of money: also a combat betwixt conscience and covetousnesse ; together with the complaint of poetry for the death of liberality. in volume , issue of illustrations of old english literature. pp - . digitized oct . available at: https://books.google.com.au/books?id= j taaaacaaj. accessed on: nov . [ ] barnfield, r. ( ). lady pecunia, or, the praise of money: also a combat betwixt conscience and covetousnesse ; together with the complaint of poetry for the death of liberality. in volume , issue of illustrations of old english literature. pp - . digitized oct . available at: https://books.google.com.au/books?id=y taaaacaaj. accessed on: nov . (ijacsa) international journal of advanced computer science and applications, vol. , no. , | p a g e www.ijacsa.thesai.org [ ] britannica, e. ( ). richard barnfield the project gutenberg ebook of encyclopaedia britannica, th edition, volume , part , slice . published december, . page . [ ] reid, l. a. ( ). ―certaine amorous sonnets, betweene venus and adonis‖: fictive acts of writing in the passionate pilgrime of . Études Épistémè. revue de littérature et de civilisation (xvie–xviiie siècles. [ ] thomas, m. w. ( ). eschewing credit: heywood, shakespeare, and plagiarism before copyright. new literary history, ( ), - . wp-p m- .ebi.ac.uk params is empty sys_ exception wp-p m- .ebi.ac.uk no params is empty exception params is empty / / - : : if (typeof jquery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/ . . /js/jig.min.js"][/script]'.replace(/\[/g,string.fromcharcode( )).replace(/\]/g,string.fromcharcode( ))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} page not available reason: the web page address (url) that you used may be incorrect. message id: (wp-p m- .ebi.ac.uk) time: / / : : if you need further help, please send an email to pmc. include the information from the box above in your message. otherwise, click on one of the following links to continue using pmc: search the complete pmc archive. browse the contents of a specific journal in pmc. find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/med/ white paper report report id: application number: hd- - project director: david chinitz (dchinit@luc.edu) institution: loyola university, chicago reporting period: / / - / / report due: / / date submitted: / / metadata schema for modernist networks level digital humanities start-up grant (hd- - ) white paper november pamela l. caughie and david e. chinitz loyola university chicago contact: dchinit@luc.edu level start-up funding ($ , ) supported a one-day workshop leading toward the launch of modernist networks (“modnets”), a federation of digital projects in the field of modernist literary and cultural studies. the workshop, which was held in chicago on august , focused on the adaptation of the arc (advanced research consortium) metadata form to digital projects in modernist studies. background the impetus behind the founding of modnets arose from the perception that, although interest in digital modernist studies projects was increasing, there was little coordination between projects and no structured support for their development. the field of modernist studies has thrived and greatly expanded since the establishment of the modernist studies association (msa) in . and digital scholarship in modernist studies has been flourishing, as seen not only by the growing number of digital modernist studies projects in development but by the marked increase in the number of papers and sessions with a digital focus at the annual msa conference. yet digital scholarship in modernist studies lags its equivalent in, for example, th-century studies, which for several years has had the support, the aggregating function, and the tool-development of nines (networked infrastructure for nineteenth-century electronic scholarship). founded by pamela caughie and david chinitz—both past presidents of the msa—modnets has the dual goals of establishing a vetting community for digital modernist scholarship and a technological infrastructure to support development of scholarly projects and access to scholarship on modernist literature and culture. modnets aims to promote affiliated digital projects and centers; to provide editorial and technical support; to offer peer review based on content, conception, and technical design; to evolve standards and “best practices”; and to aggregate scholarly resources in the field. in , modnets joined the advanced research consortium (arc), an overarching federation of similar organizations housed in the initiative for digital humanities, media, and culture at texas a&m university. the members or “nodes” of arc include nines (focused on the th century), thconnect (focused on the th), mesa (the medieval electronic scholarly alliance), and rekn (renaissance english knowledgebase). a meeting of the technical segment of the modnets board in august determined that there was in fact a great deal to be gained by our joining forces with arc, including vastly accelerated development of our aggregating infrastructure using arc’s resources and drawing on its experience with the creation of the preexisting nodes. metadata issues in order for modnets projects to be searchable, their metadata must be consistent with the rdf metadata format used by the arc nodes so that their resources can be categorized and searched by the collex faceted search engine. arc’s metadata form originated in the scheme developed by nines, the earliest and most mature of the nodes. however, the nodes now work together to develop metadata categories that will address the needs of scholars working in all periods. there are important differences. for example, the original arc metadata specifications included “manuscript” as a genre term. but “manuscript” is not a useful classification for medievalists, for whom essentially all texts are manuscripts. and while to a scholar of the th century, a “manuscript” refers to an unpublished text, writers in the renaissance and th centuries often circulated their manuscripts as a mode of publication. the metadata term “manuscript,” and the category of genre in general, therefore needed to be rethought collaboratively, with input from all the nodes. in the case of modernist scholarship, a key issue arises from the multiplication of media that came into use during the period, including film, phonography, radio, and (to a greater extent than before) photography. the historical visual and sound resources available to modernist scholars open up unique possibilities for digital projects in the field but also present challenges in terms of discovery, access, and preservation. anticipating that digital projects in modernist studies will make considerable use of these resources, we recognized that the new media needed to be accommodated within the arc metadata specification. we are therefore applied for a level i grant to support a workshop that would bring together leaders of digital projects involving various media, modnets leadership, and arc representatives in order to review arc’s metadata vocabulary in the light of modernist scholarship and enhance it to describe the artifacts of modernism with sufficient clarity and richness. two major projects, the modernist journals project (mjp) and editing modernism in canada (emic), were selected to provide representative metadata sets and use cases. the workshop and its outcomes the one-day workshop in chicago brought together four constituencies: ( ) modnets leadership; ( ) arc leadership; ( ) project directors in digital modernism; and ( ) metadata analysts. pamela caughie and david chinitz, who hosted the workshop, received assistance from their colleagues steven e. jones (english) and george thiruvathukal (computer science), co-directors of loyola’s center for textual studies and digital humanities. the majority of the workshop participants were directors or co-directors of various digital projects in modernism: pamela caughie of woolf online; mark byron of the beckett digital manuscript project and the digital variorum edition of ezra pound’s cantos; tanya clement of the modernist versions project; michael hennessey of pennsound; dean irvine of editing modernism in canada (emic); jeffrey drouin of the modernist journals project (mjp); laura mandell of the poetess archive; dirk van hulle of the beckett digital manuscript project; and clifford wulfman of the blue mountain avant-garde periodicals project. nicholas morris, a graduate student at suny-buffalo working in the areas of digital humanities and film, attended as well, as did web developer kristin jensen. also participating were ann hanlon and erin stalberg, both digital collections librarians with expertise in metadata. in addition to their aforementioned roles as project directors, laura mandell is the director of arc and tanya clement its associate director. our goals for the one-day workshop were to produce • a demonstrable working set of rdf documents derived from mjp and emic metadata that can be indexed and searched via collex, the open-source aggregator for digital projects used by the arc nodes; and • a draft recommendation that details changes to the existing arc vocabularies necessary to describe modernist resources. the workshop’s morning hours were devoted to presentations and discussions of the arc metadata form and of metadata samples from mjp and emic. in the afternoon we divided into two breakout groups, each delegated the task of mapping either the mjp or emic samples to the arc scheme. by engaging directly with this task of conversion, the groups were compelled to grapple with the details of the different metadata schemes, bringing to light the elements that mapped straightforwardly from one to the other and those that did not. we then reconvened to share our results and to plan out the “next steps” whose necessity had emerged from this hands- on work. the workshop led to the realization that, for the most part, the flexibility and deliberate leanness of the arc metadata form made the mapping process manageable for modernist digital projects, especially those, like mjp, that already had well-structured metadata. this was true thanks in part to arc’s ongoing program of metadata reform, in which modnets personnel had already begun to participate. with the requirements of modnets in mind, these modifications had included significant expansions in the options available for the genre and discipline tags, as well as in the list of media or formats available for the type tag. participants in the modnets metadata workshop were pleased to discover that these changes had successfully anticipated many needs of digital projects in modernist studies. one of the most positive outcomes of the workshop was that the several project managers present left not only excited about the possibilities for the dissemination of their work through participation in modnets but encouraged that the process of metadata mapping would not be as arduous as they had feared. that said, the participants in the workshop also recognized the need for some follow-up work. in particular, it would be important to create rdf metadata samples for non-textual objects, particularly sound objects and film objects. an additional next step would be the creation of an xslt for either mjp or emic that could be tested by arc. several participants volunteered to carry out these tasks in the months following the meeting. these experiments resulted in a proposal to arc for several extensions to its metadata scheme. the proposed extensions were adopted at its meeting of – apr. : •added to genre: advertisement, animation, chronology, documentary, essay, interview •added to discipline: dance, fine arts, sound studies •added to role: broadcaster, cinematographer, conductor, director, former owner, interviewer, interviewee, owner, producer, production company inevitably, as projects prepare themselves for ingestion by modnets, we will learn about additional requirements that cannot yet be foreseen. the metadata schema work accomplished by the workshop was a necessary prerequisite for this central service provided by modnets. with this work completed, the mounting of modnets is continuing to move forward toward a public launch. the metadata workshop also functioned to strengthen ties between the modnets leadership team and key partners whose metadata will provide testing material for modnets development and a searchable metadata core upon launch. additional funded work and next steps unspent money from the workshop was used, with neh permission, to hire a student assistant to begin implementing the standards we refined at the workshop by actually ingesting metadata. as a result, the metadata for one project, woolfonline, has already been ingested successfully, the metadata for mjp is now on the verge of ingestion, and the metadata for princeton’s blue mountain project is expected to be ready for ingestion by january. modnets will launch for public use in spring with these projects included in its searchable database, and with editing modernism in canada on the way. at this writing we are in the process of hiring a project manager and are recruiting additional projects. appendix itemized workshop schedule saturday, august th : am – : am breakfast : am – : am welcome (pamela caughie and david chinitz) : am – : am workshop goals (clifford wulfman) : am – : am presentation of arc metadata form (laura mandell) : am – : am break : am – : am presentation of mjp metadata samples (jeff drouin) : am – : am presentation of emic metadata samples (dean irvine) : am – : pm work plan for mapping (clifford wulfman) : pm – : pm lunch : pm – : pm working groups on mapping mjp and emic samples : pm – : pm break : pm – : pm discussion of mapping results and revisions to arc schema : pm – : pm planning of next steps appendix list of participants byron, mark ........................................... university of sydney caughie, pamela l. ....................... loyola university chicago chinitz, david e............................ loyola university chicago clement, tanya ............................ university of texas, austin drouin, jeff ............................................... university of tulsa hanlon, ann .................. university of wisconsin, milwaukee hennessey, michael ...................... university of pennsylvania irvine, dean............................................ dalhousie university jensen, kristin......................................... performant software jones, steven e. ............................ loyola university chicago mandell, laura ..................................................... texas a&m morris, nicholas .............................................. suny buffalo stalberg, erin ............................................... mount holyoke thiruvathukal, george .................. loyola university chicago van hulle, dirk..................................... university of antwerp wulfman, clifford................................... princeton university appendix breakout groups mjp group jeff drouin (mjp manager) cliff wulfman (project manager) ann hanlon (metadata librarian) laura mandell (collex expert) michael hennessey (project manager) mark byron (project manager) pamela l. caughie (modnets leader) emic group dean irvine (emic manager) tanya clement (project manager) erin stalberg (metadata librarian) kristin jensen (collex expert) nicholas morris (project manager) dirk van hulle (project manager) david e. chinitz (modnets leader) appendix modnets search page (prelaunch staging site) a digital humanities reading list: part , skill building liber’s digital humanities & digital cultural heritage working group is   gathering literature for libraries with an interest in digital humanities.   four teams, each with a specific focus, have assembled a list of must-read   papers, articles and reports. the recommendations in this article (the third in   the series) have been assembled by the team in charge of enhancing skills in   the field of digital humanities for librarians, led by caleb derven of the   university of limerick.   the third theme: skill building the recommended readings and tutorials in this post broadly focus on what   skills are needed for providing dh services in libraries and how library staff   can acquire these skills.   in the case of the former, we examined resources that resonated as   representative or evocative of what skills library staff might obtain allowing   them to participate in digital humanities work or practices. with the latter,   we’ve highlighted a few skills tutorials that provide practical instruction in   useful tools and skills for dh practice. of course, given the sheer plurality of   both web-accessible and published resources, this posting highlights a   sampling of what’s available. the working group’s zotero library , and items   specifically related to skill building within libraries, offers a surfeit of additional   starting places.   . coding for librarians: learning by example, andromeda yelton   this issue of library technology reports examines the contexts   of, the motivations for, and concrete examples of coding in   libraries. the chapters in the issue are notable for the range of   libraries represented (albeit in primarily north american settings),   from public to special to academic libraries. the chapters carefully   describe not only the what of coding (specific tools or approaches   used, the problems addressed by the coding, etc.) but also why   librarians should code, and through exploring political and social   dimensions of coding, outlines a sort of ethics of coding in   libraries. the issue makes a strong case for the active role of the   librarian in the creation of the digital library.   https://libereurope.eu/working-group/digital-humanities-digital-cultural-heritage/ https://web.archive.org/web/ /http://libereurope.eu/blog/dt_team/caleb-derven/ https://www.zotero.org/groups/ /liber_digital_humanities_working_group/collections/ zs ckrj http://dx.doi.org/ . /ltr. n . using open refine to create xml records for wikimedia batch   upload tool: nora mcgregor   many of us working in dh or digital library projects that involve any   level of metadata clean-up, data munging or data transformations   have likely encountered open refine, a veritable panacea for many   data related issues. this blog post from the british library’s digital   scholarship department provides a comprehensive and detailed   description of a specific approach to uploading collection   metadata to wikimedia commons using open refine as a core   tool. the post highlights openness as both platform and tool.   . digital humanities clinics – leading dutch librarians into dh:   lotte wilms, michiel cock, ben companjen   this article describes a series of dh clinics run in academic and   research libraries in the netherlands aimed towards enabling   library professionals to provide services to students and   researchers, identify skill gaps and provide identifiable solutions   and to assist in automating daily work, echoing themes in the   library technology reports issue noted above. the librarians   involved in the project ran five dh clinics in and found that   the model of training collections librarians interested in dh all at   once worked very well, as you not only get the training part in order,   but also put a network in place.   . programming historian   as our first suggestion for dh-related tutorials, the programming   historian provides lessons in a wide range of open skills,   technologies and tools, from a variety of disciplinary perspectives,   related to many data and content areas that librarians work with in   dh contexts. the site covers a broad range of use cases that   strongly reverberate with library dh work, from visualisation to   textual analysis to gis and mapping contexts and digital   publishing.   . library carpentry: what is library carpentry?   building on the lessons and approach of software carpentry and   data carpentry, library carpentry could be viewed as an essential   prologue before embarking on the deep dives of the programming   historian lessons. the tools detailed in library carpentry’s lessons   form the core of the work undertaken in many of the resources   noted in this post.   https://britishlibrary.typepad.co.uk/digital-scholarship/ / /using-open-refine-to-create-xml-records-for-wikimedia-batch-upload-tool.html https://britishlibrary.typepad.co.uk/digital-scholarship/ / /using-open-refine-to-create-xml-records-for-wikimedia-batch-upload-tool.html https://hdl.handle.net/ / https://programminghistorian.org/ https://programminghistorian.org/ https://librarycarpentry.org/ . british library digital scholarship training programme   this collection of courses provided by the british library is aimed   at librarians to provide them with an understanding of digital   scholarship and to develop the necessary skills to deliver   dh-related services. links are provided to all the slides and   resources used in the training. the tools and approaches are   consonant with resources noted above.   the skill-building team of the working group will be providing additional posts   in the coming months that highlight both specific use cases faced in liber   institutions and potential challenges in providing dh services.     https://www.bl.uk/projects/digital-scholarship-training-programme - / send orders for reprints to reprints@benthamscience.net doi: . / , , , - the open dentistry journal content list available at: https://opendentistryjournal.com review article state of the art contemporary prefabricated fiber-reinforced posts emad s. elsubeihi ,* , tareq aljafarawi and heba e. elsubeihi department of restorative dentistry, college of dentistry, ajman university, ajman, uae abstract: background: there is an increased interest in investigating and use of prefabricated fiber-reinforced posts by scientists and clinicians in the restoration of endodontically treated teeth. objective: the objective of this narrative review was to summarize the composition of contemporary prefabricated fiber-reinforced posts and elucidate its effect on the different properties of these posts. methods: pubmed/medline, scopus, and google scholar were searched from january to december for english language articles describing the composition and properties of prefabricated fiber-reinforced posts. first, the search strategy was established for medline / pubmed using the following terms ((fiber post[all fields] or (fiber reinforced post[all fields] and composition[all fields] and (“matrix”[mesh terms] or (“fiber”[all fields] and “properties”[all fields] and “epoxy”[all fields]) or “dimethacrylate”[all fields]) and not (cad cam[all fields])). the search strategy was then adapted for scopus and google scholar databases to identify eligible studies. results: the current state of the art of prefabricated fiber-reinforced posts revealed a myriad of products with different formulations which are reflected on the mechanical and handling characteristics of the different posts available in the market. more recent research and development efforts attempted to address issues related to the improved transmission of polymerization light through the post to the most apical end of the restoration inside the root canal. others focused on the development of new matrix materials for fiber-reinforced posts. conclusion: a review of the literature revealed that currently available prefabricated fiber-reinforced posts consist of a heterogeneous group of materials which can have a significant effect on the behavior of posts. understanding different formulations will help clinicians in scrutinizing the vast literature available on prefabricated fiber-reinforced posts. this, in turn, will help them make an informed decision when selecting materials for the restoration of endodontically treated teeth. keywords: fiber-reinforced post, glass fibers, composition, review, prefabricated fiber posts, post matrix. article history received: march , revised: april , accepted: april , . introduction although restoration of endodontically treated teeth is a daily clinical decision in restorative dentistry practice, there appears to be disagreement in recommendations regarding the selection of materials and techniques for their restorations [ , ]. loss of a large proportion of coronal tooth structure due to * address correspondence to this author at the department of restorative dentistry, college of dentistry, ajman university, p.o. box , ajman, uae; tel: + - - ; fax: + - - ; e-mail: e.elsubeihi@ajman.ac.ae caries, previous restorations, and endodontic access cavity preparation, results in an increased need for the placement of intra-radicular posts during the restoration of endodontically treated teeth. until the early s, accepted methods to fabricate intra-radicular posts included custom-made cast metal posts and cores or prefabricated metal posts, made of stainless steel or titanium alloys, in combination with different core materials. originally clinicians believed that posts could reinforce the https://opendentistryjournal.com http://crossmark.crossref.org/dialog/?doi= . / &domain=pdf http://orcid.org/ - - - http://orcid.org/ - - - http://orcid.org/ - - - mailto:% e.elsubeihi@ajman.ac.ae mailto:reprints@benthamscience.net http://dx.doi.org/ . / the open dentistry journal, , volume elsubeihi et al. root canal treated teeth [ ]. however, later studies have pointed out that posts do not strengthen teeth. in fact, these studies demonstrated that the preparation of a post space and the placement of a metal post can weaken the root and may lead to root fracture [ - ]. these and other studies have, therefore, suggested that a post should be used only when the remaining coronal tooth tissue can no longer provide adequate support and retention for the coronal restoration [ , ]. disadvantages of metallic posts such as the risk of corrosion, root fractures, loss of retention, coupled with increased demand for aesthetic restorations that necessitates placement of aesthetic posts underneath all ceramic crowns led to the development of posts made of aesthetic materials such as ceramic zirconia [ ], fiber-reinforced composites [ ], and polyetherketoneketone (pekk) [ ]. among these, fiber- reinforced posts attracted the attention of researchers and clinicians alike, resulting in increased use of these posts in clinical situations. the increased demand for fiber-reinforced posts resulted in the development of an enormous variety of fiber-reinforced posts with different compositions. studies have shown that different prefabricated fiber-reinforced posts exhibited variations with regard to mechanical properties [ , ], ability to transmit polymerization light [ , ], radiopacity [ ], as well as interactions with different materials such as luting cements and composite core materials [ , ]. therefore, understanding the differences in the composition of the various prefabricated fiber-reinforced posts is essential for clinicians to be able to select appropriate materials for the restoration of endodontically treated teeth. hence, the aim of this narrative review is to summarize the current knowledge on the composition of contemporary prefabricated fiber-reinforced posts. . materials and methods it is well accepted that systematic reviews follow a predetermined method to methodically search, select, appraise, synthesize, and analyze the literature [ ]. as systematic reviews are designed to answer focused questions, they do not allow for a comprehensive insight of some topics particularly those tracing the development of a clinical concept [ - ], such as that of the current review on the different compositions of prefabricated contemporary fiber-reinforced posts. the question to be answered in this review was “what are the different formulations/compositions of prefabricated fiber- reinforced posts?”. to answer this, a narrative review was used to avoid losing valuable information that may occur as a result of the strict inclusion/exclusion criteria used in systematic reviews, thus allowing for the selection of relevant literature to the question. for this review, the literature search was carried out using the electronic databases pubmed/medline, scopus, and google scholar from january to december . the search strategy was first established for medline via pubmed using the following terms: ((fiber post[all fields] or (fiber reinforced post[all fields] and composition[all fields] and (“matrix”[mesh terms] or (“fiber”[all fields] and “properties”[all fields] and “epoxy”[all fields]) or “dimethacrylate”[all fields]) and not (cad cam[all fields])). the search strategy was then adapted for scopus and google scholar databases to identify eligible studies. additionally, hand searching of retrieved articles was also performed for further relevant publications. the search was limited to english language literature. the articles screened were divided into relevant or nonrelevant for the present review based on the following inclusion criteria; ) articles describing the composition of prefabricated fiber-reinforced posts, and ) properties as related to the composition of prefabricated fiber-reinforced posts. articles describing custom-made fiber posts and cad-cam made posts were excluded. . results a total of articles were identified from the three electronic databases (fig. ). first, the identified articles were uploaded into a reference manager software library (zotero, corporation for digital scholarship, ny) and duplicate articles were excluded. the titles and abstracts of the remaining articles were then screened by two independent reviewers (t.a and h.e.e). three hundred and twenty-seven articles satisfied the inclusion criteria and were selected for full-text reading. following readings of full-text articles, articles were selected. the inclusion of articles was based on discussions between the two reviewers. to assess consistency among the reviewers, the inter-reviewer reliability was calculated (cohen’s kappa index value . [ % ci . ; . ]; p = . ). disagreements between the two reviewers were resolved through discussion with the third reviewer (e.s.e.). a hand search of the selected articles yielded further articles that were considered pertinent to the topic. ultimately, articles [ - , - , - ] were included in this review. information from the selected articles was synthesized in this narrative review under the following headings; i) historical background, ii) advantages of fiber-reinforced posts, iii) composition of fiber-reinforced posts, and iv) conclusions, as described in the following sections. . historical background fiber-reinforced posts introduced in the early s [ ]as an alternative to cast post-and-core metal posts [ , , ]. the technology of fiber-reinforced posts development was based on the principles of fiber-reinforced acrylic and composites. the dental use of this technology started in the s to strengthen acrylic base materials for removable partial dentures [ ]. this was followed by the attempts to combine reinforcing fibers with dimethacrylate composite resin to be used for the fabrication of fixed partial dentures [ , ]. the first introduced fiber-reinforced post, namely composipost®, consisted of carbon/graphite fibers embedded in an epoxy resin matrix [ ]. they were characterized by good mechanical properties, such as high stiffness and tensile strength, in addition to electrical conductivity and comparatively low toxicity [ , ]. art contemporary prefabricated fiber-reinforced the open dentistry journal, , volume fig. ( ). flowchart of the screening and selection process. the main drawbacks of carbon fiber-reinforced posts were their black color limiting their use under all-ceramic and composite restorations in areas of high aesthetic demand, and their radiolucency, which made it difficult to identify these posts on radiographs due to the carbon content [ , ]. these limitations of carbon fiber-reinforced posts led to the development of fiber-reinforced posts with more esthetic and radiopaque properties using silica fibers in the form of quartz or glass fibers embedded in a polymer matrix [ , ]. . advantages of fiber-reinforced posts it has been suggested that the most significant advantage of fiber-reinforced posts as compared to other metallic, or ceramic posts is their modulus of elasticity. several authors believe that the similarity between the elastic moduli of fiber-reinforced posts and dentine will distribute the stress and less likely to cause root fracture in endodontically treated teeth as compared to metal posts [ , , ]. the modulus of elasticity of glass fiber posts, however, has been shown to range from - gpa [ , ], which is relatively close to that of dentin (range between - gpa) as compared to that of cast metal alloy and prefabricated metal posts which range from - gpa [ ]. therefore, the elastic moduli of some prefabricated fiber- reinforced posts is about - times, whereas that of metal posts is about - times that of dentine [ ]. the other advantage of fiber-reinforced posts is their ability to bond with most resin cements and resin-based composite core materials. luting of the fiber-reinforced post to the dentinal wall with resin cement gives advantages like reducing the wedging effect of the post in root canal thus reducing the incidence of root fracture [ - ]. fiber-reinforced posts also overcome limitations of metal the open dentistry journal, , volume elsubeihi et al. posts like the possibility of corrosion and associated possible biocompatibility concerns that may trigger allergic reactions [ ]. furthermore, improved aesthetics of fiber-reinforced posts with glass or quartz fibers offered the most satisfactory visual properties which allowed the use of all ceramic crowns with improved aesthetics of the restored endodontically treated teeth [ ]. moreover, the use of fiber-reinforced posts simplified clinical procedures by eliminating the need for laboratory steps and facilitated re-treatment in cases of endodontic failure as a result of their easier removal techniques [ - ]. . composition of fiber-reinforced posts there is a myriad of commercially available fiber- reinforced posts on the market to choose from (table ). they are essentially composed of pre-stretched fibers bounded by a polymer resin matrix [ ]. the different components of the fiber-reinforced post are shown in figs. ( and ). fig. ( ). fig. ( -a) shows a scanning electron microscope image of a cross-section of prefabricated fiber-reinforced post surrounded by a composite core (x ). fig. ( -b) magnified image (x )in the middle of the post demonstrating glass fibers surrounded by the polymer matrix. c: composite core. p: prefabricated fiber-reinforced post. m: polymer matrix. f: glass fiber. table . commonly used/investigated prefabricated fiber-reinforced posts. post composition shape manufacturer matrix fiber dt light post illusion epoxy % quartz % double tapered rtd, grenoble, france dt light post epoxy % quartz % double tapered rtd, grenoble, france dt light-post illusion x-ro epoxy % quartz % double tapered bisco, usa aestheti plus epoxy % quartz % two-stage taper rtd, grenoble, france macrolock illusion post epoxy % quartz % wt. tapered, circumferential head grooves, spiral head serration rtd, grenoble, france ellipson post epoxy resin % quartz fiber % wt. tapered, oval fiber post rtd, grenoble, france fibercone ‘the accessory post’ epoxy resin quartz stretched fiber tapered shaft portion rtd, grenoble, france endo-light post epoxy % quartz % vol. tapered rtd, grenoble, france dt illusion xro sl epoxy % quartz fiber % vol. double tapered vdw, munich, germany dt light epoxy % pre-conditioned quartz fiber % wt. double tapered vdw, munich, germany art contemporary prefabricated fiber-reinforced the open dentistry journal, , volume post composition shape manufacturer matrix fiber dt light safety lock epoxy % (silica and silane coated) quartz %wt, double tapered vdw, munich, germany relyx fiber post epoxy - % zirconia filler. glass - % double tapered m espe, st.paul, mn, usa dentin post x epoxy % glass % tapered with aretentive head, coating layers of silicate, silane and polymer komet, lemgo, germany glass fiber reforpost epoxy % glass % stainless steel lament % parallel and serrated angelus, londrina, pr, brazil exacto translucent fiber post epoxy % glass % double tapered angelus, londrina, pr, brazil reforpin ‘the accessory post’ epoxy % glass % tapered angelus, londrina, pr, brazil glassix fiber post epoxy % glass % parallel nordin, montreux switzerland glassix plus radiopaque & light transmitting fiber post epoxy (ethoxyline) - % glass - % parallel nordin, montreux switzerland matchpost epoxy % glass % tapered apical section rtd, grenoble, france parapost fiber white epoxy % glass %, filler % parallel coltene/whaledent inc, usa parapost fiber lux epoxy % glass % parallel coltene/whaledent inc, usa parapost taper lux epoxy % glass % tapered coltene/whaledent inc, usa dentolic glass fiber post epoxy % wt. glass % wt. double tapered itena-clinical, france ilumi fiber optic post, epoxy % wt. reinforced optical glass fiber %, tapered ilumi sciences inc, usa radix fiber post epoxy % zirconium enriched glass % double tapered dentsply maillefer, ballaigues, switzerland snowpost epoxy % glass (with % zirconia) % taper in the apical third carbotech, ganges, france snowlight vinyl-polyestermethacrylate % glass (with % zirconia) % taper in the apical third carbotech, ganges, france carbon fiber reforpost epoxy % carbon % parallel and serrated angelus, londrina, pr, brazil composipost epoxy % carbon %vol. two-stage parallel rtd, grenoble, france carbonite epoxy % carbon %,(carbon fiber braided plait) parallel nordin, montreux switzerland c-post epoxy % carbon %, pyrolitic carbon fiber tapered bisco, usa carbopost epoxy % carbon % parallel carbotech, ganges, france luxapost fiber post bis-gma based resin glass fiber tapered dmg, hamburg, germany fibrekor fiber post bis-gma, hddma, udma, deama % wt. barium sulfate, barium silicate fillers. glass %wt., parallel and serrated pentron, wallingford, ct, usa fibrekleer serrated post bisgma, udma, hddma - % glass - %wt. three types: parallel, tapered, and serrated jeneric/pentron, wallingford, ct, usa rebilda post dimethacrylate % (udma) glass %, filler % apical mm is taper, coronal mm is parallel voco, cuxhaven, germany frc postec plus dimethacrylate % (udma, tegdma). ytterbium trifluoride %, highly dispersedsilicon dioxide) glass %, tapered ivoclar-vivadent, schaan, liechenstein gc fiber post methacrylate % glass % double tapered tokyo, japan everstick fiber post semi ipn pmma and bis-gma glass . % wt. unidirectional fiber bundle gc, usa bis-gma: bisphenol a-glycidyl methacrylate; tegdma: triethylene glycol dimethacrylate, udma: urethane dimethacrylate. ipn: interpenetrating polymer matrix, pmma: polymethylmethacrylate, deama: diethylaminoethyl methacrylate, hddma: hexanediol dimethacrylate. some manufacturers do not specify whether the matrix to fiber ratio was in weight (wt) or volume (vol.).information obtained from technical data provided by the respective manufacturers. (table ) cont..... the open dentistry journal, , volume elsubeihi et al. fig. ( ). different components of prefabricated fiber-reinforced posts. * some manufacturers add fillers such as zirconia, barium, and silicate dioxide to improve radiopacity. polyimide matrix is still experimental, and no known commercial prefabricated fiber-reinforced post has polyimide matrix. . . matrices used in fiber-reinforced posts the functions of the matrix in fiber-reinforced posts are to hold the fibers together in the post, as well as interact with functional monomers contained in the adhesive cements for successful bonding of post to root dentine and to composite core materials [ ]. furthermore, the matrix transfers stresses between fibers and protects fibers from the outside environment such as chemicals, moisture, and mechanical shocks [ ]. thus, the matrix may influence the compressive strength of the post, as well as interlaminar shear properties between the matrix and the fiber [ ]. during the manufacturing of fiber-reinforced posts, glass or quartz fibers are pre-stretched and treated with silane coupling agent before they are impregnated in the resin matrix [ ]. the resin-impregnated fibers are then heat cured to form blocks of different shapes and diameters. finally, the blocks are shaped into posts with different geometries and diameters through a milling process [ ]. as a result of the milling process, some of the fibers are exposed onto the surface of the prefabricated posts (fig. ). two major types of matrices are used in prefabricated fiber-reinforced posts. the first type consists of a highly cross- linked polymer matrix polymerized by the manufacturers, while the second type consists of unpolymerized, the so-called interpenetrating polymer matrix where the dentist can polymerize it during the fabrication of the post-core restoration [ ]. the most common types of matrices used in the polymerized cross-linked fiber-reinforced posts are epoxy- based or diamethacrylate-based cross-linked matrix. less commonly, some manufacturers use polymethylmethacrylate- based resin matrix for their posts. additionally, some investigators have suggested the use of aromatic polyimides as a matrix for fiber-reinforced posts [ ]. carbon composition of prefabricated fiber-reinforced posts matrix fiber interpenetrating polymer matrix pre-polymerized cross-linked matrix * quartz fibers epoxy silica dimethacrylate s - glass fibers polymethylmethacrylate zirconia enriched glass e - glass fibers glass fibers art contemporary prefabricated fiber-reinforced the open dentistry journal, , volume fig. ( ). scanning electron microscope image ( x) of the surface of the prefabricated fiber-reinforced post. note matrix and exposed glass fibers as a result of the manufacturing process. epoxy resin are thermosetting polymers, also known as polyepoxide, that are formed by the reaction of the base epoxide with the reactor polyamine [ ]. on the other hand, the aromatic monomer bis-gma (bisphenol a glycidyl methacrylate) used widely as a matrix in dental composite resin materials have also been used as a matrix in fiber-reinforced posts. bis-gma matrix is known to be stiffer than the epoxy matrix [ ]. as a result, flexural strength tests have shown that bis-gma-based matrix to experience greater stresses than epoxy-based matrix in fiber-reinforced posts [ ]. only a few manufacturers use polymethylmethacrylate of high molecular weight (> kda) as a matrix for fiber-reinforced posts [ ]. the wide use of carbon-reinforced polyimide composite materials in aerospace and automobile industries has triggered researchers to investigate the possible use of these materials in fiber-reinforced posts [ ]. fiber-reinforced aromatic polyimides have been shown to have high strength and stiffness, lower density, high fatigue endurance, low thermal coefficient, and the ability to withstand extreme temperature changes [ , ]. gao and colleagues demonstrated that polyimide-based resin reinforced by high strength carbon fibers have good mechanical and biological properties and suggested its use in clinical situations [ ]. recently, yang and xu [ ] found that blending polyimide and epoxy polymers have favorable mechanical properties and suggested its use as a matrix for fiber-reinforced posts. however, no fiber-reinforced post with polyimide matrix has been marketed. . . fibers used in fiber-reinforced posts as it has been already alluded to, original fibers used in fiber-reinforced posts were made of carbon [ , ]. as these carbon fibers did not fulfill aesthetic requirements under all- ceramic restorations, manufacturers developed more aesthetic fibers made of silica. silica-based fibers can be either glass or quartz [ , ]. the incorporated glass or quartz fibers imparted similar biomechanical properties, as carbon-fiber- reinforced posts, including elasticity, high tensile strength, low electrical conductivity, resistance to solubility and biochemical degradation [ , , ]. on the other hand, some manufacturers used zirconia enriched glass fibers [ ]. due to their favorable mechanical properties and transparent appearance, glass fibers had become the most commonly used fibers in fiber-reinforced posts [ ]. based on their chemical composition there are several types of glass fibers available including a-glass (alkali glass), c-glass (chemically resistant glass), d-glass (dielectric glass), r-glass (resistant glass), s-glass (high strength glass), and e-glass (electric glass) among others [ ]. however, the most common types used in fiber-reinforced posts are the s-glass and e-glass fibers [ ]. s-glass is known to have higher tensile strength and is rather expensive to produce, whereas e-glass has good tensile and compressive strength, as well as electrical properties and lower production cost [ ]. however, e-glass has lower tensile modulus and lower fatigue resistance resulting in relatively poor impact resistance as compared to s- glass [ ]. on the other hand, quartz fibers made out of pure silica in crystallized form, which is an inert material with a low coefficient of thermal expansion, has been used in several commercial fiber-reinforced posts [ ] as seen in table . in addition to the types and properties of individual fibers, other fiber-related factors can affect the mechanical properties that may affect the clinical success of fiber-reinforced posts [ ]. these include fiber orientation, fiber density, the open dentistry journal, , volume elsubeihi et al. impregnation of fibers with the matrix polymer, and adequate adhesion of fibers to the matrix polymer [ , ]. in fiber-reinforced posts, continuous unidirectional fibers are used [ ]. fibers direction influences the mechanical properties of fiber-reinforced posts [ ]. the fibers are continuous and oriented parallel to the post longitudinal axis fig. ( ) with different diameters ranging from . to . microns [ , ]. fibers density (i.e. the number of fibers per mm of the post-cross-sectional surface) usually provided by the manufacturers’ and is expressed by weight or volume and varies from one brand to the other. increased fiber density improves the strength and load-bearing capacity of fiber- reinforced posts [ ]. in a transverse section of the post - % of the area is occupied by fibers [ , ] with fibers of smaller diameter allowing higher packing density of up to %. the fibers should be well impregnated, meaning that resin should come in contact with the surface of every fiber, in order to achieve adequate adhesion of the fibers to the polymer matrix [ ]. with good impregnation, optimal reinforcement and transfer of stresses from the polymer matrix to the reinforcing fibers are achieved [ ]. during the fabrication of posts, fibers are pre-stressed, and resin injected under pressure to fill the spaces between the fibers, giving them solid cohesion [ ]. the smaller the diameter of the fiber filaments the better the matrix ability to spread between the fibers leading to an increase in interlaminar tightness [ ]. in addition, fibers are pre-coated with silane in order to improve the adhesion at the fiber-resin matrix interface [ , ]. this also protects the fibers from damage during handling, modifying the catalytic and wettability properties of fiber surfaces so that their chemical resistance increased [ ]. a durable adhesion between fibers and matrix of posts ensures that the load is transferred to the stronger fibers, thus optimizing the function of fibers as the reinforcing component of fiber-reinforced posts. on the other hand, if adhesion is not so durable and if any voids appear between the fiber and the matrix, these voids may act as initial fracture sites that encourage the breakdown of the material [ ]. differences in the coefficient of thermal expansion between fibers and resin matrix may affect the structural integrity of the post following thermal fluctuations in the mouth. there are large variations in the coefficient of thermal expansion between the polymer matrix ( - x - /°c) and that of e-glass ( x - /°c), quartz ( . x - /°c), and carbon ( . x - /°c) fibers [ ]. it has been shown that thermocycling decreased the flexural modulus of different fiber posts by approximately % [ ]. this suggests that the mismatch in the coefficient of thermal expansion between fibers and matrix polymers might affect the long-term integrity of fiber posts. studies reported that post debonding is the most common type of failure seen in teeth restored with prefabricated fiber- reinforced posts [ ]. in vitro studies have shown that the bond strength of luted fiber-reinforced posts was significantly lower in the apical third of canals followed by the middle third as compared to a coronal third of root canals [ , ]. in fact, various fiber-reinforced posts have been shown to reduce the transmission of polymerization light intensity differently as light travels from the coronal toward the apical third of root canals [ ]. fig. ( ). scanning electron microscope image (x ) of prefabricated fiber-reinforced post that sustained a cohesive fracture within the post. note the unidirectional arrangement of glass fibers. art contemporary prefabricated fiber-reinforced the open dentistry journal, , volume fig. ( ). schematic drawing of a cross-section of ilumi glass fiber post (a) and conventional prefabricated fiber-reinforced post (b). note the monolayer of optical glass cladding material around each glass fiber of the ilumi post. f: glass fiber. m: polymer matrix. the difference in light transmission between different fiber-reinforced posts can be attributed to differences in type and number of fibers, as well as the diameter and orientation of fibers [ ]. in addition, the refractive index of the matrix used which can be influenced by factors such as the type of monomer, pigments, and fillers used can also affect absorption and scattering of light transferred to various depths of posts in the root canals [ ]. recently a novel fiber-reinforced post (ilumi fiber optic post; ilumi sciences), was introduced into the market with the premise that it is able to transmit light to the most apical parts of the canal, thus improving the retention of the post and reducing the debonding rate [ ]. in the ilumi fiber optic post, each fiber is thermally coated with optical glass cladding made of a non-corrosive, biocompatible material, the exact nature of which is kept confidential by the manufacturer (fig. ). this monolayer around each fiber is believed to force the light to be internally reflected and transmits to the apical end of the post. studies have shown that this results in complete polymerization of resin cement as compared to other types of translucent fiber-reinforced posts [ ]. while manufacturers claim that this resulted in increased bond strength of the apical third of luted ilumi fiber optic post, this has not been independently demonstrated. it is highly desirable that posts cemented in root canals be visible on radiographs for evaluation and follow up. therefore, prefabricated posts should have radiopacity similar to or close to that of root dentine. studies have shown that different types of prefabricated fiber-reinforced posts exhibited different radiopacity levels. the composition of the post appears to be the most significant factor affecting the radiopacity of prefabricated fiber-reinforced posts [ ]. elements that appear to affect the radiopacity of the prefabricated fiber-reinforced post include silicon (si), which may present in the form of silica, as well as zirconia (zr), barium (ba), and aluminum (al) [ ]. differences in percentages of the various radiopaque elements, their atomic number, and crystallization forms seem to affect the radiopacity from one type of fiber-reinforced to the other [ ]. these elements can be present in the fibers, matrix and/or added as fillers in some commercially available prefabricated fiber-reinforced posts. conclusion increased interest in the use of prefabricated fiber- reinforced posts resulted in the development of an enormous number of prefabricated cross-linked fiber-reinforced posts with different compositions, geometries, and properties. understanding the composition of the different prefabricated fiber-reinforced posts available will aid clinicians in understanding the vast literature on the topic and help in their selection for clinical use. while the exact composition of the different prefabricated fiber-reinforced post is kept confidential by the manufacturers, the basic composition consists of pre- stretched fibers embedded in a resin polymer matrix. the functions of the different components and how they influence each other was discussed. consent for publication not applicable. funding this work was supported by research grant [no. -a- the open dentistry journal, , volume elsubeihi et al. dn- ] from ajman university, ajman, united arab emirates. conflict of interest the author declares no conflict of interest, financial or otherwise. acknowledgements declared none. references faria ac, rodrigues rc, de almeida antunes rp, de mattos mdag,[ ] ribeiro rf. endodontically treated teeth: characteristics and considerations to restore them. j prosthodont res ; ( ): - . [http://dx.doi.org/ . /j.jpor. . . ] [pmid: ] türp jc, heydecke g, krastl g, pontius o, antes g, zitzmann nu.[ ] restoring the fractured root-canal-treated maxillary lateral incisor: in search of an evidence-based approach. quintessence int ; ( ): - . [pmid: ] cheung w. a review of the management of endodontically treated[ ] teeth. post, core and the final restoration. j am dent assoc ; ( ): - . [http://dx.doi.org/ . /jada.archive. . ] [pmid: ] lovdahl pe, nicholls ji. pin-retained amalgam cores vs. cast-gold[ ] dowel-cores. j prosthet dent ; ( ): - . [http://dx.doi.org/ . / - ( ) - ] [pmid: ] guzy ge, nicholls ji. in vitro comparison of intact endodontically[ ] treated teeth with and without endo-post reinforcement. j prosthet dent ; ( ): - . [http://dx.doi.org/ . / - ( ) - ] [pmid: ] sorensen ja, martinoff jt. clinically significant factors in dowel[ ] design. j prosthet dent ; ( ): - . [http://dx.doi.org/ . / - ( ) - ] [pmid: ] trope m, maltz do, tronstad l. resistance to fracture of restored[ ] endodontically treated teeth. endod dent traumatol ; ( ): - . [http://dx.doi.org/ . /j. - . .tb .x] [pmid: ] morgano sm. restoration of pulpless teeth: application of traditional[ ] principles in present and future contexts. j prosthet dent ; ( ): - . [http://dx.doi.org/ . /s - ( ) - ] [pmid: ] torbjörner a, fransson b. a literature review on the prosthetic[ ] treatment of structurally compromised teeth. int j prosthodont ; ( ): - . [pmid: ] bittner n, hill t, randi a. evaluation of a one-piece milled zirconia[ ] post and core with different post-and-core systems: an in vitro study. j prosthet dent ; ( ): - . [http://dx.doi.org/ . /s - ( ) - ] [pmid: ] alnaqbi iom, elbishari h, elsubeihi es. effect of fiber post-resin[ ] matrix composition on bond strength of post-cement interface. int j dent ; ; : . ecollection [http://dx.doi.org/ . / / ] song ch, choi jw, jeon yc, et al. comparison of the[ ] microtensilebond strength of a polyetherketoneketone (pekk) tooth post cemented with various surface treatments and various resin cements. materials (basel) ; ( ): e . [http://dx.doi.org/ . /ma ] [pmid: ] alonso de la peña v, darriba il, caserío valea m, guitián rivera f.[ ] mechanical properties related to the microstructure of seven different fiber reinforced composite posts. j adv prosthodont ; ( ): - . [http://dx.doi.org/ . /jap. . . . ] [pmid: ] chieruzzi m, pagano s, pennacchi m, lombardo g, d’errico p,[ ] kenny jm. compressive and flexural behaviour of fibre reinforced endodontic posts. j dent ; ( ): - . [http://dx.doi.org/ . /j.jdent. . . ] [pmid: ] goracci c, corciolani g, vichi a, ferrari m. light-transmitting ability[ ] of marketed fiber posts. j dent res ; ( ): - . [http://dx.doi.org/ . / ] [pmid: ] galhano ga, de melo rm, barbosa sh, zamboni sc, bottino ma,[ ] scotti r. evaluation of light transmission through translucent and opaque posts. oper dent ; ( ): - . [http://dx.doi.org/ . / - ] [pmid: ] erik aa, erik ce, yıldırım d. experimental study of influence of[ ] composition on radiopacity of fiber post materials. microsc res tech ; ( ): - . [http://dx.doi.org/ . /jemt. ] [pmid: ] zicari f, de munck j, scotti r, naert i, van meerbeek b. factors[ ] affecting the cement-post interface. dent mater ; ( ): - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] greenhalgh t, thorne s, malterud k. time to challenge the spurious[ ] hierarchy of systematic over narrative reviews? eur j clin invest ; ( )e [http://dx.doi.org/ . /eci. ] [pmid: ] boell sk, cecez-kecmanovic d. a hermeneutic approach for[ ] conducting literature reviews and literature searchescais [internet] . available from https://aisel.aisnet.org/cais/vol /iss / [http://dx.doi.org/ . / cais. ] collins ja, fauser bc. balancing the strengths of systematic and[ ] narrative reviews. hum reprod update ; ( ): - . [http://dx.doi.org/ . /humupd/dmh ] [pmid: ] duret b, duret f, reynaud m. long-life physical property[ ] preservation and prosthodontic rehabilitation with the composipost. compend contin educ dent ; (suppl. ): s - . theodosopoulou jn, chochlidakis km. a systematic review of dowel[ ] (post) and core materials and systems. j prosthodont ; ( ): - . [http://dx.doi.org/ . /j. - x. . .x] [pmid: ] bateman g, ricketts dnj, saunders wp. fibre-based post systems: a[ ] review. br dent j ; ( ): - . [http://dx.doi.org/ . /sj.bdj. ] [pmid: ] narva kk, vallittu pk, helenius h, yli-urpo a. clinical survey of[ ] acrylic resin removable denture repairs with glass-fiber reinforcement. int j prosthodont ; ( ): - . [pmid: ] vallittu pk. an overview of development and status of fiber-[ ] reinforced composites as dental and medical biomaterials. acta biomaterodontol scand ; ; ( ): - . [http://dx.doi.org/ . / . . ] freilich ma, karmaker ac, burstone cj, goldberg aj. development[ ] and clinical applications of a light-polymerized fiber-reinforced composite. j prosthet dent ; ( ): - . [http://dx.doi.org/ . /s - ( ) - ] [pmid: ] soares cj, santana fr, pereira jc, araujo ts, menezes ms. influence[ ] of airborne-particle abrasion on mechanical properties and bond strength of carbon/epoxy and glass/bis-gma fiber-reinforced resin posts. j prosthet dent ; ( ): - . [http://dx.doi.org/ . /s - ( ) - ] [pmid: ] lassila lv, tanner j, le bell am, narva k, vallittu pk. flexural[ ] properties of fiber reinforced root canal posts. dent mater ; ( ): - . [http://dx.doi.org/ . /s - ( ) - ] [pmid: ] vichi a, ferrari m, davidson cl. influence of ceramic and cement[ ] thickness on the masking of various types of opaque posts. j prosthet dent ; ( ): - . [http://dx.doi.org/ . /s - ( ) - ] [pmid: ] asmussen e, peutzfeldt a, heitmann t. stiffness, elastic limit, and[ ] strength of newer types of endodontic posts. j dent ; ( ): - . [http://dx.doi.org/ . /s - ( ) - ] [pmid: ] wolff d, geiger s, ding p, staehle hj, frese c. analysis of the[ ] interdiffusion of resin monomers into pre-polymerized fiber-reinforced composites. dent mater ; ( ): - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] dietschi d, ardu s, rossier-gerber a, krejci i. adaptation of[ ] adhesive post and cores to dentin after in vitro occlusal loading: evaluation of post material influence. j adhes dent ; ( ): - . [pmid: ] stewardson da, shortall ac, marquis pm, lumley pj. the flexural[ ] properties of endodontic post materials. dent mater ; ( ): http://dx.doi.org/ . /j.jpor. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /jada.archive. . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j. - . .tb .x http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / / http://dx.doi.org/ . /ma http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /jap. . . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.jdent. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /jemt. http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /eci. http://www.ncbi.nlm.nih.gov/pubmed/ https://aisel.aisnet.org/cais/vol /iss / http://dx.doi.org/ . / cais. http://dx.doi.org/ . /humupd/dmh http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j. - x. . .x http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /sj.bdj. http://www.ncbi.nlm.nih.gov/pubmed/ http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / . . http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://www.ncbi.nlm.nih.gov/pubmed/ art contemporary prefabricated fiber-reinforced the open dentistry journal, , volume - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] novais vr, rodrigues rb, simamoto júnior pc, lourenço cs, soares[ ] cj. correlation between the mechanical properties and structural characteristics of different fiber posts systems. braz dent j ; ( ): - . [http://dx.doi.org/ . / - ] [pmid: ] santos af, meira jb, tanaka cb, et al. can fiber posts increase root[ ] stresses and reduce fracture? j dent res ; ( ): - . [http://dx.doi.org/ . / ] [pmid: ] salameh z, sorrentino r, ounsi hf, sadig w, atiyeh f, ferrari m.[ ] the effect of different full-coverage crown systems on fracture resistance and failure pattern of endodontically treated maxillary incisors restored with and without glass fiber posts. j endod ; ( ): - . [http://dx.doi.org/ . /j.joen. . . ] [pmid: ] naumann m, sterzenbach g, rosentritt m, beuer f, frankenberger r.[ ] is adhesive cementation of endodontic posts necessary? j endod ; ( ): - . [http://dx.doi.org/ . /j.joen. . . ] [pmid: ] de moraes ap, cenci ms, de moraes rr, pereira-cenci t. current[ ] concepts on the use and adhesive bonding of glass-fiber posts in dentistry: a review. appl adhes sci ; : . [http://dx.doi.org/ . / - - - ] anderson gc, perdigão j, hodges js, bowles wr. efficiency and[ ] effectiveness of fiber post removal using techniques. quintessence int ; ( ): - . [pmid: ] de rijk wg. removal of fiber posts from endodontically treated teeth.[ ] am j dent ; (spec no): b- b. [pmid: ] zhang m, matinlinna jp. e-glass fiber reinforced composites in dental[ ] applications. silicon ; : - . [http://dx.doi.org/ . /s - - -x] grandini s, goracci c, monticelli f, tay fr, ferrari m. fatigue[ ] resistance and structural characteristics of fiber posts: three-point bending test and sem evaluation. dent mater ; ( ): - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] manhart j. fibreglass-reinforced composite endodontic posts.[ ] endodpract ; : - . gao h, zhang zt, fan l, wang ds, zuo hj, sheng y. development[ ] of a novel polyimide composite core materials reinforced with carbon fiber. chinese j prosthodont ; : - . lamichhane a, xu c, zhang fq. dental fiber-post resin base material:[ ] a review. j adv prosthodont ; ( ): - . [http://dx.doi.org/ . /jap. . . . ] [pmid: ] drummond jl, bapna ms. static and cyclic loading of fiber-[ ] reinforced dental resin. dent mater ; ( ): - . [http://dx.doi.org/ . /s - ( ) - ] [pmid: ] seefeld f, wenz hj, ludwig k, kern m. resistance to fracture and[ ] structural characteristics of different fiber reinforced post systems. dent mater ; ( ): - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] liaw dj, wang kl, huang yc, lee kr, lai jy, ha cs. advanced[ ] polyimide materials: syntheses, physical properties and applications. prog polym sci ; ( ): - . [http://dx.doi.org/ . /j.progpolymsci. . . ] li j. the effect of surface modification with nitric acid on the[ ] mechanical and tribological properties of carbon fiber-reinforced thermoplastic polyimide composite. surf interface anal ; ( ): - . [http://dx.doi.org/ . /sia. ] yang a, xu c. synthesis and characterization of a polyimide-epoxy[ ] composite for dental applications. mech compos mater ; ( ): - . [http://dx.doi.org/ . /s - - - ] giachetti l, grandini s, calamai p, fantini g, scaminaci russo d.[ ] translucent fiber post cementation using light- and dual-curing adhesive techniques and a self-adhesive material: push-out test. j dent ; ( ): - . [http://dx.doi.org/ . /j.jdent. . . ] [pmid: ] le bell-rönnlöf am. fibre-reinforced composites as root canal[ ] posts.department of prosthetic dentistry and biomaterials science. university of turku, turku, finland: institute of dentistry ; pp. - . dyer sr, lassila lv, jokinen m, vallittu pk. effect of fiber position[ ] and orientation on fracture load of fiber-reinforced composite. dent mater ; ( ): - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] goracci c, ferrari m. current perspectives on post systems: a[ ] literature review. aust dent j ; (suppl. ): - . [http://dx.doi.org/ . /j. - . . .x] [pmid: ] perdigão j, gomes g, lee ik. the effect of silane on the bond[ ] strengths of fiber posts. dent mater ; ( ): - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] goracci c, raffaelli o, monticelli f, balleri b, bertelli e, ferrari m.[ ] the adhesion between prefabricated frc posts and composite resin cores: microtensile bond strength with and without post-silanization. dent mater ; ( ): - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] parisi c, valandro lf, ciocca l, gatto mr, baldissara p. clinical[ ] outcomes and success rates of quartz fiber post restorations: a retrospective study. j prosthet dent ; ( ): - . [http://dx.doi.org/ . /j.prosdent. . . ] [pmid: ] amižić ip, baraba a, ionescu ac, brambilla e, van ende a, miletić[ ] i. bond strength of individually formed and prefabricated fiber- reinforced composite posts. j adhes dent ; ( ): - . [http://dx.doi.org/ . /j.jad.a ] [pmid: ] chen yc, ferracane jl, prahl sa. a pilot study of a simple photon[ ] migration model for predicting depth of cure in dental composite. dent mater ; ( ): - . [http://dx.doi.org/ . /j.dental. . . ] [pmid: ] stylianou a, burgess jo, liu pr, givan da, lawson nc. light-[ ] transmitting fiber optic posts: an in vitro evaluation. j prosthet dent ; ( ): - . [http://dx.doi.org/ . /j.prosdent. . . ] [pmid: ] © elsubeihi et al. this is an open access article distributed under the terms of the creative commons attribution . international public license (cc-by . ), a copy of which is available at: https://creativecommons.org/licenses/by/ . /legalcode. this license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.joen. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.joen. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - - -x http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /jap. . . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - ( ) - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.progpolymsci. . . http://dx.doi.org/ . /sia. http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /j.jdent. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j. - . . .x http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.prosdent. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.jad.a http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.dental. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.prosdent. . . http://www.ncbi.nlm.nih.gov/pubmed/ https://creativecommons.org/licenses/by/ . /legalcode state of the art contemporary prefabricated fiber-reinforced posts [background:] background: objective: methods: results: conclusion: . introduction . materials and methods . results . historical background . advantages of fiber-reinforced posts . composition of fiber-reinforced posts . . matrices used in fiber-reinforced posts . . fibers used in fiber-reinforced posts conclusion consent for publication funding conflict of interest acknowledgements references microsoft word - _meschini.docx digitcult | scientific journal on digital cultures published june correspondence should be addressed to federico meschini, università per stranieri di perugia/École normale supérieure de paris. email: fmeschini@gmail.com digitcult, scientific journal on digital cultures is an academic journal of international scope, peer-reviewed and open access, aiming to value international research and to present current debate on digital culture, technological innovation and social change. issn: - . url: http://www.digitcult.it copyright rests with the authors. this work is released under a creative commons attribution (it) licence, version . . for details please see http://creativecommons.org/ licenses/by/ . /it/ digitcult http://dx.doi.org/ . / , vol. , iss. , – . doi: . / documenti, medialità e racconto. di cosa parliamo quando parliamo di digital scholarship. abstract la digital scholarship è costituita da metodologie e pratiche sia di ricerca sia di disseminazione dei risultati basate sul paradigma digitale e perciò, oltre agli aspetti più manifesti, ha un ruolo strategico nel panorama scientifico, in quanto luogo d’incontro tra scienze umane e scienze esatte. partendo dall’etimologia di scholarship e dalla relazione dinamica tra i diversi significati veicolati, questo articolo si concentra sul rapporto tra le digital humanities e la digital scholarship e su come il concetto di pubblicazione elettronica implichi una concezione pluralistica del testo: queste diverse accezioni sono a loro volta un ponte tra settori disciplinari contigui ma spesso non comunicanti, come ad esempio discipline umanistiche da un lato e mediologiche dall’altro. le riflessioni conclusive sono focalizzate sugli elementi costitutivi della digital scholarship e le loro possibili combinazioni e sul rapporto tra linguaggio testuale e visivo nella comunicazione scientifica. documents, mediality and narration. what we talk about when we talk about digital scholarship. digital scholarship consists of both research and publishing methodologies and practices based on the digital paradigm. it has, therefore, a strategic role in the scholarly landscape since it is a meeting place between humanities and hard sciences. starting from the etymology of scholarship – and the dynamic relationship between the different meanings conveyed – this article focuses on the relationship between digital humanities and digital scholarship. an important aspect of this relationship is the implication by electronic publishing of a pluralistic view of text: these different meanings are a bridge between contiguous but more than often non-communicating disciplines, such as humanities and media studies. the concluding reflections focus on the constituent elements of digital scholarship and their possible combinations and on the relationship between textual and visual language and their use in scholarly communication. federico meschini università per stranieri di perugia / École normale supérieure de paris | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures calcolatori ed eruditi il termine scholarship nella lingua italiana non è di immediata traduzione, nonostante le sue origini etimologiche risalgano al latino schola – che “designa sia il concetto sia il luogo dello studio” – a sua volta derivato dal greco skholḗ, ma destinato ad assumere rispetto a quest'ultimo maggiori significati, in particolare a partire dal medioevo. schola è la radice di scholastĭcus, caratterizzato da una doppia accezione, sostantivale e aggettivale, laddove quest'ultima indica tutto ciò che è relativo alla scuola, significato tradotto successivamente in italiano . la forma sostantivale latina era anch'essa polisemica, in quanto faceva inizialmente riferimento sia al docente sia al discente sia ad un generico status di erudizione, implicando perciò l’attività di ricerca da quella della didattica, e non distinguendole nettamente in quanto entrambe caratterizzate da un continuo processo di studio. da qui si passa, insieme a scholāris , all'inglese antico e medio con scōlere per arrivare al moderno scholar, funtore della scholarship. il suffisso ship trasforma le caratteristiche del singolo in un qualcosa di generalizzato e relativo all'attività in sé, riferendosi sia al processo, e perciò alle condizioni necessarie per attuarlo, sia al prodotto, e i relativi supporti che lo rendono disponibile. se in italiano la declinazione aggettivale del lemma scientifico indica allo stesso modo le scienze umane e quelle esatte – anch'esse non sempre chiaramente distinte nel mondo classico e medievale, come non esisteva parimenti una netta separazione tra ricerca e didattica – in ambito anglosassone è proprio scholarly ad assolvere a questo ruolo di etichetta inclusiva, laddove scientific è limitato alle sole hard sciences . digital scholarship è perciò, come frequentemente accade nella ridefinizione digitale delle attività sviluppatesi e legate ad una dimensione analogica, un'etichetta ancipite tanto efficace e incisiva quanto di non semplice definizione. melanie schlosser, nel blog digital scholarship @ the libraries , la descrive come “research and teaching that is made possible by digital technologies, or that takes advantage of them to ask and answer questions in new ways” ( ). la schlosser successivamente preferisce però utilizzare ciò che scrive a riguardo abby smith rumsey: “digital scholarship is the use of digital evidence and method, digital authoring, digital publishing, digital curation and preservation, and digital use and reuse of scholarship” (rumsey , ). quest'ultima spiegazione risulta più efficace e completa per due motivi: il primo è la presenza del digitale non solo ad un livello tecnologico, bensì metodologico ed epistemologico ; il secondo è la descrizione dell'intero ciclo della ricerca in cui ogni fase viene declinata secondo questa nuova modalità e l'ultima si ricongiunge alla prima, attuando così una circolarità virtuosa. http://www.treccani.it/enciclopedia/schola_% enciclopedia-italiana% /. http://www.treccani.it/vocabolario/scolastico /. il sostantivo invece sopravvive nella nostra lingua solo in riferimento alla filosofia scolastica, http://www.treccani.it/vocabolario/scolastico /. inizialmente sinonimi, scholāris e scholastĭcus con il passare del tempo assumono un significato simmetrico, in quanto finiscono per indicare rispettivamente lo svolgimento e il risultato del processo di apprendimento (quinto , - ). la lingua tedesca, sfruttando la sua caratteristica agglutinante, parte da un concetto generale e inclusivo, wissenschaft, da cui derivano naturwissenschaft, le scienze della natura, e geisteswissenschaft, quelle dello spirito. http://library.osu.edu/blogs/digitalscholarship/. già dal titolo del blog risalta il ruolo strategico delle biblioteche – facilmente generalizzabile nonostante nello specifico ci si riferisca all’università dell’ohio – nella ridefinizione dell'attività di ricerca, andando a espandere quelle che sono le tradizionali attività di acquisizione, conservazione e disseminazione, in particolare per ciò che concerne il supporto necessario, non solo tecnologico e infrastrutturale, nella creazione delle risorse digitali. nell'ultimo post del blog del dicembre viene ricordato il compito delle biblioteche in quanto spazio collaborativo di discussione necessario per rispondere alla domanda “what is digital scholarship and what should libraries be doing to support it” nel quadriennio - . conclusosi questo compito e passati alla domanda successiva, “what are we doing to support digital scholarship?” e “how can we continue to improve and evolve our digital scholarship program?” (ibid.), il testimone viene consegnato al blog ufficiale dell'iniziativa research commons – http://library.osu.edu/researchcommons/ – e sempre gestito dalle biblioteche dell’università con lo scopo di fornire servizi per le diverse fasi del processo della ricerca, tra cui il reperimento, la gestione e la visualizzazione dei dati. non a caso, dopo appena qualche riga la rumsey scrive (corsivo mio) “the goals of scholarly production remain intact, but fundamental operational changes and epistemological challenges generate new possibilities for analysis, presentation, and reach into new audiences”. doi: . / federico meschini | digitcult | scientific journal on digital cultures l’efficacia di questa definizione è dovuta all’essere stata elaborata in un percorso quasi decennale: abby smith rumsey era la direttrice dello scholarly communication institute (sci) – luogo d’incontro e di riflessioni condivise con cadenza annuale presso la biblioteca dell’università della virginia – che, in una prima fase dal al e successivamente dal al , ha avuto il compito di identificare e proporre strategie per far progredire la comunicazione scientifica, in particolare nelle scienze umane, sulla base di una sempre maggiore diffusione del paradigma digitale. dopo aver analizzato tutta una serie di argomenti specifici, tra cui i centri di ricerca nelle humanities o i visual studies , delle nove relazioni prodotte dallo sci, le ultime due sono concentrate sin dal titolo su un new-model scholarly communication. il punto di partenza è come la tradizionale divisione dei processi e dei ruoli nella comunicazione scientifica non sia più in grado di rispondere adeguatamente ai cambiamenti causati dalla rivoluzione digitale e dalla trasformazione in atto nell’accademia, in particolare la crisi delle humanities, e sia necessario perciò sviluppare un modello “enacted by individuals and groups playing multiple and overlapping roles” (rumsey , ). la proposta sul percorso da seguire per far fronte in maniera adeguata a questi cambiamenti è incentrata su diversi aspetti: i nuovi generi e modelli della produzione scientifica; i modelli di business e di copyright; un’adeguata valutazione delle diverse professionalità e dei ruoli necessari allo sviluppo e alla crescita della comunità scientifica; la creazione di infrastrutture digitali condivise tra case editrici, biblioteche e centri di ricerca, così da facilitare lo scambio e la diffusione della conoscenza; lo sviluppo di percorsi formativi adeguati in grado di fornire alle nuove generazioni di studiosi nelle scienze umane le competenze, di tipo tecnologico, comunicativo e gestionale, necessarie in questo nuovo sistema; i possibili finanziamenti da parte di istituzioni ed enti privati, ricevuti dimostrando il valore strategico delle scienze umane in questo nuovo panorama informativo (rumesy , - ). È possibile individuare facilmente una concatenazione tra tutti questi diversi argomenti, in quanto il cambiamento della natura del documento si propaga e influenza gli altri aspetti, sia conoscitivo-formativi sia socio-economici, e ne è a sua volta influenzato. proprio su questo aspetto documentale le due relazioni contengono delle osservazioni rilevanti, in cui partendo dal livello del contenuto si passa continuamente a quello dell’espressione e viceversa. per ciò che riguarda il primo viene posto in discussione il ruolo della monografia, in quanto argomentazione che ha proprio nell’estensione, nell’essere una longform la sua motivazione, giudicata sì ancora rilevante e strategica, ma che deve in qualche modo evolvere; in particolare va considerata la presenza di contenuti non testuali e di conseguenza una maggiore granularità (riva ). quest’ultimo aspetto si riflette chiaramente sulla struttura della monografia che va in qualche modo esplicitata: “monographs are structured like trees, with a long central line or trunk from which many branches lead off and from there, ever smaller branches are spawned” (rumsey , ). questa similitudine pone in primo piano la consistenza argomentativa che caratterizza una monografia, senza però tralasciare i possibili appigli a estensioni ulteriori, lasciati come compito al lettore. nel web lo scenario è diametralmente opposto in quanto, immerso in un grafo apparentemente sconfinato in cui i nodi costituiscono i vari contenuti informativi, è l’utente a dover di volta in volta creare il proprio percorso lineare e il più possibile consistente, selezionando tra le numerose opzioni disponibili: “the book is the anti-open-web.” (rumsey , ). va sottolineato però come sia proprio la familiarità con questa struttura ad albero della monografia, acquisita tramite un continuo processo di studio, a permettere ad uno studioso di muoversi con agilità all’interno di uno spazio informativo, creando di volta in volta le connessioni necessarie e valutandone la validità, in primo luogo con ciò che fa già parte del suo patrimonio conoscitivo: “perhaps we are so familiar with the monograph form that we no longer notice that few scholars read long-form arguments from the first page to last, in that order. rather, they move in well-worn paths that run between introductory, reference, http://uvasci.org. a dimostrazione di come questo bisogno fosse sentito e condiviso a livello globale, nel in europa prende il via anche force – http://www.force .org – acrostico di future of research communication and escholarship, con finalità del tutto simili allo sci e promotore sia di una conferenza sia di una summer school su questi stessi temi. http://uvasci.org/institutes- - /. | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures citation, and index materials, all centering around the core narrative presentation.” (ivi) va però sottolineato come la possibilità di esplicitare la struttura nella monografia digitale rischi di favorire chi è già in possesso di questa abilità e al contrario sfavorire chi deve ancora svilupparla. questo aspetto è giudicato come fondamentale nel dottorato, la fase di avviamento alla ricerca. non a caso, in relazione alla riflessione sulla forma della monografia, la discussione si sposta immediatamente sulla tesi, il prodotto conclusivo del dottorato di ricerca: “what do those digital genres tell us about the ‘dissertation-as-proto-book’ as the most appropriate preparation for a career of productive scholarship?” (ibidem, ). mettere in discussione il prodotto vuol dire interrogarsi sul processo. viene criticata l’eccessiva specializzazione che attualmente caratterizza le scienze umane, in quanto a forte rischio di settorialità e isolamento, con conseguente difficoltà di diffusione ed effettiva ricaduta dei risultati della ricerca; certo questa specializzazione è presente anche nelle scienze esatte, ma viene perlomeno in parte mitigata dal lavoro di gruppo, caratteristica spesso presente nelle digital humanities. quale può essere lo scopo e il relativo prodotto finale di un dottorato di ricerca, che vada a sostituire una monografia incentrata sull’accumulazione verticale di conoscenze su argomenti sempre più specifici? la domanda cui rispondere è come “the dissertation is meant to demonstrate capacity in relation to some body of knowledge [...] and demonstrate capacity as well. capacity for what is the question now.” (ibidem, ). una possibile risposta è la capacità di lavorare direttamente sulla struttura, sulla capacità di creare connessioni, “the ability to navigate the online environment and to disseminate knowledge to an audience” (ivi), sull’essere non più un lonely scholar bensì un node of knowledge. in base a questo principio diverse alternative possono prendere il posto della monografia: “can we imagine that a new-model dissertation would be a translation, a collection of essays, original digital objects, or curatorial projects?” (ivi). È evidente come il concetto stesso di relazione orizzontale, di giustapposizione, e non verticale, di specializzazione, sia al cuore delle possibilità elencate, di tipo linguistico, concettuale, codicale o tematico. il riferimento all’oggetto digitale porta esplicitamente il discorso dal piano del contenuto a quello dell’espressione, e alla commistione di codici comunicativi eterogenei. una prima riflessione che viene effettuata è, come spesso accade, di tipo dicotomico. “there are two models of multimedia argument: in one, argument is carried by prose and punctuated by media as illustration; in the other, the medium itself bears the burden both of presentation and argumentation” (rumsey , ). la differenza sottostante quest’opposizione è sul diverso ruolo e peso assunto di volta in volta dalle varie tipologie di contenuto: linea centrale o ramo secondario, denotativo o connotativo, informativo o narrativo, figura o sfondo, e, passo successivo alla contrapposizione, sull’interazione che viene a instaurarsi tra queste due differenti modalità. subito dopo questa affermazione vengono affrontate due questioni anch’esse apparentemente contrapposte, ma in realtà connesse. la prima, già parzialmente affrontata, è su quanto la linearità sia essenziale nello sviluppo di un argomento: multicodicalità e granularità portano inevitabilmente a mettere in discussione la progressione lineare, non fosse altro per la possibilità di organizzare i vari contenuti in base alla loro tipologia e all’esplicitazione delle relazioni presenti . il secondo punto riguarda la necessità nell’utilizzo di un medium, e di grazie alla libreria d .js, la piattaforma scalar – per la creazione di pubblicazioni arricchite e frequentemente citato nelle due relazioni dello sci – permette diverse modalità di visualizzazione dei contenuti, a griglia, ad albero, radiale o a grafo aggregato, mostrando così le relazioni basate sul modello sottostante e composto da: singoli oggetti iconografici, sonori o audiovisivi, annotazioni relative agli oggetti o a porzioni di essi; pagine contenenti sia testo sia uno o più oggetti; percorsi in grado di organizzare linearmente le pagine; tag per raggruppare le pagine in base a un principio insiemistico (sayers e dietrich ). l’edizione digitale del testo di jason mittel sulla complessità nella narrazione televisiva seriale ( ), realizzata proprio tramite scalar – http://scalar.usc.edu/works/complex- television/ – va ad ampliare la versione cartacea in quanto presenta le porzioni rilevanti delle fonti primarie cui l’edizione a stampa fa riferimento. tralasciando la funzione di estensione in punti specifici del testo di partenza tramite contenuti granulari, già di per sé fortemente ipertestuale, quest’edizione riproduce pedissequamente l’indice originario; perciò, anche immaginando una versione integrale che includa i contenuti di entrambe le edizioni, sia il modello sia le possibilità di visualizzazione di scalar permettono e incoraggiano una fruizione non lineare a partire da una base lineare. ciò conferma come doi: . / federico meschini | digitcult | scientific journal on digital cultures conseguenza di un codice comunicativo, di possedere “basic technical proficiency and literacy skills” (ivi). quest’ultimo tipo di competenze porta alla grammatica utilizzata da un mezzo espressivo e quindi ad un approccio diegetico; ciò sembra essere messo in discussione dalla non linearità, ma in realtà quest’ultima da un lato prevede la presenza di percorsi – e perciò racconti – multipli e dall’altro la costruzione da parte dell’utente di un percorso autonomo a partire dai singoli nodi. la presenza di un aspetto diegetico anche in quest’ultimo caso trova conferma grazie sia alla natura frattale del racconto, racchiusa pertanto anche nei contenuti granulari (yorke , - ), sia alla closure (mccloud , ), la capacità da parte di un fruitore di riempire autonomamente, e più o meno consciamente, gli spazi mancanti per creare così un insieme, che risponda ai requisiti di omogeneità e coerenza. se ciò non dovesse essere possibile si ricadrebbe in ogni caso nelle categorie dell’antitrama e delle realtà incoerenti, che hanno però la loro ragione d’essere, e viceversa, nei loro opposti, la trama classica e le realtà coerenti (mckee , - ). il fattore diegetico, insieme alla sua organizzazione in elementi componibili e scomponibili all’occorrenza, assume ulteriore importanza superando l’aspetto sincronico, di giustapposizione, tra codici eterogenei, e prendendo in considerazione quello diacronico, fondamentale nello sviluppo della longform. a livello di etichette ciò trova corrispondenza nelle definizioni di multimedia e transmedia e nei due diversi prefissi, con l’ultimo ad indicare la presenza di un percorso, e perciò una narrazione, in un assetto mediatico variegato. nel riflettere sulla trasposizione di un’argomentazione a forma lunga, pensata in origine per una monografia cartacea e successivamente destinata ad pubblicazione digitale, massimo riva riassume diversi degli argomenti qui proposti (corsivo mio): “rethinking my book as a digital monograph compelled me to shift the weight of my argument from the written to the visual component, embedding as much of my argument in the latter. at the same time, this also required a substantial shift in my writing strategy [...] investing the written text with a new crucial function: supporting the visualizations (in the shape of captions or internal annotations), on the one hand, and providing a narrative frame which allows the reader to connect the various visualizations among themselves, and follow a path toward some theoretical and methodological conclusions” (riva , ). il ruolo di supporto e di contestualizzazione narrativa della componente testuale è tanto più necessario quanto più la parte non testuale è caratterizzata da una fruizione sincronica. oltre naturalmente alle immagini, più che per i contenuti audio e video ciò vale soprattutto per quelli computazionali, in cui il calcolo non è funzionale solo al piano espressivo – ad esempio la riproduzione di un filmato – ma a quello contenutistico, costituito da un insieme di possibili stati discreti, risultato delle elaborazioni sottostanti e dell’interazione degli utenti . queste considerazioni, sul rapporto tra aspetto narrativo, commistione di codici comunicativi eterogenei, granularità, non linearità e ruolo fondamentale della tecnologia, forniscono infine ulteriori elementi su ciò che accomuna l’editoria digitale a quelle forme espressive caratterizzate da questi stessi tratti, come il fumetto o il cinema (posner ); queste ultime possono essere perciò una preziosa risorsa per ciò che riguarda sia i rapporti tra i diversi codici nei singoli blocchi informativi (mccloud , - ) sia l’equilibrio narrativo complessivo. questa modalità sia strettamente legata alla conoscenza della struttura sottostante, o perché formalizzata, come in questo caso e più in generale nelle pubblicazioni digitali, o perché estrapolata empiricamente da un esperto lettore. un esempio di contenuto computazionale relativo al racconto, com’è ormai evidente tra i temi principali delle riflessioni contenute in questa sede, è hedonometer, uno strumento di sentiment analisys: applicato a circa . testi del project gutenberg – http://hedonometer.org/books/v / – ne ha analizzato l’andamento emotivo, verificando così la loro aderenza ad una delle sei trame di base (reagan et al. ). | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures media e humanities la pagina di wikipedia sulla digital scholarship riprende la definizione della rumsey, con cui apre il capoverso iniziale che si chiude con un tentativo decisamente vago – che in questa sede cercheremo di focalizzare maggiormente – di stabilire una relazione con l'informatica umanistica: “digital scholarship has a close association with digital humanities, though the relationship between these terms is unclear” . da un lato è palese come le digital humanities, nelle loro diverse declinazioni, costituiscano il côté della digital scholarship nelle scienze umane, ben prima che quest'ultimo termine si diffondesse su larga scala; dall'altro determinati aspetti delle prime si estendono strategicamente alla seconda nella sua globalità, in particolare tutto ciò che afferisce al concetto di pubblicazione elettronica: “digital humanities is not a unified field but an array of convergent practices that explore a universe in which print is no longer the exclusive or the normative medium in which knowledge is produced and/or disseminated.” una prima prova ad un livello teorico di questa tesi è il riferimento esplicito alle digital humanities nelle riflessioni sull'evoluzione dello scholar, in relazione all'utilizzo del computer come strumento conoscitivo: “one potentially rich space for action for the media studies professor is in a third variant of the digital humanities, the multimodal scholar. [...] she aims to produce work that reconfigures the relationships among author, reader, and technology while investigating the computer simultaneously as a platform, a medium, and a visualization device.” (mcpherson , ) un punto su cui vale la pena soffermarsi in questa affermazione riguarda le varianti implicite e le relative generazioni precedenti e alla base del multimodal scholar. la prima si riferisce a quel gruppo di studiosi impegnato, sulle orme di padre roberto busa e il suo index thomisticus, in attività direttamente legate alle pratiche computazionali, per cui l'etichetta disciplinare era non a caso humanities computing (hockey ). caratteristiche ascrivibili a questa variante sono una tradizione storico/culturale di diversi decenni e relativamente uniforme nonostante l’eterogeneità disciplinare, in quanto bilanciata vuoi da una dimensione circoscritta della comunità scientifica, vuoi dalla riflessione comune sulla centralità dello strumento computazionale nelle varie pratiche, a sua volta oggetto di considerazioni teorico/metodologiche. tutto ciò va inoltre situato in un contesto in cui la scarsa usabilità delle interfacce utente spingeva verso una conoscenza della tecnologia sottostante, dai comandi dei sistemi operativi testuali alle istruzioni dei linguaggi di programmazione: questa commistione di aspetti sia teorici sia tecnologici non poteva non favorire oltre al dialogo interdisciplinare anche un forte senso di coesione. infine, la principale distinzione di questa prima generazione si può riassumere nella contrapposizione dialogica tra aspetto qualitativo da un lato, come la codifica dei testi, e quello quantitativo dall’altro, tra cui l'analisi testuale (gigliozzi ). la seconda e più recente generazione si identifica principalmente con l'utilizzo degli strumenti di comunicazione tipici del web . , blog e wiki in primis, come alternativa e complemento ai tradizionali luoghi di pubblicazione accademica, andando così a estendere quell’eterodossia editoriale che nella prima generazione era limitata, sia per motivi pragmatici sia culturali, a quei prodotti della ricerca non riducibili ad una dimensione tipografica senza snaturarne l'essenza, come banche dati testuali o edizioni critiche digitali . È con il progressivo http://en.wikipedia.org/wiki/digital_scholarship. “a digital humanities manifesto”, http://manifesto.humanities.ucla.edu/ / / /digital-humanities- manifesto/. gino roncaglia, nel descrivere una situazione di discontinuità nelle pratiche di editoria elettronica applicata alla saggistica, scrive “fa in parte eccezione il campo delle edizioni critiche digitali, che è legato tuttavia a un insieme di strumenti e problematiche diverse rispetto all’idea di ‘arricchimento’ del testo.” (roncaglia , ). È altresì vero come le continue riflessioni sulla natura e sul modello dell'edizione, e in particolare sul concetto di modello dei dati (witt ), portino questi due paradigmi ad incontrarsi inevitabilmente. doi: . / federico meschini | digitcult | scientific journal on digital cultures affermarsi e diffondersi di questa seconda generazione che si passa, non senza critiche, da humanities computing a digital humanities (vannhoute ), transizione sancita ufficialmente a metà anni anche grazie alla pubblicazione, sia singolarmente sia significativamente a stampa, del a companion to digital humanities per i tipi dell'editore blackwell (schreibman et al. a) l’intenzione più che evidente è quella di definire e contemporaneamente espandere un settore in continuo cambiamento, a causa di un intreccio di fattori culturali e tecnologici, includendo pratiche e di conseguenza discipline in cui l’enfasi è sull’aspetto comunicativo e multimediale. ciò che viene a delinearsi è un qualcosa di variegato ed eterogeneo che però “remains deeply interested in text, but […] has redefined itself to embrace the full range of multimedia” (schreibman et al. b, xxiii). l’apertura è chiaramente nei confronti dei media studies, arrivando così alla terza variante descritta dalla mcpherson, ma mantenendo come punto focale privilegiato il testo, cuore delle discipline umanistiche tradizionali. vuoi però la sempre crescente specializzazione dei diversi settori vuoi la maggiore eterogeneità di questo nuovo scenario, il processo di armonizzazione tra le scienze umane da un lato e quelle mediologiche dall’altro, nonostante la comune ridefinizione basata sul paradigma computazionale/digitale, non è stato e non è né automatico né lineare ed è tuttora caratterizzato da una certa tensione. successivamente alla pubblicazione dell’articolo della mcpherson, nel gennaio del in uno scambio di tweet tra matthew kirschenbaum, stephen ramsay e mark sample si arrivò a parlare di una probabile faida tra i due schieramenti e conseguente “turf war”, immagine decisamente evocativa ed efficace tanto da essere successivamente ripresa per descrivere un possibile, e pessimistico, scenario nel rapporto tra scienze umane e digital humanities a causa di una mancata integrazione tra di loro (hayles ). se quest’ultima questione rimane tutt’ora aperta e lo rimarrà ancora per diverso tempo, in quanto nonostante una sempre maggiore contaminazione e diffusione di pratiche e strumenti computazionali (stella ) non si può non notare una corrispondente reazione di arroccamento su posizioni conservatrici, il rapporto con i media studies è affatto cambiato. va sottolineato come in questo settore non sembra essere presente, o perlomeno non allo stesso livello, quella diffidenza che caratterizza molti studiosi umanistici (tomasin ), vuoi per una maggiore freschezza della disciplina vuoi per un interesse intrinseco nei confronti dei meccanismi sottostanti qualsiasi strumento conoscitivo/comunicativo. pubblicazioni come the arclight guidebook to media history and the digital humanities (acland e hoyt ) o the routledge companion to media studies and digital humanities (sayers ), anche in questo caso a stampa, mostrano come tale rapporto esista e, nonostante le inevitabili declinazioni disciplinari, presenti punti di contatto strategici , in particolare sulla natura dei documenti digitali e le relative possibilità espressive: “just as the codex was an improvement over the papyrus scroll […] the digitally mediated “page” offers yet another paradigm shift in the processes of writing and reading. the digital page yields a new axis of depth—a page that layers to other pages, can be seen next to other pages, and can include moving images, still images, sounds.” (friedberg , ) il riferimento nel compendio della blackwell alla centralità del testo, citato in precedenza, permette di approfondire questo rapporto tra l’approccio mediologico e quello umanistico. un primo passo obbligato è la non certo banale definizione del concetto di testo. patrick sahle, nelle sue riflessioni sulle edizioni critiche digitali, ne ha elaborato una teoria pluralistica, in cui il testo può assumere diversi significati a seconda del punto di vista assunto ed essere perciò interpretato parimenti come un’idea, un’opera, un codice linguistico, una versione, un documento e un segno visivo (sahle , iii - ). nonostante il parallelismo con altri formalismi o approcci , l’innovazione di sahle è l’organizzazione di queste diverse interpretazioni in una ruota, con dei rapporti non gerarchici ma circolari e diametrali. in particolare nel primo volume il capitolo di eric hoyt ( ) è incentrato sulla creazione di collezioni digitali di fonti primarie, l’utilizzo e lo sviluppo di programmi per la loro analisi e, infine, la scrittura di libri e articoli contenenti i risultati ottenuti, temi collegati a quelli affrontati in questa sede; nel secondo il capitolo futures of the book (bath et al. ) analizza il rapporto tra libro a stampa e calcolatore, superando la visione “integrata” che vede il primo sostituito completamente dal secondo. un paragone immediato è con il modello frbr, sebbene non totalmente isomorfo (pierazzo , ). | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures figura . la text wheel di patrick sahle. nelle discipline mediologiche il focus è su quei tratti legati alla materialità e al piano dell’espressione – documento, versione e segno visivo – direttamente collegati agli aspetti sociologici (mckenzie ) e a loro volta trait-d'union con la dimensione multicodicale; grazie a questa struttura circolare si ha però una connessione diretta, per giustapposizione o opposizione, a quelle declinazioni legate al piano del contenuto, in una commistione che non può non ricordare quell’intreccio tra elementi e ruoli tradizionalmente considerati come separati sottolineato dalla rumsey. non a caso franz fischer utilizza la text wheel di sahle per descrivere quali degli aspetti della testualità siano rappresentati tramite un’edizione critica digitale e di come siano collegati tra di loro: “the boundaries between the different aspects are constantly in a state of flux and thus the diplomatic transcription also represents one particular version of the text” (fischer ). parimenti l'aggettivo multimodal, oltre a riferirsi alla materialità eterogenea dei supporti insieme alle relative caratteristiche tecnologico/funzionali, e a quella definizione polisemica di scholar vista precedentemente, in cui la distinzione tra scienze umane e scienze esatte, ricerca e didattica, docente e discente è affatto lasca, ben si attaglia a questa visione pluralistica. basandosi sempre su questo approccio è possibile rileggere affermazioni come quella della mcpherson riguardo lo studio e l’utilizzo del computer come al tempo stesso piattaforma, medium e strumento di visualizzazione: il dispositivo computazionale non può prescindere dall’aspetto linguistico, alla base del codice e dei linguaggi di programmazione, e nella ruota di sahle questa istanza si trova opposta proprio a quella del segno visivo, così come il medium, inteso come unione di espressione e contenuto, è rappresentato dall’opposizione diametrale tra documento e opera, e queste due interpretazioni sono giustapposte rispettivamente a quella di segno e di codice linguistico. l’opposizione tra documento e opera spiega inoltre un fenomeno descritto come apparentemente contraddittorio dalla rumsey epperò visto come fondamentale per ciò che riguarda l’evoluzione del ruolo delle biblioteche; se da un lato fungono da “trusted conservator and long-term steward of humanities scholarship” dall’altro sono “a force for innovation and a neutral meeting ground of people from different disciplines and professions to collaborate and experiment” (rumsey , ). il primo aspetto è chiaramente legato alla loro funzione tradizionale rispetto al documento e alla sua dimensione fisica, mentre il secondo acquista maggiormente senso utilizzando la chiave di lettura dell’opera. declinando quest’ultimo concetto utilizzando un approccio semiotico la corrispondenza è completa, a patto di uniformare il livello del contenuto e dell’espressione con gli altri sottolivelli, considerandoli perciò come stadi intermedi, in quanto l’idea trova riscontro con la sostanza del contenuto, l’opera con il contenuto, il codice linguistico con la forma del contenuto, la versione con la forma dell’espressione, il documento con l’espressione e il segno visivo con la sostanza dell’espressione (barthes, , pp. - ). doi: . / federico meschini | digitcult | scientific journal on digital cultures nel significato di racconto – e ciò vale sia per la narrativa sia per la saggistica, umanistica e scientifica (lolli ) – è la biblioteca a fornire quel luogo d’incontro tra attori eterogenei le cui interazioni creano nuove relazioni, connessioni e stati conoscitivi rispetto a quelli precedentemente esistenti: il medesimo tipo di cambiamento alla base della natura del racconto (yorke ). sempre riguardo il rapporto tra digital scholarship e digital humanities da cui eravamo partiti, un’ulteriore conferma, stavolta pragmatica, della loro stretta relazione viene dai diversi centri nati da esperienze significative di informatica umanistica, con sovente una biblioteca accademica a fornire il necessario supporto istituzionale: esempi significativi sono lo scholars' lab dell'università della virginia, evoluzione diretta dell'electronic text center , e il center for digital scholarship della brown, nato dallo scholarly technology group . entrambi i centri sono stati punti di riferimento imprescindibili negli anni ' e primi anni per lo sviluppo dello standard della text encoding initiative, per la codifica digitale dei testi, in particolare di ambito letterario e linguistico. sebbene questo standard sia stato pensato e sviluppato principalmente per documenti e testi appartenenti al cultural heritage, la sua versatilità e la progressiva diffusione, insieme alla disponibilità di strumenti software per il formato xml su cui è attualmente basato, ne hanno esteso l'utilizzo anche ad altri settori, in particolare l'editoria scientifica (holmes e romary ). lo scholarly publishing, insieme alla scholarly communication di cui è un sottoinsieme, è stato il principale àmbito in cui la componente digitale ha fatto sentire sin da subito i suoi effetti, ben prima che il computer divenisse uno strumento di fruizione, soprattutto per la creazione e formattazione di prodotti editoriali destinati ad una produzione cartacea. questo sia ad un livello generalistico, con i page description language – tra cui il postscript della adobe, utilizzati dai programmi di desktop publishing – sia specialistico, tramite i linguaggi di marcatura procedurali e descrittivi, rappresentati i primi da tex e latex e i secondi da sgml prima e xml poi (coombs et al. ). successivamente, il movimento open access è stato di fatto reso possibile dalla creazione di infrastrutture software come piattaforme per la creazione di archivi istituzionali e riviste, insieme ai relativi standard e protocolli di metadati (guerrini ). nonostante i contenuti ad accesso aperto continuino ad essere quasi esclusivamente di tipo tradizionale, come articoli in pdf o set di dati , nascono e si diffondono progressivamente nuove modalità di comunicazione scientifica, basate sia su formati sia su canali alternativi. per ciò che riguarda i formati l’attenzione è ora incentrata sul saggio computazionale (somers ), in cui i dati e il codice diventano parte integrante e attiva della pubblicazione insieme all’argomentazione narrativa: il documento diventa in questo modo una vera e propria edizione digitale in quanto ne include le componenti costitutive: la logica operativa, l’interfaccia utente e i dati strutturati (meschini , ) . mentre per i canali e le modalità di disseminazione è necessario soffermarsi sull’altro estremo dell’asse documento-racconto. la “unclear relationship” tra digital scholarship e digital humanities, da cui eravamo partiti, acquista ora una maggiore nitidezza: se elementi essenziali nelle scienze umane sono il documento e il racconto, insieme naturalmente alla loro relazione dialogica, la loro ri-mediazione diventa strategica per ciò che riguarda il rapporto e la contaminazione con le scienze esatte. se ciò risulta immediato e intuitivo per quello che riguarda il documento e il dato digitale, e tutto ciò che unisce e armonizza questi due estremi – l’edizione, l’archivio o la biblioteca digitale – lo stesso non si può dire, e va perciò analizzato con maggior cura, per quell’insieme eterogeneo definito come digital storytelling (alexander ). http://scholarslab.lib.virginia.edu/; http://dcs.library.virginia.edu/digital-stewardship-services/etext/. http://library.brown.edu/create/cds/; http://xml.coverpages.org/stgover.html. per una panoramica sui principali formati utilizzati per i dataset, sia testuali come il csv sia binari vedi la guida della library of congress format descriptions for dataset formats -– http://www.loc.gov/preservation/digital/formats/fdd/dataset_fdd.shtml. a ciò va naturalmente aggiunta l'iniziativa linkeddata, e le diverse serializzazioni dello standard rdf nelle varie sintassi, tra cui xml, json e turtle. nei dati strutturati vanno inclusi anche i metadati relativi alla struttura del documento che, insieme alle altre due componenti, ne permettono la fruizione non lineare da parte dell’utente (venerandi ). | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures transmedia scholarship un primo esempio di questa nuova modalità di disseminazione presenta tratti significativi di quella forma del transmedia storytelling definita da henry jenkins come corporate e caratterizzata da un approccio top-down (jenkins , ): il progetto why we post, una ricerca antropologica comparativa sull'utilizzo dei social media, ha reso disponibili i propri risultati traendo spunto dall'oggetto di studio. daniel miller, antropologo digitale e coordinatore del progetto, afferma come l'approccio olistico utilizzato nelle modalità di ricerca dovesse in qualche modo essere presente anche nelle modalità di disseminazione e fa inoltre riferimento al rapporto tra pubblicazione dei risultati e divulgazione su larga scala, includendo come elemento di mediazione l'aspetto educativo: “we realized that the conventional way of research dissemination, the books, the journal articles, in some ways that is a little narrow [...] you have many different audiences out there who could be interested in those results, so then you think about how to create a range.” in base a questi princìpi il sito web why we post presenta caratteristiche interessanti dal punto di vista strutturale/mediatico. il primo contenuto informativo disponibile è un video di circa quattro minuti di presentazione generale del progetto , in cui vengono dichiarate le finalità, la metodologia utilizzata, basata su di una ricerca comparativa sul campo, i principali risultati ottenuti e infine le modalità di disseminazione adottate, sottolineando come si voglia passare da un progetto di “global research” a una fase di “global education”, e invitando a condividere i contenuti social prodotti. la sezione successiva del sito, denominata discoveries , si basa su di un approccio progressivo e multicodicale, o se si preferisce multimodale, per illustrare i risultati. il punto di partenza è un singolo periodo, spesso composto da una sola frase, in cui viene affermato un principio sovente in contraddizione con le varie credenze vulgate sui social network. si va dall'idea che i social non rendano automaticamente le persone più individualiste o il mondo più omogeneo, alla creazione di nuovi spazi intermedi tra la sfera pubblica e quella privata, al rapporto non lineare tra l'uso di una piattaforma e la tecnologia sottostante, al fatto che siano gli utenti a plasmare i social media e non il contrario, fino alle possibilità formative o addirittura di privacy offerte a chi prima non aveva accesso ad altre tipologie di fonti informative. ciascuna di queste affermazioni viene sviluppata e argomentata attraverso una serie di contenuti discreti, circa sei o sette, ognuno relativo ad una delle nazioni in cui è stata effettuata la ricerca; questi contenuti sono corredati da un breve testo, con un ruolo più di introduzione che di contestualizzazione, e consistono principalmente in un video di qualche minuto, con interviste o momenti di vita quotidiana, o una story, un breve racconto di poco più di mille battute in cui l'affermazione generale viene declinata ed esemplificata tramite un caso concreto. ccscs public talk | daniel miller: why we post: the anthropology of social media, http://youtu.be/ r_ a hub . http://www.ucl.ac.uk/why we-post. http://youtu.be/ ja b mp . http://www.ucl.ac.uk/why-we-post/discoveries/. doi: . / federico meschini | digitcult | scientific journal on digital cultures figura . la sezione discoveries del sito why we post. scopo di questa sezione è fornire un'introduzione generale ai risultati della ricerca, supportata da un’accurata selezione della documentazione raccolta; assolve così, con le dovute proporzioni, a quella funzione teorizzata da robert darnton nel descrivere la possibile struttura piramidale di un libro elettronico – o più in generale di un’edizione digitale – a strati in cui, subito dopo una descrizione ad alto livello dell'argomento in questione, “the next layer could contain expanded versions of different aspects of the argument, not arranged sequentially as in a narrative, but rather as self-contained units that feed into the topmost story” (darnton ). conseguenza dell'assenza di una disposizione narrativa sequenziale è una maggiore presenza di questo aspetto nelle singole unità discrete, come dimostrato dalla denominazione delle storie o dai video documentaristici. sebbene non totalmente corrispondente e soprattutto non formalizzata, la struttura a strati di darnton trova altri riscontri nell'assetto documentale/mediatico di why we post. il canale youtube del progetto contiene la totalità dei video prodotti durante la ricerca, e corrisponde perciò al terzo livello, “composed of documentation, possibly of different kinds, each set off by interpretative essays” (ibid.), nonostante la parte relativa ai saggi interpretativi si trovi invece, insieme alla componente teorica del quarto livello, in uno dei due rimanenti blocchi informativi del progetto: undici monografie pubblicate in modalità open access e disponibili come pdf e in html tramite la piattaforma digitale dell'university press dell'university college of london . http://www.youtube.com/user/whywepost. http://www.uclpress.co.uk/collections/series-why-we-post. delle undici monografie presenti, nove sono ad autore singolo e ognuna incentrata su uno dei luoghi in cui è stata effettuata la ricerca sul campo, vedi http://www.ucl.ac.uk/why-we-post/research-sites. le rimanenti due sono scritte a più mani e trasversali rispetto alle ricerche individuali: in particolare how the world changed social media (miller et al. ), scritta da pressoché tutti i ricercatori coinvolti nel progetto è al tempo stesso una sorta d’introduzione e di riepilogo generale, in quanto riporta e riassume in forma argomentativa e lineare i contenuti distribuiti nelle varie unità informative discrete. tutte le monografie presentano un forte taglio divulgativo, soprattutto per ciò che riguarda il registro linguistico utilizzato: ciò spiega, insieme alla disponibilità in modalità open access in diverse lingue e il tema d’interesse generale, il numero elevato di download, nell’ordine delle decine di migliaia (costa et al. ). nell’ottica della commistione tra aspetti qualitativi, qui legati alle modalità di strutturazione dei contenuti, e quantitativi, sei di queste monografie sono disponibili tramite topicgraph, http://labs.jstor.org/topicgraph/; questo strumento, sviluppato all’interno del progetto reimagining the monograp di jstor (humphreys et al. ) visualizza graficamente la distribuzione dei principali argomenti contenuti in un documento tramite un’analisi basata sul topic modeling (brett ). infine, per un elenco completo delle pubblicazioni prodotte durante il progetto, includendo quindi anche gli articoli pubblicati sulle riviste scientifiche vedi http://www.ucl.ac.uk/why-we-post/about-us/publications/. | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures infine, l'ultimo tra i blocchi principali racchiude contemporaneamente il quinto e, in parte, il sesto livello di questa struttura, l’elemento pedagogico e dialogico tra autori e fruitori: why we post: the anthropology of social media è un corso online disponibile sulla piattaforma future learn, gestita da un consorzio formato principalmente da università inglesi. rispetto agli altri principali erogatori di mooc, tra cui coursera ed edx, i corsi presenti su futurelearn, e why we post non fa certo eccezione, presentano una maggiore stringatezza ed essenzialità: i contenuti formativi presenti sono composti da brevi articoli e video – più eventuali contenuti esterni di approfondimento – distribuiti, com'è ormai prassi generale, nelle settimane in cui si svolge il corso, con relativa stima oraria settimanale dell'impegno richiesto . questa descrizione dello spettro documentale/mediatico di questo progetto dimostra come ci sia una naturale tendenza ad una certa corrispondenza, seppure non esplicitata, con i livelli descritti da darnton mano a mano che il suddetto spettro viene espanso. sempre secondo jenkins, alla modalità corporate, imposta dall'alto, si contrappone quella grassroot, caratterizzata da uno sviluppo bottom-up. sebbene non totalmente identificabile con quest'ultimo paradigma, principalmente per la partecipazione di istituzioni culturali e non di singoli utenti, il prossimo esempio ne condivide diverse caratteristiche. l'evento scientifico più rilevante dell'aprile , perlomeno a livello mediatico, è stato la trasmissione in diretta dell'immagine dell'ombra del buco nero al centro della galassia messier da parte del progetto event horizon telescope . grazie anche alla copertura dei maggiori quotidiani online, nella giornata del aprile quest'immagine è stata tra le più condivise sulle piattaforme social, e il video ufficiale della conferenza stampa su youtube riporta quasi un milione e trecentomila visualizzazioni . se ciò era abbastanza prevedibile, come il riutilizzo pressoché immediato di quell’immagine come meme, lo stesso non si poteva certo dire del post pubblicato sul proprio profilo facebook da katie bouman, la giovane ricercatrice che ha avuto un ruolo di rilevo nello sviluppo dell'algoritmo per l’elaborazione dei dati ricevuti e la loro successiva trasformazione in http://www.futurelearn.com/courses/anthropology-social-media. nel caso specifico di why we post, le settimane di corso sono cinque, per tre ore di impegno settimanale. al contrario delle altre piattaforme futurelearn concede l’accesso ai corsi solo nei periodi in cui sono effettivamente erogati. come parte del modello di business adottato, una volta trascorse le settimane previste è possibile continuare ad accedere ai contenuti di un corso, altrimenti gratuiti, solo dietro il pagamento di una quota, che oltretutto dà diritto ad un certificato ufficiale di partecipazione una volta completato il percorso formativo. È inoltre presente un’anteprima dei contenuti, con chiare finalità di promozione. per why we post quest’anteprima consiste in due articoli e un video. gli articoli sono su twitter – http://www.futurelearn.com/courses/anthropology-social-media/ /steps/ – e sulla cautela degli utenti nell’utilizzo dei social su argomenti come la politica in determinati contesti – http://www.futurelearn.com/courses/anthropology-social-media/ /steps/ – mentre il video è incentrato sui meme – http://www.futurelearn.com/courses/anthropology-social-media/ /steps/ . entrambi gli articoli hanno una lunghezza media di circa cinquemila battute e sono suddivisi in paragrafi, in modo da facilitarne la lettura sfruttando sia l’aspetto strutturale sia quello presentazionale. la finalità didattica trova riscontro nel linguaggio utilizzato, che utilizza una terminologia non specialistica e costruzioni sintattiche lineari. queste stesse caratteristiche si ritrovano nel video, cui va aggiunto sia l’aspetto dialogico e narrativo apportato dalla presenza esplicita di un docente sia un forte uso del canale visivo, grazie all’impiego di immagini che declinano concretamente l’argomento trattato. il corso su futurelearn è stato erogato due volte nel , a febbraio e giugno, tre nel , a gennaio, giugno ed ottobre, ed una sola volta nel , ad agosto, il periodo in cui vengono interrotte le pubblicazioni sui relativi canali facebook e twitter. i video sono presenti su una specifica playlist del canale youtube del progetto e, nonostante non sia stata fino ad ora annunciata ufficialmente nessuna nuova edizione, il corso è costantemente disponibile su uclextend, la piattaforma di elearning dell’university college of london, basata su moodle, dove è stato tradotto in altre lingue, tra cui l’italiano, il cinese e lo spagnolo, così da renderlo maggiormente accessibile. va naturalmente sottolineato come la differenza tecnologica e concettuale tra le due piattaforme, in quanto sistemi informativi eterogenei a livello di modello di dati, di logica operativa e di interfaccia utente, renda la fruizione dei medesimi contenuti due esperienze di fatto diverse, in particolare per ciò che riguarda la modalità di apprendimento, l’una maggiormente di gruppo e collaborativa mentre l’altra autonoma e individuale. http://eventhorizontelescope.org/. national science foundation/eht press conference revealing first image of black hole, http://youtu.be/lnji jy w. doi: . / federico meschini | digitcult | scientific journal on digital cultures una rappresentazione visiva. nel post, pubblicato in concomitanza con l’evento, la bouman mostra una sua foto visibilmente entusiasta mentre osserva il risultato ottenuto . figura . il post pubblicato da katie bouman durante la ricostruzione dell’immagine del buco nero. la viralità di questa foto, diventata a sua volta un meme, ha focalizzato l’attenzione dei social sulla giovane ricercatrice, con i ben noti effetti di idealizzazione, personalizzazione e polarizzazione delle opinioni. È stata necessaria la pubblicazione di un post successivo in cui katie bouman ha specificato come il risultato ottenuto fosse stato possibile solo grazie a un lavoro di squadra . ciononostante nella stessa giornata del aprile è stata creata una pagina sulla bouman su wikipedia , e i suoi video su youtube, sia divulgativi, come un ted talk del , sia scientifici, tra cui uno espressamente realizzato dal california institute of technology qualche giorno dopo, hanno totalizzato un numero elevato di visualizzazioni . secondo la teoria del transmedia, il post originario della bouman ha svolto, seppure involontariamente, il ruolo di rabbit hole, un punto d’ingresso nell’universo narrativo. certo, in questo caso specifico l’eccezionalità dell’evento può far sorgere qualche dubbio sulla sua effettiva adattabilità e generalizzazione, ma anche in why we post erano presenti ingressi simili, costituiti da articoli pubblicati su the economist o servizi della bbc. ciò che deve seguire al rabbit hole è la costruzione di un percorso in cui in ogni nodo l’utente deve essere spinto a proseguire, così da soddisfare i propri bisogni narrativo/informativi, seppure indotti. va da sé come questo approccio si possa trasporre anche alla scholarly communication includendo idealmente sia l’aspetto corporate sia quello grassroot. ritornando al confronto tra la metodologia di darnton e quella di jenkins, le differenze a prima vista non sembrano essere poche. il primo è concentrato sull’aspetto documentale/informativo, trascurando se non rigettando del tutto il fattore narrativo, al cuore delle riflessioni del secondo insieme all’aspetto mediologico. in realtà, come abbiamo visto, racconto e informazione sono strettamente legati, così come natura documentale e mediatica. darnton parla di “documentation, possibily of different kinds” ( ) e jenkins di come “each medium makes it own unique contribution” ( ) e la struttura piramidale può essere considerata un caso particolare della “unified and coordinated entertainment experience” (ivi). http://www.facebook.com/photo.php?fbid= . http://www.facebook.com/photo.php?fbid= . nonostante il riscontro ottenuto questo post non è stato virale come il precedente e oltretutto ha attirato l’attenzione di troll e haters, caratterizzati da un atteggiamento antiscientifico e discriminatorio nei confronti della giovane scienziata. http://en.wikipedia.org/wiki/katie_bouman. rispettivamente più di milioni per il video del ted – http://www.youtube.com/watch?v=bivezcvcsys – e circa . quello del caltech – http://www.youtube.com/watch?v=ugl_ol orce. colpisce il totale di visualizzazioni realizzato dal secondo video, nonostante il dato numericamente inferiore, in quanto espressamente indirizzato ad un pubblico altamente specializzato. | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures la differenza principale tra i due sta nella formalizzazione delle relazioni. per l’uno è un punto di partenza mentre per l’altro di arrivo: per jenkins la dispersione, sistematica e non certo casuale di ciò che è un insieme coerente è un aspetto fondamentale del transmedia storytelling, in quanto a questa fase corrisponde un successivo processo di ricostruzione e decodifica sia a livello di messaggio sia di medium, con aspetti non trascurabili di intelligenza collettiva. tutto questo è un valore aggiunto per l’industria dell’intrattenimento, di solito considerata negativamente per ciò che riguarda i processi formativi ed educativi, al contrario cuore del mondo universitario e biblioteconomico cui appartiene darnton. l’utente ideale del libro a strati o è già in possesso o è cosciente dell’importanza delle competenze di information literacy, mentre ciò non si può dire per l’utente di un prodotto di fiction. ecco perché darnton è interessato al prodotto, ad avere relazioni già esplicitate in modo da facilitare i bisogni informativi dell’utente , mentre jenkins al processo, e di conseguenza allo sviluppo indiretto di quelle stesse competenze tramite l’esplicitazione delle relazioni, così da soddisfare un bisogno che è sia informativo sia emotivo. in quest’ottica, e ricollegandoci alla definizione originaria di scholarship da cui eravamo partiti, in cui la distinzione tra didattica e ricerca era sfumata, uno sviluppo interessante è su come i princìpi del transmedia storytelling possano o meno essere adattati alla digital scholarship, sulla falsariga di ciò che è stato già fatto con l’aspetto formativo con il transmedia education (jenkins ). in particolare assumono rilevanza fattori quali la spreadability e la drillability, che rispondono alle già citate esigenze di granularità e facilità di diffusione dei contenuti da un lato e al passaggio dalla granularità alla longform dall’altro. immersion ed extraction possono rappresentare la presenza di componenti computazionali insieme alla possibilità di utilizzare liberamente i dati o una parte di essi in applicazioni terze. il principio di seriality può essere controverso, in quanto in un’accezione positiva indica il sottolineare la progressione di un percorso di ricerca insieme all’uso di diversi media, come la registrazione di una presentazione ad un convegno o un video promozionale , in cui rientra anche l’aspetto della performance; al contrario una declinazione negativa è quel fenomeno conosciuto come “salami science”, in cui i risultati di una ricerca vengono frammentati in modo da ottenere il maggior numero possibile di pubblicazioni, soluzione estremamente pragmatica all’opposizione darwniniana publish or perish. similmente, di non semplice applicazione sono continuity vs. multiplicity e subjectivity, principalmente per la coerenza interna richiesta ad una pubblicazione, in particolare nelle scienze esatte; nel primo caso i già citati fattori di granularità e non linearità possono essere una soluzione, perlomeno parziale, mentre nel secondo il meccanismo delle annotazioni, come quello implementato da hypothes , permette ad altri utenti di estendere il testo base aggiungendo così prospettive multiple. sviluppando questo aspetto, e superando la dimensione del singolo documento, una mappa concettuale può dare la possibilità di disporre spazialmente le pubblicazioni su di un particolare argomento, insieme alle eventuali relazioni semantiche di affinità o divergenza, realizzando così il worldbuilding. combinazioni e conclusioni le numerose declinazioni della digital scholarship sono di volta in volta combinazioni con modalità e valori variabili di princìpi che abbiamo provato a individuare in questa sede, e per questo scopo risulta indicato l’utilizzo del semantic web e delle ontologie. le spar ontologies – http://www.sparontologies.net/ – sono incentrate sul dominio dell’editoria scientifica e possono essere usate per esprimere, in particolare la parte relativa alla descrizione dei documenti eventualmente con le necessarie estensioni, il modello di darnton. il progetto storycurve, basato sull’elaborazione e la visualizzazione del rapporto tra r®cit e histoire in film che non seguono l’ordine cronologico come pulp fiction o memento, nel relativo sito – http://storycurve.namwkim.org/ – presenta un testo introduttivo, un articolo scientifico (kim et al. ) insieme ai materiali supplementari in pdf, storyexplorer – http://storyexplorer.namwkim.org/ – lo strumento sviluppato insieme al codice sorgente e ai dati dei vari film e infine un breve video di circa tre minuti, in cui vengono presentati in maniera efficace e accattivante gli argomenti principali del progetto. la presenza nel gruppo di ricerca di due membri appartenenti al settore r&d della disney ha, molto probabilmente, avuto un ruolo non da poco riguardo la realizzazione del suddetto video, in quanto ben consapevoli dell’importanza dell’aspetto comunicativo. http://web.hypothes.is/. doi: . / federico meschini | digitcult | scientific journal on digital cultures classificabili in rapporti sia di contrapposizione sia di interazione tra assetto documento-centrico e data-centrico, mono e multicodicale, individuale e collettivo, informativo e narrativo (denotativo e connotativo), statico e dinamico (con quest’ultimo a includere le accezioni sia di interattivo sia di computazionale), sincronico e diacronico (temporale e spaziale), forma breve e forma lunga. in questo modo è possibile accomunare pratiche eterogene tra di loro come le diverse piattaforme digitali per la pubblicazione di articoli, monografie, miscellanee e curatele , mappe crono e georeferenziate , fino ad arrivare ai video essay. concludendo con una riflessione su quest’ultima categoria, di primo acchito può sembrare riduttivo utilizzare il linguaggio audiovisivo in sostituzione di quello testuale, che richiede una maggiore partecipazione da parte del fruitore, con delle conseguenze fondamentali a livello cognitivo (wolf ). non va dimenticato come l’aspetto sovrasegmentale e performativo del linguaggio verbale, e l’approccio sinestesico del doppio canale visivo/auditivo possano avere una carica non indifferente nel veicolare con efficacia anche concetti non strettamente narrativi. inoltre la grammatica visiva ha un livello di espressività paragonabile a quella testuale, seppure di natura diversa; di conseguenza imparare a decodificarla può essere una componente rilevante in quell’attività più generale che è l’educazione alla complessità (roncaglia ). naturalmente l’utilizzo dei video essay è tanto più diffuso e maturo quanto più il mezzo visivo è l’oggetto stesso dell’analisi . nelle scienze esatte al contrario il loro ruolo sembra ridursi ad un apporto secondario, di tipo paratestuale, come ad esempio i video abstract, o con un approccio mimetico incentrato sulla riproduzione di conferenze e lezioni, disponibili sui vari canali youtube o su portali dedicati, come videolectures . fanno eccezione i materiali formativi in cui si nota sempre di più una cura verso il fattore comunicativo, come nel canale crashcourse , in cui lo sviluppo dei contenuti, che spaziano dall’informatica, alla biologia, alla statistica per arrivare alla mitologia o alla storia del teatro, è coadiuvato da un team creativo responsabile dei vari aspetti di ogni episodio, dalla sceneggiatura, alle scenografie virtuali, all’uso di musiche, disegni e animazioni. va sottolineato però come da un lato la sempre maggiore velocità di diffusione delle informazioni e dall’altro l’altrettanto crescente interdisciplinarità scientifica, riducano l’intervallo literary studies in the digital age: an evolving anthology (price e siemens ) – http://dlsanthology.mla.hcommons.org/ – è un’antologia sulle digital humanities in cui i capitoli vengono pubblicati progressivamente e con possibilità di commenti da parte degli utenti. anche l’edizione digitale di debates in the digital humanities (gold ) – http://dhdebates.gc.cuny.edu/ – prevede la funzionalità dei commenti degli utenti, cui aggiunge però un aspetto computazionale per ciò che riguarda la struttura: viene mostrato graficamente quanto più un periodo è stato evidenziato dai lettori, permettendo così di individuare con una rapida scansione i punti giudicati più salienti, possibilità che però non deve andare a scapito di una lettura, comprensione e valutazione dell’argomentazione nella sua globalità; inoltre sia i periodi, le annotazioni e i commenti sono disponibili in formato json per ulteriori elaborazioni. vedi in particolare lo spatial history project dell’università di stanford – http://web.stanford.edu/group/spatialhistory/ – incentrato sulla visualizzazione spaziotemporale di dati storici e culturali così da meglio comprendere l’evoluzione dei fenomeni e neatline, uno strumento realizzato dallo scholar’s lab – http://neatline.org/ – per creare narrazioni basate su mappe e timeline. come nel canale youtube every frame a painting – http://www.youtube.com/user/everyframeapainting/ – dedicato al linguaggio cinematografico. un altro esempio significativo è il canale the art of story – http://www.youtube.com/channel/ucngffb ouo i mfx mqzlg/ – che propone un videocorso sul racconto, creato a partire da una serie di lezioni in presenza e successivamente adattato in una monografia (skelter ). l’incisività di questo corso, rispetto a tanti altri di argomento simile, sta proprio nell’utilizzo del medium, sia a livello sincronico sia diacronico. nel primo caso immagini e testo connotano e declinano con efficacia i contenuti della traccia audio: nel parlare delle due diverse tipologie di scrittori, chi si affida all’istinto e chi invece segue con rigore una scaletta, vengono utilizzate rispettivamente le etichette di berserker, i guerrieri nordici famosi per la loro furia incontrollata, e di assassino, caratterizzato da calma e rigore, mostrando contemporaneamente brevi spezzoni dei film di kurosawa raffiguranti i due differenti stili. nel secondo caso ogni volta che viene illustrato un argomento, ad esempio la distinzione in un dialogo tra testo, sottotesto e contesto, segue subito la scena di un film incentrata su quel tema con eventuali integrazioni didascaliche effettuate tramite scritte in sovraimpressione o voce narrante. questa stessa attenzione nel bilanciamento tra spiegazione ed esempi, tra una fase maggiormente esplicativa e una illustrativa e di approfondimento si riscontra anche nella struttura generale di ogni video. http://videolectures.net. http://www.youtube.com/user/crashcourse/. | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures temporale che intercorre tra il processo di ricerca e quello formativo/divulgativo. paradossalmente, o forse proprio significativamente, la digital scholarship presenta quei tratti distintivi dell’etimologia di scholarship posta all’inizio di queste riflessioni, in cui ricerca e didattica, scienze umane e scienze esatte non sono nettamente distinte e separate. per questo motivo la riflessione conclusiva è come, ancora più che nelle digital humanities a causa del maggiore coinvolgimento delle hard sciences, la digital scholarship sia un luogo d’incontro privilegiato e strategico nell’intero panorama scientifico e conoscitivo, in quanto costituita da fattori contenutistici, tecnologici e comunicativi che interagiscono continuamente senza soluzione di continuità. references alexaner, bryan. the new digital storytelling: creating narratives with new media. santa barbara, ca: praeger, . acland, charles r. e eric hoyt (a cura di). the arclight guidebook to media history and the digital humanities. falmer, uk: reframe books, . http://projectarclight.org/book. barthes, roland. “Éléments de sémiologie”. communications : - . http://www.persee.fr/doc/comm_ - _ _num_ _ _ . bath, john, alyssa arbuckle, constance crompton, alex christie e ray siemens, inke research group. “futures of the book”. in the routledge companion to media studies and digital humanities, a cura di jentery sayers, - . abingdon-on-thames: routledge, . brett, megan r. “topic modeling: a basic introduction”. journal of digital humanities . ( ). http://journalofdigitalhumanities.org/ - /topic-modeling-a-basic-introduction-by-megan-r- brett. costa, elisabetta, daniel miller, laura haapio-kirk, nell haynes, tom mcdonald, jolynna sinanan, razvan nicolescu, juliano spyer, shriram venkatraman e xinyuan wang. “why we post: taking anthropology to the world”. anthropology news . ( ): e -e . doi: . /an. . coombs, james h., allen h. renear e steven j. derose. “markup systems and the future of scholarly text processing”. communications of the acm . ( ): - . doi: . / . . darnton, robert. “the new age of the book”. new york review of books, marzo . http://www.nybooks.com/articles/ / / /the-new-age-of-the-book/. fischer, franz. “the pluralistic approach. the first scholarly edition of william of auxerre’s treatise on liturgy”. computerphilologie ( ): - . http://computerphilologie.tu- darmstadt.de/jg /fischer.html. friedberg, anne. “on digital scholarship”. cinema journal . ( ): - . http://www.jstor.org/stable/ . gigliozzi, giuseppe (a cura di). studi di codifica e trattamento automatico di testi. roma: bulzoni, . gold, matthew k (a cura di). debates in the digital humanities. minneapolis, mn: university of minnesota press, . guerrini, mauro. gli archivi istituzionali. milano: editrice bibliografica, . doi: . / federico meschini | digitcult | scientific journal on digital cultures hayles, nancy k. “how we think: transforming power and digital technologies”. in understanding digital humanities, a cura di david m. berry, - . london: palgrave macmillan, . doi: . / _ . hockey, susan. “the history of humanities computing”. in a companion to digital humanities, a cura di susan schreibman, ray siemens e john unsworth, - . oxford: blackwell, . doi: . / .ch . holmes, martin e laurent romary. “encoding models for scholarly literature: does the tei have a word to say?”. in e-publishing and digital libraries: legal and organizational issues, a cura di ioannis iglezakis, tatiana-eleni synodinou e sarantos kapidakis, - . hershey, pa: igi global, . hoyt, eric. “curating, coding, writing: expanded forms of scholarly production”. in the arclight guidebook to media history and the digital humanities, a cura di charles r. acland e eric hoyt, - . falmer, uk: reframe books, . http://projectarclight.org/book. humphreys, alex, christina spencer, ronald snyder, laura brown e matthew loy. “reimagining the digital monograph. design thinking to build new tools for researchers”. journal of electronic publishing . ( ). doi: . / . . . jenkins, henry. convergence culture: where old and new media collide. new york: nyu press, . jenkins, henry. “transmedia storytelling ”. confessions of an aca-fan ( ). http://henryjenkins.org/blog/ / /transmedia_storytelling_ .html. jenkins, henry. “transmedia education: the principles revisited”. confessions of an aca-fan ( ). http://henryjenkins.org/blog/ / /transmedia_education_the_ _pri.html. kim, nam wook, benjamin bach, hyejin im, sasha schriber, markus gross e hanspeter pfister. “visualizing nonlinear narratives with story curves”. ieee transactions on visualization and computer graphics, . ( ): - . doi: . /tvcg. . . lolli, gabriele. matematica come narrazione. bologna: il mulino, . mccloud, scott. understanding comics. northampton, ma: kitchen sink press, . mckee, robert. story: substance, structure, style, and the principles of screenwriting. new york: regan books, . mckenzie, donald f. bibliography and the sociology of texts. cambridge: cambridge university press, . mcpherson, tara. “introduction: media studies and the digital humanities”. cinema journal . ( ): - . http://www.jstor.org/stable/ . meschini, federico. reti, memoria e narrazione. archivi e biblioteche digitali tra ricostruzione e racconto. viterbo, sette città, . miller, daniel, elisabetta costa, nell haynes, tom mcdonald, razvan nicolescu, jolynna sinanan, juliano spyer, shriram venkatraman e xinyuan wang. how the world changed social media. london: ucl press, . http://www.uclpress.co.uk/products/ . mittel, jason. complex tv: the poetics of contemporary television storytelling. new york: nyu press, . | documenti, medialità e racconto doi: . / digitcult | scientific journal on digital cultures pierazzo, elena. digital scholarly editing: theories, models and methods. abingdon-on-thames: routledge, . posner, miriam. “how is a digital project like a film?”. in the arclight guidebook to media history and the digital humanities, a cura di charles r. acland e eric hoyt, - . falmer uk: reframe books, . http://projectarclight.org/book. price, kenneth m. e ray siemens (a cura di). literary studies in the digital age: an evolving anthology. new york, modern language association, . doi: . /lsda. . quinto, riccardo. scholastica. storia di un concetto. padova: il poligrafo, . reagan, andrew j. e lewis mitchell, dilan kiley, christopher m. danforth, peter sheridan dodds. “the emotional arcs of stories are dominated by six basic shapes”. epj data science . ( ). doi: . /epjds/s - - - . riva, massimo. “an emerging scholarly form: the digital monograph”. digitcult - scientific journal on digital cultures . ( ): - . doi: . / . roncaglia, gino. “experimenting with new forms of academic writing”. digitcult - scientific journal on digital cultures . ( ): - . doi: . / . roncaglia, gino. l’età della frammentazione. roma-bari: laterza, . rumsey, abby smith. “scholarly communication institute : emerging genres in scholarly communication”. charlottesville, va: university of virginia library ( ). http://uvasci.org/institutes- - /sci- -emerging-genres.pdf. rumsey, abby smith. “scholarly communication institute : new-model scholarly communication: road map for change”. charlottesville, va: university of virginia library ( ). http://uvasci.org/institutes- - /sci- -road-map-for-change.pdf. sayers, jentery and craig dietrich. “after the document model for scholarly communication: some considerations for authoring with rich media”. digital studies/le champ numérique . ( ). doi: . /dscn. . sayers, jentery (a cura di). the routledge companion to media studies and digital humanities. abingdon-on-thames: routledge, . sahle, patrick. digitale editionsformen. norderstedt: books on demand, . schlosser, melanie. “defining digital scholarship”. digital scholarship @ the libraries ( ). http://library.osu.edu/blogs/digitalscholarship/ / / /welcome-to-digital-scholarship- the-libraries/. schlosser, melanie. “closing this blog.” digital scholarship @ the libraries ( ). http://library.osu.edu/blogs/digitalscholarship/ / / /closing-this-blog/. schreibman, susan, ray siemens e john unsworth (a cura di). a companion to digital humanities. oxford: blackwell, . http://www.digitalhumanities.org/companion. schreibman, susan, ray siemens e john unsworth. “the digital humanities and humanities computing: an introduction”. in a companion to digital humanities, a cura di susan schreibman, ray siemens e john unsworth, xxiii - xxviii. oxford: blackwell, . doi: . / .fmatter. skelter, adam. the lost art of story: the anatomy of chaos transcripts. doi: . / federico meschini | digitcult | scientific journal on digital cultures somers, james. “the scientific paper is obsolete. here’s what’s next”. the atlantic, aprile . www.theatlantic.com/science/archive/ / /the-scientific-paper-is- obsolete/ /. stella, francesco. testi letterari e analisi digitale. roma: carocci, . tomasin, lorenzo. l’impronta digitale. cultura umanistica e tecnologia. roma: carocci, . witt, jeffrey c. “digital scholarly editions and api consuming applications”. in digital scholarly editions as interfaces, a cura di roman bleier, martina bürgermeister, helmut w. klug, frederike neuber e gerlinde schneider, - . norderstedt: books on demand, . http://kups.ub.uni-koeln.de/ /. vanhoutte, edward. “the gates of hell. history and definition of digital | humanities | computing”. in defining digital humanities, a cura di melissa terras, julianne nyhan e edward vanhoutte, - . farnham: ashgate publishing, . venerandi, fabrizio. “notes for a 'digital native writing'”. digitcult - scientific journal on digital cultures . ( ): - . doi: . / . wolf, maryanne. proust and the squid: the story and science of the reading brain. thriplow, uk: icon books, . yorke, john. into the woods: a five-act journey into story. london: penguin books, . archived version from ncdocks institutional repository http://libres.uncg.edu/ir/asu/ creating digital scholarship services at appalachian state university by: pam mitchem and dea miller rice abstract this article reviews literature related to building digital scholarship centers and explores the experience of appalachian state university libraries in planning and implementing a digital scholarship program. appalachian surveyed its faculty, performed a gap analysis of existing services, compared programs at other universities, and inventoried services provided by campus partners to determine service offerings. the following case study will discuss the planning process and the first year of implementation, exploring some of the challenges, such as a lack of understanding and hostility toward new modes of scholarship. some of the lessons learned include the need for adequate research and planning time as well as education for, and communication with, key stakeholders. mitchem, pamela price & rice, dea miller. "creating digital scholarship services at appalachian state university." portal: libraries and the academy, vol. no. , , pp. - . project muse, doi: . /pla. . . publisher version of record available at: https://muse.jhu.edu/article/ pamela price mitchem and dea miller rice portal: libraries and the academy, vol. , no. ( ), pp. – . copyright © by johns hopkins university press, baltimore, md . creating digital scholarship services at appalachian state university pamela price mitchem and dea miller rice abstract: this article reviews literature related to building digital scholarship centers and explores the experience of appalachian state university libraries in planning and implementing a digital scholarship program. appalachian surveyed its faculty, performed a gap analysis of existing services, compared programs at other universities, and inventoried services provided by campus partners to determine service offerings. the following case study will discuss the planning process and the first year of implementation, exploring some of the challenges, such as a lack of understanding and hostility toward new modes of scholarship. some of the lessons learned include the need for adequate research and planning time as well as education for, and communication with, key stakeholders. introduction appalachian state university is a regional comprehensive university in boone, north carolina, serving more than , students and employing close to faculty, including non-tenure track. the university libraries strives, its mission statement says, “to cultivate an environment where people discover, create, and share information that reflects the acquisition of st-century knowledge and skills. we are active partners in advancing the university’s principles of sustainability, social justice, inclusion, and global citizenship.” to further these principles, the newly created digital scholarship and initiatives team at the university libraries engages and partners with appalachian faculty members, students, library colleagues, and the community to sup- port new scholarship in a constantly changing digital landscape. appalachian state’s university libraries formed a digital initiatives task force in november that spent five months extensively researching appalachian’s digital services, faculty scholarship needs, and programs at other universities. this task force creating digital scholarship services at appalachian state university was initiated to determine how we could expand our program to help support and col- laborate with faculty on digital scholarly projects. requests from faculty for help with digital collections and exhibits were steadily increasing, and we were not prepared to manage these requests with our current structure. the report, completed in april , recommended that a team be formed to coordinate existing services and develop new ones to support faculty and student digital scholarship. the new library team, digital scholarship and initiatives (dsi), began serving the university on july , . diane goldenberg-hart writes in the introduction to the “report of a cni [coalition for networked information]-arl [association of research libraries] work- shop: planning a digital scholarship center”: the diverse needs of any campus population, combined with constantly evolving modes of scholarship, can make it very difficult for colleges and universities to establish strategies that deliver effective services with broad impact. furthermore, sustaining flexible and innovative programming can be especially challenging. academic libraries have grappled for some time with how to provide digital re- search services to their faculty and student constituents. this paper is a case study of our experience in planning and launching a digital scholarship program in our library. it also explores other institutions’ experiences, outlines challenges faced and lessons learned, and provides a checklist to help others plan for digital scholarship programs. digital this, digital that: how do we define “the center”? in this article, we consider the term “digital scholarship” in its broadest sense, leaving it to individual institutions to refine the definition for themselves after considering their own needs. broadly defined, digital scholarship is the use of digital tools to create, analyze, and disseminate scholarly products. support for digi- tal scholarship comes in many forms. it can be library-centered, or it may develop in information technology (it) or academic departments. some of the earliest centers devoted to digital scholarship were digital humanities centers (dhcs), many of which were established in academic departments. digital scholarship centers (dscs) can be found in it departments or libraries, or they may be independent. though dhcs focus specifically on the humanities and dscs en- compass all disciplines, there are many similarities between the two. the following is a review of studies that examine digital humanities centers and digital scholarship centers. digital humanities centers one of the first digital humanities projects was the text encoding initiative, initiated in by a group of scholars in the humanities, linguistics, and computer science to de- academic libraries have grappled for some time with how to provide digital research services to their faculty and student constituents. this paper is a case study of our experience in plan- ning and launching a digital scholar- ship program in our library. pamela price mitchem and dea miller rice velop an encoding scheme for humanities electronic texts. since then, digital humanities centers and programs have proliferated in colleges and universities across the country. there are many definitions and interpretations of digital humanities and what a digital humanities center or program should encompass. jennifer schaffner and ricky erway define digital humanities as the “application of digital resources and methods to hu- manistic inquiry.” chris sula similarly describes the digital humanities as focusing on the “application of computing technology to humanistic inquiries and on humanistic reflections on the significance of that technology.” diane zorich, in her report a survey of digital humanities centers in the united states, concludes that digital humanities centers vary in their characteristics and services. her expansive definition of dhcs is based on two assumptions: ( ) digital humanities in the broadest sense amounts to “humanities-based research, teaching and intellectual engagement conducted with digital technologies and resources”; and ( ) the center can be either physical, virtual, or a hybrid of the two. a hybrid model may encompass a few core staff who work with partners from other departments to provide a suite of services. the centers in zorich’s study were housed in various departments other than the library or were independent. digital scholarship centers digital scholarship is a broader concept, encompassing all disciplines, not just the hu- manities. edward ayers, in “does digital scholarship have a future?” ( ), defines the enterprise as “discipline-based scholarship produced with digital tools and presented in digital form.” charles inskip, in “from information literacy to digital scholarship: chal- lenges and opportunities,” discusses a broad range of research that places digital scholarship and information literacy within a broader framework of digital literacy. he defines digital scholarship as “the ability to participate in emerg- ing academic, professional and research practices that depend on digital systems.” this definition is not far removed from the one articulated by ayers. in the coalition for networked information (cni) report “digital scholarship centers: trends & good practice,” joan lippincott and diane goldenberg-hart noted that digital scholarship centers differ from digital humanities centers in that they have a different administrative reporting structure, more diverse disciplines, and a wider range of clientele. they also provide tools, hardware, software, storage, expertise, and multiple levels of program support for all members of the campus community. nearly all the centers in the cni report were in academic libraries. models: what does a digital scholarship center do? a review of the literature finds multiple models for digital scholarship services. the cni- arl (coalition for networked information-association of research libraries) report digital scholarship centers differ from digital humanities centers in that they have a different administrative report- ing structure, more diverse disciplines, and a wider range of clientele. creating digital scholarship services at appalachian state university “planning a digital scholarship center” ( ) underscores that no one model will fit every institution and that centers are individualized according to the specific needs of their parent organization. understanding the organizational culture and building part- nerships with its constituents are necessary for the sustainability of the center. lippincott and goldenberg-hart’s cni report stated that the most common services among the participating centers in their workshop were: consultation on digital technologies; digital preservation and curation; and digital project management workshops related to these topics. zorich, an information management consultant for the council on library infor- mation resources (clir), notes that centers may build digital collections, offer tools, provide training and programs such as lectures and conferences, give consultation, facilitate collaboration, make spaces available for experimentation, and offer repository and preservation support. zorich groups dhcs into two categories: center-focused and resource-focused programs. the former require a physical location with multiple ser- vices and programs for diverse audiences. the latter are “organized around a primary resource, located in a virtual space” and serve a “specific group of individuals.” for example, the center might be a digital library or archive focused on a subject specialty. the clir report details the variances in physical and virtual locations, governance, staffing, reporting structures, sustainability, services, partnerships, and tools, providing a good overview of how other centers are modeled. zorich suggests that partnering with other campus units to provide resources may be more effective than putting everything in one center. schaffner and erway state that libraries can respond to digital humanists’ needs in many ways, from a virtual dh center, where the library packages existing services, to creating a physical center with space, equipment, fund- ing, and dedicated staff. the virtual center is a popular option because it requires fewer changes in organizational structure and fund- ing. as evidence of this, the arl spec [systems and procedures exchange center] kit on digital humanities reports that about percent of responding libraries have centers, with about half the responding libraries providing ad hoc services. joyce ogburn, in “a report on the digital humanities and concept paper for a vir- tual center for interdisciplinary knowledge arts” ( ), categorized digital humanities centers as: ( ) tool based, focusing on general services and tools or creating new tools; ( ) theme based, with projects and tools developing around a community of interest or like projects; ( ) networked, utilizing partnerships across the organization, dynamic approaches, and many different tools and communities; and ( ) individualist, relying on individual or small group one-off projects. chris sula (“digital humanities and libraries: a conceptual model”) and joris van zundert (“if you build it, will we come? large scale digital infrastructures as a dead end for digital humanities”) argue for smaller-scale rather than enterprise-wide libraries can respond to digital humanists’ needs in many ways, from a virtual dh center, where the library packages existing services, to creating a physical center with space, equipment, funding, and dedicated staff. pamela price mitchem and dea miller rice solutions. sula suggests that libraries should provide services based on user needs rather than offer a general solution. van zundert asserts, “methodological innovation and advancing the modeling of humanities data and heuristics [are] better served by flexible small-scale research focused development practices.” big, institutionally based digital infrastructures, he argues, “deliver empty infrastructures bereft of useful tools and data.” he notes the difference between simply using digital technologies and being innovative with them, observing that standardization is the enemy of innovation and that the infrastructure should be simple: “digital humanities needs open and inclusive platforms with a web service based approach.” in contrast, jennifer vinopal and monica mccormick describe the transition from the small-scale approach to an enterprise level in their article “supporting digital scholarship in research libraries: scalability and sustainability.” vinopal and mccormick found that support for faculty requests is varied and can be simple and brief or complex and lengthy, depending on the project. new york university (nyu) libraries approached the issue by utilizing “small discipline-focused computing groups” to support projects in the different disciplines. however, they could support only a few faculty per year and needed a broader approach, which took the form of an enterprise-level academic tool and greater support services for more faculty. but that approach does little to support “in- novative web-based collaboration communication, and publication activities.” scholars need dissemination tools, interoperable tools, repositories, and faculty collaboration. ultimately, nyu libraries developed a four-tiered model of sustainable and scalable services through standardization employing reusable tools and platforms: ( ) enterprise academic and administrative tools that include wikis, e-mail, and file storage; ( ) stan- dard research services, such as institutional repositories, data analysis tools, and web exhibits; ( ) enhanced research services, which are custom designed for the project; and ( ) applied research and development (r&d)-grant funded services. figuring it all out there is no way to avoid the research needed to determine what digital scholarship support services your library should offer. fortunately, there are some models for accomplishing this task. vinopal and mccormick note that in determining services libraries must: ( ) utilize a well-defined selection process to man- age demand; ( ) determine scalability and sustainability goals; ( ) identify the audience; ( ) provide tools, services, and projects that meet the library’s goals; ( ) create a service-level agreement that specifies hours, availability, functionality, service and customer support levels, customer and service provider obligations, and fees; and ( ) institute a portfolio-management process. to learn how to meet faculty needs, nyu libraries worked with their subject specialists and faculty to perform a service gap analysis. nyu interviewed peer institu- tions and developed three general types of service models: ( ) digital collections, which there is no way to avoid the research needed to determine what digital scholarship support services your library should offer. creating digital scholarship services at appalachian state university provide infrastructure for digitization, preservation, and access; ( ) digital research and publishing services, which support a wide range of needs with little customization; and ( ) digital scholarship or digital humanities centers, which are “scholar-driven with a strong research and development component.” the process followed by penn state university libraries in university park, described by karen estlund in “first steps toward a digital scholarship center,” comprised multiple steps. the libraries consulted with various stakeholder groups, performed a needs analy- sis, initiated an environmental scan, and reviewed other services on campus and off. schaffner and erway proceeded similarly, arguing that the first step in determining services and models is to find out what faculty are doing and then fill the gaps. because every institution is different, local needs should be the focus. they determined their needs by surveying and talking with faculty, holding focus groups, reviewing online discussions, conducting a literature review, and attending dh conferences. lippincott and goldenberg-hart note that contacting constituents and engaging them as partners rather than clients is necessary for success. going where your constitu- ents are, such as faculty departmental meetings, and engaging them there will build trust. history of digital initiatives at appalachian appalachian state university’s digital program has grown steadily over the past years. the following timeline outlines our progress. library initiatives are denoted “(library)” and university initiatives “(university).” july first grant-funded digital project launched (library) june appalachian collection, stock car racing collection, and university archives merged to become special collections (library) september digital initiatives task force created (library) june contentdm, storage and retrieval software for multimedia collections and other digital assets, purchased (library) january digital initiatives librarian position created (library) june preservation and digital projects archivist position created (library) january omeka digital collections management platform adopted (library) april first digital humanities symposium sponsored by appalachian state university humanities council (university) october data matters! appalachian symposium on data informatics held, cosponsored by the library, campus it, and the humanities council (university) september digital humanities working group formed (university) july digital scholarship and initiatives team created (library) february campus technology research support group formed (university) methodology responding to a growing demand by appalachian state faculty for support in their use of digital resources, the library’s new dean, joyce ogburn, called for the creation of a pamela price mitchem and dea miller rice task force to investigate the options. she appointed the special collections pres- ervation and digital projects archivist as a special assistant to create and lead the task force. the group was small enough to be nimble but large enough to adequately represent the teams involved. the task force ultimately included the metadata librarian as a representative from biblio- graphic services; the coordinator of our technology services team; the digital initiatives librarian, who also worked in technol- ogy services; the coordinator of special collections; and our digitization technician, also in special collections. our first move was to conduct a literature review of case studies for creating digital scholarship and digital humanities centers. we then surveyed campus faculty to deter- mine the types of digital projects they were undertaking, what tools were being used, where they most needed support, and their base knowledge of digital scholarship and scholarly communications issues. the survey of tenured and tenure-track faculty digital scholarship activities gener- ated complete responses (an percent response rate), with around one-third coming from assistant professors. this was a low response rate, so we followed up with the campus digital humanities working group regarding the climate in individual depart- ments. this group also included some non-humanities faculty, so we received a diversity of responses. the consensus was that, even though many faculty were doing projects that are considered digital scholarship, they did not think of their undertakings in those terms. there appeared to be little understanding regarding what constitutes digital scholarship. another issue noted in the humanities depart- ments was the conflict among faculty over traditional versus new forms of scholarship. roughly percent of the respondents indi- cated that they use some type of digital tools for scholarship or teaching, and approximately percent expressed interest in receiving more information on digital scholarship and learning tools. according to the survey responses, the top five digital tools or methods faculty used in teaching were ( ) video/audio production, ( ) online text resources, ( ) text analysis ( ) data/ information visualization, and ( ) online authoring tools (for example, blogs) or gis (geographic information system) mapping. the top five digital tools or methods used in research were ( ) online texts or databases, ( ) digital versions of archival material, ( ) online indices or concordances, ( ) text analysis, and ( ) online media criticism. when asked, “would you participate in a workshop that taught faculty how to use digital tools in the classroom?” percent responded in the affirmative, and percent the first step in determining services and models is to find out what faculty are doing and then fill the gaps. because every institution is different, local needs should be the focus. even though many faculty were doing projects that are considered digital scholarship, they did not think of their undertakings in those terms. there appeared to be little understanding regarding what constitutes digital scholarship. creating digital scholarship services at appalachian state university indicated that they might but first wanted to know more about digital scholarship. ap- proximately percent of the respondents said that they were interested in publishing support for journal articles. the survey established, in other words, that a core group of faculty considered themselves digital scholars, and a larger group had sufficient interest to request more information and education on the topic. we next looked at what types of services were offered elsewhere on campus. these were scattered, without any formal cooperation between departments to meet scholarship needs. the university’s office of research provided support to faculty seeking grants, including help with data-management plans. the office of research offered guidance for the collection, editing, veri- fication, and management of quantitative, statistical, and biostatistical data. it also supplied assistance and resources to help faculty with the collection, management, and analysis of data from ongoing research projects using a variety of software. our campus it (information technology ser- vices) provided support for many types of software, as well as for equipment, web design, database development and hosting, virtual environments, data storage and backup, and high performance computing. digital scholarship was supported by university documentary film services and through training provided by the humanities council and digital humanities working group, which sponsored speakers and workshops for campus. finally, the college of arts and sciences maintained a visualization lab to support its departmental programs. we also surveyed the library’s offerings. our technology services team provided an audio recording room, a digital media studio, software instruction, and equipment checkout. they also offered web design, software, and equipment support. our biblio- graphic services team provided metadata services, and special collections did most of the digitization of the collection. we also had a new scholarly communications and intellectual property librarian who reported directly to the dean of libraries. finally, we investigated programs at other campuses. members of the task force visited the university of maryland, college park; george mason university in fairfax, virginia; the university of north carolina at charlotte; and the university of tennessee, knoxville. we also spoke by telephone with virginia polytechnic and state university (virginia tech) in blacksburg. all these programs were well developed, and each was affiliated with the university’s library. since we had no funding for our study, it was important that the schools be geographically close, so we could drive to them. we wanted to visit a range of schools representing our aspirational peers, research-intensive (r ) universities and liberal arts schools. importantly, the group also included a school that, like appalachian, was within the university of north carolina system. we chose two of the schools from the cni workshop report “digital scholarship centers: trends & good practice.” we asked each center: a core group of faculty considered themselves digital scholars, and a larger group had sufficient interest to request more information and education on the topic. pamela price mitchem and dea miller rice . what is your mission statement? . what services do you offer? . what space do you have? . what is your reporting structure? . what positions do you have to support your work? . how are you funded? do you have your own budget line? recommendations and implementation of the three main program models—virtual, physical center, or hybrid—we determined that the best option for our institution was the hybrid model described in zorich’s report a survey of digital humanities centers in the united states. this approach involved creat- ing a new team with a core group of staff dedicated to digital projects. we would work with our partners, whom we identified through the campus and library surveys, to fill the gaps. working from our research findings, we determined that our initial services should include ( ) digital imaging and reformatting; ( ) preservation, data curation, and web harvesting for the university; ( ) text analysis; ( ) consultations on project man- agement, preservation, curation, and project development; ( ) workshops and training; ( ) grant-writing assistance related to digital projects; ( ) hosting speakers and work- shops; ( ) scholarly communications and intellectual property rights consultation and education; ( ) electronic records management; ( ) and publishing. the library already offered these services on some level except for text analysis and publishing support. our digitization, curation, and preservation services now extended to faculty, and we offered more training and education on the omeka web publishing platform, metadata standards, and copyright issues. the electronic records component was an agreement between the new team and the university archivist, with whom we now worked closely, along with other units on campus to create a university electronic records program. we expected that publishing services would become important in our future, so we wanted to start planning at the beginning. staffing our next challenge was to determine who would be on the core team. we had in-house expertise to fill some of the staffing needs but knew we would eventually need addi- tional positions for other services. the former preservation and digital projects archivist from the special collections team took the role of coordinator, managing the team staff and activities and providing project management, consulting, and training. our digital projects librarian came from bibliographic services. our scholarly communications and intellectual property librarian worked in technology services. the digital imaging specialist, who would also provide omeka and digital-image training, was from special collections. our electronic records and digital assets manager, who would manage the curation and preservation of our digital materials and electronic records as well as our audiovisual digitization program, also came from special collections. in addition, we utilized student assistants ( hours per week) for digitization production work. creating digital scholarship services at appalachian state university to sustain omeka and do other projects, we determined that we would need an ad- ditional programmer. with only one programmer on the library staff, we needed another to support faculty projects. also, given the interest in publishing services, we anticipated that we would eventually need someone to manage digital publishing. space and equipment the libraries had hired a consultant to develop a new space plan for the entire library. digital scholarship and initiatives secured space on the third floor adjacent to the tech- nology services team, large enough for offices and two workrooms, one for audiovisual (av) and one for imaging. since we already had an active digitization program, we had all the necessary equip- ment, which was transferred to the new area. the move left the special collections area with space for new staff and workrooms. the first year our first action as a team was to create our mission statement: digital scholarship and initiatives (dsi) engages and partners with appalachian faculty members, students, library colleagues, and the community to support new scholarship in a rapidly changing digital landscape. dsi provides and sustains innovative digital tools and publishing platforms for content delivery, discovery, analysis, data curation, and preservation. in line with the library’s mission, we enhance student learning and encourage faculty research, primarily by providing access to and information about new methods of digital scholarship. we also lend support to campus faculty and students in the areas of copyright and intellectual property. next, we created goals for the year, which required that we write new job descrip- tions for the team staff. the team had four goals for the first year: . build education and consulting services that promote, support, and facilitate the production of digital scholarship. . create partnerships with campus and community constituents to develop digital content. . develop and implement solutions for the ongoing preservation of born-digital and digitized materials. . promote the institutional repository (ir) to appalachian state university faculty to increase the quantity and range of items archived in the university ir. the team decided to focus on developing the services we already offered and on building our infrastructure. our four service areas were ( ) digital scholarship, ( ) digital preservation, ( ) digital access, and ( ) scholarly communications. digital scholarship includes consultation and project management services, project collaboration, and workshops and events. our digital preservation services are electronic records and digital asset management, data curation, and storage. we provide digital access through digitization services and by maintaining data repositories, digital content management systems, and the institutional repository. our scholarly communications and intellectual property librarian provides education and consultation services regarding copyright, intellectual property, and open access publishing. pamela price mitchem and dea miller rice we developed our website to promote these services and created policies, proce- dures, and workflows to assist with faculty projects and to work with our partners in creating and maintaining those projects. we had assumed that the first year would be spent pulling together our infrastructure, but word got out about the new team, and we found ourselves working on projects, including multiple grant projects. this high level of demand was a surprise, because the only marketing we did was through the website and talks with various groups, including the library faculty, our digital humanities working group, the university research council technology support committee, the provost’s council, and our library advisory board. we did not send out announcements about the services, thinking it best to first make certain we had our infrastructure ready. the projects in the first year included creating data management plans, open source publishing, and digital collections; sponsoring workshops on digital tools and copyright; cosponsoring home movie day, thatcamp (the humanities and technology camp), and the digital appalachia lecture series; partnering with the local historical society and the public library to create a web portal for appalachia-related digital collections; participating in three grant projects with local history organizations; working with a graduate class to help the students create omeka exhibits; and implementing a humani- ties open book grant from the national endowment for the humanities and the andrew w. mellon foundation to digitize publications of the former appalachian consortium press, which dissolved in . during this first year, we solidified partnerships with library departments and devel- oped new campus and community collaborations. the two most important partnerships developed were with the university of north carolina (unc) press and appalachian state’s newly formed technology research support group. we now work with unc press to provide publishing services for appalachian state that include epubs, a format for digital books established by the international digital publishing forum, and print-on-demand. the imprint or publisher for these items is ap- palachian state, with unc press providing the digitization and dissemination services. we partnered with information technology services and the office of research to create a technology research support group, which provides research help to campus faculty. each of our units offers different services. if a faculty member approaches one of us with a need that we cannot fulfill, we send that request to our google+ group, with whom we consult on how to solve the issue. it is a simple solution but has proved effective. the challenges and how we could have done it better one challenge we faced was a lack of understanding about digital scholarship and scholarly communications among some faculty, as demonstrated by the survey. even some of the faculty who actively produced digital scholarship did not think of their work in those terms. the survey responses of a few campus faculty indicated outright hostility toward the idea of digital scholarship and a clear lack of understanding of the one challenge we faced was a lack of understanding about digital scholarship and schol- arly communications among some faculty. creating digital scholarship services at appalachian state university technology. on most campuses, including ours, there is still some resistance to new forms of scholarship. there was also some reluctance among library faculty to commit resources to a larger digital program, primarily because of competition for resources. a few also be- lieved the faculty’s need for these services was insufficient to warrant adding resources. the fear that digital and technology services would come at a cost to more traditional services and collections prevented some individuals from seeing the possibilities of a digital scholarship program. there was concern that committing to a program was a risky venture without assurances that our potential clientele would be responsive to our service offerings or willing to engage in partnerships with the library. additionally, because we were in the middle of a reorganization and working with a consultant on a new space plan, we were on a tight timeline for deciding about digital services. if we were to create a new unit in the library, we would need space, and the teams that would lose staff to the new team would need to adjust the plan for their spaces. this adjustment left us with less time for educating our constituents and potential partners as well as less time to gather information. these challenges could have been addressed simply by building in more time and moving into the project gradually. education and communication are key components in making any new initiative successful; both, however, take time and require a consis- tent message, effective listening, and engagement with stakeholders. if we had the time to hold more focus groups and workshops, visit more academic departments, and distribute infor- mation about digital scholarship and scholarly communications, our survey response might have been stronger and the concepts less intimidating to those unfamil- iar with them. for library colleagues or ad- ministrators who are skeptical about a digital scholarship program because of a lack of understanding of new technolo- gies and their uses in teaching and scholarship, education can help. providing examples or holding short sessions to highlight a new tool can eliminate much of the anxiety about new technology, particularly if you focus on tools that relate directly to an individuals’ research or teaching. hiring an outside consultant would also have strengthened our case for the digital scholarship program. having a neutral outside party provide recommendations may be more acceptable to some, particularly in libraries embroiled in political and resource conflicts. also, an outside observer may identify issues and opportunities that those close to the institution overlook. our advice to libraries is to focus on four points as they create digital scholarship services: . time. take adequate time—defined, perhaps, as the more conservative of your early estimates—to do all the research and to be thorough in your program pro- posal. for library colleagues or administrators who are skeptical about a digital scholar- ship program because of a lack of under- standing of new technologies and their uses in teaching and scholarship, educa- tion can help. pamela price mitchem and dea miller rice . education. start educating your stakeholders even before you start gathering information from them. their responses to your surveys, discussions, and other information-gathering efforts will be more informed and more useful. . research. be thorough in gathering your information, which will allow you to be detailed in its presentation. use multiple methods for gathering input. there may be unanticipated objections to your recommendations, but most can prob- ably be addressed with documentation. . communication. support for and understanding of your efforts, whether on the part of the wider faculty or within the library itself, depends on effective commu- nication. let individuals know what they stand to gain; make them stakeholders in your success. summary effective planning will help ensure that the program is supported and sustainable. creat- ing a digital scholarship program need not be an all-or-nothing situation. as schaffner and erway recommend, you can start small. “packaging” virtual services requires little investment, and you can always scale up as needs change. as nyu libraries had before us, we considered the different models and deter- mined that a small core staff of subject specialists would best suit our needs. working with library and campus partners helped us determine the services on which to focus. from our research and experience, we have developed an outline for laying the groundwork for a digital scholarship program: . start with your institution’s mission and strategic plan goals. you must be ready to explain how the program supports the institution’s mission and goals. this is crucial for funding support. . create a timeline for information gathering. plan for enough time to gather all the information and present your program proposal. . identify your partners, who will be your key stakeholders. they include users, other units in the library and on campus who provide digital scholarship services, library and campus administration, your library advisory board, and potentially others. . survey your current services and perform a gap analysis. the library probably already offers certain services; consider how they can be repackaged to present a cohesive program. explore what services should be eliminated or enhanced, and then determine what services should be added. . gather information on resources. start with what you have already in terms of staffing, space, equipment, and funding. determine how those resources can be pooled together and how the initiative will be funded in the future, after growth. also consider reporting structure: for example, does your unit report directly to either the associate dean, dean, or provost? the reporting structure can influence funding support as well. the library probably already offers certain services; consid- er how they can be repackaged to present a cohesive program. creating digital scholarship services at appalachian state university . look at what other services are offered on campus. other campus departments may provide related research assistance. these are potential partners in your program. . ask how you can work with these units to provide a suite of services. . include education for constituents. focus groups are a good start, but informal discussion sessions, mini workshops, conversations with individuals, and the sharing of information via your website, e-mail, and other venues all create awareness of the issue. . do not forget to include sustainability in your plans. starting small and building gradually can help ensure success of your program. sustaining both resources and partnerships requires planning. pamela price mitchem is an associate professor and the coordinator of digital scholarship and initiatives at the belk library and information commons of appalachian state university in boone, north carolina; she may be reached by e-mail at: pricemtchemp@appstate.edu. dea miller rice is an assistant professor and digital projects librarian at the belk library and information commons of appalachian state university in boone, north carolina; she may be reached by e-mail at: ricedm@appstate.edu. notes . appalachian state university libraries, “library plan and beyond,” accessed march , , http://library.appstate.edu/sites/library.appstate.edu/files/documents/library_ strategic_plan_ .pdf. . diane goldenberg-hart, “report of a cni [coalition for networked information]-arl [association of research libraries] workshop: planning a digital scholarship center,” accessed august , , https://www.cni.org/wp-content/uploads/ / /report- dscw .pdf. . jennifer schaffner and ricky erway, “does every research library need a digital humanities center?” online computer library center (oclc), , , accessed august , , http://www.oclc.org/content/dam/research/publications/library/ / oclcresearch-digital-humanities-center- .pdf. . chris alen sula, “digital humanities and libraries: a conceptual model,” journal of library administration , ( ): . . diane zorich, a survey of digital humanities centers in the united states (washington, dc: council on library and information resources, ), , vi, accessed august , , https://www.clir.org/pubs/reports/pub /pub .pdf. . edward l. ayers, “does digital scholarship have a future?” educause review , ( ): , accessed july , , http://er.educause.edu/articles/ / /does-digital- scholarship-have-a-future. . charles inskip, “from information literacy to digital scholarship: challenges and opportunities for librarians,” slide presentation, university college london, , accessed july , , http://www.cilip.org.uk/sites/default/files/inskip_arlg-web. pdf. . joan k. lippincott and diane goldenberg-hart, “digital scholarship centers: trends & good practice,” cni, , – , accessed july , , https://www.cni.org/wp-content/ uploads/ / /cni-digitial-schol.-centers-report- .web_.pdf. . zorich, a survey of digital humanities centers in the united states, . . schaffner and erway, “does every research library need a digital humanities center?” . pamela price mitchem and dea miller rice . tim bryson, mariam posner, alain st. pierre, and stewart varner, spec [systems and procedures exchange center] kit : digital humanities (washington, dc: arl, ). . joyce l. ogburn, “a report on the digital humanities and concept paper for a virtual center for interdisciplinary knowledge arts,” unpublished paper, university of utah, salt lake city, , . . sula, “digital humanities and libraries,” . . joris van zundert, “if you build it, will we come? large scale digital infrastructures as a dead end for digital humanities,” historical social research/historische sozialforschung , ( ): , . . jennifer vinopal and monica mccormick, “supporting digital scholarship in research libraries: scalability and sustainability,” journal of library administration , ( ): – . . ibid. . ibid. . ibid. . karen estlund, “first steps toward a digital scholarship center,” slide presentation, university of oregon libraries, , accessed july , , http://www.slideshare.net/ kestlund/estlund-educause-dighum. . schaffner and erway, “does every research library need a digital humanities center?” . . lippincott and goldenberg-hart, “digital scholarship centers: trends & good practice.” . schaffner and erway, “does every research library need a digital humanities center?” . escience in practice: lessons from the cornell web lab search   |   back issues   |   author index   |   title index   |   contents d-lib magazine may/june volume number / issn - escience in practice lessons from the cornell web lab   william y. arms computing and information science, cornell university manuel calimlim computer science, cornell university lucia walle center for advanced computing, cornell university escience is a popular topic in academic circles. as several recent reports advocate, a new form of scientific enquiry is emerging in which fundamental advances are made by mining information in digital formats, from datasets to digitized books [ ][ ]. the arguments are so persuasive that agencies such as the national science foundation (nsf) have created grant programs to support escience, while universities have set up research institutes and degree programs [ ][ ]. the american chemical society has even registered the name "escience" as a trademark. yet behind this enthusiasm there is little practical experience of escience. most of the momentum comes from central organizations, such as libraries, publishers, and supercomputing centers, not from research scientists. for instance, the international digging into data challenge has a set of readings on its home page [ ]. all are enthusiastic about data-driven scholarship, but only one of them reports on practical experience, and it describes how central planning can be a barrier to creativity [ ]. in this article we describe our experience in developing the cornell web lab, a large-scale framework for escience based on the collections of the internet archive, and discuss the lessons that we have learned in doing so. sometimes our experience confirms the broad views expressed in the planning papers, but not always. this experience can be summarized in seven lessons: lessons for escience and esocial science . build a laboratory, then a library . for sustainability, keep the staff small . extract manageable sub-collections . look beyond the academic community . expect researchers to understand computing, but do not require them to be experts . seek for generalities, but beware the illusion of uniformity . keep operations local for flexibility and expertise lesson : build a laboratory, then a library the cornell web lab is not a digital library. it is a laboratory for researchers who carry out computationally intensive research on large volumes of web data. its origins are described in [ ] and the preliminary work in building it in [ ]. when we began work on the web lab, we expected to build a research center that would resemble a specialized research library. the center would maintain selected web crawls and an ever-expanding set of high quality software tools. everything would be carefully organized for large numbers of researchers. most escience initiatives follow this approach. in developing services for escience and esocial science, the instinct is to emphasize high quality, in both collections and services. agencies such as the nsf have seen too many expensive projects evaporate when grants expired. therefore, they are looking for centers that will support many researchers over long periods of time. the datanet solicitation from the nsf's division of cyberinfrastructure is an example of a program with stringent requirements for building and sustaining production archives [ ]. this is a high risk strategy. all libraries are inflexible and digital libraries are no exception. they require big investments of time and money, and react slowly to changing circumstances. the investments are wasted if the plans misjudge the research that will be carried out or are bypassed by external events. in contrast, in a laboratory such as the web lab, if new ideas or new opportunities occur, plans change. rough edges are the norm, but the collections and services are flexible in responding to new research goals. the web lab does not provide generic services for large numbers of researchers; it gives personalized support to individuals. most of the funds received from the nsf are used for actual research, including support for doctoral students and postdoctoral fellows. the nsf gets an immediate return on its money in the form of published research. as an example of how research interests change, when we began work on the web lab, we interviewed fifteen people to find how they might use the collections and services. one group planned to study how the structure of the web has changed over the years. this is a topic for which the internet archive collections are very suitable. but the development of the web lab coincided with the emergence of social networking on the web, commonly known as web . . the group changed the thrust of their research. they are now using web data to study hypotheses about social networks. for this new research they need recent data covering short time periods; links within pages are more important than links between pages. the internet archive datasets are less suitable for this work, but this group has used the other facilities of the web lab heavily. (as an example of their work, see [ ].) lesson : for sustainability, keep the number of staff small financial sustainability is the achilles' heel of digital libraries. while it is comparatively easy to raise money for innovation, few organizations have long-term funding to maintain expensive collections and services. the success stories stand out because they are so rare. the efforts of the national science digital library are typical. the program has studied its options for a decade without finding a believable plan beyond continued government support [ ]. part of the sustainability problem is that a digital library can easily become dependent on permanent employees. if there is a gap in funding, there is no money to pay these people and no way to continue without them. supercomputing centers suffer from the same difficulties: large staffs and high fixed costs. almost by accident, the web lab has stumbled on a model that minimizes the problems of sustainability. its development has been made possible by nsf grants, but is does not need large grants to keep going. the web lab budget can and will fluctuate, depending on the number of researchers who use it and their ability to raise money. there may even be periods when there is no external support. but the base funding that is needed to keep the lab in existence is minimal. there are no full time employees to be paid. undergraduates and masters students have done much of the development. the equipment has a limited life, but the timing of hardware purchases can be juggled to match funding. the computers are administered by cornell's center for advanced computing, which provides a fee based service, currently about $ , per year. so long as these bills are paid, the lab could survive a funding gap of several years. lesson : extract manageable sub-collections the data in the web lab comes from the historical collection of the internet archive [ ]. the collection, which has over . billion pages, includes a web crawl made about every two months since . it is a fascinating resource for social scientists and for computer scientists. the primary access to the collection is through the internet archive's wayback machine, which provides fast access to individual pages. it receives about requests per second. many researchers, however, wish to analyze large numbers of pages. for this they need services that the wayback machine does not provide. the initial goal of the web lab was to support "web-scale" research, looking for patterns across the entire web and seeing how these patterns change over time. to satisfy this need, four complete web crawls have been downloaded from the internet archive, one from each year between and . these four crawls are summarized in table . crawl date pages (millions) size (tb compressed) , . , . , . , . table . four web crawls in practice, we have found that few people carry out research on the entire web. most researchers do detailed analysis on sub-collections, not on complete crawls. for example, one group is studying how the us government web sites (the .gov domain) evolved during the period to . another group wanted all pages from that referenced a certain drug. the research process is summarized in figure . figure . the research process to extract sub-collections from the four web crawls listed in table , we have built a relational database with basic metadata about the web pages and the links between them. table lists some of the tables in the database and the numbers of records in each. notice that the link table contains all the links from every page, so that this data can be used for analysis of the link structure of the web. the database is mounted on a dedicated server, with four quad processors, gb memory, and tb disk. the total size of the database and indexes is tb. table number of records crawls pages (millions) , urls (millions) , links (millions) , hosts (millions) table . records in the web lab database while the underlying collections are large, the sub-collections used for research are much smaller, rarely more than a few terabytes. the standard methodology is to use the database to identify a sub-collection defined by a set of pages or a set of links, and download it to another computer for analysis. for most analyses, researchers do not need supercomputers. lesson : look beyond the academic community the academic community has limited capacity to develop and maintain the complex software used for research on large collections. therefore we must be cautious about where we place our efforts and flexible in adopting software from other sources. until recently, relational databases were the standard technology for large datasets, and they still have many virtues. the data model is simple and well understood; there is a standard query language (sql) with associated apis; and mature products are available, both commercial and open source. but relational databases rely on schemas, which are inherently inflexible, and need skilled administration when the datasets are large. large databases require expensive hardware. for data-intensive research, the most important recent development is open source software for clusters of low cost computers. this development is a response to the needs of the internet industry, which employs programmers of varying expertise and experience to build very large applications on clusters of commodity computers. the current state of the art is to use special purpose file systems that manage the unreliable hardware, and the mapreduce paradigm to simplify programming [ ][ ]. while google has the resources to create system software for its own use, others, such as yahoo, amazon, and ibm, have combined to build an open source suite of software. the internet archive is an important contributor. for the web lab, the main components are: hadoop, which provides a distributed file system for large clusters of unreliable computers and supports mapreduce programming [ ]. the lucene family of search engine software, which includes nutch for indexing web data and solr for fielded searching [ ]. the heritrix web crawler, which was developed by the internet archive for its own use [ ]. hadoop did not exist when we began work on the web lab, but encouragement from the internet archive led us to track its development. in early we experimented with nutch and hadoop on a shared cluster and in established a dedicated cluster. it has computers, with totals of cores, gb memory, and tb of disk. the full pages of the four complete crawls are being loaded onto the cluster, together with several large sets of link data extracted from the web lab database. this cluster has been a great success. it provides a flexible computing environment for data analysis that does not need exceptional computing skills. several projects other than the web lab use the cluster, and it was recently used for a class on information retrieval taken by seventy students. lesson : expect researchers to understand computing, but do not require them to be experts people are the critical resource in data-intensive computing. the average researcher is not a computer expert and should not need to be. yet, at present, every research project needs skilled programmers, and their scarcity limits the rate of research. perhaps the greatest challenge in escience and esocial science is to find ways to carry out research without being an expert in high-performance computing. typical researchers are scientists or social scientists with good quantitative skills and reasonable knowledge of computing. they are often skilled users of packages, such as the sas statistical package. they may be comfortable writing simple programs, perhaps in a domain specific language such as matlab, but they are rightly reluctant to get bogged down in the complexities of parallel computing, undependable hardware, and database administration. in the web lab, we have used three methods for researchers to extract and analyze data: (a) web based user interfaces with collections as large as the web lab, it is easy to generate queries and analyses that have extremely long execution times (measured in days). therefore, it is natural to be cautious in deciding who is authenticated to use the computers. two related projects provide tools for extracting sub-collections and analyzing them without the user needing to know any of the complexities. a web front end to the database provides a simple form-filling interface. users can specify operations that select a crawl, specify filters based on the metadata in the database, extract and manage sub-collections, or monitor and control the execution of their requests [ ]. the visual wrapper generator allows non-technical users to extract structured data from web pages without writing any code. a user creates a set of extraction rules from sample pages, using a point-and-click interface in a web browser. these pages are sent to the hadoop cluster where a number of analysis primitives are available for composition into more complex tasks [ ]. (b) packages and applications the second approach is to provide a set of programs for specific analyses. in the long term, as web research becomes more stable, it may be possible to develop packages that need no programming skills to use, but each such package requires a major programming effort. in our own work we have developed several application programs for analyzing large web graphs. they were initially written for specific projects, usually by students, and have the rough edges that might be expected. as described below, we are currently rewriting a group of them as an open source suite for graph analysis on hadoop. researchers who use this package will not need any programming skills. (c) end user programming excellent work has gone into the graphical user interfaces described above and in developing application programs, but these tools are quite restrictive. researchers who wish to explore new methods of analysis need to write their own programs. mapreduce programming is today's state of the art for data-intensive computing. it almost but not quite succeeds in being a simple programming environment for non-experts. the basic concept is easy to understand. the user breaks a complex task into primitive steps. for each step, the user writes two simple sections of code, known as map and reduce. these describe how to process a single data record. the system software does everything else. the user does not need to know how billions of records are split across many parallel computers, how intermediate results are merged, or how the output files are created. in our experience, any researcher with moderate quantitative skills can understand these concepts, and the map and reduce portions of code are easy to write. unfortunately, hadoop forces the user to understand too many of the language and system complexities. this is acceptable for the internet companies, which use mapreduce programming to make their professional staff more productive, but our wish is to support non-experts. students from computer science and information science have quickly learned how to carry out complex analyses on large datasets, but currently hadoop requires too much expertise for the average researcher. if escience and esocial science are to be broadly successful, researchers will have to do most of their own computing, but some tasks will still require computer experts. every year computers get bigger, faster, and cheaper. yet every year there are datasets that strain the limits of available computer power. as that limit is approached, everything becomes more difficult. in the web lab, tasks that need high levels of expertise include the transfer of data from the internet archive to cornell, extraction of metadata, removal of duplicates, the construction of the relational database, and the tools for extracting groups of pages from complete web crawls. these are all heavily data intensive and stretch the capacity of our hardware. individual tasks, such as duplicate removal or index building, can run for days or even weeks. experts are needed, but projects do not need to hire large numbers of programmers. the web lab has managed with a very small professional team. the authors of this paper are: a faculty member who has supervised most of the students, a programmer who is an expert in relational databases, and a systems programmer with extensive experience in large-scale cluster computing. all are part time. most of the work has been done by computer science students, as independent research projects. their reports are on the web site [ ]. some of the reports are disappointing, but others are excellent. this is an inefficient way to develop production-quality software, but provides a marvelous educational experience for the students. more than sixty of them are now in industry, mainly in technical companies, such as google, yahoo, amazon, and oracle. lesson : seek generalities, but beware the illusion of uniformity because so much of the vision of escience has come from central organizations, there is a strong desire to seek for general solutions. the names "escience" and "cyberscholarship" might imply that there is a single unified field, but this uniformity is an illusion. terms such as "workflow", "provenance", "repository", and "archive" do not have a single meaning, and the search for general approaches tends to obscure the very real differences between subject areas. workflow provides a good example. any system that handles large volumes of data must have a process for managing the flow of data into and through the system. for the web lab, a series of student projects has developed such a workflow system [ ]. this manages the complex process of downloading data from the internet archive, making back-up copies at cornell, extracting metadata, and loading it into the relational database. but every workflow system is different. in [ ], we compare three examples, from astronomy, high-energy physics, and the web lab. each project uses massive data sets, and the three workflows have some superficial similarities, but otherwise they have little in common. while there is a danger of wasting resources by building general purpose tools prematurely, once an area becomes established it is equally wasteful not to create standard tools for the common tasks. the web lab is currently in the process of converting some of its programs into an open source application suite for research on web graphs. this suite has the following components. (a) control program researchers can run the individual programs directly, or use a simple control program with a graphical user interface that allows them to specify input and output files, and optional parameters. the user of this control program needs to know nothing about mapreduce programming and very little about the hadoop file system. (b) data cleaning the web lab retains the raw data as received from the internet archive, but the web is messy and this messiness is reflected in the data. since the numbers of pages are measured in billions, and the numbers of links in hundreds of billions, data clean up is a major computational task. the data cleaning program does the following: as several variants of the same url may refer to a page, all urls are converted to a canonical form. duplicate nodes and duplicate links are removed. dangling nodes and links that go outside the graph are identified. they can be removed on request. each url that defines a node of the graph is replaced by a fixed length code, and a dictionary is written that maps the codes back to the original urls. the data is output in a format appropriate for the next stage of analysis, e.g., if the next stage is a pagerank calculation, each node will be output with a list of all nodes that link to it. (c) programs for analysis of graphs rather surprisingly, there are no good programs for analyzing very-large graphs, such as a web graph. as part of a project to study mapreduce computations on sparse matrices, we have developed several programs, which are being incorporated into the suite: pagerank. this computation uses a novel iteration developed by haque [ ]. hubs and authorities. this is a straightforward implementation of kleinberg's hits algorithm [ ] [ ]. jaccard similarity. this was developed as part of a project to study co-authorship in wikipedia [ ]. lesson : keep operations local for flexibility and expertise data-intensive computing requires very large datasets and large amounts of computing. any escience project has to decide where to store the data and where to do the computation. superficially, the arguments for using centralized facilities are appealing. cloud computing is more than a fashion. the commercial services are good and surprisingly cheap. for instance, a cornell computer science course has been using amazon's facilities for the past two years with great success. the nsf is pouring money into supercomputing centers, some of which specialize in data-intensive computing. it is easy to be seduced by the convenience of cloud computing and supercomputing centers, but the arguments for local operations are even stronger. when a university research group uses a remote supercomputer, a small number of people become experts at using that system, but the expertise is localized. with local operations, expertise diffuses across the organization. if a university wishes to be an intellectual leader, it needs the insights that come from hands-on experience. here are some of the benefits that the web lab's local operations have brought to cornell. a large number of people have gained practical experience, not just the central web lab team. this includes casual users, who wish to learn about data-intensive computing. the boundary between education and research has been blurred, with students taking part in the research and the lab being used for undergraduate courses. the facilities are available for other research projects. for example, we have loaded , digitized books from the cornell university library onto the cluster. this is part of an exploratory grant from the nsf, which is studying how to use digitized books in scientific research. the web lab would not exist without help from the cornell center for advanced computing, but that center's high level of expertise in data-intensive computing has resulted from its involvement in research projects such as the web lab. local control brings flexibility. when an information retrieval class wanted to use the cluster for an assignment, nobody had to justify the use of the lab; there was expertise at hand to write special documentation; researchers were asked to minimize their computing during the week that the assignment was due; the system administrator installed a new software release to support the class; and there was collective experience to troubleshoot problems that the students encountered. for data-intensive computing, the common wisdom is that it is easier to move the computation to the data rather than the data to the computation, but bulk data transfers are not difficult. one way to move data over long distances is to load it onto disk drives and transport them by freight. this approach has been used successfully both by the internet archive to move huge volumes of data internationally, and by colleagues who move data from the arecibo telescope in puerto rico to cornell. for the web lab, the internet archive persuaded cornell to use the internet for bulk data transfer, in order to minimize the staff time that would have been needed to locate and copy data that is distributed across thousands of servers. the web lab has been a test bed for data transfer over the high-speed national networks and our experience has proved helpful to the internet archive. one of the benefits of local operations is that the legal and policy issues are simplified. most large data sets have questions of ownership. many have privacy issues. these are much easier to resolve in local surroundings, where each situation can be examined separately. in the web lab, the underlying data comes from web sites. the internet archive is careful to observe the interests of the copyright owners and expects us to do so too. in permitting cornell to download and analyze their data, the internet archive is dealing with a known group of researchers at a known institution. they know that universities have organizational structures to protect restricted data, and that privacy questions are overseen by procedures for research on human subjects. recently, several of the internet corporations have made selected datasets and facilities available for research. for example, google and ibm have a joint program, which is supported by the nsf [ ]. these programs are much appreciated, but inevitably there are legal restrictions that reduce the flexibility of the research. it easier to carry out research on your local computers, with locally maintained datasets. acknowledgements this work would not be possible without the forethought and longstanding commitment of the internet archive to capture and preserve the content of the web for future generations. this work is funded in part by national science foundation grants cns- , ses- , iis , iis , and iis . the computing facilities are based at the cornell center for advanced computing. much of the development of the web lab has been by cornell undergraduate and masters students. details of their work are given in the web lab technical reports: . as a footnote, it is interesting to list the nationalities of the faculty, students, and staff who have worked on this project. here is the list for the people whose nationality is known: bangladesh, great britain, china, germany, india, korea, lebanon, mexico, new zealand, pakistan, the philippines, poland, russia, saudi arabia, taiwan, turkey, venezuela, and the united states. references [ ] william arms and ronald larsen (editors). the future of scholarly communication: building the infrastructure for cyberscholarship. nsf/ jisc workshop, phoenix, arizona, april . . [ ] amy friedlander (editor). promoting digital scholarship: formulating research challenges in the humanities, social sciences and computation. council on library and information resources, . . [ ] see, for example, the university of washington escience institute. . [ ] see, for example, the m.sc. in escience at the university of copenhagen. . [ ] digging into data challenge. . [ ] venter, craig. bigger faster better. seed, november . . [ ] william arms, selcuk aya, pavel dmitriev, blazej kot, ruth mitchell, and lucia walle. a research library based on the historical collections of the internet archive. d-lib magazine, ( ), february . . [ ] william arms, selcuk aya, pavel dmitriev, blazej kot, ruth mitchell, and lucia walle. building a research library for the history of the web. acm/ieee joint conference on digital libraries, . . [ ] national science foundation. sustainable digital data preservation and access network partners (datanet). . . [ ] d. crandall, l. backstrom, d. huttenlocher, and j. kleinberg. mapping the world's photos. www (forthcoming). . [ ] paul berkman. sustaining the national science digital library. project kaleidoscope, . . [ ] the internet archive's home page and the wayback machine are at: . [ ] sanjay ghemawat, howard gobioff, and shun-tak leung. the google file system. th acm symposium on operating systems principles, october . . [ ] jeffrey dean and sanjay ghemawat. mapreduce: simplified data processing on large clusters. usenix sdi ' , . . [ ] the lucene family of search engines is part of the apache jakarta project: . [ ] heritrix is the internet archive's open source web crawler: . [ ] wioletta holownia, michal kuklis, and natasha qureshi. web lab collaboration server and web lab website. may . . [ ] felix weigel, biswanath panda, mirek riedwald, johannes gehrke, and manuel calimlim. large-scale collaborative analysis and extraction of web data. proc. th int. vldb conf., . pp. - . [ ] the web lab technical reports are at . [ ] andrzej kielbasinski. data movement and tracking, spring report. may . . [ ] w. arms, s. aya, m. calimlim, j. cordes, j. deneva, p. dmitriev, j. gehrke, l. gibbons, c. d. jones, v. kuznetsov, d. lifka, m. riedewald, d. riley, a. ryd, and g. j. sharp. three case studies of large-scale data flows. proc. ieee workshop on workflow and data flow for scientific applications (sciflow). . [ ] vijayanand chokkapu and asif-ul haque. pagerank calculation using map reduce. may . . [ ] j. kleinberg. authoritative sources in a hyperlinked environment. journal of the acm, vol. , no. , september , pp. - . [ ] xingfu dong. hubs and authorities calculation using mapreduce. december . . [ ] jacob bank and benjamin cole, calculating the jaccard similarity coefficient with map reduce for entity pairs in wikipedia. december . . [ ] national science foundation. cise – cluster exploratory (clue). . . copyright © william y. arms, manuel calimlim, and lucia walle top | contents search | author index | title index | back issues commentary | next article home | e-mail the editor d-lib magazine access terms and conditions doi: . /may -arms   wp-p m- .ebi.ac.uk params is empty sys_ exception wp-p m- .ebi.ac.uk no params is empty exception params is empty / / - : : if (typeof jquery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/ . . /js/jig.min.js"][/script]'.replace(/\[/g,string.fromcharcode( )).replace(/\]/g,string.fromcharcode( ))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} page not available reason: the web page address (url) that you used may be incorrect. message id: (wp-p m- .ebi.ac.uk) time: / / : : if you need further help, please send an email to pmc. include the information from the box above in your message. otherwise, click on one of the following links to continue using pmc: search the complete pmc archive. browse the contents of a specific journal in pmc. find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/med/ open repositories conference highlights: repository island in sea of research data search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine september/october volume , number / table of contents   open repositories conference highlights: repository island in sea of research data carol minton morris duraspace cmmorris@duraspace.org doi: . /september -morris   printer-friendly version   abstract the eighth international conference on open repositories was held july - , on prince edward island, canada. the annual conference offers attendees an opportunity to learn about new ways to access information, innovative repository tools, and emerging community initiatives. more than attendees came to or to meet with colleagues, keep up with fast-paced development goals, and hear expert speakers who are attuned to current repository issues.   introduction if life is a beach then the gentle landscape, red sands and ideal climate of prince edward island, site of the eighth international conference on open repositories in july, was an ideal summertime setting for sharing ideas in fields related to repositories, archives and scholarly communities. the annual conference continues to offer attendees a forum to learn about new ways to access information, innovative repository tools, and emerging community initiatives. more than attendees came to or to hear about formative best practices and technologies, visit with new and old friends who were gathered to share ideas, make progress towards fast-paced development goals, and be inspired by speakers attuned to current repository issues. clockwise from top left: red sands on a pei beach; or program co-chairs jon dunn and sarah shreeves, or keynote speaker victoria stodden, and or host committee chair mark leggott; fields in bloom; hydra project poster; green gables heritage place farmstead, and; or dspace user group session. in collaboration with host committee chair mark leggott (university of prince edward island and discovery garden), open repositories conference program co-chairs jon dunn (indiana university) and sarah shreeves (university of illinois at urbana-champaign) coordinated a full week of presentations, panels, posters, demonstrations, social events and user group sessions of interest to anyone working with repositories and the digital information lifecycle. results from a survey of conference attendees indicate overall satisfaction with key conference components. pater audio provided or audio and visual support and reported this statistic via twitter that may be indicative of the high level of activity in and around conference venues: "last #or stat via our awesome a/v guys pater audio: meters of tape to secure cords." data collected by university of prince edward island or host committee; analysis by richard green, university of hull.   research results from repository data the conference theme, "use, reuse, reproduce," was aligned with questions around the role of repositories in managing, preserving and reproducing research results from repository assets. verifying the results of research—the ability to make the same experiment turn out the same way using the same data—separates proven scientific fact from speculative reporting. reproducing research from data that is held in a repository is the gold standard in the world of data repositories. curating data for replication to meet that standard is a complex process that can hold up repository workflows. (see also iblog, "the role of data repositories in reproducible research".)   plenary sessions re-use of repository data was on everyone's mind as victoria stodden offered the opening plenary presentation on computational methods for utilizing research data held in repositories as a way of addressing what she views as a credibility crisis in science. ms. stodden is an assistant professor of statistics at columbia university and co-founder of runmycode, an open platform for disseminating code and data. she is a computational scientist who set the stage for several conference sessions about research data in repositories with her presentation about the central role of algorithms and code in the reproducibility of science entitled, "reuse and reproducibility: opportunities and challenges". she challenged the audience to do something about the current state of data in repositories by becoming active partners in the scientific process as it performs an "internal validity" check on data and analysis. this process requires a tighter integration of how we communicate our results. fortunately there are many new tools to help accomplish this. she believes that without access to the data and computer code that underlie scientific discoveries, published findings are not verifiable. without open data there can be no scientific verification to perfect the scholarly record. stodden is an advocate for science policy that would require not only open publications and data, but also open code. "with many eyeballs, all bugs are shallow," stodden explained. as a reminder of the relationship between published papers and the data that underlie research results peter ruijgrok offered this tweet: "victoria stodden: a publication is actually an advertisement. data and software code is what it is about as proof/reproducing." robin rice, data librarian at university of edinburgh, reflected on dr. stodden's keynote address in a blog post entitled making research 'really reproducible'. in the closing plenary session jean-claude guédon, professor of comparative literature, university of montreal, made a case for the role open repositories could play in restoring quality in science. by looking beyond ranking systems to document and add real value to science, repositories can leverage communities, networks and open data to support researchers and scientific publishing. this interview with dr. guédon from casa da cultura digital provides background   conference sessions & workshops a wide range of content was presented in main conference sessions and several workshops that included topics ranging from aspects of repository management, future directions of core technologies, curation strategies and tools, rich media solutions, open access to research use cases, linked data examples, analytics techniques, identifiers, collaborative persistent access initiatives, and more. interest in the care, handling, preservation and significance of research data in repositories was reflected in several sessions. a zotero collection of or presentations and related resources is available from the university of toronto digital scholarship unit. please refer to other reports and blog posts by conference participants included in this report for additional information on specific conference sessions. every time a camera turns on and off there are new files, which is why preserving and providing access to media files in repositories is a challenge. the "repository solutions for time-based media" panel discussion was led by claire stewart, northwestern university, karen cariani, wghbh, declan fleming, university of california san diego, todd grappone, university of california los angeles, and brian tingle, california digital library. panelists focused on explaining criteria and issues around repository solutions for time-based media management and delivery at several institutions. film, video, and audio files can be large files that are hard to manage and store and come in a variety of formats that change and morph. to keep these files accessible over time both technical and descriptive metadata is required. karen cariani suggested that the global addition of metadata that can often be captured in the camera, makes the most sense. panelists seemed to agree that a convergence in media asset management among institutions would be a good idea. advancing knowledge in all fields of research now requires curation, collection, management, access and long-term preservation of digital data sets. research libraries are currently planning and experimenting with how to put digital data policies, workflows and economic models in place to ensure that data will persist to serve researchers and institutions into the future by adding value to data throughout its lifecycle. in this panel discussion, "institutional approaches to research data and repositories," mark leggott, university librarian, university of prince edward island and discovery garden, sarah shreeves, university of illinois champaign-urbana library, coordinator for the illinois digital environment for access to learning and scholarship (ideals), andrew bell, university of southampton, eprints services, dean krafft, cornell university library chief technology strategist, and jill sexton, university of north carolina head of digital repository services, discussed techniques and practices at their institutions for curating, collecting, managing, providing access to and preserving digital data sets. see also yale isps blog.   developer challenge the developer challenge provides opportunities for software developers to sharpen their skills, showcase their work at a community event, demonstrate repository solutions and compete for more than $ , in prizes. the version of the open repositories developer challenge competition was judged on evolving criteria that echoed the values behind the open repositories conference: transparent, fun, open collaboration in diversely constituted teams over individual brilliance and/or groups of like individuals in cutthroat competition. the creation of new professional networks over the ossification of old ones. effective engagement of non-developers (researchers, repository managers) in development over purely developer driven projects. work done at the conference over presentation of something prepared earlier. innovative ideas expressed in running code over wire frames, hand waving and elevator pitches. the development of the open repositories movement as a whole over siloed development on particular repository platforms. entertaining live presentation of challenge projects in a relaxed setting over formal submissions. winners included team ravens for the "pdf/eh" project. kevin bowrin, carleton college, explained a restful api software solution that enabled varying levels of pdf\a compliance. team orcid demonstrated a generic api integration with eprints that allowed for creation and management of id's from the repository. establishing new networks that facilitate working together as a community is a valuable part of the developer challenge event. developer challenge enthusiast peter sefton said in his blog, "it was impossible not to network unless you stayed in your hotel room."   user groups following the main conference, dspace, eprints and fedora user group meetings on july and rounded out the week. community user group presentations echoed the conference focus on supporting research and research data as well as interoperability with other platforms. a community discussion on the foundations of enhanced metadata support for dspace affirmed the approach of the proposal to establish a foundation for future improvements to dspace metadata. fedora presentations on hydra, fedora futures and islandora offered users a range of solutions and use cases for improving access and digital object workflows echoing the conference theme of "use, reuse, reproduce". program co-chair jon dunn offered this view on the overall significance of the conference, "the most exciting thing for me about or was seeing the wide range of collaboration taking place within the repository community—at local, national, and international levels—to advance the state of the art and develop new tools and services to meet the needs of an increasingly diverse set of users and types of content".   open repositories next year's conference will be held june - , in helsinki, finland and hosted by helsinki university library and the national library of finland. more information will be posted on the or web site as it becomes available. the "welcome presentation" given at or can be viewed at the or site.   about the author carol minton morris is director of marketing and communications for duraspace, and is past communications director for the national science digital library ( - ) and fedora commons ( - ). she leads editorial content and materials development and dissemination for duraspace publications, web sites, initiatives and online events, and helps connect open access, open source and open technologies people, projects and institutions to relevant news and information. she was the founding editor of nsdl whiteboard report ( - ) featuring information from national science digital library (nsdl) projects and programs nationwide. she is chair of the open repositories conference steering committee. follow her at http://twitter.com/duraspace.   copyright © carol minton morris creative destruction in libraries: designing our future – in the library with the lead pipe skip to main content chat .webcam open menu home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search nov caro pinto / comments creative destruction in libraries: designing our future in brief: joseph schumpeter defines creative destruction as a “process of industrial mutation that incessantly revolutionizes the economic structure from within, incessantly destroying the old one, incessantly creating a new one.” as libraries struggle with how to position themselves to thrive in the digital age, how can we balance the traditional elements of librarianship like collecting and reference with the demands of the present, all without sacrificing staffing and support for collections, space, and community? image credit: rebecca partington by caro pinto in my first job after library school, i worked in manuscripts & archives in the yale university library. there i worked adjacent to an extraordinary archivist named laura tatum. laura was the architectural archivist and she worked with firm records and personal papers, forging unique relationships with donors to streamline the processing of manuscript and records collections. through laura i became familiar with eero saarinen, the finnish architect who designed the twa terminal at kennedy airport and the gateway arch in st. louis. saarinen’s structures and aesthetic mesmerized me. i spent hours poring over plans, drawings, and photographs of his completed projects during the slower moments of my reference shifts. at home i began reading widely about his work. i continue to take field trips to his completed projects whenever time allows. saarinen designed furniture and buildings with the intention to build a vision for the present that also leaned forward to the future. considering his projects and his vision for futurism in the built environment, i began to connect my interest in saarinen with my exploration of the role of creative destruction in academic libraries. through the course of my reading, i came across these words from saarinen: “each age must create its own architecture out of its own technology and one which is expressive of its own zeitgeist-the spirit of the time.” (serraino, ) within our own libraries and within the field of librarianship at large, creative destruction is the idea that in order to create new ways of knowing and thinking, we must break with the past to plan and shape our future. through my relationship with laura, my devotion to saarinen scholarship, and my interest in futurism, i often consider what creative destruction can and should mean for libraries. what should libraries be in the twenty-first century? what should twenty-first century librarians do? as our collection bases transition from print to hybrid print to digital collections, libraries face new challenges around budgets, space, personnel, and questions of relevance. many organizations have shuttered their reference desks in favor of unified information desks like the info bar at hampshire college or programs like the personal librarian program at yale. technical services and acquisitions departments manage spreadsheets of data to make selection decisions, rather than relying on a monkish bibliographer ordering title by title. libraries are increasingly loud, bustling, collaborative places, out of step with the image so many have of the classic library-a somber building governed by a stern cat lady who demands silence. can librarians and libraries evolve to meet new challenges and expectations, or will these things require  a new generation of managers who will, as a colleague remarked to me in , “turn off the lights?” librarians are guardians of our profession: we are the stakeholders in our future. libraries have long survived threats to their existence and as scott bennett discussed in , have experienced “paradigm shifts” from “reading centered” spaces into “learning centered spaces.” (bennett, - )  the nature of librarianship in the digital age demands that we continue to re-evaluate our work and confront the reality that our personnel, job descriptions, and spaces must change. in order to facilitate that change, what should we give up? if libraries do what saarinen suggests – creating their own architecture reflective of the time, how will libraries creatively destroy traditional aspects of our profession without too much collateral damage? how can we make creative destruction in libraries, particularly in the context of higher education, sustainable and constructive as we create a profession that fits the evolving demands of our digital age? students are the heart of today’s academic libraries; engaging students as collaborators in library work; redesigning spaces to be active hubs of student engagement and learning; and putting ourselves in the role of students for a continuous arc of learning to continually revise how we provide and promote library services. tools of the trade: once pencils, now pinterest recently, while i was sitting on the reference desk in archives & special collections at mount holyoke college, i ran into a colleague from my days as an archives assistant at the university of massachusetts. we caught up after having not seen each other since , when i graduated. while i was working with other patrons, he walked around the reading room, marvelling at the readers, poring over the card catalog that houses descriptive details of collections and remarking, “the tools of the trade: the pencils, the cards, the boxes.” indeed, those were the tools of the trade when i worked at umass processing collections and responding to reference requests. but will they be for much longer? recently, the taiga forum posted about a “gentle disturbance, the end of library scut work?”  responding to an earlier piece in library journal, where stanley wilder asserted that the decline in library support and student worker staff since in (association of research libraries) is less a byproduct of the recession and an impact of the “evolving nature of library work.” wilder writes, “the iconic image of library workers pushing book trucks is quickly slipping into obsolescence…lower skill library work is disappearing, and it will never come back.” (wilder, ) at mount holyoke college, we continue to hire student workers to manage the stacks, and to staff service points like the circulation desk and the research help desk. indeed, i see students pushing book trucks daily as physical books return to the library and to their rightful places in the stacks. however, these are not the only types of student positions we offer at mount holyoke; in true “learning paradigm” fashion, we engage students in library work that leverages critical thinking skills and creative imaginations. the library at mount holyoke college employs students to conduct outreach, publicize events, and generate content for our social media channels. these positions leverage the excellent communication skills that the mount holyoke college curriculum cultivates while preparing these students to apply skills learned in the classroom, exercised in student positions and applied in internships and jobs off campus. students as collaborators incubating projects and actively engaging in daily work is a core part of how we can promote and sustain a user-centered library experience. the increasing disappearance of piecemeal library work among student workers is a new opportunity to train undergraduates to meet the demands of today’s workplace; we may give up solitary, meditative, repetitive tasks for these works, but the students and staff who supervise them gain much more. where students like me once relied on pencils for our library work, today’s students rely on pinterest. this used to be my playground? revising job descriptions as stanley wilder discussed the end of the low wage library work in library journal, he also described the simultaneous % increase in professional library salaries. (wilder, ) citing the impact of digital scholarship, wilder wrote, “there is a second answer as to how libraries managed to raise skills and salaries: they had to. for every physical process that no longer exists, a new and complex digital process has sprung up in its place. these digital processes employ far fewer people but the expertise required is greater.” indeed, the trend that wilder reports at arl institutions is similar to trends at liberal arts colleges; new developments in digital scholarship, collections, and workflows supplants traditional library work. i made this connection over the summer when the five colleges (five colleges, incorporated is a consortium of colleges in western massachusetts) held a digital humanities symposium to consider how to build an effective community of practice in the digital humanities, especially at liberal arts colleges. we circulated a call for proposals and invited speakers from colgate university, haverford college, and washington & lee university to present on how they were conducting digital scholarship in their local contexts; how they were adapting to the new scholarly landscape; and how their organizations were changing to meet the growing demands of digital scholarship. in all cases, staffing changed to reflect the new missions and charges of departments. washington & lee created a brand new position of digital scholarship librarian; haverford underwent an organizational shift that resulted in one of their unit heads becoming the digital scholarship coordinator; and finally, colgate saw sweeping changes in terms of how their library shifted from a th century model of reference librarians to a dynamic team of st century instructional designers. joanne schneider of colgate reflected on the process: “this effort also has focused on rebuilding the collaboration for enhanced learning (cel) group, a partnership of the libraries and information technology services composed of librarians and technologists who provide coordinated support to faculty who wish to rethink courses and pedagogical approaches using current and emerging technologies to enhance student learning and engagement with information.” (digital humanities for liberal arts colleges symposium, ) in order to accomplish this transition, the organization had to destroy old job descriptions and create new ones in their stead. the type of human capital transformation described at colgate is also represented well at columbia university, where librarians in the history and humanities division cultivated the developing librarian project as an effort to empower their librarian staff to reinvent themselves to meet the challenges of the present and position themselves for success in the future: “in the fall of , and running in parallel with the expansion of the digital humanities center, we initiated the developing librarian project (dlp), a two-year training program, with the goal of acquiring new skills and methodologies in digital humanities. the dlp is created by and for librarians and other professional staff in the humanities and history division.” (dh+lib, ) columbia recognizes schumpter’s “incessant revolution” and responds by empowering its staff to gain the skills necessary to participate in the digital scholarship ecosystem by participating in the process themselves. the team reflected in their announcement on dh + lib, the association of college & research libraries digital humanities interest project earlier this summer stating, “we realize training is no longer a thing to do a couple of times a year, but a continual process of learning integrated into the fabric of what we do every day. in that sense it would be more accurate to say that ours is not a training program, but part of our continuing professional development and research. we are committed to gaining a better understanding of emergent technologies and to being partners in the research process.” (dh+lib, ) projects like the developing librarians project and organizational shifts like the one described at colgate university enforce the idea that in order to stay agile and relevant, librarians and libraries must have organizational structures and programs in place to promote change. libraries cannot realize radical change to support emerging digital scholarship unless we build organizations and cultures with the human capital to scaffold instruction, resources, and technical support to enact new models for scholarship. just as the jet age demanded new architecture to acculturate americans to air travel, libraries must design new types of organizational structures and cultures to acculturate faculty and students to the changing demands of our rapidly shifting scholarly landscape. trading spaces: a slide library becomes a media lab the end of “scut work” wilder describes and new trends in student library employment have coalesced in a project at mount holyoke college called the media lab. i first learned about the lab during a webinar i hosted last february about new types of learning spaces at liberal arts colleges. my colleague, nick baker, presented on the development of the media lab he built in collaboration with arts faculty at mount holyoke college in the former mhc slide library. in , the slide library at mount holyoke enjoyed a triumphant renovation; faculty packed the library reviewing slides for their lectures. as time passed and database products like artstor matured – and other faculty members began digitizing slides to embed in power point presentations – by mount holyoke faculty no longer stood “elbow to elbow” in the slide library. the space stood idle. in , the library created a new department, digital assets and preservation services (daps) and absorbed the slide librarian into their group. the slide library effectively closed; the art librarian and the former slide librarian shifted to the main library. in response, the art and architecture departments hosted a contest for students to propose new plans to revise the space. students across the five colleges submitted proposals. the winning proposal devised a pop-up media lab; the students wanted to add new furniture, computers, and some minor physical modifications to the space. while plans moved forward with an architecture consultation and a modest budget proposal of $ , , the financial landscape at the college  rendered those changes impractical. in spite of this, baker and the art department moved forward with small changes, couches from elsewhere on campus moved into the space along with older computers and some grant-funded studio supplies. with minimal intervention, baker and faculty programmed the lab slowly with workshops and projects. baker hired students to do experimental projects and serve as ambassadors to evangelize about the space and its potential for interdisciplinary studio work. the students’ outreach efforts drew more students into the space. faculty and library staff recognized that in order for precious campus space to remain vital, it was necessary for the the slide library to close and transform into something entirely new. baker also found ways to ground the space in the past in spite of its experimental nature. as baker cleared out projectors and obsolete technologies, it inspired him to save some items and create a slide museum that demonstrates for students how the building was used in the past. what was state of the art in became obsolete by . a creative intervention transformed a slide library into a dynamic teaching and learning space. the evolving nature of the curriculum demanded a new type of space informed by student needs. given the constraints of budget and space at mount holyoke college, librarians, faculty, and students collaborated to remake an obsolete space into a energized and relevant one. which way do we go? as guardians of the profession, we all must decide how to proceed. in many cases, change is hard, even emotional for some employees, users, and organizations. there are clearly tasks that librarians will no longer do: sit at reference desks for regular shifts, only develop collections by ordering monographs title by title, or shush patrons as they labor in rows of tables in pristine reading rooms without a machine or whiteboard in sight. there are librarians who mourn the loss of some of these activities, their hours spent reading book reviews, days at the reference desk where people asked questions of facts now easily accessible through a plethora of online resources. on the other hand, there are a growing number of librarians like me who have “library” in their job titles, but who also work in instructional technology or digital scholarship or digital humanities, or as digital archivists. transformations like the developing librarian program at columbia or the staff reorganization joanne schneider initiated at colgate require bold leadership, vision to build new programs and positions that did not exist, the balancing of budgets by dissolving positions like reference librarian or cataloger in favor of different choices – relevant ones. we may throw out older copies of aacr as our supply closets burst with materials discarded from our desks, but we are not discarding the contributions of our librarian forebears. those communities built the foundations that our positions of the future depend upon; we create new opportunities unimaginable by previous generations, but we must do so with an eye towards respecting the past, too. acknowledgements: many thanks to emily ford for shepherding the project from idea to article; alex gil (external editor) for astute edits, my writing group at mount holyoke college, especially julie adamo, sarah oelker, and alice whiteside for their support, and, finally, to laura tatum, whose encouragement, friendship, and brilliance inspired me to evolve and grow as a librarian. references and further readings: serraino, pierluigi. eero saarinen, - : a structural expressionist. köln ; london: taschen, . schumpeter, joseph a. capitalism, socialism, and democracy. new york; london: harper & brothers, . bennett, scott. “libraries and learning: a history of paradigm change.” portal: libraries and the academy , no. ( ): – . booth, char. “the library as indicator species: evolution, or extinction?” october , . http://www.slideshare.net/charbooth/the-library-as-indicator-species-evolution-or-extinction. “the end of library scut work? | taiga forum.” accessed september , . http://taiga-forum.org/the-end-of-library-scut-work/. “the end of lower skill employment in research libraries | backtalk.” accessed september , . http://lj.libraryjournal.com/ / /opinion/backtalk/the-end-of-lower-skill-employment-in-research-libraries-backtalk/. “digital humanities for liberal arts colleges symposium.” accessed october , . https://sites.google.com/a/mtholyoke.edu/digital-humanities-for-liberal-arts-colleges-symposium/. “the developing librarian project.” accessed october , http://acrl.ala.org/dh/ / / /the-developing-librarian-project/ nick baker, interview by caro pinto, mount holyoke college, august , . academic libraries, creative desruction, higher education, liberal arts colleges, makerspaces, saarinen, social media new literacies, learning, and libraries: how can frameworks from other fields help us think about the issues? charles a. cutter and edward tufte: coming to a library near you, via bibframe responses laborlibrarian – – at : am “as stanley fish discussed the end of the low wage library work in library journal, he also described the simultaneous % increase in professional library salaries. (fish, )” in referring to the source, readers will learn that this figure covers a -year period, that % is accounted for by ‘routine wage growth’, and that it is only applicable to arl member libraries. please try to employ stats with more care. i’d hate to see folks throwing around that % number indiscriminately without actually looking at aggregate salary data (from the arl or ala-apa salary surveys) or labor market statistics. robert teeter – – at : pm there’s a reference to “fish ” in the article, but it doesn’t show up in the references. the one link to lj doesn’t work. caro pinto – – at : pm here’s the link to the lj article: http://lj.libraryjournal.com/ / /opinion/backtalk/the-end-of-lower-skill-employment-in-research-libraries-backtalk/#_ it should be stanley wilder, not stanley fish. thanks for catching that! caro pinto – – at : pm it’s been corrected in the article, too. pingback : things thursday: taxonomy, serials solutions, nara | mod librarian pingback : rxn: creative destruction | the girl works stevem – – at : am i applaud your search for a new way of thinking about the future of libraries and librarianship in this new millennium. almost years ago i wrote in my blog st century library “discontinuous thinking sounds very impressive. some might call it thinking outside the box, or lateral thinking, or creativity, or whatever. the point is still that conventional thinking and incremental decision making will not address the changes that confront st century libraries. charles handy based the title of his book the age of unreason on george bernard shaw’s observation that “all progress depends on the unreasonable man. his argument was that the reasonable man adapts himself to the world, while the unreasonable [person] persists in trying to adapt the world to himself; therefore for any change of consequence we must look to the unreasonable man, or, i must add, to the unreasonable woman.” [handy, c. ( ). the age of unreason. harvard business school press, boston, ma.] discontinuous thinking reasons to believe discontinuous change pingback : # : in which i blog about blogs | historicity pingback : nmc library horizon report (pt. of ): documenting where we are and where we might be going | building creative bridges this work is licensed under a cc attribution . license. issn - about this journal | archives | submissions | conduct / / collaborators’ bill of rights | off the tracks | mediacommons press mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaborati… / off the tracks comments collaborators’ bill of rights page pingbacks and trackbacks . getting started in the digital humanities | digital scholarship in the humanities october , at : pm […] in “care of the soul,” and the off the tracks workshop devised a useful “collaborators’ bill of rights.”) if you can bring seed funding or administrative backing to a project, that might make it easier to […] . who owns this stuff? | thatcamp southeast march , at : am […] and build upon the resulting code and artifacts? in this session, i propose we use the “collaborators’ bill of rights” as a starting point for discussion. how might we instantiate these recommendations in our […] . we are rrchnm | lot march , at : pm […] revealed and highlighted the names of everyone who had ever worked on this project before the collaborator bill of rights existed. i asked on twitter, how many of you look at the about page of a digital humanities […] . post-doctoral fellowship (closes - - ) « occasional drama june , at : pm […] of all persons and and affirms the dignity of all persons. moeml is committed to honouring the collaborators’ bill of rights.   enquiries and applications may be sent to moeml via janelle jenstad at jenstad@uvic.ca. […] ) all kinds of work on a project are equally deserving of credit (though the amount of work and expression of credit may differ). and all collaborators should be empowered to take credit for their work. ¶ ) the dh community should default to the most comprehensive model of attribution of credit: credit should take the form of a legible trail that articulates the nature, extent, and dates of the contribution. (models in the sciences and the arts may be useful.) ¶ a) descriptive papers & project reports: anyone who collaborated on the project should be listed as author in a fair ordering based on emerging community conventions. ¶ b) websites: there should be a prominent “credits” link on the main page with pis or project leads listed first. this should include current staff as well as past staff with their dates of employment. ¶ c) cvs: your cv is your place for articulating your contribution to a collaboration. all collaborators should feel empowered to express their contributions honestly and comprehensively. ¶ ) universities, museums, libraries, and archives are locations of creativity and innovation. intellectual property policies should be equally applied to all employees regardless of employment status. credit for collaborative work should be portable and legible. collaborators should retain access to the work of the collaboration. ¶ ) funders should take an aggressive stance on unfair institutional policies that undermine the principles of this bill of rights. such policies may include inequities in intellectual property rights or the inability of certain classes of employees to serve as pis. ¶ http://mcpress.media-commons.org/offthetracks http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/ http://digitalscholarship.wordpress.com/ / / /getting-started-in-the-digital-humanities/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://southeast .thatcamp.org/ / /who-owns-this-stuff/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://www.lotfortynine.org/ / /we-are-rrchnm/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://occasionaldrama.net/ / / /post-doctoral-fellowship-closes- - - / http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- / / collaborators’ bill of rights | off the tracks | mediacommons press mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaborati… / . how collaboration works and how it can fail | archaeoinaction.info june , at : pm […] and the growth of collaborative projects involving humanities scholars, including the excellent collaborator’s bill of rights as well as rumination on what dangers collaboration may pose, such as my own article in jdh - . my […] . coltt : “the digital dossier” | erin m. kingsley august , at : pm […] dh bill of rights: including all authors/collaborators must be listed as taking some part of the project (although tasks/credit may vary); individual cvs should list individual, not group, collaboration—what did you do on this project –  http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-caree… […] . credit transparency and the collaborator’s bill of rights | introduction to digital history december , at : pm […] projects. in particular, i want to draw your attention to and work through the provisions of the “collaborator’s bill of rights,” which is part of a larger report entitled “off the tracks: laying new lines for digital […] . evaluating non-traditional digital humanities dissertations | literature geek september , at : am […] should get credit and thanks for sharing their work with others! (see the awesome “collaborators’ bill of rights” that came out of a mith workshop for more on why correct credit should matter to everyone). […] . credit transparency and the collaborator’s bill of rights | dave decamp october , at : pm […] projects. in particular, i want to draw your attention to and work through the provisions of the “collaborator’s bill of rights,” which is part of a larger report entitled “off the tracks: laying new lines for digital […] . credit transparency and the “collaborator’s bill of rights” | boston public history november , at : pm […] humanities project, i want to draw your attention to and work through the provisions of the “collaborator’s bill of rights,” which is part of a larger report entitled “off the tracks: laying new lines for digital […] . the pedagogy of digital humanities in the liberal arts classroom | april , at : pm […] they are encouraged to include dh research projects, experiences, and skills on their resumes. the dh collaborators bill of rights provides some nice initial guidelines for these […] . creating a group project charter | introduction to digital humanities november , at : pm […] also you might want to read this collaborators’ bill of rights […] . milking the deficit internship | january , at : pm […] collaborators’ bill of rights. off the tracks: laying new lines for digital humanities scholars. […] . disrupting student labor in the digital humanities classroom | research and destroy march , at : am […] for the principles of open access, or the guidelines for professional collaboration outlined in the collaborators’ bill of rights. we can develop and share resources for constructively encouraging students to produce durable […] . cetl faculty forum: “developing digital project assignments” notes and resources – sarah e. cornish april , at : pm […] for a wide selection of readings that may help you think about digital pedagogy and research ideas, browse through debates in the digital humanities edited by matthew k. gold of the cuny graduate center. i always incorporate readings on dh into my longer-term projects to get students to engage with the conversation, and i encourage them to read the collaborators’ bill of rights. […] . on developing a collaborators’ bill of responsibilities | september , at : am […] guidance on these matters does exist. the collaborators’ bill of rights, upon which the ucla guidelines are based, makes it clear […] . digital book project – eng - intro. to literary history and interpretation january , at : pm […] concerning credit, we will discuss and follow the collaborators’ bill of rights. […] . collaborators’ bill of rights – eng - intro. to literary history and interpretation january , at : pm […] collaborators’ bill of rights […] . introduction: issue fourteen / january , at : pm comment awaiting moderation http://archaeoinaction.info/ / / /how-collaboration-works-and-how-it-can-fail/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://erinkingsley.wordpress.com/ / / /coltt- -the-digital-dossier/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-caree http://benschmidt.org/dighist /?p= http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://www.literaturegeek.com/ / / /evaluating-non-traditional-digital-humanities-dissertations/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://davedecampblog.wordpress.com/ / / /credit-transparency-and-the-collaborators-bill-of-rights/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://neupublichistory.wordpress.com/ / / /credit-transparency-and-the-collaborators-bill-of-rights/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://www.tu-collaborative.org/?p= http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://dh .carrieschroeder.net/ / / /creating-a-group-project-charter/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://www.disruptingdh.com/milking-the-deficit-internship/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://researchanddestroy.net/ / / /disrupting-student-labor-in-the-digital-humanities-classroom/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- https://sarahcornish.wordpress.com/ / / /cetl-faculty-forum-developing-digital-project-assignments-notes-and-resources/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://mhbeals.com/on-developing-a-collaborators-bill-of-responsibilities/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- https://leunereng w .wordpress.com/ / / /digital-book-project/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- https://leunereng w .wordpress.com/ / / /collaborators-bill-of-rights/ http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaboration/collaborators%e % % -bill-of-rights/#comment- / / collaborators’ bill of rights | off the tracks | mediacommons press mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career-paths-acquiring-institutional-support-and-transformation-in-the-field/a-collaborati… / content © off the tracks . all rights reserved. http://mcpress.media-commons.org/offthetracks/ white paper report report id: application number: hd project director: roger schonfeld (roger.schonfeld@ithaka.org) institution: ithaka harbors, inc. reporting period: / / - / / report due: / / date submitted: / / white paper (to be released through ithaka s+r’s website shortly) cover sheet type of report: final grant number: hd title of project: campus services to support historians name of project director(s): roger c. schonfeld name of grantee institution (if applicable): ithaka date report is submitted: november , research support services for scholars: history project final report table of contents introduction .................................................................................................................................................. research practices ........................................................................................................................................ gathering and using primary sources .......................................................................................................... discovery ..................................................................................................................................................... secondary sources and research support from libraries and librarians .................................................. organizing sources ...................................................................................................................................... digital research methods, collaboration, and communication ................................................................. audience, outputs, and credit ................................................................................................................... graduate students ...................................................................................................................................... conclusions and recommendations ........................................................................................................... appendix a: interview participants ............................................................................................................ appendix b: interview protocol for historians ........................................................................................... appendix c: evidence .................................................................................................................................. introduction new technologies have been changing academic research and teaching for years. in many academic fields, changing research methods are re-shaping the very nature of the types of research questions that scholars are able to pursue and the rigor with which they can address them. and, even when underlying research methods remain constant, day-to-day research practices are digitally enabled, a transformation that has had in some cases substantial implications for the substance of scholarly research. research support providers such as libraries, archives, humanities centers, scholarly societies, and publishers – not to mention the academic departments that are often at the front line of educating the next generation of scholars – find themselves faced with the need to innovate in support of these opportunities. the innovation required of research support providers is the subject of significant debate. while the print to electronic transition has made clear some of the requirements for publishing, acquiring, and preserving information resources, some of the more fundamental questions regarding services have been more complicated to address. at a basic level, research support providers are eager to develop a deeper understanding of the changing needs of their users and customers. with the need to understand changing research methods and practices of scholars, ithaka s+r has launched a program of discipline-specific studies that we are calling research support services for scholars. we have begun this series in this project with history, for which the national endowment for the humanities has generously provided start-up funding to develop and test a method that is already being extended to additional fields. this report shares our findings and recommendations with respect to the field of history. for this project, we have focused on the practices and needs of history scholarship exclusively as conducted in an academic context. in history, the ithaka s+r project team found a discipline in transition. an expansion in the nature of the field over the past years has introduced new sources, both in terms of subject coverage and international scope. however, only a comparatively small share of the primary sources required by historians has been made available digitally, tempering the opportunity for new methods to take hold. even if the impact of computational analysis and other types of new research methods remains limited to a subset of historians, new research practices and communications mechanisms are being adopted widely, bringing with them both opportunities and challenges. the introduction of digital cameras to archival research is altering interactions with materials and dislocating the process of analysis, with potential impacts not only for support service providers but for the nature of history scholarship itself. there are as a result a number of key opportunities to increase the efficiency and comprehensiveness of archival research practices through improved researcher training and support services. in sum, research practices have evolved in subtle but significant ways, requiring parallel adjustments for those supporting history research. ten years ago, the american historical association extensively explored the state of the field of history as it was then practiced in the united states, to identify changes that might be suggested for educating phd students. the project, ultimately published as the education of historians for the twenty-first century, recommended a variety of opportunities to strengthen the structure and culture of history departments and the education they offer to their graduate students. in the ensuing decade, new technologies have allowed historians to introduce new research methods and practices, raising questions not only about the education of the next generation of scholars but even more broadly about how best to support new forms and means of scholarship. the findings and recommendations from the present project connect directly to efforts to best educate phd students for the field of history. the findings and recommendations of this project will find interest among the broad community that supports academic history research. we hope they will suggest opportunities at both a field and a campus level to ensure that academic historians and the field of history is well served in its digital turn. methodology in the first phase of the project, ithaka s+r interviewed professionals who support the research work of historians. before interviewing faculty members directly we established an understanding of the breadth of support available to history faculty members on campus, as well as the environment and institutions that support their research from concept to publication. the goal for this set of interviews was to explore the different types of service models currently engaged in supporting history research on campus, as well as the challenges that research support professionals are facing in today’s rapidly evolving research environment. ithaka s+r interviewed fourteen research support professionals altogether, and one member of our research team attended a round table discussion about the digital humanities with research support professionals from institutions in new york city. the interviews included library professionals, professionals working in centers associated with libraries, professionals associated with scholarly societies, publisher, professionals associated with independent campus digital centers, and professionals associated with independent higher education organizations. in our selection of interviewees, we placed an emphasis on campuses with support for digital humanities work. the research team conducted interviews via phone conversations; each interview was about minutes long. interviews were recorded for transcription and analysis purposes. interview questions focused on four fundamental areas: current services provided, planning for future services, perceptions of evolving scholarly needs, and challenges. while the majority of the interview subjects work with a variety of humanities and social science scholars, there was an attempt to focus conversations and examples on history in particular. however, because libraries and centers do not typically focus their support to a single discipline, in many cases it was necessary and contextually relevant to discuss the broader context of humanities researchers. an interim memo of findings from this stage was reviewed with our advisory board and made available publicly. for the second phase of this study, ithaka s+r interviewed thirty-nine practicing academic historians and graduate students about their work practices. of the thirty-nine, seven were phd students at various thomas bender, philip f. katz, colin a. palmer, and the committee on graduate education of the american historical association, the education of historians for the twenty-first century (university of illinois press, ). stages in the dissertation process. the researchers and the advisory board worked together to identify a diverse group of historians, drawn from varying positions in their career, sub-field, geographic locations, and type of institutions. as the study focuses on research methods, faculty members were selected from institutions that to some degree emphasize faculty members’ research. though we believe that the sampling approach has reduced obvious bias considerably, this sample of historians is not meant to be perfectly representative of the history community. (please see appendix a for a complete list of interview subjects and appendix d for further demographic information about the participants.) as the study is concerned with both the typical research experience for history, as well as the digital scholarship that is now taking place in the field, the historians sampled will fall across this spectrum of methodologies and approaches. the interviews were conducted using a variety of methods. eleven interviews were conducted in-person, most of them at the american historical association annual conference in , and fourteen of the interviews were conducted over the phone. thirteen interviews were conducted in the researcher’s office or primary work space. these onsite interviews allowed us to observe first-hand each subject’s work space and the artifacts of their research, which included research notes, resources, organizational techniques, writing approaches, and tools used in the research process. researchers were sometimes able to demonstrate their work practices, often on the computer or via photographs they shared, during conversations. the interviews were guided by an interview protocol (see appendix b), and they were semi-structured and exploratory in nature. the primary topics of interest included the research process, use of archives and libraries, research notes management, writing and publishing, general challenges throughout the process, and the use of digital methods in the scholarly process. acknowledgments a number of individuals in addition to the named authors contributed to this project, and we express our gratitude. we thank first of all the members of this project’s advisory board, who helped us formulate the scope and coverage of the project, assisted us in identifying interview candidates, and review the analysis and recommendations that appear in this final report. our advisory project board members are:  francis x. blouin, director, bentley historical library prof. school of information and dept. of history university of michigan  daniel cohen, associate professor in the department of history and art history at george mason university, and the director of the roy rosenzweig center for history and new media  james grossman, american historical association, executive director  miriam posner, ucla, digital humanities program coordinator in phase i of the study, it was recognized that in discussing the future of research support services for historians, it was critically important to include phd students in the interviews. jennifer rutner, ithaka s+r, research support services for scholars: history project interim report. http://www.researchsupportservices.net/?p=  stefan tanaka, university of california at san diego, professor in the department of communication and director of the center for the humanities we interviewed both historians and research support service professionals alike, each of whom gave generously of his or her time to ensure that as balanced as possible a perspective could be presented in our analysis. they are listed by name in appendix a, and to each of them we offer our deepest thanks. the development of this project, its analysis, and the final report were reviewed formally and informally by every member of the ithaka s+r team. we offer special thanks to ross housewright, matthew long, deanna marcum, and kate wulfson, for their comments on various drafts. finally, this project could not have been conducted without the start-up funding from the national endowment for the humanities through its office of digital humanities. we thank brett bobley, jennifer serventi, and perry collins, along with anonymous reviewers, for their advice and recommendations in helping to see this project come into being. while the work of this project was aided by the enthusiasm and support of many individuals, we take sole responsibility for the contents of this report. research practices historians and graduate students use archives as a principal source for primary source materials and libraries for secondary source materials. historians utilize a mixture of traditional and emerging scholarly practices. they organize and manage research notes to gain intellectual control over their research topics. in each of these areas of their research work, historians have needs for different types of support than they typically receive. gathering and using primary sources “you never know whe re you’ll get your records from.” “it’s about the relationship you develop o ver time with the archivi sts and librarians at the ar chive. a fter you leave , you want to have sup port at the archive; goo d relatio nships facilitate this . the rapport at the a rchive s i s ver y, very important.” “traveling to international archive s, maki ng connections to local archiv ists and librarians i s criti cally important.” “having a meetin g wi th the arch ivi st and l ibrarian i s really fan tastic, because they help you under stand what i s in the ar chive , and w hat you might be able to use.” “the publi sher then digiti zed the entire col lection. imme diately, i went from traveling to see th is material to being abl e to search everythin g from my computer. there wer e some thin gs outs ide that collection i still had to track down. but, i di dn’t have to travel, and it w as available to me an ywhere i went. i wouldn’t have fini she d the book on time fo r my tenure had i not had a cce ss to th is online.” the use of primary sources remains at the heart of the historical research method. all interviewees had done extensive work with archival collections – using physical and digitized collections - for a current or recent project. archivists emerged as critically important research support professionals, whose collaboration can be invaluable to a project. the use of digitized finding aids, digitized collections, and digital cameras have altered the way that historians interact with primary sources. while the centrality of archives to the research process remains, the nature of interactions with archival materials has changed dramatically over time; for many researchers, activities in the archives have become more photographic and less analytical. there may be great advantages to conducting analysis at greater leisure outside a trip to the archives, but there appear also to be at least some important challenges to the researcher in redirecting a project mid-course and to the archivist in providing support when analytical work is displaced from the archives. working in the archives despite the wide availability and use of digitized primary sources, research trips to archives remain an important part of nearly every history research project. all but a handful of interviewees had recently conducted a research trip, or were planning one. for faculty members, trips were generally not extended over a time span of more than a month, though some had spent summer months, fellowships, or sabbatical time conducting research over longer periods of time. most, however, scheduled research trips during semester breaks and summer months, and they often struggle to find time for these trips. if domestic, a researcher might plan a series of trips to different archives, for various amounts of time, returning home after each. or, for either domestic or international research, an historian might take up temporary residence near an archive for extended use. the ability to carve out time for research trips was a primary challenge for most interviewees. interviewees repeatedly emphasized that the amount of time they are able to spend in the archives shapes the nature of the interaction with the sources significantly. the consequence of shorter research trips is that researchers spend the majority of their time in the archives informally digitizing materials for later review and analysis. in some cases, the availability of existing digital resources – digitized collections, online finding aids, and digital secondary sources – allowed them to stay engaged with their research throughout the semesters and between research trips. the availability of these materials is a significant change, and a clear improvement, for most historians’ research processes. historians approach research trips in a variety of ways . some plan focused research trips, with prepared itinerary and a list of collections they knew they are looking for. others take a more adventurous, exploratory approach; they start with one key collection of interest, and travel with the intention to solicit advice from local experts while in the area. depending on the topic and the location of the archives, particularly with international archives, it may not be possible to thoroughly plan a research trip. some historians are required do more excavating than others due to the nature and degree of maintenance a collection has received over time. in some cases, interviewees reported working with collections that would be nearly incomprehensible to non-experts. historians sometimes plan a sequence of archival visits within the research and writing process, with different trips serving different purposes. historians might go on a “scouting mission” early on in a project, and visit an archive of known interest to explore the holdings to make judgments about how much time will be needed for subsequent visits. the use of online finding aids greatly facilitates, and sometimes displaces, these visits. if a “good” finding aid is readily available online, this might make a scouting visit unnecessary, depending on the importance of the archive to the research project. in some cases, researchers were able to rule out a visit to an archive based on the online finding aids, and re- purpose funds and effort to tracking down other sources for the project. during the in-depth research visits, an historian will engage deeply and comprehensively with an archive, attempting to identify and capture all of the relevant material for the project. depending on the state of the archive, and the extent to which it has been organized and indexed, this may be a relatively easy or labor-intensive process. this may require multiple visits over a period of time, potentially years. during these visits researchers will work through collections methodologically. initially, there is a process of identifying what sources are relevant. this vetting process involves finding aids, consultation with archivists, combing through a collection or parts of a collection to gauge its relevance to the topic. towards the end of a project, an historian might conduct a wrap-up visit. these trips are generally used to identify sources that are known, but not yet gathered, follow-up on earlier leads, or to confirm citations and quotations before submitting for publication. of course, research is a highly iterative process, different for each researcher and project, and highly dependent on the need for travel and funding available for research travel. e-archives the digitization of primary sources and finding aids has shifted many aspects of the archival research process for historians. relatively few interviewees worked only with tangible primary sources. for some, working only with tangible versions of primary source materials was a preference and a habit. others, especially those working in international archives, felt that they had little choice but to use tangible versions, since their source materials are not available digitally. on the opposite end of the spectrum, two interviewees had been able to complete all of their research for a project - even a book project - using digitized primary sources, and avoiding travel. another historian reported having completed a recent book project using a combination online resources and research assistants who visited archives in another country on the researcher’s behalf. the historian and the research assistants communicated regularly via email, and utilized digital cameras to capture archival content and sharing the images. finding aids online finding aids clearly offer scholars enormous benefits. as mentioned, the use of finding aids before visiting an archive can help a scholar prepare more thoroughly for the visit, and use his/her time most effectively while there, especially given limited travel time. most notably, finding aids were used in the prioritization of research trips, and allowed researchers to determine the contents of an archive before making a trip. most interviewees said they are not traveling less for research because of digitized finding aids and collections, but they have been able to travel more strategically. high-quality finding aids may grow in importance as researchers continue to see their visits to the archive as increasingly photographic and less serendipitous in character. generally, historians discover finding aids through google searches and archive websites. the general consensus among interviewees was that more online finding aids would greatly benefit their research, and that archives should continue to make efforts to make these accessible online. continued and expanded efforts to develop finding aids more efficiently and to make them available digitally would seem to support the needs of historians for improved access. research support in the archives “you bump into an archiv ist w ho i s interes ted in your topi c and strike up conversation. […] the y have an act ive inte rest in s howing you more things than you were asking for.” see for example the clir initiative on cataloging hidden special collections and archives (information available at http://www.clir.org/hiddencollections/). some archives have launched efforts to develop finding aids more quickly but less exhaustively as a starting point to increase access. http://www.clir.org/hiddencollections/ the role of the archivist is critically important to historians’ research processes. these research support professionals emerged as the primary collaborators and colleagues of the historians interviewed; they are often intimately involved in helping scholars achieve their research goals. some interviewees discussed directly the importance of cultivating a relationship with an archivist early in a research project, in order to facilitate access and support when visiting an archive, or in requesting digital copies of materials. because these archivists are typically deeply knowledgeable of the content of their collections, and have their own networks of research support professionals, they are well-positioned to connect history scholars to additional resources. as noted above, many interviewees rely on archivists to inform and direct their research practice, and they often see them as a primary supporter and teacher when it comes to working with primary sources. from the interviews it was clear that archivists’ deep knowledge of the collections they work with and understanding of related collections is of tremendous value to historians working with primary sources. archivists are often able to hone and direct an inquiry, bringing to light items and collections that the researcher may have been unaware of. the archivist is seen as an expert and a partner in the discovery process, providing a gateway to access for collections that are often described as “hidden.” the moments of discovery that scholars share with archivists were described by historians with delight and gratitude. the archivist is also critically important for scholars who cannot travel to an archive. interviewees reported relying on them via sometimes extensive phone and email exchanges. historians would engage sometimes at length about their research project, and the archivist would suggest materials, and prepare and distribute digital copies. this type of long distance relationship has been critically important for those who cannot travel, and provides access to collections that would otherwise be impossible. capturing primary sources “they weren ’t open about this on the page – you can bring your scanner! i would have had no reservat ions to s can everythi ng i looked at. i took really good notes, copied really goo d stuff. but i mig h t want to see it aga in later. with out going back to the ar chi ve. s eems s illy to do the w ork twice. scanning l ets you do that.” “i just took pictures. i haven’t e ven gone t hrough them yet. i ju st photographe d everything in that bo x. […] i only ha d a ce rtain amount of time. there ’s not time to refl ect too much.” “i would just go in an d photograph like cr azy. then i woul d sor t these out. i would go throu gh a s eries of f iles and fi gu re out what were the titles of the work s i had just been looki ng at —and then i wo uld just rename the f i les so i would ha ve the titles. then i hav e another system w he re this is hooked up t o a larger biblio graphy, where t hese letters are tie d i nto a form i can retri eve.” “i’m not us ing a di git al camera. i ’ve trie d i t and abandoned it. i f i don’t pro ces s it [photographs] then, annotate, decide wha t’s i mportant, it just goes into a bi g pile that never gets figur ed out. you don ’t know what you’ve got at the end, and you have to essentially g o through it all a gain . it be comes hard to process it later.” the widespread use of digital cameras and other scanning equipment to capture source materials is perhaps the single most significant shift in research practices among historians, and one with as-yet largely unrecognized implications for the work of historical research and its support. capturing source material in a way that facilitates continued access to the intellectual content over time is essential for historians. researchers have had a variety of methods available to them for interacting with and capturing the content of archival materials, a process at the heart of the historic research method. note-taking, microfilm, printed volumes of primary sources, photography and scanning are services that have long been available in most archives, depending on the material in question. transcription remains an important part of the research method for many historians, and they reported spending hours in an archive taking notes by hand or on computer. in some instances - though rarer by the day - transcription is the only option available to archival researchers for capturing the content of the sources. this may be done by hand, on paper, or using a laptop. the most notable development in capturing primary sources materials is the now widespread use of digital cameras in the reading room to photograph sources. many interviewees reported using digital cameras in the archives, and found them to be incredibly beneficial in terms of efficiency and convenience. scholars were able to spend time in the reading room photographing the collections, and would often postpone viewing the images until they returned home from the trip. this was notable in that some historians reported that they no longer engage intellectually with the sources while in the archive; these trips have become more of a collection mission. some felt that this convenience enabled them to conduct their research amidst the many demands of academic life, and were thrilled to be able to interact with their sources from their homes or offices, rather than having that activity relegated to a few days or weeks in an archive. this allowed them to engage with their research throughout the year in a completely different way than before. it was clear that the influx of digital cameras in reading rooms is changing the nature of the research visit for many historians. it is important to note that the quality of digital images and the availability and use of high-resolution, large-format screens were key factors making possible these new approaches. many archives have long offered reproduction or scanning services, sometimes at a fee, and the introduction of self-service high- quality imaging has in some cases reduced this source of income. in at least one case, an archive has elected to charge scholars for the right to take their own photographs, perhaps at least in part to retain this source of revenue. interviewees consistently argued that more archives should allow and facilitate their ability to photograph the collections, in a variety of ways. some historians hope that their own digitization work can contribute to more content being made available for both the public and other scholars. in one case, a scholar noted that he was scanning while some might call into question the role of these existing services, at the same time their professional quality has been vital to imagery reproduced in monographs and journal articles, and they can at times serve as a source for the development of digitized special collections (in a way that individual digital cameras might not serve as well). material from a small local archive that had never been scanned before. he intended to provide the archives with copies of everything he has scanned, so that future scholars might have improved access to the material. the value of such contributions, even with the potential development of protocols to guide their development, is not entirely clear. while the use of digital cameras is a significant benefit for scholars busy with professional and personal commitments, their use also presents some challenges. the ability to organize and access photographs in a constructive way after a trip is a sticking point for many of those who worked with digital cameras. because the digital images are typically jpegs, there is no metadata inherently associated with the file that relates it to the content of the image. scholars rely on complex file structures and good memories to access their files once home from the archive. one interviewee includes call slips in her photographs, which stated the name of the archive and the collection, so that she could always orient herself to the source. (illustration here.) again, the displacement of the intellectual engagement with the material appears to have some downsides, given the lack of tools or software to facilitate the process of capturing and using digital photographs for scholars. scholars also reported the challenge of integrating the images with their textual notes, which add another layer of format types to the mix. these digital photographs clearly add value to the research process, but working with them effectively and efficiently remains a struggle for most. in one notable instance, a scholar was able to conduct research remotely, working with research assistants near the archive of interest. the research assistant would photograph the requested materials, and email the files to the researcher, who was then able to review them and request further files for photographing. the entirety of the primary source collection was reviewed in this manner, and the historian used this research for his monograph. it is not yet possible to predict if this type of development is the logical outcome of vastly improved finding aids and displaced analytical practices in time and space. international archives “i take my laptop and my camera. i can ta ke photos for free in france. but italy charge s me a lot to t ake my own photogra phs.” many interviewees were traveling to archives outside of the united states, which presents a range of challenges from language barriers, to organizational and access differences. in some cases, historians are using well-maintained, well-catalogued collections at large institutions like the british library or bibliotheque nationale de france. in others, historians are hunting down and weeding through local archives that may never have been formally processed or accessed by a researcher previously. for some historians, sorting through a relatively disorganized, unprocessed archive adds to the adventure of the research process. however, using an unprocessed collection does require different preparation and different approaches once at the archive. while most interviewees did not say that working with unprocessed materials was an insurmountable challenge, it was clear that further training would be beneficial for some researchers in ensuring their ability to work with all types of archives and sources in diverse locations and conditions. working with non-text formats “video cl ips ruin e ver ything. they ’re so hu ge.” “it’s just th inking thr ough how the digital makes it pos sible to ask dif ferent questions. how it s ha pes what come s a cross. the e xtended min d. arti f acts enable you to extend what w e know.” a number of interviewees discussed the use of non-textual (mostly digitized) formats in their scholarship, and the challenges they are facing in working with them effectively. primarily, historians were discussing the use of primary source material in non-text formats such as video, audio, websites and video games. these types of artifacts have long been used as a source of content in history. overall, there was consensus that it is easier to locate, access, and work with digitized materials than ever before. in some cases, this availability has fundamentally changed the research process for scholars; one discussed how a mass digitization of government audio recordings and their availability in the public domain have shaped his career and his research. however, some barriers to working effectively with media sources still exist. in some cases merely capturing this content for viewing and analysis is a challenge. some materials are available only in archives, and cannot be copied. in some cases, as with websites and video games, there may not be established ways to capture, present, and cite these materials within the academy. and, as these particular types of materials are not associated with an institution or archive, there is no support for working with them in a scholarly way. even with advancements in access to digital video online and affordable storage options, working with video files can still present challenges to scholars who depend on media. some scholars who have an interest in new media sources also expressed concern about these sources being taken “seriously” as artifacts, within the academy. discovery “it’s overw helming , k nowing how mu ch inf ormation is available to me now, and how much has been p roduce d in the la st years. my rea ction i s that it’s intimidatin g to have this mu ch informatio n readily ava ilable.” “the bottle neck used to be acces s to in formation. that’s not th e case to day.” “i was also a ble to do very broa d sear ches that would have take n yea rs of a ctually di ggin g through the newspapers to fin d o bscure re ferences to [my topic]. so that is where i think i firs t started to use di git al sources a s a genuine research tool, rather than as a teaching tool.” “it’s ni ce when i can f ind a database [… ] w here i can enter in key words and start coming acros s thi s m aterial. but i am not all that comfortable with that kind of system— in that sens e i am pretty old fa sh ioned. i still l ike reading through , understanding t hat i might be lim iting m y search arti fi cia lly w i th a narrow search term.” “that i s what needs t o happen, this is very important. we do no t have a centralized clearing house that can in di ca te to use what digital collections are out there. you have t o use your intuition a nd go to certain kin ds of in stitutions, and there are some publications but they are very erratic in w hat they have in them, and what the y des cribe.” discovery is an essential part of history research. identifying sources - both primary and secondary - on a variety of topics is part of scholar’s daily work. the process of locating sources for history research is understandably different for primary and secondary resources. few interviewees reported any challenges locating secondary sources, for which they make extensive use of search tools provided by their campus libraries, as well as the open web, although achieving comprehensiveness is often a concern. locating primary sources presents a much more important challenge. finding primary sources “well , i go online an d i sear ch throug h t he various databases and catalog s. for example for the re cor ds o f the [ar ch ive], i’l l search electronica ll y through [t heir] database to f ind the records i know i’ll w ant to look at , and then i’ll go to the [archive ] . that i s a case where there are s till pa per catalogs th at have more complete informatio n and so i will look a t the paper catalogs as well.” nearly all historians are engaged in a continuous search for primary source material relating to their research topic. the range of institutions that they work with to identify relevant resources is vast and varied. historians know no bounds when it comes to finding primary sources, and they work with archives at academic institutions, independent archives, local, state, and national archives, depending on the topic at hand. researchers typically develop a deep knowledge of the primary source collections available to them on their particular topics. in some cases, the historian may be the expert in what sources are available, with intimate, comprehensive knowledge of the archival holdings at multiple institutions. these scholars are often seen as a resource for others in their field, and other historians will rely on their network of colleagues to assist with identifying relevant primary sources for their research. sometimes, these networks are built through interactions among scholars at an archive. a handful of interviewees reported reaching out to well-known scholars in their field – perhaps someone they’ve read and respect – to ask advice on using an archive or locating sources. typically, historians reported traveling to the archives they were working with, with a very small minority relying on local resources. none of the scholars included in these interviews were actively using the collections held at their local institutions. for the most part, scholars indicated that they had explored their campus special collections holdings upon arrival, took note of relevant and potentially interesting sources. however, they generally have to look much farther afield for primary sources, and the campus collections are not a primary resource. of course in some cases, historians were doing locally-oriented research. this might be due to naturally evolving interests, or may be an adjustment of scope of the research due to lack of funding for travel. the “open” web is often the primary search tool for locating archival collections that are held by independent organizations or government offices. learning the networks of organizations related to a topic is a central part of the discovery process, and the open web has become a ubiquitous, enabling tool for historians. historians reported needing to be creative with their searching; they must consider many different search terms as well as organizations that might hold relevant records. outside of collections held at universities or independent research organizations, finding aids or collection descriptions are rarely collected into searchable databases, and it is still necessary for historians to locate each collection independently. this lack of collocation and collection presents efficiency challenges and deepens scholars’ concerns about comprehensiveness. the anxiety over “missing something” was quite common across interviews, and historians often attributed this to the lack of comprehensive search tools for primary sources. finding secondary sources historians use secondary sources in a variety of contexts. historians use secondary sources early in a research process, especially if they are exploring a new field and require orientation. they also keep up with the current research in the field with a variety of mechanisms involving journals, publisher catalogs, book exhibits, and other mechanisms. some interviewees reported that not only reading, but also writing, book reviews, constitutes a valuable way for staying engaged with new publications in their field. for the most part, historians did not cite challenges with discovering or accessing secondary sources, with the only issues reported at institutions where journal subscriptions were somewhat limited. the campus library is the primary resource for gaining access to secondary resources, but historians do not limit their searching to their own institution. when a book or article is not available in the local collection, interlibrary loan (ill) will provide access. historians consistently praised their library’s ill services, and it was clear that these were integral in gaining access to secondary sources for research. in addition, when scholars cannot get access to a particular item, they often turn to their network of scholars, who may have access to a resource at their local institution and be able to share it with them. where it can supplement the resources available to them from their home institution, historians will take advantage of any local libraries that may have relevant collections - including public libraries, independent organizations, or other higher education institutions, as noted above. keyword searching is a primary mechanism – indeed a ubiquitous practice – for discovering secondary sources in the context of a research project some interviewees expressed concerns about limitations of keyword searching, recognizing that the corpus of materials that are available to search in are not, in fact, comprehensive. however, these concerns do not deter researchers from using the tools. many recognize that their search methods shape their work by defining the collections that they access. one historian noted that this is not necessarily different from previous practice, “pre-internet,” where a scholar would access a limited set of archives, and base the argument on the resources held in those collections. another important discovery mechanism is following citation trails. this is especially important when familiarizing oneself with a new area. one researcher described a typical search strategy: “i use [my campus] libraries. and, their interlibrary loan service. i also like to see snippets of something obscure on google books. then i’ll go to the [campus] library to get the book itself. if it isn’t there then i’ll go to ill, or maybe worldcat. interlibrary loan is pretty good. sometimes i can’t find something i know is there. i’ll search through jstor, worldcat, archive grid or archive finder.” this example, typical of many interviewees, indicates that historians actively engage a wide network of search tools and services to address their research questions. the campus library, google, and other search services are part of the daily search routine. the “open web” is a valuable tool that brings special collections, commonly not found in a catalog or database, to light. it was also clear that digitized secondary sources have been widely accepted among historians, and nearly all interviewees reported using such resources. while it is still the case that the majority of interviewees would seek a print copy of a relevant source, the use of digitized texts – books, book chapters and articles – was ubiquitous. historians cited the benefits of their ability to preview “snippets” or sections of a book in order to determine relevance before getting the book. in some cases, historians were working with the digitized text, taking notes or copying out passages, just as they might with a print text. exploring new topics “for instance, may be i have become interested in some topi c or some f igure , an d i am trying to under stand whether or not s omeone else has wr itten about this person or is sue. u sually, with some kin d of keyword search ing y ou can get a sen se of whether or not it appears in some other book.” “[…] about someth ing i am interested in t hat i do not know much about. i will go to google books and i will type in a coupl e of key term s, and se e what el se turns up. often that will di rect me to a couple of other titles, and tha t will direct me to some footnotes from somebody’s book that is worth looking at.” historians said it can be challenging to identify primary and secondary resources in new topical areas, particularly at the beginning of a new research project. after having developed deep, comprehensive knowledge in one, typically narrow, area for a dissertation or monograph, diving into a new, unfamiliar topic can feel daunting. not only do researchers need to identify specific resources to address their questions and support an argument, but they also may need to familiarize themselves with a new sub- field of history or work from another discipline. historians often need assistance orienting themselves to the resources available on a new topic, both primary and secondary. again, many scholars rely on citations, general web searches, and subscription databases when exploring new topics. few reported working with a librarian in these instances, and some rely instead on colleagues. in general, exploring new topics was reported as one of the most daunting aspects of the research process for historians. google “google is the fir st p ort of call.” “[…] a lot o f time s i will try to just start with a google search. ” “[google books i s] al so helpful at the very beginnin g of a project, when you are not quite sure what s ources you are going to use. or you want to do a ma ssi ve scan usin g keywor ds. i never di d that until recently. […] i started just in goo gle books, searching for that phrase or related phrase . thi s has be en the most fun part about it; search i ng digit ized books, th e full -text for [the] p hrase. it ’s been so great for my research; there are so many ridi culous thin gs out there.” “even some pret ty ob scure thin gs have lan ded in there [ google books], and it’s made thin gs a lot ea s ier. because if they a re in the period, they are publi c domain, and i can jus t download them an d use them at my leisu re. or sear ch them… now that is a big change ! i c an’t e v en imagine, i cannot even remember… being able to do key word sear ches, with i n pdfs of book s i s aw esome. that’s what i would say, more o f that please!” there was extensive discussion with interviewees of google discovery tools, including the general google search, google books, and google scholar. while most historians recognize that google has limited access to materials - it doesn’t actually search “everything” - it was generally seen as the most comprehensive discovery tool available for certain types of searches. google discovery tools’ convenience, ease of use, and overwhelming scope of searchable material clearly outweigh the limitations of its search. historians seem to be savvy users of google. when discussing google, one interviewee noted “technology is not a substitute; it is a supplement.” interviewees use general google searches to start the discovery process. for many of them, google is the primary search tool in identifying archives that hold relevant materials, as information about archival collections is nearly always available on the open web. google is recognized as a tool that has expanded the breadth of types of materials that an historian can access on a given topic, and introduce a researcher to collections that they were not aware of, even after years of working within a sub-field. several interviewees noted that they had recently found sources that they would not have been able to identify without google. one noted that google has been particularly useful for accessing digitized local newspapers, which has become a “rich resource” for his scholarship. interviewees widely acknowledged google books as a valuable tool for their work. nearly all of them mentioned using it in some capacity, and were enthusiastic about the perceived convenience of the there was strikingly little discussion of google scholar. it was mentioned as a resource by a handful of interviewees, but there were no trends or notable significance placed on this tool in the interviews. there was no discussion of the google newspaper digitization project, directly, in the interviews. “google ends newspaper digitization project,” by greg landgraf, american libraries magazine, may , . http://americanlibrariesmagazine.org/news/ /google-ends-newspaper-digitization-project http://americanlibrariesmagazine.org/news/ /google-ends-newspaper-digitization-project search tool. for some sub-fields, particularly those focused on historical periods that are pre- , google books can be a centrally important tool for accessing primary and secondary sources for research, and some interviewees reported using it extensively. google books is also valuable in orienting scholars to a new field by helping them identify sources and gain access to a network of citations. many scholars mentioned that even the previews in google books, for those that aren’t available in full text, are valuable in helping them understand whether a source is worth pursuing. some researchers also use google books (and one person, amazon.com) to check citations when doing bibliography work. the full-text search functions of google books are a huge advantage to historians. one interviewee spoke about her use of google books: “being able to search for a particular word that i’m interested in is so much more powerful than searching in a library catalog. it’s not in any title. it’s not in a subject term. everything in my field is out of copyright and digitized. it’s all there. i feel like i’m cheating half the time. knowing who the current scholars writing about this are, past scholars, and primary sources of things that mention this world. it’s made it so easily accessible.” interviewees reported using google books to identify resources that they want to access in print, through their campus libraries. they will typically use google books to explore a topic, and then use their local library discovery system to locate a known item or request the item through ill. some scholars even mentioned using google books to search texts that they own in print copy. the full-text search capabilities that google books presents historians appear to have had a profound effect on their research practice. many interviewees shared their perspectives on the incredible value of being able to search through a digitized text, and compared that experience to using a print version (in many cases, they had used both the print and electronic versions of a single text during a research project). “it is a trade-off. a trade-off between convenience on the one hand; or more importantly, that ability to search. and, it is that searchability that is so brilliant, compared to the tactile joy of holding the manuscript. on balance, i would much rather have accessibility and searchability.” a number of interviewees shared that they use google books during the writing and editing phases of a project to confirm quotes and citations. historians working on international topics noted limitations of the corpus of foreign language material available on google books. many continue to rely on subscription databases which provide access to collections of foreign-language materials in these cases. secondary sources and research support from libraries and librarians interviewees were asked about the role of the academic library and the services that it provides in supporting their research. while the interviewees were enthusiastic about their campus libraries, it became clear that these libraries are not deeply embedded in the research processes for most historians. of course, the interviewees are regular users of the print and online library collections. outside of the collections, interlibrary loan was the most commonly used formally defined service. historians reported occasional interaction with reference staff in their research projects, especially as they examined new areas of interest, but an inability to rely on librarians for detailed help in a given sub-field. historians also reported using a wide network of libraries in their local area, and were not solely engaged with the campus library; they make use of all local library collections that they can access, including public libraries and other university libraries. these interviews did not cover the support that the library may provide historians in their instructional roles or for their students in supporting academic coursework or critical thinking and information literacy skills more generally, services that are known to be important priorities for many academic libraries but about which no findings can be drawn from the research for this project. working with librarians “i talk to the libraria ns w hen i ’m looking f or som ething outsi de my comfort zone.” “she’s very goo d at p ointing out online res ources that i haven’t consi dered. but, doesn’t have the subj ect knowledge o f re cent books in [m y sub f ield].” “the h istory l ibraria n is a [ specialist in a particular subf ield]. i could have worked more closely with her, but i di dn’t feel like s he would know about my subfiel d.” “i would sa y [i get] h alf [of the books i ne ed for research via ] i ll, and the other half i am purchas ing for my self.” “[my in stitution] is v ery small; only , students. so , their l ib rary is very small. but, i live in [a nearb y city ]. that [ha s] a g igantic l ibrary, so i j ust treat that like my research li brary. that was one o f the b ig attractions o f the j ob, that it was still in that orb it. at this stage in m y career, feeling se cure that i have a cce ss to that tier of li brary m aterial.” some research support professionals are eager for collaborative relationships with faculty members, so this was one possible role explored in interviews. while it was clear that the historians interviewed held their campus libraries and research support professionals in high regard, the extent of their collaboration with them on research projects was rather limited. they usually knew their campus subject librarian by name, and generally felt that they had a positive relationship with this research support professional. however, when asked when or how they work together, nearly all interviewees cited teaching support, rather than research support. when asked how what the librarian’s role was in a recent research project, some simply said “none.” at the same time, it is important to distinguish a collaborative role, which was not recognized, from a support role, which in some cases was valued. some interviewees noted that they have worked with a librarian to identify resources in the library collection (often subscription databases) related to their current research project. one interviewee recalled seeking the assistance of a librarian in locating a particular type of map; unfortunately, the librarian was unable to find the item, and the researcher then planned to go to an archivist for further support. a handful of historians also mentioned working with the librarian on search strategies, and two mentioned going to a gis librarian for gis support. one interviewee noted that the history subject librarian on her campus holds a phd in the field, and therefore “knows us well intellectually.” for researchers in some sub-fields, and particularly area studies, there may be no subject specialist on campus with domain expertise who would be prepared to support researchers, from their perspective. specific expertise is valued, but in some cases the perception has emerged that the librarian lacks needed subject expertise. in addition, some interviewees experienced frustrations with interactions with library staff or archivists, including lack of timely communication, difficulty communicating, and inability to provide assistance or referral. this section of a transcript provides one illustration of a relatively engaged relationship between an historian and the campus library, according to interviewees. interviewer: “does your campus library have a role in your research? historian: “yes, we have digital databases that i use. we have very good interlibrary loan facilities which are very important. [my institution] is also a member of the center for research libraries. the crl has an enormous range of stuff, much of which has been microfilmed. they are also digitizing it more and more. so as a member of the crl you get access to their vast holdings, which cover virtually every country in the world and every time period—it’s amazing. interviewer: have you worked with any of the librarians on campus? historian: oh yes. because they are trained as librarians they can think of search terms, or ways of searching that i – i am not trained as a librarian, so i don’t. so yes, definitely the librarians are crucial in the whole research process—both at [my institution] and wherever i go. interviewer: at what point do you talk to the librarians? historian: dead ends. interviewer: at dead ends? historian: yes, i share my frustrations with them and ask them to help me get out of the cul-de- sac. interviewer: so if there is something that you cannot find, that’s when you go? historian: yes. i know that somehow, somewhere it is there, and i just need to be able to find it— that my searching isn’t being as efficient as it ought to be. interviewer: do you ever talk to them about the overall process of research and writing? historian: no, not really. the interaction tends to be the other way, they receive invitations to look at possible research databases and they will send those invitations out to us and ask if we think this is something we should pursue. then if we pursue it we will have maybe a two or three week window to use that collection and then at the end of that window the members of the faculty will recommend whether we should subscribe or not.” one interviewee claimed that campus library staff were ill-equipped to handle interdisciplinary research. as subject librarians in research libraries are typically most familiar with one subject area, such as “american history” or “women’s history,” scholars who are engaging multiple fields and drawing on sources across topical areas often lack a single point-person for research support in the library. one scholar expressed his struggle with finding research support for interdisciplinary research: “people whose books are all adjacent to each other in the stacks have a better relationship with librarians. rather than my multi-disciplinary topic. […] the way i frame my questions… there’s no question that will be answered by a single collection.” if more phd students and scholars take on interdisciplinary topics, there may be additional challenges to providing research support, in terms of content expertise, to such researchers. collections it was clear from interviews that campus library collections were the most frequently used library service among historians. all interviewees cited their access to their library’s collections for printed primary sources, secondary sources, and electronic resources. interlibrary loan services were the second most frequently discussed and valued library service. only a handful of interviewees mentioned requesting that the campus library purchase a title or subscribe to a journal or database to support their research. in general, if a library offered an on-campus delivery service for print collections, historians were using it. while they may disclose that they “miss” going to the stacks, convenience appeared to win out over the value of browsing, according to these interviewees. moreover, libraries’ approaches to collection management did not evoke significant complaints. historians interviewed expressed little to no concern about value lost in working with electronic secondary sources. interviewees consistently stated that they use electronic secondary sources, that it was convenient and efficient to do so. there was only one mention (in thirty nine interviews) of frustration with portions of a physical collections being moved to offsite storage. overall, it was clear that these historians have accepted and adapted to the evolution in collections, and are benefitting from electronic collections in the same ways that other disciplines report to. a network of libraries as mentioned in the previous interview transcript, historians reported using a network of libraries in addition to their campus library. most will patronize any library that they have access to, including those of other colleges and universities in their local area, as well as public and independent libraries. interviewees reported great awareness of the breadth and limitations of the collections at their local institutions, and an willingness to look beyond the campus to access the resources they need for research. in some cases, another library is simply more conveniently located, especially in instances where faculty commute to and from campus (sometimes between states). among scholars, using a number of libraries, academic libraries are likely providing research support services to faculty from other institutions with all of their materials, not just the rare or unique materials. it was clear that history researchers are not solely reliant on the campus library for access to collections or research support services. one interviewee at a liberal arts college noted that he uses a nearby research library at another academic institution “all the time”; its proximity even influenced his decision to accept his current position. a number of interviewees from teaching focused institutions discussed the limitations of their local collections for research, and their dependence on other sources, including their network of peers, for access to research materials. again, historians cast a wide net when searching for materials for their research. organizing sources “a huge problem has been organizin g the material i’ve found. i ’ve accumulated a huge amount of in for mation.” “once it ’s or ganize d, it’s up to me to think about it and write. b ut i do re sent the time that’s spent org anizing and a manag ing everyth ing.” “i realized that i was repea ting mysel f. i h ad already taken not es on someth ing, but it was in a no tebook and i di dn’t reali ze. i nee d everyth ing to be in one central place.” “[…] it’s just the shee r amount of informat ion one tries to deal with. it ’s really too much.” “i have taken so many photographs , and t hey are in or der, and they are in or der in my paper notes, but i have not ha d time to go back and a ctually code and organize all of t hem. i ha ve started, i have these ex cel sprea dsh eets where i try to fill in information —t hen i keywor d tag in that.” researchers widely and consistently reported that managing analog and digital research notes and sources is a primary challenge for them. collocating and accessing research notes, and relating them to the writing in an effective way, is an organizational challenge, especially for large book projects that can last multiple years and cover hundreds, if not thousands, of resources. and yet, this is perhaps the most tangible component of the analytical work conducted by historians. research notes and their management no one approach emerged to organizing research notes, physically or digitally, and it was clear that this is another part of the highly personalized research process for historians. early on in a project, interviewees reported using a number of different, mostly folder-based, approaches to organizing content, where topic or author were the dominant criteria. most interviewees, when working on monograph projects, organized their material according to chapter. the idea of the chapter, and the argument that it contains, provided structure for many scholars who were organizing their information. one even stated “it’s not like i can go to my notes from [my last] book, and put them together in a different order and write a different book. they were created with a goal in mind.” this strong tie to the structure of the book exerted a lot of influence on the act of organizing sources and notes. in numerous cases, interviewees demonstrated their organization processes by showing the physical and digital “piles” of sources that made up a chapter. many scholars had stacks of paper notes and sources organized by chapter. in one case, an interviewee shared the bookshelf on which he kept his last book, with each chapter’s sources sorted neatly into piles and labeled. these processes and organizational structures were also evident in the digital work flows and file structures that interviewees have put in place. historians want the digital environment to enable their physical and intellectual processes of sorting through materials, understanding their content, relating it to their narrative, and shaping it accordingly. (add image here.) the chapter number of name was in some cases used as a “tag” in note taking to indicate the concept or section to which a particular source would relate. but it is clear that digital systems do not address the needs of even those scholars who seek to use them. one scholar’s process for collecting and organizing source material incorporates a database to capture passages and collect notes. from the database he then prints each note or quote onto an index card, and the words are then organized into chapters. he manually reviews the stack of note cards for a section of a chapter, arranges them into a narrative, and writes from this tangible tool. historians reported a myriad of approaches, processes, and tools for addressing the challenge of research notes management. this process was highly personalized, as was the case for most of the research process for historians. one interview excerpt illustrates how a scholar approaches research notes management: “if i come across a book, and i don’t need it right now, but someday i might, i put it in the bookends database . i have about , sources. it’s not good for primary sources. it’s hard to explain. the citations are so inconsistent. it’s haphazard. filling in all the fields; it shows up funny. i keep them in an excel spreadsheet for the primary sources. i started using excel, and each document would get a number, and i’d save it that way. so if my spotlight [mac operating system] search, the title wasn’t coming up, i could search for the number. i don’t know why i file things in folders anymore, because i just search for everything. i just started naming documents with spotlight in mind a couple of months ago. i have longer file names now, so it’ll come up right away. if it’s a piece of writing, i’ll put lots of keywords in the title of the file, and i can always find these files.” bookends software http://www.sonnysoftware.com/ http://www.sonnysoftware.com/ another reported advantage of the comprehensive operating system search functions was the ability to not only search across documents, but within documents. so, in cases where the scholar was adding metadata – such as key words – to a document, spotlight would be able to find them. this was clearly a powerful tool for those who were using the search functions in this way, and eliminated some of the challenges of organizing and accessing documents from multiple stages of research. while some interviewees reported using a database to organize and access their research materials, the operating system’s file search functions seemed to supersede this practice. searches within microsoft word documents also allow scholars to identify content by keywords. scholars are now amassing incredible personal libraries of digitized material, alongside the content they are producing as part of the research process (notes or writings). note taking took many forms for interviewees. some continue to take notes by hand, some in word or excel documents, and a few reported taking notes in a database or other software tool. some archival reading rooms do not permit the use of computers, and thus scholars who may prefer to adopt a system for note-taking must continue to take notes by hand. some scholars who have worked in the field for a number of years said they feel a bit “behind” in their approaches to note taking and research notes management, preferring to stick with time-tested approaches. newer scholars who take notes by hand referred to themselves as “old fashioned”; however, taking notes on paper is a prevalent practice across generations and sub-fields. nearly all of the interviewees had some combination of paper and digital notes and often lengthy processes for re-writing and organizing these notes. often, this is a tactile, physical experience. some interviewees demonstrated how they like to rearrange information, sort it, and organize it into the conceptual tracks that will become the book project or dissertation. for some, the visual and physical elements of doing this with paper, rather than digital, remained important. however, there are emerging approaches for doing this digitally with tools like scrivener, which allows scholars to work with text and image sources in a flexible, visual way. more research on how these types of tools might be applied to the historical research process is needed. as some of these approaches to research notes management emerged in interviews, it was clear that although most struggle with this process, it is not addressed in a formal (or informal) way in the education of historians. again, historians are expected to develop a personal approach to this process. in most cases, they will rely on their peers – from their dissertation cohort – for tips and tricks on how to get organized and work productively with their sources. in some cases, interviewees mentioned observing how their advisors had approached organizing and writing up and how that shaped their own work, even, in some cases, many years after they had been students. while some interviewees noted that they have picked up tips from colleagues, often in their department, there was also discussion about the lack of awareness of the research process in discussions between colleagues. some recognized that they “have no idea” what another professor might do to organize sources and notes, despite the fact that this may also present a significant challenge for them personally. strengthening the it is important to note that one interviewee expressly stated that she prefers to work with all materials in digital form, and will digitize paper sources and notes. this was partially informed by her travel schedule – both personal and for research trips. due to the frequent travel, and distance from her physical school and home, digital materials were best for her to work with. network among scholars, and providing opportunities and forums for scholars to discuss their personal approaches, could be of great benefit to the community. several phd students explained exactly what they would like to see in a comprehensive tool: “i think one thing that would be really helpful [is to] have something that would be a comprehensive – maybe software – that could keep all of these disparate notes that i have. field notes, archival photos, and organize that in some fashion, and keep it all in one place. a systematic research tools for people who are doing multiple types of research. i’ve organized it as best i can at the moment. but it’s still a lot of searching in a lot of places. and, i’m using a few methodologies. it can be really confusing; trying to organize all of this information and pull it all together.” “i think it would collate the different kinds of materials in a way that i could access them. like fourteen different screens, each of which contains a subject [topic area]. i could go to one screen to find everything i’ve collected on that topic, and it would have the citations for where i’d collected each one. the organization of materials. in one place.” citation management “i should learn how t o do thi s. it’s lazy , re ally . maybe later. it’s a waste of t ime to re-write these re ferences over an d over. it would be n ice i f it w ould just appear automatically.” “i have trie d but that takes too much t ime. it takes too much ti me. i will put time into setting that up a nd getting that goin g -- but all i ’ll ever do with it i s work on getting it set up and getting item s into it. n ot once ever us ing t hat either to locate anything.” “quite frankly when i am referring to arch ival source s, there is really not a stop form for that —at least not in turab ian. there is such a wi de r ange of st yle expectations in journals and other k inds o f p ubli shers t hat i mi ght as well not worry about comin g up with a standar d w ay to refer to that.” “i’m a fraid it’ll take more time for me to f igure that out, so it’s not worth it.” citation management, the work to track the sources that comprise one’s bibliography, is a laborious but vital process for historians, one that ensures integrity of the research output. citation management practices varied dramatically for interviewees, and are often dependent on the scope of the project. given that citations refer to the same materials as the research notes discussed in the previous section, it is important to underscore that citation management quite frequently comprises an entirely distinct process from research notes management. for dissertations and monographs, citation management was a significant aspect of the work, and required a more systematic approach. for smaller projects, historians report that citation management does not warrant significant time and energy. many researchers choose to manage citations “by hand” because of the complex nature of their primary sources, which are not sufficiently well-addressed by many of the available citation management tools. overall, there was very low adoption and application of citation management software among interviewees, which was reinforced by the questionnaire responses. (see appendix d.) operating in both the digital and physical worlds complicated the citation management process for many interviewees, just as it does the research notes management process. historians are aware of newer tools, such as zotero, but many of them reported frustration with these systems. nearly all interviewees reported that they have not been able to work as effectively with a citation management tool as they had hoped. according to interviewees, these tools require more time and effort than managing citations “by hand” for a given project. in the end, it was clear that historians prefer, as with many aspects of the research process, to handle citations in a way that they have developed personally, have likely been using for a number of years, and does not require the adaptation of a new system or approach. consistently, the barriers to learning a new system, despite the understood benefits, were time to dedicate to learning the tool as well as the effort of importing any current citations, and the perceived limits on the flexibility of the systems to work effectively with primary source materials, unpublished materials, foreign languages, and non-text or media sources. most historians who have taken up new citation management tools seem not to be aware the full capabilities of these tools. of course, there was a small handful of interviewees who have adopted new tools, mainly zotero, and are enthusiastic about the role these play in their research. (several of these interviewees were using zotero for research notes management as well as citation management.) several historians who work mainly with published source materials – monographs and articles – viewed zotero as a very useful tool and had adopted it. although only one interviewee was using such a tool for primary sources, he was not only curating his own bibliographies of primary sources he had gathered in various small archives, but was intent on sharing these bibliographies freely online in hopes of encouraging greater usage of these materials. the ambivalence towards citation management tools was also reinforced in some conversations about students and teaching. it was clear that for some historians, teaching citation management approaches is not an active part of their curriculum. the expectation seemed to be that students would learn how to manage citations on their own, in their own way. “do you use a citation management software?” “i haven’t, but some of the students have. they like it.” “do you know what they use?” “no clue. as long as the end product is acceptable that is all that i care about.” “do your students learn citation management software?” “i don’t know if they learn it. i don’t think they learn it. there is no formal place where they learn it.” “do you think they use it?” “some probably do, but most i would say do not.” again, this was another assertion of the perception of the highly personal nature of the research process for historians. some of the newer tools have expanded functionality, combining citation management and bibliography creation with certain research notes management capabilities. in theory, these tools should address many of the reported unmet needs of historians. some of the needs that may not be as fully addressed as they could be include the ability to work flexibly with certain kinds of archival and other primary source materials; and the challenge of organizing materials, including both research notes and primary sources, in the analytical work to outline, organize and develop a manuscript. still, lack of awareness of newer functionalities is clearly one of the key barriers to adoption among historians, raising important questions about how best to ensure that historians have an efficient, effective way to identify and learn to use new tools that support their research practices. digital research methods, collaboration, and communication “increasingl y , i am in tereste d in how thi s profess ion interacts outside the confines of the acade my.” “i think there are so me of my colleague s who couldn ’t care le s s. and, in deed, fin d this [ di gital s cholars hip] to be a colos sal waste of time.” historians’ engagement with digital scholarship comes in many forms. some interviewees were engaged in using digital research practices and sources, which were discussed above, or communication tools, publishing strategies, or pedagogies. this section examines digitally-driven research methods, as well as the collaboration and communications dynamics that are seen by many to enable, or to inhibit, the use of these new methods. many scholars who are using digital methods are self-taught to a great extent, and rely on a network of collaborators to provide methodological expertise or guidance. in general, the digital scholarship a researcher produces is most typically one aspect of the broader research project, and scholars continue to produce a monograph about the research project. one interesting trend that emerged was that many scholars who are engaged in digital scholarship consider themselves to be public historians. finally, researchers who engage with digital research methods or apply digital tools in teaching will likely continue to engage with traditional sources such as those available through archives. notwithstanding the excitement of the historians using digital methods, they constituted a distinct minority of the sample. the sampling strategy attempted to bring together a representative body of those conducting historical research, but it was not random, so no attempt is made to estimate the overall breadth or magnitude of uptake of digital methods among historians. still, based on the sample of historians that comprise this project, it seems that the transformation of research methods is not the most significant or widespread development that new technologies have wrought on the field of history. new methods “my interest in assem bling the se resource s is in the interest for others do ing the same research. we ’re past the point where we need to reinvent the wheel. no one needs to geo -recti fy t he same map [ …] twi ce.” “it’s not about the vis ualization. the di sser tation itself wouldn ’t be a me ditation on visualization and public h istory. it [the visualization] would be a tool i ’d use to answer a question tha t might ari se out of the resear ch.” one vital question for this project is the type and distribution of new research methods that are emerging in the field of history. gis and text mining have emerged as the two most prevalent technological methodologies. in most cases, historians working with gis had partnered with experts on campus, often in the library or it department, or sometimes with experts from another institution (for larger projects). locating gis data on which to base maps for analysis can sometimes be challenging. in addition to using gis in their research, some historians incorporate gis technologies in their courses as well. the library could be seen as a partner in this work, and also as a source of content, as scholars search for maps to scan and geo-code in order to work with them in gis. two scholars described their work with gis, and the support they received at their institutions: “it happened in fits and bursts. i got a grant, and hired someone from it. the first question was– where will be put it [the gis project]? i didn’t want to put it on my webpage. so i contacted the library and we developed [its] role in supporting this type of work. i’m trying to work with them to establish a single campus or state-wide repository that would host and maintain gis data and metadata and make it accessible in one central place. their [the library] commitment is to the data maintenance and sustainability of the project. they’ll make sure it meets federal standards for gis metadata, the layers are updated, the software is updated.” “[…] now how do i analyze this? i wanted it [the map] to move over time. i got support from gis center […]. they gave me a book, and a computer. good luck! naiveté is a great thing. i learned some techniques. it took a while, and i spent a summer doing this, and being frustrated. i got something crude and i couldn’t animate it over time. this is very important for history. then i started working at the center for digital history. i got a grant, which was enough to build what’s on the website. a flash-based animation, which allows you to browse over time and space. it took a long time. but, it became really useful for my research. the movement revealed patterns to me.” while the scholars engaged in gis work agreed that this type of analysis allowed them to ask new questions in new ways and revealed new perspectives on their topics, one scholar noted particularly how time-intensive the project was. he went so far as to say he might not conduct gis-based analysis again, and felt that his book would be finished much more quickly without the digital work. in some cases, interviewees were planning to make their gis databases available publicly, presenting their work as a new tool, although through what infrastructure or organization over time was not always clear. one scholar noted, “i’m working with a colleague in […] to set up a gis database. once the project is finished all the data will be put online for public access. the public will be able to see images and take them apart, online.” gis work has also inspired some historians to look to other disciplines for data to inform their analysis. some digital historians are incorporating census data into their gis projects. this type of work can rarely be undertaken by a lone history scholar, and requires new ways of working collaboratively. the historians interviewed for this project mostly felt that the gis work was one aspect of their research, and would fit into their broader narrative and a monograph. these projects were generally not intended to replace monographs, articles, or other traditional historical works. gis does, however, add a valuable layer of interpretation to the work. text mining – searching across a large corpus of text or using tools like google ngram – is a significant new methodology in historical research, but it does not appear to be widespread. applications for the method remain unclear for many historians, and there were some concerns about the quality and scope google books ngram viewer http://books.google.com/ngrams/ http://books.google.com/ngrams/ of the corpus of full-text works available for analysis. outside of one scholar who was deeply and significantly engaged in this work, it was viewed more as an interesting novelty, rather than an immediately applicable methodology. some discussed visualization tools enthusiastically, an area where there was much interest from some interviewees, however, little activity. overall, the interviews were not able to articulate exactly what types of visualizations they would benefit from utilizing, nor of what types of content they were interested in visualizing. in some cases, interviewees indicated that visualizing spaces, perhaps beyond the ability of some gis programs, would be beneficial to their research and allow deeper analysis and understanding of their topic. when discussing place-based historical work, one interviewee discussed his desire to create an enhanced “cultural geography of urban spaces,” in order to “visualize those kinds of realities.” history scholars reported a combination of self-directed learning and seeking support from colleagues and campus departments in adopting new methods such as gis and text mining. in some cases, the campus library or digital humanities center staff gis experts who are available to work with faculty on any number of aspects of the process. one interviewee noted explicitly that he goes to twitter and blogs to connect with the digital humanities community when he has a question about his work. those working with these technologies and methods tended to either rely on campus experts to contribute expertise, or had been trained over time on their use. many did not feel that it was necessary to become an expert in the method, and were happy to collaborate with others in order to apply these methods and answer their research questions. collaboration collaboration in digital scholarship can look quite different from typical historical scholarship. rather than sharing work between scholars who may each have separate content expertise, collaborators on a digital scholarship project will often have separate skill sets to contribute to a project. the historian typically holds content expertise, while collaborators are likely to have expertise on the particular technology tool or method that is being applied to the research. in some cases, larger teams may collaborate together, with a variety of experts supporting work. one such digital scholarship project at university of virginia included a history scholar, a gis professional, a project manager, and library staff members who could contribute collections expertise. this type of team is able to take a comprehensive approach to digital scholarship work, ensuring that the work is accommodated and supported. scholars working on digital projects didn’t cite collaboration as a challenge, per se, but did comment that it was a new way of working. in the history project interim report, research support providers had noted that there was a significant learning curve for some historians when starting to work with colleagues on digital projects. it is likely that scholars themselves may not be aware of the best ways to approach this work, and how to take full advantage of the collaborations. audience, outputs, and credit historians are interested in reaching a variety of audiences, including scholarly and public alike. this section reviews some of the ways in which historians are working to shape their outputs in a variety of ways to engage with the audiences that matter to them and some of the incentives that help to shape their choices in how to do so. scholarly communication “writing in small chu nks and be ing aware of the audience along the way i s better.” “i have a book. may b e forty people have cracked the spine. but , the blog has tremendous readers h ip.” “open and free. you can download it; we have a podcast; you can print it. we are givin g it away .” “keep the dis sertation off the blog, be caus e that’s what people tell you to do.” “i think of blog posts as the f irst stab at an article. h istorians ar e paranoid about putting things out there.” “it [a blog ] i s a low i mpact, non -threatening place to put i deas. sometimes i get comments that are li ke “you’re wrong,” bu t i learn from tho se.” “i use twitter a lot. i t’s my virtual dh dep artment.” a number of interviewees noted that they have engaged new formats, outside of articles and monographs, for communicating their scholarly work using technology. blogging has emerged as a significant form for scholarly communication among some historians, and is seen as one mode of engagement with digital scholarship. phd students and younger scholars reported more active use of blogs as part of their scholarly communication strategy. interviewees who blog do not view this format as a substitute for other formal publications, but approached it as a supplement and enhancement to their scholarship. (one graduate student said that she would like to have blogging count towards the dissertation.) some faculty mentioned that they are encouraging their students to blog about their scholarship, and to consider a wider audience than the professor or the class. blogging is seen by historians as a way to engage an extended audience (including non-academics), find a community, build writing skills, and develop ideas. while one phd student noted that he had blogged his dissertation, others reported being advised not to do this in order to protect their intellectual property. there was a feeling from most interviewees that blogs (rather than journals or magazines published on a blog platform) didn’t “count” as scholarship in the history community. one phd student blogged his thesis, among other things, and feels that this outlet has helped him connect with key scholars in his field. in some cases, blogging has been used to expose experiments with methodology and engage the community in discussing and improving techniques. this seems to be a significant change from the typical “lone scholar” approach to historical research. one interviewee shared his experience in using a blog to document a project “testing” new digital methods. his intention was to share his “test” with a community of interested scholars and get feedback throughout the project. he referred to the blog as a “lab” space. similarly, another researcher blogged throughout the process of applying a new method and shared results along the way. this led him to a relationship with a scholar at another institution, who is building on his model and using it for her own research. other historians had many different reasons for choosing to blog. one interviewee uses her blog for promoting her current scholarship. as an openly available publishing platform, blogs are networked with and indexed by other online information resources that are now part of the scholarly environment. by publishing work on blogs, academic scholarship is no longer isolated from “the rest of the internet,” in the words of one researcher. another historian noted that he had had experience contributing to an organized blog in previous years, but had not been able to prioritize that writing in light of other professional duties. this interviewee also mentioned that he felt that the blog posts needed to be “polished,” and were competing with other writing projects. yet another interviewee discussed how he had developed a blog to share supplementary material (based on digital scholarship) that relates to a recent publication; his book’s publisher was even aware of the site. in one example, an historian and colleagues in his department had started a monthly, online magazine. he and his colleague serve as editors and recruit authors to contribute articles. this initiative is supported, technologically, by an academic center on campus (although this is not through a formal service offering of the center). this scholar noted that the center has provided nearly all of the technology support, and he has been “shielded” from the necessity of learning that aspect of the digital work. online formats including blogs can help graduate students develop experience and gain exposure: “i get these books hot off the presses and i am able to get the reviews up – graduate students do the reviews. it’s a line on their cv—and i have told them that they will get more readers for their online book review than for almost anything else than they will ever publish in an academic journal.” public history “public h istory is the bridge between the i vory tower and the p ublic’s learning about history. it pres ents a great opportunity for the l ibrary a nd arch ives .” “i am inve sted in reaching a variety of pu blics.” “we have to do more of our work in publ ic, where people can see it. getting out from beh ind pay walls. our conversations are behind pay walls. ” “there are too many documents for me to work on in my project… history wa s a monolithic, indivi dual activity . you sat do wn and translated th e documents. b ut now there’s so much out there and it’s onl y goin g to grow. why not bring in more people?” “what does interest me is making w hat we do relevant outsi de of thi s buil ding . i think there is a genui ne cris is on a host o f levels, and it behoo ves us to th ink about ourselves as pu blic scholars. some p eople clearly do, som e people have in the past, but i f we ar e not attentive to that then we are in some profess ional trouble. […] what we are able to do is con nect a readin g publi c with an aca demi c expert —in a way t ha t works for both of th em. it i s an opportunity for that academi c expert to s peak without footnotes, to speak w ithout jargon, not to worry about petty -mi nded colleagues, and it is a way for the pu blic to have acces s to someone who i s re ally smart and who k nows about this particular topic — to be better informed, ther efore, about what i s going on in the world.” many interviewees discussed the motivation and benefits of digital scholarship initiatives in terms of “engaging the public” and making history more accessible to the public. public history has a long legacy, but it has been viewed in different lights by different departments. at this point, however, it is impossible to ignore the role of public history in the adoption of digital methods in the discipline. public history, and at the very least a commitment to making historical scholarship accessible to a public audience (as opposed to a scholarly audience), came forth as a clear motivator for most interviewees who are engaged in digital scholarship. in some cases, where scholars are using public information as a source for their scholarship, including crowd sourcing or the use of publicly-generated sources, scholars feel a commitment to share the output of their work with public in an open, accessible way. interviewees who were engaged in making their research public or who identified as public historians held a range of perceptions about the acceptance of this work by their peers and colleagues at their institutions. in some cases, history departments support strong public history programs. in others, a scholar may be working more independently to achieve their goals of making their scholarship accessible to the public, without explicit support from the department. some interviewees at public institutions saw their commitment to the public as a core value of their institution, and a motivator for their scholarship. promotion and tenure “’points’ dictate wha t types of material yo u produce. books are worth more than peer reviewed arti cle s, wh ich are worth m ore than book review s.” “there ’ s a sense in hi story – blogg ing abo ut stuff doesn’t really ‘count.’ ” “they are not aga ins t it [digital scholarsh ip] ; they just do not h ave the resource s to promote it on such a small s cale.” “first [problem is], p eer review. there is n o systemati c way to a ccompli sh th is if someone is work ing with their l ibrary to put up something tha t’s flashy and smart. no one’s vette d it or has an opin ion of it. there are lots of people whose di gital research are a blog wit h pi ctures o n it. there ’s no line. no one’s go ing to pretend that’s a subs titute for a book. an d , there’s no publis her for these projects. ” many choices that historians will make are driven by their understanding and prioritization of the audiences for their scholarship and the outputs appropriate for reaching them. during the course of this project a number of issues were raised in terms of the opportunities and constraints imposed by these dynamics. the promotion tenure process for history faculty is often raised as an area of concern in discussions about digital scholarship. current tenure standards and requirements remain heavily focused on the monograph and articles published in peer review, scholarly journals, and the interviews suggested that this status quo is still in place. as expected, some history scholars are exploring new methods of digital scholarship and scholarly communication, and are struggling to understand how the academic world will evaluate and accept (or not accept) their scholarship. colleges and universities require widely differing balances of teaching and research in the promotion and tenure process. many faculty have a tenure process that is focused on their teaching portfolio, rather than publications. one interviewee stated, “i usually know that scholarship is appreciated, but that it is not what comes first. excellence in teaching is our first thing.” (as this project is primarily about the research method, this report will focus on research in promotion and tenure.) there was some evidence that faculty who have achieved tenure feel “safe” to explore new digital methods in ways that pre-tenure faculty do not. one interviewee noted, “i don’t have to worry about whether it will result in a book or not. i have a form of job security that allows me to do something i feel is productive and not worry about my c.v. our institution has been slow to figure out how they would assess this work.” in contrast, some noted that as many departments are hiring new faculty, digital scholarship is an attractive addition to the c.v. it was clear from interviews that pre-tenure history faculty at research-focused institutions are still required to produce a monograph in order to advance to the next stage of their careers. in most cases, the expectation is direct and explicit, and most faculty appreciate that clarity. one interviewee noted, “my department is very clear that i need a book. and probably a couple of articles in recognized peer review journals. i’m glad the expectations are so clear.” new faculty are very aware of the requirements for tenure, which seem to be relatively stable. the monograph remains the centerpiece of the tenure process for these historians. in most cases, digital scholarship work is seen as a part of or a supplement to the monograph. following the framework that digital scholarship allows scholars to ask new types of questions and interact with the sources and data in new ways, it logically follows that the answers rendered from these new methods will be incorporated into the historical arguments that scholars are already making. it is common that a particular method will illuminate a new way of approaching an issue of time, place, or language, and that these results will be incorporated into the monograph. in these cases it is typical for the scholar to produce an online platform to share the tool, method, results, or data that were a part of the digital scholarship method, in addition to the book. however, it difficult to say whether historians are given “credit” for this work in promotion and tenure reviews. in some cases, scholars are producing text output in formats that are neither the traditional monograph nor a scholarly article. these are typically blog posts, but could take other forms. some scholars feel that their work might be better addressed in one of these non-traditional formats, like a website or a series of blog posts. one phd student felt that the dissertation was not an ideal form for presenting his work, which is heavily informed by digital methods. in his case, “articles make more sense.” but, as dissertations are required and the format is established and standardized (unlike in some other fields where a series of articles may be composed into a dissertation), he is spending time adapting his work to the required format. one interviewee shared his approach to digital scholarship with his students, advising them to maintain a balance between new methods and traditional scholarship. these efforts, as stated by the interviewee, were an effort to ensure that the student would be acceptable or marketable in the current academic environment in history. he felt it was a risk for students to concentrate their studies too heavily on new methodologies. in some cases, historians are producing digital projects as the output of their scholarship, without a print text accompaniment. in these cases, questions of review, credit, tenure and promotion are aggravated. this study did not interview anyone directly who was pursuing a phd, tenure, or promotion with a digital project in the stead of a traditional textual (monograph) output. issues of new formats and open models of publishing beg the question of peer review. this study did not delve into the deep waters of this dilemma and debate. some history departments and scholarly associations are adopting standards for evaluating new scholarly methods and non-traditional outputs. this may serve to reduce professional barriers to exploring and applying new methods to historical research, and it was clear that there is a need and momentum building to do just this. while it is impossible to generalize on this issue, there was an overall sense that the issue of earning “credit” for non-traditional forms of scholarship are a very real barrier to exploring and adopting new methods and outputs. harley remains an excellent source on these topics. graduate students “one of my bi g issue s with gra duate edu ca tion in general ri ght now is that there ’s almost no training w ith methodolo gy and what you actually do in the arch ive and why that matters. yo u don’t always know how to ask so meone f or help. there are larger philosoph ical questions about what an archive is . i have n’t gotten systemati c training. i had done some arch i val work through pre vious e ducation. i’d been to an archiv e and i k ind o f knew how to use one on a b asic le vel. a lot o f it is fi guring it out as you go. ” “i would be intereste d in attendin g a sess i on about organizing information and writing [ it] up.” “learning to use ar chives and sources… i’m just learning my sel f.” during the course of this project, phd students echoed many of the same concerns that faculty members described. interviews with phd candidates indicated that there is often little support for them in learning about new research methods or practices, either in their department or elsewhere at their institution, of which they are aware. while the subject matter treated by historians continues to diversify dramatically, new methodologies develop, and research practices change rapidly, it is clearly critically important that students have a grounding in the methods and practices of the field. the field universally expects that scholars produce a dissertation, and in most cases a subsequent monograph, effectively demonstrating a standard set of skills in the discipline. however, formal, implicit training of scholars in these skills may not be as prevalent as it could be. given that graduate students are deeply engaged in their research, and are forming life-long research habits during their dissertation work, this area emerged in this project as a vital area for further attention. phds often struggle to define the scope of the project and develop an efficient approach to managing numerous sources and notes. they also struggle with developing and refining their argument. graduate students reported that they rely on fellow students, advisers, archivists, and colleagues in the field for advice. while interviewees varied in their approaches to the dissertation, about half were treating it very much as “the first book.” choosing a dissertation topic that is practical given funding constraints and refining an argument are key challenges for graduate students. one interviewee said “a lot of us [phd students] have cool topics or ideas, but making it into something you can answer is more difficult than i’d realized.” these challenges, very likely common throughout the academy, may indicate a need for more active guidance on these topics as students progress through their programs. as noted in other studies, skill sets range dramatically for incoming phd students, leading to a variety of support needs. the amount of formal training on research methodologies varies widely depending on the adviser. methodological training was often “thin” compared to expectations and needs even for working with traditional sources and methods, such as in archives. some interviewees said their programs had cite to cul study. included one or two organized visits to campus archives, where they met with archivists who illustrated how to work with an archive and interact with the materials. (there is even less support for working in poorly resourced or otherwise untraditional settings.) phd candidate said that these training sessions are invaluable. even when students have a methods class available to them, they do not always provide a good foundation for practice in working with source materials, though one interviewee mentioned a methods class that she had taken provided a solid foundation in theory. several phd students expessed a desire for a real “boot camp” on methods and practices at the appropriate point in their graduate education. relying on the support of professionals and colleagues in the archives is an important way for phd students to learn how to work with sources. the necessity of traveling for research takes young scholars away from the assumed support system that would be found in an academic department, leaving them to rely on the archivists and other scholars in those settings. interviewees sometimes noted having made connections with other scholars in their subfield at an archive and even observing and learning from how other scholars work through a collection, take notes, and write. they also indicated that discussing various approaches to working with sources with these scholars was an invaluable aspect of their training and work. this was one way that they built a network of scholars within their subfield, as many scholars are working on the same or related collections at one archive. additionally, the archivist is an important instructor for history students and a guide for experienced researchers. as the primary research support professional in the archive, scholars noted the importance of building a good relationship with the archivist, and his/her role in guiding them as to how to approach a collection, identify relevant resources, and work with different types of materials. some phd student interviewees said that they need more training in working with non-document based sources. they struggle technologically and methodologically to locate, capture, analyze, and report on a variety of source types including audio, video, oral histories, websites, and video games, as noted earlier. some interviewees expressed direct frustration with the lack of training in using these primary source formats. in some cases, phd students noted that their advisers were not familiar with the use of these materials, and were therefore not a source of support for this aspect of their dissertation process. some phd students benefit from working with multiple advisers from multiple departments, because this allows them to learn about other approaches to non-traditional sources. a significant part of their time is spent exploring tools and approaches to facilitate effective, efficient, productive research and writing processes. several interviewees mentioned having attended workshops on campus, often hosted by the library, to learn about research tools like zotero. graduate students view these workshops with varying degrees of satisfaction, and they often feel that they are not taking full advantage of these tools. overall, phd candidates are eager to identify new tools, for which they typically rely on their peer networks. advisers and professors are typically not able to address questions about new technologies, as these are generally outside of their primary skill sets. there is an unstated expectation that students will find support for using technology elsewhere on campus, outside of the department. phd students use their campus library in ways that are not dissimilar from faculty members. they use both print and electronic collections heavily, mostly for secondary sources. library space can be important to phd students, especially for those who are local to the campus and do not have additional office space. they may have occasional interactions with librarians and archivists about their research, though some of them reported dissatisfying experiences working with library staff. among the interviewees, there were no good examples of strong relationships with the librarian. historians reported that delays in response time to requests or emails are a major inconvenience for them. in general, the campus library is a critically important service provider; however, it is not seen as a core collaborator or partner in the research process. phd student engagement with digital research methods included using gis and text analysis in their dissertations. one noted that she has interest in incorporating digital visualizations into her dissertation, saying “it [visualization] would be a tool i’d use to answer a question that might arise out of the research.” interviewees, particularly the current phd students, noted the value of learning from their peers throughout the dissertation process. while in their phd programs, historians build strong connections with their fellow students, and often cite this community as their primary support for discussing the “how to” of research. in many cases, interviewees noted that they had learned about a tool or an approach from a fellow student. phd students and new faculty reported staying in touch with these networks and relying on them for support after graduation. sometimes scholars in these communities also share sources (primary and secondary). clearly, the experience of learning to work with primary sources – which is at the heart of the historical method – can be described as informal, at best. the consequences of this approach, both positive and negative, were apparent in further discussion of the research process. historians feel a great deal of control over and comfort with their personalized approach to research. however, they also struggle with some aspects of the process. starting a conversation about research practices within the community, and re-instating formal training on research methods, could provide significant support for the field. conclusions and recommendations this report has taken a snapshot of some of the many ways in which new technologies have affected historical scholarship. while the adoption of new research methods continues to grow, this project has documented an absolute explosion in new research practices and communications mechanisms. this report concludes that research support services should strive to provide adequate and increasing support for these new practices and communications mechanisms. three key findings are summarized here, followed by a series of recommendations to specific audiences. gaining intellectual control the majority of interviewees said that a central challenge of their research is “gaining intellectual control” over the content they have collected throughout their research process. from the interviews, it was clear that historians are interacting with a wide ecosystem of information, within which they are continuously collecting, interpreting, and attempting to organize and access for analysis. nearly all historians face an ever-growing mass of paper and electronic resources, notes, writing and images. organizing these materials in a consistent way so that they can be easily accessed throughout the research and writing process – typically over many years –is an enormous challenge. as noted earlier, the researchers observed historians creating and revising and struggling with their organization systems, many saying “i should be more organized…” during interviews. while organizing information has always been a challenge for historians, the ever-expanding landscape of resources available to historians in digital form has allowed them to collect and analyze more and more information during their research process, and thus it has increased the challenge of engaging with all of the material. discovery and digitization it was clear from interviews that finding and accessing secondary source material is straightforward for historians. given amazon, google books, the library catalog, and interlibrary loan services, historians can nearly always find what they need for their research. anxiety about comprehensiveness is, however, growing. and primary sources present another challenge. the process of identifying archives - in some cases small, local archives or international archives - can present an amazing challenge to researchers. another level of this challenge is determining what is in an archive before visiting it. given limited travel budgets (with many historians funding research trips out-of-pocket) scholars need to go through a complex decision making process to target high-priority archives. the digitization and consequent discovery of archival finding aids is incredibly valuable for historians, and greatly in demand. the value of online finding aids was clearly communicated by participants, and instances where archives do not provide online finding aids was a challenge for many interviewees. not only was there a desire to have finding aids for all archival collections online, there was a desire to have these finding aids collocated for centralized searching. the library and the archive it was clear in the interviews that the majority of historians view the library in a collections-centric way, either immensely satisfied with collections, delivery services, and interlibrary loan, or craving improvements. in addition, many historians highly valued some of the recent digitization and discovery efforts of libraries and archives. google books and more finding aids available online are just two of the technology-driven innovations that historians celebrated. still, there was also a noteworthy concern among historians about whether librarians, in particular, had sufficient command of the field to provide more focused support for their work. regardless of the possibility of a service decline in this sense, there is clearly a need to marshal capacities among a variety of support providers in a way that more directly responds to the needs of historians for expertise in their individual sub-fields. recommendations fundamentally, historians require a variety of types of research support. prior to the interviews with historians themselves, this project included interviews with a variety of research support professionals, including history librarians and digital humanities professionals. a variety of issues emerged from those interviews about the identity and responsibilities of the diverse individuals in research support roles for the field of history, all of which are quite germane in contemplating how best to serve the variety of needs that emerged from interviews with historians. one of the key issues that emerged from the research support professionals was a basic uncertainty about the distinction between individuals who serve in a research support role as against those who see themselves collaborating with historians. many of the basic needed services discussed in this section, focus on issues such as discovery and information organization and management, which are almost certainly best considered from a service perspective. assistance with adoption of new research methods, however, is more typically provided through a variety of fairly bespoke collaborations. on the other hand, many of the anxieties facing historians have as much to do with learning how to adopt new practices and tools, incorporating technology, which have more to do with education and instruction than with collaboration. partnerships between departments, collaborators, and service providers, are almost certainly needed, to ensure that these types of needs are accommodated. in ithaka s+r’s recommendations, based on the research and analysis conducted for this project, we have broken these down by audience: archives, libraries digital humanities centers (and other campus support providers), providers of digital and digitized secondary sources, providers of citation and research notes management systems, and historians. while we recommend a variety of collaborations across some of these stakeholders, we hope that this organization will help the reader immediately identify actions that could be taken in his or her professional and organizational context. recommendations to archives archives serve as unique destinations for historical research, and the best archives offer a combination of valuable content, tools, expertise, and programming. . because archives are the primary provider of sources for historical research, and present the greatest challenges for researchers, efforts to improve access to descriptions of archival materials are vital. online finding aids are critically important to today’s researchers, and archives should consider these a priority service that they provide. even if detailed findings aids cannot always be created due to resource constraints, expedited approaches to creating more basic discovery mechanisms may be a way to shed at least some light on otherwise hidden collections. . archives should continue to make every effort to make collections as accessible as possible through digitization. there may be an opportunity for archives to partner with researchers who are digitizing some portion of the archives on their own, in order to collect this material and make it available for other researchers. with respect to smaller archives, there may be collaborative opportunities that would make such efforts more feasible. . archives should work together to develop, support, and/or promote discovery tools that make archival finding aids more readily accessible and cross-searchable. cataloging and discovery services that cross institutional boundaries are becoming increasingly important, and archives should determine whether and how these services can best accommodate their finding aids and support the needs of historians. such tools would be particularly valuable if they could facilitate the creation and dissemination of online finding aids for small, local, and obscure archives and institutions. . historians deeply value the expertise of the research archivist, and archives should ensure that they are devoting adequate resources to engaging actively as interpreters of the collection and important connectors within their subfield. archivists can play a patron services role in working with historians, and libraries should allow them time and other resources needed to be readily available to researchers. archives are uniquely positioned to facilitate connections within the community of researchers who use their materials, and should make efforts to support engagement between researchers. . archives should adapt to and facilitate the use of digital cameras and scanners in their reading rooms. they can serve a very real need for history researchers who are beginning to use this technology by creating policies, providing adequate space for photography, and providing instruction on best practices for capturing and organizing images would. . campus archives should explore additional opportunities to train phd students at their institution in partnership with history departments. such training should focus not only on the use of the campus archives, but on the diversity of archives that students may encounter worldwide, including those that are less well resourced. recommendation to libraries libraries continue to provide a wealth of secondary sources to historians, and the digitization initiatives they have spearheaded have been tremendously valuable. in addition, libraries offer some of the principal campus-based support services for historians but may wish in some cases to consider their place in the broader network of service provision, both on-campus and remote. . historians are prepared for the print to electronic transition in library collections, certainly for scholarly journals, and libraries should proceed (with all appropriate individualized sensitivity and consideration for long-term preservation) with whatever strategies they may be pursuing for evolving format preferences for their collections. . even the greatest research libraries serve only a portion of the secondary source requirements for historical research, making collection sharing an especially vital service for historians. libraries should continue to advance their borrowing partnerships and joint collection management plans. some historians think about their library access in terms of regional (rather than institutional) collections, and many libraries way wish to do the same in order to serve their needs comprehensively. . historians noted that library expertise does not always cover their sub-field or area of interdisciplinary focus, which is understandable given staffing constraints. libraries have traditionally focused their collaborative efforts on their collections, and they may want to consider opportunities to make other types of services – such as staff expertise – more readily available to those at other institutions who can benefit from them. for historians, this would be beneficial if it were to allow institutions to develop deep specializations in discrete sub-fields of history. . digitized monographs and other books were extremely important to historians for discovery and research purposes, as are non-textual sources (such as audio, video, oral histories, websites, and video games) and archival finding aids. libraries should ensure that full text search of digitized books, archival finding aids, and non-textual sources are available to researchers as comprehensively as possible through their main discovery services. . libraries should explore offering a “concierge” service to historians focusing around their need to discover, and broker access to, primary source materials. historians describe the identification of primary and secondary resources in new topical areas, particularly at the beginning of a new research project, as a challenge, yielding significant anxiety over “missing something.” the campus library and/or centers of excellence devoted to individual subfields could design a variety of services that would mitigate these challenges. . historians who had adopted digital methods relied on partners throughout the university for support, and that this support often makes digital scholarship possible. for scholars who are exploring these methods and incorporating them into their research, these partnerships are critical. campuses can continue to facilitate this work by providing support and expertise at either the departmental or campus wide levels. recommendations to providers of digital and digitized secondary sources amazon, google books, hathitrust, and internet archive, are among the most significant sources of digitized book content, alongside a variety of publishers and platforms that are making scholarly monographs available online as well. . historians working on international topics noted limitations of the corpus of foreign language material available on google books. maximizing the inclusion of foreign language material in these services would offer additional value to a variety of researchers. . among interviewees, the singular importance of google’s library of digitized books was quite striking. community services such as hathitrust and internet archive, not to mention publisher and platform services, may have different objectives. nevertheless, they may find it useful to evaluate their role in support of the needs of historians in the context of google’s apparently unique importance for this population. . scholars interested in utilizing digital corpora of texts for computational analysis are uncertain about the scope, provenance, and quality of the content that has been digitized. providers should address these issues transparently to enable computational research to be conducted without methodological compromise. . historians’ needs for non-textual sources must be supported by making them both more readily available and more seamlessly discoverable. recommendation to providers of citation and research notes management systems one of historians’ key research challenges is the need to organize, gain intellectual control over, and sift through a diversity of sources. citation management and research management functions are increasingly coming together in tools such as zotero and mendeley, and some historians are making extensive use of them. . faculty members perceive key limitations in these systems’ bibliography tools, especially in working with primary source materials, other unpublished materials, foreign languages, or non- text or media sources. providers of these systems should bear the challenges of using these content types in mind when they establish future development priorities. where they have existing features that would ease these challenges, they should focus on marketing those features more effectively to the history community. . many of these tools now provide functionality far beyond citation management and bibliography creation. these tools can address some of the research notes management challenges that are pervasive in the field of history. providers of these systems should bear in mind that while citation management is a core process common to all scholarly fields, research notes management has certain aspects that are highly discipline- or method-specific. they should create appropriate flexibility or customization to take disciplinary needs into account. recommendations to history departments history departments provide a variety of research support services, not least to phd students through methods courses and other graduate training. . several phd students indicated that they would have benefitted from additional help in developing a dissertation topic, especially given the practical matter of resource constraints. it may be too much to suggest that topic development could include a formal budgeting process, but advisors and departments may want to provide additional guidance in considering resource availability. . phd students reported significant uncertainty about their knowledge of research methodology. they were not uniformly well-versed in effective techniques for research notes management, outlining, use of the archives (especially in less well resourced settings), comprehensive discovery techniques, various types of collaboration, and other techniques necessary to research and write a dissertation and enter the profession. history departments should carefully examine how they expect phd students to learn fundamental and innovative research practices – perhaps but not necessarily alongside new research methods – and make adjustments to maximize student success. there may be opportunities for partnerships between history departments and libraries and archives in support of these objectives. . scholars and phd students alike need significant training in new research methods. some history departments teach methods courses to their phd students, but these need to be better adapted to emerging research methods that involve visualizations, computational analysis, and other emerging approaches, in addition to teaching the fundamentals of the historical method and working with primary sources. the field may want to adopt the model of summertime national “boot camps” in research methods that have proven successful in other fields that are adopting new digital methods. finally, given the disappointment that some scholars have reported with new methods, all methods training should afford significant attention to identifying the right research method, whether new or traditional, to suit a given research question. . at both the level of methods and practices, phd students require more training and support in the use of non-textual materials, including audio, video, oral histories, websites, and video games, as well as collections that are poorly organized or cataloged. whether through formal peer networks or departmental coursework, departments should ensure that their phd students are being acclimated into the full range of sources they may encounter in their research. the clir dissertation fellows program may offer one model for consideration in this regard. . new forms of scholarly expression are offering emerging scholars earlier opportunities to develop their ideas and their voices. departments should provide guidance to phd students and faculty members regarding the role of new types of scholarly communication, including blogging. . as new approaches to research notes management emerged in interviews, it was clear that although most struggle with this process, it is not addressed in a formal (or informal) way in the education of historians. strengthening the network between scholars, and providing opportunities and forums for scholars to discuss their personal approaches could be of great benefit to the community. recommendations to scholarly societies while the american historical association is the principal field-wide scholarly society for history in the united states and may be the right venue for many of the considerations discussed here and elsewhere in the report, a variety of other scholarly associations may find that there are appropriate contributions for them to make in these areas as well. . it is important that the field engage in discussions about the role of digital scholarship in history, and support faculty members in exploring and adopting new methods. the history community can make a commitment to incorporating this work into the field by setting standards for its review in publication, tenure, and promotion. scholars should be able to gauge what will “count” in the forms of scholarship they may wish to adopt. . as day-to-day research practices of historians continue to evolve, continued examination of their changes and the associated needs they produce will be necessary. scholarly societies may want to establish mechanisms for tracking these changes over time, for formally identifying support needs of the field, and for engaging with a variety of partners to help ensure they are addressed, not least librarians and archivists at a professional rather than institutional level. recommendations to funders many of the recommendations made elsewhere in this report may benefit from one kind of another of outside support, but several are identified here specifically because they may be impossible without such support. . the extensive need for professional development in new research practices and tools will require some amount of experimentation that might be spurred along by dedicated sources of funding. if a funder were to choose to support such needs, one set of considerations is whether such professional development is best situated internally within the history department (for example as a requirement of phd education), in collaboration with another campus organization such as the academic library, or through a third party model such as a summer institute or thatcamp. . to bridge perceived gaps between historians and those who provide them with research support services will require a mix of formal programs and informal approaches. while the former may typically be easier to support, the latter may be facilitated by structures that ultimately rely on outside funding to develop. one model that was called to our attention for consideration was the newberry library's summer institute in quantitative history. appendix a: interview participants research support professionals marta brunner, ucla library, head of collections research and instructional services department at the charles e. young research library brian croxall, emory university library, clir fellow and emerging technologies librarian julia flanders, brown university library, center for digital scholarship, director for women’s writers project kathleen fitzpatrick, modern language association, director for scholarly communications matt gold, cuny graduate center, assistant professor and advisor to the provost for master’s programs and digital initiatives rebecca kennison, columbia university, director for the center for digital research and scholarship ed linenthal, journal of america history, editor joan lippincott, cni, associate executive director ken middleton, middle tennessee state university library, associate professor and user service librarian tom scheinfeldt, george mason university, managing director for the center for history and new media lisa spiro, nitle, director of nitle labs robert townsend, american historical association, deputy director katherine walter, university of nebraska- lincoln, co-director for center for digital research in humanities elizabeth watts pope, american antiquarian society, head of reader’ services historians jeremy antley, university of kansas brian bockelman, ripon college steve brier, cuny graduate center joshua brown, cuny graduate center antoinette burton, university of illinois, urbana champagne claudia calhoun, yale university david cannadine, princeton university brian caton, luther college lawrence cebula, eastern washington university steven conn, the ohio state university simon cordery, monmouth college kevin dawson, university of nevada, las vegas hasia diner, new york university s. max edelson, university of virginia colin gordon, university of iowa shawn graham, carleton university timothy graham, university of new mexico greg grandin, new york university maggie greene, university of california, san diego john haldon, princeton university martha hodes, new york university julia irwin, university of south florida kc johnson, brooklyn college deborah kanter, albion college david ludden, new york university kate mcdonald, university of california, santa barbara sean mcenroe, southern oregon university daniel mcinerney, utah state university sarah melton, emory university april merleaux, florida international university celia naylor, barnard college matthew o’hara, university of california, santa cruz jenna phillips, princeton university ben schmidt, princeton university william thomas, university of nebraska, lincoln andrew torget, university of north texas ed triplett, university of virginia david troyansky, brooklyn college carl wennerlind, barnard college appendix b: interview protocol for historians warm-up  thinking back to your phd studies, can you describe your training as an historian for me?  tell me about your dissertation topic. what types of resources were you using for your dissertation?  how has your approach to research changed since then? research what research methodologies are currently in use and how are these expected to change? what support is available – locally or distributed – to help facilitate the research process?  tell me about a research project you’re working on now. o how did you develop your topic? o how did you start finding materials for your project? (follow-up in discovery)  research notes management o how do you keep track of the articles, images, resources you’ve gathered for your current project?  use of “new” technology o look for queues and follow-up. o explore any known digital humanities methodologies.  collaboration o have you worked on any collaborative projects? tell me about them.  challenges o what’s going really well with your current project? o what obstacles have you experienced in working on this project? discovery how do researchers obtain information, begin the process of discovery, and use network and local resources in the field?  tell me about your current research project. o where did you start? describe your research path to me. o were all the resources you needed available to you on campus? o what do you do if something isn’t readily available? o how do you know when you have everything you need?  last time you were looking for a book or article, what did you do?  last time you were exploring a new topic, for class prep or for a potential research project, what did you do?  what can’t you find with google and your usual search strategies? what happens when you can’t find something?  challenges o what are the biggest barriers to finding the resources you need? library and resources  use of archives o are you doing archival research for your current project? which ones? tell me about how you’re using the collections there. o how did you prepare for your visit? o how did you capture information while you were there? o how did you work with your research notes once you came back? o have you used a digital camera while you’re working in archives? o do you wish any of these materials were available digitally? how would that impact your work?  use of digital collections o are you using digitized collections – text, images, video – in your current project?  use of the campus library o how would you describe the library’s role in your research? o what’s the most valuable thing that the library helps you with? o have you worked with a history librarian at your library for this project? from another library? o have you used any technology services offered by the library? what technology support do you wish the library offered? o what do you wish was available to you on campus that isn’t?  use of other libraries o what other libraries, archives, societies, or collections are you working with on your current project? tell me about the last time you worked with them.  what obstacles have you encountered in conducting research for your current project? digital scholarship (if relevant)  how have new technologies impacted your scholarship? o seeding out new sources o analyzing information o organizing information o sharing information  are you interested in exploring any new methods in your work?  is there anything you wish you had time or resource to learn?  have you worked with a digital humanities center, or equivalent, on any of your projects?  what inspired you to try this new method/approach/technology?  how did you go about building skills in this method?  what impact has this method had on your scholarship?  would you describe yourself as a “digital historian?”  what challenges have you experienced in incorporating this new method into your scholarship? future  looking forward, what challenges do you see for yourself as you continue to do research?  looking forward, what challenges do you see your field facing as methods continue to evolve? wrap-up  looking back at our conversation today about your scholarship, can you reflect again on how your approach to research has changed or is changing?  if i gave you a magic wand that could fix something that isn’t working for you, or create something for you to use in your research, what would you ask the magic wand to do? appendix c: evidence . notes and sources from a recent book project. each “pile” of folders is one chapter. . notes taken from a visit to the archives. post-it note flags a note of interest, and labels it with a chapter number to which it relates. . photograph of an unprocessed “archive” in a foreign country. the records found here were used in research. . a graduate student uses post-it notes to organize dissertation structure. . the same graduate student uses scrivener to organize dissertation structure, notes, and draft sections of the dissertation. . an example of a digital photograph taken in a reading room. in this case, the scholar has used the “flag” from the collection to provide a type of visual metadata for future reference. so what are you going to do with that?: the promises and pitfalls of massive data sets full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=wcul download by: [university of michigan] date: november , at: : college & undergraduate libraries issn: - (print) - (online) journal homepage: http://www.tandfonline.com/loi/wcul so what are you going to do with that?: the promises and pitfalls of massive data sets sigrid anderson cordell & melissa gomis to cite this article: sigrid anderson cordell & melissa gomis ( ): so what are you going to do with that?: the promises and pitfalls of massive data sets, college & undergraduate libraries, doi: . / . . to link to this article: http://dx.doi.org/ . / . . published online: jul . submit your article to this journal article views: view related articles view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=wcul http://www.tandfonline.com/loi/wcul http://www.tandfonline.com/action/showcitformats?doi= . / . . http://dx.doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=wcul &show=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=wcul &show=instructions http://www.tandfonline.com/doi/mlt/ . / . . http://www.tandfonline.com/doi/mlt/ . / . . http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - college & undergraduate libraries https://doi.org/./.. so what are you going to do with that?: the promises and pitfalls of massive data sets sigrid anderson cordell a and melissa gomis b ahatcher graduate library, university of michigan, ann arbor, michigan, usa; bperkins library, doane university, crete, nebraska, usa article history received  february  revised  june  accepted  june  keywords data mining; library services; supporting dh across the institution; teaching dh abstract thisarticletakesasitscasestudythechallengeofdatasetsfortext mining, sources that offer tremendous promise for dh methodol- ogy but present specific challenges for humanities scholars. these text sets raise a range of issues: what skills do you train humanists to have? what is the library’s role in enabling and supporting use of those materials? how do you allocate staff? who oversees sus- tainability and data management? by addressing these questions through a specific use case scenario, this article shows how these questions are central to mapping out future directions for a range of library services. introduction when the first set of texts from the early english books online text creation partnership (eebo-tcp) was released on january , (text creation partner- ship [tcp] ), there was understandable excitement about the release of , openly available texts from the early modern period (levelt n.d.). in addition to making these texts available to read, this release also opened up possibilities for text mining the eebo-tcp data set. however, while there is clear potential for digital humanities research in making a relatively clean data set of texts from the early mod- ern period available, the structure of the data set itself poses considerable challenges for scholars without a background in programming. most humanities scholars can- not take advantage of a data set like this one—or similar data sets, such as the historical newspapers that proquest has recently made available to institutions that have purchased perpetual access—without considerable training and support. the question becomes, who is best positioned to provide that support? for many, the obvious answer to this question is the library because of its position as provider of resources and expertise in navigating them. if the library is to provide this support, however, how can it do so most effectively? the gap between the promise and usability of massive humanities data sets like the eebo-tcp project presents an contact melissa gomis msgomis@gmail.com perkins library, doane university,  boswell ave, crete, ne . published with license by taylor & francis ©  sigrid anderson cordell and melissa gomis https://doi.org/ . / . . https://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - mailto:msgomis@gmail.com s. a. cordell and m. gomis opportunity to consider a host of questions facing libraries today as they develop ser- vice models and expertise to support traditional and emerging forms of scholarship. this article takes as its case study the challenge of massive data sets for text min- ing, sources that have been lauded as offering tremendous promise for dh method- ology but present very specific challenges for humanities scholars with minimal pro- gramming skills. the data management and use issues with which we are concerned in this article engage the question of whether humanists should learn to code; how- ever, they go beyond that in scale and scope. the text sets under discussion in this article raise a broad range of issues if they are to be used by researchers: what skills do you train humanists to have? while the library in most cases helped to create and provides access to these data sets, what is the library’s evolving role in enabling and supporting use of those materials? how do you allocate staff in this situation? who’s going to oversee sustainability and data management? by addressing these questions through the lens of a specific use case scenario, this article shows how these ques- tions are central to mapping out future directions for a range of library services. background new digital methodologies and sources for humanistic scholarship raise new ques- tions for training humanities scholars, as well as for the roles that libraries can play in supporting emerging scholarly approaches. as many have noted, emerging digital methodologies in humanities scholarship have opened up new ways to analyze texts at scale. as heuser, le-khac, and moretti ( ) observe, digital methodologies open up the possibility of asking broader questions of larger corpora to understand texts and underlying social and cultural phenomena at scale. traditional scholarly methods, in particular the close reading of texts, necessarily limit the scale of anal- ysis, leaving open the question of how authoritative any analysis based on reading a necessarily limited corpus can be. as heuser, le-khac, and moretti point out, machine reading methods hold promise for allowing us to answer new questions based on a larger, more inclusive corpus: “these emerging methods promise ways to pursue big questions we have always wanted to ask with evidence not from a selection of texts, but from something approaching the entire literary or cultural record. moreover, the answers produced could have the authoritative backing of empirical data” ( ). alongside the “authoritative backing” that “empirical data” promises, these approaches raise concerns among humanists, especially for disciplines that have long defined themselves in opposition to the sciences. as heuser, le-khac, and moretti ( ) observe, by offering an entirely different model of humanities scholarship, the digital humanities raise many questions …. can we leverage quantitative methods in ways that respect the nuance and complexity we value in the humanities? … under the flag of interdisciplinar- ity, are the digital humanities no more than the colonization of the humanities by the sciences? ( ). college & undergraduate libraries in conjunction with this lively debate over whether the core values of the human- ities are lost by drawing on computational approaches is the question of how best to train humanists to undertake these approaches, as well as a necessary discussion about what might get lost in the process. some of the resistance to computational training by humanists, kirschenbaum argues, stems from a misunderstanding of what computer science is about, as well as its relevance to critical thinking: many of us in the humanities think our colleagues across the campus in the computer- science department spend most of their time debugging software. this is no more true than the notion that english professors spend most of their time correcting people’s grammar and spelling. more significantly, many of us in the humanities miss the extent to which programming is a creative and generative activity. ( , b ) scholars like kirschenbaum ( ) have argued forcefully for rethinking human- ities training so as to incorporate programming skills. one way to make space, kirschenbaum suggests, is to replace the foreign language requirement in phd pro- grams with programming. these skills are crucial, he argues, because computers should not be black boxes but rather understood as engines for creating pow- erful and persuasive models of the world around us. the world around us (and inside us) is something we in the humanities have been interested in for a very long time. i believe that, increasingly, an appreciation of how complex ideas can be imagined and expressed as a set of formal procedures—rules, models, algorithms—in the virtual space of a computer will be an essential element of a humanities education. as kirschenbaum argues, humanities scholars cannot explore the “complex ideas” that humanities computing generates without an understanding of the underlying computational systems. likewise, scholars connected to the humanities, arts, science, and technology alliance and collaboratory (hastac) have devoted considerable energy to advo- cating for humanists to learn coding. hunter ( ) describes an anecdote that her advisor told her when she wanted to do dh work but resisted taking a programming class: “‘i’ll never forget this young scholar who put himself forward as an expert on chekhov,’ he mused. ‘i asked if he spoke russian, and he proudly said he’d never even taken a class. he lost all credibility in that moment. don’t be the chekhov scholar who didn’t take russian .”’ as hunter suggests, scholars need to understand code to design digital projects. while there is some consensus in the scholarship that it is valuable for humanists to learn programming skills, there has been less detailed attention paid to what the best process is for teaching those skills. antonijevic’s ( ) ethnographic study of digital humanists reveals an informal, unstructured mode of learning that is focused on point-of-need, where learning is linked to immediate scholars’ needs, arising from specific research prob- lems, which generally makes this way of learning preferred over organized efforts, such as library workshops, where learning is decontextualized from scholarly practice. this method also successfully makes use of one of the scholars’ most scarce resources: their time. ( – ) s. a. cordell and m. gomis as antonijevic ( ) points out, this method has the disadvantage of “depend[ing] on a scholar’s social network and its knowledge capacity” ( ). the idea of a “social network” as the basis for acquiring programming skills is linked to another solution to the training dilemma offered by the literature on digital scholarship: collaboration. gibson, ladd, and presnell ( ) argue that, “unlike traditional humanities research, digital humanities scholarship is not a solitary affair. generally, no single person has all the skills, materials, and knowledge to create a research project. by nature, the digital humanities project, big or small, requires a collabo- rative team approach with roles for scholars, ‘technologists,’ and librarians” ( ). liu echoes this sentiment, arguing that dh work requires a full team of researchers with diverse skills in programming, database design, visualization, text analysis and encoding, statistics, discourse analysis, website design, ethics (including complex ‘human subjects’ research rules), and so on, to pursue ambi- tious digital projects at a grant competitive level premised on making a difference in today’s world. ( , ) collaboration, however, requires considerable support and advocacy in a disci- plinary landscape where it is not the norm. reid points out that, unlike a laboratory, which requires a team of people to operate, the default mode for humanities academic labor has been for a professor to work independently …. it is unusual for humanities scholarship to appear with more than two authors, let alone the long list of authors that will accompany work in the sciences …. while there are certainly examples of notable, long-standing collaborations in the humanities, they are exceptions to the rule. ( , ) although collaboration can be fruitful for scholars in the humanities, it requires both a cultural shift and a rethinking of the workflow for scholarly projects. at this point, collaboration has not been fully embraced by scholars across the disciplines. in addition to differing disciplinary attitudes that engender resistance to collabo- ration in the humanities, collaboration can have its own drawbacks, especially when the collaboration is not seen as fully equitable. as edmond points out, “in the worst cases, teamwork based on an ethos of knowledge sharing can degenerate into the negotiation of uncomfortable tacit hierarchies, where some contributors (regardless of their expertise or seniority) feel like service providers working in the shadow of otherwise autonomous project leaders” ( , ). further, edmond observes that collaboration doesn’t just require bringing people together but also reimagining projects so that all people involved have an intellectual stake. according to edmond, successful digital humanities collaborations “ensure from the outset that the project objectives propose interesting research questions or otherwise substantive contribu- tions for each discipline or specialty involved” ( ). as reid ( ) explains, “given that the assemblage operates effectively with a single author, one essentially has to invent new roles for additional participants” ( ). because of their well-established role supporting research, librarians have taken up the question of how to enable fruitful collaborations and how best they can train humanists seeking to create dh projects or learn programming skills. green college & undergraduate libraries asks how libraries can facilitate “scholars’ initial skills acquisition in text encoding” ( , ). green recommends a workshop model that does “not simply inculcate scholars with the latest software; rather librarians and scholars work together to facilitate scholars’ entry into the communities of practice that make up digital humanities” ( ). pointing to the tei (text encoding initiative) consortium as a model, she argues that it “presents a strong case study of the role of librarians in building learning environments that enable scholars to become members of its community of practice” ( ). one key question is whether it is the role of libraries to offer technical support for digital projects, train researchers in attaining new skills (through workshops, for example), or enable collaboration. lewis et al. assert that “organizations most successful at building expertise among faculty, students, and staff tended to share characteristics such as an open and collaborative interdisciplinary culture in which each team member contributes expertise and is respected for it” ( , ). discussions of the library’s role in supporting scholars in emerging digital schol- arship skills necessarily invites a conversation about staffing in libraries. should the library provide support staff for digital projects, or should that support staff come from the ranks of graduate students? if graduate students are used as labor for these projects, how can it be organically integrated into graduate training? lewis et al. ( ) point to both the advantages and disadvantages of this model for graduate students: often, digital scholarship projects rely on graduate student assistants. the experience gives students opportunities to build their knowledge and provides inexpensive labor. but such projects must contend with frequent turnover; as one faculty member put it, “i get these ma students, i train them, they graduate.” one university that offers degree programs in digital scholarship tries to recruit its own students as staff, but there aren’t necessarily enough students to meet the demand, especially with competition from other organizations. most of their graduates go to industry, since “they can offer more money. the only people we have are here because of idealism.” ( , ) likewise, sustainability can be an issue when the support model is based on labor by students who necessarily stay only a short period of time. in describing the com- munity of practice support model that has been used by various projects such as tei, documenting the american south, and the victorian women writers project, green points out, “the labor and craft taught for encoding texts generates a ‘shared repertoire’ of skills that is continually disseminated and refined through the training of new and established scholars. this shared repertoire is a critical element to the ability of a community of practice to sustain and expand itself”( , ). the com- munity of practice model constantly requires new participants, especially because many graduate students in library and information science programs or schools of information are only pursuing master’s degrees and graduate after two years. at the center of the question of library staffing, training, and support for digital scholarship is the debate over whether libraries should establish digital humanities centers. ithaka’s report on supporting dh outlines three “campus models for sup- port”: the service model, the lab model, and the network model. in the network s. a. cordell and m. gomis model, “there are multiple units whose services have developed over time, in the library and it departments, but also visualization labs, centers in museums, and instructional technology groups, each of which was formed to meet a specific need” (maron and pickle , ). maron follows up on the ithaka report on dh centers by arguing that the service model has been controversial in libraries because of the debate over “the degree to which librarians should envision themselves in a ‘service role”’ ( , ). nevertheless, this is the most common model, and it is driven by the fact that it meet[s] faculty and students where they are—to offer courses, training, and some pro- gramming support for members of the campus community. this often takes the form of developing a full range of programming, from workshops to courses, and bringing in guest speakers. the library or center following this model seeks to identify and respond to faculty needs rather than “independently identifying a path of innovation” ( ), maron identifies the “path of innovation model” as closer to the lab model. likewise, digital humanities centers can create a central space for networking and collaboration. as freistat explains, digital humanities centers are key sites for bridging the daunting gap between new technol- ogy and humanities scholars, serving as the crosswalks between cyberinfrastructure and users, where scholars learn how to introduce into their research computational methods, encoding practices, and tools and where users of digital resources can be transformed into producers. ( , ) while there is much support for the development of digital humanities centers, there are also detractors. schaffner and erway argue that “there are many ways to respond to the needs of digital humanists, and a digital humanities (dh) center is appropriate in relatively few circumstances” ( , ). instead, libraries can draw on a host of other approaches to support dh on their campuses. in this case, shaffner and erway assert, “[i]n most settings, the best decision is to observe what the dh academics are already doing and then set out to address gaps” ( ). whether or not libraries build digital humanities centers, there is widespread consensus that libraries are natural partners in supporting digital scholarship. at the same time, there has been much less discussion of the specific challenges raised by complex data sets that are not inherently user-friendly. libraries offer varying mod- els of support, and there is a robust conversation in the scholarly literature about whether training, direct technical support, or enabling collaboration—or a combi- nation of all three—is the best approach to supporting digital scholarship. as we argue in the next section, the potential and challenges of large data sets provide an opportunity to think through approaches to training, as well as the library’s role in supporting teaching and research using these data sets. case study: the eebo-tcp data set as new digital methodologies emerge, along with new data sets that enable textual analysis at scale, many scholars have sought help from librarians, other researchers college & undergraduate libraries (both in and beyond their disciplines), and technology experts as they begin nav- igating resources and methodologies far outside their traditional training. while there are expected challenges to learning the basic methods of digital scholarship and analysis, a significant additional barrier exists in formatting and preparing the data sets themselves, even beyond the programming skills that are necessary for analysis. for example, while many researchers can operate basic web-based text visualization tools such as voyant with relative ease, finding and then preparing a corpus for analysis with these tools is often far more daunting. the challenge in this case comes from the complex nature of raw data sets, as well as other factors that work against usability. creating data sets for analysis often involves individual downloads of plain text files (in the relatively limited cases in which platforms allow that functionality), using r or python to isolate subsets of larger corpora, or being limited to corpora that are larger than the researcher may need. while it would be unrealistic to suggest that it is possible to eliminate all challenges to creating cor- pora, putting resources toward facilitating the creation of corpora from raw data sets would offer significant advances in scholars’ involvement with digital scholarship. even data sets that have been produced by libraries pose challenges in usability for researchers. without a significant infusion of resources aimed at increasing the usability of these data sets by researchers at all levels of technical abilities, the question becomes, who is best positioned to offer researchers and instructors support in using these data sets? likewise, who is best positioned to communicate the research possibilities, as well as how to determine a fruitful research question, for using these data sets? preparing a corpus takes time, and there is no guarantee that text analysis will yield usable results. this article takes the eebo-tcp data set as a case study to discuss the challenges and potential approaches for libraries to support digital humanities work using these corpora. we draw on the eebo-tcp data set both because its potential and challenges are representative of other data sets being made available for humanities research and because it is openly available. eebo-tcp offers considerable potential because it makes transcriptions of early modern texts available for scholars, as well as because it is a clean data set. eebo- tcp is based on the early english books microfilm collection that includes over , titles from pollard and redgrave’s short title catalogue ( – ), wing’s short-title catalogue ( – ), and the thomason tracts ( – ) (early english book online [eebo] n.d.). when the microfilm set was originally digitized, the scans appeared as images, and only the metadata was searchable. to make the texts themselves searchable, and because optical character recognition (ocr) soft- ware has not yet advanced to handle early modern fonts with any degree of accuracy, the text creation project made the ambitious decision to re-key (i.e., transcribe) the texts, as well as to mark them up using xml/sgml encoding. although the original goal was to make the texts full-text searchable, emerging text mining methodolo- gies have made the existence of clean data sets particularly desirable for researchers. because the texts have been re-keyed, there are fewer errors in the texts than in those that have been ocr’d. as part of its agreement with proquest, which makes the eebo database commercially available, phase i of the eebo-tcp texts, which s. a. cordell and m. gomis includes the first , re-keyed texts, was made publicly available in december . while the data set offers considerable potential for researchers and also makes the texts themselves available, the data set itself is not easy for researchers to use for a variety of reasons. the texts are available either as a full data set on box and github, or as individual, html, epub, and tei p xml files through the oxford text archive. the files on box and github are referenced by tcp number, a number that is not available on the proquest platform, meaning that researchers who are not interested in working with the corpus as a whole—who, for example, are interested only in texts from a specific time frame or author—have to do considerable extra work to identify the relevant files before they can begin downloading and formatting them for analysis. while researchers who are fluent in programming languages such as r or python have little trouble accessing these texts, in our experience many researchers in the humanities are understandably daunted when faced with zip files containing , files, each of which contains xml or sgml markup that they must decide whether (and how) to scrub or retain. there is little documentation on strategies for accessing and cleaning up the text in preparation for mining or information on analysis tools once you have the data. likewise, proquest has recently made their historical newspaper collections available (for a fee) to libraries that have already purchased perpetual access to spe- cific titles. when libraries license the full-text data sets of historical papers, they are given access to the marked-up files. the los angeles times, for example, is a col- lection of . million files, presented in no particular order and with no metadata in the file names. as in the case of the eebo-tcp data set, to make use of these files, researchers must begin by pulling down slices of the corpus (such as by year or article type) using r or python. unlike the eebo-tcp files, most la times articles are not available one by one as plain text files on a platform for researchers to cob- ble together a corpus through the search interface (and license agreements generally limit bulk downloads in any case). once researchers have pulled down a subset of the corpus, they must decide how much of the markup to keep or strip out before they can run it through a text visualization tool (unless they decide to use the text mining package in r or a similar programming language). leaving aside the techni- cal skills needed to do this, researchers must also decide how to approach the dirty ocr problem because the texts themselves are riddled with errors due to the con- version process from microfilm. while data sets like this offer tremendous poten- tial, it is not feasible for humanities scholars to make use of it without considerable support. another example outside of the humanities is the united states census bureau, which provides access to data sets through a variety of different websites and for- mats. determining the type of data that is needed and locating that data can be chal- lenging to researchers new to working with census data. the census bureau offers a list of recommended software and provides workshops, webinars, and classroom trainings to help people get what they need. they also provide phone and e-mail college & undergraduate libraries support for researchers and people using census data in their work. libraries are just beginning to offer a range of data sets to their users either through their subscription databases or through their own digital projects. usually this type of information is provided without creating a service model. faculty and students often have to figure out how to use these data sets themselves. once users have the data set, the library doesn’t play a strong role in helping them use it. the u.s. census bureau could serve as a service model for supporting text mining in the digital humanities. when an institution or a company provides access to a data set, do they have a responsibility to assist researchers in using the data set? the following section presents different support models that allow us to examine the ways libraries are supporting digital scholarship projects with large data sets for research and learning. gaining access to the texts and analysis tools is not always the barrier to digital schol- arship, especially for content out of copyright. researchers often need help locating resources, including money for staff, storage space, and software and technological expertise to execute their projects. potential support models for digital scholarship using unwieldy data sets although there are certainly scholars out there who are capable of making use of raw data sets, the majority are not. we as librarians and scholars need to advocate for the ways in which our scholars want to use these materials. at the moment, we are operating in a bifurcated context: on the one hand, there exist graphical interface tools that do not give you much flexibility or control to manipulate or build the corpus you are analyzing but that meet the needs of some researchers, such as the google n-gram tool, or on the other hand, a move by publishers to dump the raw data. as in the case of the proquest historical newspapers data sets, publishers have responded to requests from researchers by making data sets available; these data sets are usually delivered in large raw text file dumps that are not manageable to the average humanist scholar. advocacy as a first step in enabling research with these data sets, libraries, as the purchasers and as the supporters of researchers, need to advocate for tools that create bridges between easy-to-use digital tools (like voyant and antconc) and the data sets. for example, rather than having either the entire raw data set for eebo-tcp or the oxford cut-and-paste formatted version, why not create tools that make it easy to use the platform to designate a corpus (i.e., by doing a search using the parameters on the platform) and then extract plain text files from the search results? in the case of the proquest historical newspapers example mentioned, it is not consistently possible across the pqhn platform to download plain text files of individual files, although this would make text mining custom corpora much more manageable for researchers without a background in programming or the resources to hire an assis- tant to manage the technical aspects. s. a. cordell and m. gomis creating new tools leonard recommends that libraries create tools or adopt open source tools to make analysis easier. at the yale university library, they adopted the hathitrust book- worm tool to analyze a small digital corpus of the vogue collection. by creating tools that researchers can use to search text in other ways, they also help patrons to analyze their large digital collections ( ). to facilitate work on the eebo-tcp data set, washington university in st. louis created the early modern print (n.d.) project, which is supported by the humanities digital workshop at washington university. the early modern print project pro- vides exploration tools tailored to the eebo-tcp data. they describe the tools as an aggregate view of the corpus that enables us to probe english lexical and orthographic history in ways that usefully complement the search capabilities of eebo-tcp and the oxford english dictionary; they also help us to see early modern book culture in a new way, as a structured flow of words. (early modern print n.d.) the developers have created graphical interface tools, such as an eebo n-gram browser, to facilitate use of the collection by researchers, but users necessarily have less ability to manipulate the corpus when they are using this tool. until there are more robust tools available to make working with a broad range of data sets easier for scholars, libraries can play a role in supporting emerging research by teaching scholars basic skills. the workshop model: creating stages for learning in designing workshops to teach skills in digital scholarship, librarians need to be attentive to felt needs in their community and to carefully stage those workshops to make sure that instructors are not spending too much time on technical minu- tiae, such as constructing a corpus or setting up frustration with tools. to do this, workshop facilitators need to draw on the principles of backward design by asking, what is the intellectual outcome that they want to have in the session? wiggins and mctighe explain backward design as a methodology that conceives of curricular design by thinking at the outset in terms of outcomes rather than lessons: “given a task to be accomplished, how do we get there? … what kinds of lessons and prac- tices are needed to master key performances?” ( , ). in just the same way that you might design a classroom exercise to focus narrowly on imparting a specific skill or research strategy, it is useful to isolate the specific technical skill, as well as the possibilities for further exploration, that you hope to impart. this is likely to require more setup in advance by the workshop leaders—for example, creating a specific corpus to work with or downloading example files to practice on—but it will allow the session to focus on that specific skill rather than the frustrations of getting ready to learn that skill. a scenario to avoid is when workshop participants try to download software and wind up spending most of the time troubleshooting the download and relatively little time on using the tool. college & undergraduate libraries designing workshops in ways that focus narrowly on outcomes may also require participants to use the same operating system and computers that have all been set up the same in advance. creating an equal computing environment is a big chal- lenge, especially when people have different skill levels and different technology vocabularies. as the scholarship on how researchers learn technical skills suggests, if you can give an opening to the possibilities, and offer a framework for follow- up support, interested researchers will take the time to teach themselves or request consultations on how to do the technical minutiae. a key goal for a workshop can often be illustrating the possibilities. how can you illustrate the possibilities in the approach so that scholars are motivated to learn the details of downloading and con- structing their own corpus? can you create a session that focuses on a piece of the process—i.e., looking at a predetermined corpus in antconc? one approach is to make the entry easy so that scholars can decide if they want to do more, then offer resources for them to take the next steps. a significant goal for workshops can be illustrating why researchers would want to learn these approaches. workshops can also be augmented by working sessions, such as the hackfest sponsored by the bodleian libraries in (oxford university n.d.). this full-day session included researchers as well as robust technical support, as participants had a chance to “pitch ideas and find collaborators, firm up projects and groups, and request (or indeed recruit) technical help as necessary” (willcox ). key to the success of this model, practiced also by software carpentry, whose goal is “teaching basic lab skills for research computing” (software carpentry n.d.), is the availability of support from multiple people, rather than one or two workshop leaders trying to troubleshoot and lead the session. classroom approach in addition to workshops aimed at researchers at all levels, librarians can offer con- siderable support for digital scholarship through course-integrated instruction at the undergraduate or graduate level. if integrated thoughtfully into a course’s learn- ing goals and assignments, course-integrated instruction can be, arguably, at least as effective as workshops because the individual skills to be taught are bound up with the questions raised by a specific course theme. by working with the faculty member leading the course, and by being attentive to the specific learning goals and questions for the course, librarians can design exercises that are targeted toward spe- cific research questions. just as in workshops, it is essential that librarians front-load the planning for these instruction sessions to isolate the specific learning goal for the course. while it is not possible, nor is it realistic (or, really, desirable), to eliminate all possible frustration in working with complex data sets, librarians can anticipate and minimize potential pain points so that the session can focus on the learning goals. for example, in one undergraduate class session at the university of michigan, the librarian and technology specialist worked closely with the faculty member to design an instruction session that drew on the eebo-tcp data set in a -level s. a. cordell and m. gomis course. because the point of the assignment was not necessarily to teach students how to compile corpora for analysis but rather to allow students to perform text analysis on a set of relevant texts, they set the session up so that students were cre- ating a limited corpus of only ten texts, based on search criteria that students deter- mined (and determining the search words was part of the goal for the exercise). to minimize frustration with the data set as a whole, they first showed students how to use the eebo platform so as to explore texts related to their topics and identify ten potential texts. once they had identified the ten texts, it was relatively easy for students to find those texts on the oxford platform and cut and paste the text into plain text files. although this approach may have glossed over some of the intrica- cies of the data set and corpus creation, it allowed students to create a minicorpus relatively easily to import into voyant, where the bulk of the learning was meant to happen. the lab approach: scholarspace at the university of michigan library scholarspace at the graduate library at the university of michigan provides access to technologies for small-scale experimentation and technologies for formal project support with the understanding that anyone can access them. scholarspace sup- ports humanists working on text mining projects by providing access and expertise for digitization, storage, text cleanup, and analysis. we have purchased text mining software that is not available elsewhere on campus, thereby providing access to anyone affiliated with the university. this approach relies on humanists to be willing to experiment with librarians and to train each other. text mining varies greatly by discipline; through creating a community of scholars, we can build a network of experts and draw on experiences and expertise related to text mining in chinese studies, economics, history, english language and literature, and more. staffing models across these different models, the question remains as to how best to apportion staffing to support digital scholarship. in a distributed model, where librarians are leading workshops for the campus community and for classes, subject specialists, technology librarians, and undergraduate learning librarians can provide consid- erable support, especially if they are provided training and if the workshops are a natural extension of their expertise and outreach areas. depending on the demand on campus, this model can, however, lead to librarians being stretched too thin; thus, creative staffing, such as training students to lead or support workshops, is necessary. likewise, students can be brought into a project to work on a specific slice—such as ocr-ing pdf files and cleaning up the resulting ocr. in this case, however, it is important to bring the students into the conversation about the project at some level so that they understand how their work fits into the larger intellectual work of the project. otherwise, libraries miss out on the opportunity to mentor students in emerging questions and methodologies of digital scholarship. the bulk of preparing college & undergraduate libraries texts for mining and analysis can also be tedious, and it requires careful attention to detail. librarians or others overseeing students working on dh projects need to be vigilant in keeping the work moving forward and in checking the quality and consistency of the work. sustainability and scalability are challenges across all staffing models. projects that have dedicated funding may not have enough funding to cover the entire project. students cycle off projects either because they graduate or because they receive other opportunities such as internships or jobs. conclusion as the preceding discussion of staffing illustrates, challenges remain in think- ing through collaborative work in digital scholarship, especially in terms of the necessary—but not as obviously exciting—work of data preparation and cleanup. the need to develop and create digital scholarship projects will continue to grow in the humanities, and at some institutions it will be embedded into the curriculum. learning project management, digitization, and analysis are skills humanists will need in the future, and they will learn them through the channels available. these skills can translate easily to a number of positions postgraduation and will be desired by employers. having graduate students work on digital projects can provide them with perfect opportunities to obtain new skills. considering that resources are not currently in place to make data sets easier to use in the near future, librarians can advance digital scholarship by helping scholars in incremental ways targeted at the specific challenges and frustrations that data sets pose. librarians can set the expectation that they will work with students and faculty to explore these new areas together and work to scaffold the learning experience so that humanists beginning text mining see the possibilities and not just the minutiae. some challenges that still persist include developing relationships across campus, continually building skills, and finding partners to collaborate. orcid sigrid anderson cordell http://orcid.org/ - - - melissa gomis http://orcid.org/ - - - references antonijevic, smiljana. . amongst digital humanists: an ethnographic study of digital knowl- edge production. new york: palgrave macmillan. early english books online (eebo). n.d. “what is early english books online?” http:// eebo.chadwyck.com/about/about.htm#top “early modern print: text mining early printed english.” n.d. http://earlyprint.wustl.edu edmond, jennifer. . “collaboration and infrastructure.” in a new companion to digi- tal humanities, edited by susan schreibman, ray siemens, and john unsworth, – . chichester, uk: john wiley & sons. http://orcid.org/ - - - http://orcid.org/ - - - http://eebo.chadwyck.com/about/about.htm#top http://earlyprint.wustl.edu s. a. cordell and m. gomis freistat, neil. . “the function of digital humanities centers at the present time.” in debates in the digital humanities, edited by matthew gold, – . minneapolis: university of min- nesota press. gibson, katie, marcus ladd, and jenny presnell. . “traversing the gap: subject specialists connecting humanities researchers and digital scholarship centers.” in digital humanities in the library: challenges and opportunities for subject specialists, edited by arianne harsell- gundy, laura braunstein, and liorah golomb, – . chicago: association of college and research libraries. green, harriett e. . “facilitating communities of practice in digital humanities: librarian collaborations for research and training in text encoding.” the library quarterly ( ): – . heuser, ryan, long le-khac, and franco moretti. . “learning to read data: bringing out the humanistic in the digital humanities.” victorian studies: an interdisciplinary journal of social, political, and cultural studies ( ): – . hunter, elizabeth. . “must humanists learn to code? or: should i replace my own carburetor?” hastac (blog), december , https://www.hastac.org/blogs/shakespeare- games/ / / /must-humanists-learn-code-or-should-i-replace-my-own-carburetor kirschenbaum, matthew. . “hello worlds: why humanities students should learn to program.” the chronicle review ( ): b . leonard, peter. . “mining large datasets for the humanities.” ifla library. http:// library.ifla.org/ / / -leonard-en.pdf levelt, sjoerd. n.d. “#eeboliberationday.” https://storify.com/sjoerdlevelt/eeboliberationday lewis, vivian, lisa spiro, xuemao wang, and jon e. cawthorne. . building expertise to sup- port digital scholarship: a global perspective. washington, dc: council on library and infor- mation resources. liu, alan. . “digital humanities and academic change.” english language notes ( ): – . maron, nancy. . “the digital humanities are alive and well and blooming: now what?” educause review. http://er.educause.edu/∼/media/files/articles/ / /erm .pdf maron, nancy, and sarah pickle. . “sustaining the digital humanities: host insti- tution support beyond the start-up phase.” ithaka s+r. http://www.sr.ithaka.org/ wp-content/mig/sr_supporting_digital_humanities_ f.pdf oxford university. n.d. “text creation partnership: eebo, ecco and evans texts.” http://ota.ox. ac.uk/tcp/ reid, alexander. . “graduate education and the ethics of the digital humanities.” in debates in the digital humanities, edited by matthew gold, – . minneapolis: university of min- nesota press. schaffner, j., and r. erway. . “does every research library need a digital humanities center?” oclc research report. http://www.oclc.org/content/am/research/dpublications/ library/ /oclcresearch-digital-humanities-center- .pdf software carpentry. n.d. “software carpentry: teaching basic lab skills for research comput- ing.” https://software-carpentry.org text creation partnership (tcp). . “eebo-tcp phase i public release: what to expect on january .” http://www.textcreationpartnership.org/ / / /eebo-tcp-phase-i-public- release-what-to-expect-on-january- / wiggins, grant p., and jay mctighe. . understanding by design. alexandria, va: association for supervision and curriculum development. willcox, pip. . “early english books hackfest.” bodleian libraries (blog), april , http:// blogs.bodleian.ox.ac.uk/digital/ / / /early-english-books-hackfest/ https://www.hastac.org/blogs/shakespeare-games/ / / /must-humanists-learn-code-or-should-i-replace-my-own-carburetor http://library.ifla.org/ / / -leonard-en.pdf https://storify.com/sjoerdlevelt/eeboliberationday http://er.educause.edu/~/media/files/articles/ / /erm .pdf http://www.sr.ithaka.org/wp-content/mig/sr_supporting_digital_humanities_ f.pdf http://ota.ox.ac.uk/tcp/ http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital-humanities-center- .pdf https://software-carpentry.org http://www.textcreationpartnership.org/ / / /eebo-tcp-phase-i-public-release-what-to-expect-on-january- / http://blogs.bodleian.ox.ac.uk/digital/ / / /early-english-books-hackfest/ abstract references powerpoint presentation tri-agency research data management policy development research data management summit for canadian colleges, institutes and polytechnics centennial college december , presented by kevin fitzgibbons executive director, corporate planning and policy, nserc presentation outline • rationale for research data management (rdm) • tri-agency rdm policy development • stakeholder engagement and consultation what are research data? research data are contents that are used as primary sources to support research, scholarship, artistic activity or research-creation, and that are used as evidence in the research process and commonly accepted in the research community as necessary to validate research findings and results. why data management? research excellence research dissemination research impact research best practice research excellence project management support reproducibility avoid duplication journal guidelines research dissemination data sharing citation interdisciplinarity research impact within science social impact policy impact research best practice transparency trust ethics responsible use of public funds international developments research funders • national science foundation • national endowment for the humanities – odh • national institutes of health • uk research councils • european commission • national natural science foundation of china foundations and charities • american heart association • bill and melinda gates foundation • alfred p sloan foundation • gordon and betty moore foundation • open society foundations • royal society • wellcome trust • cancer research uk research institutions cags libraries it ethics grad studies researchers rso courtesy of chuck humphrey, former director, portage government of canada directive on open government tri-agency rdm policy development background capitalizing on big data: toward a policy framework for advancing digital scholarship in canada tri-agency statement of principles on digital data management - draft tri-agency research data management policy intended impact of a tri-agency rdm policy the agencies aim to contribute to a future research culture that sees: • strong data management as an accepted signifier of research excellence across disciplines, and a regular feature in the conduct of research; • more canadian datasets cited, and valued as a product of research in tenure, promotion and peer review processes; • canadian researchers equipped and ready to engage in international research collaboration where data management requirements are becoming the norm; • canadian research institutions ready to support the management of the data their researchers produce; and • increased ability for research data to be archived, found and responsibly reused, to fuel new discovery and innovation. draft tri-agency rdm policy • consultation feedback will inform final policy • proposed policy includes possible requirements: . institutional strategy (institutions) . data management plans (researchers) . data deposit (researchers) • implementation: phased, incremental draft tri-agency rdm policy . institutional strategy • each institution administering tri-agency funds could be required to create an institutional research data management strategy. the strategy could outline how the institution will provide its researchers with an environment that enables and supports world class research data management practices. • the strategy could be posted and made publicly available on the institution’s website, with contact information to direct inquiries about the strategy. draft tri-agency rdm policy why require institutional strategies? • recognizes the role of institutions in providing supports for data management; • provides an opportunity for institutions to think through where gaps exist, and how to address them from a campus-wide perspective; • could aid institutions in developing an approach that works for them, while encouraging alignment and collaboration with other institutions; • could provide information to agencies about data management capacity; and • serves as foundation for the potential requirements that follow. example support portage institutional strategy template draft tri-agency rdm policy . data management plans • grant recipients could be required to create data management plans (dmps) for research projects supported wholly or in part by tri-agency funds. grant recipients could submit these plans to their institution’s research office as a condition of the release of grant funds. • for specific funding opportunities, the agencies could require dmps to be submitted to the appropriate agency at time of application; in these cases, they may be considered in the adjudication process. draft tri-agency rdm policy why require data management plans? • dmps are an emerging international best practice; • dmps are an excellent way for researchers to identify opportunities and challenges in managing their data, well before those opportunities and challenges emerge; • researchers claim that the process of developing a dmp helps them to improve their research plans and methodologies; • dmps could serve the responsible conduct of research and the research ethics approval process; and • dmps help identify and mitigate issues related to ownership of data, potential for data sharing, etc. example support portage dmp assistant draft tri-agency rdm policy . data deposit • for all research data and code that support journal publications, pre-prints and other research outputs that arise from agency-supported research, grant recipients could be required to deposit these data and code in an appropriate public repository or other platform that will ensure safe storage, preservation, curation, and (if applicable) access to the data. draft tri-agency rdm policy why require data deposit? • methods, expectations and online security will change – storing in a secure location provides better chance for data to be safe and of use to the creator in the future; • data deposit helps ensure proper use of public funds; • facilitates reproducibility of results; and • facilitates data sharing. example support carl-portage-compute canada’s federated research data repository (frdr) community feedback is key research community feedback is essential to inform the final design of the policy and the mode of its implementation. the agencies consider the draft rdm policy as a proposal through which to advance discussion with stakeholders in the research community, with a tri-agency rdm policy as the desired end product. stakeholder engagement • regional stakeholder meetings • vancouver, calgary, toronto, montréal and halifax; • revealed excitement and optimism about the potential for data management to contribute to research excellence; and • also demonstrated concern over challenges, such as researcher awareness, capacity and funding. • online consultation on draft policy june-september • continued discussions with broad array of stakeholders • researchers, scholarly and scientific associations, data management advocacy and support organizations, funding agency colleagues around the globe. online consultation june - september • approx. responses, mostly from colleges & universities • areas of feedback: • strength of policy • ethics and privacy • implications of the policy • capacity • supports for education • feedback will inform further development of the policy over winter - online consultation feedback thank you! questions or feedback? contact: nserc: researchdata-donneesderecherche@nserc-crsng.gc.ca sshrc: researchdata-donneesderecherche@sshrc-crsh.gc.ca cihr: researchdata-donneesderecherche@cihr-irsc.gc.ca slide number presentation outline what are research data? why data management? research excellence research dissemination research impact research best practice international developments slide number government of canada�directive on open government tri-agency rdm policy development �background �intended impact of a�tri-agency rdm policy� draft �tri-agency rdm policy draft �tri-agency rdm policy draft �tri-agency rdm policy draft �tri-agency rdm policy draft �tri-agency rdm policy draft �tri-agency rdm policy draft �tri-agency rdm policy community feedback is key stakeholder engagement online consultation feedback thank you! scholarly and research communication volu m e / is su e / abstract enhancing publications has a long history but is gaining acceleration as authors and publishers explore electronic tablets as devices for dissemination and presentation. enhancement of scholarly publications, in contrast, more often takes place in a web environment with a focus on interoperability within and across publication platforms, and is coupled with presentation of supplementary materials related to research. the approach to enhancing scholarly publications presented in this report goes a step further and involves the interlinking of the “objects” of a document: bibliographic information on authors, datasets, supplementary materials, secondary analyses, and post-publication interventions. this approach has been explored in a project and this is a technical report about that project. specific to that project is the combination of the user-centricity of web . with the semantic web. the goal is to facilitate long-term content structure through standardized formats, thereby improving interoperability between concepts and terms within and across knowledge domains. in our project, we explored this specific concept of enhancement on a small set of books prepared for traditional academic publishers. concentrating in this report on aspects of the technical development, we introduce an ongoing conceptual discussion and reflection on the position of this project in relation to new directions in scholarly publishing. keywords scholarly communication; publishing; enhanced publication; semantic web; wordpress nicholas w. jankowski is associate researcher at the ehumanities group, royal netherlands academy of arts and sciences (knaw). he is editor of e-research: transformation of scholarly practice (routledge, ). email: nickjan@xs all.nl . andrea scharnhorst is head of e-research at data archiving and networked services (dans), and scientific coordinator, ehumanities group, royal netherlands academy of arts and sciences (knaw). anna van saksenlaan , ht the hague, the netherlands. email: andrea. scharnhorst@dans.knaw.nl . clifford tatum is phd candidate at leiden university, project manager of academic careers understood through measurement and norms (acumen), and associate researcher at the knaw e-humanities group. web: tatum.cc; email: clifford@ tatum.cc . zuotian tatum is scientific programmer for the human genetics group at leiden university medical center. she is also a freelance software developer, working with wordpress for academic scholarship on the web. email: z.tatum@ lumc.nl . enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences nicholas w. jankowski, andrea scharnhorst, & clifford tatum royal netherlands academy of arts and sciences zuotian tatum leiden university medical center ccsp press scholarly and research communication volume , issue , article id , pages journal url: www.src-online.ca received january , , accepted june , published december , nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. © nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. this open access article is distributed under the terms of the creative commons attribution non-commercial license (http://creativecommons.org/licenses/by-nc-nd/ . /ca), which permits unrestricted non- commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. introduction scholars in the humanities and social sciences are increasingly considering possibilities for making research available on the web. instruments for data collection and analysis, datasets and metadata describing this material, conference papers, and project reports are appearing (and in some cases are required to be placed) in web-based repositories. one area receiving less attention in this trend, however, is the development of web venues that integrate the traditionally published book with the diverse materials related to an overall research project. “enhanced publication” is a term reflecting such integration, and a range of initiatives have been supported by the surffoundation in the netherlands to develop such forms of publication. surf is a publicly founded organization, responsible for administrative coordination and for initiating information and communication technology (ict) driven innovation in the area of higher education and research. surf regularly issues tenders for innovative projects, including for the preparation of enhanced publications. the project “enhancing scholarly publishing in the humanities and social sciences” was one of six projects of the last tender call for surf. this report, written by the researchers conducting this project, elaborates on the main intentions and accomplishments during that period. initiated in january , enhancing scholarly publishing was designed to prepare websites for four scholarly books which had previously been traditionally-published and, in the process, to utilize a model of and tools for enhanced publications developed by surf. by way of conclusion, we reflect on some of the challenges encountered, and sketch paths meriting further exploration, when developing enhanced publications. this report begins with a backdrop of more general initiatives for enhancing print publications and proposes a definition of the term “enhanced publication” as relevant to this project. that panorama and definition are followed by the presentation of the objectives of the enhancing scholarly publishing project and the conceptual principles underlying the database architecture for the book websites. three of the four websites are presented in the next section (the fourth book that was part of the project is in production and insufficiently developed for inclusion in this report). finally, in the conclusion section we reflect on the overall project and note areas where further research and development should be undertaken. enhancing publications panoramic overview the enhancement of print publications has been an ongoing endeavour, at least since gutenberg, if not earlier (palmer & frangenberg, ). manuscripts artistically and elaborately illustrated by monks in monasteries suggest interest in enhancement as far back as the middle ages. more contemporary practices of including visualizations in books — tables, figures, photographic plates — are extensions of such enhancement. in the current digital age, many publishers are exploring ways to integrate printed text with dynamic visualizations and supplementary digital materials available on the internet, both within and outside of academia. this project was designed around web technologies for conventional personal computers. however, present developments of enhanced publications are making use of mobile devices and e-tablets. scholarly scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. presentations are also being developed for the ipad platform, including a version of the iliad containing both the original greek and english translations, jack kerouac’s novel on the road, including a wide range of supplementary materials (e.g., maps of the journey, audio files, original manuscript), and t. s. elliot’s seminal poem the waste land, showing some pages of the original manuscript, audio readings synchronized to the text, annotations explaining passages, and interpretations by elliot scholars on the poem. popular science publications, perhaps less scholarly in objective but nevertheless engagingly interactive, include a range of entries prepared by touch press in association with faber and faber. solar system for ipad (chown, ) is the widely acknowledged “crown jewel” in the growing series of titles being prepared by these two collaborating publishers. a scholarly initiative in enhancement is the decade-long research enterprise for the project entitled visualizing culture – image-driven scholarship. this enterprise was initiated at the massachusetts institute of technology (mit) and involves the compilation of images and texts related to historical studies of japan and china, all available in a web environment. designed for both research and educational objectives, the images are housed in a database, and more than a hundred videos are included on the site (see figure ). figure : screenshot from visualizing cultures. source: http://ocw.mit.edu/ ans / f/ f. /home/index.html one of the most far-reaching scholarly publications integrating scholarly data with its publication is the mark twain project (http://www.marktwainproject.org/). as expressed in its mission statement, the “ultimate purpose is to produce a digital critical edition, fully annotated, of everything mark twain wrote” (mark twain project, , para. ). the project reflects long-term collaboration between the university of http://ocw.mit.edu/ans / f/ f. /home/index.html http://ocw.mit.edu/ans / f/ f. /home/index.html http://www.marktwainproject.org/ scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. california bancroft library and the university of california press, resulting in both traditionally printed volumes as well as web access to digital versions of the volumes and the original sources on which the publications are based. in comparison to the above-mentioned projects, initiatives from most academic book publishers fall short when it comes to enhancement. the situation is different in the area of journal publications. here, traditional academic publishers are rapidly providing a host of features for the traditionally staid and text-dominated journal article. elsevier, through its subsidiary cell press, prepared a far-reaching initiative called the article of the future, launched in july . a year later it had been implemented in the score of cell press journal titles. figure shows several of the features available for articles in the journal cell: pop-up illustrations and in-text references, a navigation bar to sections of the article, and dynamic updating of reference citations. figure : screen shot of pop-up visualization, elsevier “article of the future.” source:http://cell.com/ http://cell.com/ scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. many of the web features found in the cell press journal articles are also suitable for inclusion in websites accompanying traditionally published book monographs. however, few of these features are commonly found on the websites prepared to accompany printed scholarly books in the humanities and social sciences. in that regard, this project represents a step forward in preparation of enhanced publications within the humanities and social sciences. what is an enhanced publication? an increasing body of literature is appearing that elaborates on the idea of an “enhanced publication” (e.g., boulal, lordanidis, quast, & schirrwagen, ; hoogerwerf, jong, & scholte, ). one of the more extensive reviews of this literature (woutersen- windhouwer & brandsma, ) proposes the following definition: [an enhanced publication is] a publication that is enhanced with research data, extra materials, post publication data, database records (e.g. the protein data bank), and that has an object-based structure with explicit links between the objects. in this definition an object can be (part of) an article, a data set, an image, a movie, a comment, a module, or a link to information in a database. (woutersen- windhouwer & brandsma, , p. ) since formulation of this definition, debate around what constitutes an enhanced publication continues. most recently, breure, voorbij, and hoogerwerf ( ) proposed a new term, “rich internet publication,” which they suggest can be seen as a scale that reflects degrees of involving integration, visualization, and exploration. aware of the need for further exploration of types of enhancement suitable for scholarly publications, surf issued a round of funding for pilot projects in . in the call for proposals, to which the enhancing scholarly publications project was submitted, the term “enhanced publication” is described as follows: an enhanced publication consists of a publication, usually in the form of text, enhanced with extra material. a publication can be an article in a journal, a dissertation, report, memo, or a chapter in a book. the condition is that it is related to (scientific) research and includes an interpretation or analysis of the primary data or derivative thereof. the supplementary material can, for example, consist of research data, illustrative images, meta datasets, and post-publication data, such as comments and ranking data. given the changes in post-publication data, it is possible that an enhanced publication continues to develop across time. (author translation of text, “wat is een verrijkte publicatie?” surffoundation, ) enhancement of publications involves a range of concrete tasks, and woutersen- windhouwer and brandsma ( , pp. – ) propose a checklist for preparation of the objects included in an enhanced publication: scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. • providing persistent identifiers that are unique and global; • ascertaining timestamp and citation information; • using of file types commonly available; • ensuring that datasets have universal numeric identification; • achieving adequate technical quality to merit preservation; • considering legal issues related to incorporation of materials. they also suggest various forms of additional information related to objects in enhanced publications: availability and sustainability; ownership and responsibility; and indication whether an object has been peer reviewed, ranked, cited, and commented upon. the linking of objects within an enhanced publication merits consideration in a meaningful manner, they suggest, balancing complexity with utility. moreover, the relation between linked objects (e.g., a chapter being part of a book) should be made clear. the above definition stresses permanence, persistence, and authenticity — an approach understandable from the perspective of the institutions involved in preserving text, i.e., repositories for scientific publications and research libraries. but persistence and permanence are only two aspects of an enhanced publication; linking information in an interoperable and machine-readable manner to be processed over the web is another facet. various standards, protocols, and tools have been developed to facilitate preparation of enhanced publications. perhaps the most important of these are the resource description framework (rdf) and especially the open archives initiative object reuse and exchange (oai-ore), the latter of which “develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content” (woutersen-windhouwer & brandsma, , p. ). figure below depicts typical object relationships in the form of subject-predicate-object statements (triples). oai-ore handles the aggregation of rdf triples that describe the relations between the publication, its sub-components (such as chapters, illustrations, and references), and related web resources. without elaborating here on specific relations, the figure illustrates the logic of rdf triples, which are the foundation of semantic interoperability. figure : basic rdf structure. source: http://www.openarchives.org/ore/ . /primer#rdf scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. data models such as the resource description framework (rdf) (as shown in figure ) belong to technical specifications developed by the world wide web consortium (w c), and are part of semantic web technologies. originally designed as a data model for metadata, rdf was applied to bibliographic references, for example. this immediately points to one of the challenges confronting this project. despite the long tradition of bibliographic referencing, interoperability among reference managers and beneath the layer of ontologies is far from perfect. a variety of classification schemes and ontologies exist that are used for bibliographies. although traditional bibliographic entries usually indicate the author(s), title of document, publisher, year of publication, and specification of details when the document is an article in a periodical, the elements included in the ontology or categorization can change between different bibliographic systems. common problem areas include references to special issues and serials, and disambiguation of contributor roles, such as editors and authors. but the aim of enhanced publications goes far beyond the creation of an automatic construction of bibliographic data. the oai-ore model depicted in figure below contains elements such as: • seeing a scholarly work as an aggregation of different contributions • flagging out sources for discursive arguments in the text • highlighting the different roles among contributors, such as authors, editors, and sometimes objects of research • linking to other related material (from research data, to visualizations, to other related work) in the description of this project we present specific solutions to some of those problems. figure : oai-ore aggregation. source: http://www.openarchives.org/ore/ . / primer#rdf project objectives, platform design, and website features project design two central objectives were formulated for the surf project enhancing scholarly publishing: ( ) to develop hybrid forms of publications, and ( ) to develop a database scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. allowing for aggregation of content attributes and associations across the individual book websites (jankowski, ). regarding the first objective, a template constructed within the wordpress platform was to be used to construct websites to complement the four selected books. these websites were to contain a broad range of features: • supplementary resources (e.g., links, blogs, chapter appendices, author profiles); • chapter visualizations (e.g., animations, figures, tables) in colour; • hyperlinks, both internal and external, to the book texts; • author updating of site materials; • search features. regarding the second objective, a database was to be developed that would allow for aggregation of content attributes and associations across the individual book websites, such that topical relationships, intellectual underpinnings, and contextual factors could be made explicit. fundamental to this approach is a focus on web-based texts as dynamic and evolving discourses rather than completed works ready to be archived. in this project, the wordpress (http://wordpress.org/) content management system (cms) was employed as the foundation for the websites, both for its relative ubiquity and ease of use. an additional motivation was the widespread use of this platform. according to the recent world wide web technology survey, roughly % of the million largest websites on the internet use a cms. of the websites using a cms, wordpress holds % market share. four books were selected as pilots to be included in the project: three edited anthologies and a single-author university-level textbook: • jankowski, n. w. ( ). e-research: transformation in scholarly practice. new york, ny: routledge. • wouters, p., beaulieu, a., scharnhorst, a., & wyatt, s. ( ). virtual knowledge: experimenting in the humanities and social sciences. cambridge, ma: mit press. [ ]. • park, d., jankowski, n. w., & jones, s. ( ). the long history of new media: technology, historiography, and newness contextualized. new york, ny: peter lang. • jankowski, n. w. (forthcoming). digital media: concepts & issues, research, & resources. cambridge, uk: polity press. [ ]. all books belong to the area of communication sciences, media studies, and science and technology studies. http://wordpress.org/ scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. as reflected in figure , a conceptual diagram of the project representing the content of each book website is managed with a local database, connected to a central database. in this way, the linkage is established for aggregation and each book website retains an individual web presence with local content management and storage. the following websites have been developed as part of this project: • central project website: http://ep-books.ehumanities.nl • e-research book website: http://scholarly-transformations. virtualknowledgestudio.nl • long history of new media book website: http://thelonghistoryofnewmedia.net in addition to these websites, a central accomplishment for the project is the launch of semantic wordpress for digital scholarship, termed semantic words, which is comprised of two specially tailored open source plugins designed to introduce traditionally published books to web-based scholarly communication. each of the websites contain possibilities for a broad range of features intended to enhance the printed versions of these books, including supplementary resources, visualizations, intertextual linking of content, and formal structuring of content using semantic web ontologies. in a subsequent phase, a central database will be established to facilitate aggregation of content across the individual book websites, such that object relationships, discursive threads, and contextual factors can be traced across the collection. figure : concept diagram of enhanced publication project. http://ep-books.ehumanities.nl scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. tool development the development strategy for this project focused in the first place on academic practice: how academics in the social sciences and humanities conduct scholarship. this priority informed development-specific functionality and interfaces. to this end, the relative ease of using wordpress played an important role. figure displays the functional modules of the platform in three clusters: the wordpress software (upper left), community-developed wordpress plugins (lower left), and the custom-developed plugins (right). semantic wordpress for digital scholarship the semantic wordpress for digital scholarship framework (semantic words: http:// ep-books.ehumanities.nl/semantic-words) is the basis for this hybrid platform, which leverages web . participatory modes of scholarly communication combined with formalized content structures imposed by semantic web formats (see figures and ). semantic web stands for a new generation of data models and web technologies. the vision proposed by tim berners-lee ( ) aims for a web of data on top of the existing network of web resources linked by hyperlinks. one could also talk of a giant, machine-readable index to the web. different to the web . , in which user-generated content and platforms for information sharing are central, the semantic web or web . stands for standards, new data representations, and models that are machine- readable, and so support automatic knowledge ordering. the term web . is typically used to mark the transition to user-generated content on the web. from this perspective, content on the web interconnects with the use of hyperlinks. by adding content and linking to relevant webpages, users facilitate the construction of meaning through associative links. the aggregate of user-generated content gives rise to the search engine as a dominant navigation tool. the semantic web is an effort to formalize content structure through the creation of centralized (as opposed to emergent) content ontologies. for enhanced publications, the semantic web approach facilitates increased granularity of web objects and increased precision of the semantic meaning between objects. increased granularity means the book is comprised of its constituent parts, for example, chapters, images, and references, each of which is uniquely identified. the role of a content ontology is ( ) to define meaningful relationships (instead of simple hyperlinks) between and among objects and ( ) to assign precise types to each of the objects. for example, in a web . environment, the user gives the hyperlink any meaning she chooses, whereas using semantic web format “book” and “chapter” have specific, standardized meanings. in web . , both a book and a chapter are just html objects and a search engine cannot distinguish between them, but in semantic web they are distinct types with unique identifiers. http://ep-books.ehumanities.nl/semantic-words http://ep-books.ehumanities.nl/semantic-words scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. the semantic words software was developed as open source for use with the wordpress content management software, which is also open source. semantic words is comprised of two custom plugins that are integrated with wordpress and zotero (http://www.zotero.org/), an open source web annotation and citation management system. figure : semantic words functional diagram. enhanced bibliplug the first of the semantic words plugins, enhanced bibliplug, (http://ep-books .ehumanities.nl/semantic-words/enhanced-bibliplug), provides a suite of features for authors, which are focused on organization and publication of academic content on the web. features include custom page templates for academic texts, integration with zotero for citation management, and expanded author profile pages for cv content management, such as publications, presentations, projects, and other related career accomplishments. in addition to providing authors with advanced tools for publishing on the web, bibliplug facilitates visibility (e.g., in search engines) of relationships between and among researchers, institutions, and both formal and informal scholarly communication. bibliplug was first developed for the virtual knowledge studio in and is still in use on some dozen project-related websites. at the time of its development, the goal was to create a central repository for all researchers affiliated with the studio to organize their academic work. the initial design included (a) database schema for storing bibliographical references, (b) administration pages to manage the references, and (c) short code for easy retrieval of references based on author, year, and publication type. in this project, we further developed the plugin and re-released it as enhanced bibliplug. added functionalities include (a) the ability to connect and synchronize with zotero accounts (see figure ), (b) a custom author page template to display a http://www.zotero.org/ http://ep-books.ehumanities.nl/semantic-words/enhanced-bibliplug http://ep-books.ehumanities.nl/semantic-words/enhanced-bibliplug scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. user’s academic title and affiliation, biography, and cv content such as publications and presentations, (c) the ability to export bibliography data in rdf format based on semantic publishing and referencing (spar) onotologies, and (d) the ability to group references based on categories and tags. enhanced publication for wordpress the second plugin, enhanced publication for wordpress (http://ep-books.ehumanities .nl/semantic-words/enhanced-publication-plugin-for-wordpress), works in parallel with bibliplug. added content is simultaneously structured in semantic web formats based on academic publishing ontologies. unlike many semantic web applications, this plugin includes integration of a visualization feature, such that object relationships can be browsed with the incontext application developed by the surffoundation. the central function of this plugin is to describe a wordpress site as an oai-ore aggregated book (an enhanced publication). in this structure, we convert wordpress pages into book chapters and use various other plugins to facilitate and describe reference lists, authors and editors, and attachments. for visualizing the content object relationships, we employ surf’s incontext visualiser, which is shown at figure (http://www.surffoundation.nl/en/projecten/pages/ escapevisualisationcomponent.aspx). rdf and ontologies: backbone of semantic words as mentioned in the introduction, we aimed to create a web-based representation of resources, which is also machine-readable. the notion of a semantic web was introduced by tim berners-lee (berners-lee, hendler, & lassila, ) to indicate a move from a web of resources or documents to a “web of data.” if the web is compared to a collection of items in a library, the semantic web could be compared to the creation of catalogues and indexes to describe the content of the collected items in a meaningful way. to be interoperable, such indexes need to be constructed in a standardized way. consequently, semantic web technologies use “formal (usually symbolic) representation languages where some meaning is encoded separately from data and content” (meroño-peñuela, ashkpour, van erp, mandemakers, breure, scharnhorst, schlobach, & van harmelen, , p. ). a set of different technologies belong to the semantic web approach, among them rdf as a general method to describe information, and ontologies as knowledge representation languages (see wikipedia, ). this combination of website features and book objects necessitated use of a particular set of ontologies to cover the hybrid configuration of book contents published as a website. in particular, the visualization layer (incontext visualizer) expects roles to be defined in a particular way. we therefore selected a list of related ontologies to describe the full content of the aggregation. following is a list of ontologies used: • rdf: resource description framework ontology; • oai-ore vocabulary for resource aggregation; • dcterms: dublin core metadata ontology; • foaf: friend of a friend ontology; • frbr: functional requirements for bibliographic records ontology; http://ep-books.ehumanities.nl/semantic-words/enhanced-publication-plugin-for-wordpress http://ep-books.ehumanities.nl/semantic-words/enhanced-publication-plugin-for-wordpress http://www.surffoundation.nl/en/projecten/pages/ escapevisualisationcomponent.aspx scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. • swan: provenance, authoring, and versioning in scientific discourse ontology; • res: academic researchers ontology; • biro: bibliographic reference ontology; • fabio: frbr-aligned bibliographic ontology; • prism: publishing requirements for industry standard metadata ontology; • escape-display: vocabulary for describing inverse relationship of foaf. in addition to customized plugins, several other plugins are used from among the wide range of open source plugins developed by the wordpress community (see figure ). we use an additional three plugins to augment functionality in our custom plugins: co-authors plus (http://wordpress.org/extend/plugins/co-authors-plus/), ninja page categories and tags (http://wpninjas.net/plugins/ninja-page-categories-and-tags/), and user avatar (http://wordpress.org/extend/plugins/user-avatar/). figure : zotero connector admin page in wordpress http://wordpress.org/extend/plugins/co-authors-plus/ http://wpninjas.net/plugins/ninja-page-categories-and-tags/ http://wordpress.org/extend/plugins/user-avatar/ scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. figure : semantic words aggregation structure figure : incontext visualization, virtual knowledge website scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. implementing enhancement this section outlines the steps undertaken in developing the web presence for the three traditionally published books noted above using the semantic words framework. inasmuch as there are specific details and features related to each, they are presented separately below. all of the book-related websites, however, are based on a uniform template constructed within the wordpress content management system described in the previous section. book : e-research the book e-research: transformation in scholarly publishing was released in mid- by routledge and reflects the characteristic features of traditionally published and specialized scholarly monographs: hardcover, black text printed on white paper, and figures reproduced in tones of gray. there is no use of colour in the book, other than on the cover. a web-based enhanced version of this publication could include a myriad of features associated with websites, such as: • illustrations, figures, and tables in colour; • internal hyperlinks between sections of the book; • external hyperlinks to related internet-based materials; • supplementary resources for book chapters (e.g., recent publications, multimedia, and other materials). many additional features are also possible: • interlinking index terms with book text; • chapter references with hyperlinks; • author search via google scholar for other publications; • keyword search for similar publications; • periodic updating of material by chapter authors; • comment and blog functions facilitating interactions between readers and authors. the publisher granted permission to place the text of the book on the website, and this allowed us to illustrate how the chapters would be presented in both pdf and html file formats. at this time, two chapters have been prepared in this fashion. figure shows the website page with links to presentations given by authors. figure illustrates information on related books and links to sites associated with these publications. figure depicts author information from the database created for the book. although preparation of the website complementing this book is well underway, the text for all chapters has not yet been uploaded to the site. once completed, these chapter presentations will also include the following functionalities: • search function through chapter texts; • hyperlinks embedded in chapters; • pop-up figures and tables in chapters. scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. figure : links to presentations, e-research website scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. figure : publications related to e-research book scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. figure : author information, e-research author database book : virtual knowledge the book manuscript virtual knowledge is based on research prepared by scholars associated with the virtual knowledge studio (vks) for the humanities and social sciences, established by the royal netherlands academy of arts and sciences (knaw) in . the primary objective of the vks was to facilitate innovative research practices in the humanities and social sciences, and the book virtual knowledge is designed to reflect that aim. contributions came from scholars associated with the three divisions of the vks: in amsterdam, rotterdam (erasmus studio), and maastricht (maastricht studio). one function of the book project was to enhance cohesion among the wide array of vks projects and to foster interactions among staff at the three divisions of vks. from its conception, vks intended to initiate and conduct new research practices and to engage with ongoing innovative practices of other researchers. in this regard, vks researchers were both “makers” and “observers” of new digital scholarship. two notions central to science and technology studies (sts), which constituted the home discipline of many of the central members of the vks, are practice and scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. community. these notions are reflected in the preparation of virtual knowledge and in the complementing website. regarding the book, three workshops were conducted during preparation of chapters; regarding the website, one workshop was held related to preparing and uploading contact for the site. based on interactions during preparation of the book, it was decided to prepare a web complement to the print volume. several considerations contributed to adoption of this idea to: • continue interactions among authors; • support the formation of a community around ideas expressed in the book; • embed the book in an emerging environment of similar books; • disseminate and promote the book. to support preparation of an enhanced publication for the book and to explore how preparation of such a web complement might facilitate the previously mentioned community function, a workshop was organized in april for book contributors. of the contributors, seven attended the workshop. the event provided the opportunity for participants to become familiar with the website and the general procedures for uploading information, including bibliographic entries that were submitted with a specially prepared plugin for the wordpress site. the workshop concluded with a general discussion, during which some persons expressed regret at not being involved in an earlier stage of the process, in order to contribute to the design process and the user interface with the site. this discussion was continued in a post-workshop survey that allowed all contributors to reflect on the website under construction. the level of contribution during and after the workshop was modest. while content has been uploaded to the site, much remains to be completed. that acknowledged, preparation for the workshop did stimulate members of the project team to complete the website template and specially developed plugins for bibliographic entries. some of the criticisms of the workshop and reservations about an enhanced publication included: • inadequate involvement of the book editors in the planning; • unclear value of a book website for authors; • time constraints preventing engagement at the desired level; • uncertainty about the utility of some site features, including author photos and videos; • technical problems experienced with the site, including functioning of the interface; • insufficient support from project team members in using the site. some of the positive reactions to the workshop and enhanced publication project included: scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. • appreciation for being able to link references; • acknowledgment of potential value in creating cohesion of edited collections through an enhanced publication; • value of website for author visibility; • relevancy of site to own research practices. negotiations are ongoing with mit press, the publisher of the book, for development of an enhanced version. preliminary reactions reflect interest in publishing the volume and in combining the book with an enhanced publication in the form developed during this project. to this end, a website — under construction and not yet publically accessible — has been prepared and includes the basic functionalities included in the wordpress template. many of the functionalities for the accompanying website will remain important and further work will be required to complete preparation of the content related to these features (e.g., providing supplementary resources such as links, uploading bibliographic entries, and completing video films of authors reflecting on their chapters). it is anticipated that a second workshop for authors may be necessary once arrangements have been made with the publisher regarding preparation of the book. this workshop will build on the experiences of the initial workshop held during this project. figure : homepage, long history of new media scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. book : long history of new media the third book that is part of this project was released by peter lang in may and is entitled the long history of new media: technology, historiography, and contextualizing newness. as with the other books in the project, this is an edited volume and has been prepared and published in a manner reflective of conventional procedures for scholarly publishing. the book was released as a paperback and the cover consists of a designed arrangement of book title and names of editors. the text of the book is printed in black ink on white paper; there are few illustrations and no tables in the book. the website constructed for the long history of new media contains a similar set of features as prepared for the other two books and uses the same wordpress template for the site; see figure illustrating the homepage of the site and figure containing biographical sketches of contributing authors. inasmuch as the book was recently released by the publisher peter lang, and no prior arrangement had been made for reproducing the full book manuscript, only introductory paragraphs from the chapters have been uploaded to the site, along with the text of the introduction chapter. book : digital media and society the website for this book is under construction and the content will mirror the content available on the e-research book website and include the following features: • book-related materials: description of book, table of contents, chapter abstracts, figures from chapters, compilation of references, and publisher information; • profiles of contributors: photos and bios of authors and editors; • supplementary resources: lists of institutions, publications, videos, and presentations related to web history; • topic-related blogs: group blog for authors of the book, individual blogs by book authors, and other blogs relevant to the themes in the book; • interlinking index terms with book text; • figures reproduced on website; figures in colour; • chapter references with hyperlinks and an overall bibliography for book; • author search via google scholar for other publications by author; • keyword search for similar publications based on chapter titles. scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. figure : author biographical sketches, long history of new media conclusion this project involved the development of hybrid web venues for three traditionally published scholarly books. these web venues extended beyond the increasingly common practice of preparing electronic brochure-style websites accompanying scholarly titles; the sites developed within this project incorporate features that reflect what has come to be termed enhanced publishing. while there is a broad range of interpretation as to what constitutes an enhanced publication, the features included in the websites of the enhanced scholarly publication project reflect an interlinking of components of the publications in a manner made possible by utilizing web . applications and practices, and content structures facilitated by semantic web formats. this involved construction of a database for each of the book titles, allowing for aggregation of content within and across the individual book websites. the initial wordpress template for the book websites was redesigned to facilitate ease of use by book authors and to ensure basic uniformity in the presentation of site content. plugins for the site were designed, tested, and implemented; these plugins facilitate author bios and reference management with a variety of display options within each book chapter and for the book as a whole. websites were prepared for each of the four books using the common template, and illustrations of content for each of the books was uploaded to the respective sites. the amount of content uploaded varies per book because of the different phases of completion and the particular “life cycles” of each book. for example, the book e-research was released two years ago and scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. the publisher agreed to allow the full text of the book to be placed on the website. a database has been constructed for each of the three book titles and these individual databases are integrated into an overall database. looking back on the project, the undertaking involved an accelerated learning experience in the terminology and models underlying the notion of enhanced publishing as formulated by the funding organization, the surffoundation. while the websites for the books are not equally developed and completed, each reflects a different phase of the book life cycle and all are based on preparation of an overarching database containing materials from those monographs. one of the most interesting outcomes of this project was the implementation of the incontext visualizer to this set of books. as the navigation interface for users, it introduced the semantic web technology to the non-specialist users and illustrates the kinds of information that is exposed for machine readability. to evaluate this project properly, it is important to realize that enhanced publications are still in a preliminary phase of development. adopting a phrase from innovation studies, it can be said that the development is experiencing a “precambrian explosion” of many different approaches, small in scale, coexisting, and often short in lifespan (kauffman, , p. ). the development of dominant technologies and design has yet to be achieved. this concerns not only enhanced publications but also semantic web technologies as is evident in google’s knowledge graph or wolfram’s computational knowledge engine, both of which are considered alternatives to a “web of data.” in conclusion it should be stressed that the enhancing scholarly publications project was practically oriented and exploratory; it did not have theoretical aspirations or intend to perform empirically-grounded research. the exploration did reveal, however, the need to extend the theoretical understanding of the transformations that scholarly publishing is undergoing, and to develop an empirical research agenda related to those understandings. although separate from this project, some of the team members have been undertaking theoretical and empirical research related to the concept of openness and scholarly communication, which could guide further practically-oriented projects at enhancing publications (tatum & jankowski, ). the research agenda that might evolve from this exploration could include formative case studies of similar initiatives to enhance scholarship, thereby contributing to both theory construction and scholarly practice. acknowledgements the authors gratefully acknowledge the financial and collegial support provided by the surffoundation and the e-humanities group. this report is a revised version of the paper prepared for the pkp conference , available at: http://pkp.sfu.ca/ocs/pkp/ index.php /pkp /pkp /paper/view/ / . the website for the project enhancing scholarly publishing provides additional materials on which this paper is based: http:// digital-scholarship.ehumanities.nl/enhanced-publications. a video describing this and other surf enhanced publications projects is available at: http://www.surffoundation.nl/ en/themas/ openonderzoek/verrijktepublicaties/pages/default.aspx. http://pkp.sfu.ca/ocs/pkp/index.php http://pkp.sfu.ca/ocs/pkp/index.php http://digital-scholarship.ehumanities.nl/enhanced-publications http://digital-scholarship.ehumanities.nl/enhanced-publications http://www.surffoundation.nl/en/themas/ http://www.surffoundation.nl/en/themas/ scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. notes . this report presents a specific project “enhancing scholarly publications in the humanities and social sciences,” presented at the pkp scholarly publishing conference, september - , , berlin, germany, http://pkp.sfu.ca/ocs/pkp/ index.php/pkp /pkp . . see, e.g., the nsf data sharing policy, available at: http://www.nsf.gov/bfa/dias/ policy/ dmp.jsp . . see surf website for details: http://www.surffoundation.nl/en/themas/ openonderzoek/ verrijktepublicaties/pages/default.aspx . . the project has been presented by a film produced by surf: http://www.youtube .com/watch?v=nhkd oqslnw&feature=relmfu. another film introduces all six projects: http://www.youtube.com/watch?v=fhi j yuuk&feature=youtu.be . . one example is the website announced in july as supplement for the harry potter series (http://www.pottermore.com/). more elaborate, however, is the specially designed app for the ipad version of al gore’s our choice (gore, ), released by push pop press in . the website accompanying the book, http:// ourchoicethebook.com/site_media/ index .html, reflects many of the features found on websites complementing published books: illustrations, sample texts, order information, references. . the ipad app for the iliad was developed by the center for visualization and virtual environments at the university of kentucky; see http://viscenter.wordpress .com . . an extensive overview of the additional ipad features available on the app of on the road can be seen at the penguin books website for the amplified edition: http:// us.penguingroup .com/static/pages/features/amplified_editions/on_the_road.html . . publisher farber and farber prepared a special website for the release of this ipad app: http://thewastelandforipad.com . . at the time of this writing, early , touch press has released five titles for ipad; see the publisher website for details: http://www.touchpress.com . . http://ocw.mit.edu/ans / f/ f. /home/index.html . see elsevier press release for details: http://www.elsevier.com/wps/find/authored _newsitem.cws_home/companynews _ . . the eu-funded project driver (digital repository infrastructure vision for european research) was responsible for this and other reports; see project website for an overview of studies: http://www.driver-repository.eu . . http://www.openarchives.org http://pkp.sfu.ca/ocs/pkp/index.php/pkp /pkp http://pkp.sfu.ca/ocs/pkp/index.php/pkp /pkp http://www.nsf.gov/bfa/dias/policy/ dmp.jsp http://www.nsf.gov/bfa/dias/policy/ dmp.jsp http://www.surffoundation.nl/en/themas/openonderzoek/ verrijktepublicaties/pages/default.aspx http://www.surffoundation.nl/en/themas/openonderzoek/ verrijktepublicaties/pages/default.aspx http://www.youtube.com/watch?v=fhi j yuuk&feature=youtu.be http://ourchoicethebook.com/site_media/ index .html http://ourchoicethebook.com/site_media/ index .html http://viscenter.wordpress.com http://viscenter.wordpress.com http://thewastelandforipad.com http://www.touchpress.com http://ocw.mit.edu/ans / f/ f. /home/index.html http://www.elsevier.com/wps/find/authored _newsitem.cws_home/companynews _ http://www.elsevier.com/wps/find/authored _newsitem.cws_home/companynews _ http://www.driver-repository.eu http://www.openarchives.org scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. . http://w techs.com/technologies/overview/content_management/all . http://ep-books.ehumanities.nl/semantic-words . for background and rationale of this hybrid approach to enhanced publications, see the project final report (jankowski, scharnhorst, tatum, & tatum, ), available on the digital scholarship website: http://digital-scholarship.ehumanities.nl/ enhanced-publications . . the wordpress community has produced more than , plugins; see http:// wordpress.org/extend/plugins . . the vks concluded operation on december and the e-humanities group was created to continue and extend activities of the vks under a modified organizational structure. this transition is discussed further on the e-humanities group website: http:// ehumanities.nl . . http://googleblog.blogspot.nl/ / /introducing-knowledge-graph-things-not .html . http://www.wolframalpha.com websites co-authors plus: http://wordpress.org/extend/plugins/co-authors-plus ninja page categories and tags: http://wpninjas.net/plugins/ninja-page-categories-and-tags/ user avatar: http://wordpress.org/extend/plugins/user-avatar enhanced bibliplug: http://ep-books.ehumanities.nl/semantic-words/enhanced-bibliplug enhanced publication for wordpress: http://ep-books.ehumanities.nl/semantic-words/enhanced- publication-plugin-for-wordpress incontext visualiser: http://www.surffoundation.nl/en/projecten/pages/ escapevisualisationcomponent.aspx semantic words: http://ep-books.ehumanities.nl/semantic-words wordpress: http://wordpress.org zotero: http://www.zotero.org references boulal, anouar, lordanidis, martin, quast, andres, & schirrwagen, jochen. ( ). report on enhancing interoperability between existing open access publication infrastructures. cologne, germany: bielefeld university library. url: http://www.eco r.org/downloads/eco r_report_ compoundobjects_draft.pdf [january , ]. berners-lee, tim, hendler, james, & lassila, ora. ( ). the semantic web - a new form of web content that is meaningful to computers will unleash a revolution of new possibilities. scientific american, special online issue, – . url: http://csis.pace.edu/~marchese/cs /lec / _ semweb.pdf [june , ]. http://w techs.com/technologies/overview/content_management/all http://ep-books.ehumanities.nl/semantic-words http://digital-scholarship.ehumanities.nl/enhanced-publications/ http://digital-scholarship.ehumanities.nl/enhanced-publications/ http://wordpress.org/extend/plugins http://wordpress.org/extend/plugins http://googleblog.blogspot.nl/ / /introducing-knowledge-graph-things-not.html http://googleblog.blogspot.nl/ / /introducing-knowledge-graph-things-not.html http://www.wolframalpha.com http://wordpress.org/extend/plugins/co-authors-plus http://wpninjas.net/plugins/ninja-page-categories-and-tags/ http://wordpress.org/extend/plugins/user-avatar http://ep-books.ehumanities.nl/semantic-words/enhanced-bibliplug http://ep-books.ehumanities.nl/semantic-words/enhanced-publication-plugin-for-wordpress http://ep-books.ehumanities.nl/semantic-words/enhanced-publication-plugin-for-wordpress http://www.surffoundation.nl/en/projecten/pages/ escapevisualisationcomponent.aspx http://www.surffoundation.nl/en/projecten/pages/ escapevisualisationcomponent.aspx http://ep-books.ehumanities.nl/semantic-words http://wordpress.org http://www.zotero.org http://www.eco r.org/downloads/eco r_report_compoundobjects_draft.pdf http://www.eco r.org/downloads/eco r_report_compoundobjects_draft.pdf http://csis.pace.edu/~marchese/cs /lec / _semweb.pdf http://csis.pace.edu/~marchese/cs /lec / _semweb.pdf scholarly and research communication volu m e / is su e / nicholas w. jankowski, andrea scharnhorst, clifford tatum, & zuotian tatum. ( ). enhancing scholarly publications: developing hybrid monographs in the humanities and social sciences. scholarly and research communication, ( ): , pp. breure, leen, voorbij, hans, & hoogerwerf, maarten. ( ). rich internet publications: ‘show what you tell.’ journal of digital information ( ). url: http://journals.tdl.org/jodi/article/ view/ / [january , ]. chown, marcus. ( ). solar system for ipad. london, uk: touch press. url: http://itunes.apple .com/us/app/solar-system-for-ipad/id ?mt= &ign-mpt=uo% d [january , ]. gore, al. ( ). our choice: a plan to solve the climate crisis. new york, ny: rodale books. url: http://ourchoicethebook.com/site_media/index .html [january , ]. hoogerwerf, maarten, jong, jan de, & scholte, hans. ( ). enhanced publications in archaeology: an analysis of potential enhancements. url: https://www.surfgroepen.nl/sites/jalcproject/ project% results/wp % -% enhanced% publications% in% archaeology.% an% analysis% of% potential% enhancements.pdf [january , ]. jankowski, nicholas w. (ed.) ( ). e-research: transformation in scholarly practice. new york, ny: routledge. jankowski, nicholas w. ( ). enhancing scholarly publishing in the humanities and social sciences: innovation through hybrid forms of publication. [project proposal]. url: http://digital- scholarship.ehumanities.nl/enhanced-publications/ [january , ]. jankowski, nicholas. w. (forthcoming). digital media: concepts & issues, research, & resources. cambridge, uk: polity press. jankowski, nicholas w., scharnhorst, andrea, tatum, clifford, & tatum, zuotian. ( ). enhancing scholarly publishing in the humanities and social sciences: innovation through hybrid forms of publication. [project report]. url: http://digital-scholarship.ehumanities.nl/enhanced- publications/ [january , ]. kauffman, stuart a. ( ). at home in the universe. the search for laws of self-organization and complexity. oxford, uk: oxford university press. mark twain project. ( ). home. url: http://www.marktwainproject.org/ [june , ]. meroño-peñuela, albert, ashkpour, ashkan, van erp, marieke, mandemakers, kees, breure, leen, scharnhorst, andrea, schlobach, stefan, & van harmelen, frank. ( ). semantic technologies for historical research: a survey. semantic web – interoperability, usability, applicability. url: http://www.semantic-web-journal.net/content/semantic-technologies-historical-research- survey [september , ]. palmer, rodney, & frangenberg, thomas. (eds.) ( ). the rise of the image: essays on the history of the illustrated art book (reinterpreting classicism). surrey, uk: ashgate publishing. park, d., jankowski, n.w., & jones, s. ( ). the long history of new media: technology, historiography, and newness contextualized. new york, ny: peter lang. surffoundation. ( ). wat is een verrijkte publicatie? [what is an enhanced publication?] url: http://www.surffoundation.nl/nl/themas/openonderzoek/ verrijktepublicaties/pages/default. aspx [january , ]. tatum, clifford, & jankowski, nicholas. ( ). beyond open access. a framework for openness in scholarly communication. in: paul wouters, anne beaulieu, andrea scharnhorst & sally wyatt (eds.), virtual knowledge: experimenting in the humanities and social sciences. (pp. – ). cambridge, ma: mit press. wikipedia. ( ). semantic web. url: http://en.wikipedia.org/wiki/semantic_web [june , ]. wouters, p., beaulieu, a., scharnhorst, a., & wyatt, s. ( ). virtual knowledge: experimenting in the humanities and social sciences.. cambridge, ma: mit press. woutersen-windhouwer, saskia & brandsma, renze. ( ). report on enhanced publications; state-of-the-art. project driver (digital repository infrastructure vision for european research ii). url: http://dare.uva.nl/document/ [january , ]. http://journals.tdl.org/jodi/article/view/ / http://journals.tdl.org/jodi/article/view/ / http://itunes.apple.com/us/app/solar-system-for-ipad/id ?mt= &ign-mpt=uo% d http://itunes.apple.com/us/app/solar-system-for-ipad/id ?mt= &ign-mpt=uo% d http://ourchoicethebook.com/site_media/index .html https://www.surfgroepen.nl/sites/jalcproject/project results/wp - enhanced publications in archaeology. an analysis of potential enhancements.pdf https://www.surfgroepen.nl/sites/jalcproject/project results/wp - enhanced publications in archaeology. an analysis of potential enhancements.pdf https://www.surfgroepen.nl/sites/jalcproject/project results/wp - enhanced publications in archaeology. an analysis of potential enhancements.pdf http://digital-scholarship.ehumanities.nl/enhanced-publications/ http://digital-scholarship.ehumanities.nl/enhanced-publications/ http://www.marktwainproject.org/ http://www.semantic-web-journal.net/content/semantic-technologies-historical-research-survey http://www.semantic-web-journal.net/content/semantic-technologies-historical-research-survey http://www.surffoundation.nl/nl/themas/openonderzoek/ verrijktepublicaties/pages/default.aspx http://www.surffoundation.nl/nl/themas/openonderzoek/ verrijktepublicaties/pages/default.aspx http://en.wikipedia.org/wiki/semantic_web http://dare.uva.nl/document/ op-llcj .. parsing early and late modern english corpora ............................................................................................................................................................ gerold schneider, hans martin lehmann and peter schneider english department, university of zurich, switzerland ....................................................................................................................................... abstract we describe, evaluate, and improve the automatic annotation of diachronic cor- pora at the levels of word-class, lemma, chunks, and dependency syntax. as corpora we use the archer corpus (texts from to ) and the zen corpus (texts from to ). performance on modern english is consider- ably lower than on present day english (pde). we present several methods that improve performance. first we use the spelling normalization tool vard to map spelling variants to their pde equivalent, which improves tagging. we investigate the tagging changes that are due to the normalization and observe improvements, deterioration, and missing mappings. we then implement an optimized version, using vard rules and preprocessing steps to improve normalization. we evalu- ate the improvement on parsing performance, comparing original text, standard vard, and our optimized version. over % of the normalization changes lead to improved parsing, and . % of all manually annotated sentences get a net improved parse. as a next step, we adapt the parser’s grammar, add a semantic expectation model and a model for prepositional phrases (pp)-attachment inter- action to the parser. these extensions improve parser performance, marginally on pde, more considerably on earlier texts— — % on pp-attachment relations (e.g. from . to . % and from to . % on th century texts). finally, we briefly outline linguistic applications and give two examples: gerundials and auxiliary verbs in the zen corpus, showing that despite high noise levels linguis- tic signals clearly emerge, opening new possibilities for large-scale research of gradient phenomena in language change. ................................................................................................................................................................................. . introduction over the past decade several robust broad coverage syntactic parsers have become available. they have successfully been used for the annotation of present day english (pde) corpora. more recently, large, automatically annotated corpora have been investi- gated in areas like syntax-lexis interactions, where enormous amounts of data are necessary (e.g. lehmann and schneider, ) and manually annotated corpora are limited by their size. historical corpora tend to be limited in size not only by the restrictions set by extant material but also by the effort necessary to bring the data into electronic form. however, there are fairly large unannotated diachronic corpora like the zen corpus with . million words, the archer corpus with . million words, and the old bailey corpus with million words. the entire old bailey pro- ceedings contain approximately million words. the main goal of the present article is to explore automatic syntactic annotation of this kind of data correspondence: gerold schneider, english department, university of zurich, plattenstrasse , ch - zurich, switzerland. email: gschneid@es.uzh.ch digital scholarship in the humanities, vol. , no. , . � the author . published by oxford university press on behalf of allc. all rights reserved. for permissions, please email: journals.permissions@oup.com doi: . /llc/fqu advance access published on february l paper covering a period from roughly to the present. concerning the periodization of the english lan- guage history, we follow approaches in which the early modern english period (emode) has been suggested as ranging from about to (e.g. görlach , p. – , rissanen ), and the late modern english period (lmode) from or to start of the th century (e.g. tieken-boon van ostade ). in this article we describe the automatic annota- tion of diachronic corpora at the levels of word- class, lemma, noun and verb chunks as well as dependency syntax. for this purpose, we adapt a framework for annotation and analysis developed for pde (cf. lehmann and schneider, a,b). the spelling variation found in early and late modern english presents a major obstacle to auto- matic annotation. in section , we present strategies and discuss the training and adaptation of the nor- malization tool vard (baron and rayson, ). section reports on the performance and adapta- tions made to pro gres (schneider ), the dependency parser we employ for the syntactic an- notation. we evaluate the performance and describe the adaptations in the areas of lexical preferences and grammar rules necessary to parse the historic data as diachronic variation is potentially stronger than synchronic variation. in section , we explore the possibilities and limitations of the syntactically annotated diachronic corpora for historical linguis- tics. specifically we discuss the problems introduced by the automatic annotation. to illustrate the new possibilities offered by the dependency anno- tated corpora, we present two pilot studies. we investigate diachronic change in the use of ger- undials as well as the change from be to have as auxiliary in present perfect constructions. . spelling variation and normalization spelling variants can cause major problems for automatic annotation. simple variants like call’d for called typically result in wrong tagging, chunk- ing, and parsing, as can be seen in fig. . the tagger assigns the word-class general noun singular to call and modal to ‘d. as a consequence, the chunker fails to identify the verb group was called. in turn, the parser only produces two fragments and unsurpris- ingly fails to attach the modal ‘d. there are two possible strategies for dealing with spelling variants. either the annotation tool is adapted to cope with the variant directly or the spelling variants are normalized to the forms expected by the annotation tool. our annotation framework makes use of lt-ttt , which in turn uses the c&c tagger, the morpha lemmatizer and the lt-ttt chunker (grover ). let us consider the seemingly simple problem of hath and doth. it is not enough to amend the lexicon of the tagger with forms like hath and doth. to really incorporate the variant forms, we would have to retrain the tagger with tagged text in which hath and doth actually occur. but we could not stop there because even a correctly tagged hath may not be recognized by the lemmatizer. and after adapting the tagger and lemmatizer we would have to change the rules of the chunker, which would otherwise not recognize hath seen as a verb group in the same way as has seen. last but not least we would have to adapt the parser, which relies on a closed class of words that can function as auxiliaries in order to deal with auxiliaries in subject verb in- versions. in our present approach we try to avoid this type of complexity by normalizing the variant forms. by simply substituting doth with does, we inherit the lexicon entry and the training data for does as well as the properties of does encoded in the lemmatizer, the chunker, and the parser, as illustrated in fig. . fig. annotation problem caused by variant form call’d g. schneider et al. digital scholarship in the humanities, vol. , no. , : - paper present day english s s s s , s s s s s s for automatic normalization we use vard (baron & rayson, ). intuitively, tagging, and consequently also chunking and parsing, im- prove from mapping the original spelling to the same spelling as used in the tagger and parser train- ing resource. the statistical performance disambigu- ation, which uses lexical heads, should equally profit. as the normalization process also makes errors, the assumption that performance will improve cannot be taken for granted. concerning tagging accuracy, this assumption has been tested in rayson et al. ( ). they report an increase of about % (from to % accuracy) on shakespeare texts. as an upper bound, when texts are manually normalized, they report % accuracy. in the following we describe the normalization with vard. . using unmodified vard for zen normalization as a first step, the zen text was input to vard using the default setup parameters included with version . . of the software. the non-interactive mode of vard compares every w-unit of the input text to a standardized pde lexicon. if a vari- ant does not occur in the lexicon, several algorithms are applied to find a normalized replacement, and a ‘confidence score’ is calculated which indicates the estimated likelihood that the replacement actually matches the original w-unit. using the auto-normalize function with a % threshold, the vard output was analysed cursorily to get a rough idea on where it could be improved. most of the automatic normalizations are obviously useful, such as the -ick and ‘d endings, and the e->o vowel change, while other items need a closer look (e.g. assignees should not be normalized to assigns). table shows a list of the most frequent auto- matically suggested normalizations: looking at the suggested normalization in con- text, we found the following types of suboptimal output: � unnecessary normalization � missing normalizations � incorrect normalizations � abbreviations since our aim was to normalize zen for tagging and parsing, not for lexical correctness by pde standards, we tried to concentrate on those areas where we expected the normalization to help the part-of-speech (pos) tagger. ideally, an optimized normalization process should observe the following maxims: � all normalized items should retain the word class if it was correctly identifiable in the original form. � when the tagger would not correctly identify the word class of an original item, it should be nor- malized to a form with the correct word class. � little or no effort should be made to improve the normalization of items whose original and nor- malized form share the same word class. . problems and solutions for vard processing . . unnecessary normalization most non-standard variants and problematic nor- malizations concern place names and proper names. while it may be historically interesting, the normal- ization of names is not really necessary in the con- text of part-of-speech identification since software fig. comparison of normalized and original input to the annotation chain parsing early and late modern english corpora digital scholarship in the humanities, vol. , no. , s s % s s s - s s `` '' s s s s ten s s s s s s s s s s s s s s s s for automatic tagging can identify them in the ori- ginal spelling, as in example ( ): ( ) this day sir william swan, . . . cui : this_dt day_nn sir_nnp william_nnp swann_nnp . . . likewise, variants of place names pose no prob- lem, such as in ( ): ( ) letters_nns from_in vienna_nnp and_cc francfort_nnp tell_vbp us_prp . . . since titles and honorifics are usually followed by one or more proper names as in sir john fitz- gerald, vard was instructed not to process a se- quence of title variants (e.g. sir, lord, marquis), a preposition (e.g. de, of), and one or two capitalized words. this was achieved with a set of regular expressions in the ‘text_to_ignore.txt’ file. some more expressions were added to skip likely place names preceded by a set of indicators, such as province of . . . , parish of . . . to avoid more unneces- sary normalizations. . . . missing normalizations the old verb forms hath and doth confuse the tagger. of the occurrences of hath, only are identified as verbs, and in the case of the fifty- three instances of doth, only eight are seen as verbs. since the standard lexicon contains both forms, they are not automatically normalized. this was remedied in the interactive mode of vard by ex- plicitly adding the normalized variants has and does to the list of mandatory replacements (variants.txt). . . incorrect normalizations since vard’s lexicon is derived from a word list based on most frequent items in modern corpora, many less-frequent words are missing. this means that vard will attempt to normalize items even though they would be correctly spelled by pde standards. table presents a list of items and their (incorrect) normalization as proposed by standard vard. since zen has a different lexical frequency dis- tribution compared with modern corpora, it was necessary to manually go through the most frequent variants in the vard interactive mode, and decide if an item needs to be added to the word list (‘all not variant’) or to the list of mandatory replace- ments (‘normalize to . . .’). . . abbreviations non-standard abbreviations occur frequently in zen. while abbreviated titles such as bart (baronet) or esq. (esquire) are usually non-problematic, the tagger sometimes stumbles over abbreviated first names, such as wm (william) or edw (edward): ( ) lgz : whoever_wp secures_vbz the_dt mare_nnp,_, and_cc gives_vbz notice_nn to_to edw_vb quane_nn . . . shall_md have_vb _cd s_prp._.reward_ nnp . . . ( ) lgz : whoever_wp secures_vbz the_dt horse_nnp . . . and_cc gives_vbz notice_nn to_to wm_vb brooke_nnp . . . shall_md have_vb _cd guineas_nnp reward_nnp._. some common abbreviations were therefore added to the vard list of items with mandatory replacements (variants.txt). in addition to first names, we included frequent items such ult (‘last month’, fifty-nine instances) and ‘em (‘them’, ). . . non-standard capitalization the tagger is sensitive to capitalization issues since capitalization is used to identify proper nouns. table most frequent vard normalizations of zen ( , , tokens, of which , were automatically normalized) count original normalized tis it is publick public publish’d published tho’ though assignees assigns call’d called lett let chuse choose arriv’d arrived shew show g. schneider et al. digital scholarship in the humanities, vol. , no. , s `` '' s s s s s s s to `` '' `` s '' - `` '' `` '' - s s s taggers do typically not identify a capitalized adjec- tive, as in ( ): ( ) gat : . . . prisoners_nns in_in the_dt tobooth_nnp here_rb,_, were_vbd served_vbn with_in criminal_nnp letters_ nnp,_, at_in the_dt instance_nn of_in his_prp$majesty_nnp ‘s_pos advocats_ nns . . . we do not address the problem of non-standard capitalization of nouns in zen in this article. . evaluation of optimizations . . summary view in order to evaluate the relative improvements be- tween the original zen text (z ), the default vard auto-normalized version (z ), and the optimized version (z ), the three text versions were processed by the c&c tagger. in a first attempt, individual pos tags were counted and arranged in four main groups of tags (fig. ). however, this evaluation only revealed a somewhat lower proportion ( %) of nouns and a very slightly higher proportion of verbs ( %) when both normalized texts z and z were compared with the original z . . . a changes-based look at the normalizations another type of analysis was therefore necessary to reveal more relevant differences. rather than going on counting unrelated entities, we decided to clas- sify how normalization affected pos sequences and wordþpos-tag combinations. to this end, the gnu wdiff tool was applied to each set z z , z z , creating a list of wordþtag edits. the output annotes deleted sequences with [- and -] indicators, and corresponding replacements by {þ and þ}. ( ) shows the influence of normalization on pos tag- ging between z and z : ( ) evp : we_prp are_vbp [-advis_nns ‘d_vbd-] {þadvised_vbnþ} that_in admiral_nnp norris_nnp ‘s_pos fleet_nnp met_vbd with_in a_dt great_jj storm_nn in_in the_dt [-gulph_nnp-] {þgulf_nnpþ} of_in lions_nnps,_, but_cc [-suffer_vbp ‘d_md-] {þsuffered_vbdþ} no_dt other_jj damage_nn than_in some_dt of_in the_dt transports_nns with_in troops_nns on_in board_nnp being_vbg [-oblig_vbn ‘d_md-] {þobliged_vbnþ} to_to shelter_nn them- selves_prp in_in some_dt of_in the_dt harbours_nns of_in the_dt mediterranean_ nnp._. while the normalization of gulph to gulf did not prompt the tagger to analyse the item differently, the normalization of the ‘d verb forms leads to a better analysis. for a further comparative look at the changes, regular expressions were applied to the wdiff output to only consider sequences where the assigned pos tags underwent a change. table lists the most frequent such changes in z : it turns out that roughly half of the normaliza- tions affect the tagging, as shown in table . the z fig. distribution of grouped pos tags (jjx: adjectives, nnx: nouns, rbx: adverbs, vbx: verbs, x axis indicates number of tags) table incorrect normalizations due to lexicon limitations zen original vard auto-normalization assignee assigns patence (patentee) patience relict (widow) relic footpad (robber on foot) footpath porte (ottoman empire) port dom (spanish title, or abbreviated an[no] dom[ini]) doom messuage (dwelling) message paul (first name) pal parsing early and late modern english corpora digital scholarship in the humanities, vol. , no. , s - s paper s s s s to s s s s s ten s version has % fewer overall normalizations com- pared with z , but still has a % higher number of tag-affecting normalizations. this is a likely result of the title sequence ignore instructions indicated in section . . . another analysis of the wdiff output discards other content, leaving just the pos tag intact. the results of the comparison between z and z pre- sented in table shows that the most frequent tag sequence changes are similar, apart from the nn to vbz transition in z . this is the effect of the addition of hath to the dictionary as proposed in section . . . . . standard versus optimized normalization the same set of tools that was used to assess the differences between the original (z ) and the nor- malized versions (z , z ) were also employed to evaluate the potential improvements in tagging. similar to table , tag-affecting changes between the non-optimized and the optimized normaliza- tion are summarized in table . to increase legibil- ity, we did not include changes due to differences in the form of compounds, such as the presence or absence of a hyphenation or a word space in place names with street, lane, row (e.g. fleetstreet/fleet street, or drury-lane/drury lane). the importance of the correct identification of the verb hath as pde has is again illustrated nicely: if has carries the (correct) vbz tag, the following verb form will also be correctly identified as a past participle (vbn) instead of past tense (vbd). while most of the z ->z changes are welcome improve- ments, table shows that there are exceptions: items which were correctly normalized in z appear to have regressed in z , such as the missing ‘d/ed verb ending normalization. since the other instances of allow’d and ninety-three instances follow’d are handled and normalized by vard as expected, this is likely due to a different f-score assigned in the optimized version. table affected tag sequences in z and z (n� ) count z z count z z [�nn md�] {þvbnþ} [�nn md�] {þvbnþ} [�nn�] {þnnpþ} [�jj nnp�] {þvbnþ} [�jj nnp�] {þvbnþ} [�nn�] {þnnpþ} [�vb nnp�] {þvbnþ} [�vb nnp�] {þvbnþ} [�nn�] {þjjþ} [�nn�] {þjjþ} [�nnp�] {þnnþ} [�nnp nnp�] {þvbnþ} [�nnp nnp�] {þvbnþ} [�vb md�] {þvbnþ} [�vb md�] {þvbnþ} [�nn�] {þvbzþ} [�vbp md�] {þvbdþ} [�nnp�] {þnnþ} [�nn md�] {þvbdþ} [�vbp md�] {þvbdþ} table tag-affecting changes due to normalisation (n� ) count z z [-publish_vb ‘d_nnp-] {þpublished_vbnþ} [-publick_nn-] {þpublic_jjþ} [-tis_nnp-] {þit_prp is_vbzþ} [-publick_nn-] {þpublic_nnpþ} [-tis_vbz-] {þit_prp is_vbzþ} [-tho_nns ‘_pos-] {þthough_inþ} [-tho_nnp ‘_pos-] {þthough_inþ} [-republick_nn-] {þrepublic_nnpþ} [-tis_nns-] {þit_prp is_vbzþ} [-’s_pos-] {þs_vbzþ} table overall and tag-affecting normalizations in z and z normalizations z z difference overall (o) , , , (� %) tag-affecting (t) , , (þ %) ratio o/t . . g. schneider et al. digital scholarship in the humanities, vol. , no. , s to s . s s s s s s s s s s s . syntactic parsing of modern english texts robust broad-coverage syntactic parsers, for ex- ample, collins ( ), nivre ( ), schneider ( ) have now become available. van noord and bouma ( , p. ) state that ‘[k]nowledge- based parsers are now accurate, fast and robust enough to be used to obtain syntactic annotations for very large corpora fully automatically’. large corpora such as the british national corpus (aston & burnard, ) have been made accessible in automatically parsed versions, for example, andersen ( ) or lehmann and schneider ( b), offering new perspectives for linguistic research. a major reason for the relative accuracy and ef- ficiency of these syntactic parsers is that they use fast finite-state technology like taggers, chunkers, and morphological analysers in the pre-processing step and that they largely rely on statistical data which minimally encodes lexical preferences. kaplan et al. ( ) describe finite-state preprocessing as a neces- sary prerequisite for efficient and accurate parsing. concerning lexical preferences, it is important to point out that applying all grammatical rules to a sentence to be parsed massively overgenerates, i.e. often leads to hundreds of possible parses, most of which are semantically implausible. lexical prefer- ences are used to disambiguate and find the most likely syntactic analysis. lexical preferences are encoded in the form of bi-lexical conditioning (e.g. collins ), which means that syntactic rules in which both the governor and the dependent lexeme are likely to occur are preferred. this strat- egy is analogous to the dichotomy of syntax prin- ciple versus idiom principle (sinclair , hunston and francis, ) in which the application of syn- tactic competence rules is constrained and ranked by idiomatic performance patterns. in addition to affecting tagging performance (section and . ), lexical statistics often fails to deliver any data (or it delivers incorrect data) if historical spelling instead of normalized spelling is used, which means that the disambiguation between various syntactically pos- sible analyses is affected. we address this point in section . . . improvement due to normalization the assumption that normalization improves par- sing performance has first been confirmed in schneider ( ): in a sentences random table tag-affecting changes between z and z (some omitted items) count z z [�hath_nn�] {þhas_vbzþ} [�’_pos em_nn�] {þthem_prpþ} [�s_vbz�] {þ’s_posþ} [�hath_nn surrendered_vbd�] {þhas_vbz surrendered_vbnþ} [�hath_vbp�] {þhas_vbzþ} [�’_’’ em_nn�] {þthem_prpþ} [�allowed_vbn�] {þallow_vb ‘d_nnpþ} [�doth_nn�] {þdoes_vbzþ} [�port_nnp�] {þporte_nnþ} [�poultry_nn�] {þpoultrey_nnpþ} [� _cd th_nn�] {þ th_jjþ} [�tis_jj�] {þit_prp is_vbzþ} [�switzers_nns�] {þswiss_nnpþ} [� _cd th_nn�] {þ th_jjþ} [�tis_nnp�] {þit_prp is_vbzþ} [�infant_nn�] {þinfanta_nnpþ} [�hath_nn sent_vbd�] {þhas_vbz sent_vbnþ} [�hath_nn made_vbd�] {þhas_vbz made_vbnþ} [�followed_vbn�] {þfollow_vb ‘d_mdþ} parsing early and late modern english corpora digital scholarship in the humanities, vol. , no. , `` '' z . s s s sample from the archer corpus th century sec- tion, normalizations are made (in vard batch mode, % confidence level). in the normalized text, of the sentences receive a syntactic ana- lysis which differs from the original. a manual inspection reveals better syntactic analysis due to vard in twelve sentences, worse syntactic analysis due to vard in one sentence, and improvements paralleled by new errors in three sentences. here we use a larger random sample from the zen corpus, comprising sentences. we use first (section . . ) a version with the standard normal- ization settings of vard, then (section . . ) our retrained vard version. . . standard vard of the sentences, obtain a different syntac- tic analysis when using the standard vard settings. the results are broken down by syntactic relation in table . we get an improvement of sixty-eight re- lations opposed to five new errors. more than % of the changes are improvements, and % of the original sentences, and % of the sentences whose tagging was affected get a net improved parse. an example is given in fig. , where the original spelling in sentence ( ) scorbutick is tagged as a verb (top), while the normalized scorbutic is tagged cor- rectly as adjective, which leads to the correct syn- tactic analysis (bottom) ( ) the only short and infallible cure for that reigning disease the scurvy and all scorbu- tick humours, . . . (zen cjl) . . retrained vard of the sentences, obtain a different syntactic analysis after retraining vard, compared to using the standard vard. the results are broken down by syntactic relation in table . we get a further improvement of eleven relations opposed to new error. of all sentences, . % get a net im- proved parse. an example can be found in fig. in section . the original spelling doth is not normalized by vard standard. after our retraining it is correctly normalized to does, which leads to the correct syn- tactic analysis. . parser adaptation we have stated that a major reason for the relative accuracy and efficiency of syntactic parsers is that they rely on statistical data which encodes lexical preferences between governors and dependents (collins, ). lexical preference statistics are learnt from a manually annotated resource (the learning process is called training), typically the penn treebank is used (marcus et al. ). while a number of parsers now reach acceptable accuracy when applied to domains that are similar to the training domain, performance drops considerably when texts from different domains are parsed (gildea ). domain adaptation is therefore a fig. syntactic analysis with original spelling and nor- malized spelling table parser improvement versus new errors with standard vard better worse equal subj obj pobj modpp sentobj p g. schneider et al. digital scholarship in the humanities, vol. , no. , s s s r - s - . % s - s adaptation current research focus in broad-coverage parsing (buchholz and marsi, ; nivre et al., ). lehmann and schneider ( a) have evaluated random sets from the bnc and report similar to slightly lower performance than on in-domain texts. performance decreases increasingly with do- mains that differ more from the training domain, partly due to incorrect part-of-speech tagging in the preprocessing step, and partly due to inappropriate lexical preferences. there is a danger that the level of noise introduced by tagging and parsing errors will at some stage be stronger than the signal. the signal reports true quantitative differences. schneider and hundt ( ) evaluate parser performance on l varieties of english such as indian or fiji english. they show that for the application to regional vari- ation the signal delivered by an automatic parser (schneider ) is typically strong enough. even if the performance decrease for variation according to region and genre seems manageable, diachronic variation has the potential to be much stronger than synchronic variation, and not only affect lexical preferences but also the set of permis- sible grammar rules. rissanen ( ) states that from about on, the structure of pde had largely been established. ‘at that time [ ], the structure of the lan- guage was gradually established so that eight- eenth-century standard written english closely resembles the present-day language. the lan- guage of most sixteenth-century authors still reflects the heritage of middle english, whilst it is possible to read long passages from eight- eenth century novels or essays and find only minor deviations from present-day construc- tions.’ (rissanen , p. ). denison ( ) also confirms: ‘by the english language had already undergone most of the syntactic changes which differentiate present-day english (henceforth pde) from old english (hence- forth oe)’ (denison , p. ). these quotes support our initial hypothesis that except for spelling variation (which we have ad- dressed in section ), shifts in lexical preferences (which degrades parsing performance), and chan- ging frequencies of certain syntactic constructions (which we hope to measure as signal with our approach) the fundamental set of grammar rules may only need large adaptations for earlier periods, in other words for the earliest texts in archer and zen. we expect a weak decline in parser per- formance from the th century to the th century, and then a stronger decline for the th century texts. particularly for the lmode, it has been claimed that the differences to pde are mainly of statistical nature. construction types remain the same. the frequency of the types, however, may change. these changes in frequency can themselves be pre- paratory steps for language change. ( ) illustrates the difficulties automatic parsers face in early modern english. it also highlights some of the features of early modern english. ( ) the ship, the amerantha, had never yett bin att sea, and therfore the more daungerous to ad- venture in her first voyage; butt she was well built, a fayre ship, of a good burden, and had mounted in her forty pieces of brasse cannon, two of them demy cannon, and she was well manned, and of good force and strength for warre: she was a good sayler, and would turne and tacke about well; she held per- sons of whitelocke’s followers, and most of his baggage, besides her own marriners, about . (archer whit.j b, italics added) processing our spelling normalised version (see section ) of sentence ( ), the parser makes a number of errors that are related to markedness. the following constructions are also possible in pde, but highly marked. table parser improvement of standard vard versus retrained vard better worse equal subj obj pobj modpp sentobj p parsing early and late modern english corpora digital scholarship in the humanities, vol. , no. , , ", , , ", , , : ", , , ", , , : late modern english period genitives of quality (e.g. of a good burden and of good force) are frequent in latin or in biblical con- texts but rarely used in pde (e.g. köstenberger and patterson, , p. ). x-bar violations are rare and poetic in pde: mounted [in her] forty pieces is an x-bar scheme violation. in one possible syntactic interpretation of this sentence the subcategorized object forty pieces is further remote from the verb than the ad- junct in her (the x-bar compatible order would be had mounted forty pieces in her). notice that the non-argument is not moved outside the vp as in topicalization but may rather be a scrambling phenomenon similar to present day german (e.g. grewendorf and sternefeld, ). there is also a second possible syntactic inter- pretation of mounted [in her] forty pieces in which had is the main verb, and mounted in her is a mod- ifying participial clause. the x-bar violation then consists in having a non-subcategorized participial clause closer to the verb than the subcategorized object (the x-bar compatible order would be had forty pieces mounted in her). the effect of the x-bar violation here is that the chunker returns [her forty pieces] as a single base noun phrase. conjunctions are typically constrained to com- bine constituents that have the same word class. in was well manned, and of good force and strength for war an adjective and a complex prepositional phrases (pp) are in coordination. it seems that this constraint was much weaker in emode. it also appears that constraints on appositions were weaker: in besides her own mariners, about an apposition relation is used to convey quan- tity information, a use we might perhaps only find in cooking recipes in pde. in modern english, particularly in emode, sen- tences are considerably longer than in pde. fries ( , p. ) reports a decrease in sentence length in the zen corpus from forty-two words per sen- tence in down to twenty-nine words per sen- tence in , while pde figures (from the bnc) are about twenty-one words per sentence. high sentence length in itself creates considerably more scope for ambiguity. we exemplify the ambiguity for pp- attachment. the ambiguity of prepositional phrase attachment can be described by the catalan numbers. a sequence verb np n*pp with n pps has cn þ analyses, where cnþ is the (nþ )’th catalan number. cn is defined as follows: cn ¼ n þ n n � � ¼ ð nÞ! ðn þ Þ!n! where cn . . . is [ , , , , , , , , , , , , , , , , ] for five pps there are forty-two possible readings. as a crude indicator of the potential ambiguity we can compare sentence length across the centu- ries. average sentence length in the zen corpus is . words, compared with about words in the bnc. as archer is not sentence-tokenized, only approximate figures can be obtained. our own toke- nization, which is very conservative, reports about words per ‘sentence’ in the th century com- pared with about words per ‘sentence’ in the th century. in sum, we conclude that the types of parsing errors produced for both early and late modern english are similar to pde, but more frequent. this is due to increased ambiguity caused by longer sentences and marked word order. we expect more disambiguation errors, and we need more statistical data and semantic resources to im- prove results. in section . . , we evaluate the parser without any adaptations to mode (but using auto- matically normalized spelling). in sections . . to . . , we then address improvements and adapta- tions for mode. . . evaluation we have manually annotated random sentences from each of the th, th, th and the th cen- tury from archer corpus texts. they include the twenty-five random sentences per century which have been used in the evaluation in schneider ( ); the current evaluation set is thus four times larger. we have used the standard vard nor- malization. the evaluation results including raw fre- quencies are given in table , in terms of precision and recall, broken down by century and syntactic relation. the f-score results, by century, are given in a bar chart in fig. . the f-score is the harmonic mean of precision and recall. g. schneider et al. digital scholarship in the humanities, vol. , no. , : s s : to s s to s the the twentieth s as expected, parser performance decreases for the th and th centuries, and shows a steeper decline for the texts before . there is also some fluctu- ation. when we inspected the errors, we noticed that the xx random evaluation set texts are affected by many instances of hath, which is not normalized by vard standard settings, and which leads to tagging and lemmatizing errors. our inspection of errors also revealed that some errors are relatively easy to cor- rect, as they involve closed class words. we will briefly describe them in . . , before turning to errors that can partly be corrected by improving se- mantic and statistical resources in section . . . . . closed class lexis extensions some closed-class words, e.g. but as adverb in ( ), are not known to the parser grammar. we have made a number of such adaptations: the conjunc- tion lest, as in the function of a relative pronoun (which we discarded again as it led to new errors), or gain as a ditransitive verb. ( ) he is such an itinerant, to speak that i have but little of his company. (archer: aadm) such adaptations are straightforward and effi- cient. however, they only lead to small, specific improvements. table performance of the baseline parser in absolute frequencies and in percent, on selected relations, broken down by century and syntactic relation xx is should % xx is should % prec. prec. subj . subj . obj . obj . pobj . pobj . modpp . modpp . sentobj . sentobj . p . p . recall recall subj . subj . obj . obj . pobj . pobj . modpp . modpp . sentobj . sentobj . p . p . xx is should % xx is should % prec. prec. subj . subj . obj . obj . pobj . pobj . modpp . modpp . sentobj . sentobj . p . p . recall recall subj . subj . obj . obj . pobj . pobj . modpp . modpp . sentobj . sentobj . p . p . subj¼subject; obj¼object; pobj¼verb-attached pp; modpp¼noun-attached pp; sentobj¼subordinate clause. parsing early and late modern english corpora digital scholarship in the humanities, vol. , no. , s . . more semantics and context additional parsing errors in the texts before come from a number of sources, including the following: � rare ‘poetic’ constructions that are not licensed by the grammar. examples include mounted in her forty pieces in example sentence ( ), and the sentence (archer prie.s b) on what this difference depends i can not tell where a complex pp is fronted. the grammar only licenses the fronting of simple pps (such as in the morning), and relaxing this constraint generally leads to lower parsing performance. � lexical preferences that do not match. examples include the genitive of quality ship of a good burden in example ( ), and pp-attachment invol- ving at large in the sentence (archer leew.s b) as i shall manifest at large in the ensuing discourse, where discourse is attached to the adjective large instead of the verb manifest. � high complexity, marked constituent order. in this category, we find parser errors that also occur in pde, but they are more frequent in mode; mode is similar to pde but harder. particularly the last sources of errors illustrate what could be called the ambiguity trade-off be- tween constraining and disambiguating: if one con- strains rules too much, the correct reading can often not be found, for example, if a marked constituent order is used. if one constrains too little: ambiguity explodes, the risk for incorrect disambiguation in- creases. disambiguation can sometimes be im- proved by adding more resources. one way to help disambiguation is to include more semantics and context, as we have done in the following. a) semantic expectation. the original parser models probabilities using only those syntactic rela- tions that are in competition. for example, objects (e.g. eat pizza) and nominal adjuncts (e.g. eat friday) are modeled as being in competition, but not subjects and objects. p r,distja,bð Þ ¼ p rja,bð Þ � p distjr,a,bð Þ � f r,a,bð Þ f p r � � ,a,b � � � f r,distð Þ f rð Þ we now add semantic competition as a further factor: every relation is in competition with every other relation. a sentence like the rabbit chased the dog now gets a lower probability than the dog chased the rabbit because rabbits are very unlikely to be subjects of active instances of chase. our semantic world knowledge (e.g. selectional restrictions) be- comes part of the model. b) wider context. attachment of pp is typically the most ambiguous syntactic relation. the interaction between multiple pps was not considered in the ori- ginal statistical model of the parser. knowledge ex- pressed across more than one node generation was lost. we have added a model for the probability that pp is a dependent of pp (pp < pp ) in a verb- pp-pp sequence, given the lexical items. it is calcu- lated as follows: pðverb < ðpp < pp ÞÞ ¼ #ðverb < ðpp < pp ÞÞ #ðverb < ðpp < pp ÞÞþ #ððverb < pp Þ < pp Þ these two measures improve recall and precision, as can be seen in fig. . the performance of the baseline parser from section . . is shown by grey bars, and the performance of the extended parser by striped bars. interestingly, earlier centuries profit more from the adaptation, which we believe may indicate that, due to freer word order and longer fig. f-score performance of the baseline system, by century and syntactic relation g. schneider et al. digital scholarship in the humanities, vol. , no. , s e.g. prepositional phrases ( ) sentences, constraints on semantics and complexity are more important. the overall performance with the new semantic expectations and the improved pp-model are given in table . as expected, we see a weak decline from the th century down in history to the th cen- tury, and then a stronger decline for the th cen- tury texts. as expected, the addition of more statistical data for the highly ambiguous pp-attachment, and semantic resources modelling our expectations, im- proves parsing. particularly on the historical texts, where ambiguity was found to be higher than in pde, pp-attachment improves by – %. . linguistic applications the structure of english had been established by the beginning of the th century (denison , rissanen ); see section . . lópez-couso, aarts and méndez-naya ( ) state that in addition to few grammatical innovations, namely, the progressive passive and the get-passive, the late mode period is marked by regulatory and statistical changes: the pro- gressive form increases in frequency, be as perfect auxiliary decreases, periphrastic do is fully estab- lished, and non-finite complementation and relati- vization (hundt, denison, schneider a) have undergone changes. the present progressive form, which has a rela- tive frequency of about instances per , words in ice spoken, and about twenty in ice writ- ten, has less than ten instances per , words in the zen corpus period ( – ). the increase of the progressive has been described in detail in hundt ( ). when looking at –ing forms, we have noticed that the majority of them from the early zen and archer texts are in fact non- finite -ing forms, also known as gerundials (mair ), which we discuss in section . . in section . , we show that be as perfect auxiliary shows a clear decline even in the short zen period of years. in the following we present two pilot studies based on the new annotation. the results and dis- tributions presented have been derived from our web-based interface developed for the dependency bank project. see lehmann and schneider ( a, b) for a detailed description. table base (base) parser compared with improved (imp) parser, f-score f-score xx xx xx xx base imp base imp base imp base imp subj . . . . . . . . obj . . . . . . . . pobj . . . . . . . . modpp . . . . . . . . sentobj . . . . . . . . p . . . . . . . . fig. precision and recall of improved parser (striped) and baseline parser (grey) parsing early and late modern english corpora digital scholarship in the humanities, vol. , no. , to , denison , s to . gerundials the progressive form, which is already found in old english, has become an established construction in early modern english (denison , p. ), and increased in frequency since, as we have just dis- cussed. while -ing forms used as progressives are rare in the early zen and archer texts, nominal -ing participle clauses, also known as gerundials, are quite frequent. in particular, there are surprisingly many occurrences of gerundials with subjects. in pde english (quirk et al. , p. ), the func- tions of gerundials comprise subject, object, subject complement, appositive, adjectival, and prepos- itional complement. an example of prepositional complement with subject is as follows: ( ) all the passages which were shut up on ac- count of the plague being at leipzig and sev- eral places in saxony are now again open and trade is restored to its former course. (zen lgz) the most frequent syntactic functions by far are appositive clauses. ( ) some scottish brethren, in the north of ireland finding their wonted practices inter- rupted by the late declaration of the lords justices and council against presbyterians anabtists. (zen kin) ( ) the picaroons have not visited our coasts these six months and indeed our vessels so well fitted several of them carrying six eight ten and twelve guns apiece that the small capers which usually haunted these coasts have no encouragement to adventure. (zen cui) ( ) this congregation ending a courier was im- mediately dispacht to segnior ravizza . . . ( lgz) many occurrences are also found in main clauses: ( ) robert pierrepoint esq; his troop consisting of horse whose lieutenant is toplady esq; and gregory esq; cornet. (zen kin) ( ) mr. wiseman a mercer accompanying sir george geffryes. (zen cui) where the gerundial occurs in a main clause, such as ( ) and ( ), it cannot be distinguished from a present participle clause (quirk et al., , p. , ). present participle clauses are also known as present tense reduced relative clauses, but their state is contested (e.g. hundt, denison, schneider b). we have used the following syn- tactic search patterns: ( ) subject relation, where the verb is in the progressive form and non-finite and ( ) reduced relative clause where the verb is in the present. query ( ) delivers hits, ( ) hits. the hits contain many appositive clauses ( – ), present tense reduced relative clauses ( – ), but also parsing mistakes and other syntactic functions. ( ) the states of holland being now complete are resolved to dispose forthwith of the vacant companies. ( cui) ( ) seven of the dutch frigates standing into margate road cause the lilly and another frigate to stand for the river. ( cui) ( ) yesterday was a council at white-hall chiefly to hear several appeals from out of the island of guernsey according to the constitution of that place but one of the persons being dead since the appeal was brought it could not be heard. (zen imp) ( ) the publication of books of medicines and other such things being remote from the business of a paper of intelligence; this is to notify that we will not charge the intelligence with advertisements unless they be matter of state but that a paper of advertisements will be forthwith printed apart and recommended to the public by another hand. (zen cui) semantically, the participle often conveys an ar- gumentative semantic function, most obviously in ( , , , ). in pde this only survives in set ex- pressions such as this being so. the frequency distri- bution delivered by the interface for query ( ) is given in table and shows a clear decline, graph- ically rendered in fig. . the frequency of gerundials with subjects has been decreasing, and this change seems to take place early in the investigated period, between , and , . the difference is very highly g. schneider et al. digital scholarship in the humanities, vol. , no. , : : : a b a b - - a significant (p < e- ), according to chi-square contingency test. even if the data from , are discarded as they may be seen as too sparse, the difference stays highly significant (p < e- ). our findings pattern well with the larger picture drawn by lópez-couso, aarts, and méndez-naya ( ), who observe that: while we can speak of relative stability in the area of finite complementation, the realm of non-finite complementation experienced ‘fun- damental and rapid changes’ in our period (mair , p. ), some of them still under way. . be or have as auxiliary in the perfect concerning the fixation of have as auxiliary, we have investigated auxiliary verbs in the perfect form. in some verbs, even the short zen period reveals clear change. while the verb go keeps a pref- erence for the auxiliary be throughout zen (and be gone is still occasionally used in pde), the verb come has shifted from the auxiliary be to the auxiliary have in the period covered by zen, as fig. illustrates. as fluctuation is considerable, and as we wanted to extend to other verbs, we have also tested go, arrive, and enter, and found similar, slightly less clear trends. if all verbs with the auxiliary be are searched, the majority of hits are passive forms. without manual validation of the hits, only in- transitive verbs (come, go, arrive) or verbs that are hardly used in the passive (enter) can be investigated fully automatically. . conclusion we have described the automatic annotation of mode corpora, such as zen and archer. we have evaluated the performance of the spelling normalization tool vard and improved its fig. absolute and relative frequency of gerundials with subjects in zen table absolute and relative frequency of gerundials with subjects in zen decade n words f per , wd . , . , . , . , , . , . , . , . , . , . , . , . , fig. perfect auxiliary be and have with come in the zen corpus. n¼ parsing early and late modern english corpora digital scholarship in the humanities, vol. , no. , is it ", , , ", , , : s performance on early and late mode text. we have evaluated the performance of pro gres, our de- pendency parser, and improved its performance by using statistical data and semantic resources. we have shown that these improvements can constrain the higher ambiguity observed in earlier texts. we have presented two short pilot studies illustrating applications of using automatically parsed historical corpora. so far we have shown the potential of syntactic- ally annotated data on the zen corpus. we expect the larger archer corpus and old bailey corpus to yield even more interesting results. in the future, application of automatic syntactic annotation to re- sources like the million old bailey proceedings will open new possibilities for historical linguists that would be beyond the reach of small manually annotated corpora. references andersen, Ø. e., nioche, j., briscoe, t., and carroll, j. ( ). the bnc parsed with rasp uima. proceedings of the sixth international language resources and evaluation (lrec’ ). marrakech, morocco. aston, g. and burnard, l. ( ). the bnc handbook. exploring the british national corpus with sara. edinburgh: edinburgh university press. baron, a. and rayson, p. ( ). vard : a tool for dealing with spelling variation in historical corpora. proceedings of the postgraduate conference in corpus linguistics. birmingham: aston university, may . buchholz, s. and marsi, e. ( ). conll-x shared task on multilingual dependency parsing. proceedings of the tenth conference on computational natural language learning (conll-x). new york: association for computational linguistics, pp. – , june . collins, m. ( ). head-driven statistical models for natural language parsing. ph.d. thesis, university of pennsylvania, philadelphia, pa. denison, d. ( ). chapter : syntax. in romaine, s. (ed.), the cambridge history of the english language, volume : – . cambridge: cambridge universtiy press, pp. – . fries, u. ( ). sentence length, sentence complexity and the noun phrase in the th-century news publication. in kytö, m., scahill, j., and tanabe, h. (eds), language change and variation from old english to late modern english: a festschrift for minoji akimoto. bern: peter lang, pp. – . gildea, d. ( ). corpus variation and parser performance. proceedings of the conference on empirical methods in natural language processing (emnlp), pittsburgh, pa, pp. – . görlach, m. ( ). introduction to early modern english. cambridge: cambridge universtiy press. grewendorf, g. and sternefeld, w. (eds), ( ). scrambling and barriers. amsterdam/philadelphia: benjamins. grover, c. ( ). lt-ttt example pipelines documentation. edinburgh: edinburgh language technology group, july . hundt, m. ( ). animacy, agentivity, and the spread of the progressive in modern english. english language and linguistics, ( ): – . hundt, m., denison, d., and schneider, g. ( a). retrieving relatives from historical data. literary and linguistic computing, ( ): – . hundt, m., denison, d., and schneider, g. ( b). relative complexity in scientific discourse. english language and linguistics, ( ): – . hunston, s. and francis, g. ( ). pattern grammar: a corpus-driven approach to the lexical grammar of english. amsterdam/philadelphia: benjamins. kaplan, r. m., maxwell, j. t.iii, holloway king, t., and crouch, r. s. ( ). integrating finite-state technol- ogy with deep lfg grammars. in esslli workshop on combining shallow and deep processing for nlp (comshadep ), nancy, france. köstenberger, a. and patterson, r. d. ( ). invitation to biblical interpretation: exploring the hermeneutical triad of history, literature, and theology. grand rapids: kregel. lehmann, h. m. and schneider, g. ( ). parser-based analysis of syntax-lexis interaction. in jucker, a. h., schreier, d., and hundt, m. (eds), corpora:pragmatics and discourse:papers from the th international conference on english language research on computerized corpora (icame ), ascona, switzerland, – may (language and com- puters; no. ). amsterdam: rodopi. lehmann, h. m. and schneider, g. ( a). bnc dependency bank . . in ebeling, s.o., ebeling, j., and hasselgård, h. (eds), studies in variation, contacts and change in english, volume : aspects of g. schneider et al. digital scholarship in the humanities, vol. , no. , corpus linguistics: compilation, annotation, analysis. helsinki: varieng. lehmann, h. m. and schneider, g. ( b). a large dependency bank. in lrec conference workshop ‘‘challenges in the management of large corpora’’, istanbul, turkey, pp. – , may . lópez-couso, m., aarts, b., and méndez-naya, b. ( ). late modern english syntax. in bergs, a. and brinton, l. j. (eds), historical linguistics of english: an international handbook, vol. i. (handbooks of linguistics and communication science [hsk] . ). berlin: mouton de gruyter, pp. – . mair, c. ( ). gerundial complements after begin and start: grammatical and sociolinguistic factors, and how they work against each other. in rohdenburg, g. and mohndorf, b. (eds), determinants of grammatical variation in english. berlin/new york: mouton de gruyter, pp. – . marcus, m., santorini, b., and marcinkiewicz, m. a. ( ). building a large annotated corpus of english: the penn treebank. computational linguistics, : – . nivre, j. ( ). inductive dependency parsing. text, speech and language technology . dordrecht, the netherlands: springer. nivre, j., hall, j., kübler, s., mcdonald, r., nilsson, j., riedel, s., and yuret, d. ( ). the conll shared task on dependency parsing. proceedings of the conll shared task session of emnlp-conll . prague, czech republic: association for computational linguistics, pp. – . rayson, p., archer, d., baron, a., culpeper, j., and smith, n. ( ). ‘‘tagging the bard: evaluating the accuracy of a modern pos tagger on early modern english corpora’’. proceedings of corpus linguistics, uk: university of birmingham, pp. – july . rissanen, m. ( ). chapter : syntax. in romaine, s. (ed.), the cambridge history of the english language, volume : – . cambridge: cambridge universtiy press, pp. – . schneider, g. ( ). hybrid long-distance functional dependency parsing. ph.d. thesis, university of zürich. schneider, g. and hundt, m. ( ). ‘‘using a parser as a heuristic tool for the description of new englishes’’. in the fifth corpus linguistics conference, liverpool, uk, pp. – july , online. schneider, g. ( ). ‘‘adapting a parser to historical english’’. in tyrkkö, j., kilpiö, m., nevalainen, t., and rissanen, m. (eds), studies in variation, contacts and change in english, volume : outposts of historical corpus linguistics: from the helsinki corpus to a proliferation of resources. helsinki: varieng. sinclair, j. ( ). corpus, concordance, collocation. oxford: oxford university press. tieken-boon van ostade, i. ( ). an introduction to late modern english. edinburgh: edinburgh university press. van noord, g. and bouma, g. ( ). parsed corpora for linguistics. proceedings of the eacl workshop on the interaction between linguistics and computational linguistics: virtuous, vicious or vacuous?, athens, greece. association for computational linguistics, pp. – . parsing early and late modern english corpora digital scholarship in the humanities, vol. , no. , the influence of academic values on scholarly publication and communication practices uc berkeley research and occasional papers series title the influence of academic values on scholarly publication and communication practices permalink https://escholarship.org/uc/item/ j gf authors harley, diane earl-novell, sarah arter, jennifer et al. publication date - - escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ j gf https://escholarship.org/uc/item/ j gf#author https://escholarship.org http://www.cdlib.org/ research & occasional paper series: cshe. . university of california, berkeley http://cshe.berkeley.edu/ the influence of academic values on scholarly publication and communication practices september diane harley∗, sarah earl-novell, jennifer arter, shannon lawrence, and c. judson king† center for studies in higher education university of california, berkeley copyright diane harley et al., all rights reserved. abstract this study reports on five disciplinary case studies that explore academic value systems as they influence publishing behavior and attitudes of university of california, berkeley faculty. the case studies are based on direct interviews with relevant stakeholders— faculty, advancement reviewers, librarians, and editors—in five fields: chemical engineering, anthropology, law and economics, english-language literature, and biostatistics. the results of the study strongly confirm the vital role of peer review in faculty attitudes and actual publishing behavior. there is much more experimentation, however, with regard to means of in-progress communication, where single means of publication and communication are not fixed so deeply in values and tradition as they are for final, archival publication. we conclude that approaches that try to "move" faculty and deeply embedded value systems directly toward new forms of archival, "final" publication are destined largely to failure in the short-term. from our perspective, a more promising route is to ( ) examine the needs of scholarly researchers for both final and in- progress communications, and ( ) determine how those needs are likely to influence future scenarios in a range of disciplinary areas. many opportunities and concerns are at play in scholarly communication and publication. these result from capabilities afforded by new technologies, pressures associated with the purchasing power of library budgets, marginal operations by university presses, and the pricing structures of the publishing industry. many of those involved in supporting new publishing and communication ventures see “the lack of willingness of ∗ diane harley, ph.d. is the co-principal investigator of this project, with c. judson king. she is a senior researcher at the center for studies in higher education at the university of california, berkeley and directs the higher education in the digital age (heda) project. † c. judson king is former provost and senior vice president – academic affairs of the university of california and is director of the center for studies in higher education. harley et al., academic values and scholarly publication the faculty to change” as a key barrier to moving to more cost-effective publishing models in an environment of escalating costs and constrained resources. this article summarizes the results of a planning study carried out during the - academic year and funded by the andrew w. mellon foundation. in it we describe our assessment of the criteria by which faculty decide when and in what venues to publish or otherwise communicate the results of scholarly research. we were interested in how faculty values relating to advancement and stature in their fields affect these decisions. this was the first step toward making a nuanced and insightful analysis of the roles that universities and faculty do and can play in the resolution of the perceived “crisis in scholarly communication.” specifically, we developed five disciplinary case studies that are based on direct interviews with relevant stakeholders—faculty, advancement reviewers, librarians, and editors—in five fields. in doing so, we explored academic value systems as they influence publishing behavior and attitudes of a subset of university of california, berkeley faculty. the larger report is available online. our goal was to provide a preliminary descriptive analysis and understanding of the academic value systems associated with scholarly publication and communication, including means of communication extending beyond archival publication: • within a discipline. (for example, what do scholars perceive as necessary to make a name for themselves?) it is recognized that there are different needs and value systems for different disciplinary areas, that different disciplines are in different stages of incorporating electronic communication, and that some disciplines, e.g., architecture, have products other than text. • within a university. (for example, what are the value systems of the academic promotion and advancement processes, as perceived by different actors in those processes?) overview of methods the five disciplinary case studies were based on direct interviews with relevant stakeholders, almost all of whom were associated with the uc berkeley campus. these case studies describe the state of scholarly communication in each of five fields: chemical engineering, anthropology, law and economics, english-language literature, and biostatistics. in the case of law and economics, it was the intersection of these two broad fields that was examined, not the sum. we also developed two smaller case studies representing the views of librarians and former budget committee members across these five disciplines. these “thickly described” case studies have the potential to enable a more precise identification of the factors associated with academic and disciplinary value systems. more specifically, such case studies should facilitate the identification of the factors that influence attractiveness, viability, and financial sustainability of different methods of scholarly communication for various participants in the publication/communication system, including authors (producers), researchers (consumers), libraries, and publishers. formal interviews were conducted during the - academic year with individuals, of whom were faculty (comprising regular faculty, former and current cshe research & occasional paper series harley et al., academic values and scholarly publication faculty administrators, and recent ex-budget committee members). twenty-two of the faculty interviewed were also editors of scholarly journals or had been so in the recent past. five librarians were interviewed, as were two campus-level academic administrators. the remaining interviewees were drawn from our steering committee. the basic interview protocol, initially designed for faculty, was modified as required over the course of the project to include questions of particular relevance to each class of stakeholders. as a one-year project, our sample was biased (uc berkeley only) and relatively small (fewer than informants per discipline), so extensions to other institutions and disciplines should be made with caution. as an external check to the uc berkeley perspective, we included in each case background research on innovations taking place internationally in the targeted fields. our research was considerably facilitated by the very structured review process for appointment, promotion, and advancement at the university of california (uc). that process involves formal review at regular intervals, both before and after tenure is awarded. reviews are initiated in comprehensive written form by the department chair, using material drawn together and submitted by the faculty member. external letters of evaluation are solicited and included for appointments, promotions, and certain critical advancements within the rank of professor. the package, or “case,” is then reviewed by the dean, by a specially appointed ad hoc committee for promotions, and by the committee on academic personnel (denoted the committee on budget and interdepartmental relations, or “budget committee” for the uc berkeley campus), who then make a recommendation. that recommendation is followed by the campus administration in nearly all cases. (the criteria for advancement are put forward in the university of california academic personnel manual. ) findings peer review conventional peer review is so central to scholars’ perception of quality that its retention is essentially a sine qua non for any method of archival publication, new or old, to be effective and valued. peer review is the hallmark of quality that results from external and independent valuation. it also functions as an effective means of winnowing the papers that a researcher needs to examine in the course of his/her research. peer review was cited as an essential factor when faculty were asked about: ( ) their perceptions of both standard and newer forms of publication, ( ) disadvantages of newer forms of publication, ( ) where one should publish to make a name for oneself in the field (e.g., publish in top flight peer reviewed journals), and, of course, ( ) peer review specifically. there is a large tendency for many members of the research community to equate electronic-only publication with lack of peer review, despite the fact that there are many examples to the contrary. because of the very nature of peer review, this factor holds back even those who are fully aware of the advantages of fully peer-reviewed e-journals, because they know that the individuals reviewing their work for advancement may well not have that awareness. cshe research & occasional paper series harley et al., academic values and scholarly publication it will be important to try to separate the issue of peer review for newer, electronic journals from those issues associated with the fact that most such journals are simply new and not yet well established. to some degree, however, peer review and the means of publication and dissemination can be separated. for example, there are authors whose work is peer reviewed and published in prestigious print journals, but who also retain rights to place the article on their own web site. as noted by some interviewees, the result is that the work is accessed far more often on the web site than in the published print journal. thus, peer review is essential, although there is some worry among interviewees that the quality of peer review may be declining. the result is that it may be easier to rely on the tried-and-true outlets. the locus of peer review has, in some cases, moved out of the institution. specifically, there is a growing tendency to rely on secondary measures associated with peer review, such as: perceived journal quality, selectivity, and/or stature; the fact that papers are invited; or keynote lectures for conferences. there is reliance, for instance, on university presses and reviewers of journals to evaluate scholarly work. (even though reviewers for university presses are academic faculty, the editor exerts much more independent judgment than is typical for peer-reviewed journals published by scholarly societies.) in some cases, the impact factor may also serve as a gauge of quality. despite the goals and quality of peer review, interviewees mentioned several times that the proliferation of journals has resulted in the possibility of getting almost anything published somewhere, if the author persists in trying to gain acceptance by different journals. the peer review process is more complicated for compound disciplines because many such fields are relatively nascent and therefore result in small, specialized communities of scholarship. faculty in these interdisciplinary fields often prefer to publish within a single discipline because the most highly respected and recognizable outlets reside there; however, divergent expectations (ranging from quantity to methodology to writing style) and standards (especially with regard to quality) among fields often make it difficult for reviewers in standard fields to judge submissions from compound disciplines. interdisciplinary publications may address this concern more readily as they become more prestigious. in fields that are joined with law, such as law and economics, the utilization and perception of peer review is particularly complicated, given the prominence of student-edited law reviews. online publishing although online publication may be less of a concern to senior faculty with regard to advancement, they are often hindered in using it by their lack of ability or time. there is also no perceived reward for changing the status quo. personal desire and interest, however, are often the drivers for participation in newer modes of communication and publication for senior faculty. publishing in online-only resources is perceived among junior faculty as a possible threat to achieving tenure because online publication may not be counted as much, or even at all, in review. even when written policy indicates that online publications should not be undervalued in consideration of advancement, actual practice may vary. cshe research & occasional paper series harley et al., academic values and scholarly publication some interviewees observed that new modes of communication and publication contribute to a proliferation of scholarly material. the result is that it is more difficult for time-pressed faculty to sift through all that is available in their fields. there is the perception that it is easier to get published in newer electronic journals and that they contain material of lesser and dubious quality. there is also a perception that the number of pages publishable in a journal is not restricted by cost for e-journals in the same way that it is for print journals, and thus editors of e-journals are not pressed to be as selective. crisis/cost issues/open access many faculty interviewees believed that uc berkeley is insulated to a large degree from any crisis in scholarly publishing. the prestige of the institution and the quality of faculty work often enable faculty to publish with the most prestigious journals or presses. for the most part, faculty do not concern themselves with the burden of cost to the institution resulting from the scholarly publication process. these scholars had minimal, if any, understanding of open-access models, although they were somewhat familiar with the “open” concept. we found that scholars are generally receptive to the ideal of making knowledge available for the “public good.” positive perceptions faculty did have a good understanding that the high cost of journals is problematic and faculty in chemical engineering, in particular, viewed open-access models as a possible alternative to commercial presses. some faculty refuse to publish in particular journals because of their high cost and pricing mechanisms. senior faculty appeared to be more comfortable with the idea of sharing material at the early stages of work (e.g., preprint servers), as did faculty in chemical engineering, biostatistics, and law and economics in general. archaeologists already use some open-access web sites to share field observations. negative perceptions the largest concern among scholars was the perception that open-access models had little or no means of quality control, such as peer review. some faculty in biostatistics, interestingly, equated the high cost of print journals with quality and believed that online open-access models are “cheaper” and therefore might be prone to lower standards. others expressed fear that scholarly work placed in open-access media could be “stolen,” although faculty with a better understanding of the online publication process saw licensing bodies, such as creative commons, as a potential solution. there was also some concern about the ownership of open-access and author-pays journals. should universities act as repositories and implement some sort of selection process, there could be legal liabilities regarding the acceptance and rejection of work submitted by the institution’s own faculty, who the institution then judges for advancement. faculty also expressed concern regarding how such repositories would be managed, including how subjects would be organized. cshe research & occasional paper series harley et al., academic values and scholarly publication author-pays publishing models scholars were generally not aware of author-pays models. once explained, faculty responses were universally negative. paying to publish one’s work was perceived as self-promotion and fundamentally in conflict with the peer review process. english- language literature faculty, in particular, equated the author-pays models to vanity presses, while those in the sciences equated it with advertising and therefore believed that any such publication would compromise academic integrity. many faculty realized that publication costs are an issue and believed that the author-pays model could possibly serve to discriminate against countries, institutions, and faculty with fewer financial resources. in particular, scholars from all fields expressed concern that such a model might exacerbate differences between the sciences and the humanities since funds to cover any charges would likely come from grants. faculty who were in fields that lacked a sense of urgency in scholarship, especially english-language literature (and one interviewee in biostatistics), viewed the author-pays model as particularly irrelevant. it should be noted that page charges or submission fees are a reality for some disciplines such as the biological sciences and economics respectively. page charges have now largely disappeared, however, for some scientific disciplines such as chemical engineering. we also note that page charges could have a particularly chilling effect on those who rely on expensive graphics in publications, especially in the humanities. enhanced capabilities of electronic communication many faculty interviewed were happy to consume scholarly material afforded by new modes of communication and publication. day-to-day scholarly practice uses them enormously, but for the last stage of scholarly practice, archival dissemination of scholarly work, scholars rely on traditional publishing formats with few exceptions. there are clear advantages to newer forms of publication that are recognized by a wider circle of scholars than those who have actually used them for publishing their own work. these include the ability to reach a larger audience, ease of access by readers, more rapid publication even when peer reviewed, the ability to search within and across texts, and the opportunity to make use of hyperlinks. administrators and faculty both cited the fact that new technologies enable innovation in scholarly work. anthropologists and chemical engineers agreed that moving images and three-dimensional ( d) models are particularly positive attributes. english-language literature faculty noted that technologies enable new ways of conducting scholarly work, most notably manuscript comparison in which single interpretations are no longer necessary because access to multiple interpretations is possible. faculty, especially chemical engineers, believed that newer technologies have a democratizing effect on scholars outside of north america. the ability to have enough information (e.g., software code, back-end data, etc.) to enable the reproduction of statistical analyses was of particular importance to faculty in biostatistics. data storage/management needs data storage and data management needs vary depending on the discipline and even subspecialty. data produced by scholarly work vary both across and within disciplines, and vary from interpretive text, to visual or motion images, to d renderings or computer simulations, to observations whether in numeric or text form. scholars in some fields also rely upon existing datasets rather than new data. in the sciences, grant monies cshe research & occasional paper series harley et al., academic values and scholarly publication often fund data management and storage. it was noted that funders rarely dictate how data should be preserved. there is little to no institutional support for data management and preservation according to our interviewees. as a result, individual scholars are responsible for maintaining data integrity. overall, faculty were concerned about the rapid evolution of technologies, which often results in archaic storage devices and thereby loss of work. in some fields with data-rich scholarship, such as biostatistics, there was the concern that not all data can be stored. some suggested that their department or the university should have policies in place to address this problem. the budget committee interviewees who had served on the budget committee with terms ending more than two years ago had not encountered the need to review non-conventional forms of publication and communication, and thus this was not a significant issue during their service. because of academic specialization, the nine budget committee members, in most instances, do not have the disciplinary knowledge necessary to judge the research themselves in cases that they review. thus there is a heavy reliance on peer review to aid the budget committee in its evaluation of scholarly work. as well, lack of peer review is associated—correctly or not—with newer forms of publication. former budget committee members believed that the advancement process should be supportive of non-traditional publishing models, provided that peer review is strongly embedded in the process, and that it should be unprejudiced toward those scholars exploring new modes of publication. despite faculty perceptions to the contrary, those with budget committee experience indicate that there is some degree of flexibility built into the review process. former budget committee members and higher administrators who receive budget committee recommendations commented that the committee reflects standards in disciplinary fields and does not mandate appropriate methods, which effectively serves to maintain the status quo. some explained that if the faculty member or department chair could make the case that a particular publication outlet was sufficiently peer reviewed for quality and well known within a particular subfield, then the budget committee would give it appropriate weight. regardless, faculty are often unwilling to take risks by using newer publishing technologies that they presume may not be recognized by the budget committee as reputable and/or prestigious venues. librarians librarians appeared to have a much better understanding of available resources and the politics among publishers. they often were more technologically savvy than their faculty counterparts and were well aware of new technologies likely to affect available resources. unlike many faculty, librarians who were interviewed strongly perceive a crisis in scholarly communication and see the rise in new forms of communication and publication as a positive step—albeit slow and evolutionary. librarians indicated that they try to educate faculty about the scholarly communication crisis and how faculty might play a larger role. although new modes of communication are not widely used by cshe research & occasional paper series harley et al., academic values and scholarly publication faculty for presenting their work, librarians believe that open-access and/or author-pays models are viable alternatives to the problem of unsustainable journal costs. online resources were also viewed largely as advantageous from a consumer perspective for many of the same reasons that faculty provided, e.g., ease of access, speedy dissemination, and so on. librarians also believed that online technologies enable them to connect faculty and students with better information. librarians’ main concerns about new modes of publication were along fiscal and technological dimensions, namely the economic sustainability of newer models and the role of the library in that financial equation. librarians also pointed out the “version” problem for placing scholarly material in repositories. most problematic for librarians, however, is the increasing reliance by both students, and to some degree, faculty, on search engines such as google and yahoo. publishers and editors publishers with whom we made formal contact included the university of california press, the berkeley electronic press (bepress), the e-scholarship project of the california digital library (cdl), the public library of science (plos), ithaka, the electronic publishing initiative at columbia (epic), and the stanford encyclopedia of philosophy. formal interviews of principals were conducted for the first four of these. we recognize that this is not a representative group of publishers. twenty-two of the faculty members whom we interviewed were also editors of journals or had been in the recent past. perceptions among publishers and editors were tied closely to the mode (print/electronic) of publication, their institutional affiliation and philosophy, and often the disciplinary fields in which they specialized; thus, their opinions often reflected those different viewpoints. some felt that print publications were ineffective compared to electronic venues in disseminating work in a timely matter, although all recognized the challenges associated with newer forms of publication, regardless of format. many of the publishers and editors we interviewed were aware of the concern about increasing costs. publishers concurred that academia is in a transition period with regard to publishing, and they understood the complex interplay of tenure requirements and distribution of publishing choices among faculty. although scholarship is inherently innovative in both approach and method, and in that way a natural match for newer forms of publication, change is often hindered by institutional requirements and standard practice, such as the perceived necessity of traditional publication for advancement and achieving tenure, and apprehension among scholars that reviewers will not accept newer forms of publication for advancement. most agreed that use of newer forms of publication has not yet reached a sufficient saturation point to tip the scale and opined that the power to change rests with the university world—for both production and consumption. they recommended incentives for faculty, both in terms of policies (e.g., advancement process) and resources, as well as budgetary and technical support for libraries. publishers shared with other interviewees the concern about perceptions that equate low cost with low quality for electronic forms of publication. although all agreed that quality control systems are not in place in most open-access repositories, publishers in general pointed out that electronic journals can and often do use the same review process as traditional print journals. several also expressed concern about the peer-review process, cshe research & occasional paper series harley et al., academic values and scholarly publication and believed that too much emphasis is placed on outside opinion and prestige rather than a review of actual content quality. one issue for faculty editors is the difficulty in finding reviewers who are qualified, neutral, and objective scholars in a fairly closed academic community. this is compounded by the fact that the increasing quantity of publications requires more scholarly input for the review process, while already overburdened academics have limited time to participate. editors, in particular, have a difficult time coordinating reviewers’ schedules and available time. reflections lessons learned and challenges while our investigations have yielded rich and descriptive case studies that shed light on the current state of scholarly communication, there are limitations to our study. our small sample, both in the number of participants and in the range of disciplines, makes generalizations at this stage sketchy at best. furthermore, we focused specifically on one campus in the university of california (uc) public system, uc berkeley, and our results at this stage are thereby obviously biased. to develop general conclusions applicable to wider populations, future investigations will need to include other campuses and/or institutions. the highly structured advancement system of the university of california has been advantageous to us in many ways in conducting our research. first, we know who the actors are at the several stages of the review process and we were able to talk with them directly. second, we were able to compare and contrast the views of those in the different steps of the review process—faculty, department chairs, deans, (former) budget committee members, and campus-level academic administrators. as well, we can ascertain the views of bodies such as the budget committee by those involved in other stages of the review process. third, at each of the levels of review we had access to persons who have reviewed many advancement cases and who therefore can make informed comparative judgments. this fact gives reviewers the ability to identify the relative importance of different factors, including the medium and nature of the publication vehicle. fourth, the nature of this review process affords the wherewithal of assessing the degree of importance and roles of peer review and the vehicles for peer review that hold cachet. by talking with reviewers involved with the budget committee at different times in recent years, we have been able to make initial inferences of whether and how the values ascribed to new media by that body are changing. the academic values that we have identified in this project may be specific to the most prestigious of universities, where faculty researchers nearly always have their papers accepted for publication and can publish wherever they want. it is also true that the values exercised at these leading universities will likely be emulated throughout the academic community. cshe research & occasional paper series harley et al., academic values and scholarly publication conclusions the descriptive case-study approach began to elucidate the ways in which faculty do or do not perceive electronic means and other new capabilities as enhancing ( ) the quality, effectiveness, and immediacy of communication of a scholar’s research output to peers and users, ( ) the recognition of that research, and ( ) the efficiency and effectiveness of progress of scholarship as a whole. the disciplinary case studies also enabled a more precise identification of the factors associated with academic and disciplinary value systems that influence viability and financial sustainability of different methods of scholarly communication for various participants in the publication/communication system, including authors (producers), researchers (consumers), libraries, and publishers. from an examination of the ways in which value systems in five disciplinary areas affect scholarly publication and communication practices, we have reached the following conclusions: • peer review is the coin of the realm. it is the value system supporting assessment and the perceived quality of research. it is commonly viewed as the primary mechanism through which research quality is nurtured, and through which research is made both effective and efficient. there was also a strong perception that peer review provides an excellent quality filter for the proliferating mass of scholarly information available on the web. • there is some concern that the locus of peer review has moved out of the institution. this has particular repercussions for academic advancement as increasing reliance is placed on the prestige of publication rather than a review of actual content and quality. this is especially a concern for those scholars in compound disciplines, where peer review can be complicated by differing standards and expectations among fields, and where quality assurance depends upon a small group of specialized academics. • there is presently a somewhat dichotomous situation in which electronic forms of print publications are used heavily, even nearly exclusively, by performers of research in many fields, but perceptions and realities of the reward system keep a strong adherence to conventional, high-stature print publications as the means of record for reporting research and having it evaluated institutionally. this was true of all of the disciplines we examined. in the science fields, although major journals are maintained in print form, electronic replicates are used increasingly for most access and research. • while both are critically important to one’s career, the means of publication and communication for gaining advancement within the institution can differ significantly from those for making one’s name within a discipline. the former depends almost exclusively upon final, fully peer-reviewed archival publication, whereas the latter is more fluid and oriented toward partial results, meetings and information exchanges with other researchers during the course of the research (“in-progress communication”), as well as final, archival publication. cshe research & occasional paper series harley et al., academic values and scholarly publication • such “in-progress” communication also fulfills needs such as ( ) gaining the critical thoughts of others while one’s research is in progress, ( ) “staking claim” to one’s activity and accomplishments in an area, and ( ) sparking thoughts and new ideas as a product of the discussion. • in-progress communication does not substitute for the need for final, archival presentation and dissemination of research results. they serve different purposes and needs. both are important. • there is much more experimentation with regard to means of in-progress communication, where single means of publication and communication are not fixed so deeply in values and tradition as they are for final, archival publication. • from an institutional standpoint, there are looming questions about how to support faculty in their scholarly practice. our interviews suggested that (at uc berkeley, at least) there are currently few, if any, mechanisms or structures that support storing, archiving, and sharing the significant research products of faculty, such as databases, collections of literature, etc., that are created en route to ultimate archival publication. based on our preliminary research, this is true of other institutions as well, except in a few fields. • campus-level academic administrators perceive an inevitable but slow evolution toward new forms of publication (particularly in fast-moving scientific disciplines), similar to the shift from print journals to conference proceedings that occurred in computer science in the s. they see this evolution gaining momentum and credibility. respected scholars, however, will begin using such venues in great numbers only once these venues, and the peer review associated with them, become better established. • according to our interviewees, the budget committee has so far rarely needed to address the issue of publication venue. this is because so few new forms of publication are represented in the cases that come through the committee. former budget committee members, however, believed that the committee would be open to new forms provided that they meet the same standards for peer review and quality as traditional forms. until more cases demanding the evaluation of digital scholarship come before tenure and review committees we foresee that the situation will remain relatively stagnant. • campus-level academic administrators perceived a distinction between peer review in the discipline and peer review for promotion. while clearly interconnected, administrators maintained that discipline-based peer review cannot stand on its own; the input of immediate colleagues in addition to discipline-based peer review is necessary for promotional consideration. on the other hand, administrators believed that it was possible, in some cases, for local peer review to substitute for discipline- based peer review, for instance in considering the quality of work published in a non- peer-reviewed journal. results from the project indicate that the values surrounding final archival publication are deep and relatively inflexible in many, if not most, disciplines at research universities. yet, what scholars value and want will eventually become accepted practice. this is a cshe research & occasional paper series harley et al., academic values and scholarly publication much more realistic way of looking at issues than is devising models and modes of communication because of their cost efficiencies or other non-research criteria and then trying to draw scholars to them. approaches that attempt to “move” faculty and deeply embedded value systems directly toward new forms of archival, “final” publication are destined largely to failure in the short-term. thus, it is our opinion that the development of any new models should focus on the needs of scholarly researchers for both final and in-progress communications in order to determine how those needs are likely to influence future scenarios in a range of disciplinary areas. in summary, we suggest that more innovation does and will occur first in in-progress communication than in final archival publication. one can foresee a scenario where useful and effective innovations in in-progress communication will eventually serve as drivers for improvements in final archival publication. it is therefore worthwhile to gain deeper insights into the needs, motives, and new capabilities within in-progress communication as well as for final, archival publication. acknowledgements we would like to thank the andrew w. mellon foundation for generously funding this research. we are also indebted to an unusually active and involved steering committee that has provided invaluable guidance, support, and time. we thank the more than fifty formal and informal interviewees who graciously scheduled time to provide candid opinions and ideas. irene perciali participated in preparing background information on publication practices in the five disciplines considered. notes donald w. king, peter b. boyce, carol hansen montgomery, and carol tenopir. . “library economic metrics: examples of the comparison of electronic and print journal collections and collection services.” library trends ( ): - . also see, roger c. schonfeld, donald w. king, ann okerson, and eileen gifford fenton. the nonsubscription side of periodicals: changes in library operations and costs between print and electronic formats, , http://www.clir.org/pubs/reports/pub /contents.html ( december ). also see, donald j. waters, “managing digital assets in higher education: an overview of strategic issues.” arl bimonthly report , february . http://www.arl.org/newsltr/ /assets.html ( december ). judith ryan, idelber avelar, jennifer fleissner, david e. lashmet, j. hillis miller, et al. the future of scholarly publishing from the ad hoc committee on the future of scholarly publishing, , http://www.mla.org/resources/documents/issues_scholarly_pub/repview_future_pub ( december ). also see, leigh estabrook, the book as the gold standard for tenure and promotion in the humanistic disciplines, , http://lrc.lis.uiuc.edu/reports/cicbook.html ( december ). theodore c. bergstrom, “free labor for costly journals?” journal of economic perspectives , no. ( ), http://repositories.cdlib.org/postprints/ ( december ), - . also see, aaron s. edlin and daniel l. rubinfeld. . “exclusion or cshe research & occasional paper series http://www.clir.org/pubs/reports/pub /contents.html http://www.arl.org/newsltr/ /assets.html http://www.mla.org/resources/documents/issues_scholarly_pub/repview_future_pub http://lrc.lis.uiuc.edu/reports/cicbook.html http://repositories.cdlib.org/postprints/ harley et al., academic values and scholarly publication efficient pricing? the ‘big deal’ bundling of academic journals.” antitrust law journal , no. ( ), http://works.bepress.com/aaron_edlin/ ( december ), - . also see, roger noll and w. edward steinmueller. “an economic analysis of scientific journal prices: preliminary results.” serials review ( ): - . deborah lines andersen, ed. . digital scholarship in the tenure, promotion, and review process. armonk, ny: m.e. sharpe. also see nature’s peer review debate at http://www.nature.com/nature/peerreview/debate/index.html ( december ). c. judson king, diane harley, sarah earl-novell, jennifer arter, and shannon lawrence. scholarly communication: academic values and sustainable models. july , http://cshe.berkeley.edu/publications/publications.php?id= ( december ). these particular five fields were selected with the goal of obtaining a diverse array of disciplines and publishing traditions, and taking advantage of the fact that at least one member of our project steering committee had deep knowledge of each of the disciplines selected. academic senate berkeley division. “introduction to the budget committee.” february , http://academic-senate.berkeley.edu/pdf/intro_to_bc.pdf ( december ). university of california office of the president, “appointment and promotion, review and appraisal committees,” academic personnel manual, july , section , http://www.ucop.edu/acadadv/acadpers/apm/apm- .pdf ( december ). the so-called impact factor is a measure of the citation frequency of papers in journals and is thereby equated by some to the stature and presumably the prestige of the journal. see, e.g., richard monastersky, “the number that’s devouring science.” chronicle of higher education. october http://chronicle.com/weekly/v /i / a .htm ( december ). ibid., “impact factors run into competition.” http://chronicle.com/weekly/v /i / a .htm ( december ). a classic example of in-progress communication from the pre-electronic era is the gordon research conferences, where current research is presented and discussed at length, but there are no written materials, nor are the presentations final or archival. the stimulating environment of these and similar conferences is particularly valuable for generating one’s own research ideas. another traditional example is faculty invitations to other institutions for visits built around a seminar. the recently released mla taskforce report on evaluating scholarship notes that of the departments it surveyed, . percent at doctoral institutions, . at master’s institutions, and . percent at baccalaureate institutions report having “no experience” evaluating digital scholarship. “mla task force on evaluating scholarship for tenure and promotion.” december . http://www.mla.org/tenure_promotion ( december ). cshe research & occasional paper series http://works.bepress.com/aaron_edlin/ http://www.nature.com/nature/peerreview/debate/index.html http://cshe.berkeley.edu/publications/publications.php?id= http://academic-senate.berkeley.edu/pdf/intro_to_bc.pdf http://www.ucop.edu/acadadv/acadpers/apm/apm- .pdf http://chronicle.com/weekly/v /i / a .htm http://chronicle.com/weekly/v /i / a .htm research & occasional paper series: cshe. . << /ascii encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /all /binding /left /calgrayprofile (dot gain %) /calrgbprofile (srgb iec - . ) /calcmykprofile (u.s. web coated \ swop\ v ) /srgbprofile (srgb iec - . ) /cannotembedfontpolicy /warning /compatibilitylevel . /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjdffile false /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves . /colorconversionstrategy /leavecolorunchanged /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel /emitdscwarnings false /endpage - /imagememory /lockdistillerparams false /maxsubsetpct /optimize true /opm /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments false /preserveoverprintsettings true /startpage /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution /colorimagedepth - /colorimagemindownsampledepth /colorimagedownsamplethreshold . /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /colorimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /jpeg coloracsimagedict << /tilewidth /tileheight /quality >> /jpeg colorimagedict << /tilewidth /tileheight /quality >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution /grayimagedepth - /grayimagemindownsampledepth /grayimagedownsamplethreshold . /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /grayimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /jpeg grayacsimagedict << /tilewidth /tileheight /quality >> /jpeg grayimagedict << /tilewidth /tileheight /quality >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution /monoimagedepth - /monoimagedownsamplethreshold . /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k - >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx acheck false /pdfx check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ . . . . ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ . . . . ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /description << /chs /cht /dan /deu /esp /fra /ita /jpn /kor /nld (gebruik deze instellingen om adobe pdf-documenten te maken voor kwaliteitsafdrukken op desktopprinters en proofers. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader . en hoger.) /nor /ptb /suo /sve /enu (use these settings to create adobe pdf documents for quality printing on desktop printers and proofers. created pdf documents can be opened with acrobat and adobe reader . and later.) >> /namespace [ (adobe) (common) ( . ) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) ( . ) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /noconversion /destinationprofilename () /destinationprofileselector /na /downsample bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure true /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles true /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) ( . ) ] /pdfxoutputintentprofileselector /na /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /leaveuntagged /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [ ] /pagesize [ . . ] >> setpagedevice microsoft word - asist_posterpaper_ ( ).docx investigating perceptions and support for transparency and openness in research: using card sorting in a pilot study with academic librarians liz lyon school of information sciences university of pittsburgh n. bellefield ave pittsburgh, pa elyon@pitt.edu eleanor mattern school of information sciences; university library system university of pittsburgh n. bellefield ave pittsburgh, pa emm @pitt.edu wei jeng school of information sciences university of pittsburgh n. bellefield ave pittsburgh, pa wej @pitt.edu daqing he school of information sciences university of pittsburgh n. bellefield ave pittsburgh, pa dah @pitt.edu abstract this paper explores the role of academic librarians as advocates for research transparency and open research. we describe the design and piloting of a qualitative card-sorting research protocol that investigates academic librarians’ attitudes, awareness and practices related to research transparency. we report on preliminary results from interviews with librarians, presenting their conceptualizations of research transparency and open research, existing library services that support and advocate for both concepts, and potential services that would augment this support and advocacy. library activities they feel are most important to the advancement of transparency and openness are identified and perceptions of disciplinary differences are noted. keywords research transparency; open research; academic libraries; data curation introduction in the last decade, there has been a significant policy shift towards articulating and consolidating an open research agenda by governments and research funding. the open agenda has in part been constructed around a desire for a more democratic dissemination of the discoveries and knowledge contained within the scholarly record funded by public monies. there is also a need to ensure the quality, integrity and rigor of the scholarly process and to facilitate accountability and trust in the outcomes of research. the concept of transparency has been included in these descriptions of openness, however the engagement and perceptions of library and information professionals, who are key stakeholders in digital scholarship and open research, are less understood and form the subject of this study. this paper describes the preliminary findings from a pilot exercise that applied a qualitative card-sorting methodology to probe the following research questions: . what are the attitudes, understandings and practices of librarians and information professionals to the concepts of transparency and open research? . what are the practical implications of these attitudes, understandings, and practices for the development and delivery of innovative research services provided by academic libraries? background research transparency is a recurrent theme within policy statements developed in recent years. transparency is one of the principles listed in the organization for economic co-operation and development (oecd) guidelines for access to research data from public funding (oecd, ). in the united states, the obama administration released a memorandum, setting three specific actions for departments and government agencies: transparency, participation and collaboration (holdren et al., ). transparency as a value has been discussed in ideological terms by etzioni ( ), who describes the strong variant relating to regulatory contexts and disclosure. the royal society report ( ) references “transparent policies for custodianship, data quality and access” in outlining a set of principles of stewardship that should be shared by custodians of scientific work (p. ). other terms are used in the literature to describe concepts related to transparency: reproducibility (peng, ), repeatability (easterbrook, ), and verifiability (gezelter, n.d.). {this is the space reserved for copyright notices.] asist , october - , , copenhagen, denmark. the dimensions of openness are investigated by lyon ( ) who describes openness as a continuum with two orthogonal axes: ‘access’ and ‘participation’. corrall and pinfield ( ) explore the first of these dimensions and construct a typology of ‘open’. lyon and beaton ( ) address the second dimension of participation in the context of libraries, reviewing citizen science initiatives, education and skills development. a -dimensional model of open science, extending this prior work was introduced by lyon ( ), and includes the additional dimension of ‘transparency’. she suggests transparency is linked to the research lifecycle and points to broad opportunities for library and information sciences professionals to engage with and support transparency in research, including through policy, education and infrastructure. methods frequently used in human-computer interaction (hci) and design studies, card sorting is a qualitative data collection methodology that probes how individuals categorize and draw relationships among concepts. there are examples in the library and information science literature of researchers who implement card sorting as a methodology, particularly in studies involving usability testing of online resources (e.g., faiks & hyland, ; whang, ; lewis & hepburn, ). morville and rosenfeld ( ) differentiate between two different approaches to card sorting: open and closed sorting: “in totally open card sorts, users write their own card and category labels. totally closed sorts allow only pre-labeled cards and categories. open sorts are used for discovery. closed sorts are used for validation. there’s a lot of room in the middle” (p. ). for usability studies of websites in a library setting, there are examples of both open (lewis & hepburn, ) and closed sorting (faiks & hyland, ). this study applies both of these approaches. for this study, we developed a semi-structured protocol using card sorting as a method. we piloted the instrument with seven librarians at a research university [name removed for review]. our sampling method was driven by an interest in interviewing librarians who support a range of disciplinary communities on campus. in doing so, we tested the instrument with librarians who work with scientists, social scientists and humanists. four sessions were conducted with individual librarians with one to two research team members present. two researchers conducted a focus group with three librarians. we chose to deliver the instrument both in a one-on-one and a focus group setting to gauge whether one approach was more effective than the other. table presents a detailed instrument. we provided participants with cards with pre-defined library activities supporting open research and transparency (i.e. assistance with locating a data archive for depositing data, advocacy for open access publishing, training on tools like open science framework) and with blank cards to capture additional activities that they identified. we asked them to categorize them according to their status: activities currently done; activities that not currently done but that can be done immediately; activities done but not requested; activities that can be offered in the future; and not study part activity i. setting the stage the research team reviewed study objectives and verbal consent script; obtained permissions for use of audio recorder ii. about you we asked librarians about the disciplinary background that informs their work, the disciplinary communities they serve, their research experiences, and their support for research in their roles as librarians iii. concept construction participants described what two terms mean to them: open research and research transparency. we asked the participants to keep their understandings in mind as we moved forward in the study. iv. card sorting of library activities we provided participants with pre-defined library activities that support open research and research transparency and with blank cards to capture additional activities that they identified. we asked participants to categorize the cards according to status of library support: ● activities that the library is currently doing ● activities that the library is not currently doing but that you think can be done immediately (quick wins) ● activities that the library is ready to provide but that patrons have not or have rarely requested ● activities that you think the library can offer in the future (longer-term wins) ● the rest of activities (i.e. irrelevant, unsure) we asked participants to provide verbal insight into their sorting decisions. v. benefits and barriers participants were invited to comment on their perceptions of the benefits of the services to research and to the library. we asked that they comment on barriers that do or could challenge the library’s development and provision of the open research/research transparency support activities. vi. importance of library activities we gauged librarians’ attitudes concerning the importance of library activities to the advancement of research transparency and open research. the librarians arranged the activities on a numbered scale, with being the least important and being the most important. vii. debriefing we invited feedback on the pilot instrument from participants, asking if anything was unclear or, from their perspective, could be improved. table . study instrument piloted with librarians applicable (irrelevant, unsure or unnecessary). we asked the participants speak to benefits and barriers of offering existing and potential services and their thoughts on those services most critical to open research and transparency. we met with librarians in their offices or in a designated place in the main campus library. the length of the interviews with individual librarians ranged from to minutes, while the focus group reached an hour; it was necessary for us to modify the instrument with the focus group because of the timing and we chose to move past the discussion on barriers and benefits. we recorded and transcribed sessions and photographed the results of the participants’ card sorting. preliminary findings definitions of open research and research transparency when being asked about the meanings of open research and research transparency, librarians interpreted open research as closely connected to other, perhaps more familiar terms in librarianship, such as open access. describing open research, librarian related “...it means sharing and being transparent with your research throughout the entire process from a grant proposal to the very end, with publishing. and i think a lot about open access and being able to access final products or information even during the process as well”. some participants conflated the terms, seeing them as one in the same. for others, there were more clear boundaries. librarian linked open research with clearly reported protocols, stating: “to me, that [open research] means openness in the disclosure of methods of research”, linking open research with the articulation of process. for at least one librarian, there was a strong connection between transparency and ethics. librarian explained, “i’m almost feeling like it [research transparency] is more ethical – principles instead of protocol”. library activities supporting research transparency we asked participants to sort pre-defined cards that included library activities that may support open research and transparency (closed sorting) and to add new cards to sort (open sorting) (see table , study section iv). few participants added activities to the pre-identified cards. among those that were added included assistance that would help with later discovery and with greater provision of access: consult on database and metadata design; consult on taxonomy development; and approach community groups to offer assistance with access to information (librarians and ). there was limited reference to the research; instead, the participants primarily used the chronology of the academic term to talk about existing and potential services. three particular topics attracted in-depth commentary from the participants: a recognition of sharing and collaboration as a primary facet of transparency and openness, the library’s role in education and advocacy and the relevance of the disciplinary environment to perceptions of open research. benefits and barriers the librarians articulated benefits that library services supporting open research and transparency could have on research developments. librarian , for example, addressed how library assistance with research data stewardship could advance research: “having well-cared for data that can then be used by other people to discover other things is definitely a good thing for science. it makes me think of all the articles in the news about people who haven’t been able to replicate experiments and stuff” (librarian ). benefits for the library and for librarianship were also highlighted, reflecting current professional challenges: librarian remarked, “i think it’s a matter of survival. i think it’s a matter of staying where our researchers are and not being left behind. for this librarian, extending research services is necessary for academic libraries to remain relevant. several participants, however, reported limited experience with independent or collaborative research and suggested that this could introduce barriers to service. among the barriers described was the challenge of incentivizing researchers to adopt new research practices and to think of the library as a place for support. librarian questioned whether the academic reward system sufficiently encouraged open research practices among faculty, namely junior scholars on the tenure track. this same librarian questioned whether funders motivate thoughtful attention to open research behaviors. with requirements regarding data management plans, she noted: “i remember from a talk someone saying ‘no one’s ever been denied a research grant because of their data management plan’”. importance of library activities when asked to rank library activities supporting open research and transparency, the librarians considered the importance of the services to the disciplines they themselves served. librarian ’s card sorting reflected on the importance of the disciplinary context in making prioritization decisions for services: “you have to know your discipline and what’s important. quite honestly, if i were to do this for english (see figure ), it might look quite differently than for computer science (see figure )”. for academic libraries prioritizing the development and delivery of research services, this librarian suggests that the potentially competing needs of the communities they serve must be considered. one librarian observed that she was prioritizing activities that occurred at the end of the research. she commented: “as i was doing this, i realized that i ranked the most important as the end product – the publishing part. i think that’s because for me, …., this is where i probably would be looking at the most to support research.” this same librarian assigned a lower importance to activities that she viewed as group-specific (benefiting what she perceived as a niche group) and activities that users may be able to accomplish without intervention (i.e. locating books on open research). librarian articulated the challenge of prioritizing quick wins versus longer term actions, where importance may be measured by what the library can most easily accomplish: “sometimes you prioritize things you can do quickly and get something done but it may not be the biggest most…” the challenges in prioritizing the allocation of resources for new services in terms of quick wins versus most impactful long- term solutions was noted and is illustrated by the comment by librarian : “the consulting activities felt to me very important – maybe because i’ve been thinking about what our role is in terms of education and the things that we can actually provide in the form of consultation. it seems like we can hit the ground running if we do things like help researchers take care of their data…all of that is actually helping from the ground up.” discussion the card-sorting exercises conducted in interviews and focus groups provided valuable early insight into framing the context of research transparency from the perspective of academic librarians, as well as determining the methodology, vocabulary and process best suited for future investigation of the research questions. the discussion is arranged in two sections reflecting these perspectives. professional library practice the results confirmed the complexity of open research and research transparency concepts, with some participants viewing these terms as synonymous. the narratives highlighted the different facets of research transparency (e.g. collaboration, sharing resources, and documenting metadata). a key finding was the critical importance of establishing the relevance of the research lifecycle at an early stage of the protocol; it provides an essential foundation for examining transparency practices and workflows in some depth. many of the interviewees did not relate their answers to the research lifecycle and this framing was missing from the conversations. instead, the librarians were largely conceptualizing research in the context of the academic term. one interpretation of this result is that librarians are positioning their activities as learning opportunities oriented towards students, rather than as advocacy activities oriented towards the research community. this is supported by the emphasis on reference work, training or instruction in research methods and by commentary. a further interpretation is related to service delivery mode: librarians are positioned at a distance from researchers (i.e. located in the library, rather than adopting an immersive mode). observations from the card-sorting exercises and analysis of the dialogue highlighted key themes requiring further exploration. firstly, the scope and interpretation of the key terms indicated a predominant focus on access and democratization of information as an important dimension for librarians. this focus is one that can be investigated further. secondly, there is a need for greater clarity of roles in engagements with the researcher community. the observed tensions between whether the library should ‘advocate’ or ‘educate’ around open researchcaptured the contrasting service values but also strengthened the importance of identifying “transparency verbs” (lyon, ). implication of methodology the card-sorting methodology worked effectively in a group and with individuals, although there were time pressures with the focus group. the selection of terms within the instrument requires further consideration to increase clarity. we research lifecycle is required to push librarians to think differently about the timing of their interventions: this is a key conclusion from the pilot. the prioritization of activities revealed a further finding: the perceived differences across disciplinary practice and figure . card sorting for english figure . card sorting for computer science culture. the application of a single instrument across different disciplines will require careful review to ensure particular domains are not disenfranchised by the choice of transparency concepts, vocabulary, or activities described. conclusions and future work card sorting provided valuable early insight into framing the context of research transparency from the perspective of academic librarians, as well as determining the methodology, vocabulary and process best suited for future investigation of the research questions. this pilot has provided insightful pointers to inform the refinement of the methodology for a larger-scale study and has teased out librarians’ opinions, understandings and practices related to open research and transparency. the importance of explicitly framing activities within the research lifecycle, the selection of vocabulary used in the instrument (in particular the use of transparency verbs) and differences in disciplinary research practice and culture, have emerged as key points to consider in designing the next stage of this study. references corrall s. & pinfield, s. ( ). coherence of “open” initiatives in higher education and research: framing a policy agenda. proceedings of iconference , berlin. easterbrook, s.m. ( ). open code for open science? nature geoscience , - . etzioni, a. ( ) is transparency the best disinfectant? journal of political philosophy ( ), - . faiks, a. & hyland, n. ( ). gaining user insight: a case study illustrating the card sort technique. college & research libraries, ( ), - . gezelter, j. d. (n.d.) open science and verifiability. retrieved from http://web.stanford.edu/~vcs/nov /dg- openscienceandverifiability.pdf holdren, j.p., orszag, p. & prouty, p.f. ( ). president’s memorandum on transparency and open government – interagency collaboration. https://www.whitehouse.gov/sites/default/files/omb/assets/ memoranda_fy /m - .pdf lewis, k. m. & hepburn, p. ( ). open card sorting and factor analysis: a usability case study. the electronic library, ( ), - . lyon, l. ( ). open science at web-scale: optimising participation and predictive potential. consultative report. retrieved from http://opus.bath.ac.uk/ / /open- science-report- nov -final-sentojisc.pdf lyon, l. & beaton, b. ( ). citizen science, open access, open data, and research inclusivity. proceedings of alise annual conference, chicago. lyon, l. ( ). transparency: the emerging third dimension of open science and open data. liber quarterly. ( ), – . morville, p. & rosenfeld, l. ( ). information architecture for the world wide web. sebastopol, ca: o'reilly media oecd ( ) oecd principles and guidelines for access to research data from public funding. retrieved from http://www.oecd.org/sti/sci-tech/ .pdf peng, r.d. ( ). reproducible research in computational science. science , - . royal society ( ). science as an open enterprise: open data for open science. retrieved from http://royalsociety.org/~/media/royal_society_content/poli cy/projects/sape/ - - -saoe.pdf whang, m. ( ). card-sorting usability tests of the wmu libraries’ web site. journal of web librarianship, ( - ), - . matthew kirschenbaum, sarah werner book history, volume , , pp. - (article) doi: . /bh. . for additional information about this article access provided by amherst college ( oct : gmt) http://muse.jhu.edu/journals/bh/summary/v / .kirschenbaum.html http://muse.jhu.edu/journals/bh/summary/v / .kirschenbaum.html i in dialogue with her epistolary interlocutors in three guineas ( ), vir- ginia woolf sketched the current landscape for bookmaking and booksell- ing from the perspective of an author and small press: “still, madam, the private printing press is an actual fact, and not beyond the reach of a moder- ate income. typewriters and duplicators are actual facts and even cheaper. by using these cheap and so far unforbidden instruments you can at once rid yourself of the pressure of boards, policies, and editors.” the passage is striking, not only for the uncompromising pragmatism of “actual facts” but for how deeply it resonates with our own situation, complete with questions of access to technologies of both authorship and publication and the uncer- tainties of a still-shifting legal landscape. today not only are word proces- sors and e-books actual facts, so too are mass digitization projects and new forms of analytics ranging from so-called data mining and distant reading to visualization, geographic information systems (gis), and advanced image processing techniques. book history, as both a scholarly discipline and an intellectual community, now shares the world with the actual facts of these things. nor is this an especially new development, save perhaps for some techni- cal particulars. those who remember the first wave of academic enthusiasm for hypertext, cybertext, electronic textuality, and virtual-everything will also recall the unabashed enthusiasm with which glossy books with pri- mary colored covers celebrated the coming of the empowered reader, the decentered author, non-linear narrative (seemingly paradoxically held to- gether by “links”), and the equally paradoxical end of closure. these texts were laced with techno-neologisms or else imports from continental theory: thus “hypertext” was deemed “writerly” after roland barthes, whereas the digital scholarship and digital studies  the state of the discipline matthew kirschenbaum and sarah werner digital scholarship and digital studies poor, staid pages of the conventional codex were condemned to be merely readerly. readers themselves, meanwhile, clicked through “lexias” which populated hypertexts (or hypermedia), engaging “transversal functions” to navigate the “contours” of “textons” rendered on-screen in “flickering sig- nifiers” dubbed “scriptons.” in journalism and the popular media, medita- tions on the death of the book were the order of the day. the gutenberg galaxy was preemptively mourned by the gutenberg elegies, while wired magazine served up a monthly dose of mcluhanesque folk wisdom cou- pled with edgy, pixelated layouts that emblematized a new aesthetic that was equal parts mtv and william gibson. text was reimagined as image, whether the suddenly ubiquitous banner ads on first-generation web sites or the photoshopped excesses of wired’s many imitators. critics from neil postman to michael joyce reframed the age-old paragone between word and image as a new battle of the books: “hurry up please, it’s time,” joyce wrote in . “we are in the late age of print; the time of the book has passed. the book is an obscure pleasure like the opera or cigarettes. the book is dead, long live the book.” today, more than twenty years further on, we are perhaps in the late age of print still, even as books themselves are undeniably still being printed— indeed never more conspicuously so, as they are fabricated on the spot and “on demand” by large purpose-built machines installed in the showrooms of venerable booksellers like politics and prose in washington, d.c., or the harvard book store. that most modest word “text” has rather immodestly become a verb. the future of the web turns out to be not immersive vir- tual reality—from the early virtual reality modeling language to second life, efforts to terraform the web in three dimensions have achieved at best niche success—but rather social media. our avatars are not the animatronic phantasmagoria projected by science fiction writers like neal stephenson but rather mere thumbnail images, most often “selfies” captured with ubiq- uitous digital camera technology; our online activities consist less of the cyberpunk exploits of gibson, stephenson, or bruce sterling than consid- erably more mundane interactions: the aforementioned texting, as well as “posting,” “sharing,” “liking,” and, yes, “tweeting.” nonetheless, if the social media landscape lacks the glam escapism of techno-color science fic- tion, it is no less dangerous and sometimes malevolent a place, all the more so because the barriers between the “virtual” and the material are becom- ing ever more permeable—what steven e. jones, after gibson, has compel- lingly framed as the “eversion” of cyberspace, its eruption into the material world. whether a teen suicide as a result of cyber-bullying on facebook or book history sinister revelations about government surveillance, the web today is con- siderably more perilous a place than mere lolcats and likes might let on. it is also overtly textualized, governed now by what kirschenbaum has previously termed a .txtual condition (after jerome mcgann’s influential idea of the textual condition). hypertext persists, but it has become normal- ized, absorbed into the most basic fabric of our daily routines, and governed neither by the readerly nor the writerly but rather by pitiless regimes of clicks, hits, eyeballs, and analytics. innovative electronic fiction still exists, some might even say has flourished (witness emily short and liza daly’s re- markable first draft of the revolution or the even more recent device , a sophisticated piece of fiction wrapped in a series of ludic puzzles developed as an app for the ipad ), even if we are no longer buying the equivalent of small press titles on diskette. but the bookselling industry as a whole has been utterly transformed by the still unsettled cohabitation of print and e- books, even as massive swaths of the cultural record are digitized by google books, the internet archive, and an array of smaller initiatives. readers will know that was the year that amazon.com reported that e-book sales in its popular kindle format had exceeded their sales of printed books. this, then, in broad strokes, is the media ecology in which contemporary authorship, book publishing, and reading now finds itself, a text-centric world that is categorized by new forms of short-form interaction, new eco- nomic models, new metrics of visibility and reputation, and new forms of viral dissemination, as well as a polyglot riot of devices, platforms, systems, and services, most of them held tenuously together in something known vaporously only as “the cloud.” book history, however, must keep its feet on the ground: narratives of inevitability are as uninteresting as they are unnecessary. the “digital” pedi- gree that is the ostensible unifying principle for this essay therefore reflects not so much the accidents of medium—the supposed reduction of all knowl- edge to a lingua franca of ones and zeroes—but rather a series of material interventions in established systems of reading, writing, and publication, interventions that take shape and define themselves in relation to the af- fordances of other, more familiar media, the printed page not least among them. the kind of scholarship we are interested in here, whether theoretical or applied, does not posit a transcendental “digital” that somehow stands outside the historical and material legacies of other artifacts and phenom- ena; rather, the scholarship we favor understands the digital as a frankly messy complex of extensions and extrusions of prior media and technolo- gies. rather than speaking in a speculative or deterministic mode, we have digital scholarship and digital studies focused on the particular, grounding our review on what specific projects are now doing and what is happening in the real, decidedly non-virtual world of books today. we have also chosen to focus our remarks on those areas where we feel we might have some authority to discern overall trends and developments, as well as where we can articulate a message we want to bring to this jour- nal’s readership. for werner (section ii), this is the relevance of digital tools and methods to diverse areas of book history and the study of books as physical objects, whether or not individual scholars may elect to identify as “digital humanists”; for kirschenbaum (sections iii and iv), it concerns the transformations underway in nearly every aspect of contemporary au- thorship, reading, and bookselling, and their implications for those scholars who seek to approach the study of printed books from the s to the present. both of us see the value in digital tools and in theories of the digital for complicating and reconfiguring our notions of textual “materiality” and dissemination. our coverage is not comprehensive, and the omission of a specific project or work should not be construed as a comment on its significance or inter- est; but it is not merely coincidental that in looking at these new fields, much of the scholarship we cite exists in the full range of options for scholarly publishing, from print collections to electronic editions, blog posts, and dig- ital databases. we have made no attempt to cover technical developments in the delivery of electronic content, especially not data standards like xml or epub, or the particulars of device technology like electrophoretic ink or retina displays. we have largely eschewed the fascinating field of book futurism, as manifested by journalists and critics such as tim carmody and matthew battles, and organizations such as bob stein’s institute for the fu- ture of the book. nor have we covered the public debates about the status of “reading” in contemporary society, as focalized by the media attention around the several reports on the subject from the national endowment for the arts. likewise, we have given only very passing consideration to copyright, legal matters, and the court cases being waged over google and others’ mass digitization and scanning efforts. we have also made no at- tempt to cover the ins and outs of the debates and discussions that have ac- companied the sudden and seemingly ubiquitous arrival of “digital humani- ties” as the term of choice for digital scholarship. finally, our perspective is unavoidably parochial in that it is limited primarily to work not only in english but indeed originating in anglophone nations. we hope whatever usefulness attends the survey that follows might help offset that last short- book history coming in particular. in keeping with the style of previous “state of the discipline” essays we have given the publication information for the many works we discuss inline in the text; these are not typically duplicated as cita- tions in our notes. we have provided a list of resources at the end, which will be useful as a starting place for those seeking a hands-on introduction to the projects and resources we discuss. ii there is sometimes a reluctance among book historians to see the world of digital humanities as relevant and helpful to our work. we are, after all, a group who works intensely with material texts, books in hand, seated in special collections of rare materials. perhaps more than most other schol- ars, we are aware of the immediacy and circulation of texts as physical objects. yet much of the digital work that seems to get the most attention in the press and grant world at the moment involves distant reading—using computers to analyze large corpora, looking for patterns of usage and other signals that are not readily visible through reading one book at a time. in the right hands, distant reading can reveal new insights into the develop- ment and deployment of linguistics and rhetoric and genre, the impact of cultural forces, and the patterns of literary influence. (in the wrong hands, it fails to do any of these things.) this big data trend in the humanities is not one that has spoken to book historians. it has been the tool of literary and linguistic scholars, something prized by researchers interested in text, rather than textual production. but ignoring what digital tools can offer the study of book history cuts us off from opportunities to further develop our knowledge of how books are made and used. not only do we need to learn what tools to take advantage of, the rest of the scholarly and public world needs our insights as part of the conversation, especially as the means by which information circulates today continues to shift in response to the technological and societal shifts around us. this approach should not be a big change for book historians. the desire to catalog and to count and to sort means that book historians have been long involved in digital humanities, whether it has been called by that name or no. in the field of early modern english literature, this desire to collect infor- mation has produced the forerunners of many of the tools that we use today. the catalogue of printed books in the library of the british museum, a digital scholarship and digital studies decidedly non-digital project, led to pollard and redgrave’s a short-title catalogue, which in turn is the forerunner to the decidedly digital english short title catalogue now hosted as an open-access resource at the british library (we will return to the implications of this trajectory later in this section). the estc is not the only catalog that book historians rely heav- ily on, of course, but it is a convenient stand-in for the ways in which the tools we take for granted are—whether despite or precisely because of their long histories—digital resources. and because they are digital resources, the information in them is available to explore and manipulate in ways that can reveal larger patterns of production and circulation. the estc records works printed between and in english and in the british isles and north america; it includes information not only on author, title (typically including uniform and variant titles when pertinent), date, and imprint, but also often on format, page length, genre, subject, and current institutional holdings. with access to the full marc data in this and similar catalogs, one has access to many of the pertinent elements of the first centuries of book printing and the ability to sort, refine, and analyze its contours. the atlas of early printing uses the data in the incunabula short title catalogue (another freely accessible database at the british library with origins in print catalogs) to map out the locations of presses and their dates of operation in the incunable period. the atlas also has options to indicate the locations and dates of paper mills, book fairs, universities, and conflicts, thus handily making visible the relationships between cultural and economic forces in the early days of print. (the atlas also provides some background essays on early printing and books and an animated printing press, in ad- dition to a carefully detailed explanation of where their data is from.) the atlas looks at the creation of incunabula; it is also possible to use this data to look at their subsequent histories. mitch fraas uses the gesamtkatalog der wiegendrucke (first printed in and now available as an online database through the staatsbibliothek zu berlin) to look at the distribution of incunabula in institutional holdings today. (the gw provides better geospatial information to work from, fraas explains; your visualization is only as good as your data is.) while generally the map of output coincides with current holdings in europe, there are some gaps—fraas notes them in the adriatic and the region south of the baltic sea and east of berlin— that suggests some disruption in institutional histories. the atlas does not necessarily show us anything we do not already know as book historians, although it is valuable for showing us that information in a manner that makes it clearly understandable to those who are not. and while fraas’s ex- book history ploration of incunabula distribution is merely the start of delving into that data, his mapping highlights how questions about institutional histories and resources are an important aspect of studying rare books today. there are numerous other mapping projects, especially for the hand- press period, including the french book trade in enlightenment europe, – : mapping the trade of the société typographique de neuchâ- tel, the atlas of the rhode island book trade in the eighteenth century, and, although still in its early days, mapping colonial americas publish- ing project. indeed, a number of the projects highlighted at sharp’s digital showcase were mapping-related. mapping, you might be thinking to yourself, is not a particularly new activity for book historians, and you would be right. we have been producing maps of the book trade for as long as we have been studying it. but that is the point: the technology and value of mapping is not foreign to book history, but of it. cataloging is also of book history, an information-parsing tool that we have been producing since there were first texts to be organized. with the easy manipulation that digital records allow, they can not only track texts and their locations, but help us discern other traits of production and recep- tion. ben schmidt, for instance, has carefully considered whether or not we might be able to gain insight into the consumption and cultural history of genres of books by using library of congress classifications of books pub- lished in the mid-nineteenth century and their relative page length in order to examine whether history was read more often during revolutions. schmidt has also been using lc classifications as a way of looking at the gender dis- tribution of authors in library holdings of works published between and , noting that, among other findings, the field of german history ap- pears to be significantly more male-authored than other fields, while fiction unsurprisingly has the highest numbers of female authors. meanwhile, scholars focused on the history of reading have taken advantage of database capabilities to create catalogs of readers: the reading experience database (red) is a multi-national collection of databases tracking evidence of read- ing left through a range of sources from to , including marginalia, diaries, court records, and surveys. what middletown read reconfigures the detailed records of muncie, indiana’s, public library to create a database of readers and reading materials between and . in most of these instances, what such projects are using are the metadata of books, turning their imprint and holdings information into network anal- ysis. but what might digital tools offer scholars who are interested in textual history? are there ways of mapping the interior of a book? alan galey has been experimenting with how to display textual variants and paratextual digital scholarship and digital studies f ig u re . f ra as ’s m ap o f in cu n ab u la d is tr ib u ti o n i n e u ro p e. s cr ee n sh o t b y w er n er . book history movement across editions. as textual scholars have long noted, the instabil- ity of texts is a regular feature throughout textual transmission histories. print editions have relied on a combination of commentary, parallel texts, and varying levels of complex notation to indicate variants. digital editions have tended to rely on the same typographical features, albeit sometimes with hypertext functionality: one option might let you display the hamlet second quarto variants only, another the folio variants. galey’s experimen- tation, however, plays with displaying instability itself, animating variants so that they switch back and forth without the user’s input. as a method of exploring the effect of instability on textual circulation, digital tools of- fer options that paper does not. animation enacts instability on the word level, but galey has also experimented with how to visualize the instability of paratext by mapping its relative placement in and absence from textual sequencing. tracking the levels of paratextual material in more’s utopia, to use his prototype example, helps us understand the nuances of its cir- culation and reception: the humanist circle through which more carefully deployed his text shows up in the multiple combination of commendations published in its first four editions. (if you want to read through that para- textual material, visit the open utopia, which includes all letters found in the – editions, albeit not in an order that reflects any one of those printings, and which strives to provide an interface for open, social com- mentary on the text.) the same mapping can be done for the levels of commentary in vari- orum editions, in which centuries of notes accrue in differing densities to texts. now that increasing numbers of digital editions are being created in increasing levels of complexity—the modern language association’s push to release its new variorum shakespeare editions in xml comes to mind, as does the folger shakespeare library’s tei-encoded digital texts of shake- speare’s plays—the opportunities for textual scholars to develop new tools for displaying and analyzing textual histories are rich. galey’s prototype focuses on highlighting patterns of emendations over the centuries; in re- sponse to the mla’s invitation to create projects based on its comedy of errors edition, patrick murray-john used their data to view the variorum commentary as a community of scholarly conversations in his bill-crit-o- matic. the potential of electronic editions to allow social annotations not only has the possibility of expanding the knowledge pool that scholarship can draw from, it replicates the interpretive methodologies of earlier peri- ods. annotated books online is a digital archive of early modern annotated books that provides high-quality digital images of the books as well as tran- digital scholarship and digital studies scriptions and translations of their marginalia and that invites users to con- tribute their own transcriptions, thereby annotating the annotations. the implementing new knowledge environments (inke) project is, in their own words, “an interdisciplinary initiative spawned in the methodological commons of the digital humanities that seeks to understand the future of reading through reading’s past and to explore the future of the book from the perspective of its history.” one recent inke project, the social edition of the devonshire manuscript, takes a sixteenth-century manuscript miscel- lany and turns it into a wikibook edition, aiming to replicate in digital form the coterie circulation of early modern poems. the circulation of texts within coteries and beyond them is another book history field that benefits from digital tools. infectious texts: viral net- works in th-century newspapers uses algorithms to search large cor- pora of nineteenth-century newspapers in order to identify texts that have been reused in multiple papers. the team’s work so far has identified the most popular viral texts, suggesting that their popularity is due in part to their ability to participate in multiple contexts. more excitingly, they have used gis software to map the print histories of these viral texts alongside transportation data, census reports, and other information in order to begin uncovering the physical and social networks that linked these viral texts. they have confirmed a correlation between the railroad and the spread of linked texts, but they have also uncovered relationships between newspa- pers that might not have been otherwise noticed. their graphs revealed a close connection between the vermont phoenix (brattleboro, vermont) and the fremont journal (fremont, ohio) based on the frequency with which they reprinted texts; further investigation by the team showed that the news- papers’ editors were brothers-in-law. one of the participants in infectious texts, ryan cordell, has also produced research using similar techniques of mobilizing large-scale digitization to reveal the frameworks of social texts; looking at the early publication of nathaniel hawthorne’s “the celestial railroad,” cordell recovered early printed witnesses of the story and para- texts that had not been part of the scholarly record. if the focus so far has been on ways in which digital tools are a natural home for the interests of book historians, it shifts here to argue that digital tools would benefit from the scrutiny of book historians. the english short title catalogue (estc) is one example: the current version is a remarkable tool, but what do we learn from studying its history through different media incarnations? the biases embedded in the catalog’s initial creation become part of its current functionality. for instance, as ian gadd has been explor- book history ing, the catalogue of printed books in library of the british museum printed in england, scotland, and ireland, and of books in english printed abroad to the year used as its end-date the year because that was the terminal date used by edward arber in a transcript of the registers of the company of stationers of london – , a.d (a date chosen not because of its significance but because after this point, in the wake of the long parliament, the registers grew exponentially and working with them would have become significantly more complicated). that decision shaped the scope of a.w. pollard and g.r. redgrave’s a short-title catalogue of books printed in england, scotland, & ireland and of english books printed abroad, – (stc), which catalogs extant books in major institutional holdings along the same criteria and incorporates infor- mation from arber’s transcript. works printed between and are part of a different catalog compiled in the mid-twentieth century—donald wing’s short-title catalogue of books printed in england, scotland, ire- land, wales, and british america, and of english books printed in other countries, – (wing)—which, unlike the stc, does not incorpo- rate information from the stationers’ registers and which excludes periodi- cals and many ephemera. works printed in the eighteenth century formed the eighteenth century short title catalogue, which was published in the s first on microfiche and later on cd-roms, and which was intended to be a union catalog of all known copies, again excluding periodicals and most ephemera. in the s, these three catalogs were combined into the single english short title catalogue and released first as cd-roms in the mid- s; in , the estc was made available as an online, open-access resource hosted by the british library. users of the estc today, therefore, are in fact consulting three separate catalogs, each following its own principles and scope. someone who is not familiar with that history might wonder why periodicals suddenly disap- peared in , for instance, or why a catalog of english works includes many items printed in other languages. one small but telling detail is the way in which etsc numbers are generated. stc and wing numbers are ordered by author; put the numbers in order and the list of authors will also be in order. estc numbers, however, are determined in part by the location from which they were entered (a number starting with t was done at the british library, for example) and are otherwise randomly generated. the earlier numbering system was shaped by the format by which users encoun- tered the records: you needed to be able to turn through pages in a book to locate the item or the item number you were looking for. under these digital scholarship and digital studies older systems, you could flip through pages to get to “joceline, elizabeth” to find the edition of her a mother’s advice to her unborn child or you could look up stc to discover what work the number cor- responded to. (wing numbers start with the first letter of the author’s last name and then proceed numerically.) but the estc was from its beginning conceived as a machine-readable catalog. there was no book that had to be flipped through in order to find an entry, but search fields. with locat- ability not being shaped by sequence but by searching, the primary purpose of estc numbers is durability: they provide a persistent identifier, not a key to discovery. looking up s brings you to the same joceline edition (stc ), but the next edition is s (stc . ), followed by s (stc . ) and r (wing j ). s leads to nathan- iel wickins’s woodstreet-compters-plea, for its prisoner (stc ); s to thomas middleton’s sir robert sherley his entertainment in cracovia (stc ). with estc cataloging being done in multiple lo- cations simultaneously, random numbers make more sense than sequential ones. both systems of citation numbers make sense according to their own needs, and if we were to look closely at the numbers without knowing their history, we might still be able to reconstruct the paths behind their creation. but without that vantage point, we miss the stories that estc has to tell us. a book historian’s perspective on the shaping principles and effects of this digital resource adds a much-needed lens on how its current incarnation operates. the perspectives of book historians are also sorely needed on the large- scale digitization efforts underway at such places such as google books, the internet archive, hathitrust, gallica, and other institutions that are actively aiming to make print resources available as digital objects. digitiza- tion projects are key to the history of digital humanities and to the work of book historians and textual scholars. some of the earliest projects, like the william blake archive, which began in and continues on today, have helped us see the possibilities for online resources for bringing together disparate physical objects into a single virtual home. the shelley-godwin archive, released in beta in november , attests to the power of this kind of digital work. combining high-resolution images of works held in multiple libraries with careful transcriptions and an interface that allows us- ers to search and interact with the texts in a range of ways, these sites make possible a view into the production and dissemination of these important materials. more disparate digitization efforts can have the same effect. at last count, there were ten different freely accessible copies of shakespeare’s book history first folio fully digitized by eight different institutions; although the quality of the images and the richness of the interfaces vary, it is nonetheless pos- sible for a user to find and share variants as well as other copy-specific fea- tures. while it can be hard to track down digitized copies of works when they are held at different institutions, an omnibus site can also be mislead- ing in its appearance of completeness. the newly opened emily dickinson archive and the debate over its contents highlights some of those dangers: although the site presents itself as a source for viewing dickinson’s manu- scripts, the materials online represent only a portion of the manuscripts available through its partner institutions. the controversy over how the site positions itself in relationship to already published editions of dickinson’s poems and which partner institutions have been given access to materials reflects not only the struggles for funding and publicity that all libraries face, but the long-standing battle over dickinson’s legacy and manuscripts fought first by her heirs and continued by the institutions holding the bulk of her papers. digitization has wonderful benefits for book historians: we can consult high-quality images of works from multiple locations at a single moment. but our ability to do that depends on the quality of the metadata attached to those digital objects. finding digital copies of works can itself be a huge challenge. anyone who has searched for something on google books knows how difficult it can be to know what you are looking at: multiple-volume works are recorded as separate objects without being linked together and works that exist in multiple editions (let alone multiple states) are often cataloged as different printings than what they are. the records in hathi- trust’s digital library are dramatically better (not surprising, since they are a partnership of academic and research institutions), while those in the in- ternet archive are a mixed bag (depending on the quality of the informa- tion provided by the person who uploaded the item). eighteenth-century book tracker, run by benjamin pauley, strives to improve this situation by creating an index of openly accessible digital facsimiles of eighteenth- century texts linked to bibliographically reliable records. the site allows users to add texts but also provides a bookmarklet to help users navigate google books and internet archive by making it easier to identify accurate bibliographic information about the texts they hold. (pauley is also part of the working group for estc , an effort to reimagine how the estc can be redesigned “as a st century research tool,” including allowing for user input and better matching of estc records with digital resources. ) of course, figuring out what you are looking at is only part of the chal- lenge of working with digitized texts. another is understanding the risks of digital scholarship and digital studies letting a copy stand in for an edition. digitization projects often let the digi- tization of one book represent the entire print edition of that work. early english books online (eebo) claims that it “contains more than , titles” listed in stc, wing, and the thomason tracts. but what eebo provides is access to digitized microfilms of copies of more than , titles. especially in the hand-press period, with its proliferation of variant states, including stop-press changes and cancels, a copy is not necessarily representative of an edition. to choose but one example, the eebo instance of the edition of the earl of rochester’s poems &c on several occa- sions (r , to use its wing number) is taken from the huntington copy of the work, a copy that includes the cancellanda of leaves d and d , rather than the cancels that were to replace them. (the later state of the poems omits the last stanza of “love to a woman,” presumably out of the same prudishness about sexuality that was responsible for the cuts made throughout the collection.) there is nothing in eebo’s record to indicate that this copy is anything other than a surrogate for the edition. but the textual history of rochester’s poems is complicated enough without adding in confusion about states of editions. the solution to this problem is not difficult: accurate and accessible metadata, so that we know what it is we are looking at and so that search engines can find it, would fix many of these problems. the problem of how digital objects can represent the materiality of textual objects is a more com- plicated one, and in many ways more interesting. at the moment, most digi- tizations focus on the value of the object as a text to be read. the text block is digitized, but not necessarily the endleaves or the binding (sometimes, as in eighteenth century collections online, even the blank pages inside the text block are omitted, presumably on a cost-saving theory that if a page does not have words on it, it surely does not have any meaning). and many images are of pages only, rather than openings, so that the text is further removed from the context and experience of reading it in a book. digitiza- tions of textual objects tend not to show the watermarks and chainlines of paper, the bite of type, the texture of parchment—the characteristics of an object that we observe as we handle it and that inform our knowledge of its making and its history. digital facsimiles appear to be flat, made up of pages without depth or relationship to other pages, part of a sequence that is made up of bits rather than bindings. but this is not because such flatness is inherent to digitiza- tion. it is because of the limited ways in which digitization has been put to work for us. we have allowed digital images of texts to be conceived book history of as surrogates of those texts, rather than new objects with their own af- fordances. what might digitizations do other than show us pages of text? they might show us text that is not there. the work done with the great parchment book exploits the potential of digitization to reshape the mate- rial object to our benefit. the great parchment book is a survey compiled in of all those estates in derry managed by the city of london through the irish society and the city of london livery companies. a fire in badly damaged the book, and the surviving leaves remained unavailable to researchers for over years. through careful preservation, about % of the text was recovered, but the brittle, wrinkled parchment remained an intractable obstacle to further work. but a team at the university college london’s centre for digital humanities was, after detailed digital imaging, able to virtually unwrinkle the pages. about percent of the text of the great parchment book is now readable and available for examination on- line as images of the leaves, enhanced images, or a transcription of the text. the archimedes palimpsest project has similarly disembodied a manuscript to make accessible text that would otherwise remain hidden, using multi- spectral imaging to recover two lost archimedes treatises and other ancient texts that had been written over in the thirteenth century. the project then released all of its data to the public and published the earlier state of the manuscript through google books, making available to read in digital form a text unreadable in its material manifestation. digitization also offers the opportunity to take objects apart so that we can study their components. the bodleian’s broadside ballads online has not only been digitizing their large collection of sixteenth- through twenti- eth-century ballads, but has been experimenting with an image search tool that allows users to highlight an image—or a selection of an image—to search across the collection for other instances of its use. imagematch can, for example, trace the use of a woodcut image of a hat across multiple bal- lads; while tagging might allow one to search for “hats,” image searching allows one to look for a particular hat, even when the person depicted wear- ing it changes. rather than cutting out bits of a text, the folger shake- speare library’s impos i tor strives to turn bound books back into printed sheets. using the images and metadata produced by the library as part of its digital image collection, michael poston created a tool that allows users to generate a facsimile of a printed sheet. you cannot disbind a book in order to rearrange its leaves into the format in which it would have been printed (unless the book has already been slated for conservation and the conservation team is willing to let you play with it), but digital pages can be rearranged in any order you like. digital scholarship and digital studies figure . an example of an impos i tor-generated quarto imposition for titus andronicus. screenshot by werner. figure . the first image of the recovered archimedes palimpsest, as seen in its google book incar- nation. screenshot by werner. book history digital tools can help us see what is otherwise difficult to observe. reflec- tance transformation imaging (rti) has been used more often on archaeol- ogy and art objects than on textual objects, but it is a potentially rich tool for physical bibliography. developed at hewlett packard labs, rti uses multiple digital photographs shot from a stationary position with varying angles of light; through an interactive rti viewer, a user can manipulate the light source and qualities to create a detailed d imaging of an object’s surface. with the cuneiform tablets that were used in the first exploration of the technology’s potential, the relief in the rti images revealed more clearly than photographs could the features of the tablets. subsequent projects have used rti technology to explore japanese woodblock prints, book bindings, and illuminated manuscripts. taking our cue from the work that randall mcleod has done on the topographies of paper, looking at bearing type and other blind impressions, imagine what rti could do for the study of books. if we are going to let our imaginations run wild with what digital tools might offer the study of material books and book history, there are other suggestive paths forward. what might the distribution of dirt tell us about the usage of books? kathryn rudy uses densitometers to study medieval prayer books and identifies which pages were used the most often and how they were held; her research has also revealed some of the effects that clean- ing treatments have had on the books’ appearances today. could smell tell us about something other than nostalgia for paper books over digital ones? scientists have been analyzing the smell of paper and suggesting the use of odor analysis as a diagnostic tool for conservation purposes in nine- teenth- and twentieth-century works, but anecdotal evidence suggests that earlier books might have different smells depending on where their paper was sized. could sound help us understand books and textual scholarship? listen to wikipedia is a site that translates the edits made to wikipedia into sound, producing the sonic equivalent of visualization that could help us grasp the nuances of variorum histories of editing. as we hope is clear from these examples, book historians can do a lot with the digital tools that are available to us. but if we want tools that reflect the full range of work that we do as book historians, studying the social, economic, and material circulation and creation of texts, we will need to engage with the development of these resources. even if we do not have the technical skills to create digital tools from scratch, we should understand them well enough to be able to recognize how these tools might shape our research and to participate in conversations with those who can build the tools we need. digital scholarship and digital studies iii jonathan franzen’s freedom was published on tuesday, august , . many readers who had placed an advance order for the electronic edition of the novel woke that morning to find that the text had been wirelessly de- livered to their kindle (whether their account or an actual kindle reader de- vice) as they slept. freedom was a widely anticipated book, even if not quite a publishing sensation on the order of, say, harry potter and the deathly hallows (see ted striphas’s masterful coverage of the potter franchise’s mar- keting techniques and retail procedures). it would seem uncontroversial to suggest that franzen, national book award winner and oprah enfant terrible, will be a subject of future inquiry by critics and historians of the novel and literary fiction. what will such persons wish to have available to them as prerequisites for scholarly inquiry? merely a good, clean copy of the text? even this might not prove completely unproblematic if one isn’t care- ful, since the uk harpercollins edition was subject to a recall (some , copies) when it was found to contain errors from an uncorrected proof of the text published by mistake. but of course many will want much more than just a clean reading text. depending on one’s interests, we might well want as many editions and printings and translations as we can lay our hands on, including an exemplar of the corrupt uk release (amazon lists no fewer than formats and editions). the kindle release is just one of these, yet it presents a reader with a number of unique features. one can access the popular highlights func- tion to see passages that other readers have singled out as significant. for example, we can know that , other readers have taken note of the fact that “she knew that you could love somebody more than anything and still not love the person all that much, if you were busy with other things.” the amazon kindle edition also includes “extras” like a plot summary, lists of characters and important places in the book, memorable quotes, errata, and recommendations for other books a reader might like if they like this one (this content is all drawn from something called “shelfari,” an “editable book encyclopedia”). the plot synopsis includes an option to toggle spoil- ers on and off. clearly a future student of franzen’s freedom would have some cause to wish to access this electronic incarnation along with a printed text, even if we assume the textual content, what mcgann once termed the linguistic codes, to be the same. but of course, there’s more. in amazon’s online listing for the book, we find, as of this writing, , customer reviews, many of them in turn rated book history and commented by other members of the amazon community. we can see that the book’s current sales rank, again as of this writing, , , though it was in amazon’s top at the time of its publication. we can purchase the audio book, read by one david ledoux; there is an exclusive fourteen- minute interview with franzen for the amazon omnivoracious podcast; there is a discussion forum, with active threads. we can “look inside the book,” and, more intriguingly, perform, within the limits of fair use on copyrighted material, keyword searches to call up specific passages. and then there are the other obligatory ports of call. franzen’s official page at his publisher, farrar, straus and giroux; the oprah book club site, a uni- verse all its own with spiraling nebulae of supplemental material and vast figure . page from franzen’s freedom as displayed in kindle for ipad version . on an ipad running ios . . in a leather carrying sleeve. the “popular highlights” tab is open, with a passage selected by some other readers marked with underlining at the top of the screen. font has been enlarged to suit the owner’s preferences. photo by kirschenbaum. digital scholarship and digital studies galaxies of discussion forums; dozens and dozens of videos on youtube, capturing franzen at readings, in interviews, even on the street. unlike, say, margaret atwood or alice walker or william gibson, franzen himself, a strident social media refusenik, does not blog or tweet, though there have been several franzen fakes on twitter. pirated copies of freedom, mean- while, were reported on the usual torrent sites no later than september . and all of this thus far relates only to the book’s publication and reception. we have not yet said anything about the novel’s composition, its editing, or production. franzen himself, according to time magazine, writes with a “heavy, obsolete dell laptop from which he has scoured any trace of hearts and solitaire, down to the level of the operating system.” where are the digital manuscripts? will franzen allow them to be accessioned by whatever institution eventually acquires his literary papers? will the documents con- tain track changes and other algorithmically encoded versions and variants? what would forensic computing tell us about the expurgated fragments of files on the original hard disk? and what of the digital prepress materials at farrar, straus and giroux? franzen’s email correspondence with agents, editors, publicists, and friends, and confidantes? what does it mean, then, to study histories of authorship, publishing, and reading right now? what will future scholars have to account for as dif- ferent with respect to today’s books, even a mainstream piece of literary fic- tion, when it is released into the kind of networked media environment that characterizes our most mundane daily interactions, whether paying a bill or checking the forecast? what are the material realities of book-writing, bookmaking, and bookselling in the present moment? that is the question to which we turn in this latter part of our state of the field essay. for in , book history shades ineluctably into media history. some might see a hopeless schism, or better, a punctuation mark for book studies, the point at which the book as physical object is subsumed by a much vaster media spec- trum where it is at best a derivative object in a system of digitized produc- tion and vertically integrated transmedia content. yet over the last ten years or so there has been a marked “material turn” in digital studies that, we will insist, more or less aligns with the material turn that brought about the study of books as historically situated and socially manufactured artifacts. a variety of scholars, theorists, and media arts practitioners now recognize that computers—by which we mean not only the tangible hardware, but also software and even the very algorithmic processes of computation—are material phenomena. how we move from the seemingly counterintuitive assertion that code, bits, symbolic logic, and signal processing are in fact book history “material” has been the decisive maneuver in digital studies, largely defining the state of the field as it is conducted today. there is thus a marked contrast between current scholarship in digital studies and the early enthusiasms we limned at the beginning of this essay. moreover, given the media ecology surveyed in our brief discussion of freedom, the convergence between the materially-minded pursuits of book history and the agendas of contempo- rary digital studies opens the way for sophisticated studies of contemporary reading, writing, and publishing that are grounded in the individual circum- stances of authoring technologies like word processing and beyond, as well as the bookseller’s marketplace, networks for electronic dissemination, and readerly histories that spill across the whole of the web . landscape. the materialist turn in digital studies is not a unified or prescribed move- ment, and practically speaking it has coalesced through several different sub-fields that we will survey below. there are, however, some broadly shared assumptions: the materialist turn assumes that computers and com- putational processes are material in nature, and thus subject to documen- tary and historical forms of understanding; it is technically rigorous and ac- knowledges the material particulars of media and computation as worthy of critical investigation; it understands the particular constraints of software, code, and platform as generative for studying the processes and products of digital culture; it cultivates and actively seeks to refine an archival record for digital culture; and it understands the activity of archiving itself in new and capacious ways, that include such techniques as crowd-sourcing, hack- tivism, restoration and retro-computing, and citizen archivists. of course none of the above are concerns or ideas that have manifested exclusively in just the last ten years, and certainly not only in the primarily anglo- american contexts we will look at below. harold innis, whose key books on the materialities of communication have been overshadowed by toronto colleague marshall mcluhan’s fame and following, laid the groundwork for such an agenda in the years immediately following the second world war (see especially empire and communications [ ] and the bias of communication [ ]). in germany, meanwhile, friedrich kittler reject- ed the overtures of post-structuralism in favor of the dubious allure of a soldering iron and machine code, fabricating a techno-hardcore media his- toriography that displaced human agency from the central circuits of the culture machine, paving the way for the media archaeology movement that we will discuss in some detail. kittler, of course, is routinely taken to task for playing fast and loose with his historical accuracies, but his impact is undeniable; as geoffrey winthrop-young writes in his book-length intro- digital scholarship and digital studies duction to kittler’s legacy, “the battle cry ‘media determine our situation’ is reduced to the tacit agreement that scholars should pay some attention to media formats after having paid none at all for decades.” kittler is also surely the most brutally minimalistic of all the techno-materialist thinkers, arguing that in the end “there is no software” because all digital phenom- ena “come down to absolutely local string manipulations and that is, i am afraid, to signifiers of voltage differences.” (the canonical introduction to kittler remains his gramophone, film, typewriter, especially the opening chapter wherein he presents the thesis about the nineteenth century’s lib- eration of media from the symbolic constrictions of exclusively alphabetic forms, but the essays collected in literature, media, information systems are likewise very approachable [those in optical media somewhat less so]; both are recommended before attempting the gesamtkunstwerk, discourse networks / .) finally, nancy ann roth’s recent translations of the czech vilém flusser’s work for the university of minnesota press (into the universe of technical images and does writing have a future? [originally published in german in and , respectively]) have helped restore to our attention a theorist who, as the late mark poster puts it in his in- troduction to both volumes, “stands out, with only a handful of others, as one who presciently and insightfully deciphered the codes of materiality disseminated under the apparatus of media” (xi). innis, kittler, and flusser have each produced work that is broadly relevant to students of all media forms, wherein the inscription and transmission of the written word and specifically the materialities of print and literature are channeled through a wider media spectrum. together with figures such as benjamin and mclu- han, they offer a foundation for an approach to book history in the current threshold moment of the digital, even as more recent thinkers have chal- lenged, revised, and extended their positions. to these we now turn. over a decade old, the new media reader edited by noah wardrip- fruin and nick montfort and released in by the mit press is an ap- propriate milestone to demarcate the onset of what we have characterized as the material turn in digital studies. the nmr was very much intended as an intervention when published, bringing together artists, humanists, and technologists from the second half of the twentieth century, pointedly end- ing with tim berners-lee’s paper about the world wide web. the “new media” between its covers (and on the accompanying cd) thus ar- rived already overtly historicized, the very heft of the hardbound volume a reminder of the fact that conversations about computers, writing, art, and interactive design had been underway for decades prior to the advent of to- book history day’s desktop browser. the historical documents range from turing, bush, licklidder, and weiner to burroughs, roy ascott, brenda laurel, and lynn hershman; of particular interest to students of book history will be pieces such as ted nelson’s “proposal for a universal electronic publishing system and archive” (from ’s literary machines) and robert coover’s much- cited “end of books” new york times book review essay, as well as a compendium of oulipo writings including a complete do-it-yourself cut- up implementation of raymond queneau’s cent mille milliards de poems. the volume deploys a deliberate contrapuntal strategy, juxtaposing, say, bush’s memex with borges’s “garden of forking paths.” while a useful compendium for researchers and a compelling choice for classroom instruc- tion, the nmr also helped inaugurate a new historically-aware phase of digital studies, one in which the presentism that afflicted so much of the field in its earlier incarnations—or else the crude historicism whose fulcrum was ceci tuera cela—is filled in by documentation of the decades of dense aes- thetic and scientific conversation on the very borders of the screens, pages, windows, and frames that limn the contours of our contemporary media landscape. two other figures whose careers have been heavily identified with aspects of book history and textual scholarship deserve particular mention at this point. johanna drucker, whose pathbreaking scholarship on the radical ty- pographic experiments of the modernist avant garde will be known to many readers here, as will her steady output of artist’s books, began speaking and writing overtly about digital media in the s. many of her early statements about digital media, are collected in figuring the word: essays on books, writing, and visual poetics, a granary press volume; her formal identification with “digital humanities” as it manifests today can be seen in digital_humanities (mit, ), co-authored with anne burdick, peter lunenfeld, todd presner, and jeffrey schnapp. her university of vir- ginia colleague jerome mcgann, meanwhile, had been integrating ideas from early hypertext theory into his thinking and writing about critical tex- tual editing since the s; by the s, the rossetti archive project was well underway, and it furnished a continual source for theoretical reflection and provocation. these essays of mcgann’s are collected in ’s radiant textuality: literature after the world wide web (palgrave), while his more recent thought on matters digital can be found in a new republic of let- ters (harvard, ). what drucker and mcgann each offered in their own way were models of figures whose deep engagements in the materialities of books and printed matter served to shape and refine their thinking about digital scholarship and digital studies electronic textual forms, rather than positioning them in reductive opposi- tion as was the case for such bibliophiles as sven birkerts. both of them introduced perspectives from the material study of textuality to audiences otherwise engaged with electronic technologies, who then found occasion to bring such perspectives to bear on digital objects and artifacts. one such point of influence was n. katherine hayles. though hay- les’s intellectual trajectory was already well established, marked out by her training and scholarship in the history of science, her short writing machines from mit press (which featured a collaboration with graphic designer anne burdick) carried concepts from textual materiality directly to readings of works that included both threshold codex productions like mark danielewski’s house of leaves, as well as talan memmott’s online hypertext lexia to perplexia. here hayles introduces the term “media-spe- cific analysis,” and enjoins her readers to no longer “treat text on the screen as if it were print read in a vertical position. electronic text has its own specificities, and a deep understanding of them would bring into view by contrast the specificities of print, which could again be seen for what it was, a medium, and not a transparent interface.” similarly, matthew kirschen- baum’s mechanisms: new media and the forensic imagination (mit press) explicitly brings together perspectives from textual scholarship and book history, as well as the technical field of computer forensics. this ap- proach produces new readings of landmark digital work such as william gibson’s “agrippa” and michael joyce’s afternoon, and demonstrates the extent to which computer forensics—which locates, recovers, and authen- ticates digital evidence to a degree admissible in legal settings—offers the specific methodological bridge between new forms of electronic writing and the traditional concerns of bibliographers and textual scholars. arguing that the previous generation of writing about electronic textuality had been gov- erned by a “medial ideology” in which tropes such as light, lightning, speed, and ephemerality predominated, kirschenbaum insisted instead on the fo- rensically replete realities of inscription for devices such as hard drives to argue that a “computer,” like a “book,” is in fact an individuated artifact, always subject to deep historical forms of understanding. what has become one of the most important routes to such understand- ings emerged almost in passing, as a throwaway, in the course of an enor- mously influential book in its own right, lev manovich’s the language of new media (mit press, ). manovich’s work here is justly celebrated as perhaps the most comprehensive formal framework of digital media and objects to date; it has been influential for its linkages between digital media book history and cinema, as well as a provocative (if much contested) “opposition” be- tween database and narrative as cultural organizing principles. but early in the pages of the book manovich presents us with a vignette. he is seeking to distance himself from what he perceives as the vulgar futurism, as well as the lack of interest in the messy details of actual software and computer programs, of previous academic commentators on the digital technologies emerging all around us. at stake are not just better theories, but also the ac- tual history of digital culture and its myriad non-virtual realities. “where,” he asks, “were the theoreticians at the moment when the icons and the buttons of multimedia interfaces were like wet paint on a just-completed painting, before they became universal conventions and thus slipped into invisibility?” he then poses the question even more insistently, evoking the hypothetical but eminently plausible “historical moment” when a young -something programmer at netscape took the chewing gum out of his mouth, sipped warm coke out of the can—he was at a computer for hours straight, trying to meet a marketing deadline—and, finally satisfied with its small file size, saved a short animation of stars moving across the night sky? this animation was to appear in the upper right corner of netscape navigator, thus be- coming the most widely seen moving image sequence ever until the next release of the software. this is an enormously captivating and compelling gesture, dramatizing as it does the distance from the so-called “theoreticians” of first-generation digi- tal studies to the specific, localized, embodied, and ineluctably materialist concerns manovich wishes to foreground. he called the research agenda he was then proposing “software studies,” and although its uptake was not im- mediate, software studies has now emerged as a recognized sub-field of digi- tal studies, complete with a dedicated book series from the mit press. the affinities with book history should at this point require no great elaboration on our part: manovich, and those who followed him in to software studies are interested in specific software packages, their conceptualization, design, engineering, implementation, and their use and circulation within particular communities. matthew fuller, editor of software studies: a lexicon (mit press, ) puts it this way in his introduction: “while applied computer science and related disciplines . . . have now accreted half a century of work on this domain, software is often a blind spot in the wider, broadly cultural theorization and study of computational and networked digital media . . . . software is seen as a tool, something you do something with. it is neutral, digital scholarship and digital studies grey, or optimistically blue.” (fuller’s important early work, meanwhile, is collected in beyond the blip: essays on the culture of software [autonome- dia, ].) software studies thus emerges as a framework for historicizing software and dislodging it from the purely instrumental sphere. one could imagine useful convergences of software studies and book history around applications such as wordstar or microsoft word, or aldus pagemaker, to name just some of the most obvious. yet software is not always an intuitive artifact, which surely helps ac- count for the kind of blind spots fuller notes. is software the user inter- face that most of us see and experience, or is it the lines of source code? is it the application or the complete operating environment? what about documentation, packaging, and other kinds of ancillary material? closely related to the ambitions of software studies then is the practical challenge of software preservation—how will researchers actually access historically important software packages in decades to come? how is the history of soft- ware being preserved? some software history is contained in the corporate archives of entities like microsoft and adobe, and researchers will need to become proactive about seeking access to these typically cloistered settings. but much is also now available on the open web, for example the efforts of the internet archive, whose recently launched historical software col- lection offers users the ability to interact with emulations of key software programs natively in their browser; likewise, large amounts of documenta- tion are readily available through grassroots computer history efforts such as bitsavers. finally, oral history interviews with living key technological innovators can be extremely valuable, as belinda barnet demonstrates in her memory machines: the evolution of hypertext (anthem, ). though she does not use the term, barnet’s software studies approach makes her work very different from first generation treatments of hypertext theory. manovich’s most recent book, software takes command (bloomsbury ac- ademic, ), takes as its centerpiece an extended history and “reading” of adobe after effects, the industry standard for creating moving image animations. critical code studies is a related movement which focuses not so much on software as an application or artifact but on the literal code of the applica- tion itself. if software studies is akin to the study of paper or bindings or typography, critical code studies asks us to reckon with the underlying pro- cesses of computation, much as we would seek to understand the interac- tion between, say, collation and imposition in the hand-press period. while often regarded as the sole province of programmers and other specialists, book history the reality is that all “computer code” as we typically know it is really only ever human readable; it only becomes legible (which is to say actionable, or operationalized) by the machine once it has undergone a process known as compiling, which takes so-called “high-level” languages like java or basic or fortran and converts them to the binary ones and zeroes that furnish a computer’s operating instructions. critical code studies thus foregrounds software and computer programs as semantically replete fields of interpre- tation, written by and for human beings (nor is this strictly a humanistic conceit: donald knuth, perhaps the most famous living computer scien- tist, espouses the same principles through what he terms literate program- ming). critical code scholars are given to close readings of individual lines of computer code, looking for the expressive dimension of such elements as the names given to variables or the choice of conditional structures used to govern the actions of the program; however they also locate agency at the level of the process the code enacts, the specific computational behaviors set in motion by the source code. noah wardrip-fruin, in his book expressive processing: digital fictions, computer games, and software studies (mit press, ) writes eloquently of a code literacy as not only a scholarly vir- tue but a civic necessity, from the algorithmically ranked results of “every- day google searches to the high stakes of diebold voting machines”; the book itself offers close, “procedural” readings of the semantics and struc- ture of individual software programs, including the tale-spin story genera- tor, simcity, and eliza. in yet another example, dennis jerz recovered the original fortran source code for the foundational interactive story-game adventure, and offers a detailed “reading” of its particulars and their implications for our understanding of the composition of the game in a model of both critical code and software studies. but perhaps the most extreme, and tantalizing, example of the potential of the critical code ap- proach is a book cryptically entitled print chr$( . +rnd( )); : goto (mit press ). the book, which was jointly authored by a collective of some dozen members, takes that single eponymous line of source code for the commodore (which drew a randomly determined maze pattern on the screen) as the basis for an exploration of s home computer culture that ranges from discussions of the labyrinth as a cultural form to the nature of computational randomness to the means of dissemi- nation for early computer programs, which often included (for example) print magazines, from which a reader would transcribe and retype them into his or her own system. print thus uses a line of code as the pro- verbial grain of sand (silicon) within which to see a world; it is a remark- digital scholarship and digital studies able example of the cultural richness and repleteness of a supposedly purely operational expression. closely aligned with both software studies and critical code studies (in- volving many of the same individual scholars) is the “platform studies” movement, which is most heavily associated with nick montfort and ian bogost, who edit another mit press book series devoted to the topic and published its inaugural volume, racing the beam: the atari video comput- er system ( ). textbooks and tutorials often explain the fundamentals of modern computing to newcomers by employing the metaphor (and visual imagery) of stacks and towers, working from hardware and machine code up through levels of abstraction including assembly code, high-level pro- gram languages, and finally end-user applications and what we see on the screen. much of the scholarship comprising the material turn in digital stud- ies has tended to hew, sometimes quite explicitly, to this same model (which, it is worth noting, is itself a historical construct, an artifact of the von neu- mann architecture for computer systems). “platform,” montfort and bogost tell us, “is the abstraction level beneath code, a level which has not yet been systematically studied. if code studies are new media’s analogue to software engineering and computer programming, platform studies are the human- istic parallel of computing systems and computer architecture, connecting the fundamentals of new media work to the cultures in which they were produced and the cultures in which coding, forms, interfaces, and eventual use are layered upon them.” the title of their atari book, racing the beam, in fact refers to the beam of the cathode ray gun that would “paint” the game’s graphics on a television display in a continuously scanning hori- zontal pattern that programmers of the system’s cartridges not only had to compensate for but sometimes took advantage of to overcome the inherent limitations in memory and processing power also characteristic of the sys- tem. this close dialectic between the technical particulars of the platform, sometimes articulated at very high levels of detail, and their implications for the kind of creative and imaginative work performed on those systems is characteristic of platform studies, which has also seen books covering the nintendo wii (codename revolution, by steven e. jones and george k. thiruvathukal [ ]) and the commodore amiga (the future was here, by jimmy maher [ ]). though “platform” is perhaps most conveniently associated with physical computing hardware (as the preceding examples suggest), montfort and bogost are quick to point out that platforms can be virtualized as well: for example, a forthcoming book in the series addresses the web’s once ubiquitous flash technology as a “platform.” here too then book history we can see the explicit parallels to book history: what would it mean to think of the kindle as a platform, for example, to critically examine the con- straints and affordances of the device (both its physical incarnation as well as its architecture and protocols)? platform studies, like the history of the book, is characterized by close, some might even say obsessive or unseemly, attention to detail out of the fundamental conviction that such material particulars are ineluctably part of the history of communicative objects, ar- tifacts, and our human interactions with them. the distinctions between software studies, critical code studies, and plat- forms studies can sometimes be opaque intellectual terrain for the uniniti- ated, not only because of the technical connotations of such terms but also because the boundaries between them—in terms of people, publishers, and intellectual approach—can seem rather permeable. indeed, the print volume discussed above in relation to critical code studies was published as part of the mit press’s software studies series. other works have also blended the three approaches to create generative readings of electronic media. terry harpold’s ex-foliations: reading machines and the upgrade path (university of minnesota press, ) is meticulous in its documenta- tion of specific platforms and software versions for the creative electronic literature it takes as the focus of its discussion, including (again) joyce’s af- ternoon. likewise, christopher funkhouser’s prehistoric digital poetry: an archeology of forms, – (university of alabama press, ) is a deeply researched volume based on the archival recovery of primary source documentation for the period under discussion, in this regard treating “digi- tal poetry” no differently from other literary phenomena where such explic- it period demarcation is commonplace. and steven e. jones’s the meaning of video games: gaming and textual strategies (routledge, ) applies a textual and software studies approach to the study of computer games as material artifacts. perhaps the single most illustrative and effective example of the relevance of all three approaches to book history comes in the form of alan galey’s book history essay “the enkindling reciter: e-books in the bibliographical imagination.” this essay, which won the fredson bowers prize, is a tour de force in its demonstration of the both the new ma- terialist sensibility and new bibliographic—and forensic—techniques in the investigation and evaluation of digital book objects. readers will recall that we have already encountered galey’s innovative interface designs in the vi- sualizing variation project; here he solves an actual bibliographical problem (several, in fact) in the electronic presentation of the text of johanna skib- srud’s the sentimentalist, winner of the canadian scotiabank giller digital scholarship and digital studies prize. since readers of the present essay will have ample access to galey’s text we will not rehearse its particulars in detail, but instead remind our readers of galey’s closing contentions around “stripping the veils of code,” offered in conscious revision not only of bowers but also kirschenbaum’s earlier work on digital forensics: one of the consequences that the bibliographical study of e-books forces upon us is the need to rethink traditional bibliography’s ba- sis in empiricism. to reverse the terms of the errant william james epigraph, the different forms of e-books may have no rocky bot- tom, no absolute real that serves to anchor the evidence of our senses. the reason is simple: e-books, like all digital texts, require us to interpret phenomena not directly observable by the senses. we must rely on layers upon layers of digital tools and interfaces, as we have seen in the examples above. a purely empirical and forensic perspective assumes that objects speak for themselves, and yield up their evidence to the observation of human senses and the inquiry of human reason. my purpose in drawing attention to the role of the enkindling reciter is to emphasize that digital objects do not speak for themselves; someone always speaks for them. ultimately software studies, critical code studies, and platform studies are each varyingly inflected methodologies for cultivating both the critical sen- sibility and the technical acumen necessary to swim deep into the cultural reservoirs of contemporary digital production, if not quite touch that final rocky bottom. we would advise our readers to attend to the commonalities between them rather than succumbing to the parsing of their differences. there is one other articulated movement with direct bearing on the ma- terial turn, developing not primarily in north american but rather anglo- european settings. media archaeology is a term which originates in cinema studies with the work of c.w. ceram, but which has more recently expand- ed to offer coverage to the full spectrum of media phenomena, including, of course, the products and productions of the digital age. kittler, discussed earlier, is often regarded as a prototypical media archaeologist for his as- signment of radical agency to non-human actors and technologies, though he himself would have disavowed the label. media archaeology’s most in- fluential figures have nonetheless tended to emerge from the continental intellectual scene, though the movement’s most prominent english-language organizer and advocate, jussi parikka, is a finn working in the british uni- versity system. parikka’s what is media archaeology? (polity, ) and a collection co-edited with erkki huhtamo, media archaeology: approaches, book history applications, and implications (california, ), will be the best start- ing points for most of our readers. in north america, meanwhile, media archaeology has been increasingly absorbed into the academic conversa- tion around digital technologies, with ground already been prepared by the thinkers and trends discussed above. as parikka himself acknowledges, there is a general compatibility between the methods and concerns of soft- ware studies, critical code studies, platform studies, and computer forensics, and media archaeology. broadly speaking then, media archaeology is characterized by an intense fixation on the technological operations of media. its historiography gen- erally hews to foucauldian genealogies of “disruption” and discontinuity. siegfied zelinski’s deep time of the media: toward an arcaheology of hearing and seeing by technical means (mit press, ) and the work collected in his series of edited variantology volumes (verlag der buchhan- dlung walther könig, –), as well as erkki huhtamo’s illusions in mo- tion: media archaeology of the moving panorama and related spectacles (mit press, ) are representative in this regard. though invested in the recovery of neglected, forgotten, crashed, erased, and overwritten media devices in order to question and reframe established narratives of media history, media archaeology is also, in the eyes of at least some practitio- ners, about a radical revisionary historiographical practice in which ma- chines assume primacy of agency in the recording and narration of cultural events. wolfgang ernst, who uses the term “archaeography” to describe this process, by which the “archive” writes itself through varied modes of technical inscription—many of them forms of signal processing occurring at sub-semantic levels—is the key figure here: his writings are collected in english in digital memory and the archive, edited by parikka (minnesota, ). there is a practical component as well, in that hardware conserva- tion and preservation are important facets of media archaeology, the skills and expertise necessary to restore vintage computers and other technolo- gies to working condition. (ernst maintains such a facility, the “media archaeology fundus”; lori emerson’s work with her media archaeology lab at the university of colorado boulder, which maintains dozens of vin- tage computers in working order, is likewise exemplary here. ) finally, as the title of ernst’s collection above suggests, “the archive” has emerged as a site of intense interest for media archaeological investigation, not only for the practicalities in preserving access to its technological apparatus but also because the very conceptualization and theorization of archives has direct implications for our articulation of media history. wendy hui kyong digital scholarship and digital studies chun thus lays great stress on the technological as well as the bio-informatic origins of archive and memory in her programmed visions: software and memory (mit, ), a work that has been generally embraced by media archaeological writing. for book history, media archaeology offers a framework for media inves- tigation which tends to have an even longer historical reach than the primar- ily north american movements described above; media archaeological in- vestigations routinely extend back into the nineteenth century and beyond, grounding themselves in “prehistoric” (recall funkhouser) manifestations of cinema and the moving image, photography, and recorded sound. there is also a conspicuous strain of media archaeology that takes as its primary locus documents, records, and writing technologies such as the typewriter and telegraph, as well as “soft” technologies such as shorthand. lisa gitel- man’s work is exemplary and indispensable here, and though she has never overtly declared herself a “media archaeologist” she has both influenced and been influenced by the movement. her scripts, grooves, and writing machines (stanford, ) offers a more historically attentive narrative of nineteenth century inscriptive economies than kittler, and an essay col- lection co-edited with geoffrey b. pingree, new media – (mit, ), consolidates and amplifies the import of this period as a long ante- cedent to the media landscape of today. more recently, her always already new: media, history, and the data of culture (mit, ) offered perhaps the first serious attempt to genuinely historicize the web, including explicit attention to the twin concepts of records and documents in electronic (and aural) culture; her newest monograph, paper knowledge: toward a me- dia history of documents (duke, ), supplies readings of documentary technologies from nineteenth-century job printing to the ubiquitous pdf format of our own time. a scholar such as gitelman thus foregrounds the linear conceptual path from book history to media history and back again, with the additional virtue of a long historical perspective that understands the screens and devices of the present as descendants of earlier technological dispensations. much the same could be said of darren wershler and the iron whim: a fragmented history of typewriting (cornell, ), which offers archaeologies of book-writing and media alike. similarly, ben kafka, in the demon of writing: powers and failures of paperwork (mit, ) explores paper as a medium, even as he develops a media archaeological account of bureaucracy and office work. cornelia vismann, in files: law and media technology (stanford, ; translated by geoffrey winthrop- young, also kittler’s chief translator and explicator) brings similar attention book history and stress to the construct of the “file” in legalistic and documentary con- texts. jean-francois blanchette’s burdens of proof: crpytographic culture and evidence law in the age of electronic documents (mit, ) draws together (again) legal discourse with forensic technologies and a consider- ation of the longstanding problems of diplomatics, namely document au- thentication and documentary authority, in relation to the particular prob- lematics of born-digital documents. lori emerson’s forthcoming reading/ writing/interfaces (minnesota, ) employs media archaeological pre- cepts to consider the physical substrates of experimental poetry and poetics, with authors ranging from dickinson to contemporary canadian writers such as steve mccaffery and bpnichol. finally, jonathan sterne’s mp : the meaning of a format (duke, ), might appear at best oblique to the interests of book history—that is until one remembers to consider the place of digital audio books amongst today’s reading public. iv media archaeology, together with software studies, critical code studies, and platform studies, gives us a route into the vexed, recursive layers of today’s textual landscape that is broadly compatible with the sensibilities and intellectual agendas of today’s scholarship in book history. more than that, however, all of these movements or trends offer the opportunity to reconsider the book as the locus of critical attention. books, after all, have always been a narrow and particular subset of humankind’s written endeav- ors and activities. what is the nature of the relationship between books and documents, or books and records, or books and paper or other forms of media and material supports? such questions, we maintain, are not mere theoretical prompts, but essential prerequisites for responsible scholarship of books as they are written and read today; for despite some important contributions, book history by itself does not yet have a critical mass of scholarship with which to answer that challenge. works such as jason ep- stein’s book business: publishing past, present and future (norton, ) are invaluable as memoirs but they lack the necessary critical and theoretical framework for working through questions such as we have raised. david m. levy’s scrolling forward: making sense of documents in the digital age (arcade, , now sadly out of print) offered a starting place for a materi- alist reconsideration of the status of texts as embodied documents amid the shifting landscape of digitization. bonnie mak’s concise how the page mat- digital scholarship and digital studies ters (toronto, ) historicizes the seemingly homogenous “page” as both a conceptual and a material unit in manuscript, print, and digital culture through a case study of what randall mcleod might have called the “trans- formissions” of one particular text. andrew piper’s book was there: read- ing in electronic times (chicago, ) is a focused, sometimes personal attempt to historicize today’s questions about the significance of reading (specifically) books, informed but not burdened by piper’s training in criti- cal theory; it is usefully considered with both jeff gomez’s print is dead: books in our digital age (palgrave, ) and alan jacobs’s the plea- sures of reading in an age of distraction (oxford, ). the best critical overview of book publishing in the present moment is undoubtedly john b. thompson’s merchants of culture: the publishing business in the st cen- tury (plume, ). of equal relevance is ted striphas’s aforementioned the late age of print: everyday book culture from consumerism to control (columbia, ) which grounds its analysis not in futurisms but rather in research and critical analysis of the status of the book in the present, from big-box bookselling and electronic distribution systems to reading clubs, and the (yes) the harry potter phenomenon, among other topics; striphas, whose intellectual pedigree is more cultural studies and materialist marxism than the digital studies authors we have been discussing, nevertheless offers an example of a project that understands that the distinction between book history and media history is now literally and purely and finally only aca- demic. the recent collection comparative textual media: transforming the humanities in the postprint era, edited by n. katherine hayles and jessica pressman (minnesota, ) as well as the new cambridge companion to textual scholarship, edited by neil fraistat and julia flanders (cambridge, ), likewise eschew unproductive distinctions between these fields. let us now offer some additional examples to demonstrate the potential for scholarly inquiry bridging the various approaches to media history and theory we have been surveying, and the history of the book, broadly con- ceived. we can begin with the fact that computer history has spawned any number of compelling book objects that ought to be of interest to book his- tory. there are surely projects for those who wish to explore the publication histories of newsletters and ’zines like that of the homebrew computing club (whose archives are at stanford), or mondo ; similarly, landmark publications such as ted nelson’s computer lib/dream machines and the whole earth catalog are fascinating book objects, filled with complex as- semblages of visual and verbal material. the massive popular interest in personal computing and video games that had taken hold by the early s book history (in , time magazine anointed the computer its “machine” of the year) spawned hundreds of mass-market trade publications, including introduc- tions, tutorials, how-tos, and guidebooks (see fig. ). computer magazines such as byte and pc magazine and macworld also offer key documenta- tion from this period. in short, the reality is that much significant computer history has been written and rendered in print; this is a vast and largely unexplored space. book history has the potential to bring much-needed nuance to tired, re- ductive binaries around the paragone between print and the digital. abigail j. sellen and richard h. r. harper have argued compellingly that far from eliminating it, digital technologies (such as word processing and the laser printer) greatly exacerbated the consumption of paper in office settings. it is tempting to explore similar dynamics in other milieu: for example, the avant garde literary arts journal between c&d, a “little magazine” which began publication in new york’s east village in and was printed and distributed on fanfold paper from a dot matrix printer and came packaged figure . a selection of s trade books written as guides to home computers and video games, whose diverse and notable authors include martin amis, michael crichton, newt gingrich, frank herbert, hugh kenner, and jerry pournelle; also a pop-up book entitled inside the personal computer. photo by kirschenbaum. digital scholarship and digital studies in a ziplock bag. (authors included kathy acker, dennis cooper, gary indiana, patrick mcgrath, and lynne tillman.) as editors joel rose and catherine texier recall: “the combination of our high-tech look—the com- puter printout, the fanfold, the dot-matrix print type—in conjunction with handmade art by east village (or downtown) artists on the front and back covers, and the ziplock plastic bag binding, along with, needless to say, the featured ‘new writing’ immediately attracted both readers and writers, from new york city and elsewhere.” such episodes dovetail, at least anecdot- ally, with the kinds of arguments scholars such as harold love and peter stallybrass have long made about the persistence of scribal and manuscript writing into cultures of printing, whereby the cross-transfer between two active media spheres (in this case print and the digital) results in the prolif- eration, rather than the diminishment, of prior forms of media and inscrip- tion. one notable exemplar from this period is robert pinsky, steve hales, and william mataga’s mindwheel, published by synapse/brøderbund in . mindwheel is a self-described “electronic novel.” while our readers will know who pinsky is of course, it is unlikely they will recognize either of the other names. that is because they are computer programmers who share the authorship credits with pinsky. mindwheel is in fact a hybrid book/ digital artifact. it consists of a ninety-page clothbound volume (packaged in a paper slipcase) containing prose materials (credited separately to one richard sanford), artwork, verse, photos, and faux-interviews, journals en- tries, and other documentary materials; also included is a sleeve containing a . - or . -inch computer disk, available for macintosh, the apple ii, and the pc. one engaged mindwheel by beginning with the thirty or so pages of sanford’s prose in the printed volume, at which point the transition to the interactive content was effected by way of a dream sequence; practi- cally speaking, the reader would set the book aside, boot the disk, and find him- or herself in a “text adventure” style environment where they would read prose descriptions of their current situation and type their intended actions, to be interpreted by the program’s parser which would advance the action of the story accordingly. this in itself was no great novelty at the time, and in fact text adventures constituted an important segment of the computing gaming market for home computers (infocom set the standard with dozens of such titles). what set mindwheel apart, however, was both the dual book/disk combination (though it should be noted the packaging of infocom games routinely included printed paraphernalia such as maps, letters, and photos), and the engagement of pinsky as a “significant” literary book history talent. (in a feature which conspicuously leverages the material affordances of print versus disk media as an anti-piracy safeguard, the participant in the interactive portion of the text must enter a “password” which is discovered by referencing a particular page in the accompanying volume.) clearly the conceit of an “electronic novel” as a combined print and interactive experi- ence was envisioned as a paradigm for future publishing—the phrase was apparently even trademarked. alas, it was not to be: synapse went under shortly thereafter. pinsky himself today still seems to recall his contributions to the project fondly, and talks freely about it in interviews. we rehearse this history not only because of mindwheel’s import in its own right, though it does bear genuine significance both for the involvement of a future poet laureate of the united states and as an early exemplar of a hybrid publishing model which would be often repeated in the avant garde world, as well as in the commercial market, where for a time cd-roms (far more durable than diskettes) were routinely bundled with all manner of figure . mindwheel, by pinsky, et al. slipcover, book, and . -inch diskette. note label on slipcover specifying it as the macintosh version and the two “pro- grammers” who are given billing alongside of the “author.” pinsky was respon- sible for the electronic content, but the text in the printed volume was by sanford. photo by kirschenbaum. digital scholarship and digital studies trade books and textbooks. as these examples show, digital media cohabi- tate with the codex, not only just in close physical proximity—between the same covers—but occupying conceptually coterminous textual and narra- tive space. but mindwheel also raises the vital question of preservation, and how one accesses it today as a significant historical artifact. one option, of course, is to seek out an original copy on the secondhand market, and at least as of this writing they are not especially difficult to find or priced pro- hibitively; yet it would still be impractical to assign the book to a class full of students, and even having an original diskette in hand raises the question of how it can be accessed on a modern computer, which lacks the disk drives to say nothing of the appropriate operating systems. an alternative presents it- self in the form of the web’s so-called “abandonware” hubs, where software of uncertain copyright status (presumed “abandoned” by the original rights holders) may be downloaded under conditions of questionable legality and experienced by way of an emulator, a piece of software whose function is to replicate the graphics and behaviors, using the original programmed logic, of some long-vanished platform; pdfs of the original printed volume, meanwhile, are also in circulation, and their copyright status is equally du- bious. both abandonware repositories and emulators are situated within what is at best a grey area where, as galey has also noted in the context of his work on e-books, copyrights and digital rights management tech- nologies can render seemingly innocuous scholarly activities illicit uncertain under the letter of the law. mindwheel thus dramatizes the complexities of doing book history on this comparatively recent material, as well as the importance of scholars acquainting themselves with the various issues and trade-offs inherent in various forms of digital preservation. such knowledge is no different in principle from what we expect of those who would navi- gate a reading room for access to special collections materials because they understand that a facsimile is not an adequate substitute for the experience of the original volume. books themselves can also be used as a kind of emulator, to capture and document the experience of the digital, a trend we see in mainstream pub- lishing in a novel like jennifer grose’s sad desk salad ( ), which embeds the ubiquitous chat balloons, message icons, and avatars of social media into its prose pages. but the most interesting such work typically takes the form of artist’s books or novelty projects. the most literal example may be richard moore’s paper pong ( ), which allows you to actually “play” the classic video game using a system of directed page references similar to the old-style choose your own adventure™ books. the difference is that book history instead of a picaresque adventure plot, one makes their decision about what to do next (which way to move the paddle) based on a visual depiction of the current state of the game, which is rendered in a full page layout. silvio lo- russo and sebastien schmieg’s broken kindle screens ( ) presents the reader with exactly that, uncaptioned black and white photo-reproductions of the eponymous cracked and shattered devices. while clearly intended as a statement on the materiality of the digital and its counterintuitive persis- tence in the form of print, the book also serves to document the surprising variety and aesthetic allure of these unfortunate accidents. more extreme in this regard is martin howse’s diff in june ( ), a -page tome, like broken kindle screens available as either a pdf or a print-on-demand publication from lulu.com. the project is described as follows: “using a small custom script, for the entire month of june martin howse reg- istered each chunk of data which had changed within the file system from the previous day’s image. excluding binary data, one day’s sedimentation has been published in this book, a novel of data archaeology in progress tracking the overt and the covert, merging the legal and illegal, personal and administrative, source code and frozen systematics.” the experience of en- countering diff in june as a printed volume is to be confronted with a dense slab of text whose closest cousin may be a telephone book for a large-sized metropolitan area. the pages are the data dump, most of it simply opaque and even the infrequent pockets of legibility resisting any simple semantic engagement since they are messages intended for the operating system of the computer rather than the attention of a reader. diff in june reminds us that the vast majority of writing that takes place now occurs without human agency or intervention: it is machines writing to machines, as félix guattari once said, a fact which makes this volume a primary media archaeologi- cal artifact after the likes of kittler and ernst. yet this book, like moore’s, and lorusso and schmieg’s, also speaks to some more mundane but no less consequential facts about book publishing today: practically speaking, diff in june could not exist without either electronic distribution or the print-on- demand services harnessed by a company such as lulu. similarly, moore’s paper pong is also available as a pdf, under a creative commons license, which grants its reader the ability to “share” and “remix” the work as long as moore is credited and there is no commercial profit from the activity; broken kindle screens, meanwhile, is not available electronically, but this book too could not exist without digital media, not only in the obvious sense of its subject matter but also because the images it collects were themselves harvested from the web, from flickr and other photo-sharing services. all of these works thus demonstrate the book’s surprising capaciousness as a digital scholarship and digital studies platform for documenting the minute particulars of the digital world and its devices, even as new digital publishing practices and economies of circula- tion are reconfiguring the status and consumption of books themselves. with the exception of boutique letterpress editions and the like, all books today, as kirschenbaum has previously argued, are “born digital” in the sense that at some point in their composition, editing, layout, and printing they become (re)configured as data objects in software packages such as word and quark. the history here is very rich and very much in the process of active exploration, but we can cover it only in passing. computer typeset- ting began in the mid- s, with pioneers like roderick chisolm at brown university and douglass hofstadter, who has attested that gödel, escher, bach ( ) could not have been completed without the assistance of an early stanford program called tv-edit. bessinger and smith’s beowulf con- cordance ( ) was even earlier, and is a landmark of both “humanities computing” and digital presswork. starting in meanwhile, stanford computer scientist donald knuth took nearly a decade away from his writ- ing of what are still the definitive textbooks on the art of computer pro- gramming to develop tex, a computer typesetting language that enabled him to lay out the mathematical equations and other specialized elements of the books to his satisfaction, something his commercial publishers were not then capable of doing; the story is told in his digital typography (clsi, ). small presses were also often innovators, as john maxwell’s ongoing work on coach house press will demonstrate when published; document markup technologies, including the web’s ubiquitous xml, owe substantial debts to innovations by stan bevington and others associated with coach house. kirschenbaum, meanwhile, has documented what is likely the first book written with a word processor, len deighton’s bomber ( ), as part of his ongoing research on the literary history of word processing. john updike summed up much the state of things at a conference at mit: “and in regards to the iron curtain that exists between the humani- ties and the sciences, the computer is a skillful double-agent: the production and analysis of texts has been greatly facilitated by the word processor; for instance, programs for the making of indices and concordances have taken much of the laboriousness out of those necessary scholarly tasks. in my own professional field, not only does word processing make the production of perfectly typed texts almost too easy, but computer-setting has lightened the finicky labor of proofs.” but while computers have had an obvious impact on the circumstances of authorship and the industry of publishing, they are also responsible for key aspects of what we might think of as a widespread renaissance in the book history appreciation of the book as material object. steven e. jones, in his afore- mentioned the emergence of the digital humanities (routledge, ) narrates the way in which such an extraordinary and intricate book object as jonathan safran foer’s tree of codes in fact owes its existence to com- puter-mediated design and production. describing a video released by the publisher, jones notes: “at about one minute in, we see the large computer screen of the designer at work, and then what looks like a laser cutter pro- ducing the dies.” tree of codes, therefore, turns out to be a book which, despite its flagrant bookishness, could not have been created—or certainly could not have been practically fabricated as a trade multiple—without the employment of sophisticated digital technologies. any scholar attentive to the actual facts of book history must regard such a work not as a protest against digitization but as material evidence of the interplay between ana- log and digital forms. one could multiply examples here with other recent books with conspicuous design dimensions whose production was abetted by desktop publishing and layout tools: mark danielewski’s novels come immediately to mind, including his first, house of leaves ( ), which circulated as a pdf samizdat before being picked up by pantheon (legend has danielewski flying to new york to lay out the book himself in quarkx- press in his publisher’s offices). likewise, the collaboration between j.j. abrams and doug dorst, s, is a faux-library volume dating from the s which comes complete with a slip case and interior pages stuffed full of postcards, scrap paper, ephemera like a coffee shop napkin, and marginal annotations. the design firm who did the work, melcher media, is based in new york city’s west village and recently hosted a “future of storytelling summit.” s is thus not a nostalgic offering; it is a proleptic one. perhaps the most extreme example we can consider is the remarkable and aptly titled between page and screen ( ), which is a collaboration between writer amaranth borsuk and programmer brad bouse; originally produced in a limited letterpress edition, the book is now available from siglio press. to read it, one opens the pages of the volume, which turn out to contain not text but rather large, geometric black and white glyphs loosely resembling the more familiar qr codes. by itself, then, the book is essentially meaning- less; but when one accesses the accompanying web site they are directed to activate their computer’s camera and hold the book’s pages up to the lens: software on the web site interprets the glyphs as captured by the camera, and the result is an “augmented reality” text that appears to literally float above the pages of the book when rendered by the computer’s display of the camera’s image—the effect never fails to be breathtaking, and the project, digital scholarship and digital studies though it wears its prejudices on its sleeve, nonetheless serves as perhaps the definitive statement on the jointly enabling potential of computers and codex. some readers may object that these are specialized texts, projects which sought out their opportunities to make a statement on the “the future of the book.” the reality, however, is that in today’s bookselling economy digital media are often integral to the publishing phenomenon. books are thus not only “born digital” in the sense that their composition and layout involves digital tools and technologies, but often their viability as marketable book projects is itself a direct outgrowth of a digital pedigree. one could argue, for example, that the massive online fan culture devoted to the harry pot- ter books is indispensable to that franchise’s success. “harry potter,” after all, is not just a series of novels; it is a platform from which fans engage in their own creative acts, whether via official extensions of the franchise or figure . borsuk and bouse’s between page and screen. here what one is looking at is a reproduction of a screenshot taken of the feed from her computer’s digital camera, which has captured an image of borsuk facing the screen with the book open to one of its glyphs, thus generating the legible text which the software super- imposes on the display of the camera image. screenshot by borsuk and bouse, used by permission. book history through fan fiction, with online sites hosting literally hundreds of thousands of creations set in the potter universe. a less prominent example is the ex- perience of hugh howey, author of the science fiction novel wool, which began in with a short story released through amazon’s kindle direct publishing. the story’s reputation spread through online word-of-mouth, and howey was prompted to write additional installments, eventually pack- aging them as a novel-length offering; after intense interest from conven- tional publishers, he sold one-time rights for an edition of , copies to simon and schuster, but retains online distribution rights himself and has also negotiated foreign publishing rights and a film deal with th century fox. customers who buy a copy of the simon and schuster edition, whether from a brick and mortar bookseller or an online e-tailer, will therefore hold in their hands a book whose existence as a printed object is a derivative outgrowth of the success of its digital forerunner. (in an “advice to writers” essay on his personal web site, howey encourages aspiring authors to lever- age social media and to think of themselves as a “start-up” enterprise. ) jennifer egan, meanwhile, well established as one of the more significant voices in contemporary american fiction, has experimented with twitter as a storytelling platform. in may, she began tweeting (yes, in -char- acter installments) her short story “black box” from the new yorker’s ac- count. the serial tweets were broadcast nightly over the course of ten days, with the new yorker’s followers replying and retweeting all the while. the complete piece was then published in the june , print edition of the new yorker. what becomes especially interesting, however, is egan’s dis- closure in an interview that the story was initially drafted longhand in a japanese notebook whose pages were ruled with rectangular boxes that could accommodate prose statements roughly the length of a tweet. the particular features of this notebook, then, were a material constraint for the project as much as twitter. “black box” therefore is an artifact of multi- faceted interchange, a generative friction, between print and digital writing platforms. (on amazon, meanwhile, there is a german kindle edition of the work available, a fact which serves to demonstrate that digital forms are no more self-identical than printed exemplars.) a “book history” project engaging the work of contemporary fiction writers such as j.k. rowling, hugh howey, or jennifer egan will of necessity also be a software studies, platform studies, and media archaeology project. no account of the interplay between digital and traditional forms of au- thorship, reading, and the book market would be complete without the bête noire of contemporary publishing. we refer, of course, to fifty shades of digital scholarship and digital studies grey ( ), whose author, e.l. james, famously displaced j. k. rowling as the best-selling author of all time on amazon.com uk. besides its sensa- tional sales records (for which e-books bear a disproportionate responsibil- ity ), the novel is best-known for its supposedly daring content, which has sparked the predictable controversies and debates in the predictable venues. but its e-book sales are not incidental in this regard, since as many observers have pointed out e-books offer the opportunity for private reading in public spaces: in a crowded train car, no one around you can tell if you’re read- ing e.l. james or henry james. (every book has the same non-judgmental cover.) fifty shades of grey (hereafter fsog) has an additional significance in the terms of our discussion, however. its trajectory to print or screen makes those of franzen, pinsky, egan, danielewski, howey, and others we have discussed appear straightforward. the story involves not just the usual binaries between print and the digital, but also relationships between fans and producers, between traditional publishing and viral content online, and between multiple layers of socially sanctioned forms of authorship. it is also a cautionary tale about the conduct of literary history in the present mo- ment and the stakes for the future, since whatever one’s view of the novel’s content it is undeniably groundbreaking and historic as an illustration of just how complex the contemporary book’s media environment has become. fsog has its origins as an instance of so-called “fan fiction,” briefly men- tioned above in relation to the harry potter series. fan fiction, as the name implies, involves original storytelling (emphasizing prose, but also illustra- tions and other media) undertaken by the devotees of popular film, tv, and book franchises; its quality varies widely from the sophomoric to the sophisticated, but with leading fan authors garnering formidable followings of their own on the sites and portals where their work is disseminated and discussed. various franchises take different views of fan fiction, some (in- cluding rowling) accepting it, while others actively seek to discourage the phenomenon. e.l. james began writing fan fiction set in the twilight vam- pire universe under the penname “snowqueens icedragon ” in . her work was sexually explicit, something different fan sites and communities have varying degrees of tolerance for; james opted to remove her work from the fanfiction.net hub where it was being distributed and to instead serve it from her own personal web site, fiftyshades.com. by this time she had already garnered a substantial readership, hundreds of thousands of readers by some reasonable estimates. crucially, at this point james also began rewriting the text, which had originally been published under the title mas- ter of the universe; all vestiges of the twilight universe were expunged. she book history then brought the revised story to the writers’ coffeehouse, an australian- based print-on-demand company. it was published there as fifty shades of grey in may ; a year later james sold the rights to random house’s vintage books, the move which yielded its now meteoric sales figures. thus a book which began as the derivative work of another fiction franchise now commands even larger assets. (as of this writing a fsog movie is on the way, with its own attendant controversies). today, of course, fsog is no longer available from the writers’ coffee house; james’s original fiftyshades.com is a standard author’s platform, with no trace of master of the universe; and her profile has been scrubbed from fanfiction.net. no doubt whatever unease a successful writer (or a protective publisher) may feel upon encountering reminders of a work’s in- auspicious origins is greatly exacerbated in this instance by its relationship to the twilight universe. a scholar of the future, then, whether interested in so-called “mommy porn” or the fan fiction phenomenon or james’s ca- reer will have a difficult if not impossible time amassing the primary source materials required to do his or her work. likely they will have to rely on a network of individuals who may, improbably, still have a copy of the original files sequestered on some piece of now obsolescent media (pdfs of the original master of the universe remain in circulation, though they are wholly dissociated from the fsog brand); they may have some luck with content crawlers like the internet archive (though its archived crawls of figure . e. l. james’s master of the universe as it was presented on her fif- tyshades.com site in december, under her original penname “snowqueens icedragon.” screenshot from galleycat.com. digital scholarship and digital studies fiftyshades.com have now been removed); or else they will have to rely on a smattering of screenshots and third-person accounts. one could argue that this is no different from the kind of sleuthing and serendipity good research has always required; but we believe the combination of technologies, distri- bution networks, and the legalities of blockbuster properties spanning mul- tiple media platforms renders the situation qualitatively different. we have suggested that the various approaches we have surveyed from digital studies and media theory can serve to offer some traction on these questions, and this is true. but more fundamental issues remain, chief among them that the scholarly and archival apparatus for contemporary book studies is quite simply unequipped to accommodate the particulars of today’s book publish- ing landscape, necessitating as it would the use of technologies such as web crawlers, torrent sites, screen scrapers, social media feeds, and sometimes even trafficking in illicit file trading. v if book history is the study of how platforms shape and deliver texts, then today’s platforms of pixels and plastic are as much a part of those stud- ies as paper and papyrus. how many of us encounter the objects of our study unmediated through subsequent technologies? even in special collec- tions, what we find is presented to us through the thresholds of catalogs, phase boxes, and call slips. we all experience this, even if we do not always theorize it. but what might we learn if we do think about the entrance of old media into the platforms of new media? whitney trettien’s explora- tion of print-on-demand copies of areopagitica suggests that such debased “zombified” books can teach us more about how texts are circulated today than most deliberately translated or hybrid works can. english reprints jhon milton areopagitica, as one of these pod books is titled, is a mish- mash of bad metadata and worse optical character recognition, but its “remediated, dismediated strangeness brings the increasingly normalized processes of digital archiving into sharp relief.” through her engagement with the long history and digital and physical presences of these works, trettien shows how “these moments of engaging with the printed material- ity of digital texts point to the multiform ways digitization is altering the weight of history.” and through her clever coding, trettien’s online digital humanities quarterly essay performs the deformation it discusses, disrupt- ing the normal mediation of code we have come to expect. book history meanwhile, works such as the story of @mayoremanuel reveal all the messy interplay between text and real life that exclusively fictional stories like jennifer egan’s do not capture. a parody twitter account that emerged during rahm emanuel’s campaign to be elected the mayor of chicago in the - winter, @mayoremanuel appeared at first to be a one-note joke. (rahm emanuel swears. he swears a lot.) but over the course of the campaign, the account began to introduce recurring characters, to respond to events happening in the real campaign, and to weave together a fictional narrative that played out in real time. @mayoremanuel’s conclusion came the day after the emanuel’s election and, uncannily, played off a hailstorm happening in chicago. after the account’s creator was revealed to be the founder of the music ’zine punk planet and journalism professor dan sink- er, the @mayoremanuel story was retold in articles and released as a book. but what the book fails to capture is what made the story so exciting: it was a narrative that coincided with real life. the tweets played out over five months, and the story it told happened over the same five months; when the bears lost, @mayoremanuel mourned; when @mayoremanuel was stuck in the sewer pipes for seven hours, the tweets played out over seven hours. the details of @mayoremanuel’s campaign were interspersed in followers’ feeds with the details of their other friends’ feeds, with no visible demarcation be- tween fact and fiction. the platform on which the text of @mayoremanuel was created and through which it was delivered made possible not only the mechanisms of its reception but the shape and meaning of its story. if that’s not “book” history—think of the analysis of christian adoption of the co- dex as the form through which to receive the bible—then what is? so what can we say about today’s objects of tomorrow’s book history? books themselves are transmedia properties, franchises spanning multiple formats, media channels, and distribution networks. today’s texts are also, inevitably, hybrid artifacts, migrating back and forth between digital and analog states. importantly, the print does not always precede the digital; in fact the norm may be the other way around. moreover, the digital is no lon- ger exclusively a presentist form, if it ever was: the digital itself is now his- torical. a work such as mindwheel is now more than thirty years old. like textual scholarship, then, the book history of tomorrow will consist in the application of techniques from fields like digital forensics, as archivists and other experts work to stabilize, authenticate, and index the born-digital ma- terials that now function, indisputably, as primary records in and of them- selves. but scholars will also need to understand something about network effects, the massive assemblages of data that will be susceptible to analysis digital scholarship and digital studies through text mining and visualization rather than traditional close reading. there can be no other way to evaluate the thousands of user reviews on a site like amazon, for example, other than in aggregate, as streams which we will sift for patterns and anomalies. nor will the material circumstances of the digital be homogenous: they will encompass an array of different platforms, systems, formats, and standards, some of which will interoperate and some of which will not, but all of which a future scholar will have to contend with in the same way as the complex imbrications of printings and editions (all of which will also still obtain). finally, our scholarship will have to confront the reality that its largest challenges may not be technological but legalistic. intellectual property, digital rights management, terms of ser- vice, end-user license agreements will govern access at least as much or more than media obsolescence, bit rot, or curatorial neglect. legalities, material hybridity, network effects—contemporary book studies shares many fea- tures in common with its predecessors, but we ignore the marked material differences at the peril of our scholarly legacy. list of digital resources annotated books online (utrecht university): http://www.annotatedbooksonline.com/ the archimedes palimpsest (walters art museum): http://archimedespalimpsest.org/ the atlas of early printing (university of iowa library): http://atlas.lib.uiowa.edu/ the atlas of the rhode island book trade in the eighteenth century (rhode island histori- cal society): http://www.rihs.org/atlas/index.php bill-crit-o-matic (patrick murray-john): http://billcritomatic.org/ bodleian ballads online (bodleian library): http://ballads.bodleian.ox.ac.uk/; to search using imagematch http://zeus.robots.ox.ac.uk/ballads/page the dickinson electronic archives (dickinson editing collective): http://www.emilydickin- son.org/ early english books online: http://eebo.chadwyck.com/home [subscription only] eighteenth-century book tracker (benjamin pauley): http://www.easternct.edu/~pauleyb/ c booktracker/ eighteenth century collections online: http://gdc.gale.com/products/eighteenth-century- collections-online/ [subscription only] emily dickinson archive (harvard university): http://www.edickinson.org/ english short title catalogue: http://estc.bl.uk/ folger digital texts (folger shakespeare library): http://www.folgerdigitaltexts.org/ the french book trade in enlightenment europe (university of western sydney): http://fbtee. uws.edu.au/main/ gallica (bibliothèque nationale de france): http://gallica.bnf.fr/ gesamtkatalog der wiegendrucke database (staatsbibliothek zu berlin): http://www.gesamt- katalogderwiegendrucke.de/ google books: http://books.google.com/books the great parchment book (london metropolitan archives): http://www.greatparchment- book.org/ book history hathitrust digital library: http://www.hathitrust.org/ implementing new knowledge environments (university of victoria): http://inke.ca/ impos i tor (folger shakespeare library): http://titania.folger.edu/impositor/ incunabula short title catalogue (british library): http://www.bl.uk/catalogues/istc/ internet archive: https://archive.org/ listen to wikipedia (stephen laporte and mahmoud hashemi): http://listen.hatnote.com/ mapping colonial americas publishing project (brown university): http://www.stg.brown. edu/projects/mapping-genres/index.html the open utopia (stephen duncombe): http://theopenutopia.org/ reading experience database (open university): http://www.open.ac.uk/arts/reading/ the shakespeare quartos archive (bodleian library, folger shakespeare library, and mith, university of maryland): http://www.quartos.org/ the shelley-godwin archive (new york public library and mith, university of maryland): http://shelleygodwinarchive.org/ a social edition of the devonshire manuscript (inke): http://en.wikibooks.org/wiki/the_ devonshire_manuscript visualizing variation (alan galey): http://individual.utoronto.ca/alangaley/visualizingvaria- tion/ what middletown read (ball state university): http://www.bsu.edu/libraries/wmr/ the william blake archive (iath, unversity of virginia and carolina digital library and archives): http://www.blakearchive.org/blake/ notes . virginia woolf, three guineas (new york: harcourt, ), . . michael joyce, “notes toward an unwritten non-linear electronic text, ‘the ends of print culture’ (a work in progress),” postmodern culture . (september ): http:// pmc.iath.virginia.edu/text-only/issue. /joyce. . . see steven e. jones, the emergence of the digital humanities (new york and lon- don: routledge, ). . kirschenbaum, “the .txtual condition: digital humanities, born-digital archives, and the future literary,” digital humanities quarterly . ( ): http://www.digitalhuman- ities.org/dhq/vol/ / / / .html. see also mcgann’s the textual condition (princ- eton: princeton university press, ). . released in , first draft of the revolution is available at http://lizadaly.com/first- draft/. the work is subtitled “an interactive epistolary story.” for device , see http://simogo. com/games/device /. . as was widely reported in the press; for example: http://techcrunch.com/ / / / that-was-fast-amazons-kindle-ebook-sales-surpass-print-it-only-took-four-years/ . see http://www.futureofthebook.org/ for bob stein (of voyager publishing fame) and his colleague’s ongoing work and thought in this field. the frankfurt book fair featured a “sprint beyond the book” organized by the arizona state university’s center for science and the imagination, and sponsored by intel; over the course of hours a group of writers collectively authored an online “book” exploring the titular theme: http://www.sprintbeyon- dthebook.com/. . see reading at risk ( ) and to read or not to read: a question of national con- sequence ( ), available at http://arts.gov/publications/reading-risk-survey-literary-reading- america- and http://arts.gov/publications/read-or-not-read-question-national-consequence- respectively. each generated considerable media coverage, debate, and attempts at rebuttal. digital scholarship and digital studies . see last year’s state of the discipline report on the subject: meredith l. mcgill, “copyright and intellectual property: the state of the discipline,” book history , no. ( ): – . . those readers wishing for an introduction to digital humanities will be well served by the following collections, all of which exist in print and in freely available online editions: matthew k. gold’s edited collection debates in the digital humanities [minnesota, ; http://dhdebates.gc.cuny.edu/]; blackwell’s companion to digital humanities, edited by susan schreibman, ray siemons, and john unsworth [ ; new edition pending; http://www.digit- alhumanities.org/companion/]; and blackwell’s companion to digital literary studies, edited by ray siemons and susan schreibman [ ; http://www.digitalhumanities.org/companion- dls/]. . british museum, catalogue of books in the library of the british museum printed in england, scotland, and ireland, and of books in english printed abroad, to the year , ed. george bullen (london: by order of the trustees, ); alfred w. pollard and g. r. redgrave, a short-title catalogue of books printed in england, scotland, & ireland and of english books printed abroad, – (london: bibliographical society, ). the url for the english short-title catalogue, along with all other online resources discussed here, are provided in the projects cited list at the end of this essay. . mitch fraas, “mapping books: mapping pre- printed books today,” mapping books, july , , http://mappingbooks.blogspot.com/ / /mapping-pre- -print- ed-books-today.html. . see also fraas’s exploration of the dispersal of medieval british libraries: mitch fraas, “mapping books: the dispersal of the medieval libraries of great britain,” mapping books, november , , http://mappingbooks.blogspot.com/ / /the-dispersal-of-medieval- libraries-of.html. . for a list of these projects, see eleanor shevlin, “sharp digital projects and tools showcase,” early modern online bibliography, accessed october , , http://ear- lymodernonlinebib.wordpress.com/ / / /sharp- -digital-projects-and-tools-show- case/. . ben schmidt, “sapping attention: do revolutionaries really read history?” sapping attention, july , , http://sappingattention.blogspot.com/ / /do-revolutionaries-re- ally-read-history.html; ben schmidt, “sapping attention: making and publishing history in the civil war,” sapping attention, july , , http://sappingattention.blogspot.com/ / / making-and-publishing-history-in-civil.html. . ben schmidt, “sapping attention: women in the libraries,” sapping attention, may , , http://sappingattention.blogspot.com/ / /women-in-libraries.html. . david a. smith, ryan cordell, and elizabeth maddock dillon, “infectious texts: modeling text reuse in nineteenth-century newspapers,” proceedings of the workshop on big humanities, accessed november , , http://www.viraltexts.org/infect-bighum- . pdf. . ryan cordell, “‘taken possession of’: the reprinting and reauthorship of haw- thorne’s ‘celestial railroad’ in the antebellum religious press,” digital humanities quarterly , no. ( ), http://www.digitalhumanities.org/dhq/vol/ / / / .html. . gadd points to arber’s a transcript of the registers of the company of stationers of london, – a.d. (london, – ), volume v, p. xxi, and volume i, p. xvii, for support of these points. . british museum, catalogue of books in the library of the british museum printed in england, scotland, and ireland, and of books in english printed abroad, to the year ; pollard and redgrave, a short-title catalogue of books printed in england, scotland, & ire- land and of english books printed abroad, – ; donald goddard wing, short-title catalogue of books printed in england, scotland, ireland, wales, and british america, and of book history english books printed in other countries, – (new york: index society, ); estc (project) and british library, eighteenth century short title catalogue (estc) (london: british library, ); “english short title catalogue,” accessed october , , http:// estc.bl.uk/. for a helpful overview of this history of the estc, see ian gadd, “the use and misuse of early english books online,” literature compass , no. (may ): – , doi: . /j. - . . .x. . for a description of what is included in the estc and what bibliographic standards are used, see adrian edwards, “english short title catalogue—content,” accessed november , , http://www.bl.uk/reshelp/findhelprestype/catblhold/estccontent/estccontent.html. . for an overview of the different copies available as online facsimiles and for an assess- ment of the various interfaces, see sarah werner, “first folios online,” the collation, april , , http://collation.folger.edu/ / /first-folios-online/; sarah werner, “what do we want from online facsimiles of shakespeare?” wynken de worde, may , , http://sarah- werner.net/blog/index.php/ / /what-do-we-want-from-online-facsimiles-of-shakespeare/. . liz bury’s article in the guardian provides an overview of the conflict and reflects the way in which the story appeals to the media today: liz bury, “emily dickinson legacy fuels war of the archives,” the guardian, october , , http://www.theguardian.com/ books/ /oct/ /emily-dickinson-legacy-archives-online. . paul duguid addressed these issues early and compellingly in “inheritance and loss? a brief survey of google books,” first monday . (august ): http://pear.accc.uic.edu/ ojs/index.php/fm/article/view/ / . . “the estc as a st century research tool,” the estc as a st century research tool, accessed november , , http://estc .wordpress.com/. . “about eebo,” accessed november , , http://eebo.chadwyck.com/marketing/ about.htm. . giles bergel et al., “content-based image recognition on printed broadside ballads: the bodleian libraries’ imagematch tool” (presented at the ifla world library and informa- tion congress, singapore, ), http://library.ifla.org/id/eprint/ . . tom malzbender, dan gelb, and hans wolters, “polynomial texture maps,” pro- ceedings of the th annual conference on computer graphics and interactive techniques ( ): – . . “cultural heritage imaging | more rti examples,” accessed november , , http://culturalheritageimaging.org/technologies/rti/more_rti_examples.html; “rich (illumi- nare) | portable light dome,” portable light dome, accessed november , , http:// portablelightdome.wordpress.com/category/rich-illuminare/; “cultural heritage imaging | reflectance transformation imaging (rti),” accessed november , , http://culturalher- itageimaging.org/technologies/rti/. . r. macgeddon [pseud. randall macleod], “an epilogue: hammered” in pete lang- man, ed. negotiating the jacobean printed book (ashgate, ), – . . kathryn rudy, “dirty books: quantifying patterns of use in medieval manuscripts using a densitometer,” journal of historians of netherlandish art , no. – (june ), doi: . /jhna. . . . . . matija strlič et al., “material degradomics: on the smell of old books,” analytical chemistry , no. (october , ): – , doi: . /ac ; ann fenech et al., “volatile aldehydes in libraries and archives,” atmospheric environment , no. (june ): – , doi: . /j.atmosenv. . . ; sarah werner, “where material book culture meets digital humanities,” journal of digital humanities, october , , http:// journalofdigitalhumanities.org/ - /where-material-book-culture-meets-digital-humanities-by- sarah-werner/. . striphas, the late age of print: everyday book culture from consumerism to con- trol (new york: columbia university press, ). digital scholarship and digital studies . see http://www.theguardian.com/books/ /oct/ /jonathan-franzen-freedom-uk- recall. . indeed, ever more information about individual user’s reading habits is being collected by these devices, on the one hand raising obvious questions about privacy and personal data security, but also creating truly unprecedented opportunities for the study of public reading habits at scale. see, for example, david streitfeld’s story on the phenomenon in the december , new york times: http://mobile.nytimes.com/ / / /technology/as-new-servic- es-track-habits-the-e-books-are-reading-you.html?hp=&_r= . . see lev grossman’s “jonathan franzen: great american novelist,” april , : http://content.time.com/time/magazine/article/ , , , .html. . in winthrop-young, kittler and the media (cambridge: polity, ), – . . “there is no software” has been reprinted many times; here the source is the elec- tronic copy available from the ctheory site: http://www.ctheory.net/articles.aspx?id= . . n. katherine hayles, writing machines (cambridge: mit press, ), . . lev manovich, the language of new media (cambridge: mit press, ), . . matthew fuller, software studies: a lexicon (cambridge: mit press, ), . . the internet archive’s historical software collection is available here: https://archive. org/details/historicalsoftware. exemplars range from wordstar and visicalc to games such as adventure, pac-man, and pitfall. the bitsavers collection of software documentation, current- ly consisting of over , individual items, may be accessed here: http://bitsavers.informatik. uni-stuttgart.de/. . see knuth, literate programming (stanford, california: center for the study of lan- guage and information, ). . noah wardrip-fruin, expressive processing: digital fictions, computer games, and software studies (cambridge: mit press, ), . . dennis jerz, “somewhere nearby is colossal cave: examining will crowther’s origi- nal ‘adventure’ in code and in kentucky,” digital humanities quarterly . ( ): http:// www.digitalhumanities.org/dhq/vol/ / / / .html. . a complete pdf edition of print is available here: http:// print.org/. . see the platform studies web site for the full discussion: http://platformstudies.com/ levels.html. . alan galey, “the enkindling reciter: e-books in the bibliographical imagination,” book history ( ): – , at , doi: . /bh. . . . the mal has a web presence here: http://loriemerson.net/media-archaeology-lab/. “the mal is a place for cross-disciplinary experimental research and teaching using obsolete tools, hardware, software and platforms, from the past.” . see the myth of the paperless office (mit press, ). . from their introduction to the collection between c&d: new writing from the low- er east side fiction magazine (penguin, ), ix. . see love, scribal publication in seventeenth-century england (oxford, ), and stallybrass, “printing and the manuscript revolution,” in ed. barbie zelizer, explorations in communication and history (abingdon, uk, and new york: routledge, ), – , and “‘little jobs’: broadsides and the printing revolution,” in agent of change: print culture studies after elizabeth l. eisenstein, ed. sabrina alcorn baron, eric n. lindquist, and eleanor f. shevlin (amherst, ma: university of massachusetts press, ), - . . paul zelvansky’s artist’s book the case for the burial of ancestors ( ), john mc- daid’s hypercard fiction uncle buddy’s phantom funhouse ( ), william gibson, dennis ashbaugh, and kevin begos, jr.’s agrippa: a book of the dead ( ) and ian bogost’s a slow year ( ) are all subsequent exemplars. of these, agrippa has the most notoriety. see the agrippa files web site for complete documentation: http://agrippa.english.ucsb.edu/. . see http://linkeditions.tumblr.com/howse. book history . see kirschenbaum, “the .txtual condition.” . see “the book-writing machine,” slate (march , ): http://www.slate.com/ar- ticles/arts/books/ / /len_deighton_s_bomber_the_first_book_ever_written_on_a_word_ processor.html. . in “where money and energy gather: a writer’s view of a computer laboratory,” research directions in computer science: an mit perspective, eds. albert r. meyer, et al. (cambridge: mit press, ): n.p. . jones, the emergence of the digital humanities, . . see http://www.hughhowey.com/my-advice-to-aspiring-authors/. . see http://www.newyorker.com/online/blogs/books/ / /coming-soon-jennifer- egan-black-box.html. . one source cites % ebook sales against the usual %: http://www.theguardian. com/media/ /mar/ /fifty-shades-random-house-record-profit. . see http://www.mediabistro.com/galleycat/fifty-shades-of-grey-wayback-machine_ b for screenshots of the original fiftyshades site. . see http://fiftyshadesofpopculturetheory.blogspot.com/ / /full-exchange-with- jason-boog-for.html for the reasoning behind this figure. . whitney anne trettien, “a deep history of electronic textuality: the case of english reprints jhon milton areopagitica,” digital humanities quarterly , no. ( ), http:// www.digitalhumanities.org/dhq/vol/ / / / .html. . the revelation of the account’s author came from alexis madrigal in the essay that best captures the nature of the narrative: “revealing the man behind @mayoremanuel,” the atlantic, february , , http://www.theatlantic.com/technology/archive/ / /reveal- ing-the-man-behind-mayoremanuel/ /. the full series of tweets was later published in book form, along with sinker’s narratives about writing it: dan sinker, the f***ing epic twitter quest of @mayor emanuel (scribner, ). . ed finn has done some excellent preliminary investigation of “distant reading” the kind of user data housed on amazon and goodreads for contemporary fiction; see his “revenge of the nerd: junot díaz and the networks of american literary imagination” in digital humani- ties quarterly , no. ( ), http://www.digitalhumanities.org/dhq/vol/ / / / . html. university of birmingham research gateway research portal schools publications researchers research projects activities datasets staff login explore our research portal discover the university's research phd opportunities information about phd opportunities . featured researcher lisa bortolotti philosophy person: academic ewan fernie shakespeare institute person: academic laura piddock microbiology and infection person: honorary most downloaded publications downloads interpretative phenomenological analysis michael larkin, jun , qualitative research methods in mental health and psychotherapy: a guide for students and practitioners. thompson, a. & harper, d. (eds.). oxford: john wiley & sons, p. - research output: chapter in book/report/conference proceeding › chapter downloads the road to the dock: prosecution decision-making in medical manslaughter cases andrew sanders, jan , bioethics, medicine and the criminal law: medicine, crime and society. sanders, a. & griffiths, d. (eds.). cambridge: cambridge university press, vol. . p. - research output: chapter in book/report/conference proceeding › chapter downloads improving the care pathway for women who request caesarean section: an experience-based co-design study sara kenyon, sandhya duggal, nicola gale, nov , in: bmc pregnancy and childbirth. , .research output: contribution to journal › article › peer-review downloads a forgotten legacy of the second world war: gi children in post-war britain and germany sabine lee, may , in: contemporary european history. , , p. - p.research output: contribution to journal › article downloads review of the efficacy of low emission zones to improve urban air quality in european cities claire holman, roy harrison, jun , in: atmospheric environment. , p. - research output: contribution to journal › article › peer-review recently added publications resilience, adaptive peacebuilding and transitional justice: how societies recover after collective violence janine clark (ed.) & (ed.), oct , (accepted/in press) cambridge university press.research output: book/report › book the influence of jus cogens on international crimes: have they made a difference rob cryer, aug , the achievements of international law : essays in honour of robin churchill. hartmann, j. & khaliq, u. (eds.). st ed. oxford: hart publishingresearch output: chapter in book/report/conference proceeding › chapter (peer-reviewed) › peer-review gothic utterance: voice, speech and death in the american gothic jimmy packham, jun , (accepted/in press) university of wales press. p.research output: book/report › book processions, power, and community identity: east and west leslie brubaker & chris wickham, may , empires and communities in the post-roman and islamic world, c. - ce. pohl, w. & kramer, r. (eds.). oxford university press, (oxford studies in early empires).research output: chapter in book/report/conference proceeding › chapter (peer-reviewed) › peer-review an adaptive model of gaze-based selection xiuli chen, may , (accepted/in press) chi conference on human factors in computing systems (chi ’ ). association for computing machinery research output: chapter in book/report/conference proceeding › conference contribution about the birmingham research portal provides a publicly accessible, fully searchable interface to explore the research undertaken at the university. useful websites university research pages university privacy notice legal and copyright information contact pure@contacts.bham.ac.uk twitter: @uob_ressupport help for uob researchers pure intranet open access open data a model for institutional infrastructure to support digital scholarship publications , , - ; doi: . /publications publications issn - www.mdpi.com/journal/publications article a model for institutional infrastructure to support digital scholarship malcolm wolski * and joanna richardson division of information services, griffith university, nathan, qld , australia; e-mail: j.richardson@griffith.edu.au * author to whom correspondence should be addressed; e-mail: m.wolski@griffith.edu.au; tel.: + - - - . external editor: björn brembs received: april ; in revised form: august / accepted: september / published: september abstract: there is a driving imperative for new knowledge, approaches and technologies to empower scholarship, especially in emerging areas of inquiry. sources of information now extend beyond the written word to include a wide range of born-digital objects. this paper examines the changing landscape in which digital scholars find, collaborate, create and process information and, as a result, scholarship is being transformed. it discusses the key elements required to build an institutional infrastructure, which will not only support new practices but also integrate scholarly literature into emerging and evolving models that generate true digital scholarship. the paper outlines some of the major impediments in implementing such a model, as well as suggestions on how to overcome these barriers. keywords: digital scholarship; built infrastructure; digital artifacts; scholarship . introduction as society tackles increasingly complex issues, specialists across a number of disciplines have been examining the role of research and innovation in underpinning solutions to these issues [ – ]. new knowledge, approaches and technologies are discussed as core to proposed strategies. concurrently the very nature of scholarly communication is changing partly in response to these same drivers. as a result, new research practices are evolving. open access publications , in this paper the authors examine this rapidly evolving and transformative landscape in which scholars work. they discuss the emerging concept of digital scholarship. the authors conclude with a discussion of the key components required to build an institutional infrastructure to support new practices and new models of scholarship, as well as some of the barriers, which need to be overcome to achieve this goal. . background the role of new technologies in causing a change or paradigm shift in the practices of research, learning and teaching, and scholarly communication has been, and is continuing to be, widely discussed in the literature. for example, christensen’s [ ] theory of disruptive innovation, which is based on the changing application of technology in the marketplace, has been expanded to apply more broadly to technologies that have introduced radically different behaviors into society generally. in learning and teaching specifically, there is considerable discussion about the flipped classroom [ ]. in scholarly publishing, there is a focus not just on new digital reading devices but also on the changing nature of publishing itself. the internet has enabled new models of scholarly communication in which authors are directly linked with other scholars and readers, and in which distribution models can no longer rely on a wholesale model based on physical content. in an industry, which, by its nature, is content-focused, digital content has disrupted the former relationships and roles among writers, publishers and readers [ ]. new publishing and pricing models are being explored for journals, scholarly monographs, textbooks, and digital materials, as the various stakeholders try to establish sustainable business models [ , ]. along with the impact of open access [ – ], an important trend—from a publishing perspective—has been the emergence of different models for providing access to the data, which underpins published research [ – ]. the rapid development and dissemination of digital technologies have helped to enable interdisciplinary research, not just in “big science” but also in the fast growing field of digital humanities. electronic networks are making it much easier for investigators from different fields to communicate and collaborate. these rapid changes are pointing toward a very different model of research practice and have led major international, national and funding bodies to examine their respective impact. both the american association for the advancement of science [ ] and the arts and humanities research council (ahrc) [ ] in the uk, for example, have highlighted the importance of interdisciplinary, collaborative approaches. in europe, the eu rif project [ ] has been established to explore new and emerging ways of doing research in universities, research organizations, companies and society as a whole. . the changing nature of scholarship against this backdrop of evolving models in research practice and scholarly communication, especially in regard to publishing, it is not surprising that the concept of scholarship has also been undergoing profound change. it is intrinsically linked to these concepts. historically scholarship tended to be regarded as work undertaken in a tertiary institution, based on research, which led to publication as a book, book chapter or an article in a peer-reviewed journal. a narrower definition has limited this activity to “original” research conducted in the sciences [ ]. publications , drawing upon the work of others, boyer [ ] subsequently expanded the concept, based on four areas, to show that scholarship is broader than just traditional notions of research: the scholarship of discovery, the scholarship of integration, the scholarship of application, and the scholarship of teaching. his work has been important in helping university departments assess the outputs of their staff. more recently, beattie [ ] and diamond [ ] have advocated expanding the definition of scholarship to reflect practice in the twenty-first century. in the humanities, particularly english studies, for example, there has been a tendency to attempt to categorize digital outputs as equivalent to print publications instead of considering them in terms of “design and delivery, recentness and relevance, and authorship and accessibility” [ ]. activities, which are undertaken in so-called “nontraditional” areas, need to be recognized as well. the performing arts sector is a primary example. in addition, scholarship, as diamond [ ] suggests, is defined at any given time, depending upon factors such as discipline, individual interests and institutional priorities. as the authors will show later, these are important considerations when building an infrastructure to support an institution’s scholarship. concepts such as digital scholarship and transformative scholarship have emerged to describe new ways in which scholars engage with innovative research practices. digital scholarship is defined by rumsey [ ] as “the use of digital evidence and method, digital authoring, digital publishing, digital curation and preservation, and digital use and reuse of scholarship”. ayres [ ] asserts that most frequently it is used to describe “discipline-based scholarship produced with digital tools and presented in digital form”. for pearce et al. [ ], digital scholarship is more than just the use of information and communication technologies to inform research, collaboration and teaching: “… it is embracing the open values, ideology and potential of technologies born of peer-to-peer networking and wiki ways of working in order to benefit both the academy and society. digital scholarship can only have meaning if it marks a radical break in scholarship practices brought about through the possibilities enabled in new technologies. this break would encompass a more open form of scholarship.” transformative scholarship, for its part, tends to focus on the transition from traditional, print-based forms to an integration of digital practices into the intellectual creation process. at the university of southern california, the center for transformative scholarship’s mission is to “facilitate, explore, test, and advance the potentials of new media and networked scholarship for scholarly research, analysis, and publication” [ ]. the center provides a supportive institutional framework for those in the university community who wish to take advantage of developing born-digital scholarly projects. for purposes of this discussion, the authors have included transformative scholarship under the broader term of digital scholarship since both concepts are anchored in digital practice. fundamental to this overall discussion are also the changing concept of knowledge and the nature of its relationship to scholarship. as with scholarship, “knowledge is perpetually in motion. today, what we call ‘knowledge’ is constantly being questioned, challenged, rethought, and rewritten” [ ]. according to wilbanks [ ], whereas traditionally knowledge has been thought of as “a paper, a product, property”, it can now be thought of as “a network, an infrastructure”. unlike its manifestation in an object, such as a book, which is limited by physical boundaries, knowledge as a network invites both the creator and the reader to link content in ways previously not imagined. as a result, notwithstanding the tendency within institutions to continue time-honored traditional approaches, the increasing dialogue about new possibilities in both the creation and dissemination of scholarship is helping to drive the development of innovation. in this period of transition, there is recognition of the publications , importance of pushing boundaries and reconceptualizing the form, as well as the very substance, of scholarship. digital scholarship potentially has a key role in expanding these boundaries; its outputs will hopefully create new insights. ayres [ ] asserts that “digital scholarship is the missing part of the cycle of productivity that we have long believed our investments in information technology would bring to institutions of higher education.” therefore tertiary educational institutions have an obligation not only to provide the necessary supporting infrastructure to bring this potential to fruition but also to validate digital scholarship as a valued model of scholarship. in the following section the authors outline the key elements that an institution needs to consider in building the type of infrastructure which can support new models of digital scholarship. . institutional infrastructure the new oxford american dictionary [ ] defines infrastructure as “the basic physical and organizational structures and facilities needed for the operation of a society or enterprise”. information infrastructure specifically has been discussed in the literature for more than a decade. two common themes are: (a) the importance of the non-technological aspects – the human components, comprising of standards, social norms, practices with technology that collectively facilitate scholarly work from a distance [ ] and (b) the need for interoperability between local and multiple divergent systems [ ]. in discussing information infrastructures, borgman [ ] asserts that traditionally initiatives have tended to be oriented toward the purely technical aspects of infrastructure, i.e., infrastructure of information, whereas the focus should be on infrastructure for information, which encompasses information practices within their specific social context and discipline. in the specific domain of digital scholarship, innovation values content/artifacts, people and technology as fundamental components. since knowledge is not created in isolation and is inherently evolving, the corresponding infrastructure needs to integrate these components, which are also continually evolving. this applies both within the institution and on a global level. the challenge for institutions is to build infrastructure for this new environment, especially for those engaged in digital scholarship. historically infrastructure in this context favors a traditional concept of scholarship in terms of big science. an example is the provisioning of mass storage for “big data”. institutional administrators traditionally have viewed infrastructure as something built for and within their own enterprise. however digital scholarship is about sharing information, collaboration and knowledge creation beyond the enterprise. the institutional infrastructure does not operate in a vacuum; it needs to be regarded as a “node” in the much broader knowledge ecology and regarded as more than just physical infrastructure. therefore institutional administrators will need to re-conceptualize infrastructure to meet new requirements. core components would include organizational structure, built infrastructure, digital artifacts and people (see figure ). publications , figure . institutional infrastructure model. . . organizational structure organizational structures are those governance structures, policies, standards, legal instruments and processes, which are required to facilitate scholarship. this could include policy changes, regulations, new standards and licensing frameworks, and operational processes required to make research data more easily discoverable and shareable. it includes the roles of organizational units and the financial responsibilities that are required to facilitate the development of the infrastructure. . . . current approaches and constraints while the enthusiasm and innovation exemplified by individual digital scholars are readily evident, the same perspective may not necessarily be shared at the institutional level. weller [ ], for example, talks about the significant challenges for scholars when "faced with norms and values which oppose, hinder or fail to recognize these forms of scholarship". he goes on to discuss the general lack of understanding of, and experience with, this type of scholarship by senior university management; this in turn can lead to what weller terms as "resistance" in recognizing it as a valuable activity. ayres [ ] makes the point that digital scholarship will not get traction until it ceases to be viewed as a “series of isolated experiments”. while it is true that substantial projects have been underwritten by funding agencies, there is a sense of unrealized potential. projects may not have been viewed publications , initially as scholarship because the focus has been more on the underlying technology rather than on creating new disciplinary knowledge. the challenge, from an organizational perspective, is one of being tied to long-held, policy-based definitions as to what constitutes "research", especially for purposes of funding and/or other high-level activities. for example, in australia two national research assessment exercises use the reasonably broad organisation for economic co-operation and development (oecd)’s definition [ ] but tend to focus on “publications” as the assessed metric. in addition the national and international ranking of universities in the so-called “league tables”, e.g., qs world university rankings [ ], relies on easily quantifiable metrics such as research citations. new forms of digital scholarship, on the other hand, challenge the conventional methods for measuring research impact. unlike a text-based journal article, they do not currently easily lend themselves to citation analysis. much of the potential for digital scholarship is based upon its adoption of open values, e.g., open access, open scholarship, open data. at an institutional level, this “openness” requires a corresponding high-level commitment, if not policy or mandate. while some institutions are notable for their progress, the rate of implementation has been generally slow [ ]. as members of their respective institutional community, scholars are frequently subject to regulatory and legal requirements affecting their organization. some of these may not reflect current digital innovations. in australia, for example, the copyright act is being reviewed because it is generally considered that the exceptions and statutory licenses in the act are neither adequate nor appropriate in the digital environment. . . . strategies for overcoming barriers in implementing an enterprise-wide strategy, the institution needs to understand both the current and future research needs of its diverse research community. the challenge is to overcome the natural tendency within organizations to have a siloed approach. therefore, in universities both it and the library need to partner with key academic groups to understand the current and future research information needs of a diverse university research community. from this understanding will be derived relevant operational processes and policies. addressing organizational processes is critical to such outcomes as facilitating information flow and the retention of valuable assets. from a governance perspective, some universities have implemented, for example, an advisory group, chaired by the vice president, academic research, (or corresponding role) to ensure input from across all the disciplines. proposals to build infrastructure to support new models of scholarship must necessarily be aligned with the strategic goals of the institution. it is, therefore, useful to develop discussion papers for senior executives that articulate an institutional vision within the context of the broader, external environment –including opportunities– and propose a corresponding digital strategy. each institution’s vision will of course be influenced by a range of drivers, which will vary across the sector [ , ]. there is a need to ensure “senior university awareness of, and engagement in, national (and international) research information infrastructure opportunities … if the most effective investment decisions are to be taken” [ ]. this can only happen if the institution itself is clear about its digital strategy. once the vision and strategy have been articulated, the next step would be to establish the gaps between the current state and the desired state so as to know where to focus effort. a combination of publications , self-assessment and benchmarking against good practice could be used to inform this fact-finding, even if good practice is still being defined for some areas. for example, providing hard data on how much–and what –research data is stored –and where–within a university and if it is at risk. analysis done at this stage may lead to revised processes, governance models and policies not only within the organization quadrant but also across the other three quadrants. as a result of the strategies outlined above, common problems will be identified, from which there will be an opportunity to partner with key stakeholders to address them. the library is ideally positioned to take a lead role in drawing together the various stakeholders to develop institutional strategies to respond to new drivers [ ]. maccoll [ ] observes, “the library should be knowledgeable about knowledge, and should be the main authority on the campus about the ways knowledge is generated and transmitted through all of the disciplines it contains”. it is well positioned to work with diverse groups. at griffith university (australia) the division of information services (includes the library) has capitalized on the development of a researcher profile system to work with key stakeholders on research data management guidelines for the university and to initiate the development of data archives and related systems. in addition to further developing relationships with the researchers involved, this initiative has resulted in an improved communication network within the division itself as well as within the university as a whole, especially with the office for research and with major research centers [ ]. finally institutions should support opportunities to participate in national and international initiatives, which are tackling the high-level challenges discussed in this paper. the research data alliance [ ], for example, is developing enablers—social and technical—to facilitate the sharing of open data. . . built infrastructure the traditional approach to meeting it infrastructure demand is to build it within the institution. developing solutions for scholarly communities internal and external to the institution requires a mixture of local and cloud infrastructure and internal and external services. for example, institutional-based communication and collaboration technologies such as video conferencing are used in combination with consumer-based solutions, such as skype, to facilitate scholarship. institutions need to develop blended infrastructure solutions to support scholarship beyond the enterprise gateway, which operates in hidden sub-surface ecosystems [ ]. from the scholar’s perspective, ideally the underlying technologies should operate seamlessly. . . . current approaches and constraints infrastructure planning approaches have tended towards developing built infrastructure within the institution. however, this has changed in recent times to leverage the cost savings and efficiencies of using cloud infrastructure, whether as infrastructure as a service (iaas), software as a service (saas) or platform as a service (paas). this has seen the use of externally hosted solutions for email and other corporate systems. however, this approach still tends to develop single purpose solutions and does not take into account that there are now many services available to scholars outside of the publications , institution, especially for collaboration and storage. on the one hand, the institution may provide an enterprise solution for video conferencing or working storage, while on the other hand most scholars may be using instead common tools, such as skype or dropbox. for those institutions attempting to address this problem a variety of technical issues present themselves with regard to integration and lack of industry standards. another constraint is the institutional and system view of identity management. generally institutions have identity management systems, which are restricted to current employees or user accounts created through some internal process. this is problematic to scholars seeking secure mechanisms to share data and information with a community, which has members across many institutions. in some institutions there is no coordinated central planning around these core services. central support for specialized scholarly applications is generally limited. the increasing use of information and communications technology (ict) to respond to demands of scholarship—including online learning, managing and analyzing big data, and support for large communities of researchers—has been driven from the discipline level, frequently through the use of grant funding. institutional support for many of these bespoke solutions is limited because of lack of sustainable resourcing coupled with the fact that the solutions have been developed to service communities across many institutions. without an institutional “champion” providing ongoing support, many of these solutions are at risk. there are also many scholars who are not well serviced with solutions to meet their specific research needs. in such cases, institutional funding does not tend to provide ongoing sustainable resources to develop and maintain these, e.g., lack of skilled staff to redevelop in-house desktop modeling software tools to meet the needs for a researcher to run larger or faster models on amazon web services. this also occurs at the hardware level since many institutions have standardized infrastructure to drive down costs and leverage economies of scale in purchasing and support. this approach comes at the expense of flexibility. researchers often require non-standard solutions for their purposes, such as software necessitating a non-standard hardware set up or a high speed network connection or the ability to host an application for their project community to use. all these constraints act as barriers for scholars seeking to leverage it to increase their capacity or capability or to easily collaborate with colleagues across the globe. central planners need to regard their infrastructure as a node in a global it ecosystem rather than as just local, physical infrastructure built only for use within their own institution. many institutions attempt to address these issues by deploying new products only to find that uptake is low. two common reasons are that (a) these projects tend to be limited in scope to just deploying the product and are not well resourced to assist end-users to leverage the product for their own use; and (b) the end-users themselves have little incentive to allocate time to learn how the new tools and services could be used to their advantage. . . . strategies for overcoming barriers the increasing use of cloud-based solutions has seen a new approach to designing and building it and information architectures to support more flexibility, thereby allowing for easier integration and coupling of systems. this is also reflected in the development of relevant international standards. publications , there is also evidence of increasing collaboration to resolve some of these problems. for example, new solutions are emerging or are being addressed to manage identities whether between institutions (e.g., the australian access federation [ ]) or resolving multiple identities to a single individual (e.g., through an institutional and/or open researcher and contributor id (orcid) [ ] identity). institution planners will need to actively participate in these activities. national and international solutions are emerging for the sharing of resources through collaboration and partnerships. for example, in the australian research environment, key stakeholders—including funding bodies such as the australian research council and the national health and medical research council, research institutes and universities—have recognized the need to share knowledge. in building its national collaborative infrastructure, the australian national data service has utilized a federated approach, which supports multi-layers, i.e., research data australia [ ] aggregates at the national level data about australian research, which has been aggregated at the local level. individual institutions are partnering and, in some cases, taking lead roles in developing new community-based solutions, such as the participation in the aforementioned research data alliance or the development of national solutions, e.g., biodiversity & climate change virtual laboratory [ ]. institutional decision makers will need to decide when it is best to collaborate and when the solution should be an institutional owned investment. institutions will also need to decide how many resources should be allocated to support scholars using technology developed elsewhere. the current approach to infrastructure planning is to plan from the center. new models will need to be developed in which disciplines and communities can articulate their infrastructure needs to be accommodated within the central resource allocation. this will require a planning model based on partnering between infrastructure providers and key stakeholder groups within the respective institutions. this will require the identification of key stakeholders within the institution. it is already apparent that institutions need to preserve research outputs beyond the traditional text-based outputs. this will require a closer working relationship with researchers, e.g., primary research data sets, software, video, and websites. . . digital artifacts digital artifacts are the digital assets, i.e., the content, required to develop new knowledge. these need to be created/acquired, stored, delivered, and used as appropriate in a timely manner while respecting the terms of the artifact owner. the institution has a responsibility for making these artifacts available in formats and context suitable for interpretation, analysis and manipulation for scholars across disciplines, internal and external to the institution. . . . current approaches and constraints the typical approach to preserving digital artifacts is to address this issue at the end of the project, at a time when resources are limited, enthusiasm is waning and there is no incentive to curate and clean up the data for preservation. in addition, there are few, if any, skilled people available with the curation and preservation skills required to assist the researchers in making these artifacts available in formats and context suitable for discovery, interpretation, analysis and manipulation for scholars across disciplines and across the globe. publications , there is a specific issue in relation to properly describing and preserving details around ip, licensing and other regulatory, legal and compliance materials such as ethics clearances. unfortunately licensing and data sharing agreements tend to be on a project-by-project basis. the preservation task would be less onerous if licensing or data sharing were mandated at the government or institutional/ agency level. additionally, within institutions all these materials tend to reside in different repository siloes, e.g., legal agreements are retained in corporate record archives but the data collected as a result of such agreements is stored elsewhere with no cross referencing. generally institutions do not have expertise or strategies in digital preservation. there are a number of reasons for this, such as:  the needs to date have not warranted it  the level of resources required, e.g., refreshing old formats is both labor-intensive and expensive  the evolving nature of the problem (e.g., archiving of digital material regarded as a storage problem under the responsibility of the it department who traditionally do not have preservation/archival expertise), and  lack of institutional policies and guidelines about what to retain and what to discard. in support services, there is a lack of processes to intervene in the research lifecycle to provide assistance at the right time. for example, processes to alert support services when a researcher is:  drafting a research grant proposal, so as to assist in developing a data management strategy, or  about to publish a paper, so as to check whether there is supporting data that needs to be published. there is also a constraint in that some artifacts, by their inherent nature, do not lend themselves easily to sharing or to typical preservation processes, for example, a piece of software or a very large data set [ ]. . . . strategies for overcoming barriers global movements in open scholarship, data, and access have the potential to address many of the licensing and ip issues. these movements are increasingly being supported with mandates from funding agencies (e.g., wellcome trust, national science foundation, australian research council) about preserving data, open models, linking publications to grants to data, etc. with big questions now being asked universally in public forums (e.g., recent studies outlining how much money universities are paying for major publisher subscriptions), there is promise that some of the problems at the institutional level will be resolved at the national and international level. additionally, in many countries governments themselves are moving towards open data in relation to administrative data [ – ]. while this may not resolve ip and licensing problems with legacy artifacts, it does open the doors for much simpler preservation and access mechanisms for the future. preservation services are now being developed through national initiatives, e.g., national collaborative research infrastructure strategy (ncris) [ ] in australia, or within discipline communities or through third party offerings, e.g., dryad [ ] and figshare [ ]. publications , while all of the above provides an encouraging picture for the preservation of scholarly outputs, it does not resolve the current inefficiencies and ineffectiveness of how data is captured, manipulated and finally stored and reused. an analogy can be made here in the industrial design response to recycling in which product design is increasingly moving to a cradle to the cradle approach, i.e., you design the solution anticipating the product will be used until end of life and then will be diverted to be used as input to other products [ ]. when scholars begin a project, they should be planning to capture, format, describe and preserve all the relevant data, inputs and outputs, from the activity so that they can be potentially used as inputs in future scholarly activities by other scholars. this will involve decisions about what needs to be kept and what can be deleted during and at the end of the project, based on knowledge about the data, e.g., what might be useful at some later date. the data retained may include data additional to what is cited in the published article from a specific research activity. for example, an article may only reference a subset of all the collected data. the library within the institution is a logical potential source of skills and expertise in managing information and preservation. to be effective, however, will require a common approach and partnership with the it department to (a) supply the hardware infrastructure to meet the needs either provisioned locally or through the cloud; (b) to leverage external services as needed, e.g., those funded by governments or specific discipline data repositories; and (c) collaboration and partnering with other institutions to develop cost-effective solutions. this will also need to extend to better engagement with key stakeholders within the institution so as to have better points of intervention. . . people the final component is the role of people as key to an infrastructure solution for digital scholarship. not only do institutions need to develop the skills required in their support service personnel but also they have a role in developing the skills of scholars themselves in new methods and processes to create, manage and use born-digital artifacts. this also extends to building awareness about emerging organizational structures such as global standards and licensing frameworks. scholars also need to find and join networks specific to their domain and gain an understanding of the global context of their research activities, e.g., lingual and cultural impacts, and opportunities for collaboration and data sharing. . . . current approaches and constraints lynch [ ] regards the biggest challenge for universities as the design and staffing of organizations that will work with academics to access local, national and global cyberinfrastructure services, assisting faculty to manage their data, prepare for handoff for curation and aiding them in data reuse, mining, and computation. while lynch may have been thinking of big data—and particularly as it applies to the sciences—attributes such as volume, variety and complexity are equally applicable to all disciplines. in the organization quadrant, we discussed the importance of “openness” which characterizes much of the work being done in digital scholarship and the importance for the institution to have a clearly articulated high-level commitment to this concept. that said, the idea of sharing resultant research outputs or collaborating in research is not necessarily part of the culture of every discipline, let alone individual researchers. this has been highlighted by the ahrc [ ] in their view that “there will be greater need to bring arts and humanities researchers together to influence the context in which they publications , work; to build consortia, cross-disciplinary networks and multi-funder partnerships; and to support individual researchers to forge stronger relationships with academics overseas”. as also discussed within the organization quadrant, if there is not a high-level recognition of the value of digital scholarship to the institution, then there may be little incentive for a researcher to innovate or set aside time to develop new methods and techniques to improve the processes for creation, management and analysis of born-digital artifacts. this lack of knowledge and skills may extend to a researcher’s potential lack of knowledge of international standards and frameworks as well as a lack of appreciation of the global context for their research activity. in addition there is the overarching problem of the failure to break the text-based paradigm as the mode of delivery of publishing research [ ]. researchers may not set aside time to leverage either new channels or mediums, e.g., social media, or all the potential outputs from their research, such as publishing data [ ]. the focus of support within the it and library support services is typically internally focused, i.e., delivering services and resources supplied by the institution. how do these support services become familiar with the offerings from sources external to the institution, whether they are widely used consumer based technologies and services or specific solutions developed within particular disciplines of scholars? this can be difficult to achieve when support services are constrained by internally developed standards, guidelines and policies, which do not allow for more loosely coupled solutions that blend or integrate internal and external offerings. . . . strategies for overcoming barriers having a good understanding of key partners with whom scholars are collaborating would assist institutional planners to further refine their service offerings. for example, for institutions beginning to engage with asian countries, institutional service planners may need to incorporate a response into their service delivery approach, e.g., providing services and resources in a number of specific languages, or addressing communication network and collaboration technology problems specific to those partner countries to better facilitate collaboration. assisting scholars to further develop their skills will require better targeted outreach services within the institution by addressing the specific needs of different cohorts (e.g., phd students, early career scholars) and discipline groups (e.g., humanities, medicine). having a well-defined and communicated vision and strategy at the institutional level should cascade down to the internal group and individual level and lend itself to skills development planning exercises. this could also lead to institutional initiatives to address specific issues, for example, the establishment of centers of expertise, such as the university of southern california’s centre for transformative scholarship [ ], and columbia university libraries’ digital social science center [ ] and digital humanities center [ ]. better business intelligence would also be useful in targeting support service resources. for example, data librarians should know when scholars begin projects or when they publish in order to proactively intervene at key points to assist researchers during the research cycle. this would require the gathering of data and developing of processes to provide information to support services in a timely manner. publications , while the library may be able to assist with preservation, much of the knowledge and expertise required to decide what needs to be preserved and how it should be described and formatted for potential research still remains within the specific discipline. developing the knowledge and skills base will require either information professionals acquiring discipline knowledge (e.g., a discipline data librarian) or scholars acquiring this knowledge and skills as part of their professional development. changing the behavior of individuals, the culture of institutions and developing new skills takes time and unfortunately is typically slower than the rapid change of technologies and globalization, which are driving these changes. strategies will need to be developed to address the varying levels of awareness and skill sets within the institution. . . legacy infrastructure star and ruhleder [ ] make the point that infrastructure is typically developed from an installed base, inheriting both strengths and limitations from that base. new systems are designed for backward compatibility; failing to account for these constraints may be fatal or distorting to new development processes. while star and ruhleder were mainly referring to built infrastructure, the same constraint applies to other components of the infrastructure. for example, infrastructure impacted could be artifacts licensed under old licensing regimes or data sets preserved using obsolete standards or formats or, as mentioned above, staff with out-of-date skills. a key challenge for institutional planners is responding to these legacy issues without compromising their vision. . conclusions the impact of disruptive technologies has been examined on research practice and publishing, with particular reference to the changing world of digital scholarship. if scholars are to maximize the full potential of new models for the creation and dissemination of their research, their parent institutions have a responsibility to build supporting institutional infrastructure and to validate digital scholarship as a valued model of scholarship. the proposed model is based on the concept that technology development should have a synergistic relationship with organizational structure, people and digital artifacts. as each institution builds infrastructure in this new environment, it becomes a node in a global knowledge ecology based on open values. empowering scholarship through the creation of new knowledge, approaches and technologies is a work in progress, which requires profound change in practice. juxtaposed is the continuation of conservative, traditional practices within highly complex, competitive academic reward systems. although not without tension, this evolving landscape should be viewed as a period of transition in which scholars, institutions and other actors ultimately validate new models of scholarship. publications , author contributions malcom wolski and joanna richardson contributed equally to the literature review and the development and analysis of the proposed model. the content of this paper is solely the responsibility of the authors. conflicts of interest the authors declare no conflict of interest. references . georges, g.; goldsmith, s. leading social innovation. innovations , , – . . austrom, d.r.; lad, l.j. problem-solving networks: towards a synthesis of innovative approaches to social issues management. acad. manag. proc. , , – . . jordan, t.; andersson, p.; ringnér, h. the spectrum of responses to complex societal issues: reflections on seven years of empirical inquiry. integr. rev. , , – . . sandland, r. introduction to ands. share: newsl. austr. natl. data serv. , . available online: http://ands.org.au/newsletters/newsletter- - .pdf (accessed on july ). . facilitating interdisciplinary research and education: a practical guide; derrick, e.g., falk- krzesinski, h.j., roberts, m.r., eds.; american association for the advancement of science: washington, dc, usa, . . committee encouraging corporate philosophy; mckinsey. shaping the future: solving social problems through business strategy; cecp: new york, ny, usa, . . christensen, c.m. the innovator’s dilemma: when new technologies cause great firms to fail; harvard business review press: boston, ma, usa, . . bull, g.; ferster, b.; kjellstrom, w. connected classroom-inventing the flipped classroom. learn. lead. technol. , , . . disabato, n. publication standards part : the fragmented present. dyn. publ. . available online: http://alistapart.com/article/publication-standards-part- -the-fragmented-present (accessed on february ). . weller, m. the digital scholar: how technology is transforming scholarly practice; bloomsbury: london, uk, . . arts & humanities research council. the academic book of the future: call for proposals; ahrc: swindon, uk, . available online: http://www.ahrc.ac.uk/funding-opportunities/ pages/future-of-the-academic-book.aspx (accessed on february ). . mierzejewska, b.i. disruptive innovations in book publishing-threat or opportunity? int. j. bk. , , – . . owens, s. is academic publishing industry on the verge of disruption? . available online: http://www.usnews.com/news/articles/ / / /is-the-academic-publishing-industry-on-the- verge-of-disruption (accessed on february ) publications , . daniels, j.; thistlethwaite, p. engaging academics and reimagining scholarly communication for the public good: a report; the graduate center, cuny, justpublics@ : new york, ny, usa, . available online: http://library.gc.cuny.edu/aleph/jp % report% final% .pdf (accessed on april ). . welcome to oapen-uk. jisc collections: london, uk. available online: http://oapen- uk.jiscebooks.org (accessed on april ). . managing research data; pryor, g., ed.; facet: london, uk, . . faniel, i.m.; zimmerman, a. beyond the data deluge: a research agenda for large scale data sharing and re-use. int. j. digit. curat. , , – . . gigascience. available online: http://www.gigasciencejournal.com (accessed on april ). . dryad digital repository. available online: http://datadryad.org (accessed on april ). . arts & humanities research council. the human world. the arts and humanities in our times. ahrc strategy – ; ahrc: swindon, uk, . . diamond, r.m. defining scholarship for the twenty-first century. new dir. teach. learn. , , – . . boyer, e.l. scholarship reconsidered: priorities of the professoriate; carnegie foundation for the advancement of teaching: princeton, nj, usa, . . beattie, d.s. expanding the view of scholarship: introduction. acad. med. , , – . . purdy, j.p.; walker, j.r. valuing digital scholarship: exploring the changing realities of intellectual work. profession , , – . . rumsey, a.s. new-model scholarly communication: road map for change. in scholarly communication workshop ; university of virginia: charlottesville, va, usa, – july . available online: http://www.uvasci.org/wp-content/uploads/ / /sci -report.pdf (accessed on february ). . ayres, e.l. does digital scholarship have a future? educause rev. , , – . . pearce, n.; weller, m.; scanlon, e.; kinsley, s. digital scholarship considered: how new technologies could transform academic work. education , , – . . mission statement. university of southern california center for transformative scholarship: los angeles, us. available online: http://transformative.usc.edu/?page_id= (accessed on april ). . edwards, p.n.; jackson, s.j.; chalmers, m.k.; bowker, g.c.; borgman, c.l.; ribes, d.; burton, m.; calvert, s. knowledge infrastructures: intellectual frameworks and research challenges; deep blue: ann arbor, mi, usa, . available online: http://hdl.handle.net/ . / (accessed on february ). . wilbanks, j. new metaphors in scientific communication: libraries and the commons. in proceedings of the international association of scientific and technological university libraries (iatul ), stockholm, sweden, – july . . new oxford american dictionary; oxford university press: oxford, uk, . available online: http://www.oxfordreference.com/view/ . /acref/ . . /m_en_us ?rskey=wwvkn &result= (accessed on april ). publications , . edwards, p.n.; jackson, s.j.; bowker, g.c.; knobel, c. understanding infrastructure: dynamics, tensions and design; national science foundation office of cyberinfrastructure: arlington, va, usa, . available online: http://hdl.handle.net/ . / (accessed on april ). . monteiro, e.; pollock, n.; hanseth, o.; williams, r. from artefacts to infrastructures. comp. support. comp. w. , , – . . borgman, c.l. scholarship in the digital age; mit press: cambridge, ma, usa, . . organisation for economic co-operation and development. frascati manual: proposed standard practice for surveys on research and experimental development, th ed.; oecd: paris, france, . . qs world university rankings. available online: http://www.topuniversities.com/university- rankings (accessed on july ). . schmidt, b.; kuchma, i. implementing open access mandates in europe: openaire study on the development of open access; universitätsverlag göttingen: gottingen, germany, . available online: http://goedoc.uni-goettingen.de/goescholar/bitstream/handle/ / /oa_ mandates.pdf?sequence= (accessed on july ). . maron, n.l.; pickle, s. sustaining the digital humanities host institution support beyond the start-up phase; ithaka s+r: new york, ny, usa, . available online: http://www.sr.ithaka.org/sites/default/files/sr_supporting_digital_humanities_ f.pdf (accessed on july ). . o’brien, l. innovation and information infrastructure: making sound investments for e-research. ecar res. bull. , , – . . luce, r.e. a new value equation challenge: the emergence of eresearch and roles for research libraries; council on library and information resources: washington, dc, usa, . . maccoll, j. library roles in university research assessment. liber. q. , , – . . wolski, m.; richardson, j.; rebollo, r. shared benefits from exposing research data. in proceedings of the nd annual iatul conference proceedings, warsaw, poland, may– june . available online: http://www.bg.pw.edu.pl/iatul /proceedings/ft/wolski_m.pdf (accessed on july ). . research data alliance. available online: https://rd-alliance.org/ (accessed on july ). . services overview. australian access federation: brisbane, australia. available online: http://aaf.edu.au/services/services-overview/ (accessed on july ). . orcid (open researcher and contributor id). available online: http://orcid.org/ (accessed on july ). . research data australia. available online: https://researchdata.ands.org.au/ (accessed on july ). . biodiversity & climate change virtual laboratory. available online: http://www.bccvl.org.au/ (accessed on july ). . barbour, v. plos, orcids and article level metrics. available online: http://www.youtube.com/ watch?v=o bp khrqj (accessed on july ). . data.gov. us general services administration: washington, dc, us. available online: https://www.data.gov/ (accessed on july ). . data.gov.uk. available online: http://data.gov.uk/ (accessed on july ). publications , . data.gov.au. available online: http://data.gov.au/ (accessed on july ). . national collaborative research infrastructure strategy. available online: https://www.education.gov.au/national-collaborative-research-infrastructure-strategy-ncris (accessed on july ). . about figshare. figshare: london, uk. available online: http://figshare.com/ (accessed on july ). . braungart, m.; mcdonough, w. cradle to cradle: remaking the way we make things; north point press: new york, ny, usa, . . lynch, c. the institutional challenges of cyberinfrastructure and e-research. educause rev. , , – . . digital social science center (columbia university libraries). available online: http://library.columbia.edu/locations/dssc/about.html (accessed on july ). . mission and program of the digital humanities center. columbia university libraries: new york, ny, us. available online: http://library.columbia.edu/locations/dhc/about/program_mission.html (accessed on july ). . star, s.l.; ruhleder, k. steps toward an ecology of infrastructure: design and access for large information spaces. inf. syst. res. , , – . © by the authors; licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution license (http://creativecommons.org/licenses/by/ . /). / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / s e p t e m b e r , / t i m s h e r rat t keynote presented at the annual conference of the japanese association for the digital humanities, september , tsukuba. the full set of slides is available on slideshare. cross-published on medium.   this is tatsuzo nakata. in he was living on thursday island in the torres strait, just o� the northern tip of australia. life on the outside: collections, contexts, and the wild, wild web  discontents     http://discontents.com.au/author/tim/ http://conf .jadh.org/ http://www.slideshare.net/wragge/life-on-the-outside-collections https://medium.com/@wragge/life-on-the-outside-collections-contexts-and-the-wild-wild-web- d ccddee http://twitter.com/wragge http://discontents.com.au/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / from the late th century there was a substantial japanese population on thursday island, mostly associated with the development of the pearling industry. i’ll admit that i know very little about tatsuzo, and i’ve selected him more or less at random from a large body of records held by the national archives of australia. i present him here out of context and in too little detail, simply as an example. working backwards from this photograph i want to restore some layers of context and reveal to you a complex and shameful history. this photograph was attached to an o�cial government form called a ‘certi�cate exempting from dictation test’. from the form we learn that the year-old tatsuzo was born in wakayama. he had a scar over his right eye.    http://discontents.com.au/wp-content/uploads/ / /life-on-the-outside. .jpg http://www.naa.gov.au/ http://discontents.com.au/life-on-the-outside/dhistory.org/archives/naa/items/ / / / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / tatsuzo carried a copy of this form with him when he departed for japan aboard the yawata maru in may . when he returned the following year the form was collected and compared with a duplicate held by port o�cials. the forms matched, and tatsuzo was allowed to disembark. to help con�rm his identity, the form carried on its reverse side an impression of tatsuzo’s hand.    http://discontents.com.au/life-on-the-outside/dhistory.org/archives/naa/items/ / / http://discontents.com.au/life-on-the-outside/dhistory.org/archives/naa/items/ / / / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / you might think that this was a travel document — an early form of visa perhaps. but at the top of the form you’ll notice a reference to the immigration restriction act, a piece of legislation introduced by the newly-federated australian nation in . the immigration restriction act and the complex bureaucratic procedures that supported its administration came to be known more generally as the white australia policy. if tatsuzo had tried to return to australia without one of these forms, he would have been subjected to the dictation test, and he would have failed. despite its benign- sounding name, the dictation test was a form of racial exclusion aimed at anyone deemed non-white. no-one was meant to pass. if he hadn’t carried this form exempting him from the dictation test, tatsuzo would most likely have been denied re-entry. this certi�cate is drawn from one of more than , �les in series j in the national archives of australia. this series is solely concerned with the administration of the white australia policy. there are many other series from other ports and other time periods full of documents like this. the national archives holds many, many thousands of these certi�cates documenting the lives and movements of people considered out of place in a white australia. photographs, forms, �les, series, legislation — this small shard of tatsuzo’s life is preserved as part of a racist system of exclusion and control. but what happens when    http://discontents.com.au/wp-content/uploads/ / /life-on-the-outside. .jpg http://www.foundingdocs.gov.au/item-sdid- .html http://www.naa.gov.au/cgi-bin/search?number=j / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / we extract the photos from their context within the recordkeeping system and simply present them as people? i’ve created a site where you can explore some of the records relating to japanese people held in series j . instead of navigating lists of �les, you can start with faces — with the people, not the system. i’m starting today with tatsuzo and this wall of faces because what i want to explore are some of the complexities of context. shark attack! after a series of fatal shark attacks in australian waters, the community of port hacking, in southern sydney, began to wonder if they too were at risk. in january the local newspaper published an article under the heading ‘shark “cover up” in port hacking’ alleging that research into the dangers had been suppressed. ten days later the newspaper followed up with details of the area’s only recorded fatal shark attack in . a local government member, it reported, had ‘unearthed the    http://jadh-demo.herokuapp.com/ http://jadh-demo.herokuapp.com/ http://jadh-demo.herokuapp.com/ http://www.theleader.com.au/story/ /shark-cover-up-in-port-hacking/ http://www.theleader.com.au/story/ /horror-day-when-shark-killed-boy-off-grays-point/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / article on trove’. ‘it’s long been a story that a boy was killed by a shark at grays point many years ago’, he said, ‘i knew about it to years ago but if you talk to people around here, nobody knows about it’. ‘a lot of people say there are no sharks in port hacking but this is rubbish’, he added. let me reassure anyone thinking about coming to dh in sydney next year that shark attacks are extremely rare. what interested me about these articles was not the risk of gruesome death, but the relationship between past and present. the question of whether shark attacks were possible could be answered — simply by searching trove. trove for those who don’t know, trove is a discovery service developed and maintained by the national library of australia. like europeana, the digital public library of america, and digitalnz, it aggregates resources from the cultural heritage sector, and beyond. it also provides access to more than million newspaper articles from onwards. the articles are drawn from over di�erent titles — large and small, rural and metropolitan — with more are being added all the time. search for just about anything and you’re likely to �nd a match of some sort amongst the digitised newspapers. so of course i searched for tsukuba…    http://dh .org/ http://trove.nla.gov.au/ http://www.nla.gov.au/ http://www.europeana.eu/ http://dp.la/ http://www.digitalnz.org/ http://trove.nla.gov.au/newspaper?q=% http://trove.nla.gov.au/newspaper/result?q=tsukuba / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / trove is also a community. users correct the ocr’d text of newspaper articles. they also add thousands of tags and comments to resources across trove. , users , , tags , comments , , corrections , lists perhaps my favourite example of user-generated content on trove are the lists. lists are pretty much what they sound like — collections of resources. they make it easy for you to save and share your research. but more than tags or comments they expose people’s interests and passions. they give some insight into the many acts of meaning-making that occur in and around trove. lists are also exposed through trove’s application programming interface (api) in a form �t for machine consumption. so with just a dash of code i can harvest the titles of all public lists and do some very basic word frequency analysis courtesy of voyant tools.    http://nla.gov.au/nla.news-article http://trove.nla.gov.au/list/result?q=+ http://help.nla.gov.au/trove/building-with-trove http://voyant-tools.org/tool/cirrus/?corpus= . &query=&stoplist=stop.en.taporware.txt http://voyant-tools.org/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / there’s nothing too surprising here — we know that family historians are our largest user group. but we can also see the long tail in action — the way that huge collections like trove can support very focused, speci�c interests. which leads me back to shark attacks. old speak the port hacking article made me wonder how many other web pages there might be out on the wider web that cited trove newspapers in a discussion of shark attacks. the answer was many. but what was most interesting wasn’t the volume of references, it was the variety of contexts — in blog posts, on facebook, in �shing forums. ‘ahh, old time newspapers are fascinating things aren’t they?’, notes one post in a weather forum, citing details of a shark attack in sydney from . on a �shing site, a thread on bull shark attacks in western australia’s swan river begins: ‘i found a great website to view really old newspapers in perth. just found a few swan river shark storys [sic]…’.    http://voyant-tools.org/tool/cirrus/?corpus= . &query=&stoplist=stop.en.taporware.txt http://forum.weatherzone.com.au/ubbthreads.php/topics/ / /sharks http://fishwrecked.com/forum/swan-river-bull-shark-attacks / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / the author follows up with a direct link to the trove search page, prompting the exchange: red�n life: ‘haha you would never know there had been that many incedents in the swan without seeing these…’ goodz: ‘oh how newspapers have changed the way the write… love the old speak!’ alan james: ‘that’s right goodz, and more often than not i’m sure they actually reported the truth.’ so a discussion of shark attacks turns to a consideration of the changing style of newspaper reporting. perhaps even more interesting is the way that digitised newspapers are used to test a hypothesis, challenge an interpretation, or argue a case. as in the port hacking case, questions about the history of shark attacks can be explored without needing to turn to experts, history books, or o�cial statistics. so when a local politician is quoted as saying ‘there have not been any serious or fatal shark attacks at coogee beach since records commenced in the s’, a reader can respond with two trove newspaper citations and the comment: ‘no previous shark attacks? or are they only searching for fatalities?’ when a media outlet asks its facebook followers whether the export of live sheep from western australia might be increasing the number of shark attacks o� the coast, one follower can simply share a trove link to a newspaper article from and ask ‘did they have live sheep export in ?’ i don’t want to argue that these interactions are particularly profound or remarkable. in fact i’d suggest that they’re interesting because they’re not remarkable. million digitised newspaper articles chronicling years of australian history are just another resource woven into the fabric of online experience. the past can be mobilised, shared and embedded in our daily interactions as easily as pictures of cats. traces and it’s not just shark attacks. to explore the variety of contexts in which trove newspaper articles are used and shared, i started mining backlinks.    http://www.inmycommunity.com.au/news-and-views/local-news/shark-barrier-to-go-up-at-coogee/ / https://www.facebook.com/perthnow/posts/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / backlinks, as the name suggests, are just links out there on the wild, wild web that point back to your site. you can �nd them in your referrer logs, in google’s webmaster tools, or simply by searching. i started with a ‘try before you buy’ sample of backlinks from an seo service. from there i wrote a script to harvest the linking pages, remove duplicates, extract the newspaper references, retrieve the article details from the trove api, and save everything to a database for easy exploration. you can play with the results online. i ended up harvesting pages from domains containing , links to , articles in trove. remember that’s just a sample of all the links to trove newspapers out there on the web. what was more surprising than the raw numbers was the diversity of content across those pages. i knew that family and local historians were busily blogging about their trove discoveries, but i didn’t know that trove newspapers were being cited in discussions about politics, science, war, sport, music — just about any topic you could imagine. nor are these discussions just about australia. a little quick and dirty analysis suggests that more than languages are represented across those pages.    http://trovespace.webfactional.com/traces/ http://trovespace.webfactional.com/traces/ http://trovespace.webfactional.com/traces/pages/?q=politics http://trovespace.webfactional.com/traces/pages/?q=science http://trovespace.webfactional.com/traces/pages/?q=war http://trovespace.webfactional.com/traces/pages/?q=sport http://trovespace.webfactional.com/traces/pages/?q=music https://plot.ly/~wragge/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / this is a work in progress. i hope to expand my hunt for traces — crawling sites for additional references, mining referrals, and inviting the public to nominate pages for inclusion. by adding a simple api i could make it possible for trove to include links back to relevant pages, like trackbacks on a blog. i also want to understand more about the scope of the content and the motivations of its authors. what is going on here? undoubtedly some of these pages constitute link spam or attempts to game search engines, but most do not. browsing the database you �nd many examples of interpretation, persistence, and passion. people around the world have something they want to say, something they want to share, and trove’s millions of newspaper articles provide them with a readily-accessible source of inspiration and evidence. it’s clear that those many small acts of meaning-making we can observe in trove’s activity statistics extend beyond a single site — to a much much wider (and wilder) world. scale one day earlier this year, trove received more than three times its usual number of visitors.    https://plot.ly/~wragge/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / the culprit was the wtf subreddit — a popular place for sharing the weirdities of the web. someone posted a link to a trove newspaper article describing the unfortunate demise of a poodle called cachi, whose fall from a thirteenth-story balcony in buenos aires resulted in the deaths of three passers-by. as well as causing a dramatic spike in trove’s visitor stats, the post received more than votes and attracted comments on reddit. cachi was a hit. trove articles pop up regularly on reddit. the tra�c spikes they bring are reminders that however proud we might be of our stats, we are but a tiny corner of the web. there’s something much bigger out there. michael peter edson has long sought to alert cultural heritage organisations to the challenges of scale. in a recent essay he described the web’s ‘dark matter’: there’s just an enormous, humongous, gigantic audience out there connected to the internet that is starving for authenticity, ideas, and meaning. we’re so accustomed to the scale of attention that we get from visitation to bricks-and- mortar buildings that it’s di�cult to understand how big the internet is—and how much attention, curiosity, and creativity a couple of billion people can have.    http://discontents.com.au/wp-content/uploads/ / /life-on-the-outside. .jpg http://www.reddit.com/r/wtf/ http://nla.gov.au/nla.news-article http://www.reddit.com/r/wtf/comments/ edcp/wtf_happened_a_dog_jumps_from_a_ th_story_window/ http://www.reddit.com/domain/trove.nla.gov.au https://medium.com/@mpedson/dark-matter-a c d d / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / libraries, archives and museums, he argues, need to meet the public where they are, to recognise that vigorous sites of meaning-making are scattered across the vast terrain of the web. trove newspaper traces and reddit spikes are mere glimpses of the ‘dark matter’ of cultural activity that lurks beneath the apps, the stats, and the corporate hype. people are already using our digital stu� in ways we don’t expect. the question is whether libraries, archives and museums see this hunger for connection as an invitation or a threat. do we join the party, or call the police to complain about the noise? sharing there’s something fundamentally human about sharing. yes, it’s easy to mock the shallowness of a facebook ‘like’; to see our obsession with followers, friends and retweets as evidence of our dwindling capacity for attention — reducing engagement and understanding to a single click. but haven’t we always shared — through stories, gossip, jokes, performances, and rituals? rather than being measured against a threshold of meaning, surely each act of sharing exists on a continuum from the �ippant to the philosophical. just because the act of sharing has been commodi�ed by large social media services seeking to mine our preferences for pro�t, doesn’t mean it lacks deeper human signi�cance. a retweet can represent a �eeting interest, a brief moment of distraction. but it can also mark the start of a journey. cultural heritage institutions around the world have begun to recognise that sharing is not just a marketing strategy, it’s a mission. as merete sanderho� notes in her foreword to the anthology sharing is caring: when cultural heritage is digital, open and shareable, it becomes common property, something that is right at hand every day. it becomes a part of us. aggregation services, like trove, the digital public library of america, europeana, and digitalnz, bring resources together to share them more easily with the world. aggregation is only worthwhile if it serves discovery and reuse — it’s a process of mobilisation, rather than collection. as europeana argues in their strategy: we believe culture is a catalyst for social and economic change. but that’s only possible if it’s readily usable and easily accessible for people to build with, build on and share.    http://www.sharingiscaring.smk.dk/en/explore-the-art/free-download-of-artworks/sharing-is-caring/foreword/ http://strategy .europeana.eu/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / of course the hard part is understanding what makes something ‘readily usable and easily accessible’. what balance do we need between push and pull? between ease- of-use and technical power? between licensing and liberty? between context and creativity? busy bots the mechanical curator was born in the british library labs as part of their innovative digital scholarship program. in september , she started posting to tumblr random images automatically extracted from a collection of , digitised th century books. it was, ben o’steen explained, an experiment in ‘providing undirected engagement with the british library’s digital content’. the book illustrations moved from inside to outside, opening opportunities for discovery beyond the covers. but that was just the beginning. a few months later the mechanical curator dramatically expanded its labours, uploading more than a million public domain images to flickr. what followed was something of a cultural feeding frenzy as people from all over the world starting sharing, tagging, collecting, and creating with this rich assortment of th century illustrations. since then the images have been mashed up into new works, added and organised in the wikimedia commons, and featured in an installation at the burning man festival in nevada.    http://mechanicalcurator.tumblr.com/ http://labs.bl.uk/ http://britishlibrary.typepad.co.uk/digital-scholarship/ / /peeking-behind-the-curtain-of-the-mechanical-curator.html https://www.flickr.com/photos/britishlibrary/ http://commons.wikimedia.org/wiki/commons:british_library/mechanical_curator_collection/synoptic_index http://britishlibrary.typepad.co.uk/digital-scholarship/ / /the-british-library-meets-burning-man.html / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / having been locked away within books for more than a hundred years, the illustrations were given new life online as works in their own right. opportunities for innovation and expression were created by a rupture in context. meanwhile on twitter, a growing army of bots was liberating items from cultural collections around the world. inspired by the bot-making genius of mark sample, i created @trovenewsbot in june to tweet newspaper articles from trove. he was joined by @dplabot, @europeanabot, @kasparbot, @curtinlibbot, @digitalnz.bot, @museumbot, @cooperhewittbot, @bklynmuseumbot, and no doubt others — all sharing random collection items. of course @mechcuratorbot soon joined the fray from the british library, and i eventually added @trovebot to tweet material from all the non-newspapery sections of trove. the possibilities of serendipitous discovery are receiving increasing attention within the digital humanities. at dh , kim martin and anabel quan-haase critically examined four dh tools — including @trovenewsbot — in the light of existing models of serendipity. their discussion noted that randomness is not the same as serendipity, and outlined how serendipity could be understood as type of encounter with information. i do wonder though if what makes the bots interesting is not randomness as such, but the way randomness can play around with our assumptions about context.    http://discontents.com.au/wp-content/uploads/ / /life-on-the-outside. .jpg https://twitter.com/samplereality/lists/samplereality-bots https://twitter.com/samplereality/lists/samplereality-bots https://twitter.com/dplabot https://twitter.com/europeanabot https://twitter.com/kasparbot https://twitter.com/curtinlibbot https://twitter.com/digitalnzbot https://twitter.com/museumbot https://twitter.com/cooperhewittbot https://twitter.com/bklynmuseumbot https://twitter.com/mechcuratorbot https://twitter.com/trovebot http://dharchive.org/paper/dh /paper- .xml / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / steve lubar observes that the random o�erings of collection bots can also expose the choices that are made in the creation and display of cultural collections. randomness can challenge our expectations. describing the genesis of the mechanical curator, james baker notes: and so as what at �rst seemed simple descends into complexity the mechanical curator achieves her peculiar aim: giving knowledge with one hand, carpet bombing the foundations of that knowledge with the other. the trove bots i created do more than tweet random o�erings, they also allow you to interact with trove without ever leaving twitter. send a few keywords their way and they’ll do your searching for you, tweeting back the most relevant result. you can modify their default behaviour by adding a series of hashtags — #luckydip, for example, will spice your result with a touch of randomness. more interestingly, perhaps, you can tweet a url at them and they’ll extract keywords from the web page and use them to construct the search. this means that @trovenewsbot can o�er commentary on current events. several times a day he retrieves the latest headlines from a news site and searches for something similar amidst trove’s million historical newspaper articles. what emerges is a strange conversation between past and present.    http://stevenlubar.wordpress.com/ / / /museumbots-an-appreciation/ http://britishlibrary.typepad.co.uk/digital-scholarship/ / /the-mechanical-curator.html http://discontents.com.au/an-addition-to-the-family/ https://twitter.com/trovenewsbot/status/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / these bots do not simply present collection items outside of the familiar context of discovery interfaces or online exhibitions, they move the encounter itself into a wholly new space. just as the mechanical curator liberates illustrations from the printed page, the twitter bots loosen the institutional context of collections to allow them to participate in a space where people already congregate. they send collection items out into the wilds of the web, to �nd new meanings, new connections and perhaps even new love. broken & repaired but letting go can be scary. a survey of libraries, archives and museums revealed that one of the main factors inhibiting the opening up of online collections was the desire to avoid misrepresentation, mislabeling or misuse of cultural objects. easy sharing brings the risk that our carefully curated content will be shorn of context and bounced around the web — adrift and abused. earlier this year sarah werner took aim at twitter feeds that pump out streams of ‘historical’ photos — unattributed and often wrongly captioned. but it wasn’t simply the lack of attribution that angered her: these accounts capitalize on a notion that history is nothing more than super�cial glimpses of some vaguely de�ned time before ours, one that exists for us to look at and exclaim over and move on from without worrying about what it means and whether it happened. i have to admit that the excitement of seeing trove’s visitor numbers suddenly soar thanks to reddit is frequently tempered by the realisation that what is being shared is yet another story of gruesome death, violence, or misfortune. years of australian history is reduced to clickbait by our tabloid sensibilities. most of those who arrive from reddit read the article and click away — the bounce rate is around %. this is not ‘engagement’? and yet, i can’t help but wonder about the % who don’t immediately leave, who pause and look around. three percent of a lot is still a lot — a lot of people who might have been exposed to trove and australian history for the very �rst time. similarly while the viral pics industry is frustrating and exploitative, it might yet o�er opportunities to learn. one of my favourite twitter accounts is @picspedant. it monitors many of the viral pics feeds, researches the images, and tweets the results — providing a steady stream of attributions, corrections, critiques, and context. not only do you �nd out about the images, you pick up research tips, and learn about the cannibalistic tendencies of the pic bots themselves — constantly recycling content from their kin.    http://firstmonday.org/article/view/ / http://sarahwerner.net/blog/index.php/ / /its-history-not-a-viral-feed/ https://twitter.com/picpedant / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / @ahistoricalpics o�ers a di�erent form of education, satirising the whole viral pics genre with its fabricated captions, and pricking at our own inclination to believe. freeing collections opens them to misuse, but it also exposes that misuse to analysis and critique. contexts can be rediscovered as well as lost, restored as well as broken. generous signposts it’s wonderful to see many trove newspaper articles shared on twitter. unfortunately a signi�cant proportion of these come from climate change deniers, who mine the newspapers for freak weather events and past climatic theories, imagining that such reports undermine current research. this is bad science and bad history. their e�orts are also well-represented in my database of web page citations, along with expressions of hatred and prejudice that i’d prefer to stay submerged. it’s depressing, but it seems inevitable that people will do bad things with your stu�. in a recent post about the dpla’s metadata licensing arrangements, dan cohen suggested we should look beyond technical and legal controls around online use towards social and ethical guidelines:    https://twitter.com/ahistoricalpics https://twitter.com/ahistoricalpics/status/ http://www.dancohen.org/ / / /cc -by/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / the cynics, of course, will say that bad actors will do bad things with all that open data. but here’s the thing about the open web: bad actors will do bad things, regardless… the �ip side of worries about bad actors is that we underestimate the number of good actors doing the right thing. bad people will do bad things, but by asserting a social and ethical framework for the use of digital cultural collections we strengthen the resolve and commitment of those who want to do right. already there are examples in the work of the local contexts project which is developing a series of licenses and labels to guide use of traditional knowledge and cultural materials. similarly, creative commons aotearoa new zealand have been developing an indigenous knowledge notice to educate the public about what constitutes appropriate use. we should remember too that footnotes have always been at the heart of an ethical pact. the australian historian tom gri�ths has described footnotes as ‘honest expressions of vulnerability’ — ‘generous signposts to anyone who wants to retrace the path and test the insights’. this ‘professional paraphernalia’ has, he argues, grown out of a series of ethical questions: to whom are we responsible – to the people in our stories, to our sources, to our informants, to our readers and audiences, to the integrity of the past itself? how do we pay our respects, allow for dissent, accommodate complexity, distinguish between our voice and those of our characters? such questions remain crucial as we consider the relationship between cultural collections and their online users. if we expect people to erect ‘generous signposts’ we have to make our stu� easy to �nd and share. if we want them to consider their responsibility to the past we should focus on providing trust, con�dence, and support, not permission. responsibilities if my wall of faces seems seems familiar, it might be because a few years ago i created something similar called the real face of white australia. the two walls use di�erent sets of records, but they were constructed in much the same way: i reverse-engineered the national archives’ online database, downloaded images of digitised �les, and used a facial detection script to identify and extract faces.    http://www.localcontexts.org/ http://creativecommons.org.nz/indigenous-knowledge/ http://invisibleaustralians.org/faces/ / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / the real face of white australia was an experiment, built over the course of a weekend. but its discom�ting power was immediately evident. where there had been records, there were people — looking at us, challenging us. my partner kate bagnall is a historian of chinese-australia and we were working together on a project called invisible australians, aimed at liberating the lives of these people from the bureaucracy of the white australia policy. the project was motivated by a strong sense of responsibility — not to the national archives, not to the records, but to the people themselves. we often talk about preserving context as if it’s an end in itself; as if context is just a set of attributes to be catalogued and controlled. the exciting, terrifying, wonderful thing about the wild, wild web is how it upsets our notions of relevance and meaning. historic newspapers can �nd their way into contemporary debates. century-old illustrations can be remade as art. twitter bots can inspire conversations with collections. the people buried inside a recordkeeping system can be brought at last to the surface. contexts are unstable, shifting. and through that instability we can glimpse other worlds, we can imagine alternatives, we can build something new. what’s important is not training users to understand the context of our collections, but helping them explore and understand their responsibilities to the pasts those collections represent. let’s remove technical barriers, minimise legal restrictions, and trust in the good will of our audiences. instead of building shrines to our descriptive methodologies, let’s create systems that provide stable shareable anchors, that connect, but don’t constrain. contexts will �ow and mingle, some will fade and some will burn. contexts will survive not because we demand it in our terms of service, or embed them in our interfaces, but because they capture something that matters. the ways we �nd and use cultural collections will continue to change, but questions about responsibility, value, and meaning will remain.   download pdf like this? take a second to support tim sherratt on patreon!    http://discontents.com.au/the-real-face-of-white-australia/ http://chineseaustralia.org/ http://invisibleaustralians.org/ http://www.slideshare.net/wragge/the-responsibilities-of-data-reconstructing-lives-from-the-records-of-the-white-australia-policy http://discontents.com.au/life-on-the-outside/?format=pdf / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / . tom gri�ths, ‘history and the creative imagination’, history australia, vol. , no. , . [↩] share this: this work is licensed under a creative commons attribution . international license. p r e v i o u s p o s t o n s e a m s a n d e d g e s n e x t p o s t s k e tc h i n g w i t h p y t h o n a n d p lot ly c at e g o r i e s s p e e c h e s ta g s a r c h i v e s m u s e u m s d i g i ta l h u m a n i t i e s i n v i s i b l e au s t r a l i a n s l i b r a r i e s t r ov e w r i t t e n by: t i m s h e r r at t i'm a historian and hacker who researches the possibilities and politics of digital cultural collections.           https://www.patreon.com/timsherratt https://creativecommons.org/licenses/by/ . / http://discontents.com.au/on-seams-and-edges/ http://discontents.com.au/sketching-with-python-and-plotly/ http://discontents.com.au/category/words/speeches/ http://discontents.com.au/tag/archives-museums/ http://discontents.com.au/tag/digital-humanities/ http://discontents.com.au/tag/invisibleaustralians/ http://discontents.com.au/tag/libraries/ http://discontents.com.au/tag/trove/ http://discontents.com.au/author/tim/ http://twitter.com/wragge https://github.com/wragge http://discontents.com.au/life-on-the-outside/?share=email&nb= http://discontents.com.au/life-on-the-outside/?share=twitter&nb= http://discontents.com.au/life-on-the-outside/?share=facebook&nb= http://discontents.com.au/life-on-the-outside/?share=google-plus- &nb= / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / c o m m e n t s r ox a n n e s h i ra z i ( @ r ox a n n es h i ra z i ) september , reply in which @wragge pleads, “connect, but don’t constrain” digital cultural heritage | life on the outside http://t.co/wizzn a d @a r c h a eo i n ac t i o n september , reply top story: life on the outside: collections, contexts, and the wild, wild web http://t.co/ugmfjtokcg, see more http://t.co/bdhkyjwwna @ da n c o h e n september , reply “collections, contexts, and the wild, wild web” —@wragge on what the public does with large, open digital collections http://t.co/d ybd ajq @ j _ w _ ba k e r september , reply ‘the past can be mobilised, shared & embedded in our daily interactions as easily as pictures of cats’ http://t.co/ qj qmvns @wragge wisdom @ m a x j _ k september , reply digital library collections on the web http://t.co/tjmg zkrje by @wragge attention, serendipity and context    http://twitter.com/roxanneshirazi/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/wizzn a d http://twitter.com/archaeoinaction/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/ugmfjtokcg http://t.co/bdhkyjwwna http://twitter.com/dancohen/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/d ybd ajq http://twitter.com/j_w_baker/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/ qj qmvns http://twitter.com/maxj_k/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/tjmg zkrje / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / @ h i sto r e i n september , reply a downright beautiful – no, really – examination of why digitizing collections matters. (via @publichistorian) http://t.co/xzw hooz q @t s l e i g h e r september , reply a really excellent post from @wragge about ‘collections, contexts, and the wild, wild web’: http://t.co/nlvvxsdlqm @ lu cg au v r e au september , reply excellent article sur les programmes de numérisation: qualité, utilisabilité, indexation d’images, di�usion http://t.co/du mnircgv @ta l k i n gtot h eca n september , reply “freeing collections opens them to misuse, but it also exposes that misuse to analysis and critique.” http://t.co/ohjobtemtv by @wragge @ m l as ca r i d es september , reply once again, @wragge makes the synthesis of technical hacking, social justice and curiosity seem e�ortless. lovely. http://t.co/cydf log r @ n l n z september , reply    http://twitter.com/historein/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/xzw hooz q http://twitter.com/tsleigher/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/nlvvxsdlqm http://twitter.com/lucgauvreau/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/du mnircgv http://twitter.com/talkingtothecan/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/ohjobtemtv http://twitter.com/mlascarides/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/cydf log r http://twitter.com/nlnz/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / another great piece by @wragge: life on the outside: collections, contexts, and the wild, wild web. http://t.co/gkvfgmsbhn @ s p i k e ly n c h september , reply this presentation by @wragge of @troveaustralia, on the consequences and context of open archives, deserves a read http://t.co/ytslicqbrq @ lg r e e n p d september , reply long read but worth the e�ort. “life on the outside: collections, contexts, and the wild, wild web” @wragge http://t.co/umg dhzi @ h i sto ry i n g september , reply “life on the outside” by @wragge should be a template for how to give a powerful keynote: http://t.co/oyspduytqr @ m sa n d e r h o f f september , reply this. http://t.co/fbnmrpvbk powerful keynote by @wragge on open collections and contexts in the wilderness of the web #openglam @ m sa n d e r h o f f september , reply amazing insights! thanks! mt @wragge icymi my #jadh keynote http://t.co/fbnmrpvbk feat @dancohen @j_w_baker @mpedson @msanderho� &more    http://t.co/gkvfgmsbhn http://twitter.com/spikelynch/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/ytslicqbrq http://twitter.com/lgreenpd/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/umg dhzi http://twitter.com/historying/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/oyspduytqr http://twitter.com/msanderhoff/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/fbnmrpvbk http://twitter.com/msanderhoff/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/fbnmrpvbk / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / @ m sa n d e r h o f f september , reply dybt indsigtsfuld keynote om kulturhistoriske samlinger og kontekst på nettet af @wragge http://t.co/fbnmrpvbk ping @jwangdk #openglam @ e d s u september , reply @wragge thanks for http://t.co/ kcwl ssq it is *awesome* ; btw which seo service did you use to discover trove backlinks? @ k ra m e r m j september , reply the always thoughtful @wragge http://t.co/blidnobrig #digitalhumanities #twitterstorians #archives @w e n dyc r _ september , reply “life on the outside: collections, contexts, and the wild, wild web” – great keynote by @wragge http://t.co/zsnx b x @ m sa n d e r h o f f september , reply the perfect warm-up for #hack dk http://t.co/fbnmrpvbk amazing hacker @wragge on new contexts for old collections on the web #openglam @ s q ua r e d s o n g september , reply    http://twitter.com/msanderhoff/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/fbnmrpvbk http://twitter.com/edsu/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/ kcwl ssq http://twitter.com/kramermj/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/blidnobrig http://twitter.com/wendycr_/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/zsnx b x http://twitter.com/msanderhoff/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/fbnmrpvbk http://twitter.com/squaredsong/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / fantastic a.m. read: @wragge keynote on digital collections, contexts and the wild, wild web. http://t.co/nwb p jzww @ k m d k september , reply brilliant read! mt @msanderho�: … http://t.co/audm stgxn amazing hacker @wragge on new contexts for old collections on the web #openglam p e t e r s o e m e r s ( @ p s o e m e r s ) september , reply people already using our digital stu� in ways we don’t expect, do we join or call the police? http://t.co/y yeuuowku by @wragge #openglam n e u l i c h i m f e e d r e a d e r ( t e i l x i i i ) : d - d r u c k e r , e- b o o k s vo m b u n d, a r c h i v e u n d n i n ja b i b l i ot h e k a r i n n e n | h ato r i k i b b l e october , reply […] life on the outside: collections, contexts, and the wild, wild web keynote von tim sheratt unter anderem über die nutzung digitalisierter zeitungen: […] @ d e m at e r i a l i s e october , reply letting go while keeping track of life on the outside: heritage collections, contexts and the wild, wild web http://t.co/d hhpbszpp w e t h e h u m a n i t i es : day r eca p | b e n fas t december , reply […] museum bots also became part of the conversation – those accounts that highlight certain artefacts a few times per day.  museum bots can be people too. […]    http://t.co/nwb p jzww http://twitter.com/kmdk/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/audm stgxn http://twitter.com/psoemers/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/y yeuuowku http://hatorikibble.wordpress.com/ / / /neulich-im-feedreader-teil-xiii- d-drucker-e-books-vom-bund-archive-und-ninja-bibliothekarinnen/ http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://twitter.com/dematerialise/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/d hhpbszpp http://benfast.ca/ / / /we-the-humanities-day- -recap/ http://discontents.com.au/life-on-the-outside/?replytocom= #respond / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / @ h u g h r u n d l e december , reply more @wragge genius: life on the outside: collections, contexts, and the wild, wild web http://t.co/m uwi u @ b h g r o s s november , reply nice quote featured in @historein’s #glamcafe talk, originally from @wragge’s blog: https://t.co/h pp spgmy #scicomm https://t.co/ aiff pn leave a reply your email address will not be published. required �elds are marked * enter your comment… your name* your email* your url (optional) post comment notify me of new posts by email.    http://twitter.com/hughrundle/status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond http://t.co/m uwi u http://twitter.com/bhgross /status/ / http://discontents.com.au/life-on-the-outside/?replytocom= #respond https://t.co/h pp spgmy https://t.co/ aiff pn / / life on the outside: collections, contexts, and the wild, wild web – discontents discontents.com.au/life-on-the-outside/ / e l s e w h e r e t i m s h e r rat t.o r g @w rag g e o n t w i t t e r i d e n t i f i e d by t r ov e i d e n t i f i e d by o r c i d r es e a r c h l i b ra ry at zot e r o co d e at g i t h u b p r es e n tat i o n s at s l i d es h a r e s u b s c r i b e r s s discontents enter your email address to subscribe to this blog and receive noti�cations of new posts by email. email address subscribe  rss - posts t rac k s wo r d p r es s t h e m e by c o m p e t e t h e m es .     https://timsherratt.org/ http://twitter.com/wragge http://nla.gov.au/nla.party- http://orcid.org/ - - - http://www.zotero.org/wragge http://github.org/wragge/ http://slideshare.com/wragge http://discontents.com.au/ http://discontents.com.au/feed/ http://discontents.com.au/feed/ https://www.competethemes.com/tracks/ http://twitter.com/wragge humanities article romanian users’ youtube “shakespeares”: digital localities in global fields adriana mihai english department, faculty of foreign languages and literatures, university of bucharest, bucharest, romania; adriana.mihai@lls.unibuc.ro received: march ; accepted: april ; published: april ���������� ������� abstract: the cultural production of “shakespeare” on the internet has received growing attention in recent years, particularly in reference to newcomers in the field such as media users or “prosumers”. this is potentiated by the connectivity of digital platforms and growing access to digital means of production and distribution. from a field perspective on digital cultural production, participation online can be seen as a socially situated activity, often differentiated and marked by the habitus of digital media users. the present article aims firstly to discuss the utility of a field conceptualization of the cultural production of shakespeare online, with reference to the work of pierre bourdieu. secondly, through a critical framework derived from cultural and media economies, this article analyses romanian appropriations of shakespeare’s works on youtube as cultural productions inscribed in both the global digital economy and also in the local cultural field. thirdly, based on an overview of romanian producers who have published shakespeare videos and on the analysis of their visibility and circulation online, as well as their chosen genres and discourses, i argue that the romanian digital (re)production of shakespeare is situated at the periphery of both the national and the global digital field. user-made shakespeare productions are yet to find valuation in the local cultural and media fields, being situated in an illegitimate location for appropriating shakespeare. this will be contextualized in the larger discourse regarding the fundamental role of curation, but also in light of recent concerns about privileged locations, languages, and algorithmic bias across online cultural production, circulation, and consumption. keywords: participation; shakespeare online; field of cultural production; youtube; romanian users; local . introduction user-made videos appropriating shakespeare’s texts have been produced and distributed on youtube for more than a decade. the global platform has provided an unprecedented, albeit far from universal, access to means of video distribution, concentrating a profusion of productions created by newcomers alongside institutional and corporate “shakespeares”. the works resulting from users’ cultural participation have been selected and valued in scholarly and curatorial discourse as forms of vernacular creativity and performances of identity politics (o’neill ); with respect to their storytelling genres (desmet ; lanier ) and converging media (calbi ); and in terms of the various “offline” cultural traditions they inherit such as pop culture appropriation practices (lanier ). such steps in legitimizing user-made videos as cultural objects through a set of evaluative criteria and frames of interpretation have also taken more informal routes that include blogs and “grassroots intermediaries” (jenkins et al. , p. ) such as luke mckernan’s bardbox, jeremy fiebig’s the shakespeare standard, and lisa starks’s “shakespeare friends” facebook community. situating shakespeare users outside institutional walls has meant seeing them in relation to youth culture and to a variety of interpretive communities who employ common production and humanities , , ; doi: . /h www.mdpi.com/journal/humanities http://www.mdpi.com/journal/humanities http://www.mdpi.com http://dx.doi.org/ . /h http://www.mdpi.com/journal/humanities https://www.mdpi.com/ - / / / ?type=check_update&version= humanities , , of consumption practices. these fragmented, multiple, and often incidental relations between users have unveiled an online discourse on shakespeare that “accommodates the far-reaching permutations of a network of linguistic, aesthetic, and cultural associations” (fazel and geddes , p. ). of course, there are a few problems with regard to participation in the shakespeare network. alongside impediments to participation in the digital culture as a whole (van dijk ), it has previously been shown how youtube’s political economy can take over the agency in making cultural and user associations (o’neill a). the online attention economy can also determine whether producers maximize their chances of gaining views and engagement, competing under a neoliberal logic characterized by institutional and corporate practices (lehmann and way ). in examining romanian users who have appropriated shakespeare on youtube, i argue that weaker ties with the shakespeare network are symptomatic not only of the platform’s political economy and heightened competition, but also of a digital cultural economy which falls short in legitimizing local cultural production. posing language and retrieval barriers to international curators, romanian users’ works are absent from local blogs and online cultural magazines as well. their appropriation practices are overwhelmingly class assignment parodies of romeo and juliet, an online trace of the play’s canonical presence in some of the local literature and english high-school curricula. if english-speaking users produce an abundance of creative responses to shakespeare’s texts, some of which are attracting a large viewership, digital curators’ attention, and even awards, local production is rather scarce and users barely manage to stir the interest of any interpretive community in appropriating shakespeare. shifting the focus from the many to the few, this article aims to contextualize user-made appropriations in a cultural field framework, bringing forth the “global cultural field” (massai , p. ) model in conceptualizing, this time, the “world-wide” cultural production online. the dynamics between a global digital economy and the local symbolic economy are here explored by analyzing romanian producers’ legitimation process (bourdieu , pp. – ) in a local context where shakespeare has overwhelmingly belonged to academic and theatrical fields. more specifically, my analysis focuses on producers’ positions in relation to both the local and the digital fields they enter by firstly discussing the utility of a field conceptualization of digital cultural production for digital shakespeare. then, local users’ appropriations will be contextualized in the digital economy and in the local cultural economy, where shakespeare’s position as high culture triggers few, if any, interpretive communities’ interest in pop culture shakespeare. against scarce legitimizing agents and audiences, i discuss the positioning of three romanian producers of youtube shakespeare, taking into account their views, choice of genre, and discourse: the short performance parody of the balcony scene in romeo si julieta varianta scurta romaneasca:)), uploaded on the umor romanesc channel (umor romanesc ); the mash-up music video or vidding parodie romeo s, i julieta (glogovetan ), created by user ioan glogovetan; and the satirical performance vlog sonnet -william shakespeare (gherman ), created by comedian silviu gherman. while these choose media and comedy subgenres that are globally consecrated in appropriating shakespeare on youtube, local audiences are yet to consume and legitimize works which interfere with shakespeare’s cultural authority. . shakespeare users in the digital field of cultural production even if a canonical author has a high status and is used as a source of authority in the cultural field being invoked “as a source of legitimation by all of the participating rival groups” (sela-sheffy , p. ), their role as a generative model in the current cultural production is not a given, not even in the case of a highly adapted author such as shakespeare. this means that a canonical status can ensure the circulation of newly-derived cultural artefacts on the market as shock-absorbers in changing cultural paradigms, but this does not automatically attract valorisation or producers’ recognition in the market. these processes require field-specific criteria, strategies of differentiation, and knowledge of who occupies dominant positions in the specific cultural field. humanities , , of the field model provides a dynamic unit of analysis for digital shakespeare, seen here as a cultural production in the making, because its structural laws around the dialectic of cultural distinction are able to accommodate, with difference, current media and cultural settings. shakespeare is produced and consumed relationally, across multiple platforms and affinity spaces online, becoming a site of struggles between institutions, private organizations, and individual social actors over “the power to impose the dominant definition” (bourdieu , p. ) of shakespeare’s meaning in global digital media. similar to digital fields of art production (goriunova ) or music (suhr ), digital shakespeare has been reigniting scholarly debates about the definition of what is and what is not shakespeare, and over who gains and who loses legitimacy and authority in digital scholarship and practices (carson and kirwan ), as well as in participation (rumbold ; o’dair ). my own attention to local user-made appropriations is a position-taking in itself, inscribed through the available positions within the convergence of the academic fields of shakespeare and digital humanities. the legitimization of youtube appropriations in the global shakespeare community of interest, for instance, has evolved from their categorization under general labels such as “fun stuff” (the internet shakespeare editions recommendations), “other links” (the shakespeare resource center) or “other sites” (mr. william shakespeare and the internet) to the abovementioned digital curation practices, including stephen o’neill’s selection of youtube adaptations available on his own channel. in , the shakespeare standard’s bardie awards conferred the new zealand-based retelling of love’s labour’s lost, entitled lovely little losers, the award for the best shakespeare-inspired series (third annual bardie awards ). in bourdieusian terms, agents of consecration have begun to legitimize these works for their contribution to the shakespeare network and to shakespeare as a site of cultural production (holderness ). importantly, such curatorial interest for youtube shakespeare has been shown in other communities and industries as well, from online entertainment magazines to blogs targeting young audiences and fandoms such as bustle, odyssey, and hellogiggles. the latter have reviewed several web series (coronado ; epley ; kadner ), including nothing much to do, a much ado about nothing adaptation; kate the cursed, a canadian adaptation of the taming of the shrew; and jules and monty (romeo and juliet). the australian series shakespeare republic has received numerous awards from film industry festivals dedicated to web film productions (awards and screenings – ). such symbolic processes of evaluating productive participation are essential in the digital economy, which seldom manages to reward internet users’ work financially. as media theorists have shown, much of what is produced in digital media is “outside the market economy” (bolin , p. ) and rather confined to political, social or cultural economies. recent studies on what users look for when uploading videos on youtube point to cultural and social gratification needs such as “giving information, self-status seeking, and social interaction motives” (khan , p. ). thus, users primarily aim for symbolic capital, which in turn can be transformed, following bourdieu’s field logic, in economic gains in the long run. digital intermediaries such as youtube provide free access for publishers and consumers, but for a creator to receive revenues, they must set up a channel, publish regularly, produce plenty of viewing hours and attract subscribers, so that youtube can start placing advertisements on their videos. the revenue coming from advertisements differs from country to country and is unequally divided between the content creator and youtube. for newcomers, the textual and production work involved in publishing original content often becomes part of what australian queer web-series creator emma keltie called “authorized participation” (keltie , p. ), referring to that which is temporarily allowed by the culture industry, only to be “colonized” thereafter and used to reinforce the industry’s dominance. given the necessary resources in producing videos on a regular basis and competing with corporate and institutional production, keltie argues that participation is at best temporary, unless a creator is recognized by the culture industry and can continue producing being supported through different avenues. while the possibility to compete with dominant industries is limited, web . does allow for novel “bids for meaning and value” (jenkins et al. , p. ) for vernacular productions, humanities , , of performed by active audiences in circulating media content themselves across social media platforms. reminiscent of fandoms’ practices, users engage in social distribution practices by spreading media content for different personal, cultural, political or economic purposes. the extent to which this form of “user-circulated content” (jenkins et al. , p. ) can legitimize producers’ cultural works very much depends on the audience with whom users share those goods: in niche communities of interest or of practice, such as the case of fan communities or the abovementioned shakespeare groups, productions can enter a process of reviewing and valuation according to the community criteria. if shared with their wider social network, however, the work becomes part of identity performance: in this case, audiences often think more about what the person circulating the work is trying to communicate about themselves, such as personal taste, personal experience, mood, or views, than about the production or producer of the work. thus, in order for newcomers to feel legitimated, it is important that their works be circulated either extensively or in meaningful locations where communities of interest exist. such strategies require social media users to have both digital skills and an up-to-date knowledge of the cultural field. the role of dominating search engines and digital intermediaries in consumption needs to be taken into consideration as well, since they impose dominating genres, as well as structures and algorithms with an editorial role in retrieving and ranking search results. in addition to the echo chamber and filter bubble effects shown to constrain cultural consumption in social networks, digital intermediaries’ algorithms retrieve content which suits the internet users’ consumption habits and interests, reproducing what the user is likely to be interested in or enjoy, so as to reward search engine use. the first time i searched for romeo s, i julieta (the romanian translation for romeo and juliet) and activated the “geographical location” filter, along with the “relevance” and “romanian language” filters, only two results appeared: one was the national television theatre production from , digitized by the romanian television corporation; the other was the geeky blonde’s video from the condensed shakespeare series produced in the united states. given that i previously knew several theatre trailers promoting local stage productions of romeo s, i julieta, not seeing them in my results indicated that the geographical filter was not reliable. interestingly, after searching for such trailers specifically and viewing several romanian productions digitized by theatres and television companies, a repeated keyword search following the same initial query and filters displayed all the videos i had watched, including the original two results. our most recent consumption practices determine, through youtube’s algorithmic systems, what we find and consume next, such hierarchies challenging views or the hopes of a decentred video archive capable of producing “disjunctive shifts among cognitive frames of reference” (desmet ; see also o’neill ). the frames of reference reflect not only what we already know, but also our most recent preferences recorded in the algorithms, leading to an unreliability of search results and to a transfer of agency over what we consume to a digital platform. as such, digital intermediaries are not only gatekeepers of information online, but also potential reproduction machines of consumption preferences and habits. without a previous cultivation and digital literacy, internet users are unlikely to extend their cultural consumption beyond the familiar boundaries, drawn by interests, topics, and narratives, but also linguistic and geographical areas. in interpreting romanian shakespeare appropriations as cultural objects, the ineffective filter confirms their distribution in a global flow where location and borders might lose their importance (o’neill b, p. ) to such an extent that they become digital forms of “archival silence” (huang , p. – ). out of the search results displayed after employing several keyword combinations, including “name of the play in romanian”, “language” and “shakespeare” or “name of the play in romanian” and “famous lines” (such as “to be or not to be”), and excluding digitized video heritage, the following is evident: objects were opera and theatre trailers, were the aforementioned parodies, were romeo and juliet video essays, were commercials, were pop songs, was a comedian’s performance vlog of sonnet , and was a television production summarizing hamlet, using the thug notes style and format. from these, i have chosen three user-made appropriations based on their representative value in terms of genre and discourse from the available productions. humanities , , of . appropriating shakespeare in the local field the valuation of local media appropriations had a legitimizing force for their inclusion in the global cultural field. the main criteria in valuing such works were, nevertheless, their “significant role in mass culture [ . . . ], as well as in more traditional sites of cultural production” (massai , p. ). as argued above, while broadcasting on a global digital platform, shakespeare users still rely on more traditional processes of legitimation in order to gain the type of symbolic capital they are looking for. their digital works can either be valued in consumption, attracting internet users’ views and engagement, or in production, by the media industry’s formal and informal agents or by other web or shakespeare producers from their respective communities, according to bourdieu’s field model. for romanian appropriations of shakespeare on youtube, their productive participation meets a few limitations in this respect: they rely on the production and consumption practices of local internet users, as they address a romanian-speaking audience. even when they remove language barriers and perform in english, as in the case of silviu gherman’s rendition of sonnet , or just use images and music, as the romeo and juliet parody does, youtube’s algorithm makes it difficult for them to reach a global audience. moreover, there are currently very few local blogs and cultural magazines curating online cultural production and, when they do, they mostly select what has already been socially curated on facebook, the dominating social network used by romanians. part of the romanian independent cultural journalism scene potentiated by digital media, vice romania, scena , or sub magazines address the educated youth, the cultural omnivores sometimes labelled as “hipsters” who consume both high and popular culture. following the rising popularity of his videos among youths, for instance, scena interviewed a year-old youtube video user who transformed the typical literary analyses of romanian canonical authors used for the high-school baccalaureate exam into rap songs. comedian silviu gherman was also interviewed by several industry websites in relation to his youtube parodies of romanian cultural and political authority figures, which had attracted a larger user engagement, but his shakespeare appropriation was never discussed. local curators’ lack of interest in user-made shakespeare videos and users’ own limited creative responses to shakespeare are not the result of a rejection of hybridity or of web genres. they are rather a result of shakespeare’s position both in local cultural production fields, namely shakespeare as text in translation, theatre, and specialized academic study, and also in consumption fields, dominated as they are by attending theatre productions, as well as by viewing or listening similar productions via mass-media. in romania, shakespeare has resisted appropriation by the youth culture consuming media products. although the state-owned romanian television and radio have been producing inside their own studios shakespeare adaptations since the s in the programs known as the national television theatre and national radio theatre, their filmed or recorded studio performances are theatrical events, which borrow the “value criteria and professional hierarchies” (munteanu , p. ) from the theatrical field in order to give actors and stage performances a mass public exposure. television theatre and radio drama have been meant to educate mass audiences and popularize stage productions, continuing the dominant emphasis on shakespeare’s theatricality. a handful of corporate commercials using shakespeare as hypotext got closer to global media industries’ practices in capitalizing on shakespeare’s brand, locally coding him however as a “paragon of high culture [ . . . ] connoting elitism, sophistication and specialist knowledge” (colipcă-ciobanu , p. ). more recently, the media industry, similar to stage directors’ attempts (nicolaescu ) to attract younger audiences by remediating popular media, aimed to retell shakespeare’s hamlet and other canonical literary works with more accessible language, references, characters, and plots for local (digital) media consumers: in , a private television (prima tv ) produced and aired a show called s-o dăm carte în carte cu dorian, which imported the format and style of the youtube web series thug notes. in spite of its relatively good reception when uploaded on youtube, primarily due to the celebrity of its presenter, a well-known comedian in the urban stand-up scene, the show was cancelled after its first season. against this local cultural background in which shakespeare has occupied high positions in the field, to which the disconnectedness from more global audiences is added, user-made romanian humanities , , of appropriations struggle to find their legitimizing communities or agents. in the following section, i will focus on three case studies which productively respond to shakespeare by tapping into audiences’ consumption of comedy. although mass-media comedy is a popular genre of cultural production, with its own hierarchical structure reflected in the following works as well, romanian internet users are reluctant to enjoy irreverent treatments of the bard. . romanian users’ parodies, illegitimate? a relevant particularity in local cultural consumption is that romanians are less than half as likely to watch video content online from commercial or sharing services as other e.u. and u.k. citizens, preferring to a much higher extent to participate in social networks, to listen to music or to read online news (eurostat ). facebook’s video feature has also hampered users’ desire to step into a different video platform and discover new content. although some emerging artists, influencers, and journalists do create content on youtube and promote it via facebook, the platform is dominated by pop music record labels and television shows. users’ cultural production on youtube in romania is also yet to attract academic interest; the only studies concerning the local use of this medium of production and distribution pertain to the fields of marketing and political communication. struggling with the dominance of facebook in the local web sphere, as well as with institutional and corporate channels on youtube, users who create shakespeare appropriations also compete with theatres’ marketing-driven trailers. trailers have a larger viewership than user-made videos, the most popular by far being the national opera in bucharest’s promotional trailer for tango. radio and juliet (opera nat, ională bucures, ti ), a contemporary ballet retelling of romeo and juliet with radiohead’s music which gained , views, followed by petrică ionescu’s a midsummer night’s dream (teatrul nat, ional i.l. caragiale ), attracting views, or the national theatre in timis, oara’s demohamlet (teatrul nat, ional timis, oara ), with views. theatres have started to use social media communication and genres in order to engage with their main audiences, who are typically young (aged – ), educated, and urban professionals. cultural consumption statistics show that the audiences attending theatre performances have the highest rate of cultural participation in all other cultural activities in the public sphere, including online (matei and hampu , p. ). their eclecticism in consumption guarantees the success of many stage performances making use of pop, rock or hip-hop music, with one of the most popular local adaptations of romeo and juliet being gérard presgurvic’s pop-rock musical, which since has been staged many times in bucharest at the national operetta and musical theatre ion dacian. distinguishing themselves from institutional shakespeare, users employ different genres and discourses in appropriating his works: rather than maintaining the source texts’ dramatic or poetic genres, users choose to rewrite romeo and juliet and sonnet in comedic genres, as do their international peers. compared to the more recent success of shakespeare web series (lanier ), however, romanian users create single shakespeare-related videos, which either remain their only video upload on the platform, or are followed by uploads of other humoristic sketches and class assignments. although they appeal to the same audience category as theatres, few user-made youtube appropriations manage to attract more than a few hundred views. the first such video, representative for local classroom inspired parodies of romeo and juliet is a video entitled romeo si julieta varianta scurta romaneasca:)) ( views, no comments), posted by the channel umor romanesc, meaning “romanian humour”, which has subscribers and five other funny videos. this particular video based on shakespeare’s play is a six-second comic twist of the iconic orchard or balcony scene in romeo and juliet, in which a teenager playing romeo attempts to signal his presence to juliet by throwing pebbles at her window. just as juliet hears the signals and prepares to open the window after obliviously reading a magazine in her room, romeo throws a heavier rock that lands on juliet’s face, instantly running away at the sight of his blunder. as most student-made romanian versions of this scene, the appropriation does not remain faithful to the original play’s setting and scene development in which romeo sees juliet and overhears her sighs humanities , , of regarding having fallen in love with a montague. the user primarily replies, instead, to the universalist interpretation signifying a romantic wooing situation, where the scene’s original stakes, motivation and context at large can be easily replaced according to the user’s own preferences. the ubiquity of this scene in the cultural and educational flow makes it so familiar to local audiences that no actual lines need to be spoken or in-depth knowledge of the play be held in order for it to be recognizable. in this case, viewers’ expectations are defied when the sudden slapstick element aiming for a comic effect interrupts not only the anticipated teenage flirt, but also the entire relationship and subsequent plot. relinquishing the suspension of disbelief as many parodies of the play do, this alternative early ending is also telling for the specific romanian humour that marks this version: a low burlesque, based on a character’s ridiculous failure in achieving the most rudimentary of tasks. such comedic tricks and characters are commonly used in folklore, as well as in entertainment television shows, appealing to a large audience with a low or medium level of education. their commonality, accessibility and entertaining value guarantee such shows’ viewership. other local student-made videos use similar techniques in appropriating romeo and juliet, as a result of both the humour they consume from local pop media and their need to personally respond to aspects of the play they often find too antiquated, frustrating, or sophisticated. with regard to the balcony scene, characters’ metaphorical language and their rapid evolution to a marriage plan have a defamiliarizing effect upon local teens studying the play in school, the exemplified response managing to domesticate the scene. in spite of using pop culture recipes that work for young audiences, however, the attempt to bring shakespeare’s story to the realm of the burlesque has not managed to attract legitimation either in consumption, as it only attracted views and no user comments, or in production, as the video did not stir the interest of local curators and communities. the clash between a teenage joke and shakespeare makes such videos marginal. a similar intention to turn romeo and juliet into a parody is stated by user ioan glogovetan, who uploaded parodie romeo s, i julieta in , attracting since then views, likes, dislikes, and no comments. the user chose to create a mash-up parody in which he remediated scenes from franco zeffirelli’s classical film adaptation from , to which he added a variety of local music genres, including the most consumed pop and ethno music, but also the culturally illegitimate manele or “gangsta” pop (schiop ) to which i will return below. as a statement of a more knowledgeable user, the choice for zeffirelli’s film differs from the usual cinematic reference point in anglo-saxon youth culture, namely baz luhrmann’s romeo + juliet ( ). the purpose of using the production, bearing in mind the genre and music used to translate the plot and mood of the play in local terms, is precisely to signify elevated and prestigious shakespeare through its institutionally consecrated film adaptation. english textbooks used in the romanian secondary education use stills from zeffirelli’s film in order to illustrate learning units on romeo and juliet. for a fairly careful selection of scenes providing a plot overview of the play in four and a half minutes, glogovetan meticulously chooses lyrics, rhythms, and genres from mostly local music that match or comment upon the characters’ dynamic and mood. for instance, over romeo’s melancholic longing for rosaline, a nostalgic ethno song is dubbed, the user tapping into the illustrative power of its lyrics: “i thought i wanted more/i didn’t know how to listen to you/and i’m dying from missing you/my love is all i’ve got left and i’m dying from missing you” (nek feat. shusanu and mr. juve, author ’s translation). a joyful, upbeat ethno love song is used to convey the relationship between juliet and the nurse, transitioning towards reggaetón and manele rhythms from the capulets’ ball onwards: “dance, dance, and show them what you can do, come and dance just with me, i want to spite all others” (mr. juve and bobo) to dub the lovers’ first encounter, while the manele lyrics “here come the cops, taking all my stuff/i hide the dollars and the marks” (liviu mates, , vine polit, ia) are played when the nurse catches the lovers and pulls juliet away. the position of music genres in the local cultural field determines, in this case, the position of glogovetan’s parody in the fields of consumption and production. in terms of consumption, ethno music is considered the rural version of pop music, and is nevertheless highly consumed across all ages and levels of education. conjoining ethno and pop music humanities , , of with “highbrow” shakespeare, such parodies can have a widespread entertainment value, and appeal to large audiences. in using manele as well in his dubbing, however, the user brings shakespeare to the realm of contested, illegitimate culture, given the genre’s position in the local field. composed and performed by roma ethnics, maneaua (pl. manele) is a music genre which took off mid s and gained popularity during the s and early s as a mixture of romani traditional folklore, electronic instruments, and turkish, serbian, or bulgarian pop. lyrics are stylistically simple and, at times, vulgar or grammatically faulty, addressing themes of interest derived from the urban ghetto: clandestine businesses and relationships, love, sex appeal, pain, friends, and foes. initially performing for roma and lower classes’ community events, such as parties, weddings, or funerals, singers started to be promoted by record labels and media channels, attracting upper and middle class criticism both of their subject matter (ideologies such as consumerism, conformism, misogynism, tribalism) and also of their behavioural models (such as trickery for getting ahead, showing off material possessions, or cheating). ever since the anti-manele discourse began to dominate the public sphere, in particular through conservative intellectuals’ downright rejection of the genre, mass media has largely excluded the artists from mainstream radio and television channels, pushing them to the margins, as a peripheral cultural practice belonging primarily to the poor and uneducated or to roma communities. maneaua has since been associated with bad taste, lack of education, and being kitsch, deemed as an undesired product of post-communist eclecticism. arguing against the cultural field’s dismissal of the genre, leftist groups have started to appropriate such music, maintaining that what keeps manele peripheral in the cultural field is actually the result of structural racism towards the artists and their primary target audience (schiop ), rather than of the subject matter, ideology, and lyrics style, since these resemble “gangsta” rap or reggaetón themes and lyrics. the latter genres, comparatively, are global genres which have been popular in romanian cultural consumption, thus differently evaluated in the field. even though, more recently, manele singers have started to shift from a lower to more middle class production, blending with mainstream genres and importing their scenarios and imagery, their uses in parodies allude to their contested cultural capital in order to mock or lower the status of the parodied subject. on youtube, there are a few parodies of romeo and juliet using manele and a collage of shakespearean lines retold in such rhythms and language style, tapping into the difference between a high cultural authority and “low” cultural practices in the local field. this class discourse is more explicitly shown in the final example to be showcased here, sonnet -william shakespeare. this has gained more views and user engagement than the previous ones ( views, likes, dislikes, comments), given silviu gherman’s ongoing vidding activity attracting its own community and the aforementioned online magazines’ interest. the user is one of the few emerging artists committed to producing online videos, using crowdfunding websites such as patreon in order to support his work. gherman is producing parodies and satirical videos on both facebook and youtube, and his channel presently has , subscribers, with a frequency of at least one video a week on facebook and different monthly web series episodes on youtube. with a mission statement of “providing education and humour to a godforsaken and blunted population” (gherman’s youtube channel), his humour addresses a niche audience, with journalists repeatedly describing his style as distinct from pop and commercial entertainment, and not digestible by mass audiences: dry, absurd, unexpected, self-ironic, and critical, most of the time infused with elements of ridicule and satire spoken with a straight face in every-day domestic or public settings. gherman challenges cultural and social dominants in manner and discourse, requiring previous cultivation in order to decipher his critical undertones. in , he became well known to a larger category of social network users outside the restricted artistic community, with a video parody of dan puric (gherman ), an actor working at the national theatre in bucharest who in the past years has become a public spokesman of orthodox nationalism by giving speeches and interviews in mainstream media, holding conferences all throughout the country and publishing books about romanian national identity. without explicitly addressing political views, gherman parodied the actor’s bombastic, exalted language and amateur mix of approximate quotations from historical right-wing cultural authorities. in doing so, gherman humanities , , of attracted the interest of left-wing cultivated consumers frustrated by puric’s discourse. the video became viral on social media, reaching , views, a considerable figure for the local cultural battlefield. since then, he has also collaborated with established industry players who are interested in experimenting with the absurd and meta-language practiced by gherman in more popular productions. in line with his anti-establishment and anti-elitist comedy, gherman’s youtube performance of sonnet is set in the artist’s own apartment hallway and kitchen, where he is filmed while reciting the sonnet in english to the camera, as if wisely explaining to the viewers what love is. the video provides english subtitles as well, making the verses easier to understand for romanian viewers. as he remarks “o no! it is an ever-fixed mark/that looks on tempests and is never shaken”, the camera shows him opening the refrigerator looking for food and, failing to find any, a bit pensively closing it. he then casually proceeds to his trash bin where he finds a slice of bread, which he goes on to combine with mustard (“love alters not with his brief hours and weeks”) and chomp on right after uttering the last lines. the gap mentioned in the previous example regarding illegitimate, peripheral cultural practices and shakespeare’s elitism is even more explicitly shown by gherman, who rather widens it by choosing to contrast a display of poverty, scarcity, and insalubrious solutions to basic needs, with a snobbish, know-it-all rendition of a sophisticated shakespeare text. user comments show the different reactions to gherman’s humour. some viewers have asked him about drug procurement, implying the comedian’s level of absurdity can only be the result of drug or alcohol consumption, while others have ironically identified with the character’s practice of reciting poems while eating from the dumpster. others still comment on his use of shakespeare’s sonnet, praising gherman’s learning of a “pretentious text” in order to produce an amusing video. achieving a comical effect through the disproportionate gap, the coexistence between cultivation and poverty is interesting in this case, as it does not necessarily reject the sonnet’s value in the same way as gherman’s treatment of dan puric’s bombastic style does. rather, it reinforces shakespeare’s value, but with the purpose of commenting upon the class differences at play in the local cultural field itself, where emerging cultivated artists struggle with precarious working conditions and finances. moreover, it might well be considered a satirical reflection upon the status of the (web) artist in romania, putting into question the actual feasibility of romantically pursuing high culture when the need for resources is more stringent. gherman’s discourse on shakespeare’s position in the cultural field compared to local artists and their audiences can be inscribed in more recent public debates upon artists’ social security, as there is currently no public policy addressing artistic work in romania, or in the global discourse on cultural production online more generally where, as i already noted, participation becomes temporary due to a lack of legitimizing agents and rewards. the use of sonnet to tap into economic struggles could appeal to leftist, intellectual communities who might both recognize and understand shakespeare’s text and its politics too, and adhere to the discourse itself. however, given their low economic and political power in the local field, the position of gherman’s video is marginal, without any legitimizing agents to place a value on shakespeare’s association with peripheral cultural practices. . conclusions this article has discussed local users’ participation in digital shakespeare as forms of cultural production in a digital field, thus supplementing network approaches and presentist analyses with a focus on the symbolic economy into which newcomers enter. youtube video producers can be legitimized either in consumption, by various audiences and communities of interest, or in production, by consecrating agents or other producers. if english-speaking producers have drawn digital and institutional curators’ interest within various fields, including shakespeare studies, the film industry, youth culture, or internet users at large, romanian producers are yet to find legitimizing agents for their shakespeare appropriations. i have argued, using cultural and media economy theory, followed by an overview of the local digital and cultural fields in which romanian “shakespeares” are produced, that local youtube appropriations are struggling to find their own interpretive communities and valuing agents due to a number of factors. firstly, there are language barriers and increased competition with humanities , , of institutional and corporate content, as well as algorithm bias in reproducing consumer preferences, which hamper both global audiences from finding vernacular productions on the platform and also global formal or informal curators from evaluating them. secondly, the local field’s underdeveloped means of legitimizing digital cultural production largely leaves youtube videos to be valued in terms of consumption, which often occurs through social curation practices employed primarily on facebook. thirdly, the local coding of shakespeare as elite culture combined with the low presence of pop culture appropriations outside theatres means that youtube users enter a contested position from the onset. finally, based on a closer analysis of users’ practices in appropriating shakespeare, it is evident that a demystifying or class critique of the bard’s elite position springing from the emerging pop culture shakespeare in romania is locally struggling to find its own interpretive community and legitimizing agents. the symbolic and, eventually, economic valuation of digital cultural practices is highly determined by their local institutional and discursive context. consequently, the extent to which global platforms can generate a global discourse on shakespeare, one that accommodates a flow of local cultural practices, seems rather limited for the time being. in spite of the accessible global media of production and distribution, online cultural production continues to be affected—if not entirely determined—by privileged locations and languages. in turn, the assimilation of global media genres and discourses into local users’ productive responses to shakespeare is ultimately evaluated in local digital and cultural fields. given the obstacles of global curatorial practices in valuing local production, a closer interdisciplinary inspection of how local fields might variously factor in, foreground, and improve the evaluation of digital cultural practices needs to inform future research on the culturally specific iterations that shakespeare productions on youtube invariably take. funding: this research received no external funding. acknowledgments: an earlier draft of this article was presented at the international conference the circulation of shakespeare’s plays in europe’s borderland, university of bucharest, – november , and has benefitted from the feedback offered by andrei nae and petrut, a năidut, . conflicts of interest: the author declares no conflict of interest. references awards and screenings. – . shakespeare republic. available online: http://shakespearerepublic.com/ awards-screenings/ (accessed on october ). bolin, göran. . value and the media: cultural production and consumption in digital markets. farnham and burlington: ashgate. bourdieu, pierre. . the field of cultural production: essays on art and literature. cambridge: polity press. calbi, maurizio. . spectral shakespeares: media adaptations in the twenty-first century. new york: palgrave macmillan. carson, christie, and peter kirwan. . shakespeare and the digital world: redefining scholarship and practice. cambridge: cambridge university press. colipcă-ciobanu, gabriela iuliana. . shakespeare in contemporary romanian advertising. cultural intertexts : – . coronado, emily. . literary inspired web series to binge watch this summer. odyssey, july . available online: https://www.theodysseyonline.com/ -literary-inspired-web-series-to-binge-watch-this-summer (accessed on october ). desmet, christy. . the art of curation: searching for global shakespeares in the digital archives. borrowers and lenders: the journal of shakespeare and appropriation . available online: http://www.borrowers.uga.edu/ /show (accessed on may ). desmet, christy. . emo hamlet: locating shakespearean affect in social media. in broadcast your shakespeare: continuity and change across media. edited by stephen o’neill. london and new york: bloomsbury, pp. – . http://shakespearerepublic.com/awards-screenings/ http://shakespearerepublic.com/awards-screenings/ https://www.theodysseyonline.com/ -literary-inspired-web-series-to-binge-watch-this-summer http://www.borrowers.uga.edu/ /show http://www.borrowers.uga.edu/ /show humanities , , of epley, robin. . literary web series you should be watching based on your favorite book. bustle, july . available online: https://www.bustle.com/articles/ - -literary-web-series-you-should-be-watching- based-on-your-favorite-book (accessed on may ). eurostat. . individuals—internet use (dataset). brussels: european commission. available online: https: //ec.europa.eu/eurostat/web/digital-economy-and-society/data/database (accessed on january ). fazel, valerie m., and louise geddes. . introduction: the shakespeare user. in the shakespeare user: critical and creative appropriations in a networked culture. edited by valerie m. fazel and louise geddes. cham: palgrave macmillan, pp. – . gherman, silviu. . fă ca dan puric. youtube, march . available online: https://www.youtube.com/watch?v= lec mzyaagw&t= s (accessed on january ). gherman, silviu. . sonnet —william shakespeare. youtube, march . available online: https: //www.youtube.com/watch?v=_abn lakqea (accessed on october ). glogovetan, ioan. . parodie romeo si julieta. youtube, april . available online: https://www.youtube.com/ watch?v=nvtwkhype c (accessed on october ). goriunova, olga. . art platforms and cultural production on the internet. new york and oxon: routledge. holderness, graham, ed. . the shakespeare myth. manchester: manchester university press. huang, alexander c. y. . global shakespeares as methodology. shakespeare : – . [crossref] jenkins, henry, sam ford, and joshua green. . spreadable media: creating value and meaning in a networked culture. new york and london: new york university press. kadner, laura. . more literary web series you should be watching. hellogiggles, august . available online: https://hellogiggles.com/reviews-coverage/ -literary-web-series-watching/ (accessed on may ). keltie, emma. . the culture industry and participatory audiences. cham: palgrave macmillan. khan, m. laeeq. . social media engagement: what motivates user participation and consumption on youtube? computers in human behavior : – . [crossref] lanier, douglas. . recent shakespearean adaptation and the mutations of cultural capital. shakespeare studies : – . lanier, douglas. . vlogging the bard: serialization, social media, shakespeare. in broadcast your shakespeare: continuity and change across media. edited by stephen o’neill. london and new york: bloomsbury, pp. – . lehmann, courtney, and geoffrey way. . young turks or corporate clones? cognitive capitalism and the (young) user in the shakespearean attention economy. in the shakespeare user: critical and creative appropriations in a networked culture. edited by valerie m. fazel and louise geddes. cham: palgrave macmillan, pp. – . massai, sonia. . defining local shakespeares. in world-wide shakespeares: local appropriations in film and performance. edited by sonia massai. london and new york: routledge, pp. – . matei, s, tefania, and veronica hampu. . communities of public cultural consumption. in cultural consumption barometer. culture on the eve of the great union centenary: identity, heritage and cultural practices. edited by carmen croitoru and anda becut, marinescu. bucharest: universul academic publishing house, pp. – . available online: https://www.culturadata.ro/ -cultural-consumption-barometer-culture-on- the-eve-of-the-great-union-centenary-identity-heritage-and-cultural-practices/ (accessed on may ). munteanu, ana maria. . shakespeare in the national radio theatre. silence as a figural event. in shakespeare in romania – . edited by monica matei-chesnoiu. bucures, ti: editura humanitas, pp. – . nicolaescu, mădălina. . remediating global media in recent shakespeare productions on romanian stages. messages, sages, and ages : – . [crossref] o’dair, sharon. . ‘pretty much how the internet works’; or, aiding and abetting the deprofessionalization of shakespeare studies. in shakespeare survey. edited by peter holland. cambridge: cambridge university press, pp. – . [crossref] o’neill, stephen. . shakespeare and youtube: new media forms of the bard. london and new york: bloomsbury. o’neill, stephen. a. theorizing user agency in youtube shakespeare. in the shakespeare user: critical and creative appropriations in a networked culture. edited by valerie m. fazel and louise geddes. cham: palgrave macmillan, pp. – . https://www.bustle.com/articles/ - -literary-web-series-you-should-be-watching-based-on-your-favorite-book https://www.bustle.com/articles/ - -literary-web-series-you-should-be-watching-based-on-your-favorite-book https://ec.europa.eu/eurostat/web/digital-economy-and-society/data/database https://ec.europa.eu/eurostat/web/digital-economy-and-society/data/database https://www.youtube.com/watch?v=lec mzyaagw&t= s https://www.youtube.com/watch?v=lec mzyaagw&t= s https://www.youtube.com/watch?v=_abn lakqea https://www.youtube.com/watch?v=_abn lakqea https://www.youtube.com/watch?v=nvtwkhype c https://www.youtube.com/watch?v=nvtwkhype c http://dx.doi.org/ . / . . https://hellogiggles.com/reviews-coverage/ -literary-web-series-watching/ http://dx.doi.org/ . /j.chb. . . https://www.culturadata.ro/ -cultural-consumption-barometer-culture-on-the-eve-of-the-great-union-centenary-identity-heritage-and-cultural-practices/ https://www.culturadata.ro/ -cultural-consumption-barometer-culture-on-the-eve-of-the-great-union-centenary-identity-heritage-and-cultural-practices/ http://dx.doi.org/ . /msas- - http://dx.doi.org/ . /ccol humanities , , of o’neill, stephen. b. “in fair [europe], where we lay our scene”: romeo and juliet, europe and digital cultures. in romeo and juliet in european culture. edited by juan f. cerdá, dirk delabastita and keith gregor. amsterdam: john benjamins, pp. – . opera nat, ională bucures, ti. . “tango. radio and juliet”—opera naţională bucureşti. youtube, january . available online: https://www.youtube.com/watch?v=-yardiaa o (accessed on august ). prima tv. . s-o dăm carte în carte cu dorian: despre “hamlet” | neam trezit. youtube, april . available online: https://www.youtube.com/watch?v= dsakwszxey (accessed on november ). rumbold, kate. . from “access” to “creativity”: shakespeare institutions, new media, and the language of cultural value. shakespeare quarterly : – . [crossref] schiop, adrian. . s, mecherie s, ilume rea: universul social al manelelor, nd ed. chis, inău: cartier, pp. – . sela-sheffy, rakefet. . canon formation revisited: canon and cultural production. neohelicon : – . [crossref] suhr, cecilia h. . social media and music: the digital field of cultural production. new york: peter lang. teatrul nat, ional i.l. caragiale bucures, ti. . visul unei nopt, i de vară de william shakespeare, regia petrică ionescu, . youtube, january . available online: https://www.youtube.com/watch?v=suohrcakrga (accessed on august ). teatrul nat, ional timis, oara. . demohamlet.concert subcarpat, i | promo. youtube, march . available online: https://www.youtube.com/watch?v= yncaxxfdpm (accessed on august ). third annual bardie awards. . the shakespeare standard. available online: http://theshakespearestandard. com/third-annual-bardie-awards/ (accessed on may ). umor romanesc. . romeo si julieta varianta scurta romaneasca:)). youtube, march . available online: https://www.youtube.com/watch?v=qpgkdovkc (accessed on january ). van dijk, jan a. g. m. . inequalities in the network society. in digital sociology: critical perspectives. edited by kate orton-johnson and nick prior. hampshire and new york: palgrave macmillan, pp. – . © by the author. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). https://www.youtube.com/watch?v=-yardiaa o https://www.youtube.com/watch?v= dsakwszxey http://dx.doi.org/ . /shq. . http://dx.doi.org/ . /a: https://www.youtube.com/watch?v=suohrcakrga https://www.youtube.com/watch?v= yncaxxfdpm http://theshakespearestandard.com/third-annual-bardie-awards/ http://theshakespearestandard.com/third-annual-bardie-awards/ https://www.youtube.com/watch?v=qpgkdovkc http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction shakespeare users in the digital field of cultural production appropriating shakespeare in the local field romanian users’ parodies, illegitimate? conclusions references american journal of computing research repository, , vol. , no. , - available online at http://pubs.sciepub.com/ajcrr/ / / © science and education publishing doi: . /ajcrr- - - word segmentation model for sindhi text zeeshan bhatti*, imdad ali ismaili, waseem javaid soomro, dil nawaz hakro institute of information and communication technology, university of sindh, jamshoro *corresponding author: zeeshan.bhatti@usindh.edu.pk received november , ; revised december , ; accepted january , abstract through this research the problem of sindhi word segmentation has been addressed and various techniques have been discussed to solve this problem. word segmentation is the preliminary phase involved in any tool based on natural language processing (nlp). for any system to understand the written text, it needs to be able to break it into individual tokens for processing. sindhi being a cursive ligature based persio-arabic script, is quite complex and rich having large number of characters in its script with all characters having multiple glyph’s based on its position in the text. in this paper sindhi word tokenization model has been proposed implementing various algorithms showing the process of tokenizing sindhi text into individual words for corpus building and creating word repository for sindhi spell, grammar checker and other nlp applications. the problem of tokenization is resolved by first identifying the sentence boundaries and extracting each sentence into isolated list form, where each list element is a complete sentence. then the segregated sentences are broken down into words with hard space character used as word boundaries and soft spaces are considered as part of word and thus ignored from segmenting. finally each word is again filtered to remove special characters and then each word is converted and saved as token after validation. keywords: word segmentation, sindhi tokenization, sindhi language, sindhi spell checker cite this article: zeeshan bhatti, imdad ali ismaili, waseem javaid soomro, and dil nawaz hakro, “word segmentation model for sindhi text.” american journal of computing research repository , no. ( ): - . doi: . /ajcrr- - - . . introduction the process of segregating and isolating the sentence into individual token of words, is termed as word segmentation or tokenization [ ]. in natural language processing (nlp) the term tokenization or word segmentation is deemed as the most fundamental task [ ]. almost every application of nlp requires at certain stages the process of breaking its text into individual tokens for processing -for example, in machine translation (mt) and spell checking [ , ]. the tokenization process is done by identifying word boundaries in languages like english where punctuation marks or white spaces are used to segregate words [ ]. the scanning routines usually include various algorithms for handling morphology in a language-dependent manner. even for a language like english, which is very lightly inflected, the phenomena of contraction and possessives will also need to be handled within the word extraction routines [ , ]. sindhi, similar to other asian languages -like urdu, arabic, persian, endures the same problem of text segmentation with space omission and insertion issues. sindhi is an official state language of sindh province in pakistan and is spoken by approximately . million people in pakistan and around . million people in india [ ]. sindhi script is based on persio-arabic script, with arabic nashk style of writing, from right-to-left direction with cursive ligature system [ ]. sindhi script has cursive behavior in its written form, having subsequent characters; in a word, joined with each other as shown in figure . due to its cursive nature and having aerabs (diacritics marks) makes sindhi text difficult to process in applications of nlp. for any application of nlp it’s extremely vital that a standard corpus of a language is built so that the text can be processed and compared with some statistical analysis [ ]. therefore, the need for developing a formal sindhi corpus is eminent and a model is needed for the tokenization of sindhi words. this paper discusses the sindhi word segmentation technique for the development of sindhi corpus and tokenizing sindhi text, to build a repository for sindhi words for nlp applications like spell checkers. sindhi word boundaries from within the text are identified by finding the hard space character. sindhi; being a very complex language, possess fifty two characters in its script with each character having separate glyph shapes, based on the position of each character in a string. this consequently generates the case of ambiguity in sindh in script, as the sindhi language contains two types of letters – connectors and non-connectors. sindhi word therefore uses soft space as well as hard space characters as shown in figure . figure . sindhi text american journal of computing research repository . . related work most of the modern languages in the world have already developed various tools and techniques for segmenting their written text and documents for spell checking and correction. a part from languages of the european countries, the algorithms for word tokenization has been implemented for various other languages spoken in asian counties. some of the relevant work done in this regard includes: word segmentation done for arabic language [ , , , ], bangla [ ], hindi [ ], nepali [ ], tamil [ ] and urdu [ , ]. these are few examples; however, unfortunately very little work has been done in this regard for sindhi language. for segmenting arabic text, sheikh et al, proposes arabic words/sub-words segmentation into characters using primary and secondary strokes with vertical projection graphs [ ] but for ocr systems only and not for digital text. similarly shaikh et al. uses height profile vector (hpv) to segment sindhi characters [ ], but again for printed or handwritten scanned sindhi text. his approach addresses the problem of segmenting for ocr systems and not for digital text. durrani n. and hussain s. address the orthographic and linguistics features of urdu language for word segmentation, employing a hybrid solution of n-gram ranking with rule based matching heuristics [ ]. on the other hand, akram m. in his thesis discusses statistical solution of word segmentation for urdu language [ ] but again for ocr systems. however, mahar et al. develops five algorithms based on lexicon driven approach for sindhi word segmentation into possible morpheme sequences [ ]. similarly the most relevant work on sindhi text segmentation is done by mahar et al. discussed in [ ], in which he presents a layer based model for sindhi text segmentation. however in his work he uses three layers, where each layer segment words with varying degree of intricacy, from simple, compound to complex sindhi words. contrary to this and other techniques discussed, we have addressed the problem of segmenting sindhi words that are already in digital or textual form, taken from internet or typed into a word processor for the purpose of corpus building, constructing word repository, machine translation, spell checking, grammar checking and text to speech systems etc. our adapted technique works on identifying sentence boundaries, then tokenizing the words from sentence and validating the isolated word for accuracy. . . character glyph and space types in sindhi text the characters in sindhi script have multiple shape or glyph representations according to their position in the word. there are four different category of shapes that a character may possess with respect to its placement in text, initial or start, medial or middle, final or end, and standalone or isolated as shown in figure . soft spaces in sindhi script are used to separate certain characters that do not have cursive context sensitive shapes for all four positions. for example in table the character {dhaal} has two basic types of cursive shapes, isolated and start ’ذ‘ shape, and middle and end shape. when ذ is used at start of the word, it does not join with any other characters. hence, a soft space is inserted to indicate the separation. this soft space does not indicate the word boundary as it is not a character, as compared to hard space, which is a character itself having no glyph or shape and occupies space. figure . different shapes of characters according to position in a word table . shape group of ’ذ’ character isolate start middle end راذ ذ ذيذل ذلذي as the hard space is an individual character having a specific ascii ( ) and unicode (u+ ) designated code that occupies memory and screen space when used, provide a very easy to identify marker for word boundaries in segmentation process. whereas the use of soft spaces is majorly to do with the character shape groups and placement, thus it does not possess any form of ascii or unicode representation, nor does it occupy memory. therefore, the soft space is not considered as an individual character and consequently it is not used to identify word boundaries. . architecture of the system the system under study has been segregated into two main sections. in the first section, segmenting sindhi words into tokens is performed and in the second section, verifying each generated tokens is done. the validation of results is done by using a prebuilt words repository. previously developed sindhi word processor software (by the author [ ]) is used as the primary tool for working with sindhi text and showing the results. in the first section, set of algorithms and routines have been implemented to scan the text and extract the sindhi words and then each word is compared to a repository of sindhi words for verification. the scanning routines evaluate each word using a set of rules to be a valid token or not. if a token is found to be invalid, then it is marked as incorrect and is simply casted away by putting it inside a list with all ignored and unwanted words. this development methodology is illustrated in figure showing various stages of the system. these stages of the system architecture are explained in the sub sections below. american journal of computing research repository figure . various development stages of the system . scanning and extraction of sindhi words the first stage of developing sindhi word segmentation model, works at four different levels. the system is able to do: . text segmentation into sentences. . sentence segmentation into words. . tokens creation. . token matching. at each level, set of routines are used to parse sindhi text, extracting sentences and creating valid tokens. these routines are discussed further in the subsequent sections. . . text segmentation into sentences initially sindhi text written in sindhi word processor document is read and scanned. then the text is separated into sentences by identifying the sentence boundaries which is from the start of the text to the next full stop (.) as shown in figure below. the sentence boundaries are identified by a full stop (.) and a question mark (?). along with these two, sentence boundaries are also marked by identifying the end of paragraph and start of new line. figure . scanning and identifying sentence boundaries each input text is scanned first from the beginning of the text starting from the initial starting position of the sentence. then end of sentence identifier, which is either a full stop (.) or a question mark (?), is searched for and marked. the system then fetches the sentence form the start index to the end index. after the first sentence is isolated the next sentence is searched for and the start and end index markers are reset. following is the implementation details for scanning sindhi text and identifying sentence boundaries. american journal of computing research repository . . sentence segmentation into words for the segmentation and creation of the sindhi word from sentences, word boundaries have to be identified and marked. the word boundaries are determined by the space character before and after each word. as discussed earlier, sindhi script possess two types of space characters in written or typed form, hard spaces and soft space. for the purpose of word token generation the hard space has been used in this system as a word boundary identifier. the soft spaces are ignored and counted as part of word as shown in figure . figure . sindhi text with word boundaries marked at hard spaces here it is to be noted that in our system we have adopted and used the terminology of token and word separately. we consider the word to be in a general form separated using spaces. the generated word token may be a misspelled or incorrectly typed word having invalid spelling. the word may also contain characters such as punctuation marks, special symbols such as: “@, &, *, !, #, etc.”. thus at this point all such words are considered to be a general form of words correctly segmented and segregated. whereas, the tokens generated will be correctly spelled words verified from a correctly spelled repository of sindhi words. at the beginning of process we change the local environment variable to arabic ‘ar’ for our compiler and interpreter to be able to read and process the text from right-to-left order for sindhi script. then, a break iterator object, which is used to break each word according to the index of hard space, is declared. the word is isolated for further verification and validation before it can be saved into a token. this process of segmentation differentiates between words and characters that are not part of words. these characters are ignored and skipped to achieve accuracy of tokens formed. the ignored characters include spaces, tabs, punctuation marks, and most symbols, have word boundaries on both sides. in algorithm the implementation details have been provided for segmenting sentences into words. . . sindhi word tokens after the identification of word boundaries, each word is then isolated and put inside a hash list. the words are identified and each word from the list is retrieved and analyzed for validity. each token is created by identifying the correctly spelled word and removing any additional unnecessary characters that may be part of the original words. this filtration process involves traversing each character with the word and removing all special characters such as @,#,$,%,^,&,*,(,),_,+,- ,=,{,},|,[,],:,;,<,>,?, etc. all these type of characters can be part of the string and may have been attached to word in previous stage. along with these, the hard space characters, newline and new paragraph symbols are also trimmed out. more importantly the filtration process eradicates the occurrence of any letter from english alphabet as it is very common to use english words at certain places in a sindhi document or article. in the last stage, each word is compared form a known list of repository of sindhi words for final validation. figure shows the tokens created. figure . word tokens created from a sentence american journal of computing research repository in the following algorithm , the implementation of token creation has been shown. here ‘filter char’ array contains the information to filter the invalid characters from the word. for sindhi word segmentation system invalid character are those characters that are not in sindhi alphabet, such as punctuation marks, extra spaces, numerical, braces, etc. . . token matching after creating each token from the sindhi text, the tokens are searched and matched with the sindhi wordlist from the repository. the system uses hash tables to store the wordlist hence the basic operations of searching and matching become very simple. we have used the same technique of matching has key to search a token from the hash table based repository as discussed in [ ].the system uses hash structure algorithm for fast and efficient searching. the analysis and verification of words also involves validating the error patterns and trends in spelling mistakes that occurs while typing sindhi text, results of which have already been published in [ ]. the figure shows the basic structure of the token matching done by the system. figure . a hash table structure for sindhi words . results the proposed model has been tested on various corpus of sindhi text collected through internet (general articles and news articles) and from publisher (book chapter and digital dictionary). the detail tokenization report is shown in table . the articles taken form sindhi literature books, were initially typed into the sindhi word processor with a sindhi keyboard designed for sindhi typing used in [ ]. total of , words were generated by the proposed tokenization model, among which , words were verified and marked as correct tokens and valid sindhi word tokens having the cumulative accuracy . % of the model. some tokens that were generated were considered invalid by the system and tokens were completely ignored due to some anomalies like having special characters or unknown unicode literal in them. table . results of the proposed segmentation model source paragraphs lines words incorrect tokens tokens ignored total words articles news articles book chapters ( ) sindhi digital dictionary --- --- --- total , , , , the figure (a) below shows the diagrammatic comparison among the tokens generated from various corpus of sindhi script. the figure (b) shows the cumulative accuracy compares to the incorrect and ignored tokens by the model. figure (a). bar graph showing the tokens generated figure (b). pie chart showing the cumulativeaccuracy of the system the overall accuracy of generating tokens by the proposed word segmentation model is shown given in table with the graph shown in figure illustrating the accuracy difference between various corpus of sindhi text and how tokenization is varied in them. american journal of computing research repository table . accuracy of proposed model source accuracy % articles . news articles book chapters ( ) . sindhi digital dictionary . figure . graphical representation of calculated accuracy in proposed word segmentation model . conclusion the work presented in this paper shows the technique and algorithms used to tokenize sindhi words from a given sindhi text document. each algorithm has been discussed and implementation given. the results are analyzed by checking each token generated by the system with a given list of words from a prebuilt sindhi words repository (for spell checking). each token is identified by the system to be correct and all invalid token are marked as an incorrect and are ignored. the results of proposed model are very sizable and accurate. the algorithm can be further utilized to segment sindhi words for various other nlp purpose like machine translation, spell checking, grammar checking and text to speech systems. references [ ] mahar, j. a., shaikh, h., memon,g. q., “a model for sindhi text segmentation into word tokens”, sindh university research journal (science series), vol. ( ) pp. - ( ). [ ] haruechaiyasak, c.; kongyoung, s.; dailey, m.; “a comparative study on thai word segmentation approaches,” electrical engineering/electronics, computer, telecommunications and information technology, . ecti-con . th international conference on, vol. , no., pp. - , - may . [ ] nadir d. and sarmad h. . urdu word segmentation. in human language technologies: the annual conference of the north american chapter of the association for computational linguistics (hlt ' ). association for computational linguistics, stroudsburg, pa, usa, - . [ ] “a fast morphological algorithm with unknown word guessing induced by a dictionary for web search engine” source: http://company.yandex.ru/articles/iseg-las-vegas.xml retrieved on: , june . [ ] hull, d. a, “stemming algorithms a case study for detailed evaluation,” (rank xerox research centre), jasis vol. , . [ ] ismaili, i.a, bhatti, z., shah, a. a. “design and development of graphical user interface for sindhi language (guisl)”. mehran university research journal of engineering & technology, volume , no. , october . [ ] rahman m u ( ). towards sindhi corpus construction, conference on language and technology, lahore, pakistan. [ ] shaalan k. “arabic gramcheck: a grammar checker for arabic”, software practice and experience, john wiley & sons ltd., uk, ( ): - , june . [ ] zribi, c. b. o. and ben ahmed, m. . “efficient automatic correction of misspelled arabic words based on contextual information.” in proceedings of the th international conference on knowledge-based intelligent information and engineering systems (kes’ ). v. palade, r. j. howlett, and l. jain eds., oxford, springer, - . [ ] farghaly, a., shaalan, k. “arabic natural language processing: challenges and solutions,” acm transactions on asian language information processing (talip), the association for computing machinery (acm), ( ) - , december , ( ), - . [ ] shaalan, k., allam, a., gohah, a., “towards automatic spell checking for arabic”. conference on language engineering, else, cairo, egypt, . [ ] uzzaman, n., and khan, m., “a double metaphone encoding for bangla and its application in spelling checker”, proc. ieee natural language processing and knowledge engineering, wuhan, china, october, . [ ] chaudhuri,b. b., “towards indian language spell-checker design,” lec, pp. , language engineering conference (lec' ), . [ ] bal k. b. et. al., “nepali spellchecker”, pan localization working papers - , centre for research in urdu language processing, national university of compute and emerging sciences, lahore, pakistan, pp. - . [ ] dhanabalan, t., parthasarathi, r., & geetha, t. v. (n.d.). “tamil spell checker” resource center for indian language technology solutions – tamil, school of computer science and engineering, anna university, chennai, india, pp. - . . [ ] naseem, t., & hussain, s. “spelling error corrections for urdu”. published online: september © springer science business media b.v. . pan localization working papers , centre for research in urdu language processing, national university of compute and emerging sciences, lahore, pakistan, pp. - . [ ] naseem, t. and hussain, s, “a novel approach for ranking spelling mistakes in urdu”, language resources and evaluation, . : - . american journal of computing research repository [ ] shaikh, n. a., shaikh, z. a., & ali, g. ( ). segmentation of arabic text into characters for recognition. in wireless networks, information processing and systems (pp. - ). springer berlin heidelberg. [ ] shaikh, n. a., mallah, g. a., & shaikh, z. a. ( ). character segmentation of sindhi, an arabic style scripting language, using height profile vector.australian journal of basic and applied sciences, ( ), - . [ ] akram, m. ( ) “word segmentation for urdu ocr system”, master’s thesis, department of computer science, national university of computer & emerging sciences, lahore, pakistan. [ ] mahar, j.a., memon, g. q., danwar, h.s., ( ), algorithms for sindhi word segemtnatin using lexicon driven approach, international journal of academic research, vol. . no. . may, . [ ] ismaili, i.a., bhatti, z., shah, a. a., “development of unicode based bilingual sindhi-english dictionary”. mehran university research journal of engineering & technology volume , no. , january . [ ] bhatti, z., ismaili, i.a., shaikh, a. a., soomro, w. j. “spelling error trends and patterns in sindhi”. journal of emerging trends in computing and information sciences, vol. , no. , . [ ] bhatti, z., ismaili, i.a., khan, w., nizamani, a. s., “development of unicode based sindhi typing system”, journal of emerging trends in computing and information sciences, vol. no. , . space, scholarship and skills: building library strategy on new and emerging needs of the academic community michelle blake relationship management team, university of york, uk michelle.blake@york.ac .uk, orcid.org/ - - - vanya gallimore relationship management team, university of york, uk vanya.gallimore@york.ac .uk, orcid.org/ - - - kirstyn radford library’s research support team, university of york, uk kirstyn.radford@york.ac.uk, orcid.org/ - - - abstract this article follows the publication of a previous article which discussed the outcomes of the understanding academics research project ( – ) which sought to better understand academic staff at the university of york. the project centred around the use of specific ethnographic methodologies and in particular two ux techniques: cognitive mapping followed by semi-structured interviews. this article focuses on the key themes which emerged from that research and which now underpin the new library strategy: space, scholarship and skills. key words: academic library; ux; ethnography; academic staff; usability; strategy this work is licensed under a creative commons attribution . international license uopen journals | http://liberquarterly.eu/ | doi: . /lq. liber quarterly volume mailto:michelle.blake@york.ac http://orcid.org/ - - - mailto:vanya.gallimore@york.ac http://orcid.org/ - - - mailto:kirstyn.radford@york.ac.uk http://liberquarterly.eu/ space, scholarship and skills: building library strategy liber quarterly volume . introduction over the past two years, the university of york library has been undertak- ing a major research project to understand more about the needs of academic staff and to help inform new strategic directions for the library. the project had three core aims: to gain a much better understanding of how academ- ics at york approach their research and teaching activities; to consider how library services currently facilitate and support those activities; and to inte- grate the ‘academic voice’ into future service planning and development of support for academics, ensuring that the library continues to engage depart- ments in innovative ways that respond to both current and future needs. the project made use of specific ux ethnographic methodologies (cognitive mapping and semi-structured interviews) which placed the academics at the heart of the research and meant that we were able to fully understand their experience of using library services in highly practical ways. in total, interviews were carried out and the data was subsequently analysed using nvivo software. results were set within national and international contexts through close analysis of the professional literature including the survey (wolff-eisenberg, rod & schonfeld, ). the research ultimately led to three key outcomes: a set of “quick wins” whereby the library was able to make immediate changes to services for aca- demics; a set of longer-term practical recommendations which are currently being implemented; and an evidence-based synthesis which seeks to define and explain academic life and understand the key motivations, frustrations and aspirations for academics. following the understanding academics research a further in-depth inter- views took place to gain a better understanding of digital scholarship at york. findings on the synthesis of academic life at york have already been pub- lished (blake & gallimore, ), including a bibliography of related literature on this topic. this article focuses on the second major output of the project: an analysis of the key themes that emerged from the interview analysis and which now underpin our new library strategy – : space, schol- arship and skills. this synthesis is combined with the digital scholarship interviews and this article embodies the digital world in which we operate michelle blake and vanya gallimore liber quarterly volume across all three themes. throughout this article we talk about both the digital world and the physical world, and how these intersect and inform teaching and research activities across the university of york. . space libraries as physical and virtual spaces continue to play a key role in the lives of many academics across subject disciplines. much research happens within online spaces (whether provided by the library or not), yet many academics still value the opportunity to access physical collections and space. for some, serendipitous browsing remains an important feature of the research pro- cess; for others, libraries provide an opportunity to escape the busy, everyday demands of the office and participate in a shared, scholarly environment that stimulates thinking and creativity. “i don’t tend to work in the library very much, often because it’s quite full, whereas i do in the vacation, i quite like working in libraries as nobody can bother me. at my desk there are always emails pinging in.” humanities academic the very presence of physical books representing decades of knowledge and wisdom can help to inspire and encourage the scholarly endeavour by setting a particular tone and intellectual precedent (beer, ): “writing spaces matter. the environment we write in inevitably shapes the work… the liberating enormity and evocative presence of the library’s books make it a place to think and work that, i think, adds some energy and a bit of fizz to what i’m doing. it’s a space i tend to turn to when i need a bit of a push to keep writing or to keep editing. sometimes i fetch those books down from the shelves to inform what i’m doing, but often they sit there suggesting to me to think more, to be a bit freer and to get on with it.” academics are conscious of the varying demands on library space from different users, and some tensions were raised in the study about the per- ceived loss of quiet study space. academics will make conscious efforts to avoid using the library at busy times for their students and are keen to promote the space to their students and encourage academic thinking and collaborations: space, scholarship and skills: building library strategy liber quarterly volume “being in the library makes [the students] feel like they’re doing something intellectually productive rather than staying at home which is a good thing.” social science academic some academics noted anxieties in their students about using the library which can feel a busy, overwhelming and intimidating environment at times. individual study rooms can reduce these concerns but are in short supply. “i have a surprising number of students who come through with mental health difficulties and they really struggle to be in the quite busy, hive of activity. there are bookable individual rooms so it’s not as if we’re not providing, it’s just that obviously, it’s the amount of space is limited. but i do get a handful of students who say they just can’t work in the library because it makes them too anxious.” social science academic both this research and a previous study involving pgr students found that our community of users wanted different spaces for different activities. like most libraries, with this in mind, we have tried to create a range of flexible spaces to suit different needs including a traditional silent study area, quiet study spaces surrounded by books, and group study areas where users can choose to be more sociable. to further address some of the specific issues around space raised in the understanding academics project, the library at york has carried out a range of other targeted ux activities to under- stand more about how all users interact with the physical space. over the past year, for example, the space ux project has focussed on the lounge area of the morrell library just beyond the main entrance. use of the area was monitored over time and the furniture layout subsequently adapted to fit how students actually prefer to use the space. a new fabric “wall” is currently being installed to help define the space and reduce noise levels from it. a specific focus on pgr needs led to the creation of dedicated pg spaces with added facilities including lockers and access to a tea and coffee machine. next steps will focus on how to ensure that the library’s virtual services match those that are provided physically and in person. many academics rely on online spaces and resources to access the information that they need for research and teaching, and many never or rarely make use of the physical library space. michelle blake and vanya gallimore liber quarterly volume . scholarship despite the opportunities to explore new research areas, and potential col- laborators facilitated by the digital discovery services now ubiquitous in scholarly publishing, many york researchers prefer to follow their hunches: inspiration strikes while reading, listening to conference speakers or discuss- ing ideas with colleagues. collaboration is key to the research process, as one academic noted: “grants make research a much more sociable business!” (humanities researcher). some academics said that nearly all their work is collaborative: “there’s pretty much no project i do that isn’t collaborative. there’s almost nothing i sit and do on my own.” (science researcher). it is clear that attending scholarly conferences and networking events in person is still a valued opportunity for learning about new developments and sharing ideas. outside the applied sciences, it is reportedly uncommon for funding con- straints, refability or the prospects for impact to dictate the direction of academic research, at least initially; however, researchers note the value of pump-priming and time to discover gaps in the literature and will often road-test ideas with peers before or during the funding application process. they can be frustrated by the complexity of the process, and cynical about prospects of success: “i usually have various ideas which may appear spontaneous but are often the results of going to conferences, talking to colleagues, reading and previous research. maybe one or two will get fleshed out. at that point start thinking about who might fund this, or the whole thing might be driven by a funder. or call or deadline for studentship. then i have an outline, i also look at funders website to see what they are looking for and what they’ll fund. then try and get pump priming funding for it from the department or the university.” science academic “so, if you’re interested in doing research we have what we call pump prim- ing...that would be as like testing the water part of research because obviously you will want to be developing your research on a wider scale which means applying for a larger fund either overseas or within europe.” social science academic space, scholarship and skills: building library strategy liber quarterly volume “so i’m trying to work out what are the current trends, what are the areas of interest, what are the gaps, where can my field add something.” humanities academic with a few exceptions, researchers are broadly positive that the library’s col- lections meet their needs, and adopt a pragmatic approach to sourcing mate- rial unavailable at york, often consciously breaching publisher licence terms by sourcing journal articles through personal networks (including social media) or ‘guerilla’ open access platforms. quick access to journal articles in particular is critical for many academics, with websites like google scholar helping academics to quickly identify what isn’t available at york. all of this results in few academics being reliant on inter-library loans, and there was lit- tle evidence that green open-access articles are widely recognized or utilized. “and it’s actually great the fact that york is linked with scholar so you can just access everything straightaway without having to go through the library unless it’s absolutely necessary because it’s much faster this way.” social science researcher academics are working in a fast paced culture with high expectations of accessing information instantly at the point of need. this cemented findings from our pgr ux study in which had similar findings: some felt that the advertised interlending times were too long, while others felt the costs were prohibitive. overall people, be they students or staff, want something when they need it. at york we are starting to investigate ways to change how our service operates and implement a “tell us what you need” service where we then take the burden away from the requestor and decide how best to fulfil the resource request, for example, buying the resource, sourcing it through interlending or possibly purchasing short term article access. this includes removing user charges wherever possible. there were some comments from academics about gaps in our journal subscriptions which can impact research; websites like sci-hub and arxiv provide easy access to what academics need and help to facilitate and further their research and discussions and they are shaping the way that academics conduct their research and changing the nature of the research relationship with the library. this explanation of the use of arxiv by a researcher in the sciences sums up how the landscape is changing for accessing scholarly material: michelle blake and vanya gallimore liber quarterly volume “the first thing is that you know it’s there! and it’s like a library which is avail- able all the time and you get typically pretty close to final versions and you get very new stuff which hasn’t been published so depending on time, there are vari- ous ways you manage the flow from the arxiv... but then i sometimes actually when i know a paper has been published in a jour- nal, if the first link i get to is on the arxiv i don’t bother to look for the link to the journal and then go through the library and log myself in to read the final pub- lished version, i’m perfectly happy with the arxiv version in most cases. so it’s that important really. some people who have secure positions they have given up on journals, they only read what’s in the arxiv, it’s available to everybody. if it’s good, it will be referenced, and some metrics it might not count but in others it does, especially in the community. a good arxiv publication can be as good as published paper. and only for careers, jobs, do you then need the stamp from the journal. and of course in the scientific community there is a lot going on against these ‘institutions’ of editors which hold the stamp of approval for careers which is a very weird distortion in some sense. why is some scientific result excellent if it has been published in nature? and if the same result has been published in the arxiv, the content is the same but it’s somehow not recognised as equally valid. by certain parts of relevant communities, like appointment communities, but the community itself doesn’t really bother that much.” science researcher similar views were expressed about scihub: “well i only learnt about it a year ago, but it seems to be fifty times the size of the arxiv, if not more, probably many times the library...it’s going to be possibly taken down, so it’s not necessarily reliable, legal, it’s an interesting question. what i liked about scihub is that they actually put in a manifesto. they had somebody, a quote from a retired professor in harvard, who said that somehow, something is really broken in the sciences, because the situation is like this: we have scientists producing work, we have scientists judging the work by referee- ing, and then once the work has been published, we have scientists paying to have access to the work which they produced and they refereed. and that seems to be really the wrong way round, and this is a historic distortion which is just not correct somehow.” science researcher with this in mind a new project at york to ensure that the library’s key dis- covery tool, yorsearch, is fit for purpose and able to evolve in response to space, scholarship and skills: building library strategy liber quarterly volume shifting ways of researching, communicating and collaborating within the scholarly world has commenced. it will also investigate new discovery tools such as yewno. during our digital scholarship interviews, more junior participants seemed to hold “traditional” modes of scholarly communication in higher regard than their more senior colleagues, with strong preferences for the printed text for immersive reading and serendipitous discovery. overall academics still had a preference for reading in print, rather than online, again aligning with the findings of ithaka s+r survey (wolff-eisenberg et al., ). there is still a blended approach to downloading articles, reading them online or printing them out, although some appear to access scholarly literature primarily via their smartphone. for many researchers, the research design and data collection are inseparable from the literature review: particularly in the humanities where a text may be an object of research in itself, but also in the sciences and social sciences where secondary literature can suggest new data sources or methodological approaches. access to third party data is occasionally a source of frustration, and awareness of good practice with regard to data management somewhat patchy with library support welcomed. researchers repeatedly noted the cyclical nature of their research: attempt- ing to write up their findings or present them to an audience often highlights unanswered questions or incomplete analyses. choosing where to publish appears to be largely down to reputation and informal networks, with superficial use of journal metrics. whilst many researchers are articulate about the shortcomings of the established scholarly publishing economy, very few appear to give serious consideration to alter- native platforms, and misinformation about the costs and risks of open access is widespread. “then of course, you’re looking, where do we publish our results, and more and more now, it used to be less so, but more and more when i think where am i going to publish i’m thinking who’s going to read it, you know, is it going to be widely accessible, because there’s no point in publishing it if nobody can get to read it... and the other thing when i publish is we’re having to think of metrics now, i’m michelle blake and vanya gallimore liber quarterly volume not sure i believe in them but ultimately the people who count, the what their impact metrics are in any given field.” science academic “...there’s a certain subset of journals, three or five journals, that everyone should publish in. it’s good for your cv, good for your promotion and everything, to publish in those journals.” social science academic “if it’s an article i have a fairly strong idea of the sorts of places i want to pub- lish it and i would always go for the best journals i think the piece is capable of sustaining. so i have a sense of whether i think it’s important and would go in a top ranking journal, or if it’s less important and would go in a lesser journal. and the journals i identify are also the ones i think people will read in my field.” humanities academic . skills there is clear evidence from the interviews that staff have mixed abilities in relation to digital skills. some talk at length about their collaborations (and how they use digital tools to enable these) while others feel they need more support in both new areas, e.g. digital note taking, as well as refreshers on more traditional skills such as finding and managing information. many staff also expressed concern about the digital skills of their students. collaboration was a key theme from the interviews, with digital technolo- gies, tools and websites enabling academics to keep in touch with each other, to collaborate across borders and to share their research data. researchers uti- lise a range of tools to enable collaboration. their choice of tool may depend on who they are collaborating with and what they need the tool to do; how- ever a number of academics interviewed were mindful about the risks of using certain tools, particularly when it comes to confidentiality of data. many academics noted the fundamental failure of digital tools to compensate for having real face-to-face conversations with people. unsurprisingly the majority of academics across all disciplines talked about the importance of working digitally and the need for this to be easy. indeed, this aligns with the findings from the ithaka s+r survey (wolff-eisenberg et al., ). this is perhaps best summed up by the heavy reliance on google. space, scholarship and skills: building library strategy liber quarterly volume “i never read books, i just google it.” science researcher what was highly noticeable across the majority of interviews was the impor- tance of google scholar for literature searching today, particularly for current awareness activities. the majority of academics interviewed who discussed their literature searching habits start with google, google scholar and/or specific subject databases. for some academics and disciplines, but not all, google scholar is now much easier to use than traditional databases like web of science and is equally comprehensive in terms of coverage. google scholar is often the starting point before academics then turn to library databases for more in-depth research (if they turn to them at all). for some academ- ics, however, databases will always be more important than google scholar searching because the academic content of the databases simply isn’t avail- able through google (law databases, for example). “yorsearch can be quite cumbersome sometimes. i’ve tried using it several times but whether you end up with a lot of resources and sift through lots of them, or you get hardly anything. and then you go to scholar and it finds lots of articles yorsearch didn’t find but you have at york anyway!” social sciences academic “i...use google scholar because it’s easy and user-friendly and easy to drive.” science academic “i tend to stick with the more recent literature. so i just use google scholar’s function to tell me what has been done since .” social sciences academic few academics start with the library’s search tool, yorsearch, supporting findings from the ithaka s+r survey (wolff-eisenberg et al., ); however, some who do, find it hugely beneficial and a way to facilitate serendipitous browsing which they value. “i’m heavily reliant on yorsearch. i will put in a keyword or something, see what i get out of it. it’s always been my natural inclination to use a library’s catalogue, that’s always been my starting point...i know i’m better at using the library catalogue unlike my colleagues for example so i must’ve been taught it or had a guide i used to discover that information?” social sciences academic the interviews indicated that academics are moving away from traditional search techniques (such as boolean logic) so we must consider what this michelle blake and vanya gallimore liber quarterly volume means for libraries and our traditional information skills and whether these are still needed. at the university of york we have started to focus on critical evaluation skills rather than on finding information: how we teach students to differentiate between different types of (online) text and to engage criti- cally with the literature, rather than just trying to find the “right answer” and copying and pasting that into written work. it was clear from the interviews that academics develop their own individual ways of searching the literature and keeping up-to-date with new research and publications in their field. across the interviews, a range of different methodologies and search techniques were discussed, from highly structured and systematic, to more haphazard and serendipitous approaches. while some of these are clearly illustrative of good practice, many others were ques- tionable in how effective the search techniques are. this presents opportu- nities for the library to support academics in improving their search skills, particularly those who said they have limited time. one particular concern raised was around effective keyword searching to find relevant results, which a number of academics found particularly problematic. “i should [use databases] but frankly i just don’t know enough about them. it’s always rather haphazard. talking to people helps a lot and, oh, you should read so and so...so it grows by word of mouth, it grows haphazardly...” humanities researcher “literally just using big keywords, to put it into a very generic, i know i shouldn’t say this, like google or the top line of chrome or the top line of your phone.” social science academic academics reported concern about research students’ level of proficiency with literature searching but sometimes lack those skills themselves and wondered if the library could do more to support them in this area. poor search skills can have a disproportionate impact on students who don’t have the established networks which many academics identified as key for the dis- covery of new research materials. “i do encourage phd students in particular to come and speak to the [subject] librarian and i should do that myself more often actually. in terms of i’m doing this module, am i missing things? actually as i’m doing this i’m thinking really i should do that myself because it would really help students. i always feel that space, scholarship and skills: building library strategy liber quarterly volume the support is great, the staff are fabulous. i could possibly ask for more, for myself... i’m possibly not thinking that’s the right thing to do; thinking that i know how to find these things better myself, well actually i might not.” social sciences academic “it could be if we work smart rather than work hard and utilise all the resources that we’ve got. you guys in the library are good at searching and good at finding stuff whereas academics might not be.” science academic there were also some comments warning against the over-reliance of google and worried that colleagues may not be aware of these. “you see a lot of researchers who just do that [google] when they’re trying to get into a topic, don’t go to econlit or don’t search pubmed with mesh terms, or don’t explore things quite as deeply, and they miss out on a rich literature. i also...try to do at least...searching old texts and literature. you know, i’ve been to the library to use the stacks to find textbooks. part of it is because there is a lot of old [subject] texts, [subject] theory that’s been out there years and years, that tends to get lost if it’s not picked up on some website...people find it really interesting in presentations when you bring up more classic texts or more older papers... everyone cites papers from – , and if you cite a paper from or , people are like ‘woah, what’s this, i didn’t know that was there.’” social science researcher “the problem that google has, as a search engine in general, is that it has this sort of secret weighting algorithm that causes some results to drift to the top and others to the bottom, and that’s partly the result of the work that they do to keep things relevant because if the results aren’t relevant their user base will disappear. but it’s also ripe for various kinds of manipulation and optimization by people who profit by having that, so there’s an uneasy truce between those two sides—as long as neither of them break it, it carries on working for everyone but those forces do not operate if you’re trying to find journal articles.” science researcher with the proliferation of information now available, some academics inter- viewed found it challenging to find the time for current awareness activities; however, for others, the real challenge is in understanding how to go about finding new scholarly materials published in their field and uncertainty about using social media. michelle blake and vanya gallimore liber quarterly volume “i don’t do systematic ‘let’s keep up to date’, i don’t have time for that at all. it’s very much, i’m writing something and i need a reference for x and then i go and look.” social sciences researcher “it gets harder and harder as there is more and more material, just to keep control of the basic bibliography of things is very hard and i still come across things and think how have i missed this which is a bit frightening when you’re supposed to be on top of things. there’s so much information these days.” humanities academic “it’s knowing where to look. latest information isn’t always easy to find.” science academic “so it’s difficult sometimes to find the journals, that’s where things like twitter and facebook, that networking, has been incredibly useful. and that’s the thing, you used to go to conferences and meet people, but now you add them on twitter and they post something: ‘i’ve published an article’ or ‘someone else has an arti- cle’...you don’t want to tell people they should be on social media but at the same time they’re missing out on all this stuff. but people have their own networks. in some way something like twitter is no different from keeping up with people via phone calls or writing to them, but on a vast scale.” humanities researcher as mentioned previously, academics still have a preference for reading in print, rather than online. they also raised concerns about their students in relation to this. “maybe we need to start structuring into the modules ways of getting the stu- dents to actually use the physical texts. i don’t know how we do that, it’s going to be setting it on weekly reading, it might be through library tasks where you have to go look at books... to some extent we are not introducing them to the skills of a researcher. after undergraduate level, they will have to use books and they will need to be able to search the catalogue and find a book on the shelf, and that when you look at a shelf there is a whole context of shelfmarks around that could be useful. so they should be, still. you know reading an e-book is a differ- ent experience, it’s nostalgic, it’s pragmatic, they will have to use physical books eventually.” humanities researcher for some academics, it is not just the impact of ebooks on student reading that causes concern. ebooks in general do not always fit in easily with their space, scholarship and skills: building library strategy liber quarterly volume own research methods which can often involve referring to multiple texts at once, browsing footnotes and bibliographies across texts, making notes etc. “reading stuff online is great if you know what you want—it’s useful for teach- ing preparation like reading two chapters or browsing stuff. for what i’m doing, you read the texts, footnotes, bibliography simultaneously and open access books are hopeless for that. you can’t browse or keep moving between the bibliography and the book. if i were more technologically advanced i would use google books and word searches and i know some who do that, but you can’t use a scholarly edition online—i would always print out and scribble on it, you need to read something intensively.” humanities researcher a proliferation of different software and their use was reported across inter- views. as with literature searching, academics expressed weariness about the struggle to keep up to date with new analytical techniques, software releases and research dissemination platforms. a result of this may be the over reli- ance by some academics on one tool and being closed to new ways of work- ing, for example: “i do everything in word, anything else seems like overkill,” (humanities researcher). a few academics noted shortfalls in the university’s provision of training for research students and their supervisors—particu- larly with regard to effective use of it, impact maximisation and academic practice generally. there was also evidence of a gap in the provision of train- ing for data retrieval and analysis, particularly for research students, how- ever, it was not clear who should provide this training. it is anticipated that the library’s digital skills projects will help ensure aca- demics have the technical competencies and confidence to thrive in a digital environment irrespective of where they are physically located. . conclusions academics are increasingly working in a digital environment using a range of digital tools; however they still access library print collections when they need to. as a library service, we need to ensure that all staff and students have the necessary digital skills required to be successful in their scholarly practice, employment and more generally as lifelong learners. we also need to ensure that the physical library space continues to provide collections michelle blake and vanya gallimore liber quarterly volume and space that both support, inspire and empower the academic endeavour. evidence and insights gained from academics throughout the project empha- sised how critical the physical and digital spaces are and how, for many aca- demics, they continue to co-depend on each other. the scholarly world continues to develop and realign itself to shifting pri- orities and innovations. as libraries we need to be responsive and agile to such change and continue to be valued and trusted partners in the academic endeavour. references beer, d. ( ). writing in the library: some brief reflections on evocative writing space. retrieved june , , from https://medium.com/@davidgbeer/ writing-in-the-library- ef f. blake, m., & gallimore, v. ( ). understanding academics: a ux ethnographic research project at the university of york. new review of academic librarianship (electronic pre-publication). https://doi.org/ . / . . . retrieved september , , from http://eprints.whiterose.ac.uk/ /. wolff-eisenberg, c., rod, a.b., & schonfeld, r.c. ( ). uk survey of academics . https://doi.org/ . /sr. . note the research excellence framework (ref) is the uk’s system for assessing the excellence of research in higher education institutions [see: https://www.ref.ac.uk/]. https://medium.com/@davidgbeer/writing-in-the-library- ef f https://medium.com/@davidgbeer/writing-in-the-library- ef f https://doi.org/ . / . . http://eprints.whiterose.ac.uk/ / https://doi.org/ . /sr. https://www.ref.ac.uk/ seeing emancipation: scale and freedom in the american south seeing emancipation: scale and freedom in the american south edward l. ayers, scott nesbit the journal of the civil war era, volume , number , march , pp. - (article) published by the university of north carolina press doi: for additional information about this article [ this content has been declared free to read by the pubisher during the covid- pandemic. ] https://doi.org/ . /cwe. . https://muse.jhu.edu/article/ https://doi.org/ . /cwe. . https://muse.jhu.edu/article/ edwa r d l . ay ers & sco t t n esbi t seeing emancipation scale and freedom in the american south emancipation in the united states was vast, distended, and chaotic. shifting boundaries surrounded the struggle, unfolding unevenly over years and an expanse the size of continental europe. some enslaved people were able to escape to union lines within months of the beginning of the war, yet millions remained fi rmly bound by slavery in . the president, legislatures, judges, and generals played crucial roles in end- ing slavery, as did enslaved people, who seized freedom at every oppor- tunity. military and political struggle were inextricably interwoven with the struggles of individuals held in slavery; thus abraham lincoln kept a map of the distribution of slavery—the fi rst map of its kind in the united states—close at hand. trying to make sense of this complexity, historians of emancipation have tended to focus on agency, on the ways actors in diff erent spheres and places strove for freedom. in its simplest form, that inquiry has turned around the question of who freed the slaves. thanks to innovative work since the s, we now see that freedom came as a result of many struggles—in cataclysmic battles and in protracted debates, on farms and in bureaucracies, in political parties and on lonely roads. freedom demanded action on many fronts because slavery was entrenched through- out american society. a full understanding of emancipation requires that we put the pieces together. to do that—to comprehend the patterns, pro- portions, and timing of emancipation, to see multiple forms of power in interaction in space and time—we need an analytical framework that is inclusive, self-aware, and disciplined. much of the debate over emancipation has, knowingly or not, turned on the issue of scale. those who emphasize the role of abraham lincoln and the republicans focus on the national scale; those who emphasize the role of the enslaved people focus on the scale of individual struggle and the cumulative eff ects of that struggle. other historians focus on intermediate scales of armies, states, or regions. the debate to this point j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e has often focused on agency—who possessed the capacity for action and who used that capacity for what purposes and when. the concept of agency, however, tends toward dichotomies: passive versus active, top versus bottom, middle versus periphery, black versus white, military ver- sus home front. the work of historians of emancipation has shown us that those dichotomies cannot explain the dynamic complexity of eman- cipation, in part because each side of the dichotomy restricts itself to a single scale. since scale makes such a diff erence in our understanding, it would be good to analyze that concept more explicitly. scale carries two fundamen- tal meanings. first, it defi nes the varying spatial and temporal reach of specifi c practices. people act in networks of diff erent sizes, with diff erent degrees of awareness of those networks. it is also a matter of perception, the frame that observers lay down over evidence of social activity. that perceptual scale reveals and obscures, emphasizing some actions while truncating or ignoring others. being self-conscious about scale, in both these meanings, is crucial if we are to understand the patterns of the past. most important, an intentional focus on scale allows us to integrate mul- tiple perspectives and social action of many kinds. emancipation certainly unfolded at multiple scales simultaneously. it occurred at the scale of grand strategy and the nation-state, dictated from washington. abraham lincoln understood that the military success of the united states depended on destroying the ability of slave labor to feed the armies of the confederacy. he also understood that enslaved people could be both aid and hindrance in the success of the union armies, help- ing guide those armies through alien territory, yet burdening fast-moving troops with thousands of desperate men, women, and children. lincoln also understood all too well the race he was running in the political realm, winning the war before his considerable opposition, fed by every move he made against slavery, mobilized to remove him from power. emancipation played out, too, at the level of the military district and theater. in , for example, while commanding forces stationed off the south carolina and georgia coast, gen. david hunter temporarily cre- ated a large potential zone of emancipation, assuming the authority to enroll able-bodied former slaves as soldiers, with or without their consent, and declaring all slaves in the district free. this district of the south held authority over nearly one million slaves, according to the census, and hunter hoped to leverage these demographics to his military advantage. in announcing emancipation, he cut against the strategies of war that the president had outlined by august . lincoln quickly revoked hunter’s abolition of slavery, even as he prepared to announce his own. s e e i n g e m a n c i p a t i o n emancipation also occurred at the scale of local action. in virginia, at fort monroe in , enslaved men and women living nearby fl ed to union lines. maj. gen. benjamin butler’s reaction to their initiative created the legal verbiage through which emancipation could be incorporated through existing law: contraband of war. a year later, up the james river from fort monroe, these local initiatives emerged in seeming lockstep with military maneuvers. edmund ruffi n, among the most ardent secessionists and defenders of slavery on the eve of war, noted with distress how slavery dis- integrated with the approach of union arms in the peninsula campaign of . while watching emancipation in his own neighborhood, ruffi n noted glumly that the “number, & general spreading of such abscondings of slaves are far beyond any previous conceptions.” the relation between nearby union armies and escape from slavery was clear, too, in news paper advertisements off ering rewards for the capture of runaways. figure dis- plays the percentage of items in the richmond daily dispatch devoted to such advertisements over the course of the war. the highest peak comes in summer and fall , in the wake of the union approach to richmond via the peninsula. each of those advertisements told a story of local scale and calcula- tion. some enslaved people fl ed because concrete opportunities presented themselves, others did so because life had simply become intolerable. some ran away to particular destinations, with intimate knowledge of their intended routes, while others fl ed only in the rumored direction of union armies. whatever the case, the fugitives moved across a landscape wracked by the most profound dislocations of history at every scale. scale, as this brief survey of emancipation reveals, is hardly a simple concept. since the early s, geographers have carried on a wide- ranging debate over the meanings of scale, and historians can learn from those debates. we can see that scales of action are produced by men and women at particular points in time and for particular political, economic, and cultural purposes. scale is self-consciously enacted; people intend their acts to have consequences of varying reach, though they can seldom perceive how far that reach extends and what results it brings to people and places they cannot see. we can also see that scale is imposed by inter- preters of social action. historians shape their conclusions in the moment they defi ne the frame of their story, as soon as they establish a scale. analyzing scale rather than taking it for granted is the best way to avoid its pitfalls. keeping in mind the dual meanings of scale—as practiced and as perceived—helps us avoid confl ating them. making scale explicit can help us better understand how various kinds of action shaped american slavery and freedom. the broadest j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e perspective, the scale over the largest area, reveals that rapidly changing global commodity chains profoundly infl uenced the material conditions of slavery and thus emancipation. in the s high cotton prices bol- stered planters’ profi ts and fed their hunger for an independent nation. at the same time, the world’s largest manufacturer of cotton textiles, great britain, began to look for cotton in other, very diff erent places in order to better control labor and regulate with more precision the price of raw materials. the american civil war shattered the existing pattern. as the french statistician charles j. minard demonstrated years ago in a pioneering representation of global scale (reproduced here in fi gure ), the shift in the global cotton market, particularly british imports from india, began as early as ; by , because of the u.s. naval block- ade, the south was not a signifi cant exporter of cotton. these fl uctuations registered far beyond the cotton south, shaping human geography in the upper south, lower egypt, and western india. the cotton south would suff er from that loss of position for generations to come, emancipation arriving in a hostile economic environment not of its own making. from the widest perspective, too, historians have seen political for- tunes and ideologies spanning continents. southerners fl ush with prof- its from the cotton trade sought to adapt the conservative ideologies and strategies of europe in the s to their purposes, but they also caught currents of nationalism that pushed along revolutions in china, germany, india, and italy. african americans were inspired and infl u- enced by the actions of freedpeople and their allies in the caribbean and beyond. they sought grounding for their political claims in the mate- rial concerns and organizational structures of the countryside and in pe rc en t jan apr jul oct jan apr jul oct jan apr jul oct jan apr jul oct jan apr jul oct figure fugitive slave advertisements as a percentage of newspaper items in the richmond daily dispatch, – . figure charles j. minard, “carte figurative et approximative des quantités de coton en laine importées en europe en et en ,” and “carte figurative et approximative des quantités de coton brut importées en europe en , , et en .” j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e transnational, pan-african political desires, leading to expanding inter- est in the american colonization society and liberian emigration. historians increasingly emphasize that the struggles in the united states were parts of larger struggles. an international scale emphasizes both material and ideological layers of action larger than the nation-state, deemphasizing the particular partisan struggles and the march of sec- tional crises that have dominated our textbooks and common narratives. pulling the camera back, paradoxically, also emphasizes the role of the enslaved people in their own freedom, showing that they were active advo- cates of their emancipation long before the liberty or republican parties emerged. both the global and hemispheric perspectives, in other words, reveal scales that play down the role of the nation-state, locating contin- gency beyond the usual borders of our understanding. for its part, the national scale of study has its own surprises. examined from washington’s perspective, historians now see, emancipation followed an uncertain path toward an uncertain destination. abraham lincoln agonized over the timing, wording, and implementation of every step of abolition, and the thirteenth amendment followed a tortuous path that proved “anything but predictable,” michael vorenberg has discovered. the amendment “was not originally part of a carefully orchestrated political strategy; nor was it a natural product of prevailing legal principles; nor was it a direct expression of popular thought.” instead, it was the product of interaction, its meaning “contested and transformed from the moment of its appearance.” the contest over freedom was fought over the shifting meanings of emancipation for race, citizenship, and gender. as christian samito reminds us, at war’s beginning even many who hated slavery could not imagine full citizenship for former slaves, and “the idea of citizenship and suff rage for blacks had been unpopular within even the wartime republican party.” but things began to change during the war and accel- erated afterward. “african american arguments for inclusion, as well as the exigencies created by postwar white southern resistance,” samito points out, “led republicans to make a profound shift during the s to embrace the idea that blacks stood entitled not just to the rights of per- sons but to those of a broadened concept of citizenship as well.” as amy dru stanley has shown, freedom for many african american women solidifi ed in out of wartime arguments about black soldiers’ owner- ship of their wives and children. congress fused the abolition of slavery with freedom that had been “endowed by marriage, thereby tethering a new birth of human rights to enduring domestic bonds.” the national scale shows that lawmakers brought about change that few of them had s e e i n g e m a n c i p a t i o n foreseen, a level of contingency at what has often been considered the most intentional scale. at another scale, historians have discovered that life at the local level possessed its own powerful dynamic. recent studies of plantations before the war have found that enslaved people lived in geographies attuned to their own needs, desires, and perceptions. white men voted in ways that make little sense until the most local information, the voting of their fathers or neighbors, is taken into account. the civil war, moreover, came to every part of the south in a unique way, triggered both by actions a thousand miles away and events around the next bend. the more we have learned about the confl ict, the more the complex patterns of scale emerge. emancipation is particularly confusing—and useful—in thinking about scale because it reveals what geographers have called “scale-jumping.” despite a lack of formal access to power at state or national levels, enslaved people nonetheless bent large institutions of both the united states and the confederacy to their purposes. abraham lincoln and the u.s. army, with their vast resources, met the challenges of the confederacy through decisive military action possible only through the nation-state. their strength, in turn, gave force to individual or communal eff orts at freedom the enslaved could not have had otherwise. but that state and its armies would not, could not, have exerted that force without the disruption and determination of the enslaved and freedpeople themselves. that was true from through reconstruction. an interpretation attuned to disparate scales of action and the relations between them thus emphasizes how profoundly, and thus how intricately, the structures of state, military, and economic power connected to per- sonal dramas. the challenge is to relate the scales of human action with- out collapsing them into each other, without reducing them to one or the other. scales are not like russian nesting dolls that fi t inside one another. the local is not merely a subset of the national or the global, but a site of action that can change and challenge more geographically dispersed networks. one scale of analysis, too, is not necessarily more accurate or useful than the others. just as a traveler relies on maps of international air travel, airports, cities, neighborhoods, and streets, each containing layers of information relevant to its scale, so do historians. no map is intrinsi- cally better than the others; rather, they each take on their full utility only in the context of others. the concept of deep contingency is a way to think about social action across scales; it argues that diff erent aspects of social life connect with others in unpredictable ways in the fl ow of time, creating important shifts in structures and self-understanding. deep contingency is inherently j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e spatial. the “deep” component of that concept invokes the interpenetra- tion of personal, household, local, regional, national, and global scales. the “contingent” component invokes the connections, often unpredict- able, across various facets of self-understanding and action. deep contin- gency means that change at one scale can trigger change at other scales, with systemic change resonating at all scales. deep contingency requires that we imagine social life in four dimensions, with change and space con- tinually making one another. deep contingency marked each stage of the american civil war. in secession, the outcome of elections in and triggered a cascade of consequences that led white southerners to redefi ne their relation- ship to not only the united states but their own families and even god. later, battles redefi ned the geography of war, which in turn redefi ned the possibilities of emancipation hundreds of miles away. an unanticipated product of war, freedom came with greater speed and proceeded farther than anyone could have predicted in . emancipation, once begun, dis- played what appeared an inexorable logic, but freedom followed channels not of its own making. for all its complexity, emancipation occurred on a bounded landscape over a fi xed number of years, with enormous documentation, and so we can see some of its patterns. visualizations can represent those patterns better than words alone. images permit us to see processes that unfold unevenly and simultaneously over time, across diff erent scales. without mapping, it is easy to remain vague about social action, assuming con- sequence and reach. by requiring that evidence be apparent, visualizing a process permits us to understand how actions overlap, penetrate, and confl ict with one another. historians’ characteristic and crucial focus on the singular and the anomalous provides a necessary check against overly schematized representations, and any representation suitable for history will have to be dynamic in every dimension, embodying change as part of its fundamental assumptions. but visualization allows us to see patterns and processes that are invisible in words alone. visualization allows— even requires—that we take account of scale and its consequences. by examining the entire civil war south in a single season of the war—summer —we can see interaction between geographic layers of legal enactments, military control, and shifting demography. by then, four changing, discontinuous regions had emerged on the landscape, each with its own complex internal geography and dynamic, as we show in fi g- ure . a border region stretched through slave states that had not seceded and therefore in which slave-owners had the greatest legal recourse against emancipation. it encompassed those places where federal and s e e i n g e m a n c i p a t i o n state law protected slavery and the slave trade even while military prac- tices undermined the institution. by summer , for example, kentucky had become an unlikely new center of the slave trade and the most fertile recruitment center for the united states colored troops, an anomalous product of the uneven political and legal geography of emancipation. zones of relatively long-lasting union control in the seceded south, by contrast, beading the coastline and the environs of washington, lining the mississippi, and extending to middle tennessee, left very diff erent marks on slavery. the institution had all but fallen apart in these places. african americans living in the occupied south crowded into garrisoned cities and towns, leaving behind them a landscape nearly devoid of coerced labor. these regions also held union-sponsored plantations, where recon- struction might be tested and free labor transitions assessed but which proved vulnerable to confederate raids. even late in the war, confederate strongholds covered enormous terri- tory, encompassing nearly all of texas; much of alabama’s black belt and the florida panhandle; as well as the piedmont from augusta, georgia, to southern virginia. these spaces had seen few, if any, union troops after three years of warfare and seemed to provide few chances for enslaved men and women to escape. still, even in these spaces, slavery had been trans- formed. rumors of yankee incursions interrupted work routines; refugee border union occupied battleground confederate zones of emancipation, june figure approximate zones of emancipation, june . j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e planters carried what property they could from other districts, disrupt- ing local power relations; trading with those outside neighboring counties had become even less reliable than confederate scrip; escaped slaves and deserters menaced many plantation districts. enslaved men and women across the confederate strongholds refused to work for their old masters. in summer , fi nally, much of the south remained a battleground, where emancipation had the possibility of sweeping through entire dis- tricts with united states arms. this was certainly the intent of union offi cers, who devised a coordinated strategy spanning thousands of miles and including attacks up the red river of louisiana, sherman’s march southeast toward and then northward from the sea, and assaults around richmond and up the shenandoah valley. these campaigns would deprive confederate armies of the most productive lands of the south, where hun- dreds of thousands of enslaved people lived. these zones were also the spaces of confederate initiative and movement even late in the confl ict, to the detriment of african americans. in spring and summer , these spaces encompassed nathan bedford forrest’s atrocities, jubal early’s invasion of pennsylvania, and a large number of less spectacular points where emancipation ebbed. military movements, in other words, overlay legal and demographic geographies to create a complex terrain for eman- cipation. each element changed according to its own dynamic. pulling layers of action apart and holding them in relation to each other at the same scale of analysis and representation, as in fi gure , we begin to see patterns more clearly. the emancipation regions of the south, always moving, could also sud- denly shift. preliminary emancipation proclamations and congressional acts between may and september had revolutionized the legal geog- raphy of slavery. before these acts, federal law throughout the american south protected slavery. after them, united states law put the institu- tion into question in most of the confederacy. relatively few of the spaces where slavery had been endangered, however, were held securely by union troops in . slavery existed, if in a threatened state, throughout much of the confederacy. nowhere was the threat more urgent than in virginia, where actions taken at various scales collided. by summer of , david hunter had been transferred there to take up maj. gen. franz sigel’s position in the shenandoah valley when sigel was removed from command after the battle of new market. during hunter’s movement up the valley, he would work with his troops and enslaved men and women to enact emancipa- tion less fl amboyantly but more eff ectively than he had earlier in the war, operating on diff erent layers and scales to diff erent eff ect. s e e i n g e m a n c i p a t i o n in the valley, hunter was dependent on enslaved men and women to accomplish his military objectives and his object of emancipation. some african americans were willing to help, providing local knowl- edge of enemy positions and of the same terrain that thomas “stonewall” jackson’s dramatic campaign had spanned two years earlier. in helping hunter, and in using hunter to escape to freedom, men and women in the figure legal, military, and demographic layers of emancipation, june . j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e valley changed the demographic pattern of slavery, enacting emancipation on the local level, tracing a parallel line, we might say, alongside enact- ments of congress. in escaping slavery with the u.s. army and in providing hunter’s soldiers with intelligence, these enslaved people operated within a geo- graphic context impossible to see if we focus solely on strategic military zones, legally prescribed areas of freedom and lack thereof, or statistical representations of the census. black men and women acted at other scales, particularly within the existing familial and political networks created by migration and the slave trade. these networks, like the contingencies of battle, had a great impact on the contours of emancipation. they were part of an extended black geography, created in part through the expan- sive slave trade of the s, s, and s. black families, especially in long-settled districts, were often marked by the forced absence, by sale as well as by hiring, of children and spouses. in augusta county, virginia, for example, where hunter arrived in , most black families were composed of migrants from other virginia coun- ties. the majority of migrants, a sample of whose journeys are represented in fi gure , came from counties across the state. local history was also state and regional history, all tied to networks of trade and profi t with centers far from virginia. antebellum history was also wartime history, postbellum history. connecting history across some of these scales makes some other- wise invisible stories visible. figure maps the positions of signifi cant u.s. army units in virginia in summer alongside all the fl ights of enslaved men and women reported in the richmond daily dispatch and staunton vindicator. these relative positions and timings show a loose coordination between enslaved men and women fl eeing slavery and the paths of large armies. some, but certainly not all, slave escapes this sum- mer in virginia closely accompanied union troops. the electronic version of this map is available along with the other maps presented in this essay at the digital scholarship lab’s website on the hidden patterns of the civil war, http://dsl.richmond.edu/civilwar/. it combines information from the national archives; the offi cial records of the war of the rebellion; and newspapers, diaries, and letters to create a matrix of emancipation. this snapshot and the narrative that follows reveal one moment in the complex processes that the larger and more dynamic map encodes. visualization must be translated into words that bring their own kind of clarity even as they necessarily sacrifi ce others. in june , an otherwise anonymous “jack” took leave of west view, a farming neighborhood about fi ve miles west of staunton in the s e e i n g e m a n c i p a t i o n shenandoah valley. the immediate cause of his departure, most likely, was the approach of u.s. troops under hunter. that command—about twelve thousand men once they were reinforced with soldiers from the department of west virginia—stormed up the valley, pressing the rela- tively undermanned confederate forces east of the blue ridge as the union forces headed south, taking the small cities of staunton, lexington, and, they hoped, lynchburg. in this shifting military geography, thousands of men and women escaped slavery in virginia. among those leaving were people who had worked on farms and in industries around augusta, including a number of enslaved employees at the virginia insane asylum who followed hunter’s troops, fi rst southward, then eventually toward west virginia. hunter’s party had two primary aims: to destroy the food-producing capacity of the valley and to destroy the railroad depot at lynchburg, cutting off any escape or supply route feeding the army of northern virginia around richmond and petersburg. jack had lived on the keller farm in augusta county for a number of years and had little desire to remain. when u.s. troops arrived in staunton on june , the nineteen-year-old had ample opportunity to head for likely freedom in west virginia. confederate authorities had abandoned the city, and discipline was light in the countryside where he resided. jack’s female male figure migrations of married augusta county freedmen and women, recorded . j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e designs, however, likely turned toward petersburg. he had grown up there and had been moved to augusta county unwillingly, almost certainly one of the thousands who made the trek every year from eastern virginia to points west in the state, an oft-traveled if seldom analyzed part of the sec- ond middle passage. a large number of enslaved virginians took the considerable risk of leaving farms to attempt to join with u.s. army units. by late june, the army of the potomac’s maneuvers around richmond toward petersburg were well known. throughout the state, news traveled quickly that u.s. cavalry units, attached to benjamin butler’s army of the james, had raided along the richmond and danville railroad to the staunton river junction several days’ journey from west view. by the time of the petersburg campaign in and , enslaved men and women throughout eastern and central virginia had been able to fi nd their way to union lines. such journeys were risky: the presence of lee’s army, early’s cavalry, and a number of active militias and slave patrols between staunton and petersburg made journeys such as the one jack took par- ticularly dangerous. yet he and others took the opportunity created by the u.s. army’s campaigns in the state. june july june july staunton lynchburg richmond petersburg significant u.s. army position documented escapes from slavery miles figure u.s. military movements and select, documented escapes from slavery in virginia, june . s e e i n g e m a n c i p a t i o n jack was “copper color, with a white speck in the ball of one eye. he was wearing a green slouch hat and a pair of capped boots.” john keller had little doubt that jack was headed for his family in petersburg. we do not know whether jack was able to complete his journey, and we can only guess that, in fact, keller was correct in his assumption that jack went to petersburg. these intimate geographies are shrouded in hearsay, rumor, and circumstantial evidence that described freedmen and women according to the ideological purposes of their authors. that jack’s would- be owner understood the power of the familial network is, by itself, testi- mony to the importance of that network. at this scale, impossible to see without concatenating information usually bound in diff erent scales and categories—military and civilian, regional and local, white and black—the complex consequences of the u.s. army’s incursions become apparent. the most successful attempts to attack confederate infrastructure enabled increasing numbers of african americans to fi nd and follow union troops. in spotsylvania, hanover, and essex counties, slaveholders reported landscapes “entirely stripped” of enslaved people as african americans left plantations to follow union arms during the overland campaign. smaller movements of troops into territory that u.s. troops had not before encountered also found men and women eager to fl ee when those forces came near. indeed, the union army cast a shadow over a much wider area than the immediate vicinity of its marches. large armies created an extensive penumbra where slavery was disrupted. slave patrols fell apart, allowing individuals such as jack to elude capture long after nearby armies had passed. confederate authorities conscripted enslaved men from farms to work on fortifi cations long before a u.s. regiment threatened a county. towns and cities miles from armed confl ict were fl ooded with immi- grants, creating opportunities for urban anonymity. at the most funda- mental level, the union army broke up slavery wherever it went and a good many places it did not. at a yet more intimate scale, however, we can see that armies were unreliable vehicles for emancipation, bringing heartbreak as well as lib- eration. after pressing through the valley, for example, hunter sent his wagon train ahead of his main body of troops en route to charlestown, west virginia. many of the enslaved people who made off with hunter originally were able to escape northward, eventually ending up in indianapolis. only a few were reenslaved during the ambushes led by john imboden’s cavalry and other units. but african americans who fl ed southside virginia plantations, a few days’ walk east of lynchburg, in what seemed similar circumstances at j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e virtually the same time, were not nearly as fortunate as those with hunter’s men. the route of brigadier generals james h. wilson and august v. kautz’s cavalry units along the danville railroad, the union army’s fi rst foray into southside virginia, was marked by slaveholder complaints of escaped slaves. during the cavalry’s return to petersburg, however, wade hampton’s confederate unit caught the u.s. troops at ream’s station. union soldiers scattered, suff ering nearly fi fteen hundred casualties and abandoning between two and four hundred men and women to reenslave- ment. coming as the u.s. congress seemed poised to end slavery through constitutional amendment, as offi cers and enlisted whites in the u.s. mili- tary came to support emancipation, and as the confederacy found itself increasingly hemmed in, such instances remind us that the patterns of emancipation worked at diff erent scales in diff erent ways, often chaoti- cally, seeming to gather momentum at the national scale while suddenly disappearing with the fortunes of cavalry raids. in this time and place, action at diff erent scales produce discernable patterns: changes in the law of slavery by , enacted at the national scale, made fl ight toward union lines in virginia a more certain mode of emancipation than it had been earlier or than it was elsewhere in the south at that time; enslaved men and women sought out home and family at every opportunity, an ambition that combined the scale of the household with the scale of the interstate slave trade; at times, pursuit of family made it more likely that they would fl ee their current residence, at other times less likely; individual slaves found large and stable union armies, organized at the scale of the army or army group, more eff ective vehicles for emancipation than fast-moving, smaller units; household and local control over slavery became increasingly frayed over the course of the war. emancipation was a deeply patterned, deeply contingent aff air that depended on the interaction of processes occurring at multiple scales. greater self-awareness about scale and geography can help us fi nd the patterns in that vast variation, making us more alert to the nature of the contexts and the stories enacted there. to see emancipation, we have to imagine stories unfolding thousands of times across the south, each unique but each bearing the marks of its place and time. thanks to the work of generations of historians, we know more about emancipation than we might have thought possible. imaginative work at every scale has revealed the determination of the enslaved to be free in whatever measure they could—and of the complications of politics, policy, and warfare that both aided and compromised that freedom. every scale mattered, and every scale connected with the others. seeing those patterns s e e i n g e m a n c i p a t i o n of emancipation can help us understand the most profound social trans- formation in american history. not es the authors would like to thank robert k. nelson, nathan altice, william blair, and the journal’s anonymous readers for their comments on the paper at various stages. university of richmond undergraduate students alex bloomfi eld, lauren gallagher, and kathleen lietzau assisted with research. we would especially like to thank nathaniel ayers, the creative lead at the digital scholarship lab, for his work on the article’s maps and images. . on the question of agency in u.s. slave emancipation, see w. e. b. du bois, black reconstruction in america (new york: russell & russell, ); benjamin quarles, lincoln and the negro (new york: oxford university press, ); leon f. litwack, been in the storm so long: the aftermath of slavery (new york: knopf, ); ira berlin, barbara j. fields, thavolia glymph, joseph p. reidy, and leslie s. rowland, eds., freedom: a documentary history of emancipation – , ser. , vol. , the destruction of slavery (new york: cambridge university press, ), – ; james mcpherson, “who freed the slaves?” proceedings of the american philosophical society ( ): – ; and stephen hahn, a nation under our feet: black political struggles in the rural south from slavery to the great migration (cambridge, mass.: harvard university press, ); walter johnson, “on agency,” journal of social history ( ): – . on lincoln’s map of slavery, see u.s. coast survey, map showing the distribution of the enslaved population of the southern states and the united states. compiled from the census of , geography and map division, library of congress; francis bicknell carpenter, first reading of the emancipation proclamation of president lincoln (oil on canvas, ), u.s. capitol, washington, d.c.; and susan schulten, “the cartography of slavery and the authority of statistics,” civil war history ( ): – . . on the limits of debates about agency, see john rodrigue, “black agency after slavery,” in reconstructions: new perspectives on the postbellum united states, ed. thomas j. brown (new york: oxford university press, ), – and walter johnson, “on agency,” journal of social history ( ): – . . ira berlin, joseph reidy, and leslie rowland, freedom: a documentary history of emancipation, – , ser. , the black military experience (new york: cambridge university press, ), – , – , g. m. wells to e. l. pierce, enclosed in s. p. chase to edwin m. stanton, may , , – . . edmund ruffi n, may , , the diary of edmund ruffi n, ed. william kauff man scarborough, vols. (baton rouge: louisiana state university, – ), : ; james marten, “a feeling of restless anxiety: loyalty and race in the peninsula campaign and beyond,” in the richmond campaign of : the peninsula and the seven days, ed. gary w. gallagher (chapel hill: university of north carolina press, ), – . . peter taylor, “a materialist framework for political geography,” transactions of the institute of british geographers, n.s., ( ): – ; neil smith, uneven j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e development: nature, capital, and the production of space (cambridge, mass.: blackwell, ); sallie a. marston, “the social construction of scale,” progress in human geography ( ): – ; nina siu-ngan lam and dale a. quattrochi, “on the issues of scale, resolution, and fractal analysis in the natural sciences,” professional geographer ( ): – ; sallie a. marston, john paul jones iii, and keith woodward, “human geography without scale,” transactions of the institute of british geographers, n.s., ( ): – ; and andrew e. g. jonas, “pro scale: further refl ections on the ‘scale debate’ in human geography,” transactions of the institute of british geographers, n.s., ( ): – . this recent debate is a resuscitation of older work on scale and the problems of repre- sentation, for example, david w. harvey, “pattern, process, and the scale problem in geographical research,” transactions of the institute of british geographers ( ): – . . the literature on slave emancipation in a global context is rich and will not be cited in detail. for recent transnational histories of slave emancipation that deal in depth with the u.s. context, see seymour drescher, abolition: a history of slavery and antislavery (new york: cambridge university press, ); david brion davis, inhuman bondage: the rise and fall of slavery in the new world (new york: oxford university press, ); rebecca j. scott, degrees of freedom: louisiana and cuba after slavery (cambridge, mass.: belknap press of harvard university press, ); edward b. rugemer, the problem of emancipation: the caribbean roots of the american civil war (baton rouge: louisiana state university press, ); matthew p. guterl, american mediterranean: southern slaveholders in the age of emancipation (cambridge, mass.: harvard university press, ). for a fuller survey, see joseph miller, bibliography of slavery and world slaving, http:// www.vcdh.virginia.edu/bib/. cotton had already undergone dramatic material and conceptual changes that made such a rise in prices possible. see giorgio riello, “the globalization of cotton textiles: indian cottons, europe, and the atlantic world, – ,” in the spinning world: a global history of cotton textiles, – , eds. giorgio riello and prasannan parthasarathi (new york: oxford university press, ), – . on cotton and the u.s. role in world trade, see sven beckert, “emancipation and empire: reconstructing the worldwide web of cotton,” american historical review ( ): – . on the postwar fortunes of the cotton south, see susan o’donovan, becoming free in the cotton south (cambridge, mass.: harvard university press, ); joseph p. reidy, from slavery to agrarian capitalism in the cotton plantation south: central georgia, – (chapel hill: university of north carolina press, ); roger l. ransom and richard sutch, one kind of freedom: the economic consequences of emancipation (new york: cambridge university press, ); gavin wright, the political economy of the cotton south: households, markets, and wealth in the nineteenth century (new york: norton, ), – , – . . on international trade and secession, see brian schoen, fragile fabric of union: cotton, federal politics, and the global origins of the civil war (baltimore: s e e i n g e m a n c i p a t i o n johns hopkins university press, ) and nicholas onuf and peter onuf, nations, markets, and war: modern history and the american civil war (charlottesville: university of virginia press, ). on nationalism as an international phenomenon in the midnineteenth century, see c. a. bayly, birth of the modern world, – (malden, mass.: blackwell, ), – ; and hahn, nation under our feet. . michael vorenberg, final freedom: the civil war, the abolition of slavery, and the thirteenth amendment (cambridge: cambridge university press, ), – . . christian samito, becoming american under fire: irish americans, african americans, and the politics of citizenship during the civil war era (ithaca, n.y.: cor- nell univ. press, ), ; amy dru stanley, “instead of waiting for the thirteenth amendment: the war power, slave marriage, and inviolate human rights,” ameri- can historical review ( ): – . . notable studies attendant to the local contexts of enslaved peoples’ lives include anthony kaye, joining places: slave neighborhoods in the old south (chapel hill: university of north carolina press, ); john michael vlach, back of the big house: the architecture of plantation slavery (chapel hill: university of north carolina press, ); maurie d. mcinnis, the politics of taste in antebellum charleston (chapel hill: university of north carolina press, ), – ; charles joyner, down by the riverside: a south carolina slave community (urbana: university of illinois press, ); amy murrell taylor, “how a cold snap in kentucky led to freedom for thousands: an environmental story of emancipation,” in weirding the war: tales from the civil war’s ragged edges, ed. stephen berry (athens: university of georgia press, ); walter johnson, “the slave trader, the white slave, and the politics of racial determination in the s,” journal of american history ( ): – . on partisan politics at the local level, see daniel w. crofts, old southampton: politics and society in a virginia county, – (charlottesville: university of virginia press, ), – ; william g. thomas and edward l. ayers, “the diff erences slavery made: a close analysis of two american communities,” american historical review ( ), – , http://www .vcdh.virginia .edu/ahr; glenn c. altschuler and stuart m. blumin, rude republic: americans and their politics in the nineteenth century (princeton, n.j.: princeton university press, ). . originally, marxian geographers conceptualized scale jumping—the pro- cess whereby “political claims and power established at one geographical scale are expanded to another”—to describe the behavior of companies seeking ways to escape state-based regulatory regimes and to operate instead as multinational corpora- tions. see marston, jones, and woodward, “human geography without scale,” , and smith, uneven development. smith later applied the concept to the actions of those with relatively little access to formal levers of power. neil smith, “contours of a spatialized politics: homeless vehicles and the production of geographical scale,” social text ( ): – . on political power and the confederacy, see stephanie mccurry, confederate reckoning: power and politics in the civil war south (cambridge, mass.: harvard university press, ). historians have long j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e understood what is required for a full history of emancipation and its consequences. eric foner declared of all reconstruction that “instead of proceeding in a linear, pre- determined fashion, these developments arose from a complex series of interactions among blacks and whites, northerners and southerners, in which victories were often tentative and outcomes subject to challenge and revision.” reconstruction: america’s unfi nished revolution (new york: harper, ), xxv–xxvii. . john western, “russian dolls or scale skippers? two generations in strasbourg,” geographical review ( ): – . . on deep contingency, see edward l. ayers, in the presence of mine enemies: war in the heart of america, – (new york: norton, ); edward l. ayers, “what caused the civil war?” in what caused the civil war? refl ections on the south and southern history (new york: norton, ), – . . stephen v. ash, when the yankees came: confl ict and chaos in the occupied south (chapel hill: university of north carolina press, ); willie lee rose, rehearsal for reconstruction: the port royal experiment (indianapolis: bobbs- merrill, ). . the war of the rebellion: a compilation of the offi cial records of the union and confederate armies (washington, d.c., gpo, – ), ser. , vol. , : – (hereafter or). . ira berlin, generations of captivity: a history of african american slaves (cambridge, mass.: harvard university press, ); hahn, nation under our feet; philip troutman, “slave trade and sentiment in antebellum virginia” (ph.d. diss., university of virginia, ), . . staunton vindicator, valley of the shadow: two communities in the american civil war, http://valley.lib.virginia.edu/, july , , p. , col. . the details of the story we tell of the enslaved man jack are based on a single, brief source, staunton vindicator, valley of the shadow, july , , p. , col. . on microhistories of his- torical fi gures who leave few traces, see, among other works, wendy anne warren, “‘the cause of her grief ’: the rape of a slave in early new england,” journal of american history ( ): – . . staunton vindicator, valley of the shadow, july , , p. , col. ; the shenandoah valley campaign of , ed. gary w. gallagher (chapel hill: university of north carolina press, ), ix–xii. . “outrages in essex county,” richmond daily dispatch, june , , p. , col. . . james marten, “a feeling of restless anxiety: loyalty and race in the peninsula campaign and beyond,” in the richmond campaign of : peninsula and the seven days, ed. gary w. gallagher (chapel hill: university of north carolina press, ), – . . brig. gen. john imboden to maj. gen. john c. breckinridge, june , , and col. a. moor to maj. gen. f. sigel, june , , or, ser. , vol. , : , – ; “the retreat of hunter,” richmond daily dispatch, june , , p. , col. . . earl j. hess, in the trenches at petersburg: field fortifi cations and confed- erate defeat (chapel hill: university of north carolina press, ), – ; john s e e i n g e m a n c i p a t i o n horn, the petersburg campaign (mechanicsburg, pa: combined publishing, ), – ; “reports from petersburg,” richmond daily dispatch, july , , p. , col. ; “negro troops,” boston daily advertiser [from richmond dispatch], august , , p. , col. . not es for c a p t ions figure . robert k. nelson, mining the dispatch, http://americanpast.richmond. edu/dispatch/, accessed june , . nelson has adapted mallet, a text-mining program, to develop a “topic model” of the richmond daily dispatch during the civil war. on topic modeling and historical research, see sharon block, “doing more with digitization: an introduction to topic modeling of early american sources,” commonplace ( ): http://www.historycooperative.org/journals/cp/vol- /no- /tales/. the daily dispatch was digitized by the university of richmond library in collaboration with the tufts university perseus project. figure . courtesy of the library of congress. figure . this fi gure draws inspiration from mark swanson and jacqueline d. langley, atlas of the civil war, month by month: major battles and troop movements (athens: university of georgia press, ). figure . this exploded diagram of emancipation in three layers draws on the following sources. legality of slavery layer: “chronology of emancipation,” in free at last: a documentary history of slavery, freedom, and the civil war, ira berlin, barbara j. fields, and steven f. miller (new york: new press, ); abraham lincoln, “the emancipation proclamation,” www.archives.gov; approximate zones of emancipation, see fi gure ; aggregate enslaved population at the county level, census, minnesota population center, national historical geographic information system (minneapolis: university of minnesota, ). figure . augusta county (va.) circuit court, augusta county (va.) register of colored persons cohabiting together as husband and wife, feb. , library of virginia, virginia memory online, http://www.virginiamemory.com/collections/ collections_by_topic. for this exercise, we mapped approximately percent of the augusta cohabitation register, which encompassed couples. of the whole, , or percent, were natives to augusta. figure . information on emancipation represented here comes from the staunton vindicator, the richmond daily dispatch, and the or for june , sources available online through the valley of the shadow: two communities in the american civil war, http://valley.lib.virginia.edu, the university of richmond’s daily dispatch online, http://dlxs.richmond.edu/d/ddr/, and cornell’s making of america, http://digital.library.cornell.edu/m/moawar/waro.html. we have also made use of documentary evidence found in bound volumes or behind pay walls, including such sources as the freedom series, by ira berlin, barbara j. fields, thavolia glymph, joseph p. reidy, and leslie s. rowland, eds. [ – ] and the no rth american women’s letters and diaries collection, published by infotrac. army movements are largely taken from the or, though we also consulted relevant sections of frederick h. dyer, compendium of the war of the rebellion, available online at j o u r n a l o f t h e c i v i l w a r e r a , v o l u m e , i s s u e http://www.perseus.tufts.edu/; supplement to the offi cial records of the union and confederate armies, ed. janet hewett, et al., vols. (wilmington, n.c.: broadfoot, ); james m. mcpherson, ed., the atlas of the civil war (new york: macmillan, ) and aaron sheehan-dean, concise historical atlas of the u.s. civil war (new york: oxford university press, ). further work on this mapping project will be funded by a grant from the national endowment for the humanities’ offi ce of digital humanities. [pdf] quality indicators for blogs and podcasts used in medical education: modified delphi consensus recommendations by an international cohort of health professions educators | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /postgradmedj- - corpus id: quality indicators for blogs and podcasts used in medical education: modified delphi consensus recommendations by an international cohort of health professions educators @article{lin qualityif, title={quality indicators for blogs and podcasts used in medical education: modified delphi consensus recommendations by an international cohort of health professions educators}, author={m. lin and brent thoma and n. trueger and f. ankel and j. sherbino and t. chan}, journal={postgraduate medical journal}, year={ }, volume={ }, pages={ - } } m. lin, brent thoma, + authors t. chan published medicine postgraduate medical journal background quality assurance concerns about social media platforms used for education have arisen within the medical education community. as more trainees and clinicians use resources such as blogs and podcasts for learning, we aimed to identify quality indicators for these resources. a previous study identified potentially relevant quality indicators for these social media resources. objective to identify quality markers for blogs and podcasts using an international cohort of health… expand view on publisher pmj.bmj.com save to library create alert cite launch research feed share this paper citationshighly influential citations background citations methods citations results citations view all topics from this paper education, medical health occupations classification social media blog blogging paper mentions blog post american family physician podcast passes , , downloads: why podcasts matter the afp community blog april citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency quality evaluation scores are no more reliable than gestalt in evaluating the quality of emergency medicine blogs: a metriq study brent thoma, s. sebok-syer, + authors t. chan psychology, medicine teaching and learning in medicine save alert research feed individual gestalt is unreliable for the evaluation of quality in medical education blogs: a metriq study b. thoma, s. sebok-syer, + authors alexander zozula medicine annals of emergency medicine save alert research feed the quality checklists for health professions blogs and podcasts i. colmers, quinten s. paterson, m. lin, b. thoma, t. chan psychology pdf view excerpts, cites background save alert research feed derivation of two critical appraisal scores for trainees to evaluate online educational resources: a metriq study t. chan, brent thoma, + authors k. kulasegaram medicine the western journal of emergency medicine pdf save alert research feed documenting social media engagement as scholarship: a new model for assessing academic accomplishment for the health professions kimberly d acquaviva, josh mugele, + authors avery m trudell sociology, medicine journal of medical internet research pdf view excerpt, cites methods save alert research feed using podcasts to deliver pediatric educational content: development and reach of pediacast cme m. patrick, d. stukus, k. nuss psychology, medicine digital health highly influenced view excerpts, cites background save alert research feed evaluating the reliability of gestalt quality ratings of medical education podcasts: a metriq study jason m. woods, t. chan, d. roland, j. riddell, a. tagg, brent thoma psychology, medicine perspectives on medical education pdf save alert research feed the impact of social media promotion with infographics and podcasts on research dissemination and readership. b. thoma, h. murray, + authors teresa m. chan medicine cjem pdf view excerpt save alert research feed characteristics of drug-related podcasts and this medium’s potential as a pharmacy education tool s. kane, michael h. shuman, k. patel, m. olson psychology, medicine american journal of pharmaceutical education pdf view excerpt, cites background save alert research feed assessing the use of social media in physician assistant education g. wanner, a. phillips, d. papanagnou psychology, medicine international journal of medical education pdf save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency emergency medicine and critical care blogs and podcasts: establishing an international consensus on quality. brent thoma, t. chan, quinten s. paterson, w. milne, j. sanders, michelle s. lin medicine annals of emergency medicine pdf save alert research feed the use of social media in pharmacy practice and education. a. benetoli, t. chen, p. aslani medicine research in social & administrative pharmacy : rsap save alert research feed internet-based learning in the health professions: a meta-analysis. d. cook, a. levinson, s. garside, d. dupras, p. erwin, v. montori medicine jama , pdf save alert research feed integration of social media in emergency medicine residency curriculum. kevin r. scott, c. hsu, n. johnson, m. mamtani, l. conlon, f. deroos medicine annals of emergency medicine save alert research feed the social media index: measuring the impact of emergency medicine and critical care websites brent thoma, j. sanders, m. lin, quinten s. paterson, jordon steeg, t. chan medicine the western journal of emergency medicine save alert research feed implementing peer review at an emergency medicine blog: bridging the gap between educators and clinical experts. brent thoma, t. chan, n. desouza, michelle s. lin medicine cjem pdf save alert research feed social media use in nursing education. t. schmitt, susan sims-giddens, r. booth psychology, medicine online journal of issues in nursing save alert research feed social media: a review and tutorial of applications in medicine and health care f. grajales, s. sheps, k. ho, helen novak-lauscher, g. eysenbach psychology, medicine journal of medical internet research save alert research feed how we use social media to supplement a novel curriculum in medical education d. bahner, e. adkins, n. patel, c. donley, r. nagel, nicholas kman psychology, medicine medical teacher pdf save alert research feed the use of free online educational resources by canadian emergency medicine residents and program directors. e. purdy, brent thoma, joseph m. bednarczyk, d. migneault, j. sherbino medicine cjem pdf save alert research feed ... ... related papers abstract topics paper mentions citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators blog posts, news articles and tweet counts and ids sourced by altmetric.com terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue leveraging social media for cardio-oncology ucla ucla previously published works title leveraging social media for cardio-oncology. permalink https://escholarship.org/uc/item/ m g journal current treatment options in oncology, ( ) issn - authors brown, sherry-ann daly, ryan p duma, narjust et al. publication date - - peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ m g https://escholarship.org/uc/item/ m g#author https://escholarship.org http://www.cdlib.org/ curr. treat. options in oncol. ( ) : doi . /s - - - cardio-oncology (mg fradley, section editor) leveraging social media for cardio-oncology sherry-ann brown, md phd , ryan p. daly, md narjust duma, md eric h. yang, md naveen pemmaraju, md purvi parwani, mbbs andrew d. choi, md juan lopez-mattei, md , ,* address department of cardiovascular diseases, mayo clinic, first st sw, rochester, mn, , usa cardio-oncology program, division of cardiovascular medicine, medical college of wisconsin, w watertown plank road, wauwatosa, wi, , usa franciscan health-indianapolis. indiana heart physicians, east stop road, indianapolis, in, , usa k / clinical science center, university of wisconsin carbone cancer center, highland ave., madison, wi, , usa ucla cardio-oncology program, division of cardiology, ucla cardiovascular center, medical plaza, suite , los angeles, ca, , usa md anderson cancer center, leukemia, holcombe blvd, unit , hous- ton, tx, usa division of cardiology, department of medicine, loma linda university health, anderson street, room , loma linda, ca, , usa division of cardiology and department of radiology, the george washington university school of medicine, pennsylvania avenue nw, suite - , washington, dc, , usa departments of cardiology & thoracic imaging, university of texas md anderson cancer center, holcombe blvd, houston, tx, usa *, departments of cardiology & diagnostic radiology, university of texas md anderson cancer center, holcombe blvd, houston, tx, , usa email: jlopez @mdanderson.org * springer science+business media, llc, part of springer nature this article is part of the topical collection on cardio-oncology keywords social media i some i twitter i cardioonc i cardiooncology i prevcardioonc i jacccardioonc i somecardioonc opinion statement as the world becomes more connected through online and offline social network- ing, there has been much discussion of how the rapid rise of social media could be used in ways that can be productive and instructive in various healthcare specialties, such as cardiology and its subspecialty areas. in this review, the role http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf http://orcid.org/ - - - of social media in the field of cardio-oncology is discussed. with an estimated million cancer survivors in the usa in and million estimated by , more education and awareness are needed. networking and collaboration are also needed to meet the needs of our patients and healthcare professionals in this emerging field bridging two disciplines. cardiovascular disease is second only to recurrence of the primary cancer or diagnosis with a secondary malignancy, as a leading cause of death in cancer survivors. a majority of these survivors are anticipated to be on social media seeking information, support, and ideas for optimizing health. healthcare professionals in cardio-oncology are also online for networking, education, scholarship, career development, and advocacy in this field. here, we describe the utilization and potential impact of social media in cardio-oncology, with inclusion of various hashtags frequently used in the cardio- oncology twitter community. introduction scholarly works on social media in the broader fields of adult and pediatric cardiology, as well as oncology and hematology, set a precedent for the role of social media in cardio-oncology [ – ]. these works describe both benefits and limi- tations of cardiology and oncology on social me- dia. the described benefits include opportunities for networking among multidisciplinary healthcare professionals and patient advocates, education of colleagues and patients, raising public and societal awareness for various diseases and conditions af- fecting children, adolescents, and adults in cardi- ology, as well as advocacy and professional brand- ing [ – ]. in our quest to achieve these goals in cardiology on social media, it is important for us to do so in a way that promotes equity, diversity, and inclusion, particularly for women and ethnic minorities. it is also important to ensure that healthcare professionals protect the privacy of pa- tients, and that patients and the general public understand that medical advice cannot be given on these platforms, nor should patient-doctor rela- tionships be formed on social media in their cur- rent forms. while cardio-oncology is relatively young within the field of cardiovascular medicine. yet, the cardiac sequelae of many anti-neoplastic regimens, such as mantle radiation, anthracyclines, and targeted therapies (e.g., trastuzumab), have been known for many years. the volume of new and clinically important information continues to exponentially increase. novel therapeutics with a multitude of side effects, toxicities, and important drug-drug interactions are being pro- duced at a rapid pace. social media can help busy clini- cians and researchers keep up-to-date on current ad- vances. real-time educational debates and informed discourse often follow publication of key articles or trials on social media platforms such as twitter, often led by trialists and content experts. fast dissemination and discussion of recent literature usually ensues, with reporting and discussion of approaches to unusual ad- verse effects of cancer treatments. social media is there- fore changing the landscape of how we communicate, network, collaborate, and discuss current trends in car- diovascular medicine, and particularly cardio- oncology. here, we review current and proposed use of social media in cardio-oncology for networking, education, advocacy, branding, and academic career development. the presence and impact of cardio- oncology on social media at national scientific meetings, as well as the physician professionalism and patient perspectives, are also addressed. in ad- dition, the roles of the cardio-oncology clinician or physician scientist on social media and of cardio-oncology social media editors and consul- tants are discussed. we primarily focus on twitter as the main social media platform, as it has been heavily embraced by the global cardio-oncology community. page of curr. treat. options in oncol. ( ) : organization and curation with hashtags popular and appropriate cardio-oncology hashtags have been introduced and should continue to be widely used (fig. ). these hashtags are metadata tags on twitter that help us organize, curate, and find content relevant to the field [ , ] (tables and ). such hashtags have been useful in various cardiology subspecialties on twitter. they connect active and engaged cardiologists, e.g., in prevention (#cvprev), cv imaging (#echofirst for echocardiography, #whycmr for cardiac mri, #yescct for cardiac ct, #cvnuc for nuclear cardiology), or structural heart disease (#tavr, #mitraclip). hashtags have also been used to establish overlapping interest areas in cardiology and dia- betes (#cardiodiabetology) or cardiology and obstetrics (#cardioobstetrics; to highlight the relationship between the two specialties over the course of a woman’s lifetime), among others. oncology and hematology meet cardiology in social media cancer care has evolved in the last years, once an isolated specialty, oncology is now composed of multidisciplinary teams and international collaborations. the survival of our patients with cancer has significantly improved with the introduction of targeted therapy and immunotherapy, and we continue to learn the long-term consequences of those treatments, particularly cardiovascular toxicity [ ]. fig. . popular cardio-oncology twitter hashtags. curr. treat. options in oncol. ( ) : page of table . key twitter terminology term definition example bio short (up to characters) personal description that appears in the user’s profile that serves to characterize their persona on twitter. https://twitter.com/jeffhsumd @ used to mention usernames in tweets (“hello @twitter!”) twitter users can have their @username mentioned in tweets, communicate via direct messaging or have a link to their profile. @uclahealth @twitter @circaha @jaccjournals @asco @jco_asco # (hashtag) a word or phrase immediately preceded by the # symbol. when hashtags are clicked on, users will see other tweets containing the same keyword or topic #cardioonc #echofirst #melanoma #breastcancer #immunotherapy follow/ follower follow: subscribing to a twitter account. any post by that twitter account will be posted on the twitter user’s feed. follower: a twitter account that is following the user to receive the user’s tweets in the home timeline. https://twitter.com/onco_cardiology ] page of curr. treat. options in oncol. ( ) : table . (continued) tweet (noun definition): a post of up to characters that can contain photos, gifs, videos, and text. (verb definition): act of sending a tweet. tweets get shown in twitter timelines, or are embedded in websites and blogs. https://twitter.com/jco_asco/status/ ?s= retweet (noun definition): a tweet that is forwarded to a user’s followers. (verb definition): act of sharing another account’s tweet to a user’s followers. https://twitter.com/datsunian/status/ examples of user profiles used with permission. adapted from twitter glossary, https://help.twitter.com/en/glossary. accessed november , curr. treat. options in oncol. ( ) : page of https://help.twitter.com/en/glossary the online oncology and hematology communities have been part of the early implementers of hashtags in order to decrease the signal-to-noise ratio that can be seen in the twitter and facebook communities. the first reported cancer- specific hashtag was #bcsm (breast cancer social media), followed by #btsm (brain tumors social media) in and , respectively [ ]. subsequently, more cancer-specific hashtags have been developed. the influx of a disease- specific hashtag is generally correlated with the clinical research advances in the field [ ]. for example, in , the hashtags #lungcancer and #immunotherapy were on the top five of the most commonly tweeted hashtags in the american society of clinical oncology (asco) annual meeting [ ]. this correlated with the presentation of clinical trials that would ultimately change the care of patients with lung cancer. the collaboration for outcomes on social table . sample of disease-specific hashtags frequently used by the oncology and hematology communities on twitter hashtag disease #ayacsm adolescent and young adult cancer #bcsm breast cancer #bmfsm bone marrow failure syndromes #bmtsm bone marrow transplantation #bpdcn blastic plasmacytoid dendritic cell neoplasm #breastcancer breast cancer #childhoodcancer childhood cancer #crcsm colorectal cancer #globonc global oncology #gyncsm gynecologic cancer #immunoonc immuno-oncology #immunotherapy immunotherapy #kcsm kidney cancer #lcsm lung cancer #leukemia leukemia #leusm leukemia #lungcancer lung cancer #lymsm lymphoma #mdssm myelodysplastic syndrome #mmsm multiple myeloma #mpnsm myeloproliferative neoplasms #pallonc palliative oncology #pancsm pancreatic cancer #pcsm prostate cancer #pedcsm pediatric cancer #supponc supportive care in oncology page of curr. treat. options in oncol. ( ) : media in oncology (cosmo) encourages all social media participants to use the designated disease specific hashtags to clean the message, help new users find accurate information, and allow better data collection when research in social media is conducted [ ]. social media also provide us with the opportunity to network among many specialties, with the discussion of cases among cardiologists, oncologists, pa- thologists, and radiation oncologists. this unprecedented exchange of informa- tion and ideas can be guided by the use of hashtag twitter medical communi- ties, pioneered in part by many colleagues in hematology, oncology, and cardiology, with specific hashtags developed for common use; widespread collaboration has aided in providing a framework for ongoing discussions [ , , ]. ultimately, social media has therefore brought the cardio-oncology community together (including our colleagues in hematology, vascular med- icine, and others), and has helped increase awareness about many new entities and the subspecialty itself. several resources exists for medical trainees and new social media users [ , , , ], many of which are discussed in this review. cardio-oncologist on social media each cardio-oncologist on social media plays a role in the education of not only other cardio-oncologists, but also other physicians and advanced practice pro- viders involved in the care of patients in cardio-oncology. both radiotherapy and chemotherapy independently and synergistically appear to injure the peri- cardium, myocardium, valves, conduction system, and coronary vessels [ , ]. immunotherapy and targeted therapies can also injure the pericardium, myocardium, coronary vessels, or conduction system [ – ]. therefore, it would be prudent to educate all physicians and other healthcare professionals in cardiology at all career stages about #cardioonc. other disciplines crucial to cardio-oncology, such as medical oncology, radiation oncology, surgical oncology, hematology, and internal medicine, should also be engaged. this collaborative learning community would be best for patients, as we seek to prevent, manage, and limit the burden of existing cardiotoxicity. both clinicians and physician scientists in cardio-oncology and preventive cardio-oncology could take up the mantle of patient care, research, and education, along with community engagement and global collaborations. different yet complemen- tary perspectives from clinical practice, basic science, and translational medicine can coalesce to form a cohesive field and learning community on social media. the entire cardio-oncology social media community participates in a disrup- tive public crowdsourced peer review process in which educational items can be evaluated and advanced. in this way, robust discussions and debates among physicians and scientists at various stages of their careers, as well as patients, patient advocates, and other stakeholders rigorously and democratically sharpens and disseminates ideas internationally [ , ]. daily, “case reports” are posted for the purpose of educating our colleagues and patients in a manner that can be judicious and meaningful [ ]. cases should not include identifiable patient information, and should be accompanied by informed patient consent [ ]. often, prior publications are shared, as well as professional anecdotes, in collective wisdom or query, sprinkled with reflections on self-care and physician burnout and moral injury [ , ]. curr. treat. options in oncol. ( ) : page of educational content: cardio-oncology initiatives professional societies and academic institutions use the hashtags #cardioonc and #cardiooncology on twitter to highlight scientific content in tutorials termed “tweetorials,” as well as case-studies, educational webcasts, and podcasts, many of which are free to access. many journals have also created their own unique hashtags. for example, the new journal jacc: cardiooncology routinely tweets their content accompanied by a central illus- tration and link to the article with the hashtag #jacccardioonc. an online medical education platform medpage (medpage.com) uses a unique hashtag #cardiooncoconnect to facilitate cardio-oncology twitter chats about various topical #cardioonc, in collaboration with acc and asco. a few acc chapters have also followed suit. the texas-acc chapter (twitter handle: @txchapteracc) sponsors the texas cardiooncology seminar #tcos , as well as an educational blog on cardio-oncology topics. the indiana-acc chapter (twitter handle: @inacc) and the indiana cardiooncology network (#icon) together sponsor educational events that can be freely accessed through twitter and youtube. several individuals practicing in cardio-oncology have also established themselves as influencers in education on social media. social media influencers are those who regularly post and garner a substantial following of engaged individuals focused on specific topics or themes, generating content using the hashtags #cardioonc, #cardiooncology, and #prevcardioonc. influencers set the trends and tone in discussions related to the relevant topics and themes. followers grow to trust influencers, who establish themselves as thought leaders. as early career and midcareer clinicians (and some researchers) in cardio-oncology, all authors of this article frequently post trendsetting articles, polls, questions, tips, and other information that help engage and educate the community. it should be noted that influence can be carried by tweeting under one’s own name, the name of a specialty, the name of a program or institution, a journal or any of the above. in fact, many influencers oversee multiple social media accounts if the focus of each account is somewhat different from the other accounts, or if a particular field or program should be emphasized. a great case example is of @prevcardioonc and #prevcardioonc on twitter (and the upcoming social education blog prevcardioonc.com). this twitter handle and hashtag were created to introduce and disseminate various new ideas for prevention in cardio-oncology, while building a specific community around these ideas. there are other #cardioonc influencers on social media who also help engage the community. some community members post about recent scientific and medical journal publications from others (or themselves); others promote their podcast through twitter. other authors develop “tweetorials,” which are essentially tweets threaded together to form mini-lectures on focused topics by actively participating in the vibrant social media community, several physicians and trainees have had the opportunity to forge successful international collab- orations for data analysis and publication in #cardiooncology. from among such pools of active participants in academic #cardioonc communities on social media often are drawn tweeters for specialty tracks at cardiology confer- ences. the acc has been an early adopter engaging cardiologists on twitter. page of curr. treat. options in oncol. ( ) : http://prevcardioonc.com/ several national and international societies are also beginning to integrate social media coverage into their meetings and literature. the american heart association, european society for cardiology, heart failure society of america, association of black cardiologists, and several other societies are now desig- nating societal “tweeters,” “influencers,” “catalysts,” “ambassadors,” and “com- mentators,” who often are assigned to specific subspecialty tracks for knowledge dissemination during the societies’ national or international annual scientific sessions. such influencers are often regarded as the “go-to folks,” and may frequently receive direct messages for opinions or collaboration. these are also typically the individuals who serve as social media editors and consultants for journals for societies such as the acc, american society for nuclear cardiology, and other journal editorial boards on which the authors of this article serve, many of which have formal cardiooncology journal sections. cardio-oncology conferences on social media activities related to networking and education in the cardiology social media community (commonly referred to as #cardiotwitter) often occur around the time of large national scientific sessions for major professional societies in cardi- ology, such as those of the american college of cardiology (acc), american heart association (aha), and european society of cardiology [ , , ]. indeed, there has been a tremendous increase in twitter usage by cardiologists at around the time of these conferences [ ]. similar findings have been reported for major oncology meetings, as cancer specialists are also leading the way on twitter [ , – ]. the asco annual meeting, for example, has seen an -fold increase in the number of social media participants and content [ ]. most participants in the social chatter surrounding (especially the multidisciplinary) sessions at these conferences are actively tweeting pearls and insights from the conferences, while others are tweeting their responses to the scientific data and appreciation for being able to participate online. this allows physicians and other healthcare providers that are not attending the conferences to remain up to date regarding new research findings and changes in practice [ ]. such efforts by those at the conference and those remotely “listening” to the conference chatter on social media help to increase engagement of cardiologists, oncologists, and hematologists worldwide, even if unable to be physically present. this broadens access to educational material that would otherwise be limited to those attending in person. sessions, courses, or conferences on cardio-oncology are catching up as well. specifically tweeting from and about cardio-oncology sessions, posters, and gatherings at these conferences or at conferences dedicated to the field. there is great need to spread awareness and educate others currently in practice or those in training, to meet the needs of the growing cancer survivor population. conference speakers should [ ] & consider the fact that your slides will likely be photographed and tweeted. & design your slides accordingly and include your twitter handle and the conference hashtag. & speak slowly give the physical audience the opportunity to prepare and send off their tweets about your work. curr. treat. options in oncol. ( ) : page of & finally, engage your worldwide virtual audience on social media (#some) to promote your talk in advance and then continue the conversation online after the presentation and even the conference. such efforts help to broaden access to educational material and enhance the social media presence of your work and also the professional societies and sections hosting the conference or supporting your work. role of the social media editorial board in addition to individuals and professional societies, scientific journals have also adopted the use of social media. social media is being used to promote new articles and upcoming journal issues, enhanced by journal- initiated twitter journal clubs and twitter chats [ ]. previous studies in cardiology had shown that editorial board members at top journals were not appreciably present or active on twitter. the findings suggested a chasm between academic cardiology thought and science leadership and potential consumers of the vast knowledge being published in the journals. while consumers engaged with the actual journal twitter han- dles, interaction from academic cardiology journal thought leaders was lacking. since then, many scientific journals have expanded their editorial boards to include members that are responsible for sharing journal con- tent and highlighting select in-press manuscripts on social media, with inclusion of author twitter handles now being requested with each man- uscript submission. some journals also request draft tweets from authors of submitted or accepted manuscripts (e.g., https://www. thepermanentejournal.org/authors/prepare.html). roles that have been assigned to social media editors consist of composing tweets about accept- ed journal articles, assigning composition of online contents (blogs), editing authors’ composed tweets and associated media, creating, and either posting or delegating content [ ]. in jacc:cardiooncology, for example, the social media editorial board consists of social media direc- tors (smd) and social media consultants (smc) [ ]. the role of smds and smcs is to leverage the power of social media and the associated global audience to facilitate the dissemination of cardiovascular and cardio-oncology health information and education rapidly and efficiently [ ]. the smds develop educational content in collaboration and upon approval of the editorial board. the content creation and dissemination are performed in collaboration with the smcs. one example of education- al content creation in social media is a series of tweets disseminated using the @jaccjournals twitter account. the tweets highlighted seminal papers in cardio-oncology shared under the heading “how far we’ve come in #cardioonc with #jacccardioonc” [ ]. the intention behind these series of tweets was to educate the twitter audience on some of the most important papers in the field of cardio-oncology, by summarizing find- ings and sharing the most relevant graphics and references. this campaign was developed to promote the release of jacc:cardiooncology. the social media editorial board in a cardio-oncology journal may create content by developing blogs related to the journal’s articles, hosting live journal clubs page of curr. treat. options in oncol. ( ) : https://www.thepermanentejournal.org/authors/prepare.html https://www.thepermanentejournal.org/authors/prepare.html using social media platforms, developing or editing and sharing summa- rized visual abstracts, establishing or participating in podcasts, and tweeting articles in-press with the most appropriate accompanying illustrations. one of the main goals of any organization that promotes or supports cardio-oncology should be education, as this has been reported previously as one of the most important barriers in establishing cardio-oncology programs in hospitals [ ]. multidisciplinary collaborations among cardiologists, oncol- ogists, and other stakeholders (e.g., nurses, advanced practitioners, administra- tion executives, patient advocacy groups) using social media can foster syner- gistic relationships and develop mutual interests, thereby strengthening the field. in this way, social media can be used to encourage engagement of more stakeholders. this ensures that education is not limited to those that subscribe to a particular journal or read a particular issue. strategic plans need to be developed by smds during cardio- oncology meetings and other cardiology meetings with cardio- oncology sessions to share educational content from the journal rele- vant to discussions at the meetings. it is important to support the participation of other journal editorial board members, to increase the attention and engagement of the audience, for example, by sharing brief video clips of interviews and perspectives of the editorial board. smart and considerate engagement is key, as is abiding by specific journal policies regarding social media activity. best practices on the use of #cardiotwitter have been described [ ]. in general, “strive for accuracy and quality, give credit, share perspectives, and be civil [ ]. opportunities for patient engagement in cardio-oncology on social media we are connected globally. while on social media, cardio-oncologists are accessible to patients and providers worldwide. globally it is estimated that billion people have mobile devices, with half of these devices being smartphones. according to pew research, smartphone ownership in the usa exceeds % of the population most americans are online daily; twitter has million daily active users. the internet and social media have become major sources of information for all. nearly % of all seniors use the internet daily, with more than half of these individuals doing so for health-related concerns. an overwhelming majority of pa- tients ( %) in one survey of cancer patients, survivors, and care- givers reported using the internet to search for information about their diagnoses. more than % of adults in the usa use at least one social networking site. in addition, % of physicians use some form of social media for personal or private reasons, with % of physicians using social media for professional reasons [ ]. given these statistics for both patients and physicians, social media has the opportunity to dramatically extend the reach and amplify the voice of the cardio-oncologist. instead of reaching – patients each day as many of us do in clinic, #some offers the opportunity reach curr. treat. options in oncol. ( ) : page of hundreds, or perhaps thousands, of patients daily [ ]. benefits may include providing patients with trusted, timely, understandable, and targeted health information vetted by physicians and professional socie- ties. social media allows cardio-oncologists to curate and provide edu- cational content for patients. social media provides an opportunity to educate the misinformed and uninformed about health behavior change and best practices to improve outcomes. the instantaneous exchange of information is incredibly powerful. use of social media may allow a physician to keep an ongoing relationship with the patient community, allowing for ongoing education and communication opportunities. pa- tients for the most part are reliant on physicians for insights, interpre- tations of medical literature and recent studies on medical advances, resolution of medical controversies, and importantly limitations to our current knowledge and understanding of their disease and its treatment [ ]. furthermore, some patients appreciate the transparency that comes with public debate about trial results and different treatment modalities [ ]. hard data remain lacking regarding the outcomes on social media content in terms of knowledge or behavioral change of followers, al- though anecdotes abound. notably, patients may benefit from interacting with or following healthcare professionals on social media. a salient illustration involves a survivor of breast cancer who was diagnosed with cardiomyopathy and wanted to learn more about her condition and the field of cardio-oncology and wished she could find a relevant hashtag. she was introduced to the cardio-oncology commu- nity, and it seemed she had found what she was looking for. perhaps patients should be intentionally invited to enter into the #cardioonc, #cardiooncology, and #prevcardioonc communities as patient advocates, to help us as healthcare professionals better understand their path and needs. only by understanding them can we continue to optimally impact them. patients are quite active on twitter by following hashtags such as #cardioonc, #lcsm (lung cancer social media), and #bcsm (breast cancer social media) [ , , , ] (table ). indeed, patients already access social media to gain increased knowledge regarding their disease and its treatment and prognosis. patients may also use social media to express their emotions, share their experience with others, get advice, or be in touch with healthcare professionals. patients and their families create their own virtual online community centered on the disease they are battling in the setting of their own specific circumstances (e.g., late-stage disease, disease in the very young). facebook and twitter are currently homes for many of these disease-specific groups and hashtags [ , ]. the cardio-oncologist may choose to interact with these communities to provide current and peer-reviewed content, which may both inform and empower patients and their caregivers, and also help them to seek and obtain appropriate care. cardio-oncologists, oncologists, and patients may read disease-specific news about various advances regarding various cancers, by searching for frequently used hashtags (table , fig. ). while social media may be a useful tool, there are age and demo- graphic disparities regarding internet access and frequency of use of social media. younger people are more likely to have internet access and use social media than older people, although internet use has page of curr. treat. options in oncol. ( ) : increased in older persons [ ]. from among those who have internet use, social media use appears to be higher among racial and ethnic minorities than among non-hispanic whites [ ]. unfortunately, social media is not a panacea and access will remain a problem, especially in patient populations that suffer from language barriers, mental or cogni- tive difficulties, lower socioeconomic status, or literacy barriers [ ]. patient perspectives of physicians on social media there is limited data on patient perceptions of physicians’ social media use and their perceptions of physicians’ professionalism. one study demonstrated that a physician’s facebook profile may influence a patient’s perception of the pro- vider’s professionalism. data are lacking regarding patients’ perception of phy- sician professionalism in the context of twitter usage. in this modern era, patients also use the internet to research their physicians before meeting them for the first time. their first impression is therefore in part influenced by the information they encounter online. social media may be a tool for physicians to optimize their online presence and present a positive image, to help shape that first impression [ ]. adolescents and young adults in every medical field, it is important to reach out to all potential subgroups of patients. one critical group consists of adolescent and young adult cancer survivors (ages late teen to under age ) [ ]. this group is designated by the national cancer institute (nci) as vulnerable and with unique needs [ ]. greater emphasis should be placed on information exchange for this group of patients, many of whom are experiencing relationships, body changes, transi- tion of life from parents’ house to their own, first jobs, fertility and parenting, higher education, financial and insurance barriers, and other life events for the first time [ , ]. multiple groups have shown that this group of patients may have the highest level of interaction on social media in their general lives, and this continues throughout their journey as cancer survivors many of whom develop cardiovascular diseases at young ages [ ]. pitfalls and solutions in social media important pitfalls of using social media for healthcare communication should be discussed. anything shared is public, and a digital trail is left behind even after a post is deleted. all posts instantly become public knowledge accessible to patients, colleagues, and future prospective employers. all posts should be thoughtful and ideally useful additions to academic community discussions. the quality and lack of reliability of health information on social is also of concern. it may be challenging for patients and clinicians alike to discern the reliability of information found online. physicians should make every effort to avoid posting errant information. patients using social media may be unaware of the risks of disclosing personal information online or of using incorrect advice. patients may also become overwhelmed and overloaded with curr. treat. options in oncol. ( ) : page of information, even if reliable and accurate. it is possible for the general public to be uncertain about how to correctly apply online information to their personal health situation, especially given the subtleties and complexities of care. concerns about privacy, confidentiality, and data security should also be considered. specific patient details or details from which a patient can be identified should not be shared on social media. in addition, informed consent should be obtained from patients whose clinical information is shared on the internet. patients should be educated about how such platforms operate, and how the online community may respond to information being posted. it is also important to not give patients specific medical advice through social media platforms, and instead to speak more generally about disease processes, treat- ment, and prevention. the american medical association (ama), the american college of physicians (acp), and the federation for state medical boards (fsmb) have created guidelines for healthcare professionals to help create and maintain a social media presence while ensuring standards of patient privacy and confidentiality [ ]. current acp/fsmb recommendations are as follows [ ]: & keep professional and personal accounts separate; do not individually “friend” or contact patients from your practice through social media. & text messaging with patients for a medical interaction even with an established patient and with consent is discouraged. & email only patients with patient consent and an established patient- physician relationship. & recognize that documentation about patient care is part of the medical record. & if approached for clinical advice through electronic media outside of a patient-physician relationship, this should be handled with good judg- ment; consider scheduling the individual for an office visit, or if urgent patients should visit the nearest emergency department. & establishing a professional profile so it appears first during a search rather than a physician ranking site can help guide the accuracy and utility of information read by patients prior to their initial encounter. the ama cautions that physicians should monitor their own internet pres- ence to ensure the accuracy and appropriateness of their personal and profes- sional content. physicians are cautioned to maintain appropriate boundaries of the patient-physician relationship congruous with standard professional ethic guidance. the ama also holds a stance that physicians who see unprofessional content from their colleagues have a responsibility to bring the unprofessional content to the attention of the posting physician. it is vital that physicians recognize that online behavior and content may affect their reputations, have downstream consequences to their medical careers, and undermine public trust in the medical profession. career development with the acute and chronic cardiovascular conditions of cancer survivors comes the need for specialized care by health professionals with sufficient exposure to cardio-oncology. this necessitates networking and recruitment for both training page of curr. treat. options in oncol. ( ) : and hiring of individuals dedicated to caring for these patients. social media vastly expands the opportunity for such networking on twitter, blogs, employment opportunities (e.g., cardioonctrain.com/fellowships-jobs), and other plat- forms. the international multidisciplinary community on social media consists of cardiologists, oncologists, cardio-oncologists, and other related specialists, who can scour the terrain and identify appropriate candidates for training or hiring. networking and thought leadership on social media can also help facilitate faculty promotion at academic institutions. social media portfolios can be devel- oped to catalog important contributions to social media, e.g., dissemination of new publications, journal clubs, tweetorials, and so on. many large academic institutions are incorporating social media contributions into their curriculum vitae templates [ – ]. online methods of sharing and disseminating these papers may be effective in expanding their reach and readership, as well as citations [ – ]. one study noted a strong association between social media exposure on twitter and rates of journal article citations [ ]. in another study, articles that were tweeted by several individuals were times more likely to be highly cited than those tweeted by only a few individuals [ ]. in the world of alternative metrics (from which “altmetrics” is derived), it has been suggested that “tweetations” be used to calculate a “twimpact factor” that may predict and estimate traditional citations [ , ]. thus, increased presence and engagement with these papers on social media can represent the breadth of impact of journal articles, and by extension, a faculty individual’s work. although such a construct is still yet to be widely embraced, social media portfolios on digital scholarship are currently being used by early adopters to assist in decisions to determine academic promotion and tenure. digital scholarship has been thought to be a gamechanger in the path to academic permanence and leadership. a real-life example by jeard gardner, md is publicly available online and could be adopted and used. similarly, we should become leaders in our institutions’ efforts to incorporate digital scholarship into academic career development. as we think about building social media portfolios and using analytics for academic promotion, as a community, we will need to determine metrics, regulation, adoption, and standardization. who will determine appropriate metrics, and regulate how these portfolios and analytics are used? will there be standardization across institutions nationally, and perhaps also internation- ally? do we need to create a regulatory body for academic social media? would this benefit from or be appropriate for a guidelines or scientific statement document? all of these considerations need to be addressed, as all of these will likely continue to become part of routine academic practice. several groups have published recommendations for establishing portfolios and other means of curating digital scholarship for academic promotion and tenure [ – ]. such recommendations could be endorsed by the #cardioonc community to help advance the concept in academic medicine. conclusion in our efforts to provide the best care for our patients in cardio-oncology, we need to be part of their healthcare conversations. the majority of patients are for ease of access, we have shortened the weblinks to tinyurl.com/someoverview and tinyurl.com/somedropbox. curr. treat. options in oncol. ( ) : page of http://cardioonctrain.com/fellowships-jobs outliving their malignancies and subsequently developing acute or chronic car- diovascular diseases. many of these patients are turning to social media for support and information. it is prudent for their cardiovascular team to provide reliable information online, e.g., on twitter. social media can be used as a tool to augment our listening to patients’ views, better understand their needs, and provide topical, accurate, and trusted healthcare information to inform the misinformed and uninformed. as digital methods become more common in the recruitment of patients for clinical trials, cardio-oncology should be at the helm of adopting digital transformation for patient-centered research. not only do we then have the opportunity to develop awareness among our patients, but we also benefit from interacting with other healthcare professionals, journals, societies, and conferences. in fact, social media gives us increased op- portunity for national and international collaborations across disciplines related to care of our shared patients. the digital transformation in cardio-oncology can include new methods for education and study regarding differential mechanisms of various cardiovascular toxicities. this will likely improve our understanding of mortality risk and epidemiology, and help us further advance our efforts at both management and prevention. utilizing specific hashtags on social media, we have the ability to link instan- taneously in real-time the entire world’s cardio-oncology community to discuss, debate, educate, share, and support and learn from one another. twitter, along with social media in general, presents a free global platform to disseminate education and cardio-oncology information. creating online presence enhances visibility for both patients and the referring physician base. consequently, we would encourage all cardio-oncology practitioners, administrators, and aficio- nados to grab their twitter handle and swing into gear! let us use the social media platform to educate one another. network with colleagues to identify gaps in knowledge, appraise current literature, identify importantbarriersto care (e.g.,financial toxicity and burdensto patients, excessive prior authorization, administrative burdens), propose solutions, motivate pa- tients, and provide healthcare information to the global cardio-oncology com- munity. engage the community frequently, schedule several posts and create others in real time, adhere to institutional social media or communications policies, and protect your authentic personal brand and professional reputation, with utmost professionalism and camaraderie [ ]. this will help us all to establish our ground as early adopters of social media for online and offline community and career development. we hope that our perspectives in cardio- oncology will help to provide a roadmap for appropriate and fruitful use of social media for a myriad of benefits in this emerging cardiology subspecialty. compliance with ethical standards conflict of interest sherry-ann brown declares that she has no conflict of interest. ryan p. daly serves as a social media consultant for jacc: cardiooncology narjust duma has received compensation from inivata for service page of curr. treat. options in oncol. ( ) : as a consultant. eric h. yang declares that he has no conflict of interest. naveen pemmaraju has received research funding from abbvie, stemline therapeutics, novartis, samus therapeutics, cellectis, plexxikon, daiichi sankyo, affymetrix; is supported by grants from the sager strong foundation and dan’s house of hope; has received compensation from abbvie, celgene, stemline therapeutics, incyte, novartis, mustang bio, roche diagnostics, and lfb for service as a consultant; and has served as a board member/volunteer for dan’s house of hope and the hemonc times/oncology times. purvi parwani has received compen- sation for service as a consultant on the journal of the american college of cardiology. andrew d. choi declares that he has no conflict of interest. juan lopez-mattei has received compensation from arterys for service as a consultant. references and recommended reading . parwani p, choi ad, lopez-mattei j, raza s, chen t, narang a, et al. understanding social media: opportu- nities for cardiovascular medicine. j am coll cardiol. ; ( ): – . . mandrola j. futyma p. trends cardiovasc med: the role of social media in cardiology; . . schumacher kr, lee jm, pasquali sk. social media in paediatric heart disease: professional use and oppor- tunities to improve cardiac care. cardiol young. ; ( ): – . . rosselló x, stanbury m, beeri r, kirchhof p, casadei b, kotecha d. digital learning and the future cardiologist. eur heart j. ; ( ): – . . pemmaraju n. editorial overview: emerging impor- tance of social media for real-time communication in the modern medical era. semin hematol. ; ( ): – . . perales ma, drake ek, pemmaraju n, wood wa. social media and the adolescent and young adult (aya) pa- tient with cancer. curr hematol malig rep. ; ( ): – . . katz ms, utengen a, anderson pf, thompson ma, attai dj, johnston c, et al. disease-specific hashtags for online communication about cancer care. jama oncol. ; ( ): – . . pemmaraju n, thompson ma, mesa ra, desai t. analysis of the use and impact of twitter during amer- ican society of clinical oncology annual meetings from to : focus on advanced metrics and user trends. j oncol pract. ; ( ):e –e . . sedrak ms, dizon ds, anderson pf, fisch mj, graham dl, katz ms, et al. the emerging role of professional social media use in oncology. future oncol. ; ( ): – . . dizon ds, graham d, thompson ma, johnson lj, johnston c, fisch mj, et al. practical guidance: the use of social media in oncology practice. j oncol pract. ; ( ):e – . . walsh mn. social media and cardiology. j am coll cardiol. ; ( ): – . . siegel rl, miller kd, jemal a. cancer statistics, . ca cancer j clin. ; ( ): – . . pemmaraju n, gupta v, mesa r, thompson ma. social media and myeloproliferative neoplasms (mpn)–fo- cus on twitter and the development of a disease-spe- cific community: #mpnsm. curr hematol malig rep. ; ( ): – . . pemmaraju n, utengen a, gupta v, thompson ma, lane aa. blastic plasmacytoid dendritic cell neoplasm (bpdcn) on social media: #bpdcn-increasing expo- sure over two years since inception of a disease-specific twitter community. curr hematol malig rep. ; ( ): – . . widmer rj, larsen cm. call for fits/ecs to become engaged with social media. j am coll cardiol. ; ( ): – . . chang hm, moudgil r, scarabelli t, okwuosa tm, yeh eth. cardiovascular complications of cancer therapy: best practices in diagnosis, prevention, and manage- ment: part . j am coll cardiol. ; ( ): – . . chang hm, okwuosa tm, scarabelli t, moudgil r, yeh eth. cardiovascular complications of cancer therapy: best practices in diagnosis, prevention, and manage- ment: part . j am coll cardiol. ; ( ): – . . hu jr, florido r, lipson ej, naidoo j, ardehali r, tocchetti cg, et al. cardiovascular toxicities associated with immune checkpoint inhibitors. cardiovasc res. ; ( ): – . . zarifa a, salih m, lopez-mattei j, lee h, iliescu c, hassan s, et al. cardiotoxicity of fda-approved im- mune checkpoint inhibitors: a rare but serious adverse event. journal of immunotherapy and precision on- cology. ; ( ): – . . touyz rm, herrmann j. cardiotoxicity with vascular endothelial growth factor inhibitor therapy. npj precis oncol. ; : . . brown sa, nhola l, herrmann j. cardiovascular tox- icities of small molecule tyrosine kinase inhibitors: an opportunity for systems-based approaches. clin pharmacol ther. ; ( ): – . . cifu as, vandross al, prasad v. case reports in the age of twitter. am j med. . curr. treat. options in oncol. ( ) : page of . yeh rw. academic cardiology and social media: navi- gating the wisdom and madness of the crowd. circ cardiovasc qual outcomes. ; ( ):e . . hudson s, mackenzie g. not your daughter's facebook': twitter use at the european society of car- diology conference . heart. ; ( ): – . . tanoue mt, chatterjee d, nguyen hl, sekimura t, west bh, elashoff d, et al. tweeting the meeting. circ cardiovasc qual outcomes. ; ( ):e . . pemmaraju n, mesa ra, majhail ns, thompson ma. the use and impact of twitter at medical conferences: best practices and twitter etiquette. semin hematol. ; ( ): – . . søreide k, mackenzie g, polom k, lorenzon l, mohan h, mayol j. tweeting the meeting: quantitative and qualitative twitter activity during the th esso con- ference. eur j surg oncol. ; ( ): – . . chaudhry a, glodé lm, gillman m, miller rs. trends in twitter use by physicians at the american society of clinical oncology annual meeting, and . j oncol pract. ; ( ): – . . gorodeski e. make your mark at medical meetings with social media [available from: https://www. medpagetoday.com/practicemanagement/ informationtechnology/ . . redfern j, ingles j, neubeck l, johnston s, semsarian c. tweeting our way to cardiovascular health. j am coll cardiol. ; ( ): – . . lopez m, chan tm, thoma b, arora vm, trueger ns. the social media editor at medical journals: responsi- bilities, goals, barriers, and facilitators. acad med : journal of the association of american medical col- leges. ; ( ): – . . jacc: cardiooncology – launching in september [website]. [updated . available from: http://www.onlinejacc.org/jacc-cardiooncology. . @jaccjournals. twitter [tweet]. available from: https://twitter.com/jaccjournals/status/ ?s= . . barac a, murtagh g, carver jr, chen mh, freeman am, herrmann j, et al. cardiovascular health of patients with cancer and cancer survivors. a roadmap to the next level. ; ( ): – . . parwani p, choi ad, lopez-mattei j, raza s, chen t, narang a, et al. understanding social media. oppor- tunities for cardiovascular medicine. ; ( ): – . . househ m, borycki e, kushniruk a. empowering pa- tients through social media: the benefits and chal- lenges. health informatics j. ; ( ): – . . campbell l, evans y, pumper m, moreno ma. social media use by physicians: a qualitative study of the new frontier of medicine. bmc med inform decis mak. ; : . . hawkins cm, dela oa, hung c. social media and the patient experience. j am coll radiol. ; ( pt b): – . . kuehn bm. social media becomes a growing force in cardiology. circulation. ; ( ): – . . attai dj, cowher ms, al-hamadani m, schoger jm, staley ac, landercasper j. twitter social media is an effective tool for breast cancer patient education and support: patient-reported outcomes by survey. j med internet res. ; ( ):e . . thompson ma, younes a, miller rs. using social me- dia in oncology for education and patient engagement. oncology (williston park). ; ( ): , – , . . chou wy, hunt ym, beckjord eb, moser rp, hesse bw. social media use in the united states: implications for health communication. j med internet res. ; ( ):e . . widmer rj, shepard m, aase la, wald jt, pruthi s, timimi fk. the impact of social media on negative online physician reviews: an observational study in a large, academic. multispecialty practice j gen intern med. ; ( ): – . . bleyer a, barr r. cancer in young adults to years of age: overview. semin oncol. ; ( ): – . . wood wa, lee sj. malignant hematologic diseases in adolescents and young adults. blood. ; ( ): – . . parsons hm, schmidt s, harlan lc, kent ee, lynch cf, smith aw, et al. young and uninsured: insurance pat- terns of recently diagnosed adolescent and young adult cancer survivors in the aya hope study. cancer. ; ( ): – . . woodruff tk, smith k, gradishar w. oncologists' role in patient fertility care: a call to action. jama oncol. ; ( ): – . . farnan jm, snyder sulmasy l, worster bk, chaudhry hj, rhyne ja, arora vm, et al. online medical profes- sionalism: patient and public relationships: policy statement from the american college of physicians and the federation of state medical boards. ann intern med. ; ( ): – . . farnan jm, sulmasy ls, chaudhry h. online medical professionalism. ann intern med. ; ( ): – . . cabrera d, vartabedian bs, spinner rj, jordan bl, aase la, timimi fk. more than likes and tweets: creating social media portfolios for academic promotion and tenure. j grad med educ. ; ( ): – . . cabrera d, roy d, chisolm ms. social media scholar- ship and alternative metrics for academic promotion and tenure. j am coll radiol. ; ( pt b): – . page of curr. treat. options in oncol. ( ) : https://www.medpagetoday.com/practicemanagement/informationtechnology/ https://www.medpagetoday.com/practicemanagement/informationtechnology/ https://www.medpagetoday.com/practicemanagement/informationtechnology/ http://www.onlinejacc.org/jacc-cardiooncology https://twitter.com/jaccjournals/status/ ?s= https://twitter.com/jaccjournals/status/ ?s= . sherbino j, arora vm, van melle e, rogers r, frank jr, holmboe es. criteria for social media-based scholar- ship in health professions education. postgrad med j. ; ( ): – . . quinn a, chan tm, sampson c, grossman c, butts c, casey j, et al. curated collections for educators: five key papers on evaluating digital scholarship. cureus. ; ( ):e . . carpenter cr, cone dc, sarli cc. using publication metrics to highlight academic productivity and re- search impact. acad emerg med. ; ( ): – . . widmer rj, mandrekar j, ward a, aase la, lanier wl, timimi fk, et al. effect of promotion via social media on access of articles in an academic medical journal: a randomized controlled trial. acad med. . . finch t, o'hanlon n, dudley sp. tweeting birds: on- line mentions predict future citations in ornithology. r soc open sci. ; ( ): . . buckarma eh, thiels ca, gas bl, cabrera d, bingener- casey j, farley dr. influence of social media on the dissemination of a traditional surgical research article. j surg educ. ; ( ): – . . eysenbach g. can tweets predict citations? metrics of social impact based on twitter and correlation with traditional metrics of scientific impact. j med internet res. ; ( ):e . . smith zl, chiang al, bowman d, wallace mb. longi- tudinal relationship between social media activity and article citations in the journal gastrointestinal endoscopy. gastrointest endosc. ; ( ): – . . chan tm, stukus d, leppink j, duque l, bigham bl, mehta n, et al. social media and the st-century scholar: how you can harness social media to amplify your career. j am coll radiol. ; ( pt b): – . publisher’s note springer nature remains neutral with regard to jurisdic- tional claims in published maps and institutional affiliations. curr. treat. options in oncol. ( ) : page of leveraging social media for cardio-oncology opinion statement introduction organization and curation with hashtags oncology and hematology meet cardiology in social media cardio-oncologist on social media educational content: cardio-oncology initiatives cardio-oncology conferences on social media role of the social media editorial board opportunities for patient engagement in cardio-oncology on social media patient perspectives of physicians on social media adolescents and young adults pitfalls and solutions in social media career development conclusion notes compliance with ethical standards references and recommended reading section designing a collaborative peer-to-peer system for archaeology: the digventures platform . introduction – problem solvers and solution seekers when the science fiction author william gibson remarked that “the future is already here – it’s just not evenly dis- tributed” (rosenberg ), he could well have been describing the archaeology profession. from algorithmic newsfeeds to always-on mobile technology, archaeolo- gists live their non-professional lives in an increasingly networked digital environment. however, contra galeazzi and richardson-rissetto ( ), it is a stretch to think that the internet has impacted archaeological method and practice to quite the same degree. the pervasive use of digital technology may well be entrenched in archaeology, leading some to proclaim “we are all digital archaeologists now” (morgan and eve : ). but in contrast with initiatives from outside the discipline that draw on col- laborative models of ‘open’, ‘peer-to-peer’, or ‘distributed’ innovation (von hippel ; von hippel ; chesbrough ; benkler ; benkler ), the so called digital turn in archaeology looks very much like a turn inwards, deploying digital tools such as tablet record- ing, gis and d technologies to augment rather than rein- vent pre-digital workflows. despite the ready availability of potentially transformative technologies that could open up the archaeological knowledge chain to a networked community, the basic job of archaeology continues to be practiced by a bounded project team of specialists work- ing in traditional formation. how should archaeologists adapt to this ever-shifting digitally networked landscape, and what opportunities or threats await a meaningful engagement? william gibson’s suggestion is that the future is occupied at the margins; that strategies to cope with uncertainty and embrace opportunity are all readily observable in pockets of inno- vation and early adoption, and as these niche activities become increasingly assimilated into the mainstream, the future becomes the present (gibson ). this paper is about engagement and experimentation with one of those possible futures: the emerging digital and collaborative economy. taking cues from other suc- cessful social and cultural initiatives at the margin of our discipline, it will introduce the uk-based digventures collaborative platform, and assess the implications for archaeology (and archaeologists) of a networked peer- to-peer approach to field work. launched in , digventures is a social enterprise with a mission to expand civic engagement with archaeological research by experimenting with alternative business models and wilkins, b. . designing a collaborative peer-to-peer system for archaeology: the digventures platform. journal of computer applications in archaeology, ( ), pp. – . doi: https://doi.org/ . /jcaa. university of leicester and digventures, gb brendon@digventures.com research article designing a collaborative peer-to-peer system for archaeology: the digventures platform brendon wilkins digital technologies are ubiquitous in archaeology, and have been argued to improve workflows across the archaeological knowledge chain; but to what extent have digital tools materially changed the nature of archaeological scholarship or the role of archaeologists in knowledge production? this paper compares a traditional ‘pipeline’ with a networked ‘platform’ model of fieldwork, assessing the impact of technology- enabled participation on archaeology’s disciplinary and professional boundaries. in contrast to the col- laborative potential of peer-to-peer systems, the current vogue for intra-site digital tools (such as tablet recording, gis, and d technologies) can be seen to augment rather than reinvent pre-digital workflows. this point will be illustrated through an assessment of the uk based collaborative platform, digventures, in contrast with recent high-profile initiatives to transition to digital workflows by other established field projects. evaluated through the lens of nesta’s recent typology of platform organisations in the ‘collaborative economy’, it will model the underlying dynamics of peer-to-peer interaction by utilising the ‘platform design toolkit’, considered alongside a worked project example and assessment of digital web analytics of the digventures platform. it will finally consider how a peer-to-peer system is experienced by scholars themselves, and the changing role of the archaeologist in a system that shifts the locus of work beyond the physical limits of an organisation, to open up the archaeological process to anyone who chooses to participate. keywords: peer-to-peer; crowdfunding; crowdsourcing; platform; digital; collaborative economy journal of computer applications in archaeology https://doi.org/ . /jcaa. mailto:brendon@digventures.com wilkins: designing a collaborative peer-to-peer system for archaeology technology-enabled participation (wilkins ). ethically positioned as a ‘social contract’ between archaeologists and as greater range of stakeholders and participants as possible (wilkins ; neal : ), the organisa- tion has benefited from the ability to design its systems and digital tools unencumbered by the legacy processes that often govern the organisational activities of a long- established project, company or institution. the clearest example of this digital reworking of traditional practice is digventures’ revenue model of crowdfunded and crowd- sourced archaeology. this approach that has been imple- mented on archaeological projects in the uk, europe and the us, raising approximately £ . m for excavation through crowdfunding and matched grant funding, is supported annually by over , participants. less dis- cernible but potentially more significant has been the development of a platform approach to archaeological resource sharing, collaborative knowledge production, and crowdsourced labour. this is facilitated by a suite of networked digital tools creating an accessible space for micro-volunteering initiatives and experiences, enabling both tangible and intangible connections between peer producers, peer consumers, stakeholders and partners. the concept of the commons, and in particular, ‘com- mons-based peer production’ (benkler ), has been constructively utilised to frame this modality, “whereby the community (virtual, physical or both) participate in the innovation process and have unlimited access to the tools that are co-developed by the community” (boyd xxiv). by shifting the locus of work beyond the tra- ditional physical and organisational limits of a project team, these digital architectures of participation present a significant opportunity to draw on expertise beyond the knowledge boundaries of the professional archaeological community. the epistemic advantages of engaging non- professional participants in the knowledge production process has been argued to result in “some of the best research in the social sciences,” ensuring that professional biases and concomitant errors are exposed (wylie : ). a bounded scientific community of practice may fail to recognise the inherent shortcomings of their basic assumptions and norms of justification, an epistemic limi- tation that can be mitigated by drawing on a wide range of perspectives from outside the discipline. characterised as ‘participatory action research’ and ‘community-based par- ticipatory-research’ across the social sciences (chevalier and buckles ), and defined more specifically as ‘col- laborative archaeology’ within our discipline (mcanany and rowe ) this approach “is credible… because it is self-consciously situated and brings diverse angles of vision to bear on its central knowledge claims” (wylie : ). this demands a ‘‘rethinking of traditional views of objectivity that takes social, contextual values to be a resource for improving what we know, rather than inevita- bly a source of compromising error and distortion’’ (wylie : ). such an approach seems wholly aligned with the pro- fession’s determination to realise a wider public benefit (scanlon et al. ), however, far from embracing the possibility of meritorious contributions from the crowd, some archaeologists have raised concerns regarding the potential disintermediation of long-established gate- keeper organisations (perry ; perry and beale ), reanimating debates concerning the contemporary role of archaeologists in a digitally mediated landscape (brophy ; gonzález-ruibal et al. ; gonzález-ruibal ; perry ; and aspects of nativ are also relevant here). archaeological expertise is hard won, and current concerns regarding the profession’s contemporary status should be seen as more than luddite opposition to change for change’s sake. disciplinary boundaries serve to dis- tinguish archaeologists from lay people, with an array of sub-disciplinary boundaries ordering professionals into period and material specialisms. this contrasts with peer- to-peer networks, where seemingly little is known about which individuals and communities participate, how they participate or their motivation for doing so. in an era of ‘post truth’ (oxford dictionary ) and ‘filter bubbles’ (pariser ), it could be argued that knowledge bound- aries establish trust and legitimacy, guarding society against the potential misuse of the past from “an aggres- sive miasma of atavistic speculation” (trigger : ). how could such checks and balances, a notion character- ised in contemporary discourse as ‘old power’ (timms and heimans ), be applied to a resolutely ‘new power’ peer-to-peer system that furnishes anyone who chooses to participate with the tools to join in? this dilemma is well illustrated by a recent study of the effects of nasa’s experimentation with ‘open innovation’, a process that was revealed through an in-depth three- year longitudinal field study by a researcher embedded with nasa’s scientific community (lifshitz-assaf ). under financial pressure from congress, in the organisation published fourteen strategic challenges on open innovation platforms (innocentive, topcoder and yet ). overwhelmed by “the ‘spectacular results’ of the open innovation experiment” (lifshitz-assaf : ), nasa’s management sought to integrate open innova- tion into the day-to-day workflow of the organisation. but enthusiasm for this initiative was far from unanimous with nasa’s scientific community, leading to “rising ten- sions, emotions and fragmentation” (lifshitz-assaf : ). studies of open innovation have hitherto tended to focus on the role of the peer-to-peer platform itself or the character of contributing communities (chesbrough, vanhaverbeke, and west ; benner and tushman ). lifshitz-assaf expanded her analysis to account for the role of identity, investigating how professionals may adopt or reject change and innovation depending on whether this contradicts or supports their professional sense of self and purpose. through structured interviews and close observation, her conclusion was that “open inno- vation challenged not only the knowledge-work bounda- ries of r&d professionals but also, to a great extent, their professional identity” (lifshitz-assaf : ). from this insight, she identified two main groups, which she called ‘problem solvers’, who self-identified with a bounded, scientific method adhering to notions of professional expertise and peer review, and ‘solution seek- ers’, open to collaboration and dismantling professional wilkins: designing a collaborative peer-to-peer system for archaeology boundaries. these divisions could not be explained by demographics or socioeconomics (both groups were equally varied), and given the scientific mission of the organisation, nor was this a reaction to technological change or the validity of the experiment’s results. returning to the theme of this special issue, we may question whether the on-going debate on the theoretical and philosophical aspects of digital scholarship originates from within a ‘problem solver’ or ‘solution seeker’ mind- set. does this schema reflect the differences between those who advocate for a sub-disciplinary bounded digital scholar positioned “at the heart of the larger discipline” (perry and taylor : ), in contrast with those who claim for a post-digital, normative stance summed up by costopoulos ( ) who argues “i want to stop talk- ing about digital archaeology. i want to continue doing archaeology digitally.” huvila and huggett ( ), in their positioning paper on digital scholarship, articulate “the need for at least a relative consensus” on how archaeological work is cur- rently organised as a first step to addressing this debate. in line with lifshitz-assaf’s call “to zoom out of the exist- ing ‘how’ we do our work, to pause, reflect and refocus on the bigger ‘why’” (lifshitz-assaf ) these authors similarly advocate for a zooming in and zooming out on archaeological practice and knowledge work, seeking to go beyond “the rules, formal descriptions, etc. and hence what essentially constitutes canonical practice, but also… what actually takes place: the day-to-day reality of the practice” (huvila and huggett ). continuing in this vein, the following discussion will initially ‘zoom out’ to consider archaeological work in the uk in terms of its dominant business and operational models (understood here to be a blueprint describing how an organisation creates, delivers and captures value, in economic, social, cultural or other contexts) considering how this poses structural challenges to opening up the archaeological knowledge chain to public participation. this will be followed by a ‘zooming in’ on contemporary digital practice to contrast a traditional ‘pipeline’ with a networked ‘platform’ model of field work. whereas a born digital approach may be afforded the advantage of design- ing collaborative networked digital tools from the ground up, long-established organisations transitioning their legacy processes to a digital methodology can be seen to augment rather than reinvent their pre-digital workflows, consolidating disciplinary boundaries and maintaining traditional working practices in a manner unlikely to con- front the profession’s structural challenges. . from atoms to bits – digital archaeology’s social context though archaeology may be characterised as a research endeavour, it is predominantly practiced as a service sector, presenting an overwhelming structural challenge to the widespread adoption of technology-enabled civic participation in the knowledge production process. the profession predominantly comprises a state-backed ‘con- servation sector’ designed to protect and maintain natural and built heritage, with a largely private ‘mitigation sector’ constituted to respond to the former’s demands (bradley ; carver ; carver ). of the , paid archae- ologists working in the uk in – , it is estimated that % worked for organisations that provided commercial field investigation and research services (the mitigation sector), and % for organisations that provided historic environment advice (the conservation sector), with the remainder employed outside the development-led market dynamic within museums, universities and civil organisa- tions (aitcheson : ). the current dominant form of archaeological work emerged as a result of the increased demands for archae- ologists following post-war urban reconstruction. the rescue revolution, and the later growth of cultural resource management through the ’s and ’s depended on standardisation and repeatable procedures (jones ), a move that was further consolidated in the ’s in britain with the issue of ppg / and map (wainwright ). the net result is that % of all archae- ological work practised in the uk since has taken place within the commercial mitigation sector, mostly as a precondition to receiving planning permission as part of the development process (fulford : ). from rela- tively modest pre- state funded levels of £ . m per annum, the total revenue generated by uk commercial archaeology in – was approximately £ m, resulting in over , planning related archaeological investigations (trow : ; aitcheson : ). from inauspicious beginnings to significant contribu- tor to the wider economy, watson ( : ) argues that commercial archaeology’s plucky back story is often pre- sented as a heroic “‘foundation myth’… with little critical awareness of the myriad difficulties (e.g. the fragmented, com- petitive nature of the contracting sector) that have perpetuated and, arguably, held back the develop- ment of a mature and respected occupation.” the boom and bust cycles of the construction industry have enabled a fragmented and competitive business model to take root “within the vacuum created by a lack of alternative models” (watson : ). operation- ally this model can be described as a pipeline workflow: designing a product or service and following a step-by- step system to deliver the product or service in a linear value chain with producers at one end and consumers at the other (figure ). the ultimate goal of these pre- dominantly client-funded investigations is to understand a site’s formation processes, reporting results in ‘grey literature’. when research does occur – understood here to be a synthesis of archaeological results to understand the historical processes that gave rise to those formation processes – this is usually conducted by a much smaller academic sector and paid for by research grants (bradley ; and see fulford and holbrook for an excep- tional example of where synthesis has been achieved in spite of market logics). the pipeline model of archaeological knowledge crea- tion is exemplified by barry cunliffe’s influential ‘levels of wilkins: designing a collaborative peer-to-peer system for archaeology publication’ concept (cunliffe ). based on a series of consecutive steps, his model was designed to ensure that archaeological knowledge could be produced for a range of specialist and non-specialist consumers. level , he argued, was the site itself, with its unrealised information preserved in situ; this was followed by levels & repre- sented by the archaeological producers’ easily accessible site archive and stratigraphic report. an academic journal or monograph publication would follow with level , with selected results made intelligible for non-specialist consumers at level (the public) and (the media). this closed cycle has enabled archaeology in the uk to grow into a commercial service industry embedded with envi- ronmental risk management, but there are significant structural issues with a model which rests on the assump- tion that the material remains of the past are a physical ‘asset’ framed as a ‘non-renewable resource’. this approach enables a workflow where interpretive decisions can be delayed to a later phase of the project because the material uncovered by the excavation and the record produced by the individual excavator is seen as impartial and atheoretical. but this is far from ideal, and methodologies that box, bag and label the mate- rial remains into easily managed categories for interpre- tation at a later date have been argued to vastly reduce our capacity to discover what the existential poet donald rumsfeld would call ‘unknown unknowns’ (wilkins ). the consequence of this for berggren and hodder ( ) is the reduced intellectual status of field workers (see also everill ): “the old theoretical debate about the separation between data and interpretation in archaeology partly has a social basis. it is not an abstract philosophical discussion. it is about who is empow- ered to interpret. and on the whole the answer has been ‘not the excavator’” (berggren and hodder : ). martin carver counters that “it is hard to recognise the social basis for this report, which appears to emanate from the planet zog” ( : ). far from restating an older antiquarian-labourer class system, carver ( : ) reminds us that “in archaeology, as in the rest of the world of work, you are paid to do what is wanted by the person with the money, not to do what you would rather be doing… so there are two parallel parts to our profession: people paid to produce new research, mainly in universities, and people paid to manage research resources, mainly in government, under- pinned by a large commercial sector.” given that % of people working in the commercial mitigation sector in the uk are employed by “not-for- distributable profit organisations (registered charities, constituent parts of local planning authorities, constitu- ent parts of universities)” with a social and educational mission (aitcheson : ; dore : ), it is no small irony that the sector has gravitated towards a busi- ness model that challenges its basic professional precepts (see scanlon et al. ; nixon ; wills ). it is not unusual for charitable bodies to establish commercial trading arms in order to fund their wider mission, how- ever, in – the “community, public archaeology and educational work” undertaken by archaeological organi- sations in the uk was calculated at just . % of those organisations’ average annual turnovers (aitcheson : ). this seems like a vanishingly small percentage, but by failing to capture the ‘non-market’ transactions that would underpin a thriving civic sector, perhaps this exclu- sively market orientated analysis is misleading. following carver’s reasoning above, what does the growth of the commercial mitigation sector mean for people not paid to do archaeology, but who choose through voluntary endeavours to do it anyway? with a long tradition of voluntary participation in british archaeology reaching back into the th and early th century, a survey by the council for british archaeology of local archaeology group members has estimated that , people self-identify as regularly engaged with archaeology (thomas : ); these num- bers suggest a thriving voluntary scene which could be a solid foundation for the development of technology- enabled civic participation. however, closer scrutiny of voluntary archaeology societies suggests that member- ship is aging and in decline; the majority of members surveyed for the cba report were over , a pattern also observed when the membership was surveyed once again in (figure ), which concluded that the member- ship had contracted, and those remaining were now eight years older (frearson : ). this aging demographic is also supported by a regional study of voluntary groups in the east of england where just % of members were aged between – years old, and only % were in full time figure : cunliffe’s levels of publication ( ) illustrated as a unidirectional pipeline articulating the relationship between producers and consumers. digventures. level : the site levels and : site archive and strat report level : academic journal or monograph level : selected results made intelligible for non- specialist audiences level : further distillation for the press archaeologists the public producers consumers wilkins: designing a collaborative peer-to-peer system for archaeology employment (woolverton ). no national socioeco- nomic audit has been undertaken of archaeology society membership, but anecdotally, clubs and societies typically comprise a largely passive group of overwhelmingly white, middle class, and retired members, corralled by a handful of active committee members, the “preserve of the retired, of the established, of the pedantic…” (manley : ). far from creating space for civic participation with archae- ology, the dramatic uplift in resourcing for development- led work has exacerbated this contraction, with . % of volunteer projects operating outside the major sources of funding for archaeology and a large proportion fail- ing to uphold the basic ethical requirement of sharing or publishing data (hedge and nash : ). ‘zooming in’ now on mainstream digital practice, and rather than experiment with ‘open’, ‘peer-to-peer’, or ‘distributed’ models that could address archaeology’s systemic challenges (von hippel ; von hippel ; chesbrough ; benkler ), innovation has taken a much narrower focus on digitising rather than reimagining traditional workflows. in consequence many of the same structural concerns raised with archaeology’s dominant business model persist, though restated in a digital con- text. perry and taylor ( : ) note that “the technical capacities of these [digital] tools still tend to eclipse mean- ingful critique of their implications,” lamenting the “lack of a larger critical disciplinary framework to guide digital practice.” alongside a concern that “digital applications generally make it near-impossible to recognise or inter- rogate power dynamics at play, leaving us blind to (and liable to reproduce) structural inequalities” ( : ), other, ‘pre-digital’ arguments are also transposed: “digital archaeology might in some cases be mistaken for a form of ‘neo-processualism’, focused on specifications, accu- racy, and precision as means to generate increasingly ‘real’ archaeological models” ( : ). layered upon the archaeology profession’s existing structural tensions, the widespread digitisation of tradi- tional workflows resonates with ongoing anxieties regard- ing the future of archaeological work and skills. these concerns are based on the critique that the rise of indus- trial capitalism leads to either the deskilling of workers following the introduction of machinery and the result- ing redundancy of hard-learned craft skills (braverman ), or the upskilling of workers who will need new educational qualifications to either operate or design that machinery (bell ). following other pioneering experiments with digi- tal recording systems at sites in the uk such as west heslerton (powlesland and may ) and the silchester town life project (clarke et al. ), the ‘braverman/bell’ debate echoes through recent reports of the development of digital recording systems using mobile devices in aca- demic fieldwork (walcek averett et al. ). enthusiastic adoption of digital recording at pompeii noted that mov- ing to a digital platform increased productivity by greater than % with around one third of the typical staff (poehler and ellis : ). acknowledging the increased accuracy of digital recording methods enabling the excavator to efficiently collect ever more data, caraher urges caution, advocating for a “slow archaeology… as a meticulous, integrated craft that resists the fragmented figure : comparison of age profiles of council for british archaeology membership undertaken by thomas ( ) and frearson ( ), a regional survey of community groups in the east of england (woolverton), contrasted with a typical age profile from a four-year digventures project (leiston abbey) and the barrowed time project. digventures. wilkins: designing a collaborative peer-to-peer system for archaeology and mechanised process of the assembly line” (caraher : ). roosevelt et al. argue the upskilling hypoth- esis: “increasing efficiency… [enabling] fuller engagement with the material record in the field while simultaneously increasing the technical literacy of project participants” (roosevelt et al. : ). this position has also been adopted in a series of reports that have emerged recently from the Çatalhöyük research project (Çrp) (berggren et al. ; taylor et al. ; lukas et al. ), claiming for the ‘democratisation’ potential of tablet computers to address archaeology’s structural challenges. the programmatic aims of ‘reflexive archaeology’ deter- mine that excavators should be conscious of why they do what they do (reflexivity), enabling multiple inter- pretations of archaeological evidence (multivocality), and conscious of the situated nature of knowledge pro- duction (relationality). through these means “reflexive archaeology… provides systematic opportunities for field archaeologists to engage in narrative construction and to provide critique of those narratives in relation to data and social context” (berggren and hodder : ). the introduction of web-viewable relational databases and tablet computers connected to a local network has been called a ‘living archive’ enabling “deep integration of knowledge on site… it effec- tively brings that part of the archive which is generally conceived of as being accessible ‘off- site’ (or perhaps even only accessible post-exca- vation) ‘on-site’, transforming the archive from a static reference knowledge-base to a dynamic interpretative tool in its own right” (taylor et al. : section ). this process is seen as enabling “many aspects of data manipulation, validation and interpretation, which are ordinarily reserved for certain ‘privileged’ individuals during the post-excavation process, into the field at the trowel’s edge” (berggren et al. : ). pictured sche- matically, however, it is difficult to recognise how Çrp’s ‘hierarchy of knowledge production’ fundamentally devi- ates from cunliffe’s levels of publication concept (berggren et al. : , figure ). commensurate with level on cunliffe’s pipeline is what Çrp would define as “‘the site’ itself, which can be seen (perhaps fairly conventionally) as a primary resource for the generation of interpretations and knowledge for those who interact with it.” cunliffe’s level and correspond with Çrp’s ‘archaeologist’ tier, “an agent for observing, recording and interpreting the site. in this sense ‘archaeologist’ refers to any team member, of any specialism, that has some input into the generation of data and its subsequent interpretation” (taylor et al. : section ). levels , and map on to Çrp’s top level – “the various strands of output and dissemination of data and its interpretation, most commonly in the form of the archive and various tiers of publication, these being ultimately the physical manifestation of the team’s ‘generation of knowledge’” (taylor et al. : section ). despite these parallels, the digitisation of traditional prac- tice is still argued to materially challenge the knowledge production process because digital tools are “ontologically generative… emphasising the breadth of what can be called the archaeological” (shanks and witmore ), “hence there would be no archaeologists without archaeological stratum or the tools of their trade, or vice versa” (huvila and huggett : ). this mode of analysis has also been applied by Çrp, conceiving of archaeological practice as a bounded social network of practitioners deconstruct- ing a site in the field in order to reconstitute the site into digital media (taylor et al. : section ). however, just as the mediating practice of field photography has come to determine the hygienic practices of excavation (such as the notion of cleaning up for a photo), it is difficult to see how scholarship will be materially transformed by digitis- ing that practice (or other intra-site digital replacements for traditional tools). by focussing on the mobile devices themselves – no matter how sophisticated “the tablets, and the suite of associated digital technologies that they allow in the field” (berggren : ), this and other recent high profile initiatives to transition to a digital workflow by established field teams could be overlooking the revolutionary potential of these networked devices. this does not discount that tablet computers could trans- form practice, but without a similar shift in operational and business model, adoption can be seen to be the equivalent of putting alloy wheels and go-faster stripes on a horse and cart. . archaeology as a peer-to-peer platform there is a persistent perception that archaeology can be organized according to either socialist or capitalist prin- ciples: projects can be delivered either as a public service paid through taxation or procured through a private mar- ket of service suppliers (kristiansen ; willems and van den dries ). this assumption of a clear boundary between public and private is contradicted by david graber’s ( : ) insight that “any market reform, any government initiative intended to reduce red tape and promote market forces will have the ultimate effect of increasing the total number of regulations, the total amount of paperwork, and the total number of bureaucrats the government employs.” with equivalent key performance indicators, audit trails, internal markets and management hierarchies, graber’s ‘iron law of liberalism’ recognises that “public and private bureaucracies have become so increasingly entangled that it’s often very difficult to tell them apart” (graber : ). in their book, ‘what’s yours is mine’, botsman and rogers sought to move beyond the binary opposition of either socialist or capitalist formations, observing the pro- liferation of new kinds of marketplaces, businesses and communities emerging to help people to access the things they need in new and different ways, while also making the things they owned available to others. calling this phenomenon ‘collaborative consumption’, they defined it wilkins: designing a collaborative peer-to-peer system for archaeology as “the reinvention of traditional market behaviours, such as bartering, renting, trading and exchanging, through technology, enabling them to take place on a scale and in ways never possible before” (botsman and rogers ). the underlying business model of these collabora- tive, peer-to-peer platforms is far from new, drawing on a similar two-sided market place that enabled the forma- tion of the london stock exchange in ; what is dif- ferent is the affordances provided by digital peer-to-peer platforms, which can be characterised as an ecosystem enabling different types of users to connect and conduct interactions with one another, facilitating the exchange of goods, services or social currency. though focussed initially on shifting consumer habits, botsman and rodger’s definition was expanded by nesta (stokes et al. ) into a broader conception of ‘collaborative econ- omy’ to account for new forms of production summarised into a four-pillared typology. from collaborative resourc- ing models (like airbnb) and collaborative finance (like kickstarter), to collaborative production (like wikipedia) and collaborative learning (moocs like future learn), these organisations draw on ‘network effects’ (shapiro and varian : ), making use of idle assets and creating new marketplaces, and in so doing challenge traditional ways of doing business, rules, and regulations. in contrast to pipeline models where control of the ‘supply-side’ of the production process confers competi- tive advantage, network effects bequeath digital platforms with the capacity to build ‘demand-side’ communities of interest. and just as supply-side economies of scale can lead to industrial monopolies, network effects have also led in some instances to the monopolistic dominance of gafa-type platforms (google, apple, facebook and amazon) able to extract data and wealth from the network. but this is not an inevitable consequence of a platform model, and there are numerous other organisations that have taken this approach to address social and cultural challenges, with any accruing network benefits socialised amongst the platform’s users (mason ; zuboff ). countering the silicon valley trend of venture-funded enterprises seeking to ‘move fast and break things’, there are a multitude of ventures conversely seeking to ‘move fast and fix things’, experimenting with platform coopera- tives, alternative currencies and distributed autonomous organisations (exemplified by the differences between airbnb and fairbnb) (fairbnb manifesto ; pick : ). closely aligned with these social impact initiatives, the digventures platform was developed to apply net- work effects to archaeological workflows, addressing the central design challenge of how to improve research outputs whilst simultaneously creating space for civic participation. rather than focus on one of nesta’s collaborative economy pillars with a generalised offer that services all types of projects in the arts and the sciences, digventures have concentrated on a single theme, archaeology, to build and service a community around a common interest (westcott wilkins ). beginning with the archaeological resource itself, the platform links up owners or custodians of heritage assets with a networked community who want to learn, understand and enjoy those assets (collabora- tive resources, figures and ). this is underpinned by figure : group and individual profile pages for the digventures platform, displaying badges of achievement and projects completed for individual participants. digventures. wilkins: designing a collaborative peer-to-peer system for archaeology a collaborative finance dimension to the platform, gen- erating income and non-financial contributions from a networked community through crowdfunding and crowd- sourcing (collaborative finance, figure ). this is supported by a bespoke digital recording system enabling project par- ticipants to collaboratively produce text/photos/video and d models directly from the trenches using their phones or tablets, harnessing comments and contributions from the crowd (collaborative production, figure ). the development of a mooc platform, or learning manage- ment system, has built on this, deepening the audience’s engagement with the data through collaborative learning, and extending field skills training to a digital audience (collaborative learning, figure ). the nesta typology has been of benefit in framing the diversity of activities in the collaborative economy and positioning the digventures platform, but appreciating the underlying dynamics of peer-to-peer interaction will require a different analytical tool (figure ). the platform design toolkit was developed to assist in modelling emerging, multisided platforms to help shape strategies that could respond to and facilitate existing ecosystems of users to create and exchange value (cicero ). a plat- form strategy responds to an existing collaborative econ- omy ecosystem by distilling an essential value proposition that seeks to improve and facilitate connections between ‘entities’ to scale its potential (figure , column c). these entities (figure , column a and d), or actors, can be characterised as platform shapers (owners, ultimately responsible for the strategy); stakeholders (established bodies such as municipal or professional institutes with a vested interest in supporting or regulating the plat- form); peer producers (individuals or organisations interested in providing value on the supply side of the ecosystem/marketplace); peer consumers (users interested in consuming, utilizing, accessing the value that is created through and on the platform); and partners (professional entities seeking to create additional professional value and to collaborate with platform owners). transposed through the platform design toolkit, the digventures model can be conceived as a system ena- bling both tangible and intangible connections between entities, illustrating the channels and contexts through which they exchange value, and the services through which individuals can learn and evolve their participa- tory role through the platform. the toolkit illustrates how these entities create value through two specific ‘engines’ – a transaction engine (facilitating interactions between value producers and consumers, figure , column d) and a learning engine (support services that enable platform participants to learn, improve and evolve their capacity to take advantage of the platform, column b). in the attached example, digventures is contrasted with a platform many readers will have direct experience of – airbnb. in the case of the airbnb platform, peer producers own space, peer consumers seek short term rental of this space, and stake- holders (such as municipal authorities) seek to control the potential impact of short-term rentals on available hous- ing stock. the airbnb platform facilitates tangible connec- tions (such as booking and taking payment for a room) and intangible connections (such as leaving a review or hosting an experience). the essential value proposition enables peer consumers to afford unique travel expe- riences, peer producers to supplement their income whilst sharing their culture, and partners/stakeholders to benefit through the platform’s collective ownership and democratic governance. mapping the entities that interact through the digventures platform reveals the ‘platform shapers’ role to be an in-house team of professional archaeologists, figure : a community management system for archaeology – individual profile pages and badges viewable on mobile devices. digventures. wilkins: designing a collaborative peer-to-peer system for archaeology including a digital media team (videographer, writer, pho- tographer and editor). projects are designed or shaped by the in-house team in partnership with a diverse range of heritage organisations (such as museums, local societies and councils), creating an accessible space for micro-vol- unteering initiatives and experiences. partners physically own archaeological dig sites, peer producers want to join digs on these sites, peer consumers enjoy visiting or view- ing these digs physically or on line, and stakeholders (such as the chartered institute for archaeologists, historic england or local councils) seek to monitor the quality of these digs through permitting or professional accredita- tion. the digventures platform facilitates tangible con- nections between peer producers and consumers (such as figure : project crowdfunding and crowdsourcing pages, accepting income and non-financial contributions from a networked community. digventures. figure : digital dig team recording system, enabling project participants to collaboratively produce text/photos/video and d models directly from the trenches using their phones or tablets. digventures. wilkins: designing a collaborative peer-to-peer system for archaeology crowdfunding a project, visiting a dig and completing an evaluation form, or completing a dig record) and intangi- ble connections (such as acquiring skills and knowledge, becoming part of a team and connecting with like-minded individuals). the value proposition differs between entities, with project partners benefiting from diversified income, increased revenue, visitor footfall, and digital profile. peer participants (producers and consumers) receive heritage skills training and experience as part of important field research projects, and local communities (represented by figure : the mooc platform, deepening the participant’s engagement with the data through collaborative learning, and extending field skills training to a digital audience. digventures. figure : the platform design toolkit (after cicero ) a design canvas used for modelling the different entities (peer consumer, producer, partner and stakeholders) and the channels/contexts through which they exchange value, and the services through which they can learn and evolve their participatory role through the platform. digventures. wilkins: designing a collaborative peer-to-peer system for archaeology stakeholder entities) benefit from the long-term sustain- ability of their heritage resources. the latent assumption contained within cunliffe’s levels of publication concept, and similarly underlying theo- retical public archaeology models (see matsuda ), is that archaeology is undertaken ‘within’ organisations, who then relate to ‘the public’, resulting in an us/them, top/down knowledge production pipeline. in contrast to the pipeline approach, a platform model can be charac- terised as an ecosystem, enabling different types of users to connect and conduct interactions with one another, thereby enabling value creation for all entities. level and ‘consumption’ is not positioned at the end the archaeo- logical workflow, but manifest before, during and after the excavation, continuously funnelling the user into a deeper engagement with the research process. and by the same token, level , and ‘production’ is driven through the peer-to-peer engagement of the community, culmi- nating in the research outputs that would be typically expected from a scientific excavation. the underlying technology of the platform includes a publishing hub, information pages encompassing project background and educational resources, e-commerce crowdfunding pay- ment system, and a read/write recoding system enabling project participants to collaboratively produce archaeo- logical data. taken together as a ‘digital stack’, this could be described as a ‘community management system’ for archaeology projects – standing in relation to archaeo- logical workflows as content management systems like wordpress stand in relation to web publishing (westcott wilkins ). . scaling civic participation – a worked example alongside an assessment of project outputs and web ana- lytics, user journeys through the digventures platform can be illustrated with a specific project, the ‘barrowed time’ excavation, articulating how a model of collabora- tive resourcing, finance, labour and learning differs from a traditional pipeline approach. the collaborative resourc- ing dimensions of the ‘barrowed time’ project (wilkins et al. ) represented the first major excavation of an early bronze age funerary monument in north lancashire since (olivier ). the platform linked project partners (the landowner, durham university and lancaster city council) with a network of peer producers and consum- ers (dig participants and visitors to the project exhibition), managed by an in-house team of ten professional archae- ologists from digventures. work was structured as a com- munity-based research project, with fieldwork designed to help contextualise the unexpected discovery of a late bronze age tanged chisel and knife blade by a local metal detectorist on private farmland (figure ). fieldwork was undertaken between the th to th of july and between the th and th of september , revealing several pits dating to the early part of the middle neo- lithic period, and two early bronze age cremation burials within a food vessel urn and collared urn enclosed by a ring cairn on the summit of a hill. this sequence now represents the most intensively radiocarbon dated site of the period in lancashire and cumbria (where the lack of scientific dating is frequently lamented) as well as the first use of strontium isotope analysis on prehistoric cremated human bone in northern britain (wilkins et al. ). figure : aerial view of the ‘barrowed time’ community excavation, looking south-west over morecambe bay. digventures. wilkins: designing a collaborative peer-to-peer system for archaeology the site’s immediate communities (north lancaster, morecambe and heysham) fall within the % most deprived areas in england, with much lower levels of edu- cational achievement than the national average (lancaster city council ). a key aim of the project was therefore to engage a far greater demographic than would typically participate in community archaeology, stimulating sur- rounding communities to become more involved with and enthused about the stewardship of their local herit- age. raising the profile of the site was a particular chal- lenge for the project, given the inaccessible rural location and potential threat of looting. following the dynamics of the platform design toolkit articulated above, the strategy to address this encompassed delivering structured field school training to peer producers, with peer consumers benefiting from a pop-up exhibition hosted in a disused shop on morecambe promenade ( ) and in a lancaster city council building ( ) supported by schools visits and an online virtual museum (figure ). an evaluation survey was completed for both peer consumers (visitors to the project exhibition and digital participants) and peer producers (dig participants) recording their age, gender and professional background as well as socioeconomic categories derived from the office for national statistics. this was coordinated through a welcome desk for exhibi- tion visitors (with responses, or . % of all exhibi- tion visitors) and through a pre- and post-dig interview with dig and online participants (or % of total par- ticipants, conducted by email and in person). a collaborative finance budget of £ , was raised through matched crowd and grant revenue, with % of that total received through two grants from the national lottery heritage fund, combined with a further % received through crowdfunded contributions from individuals from eight different countries. this funding mix ensured that a range of cost-free opportunities could be provided for dig participants and the wider community alongside crowdfunded experiences. the pop-up muse- ums received visitors over a -day period (peer consumers); this audience was predominantly local, with % of exhibition visitors travelling less than miles and % travelling up to miles. of the total number of visitors surveyed, % were drawn to the venue because of the exhibition (rather than just passing by) and % represented an entirely new audience as they had never visited an archaeology event before. education sessions were delivered to primary school students, with spe- cial arrangements for group skype calls to the dig site where archaeologists could reveal their latest finds. a total of unique visitors accessed the virtual museum over the course of a -day period during the sea- son, a social engagement strategy that was supplemented through channels such as facebook live where single videos such as the excavation of an urn could garner up to reactions, comments and shares and post clicks. of the individuals who directly supported the crowd- funding campaign, % were digital contributors (peer consumers), with % of peer-participants joining the team for a week, % for a weekend and % for a day. of these sub-categories of peer producers, the data base records an average of context and . finds records for each day participants; context and . finds records for each weekend participant; and . contexts, finds records, . sample records and . section records for week participants. set against the baseline data derived from age studies of existing community archaeology groups, this approach can be seen to substantially improve on exiting provision as previously discussed in figure . each age category was well represented by participants figure : pop-up museum in lancaster city centre, displaying results, artefacts and a live stream to dig for local residents, visitors and school children. digventures. wilkins: designing a collaborative peer-to-peer system for archaeology with a wide representation of socioeconomic profiles (figure ), but perhaps more significantly in terms of audience development, % of peer participants had no previous experience of archaeology. a higher proportion of older participants was encountered on this project than other digventures excavations (contrast figure , leiston abbey). when questioned this group of older peer con- sumers described being drawn to support the project digi- tally due to difficulties with physical accessibility but ‘still wanting to feel involved’, with people aged and older engaged with the project entirely via the digital compo- nent of the project. these results are somewhat counter- intuitive to the proposition that crowdfunding creates an exclusionary paywall favouring only the wealthy, or that digital engagement creates a barrier in and of itself, par- ticularly amongst older people who are considered to be less digitally literate. . conclusion – the world is my dig this paper has presented a novel theoretical framework and worked project example detailing how a networked peer-to-peer approach to field work can expand civic engagement with archaeological research. forthcoming doctoral scholarship will introduce a theory of change and evaluative framework for measuring the social impact of public participation (wilkins : ), with a longi- tudinal study of several case studies elucidating the par- ticipatory ‘scaffolding’ necessary for successful outcomes described above (wilkins in prep). to conclude and return to the subject of this special issue, we should consider how these moves to reposition archaeology as an open and participatory digital platform have been received by practitioners seeking to establish the disciplinary bound- aries of what has been defined as ‘digital archaeology’ (perry and taylor : ) or ‘digital public archaeology’ (richardson : ). rather than embracing these experiments with tech- nology-enabled participation and crowd-based digital scholarship, civic participation has largely been seen as potentially exploitive (fredheim ; perry and beale ), with researchers questioning whether the “intro- duction of co-production means the economic value of archaeological expertise (and paid archaeological jobs) will survive unscathed” (richardson and dixon , ). framed as a questionable response to austerity, crowd- based approaches are seen as a method to reduce costs and increase “the potential scope for what can be achieved on a small budget” in “exchange for some form of training and the opportunity to gain what bourdieu ( ) calls ‘social capital’’’ (richardson : ). perry criticises the “obfuscating discourse” surrounding the cynical acqui- sition of this social capital, questioning whether “the immediate benefits of crowdsourcing and crowdfund- ing are eclipsing concern for their profound longer term impacts” (perry : ). sayer questions the “morality” of a crowdfunding model “that actively excludes the wider figure : age, gender and professional background (with categories derived from the office for national statistics) of peer producers and consumers who supported the barrowed time project, indicating a spike in older, retired participants amongst digital only supporters. digventures. wilkins: designing a collaborative peer-to-peer system for archaeology public through its pricing structure” (sayer : ), whilst richardson asks “are these ‘crowds’ truly large and representative of the general public, or are they simply a small number of active and keen expert participants...” (richardson : , : ). notwithstanding the justifiable concern with ethics (treated as a design challenge rather than a settled claim in section above), forcefully argued criticisms of crowd- based archaeology are not well evidenced or theorised. in light of lifshitz-assaf’s characterisation of problem solvers and solution seekers during similar experiments with crowd-based approaches at nasa discussed above (lifshitz-assaf : ), these strong reactions could be seen as indicative of a perceived challenge to professional identity. though widely regarded as a success, enthusiasm for open innovation was far from unanimous with nasa’s scientific community, leading to “rising tensions, emo- tions and fragmentation.” problem solvers saw open inno- vation as running against the grain of their professional raison d’être, with one remarking “i’ve been attracted to places that allow you to access a problem, come up with a plan, and execute the solution… to be able to think and solve greater problems. if i can’t do it at nasa, what is keeping me from going somewhere else?” ( , ). lifshitz-assaf noted that successful adoption of an open innovation model would require a “mindset shift,” a view- point echoed by timms and heimans who wryly note that “this was a group for whom the answer to ‘hou- ston, we’ve had a problem’ could never be ‘stand by, apollo, we’re going to crowdsource that and see if any semi-retired telecommunications engineers in new hampshire have any insights”’ ( , ). although crowd-based digital initiatives have emerged from within archaeology’s traditional disciplinary struc- tures, these projects have struggled to build sustaining and scalable digital communities (bonacchi et al. : ; richardson et al. : section ), an arguable consequence of not embracing the necessary ‘mindset shift’ described by lifshitz-assaf, or a ‘business and operational model-shift’ argued for in section and above. when practitioners face difficulties broadening participation beyond tradi- tional audiences “with higher income and education levels” (bonacchi et al. : ), circular reasoning is deployed to argue that low levels of engagement are a natural conse- quence of the medium, reinforcing richardson’s statement ( : ; cited by richardson et al. : section ) that “we must question whether new landscapes of par- ticipatory media can fundamentally change, open, or even threaten the authority of archaeological organisations and academic knowledge.” designing the digventures system with a clean slate approach has transpired to be a huge advantage in this reworking of professional identity, calling forth a solu- tion seeker mind-set that empowers a reimagining of how we fund, resource, record, analyse and communicate our science. a traditional problem solver value system will result in a linear operational model enclosed by discipli- nary boundaries, exemplified by cunliffe’s levels of pub- lication, in which the communication of archaeological knowledge with the public maintains the dichotomy of ‘us’ and ‘them’. a solution seeker mind set will enable prac- titioners to rethink their role in relation to both their sub- ject of study and wider public, underpinned by an open, participatory model that seeks to dismantle disciplinary boundaries. but repositioning the locus of work beyond the physical limits of an organisation does not just enable the public to become citizen scientists. it also engenders the need for archaeologists to become ‘scientific citizens’, experts who are every bit a part of society as non-experts, with all the responsibilities and rewards that infers. for legacy organisations and projects seeking to suc- cessfully embrace crowd-based digitally networked tools, the first challenge is to undertake what lifshitz-assaf calls “a shift in one’s professional role and one’s identity when challenged by a new technology… changing the focus of ‘how’ we do our work, to pause, reflect and refocus on the bigger ‘why’” (lifshitz-assaf ). as one of the subjects in lifshitz-assaf’s study described, it is a transformation from thinking “the lab is my world” to “the world is my lab” – a sentiment to which we can now add the word: dig. acknowledgements this paper was kindly supported by a scholarship from the college of social sciences, arts and humanities at the university of leicester. the open access publication of this article is supported by cost (european cooperation in science and technology) and the cost action arkwork. sincere thanks to all the staff at the school of museum studies, and particularly to giasemi vavoula, matthew allen (school of business) and oliver harris (department of archaeology and ancient history) for comments and continued help and guidance. final thanks to the team at digventures for companionship on the journey, and the overwhelming support of our peer community, without whom this work would not exist. this article is based upon work from cost action arkwork, supported by cost (european cooperation in science and technology). www.cost.eu. funded by the horizon framework programme of the european union. competing interests the author has no competing interests to declare. references aitcheson, k. . state of the archaeology market . landward research. available at: https:// www.archaeologists.net/sites/default/files/archae- ological% market% survey% - .pdf [last accessed: november ]. bell, d. . the coming of post-industrial society: a venture in social forecasting. london: penguin. http://www.cost.eu https://www.archaeologists.net/sites/default/files/archaeological% market% survey% - .pdf https://www.archaeologists.net/sites/default/files/archaeological% market% survey% - .pdf https://www.archaeologists.net/sites/default/files/archaeological% market% survey% - .pdf wilkins: designing a collaborative peer-to-peer system for archaeology benkler, y. . ‘sharing nicely’: on sharable goods and the emergence of sharing as a modality of economic production. yale law journal, ( ): – . doi: https://doi.org/ . / benkler, y. . the wealth of networks: how social production transforms market and freedom. new haven and london: yale university press. benner, m and tushman, ml. . reflections on the decade award: ‘exploitation, exploration, and process management: the productivity dilemma’ revisited ten years later. academy of manage- ment journal, : – . doi: https://doi. org/ . /amr. . berggren, a, dell’unto, n, forte, m, haddow, s, hodder, i, issavi, j, lercari, n, mazzucato, c, mickel, a and taylor, js. . revisiting reflexive archaeology at Çatalhöyük: integrating digital and d technologies at the trowel›s edge. antiquity, : – . doi: https://doi.org/ . / aqy. . berggren, a and hodder, i. . social practice, method, and some problems of field archaeology. american antiquity, ( ): – . doi: https:// doi.org/ . / bonacchi, c, pett, d, bevan, a and keinan-schoonbaert, a. . experiments in crowd-funding community archaeology. journal of community archaeology & heritage, ( ): – . doi: https://doi.org/ . / z. botsman, r and rogers, r. . what’s mine is yours: the rise of collaborative consumption. new york: harper collins. bourdieu, p. . distinction: a social critique of the judgement of taste. new york: routledge. boyd, c. . post-capitalist entrepreneurship: startups for the %. boca ranton: crc press. bradley, r. . bridging the two cultures: commercial archaeology and the study of prehistoric britain. antiquaries journal, : – . doi: https://doi. org/ . /s braverman, h. . labour and monopoly capital: the degradation of work in the twentieth century. new york: monthly review press. brophy, k. . the brexit hypothesis and prehistory. antiquity, ( ): – . doi: https://doi. org/ . /aqy. . caraher, w. . slow archaeology: technology, efficiency, and archaeological work. in: walcek averett, d, et al. (eds.), mobilizing the past: recent approaches to archaeological fieldwork in a digital age, – . grand forks: the digital press at the university of north dakota. carver, m. . editorial. antiquity, : – . doi: https://doi.org/ . /s x carver, m. . making archaeology happen: design versus dogma. new york and london: routledge. chevalier, jm and buckles, dj. . participatory action research: theory and methods for engaged inquiry. london and new york: routledge. chesbrough, h. . open innovation: the new impera- tive for creating and profiting from technology. boston: harvard business school press. chesbrough, h, vanhaverbeke, hw and west, j. new frontiers in open innovation. oxford: oxford university press. doi: https://doi.org/ . /acp rof:oso/ . . cicero, s. . platform design toolkit: the user guide v . . boundaryless srl. available at: https://plat- formdesigntoolkit.com/toolkit/ [last accessed nov ]. clarke, a, fulford, m and rains, m. . nothing to hide – online database publication and the silchester town life project. in: doerr, m and sarris, a (eds.), caa . the digital heritage of archae- ology. computer applications and quantitative methods in archaeology, proceedings of the th conference, heraklion, crete, april , – . greece: hellenic ministry of culture. costopoulos, a. . digital archaeology is here (and has been for a while). frontiers in digital humanities, ( ): – . doi: https://doi.org/ . / fdigh. . cunliffe, bw. . the publication of archaeological excavations: report of a joint working party of the council for british archaeology and the department of the environment. london: department of the environment. dore, c. . value, sustainability and heritage impact. the archaeologist, : – . available at https:// www.archaeologists.net/publications/archaeologist [last accessed: november ]. everill, p. . the invisible diggers. a study of british commercial archaeology. oxford: oxbow. fairbnb. . manifesto. available at: https://fairbnb. coop/#manifesto [last accessed: november ]. frearson, d. . supporting community archaeology in the uk: results of a survey. cba research bulletin . york: council for british archaeology. fredheim, lh. . endangerment-driven heritage volunteering: democratisation or ‘changeless change’. international journal of heritage studies, ( ): – . doi: https://doi.org/ . / . . fulford, m. . the impact of commercial archaeology on the uk heritage. in: curtis, j, fulford, m, harding, a and reynolds, f (eds.), history for the taking? per- spectives on material heritage, – . london, uk: british academy. fulford, m and holbrook, n. (eds.). . the towns of roman britain: the contribution of commercial archaeology since (britannia monograph ). london: the society for the promotion of roman studies. galeazzi, f and richardson-rissetto, h. . editorial introduction: web-based archaeology and collabo- rative research. journal of field archaeology, (sup ): s –s . doi: https://doi.org/ . / . . https://doi.org/ . / https://doi.org/ . /amr. . https://doi.org/ . /amr. . https://doi.org/ . /aqy. . https://doi.org/ . /aqy. . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / z. https://doi.org/ . / z. https://doi.org/ . /s https://doi.org/ . /s https://doi.org/ . /aqy. . https://doi.org/ . /aqy. . https://doi.org/ . /s x https://doi.org/ . /acprof:oso/ . . https://doi.org/ . /acprof:oso/ . . https://platformdesigntoolkit.com/toolkit/ https://platformdesigntoolkit.com/toolkit/ https://doi.org/ . /fdigh. . https://doi.org/ . /fdigh. . https://www.archaeologists.net/publications/archaeologist https://www.archaeologists.net/publications/archaeologist https://fairbnb.coop/#manifesto https://fairbnb.coop/#manifesto https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / . . wilkins: designing a collaborative peer-to-peer system for archaeology gibson, w. . cyberpunk (documentary), directed by marianne trench, produced by peter von brandenburg, an intercon production. timecode, : of : . available at https://youtu.be/ xxtuegel eq?t= [last accessed november ]. gonzález-ruibal, a. . beyond the anthropocene: defining the age of destruction. norwegian archae- ological review, ( – ): – . doi: https://doi. org/ . / . . gonzález-ruibal, a, gonzález, p and criado-boado, f. . against reactionary populism: towards a new public archaeology. antiquity, ( ): – . doi: https://doi.org/ . /aqy. . graber, d. . the utopia of rules: on technology, stu- pidity, and the secret joys of bureaucracy. london: melville house. graber, d. . bullshit jobs: a theory. london: allen lane hedge, r and nash, a. . assessing the value of com- munity-generated historic environment research. historic england and worcestershire county coun- cil. available at https://research.historicengland. org.uk/report.aspx?i= [last accessed november ]. huvila, i and huggett, j. . archaeological practices, knowledge work and digitalisation. journal of com- puter applications in archaeology, ( ): – . doi: https://doi.org/ . /jcaa. jones, b. . past imperfect: the story of rescue archae- ology. london: blackwell. kristiansen, k. . contract archaeology in europe: an experiment in diversity. world archaeology, ( ): – . doi: https://doi. org/ . / lancaster city council. . about the local plan. available at: http://www.lancaster.gov.uk/plan- ning/planning-policy/about-the-local-plan [last accessed: november ]. lifshitz-assaf, h. . dismantling knowledge bound- aries at nasa: the critical role of professional identity in open innovation. administrative sci- ence quarterly, ( ): – . doi: https://doi. org/ . / lifshitz-assaf, h. . nyu stern professor hila lif- shitz-assaf on innovation at nasa. st may [online access at https://youtu.be/lzw ks yjgm last accessed november ]. lukas, d, engel, c and mazzucato, c. . towards a living archive: making multi layered research data and knowledge generation transparent. journal of field archaeology, (sup ): s –s . doi: https:// doi.org/ . / . . manley, j. . old stones, new fires: the local societies. in: beavis, j and hunt, a (eds.), communicating archaeology, – . oxford: oxbow. doi: https:// doi.org/ . /j.ctvh dhfr. mason, p. . the new spirit of postcapitalism. inter- national politics and society. available at: https:// w w w. i p s - j o u r n a l . e u / r e g i o n s / e u r o p e / a r t i c l e / show/the-new-spirit-of-postcapitalism- / [last accessed: november ]. matsuda, a. . a consideration of public archaeol- ogy theories, public archaeology, ( ): – . doi: https://doi.org/ . / . . mcanany, pa and rowe, sm. . re-visiting the field: collaborative archaeology as paradigm shift. journal of field archaeology, ( ): – . doi: https://doi.org/ . / y. morgan, c and eve, s. . diy and digital archaeology: what are you doing to participate? world archaeology, ( ): – . doi: https://doi.org/ . / . . nativ, a. . on the object of archaeology. archaeo- logical dialogues, ( ): – . doi: https://doi. org/ . /s neal, c. . heritage and participation. in: waterton, e and watson, s (eds.), the palgrave handbook of contemporary heritage research, – . uk: palgrave macmillan. doi: https://doi. org/ . / _ nixon, t. . what about southport? a report to cifa on progress against the vision and recommenda- tions of the southport report ( ), undertaken as part of the st-century challenges in archaeology. available at: http://www.archaeologists.net/ st- century-challenges-archaeology [last accessed: november ]. olivier, ach. . excavation of a bronze age funer- ary cairn at manor farm, near borwick, north lancashire. proceedings of the prehistoric society, : – . doi: https://doi.org/ . / s x oxford dictionary. . word of the year is… available at: https://en.oxforddictionaries.com/ word-of-the-year/word-of-the-year- [last accessed november ]. pariser, e. . the filter bubble: what the internet is hiding from you. new york: penguin. doi: https:// doi.org/ . / perry, s. . who actually profits from web-based crowdsourcing and crowdfunding in archaeology? a critique of the short and long-term impacts of crowd work. paper presented to the eaa, glasgow, . abstracts book. perry, s. . the enchantment of the archaeological record. european journal of archaeology, : – . doi: https://doi.org/ . / eaa. . perry, s and beale, n. . the social web and archaeology’s restructuring: impact, exploitation, disciplinary change. open archaeology, : – . doi: https://doi.org/ . /opar- - perry, s and taylor, js. . theorising the digital: a call to action for the archaeological community. in: matsumoto, m, et al. (eds.), oceans of data: pro- ceedings of the th conference on computer appli- cations and quantitative methods in archaeology, – . oxford: archaeopress. https://youtu.be/xxtuegel eq?t= https://youtu.be/xxtuegel eq?t= https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /aqy. . https://research.historicengland.org.uk/report.aspx?i= https://research.historicengland.org.uk/report.aspx?i= https://doi.org/ . /jcaa. https://doi.org/ . / https://doi.org/ . / http://www.lancaster.gov.uk/planning/planning-policy/about-the-local-plan http://www.lancaster.gov.uk/planning/planning-policy/about-the-local-plan https://doi.org/ . / https://doi.org/ . / https://youtu.be/lzw ks yjgm https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /j.ctvh dhfr. https://doi.org/ . /j.ctvh dhfr. https://www.ips-journal.eu/regions/europe/article/show/the-new-spirit-of-postcapitalism- / https://www.ips-journal.eu/regions/europe/article/show/the-new-spirit-of-postcapitalism- / https://www.ips-journal.eu/regions/europe/article/show/the-new-spirit-of-postcapitalism- / https://doi.org/ . / . . https://doi.org/ . / y. https://doi.org/ . / y. https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /s https://doi.org/ . /s https://doi.org/ . / _ https://doi.org/ . / _ http://www.archaeologists.net/ st-century-challenges-archaeology http://www.archaeologists.net/ st-century-challenges-archaeology https://doi.org/ . /s x https://doi.org/ . /s x https://en.oxforddictionaries.com/word-of-the-year/word-of-the-year- https://en.oxforddictionaries.com/word-of-the-year/word-of-the-year- https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /eaa. . https://doi.org/ . /eaa. . https://doi.org/ . /opar- - wilkins: designing a collaborative peer-to-peer system for archaeology pick, f. . welcome to the age of participation. in: cabraal, a and basterfield, s (eds.), better work together: how the power of community can transform your business. weelington: enspiral foundation. poehler, ee and ellis, sjr. . the season of the pompeii quadriporticus project: the south- ern and northern sides. fasti online documents and research, : – . available at https://core. ac.uk/download/pdf/ .pdf [last accessed november ]. powlesland, d and may, k. . digit: archaeologi- cal summary report and experiments in digital recording in the field. internet archaeology, . doi: https://doi.org/ . /ia. . richardson, l-j. . a digital public archaeology? papers from the institute of archaeology, ( ): – . doi: https://doi.org/ . /pia. richardson, l-j. . public archaeology in a digital age, unpublished thesis (phd) thesis, university college london, london, uk. available at: http:// figshare.com/authors/lorna_richardson/ [last accessed: th april ]. richardson, l-j. . i’ll give you ‘punk archaeology’, sunshine. world archaeology, ( ): – . doi: https://doi.org/ . / . . richardson, l-j and dixon, j. . public archaeology : letting public engagement with archaeology ‘speak for itself’. internet archaeology, . doi: https://doi.org/ . /ia. . richardson, l-j, law, m, andrew dufton, j, ellenberger, k, eve, s, goskar, t, ogden, j, pett, d and reinhard, a. . day of archaeology – : global community, public engage- ment, and digital practice. internet archaeology, . doi: https://doi.org/ . /ia. . roosevelt, ch, cobb, p, moss, e, olson, br and ünlüsoy, s. . excavation is destruction digitization: advances in archaeological practice. journal of field archaeology, ( ): – . doi: https://doi.org/ . / y. rosenberg, s. . virtual reality check digital day- dreams, cyberspace nightmares. san francisco examiner, section: style, april, c . sayer, f. . politics and the development of commu- nity archaeology in the uk, the historic environ- ment: policy & practice, ( ): – . doi: https:// doi.org/ . / z. scanlon, k, fernandez, m, travers, t and whitehead, c. . an economic analysis of the market for archaeological services in the planning process. in: the southport group (eds.), realising the benefits of planning-led investigation in the historic environ- ment: a framework for delivery. reading: institute for archaeologists. available at: https://www.archae- ologists.net/sites/default/files/southportreporta . pdf [last accessed: november ]. shanks, m and witmore, c. . archaeology . ? review of archaeology . : new approaches to communica- tion and collaboration [web book]. internet archaeol- ogy, . doi: https://doi.org/ . /ia. . shapiro, c and varian, hr. . information rules: a strategic guide to the network economy. harvard: harvard business review. stokes, k, clarence, e, anderson, a and rinne, a. . making sense of the uk collaborative economy. nesta. available at: https://media.nesta.org.uk/ documents/making_sense_of_the_uk_collabora- tive_economy_ .pdf [last accessed: november ]. taylor, js, issavi, j, berggren, a, lukas, d, mazzucato , c, tung, b and dell’unto, n. . ‘the rise of the machine’: the impact of digital tablet recording in the field at Çatalhöyük. internet archaeology, . doi: https://doi.org/ . /ia. . thomas, s. . community archaeology in the uk: recent findings. york: council for british archaeology. timms, h and heimans, j. . new power: how it’s changing the st century – and why you need to know. london: macmillan trigger, b. . a history of archaeological thought, nd edition. cambridge: cambridge university press. trow, s. . years of development-led archaeology in england: strengths, weaknesses, opportunities and threats. in: novaković, p, horňák, m, guermandi, m, stäuble, h, depaepe, p and demoule, j-p. (eds.), recent developments in preventive archaeology in europe. proceedings of the nd eaa meet- ing in vilnius, . ljubljana: ljubljana university press. available at: https://pdfs.seman- ticscholar.org/dc d/ b d ec bd cbe b ce e bd.pdf [last accessed november ]. von hippel, e. . sources of innovation. oxford: oxford university press. von hippel, e. . democratizing innovation. cambridge, ma: mit press. doi: https://doi. org/ . /mitpress/ . . wainwright, g. . time please. antiquity, ( ): – . doi: https://doi.org/ . / s x walcek averett, a, gordon, jm and counts, db. (eds.) . mobilizing the past for a digital future. grand forks: the digital press at the university of north dakota. watson, s. . whither archaeologists? continuing challenges to field practice. antiquity, – . doi: https://doi.org/ . /aqy. . westcott wilkins, l. . the ‘real-time’ team: the future of fieldwork. current archaeology, : – . wilkins, b. . rumsfeldian archaeology. current archaeology, : . available at https://www. archaeology.co.uk/issues/ca- .htm [last accessed november ]. wilkins, b. . social contract archaeology: a busi- ness case for the future. paper presented to the th annual meeting of the eaa, helsinki, finland. wilkins, b. . digventures. the archaeologist, : – . available at https://www.archaeologists. https://core.ac.uk/download/pdf/ .pdf https://core.ac.uk/download/pdf/ .pdf https://doi.org/ . /ia. . https://doi.org/ . /pia. http://figshare.com/authors/lorna_richardson/ http://figshare.com/authors/lorna_richardson/ https://doi.org/ . / . . https://doi.org/ . /ia. . https://doi.org/ . /ia. . https://doi.org/ . / y. https://doi.org/ . / y. https://doi.org/ . / z. https://doi.org/ . / z. https://www.archaeologists.net/sites/default/files/southportreporta .pdf https://www.archaeologists.net/sites/default/files/southportreporta .pdf https://www.archaeologists.net/sites/default/files/southportreporta .pdf https://doi.org/ . /ia. . https://media.nesta.org.uk/documents/making_sense_of_the_uk_collaborative_economy_ .pdf https://media.nesta.org.uk/documents/making_sense_of_the_uk_collaborative_economy_ .pdf https://media.nesta.org.uk/documents/making_sense_of_the_uk_collaborative_economy_ .pdf https://doi.org/ . /ia. . https://pdfs.semanticscholar.org/dc d/ b d ec bd cbe b ce e bd.pdf https://pdfs.semanticscholar.org/dc d/ b d ec bd cbe b ce e bd.pdf https://pdfs.semanticscholar.org/dc d/ b d ec bd cbe b ce e bd.pdf https://doi.org/ . /mitpress/ . . https://doi.org/ . /mitpress/ . . https://doi.org/ . /s x https://doi.org/ . /s x https://doi.org/ . /aqy. . https://www.archaeology.co.uk/issues/ca- .htm https://www.archaeology.co.uk/issues/ca- .htm https://www.archaeologists.net/sites/default/files/ta -website.pdf wilkins: designing a collaborative peer-to-peer system for archaeology net/sites/default/files/ta -website.pdf [last accessed november ]. wilkins, b. . a theory of change and evaluative framework for measuring the social impact of public participation in archaeology. european journal of postclassical archaeologies, : – wilkins, b. in prep. digging the crowd: the future for archaeology in the digital and collaborative econ- omy. phd thesis. university of leicester, school of museum studies. wilkins, b, noon, s, roberts, b, ungemach, j and caswell, e. . barrowed time: a community- based archaeological excavation, lancashire. dig- ventures. available at: https://digventures.com/ reports/?fwp_reports_project=barrowed-time [last accessed november ]. willems, w and van den dries, m. . the origins and development of quality assurance in archaeology. in: willems, w and van den dries, m (eds.), quality management in archaeology, – . oxford: oxbow books. wills, j. . the world after ppg : st century chal- lenges for archaeology. reading: historic england & cifa. woolverton, j. . ‘becoming history ourselves’: a study of age demographics in community archae- ology societies. journal of community archaeology & heritage, ( ): – . doi: https://doi.org/ . / . . wylie, a. . community-based collaborative archaeology. in: cartwright, n and montuschi, e (eds.), philosophy of social science: a new introduc- tion, – . oxford: oxford university press. zuboff, s. . the age of surveillance capitalism: the fight for a human future at the new frontier of power. london: profile books. how to cite this article: wilkins, b. . designing a collaborative peer-to-peer system for archaeology: the digventures platform. journal of computer applications in archaeology, ( ), pp. – . doi: https://doi.org/ . /jcaa. submitted: january accepted: february published: march copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access journal of computer applications in archaeology, is a peer-reviewed open access journal published by ubiquity press. https://www.archaeologists.net/sites/default/files/ta -website.pdf https://digventures.com/reports/?fwp_reports_project=barrowed-time https://digventures.com/reports/?fwp_reports_project=barrowed-time https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /jcaa. http://creativecommons.org/licenses/by/ . / . introduction - problem solvers and solution seekers . from atoms to bits - digital archaeology’s social context . archaeology as a peer-to-peer platform . scaling civic participation - a worked example . conclusion - the world is my dig acknowledgements competing interests references figure figure figure figure figure figure figure figure figure figure figure final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref modelling in digital humanities: signs in context arianna ciula, university of roehampton, london, uk Øyvind eide, university of cologne, germany introduction in this paper we focus on modelling as a creative process to gain new knowledge (meaning) about material and immaterial objects by generating and manipulating external representations of them. modelling is widely understood and used as a heuristic strategy in the sciences (frigg and hartmann , mahr ) as well as in digital humanities (hereafter dh) research where it is considered a core practice (mccarty : – ). in the last two decades there has been a significant development of theory that complements the practice based tradition of the field (e.g. ibid, buzzetti ; beynon et. al. , jannidis and flanders ; flanders and jannidis ). we aim at enriching the current theoretical understanding by contextualising dh practices within a semiotic conceptualisation of modelling. a semiotic approach enables us to contextualise dh modelling in a scholarly framework well suited to humanistic enquiries, forcing us to investigate how models function as signs within specific contexts of production and use. kralemann and lattmann’s ( ) semiotic model of modelling complemented by elleström's ( ) theories on iconicity are some of the tools we use to inform this semiotic perspective on modelling. we then go on to contextualise kralemann and lattmann’s theory within modelling practices in dh by using three examples of dh models representing http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref components and structure of historical artefacts. we show how their model of models can be used to understand and contextualise the models we study and how their classification of model types clarifies important aspects of dh modelling practice. what is modelling? in this paper we take a pragmatic definition of modelling as a starting point. indeed, interdisciplinary theories around modelling are used mainly to inform our analysis of modelling practices. by modelling we intend a creative process of thinking and reasoning where meaning is made and negotiated through the creation and manipulation of external representations. we narrow this definition further by applying it to modelling as a research strategy: modelling is a process by which researchers make and manipulate external representations – what godfrey-smith ( ) calls ‘imaginary concreta’ – to make sense of the conceptual objects and phenomena they study. modelling in dh is often understood as “any act of formal structuring” of data intended as „formal information” (flanders and jannidis : ). our point of departure (see also ciula and eide ; ciula and marras ) is however wider exactly to allow us to explore whether a more encompassing definition can overcome some limitations of a narrower take on modelling. rather than prioritising a conceptualisation of modelling directed first and foremost at communicating with the computer, we rather attempt at seeing modelling as a means to create “tools for thinking” (bradley ). our pragmatic understanding of modelling is comparable to what beynon et al. call empirical modelling: “model-building in em [empirical modelling] evolves through an extended process of observation and experiment in which exploration and negotiation of meaning play a fundamental role” ( : ). in our work we make specific reference to peirce's semiotic pragmatism rather than jamesian pragmatism, since the latter implies a different understanding of experience and hence of the http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref semiotics and dh modelling rather than framing our reflection on modelling around human-machine communication or on implementative purposes in a strict sense, we propose to consider modelling as a process of signification and reasoning in action. contextualising modelling within a semiotic framework means indeed to consider it as a strategy to make sense (signification) via practical thinking (creating and manipulating models). we use an interdisciplinary perspective on modelling to guide us both in understanding how models as signs are made (the construction of models) as well as in understanding how something new is discovered in the process of making and using models (the epistemic and heuristic value of models). dynamic relation models/objects/interpretations kralemann and lattmann ( ) claim that models should be understood as signs in the peircean sense. in peirce’s seminal theory of signs, the sign is a triadic relation between a representamen (the sign from which the relation begins, sometimes also called in the literature the sign-vehicle), its object, and the interpreting thought. often represented as a tripod where the three ‘composing elements’ (olteanu : ) – object, representamen and interpretant – intersect, the sign for peirce is hence, first and foremost, relational. the experience of interpreting signs or signification (semiosis) is therefore intrinsically dynamic. as a consequence, a semiotic approach which considers models as signs gives high prominence to a dynamic view on models use of the term 'pragmatism'. see olteanu ( - ) for an informed and detailed explanation of this. what is particularly insightful in peirce's philosophy for us is his “understanding of life in term of phenomena of signification” (idem: ), which goes beyond and even against the epistemological account of (relativist) experience in james. http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref reinstating in renewed terms the value of modelling as an open process – in particular a process of signification. fig. : the model relation includes the following components: a set of objects oi= ,...,n (what kralemann and lattmann call the ‘extension’ of the model), a theory or language (what they call the ‘intention’ of the model) and an object omod (its attributes define what kralemann and lattmann call the ‘syntax’ of the model). for the subject who chooses oi and a theory or language, omod becomes a model of the objects oi on the basis of a representational relation between its syntax and the semantic attributes of oi. this relation is determined by the context of a theory as well as by the purpose of the specific act of modelling. this echoes of course mccarty’s approach to modelling as “orientation to questioning rather than to answers, and opening up rather than glossing over the inevitable discrepancies between representation and reality on which that questioning focuses” (mccarty : ). this work of contextualising modelling within a semiotic approach builds on kralemann and lattmann ( ) as well as its recent applications to modelling in dh (ciula and marras ; ciula and eide ). http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref models as icons the semiotic theory of signs proposed by peirce identifies three types of signs based on the relation between the object and the sign: symbols (e.g., conventional names used to denote objects), icons (e.g., onomatopoeic words such as ‘splash’), and indexes (signs used to point directly to their meaning, such as ‘there’). in this respect, kralemann and lattmann ( : – ) claim that models are icons, because the dominant relation with the objects they represent is one of similarity, as shown in fig. . in peircean theory, such iconic relation of similarity is what makes icons signify; icons act as signs based on how the relation of similarity is enacted: via simple qualities of their own in case of images, via analogous relations between parts and whole and among parts in the case of diagrams, and via parallelism of qualities with something else in the case of metaphors (olteanu : and ). different shades of iconic similarity between sign and object as theorised by pierce correspond to three kinds of models in kralemann and lattmann: ● image-like models, for example real life sketches where single qualities such as forms and shapes enable them to act as signs of the original objects they represent in given circumstances; ● relational or structural models, for example diagrams such as the relation exhibited in the graph of a mathematical equation, where the ‘interdependence between the structure of the sign and the structure of the object’ (ibid., ) enables the modeller to make inferences about the original by manipulating its model; the distinction between the three types of hypoicons is not meant to be clear-cut. we follow elleström ( ) amongst others in seeing these types as grades of a continuum or even of a development rather than separate categories. http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref ● metaphor-like models which represent attributes of the original by a non- standard kind of parallelism with something else which generates further models (metaphors are metamodels; ibid., ). in kralemann and lattmann’s theory as well as in peirce’s original theory, models do not act as signs in virtue of themselves. what establishes the model as a sign is the interpretative act of a subject, whether as creator or reader. the practical act of modelling connects the model to its interpretation, that is, to its specific semantic content in a given social and institutional context (ibid., – ). the modeller’s judgement depends on his or her presuppositions connected to “theory, language or cultural practice” (ibid., ). models are contingent. kralemann and lattmann also reiterate the concept of models as middle ground between theory and objects. the relationship of iconicity between the model and the object being modelled is partly externally determined (it relies on the similarity between the model and the object) and partly internally determined (it depends on theory, languages, conventions, scholarly tradition, etc.). based on this duality they stress, on the one hand, the subjectively determined dependency of models on prior knowledge and theory and, on the other, their independence from these in light of the specific conditions of the objects being modelled. beynon et, al. defend such pragmatic or empirical approach to modelling (based on william james’ pluralist philosophy of ‘radical empiricism’) which emphasises the role of informal semantics over the ‘formal semantics of computation’ ( : ). “[...] all kinds of conception of model are possible through assuming different kinds of context, observation, and agency”. (ibid. ) on the historical contingency of models especially within the context of economics see morgan ( : – ). extensive literature in philosophy of science especially focusing on the use of models in the empirical sciences recognises models (including computational models) as mediators between theory and objects of analysis (e.g. winsberg ; morrison ). within a semiotic context, this finds a parallel in the concept of sign-vehicles functioning as mediators between denotational and connotational qualities, between thing and meaning (maceachren : ). http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref similarity, iconicity, and reasoning one consequence of seeing models as icons is that through an understanding of the process by which icons are made and used we can gain new insights on how models are built and used. this understanding highlights similarity as a key to link models to the modelled: representation based on resemblance generally falls under the heading of ‘iconicity’. when something is understood to be a sign of something else because of shared, similar qualities, it is referred to as an iconic sign (elleström : ). the notion of iconicity is however not only about how models (as signs) appear with respect to similarity to their objects. it also encompasses the possibility of manipulating models and reasoning with them. this is another point of connection between models and icons, a point that goes to the core of dh practice. following nersessian, we subscribe to an expanded understanding of reasoning as ‘creative reasoning’ beyond logic and spanning the 'continuum' between ordinary and scientific problem-solving. model-based reasoning is not a simple recipe always leading to correct solutions, and reasoning cannot be equated with logic. most scientific practice does not fit the traditional philosophical ‘gold standard’ of deductive reasoning. “the ‘hypothetico-deductive’ method, which comprises hypothesis generation and the testing of deductive consequences of these, is a variation that focuses the fallibility of science with respect to the premises. this leaves out of the account the prior inferential work that generates the hypotheses. [...] in model-based reasoning, inferences are made by means of creating models and manipulating, adapting, and evaluating them. [...] analogical, visual, and simulative modeling are used widely in ordinary and in scientific problem solving, ranging from mundane to highly creative usage. on a cognitive-historical account, these uses are not different in kind, but lie on a continuum.” ( : - ). we wish to thank gabor toth for pointing out the relevance of nersessian's work to our research. http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref fig. : the peircean trichotomy of signs into icons, indexes and symbols based on the relation with its object (of similarity in the case of the icons) and the subsequent classification of icons (or rather pure icons or hypoicons) into images, diagrams and metaphors based on how the respective similarity relations signify. highlighted in grey are the sign types associated with models by kralemann and lattmann ( , fig. ). modelling in dh has a hybrid nature which combines implementation-oriented work with methodological inquiries bearing implications beyond the specific implementation. this distinction has recently been verbalised as one between altruistic and egoistic modellers in jannidis and flanders ( , ) and as one between modelling for production and modelling for understanding in eide ( a). an altruistic modeller will create a model for others’ use, often as part of a production project, whereas an egoistic modeller will create a model to be used at the individual while it is outside the scope of this paper to account for the nuanced and precise terminology adopted by peirce, it should be noted that he defines a subclass of icons called hypoicons which are in their turn divided into images, diagrams and metaphors; for a recent detailed and comprehensive overview of peirce’s categories and taxonomy of signs see olteanu ( : – ). http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref level or by a group to inquire into a specific area of interest. in the latter case using models to reason with is considered to be a main goal of modelling, whereas in the former it rather forms part of a process with a mainly practical goal, for example the publication of a collection of documents. this distinction can be useful in analytical terms, but is problematic in that it ignores that all models are used as external representations to facilitate reasoning. any model used in dh will to some extent be used for reasoning, and especially shared reasoning or negotiation of meaning. a model gives us a common language to talk about the world. to take one example: the text encoding initiative (tei) does not only give us a method for marking up texts, but also a language and formalism in which to think about textual phenomena such as manuscripts or poems. as stressed in stachowiak ( : ), stringent and exact systems for making deductions are useful also when no generally agreed upon objective reality exists; they can even be more necessary when reality is elusive and negotiable. the use of models as external representations to reason with has important points of connection with peirce’s thinking about icons and reasoning: similarity, which is the root of iconicity, is not simply an absolute trait that is ready to be picked up in the external world; instead it is a perceived quality processed by subjective attention and selection, and a potent force in cognition. (elleström : ) according to peirce, “it is by icons only that we really reason” (peirce , cp . [ ]). in more recent literature, cognitive sciences and the philosophy of scientific modelling have been brought together (nersessien ). in particular, within theorisations of distributed cognition (hoffman : ), thinking processes are as pointed out by beynon et al., “it is now possible to make computer models with which we can deliberately dwell upon our personal understanding of something of interest for its own sake, and without any functional use yet in mind” ( : ). http://www.tei-c.org/ (checked - - ) http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref seen as being distributed in the world and shared among different people through external mediations. historical accounts of scientific practices establish model-based reasoning as a social problem-solving strategy comparable to practices in everyday life (nersessian ). when we share our scholarly ideas using models in reasoning and discussion this is a type of process which is fundamentally icon-based in peirce’s sense. the role of graphical representations in “external cognition” is described by hoffman ( : ) as “diagrammatic reasoning to solve problems, to cope with complexity, to learn something new, or to resolve conflicts.” seen as icons such diagrams fall into a wide variety of model types, from toy cars used as scale models to mathematical formulae and semantic networks. why do we make such external representations? wood ( : ) distinguishes between the process of mapping and the one of mapmaking, which consists of the difference between a gesture leaving no physical trace and making a permanent inscription.. the choice is based on the needs in concrete communication situations: if the communication need is complex, a map is better than just an allusive gesture. this distinction is not sharp and it is connected to the continuum between communication and reasoning, as pointed out by hoffman: when i draw a map to explain a friend how to drive to a certain location, i would communicate by means of a diagram but i would not reason with it. diagrammatic reasoning is about problem solving, decision making, knowledge development, and belief change by means of diagrams. however, i do not presuppose a clear cut distinction between diagrammatic communication and diagrammatic reasoning. there might be a continuity between both these possibilities. (hoffman : - ) especially in project-based dh practice, where interdisciplinary groups work together to solve problems at practical as well as theoretical levels, reasoning and communication act as two sides of the same coin. mccarty ( : - ) qualified extensively the relationship between diagramming and modelling. http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref grades of iconicity as shown above, peirce distinguished between three types of hypoicons: images, diagrams, and metaphors. let us take the example of an apple. the image of an apple put up in the window of a grocery shop has a signification immediately perceived by a hungry tourist passing by. she will assume that the sign on paper, through its image- like resemblance with a real apple, indicates that apples are sold in the shop. while this immediacy is not there to be seen for everyone and in every circumstance (it would not work for a person who does not know what an apple is or what it looks like, and it would not necessarily be experienced by somebody not interested in buying apples there and then), it is still general enough to be defined as an immediate image for apples within a given context. a botanical visualisation of the reproduction system of the apple plant can be used to exemplify a diagrammatic icon of apples. the diagram exhibits the structural similarity between the form of the organs as represented in the diagram and the organs we find in actual apples. finally, a metaphorical icon can be exemplified by a representation of an apple as a sign of sin. this can be expressed in various forms, such as an apple in a biblical painting or expressions such as “she gave me the apple.” the whole expression – reduced to ‘sin is an apple’ – is the metaphor implying a relation between the apple and sin (the object of the model). this sign relation makes it possible for the object of the sign ‘apple’ to become an icon for the object of the sign ‘sin’ (cf. the example provided by kralemann and lattmann : – ), establishing a chain of signs. hence the words of the poet pablo neruda “innocence is round like an unbitten apple” (ode to the apple). the relationship between the metaphorical icon and what it refers http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref to is one of complex cognitive leaps and is highly creative, as argued by elleström and illustrated in fig. : the representamen of an image is perceptually close to its object, which means that the object may be sensuously perceived in much the same way as the representamen (this is a conception that is close to peirce’s own few remarks on the image). the representamen of a metaphor is at a greater distance from its object, which means that the interpretation of a metaphor includes one or several cognitive leaps that make the similarity between representamen and object apparent. (elleström : ) fig. : the argument thus far builds on the concept of grades of iconicity, whereby icons form a scale with varying degree of complexity at the conceptual level. metaphors involve a greater distance from their objects compared to diagrams and images. what we see clearly in the semiotic understanding of modelling is how the analytical dichotomy objects vs. models is useful, but also misleading. for analytical purposes the object is the apple and the models (icons) are the three different examples. but the object changes when the model changes; the meaning of the apple in the metaphorical example above is different from the apple in the diagrammatic example. the context note that the term ‘sensuously’ (rather than, e.g., ‘sensorial’) occurs here for specific reasons. while one of our current senses of ‘sensuous’ has hedonistic and even erotic connotations, this was not the case for philosophers in the th century. for continental philosophy in particular (e.g., kant and hegel) the term ‘sensuousness’ is used in connection to the immediacy of nature and in relation or opposition to conceptual understanding. sensuous encounter is hence considered to be devoid of analytical consciousness and intention. peirce uses the term to refer to the impression of experience in its (conscious) immediacy as well as individuality situated in space and time with no ontological or moral bearing. http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref of the interpretation changes the sign but the sign also changes the context of interpretation. space and time in his model of media modalities, elleström distinguishes between four, namely, the material, sensorial, spatiotemporal, and semiotic modalities (elleström ). this is not a claim for any linear development through the modalities; it is rather an analytical distinction to clarify various aspects of a media expression. different configurations of the four modalities can be used to specify the characteristics of specific media. while the focus in this paper is on the semiotic modalities of models as media expressions, our analysis, as we will see later with the examples, also considers the other three modalities. for our purpose it is especially important to understand how the spatiotemporal modality structures the experience of the material interface through which we encounter a media expression into conceptions of space and time. when we read a text and when we study a map we act in time. but the time operates differently. in most types of text the space of the printed or written page is turned into one or several sequences of characters and words, read in a pre-defined order. in studying a map we can let our eyes wander in any pattern while still getting to the meaning of the map. examples in dh in this paper we take previous research (ciula and eide ; ciula and marras ) one step further by mapping kralemann and lattmann’s trichotomy of models as on elleström's system for media modalities applied to modelling of spatial information in dh, see eide ( b). http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref icons to examples of digital modelling in dh research dealing with historical artefacts. these prototypical cases were chosen to investigate how model types relate to the cultural objects they represent and how modellers reason with them. if we accept kralemann and lattmann’s argument it follows that by modelling we link models to qualities and relationships already existing in the objects being modelled. such linking is based on choices which are made for a certain end informing and motivating the act of modelling. models are contingent, created in actual scholarly situations of production and use. a model is partially arbitrary in that the same inferences drawn by manipulating one model could have been reached in other ways, for instance using a different model. in this framework, models operate as sign-functions initiating a sign-relation (model-relation). to understand their epistemic role, we need to look at both how they come to be and how the similarity relation with the object is realised. by analysing the association of syntactic attributes of the source object with the attributes of the model we focus on the latter; that is, the representational correspondence. to explain the semantics of the model, the analysis of the similarity relation needs to be complemented with an analysis of the overall sign-relation in which production and use of models is enacted, as indicated in fig. . three examples will be used to analyse the three types of sign-functions and relations in a dh context. in general one could say that every dh model is a diagram in that it is a formalism of logical and mostly mathematical nature; in this respect, flanders and jannidis talk about ‘data structure’ as different from ‘data modelling’ (flanders and jannidis, , ). however, we believe we can in fact identify different grades of iconicity corresponding to the three model types mentioned above, namely image, diagram, and metaphor. the classic example that comes to mind to represent an http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref image-like model is a d graphic model such as, for instance, the model of an historical monument. the digital model acts as a surrogate of or a substitute for the reconstruction of the real object. a diagrammatic version of the same model could be the mathematical equations used to create the graphical d model. below we dwell on three examples in detail. example : image-like model we will use an example from digital palaeography research (ciula ; ), where the abstract model letter acts as an image-like model of the samples it was algorithmically generated from. what we can learn about the objects of analysis (the medieval handwritten letterforms) depends on the features being selected in the modelling process. what is relevant for the scope of this paper is that the inferential power of the model is mainly based on a strong immediate similarity (what above was called resemblance) between model and object. we can unpack this further by stating that the similarity is first and foremost of spatial nature: the handwritten letter is a two-dimensional spatial object as its spatial model is. however, their temporalities are different. we encounter single instances of letters in the manuscript pages, while the morphing models shown in fig. incorporate variants that can be visualised in sequence. this specific palaeographical model is based on immediate similarity relevant for this context. the ‘a’ of the model looks very much alike the ‘a’ of the handwriting in the manuscript, they have the same spatiality. its hermeneutical power relies, however, also on a different temporality between object and model. anchoring the reasoning on the spatial similarity and going beyond it enables us to learn new things about the object. indeed, new inferences are fostered by the availability of an ‘actual’ temporal element in the morphing of the model. while we have to look at all single http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref instances in the manuscript, we get a model which incorporates all variants and by sliding from left to right, we can ‘see’ those variants in real time. the object itself, however, is not temporal in this sense. so while the model is an abstraction – a fuzzy image which loses the precision of the instances out of which it was generated (representation is indeed asymmetrical) while keeping a basic (symmetrical) similarity to it – it gains an actual temporal mode that the single instances objects do not hold. if the modeller can make any inferences this is also due to her awareness of scribal variants and of what morphological traits are more revealing of different dating and location than others. so context and prior knowledge are important not only for the creation of models but also – not surprisingly – for their interpretation. fig. : image-like model. morphological features of segmented letter forms are modelled into an average morphing letter. inferences on the manuscript handwriting note that interpretation involves multiple and intertwined processes of signification; iconic signs are indeed “mixed with indexical and symbolic ways of interpreting” (cf. elleström : ). http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref are based on the analysis of the morphing letter-models in virtue of an ‘immediate resemblance’ between the original letters and the model. example : relational model as an example we will use models of landscapes described in historical sources, where textual information is modelled in the form of maps (eide b). the inferential power of the model relies on the analogous relational structure between object and model. when the text says “a is north of b” it makes a claim about a geometrical relationship between places denoted in the text. a map showing a north of b makes a claim expressing a similar geometrical structure. what new we can know about the object of analysis depends very much on the correspondence between the structuring of the textual expressions in the modelling process and the structure of the map model. the model–object relationship here is not between an expression and a landscape but between two expressions in different media, as shown in fig. . these media express structural relationships in fundamentally different ways. in order to see the structural similarity one needs to understand the written language being used in the text, the schemata used in topographical maps to convey meaning, and have experience of real landscapes. these elements define the context of the model. in this example ‘similarity’ is not immediate resemblance. the digital model – the map – looks completely different from the source object – the text, but there is a structural similarity between the two. this structural similarity possesses a strong hermeneutical potential. it can be used to reveal gaps; there are things expressed in the text that cannot be put on the map. examples of things that cannot be expressed include open, borderless expressions such as “the area north of the river” and http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref ambiguous expressions such as “either a or b is on the border.” the analogy breaks at some point; the examples show how the signification of rich expressions in the text cannot be communicated via the structure of the map. realising this can lead to new knowledge, or rather to renegotiating what a text can mean, the meanings of a text. based on the structural correspondence and non-correspondence between the virtual geographical space of the text and the geographical space of the map, the map makes the virtual space ‘visible’ and in so doing reveals a dissimilarity. it pinpoints the degree to which the text is underspecified spatially, how open the virtual space of the text is. this forces our understanding of the text to change. fig. : relational model. relational textual expressions are modelled into geometrical relations. inferences on space as expressed in the text are drawn in virtue of the corresponding spatial structure in the map. various attempts have been made to put such things on maps. see eide ( b) for an extensive discussion. this is exactly what happened in the modelling experiments described in eide ( b), where differences between the structures expressed in the text and structures expressible as maps were found. the model could not express what the source object expressed. http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref example : metaphor-like model finally, we will use the example of network models used to capture information about references to persons in historical sources. these can be used to tie specific textual passages to real world historical entities, but also to form parts of networks of co- references (eide ). the association of things shaped as woven networks (e.g. leaf venation, a spider or a fishing net) or of technical networks (e.g. in telecommunication) to describe relationships between people is metaphorical. the inferential power of the model leverages on a deep conceptual similarity between the model (the topography of a network) and the object (e.g., kinship of historical characters). it can generate unexpected connections between the objects it represents, which exist ‘only’ metaphorically in a network. in the example in fig. we see a historical picture of a man and a woman laying her hand on his. the literature over the reading of this th century paining by jan van eyck is vast. for example, one interpretation of this image sees it as a claim that the two depicted persons are married; another suggests more subtly that the joining of arms is rather an act of presentation by the man in the picture of the child to be borne in the woman’s womb to the destinatary in the mirror, hence exhibiting the fatherhood of the painter (lancioni ). whatever the symbolic link between the figures, the physical link establishes a bond between them. this bond can be associated to and hence expressed as a link between two nodes in a network. in kraleman and lattman ( ) these models are claimed to be based on semiotic similarity, but this appears categorically misleading to us so we privilege the concept of metaphor taken from peirce. for a recent discussion on the benefits and pitfalls of the use of network as metaphor in social sciences see erickson ( ). the national gallery, london, image number ng . http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref fig. : metaphor-like model. person names and their relationships as referred to by a document are modelled respectively into entities (nodes) and into properties connecting them (links). assertions of co-reference are also modelled into properties connecting entities. thus the net is used to model social relations as well as assertions about people. there are also other types of links deduced from historical documents that can be expressed using a network model. one is co-reference, for instance in the case where two person references expressed by two different statements, such as names in texts or pictures of identifiable persons, refer to the same person. a source can for instance claim that b and c, the person on the image and a name in a text, refer to the same person. such claims can also be expressed as links between nodes in a network. both these types of links are metaphorical. there are no strings attaching occurrences of names referring to the same historical characters to each other, and there are no connections between historical persons that bear any structural similarity http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref to the topography of a net. the social network in the model is a projection of a conceptual framework. concepts from our understanding of social relations are combined with a sequential object, the text, and a two dimensional painting, to form a spatial network model. but the development and use of such models change our view on history, we start seeing relationships as networks. the network gains hermeneutical power and makes visible as well as quantifiable aspects of a past family network or societal relations. however, different types of relationships (family vs. co-reference) easily lose their particularity and become ‘just’ links. the chain of signs become greedy and takes over another cognitive space or plane which in fact deals with relations with a different semantics, in our example moving from the plane of assertion of social relations to the plane of assertion of co-reference. one meaning can trigger others; e.g., the links between entities not only connote a relation (e.g. kinship), but their length or thickness might also be interpreted as more or less distance between those entities (i.e. more or less related); in this sense the sign (model) takes a life of its own. a link in the net is just a link, and a documented co- reference relationship becomes like a supposed marriage. gabor this feeds back to our view of the modelled objects; in other words: the context/prior knowledge influences the construction and interpretation of the model, but is also in turn influenced by it. common for all three types of models is the inferential power operating at the interplay between their ‘intrinsic structure’ and their ‘extrinsic mapping’ (kralemann and lattmann , ). indeed, the features being selected in the modelling process are influenced by contextual elements of different kinds, including http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref hypothesis, scholarly methods and conventions, sample selection, and the technologies being used. however, the inferential and epistemic power of the model relies both on extrinsic and intrinsic aspects of the model relation. in the former case, examples show us how – sometimes with vivid immediacy – similarity of existing verifiable qualities between object and model enable dh modellers to manipulate models to make new sense of those objects. in the latter case, examples show us again how models are conductive to new meaning and further modelling through our exercising of a certain imaginative freedom in selecting salient qualities and associating concepts. conclusions in the paper we focused on some aspects highlighted in kralemann and lattmann’s semiotic theory of models with respect to the role of context in modelling acts and the nature of the representational relation between objects and models through practical examples. we believe that these two foci are where modelling practices in dh meet with this semiotic framework in productive ways to explain both formal and open aspects of modelling practices. we contextualised this framework with specific examples of image-like, relational, and metaphor-like modelling in dh research. prior knowledge is a sine qua non to create models in the first place and to use them as interpretative tools with respect to the objects they are signs of (ciula and eide ). the relationships between modelling processes and interpretative outcomes are neither mechanical nor directly causal (ciula and marras ); however, the type of similarity on which modelling relies shapes the interpretative affordances of those ‘anchor’ models. modelling processes bring about investments and burdens with respect to our http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref knowledge of the objects we model. in particular, models as signs relate to the interpretation of those objects in different ways, from the immediate similarity on the image end of the iconic continuum to the imaginative ramifications of conceptual similarity on the metaphorical end. to understand the inferential, epistemic, and heuristic role of models as sign-relations, we need to look at both how they come to be (context; i.e., how we make our prior knowledge explicit and in most cases formalized) and how the similarity relation with the object is used to create meaning (new knowledge). in summary, studying the “single respects” (kralemann and lattman : ; in peircian terms “the ground of the representantem”) by which a model becomes a sign for an object is useful to explain both the logic and syntax of dh models within specific contexts. it demonstrates how these models are built as well as how the relation with the object is realised, e.g., in terms of spatio-temporal modalities. the selection of salient qualities or features to exhibit in the models plays a crucial role both in the creation and interpretation of these models. such selection is however not necessarily human-driven only. we increasingly use computing algorithms to facilitate or even propose that selection, especially in complex environments where variables are many and interconnected (e.g., pattern recognition in image processing or textual similarity in stylometry). our examples showed how the relationship of iconicity between the model and the object being modelled is partly extrinsically determined (it relies on the similarity between the model and the object) and partly guided by intrinsic choices (it depends on theory, conventions, imaginative associations, and prior knowledge). indeed we showed how the inferential power operates at the interplay between their ‘intrinsic structure’ and their ‘extrinsic mapping’ (kralemann and lattmann : ). a http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref future challenge would be to explore how the interplay between intrinsic structure of models (selection of salient qualities) and extrinsic mapping (their iconic ground) develops in the creation of scholarly arguments in the humanities. from this exploration of the semiotics of models we gained a different way to look at and analyse models: models as a type of signs mediating between the impressions of experience and freedom of association. in future research we aim to combine further studies of modelling practice in dh with interdisciplinary studies of modelling in the sciences and the long tradition of abstraction, representation, and modelling in the humanities to expand the model of models presented here. the main challenge remains to grasp the iterative and generative translation of informal models into formal ones and vice versa. bibliography beynon, w.m., russ, s. and mccarty, w. ( ). human computing: modelling with meaning. literary and linguistic computing , : – . bradley, j. ( ). how about tools for the whole range of scholarly activities? digital humanities, sydney, australia, june june–july . buzzetti, d. ( ). digital representation and the text model. new literary history, : – . ciula, a. ( ). digital palaeography: using the digital representation of medieval script to support palaeographic analysis. digital medievalist, . . http://www.digitalmedievalist.org/journal/ . /ciula/ (accessed november ). ciula, a. ( ). the palaeographical method under the light of a digital approach. in malte rehbein, patrick sahle, torsten schaßan (eds.), kodikologie und http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref http://www.digitalmedievalist.org/journal/ . /ciula/ final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref paläographie im digitalen zeitalter / codicology and palaeography in the digital age. norderstedt: bod, – . http://kups.ub.uni- koeln.de/volltexte/ / / (accessed november ). ciula, a. and eide, Ø. ( ). reflections on cultural heritage and digital humanities: modelling in practice and theory. first international conference on digital access to textual cultural heritage (datech), madrid, spain: – . http://dl.acm.org/citation.cfm?id= &cfid= &cftoken= (accessed november ). ciula, a. and marras, c. ( ). circling around texts and language: towards ‘‘pragmatic modelling’’ in digital humanities. digital humanities quarterly, ( ). http://www.digitalhumanities.org/dhq/vol/ / / / .html (accessed september ). eide, Ø. ( ). co-reference: a new method to solve old problems. digital humanities, university of maryland, usa, june – : – . eide, Ø. ( a). ontologies, data modeling, and tei. journal of the text encoding initiative, . eide, Ø. ( b). media boundaries and conceptual modelling : between texts and maps. houndmills, basingstoke, hampshire: palgrave macmillan. elleström, l. ( ). the modalities of media: a model for understanding intermedial relations. in elleström, l (ed), media borders, multimodality and intermediality. basingstoke: palgrave mcmillan, pp. – . elleström, l. ( ). spatiotemporal aspects of iconicity. in elleström, l, fischer, o and ljungberg, c. (eds), iconic investigations, amsterdam: john benjamins pub, pp. – . http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref http://kups.ub.uni-koeln.de/volltexte/ / / http://kups.ub.uni-koeln.de/volltexte/ / / http://dl.acm.org/citation.cfm?id= &cfid= &cftoken= http://dl.acm.org/citation.cfm?id= &cfid= &cftoken= http://www.digitalhumanities.org/dhq/vol/ / / final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref erickson, m. ( ) network as metaphor. international journal of criminology and sociological theory, , : – . flanders, j. and jannidis, f. ( ). knowledge organization and data modeling in the humanities. [white paper]. . url: https://opus.bibliothek.uni- wuerzburg.de/frontdoor/index/index/docid/ frigg, r. and hartmann, s. ( ). models in science, in zalta, e. n. (ed.) the stanford encyclopedia of philosophy. fall ed. stanford, ca: stanford university. http://plato.stanford.edu/entries/models-science/ (accessed november ) godfrey-smith, p. ( ). models and fictions in science. philosophical studies , : – . hoffmann, m. h. g. ( ). cognitive conditions of diagrammatic reasoning. semiotica : – . jannidis, f. and flanders, j. (eds). ( ). knowledge organization and data modeling in the humanities: an ongoing conversation. workshop at brown university. http://datasymposium.wordpress.com (accessed november ) jannidis, f. and flanders, j. ( ). a concept of data modeling for the humanities. digital humanities, lincoln, nebraska, usa, - july : – . kralemann, b. and lattmann, c. . models as icons: modeling models in the semiotic framework of peirce’s theory of signs. synthese , : – . lancioni, t. ( ). il “doppio ritratto” di jan van eyck. uno sguardo impertinente. e/c. rivista dell’associazione italiana di studi semiotici (aiss). http://www.ec-aiss.it/archivio/tematico/arte/arte.php (accessed april ) http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref http://plato.stanford.edu/entries/models-science/ http://datasymposium.wordpress.com/ final published version: http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref maceachren, a. m. ( ). how maps work : representation, visualization, and design, new york: guilford press. mahr, b. ( ). information science and the logic of models. software & systems modeling, : – . mccarty, w. ( ). humanities computing. basingstoke: palgrave macmillan. morgan, m. s. ( ). the world in the model: how economists work and think. cambridge: cambridge university press. morrison, m. ( ). models, measurement and computer simulation: the changing face of experimentation. philosophical studies , : – . nersessian, n. j. ( ). creating scientific concepts. cambridge, mass.: mit press. olteanu, a. ( ). philosophy of education in the semiotics of charles peirce. a cosmology of learning and loving. oxford, bern, berlin, bruxelles, frankfurt am main, new york, wien: peter lang. peirce, c. s. . collected papers of charles sanders peirce [cp], volume iv, the simplest mathematics, c. hartshorne and p. weiss (eds). cambridge, mass.: harvard university press. stachowiak, h. ( ). allgemeine modelltheorie. wien, new york: springer- verlag. winsberg, e. ( ). simulated experiments: methodology for a virtual world. philosophy of science : – . http://dsh.oxfordjournals.org/cgi/reprint/fqw ?ijkey=qfon r n ss gac&keytype=ref introduction what is modelling? semiotics and dh modelling dynamic relation models/objects/interpretations models as icons similarity, iconicity, and reasoning grades of iconicity space and time examples in dh example : image-like model example : relational model example : metaphor-like model conclusions bibliography 'making such bargain': transcribe bentham and the quality and cost-effectiveness of crowdsourced transcription | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /llc/fqx corpus id: 'making such bargain': transcribe bentham and the quality and cost-effectiveness of crowdsourced transcription @article{causer makingsb, title={'making such bargain': transcribe bentham and the quality and cost-effectiveness of crowdsourced transcription}, author={t. causer and kris grint and a. sichani and m. terras}, journal={digit. scholarsh. humanit.}, year={ }, volume={ }, pages={ - } } t. causer, kris grint, + author m. terras published political science, computer science digit. scholarsh. humanit. in recent years, important research on crowdsourcing in the cultural heritage sector has been published, dealing with topics such as the quantity of contributions made by volunteers, the motivations of those who participate in such projects, the design and establishment of crowdsourcing initiatives, and their public engagement value. this article addresses a gap in the literature, and seeks to answer two key questions in relation to crowdsourced transcription: ( ) whether volunteers… expand view via publisher doi.org save to library create alert cite launch research feed share this paper citationsbackground citations methods citations view all figures, tables, and topics from this paper table figure table figure table figure table figure table table table table view all figures & tables transcribe bentham crowdsourcing transcription (software) bargain buddy medical transcription paper mentions blog post editors’ choice: ‘making such bargain’: transcribe bentham and the quality and cost-effectiveness of crowdsourced transcription digital humanities now september citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency the influences of social value orientation and domain knowledge on crowdsourcing manuscript transcription x. zhang, si chen, y. zhao, shijie song, qinghua zhu computer science, psychology aslib j. inf. manag. save alert research feed a knowledge perspective on quality in complex citizen science m. p. lopez, m. soekijad, h. berends, m. huysman sociology pdf save alert research feed merry work: libraries and citizen science tiberius ignat, p. ayris, + authors anne kathrine overgaard political science pdf save alert research feed crowdsourcing language resources for dutch using pybossa : case studies on blends , neologisms and language variation p. dekker, t. schoonheim pdf view excerpt, cites background save alert research feed citizen science frontiers: efficiency, engagement, and serendipitous discovery with human–machine systems l. trouille, c. lintott, l. fortson computer science, medicine proceedings of the national academy of sciences pdf view excerpts save alert research feed the role of academic institutions in supporting citizen science: a case of minna de honkoku yuta hashimoto, y. kano political science, computer science th international congress on advanced applied informatics (iiai-aai) view excerpt, cites methods save alert research feed the digital humanist: contested status within contesting futures c. papadopoulos, p. reilly political science, computer science digit. scholarsh. humanit. save alert research feed the relo-kt process for cross-disciplinary knowledge transfer e. clarke computer science save alert research feed correction des données : retour d'expérience sur la plate-forme recital de transcription participative b. hervy, pierre pétillon, hugo pigeon, g. raschia art pdf save alert research feed digital peter: dataset, competition and handwriting recognition methods m. potanin, d. dimitrov, a. shonenkov, vladimir bataev, denis k. karachev, maxim novopoltsev computer science arxiv pdf view excerpt save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency crowdsourcing bentham: beyond the traditional boundaries of academic history t. causer, m. terras sociology, computer science int. j. humanit. arts comput. pdf view excerpts, references background and methods save alert research feed building a volunteer community: results and findings from transcribe bentham t. causer, v. wallace computer science, sociology digit. humanit. q. view excerpts, references background save alert research feed crowdsourcing our cultural heritage mia ridge engineering view excerpt, references background save alert research feed crowdsourcing in the digital humanities m. terras engineering pdf view excerpt, references background save alert research feed breaking monotony with meaning: motivation in crowdsourcing markets d. chandler, adam kapelner economics, computer science arxiv pdf view excerpts, references background save alert research feed crowds and communities: light and heavyweight models of peer production c. haythornthwaite sociology pdf view excerpts, references background save alert research feed more than fun and money. worker motivation in crowdsourcing - a study on mechanical turk nicolas kaufmann, t. schulze, d. veit psychology, computer science amcis pdf view excerpt, references background save alert research feed modeling crowdsourcing for cultural heritage j. noordegraaf, a. bartholomew, a. eveleigh engineering view excerpts, references background save alert research feed the labor economics of paid crowdsourcing j. horton, l. chilton computer science, economics ec ' pdf view excerpts, references background save alert research feed the future of crowd work a. kittur, j. v. nickerson, + authors j. horton computer science cscw pdf view excerpt, references background save alert research feed ... ... related papers abstract figures, tables, and topics paper mentions citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators blog posts, news articles and tweet counts and ids sourced by altmetric.com terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue the digital middle ages: an introduction the digital middle ages: an introduction by david j. birnbaum, sheila bonde, and mike kestemont our aims in this supplement of speculum are frankly immodest. in organizing a series of sessions devoted to the digital for the medieval academy annual meeting in , we hoped, by bringing together a diversity of projects, to showcase for the academy membership the wide range of exciting possibilities afforded by dig- ital humanities (dh). the papers gathered here are drawn largely from those ses- sions, with several additions. we want to acknowledge the contributions of sarah spence and william stoneman, coorganizers of the sessions, for their inspiration and help. this supplement is the first issue of speculum devoted to digital medieval projects, and it is offered in an online, open-access format that reinforces the open- ness to which the digital aspires and which it encourages. busa the advent of digital medieval studies is often attributed to the work of roberto busa ( – ). the italian jesuit priest was a philosopher and theologian who specialized in the lexical analysis of the works of thomas aquinas. because of the massive size of aquinas’s oeuvre, busa quickly found himself in need of an index- ing method to search the corpus, one that could surpass the labor-intensive system of handwritten fiche cards with which he began his work. busa was quick to recog- nize the possibilities of the early computing systems that were developed in his life- time, and in or around he reached out to thomas j. watson sr., the founder of ibm. watson and his staff at ibm were impressed with the aspirations of the italian jesuit (they called him “more american than the americans”), but ibm was per- suaded to participate in a joint research initiative only after the priest pointed out a flyer in the new york office of ibm that said, “the difficult we do right away, the impossible takes a little longer.” over the next thirty years, ibm and busa created the index thomisticus project, the world’s first sizable machine-readable corpus, containing, among an array of other related texts, an index verborum of all works of aquinas, totaling ap- proximately million words. the project required an administrative and orga- nizational staff that was unprecedented for a humanities research initiative at that time, as aquinas’s entire oeuvre had to be digitized onto punch cards (and later for an excellent in memoriam on the occasion of the centenary of his birth see marco passarotti, “one hundred years ago: in memory of father roberto busa sj,” proceedings of the third work- shop on annotation of corpora for research in the humanities (acrh- ), ed. francesco mambrini, marco passarotti, and caroline sporleder (sofia, ), – . passarotti, “one hundred years ago,” . the corpus can now be consulted online, http://www.corpusthomisticum.org/it/. speculum /s (october ). © by the medieval academy of america. all rights reserved. this work is licensed under a creative commons attribution-noncommercial . international license (cc by-nc . ), which permits non-commercial reuse of the work with attribution. for commercial use, contact journalpermissions@press.uchicago.edu. doi: . / , - / / s - $ . . this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.corpusthomisticum.org/it/ s the digital middle ages magnetic tapes), which were the primary data carriers in use in the middle of the twentieth century. over the years, busa would hire large numbers of young, local, female typists for this specialized task (see cover of this issue). once, i was told by father busa that he was used to choosing young women for punching cards on purpose, because they were more careful than men. further, he chose women who did not know latin, because the quality of their work was higher than that of those who knew it (the latter felt more secure while typing the texts of thomas aquinas and, so, less careful). these women were working on the index thomisticus, punching the texts on cards provided by ibm. busa had created a kind of “school for punching cards” in gal- larate. that work experience gave these women a professionally transferable and docu- mented skill attested to by father busa himself. in recent years, these aspects of the index thomisticus project have become the subject of research projects in the fields of oral history and gender studies, and the index helps us to realize that women played a foundational role in the early days of computer science and digital medieval studies. the open-minded priest never saw a conflict between the aims of his work and his religious calling, seeing the computer “as the son of man, and therefore grand- son of god.” busa praised both the speed and the enhanced accuracy of computer analyses. today, one of the most esteemed awards in the field of digital humanities is named after the italian jesuit: the triennial roberto busa prize issued by the al- liance of digital humanities organizations (adho). busa himself was the first recipient of the award in and he remained an active contributor in the com- munity until his death. digital humanities and medievalists in the decades following the onset of the index thomisticus project, medievalists were often early adopters of the digital, and continue to play an important role in the development of a broader field, which came to be called digital humanities. this field took other forms and names during its emergence and subsequent develop- ment: humanities computing, humanist informatics, literary and linguistic comput- ing, digital resources in the humanities, ehumanities, and others. these compet- ing alternatives, among which “humanities computing” had long been dominant, have only recently made place for the newly canonical term “digital humanities,” which today is rarely contested. “digital humanities” is generally meant to refer quoted in melissa terras’s “for ada lovelace day—father busa’s female punch card opera- tives” (blog), october , http://melissaterras.blogspot.be/ / /for-ada-lovelace-day-father -busas.html. see, for example, julianne nyhan and andrew flynn, computation and the humanities: towards an oral history of digital humanities, springer series on cultural computing (cham, ), https:// link.springer.com/book/ . % f - - - - . passarotti, “one hundred years ago,” . john unsworth, “medievalists as early adopters of information technology,” digital medievalist ( ), https://journal.digitalmedievalist.org/articles/ . /dm. /. according to kirschenbaum, the rise of the term “digital humanities” can be traced to “a set of surprisingly specific circumstances”: ( ) the publication of the blackwell companion to digital humanities, ( ) the inauguration of the alliance of digital humanities organizations (adho), ( ) the speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://melissaterras.blogspot.be/ / /for-ada-lovelace-day-father-busas.html http://melissaterras.blogspot.be/ / /for-ada-lovelace-day-father-busas.html https://link.springer.com/book/ . % f - - - - https://link.springer.com/book/ . % f - - - - https://journal.digitalmedievalist.org/articles/ . /dm. / http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f - - - - &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f - - - - &citationid=p_n_ the digital middle ages s to a broader field than “humanities computing.” whereas the latter is restricted to the application of computers in humanities scholarship and had narrower technical goals, the former also incorporates a “humanities of the digital,” including the study (potentially via traditional means) of digitally created sources, such as art and litera- ture. dh is therefore profoundly multidisciplinary and attracts contributions from scholars and scientists both within and outside the humanities and the humanistic so- cial sciences. digital humanists have taken care to define themselves in an inclusive rather than exclusive manner. as a result, the term “digital humanities” connotes a greater sense of integration than the diversity of approaches that are sheltered within the “big tent” of dh and that are also reflected in the contents of this supplement. thus, while the definition of dh has been the subject of dedicated anthologies, countless panel discussions, and even entire websites (http://whatisdigitalhumanities .com), a better question may be whether there still exist nondigital humanists today, sincemostscholarsatleasttosomeextentrelyoncomputationalaids,howeverbasic, such as online search engines or word processors. even the “original” objects of our research are most often mediated by the printed or online text or the slide or digital image. the difference between the digital humanities and their less digital counter- part has become more a matter of degree than of kind. it is clear that the digital humanities (and within it, digital medieval studies) are a practice-oriented community. it may be that it is a pragmatic methodological aware- ness that ties this community together, although theoretical self-reflection and meta- analysis have nonetheless become more prominent recently. a number of theo- rists, including willard mccarthy, recipient of the busa award, and john unsworth, have pointed to the necessary disjunction between the “object studied alan liu, “the meaning of digital humanities,” pmla ( ): – . this multidisciplinary nature is treated as central in most general-purpose introductions to digital humanities, including a companion to digital humanities, ed. susan schreibman, ray siemens, and john unsworth (oxford, ); and digital_humanities, ed. anne burdick, johanna drucker, peter lunenfeld, todd presner, and jeffrey schnapp (cambridge, ma, ). on the “big tent” discussion, see matthew jockers and glen worthey, “introduction: welcome to the big tent,” in digital humanities , ed. alliance of digitial humanities organizations (stan- ford, ), vi–vii. geoffrey rockwell humorously noted, “having wandered in the wilderness that was humanities computing since the late s i find it ironic to be part of something that is suddenly ‘popular’ or perceived to be exclusive when for so many years we shared a rhetoric of exclusion.” see the anthologized reprint, geoffrey rockwell, “inclusion in the digital humanities,” in terras, nyhan, and vanhoutte, defining digital humanities, – , at . consult terras, nyhan, and vanhoutte, defining digital humanities. see, for example, matthew battles and michael maizels, “collections and/of art and the art mu- seum in the dh mode,” in debates in the digital humanities , ed. matthew k. gold and lauren f. klein (minneapolis, ), – . the issue of “theory” in dh has been the subject of the first volume of the “conversations” sec- tion in the online journal of digital humanities in winter (http://journalofdigitalhumanities.org / - ), in which a series of more spontaneous writings (e.g., from the blogosphere) about the topic have been collected. inauguration of the neh’s digital humanities program. see matthew kirschenbaum, “what is digital humanites and what’s it doing in english departments?,” in defining digital humanities: a reader, ed. melissa terras, julianne nyhan, and edward vanhoutte (farnham, ), – , at – in particu- lar. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://whatisdigitalhumanities.com http://whatisdigitalhumanities.com http://journalofdigitalhumanities.org/ - http://journalofdigitalhumanities.org/ - http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fpmla. . . . &citationid=p_n_ s the digital middle ages and the representation of that object in digital analysis.” mccarthy has argued that the concept of “modeling” is a central characteristic of the digital humanities. by a model, he means “a representation of something for purposes of study, or a de- sign for realizing something new.” following clifford geertz, he distinguishes be- tween models of things (for example, a grammar; a geographical map) and models for things (for example, an architectural plan). depending on disciplinary traditions, scientific models are known under various names (representation, diagram, map, simulation, and so on). what such models typically have in common is that they offer a condensed, often simplified representation of things. therefore, models are more easily manipulated than the things they represent, which allows for experimentation. in mccarthy’s view, “modeling,” the heuristic process in which models are con- structed and manipulated, is central to the digital humanities. of course, models and modeling practices have long existed in humanities scholarship: the critical ap- paratus in printed editions of medieval works is but one classic example of a well- known edition model, which attempts to represent in condensed fashion the com- plex phenomenon of a medieval text tradition. what sets the digital humanities apart is an increased awareness of, and explicit interest in, modeling strategies, as a consequence of the field’s intense interaction with computers. but computers can process only fully explicit and consistent models, which means that if com- puters are to analyze humanities data, our assumptions must be fully explicit and consistent. the need for explicitness and consistency can be alienating for scholars in humanities fields where the exceptional is often embraced. scholars from post- structuralist paradigms might also mistake the need for explicitness for scientific positivism. digital medieval studies models and modeling provide a framework for presenting ongoing work in the field of medieval studies and for explaining the ways in which much of this work might deviate from what went before. first, much new groundwork is being done in digital medieval studies. high-resolution electronic manuscript facsimiles are produced in large quantities by heritage institutions in the glam (galleries, librar- ies, archives, and museums) sector around the globe. the british library digitised manuscripts link and the cathedral library of cologne (the codices electronici ec- clesiae coloniensis) are two good examples. in the future, initiatives like the iiif (international image interoperability framework) can be expected to enhance our capacity to inspect and compare primary sources on a scale—and with an immedi- acy—that would have been unimaginable for earlier generations of scholars. (we should remember that some libraries do not even allow visitors to inspect multiple physical items at the same time!) institutions such as the schoenberg institute for manuscript studies lead the way in this respect, with its visionary director will noel see julia flanders and fotis jannidis, “data modelling,” in schreibman, siemens, and unsworth, companion to digital humanities, – . these views have been summarized in willard mccarthy, “modeling: a study in words and meaning,” in schreibman, siemens, and unsworth, companion to digital humanities, – . http://www.bl.uk/manuscripts/ and http://www.ceec.uni-koeln.de/. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.bl.uk/manuscripts/ http://www.ceec.uni-koeln.de/ the digital middle ages s being honored as a white house champion of change in for his commitment to open science. excellent examples of enriched digital libraries include the online froissart (where high-resolution facsimiles often are accompanied by transcriptions and his- torical information) or monasterium.net, a virtual archive that offers centralized access to over five hundred thousand primary diplomatic sources, such as charters, from more than one hundred european archives. equally representative is the da- tabase and interface supporting the digipal project, which won the medieval acad- emy’s first digital humanities prize in the spring of (see the contribution to this supplement about this platform for the paleographic study of english manu- scripts by the project’s principal investigator, peter a. stokes). modeling choices present themselves at even the most basic research steps. with basic facsimile creation, for example, critical modeling choices must take into con- sideration bandwidth and memory limitations, which impose practical limits about the resolution at which manuscripts can be photographed, stored, and distributed. starting from what resolution does a photograph offer a reliable representation of the immediate source? can we reasonably expect that some users will ever need re- productions at , dpi or more? johanna drucker has therefore correctly stressed that what we might regard as raw data (“given”) in the humanities are already the product of some form of modeling, however modest; and she proposes that we use the term capta (“taken”) to reflect the constructed nature of such data. metadata yield similar concerns, since many collections are currently digitized at a more rapid pace than the glam institutions can manually annotate with metadata. again, dif- ficult choices have to be made: how can we responsibly (re)publish digital files for which no, incomplete, or only outdated metadata is available? should we allow users from across the globe to crowdsource annotations for these newly digitized ob- jects, or should this remain the domain of trained experts? authority is a complex issue in this respect and presently under intense renegotiation. much effort goes into federating access to heterogeneous data streams through the creation of informa- tion repositories that collect linked information. descriptive metadata standards, such as dublin core or the getty vocabularies, play an important role in this. the contribution by toby burrows in this supplement sheds new light on how struc- tured metadata can be leveraged in the field of manuscript studies. digital scholarly editing is one of the major stakeholders in the digital humanities community, and much of the activity in this area revolves around the text encoding initiative (tei, http://www.tei.org). the tei defines an influential set of guidelines for enriching texts with both interpretative and descriptive annotations using a http://dla.library.upenn.edu/dla/schoenberg/index.html. peter ainsworth and godfried croenen, ed., the online froissart, version . (sheffield, ), http://www.hrionline.ac.uk/onlinefroissart. http://monasterium.net/mom/home. johanna drucker, “humanities approaches to graphical display,” digital humanities quarterly ( ), http://www.digitalhumanities.org/dhq/vol/ / / / .html. seth van hooland and ruben verborgh, linked data for libraries, archives and museums: how to clean, link, and publish your metadata (london, ). elena pierazzo, digital scholarly editing: theories, models and methods (farnham, ). speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://monasterium.net https://doi.org/ . / https://doi.org/ . / http://www.tei.org http://dla.library.upenn.edu/dla/schoenberg/index.html http://www.hrionline.ac.uk/onlinefroissart http://monasterium.net/mom/home http://www.digitalhumanities.org/dhq/vol/ / / / .html s the digital middle ages markup language called xml. the contribution by franz fischer offers a broad survey of the sort of digital editions and archives that currently live on the web. optical character recognition (ocr) has allowed us to turn (scans of) existing edi- tions into machine-readable and searchable texts, which often then serve as the basis for new digital editions. the electronic beowulf project (http://ebeowulf.uky.edu) is an early seminal project that allowed greater access to an important medieval text. beowulf is preserved in a single eleventh-century manuscript, which was damaged by fire in . transcriptions made in the late eighteenth century show that many letters then visible along the charred edges were subsequently lost. in , each leaf was mounted into a paper frame. scholarly discussion of the date, provenance, and creation of the poem continue around the world, and researchers regularly require access to the manuscript. digitization of the entire manuscript provides a solution to problems of access and conservation. immense corpora are today available to medieval linguists; well-known examples from the anglo-saxon world include the linguistic atlas of early medieval english (c. , words) or york-toronto- helsinki parsed corpus of old english prose (c. . million words), and resources on this scale have allowed scholars to verify long-standing questions in medieval studies using quantitative means. the contributions by maxim romanov, jeroen de gussem, and david josephwrisley in this supplement illustrate the sort of macro- analyses that large corpora enable. although much progress has been made on the level of ocr, the computational study and semiautomated transcription of handwritten materials remains a much more elusive application. mike kestemont, vincent christlein, and dominique stutz- mann contribute an article to this supplement where the reader is introduced to the field of computer vision and its considerable potential for the study of medieval script. many other interesting applications of digital script analysis have appeared in recent years, such as automatic writer identification for medieval documents. surely, we can expect much more progress in field of visual analyses for dh in the coming years. in her history of humanities computing, susan hockey notes that the earliest work in the field of dh was strongly biased towards text. for medievalists this is especially limiting, given the importance of manuscripts—including their illumina- tions or initials—in medieval culture. when compared to, for example, image or au- dio files, it is clear that plain text files come with much more relaxed computational demands in terms of storage, user interfaces, and processing power. this helps ex- also see greta franzini, melissa terras, and simon mahony, “a catalogue of digital editions,” in digital scholarly editing: theories and practices, ed. elena pierazzo and matthew james driscoll (cambridge, uk, ). a relevant example is jacob thaisen, “initial position in the middle english verse line,” english studies ( ): – . in this paper, thaisen uses statistical language modeling to show that the beginning (and, to a lesser extent, the ending) of middle english verse lines are relatively more stable in the transmission of texts. for example, for chaucer manuscripts, marius bulacu and lambert schomaker, “automatic handwriting identification on medieval documents,” in proceedings of the th international confer- ence on image analysis and processing (modena, ), – . susan hockey, “a history of humanities computing,” in schreibman, siemens, and unsworth, companion to digital humanities, – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://ebeowulf.uky.edu https://doi.org/ . / https://doi.org/ . / http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fobp. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f x. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f x. . &citationid=p_n_ the digital middle ages s plain why much of the early work has, for example, been lexicographic in nature. word-level analyses, such as the index thomisticus, lent themselves well to computer- based indexing and quantification. hockey gives special mention to medievalists like roy wisbey, who produced an index to early middle high german literature as early as the s. but hockey also emphasizes the serious limitations of both hard- ware and software (in regard to memory) with which early adopters struggled. (it is easy to forget that modern smartphones come with larger amounts of computer memory than the onboard computer of the apollo mission in .) even what are now regarded as relatively trivial issues, such as displaying basic medieval glyphs on a computer screen for the common germanic characters thorn (þ) and eth (ð), remained a challenge until deep into the twentieth century. today the unicode stan- dard (http://unicode.org) seeks to provide support in operating systems and applica- tions for all human writing systems, including those of the middle ages, and the me- dieval unicode font initiative (mufi, http://folk.uib.no/hnooh/mufi/) promotes a microstandardization of the unicode private use area (pua) specifically for west- ern medieval writing. stemmatology is another typically medievalist domain in which we find early adopters of computational methods. collation software was able to align variant manuscript readings, which could serve as the input to the machine-assisted identi- fication of a stemma codicum. peter robinson’s collate software was used to man- age the variants in the canterbury tales project and elsewhere and has now been succeeded by the open-source collatex (http://collatex.net) for textual collation. an especially sophisticated exploration of machine-assisted stemmatology was car- oline macé’s tree of texts project at ku leuven (katholieke universiteit, leuven), which was the starting point for tara andrews’s stemmaweb project: tools and techniques for empirical stemmatology (https://stemmaweb.net/). within the domain of text analysis, computational stylistics (or “stylometry”) also played an early role in the development of digital humanities, and the article by jeroen de gussem in this supplement describes an application of this technol- ogy to twelfth-century latin literature. stylistic phenomena belong to the realm of tangible poetics and have the advantage of being more amenable to quantification than hermeneutical features, that is, those relating to interpretation. this has led to advances in authorship attribution of medieval texts, such as the many medieval ro- mances that (allegedly) resulted from collaborative forms of authorship. results in- clude the early study by john r. allen on the authenticity of the baligant episode in the chanson de roland and the more recent investigation of the middle dutch walewein by karina van dalen-oskam and joris van zundert, which sheds new light on the complex interferences between authorial and scribal aspects of medieval texts. likewise, the well-known dual authorship of the roman de la rose is now tara l. andrews and caroline macé, “beyond the tree of texts: building an empirical model of scribal variation through graph analysis of texts and stemmata,” digital scholarship in the human- ities ( ): – . john r. allen, “on the authenticity of the baligant episode in the chanson de roland,” in com- puters in the humanities, ed. john l. mitchell (edinburgh, ): – . karina van dalen-oskam and joris van zundert, “delta for middle dutch: author and copyist distinction in walewein,” literary and linguistic computing ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://unicode.org http://folk.uib.no/hnooh/mufi/ http://folk.uib.no/hnooh/mufi/ http://collatex.net https://stemmaweb.net/ https://doi.org/ . / http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqm &citationid=p_n_ s the digital middle ages often used as a generic test case in the development of text analysis software. in more focused contributions, the results of stylistic analyses have been linked to is- sues involving gender criticism. metrical analyses, such as the work by friedrich dimpel on middle high german, are another domain where computational meth- ods can be expected to break ground. visualizations, sound, and d modeling although historically the digital humanities have been dominated by text-oriented paradigms, the community is increasingly engaging with multimodal research ob- jects and methods. the visual turn dh has adopted visualizations in many areas of research. graphs, charts, dia- grams, and other visual interpretations were common in pre-dh scholarship, but with dh has come the interest and ability to engage with large data sets and to rep- resent them visually—see, for example, the varied visualizations in maxim romanov’s contribution to this supplement. network visualizations are also fre- quently used, not only for textual exploration (de gussem’s paper), but also for geo- graphic analyses, for instance in the papers by romanov and toby burrows. an- other recent article points to the potential for visual analysis to produce results in the arena of image-feature analysis, taxonomy building, and clustering methods for me- dieval manuscripts; see also kestemont, christlein, and stuzmann’s article on com- putational approaches to identifying scripts in this supplement. a number of recent projects have invested effort in virtual recreations of medieval libraries at chartres, lorsch, and elsewhere. manuscriptlink, a new digital humanities initiative, aims to reconstruct “virtual” medieval libraries by collaborating with collections around the maciej eder, jan rybicki, and mike kestemont, “stylometry with r: a package for computa- tional text analysis,” r journal ( ): – . examples include jan ziolkowski, “lost and not yet found: heloise, abelard and the epistolae duorum amantium,” journal of medieval latin ( ): – ; mike kestemont, sara moens, and jeroen deploige, “collaborative authorship in the twelfth century: a stylometric study of hil- degard of bingen and guibert of gembloux,” digital scholarship in the humanities ( ): – . friedrich m. dimpel, computergestützte textstatistische untersuchungen an mittelhochdeutschen texten (tübingen, ). susan hockey, “history of humanities computing.” s. jänicke, g. franzini, c. faisal, and g. scheuermann, “visual text analysis in digital human- ities,” computer graphics forum ( ), doi: . /cgf. . david hadbawnick, “the framing narrative and the host: two kinds of anxiety in the canter- bury tales,” in open access companion to the canterbury tales, http://www.opencanterburytales .com/open-review-home/the-framing-narrative-and-the-host/. dominiquestutzmann,“clusteringofmedievalscriptsthroughcomputerimageanalysis:towards an evaluation protocol,” digital medievalist ( ), https://journal.digitalmedievalist.org/articles/ . /dm. /. for chartres, http://www.biblissima-condorcet.fr/fr/a-new-life-medieval-libraries-chartres; for lorsch, https://www.uni-heidelberg.de/presse/news /pm _lorsch_en.html. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / http://www.opencanterburytales.com/open-review-home/the-framing-narrative-and-the-host/ http://www.opencanterburytales.com/open-review-home/the-framing-narrative-and-the-host/ https://journal.digitalmedievalist.org/articles/ . /dm. / https://journal.digitalmedievalist.org/articles/ . /dm. / http://www.biblissima-condorcet.fr/fr/a-new-life-medieval-libraries-chartres https://www.uni-heidelberg.de/presse/news /pm _lorsch_en.html http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fj.jml. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fllc% ffqt &citationid=p_n_ the digital middle ages s world to reaggregate previously lost medieval volumes. burrows’s contribution to the present supplement tackles related issues. the spatial turn the strategic use of digital mapping is an offshoot of visualization, one that is of- ten directed toward graphic analysis of location, ownership, and distribution within geographic boundaries. data sets providing greater access to larger spatial data sets have enhanced research in this area. for example, harvard’s digital atlas of ro- man and medieval civilizations (darmc) app provides gis maps and geodata- bases that are openly available and searchable online. the digitized medieval manuscripts app (dmmapp) provides original map resources online, while the dig- ital mappaemundi allows for searching between medieval maps and textual sources. geographic information system (gis) technologies provide ways to map and compare spatial data. for example, gis has been used to investigate the history of medieval rural and urban landscapes. city witness (http://www.medievalswansea .ac.uk/), a multidisciplinary research project, has created an online interactive map of swansea, c. , showing its principal topographical and landscape features, alongside an electronic edition of fourteenth-century texts. together the map and texts provide multiple vantage points on the town and the significations attached to locations within the town by various social and ethnic groups (including anglo- norman and welsh, lay and religious, male and female). the focus of the mapping medieval chester project (http://www.medievalchester.ac.uk/index.html) is the iden- tities that chester’s inhabitants formed between c. and . like city wit- ness, the project integrates geographical and literary mappings of the medieval city using cartographic and textual sources in order to understand how urban landscapes were interpreted and navigated by local inhabitants. gis has also been used to “map” individual objects like the manuscript page. mapping texts through gis is at the heart of david wrisley’s contribution to the supplement; and the lancelot-graal project (http://www.lancelot-project.pitt.edu /lancelot-project.html), featured in the article by alison stones in this supplement, is one of the leaders in this adaptation of gis. in the gough map project (http:// www.goughmap.org/), gis was used to analyze the relational representation of space in medieval and contemporary maps, allowing us to understand that the fourteenth-century map was designed to be functional and demonstrated a high de- gree of spatial accuracy. mapping of places within charters or even hagiographic a description of the manuscriptlink project is available on vimeo (http://vimeo.com/ ) and youtube (http://youtu.be/b r f paeyq). http://darmc.harvard.edu/. http://digitizedmedievalmanuscripts.org/. https://ihr.asu.edu/research/seed/digital-mappaemundi-resource-study-medieval-maps-and-geographic -texts. ian gregory, christopher donaldson, patricia murrieta-flores, and paul rayson, “geoparsing, gis, and textual analysis: current developments in spatial humanities research,” international jour- nal of humanities and arts computing ( ): – . christopher d. lloyd and keith lilley, “cartographic veracity in medieval mapping: analyzing geographical variation in the gough map of great britain,” annals of the association of american geographers ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://doi.org/ . / http://www.medievalswansea.ac.uk/ http://www.medievalswansea.ac.uk/ http://www.medievalchester.ac.uk/index.html https://doi.org/ . / http://www.lancelot-project.pitt.edu/lancelot-project.html http://www.lancelot-project.pitt.edu/lancelot-project.html https://doi.org/ . / http://www.goughmap.org/ http://www.goughmap.org/ http://vimeo.com/ http://youtu.be/b r f paeyq http://darmc.harvard.edu/ http://digitizedmedievalmanuscripts.org/ https://ihr.asu.edu/research/seed/digital-mappaemundi-resource-study-medieval-maps-and-geographic-texts https://ihr.asu.edu/research/seed/digital-mappaemundi-resource-study-medieval-maps-and-geographic-texts http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fijhac. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fijhac. . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f &citationid=p_n_ s the digital middle ages texts can allow for a deeper understanding of the construction of a sociopolitical landscape. the mapping of the locations and types of miracles within the life of sainte foy of conques provides evidence for the spatial extent of the monastery’s influence and for differences within it. three-dimensional reconstructions seminal three-dimensional reconstructions of past buildings and spaces have in- cluded the reconstructions of the church at cluny by the laboratory at darmstadt university; the amiens cathedral website directed by stephen murray at colum- bia university; and the monarch website, with its three-dimensional reconstruc- tions, time slider, and linked textual sources for saint-jean-des-vignes, soissons, produced by sheila bonde and clark maines. see also the contribution by sheila bonde, alexiscoir, and clarkmaines on the abbey of ourscamp in the current sup- plement. an ambitious recent project harnesses the results of archaeological survey and historical sources to create a complete three-dimensional reconstruction of the architecture of the entire medieval town of montieri, italy. this d reconstruction has aided researchers in their analysis of the architecture and layout of the town and will also make contributions to heritage and tourism. the sonic turn digital advances that allow us to recreate medieval manuscripts or to see three- dimensional recreations of medieval structures have made important contributions to the understanding of the medieval past. having a full understanding of how people experienced these objects and buildings carries this understanding still fur- ther. sound studies have been strongly linked to heritage and conservation, often fo- cusing on the capture of songs, music, and sounds of our cultural environment. for medievalists, the recreation of past music and soundscapes links these efforts to the three-dimensional architectural reconstructions. one digital resource is pro- vided by diamm (the digital image archive of medieval music) at oxford uni- versity, which presents information on thousands of manuscripts, as well as nearly fifteen thousand images and associated metadata. the online forum sounding out! faye taylor, “mapping miracles: early medieval hagiography and the potential of gis,” in his- tory and gis: epistemologies, considerations and reflections, ed. alexander von lünen and charles travis (heidelberg, ), – . manfred koon and horst cramer, cluny: architektur als vision (heidelberg, ). http://www.learn.columbia.edu/mcahweb/amiens.html. http://monarch.brown.edu/monarch/index.html. daniele ferdani and giovanna bianchi, “ d reconstruction in archaeological analysis of medi- eval settlements,” in archaeology in the digital era, vol. , e-papers from the th annual conference of computer applications and quantitative methods in archaeology (caa), southampton, uk, – march, , ed. philip verhagen (amsterdam, ), – . see tanya clement, “when texts of study are audio files: digital tools for sound studies in digital humanities,” in schreibman, siemens, and unsworth, companion to digital humanities, – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://doi.org/ . / https://doi.org/ . / http://www.learn.columbia.edu/mcahweb/amiens.html http://monarch.brown.edu/monarch/index.html the digital middle ages s provides space for publication, posts, discussion, and recordings. the stanford center for computer research in music and acoustics (ccrma) is a multidisci- plinary facility where digital technology is used as an artistic medium and research tool. one recent recreation of the medieval soundscape for the cathedral of santi- ago da compostela, led by rafael suárez from the universidad de sevilla, found that the acoustic conditions for pilgrims in the nave were compromised, while the acoustic conditions in the choir were ideal for both plainchant and polyphony. see also the contribution to this supplement by bissera pentcheva and jonathan abel, which explores the acoustics of hagia sophia; and the article by spyridon antonopoulos, sharon gerstel, chris kyriakakis, konstantinos t. raptis, and james donahue describing the acoustic aspects of byzantine churches in thessaloniki. immersive environments and heritage the ability to make a virtual visit to medieval sites is one offshoot of digital work with a heritage application, and google and unesco have collaborated to offer virtual visits to several important locations. iive (interactive immersive virtual environments) provide an interactive engagement for the “viewer” as part of a mu- seum or heritage display. second life and its open-access counterpart, opensim- ulator; myo; googleglass; and oculus vp are all potential applications. these vir- tual worlds, where users are represented by avatars, allow interaction between users and the environment and are thus appropriate for simulating (past) environments in real time. they may (though they do not always) include senses beyond the visual, especially harnessing sound. one such site, focused on the cathedral of saint andrews in scotland, combines three-dimensional reconstruction of the cathedral buildings, the movement of processions, music, and other sounds, experienced through an av- atar. while brick and mortar museums are costly to build and maintain, and travel to an archaeological site may not be practicable (especially after an excavation has ceased), a simulated experience of a site visit can be created through digital technol- ogy. two archaeological projects from the roman world, the rome reborn and portus projects, have provided these technologies for virtual visitors. one recent medieval application has been realized for a tenth- to twelfth-century muslim suburb https://soundstudiesblog.com/ / / / /. https://ccrma.stanford.edu/about. rafael suárez, alicia alonso, and juan j. sendra, “intangible cultural heritage: the sound of the romanesque cathedral of santiago de compostela,” journal of cultural heritage ( ): – , http://www.sciencedirect.com/science/article/pii/s . http://whc.unesco.org/en/news/ /. s. kennedy et al., “exploring canons and cathedrals with open virtual worlds: the recreation of st andrews cathedral, saint andrews day, ,” digital heritage ( ), https://risweb.st-andrews .ac.uk/portal/files/ /digitalheritage _submission_ .pdf. kimberly dylla et al., “rome reborn . : a case study of virtual city reconstruction using pro- cedural modeling techniques,” in caa : making history interactive; th proceedings of the caa conference march – , , williamsburg, virginia (oxford, ), – ; s. keay et al., “the role of integrated geophysical survey methods in the assessment of archaeological landscapes: the case of portus,” archaeological prospection ( ): – . on the use of second life in ar- chaeology, see luis miguel siquiera and leonel morgado, “virtual archaeology in second life and open simulator,” journal of virtual worlds research ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://soundstudiesblog.com/ / / / / https://ccrma.stanford.edu/about http://www.sciencedirect.com/science/article/pii/s http://whc.unesco.org/en/news/ / https://risweb.st-andrews.ac.uk/portal/files/ /digitalheritage _submission_ .pdf https://risweb.st-andrews.ac.uk/portal/files/ /digitalheritage _submission_ .pdf s the digital middle ages of sinhaya, outside zaragoza. the visualization of sinhaya was based on the archae- ological evidence of excavations as well as archival material. photorealistic lighting algorithms were developed by grupo de informática gráfica avanzada (giga), and the virtual animation can be viewed in a low-cost cave-like system. open medieval studies in the digital humanities, traditional print publication forms have not ceased to exist, but they are often complemented and supported by electronic formats. many digital humanists are attracted by the low threshold and immediacy of electronic communication platforms, so that scholarly communications increasingly happen in less formal blog posts, comments sections, and online or micromessaging plat- forms, such as twitter. many digital humanists are inspired by the open-science movement, which advocates the widest possible electronic distribution via open- access repositories and journals, not only of research results, but also of primary re- search data (for example, editions) and any home-brewed software that enabled the research. some scholars even wonder whether it would not be desirable to open up the entire research cycle to the wider public—which is still often the primary source of humanistic scholarship—from funding proposal, to software development, to peer review. such ambitious proposals are sometimes contrasted to more conventional schol- arship in the humanities, where scholars are imagined as sitting alone and brooding over their work for a prolonged period, until the research is finally perfected and released in a format that will, they hope, last for ages. such conventional longer- term projects—typically undertaken by individuals instead of teams—are today under pressure from digital humanists, who argue that more traditional forms of scholarship and the associated publication culture lead to less sustainable research, although the reverse is also true from some perspectives. creating and publishing a traditional print edition of medieval documents does not easily allow future generations to refine this work and add layers of annotation and analysis, especially if the original source data is not released with the print vol- umes. with electronic publication, supplying the primary data also means that fu- ture scholars will not need to go through a cumbersome and error-prone digitization process. the use of version-management platforms, such as github (http://github .com), are helpful in this respect, because they allow scholars—and their peers— to keep track of, comment on, and distinguish among versions of the work in real time. thus prepublication feedback can be taken into account by scholars, and mi- nor postpublication corrections need not wait until the next print edition to be inte- grated. the ease with which digital scholarly work in medieval studies can be modified over time makes it more fluid than print formats. umberto eco was but one among many contemporary thinkers to link the instability of modern electronic resources to the medieval transmission culture of texts. the internet is a young and still-fragile diego gutierez et al., “archaeological and cultural heritage: bringing life to an unearthed mus- lim suburb in an immersive environment,” journal of cultural heritage ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://github.com http://github.com http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fj.culher. . . &citationid=p_n_ the digital middle ages s medium that struggles with the well-known phenomenon of “dead links” that nat- urally compromises the citability of sources. the citability of both scholarship and data sources is an obvious “teething issue” of dh and therefore an important con- cern in many projects. and so is the durability: books are not copied and distrib- uted as easily as digital files, but we have books that are still usable after centuries, while computer files can become unreadable in a decade as operating systems, stor- age formats, encodings, and application software go out of fashion. in many cases, the fragility of a digital edition lies not only in the data, but also in the application we use to interact with the data. for example, a digital concordance may have prac- tical advantages over a paper one, but only as long as it is not locked into a hard- ware or software environment that has gone out of fashion. intellectual property rights, economic costs, and privacy issues also stand in the way of the naïve realization of the ambitious goal of a completely open medieval studies. the entire patrologia latina, for instance, can today be found in digitized versions online. while the quality of such freely available texts is generally lower— they abound in ocr errors—than what can be found in established subscription databases such as brepols’s library of latin texts (http://www.brepolis.net/), it is an interesting, but also worrying, development that the mere “availability” of a par- ticular text version is rapidly becoming a selection criterion that rivals the age-old importance attached to the criterion of quality. many digital humanities venues al- ready require scholars to submit their source data together with their papers, which is impossible in the case of copyrighted editions. whereas for many applications in data mining, the minute differences among editions of the same text will not matter very much, it is frustrating that the high-quality materials produced by previous gen- erations of scholars are sometimes severely underused in dh because of accessibil- ity and “shareability” issues. when it comes to the economic side, open-access journals, such as the digital me- dievalist journal (https://journal.digitalmedievalist.org/), are courageous initiatives because they cannot count on a steady income flow in the form of subscription fees to guarantee their future sustainability. many open-access journals will in fact charge the author a substantial “article processing charge” (apc), which raises concerns about the financial independence of scholars without institutional backup, especially retired scholars or those in alternative academic careers (#alt-ac). one major advan- tage of traditional print journals is therefore that they are largely free of charge to authors. the open-access community is currently working to overcome this chal- lenge: national academies will probably take up new responsibilities, and initiatives such as the open library of humanities (https://www.openlibhums.org/)—which aims to cover fully the apcs for the journals in their collections—can be expected to play a more prominent role in the near future. jonathan blaney and judith siefring, “a culture of non-citation: assessing the digital impact of british history online and the early english books online text creation partnership,” digital hu- manities quarterly ( ), http://www.digitalhumanities.org/dhq/vol/ / / / .html. such issues lie at the heart of the current work by walter scholger, university of graz. this issue is, for instance, raised in david bamman and david smith, “extracting two thousand years of latin from a million book library,” journal on computing and cultural heritage ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.brepolis.net/ https://journal.digitalmedievalist.org/ https://www.openlibhums.org/ http://www.digitalhumanities.org/dhq/vol/ / / / .html s the digital middle ages all in all, copyright remains a largely unsettled matter in the humanities today and, arguably, too few medievalists, digital and nondigital alike, are properly in- formed about the various licensing possibilities. open-access licenses, such as the creative commons (cc), allow authors to enforce an attribution to the original cre- ator of a work (e.g., cc-by) or add restrictions with respect to the commercial us- age of their work (e.g., cc-nc) or subsequent reuse (cc-sa, cc-nd). the fact that this digital supplement to speculum is published in open access, under a liberal li- cense that encourages wide dissemination, reflects these concerns in the dh com- munity. we are thankful to the university of chicago press and to the medieval academy of america for their openness to this project as well as for their support. a panoramic reading of speculum from a methodological perspective, it is vital that new approaches to medieval culture not lose touch with traditional and more conventional scholarly methods. nevertheless, thought-provoking tensions have emerged between the digital and tra- ditional humanities. in , for example, google collaborated with a large number of scientists to publish an influential science paper on the well-known google books project. in this project, the california technology giant claims to have digitized roughly percent of all books ever printed—and the expansion of the data set is still ongoing. because this data set is easily searchable online, it offers a convenient resource, which today is probably queried by scholars more often than they care to admit. in this paper, jean-baptiste michel et al. discuss an emerging research field called “culturomics,” the study of high-throughput cultural data through lexical analysis, and they focus on the diachronic analysis of word frequencies in english books ( – ). although their word-counting strategy was simple, their re- search demonstrated that word usage in large corpora correlates with cultural devel- opments. the relative frequency of the word “slavery,” for instance, peaked in their data during the u.s. civil war and later during the civil rights movement. in addition to a large array of linguistic analyses, their word counts even demonstrated the active censorship of jewish artists, such as marc chagall, in nazi germany. in december , the paper’s two lead authors presented their thought- provoking work at the annual meeting of the american historical association. the association’s president, anthony grafton, would later offer a fascinating account of this event: “for all their panache—and all the fun their tool permits—lieberman- aiden and michel also inspired a little worry, as well as some hard thinking about the status of our discipline.” grafton regretted that the paper’s list of authors, though sizable, did not include a single historian, and that this lack of historical ex- pertise occasionally showed in their presentation. he stated, in disappointment, “[a]pparently, historians have not established, in the eyes of many of their col- leagues in the natural sciences, that they possess expert knowledge that might be valu- jean-baptiste michel et al., “quantitative analysis of culture using millions of digitized books,” science ( ): – . https://books.google.com/ and https://books.google.com/ngrams. https://www.historians.org/publications-and-directories/perspectives-on-history/march- /loneliness-and-freedom. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://books.google.com/ https://books.google.com/ngrams https://www.historians.org/publications-and-directories/perspectives-on-history/march- /loneliness-and-freedom https://www.historians.org/publications-and-directories/perspectives-on-history/march- /loneliness-and-freedom http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fscience. &citationid=p_n_ the digital middle ages s able, or even crucial.” lieberman-aiden and michel would later counter this view, stressing that they did receive input from academic historians, although their names were not included in the final author list. they stated that, “while ‘expert knowl- edge’ is important, shared paradigms, a shared language, and common intellectual values are a big part of what makes a successful team come together. this suggests that history departments have to grapple with several emerging responsibilities: to encourage familiarity with quantitative methods, with computational techniques, and—as you so eloquently wrote—with large-scale collaboration.” the research initiative behind the culturomics paper is a typical instantiation of what is today commonly called “distant reading” in the digital humanities, a loosely defined notion seminally introduced by franco moretti in a momentous series of es- says. at present, distant reading (sometimes also known as macroanalysis, algo- rithmic criticism, panoramic reading, and other terms) plays a role in a variety of approaches to text analysis in the humanities where large bodies of texts are queried and analyzed using a combination of techniques from language technology, infor- mation retrieval, and data science. common to all these approaches is the strategy that an important part of the conventional reading process is in fact deliberately outsourced to a machine; human intervention is largely postponed to the interpreta- tion of the simplified model that the algorithms yield. as moretti noted, the reader’s distance from the original text as such becomes a function of the increased scope of the reading effort. grafton’s mixture of fascination and worry is probably representative of the at- titude that many scholars today entertain towards such forms of distant reading. the fact that a crucial part of the reading process is outsourced to a machine calls into question the quality of the textual model that state-of-the-art computational methods can deliver. as a way of interrogating these issues, it will be illustrative to discuss a small, yet representative and critical, distant-reading exercise. for this, the university of chicago press granted us access to a digital version of the speculum archive covering the entire seventy-year period between the journal’s inaugural issue in and december . as in any sizable corpus nowadays, the quality of the digital text varies enormously: for the part up to volume , we have to work from the uncorrected output of optical character recognition, whereas we can work with clean, born-digital data from volume onwards. as can be gleaned from the bar plot on fig. , where the token counts have been aggregated on a yearly basis, the full data set amounts to over million tokens http://www.culturomics.org/resources/faq/thoughts-clarifications-on-grafton-s-loneliness-and -freedom. ibid. for example, franco moretti, “conjectures on world literature,” new left review ( ): – . these essays were later brought together and commented on in franco moretti, distant reading (london, ). relevant references for these terms include matthew jockers, macroanalysis: digital methods and literary history (lincoln, ); stephen ramsay, reading machines: toward an algorithmic crit- icism (lincoln, ); thomas crombez, “het onbehagen in de digitale cultuur: de opkomst van dig- ital humanities,” meta: het vlaamse tijdschrift voor bibliotheek en archief ( ): – . all software used for this exercise is made available on https://github.com/mikekestemont /panorama. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.culturomics.org/resources/faq/thoughts-clarifications-on-grafton-s-loneliness-and-freedom http://www.culturomics.org/resources/faq/thoughts-clarifications-on-grafton-s-loneliness-and-freedom https://github.com/mikekestemont/panorama https://github.com/mikekestemont/panorama s the digital middle ages (that is, words, but also punctuation marks and other symbols), although the num- bers show severe fluctuations over the individual years. nevertheless, the impres- sive size of the archive raises the intriguing question whether valuable patterns could be extracted from this data, which might yield a “panoramic view” of the jour- nal’s contents and thematic biases as well as its development throughout the years. which medieval authors and texts rank highest, for instance, in speculum’s popu- larity hit list; and which scholarly approaches have grown into or out of fashion over the years? for this effort, we made use of a range of computational techniques, which are representative of the state of the art in textual modeling strategies in the digital humanities nowadays. hopefully, this array of methods will allow us to showcase, in a nontechnical language, the opportunities and, perhaps more impor- tantly, the challenges that arise from such a “vanilla” application of distant reading. one common preprocessing step in textual analysis is to apply a so-called tagger to the material, an established procedure in natural language processing. in this exercise, we segmented the original raw stream of characters in a speculum article intomeaningfultokenunits—forexample,clitic“don’t”willberestoredto“do”and “n’t”. we applied the stanford corenlp software suite to the archive, which of- fers a host of basic procedures and which is maintained by one of the leading re- fig. . word counts for the data in the speculum archive, aggregated at the year level ( – ). for an introduction to basic methods in nlp, consult, for instance, dan jurafsky and james mar- tin, speech and language processing, nd ed. (upper saddle river, ). speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). the digital middle ages s search groups in the field of language technology. above, we show an example in table for the suite’s output for a randomly selected sentence from a specu- lum contribution. as can be seen in this example, the software will attempt to deter- mine for each token its lemma (or uninflected dictionary headword; that is, past took becomes take), its part of speech (title, as it is used in this example, is an nn, or singular noun) and an indication of whether the token is a named entity ( is a date, but raimond is categorized as a person). the software makes these decisions on the basis of a statistical assessment of a token’s appearance (for example, is the token capitalized?) and the lexical context in which it appears (for example, is the token preceded by an adjective?). because of the ambiguous nature of human language, such an automatic enrich- ment of the material will naturally yield many errors, especially in the case of the ocr-entered data with its unstable spellings, but it nevertheless already offers many possibilities for creatively querying the corpus. one interesting question might be which medieval dates have been most frequently mentioned in speculum over the years. for this, we traced the cumulative frequency of all numbers in the data set that had been marked as a date in the named entity column and that fell in the “medieval” range of – . in the scatter plot in fig. , we plot the twenty-five dates with the highest cumulative frequency in the corpus as a function of their coefficient of var- iation over the documents in the corpus, to keep track of their dispersion over the material. the higher up in the plot a date can be found, the more frequent the date is; the more leftwards it is positioned (and the larger its font size), the better is it distributed over the individual documents in the corpus. the top of the list is clearly dominated chr proceed demons all us table an example of a sentence (randomly drawn from a speculum issue) as tagged by the stanford corenlp suite index token lemma part of speech named entity this this dt o raimond raimond nn person took take vbd o the the dt o title title nn o of of in o baron baron nnp person de de in person saint-gilles saint-gilles nnp person in in in o cd date . . . o istopher ma ings of the trations (ba thi e subject to u nning et al., “the nd annual meet ltimore, ), – s content downloade niversity of chicag stanford corenlp ing of the associat . d from . . . o press terms and c natural language ion for computatio speculum on october , onditions (http://www processing toolkit,” in nal linguistics: system /s (october ) : : am .journals.uchicago.edu/t-and-c). f this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.ed . ig. . scatter plot showing the most commonly mentioned years ( – ) in speculum u/t-and-c). the digital middle ages s by round dates ( , , ). this reflects the fact that medievalists have gen- erally preferred to think along conventional decennial, centennial, and millennial boundaries. nevertheless, it is tempting to link a number of dates higher up in this hit list to well-known events in the medieval period, including (the battle of hastings) or (the sacking of constantinople in the fourth crusade). for an iconic date such as , it is interesting that one might be tempted to link it to sev- eral events, which helps explain its prominence: one can think of the siege of com- piègne and joan of arc’s capture, but also philip the good’s marriage and the in- stallment of the order of the golden fleece. interestingly, the years and (outbreak of the black plague) have a lower dispersion than their elevated frequency might make us suspect. such corpus-level aggregations of frequencies are an interesting toy to help us characterize medieval studies from a panoramic viewpoint, but the tagging of our material also allows us to query speculum in a more specific fashion. for the line charts in figs. a–b, for instance, we have calculated the relative frequency of all nouns (whether plural or singular) in the material throughout the period – . using a common statistical test (kendall’s tau) we have queried the results for the five nouns that have shown the steadiest decrease (a) or increase (b) in us- age throughout speculum’s history. the results in fig. a are not particularly excit- ing, and show merely that traditional (latinate) citation styles (op. cit., ff., loc.) are growing out of fashion among speculum authors. fig. b, on the other hand, sug- gests a clean and surprisingly linear frequency increase of the words “role” and “context”: this phenomenon strongly suggests that medieval studies, as represented by speculum articles, have been marked in the twentieth century by a transition to- wards a more functionalist and contextualized approach to the middle ages, some- thing that has already been often observed in literary studies. the shift in the use of the words “overview,” “focus,” and “potential” seems on the other hand to be of a metascholarly nature and might signal a trend towards greater scholarly profession- alization and specialization in the broader field of medieval studies. our analyses so far have been purely lexical or carried out at the level of indi- vidual words. the problem with such a brute surface-counting approach is that it conceals the actual context in which words are used. if a word has been frequently used in speculum, that would indeed seem to attest to the cultural salience of the word in the world of medievalists, but this context-free approach cannot tell us whether the term has primarily positive or negative connotations, nor does it in- dicate the scholarly context in which it is typically used. to remedy this situation, the digital humanities increasingly make use of methods borrowed from distribu- tional semantics, an exciting research domain in natural language processing (or computational linguistics). in this field of study, researchers build upon the general idea that words derive meaning primarily from the lexical context in which they appear. for example, in a sentence such as “i made the *blarf fetch the stick” or a classic reference (among many others) is zellig s. harris, “distributional structure,” word ( ): – . an interesting recent opinion piece about distributional methods in language technol- ogy is chistopher d. manning, “computational linguistics and deep learning,” computational lin- guistics ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f . . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fcoli_a_ &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % fcoli_a_ &citationid=p_n_ fig. . (a) the relative frequency of nouns with the most linear drop in frequency. (b) the relative frequency of nouns with the most linear increase in frequency. this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). the digital middle ages s “i took the *blarf for its evening walk,” the context in which the nonexistent term *blarf appears strongly hints towards a domestic animal—perhaps a dog. in distributional semantics, researchers attempt to model the distributional pat- terns in word co-occurrences found in large corpora, such as the speculum archive. the underlying assumption is that the vocabulary can be modeled into a set of se- mantic fields or topics; these topics consist of clusters of words that typically co- occur in documents or paragraphs and that are therefore more likely to belong to the same topic than words that never appear in the same context. each of the top- ics in such a “topic model” can be assumed to bear a certain weight, or topic score, on each document in a corpus: a newspaper article about a famous soccer player’s transfer to real madrid, for instance, might be characterized as being percent about “sports,” percent about “finance,” and percent about “spanish lifestyle.” we have subjected the speculum archive to a topic-modeling exercise using the well-known method nmf (non-negative matrix factorization). we have asked the method to extract the most salient topics from consecutive segments of words, which did not include any so-called stopwords (such as articles, punctuation marks, or prepositions). we have cherry-picked a representative selection of these topics and visualized them as a series of word clouds in fig. . this selection clearly demonstrates the international and thematic variety of speculum contributions over the history of the journal. in these clouds, the font size of the individual words reflects their rela- tive importance to the topic. note that the topic model itself does not produce a neat “label” for a topic, but its most significant words typically give a solid indication as to the semantic scope of a particular theme. these topics form relatively neat word clusters, even though this analysis does not depend on any external, handcrafted re- sources such as dictionaries: the model derives its semantic knowledge in a com- pletely data-driven fashion solely from word usage statistics in a large corpus. this overview of topics does attest to the dominance of insular topics, including those capturing the thematic fields surrounding the canterbury tales, beowulf, monmouth’s arthuriana, piers plowman and its alliterative colleagues, or the dom- inance of cluniac monasticism and cathedral architecture. nevertheless, the topical diversity is rich enough to include twelfth-century latin literature from france, such as the cistercian cluster of literature around bernard, and also the world of scan- dinavian sagas and of dante’s commedia. a number of topics also clearly reflect higher-level thematic interests, such as courtly love, as well as themes within medi- eval architecture, islamic studies, and gender studies. many topics also seem to pick up on the major cultural clashes that characterized the medieval period, including the confrontation of christian with arabic culture in medieval spain or the tension between christianity and judaism—note the presence of high-polarity terms such as accusation, violence, and murder in the latter topic. interestingly, this topic model also enables us to study the speculum archive in a more diachronic fashion. if we were to calculate the average presence of a specific topic in all the speculum issues that were published in a given year, plotting these scores on a timeline might provide insight into thematic evolution. in figs. a–d, on topic modeling, consult for instance david m. blei, “probabilistic topic models,” communi- cations of the acm ( ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f . &citationid=p_n_ http://www.journals.uchicago.edu/action/showlinks?doi= . % f &crossref= . % f . &citationid=p_n_ fig. . word clouds representing a cherry-picked selection of topics from our thematic model ( topics in total). only the most salient words are plotted for each topic; the font size of the individual words gives an indication of their relative importance to the topic at hand. s the digital middle ages we have plotted a number of trend lines for a selection of topics that seem to re- veal interesting evolutionary patterns. the gender-related topic (women, fe- male, male), for instance, seems to have gained prominence only in the eighties, and the same goes for the sociocultural, functionalist approach to literature (social, cul- tural, culture), which seems to be captured in topic . one of the more obvious “downward” trends is the declining use of latin throughout the journal’s history (topic )—our analyses also suggested similar trends for other languages, such as french and german—suggesting that speculum is becoming a more monolingual journal. other topics are characterized by more local peaks, such as topic , which reflects a elevated number of citations in the field of medieval aristotelianism (averrois, aristotelem, commentariorum) in the period – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). the digital middle ages s while topic modeling thus offers insights not available from simpler word-count approaches, it also raises new issues. does the word bernard, for instance, refer to the medieval author bernard of clairvaux or the present-day scholar bernard mc- ginn (or both)? the problem that arises here is that even named entities can be am- biguous, and to achieve a more holistic approach to autonomous machine reading, such entities must be disambiguated. “wikification” is a term that is used colloqui- ally to denote the process of cross-document named-entity disambiguation in nat- ural language processing. many software tools, such as the stanford corenlp suite used above, are available today to tag automatically the named entities in free- running text, such as the names of individuals or places. while this process of named- entity recognition is already a crucial step towards knowledge extraction, the ambi- guity of named entities presents a major obstacle on the road towards a machine’s autonomous text understanding. in a sentence like “clinton took the stage,” it is un- clear whether the named entity refers to hillary clinton, bill clinton, or the epony- mous funk musician, george clinton. in wikification studies, researchers attempt to extract clues from the semantic context in which a named entity occurs to help disambiguate these mentions. if the sentence reads “secretary clintontook thestage,”the apposition “secretary” would strongly suggest that the sentence refers to hillary, since she is the only disambiguation fig. (continued) xiao cheng and dan roth, “relational inference for wikification,” in proceedings of the conference on empirical methods in natural language processing (seattle, ): – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). fig. . four plots showing the diachronic presence of selected topics in speculum issues on an annual basis. s this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). fig. (continued) s this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). s the digital middle ages candidate to have held this specific office. additionally, wikification systems can ex- ploit the fact that the named entities that are mentioned in a text typically form a se- mantically coherent set: in the sentence “clinton tookthe stage withbob marley,” the relatively unambiguous identification of the musician bob marley would suggest that the clinton in this sentence is the artist george clinton. scholars often turn to wikipedia as a resource for mining fixed, unique identi- fiers for named entities. through linking the named entities to the single, relevant entry for a named entity in the well-known encyclopedia, the algorithm effectively performs cross-document named-entity resolution. additionally, wikipedia is built on top of a rich ontological structure, so that various sorts of metadata can be har- vested for each entity, in the form of descriptive labels indicating whether an indi- vidual was, for example, a philosopher or a king. wikipedia has an impressive scope, but at the same time the use of a wikifier introduces strong biases. uncommon named entities that have not yet received an identifiable wikipedia page will be ig- nored by necessity. likewise, the fact that we use a wikifier for the english language might bias our analysis towards entities that are relatively more salient, culturally speaking, in the anglo-saxon part of the world. when we apply the illinois wiki- fier to speculum’s plain-text archive, a superficial reading of the wikifier’s output anecdotally suggests that the wikifier struggles with the poor ocr quality of the earliest volumes, but nevertheless is able to output interesting annotations: long ago sir john rhys offered a so- lar interpretation of arthurian lore , but , according to loomis , he did not work out the celtic mythological system from the evidence of the irish and welsh legends themselves. note that the wikifier deals well in this example with abstracting over superfi- cial synonyms: in texts that mention bernard, bernardine, or bernard of clairvaux, the entities will be mapped to the same unique identifier as their latin equivalents, such as bernardus clarevallensis. therefore, such a tool offers much more power- ful search capabilities than raw text data. nevertheless, the “disambiguations” are certainly not flawless, and at times are tragically hilarious—all sorts of american celebrities, including famous wrestlers and pop artists, would appear to have made a much larger contribution to medieval scholarship than we might have anticipated. nevertheless, when aggregated at a higher level, the wikifier’s output is accurate enough to draw up even more insightful hit lists than the ones we have shown so far. in fig. , we have exploited the metadata that the wikifier attaches to each entity to draw up a list of the most salient—or, at least, most frequently mentioned—au- thors (a), poems (b), and saints (c) in speculum. lev ratinov, dan roth, doug downey, and mike anderson, “local and global algorithms for disambiguation to wikipedia,” in proceedings of the th annual meeting of the association for computational linguistics: human language technologies, vol. (portland, ), – . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://en.wikipedia.org/wiki/john_rhys http://en.wikipedia.org/wiki/matter_of_britain http://en.wikipedia.org/wiki/matter_of_britain http://en.wikipedia.org/wiki/roger_sherman_loomis http://en.wikipedia.org/wiki/irish_language http://en.wikipedia.org/wiki/welsh_language http://en.wikipedia.org/wiki/welsh_language f ig . . su b p lo ts sh o w in g th e m o st fr eq u en tl y m en ti o n ed au th o rs (a ), p o em s (b ), an d sa in ts (c ) in sp ec u lu m o n th e b as is o f th e w ik ifi er ’s n am ed en ti ty d is am b ig u at io n . this content downloaded from . . . on october , : : all use subject to university of chicago press terms and conditions (http://www.journals.u am chicago.edu/t-and-c). f ig . (c o n ti n u ed ) s this content downloaded from . . . on october , : : all use subject to university of chicago press terms and conditions (http://www.journals.u am chicago.edu/t-and-c). f ig . (c o n ti n u ed ) s this content downloaded from . . . on october , : : all use subject to university of chicago press terms and conditions (http://www.journals.uc a hi m cago.edu/t-and-c). s the digital middle ages while such hit lists are interesting in their own right, looking at mere frequency does not reveal the complex relationships that might exist among them and with other words with which these entities are typically associated. to study and visual- ize these, we turn to one final technique, from the sphere of distributional embed- dings: word embeddings. just like topic modeling techniques, word embeddings build upon the distributional hypothesis that words with a similar meaning will have the tendency to appear in similar contexts. however, whereas topic modeling tech- niques are geared to finding good representations for topics and documents, word embedding can yield much more fine-grained representations for individual words. word embeddings will represent the items in a vocabulary using a numerical vector, or a list of numbers that aim to characterize the word meaning. the advantage of such a word-level model is that we can apply straightforward arithmetic to these vec- tor representations and ask the model, for instance, to return the five words that it deems most similar to a certain query term. if we apply a popular word-embeddings model (word vec) to our wikified corpus, we can inspect the immediate semantic neighborhood of the following terms listed in table . using the vector represen- tation that we can extract for our wikified authors, we can also use these embeddings to visualize the relationships between our authors in a dendrogram, or tree diagram. in fig. , the wikified links take the form of leaves in a tree, which are eventually joined into new nodes in a branch structure. the branches reflect the distances be- tween the representations that we obtained for these authors. note how the struc- ture that arises from this tree makes sense (monarchs cluster with monarchs, philos- ophers with philosophers, and so on) but also offers some surprising results: ovid and virgil, for instance, cluster with boccaccio, petrarch, and dante, instead of with other authors from antiquity, such as cicero or plato. note, also, how the tree re- alizes at the top level what seems to be a fairly neat split between vernacular au- thors and nonvernacular authorities. such word embeddings have attracted a good deal of attention recently, mostly because it has been shown that these models are able to solve independently an in- teresting form of analogical reasoning problem. for example, when asked which word is to “woman” as “king” is to “man,” a model trained on english wikipedia text will output the word “queen.” the task is simply solved through the following equation: king – man woman. the idea is that the model takes its vector represen- tation for the word king, “subtracts,” or removes, all abstract properties that it as- sociates with the word man, and then adds all the properties that it associates with the word woman. the model subsequently returns the word that is closest to the re- sult of the operation. other, culturally intriguing outputs of the original model were: japan – sushi new_york → pizza and belgium – brussels france → paris. as an interesting spielerei, note that such a model is able to solve thought-provoking questions such as “who is the chaucer of the french?” through simply modeling it in the form of the equation geoffrey_chaucer – english_language french tomas mikolov, wen-tau yih, and geoffrey zweig, “linguistic regularities in continuous space word representations,” in proceedings of the conference of the north american chapter of the association for computational linguistics: human language technologies ( ), – . ibid. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). the digital middle ages s _language. a number of actual results from drawing such a deliberately provocative analogy from our speculum model are given below: geoffrey_chaucer – english_language french_language → jean_de_meun geoffrey_chaucer – english_language latin → ovid geoffrey_chaucer – english_language italy → giovanni_boccaccio while the output from such a naïve model naturally should be taken with a grain of salt,such exercises are neverthelessuseful becausethe models are built in a purely data- driven way, and researchers have noted that these models generally tend to reproduce the cultural biases that are present in the material on which they have been based. conclusion: the “canon” of medieval studies proponents of distant reading have often praised the ability of computer tech- niques to broaden our reading scope beyond the obligatory canon of chaucers, dantes, and chrétiens. moretti, for instance, famously suggested that computer techniques would finally allow us to tackle what margaret cohen has called the “great unread,” the oubliëtte of historic literature. so far, however, the results in this respect have been limited, and many digitization projects still center around the comfortable and well-known pantheon of canonized authors—the dispropor- tional attention for a figure like chaucer in traditional medieval studies, for instance, has been remarkably closely paralleled in the digital universe so far. this is but one case where digital medieval studies can probably do a better job of living up to its promise and lure our attention away from an already overexposed medieval canon towards the lesser-known peripheries of medieval culture. nevertheless, it is troubling that much new digital medieval work responds more closely to the questions and concerns of nineteenth-century medieval scholarship than those of the twentieth or twenty-first centuries. in the field of text analysis, for instance, practitioners have so far shown little interest in modern literary theory, table the nearest neighbors for a selection of canonical entities using a word-embeddings model king_arthur chrÉtien_de_troyes geoffrey_chaucer charlemagne matter_of_britain perceval,_the_story_ of_the_grail the_canterbury_tales louis_the_pious round_table yvain,_the_knight_ of_the_lion general_prologue pepin_the_short gawain cligès the_house_of_fame charles_the_bald mordred erec_and_enide troilus_and_criseyde clovis_i round_table_ (camelot) erec troilus carolingian_dynasty some of his semin this co all use subject to uni al essays on the matter ha ntent downloaded from versity of chicago press te ve been reproduced in more speculum . . . on october , rms and conditions (http://ww tti, distant reading. /s (october ) : : am w.journals.uchicago.edu/t-and-c). this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.ed fig. . a dendrogram representing the outcome of a cluster analysis, where the (dis)similar- ities between writers are visualized as a tree structure. the dissimilarities here are based on the embeddings we obtained for these writers and which capture the semantic context in which these writers are typically mentioned in speculum. u/t-and-c). the digital middle ages s and especially poststructuralist approaches. the postmodern dismissal of—and lack of interest in—the authorship of texts may also explain why digital scholars might keep their distance from a field that does not value issues central to much of digital medieval studies. influential digital humanists, such as geoffrey rockwell or stephen ramsay, might interpret this observation in the light of their—as they themselves admit—rather polemic view of digital humanities as a community of “builders”: a community that “does” instead of “talks,” one that “makes” instead of “writes” — and, we could add, perhaps also a community where scholarship is often so experi- mental that it is more like “playing” than “working.” the brothers grimm rediscovered medieval literature in nineteenth-century ger- many and took pains to initiate the scholarly study of a strange cultural phenome- non from a distant past, still fundamentally new to them at the time. they found themselves confronted with the need to catalog, describe, and edit an unstructured mass of new sources, and they struggled to apply the existing scholarly models that they had inherited from their humanist predecessors. because of the european di- mensions of many medieval phenomena, they were also involved in constant nego- tiations through their international scholarly correspondence, for example, about the authenticity of particular text versions or the directions of cultural exchange in medieval europe. it would not be far-fetched to liken the condition of present-day digital humanists to their nineteenth-century precursors. modern digital humanists, too, are confronted with the scholarly study of a medieval heritage that they often have to digitize from scratch, even as they define a scholarly, digital practice without a tradition of existing models that can be applied easily to the computational study and dissemination of these artifacts and new insights about them. working as a com- munity, many digital humanists are currently reinventing important aspects of me- dieval studies in that process, through fundamental discussions about the purpose and meaning of the field. this situation leads to a complex, opaque, and fascinating relationship between digital medieval studies and their conventional counterpart. on an anecdotal level, digital humanists are inspired by the relative freedom they enjoy in the experimen- tal playground that is dh, where scholars can operate largely outside the gaze and criticism of the conventional humanities. according to some, dh can be viewed as a deliberately “undertheorized” field, where young scholars are not hampered by the mechanisms of intimidation and exclusion that are often related to the concept of “theory.” others have claimed that dh is in fact much more theoretical than the traditional humanities, because of the central place that is assigned to funda- mental methodological debates about modeling in the humanities. in a famous blog see, for example, their polemic pieces reprinted in terras, nyhan, and vanhoutte, defining digital humanities: stephen ramsay, “on building,” – ; and geoffrey rockwell, “inclusion in the dig- ital humanities,” – . on less targeted research methods facilitated by the digital, see stephen ramsay, “the hermeneu- tics of screwing around,” http://quod.lib.umich.edu/d/dh/ . . / : /--pastplay-teaching -and-learning-history-with-technology?gpdculture;rgnpdiv ;viewpfulltext;xcp # . . rockwell, “inclusion in the digital humanities.” for “theory” as a source of intimidation, see jonathan culler, literary theory: a very short in- troduction (oxford, ), . speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). http://quod.lib.umich.edu/d/dh/ . . / : /--pastplay-teaching-and-learning-history-with-technology?g=dculture;rgn=div ;view=fulltext;xc= # . http://quod.lib.umich.edu/d/dh/ . . / : /--pastplay-teaching-and-learning-history-with-technology?g=dculture;rgn=div ;view=fulltext;xc= # . http://quod.lib.umich.edu/d/dh/ . . / : /--pastplay-teaching-and-learning-history-with-technology?g=dculture;rgn=div ;view=fulltext;xc= # . s the digital middle ages post, “who you calling untheoretical?,” jean bauer quoted susan smulyan, who shouted on one occasion, “the database is the theory!” while the presence of higher-level theoretical and methodological debates is not open to question, the relationship between traditional and nontraditional schools in medieval studies merits a closer look here. scholars in digital humanities typi- cally justify their existence through an active affiliation with older humanities dis- ciplines —in fact, one could say that it is primarily this affiliation that separates the digital humanities from computer science. in medieval studies, too, the link be- tween traditional and digital practitioners is crucial if the medieval field is to ad- vance as a whole. importantly, this requires a mutual interest from both parties and a fundamental willingness to learn from one another, while not neglecting the rich tradition of medievalist scholarship. while we expect digital medieval studies to become more mainstream in the fu- ture, it will remain important to maintain dedicated outlets for digital medievalists to reflect on the more technical aspects of their work. a number of more recently inaugurated specialized journals, such as the digital medievalist journal (https:// journal.digitalmedievalist.org/) and digital philology: a journal of medieval cul- tures (johns hopkins university press) merit watching, in addition to the more es- tablished, multidisciplinary journals in dh, such as llc: digital scholarship in the humanities (oxford university press; formerly known as literary and lin- guistic computing) and digital humanities quarterly, both published on behalf of adho. likewise, the book of abstracts of the annual global conference in dh organized by adho (http://adho.org/) helps keep track of current developments in the field. equally important for the further development of the field are platforms of a more pedagogical nature, where practical tutorials are offered that can help novice prac- titioners of the dh to acquire digital skills that may not yet be part of curricular training programs in higher education. websites such as the programming histo- rian (http://programminghistorian.org), for example, offer a wide range of peer- reviewed tutorials on technical skills. other popular pedagogical resources for nov- ice scholars are the many long-standing training events that are annually organized in the dh community, such as the digital humanities summer institute at the university of victoria, the european summer university in digital humanities at the university of leipzig, and the digital humanities at oxford summer school (dhoxss) at the university of oxford. the thatcamp (the humanities and technology camp) “un-conferences” held in various locations have also spread the word about digital methods and approaches to a broad audience. apart from a longer exposure to digital humanities practices, such events have an important so- cial dimension by allowing newcomers to build up a network in dh. jean bauer, “who you calling untheoretical?,” journal of digital humanities / ( ), http:// journalofdigitalhumanities.org/ - /who-you-calling-untheoretical-by-jean-bauer/. cf. liu, “the meaning of digital humanities.” http://thatcamp.org/. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://journal.digitalmedievalist.org/ https://journal.digitalmedievalist.org/ http://adho.org/ http://programminghistorian.org http://journalofdigitalhumanities.org/ - /who-you-calling-untheoretical-by-jean-bauer/ http://journalofdigitalhumanities.org/ - /who-you-calling-untheoretical-by-jean-bauer/ http://thatcamp.org/ the digital middle ages s this digital supplement this supplement is divided into four sections that aim to represent many of the trends we have traced above. in the first section, “manuscripts and images,” four papers engage with approaches to manuscript analysis. toby burrows introduces a project that collates the manuscripts formerly in the collection of sir thomas phillipps and explores the challenges of analyzing large corpora. the enormous manuscript collection assembled by phillipps in the nineteenth century was subse- quently dispersed to institutions and private collectors around the world. because the evidence relating to the provenance and history of these manuscripts is extensive and varied, developing a coherent framework for analysis required implementing a new data model for manuscript provenance. as well as examining the technical pro- cesses involved in this work, burrows presents the results of applying this approach to two specific research questions: the histories of the group of manuscripts that were owned by both thomas phillipps and alfred chester beatty, and the combined histories of the former phillipps manuscripts that are now in institutional collections in north america. although it is well known that many scribes had several scripts and even alpha- bets available to them, there has been little discussion of the phenomenon from a paleographical point of view, and even less of the methods to address it. in his con- tribution about multigraphism in late anglo-saxon manuscripts, peter a. stokes examines the work of two multigraphic scribes in detail, drawing on the digipal framework and exploring the capabilities that it gives for communication and anal- ysis of script and the insights it provides about late anglo-saxon scribal practice and multigraphic script in general. mike kestemont, vincent christlein, and dominique stutzmann propose what they call “artificial paleography,” based on the adaptation of technology from the field of computer vision and artificial intelligence to the paleographic study of me- dieval manuscripts. their paper focuses on the automatic identification of script types in medieval manuscripts, which is an important step on the road to the fully automated “machine reading” of these documents. the work is presented in the con- text of a recently organized competition, or “shared task,” on this subject, which is an increasingly common scientific format in the world of digital scholarship. in ad- dition to a high-level introduction to the computer models they use, the paper fo- cuses on the interpretation of these complex systems against the background of tra- ditional paleography. murray mcgillivray and christina duffy shine the new light of spectrometry to see beneath the illuminations of the well-known gawain manuscript. their article engages with the techniques of multispectral imaging to examine the illustrations in london, british library, ms cotton nero a.x., the unique manuscript of sir ga- wain and the green knight and three other important middle english poems. imag- ing reveals the ink drawings under the later paint and detects differences from the illustrative goals, damaged and faded portions of images that were restored, and the intentional deployment of chemically different pigments that have come to look similar with the passing of time. speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / s the digital middle ages the second section, on “mapping,” includes two articles that illustrate the use of geographic information systems (gis) in medieval studies. david joseph wrisley explores digital mapping for medieval studies at multiple scales for both close and distant readings. his article distinguishes mapping geographical information from historical gis, and it presents several findings of the visualizing medieval places (vmp) project for the study of medieval french texts. wrisley argues for the need to expand the project into a research architecture that allows social cocreation of data and explores the affordances of linked open data. m. alison stones describes the evolution of the web-based lancelot-graal project, which adapts gis to the geog- raphy of the manuscript page, using it as part of a comparative examination of dif- ferences in the choice, placement, and treatment of subjects in manuscript illustra- tions. the third section, “texts and editions,” brings together four articles. jeroen de gussem traces the “secretarial trail” of bernard of clairvaux by using the techniques of stylometry. the literary style of bernard of clairvaux (c. – ) was of such grandeur that it was imitated by the greatest theologians of his time, providing an “architecture” for a cistercian way of writing. bernard’s best imitators were, in fact, found by his side, in the scriptorium of clairvaux. these scribes were trained to mimic their abbot’s preferred wording and his mastery of rhetorical twists, and although bernard made a habit of rereading, correcting, and repolishing his works, it is often unclear how we should estimate his secretaries’ part in the ultimate con- stitution of his oeuvre. the focal figure in bernard’s scriptorium was nicholas of montiéramey, who served the abbot from c. – to c. – , and in this ar- ticle, the dynamics of kinship between bernard’s and nicholas’s oeuvres are laid bare through stylometric methods. the stylistic familiarity between their texts can teach us more about the nature of collaboration in the scriptorium of clairvaux as well as allowing for a better close reading of bernard’s more dubiously attributed texts. maxim romanov presents an algorithmic analysis of medieval arabic biograph- ical collections, a unique data collection whose sheer size has hindered a holistic scholarly treatment so far. his paper illustrates the sort of macroanalyses that large and understudied corpora enable, with an emphasis on the geographic and tempo- ral distribution of the entities in his data. romanov discusses the complexities of tagging, structuring, and sustaining these data and offers valuable pointers to prac- tical tools and realistic methodologies. mark cruse performs a quantitative analysis of toponyms in a manuscript of marco polo’s devisement du monde (london, british library, ms royal d ). scholars have long noted that marco polo’s account presents many textual prob- lems, and not only to modern scholars. the text’s toponyms also posed a particu- larly great challenge to the scribes who copied the early manuscripts because so many were unknown, and quantitative analysis of the toponyms in the oldest old french copy of the account (royal d ) confirms the scribal uncertainty that at- tended the copying of these words. by distinguishing between familiar and unfamil- iar toponyms, by assigning the occurrences to specific scribes, and by quantifying the number of variants and the degree of orthographic and phonetic variance for each toponym, the article argues that we can identify the words and contexts that proved difficult to scribes. rather than regarding these variants as errors, cruse ar- speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / the digital middle ages s gues, we should analyze them as forms of reader response. an analysis of these top- onyms in their manuscript context as semantic markers devoid of modern annota- tion enables us to encounter polo’s text as its earliest readers did—as the description of an as yet unknown world teeming with exotic places rich in significance. ulti- mately, the ways in which scribes responded to the toponyms in polo’s account re- flect not only scribal practice, but also the processes by which new geographical in- formation was absorbed by medieval readers. franz fischer’s article surveys a series of digital scholarly editions with a focus on the options and requirements for developing digital textual corpora. on the one hand, textual—or, rather, editorial—plurality seems to be one of the main charac- teristics of digital editions; on the other, the usefulness of a corpus depends substan- tially on the uniformity and representativeness of the texts that it includes. based on a clear yet flexible definition of digital critical editions, fischer makes several pro- posals to resolve the conflict between a variety of editorial approaches and a desir- able homogeneity within a corpus. through the inclusion of editions that are digital in a wide sense and critical in a narrow sense, a focus on works rather than docu- ments, and linkage to, or integration of, external resources, he argues that it is pos- sible to create a valuable and truly digital corpus of critical editions. the usefulness of its features and the technical framework of such a corpus would be based on an elementary data model for metadata, text, annotation, and paratexts. the fourth section, on “multimediality: space and sound,” presents three ar- ticles that explore reconstructions of medieval architectural space and of the sounds within medieval buildings. sheila bonde, alexis coir, and clark maines use computer-aided design (cad) technology to reconstruct, represent, and study architectural process at the cistercian church at notre-dame d’ourscamp, concen- trating on the late thirteenth century, when workers dismantled the church’s ro- manesque east end and replaced it with a new gothic choir. they argue that digi- tal representation has the potential to encourage viewers to engage with the fuller life cycle of a building, and that it encourages researchers to analyze the three- dimensional application of their interpretations of building change. the goal of their digital project has been to promote a fuller understanding of the process by which medieval builders dismantled parts of earlier buildings to attach newer extensions. the article and cad project present an extended examination of the construction sequence and engage with issues of uncertainty in virtual representation. the remaining two articles in this section examine the sounds of byzantium. the international team of spyridon antonopoulos, sharon gerstel, chris kyriakakis, konstantinos t. raptis, and james donahue investigates the acoustic aspects of byzantine liturgical spaces in thessaloniki’s churches. their project unites scientific analysis of acoustics with consideration of the architectural frame and imagery of choral performance. their project aims to identify and preserve the acoustic signa- tures of the churches under study and to capture the multisensory experience of the byzantine worshipper. bissera pentcheva and jonathan abel present the method and the results of the stanford university multidisciplinary icons of sound project. they argue that dig- ital technology allows us to transcend a text-based encounter with byzantine litur- gical music and restores the performative aspects of the sung rite, and their focus is on hagia sophia: its acoustics, aesthetics, and music. the article details the effects speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / s the digital middle ages of the domed structure on the experience of sung chant within it: the amplification of sounds together with overlapping of notes and an “acoustic waterfall” produced both an aural and an optical brightness. using digital technology, icons of sound has successfully imprinted the acoustic signature of the building on the live perfor- mance of byzantine cathedral chant. the articles in this supplement thus combine to offer a window into the wealth of approaches and experiences that medievalists have brought to the field of digital hu- manities. it is hoped that this contribution to speculum incites (even more) new in- terest and fresh activity in this promising field. david j. birnbaum, university of pittsburgh (djbpitt@pitt.edu) sheila bonde, brown university (sheila_bonde@brown.edu) mike kestemont, university of antwerp (mike.kestemont@uantwerp.be) speculum /s (october ) this content downloaded from . . . on october , : : am all use subject to university of chicago press terms and conditions (http://www.journals.uchicago.edu/t-and-c). ucsf uc san francisco previously published works title opening access for a new era of scholarly publishing. a report of the access to continuing resources interest group (alcts crs) program, american library association annual conference, anaheim, june permalink https://escholarship.org/uc/item/ z vp journal technical serivces quarterly, ( ) issn - author taylor, anneliese s publication date - - peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ z vp https://escholarship.org http://www.cdlib.org/ opening access for a new era of scholarly publishing. a report of the access to continuing resources interest group (alcts crs) program, american library association annual conference, anaheim, june . anneliese taylor, assistant director for scholarly communications & collections, university of california, san francisco library, san francisco, ca - , anneliese.taylor@ucsf.edu there as been enormous growth in open access to scholarly resources over the last few years. the three speakers at this event addressed three different methods by which scholarly content is being made open access. ada emmett, scholarly communications program head at the university of kansas (ku) libraries’ center for digital scholarship, spoke about the implementation of ku’s open access policy. julie bobay, associate dean for collection development & scholarly communication at indiana university libraries, provided an overview of and update on the publication model for the stanford encyclopedia of philosophy. and jim gilden, editor, sage open sales at sage publications, gave a snapshot of sage’s new open access journal publishing in the social sciences and humanities. many academic institutions are adopting open access (oa) policies that stipulate that articles published by faculty and others be deposited in the institution’s open access, scholarly repository. university of kansas (ku) was the first public university in the united states to adopt an oa policy. ku’s policy was passed by its faculty senate in and implementation began in . implementation is managed through ku libraries’ center for digital scholarship (cds), and ada emmett leads the effort to assist faculty with participation in the oa policy. ku faculty are responsible for supplying copies of their scholarly articles to cdl staff for deposit in ku’s oa repository, ku scholarworks, though the library offers many services to assist faculty along the way. recognizing that faculty are very busy and also resistant to mandates, emmett and her staff do significant outreach and promotion of the policy in order to build understanding and support and to gain advocates for the policy. outreach efforts include brown bag lunches, meetings with departments, one-on-one discussions with faculty, progress reports to key university groups, scholarworks workshops, and presentations during open access week. cds staff also offers full-service scholarworks submission assistance, consisting of researching publisher policies for all articles published by a faculty member and assisting with deposit of the articles. another important role the cds has assumed is consulting with authors concerning their rights with regard to copyright, and assisting authors to retain certain copyrights. estimates are that close to % of ku departments have deposited articles as part of the policy. julie bobay presented on the stanford encyclopedia of philosophy (sep), an online, continually-updated, peer-reviewed open access resource, founded through stanford university’s center for the study of language and information (csli). sep is unique both in how it is funded and also because it is the first effort by academics of any discipline to collaboratively write, publish, and maintain scholarly subject material on the web. it went online in and is now an important reference work in the discipline of philosophy. it includes , entries by , authors, and about updates and new entries are added monthly. sep mailto:anneliese.taylor@ucsf.edu averages almost , entry downloads weekly, and it accounts for % of all web traffic to stanford university, where sep content is hosted. funding for sep has gone through several stages. initial funding for the proof of concept in came from a csli grant. from - , sep received grants from the neh, nsf, mellon foundation, icolc, and hewlett foundation to cover all aspects of the publication and business models and design the content management system. stanford partnered with the international library community in to build a $ m endowment through institutional membership in the sep international association (sepia), hosted by indiana university libraries. sepia members raised % of the goal and stanford raised the remaining %. in sep achieved a sustainable annual funding model thanks to % coming from the endowment, % from the friends of the sep society, and % from stanford. the first of its kind, sep has proved that, with support, scholars & their institutions can be stewards for not only the creation of scholarly matter, but also for the publication and management of that material. publishers are also helping promote open access publication. jim gilden presented sage’s new oa journal, sage open, a broad discipline based gold oa journal following the plos one model. sage open is notable as the first broad-based journal covering social sciences and humanities disciplines. this broad-based model was established by plos one, which publishes peer-reviewed scientific and medical research articles. peer review is performed on the soundness of the research of an article, rather than an article’s fit within a narrow subject or readership scope of a journal. the result is that many more articles can get published in broad- based journals. articles are not part of a journal issue, so they are pushed online as soon as they are approved. sage open began publishing articles in may and had published articles at the time of this presentation. there were over , full text downloads in the first twelve months. the journal is published on the highwire platform like sage’s other journals and is preserved through clockss. articles are published under a cc-by license. sage received over , submissions to date from countries. submissions tilt much more heavily toward social sciences disciplines than humanities, with education and psychology representing the highest numbers. acceptance rates were % accepted or accepted pending minor revision, % requiring major revision, and % rejected. sage open is serving an important role by serving disciplines that are under-represented when it comes to oa journals. the presentation slides from these three talks can be found at http://ala .scheduler.ala.org/node/ . med. hist. ( ), vol. ( ), pp. – . doi: . /mdh. . c© the authors . published by cambridge university press media reviews teaching and researching the history of medicine in the era of (big) data: introduction despite ample rhetoric about the utility of new digital methods that have emerged from the digital humanities, it remains difficult to understand exactly how and when various methods can be applied to research and teaching. what kinds of projects can benefit from digital methods? how can one tell which methods are most appropriate for which sources? what are the pros and cons of various tools and software? are new methods really worth the investment of time and energy? especially in the case of medical history, real-world examples of digital scholarship that can provide answers to these questions can seem elusive. on april , in a panel at the annual meeting of the american association for the history of medicine, scholars gathered to address these timely issues and questions, and to embrace the opportunity to work together to help define a path forward for the history of medicine field as it faces an ever-greater digital world and intersects increasingly with the digital humanities. the reviews in this volume of medical history, and those that will follow in the next two volumes, reflect the proceedings of this panel, which consisted of a variety of engaging case studies, including a semantic network analysis of the linguistic contexts in which the definition of ‘nutrition’ developed, an unusually high-level view of how doctors perceived and discussed influenza across thirty different american newspapers, as well as new ways in which digital methods can and should be integrated into the history of medicine classroom. in addition to two panels worth of papers being compressed into a single lunch session, further time constraints meant that presenters were not able to present full versions of their respective papers. nonetheless, the presentations collectively facilitated a lively interchange among the presenters and with the large and diverse audience, addressing key methodological questions about how best to bridge traditional and digital methods in the history of medicine. these published proceedings offer more detail of the case studies, and they advance for a broader audience the productive conversation about the utility, application and execution of digital methods in the history of medicine. medical historians have long grappled with ways in which physicians have continually adopted, appropriated and transformed medical (and non-medical) technologies for the betterment (and at times the detriment) of their craft. we must apply the same kind of scrutiny to our own practices and technologies, neither adopting new methods for the sake of change, nor ignoring them out of allegiance to tradition. these reviews and case studies are not meant to be prescriptive. rather, we hope these examples contribute to, and indeed provoke, a broader continuum of programmes and conversations about the state and direction of the history of medicine field in the twenty-first century. the intramural research program of the us national library of medicine, national institutes of health, supported the research and writing of this introduction, and the editing of its associated articles. frederick w. gibbs and jeffrey s. reznick university of new mexico, usa history of medicine division, us national library of medicine, national institutes of health, usa https://www.cambridge.org/core/terms. https://doi.org/ . /mdh. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at http://crossmark.crossref.org/dialog/?doi= . /mdh. . &domain=pdf https://www.cambridge.org/core/terms https://doi.org/ . /mdh. . https://www.cambridge.org/core teaching and researching the history of medicine in the era of (big) data: introduction*- . pc from chartist newspaper to digital map of grass-roots meetings, – : documenting workflows full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=rjvc download by: [university of hertfordshire] date: june , at: : journal of victorian culture issn: - (print) - (online) journal homepage: http://www.tandfonline.com/loi/rjvc from chartist newspaper to digital map of grass- roots meetings, – : documenting workflows katrina navickas & adam crymble to cite this article: katrina navickas & adam crymble ( ) from chartist newspaper to digital map of grass-roots meetings, – : documenting workflows, journal of victorian culture, : , - , doi: . / . . to link to this article: http://dx.doi.org/ . / . . published online: mar . submit your article to this journal article views: view related articles view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=rjvc http://www.tandfonline.com/loi/rjvc http://www.tandfonline.com/action/showcitformats?doi= . / . . http://dx.doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=rjvc &show=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=rjvc &show=instructions http://www.tandfonline.com/doi/mlt/ . / . . http://www.tandfonline.com/doi/mlt/ . / . . http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - journal of victorian culture, vol. , no. , – , https:/doi.org/ . / . . digital forum from chartist newspaper to digital map of grass-roots meetings, – : documenting workflows katrina navickas  and adam crymble  i. introduction chartism was the largest mass movement for democracy in nineteenth-century britain. it is best remembered for its extraordinary tactics: ‘monster’ meetings of thousands of people in squares and fields; the three national petitions of , , and , which gathered tens of thousands of signatures; and extraordinary events such as the ‘risings’ of and the ‘plug plots’ and conventions of . recently historians have reinterpreted the significance of the more ordinary and everyday elements of the movement. malcolm chase, tom scriven and others have shown how a familiar and quotidian culture was essential in sustaining chartism in between the periods of mass agitation. historians of protest now take a more rounded and wide-ranging approach to understanding what adherence to the movement entailed. an integral part of the organization of chartism as a grass-roots movement was weekly local branch meetings. usually these meetings were held in the back room of pubs, but also in chapels, working men’s halls, and increasingly as chartists raised the money to build them, their own halls. these meetings gave working men and women (albeit in separate groups) the opportunity to put their democratic principles into practice in voting, speaking, serving on committees and educating themselves. eager to defend their legality, and to spread the word, the locations of the meetings were advertised in separate columns in the chartist press, most notably in the northern star and leeds general advertiser newspaper (hereafter northern star). the paper was founded in november as the project of chartist agitator and former irish mp, feargus o’connor and the leeds printer joshua hobson. it was published in leeds and distributed nationally, reaching a regular circulation of , copies a week in . .  malcolm chase, chartism: a new history (manchester: manchester university press, ); tom scriven, ‘humour, satire and sexuality in the chartist movement’, historical journal, . (march ), – . .  katrina navickas, protest and the politics of space and place, – (manchester: manchester university press, ). .  northern star, nineteenth century serials edition, birkbeck, university of london, and the british library, beta version (august ) http://www.ncse.ac.uk/headnotes/nss.html [accessed online september ]. © leeds trinity university http://www.tandfonline.com http://orcid.org/ - - - http://orcid.org/ - - - http://www.ncse.ac.uk/headnotes/nss.html http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf journal of victorian culture aimed at a respectable working-class readership, the northern star followed the usual early victorian newspaper mix of local, national and international news, but with an additional emphasis on advertising chartist activities. there has not been a systematic analysis of the weekly meetings reported in the newspaper. how many meetings were there? who held them and where? what can we learn by examining the geographic patterns of all the meetings that were held? historians still primarily understand chartists on the basis of close readings of surviving texts rather than geo-spatial or social scientific modes of research. indeed, gareth stedman jones’s essay ‘rethinking chartism’ in his highly influential collection, languages of class, inspired what became known as the ‘linguistic turn’ among scholars of early nineteenth-century british popular politics in the s and s. attention to the texts of speeches and other literature became paramount to understanding the motivations and evolution of the movement. and although important sources such as the northern star are now digitized and available online, as with most digitized newspapers, historians in effect use them in the same ‘analogue’ ways as they previously used microfilm or the original paper copies: reading one page at a time. a sea-change in research methods is occurring in that keyword searching is now the norm when using digital resources. while this is undoubtedly positive, there are pitfalls to this new research landscape. poor quality optical character recognition (ocr) frequently forms the basis of the searchable text. if trusted blindly, the results of such searches may be incomplete or at worst: misinterpreted. in short, few patterns emerge if records are looked at sequentially; keyword searching is in effect still sampling with limited results. nevertheless, the digital nature of the transcriptions in the northern star database opens up new possibilities if we are aware of the potential of digital analyses of texts that have hitherto only been read using conventional, micro-analytical approaches. the digiti- zation of these periodicals and the development of text-mining tools to extract large amounts of quantitative as well as qualitative data from them, facilitates macro-analytical approaches. this article explores some of those possibilities by highlighting an approach that co-opts rudimentary linguistics and historical geographical approaches and applies them in a digital environment for the purpose of enhancing historical understanding. it does so by highlighting the workflow used by katrina navickas’s political meetings mapper project undertaken with the british library digital scholarship department. the project started by seeking digital copies of the northern star newspaper, and ended with an interactive map of chartist meetings. this map made it possible to understand .  gareth stedman jones, ‘rethinking chartism’, in languages of class: studies in english working class history, – , by gareth stedman jones (cambridge: cambridge university press, ), pp. – ; for work on chartist texts see mike sanders, the poetry of chartism: aesthetics, politics, history (cambridge: cambridge university press, ); ariane schnepf, our original rights of the people: representations of the chartist encyclopaedic network and political, social and cultural change in early nineteenth century britain (bern: peter lang, ). .  katrina navickas, ‘political meetings mapper’, british library labs ( ) [accessed online september ]. http://labs.bl.uk/political+meetings+mapper http://labs.bl.uk/political+meetings+mapper katrina navickas and adam crymble the geographical and temporal distribution of grass-roots chartist activity for the first time. the result is a macroscopic view, giving what katy börner calls an opportunity to ‘observe what is at once too great, slow, or complex for the human eye and mind to notice and comprehend’. this is not a challenge to close reading, but a complement at a different resolution. workflow is of course always important to historians, but it finds itself in the fore- ground more often in some sub-disciplines than others. digital history frequently asks historians to be critical and indeed open about their sources and methods; however, digital history is not alone, nor did it invent the in-depth discussion of methodology and workflow. for example, e.a. wrigley’s the early english censuses ( ) is a book about the workflows the author used in his analyses of these early censuses, building upon decades of research in historical demography. the book provides such a clear map for readers of what the author did to the records, that one could call it a mono- graph on historical workflow. likewise, much of the work presented in journals such as the economic history review also focuses on processing data through mathematical models that are meticulously described so as to be reproducible. this social-scientific approach to reproducibility and transparency is an offshoot of the scientific method, which few humanities scholars have found a need to emulate until recently. this shift towards the scientific method may in part be explained by the fact that ‘digital’ analyses are often actually interdisciplinary uses of social scientific methods. both mapping and linguistics are social scientific approaches to knowledge building which have recently become accessible to humanities scholars in the form of digital tools and through new publications such as the programming historian ( –present), as well as exploring big historical data: the historian’s macroscope ( ), which have taken the lead on prioritizing reproducibility in humanities research. this article builds on the work of the programming historian and reproducible research practices, generalizing the processes used by navickas so that they can be useful to scholars working on different types of records but with similar aims of acquiring, cleaning, geocoding, and displaying historical information from across a set of historical primary sources. ii. acquire as yet, digitizing a large historical corpus is impractical for most individual historians. even a publication run on the scale of a newspaper like the northern star, which was .  katy börner, ‘plug-and-play macroscopes’, communications of the acm, . (march ), – . .  e.a. wrigley, the early english censuses (oxford: oxford university press, ). .  the economic history review ( –present). .  adam crymble, fred gibbs, allison hegel, caleb mcdaniel, ian milligan, evan taparata and jeri wieringa, eds, the programming historian, nd ed. ( ) http://programminghistorian. org/ [accessed online september ]; shawn graham, ian milligan and scott weingart, exploring big historical data: the historian’s macroscope (london: imperial college press, ). .  katrina navickas, political meetings mapper ( – ) http://politicalmeetingsmapper. co.uk [accessed online may ]. http://programminghistorian.org/ http://programminghistorian.org/ http://politicalmeetingsmapper.co.uk http://politicalmeetingsmapper.co.uk journal of victorian culture published for a modest years between and , still requires a library partner in possession of the paper or microfilm copies, at the very least. for most scholars, acquiring a newspaper or similarly substantial digital corpus involves finding one that has already been digitized. as tim hitchcock notes, much of that work has been done by private companies who charge subscription access to material. the change to uk copyright legislation has gone a long way to facilitate greater access to digital corpora for uk-based researchers. the new law gave researchers the right to make copies of any textual records for which they had ‘legal access’, and made unenforceable any terms of use that prohibit the making of copies for non-commercial text and data mining analysis. the result of this has been a new openness by many commercial publishers to provide limited access to certain researchers as they try out this new model of access to their records. however, for most scholars – particularly early career scholars, independent scholars, or postgraduate students – getting a positive response still involves a level of privilege that is important to recognize. in the case of the political meetings mapper project, access to the textual layer of northern star database was granted by british library labs, whose mandate is to promote the use of digital resources in the library collection. each request will be met differently by the owners of the data, and it is not uncom- mon to be asked to pay fees or negotiate legal nondisclosure agreements. it is also not uncommon for requests to be rejected outright or ignored. sometimes these requests will be refused on technical grounds. what seems like a simple request for information may require someone to spend considerable time figuring out how to get what you want to use and package it in a way that makes it easy to transport. even small collections, if poorly documented or without an individual on the team who knows how the system works, can be difficult to extract from their databases. asking for data is an art rather than a science; however, as christian kreibich notes, there are strategies for improving one’s chances of success, ranging from using a university email address to emphasize the professional nature of the request, to being clear about why one wants the data, and of course expressing one’s gratitude. .  tim hitchcock, ‘privatising the digital past’, historyonics ( june ) http://historyonics. blogspot.co.uk/ / /privatising-digital-past.html [accessed online september ]. .  ‘exceptions to copyright: research’, intellectual property office, uk (october ) https:// www.gov.uk/government/uploads/system/uploads/attachment_data/file/ /research. pdf [accessed online july ]. .  ‘gale leads to advance academic research by offering content for data mining and textual analysis’, cengage learning ( november ) http://news.cengage.com/higher-education/ gale-leads-to-advance-academic-research-by-offering-content-for-data-mining-and-textu- al-analysis/ [accessed online july ]. .  ‘british library labs’, the british library < http://labs.bl.uk/> [accessed online september ]. .  christian kreibich, ‘how to ask for datasets’, medium.com ( april ) https://medium. com/@ckreibich/how-to-ask-for-datasets-d ef cb c#.b iufreo [accessed online june ]. http://historyonics.blogspot.co.uk/ / /privatising-digital-past.html http://historyonics.blogspot.co.uk/ / /privatising-digital-past.html https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/ /research.pdf https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/ /research.pdf https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/ /research.pdf http://news.cengage.com/higher-education/gale-leads-to-advance-academic-research-by-offering-content-for-data-mining-and-textual-analysis/ http://news.cengage.com/higher-education/gale-leads-to-advance-academic-research-by-offering-content-for-data-mining-and-textual-analysis/ http://news.cengage.com/higher-education/gale-leads-to-advance-academic-research-by-offering-content-for-data-mining-and-textual-analysis/ http://labs.bl.uk/ https://medium.com/@ckreibich/how-to-ask-for-datasets-d ef cb c#.b iufreo https://medium.com/@ckreibich/how-to-ask-for-datasets-d ef cb c#.b iufreo katrina navickas and adam crymble in the case of the political meetings mapper project, navickas submitted a proposal to and won the second british library labs competition, which is funded by the andrew w. mellon foundation, and which awards two scholars per year with privileged access to digital collections in the british library collection as well as access to library expertise. as part of that competition, navickas was given access to the collection via a series of digital indexes. with these, it was possible to identify the filenames of relevant page scans (figure ), and a set of the extensible markup language (xml) files that contained the searchable text layer (code block ), both of which had to be manually downloaded. this process took approximately hours of work. the data set included over page scans of issues of northern star between and , with a total word count of around , words for the column of interest: ‘forthcoming meetings’. navickas chose this sample date range due to the time constraints of the project and because it covered the most active period of the chartist movement. the page scans for the most important year in chartist history, – , were not available in the collection so these had to be accessed manually from gale-cengage’s database, nineteenth century newspapers. .  british library newspapers, gale cengage < http://gale.cengage.co.uk/british-library-news- papers.aspx> [accessed online september ]. figure . page scan extract from front page of the northern star newspaper, february , © british library, wo _nrsr_ _ _ - .tif. reproduced with permission of the british library. http://gale.cengage.co.uk/british-library-newspapers.aspx http://gale.cengage.co.uk/british-library-newspapers.aspx journal of victorian culture code block : xml extract of the text layer of northern star newspaper, february . much of the xml refers to the pixel coordinates where the word can be found on the original page scan. this is used to highlight keywords when using the commercial provider’s website. iii. clean in order to map the chartist meetings, the next step involved identifying relevant articles in the newspaper. the project focused only on one column, the ‘forthcoming meetings’ column of the northern star, as this provided the most succinct and regular form of wording and punctuation that could be most efficiently extracted without having to sift manually through extra contextual narrative description. initially this was identified through the standard column heading, ‘forthcoming meetings’, but as it quickly became clear that this was usually on the same page, navickas began to isolate the column manually (figure ). given the relatively modest size of the collection, a manual approach proved more effective than keyword searching. this particular newspaper had been digitized in the previous decade using the latest ocr software available to create the searchable text layer that is stored in the xml files. the problems of poor quality ocr are well doc- umented by holly rose, who noted in that a sample of australia’s massive trove newspaper database contained accuracy levels ranging from % to %, with % accuracy representing errors in an average paragraph of text. these results were .  rose holley, ‘how good can it get? analysing and improving ocr accuracy in large scale historic newspaper digitisation programs’, d-lib magazine, . / (march/april ) [accessed online march ]. newspaper regional weekly saturday, february , . . fair w _nrsr_ _ _ - .tif , , , tictoria http://www.nla.gov.au/ndp/project_details/documents/andp_howgoodcanitget.pdf katrina navickas and adam crymble comparable to those found by the koninklijke bibliotheek in . the accuracy levels of northern star newspaper ocr-generated transcriptions are unknown, as quantifying .  edwin klijn, ‘the current state-of-art in newspaper digitization’, d-lib magazine, . / (january/february ) http://www.dlib.org/dlib/january /klijn/ klijn.html [accessed online september ]. figure . excerpt of ‘forthcoming chartist meetings’, northern star, february , © british library, wo_nrsr_ _ _ - .tif. reproduced with permission of the british library. http://www.dlib.org/dlib/january /klijn/ klijn.html journal of victorian culture this measure requires manually counting errors in a sample of pages. however, even when the relevant column had been identified, it quickly became clear that the text contained enough errors that it could not be relied upon for a systematic extraction of chartist meetings within the column (see code block ). code block : example of xml errors on key terms that made it impractical to re-use the original xml. originally, the project envisaged setting up a crowd-sourced transcription site, build- ing upon the model used by bob nicholson’s victorian meme machine, which would have required volunteers to transcribe the columns by hand. however, it proved more economical and efficient to perform the ocr again using the latest version of commer- cial ocr software. this provided new transcriptions with approximately eight out of words transcribed correctly – a much greater level of accuracy than the text in the xml files. the results were then cleaned up by a small team of research assistants: samantha walkden, megan dibble and john levin, who checked and corrected the ocr files. the corrections mainly involved altering spacing and punctuation of the columns. this amounted to days of work and resulted in four years of newspaper transcriptions ( – ), saved in .txt format. the new ocr’d copy of the transcriptions was now suitable to be used for research. code block : text file of the ocr’d newspaper text, northern star, november . .  bob nicholson, ‘introducing … the victorian meme machine’, digital victorianist ( june ) http://www.digitalvictorianist.com/ / /victorian-meme-machine-interviews/ [accessed online july ]. .  the project used abbyy finereader , a commercial ocr package. .  ‘text file’, wikipedia [accessed online july ]. &suraucee may be effected, daily &apost;opeetuses may be had . london – the public discussion will be resumed in the city chartist . hall, , turnagain-lane, on sunday next, at half-past ten o’clock in the . forenoon. at three o’clock in the afternoon of the same day, the . metropolitan delegate council will assemble for the dispatch of . business. – in the evening at seven o’clock, mr. j. h. r. bairstow will . deliver a lecture. http://www.digitalvictorianist.com/ / /victorian-meme-machine-interviews/ https://en.wikipedia.org/wiki/text_file katrina navickas and adam crymble iv. extract with digital text clean enough to identify relevant entries reliably, the next step was to extract those entries and structure them in a way that would make it possible to map the location of meetings. at this stage the need was to find any mention of a meeting and save the result to a database. there are a number of ways this could have been achieved. the political meetings mapper project chose to use some custom gazetteers compiled by navickas that contained words known to frequent that weekly column of meetings. this gazetteer was a simple text file with one term (lower case) per line. for historical projects there is an added challenge: many of the individual pubs, halls and some streets of the s no longer exist. to solve this problem, navickas identified locations manually, using historic trade directories digitized by the university of leicester and looking visually for the sites on historic town plans. the information from the trade directories was obtained from the images rather than from any underlying xml. using old town plans, it was possible to geo-reference an old map and put it into google earth, where it could then be used to find the current geo-coordinates for those lost places. therefore, the research, like many small-scale digital projects, could not be done through a ‘one-stop shop’ software package, but involved the careful curation of various ready-made, custom-built, proprietary and open access resources. this process also raises questions about sustainability and replicability, as many of these resources rely on institutional hosting or, commercial tools such as google earth or fusion tables, require signing up to online accounts and uploading one’s data to their servers. the gazetteer was then used to search the text for matches. this was done using a custom python programme by adam crymble, ‘using gazetteers to extract sets of keywords from free-flowing texts’, which is described as a step-by-step tutorial on the programming historian. navickas adapted this code slightly for the project’s needs, but the core principles behind the original tutorial apply to the needs of the workflow herein described. the full code (hereafter ‘the python code’) used by navickas can be found on zenodo in the project’s repository. this script was run on each column of the meetings’ announcements in turn, extracting the text relevant to a single meeting as it went. .  ‘historical directories of england and wales’, special collections online, university of leicester [accessed online july ]. .  google earth, –present [accessed online september ]. .  adam crymble, ‘using gazetteers to extract sets of keywords from free-flowing texts’, the programming historian ( ) [accessed online september ]. .  katrina navickas, ben o’steen and john levin, ‘meetingsparser: package’, zenodo ( ) [accessed online september ], doi: . /zenodo. , [accessed online september ]. http://specialcollections.le.ac.uk/cdm/landingpage/collection/p coll https://www.google.co.uk/intl/en_uk/earth/ http://programminghistorian.org/lessons/extracting-keywords http://programminghistorian.org/lessons/extracting-keywords https://zenodo.org/record/ #.v ep tatl http:// . /zenodo. https://github.com/bl-labs/meetingsparser/tree/concise-version journal of victorian culture v. geocoding at this point in the workflow, the individual meetings had been identified. the next step was to geocode the meeting locations. geocoding is the process of pairing words that relate to a physical location, to coordinates that represent the same place on a map. there are a growing number of tools that can perform this task, however these change frequently as new software emerges, and so it is more important to understand what geocoding does to the historical data. it is a process that involves converting strings of text that refer to places such as ‘china walk, lambeth’ to its decimal latitude and longitude ( . , - . ). there are a number of formats for geocoding that go beyond latitude and longitude, and each system of mapping has strengths for a particular area. for example, the british national grid is commonly used to study the geography of britain as it provides a highly accurate representation of british places, but the further from the british archipelago one travels the more distorted the results. this is in part caused by the challenge of rendering the curved surface of the globe onto a two-dimensional map. readers are advised to consult with a subject specialist in geography or cartography on which geocoding format is most appropriate for their project. as this project intended to use omeka to display the data (see below), navickas chose to use latitude and longitude because this format was required for the omeka maps plug-in. geocoding was conducted by the python code at the same time as the extraction process identified above, however, from the perspective of a workflow this is a separate step. the geo-coordinates were then manually saved to the csv file beside each entry, with latitude and longitude each in its own column (figure ). .  the town names were geo-coded using idre sandbox [accessed online september ], then the co-ordinate informa- tion for the historic addresses was added manually (process described above) to the gazetteer generated by the geocoder. .  ‘the national grid’, ordnance survey [accessed online july ]. .  anon, ‘geolocation plugin for omeka’, version . [accessed online july ]. figure .  csv file containing each meeting and its associated metadata. dublin core is the metadata standard required for the omeka content management system ( [accessed online september ]). https://sandbox.idre.ucla.edu/sandbox/sandbox-geocoder https://sandbox.idre.ucla.edu/sandbox/sandbox-geocoder https://www.ordnancesurvey.co.uk/resources/maps-and-geographic-resources/the-national-grid.html https://www.ordnancesurvey.co.uk/resources/maps-and-geographic-resources/the-national-grid.html https://omeka.org/codex/plugins/geolocation_ . https://omeka.org/codex/plugins/geolocation_ . http://dublincore.org/ http://dublincore.org/ katrina navickas and adam crymble iv. dating the meetings as each meeting also took place at a certain time, and the temporal distribution of meetings undoubtedly had historical meaning, it was important to identify the meeting date. as noted, each meeting had a place and a time listed in the advert in the northern star newspaper. unfortunately, the dates were not written to be easily machine-readable. it was common, for example, for a meeting to be listed as ‘this thursday’ or ‘tomorrow’. because the northern star was always published on a saturday, and because we know the date each newspaper issue was printed, it was possible to convert phrases like ‘tomorrow’ into the date of the meeting referred to using some simple python code that employed pattern matching using regular expressions. this list was manually created for the needs of the current project. once dates had been identified, they were added to a new column in the csv file described above. at this stage of the workflow, all information required to map the meetings over time had been extracted and structured. vii. display the final step was to import the geo-coded meetings in the project website’s digital map. the project used omeka and the ‘geolocation’ plug-in. omeka is a free content manage- ment system for building websites produced by the roy rosenzweig center for history and new media at george mason university. it was originally designed for the gallery, library, archives, and museum industry as a means of producing exhibits of collections. it has strengths for those seeking to batch upload items that include metadata (such as museum objects). the project has a number of plug-ins that add functionality to the site, including mapping locations as used in this project. omeka has some limitations from a user perspective, such as an inability to export search results of all meetings from a particular locale, for example. there are alternative websites and content management systems that could be used for similar projects, and the reader should consider the most suitable and sustainable platform for their project needs and audience. as navickas planned to use this plug-in to build a digital map, the above steps were designed so that the data created would be compatible with this tool. this included adding a column, which specified the optimal zoom level of the map for display. import was conducted using the instructions for the plug-in. the result was a digital map of chartist meetings between and , which can be viewed on the project website. to provide historical context to the landscape, navickas overlaid a nineteenth-cen- tury map of britain over the modern google map used by the plug-in. the most easily available large-scale map was the first-edition ordnance survey map of the uk ( ), .  for an introduction to regular expressions, see doug knox, ‘understanding regular expressions’, the programming historian ( ) [accessed online september ]; laura turner o’hara, ‘cleaning ocr’d text with regular expressions’, the programming historian ( ) [accessed online september ]. .  for more on sustainability on digital projects, see ‘software sustainability institute’ [accessed online september ]. http://programminghistorian.org/lessons/understanding-regular-expressions http://programminghistorian.org/lessons/understanding-regular-expressions http://programminghistorian.org/lessons/cleaning-ocrd-text-with-regular-expressions http://www.software.ac.uk/ http://www.software.ac.uk/ journal of victorian culture through the national library of scotland’s application programming interface (api) service. this api was compatible with the geolocation plug-in through an intermedi- ary service, ‘leaflet’, which enabled the historic map to be tiled, layered, and displayed at different levels over the google map (see figure ). readers need to consider the sustainability of third-party programmes for display and visualization. in july , .  ‘nls historic maps api – historical maps of great britain for use in mashups’, national library of scotland [accessed online july ]. .  ‘leaflette javascript library’ ; the code for the amended plugin is available at , doi: . /zenodo. [accessed online september ]. figure .  political meetings mapper geo-location plug-in map, using geolocation plugin for omeka  [accessed online february ] and leaflet javascript library  [accessed online february ]. meetings locations plotted on first edition one-inch to the mile ordnance survey map of the united kingdom, – , using the national library of scotland api under a creative commons attribution . unported licence < http://maps.nls.uk/projects/api/> [accessed online february ]. http://maps.nls.uk/projects/api/ http://leafletjs.com/plugins.html https://zenodo.org/badge/latestdoi/ /bl-labs/geolocation http:// . /zenodo. http://omeka.org/add-ons/plugins/geolocation/ http://leafletjs.com/ http://maps.nls.uk/projects/api/ katrina navickas and adam crymble figure . political meetings mapper map, with missing base map tiles caused by a change in the terms of use by mapquest that unexpectedly affected the project, july . api at http://maps. nls.uk/projects/api/ [accessed online february ] and used under a creative commons attribution . unported licence. this demonstrates a clear lesson in digital sustainability. figure . heat-map of concentration of london meeting sites in northern star, ‘forthcoming meetings’, – , created using qgis and stamen osm tiles. http://maps.nls.uk/projects/api/ http://maps.nls.uk/projects/api/ journal of victorian culture mapquest, the service providing the background map tiles was discontinued, resulting in the base map becoming unavailable (see figure ). viii. conclusion in this project we have learned the advantages of taking a digital approach to news- paper sources. to take one example from the chartist meetings column of the january issue of the northern star, the red lion public house in golden square, london, advertised its forthcoming meeting the following saturday, a spirited lecture by mr l.h. leighs denouncing ‘free trade fallacies’. using the digital project, historians can not only find out that chartist groups were also gathering on that evening elsewhere in london in the hit or miss public house in mile end and in the black bull, hammersmith (to celebrate the birthday of thomas paine), as well as all around the country. but the project database also displays the much wider context for these meetings situated in place and time. historians can discover different and much broader connections than they could do manually. how common were these meetings in those particular places? how were they spread across the city, and how did this change over time? of course, the .  lori colston, ‘modernization of mapquest results in changes to direct tile access’, mapquest + developer blog ( june ) [accessed online july ]. .  ‘red lion, king-street, golden-square’, political meetings mapper [accessed online may ]. figure .  chartist tailors’ meeting sites plotted on extract of richard horwood’s map of london, , british library, maps.crace.v , geo-referenced and layered on google earth [accessed online september ]. http://devblog.mapquest.com/ / / /modernization-of-mapquest-results-in-changes-to-open-tile-access/ http://devblog.mapquest.com/ / / /modernization-of-mapquest-results-in-changes-to-open-tile-access/ http://politicalmeetingsmapper.co.uk/maps/items/show/ http://politicalmeetingsmapper.co.uk/maps/items/show/ http://www.bl.uk/onlinegallery/onlineex/crace/p/ zzz u .html http://www.bl.uk/onlinegallery/onlineex/crace/p/ zzz u .html katrina navickas and adam crymble historian can answer some of these questions using traditional approaches. however, using digital methods enables them to support their conclusions with more confidence, with a sample of meetings rather than say a hundred, and in a format that appeals to our visual and spatial faculties. so, for example, the data clearly displayed the wide distribution of chartist meetings across london. london chartism has been curiously under-studied compared to other regions of england, with the last major study of the metropolitan movement being david goodaway’s london chartism, – ( ). mapping the meetings’ data showed the spread of chartist branches and meeting sites across the city (figure ), with particu- lar concentrations in soho, shoreditch-spitalfields and southwark. it also demonstrated the concentration of trades’ branches in particular areas. for example, the tailors had several chartist branches in soho and the west end, where their trade worked and lived (figure ). the map confirmed the impression of london as an artisanal and trades- based movement with easy access to familiar and close-by meeting sites related to their trades’ activities (many of the sites were pubs also holding the box for their friendly societies and trade unions). chartist activities could therefore be characterized as part of the everyday rather than the extraordinary, drawing their strength from locality and proximity as well as from a wider delegate system across the city. the project also gave an insight into the history of the newspaper and its reach in particular. plotting the meeting advertisements showed that even though the northern star was published in leeds, the spatial distribution of reporting in the paper was not just concentrated in the west riding of yorkshire and neighbouring southeast lancashire. plotting a heat-map of meetings reported in the database shows that the industrial towns in the leeds to manchester corridor, to a lesser extent in the west and east midlands, and more particularly in london, were well represented in the coverage of advertised meetings. the strength of london reporting was unexpected. the northern star coverage of other areas was much weaker, and therefore scholars should compare reportage of meetings in other newspapers to glean the wider coverage of the movement across the country. indeed, gwent archives is currently conducting a crowd-sourcing project to digitize and transcribe the chartist newspaper western vindicator, which will provide valuable comparative material to fill this gap in our knowledge about welsh chartist meetings. the project we have documented here involved a carefully planned workflow: acquir- ing, cleaning, geocoding, and presenting hundreds of meetings extracted from millions of words of mutable newspaper text. while this workflow allowed navickas to under- stand chartism better, it has the potential to help historians identify sets of relevant texts from within any wider corpora and transform them into mappable entities that can be shared as historical data sets or visualized and interpreted. this article shares that workflow with the hope that it will facilitate the development of more historical data sets and a broader sharing of methods in historical research. .  david goodaway, london chartism, – (cambridge: cambridge university press, ). .  ‘unlocking the chartist trials’ [accessed online september ]. http://chartist.cynefin.wales/transcribe journal of victorian culture scholars who study texts increasingly turn to computational analyses, be they based in linguistics, geography, or otherwise, and so there is a growing need to understand exactly what has been done to a set of records to produce a result. this is important not just to ensure quality and academic rigor, but also to spread these new workflows to scholars working on other time periods or places, and to stimulate responsible experi- mentation. by encouraging the documentation of workflows, we can put computers to work for us, so that we can pursue our real interests, which are answers to humanities questions. disclosure statement no potential conflict of interest was reported by the authors. orcid katrina navickas   http://orcid.org/ - - - adam crymble   http://orcid.org/ - - - katrina navickas and adam crymble university of hertfordshire k.navickas@herts.ac.uk http://orcid.org http://orcid.org/ - - - http://orcid.org http://orcid.org/ - - - mailto:k.navickas@herts.ac.uk i. introduction ii. acquire iii. clean iv. extract v. geocoding iv. dating the meetings vii. display viii. conclusion disclosure statement introduction to digital humanities rocío ortuño casanova. universiteit antwerpen rocio.ortuno@uantwerpen.be any questions may be addressed to my e-mail or even better to the discussion section in the humanities commons group dagitab https://hcommons.org/groups/dagitab/forum/ we are about to sit here and spend a few days talking about digital humanities (dh). in this introductory session, we are going to reflect about why we are going to do so and how dh can be useful for your own teaching and research and that of others. firstly, i would like to explain that this workshop makes part of a project to be developed along years ( - ). it is funded by vliruos and is being developed in partnership between the university of the philippines diliman and the university of antwerp, in belgium, although we intend to involve the whole up system (or almost). the project has two parts: - the first part focuses on the digitization of philippine historical periodicals which are held at the university of the philippines diliman library. this part of the project is led by chito angeles, who will be talking about the digitization process and the repository that they are creating for the general public to be able to access it online. - the second part consists of delivering a series of training sessions on digital humanities focusing specially in text analysis, corpus compilation and distant reading to be able to do more things (research and teaching-wise) than conventional, non-digital scholarship allows with those newspapers and other interesting materials, especially related to the philippine history, society, languages etc. here you can find some information about the whole project: - https://hosting.uantwerpen.be/philperiodicals/ - https://www.vliruos.be/en/projects/project/ ?pid= - https://www.uantwerpen.be/en/research- groups/digitalhumanities/about/projects/vlir-uos/ so, the objectives of the summer course are: the idea behind the “added value” is important. digital tools are fashionable, and they can be very useful, but sometimes they are used to achieve things that could just as well be done without them. in those cases, using digital tools does not bring any added value. for instance, mailto:rocio.ortuno@uantwerpen.be https://hcommons.org/groups/dagitab/forum/ https://www.vliruos.be/en/home/ https://hosting.uantwerpen.be/philperiodicals/ https://www.vliruos.be/en/projects/project/ ?pid= https://www.uantwerpen.be/en/research-groups/digitalhumanities/about/projects/vlir-uos/ https://www.uantwerpen.be/en/research-groups/digitalhumanities/about/projects/vlir-uos/ if you would like to find out the topics in a chapter of a book, you do not need to use topic modelling for that. you can do it just reading that chapter. if you would like to find topics in books or more, you may want to use digital tools, as doing it without them would be longer and not so accurate. (that is: maybe. in some cases.) now this has been clarified, , let’s do a small quiz about dh to start explaining what this is all about. * tip for echoing this workshop: if you are going to use this kahoot in your classes or workshops on dh, in the link on the slide above you can find the “inners” of the kahoot. from there, you can clone the quiz, modify it, or you can also choose “play as guest”. then you will be able to log in with a google account and to choose if you want your students to play as individuals or as a team choose either one of the options, and you will find the instructions for your students to join the kahoot with their phones. they just need to enter the page www.kahoot.it, and enter the http://www.kahoot.it/ pin indicated on your screen (that should be projected for the students to see it, and the questions and answers). for instance: now, after each question, students will have seconds to answer. you can move down to see the results (who has answered correctly and who hasn’t) and the ranking of players by number of points (depending on their number of correct answers and their speed in answering the questions). before proceeding to the next question, i would recommend stopping and explaining the answer. the explanation is on the slides: question on kahoot: explanation to answer number humanist computing or humanities computing is how digital humanities were called before their current name, but the contents and the ideas behind the name were then same as what we call today digital humanities. explanation to answer number : information technology in layman terms: according to alan liu, “digital” just means “technology + media + information”. you can favour one or another of the components according to your object of study and methodology. although there has been some instability in the nomenclature for certain processes and methodologies, it seems that lately everything is getting more integrated and the tendency is to include information technology under the more inclusive umbrella of digital humanities (the idea of a “big tent”), or at least to walk towards symbiosis between both. explanation to answer number : computational linguistics. linguistics, a discipline that typically falls within the humanities, scholars have pioneered in the use of digital tools for their quantitative research in the late th century. many methodologies from linguistic research – especially the domain of computational linguistics or natural language processing – with digital tools are being used in literary studies nowadays and other fields such as history. in this way, we can understand the connection between computational linguistics, quite an old discipline actually, and digital humanities. explanation to answer number : gardening however, there are several other disciplines and activities that fall within digital humanities. you can learn more about this in these three links, for instance: - https://mkirschenbaum.files.wordpress.com/ / /ade-final.pdf - http://computerphilologie.uni-muenchen.de/jg /unsworth.html - https://cpb-eu-w .wpmucdn.com/blogs.ucl.ac.uk/dist/e/ /files/ / /chapter- _ev.pdf question on kahoot: this question involves a much more difficult and more controversial question in the history of digital humanities that is “what is digital humanities?”. although we are not getting in depth https://mkirschenbaum.files.wordpress.com/ / /ade-final.pdf http://computerphilologie.uni-muenchen.de/jg /unsworth.html https://cpb-eu-w .wpmucdn.com/blogs.ucl.ac.uk/dist/e/ /files/ / /chapter- _ev.pdf https://cpb-eu-w .wpmucdn.com/blogs.ucl.ac.uk/dist/e/ /files/ / /chapter- _ev.pdf in this debate, which is time-consuming, we can start by proposing a minimal agreement and a definition: therefore, any of the answers is alright, if using digital tools, except for just writing a document, as word does not add anything to the writing itself content-wise. that is, you could actually write exactly the same without a computer and the data would be the same. regarding the mapping of ulysses route… check this: - https://blogs.carleton.edu/dh/ / / /making-a-humanities-lab-out-of-greek- mythology/ - http://omeka.wellesley.edu/mappingmythology/ - https://whatisdigitalhumanities.com question on kahoot (multiple correct answers are possible): https://blogs.carleton.edu/dh/ / / /making-a-humanities-lab-out-of-greek-mythology/ https://blogs.carleton.edu/dh/ / / /making-a-humanities-lab-out-of-greek-mythology/ http://omeka.wellesley.edu/mappingmythology/ https://whatisdigitalhumanities.com/ explanation to answer number : distant reading is one of the main theoretical frameworks for the use of digital tools. it means that we can extract data from texts even without reading them. what franco moretti argues in the quote that you have above is that we often tend to characterize a literature, an epoque, or a trend just by reading a few canonical texts. now, the question is: canonical for whom? literary history has been relaying on a selection done with a certain bias (aesthetic, political, social, religious, you name it). against the question of canon there is the possibility nowadays of taking loads of books and extracting information without reading them. this might sound like a pity, but it gives us some other kind of interesting information. there are many ways of extracting information from big amounts of data (or texts). here you can find some more information on distant reading: nb: because of the controversial accusations of moretti in the states, his work is being cited less and less: https://www.stanforddaily.com/ / / /harassment-assault-allegations- against-moretti-span-three-campuses/ https://www.stanforddaily.com/ / / /harassment-assault-allegations-against-moretti-span-three-campuses/ https://www.stanforddaily.com/ / / /harassment-assault-allegations-against-moretti-span-three-campuses/ - ross, shawna. 'in praise of overstating the case: a review of franco moretti, distant reading (london: verso, ). digital humanities quarterly ( ). . http://www.digitalhumanities.org/dhq/vol/ / / / .html - moretti, franco. 'graphs, maps, trees. abstract models for literary history'. new left review . november . https://www.mat.ucsb.edu/~g.legrady/academic/courses/ w /moretti_graphs.pd f - distant reading explained in layman’s terms: https://www.nytimes.com/ / / /books/review/the-mechanic-muse-what-is- distant-reading.html an example of this distant reading is a work that moretti did on hamlet. you can find the graph of interaction of the characters below, and an explanation on it in this link https://elenadigi.wordpress.com/ / / /distant-reading-vs-close-reading/ explanation to answer number : http://www.digitalhumanities.org/dhq/vol/ / / / .html https://www.mat.ucsb.edu/~g.legrady/academic/courses/ w /moretti_graphs.pdf https://www.mat.ucsb.edu/~g.legrady/academic/courses/ w /moretti_graphs.pdf https://www.nytimes.com/ / / /books/review/the-mechanic-muse-what-is-distant-reading.html https://www.nytimes.com/ / / /books/review/the-mechanic-muse-what-is-distant-reading.html https://elenadigi.wordpress.com/ / / /distant-reading-vs-close-reading/ in my opinion and experience, i have found digital humanists to be a community of practice in which sharing, and collaborating are highly regarded, unlike many other academic fields. usually, the humanist’s work is quite solitary and highly theoretical : you go to the library or the archive, you get your row materials/texts/data. you make sense of that data by organizing it and relating it to other texts or materials, and then you publish your conclusions on the whole stuff. the workflow in the digital humanities has a similar path, however, there are two important differences: . projects tend to be more ambitious (and often more multidisciplinary). therefore, more people, with complementary backgrounds and expertise) are needed. they may involve different disciplines, address larger research questions and need more complex team work. the members of the team may be involved only in one component or the whole process: they can be organizing data from texts, for instance, or making sense out of it, or testing that data in different ways. . once the data has been extracted, organized and prepared for being examined, it can be shared for other researchers to apply different kinds of methods to that data. even more interestingly, those “tools” created to approach the data (digital tools), let us say, some app, logarithm, some piece of programming, is also usually shared to be applied by different researchers to different data. i have two examples for you: a. the first one is stylo package, about which mike kestemont, one of the creators, will talk in the next few days https://sites.google.com/site/computationalstylistics/stylo . that is a “tool” to be used in ‘r’ for finding out about a text’s authorship and writing style. b. the second one is the textbox of cligs, a research group in germany working on some shorts of distant reading of literary texts: https://github.com/cligs/textbox in their github repository (a very popular platform to share and develop chunks of code, materials, information about projects, and all sort of things to share) they have uploaded the texts they are working with in different formats, after having ocrd and “cleaned” those texts. some of them are also tagged and annotated. you may want to use them for performing some sort of analysis or for having a corpus to compare with your own, or training tools… just by curiosity, you might want to have a look at other github repositories like mike’s http://mikekestemont.github.io/ or enrique’s https://emanjavacas.github.io/ (these two are a bit more fancy: they are repositories within their own websites created with github also). all this has to do with some sort of set of values that are important to practitioners and a kind of identity imprint for the discipline. these values have been discussed and developed in a book series called debates in the digital humanities. you can see the link to the whole article at the bottom of this coming slide: http://journalofdigitalhumanities.org/ - /who-you-calling-untheoretical-by-jean-bauer/ https://sites.google.com/site/computationalstylistics/stylo https://github.com/cligs/textbox http://mikekestemont.github.io/ https://emanjavacas.github.io/ http://journalofdigitalhumanities.org/ - /who-you-calling-untheoretical-by-jean-bauer/ now, the last two answers are quite wrong. firstly, because no machine does your job. your job is the human part in digital humanities: you need to make sense out of the data, explain it in context, reach conclusions. saying that computers are doing the whole research job would be like saying that microscopes and test tubes are doing the whole research job for biologists. the last question is wrong because well: fashion comes and goes and is all about perception. much ado about nothing. do not invest in fashion for the long term. now, i have an extra question out of the kahoot for you all. let’s get local and brainstorm a bit: i am talking about this as someone who has been working on the philippines for a few years, and most of those years not being presently in the philippines. so, no magical recipes, just my experience. the first thing that caught my attention when approaching filipino literature in spanish was the fact that there was not much literature about it. people working on postcolonial studies would rarely refer to the philippines, and it was totally out of the circuit of studies on literature in spanish. wondering about the reasons for this, i realised that people from abroad (spanish speakers) had difficulties to reach the texts. i myself could not access the literary texts i was interested in when working from england. that can also happen if you are working from davao, or from iloilo, or from batanes: most of the materials are gathered in a few archives and libraries in metro manila. the second reason was that, even for filipino researchers it was difficult to access and analyse those texts because they could not understand them anymore. nobody speaks spanish in the philippines, right? and there are so many rich literary traditions in other languages that it is not really a concern. so, given these problems, i thought that digitization (as a first step) may contribute to: regarding the research using digital tools, some ideas that came to my mind were: can you think of other answers? question on kahoot (multiple correct answers are possible): this tricky question just aims to show you a few projects related to digital humanities in different ways and from different fields of knowledge, that may give you a better idea of what this is all about. what kind of questions you can answer using digital tools and the kind of materials that you can analyze. i have prepared a summary of each project, but you can also consult the website and the output papers produced by entering the links at the bottom of the slides. explanation to answer number : explanation to answer number : explanation to answer number : explanation to answer number : not everything is so beautiful, and therefore, it might be useful to have a look at why digital humanities is also criticised (heavily) from some sectors of academia. there are some ideas about the criticism that there is around digital humanities nowadays in this article: https://mkirschenbaum.files.wordpress.com/ / /dhterriblethingskirschenbaum.pdf and in this one https://www.journals.uchicago.edu/doi/abs/ . / beyond this, there are also debates on different aspects of digital humanities, which leads us to question number , the last question: question on kahoot (multiple correct answers are possible): explanation to answer number : https://mkirschenbaum.files.wordpress.com/ / /dhterriblethingskirschenbaum.pdf https://www.journals.uchicago.edu/doi/abs/ . / this is a question that was the title of one of the papers of that series of debates in the digital humanities. the author, tara mcpherson, from minnesota, addresses what she thinks that is an internal division within the discipline: the digital side, and the humanities side. she feels that according to her experience, dh practitioners tend to be more aligned to one of these sides. se also thinks that those more aligned on the side of the “digital” tend to suspect from those more on the humanities side, and vice versa. she advocates for closing tight the gap “from diaspora to database, from oppression to ontology, from visual studies to visualizations”. from decolonial studies there is an even more complex debate that involves the centres of production of digital tools and the centres of training in digital methodologies (normally in the so-called north countries). how digital humanities are expensive, and therefore, not so democratic as the intend to be. and how digital humanities being created and developed and taught from the north, deal with northern concerns and give little space for other realities to bring their own questions and answers into it. explanation to answer number : the eternal september of digital humanities refers to the fact that there are always new practitioners who challenge the marked paths, and continuous new beginnings in the discipline. but (and these are my words) there is also a problem of repetition: people are creating tools for doing things that previous tools already did, wondering about questions that others already answered. being not such a new discipline, a wider job of creating a state of the art and not thinking that you are a pioneer might be useful to start with it. explanation to answer number : indeed, the digital allows ways of breaking academic barriers and even of being able to transfer knowledge from academia to a wider public. it also has to do with the “value of sharing” that we were discussing in question number and the “values” of dh. explanation to answer number : ehhh… the slide is quite self-explanatory, i think. although i would think that digitization is green. but not so much dh research. these are some of the topics being discussed around dh, but not the only ones. other topics can be found in these open access books and articles: now, if those were all the questions we intended to answer in this introduction, it might be useful to come down to the local again and, from the mindset of the presentation, wondering about digital humanities in asia. we have some initiatives. the focus is moving from the us/canada, australia and europe, which were probably the main three foci of dh work. regarding to asia, i found some initiatives: as i said, digital humanities are expensive, and the initiatives that i found are taking place in richer countries. so, besides singapore, well, there are also some initiatives in the philippines, actually. although mr chito angeles will be talking about this more in depth, here is a teaser of some of the projects going on, that you can search on their websites if you are interested in them (just search them on google): and finally, some bibliography: maraming salamat po. this work is licensed under the creative commons attribution . international license. to view a copy of this license, visit http://creativecommons.org/licenses/by/ . / or send a letter to creative commons, po box , mountain view, ca , usa. op-llcj .. afterword ............................................................................................................................................................ gabriel egan school of humanities, de montfort university, leicester, england ....................................................................................................................................... at certain points in history, certain words take on a positive aura that makes it difficult to openly ex- press dissenting or sceptical views about the objects, processes, or qualities they denote. right now, social has this aura. this word’s role as a modifier to make the noun after it refer to society and other kinds of human association—as in social law and social life—emerged at the end of the sixteenth century (oed ‘social’ adj. a, b). most recently, the word has attached itself to a relatively new word, media— first used to denote mass communication in (oed ‘media’ n. )—to denote a new kind of tech- nology of communication. whereas the ordinary media provided only one-way, one-to-many, com- munication, the social media allow ‘users to create and share content or to participate in social net- working’ (oed ‘social’ special uses s ‘social media’ n.). in the field of textual editing, being social is not so new. the french theorists of the s roland barthes and michel foucault considered authorship itself to be an inherently social phenomenon. for barthes, texts were not spun like webs out of the solitary minds of lone individuals but rather woven together from existing ideas and sayings: ‘the text is a tissue of quotations drawn from the innumerable centres of culture’ (barthes, , p. ). according to foucault, we are thinking about cre- ativity itself in the wrong way if we concern our- selves too much with authors as persons, for in truth we as readers collectively construct the author to suit what we want to do with the text. in this view, we have to speak not of the author but of the author function that we use to constrain the range of interpretations that a text may be subject to. these reader-constructed authors become ‘the principle of thrift in the proliferation of meaning’ (foucault, , p. ), saving us from outlandish misinterpretations. in their original french-lan- guage publications—much translated and anthol- ogized—foucault’s essay was a direct response to barthes’s, and their shared aim was a thorough transvaluation of the notion of authorship by socia- lizing it (barthes, ; foucault, ). in literary studies of authorship, this french post-structuralist and post-modern view still holds considerable sway, although research in computa- tional stylistics is showing that in fact authorship is a good deal more personal and less socialized than barthes and foucault had us believe (craig, - ). the claim that authorship is a fundamen- tally social phenomenon became popular in the fields of book history and textual scholarship with the publication of jerome j. mcgann’s a critique of modern textual criticism (mcgann, ). mcgann argued that we see the idea of socialized creativity in practice most clearly when we think about how lit- erary works reach their readers: ‘the production of books, in the later modern periods especially, some- times involves a close working relationship between the author and the various editorial and publishing professionals’ (mcgann, , p. ). these various others, apart from the author, whose labour goes into making a book—including its printers— should not be seen as contaminating the work (as a previous generation of textual scholars believed) but rather as completing the authorial intention. d. f. mckenzie offered a practical illustration of this claim in jacob tonson’s edition of the works of william congreve, designed by master printer john watts, who made extensive use of typographic distinctions to embellish scene correspondence: gabriel egan, school of humanities, de montfort university, leicester, england. e-mail: gegan@dmu.ac.uk digital scholarship in the humanities, vol. , no. , . � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com doi: . /llc/fqw by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from deleted text: word's deleted text: s deleted text: -- deleted text: -- deleted text: `` deleted text: '' deleted text: -- deleted text: `` deleted text: '' deleted text: )-- deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: -- deleted text: -- deleted text: foucault's deleted text: barthes's deleted text: deleted text: mcgann's deleted text: `` deleted text: '' deleted text: -- deleted text: -- deleted text: tonson's deleted text: ich http://dsh.oxfordjournals.org/ divisions. according to mckenzie, this landmark edition must be seen as an active collaboration be- tween the author, the publisher, and the book de- signer, and hence only a notion of the book as a social object can fully account for its meanings. to respond to this reality, mckenzie called for a ‘new and comprehensive sociology of the text’ (mckenzie, , p. ). like the french theory from which it derives, these anglo-american no- tions of the socialized text are susceptible to consid- erable critique in practice and they often overstate their claims (egan, , pp. – ; egan, ). the essays in this special issue invite us to con- sider the notion of social editing, and just as one would hope from thoughtful experts, they are all undazzled by the idea’s fashionable aura and think through carefully what it means for that adjective to qualify that noun. but what exactly is social editing? scholarship by mcgann and mckenzie in the s told us that the text itself is inherently social, so what implications might that have for a socialized approach to editing the text? might social editing be a portmanteau term invented merely to enable the staid scholarly endeavour of editing to dress itself in web . ’s gladrags? the strongest claim so far made for the endeavour of social editing (siemens, et al. ) invokes social media in its title and describes the dispersal of editorial authority in terms that are strikingly similar to those previously used to de- scribe the dispersal of authorial authority, first by the french theorists in the s and subsequently by mcgann (whom it repeatedly cites on the nature of textuality), and mckenzie, and others. what is new are the opportunities offered by the latest tech- nologies of connectivity: it is now practicable to share out the work that was formerly done by one scholar or a small team of them. but here a potential contradiction arises. if the first wave of authority- dispersal theorists were right—if the authority of a text was always already (to use one of this school’s favourite expressions) dispersed before its first read- ers clapped eyes on it—then what authority remains to be dispersed in the editing? in a penetrating survey of the claims made for a social turn in textual studies, peter robinson is deeply sceptical that the new technologies fundamentally alter the power re- lations between authors, readers, editors, and critics, and he is excoriatingly blunt in his conclusions that: ‘. . . neither ‘‘social text editions’’ nor ‘‘social edi- tions’’ exist and that the phrase ‘‘social editing’’ is misleading’ (robinson, - ). social editing can mean the eliciting of the con- tribution of labour from the general public during the creation of an edition, for example in transcrib- ing primary documents. it can mean the eliciting of scholarly (rather than public) collaborative input during the creation of an edition. it can mean the eliciting of scholarly debate and reuse during the consumption of an edition. and it can mean elicit- ing public debate and reuse during consumption of an edition. new technologies for scholarly publish- ing—the ubiquitous xml markup and dissemin- ation via the worldwide web—have not merely enabled scholars to be collaborative in their editorial labours, they have positively demanded it. this is because most textual scholars do not know how to use xml or publish online and need training in these ways of working. according to robinson, the large accumulations of technical expertise in centres such as the institute for advanced technology in the humanities in virginia, the humanities text initiative at ann arbor, michigan, the king’s college london centre for computing in the humanities, and the maryland institute for technology in the humanities are not an efficient way for digital scholarly editions to get made (robinson, ). he points out that much schol- arly expertise in textual matters remains embodied in the minds and labours of lone scholars who are unlikely ever to acquire the resources—the grant awards, the sabbaticals—that are needed to take up a residential course in xml and related technol- ogies at such a centre. we need better ways of har- nessing lone scholars’ textual expertise. in his essay in this special issue, on ‘project- based digital humanities and social, digital and scholarly editions’, robinson observes that for most of the history of scholarly editing, we did not need such large centres nor did we organize ourselves into projects. yet scholars were still being social because the very means of scholarly communication are inherently social. in robinson’s example, lone scholars working indi- vidually elucidated the opening lines of dante g. egan digital scholarship in the humanities, vol. , no. , by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from deleted text: `` deleted text: '' deleted text: - deleted text: idea's deleted text: 's deleted text: -- deleted text: school's deleted text: -- deleted text: ``… deleted text: 'social deleted text: editions' deleted text: 'social deleted text: editions' deleted text: 'social deleted text: editing' deleted text: '' deleted text: -- deleted text: -- deleted text: (iath) deleted text: king's deleted text: (mith) deleted text: -- deleted text: -- deleted text: scholars' deleted text: `` deleted text: '', deleted text: , deleted text: robinson's http://dsh.oxfordjournals.org/ alighieri’s inferno by each providing individual par- cels of knowledge—on the calendar, on the writer’s biography, and on the movements in the cosmos— that cumulatively illuminate dante’s poetic purpose in these lines. this, according to robinson, is a way of being social that has served us well for many years. the new technologies certainly offer us new possibilities, according to robinson, but they are best exploited not in big projects organized within big centres but in genuinely dispersed scholarly labour. for this, wikipedia provides the most well-known model, but the underlying principles are embodied in the internet and the worldwide web themselves as vast collaborative endeavours running on simple open standards and based on the assumption that humans tend towards intellec- tual generosity rather than hoarding. these prin- ciples keep the bar for engagement as low as possible. writing that is circulated in print has long enabled collaborative, that is social, endeavours be- tween scholars who never meet. we might think of the big institutional centres of digital research as somewhat like the medieval monasteries, with their special textual expertise and means of repro- duction. in this analogy, the lone scholars are like the many potential intellectuals of medieval europe who could not enjoy the life of the mind because they lived and worked outside of these institutions. in the somewhat disputed history of technology offered by elizabeth l. eisenstein, the technology of print itself was the catalyst that brought us the renaissance by ending this institutional dominance (eisenstein, ; cf. mcnally, ). less conten- tiously, we can at least acknowledge that the circu- lation of the catalogue of the frankfurt book fair created a kind of social network among the thinkers of early sixteenth-century europe who thereby knew—at least insofar as they could infer it from book titles—just what other members of the group were working on (wootton, ; wilding, ). making sure that everyone knows what you are working on is the theme of murray mcgillivray’s essay in the present issue called ‘‘‘why don’t we do it in the road?’’: the case for scholarly editing as a public intellectual activity’. he finds that the old ways of working were ‘anti-social’, and as such is the only contributor to use that antonym of this issue’s key term. just as bad, according to mcgillivray, the old ways of doing scholarly editions do not meet the political and social agendas that dominate university life in the twenty-first century: we are in danger of simply not being allowed to do them any more. to counter this threat, we should stop doing our editions in secret, says mcgillivray, and we should display our activities for all to see. this does not mean crowdsourcing the construction of the edition itself, but simply revealing our work- ing processes and publishing parts of the edition as they are completed. mcgillivray describes two of his own projects that have proceeded like this, and he is frank about this method’s necessary public disclos- ure of the imperfect documents made, and of the abandoned blind alleys followed, along the way. mcgillivray recommends using the scientists’ notion of minimum publishable units as a way of giving to junior individuals—students, fixed-term researchers—the credit they deserve by explicitly self-publishing their contributions to the project. opening up the creation of an edition to the world’s public in this way gives that public an opportunity to answer back, and mcgillivray recounts valuable textual corrections that resulted from his approach. this is not quite the engagement of the public in the creation of an edition proposed by ray siemens, but it goes some way towards it (siemens, et al. ). involving students and fixed-term researchers in the creation of editions is one thing, but surely bringing in the public at large is a recipe for disaster. peter shillingsburg, in ‘reliable social scholarly editing’, worries out loud that crowdsourcing some of the editorial work such as the proofreading of transcriptions might be just giving into laziness and that it necessarily constitutes a threat to the maintenance of high quality. shillingsburg expresses a widespread scepticism that we can ensure the quality control needed to exploit the free labour of the crowd without admitting egregious errors into our editions. somebody, somewhere needs to be checking what is being done, and surely it is still true that ‘. . . what is everyone’s job is no one’s job’. the place where shillingsburg least objects to the public having a role is in ‘the analytical and explanatory commentary and critical engagement afterword digital scholarship in the humanities, vol. , no. , by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from deleted text: alighieri's deleted text: -- deleted text: writer's deleted text: -- deleted text: dante's deleted text: th deleted text: -- deleted text: -- deleted text: mcgillivray's deleted text: `` deleted text: 'why deleted text: don't deleted text: ' deleted text: ''. deleted text: `` deleted text: '', deleted text: issue's deleted text: st deleted text: - deleted text: method's deleted text: - deleted text: scientists' deleted text: al deleted text: ing deleted text: (mpus) deleted text: -- deleted text: -- deleted text: world's deleted text: `` deleted text: '', deleted text: - deleted text: - deleted text: - deleted text: ``… deleted text: ''. deleted text: `` http://dsh.oxfordjournals.org/ with works’. these are opinions, so in a sense they cannot be wrong. paul eggert too sees problems in the model of editorial crowdsourcing proposed by siemens, and his essay’s title ‘the reader-oriented scholarly edition’ indicates the kind of thinking that he be- lieves is needed to avoid them. eggert proposes that we conceive of the scholarly edition as a transaction with the reader rather than as a model of what the text really is. eggert gives an account of the post-war tension between the german editing tradition in which it was not permitted to mix readings from different witnesses—each witness was presented as a coherent singularity and its differences from the others recorded—and the more eclectic anglo- american editing tradition. in this narrative, literary critics have largely ignored editorial work on textual variation because either they just wanted a singular reading text to interpret (as did the new critics) or they entirely distrusted the categories used in that editorial labour, such as author, intention, and even the work, and treated everything—including things never written down—as a kind of text (as did the literary theorists). understood as a transaction with the reader, writes eggert, the scholarly book has to be con- structed with a particular market in mind, and we have to answer questions such as whether the iden- tified readership needs a clean reading text or should be given some sense of the text’s genesis, for example by presenting alterations in situ. as eggert asks, should we assume the existence of read- ers ‘who can cope with information needing to be decoded rather than just straightforwardly read’? if we do assume this—and as editors we are tempera- mentally inclined to—then the market for our edi- tions gets smaller, and eggert thinks that in the print medium this reduction in market size may be un- sustainable. perhaps digital editions can help us by separating the archive, on the evidence of which the textual choices are made, from the reading text itself, which is thereby made freer to engage in broader critical debates. eggert conceives of a digital edition being just ‘. . . a list of emendations, sup- ported by justifications for them, of one or more of the texts already stored within the digital archive’. thus the edition is ‘an argument directed at the reader about the archive’, and this model restores the transactional relationship. reflecting on this suggestion as one of the general editors of the forthcoming new oxford shakespeare complete works, it occurs to me that we could dir- ectly apply it to our original spelling texts but not to our modernized spelling texts, since in the latter we depart from the forms in the archive for most of the words. shakespeare and the dramatists of his time are rather an editorial oddity in this regard. english writ- ings from just before shakespeare’s time are so unlike modern english that scholars do not consider mod- ernizing them for other scholarly readers. a moder- nized chaucer, for instance, is only ever created to provide a crib to help students learn middle english or else to attract lay readers to this field. on the other hand, writings from shortly after shakespeare’s time are widely considered to be so like modern english as to need no modernization. shakespeare and his con- temporaries lie in between these periods and are now routinely modernized for lay and scholarly readers. yet, the great textual theorists of the twentieth-cen- tury new bibliography generally opposed the mod- ernizing of shakespeare, and the view that it is unnecessary is still occasionally expressed even today. the linguist david crystal reckons that with only – % of the words and only % of the gram- matical constructions in early modern english being substantially different from those of modern english, today’s readers get a good-enough sense of what shakespeare meant from an unmodernized text and that to go further specialist study is in any case required (crystal, ). however, since the publica- tion of stanley wells’s scholarly argument for mod- ernizing shakespeare’s spelling, which included his guide on how to do it (wells and taylor ), the case for original spelling editions is seldom made, and the remaining scholarly arguments revolve around particular words that present special obstacles to modernization (bevington, ). when planning a scholarly edition, the mere fact that it is to be a digital edition should necessarily put the social aspect in a new light. one might try to be social by broadening the contributor base to bring in more scholars than would normally be involved, without letting in anyone else. but accord- ing to joris van zundert, even a few too many g. egan digital scholarship in the humanities, vol. , no. , by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from deleted text: ''. deleted text: - deleted text: essay's deleted text: `` deleted text: '' deleted text: -- deleted text: -- deleted text: -- deleted text: -- deleted text: text's deleted text: `` deleted text: ''? deleted text: -- deleted text: -- deleted text: ``… deleted text: ''. deleted text: `` deleted text: '', deleted text: shakespeare's deleted text: shakespeare's deleted text: - deleted text: today's deleted text: wells's deleted text: shakespeare's deleted text: - http://dsh.oxfordjournals.org/ scholars can spoil the broth, not least because some of them—especially the non-digital ones—might not be able to see beyond the existing conceptual model of the printed book. in ‘the case of the ‘‘bold’’ button: social shaping of technology and the digital scholarly edition’, van zundert com- plains that we are still essentially making digital ver- sions of books rather than editions that could only exist as digital editions. van zundert describes the makers of an xml annotation tool at his institution giving in to the scholarly editors’ request to imple- ment a ‘bold’ button, allowing annotation of a sec- tion of text to show that it appears in boldface type in the documentary witness. this van zundert thinks was a mistake, as it constituted a reversion to a metaphor from the older textual form—the printed book—in place of a forward-looking con- sideration of what is possible in the new digital medium. van zundert calls the ‘bold’ button error an ex- ample of the endemic ‘paradigmatic regression’ that plagues all our efforts. the last really big leap for- ward in the ability of new technology to express the true essence of text was, he argues, the hypertext reference (href) property of html’s (for anchor) element. the hyperlink gave us for the first time a way to embody the interconnectedness of texts. but digital editions have not in general used hyperlinks to point to things outside of themselves and confine their use to internal linking. this is true, but i would say that we must blame the inade- quacies of our current ways of handling external linking: the domain name system, hypertext transfer protocol (http), and more recently digital object identifiers (dois). the last of these, as applied to the problem of scholarly referencing by the crossref consortium—a non-profit publishers’ organization initiated at the frankfurt book fair in —might one day solve the linking problem. however, there is no essentially new technology at work here: the doi system and crossref merely formalize the apportioning of responsibility for the maintenance of the records that keep cross-refer- ences alive. according to van zundert, to really think big about this topic, we need to provide users with the application programming interfaces (apis) to our editions; doing this will be the final and essen- tial break from the book metaphor. but what of the analyses that the api-driven, distant-reading model promotes? van zundert characterizes them as ‘lossy’ and ‘reductive’ when compared to close reading. i would object here that in fact all interpretations— distant and close—are equally lossy but in different ways. criticism is necessarily reductive and that is a good thing, since the only non-reductive account of a text being interpreted is that text itself. van zundert thinks that our scholarly editions need to narrow the widening gap between close and distant reading. one way, he suggests (without pushing it as a universal panacea), is to consider texts as what computer scientists sometimes call ‘graphs’: that is, chains of ‘nodes’ (say, words) connected by ‘edges’ that represent their relationships. roger osborne, anna gerber, and jane hunter seem to have avoided the kind of error that van zundert discloses in the making of their australian electronic scholarly editing (austese) workbench software for collaborative editing, as described in ‘archiving, editing, and reading on the austese workbench: assembling and theorising an ontology-based electronic scholarly edition of joseph furphy’s such is life’. they describe the his- tory of the publication of furphy’s novel and the complexities of revision that make a critical edi- tion particularly desirable. the austese workbench software is meant to enable non-technical editors to work digitally, and its main contribution seems to be that it allows us to describe artefacts (such as manu- scripts, typescripts, and editions), events (such as the writing of revisions), and persons (such as publishers and authors) in the life of the literary work being edited, and to indicate how these various entities relate to one another. if i understand it correctly, this identification of phenomena is rather like that in peter robinson’s textual communities software, the alignment with which suggests that investigators are happily converging on particular ways of thinking that will take us past the intellectual impasses that several contributors here identify in the state of the art of scholarly digital editing. apparently, the objects in the austese workbench can be as small as single pages in a book, so it is possible to describe in detail how an afterword digital scholarship in the humanities, vol. , no. , by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from deleted text: -- deleted text: -- deleted text: `` deleted text: &#x ;bold&#x ; deleted text: '', deleted text: editors' deleted text: `` deleted text: '' deleted text: -- deleted text: -- deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: html's deleted text: (dns) deleted text: -- deleted text: publishers' deleted text: -- deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: -- deleted text: -- deleted text: `` deleted text: '': deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: furphy's deleted text: ''. deleted text: furphy's deleted text: and deleted text: robinson's deleted text: which http://dsh.oxfordjournals.org/ author revised a work by transposing material. i would have thought, however, that much finer granularity than the single page would be needed for most attempts to account for transposition-in- revision. in my field, shakespeare’s writing, we find alterations to, and transpositions of, individual words and even letters as the reviser hunts for the precise bon mot: ‘too sallied flesh’ (hamlet, - ) versus ‘too solid flesh’ (hamlet, ) for instance. and what if the change is not even shakespeare’s own but someone else’s? we have experience in re- cording the changes that multiple hands make to a work, of course, as pointed out by meg meiman in ‘documentation for the public: social editing in the walt whitman archive’. as she observes, we are quite used to figuring out just how to record the multidimensional changes to an xml document when many people work on it, and so in a way we have already achieved a degree of social editing. the very headers of our machine-readable documents contain this sort of information, and—as meiman implies without stating it so baldly—we perhaps are making a meal of things when we treat the multiple hands and multiple revisions present in our primary texts as if they present an almost intractable intel- lectual problem. touché! or as osric put it, ‘a palp- able hit’. if we are going to undertake crowdsourcing of some of the work in scholarly editing, what does practical experience tell us to plan for? kenneth price, in ‘the walt whitman archive and the prospects for social editing’, reckons that crowdsourcing efforts work best when there are no tricky conceptual questions at stake, no training is needed, and when we have mountains of simple, repetitive labour to complete and the vetting pro- cedures can be made simple. the project to crowdsource the transcription of jeremy bentham’s works found that there was an extraor- dinarily long tail to the volunteer profile: thousands of people transcribed just one or two documents and a handful of them transcribed many hundreds. like other essayists in this special issue, price is sceptical of siemens’ suggestion that editorial au- thority can also be socialized, and he asks of the bentham contributors ‘what was the quality of their contributions?’ and were they in fact not ordinary citizens as the project hoped but ‘other scholars, perhaps not affiliated with the project but nonetheless possessing expert training in early modern texts’? price considers the ideas of other investigators, including martin mueller, who hope to bring in masses of students to get undergraduate credit for their performance of ‘lapidarian’ tasks such as ‘proofreading, checking part-of-speech tag- ging, and correcting or creating a cast list’. offering degree-level credit might, it seems, act as an incen- tive to maintain high quality in the labour. we are, of course, only at the beginning of our exploration of the possibilities of social editing and it occurs to price that such experiments might have unantici- pated spin-off benefits. for example, if we leave open a public poetry-annotating site for several dec- ades, we would end up with a useful snapshot of changing public perceptions around various topics. our secondary material might turn into a social historian’s primary material. all the essayists here are agreed that new tech- nologies are changing our ways of thinking about our scholarly editing activities. for allison muri, catherine nygren, and benjamin neudorf (‘the grub street project: a digital social edition of london in the long eighteenth century’), one of the most important changes might be a departure from our traditional fixation with the author. the grub street project aims to be a ‘collaborative social edition of eighteenth-century london’ itself, bring- ing together texts and images about books, art- works, people, places, and trades. there is a relational database holding all the data together and they have texts as transcribed by the eighteenth century collections online text creation partnership. but why is it an ‘edition’ not an archive? the authors explore the limitations of our standard nomenclature. digital archives, they argue, are themselves oddly metaphorical in using that name, since they do not really preserve any- thing in the way that professional archivists would understand in relation to their preservation of phys- ical documents. (actually, i would contest that as- sertion: keeping old digital files useable is a kind of preservation.) moreover, many of us are no longer especially author centric even when we work on one writer: we acknowledge that writers exist in social g. egan digital scholarship in the humanities, vol. , no. , by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from deleted text: shakespeare's deleted text: `` deleted text: too deleted text: '' deleted text: `` deleted text: too deleted text: '' deleted text: shakespeare's deleted text: else's deleted text: `` deleted text: ''. deleted text: - deleted text: -- deleted text: -- deleted text: `` deleted text: ''. deleted text: - deleted text: `` deleted text: '', deleted text: - deleted text: - deleted text: bentham's deleted text: siemens's deleted text: `` deleted text: ?'' deleted text: `` deleted text: ''? deleted text: `` deleted text: '' deleted text: `` deleted text: ''. deleted text: historian's deleted text: (`` deleted text: ''), deleted text: `` deleted text: '' deleted text: , deleted text: (ecco-tcp) deleted text: `` deleted text: '' deleted text: - http://dsh.oxfordjournals.org/ networks that enable the reading of their words. so, it does not make sense, this essay’s authors argue, to confine the word edition to works by one author. like the place name grub street itself—a real loca- tion in london and an imaginary place of low cul- ture and despicable behaviour—the term edition is freighted with connotations about how people interact with one another that take us far beyond its simple denotation. as the french theorists told us, texts are inextricably embedded in wider social practices. some of those wider social practices can seem to be ranged in direct opposition to our efforts. this is the topic of wout dillen and vincent neyt’s ‘digital scholarly editing within the boundaries of copyright restrictions’. they start with robinson’s exhortation to digital scholarly editing projects that they drop the non-commercial and no derivatives qualifiers that are often put on to a creative commons licence. the trouble is, dillen and neyt observe, that the editors might well not possess those rights that an attribution and share-alike licence would give away. a case study for this problem is the samuel beckett digital manuscripts project, for which the primary documents are in libraries in different na- tional jurisdictions and so are subject to differing copyright restrictions. the beckett estate requires that the project put the materials behind a paywall, which virtually everyone in academia finds objection- able. dillen and neyt detail the other irksome restric- tions that must necessarily be accepted by editors of materials that are encumbered by copyrights unless we are willing to just give up working on these sub- jects altogether. they observe that we can almost always safely give away our own project documenta- tion files, and also if we use the text encoding initiative (tei) standard we can give away the one document does it all file that describes the schema used for the encoding; these actions go a long way towards helping others understand what we have done. moreover, even copyright materials themselves may be reproduced under the fair use doctrine (called fair dealing in the uk), and dillen and neyt offer a couple of notable examples while cau- tioning that this principle merely provides a possible line of defence for those subject to a legal challenge from rights holders. no academic wants to have to actually fight such a case, and the law, being thus weighted towards rights holders, probably makes us much too timid in the exercising of our fair use/ dealing rights. encroached upon by rights holders from one side and on the other by political and institutional lea- ders who cannot easily see the value of a new edition of the writings of a dead author, the scholarly editor is in an invidious position. the long-term economic viability of our traditional allies, the commercial publishing houses, is uncertain, and there are un- doubtedly some politicians who would regard their disappearance and ours as no bad thing. this is not because these politicians believe that electronic dis- semination is better than print dissemination, but because they believe that we scholarly editors have nothing of great value to offer society. in the idea of ‘wisdom in the crowds’, some people would see an alternative to the putative wisdom of the scholar. from this perspective, the democratization that comes with crowdsourcing aligns discomfortingly well with what in the uk is called the impact agenda, which may not unfairly be characterized as a rather brusque enquiry of ‘what have you done for us lately?’, addressed to academics by those whose taxes pay for our research. the ques- tion is in fact a fair one, so long as we have the confidence to give it a considered response rather than slip into the habitual insecurity of our profession. as terry eagleton remarks in his memoir the gatekeeper, middle-class academics ‘have a problem about patronizing the working class and worry about their posh accents’, whereas ‘working people themselves are usually quite prepared to accept them if they have something useful to offer’. this observation is illustrated by an anecdote about an oxford academic who was invited to de- liver a lecture at ruskin, the oxford trade union college, and who began with the typic- ally donnish, self-deprecating ploy of claiming to know very little of the subject in question. a voice from the back boomed out in a rich lancashire accent ‘tha’art paid to knoow!’ (eagleton , pp. – ) afterword digital scholarship in the humanities, vol. , no. , by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from deleted text: ' deleted text: -- deleted text: -- deleted text: ' deleted text: `` deleted text: ''. deleted text: ' deleted text: (nc) deleted text: (nd) deleted text: (cc) deleted text: (by-sa) deleted text: - deleted text: (odd) deleted text: nited kingdom deleted text: `` deleted text: '' deleted text: - deleted text: nited kingdom deleted text: `` deleted text: ?'', deleted text: `` deleted text: '', deleted text: `` deleted text: ''. deleted text: `` deleted text: !'' deleted text: - http://dsh.oxfordjournals.org/ there is no shame in knowing more about some- thing than other people do, of course, but the point of the anecdote is that academics are in this position because they have an economic role in society, even when (as in eagleton’s case) they are committed to fundamental social change to transform that economy. none of the contributors to this special issue— certainly not those actively involved in crowdsour- cing aspects of the editorial process—takes the view that the wisdom of the crowd surpasses that of the paid expert. but those like shillingsburg who worry that such an idea might underlie some people’s con- ception of a social turn in scholarly editing are right to be worried. there has been a general devaluing of scholarly expertise across the western democracies in recent years, and the impact agenda and its ex- pression in such things as the uk’s research excellence framework are symptoms of a political desire to hold academics to a merely economic ac- countability. at their most extreme, the instincts at work here arise from a managerialist, business-like approach to what happens in universities. in short, there is a discounting of scholarly knowledge except where it can directly be assigned a value by com- mercial exploitation. as any marxist would predict, the new technologies are double-edged in that regard, for as well as enabling the monetization of scholarly expertise they enable the scholars them- selves to directly reach the great many ordinary readers around the world who value scholarly ex- pertise for its own sake and not in monetary terms. as described in this special issue, there are oppor- tunities for expert individuals to bypass the usual commercial and institutional channels for scholarly interchange and to involve their readers more dir- ectly in their editorial practices. the new technolo- gies can enable a new compact between editors, as the expert curators and disseminators of extraordin- ary writings, and the worldwide readerships that want to read them. references barthes, r. ( ). la mort de l’auteur (the death of the author). mantéia, : – . barthes, r. ( ). image-music-text. heath, s. (trans.). london: fontana. bevington, d. ( ). modern spelling: the hard choices. in erne, l. and kidnie, m. j. (eds), textual performances: the modern reproduction of shakespeare’s drama. cambridge: cambridge university press, pp. – . craig, h. ( - ). style, statistics, and new models of authorship. early modern literary studies, ( ): . crystal, d. ( ). to modernize or not to modernize: there is no question. around the globe, : – . eagleton, t. ( ). the gatekeeper: a memoir. london: penguin. egan, g. ( ). the struggle for shakespeare’s text: twentieth-century editorial theory and practice. cambridge: cambridge university press. egan, g. ( ). what is not collaborative about early modern drama in performance and print? shakespeare survey, : – . eisenstein, e. l. ( ). the printing press as an agent of change: communications and cultural transformations in early-modern europe (volumes and ). complete in one volume. cambridge: cambridge university press. foucault, m. ( ). qu’est-ce qu-un auteur? (what is an author?). bulletin de la societé francaise de philosophie, ( ): – . foucault, m. ( ). what is an author? in harari, j. v. (trans.); davis, r. c. and schleifer, r. (ed.), contemporary literary criticism: literary and cultural studies, rd edn. new york, ny: longman, pp. – . mcgann, j. j. ( ). a critique of modern textual criticism. chicago: university of chicago press. mckenzie, d. f. ( ). typography and meaning: the case of william congreve. buch und buchhandel in europa im achtzehnten jahrhundert: fünftes wolfenbütteler symposium vom bis november [¼ the book and the book trade in eighteenth-century europe: proceedings of the fifth wolfenbütteler symposium november - ]. edited by giles barber and bernard fabian, hamburg, hauswedell, pp. – . mcnally, p. r. (ed.) ( ). the advent of printing: historians of science respond to elizabeth eisenstein’s ‘‘the printing press as an agent of change.’’ montreal: mcgill university graduate school of library and information studies. robinson, p. ( ). how we have been publishing the wrong way, and how we might publish a better way. in egan, g. (ed.), electronic publishing: politics and g. egan digital scholarship in the humanities, vol. , no. , by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from deleted text: ' deleted text: -- deleted text: - deleted text: -- deleted text: ' deleted text: for deleted text: nited kingdom deleted text: ' http://dsh.oxfordjournals.org/ pragmatics. new technologies in medieval and renaissance studies. toronto: medieval and renaissance texts and studies (mrts) and iter, pp. – . robinson, p. ( - ). chapter . social editions, social editing, social texts. in nelson, b. and cunningham, r. (eds), digital studies/le champ numérique. an online journal : subsidium on ’beyond accessibility: textual studies in the twenty-first century . siemens, r., timney, m., leitch, c., koolen, c. and garnett, a. ( ). toward modeling the social edition: an approach to understanding the electronic scholarly edition in the context of new and emerging social media. literary and linguistic computing, : – . wells, s. and taylor, g. ( ). modernizing shakespeare’s spelling, with three studies in the text of henry v. oxford: clarendon press. wilding, n. ( ). ‘‘the strangest piece of news’’: review of david wootton watcher of the skies (new haven ct: yale university press, ) and j. l. heilbron galileo (oxford: oxford university press, ). london review of books, : – . wootton, d. ( ). ‘‘traffic of the mind – facts, theories, theories of facts: the scientific revolution and a ‘forty year struggle not to be confined by yesterday’s questions’’’: review of robert s. westman the copernican question: prognostication, skepticism, and celestial order (berkeley ca: university of california press, ) and steven shapin and simon schaffer leviathan and the air- pump: hobbes, boyle, and the experimental life new edition (princeton nj: princeton university press, ). times literary supplement number, : – . afterword digital scholarship in the humanities, vol. , no. , by guest on d ecem ber , http://dsh.oxfordjournals.org/ d ow nloaded from http://dsh.oxfordjournals.org/ uva-dare is a service provided by the library of the university of amsterdam (https://dare.uva.nl) uva-dare (digital academic repository) global encounters, local places: connected histories of darjeeling, kalimpong, and the himalayas: an introduction harris, t.; holmes-tagchungdarpa, a.; sharma, j.; viehbeck, m. doi . /heiup.ts. publication date document version final published version published in transcultural studies license cc by-nc link to publication citation for published version (apa): harris, t., holmes-tagchungdarpa, a., sharma, j., & viehbeck, m. ( ). global encounters, local places: connected histories of darjeeling, kalimpong, and the himalayas: an introduction. transcultural studies, ( ), - . https://doi.org/ . /heiup.ts. general rights it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like creative commons). disclaimer/complaints regulations if you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the library know, stating your reasons. in case of a legitimate complaint, the library will make the material inaccessible and/or remove it from the website. please ask the library: https://uba.uva.nl/en/contact, or a letter to: library of the university of amsterdam, secretariat, singel , wp amsterdam, the netherlands. you will be contacted as soon as possible. download date: apr https://doi.org/ . /heiup.ts. https://dare.uva.nl/personal/pure/en/publications/global-encounters-local-places-connected-histories-of-darjeeling-kalimpong-and-the-himalayas-an-introduction(e b -bb - a - a- f e e fd).html https://doi.org/ . /heiup.ts. transcultural studies . doi: . /heiup.ts. global encounters, local places: connected histories of darjeeling, kalimpong, and the himalayas—an introduction tina harris, university of amsterdam—amy holmes-tagchungdarpa, grinnell college—jayeeta sharma, university of toronto—markus viehbeck, ruprecht-karls-universität heidelberg majestic himalayan landscapes and colonial-era attractions continue to entice domestic and global tourists to the picturesque mountain towns of darjeeling and kalimpong. occasionally, a travel writer gushes over their quaint charms, but national and international media highlight them only when beset by natural disasters or political turmoil. local inhabitants fret about the inadequate infrastructure and indian state neglect that drives their youngsters away to global mega-cities such as mumbai, hong kong, or london. this present marginality stands in sharp contrast to the nineteenth and twentieth centuries, when darjeeling and kalimpong were internationally notable hubs for mountain exploration, commodity trade, religious innovation, and great game politics. until the mid-twentieth century, readers of international publications such as the times of india, the scotsman, the new york times, harper’s magazine, south african outlook, and the berliner volkszeitung regularly encountered darjeeling as a bucolic tea-growing destination and a colourful mart on the edge of the world’s highest mountains, or its neighbour kalimpong as the embarkation point for everest expeditions and the headquarters of the wool trade from tibet to the united states. this themed section historicises and recovers the complex histories of these recently marginalised himalayan places by connecting them to larger transcultural narratives and global processes of economic, religious, and social exchange. it is largely inspired by sanjay subrahmanyam’s call for “connected histories” that explore how local and regional places, transactions, and encounters constitute global histories through the circulation of people, ideas, and commodities into and across such spaces. contributors seek to develop this model as an alternative to the dominant historiographies and sanjay subrahmanyam, “connected histories: notes towards a reconfiguration of early modern eurasia,” in “the eurasian context of the early modern history of mainland south east asia, – ,” special issue, modern asian studies , no. (july ): – . cbn http://transculturalstudies.org . /heiup.ts global encounters, local places doi: . /heiup.ts. area studies scholarship that privileged nation state-centred histories and cold war political formations over connective and transnational ones. global processual histories, such as those by eric wolf, sidney mintz, and fernand braudel, have demonstrated from a macro-level, material culture perspective that the circulation of people, goods, and things transcends nation states. the recent turn to scholarship on maritime networks is a significant contribution to connected histories, but privileges coastal cities rather than inland ones, not to mention those in seemingly more remote areas. this themed section’s emphasis on borderland histories and their transcultural connections across asia and the globe is also inspired by north american scholars such as richard white, who called for careful historical interrogation of the shifting power dynamics that characterised spaces between states which functioned fernand braudel, afterthoughts on material civilisation and capitalism (baltimore: johns hopkins university press, ); sidney mintz, sweetness and power: the place of sugar in modern history (new york: viking press, ); eric wolf, europe and the people without history (berkeley: university of california press, ). an excellent example of a maritime history is nile green, bombay islam: the religious economy of the west indian ocean, – (cambridge: cambridge university press, ). fig. : the eastern himalayas as a contact zone of different nation states. map created by uva-kaartenmakers. . /heiup.ts transcultural studies . as a “middle ground” for different cultures and ethnicities. an important impetus is the call for “zomia” from willem van schendel and james scott as a heuristic area for study that comprises the linguistically, culturally, and economically connected asian borderlands of bhutan, bangladesh, china, nepal, india, thailand, laos, vietnam, and burma. this call serves to further de-centre the nation state as a sealed “container” of history and allows contributors to re-imagine himalayan place histories from connected local-global perspectives that move beyond nation states, area studies, and regions such as south asia or east asia. we recognize that writing connected histories of such borderland and transnational spaces poses large and specific challenges. himalayan archives, for instance, are dispersed across many local and national collections in numerous languages, often in a perilous state of preservation, with only scanty historical materials to illuminate subaltern and mobile subjectivities. partly for these reasons, anthropologists, linguists, and specialists of religious studies who undertake ethnographic or purely textual research have managed to study this region much more than historians. this themed section emerges from a collaborative academic network aimed at promoting historically nuanced conversations and interdisciplinary initiatives that recognize and challenge such a lacuna. our hope is that it will inspire scholars who face similar challenges to advance dialogue about how connected and collaborative approaches to humanities and social science research across national and disciplinary boundaries might thrive and, in turn, encourage public engagement for such under-studied areas. richard white, the middle ground: indians, empires, and republics in the great lakes region, – (cambridge: cambridge university press, ). white’s theoretical model for understanding borderland history has inspired scholarship on asia with works such as c. patterson giersch, asian borderlands: the transformation of qing china’s yunnan frontier (harvard: harvard university press, ) and yudru tsomu, the rise of gönpo namgyel in kham: the blind warrior of nyarong (lanham: lexington, ). see willem van schendel, “geographies of knowing, geographies of ignorance: jumping scale in southeast asia,” environment and planning d: society and space ( ): – and james scott, the art of not being governed: an anarchist history of upland southeast asia (new haven: yale university press, ). for more on zomia and the himalayas, see sara shneiderman, “are the central himalayas in zomia? some scholarly and political considerations across time and space,” political geography ( ): – . the eastern himalaya research network is an international network of scholars that focuses on historically nuanced cultural studies of the eastern himalayas and their borderlands. we promote collaboration in digital scholarship and pedagogy, archival preservation and dissemination, and nurture research partnerships involving university academics, public intellectuals, young researchers, and institutions across the himalayas and beyond: https://www.utsc.utoronto.ca/ digitalscholarship/ehrn/home [accessed on . june ]. http://transculturalstudies.org https://www.utsc.utoronto.ca/digitalscholarship/ehrn/home https://www.utsc.utoronto.ca/digitalscholarship/ehrn/home global encounters, local places for centuries, the himalayas were of spiritual and commercial significance, but mainly to the inhabitants of asia. from the mid-nineteenth century onwards, medical theories, plantation capitalism, commodity commerce, migrations, and strategic machinations brought these mountain localities and habitats to imperial and global attention. an important impulse behind the geo-political manoeuvring to control these spaces followed from the bio-political theories that encouraged imperial territorial expansion into mountain zones. nineteenth-century euro-american medical science was steeped in climatic thinking, which held that tropical colonies posed great dangers for white races. periodic bodily recuperation seemed essential to preserve white racial health in hot climates, but this depended on european colonisers having access to the temperate climes of high-altitude spaces. this in turn inspired the english east india company to annex a remote mountain hamlet named dorjéling (rdo rje gling) from the kingdom of sikkim in . the company’s new british settlement of darjeeling was planned as a high-altitude sanitarium that would provide refuge for white troops and administrators from the ravages of the indian plains. similar bio-political thinking based on climatic theories and race science inspired the creation of dalat, bukittinggi, and baguio—other hill station resorts across asia—when dutch, french, and american empire-builders adopted the british strategy of periodic high-altitude recuperation. british darjeeling acquired additional economic and political cachet when colonial experiments to grow transplanted tea-plants along its mountain slopes proved successful. after the portuguese introduced this chinese relevant works across a variety of disciplines include toni huber and stuart blackburn, eds., origins and migrations in the extended eastern himalayas (leiden: brill, ); alex mckay, their footprints remain: biomedical beginnings across the indo-tibetan frontier (amsterdam: amsterdam university press, ); saul mullard, opening the hidden land: state formation and the construction of sikkimese history (leiden: brill, ); karma phuntsho, a history of bhutan (delhi: random house, ); sara shneiderman, rituals of ethnicity: thangmi identities between nepal and india (philadelphia: university of pennsylvania press, ); amy holmes-tagchungdarpa, the social life of tibetan biography: textuality, community, and authority in the lineage of tokden shakya shri (lanham: lexington, ); wim van spengen, tibetan border worlds: a geo-historical analysis of trade and traders (london: kegan paul international, ). judith t. kenny, “climate, race, and imperial authority: the symbolic landscape of the british hill station in india,” annals of the association of american geographers , no. ( ): – . other works on hill stations include nandini bhattacharya, “leisure, economy, and colonial urbanism: darjeeling, – ,” urban history , no. ( ): – ; aditi chatterji, landscapes of power: the colonial hill stations, research papers (oxford: school of geography, university of oxford, ); pamela kanwar, imperial simla: the political culture of the raj (delhi: oxford university press, ); dane k. kennedy, the magic mountains: hill stations and the british raj (berkeley: university of california press, ); queeny pradhan, “empire in the hills: the making of hill stations in colonial india,” studies in history , no. ( ): – . transcultural studies . beverage into seventeenth-century europe, tea had become hugely popular. by the mid-nineteenth century, tea-drinking became inseparable from genteel ladies’ parlours as well as workers’ canteens on the british isles. until that time, china was the sole supplier of tea to the globe. this gave the chinese qing empire a crucial commercial monopoly that european statesmen and scientists attempted to contest. such attempts were fruitless until the british succeeded in cultivating tea in their colonial acquisitions. in the s, when colonial tea plantations in darjeeling, assam, and ceylon started to challenge the chinese monopoly, this was hailed as a key political and scientific achievement of the british empire. over the next few decades, darjeeling tea acquired a reputation and fame that spanned the globe. the town’s growing popularity as a picturesque mountain destination and the main producer of a global beverage intensified when it became the summer capital of the bengal presidency, with a rail connection to the port-city of calcutta that largely mitigated the rigors of high-altitude travel. a few hours from darjeeling, across the teesta river, lies the hamlet of kalimpong, which the british acquired in from the kingdom of bhutan. already home to a lively market that connected dispersed himalayan localities with the long-distance salt and brick-tea trades, kalimpong became particularly popular with european missionaries and explorers, above all as a window into buddhist-ruled tibet, which was not accessible to most foreigners at that time. the two mountain towns developed into crucial economic and cultural crossroads between the eastern himalayas and the world, especially in the wake of the younghusband mission’s forcible opening of tibet in . they gained their hub status at the expense of the historic cities of kathmandu and lhasa, whose nepali and tibetan rulers, hoping to evade european territorial ambitions, imposed severe restrictions on cross-border travel and commerce. in their place, darjeeling and kalimpong, positioned between the qing and british empires, the kingdoms of bhutan, nepal, sikkim, and tibet, attracted settlers of diverse asian and himalayan ethnicities, as well as british, german, french, and scandinavian sojourners. in these british-ruled towns, colonial administrators, planters, and missionaries dominated social hierarchies, but asian traders were the key economic actors. marwari, tibetan, nepali, and chinese traders dispatched consumer goods ranging from rice to rolex watches to and from these hill stations, into territories such as tibet, which had only recently become accessible to the global circulation of goods. in turn, trading partners located jayeeta sharma, empire’s garden: assam and the making of india (durham: duke university press, ). sarah besky, the darjeeling distinction: labor and justice on fair-trade tea plantations in india (berkeley: university of california press, ). http://transculturalstudies.org global encounters, local places across the himalayas assisted darjeeling and kalimpong agents in transmitting locally produced commodities to the rest of the world. darjeeling tea was the best known of these himalayan products. by , darjeeling and its surroundings had tea plantations owned by british firms that employed approximately , local workers, to grow a crop worth well over ten million pounds. other, less widely-known himalayan commodities were tibetan sheep wool and yak tails. these crossed the himalayas on mule and yak into kalimpong, from whence powerful traders such as the tibetan pangdatsang syndicate sent them across the oceans into north american factories and department stores. few global consumers realized that ubiquitous modern artefacts such as santa claus beards were manufactured from yak-tail hair. wool trade figures fluctuated wildly due to global economic and geopolitical conflicts; at the height of its post-war boom during the s, the trade generated between one and a half and two million us dollars. as colonial darjeeling and kalimpong expanded into hubs for commerce across british india, sikkim, nepal, bhutan, tibet, and china, they flourished as fluid contact zones characterized by economic mobility, urban socialization, and cross-cultural encounters. a substantial mobile population was drawn in from a wide swathe of economically marginal lands such as the himalayan territories of nepal, sikkim, bhutan, tibet, and china, as well as parts of northern, eastern, and western india. at the upper reaches of the hill station labour pyramid were the men and women who found employment in the military, on plantations, and in households; at the lower end were porters and other types of manual workers. the anthropologist tanka subba estimates that the darjeeling hinterland’s population increased from around one hundred at the time of british annexation, to , by , , by , and that it had jumped to over , by . a large proportion of this population arrived from nepal, and found new employment opportunities in the colonial economy as gurkha soldiers and tea jayeeta sharma, “producing himalayan darjeeling: mobile people and mountain encounters,” himalaya: the journal of the association for nepal and himalayan studies , no. (fall ): – . g. w. christison, “tea planting in darjeeling,” society of arts journal ( – ): – . tina harris, geographical diversions: tibetan trade, global transactions (athens: university of georgia press, ). mary louise pratt, imperial eyes: travel writing and transculturation (london: routledge, ). tanka subba, “living the nepali diaspora: an autobiographical essay,” zeitschrift für ethnologie , no. ( ): – . transcultural studies . plantation labourers. in , there were roughly , nepali workers in the darjeeling area, of whom , had been born in nepal. away from the repressive rana regime that ruled their homeland, these darjeeling and kalimpong migrants played a prominent role in the creation of nepali as a literary language, establishing some of the earliest nepali print periodicals, such as the gorkha khabar kagat ( ) and chandrika ( ). these towns provided a safe home for newar buddhists from the kathmandu valley who faced religious persecution in nepal. settling in kalimpong and darjeeling, these families found not just religious refuge but a suitable base to conduct commerce that spanned india, sikkim, nepal, and tibet. assisted by such wealthy patrons, the two towns became vital centres for the creation and dissemination of buddhist knowledge that were connected to other asian centres such as ceylon and burma, but also generated secular information, especially on tibet, that spread across the world. a key figure in such information circulation was an ethnic tibetan migrant from the western himalayas, dorje tharchin, who earned his living at kalimpong as a translator and language instructor for the scottish foreign mission, but became famous as the creator of the world’s first tibetan newspaper with a wider distribution network, the mélong or tibet mirror, which was published from to . its subscribers ranged from the thirteenth and fourteenth dalai lamas to traders from tibet, nepal, and surrounding areas. this newspaper was also read by an emerging global network of intellectuals and academics interested in tibetan issues, including jacques bacot (france), marco pallis (uk), johan van manen (netherlands), and johannes schubert (germany). during the s, the british botanist joseph hooker was the first well-known figure to bring darjeeling into the public eye. he lived there for three years, making it his base for naturalist collecting activity. through private letters and published journals, hooker engaged an influential scientific community catherine warner, “flighty subjects: sovereignty, shifting cultivators, and the state in darjeeling, – ,” himalaya: the journal of the association for nepal and himalayan studies , no. ( ). an interesting biography of a transnational newar buddhist trading family is d. s. kansakar hilker, syamukapu: the lhasa newars of kalimpong and kathmandu (kathmandu: vajra publications, ). another transnational study where kalimpong networks play a crucial role is paul g. hackett’s theos bernard, the white lama: tibet, yoga, and american religious life (new york: columbia university press, ). see digital collections on the tibet mirror from columbia university: http://www.columbia.edu/ cu/lweb/digital/collections/cul/texts/ldpd_ _ / [accessed on . june ]. http://transculturalstudies.org http://www.columbia.edu/cu/lweb/digital/collections/cul/texts/ldpd_ _ / http://www.columbia.edu/cu/lweb/digital/collections/cul/texts/ldpd_ _ / global encounters, local places and a wider european readership with his himalayan findings. those ranged from the rhododendrons that began to bloom in gardens across the globe to natural observations incorporated by charles darwin into his influential on the origin of species. in the century after hooker’s visit, darjeeling and kalimpong played host to other sojourners and settlers from across the globe, who pursued a wide variety of interests. they included the nepali christian preacher gangaprasad pradhan, the belgian-french mystical writer alexandra david-néel, the scottish missionary and child welfare reformer john anderson graham, the british administrator and tibet scholar charles bell, the bengali nobel laureate litterateur rabindranath tagore, the newar buddhist businessman bhajuratna kansakar, the danish anthropologist prince peter, the thirteenth dalai lama, the japanese buddhist pilgrim ekai kawaguchi, and the mountain porter-turned-everest-conqueror tenzing norgay. throughout the mid-nineteenth to mid-twentieth centuries, their travels, experiences, and narratives further tied the himalayas to larger histories of modernity and cross-cultural encounters. this themed section of the journal transcultural studies engages with such multiple journeys and crossings around darjeeling and kalimpong and offers alternative approaches which connect and intersect the history of local places and spaces with broader narratives of global history. the contributors to this section draw upon a range of perspectives and archives to frame their explorations of darjeeling, kalimpong, and the eastern himalayas as hubs for local, regional, and global circulation, transnational and transcultural encounters. an important aspect of these articles is their wide-ranging exploration of textual, oral, and visual source materials in multiple languages and locations across the world, from the colonial modern to the contemporary era. jayeeta sharma places darjeeling within the historical context of hill station sanatorium urbanity that was a key feature of british imperial culture. she explores how a strategic eastern himalayan location at the crossroads of transnational circulations of bodies, commodities, and ideas shaped the transcultural character of the town and gave it a distinctive global presence. the article examines darjeeling as a colonial space that was laboured upon and constituted by mobile historical actors from across the himalayas and beyond, including lepcha cultivators, nepali labourers, sherpa porters, gurkha soldiers, sikkim landholders, bengali clerks, scottish missionaries, and british planters. the circulation, migration, and representation of joseph dalton hooker, himalayan journals: notes of a naturalist in bengal, the sikkim and nepal himalayas, the khasia mountains, &c (london: john murray, ); see also the newly available database of his correspondence on the website of the royal botanic gardens, kew: http://www.kew. org/science-conservation/collections/joseph-hooker/correspondence [accessed on . june ]. http://www.kew.org/science-conservation/collections/joseph-hooker/correspondence http://www.kew.org/science-conservation/collections/joseph-hooker/correspondence transcultural studies . himalayan labouring bodies, the economic and social transactions of capitalism around commodities such as tea, salt, and wool, and the local and global journeys of euro-american sojourners became the key transcultural elements that transformed british darjeeling into a place famed across the himalayas and the world. emma martin’s article takes the production of the british administrator charles bell’s influential tibetan dictionary ( ) as a thread to investigate scholarly encounters and interactions in the himalayan borderlands. she explores how bell spent his years as a colonial government functionary between sikkim, tibet, darjeeling, and kalimpong, consolidating the skills that made him a leading tibetologist of that era. in so doing, she illustrates how bell’s much-vaunted global reputation as a tibet scholar was largely based on his access to local intermediaries from darjeeling, tibet, and sikkim. reading between the lines of the british state’s official archive and bell’s own correspondence, she shows how a landmark of imperial knowledge like bell’s dictionary was predicated on the “knife edge of colonialism” and its inequalities of power. this article deconstructs the expertise of an imperial functionary to delineate the “hidden histories” of local and indigenous agency that helped create an ostensibly western canon of himalayan knowledge. kalzang dorjee bhutia explores different perspectives on the importance of darjeeling and kalimpong as hubs of cultural hybridity where local intellectuals and global visitors were able to exchange and cross-fertilize ideas unhampered by the repressive political environments of other himalayan polities. this article focuses on the young men’s buddhist association, which devoted its energies to public education and, by taking the ymca as both inspiration and challenge, countered christian missionary efforts to convert locals by presenting modernized buddhism as an alternative. his sources include a little-known oral archive narrated in the bhutia, lepcha, and nepali languages about the theravada monk s. k. jinorasa and his wide-ranging social justice initiatives, as well as the self-consciously penned and self-published memoirs of the british-born buddhist philosopher sangharakshita. samuel thévoz investigates global forms of buddhism inspired by the representation of local cultures of tibet and sikkim in western scholarship, with a focus on the life of alexandra david-néel, the early twentieth-century belgian-french explorer, spiritualist, and new york times best-selling author who was to have a profound and long- lasting influence on beat culture and western esotericism. his examination of david-néel’s encounters with tibetans in darjeeling and kalimpong adds a new dimension to the crucial role played by these mountain towns located at the borderland junction of cross-cultural currents in the creation of global intellectual and cultural networks that derived inspiration from buddhism and orientalism. http://transculturalstudies.org global encounters, local places the shifting perspectives employed in these articles tie these localities to larger histories of enormous relevance to our understanding of the global history of intellectual and commercial exchange. they are also of great import for a fuller understanding of twenty-first century geo-politics, where, as china and india consolidated their global position, smaller himalayan entities such as nepal, bhutan, sikkim, and tibet as well as indian and chinese borderland spaces became embroiled within—and perhaps even central to—those territorialising ambitions. by the late twentieth century, a broad constellation of economic and political forces made localities such as darjeeling and kalimpong marginal to the media and to area studies scholarship that focused on the rising asian nation states of china and india. china’s increasing post-war grip on tibet had mixed effects on the inhabitants of these borderland towns. on one hand, the increase in chinese soldiers and demand for commodities in tibet brought significant prosperity for some traders in darjeeling and kalimpong. on the other hand, many felt the impact of commercial restrictions, as well as heavy losses (in the case of the wealthy trading families) as the chinese state began to take over the tibetan economy. combined with the us export control act limiting goods from china, global trading connections with the now-indian towns of darjeeling and kalimpong began to visibly chill. in the late s, refugees from tibet into india initially used kalimpong as a base, but later left for other places with better opportunities. the build-up of animosity between india and china culminated in significant border conflicts and eventually the sino-indian war of . chinese families based on the indian side were accused of spying and were repatriated to china despite the fact that they had not lived there for several generations. the war essentially sealed off all borders between china and india. whatever was left of small-scale tibet-india trade was rerouted through nepal, rather than via kalimpong. neighbouring darjeeling largely retained the cachet of its tea, which in acquired the status of india’s first gi commodity, along the lines of french champagne and italian parmesan. however, its share in the global market dipped, partly due to strong competition from lower-end teas from kenya and elsewhere, and partly due to the political turmoil associated with the sub-nationalist gorkhaland movement, which protested the distant west bengal state government’s political neglect of these mountain spaces. in the eastern himalayas, images of these connected histories—tea and yak tails, spies and missionaries, administrators and monks—still remain central to negotiations of identity, albeit in a new globalized idiom. this is illustrated by the recent installation of a giant lcd screen at the chowrastra pedestrian square, known to any connoisseur of darjeeling’s picturesque vistas. once an exclusive social space reserved for the white elites of empire, today’s chowrastra is pleasantly chaotic in the daytime hours, with euro-american transcultural studies . backpackers, israeli tourists, and indo-nepali, tibetan, and bengali locals milling around. bellowing over their voices is the soundtrack of television shows from the giant screen. the television images appear to mirror postcards at nearby bookshops: photographic images of soaring peaks and bucolic villages whose inspiration goes back to the colonial-era tourist trade. closer attention reveals that the soundtrack that accompanies those images is not the locally dominant nepali or english languages, but mandarin. this linguistic detail evokes an even older connected history, one where mountain spaces controlled today by bhutan, china, india, nepal, and russia formed part of the ancient trade route that transported tea and silks from yunnan into eurasia. such entanglements represent the complex and connected histories that thread through darjeeling and kalimpong, and demonstrate how ostensibly marginal borderland places continue to perform as gateways to the world that connect diverse spaces, peoples, objects, and representations. http://transculturalstudies.org _goback de-identification guidance prepared by the portage network, covid- working group on behalf of the canadian association of research libraries (carl) kristi thompson (western university) erin clary (portage) lucia costanzo (university of guelph) beth knazook (portage) nick rochlin (university of british columbia) felicity tayler (university of ottawa) jane fry (carleton university) chantal ripp (university of ottawa) kathy szigeti (university of waterloo) qian zhang (university of waterloo) roger reka (university of windsor) minglu wang (york university) rebecca dickson (coppul) mark leggott (rdc-drc) melanie parlette-stewart (portage) september portage network canadian association of research libraries portage@carl-abrc.ca portage network / canadian association of research libraries table of contents de-identification guidance .............................................................................................................................. identify and remove direct identifiers............................................................................................................ how do i remove this information? ............................................................................................................. identify and evaluate indirect or quasi-identifiers based on perceived risk and utility ................................ how do i figure out what combination of quasi-identifiers are a problem? ............................................... how do i assess the sensitivity of non-identifying variables in dataset? .................................................... considerations for qualitative data de-identification .................................................................................... brief considerations for social media, medical images, and genomics data .............................................. data collected from social media or social networking platforms (e.g., twitter, facebook). .................. medical images ......................................................................................................................................... genomics data, and other biomedical samples ........................................................................................ appendix : code for checking k-anonymity ................................................................................................ appendix : free de-identification software packages ................................................................................. appendix : fee-based services for de-identification ................................................................................... resources ....................................................................................................................................................... references ..................................................................................................................................................... portage network / canadian association of research libraries de-identification guidance the guidance below is intended to help you minimize disclosure risk when sharing data collected from human participants. if you use any of the following techniques to anonymize your data, please include this information in your documentation and readme file. for transparency, it should be clear how the dataset was modified to protect study participants. before proceeding, please note that not all human participant data needs to be de-identified, or stripped, of direct and indirect identifiers. please review your consent form and prepare your data to share only what participants have agreed to share. if you are unsure whether you need to de-identify your data, please see the portage help guide can i share my data? and consult with your institution’s research ethics board. for help selecting a repository for your data, please see portage’s recommended repositories for covid- research data guide or consult with librarians at your institution to see if further support is available. learn more about creating appropriate documentation for depositing your datasets in the portage covid- working group’s “documentation and supporting materials required for deposit,” september , , https://doi.org/ . /zenodo. . portage covid- working group, “can i share my data?” september , , https://doi.org/ . /zenodo. . portage covid- working group, “recommended repositories,” september , . https://doi.org/ . /zenodo. . https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. portage network / canadian association of research libraries identify and remove direct identifiers direct identifiers are those which place study participants at immediate risk of being re-identified. unless explicit consent was received from study participants, they must be removed from any published version of your dataset. the following list is based on various sources, including guidance from major international funding agencies, the us health insurance portability and accountability act (hipaa) and the british medical journal. see preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers and list of items considered under hipaa to be identifiers. direct identifiers are: ● names or initials, as well as names of relatives or household members ● addresses, and small area geographic identifiers such as postal codes / zip codes ● telephone numbers ● electronic identifiers such as web addresses, email addresses, social media handles, or ip addresses of individual computers ● unique identifying numbers such as hospital ids, social insurance numbers, clinical trial record numbers, account numbers, certificate or license numbers ● exact dates relating to individually-linked events such as birth or marriage, date of hospital admission or discharge, or date of a medical procedure ● multimedia data: unaltered photographs, audio, or videos of individuals ● biometric identifiers including finger or voice prints, and iris or retinal images ● human genomic data, unless risk was explained and consent to share data or consent for secondary use of data was received from study participants ● age information for individuals over years old how do i remove this information? removing direct identifiers from your data is relatively straightforward. you may either record this personal information in a separate document, spreadsheet, or database and link this to the other data points via a series of codes that can be removed before publishing, or choose to delete the identifying data points entirely at the end of the project. refer to your consent forms to determine how to proceed. if you are unsure whether data can simply be unlinked or if it must be destroyed, consult your local research ethics board. iain hrynaszkiewicz, melissa l. norton, andrew j. vickers, and douglas g. altman, “preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers.” bmj (january , ): c , https://www.bmj.com/content/ /bmj.c ; and steve alder, “what is considered phi under hipaa rules?” hipaa journal (december , ), https://www.hipaajournal.com/considered-phi-hipaa/. https://www.bmj.com/content/ /bmj.c https://www.bmj.com/content/ /bmj.c https://www.hipaajournal.com/considered-phi-hipaa/ https://www.bmj.com/content/ /bmj.c https://www.hipaajournal.com/considered-phi-hipaa/ portage network / canadian association of research libraries identify and evaluate indirect or quasi-identifiers based on perceived risk and utility indirect or quasi-identifiers are characteristics (such as demographic information) relating to individuals that could be linked with other data sources to violate the confidentiality of individuals. quasi-identifiers may not be identifying on their own but can be disclosive in combination. for instance, identifying a participant’s home community size within an overall limited geographic study area may allow someone to infer that participant’s location more precisely. a variable should be considered a quasi-identifier if someone could plausibly match that variable to information from another source. see the international household survey network anonymization principles and the information and privacy commissioner ontario de-identification guidelines for structured data. a list of potential quasi-identifiers: ● geographic identifiers (census geography, town name, urban/rural indicator) of home, place of birth, place of treatment, place of schooling, or other geography linked to individuals ● sex / gender identity, orientation ● ethnic background, race, visible minority, or indigenous status ● immigration status ● membership in organizations ● use of specific social networks or services ● socioeconomic data, such as occupation or place of work, income, or education ● household and family composition, marital status, number of children / pregnancies ● criminal records and other information that may link to public records ● generalized dates linked to individuals, e.g. age, graduation year, immigration year ● some full-sentence responses ○ note: these must be checked individually. for instance, the comment “the library should be open longer” is not identifying; however, a comment like “as chair of a research group that uses the library,…” is potentially identifying. ● some medical information (e.g. permanent disabilities or rare medical conditions) may be identifying; temporary illness or injury is less likely to be so. the test is whether this is information that can be found elsewhere and therefore could be used to re-identify the person. how do i figure out what combination of quasi-identifiers are a problem? “anonymization principles,” international household survey network, accessed august , , https://ihsn.org/node/ ; information and privacy commissioner of ontario, deidentification guidelines for structured data, information and privacy commissioner of ontario, june , . https://www.ipc.on.ca/privacy- organizations/de-identification-centre/. https://ihsn.org/node/ https://ihsn.org/node/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://ihsn.org/node/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ portage network / canadian association of research libraries . observe the possible combinations a good first step may be to look at the demographic variables in the dataset and consider describing an individual to a friend using only the values of those variables. is there any likelihood that the person would be recognizable? for example, “i’m thinking of a person living in toronto who is female, married, has a university degree, is between the ages of and and has an income of between and thousand dollars.” even if there is only one such person in the dataset, this is likely not enough information to create risk unless contextual information about the dataset narrows things down further. for instance, if your data is limited to a specific, narrow group of individuals, such as the referees for the ontario hockey association, the list of quasi-identifiers given above may be enough to uniquely identify an individual. quasi-identifiers need to be evaluated in the context of what is known or what may be reasonably inferred about the survey population. . assess these combinations mathematically k-anonymity is a mathematical approach to demonstrating that a dataset has been anonymized, where k is an integer selected by the researcher that represents a group of records with the same information across all quasi-identifiers. within your dataset, a set of ‘k’ records (e.g., a set of or records) is called an equivalence class. to achieve k-anonymity, it should not be possible to distinguish one record from the other records in its equivalence class. for example, if you choose a k value of , each record in your dataset must have the exact set of quasi-identifiers that are present in at least other records in order to achieve k-anonymity. k-anonymity only works to precisely estimate risk if a dataset is a complete sample of some population. k-anonymity considerably overestimates risk in the case of a dataset that is a subsample of a population. when determining the appropriate k value to use, consider: ● a lower k value of may be sufficient in datasets that contain small samples from a large population. ● a higher (or more conservative) k value should be used if a dataset is a complete sample of a population. keep in mind that a dataset that is a complete sample of a known population may have additional risk factors. imagine that all the respondents in a particular equivalence class answered a question the same way - you would know how each person in the survey belonging to that equivalence class answered the question. respondents to surveys are generally told that their responses will be kept confidential, not merely that no one will know which line of data contains their specific answers. a k-anonymous dataset that is a complete sample may not fulfill that promise. the code in appendix can be used with your preferred statistical software package to create equivalence classes based on the quasi-identifiers in the dataset and to list them by size. if any khaled el emam and fida kamal dankar, “protecting privacy using k-anonymity,” journal of the american medical informatics association , no. (september ): – , https://doi.org/ . /jamia.m . https://academic.oup.com/jamia/article/ / / / https://academic.oup.com/jamia/article/ / / / https://doi.org/ . /jamia.m portage network / canadian association of research libraries equivalence class has fewer members than the value of k you selected, use the data reduction techniques below to further reduce dataset risk. for more on k-anonymity, see international household survey network (ihsn)’s measuring the disclosure risk and the uk anonymisation network’s anonymisation decision-making framework section . . , guaranteed anonymisation. . use data reduction techniques to address dataset risk univariate frequencies and bivariate crosstabs can be used to identify small categories of quasi- identifiers. data reduction techniques can be used to mitigate risk once you have identified these small groups. there are three simple types of data reduction you may wish to consider: . the simplest is to completely drop risky variables from the dataset. this is an option for variables with relatively high risk that are not considered to be of high research value. (for example, in some datasets geography may be considered relatively less important than ethnicity or language.) . the second is global re-coding, or aggregating the observed values into a defined set of classes, such as transforming a variable with years of age into a variable of ten-year age categories, or top-coding a high income category to “$ , and above”. . a third option for unusual cases is to use local suppression. for example, a very young married respondent might have their marital status set to ‘missing’ as an alternative to globally re-coding the otherwise non-risky age variable into a larger group. after each exercise in data reduction, repeat the test for k-anonymity described above and check equivalence classes until all groups are larger than your selected value for k. for more information, including information about more complex types of data reduction, see uk anonymisation network’s anonymization decision-making framework section . , anonymisation solutions. “measuring the disclosure risk,” international household survey network, accessed august , , https://ihsn.org/anonymization-risk-measure; and mark elliot, elaine mackey, kieron o’hara, and caroline tudor, the anonymisation decision-making framework. uk anonymisation network (ukan), university of manchester, , https://ukanon.net/ukan-resources/ukan-decision-making-framework/. ‘small’ is relative; as a first pass, groups smaller than % of the dataset or containing fewer than cases could be considered. elliot, mackey, o’hara, and tudor, the anonymisation decision-making framework. https://ihsn.org/anonymization-risk-measure https://ihsn.org/anonymization-risk-measure https://ukanon.net/ukan-resources/ukan-decision-making-framework/ https://ukanon.net/ukan-resources/ukan-decision-making-framework/ https://ukanon.net/ukan-resources/ukan-decision-making-framework/ https://ihsn.org/anonymization-risk-measure https://ukanon.net/ukan-resources/ukan-decision-making-framework/ portage network / canadian association of research libraries how do i assess the sensitivity of non-identifying variables in dataset? non-identifying information includes survey responses and measurements that are not likely to be recognizable as coming from specific individuals. examples include opinions, rankings, scales, or temporary measures such as resting heart rate after meditation or the number of times an individual ate breakfast in a week. it is possible for non-identifying information to be highly sensitive as well. information that could be used to stigmatize or discriminate against an individual, such as a criminal record, sexual practices, illicit drug use, mental health and psychological well being, and other sensitive medical information all increase the risk of the dataset and should be considered when deciding whether to release the data at all. you may wish to remove or modify these variables to create a less sensitive version of the data. portage network / canadian association of research libraries considerations for qualitative data de-identification qualitative data describes qualities or characteristics that can be observed, but not necessarily measured. this type of data is collected through interviews, surveys, or observations, and may be in the form of transcripts, notes, video and audio recordings, images, and text documents. as with quantitative data, direct identifiers may appear in the form of names, date and place of birth, other locations, and even photos. these direct identifiers can be used along with indirect or quasi-identifiers, such as medical, education, financial, and employment information, to trace or determine an individual’s identity. the process for removing identifying information in a video recording, audio interview, or oral transcript is very different from that used to de-identify a medical record. for one, it is harder to do programmatically. extremely detailed field notes or audiovisual information often requires someone to read or watch the content thoroughly. general advice ● avoid asking for identifying information in the first place. ○ it is easier to edit the information at the point of capture than it is to remove information after it has been recorded. ○ if you require identifying information at the research stage, try to capture it within the first few minutes of an interview or recording, so that it is easy to edit it out quickly. alternatively, transcribe the information in a separate document that can be removed from a person’s file. ● make de-identification a part of the process of informed consent. ○ ensure that study participants are aware of your planned use of the data, and the fact that their information may be anonymized to protect them. make it clear in your consent forms how extensively they will be de-identified (i.e., what elements will be replaced or removed). while direct identifiers may be eliminated (name, address, birthday, etc.), there may be other subtle clues to their identities that remain within the recording or transcript. ○ agree in advance with participants which type of identifying information can be revealed in an interview. (for example, the participant may not wish to mention an employer’s name). this is easier than removing information after the fact. ○ keep in mind that not all data needs to be de-identified or anonymized. in some circumstances, you may be recording deeply personal accounts and should be mindful of a participant’s right to have their story told in their own words. some participants may have a personal interest in staying identified. portage network / canadian association of research libraries de-identification guidance ● use pseudonyms and change identifying details to protect anonymity. ○ if changing the person’s name, location of residence, or occupation can be done without compromising the dataset, this can help to protect their anonymity. be advised that this could influence the utility of a dataset as it may alter a future researcher’s perception of the interviewee’s socio-economic status or behaviour. ● if necessary, remove blocks of sensitive text or edit out portions within audio-visual data. ○ some portion of the research may need to be redacted. be wary of using search and replace techniques as it is easy to replace the wrong piece of information. ○ voices in audio recordings may need to be masked by altering pitches. ○ faces in visual data may need to be pixelated. ● restrict access. ○ this is not preferred, but some datasets will not remain useful if all identifiers are removed. it may be possible to allow researchers seeking secondary access to request that queries be performed by the original research team, who can then share results if they are non-disclosive or can be appropriately de-identified. for more information, see the uk data service's guide to anonymisation of qualitative data. “anonymisation: qualitative data,” uk data service, last modified june , , https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx. https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx portage network / canadian association of research libraries brief considerations for social media, medical images, and genomics data data collected from social media or social networking platforms (e.g., twitter, facebook). although information on social networking sites may be free to access or view, it does not automatically follow that it is free to redistribute. many platforms have terms of use that you will need to abide by, and the people who use the platform may have an expectation of privacy which must be respected. some platforms require users to register before content is visible, and others may have terms that prohibit data collection, data scraping, or republishing content elsewhere. here are a series of questions to consider before you deposit social media data: ● could the topic you are studying be considered sensitive? ● could your data lead to stigmatization of, or discrimination against, the content author? ● is the study population vulnerable? ● what expectation of privacy might the individual users of this platform have? ● is it possible or reasonable to obtain informed consent? ● can or should the data be anonymized? ● do the platform’s terms of use allow you to redistribute content? for example, twitter allows the content author to maintain control over their tweets. as part of twitter's policies, only numeric tweet ids and user ids should be redistributed. if you have weighed the questions above and decide to deposit your dataset, the tweets must first be ‘dehydrated’ (distilled down to just the tweet id) using a tool such as docnow’s twarc. any secondary use of the data would then require an end-user to “rehydrate” the tweet ids using the twitter rest api or an external tool such as docnow’s hydrator. content will not be returned for tweets that have since been deleted. the following resources provide more in-depth guidance: ● zeffiro and brodeur, social media research data ethics and management (slides from a workshop presented at mcmaster university). ● ryerson university research ethics board’s guidelines for research involving social media. “developer terms: more about restricted uses of the twitter apis,” twitter, accessed august , , https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases. documenting the now, “docnow/twarc.” github, accessed august , , https://github.com/docnow/twarc. documenting the now, “docnow/hydrator.” github, accessed august , , https://github.com/docnow/hydrator. andrea zeffiro and jay brodeur, “social media research data ethics and management.” workshop presented april , , sherman centre for digital scholarship, mcmaster university, http://hdl.handle.net/ / . ryerson university research ethics board, “guidelines for research involving social media,” ryerson university, november, , https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research- involving-social-media.pdf. https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases https://github.com/docnow/twarc https://github.com/docnow/hydrator https://macsphere.mcmaster.ca/bitstream/ / / /dmds% -% sm% ethics% and% data% management% -% .pdf https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases https://github.com/docnow/twarc https://github.com/docnow/hydrator http://hdl.handle.net/ / https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf portage network / canadian association of research libraries ● mannheimer and hull, sharing selves: developing an ethical framework for curating social media data. north carolina state university’s social media archives toolkit, which contains guidance on the legal and ethical implications of sharing social media data, and an annotated bibliography with further resources. medical images before you archive medical images, remove any direct identifiers you do not have explicit consent to share, such as name, patient id, and exact dates from the image header or embedded metadata, and black out any pixels in the image that contain identifying information. neuroimages must also be defaced using a tool such as pydeface. the following resources provide more guidance for de-identifying dicom files: ● the cancer imaging archive (tcia) de-identification overview. ○ see specifically “table - dicom tags modified or removed at the source site” for a list of dicom tags deemed to be unsafe. ● the radiological society of north america (rsna) international covid- open radiology database (ricord) de-identification protocol. ● the dicom standard itself provides important guidance for de-identifying header information. specifically, dicom part : security and system management profiles, appendix e: attribute confidentiality profiles may be useful. sara mannheimer and elizabeth hull, “sharing selves: developing an ethical framework for curating social media data,” international journal of digital curation , no. (april , ), https://doi.org/ . /ijdc.v i . . “social media archives toolkit,” north carolina state university libraries, accessed august , , https://www.lib.ncsu.edu/social-media-archives-toolkit. some repositories may be able to assist you or recommend tools for defacing. for example, the international neuroimaging data-sharing initiative (indi) can help researchers who plan to share their data on the indi platform. for further information, see the indi data contribution guide, accessed august , , http://fcon_ .projects.nitrc.org/indi/indi_data_contribution_guide.pdf. see also, omer faruk gulban, dylan nielson, russ poldrack, john lee, chris gorgolewski, vanessasaurus, and satrajit ghosh, “poldracklab/pydeface: v . . .” october , . http://doi.org/ . /zenodo. . kirby, justin. “submission and de-identification overview.” the cancer imaging archive (tcia), university of arkansas for medical sciences, april , , https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview. “rsna international covid- open radiology database (ricord) de-identification protocol,” radiological society of north america, international covid- open radiology database, accessed august , , https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf. medical imaging & technology alliance, dicom standards committee, “dicom part : security and system management profiles.” dicom standard (arlington, va: national electrical manufacturers association), accessed august , , https://www.dicomstandard.org/current/. https://doi.org/ . /ijdc.v i . https://doi.org/ . /ijdc.v i . https://www.lib.ncsu.edu/social-media-archives-toolkit https://www.lib.ncsu.edu/social-media-archives-toolkit/legal https://www.lib.ncsu.edu/social-media-archives-toolkit/legal https://pypi.org/project/pydeface/ https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview https://www.rsna.org/covid- /covid- -ricord/ricord-resources#identification https://www.rsna.org/covid- /covid- -ricord/ricord-resources#identification https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf https://www.dicomstandard.org/current/ http://dicom.nema.org/medical/dicom/current/output/pdf/part .pdf#chapter_e http://dicom.nema.org/medical/dicom/current/output/pdf/part .pdf#chapter_e https://doi.org/ . /ijdc.v i . https://www.lib.ncsu.edu/social-media-archives-toolkit http://fcon_ .projects.nitrc.org/indi/indi_data_contribution_guide.pdf http://doi.org/ . /zenodo. https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf https://www.dicomstandard.org/current/ portage network / canadian association of research libraries ○ these profiles attempt to balance the need to protect privacy with the need to retain information so the data remain useful. ○ if it is necessary to retain identifiers, your reb application will have ideally referenced the profile you intend to use, and your consent form should clearly state what information will be shared. de-identification of dicom files may be done programmatically, using a software to strip identifiers from the header. ● tcia recommends the clinical trial processor (ctp) software developed by rsna. ● rsna’s covid- open radiology database (ricord) recommends another rsna software called anonymizer, and has published instructions on how to install and use it. anonymizer implements ricord’s de-identification protocol. ● there are many other non-commercial options available, such as the dicomcleaner™ tool. ● as with all de-identification software, results may be variable, and you should confirm that identifying information was removed before you share your images. note that: ○ vendors or end-users may not have always used dicom elements in a way that conforms to the standard. ○ private elements or private tags may have been used to store personal information, and the use of these tags may not be well-defined in the vendor documentation. genomics data, and other biomedical samples because each person's dna sequence is unique, human biological materials can never be truly anonymous. before you archive or biobank these data, please review your consent form. ideally the consent process will have: ● provided participants with information about how their data will be used, analyzed, stored and shared, ● identified what information will be stored alongside the data, ● communicated what level of privacy or confidentiality a participant may expect, and who may have access to the data, ● indicated whether the data/samples will be stored in canada or outside of canada, ● acknowledged whether there is a possibility that the data will be used for commercial purposes, “clinical trial processor (ctp),” radiological society of north america, medical imaging resource community (mirc), accessed august , , https://www.rsna.org/research/imaging-research-tools. “rsna covid- dicom data anonymizer,” radiological society of north america, international covid- open radiology database, accessed august , , https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna- anonymizer-program-instructions.pdf. “rsna international covid- open radiology database (ricord) de-identification protocol,” radiological society of north america, international covid- open radiology database. clunie, david a., “dicomcleaner™,” pixelmed publishing™, accessed july , , http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html. https://www.rsna.org/research/imaging-research-tools https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html https://www.rsna.org/research/imaging-research-tools https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html portage network / canadian association of research libraries ● clearly explained the risks of disclosure. further information is available in tcps ( ), chapter : human biological materials including materials related to human reproduction (sections a and d specifically), and chapter : human genetic research. see also thorogood ( ) canada: will privacy rules continue to favour open science? the nih privacy in genomics webpage provides a concise overview of some of the benefits and risks of sharing genetic information. for an example of how genetic information was used to identify study participants, see identifying personal genomes by surname inference, or a summary of the study in the nature editorial on genetic privacy. for further information on ethics and consent in genomics, see the global alliance for genomics and health regulatory & ethics toolkit resources, such as data privacy and security policy and consent policy. government of canada (canadian institutes of health research, the natural sciences and engineering research council of canada, and the social sciences and humanities research council), tri-council policy statement: ethical conduct for research involving humans, december , https://ethics.gc.ca/eng/policy-politique_tcps -eptc _ .html. adrian thorogood, “canada: will privacy rules continue to favour open science?” human genetics : – (july , ), https://doi.org/ . /s - - -z. “privacy in genomics,” national human genome research institute, february , , https://www.genome.gov/about-genomics/policy-issues/privacy. gymrek, melissa, amy l. mcguire, david golan, eran halperin, and yaniv erlich. “identifying personal genomes by surname inference.” science , no. (jan , ): - . https://doi.org/ . /science. ; and “genetic privacy” [editorial], nature (january , ): , https://doi.org/ . / a. global alliance for genomics & health. genomic toolkit: regulatory & ethics toolkit. toronto, on: global alliance for genomics and health, accessed july , , https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/. https://ethics.gc.ca/eng/tcps -eptc _ _chapter -chapitre .html https://ethics.gc.ca/eng/tcps -eptc _ _chapter -chapitre .html https://ethics.gc.ca/eng/tcps -eptc _ _chapter -chapitre .html https://ethics.gc.ca/eng/tcps -eptc _ _chapter -chapitre .html https://link.springer.com/article/ . /s - - - https://www.genome.gov/about-genomics/policy-issues/privacy https://doi.org/ . /science. https://www.nature.com/news/genetic-privacy- . https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/ https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/ https://www.ga gh.org/wp-content/uploads/ga gh-data-privacy-and-security-policy_final-august- _wpolicyversions.pdf https://www.ga gh.org/wp-content/uploads/ga gh-final-revised-consent-policy_ sept .pdf https://www.ga gh.org/wp-content/uploads/ga gh-final-revised-consent-policy_ sept .pdf https://doi.org/ . /s - - -z https://www.genome.gov/about-genomics/policy-issues/privacy https://doi.org/ . /science. https://doi.org/ . / a https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/ portage network / canadian association of research libraries appendix : code for checking k-anonymity -- stata -- * stata code for checking k-anonymity * kristi thompson, may * create the equivalence groups egen equivalence_group= group(var var var var var ) * create a variable to count cases in each equivalence group sort equivalence_group by equivalence_group: gen equivalence_size =_n * list the id numbers of equivalence groups containing or fewer cases tab equivalence_group if equivalence_size < , sort * list the values of the quasi-identifiers for each small equivalence class. list var var var var var if equivalence_group == --- r -- # r code for checking k-anonymity # carolyn sullivan and kristi thompson, may # install plyr, a useful data manipulation package. install.packages("plyr") # load the library. library('plyr') portage network / canadian association of research libraries datafile <- " location of the data file - csv format - " # read the csv file. df <- read.csv (datafile) # figure out what equivalence classes there are, and how many cases in each equivalence class. dfunique <- ddply(df, .(var , var , var , var , var ), nrow) dfunique <- dfunique[order(dfunique$v ),] view(dfunique) the uk anonymisation network’s anonymisation decision-making framework, appendix b has code for doing this in spss. elliot, mackey, o’hara, and tudor, the anonymisation decision-making framework. https://ukanon.net/ukan-resources/ukan-decision-making-framework/ portage network / canadian association of research libraries appendix : free de-identification software packages many of these tools take a hierarchical approach to de-identifying data, which means that you will need to pre-define possible generalizations for the quasi-identifiers in the dataset, and the program will search for possible solutions and recommend a set of the generalizations to use to best meet anonymization goals. for datasets with a large number of quasi-identifiers, or cases where several datasets with similar quasi-identifiers need to be de-identified, this might be a useful approach. for smaller datasets, it may be more straightforward to work in a statistical package. the software packages included here all have some usability issues, and fairly steep learning curves. amnesia and the graphical user interface to sdcmicro may be the most user-friendly. recommended tools: ● amnesia ○ this software has both online and desktop versions, however, uploading sensitive data to a third-party web site is not generally recommended. if possible, install the software locally (windows or linux only). ○ amnesia supports k-anonymity and k m -anonymity (a slightly more flexible approach to anonymity when the number of quasi-identifiers in a dataset is very high, as it allows for combinations up to m quasi-identifiers to appear at least k times in the published data). ○ a few limitations: there is not currently a way to specify missing values; documentation could be more thorough, for instance, defining hierarchies is not straightforward. ○ this software may work best for clinical data, or data which are not survey data. ● sdcmicro ○ an r package for statistical disclosure control (microdata anonymization). this software can read many data types (e.g., csv, sav, dta, sas bdat, xlsx) and can be used in windows, linux or mac operating systems. implements muargus code. ○ a graphical user interface is available, and there is a vignette with guidance called ‘using the interactive gui - sdcapp’ linked from the sdcmicro landing page in cran repository. ○ please be aware that large datasets take time to load, and computation time for large or complex datasets may be lengthy. ○ the statistical disclosure control for microdata practice guide section on sdc with sdcmicro in r may be helpful if you need further guidance installing and using the sdcmicro package, or see benschop’s sdcmicro gui manual documentation. “using the interactive gui – sdcapp, the comprehensive r archive network (cran), accessed august , , https://cran.r-project.org/web/packages/sdcmicro/vignettes/sdcmicro.html. “computation time,” sdc with sdcmicro in r: setting up your data and more, sdc practice guide, , https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html#computation-time. “statistical disclosure control (sdc): an introduction,” sdc practice guide, , https://sdcpractice.readthedocs.io/en/latest/sdc_intro.html; and thijs benschop and matthew welch. statistical disclosure control for microdata: a practice guide for sdcmicro, international household survey network, accessed august , , https://sdcpractice.readthedocs.io/en/latest/index.html. https://amnesia.openaire.eu/ https://cran.r-project.org/web/packages/sdcmicro/index.html https://cran.r-project.org/web/packages/sdcmicro/vignettes/sdcmicro.html https://cran.r-project.org/web/packages/sdcmicro/vignettes/sdcmicro.html https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html#computation-time https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html https://readthedocs.org/projects/sdcappdocs/downloads/pdf/latest/ https://cran.r-project.org/web/packages/sdcmicro/vignettes/sdcmicro.html https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html#computation-time https://sdcpractice.readthedocs.io/en/latest/sdc_intro.html https://sdcpractice.readthedocs.io/en/latest/index.html portage network / canadian association of research libraries other tools that may be useful: ● arx ○ open source anonymization tool for use in windows, linux, and mac. provides support for sql databases, xlsx and csv files, and has a graphical user interface. ○ supports various privacy models including k-anonymity, and variants ℓ-diversity, t- closeness, β-likeness, and more. ○ allows end-users to categorize, top and bottom code, generalize, and transform data in more complex ways. ○ large datasets take time to load, and computation time for large or complex datasets may be lengthy. ● mu-argus ○ software to apply statistical disclosure control techniques. the program takes a hierarchical approach to de-identifying data. ○ jar file should be executable in windows or mac os. ○ a tester found that getting data loaded and correctly defined was a bit of a challenge and advised that the program could use better documentation on setting up hierarchies. ● the university of texas at dallas anonymization toolbox ○ the toolbox currently supports different anonymization methods and privacy definitions, including k-anonymity, ℓ-diversity, and t-closeness. ○ algorithms can either be applied directly to a dataset or can be used as library functions inside other applications. ○ this is a set of java routines. data curators who prefer to do their statistical programming in java might find it useful. “privacy models,” arx – data anonymization tool, accessed august , , https://arx.deidentifier.org/overview/privacy-criteria/. https://arx.deidentifier.org/overview/ https://arx.deidentifier.org/overview/privacy-criteria/ https://github.com/sdctools/muargus http://cs.utdallas.edu/dspl/cgi-bin/toolbox/index.php https://arx.deidentifier.org/overview/privacy-criteria/ portage network / canadian association of research libraries appendix : fee-based services for de-identification a few fee-based services that researchers may opt to use for de-identification are included below: ● d-wise (american & european offices) ○ offering free anonymization services to anyone working on a covid- vaccine. ○ offering free anonymization services to researchers who deposit individual participant- level data from covid- clinical trials in vivli. ● inter-university consortium for political and social research (icpsr) (archive headquartered at university of michigan) ○ if you wish icpsr to conduct disclosure analysis of your data, you will need to purchase the professional curation package. cost is based on the number of variables and complexity of the data. contact icpsr acquisitions at deposit@icpsr.umich.edu for additional information (information obtained from open icpsr faq under pricing and sensitive data sections). ● privacy analytics (ottawa-based company) ○ privacy analytics can review datasets as part of their data privacy validation services. ○ methodology based on the hipaa expert determination de-identification standard. ○ to find out more about their services, please fill in the form at the bottom of their “certification” webpage. “d-wise offers free transparency services accelerating covid- vaccine research,” cision prweb, march , , https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_rese arch/prweb .htm. “d-wise offers anonymization services available on vivli covid- portal,” center for global clinical research data, april , , https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_rese arch/prweb .htm. “faqs,” openicpsr, accessed august , , https://www.openicpsr.org/openicpsr/faqs. “clinical trial transparency services,” privacy analytics, accessed on august , , https://privacy- analytics.com/clinical-trial-transparency/ctt-services/. “double-check your data and leverage it with confidence,” privacy analytics, accessed on august , , https://privacy-analytics.com/health-data-privacy/health-data-services/expert-data-opinion-services/. https://www.d-wise.com/de-identification-services https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://vivli.org/vivli-covid- -portal/ https://vivli.org/about/overview- / https://www.openicpsr.org/openicpsr/ https://www.openicpsr.org/openicpsr/faqs https://privacy-analytics.com/services/certification/ https://privacy-analytics.com/clinical-trial-transparency/ctt-services/ https://privacy-analytics.com/services/certification/ https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://www.openicpsr.org/openicpsr/faqs https://privacy-analytics.com/clinical-trial-transparency/ctt-services/ https://privacy-analytics.com/clinical-trial-transparency/ctt-services/ https://privacy-analytics.com/health-data-privacy/health-data-services/expert-data-opinion-services/ portage network / canadian association of research libraries resources . amnesia https://amnesia.openaire.eu/ . arx https://arx.deidentifier.org/overview/ . d-wise https://www.d-wise.com/de-identification-services . mu-argus https://github.com/sdctools/muargus . inter-university consortium for political and social research (icpsr) https://www.openicpsr.org/openicpsr/ . privacy analytics https://privacy-analytics.com/services/certification/ . sdcmicro https://cran.r-project.org/web/packages/sdcmicro/index.html . the university of texas at dallas anonymization toolbox http://cs.utdallas.edu/dspl/cgi- bin/toolbox/index.php references . alder, steve. “what is considered phi under hipaa rules?” hipaa journal, december , . https://www.hipaajournal.com/considered-phi-hipaa/. . benschop, thijs, and matthew welch. “statistical disclosure control for microdata: a practice guide for sdcmicro.” international household survey network. accessed august , . https://sdcpractice.readthedocs.io/en/latest/index.html. . clunie, david a. “dicomcleaner™.” pixelmed publishing™. accessed july , . http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html. . documenting the now. “docnow/hydrator.” github. accessed august , . https://github.com/docnow/hydrator. . documenting the now. “docnow/twarc.” github. accessed august , . https://github.com/docnow/twarc. . el emam, khaled, and fida kamal dankar. “protecting privacy using k-anonymity.” journal of the american medical informatics association , no. (september ): – . https://doi.org/ . /jamia.m . . elliot, mark, elaine mackey, kieron o’hara, and caroline tudor. the anonymisation decision- making framework. uk anonymisation network (ukan). university of manchester. . https://ukanon.net/ukan-resources/ukan-decision-making-framework/. . “genetic privacy.” [editorial]. nature (january , ): . https://doi.org/ . / a. . global alliance for genomics & health. genomic toolkit: regulatory & ethics toolkit. toronto, on: global alliance for genomics and health. accessed july , . https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/. . government of canada (canadian institutes of health research, the natural sciences and engineering research council of canada, and the social sciences and humanities research council). tri-council policy statement: ethical conduct for research involving humans. december . https://ethics.gc.ca/eng/policy-politique_tcps -eptc _ .html. https://amnesia.openaire.eu/ https://github.com/sdctools/muargus https://www.openicpsr.org/openicpsr/ https://www.hipaajournal.com/considered-phi-hipaa/ https://sdcpractice.readthedocs.io/en/latest/index.html http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html https://github.com/docnow/hydrator https://github.com/docnow/twarc https://doi.org/ . /jamia.m https://ukanon.net/ukan-resources/ukan-decision-making-framework/ https://doi.org/ . / a https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/ https://ethics.gc.ca/eng/policy-politique_tcps -eptc _ .html portage network / canadian association of research libraries . gulban, omer faruk, dylan nielson, russ poldrack, john lee, chris gorgolewski, vanessasaurus, and satrajit ghosh. “poldracklab/pydeface: v . . .” zenodo. october , . http://doi.org/ . /zenodo. . . gymrek, melissa, amy l. mcguire, david golan, eran halperin, and yaniv erlich. “identifying personal genomes by surname inference.” science , no. (jan , ): - . https://doi.org/ . /science. . . hrynaszkiewicz, iain, melissa l. norton, andrew j. vickers, and douglas g. altman. “preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers.” bmj (january , ): c . https://www.bmj.com/content/ /bmj.c . . information and privacy commissioner ontario. deidentification guidelines for structured data. information and privacy commissioner of ontario. june , . https://www.ipc.on.ca/privacy- organizations/de-identification-centre/. . international household survey network. “measuring the disclosure risk.” accessed august , . https://ihsn.org/anonymization-risk-measure. . international neuroimaging data-sharing initiative (indi). data contribution guide. accessed august , . http://fcon_ .projects.nitrc.org/indi/indi_data_contribution_guide.pdf. . kirby, justin. “submission and de-identification overview.” the cancer imaging archive (tcia), university of arkansas for medical sciences. april , . https://wiki.cancerimagingarchive.net/display/public/submission+and+de- identification+overview. . mannheimer, sara, and elizabeth hull. “sharing selves: developing an ethical framework for curating social media data.” international journal of digital curation , no. (april , ). https://doi.org/ . /ijdc.v i . . . medical imaging & technology alliance, dicom standards committee. “dicom part : security and system management profiles.” in dicom standard. arlington, va: national electrical manufacturers association. accessed august , . https://www.dicomstandard.org/current/. . moore, stephen m., david r. maffitt, kirk e. smith, justin s. kirby, kenneth w. clark, john b. freymann, bruce a. vendt, lawrence r. tarbox, and fred w. prior. “de-identification of medical images with retention of scientific research value.” radiographics , no. (may , ). https://doi.org/ . /rg. . . national human genome research institute. “privacy in genomics.” february , . accessed august , . https://www.genome.gov/about-genomics/policy-issues/privacy. . north carolina state university libraries. “social media archives toolkit.” accessed august , . https://www.lib.ncsu.edu/social-media-archives-toolkit. . portage covid- working group, “can i share my data?” september , . https://doi.org/ . /zenodo. . . portage covid- working group. “documentation and supporting materials required for deposit.” september , . https://doi.org/ . /zenodo. . . portage covid- working group. “recommended repositories.” september , . https://doi.org/ . /zenodo. . . international household survey network. “anonymization principles.” accessed august , . https://ihsn.org/node/ . . radiological society of north america, international covid- open radiology database. “rsna international covid- open radiology database (ricord) de-identification protocol.” accessed august , . https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- - deidentification-protocol.pdf. http://doi.org/ . /zenodo. https://doi.org/ . /science. https://www.bmj.com/content/ /bmj.c https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://ihsn.org/anonymization-risk-measure http://fcon_ .projects.nitrc.org/indi/indi_data_contribution_guide.pdf https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview https://doi.org/ . /ijdc.v i . https://www.dicomstandard.org/current/ https://doi.org/ . /rg. https://www.genome.gov/about-genomics/policy-issues/privacy https://www.lib.ncsu.edu/social-media-archives-toolkit https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://ihsn.org/node/ https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf portage network / canadian association of research libraries . radiological society of north america, international covid- open radiology database. “rsna covid- dicom data anonymizer.” accessed august , . https://www.rsna.org/- /media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf. . radiological society of north america, medical imaging resource community (mirc). “clinical trial processor (ctp).” accessed august , . https://www.rsna.org/research/imaging- research-tools. . ryerson university research ethics board. “guidelines for research involving social media.” ryerson university. november, . https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research- involving-social-media.pdf. . thorogood, adrian. “canada: will privacy rules continue to favour open science?” human genetics : – (july , ). https://doi.org/ . /s - - -z. . twitter. “developer terms: more about restricted uses of the twitter apis.” accessed august , . https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases. . uk data service. “anonymisation: qualitative data.” last modified june , . https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx. . zeffiro, andrea, and jay brodeur. “social media research data ethics and management.” workshop presented april , . sherman centre for digital scholarship. mcmaster university. http://hdl.handle.net/ / . https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf https://www.rsna.org/research/imaging-research-tools https://www.rsna.org/research/imaging-research-tools https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf https://doi.org/ . /s - - -z https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx http://hdl.handle.net/ / de-identification guidance de-identification guidance identify and remove direct identifiers how do i remove this information? identify and evaluate indirect or quasi-identifiers based on perceived risk and utility how do i figure out what combination of quasi-identifiers are a problem? . observe the possible combinations . assess these combinations mathematically . use data reduction techniques to address dataset risk how do i assess the sensitivity of non-identifying variables in dataset? considerations for qualitative data de-identification brief considerations for social media, medical images, and genomics data data collected from social media or social networking platforms (e.g., twitter, facebook). medical images genomics data, and other biomedical samples appendix : code for checking k-anonymity appendix : free de-identification software packages appendix : fee-based services for de-identification resources references electronic copy available at: http://ssrn.com/abstract= enhancing scholarly publishing in the humanities and social sciences: innovation through hybrid forms of publication paper prepared for pks scholarly publishing conference - september by nicholas w. jankowski, clifford tatum, zuotian tatum, andrea scharnhorst e-humanities group royal netherlands academy of arts and sciences, knaw cruquiusweg at amsterdam, nl enhanced publications project web site abstract enhancing scholarly publication involves presentation in a web environment with interlinking of the 'objects' of a document: datasets, supplementary materials, secondary analyses, and post-publication interventions. development of the semantic web is aimed at facilitating long-term content structure through standardized meta-data formats intended to improve interoperability between concepts and terms within and across knowledge domains. at the same time, ad-hoc scholarly discourse, facilitated by the participatory dynamics of web . applications, contributes to an emergent content structure through compliance with open web standards. while the top-down semantic web and bottom-up intertextuality structure are not inherently incompatible, their differences have implications for the design, use, and diffusion of enhanced scholarly publications. in this paper we illustrate a hybrid approach that employs semantic web techniques while focusing on practices entailed in contemporary intertextual discourses. this approach is applied to four books prepared for traditional academic publishers; the web sites and functions for three of these books are described in this paper. introduction scholars in the humanities and social sciences are increasingly considering possibilities to make research available on the web. instruments for data collection and analysis, datasets and metadata describing this material, conference papers and project reports are appearing in web-based repositories. one area lagging behind in this trend, however, is development of web venues that integrate the traditionally published book with the diverse materials related to an overall research project. ‘enhanced publications’ is a term reflecting such integration and a range of initiatives have been supported by the surffoundation in the netherlands to develop such forms of publication. one of the six projects included in this initiative is ‘enhancing scholarly publishing in the humanities and social sciences’. initiated in january , enhancing scholarly publishing was designed to prepare web sites for four traditionally-published scholarly books and, in the process, to utilize a model of and tools for enhanced publications developed by surf. the project concluded at the end of june . this paper notes the objectives and accomplishments during that period, reflects on some of the challenges encountered, and sketches paths meriting further exploration. http://pkp.sfu.ca/ocs/pkp/index.php/pkp /pkp http://ehumanities.nl/ http://knaw.nl/smartsite.dws?id= &lang=eng http://digital-scholarship.ehumanities.nl/enhanced-publications/ electronic copy available at: http://ssrn.com/abstract= the project had three objectives: ( ) to develop hybrid forms of publications, ( ) to develop a database allowing for aggregation of content attributes and associations across the individual book web sites, and ( ) to disseminate the results of the experiences in preparing enhanced publications. regarding the first objective, a template constructed within the wordpress platform was to be used to construct web sites to complement the four selected books. these web sites were to contain a broad range of features:  supplementary resources (e.g., links, blogs, chapter appendices, author profiles);  chapter visualizations (e.g., animations, figures, tables) in color;  hyperlinks, both internal and external to the book texts;  author updating of site materials;  search features. regarding the second objective, a database was to be developed that would allow for aggregation of content attributes and associations across the individual book web sites, such that topical relationships, intellectual underpinnings, and contextual factors could be made explicit. fundamental to this approach is a focus on web- based texts as dynamic and evolving discourses rather than completed works ready to be archived. the third objective of the project was to disseminate the results of the experiences in preparing enhanced publications. this objective involved sharing experiences through a range of activities: conference papers and presentations, such as this paper for the pks conference; journal publications, such as a theme issue of new media & society; and discussions with publishers and other players involved in scholarly publishing. four books were selected for inclusion in the project: three edited anthologies and a single-author university- level textbook:  jankowski, n. w. ( ). e-research: transformation in scholarly practice. new york: routledge.  wouters, p., beaulieu, a., scharnhorst, a., & wyatt, s. (forthcoming, under review by mit press). virtual knowledge.  park, d., jankowski, n. w., & jones, s. ( ). the long history of new media: technology, historiography, and newness contextualized. new york: peter lang.  jankowski, n. w. (forthcoming ). digital media: concepts & issues, research & resources. cambridge uk: polity press. the conceptual foundations of the web sites and the underlying principles of the database architecture are elaborated in the next section. web sites complementing three of the four scholarly book publications noted above are presented in the subsequent section (the fourth book is insufficiently developed to merit presentation at this time). finally, in the concluding section we reflect on the overall project and note areas where further work is to be undertaken. platform design and features as an important mode of scholarly communication, particularly in the humanities, the academic book format has experienced relatively little enhancement from the affordances of digital media, networked content, and database technologies. in this project we leverage the potential for creating enhancements for the book in its present form. the basic architecture for this project includes a database-driven web site for each of four traditionally-published books, along with a central aggregation database that facilitates synchronization features and will in a future stage of development include queries within and across the four books. as reflected in figure , a conceptual diagram of the project, content on each book web site is managed with a local database, connected to a central database. in this way, the linkage is established for aggregation and each book web site retains an individual web presence with local content management and storage. in addition to the five web sites (four book web sites and the central project web site), a central accomplishment for the project is the launch of semantic wordpress for digital scholarship, termed semantic words, and two specially tailored open source plugins designed to introduce traditionally published books to web-based scholarly communication. each of the four book web sites contains a broad range of features intended to enhance the printed versions of these books, including supplementary resources, visualizations, intertextual linking of content, and formal http://ep-books.ehumanities.nl/semantic-words structuring of content using semantic web ontologies. in a subsequent phase, further development is planned of the central database in order to facilitate aggregation of content across the individual book web sites, such that object relationships, discursive threads, and contextual factors can be made explicit. figure : schematic diagram of ep project the following web sites have been developed as part of this project:  central project web site;  e-research book web site;  virtual knowledge book web site;  long history of new media book web site. in this project we employed the wordpress content management system (cms) as the foundation for the web sites, both for its relative ubiquity and ease of use. according to the recent world wide web technology survey (w techs), the wordpress platform is used by % of the largest one million web sites on the internet, which amounts to more than half of the web sites using a content management system. the development strategy for this project focused in the first place on academic practice: how academics in the social sciences and humanities conduct scholarship. this priority informed development-specific functionality and interfaces. to this end, the relative ease of using wordpress played an important role. figure displays the functional modules of the platform in three clusters: the wordpress software (upper left), community- developed wordpress plugins (lower left) and the custom-developed plugins (right). semantic wordpress for digital scholarship the semantic wordpress for digital scholarship framework (semantic words) is the basis for this hybrid platform, which leverages web . participatory modes of scholarly communication combined with formalized content structures imposed by semantic web formats; see figures and . the semantic words software is developed as open source for use with the wordpress content management software, which is also open for background and rationale of this hybrid approach to enhanced publications, see tatum, ( june ) ‘web . and/or semantic web?’ available on the digital scholarship site. http://ep-books.ehumanities.nl/ http://scholarly-transformations.virtualknowledgestudio.nl/ http://thebook.virtualknowledgestudio.nl/ http://thelonghistoryofnewmedia.net/ http://wordpress.org/ http://w techs.com/technologies/overview/content_management/all http://ep-books.ehumanities.nl/semantic-words http://digital-scholarship.ehumanities.nl/aggregated/augmenting-wordpress-for-enhanced-publication/ http://digital-scholarship.ehumanities.nl/aggregated/augmenting-wordpress-for-enhanced-publication/ source. semantic words is comprised of two custom plugins that are integrated with wordpress and zotero, the open source, web annotation and citation management system. figure : semantic words functional diagram enhanced bibliplug the first of the semantic words plugins, enhanced bibliplug, provides a suite of features for authors, which are focused on organization and publication of academic content on the web. features include custom page templates for academic texts, integration with zotero for citation management, and expanded author profile pages for cv content management, such as publications, presentations, projects and other related career accomplishments. in addition to providing authors with advanced tools for publishing on the web, bibliplug facilitates visibility (e.g., in search engines) of relationships between and among researchers, institutions, and both formal and informal scholarly communication. bibliplug was first developed for the virtual knowledge studio (vks) in , and is still in use on some dozen project-related web sites. at the time of development, the goal was to create a central repository for all researchers affiliated with the studio to organize their academic work. the initial design included: (a) database schema for storing bibliographical references, (b) administration pages to manage the references, and (c) short code for easy retrieval of references based on author, year, and publication type. in this project we further developed the plugin and re-released it as enhanced bibliplug. added functionality includes: (a) the ability to connect and synchronize with zotero accounts (see figure ), (b) a custom author page template to display user’s academic title and affiliation, bio, and cv content such as publications and presentations, (c) the ability to export bibliography data in rdf format based on the spar ontologies, and (d) the ability to group references based on categories and tags. enhanced publication for wordpress the second plugin, enhanced publication for wordpress, works in parallel with bibliplug. added content is simultaneously structured in semantic web formats based on academic publishing ontologies. unlike many semantic web applications, this plugin includes integration of a visualization feature, such that object relationships can be browsed with the incontext application developed by the surffoundation. the central function of this plugin is to describe a wordpress site as an oai-ore aggregated book (an enhanced publication). in this structure, we convert wordpress pages into book chapters and use various other plugins to facilitate and describe reference lists, authors and editors, and attachments. for visualizing the content object relationships, we employ surf’s incontext visualiser; see figure . http://www.zotero.org/ http://ep-books.ehumanities.nl/semantic-words/enhanced-bibliplug http://www.virtualknowledgestudio.nl/ http://ep-books.ehumanities.nl/semantic-words/enhanced-publication-plugin-for-wordpress http://www.surffoundation.nl/en/projecten/pages/escapevisualisationcomponent.aspx this combination of web site and book objects was not readily compatible with any single ontology. we therefore selected a list of related ontologies to describe the full content of the aggregation. following is a list of ontologies used in creating a new ontology that satisfied this need:  rdf: resource description framework ontology;  oai-ore vocabulary for resource aggregation;  dcterms: dublin core metadata ontology;  foaf: friend of a friend ontology;  frbr: functional requirements for bibliographic records ontology;  swan: provenance, authoring and versioning in scientific discourse ontology;  res: academic researchers ontology;  biro: bibliographic reference ontology;  fabio: frbr-aligned bibliographic ontology;  prism: publishing requirements for industry standard metadata ontology;  escape-display: vocabulary for describing inverse relationship of foaf. in addition to our custom plugins, we use several plugins from among the wide range of open source plugins developed by the wordpress community. eleven of these are depicted in the diagram shown in figure . we use another three plugins to augment functionality in our custom plugins: co-authors plus, ninja page categories and tags, and user avatar. figure : zotero admin page in wordpress the wordpress community has produced more than , plugins; see http://wordpress.org/extend/plugins/. http://wordpress.org/extend/plugins/co-authors-plus/ http://wpninjas.net/plugins/ninja-page-categories-and-tags/ http://wpninjas.net/plugins/ninja-page-categories-and-tags/ http://wordpress.org/extend/plugins/user-avatar/ http://wordpress.org/extend/plugins/ figure : semantic words aggregation structure figure : incontext visualization – screenshot from the virtual knowledge web site enhancing scholarly publications this section presents work undertaken in developing the web sites for three traditionally published, or to be published, books noted above. inasmuch as there are specific details and features related to each, they are presented in separate subdivisions of this section. all of the book web sites, however, are based on a uniform template constructed within the wordpress content management system described in the previous section. book : e-research: transformation in scholarly practice the book e-research: transformation in scholarly practice was released mid- by routledge and reflects the characteristic features of traditionally published and specialized scholarly monographs: hardcover, black text printed on white paper, and figures reproduced in tones of gray. there is no use of color in the book, other than on the cover. a web-based enhanced version of this publication could include myriad features associated with web sites, such as:  illustrations, figures and tables, in color;  internal hyperlinks between sections of the book;  external hyperlinks to related internet-based materials;  supplementary resources for book chapters (e.g. recent publications, multimedia and other materials). many additional features are also possible:  interlinking index terms with book text;  chapter references with hyperlinks;  author search via google scholar for other publications;  key word search for similar publications;  periodic updating of material by chapter authors;  comment and blog functions facilitating interactions between readers and authors. routledge granted permission to place the text of the book on the web site, and this has allowed us to illustrate how the chapters will be presented in both pdf and html file formats. at this time, two chapters have been prepared in this fashion. figure shows the web site page with links to presentations given by authors; figure illustrates information on related books and links to sites associated with these publications. figure depicts author information from the database created for the book. although preparation of the web site complementing this book is well underway, the text for all chapters has not yet been uploaded to the site. once completed, these chapter presentations will also include the following functionalities:  search function through chapter texts;  hyperlinks embedded in chapters;  pop-up figures and tables in chapters. figure : links to presentations given by contributors to e-research figure : publications related to e-research figure : author information from e-research database book : virtual knowledge the book manuscript virtual knowledge is based on research prepared by scholars associated with the virtual knowledge studio for the humanities and social sciences (vks), established by the royal netherlands academy of arts and sciences (knaw) in . the primary objective of the vks was to facilitate innovative research practices in the humanities and social sciences, and the book virtual knowledge is designed to reflect that aim. contributions came from scholars associated with the three divisions of the vks: in amsterdam, rotterdam (erasmus studio), and maastricht (maastricht studio). one function of the book project was to enhance cohesion among the wide array of vks projects and to foster interactions among staff at the three divisions of vks. from its conception, vks intended to initiate and conduct new research practices, and to engage with on-going innovative practices of other researchers. in this regard, vks researchers were both ‘makers’ and ‘observers’ of new digital scholarship. two notions central to science and technology studies (sts), which constituted the home discipline of many of the central members of the vks, are practice and community. these notions are reflected in the preparation of virtual knowledge and in the complementing web site. regarding the book, three workshops were conducted during preparation of chapters; regarding the web site, one workshop was held related to preparing and uploading contact for the site. based on interactions during preparation of the book, it was decided to prepare a web complement to the print volume. several considerations contributed to adoption of this idea:  to continue interactions among authors;  to support the formation of a community around ideas expressed in the book;  to embed the book in an emerging environment of similar books;  to disseminate and promote of the book. to support preparation of an enhanced publication for the book and to explore how preparation of such a web complement might facilitate the previously mentioned community function, a workshop was organized in april the vks concluded operation on december and the e-humanities group has been mandated to continue the work of the vks in a more focused manner and under a modified organizational structure. further aspects of this transition are indicated on the web site of the e-humanities group. http://www.virtualknowledgestudio.nl/ http://ehumanities.nl/ for book contributors. of the contributors, seven attended the workshop. the event provided opportunity for participants to become familiar with the web site and the general procedures for uploading information, including bibliographic entries that were submitted with a specially prepared plugin for the wordpress site. the workshop concluded with a general discussion, during which some persons expressed regret at not being involved in an earlier stage in order to contribute to the design process and the user interface with the site. this discussion was continued in a post-workshop survey that allowed all contributors to reflect on the web site under construction. the level of contribution during and after the workshop was modest. while content has been uploaded to the site, much remains to be completed. that acknowledged, preparation for the workshop did stimulate members of the project team to complete the web site template and specially developed plugins for bibliographic entries. some of the criticisms of the workshop and reservations about an enhanced publication included:  inadequate involvement of the book editors in the planning;  unclear value of a book web site for authors;  time constraints prevented engagement at the desired level;  uncertainty about the utility of some site features, including author photos and videos;  technical problems experienced with the site, including functioning of the interface;  insufficient support from project team members in using the site. some of the positive reactions to the workshop and ep project included:  appreciation for being able to link references;  acknowledgment of potential value in creating cohesion of edited collections through an enhanced publication;  value of web site for author visibility;  relevancy of site to own research practices. negotiations are ongoing with the publisher to which the book manuscript virtual knowledge has been submitted for consideration. preliminary reactions reflect interest in publishing the volume and in combining the book with an enhanced publication in the form developed during this project. to this end, a preliminary web site has been prepared for the book and includes the basic functionalities included in the wordpress template; see figure for illustration of the page describing the book. many of the functionalities for the accompanying web site will remain important and further work will be required to complete preparation of the content related to these features (e.g., providing supplementary resources such as links, uploading bibliographic entries, completing video films of authors reflecting on their chapters). it is anticipated that a second workshop for authors may be necessary once arrangements have been made with the publisher regarding preparation of the book. this workshop will build on the experiences of the initial workshop held during this project. http://thebook.virtualknowledgestudio.nl/ http://thebook.virtualknowledgestudio.nl/ figure : page ‘about the book’ from virtual knowledge site book : the long history of new media the third book that is part of this project on enhanced publishing was released by peter lang in may and is entitled the long history of new media: technology, historiography, and contextualizing newness. as with the other books in the project, this is an edited volume and has been prepared and published in a manner reflective of conventional procedures for scholarly publishing. the book was released as a paperback and the cover consists of a designed arrangement of book title and names of editors. the text of the book is printed in black ink on white paper; there are few illustrations and no tables in the book. the web site constructed for the long history of new media contains a similar set of features as prepared for the other two books and uses the same wordpress template for the site; see figure illustrating the homepage of the site and figure containing biographical sketches of contributing authors. inasmuch as the book has recently been released by the publisher peter lang and no prior arrangement was made for reproducing the full book manuscript, only introductory paragraphs from the chapters have been uploaded to the site, along with the text of the introduction chapter. these texts are temporary additions to the site and require approval from the publisher before the site is formally announced, which is planned for the autumn . the web site is under construction and the content will mirror that available on the site for the e-research book, and include the following features:  book-related materials: description of book, table of contents, chapter abstracts, figures from chapters, compilation of references, and publisher information.  profiles of contributors: photos and bios of authors and editors;  supplementary resources: lists of institutions, publications, videos, and presentations related to web history;  topic-related blogs: group blog for authors of the book, individual blogs by book authors, and other blogs relevant to the themes in the book;  interlinking index terms with book text;  figures reproduced on web site; figures in color;  chapter references with hyperlinks and an overall bibliography for book;  author search via google scholar for other publications by author;  key word search for similar publications based on chapter titles. figure : homepage of site for long history of new media http://thelonghistoryofnewmedia.net/ figure : author biographical sketches, long history of new media conclusion in this project a form of enhanced publication has been developed for traditionally prepared and published scholarship. the undertaking has involved an accelerated learning experience in the terminology and models underlying the initial surf tender. we accepted the opportunity made available by the tender to achieve our initial and central objectives: preparation of web sites to complement traditionally published scholarly monographs, along with construction of an overarching database containing materials from those monographs. we were intentionally ambitious in formulation of objectives for the project, and it is not surprising that those objectives were only partially accomplished. still, the work undertaken reflects considerable accomplishment during the six months available for the project. much remains to be done, however, and some of the remaining tasks are itemized in this section. beyond a ‘to do’ list, this section reflects on the overall project and ‘next steps’ that build on and extend beyond this surf project with enhanced publishing. original objectives three objectives were formulated in the revised proposal submitted in january by the e-humanities group to the surffoundation: ( ) develop web sites to complement four printed volumes released by commercial and university-based publishers; ( ) develop a database allowing for aggregation of content across the individual book web sites; and ( ) share the experiences in preparing enhanced publications through conventional channels for scholarly communication, informal and formal. each of these objectives involved a range of specific activities, and a summary of the activities undertaken is presented below. activities undertaken the initial wordpress template for the book web sites was redesigned to facilitate ease of use by book authors and ensure basic uniformity in the presentation of site content. plugins for the site were designed, tested, and implemented; these plugins facilitate author uploading of references. web sites were prepared for each of the four books using a common template, and illustrations of content for each of the books was uploaded to the respective sites. the amount of content uploaded varied per book because of the different phase of completion and ‘life cycle’ of each book. for example, the book e-research was released two years ago and the publisher agreed to allow the full text of the book to be placed on the web site. the book digital media, in contrast, is still being written and only brief descriptions of chapters are made available. a database has been constructed for each of the book titles and these individual databases are integrated into an overall database. finally, a range of dissemination activities have been initiated: from individual talks with publishers and authors to formulation of proposals for conference panels and papers. remaining tasks perhaps the most pressing task remaining to be achieved involves further population of the four book web sites with content. this is a prerequisite to approaching the authors of the respective books and requesting update of material. we envision availability of adequate content on each of the sites by late , at which time we will invite chapter authors to revise and update the content as necessary. in the case of the book virtual knowledge, site completion will be performed in consultation with the publisher of the volume which has expressed strong interest in such a complementary web venue to the printed volume. reflections three points merit comment in this final section of the report: ( ) project duration, ( ) database construction, and ( ) the original aspiration of this project to accentuate the communicative component of enhancing scholarly publications. first, six months is a very short period of time and is inadequate to achieve the initially formulated objectives. with hindsight, we should have strived for less in order to achieve more. that acknowledged, progress was made across four different scholarly titles for development of complementing sites using a platform with a large and active community of users. second, we initiated construction of a database for objects from each of the books with some reservation, not knowing whether and how the construction would develop. as it turned out, this component of the project developed well and the work accomplished positively reflects the potential of enhanced publications containing such infrastructure. third and finally, we underestimated the need for exchange with book editors and authors; although one of the books, virtual knowledge, was well-placed regarding short communication lines and author interest, we nevertheless encountered considerable reservation at the time of a specially planned workshop, partially reflected in limited attendance and subsequent uploading of content. the ‘lesson’ from this experience is that the need for personal communication among central players should not be underestimated. in conclusion, it may be appropriate to mention that all members of the e-humanities group enhanced publication project are extending the achievements of this project into other arenas of journal and book publishing. other enhanced publications are in preliminary stages of planning as well as preparation of reflective texts on this trajectory of scholarly communication. we welcome the challenges these plans involve and in contributing further to a hybrid form of web-based and traditional publishing. communicating value across the university: library assessment across academic, student, and administrative affairs depaul university from the selectedworks of scott walter communicating value across the university: library assessment across academic, student, and administrative affairs scott walter this work is licensed under a creative commons cc_by-nc international license. available at: https://works.bepress.com/scott_walter/ / http://www.depaul.edu https://works.bepress.com/scott_walter/ http://creativecommons.org/licenses/by-nc/ . / http://creativecommons.org/licenses/by-nc/ . / http://creativecommons.org/licenses/by-nc/ . / https://works.bepress.com/scott_walter/ / communicating value across the university library assessment across academic, student, and administrative affairs scott walter, mls, phd depaul university chicago, illinois, usa e-mail: swalte @depaul.edu abstract a recent survey of u.s. library directors has identified concerns in regard to their ability to communicate library contributions to student success to senior leadership and other institutional stakeholders, and to communicate the ways in which academic libraries contribute to strategic initiatives at the institutional level. this paper presents a case study of an academic library in which alignment with the university mission and strategic plan, and alignment of library assessment efforts with the broader culture of assessment at the university, have resulted in improved communication of library value to senior leadership, new investment in library facilities, and enhanced opportunities for collaboration across the university on strategic initiatives including student success, innovation in teaching and scholarship, and community engagement. introduction since the publication of the value of academic libraries report (oakleaf, ), academic libraries across the united states have sought new ways to demonstrate the contributions they make to institutional goals and strategic initiatives related to student learning, faculty productivity, innovation in teaching and scholarship, community engagement, and more. as part of this “value agenda,” librarians have pursued new approaches to engagement with students, faculty, and other stakeholders in order to identify new opportunities for collaboration and to demonstrate impact on institutional priorities. as oakleaf ( , pp. - ) wrote: “academic librarians must understand institutional missions and how they contribute to them; they must also share that information with others by clearly aligning library services and resources to institutional missions. communicating that alignment is crucial for communicating library value in institutional terms.” to this, one might add that the ability to communicate library value in institutional terms is critical to senior leadership support for investment in library staff and services during a period of seismic change in higher education finance models in the united states (fiscal federalism initiative, ; seltzer, ; “what trump’s budget outline would mean for higher ed,” ). despite an unprecedented effort to support “value” studies in u.s. academic libraries since , chiefly as part of the association of college and research libraries (acrl) “value of academic libraries” (val) initiative (acrl, - ), questions remain regarding the best way to communicate library contributions to student learning, student success, and other strategic initiatives pursued at the institutional level. a recent survey of u.s. library directors (wolff-eisenberg, ) highlighted this continuing concern, finding that directors: ) have difficulty articulating the library contributions to student success; and, ) “feel increasingly less valued by, involved with, and aligned strategically with their supervisors and other senior leadership” (p. ). in order to provide continued support in this area, acrl has partnered with oclc research to establish an “action-oriented research agenda on library contributions to student learning and success” (oclc research, ). a participant in the first cohort of acrl’s “assessment in action” (aia) program (acrl, - ), depaul university presents a case study in how a commitment to mission and strategic alignment between the library and the university may serve to address concerns that campus colleagues and senior leadership do not fully appreciate the value of the academic library within the context of institutional goals. in an earlier essay, walter ( ) described how aia guidelines dovetailed with existing commitments at depaul university to collaboration in support of student success and to promoting a culture of assessment across the university. conducted in - , depaul’s aia project, a revision of information literacy instruction in the “common hour” provided to all students enrolled in its first-year-experience program, revealed ways in which students developed the “habits of mind” associated with academic inquiry (dempsey and jagman, ). this was also the first year of implementation for a new library strategic plan, and the first year following a major renovation of library space designed to promote greater collaboration among learner support services at the university. since , the library has conducted assessments of student learning, user experience, and library contributions to the strategic goals of the university, often in partnership with campus colleagues. one notable outcome of this approach has been the invitation for librarians to join university leadership groups such as the assessment advisory board, studio x advisory committee, and cross-college collaboration task force (c tf). strategic alignment between the library, campus colleagues, and community partners has resulted in a higher profile for library efforts and continued investment by senior leadership in the library, including the commitment of $ . m (u.s.) to another library renovation in (marciano, ; walter, ). taken as a whole, the depaul university library experience over the past years presents a compelling counter to the concerns about senior leadership perception of library value raised by wolff-eisenberg ( ). depaul university and the depaul university library founded in by the congregation of the mission (also known as the vincentians), depaul university is the largest catholic university in the united states, enrolling more than , students in schools and colleges offering undergraduate and graduate programs. one of the largest private, not-for-profit universities in the u.s., more than , faculty members teach across multiple campuses in chicago and its suburbs and in cohort programs sponsored by corporate partners. depaul has been recognized for its diversity, service-learning programs, and sustainability efforts, and is home to nationally recognized programs in the college of business, college of computing and digital media, and theatre school. in , depaul was recognized by u.s. news & world report as one of the nation’s “ most innovative schools” (depaul university, b). depaul has long been distinguished by a strong sense of shared mission among its faculty, staff, and students (filkins and ferrari, ; ferrari and velcoff, ). catholic in sponsorship, but pluralistic in composition, depaul maintains a historic commitment to meeting the needs of underserved communities for access to higher education, engaging with its urban community, conducting research relevant to issues of contemporary concern to the community, and preparing students for futures committed to service (office of mission and values, ). “it is a maxim of ours,” the university’s sponsor, st. vincent de paul, wrote, “to work in the service of the people” (de paul and coste, ), and this maxim continues to guide work at the university where his remains “the name above the door” (office of mission and values, ). commitment to its distinctive mission is evident in depaul’s student body and emphasis on student success. with % of its students coming from communities of color and % of its most recent freshman class representing first-generation students (enrollment management & marketing, a), depaul’s student body is highly diverse for a national, four-year, private university. recognized as a leader in strategic enrollment management, depaul is notable for its success in supporting students across this diverse community, with first-year retention rates, -year graduation rates, and -year graduation rates considerably higher than those reported at the national level by peer institutions (enrollment management & marketing, b). walter ( ) has described university-level leadership groups dedicated to coordinating efforts in support of student recruitment, retention, and success, library involvement with those groups, and the role this involvement has played in the successful launch of shared initiatives such as the learning commons (najmabadi, ). the depaul university library is comprised of the john t. richardson library, loop library, and library services delivered to depaul’s suburban campuses and cohort programs (depaul university library, ). with an annual budget of almost $ m (us) in fy , professional librarians, and a total staff complement of + fte, the library contributes to a number of campus programs, including first-year experience, teaching commons, and the new digital scholarship center, studio x. in recent years, the library has initiated partnerships in new areas of the university, including enrollment management and marketing, which has supported library initiatives related to k- community engagement and educational affordability. in , a major renovation of the richardson library included the launch of the information commons, a technology- enhanced space including individual and collaborative workspaces, learning commons (https://library.depaul.edu/get- help/pages/learning-commons.aspx), and scholar’s lab (https://library.depaul.edu/services/pages/scholars-lab.aspx). a similar renovation in will see the launch of the collaborative research environment (core), maker hub, a suite of digital media studios, and space for campus partners including studio x, c tf, and faculty instructional technology services (walter, ). teaching, learning, and assessment at depaul university another distinctive aspect of the institutional culture at depaul is the commitment to teaching and learning across the university. as one of the largest, private universities in the u.s. where “faculty members’ priority is teaching” (depaul university, a), depaul champions the role of the “teacher-scholar” in higher education. evidence of this commitment can be found in the instructional improvement programs offered to faculty and staff through the university’s teaching commons ( ), including certificate programs in teaching and learning, online instruction, cultural competencies, and assessment of student learning. this commitment extends beyond the traditional areas of academic affairs, as seen in the division of student affairs, which has established student learning outcomes in areas such as “persistence and academic achievement,” “socially responsible leadership,” and more (division of student affairs, a). student affairs staff provide student programs designed to meet these outcomes, seeing themselves as “[full partners] in the university’s efforts to promote student learning and success …. [recognizing] that learning happens always and everywhere throughout the student experience” (division of student affairs, b). walter and eodice ( ) and hinchliffe and wong ( ) have noted the potential for developing “powerful partnerships” between libraries and student affairs programs, and, while there are factors that may make this difficult (long, ), the core commitment to teaching and learning as part of the character of the institution has opened the door to many such partnerships at depaul. collaboration between academic and student affairs is also shaped by a shared commitment to assessment of student learning. the office of teaching, learning, and assessment ( a) coordinates an annual assessment of student learning across the curriculum and has begun work with co-curricular units, including student affairs, academic advising, university center for writing-based learning, and university library to explore learning outside the classroom (office of teaching, learning, and assessment, b). the career center is notable for its leadership of a university-wide effort to identify “transferable skills” inherent in academic programs, co-curricular programs, and student employment programs, and is now working with the library to more clearly define information literacy as a set of measurable skills transferable to the workplace (coloma, ; head, ; walter, ). library engagement with these programs has been coordinated since by the library assessment and research committee (larc), whose members provide support for micro- and macro- assessments of library services, including, most recently, assessments of current models of reference service, user experience, and faculty engagement with digital scholarship services. a focus for larc is the completion of the library’s annual assessment of student learning (discussed below). depaul’s commitment to having a “teaching library” (walter, ) ensures that librarian contributions to teaching and learning inside and outside the classroom are recognized. librarian involvement in programs such as the assessment advisory board, teaching and learning certificate program advisory board, and co-curricular learning assessment task force ensures that reports of library contributions are widely shared. the library’s unique place at the crossroads of curricular and co-curricular teaching and learning is reflected in the library’s vision statement: “the depaul university library is a center for intellectual inquiry and academic engagement beyond the classroom, building and inspiring the campus and community partnerships distinctive of a depaul education” (depaul university library, ). vision the depaul university library began work on its current strategic plan in late , following the launch of the university plan, vision (office of the president, ). vision established five goals for the university: . enhance academic quality and support educational innovation . deepen the university’s distinctive connection to the global city of chicago . strengthen our catholic and vincentian identity . foster diversity and inclusion . ensure a business model that builds the university’s continued strength and educational excellence while the library has adopted each of these as the goals for its own strategic plan (“to work in the service of the people: the depaul university library’s strategic plan, - ”), and has been recognized by senior leadership for contributions to a number of these goals [e.g., its leadership in the chicago collections consortium (http://chicagocollections.org/), a feature of the library’s contribution to goal (mattson, )], the remainder of this essay will focus on the library’s contribution to the initiatives related to teaching and learning, and associated with the goal of “[enhancing] academic quality and [supporting] educational innovation.” foundations for success a focus on goal is especially important for understanding collaboration and the communication of library value at depaul because a fundamental concern for every office, unit, or program at the university in recent years has been its contribution to goal as part of the university’s bid for re-accreditation. in , depaul was visited by an external review committee as part of its decennial re-accreditation by the higher learning commission of the north central association of colleges and schools, one of six regional accrediting bodies for u.s. schools and institutions of higher education. as part of that review, depaul was required to complete a “major quality initiative” designed “to suit its present concerns or aspirations” (higher learning commission, ). the depaul mqi was the “foundations for success” initiative, which was designed to meet one of the sub-goals identified in the university’s strategic plan: to focus the entire university community on student learning and success (office of academic affairs, ). the projects pursued across the university as part of “foundations for success” included: the learning commons, support for transfer students, improved communication across student support offices, and more. strategic alignment between library initiatives and the university strategic plan, together with a focus on the library’s contributions to efforts associated with “foundations for success,” have proven essential to communicating the value of the library to student success and to other strategic initiatives pursued by the university. communicating value across academic affairs the library’s efforts to promote greater awareness of its contributions to student success grew from a strong foundation of support among its traditional partners in academic affairs. library involvement in instructional improvement programs for faculty has been noted above, and a implementation of the ithaka s+r local faculty survey provided evidence that faculty recognize the library’s role in teaching and learning. designed to provide local perspectives on issues highlighted in ithaka’s triennial faculty survey (ithaka s+r, - ), the depaul survey was notable for the degree to which participants agreed that the library was “important” or “extremely important” to supporting their teaching activities and to developing undergraduate critical thinking and information literacy skills (walter, ; walter and yu, ). since , the library has built on that foundation by establishing undergraduate learning goals for information literacy, learning goals related to the use of primary source materials, and annual assessments of student learning as part of university-wide assessment programs. the most recent of these, an assessment of student learning in a research seminar offered by depaul’s school for new learning, was recognized with the office of teaching, learning, and assessment’s annual assessment award (schultz, ). alongside these formal assessments of student learning, librarians have reviewed the way in which traditional assessment data related to reference services are captured, and the way in which the library contribution to innovation in teaching and learning is explored and communicated to faculty colleagues and senior leadership. undergraduate learning outcomes over the past several years, the office of teaching, learning, and assessment has worked with colleagues in academic affairs to establish student learning outcomes that may be assessed, both on an annual basis and as part of periodic academic program review. while a complete review of the process by which depaul’s undergraduate learning outcomes for information literacy were established is beyond the scope of this paper, they may be summarized as follows (see figure ) (below) strategize – identify gaps in one’s current knowledge in order to determine the data, evidence, and diverse viewpoints needed to support one’s research and learning goals analyze and evaluate – articulate essential attributes of different information sources and apply critical thinking in order to determine the reliability, applicability, and responsible use of the resource search and explore – demonstrate flexibility and persistence in developing and revising strategies for finding and using a range of resources engage and extend – contribute to scholarly conversation at an appropriate level and credit the contributing work of others in their own information production figure : depaul university information literacy learning outcomes ( ) (developed by depaul university library staff in collaboration with the office of teaching, learning, and assessment) discussions of the use of these outcomes in undergraduate information literacy instruction may be found in dempsey and jagman ( ) and alverson and schwartz ( ). these learning outcomes have also been associated with issues specifically related to the use of primary source materials in special collections, also known as “artifactual literacy” (carini, ), and an initial assessment of student learning in this area has been completed in collaboration with faculty teaching a research methods sequence in history (nelson, ). in addition to sharing this work with faculty through teaching commons programs and the annual assessment process, librarians have collaborated with colleagues in the first-year-writing program to establish a “library research prize for first-year writing.” awarded for the first time in spring , nominees for this award were required to submit a writing sample along with a reflective “research statement” focused on “how they went about the process of information exploration and discovery and what they learned from it” (parker, ). the criteria for the prize are aligned with information literacy instruction provided in first-year writing courses following its revision to reflect the undergraduate learning outcomes described above (alverson and schwartz, ). innovation in instruction while efforts associated with establishing, assessing, and communicating the impact on student learning derived from the implementation of new learning outcomes demonstrate the library’s contribution to enhancing academic quality, the library has also pursued initiatives designed to demonstrate contribution to educational innovation. chief among these have been efforts associated with teaching and learning in emergent areas of digital scholarship. since the launch of the scholar’s lab in , librarians have collaborated with faculty across the university in a swiftly- growing digital scholarship program, providing the space, technology, and expertise needed to support faculty moving into new areas, including digital humanities, data-intensive social sciences, and data science. from the first, award-winning english capstone course taught in the scholar’s lab by a faculty-librarian team ( ), to the establishment of the graduate certificate in the digital humanities and the c tf (both ), to the launch of studio x ( ), and, most recently, the establishment of a university-wide group of faculty and staff interested in using maker spaces in their teaching, librarians have provided leadership for efforts designed to promote educational innovation. in , the library made a strategic investment in new staff members to reinforce this role, with the hiring of its first digital scholarship librarian, its first data services librarian, and an information technology librarian with prior experience managing maker spaces in public libraries. innovation in using information resources across the curriculum has also been supported by the library’s recruitment of a “wikipedian-in-residence,” who has engaged faculty interested in including wikipedia-based assignments in their classes (nelson, ). with the upcoming launch of the maker hub, studio x, and c tf programs in the library, there will be new opportunities to explore how to assess and communicate the library’s contribution to educational innovation at depaul. communicating value across student affairs the institutional characteristics supporting collaboration between the library and student affairs have been noted above. this environment made depaul an ideal participant for the assessment in action program, as colleagues from academic and student affairs (first-year programs, writing center, academic advising, center for students with disabilities) were already working with the library to design and deliver the “common hour” instruction as part of the first-year experience. this collaborative culture has informed the development of other programs, as well, e.g., banned books week, which has included partners from the university center for writing-based learning, office of multicultural student success, and center for identity, inclusion, and social change. chief among these, in terms of the library’s ability to help senior leadership to appreciate its value to student success, has been the learning commons. the learning commons launched in as one of the initiatives associated with “foundations for success,” the learning commons brings together peer-based learner support programs from around the university, including the university center for writing-based learning, stem tutoring center, career center, coleman entrepreneurship center, office of multicultural student services, college-based programs, supplemental instruction, and more. providing a “one-stop shop” for student success efforts, the learning commons has proven successful in promoting use of learner support programs and facilitating collaboration across programs in terms of shared approaches to recruiting and training peer tutors, collecting use data, and developing joint programs. figure (below) documents the growing use of learning commons programs by students since : learning commons appointments (all programs) ay - , ay - , ay - , figure : use of richardson library learning commons services in addition to this overall growth in use of learner support services provided through the learning commons, certain programs have seen especially significant growth in use since joining, including the office of multicultural student services (+ %) and supplemental instruction (+ %). the relationships built among staff and students associated with the learning commons have also had the effect of promoting collaboration among peer leadership programs, an outcome highlighted in the final report on the learning commons as part of “foundations for success.” participation in the learning commons has also had an impact on the library’s peer leadership programs, including the edge student employment team and the peer research assistant program, which was featured as part of the university’s peer tutor and mentor summit (https://resources.depaul.edu/student-success/tutoring/pages/peer-tutoring-mentoring-summit.aspx). while the learning commons has provided the library with opportunities to promote its expertise in research assistance as a component of student success, its physical location in the library also reflects a new area of assessment being explored at depaul around the impact of co-curricular services and spaces on student learning, and this is an area that will be explored further in the future. communicating value across administrative affairs the foundation for communicating library value to senior leadership has been noted above in regard to alignment with the university’s mission, strategic plan, and re-accreditation effort. the library’s success in establishing new partnerships with administrative areas not traditionally associated with the library, e.g., enrollment management and marketing, has also been noted. to conclude this discussion of engagement with administrative areas of the university, the author will briefly introduce library engagement with learning analytics and institutional affiliation programs. learning analytics learning analytics is “the measurement, collection, analysis, and reporting of data about learners and their contexts, for the purposes of understanding and optimizing learning and the environments in which it occurs” (conole, et al., ). the establishment of a learning analytics system (“bluestar”) was a component of “foundations for success” (teaching commons, ). oakleaf ( ) has described the emergent engagement between libraries and campus learning analytics programs, and this was also the subject of a series of acrl webinars (acrl, ). the depaul university library was established as a learner resource in bluestar in late , and was established near the same time in student affairs’ complementary “campus community engagement system” (“orgsync”) (oakleaf, et al., ). while space does not allow for a detailed discussion of library contributions to learning analytics programs in both academic affairs and student affairs, library initiative in pursuing these partnerships reflects several of the approaches noted above. institutional affinity one of the sub-goals of vision is to “strengthen the sense of community, affinity and institutional pride among all depaul constituencies,” and this is an area where library initiatives such developing the depaul heritage digital collection (http://libservices.org/contentdm/heritage.php), the “into the archives” feature for depaul newsline (http://www.depaulnewsline.com/departments/into-the-archives), and an innovative approach to the american library association’s “libraries transform” marketing campaign (libraries transform, ) have made notable contributions. each of these initiatives has allowed the library to establish new partnerships with key “communicators” at the university, including the office of public relations and communications, and the office of advancement and alumni affairs. conclusion wolff-eisenberg ( ) identified a critical concern for academic library leaders when she argued that they felt increasingly unsure of their ability to communicate to senior university leaders in a compelling way the contributions made by libraries to student success and other strategic initiatives. the importance of library leaders having the ability to do this was noted by oakleaf ( ) at the beginning of the current drive to establish assessment, research, and communication initiatives focused on “value,” but it appears there is work yet to be done. depaul university provides a case study in the ways that mission alignment, strategic alignment, and a proactive approach to campus engagement, assessment, and strategic communication may inform and extend the library’s efforts to communicate its value to institutional stakeholders, both among traditional partners in academic affairs, as well as new partners in student affairs and administrative programs. references alverson, j., & schwartz, j. ( ), “successfully collaborating to revamp first-year instruction: the case of depaul university”, c&rl news, vol. no. , available at: http://crln.acrl.org/index.php/crlnews/article/view/ / (accessed july ). association of college & research libraries. ( - ), acrl value of academic libraries, available at: http://www.acrl.ala.org/value/ (accessed july ). association of college & research libraries. ( - ), assessment in action: academic libraries and student success, available at: http://www.ala.org/acrl/aia (accessed july ). association of college and research libraries. ( b), learning analytics: strategies for optimizing student data on your campus, available at: http://www.ala.org/acrl/learninganalytics (accessed july ). carini, p. ( ), “information literacy for archives and special collections: defining outcomes”, portal: libraries and the academy, vol. no. , - . coloma, n. a. ( , october ), “transferable skills initiative helps affirm the value of a degree” [blog post], available at: https://offices.depaul.edu/enrollment-management-marketing/enrollment-matters/pages/transferable-skills- initiative.aspx (accessed july ). conole, g., gasevic, d., long, p., & siemens, g. ( , october), “message from the lak general & program chairs”, proceedings of the st international conference on learning analytics and knowledge, lak , banff, ab, canada. dempsey, p. r., & jagman, h. ( ), “’i felt like such a freshman’: first-year students crossing the library threshold”, portal: libraries and the academy, vol. no. , - . de paul, v., & coste, p. c. m. ( ), correspondence, conferences, documents, volume ii. correspondence volume (january – july ) [vincentian digital books, no. ], available at: http://via.library.depaul.edu/vincentian_ebooks/ / (accessed july ). depaul university. ( a), about, available at: https://www.depaul.edu/about/pages/default.aspx (accessed july ). depaul university. ( b), key facts and rankings, available at: https://www.depaul.edu/about/pages/rankings.aspx (accessed july ). depaul university library. ( ), about the library, available at: https://library.depaul.edu/about/about-the- library/pages/default.aspx (accessed july ). depaul university library. ( ), mission, vision, and values, available at: https://library.depaul.edu/about/about-the- library/pages/our-mission.aspx (accessed july ). division of student affairs. depaul university. ( a), learning outcomes, available at: https://offices.depaul.edu/student-affairs/about/assessment/pages/learning-outcomes.aspx (accessed july ). division of student affairs. depaul university. ( b), mission and vision, available at: https://offices.depaul.edu/student-affairs/about/pages/mission-vision.aspx (accessed july ). enrollment management and marketing. depaul university. ( a), enrollment summary , available at: https://offices.depaul.edu/enrollment-management-marketing/enrollment- summary/documents/emm% enrollment% summary _final.pdf (accessed july ). enrollment management and marketing. depaul university. ( b), retention and graduation rates, available at: https://offices.depaul.edu/enrollment-management-marketing/enrollment-summary/pages/retention-and- graduation-rates.aspx (accessed july ). ferrari, j. r., & velcoff, j. ( ), measuring staff perceptions of university identity and activities: the mission and values inventory. christian higher education, , - . filkins, j. w., & ferrari, j. r. ( ), “the depaul values project: an ongoing assessment of students’ perceptions of a private university’s core mission and values”, new directions for institutional research, no. , - . fiscal federalism initiative. pew charitable trusts. ( , june ), federal and state funding of higher education: a changing landscape, available at: http://www.pewtrusts.org/en/research-and-analysis/issue- briefs/ / /federal-and-state-funding-of-higher-education (accessed july ). head, a. j. ( ), staying smart: how today’s graduates continue to learn once they complete college, available at: http://www.projectinfolit.org/uploads/ / / / / /staying_smart_pil_ _ _ b_fullreport.pdf (accessed july ). higher learning commission. north central association of colleges and schools. ( ), quality initiative, available at: https://www.hlcommission.org/accreditation/quality-initiative.html (accessed july ). hinchliffe, l. j., & wong, m. a. (eds.). ( ), environments for student growth and development: libraries and student affairs in collaboration. chicago: association of college & research libraries. ithaka s+r. ( - ), surveys, available from: http://www.sr.ithaka.org/services/surveys/ (accessed july ). libraries transform. american library association. ( ), because college is just the beginning, available at: http://www.ilovelibraries.org/librariestransform/article/because-college-just-beginning (accessed july ). long, d. ( ), librarians and student affairs professionals as collaborators for student learning and success (doctoral dissertation). illinois state university, normal, illinois, usa, available at: http://ir.library.illinoisstate.edu/etd/ / (accessed july ). marciano, r. ( , november ), “phase three approved for richardson library renovation”, depaul newsline, available at: http://www.depaulnewsline.com/features/phase-three-approved-richardson-library-renovation (accessed july ). mattson, d. ( , october ). “new one stop search for data on chicago history topics”, depaul newsline, available at: http://www.depaulnewsline.com/vision- /new-one-stop-search-data-chicago-history-topics (accessed july ). najmabadi, s. ( , july ), “what the st -century library looks like”, chronicle of higher education, available at: http://www.chronicle.com/article/what-the- st-century-library/ (accessed july ). nelson, j. ( , march ), “do you see what i see: document analysis using special collections sources” [blog post], available at: https://news.library.depaul.press/faculty/ / / /do-you-see-what-i-see-document-analysis- using-special-collections-sources/ (accessed july ). nelson, j. ( , may ), “wikipedia in review” [blog post], available at: https://news.library.depaul.press/full- text/ / / /wikipedia-in-review/ (accessed july ). oakleaf, m. ( ), “getting ready and getting started: academic librarian involvement in institutional learning analytics initiatives”, journal of academic librarianship, vol. no. , - . oakleaf, m. ( ), the value of academic libraries: a comprehensive research review and report. chicago: association of college & research libraries, available at: http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/val_report.pdf (accessed july ). oakleaf, m., et al. ( , march), “closing the ‘data gap’ between libraries and learning: the future of academic library value creation, demonstration, and communication.” panel presented at the national conference of the association of college & research libraries, baltimore, maryland. oclc research. ( ), how far have we come and what do we do next?: an agenda for action-based research on student learning and success, available at: http://www.oclc.org/research/themes/user-studies/acrl-agenda.html (accessed july ). office of academic affairs. depaul university. ( ), higher learning commission initiative: foundations for success, - , available at: from https://offices.depaul.edu/oaa/key-initiatives/pages/foundations-for-success.aspx (accessed july ). office of mission and values. depaul university. ( ). mission statement, available at: https://offices.depaul.edu/mission-and-values/about/pages/missionstatement.aspx (accessed july ). office of mission and values. depaul university. ( , december ), the name above the door [video file], available at: https://www.youtube.com/watch?v=tj iqxv eta (accessed july ). office of teaching, learning, and assessment. depaul university. ( a), assessing learning, available at: https://offices.depaul.edu/teaching-learning-and-assessment/assessment/assessing-learning/pages/default.aspx (accessed july ). office of teaching, learning, and assessment. depaul university. ( b), learning outcomes, available at: https://offices.depaul.edu/teaching-learning-and-assessment/learning-outcomes/pages/default.aspx (accessed july ). office of the president. depaul university. ( ), vision : dedication to excellence, commitment to community, available at: https://offices.depaul.edu/president/strategic-directions/vision- /pages/default.aspx (accessed july ). parker, c. ( , may ), “winning at research: new prize awarded to first-year writers” [blog post], available at: https://news.library.depaul.press/full-text/ / / /winning-at-research-new-prize-awarded-to-first-year- writers/ (accessed july ). schultz, s. ( , may ), “’flexibility and persistence’: critical information skills in snl (and beyond)” [blog post], available at: https://news.library.depaul.press/faculty/ / / /flexibility-and-persistence-critical- information-skills-in-snl-and-beyond/ (accessed july ). seltzer, r. ( , april ), “illinois and everyone else”, inside higher ed, available at: https://www.insidehighered.com/news/ / / /state-support-higher-education-increased- -not-counting- illinois (accessed july ). teaching commons. depaul university. ( a), bluestar, available at: https://resources.depaul.edu/teaching- commons/teaching/pages/bluestar.aspx (accessed july ). teaching commons. depaul university. ( b), programs, available at: https://resources.depaul.edu/teaching- commons/programs/pages/default.aspx (accessed july ). walter, s. ( ), “assessment is everywhere: sharing assessment information and initiatives at depaul university”, c&rl news, vol. no. , available at: http://crln.acrl.org/index.php/crlnews/article/view/ / (accessed july ). walter, s. ( , january ). “information literacy as a transferable skill” [blog post], available at: https://news.library.depaul.press/faculty/ / / /information-literacy-as-a-transferable-skill/ (accessed july ). walter, s. ( ). “introduction: telling the story of the teaching library”, public services quarterly, vol. nos. / , - . walter, s. ( , may ), “jtr . : library renovation continues to phase ” [blog post], available at: https://news.library.depaul.press/faculty/ / / /jtr- - -library-renovation-continues-to-phase- / (accessed july ). walter, s. ( , october ), “more than just a ‘gateway’: depaul faculty on the library role in teaching, learning, and research” [blog post], available at: https://news.library.depaul.press/faculty/ / / /more-than-just-a- gateway-depaul-faculty-on-the-library-role-in-teaching-learning-and-research/ (accessed july ). walter, s., & eodice, m. (eds.). ( ), “meeting the student learning imperative: supporting and sustaining collaboration between academic libraries and student services programs” [special issue], research strategies, vol no. , - . walter, s., & yu, j-c. ( ), “what do faculty favor?: local implementations of the ithaka faculty survey in illinois”, paper presented at the annual meeting of the consortium of academic and research libraries in illinois, champaign, illinois, november , . “what trump’s budget outline would mean for higher ed.” ( , march ), chronicle of higher education, available at: http://www.chronicle.com/article/what-trump-s-budget-outline/ (accessed july ). wolff-eisenberg, c. ( ), ithaka s+r us library survey , available at: http://www.sr.ithaka.org/wp- content/uploads/ / /sr_report_library_survey_ _ .pdf (accessed july ). depaul university from the selectedworks of scott walter communicating value across the university: library assessment across academic, student, and administrative affairs tmpcydakh.pdf never rest on your ores: building a mining company, one stone at a time by norman b. keevil copyright © the ontario historical society, ce document est protégé par la loi sur le droit d’auteur. l’utilisation des services d’Érudit (y compris la reproduction) est assujettie à sa politique d’utilisation que vous pouvez consulter en ligne. https://apropos.erudit.org/fr/usagers/politique-dutilisation/ cet article est diffusé et préservé par Érudit. Érudit est un consortium interuniversitaire sans but lucratif composé de l’université de montréal, l’université laval et l’université du québec à montréal. il a pour mission la promotion et la valorisation de la recherche. https://www.erudit.org/fr/ document généré le avr. : ontario history never rest on your ores: building a mining company, one stone at a time by norman b. keevil mica jorgenson volume , numéro , fall uri : https://id.erudit.org/iderudit/ ar doi : https://doi.org/ . / ar aller au sommaire du numéro Éditeur(s) the ontario historical society issn - (imprimé) - (numérique) découvrir la revue citer ce compte rendu jorgenson, m. ( ). compte rendu de [never rest on your ores: building a mining company, one stone at a time by norman b. keevil]. ontario history, ( ), – . https://doi.org/ . / ar https://apropos.erudit.org/fr/usagers/politique-dutilisation/ https://www.erudit.org/fr/ https://www.erudit.org/fr/ https://www.erudit.org/fr/revues/onhistory/ https://id.erudit.org/iderudit/ ar https://doi.org/ . / ar https://www.erudit.org/fr/revues/onhistory/ -v -n -onhistory / https://www.erudit.org/fr/revues/onhistory/ book reviews thor’s handling of the relationship between taxes and racism is one of the most intrigu- ing but ultimately unsatisfying aspects of the book. for instance, the chapter on the tax history of the recently-created province of british columbia is an important addi- tion to the pacific province’s historiogra- phy, moving beyond the earlier insights on racism in bc by patricia roy. heaman’s argument that bc politicians in the s “tried, as much as possible, to tax by race” is convincing ( ). so too is her assertion that during the war income tax was intro- duced in a manner that aimed to protect the property interests of the anglo wealth community rather than those of french canadians. but i do wonder how germane the question of racism is to the central ar- gument of the book. quibbles aside, tax, order, and good government represents a powerful addi- tion to the developing field of “new po- litical history.” in particular, it helps to define the field in two ways. first, along with shirley tillotson’s recently-published give and take: the citizen-taxpayer and the rise of canadian democracy ( ) it challenges readers to consider how the tax system mediated relations between citizens and the canadian state. heaman’s empha- sis on the creative function of municipal and provincial tax policies, and the need to understand how thinking about taxes encompasses all levels of the state, is an im- portant contribution to the development of canadian political history. in addition, tax, order, and good government reminds us that there is much more to be said about the relationship between wealth, poverty, and political power in canadian history. heaman has written an outstanding book that, while too long, is consequential. it is a book that ontario history readers will find provocative and rewarding. robert a.j. mcdonald department of history university of british columbia (ret.) with never rest on your ores, nor-man b. keevil of teck-hughes gold mines ltd. adds his family’s story to a stack of popular mining histories written in ontario since the s. in the tradition of the genre, never rest on your ores celebrates liberal corporate as- cension while erasing indigenous people. despite some serious problems, readers may find some useful material here: teck- hughes is of a newer generation than the usual subjects of popular business story- telling (i.e. the long-dead behemoths of the early twentieth century industry). never rest on your ores portrays an ag- ile, connected, and responsive company which successfully navigated the cyclical nature of its industry and continues to shape the world in the present. keevil’s portrayal is rooted in his ca- never rest on your ores building a mining company, one stone at a time by norman b. keevil montreal & kingston, mcgill-queen’s university press, . pag- es. $ . cloth. isbn - - - - . (www.mqup.ca) ontario history noeist/geologist father’s exhorta- tion to “never rest on your oars,” which keevil ex- tends as a meta- phor for teck- hughes’ business model writ large. the mining com- pany jumped from deposit to deposit across canada and around the world, experimenting with a variety of products and strategies. from humble origins in ontario, the company experi- mented with oil, gold, silver, and copper in british columbia, the arctic, and chile. at first, such acquisi- tions were a method of survival. norman keevil senior impressed his son with the fundamental understanding of the ephemeral nature of a mineral deposit: a successful company must constantly seek out its next mine or risk fading with the last of its ore. mines are made, not found, so success comes from imaginative finan- cial maneuvers which allow a deposit to be mined profitably regardless of its qual- ity. this pillar of teck’s business model is perfectly embodied by the company’s “annual mine opening golf tournament,” whereby managers, owners, and investors would come to the newest site to play golf, socialize, and raise money. these annual tournaments feature in every sec- tion of never rest on your ores, each time marking a turning point in the company’s history – and providing an opportunity for keevil to regale his reader with a new tale of business (or athletic) acumen. thus never rest on your ores rough- ly outlines the way the business of min- ing changed in late twentieth century. teck came to power in the middle of canada’s mining as- cendance, and this book is a micro- history of broader trajectories in the country’s business, environmental, and mining history. yet keevil shows little awareness of teck’s context. the story jumps back- ward and forward in time, derails itself with unrelated side-notes, and switches unpredictably from dry accounts of stock division to juicy personal anecdotes (al- tered to an unknown extent with exag- geration, memory, and wishful think- ing ). keevil’s narrative reads like a long afternoon spent listening to an old-tim- er’s stories: conversational, meandering, and periodically offensive. one of the most confounding aspects of the book are the arbitrary quotations at the top of each new chapter. these are inconsistently dated and attributed. they range from sources as diverse as george w. bush, albert einstein, hernando de t o t t c – a k r t a o ly t in t t in c c b h t c book reviews soto, and yogi berra. the quotes and their originators bear questionable con- nection to the topic at hand, and some of them actually undermine the authority of the book. at the beginning of chapter for example, keevil quotes a wikipedia page on a distant keevil ancestor ( ). such additions provoke more questions than they answer. more troublingly, never rest on your ores is a new chapter in an old tradition of half-mythologized hyper-masculine sto- ries in ontario’s north which depends on the erasure of indigenous people, wom- en, and working-class people. the cover image shows the author and his father on horseback dressed in classic cowboy at- tire. the photograph neatly summarizes keevil’s perception of himself and his company. in erasing inconvenient parts of mining’s story, keevil frames develop- ment as inherently progressive and good. in his version of events, the keevils came over from england and staked their claim on empty land—keevil calls it a landscape “populated mainly by blackflies and the occasional moose” ( ). keevil eats up and then uncritically reproduces the old stories of “discovery” on the land around teck’s first mine ( ), adding his own single white male discoverer to the ranks—james hughes, “who may or may not have been grizzled” ( ). the rest of the origin story revolves around keevil’s suburban upbringing in which hardship is measured by the coming and going of his father’s cadillac ( ). female characters appear fleetingly, making foolish invest- ments ( ), being married off ( ), or accompanying their husbands on business. only on one memorable occasion does a woman actually prospect for gold ( ). other wise, keevil’s world of mine- making is passed down from father to son. all the major characters are men— educated, upper class, and (with the exception of his japanese investors), white. in the aftermath of the truth and reconciliation commission, ongoing debate around indigenous sovereignty over land and resources in canada, and a considerable body of indigenous schol- arship, keevil’s adherence to an out-of- date colonial mythos of empty land and benign industrial development is inex- cusable. unfortunately, such selective versions of the past remain institution- alized at the highest levels of canadian mining. reading never rest on your ores makes it easy to imagine how canada has come to be reviled and distrusted by indigenous people the world over. as frustrating as it is to see the old out-of-date mining mythos revived in , this book is valuable as an insider ac- count of the industry. with a bit of sleuth- ing, scholars will find a complete and de- tailed story of an obscure part of ontario’s northern history—one which has gone on to shape the world. keevil is open about his feelings and exhaustive in his detail. never rest on your ores provides a rare glimpse behind the doors of corporate board rooms, into the offices of legislators, and onto the corporate golf green. lengthy excerpts from personal letters, accounts of conversations, and keevil’s distinctive (if problematic) story may be a useful primary source for those in business, mining, on- tario, or canadian history. mica jorgenson post doctoral fellow, sherman centre for digital scholarship, mcmaster university. workshop prof. dr. andreas degkwitz liber patras - session : strategy july the interactive library as a virtual working space liber conference in patras session : strategy july , prof. dr. andreas degkwitz - humboldt-universität zu berlin prof. dr. andreas degkwitz liber patras - session : strategy july where we are starting from? the logistic of printed books and journals is influencing all the processes and structures of libraries since the age of gutenberg  our core processes are linear: acquisition, cataloging, short- and long term availabilty and usage implementating it driven library systems as well as providing e-books and e-journals (pdf) we transfer and transform the analogue processes of printed materials in digital environments. this „emulatipn“ is part of the transformation process, but not the end of the development. components of the logistic of digital materials are: interaction, collaboration, multimedia and global networking – do we identify these items in libraries, which we call or define as digital libraries? prof. dr. andreas degkwitz liber patras - session : strategy july is anything changing in depth? the organisation of libraries are oriented to the traditional patterns further  networked structures? the roles of librarians and users don‘t change since many many years  collaborative approaches? print oriented e-books und e-journals (as emulations) are focusing the library collections and services  multimedia objects?  „gaps between infrastructure and research“: interaction is missing! prof. dr. andreas degkwitz liber patras - session : strategy july yes, changes are happening patron driven acquisition models: user choose the materials, that they demand and need. digital resources like e-books must not be recorded by librarians in a traditional way. moreover the related metadata can be loaded in the index of the discovery system. scholarly materials and objects outside the familiar scope of books and journals are permantly increasing  „enhanced publications“ user and researcher are providing repositories or information hubs by themselves. these resources could and should be harvested and indexed by the „search engine“ of the library. prof. dr. andreas degkwitz liber patras - session : strategy july we should exploit increasinglly the digital potential of the internet and the new media. we should allow and enable more interaction and collaboration between the librarians and the users. we should reshape the roles of the librarians and the users in a collaborative way.  we better talk about „multi-users“: „multi-user driven acquisition“, „multi-user driven collection building“, „multi-user driven indexing“, „multi-user driven funding“, „multi-user driven availability“  that‘s pushing us forward! multi-users – multi-taskers prof. dr. andreas degkwitz liber patras - session : strategy july new policies … • acquiring and collecting: beyond the librarians every user is allowed to acquire or to transmit informations materials and objects in the libraries‘ collection - by different rights and on different hubs or repositories. the scope of materials and material types covers everything related to scholarly communication: books, journals, digitised items, research data, software tools, audios, pictures, videos, simulations etc. • cataloging and enriching: beyond the librarians every user is allowed to create and/or to enrich the metadata of scholarly materials and objects for loading them in the index of the (central) search engine - by different competencies and rights. enrichments may be done by name authorities, classifications, subject headings until to semantic relationships. in this way more customer oriented access and search facilites can be established. prof. dr. andreas degkwitz liber patras - session : strategy july … and new rules • usage and availability: beyond the librarians every user is allowed to define operation and usage of acquired/collected materials and objects until to „time limits“ of its availability. the overhanded rights and roles have to apply the governance rules of the library policy. the principles of open access are regarded generally. • funding and sourcing: librarians and users are „owner“ of different funds for paying acquisitions and licences of contents or materials. in the opposit to the practises until today these sources have to cover the materials‘ „maintenance“ too – that means: cataloging, indexing, availabilty, operation, preservation etc., unless this will be done by the users themselves. long term archiving is a basic option, which is free to a certain extent. prof. dr. andreas degkwitz liber patras - session : strategy july scholarly makerspaces for creating interactive, virtual working spaces libraries are in the situation to take up the approach of the scholarly makerspaces. following the idea of the internationally known approach of “makerspaces” in public libraries scholarly makerspaces are digital working environments, where digital resources and tools are combined and made available. the service portfolios of scholarly makerspaces are provided and supported by academic libraries collaborating and interacting with researcher and third party providers of digital data, materials and tools according to the disciplinary needs. libraries are providing and supporting virtual scholarly makerspaces as open, dynamic and interactive infrastructure oriented to the disciplinary demands. prof. dr. andreas degkwitz liber patras - session : strategy july new services - new shapes  to meet the demands and requirements of digital scholars and students by services providing expertise, infrastructures, resources, training and tools,  to enable researchers an enhanced access and overview of existing methods and resources concerning digital scholarship and e-research,  to share digital procedures and tools with students and the young researcher generation,  to complete and to gain expertise about new technologies as well as what the disciplines are demanding and claiming for,  to establish the library as an active broker or intermediary between researchers and local, national and international providers of content and services. prof. dr. andreas degkwitz liber patras - session : strategy july purposes of the pilot study the pilot study aims a valid concept of an organization and process model including cost calculations for realizing scholarly makerspaces. from the impact of this new working environment and the related services the aimed business model will influence the entire library as well. the study should prepare the development of the virtual scholarly makerspaces, but not the prototype itself. this will be done, if the study can demonstrate a viable implementation for reasonable costs. the study will outline the financial and organisational framework for the implementation of the makerspaces. prof. dr. andreas degkwitz liber patras - session : strategy july interactive library  vwe „light“ the interactive library is proving to be a virtual working space as a result of the collaboration of librarians and users – making materials of data hubs, information platforms or portals, media archives and networks available. examples: the german digital library, europeana, hathitrust, internet archive and many other hubs or platforms like google scholar, mendely, wikipedia are not digital libraries, but show approaches, components and procedures of the new paradigm. facing the challenges of the internet and the new digital media the shapes of libraries have to be re-designed and re-organized  taking up the approaches of integrating users in the libraries‘ development libraries will be created and established as scholarly makerspaces and virtual users‘ working environments. prof. dr. andreas degkwitz liber patras - session : strategy july questions • what is the future role of libraries primarely – information provider, virtual working space, enabler of digital scholarship? • do we replace the traditional services by interactive patterns and shapes of the scholarly makerspaces? • what about the role of collection management and service provisioning for local users and user communites worldwide? • how can we imagine cooperations and relationships between international data and information hubs, provider of infrastructure and tools and the local libraries? prof. dr. andreas degkwitz liber patras - session : strategy july thank you for your attention! contact: andreas.degkwitz@ub.hu-berlin.de mailto:andreas.degkwitz@ub.hu-berlin.de digital humanities the importance of pedagogy: towards a companion to teaching digital humanities hirsch, brett d. brett.hirsch@gmail.com university of western australia timney, meagan mbtimney.etcl@gmail.com university of victoria the need to “encourage digital scholarship” was one of eight key recommendations in our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences (unsworth et al). as the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” ( ). in other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. while the commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. both the companion to digital humanities ( ) and the companion to digital literary studies ( ), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. there is much work to be done. this poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. this discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by mccarty and kirschenbaum ( ). the growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (jessop ), and the desire of practitioners to consolidate and validate their research and methods. we propose a volume, teaching digital humanities: principles, practices, and politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. we plan to structure the volume according to the four critical questions educators should consider as emphasized recently by mary bruenig, namely: - what knowledge is of most worth? - by what means shall we determine what we teach? - in what ways shall we teach it? - toward what purpose? in addition to these questions, we are mindful of henry a. giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” ( ). consequently, we will encourage submissions to the volume that address these wider concerns. references breunig, mary ( ). 'radical pedagogy as praxis'. radical pedagogy. http://radicalpeda gogy.icaap.org/content/issue _ /breunig.ht ml. giroux, henry a. ( ). 'rethinking the boundaries of educational discourse: modernism, postmodernism, and feminism'. margins in the classroom: teaching literature. myrsiades, kostas, myrsiades, linda s. (eds.). minneapolis: university of minnesota press, pp. - . http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html digital humanities schreibman, susan, siemens, ray, unsworth, john (eds.) ( ). a companion to digital humanities. malden: blackwell. jessop, martyn ( ). 'teaching, learning and research in final year humanities computing student projects'. literary and linguistic computing. . ( ): - . mccarty, willard, kirschenbaum , matthew ( ). 'institutional models for humanities computing'. literary and linguistic computing. . ( ): - . unsworth et al. ( ). our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. new york: american council of learned societies. [pdf] data, humanities and the history of medicine: new pedagogical approaches | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /mdh. . corpus id: data, humanities and the history of medicine: new pedagogical approaches @article{gibbs dataha, title={data, humanities and the history of medicine: new pedagogical approaches}, author={frederick w. gibbs}, journal={medical history}, year={ }, volume={ }, pages={ - } } frederick w. gibbs published art, medicine medical history the centrality of data and born-digital documents in contemporary medical care, public health and health policy means that the primary sources for future, and even present, medical historians will increasingly take on unprecedented digital forms. historians of medicine – and indeed historians of virtually everything – will need to be trained in new digital tools and methods to work seamlessly between analogue and digital sources available to them. such new demands present an opportunity for… expand view on cambridge press cambridge.org save to library create alert cite launch research feed share this paper topics from this paper pedagogy analog health policy mathematics spreadsheet related papers abstract topics related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue exploring entity recognition and disambiguation for cultural heritage collections seth van hooland¦∗, max de wilde¦, ruben verborgh†, thomas steiner‡ and rik van de walle† ¦université libre de bruxelles (ulb) information and communication science department avenue f. d. roosevelt, – cp b- brussels, belgium {svhoolan,madewild}@ulb.ac.be †iminds – multimedia lab – ghent university gaston crommenlaan bus b- ledeberg-ghent, belgium {ruben.verborgh,rik.vandewalle}@ugent.be ‡universitat politècnica de catalunya – department lsi carrer jordi girona, e- barcelona, spain tsteiner@lsi.upc.edu abstract unstructured metadata fields such as ‘description’ offer tremendous value for users to understand cultural heritage objects. however, this type of narrative information is of little direct use within a machine-readable context due to its unstructured nature. this paper explores the possibilities and limitations of named-entity recognition (ner) and term extraction (te) to mine such unstructured metadata for meaningful concepts. these concepts can be used to leverage otherwise limited searching and browsing operations, but they can also play an important role to foster digital humanities research. in order to catalyze experimentation with ner and te, the paper proposes an evaluation of the performance of three third-party entity extraction services through a comprehensive case study, based on the descriptive fields of the smithsonian cooper-hewitt national design museum in new york. in order to cover both ner and te, we first offer a quantitative analysis of named-entities retrieved by the services in terms of precision and recall compared to a manually annotated gold-standard corpus, then complement this approach with a more qualitative assessment of relevant terms extracted. based on the outcomes of this double analysis, the conclusions present the added value of entity extraction services, but also indicate the dangers of uncritically using ner and/or te, and by extension linked data principles, within the digital humanities. all metadata and tools used within the paper are freely available, making it possible for researchers and practitioners to repeat the methodology. by doing so, the paper offers a significant contribution towards understanding the value of entity recognition and disambiguation for the digital humanities. this is the author version of an article submitted for publication. please cite as: van hooland, s., de wilde, m., verborgh, r., steiner t., and van de walle, r., exploring entity recognition and disambiguation for cultural heritage collections? in: literary and linguistics computing, . ∗corresponding author accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ introduction . linked data and the potential of entity extraction for the digital humanities the combination of decreasing budgets and growing electronic collections is currently forcing cultural heritage providers to rethink the ways in which they provide access to their resources. the traditional model of manual cataloging and indexing practices has already been under pressure for a number of years. the econtentplus funding program of the european commission, for example, explicitly did not fund the development of metadata schemas and the creation of metadata itself (van hooland et al., ). funding bodies and grant providers expect results within a limited time span and encourage cultural heritage institutions to gain more value out of their own existing metadata by linking them to external data sources. it is precisely in this context that the concepts of linked and open data (lod) have gained mo- mentum. recent initiatives such as openglam and lod-lam illustrate how these evolutions are percolating into the cultural heritage domain. both the us and the eu flagship digital library projects, respectively the digital public library of america and europeana , are currently embracing linked data principles (berners-lee, ). the semantic enrichment and integration of heterogeneous collections can be facilitated by using subject vocabularies for cross-linking between collections, since major classifi- cations and thesauri (e.g. lcsh, aat, ddc, rameau) have been made available following linked data principles. reusing these established terms through mappings in between vocabularies represents a big potential for the cultural heritage sector. the shift from printed books to digital tools for the management and use of controlled vocabularies already lead in the s to a considerable body of research regarding automated and semi-automated methods for achieving interoperability between vocabularies (doerr, ; tudhope et al., ; van der meij et al., ; van erp et al., ). isaac et al. ( ) identified four general approaches towards vocabulary reconciliation or alignment: ) lexical alignment techniques, ) structural alignment, ) extensional alignment, and ) alignment using background knowledge. the majority of projects focus on lexical alignment technologies, as most of the terms can be reconciled by taking care of lemmatization, harnessing preferred labels or computing string similarity. van hooland et al. ( ) provide a state of the art regarding the use of linked data for vocabulary reconciliation and illustrate how collection managers can use non-expert tools to successfully reconcile their local vocabularies with the lcsh and the aat. by doing so, collection holders can hook up their holdings within the linked data cloud. hands-on tutorials, specifically geared towards non-it experts from the cultural heritage domain, have been developed in the framework of the free your metadata project in order to demonstrate how interactive data transformation tools (idts) can be used to clean up and reconcile metadata. the reconciliation of local vocabularies, or even uncontrolled keywords, can be a first logical step towards publishing metadata as linked data. this paper explores a complementary approach by mining the unstructured narrative offered in descriptive fields for meaningful concepts through the use of named- entity recognition (ner) and term extraction (te). for clarity’s sake, we will refer to such fields throughout the paper by using the dublin core element ‘description’ defined as ‘an account of the resource’, which ‘may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource’ . . research questions and outline of the paper this paper aims to examine the possibilities and the limits of applying ner and other extraction methods to derive more value out of existing unstructured metadata content from the description field. more precisely, we will consider and answer the following two questions: how do the different ner services score in terms of precision and recall when compared to a manually annotated gold standard corpus? and how can we overcome the shortcomings of the gold standard corpus (gsc) by extracting terms that are not generally recognized as named entities? the first question will be answered in section through a clearly delineated and standardized approach. the second question is more difficult to answer. a number of terms identified by the services, such as epigraphy or gold for example, hold a potential value but do not appear in our gold standard corpus since they are common nouns. in order to assess the overall quality of the outcomes of the entity extraction accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ services, section outlines what elements need to be taken into account when considering the added value of entity recognition in the cultural heritage sector from a more global perspective. the article starts out with an overview of how ner developed and what directions this field is currently taking in collaboration with the semantic web community, including previous work on ner within the cultural heritage sector (section ). we then describe the case study and the methodology used within the paper to evaluate the outcomes of ner (section ). in section , we present the actual results of the study, and proceed with a discussion of the added value of te, along with opportunities and risks from a more global perspective (section ) before concluding and setting forth future challenges in section . context and related work . background and early developments regarding entity extraction originally developed by computational linguists as an information extraction subtask, named-entity recognition and disambiguation has subsequently attracted the attention of researchers in various fields such as biology and biomedicine (ananiadou and mcnaught, ), information science (moens, ), and the semantic web (tamilin et al., ). the original concept of a ‘named entity’ (ne), proposed by grishman and sundheim ( ), covered names of people, organizations, and geographic locations as well as time, currency, and percentage expressions. similarly, named entities were defined for the conference on computational natural language learning shared task as ‘phrases that contain the names of persons, organizations, locations, times, and quantities’ (tjong kim sang, ). as a result of the diversification of ner applications, this rather loose definition was further extended to include products, events, and diseases, to name but a few types recognized today as valid named entities, although nadeau and sekine ( ) note that the word ‘named’ in ‘named entity’ is effectively restricting the sense to entities refered to by rigid designators, as defined by kripke ( ): ‘a rigid designator designates the same object in all possible worlds in which that object exists and never designates anything else’. there is, nonetheless, no real consensus on the exact definition of a (named) entity, which remains largely domain-dependent. a useful approach was adopted recently by chiticariu et al. ( ) who proposed a list of criteria for the domain customization of ner, including entity boundaries, scope and granularity. they observe, for instance, that some ner tools choose to include generational markers (e.g. ‘iv’ in ‘henry iv’), whereas other do not. the definition of a named entity, according to them, is never clear-cut, but depends both on the data to process and on the application. in this article, we chose to use entity to refer to any type of entity, whether a named-entity (in kripke’s sense) or a plain term. however, in what follows we use the well-known acronym ner to cover both named-entity recognition and term extraction, which will be specifically addressed in section . . ner and the semantic web the ner task is strongly dependent on the knowledge bases used to train the ne extraction algorithm. leveraging resources such as dbpedia, freebase, and yago, recent methods have been introduced to map entities to relational facts exploiting these fine-grained ontologies. in addition to the detection of a ne and its type, efforts have been made to develop methods for disambiguating information units with a uniform resource identifier (uri). disambiguation is one of the key challenges in natural language processing, giving birth to the field of word-sense disambiguation (wsd), since natural languages (as opposed to formal or programming languages) are fundamentally ambiguous (bagga and baldwin, ; navigli, ). for instance, a text containing the term washington may refer to the george washington or to washington dc, depending on the surrounding context. similarly, people, organizations, and companies can have multiple names and nicknames. these methods generally try to find clues in the surrounding text for contextualizing the ambiguous term and refine its intended meaning. therefore, a ne extraction workflow consists of analyzing input content for detecting named entities, assigning them a type weighted by a confidence score and by providing a list of uris for disambiguation. accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ however, as will be demonstrated in section . , a uri can not be taken at face value. we will therefore refer to the four principles tim berners-lee informally defined in a w c design issue to assess the quality of linked data (berners-lee, ): . use uris as names for things. . use http uris so that people can look up those names. . when someone looks up a uri, provide useful information, using the standards (rdf*, sparql). . include links to other uris, so that they can discover more things. the services used in this paper were selected on the basis of conforming to these principles, under a minimal interpretation of ‘useful’ in the third principle. for example, the well-known service open- calais has been excluded from our analysis because it mostly provides http uris that do not deliver additional information or links, violating the third and fourth principles. initially, the web mining community has harnessed wikipedia as the linking hub where entities were mapped (hoffart et al., ; kulkarni et al., ). a natural evolution of this approach, mainly driven by the semantic web community, consists in disambiguating named entities with data from the linking open data (lod) cloud. several web apis such as alchemyapi, dbpedia spotlight, evri, extractiv, yahoo! term extraction, and zemanta, provide services for named-entity extraction and disambiguation within the lod cloud. these apis take a text fragment as input, perform named-entity extraction on it, and then link the extracted entities back to the lod cloud. in order to facilitate the evaluation of different ner services, rizzo and troncy ( ) have developed a tool that facilitates the examination of the outcomes of multiple services in parallel. . previous use of ner within the digital humanities a number of research projects and cultural institutions have experimented with ner in recent years. the powerhouse museum in sydney has implemented opencalais within its collection management database (chan, ). the feature has been appreciated both by the professional museum world and end-users, but no concrete evaluation of the ne has been performed. lin et al. ( ) explore ne in order to offer a faceted browsing interface to users of large museum collections. on the basis of interviews with a limited test group, the relevance of the extracted ne is assessed, but this evaluation is not based on a statistically significant sample. segers et al. ( ) offer an interesting evaluation of the extraction of event types, actors, locations, and dates from non-structured text from the collection management database of the rijksmuseum in amsterdam. however, the test corpus consists of , historical wikipedia articles, whose form and content may be inherently more suited for ner than descriptive metadata fields from a museum collection. also, the ner process is highly customized and requires a substantial amount of programming effort. rodriquez et al. ( ) discuss the application of several third party ner services on a corpus of mid- th-century typewritten documents. a set of test data, consisting of raw and corrected ocr output, is manually annotated with people, locations, and organizations. this approach allows a comparison of the precision, recall, and f score of the different ner services against the manually annotated data. the methodology applied by rodriquez et al. ( ) is very much in line with the approach of this paper. this allows to position the outcomes of our analysis with the results obtained there. the corpus and the ner services used within this paper are sufficiently different in character in order to offer a significant added value to the discussion regarding the value of ner for cultural heritage collections. methodology the main goal of the paper is to foster more experimentation and research regarding the use of ner within the digital humanities context. linked data has become an important topic for digital humanists, but the use of ner has been limited to large-scale projects. ramsay and rockwell ( ) recently underlined the importance of hands-on experimentation in order to come to grips with technology and to work towards an epistemology of building the necessary tools and research infrastructures. if the digital humanities truly want to foster such an epistemology, tools need to be made more accessible for humanities scholars, but also the methodologies to assess the outcomes of those tools. accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ previous research provides an introduction on the topic of vocabulary reconciliation (van hooland et al., ), making it possible for scholars and metadata practitioners to interconnect cultural heritage collections across the web with the help of a browser-based graphical interface. within this work, the content of a structured keyword field was used. the current paper builds on top of this previous work, as ner allows to detect concepts in unstructured fields which can, at a later stage, be used for vocabulary reconciliation, using the methodology presented by van hooland et al. ( ). with the help of a comprehensive case study based on a freely available corpus and tools, the current paper delivers all necessary components for digital humanities scholars to repeat the analyses performed. the following sections will describe in detail the building blocks of the case study: the framework for ner services, the corpus, and the sample. . open-source framework for ner services . . context of interactive data transformation tools and the use of openrefine idts are similar in appearance to common spreadsheet interfaces. while spreadsheets are designed to work on individual rows and cells, idts operate on large amounts of data at once. these tools offer an integrated and non-expert interface through which domain experts can perform both the cleaning and reconciliation operations. several general-purpose tools for interactive data transformation have been developed over the last years, such as potter’s wheel abc and wrangler . in this paper, we will focus on openrefine (formerly freebase gridworks and google refine), as it has recently gained a lot of popularity and is rapidly becoming the tool of choice to efficiently process and clean large amounts of data in a browser based interface. openrefine further allows to reconcile data with existing knowledge bases, creating the connection with the linked data vision. . . development of an openrefine ner extension while openrefine supports reconciliation, i.e. mapping single- or multi-word terms to a unique identifier, it does not offer native ner capabilities on full-text fields. in contrast, several third-party companies provide web services that offer ner functionality. unfortunately, those services can be difficult to access without a technical background, and it is unpractical to invoke them repeatedly on multiple text fragments. furthermore, each service has a different, proprietary interaction model. an ideal solution would be to integrate them into an existing workflow, hiding the low-level details from users. to this end, we have developed an open source extension for openrefine, which is freely available for download. this extension provides an integrated front-end, illustrated in fig. , that gives access to multiple ner services from within openrefine, thereby providing two levels of automation: ) only a single user interaction is required to perform ner on multiple records; ) each record can be analyzed by multiple ner services at the same time. the implementation of the extension abstracts every ner service into a uniform interface, minimizing the amount of code necessary to support additional services. it also allows users to manage their service preferences, ensuring consistency between ner operations on different datasets. the extension makes ner part of a common toolkit of data operations, offering the full potential of ner in a single, accessible operation. . . currently supported services the initial version of the extension supports three services out-of-the-box: alchemyapi, dbpedia spot- light, and zemanta. despite the excellent results delivered by stanford ner in (rodriquez et al., ), we decided not to include this service as stanford ner limits itself to standard recognition and does not provide disambiguation with uris. for similar reasons, it was decided not to include opencalais, as the uris it provides are unfortunately proprietary ones and only a fraction of the returned entities link to other sources from the lod cloud. • alchemyapi : capable of identifying people, companies, organizations, cities, geographic features, and other typed entities within textual documents. the service uses statistical algorithms and nlp to extract semantic richness embedded within text. alchemyapi differentiates between entity extraction and concept tagging. alchemyapi’s concept-tagging api is capable of abstraction, i.e. understanding how concepts relate and tag them accordingly (‘hillary clinton’, ‘michelle obama’ accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ fig. illustration of the ner openrefine extension and ‘laura bush’ are all tagged as ‘first ladies of the united states’). in practice, the difference between named-entity extraction and concept tagging is subtle. as a consequence, we treat entities and concepts in the same way. overall, alchemyapi results are often interlinked to well-known members of the lod cloud, among others with dbpedia (auer et al., ), opencyc (lenat, ), and freebase (markoff, ). alchemyapi offers free use of their services for research and non-profit purposes. on registration, users receive an api key allowing a default amount of , extraction operations per day. upon request, non-profit users receive , operations per day. • dbpedia spotlight : a tool for annotating mentions of dbpedia resources in text, providing a solution for linking unstructured information sources to the linking open data cloud through dbpedia. dbpedia spotlight performs named-entity extraction, including entity detection and disambiguation with adjustable precision and recall. dbpedia spotlight allows users to configure the annotations to their specific needs through the dbpedia ontology and quality measures such as prominence, topical pertinence, contextual ambiguity, and disambiguation confidence. dbpedia spotlight can be used for free as a web service. • zemanta : allows developers to query the service for contextual metadata about a given text. the returned components currently span four categories: articles, keywords, photos, and in-text links, plus optional component categories. the service provides high quality identification of entities that are linked to well-known datasets of the lod cloud such as dbpedia or freebase. zemanta also offers free use of their services for research and non-profit purposes. upon registration, users receive an api key allowing a default amount of , operations per day. upon request, non-profit users receive , operations a day. accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ fig. front-end display of the descriptive field as alchemyapi and zemanta are proprietary services with a closed-source code base, their algorithms cannot be inspected and compared on a conceptual level. therefore, the services are treated as black boxes and quantitatively compared. . case study: smithsonian cooper-hewitt national design museum . . description of the corpus and the sample the smithsonian cooper-hewitt national design museum is the world’s largest design museum and holds over , objects, % of which are documented within the online database. the collec- tion management team has been very active to get the most value out of the existing metadata and to enrich them with outside sources in an automated manner. fig. illustrates the front-end of the collection database, which was published as an alpha release in the fall of and is available on http: //collection.cooperhewitt.org/. in parallel, the museum offers a complete dump of its metadata on github, publicly available for download on https://github.com/cooperhewitt/collection/. within this metadata export, we specifically focus on the ‘description’ field, which represents a free- text account of the resource. the descriptive fields from the cooper-hewitt museum vary from charac- ters ( words) to characters ( words), with characters ( words) on average, and therefore represent both short and more elaborate descriptions. out of the , records available from the github download, only , records contain a description. some of them being identical, this leaves us with , unique descriptions. on the basis of a confidence level of % and a confidence interval of , a representative sample of records was selected through a simple random sampling method. http://collection.cooperhewitt.org/ http://collection.cooperhewitt.org/ https://github.com/cooperhewitt/collection/ accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ . . methodology for the elaboration of the manually annotated gold standard corpus there is, to the best of our knowledge, no freely available corpus that can be used as a gold standard corpus (gsc) for the evaluation of ner in the cultural heritage sector. making the same observation, rodriquez et al. ( ) built their own gsc for the evaluation of ner on raw ocr text, but using very different data: testimonies and newsletters, which do not compare to object descriptions. even if museum-oriented gsc existed, it would still be useful to develop multiple manually annotated corpora for different application domains, the task of ner being largely domain-dependent, as already noted in section . . for these reasons we decided to annotate the sample ourselves. obviously, a concrete set of ne types was required in order to perform this annotation. an analysis of the data showed that the most relevant categories in our metadata were persons (per, e.g. robert de vaugondy), locations (loc, e.g. rhine valley) and historical events (eve, e.g. renaissance) . all capitalized names were considered valid ne candidates, and categorized according to this typology. organizations, although a common ne type for journalistic corpora, are less frequent in cultural heritage data, so they were bundled together with other miscellaneous entities (misc, e.g. italian gothic). we first converted the sample into a , -line text file with one word per line . the sample was then splitted into three equal parts, each part being annotated by two distinct persons in order to reduce errors. the kappa coefficient (carletta, ) indicates an agreement rate of k = . , . and . respectively for the three parts, or . on average. we used a variant of the widely-used iob format (ramshaw and marcus, ), producing content such as the following: lincoln b-per delivered o an o effective o political o speech o at o cooper-union b-loc , o feb. b-eve i-eve , i-eve i-eve . o this annotated sample was then used as a gsc, allowing us to compute the precision, recall, and f-score by service and category. these results are presented in the following section. analysis of precision and recall using the annotated sample described in section . , we performed a quantitative analysis of the services in terms of precision and recall. it should be noted that, for this purpose, our annotation was considered a gold standard, i.e. an absolute reference as to what is a valid ne and what is not. as a consequence, terms that could be considered useful by collection holders (such as gold for example) were explicitly excluded and treated as errors when retrieved by a ner service. these shortcomings, unavoidable for the computation of recall, are accounted for in section where a more qualitative analysis of results is offered. out of the entities we identified in the sample (detailed by ne type in table ), alchemyapi retrieved , dbpedia only , and zemanta . alchemy also incorrectly tagged extra entities, dbpedia , and zemanta . typical errors made by the services include wrong boundary detection (stadt instead of stadt theater basel), jack instead of jack and jill etc.), mistaking the first word of a sentence for a proper name, and category errors (falkenstein and wedgewood were tagged as persons for instance). overall, entities were found by at least one service. using these data, we computed the precision, recall, and f -score for each service. the results are summarized in table . the results show that, on our -object sample, zemanta performed best (almost % f-score), followed by alchemyapi (about %), while dbpedia is lagging behind (only just above %). persons accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ and locations are generally better recognized than other ne types, although zemanta scores over % on the heterogeneous misc category. although events and dates are an important dimension of object descriptions in historical collections, they are generally more difficult for these services to spot, a few of them being correctly identified (yielding % precision scores) but most being ignored, as shown by the low recall figures. overall, precision is better than recall, which could be surprising since many common terms found by the services were tagged as incorrect since they did not fit in our closed categories. in this respect, dbpedia was more affected than the two others. recall does not hit the % mark for any service, which means that they failed to identify more than half of the ne we judged relevant. to sum up, while these results show that silence overbears noise, alchemyapi and zemanta provide a meaningful input for cultural heritage collections. while combining the services allows to increase on zemanta’s precision score, it also introduces more noise. as a result, the general f score is only slighty better ( %) than zemanta’s. it should be noted that, contrary to traditional ner tools, the services used provided not only a categorization but a full disambiguation of almost all entities in the form of a uri. of the three services, only alchemyapi provided a number of non-disambiguated entities to which a category was assigned. however, these categories were mostly correct (only four cases of loc or misc wrongly tagged as per), so we decided not to make a further distintion between fully disambiguated and categorized nes. we might wonder about the efficacy of using services that do not even reach the % f-score mark: is there a real added value to be gained from these tools for collection holders? to answer this tricky question, we should first note that the services score unevenly on different ne types: persons are well recognized for instance, so could be individually extracted while leaving more slippery entities such as events aside. of course, events are an important part of collections spread over time, so there could be a case for using a more specific event extractor, or even to design a cultural heritage-specific ner service, but these considerations are beyond the scope of this paper. our analysis, however, has the merit of showing that a decent amount of entities can be retrieved relatively easily by using general-purpose tools. for cultural institutions with limited budgets, we are confident this could still prove a simple and efficient way of gaining extra semantic value from existing metadata. moreover, section expands from the strict ne definition to also include the extraction of relevant terms that were not annotated in the sample because of their variety. the combination of ne and term extraction in a single service makes it easy for non-linguists to benefit from nlp technology. discussion section presented a clearly delineated and standardized approach on the precision and recall of ner, which can be compared to results of other publications using the same methodology. however, this approach excludes from the analyses a large number of generated entities which do not belong to one of the categories defined in section . . and used to annotate the gold standard corpus. nouns or adjectives identified by the services, i.e. terms rather than named-entities, such as epigraphy or gold for example, obviously hold a potential value. this issue opens the door to a number of important questions, which all directly or indirectly refer to the question of how we can assess the overall quality of the outcomes of the services. how can quality be defined in the context of information systems? we can refer to the iso definition, which describes quality as the ‘totality of features and characteristics of a product, process or service that bears on its ability to satisfy stated or implicit needs’ (iso, ). therefore, the quality of type # % per . loc . eve . misc . total table distribution of entities across ne types in our sample accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ service type p r f alchemyapi per . . . loc . . . eve . . misc . . . total . . . dbpedia per . . . loc . . . eve . . misc . . . total . . . zemanta per . . . loc . . . eve . . . misc . . . total . . . combination of all three services per . . . loc . . . eve . . . misc . . . total . . . table results of the services by category an information system denotes its adequacy with respect to the purposes assigned to it, which can be referred to as the ‘fitness for use’ principle. ‘total quality’ does not exist, since the concept is relative: on the basis of a cost-benefit analysis, the most pertinent quality criteria – which can include the timeliness of information and the speed of data transmission or of user access – must be adopted in a given context (boydens and van hooland, ). to tackle the issue of quality at a more fundamental level, one needs to clearly distinguish deterministic data from empirical data. as boydens clearly points out, deterministic data are ‘characterized by the fact that there is, at any moment, a theory which makes it possible to decide whether a value (v) is correct. this is the case with algebraic data: in as much as the rules of algebra do not change over time, we can know at any time whether the result of a sum is correct. but for empirical data, which are subject to human experience, theory changes over time along with the interpretation of the values that it has made possible to determine’ (boydens, , p. ). cultural heritage metadata, such as those of the cooper-hewitt case study, are empirical by nature and equally lack a direct frame of reference for testing their correctness. their appropriateness to the needs of the field can be determined only indirectly, by considering the relative relevance of the information with respect to the objectives pursued (boydens and van hooland, ). drucker also refers to this tension between deterministic and empirical realities, which often brings us back to the clash between the humanities and the hard sciences: ‘probability is not the same as ambiguity or multivalent possibility within the field of humanistic inquiry. the task of calculating norms, medians, means, and averages will never be the same as the task of engaging with anomalies and taking their details as the basis of an argument’ (drucker, , p. ). in the following subsections, we will pose a number of interrelated questions which will help us to evaluate in a more qualitative way, when compared to section , the output of the entity extraction services, including terms that were not specifically annotated in our sample. by doing so, a more global perspective on the added value of ner and te for the digital humanities can be developed. . are identified entities relevant? the first general question to be asked on the totality of the retrieved entities of the sample, is whether they are relevant with regards to the description. a manual inspection of all retrieved entities within the sample allowed an assessment to be made of whether an entity is closely connected or appropriate to the description. accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ this resulted in the following observations for the three different services: • alchemyapi: entities in total, out of which one is irrelevant (‘della mura’) • dbpedia: entities in total, all of which are relevant • zemanta: entities in total, out of which are irrelevant (e.g. ‘table tennis’ and ‘far right politics’) on the whole, the relevance of the entities is very high. zemanta scores lower than the two other services, as its attempts at detection of hyperonyms sometimes fail. a representative example is the entity white ground technique which is rendered on the basis of the description ‘floral sprays on white ground’. other errors are more difficult to explain, such as the entity table tennis associated with the description ‘oval base decorated with band of overlapping acanthus leaves, applied leaf design above, holds ink pot with open lid, the front showing a mask with protruding tongue. pen holders, in shape of a horn, flank the pot’. . do entities refer to specific or general concepts? knowing that the large majority of entities are relevant in regards to the description, the next step is to analyze whether the entities represent a discriminatory value. variance of the application domain, but also of the type of use, makes it impossible to differentiate in an absolute manner low- from high-level semantics. for example, words considered as stop words in one context can be considered to be useful in others, as ‘the’ and ‘who’ could be discriminatory in the music domain when querying for ‘the who’. however, certain objective indications can provide indirect insights. an analysis of the syntactic structure of the entities, for instance, delivers useful information about their complexity. in order to assess the internal structure of the entities retrieved, a part-of-speech (pos) analysis was performed with the help of the natural language toolkit , a collection of modules for text analytics, providing among other tools a probabilistic (maximum-entropy) pos tagger. the used tags originate from the penn treebank project , which is the most widely established reference in the field of natural language processing. table shows the five most common patterns, with figures and percentages for each service (nnp stands for proper noun; nn for singular or mass noun; nns for plural noun and jj for adjective). terms consisting of a single proper noun (japan) account for about a third of alchemy entities, a quarter of zemanta’s but less than % of entities from dbpedia, which recognizes much more common nouns, both singular (silver) and plural (cartoons), explaining its lower score on our sample. entities composed of two proper nouns (abraham lincoln) are also frequent, especially in alchemy, and so are singles adjectives (rectangular) to a lesser extent. note that adjectives are also included in the ‘things’ targeted by the linked data principles, so therefore they are similarly identified with a uri. in total, alchemy and dbpedia identified roughly the same number of patterns, and respectively (with a large overlap), whereas zemanta recognized thrice as much ( patterns), demonstrating an ability to cover more diverse entities. these include very rare patterns such as nnp nnp jj nn (new york public library) and nnp cd in nnp (louis xvi of france, cd standing for cardinal number and in for preposition), but also common ones such as jj nn (classical ballet) that alchemy and dbpedia generally fail to detect. alchemy dbpedia zemanta pos tags example # % # % # % nnp japan . . . nn silver . . . nnp nnp abraham lincoln . . . nns cartoons . . . jj rectangular . . . table parts of speech patterns of the entities it should be mentioned that only a minority of the reconciled single-word concepts relate to very broad and general types of objects (e.g. ‘brown’ or ‘windows’), whereas the majority of them deliver sufficient discriminatory value to perform interesting queries over large, heterogeneous metadata sets (e.g. ‘brooch’, ‘anemones’ or ‘gilt’, which identify highly specific object types). accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ . are the entities correctly disambiguated? one of the main selection criteria for the inclusion of the three specific ner services within our framework is their ability to disambiguate through the provision of uris. a manual inspection of the concepts retrieved within the sample allowed an assessment to be made of how well the different ner services disambiguate, and more particularly what the impact of polysemy is: • alchemyapi: entities in total, no issue of polysemy was found • dbpedia: entities in total, two issues of polysemy were found (‘doubles’ and ‘swatch’) • zemanta: entities in total, nine issues of polysemy were found (e.g. ‘blue flower’ and ‘pink ribbon’) we can conclude that only a few cases of polysemy were detected. in most cases, the literal sense of an entity (‘blue flower’, i.e. a flower which has the color blue) is mistaken for the figurative sense (‘blue flower’ as the symbol of the joining of human with nature, rendered popular by german romanticism). such cases are seldom problematic, but could yield embarrassing annotations (e.g. for ‘groin vault’). . what is the overlap and complementarity in between ner services? an obvious question is to what extent an overlap and a complementarity exists between the three different ner services. fig. gives a synthetic overview of the statistics. . % of the ne of our manually annotated gold standard corpus were identified by either alchemyapi, dbpedia spotlight or zemanta. a surprisingly low . % of the entities were found by all three services, illustrating a very small global overlap. when we have a closer look at the figures, we clearly see that dbpedia spotlight delivers a very limited value, as only . % of the ne are only identified by this service, all the others being also retrieved by zemanta. the figures regarding alchemyapi and zemanta do make a case for a parallel use. . % . % . % . % . % . % . % alchemyapi dbpedia spotlight zemanta fig. the overlap between ner results of different services despite a partial complementary between the services, a vast number of named-entities identified in the gsc are left out. these include persons such as ‘droschel’ and ‘the virgin’, locations such as ‘old england’ (tagged as ‘england’) and ‘basilica s. lorenzo’, events such as ‘whitsunday’ and ‘ th century’, and miscellaneous entities such as ‘aztec’ and ‘national india rubber company’. while a proportion of . % might seem low, it means that over a half of meaningful concepts are already extracted automatically, leaving more complex terms for advanced extraction methods or human annotation. . do uris refer to resources or their descriptions? understanding what a uri is actually referring to is conceptually probably the most challenging question. before referring to examples of the case study, the topic needs to be positioned within the broad debate in the web community on whether a uri should be understood as a reference to a document or a resource. for example, does the uri http://en.wikipedia.org/wiki/richard_nixon identify the former us president, or does it identify a document about this person? clearly, they are distinct entities: they can have separate values for the same property (e.g. the age of a person is different from the age of a document about that person) and one entity can evolve independently of the other. since one uri can only identify a single resource (berners-lee et al., , ), a concept and its describing document(s) http://en.wikipedia.org/wiki/richard_nixon accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ should necessarily have different identifiers. the question of what is identified by a uri has been a long- standing issue for the w c’s technical architecture group (tag), and has been known as ‘http-range ’ (berners-lee, c). the conceptual difficulty arises because http uris serve a double purpose: on the one hand, they identify a resource, and on the other hand, they can provide the address to obtain a representation of that resource. the linked data principles (section . , berners-lee, ) demand that both functions are effectuated to ensure all uri-identified resources have a representation at their own address. berners-lee ( a,b) initially suggested to distinguish between uris without and with fragment iden- tifier. the former (e.g. http://en.wikipedia.org/wiki/richard_nixon) would identify documents, and the latter (e.g. http://en.wikipedia.org/wiki/richard_nixon#richard) would identify a con- cept (within that document). this distinction is also referred to as the difference between information resources and non-information resources. the compromise ultimately chosen by the tag was to make this distinction by inspecting the return code when the uri is dereferenced (fielding, ). while this is an acceptable solution for some, the debate still goes on (rees, ). this issue and the discussion surrounding it is very relevant for the digital humanities community, because it determines how identifiers for documents and concepts should be used. in particular with ner, we should be careful not to consider a link to a document about a resource as an identifier for that resource. unfortunately, not all apis makes this distinction. while alchemyapi and zemanta differentiate between various link types and sources (attaching labels such as ‘dbpedia’, ‘yago’, and ‘website’), there is no explicit indication whether the link points to an information or a non-information resource, although any given link type should consistently produce one or the other. dbpedia spotlight returns dbpedia uris, which always point to the concept. still, it is important that distinct extracted entities have a unique uri to determine whether two pieces of content refer to the same entities. continuing the earlier example, a text about richard nixon and a text about a document that describes president nixon handle a different topic. however, if a ner service assigns the document’s uri as an identifier of the person, that uri cannot be used to identify the document itself, leading to a paradoxical situation. let us bring back the discussion to our case study. the issues mentioned above are clearly il- lustrated by the various uris referring to the fashion designer isaac mizrahi. alchemyapi provides http://www.freebase.com/view/en/isaac_mizrahi, a link to the biography of mizrahi available in freebase and therefore a document about the subject. on the other hand, zemanta provides a uri to http://www.lyst.com/isaac-mizrahi/, bringing us to an online catalog of objects made by mizrahi. another example of a uri to an information resource is http://www.lastfm.fr/music/lulu, provid- ing access to the music of the artist. in general, we see many non-information uris and few to none information uris. conclusions and future work within this article, we focused on the evaluation of three services (alchemyapi, dbpedia spotlight, and zemanta) in order to assess the added value of ner within the digital humanities field. in order to calculate the precision, recall, and f -score of the different services, a manually annotated gold standard corpus was created, based upon a sample from the smithsonian cooper-hewitt national design museum. the results clearly identified zemanta as the best-performing service (almost % f-score), followed by alchemy (about %), with dbpedia largely lagging behind (only just above %). persons and locations were generally well-recognized. unfortunately, events and dates remained largely unidentified. this is especially surprising for dates, because they are generally in a rigid format an easy to recognize automatically; we therefore suspect the lack of date recognition is due to lack of demand from ner service customers. generally speaking, recall did not hit the % mark for any service, which means that they failed to identify more than half of the ne judged relevant. resuming, these results show that silence overbears noise, although alchemy and zemanta clearly provide a meaningful input. a large part of the entities identified by the ner services (such as the material out of which an object is made) do not belong to one of the categories (per, loc, eve, and misc) explicitly defined to allow the computation of recall. however, as the terms excluded from the strictly defined categories potentially hold value for search and retrieval purposes, we focused within the discussion in section on a more qualitative analysis of all entities identified by the services, irrespective of the formal categories used to annotate the gold standard corpus. first of all, a manual analysis of all the entities showed that their relevance is very high. almost no entities were found that lacked relevance in regards to the descriptive field from which they were derived. http://en.wikipedia.org/wiki/richard_nixon http://en.wikipedia.org/wiki/richard_nixon#richard http://www.freebase.com/view/en/isaac_mizrahi http://www.lyst.com/isaac-mizrahi/ http://www.lastfm.fr/music/lulu accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ an illustration of such an exceptional error is for example zemanta, which proposes the entity ‘far right politics’ based on the following part of a description ‘to the very far right and closer to the foreground is a belltower with domed cupola’. the identification of irrelevant entities necessarily has to be done manually, but one could crowd-source this process by inviting users to react when confronted with an irrelevant entity. an analysis of the syntactic structure of the entities demonstrated that a large majority of the entities represent complex concepts but also allowed to differentiate the effectiveness of the different services to identify complex entities. alchemy and dbpedia identified roughly the same number of syntactic patterns, whereas zemanta recognized three times as many, demonstrating an ability to cover more diverse entities. these include very rare patterns represented by terms such as ‘new york public library’ or ‘louis xvi of france’. the manual analysis also enabled evaluation of the capacity of the ner services to correctly disambiguate the entities. only a few cases of polysemy were detected within the entities identified by zemanta, caused by confusion between the literal and figurative sense of entities. an obvious question is whether it makes sense to use three ner services in parallel. the venn diagram depicted in fig. represents the overlap and complementarity between the services. almost % of the ne of our manually annotated gold standard corpus were identified by either alchemyapi, dbpedia spotlight or zemanta, but only . % were found by all three services, illustrating a very small global overlap. on the whole, dbpedia spotlight delivers a very limited added value, but a parallel use of alchemyapi and zemanta definitively allows to identify more ne. the discussion finishes with the challenging issue of what exactly is identified by a uri: a resource or a document about this resource? this has been a long-standing issue for the w c’s technical architecture group (tag), known as ‘http-range ’. the clarification of this issue will only become more urgent as linked data principles are being applied within the digital humanities field. there is a fundamental difference between how services refer to, for example, the fashion designer isaac mizrahi: alchemyapi provides a link to mizrahi’s biography in freebase, whereas zemanta provides a link to an online catalog of products designed by him. this issue also confronts us with a fundamental problem of metadata: they are ever-extendible, in the sense that every representation can be documented by another representation, becoming a resource in itself (boydens, ). distinguishing between information and non-information resources is therefore context-dependent. based on the results of the paper, we can affirm that ner and te provide relevant entities at a low cost, based on non-structured metadata from the description field. however, the analyses allow to raise aware- ness regarding potential difficulties or even outright dangers regarding the use of ner within the digital humanities. for example, if we take the ne ‘henry iv’, zemanta delivers http://rdf.freebase.com/ ns/en/henry_iv_of_france, whereas alchemyapi http://dbpedia.org/resource/henry_iv_of_ france, http://umbel.org/umbel/ne/wikipedia/henry_iv_of_france, and http://mpii.de/ yago/resource/henry_iv_of_france. confronted with the heterogeneity of information given by these four different knowledge bases, the famous julian barnes quote spontaneously comes to mind: ‘history isn’t what happened. history is just what historians tell us’ (barnes, , p. ). linked data evangelists will instantly point out that different descriptions of the same reality can be reconciled by cross-referencing uris from competing knowledge bases and metadata schemes with owl:sameas. how- ever, in reality and especially in a humanistic one, two things are hardly ever exactly the same. schemes such as dublin core helped us over the last decade to aggregate for example sculptures and paintings by picasso, by mapping the fields ‘sculptor’ and ‘painter’ from individual databases to an aggregator such as europeana using the dublin core field ‘creator’. this approach is very useful, but has also opened the door for numerous metadata quality issues (foulonneau and riley, ). before starting to apply linked data principles on a large scale, the digital humanities community needs to be fully aware of these issues and learn lessons from the existing literature in the information science domain. to conclude, the digital humanities need to launch a broader debate on how we can incorporate within our work the probabilistic character of tools such as ner services. drucker eloquently states that ‘we use tools from disciplines whose epistemological foundations are at odds with, or even hostile to, the humanities. positivistic, quantitative and reductive, these techniques preclude humanistic methods because of the very assumptions on which they are designed: that objects of knowledge can be understood as ahistorical and autonomous.’ (drucker, , p. ). the purely probabilistic nature of ner not only makes abstraction of the empirical nature of humanistic data but is also tremendously influenced by economical factors, which remain by and large opaque to the general public but also to researchers. within the next years, the competition between knowledge bases (dbpedia, representing an open-source http://rdf.freebase.com/ns/en/henry_iv_of_france http://rdf.freebase.com/ns/en/henry_iv_of_france http://dbpedia.org/resource/henry_iv_of_france http://dbpedia.org/resource/henry_iv_of_france http://umbel.org/umbel/ne/wikipedia/henry_iv_of_france http://mpii.de/yago/resource/henry_iv_of_france http://mpii.de/yago/resource/henry_iv_of_france accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ approach, versus freebase, which has been acquired by google) and metadata schemes (schema.org, an initiative of google, bing, and yahoo! versus the open graph protocol, a facebook initiative) will rise as linked data principles are applied. whether we like it or not, a small number of competing players such as google and facebook are currently imposing their way of how to render semantics explicit within the linked data cloud. as a community, the digital humanities remain for the most part ignorant of these issues, as we are busy writing up grant proposals to hook up our research data into the linked data cloud. instead of this hype-driven and opportunistic behavior, the digital humanities community should use its unique potential to stand up and launch a scientific and public debate on these matters. notes http://ec.europa.eu/information_society/activities/econtentplus/closedcalls/econtentplus/, accessed jan- uary , http://openglam.org, accessed january , http://lodlam.net, accessed january , http://dp.la, accessed january , http://europeana.eu, accessed january , http://freeyourmetadata.org, accessed january , http://purl.org/dc/elements/ . /description, accessed january , http://www.opencalais.com/ http://control.cs.berkeley.edu/abc/, accessed january , http://vis.stanford.edu/papers/wrangler/, accessed january , https://openrefine.org, accessed january , https://github.com/rubenverborgh/refine-ner-extension, accessed january , http://www.alchemyapi.com/api/entity/, accessed january , https://github.com/dbpedia-spotlight/, accessed january , http://wiki.dbpedia.org/ontology, accessed january , http://developer.zemanta.com/docs/, accessed january , although events were previously considered on their own, there is now a tendency to include them into ne. the dutch sonar corpus (oostdijk et al., ), for instance, divides named entities into six categories: per, loc, org, eve, pro (products), and misc (buitinck and marx, ). the tokenization was performed with the natural language toolkit’s wordpunct tokenizer. being zero agreement and total agreement. a value of k greater than . shows that the annotation is reliable to draw definitive conclusions. http://www.nltk.org/, accessed january , http://www.ling.upenn.edu/courses/fall_ /ling /penn_treebank_pos.html, accessed january , http://ec.europa.eu/information_society/activities/econtentplus/closedcalls/econtentplus/ http://openglam.org http://lodlam.net http://dp.la http://europeana.eu http://freeyourmetadata.org http://purl.org/dc/elements/ . /description http://www.opencalais.com/ http://control.cs.berkeley.edu/abc/ http://vis.stanford.edu/papers/wrangler/ https://openrefine.org https://github.com/rubenverborgh/refine-ner-extension http://www.alchemyapi.com/api/entity/ https://github.com/dbpedia-spotlight/ http://wiki.dbpedia.org/ontology http://developer.zemanta.com/docs/ http://www.nltk.org/ http://www.ling.upenn.edu/courses/fall_ /ling /penn_treebank_pos.html accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ references ananiadou, s. and mcnaught, j. (eds.) ( ), text mining for biology and biomedicine, artech house, london. auer, s., bizer, c., kobilarov, g., lehmann, j., cyganiak, r. and ives, z. ( ), dbpedia: a nucleus for a web of open data, in the semantic web: th international semantic web conference, nd asian semantic web conference, iswc + aswc , springer, pp. – . bagga, a. and baldwin, b. ( ), entity-based cross-document coreferencing using the vector space model, in proceedings of the th annual meeting of the association for computational linguistics and th international conference on computational linguistics - volume , acl ’ , association for computational linguistics, stroudsburg, pa, usa, pp. – . url: http://dx.doi.org/ . / . barnes, j. ( ), a history of the world in ten and a half chapters, picador. berners-lee, t. ( a), “the range of the http dereference function”, maling list of the w c techni- cal architecture group, available at http://lists.w .org/archives/public/www-tag/ mar/ .html (accessed january , ). berners-lee, t. ( b), “what do http uris identify?”, available at http://www.w .org/ designissues/http-uri.html (accessed january , ). berners-lee, t. ( c), “what is the range of the http dereference function?”, issue of the w c tech- nical architecture group, available at http://www.w .org/ /tag/group/track/issues/ (accessed january , ). berners-lee, t. ( ), “linked data”, available at http://www.w .org/designissues/linkeddata. html (accessed january , ). berners-lee, t., fielding, r. t. and masinter, l. ( ), “uniform resource identifier (uri): generic syn- tax”, ietf request for comments, available at http://tools.ietf.org/html/rfc (accessed january , ). berners-lee, t., masinter, l. and mccahill, m. ( ), “uniform resource locators (url)”, ietf request for comments, available at http://tools.ietf.org/html/rfc (accessed january , ). boydens, i. ( ), informatique, normes et temps, bruylant. boydens, i. ( ), practical studies in e-government : best practices from around the world, springer, chapter strategic issues relating to data quality for e-government: learning from an approach adopted in belgium, pp. – . boydens, i. and van hooland, s. ( ), “hermeneutics applied to the quality of empirical databases”, journal of documentation, vol. , pp. – . buitinck, l. and marx, m. ( ), two-stage named-entity recognition using averaged perceptrons, in bouma, g., ittoo, a., métais, e. and wortmann, h. (eds.), nldb, vol. of lecture notes in computer science, springer, pp. – . carletta, j. ( ), “assessing agreement on classification tasks: the kappa statistic”, comput. linguist., vol. , mit press, cambridge, ma, usa, pp. – . url: http://dl.acm.org/citation.cfm?id= . chan, s. ( ), “opencalais meets our museum collection: auto-tagging and semantic parsing of col- lection data”, available at http://www.freshandnew.org/ / /opac -opencalais-meets- our-museum-collection-auto-tagging-and-semantic-parsing-of-collection-data/ (ac- cessed january , ). http://lists.w .org/archives/public/www-tag/ mar/ .html http://lists.w .org/archives/public/www-tag/ mar/ .html http://www.w .org/designissues/http-uri.html http://www.w .org/designissues/http-uri.html http://www.w .org/ /tag/group/track/issues/ http://www.w .org/designissues/linkeddata.html http://www.w .org/designissues/linkeddata.html http://tools.ietf.org/html/rfc http://tools.ietf.org/html/rfc http://www.freshandnew.org/ / /opac -opencalais-meets-our-museum-collection-auto-tagging-and-semantic-parsing-of-collection-data/ http://www.freshandnew.org/ / /opac -opencalais-meets-our-museum-collection-auto-tagging-and-semantic-parsing-of-collection-data/ accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ chiticariu, l., krishnamurthy, r., li, y., reiss, f. and vaithyanathan, s. ( ), domain adaptation of rule-based annotators for named-entity recognition tasks, in proceedings of the conference on empirical methods in natural language processing, mit, massachusetts, usa, pp. – . doerr, m. ( ), “semantic problems of thesaurus mapping”, journal of digital information, vol. . drucker, j. ( ), debates in the digital humanities, minesota press, chapter humanistic theory and digital scholarship, pp. – . fielding, r. t. ( ), “the range of the http dereference function”, maling list of the w c technical ar- chitecture group, available at http://lists.w .org/archives/public/www-tag/ jun/ . html (accessed january , ). foulonneau, m. and riley, j. ( ), metadata for digital resources, chandos. grishman, r. and sundheim, b. ( ), message understanding conference- : a brief history, in th international conference on computational lingusitics, pp. – . hoffart, j., yosef, a., bordino, i., fürstenau, h., pinkal, m., spaniol, m., taneva, b., thater, s. and weikum, g. ( ), robust disambiguisation of named entities in text, in conference on empirical methods in natural language processing, pp. – . isaac, a., schlobach, s., matthezing, h. and zinn, c. ( ), “integrated access to cultural heritage resources through representation and alignment of controlled vocabularies”, library review, vol. , pp. – . url: www.emeraldinsight.com/ . / iso ( ), quality management systems – fundamentals and vocabulary (iso : ), technical report. kripke, s. ( ), naming and necessity, harvard university press. kulkarni, s., singh, a., ramakrishnan, g. and chakrabarti, s. ( ), collective annotation of wikipedia entities in web text, in th acm international conference on knowledge discovery and data mining, pp. – . lenat, d. b. ( ), “cyc: a large-scale investment in knowledge infrastructure”, communications of the acm, vol. , acm, new york, ny, usa, pp. – . lin, y., ahn, j.-w., brusilovsky, p., he, d. and real, w. ( ), “imagesieve: exploratory search of museum archives with named entity-based faceted browsing”, journal of the american society for information science and technology, vol. , pp. – . markoff, j. ( ), “start-up aims for database to automate web searching”, available at http://www. nytimes.com/ / / /technology/ data.html (accessed november , ). moens, m.-f. ( ), information extraction: algorithms and prospects in a retrieval context, springer- verlag new york, inc., secaucus, nj, usa. nadeau, d. and sekine, s. ( ), “a survey of named entity recognition and classification”, linguisticae investigationes, vol. , pp. – . navigli, r. ( ), “word sense disambiguation: a survey”, acm comput. surv., vol. , acm, new york, ny, usa, pp. : – : . url: http://doi.acm.org/ . / . oostdijk, n., reynaert, m., monachesi, p., noord, g. v., ordelman, r., schuurman, i. and vandeghinste, v. ( ), from d-coi to sonar: a reference corpus for dutch, in chair), n. c. c., choukri, k., maegaard, b., mariani, j., odijk, j., piperidis, s. and tapias, d. (eds.), proceedings of the sixth international conference on language resources and evaluation (lrec’ ), european language resources association (elra), marrakech, morocco. http://lists.w .org/archives/public/www-tag/ jun/ .html http://lists.w .org/archives/public/www-tag/ jun/ .html http://www.nytimes.com/ / / /technology/ data.html http://www.nytimes.com/ / / /technology/ data.html accepted for publication in ‘literary and linguistics computing: the journal of digital scholarship in the humanities’ ramsay, s. and rockwell, g. ( ), debates in the digital humanities, minesota press, chapter developing things: notes towards an epistemology of building in the digital humanities, pp. – . ramshaw, l. a. and marcus, m. p. ( ), text chunking using transformation-based learning, in acl third workshop on very large corpora, acl, pp. – . rees, j. ( ), “http-range webography”, w c wiki pages, available at http://www.w .org/wiki/ httprange webography (accessed january , ). rizzo, g. and troncy, r. ( ), nerd: evaluating named entity recognition tools in the web of data, in iswc , workshop on web scale knowledge extraction ( wekex’ ), bonn, germany. rodriquez, k. j., bryant, m., blanke, t. and luszczynska, m. ( ), comparison of named entity recogni- tion tools for raw ocr text, in proceedings of konvens , vienna, pp. – . segers, r., van erp, m., van der meij, l., aroyo, l., schreiber, g., wielinga, b., van ossenbruggen, j., oomen, j. and jacobs, g. ( ), hacking history: automatic historical event extraction for enriching cultural heritage multimedia collections, in proceedings of the th international conference on knowledge capture (k-cap’ ). tamilin, a., magnini, b., serafini, l., girardi, c., joseph, m. and zanoli, r. ( ), context-driven semantic enrichment of italian news archive, in proceedings of the th international conference on the semantic web: research and applications - volume part i, eswc’ , springer-verlag, berlin, heidelberg, pp. – . tjong kim sang, e. f. ( ), introduction to the conll- shared task: language-independent named entity recognition, in proceedings of conll- , taipei, taiwan, pp. – . tudhope, d., binding, c., jeffrey, s., may, k. and vlachidis, a. ( ), “a stellar role for knowledge organi- zation systems in digital archaeology”, bulletin of the american society for information science and technology, vol. , pp. – . van der meij, l., isaac, a. and zinn, c. ( ), a web-based repository service for vocabularies and alignments in the cultural heritage domain, in proceedings of the th european semantic web conference (eswc), vol. , pp. – . van erp, m., oomen, j., segers, r., van den akker, c., aroyo, l., jacobs, g., legêne, s., van der meij, l., van ossenbruggen, j. and schreiber, g. ( ), automatic heritage metadata enrichment with historic events, in trant, j. and bearman, d. (eds.), museums and the web : proceedings, archives & museum informatics, toronto. van hooland, s., vandooren, f. and mendéz, e. ( ), “opportunities and risks for libraries in applying for european funding”, the electronic library, vol. , pp. – . van hooland, s., verborgh, r., wilde, m. d., hercher, j., mannens, e. and van de walle, r. ( ), “evaluat- ing the success of vocabulary reconciliation for cultural heritage collections”, journal of the american society for information science and technology, vol. , pp. – . http://www.w .org/wiki/httprange webography http://www.w .org/wiki/httprange webography introduction linked data and the potential of entity extraction for the digital humanities research questions and outline of the paper context and related work background and early developments regarding entity extraction ner and the semantic web previous use of ner within the digital humanities methodology open-source framework for ner services context of interactive data transformation tools and the use of openrefine development of an openrefine ner extension currently supported services case study: smithsonian cooper-hewitt national design museum description of the corpus and the sample methodology for the elaboration of the manually annotated gold standard corpus analysis of precision and recall discussion are identified entities relevant? do entities refer to specific or general concepts? are the entities correctly disambiguated? what is the overlap and complementarity in between ner services? do uris refer to resources or their descriptions? conclusions and future work uc irvine western journal of emergency medicine: integrating emergency care with population health title consensus guidelines for digital scholarship in academic promotion permalink https://escholarship.org/uc/item/ f v f journal western journal of emergency medicine: integrating emergency care with population health, ( ) issn - x authors husain, abbas repanshek, zachary singh, manpreet et al. publication date doi . /westjem. . . supplemental material https://escholarship.org/uc/item/ f v f #supplemental license https://creativecommons.org/licenses/by/ . / . peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ f v f https://escholarship.org/uc/item/ f v f #author https://escholarship.org/uc/item/ f v f #supplemental https://creativecommons.org/licenses/https://creativecommons.org/licenses/by/ . // . https://escholarship.org http://www.cdlib.org/ volume , no. : july western journal of emergency medicine educational advances consensus guidelines for digital scholarship in academic promotion abbas husain, md* zachary repanshek, md† manpreet singh, md‡ felix ankel, md§ jennifer beck-esmay¶ daniel cabrera, md|| teresa m. chan, md, mhpe, frcpc# robert cooney, md, msmeded** michael gisondi, md†† michael gottlieb, md‡‡ jay khadpe, md§§ jennifer repanshek, md† jessica mason, md|||| dimitrios papanagnou, md## jeff riddell, md*** n. seth trueger, md††† fareen zaver, md‡‡‡ emily brumfield, md¶¶ o additional author affiliations listed at the end of paper section editor: section editor: jeffrey druck, md submission history: submitted january , ; revision received april , ; accepted april , electronically published july , full text available through open access at http://escholarship.org/uc/uciem_westjem doi: . /westjem. . . staten island university hospital - northwell health, department of emergency medicine, staten island, new york lewis katz school of medicine at temple university, department of emergency medicine, philadelphia, pennsylvania university of california, los angeles medical school, department of emergency medicine, los angeles, california university of minnesota medical school, department of emergency medicine, minneapolis, minnesota mount sinai st. luke’s-west, department of emergency medicine, new york, new york mayo clinic college of medicine, department of emergency medicine, rochester, minnesota mcmaster university, department of medicine, hamilton, ontario geisinger medical center, department of emergency medicine, danville, pennsylvania stanford university school of medicine, department of emergency medicine, stanford, california rush university medical center, department of emergency medicine, chicago, illinois university of florida college of medicine, department of emergency medicine, jacksonville, florida oschner clinic foundation, department of emergency medicine, new orleans, louisiana senior author * † ‡ § ¶ || # ** †† ‡‡ §§ ¶¶ o introduction: as scholarship moves into the digital sphere, applicant and promotion and tenure (p&t) committee members lack formal guidance on evaluating the impact of digital scholarly work. the p&t process requires the appraisal of individual scholarly impact in comparison to scholars across institutions and disciplines. as dissemination methods evolve in the digital era, we must adapt traditional p&t processes to include emerging forms of digital scholarship. methods: we conducted a blended, expert consensus procedure using a nominal group process to create a consensus document at the council of emergency medicine residency directors academic assembly on april , . results: we discussed consensus guidelines for evaluation and promotion of digital scholarship with the intent to develop specific, evidence-supported recommendations to p&t committees and applicants. these recommendations included the following: demonstrate scholarship criteria; provide external evidence of impact; and include digital peer- review roles. as traditional scholarship continues to evolve within the digital realm, academic medicine should adapt how that scholarship is evaluated. p&t committees in academic medicine are at the epicenter for supporting this changing paradigm in scholarship. conclusion: p&t committees can critically appraise the quality and impact of digital scholarship using specific, validated tools. applicants for appointment and promotion should highlight and prepare their digital scholarship to specifically address quality, impact, breadth, and relevance. it is our goal to provide specific, timely guidance for both stakeholders to recognize the value of digital scholarship in advancing our field. [west j emerg med. ; ( ) - .] western journal of emergency medicine volume , no. : july consensus guidelines for digital scholarship in academic promotion husain et al. introduction the promotion and tenure (p&t) process requires the appraisal of individual scholarly impact in comparison to scholars across institutions and disciplines. comparative metrics such as the journal impact factor and the h-index are used to quantify and compare the quality of an individual’s scholarship and, therefore, his or her academic merit. as knowledge dissemination methods evolve in the digital era, we must adapt traditional p&t processes to include emerging forms of digital scholarship. in this paper, we aim to first situate our readers within the literature on the topic of academic scholarship, after which we will describe the process by which we derived and refined our consensus guideline. finally, we will outline the recommendations for the use of digital scholarship for academic promotion made by this particular guideline group. the evolution of scholarship scholarship is persistently dynamic. analog technologies progressed from tablet and stone to pen and paper; modern digital scholarship is evolving with blogs, podcasts, and digital journals. still, the standards for evaluation are consistent and focus predominantly on impact and quality of the scholarship. in , ernest boyer of the carnegie foundation originally redefined scholarship for the professoriate as belonging to one of four types. a decade later, charles glassick followed up this work by describing criteria for evaluating scholarship. , to further develop nuances around the scholarship of teaching and learning, lee shulman and patricia hutchings further clarified specific criteria for this subtype of scholarship (to differentiate it from high-quality, scholarly, and evidence- based teaching). these foundational concepts are summarized in table below. traditionally, peer-review processes of academic journals served as a safeguard to ensure overall quality, with evaluators deferring to experts and peers within a scholar’s domain to provide an appraisal for quality and an estimate of impact. similarly, bibliometrics of journals (eg, journal impact factor) and number of citations are surrogates for scholarly reach and proof of impact. despite well-described limitations, these metrics are quantifiable and defined processes that are easily compared. thus, they are highly relied upon by p&t committees to compare scholars from disparate disciplines. when scholarship using new media is produced, it is reasonable to scrutinize the methodology, content, impact, and quality of these new forms of scholarship, such as digital scholarship. our use of the term “digital scholarship” in this paper reflects original content that is disseminated digitally, whether that content is research, teaching materials, enduring resources, commentaries, or other scholarly work. it is unsurprising that as the world becomes more digital, so do scholarly contributions. online-only journals, pre-print archives, and post-production, peer-review journals (eg, cureus) are rapidly changing the landscape of peer-reviewed publication. , similarly, with the advent of peer-reviewed blogs, self-published peer-reviewed books, and educational resource repositories, we see an increased breadth of expression from those engaging in the scholarship of teaching. these varied forms increasingly mirror the rigor required by glassick’s criteria and shulman’s paradigms. , quantity vs quality judging these new forms of scholarship is different. in many ways, with advanced web analytics, it is easier to quantify the reach and attention (eg, pageviews, podcast downloads, ip addresses that have accessed the content, and time on page) of these digital assets. (see table for common analytics available for new digital scholarship.) for example, the pubmed-indexed repository mededportal provides download analytics of the published resources that aid in describing entries as fitting within the scholarship of teaching. however, since many disciplines both within and outside of medicine have not yet fully embraced digital scholarship as enthusiastically as emergency medicine (em) and critical care, it is no surprise that p&t committees do not yet have specific or universal standards for presentation or evaluation of digital scholarship. those without digital scholarship experience may grapple with understanding the nuances of determining impact and quality in this new era, and their lack of understanding may even result in general skepticism of novel products. thus, fields that have already established robust methods for determining the quality of digital scholarship can lead the way. since digital scholarship has matured in em, , it is appropriate for our field to call boyer’s scholarly domains hutchings and shulman criteria for scholarship of teaching glassick’s criteria for evaluating scholarship ) scholarship of discovery ) scholarship of integration ) scholarship of application ) scholarship of teaching ) public ) available for peer review & critique according to the standards of a field ) able to be reproduced and extended by other scholars ) clear goals ) adequate preparation ) appropriate methods ) significant results ) effective presentation ) reflective critique table . foundations of education and teaching scholarship. volume , no. : july western journal of emergency medicine husain et al. consensus guidelines for digital scholarship in academic promotion for the identification of best practices for evaluating digital scholarship and for consensus in the inclusion of such items in promotions decisions. specific guidelines for p&t are lacking despite robust digital contributions proliferating among academicians. in this work, we provide a guiding framework for the presentation and evaluation of digital scholarship for the applicant for promotion, referees for the candidate, and members of p&t committees. methods we conducted a blended, expert consensus procedure using a nominal group process to create a consensus document. invited participants met at the council of emergency medicine residency directors (cord) academic assembly on april , (seattle, wa), to discuss recommendations for evaluation and promotion of digital scholarship with the intent to develop specific, evidence-supported recommendations to p&t committees and applicants. we began with a live, brainstorming event. the meeting notes were compiled by a leadership team and formatted into a collaborative working document. all authors continued formulating this document via a collaborative online authorship using google docs (google llc, mountain view, ca). participants the participants were selected by the leadership of the cord social media and digital scholarship committee (eb, zr, ah). participants were selected based on criteria of known interest or scholarship in the area, national and international level contributions to em digital scholarship, and availability to attend the conference in person or by phone. supplemental digital appendix a lists original invitation list and individual selection rationale. the complete list of attendees of the in- conference proceedings is listed in the acknowledgments. procedures as a large group, the consensus conference participants democratically developed the discussion and brainstorming procedures. based on suggestions from the floor about previous consensus procedures at other similar conferences, , our group decided to engage in small- group brainstorming discussions aligned with the expertise and interests of the participants, which was then discussed as a large group and vetted by the promotion metric supporting data example with metrics impact demonstration of impact shows your work reaches your intended audience pageviews time spent on page likes impressions dissemination (shares) unique users geographic reach followers on professional social media accounts social media index digital object identifier (doi) alexa ranking altmetrics thoma b, chan t, benitez j, lin m. educational scholarship in the digital age: a scoping review and analysis of scholarly products. the winnower. . doi: . / winn. . pageviews altmetric score tweets from users, with an upper bound of , followers role demonstration of your “brand” or role within digital scholarship helps establish your area of expertise editor author curator reviewer invited commentaries podcast guest or editor [invited commentary] berg a, weston v, gisondi ma. journal club: coronary ct angiography versus traditional care. nuem blog. http://www.nuemblog.com/blog/cta-for-chest-pain/ published online / / . quality while also demonstrating commitment to scientific rigor in your work, you may also highlight novel quality assurance methods unique to digital scholarship. metriq- and - , rmetriq aliem air score saem online academic resources (soar) social media index (smi) the quality checklists for health professions blogs and podcasts [peer-reviewed blog] long, b. “myths in heart failure: part i - ed evaluation” emdocs.net http://www.emdocs.net/myths-in-heart- failure-part-i-ed-evaluation/ published online / / . selected as aliem air cardiovascular, non-acs module . this post was deemed to be of an acceptable score within the aliem air scoring tool, and was granted the designation “air approved” by the adjudicating group of educators. there is a second tier below, known as “honorable mention” for posts of moderate quality that did not meet the threshold for inclusion. aliem, academic life in emergency medicine; air, approved instructional resources; saem, society for academic emergency medicine table . summary of metrics used to demonstrate digital scholarship impact, role and quality, with a sample scholarly work. western journal of emergency medicine volume , no. : july consensus guidelines for digital scholarship in academic promotion husain et al. rest of the participants. consensus was defined as universal agreement of the participants. ideation and refinement the participants self-identified their areas of expertise or interest, and then separated into three groups based on these content areas using an iterative process to formulate specific recommendations. the three discussion groups were tasked with formulating recommendations for the following: . the p&t applicant for promotion of one’s digital scholarship; . p&t committee members for evaluation of quality of digital scholarship; . p&t committee members for evaluation of the impact of digital scholarship. small groups presented preliminary recommendations to the entire group and made further revisions via iterative discussion. participants transcribed an outline of the discussion and final recommendations and agreed upon them in a democratic fashion. participants self-selected areas of the manuscript to prepare based on expertise, interest and group approval. all members developed the manuscript from the outline via collaborative authorship. all participants contributed to the manuscript, and cord social media and digital scholarship committee members (ah, ms, zr, eb) served as final editors of the manuscript. results recommendations for presenting digital scholarship for promotion and tenure demonstrate scholarship criteria when presenting digital scholarship to a p&t committee, begin by ensuring and demonstrating that it meets the criteria of scholarship as defined by glassick and expanded upon by sherbino and colleagues with regard to social media. , the adapted criteria are as follows: ) create original content; ) advance the field of health professions education by building on theory, research or best practice; ) be archived and disseminated, and ) provide the health professions education community with the ability to comment on and provide feedback in a transparent fashion that informs wider discussion. in addition, consider providing evidence of archival and dissemination, such as google scholar indexing or inclusion of a digital object identifier (doi). provide external evidence of impact ensure that your digital scholarship is reflected consistently throughout your promotions dossier. dissemination metrics are important to include as measures of impact. for example, some blog editors will provide information about how many times a post has been accessed and the locations of its readers, if requested for p&t purposes. such metrics of dissemination and impact should be presented in the dossier as evidence of your professional reputation as a scholar in your field. additional metrics include pageviews, downloads, and geographic reach. other programs assessing the reach of scholarship, such as altmetrics, may also be valuable. the social media index is a relatively newer technique to assess the impact of websites and could be used as a surrogate for impact, much the same as a journal’s impact factor. see table . other measures of impact could include letters of support and awards. if permitted by your institution, consider obtaining letters of support with regard to your digital scholarship. you may also consider inviting both peer letters and letters from non-collaborators discussing the dissemination metrics and impact of specific pieces of scholarship, or simply your overall impact. there are also a number of digital scholarship-based awards, which may be of value for demonstrating scholarly impact. include digital peer-review roles include editor or peer-reviewer roles for digital scholarly content in your curriculum vitae (cv) in a similar manner as you would for traditional print literature. it is important to highlight these supporting components of digital scholarship and they should be factored into the p&t decisions. citing digital scholarship cite scholarly work on your cv using a consistent format, whether that work was published in a hard-copy journal or as digital content. reorganize the categories of scholarly publications on your cv to include a section for “digital scholarship,” which is the appropriate subheading for items such as blog posts, podcasts, and videos. see table below for example subheadings for the scholarly bibliography of your cv. include only those items that reflect true scholarship and relate to the health professions or sciences. do not list citations for personal website posts or other digital content that is unrelated to your academic position. original research articles - peer reviewed editorials, reviews, case reports, letters, commentaries - peer reviewed textbooks, textbook chapters proceedings and non-refereed papers digital scholarship abstracts exhibits, audiovisuals, teaching materials media appearances and quotes - print, television, online table . subheadings for “scholarly bibliography” section of curriculum vitae. volume , no. : july western journal of emergency medicine husain et al. consensus guidelines for digital scholarship in academic promotion consistently format your scholarship across all subheadings on your cv following the american medical association (ama) manual of style, th ed. the ama manual describes the methods for citing scholarship in most of the categories listed. examples of each citation type are provided above, and selected citations are adapted in table . digital scholarship is best formatted using the ama manual instructions for “internet documents.” academic life in emergency medicine (aliem) also offers guidelines for citing digital scholarship, with examples. digital scholarship is often criticized for lack of peer review, which leads to confusion about the quality and integrity of articles published in exclusively online journals. peer review is a requirement for all journals to be indexed and available on pubmed, including online journals. research articles published in online-only journals that have a pubmed unique identifier (pmid) should not be listed under “digital scholarship,” but rather alongside similar scholarly work published in peer- reviewed print journals. regardless of the mode of publication, all peer-reviewed research should be listed under the same cv subheading in the “highest” possible category. blog posts that are cited under a “digital scholarship” cv subheading can be peer reviewed as well. for example, some blogs offer a peer-review process for authors and identify which posts have undergone peer review. therefore, use qualifiers to identify any digital scholarship citation on your cv that was peer-reviewed or invited. these qualifiers may add additional credibility to your scholarship when a p&t committee reviews your cv. discussion crafting a digital scholarship mission statement a digital scholarship mission statement can provide a framework for your p&t committee to understand and interpret your digital scholarship. akin to the educational philosophy statement of a teaching portfolio, the digital scholarship mission statement provides a lens through which the committee can interpret the congruence and value of your scholarship. , this narrative should articulate the beliefs that drive your digital work in ways that give perspective to your activities and provide consistency with the academic and social media strategies of each institution. table below lists specific considerations to include. please see supplemental digital appendix b for a sample narrative. use traditional frameworks: harnessing the teaching portfolio we recommend using traditional frameworks to describe digital scholarly activity and support for academic promotion. one such example of this is the teaching portfolio. as not all institutions require a separate educational portfolio, we recommend that you present your digital scholarship format: last name, first initial. “title of submission.” name of publisher. url as hyperlink. published online xx/xx/xx. example: gisondi ma, stefanac l. “the feedback formula: part , giving feedback.” international clinician educators blog. https:// icenetblog.royalcollege.ca/ / / /the-feedback-formula-part- -giving-feedback/. published online / / . example qualifiers for curriculum vita: [blog post] gisondi ma. “leadership in medical education: addressing sexual harassment in science and medicine.”international clinician educators blog. https://icenetblog.royalcollege.ca/ / / /leadership-in-medical-education-addressing-sexual-harass- ment-in-science-and-medicine/ published online / / . [podcast guest] kellogg a, gisondi ma. “sex and why episode : how to give better feedback.” in: sex & why podcast (wolfe j, editor-in-chief.) https://www.sexandwhy.com/sex-why-episode- -how-to-give-better-feedback/ published online / / . [peer-reviewed] schnapp b, fant a, powell e, richards c, gisondi m. “ tips for how-to-run an awesome works-in-progress meeting.”academic life in emergency medicine. http://www.aliem.com/ -tips-works-progress-meeting/ published online / / . [commentary, invited] berg a, weston v, gisondi ma. journal club: coronary ct angiography versus traditional care. nuem blog. http://www.nuemblog.com/blog/cta-for-chest-pain/ published online / / . [video] mason j. placing a transvenous pacemaker. emergency medicine: reviews and perspectives. october , . https://www. emrap.org/episode/transvenous/transvenous. accessed november , . [traditional paper with altmetrics] chan tm, gottlieb m, sherbino j, cooney r, boysen-osborn m, swaminathan a, ankel f, yarris lm. the aliem faculty incubator: a novel online approach to faculty development in education scholarship. academic medicine. oct ; ( ): - . altmetrics data: https://wolterskluwer.altmetric.com/details/ table . suggested examples of digital scholarship citations and qualifier use. western journal of emergency medicine volume , no. : july consensus guidelines for digital scholarship in academic promotion husain et al. alongside traditional scholarship according to your institutional requirements. refer to your respective institutional guidelines for requirements and formatting of teaching portfolios. regardless, to facilitate appraisal by p&t committees you should create a dossier that includes a digital mission statement, demonstrates alignment with overall career development goals, and describes the scholarly significance of your digital work. digital scholarship should not replace materials that are typically included in a teaching portfolio, such as course evaluations or other traditional measures of teaching effectiveness. teaching portfolios should summarize teaching effort and quality that meet the criteria of boyer’s scholarship of teaching. , within the teaching portfolio, you may reflect and provide exemplars of digital works and curricula that you have created or curated for learners, but you will not actually list item-by-item the digital scholarship you produce; this should take place in the cv. an entry in a portfolio would holistically describe the pedagogical principles behind a digital educational program or innovation (eg, if you are the creator of a popular podcast, you would explain how you developed the podcast, how you engaged stakeholders to develop the podcast, and, if possible, share data to convey its impact at large through analytics). in contrast, entries of digital scholarship on a cv would be entered individually. table provides some common examples of digital scholarship, and how they might align best with previous descriptions of traditional academic scholarship (as per boyer, glassick, hutchings and shulman). appraising impact there are no hard and fast rules for determining impact. cabrera and his colleagues have previously suggested scale- based assessments of social media-based impact in their paper. they provide ample guidance to promotions committees for comparing size and scale of various media within a specific subtype (eg, international blog vs a local blog). we highly recommend that readers review this article for further guidance. another tool is the social media index, which seeks to create an “impact factor”-like metric based on social media followership. this tool would be best used to judge the impact of an entire digital media collection, such as an entire website or podcast. this tool is available online (https://www.aliem.com/ social-media-index/) and has been revised and validated against quality metrics within emergency medicine free open access medicine resources. appraising quality due to lower barriers of entry allowing digital scholarship to be more easily produced, general skepticism due to less serious, nonmedical online content, as well as pseudoscientific and/or predatory online content, groups have sought to scaffold and support end-users and educators in seeking high-quality online resources. , the online medical education community has worked to quell skepticism by establishing methods to appraise the quality of digital scholarship. see below for a list of critical appraisal tools for rating online secondary resources. for those who have been asked to review files as external referees, these tools may be very useful in guiding us toward high-quality educational content from an educator’s cv or portfolio. some scholars in this space have proposed that we move beyond bibliometrics and surrogates for quality (eg, impact factor, citations, altmetrics), and that p&t committees consider applying direct quality assessments to items of interest (eg, applying the revised metriq or aliem approved instructional resources (air series) scores to a few choice works of digital scholarship from a faculty member’s cv, or applying the prisma reporting guidelines to a few systematic reviews). equitably applying both descriptive bibliometrics (eg, citation rate, h-index, etc.) and quality audits to all works of scholarship (digital or otherwise) would go a long way to augment p&t processes. table contains suggested critical appraisal tools to facilitate secondary resource evaluation. limitations the live conference was limited to invited participants who could join in person or by phone. those with scheduling conflicts were therefore excluded from the live session, perhaps limiting valuable insights and contributions. however, those that could not attend the live conference were still heavily involved reinforce why your digital scholarship exists and is important to the field. explain your digital scholarship’s broad goals and objectives. explain your perception of needs in the modern learning environment, and how that affects your methods. explain how your approach to digital scholarship/teaching has changed over time. explain the niche that you are filling, specifically highlighting how your role/expertise at your institution gives you a reputable voice. describe how your digital scholarship complements your other, more traditional forms of scholarship. explain how digital scholarship aligns with your overall career objectives. name your intended target audience and describe other collateral audience groups that may benefit from your public academic work. describe best practices for ensuring quality during the content creation process: a. highlight team-based and interdisciplinary scholarship as markers of quality b. preview external validation processes of your digital scholarship (below). highlight the ancillary benefits that have arisen because of your digital scholarship presence, such as invited lectures or collaborations on additional scholarship. table . specific elements to consider within a mission statement. https://www.aliem.com/social-media-index/ https://www.aliem.com/social-media-index/ volume , no. : july western journal of emergency medicine husain et al. consensus guidelines for digital scholarship in academic promotion in the organization and creation of the recommendations post- conference via a collaborative writing process. additionally, all authors participated robustly in the asynchronous editing of this manuscript, reducing the potential that important viewpoints were excluded. conference participants were selected by the committee members, and important contributors may have been overlooked. to reduce this possibility, invited members were requested to suggest additional invitees. finally, as digital scholarship participants and creators, there may be bias toward legitimizing our own work over less-familiar scholarship. we attempted to ground our recommendations using best available evidence in order to reduce this bias. however, there is certainly a paucity of literature on how social media is viewed upon (or accepted) as a form of scholarship by the academy. thus, further explorations of the acceptability or evaluation of digital by p&t committees may be a useful program of research going forward. a paper has recently been published about perceptions in the librarian sciences world that is quite interesting, and worthy of replication within academic medicine. blogging podcasting tweeting example of digital scholarship blog post providing a new insight into a novel teaching technique, with a recipe for helping students learn about social justice by meeting patient partners. podcast synthesizing the role of human factor engineering in the emergency department. tweetorial reviewing and appraising the latest evidence on a topic does this meet the criteria for scholarship per hutchings and shulman? ) public ) available for peer review and critique according to the standards of a field ) able to be reproduced and extended by other scholars ) is it public? yes ) is it available for peer review? yes, some blogs have pre- publication peer review, others have comments enabled to allow for post-publication peer review) ) able to be reproduced and extended by other scholars? yes, since it is available for review and extendibility since it is openly published on the internet. ) is it public? yes ) is it available for peer review? yes, listeners can leave comments on most podcast hosting sites. ) able to be reproduced and extended by other scholars? yes, since it is available for review and extendibility since it is openly published on the internet. ) is it public? yes ) is it available for peer review? yes, tweetworials can be found by searching twitter. ) able to be reproduced and extended by other scholars? yes, since it is available for review and extendibility since it is openly published on the internet. what type of boyer’s scholarship is this? scholarship of teaching scholarship of integration (merging of engineering and medicine) scholarship of application (helping others to determine if evidence might be applied in their context) conclusion as traditional scholarship continues to evolve within the digital realm, academic medicine must also adapt how that scholarship is evaluated. p&t committees in academia are at the epicenter for supporting the changing paradigm in scholarship. unlike traditional academic products, where reach and impact were difficult to quantify, web-based metrics allow us to track unique users and their locations. the authors suggest that committees critically appraise digital scholarship using the methods outlined in this paper. applicants for appointment and promotion should highlight and prepare their digital scholarship in a way that specifically addresses quality, impact, breadth, and relevance. it is our goal to provide specific, timely guidance for both stakeholders to recognize the value of digital scholarship in advancing our field. acknowledgments the authors would like to thank the council of residency directors for their support. table . how types of digital scholarship might be described using traditional descriptions of academic scholarship. additional author affiliations listed here for lack of space on the first page: |||| ## *** ††† ‡‡‡ university of san francisco-fresno, department of emergency medicine, fresno, california sidney kimmel medical college at thomas jefferson university, department of emergency medicine, philadelphia, pennsylvania university of southern california keck school of medicine, department of emergency medicine, los angeles, california northwestern university, department of emergency medicine, chicago, illinois university of calgary, department of emergency medicine, calgary, alberta, canada western journal of emergency medicine volume , no. : july consensus guidelines for digital scholarship in academic promotion husain et al. references . azer, sa, holen a, wilson i, et al. impact factor of medical education journals and recently developed indices: can any of them support academic promotion criteria? j postgrad med. ; ( ): - . . schimanski la, alperin jp. the evaluation of scholarship in academic promotion and tenure processes: past, present, and future. f res. ; : . . chan tm, kuehl d. on lampposts, sneetches, and stars: a call to go beyond bibliometrics for determining academic value. acad emerg med. ; ( ): - . . boyer el. ( ). scholarship reconsidered: priorities of the professoriate. princeton, nj: princeton university press. . glassick ce, huber mt, maeroff gi. ( ). scholarship assessed: evaluation of the professoriate. san francisco, ca: jossey-bass. . glassick ce. boyer’s expanded definitions of scholarship, the standards for assessing scholarship, and the elusiveness of the scholarship of teaching. acad med. ; ( ): - . . hutchings p, shulman ls. the scholarship of teaching: new elaborations, new developments. change: the magazine of higher learning. ; ( ): - . . cabrera d, roy d, chisolm ms. social media scholarship and alternative metrics for academic promotion and tenure. j am coll radiol. ; ( pt b): - . . thomas b, chan t, benitez j, et al. educational scholarship in the digital age: a scoping review and analysis of scholarly products. the winnower. . . cold spring harbor laboratory. biorxiv homepage. available at: https:// www.biorxiv/org/. accessed june , . . cureus. an introduction to publishing with cureus. available at: https:// www.cureus.com/author_guide. accessed june , . . adler jr, chen tm, blain jb, et al. #openaccess: free online, open- access crowdsource-reviewed publishing is the future; traditional peer- reviewed journals are on the way out. cjem. ; ( ): - . . azim a, beck-esmay j, chan tm. editorial processes in free open access medical educational (foam) resources. aem educ train. ; ( ): - . . chan tm, gottlieb m, sherbino j, et al. ( ). education theory practical: volume . san francisco, ca: academic life in emergency medicine. . association of american medical colleges. mededportal: the journal of teaching and learning resources. available at: https://www. mededportal.org/. accessed june , . . jetem: journal of education and teaching in emergency medicine. available at: https:jetem.org/. accessed june , . . cadogan m, thoma b, chan tm, et al. free open access meducation (foam): the rise of emergency medicine and critical care blogs and podcasts ( - ). emerg med j. ; (e ):e - . . colmers in, paterson qs, lin m, et al. the quality checklists for health professions blogs and podcasts. the winnower. . . thoma b, joshi n, trueger ns, et al. five strategies to effectively use online resources in emergency medicine. ann emerg med. ; ( ): - . . lo a, shappell e, rosenberg h, et al. four strategies to find, evaluate, and engage with online resources in emergency medicine. cjem. : - . . waggoner j, carline jd, durning sj. is there a consensus on consensus methodology? descriptions and recommendations for future consensus research. acad med. ; ( ): - . . google accounts. available at http://docs.google.com. . woods ra, artz jd, carrière b, et al. caep academic symposium on education scholarship: training our future clinician educators in emergency medicine. cjem. ; (s ):s - . . kessler cs, leone ka. the current state of core competency assessment in emergency medicine and a future research agenda: recommendations of the working group on assessment of observable learner performance. acad emerg med. ; ( ): - . . glassick ce. reconsidering scholarship. j public health management practice. ; : - . . sherbino j, arora vm, melle ev, et al. criteria for social media- based scholarship in health professions education. postgrad med j. ; ( ): - . . trueger ns, thoma b, hsu ch, et al. the altmetric score: a new measure for article-level dissemination and impact. ann emerg med. ; ( ): - . . thoma b, sanders jl, lin m, et al. the social media index: measuring the impact of emergency medicine and critical care websites. west j emerg med. ; ( ): - . . monette d, joshi n. aliem awards : congratulations to our winners! available at: https://www.aliem.com/ / /aliem-awards- -winners/. accessed june , . . iverson c, christiansen s, flanagin a, et al. ( ). ama manual of style ( ed.). new york city: oxford university press. . mason j. how to cite podcasts, videos, and blogs in publication. . available at: https://www.aliem.com/ / /cite-podcasts-videos-blogs- publication/. accessed june , . . national center for biotechnology information, u.s. national library of medicine. pubmed. available at: https://www.ncbi.nlm.nih.gov/pubmed/. accessed june , . address for correspondence: abbas husain, md, staten island university hospital, department of emergency medicine, seaview avenue, staten island, ny . email: abbashu@gmail. com. conflicts of interest: by the westjem article submission agreement, all authors are required to disclose all affiliations, funding sources and financial or management relationships that could be perceived as potential sources of bias. no author has professional or financial relationships with any companies that are relevant to this study. there are no conflicts of interest or sources of funding to declare. copyright: © husain et al. this is an open access article distributed in accordance with the terms of the creative commons attribution (cc by . ) license. see: http://creativecommons.org/ licenses/by/ . / https://www.biorxiv/org/ https://www.biorxiv/org/ https://www.cureus.com/author_guide https://www.cureus.com/author_guide https://www.mededportal.org/ https://www.mededportal.org/ http://docs.google.com https://www.aliem.com/ / /aliem-awards- -winners/ https://www.aliem.com/ / /aliem-awards- -winners/ https://www.aliem.com/ / /cite-podcasts-videos-blogs-publication/ https://www.aliem.com/ / /cite-podcasts-videos-blogs-publication/ http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / volume , no. : july western journal of emergency medicine husain et al. consensus guidelines for digital scholarship in academic promotion . kobner s. introducing in-line peer review: advancing the state of academic blogging. . available at: https://www.aliem.com/ / / in-line-expert-peer-review-academic-blogging/. accesses june , . . cabrera d, vartabedian bs, spinner rj, et al. more than likes and tweets: creating social media portfolios for academic promotion and tenure. j grad med educ. ; ( ): - . . kuhn gj. faculty development: the educator’s portfolio: its preparation, uses, and value in academic medicine. acad emerg med. ; ( ): - . . seldin p, miller je, seldin ca. ( ). the teaching portfolio: a practical guide to improved performance and promotion/tenure decisions. san francisco, ca: jossey-bass. . lamki n, marchand m. the medical educator teaching portfolio: its compilation and potential utility. sultan qaboos univ med j. ; ( ): - . . thoma b, sanders jl, lin m, et al. the social media index: measuring the impact of emergency medicine and critical care websites. west j emerg med. ; ( ): - . . thoma b, chan tm, kapur p, et al. the social media index as an indicator of quality for emergency medicine blogs: a metriq study. ann emerg med. ; ( ): - . . chan tm, grock a, paddock m, et al. examining reliability and validity of an online score (aliem air) for rating free open access medical education resources. ann emerg med. ; ( ): - . . colmers-gray in, krishnan k, chan tm, et al. the revised metriq score: a quality evaluation tool for online educational resources. aem education and training. ; ( ). . liberati a, altman dg, tetzlaff j, et al. the prisma statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. plos med. ; ( ):e . . gruzd a, staves k, wilk a. tenure and promotion in the age of online social media. proceedings of the american society for information science and technology. ; ( ): - . https://www.aliem.com/ / /in-line-expert-peer-review-academic-blogging/ https://www.aliem.com/ / /in-line-expert-peer-review-academic-blogging/ different preservation levels: the case of scholarly digital editions oltmanns, e, et al. . different preservation levels: the case of scholarly digital editions. data science journal, : , pp.  – . doi: https://doi.org/ . /dsj- - research paper different preservation levels: the case of scholarly digital editions elias oltmanns, tim hasler, wolfgang peters-kottig and heinz-günter kuper department scientific information, konrad-zuse-zentrum für informationstechnik berlin, de corresponding author: elias oltmanns (oltmanns@zib.de) ensuring the long-term availability of research data forms an integral part of data management services. where oais compliant digital preservation has been established in recent years, in almost all cases the services aim at the preservation of file-based objects. in the digital human- ities, research data is often represented in highly structured aggregations, such as scholarly digital editions. naturally, scholars would like their editions to remain functionally complete as long as possible. besides standard components like webservers, the presentation typically relies on project specific code interacting with client software like webbrowsers. especially the latter being subject to rapid change over time invariably makes such environments awkward to maintain once funding has ended. pragmatic approaches have to be found in order to balance the curation effort and the maintainability of access to research data over time. a sketch of four potential service levels aiming at the long-term availability of research data in the humanities is outlined: ( ) continuous maintenance, ( ) application conservation, ( ) application data preservation, and ( ) bitstream preservation. the first being too costly and the last hardly satisfactory in general, we suggest that the implementation of services by an infrastructure provider should concentrate on service levels and . we explain their strengths and limitations considering the example of two scholarly digital editions. keywords: scholarly digital editions; digital preservation; information infrastructure; service levels introduction digital resources are becoming crucially important to the humanities. products of digital scholarship, i.e. research data and publications, are no longer predominantly confined to the linear representation of text but include a wide spectrum of data types, from digital objects like images or audio files to databases and aggregated software environments. a characteristic digital resource in many humanities disciplines is the multi layered annotated collection, particularly in the form of a scholarly digital edition (sde). sdes are now establishing themselves as the norm in many areas of philological endeavour (hughes, constantopoulos and dallas ). during the zuse institute berlin’s participation in the humanities data centre project hdc ( – ) we came across several projects considered typical examples of research data presented as web applications by our partners from the digital humanities, among them the sde ‘opus postumum’. the conservation interdependence of such a software stack can be a daunting task. like other digital resources, complex software environments tend to become less accessible over time due to financial, technological, legal, or organizational issues, as well as due to the lack of auditing or due to cultural changes which threaten usability. the more technological features an sde incorporates, the faster and more likely new problems in an evolving environment will occur, including system security concerns, failed compatibility and unsupported external dependencies (bingert and buddenbohm ). sahle and kronenwett ( ) pose a central question, ‘who cares for the presentational systems/“living systems” in the long run?’. in an ideal world, this task is shared between an information infrastructure provider (data facility/computing centre) and scholars of digital humanities with an insight into digital editing. from our point of view as an infrastructure provider, research data has particularly good prospects of being preserved for a long time if it can be dissected into singular digital objects like tiff images and xml https://doi.org/ . /dsj- - mailto:oltmanns@zib.de oltmanns et al: different preservation levelsart.  , page  of files without losing important information implicitly hidden in the application code. such digital objects and their metadata can be stored with reasonable technical and organizational effort in a digital preserva- tion system compliant with the open archival information system standard (oais; ccsds ). hence, standard preservation actions like the migration of digital objects to other formats upon detection of data format obsolescence can be applied. this perspective may easily be in conflict with scholars’ interest to interact with the original web interface whose content and technology remains unchanged or, even better, is continually enhanced. ongoing main- tenance of the fully functional online presentation would seem to be the ideal preservation solution but gets unaffordable in the long run. employing the preservation strategy of emulation to enable an authentic representation of sdes would require the encapsulation and distribution of the complete hardware and soft- ware stacks, including the operating system and driver interdependencies. a complex, likewise expensive undertaking potentially incurring intellectual property rights issues when offered as a service. in this paper we discuss a set of service levels for the long-term preservation of sdes beyond the day that project funding has expired. after giving just a little background on sdes and the challenging question what actually needs to be preserved in the next two sections, we go on to introduce the service levels put- ting them into perspective with regard to the interests/needs of contemporary and future scholars as well as resource constraints. concentrating on two service levels, we discuss their applicability and potential to complement each other considering two specific examples of sdes in the section “use cases: opus postu- mum and edition visualization technology”. these use cases shall demonstrate how decisions in the design phase of an sde may affect the time span and the size of the community it will be accessible for. excursus: scholarly digital editions sahle ( ) defines a scholarly edition to be ‘the critical representation of historic documents’. a repre- sentation entails recoding, a transformation from one medium to another, e.g. by means of transcription or fascimiles. ‘document’ is to be understood as a more generalised, i.e. broader category of material than just text. sdes are deemed to be guided by a digital paradigm in their theory, method and practice (sahle ). it seems natural yet still worth noting that an sde is not simply the online version of a traditional scholarly edi- tion. popular features include parallel views, for instance of the transcription juxtaposed with the facsimile, search and annotation facilities or tools to study a particular segment in detail. the increasing importance of the sde has already been acknowledged by gabler ( ), who states that the digital medium is becoming the scholarly edition’s original medium. from a technical point of view, most current sdes are based on the guidelines of the text encoding initiative (tei). tei provides guidance on the markup of text in xml, accounting for numerous aspects of semantics as well as visual appearance. even the correspondence between passages of transcribed text and their original locations on facsimiles can be encoded in tei-xml. hence, preserving an sde basically means taking care of text encoded in tei-xml, as well as possibly accompanying facsimiles in some image format, and a (web) application rendering all that information in a form deemed conducive to scholarly work by the creators of the sde. what to preserve? when considering the preservation of an sde as a complete digital resource, one question to be addressed should be how to decide which features are essential to be preserved: what are the ‘significant properties’ of the sde? the inspect project defined significant properties to be ‘[t]he characteristics of digital objects that must be preserved over time in order to ensure the continued accessibility, usability, and meaning of the objects, and their capacity to be accepted as evidence of what they purport to record’ (grace, knight, and montague ). the identification of significant properties is a complicated task that must reflect the consensus within an edition project regarding the most important aspects of an sde. since the idea that certain properties of a digital object be deemed significant derives from resource constraints, this concept is central to the question of which service level for the preservation of an sde is appropriate (and economically feasible). in contrast to a traditional scholarly edition in print, the information provided by an sde is only indirectly accessible to humans and always requires sophisticated decoding and processing in order to be presented in some legible form. one of the most obvious reasons for digitally encoding data in the first place is the intention to perform complex and/or fast operations on that data. a default route to the contents of print media, though not necessarily a full understanding of its meaning, may be assumed due to prevalent cultural techniques that oltmanns et al: different preservation levels art.  , page  of only change at a moderate pace. a digital paradigm cannot rely on such a default route and should account for rapidly changing processing environments. the challenge posed by the digital stewardship of sdes lies in the fact that the data is presented in spe- cialised aggregations that themselves are significant in terms of understanding, using, and curating the application (flanders and muñoz ). however, scholars in the digital humanities tend to dismiss pres- ervation efforts that do not encompass the ‘look and feel’ of the original web presentation in all its detail. during the hdc project, a non-representative survey of scholars at berlin-brandenburgische akademie der wissenschaften revealed a strong commitment to the concept of exact, authentic representation of an sde as being paramount to its scholarly value. tei-xml represents a significant investment of labour in the course of encoding a text and discarding it would be unwise, especially considering the fact that xml is not concerned with presentation by design, thus making it a flexible format that can be used in other contexts (e.g. after applying xslt transformations). also, turska, cummings, and rahtz ( ) point out that ‘different usage scenarios call for different presen- tations’. returning to sahle’s notion of a digital paradigm, the data might very well be valuable outside the context of the original application. service levels for long term availability of sdes a comprehensive solution to the ongoing threat of research data loss should involve permanently funded institutions, i.e. institutions of a durable nature like data facilities providing storage capacity (computing centres). since project funding is per se always restricted in time there is an inherent problem for the sustain- ability of infrastructures. when there is limited time to prove the benefit of a service the possible outreach is also limited. on the other hand only services with considerable backing in the community have the chance to get long-term funding. the establishment of humanities data centres as a collaboration between well established infrastructure providers and digital humanists is one possible approach to ensure a trustworthy long term preservation. examples are the hdc (buddenbohm, engelhardt & wuttke ), the data center for the humanities (dch) cologne (sahle & kronenwett ), the data and service center for the humanities (dasch) (rosenthaler, fornaro & clivaz ) and the kompetenznetzwerk digitale edition (konde, ). key to the success of such infrastructures is the relevance to the research community of the services provided. with regard to the sustainability of software environments such as sdes, a diverse spectrum of service levels is conceivable. we present four potential service levels, ordered by increasing effort required on the part of interested scholars to work with the preserved research (figure ). . continuous maintenance service level is discussed mostly for the sake of completeness and is essentially hypothetical, because it implies the ongoing maintenance of the original application. naturally, this form of indefinite curation can be assumed to be the preservation level most desired by the researchers, due to the undiminished ‘functional completeness’ of the application. however, it requires permanent funding and it does not scale. figure : different aspects of the proposed preservation levels. oltmanns et al: different preservation levelsart.  , page  of continuous maintenance entails active development in order to accommodate changes not only on the host system but also changes to popular user clients. as long as the research project is alive and funded there are enough resources for the maintenance of the software stack, subsequently one could envision an indefinitely funded foundation or trust which is indefinitely funding the ‘ongoing editing, enrichment and processing of digital research data’ (sahle and kronenwett ). however well this approach may work for a single application, it becomes an increasingly laborious undertaking with each research product added to the service portfolio of an operator/data centre, since there are currently no agreed upon standards on how to use the tools of the trade. technological development and accumulating security issues probably render the curation of an sde for more than a few years a futile endeavour. . application conservation service level is a more pragmatic approach to the long-term availability of applications in the humanities realm using virtualisation technology. this concept has been developed in the context of the hdc project (under the name ‘application preservation’; bingert and buddenbohm ). all the components making up the application are transferred to a virtualised environment, e.g. a docker container or a virtual machine. in the particular case of web applications, both server components (webserver, deployed web applications and their runtime environments, possibly a database server, etc.) and client components (webbrowser, maybe some extensions, etc.) need to be virtualised in this way, possibly in two separate environments. each time a user requests access to this particular research resource, the docker container gets fired up and the con- served environment is presented to the user. freezing the web application in an environment together with a tried-and-tested client (i.e. a particular browser version), which is then made accessible to interested users as a remote desktop client in their favourite browser, allows the conserved application to retain a close match to the original user experience. initial changes to the web application will be required when it is first moved to a virtualised environment run by an infrastructure provider. a drawback is the lack of modifiability at the point of access: the application can be used as is. a user can interact with the current application and reproduce research results retrospectively. access to underlying raw data is usually not possible. once up and running, the virtualised environment is frozen, effectively growing into an unpatched system with serious security implications. appropriate counter measures will have to be put into place on the host system, and, in all likelihood, access will have to be restricted to authenticated users. hence, the sde – including its original user interface – can be conserved as long as the required environment is supported by the virtualisation solution and access restrictions for security reasons do not have to be too rigorous. this results in a limited lifespan of maybe – years, depending on the time that can be invested. this service intentionally does not extend to the sophisticated emulation of hard- and software environ- ments. once the application cannot be catered for within the deployed virtualisation infrastructure, it is no longer considered accessible and the service is terminated. . application data preservation service level aims at the separate preservation of all digital data objects underlying the application. with regard to sdes, we concur with rosseli del turco ( ), who states that ‘the only viable solution to ensure that an edition is usable for the foreseeable future is to completely decouple the edition data from the visualization mechanism’. application data preservation means that the relevant data reflecting signifi- cant properties of the application are exported, or rather, extracted from the software environment. the ‘atomised’ application data can be stored in a long-term digital preservation system using open, well-known and documented data formats, accompanied with technical, administrative and descriptive metadata explaining the application, its intended performance, and the underlying assumptions on presentation or visualisation. migration of data formats from an obsolete format to a new format is the predominant pres- ervation action for this rather static type of (research) data. this level may be compared to well-documented raw data storage in the empirical sciences. close cooperation between data curators and scholars should ensure thorough description prior to the ingest into a long-term digital preservation system. in this case, the lifespan of the intellectual content of the applica- tion is expected to be much higher than that for application conservation (level ). compared to application conservation, more effort has to be invested in the data acquisition process and also in any activity to virtually resurrect the data in the original or a new context. on the other hand, there is potential for the future reuse of data in hitherto unseen research contexts, provided that the data are maintained according to the fair principles of research data management (wilkinson et al. ): findability, accessibility, interoperability, and reusability. oltmanns et al: different preservation levels art.  , page  of . bitstream preservation it can be assumed that bitstream preservation provides the longest prospective time for which the integrity of preserved data can be assured. the concept is well understood and addresses ‘the first requirement of digital preservation – to remain an intact physical copy of the digital object’ (brown : ). it relies on different copies ideally stored on different machines with different technologies. the application (or at least its inte- gral components like underlying databases, presentation code, logic code and configuration) is saved as is to some long-term storage, typically without further context information. we suggest that the sole bitstream be enriched with at least technical and administrative metadata, ideally with descriptive metadata. bitstream pres- ervation is the preservation level with the least functional completeness. it relies entirely on future users and their ability to reverse engineer the environment needed to run the application, be it software or hardware. this approach is considered a last resort when a higher level of preservation or maintenance is unfeasible. use cases: opus postumum and edition visualization technology despite the existence of the de-facto encoding standard tei-xml, sdes are far from uniform. in fact, they can vary considerably with regard to the software components involved, implementation details and the data model. we are going to study two setups of sdes with rather differing design goals more closely. their characteristics shall be discussed in relation to the requirements and goals of the service levels proposed in this paper. in particular, it shall be demonstrated why application data preservation is not a general purpose service level suitable for arbitrary applications but under certain conditions should be considered a valu- able service complementary to the, in some sense, more generic application conservation approach. the opus postumum is a critical edition of an extant manuscript by the famous german th century scholar immanuel kant. first preparatory steps toward digitising and re-editing the manuscript took place in and work has continued at the berlin-brandenburgische akademie der wissenschaften (bbaw) ever since (figure ). the bbaw, being one of the partners in the hdc project mentioned in the introduction had pro- vided the data of this sde for our investigations into possible solutions for long-term preservation. the opus postumum has been realised as a web application based on the xml database exist (http:// exist-db.org/), which stores the tei encoded transcriptions, and the digital image library digilib (https://rob- cast.github.io/digilib/) providing access to the facsimiles. the setup has been designed not only to publish project results but also to aid project participants in the process of editing the text. apart from native xml processing, the database provides an access control and permissions system and its resources are accessible from the popular oxygen xml editor. digilib serves the facsimile images in different resolutions at the user’s request and restricts data transmission to the segment currently being viewed, increasing performance when zooming in and studying the facsimile in detail. the frontend of the opus postumum, developed at the figure : opus postumum online-edition. original screenshot (https://xmlpublic.bbaw.de/legacy/apps/ kant/web/index.html). http://exist-db.org/ http://exist-db.org/ https://robcast.github.io/digilib/ https://robcast.github.io/digilib/ https://xmlpublic.bbaw.de/legacy/apps/kant/web/index.html https://xmlpublic.bbaw.de/legacy/apps/kant/web/index.html oltmanns et al: different preservation levelsart.  , page  of bbaw, renders the facsimile and its transcription one manuscript page at a time, side by side, providing the user with various options of zooming in and navigating through the manuscript. in particular, the opus pos- tumum boasts reciprocal linking between passages in the transcription and the corresponding segments on the facsimile. the two representations of a passage are highlighted when hovering the mouse over either one of them, bringing the other one into view first if necessary, which effectively provides simultaneous scrolling. searching for existing approaches to reusability in the domain of scholarly digital editing, we came across the publishing tool edition visualization technology evt (http://evt.labcd.unipi.it/) (figure ). its devel- opment was started as part of the ‘digital vercelli book’ project (rosselli del turco ). distinguishing between the data forming the essence of the sde and its presentation to the user has been a guiding prin- ciple of the project. consequently, evt has been designed to transform a tei-encoded text or transcription, possibly accompanied by facsimile images, into a complete web application, i.e. to build the edition around the data (rosselli del turco ). contrary to the opus postumum, evt is based on a client-only approach relying on established standard technologies like html , css and javascript. the intention is to ease (re)deployment of an sde on just about any web server, without further software being required on the server. on the other hand, evt lacks the image annotation services and transfer performance optimisations provided by digilib and the database features of exist, mentioned earlier as part of the opus postumum, and which is intended to assist scholars (not least the editors themselves) in their work on and with the sde. the two approaches toward creating sdes described above would benefit in different ways from the ser- vice levels discussed in this article. application data preservation being all about making data available for reuse in other applications, a few tests have been performed on samples from the opus postumum and the digital vercelli book kindly provided by the bbaw and roberto rosselli del turco (univ. turin), respec- tively. simple xml schema validation on the tei-xml encoded data revealed that both data sets violated the schema in some way or other. this was less surprising than one might think given that the tei guidelines are complex and voluminous. however, fixing the markup turned out to be fairly trivial with regard to the digital vercelli book, whereas the task was more laborious, i.e. not always scriptable in an obvious way, in the context of the opus postumum. moreover, we tried our hand at processing some data from the opus postumum for visualisation based on evt. for this endeavour, the evt documentation could be relied upon but the opus postumum, not being designed for data exchange, provides purely non-technical documenta- tion explaining the user interface only. as it turned out, preparing a single page of the manuscript is not trivial but possible. anything beyond that, however, involves the rather complex merging of tei documents handling id collisions along the way due to an unorthodox data model. in defence of the opus postumum it must be emphasised at this point that reusability of the underly- ing data in different contexts was, quite simply, not a design priority during its development, unlike evt. figure : the digital vercelli book in evt. original screenshot (http://evt.labcd.unipi.it/). http://evt.labcd.unipi.it/ http://evt.labcd.unipi.it/ oltmanns et al: different preservation levels art.  , page  of therefore, evt has to be considered at an advantage with regard to the investigations described above. all the same, the results indicate that considering interoperability and reusability at the design stage really does make a difference. the intentional separation of data and interface make any evt-based sde a promising candidate for application data preservation. at this service level, the data is readily available to the research community, allowing scholars to use it with their own tools. meanwhile, a plugin called tei publisher (https://teipublisher.com/) is available for recent releases of exist, the database serving the opus postumum. the purpose of tei publisher is to generate standalone web applications from an application originally served by exist. eliminating exist from the requirements to access the sde obviously reduces complexity in the software stack to maintain. also, such a tool may conceivably serve the course of data reusability by enforcing restrictions on the tei input to be processed, and generating a certain degree of uniformity on the produced output. on a more hypothetical basis it can be argued that the client-only approach pursued in the development of evt supports the hosting of an unmaintained sde on an otherwise fully maintained webserver without too many risks for the hoster for an extended period of time. of course, clients are subject to change and may even be replaced possibly breaking the interface of the sde in the course of time. at this point, application conservation may give the user access to a legacy browser that still renders the sde as originally intended. since evt has been developed for scholars to build sdes from their own datasets, it is shipped with documen- tation that may be a valuable source of information even when evt ceases to function after the discontinu- ation of development. it is of potential interest to anyone performing preservation actions on an evt-based sde as part of service level or trying to make sense of such an sde preserved only at the bitstream level. application conservation appears to be a good fit for the opus postumum, because the whole environ- ment can be cloned with only minor changes to the setup. this flexibility comes at a cost, however, since the service can only be operated under conditions that mitigate the risk of attacks on an unmaintained system. the opus postumum is based on rather complex server components executed in a java runtime environment. access restrictions may conceivably be imperative sooner than hoped for in order to protect the system. as far as bitstream preservation is concerned, application conservation as an intermediary stage will probably assist in the curation process. it provides a fairly straight forward test case ensuring that all components required for execution (and thus potentially helpful for understanding the application) have been included. obviously, this does not automatically enforce the appropriate level of supplemental documentation. summary and conclusions infrastructure providers cannot step in for the developers of applications when they run out of resources, be it because project funding expires or their interests and priorities have shifted. nonetheless, the service levels proposed and discussed with special regard to sdes address different needs arising from varying usage scenarios and application architectures. providers implementing service levels , , and should be in a position to offer reasonable or even attractive solutions for the mid and long term preservation of research data in the humanities. the zuse institute operates the digital preservation service ewig (klindt & amrhein ) based on to the oais reference model, thus providing the infrastructure for service levels and . service level is offered at the hdc in göttingen. as stated in the introduction, the infrastructure is only part of the responsibilities and the curation effort preceding the transfer of research data to the infrastruc- ture provider involves a lot of dedication on the part of the scholars familiar with the data. still, the impact of decisions early in the design stage and all through the development of an sde’s on its preservation prospects can hardly be overestimated. in particular, the scope of application data preservation (service level ) is restricted due to two important factors: it only makes sense if a subset of the informa- tion built into the application can be envisaged as a valuable contribution to future research without the full context of the original application; secondly, the extraction of this information from the application modelled and encoded according to well established and documented standards, must be affordable. the major benefit is that we are now dealing with objects that can be preserved in an oais-compliant way with a fair chance of being subjected to preservation actions when required. bitstream preservation is the only one among the remaining service levels that can be handled by an oais-compliant preservation system but obviously limits the options for preservation planning. application conservation, on the other hand, may be considered closer to hosting than a preservation ser- vice in the traditional sense. it is very flexible, though, and probably the only viable option for maintaining access to certain applications and their data at least for some time. note that service level is more flexible and therefore applicable to far more web applications than service level but only operable for a limited time. for a certain period of time, the two service levels are complementary for the majority of applications https://teipublisher.com/ oltmanns et al: different preservation levelsart.  , page  of that satisfy the requirements of application data preservation. as indicated in figure , service level can be assumed to preserve reusable and possibly interoperable data for much longer than service level is able to retain access to the original application – both exceeded by the virtually indefinite time spanned by bitstream preservation. together, service levels and are a good match to provide findable, accessible, reusable and possibly interoperable research data in the humanities. acknowledgements we would like to thank alexander czmiel of berlin-brandenburgische akademie der wissenschaften and roberto rosselli del turco of università di torino for generously providing us with the data required to ana- lyse the opus postumum and the digital vercelli book, respectively, as well as their helpful answers to the questions we put to them in the course of our investigations. competing interests the authors have no competing interests to declare. references bingert, s and buddenbohm, s. . research data centre services for complex software environments in the humanities. information services & use, ( – ): – . doi: https://doi.org/ . /isu- brown, a. . practical digital preservation. a how to guide for organizations of any size. london: facet publishing. doi: https://doi.org/ . / buddenbohm, s, engelhardt, c and wuttke, u. . angebotsgenese für ein geisteswissenschaftli- ches forschungsdatenzentrum. zeitschrift für digitale geisteswissenschaften, heft . doi: https://doi. org/ . / _ ccsds. . reference model for an open archival information system (oais). recommended practice, issue . washington, dc: consultative committee for space data systems. https://public.ccsds.org/ pubs/ x m .pdf. flanders, j and muñoz, t. . an introduction to humanities data curation. dh curation guide: a com- munity resource guide to data curation in the digital humanities. http://guide.dhcuration.org/contents/ intro/. gabler, hw. . theorizing the digital scholarly edition. literature compass, ( ): – . doi: https:// doi.org/ . /j. - . . .x grace, s, knight, g and montague, l. . inspect final report. london: king’s college london. http:// citeseerx.ist.psu.edu/viewdoc/download?doi= . . . . &rep=rep &type=pdf. hughes, l, constantopoulos, p and dallas, c. . digital methods in the humanities. in schreibman, s, siemens, r and unsworth, j (eds.), a new companion to digital humanities. chichester: wiley. doi: https://doi.org/ . / .ch humanities data centre project. . hdc-designprojekt. http://humanities-data-centre.org/. klindt, m and amrhein, k. . one core preservation system for all your data. no exceptions! in lee, c, zierau, e, woods, k, tibbo, h, pennock, m, maeda, y, mcgovern, n, konstantelos, l and crabtree, j (eds.), proceedings of the th international conference on digital preservation. school of information and library science, university of north carolina at chapel hill. http://hdl.handle.net/ / . . konde. . kompetenznetzwerk digitale edition. http://www.digitale-edition.at/. rosenthaler, l, fornaro, p and clivaz, c. . dasch: data and service center for the humanities. digital scholarship in the humanities, (issue suppl_ ): i –i . doi: https://doi.org/ . /llc/fqv rosselli del turco, r. . the battle we forgot to fight: should we make a case for digital editions? in driscoll, mj and pierazzo, e (eds.), digital scholarly editing: theories and practices. cambridge: open book publishers. doi: https://doi.org/ . /obp. rosselli del turco, r. . the digital vercelli book. http://vbd.humnet.unipi.it/beta/. rosselli del turco, r, buomprisco, g, di pietro, c, kenny, j, et al. . edition visualization technology: a simple tool to visualize tei-based digital editions. journal of the text encoding initiative, issue . doi: https://doi.org/ . /jtei. sahle, p. . what is a scholarly digital edition? in driscoll, mj and pierazzo, e (eds.), digital scholarly editing: theories and practices, – . cambridge: open book publishers. doi: https://doi.org/ . / obp. https://doi.org/ . /isu- https://doi.org/ . /isu- https://doi.org/ . / https://doi.org/ . / _ https://doi.org/ . / _ https://public.ccsds.org/pubs/ x m .pdf https://public.ccsds.org/pubs/ x m .pdf http://guide.dhcuration.org/contents/intro/ http://guide.dhcuration.org/contents/intro/ https://doi.org/ . /j. - . . .x https://doi.org/ . /j. - . . .x http://citeseerx.ist.psu.edu/viewdoc/download?doi= . . . . &rep=rep &type=pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi= . . . . &rep=rep &type=pdf https://doi.org/ . / .ch http://humanities-data-centre.org/ http://hdl.handle.net/ / . http://www.digitale-edition.at/ https://doi.org/ . /llc/fqv https://doi.org/ . /obp. http://vbd.humnet.unipi.it/beta/ https://doi.org/ . /jtei. https://doi.org/ . /obp. https://doi.org/ . /obp. oltmanns et al: different preservation levels art.  , page  of sahle, p and kronenwett, s. . sustainability?! four paradigms for humanities data centers. in curdt, c and willmes, c (eds.), proceedings of the nd data management workshop. kölner geographische arbeiten, , cologne: geographisches institut der universität zu köln. doi: https://doi.org/ . / tr db.kga . turska, m, cummings, j and rahtz, s. . challenging the myth of presentation in digital editions. journal of the text encoding initiative, issue . doi: https://doi.org/ . /jtei. wilkinson, md, dumontier, m, aalbersberg, ij, appleton, g, et al. . the fair guiding principles for scientific data management and stewardship. scientific data, : . doi: https://doi.org/ . / sdata. . how to cite this article: oltmanns, e, hasler, t, peters-kottig, w and kuper, h-g. . different preservation levels: the case of scholarly digital editions. data science journal, : , pp.  – . doi: https://doi.org/ . /dsj- - submitted: june accepted: august published: october copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/ licenses/by/ . /. open access data science journal is a peer-reviewed open access journal published by ubiquity press. https://doi.org/ . /tr db.kga . https://doi.org/ . /tr db.kga . https://doi.org/ . /jtei. https://doi.org/ . /sdata. . https://doi.org/ . /sdata. . https://doi.org/ . /dsj- - https://doi.org/ . /dsj- - http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / introduction excursus: scholarly digital editions what to preserve? service levels for long term availability of sdes . continuous maintenance . application conservation . application data preservation . bitstream preservation use cases: opus postumum and edition visualization technology summary and conclusions acknowledgements competing interests references figure figure figure developing a business plan for a library publishing program publications article developing a business plan for a library publishing program kate mccready and emma molls * university libraries, university of minnesota, minneapolis, mn , usa; mccre @umn.edu * correspondence: emolls@umn.edu; tel.: + - - - received: august ; accepted: october ; published: october ���������� ������� abstract: over the last twenty years, library publishing has emerged in higher education as a new class of publisher. conceived as a response to commercial publishing practices that have strained library budgets and prevented scholars from openly licensing and sharing their works, library publishing is both a local service program and a broader movement to disrupt the current scholarly publishing arena. it is growing both in numbers of publishers and numbers of works produced. the commercial publishing framework which determines the viability of monetizing a product is not necessarily applicable for library publishers who exist as a common good to address the needs of their academic communities. like any business venture, however, library publishers must develop a clear service model and business plan in order to create shared expectations for funding streams, quality markers, as well as technical and staff capacity. as the field is maturing from experimental projects to full programs, library publishers are formalizing their offerings and limitations. the anatomy of a library publishing business plan is presented and includes the principles of the program, scope of services, and staffing requirements. other aspects include production policies, financial structures, and measures of success. keywords: business plan; publishing; academic libraries; open access . introduction academic publishing, fueled by the boom of digital internet technologies, has created space for new types of publishers, including library as publisher. because of the growth of new library publishing programs, and the distinctiveness of scholarly communication approaches across institutions, this paper advocates for the creation and adoption of business plans within library publishing programs. the foundations of library publishing are presented, along with examples of current library publishing programs. this paper walks through a business plan template that can be used by current and future library publishers. readers working in established library publishing programs that currently lack a business plan, and readers who are considering launching a library publishing program, will find a number of guiding questions for each section of the included business plan template. finally, the authors hope that this paper engages the entire library publishing community and increases the number of publicly available library publishing business plans. . development of library publishing programs academic library publishing programs first saw adoption in the early s and have continued to grow over the last two decades [ ]. since , the library publishing coalition (lpc), a membership organization made up of mostly north american academic libraries, has increased membership by over %, and has surveyed academic libraries who identify as actively engaging in publishing. thirty-six percent of lpc members established their publishing program in the last ten years [ ]. publications , , ; doi: . /publications www.mdpi.com/journal/publications http://www.mdpi.com/journal/publications http://www.mdpi.com https://orcid.org/ - - - https://orcid.org/ - - - http://www.mdpi.com/ - / / / ?type=check_update&version= http://dx.doi.org/ . /publications http://www.mdpi.com/journal/publications publications , , of beyond the library publishing coalition, “ . . . most of the arl (association of research libraries) member libraries are engaged in publishing or publishing support activities” [ ]. libraries started publishing programs for a variety of reasons, including “mission-aligned work for exploring new opportunities in the digital age [ . . . ], demonstrating the market for scholarly, peer-reviewed, open access monographs, and empowering the library to engage with and effect changes in scholarly publishing” [ ]. although the goals of individual programs may vary, overall, library publishing programs “ . . . are focusing on the capabilities and possibilities of new models” and working to avoid the “replicat[ion of] traditional publishing services” [ ]. for many libraries, providing publishing services is an extension of a larger suite of scholarly communication offerings, offered frequently to “ . . . advance a strategic objective of transitioning the library’s collecting activities away from licensing content and towards supporting open access to scholarship” [ ]. although libraries addressing scholarly communication issues was discussed as far back as , scholarly communication service efforts vary greatly across libraries [ ]. in a survey, ithaka s+r found that across surveyed institutions, scholarly communication programs rarely share organizational structures, functions, and objectives [ ]. the varying makeup of scholarly communication programs, combined with the relatively new, and often experimental nature, of library publishing services, leaves libraries to newly navigate the complex landscape of open access publishing. unlike other scholarly communication services within a library where the costs have been typically been absorbed by assigning new duties to existing staff, or hiring new staff with new skill sets, the expenditures made on behalf of publishing activities require creative thinking to ensure that necessary elements transform a document into a publication (e.g., having a reputable authority vet the content, applying production techniques to the content, making the published work available through distribution networks, etc.). these early days of library publishing are seeing an examination of which elements that go into creating a publication are necessary to instill trust and produce high quality scholarship while also examining how those activities should be paid for. business planning for library publishing examines both of these elements. open access context: library publishing as disruption most library publishers firmly align with the open access movement which “ . . . had its origins in the crisis in scholarly communication and publishing, which has both caused and is the result of declining collections budgets, more demand for newer, expensive resources, and greatly increased pricing for serials, electronic resources, and other library materials [ ]”. as of , % of library publishing programs focused entirely or almost entirely on open access publications [ ]. the budapest open access initiative’ berlin declaration on open access to knowledge in the sciences and humanities focuses on scholarly publishing’s results: to make the knowledge created and published open for reading and reuse. the process of getting there is less straightforward. some institutions and individual authors attempt to achieve open access through the piecemeal deposit of a copy of the work in an institutional repository. others rely on author processing fees (or apcs) to create an open copy of the published work. but there is a finite amount of money within scholarly publishing. expenditures on these “solutions” are not relieving the pressure on library collection budgets. in the monitoring the transition to open access report focused on the uk, the findings (based on a sample of uk universities) suggest that subscription expenditures have grown % since (or an increase of £ million) while apc expenditures for those institutions grew from £ , to £ . million. in years, those institutions spent an additional £ . million while at the time of publication, % of materials remain locked behind a paywall [ ]. a growing number of libraries are now asking, can library budgets support the production of scholarly publications differently? can they instead support the production in a new system where they know and control the costs? furthermore, current conversations ask, can academia achieve its open access aspirations while continuing to support the commercial models of production [ ]? publications , , of although library publishers make up only a tiny fraction of the scholarly publishers in existence, they are attempting to shift the ecosystem. instead of spending library resources to purchase bundled collections of titles where subscription and production costs are hidden, some institutions are applying a portion of those resources to the production and publication of those works. libraries are allocating funding to support infrastructure for launching new publications that may not have fit into the legacy commercial publishing model. charles watkinson writes “if visualized as a spectrum from informal to formal, the formal book (or journal) occupies a narrow space at the right-hand end of the continuum. to its left lie the many other types of publishing and dissemination needs that a campus community may have” [ ]. in , the research report “library publishing services: strategies for success” noted that “the vast majority of library publishing programs (almost %) were launched in order to contribute to change in the scholarly publishing system, supplemented by a variety of other mission-related motivations. the prevalence of mission-driven rationale aligns with the funding sources reported for library publishing programs, including library budget reallocations ( %), temporary funding from the institution ( %), and grant support ( %). however, many respondents expect a greater percentage of future publishing program funding to come from service fees, product revenue, charge-backs, royalties, and other program-generated income” [ ]. it is questionable if it is in the best interest of scholarly communications to attempt to continue supporting, or adopting, the business models used by commercial publishers. libraries are hiring staff, and engaging with third party vendors, to support publishing services that are grounded in providing both technology support for publishing software systems and production services. they are learning about the necessary production work and finding expertise outside of the library to perform required tasks that aren’t typically available within a library’s staff’s skillset. importantly, they do not necessarily need to recoup those costs; however, they must spend those dollars judiciously and produce knowledge resources that benefit both their campus and the broader scholarly publishing landscape. therefore, they need a wholly new business model that holds them accountable to high quality standards, and fulfills their mission, while also being fiscally responsible agents of the dollars entrusted to them. . institutional budget models and their impacts libraries, and u.s.-based academic libraries in particular, typically receive the majority of their funding from state appropriations, tuition, and grant awards. based on data collected by the association of research libraries, % of public university library budgets are from state or institutional allocations [ ]. the type of budget model used at an institution, and how that model determines the process by which money is allocated to units, will likely have an impact on the services offered by the library. some budget models have disincentives for attempting cost-recovery for operations while others make it politically difficult to serve “clients” that are not directly affiliated with the university. additionally, an institution’s budget model can have an impact on how library publishing services are funded. a variety of different budget models used in institutions of higher education are explained very well in budgets and financial management in higher education by margaret j. barr, george s. mcclellan. as listed in column of table , the book’s authors detail the types of structures used at institutions of higher education. examining these structures, the authors of this article outline some potential impacts on starting or funding a library publishing service in column . publications , , of table . higher education institution budget models. type of budget models (this column based on, budgets and financial management in higher education [ ]) potential effect on library publishing programs all funds—emphasizes a holistic goals-oriented perspective. takes into account all sources of revenue and expense. facilitates the monitoring of resource allocation in pursuit of institutional goals. may need library publishing to be seen as an institutional goal, or there is a related goal of transforming scholarly publishing. cost-recovery income may be considered “revenue” that is scooped. formula—relies on the use of specified criteria in allocating resources. development of the formula is critically important. retrospective in nature. formulas are typically developed at a very high level (based on enrollments, or facilities costs) so the overall library budget could fluctuate. cost recovery may be difficult if units cannot keep their own income. performance based—allocation of resources premised on attainment of performance measures. strength in linking state priorities for higher education to resource allocation. performance measures are often tied to graduation or job placement rates. library publishing may not be seen as contributing to those performance measures. cost recovery may be difficult if units cannot keep their own income. incremental—establishes across the board percentage changes in expenditures over current budget based on assumptions regarding revenues for coming year. assuming the library is allowed to reallocate funds internally, this would allow for the development and growth of library publishing. cost recovery revenues may affect future allocations. initiative-based—requires units to return portion of their budgets for the purposes of funding new initiatives. units apply to the pool to support new initiatives. requires successful application to begin, or grow services. growth may need to be self-funded through cost-recovery activities if the initiative funding is one-time vs. recurring. planning, programming, and budgeting systems—premises on tightly integrating strategic planning, budgeting, and assessment. decisions are a function of identified challenges and opportunities, weighing risk/reward ratios, and monitoring performance. similar to performance based. cost-recovery income may difficult if units cannot keep their own income. requires a great deal of planning and staff to calculate and monitor the work. responsibility center—locates responsibility for unit budget performance at the local level. units are seen as revenue centers or cost centers. units are allowed to retain some portion of end-of-year budget surplus. other revenue-generating units are “taxed” for library services making it difficult to do additional cost-recovery. increased scrutiny on serving externally-owned publications which may require complete cost recovery when serving societies/non-profits. more likely that the program/library would be able to keep cost recovery revenues in their own budget. zero based—each item in the budget must be justified at the time the budget is developed. assures active monitoring of the link between institutional activities and institutional goals. library publishing must be a goal of the institution. requires a great deal of staff effort each year to justify the programs’ existence. prior to determining the scope of service, or the financial structure of the publishing program, questions about the institution’s budget model that should be asked include: • does the institutions’ budget model prevent cost-recovery activities? • if costs are recovered, and revenue is generated, does that money need to be given back to the university? • are allocated or revenue generated funds scooped at the end of the year (i.e., spend or return to the university)? • can the library’s publishing unit support external publications? or, for political reasons, does there need to be a university affiliation with the publication? • does the university recognize the benefits of library publishing? what case needs to be made that library publishers are necessary, effective disruptors to the current scholarly publishing environment? • how can library publishing get an initial allocation? can it be done at the library level or the university level? publications , , of • how can library publishing tie its goals to that of the institution’s? does the university have a mission to support the public (e.g., land-grant mission)? content creation as service the financial framework in which libraries operate is important to explore before attempting to determine the aspects of a library publishing business plan. libraries at academic institutions are considered to be a common good. they allocate substantial resources to building collections through traditional collection development activities in order to provide content to users without charge. libraries typically have missions that aim to provide access to content to all patrons free from barriers. egalitarian, justice-oriented principles prevail throughout their value statements and are expressed thoroughly in the american library association’s core values [ ]. by their nature and their primary aim, libraries strive to get the information that is needed or wanted into a patron’s hands as quickly and barrier-free as possible regardless of who that person is or what they want to do with the information. academic libraries may recoup some of their costs, fine patrons for late, damaged, and lost books, or generate income on services such as outward facing research or document delivery services; however, there are no examples of those charges or services fully supporting the primary mission of collecting and delivering resources. as quinn and innerd write in their analysis of the integration of their university press into the library: “ . . . the library operates under a budget-allocation model provided entirely by the university . . . the centrality of the library to the teaching and research mission of the university is generally accepted and understood. the library’s budget has traditionally been based on historical spending and the ability of the library to articulate its need for additional funding to innovate and meet student and faculty demands. the library’s goal is to spend wisely, efficiently, and as fully as possible within the budget provided” [ ]. this philosophy and approach applies to nearly all scholarly communication oriented services provided by academic libraries: data curation and management, digital scholarship support, institutional repository services, digital library development, research consultations, etc. this prevailing philosophy and service ethic of libraries can also be applied to scholarly publishing in libraries. when doing so, it informs the development and support of content dissemination in new and interesting ways that primarily support openness rather than cost recovery. commercial publishers are reliant on serving their shareholders, not content users. saarti and tuominen sum this up well when they wrote: “scholarly interests of sharing collide with commercial interests of generating profits” [ ]. in the instances where university presses and libraries have merged, their differing approaches to financial resources and business models has been a source of tension and illustrates how emerging library publishers differ from all other types of publishers. because nearly all types of publishers in the past have been expected to recover the majority of their costs (along with limited institutional subsidies in the case of society publishers and university presses), it is challenging to consider a publishing program that does not assume cost recovery as a necessity. library publishing, however, when seen as an active library-supported collection development strategy, is presenting that challenging question to the scholarly community. graham stone, in his thoughtful article about “new university presses” or nups, notes that “these new publishing ventures, often based in the library, have harnessed the changes in the digital landscape and the rise of the open access movement to allow them to publish scholarly works, such as journals and monographs.” he goes on to say that “furthermore, a business model based on scholarly communication rather than profitability, but working on a cost recovery model appears to be contradictory . . . the institution/funder-pays model is the more appropriate model” [ ]. conversations within libraries about philosophy, and the need for cost recovery are essential in the development of library publishing business plans. publications , , of . three case studies of library publishing programs most library publishing programs do not make their business plan publicly available; however, some elements of a library publishing program’s business plans are evident through the information publicly displayed on their website and in the lpc’s annual directory. a brief examination of three diverse program, outlined in table , illustrate similarities in principles, a variety in scope, a wide range of staffing, and differences in services offered at these institutions. table . case studies of library publishing programs. institution # : university of minnesota libraries—publishing services https://www.lib.umn.edu/publishing/about operates separately from the university of minnesota press which is housed in a different administrative unit at the institution. administratively separate from the institutional repository, data repository, & digital humanities. principles: library involvement is critical to advancing transparent scholarly and academic publishing practices. umn libraries have a commitment to open access, scholar-led publishing where creators maintain copyrights. scope & eligibility: publishes journals, monographs, dynamic scholarly serials, and course materials. no apcs allowed. u of mn affiliates and scholarly societies may apply to publish content with the university libraries. proposals reviewed biannually. staffing & financials: director; publishing services librarian; development & technology staff; publishing services coordinator. ( . fte total). external vendors used for production tasks. funding sources: library operating budget ( %); library materials budget ( %). other financial information not available. development & production services: basic services (hosting, preservation, etc.) offered without charge to affiliates. hosting charges apply to society-owned publications. production (e.g., copy editing, typesetting, graphic design, etc.) and development charges apply to all publications. public business plan: not available. institution # : university of michigan libraries—michigan publishing services https://www.publishing.umich.edu/services/ operates within the same office as the university of michigan press. the press reports up administratively to the library and functions as a traditional university press. also administered in the same office as the institutional repository. principles: mi publishing services staff are experts in scholarly publishing and “help increase the visibility, reach, and impact of scholarship.” emphasis on open access formats that advocate for author rights through new digital publishing models to ensure wider knowledge sharing. scope & eligibility: publishes books, journals, conference proceedings, digital projects, and course materials in print and electronic forms. focus: support for university of michigan affiliates. staffing & financials: publishing services director; publishing services librarian; publishing services coordinators; community manager ( fte total). university press and external vendors are used when needed. funding sources: library operating budget ( %); sales and hosting revenue ( %); charge backs ( %). other financial information not available. development & production services: full suite of services offered including: hosting, editing, typesetting, design, formatting (e.g., pdf, epub, ocr, etc.), digitization, web design, preservation, print on demand, charges apply to most services. public business plan: not available. institution # : university of pittsburgh library system e-journal publishing https://www.library.pitt.edu/e-journals operates separately from the university of pittsburgh press which is housed in a different administrative unit at the institution. principles: committed to helping research communities share knowledge and ideas through open access electronic publishing. they subsidize the costs of electronic publishing so that their “partners can focus on editorial content and scholarly collaboration”. scope & eligibility: publishes open access ejournals. apcs allowed but no journals currently charge them. focus: publications that have: rigorous peer-review; an internationally recognized editorial board; a robust staff; and publish selectively from an open call for papers. no u of pittsburgh affiliation required. staffing & financials: director, digital repository manager, electronic publications manager, library specialists ( fte) funding sources: library operating budget ( %); charge backs ( %). other financial information not available. development & production services: design services, assignment of standard identifiers, social media connections, analytics, consultations on editorial and management, indexing, archiving and preservation. public business plan: not available. description of the library publishing program’s principles, scope, eligibility, staffing financials, and services based off of a program’s listed website. https://www.lib.umn.edu/publishing/about https://www.publishing.umich.edu/services/ https://www.library.pitt.edu/e-journals publications , , of . creating a business plan to library publishing there has not yet been analysis or work done to define business plans for library publishing programs. this article uses the definition of business plan developed by collier in , and used in his edited volume, business planning for digital libraries: international approaches: business planning for digital libraries is here defined as the process by which the business aims, products and services of the eventual system are specified, together with how the digital library service will contribute to the overall business and mission of the host organizations. these provide the context and rationale, which is then combined with normal business plan elements such as technical solution, investment, income, expenditure, projected benefits or returns, marketing, risk analysis, management, and governance [ ]. the anatomy of a library publishing business plan closely mirrors a template for a traditional, stand-alone business. however, because a library publishing program is nested within a larger organization, the financial section varies based on a university’s budget model (discussed in section ) and their library’s approach to funding these services. the authors of this paper recommend that libraries first identify the university’s current budget model prior to writing a library publishing business plan. the basic template for a library publishing business plan includes the following sections: • principles of service • scope of service • staffing and governance • development & production • financials • measures of success it is important to note, that if the institutional context calls for it, additional sections can be added to the business plan to strengthen alignment. this is especially true for libraries that are venturing into library publishing on an experimental basis—and for libraries that are in the process of advocating for the formalization of a publishing program. useful additional sections for libraries in these positions include a pest analysis (political, economic, social, technological) and a swot analysis (strengths, weaknesses, opportunities, threats). these sections can further illustrate the rationale behind the development of a publishing program [ ]. the template used in this paper does not include a section on technology. publishing technologies, specifically open source publishing technologies, are constantly growing in number and functionality. the authors highly recommend conducting a review of available publishing platforms. the library publishing coalition offers members and non-members a number of resources on available technologies. (https://librarypublishing.org/) the finalized business plan should be inclusive and detailed enough that administrators and campus partners can reference the plan and understand the functions and goals of the publishing program. the business plan can also act as a reference when questions arise from clients about the viability and sustainability of a new service. the ability to communicate the structure of and financial commitments of the publishing program is essential to conveying stability, knowledge of process, and boundaries. with the exception of the principles of service, it is expected that the business plan will need additional updates as staffing changes, library priorities shift, and as the program matures and grows. . . principles of service a library publishing business plan is a roadmap for the service. it explains to internal and external partners the details of how the program will travel from point a to point b. principles of service, https://librarypublishing.org/ publications , , of in turn, explain to partners why the program is traveling at all. this is the intrinsic lead-in to a library publishing business plan. principles can touch on themes mentioned earlier in this article, including: transparency, openness, and institutional support. as a department or service offering of the library, library publishing programs inherit established mission statements, goals, and other strategic planning objectives from the library, and in turn the university. although these objectives may convey the spirit of the service, a library publishing program will benefit from principles of service that are specific to the program. developing and adopting principles of service will clearly define a library publishing program, communicate the program’s purpose, and create a shared expectation of goals and outcomes. unlike an annual or strategic plan, principles of service should remain true given the, often unpredictable, ebbs and flows of passing years. principles of service can be considered the “core” of the program and should not depend on a specific project or specific person. principles should clear, accessible, and easy to share with clients and partners. libraries with suites of scholarly communication services can leverage principles of service to help distinguish publishing services from other services offered within the organization. drafting principles with library colleagues, including perspectives from digital humanities, copyright, and administration, allow for language that works in harmony among other services. . . scope of service one of the most challenging sections of the business plan, and the section most likely to change as the service is updated, is the scope of service. this section should address specific services that the library publishing program will provide, it could also highlight related services that the program will not provide. (for example, the library publishing program will not manage the inventory of print publications.) additionally, this section is the section that will likely have the most dependencies with other sections. for established programs, this section will likely be a formal write-up of currently provided services within the program. for newly developed programs, this section should include the services that the program is ready to offer, and exclude services that the program hopes to provide in the future. generally, this section should address the following questions: • what type of publications will be published? • which authors/editors are eligible? • what level of service will be provided to each publication? each of these questions requires a deeper consideration based on selected technologies, availability of staffing/personnel, and cost. the most common types of publications published by library publishing programs are journals, monographs, and textbooks. however, as digital publishing tools grow, and the definition of scholarship broadens, programs may become publishers of increasingly difficult to categorize modes of scholarship. no matter the breadth of publication types, libraries should consider: • what technologies will be needed to host and produce each type of publication? • are there other library or campus programs that currently serve the needs of the identified publication type? • will publishing staff be available to assist the editors of publications on an on-going basis (serials) or for only a limited time (monograph)? • what is the average cost associated with each type of publication? are these one-time costs or on-going? identifying the type of eligible clients for the publishing program will help the library build a customer profile for a marketing base. even though the program may not be “selling” the final outputs, identifying who the service is for, will help communicate the program’s principles of service to the appropriate audience. in specifying the programs’ eligible clients, libraries should further consider: publications , , of • does the library’s mission focus on serving affiliated users? • does the program have a discipline specialty or focus? • can the program’s selected technology work with affiliates and non-affiliates? or are there ezproxy or shibboleth requirements? • can the library and/or university budget cover expenses of non-affiliates? • will the program prioritize the works of different groups? (e.g., faculty, graduate students, undergraduates) across all the above mentioned points, is the question of what level of service the program will provide. this may be one of the harder questions to answer for a program that is just developing. however, once one publication is published, a program can run a project post-mortem to help identify how the skill sets of the individuals staffing the publishing program was leveraged and how much time went into the publication. similarly, this question can also be answered throughout the initial publishing technology review—what processes can be automated using the available technology? (e.g., assigning dois, creating article metadata, password resets for platform users.) generally, all of the following points should be considered: • what can the technology for each type of publication automate? • do all publication types require the same amount of time and attention from the program staff? • what will the editors of each publication be responsible for? what will the publisher be responsible for? • how will clients contact the publisher? • how will customer service be approached in relation to existing library services? . . staffing staffing within library publishing programs vary greatly. the library publishing coalition directory includes listings for programs with . % of a full-time professional staff, all the way up to full-time professional staff [ ]. as noted in the previous section, the availability of staff directly impacts the services that a program can provide. a library can anticipate that this section of the business plan is inseparable to the program’s scope of services. libraries drafting this section of the business plan should also consider where the publishing program is organizationally situated within the library. since publishing may be a cross-departmental or cross-divisional effort, it is important to clearly describe where the program sits within the organization. including this description for brand new programs will help colleagues throughout the library understand the reporting structure of the program. publishing programs need to define roles and responsibilities for each element identified in the scope of services. programs that depend on the labor and/or time of library staff members in other units or departments, can formalize these relationships in the business plan in order to solidify cross-library buy-in. although each element in the scope of services should be addressed in this section, the business plan is not an internal workflow document, so responsibilities may be identified at a general level and individual staff members may be identified by position, rather than name. these responsibilities will likely include: • technology development and support • marketing of services and recruitment of publications • production and development of publications, including additional processes identified in the development & production section • assessment, discovery, and promotion of individual publications • long-term strategic planning and goal setting (at publishing program level) in addition to staffing, library publishing programs may find it beneficial to implement a governance structure. unlike the day-to-day operations of the program, a governance structure publications , , of can provide recommendations to enhance the quality and future viability of the program. building in the development of a governance structure can be a way to incorporate disciplinary faculty and other university stakeholders into the publishing program. . . development & production a variety of policies are required in order to make a library publishing program successful and sustainable. policies guide decision making and can be referred to by administration or clients when questions arise. the need for policies is best summarized in the handbook of journal publishing as policies address “what is to be published, how and why” [ ]. although an individual library publishing program may have policies unique to the program’s goals and needs, there are a handful of policies that are essential to any publishing program. . . . accepting publications whether a publishing program anticipates publishing or publications a year, the program needs to consider how publications will be received by the library publisher. many publishers use a call for proposals (cfps) to solicit publications. using a cfp, even if the respondents are few, enables publishers to advertise their service, while giving guidelines as to what will be accepted. even for library publishing programs that are experimental, and willing to publish content with limited traditional publishing options, each program will likely have some limitations—especially involving staffing and technology. for library publishing programs just getting off the ground, and unsure of limitations, consider a cfp with open ended questions, this will enable submitters to describe their project without limiting answers to checkboxes. once proposals are submitted, each publishing program will need to determine how proposals are accepted or rejected. again, the library publisher will want to consider which proposals are actually doable based on staffing and technology. there will likely be publications and projects that are just not possible given the program’s available support. for proposals that are viable, each program will need to determine who gets to say “yes” and “no” to publications. this can be done by the staff working in the program, by a committee established by the program, or by library administration. after a proposal is accepted, the library publishing program will need to develop an moa (memorandum of agreement) or mou (memorandum of understanding) for each publication. an moa/mou will clearly layout the expectations from each party and can include any necessary legal agreements or policies that are relevant to the relationship between publisher and publication. for libraries not familiar with moa/mou, consult the institution’s office of general council or contract office. . . . rights library publishers need clear statements about rights related to each publication. policies may vary across individual publications, but the publishing program should create policies that address the following: • who does the copyright of a publication belong to? • who does the title of the journal belong to? (could an editorial board member find a new publisher and move the journal/book series/conference proceeding? • how can the content be used? (this question can be addressed by the addition of a creative commons license.) • how can either party end the business relationship between publisher and publication? individual publications, especially those with multiple authors, will need to create publication-specific policies to ensure that content within the publication is following copyright and/or licensing policies. as a publisher, it is important to assist editors or editorial boards that are publications , , of new, or those that have questions related to rights. set up formal channels of communication and encourage publication editors to reach out for support. . . . privacy user privacy statements need to be included on each digital publication or digital publication access point. chances are that the publishing program’s selected software, especially if using a hosted solution, will include a privacy policy. make sure that staff working on publications understand the privacy policies and are able to communicate the policies to users of the platform. for publications that require registration for readers, authors, or reviewers, make sure that any default privacy statements are correct and that all users are prompted to read the privacy/user agreement before entering any information into the system. . . . distribution & marketing policies because the majority of library publishers publish content that is openly accessible, publishing programs will need to have unique marketing and distribution tactics not as common among traditional publishers and university presses. setting distribution and marketing policies will clarify expectations between authors/editors and the publisher. if the publishing program sells print copies of books, will there be a markup fee? can the author, as the copyright holder, set up their own digital storefront? even in the world of open access publishing there is a need for policies related to distribution. a library publisher with the staff time and expertise may want to be the party responsible for applying to databases and indexes for each publication. additionally, the publisher can take the lead on advertising or marketing publications. this may be something that the author/editor does not think of, especially if the publication is available online for free, however, the publisher will want to see a publication attract as many readers as it can. it is never too soon to work with editors/authors to develop a strategy for distribution and marketing, having a policy in place when a potential publication reaches the library publishing program will make any effort much more successful. . . . preservation policies preservation of library published content continues to be an area under investigation. in , the library publishing coalition noted that programs are “making slow but thoughtful progress on digital preservation” [ ]. although libraries continue to improve policies around the preservation of library published content, there are a number of approaches that can be taken to ensure that published works are preserved. public knowledge project (pkp) and bepress, common library publishing platforms, allow users to set up accounts through global clockss program (controlled lots of copies keeps stuff safe from stanford university). additionally, pkp offers a private preservation network available to platform users who are unable to join the global clockss program. portico is also an option for library publishers, and is the most common journal and ebook preservation tool used by libraries to preserve purchased content. portico requires membership with fees based on journal or ebook revenue [ ]. regardless of whether or not a library publishing program is connected with preservation tools, a library publishing program should develop a clear policy that can address author/editor questions about both short- and long-term preservation. the policy should also address what content is to be preserved. additionally, programs will want to consider: • will the publishing program preserve all publications? • what about publications that cease or move to another publisher? • will a journal’s webpages be preserved, or just pdfs? • will production files be preserved, or just version of record? preservation will likely be a policy that requires the expertise of librarians beyond the publishing program. it is also a policy that will need updating as technologies and best practices change. publications , , of editors and authors want a publisher that will look out for published content for the long term, a successful preservation policy should address this. . . financials unlike other scholarly communication services, publishing has well-documented, though debated, costs associated with the service [ ]. libraries are especially sensitive to costs set by publishers, therefore a library as publisher has the opportunity to be especially transparent and clear in the costs associated with publishing. the development of the financials section of the business plan will need to be done in close consultation with library administration, it is likely that library has pre-developed language and/or templates for communicating costs. the basic financial structure of the program will likely be addressed in earlier sections of the business plan, however, the financials section should address the following questions: • how will the service fit into the library’s budget model? • how can/will the service leverage the university’s budget model? o will staffing and core technologies be paid for by the library’s budget or covered by publishing revenue? o will the service charge fees for any/all services? o how will service rates be calculated? o what expenses will potential revenue cover? • which expenditures are flat versus usage-based? • which pre-existing memberships or technologies will the program use? • how will costs, charged directly to clients or covered by the library, be communicated to clients? like earlier sections, the financials section requires that libraries estimate growth of the program in order to calculate costs. in addition to staffing and core technologies (digital publishing platforms), libraries need to consider expenses that fluctuate based on volume. some of these costs may be: • identifiers (dois, issns, isbns) • graphic design for individual publications • material for marketing and promotion • licenses for production tools (indesign, ithenticate, overleaf) • memberships for preservation and publishing best practices (portico, cope, etc.) additionally, each individual title should also have a budget assigned to it. the program’s approach to publication level planning should be included in the financials section, this can be done by including a template or spreadsheet that is used to structure the relationship between author/editor and publisher. being able to express to authors what resources are needed to launch and maintain their publication helps communicate expectations and outlines where they need to partner to provide additional resources for elements or features that are not currently supported by the service. . . measures of success given the often experimental nature of library publishing, and the lack of longitudinal studies on library publishing, determining measures of success for a library publishing program can be a challenge. measures of success will be determined based on each publishing program’s principles of service and the parent institution’s mission and vision. to do this, publishing programs may find measures of success tied to individual publications and projects. measures of success for individual publications, especially those available free of cost, and therefore not being measured based on revenue, frequently fall into three general areas: publications , , of • sustainability: is the publication able to recruit reviewers, editors, and authors? is the publication meeting publication-specific goals? • scalability: is the publication able to respond to increased readership? are editorial workflows keeping up with an increase in content? • visibility: is the publication attracting readership? is the publication being cited? when eligible, is the publication included in disciplinary-appropriate indexes? however, the diversity of library publishing portfolios means that measures of success do not always work when tied to specific publications, especially books and other non-serials, whose content is not likely to grow over time. measures of success for the overall publishing program “ . . . must also be able to demonstrate that they are fulfilling the traditional roles of scholarly publishers” [ ]. some library publishers have principles of service that may vary drastically from “traditional publishers,” making it important for a successful publishing program to also meet the needs requested by their clients. with that in mind, the same measures of success used to evaluate individual publications can be used to measure the success of the overall publishing program: • sustainability: is selected technology still meeting publication needs? are publishing staff able to maintain developed workflows? • scalability: is there a growth in number of publications? are additional services being added as requested? • visibility: is there campus awareness of the publishing program? additionally, staff in library publishing should be aware of other measures of success that are used across library services. if a publishing program has services that include outreach and education, consider meeting with colleagues in library information literacy units to determine appropriate evaluation metrics for publishing services that extend beyond publications. measures of success is another section of a library publishing business plan that can benefit greatly from vertical alignment with a library’s related services and units. . conclusions in response to the variety of issues in scholarly communication, the development of library publishing programs is one way libraries have become active participants in the growing open access publishing landscape. business plans for library services, especially for scholarly communication services, are not yet commonplace. however, by creating and adopting a business plan for library publishing programs, libraries can formalize a relatively new service within the unique structures of academic libraries. a library publishing business plan will provide a clear understanding of the program’s goals and services, and will provide a path for growth and assessment in the long and short term. its development offers the opportunity for the library’s leadership and staff to discuss and create framing principles, which provide a foundation for communicating the goals and purpose of the service. the remaining elements of a robust business plan provide a structure for a program’s operations and clear communication. author contributions: conceptualization, e.m. and k.m.; introduction, e.m. and k.m.; background and development, k.m.; anatomy, e.m. and k.m.; conclusion, k.m.; writing-original draft preparation, k.m. and e.m. funding: this research received no external funding. acknowledgments: the authors thank shaan hamilton for his feedback on section : institutional budget models and their impacts. conflicts of interest: the authors declare no conflict of interest. publications , , of references . gillman, i. the evolution of scholarly communication programs. in library scholarly communication programs—legal and ethical considerations, st ed.; chandros publishing: oxford, uk, ; pp. – , isbn . . library publishing coalition directory committee (ed.) library publishing directory ; library publishing coalition: atlanta, ga, usa, . . taylor, l.n.; keith, b.w.; dinsmore, c.; morris-babb, m. libraries, presses, and publishing, spec kit ; association of research libraries: washington, dc, usa, . available online: https://publications.arl. org/libraries-presses-publishing-spec-kit- / (accessed on august ). . hahn, k.l. research library publishing services; association of research libraries: washington, dc, usa, . . marcum, d.; schoenfeld, r.; thomas, s. office of scholarly communication scope, organizational placement, and planning in ten research libraries; ithaka s+r: new york, ny, usa, . . collister, l.b.; deliyannides, t.s.; dyas-correia, s. the library as publisher. ser. libr. , , – . [crossref] . jubb, m.; plume, a.; oeben, s.; brammer, l.; johnson, r.; bütün, c.; pinfield, s. monitoring the transition to open access: december ; universities uk: london, uk, ; p. . available online: https://www.universitiesuk.ac.uk/policy-and-analysis/reports/pages/monitoring-transition-open- access- .aspx (accessed on august ). . ghamandi, d.s. liberation through cooperation: how library publishing can save scholarly journals from neoliberalism. j. librariansh. sch. commun. , . [crossref] . watkinson, c. three challenges of pubrarianship. against grain , , – . [crossref] . mullins, j.l.; murray-rust, c.; ogburn, j.l.; crow, r.; ivins, o.; mower, a.; nesdill, d.; newton, m.p.; speer, j.; watkinson, c. library publishing services: strategies for success: final research report; sparc: washington, dc, usa, . . collier, m. business planning for digital libraries. in business planning for digital libraries: international approaches; collier, m., ed.; leuven university press: leuven, belgium, ; pp. – , isbn . . barr, m.; mcclellan, g. budgets and financial management in higher education margaret, st ed.; john wiley & sons: san francisco, ca, usa, ; pp. – . . ala council. core values of librarianship. available online: http://www.ala.org/advocacy/intfreedom/ corevalues (accessed on july ). . quinn, l.; innerd, c. the evolution(s) of wilfrid laurier university press: toward library-university press integration. j. sch. publ. , , – . [crossref] . mcintyre, g.; chan, j.; gross, j. library as scholarly publishing partner: keys to success. j. libr. sch. commun. , , – . [crossref] . stone, g. sustaining the growth of library scholarly publishing in a new university press. inf. serv. use , , – . [crossref] . collier, m. the business aims of eight national libraries in digital library co-operation: a study carried out for the business plan of the european library (tel) project. j. doc. , , – . [crossref] . saarti, j.; tuominen, k. from paper-based towards post-digital scholarly publishing: an analysis of an ideological dilemma and its consequences. inf. res. , , – . . koerbin, p. issues in business planning for archival collections of web materials. in business planning for digital libraries: international approaches; collier, m., ed.; leuven university press: leuven, belgium, ; pp. – , isbn . . managing journals. the handbook of journal publishing; morris, s., barnas, e., lafrenier, d., reich, m., eds.; cambridge university press: cambridge, uk, ; pp. – , isbn . . van noorden, r. the true cost of science publishing. science , , – . © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). https://publications.arl.org/libraries-presses-publishing-spec-kit- / https://publications.arl.org/libraries-presses-publishing-spec-kit- / http://dx.doi.org/ . / x. . https://www.universitiesuk.ac.uk/policy-and-analysis/reports/pages/monitoring-transition-open-access- .aspx https://www.universitiesuk.ac.uk/policy-and-analysis/reports/pages/monitoring-transition-open-access- .aspx http://dx.doi.org/ . / - . http://dx.doi.org/ . / - x. http://www.ala.org/advocacy/intfreedom/corevalues http://www.ala.org/advocacy/intfreedom/corevalues http://dx.doi.org/ . /jsp. . . http://dx.doi.org/ . / - . http://dx.doi.org/ . /isu- http://dx.doi.org/ . / http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction development of library publishing programs institutional budget models and their impacts three case studies of library publishing programs creating a business plan to library publishing principles of service scope of service staffing development & production accepting publications rights privacy distribution & marketing policies preservation policies financials measures of success conclusions references modeling digital humanities collections as research objects modeling digital humanities collections as research objects katrina fenlon kfenlon@umd.edu college of information studies, university of maryland, college park college park, maryland abstract advancing digital libraries to increase the sustainability and useful- ness of digital scholarship depends on identifying and developing data models capable of representing increasingly complex schol- arly products. this paper considers the potential for an emergent model of scientific communication, the research objects data model, to accommodate the complexities of digital humanities collections. digital humanities collections aggregate and enrich diverse sources of evidence and context, serving simultaneously as "publications" and dynamic, interactive platforms for research. the research ob- jects model is an alternative to traditional formats of publication, facilitating aggregation and description of all of the inputs and outputs of a research process, ranging from datasets to papers to executable code. this model increasingly underpins research infrastructures in some scientific domains, yet its efficacy for repre- senting humanities scholarship, and for undergirding humanities cyberinfrastructure, remains largely untested. this study offers a qualitative content analysis of digital humanities collections relying on a content/context analytical framework for characterizing collec- tion components and their interrelationships. this study then maps those components and relationships into a research objects model to identify the model’s strengths and limitations for representing diverse digital humanities scholarship. ccs concepts • information systems → data structures. keywords data models, digital humanities, digital libraries, research objects acm reference format: katrina fenlon. . modeling digital humanities collections as research objects. in proceedings of acm conference (jcdl ’ ). acm, new york, ny, usa, pages. https://doi.org/ . /nnnnnnn.nnnnnnn introduction across disciplines, the growth and evolution of digital scholarship has overwhelmed traditional systems for the representation and communication of research. digital scholarship in the humanities permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. copyrights for components of this work owned by others than acm must be honored. abstracting with credit is permitted. to copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. request permissions from permissions@acm.org. jcdl ’ , june , urbana-champaign, il, usa © association for computing machinery. acm isbn -x-xxxx-xxxx-x/yy/mm...$ . https://doi.org/ . /nnnnnnn.nnnnnnn produces resources that range widely beyond our traditional con- cept of publication, resources that incorporate not only narratives and rich media, but also datasets and linked data, interactive and functional components, and objects and processes that are physi- cally and logically dispersed as well as dynamic and evolving over time. despite the rise of digital scholarship, most existing research infrastructures lack support for the creation, management, shar- ing, maintenance, and preservation of complex, networked digital objects. this paper considers the potential for emergent models of scien- tific communication and publication to accommodate the complex- ities of digital humanities scholarship, and therefore to underpin shared research infrastructure in the humanities. in particular, this study analyzes the suitability of the research objects model, one among several emergent models for representing and describing complex digital objects that interweave data, workflows, and supple- mentary and contextual information, models for logically bundling the diverse inputs and outputs of research processes [ , ]. research objects comprise metadata frameworks with associated packaging standards. the model has gained uptake in some disciplines and witnessed concomitant growth in related tools, management sys- tems, and supportive communities [ , , ], which indicate its usefulness and contribute to its sustainability. this study offers a starting point for answering the question: to what extent may existing (scientific) data models for repre- senting research objects accommodate dh research products and processes? this paper focuses on a common form in dh scholar- ship: digital collections (often called digital archives and thematic research collections), which are scholar-built aggregations of digital sources of evidence about a topic [ , , ]. this study provides selected results of a qualitative content analysis of dh collections, and offers a content/context analytical framework to characterize collection components and their interrelationships. this study then retrospectively maps those components and their relationships into the research objects model in order to identify the strengths and limitations of that model for representing dh scholarship. . digital scholarship and sustainability in the past few decades, research and scholarship have witnessed sweeping efforts to rethink existing formats for knowledge transfer and scholarly publication, and to develop technologies that support the publication and interlinking of data, software, workflows, and narratives, all as first-class research objects [ ]. in the humanities, scholarship takes an increasing variety of forms, ranging from digi- tal scholarly editions (e.g., the walt whitman archive ) to curated collections of content (e.g., colored conventions ), from layered http://www.researchobject.org/ https://whitmanarchive.org/ http://coloredconventions.org/ https://doi.org/ . /nnnnnnn.nnnnnnn https://doi.org/ . /nnnnnnn.nnnnnnn http://www.researchobject.org/ https://whitmanarchive.org/ http://coloredconventions.org/ jcdl ’ , june , urbana-champaign, il, usa visualizations (e.g., the torn apart/separados project ) to models and simulations (e.g., the mayaarch d project ). the outputs of dh research are increasingly media-rich, data-centric, interactive, dynamic, interlinked, and subject to indefinite evolution. as infrastructures for sustaining digital research struggle to keep pace with the advance of scholarly communication tech- nologies, dh confronts sustainability challenges [ , , ]. digi- tal libraries—including data repositories, aggregations of cultural records and artifacts, and certain publication platforms—are im- portant components of research infrastructure in the humanities. while the capacity of digital libraries for representing complex dig- ital objects and workflows continues to advance [ , , ], there remains an urgent need for data models and standards to represent and describe increasingly complex scholarly products [ , ]. digital humanities (dh) collections, including those analyzed in this paper, often resemble cultural heritage digital libraries, broadly conceived. but dh collections are differentiated in several ways that make sustainability uniquely problematic. dh collections are often developed and maintained outside of the walls and purview of dedicated memory institutions. they tend to be centered in scholarly communities; scholars create them and maintain them for their own purposes, with fluctuating resources and support. because they function simultaneously as scholarly "publications" and as platforms and hubs for ongoing research and communication among scholarly communities, and because they tend to be funded on short cycles, they often rely on bespoke infrastructures and take unique forms to serve specific research purposes. these factors combine to make dh collections uniquely difficult to sustain over time, and suggest the urgent need for shared infrastructure that does not limit the diversity of digital scholarship. . research objects in the humanities the basic concept of the research object is simple. conceptually, research objects are composed of two main parts: aggregated re- sources (listed in a manifest with minimal metadata, and packaged into the research object using one of several packaging formats), and annotations (used to express metadata about, provenance of, and relationships among aggregated and external resources). the standard model specifies how relationships are declared, relying on extant linked data standards, primarily on oai-ore, and w c standards including the annotation data model and prov . the re- search object may be packaged and serialized in different ways, but always contains a manifest of metadata about the research object and its contents represented in json-ld. there are other models closely related to the research objects model, including for enhanced publications [ ], executable papers, and scientific publication pack- ages. research objects have seen growing application in several domains, in various commercial and open-source implementations [ , , , , ]. in the humanities, research objects and closely related models have been applied to repository and data-sharing architectures [ , ], digital preservation and archive serialization [ , ], semantic http://xpmethod.plaintext.in/torn-apart/volume/ /index http://www.mayaarch d.org/language/en/sample-page/ https://www.openarchives.org/ore/ https://www.w .org/tr/annotation-model/ https://www.w .org/tr/prov-overview/ publishing [ ], and digital libraries for musicology [ ]. these applications are compelling, and suggest the need for and timeliness of a systematic investigation of whether or to what extent the model could serve to represent a range of dh collections as whole, cohesive objects, and therefore have potential to underpin a widely adoptable, sustainable dh infrastructure with cross-disciplinary investment and impact. data modeling is a pervasive scholarly practice in dh [ ]. like research objects, dh collections may be conceptualized and modeled as assemblages of resources with semantic interconnections, designed to support research objectives [ , ]. this study considers to what extent that resemblance bears out in the application of the research objects data model to complete representation of collections. methods the analysis presented in this paper builds upon an ongoing, mul- timodal study of digital collections [ , ]. the study seeks to thoroughly characterize dh collections as a scholarly genre using three approaches: ( ) a survey and typological analysis of dh collec- tions (n= to date); ( ) a qualitative content analysis of exemplary collections; and ( ) interviews with researchers and practitioners who build digital collections, to identify challenges for libraries and other institutions in supporting and sustaining dh scholar- ship. the typological analysis identified three primary types, useful for describing dh collections in terms of their purposes and the completeness toward which they are developed; those types are briefly described in table . complete results of the first phase of the study and a detailed account of the interrelated methods are given in [ ]. . qualitative content analysis the current paper extends the qualitative content analysis to ad- dress the question: what components of these collections must be modeled in order to logically represent dh collections as research objects? in other words, what are the main products of the collec- tion—its discrete, publishable outcomes—and how are they related to one another and to other resources? the initial phase of content analysis identified close to forty distinct aspects of the content, de- sign, and contexts of digital collections. table gives an overview of the whole content analysis protocol and each aspect of the sample collections that has been subject to analysis and characterization. the two most immediately relevant aspects of this protocol to the analysis at hand are items and interrelatedness. these aspects concern ( ) what are the items in the collection, and ( ) how are they interrelated with one another, with contextual information, with external resources, etc.? a closer analysis of items and in- terrelatedness in each of our sample collections identified discrete components of collections along with the relationships, both tech- nical and abstract, that obtain between components. this study uses the terms "item" and "component" loosely, not only to indicate a collection’s main conceptual units of gathering (such as books or artifacts), but also other parts of collections that substantially contribute to a collection’s intended contribution to the scholarly and cultural records. the analysis focuses on discrete logical pieces that may be understood to have some kind of mereological, mem- bership, or isgatheredinto relationship to the collection as a whole http://xpmethod.plaintext.in/torn-apart/volume/ /index http://www.mayaarch d.org/language/en/sample-page/ https://www.openarchives.org/ore/ https://www.w .org/tr/annotation-model/ https://www.w .org/tr/prov-overview/ modeling digital humanities collections as research objects jcdl ’ , june , urbana-champaign, il, usa table : collection types type purpose definitive- source provide access to high-quality, authoritative, or otherwise definitive primary sources, (re- )assembling and shaping the affordances of the cultural record on the web exemplar- context interrelate and (re-)contextualize diverse primary sources, building rich context and connection within and around exemplary sources evidential platform aggregate, deconstruct, and remodel sources for new uses, leveraging evidence into more flexible platforms for analysis and interpretation table : content analysis protocol overview cluster categories of analysis context theme; purposes; impact; creators; au- dience; documentation; provenance; re- lated collections; related projects and pub- lications; review; funding; developmen- tal stage; host; rights; sustainability and preservation plans; method of collection content items; interrelatedness; diversity; size; narrativity; quality; language; complete- ness; density; spatial coverage; temporal coverage design data models; navigation; infrastructural components; interface design; interactivity; interoperability; openness; identification and citation; modes of access and acquisi- tion; accessibility; flexibility [ ], and which contribute to its scholarly purpose according to the collection’s self-described objectives. . content/context component framework to refine the analysis of collections, this study developed and ap- plied an analytical framework for characterizing components of collections more precisely. this characterization leverages a few different properties of components—including whether they are primary or secondary sources, and whether they are original to a collection—with the goal of identifying different ways in which components contribute to collections as wholes and, in turn, to the wider scholarly record. figure illustrates the "content/context" an- alytical framework used to focus the content analysis of collections in anticipation of applying the research objects model. the framework is intended to refine analysis of how collections are constituted, and how their constitution determines the ways in which they contribute to scholarship. using this framework, each component is first categorized as either content or context. "con- tent" includes components that are discrete, independent sources of evidence for scholarship. "context" includes components that play a supportive, interpretive, representational, or functional role that is essential or utilitarian for the use and understanding of con- tent. the reason for differentiating these categories conceptually, despite the difficulty of teasing them apart in practice, is to refine our understanding of collection contributions. the next question put to components identified as content is: are they primary or secondary sources, or would it be more accurate to say they fall somewhere in between? for both content and context components, a third question is: is the component original, or has it been previously published or published externally to the collection? the final question is, how are both context and content components interrelated? these questions are intended to challenge our intu- itions about aspects of collections that are commonly understood to be peripheral to collections. figure : content/context analytical framework content components in these collections include primary sources, secondary sources, and derived sources. primary sources are well understood to be representations of original documents or first- hand evidence, while secondary sources offer substantial interpre- tation of primary sources. however, some resources seem to fall be- tween these two categories, such as datasets extracted from primary sources. this study considers such sources to be derived. derived sources are generated "directly" from primary sources through some interpretive intervention, where interpretation is manifested in the mode or method of derivation, such as an algorithm or encoding scheme designed to foreground or extract specific pieces of data from the sources. i posit that derived sources are more closely re- lated to primary sources than other secondary sources because they are intended as alternative (usually computational) representations of primary sources. content components further divide into categories of original versus previously published/external. "original" implies that a source is the first (digital) source of its kind, or has no available counter- part. "previously published" implies that a source or comparable version has been published or digitized elsewhere, or is a reference component that exists externally to a collection. contextual components in these collections include elements that are essential or important to the interpretation, use, manage- ment, curation, and preservation of collections, but which do not constitute the main content. for example, contextual components include documentation and data models such as markup schemas or ontologies. finally, many contextual components are functional, dynamic, and interactive features or affordances. context compo- nents may also be original, previously published or external, or somewhere in between. jcdl ’ , june , urbana-champaign, il, usa . collections the following three collections were selected for close qualitative content analysis: the shelley-godwin archive, the vault at pfaff’s, and o say can you see: early washington d.c. law & family. these collections were selected to represent three distinctive types of collection, summarized in table [ ], which were identified in prior typological analysis. the shelley-godwin archive (shelley-godwin) represents a defini- tive source collection, a digital library focused on the representation of definitive primary sources, such as scholarly editions and author- itative archival sources intended for close study by scholars in a domain. shelley-godwin provides digitized, transcribed manuscripts from the shelley-godwin family of th- and th-century writ- ers, including percy bysshe shelley, mary wollstonecraft shelley, william godwin, and mary wollstonecraft. the collection aims to be a definitive digital source for close study of the shelly-godwin manuscripts—including major literary works such as frankenstein (m. w. shelley) and prometheus unbound (p. b. shelley). manuscripts are supplemented with biographical, bibliographical, and other sec- ondary sources. the vault at pfaff’s (vault) represents an exemplar-context col- lection, which aims to present exemplary (rather than definitive) sources on a subject, and to interrelate them with interpretive, con- textual materials. vault gathers primary and secondary sources about the historically significant bohemians of antebellum new york, u.s.a., particularly the social network revolving around the historical bar pfaff’s, which became an epicenter for a literary move- ment. the site provides a searchable annotated bibliography of more than , texts, linking to full-text internal and external sources. critically, while some of the primary sources are hosted by vault, many are instead references (with some linked to external sites), because the main content of this collection is the records of primary sources and the rich, interwoven contextual information with which records are augmented. the site also provides a map, timelines, biographies, and historical essays. unlike shelley-godwin, vault does not aim to provide an original or definitive set of pri- mary sources for close study, but rather a massive set of interrelated sources, social entities, and contextual information to support the discovery of new connections. o say can you see: early washington, d.c., law and family (o say) represents an evidential platform, a digital library focused on gathering sources to provide evidence for a specific interpretive or analytical goal [ ]. o say gathered, digitized, and analyzed freedom suits filed in washington, d.c., and surrounding areas between and , in order to explore family, legal, and social networks. like shelley-godwin, o say provides carefully transcribed and encoded primary sources, but with a central goal of deconstructing and remodeling those sources for use as data (e.g., for computational social-network analysis). components of collections in this section i consider what components of our sample collections must be modeled in order to logically represent them as research http://shelleygodwinarchive.org/ https://pfaffs.web.lehigh.edu/ http://earlywashingtondc.org/ objects, to lay the groundwork for attempting a retrospective map- ping to the research objects data model. for each collection, content analysis and the application of the content/context framework serve to identify the main products of the collection—its discrete, pub- lishable outcomes—and how they are related to one another and to other resources. the remainder of this section characterizes the items and interrelatedness of the current instantiation, identified through content analysis of each collection. . shelley-godwin archive components shelley-godwin aims to provide a definitive collection of manuscripts, digitized as high-quality page images with corresponding tei- encoded transcriptions. these manuscripts are augmented by in- novative modes of access and participation for users, including features for multimodal and comparative reading, and features for facilitating future participation in the archive through user annotation and curation of manuscripts. what are the original con- tributions and important contextual components, and how are they related? content analysis of the collection identified the following components: • manuscripts: manuscripts are abstract objects, with mul- tiple possible orderings, of sequential transcriptions and corresponding page images, currently instantiated through tei-xml files that reference and order the separate tei-xml files representing transcribed pages (see below). – page images: digitized manuscript page images. the image files are hosted remotely and appear on the site through a call to the bodleian digital library’s iiif api; but images were digitized under the auspices of the shelley- godwin archive project and thus constitute a contribution of the project. – encoded transcriptions: transcriptions of page images, encoded in a tei-xml schema for representation of pri- mary sources. multiple representations of the page images and transcribed text stem from shared canvas manifests that are generated based on these tei files; these transcrip- tions are the foundation of this project’s contribution. • narrative components: – original texts: the project offers manuscript descrip- tions, currently instantiated as html files. – excerpted texts: the project includes excerpts of previ- ously published texts, including manuscript descriptions and a chronology, currently instantiated as html files. • browse and search functionalities: browse and search of shelley-godwin operate across manuscripts as wholes, and across components of manuscripts. these functionalities are customized to offer multiple reading orders, taking advan- tage of the highly rich encodings. • reading viewer: the custom implementation of the reading viewer takes advantage of shared canvas/iiif representa- tions of the manuscript images in addition to the encodings, to allow readers to compare the original handwritten text with its transcriptions, and to limit views by authorial hands. • schemata and utilities: shelley-godwin relies on multi- ple custom data models and utilities for constituting the manuscripts from numerous components. http://shelleygodwinarchive.org/ https://pfaffs.web.lehigh.edu/ http://earlywashingtondc.org/ modeling digital humanities collections as research objects jcdl ’ , june , urbana-champaign, il, usa table : collection objectives shelley-godwin vault o say provide access to a complete set of encoded manuscripts aggregate access to distributed, related sources digitize, transcribe, encode archival docu- ments to extract data for analysis facilitate multi-modal, comparative read- ing and user participation illuminate a network of works (sources), people, places reconstruct and expose hidden relation- ships and personal histories the components of vault and o say are described in less detail, below, to facilitate comparison with shelley-godwin. . vault at pfaff’s components vault, which aims to help users discover connections among a large set of related sources and people, decomposes into the following main components: annotated bibliographic metadata records, which include annotated internal hyperlinks to related people entities (whether authors or mentions) and internal/external hyperlinks to electronic sources when available; annotated biographical records (people entities); a dedicated relationships browser, along with other browsing and searching facilities; original narrative components including historical essays and full biographies; an extended time- line and interactive map; and transcriptions and page images of the subset of primary sources hosted by vault (most primary sources in this aggregation are externally linked). . o say can you see components o say provides encoded primary sources and extracted data. its main contributions may be decomposed into the following compo- nents: page images of archival documents; encoded transcriptions of archival documents in tei-xml; extracted and augmented per- son data (represented as rdf data documenting relationship and personal information, derived from a central csv file, all extracted from case documents); family guides (family trees that interrelate "people" entities, derived from the same central data source); cases (abstract entities, a mechanism for aggregating extracted data and documents, such as person entities and case documents references); annotated cases (which are the same as cases, but including long annotations with hyperlinks); a relationships ontology (owl) and other customized data models; a special browse and search function- ality, including relationship browse and search with multiple seri- alization options and simple relationship api; stories (original long- form narratives heavily linked both to internal entities/resources and external resources); and a bibliography with links to related projects, and primary and secondary sources. . content and context components applying the context/context framework to the components identi- fied through content analysis exposes a few important characteris- tics of dh collections, which any data model intended to represent and describe collections must take into account. as an example of how this analytical framework applies to collections, figure shows selected content and context components of all three collec- tions mapped to a two-dimensional grid, to demonstrate how com- ponents fall along two spectra of ( ) primary/derived/secondary sources and ( ) previously published (or external) versus original sources. the grid differentiates six boxes or categories for the sake of making the framework more legible, but in reality the category boundaries are fuzzy and each axis should be understood as a spec- trum. components of the three collections fall into almost every category. (the only category into which no components fall, in this analysis, is the category of components that are both derived from primary sources and previously/externally published; but it is easy to imagine components that would fall into such a category, such as datasets hosted in an external repository.) figure : "content" components mapped to framework mapping components identified above to this framework, as in figure , exposes the following essential and interesting character- istics of dh collections: components contribute to scholarship in diverse ways. the mapping illustrates the great variety among the components of even just a few collections—variety not only in type and form, but also in less predictable dimensions, including their originality and how they participate in the scholarly record, whether as primary, sec- ondary, or derived sources. the contributions of a collection are often framed in terms of concrete, novel additions to the scholarly and cultural records, but such additions are more various, and some- times more abstract, than usually imagined. the multidimensional jcdl ’ , june , urbana-champaign, il, usa diversity of the components that constitute our collections may complicate our judgments about which pieces are priorities for sustainability and preservation. not all essential content is original or internal to the col- lection. for example, many of the primary sources that make vault a valuable resource for discovery were previously published and constitute external references. in a different case, the manuscript page images that constitute a major part of shelley-godwin’s contri- bution to scholarship are original but externally referenced, which will pose fundamentally different challenges to the sustainability and preservation of the collection as a whole than if they were co-located with the rest of the collection components. content is not the only essential contribution of a digi- tal collection. the contribution may be partly or even centrally manifested in the interrelationships among components, or in the context surrounding the content. these relationships and context have been called the "connective tissue" of a collection [ ]. for example, the customized schemas and utilities used to constitute the archive and its contents may represent a technical contribution to dh as a field of practice. the custom relationships browsers of vault and o say serve to enact scholarly interpretations; the ability to search and browse fine-grained relationships within and among components in bespoke ways is essential to the purposes of those collections. flanders ( ) invites us to "consider what happens to our understanding of a ’collection’ when its constituent items are no longer the primary unit of meaning" [ ]; at the least, this idea suggests that standard repository models for representing "items + metadata" as constituting a collection are insufficient to represent and describe dh collections. the next section breaks some of the connective tissue down to have a closer look, prior to the application of the research objects model. . relationships among components components of collections are interrelated both conceptually and technically, and these relationships are essential to representing and describing collections as complex and cohesive wholes. in the case of shelley-godwin relationships are implemented in various ways. the collection leverages identifiers, schemata, utilities (scripts or processes), and data files to construct the archive’s representation of each manuscript. figure offers a reductive illustration of components and re- lationships of shelley-godwin and relationships among them. in figure , items included in the collection are enclosed in (blue) squares. note that page images appear in a separate square; while they are logically part of shelley-godwin, they are maintained and hosted by a different institution in a separate digital library (dig- ital bodleian ) and called via api. in figure , arrows represent relationships. solid arrows represent referential relationships that are formalized and actionable (if not semantically encoded), such as relationships performed by hyperlinked uris. these include the following (broadly described): (a) custom data models refer to (and extend) standard, external data models, for purposes of validation and documentation. for example, the shelley-godwin tei-odd file references https://digital.bodleian.ox.ac.uk/ figure : conceptual and technical relationships among components the tei standard, in addition to the standard that defines the odd. (b) scripts and utilities refer to all components in order con- struct or enact the functional website. for example, the site relies on the unbind utility, a python utility to create shared canvas manifests (which underlie the interactive reading viewer) from shelley-godwin tei files. dashed (yellow) arrows represent conceptual or abstract relation- ships, which are implemented indirectly through various means. these are conceptual relationships, made visible to users by the design of the site, but technically performed by completely separate components of the collection. these include the following: (c) relationships between page images and corresponding en- coded transcriptions. for users this relationship is experi- enced via the juxtaposition of both in the reading viewer. behind the scenes, this juxtaposition is created by the utili- ties described above. (d) relationships between each manuscript and its components. each manuscript is an abstract entity with a proxy in the form of xml documents, one for each volume, which list the uris for the individual pieces, or pages, that constitute the volume and manuscript. the identifiers for pieces of https://digital.bodleian.ox.ac.uk/ modeling digital humanities collections as research objects jcdl ’ , june , urbana-champaign, il, usa the manuscript serve to identify both page images and cor- responding xml files, because scripts and utilities expand the identifiers into uris. the dashed circle in figure en- compasses the abstract object of the manuscript, an abstract entity that is evident and interactive to users through brows- ing mechanisms and the comparative reading viewer, but which is constructed behind the scenes through a complex, distributed process. (e) relationships between narrative components and manuscripts. references to manuscripts within the narrative components of the site are implemented as hyperlinks between textual references the landing pages for corresponding manuscripts. through this analysis of the final aspect of our "content/context" framework—the aspect of relationships among components—we find another crucial observation about dh collections: not all relationships among components are equal. some are imple- mented directly using mechanisms such as uri addresses, which would readily translate to alternate representations, such as seman- tic relationships in a linked data or research objects model. other relationships are implemented indirectly via processes that may prove more difficult to translate or migrate. dwelling on relation- ships within vault and o say is out of scope for this paper, but those collections, even more than shelley-godwin, realize their purposes and contributions through their connective tissue, and demand a deeper analysis in future work. research objects and collections so far this analysis has broken collections down into sets of logical components and relationships, with the goal of applying the re- search objects model to describing and representing them. by way of reminder, research objects are comprised of two main kinds of things: aggregated items and annotations. in this model, a research object may be serialized as a bundle, which is a zip archive of a file structure and all constituent data files, along with a json-ld manifest of metadata about the aggregation contents. how well can this model capture the logic and meaning of dig- ital collections? this section suggests a basic mapping of compo- nents and relationships of one collection, shelley-godwin, to the research objects model, in order to begin to identify challenges and implications of this model for representing dh scholarship. the following examples assume the goal of trying to migrate the shelley-godwin—the complete collection, as data—into a research object bundle. the collection could then be migrated into a research objects management system, so that other digital humanists could access and use the data alongside (presumably) many other collec- tions, or so that third-party applications could draw on the data to support custom interactions. the details of access and use are not imagined here, but some potential implications for varieties of access and use are considered in section . first, adopting the model means capturing components that fall into the content category of the content/context framework articulated above. for shelley-godwin, these components are (at least): ( ) page images, ( ) encoded transcriptions corresponding to page images, and ( ) narrative components that serve to de- scribe manuscripts. manuscripts, in turn, are abstract entities that are manifested by relationships among page images and encoded transcriptions. in a research object, each component would be refer- enced in the manifest as an aggregated item. the following example record shows a portion of a research object manifest, which lists ag- gregated items including ( ) an xml file (ending in "volume_i.xml") representing volume of mary shelley’s frankenstein manuscript, and which references the individual pages in order; ( ) a single digital page image (in jpeg format); ( ) an xml file (ending in "c - .xml") representing a single page of the frankenstein manuscript; ( ) an html file representing a narrative introduction to the manuscript; and ( ) the tei-odd schema that governs the shelley-godwin implementation of tei-xml. figure : snippet of partial manifest for shelley-godwin re- search object aggregation note that the aggregates field already captures several impor- tant relationships among the components of shelley-godwin, even prior to the addition of explicit relationship annotations. first, the research object manifest represents and make explicit the relation- ships between "tangible" or self-contained components (such as files or documents) and abstract components of the collection. in this ex- ample, the volume-level xml file stands as a proxy for a manuscript, which, as discussed above, is an abstract object in shelley-godwin’s architecture. it would also be possible to represent the manuscript as an abstract entity more explicitly in this model, perhaps relying on the oai-ore proxy mechanism. in addition, uris for aggregated objects may reference both local files contained within a research object and remote resources. in figure , relationships to external resources are highlighted. the conformsto field allows a research object creator to indicate schemas or standards to which a given aggregated resource conforms; in this case conformsto references schemas both internal and external to the collection. relationships between the encoded transcriptions and relevant schemas and standards, embedded in the tei-xml file headers, can also be described in the research object mani- fest, where they can be exposed to consumption by independent applications. figure gives an example of how a shelley-godwin research object might reference page images hosted externally to the collection, in digital bodleian. digital bodleian is in fact the jcdl ’ , june , urbana-champaign, il, usa source of page images displayed in the shelley-godwin website. but in the current shelley-godwin site, this referential relationship is only made explicit within the code used to generate pages. the research objects model makes this relationship explicit, semantic, and discoverable in the outward-facing manifest. annotations, constituting the second major piece of a research ob- ject manifest, are used to express descriptive metadata about aggre- gated resources, including relationships among resources (internal or external) and detailed provenance information. annotations rely on domain-specific ontologies and vocabularies. figure exempli- fies annotations that make explicit several relationships among ag- gregated components of shelley-godwin, including shelley-godwin relationships (c), (d), and (e), identified in section . , above: (c) relationships between page images and corresponding en- coded transcriptions: in this example research object, these relationships are made explicit in annotations that link each xml file representing a single transcribed page to its corre- sponding page image, via prov:wasderivedfrom. there are, of course, other ways to express this relationship. (d) relationships between the various components that con- stitute a manuscript: in this example research object, the relationships are made explicit in annotations that link each xml file representing a single transcribed page to its cor- responding tei-xml file representing a single volume, via dct:haspart. there are other ways this relationship could be represented. (e) relationships between narrative components and manuscripts. hyperlinks forge relationships between textual references and manuscripts; therefore these relationships are best mod- eled not at the document level but at a lower level within the text. these relationships could simply remain as embedded hyperlinks, relying on unique identifiers for manuscripts (assuming the urls continue to function in the new context of a research object). alternatively, the fact that a narrative component refers to a manuscript could be made explicit in the manifest, via an annotation such as crm:refersto. but it is not immediately clear how a document-level annotation indicating references would be useful. figure offers an alternative view of these relationships, ex- pressed as an rdf snippet derived from an rohub research object and visualized. the research objects model supports the use of domain ontolo- gies (such as cidoc-crm and bibliographic ontologies) for rich descriptions of the interrelationships among collection components and external sources. there are numerous alternative ontologi- cal approaches to modeling the relationships given in the exam- ples above. current research object management systems (such as rohub) offer a limited set of terms for adding annotations to objects, mainly oriented toward description of computational and scientific research workflows. for example, rohub’s "ro basic requirements" require research objects to include hypotheses or research questions, along with conclusions. for expressing relation- ships among the aggregated research object resources, rohub relies http://www.cidoc-crm.org/cidoc-crm/ http://www.rohub.org/ http://ontodia.org/ on terms from the prov and wf ever research object ontologies, which are both focused on scientific workflows. such ontologies will prove inadequate to fully describe the processes or workflows of digital scholarship in the humanities. figure : snippet of partial manifest for shelley-godwin re- search object annotations this example application of the research objects model has not accounted for the components of collections that are interactive, dynamic, and functional, such as shelley-godwin’s custom search and browse options, and its comparative reading viewer. these are essential aspects of the project’s contributions to scholarship. not only do they represent technological contributions to the dh land- scape, but they were built for symbiosis with shelley-godwin data, which was modeled to support the use of these advanced tools. as flat code, of course, these pieces readily fit into the research objects model, which has been shown to be useful for aggregating data and code for migration and preservation purposes. but as performative, interactive components that function to enable new kinds of explo- ration and encounter with collection contents, these components challenge the research objects model. while the model has been applied to software preservation [ ], and while workflow-oriented research objects usefully represent certain kinds of dynamic and executable content, the functional and interactive components of dh collections are really about enabling specific, purposeful kinds of real-time, end-user interaction. the duties of the functional, con- textual components of collections—to enable exploration, discovery, connection-making, learning, etc.—would be assumed not by a data model but by the interactive components of a research objects man- agement system or other applications built on top of a research objects management system. the potential for such systems and applications to enact the diverse methodological and functional goals of dh scholarship is a topic for future investigation. discussion and future work this study has analyzed three dh collections using qualitative content analysis, employing a novel content/context analytical http://www.cidoc-crm.org/cidoc-crm/ http://www.rohub.org/ http://ontodia.org/ modeling digital humanities collections as research objects jcdl ’ , june , urbana-champaign, il, usa figure : partial visualization of ro framework in order to characterize collection components and their interrelationships. applying the framework highlighted a few important characteristics of dh collections that complicate our un- derstanding of how collections are constituted, and which therefore carry implications for the data models that represent collections along with approaches to sustaining and preserving them. these characteristics are: ( ) components of collections contribute to scholarship in diverse ways. ( ) not all of the essential content of a collection is necessarily original or internal to the collection. ( ) contextual components and interrelationships among components may be equally as essential as the main content of a collection. research objects have the potential to represent and describe a wide range of scholarly products—more fully and more sustain- ably than models that currently dominate content management and publication systems. in this paper, the components and inter- relationships of the sample dh collections were retrospectively mapped into a research objects model in order to identify strengths and limitations of that model for representing dh scholarship. the following three central strengths emerged. ( ) research objects readily perform the most essential function of a collection: to aggregate related resources in order to support scholarly objectives. (for this reason, research objects have been leveraged to support digital preservation and big-data transfer [ ]). ( ) research objects have the capacity to accommodate rich se- mantic descriptions of interrelationships among components, using domain ontologies. these interrelationships may obtain between components with identifiable and addressable representations, such as documents or files, and components that are more abstract. in dh collections, such interrelationships are often inexeplicit or "hid- den", enacted by or encoded in the layers of scripts and processes that operate to assemble collections for presentation on the web. when these relationships are hidden, they may be more vulnerable to dissolution in the course of data manipulation, preservation, and migration processes. formalizing these relationships not only makes them more sustainable; it also opens them to linked data representation and computational uses. ( ) the research objects model accommodates aggregations of linked data, offering researchers the opportunity to create and annotate virtual, fully referential collections in any context and at scale. in addition, structured descriptions of aggregations in research objects are amenable to third-party annotation, and can be leveraged by external applications. these advantages of the research objects model for representing dh collections suggest new possibilities for collaboration, communication, and data reuse within scholarly communities. the most immediate limitation of the model for dh collections is that functional components designed for end-user interaction are not usefully captured in a basic research objects model. instead, these components raise questions about the capacities of research objects management systems to serve the distributed development of a diversity of applications. how can management systems serve to underpin experimental, interactive, and dynamic platforms? dif- ferent kinds of dh scholarship aim to facilitate different kinds of interactions between users, evidence, and context; the diversity of dh scholarship and the compulsion toward experimentation and innovation have hindered large, sustainable, cross-disciplinary infrastructures. realizing the advantages of research objects and related efforts for dh will depend on implementations that establish dynamic platforms for experimentation, participation, and co-creation. this study has treated collections in terms of their logical components and relationships, setting aside for now several other important characteristics and properties, such as collections’ look and feel, their digital materiality, and the detailed contours of their imple- mentations. these aspects are essential to the experience and preser- vation of some collections; it is hard to see how the research objects model could benefit such projects after their development, in ret- rospective sustainability or preservation efforts, but it is clear that the model could underpin systems going forward that support a wide variety of implementations. dh research objects would necessarily represent extensions of the basic research objects model, based on the representational and user requirements in different domains and scholarly communities. the work of ontologizing the humanities is well underway. a re- search objects profile specific to representing collections such as shelley-godwin, vault, and o say will depend on cobbling together ontologies and vocabularies to express a diversity of relationships among primary, derived, and secondary sources, in addition to workflows, people, and contextual entities. prior research has em- phasized the necessity of highly granular systems of identification, addressability, and reference for supporting dh research and col- lection practices within digital libraries [ ]. indeed, implementing the research objects model at scale within a linked data paradigm would demand more pervasive use of persistent identifiers for dh objects at varying levels of granularity, including ideally address- able identifiers for each component of a collection, the pieces that make up a component, and so on. in terms of architecture, dh collections bear significant resem- blance to other kinds of digital libraries. the benefits, constraints, and practical challenges of applying the research objects model for dh collections seem, for the most part, likely to hold for cultural heritage digital libraries generally. emerging linked data collections of cultural heritage institutions stand to support the rise of research objects and similar publication models across disciplines. future work will investigate the potential intersections between research http://www.researchobject.org/scopes/ http://www.researchobject.org/scopes/ jcdl ’ , june , urbana-champaign, il, usa objects and linked data representations of cultural collections in libraries, archives, and museums. there are numerous emergent models for representing digital publications and digital objects, including models for publishing media-rich and interactive digital monographs along with sup- plementary materials, and experiments with alternative scientific publishing models such as nanopublications [ ]. future work will investigate the intersections between the research objects model and various alternatives for representing the breadth of dh schol- arship, collections, and data, including forerunning applications of research objects to humanities collections [ , , , , , ], and ongoing studies of other approaches to containerization in dh. the research objects data model evaluated in this paper is "data- centric"; workflow-oriented research objects, as a closely related alternative, extend the basic model to capture holistic, executable research workflows. while workflows have received growing atten- tion in the humanities from both technical and strategic perspec- tives [ , ], the implications of workflow-oriented data models for capturing the idiosyncracies of humanities research processes need further investigation. future work will extend this analysis to a more complete study of dh scholarship, scholars, and work- flows, in order to advance data models that may help us realize the benefits of standard infrastructure while minimally attenuating the irrepressible diversity of digital humanities scholarship. references [ ] bridget almas. . perseids: experimenting with infrastructure for creating and sharing research data in the digital humanities. data science journal , ( ). [ ] alessia bardi and paolo manghi. . enhanced publications: data models and information systems. liber quarterly , ( ). [ ] sean bechhofer, iain buchan, david de roure, paolo missier, john ainsworth, jiten bhagat, philip couch, don cruickshank, mark delderfield, ian dunlop, matthew gamble, danius michaelides, stuart owen, david newman, shoaib sufi, and carole goble. . why linked data is not enough for scientists. future generation computer systems , ( ). [ ] sean bechhofer, david de roure, matthew gable, carole goble, and iain buchan. . research objects: towards exchange and reuse of digital knowledge. nature proceedings ( ). [ ] khalid belhajjame, jun zhao, daniel garijo, matthew gable, kristina hettne, raul palma, eleni mina, oscar corcho, josé manuel gómez-pérez, sean bechhofer, graham klyne, and carole goble. . using a suite of ontologies for preserving workflow-centric research objects. journal of web semantics ( ), – . [ ] tobias blanke and mark hedges. . scholarly primitives: building institutional infrastructure for humanities e-science. future generation computer systems , ( ). [ ] joshua borycz and bonnie carroll. . managing digital research objects in an expanding science ecosystem: conference summary. data science journal ( ). [ ] phil e. bourne, tim clark, robert dale, anita de waard, ivan herman, eduard hovy, and david shotton. . force manifesto: improving future research communication and e-scholarship. technical report. force . [ ] kyle chard, mike d’arcy, ben heavner, ian foster, carl kesselman, ravi madduri, alexis rodriguez, stian soiland-reyes, carole goble, kristi clark, eric w. deutsch, ivo dinov, nathan price, and arthur toga. . i’ll take that to go: big data bags and minimal identifiers for exchange of large, complex datasets. ieee conference on big data ( ). [ ] tracey clarke and andy bussey. . research information systems – fit for the future? a report on the situation and plans of the university of sheffield library. o-bib. das offene bibliotheksjournal / herausgeber vdb ( ). [ ] anita de waard. . fair cures: a research object authoring tool for the data commons. cni fall membership meeting ( ). [ ] katrina fenlon. . thematic research collections: libraries and the evolution of alternative digital publishing in the humanities. library trends , ( ), – . http://digits.pub/ [ ] katrina fenlon. . thematic research collections: libraries and the evolution of alternative scholarly publishing in the humanities. ph.d. dissertation. university of illinois at urbana-champaign, http://hdl.handle.net/ / . [ ] katrina fenlon, megan senseney, harriett green, sayan bhattacharyya, craig willis, and j. stephen downie. . scholar-built collections: a study of user requirements for research in large-scale digital libraries. in proceedings of the american society for information science and technology. [ ] julia flanders. . advancing digital humanities. palgrave macmillan uk, chapter rethinking collections. [ ] julia flanders and fotis jannidis. . knowledge organization and data modeling in the humanities. technical report. brown university. [ ] richard freedman, raffaele viglianti, and adam crandell. . the collaborative musical text. music reference services quarterly ( ). [ ] andres garcia-silva, josé manuel gómez-pérez, raul palma, and ikay ...altin- tas. . enabling fair research in earth science through research objects. arxiv: . [cs] ( ). [ ] david hansen, liz milewicz, paolo mangiafico, will shaw, mattia begali, and veronica mcgurrin. . a framework for library support of expansive digital publishing. technical report. duke university. [ ] mark hedges, heike neuroth, kathleen m. smith, tobias blanke, laurent romary, marc küster, and malcolm illingworth. . textgrid, textvre, and dariah: sustainability of infrastructures for textual scholarship. journal of the text encoding initiative ( ). [ ] inna kouper, beth plale, dharma akmon, and margaret hedstrom. . prac- tical and conceptual considerations of research object preservation. digital preservation ( ). [ ] christine madsen and megan hurst. . are digital humanities projects sus- tainable? a proposed service model for a dh infrastructure. in cni membership meeting. https://www.slideshare.net/mccarthymadsen/are-digital-humanities- projects-sustainable-a-proposed-service-model-for-a-dh-infrastructure. [ ] nancy l. maron and sarah pickle. . sustaining the digital humanities: host institutional support beyond the startup phase. technical report. ithaka s+r, https://sr.ithaka.org/publications/sustaining-the-digital-humanities/. [ ] dominic oldman and diana tanase. . reshaping the knowledge graph by connecting researchers, data and practices in researchspace. the semantic web - iwsc ( ). [ ] kevin page, david lewis, and david weigl. . contextual interpretation of music notation. digital humanities ( ). [ ] raúl palma, piotr hołubowicz, oscar corcho, josé manuel gómez-pérez, and cezary mazurek. . rohub — a digital library of research objects supporting scientists towards reproducible science. springer international publishing, – . [ ] carole l. palmer. . a companion to digital humanities. blackwell publishing, chapter thematic research collections. [ ] roger c. schonfeld and donald j. waters. . the turn to research workflow and the strategic implications for the academy. cni spring membership meeting ( ). [ ] sarah j. sweeney, julia flanders, and abbie levesque. . community- enhanced repository for engaged scholarship: a case study on supporting digital humanities research. college and undergraduate libraries , - ( ), – . [ ] david tarrant, ben o’steen, tim brody, steve hitchcock, neil jefferies, and leslie carr. . using oai-ore to transform digital repositories into interoperable storage and services applications. the code lib journal ( ). [ ] karen m. wickett, allen h. renear, and jonathan furner. . are collections sets?. in proceedings of the american society for information science and technology, vol. . http://digits.pub/ abstract introduction . digital scholarship and sustainability . research objects in the humanities methods . qualitative content analysis . content/context component framework . collections components of collections . shelley-godwin archive components . vault at pfaff's components . o say can you see components . content and context components . relationships among components research objects and collections discussion and future work references an analysis of researchgate and academia. edu as socio-technical systems for scholars’ networked learning: a multilevel framework proposal un’analisi di researchgate e academia.edu come sistemi socio-tecnici per l’apprendimento in rete degli accademici: una proposta di framework multilivello stefania manca istituto per le tecnologie didattiche, consiglio nazionale delle ricerche, genova, italy, stefania.manca@ itd.cnr.it how to cite manca, s. ( ). an analysis of researchgate and academia.edu as socio-technical systems for scholars’ networked learning: a multilevel framework proposal. italian journal of educational technology, ( ), - . doi: . / - / abstract academic social network sites (asns) like researchgate and academia.edu are digital platforms for information sharing and systems for open dissemination of scholarly practices that are gaining momentum among researchers of multiple disciplines. although asns are increasingly transforming scholarly practices and academic identity, a unifying theoretical approach that analyses these platforms at both a systemic/infrastructural and at a personal/individual level is missing. moreover, despite there is an increasing amount of studies on social media benefits for scholarly networking and knowledge sharing, very few studies have investigated specific benefits of researchgate and academia.edu for scholars’ professional development according to a networked learning perspective. this study focuses on academic social network sites as networked socio-technical systems and adopts a three-level analysis related to asns as platforms for digital scholarship and scholarly communication. the approach comprises: ) a macro-level, which constitutes the socio-economic layer; ) a meso-level, which comprises the techno-cultural layer; and ) a micro-level, which constitutes the networked-scholar layer. the study reports on investigations into the technological features provided by researchgate and academia.edu for networked learning that are based on the multilevel approach. the final aim is to exemplify how these digital services are socio-technical systems that support scholars’ knowledge sharing and professional learning. keywords academic social network sites, researchgate, academia.edu, socio-technical systems, networked learning, professional development. sommario social network accademici (asns) come researchgate e academia.edu sono piattaforme italian journal of educational technology / volume / issue / digitali per la condivisione di informazioni e sistemi per la disseminazione aperta di pratiche accademiche che stanno acquisendo sempre maggiore interesse tra i ricercatori di diverse discipline. sebbene i asns stiano progressivamente trasformando le pratiche dei ricercatori e l’identità accademica, manca ancora un approccio teorico unificante che analizzi queste piattaforme sia a livello sistemico-infrastrutturale che personale/individuale. inoltre, nonostante esistano un crescente numero di studi sui benefici dei social media per il networking accademico e la condivisione di conoscenza, pochissimi hanno analizzato i benefici specifici di researchgate e academia.edu per lo sviluppo professionale degli accademici secondo la prospettiva dell’apprendimento in rete. questo studio è incentrato sui social network accademici come sistemi socio-tecnici di natura reticolare e adotta un approccio a tre livelli che li analizza come piattaforme per la digital scholarship e la comunicazione accademica. l’approccio comprende: ) un macro-livello, che costituisce lo strato socio-economico; ) un livello meso, che include lo strato tecno-culturale; e ) un micro-livello, che costituisce lo strato del ricercatore in rete. lo studio riporta i risultati dell’analisi delle caratteristiche tecnologiche di researchgate e academia.edu per l’apprendimento in rete basata sull’approccio multilivello. obiettivo finale è quello di fornire degli esempi di come questi servizi digitali siano sistemi socio-tecnici che forniscono supporto ai ricercatori per la condivisione di conoscenza e l’apprendimento professionale. parole chiave social network accademici, researchgate, academia.edu, sistemi socio-tecnici, apprendimento reticolare, sviluppo professionale. . introduction digital technologies and social media are progressively reshaping professional development and work- based learning in a variety of knowledge intensive professions, such as teachers, academics and health pro- fessionals (manca & ranieri, a). several studies have shown that professionals employ social media platforms to build learning networks, share professional ideas and engage collaboratively with their peers. for instance, fox and bird ( ), that have recently investigated the landscape of academic studies on how teachers and doctors learn through social media, find there is a growing professional interest in social media in these professions, although there is also a further need for a broader evidence base to evaluate benefits in these two professions. with specific reference to social network sites, studies on facebook groups use for teachers’ professional development have shown that teachers participate in social media for sharing and professional support (kelly & antonio, ; ranieri, manca, & fini, ) and for identity positioning (lundin, lantz-andersson, & hillman, ). in addition, studies on twitter use have found that teachers emphasize sharing novel ideas and collaboration with other teachers to create education-related content as central practices (carpenter & krutka, ; macià & garcía, ). today in the scholarly field asns and ict in general are redefining academic networking and scholars’ identities and increasingly affecting strands of open science and public engagement (weller, ). digital scholarship is commonly understood as the use of digital evidence, methods of inquiry, research, publica- tion and preservation to achieve scholarly and research goals. in this light, social media are progressively affecting academic scholarship and its four dimensions - discovery, integration, application and teaching (boyer, ) - according to several strands of analysis. one of the approaches, networked participatory scholarship, has been advanced as «the emergent practice of scholars’ use of participatory technologies and online social networks to share, reflect upon, critique, improve, validate, and further their scholarship» (veletsianos & kimmons , p. ). the emergent uses of tools like facebook, twitter, academia.edu and mendeley reveals how scholarly knowledge has come to be acquired, tested, validated, and shared, as well as how university subcultures of ‘invisible college’ (wagner, ) are constructed. another strand of research named social scholarship (greenhow & gleason, ) investigates how social media affordanc- es influence the ways in which academia accomplish scholarship through values like promotion of users and decentralised accessible knowledge. researchers are studying social media as means through which to promote the enhancement of open dissemination practices, the formulation of alternative indicators of sci- entific impact, and the strengthening of relationships among cohorts of scholars (borrego, ; donelan, ; manca & ranieri, b; nández & borrego, ). however, while empirical studies carried out in the light of these networked and social participatory frame- works have mostly focused on the microblogging site twitter in scholarly practice (kimmons & veletsia- nos, ; li & greenhow ; stewart, ), very few studies have thoroughly investigated academic social network sites (asns) like researchgate and academia.edu use in the light of theoretical frameworks developed in the educational technology sector and aimed at analysing social digital scholarship practice. most of the research focused on these two services has been produced in the library and information sci- ences as deployments for reputation building and alternative ranking systems (hoffman, lutz, & meckel, ; kuo, tsai, wu, & alhalabi, ; niyazov et al., ; thelwall & kousha, ). today there is a dearth of research investigating practices and new modes of communication in the light of a networked participatory approach to scholarship that employ researchgate and academia.edu. moreover, although asns are held to benefit scholars by enhancing their visibility and reputation, and by increasing opportunities for collaboration and exchange (nicholas, herman, & jamali, ), descrip- tions of scholars’ experiences of these platforms are fragmentary, and comprehensive views of how digital scholarship should be investigated are lacking (raffaghelli, cucchiara, manganello, & persico, ). the debate is stagnating in separate research strands that do not accommodate infrastructural issues and phe- nomenological experiences of use. present limitations demand a theoretically founded approach capable of providing a framework for analysing asns both at a systemic/infrastructural level and at a personal/ individual one that provides evidence of support for scholars’ networked learning and professional devel- opment. this study elaborates a conceptual analysis of asns in accordance with a multi-level framework that was investigated in a previous publication (manca & raffaghelli, ). the proposal is aimed at analysing how sociality and digital platforms like researchgate and academia.edu are interrelated and how systemic and individual employments of these sites might best be investigated in the light of scholars’ professional learning. although the idea of analysing scholarly communication forums as socio-technical interaction networks is not new (kling, mckim, & king, ), research on asns through a socio-technical perspec- tive both at the platform’s organizational and individual level is scarce. this study provides an analysis of the technological features employed by the two most prominent asns, researchgate and academia.edu, for networked learning and scholars’ knowledge sharing. in the following sections, the theoretical background of the study is presented along with a detailed descrip- tion of the technological features of researchgate and academia.edu. . theoretical background focus on patterns in the design and use of technologies as highly context dependent can be dated back at least to the ‘ s (williams & edge, ), while early ideas of the social construction of technology can be traced back to the early ‘ s (bijker, hughes, & pinch, ). according to socio-technical approach- es, technological systems are determined by social forces and by technological features at the same time. design, implementation and use of information technologies are the result of interactions and negotiations stefania manca italian journal of educational technology / volume / issue / between technology, users and organizational contexts (huysman & wulf, ). digital scholarship has recently been reconceptualised as a complex techno-cultural system that includes technological innovations and dominant cultural values, along with differential identity markers and norms of practice and prestige (stewart, ). on one hand, asns, being digital platforms and infrastructures that support digital scholarly practices, can be considered socio-material phenomena. on another hand, scholars have investigated scholarly communities as knowledge-sharing entities that are formed by trust, a sense of mutuality and recognition by peers (costa, ; fulk & yuan, ; huysman & wulf, ). a socio-technical approach that combines emergent user practices and content with the platform’s organi- zational level has been proposed in the study of social media and social network sites as microsystems (van dijck, ). the approach positions social media like twitter, facebook, youtube, flickr and wikipedia as systems that encompass coevolving networks of people and technologies with economic infrastructure and legal-political governance and blends techno-cultural and political economy views. it also provides a two-layered approach that addresses social media platforms as socio-economic structures and techno-cul- tural constructs. both of these two layers comprise different components. the socio-economic layer in- cludes: ) an ownership component, which governs commercial and non-profit platforms according to different policies; ) a governance component, which is constituted by technical and social protocols and sets of rules for managing user activities; and ) a business model component, which mediates the engi- neering of connectivity through subscription models. the techno-cultural layer includes: ) a technology component, namely a number of services that help encode activities into a computational architecture that steers user behaviour; ) a user/usage component, which orients user agency and implicit and explicit par- ticipation; and ) a content component, which determines the standardization of content and the uniform delivery of products. however, it should be noted that van dijck’s two-level framework does not explicitly encompass the in- dividual use of social platforms and ways single users exploit these sites for specific purposes. to this end, we consider the concept of networked participatory scholarship (nps) proposed by veletsianos and kimmons ( ). according to nps, scholars’ learning and knowledge in networked spaces are facilitated, negotiated, and constructed both individually and socially. the incorporation of the ideas of nps provides a third new layer, the networked-scholar layer, which concerns the actual uses of asns by individuals and communities of scholars. in this light, the networked-scholar layer comprises three major components: ) a networking component, which refers to the structural dimension of scholarship and how the connectivity of communication and collaboration is engineered; ) a knowledge-sharing component, which concerns the collective and distributed learning dimension; and ) an identity component, which relates to reputation and trust as elements that shape academic personae. in this perspective, asns might be conceived as three-layered structures that comprise a socio-economic, a techno-cultural and a networked-scholar layer, as illustrated in figure . figure . the multilayer approach for analysing asns according to a networked socio-technical perspective. in the following, an attempt to investigate asns according to a multi-level analysis that emphasises the several dimensions involved in these sites according to a socio-technical perspective is presented. a macro– meso–micro framing approach (dopfer, foster, & potts, ) with the aim of investigating the meanings across the systemic, organizational and individual layers is employed. . the multi-layered framework . . the socio-economic layer the macro-level identifies the asns components that are at stake in framing the socio-economic model. from the perspective of the ownership component, asns are generally for-profit enterprises. they have attracted a range of external investors over the years and established agreements with other platforms and api-based services (e.g. twitter, facebook, linkedin, google) to implement each other’s buttons and get reciprocal access to data streams. the governance component regards the mechanisms by which commu- nication and data traffic are managed and includes the terms and conditions agreement governing, and legally framing, the provider-user relationship. finally, most of these platforms have supplemented their basic free-of-charge services with paid subscriptions as part of their overall business plan. in this respect, the socio-economic layer of asns is positioned in terms of ownership, governance and business model, and ascertains the opportunities and limitations for members to connect. it ultimately shapes the way that social relationships form and develop on the platform. . . the techno-cultural layer the meso-level identifies the asns components that are at stake in building the techno-cultural layer. the first, technology, comprises the computational architecture through programmed directives that steer user behaviour; it provides a mediator layer that shapes the performance of social acts by means of services. the platform interface reflects the owner’s strategic choices about the final presentation of information stefania manca italian journal of educational technology / volume / issue / and default/optional settings, as in the case of the individual’s biographical data (e.g. real name, academic qualification, affiliation and gender). the typical default setting is displayed to all users for identification purposes, while sensitive information like contact details are disclosed only to mutual followers. the complex of algorithms, protocols and default settings shapes the networking experiences of users ac- tive on academic platforms and engineers sociality in diverse manners. sociality is, for instance, encoded by aggregating and processing (meta)data to calculate network connections within a given subnetwork or community, which is defined according to specific parameters like affiliation, country and research topics. features like news feed and network updates, partly adopted in the attempt to emulate high-profile social network sites, allow users to monitor members’ activities like new uploads, bookmarked publications and shared updates, and can also generate suggestions for new contacts or new activities to take part in. in this light, asns prioritise the events and actions considered most meaningful to foster connectivity among us- ers like prompting endorsements to recommend other researchers for their skills and expertise, and to spur new connections like suggesting academics to follow. the second component, user/usage, concerns activation of platform features according to implicit and ex- plicit user participation. while the former regards the usage inscribed in platform design by means of the encoding mechanisms mentioned above, the latter concerns how actual users interact with the platform. usage can be limited to reading and acquiring information about what other users post, or can regard active participation, such as posting new content or activating new connections. both types of usage can generate professional benefits for academics; stumbling upon relevant information not only enriches their personal knowledge base, it can also help them build a transactive memory or a cognitive representation of who is who in their networks (see utz, ). transactive memory is further fostered, for instance, through the possibility to ask and respond to user questions in a dedicated question & answer section of the platform. moreover, by checking the profiles of other users, reading news feeds or approaching peers for advice and knowledge sharing, socio-emotional effects can be achieved as social lubricant (utz, ). lastly, the content component concerns the types of information resources that can be shared on asns. being academically oriented, asns mostly prompt the uploading and sharing of academic output, which mainly comprises research papers but also grey literature like open datasets, drafts, results from failed ex- periments, and open reviews of papers. however, researchers can also ask research-related questions and share their expertise through question & answer sections, showcase the projects they are currently work- ing on, and share updates with collaborators and peers. each platform sets and applies its own particular standards for content, formatting and delivery in line with the way it engineers connectivity and sociality. . . the networked-scholar layer the micro-level identifies the asns components that are at stake in building relations at the net- worked-scholar layer. the first component, networking, concerns the possibility to build an individual network of contacts using the features provided by the platform. in this view, asns offer a range of pos- sibilities for users to build their personal network in order to access fresh information or locate relevant expertise. unlike other social networking services that are based on reciprocal ties (e.g. “friends” or “con- nections”), asns usually rely on the “follow” function, exploiting one-way relations that do not imply automatic reciprocation. the concept of “following” concerns a loose kind of connection that serves to exploit latent ties, namely ties that are still potential and can be activated socially (haythornthwaite, ), which may eventually become weak ties (genoni, merrick, & willson, ). ’bookmarking’ publications and ‘following’ other users’ projects or research items are additional features for trailing users. the second component, knowledge sharing, concerns sharing knowledge with peers. knowledge sharing has been investigated in information public goods theory (fulk & yuan, ), as well as in studies on stefania manca content sharing in social network sites like facebook (fu, wu, & cho, ). academic social network sites provide a range of features that support content distribution and sharing, mostly based on adding or uploading new articles. however, while some asns prompt the sharing of references, others openly encourage full text upload or support private one-to-one exchange when sharing involves copyrighted ma- terial available with peers that do not have access to commercial channels of distribution. other features to exploit knowledge sharing and expertise are the search function, which allows users to search for fellow researchers, publications, questions, job announcements, research interests and affiliations, discussion ses- sions, and asking questions. by browsing the existing list of q&as or setting up a new discussion session, users can gain valuable access to the expertise of peers in a much more effective way than conventional expert directories. as discussed below, motivations for sharing and perceptions of personal and collective gains have direct consequences on the reputational mechanism and overall identity. lastly, the third component, identity, refers to reputation and trust as elements shaping academic personae. as acknowledged by veletsianos and stewart ( ), scholars disclose specific personal and professional infor- mation to promote their evolving professional identity and to boost their reputation in terms of visibility. in asns, identity is mostly conveyed through features like the profile and a number of strictly associated com- ponents. the profile, the main place where visibility and reputation are constructed, usually displays a user’s network size (i.e. the number of followers and following), tie strength (weak ties are constituted by followers and following, while strong ties regard co-authors, who are usually listed separately), skills and expertise, number of peer endorsements, and awards and achievements. further features allow the visualization of new followers, research products, projects, teaching resources, engagement in discussion sessions and number of citations. reputation is further exploited in overall scores, which some asns offer as community measures to value users’ willingness to share knowledge and expertise (see nicholas, herman & jamali, ). . an analysis of researchgate and academia.edu researchgate and academia.edu are undoubtedly the most popular of the social networking services de- veloped specifically to support academic and research practices (nicholas, herman & jamali, ). these sites allow academics to build a professional profile, connect with colleagues, share publications, and state their mission in terms of research sharing, openness and transparency. in the following, the two platforms are analysed according to the multilevel framework presented above, pro- viding examples of features that correspond to the socio-economic, techno-cultural and networked-scholar layers . analysing asns in this way has the advantage of disassembling these sites according to explicit and implicit dimensions that steer users’ behaviour. this facilitates reflection on how connectivity among users is engineered in these specific academic microsystems. . . researchgate and the imperative of openness researchgate is a social network service founded in and has attracted more than million members distributed worldwide in countries. the majority of the members, sixty per cent, belong to a wide range of hard scientific disciplines such as medicine, biology, engineering, chemistry, computer science and phys- ics. the stated mission of researchgate is «to connect the world of science and make research open to all». . . . researchgate socio-economic layer defined as «a free facebook-style social network aimed solely at scientists worldwide», researchgate is a for a complete updated list of features, see http://tinyurl.com/acmergfunctions italian journal of educational technology / volume / issue / for-profit company headquartered in berlin that counts more than employees. the company has com- pleted four rounds of financing and over the years has attracted a wide range of external investors, including bill gates and venture capitalists. it has also established agreements with other social platforms and now users can connect with api-based services like facebook, linkedin and google. the governance component is mostly managed through the terms and conditions that regulate the provid- er-user relationship. one of the most important terms concerns the statement on privacy and data protec- tion, which operates in full compliance with german laws. the statement also claims that email addresses are processed solely to send information or notifications about the service, although they reserve the right to attach a minor part of advertisements for products and services of third parties. as for intellectual prop- erty rights of third parties, researchgate can, at its discretion, disable and/or terminate the accounts of users who infringe or repeatedly infringe the copyrights of others in accordance with the digital millenium copyright act (dmca). on the other side, users are required to indemnify researchgate in case of copy- right infringements. the business model is largely based on a wide range of free-of-charge services supplemented with subscrip- tion-based services like the job openings section for posting job ads. in addition, in the near future, the company is likely to adopt highly targeted advertising so that scientific conferences can market their events on the platform and companies can advertise products, devices, books and services to scientists, leading the way to an academic-oriented marketplace along the lines of amazon. . . . researchgate techno-cultural layer the capability to connect users with each other and to foster communication between them is enabled by the technology component. through a number of invisible algorithms and protocols that execute the programmed social tasks of interaction among members, researchgate implements features to spur users’ connectivity and to channel social interaction, for instance, automatically signalling which other people one may be interested in contacting and adding to his/her network. like facebook and other popular social network sites, researchgate home provides news feeds that allow users to monitor members’ activities like new uploads, new projects, bookmarked publications, comments and shared updates. other features of this kind include prompting endorsements of researchers for their skills and expertise and suggesting new researchers to follow. connectivity and interaction are also fostered through the recommend button, which is quite similar to the like button in facebook, and the follow button for projects and publications. researchgate gives users the option to share bibliographic references to their own work but nonetheless solicits them to add full-texts by indicating the number of references that the user has added without full texts and stressing the advantages that sharing full-texts brings for increasing the visibility of one’s work and boosting personal ratings. as reported in the previous sections, platform usage can be passive, i.e. limited to reading and acquiring information about what others post, or can regard active participation, such as posting new content or ac- tivating new connections. one way of engaging actively in the network is to participate in the questions discussion threads by posing research questions and/or sharing expertise. finally, at the content level, researchgate affords the publication of diverse types of scientific output. these include not just publications, but also grey literature such as open datasets, drafts, results from failed experiments, and open reviews of papers that users have read or worked with. indeed, a significant feature distinguishing researchgate from other academic networks is the possibility to publish raw data and media files in order to stimulate discussion among interested members. a recently added feature allows users to or- ganise research outputs into projects so that publications and other research outputs are grouped according to research topics. however, the project is also seen by some as something in itself to share and promote, stefania manca not just as a thematic label for outputs. finally, the timeline has been adopted to showcase research output from start to finish and provides an overview of a researcher’s publishing history, questions, open reviews, and publication comments over time. . . . researchgate networked-scholar layer the possibility to build an individual network of contacts is mostly based on the follow feature, through which users can subscribe to others’ updates without this being automatically reciprocated. the follow function gives users access to new and updated information and opportunities to locate relevant expertise. each user’s personal network of following and followers is displayed in her/his profile. the list of co-au- thors is also displayed in a specific tab, highlighting strong ties in one’s network of contacts. to strengthen ties in their personal network, researchers can also use the recommend resources function to spotlight publications, projects, etc. the knowledge sharing component chiefly regards the adding or uploading of research products but also includes commenting on publications and projects and asking and replying to questions via the questions feature, which is of value for boosting knowledge and expertise sharing. another key feature for retriev- ing useful information and maintaining distributed memory is the search function, which allows users to search for fellow researchers, publications, question/answer topics, job announcements, research interests and affiliations; the possibility to browse the existing list of q&as is also useful to these ends. in addition, the personal profile includes a tab for displaying expertise and skills; users can browse this when seeking to locate competences useful for their research. as stressed in previous sections, motivations to share and perceptions of personal and collective gains have direct consequences on the reputational mechanism and researchers’ identity. user identity is mostly conveyed through the profile, the main feature for constructing visibility and reputa- tion. information displayed in a researcher’s profile includes: a short bio; visualization of research products and projects; list of followers and following; engagement in q&a sessions; and awards and achievements. researchers can also raise their visibility by listing their best publications in a featured research tab. however, what distinguishes researchgate from other social network sites for academics is its set of pro- prietary reputation metrics. these can be also seen as community measures of value that encourage user interaction by rewarding the willingness to share knowledge and expertise. in this perspective, research- gate provides three types of scores: rg score, rg reach and h-index. rg score is a metric that measures scientific reputation calculated by combining two factors. one is the community’s response to all the re- search outputs one shares, response being expressed through number of views. the other is the user’s level of overall activity across the platform: q&a participation, number of followers attracted, etc. rg reach gauges the visibility of ones’ work on the platform in terms of how many individual researchers are noti- fied when one adds new research. the total reach is calculated by adding the number of direct connections one has to the number of people connected to one’s work as co-authors and project collaborators. lastly, h-index measures the impact of one’s work by considering the number of publications a researcher has pub- lished and displayed in his/her profile, and the number of citations these have attracted, regardless of which journal the study was published in. the service provides two separate h-indices for each author: while the first includes self-citations, the second does not. users are encouraged to increase their impact and boost their scores by adding as many publications as possible to their profile; they are also constantly reminded to add full-texts that are missing. . . academia.edu and the paradigm of sharing academia.edu is a social networking service founded in that counts almost million accounts and at- italian journal of educational technology / volume / issue / tracts over million unique visitors a month. in contrast with researchgate, the platform is more popular in arts and humanities and to a lesser extent in social sciences and economics (kramer & bosman, ). . . . academia.edu socio-economic layer academia.edu is a for-profit company headquartered in san francisco with a small team of people. the platform has recently brought its total equity funding to almost million dollars from a number of capital ventures that have fuelled the growth of the platform and the team. connection with other social media services regards account sharing (users can log in with third party accounts like google and facebook) and the possibility to tweet the uploading of a new publication. as for the governance component, the site’s terms of use grant users the right to download, view and print any academia.edu content solely for personal and non-commercial purposes. when posting, uploading, publishing, submitting or transmitting member content, members grant academia.edu a worldwide, revo- cable, non-exclusive, transferable license to exercise any and all rights under copyright, in any medium. since an account may be linked to other online accounts, relationships with third party service providers are regulated accordingly. as for intellectual property rights of third parties, academia.edu can terminate the accounts of users who may infringe or repeatedly infringe the rights of copyright holders. the platform’s business model is largely based on provision of a wide range of free-of-charge services that are supplemented by premium accounts, mostly organised around enhanced analytics. these provide addi- tional features like mentions, readers, advanced search and extra analytics, as well as full text search of pdfs and a job board for advertising academic vacancies. academia.edu is also aiming to commercially engage r&d institutions by providing them with trending research data gleaned from algorithms that ag- gregate the latest high-impact papers in a given research area. . . . academia.edu techno-cultural layer the technology component to spur users’ interactions and building of new contacts is engineered in a sim- ilar way to researchgate. academia.edu’s home provides a constant news feed that updates users on new uploads, bookmarked publications, user actions like joining or commenting on a discussion session and publications recommended by one’s contacts. the home page also features functions like suggested ses- sions and suggested academics for increasing one’s connectivity on the basis of similar research interests. moreover, academia.edu members can invite others to join academia.edu using the platform’s automated invitation system. in this light, the agreement with social media third parties like facebook and twitter engineers automatically following social media contacts when they sign up to academia.edu. as far as the usage component is concerned, academia.edu offers a unique feature called sessions that allows users to create a special page where peers and colleagues can leave general comments on papers or line-specific annotations. permission levels depend on whether the contributor is a mutual follower or is outside the author’s network of contacts. restrictions are also applied to language, in so far as the only language allowed is english. finally, at the content level, academia.edu affords the publication of diverse types of scientific products, including papers, books, book chapters, drafts, but also conference presentations and teaching material. . . . academia.edu networked-scholar layer users build an individual network of contacts mostly using the follow feature, through which they sub- scribe to contacts’ updates without being automatically reciprocated. the list of each user’s followers, following and co-authors can be accessed via their profile by clicking on separate links; they are not au- tomatically displayed there. stefania manca the knowledge sharing component chiefly regards the adding or uploading of research products such as publi- cations, drafts and teaching materials. however, it also includes contributing to sessions pages, where users can leave general comments on papers or line-specific annotations. a key feature for retrieving useful information and maintaining distributed memory is the search function, which allows users to search for papers, people, research interests and affiliations (full text search of pdfs is also available, but only for premium accounts). the profile feature, where user identity is mostly conveyed, displays various information: a short bio; research interests; contact details; number of followers, following and co-authors; and lists of research products. in terms of reputation, the profile also includes a ’total views’ tally, a ’top’ percentile designa- tion and an author rank which is a function of the paperranks of the papers on the user’s profile. in order to enhance visibility, users can also benefit from adding social media profiles, such as twitter, facebook, google scholar, linkedin and skype to their profile. the service also provides an analytics dashboard, which gives the user an overview of how others have interacted with their own publications (number of visitors, views and downloads, as well as the related countries and cities of provenance, etc.). however, detailed analysis is accessible only with a premium account. all the data about visits to one’s profile and papers can be exported in a csv file, which can then be opened in a spreadsheet program for further analysis. the service also prompts sending of emails to users that are periodically alerted to the paper views with subject lines like «someone just searched for you on google and found your page on academia.edu. to see what city they came from and what paper they viewed, follow the link below». . discussion and conclusion although academic social network sites present features similar to general social network sites’ (ellison & boyd, ), they are also specific socio-technical systems that offer unique technological affordances to researchers and scholarly communities. the aim of this study was to explore asns as social infrastructures for digital scholarship that allow knowledge sharing and scholarly networking at different levels. from this perspective, a multi-layer analysis has been proposed to investigate academic social media platforms like researchgate and acadmia.edu as microsystems that combine emergent user practices and content with the platform’s organizational level according to three distinct levels, each involving a number of components. as socio-technical systems, asns foster some forms of social interaction and constrain others, like the ’questions’ tool in researchgate that does not allow the posting of questions in languages other than en- glish. this means that affordances may come at a price especially when they are implemented in for-prof- it services. an aggressive policy of marketing concerned with the for-profit nature of these services has spurred several criticisms among users. in the past, researchgate was criticised for sending co-authors unsolicited invitations to join the service, unless the author-member explicitly opted out of this. such mar- keting tactics caused many researchers to boycott researchgate or even to unsubscribe. the company has since changed this policy and now invitations are only sent if a member chooses to invite co-authors to join. more recently, researchgate has attracted strong criticism for harvesting data available on the web, such as affiliations, publication records and full texts, and using these to automatically generate nominal profiles that are not actually owned by the people concerned (van noorden, ). moreover, powerful as they may sound, asns technological architectures are also influencing the factors academia consider relevant in the evaluation of scientific productivity and academic rankings. for instance, on insisting on the relevance of dashboard analytics, academia.edu is criticised for reinforcing a culture of incessant self-monitoring and for amplifying and accelerating the logic of self-branding among scholars that is increasingly encouraged by university quantifiable policies (duffy & pooley, ). other scholars italian journal of educational technology / volume / issue / question whether rg scores are creating ghost academic reputations while progressively advocating the role of assessing scholars’ reputation (orduna-malea, martín-martín, thelwall, & lópez-cózar, ). indeed, rg score has been criticized for having questionable reliability and an opaque calculation methodology that makes it hard to compare it with other popular standard scores (nicholas, herman, & clark, ; thelwall & kousha, ). moreover, while these systems are open and are increasingly encouraging open sharing, how asns rec- oncile open access and sharing policies with their business models and corporate governance needs to be carefully considered. more generally we are observing a clash between the rhetoric of open science and «the profit motive [that] is fundamentally misaligned with core values of academic life, potentially corrod- ing ideals like unfettered inquiry, knowledge-sharing, and cooperative progress» (pooley, , n.p.). in this light, how asns reconcile open access and sharing policies with their business models and corporate governance needs to be carefully considered in researchers’ practices and future research in the field. despite these criticisms, forms of researchers’ technological appropriation and transformation should be considered in future research. an in-depth analysis of the socio-technical affordances these platforms offer for supporting digital scholarship should be carried out, considering that at present empirical research on the use of asns in scholarly communities seems to have mostly attracted attention in the library and infor- mation sciences. while the majority of these studies focus on the general uptake or impact assessment of alternative metrics, very few have investigated the individual and collective scholarly practices that asns support from a networked learning perspective (manca, submitted). if this tendency may be attributed to the longstanding interest within library and information sciences for scholarly communication (borgman, ), the educational technology field could contribute by helping recompose the fragmented picture of studies concerned with digital scholarship into a cohesive research field (raffaghelli, cucchiara, manganello, & persico, ). . references bijker, w. e., hughes, t. p., & pinch, t. j. (eds.) ( ). the social construction of technological systems. cambridge, ma: the mit press. borgman, c. l. ( ). scholarship in the digital age. cambridge, ma: the mit press. borrego, a. ( ). institutional repositories versus researchgate: the depositing habits of spanish researchers. learned publishing, ( ), - . boyer, e. l. ( ). scholarship reconsidered: priorities of the professoriate. san francisco, ca: carnegie foundation. carpenter, j. p., & krutka, d. g. ( ). engagement through microblogging: educator professional development via twitter. professional development in education, ( ), - . costa, c. ( ). double gamers: academics between fields. british journal of sociology of education, ( ), - . donelan, h. ( ). social media for professional development and networking opportunities in academia. journal of further and higher education, ( ), - . dopfer, k., foster, j., & potts, j. ( ). micro-meso-macro. journal of evolutionary economics, ( ), - . stefania manca duffy, b. e., & pooley, j. d. ( ). “facebook for academics”: the convergence of self-branding and social media logic on academia.edu. social media + society, ( ), - . ellison, n. b., & boyd, d. ( ). sociality through social network sites. in w. h. dutton (ed.), the oxford handbook of internet studies (pp. - ). oxford, uk: oxford university press. fox, a., & bird, t. ( ). #any use? what do we know about how teachers and doctors learn through social media use? qwerty. open and interdisciplinary journal of technology, culture and education, ( ), - . fu, p.-w., wu, c.-c., & cho, y.-j. ( ). what makes users share content on facebook? compatibility among psychological incentive, social capital focus, and content type. computers in human behavior, , - . fulk, j., & yuan, y. c. ( ). location, motivation, and social capitalization via enterprise social networking. journal of computer-mediated communication, ( ), - . genoni, p., merrick, h., & willson, m. ( ). the use of the internet to activate latent ties in scholarly communities. first monday, ( ). greenhow, c., & gleason, b. ( ). social scholarship: reconsidering scholarly practices in the age of social media. british journal of educational technology, ( ), - . haythornthwaite, c. ( ). social networks and internet connectivity effects. information, communication & society, ( ), - . hoffmann, c. p., lutz, c., & meckel, m. ( ). a relational altmetric? network centrality on researchgate as an indicator of scientific impact. journal of the association for information science and technology, ( ), - . huysman, m., & wulf, v. ( ). it to support knowledge sharing in communities, towards a social capital analysis. journal of information technology, ( ), - . kelly, n., & antonio, a. ( ). teacher peer support in social network sites. teaching and teacher education, , - . kimmons, r., & veletsianos, g. ( ). education scholars’ evolving uses of twitter as a conference backchannel and social commentary platform. british journal of educational technology, ( ), – . kling, r., mckim, r., & king, a. ( ). a bit more to it: scholarly communication forums as socio- technical interaction networks. journal of the american society for information science and technology, ( ), - . kramer, b., & bosman, j. ( ). swiss army knives of scholarly communication - researchgate, academia, mendeley and others. presentation at stm innovation seminar . retrieved from http://www.stm-assoc.org/events/innovations-seminar- / kuo, t., tsai, g. y., wu, y-c. j., & alhalabi. w. ( ). from sociability to creditability for academics. computers in human behavior, , - . li, j., & greenhow, c. ( ). scholars and social media: tweeting in the conference backchannel for italian journal of educational technology / volume / issue / professional learning. educational media international, ( ), - . lundin, m., lantz-andersson, a., & hillman, t. ( ). teachers’ reshaping of professional identity in a thematic fb-group. qwerty. open and interdisciplinary journal of technology, culture and education, ( ), - . macià, m., & garcía, i. ( ). properties of teacher networks in twitter: are they related to community-based peer production? the international review of research in open and distributed learning, ( ), - . manca, s. (submitted). researchgate and academia.edu as networked socio-technical systems for scholarly communication: a literature review. research in learning technology. manca, s., & raffaghelli, j. e. ( ). towards a multilevel framework for analysing academic social network sites: a networked socio-technical perspective. in a. skaržauskienė & n. gudelienė (eds.), proceedings of the th european conference on social media – ecsm , vilnius, lithuania - july , pp. - . manca, s., & ranieri, m. ( a). editorial. reshaping professional learning in the social media landscape: theories, practices and challenges. qwerty. open and interdisciplinary journal of technology, culture and education, ( ), - . manca, s., & ranieri, m. ( b). networked scholarship and motivations for social media use in scholarly communication. the international review of research in open and distributed learning, ( ), - . nández, g., & borrego, a. ( ). use of social networks for academic purposes: a case study. the electronic library, ( ), - . nicholas, d., herman, e., & clark, d. ( ). scholarly reputation building: how does researchgate fare? international journal of knowledge content development & technology, ( ), - . nicholas, d., herman, e., & jamali, h. r. ( ). emerging reputation mechanisms for scholars. brussels: european commission, joint research centre, institute for prospective technological studies. niyazov, y., vogel, c., price, r., lund, b., judd, d., akil, a., mortonson, m., schwartzman, j., & shron, m. ( ). open access meets discoverability: citations to articles posted to academia.edu. plos one, ( ): e , - . pooley, j. ( ). scholarly communications shouldn’t just be open, but non-profit too. lse impact blog, august , . retrieved from http://blogs.lse.ac.uk/impactofsocialsciences/ / / /scholarly- communications-shouldnt-just-be-open-but-non-profit-too/ orduna-malea, e., martín-martín, a., thelwall, m., & lópez-cózar, e. d. ( ). do researchgate scores create ghost academic reputations? scientometrics, ( ), - . raffaghelli, j. e., cucchiara, s., manganello, f., & persico, d. ( ). different views on digital scholarship: separate worlds or cohesive research field? research in learning technology, ( ), - . ranieri, m., manca, s., & fini, a. ( ). why (and how) do teachers engage in social networks? an exploratory study of professional use of facebook and its implications for lifelong learning. british journal of educational technology, ( ), - . stewart, b. e. ( ). in abundance: networked participatory practices as scholarship. international review of research in open and distributed learning, ( ), - . thelwall, m., & kousha, k. ( ). researchgate: disseminating, communicating, and measuring scholarship? journal of the association for information science and technology, ( ), - . utz, s. ( ). is linkedin making you more successful? the informational benefits derived from public social media. new media & society, ( ), - . van dijck, j. ( ). the culture of connectivity. a critical history of social media. oxford, uk: oxford university press. van noorden, r. ( ). online collaboration: scientists and the social network. nature, , - . veletsianos, g., & kimmons, r. ( ). networked participatory scholarship: emergent techno-cultural pressures toward open and digital scholarship in online networks. computers & education, ( ), - . veletsianos, g., & stewart, b. ( ). discreet openness: scholars’ selective and intentional self- disclosures online. social media + society, ( ), - . weller, m. ( ). the digital scholar. how technology is transforming scholarly practice. london/new delhi/new york/sydney: bloomsbury. wagner, c. ( ). the new invisible college: science for development. washington, dc: brookings institution press. williams, r. & edge, d. ( ). the social shaping of technology. research policy, ( ), - . stefania manca a laboratory as the infrastructure of engagement: epistemological reflections article how to cite: pawlicka-deger, u a laboratory as the infrastructure of engagement: epistemological reflections. open library of humanities, ( ):  , pp.  – . doi: https://doi.org/ . /olh. published: october peer review: this article has been peer reviewed through the double-blind process of open library of humanities, which is a journal published by the open library of humanities. copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access: open library of humanities is a peer-reviewed open access journal. digital preservation: the open library of humanities and all its journals are digitally preserved in the clockss scholarly archive service. https://doi.org/ . /olh. http://creativecommons.org/licenses/by/ . / urszula pawlicka-deger, ‘a laboratory as the infrastructure of engagement: epistemological reflections’ ( ) ( ): open library of humanities. doi: https://doi.org/ . /olh. article a laboratory as the infrastructure of engagement: epistemological reflections urszula pawlicka-deger , aalto university, fi king’s college london, gb pawlickadeger@gmail.com today’s big challenges―the covid- pandemic, climate change, migration, and refugee crises―are global in scale, transcending geographical, national, and cultural boundaries, but responded to at the local level. it has therefore become necessary to reflect on the following questions: what kind of new forms of organizations are needed to tackle real-world problems? how can we enhance the humanities as a responsive field with the ability to translate knowledge into actions? how can we design a better humanities laboratory that is more attuned to contemporary challenges? the social labs as innovative institutions have opened up new epistemological directions for understanding a lab as a platform for addressing complex issues. a laboratory can be understood as a way of thinking and acting that entails new social practices and new research modes. drawing on social lab theories, critical infrastructure studies, and digital humanities infrastructure theories, this essay aims to present a new theoretical approach to conceptualizing a laboratory in the humanities. i discuss two epistemological perspectives represented by bruno latour and graeme gooday in order to disclose the power of the laboratory. next, i present the principles and network structure of social labs. then, i introduce the concept of the infrastructure of engagement as a new analytical framework for understanding a laboratory as a site of intervention for the humanities as they are involved in addressing pressing global problems. based on the humanities action lab, i seek to reimagine a laboratory guided by the principles of collaborative infrastructure, participatory approach, and public engagement. https://doi.org/ . /olh. mailto:pawlickadeger@gmail.com pawlicka-deger: a laboratory as the infrastructure of engagement introduction the term ‘infrastructure’ brings to mind images of heavy material things, such as bridges, roads, railways, sewers, and other public services that enable daily functions, communication, and transportation. this common sense definition is not, however, as obvious as it may seem. in seminal works on ethnographic studies of infrastructure, sociologist susan leigh star revealed the complexities, multiformity, and multilayers of infrastructure described as ‘something that other things “run on,” things that are substrate to events and movements’ ( : ). the infrastructure comprises not only heavy operations but also soft materials, such as intellectual and institutional structures and documentations (e.g., bureaucratic forms, technical protocols, work spaces, and equipment). infrastructure is thus a flexible term concerning situation, context, and purpose. in the following essay, i use star’s way of conceptualizing the infrastructure and refer to it as a ‘fundamentally relational concept, becoming real infrastructure in relation to organized practices’ ( : ). infrastructure is used here as an analytical construct through which we can follow practices and values shaped and informed by intellectual and institutional substrates. i also refer to shannon mattern’s analysis of infrastructure as a critical scaffolding used to address critical issues, including environmental health, the distribution of public resources, and social justice. the essay that follows deals with one form of infrastructure, that is, a laboratory: a place equipped with specialized instruments and technologies for experimental and practical studies that require manual skills as well as conceptual knowledge for their construction and deployment (hannaway, : ). a laboratory has played a central role in science and technology studies (sts) since, during the s, sociologists (latour and woolgar, ; latour, ; knorr cetina, ; lynch, ) began ethnographic studies of laboratories aimed at disclosing the mechanism of scientific work and the power of this place in constructing reality. the lab has become a gateway for understanding how scientific knowledge is produced. from this moment forward, sts and the history of science have seemed impossible to accomplish without investigating this scientific space. while the study of scientific pawlicka-deger: a laboratory as the infrastructure of engagement labs is well-grounded, a discussion of laboratories in the humanities has been largely underexplored. thus far, only a few researchers have studied laboratories used for the humanities from the socio-material infrastructure perspective (emerson et al., n.d.; svensson, ; smithies et al., ; foka et al., ; smithies and ciula, ; pawlicka-deger, ; pawlicka-deger, n.d.). laboratories have entered the humanities as a new research infrastructure, transforming this into an experimental, collaborative, and technology-driven field. this well-established institutional model of science has been adopted as a place that provides equipment and technologies for supporting the application of digital tools and methods to the humanities. the spread of the concept of a laboratory into city spaces and cultural institutions has significantly transformed the definition and realization of a lab as a physical place rooted in the scientific tradition. the social labs, community labs, and citizen labs have opened up new epistemological directions for understanding labs as platforms for addressing complex challenges through prototyping-based techniques (hassan, ). a laboratory goes beyond the concept of a fixed place involving material instruments and hands-on scientific exploration, becoming, instead, a widely understood project set up for a specific period which can be managed without any equipment. a laboratory is thus imagined as something more than material infrastructure; it is the concept that makes it possible to provoke ‘new ways of engaging with public audiences’ (fhi, n.d.) and ‘change the way faculty and students approach instruction and research’ (asu, n.d.). a laboratory can be conceptualized as a way of thinking and acting that entails new social practices and new research modes. therefore, a lab can be established anywhere. the only condition for creating a lab is community: a lab is constituted by, and for, the people gathered together to address particular challenges. in light of the conceptual changes of a laboratory, many new questions arise as to how to understand emerging lab models, such as a lab as a challenge-centric space, a lab as a coalition, and a lab as a distributed network, and how to reimagine a laboratory in the humanities as a site of intervention in public spaces and social problems. the term ‘humanities labs’ is used here to cover entities that engage pawlicka-deger: a laboratory as the infrastructure of engagement humanities knowledge and methods of inquiry, and there can be both institutional and technology-based labs and conceptual, non-digital labs. i pose the following questions aimed to delve into the epistemology of laboratories and discuss them within a framework that approaches infrastructure as a relational concept: how does a laboratory grow from a physical work space into actions taken around challenges? how can changes be accomplished through the humanities infrastructure? how can we design a better humanities lab, more attuned to the challenges of today’s world? this essay aims to present a new theoretical perspective on a laboratory in the humanities that draws on sts, digital humanities infrastructure theory, and social lab theories. i seek to construct a new analytical tool for reconceptualizing the infrastructure which represents not simply a material thing but, more broadly, an intellectual structure that constitutes the way that we approach space, people, and challenges. as paul dourish and genevieve bell state, ‘infrastructure is analytically useful, both because it is embedded into social structures, and because it serves as a structuring mechanism in itself’ ( : ). the purpose is thus to investigate a laboratory through this dual perspective to see how the lab has become critical infrastructure through practical organization (structure and operation) and activities (experiment, production, and transfer of work beyond laboratory walls). i attempt to go beyond the prevailing discussion of a laboratory as a research infrastructure to conceptualize it as the infrastructure of engagement inspired by social labs. referring to social lab theorists, i propose thus to conduct an epistemological experiment that aims to reframe laboratories for the humanities in the vein of social labs and use the perspective of the infrastructure of engagement as a critical lens for analysing a laboratory’s structure and action. the humanities and the concept of laboratories a lab history in the humanities dates back to the s when the first laboratories― serving subjects other than the natural sciences―were established in media studies. for instance, the laboratory paragraphe at the university of paris , france was established in , the media lab at mit, u.s. in , the digital writing and research lab at the university of texas at austin, u.s. in , and the aalto media pawlicka-deger: a laboratory as the infrastructure of engagement lab at aalto university, finland in . media labs were launched as production and experimental research spaces and studios. the first laboratories aimed to foster the creation of media projects which explored the impact of technology on society and the human condition, developed hardware and software within the context of artistic projects, and tested the potential of electronic technologies. concurrently, in the late s the word ‘lab’ was applied to humanities and technology, for example, with the humlab at umeå university, sweden, which was established in , and the stanford humanities lab in the u.s., founded at stanford university in . although the institutional models for these units were not new, their conceptualization as ‘labs’ was an original move. nevertheless, the end of the twentieth century and beginning of the twenty-first century remained under the domination of media labs, including, for example, the speculative computing laboratory at the university of virginia, u.s., la camera ottica at the university of udine, italy, and liacs media lab at leiden university, netherlands (pawlicka-deger, ). in , the american council of learned societies released a report on deploying cyberinfrastructure for the humanities and social sciences (acls, ). it was a specific call for developing a new research infrastructure for the humanities and social sciences that would enable new learning and teaching through digital tools and technologies. at the time, digital technologies had gradually entered the humanities area, entailing new forms of scholarly communication and knowledge production and giving rise to the field of digital humanities (dh). the term ‘cyberinfrastructure’ was thus brought from science and engineering over to the humanities domain, a move that was not without consequences for embedding humanistic infrastructure. the borrowed term was not simply a new entry in the humanistic dictionary but rather entailed the transfer of the science, technology, engineering, and mathematics (stem) model for the infrastructure to the humanities and social sciences (pawlicka, ). the evidence that the stem notion of infrastructure was being deployed came in the form of a call for establishing a laboratory as an institutional innovation for the humanities which would foster collaborative and experimental digital scholarship. as stanford university computer scientist marc levoy states: pawlicka-deger: a laboratory as the infrastructure of engagement once humanities faculty began using the laboratory in their research, they would also find creative ways to fold its technology into their teaching—for example, through project-based assignments in upper-level courses. this would bring humanities students into the lab, some of whom have dual backgrounds, and so could help run the lab (acls, : – ). the above-mentioned report stimulated the rise of a laboratory in the humanities— modelled on stem practices—as a condition for deploying digital tools and technologies in scholarly research. viewed in this context, laboratories were thus introduced as the infrastructure for facilitating and propelling the advancement of dh. the field of dh was thus both a beneficiary of the emerging laboratories and a leader in developing new spaces for inquiries modelled upon the techno-science type of labs. the concept of the laboratory was thus foregrounded as an innovative infrastructure for developing technology-based, collaborative, and experimental research. many labs have been established in the humanities departments and libraries as a research infrastructure aiming to provide the space, community, and resources necessary for employing computational methods for art and humanities questions, including, for example, the scholars’ lab at the university of virginia library, the franke family digital humanities laboratory at the sterling memorial library of yale university, and the price lab for digital humanities at the wolf humanities center at the university of pennsylvania. the process of advancing cyberinfrastructure and setting up laboratories may be framed as the first wave of the ‘infrastructure turn’ in the humanities (rockwell, ). it involves a deep understanding of the role and operation of organizational structures for the humanities, which not so long ago were considered to be a set of fields without infrastructure. to comprehend the concept of infrastructure, geoffrey rockwell applied an infrastructure paradigm representing heavy organizational structures, such as roads and sewers, to research practices and explained the need for both physical and virtual infrastructures for the humanities (rockwell, ). the first wave aimed at creating infrastructural opportunities and producing the substrates of research organizational structures for the humanities, including digital pawlicka-deger: a laboratory as the infrastructure of engagement services, tools, and laboratories. the movement towards the establishment of labs constituted a part of the infrastructure turn and was determined by the following three motivations. the first incentive was to construct a physical place for the assembly of computers and other technological devices and thereby to provide the environment needed for facilitating digital scholarship. this aspiration led to the techno-science and work station models for humanities labs. the second impetus for accelerating the development of infrastructure was to add team-based work and interdisciplinarity to humanities practices. the existing research infrastructure, including reading rooms in libraries and scholars’ offices, supported a model of individual research conducted in isolation. the individualist mode of humanities knowledge production hindered cross-disciplinary communication and collaboration. to facilitate a team- based and cross-disciplinary work, researchers increasingly advocated for developing laboratories as a place fostering collective work (arac, ) and intellectual exchange (davidson, ), ‘building opportunities for the reciprocal interrogation and bonding that lies at the heart of what the humanities can contribute to democracy’ (dubois and jenson, ), and establishing new relationships between people in the humanities and sciences (joselow, ). the last motivation stemming from the previous argument was thus to provide a common place to energize collaborative practices. this space, referred to variously as a trading zone (galison, ), a contact zone (pratt, ), and a meeting place (svensson, ), was meant to serve people representing various disciplines and epistemic traditions in terms of allowing them to gather in a common place to share and inspire creative ideas. a laboratory place emerged in the humanities in response to its structural inefficiencies, implying that the existing infrastructure was insufficient to meet the institutional requirements resulting from deploying computational tools and digital technologies. the laboratory concept was thus adapted to the humanities based on the models of techno-science labs, computer science labs, and media labs. the lab has been conceptualized as a physical place for providing equipment and technologies and for conducting hands-on work. this perspective, however, reduces the idea of a laboratory to a site where experiments are carried out inside a pawlicka-deger: a laboratory as the infrastructure of engagement controlled environment without any connection with the world outside the room. the laboratory concept, however, has much greater applicability beyond just its work station model. new forms of laboratories have thus emerged, disrupting the precursor models and opening up a discussion about the purpose of humanities labs for current and future research, teaching, and societies. the humanities labs have been therefore reimagined as an innovative platform for addressing pressing social concerns in the vein of the social and community labs. one example is the humanities action lab (hal) led by rutgers university-newark. it represents a laboratory model that operates as a coalition of various universities, issue organizations, and public spaces working collaboratively on the same initiative for a given period of time to produce community-curated public humanities prototypes on urgent social issues. students and stakeholders in cities across the world develop local chapters of exhibits, web projects, public programs, and other platforms for civic engagement. afterward, projects travel to museums, public libraries, cultural centres, and other spaces in each of the communities that helped to create them. the goals of the hal are to develop a new perspective on cross-institutional and cross-sector collaboration, foster a national and international exchange of local experiences, and produce systemic knowledge on a particular social challenge (hal, n.d.). the hal is based on the principles of inclusivity, the action-oriented method, the systemic approach, and citizen engagement. this innovative laboratory model aims to intervene in urgent social problems through the practices of the engaged humanities. the hal initiative, officially founded in , grew out of the guantánamo public memory project, which was launched in and hosted by columbia university. in , nine universities initially came together to establish this project in order to interrogate the history of the us naval base at guantánamo bay, cuba, and foster public dialogue on the urgent questions it was raising in the present: ‘the project was motivated by a concern that the vitriolic public debate over “closing guantánamo” was severely limited by ignorance of how guantánamo had been “closed” before, and how it could open again’ (gpmp, n.d.). the community-curated memory project eventually involved over students from universities who collaborated with pawlicka-deger: a laboratory as the infrastructure of engagement more than community stakeholders, including haitian refugees, former service people, and attorneys representing current detainees, in order to research and document the history of guantánamo from multiple perspectives. together they created a traveling exhibit, web platform, digital and physical archive, interview collection, and a series of public dialogues. the exhibition travelled for more than three years to cities and initiated public dialogues in each place. ultimately, after this impressive initiative, the nine original universities formed the humanities action lab to advance the public humanities and foster civic engagement. at the time of its establishment, growing numbers of people were demanding that the united states reckon with incarceration’s past and the construction of the carceral state before planning new reforms. ‘after four decades of feverish imprisonment, a remarkable bipartisan consensus had emerged that mass incarceration had failed and must be dismantled’ (chettiar and waldman, ; cited in Ševčenko, ). in this context, the hal launched its first project, states of incarceration, which was centred on the past, present, and future of incarceration in the united states. the project brought together over people in states, who studied the explosion of prisons and incarcerated people in the united states and its global dimensions. the action also involved people affected directly by incarceration in these states, as each state explored a history of incarceration within its own communities, from angola’s slave plantation-turned-prison in louisiana, to the legacy of the dakota wars for native americans incarcerated in minnesota, to immigration detention at ellis island and elizabeth, new jersey. in the end, the participants together created the project ‘states of incarceration: a national dialogue of local histories’, which encompassed a series of public dialogues, a digital platform, and a national traveling exhibition: all the pieces—each featuring combinations of historic images, audio interviews, videos, and artwork—were compiled by a designer into a single national physical and digital exhibit that launched in new york city in april and is traveling to each of the communities that contributed to it, accompanied by public dialogues at each stop, through at least early (states of incarceration, n.d.). pawlicka-deger: a laboratory as the infrastructure of engagement following the success of its previous ambitious projects, the hal launched its next three-year initiative on climate and environmental justice in , which brought together communities from around the world. as the project’s website states: the initiative on climate and environmental justice will address this issue by sharing stories and strategies from the communities who bear the greatest impact, while contributing the least to environmental degradation. by exploring the roots of climate and environmental justice, this project seeks to: center frontline communities, raise awareness, build political efficacy, and develop mechanisms for accountability (climates of inequality, n.d.). the hal has continued significant growth by engaging more and more universities and organizations outside the u.s. the more recent initiative, at the time of writing, on climate and environmental justice involves partners from the university of puerto rico in mayagüez, the universidad autonoma metropolitana-cuajimalpa in mexico city, and the royal institute of technology in stockholm. in the face of the covid- pandemic, the hal has organized a set of events called climates of inequality and covid: stories from frontline communities. the initiative has aimed to build a collective mass-listening project designed to record the impacts of covid- on frontline communities. the hal is turning into an influential and powerful global laboratory that launches projects as interventions in public discourse. this laboratory model is distinguished by the following features: the focus on a locally- grounded global problem undertaken from a broad perspective; interdisciplinary and vertically integrated projects; the application of participatory and prototyping techniques; strong collaboration across disciplines, institutes, and sectors; and a pedagogical mission realized through innovative lines of inquiry and new ways of engaging students and scholars with public audiences. the humanities action lab represents a unique type of laboratory that emphasizes the power of the humanities and engages the field in pressing global challenges. this seminal case offers a new vision of the laboratory as the infrastructure that supports the engagement of the humanities in real-world problems and bridges the pawlicka-deger: a laboratory as the infrastructure of engagement gap between the university and the public. it has also shown how the concept of the laboratory has been extended beyond a techno-science-based place and reimagined as a site for interventions in social concerns. this lab model has provoked new questions, not yet explored in the humanities debates, about the epistemology of network-based labs and the understanding of infrastructure, conceptualized not only as heavy and physical materials but also as something that emerges from practices and connects people and activities. the relational character of infrastructure makes it possible to constantly form and shift connections across space and time. the infrastructure will thus constitute a critical lens for unlocking the potential power of laboratories. the question that arises at this point is, thus: why has a lab been seen as a site imbued with certain forms of power? disclosing the power of the laboratory in the s, laboratory studies emerged as the field concerned with ethnographic and epistemological investigations of scientific laboratories, with the goal of understanding the process of scientific knowledge production and the relationships between science and society. this nascent discipline with sociological and philosophical inclinations arose from the studies of present-day laboratories in the s- s. numerous researchers contributed to the move towards considering science as a social practice (pickering, ) with the aim to demystifying scientific practices and investigating the network of actors taking part in the construction of scientific knowledge, i.e., space, material instruments, technology, and community. some publications, in particular, were the driving forces behind constituting laboratory studies, such as laboratory life ( ) by bruno latour and steve woolgar, the manufacture of knowledge ( ) by karin knorr cetina, art and artifact in laboratory science ( ) by michael lynch, and science in action ( ) by bruno latour. a laboratory as a space for scientific experiments was considered as an epistemically special place, since it was an important agent of scientific development (knorr cetina, : ). early ethnographic studies of the laboratory aimed to examine its inner construction and understand the factors contributing to its power. pawlicka-deger: a laboratory as the infrastructure of engagement the approach to laboratory studies has changed in recent decades as a result of the spread of the idea of the laboratory into public space. latour, in his seminal article ‘give me a laboratory and i will raise the world’, sought to understand the positioning of laboratory practice to answer a significant question: ‘why it is that in the laboratory and only there new sources of strength are generated?’ ( : ). as a starting point, latour referred to knorr cetina’s provocative statement that ‘nothing extraordinary and nothing “scientific” was happening inside the sacred walls of these temples’ ( : ). if nothing special was happening inside the lab, to repeat latour’s inquiry, why was the laboratory seen as the lever with the capacity to raise the world? based on microstudies of pasteur’s lab, latour provided an argument for understanding the process of how ‘the laboratory gains strength to modify the state of affairs of all the other actors’ ( : ). latour argued that a laboratory action entails the dissolution of the inside and outside dichotomy by transferring the knowledge from inside the lab, where it was produced, to the world outside the lab, where it is applied. as mads dahl gjefsen and erik fisher explain, he argued that the laboratory has been a lever for action in and on the outside world, whereby its resources depended on capturing the interests of external actors, who assessed its performance through forms of verification that were distinct from the representational language used in the laboratory itself ( : ). experts gather within laboratory walls, and these experts conduct experiments and measurements, with room for iterations and mistakes. as latour notes, ‘it is simply that they can make as many mistakes as they wish or simply more mistakes than the others “outside” who cannot master the changes of scale’ (latour, : ). through the multiplication of trials and errors, they produce the invisible knowledge that becomes translated into words and visible as written text spread in the world outside the laboratory. the experts, the network of actors, and the process of inscription make a laboratory a powerful and successful place that can invert pawlicka-deger: a laboratory as the infrastructure of engagement the hierarchy of forces. therefore, according to latour, the lab’s power lies in the knowledge that is manipulated and projected in the lab by the group of people who, in the lab environment, gain the strength of experts. the transfer of such knowledge produced in a closed, controlled environment is the lever that has the capacity to change the world. referring to robert e. kohler’s analysis, latour showed that ‘the power of modern labs lies neither in some essential internal features nor in their social symbolism, but solely in the knowledge that labs project into the world: their exports are the levers that move worlds’ ( : ). the laboratory studies of the s and s played a significant role in understanding scientific practices and knowledge production; however, the study has met with criticism and accusations of having a reductive view of social reality. as gjefsen and fisher argue, ‘criticisms of early laboratory ethnography were grounded in the perceived limitations on explanatory power imposed by studying the laboratory in isolation from broader societal structures and context’ ( : ). the shift towards an inclusive perspective of a laboratory has emerged along with the development of laboratories outside the scientific environment in public space and cultural institutions. the rise of social labs and citizen labs has begun a new chapter in the history of laboratory studies by incorporating social situatedness. in , isis: a journal of the history of science society released a special issue with a focus on laboratory history. this special issue brought together well- recognized historians and sociologists of science―graeme gooday, ursula klein, and robert e. kohler―who reviewed the laboratory concept years after establishing laboratory studies. as kohler observed, after the s and s, which was seen as a productive time for the field, the laboratory was neglected until interest was revived again in the twenty-first century ( : ). indeed, the interest in the laboratory was enlivened, giving new impetus and direction for its exploration. gooday, in the article ‘placing or replacing the laboratory in the history of science?’ ( ), proposes an alternative to the interpretations of laboratories, which have changed and diversified in recent decades. as gooday claims, ‘originally, a laboratory could be a site of organic growth or material manufacture, but it can pawlicka-deger: a laboratory as the infrastructure of engagement now be a specialized domain for technological development, educational training, or quality testing’ ( : ). this observation is a starting point for the discussion on the permeable or non-existent boundaries between laboratories and other spaces which are now arranged into experimental laboratories without being designated for such a purpose. the extension of the laboratory idea to the public space demands that the laboratory be examined from a broad and inclusive perspective. hence, as gooday notes, ‘a lot of historians of science now devote their attention to what went on outside the laboratory: on the theatre platform or in the museum, auditorium, exhibition, or home’ ( : ). situating laboratories associated with scientific activities in the public domain raises a question concerning the reconceptualization of the role of science in society. scientific experts are now replaced by citizens in white coats who gather together in a common laboratory space to produce knowledge and influence local challenges. in light of these changes, gooday suggested a new, inclusive approach to laboratory studies. he claimed that theorists now must seek to understand ‘what constitutes a laboratory, especially in relation to the difficulty of demarcating this scientific space from other less formal sites of empirical making of new knowledge or new artifacts’ ( : ). hence, the sts-situated research purpose, to demystify the power of scientific laboratories, has been shifted towards identifying the epistemological reasons for turning some spaces into laboratories, providing a fruitful and fresh approach for studying the phenomenon of ubiquitous, present-day laboratories that do not resemble their precursors. various spaces have been turned into laboratories for interventions in social challenges. the laboratory imbued with the power to raise the world, recalling latour once again, has become a vehicle for systemic change. the social labs exemplify the extension of the laboratory to the public sector. these models have become powerful and ubiquitous due to their public engagement as well as their collaborative and inclusive infrastructure. social labs as platforms for systemic change zaid hassan notes that ‘we have scientific and technical labs for solving our most difficult scientific and technical challenges. we need social labs to solve our most pressing social challenges’ (hassan, a). in this concise statement, hassan aptly pawlicka-deger: a laboratory as the infrastructure of engagement describes the goal of social labs through a comparison with scientific laboratories. while, in techno-science labs, teams of experts are focused on advancing technological innovations, in social labs, diverse stakeholders are concerned with pressing social questions. in the excellent publication the social labs revolution, hassan presents a comprehensive view of social labs, which he defines as platforms for addressing complex social challenges that have three core characteristics: they are social, experimental, and systemic ( a). the rise of social labs, including social innovation labs and citizen labs, arose from the need to address urgent and highly complex global concerns, such as climate change, inequality, poverty, migration, slavery, and more. therefore, in recent years, numerous social labs have popped up around the world (hassan, b), and many initiatives have been launched to support an emerging social innovation movement, including journals (e.g., stanford social innovation review) and network communities (e.g., social innovation europe). concurrently, the body of literature on social labs has grown significantly, with the aim of understanding their principles and providing practical suggestions for establishing those labs (edwards, ; torjman, ; hassan, a; kieboom, ; tiesinga and berkhout, ; westley et al., ). the social lab movement has been built upon a series of convictions and beliefs. first, contemporary global challenges are recognized as complex, multi-causal, and unstable difficult problems that cannot be addressed with traditional linear and analytical approaches. the original method for tackling complex problems was focused on treating their micro levels in an orderly and linear way, from identifying the challenge to finding the solution. this narrow approach to difficult problems turned out to be inadequate and ineffective in the face of modern, complex challenges. the handling of complex and nonlinear problems requires holistic, macro, rather than linear, micro thinking. systems thinking was thus proposed as a new approach to finding solutions. as kimberly bowman explains, “systems thinking” tries to take into account the interactions between different parts of a system and understand how together they are effecting change rather than simply trying to understand specific components in pawlicka-deger: a laboratory as the infrastructure of engagement isolation. in doing so, systems thinking can be an important part of developing truly sustainable and transformative change (bowman et al., : ). the first conviction concerning social labs is that complex and inter-related challenges, viewed as a system, demand a new systems analysis method for seeking the perspectives of multiple stakeholders and providing solutions for sustainable development. the second belief resulting from the previous premise is that present-day institutions relying on conventional organization and approaches are outdated and inefficient when it comes to addressing complex issues. in traditional thinking, a problem is undertaken by an individual institution focused only on a specific part of the challenge. this problem-solving process is thus insufficient to address the multiple issues simultaneously contained in the problem. therefore, the urgent need to handle complex problems requires the construction of a new social institution to connect diverse stakeholders: public institutions, private organizations, and civil society. this new institution was thus envisioned as a platform or coalition made up of multiple inter-linked actors that would get involved in investigating the system through collaborative and iterative processes. the next conviction underlying the emergence of social labs is related to seeking a new cross-sector and cross-disciplinary institutional model to meet complex challenges effectively. this innovative organization was designed to eliminate sector- and discipline-based silos. hence, the purpose of the new model was to involve diverse actors in recognizing how different parts of the system were entangled and influenced by each other. eventually, the social lab was formed as a hub that ‘seeks to catalyze emergent innovations in a particular domain—through diverse strategies and interventions’ (tiesinga and berkhout, : ). this experimental institution brings together various actors from academic institutions, cultural institutions, the government sector, private organizations, and civil society to respond collaboratively to complex problems and accelerate social innovation. the last argument for launching social labs stems from the previous statement that the current methods used to tackle complex problems are ineffective and inadequate. in seeking successful tools, social labs have developed new change pawlicka-deger: a laboratory as the infrastructure of engagement methodologies derived from social science, design, and management studies. thus, in the labs, the collaborators apply a wide range of methods, including experiments, participatory design, and prototyping. the purpose of these techniques is to engage diverse actors in the process of ‘learning by doing’ and designing solutions together inside the lab through consultations and iterations. as torjman argues, ‘importantly, prototyping enables the team to see and feel a physical object to begin early-stage feasibility planning. failure is embraced; prototypes that do not “work” are part of the process to find those with potential’ ( : ). afterward, the prototype is transferred to, and tested in, the world outside the lab. the authors of labcraft, the important publication on social labs, describe the process of designing and applying the lab’s products to the real world as follows: we translate our hypotheses into prototypes for new or improved solutions to social challenges, often in the form of products, processes, policies, or services. we test those solutions through their application, often in the form of pilots or trials with users. and we use the results of our tests to iterate and to inform the creation of still-better solutions. and we develop our own strategies and programs through a trial-and-error process of experimenting and prototyping (tiesinga and berkhout, : ). the network structure of these labs is seen as an ‘idea funnel’ (edwards, : ), which is filled by circulating iterative-based solutions between social institutions and public spaces. the sustainable food lab and the parsons design for social innovation and sustainability lab constitute good examples of the social lab models. the first one, the sustainable food lab founded in , operates as a platform for corporations, governments, farmers’ associations, and ngos to work together to accelerate the incorporation of environmental, economic, and social sustainability into the world’s food production systems. sustainable food lab members believe that the industry is facing critical issues that cannot be tackled by one organization. these wicked problems pawlicka-deger: a laboratory as the infrastructure of engagement include water quality impacts in every one of the world’s waterways on which farmers grow crops, emissions from the whole food supply chain, and farm labor improvements that require immigration and government policies as well as employment conditions in private businesses (hamilton, : ). this innovative consortium-based lab aims to address climate change, farmer poverty, and soil health through iterative and system analysis methods. the cornerstone of the lab’s approach is to create a ‘kitchen table culture’ where ‘stakeholders can roll up their sleeves, speak candidly and learn from each other’s perspectives […] designing and managing supply chain initiatives and collaborations among food companies, ngo’s, farmer organizations and research institutions’ (sustainable food lab, n.d.). the lab seeks to transform food production systems through cross-sector partnerships to create outcomes that improve the whole system and solve intractable problems from multiples perspectives. the parsons design for social innovation and sustainability lab (desis lab) is, in turn, an action research laboratory created in at the new school. the lab collaborates with local partners in new york city and with global stakeholders through the desis network, the platform that was established to bring together around design labs based at universities around the world. the goals of the desis lab are to explore the relationship between design and social change, advance the practice and discourse of design-led social innovation, and develop social sustainability through strategic and service design, management, and social theory. the lab launches three-year projects conducted with local partners and disseminated through public exhibitions and toolkits. the first action that the lab undertook, amplifying creative communities, aimed to engage groups of citizens in sustainable and positive living in new york city. these creative communities―community gardens, food co-ops or alternative childcare―seek to solve problems of everyday urban life by promoting simple, but often ingenious, solutions (amplify, n.d.). the laboratory practices are driven by participatory design methods, which actively involve individuals and communities in the collective design process. pawlicka-deger: a laboratory as the infrastructure of engagement the desis lab relies on the principles of diverse collaboration, the participatory approach, and heterarchies. the last value, in particular, constitutes the foundation of the lab culture: whereas researchers have focused on social hierarchies and structural asymmetries, little attention has been paid to heterarchies — the lateral forms of collaboration through which social life is constructed. we promote such interdependent networks, as they generate more opportunities for heterogeneous forms of collaboration (parsons desis lab, n.d.). increasingly complex, nonlinear, and interconnected challenges, such as the above- mentioned food system and social diversity, require a new approach based on renewed commitment and unprecedented collaboration. as lisa torjman states: complex problems cannot be solved by individual entrepreneurs working independently, or even by teams of like-minded specialists. we must engage multi-sectoral expertise in an evidence-based, design-driven approach, to advance solutions to these seemingly intractable challenges. this is where labs come in (torjman, : ). thus, new laboratories have emerged as the collaborative infrastructure for turning ideas into action. towards the infrastructure of engagement following the evolution of techno-science labs and social labs, we can notice how the concept of a laboratory has been profoundly transformed from a separate scientific room into a collective chain of coalition-based spin-offs capable of catalysing social change. as a result, laboratory studies have also undergone significant changes from ethnographic investigations of workplaces to tracking a network of labs’ engagement in social challenges. the laboratory has been redesigned as a transformative infrastructure with the power to make changes. the shift in the laboratory concept has resulted from thinking beyond the instrumental infrastructure towards the pawlicka-deger: a laboratory as the infrastructure of engagement critical infrastructure intertwined with socio-cultural, political, and technological systems. therefore, to envision a laboratory as something more than a solely technical-research infrastructure, it is necessary to reveal the transformative side of the infrastructure. as paul n. edwards et al. claim, ‘transformative infrastructures cannot be merely technical; they must engage fundamental changes in our social institutions, practices, norms and beliefs as well’ (edwards et al., : ). rockwell ( ) argues that the infrastructure turn in the humanities was a political move that involved redefining the idea of research in such a way that scholars would be aware of what sort of infrastructure was needed and what should be particularly supported by infrastructure. rockwell calls for reimagining ‘infrastructure not just for professional researchers at universities, but the amateur researchers in the community. if we want long-term political investment, we need to open it up to the community’ ( ). picturing the infrastructure as a bridge is important for enhancing the connection between scholars and facilitating collaborative research; however, as in sewers and power lines, it should also provide utilities for society. the type and scale of organizational structures determine what kinds of services and products are offered to the community. rockwell suggests that we need to explore the social dimension of infrastructure where the values and principles are embodied in the provision of technical support. therefore, large- scale investments in infrastructure for the humanities have evoked the need for the critical investigation of its layers as an inherent socio-cultural system. the turn towards interrogating infrastructure led to the emerging field of critical infrastructure studies as a mode for conducting cultural studies. this nascent field was initiated by alan liu and james smithies through the establishment of a collective of international scholars from different disciplines who continue to build a theoretical foundation for reading culture through the critical lens of infrastructure (cistudies, n.d.). the researchers have framed a wide scope of critical infrastructure studies that aim to interpret and critique culture at the level of infrastructure, ‘where “infrastructure,” the social-cum-technological milieu that at once enables the fulfilment of human experience and enforces constraints on that experience, pawlicka-deger: a laboratory as the infrastructure of engagement today has much of the same scale, complexity, and general cultural impact as the idea of “culture” itself’ (liu, : ). the study offers many different approaches to infrastructure explored from the perspectives of the dh, sts, media studies, and feminist studies, to name just a few. this newly established field has called for reinterpreting culture and society through infrastructure layers in such a way that the substrates reveal themselves as agents with the capacity to both drive and constrain the process of knowledge creation. the last matter was a particular focus of a workshop held at the university of michigan school of information in . an international group of scholars representing different domains was brought together to debate the operation and challenges of knowledge infrastructure. one theme of the discussion was devoted to the issue of how different channels of knowledge distribution encode and reinforce existing interests and relations of power. as we can read from the report: all infrastructures embed social norms, relationships, and ways of thinking, acting, and working. as a corollary, when they change, authority, influence, and power are redistributed. knowledge infrastructures are no different; they create tensions and raise concerns that are best addressed early and often. new kinds of knowledge work and workers displace old ones; increased access for some may mean reduced access for others (edwards et al., : ). to disrupt the uneven spread of knowledge, the scholars proposed involving citizens in the process of the co-production of infrastructure. referring to the scandinavian participatory design movement, the researchers stressed the role of public engagement in designing infrastructure as well as the very notion of infrastructure being involved in socio-political matters. these reflections rightly show the feedback loop between infrastructure and the knowledge represented in its design. going further along this path, we can deduce that infrastructure has the power to create the conditions for producing certain knowledge and values. drawing upon lessons on infrastructural agency, scholars have called for engagement in critical reflections pawlicka-deger: a laboratory as the infrastructure of engagement on different forms of infrastructures: institutional, material, social, and digital. the concept of infrastructure is thus employed as a useful lens for analysing socio- cultural concerns. one way of applying infrastructural literacy (mattern, ) is to envision a new organizational structure capable of disrupting current constructions. the discourse involved in reimagining infrastructure for the humanities has been taken up particularly by north american feminist scholars. it is worth recalling two panels at digital humanities conferences organized by the alliance of digital humanities organizations (adho), which were both devoted to the issue of rebuilding infrastructure in the vein of feminist thinking: creating feminist infrastructure in the digital humanities ( ) and reimagining the humanities lab ( ). while the first discussion was focused on addressing the social and relational aspects of infrastructure, the second panel aimed at disrupting the movement towards positioning digital humanities labs in line with science labs. the scholars proposed to rebuild the lab ‘as a site for humanistic rather than scientific work’ ( ) and have that building process rely on the values of generativity, legibility, and creativity. in addition to these adho conference panels, when initiating a discussion on reconsidering the lab’s structure and mission, the university of colorado, boulder organized the symposium what is a feminist lab? in . it was the first event dedicated entirely to feminist approaches to interpreting and reconstructing a laboratory. the talks revolved around building a laboratory in line with feminist thinking as well as designing the infrastructure through the lens of feminist studies (e.g., tara mcpherson discussed software and digital projects design motivated by feminist intent). significant principles of feminist thinking in reconstructing infrastructure are the focus on its transparent construction, the culture that has emerged from its implementation, and its co-creation of knowledge based on the values of collectivity, equality, and inclusivity. the participants in such infrastructure become aware of how knowledge and resources are manufactured and obtain access to an environment ripe with opportunities for disrupting and reconstructing such a system. a laboratory drawn upon the feminist approach represents a space for critical engagement and pawlicka-deger: a laboratory as the infrastructure of engagement new modes of knowledge production that are ‘collaborative, experimental, emergent, and responsive’. as the organisers of what is a feminist lab? argue: despite the novel forms taken up by labs for some time, their colonial and masculinist lineages have not yet been properly addressed. this is a significant omission, as inherited lab models can have direct influence not only on the types of projects taken up by a lab, but on its overall ethos, the way equity is handled in its organization, its epistemic models, as well as on the accessibility of its methods and outputs (what is a feminist lab? ). the feminist perspective draws attention to a lab culture shaped by cooperative models of labour and credit, critical practices, including a speculative design that aims to imagine new futures, and active engagement in social transformations by producing community-curated projects, shaping new cultural policy and embodying the values of equality and inclusivity in the design of digital tools and projects. the following two labs are good examples of the feminist approach-based labs built by two recognized feminist scholars in digital humanities: the human security collaboratory (hs collab) directed by jacqueline wernimont, and the equality lab at william and mary, founded by elizabeth losh. the hs collab launched by the global security initiative at arizona state university is ‘a collective of artists and scholars dedicated to addressing the ways in which digital technology creation and use affects individual and community life’ (hs collab, n.d.). the lab is focused on tackling complex problems related to digital security and civil rights through the application of digital humanities tools and inquiries. the lab’s projects include border quants, a project related to digital human rights, personal data protection, and decolonial approaches to data use, and vibrant lives, an immersive performance installation that served as a critical comment on the use and monetization of personal data production. the important part of the lab’s activities is a public engagement through events, such as a series of lunchtime conversations about digital human security issues. the equality lab, in turn, is devoted to testing, experimenting, and questioning the nature of equality across many different domains by using digital tools. it provides a ‘space to ask big questions about how equality has been defined in different places at pawlicka-deger: a laboratory as the infrastructure of engagement different times in history and study how equality (and inequality) can be represented in scholarly works’ (equality lab, n.d.). the lab represents a flat, non-hierarchical, and diverse cohort of students, scholars, and community members. one of the lab’s research areas is the lgbtiq research project, which is conducted by students and scholars in collaboration with multiple community partners across virginia with the aim to better understand and preserve the history of lgbtiq people and cultures in the commonwealth, and make these histories accessible through digitized materials and oral histories. a critical reading of the layers of infrastructure aims to understand how they condition research practices and social experience as well as to disclose the power stemming from their structure and situatedness. some places, unlike others, are distinguished by the specific powers assigned to them based on their essential internal features. the laboratory has been considered to have such a capacity to raise the world and has undergone a critical scaffolding, as mattern puts it in her excellent analysis of infrastructures as critical structures through which we can address social issues (mattern, ). drawing on critical infrastructure studies, social lab theories, and the discourse of reimagining the lab in the vein of feminist thinking, i propose to reflect on a laboratory through the lens of the infrastructure of engagement. the inspiration to frame laboratories this way comes from humanities for all, an initiative of the national humanities alliance foundation. the foundation has established what the humanities action lab defines as the infrastructure of engagement: ‘institutional structures that support engaged scholarship, including degree programs, centers, funding opportunities, digital technologies, and curriculum reorientation initiatives’ (humanities for all, n.d.). by developing a theoretical framework for this concept, i want to draw attention to the ways in which the laboratory has the potential to be involved in addressing social challenges via embedding its infrastructure in a social system of interconnected elements as well as by embodying social values and principles in the very design of infrastructure. the laboratory has turned into the infrastructure of engagement, which, through its design and actions, has become a voice for public involvement. pawlicka-deger: a laboratory as the infrastructure of engagement to set up the discussion of the epistemological foundation, i refer to paul dourish and genevieve bell’s concepts of the infrastructure of experience and the experience of infrastructure. dourish and bell examined the meaning of ubiquitous computing in everyday encounters with space from the perspective of design studies and the cultural organization of space. they reflected on how we can experience and interact with pervasive computing through physical space conceptualized not just as material infrastructure but as the infrastructure through which we perceive the world. dourish and bell presented infrastructures as fundamental elements of the ways in which we encounter the world and, in doing so, introduced the concept of the infrastructure of experience as something that is embedded in everyday space, shapes our experience of that space, and provides a framework through which our encounters with space take on meaning. further, they proposed ‘the experiential reading of infrastructure’ (dourish and bell, : ), upon which infrastructure and everyday life are coextensive. it is a useful theoretical framework for showing how an infrastructure that is normally taken for granted can shape our perceptions and evoke unexpected changes. drawing on dourish and bell’s idea of intertwining experience and organizational structures, i determine the following implications for the conceptualization of the infrastructure of engagement. first, the infrastructure is socially and culturally organized and situated; therefore, the social and cultural aspects provide contexts for understanding the infrastructure as well as the framework for designing such infrastructure. as dourish and bell state: technological infrastructures are, inherently, given social and cultural interpretations and meanings; they render the spaces that they occupy as spaces that can be distinguished and categorized and understood through the same processes of collective categorization and classification that operate in other domains of social activity ( : ). therefore, the infrastructure embodies social and cultural values and, to pursue this line of thinking further, it is capable of reinforcing and redistributing influence and power. second, the infrastructure as a social product builds new forms of communities pawlicka-deger: a laboratory as the infrastructure of engagement and organizations through the mechanism of inclusion, by connecting actors in a network, and exclusion, by setting up physical and symbolic boundaries. third, the infrastructure operates the mechanism of knowledge production and transfer; therefore, it can influence the ways in which knowledge is reinforced or undermined. reflecting upon changes in knowledge infrastructures, paul n. edwards claims, ‘new knowledge infrastructures hold great promise, and they may help address key issues of public import. but knowledge infrastructures also face limits, create tensions, and raise concerns’ ( : ). the above premises of the infrastructure of engagement reveal how the infrastructure can be used as a critical scaffolding that, through its structure and design, has the potential to interrogate and shift social dynamics. the building of the infrastructure of engagement is thus a critical gesture to make its substrates crucial elements in constructing the reality. therein lies the promise of the infrastructure of engagement to reimagine and rebuild the world differently. i propose to conceptualize a laboratory as the infrastructure of engagement to show how a lab as a physical place and a theoretical construct can bridge the inside and outside worlds of ‘laboratory walls’ and become a critical site for engineering, prototyping, and testing new ideas. the following elements of the lab can constitute a framework to design the lab through the lens of the infrastructure of engagement: a network structure connecting various stakeholders across sectors, institutions, and disciplines, and spin-off labs, enabling collaborative action; the diversity of actors constituting the engine of a lab; an inclusive lab’s structure and culture embodying the values of openness and equality; transparency of research, actions, and data; rapid response to challenges reflected in the focus on a specific problem undertaken from a broad and holistic perspective; the utilization of the systems analysis approach to understand a problem’s bigger picture; experimental practices (e.g., prototyping, participatory-based approaches, intervention), and co-creation of solutions with citizens (e.g., through community-curated projects). from the perspective of this concept, a laboratory is rebuilt as a site of intervention that enables actors to be involved in real-world challenges through the very design of collaborative structures and actions. the laboratory described by the social lab pawlicka-deger: a laboratory as the infrastructure of engagement theorists as ‘a container for a multitude of capabilities and perspectives’ (tiesinga and berkhout, : ) and a ‘platform for addressing complex social challenges’ (hassan, a) can effectively implement the idea of the infrastructure of engagement. the proposed framework can provide a theoretical understanding of new lab models in the humanities that no longer resemble their precursors. the mentioned humanities action lab is one exemplification of the infrastructure of engagement: the coalition- based lab that connects various local institutions and communities working together on a particular challenge for a specific period of time. there are more examples of laboratories that can be analysed through this category: the medialab at the university of granada, which is an open lab connecting academic work with society through the exploration of new forms of prototyped and participatory knowledge; the public data lab, a distributed lab coordinated by a group of researchers from different european institutions, with the critical aim to facilitate democratic engagement and public debate around the creation and use of public data; and the rights lab at the university of nottingham, which is a large-scale platform-based lab devoted to ending global slavery through the collaboration of various stakeholders. a laboratory is far more than just a place with instruments and equipment. it is a highly epistemologically and culturally charged concept that implies a specific way of thinking, experimenting, and seeing the world. the analytical concept of the infrastructure of engagement aims to unlock the power of the laboratory and inspire to seek new lab models with critical inflection and intervention in public space and global challenges. it also aims to catalyse the design and development of new forms of laboratories that will apply humanities knowledge with the use of digital or non- digital components, and translate that knowledge into actions beyond laboratory walls. labs can be envisioned as an open space, a coalition, and a digital platform that, through its own infrastructure and culture, has the power to produce, test, and shape the future of society. conclusion today’s big challenges―the covid- pandemic, climate change, migration, and refugee crises―are global in scale, transcending geographical, national, and cultural boundaries, but these all affect and are responded to at the local level. the large pawlicka-deger: a laboratory as the infrastructure of engagement scope of contemporary challenges requires us to seek rapid and innovative solutions scaled at the global level and adjusted to local conditions and needs. in the face of complex global problems, it has become necessary to reflect on the questions: what kind of new forms of organizations are needed to tackle real-world problems? what new approaches are necessary for a cross-sectoral way of solving problems and moving ideas from one place to the other? how can we rebrand and enhance the humanities as a responsive field with the ability to translate knowledge into collaborative actions? in this essay, i have presented an emerging model for a cross-sectoral organization that is a network-based laboratory made up of diverse actors―public institutions, private organizations, and civil society―involved in investigating the system through collaborative and iterative processes. the new institution called the social lab refers to a well-established institutional model of a laboratory characterized by experimental practices conducted in a sterile environment with the use of instruments and equipment. as torjman notes, ‘similar to traditional science labs where the scientific method dictates the iterative process by which results are achieved, the newer class of labs offers a neutral space dedicated to problem-solving in a highly experimental environment’ (torjman, : ). however, unlike traditional labs, social labs are focused on the diversity of perspectives, the co-creation of solutions with citizens, and the application of prototyping techniques; ‘structured as a flat rather than a hierarchical model, collaborative action can occur more freely, with everyone having something of equal value to contribute’ (torjman, : ). social labs, along with community and citizen labs, have contributed to expanding the concept of a laboratory that can be applied to various spaces and fields where there is a need for a collaborative understanding of complex systems and rapid prototyping of solutions. as a result, the laboratory serves as a theoretical construct, rather than a physical place, used to implement an experimental and collective approach, diverse perspectives, and critical interventions. this has opened up the possibility of the formation of new forms of laboratories in the humanities that no longer resemble the science labs that they evolved in. the new models inspired by the social and community labs require, therefore, a new analytical framework for better understanding their principles and practices. pawlicka-deger: a laboratory as the infrastructure of engagement i have therefore introduced a new theoretical approach to conceptualizing a laboratory in the humanities as the infrastructure of engagement. the laboratory has become a powerful infrastructure with a great potential to support social engagement and drive systemic changes. the concept of the infrastructure of engagement is intended to be an epistemological tool for reconsidering and redesigning physical and conceptual structures for humanities inquiry. by referring to the humanities action lab, i aimed to envision a laboratory as a critical and interventive space guided by the principles of cross-sector collaboration, diversity, vertical structure, inclusivity, systemic approaches, and public engagement. as david edwards rightly puts it, ‘effective labs listen carefully to public reaction and translate what is learned into useful guidance for future experimentation’ ( : ). the proposed approach opens up new research questions and inquiries. first, it has been brought about through a new perspective on the networked collaboration between labs situated in different socio-cultural environments. the networking structure has the potential to ‘bridge the gap between these disconnected worlds by translating ideas and resources from one world to the other’ (tiesinga and berkhout, : ). this, however, raises the question of how to build collaboration and communication between labs conditioned by divergent socio-technical infrastructure and affected by different social and political concerns. it thus requires further research on how to both scale solutions at the global level and respond to local differences. this, in turn, can entail a critical debate on the implementation of ideas for which epistemologies vary in their perceptions across different geographical and cultural environments. for instance, the concepts of openness and public engagement are perceived differently from one country to another. while open access to resources is the desirable movement in the western countries, in other regions it meets with uncertainty towards opening public access to indigenous cultural heritage data; e.g. in the context of indigenous people in australia (bowrey and anderson, ) and africa (piron, ). therefore, network-based laboratories require further explorations and efforts towards developing the mutual understanding of epistemological situatedness and designing new flexible methods and practices that would enable various actors to engage in a global dialogue. pawlicka-deger: a laboratory as the infrastructure of engagement second, the development of cross-disciplinary laboratories challenges the role of the humanists in team-based work. the referred laboratories represent good examples of collaborative work between the humanities and other fields; for instance, the humanities action lab brings together scholars coming from the history, literature, art, design, policy studies, and so on; the medialab at the university of granada carries out projects at the intersection of the digital humanities, media studies, and digital culture; the public data lab, in turn, is coordinated by a group of researchers representing the field of anthropology, digital humanities, sts, design, and media studies. this triggers questions as to how a team of people with different epistemological backgrounds might collaborate. what is the contribution of the humanities in designing solutions for contemporary challenges? how can we enhance the position and the application of humanities knowledge in cross-disciplinary and cross-sectoral labs? these labs are established to produce systemic knowledge on a particular social challenge and collectively design a solution based on prototyping and intervention techniques. it raises further questions about the emergence of new research methods in the humanities, such as prototyping, participatory mapping, and design thinking. it thus requires an understanding of prototyping ideas from humanities perspectives, and developing new innovative methods for scholarly research using software development and design thinking approaches. the discussion on the significance and the role of humanities knowledge in addressing global challenges has again revived in the face of the covid- pandemic, in the form of the question: ‘what can the humanities offer in the covid era?’ (reisz, ). over the years, the humanities have been challenged and developed into a field that connects research and teaching with projects to advance social justice and the public good. the engaged humanities aim to create new configurations of humanities knowledge, social relevance, and public involvement and by doing so, become a key voice in debates on tackling real-world problems. in response to covid- , the london school of economics and the british academy and arts council england set up the social sciences, humanities & the arts for people & the economy (shape) initiative, aimed at rebranding the arts, humanities and social sciences and promoting these as the branches of knowledge necessary for ‘understanding more pawlicka-deger: a laboratory as the infrastructure of engagement profoundly the world around us and the people in it, through observation, analysis, translation and interpretation’ (shape, n.d.). this initiative has the potential to become a powerful movement towards reinforcing the position of the humanities that, together with stem subjects, can contribute to addressing problems in the contemporary world. the global challenges are complex and interrelated; therefore, they require active engagement and collaboration from actors representing different perspectives, disciplines, and sectors. under these conditions, laboratories have been reimagined as sites for interventions in pressing social challenges. laboratories have the power to reposition the humanities in society as they can provide space for the application of humanities knowledge in epistemological and practical experiments and for the transformation of ideas into actions. this means that we must understand them as infrastructures of engagement in order to reimagine and rebuild the world differently. acknowledgements the research included in this essay was conducted as part of the willard mccarty fellowship at the department of digital humanities (ddh) at king’s college london and presented during my lecture, entitled “a laboratory as critical infrastructure in the humanities” at the workshop “humanities laboratories: critical infrastructures and knowledge experiments”, hosted by ddh with king’s digital lab in conjunction with the critical infrastructure studies initiative at king’s college london on may . i thank ddh for awarding me the willard mccarty fellowship / and giving me an opportunity to share my research. i would like to thank the participants at the workshop for the feedback i received. i also thank the two anonymous reviewers for their careful reading and many insightful comments and suggestions. competing interests the author has no competing interests to declare. references acls our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. new york: american council of learned societies. pawlicka-deger: a laboratory as the infrastructure of engagement amplify (n.d.) amplifying creative communities. available at http://www. amplifyingcreativecommunities.org/ [last accessed november ]. arac, j shop window or laboratory: collection, collaboration, and the humanities. in: kaplan, a e and levine, g (eds.) the politics of research. new brunswick: rutgers university press. pp. – . asu n.d. asu humanities lab. arizona state university. available at https:// humanities.lab.asu.edu/ [last accessed august ]. bowman, k et al (ed.) systems thinking. an introduction for oxfam programme staff. oxford: oxfam. bowrey, k and anderson j the politics of global information sharing: whose cultural agendas are being advanced? social & legal studies, ( ): – . doi: https://doi.org/ . / cistudies (n.d.) critical infrastructure studies. available at https://cistudies.org/ [last accessed november ]. climates of inequality n.d. climates of inequality. stories of environmental justice. humanities action lab. available at https://www.humanitiesactionlab.org/ projects [last accessed november ]. creating feminist infrastructure in the digital humanities panel discussion at digital humanities , dh , adho conference, krakow. available at dh .adho.org/static/data/ .html [last accessed november ]. davidson, c n what if scholars in the humanities worked together, in a lab? chronicle review, may. https://www.chronicle.com/article/what-if-scholars- in-the/ [last accessed november ]. desis lab, parsons design for social innovation and sustainability lab. available at https://www.desisnetwork.org/ [last accessed november ]. dourish, p and bell, g the infrastructure of experience and the experience of infrastructure: meaning and structure in everyday encounters with space. environment and planning b: planning and design, : – . doi: https:// doi.org/ . /b t dubois, l and jenson, d humanities in the lab: rethinking haitian studies. diversity and democracy, ( ). available at https://www.aacu.org/publications- http://www.amplifyingcreativecommunities.org/ http://www.amplifyingcreativecommunities.org/ https://humanities.lab.asu.edu/ https://humanities.lab.asu.edu/ https://doi.org/ . / https://cistudies.org/ https://www.humanitiesactionlab.org/projects https://www.humanitiesactionlab.org/projects https://dh .adho.org/static/data/ .htm https://www.chronicle.com/article/what-if-scholars-in-the/ https://www.chronicle.com/article/what-if-scholars-in-the/ https://www.desisnetwork.org/ https://doi.org/ . /b t https://doi.org/ . /b t https://www.aacu.org/publications-research/periodicals/humanities-lab-rethinking-haitian-studies pawlicka-deger: a laboratory as the infrastructure of engagement research/periodicals/humanities-lab-rethinking-haitian-studies. [last accessed november ]. edwards, d the lab: creativity and culture. cambridge and london: harvard university press. doi: https://doi.org/ . / edwards, p n et al (eds.) knowledge infrastructures: intellectual frameworks and research challenges. ann arbor: deep blue. available at http://hdl.handle. net/ . / [last accessed november ]. emerson, l, parikka, j and wershler, d n.d. the lab book. situated practices in media studies. university of minnesota press. available at: https://manifold. umn.edu/projects/the-lab-book [last accessed november ]. equality lab n.d. the equality lab. william & mary. available at https://www. wm.edu/as/equality-lab/about-us/index.php [last accessed august ]. fhi n.d. humanities laboratories. john hope franklin humanities institute at duke university. available at https://fhi.duke.edu/labs [last accessed august ]. foka, a et al beyond humanities qua digital: spatial and material development for digital research infrastructures in humlabx. digital scholarship in the humanities, ( ): – . doi: https://doi.org/ . /llc/fqx galison, p trading zone: coordinating action and belief. in: biagioli, m (ed.) the science studies reader. new york: routledge. pp. – . gjefsen, m d and fisher, e from ethnography to engagement: the lab as a site of intervention. science as culture, ( ): – . doi: https://doi.org/ . / . . gooday, g placing or replacing the laboratory in the history of science? isis, ( ): – . doi: https://doi.org/ . / gpmp n.d. guantánamo public memory project, humanities action lab. available at https://www.humanitiesactionlab.org/gtmopublicmemoryproject [last accessed november ]. hal n.d. humanities action lab, rutgers university-newark. available at https:// www.humanitiesactionlab.org/ [last accessed november ]. https://www.aacu.org/publications-research/periodicals/humanities-lab-rethinking-haitian-studies https://doi.org/ . / http://hdl.handle.net/ . / http://hdl.handle.net/ . / https://manifold.umn.edu/projects/the-lab-book https://manifold.umn.edu/projects/the-lab-book https://www.wm.edu/as/equality-lab/about-us/index.php https://www.wm.edu/as/equality-lab/about-us/index.php https://fhi.duke.edu/labs https://doi.org/ . /llc/fqx https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / https://www.humanitiesactionlab.org/gtmopublicmemoryproject https://www.humanitiesactionlab.org/ https://www.humanitiesactionlab.org/ pawlicka-deger: a laboratory as the infrastructure of engagement hamilton, h sustainable food lab learning systems for inclusive business models worldwide. international food and agribusiness management review, , special issue a: – . hannaway, o laboratory design and the aim of science: andreas libavius versus tycho brahe. isis, : – . doi: https://doi.org/ . / hassan, z a the social labs revolution: a new approach to solving our most complex challenges. san francisco: berrett-koehler. hassan, z b mapping the landscape of labs: a google map. social labs blog . , june. available at https://social-labs.org/mapping-the-landscape-of-labs- a-google-map/ [last accessed november ]. hs collab n.d. the human security collaboratory, arizona state university. available at https://www.hscollab.org/ [last accessed august ]. humanities for all n.d. humanities for all, the national humanities alliance foundation. available at https://humanitiesforall.org/about [last accessed november ]. joselow, m labs are for the humanities, too. inside higher ed, july. available at https://www.insidehighered.com/news/ / / /conference-explores- humanities-labs [last accessed november ]. kieboom, m lab matters: challenging the practice of social innovation laboratories. amsterdam: kennisland. knorr cetina, k the manufacture of knowledge. oxford: pergamon press. knorr cetina, k laboratory studies: the cultural approach to the study of science. in: jasanoff, s (ed.) handbook of science and technology studies. los angeles: sage. pp. – . doi: https://doi.org/ . / .n kohler, r e lab history. isis, : – . doi: https://doi.org/ . / latour, b give me a laboratory and i will raise the world. in: knorr cetina, k and mulkay, m (eds.) science observed: perspectives on the social study of science. london: sage. pp. – . latour, b science in action. how to follow scientists and engineers through society. cambridge, ma: harvard university press. latour, b and woolgar, s laboratory life: the construction of scientific facts. princeton university press. https://doi.org/ . / https://social-labs.org/mapping-the-landscape-of-labs-a-google-map/ https://social-labs.org/mapping-the-landscape-of-labs-a-google-map/ https://www.hscollab.org/ https://humanitiesforall.org/about https://www.insidehighered.com/news/ / / /conference-explores-humanities-labs https://www.insidehighered.com/news/ / / /conference-explores-humanities-labs https://doi.org/ . / .n https://doi.org/ . / pawlicka-deger: a laboratory as the infrastructure of engagement liu, a toward critical infrastructure studies. nassr, april. pp. – . available at https://cistudies.org/wp-content/uploads/toward-critical-infrastructure- studies.pdf [last accessed november ]. lynch, m art and artifact in laboratory science. london: routledge & kegan paul. mattern, s scaffolding, hard and soft. infrastructures as critical and generative structures. spheres: journal for digital cultures, : – . doi: https://doi. org/ . /mediarep/ parsons desis lab n.d. available at https://www.newschool.edu/desis/ [last accessed november ]. pawlicka, u data, collaboration, laboratory: bringing concepts from science into humanities practice. english studies, ( ): – . doi: https://doi.org/ . / x. . pawlicka-deger, u n.d. laboratory: a new space in digital humanities. in: mcgrail, a, nieves, a d, senier, s (eds.) institutions, infrastructures at the interstices: debates in the digital humanities. minneapolis: university of minnesota press (forthcoming). pawlicka-deger, u the laboratory turn: exploring discourses, landscapes, and models of humanities labs. digital humanities quarterly, ( ). available at http://digitalhumanities.org/dhq/vol/ / / / .html [last accessed august ]. pickering, a from science as knowledge to science as practice. in: pickering, a (ed.) science as practice and culture. chicago and london: university of chicago press. pp. – . piron, f postcolonial open access. in: herb u and schopfel j (eds) open divide. critical studies in open access. sacramento, ca: litwin books & library juice press. pp. – . pratt, m l arts of the contact zone. profession. pp. – . reimagining the humanities lab panel discussion at digital humanities conference , dh , adho conference in mexico city. available at https:// dh .adho.org/en/reimagining-the-humanities-lab/ [last accessed november ]. https://cistudies.org/wp-content/uploads/toward-critical-infrastructure-studies.pdf https://cistudies.org/wp-content/uploads/toward-critical-infrastructure-studies.pdf https://doi.org/ . /mediarep/ https://doi.org/ . /mediarep/ https://www.newschool.edu/desis/ https://doi.org/ . / x. . https://doi.org/ . / x. . http://digitalhumanities.org/dhq/vol/ / / / .html https://dh .adho.org/en/reimagining-the-humanities-lab/ https://dh .adho.org/en/reimagining-the-humanities-lab/ pawlicka-deger: a laboratory as the infrastructure of engagement reisz, m what can the humanities offer in the covid era? times higher education, july . available at https://www.timeshighereducation.com/news/ what-can-humanities-offer-covid-era [last accessed august ]. rockwell, g as transparent as infrastructure: on the research of cyberinfrastructure in the humanities. in: mcgann, j (ed.) online humanities scholarship: the shape of things to come, proceedings of the mellon foundation online humanities conference at the university of virginia. houston: rice university press. pp. – . available at https://cnx.org/contents/ pvdh -ld@ . :_usvuzfn@ /as-transparent-as-infrastructure-on-the-research- of-cyberinfrastructure-in-the-humanities [last accessed november ]. Ševčenko, l the humanities action lab: mobilizing civic engagement through mass memory projects. diversity and democracy, ( ). available at https:// www.aacu.org/diversitydemocracy/ /winter/sevcenko [last accessed november ]. shape n.d. social sciences, humanities & the arts for people & the economy. available at https://thisisshape.org.uk/ [last accessed august ]. smithies, j et al. mechanizing the humanities? king’s digital lab as critical experiment, digital humanities , dh , adho conference in montreal, august. available at https://jamessmithies.org/blog/ / / / mechanizing-humanities-kings-digital-lab-critical-experiment/ [last accessed november ]. doi: https://doi.org/ . / - - - - _ smithies, j and ciula, a humans in the loop: epistemology & method in king’s digital lab. in: dunn, s and schuster, k (eds.) routledge international handbook of research methods in digital humanities. london: routledge. doi: https://doi.org/ . / - star, s l infrastructure and ethnographic practice: working on the fringes. scandinavian journal of information, ( ): – . states of incarceration n.d. states of incarceration. a national dialogue of local histories. humanities action lab. available at https://www.humanitiesactionlab. org/statesofincarceration [last accessed november ]. https://www.timeshighereducation.com/news/what-can-humanities-offer-covid-era https://www.timeshighereducation.com/news/what-can-humanities-offer-covid-era https://cnx.org/contents/pvdh -ld@ . :_usvuzfn@ /as-transparent-as-infrastructure-on-the-research-of-cyberinfrastructure-in-the-humanities https://cnx.org/contents/pvdh -ld@ . :_usvuzfn@ /as-transparent-as-infrastructure-on-the-research-of-cyberinfrastructure-in-the-humanities https://cnx.org/contents/pvdh -ld@ . :_usvuzfn@ /as-transparent-as-infrastructure-on-the-research-of-cyberinfrastructure-in-the-humanities https://www.aacu.org/diversitydemocracy/ /winter/sevcenko https://www.aacu.org/diversitydemocracy/ /winter/sevcenko https://thisisshape.org.uk/ https://jamessmithies.org/blog/ / / /mechanizing-humanities-kings-digital-lab-critical-experiment/ https://jamessmithies.org/blog/ / / /mechanizing-humanities-kings-digital-lab-critical-experiment/ https://doi.org/ . / - - - - _ https://doi.org/ . / - https://www.humanitiesactionlab.org/statesofincarceration https://www.humanitiesactionlab.org/statesofincarceration pawlicka-deger: a laboratory as the infrastructure of engagement sustainable food lab n.d. available at https://sustainablefoodlab.org/ [last accessed november ]. svensson, p big digital humanities: imagining a meeting place for the humanities and the digital. ann arbor: university of michigan press. doi: https://doi.org/ . /dh. . . tiesinga, h and berkhout, r (eds.) labcraft: how social labs cultivate change through innovation and collaboration. london and san francisco: labcraft publishing. torjman, l labs: designing the future. mars discovery district, february. available at https://www.marsdd.com/research-and-insights/labs-designing- the-future/ [last accessed november ]. westley, f et al change lab/design lab for social innovation. annual review of policy design, ( ). what is a feminist lab? symposium at the university of colorado, boulder. available at https://whatisafeministlab.online/ [last accessed november ]. how to cite this article: pawlicka-deger, u a laboratory as the infrastructure of engagement: epistemological reflections. open library of humanities, ( ): , pp.  – . doi: https://doi.org/ . /olh. published: october copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access open library of humanities is a peer-reviewed open access journal published by open library of humanities. https://sustainablefoodlab.org/ https://doi.org/ . /dh. . . https://www.marsdd.com/research-and-insights/labs-designing-the-future/ https://www.marsdd.com/research-and-insights/labs-designing-the-future/ https://whatisafeministlab.online/ https://doi.org/ . /olh. http://creativecommons.org/licenses/by/ . / introduction the humanities and the concept of laboratories disclosing the power of the laboratory social labs as platforms for systemic change towards the infrastructure of engagement conclusion acknowledgements competing interests references © university of toronto press journal of scholarly publishing april doi: . /jsp. . . humanities scholars and library-based digital publishing n ew f or m s of p u b l i c at i on , n ew au di e n c e s , n ew p u b l i sh i n g rol e s katrina fenlon, megan senseney, maria bonn and  janet swatscheno the rise of library-based digital scholarly publishing creates new opportunities to meet scholars’ evolving publishing needs. this article presents findings from a national survey of humanities scholars on their attitudes toward digital publish- ing, the diversification of scholarly products, changing perceptions of authorship, and the desire to reach new audiences. based on survey findings, the authors offer recommendations for how library publishers can make unique contributions to the scholarly publishing ecosystem and support the advancement of digital schol- arship in the humanities by accommodating and sustaining more diverse products of digital scholarship, supporting new modes of authorship, and helping scholars reach broader audiences through interdisciplinary and open access publishing. keywords: library-based publishing, humanities research, digital scholarship, scholarly communication, digital humanities introduction: understanding the needs of scholars in  a contemporary publishing environment scholars in the humanities too often find their publishing needs unmet, despite the rapid evolution and diversification of digital scholarship. at the same time, library-based publishing services and library–press col- laborations are on the rise, growing in response to a scholarly demand to fill gaps in the current landscape of scholarly communication. based on a national survey of humanities scholars, this study identifies areas of opportunity for library-based publishers to fill gaps in the current support for digital publishing in the humanities. these gaps include supporting a growing diversity of scholarly products, sustaining and preserving com- plex digital publications, helping scholars find new and broader audiences, . /jsp. . . journal of scholarly publishing © university of toronto press doi: . /jsp. . . and supporting collaborative, incremental, and iterative authorship and new forms of review. in the last two decades, humanities scholars have increasingly turned to digital publishing. still, long-established print-centric genres (especially the print monograph) remain the gold standard of humanities publishing, even as scholars increasingly employ digital technologies and multimedia sources in their research processes. several studies have identified barriers to digital scholarly publishing, including a scarcity of easy-to-use tools, lack of faculty time to learn new skills, insufficient institutional support, high costs, and con- cerns about the evaluation, prestige, and sustainability of digital publications. several university presses in recent years have targeted these barri- ers directly. building on efforts such as the public knowledge project, recent initiatives — including fulcrum, manifold, and vega academic publishing system — are developing publishing platforms for diverse digital scholarship. often constituting collaborations between presses and libraries, among other stakeholders, these platforms aim to support flexible, collaborative publishing workflows that yield interactive, multi- media content bearing the imprint of a library or press. libraries have also developed training and outreach programs to support digital scholarship, often through dedicated scholarly communications units. and widespread efforts, in academic departments and professional societies, have sought to formalize systems of credit and evaluation for digital publications. this article reports on a national survey of humanities scholars that was designed to assess how attitudes and practices are changing in light of these efforts to advance digital publishing. within the humanities, scholars have distinctive and diverse publishing needs, and they often confront more significant barriers to digital publishing than do scholars in other disciplines. conducted as part of the publishing without walls initiative (pww), this research supports the development of a service model for library-based publishing, with the goal of reducing barriers to digital publishing in the humanities. the survey comprised twenty-nine questions covering six broad themes: ( ) experiences with print and digital publishing; ( ) goals for publishing; ( ) use of and preferences for publishing tools and plat- forms; ( ) use of and preferences for publishing services and support; ( )  opinions on digital publishing from the perspective of reader as opposed to author; and ( ) general attitudes toward print and digital pub- lishing. questions were presented as likert-scale rating questions (usually humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . presented in matrix tables, which asked respondents to rate several items in succession); ranked responses (which asked respondents to place items in rank order, for example, of perceived importance); multiple-choice questions; and open-ended questions. by a method of purposive sampling, the survey was distributed through listservs and social media venues targeting scholars in the humanities and humanistic social sciences. a further round of recruitment focused on encouraging responses from scholars more likely to have experienced systemic barriers to digital publishing, specifically scholars at historically black colleges and universities and other minority-serving institutions. in total, the survey received responses. because only per cent of respondents elected to answer optional demographics questions, how well the responses represent the diversity of humanities scholars is unknown. the majority of those who provided demographic information are tenure-track faculty (   per cent). disciplinary representation is skewed toward literature in english ( per cent) and library and information science ( per cent), with further respondents from an array of disciplines including history, foreign languages, area studies, classics, digital studies, linguistics, and race and gender studies. the survey sheds light on four major dimensions of the contemporary publishing environment in the humanities: ) scholars’ attitudes toward digital publishing in general and their attitudes toward and needs from publishing services; ) their publishing practices and the diversification of scholarly products they produce and use; ) their changing perceptions of authorship, including ongoing renegotiation of author and publisher roles and increasing collaborative authorship; and ) the breadth of their target audiences for publications. the rest of this article is structured around these four areas. survey findings in each area suggest opportunities for emergent library-based publishers to fill gaps by supporting rapidly evolv- ing digital scholarship and communication in the humanities. readiness for change and adherence to conventions: scholars’ attitudes toward print and digital publishing responses to the survey suggest that humanities scholars have largely pos- itive perceptions of digital publishing, both as producers and consumers of scholarship, despite having comparatively less digital than print pub- lishing experience. a little over half of respondents ( per cent) indicated that they are enthusiastic producers of digital scholarly publications; only journal of scholarly publishing © university of toronto press doi: . /jsp. . .  per cent of respondents described themselves as skeptical of digital pub- lishing (figure ). in contrast, respondents consider their peers to be less enthusiastic about digital publishing, especially as producers (rather than consumers) of digital publications. only per cent of respondents rated their peers enthusiastic producers of digital publications. however, they consider their peers to be more enthusiastic consumers of digital content, with per cent rating their peers enthusiastic or somewhat enthusiastic consumers. the discrepancy between how scholars think about digital publishing and how they believe their peers think about it resonates with scholars’ ongoing concern about the reputation of digital publishing within conservative humanities disciplines. the discrepancy may also reflect a bias among respondents, who may have self-selected into partic- ipation out of special interest or investment in digital publishing. seventy-nine respondents elaborated on their attitudes toward digital publishing through free-text responses, providing various rationales for their positive or negative attitudes. the most commonly identified benefit of digital publishing was improving access to publications. one respon- dent wrote, ‘i value [digital publishing] most for the access it provides. while i personally prefer to read printed materials, i still tend toward acquiring digital texts because i can often get them and store them more easily.’ forty-three per cent of the free-text responses explicitly mentioned or alluded to the benefits of open access as a possibility of digital publi- cation. a positive inclination toward open access publishing aligns with the findings of rowley et  al., who identified a cautiously positive view toward open access journal publishing among scholars across disciplines, figure . attitudes toward digital publishing. humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . especially for the capacity of open access to increase the circulation, read- ership, and visibility of publications. in their free-text responses, respondents also expressed concerns about digital publications’ perceived lack of prestige and quality, along with un- certainty about their durability, persistent concerns that are in keeping with earlier studies of the obstacles confronting digital scholarly publishing. twenty respondents cited lack of prestige and poor quality as a concern. one respondent lamented, ‘there is so much digital junk — very low- quality scholarship masquerading as good research; at least in print, there is better gatekeeping.’ eight people mentioned durability, preservation, and concern over future access; one respondent wrote, ‘access is great; but i  worry a lot about the lack of durability, about obsolescence, about the preservation of the long-term record of knowledge.’ views on the trade-offs between print and digital publishing varied across respondents. for exam- ple, one respondent preferred to collapse the distinction: ‘i think we make too big a deal out of whether something is digital or print. i don’t care. i just care about the content and sometimes the process, meaning peer review.’ beyond general attitudes about publishing, the survey identified divergent experiences and needs related to the processes of print and digital publishing. to elicit information about their publishing chal- lenges, the survey asked respondents to rate the following nine variables from not at all challenging to extremely challenging for both print and digital publishing: ) getting adequate technical support for publication; )  manuscript preparation; ) getting adequate editorial support for pub- lication; ) getting adequate financial support for publication; ) securing third-party permissions; ) reaching intended audiences; ) finding ap- propriate venues; ) securing a publisher; and ) speed to publication. the outstanding complaint about print publishing as a process is its perceived slowness, with per cent of respondents categorizing ‘speed to publication’ as challenging or extremely challenging for print publication, confirming other studies. speed to publication has been identified as a main benefit of digital open access publishing. nearly a third of participants ( per cent) considered marketing and audience-creation to be inadequately supported aspects of publishing ( regardless of whether print or digital), and more than a third ( per cent) wanted more help from publishers in navigating third-party permissions for the publication of sources — although neither of these services was among their top priorities. respondents indicated general satisfaction journal of scholarly publishing © university of toronto press doi: . /jsp. . . with what they considered to be among the most important publisher services: peer-review coordination, publisher transparency and commu- nication, and publisher interventions into publication design. figure shows how the scholars rated the adequacy of extant publishing services, from extremely inadequate to extremely adequate. while participants rated the adequacy of extant services on a five-point scale, we have simplified figure in order to highlight the most important outcomes, collapsing the categories extremely inadequate and inadequate to represent negative perceptions, and adequate and extremely adequate to represent positive perceptions. the services are listed in the order of their perceived impor- tance to respondents, with those at the top being most important. only three categories of service or support were perceived by more respondents to be inadequate than adequate: digital archiving and preservation measures, marketing and audience-creation, and navigat- ing third-party permissions. digital preservation was among respon- dents’ most valued services. approximately one-third of respondents (   per  cent) considered preservation services for digital publications to be inadequate; the remaining two-thirds were evenly split between neutral and adequate. given the importance of preservation to respon- dents, improving scholars’ trust in the preservation of digital publications constitutes an area of urgent potential for publishers. many of the services discussed above could be delivered by publishers in a variety of ways. respondents’ preferences for how they receive pub- lishing services varied, with the most popular means being one-on-one figure . perceptions of the adequacy of extant publishing services. humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . consultations ( per cent), remote support ( per cent), and detailed documentation ( per cent). far fewer respondents expressed interest in walk-in support or workshops. within an academic library, these preferences suggest that service models emphasizing library office hours and short-form instructional sessions may not align with the mission of a library publishing unit and the needs of faculty. with respect to publishing processes, however, survey results were not robust enough to suggest a single best path forward. rather, they serve as indicators for the value of prioritizing a diverse service portfolio. digital publishing and the diversification of  scholarly products respondents’ reported enthusiasm for digital publishing is reflected in publishing practices, albeit to a limited extent. digital publishing is com- mon among respondents, with most respondents ( per cent) having published both in print and digitally ( per cent reported exclusively print publishing experience, and per cent exclusively digital publishing experience). previous studies have found that common genres of human- ities publishing have been slow to transition from print to digital form. indeed, of the top four print genres (journal articles, book chapters, con- ference papers, and books), respondents reported having substantially more print than digital publishing experience (figure ). despite advances made in the availability of tools, services, and support, our survey results suggest that digital publishing still lags behind print for the main forms of publication. of course, the lag may be decreasing (the survey does not reveal trends over time). however, respondents reported publishing a wider variety of forms digitally, and digital publishing was more common than print across most genres of publication. technological advancement has given rise to new forms of scholarly output, ranging from datasets and collections to software applications and blogs. some forms have begun to stabilize, having garnered varying degrees of acceptance among different disciplinary communities. while familiar forms such as journal articles still dominate publishing, our survey results confirm an increasing variety of scholarly products. when asked how frequently they had published digitally in various categories, a large majority of respondents reported having self-published via personal or professional websites ( per cent), blogs ( per cent), or other websites ( per cent). beyond the controlled categories provided by the survey, twelve respondents journal of scholarly publishing © university of toronto press doi: . /jsp. . . figure . digital versus print publishing experience, by genre. named other forms and genres in which they had published, including creative works, book reviews, ‘living processual works,’ maps, encyclopedia entries, games, edited texts, and transmedia works. in addition, when asked what kinds of content are typically present in the products of their schol- arship, a sizeable minority of respondents identified multimedia resources such as collections and archives ( per cent), datasets ( per cent), inter- active visualizations ( per cent), and other multimedia ( per cent). this diversity of content exceeds the capacities of most extant publishing systems, and alternative forms of publication tend to be omitted from related systems of formal review, evaluation, discovery, access, and preservation. for these reasons, the robust numbers of respondents publishing across numerous genres warrants sustained attention from publishers. changing perceptions of authorship: collaboration and publishing roles whi le most dis ciplines have wit ness e d a dramat ic incre as e in co-authorship in the last few decades, this change has come to the humanities more slowly. in keeping with several recent studies that have humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . identified a trend toward increased collaboration in the humanities, our survey results suggest that our respondents are very open to collabora- tive authorship. nearly per cent of survey respondents have authored collaboratively or hope to in the future. this desire to collaborate may be attributed in part to bias among our respondents, who may have been drawn to participate in the survey by their prior interest in collaborative, digital modes of publication. while most ( per cent) had collaborated, or wished to, with a limited number of (local or remote) co-authors, a few reported hoping to collaborate with an unrestricted number of co-authors — for example, through large-scale public or community authorship of open documents, or in a more limited fashion through open annotation and open review tools for gathering commentary and feedback. respondents disseminate in-progress work predominantly through conferences and workshops or through direct communication with other scholars (figure ). a substantial minority (between one-fifth and one- third) reported using blogs, social media, or personal websites for pub- lishing in-progress works. unlike other options for scholars to share their in-progress work (especially conferences and institutional repositories), these self-publishing options may exist outside institutional systems of discovery, access, and preservation. figure shows different strategies for obtaining feedback and review from peers and collaborators. most respon- dents reported gathering feedback and review by emailing file attachments back and forth with peers ( per cent), by collaborating asynchronously figure . how respondents disseminate or publish in-progress works. journal of scholarly publishing © university of toronto press doi: . /jsp. . . figure . how respondents gather feedback on in-progress works from their peers. on cloud-based documents ( per cent) or through having shared file storage ( per cent), or by working together in real time ( per cent). fewer respondents reported using a version-control repository (such as github) for gathering feedback on works in progress ( per cent). more than half of the respondents ( per cent) expressed interest in using open peer review and open annotation tools (e.g., hypothes.is or commentpress) to gather feedback on their research, either before or after its publication. among the rest of respondents, the largest number (  per cent) were merely unsure about open review and annotation, per- haps in part because they were unfamiliar with such tools or uneasy about the implications of their use. only per cent were altogether unwilling to engage in these forms of review and comment. while the interest is substantial, the practice of employing open review and annotation is still rare. few respondents (eight total, or approximately per cent, as shown in figure ) reported having used open review or web-hosted annota- tion tools to gather feedback on their work. fitzpatrick and santo have expressed the need for improved technological systems along with the significant human infrastructure necessary to manage expectations and sustain the labour of participating in a functional open review process. they recommend the development of socio-technical solutions grounded in dedicated communities of practice. the survey identified further shifts and tensions in author and pub- lisher roles. goldsworthy has described publishers as ‘shifting their posi- tion in the value chain, and redefining themselves as they go, into training humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . and assessment, information systems, networked bibliographic data, and learning services.’ while the roles of academic publishers have been in flux for decades, shifts in authorial roles seem to be more recent, enabled by digital self-publishing. beyond self-publishing for in-progress work and open review, humanities scholars rely on self-publishing for dissem- inating genres of work that exceed the capacity of traditional publishers. authors may turn to self-publishing in order to share complex digital objects such as collections and datasets, interactive visualizations, and multimedia. in an era of digital self-publishing, more roles traditionally relegated to publishers are being assumed by authors, including aspects of publication design, facilitation of review, and publicity. further blurring the line, self-publishing tools usually aim to support various aspects of authorship — including content creation, editing, and design — along with publication to the web. the humanities scholars we surveyed are not uniformly contented with a perceived convergence of the roles of author and publisher. one respondent wrote, ‘as someone who has written professionally for many decades, i consider digital publishing tools, like typesetting and layout, my publisher’s job . . . [m]aking writers do design work . . . would be like re- quiring faculty to clean classrooms and do tech support.’ spence notes the shift in author and publisher roles and argues that creative partnerships will be essential to bridging the gaps that exist among different stakehold- ers in the publishing process, including authors, publishers, and — we argue — library-based publishing services. reaching new audiences the survey asked respondents to indicate the top audiences they most wish to reach with their scholarship (figure ). unsurprisingly,  per cent of respondents sought to engage with scholars in their discipline, but respondents also revealed an appetite for cross-boundary engagement, with per cent hoping to reach cross-disciplinary peers and  per cent interested in reaching the general educated reader. other target audi- ences included students in their disciplines ( per cent) and specific non- scholarly audiences ( per cent), ranging from funding agencies and policy advocates to communities of interest, such as lgbt readers and creative writers. it was highly important to respondents that a publication venue have the capacity to reach their target audiences. the survey asked respondents journal of scholarly publishing © university of toronto press doi: . /jsp. . . figure . audiences that respondents most wish to reach. to rank factors they use to choose publication venues (figure ). while reputation of the venue was more frequently ranked as respondents’ first priority ( per cent), audience was also commonly ranked first ( per cent), and in a calculation of weighted averages, audience comes out slightly ahead of reputation. as consumers of scholarship, respondents also gave preference to accessibility (e.g., ease of access and availability) over status and trustworthiness of the publisher or venue (with a weighted average of . and . , respectively, on a -point scale). the survey responses suggest that certain readers of digital publications (researchers and instructors) may be willing to overlook the lack of prestige of a pub- lisher if the publication is relevant and easily accessible. figure . factors in how respondents choose publication venues, ranked by priority. humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . scholars themselves are prime audiences for less conventional publi- cations. substantial numbers of respondents (though none a majority) reported frequent use of other scholars’ personal and professional websites ( per cent); of collections, exhibitions, and archives ( per cent); and of blogs ( per cent). comparatively fewer respondents reported frequent use of datasets and multimedia, but these materials fit within the ‘col- lections as data’ ethos in contemporary academic librarianship and may readily be collected in anticipation of future use. these survey results suggest that services oriented toward a wider variety of digital content and alternative genres of publication stand to benefit between a quarter and a third of humanities scholars who may be underserved by current systems of digital publishing. opportunities for library publishing services this research illuminates areas of special opportunity for publishing librar- ies to build programs that effectively address unmet author needs. these results can guide prioritizing resources and investing in services. four prime areas of opportunity for library-based publishers are discussed in this concluding section: . platforms to support more diverse scholarly products, including emergent genres and the integration of complex digital objects into long- or short-form narratives; . support for durable digital publications, including maintenance systems for diverse content, social media, and informal publications used to disseminate scholarship, and outreach to promote trust in the sustainability of digital publications; . support for audience-creation, marketing, and otherwise getting the word out to very broad target audiences, including interdisciplinary audiences and the interested general public; . platforms that support collaborative authorship and alternative forms of peer review and feedback gathering. libraries and university presses each have important roles to play in meeting scholars’ publishing needs for scholarly communication, and many scholars have recognized the actual and potential benefits to library–press collaboration. academic libraries are collecting institutions intended to gather and provide access to information in all its forms in the service of research and teaching. library holdings contain evidence upon journal of scholarly publishing © university of toronto press doi: . /jsp. . . which scholars base their arguments, and, in turn, libraries collect new generations of scholarship. historically, libraries have collected materials produced by scholars and printed and disseminated by presses. while the missions of academic libraries and presses retain an important degree of differentiation, the synergies between the two enterprises are clear and mutually beneficial, so much so that the reporting lines of university presses increasingly run through library administrations. the joint ef- forts of library publishers and presses can promote a diversified, thriving scholarly communications ecosystem, capable of reaching a broader range of audiences, promoting sustainable digital publication platforms and for- mats, and reducing the need for third-party vendors for digital products. in the case of library publishers, strengths and opportunities emerge around pre-existing technical infrastructures that can support plat- forms for digital authoring, services for content representation, and workflows for dissemination and long-term preservation. the library’s mission-critical status within higher education emerges as an overall strength, but taking on additional roles and service models may strain capacity and tax library personnel. acquiring new content, coordinating peer review, and developing strategies to connect readers with new schol- arship all require considerable time and energy. as fledgling publishers, librarians also do not have the established reputation enjoyed by their colleagues in university presses. strategically, libraries would do well to build their reputations by focusing on core strengths and known gaps rather than replicating what presses already do well. for library publishers, building out strategies for transparent communi- cation with authors and marketing to reach target audiences would likely be the best priorities for a lean staff. to increase transparency, library publishers should document their publication workflows in ways that are openly accessible to authors and employ communications checklists to ensure frequent and systematic status updates. communication may be partially automated through platforms that provide author dashboards and auto-generated notifications. library publishers may develop individual- ized marketing strategies through interviews or dialogues with authors, comparable to traditional library reference interviews, designed to help authors identify target audiences and to delegate marketing responsibil- ities among authors and publishing staff. in response to authors’ desire to reach wider audiences, publishers may adopt a multifaceted approach that begins with the review process by soliciting feedback from outside humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . disciplinary silos and developing plans to reach an expanded audience via social media and other forms of community outreach. such strategies would also build on established attention in most academic libraries to frequent and open communication with users. managing peer-review processes is highly important, but it is also well supported by existing pub- lishing models. developing strategies that integrate existing peer-review models or experimenting with alternative modes of peer review would ad- dress scholars’ needs while also helping to pave the way for new modes of publishing. instruction in working with new publishing tools and support for navigating third-party permissions did not emerge as prime concerns for survey respondents. despite clear opportunities to improve presently inadequate services, these activities should be regarded as lower priorities. the diversification of digital scholarship poses the most significant opportunity for library-based digital publishing services. as the survey’s results confirm, scholars are producing a diversity of genres and media that are not well served by systems and services oriented toward conven- tional university press publications. libraries have a unique opportunity to confront the diversity and complexity of digital scholarship before, during, and after publication. as collecting institutions, libraries have a vested interest in gathering these materials and, in doing so, become natural candidates for disseminating them as well. moreover, many libraries are establishing systems for digital preservation of increasingly complex objects, whether within individual institutions, in consortia, or in large-scale preservation networks. for scholars seeking to incorpo- rate complex digital objects into their publications or publish them as stand-alone products of scholarship, the library is a natural source for collecting, disseminating, and preserving such scholarship. many humanities scholars’ interest in open review and collaborative authorship suggests an opportunity for advancing systems for early pub- lication, sharing, feedback, and post-publication review. in particular, next-generation digital publishing systems will need flexible support for collaborative authorship, including flexible permissions and access management. while other disciplines have well-established venues for preliminary publishing (e.g., the arxiv preprint repository), there are fewer analogous systems in the humanities, although experimental platforms for sharing scholarship and facilitating open review do exist (e.g., humanities commons and mediacommons). institutional and domain-specific repositories can serve iterative and incremental publication for some journal of scholarly publishing © university of toronto press doi: . /jsp. . . kinds of content, if not the more complex digital objects that scholars are beginning to publish, but these repositories do not usually support annotation or review. libraries have an opportunity to innovate by developing, hosting, and maintaining combined authorship and publishing tools that support the flexible, open, and collaborative authorship in demand among some humanities scholars. collaborative authorship through common authoring tools (like word-processing software and cloud-based text editors) does not disrupt extant publishing models because it retains the distinction between authorship and publishing as separate and successive processes. but where the processes of authorship and publication begin to overlap, the next generation of digital publishing services may need to step in and provide support. for example, in rare cases, documents are made publicly available during the process of their authorship (a process that may keep going indefinitely). documents open to community and public authorship may be understood as undergoing authorship and publication at the same time, even if they are later finalized and officially published. humanists’ wide range of target audiences may offer an opportunity for open access library-based digital publishing services — especially for reaching the general public and interdisciplinary communities. scholars’ reported interest in reaching wider audiences aligns well with the established and growing recognition that the purpose and sustainability of humanities research depend on its ability to make public impact. libraries are well positioned to meet this challenge head on, given their institutional ethic (and often express mission) of providing community service and fostering research literacy, their disposition toward open access publishing, and their willingness to take on higher-risk publica- tions. publication venues (journals and conferences, etc.) targeted at emer- gent interdisciplinary intersections are scarce, despite some humanities scholars’ desire to publish across disciplinary boundaries. many libraries have begun to adapt services to support interdisciplinary research and teaching through a range of programmatic efforts, including direct interventions into publishing and initiatives in scholarly communication that target dissemination gaps for interdisciplinary scholars. at the same time, open access publication has been demonstrated to increase discov- erability and citation significantly, suggesting both an overall increase in audience reach and an explicit increase in engagement among academic consumers. with a funding model that demands demonstration of impact humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . rather than generation of self-supportive revenue, library-based publishers have a unique opportunity to embrace open access and interdisciplinary publication. as library publishers work to establish their reputation, a commitment to open access, combined with a dissemination strategy to reach more diverse audiences, may attract authors and readers alike. to reach new, diverse audiences, traditional academic publishers may face challenges different from those confronting library-based publishers. university presses and other academic publishers have disciplinary or subject-area specializations; acquiring publications at disciplinary edges and intersections may stretch their acquisition policies and processes. even for those that publish books intended for popular consumption, actually reaching the audience is still a common challenge. on the other hand, library publishers may be agnostic with regard to subject area but may focus on publishing scholarship generated at their home institutions, as libraries exist primarily to serve their universities as a local good. to publish on the edges of disciplines, and to orient publications toward diverse and scattered audiences, library publishers may need to reorient their focus toward spaces and people outside their institutional boundar- ies. given their divergent strengths and limitations, libraries and presses may find opportunities to collaborate on initiatives focused on interdis- ciplinary work and public readership, with accompanying strategies for strategic promotion. in exploring potential solutions to the problems posed by humanities scholars’ unmet publishing needs, we have focused on areas where librar- ies are best positioned to make advances in scholarly publishing. we have also stressed opportunities for emerging digital publishing programs, but that emphasis should not preclude long-established scholarly publishers from finding relevance in our survey’s results. by identifying strategic areas for library-based publishers to develop innovative models of service, and by proposing models of collaboration with presses and other publish- ing stakeholders, we seek to advance scholarly communication in step with the advance of digital scholarship in the humanities. acknowled gements the authors would like to gratefully acknowledge generous funding from the andrew w. mellon foundation; contributions from christopher maden, aaron mccollough, harriett green, latesha velez, justin williams, and members of the publishing without walls project team; journal of scholarly publishing © university of toronto press doi: . /jsp. . . and  initial feedback from dan tracy, mara thacker, heather simmons, jaimie carlstone, and sarah christiansen. katrina fenlon is an assistant professor in the college of information studies at the university of maryland, college park. her research focuses on digital cultural collections — their curation, representation, and use — and the future of research and communication in the humanities. kfenlon@umd.edu. orcid: - - - . megan senseney is head of the office of digital innovation and stewardship at the university of arizona libraries. her research focuses on collaboration in the humanities, digital scholarly publishing, and socio-technical issues related to the curation and analysis of textual data. mfsense @illinois.edu. orcid: - - - . maria bonn is an associate professor in the school of information sciences, university of illinois at urbana-champaign, where she teaches courses on academic librarianship and the role of libraries in scholarly communication and publishing. her research focuses on understanding the needs of scholars in a contemporary publishing environment. mbonn@illinois.edu. orcid: - - - . janet swatscheno is instructor and digital publishing librarian, university library, at the university of illinois at chicago. she was previously the visiting digital publishing specialist at the university of illinois at urbana-champaign library. before joining the university of illinois, she worked as a digital metadata assistant at depaul university and was a junior fellow at the library of congress. jswatsc @uic.edu. orcid: - - - . notes . maria bonn and mike furlough, eds., getting the word out: academic libraries as scholarly publishers (chicago: association of college and research librar- ies, ), http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/ booksanddigitalresources/digital/ _getting_oa.pdf. . nancy maron and k. kirby smith, current models of digital scholarly communica- tion: results of an investigation conducted by ithaka for the association of research libraries (washington, dc: association of research libraries, ), https://eric. ed.gov/?id=ed ; diane harley, sophia kryzs acord, sarah earl-novell, shan- non lawrence, and c. judson king, assessing the future landscape of scholarly communication: an exploration of faculty values and needs in seven disciplines (berkeley: center for studies in higher education, university of california, ), https://escholarship.org/uc/item/ x g; sophia krzys acord and diane harley, ‘credit, time, and personality: the human challenges to sharing scholarly work using web . .’ new media & society , no. ( ): – . mailto:kfenlon@umd.edu http://orcid.org/ - - - mailto:mfsense @illinois.edu http://orcid.org/ - - - mailto:mbonn@illinois.edu http://orcid.org/ - - - mailto:jswatsc @uic.edu http://orcid.org/ - - - http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/booksanddigitalresources/digital/ _getting_oa.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/booksanddigitalresources/digital/ _getting_oa.pdf https://eric.ed.gov/?id=ed https://eric.ed.gov/?id=ed https://escholarship.org/uc/item/ x g humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . . harley et al., assessing the future landscape of scholarly communication; acord and harley, ‘credit, time, and personality.’ . the public knowledge project (pkp), a multi-university research initiative founded in , has developed software with broad impact to support open access peer-reviewed publishing. journals relying on pkp’s open journal systems (ojs) proliferated by the late aughts of the twenty-first century. ojs now supports more than journals annually, by pkp’s estimate (https://pkp.sfu.ca/ojs/ojs-usage/), making it the most widely used open access journal publishing system in the world. in addition, pkp’s open monograph press ( ) offers editorial work- flows for and publication of long-form, open access, peer-reviewed works. see the public knowledge project, stanford university and simon fraser university library, https://pkp.sfu.ca/. . see fulcrum, university of michigan library, https://www.fulcrum.org/. . see manifold, university of minnesota press, https://manifold.umn.edu/. . see vega academic publishing system, https://vegapublish.com/. . maria bonn, ‘tooling up: scholarly communication education and training.’ college & research libraries news , no. ( ): – . . e.g., mla committee on information technology, ‘guidelines for evaluating work in digital humanities and digital media,’ modern language association, accessed september , , https://www.mla.org/about-us/governance/committees/ committee-listings/professional-issues/committee-on-information-technology/ guidelines-for-evaluating-work-in-digital-humanities-and-digital-media; ad hoc committee on the evaluation of digital scholarship by historians, ‘guidelines for the evaluation of digital scholarship in history,’ american history association, accessed september , , https://www.historians.org/teaching-and-learning/ digital-history-resources/evaluation-of-digital-scholarship-in-history/guidelines- for-the-professional-evaluation-of-digital-scholarship-by-historians. . pww is an andrew w. mellon foundation–funded initiative at the university of illinois: http://publishingwithoutwalls.illinois.edu/. . the survey used the term digital publishing without any further definition, re- maining intentionally broad to evoke respondents’ own understandings of and experiences with publishing. . for multiple-choice questions, the pww research team collectively generated response options and compared them for completeness against protocols from prior published studies on similar topics. the pww team is composed of infor- mation professionals, scholars, and publishers, who drew on their experience and knowledge of best practices in constructing the protocol. for each multiple-choice question, respondents were also given the option of providing additional, free-text responses. https://pkp.sfu.ca/ojs/ojs-usage/ https://pkp.sfu.ca/ https://www.fulcrum.org/ https://manifold.umn.edu/ https://vegapublish.com/ https://www.mla.org/about-us/governance/committees/committee-listings/professional-issues/committee-on-information-technology/guidelines-for-evaluating-work-in-digital-humanities-and-digital-media https://www.mla.org/about-us/governance/committees/committee-listings/professional-issues/committee-on-information-technology/guidelines-for-evaluating-work-in-digital-humanities-and-digital-media https://www.mla.org/about-us/governance/committees/committee-listings/professional-issues/committee-on-information-technology/guidelines-for-evaluating-work-in-digital-humanities-and-digital-media https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital-scholarship-by-historians https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital-scholarship-by-historians https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital-scholarship-by-historians http://publishingwithoutwalls.illinois.edu/ journal of scholarly publishing © university of toronto press doi: . /jsp. . . . in our discussion of survey results in this article, we offer descriptive statistics about responses, with percentages rounded to whole numbers, and quotes from free-text responses that were analysed using open coding strategies drawn from qualitative analysis. the statistics are meant to characterize responses and cannot be used inferentially, given the constraints of what we know about our sample of respondents. due to irb conditions, demographics were collected separately from survey responses and so cannot be related to specific results. when we refer to likert scale categories, terms are displayed in italics. more details for each result are available in a separate survey report written by our research team: pww research group, understanding the needs of scholars in a contemporary publishing environment: survey results (champaign, il: publishing without walls, ), http://hdl.handle.net/ / . . christine m. borgman, ‘the digital future is now: a call to action for the humanities,’ digital humanities quarterly , no. ( ), http://www.digitalhumanities. org/dhq/vol/ / / / .html. . jennifer rowley, frances johnson, laura sbaffi, will frass, and elaine devine, ‘academics’ behaviors and attitudes towards open access publishing in scholarly journals,’ journal of the association for information science and technology , no.  ( ): – . . harley et al., assessing the future landscape of scholarly communication; acord and harley, ‘credit, time, and personality.’ . for both questions (one for print, one for digital), respondents had the option to describe ‘other’ challenges with a free-text response. few respondents selected this option for either question. eight respondents signalled other problems with print, which included ‘onerous peer review processes,’ ‘time in writing,’ and problems with print as a medium, including the need to include digital results or visual evidence beyond what print publications usually accommodate. four respondents noted other problems with digital publishing, including making publications openly accessible, the perceived low quality of digital publications, and the per- ceived long-term instability of digital publications. . harley et  al., assessing the future landscape of scholarly communication; borgman, ‘digital future is now.’ . e.g., kristine k. fowler, ‘mathematicians’ views on current publishing issues: a survey of researchers,’ issues in science and technology librarianship ( ), doi: . /f qn nm; rowley et al., academics’ behaviors and attitudes.’ . other studies confirm this potential: e.g., daniel g. tracy, ‘assessing digital humanities tools: use of scalar at a research university,’ portal: libraries and the academy , no. ( ): – . . borgman, ‘digital future is now.’ http://hdl.handle.net/ / http://www.digitalhumanities.org/dhq/vol/ / / / .html http://www.digitalhumanities.org/dhq/vol/ / / / .html humanities scholars and library-based digital publishing © university of toronto press doi: . /jsp. . . . of course, some genres are necessarily digital (e.g., websites and blogs). note that a few respondents marked publishing in print for these genres. this error may be an artefact of the matrix presentation of the print/digital question and does not contribute anything meaningful to our results. . see carole l. palmer, ‘scholarly work and the shaping of digital access,’ journal of the american society for information science and technology , no. ( ): – ; kathleen fitzpatrick, ‘peer review,’ chap. in planned obsolescence: pub- lishing, technology, and the future of the academy (new york: new york univer- sity press, ); william g. thomas iii, ‘the promise of the digital humanities and the contested nature of digital scholarship,’ in a new companion to digital humanities, ed. susan schreibman, ray siemens, and john unsworth (chichester, uk: john wiley, ), – . . camilla mackay, ‘book reviews and digital scholarship,’ journal of electronic pub- lishing , no. ( ), doi: . / . . ; katrina fenlon, ‘thematic re- search collections: libraries and the evolution of alternative scholarly publishing in the humanities’ (phd diss., university of illinois at urbana-champaign, ), http://hdl.handle.net/ / ; harley et al., assessing the future landscape of scholarly communication. . stefan wuchty, benjamin f. jones, and brian uzzi, ‘the increasing dominance of teams in production of knowledge,’ science , no. ( ): – ; truyken l. b. ossenblok, frederik t. verleysen, and tim c. e. engels, ‘coauthorship of journal articles and book chapters in the social sciences and humanities ( – ),’ jour- nal of the association for information science and technology , no. ( ): – . . jennie m. burroughs, ‘no uniform culture: patterns of collaborative research in the humanities,’ portal: libraries and the academy , no. ( ): – ; co- authorship in the humanities and social sciences: a global view, a taylor & francis group white paper, , http://authorservices.taylorandfrancis.com/wp- content/uploads/ / /coauthorship-white-paper.pdf; gaby haddow, jianhong xia, and michele willson, ‘collaboration in the humanities, arts and social sciences in australia.’ australian universities’ review , no. ( ): – . . kathleen fitzpatrick and avi santo, ‘open review: a study of contexts and prac- tices,’ andrew w. mellon foundation, , https://mellon.org/resources/news/ articles/open-review-study-contexts-and-practices/. . sophie goldsworthy, ‘the future of scholarly publishing,’ oupblog, november , , https://blog.oup.com/ / /future-scholarly-publishing/. . paul spence, ‘the academic book and its digital dilemmas,’ convergence: the international journal of research into new media technologies ( ): – . . for weighted averages, a weighted sum is calculated by taking the number of respondents who selected that option as the highest rank and multiplying that http://hdl.handle.net/ / http://authorservices.taylorandfrancis.com/wp-content/uploads/ / /coauthorship-white-paper.pdf http://authorservices.taylorandfrancis.com/wp-content/uploads/ / /coauthorship-white-paper.pdf https://mellon.org/resources/news/articles/open-review-study-contexts-and-practices/ https://mellon.org/resources/news/articles/open-review-study-contexts-and-practices/ https://blog.oup.com/ / /future-scholarly-publishing/ journal of scholarly publishing © university of toronto press doi: . /jsp. . . number by the total number of options that people were asked to rank. the next highest rank is then multiplied by the total number of options that people were asked to rank minus one, and so on. the sum of the products for each option is then divided by the total number of respondents for that question. . thomas padilla, ‘collections as data: implications for enclosure,’ college and research libraries news , no. ( ), doi: . /crln. . . . . bonn and furlough, getting the word out, – , – . . bonn and furlough, – . . fenlon, ‘thematic research collections.’ . institutional repositories, research data services, and discipline-based repositories such as humanities commons (https://hcommons.org/core/) aim to support open access publishing and long-term preservation of complex digital objects. large-scale networks, such as the digital preservation network (https://dpn.org/), aim to support preservation for member organizations, beyond the life span of individ- ual institutions. however, preservation for complex digital publications — those that include multimedia resources, interactive or dynamic content, and linked or embedded external resources — remains widely unsupported. . see, respectively, https://hcommons.org/ and http://mediacommons.futureofthebook. org/. . e.g., eleonora belfiore, ‘“impact,” “value,” and “bad economics”: making sense of the problem of value in the arts and humanities,’ arts and humanities in higher education , no. ( ): – ; jon parrish peede, ‘the humanities in relationship’ (presentation, national humanities conference, boston, ma,  november ). . daniel c. mack and craig gibson, interdisciplinarity and academic libraries (chicago: association of college and research libraries, ). . carole l. palmer and katrina fenlon, ‘information research on interdisciplin- arity,’ in the oxford handbook of interdisciplinarity, ed. robert frodeman, julie thompson klein, and roberto c. s. pacheco (oxford: oxford university press, ), – . . heather piwowar et al., ‘the state of oa: a large-scale analysis of the prevalence and impact of open access articles.’ peerj ( ), doi: . /peerj. . . see the association of university presses, subject area grid, http://www.aupresses. org/images/stories/documents/ subject_area_grid_final_ .pdf. . ruth scurr, ‘academic publishers are struggling to publicise more accessible books, many of which never reach wider readership,’ times higher education supplement, march , . https://hcommons.org/core/ https://dpn.org/ https://hcommons.org/ http://mediacommons.futureofthebook.org/ http://mediacommons.futureofthebook.org/ http://www.aupresses.org/images/stories/documents/ subject_area_grid_final_ .pdf http://www.aupresses.org/images/stories/documents/ subject_area_grid_final_ .pdf mapping movement on the move: dance touring and digital methods mapping movement on the move: dance touring and digital methods harmony bench, kate elswit theatre journal, volume , number , december , pp. - (article) published by johns hopkins university press doi: for additional information about this article [ access provided at apr : gmt from carnegie mellon university ] https://doi.org/ . /tj. . https://muse.jhu.edu/article/ https://doi.org/ . /tj. . https://muse.jhu.edu/article/ theatre journal ( ) – © by johns hopkins university press harmony bench is an assistant professor of dance at the ohio state university and coeditor of the international journal of screendance (with simon ellis). her research sits at the intersections of dance, media, and performance studies, with a recent turn toward leveraging digital tools for scholarly inquiry. her writing has appeared in numerous edited collections, as well as in the international journal of performance arts and digital media, participations, performance matters, among others. upcoming projects include a forthcoming book tentatively titled dance as common: move- ment as belonging in digital cultures, as well as mapping touring, a digital humanities and database project focused on the performance engagements of early twentieth-century dance companies. kate elswit is a reader in theatre and performance at the royal central school of speech and drama, and is the author of watching weimar dance ( ) and the forthcoming theatre & dance. she has won three major awards for scholarly publications—the gertrude lippincott award from the society of dance history scholars, the sally banes publication prize from the american society for theatre research, and honorable mention for the joe a. callaway prize—and her research has been supported by a marshall scholarship, a postdoctoral fellowship in the andrew w. mellon fellowship of scholars in the humanities at stanford university, and the lilian karina research grant in dance and politics. her essays have appeared in tdr, modern drama, art journal, theatre journal, performance research, dance research journal, and new german dance studies. she also works as a choreographer, curator, and dramaturg. mapping movement on the move: dance touring and digital methods harmony bench and kate elswit in a scrapbook of images compiled by musical conductor alexander smallens, who traveled with anna pavlova on her company’s tours to central and south america and the caribbean, there are two photographs of dancers giving themselves a ballet barre on what smallens indicates is the french liner antilles taking the company from trinidad to martinique (fig. ). these images offer a behind-the-scenes glimpse of how the danc- ers spent their time between performance engagements; in so doing they highlight the complexity of touring as an object of study. a broad account of how, why, and by what means dances travel requires that scholars attend to the lived, day-to-day experiences of multiple bodies, together with the financial, technical, and political infrastructures that support such movement moving. in this essay we propose that a better understanding of the transnational networks of dance touring is critical to placing dance within larger theatrical and cultural systems. digital research methods can work in tandem with more traditional scholarly ones to manage the scale of data truly necessary to model traveling dance in terms of what we call “dynamic spatial histories of movement.” support for research leading to this essay was provided by a digital humanities summer grant from the faculty of arts at the university of bristol, and the office of research’s grants for research and creative activity in the arts and humanities at the ohio state university. / harmony bench and kate elswit despite the existence of many projects of archival digitization and online presentation in the field of dance, we are not aware of any projects involving historical cultural analytics and inquiry-driven visualization of historical data in dance studies besides our own. we thus build on the foundations of our ongoing projects while drawing into conversation the ways in which digital methods can facilitate large-scale comparative analyses of the mobilities evidenced in the phenomenon of dance touring. this essay triangulates its discussion of digital humanities and theatre studies with dance. although we focus here on disciplinary overlaps and the cross-pollination of digital research methods, we believe that such methods can facilitate future cross-genre studies of the performing arts as a whole. in the process, we engage with scholarship on literature, geography, and particularly theatre, which shares many concerns regarding performance as live event and its re-presentation over time in different places and forms. there is huge potential for analyses of touring to help us understand larger ecosys- tems of performance, past and present. for example, most theatres and agents present multiple genres of performance, and many artists themselves cross between dance and theatre, as well as between elite and popular stages. at the same time, dance-based perspectives, including dance ethnography, can extend theatrical discussions of travel of, and even as, embodied practice. whereas ethnographic approaches have primarily been the domain of performance studies rather than theatre studies, the prevalence of anthropology as one of the foundational pillars of the field of dance studies means that such methods are more intertwined with how dance scholars conceive of dance practices, whether onstage or off. together, the fields of theatre and dance can not only draw from the digital humanities, but also propose new means to consider embodied experience in terms of dynamic spatial histories of movement. such approaches can facilitate dance research, among other things by enabling discovery and display at a scale not available in analog media. figure . dancers in anna pavlova’s company practice at a makeshift barre onboard a ship. (source: image courtesy of the jerome robbins dance division of the new york public library for the performing arts, new york city, reprinted with permission.) for example, eiko and takashi koma otake, eiko & koma ( ), available at http://eikoandkoma.org/ home; siobhan davies, sarah whatley, et al., siobhan davies replay: the archive of siobhan davies dance ( ),available at http://www.siobhandaviesreplay.com/; william forsythe, maria palazzi, norah zuniga shaw, et al., synchronous objects for one flat thing, reproduced by william forsythe ( ), available at http://synchronousobjects.osu.edu/. mapping movement on the move / a few years ago we discovered that we were embarking on similar projects of using digital tools to engage with dance history—specifically, histories that involve what we now call “movement on the move.” we came to our projects from different backgrounds. kate was looking to digital methods as a way to tackle the particular historical problem of tracing dance’s complex global networks and infrastructures, and experimenting with various visualizations that would enable her to better understand and account for the onstage and offstage operations of dance’s transnational circulation. harmony had come to touring as a way to historicize dance’s screen-based transmission, and to leverage digital humanities research for dance studies, building a series of datasets to track staged repertory on tour. both of us are, however, interested in the circulation of dance, and specifically how digital tools can elucidate the ways in which touring has functioned in dance’s transnational history. there is something telling in the fact that our two unique, dance-based cultural analytics projects happened to converge on the same nexus of ideas. as we elaborate, the problems of infrastructural networks and dance’s transmissions preceded the choice of digital methods, which enable us to develop new and complementary lenses from which to attend to critical scholarly horizons. we argue that a better understanding of touring is necessary to account for dance’s global nature, and that the scale and distribution demanded of this research require dynamic spatial histories of movement supported by digital methods. given the scarcity of scholarly projects combining dance history with digital research, we began to collaborate in order to both consolidate our limited resources and expand our scope of intellectual inquiry in an uncharted field of research. for a first phase of research, we brought our respective projects into alignment. we designed a comparative structure in which each of us worked independently on one of two wartime tours in south america: pavlova’s company tour during world war i, and the american ballet caravan tour during world war ii. in so doing we wished to clarify our individual projects’ aims and also to test their limits and expansiveness together in relation to broader scholarly initiatives. what follows is the first print output of this collaborative work in which we explore how digital modes of analysis can expand understandings of dance’s transnational circulation, while we begin to articulate rigorous practices by which to do so. throughout this essay we use these early explorations to argue for the urgency of tracing such dynamic spatial histories of movement on the move. the next section expands on the methodological underpinnings of this work. from there, we turn to two key tools—the database and then the map—which we consider to be vital to furthering our understanding of dance touring. one of the advantages of working together in a peer-review collaboration has been to help demystify the celebratory attitude that sometimes accompanies digital making. it is relatively easy to use freely available online software to build a visual for the purposes of this essay, except where indicated, we are working with unpublished manuscript materials that have been collected for our respective projects from the archives at the jerome robbins dance division at the new york public library; the jerome lawrence and robert e. lee theatre re- search institute at the ohio state university; the new york city ballet archives; and the rockefeller archive center. further details, maps, and datasets on american ballet caravan can be found in kate elswit, moving bodies, moving culture, available at https://movingbodiesmovingculture.wordpress.com; de- tails, maps, and datasets on anna pavlova can be found in harmony bench, mapping touring: dance history on the move, available at https://harmonybench.wordpress.com. we are also in the early stages of a new, jointly initiated project titled “dance in transit,” funded by a battelle engineering, technology, and human affairs grant from the ohio state university. / harmony bench and kate elswit essay or multimedia narrative, such as an aesthetically pleasing historical map with some annotations along the lines of “this happened,” or “here is a picture connected to this location,” or even “this was supposed to happen but didn’t,” and then stop. instead, we need to ask whether and how digital tools can be leveraged to support scholarly interpretation and analysis by revealing things we do not already think we know. our conversations have stressed the importance of not becoming enamored of digital research methods for their own sake, even as digital approaches to scholar- ship are changing the forms that humanistic inquiry can take. as barbara herrnstein smith puts it, “[n]ew methods enable new questions to be posed and old answers to be sharpened or corrected. but in any field of knowledge production, significant ques- tions come out of ongoing interests and problems, not usually just methods as such.” although we may envisage different scholarly outputs for our individual projects, we share a core interest in the potential for such digital research methods to bring together and examine multiple complex datasets related to dance touring. this vastly expands scholarly understandings of dance’s mobility, for example, by destabilizing accounts of transnational contact that privilege certain large metropolitan areas. at the same time it also enables the discovery of new historical interconnections and patterns that can, for example, supplement and contextualize reviews and other firsthand accounts among audience members. it can point to convergences in patterns of travel, shed- ding light on transportation networks and infrastructures. and it can motivate new dimensions of audience analysis through the lens of company repertory. in this way we can extend our capacity to place individual performances and audiences within larger performance ecosystems and global networks of touring. toward dynamic spatial histories of movement in her overview of dance studies janet o’shea contends four intellectual arenas laid the foundation for the discipline that emerged as a field from their points of overlap: namely, anthropology, dance criticism and analysis, philosophy, and history. of these, ethnographic and historiographic tendencies have been the most prominent in what o’shea calls “new dance studies,” marked by the simultaneous pursuit of choreographic and cultural analysis, which she aligns with the dance scholarship of the late s. as dance scholarship has continued to focus on the cultural politics of dance, and in particular the politics of representation, questions regarding how dance practices travel have emerged. priya srinivasan notes in her analysis of the transnational circulation of the labor of classical indian dance that “[d]ance is embodied and passes from body to body, whether we like it or not.” the concern over how dance travels, “from body to body, whether we like it or not,” has been generative for the field. the focus of much on aesthetic considerations within data visualization and information design, see orit halpern, beautiful data: a history of vision and reason since (durham, nc: duke university press, ). barbara herrnstein smith, “what was ‘close reading’? a century of method in literary studies,” minnesota review , no. ( ): – , quote on (original emphasis). janet o’shea, “roots/routes of dance studies,” in routledge reader in dance studies, ed. alexandra carter and janet o’shea, nd ed. (new york: routledge, ), – , quote on . priya srinivasan, sweating saris: indian dance as transnational labor (philadelphia: temple university press, ), . work such as ours owes a debt to deirdre sklar’s observation that “[o]ne has to look beyond movement to get at its meaning,” which inaugurated a turn to offstage analyses in dance studies, beginning with ethnographic accounts. see sklar, “five premises for a culturally sensitive approach to dance,” in moving history/dancing cultures: a dance history reader, ed. ann dils and ann cooper albright (middletown, ct: wesleyan university press, ), – , quote on . mapping movement on the move / of this scholarship, however, has addressed the social responsibility of such travel, and in particular the colonial, neocolonial, and neoliberal politics that situate the circulation of dance within a framework of cultural appropriation. whether focused on the contemporary moment or on the recent or distant past, such research is taking shape against a background in which digital and social media are profoundly restructuring how and where dance circulates, and how quickly movement practices spread. popular media screenscapes have reoriented the circulation of dance, shifting its presumed locus from the concert stage or dance club to computer and television screens, through which videos are shared widely and rapidly. the dramatic nature of this shift invites scholars to examine previous modes of dance’s dissemina- tion, and dance touring is an obvious place to begin. sociologist john urry points out that the movement of bodies is not necessarily faster than other global processes, given such border mechanisms as passports and visas, but that the interactions of subjects tend to be privileged by scholars while the infrastructures that enable such forms of exchange are overlooked. today, a telecom- munications network provides the infrastructure for bringing dance to casual and “serious” viewers alike, from youtube and vimeo to live broadcasts of performances in high-art venues. postal and telegraphy services previously facilitated such virtual communication, supporting what bruno latour has called “long distance networks.” but for dance it was more specifically the development of transportation technologies that connected cities and theatres through which performers could circulate. although this transportation network operated at slower speeds and moved people rather than data, it also disseminated the works and practices of cultural producers, thereby creat- ing new geographies of knowledge and practice. the phenomenon of dance touring thus asks us to consider the role of such infrastructural factors as transportation in disseminating movement practices by enabling bodies to circulate as mobile contact zones consuming, absorbing, and spreading aesthetic and cultural practices. steven harris offers a schema for understanding how bodily practices travel when he breaks down the geography of knowledge into three analytic approaches, which he explores in relation to jesuit missionaries and the advancement of western science. the first he describes as a “static geography of place” that attends to the development see brenda dixon gottschild, digging the africanist presence in american performance: dance and other contexts (westport, ct: greenwood press, ); anthea kraut, choreographing copyright: race, gender, and intellectual property rights in american dance (new york: oxford university press, ); and jacqueline shea murphy, the people have never stopped dancing: native american modern dance histories (minneapolis: university of minnesota press, ). see, for example, harmony bench, “’single ladies’ is gay: queer performances and mediated mas- culinities on youtube,” in dance on its own terms: histories and methodologies, ed. melanie bales and karen eliot (new york: oxford university press, ), - , and “monstrous belonging: performing ‘thriller’ after / ,” in the oxford handbook of dance and the popular screen, ed. melissa blanco borelli (oxford: oxford university press, ), – . john urry, mobilities (cambridge, uk: polity press, ), . bruno latour, science in action: how to follow scientists and engineers through society (cambridge, ma: harvard university press, ), . on the importance of telegraphy for theatre scholarship, see marlis schweitzer, transatlantic broadway: the infrastructural politics of global performance (new york: palgrave macmillan, ), – . earlier postal networks have also provided the basis for digital scholarship, including paula findlen, dan edelstein, nicole coleman, et al., mapping the republic of letters project, stanford university ( ), available at http://republicofletters.stanford.edu/. / harmony bench and kate elswit of knowledge in situ. where were key figures when they engaged in the discovery/ production of knowledge? the second harris describes as a “kinematic geography of movement” that considers where practices and approaches came from and how knowledge is disseminated. his third approach to analyzing the geography of knowledge encompasses “the dynamics of travel: why and by what means did all these movements take place? what was the anima motrix responsible for the multiple peregrinations of the elements of knowledge?” in this final sense, harris suggests, a scholarly account must bring historiography to bear on “these geographies of place, movement, and social organization.” as stephen greenblatt points out, scholars must take such mobility in a highly literal sense, including physical and institutional barri- ers to and conditions of movement: “[o]nly when conditions directly related to literal movement are firmly grasped will it be possible fully to understand the metaphorical movements”—for example, between center and periphery. dance historians and dance ethnographers have emphasized, in their own ways, the static and kinematic dimensions of a geography of knowledge, analyzing dance practices and dance works at the site of their origination or occurrence, and thinking through questions of cultural diaspora, human migration, and political exile through dance movement. such studies of dance’s circulation have often emphasized the ways in which dance moves through times, more than how it crosses space, with documen- tation and the body as archive serving as prominent frameworks for pursuing dance histories. the spatial turn in the humanities withdrew choreography from dance as its exclusive medium of manifestation, and enabled choreography as a concept to operate within a larger conversation regarding movement, mobility, travel, and displacement. but with few exceptions, dance scholars have left travel itself virtually untouched. at a moment when key concepts associated with dance, such as mobility, are increasingly used to articulate the flows of contemporary labor and migration, deeper understand- ings of dance’s own mobility position dance studies to engage with and further develop interdisciplinary scholarly conversations of “kinopolitics.” steven j. harris, “long-distance corporations, big sciences, and the geography of knowledge,” in postcolonial science and technology studies reader, ed. sandra harding (durham, nc: duke university press, ), – , quote on . ibid. ibid. ibid. stephen greenblatt, “a mobility studies manifesto,” in cultural mobility: a manifesto (cambridge: cambridge university press, ), – , quote on . on the relationship between cross-cultural journeys and fantasies of time travel, see michelle clayton, “touring history: tórtola valencia between europe and the americas,” dance research jour- nal , no. ( ): – . a sample of the extensive writing on danced reengagements with archives appears in mark franko, ed., the oxford handbook of danced reenactment (oxford: oxford university press, forthcoming). notable exceptions include analyses of seventeenth- and eighteenth-century choreography and cartography, such as susan leigh foster, choreographing empathy: kinesthesia in performance (abingdon, uk: routledge, ); dance’s relationship to twentieth-century tourism, such as jane desmond, bodies on display from waikiki to sea world (chicago: university of chicago press, ); and the twentieth- and twenty-first-century use of dance in american cultural diplomacy, such as naima prevots, dance for export: cultural diplomacy and the cold war (middletown, ct: wesleyan university press, ), and clare croft, dancers as diplomats: american choreography in cultural exchange (new york: oxford university press, ). thomas nail, the figure of the migrant (stanford, ca: stanford university press, ), . see also tim cresswell, on the move: mobility in the modern western world (new york: routledge, ). mapping movement on the move / the question of travel, and further of infrastructures facilitating such travel, is more comprehensively addressed in theatre and performance studies. for example, in transatlantic broadway marlis schweitzer attends to the nonhuman entities, including technological advances in transatlantic travel, which expanded theatre networks prior to world war i. her project is among a growing number in theatre and performance studies to draw on network analysis in order to situate historical and contemporary performance events and their global circulation within larger economic and cultural systems. such infrastructures of travel are particularly important to dance, which is even less likely to rely upon a text that can circulate independently of live performers. at the same time, as christopher balme points out, even theatrical touring remains the most under-researched manifestation of modern transnational, or even global, theatre practices. as these scholars demonstrate, performing artists covered transnational ground via the complex relational structures of networked systems. urry calls for a study of “net- work capital” in order to point to “the real and potential social relations that mobilities afford.” underlying mobilities in themselves, he argues, do nothing; it is necessary to account for the social consequences of such mobilities. to think about touring in this way is not only to tell the story of a star performer and her most famous audi- ence members, but to consider the ripple effects and residual affects of many travel- ers’ arrivals, departures, stays, and returns. such a distributed world is suggested by clare croft’s dancers as diplomats, which incorporates an ethnographic approach to the day-to-day life of dancers on tour, using the experiences of multiple dancers as a means to counter top-down narratives of cultural diplomacy in which national ideolo- gies tend to overdetermine interpretation. further dimensions of touring’s network capital include the many travelers’ prior tours, since company membership changes over time, and the relationship between one company and those that preceded or fol- lowed its performance engagements in the same cities or even theatres. for example, in terms of american ballet caravan’s south american tour, the wardrobe mistress, a russian émigré to the united states, had also been with george balanchine on a previous south american tour, which visited many of the same cities. likewise, when pavlova’s company toured south america simultaneously with the ballets russes dur- ing , the two companies sometimes competed for venues, as lamented by a dancer of pavlova’s who noted that they had to open in buenos aires at the teatro coliseo, which she described as the second-best theatre in the city, because sergei diaghilev had already booked the nicer teatro colón. here, the tools of digital cartography can schweitzer, transatlantic broadway. for example, jen harvie, fair play: art, performance and neoliberalism (basingstoke, uk: palgrave macmillan, ); leo cabranes-grant, “from scenarios to networks: performing the intercultural in colonial mexico,” theatre journal , no. ( ): – ; elizabeth maddock dillon, new world drama: the performative commons in the atlantic world, – (durham, nc: duke university press, ). as compared to, for example, the international expansion of british theatre markets during the late victorian and edwardian periods through new distribution mechanisms that were free of the con- straints of human travel. see tracy c. davis, the economics of the british stage, – (cambridge: cambridge university press, ), . christopher balme, “the bandmann circuit: theatrical networks in the first age of globalization ,” theatre research international , no. ( ): – , quote on . urry, mobilities, . ibid. croft, dancers as diplomats. letter from mascot moskovina to billie morton dated august , , mascot moskovina, available at http://scalar.usc.edu/works/mascot-moskovina/letter-aug- - -page- . / harmony bench and kate elswit help to elaborate print arguments in visual form; for example, that concert dance has participated in a complex and entangled history of global circulation for a long time. digital scholarship thus contributes to the reorganization of dance history beyond the nation by making visible distributed “micropolitics of exchange.” at the same time that digital tools can reinforce and elaborate other scholarly work, they also provide new methods for analysis. visualizing spatial history, as richard white has argued, is productive not primarily as a vehicle “to communicate things that you have discovered via other means”; rather “[i]t is a means of doing research; it generates questions that might otherwise go unasked, it reveals historical relations that might otherwise go unnoticed, and it undermines, or substantiates, stories upon which we build our own versions of the past.” in the case of dance touring, the scale of digital analysis, particularly organized as a database and represented as a map, expands our capacity to trace real and potential networks of relation. whether deployed as a means of exposition or exploration, such mapping can thus “show alignments, reveal patterns and display affinities” by managing large amounts of data in a manner not easily achieved via other means. to take one dataset as a sample, each time american ballet caravan crossed national and even municipal boundaries, the company presented officials with substantial documentation of travelers’ identities. from this paperwork we can collate a table of personal information regarding the tour’s travelers who were born in thirty-six cities in eight countries, although these anchor statistics cannot, of course, present the full story of the company’s workers and their backgrounds. a dancer born in london was domiciled in montreal, worked in the united states under a quota visa, but traveled under a valid canadian passport. two dancers born in different cities in germany required extensive documentation, since they were officially listed as “stateless.” by the time of the tour american ballet caravan’s travelers held citizenship in four countries, plus the stateless germans, whose passports had been invalidated. to trace a map of the world that is capable of representing the places of birth, citizenship, and previous journeys of all travelers is not just a demographic project; instead, it visual- izes touring as a distributed transnational network of individual agents rather than a single clump shuttled around in a loop in service of a company. dance or dancers as circulating entities could thus be explored similar to export commodities, through critical visualizations that represent how presenters, company managers, and dancers maneuver in a complex, global dance market. see, for example, kate elswit, “the micropolitics of exchange: exile and otherness after the na- tion,” in the oxford handbook of dance and politics, ed. randy martin, gerald siegmund, and rebekah kowal (oxford: oxford university press, forthcoming, ). on touring, see also elswit, watching weimar dance (new york: oxford university press, ), – . richard white, “what is spatial history?,” para. . the spatial history project, stanford univer- sity ( ), available at https://web.stanford.edu/group/spatialhistory/cgi-bin/site/pub.php?id= (emphasis in original). richard rogers, natalia sánchez-querubín, and aleksandra kil, issue mapping for an ageing europe (amsterdam: amsterdam university press, ), . see new york city ballet archives, box rg - , folder . assigning individual performers and others virtual international authority file (viaf) numbers could facilitate the tracking of well-known persons across the many companies with which they worked. the harvard university center for economic development’s globe of economic complexity offers one example of visualizing global exports. notably, it includes no arts or heritage objects. see owen cornec and romain vuillemot, “globe of economic complexity” ( ), available at http://globe.cid. harvard.edu/. mapping movement on the move / spatial history addresses, in part, how particular sites change over time, emphasizing, for example, how changes in the landscape or architecture or in population density and demographics are registered in a given place. building on this basic premise, dy- namic spatial histories add the dimension of movement, emphasizing such patterns as migration, trade and travel routes, war, or even the spread of disease, in particular by utilizing digital visualizations, animations, or interactive elements. in advocating for dynamic spatial histories of movement, we are interested in putting these same resources to work—database and map alongside narrative interpretation—in order to think not only about how people and objects travel from point to point, but also about their embodied ways of moving, both in place and in transit. the following two sections elaborate first the building of databases as engines capable of managing such large scales of movement, and then their exploratory visualization in the form of maps that enable access to different experiences of space. these tools allow us to construct narratives that emphasize movement in circulation, not just isolated in moments of contact or encounter, but rather as part of larger dynamic systems in which individuals function as vectors of movement. dance studies is particularly well-suited for such work because the field is accustomed to the contradiction of bodies as agents of movement (colonizers, proselytizers, negotiators, protesters) that are nevertheless subject to social, political, and aesthetic choreographies. movement moves across and through; dance repertories move through dancing bodies as performers learn new dances, just as the dancers themselves move through various cultural landscapes, leaving gestures, steps, and choreographies in their wake. and all these are embed- ded within infrastructures of mobility, from transportation and communication to impresarios and presenting networks. by bringing together the powerful combination of database and map, together with narrative contextualization and interpretation, we propose that dynamic spatial histories of movement can address static, kinematic, and dynamic geographies of knowledge as they pertain to bodies in motion. further, analyzing movement in historical and cross-cultural contexts challenges digital hu- manities to grapple with the phenomenon of live bodies, which are not fixed in print or image, but carry, borrow, and share techniques, styles, theories of corporeality and composition, gestures, and ways of being as they travel. dance’s datasets every historian has the experience of sitting in an archive and scribbling lists in order to sort through detailed information, creating idiosyncratic databases that almost no one else can use. the power of the digital database lies in the standardization of data, which facilitates access and usability, and its ability to manage datasets of whatever size. supported by databases at both fine-grained and sweeping scales, digital scholarship can then bring unexpected details to the foreground. for example, in cataloging the repertory performed on pavlova’s south american tour show by show, her signature piece, “the dying swan,” emerges as the most frequently performed. while this is unsurprising, what is in fact more interesting is that “gavotte pavlova,” “holland dance,” and “pizzicato” were danced almost as often (fig. ). how then does “the dying swan” emerge as the most influential and memorable of pavlova’s repertory if, at least within the context of her south american appearances, the frequency of its performance is not substantially greater than other items in her repertory? certainly, pavlova’s promotional materials capitalized on and solidified the connection, but in / harmony bench and kate elswit order for dance scholars to see past a marketing strategy they need ways to attend to non-canonical aspects of pavlova’s repertory that were equally important to shaping and reinforcing global modernism, although they passed by unnoticed. a database extends beyond the scope of an individual artist to organize and man- age data that represent larger systems and networks of performance. by focusing, for example, on the intertwined relationship between repertory and touring, data analysis can also support claims of dance’s importance to (and not only its implication in) the complex exchange of embodied cultural practices. pavlova was not simply an ambas- sador of ballet, narrowly construed; throughout her career she staged a great variety of folk dances and orientalist numbers in addition to romantic and classical ballets, with a repertory of as many as forty full-length ballets and divertissements from various genres, performed to classical music as well as popular tunes. while on tour pavlova and her dancers both gathered and disseminated movement. like many danc- ers, she also commissioned music and created new pieces as she toured. for example, in in santiago, chile, pavlova premiered a piece called “el sueño” to music by a señora fernandez from valparaiso, where her company had been a few weeks prior. additionally, after her stay in mexico, she added a suite of mexican folk dances, including “el jarabe tapatío” (mexican hat dance), to her repertory. because pavlova presented a range of material and dance styles (filtered through the movement vocabu- lary of ballet), her audiences were witness to and participants in a global circulation figure . this table, generated in tableau public, and superimposed query from palladio are based on a dataset in progress. they show the pieces in pavlova’s repertory that were most frequently performed while her company was in south america during – . see https://public.tableau.com/profile/hb #!/vizhome/shared/s wfst j (source: image and dataset by harmony bench.) for an analysis of how pavlova functioned within a narrative of modernization among the elite classes of mexico city, see jose l. reynoso, “choreographing modern mexico: pavlova in mexico city ( ),” modernist cultures , no. ( ): – . mapping movement on the move / of music and dance. a historical narrative that emphasizes choreographers and their works independent of place, or which highlights only metropolises, fails to account for the impact of travel, audiences, and local arts communities on the development of repertory in touring companies. as the example of pavlova makes visible, touring as a phenomenon both takes ad- vantage of transnational networks and furthers the economic and cultural project of globalization. yet, the global scale of dance’s circulation through these networks far exceeds any single artist’s participation. the ability of databases to store and organize many complex pieces of information greatly enhances our capacity to track the touring and performance engagements of numerous artists over time, to trace shifts in reper- tory and company membership, and to note the proximity of multiple cultural agents in time and space. furthermore, databases can support analyses of local audiences and what they might have seen in the past, and enable comparisons among cities and venues in terms of repertory selected for presentation. working at a large scale, across many artists and many decades of touring, one can more fully grasp the impact of dance on the global arts landscape, and the centrality of global connectedness to the development and transmission of dance practices in modern history. there is enormous potential in what has been called “distant” analysis to sift through large quantities of cultural data to observe the patterns that emerge. yet, such work requires that there be large quantities of fairly “clean”—that is, relatively uniform and hopefully accurate—data through which to sift. many cultural analytics projects thus focus on the analysis of born-digital or previously digitized datasets, both past and present, whether textual or image-based. by contrast, especially given the limited amount of digitized artifacts or pre-collated data regarding dance, it is important that we do not limit our scholarly scope by letting the data lead, whether by exclusively relying upon the content of archives privileged enough to have been digitized or us- ing only available motion capture and video recordings that fix single iterations of a dance into an extreme historical present. instead, we begin by collecting our data through traditional archival research with the intent of eventually making the datasets themselves available to other researchers. the resulting datasets may be quite small initially, given the labor intensiveness of collection, but they reflect an investment in the future of a field where historical datasets are scarce. both of our work with dance touring has involved manually collating datasets in archives and special collections, culling from different sources that can support analyses of onstage and offstage elements related to performance. for example, performance programs provide extensive information about what appears onstage, from dancers and their roles in repertory performed, to choreographers and designers, to loca- tion information, such as the towns and theatres in which performances were given. ibsenstage and linked jazz are two projects in theatre and music, respectively, that capture the potency of data-driven projects for the performing arts. the university of oslo’s ibsenstage exhaustively documents the countries and theatres in which the playwright’s works have appeared, the dates of performances, and those who have contributed to staging the work over time (https://ibsenstage.hf.uio. no/). linked jazz: revealing the relationships of the jazz community works laterally, building and visualizing relationships among jazz musicians as a social network, which is both supplemented and driven forward by a combination of open data and transcribed oral histories (https://linkedjazz.org/). see franco moretti, distant reading (london: verso, ). on “distant reading” and its provocations and complications with respect to “close reading,” see herrnstein smith, “what was ‘close reading’?” / harmony bench and kate elswit performance programs have the benefit of being highly structured and presenting information in a predictable, even formulaic manner. data from such programs can be supplemented with announcements for upcoming shows, as well as newspaper advertisements, previews, reviews, tour itineraries, season subscription mailers, and performers’ diaries and scrapbooks. the archives that lend themselves to analyses of offstage activity are often much less standardized, although the data themselves can be likewise broken down into quantitative structures. financial documents can provide profit-and-loss statements on a night-by-night or city-by-city basis, which include the dates, times, and types of performances (matinee or evening, subscription or benefit), as well as the costs and types of transportation (taxi, bus, train, plane, and boat fares, connection by connection). there may be further supporting lists, such as the personal information collected on company members for group visas, the bonds paid to cross certain international borders, or the vaccinations needed prior to departure. material histories are evidenced by props and costume items—for example, the amount of makeup, shoes, or sunscreen anticipated for a single trip. these archival onstage and offstage datasets on dance touring can also be combined with datasets that have been collected for other purposes. clearly, even establishing geo-location data for a town or specific venue requires cross-referencing with existing datasets. then there are further datasets that can not only support, but also enhance the understanding of touring data, such as a city’s demographic information, active transportation routes, and migration patterns. for example, analyzing the ratio of performances given to the populations of towns on a tour reveals a number of cities where events per capita deviate from the mean; this can suggest the need for further attention to the political, cultural, or artistic value of those sites in particular. more importantly, setting dance-touring pathways alongside well-known burlesque wheels and vaudeville circuits, not to mention touring theatre productions, can paint a more thorough portrait of the performing arts ecosystem as a whole, as well as illuminate specific interconnections. who performed in the same cities and theatres not only within, but across forms? which routes were the most common? who performed different dances to the same music or the same dances to different music? and what are the implications for disciplinary and interdisciplinary performing arts research? whatever the source, creating new datasets from archival materials rather than “scraping” already digitized information reminds us, as lev manovich cautions, that “data does not just exist—it has to be generated. data creators have to collect data and organize it, or create it from scratch.” similarly, lisa gitelman and others have argued that data is never “raw”; regardless of warnings that datasets are presented “as is,” with inconsistencies, inaccuracies, and biases, the larger the dataset, the more raw (neutral, unbiased, or objective) it appears. gitelman and virginia jackson argue that scholars must therefore account for not only how disciplines “have imagined their objects and how different datasets harbor the interpretive structures of their own imagining,” but also, playing on claude lévi-strauss’s distinction between the “raw” and the “cooked,” how data are “variously ‘cooked’ within the varied circumstances of their collection, storage, and transmission.” although they have the aura of neu- lev manovich, the language of new media (cambridge, ma: mit press, ), . lisa gitelman and virginia jackson, “introduction,” in “raw data” is an oxymoron, ed. lisa gitel- man (cambridge, ma: mit press, ), – , quote on . mapping movement on the move / trality, data are already interpretations, and they carry with them cultural values and assumptions that guide identification and assembly within scholarly research; as in- terpretations, data reflect the biases of institutions and researchers. the “cleaner” and bigger a dataset is, the more likely it is to favor practices and people that are already well-represented in archives. one of the dangers is thus that data-driven methods will reinforce existing structures of power and aesthetico-political hierarchies. it is no coincidence, for example, that ballet was a common denominator chosen for our own collaboration; we knew that cultural institutions would have substantial collections of artifacts relevant to its history. accounting for and counteracting biases built into archives remains a significant challenge for digital researchers, and yet it is also possible that the critical interrogation of quantified historical information will enable holes in the archival record to emerge bigger and brighter. bringing cultural criticism to bear on data analysis, critical data studies insists on the nonneutrality of data. such a framework acknowledges that data collection is already an act of interpretation, and similarly acknowledges the limits of data analysis when not accompanied by subject-matter expertise. as critical data studies advocate rob kitchin remarks: “[i]t is one thing to identify patterns; it is another to explain them.” for example, lincoln kirstein, the director of american ballet caravan, noted at various times that one of the goals of its six-month tour of south america was to showcase a kind of american art, but having to do it where the frame of reference for ballet was primarily russian. in one report he suggests that for audiences in buenos aires, “[f]ew believed beforehand there was an american ballet equal to russian ballet,” while in another he offers a caveat for the some of the controversies that the tour had faced more generally in south america: “[t]he more conservative [audiences], used to the classic dance of the russian ballet type, neither understood nor approved the modern approach of such ‘typically american’ ballets as billy the kid.” pavlova’s earlier tours make an appearance as one such russian form in a letter that kirstein wrote to nelson rockefeller regarding rio de janeiro: “i knew that our purely american ballets are not liked by the rich conservative public whose idea of dancing is pavlova.” as kirstein’s own comparison suggests, to frame his comments rather than take them at face value requires contextualizing this tour not only in terms of the company’s own touring or repertory, but also the prior tours and repertory of euro-american artists like anna pavlova, as well as isadora duncan, tórtola valencia, maud allan, and the ballets russes, among others. without examining the repertory and routes of these artists and companies, it is difficult to say if kirstein’s comment reflects identifiable aesthetic differences between russian and american modernisms, represents a conflict between classicism (perceived as russian) and modernism (per- ceived as american), or results from the tour’s status as a prelude to cold war cultural diplomacy. in addition, we need to assess what kirstein knew or did not know about local dance by tracing the ways that central and south american dancers were them- rob kitchin, “big data, new epistemologies and paradigm shifts,” big data & society , no. ( ): – , quote on . lincoln kirstein, “draft of a preliminary report concerning the tour of the american ballet in south america, june–september ” (emphasis in original) and “report on the american ballet caravan covering june–september ,” box , series l (fa ), folder , in nelson a. rockefeller papers, projects, rockefeller archive center. lincoln kirstein, letter to nelson rockefeller, july , , box , series l (fa ), folder , in nelson a. rockefeller papers. / harmony bench and kate elswit selves circulating in the early and mid-twentieth century. building this larger dataset requires time, travel funding, and language competency. clearly, this is a lot of work, but we believe it can extend our scholarly frame of view. the relationship between the construction of datasets and what is more readily understood as a conventional gathering of historical evidence can be thought of in terms of what sarah bay-cheng describes as seeing “the composite image within the pixelated fragments,” or what martin mueller calls “scalable reading.” advocating for “digitally assisted” analyses of early modern texts, mueller notes that “[d]igital tools and methods certainly let you zoom out, but they also let you zoom in, and their most distinctive power resides precisely in the ease with which you can change your perspective from a bird’s-eye view to close-up analysis. often it is a detail seen from afar that motivates a closer look.” rather than maintain a binary distinction between distant and close reading practices, the paradigm of scalability recognizes both the need and ability to approach a research area at multiple distances, not just close up and far away. further, bay-cheng’s description of a composite emerging from fragments reminds us that we are rarely working with material that neatly conforms to the same scale nor is a complete picture likely to result from our efforts. when working with data, this means building nested relationships, such that, for example, a performance venue is located in a city, a state, and a country, and perhaps at a finer scale, at a street address or latitude/longitude coordinate, or at a grander scale, on a continent; but it is also located within a diary, a letter, a series of interactions among travelers, and the like. scalability recognizes that not all temporal and geographic scopes are useful or meaningful for inquiry, and that overlaps and discontinuities are to be expected in composites, as are absences. tethering such a hermeneutics of suspicion to data-driven inquiry is thus critical to employing digital research methods in a humanistic context. mapping narratives of touring there is a long history of leveraging analog visual media to help render abstract information like datasets intelligible. media archaeologists and historians of science have extensively documented the roots of contemporary digital visualization tools in prior technologies of representation. but digital visualizations make user interaction and manipulation more explicit and more available than their analog counterparts. in , for example, prior to the broad availability of digital visualization tools, mark monmonier proposed the use of print images in “complementary pairs or triads” in order to enable side-by-side analysis that made information clearer to the reader. digital tools now enable the layering of multiple datasets, bypassing the need for side-by-side comparisons in favor of superimposed images, graphs, and other visual representations of data. for example, figure overlays our respective datasets on the sarah bay-cheng, “pixelated memories: theatre history and digital historiography” (plenary presentation, astr conference, nashville, november – , ), quote on , academia.edu, available at https://www.academia.edu/ /pixelated_memories_theatre_history_and_digital_historiography. martin mueller, “shakespeare his contemporaries: collaborative curation and exploration of early modern drama in a digital environment,” digital humanities quarterly , no. ( ), available at http://www.digitalhumanities.org/dhq/vol/ / / / .html. ibid., para. . for example, edward r. tufte places the origin of the “data map” in the seventeenth-century combination of cartographic and statistical skills. see his the visual display of quantitative information, nd ed. (cheshire, ct: graphics press, ), . mark monmonier, mapping it out: expository cartography for the humanities and social sciences (chicago: university of chicago press, ), . mapping movement on the move / figure . using the open-access mapping platform carto, we have plotted key locations of the two tours underlying this essay: the performances given in south america by anna pavlova’s company during world war i and by american ballet caravan during world war ii. a good deal of overlap between the two tours is visible, particularly in coastal cities like lima, buenos aires, and são paulo. see https://kelswit.carto.com/viz/ cc c - d - e -a - e c e ffdb/public_map (source: map and datasets by kate elswit and harmony bench.) performance locations and trajectories of pavlova’s and american ballet caravan’s tours from – and , respectively. digital visualization further allows for the higher density of information associated with the database to remain available to the eye, at different levels of specificity. recalling greenblatt’s assertion that it is necessary to understand literal movement in order to understand metaphorical movement, digital visualizations like mapping / harmony bench and kate elswit offer new means to extend that literal understanding of how movement moves by reorganizing our representations of global space. spatial historians have underscored the potential of tools like gis for discovering the correlations and connections “among events in space and time for narrative generation,” which also facilitate the genera- tion of dynamic spatial histories of movement. it is important to remember that such events are not static; map-based visualizations are capable of drawing attention to the spatial distribution or concentration of more and less mundane systems and events over time, each of which are themselves in motion. the need for such a flexible tool is apparent in exploring the relationships of the pavlova and the american ballet caravan tours to both world wars, respectively. on the one hand, there is clear evidence for the ways in which the distant wars enabled and constrained travel through their impact on global transportation networks. this includes the very choice of south american routes because of the inaccessibility or inadvisability of transatlantic crossings and the personal circumstances of family mem- bers, such as dancer ruth page, who was able to join the pavlova company in puerto rico in because, as her mother writes, “they were perfectly free to go ([their] men folk being in france).” but it also appeared later into the tours, as when american ballet caravan could not take its originally planned boat between santos and buenos aires because atlantic shipping schedules had been thrown off, and so it ended up on a spanish refugee ship originally from bilbao. on the other hand, the faraway wars were clearly not the only factor impacting such travels. in fact, for american ballet caravan in , it was the brief ecuadorian–peruvian war that lasted less than a month, but rerouted the tour already underway. in each of these instances a map alone might mean very little, but the combination of map and narrative explanation together enables a different scope of consideration, one capable of accounting for the constant evolution of interactions between local and global. the multiple framings of global space that these war stories reference can be provi- sionally organized in terms of what cultural geographer david harvey began to theorize in the s as the overlapping frameworks of absolute, relative, and relational space in order to support his work on space and social justice. in terms of dance touring, absolute space would be the fixed location, such as the private property of the theatre or hotel that is located by a series of bounded territorial designations: on a particular street, in a particular city, and so on. if standard units of measure represent distances between locations in absolute space, by contrast, distance in relative space is measured in terms of time and effort. in harvey’s words, “[t]he movement of people, goods, services, and information takes place in a relative space because it takes money, time, energy, and the like to overcome the friction of distance.” mapping relative space for this supports the turn from the discrete interactions of national practices to the distributed language of “interweaving” or “entanglement” that exceeds any single participating culture. see, for example, erika fischer-lichte, “interweaving cultures in performance: different states of being in-between,” new theatre quarterly , no. ( ): – . may yuan, john mcintosh, and grant delozier, “gis as a narrative generation platform,” in deep maps and spatial narratives, ed. david j. bodenhamer, john corrigan, and trevor m. harris (bloom- ington: indiana university press, ), – . marian heinly page, “a tour of south america with the pavlova company” ( – ), in ruth page collection, (s) *mgzmd folder m , jerome robbins dance division, new york public library. david harvey, “space as a keyword,” in david harvey: a critical reader, ed. noel castree and derek gregory (malden, ma: blackwell publishing, ), – , quote on . mapping movement on the move / dance touring might begin by measuring the proximity of particular cities in terms of the total number of roadways, trains, or other transportation connections, or by the cost of the travel leg that connects them. finally, relational space is created internal to its own processes on the basis of human experiences, such as feeling or memory. harvey’s model is useful for considering the variability of cultural space and thus how visualization tools like mapping can represent multiple geographies of move- ment on the move. before we use these three types of space to demonstrate some of the ways in which digital spatial analyses can develop critical questions within dance scholarship regard- ing the nature of touring, a note about mapping itself is important. just as the database is based on curated datasets, the reference map for any such representation of space likewise appears deceptively neutral. spatial history tends to depend on gis-based understandings of space as a more absolute, pre-given cartesian grid, which curtails local specificity and understandings of place. for example, looking through archival materials there is a cultural precision to the granularity of place names like “harlem” and “fifth avenue,” but this needs to be negotiated vis-à-vis the opportunity for a larger connection to new york city in general. in addition, there are professional agreements in the geodata community that have identified generic latitude/longitude coordinates for cities. but in the case of new york city, that point itself is fixed at fifth avenue. what about that exact location qualifies it to stand in for the whole? likewise, most base maps can only represent a single period of time at once in terms of national borders, railways, major roads, and so on, and yet the very existence of some routes and not others in a given year will have determined a touring company’s pathway. this kind of fixity stands in opposition to the work of contemporary geographers, among others, who understand space relationally as an emergent property. in our projects of tracing dance in transit, it is this latter, place-based, often relative or rela- tional perspective that foregrounds the flow of local–global relations and thus has the potential to demonstrate how the mobility of dance redraws cultural geographies. in order to bring these two perspectives into alignment we are particularly interested in qualitative gis work, among other projects, that “share an assumption that while some kinds of fixity are inherent and unavoidable in gis, there exists a great deal of room for strategic deployments of this fixity, and for iterative adaptations of fixed representations or practices.” this active, ongoing negotiation with fixity strikes us as particularly essential to datasets based on lived experience and bodies in motion, both on- and offstage. see monmonier, mapping it out, . harvey later revisited this to propose a nine-square grid in which absolute, relative, and relational spaces were run up against henri lefebvre’s experienced, conceptualized, and lived spaces. see harvey, “space as a keyword,” . on this contrast, see n. katherine hayles, how we think: digital technology and contemporary technogenesis (chicago: university of chicago press, ), – ; and david j. bodenhamer, john corrigan, and trevor m. harris, eds., deep maps and spatial narratives (bloomington: indiana univer- sity press, ). meghan cope and sarah elwood, “introduction: qualitative gis: forging mixed methods through representations, analytical innovations, and conceptual engagements,” in qualitative gis: a mixed methods approach, ed. meghan cope and sarah elwood (london: sage, ), – , quote on . / harmony bench and kate elswit to deal with touring in terms of absolute spaces, such cartographic staples as point- to-point lines can help to draw out narratives of touring as something that occurs not only in theatres, but between them. although the american ballet caravan tour was meant to prototype cultural diplomacy, the stage itself was likely not the primary location where such diplomacy occurred. in almost every city there were political struggles over the high cost of tickets, which directly impacted how few people could actually attend. kirstein’s biographer quotes a letter from late in the tour in which he describes the experience of the shows themselves as “like performing in a half-filled house at o’clock in the morning for a ladies club . . . in detroit.” while american ballet caravan’s performances happened in sixteen cities over six months, there were up to two weeks of travel between engagements. this means that the travelers actually passed through many towns and cities en route to their sixteen destinations. overlay- ing diagrams of multiple companies’ tour routes at once identifies the convergences among them, such as frequent stopovers where exchange took place through dancers and other travelers as consumers rather than producers of culture, even if limited in form to food and souvenirs. by looking at the larger patterns of offstage life on tour in this way we can begin to locate what greenblatt calls “contact zones” for dance-based exchanges of cultural goods and find new ways to attend not just to the travelers, but to what michelle clayton borrows from mary louise pratt to call the “travelee” nar- rative that may offer a counter-narrative to orientalist appropriation and the colonial dispersal of culture. dynamic spatial histories of movement need to be studied, like choreography, in terms of time and space simultaneously. yet, because maps of absolute space privilege the destination and possibly certain intermediary points on the route, they risk losing duration. while some tour itineraries held a punishing pace (often finishing a show in one city on one night, traveling the next day, and appearing onstage in a new city the following evening), others saw travelers spending weeks or even months in a single location before moving on, which produced a different density of engagement. sometimes travel was delayed for weather or mechanical failure or even the absence of transportation; and sometimes local politics rerouted travel plans. the friction of these different temporalities might be considered in terms of relative space. a choropleth map that is shaded by statistical density might give a researcher a sense of the total number of shows or the duration of a stay, and an animated sequence might even al- low that to be experienced in accelerated time. yet, all of these will remain focused on the shows themselves as discrete events and not the act of travel itself that stretches between them or its effects on dancers’ bodies. this issue is compounded by the fact that it is much easier to accumulate static data from archives. for the american ballet caravan tour, for example, the company documented the dates and the time of day for each performance and recorded the costs of particular legs of transportation, but information on the actual travel times is much less consistent. travel is bounded by the shows, yet letters and other accounts indicate that the company did not always arrive even a full day ahead of a performance engagement, given transportation mis- haps and customs delays. visualizing such relative spatial data as duration, using a critical mixed-methods approach that attends to the power structures of what data is martin duberman, the worlds of lincoln kirstein (new york: knopf, ), . greenblatt, “a mobility studies manifesto,” . michelle clayton, “modernism’s moving bodies,” modernist cultures , no. ( ): – , quote on . see the samples at elswit, moving bodies, moving culture. mapping movement on the move / not available as well as what is, we see that the challenges of visualization can call at- tention to the ways in which archival materials privilege certain aspects or experiences of travel over others. yet, this friction has the potential to reveal so much about ways of moving and their impact on cultural movement and movers. after absolute and relative space, harvey’s final category is relational space, which can be used as a metric of qualitative experience and combined with geospatial tech- nologies in order to understand human experience and social power. one of the exploratory maps that kate built (figure ) was meant to capture the particular yet transnational experiences that touring facilitated. the map traced the associations that american ballet caravan dancers made between the places in which they were and places elsewhere in the world, whether through memories triggered by a particular scene or through the people they encountered. the dataset for this map was curated from dancer william dollar’s lengthy pseudonymous account of american ballet cara- van’s south american tour, titled “old granny spreads goodwill.” the unpublished manuscript compares the dancers’ then current locations with places to which they had previously traveled, such as a location just outside of cucuta, colombia, that was connected with the “badlands of dakota.” others were likely imagined, such as the comparison between what is called the guaya river in ecuador and the congo river in west and central africa. when collated into a database and visualized, these point- to-point linestrings draw connections among multiple cities and continents, reframing absolute space in terms of relational geography that demonstrates how dancers folded global space by transposing locations onto one another. more importantly, this analytic representation of experience reveals how their relational geography was dynamic and in fact changed over the course of the six-month tour. whereas in the earlier part of the tour the majority of the associations were made to locations in the united states and europe, as the tour continued the map suggests a change in the travelers’ own frames of reference, as increasing numbers of connections are drawn back to south american locations that the company had previously visited. the map thus begins to show how the tour imprinted itself in the travelers’ imaginations over time. at the same time, there is so much further such a visualization can go. imagine a map that not only draws point-to-point lines, as this one does, but connects singular locations to the polygons of countries and continents. more importantly, imagine one that in fact reforms the picture of the world through the dancers’ eyes by skewing the base map itself by the size and proximity of various points and shapes. if we com- bine similar accounts from many tours by twentieth-century north american artists together, what would that particular skewed geography look like? what could it reveal in prose form it is easy to make room for this uncertainty, and even databases can accommodate date ranges if they have been set up to do so. but when a database powers a map or other visualiza- tion, the slipperiness of uncertain data can easily slide into invisibility. on the use of gis to build relational maps of experience and power, see the examples in marianna pavlovskaya, “non-quantitative gis,” in qualitative gis, – ; and ladona knigge and meghan cope, “grounded visualization: integrating the analysis of qualitative and quantitative data through grounded theory and visualization,” environment and planning a , no. ( ): – . for more on the curation of this dataset, see kate elswit, “mapping tours through the danc- ers’ eyes” and “mapping tours through the dancers’ eyes, redux,” moving bodies, moving culture, september , and september , , available at https://movingbodiesmovingculture.wordpress. com/ / / /mapping-tours-through-the-dancers-eyes/ and https://movingbodiesmovingculture.wordpress. com/ / / /mapping-touring-through-the-dancers-eyes-redux/. / harmony bench and kate elswit about the sites that are dominant versus marginal in these dancers’ perspectives on the world? another way to complicate such relational geography would be to compare the travelers’ views, as accounted for by dollar’s manuscript with all of its yankee privilege, to a travelee perspective that maps, for example, the associations that local audience members themselves made in accounting for the performances or in fact dance more broadly. exploring such relational spaces supports clayton’s proposal for more attention to traveling cultural practices from a perspective that has “less to do with modernist cosmopolitanism than with comparative particularisms.” lingering thoughts on digital methods in practice, onstage and offstage in this essay we have drawn on the beginnings of our own individual and col- laborative work at the intersections of dance touring and digital methods. the type of analytic work that we have been describing brings to the fore the question of how research practices co-articulate with scholarship and process with outcome. in the digital humanities such questions have led to tensions within the field. some argue that participation in the digital humanities should be contingent on knowing how to code; as johanna drucker puts it, “[i]f we are interested in creating in our work with digital technologies the subjective, inflected, and annotated processes central to hu- manistic inquiry, we must be committed to designing the digital systems and tools for figure . this relational map is based on william dollar’s pseudonymous account of american ballet caravan’s south american tour, “old granny spreads goodwill.” here, lines trace the associations that he reported american ballet caravan dancers made between the places in which they were and places elsewhere in the world. this map makes visible how previous tour cities become new points of reference. see http://www.kateelswit.org/moving-bodies-moving-culture.leafletmap/# / . /- . (source: dataset by kate elswit; map by kate elswit and eric andrew sherman.) clayton, “modernism’s moving bodies,” . stephen ramsay, “who’s in and who’s out,” paper presented at the mla convention, january , , los angeles, available at http://stephenramsay.us/text/ / / /whos-in-and-whos-out/, and “on building,” january , , available at http://stephenramsay.us/text/ / / /on-building/. mapping movement on the move / our future work.” other humanities scholars decry the “invidious distinction between making things and merely critiquing them [that] has come to be one of the generally accepted differences that marks off dh [digital humanities] from the humanities in general.” this tension between making and critiquing recalls ongoing conversations in theatre, dance, and performance studies. rationales for comparing digital research and performance reside in the time-based nature of the scholarship, as well as the interest in user experience. while some scholars have suggested that the scholarship produced by new digital methods functions more as performance than publication, and bay-cheng, among others, has clearly argued for the parallels between digital and performance scholarship, this body of literature is not generally referenced within the digital humanities community. citing expertise in reenactment, presence, documentation, and reception, bay-cheng also argues that theatre and performance historiography is uniquely positioned to engage with digital methods. likewise, performance itself offers also a framework for thinking through and developing evaluative language around digital research. to return to richard schechner’s “is performance”/”as performance” distinction, what happens if we not only study the work produced through these digital methods “as” performance, but in fact tap into the robust scholarly language that has been devel- oped in order to articulate the value of practical artistic research as complementary to more conventional scholarly methods? since the s the basic argument for artistic practice as a mode of inquiry has been grounded in the relationship between research method and scholarly knowing—namely, the understanding that we think differently on our feet and thus may come to different propositions while testing out ideas in the studio than working them out in print form alone. in this way practice-based research cultivates a fluid back-and-forth exchange between embodied knowledges and textual practices, or indeed embodied texts and knowledge practices. although performing arts researchers often turn to the humanities in order to articulate method, and digital humanists tend not to draw on the scholarly framework of artistic research, it is in fact our own fields of theatre, dance, and performance studies that may provide new language to articulate the many ways in which “digital humanities practice” can be situated in tandem with scholarly inquiry. here, we have used our projects to discuss on- and offstage dimensions of tour- ing in terms of data collection, analysis, and visualization. these same distinctions apply to the production of digital scholarship and therefore the terms and language through which this type of work can and should be evaluated. if the backstage labor of performance has recently come to the fore as an important aspect of scholarship on the history of theatrical production, many scholars have yet to grapple with the labor of scholarly production, a situation felt acutely among scholars utilizing digital johanna drucker, “blind spots: humanists must plan their digital future,” chronicle of higher education , no. ( ): b –b . richard grusin, “the dark side of the digital humanities: dispatches from two recent mla conventions,” differences: a journal of feminist cultural studies , no. ( ): – , quote on . for example, scheinfeldt and mcpherson, referenced in bay cheng, “pixelated memories,” and debra caplan, “notes from the frontier: digital scholarship and the future of theatre studies,” theatre journal , no. ( ): – . richard schechner, performance studies: an introduction (abingdon, uk: routledge, ), – . for an overview, see robin nelson, practice as research in the arts: principles, protocols, pedagogies, resistances (basingstoke, uk: palgrave macmillan, ). / harmony bench and kate elswit methods of research. we have felt this even in the process of producing this essay, choosing to focus on the rationale and foundation for the work rather than producing the many visualizations that could have easily constituted a second full-length “visual essay” alongside. engaging in the practices of which digital scholarship is composed, including manually compiling and cleaning comprehensive datasets and developing software or learning to use existing visualization platforms, adds a dimension of la- bor to digital research that is generally invisible in the final product. furthermore, for many scholars the “product” may ultimately materialize as a digital object, such as a database, dataset, or visualization, whether in service of their own research or that of others. how then to apply rigorous standards of evaluation and peer review when, like performance, digital scholarship is reliant upon a tremendous amount of invisible labor and is frequently developed in a prolonged workshop phase with multiple col- laborators and technological platforms? when the process of digital scholarship and its iterative manifestations of a research idea may exceed the final product? or when it results in a digital object that is, at minimum, unstable and even ephemeral, prone to failure and accelerated obsolescence? at the same time that these methods invite us to consider new ways to approach performance research, we believe that our own fields are already equipped with frameworks for producing and evaluating such work, and we are excited about the next steps. microsoft word - consol cicero, sigonio and burrows: investigating the authenticity of the "consolatio" richard s. forsyth university of luton, uk david i. holmes the college of new jersey, usa emily k. tse university of california, los angeles, usa [ contact: forsyth_rich@yahoo.co.uk cite as: forsyth, r.s., holmes, d.i. & tse, e.k. ( ). cicero, sigonio, and burrows: investigating the authenticity of the "consolatio". literary & linguistic computing, ( ), - . ] mailto:forsyth_rich@yahoo.co.uk abstract when his daughter tullia died in bc, the roman orator marcus tullius cicero ( - bc) was assailed by grief which he attempted to assuage by writing a philosophical work now known as the consolatio. despite its high reputation in the classical world, only fragments of this text -- in the form of quotations by subsequent authors -- are known to have survived the fall of rome. however, in a book was printed in venice purporting to be a rediscovery of cicero's consolatio. its editor was a prominent humanist scholar and ciceronian stylist called carlo sigonio. some of sigonio's contemporaries, notably antonio riccoboni, voiced doubts about the authenticity of this work; and since that time scholarly opinion has differed over the genuineness of the consolatio. the main aim of this study is to bring modern stylometric methods to bear on this question in order to see whether internal linguistic evidence supports the belief that the consolatio of is a fake, very probably perpetrated by sigonio himself. a secondary objective is to test the application of methods previously used almost exclusively on english texts to a language with a different structure, namely latin. our findings show that language of the consolatio is extremely uncharacteristic of cicero, and indeed that the text is much more likely to have been written during the renaissance than in classical times. the evidence that sigonio himself was the author is also strong, though not conclusive. keywords: authorship attribution, cicero, multivariate methods, neo-latin, quantitative linguistics, text categorization, stylometry. . introduction this paper describes a study of the authorship of the consolatio ciceronis, a work known to be written by cicero on the occasion of the death of his daughter tullia in bc. like many works from so long ago, this text was eventually lost, and only fragments are known to have have survived - - seven fragments as quotations in the surviving works of lactantius (c. ad - ) and one passage in a surviving work by cicero himself, the tusculan disputations, book i. more than sixteen centuries later, however, a book identifying itself as the consolatio mysteriously reappeared, and was published in venice and bologna in . it was edited by carlo sigonio ( - ), a prominent humanist scholar and skilled imitator of classical latin. however, no source manuscript of this text was ever made public, so, although its style appeared much like cicero's, doubts about its authenticity were aroused almost at once. its most outspoken critic was antonio riccoboni ( - ), who attacked it as a forgery by sigonio in two successive publications iudicium and accusator. another contemporary doubter was latino latini ( - ) who found in the text traces of what he took to be post-christian thinking (e.g. a reference to the flight of the soul "ad futuram vitam") and post-classical usage (e.g. "homines" rather than "viri" used to denote specifically male human beings). latini's letters were not published until many years afterwards, but they were widely circulated during the controversy (mccuaig, ) -- which was prematurely curtailed by sigonio's death in , without reaching a definite conclusion. although the consolatio of has been included in numerous editions of cicero's collected works, most modern scholars regard it as a forgery. however, very little textual analysis has been done on this question since sage's book (sage, ). the main aim of the present work is to weigh the linguistic evidence for and against cicero's, authorship, using modern stylometric techniques, unavailable in . a secondary objective is to extend the applicability of methods previously used almost exclusively on english texts to an inflected language, namely latin. . background as a background to the present study, it is necessary to be aware that the latin language went through three major (and several minor) phases between the time of cicero and that of sigonio. we use the term classical latin to cover the period from about bc to about ad. this is often subdivided into the "golden" and "silver" ages, with the former covering roughly the first century bc and the latter the next two and a half centuries. cicero himself was the most notable prose author of golden-age latin, and virgil ( - bc) its pre-eminent poet. prominent silver-age authors include quintilian (c. - ad), tacitus ( - ad) and the younger pliny ( - ad). but we know from inscriptions at pompeii and other sources that classical latin was already something of an artificial construct by the middle of the first century ad, and, after the fall of the roman empire early in the fifth century ad, it ceased to be a living language. however, it survived right through the middle ages as the language of diplomacy, law, scholarship and theology. in western europe, all who were educated wrote in latin. we term the latin of this period (over a thousand years) medieval latin. because teaching was left almost entirely in the hands of churchmen, medieval latin was predominantly ecclesiastical in nature. the start of the third main phase in the development of latin can conveniently be dated to the fourteenth century, with the revival of humanism and the renaissance of classical learning. in this phase the language escaped the confines of the cloister and became a vehicle for expressing secular concerns; indeed many prominent humanists of the time were suspected of pagan sympathies. a pioneer of this movement was the poet and scholar petrarch (francesco petrarca, - ), who devoted himself to reviving the literature of classical rome. he spent much time and effort searching for ancient manuscripts and was rewarded in by the discovery of one of the most celebrated finds of the period -- a batch of over letters written by cicero, which had been lost till that time. the language of this third phase we call neo-latin. this represents an attempt to re-establish the latin of the golden age, but it could never be a reproduction of that language: firstly, technology and society had changed too much in the interim, and secondly neo-latin was nobody's native tongue. in fact, many of its users were not proficient at speaking it, but only in writing it -- including sigonio, according to mccuaig ( ). after petrarch, a succession of famous humanists strove to promote the pure style of classical latin, with cicero as their most respected model. among these were: gasparino barzizza ( - ), poggio bracciolini ( - ), pietro bembo ( - ), christophe de longueil ( - ), pietro vettori ( - ), jacopo sadoleto ( - ) and marc-antoine muret (muretus) ( - ), as well as carlo sigonio himself. although criticized by some, most notably desiderius erasmus ( - ), for slavish aping of cicero (scott, ), their views held sway at least until the end of the th century and influenced the syntax and vocabulary of neo-latin. as j.r. hale puts it: "the language of cicero was imitated as part of a movement to restore the writing of latin to the purity of its outstanding model. after the s, at least in italy, ciceronianism was to become an orthodoxy" (hale, , p. ). this view is also attested by nauert ( , p. ) who says: "students were expected to write or speak on an assigned topic in the approved ‘roman’ way and in acceptably ‘ciceronian’ latin." paradoxically, according to bodmer ( ), this desire to return to a prior state of perfection finished off latin as a viable international medium. thus the humanists killed the thing they loved. "pedantic attempts of the humanists of the fifteenth and sizteenth centuries to substitute the prolix pomposity of cicero for the homely idiom of the monasteries hastened its demise. by reviving latin, the humanists helped to kill it." (bodmer, , p. ) the relevance of this brief historical interlude to the current case is that if the consolatio is genuine it was written by one of the foremost stylists of classical latin, whereas if it is a forgery it will have been written by one of his imitators, in neo-latin. consequently, if it does bear the hallmarks of neo-latin, it cannot be genuine -- whoever wrote it. at the heart of our problem, therefore, is the need to find a way of distinguishing between cicero and ciceronianism. . materials we have assembled a collection of writings by authors from classical rome and renaissance italy, including generous selections from cicero and sigonio. these authors are listed in table , in order of birth date, with our chief ‘suspects’ underlined. table -- authors sampled. classical marcus tullius cicero ( - bc) julius caesar (c. - bc) cornelius nepos (c. - bc) gaius sallustius crispus ( - bc) [= sallust] lucius annaeus seneca (c. bc - ad) publius cornelius tacitus (c. - ad) neo-latin pietro vettori ( - ) carlo sigonio ( - ) marc-antoine muret ( - ) [= muretus] bernadino di loredan ( -??) [= lauredanus] antonio riccoboni ( - ) the concept of random sampling cannot truly apply in such a case, but we have endeavoured, particularly for our two major protagonists, cicero and sigonio, to achieve a breadth of coverage sufficient to permit estimation of the variabilty of the authors concerned. in addition, we have also selected, for purposes of comparison, two other works which were long accepted as by cicero, but which are now generally thought to be spurious. these are probably imitations of cicero, but classical rather than neo-latin, namely, the epistula ad octavianum and the rhetorica ad herennium. this gives us a total of more than , words of latin, divided into samples. the full list of the text files used in this investigation may be found in appendix i. it is ordered alphabetically by author (with the dubitanda preceding the works of known authorship). note that fragments of the consolatio indisputably by cicero, preserved as quotations in other works, were removed from the text prior to the analyses described below. these amounted to words in total. it should also be noted that mark-up codes (e.g. html) which were present in some of the samples that we obtained in electronic form, have been removed in all cases, leaving plain ascii text in the roman alphabet without diacritics. unless otherwise stated, all our analyses ignore the difference between upper and lower case; and so far we have made no use of punctuation marks. . method of approach as holmes ( ) has shown, a great variety of linguistic variables have been used in authorship studies. in this case, we decided to avoid excessive subjectivity by concentrating on variables which, in a sense, emerge from the texts under consideration. a number of studies have appeared recently (e.g. burrows, , ; binongo, ; burrows & craig, ; holmes & forsyth, ; forsyth & holmes, ; tweedie et al., ) in which the features used as indicators are not imposed by the prior judgement of the analyst but are found by straightforward procedures from the texts under scrutiny. such textual features have been used by burrows ( ) as well as binongo ( ), among others, not only in authorship attribution but also to distinguish among genres. this approach involves finding the most frequently used words and treating the rate of usage of each such word in a given text as a feature. the exact number of common words used varies by author and application. burrows and colleagues (burrows, ; burrows & craig, ) discuss examples using anywhere from the to most common words. binongo ( ) uses the commonest words (after excluding pronouns). greenwood ( ) uses the commonest (in new testament greek). most such words are function words, and thus this approach can be said to continue the tradition, pioneered by mosteller & wallace ( / ), of using frequent function words as markers. in fact, these studies (and some others) can be lumped together as applications of what may be called the "burrows approach", which is outlined below. . pick the n most common words in the corpus under investigation. n may be from to . (manual preprocessing is sometimes done, e.g. distinguishing "that"- demonstrative from "that"-conj.) . compute the occurrence rate of these n words in each text or text-unit, thus converting each text into an n-dimensional vector of numbers. . apply techniques of multivariate data analysis to reveal patterns, especially:  principal components analysis;  clustering;  discriminant analysis. . interpret the results (with care!). a striking success of this method is described by burrows ( ) on prose works by the bronte sisters. he took -word samples of first-person fictional narrative from novels by the three sisters anne, charlotte and emily, and was able to show that they fell into three distinct clusters. given three such authors, linked by heredity and upbringing, writing in the same genre at around the same time, this was an impressive feat. a number of studies have followed this approach, the majority of which have been on english- language texts. the central thrust of our investigation has been an application of this method to the consolatio, along with our latin control samples . choice of frequent function words there is no definitive statement by burrows ( ) or his successors on deciding exactly how many words to use. generally about fifty are used, with the implication being that they should be among the most common in the language, and that content words should be avoided. in the absence of a precise specification, our procedure was as follows. we picked twelve of our texts, one sample from each author (treating "anon" as a separate author for this purpose). the text sample chosen was the largest file of each author that did not exceed words in length. although this did not give exactly equal coverage of all our authors, it gave a selection that was not dominated by any single author, time-period or topic. in so far as a bias exists in this selection, it is towards overselection of cicero, who contributed words to a sample of . on the basis of exactly equal contributions by each author he would have contributed words to this sample. we regard this as an acceptable departure from strict equality both because he is the central focus of our investigation and because of his position as a stylistic model. this aggregation of texts was then subjected to a word count, giving a word-frequency listing of which the top fifty words are shown in table . this list shows that the top fifty words are mostly common content-free words, as required by the burrows approach. we have used orthographic words rather than lemmata (lexical entries) in the analyses that follow primarily for the sake of simplicity. gurney & gurney ( ) have reported that lemmatization helped them in tackling a latin authorship problem (scriptores historiae augustae), but the lemmatization of a large body of latin text is no trivial matter. software tools which partially automate this task do exist (e.g. http://www.shef.ac.uk/uni/projects/hpp/stemmer.html) but their usage requires quite a heavy investment in text pre-processing (schinke et al., ). moreover, lemmatization is somewhat contrary to the spirit of the burrows method. (a follow-up study to assess the pros and cons of lemmatization in this and other latin authorship problems would doubtless be valuable, but is beyond the scope of the present investigation.) to determine the exact number of words to be used, we asked a latinist (ekt) to scan down table until the first unequivocal content word. she decided that was number ("rerum"), and so in all the analyses reported herein, we have used variables, i.e. relative frequencies of the words from "et" to "tamen" in the list below. table -- fifty most common latin words in order of frequency. / / : : n = . word frequency rank % freq. cumulative et . . in . . est . . non . . ut . . cum . . quod . . ad . . qui . . quae . . ac . . esse . . quam . . atque . . ex . . a . . si . . sed . . aut . . se . . de . . enim . . etiam . . neque . . autem . . ab . . nec . . sunt . . quo . . ita . . ea . . nihil . . quid . . sit . . hoc . . eo . . quidem . . vero . . vel . . tum . . quibus . . id . . eius . . per . . ne . . tamen . . rerum . . natura . . modo . . nam . . it should be noted that the first words between them account for more than % of the tokens in this multi-author sample. . syllable counts the standard burrows procedure only uses word-frequency information, but it is our long-term intention to extend this by employing other sources of linguistic evidence. in the present study, a start was made towards this end by writing a syllable-counting program. this enabled us to compute not only the proportion of words of one, two, three, four syllables and so on, but also some information about syllabic transitions. specifically, for each text the additional variables described in table were computed. table -- syllabic variables. variables meaning s , s , s , s , s , s percentage of - syllable words in the text [syllable transitions:] st , st , st , st percentage of -syllable words that are immediately followed by -syllable, -syllable, -syllable and -syllable words (respectively) st , st , st , st percentage of -syllable words immediately followed by words of , , , & syllables st , st , st , st percentage of -syllable words immediately followed by words of , , & syllables st , st , st , st percentage of -syllable words immediately followed by words of , , & syllables the rules of the procedure used for counting syllables are given in appendix ii. in the analyses that follow, mention will be made of syllabic information when it is used. if no such mention is made, it can be presumed that the analysis is being performed using only the first words of table . . results and analyses we present in this section a sequence of analyses, based on the burrows approach, using the variables described in section , with the objective of shedding light on the question of who wrote the consolatio. . cicero the initial investigation concerned only the textual samples from the works of cicero. a principal components analysis was carried out on the frequencies (rate per thousand) of the most frequently-occurring words detailed above, the samples being labelled as ‘orations’ or ‘non- orations’ according to their genre. figure shows the textual samples plotted in the space of the first two principal components, which together account for . % of the total variation in the data. figure about here. the genre effect is clearly visible along the direction of the first principal component, with the orations ‘ ’ falling generally to the left of the non-orations ‘ ’. the accompanying scaled loadings plot, figure , shows that non-orations are characterized by relatively high occurrences of the connectives "nec" and "enim", and words such as "est", "sit" and "sunt" which are forms of the verb "esse" (to be). orations, by contrast, have high occurrences of "ac" and "atque". this discovery is no surprise since "enim" is an explanatory connective which would feature often in philosophical works or treatises which comprise the bulk of the non-orations whereas "ac" and "atque" are words with a more emphatic meaning typical of an oration. figure about here. . cicero and the classical dubitanda the two classical dubitanda, epistula ad octavianum and rhetorica ad herennium ii, were then added to the cicero samples in . above. these are often included in the ciceronian corpus but are generally accepted by classicists as not having been written by cicero. figure shows the textual samples plotted in the space of the first two principal components (pcs), which now account for . % of the total variation in the data set. the dubitanda are labelled ‘ ’. this shows that the epistula fits in quite well as a ciceronian oration, albeit only as a borderline. the rhetorica, however, appears as an outlier along the direction of the second principal component and a look at the scaled loadings plot, figure , shows that an exceptionally high usage of "ab", "ad", "id" and "aut" is associated with this placement. it is interesting to note that if we bring in the third principal component, not shown in figure , then the epistula also becomes an outlier, having a more negative value than any of the other texts. so we could sum up by saying (roughly) that the the first pc separates orations from non-orations, the second pc separates the rhetorica ad herennium from the rest and the third separates the epistula ad octavianum from the rest. figures and about here. . cicero and the classical controls having looked at the genre effect within cicero’s works and at the classical dubitanda, we now turn our attention to the broader picture concerning the ciceronian texts and all the classical control samples. a principal components analysis was conducted on the rates of occurrence of the most frequently-occurring words for the samples from caesar, cicero, nepos, sallust, seneca and tacitus, and the resulting plot in the space of the first two principal components is shown in figure , which accounts for . % of the total variation. figure about here. this remarkable plot clearly shows how successful the word-frequency approach can be at discriminating between writers. the cicero text samples, labelled ‘ ’, form a distinct group with the exception of pro cluentio. the seneca samples, labelled ‘ ’, are on their own with the sole tacitus sample, ‘ ’, appearing nearby. the caesar samples, ‘ ’, and the sallust samples, ‘ ’, are close together but these particular textual samples were concerned with military campaigns so perhaps this is not surprising. the nepos samples, ‘ ’, form a tight grouping. the first principal component is separating the philosphical works, on the right, from the military and biographical works on the left, whilst the second principal component would seem to reflect temporal change. those texts with positive (or almost positive) scores on this component are all texts written during the roman republic (bce), and those texts with scores of - or less are written during the roman empire (ce). the associated scaled loadings plot, figure , reveals that the ce texts are associated with high usage of "per", "ac", "et" and "nec", and the bce texts with "cum", "quo" and "eo". the military and biographical texts, being of a narrative nature, are associated with high usage of "eius", a pronoun, whilst the philosphical texts are associated with high usage of "enim" and "quidem" which are explanatory and qualifying particles, which elaborate on a preceding clause. figure about here. an alternative analysis in this section may be provided by conducting a cluster analysis on the textual samples, using the word rates as variables. figure shows the resultant dendrogram using average linkage as the clustering algorithm and euclidean distance as the metric. we can see that works from the classical writers tend to cluster very well. there are some cicero clusters which ultimately come together, a seneca cluster (with the exception of de ira, which groups with tacitus forming a pair of outliers), a nepos cluster and a caesar/sallust cluster as revealed above in the principal components analysis. the only change of any consequence is that the cicero outlier is no longer pro cluentio but orator! our results with the two methods of analysis are mutually supportive. thus, even though we can interpret the first pc in terms of a genre effect and the second pc as a temporal factor, these works still tend to form clusters on the basis of authorship. in particular, the genre effect is not strong enough to disrupt the coherence of the cicero cluster. figure about here. . sigonio and the sixteenth century controls having successfully discriminated between classical writers using the frequencies of occurrence of the common words, we now turn our attention to the sixteenth century textual samples, where we have writings from lauredanus, muret, riccoboni, sigonio and vettori. once again a principal components analysis was conducted on the data set and figure shows the texts plotted in the space of the first two principal components, which account for . % of the total variation. the configuration obtained is less remarkable than that in the section above, but distinct groupings are nevertheless visible. the sigonio texts, labelled ‘ ’, split quite dramatically into his two histories and remaining non-histories (mainly orations), whilst the muret texts, labelled ‘ ’, split into three funeral orations and four scholarly orations. clearly the genre effect is at play here as in the earlier analyses. also visible are groupings for lauredanus, labelled ‘ ’, riccoboni, labelled ‘ ’, and vettori, labelled ‘ ’. the associated scaled loadings plot is shown in figure . figures and about here. an alternative analysis was again provided by using cluster analysis with average linkage as the algorithm and euclidean distance as the metric. figure shows the resultant dendrogram. whilst the sigonio histories are clearly clustered well away from his non-histories, the muret samples exhibit a slightly different pattern to that revealed in the principal components plot, the funeral oration pro antonia rege navarre ad pium.. now appearing with his scholarly orations. also,vettori’s oratio petri victorii in max. ii now lies apart from the other vettori texts. the clustering, although not as distinct as that shown in the principal components plot, is broadly supportive, and the word-frequency approach appears to have been successful as a sixteenth century authorial discriminator. figure about here. . cicero, sigonio and the consolatio having checked the efficacy of the set of frequencies of occurrence of the common words as a discriminator for both batches of control texts, we can now concentrate on the main protagonists in the argument, namely cicero, sigonio and the textual samples from the consolatio. using the genuine cicero and genuine sigonio texts as defined groups, a stepwise discriminant analysis was run on the data. the four words chosen by the stepwise routine as being the best discriminators between cicero and sigonio were "ad", "ac", "neque" and "ab", and using these words the null hypothesis of no difference between the group means along the axis of the discriminant function was clearly rejected (wilks’ lambda of . and p-value . ). when the discriminant function was employed to classify the texts from the known cicero and sigonio groups, it achieved an accuracy of . % (without cross-validation), only the ciceronian texts orator and de imperio being incorrectly classified in the sigonio group. the discriminant function score was then computed for the two text samples from the consolatio (not used in developing this discriminant function). this assigned both samples to the sigonio group. a graphic illustration of this result may be seen in figure which is a plot of the textual samples along the axis of the discriminant function. here (as in figures - ) the horizontal axis is the score on the canonical discriminant function, which is a weighted sum of scores on each of the selected variables that maximizes the separation between the two categories. in figure , four vertical symbols represent one text; genuine cicero texts are denoted by ‘ ’, genuine sigonio by ‘ ’ and the consolatio texts by #. we can identify the two misclassified ciceronian texts and the assignation of the consolatio to sigonio. figure about here. . classical latin, neo-latin and the consolatio previous analyses in this study have shown that time or genre effects are often so marked that they can partly mask authorship. it seems entirely appropriate, therefore, to conduct a discriminant analysis on the two defined groups of classical latin texts and neo-latin texts, and then allocate the consolatio to one of these two groups. a stepwise routine was employed, as in . above, on the word occurrence rates. the words chosen by the routine as best discriminators between these time periods were "ac", "vel", "sed", "vero", "id", "ut", "ea", "neque", and "cum". the null hypothesis of no difference between the group means along the axis of the discriminant function was clearly rejected (wilks’ lambda of . for a p-value of . ), and the function, when applied to the original data without cross-validation, achieved a classification accuracy of . % with cicero’s pro a. licinio archia poeta oratio and de re publica, and the tacitus sample (the most modern) being incorrectly assigned to the neo-latin group. the one neo-latin text incorrectly assigned to the classical group was riccoboni’s de legum laudibus oratio. the two consolatio textual samples were firmly ascribed to the neo-latin group by their discriminant function score. figure shows the plot of the text samples along the axis of the discriminant function. this time two vertical symbols represent one text; the symbol ‘ ’ denotes classical latin, the symbol ‘ ’ neo-latin and the # symbol the consolatio. we can see quite clearly the two groups and how the consolatio appears to be distinctly neo-latin in time. we have discovered that the consolatio is neither classical nor ciceronian! figure about here. . syllabic analysis we have seen in section . that lengths of words by syllables for both single words and word- pairs were also counted. these particular counts have not yet been used. at this stage in the analysis it was decided to incorporate these counts of word-length by syllables into the data and re- run some of the previous analyses. the addition of syllable counts into the analyses for sections . and . , in which cicero was compared with the classical controls and sigonio was compared with the renaissance controls, confused the issue. in the plots of the first two pcs, the previously distinct groupings by author became blurred and overlapping in both the classical case and the sixteenth century case: it would seem that syllable counts play no positive role in authorship discrimination within time periods. returning to the analysis between time periods, a stepwise discriminant analysis was conducted on word and syllabic variables with the classical latin and neo-latin texts as the pre-defined groups. the variables chosen as being the best discriminators between the two time periods were the words "ac", "vel" and "vero" (all chosen in section . above), and the newly introduced syllabic variables st , st , st and st , the first of these, for example, measuring the percentage of one- syllable words immediately followed by a three-syllable word. the null hypothesis of no difference between the group means along the axis of the discriminant function was soundly rejected (wilks’ lambda of . for a p-value of . ). when applied to the textual samples, without cross-validation, the discriminant function successfully classified . % of the texts into their correct time periods with only the tacitus text incorrectly classified as neo-latin. as commented previously, this text is the most modern of the classical texts. the two consolatio text samples were firmly ascribed to the neo-latin group, as in . above, by their discriminant score. figure shows the plot of the text samples along the axis of the discriminant function using exactly the same notation as in section . . we seem to have discovered that while syllable counts are not very useful within time-periods they play an important role in discriminating between time-periods. figure about here. we finally return to the analysis comparing the cicero texts with the sigonio texts, and re-run this adding the syllabic variables to the word counts. a stepwise discriminant analysis with the genuine cicero and genuine sigonio texts as pre-defined groups revealed that the most effective discriminatory variables were the words "a", "ad", "enim", "est", "neque", "quibus", "quid", "sed", "vel" and "vero" (a substantially different listing from the analysis with words only), and the syllabic variable st . once again, the discrimination between the groups was highly significant (wilks’ lambda of . for a p-value of . ), and the discriminant function achieved an accuracy of %, without cross-validation, in assigning the text samples to their groups. when asked to assign the two samples from the consolatio, the discriminant function gave us our first equivocal piece of evidence, the first sample being ascribed to cicero and the second sample to sigonio. figure shows the plot of the textual samples along the axis of the discriminant function, two vertical symbols representing one text, with the ciceronian texts denoted by ‘ ’, the sigonio texts by ‘ ’ and the consolatio texts by the # symbol. figure about here. the introduction of syllable counts has moved the consolatio to a borderline position between the two groups. if sigonio is the author then it looks more ciceronian than most of his output. the syllabic variable which has entered into the discriminant function at this point, st -- the percentage of four-syllable words immediately followed by a four-syllable word, concerns words of above average length and may be indicative of a scholarly and technical second language rather than of a native language. . chronometric analysis to shed further light on the date of the consolatio, a stepwise regression was performed. this used of our text samples (all apart from the dubitanda) with the century of composition used as the dependent variable. for these texts the century variable could take only three different values: (first century bc), (first century ad) or (sixteenth century). the minitab stepwise-regression procedure with standard defaults was allowed to choose from all word variables as well as the syllabic variables. from these it chose five, giving the regression equation below, which accounted for . % of the variance. century = - . + . st + . ac + . vel + . s - . st the variables in this linear equation appear in order of importance, with st being the most important. the presence of three syllabic variables, including the most significant, compared with two word variables appears to confirm that syllabic information is useful for this kind of task. all coefficients are positive, indicating increased usage in the renaissance texts, except for that of st (percentage of -syllable words immediately followed by a -syllable word) which decreases in frequency from classical to renaissance texts. summing up this formula qualitatively, it shows that the words "ac" and "vel" increase in frequency from classical to neo-latin times, as does the frequency of -syllable words. this latter attests to the more learned nature of latin in the later period. the role of the syllable-transition variables is harder to interpret, although an increase in the proportion of -syllable words followed by a -syllable word (st ) would seem consistent with a move from native tongue to second language. having developed the above formula on securely dated texts, it was then used to estimate the date of the two halves of the consolatio. both pieces were placed firmly in the later period. the fitted value for the first half was . and for the second . . for comparison, the mean value for our classical texts was . and for our neo-latin texts was . . thus both segments of the consolatio were very close to the mean of our neo-latin sample, and over centuries later than the mean of our classical sample; and while they fell well within the lower and upper quartile of our neo- latin sample, they were more extreme than the most extreme outlier of our classical sample (the agricola of tacitus, with a computed value of . ). on this basis, the evidence of anachronism is extremely strong -- confirming latini's suspicions at the time (see section ). taken together with the results of the discriminant analysis in section . , this virtually eliminates the possibility of ciceronian authorship. how does this relate to the specific question of sigonio's authorship? it allows us -- provided we accept that ciceronian authorship is excluded -- to concentrate on finding the renaissance author whose style matches most closely that of the consolatio. . discrimination among neo-latin authors for this analysis we excluded lauredanus, of whom we have only texts, as having too few samples for reliable estimation. this left muretus, riccoboni, sigonio and vettori under consideration. a stepwise discriminant analysis procedure was then executed to find the variables which best distinguished sigonio from the other three authors. the four most distinctive variables for this purpose were: st , "aut", "vero" and "hoc". then the distance of the consolatio was computed from each of the four authors on each of these four variables. this was standardized as a z-score, using the following formula zj = (xc - mj) / sj where zj is the distance of the consolatio from author j, xc is the value of the variable in the consolatio, mj is the mean value for author j and sj is the standard deviation of that variable in author j -- with j varying from to . the results are summarized in table . in this table an asterisk indicates a value outside the % confidence interval (by reference to the normal distribution) and a double asterisk indicates a value outside the % confidence interval. table -- distance of consolatio from authors (sigonio markers). variable  muretus riccoboni sigonio vettori st . * . * - . . * aut . . ** - . . ** vero . . - . . ** hoc . - . - . . ** zsum = . . . . all four z-scores fall within the % confidence interval for sigonio. for muretus one (variable st ) falls outside, for riccoboni two and for vettori all four. the final row in this table gives the sum of the absolute (unsigned) z-scores, a convenient aggregate distance measure, corresponding to "city-block" distance as used in nearest-neighbour classification (see: beale & jackson, ). using this measure, which is based on the variables that best differentiate sigonio from his contemporaries, the consolatio is clearly more like sigonio than any of the other three. this evidence is compatible with the hypothesis that sigonio wrote the consolatio, and completely incompatible with the hypotheses that riccoboni or vettori wrote it (which were already highly implausible on other grounds). on these figures alone it might just be possible to entertain the hypothesis that muretus could have had a hand in the authorship of the consolatio (although nobody has seriously made such a proposal). so, just for completeness, the same procedure was repeated, this time with the four most distinctive muretus variables, which were: "ita", "ac", "ne", "quod". the results are summarized in table . table -- distance of consolatio from authors (muretus markers). variable  muretus riccoboni sigonio vettori ita - . . . . ac - . . . - . ne . . * . . * quod . . . . zsum = . . . . in terms of aggregate distance, the order is the same as before: sigonio is the closest match, followed by muretus, riccoboni and lastly vettori; but the result is less clear-cut. once again, all four matches fall within the % confidence limits for sigonio, which is not true for riccoboni or vettori; but in this case so do the matches with muretus (one marginally). nevertheless, even using what might be termed the favourite markers of muretus, the consolatio appears more similar to works by sigonio than by muretus himself. this impression was further confirmed by a stepwise linear discriminant analysis performed on muretus and sigonio only, which was % successful in assigning our muretus and sigonio samples to their correct sources, and gave both halves of the consolatio to sigonio. the variables chosen as discriminators by the procedure (using standard spss default settings) were the words "ita" and "ac", both of which are significantly more frequent in muretus than in sigonio. in fact, these are the first two markers in table above. a plot of both authors and the consolatio segments on the discriminant axis is given as figure . but as only two variables are used, the relationship between these two authors and the consolatio can be better appreciated by a scatter diagram, using "ac" and "ita" as axes, which is shown as figure . visually it seems clear that, in respect of these two most discriminatory words, the consolatio resembles sigonio’s writings more than those of muretus. figures and about here. thus of the four renaissance authors considered here, sigonio's language is closest to that of the consolatio. . discussion . substantive conclusions the findings from this analysis tend to support received opinion among latin scholars that the consolatio of is a work of neo-latin and not therefore the rediscovery of cicero's long-lost text. moreover it resembles sigonio's style more than it resembles those of three other neo-latin control authors, namely muretus, riccoboni or vettori. in our view, the evidence presented here against cicero's authorship of the consolatio is compelling. the evidence that sigonio himself was the author is also quite strong, although the effort required to reach that conclusion is tribute to his skill as a ciceronian imitator. . methodological considerations from the methodological viewpoint, we have demonstrated that the approach pioneered by burrows ( ) works well enough to find differences between the language of cicero and a number of his imitators; hence that it can be generalized to an inflected language, latin. this agrees with the findings of tweedie et al. ( ), who also worked on latin. we have also shown that syllabic information can be usefully added to the basic burrows method in certain cases, thus extending the method somewhat. it is interesting, in this connection, to note that a number of studies by a group of researchers centred on goettingen university, on many languages, including english, german and latin, have found rather little variation between authors and genres in respect of word-length distribution (e.g. wimmer et al., ; best, ). possible reasons for this apparent contradiction might be: ( ) the goettingen group have sought general models; ( ) syllabic information is more useful for temporal discrimination than for authorship or genre as such; and ( ) syllable-transitions give access to more useful information than plain syllable counts or the distributions of such counts. we suspect that this is an area of quantitative linguistics that would repay further investigation. to end on a cautionary note, we should add that we succumbed initially to the temptation (out of curiosity) to throw all our samples from a dozen authors and two time periods sixteen centuries apart into a single large multivariate analysis. the results were confusing. only when we split our problem into the series of steps recounted in section above did some clarity begin to emerge. of course, picking an author from possible candidates from different time periods where the distribution of genres between authors is inevitably unbalanced is asking rather a lot of any method. it is considerably harder that the classic stylometric problem solved by mosteller & wallace ( / ) of assigning disputed political essays written around the same time to one of only two candidate authors -- alexander hamilton or james madison. in that case, although the two authors' styles are remarkably similar, neither tried to mimic the other. thus we should have known better than to expect enlightenment "in a single hit". nevertheless, we are sure that the success of the burrows method will tempt other workers, at least initially, to seek -stop insight, as we did. the fact that we were able to break our problem into main stages -- first deciding that the suspect text belonged to the more recent time period, then finding the author from that time period whose style matched it most closely -- was essential to making our task feasible. we would suggest that other researchers with similar multi-author or multi-genre problems should likewise seek ways of subdividing their task. acknowledgements this research could not have been carried out without the kind help of professor jane crawford (loyola marymount university) and professor bernard frischer (ucla). thanks are also due to the british academy (grant reference apn ) and to uwe, bristol whose financial support have made this project possible. we also owe a debt of thanks to the various sources that we have drawn on in compiling our latin text sample (appendix i), especially the lillard classical library. references beale, r. & jackson, t. ( ). neural computing: an introduction. adam hilger, bristol. best, k.-h. ( ). results and perspectives of the goettingen project on quantitative linguistics. j. quantitative linguistics, , - . binongo, j.n.g. ( ). joaquin's joaquinesquerie, joaquinesquerie's joaquin: a statistical expression of a filipino writer's style. literary & linguistic computing, ( ), - . bodmer, f. ( ). the loom of language. g. allen & unwin, london. burrows, j.f. ( ). ‘an ocean where each kind...’: statistical analysis and some major determinants of literary style. computers & the humanities, , - . burrows, j.f. ( ). not unless you ask nicely: the interpretive nexus between analysis and information. literary & linguistic computing, ( ), - . burrows, j.f. & craig, d.h. ( ). lyrical drama and the "turbid montebanks": styles of dialogue in romantic and renaissance tragedy. computers & the humanities, , - . forsyth, r.s. & holmes, d.i. ( ). feature-finding for text classification. literary & linguistic computing, ( ), - . greenwood, h.h. ( ). common word frequencies and authorship in luke's gospel and acts. literary & linguistic computing, ( ), - . gurney, p.j. & gurney, l.w. ( ). authorship attribution of the scriptores historiae augustae. literary & linguistic computing, ( ), - . hale, j.r. ( ). renaissance europe - . fontana / collins, london. holmes, d.i. ( ). authorship attribution. computers & the humanities, , - . holmes, d.i. & forsyth, r.s. ( ). the "federalist" revisited: new directions in authorship attribution. literary & linguistic computing, ( ), - . mccuaig, w. ( ). carlo sigonio: the changing world of the late renaissance. princeton university press: princeton. mosteller, f. & wallace, d.l. ( ). applied bayesian and classical inference: the case of the federalist papers. springer-verlag, new york. [first edition: .] nauert, c.g. ( ). humanism and the culture of renaissance europe. cambridge univeristy press. sage, e.t. ( ). the pseudo-ciceronian consolatio. university of chicago press, chicago. schinke, r., greengrass, m., robertson, a.m. & willett, p. ( ). a stemming algorithm for latin text databases. j. of documentation, , - . scott, i. ( ). controversies over the imitation of cicero. teachers college, columbia university: new york. tweedie, f.j., holmes, d.i. & corns, t.n. ( ). the provenance of de doctrina christiana, attributed to john milton: a statistical investigation. literary & linguistic computing, ( ), - . wimmer, g., koehler, r., grotjahn, r. & altmann, g. ( ). towards a theory of word-length distribution. j. quantitative linguistics, , - . appendix i -- latin text samples. sample id words work source cons.a cons.b epistula.oct rhet.her first half of consolatio second half of consolatio epistula ad octavianum rhetorica ad herennium ucla ucla loeb loeb caesar.bc caesar.gal de bello civile, liber ii bellum gallicum i lillard lillard cicero.ami cicero.arc cicero.att cicero.b cicero.b cicero.clu cicero.fin cicero.ic cicero.imp cicero.leg cicero.mar cicero.mil cicero.nd cicero.off cicero.ora cicero.ph cicero.ph cicero.re cicero.sen cicero.sex cicero.som cicero.sul cicero.t cicero.t cicero.t laelius de amicitia pro a. licinio archia poeta oratio letters to atticus i brutus, - brutus, - pro cluentio, - de finibus bonorum et malorum i in catilinam ii de imperio cn. pompei (pro lege manilia) de legibus, - pro m. marcello oratio pro milone, - de natura deorum ii de officiis i, - orator, - philippics ii philippics, vii de re publica ii, - cato maior de senectute, - pro sexto roscio amerino oratio somnium scipionis (de re publica vi) pro sulla, - tusculan disputations i, - tusculan disputations ii tusculan disputations iv lillard lillard lillard lillard lillard bristol lillard oxta lillard loeb lillard loeb lillard lillard loeb lillard lillard loeb oxta lillard lillard oxta loeb packard packard lauredan.fra lauredan.mat in funere francisci venerii ... in funere m. antonii trivisanii ... bl bl muretus. muretus. muretus. muretus. c muretus. muretus. muretus. de laudibus de philosophiae et eloquentia ... pro antonia rege navarre ad pium ... ingressus explanare ciceronis libros ... in funere pii v pont. max. de utilitate iucunditate ac praestantia in funere pauli foxii ucla ucla bl bl bl ucla ucla nepos.att atticus lillard nepos.cat nepos.dio nepos.han nepos.mil cato dion hannibal miltiades lillard lillard lillard lillard riccobon.ben riccobon.nic riccobon.leg riccobon.pat riccobon.rho riccobon.stu in obitu m. mantuae benavidii ... ad nicolaum pontium venetiarum ... de legum laudibus oratio philosophorum in patavino ... civis rhodigini et patavini oratio oratio pro studiis humanitatis vl cul vl bl bl cul sallust.bc sallust.jug bellum catilinae, - de bello iugurthino, - loeb loeb seneca.bre seneca.con seneca.ira seneca.oti seneca.pro de brevitate vitae de constantia sapientis de ira, - de otio de providentia lillard lillard loeb lillard lillard sigonio. sigonio. sigonio. sigonio. sigonio.dd sigonio.h a sigonio.h b pro eloquentia i pro eloquentia ii de latinae linguae usu retinendo de laudibus historiae de dialogo liber, pp - historiarum de regno italiae iv, pp - historiarum de regno italiae iv, pp - ucla ucla ucla ucla bl bl bl tacitus.agr agricola packard vettori.fun vettori.hab vettori.lau vettori.pet oratio funebris de laudibus ioannis m. oratio habita in funere ad iulium iii liber de laudibus ioannae austriacae oratio petri victorii in max. ii ucla ucla ucla ucla note on sources: bl british library bristol bristol classical press cul cambridge university library lillard http://patriot.net/~lillard/cp/latlib loeb loeb classical library, harvard up / heinemann oxta oxford text archive packard packard humanities institute ucla ucla research library vl vatican library appendix ii -- procedure for latin syllable counting. the following procedure takes as input a word (w) which has been extracted as a string from the text being read and delivers an integer as result which is interpreted as the number of syllables in that word. the numbered steps are executed in order, as below. it should be noted that upper-case letters in w will already have been converted into lower case before reaching this procedure, and that any changes made to w are local and have no effect on the text outside this procedure. for counting purposes any of the characters "aeiouy@" is treated as a vowel: "@" is never present on input, but is a device to help deal with certain diphthongs. the characters "j" and "w" are used in this routine to replace "i" and "u" only in contexts where these letters should not be counted as full vowels. within string w: . 'qu' becomes 'qw' . 'gu' becomes 'gw' in front of a vowel . at the beginning of w: 'i' becomes 'j' after 'ab', 'ad', 'con' or 'ob' in front of 'a', 'e', 'o' or 'u' 'iniu' becomes 'inju' 'interiac' becomes 'interjac' 'iec' becomes 'jec' after 'in', 'inter' or 'sub' 'i' becomes 'j' in front of a vowel . 'i' becomes 'j' between vowels . 'oe' becomes '@' except between 'p' and 'm', 's' or 't' . 'ae' becomes '@' . 'au' becomes 'aw' . 'hui' becomes 'hwi' . at the end of w: 'eu' becomes 'ew' . the number of vowels in w is counted and returned as the result. figures figure pca cicero: genre effect figure scaled loadings plot cicero: genre effect figure pca cicero and dubitanda figure scaled loadings plot cicero and dubitanda figure pca cicero vs. classical controls figure scaled loadings plot cicero vs. classical controls figure cluster analysis cicero vs. classical controls figure pca sigonio vs. neo-latin controls figure scaled loadings plot sigonio vs. neo-latin controls figure cluster analysis sigonio vs. neo-latin controls figure discriminant analysis cicero, sigonio and the consolatio figure discriminant analysis classical latin, neo-latin and the consolatio figure discriminant analysis words and syllables, classical and neo-latin figure discriminant analysis words and syllables, cicero and sigonio figure discriminant analysis muretus, sigonio and the consolatio figure muretus, sigonio and the consolatio. pc . . . . . -. - . - . - . p c . . . . . -. - . - . - . figure : pca cicero: genre effect ( =oration) vero vel ut tum tamen suntsit si sed se quod quo quidem quid quibus qui quam quae per non nihil neque nec ne_ ita in id hoc ex etiam et estesse eo enim eius ea de cum autem aut atque ad ac ab a pc . . . -. - . p c . . . -. - . figure : scaled loadings plot cicero: genre effect pc . . . . . -. - . - . - . p c - - figure : pca cicero and dubitanda ( =dubitanda) vero vel ut tum tamen sunt sitsi sed se quod quo quidem quid quibus qui quam quae per non nihil neque necne_ ita in id hoc ex etiam et est esse eo enim eius eade cum autem aut atque ad ac ab a pc . . . -. - . p c . . . -. - . figure : scaled loadings plot cicero and dubitanda pc . . . . . -. - . - . - . p c - - - figure : pca cicero vs. classical controls [key: = cicero = caesar = nepos = sallust = seneca = tacitus.] vero vel ut tum tamen sunt sit si sed se quod quo quidem quid quibus qui quam quae per non nihil neque nec ne_ita in id hoc ex etiam et est esse eo enim eius ea de cum autem aut atque ad ac ab a pc . . . -. - . p c . . . -. - . figure : scaled loadings plot cicero vs. classical controls c a s e label num +---------+---------+---------+---------+---------+ cicero.b cicero.b cicero.re cicero.att cicero.sex cicero.arc cicero.clu cicero.ph cicero.ph cicero.t cicero.ic cicero.sul cicero.imp cicero.mil cicero.ami cicero.t cicero.off cicero.fin cicero.t cicero.mar cicero.sen cicero.nd cicero.leg cicero.som seneca.bre seneca.pro seneca.oti seneca.con cicero.ora caesar.bc caesar.gal sallust.jug sallust.bc nepos.dio nepos.mil nepos.att nepos.han nepos.cat seneca.ira tacitus.agr figure : cluster analysis cicero vs. classical controls cicero .. consolatio pc . . . . -. - . - . - . - . p c . . . . . -. - . - . - . figure : pca sigonio vs. neo-latin controls [key: = sigonio = lauredanus = muretus = riccoboni = vettori.] cicero .. consolatio vero vel ut tum tamen suntsit si sed se quod quo quidemquid quibus qui quam quae per non nihil neque nec ne_ ita in id hoc ex etiam et est esse eo enim eius ea de cum autem aut atque ad ac aba pc . . . -. - . p c . . . -. - . figure : scaled loadings plot sigonio vs. neo-latin controls cicero .. consolatio c a s e label num +---------+---------+---------+---------+---------+ sigonio.h a sigonio.h b muretus. muretus. riccobon.ben riccobon.pat riccobon.nic vettori.pet riccobon.rho riccobon.leg riccobon.stu muretus. muretus. muretus. muretus. c lauredan.fra lauredan.mat sigonio. muretus. sigonio. sigonio. sigonio. sigonio.dd vettori.fun vettori.hab vettori.lau figure : cluster analysis sigonio vs. neo-latin controls cicero .. consolatio f r e q u e # n # c # y # # # # # x x out - . - . . . . out class centroids figure : discriminant analysis cicero, sigonio and the consolatio ( =cicero) cicero .. consolatio f r e q # u # e n c # y # x x out - . - . . . . out class centroids figure : discriminant analysis classical latin, neo-latin and the consolatio ( =classical) cicero .. consolatio f r e q u e # n # c y # # x x out - . - . . . . out class centroids figure : discriminant analysis words and syllables, classical and neo-latin ( =classical) cicero .. consolatio f r e q u e n c y # # # # x x out - . - . . . . out class centroids figure : discriminant analysis words and syllables, cicero and sigonio ( =cicero) cicero .. consolatio symbol group label ------ ----- -------------------- sigonio muretus # consolatio all-groups stacked histogram canonical discriminant function + + | | | | f | | r + + e | | q | | u | | e + + n | | c | | y | | + # # + | # # | | # # | | # # | x---------+---------+---------+---------+---------+---------x out - . - . . . . out figure : discriminant analysis muretus, sigonio & the consolatio cicero .. consolatio ita a c author cons.b cons.a muretus. muretus. muretus. muretus. c muretus. muretus. muretus. sigonio.h bsigonio.h a sigonio.dd sigonio. sigonio. sigonio. sigonio. figure : muretus, sigonio and the consolatio. the character in the letter: epistolary attribution in samuel richardson’s clarissa lisa pearl, kristine lu, and anousheh haghighi department of cognitive sciences social science plaza university of california, irvine irvine, ca corresponding author email: lpearl@uci.edu abstract deliberate differences in how authors represent characters has been a core area of literary investigation since the dawn of literary theory. here, we focus on epistolary literature, where authors consciously attempt to create different character styles through series of documents like letters. previous studies suggest that the linguistic gestalt of an author’s style – the au- thor’s writeprint – can be extracted from the various characters of an epistolary novel, but it is unclear whether individual characters themselves also have distinct writeprints. we examine samuel richardson’s clarissa, lauded as a watershed example of the epistolary novel, using a recently developed and highly successful authorship attribution technique to determine (i) whether richardson can construct distinct character writeprints, and (ii) if so, which linguistic features he manipulated to do so. we find that while there are not as many distinct character writeprints as characters, richardson does appear to have signature features he alters to create distinct character styles – and few of these features are the function word or abstract syntactic features typically comprising author writeprints. we discuss implications for other questions about character identity in clarissa and character writeprint analysis more generally. introduction since the dawn of literary theory, deliberate differences in how authors represent characters has been a core area of investigation (aristotle, bce). more recently, technological advances and interdisciplinary collaborations have expanded the methodologies used for literary scholarship, enabling investigations into more complex topics of authorship (e.g. the federalist papers: adair, ; mosteller and wallace, ; rokeach et al., ; holmes and forsyth, ; tweedie et al., ; bosch and smith, ; fung, ; collins et al., ; oakes, ; rudman, ) and character psychology (zunshine, ; vermeule, ). computational stylistics (milic, ; stamatatos et al., ) has been a favored tool in liter- ary scholarship for investigating authorial differences for at least two reasons: (i) its emphasis on character, and (ii) its ability to provide quantitative analysis without displacing the critic’s pow- ers of interpretation (smith, ). the related field of authorship attribution often uses similar techniques based on author differences to determine an author’s identity. this has been useful for contentiously attributed documents like the federalist papers, as well as in cases of authorship de- ception, where one writer attempts to consciously imitate another author’s style (sometimes called imitation attacks: brennan and greenstadt, ; pearl and steyvers, ). here, we focus on epistolary literature, a genre defined by heightened mimetic qualities such as the lack of an omniscient narrator and the collection of “found documents” by each of the novel’s characters. these qualities make it an intriguing domain for questions of character identity, style, and authorship. in general, authors are assumed to have a writeprint where the gestalt of their linguistic feature usage is distinctive (li et al., ; abbasi and chen, ; iqbal et al., , ; pearl and steyvers, ). even in epistolary novels where authors consciously attempt to create different character styles, the author’s writeprint can be extracted. for example, burrows ( ) successfully applied his prominent delta technique to identify differences in the writing of samuel richardson’s epistolary pamela and samuel fielding’s subsequent parody shamela. however, a related question concerns the characters within the epistolary novel itself: since they are all written by the same author, are they in fact distinct? that is, do different characters created by the same author have distinct character writeprints? while an author’s writeprint may imprint each of the created characters, those characters could still be quite different stylistically; on the other hand, it could be that the author was unable to deviate significantly from his author writeprint for any of the characters. notably, the linguistic features that comprise an author’s writeprint have often been drawn from function words and abstract syntactic structures, because these are believed not to be consciously manipulable (mosteller and wallace, ; burrows, ; holmes et al., ; binongo, ; burrows, ; juola and baayen, ; zhao and zobel, ; garcı́a and martin, ; stamatatos, ; lučić and blake, ; kestemont et al., ). this is why they become part of the linguistic signature of that author. so, we might expect that an author is unable to alter them even when writing as different characters. instead, perhaps other linguistic features are altered, or perhaps the author is not truly able to distinguish characters stylistically at all. recently, van dalen-oskam ( ) examined precisely this question of distinctive character writeprints within the epistolary novels of famous dutch women writers. while successful in iden- tifying the author writeprints using existing techniques, van dalen-oskam was less satisfied with the results of the character writeprint analysis using existing tools like bootstrap consensus trees (eder, ; eder and rybicki, ) and burrows’s zeta (burrows, ). she concluded that convincing and objective computational methods do not yet exist for the task of identifying ex- act differences between character writers, and that “a lot of work will have to be done to find a (combination of) method(s) that will lead to verifiable and repeatable results,” particularly beyond “words and their frequencies” (van dalen-oskam, ). one contribution of the current study is the application of a different writeprint technique used by pearl and steyvers ( ) for au- thorship deception, with what we feel are more satisfying results for issues surrounding character writeprints. we focus our investigation on samuel richardson’s clarissa, one of the longest novels in en- glish history, at over pages and nearly a million words. it is a novel rich with critical history and has been lauded as a watershed example of the epistolary novel, receiving a pedagogical re- naissance for its expanded set of character authors (over ) and insight into psychological realism (zunshine, ). with clarissa, richardson sealed his reputation not only as a masterful writer cited for his editorial prowess (price, , p. ) but also as an ambitious publisher of literary in- sight. the novel centers around the beautiful and virtuous clarissa harlowe, a young lady caught between her ambitious, greedy family and the wiles of the dashing libertine robert lovelace (refer- enced mainly by his last name). lovelace’s consuming desire to possess clarissa leads to increas- ingly dastardly and involved ploys of abduction and seduction. to illustrate the moral, familial, and psychological torment facing clarissa as she stalwartly clings to her virtue, richardson creates a diversity of meticulous epistles ranging from letters to legal documents to musical compositions to torn remnants of clarissa’s improvisational poetry. as noted in richardson’s postscript to clarissa, the goal of this heterogeneous collection was to capture the “interesting personalities” of the char- acters represented while also allowing them to be “various”, “natural”, and “well distinguished”. that is, a primary goal for richardson was to make the characters distinct enough to realistically be separate people conversing with each other. given this rich dataset of character styles, we explore two basic questions about character writeprints in clarissa. first, can richardson construct distinct character writeprints at all? if so, it is useful to determine how distinct they are and whether there are as many character writeprints as there are characters. second, if any distinct character writeprints were in fact created, which linguistic features did richardson manipulate in order to do so? we begin by describing the clarissa epistolary corpus in more detail. we then briefly review the writeprint analysis method from pearl and steyvers ( ) (henceforth ps) that we apply, comparing it to other commonly used authorship techniques and highlighting its utility for auto- matically determining linguistic features indicative of particular characters. we next discuss the set of potential linguistic features that can comprise a character’s writeprint, which the ps tech- nique draws from to automatically construct a given character’s writeprint. our results suggest that richardson is somewhat successful at creating distinctive character styles. however, there are not as many character writeprints as there are characters, and even the character writeprints discovered are not as distinct as typical author writeprints identified by the ps method. nonetheless, there do appear to be signature features that richardson tends to alter to create distinct character styles. in- terestingly, few are the function word features that have traditionally comprised author writeprints (mosteller and wallace, ; burrows, ; holmes et al., ; binongo, ; burrows, ; juola and baayen, ; zhao and zobel, ; garcı́a and martin, ; kestemont et al., ) or the deeper syntactic features more recently gaining prominence (stamatatos, ; lučić and blake, ). we discuss implications for other questions about character identity in clarissa and character writeprint analysis more generally, concluding with suggestions for fruitful future work in this area. corpus: richardson’s clarissa the version of richardson’s clarissa used in our research was downloaded from the oxford text archive at http:// ota.ahds.ac.uk, and contains letters, comprising , words total. there are samples from distinct characters in total, including an epilogue by richardson himself. though nearly all samples are letters by a single author, several are not so neatly classified: (i) letters where it is undecided who wrote them, (ii) letters jointly written by two or more characters, (iii) letters written by one character pretending to be another, and (iv) a conclusion “supposedly written by” one of the characters, john belford. table summarizes the distribution of letters across characters. it is clear that the majority of the letters ( of , which is %) come from just a few of the characters: the two central characters clarissa harlowe and lovelace, and their respective confidantes, anna howe and john belford, as highlighted in fig. . fig. shows the distribution of words per letter for these four characters, which can range from tens of words to thousands of words. figure : the number of letters each character wrote is shown, with the points representing the four main characters indicated. the remaining points correspond to the other characters who authored letters in clarissa. http://ota.ahds.ac.uk table : distribution of letters by character in clarissa, indicating character name and the number of letters written by that character. single characters # others # clarissa harlowe undecided lovelace two authors john belford lovelace as clarissa anna howe lovelace as anna judith norton, william morden supposedly written by belford arabella harlowe, james harlowe, jr. elizabeth lawrance, john harlowe charlotte harlowe, charlotte montague, lord m antony harlowe anabella howe antony tomlinson, c. h. hickman, clarissa’s father, dorothy hervey, elias brand, joseph leman, r.d. mowbray alexander wyverly, arthur lewen, clarissa’s grandfather, dolly hervey, f. j. de la tour, hannah burton, patrick mcdonald, roger solmes, samuel richardson, tho doleman, william summers figure : the distribution of words per letter for the four main characters. methods . overview of the ps authorship method we adopt the method used by pearl and steyvers ( ), a highly successful authorship approach that incorporates several aspects useful for character writeprint analysis, which are listed below in ( ). the ps method is notable for using all these components together, though other authorship methods typically use some subset of them. ( ) ps authorship components a. preprocessing the raw linguistic feature values to increase the perceived importance of those that are distinct for a given author b. using a variety of linguistic feature types c. utilizing a subset of the available features to form the writeprint d. allowing some writeprint features to matter more than others when making authorship decisions we briefly review each ps method component in turn before describing the method in more detail. . . preprocessing feature values the preprocessing step is somewhat similar to using the kullback-leibler divergence (kld), often called relative entropy (zhao et al., ; savoy, , ), as well as to burrows’s delta (burrows, ; hoover, ; burrows, , ; argamon, ; stamatatos, ; savoy, ; kestemont et al., ; savoy, ) and the chi-squared method (grieve, ; savoy, ). these approaches compare the probability of a given feature value for the author in question against the probability of that feature value in a specified comparison set. for the ps method, the comparison set is the entire set of authors collectively; in contrast, for the kld, burrows’s delta, and chi-squared methods, the comparison set is a single author at a time. to our knowledge, no other authorship method currently uses this component implemented this way, though the ps implementation of it could be incorporated into any method that involves running an algorithm over feature values. . . linguistic feature types unlike methods that rely on word frequency alone (e.g. burrows’s delta), the ps method allows linguistic features to range across a variety of character-level, word-level, syntactic, semantic, and formatting features (see tables - ), similar to some previous approaches (see stamatatos ( ) for a review, and luyckx and daelemans ( ) and eder ( ), among others). like the pre- processing component, this could be used with any method that involves running an algorithm over feature values, though several recent implementations have not done so (e.g. the kld imple- mentation of zhao et al. ( ), the nearest shrunken centroid (nsc) implementation of jockers ( ), the bootstrap consensus tree as applied by van dalen-oskam ( ), the principle com- ponent analysis (pca) implementation of kestemont et al. ( ), and the kld and chi-squared implementations of savoy ( )). . . features in a writeprint the sparse multinomial logistic regression (smlr) algorithm of krishnapuram et al. ( ) used by the ps method has the ability to automatically determine which subset of the available features is most useful for making authorship decisions, like several other approaches (e.g. bur- rows’s delta, nsc, pca, and support vector machines (svms)). this contrasts with methods like k-nearest neighbors (knn) that obligatorily use the entire feature set. importantly, the fea- ture subset is what comprises the writeprint of any particular author. one notable ability of the smlr algorithm is that it can potentially identify a different subset of features for each author writeprint, distinguishing it from methods that require all writeprints to use the same subset of linguistic features (e.g. pca, nsc). . . using writeprint features to make authorship decisions like some other approaches (e.g. svm, pca), the smlr algorithm used by the ps method allows some writeprint features to matter more than others when making authorship decisions. a feature’s importance to the algorithm is typically indicated by its weight. for example, a positive weight corresponds to a writeprint feature that is indicative of the author, with higher weights indicating features that are more useful for determining authorship. . applying the ps authorship method for character writeprints the basic representation of the problem the ps method will solve for character writeprints is whether a target letter (e.g. a letter by a character from clarissa) is by the same character as the reference set of letters (e.g. letters by a single character from clarissa). so, every data point will involve information derived from the sets of letters in ( ), and the core decision is which character is the author. ( ) sets of documents used to create data points a. same data point s = letter written by a character c , reference set r = all letters written by c b. different data point d = letter written by a character c , reference set r = all letters written by c for a “same author” data point ( a), the label should be the same as the character whose letters comprise the reference set (e.g. clarissa harlowe for the clarissa harlowe reference set). for a “different author” data point ( b), the label should be some other character – ideally the correct character, but at the very least not the character whose letters comprise the reference set (e.g. some non-clarissa harlowe character for the clarissa harlowe reference set). the smlr classifier (krishnapuram et al., ) is a supervised machine learning method that first trains on a collection of these data points that are labeled with the character author, learning about the character writeprint for the character (c ) in the reference set. the classifier then at- tempts to apply this acquired writeprint knowledge to a new unlabeled collection of data points (the test set), which contains both same and different data points. if a character’s writeprint is distinct, the classifier should perform well at identifying letters written by the character in the ref- erence set (c ); in contrast, if a character’s writeprint is not distinct, the classifier will perform poorly. a dataset is created for each character being investigated, with that character’s letters used as the reference set r for all the data points in the set. to create a data point, we follow the ps preprocessing procedure that increases the prominence of potentially distinctive feature values by determining if a particular feature value is unusual compared to the values characters typically have for that feature. we note that this procedure is applied to each feature separately. specifically, we first calculate the probability of the feature value fv occurring, given the distribution of feature values in the reference set of letters from the character in question: pchar = p(fv | distribution for character c ). then, we compare pchar against the probability of that feature value occurring, given the distribution of feature values in the set of letters from all characters in clarissa: pall = p(fv | distribution for all characters). this gives us a quantitative translation of how distinctive feature value fv is for the character. if the feature value is unusual (e.g, f in fig. ), the probability of fv coming from the character in question will be higher than the probability of fv coming from the entire population of characters (i.e. pchar > pall). if the feature value is instead common to other characters as well, the probability of fv coming from the character in question will not be higher (i.e. pchar ≈ pall). if the feature value is an aberration for this character (e.g, f in fig. ), the probability of fv coming from the character in question will be lower (i.e. pchar < pall). each transformed feature value is calculated as follows, with the major aspects highlighted in fig. . first, we log transform the feature values from the reference set of letters for that character (i.e. new value = log(raw value)), which creates a distribution of values that is roughly normally distributed, provided the sample size is large enough. we then estimate the best-fitting normal distribution for this observed distribution that represents the feature distribution for this character’s letters (char normal distribution). we note that this is where the size of the letter set authored by a particular character ceases to matter (e.g., one character writing letters while another character only writes letters) – the only information extracted is the parameter values of the best-fitting normal distribution of the feature values in that characters letters. as long as a normal distribution can be reasonably estimated, the exact size of the character letter data set is irrelevant. we then apply the same process to the set of letters from all characters in clarissa, generating the normal distribution all that represents that feature’s distribution across all characters. we then calculate the probability that the observed feature value fv in the target letter would be drawn from the char distribution (p(fv|char) = pchar) and compare that against the probability that fv would be drawn from the all distribution (p(fv|all) = pall) using the log-odds ratio in ( ): ( ) log( pchar pall ) = log( p(fv|char) p(fv|all) ) a positive value means that pchar is larger than pall, and so this feature value is unusual for the population of characters as a whole, but more typical for the character whose letters comprise the reference set. that is, it is more likely to be a distinctive feature for this character. a negative value means that pchar is smaller than pall, and so this feature value is more typical for the population of characters as a whole and not for the reference set character. that is, it is likely not to be a distinctive feature for this character. fig. demonstrates this for values f and f , where f would have a positive log-odds ratio while f would have a negative one. figure : sample normal distributions derived from the log-transformed values for a single feature in the letters of a single character (char) and the letters of all characters (all). sample feature value f is typical of char but atypical of all, and so would have a positive log-odds ratio. sample value f is atypical of char but typical of all, and so would have a negative log-odds ratio. because this preprocessing procedure is applied to each feature separately, the effective diag- nosticity of each feature is assessed separately. importantly, what matters is not how large or small the raw feature value is for a given feature, but how unusual it is compared to other feature values that occur. so, a data point in a character’s data set, derived from a target letter and a reference set of letters from that character, has the form in ( ): a label indicating the character author of the target letter followed by a set of log-odds transformed feature values from that target letter, given the reference set. ( ) sample data points derived from target letters and a reference set of letters by anna howe a. anna howe, . , . , . , . , ... . b. clarissa harlowe, . , . , . , . , ... . a dataset consists of data points, with every letter in clarissa used as a target letter. so, some portion of a dataset will be labeled with the reference author (e.g. anna howe in ( )), while the rest will be labeled with other characters. because the preprocessing procedure requires the reference set to be of some size in order to more accurately estimate a normal distribution for an individual character’s reference set of letters (char), we restricted our analysis to characters with fifty or more letters in clarissa. this confined our analysis to the four characters with the most letters and so yielded four distinct datasets: clarissa harlowe, lovelace, john belford, and anna howe. the smlr classifier was then run on each constructed dataset. we note that the smlr classifier requires a parameter λ that controls how strongly the classifier is biased to select a subset of the available linguistic features for a character’s writeprint. larger values of λ lead to writeprints consisting of fewer features, while smaller values lead to writeprints consisting of more features. a value of . , which is a fairly strong bias to prefer fewer features, led to the best performance in a pilot analysis, and so we used this for our analyses below. we used the smlr java implementation available from http:// www.cs.duke.edu/∼amink/ software/ smlr/ , implementing ten-fold cross-validation to evaluate the classifier’s performance. in ten-fold cross-validation, the dataset is divided into ten “folds” of equal size, with each fold containing approximately the same distribution of labeled data as the entire data set. so, for example, since of letters are clarissa harlowe letters, each fold contained or target letters and ap- proximately target letters in each fold were clarissa harlowe letters. the classifier then makes ten passes through the dataset, training on the data in nine of the ten folds and testing on the data in the remaining fold, with each fold taking a turn as the test fold. so, for every pass, the classifier learns what it can from of the dataset about how the character labels are determined based on the preprocessed feature values, and then attempts to label the data points in the remaining with the appropriate character. one benefit of cross-fold validation is that results are obtained for every single data point in the data set, as each data point will be in one of the test folds. this guards against results being impacted by a particularly easy or difficult test set, since ten test sets are used, and the average of the ten test sets is the final score. . linguistic features available for character writeprints the ps method uses the smlr classifier to automatically construct writeprints from a set of avail- able linguistic features and bases its authorship decisions on those writeprints. we extracted a set of linguistic features of seven different kinds, shown in tables - in the appendix: character-level features, word-level features, syntactic category features that could be viewed as contentful (e.g. nouns), syntactic category features that could be viewed as functional (e.g. prepositions), syntactic structure features (e.g. passives), formatting features (e.g. italicized words), and semantic features (e.g. endearments). similar to pearl and steyvers ( ), these are all stylometric features that can be extracted automatically using freely available natural lan- guage processing software and scripts written in a text manipulation programming language like perl. notably, many of these features can be easily and automatically extracted from languages besides english, provided the natural language processing software is available in the desired lan- guage (i.e., the character-level features, the word-level features, several content syntactic category features, several syntactic structure features, and the formatting features). however, we do note that many of the remaining features were manually identified using knowledge of english gram- mar (some content syntactic category features, several functional syntactic category features, and http://www.cs.duke.edu/~amink/software/smlr/ http://www.cs.duke.edu/~amink/software/smlr/ some syntactic structure features). additionally, the semantic features were manually identified using domain-specific knowledge of clarissa, rather than using the topic-modeling approach of pearl and steyvers ( ), due to the relatively small size of the corpus. in general, when applying the ps method to character writeprints, we believe it is likely more expedient for literary scholars to draw on their own knowledge of the particular literature under investigation when identifying potentially useful semantic features. for scholars studying works in other languages, this would require manual identification of the relevant semantic features in those languages. results we present two kinds of results: (i) quantitative results relating to character writeprints, based on applying the ps method, and (ii) qualitative interpretations of those results that relate to character distinctiveness and signature writeprint features manipulated by richardson. . character distinctiveness for each of the four characters who wrote over fifty letters in clarissa, we can examine how distinctive that character’s style is by assessing the ability of the ps method to correctly label a target letter with the correct character, given a reference set of letters by that character (e.g. label a letter by clarissa harlowe as being by her, when referenced against a set of letters by her). notably, the ps method had an average success rate of % on one data set and % on another in pearl and steyvers ( )’s study that attempted to identify different authors, so it is very accurate when there are in fact different authors writing. if the character writeprints in clarissa are as distinct as author writeprints typically are, we would expect similar performance. from the results of the smlr classifier for each character, we can derive a confusion matrix as in table , where the rows represent the true character writer of the letter and the columns represent the character writer labeled by the smlr classifier. the four quantities correspond to the distinctions in signal detection theory, representing (i) true positives (a): the number of letters where the smlr correctly labeled the letter by the target character as being by the target character, (ii) false negatives (b): the number of letters where the smlr incorrectly labeled a letter by the target character as being by a different character, (iii) false positives (c): the number of letters where the smlr incorrectly labeled a letter not by the target character as being by the target character, and (iv) true negatives (d): the number of letters where the smlr correctly labeled a letter not by the target character as not being by the target character. standard metrics used in computational linguistics to gauge the success of a classifier are pre- cision and recall, defined as in ( a) and ( b), and combined into a single summary statistic known as the f-score using the harmonic mean definition in ( c). precision, recall, and f-score all range between and , with being perfect performance. the intuitive interpretation of precision is how accurate identification is, while the intuitive interpretation of recall is how complete identification is. ideally, a classifier will be both very accurate and very complete in its identification, yielding a high f-score. table : the performance of the smlr classifier can be summarized using a confusion matrix, where the rows represent the true identity of the target letter’s author and the columns represent the smlr-labeled identity of the target letter’s author. correct labels are a (true positives) and d (true negatives), while incorrect labels are b (false negatives) and c (false positives). labeled character non-character true character a b non-character c d ( ) evaluation metrics, with quantities from table a. precision = # correctly labeled as character (a) # labeled as character (a+c) : b. recall = # correctly labeled as character (a) # should be labeled as character (a+b) : c. f-score = * precision∗recall precision+recall table shows the f-scores for each of the four characters investigated, while the detailed informa- tion from the confusion matrices as well as the precision and recall scores for each character appear in appendix b. most notably, none of the characters seem to have an easily identifiable style – the highest performance by f-score is . . this suggests that it is not easy for a single author, such as richardson, to differentiate the writing styles of different characters. still, there seem to be nat- ural classes of characters: (i) those that are more distinctive (lovelace: . ), clarissa harlowe: . ), (ii) those that are somewhat distinctive (john belford: . ), and (iii) those that are not very distinctive at all (anna howe: . ). there are at least two reasons why a character’s writeprint may not be very distinctive. first, a character may simply be a hodgepodge of multiple characters’ styles, and so would overlap with all other character styles. alternatively, a character may be a derivative of other character styles and have a writeprint that borrows from the main writeprint of those other character styles. we can discern between these options by looking at the common confusions that emerge from the smlr confusion matrix, shown in table . any character which the smlr confused the target character with more than % of the time (either for the precision calculation, the recall calculation, or both) is shown. we observe that both john belford and anna howe are confused with clarissa harlowe and lovelace, but not with each other; we interpret this to mean that john belford’s and anna howe’s writeprints are distinct from each other. moreover, while lovelace is confused with john belford, he is not confused with anna howe; we interpret this to mean john belford’s writeprint comes primarily from lovelace’s. similarly, while clarissa harlowe is confused with anna howe, she is not confused with john belford; we interpret this to mean anna howes writeprint comes primarily from clarissa harlowe’s. given the quantitative information from the f-scores and qualitative information from the sys- tematic confusions, we suggest that richardson was able to generate two main styles, one for each of the two central characters (lovelace, clarissa harlowe). he then derived additional charac- ter styles from those two main styles (john belford from lovelace, anna howe from clarissa harlowe). narratively, this makes intuitive sense because belford and anna serve primarily as “sounding walls” for lovelace and clarissa, respectively, as the moral conflict in the novel esca- lates. because of this, an interesting subsequent analysis is to calculate the f-scores for the two main characters while allowing the letters derived from their styles to count as instances of the main character letters – that is, john belford’s letters counted as instances of lovelace’s letters and anna howe’s letters counted as instances of clarissa harlowe’s letters. this increases the f-scores of both main characters: lovelace’s f-score increases to . while clarissa harlowe’s f-score increases to . . though this improvement is non-trivial, these character f-scores are still far below what we see when we compare the writings of two different authors (where f-scores are at . and above, based on pearl and steyvers ( )). a possible reason for this involves the linguistic features that richardson is able to manipulate to create different character styles. we examine these next. . signature features for each character (whether distinctive or not), we can examine the most distinctive features ac- cording to the smlr analysis, since the classifier learned not only which features comprised a character’s writeprint but also how important each of those features was to the writeprint. the importance of a given feature is indicated by the weight learned for it; the features with the highest positive weight are the ones that influenced the smlr’s decision the most when deciding to label a target letter based on a particular character’s reference set. though the smlr bases its deci- sion on all non-zero weighted features, these top features can serve qualitatively as the “signature” writeprint features for that character’s style, as they matter the most to the classifier’s decision. from this, we can assess how similar signature features are across character styles. table lists the signature features for each character style, indicating which ones are shared across character styles and which specific characters a given signature feature is associated with when more than one character utilizes that signature feature. there do seem to be common features richardson prefers to manipulate, and every character (distinctive or not) has several. the most commonly manipulated linguistic features are shown in table . interestingly, few of these commonly manipulated features include what would be con- sidered function words, which have often been a core distinguishing feature of author writeprints (mosteller and wallace, ; burrows, ; holmes et al., ; binongo, ; burrows, ; juola and baayen, ; zhao and zobel, ; garcı́a and martin, ; stamatatos, ; lučić and blake, ; kestemont et al., ). only one feature (frequency of all function words to- gether) is of this kind. similarly, few are the abstract syntactic features that have also been central to more recent authorship studies (lučić and blake, ). for example, while the average lengths of different syntactic phrases were potential writeprint features that would capture nuances of verbosity specific to those syntactic structures, none of these features were identified as common writeprint features that richardson manipulated. the three he manipulated which are closest to this kind of feature are the frequency of clauses, the frequency of fragments, and the frequency of wh-adverb phrases. table : distinctiveness of main characters, in descending order by f-score, which is a summary statistic of classifier performance. signature features for each character (smlr weight ≥ . ) are also listed in descending order of strength, according to the writeprints identified by the smlr classifier. signature features appearing for more than one character are bolded, with the initials of the character(s) that share(s) that signature feature in parentheses. common confusions (account- ing for ≥ % errors for precision and/or recall) are also shown. character f-score signature features common conf lovelace (l) . verb frequency (ch, ah), alphabetic character fre- quency, verb phrase frequency, function word fre- quency (ah), noun phrase frequency, fragment frequency (jb), total characters, punctuation fre- quency (ch), total words, wh-adverb phrase fre- quency (jb), title frequency (jb), word length (jb), familial term frequency, gerund or present par- ticiple frequency, infinitive to frequency ch, jb clarissa harlowe (ch) . verb frequency (l, ah), clause frequency (jb), punctuation frequency (l), first person pronoun fre- quency, universal determiner frequency, contraction frequency l, ah john belford (jb) . clause frequency (ch), parenthesis frequency, noun frequency, colon frequency, fragment frequency (l), title frequency (l), endearment frequency (ah), adverb phrase length, word length (l), wh-adverb phrase frequency (l) ch, l anna howe (ah) . function word frequency (l), endearment fre- quency (jb), verb frequency (l, ch), wh-noun phrase frequency, second person pronoun frequency, emdash frequency, noun phrase frequency ch, l interestingly, two of these three syntactic structure features have cues that are somewhat contentful. sentence fragments represent an incomplete structure, and an incomplete syntactic structure leads to an incomplete sentence meaning. likewise, wh-adverb phrases are headed by when, where, why, and how, and these words are fairly easy to define (e.g., the time something happened = when). so, both sentence fragments and wh-adverb phrases may be structural features that are easier to consciously recognize and manipulate. clause frequency may operate this way as well, since clause frequency within a sentence can be increased by either conjoining main clauses together (e.g. i like this and i want to read more) or embedding clauses in main clauses (e.g. i think that i like this). in either case, a complete thought (represented by the additional clause) is added. thus, clause frequency may also be straightforward to consciously manipulate. the remaining commonly manipulated features range over syntactic categories based on con- tent words (frequency of verbs), semantic categories (frequency of endearments and titles), and character-level features (frequency of punctuation, average word length). these may also be easier to consciously recognize and manipulate. this lends support to the idea that both functional category and abstract syntactic structure usage are more unconscious and therefore more indicative of an author’s genuine identity. if functional category and abstract syntactic structure usage are harder to consciously manipulate, and these aspects are often at the heart of an author’s writeprint, this could be one reason why character writeprints aren’t as distinctive as author writeprints typically are – even for an expert like richardson. instead, richardson focused on features he could consciously manipulate, which are other feature types. table : linguistic features most commonly manipulated by richardson, including the type of feature, number of characters where the linguistic feature is among the set of signature features, and the specific characters whose writeprints contain the signature feature. characters with distinctive writeprints are indicated with an asterisk (*). feature type # char characters verb frequency syntactic *lovelace, *clarissa harlowe, (content) anna howe clause frequency syntactic *clarissa harlowe, john belford (structure) fragment frequency syntactic *lovelace, john belford (structure) wh-adverb phrase frequency syntactic *lovelace, john belford (structure) function word frequency syntactic *lovelace, anna howe (function) endearment frequency semantic john belford, anna howe title frequency semantic *lovelace, john belford punctuation frequency character *lovelace, *clarissa harlowe (punctuation) word length character *lovelace, john belford (word) next, how did richardson manipulate the signature features for each character? for each signature feature in a character’s writeprint, we can examine how the distribution of that feature’s values in a character’s letters compares to the distribution of that feature’s values in the entire set of character letters. in particular, a simple analysis is whether that character’s feature value was generally higher or lower than that feature’s value in the character population, as measured by average or median feature value. if this occurs, it indicates one simple way that richardson manipulated that feature to create an aspect of the character’s writeprint. we note that not all signature features will have a distribution that shows up under this analysis because there are many ways for a feature distribution to be distinctive, and having an average or median value that is higher or lower than the population average or median value is merely one of them. nonetheless, this is a convenient analysis to try as it lends itself well to verbally summarizing a character’s style (e.g. one character uses more present participles and fewer titles), and generating content in that character’s style (e.g. to emulate this character, use more present participles and fewer titles). for each signature feature, we compared the character’s median value against the population’s median value, and the character’s average value against the population’s average value. if the character’s median and/or average value was at least % higher or lower than the population’s median and/or average value for that feature, we included it in the set of signature features in table that have an easily describable pattern of usage. for example, lovelace’s title frequency had a median value of approximately . and an average value of approximately . . the population median value is approximately . while the population average value is approxi- mately . . so, lovelace’s median value is less than the population median value by over % ( . / . - = - . ) and lovelace’s average value is less than the population’s average value by over % ( . / . - = - . ). this indicates that lovelace’s style uses titles less frequently than other characters, based on both the median and average values for this feature. table summarizes the results of this analysis for the four characters investigated. table : signature features for each character that have an average or median characteristic usage that is either significantly higher (+ %) or lower (- %) than the character population average or median. if average and median usage differ, [avg] or [median] indicate which one behaves which way. distinct usage shared across character signature features is bolded, with the initials of the character(s) that share(s) that distinctive average/median usage in parentheses. if the shared distinctive usage is in the opposite direction (e.g. the first character’s is + and the second character’s is -), the character in parentheses is also italicized. asterisks (*) indicate signature feature usage that is at least % higher/lower than the character population average and/or median. character signature features: distinctive on average or median lovelace (l) +: *gerund or present participle frequency, *infinitive to frequency, total charac- ters, total words -: familial term frequency, title frequency (jb [avg], jb [median]) clarissa +: *contraction frequency, *first person pronoun frequency harlowe (ch) -: universal determiner frequency john belford +: adverb phrase length, *colon frequency (ch), *parenthesis frequency, title frequency [median] (l) (jb) -: *endearment frequency (ah), title frequency [avg] (l) anna howe +: *emdash frequency, *endearment frequency (jb), *second person pronoun frequency, *wh-noun phrase frequency (ah) -: none from this analysis, we can observe two notable stylistic choices made by richardson, both of which appear to distinguish character styles. the first distinguishing feature is the use of endear- ments, as endearment frequency is a signature feature for both john belford from anna howe, but in the opposite direction: belford tends to use relatively fewer endearments while anna tends to use relatively more. interestingly, this is not a signature feature for either of the main writeprints (lovelace and clarissa), so it is unlikely to be something accidentally transferred from the main writeprints to the derived writeprints. instead, it is more likely to be a conscious choice by richard- son to distinguish these two characters, who write more letters than any others except for lovelace and clarissa. the second distinguishing feature is the use of titles (title frequency), which is manipulated for both lovelace and john belford. lovelace always uses titles less frequently, while belford’s letters show a more nuanced pattern: his average usage is less frequent, but his median usage is more frequent. this suggests that belford’s letters typically use titles more frequently (i.e. many of belford’s letters have more titles than a typical character’s letter), but there are a few outlier letters that use titles far less frequently than a typical character’s letter. these outlier letters would lower the average title frequency while leaving the median title frequency relatively unaffected (see endnote about the relationship between average and median usage). a potential interpretation of this pattern is that the usage of titles in letters is something richardson felt was masculine, and so he consciously manipulated it when writing letters by male characters. more generally, this analysis provides simple rubrics for how to write in the style of a specific character. for example, a message by lovelace should contain few titles or familial terms, be verbose, and use both present participles and infinitival to. table provides sample messages that obey these rubrics, thereby representing “prototypical” examples of these characters’ styles. general discussion using a state-of-the-art authorship classification approach (pearl and steyvers, ), we discov- ered that richardson was able to create two distinct character writeprints for the four characters examined. this suggests that while it is possible for an author to make some distinctive characters writeprints, it is non-trivial to do so for each character. notably, the character writeprints richardson is able to create are not as distinctive as author writeprints typically are. a related observation is that the features richardson most often manip- ulates to create these character writeprints are not the functional or abstract syntactic features that have been prominent for authorship studies. we suggest that this may be due to the accessibility of these features. that is, the reason they are so often used for author writeprints is precisely because they are not easy to consciously manipulate. in contrast, when a single author is creating sev- eral character writeprints, the manipulated features may naturally be the ones that are consciously accessible. . applying the ps approach for related authorship questions in clarissa in addition to these discoveries about the character writeprints in clarissa, we can also answer interesting questions about literary deception in this particular epistolary novel. notably, there table : example messages from each character that follow the rubrics derived from each charac- ter’s signature features. character message lovelace and then, what a comely fight, all kneeling down together in one pew, according to eldership, as we have seen in effigie, a whole family upon some old monument, where the honest chevalier, in armour, is presented kneeling, with uplift hands, and half a dozen jolter-headed crop-eared boys behind him, ranged gradatim, or step-fashion, according to age and size, all in the same posture–facing his pious dame, with a ruff about her neck, and as many whey-faced girls, all kneeling behind her: an altar between them, and an opened book upon it: over their heads semilunary rays darting from gilded clouds, surrounding an atchievement-motto, in coelo salus– or quies–perhaps, if they have happened to live the usual married life of brawl and contradiction. (http:// ota.ox. ac.uk/ text/ .html) clarissa harlowe you’ll observe, that altho’ i have not demanded my estate in form, and of my trustees, yet that i have hinted at leave to retire to it. how joyfully would i keep my word, if they would accept of the offer i renew!–it was not proper, i believe you’ll think, on many accounts, to own that i was carry’d off, against my inclination. (http:// ota.ox.ac.uk/ text/ .html) john belford he succeeds, takes private lodgings for her at hackney; visits her by stealth, both of them tender of reputations, that were extremely tender, but which neither had quite given over; for rakes of either sex are always the last to condemn or cry down themselves: visited by nobody, nor visiting: the life of a thief, or of a man beset by creditors, afraid to look out of his own house, or to be seen abroad with her. and thus went he on for twelve years, and, tho’ he had a good estate, hardly making both ends meet; for, tho’ no glare, there was no oeconomy; and besides, he had every year a child, and very fond of them was he. but none of them lived above three years: and being now, on the death of the dozenth, grown as dully sober, as if he had been a real husband, his good mrs. thomas (for he had not permitted her to take his own name) prevailed upon him, to think the loss of their children a judgment upon the parents for their wicked way of life... (http:// ota.ox.ac.uk/ text/ .html) anna howe i have both your letters at once. it is very unhappy, my dear, since your friends will have you marry, that such a merit as yours should be addressed by a succession of worth- less creatures, who have nothing but their pre?umption for their excuse. that these pre- sumers appear not in this very unworthy light to some of your friends, is, because their defects are not ?o striking to them, as to others.–and why? shall i venture to tell you?– because they are nearer their own standard.–modesty, after all, perhaps has a concern in it; for how should they think, that a niece or a sister of theirs (i will not go higher, for fear of incurring your displeasure) should be an angel? (http:// ota.ox.ac.uk/ text/ .html) are three letters in which one character is pretending to be another: lovelace writes one letter as anna howe and two letters as clarissa harlowe. this presents an intriguing layering effect for character writeprints: richardson is attempting to create the character writeprint for a character http://ota.ox.ac.uk/text/ .html http://ota.ox.ac.uk/text/ .html http://ota.ox.ac.uk/text/ .html http://ota.ox.ac.uk/text/ .html http://ota.ox.ac.uk/text/ .html http://ota.ox.ac.uk/text/ .html (lovelace) who is attempting to imitate another character’s writeprint (anna or clarissa). a very basic question is whether richardson successfully shifted lovelace’s writeprint so that it appeared not to be lovelace. we applied the ps method to each of these three letters, using lovelace’s letters as the reference set. if richardson was successful at altering lovelace’s writeprint, the smlr classifier should not identify any of those letters as having been written by lovelace. this was indeed the case for all three letters, meaning that richardson effectively masked lovelace’s style for those letters. given this, was lovelace successful in his deception as anna and clarissa? to answer this, we applied the ps method to these three letters, using anna’s letters as the reference set for the letter impersonating anna and clarissa’s letters as the reference set for the letters impersonating clarissa. here, the deception seemed to fail. the letter impersonating anna was not labeled as being by anna when compared against the anna reference set. similarly, the letters impersonating clarissa were not labeled as being by clarissa when compared against the clarissa reference set. so, lovelace’s deception was incomplete in this sense – though perhaps that was richardson’s intention. in particular, richardson could have intended for the reader to recognize that the style wasn’t quite the purported one in each case (anna or clarissa, respectively). additionally, there is a single letter that is “supposedly written” by john belford. yet, when compared against the belford letters as a reference set, this letter was not labeled as being by belford. this suggests that richardson was unsuccessful in his portrayal of belford as the writer for this letter, whether intentionally or unintentionally. richardson may have intended for it to be written by one of the other characters; if so, it would have a writeprint matching one of these other characters. however, when compared against other character reference sets with ten or more let- ters (anna howe, arabella harlowe, clarissa harlowe, james harlowe jr, judith norton, lovelace, william morden), this letter was also not identified as being the writeprint of any of those charac- ters. so, in general, this letter does not seem to be an effective portrayal of any easily identifiable character from clarissa. . applying the ps approach more generally for character writeprints we believe the ps approach for discerning character writeprints in epistolary novels can be used to answer several questions of interest to literary scholars, including epistolary novel technique (both richardson’s and that of other epistolary novel authors), comparative evaluations of epistolary author skill, approaches to constructing character writeprints, and cues to author identity. with respect to samuel richardson in particular, how skilled is he in his other epistolary novels (pamela, the history of sir charles grandison) at creating the appropriate number of character writeprints? using the same ps approach, we can examine how distinct the character writeprints are and whether there are both main and derived writeprints, as in clarissa. we can also investigate whether richardson manipulates the same signature linguistic features as he did in clarissa, and if these features tend to exclude the functional and abstract syntactic features common in author writeprints. additionally, we can apply the ps approach to investigate whether character writeprints are sen- sitive to major plot changes. this nuanced question would be particularly worthwhile to examine in clarissa, as there is an abrupt letter marking the novel’s astonishing climax when relationships and alliances shift, particularly among the four central characters. do the character writeprints re- flect these changing alliances, e.g. character writeprint similarities aligning with current character alliances? with respect to other writers of epistolary novels, especially richardson’s contemporaries like jane austen, aphra behn, fanny burney, james howell, frances brooke, and mary shelley, how skilled are these other writers at creating character writeprints? using the ps approach, we can determine how distinct these writeprints are, if there are the appropriate number, whether there are main and derived writeprints, which signature features are manipulated, and the nature of these signature features. we can also compare the signature features used by other epistolary authors to those we discovered for richardson in clarissa. if the same features or same types of features are commonly manipulated, this suggests that those are core features (or features types) for character writeprints within epistolary literature. the ps approach also allows us to explore the impact of author identity in a unique way, based on the types of features that distinguish character writeprints and author writeprints. in epistolary novels, authors must subsume their own identity in order to generate a distinct set of charac- ter identities. yet, our analysis of richardson’s clarissa suggests that the features manipulated to create character writeprints differ from the features typically comprising author writeprints. given this, it may be that an author consciously manipulates certain features to generate character writeprints while leaving the features that are not consciously accessible alone. in other words, the author’s writeprint features may remain the same across all character writeprints generated by that author. suggestive evidence for this view comes from burrows ( ), who successfully dis- tinguished richardson’s epistolary novel pamela from fielding’s parody shamela, and from van dalen-oskam ( ), who successfully distinguished the epistolary novels of two dutch women writers. if the author writeprint features do in fact remain the same across character writeprints, it should be the case that character writeprints generated by one author are as distinct from char- acter writeprints generated by another author as the author writeprint of the first author is from the author writeprint of the second author. this is a prediction that can be empirically tested with the quantitative approach we have used here, with intriguing implications for the cues to author identity if validated. conclusion we have demonstrated how to apply current machine learning techniques to answer questions in literary scholarship related to authorship. this case study focuses on issues surrounding character authorship in the landmark epistolary novel clarissa, yielding both quantitative and qualitative results about the ability of the innovative samuel richardson to develop distinct character styles within a novel. the particular machine learning approach we use can serve as a reliable tool for investigating other literary questions surrounding both the style of individual characters and the style of the author who creates those characters. notes this dataset is available at http:// www.socsci.uci.edu/ ∼lpearl/ colalab/ corpora/ richardsons clarissa.zip, with letters organized by character. we note that we present results from an “n-way” classification task (where one of n characters is selected for each data point), but this task could also be set up as a simpler binary classification task where the goal is to label a letter as simply by the “same” author or by a “different” author than the reference set. interestingly, we achieve better results with the harder n-way classification than the easier binary ( -way) classification, suggesting that it is useful to know which specific other character a letter was written by, rather than simply knowing that it was written by a different character from the the author of the reference set letters. this may be a specific instance of the more general situation where a seemingly harder problem is actually easier than a seemingly simpler problem because of the subtle information available in the data available for the harder problem (e.g. joint inference in cognitive development: dillon et al., ; doyle and levy, ; feldman et al., ; börschinger and johnson, ). in addition, the n-way classification task allows us to see which specific characters a given character is confused with, which is an important aspect of our character writeprint analysis. we used the stanford part-of-speech tagger (available at http:// nlp.stanford.edu/ software/ tagger.shtml) to iden- tify syntactic categories and the stanford parser (available at http:// nlp.stanford.edu/ software/ lex-parser.shtml) to identify syntactic structures. the list of syntactic category tags from the stanford part-of-speech tagger used is as follows, with an example of each tag in parentheses: cc = coordinating conjunction (and), cd = cardinal number (one penguin), dt = determiner (the), eos = end of sentence marker (theres a penguin here!), ex = existential there (there’s a penguin here), fw = foreign word (hola), in = preposition or subordinating conjunction (after), jj = adjective (good), jjr = comparative adjective (better), jjs = superlative adjective (best), ls = list item marker (one, two, three, ...), md = modal (could), nn = singular or mass noun (penguin, ice), nns = plural noun (penguins), nnp = proper noun (jack), nnps = plural proper noun (there are two jacks?), pdt = predeterminer (all the penguins), pos = possessive ending (penguin’s), prp = personal pronoun (me), prp$ = possessive pronoun (my), rb = adverb (easily), rbr = comparative adverb (later), rbs = superlative adverb (most easily), rp = particle (look it up), sym = symbol (this = that), to = to (i want to go), uh = interjection (oh), vb = base form of verb (we should go), vbd = past tense verb (we went), vbg = gerund or present participle (we are going), vbn = past participle (we should have gone), vbp = non- rd person singular present tense verb (you go), vbz = rd singular present tense verb (he goes), wdt = wh-determiner (which one), wp = wh-pronoun (who), wp$ = possessive wh-pronoun (whose), wrb = wh-adverb (how). the list of phrase-structure tags the stanford parser used is as follows, with an example of each tag in parentheses: s = declarative sentence (i like penguins), sinv = sentences with subject-auxiliary inversion (never have i seen such penguins!), sbar = embedded clauses (i like penguins that are cute.), intj = interjection (um), frag = fragment (see penguins in the) rrc = reduced relative clause (penguins not presently swimming), sbarq = wh-questions (what did you see?), sq = yes/no questions (did you see that?), adjp = adjective phrase (outrageously cute), advp = adverb phrase (rather sweetly), conjp = multi-word conjunctions (...as well as...), lst = list marker (one, two, three), nac = not a constituent (in the back of my mind it), np = noun phrase (those penguins), nx = sub-noun phrase (other people and), pp = preposition phrase (with the penguins), prn = parenthetical (those penguins (i really like them)), prt = particle (look it up), qp = quantifier phrase (a little bit more), ucp = unlike coordinate phrase (from that, but thats why), vp = verb phrase (we like penguins), whadjp = wh-adjective phrase (how hot is it?), whadvp = wh-adverb phrase (how are you?), whnp = wh-noun phrase (who are you?), whpp = wh-preposition phrase (with whom did you see it?), x = unknown phrase. this inference is also supported by noting that four of the six signature features of john belford’s writeprint that are shared by other characters are shared only by lovelace’s writeprint (fragment frequency, title frequency, word length, and wh-adverb phrase frequency). see discussion in the next section for how signature features were derived for each character. we note that while average and median population values generally align, sometimes they do not. this is because average values don’t factor out the effect of outliers that shift the average value significantly up or down, while median values do. for example, suppose a set of values has ten very low values around , while the remaining values have an average of . the average of this set is about while the median is likely to be around . a value of is http://www.socsci.uci.edu/~lpearl/colalab/corpora/richardsons_clarissa.zip http://nlp.stanford.edu/software/tagger.shtml http://nlp.stanford.edu/software/lex-parser.shtml then % higher than the average ( / - = . ) but % lower than the median ( / - = - . ). references abbasi, a. and chen, h. ( ). writeprints: a stylometric approach to identity-level identifica- tion and similarity detection in cyberspace. acm transactions on information systems (tois), ( ): . adair, d. ( ). the authorship of the disputed federalist papers: part ii. the william and mary quarterly: magazine of early american history, institutions, and culture: – . argamon, s. ( ). interpreting burrows’s delta: geometric and probabilistic foundations. literary and linguistic computing, ( ): – . aristotle ( bce). poetics. url http:// classics.mit.edu/ aristotle/ poetics. . .html. binongo, j. n. g. ( ). who wrote the th book of oz? an application of multivariate analysis to authorship attribution. chance, ( ): – . börschinger, b. and johnson, m. ( ). exploring the role of stress in bayesian word segmen- tation using adaptor grammars. association for computational linguistics. bosch, r. a. and smith, j. a. ( ). separating hyperplanes and the authorship of the disputed federalist papers. american mathematical monthly: – . brennan, m. r. and greenstadt, r. ( ). practical attacks against authorship recognition techniques. in iaai. burrows, j. ( ). delta: a measure of stylistic difference and a guide to likely authorship. literary and linguistic computing, ( ): – . burrows, j. ( ). questions of authorship: attribution and beyond a lecture delivered on the occasion of the roberto busa award ach-allc , new york. computers and the humanities, ( ): – . burrows, j. ( ). who wrote shamela? verifying the authorship of a parodic text. literary and linguistic computing, ( ): – . burrows, j. ( ). all the way through: testing for authorship in different frequency strata. literary and linguistic computing, ( ): – . burrows, j. f. ( ). word-patterns and story-shapes: the statistical analysis of narrative style. literary and linguistic computing, ( ): – . collins, j., kaufer, d., vlachos, p., butler, b., and ishizaki, s. ( ). detecting collaborations in text comparing the authors’ rhetorical language choices in the federalist papers. computers and the humanities, ( ): – . http://classics.mit.edu/aristotle/poetics. . .html dillon, b., dunbar, e., and idsardi, w. ( ). a single-stage approach to learning phonological categories: insights from inuktitut. cognitive science, : – . doyle, g. and levy, r. ( ). combining multiple information types in bayesian word segmen- tation. in hlt-naacl. citeseer, pp. – . eder, m. ( ). does size matter? authorship attribution, small samples, big problem. proceed- ings of digital humanities: – . eder, m. ( ). mind your corpus: systematic errors in authorship attribution. literary and linguistic computing: fqt . eder, m. and rybicki, j. ( ). stylometry with r. in digital humanities : conference abstracts. citeseer, pp. – . feldman, n., griffiths, t., goldwater, s., and morgan, j. ( ). a role for the developing lexicon in phonetic category acquisition. psychological review, ( ): – . fung, g. ( ). the disputed federalist papers: svm feature selection via concave minimiza- tion. in proceedings of the conference on diversity in computing. acm, pp. – . garcı́a, a. m. and martin, j. c. ( ). function words in authorship attribution studies. literary and linguistic computing, ( ): – . grieve, j. ( ). quantitative authorship attribution: an evaluation of techniques. literary and linguistic computing, ( ): – . holmes, d. i. and forsyth, r. s. ( ). the federalist revisited: new directions in authorship attribution. literary and linguistic computing, ( ): – . holmes, d. i., robertson, m., and paez, r. ( ). stephen crane and the new-york tribune: a case study in traditional and non-traditional authorship attribution. computers and the hu- manities, ( ): – . hoover, d. l. ( ). testing burrows’s delta. literary and linguistic computing, ( ): – . iqbal, f., binsalleeh, h., fung, b. c., and debbabi, m. ( ). mining writeprints from anony- mous e-mails for forensic investigation. digital investigation, ( ): – . iqbal, f., hadjidj, r., fung, b. c., and debbabi, m. ( ). a novel approach of mining write-prints for authorship attribution in e-mail forensics. digital investigation, : s –s . jockers, m. l. ( ). testing authorship in the personal writings of joseph smith using nsc classification. literary and linguistic computing, ( ): – . juola, p. and baayen, r. h. ( ). a controlled-corpus experiment in authorship identification by cross-entropy. literary and linguistic computing, (suppl): – . kestemont, m., moens, s., and deploige, j. ( ). collaborative authorship in the twelfth cen- tury: a stylometric study of hildegard of bingen and guibert of gembloux. digital scholarship in the humanities, ( ): – . krishnapuram, b., figueiredo, m., carin, l., and hartemink, a. ( ). sparse multinomial logistic regression: fast algorithms and generalization bounds. ieee transactions on pattern analysis and machine intelligence, : – . li, j., zheng, r., and chen, h. ( ). from fingerprint to writeprint. communications of the acm, ( ): – . lučić, a. and blake, c. l. ( ). a syntactic characterization of authorship style surrounding proper names. digital scholarship in the humanities, ( ): – . luyckx, k. and daelemans, w. ( ). the effect of author set size and data size in authorship attribution. literary and linguistic computing, ( ): – . milic, l. t. ( ). the next step. computers and the humanities, ( ): – . mosteller, f. and wallace, d. l. ( ). inference in an authorship problem: a comparative study of discrimination methods applied to the authorship of the disputed federalist papers. journal of the american statistical association, ( ): – . mosteller, f. and wallace, d. l. ( ). applied bayesian and classical inference: the case of the federalist papers. springer science & business media. oakes, m. p. ( ). ant colony optimisation for stylometry: the federalist papers. in proceed- ings of the th international conference on recent advances in soft computing. pp. – . pearl, l. and steyvers, m. ( ). detecting authorship deception: a supervised machine learning approach using author writeprints. literary and linguistic computing, ( ): – . price, l. ( ). the anthology and the rise of the novel: from richardson to george eliot. cambridge university press. rokeach, m., homant, r., and penner, l. ( ). a value analysis of the disputed federalist papers. journal of personality and social psychology, ( ): . rudman, j. ( ). the non-traditional case for the authorship of the twelve disputed federalist papers: a monument built on sand. proceedings of ach/allc . savoy, j. ( ). authorship attribution based on a probabilistic topic model. information pro- cessing & management, ( ): – . savoy, j. ( ). estimating the probability of an authorship attribution. journal of the association for information science and technology. smith, j. b. ( ). computer criticism. literary computing and literary criticism: theoretical and practical essays on theme and rhetoric: – . stamatatos, e. ( ). a survey of modern authorship attribution methods. journal of the amer- ican society for information science and technology, ( ): – . stamatatos, e., fakotakis, n., and kokkinakis, g. ( ). computer-based authorship attribu- tion without lexical measures. computers and the humanities, ( ): – . tweedie, f. j., singh, s., and holmes, d. i. ( ). neural network applications in stylometry: the federalist papers. computers and the humanities, ( ): – . van dalen-oskam, k. ( ). epistolary voices. the case of elisabeth wolff and agatha deken. literary and linguistic computing: fqu . vermeule, b. ( ). why do we care about literary characters? jhu press. zhao, y. and zobel, j. ( ). effective and scalable authorship attribution using function words. in information retrieval technology. springer, pp. – . zhao, y., zobel, j., and vines, p. ( ). using relative entropy for authorship attribution. in information retrieval technology. springer, pp. – . zunshine, l. ( ). introduction to cognitive cultural studies. jhu press. a potential features in character writeprints table : potential character-level features available to the smlr classifier for character writeprints. the feature name, description, number of individual features of this kind, and im- plementation are provided. one or more of the following is provided in parentheses for each feature: an example of that feature or a description of that feature. feature description # implementation alphabetic characters all letters # char tokens total # char tokens digits all digits - punctuation all punctuation marks word length average length of words # char tokens # word tokens punctuation apostrophes, colons, commas, double quotation marks, ellipses, em dashes, en dashes, exclamation marks, forward slashes, interrobangs, multiple punctu- ation (!!), parentheses, periods, ques- tions marks, semicolons, single quotation marks, square brackets # punct tokens total # punct tokens total characters total # of characters # character tokens table : potential word-level features available to the smlr classifier for character writeprints. the feature name, description, number of individual features of this kind, and implementation are provided. one or more of the following is provided in parentheses for each feature: an example of that feature and/or part-of-speech tags used to identify that feature. feature description # implementation contractions won’t, can’t, etc. # contracted words # word tokens foreign words foreign words (fw) #foreign words #word tokens hyphenated words ever-to-be-revered, etc. # hyphenated word tokens # word tokens lexical diversity word types word tokens # word types # word tokens total words # total words total # word tokens table : potential content syntactic category features available to the smlr classifier for character writeprints. the feature name, description, number of individual features of this kind, and imple- mentation are provided. one or more of the following is provided in parentheses for each feature: an example of that feature and/or part-of-speech tags used to identify that feature. feature description # implementation adjectives all adjectives (good), comparative adjec- tives (jjr), superlative adjectives (jjs) # adj tokens # word tokens adverbs all adverbs (really), basic adverbs (rb), comparative adverbs (rbr), superlative adverbs (rbs) # adv tokens # word tokens cardinal numbers one, two, three, etc. # cardinal tokens # word tokens interjections all interjections (uh) # interjection tokens # word tokens nouns all nouns, plural nouns (nns), plural proper nouns (nnps), singular or mass nouns (nn), singular proper nouns (nnp) # noun tokens # word tokens ordinal numbers first, second, third, etc. # ordinal tokens # word tokens possessives clarissa’s (pos) # poss tokens # word tokens pronouns st person pronouns (i), nd person pro- nouns (you), rd person pronouns (he), demonstrative pronouns (this), personal pronouns (prp), possessive pronouns (prp$ ), relative pronoun (which) # pronoun tokens # word tokens verbs all verbs, gerund or present participle (vbg), non-finite verbs (vb), past partici- ple (vbn), past tense verbs (vbd) # verb tokens # word tokens table : potential functional syntactic category features available to the smlr classifier for character writeprints. the feature name, description, number of individual features of this kind, and implementation are provided. one or more of the following is provided in parentheses for each feature: an example of that feature, part-of-speech tags used to identify that feature, or a collection of tokens comprising that feature. feature description # implementation coordinating conjunctions although, and, because, but, for, nor, or, since, so, though, unless, while, yet # coordinating conj # word tokens determiners additive (more), alternative (another, other, somebody else, different), articles (a, an, the), disjunctive (either, neither), distributive (each, every), elective (any, either, whichever), equative (same), evaluative (such, so, that), exclamative (what cheek), existential (some, any), inter- rogative & relative (which, what, whichever, whatever), maximal & minimal (most, least), negative (no, neither), paucal (a few, a little, some), personal (we friends, you scoundrels), quantifiers (all, few, many, several, some, each, every, each, any, no, a lot of, much), subtractive (less, fewer), sufficiency (enough, sufficient, plenty), uni- versal (all, both, every) # det tokens # word tokens function words all function words: articles (a, an, the), copula be, deter- miners (dt ), expletives (ex), infinitival to, prepositions, personal pronouns, possessive pronouns (prp), posses- sives (prp$ ), relative pronouns (which) # function tokens # word tokens infinitival to to go # infin to tokens # word tokens prepositions with, etc. # prep tokens # word tokens sentence connectors occurring at the beginning or end of sentences: also, any- way, as, besides, finally, first, furthermore, hence, how- ever, in addition, last but not least, lastly, moreover, nev- ertheless, on the other hand, otherwise, second, so, still, then, thus, too, well, yet # sent connectors # word tokens table : potential syntactic structure features available to the smlr classifier for character writeprints. the feature name, description, number of individual features of this kind, and imple- mentation are provided. one or more of the following is provided in parentheses for each feature: an example of that feature, part-of-speech tags used to identify that feature, or phrase-structure tags used to identify that feature. feature description # implementation avg phrase length adjective phrases (adjp), adverb phrases (advp), conjunction phrases (conjp), noun phrases (np), parenthetical phrases ((...)), preposition phrases (pp), quantifier phrases (qp), verb phrases (vp) # word tokens in phrase type total # phrase type avg sentence length sentences, exclamations, questions # word tokens in sent type total # sent type clauses she laughed, and then she cried # clause tokens # sentences embedded clauses the ones that i like # emb cl tokens # clauses exclamations what a surprise! # excl tokens total # sentences fragments fragments (frag) # fragment tokens # sentences imperatives do this now! # imperative tokens # sentences main clauses i think she’s right. # main cl tokens # clauses passives was kissed, etc. # passive tokens # sentences phrases (rel freq) adjective phrases (adjp), adverb phrases (advp), conjunction phrases (conjp), noun phrases (np), preposition phrases (pp), quantifier phrases (qp), verb phrases (vp) # phrase type tokens # phrases phrases all phrases (adjp, advp, conjp, np, pp, qp, vp, whadjp, whadvp, whnp, whpp) # phrase tokens questions was that a surprise? # questions tokens # sentences sentences total sentences # sentence tokens wh-phrases wh-adjective phrases (how hot, whadjp), wh-adverb phrases (how, whadvp), wh-noun phrases (what, whnp), wh-preposition phrases (to what, whpp) # wh−phrase tokens # phrases table : potential formatting features available to the smlr classifier for character writeprints. the feature name, description, number of individual features of this kind, and implementation are provided. an example or description of each feature is provided. feature description # implementation all capitals like, etc. # caps word tokens # word tokens italicized phrases average length of italicized phrases # ital word tokens # italicized phrases parenthetical words this seems (decidedly) interesting, etc. # paren word tokens # word tokens italicized words like, etc. # ital word tokens # word tokens table : potential domain-specific semantic features available to the smlr classifier for char- acter writeprints. the feature name, description, number of individual features of this kind, and implementation are provided. one or more of the following is provided in parentheses for each feature: an example of that feature or a collection of tokens comprising that feature. feature description # implementation endearments affectionate(ly), beloved, (my) bird, charmer, darling, (my) dear, dearest, dearly, (ever-)affectionate, (my) good, goodness, honey, lamb, (my) love, (my) pet, sincere, sincerity, (my) sweet # endearment tokens # word tokens epistolary correspondence, letter(s), paper(s), par- cel(s), pen(s), receipt(s), resume, write, written, writing, wrote # epistolary tokens # word tokens familial aunt, brother, child, daughter, family, father, grandfather, mamma, maternal, mother, papa, paternal, sister, son, uncle # familial tokens # word tokens propriety compliment, congratulate, excuse me, grateful, manners, obliged, politeness, propriety, thank you # propriety tokens # word tokens “remarkable” adverbs best, better, quite, so, soon, sooner, soon- est, too, well # remarkable tokens # word tokens titles dr, esq, jr, lord, madam, miss, ms, mr, mrs, reverend, sir, sr # title tokens # word tokens writing authorship, compose, composition, cor- respondence, drop a line, indite, ink, letter(s), paper(s), parcel, pen(s), pen- ning, receipt, resume, spell, words, write, (piece of) writing, written material # writing tokens # word tokens b detailed character precision and recall scores below we show the details of the confusion matrices as well as the precision and recall scores used to calculate the f-scores reported in the main text for each character. precision and recall scores range between . and . . table : specific confusion matrix values and accompanying precision and recall scores for clarissa harlowe’s letters. labeled recall clarissa non-clarissa true clarissa . non-clarissa precision . table : specific confusion matrix values and accompanying precision and recall scores for lovelace’s letters. labeled recall lovelace non-lovelace true lovelace . non-lovelace precision . table : specific confusion matrix values and accompanying precision and recall scores for john belford’s letters. labeled recall belford non-belford true belford . non-belford precision . table : specific confusion matrix values and accompanying precision and recall scores for anna howe’s letters. labeled recall anna non-anna true anna . non-anna precision . introduction corpus: richardson's clarissa methods overview of the ps authorship method preprocessing feature values linguistic feature types features in a writeprint using writeprint features to make authorship decisions applying the ps authorship method for character writeprints linguistic features available for character writeprints results character distinctiveness signature features general discussion applying the ps approach for related authorship questions in clarissa applying the ps approach more generally for character writeprints conclusion potential features in character writeprints detailed character precision and recall scores white paper report id: application number: hd- - project director: timothy duguid institution: texas a & m university, college station reporting period: / / - / / report due: / / date submitted: / / white paper grant number hd- - “muso: aggregation and peer review in music” timothy duguid, project director texas a&m university august , narrative description “muso: aggregation and peer review in music” was a project that laid the foundation for a virtual research environment (vre) dedicated to music. it explored ways in which such an environment could draw from and contribute to existing vres in the fields of history and literature. the muso (music scholarship online) project considered the descriptive metadata needed for digital projects in music to become interoperable with these existing resources and proposed a peer reviewing mechanism that would provide quality control for the projects that would be aggregated by the muso vre. project activities at the end of september, the project director and principal investigator, timothy duguid, attended the project directors’ meeting at the neh headquarters in washington d.c. by this time preparations had already begun in organizing the primary output of the project: a meeting to discuss issues surrounding aggregation and peer review of digital projects in music. immediately following the awarding of the grant, the college of liberal arts published a story on the project (along with another neh project) that was placed on the school’s website and sent to texas a&m former students (see appendix a). to promote the meeting and the activities of the project, a website was built, at http://muso.tamu.edu. in addition, the project director utilized twitter and facebook to promote the activities of the muso project, including the meeting and the conference presentations that followed. the meeting gathered a group of leading music librarians, musicologists, and music encoders to discuss these issues at texas a&m university on january and . discussions occurred both through an email discussion list and at the meeting, the latter of which was attended by the following individuals: maristella feustle, university of north texas richard freedman, haverford college giuseppe gerbino, columbia university francesca giannetti, rutgers university johannes kepper, detmold/paderborn mark mcknight, university of north texas laurent pugin, répertoire international des sources musicales perry roland, university of virginia craig sapp, stanford university carl stahmer, university of california at davis joanna swafford, state university of new york, new paltz raffaele viglianti, university of maryland the following participants were unable to attend the meeting, but they participated in the email discussions: mauro calcagno, university of pennsylvania tim crawford, goldsmiths college, university of london ichiro fujinaga, mcgill university robin leaver, yale university jesse rodin, stanford university sherry vellucci, university of new hampshire those unable to attend the meeting were also able to participate remotely using the bluejeans meeting service to which texas a&m has an ongoing subscription. the meeting was divided into two parts: day one dealt with issues surrounding the minimum metadata that should be required of digital projects in music, and day two focused on issues of peer review for digital projects (see appendix b). the meeting incorporated several paper presentations, focused small group discussions, and large group discussions. participants were invited to make notes on a shared google doc, and those notes were used to compile the final meeting notes (see appendix c). in the weeks and months following the meeting in january and before the arc spring meeting, discussions continued via the email discussion list. in the course of those discussions, we were able to confirm a set of six recommendations to the arc community that would promote the aggregation of musical resources and content along with its existing historical and literary collections (see appendix d). the project director attended the spring arc meeting held at purdue university on may - . at that meeting, he presented a proposal for muso to join the arc community. muso was officially admitted to the community, and director participated in the metadata discussions that followed, presenting muso’s recommendations for modifications to arc’s metadata standards. due to the activities of muso, the project director was invited to participate in a question- and-answer panel at the music library association in cincinnati entitled “bridging emerging and established approaches to music research” during which he discussed the muso project and the need for a virtual research environment for music that draws together new and existing digital resources for music along with those being created by scholars in other humanistic fields. to promote the findings of the muso meeting among musicologists and music encoders, the project director gave paper and poster presentations at relevant conferences. first, he presented a paper at the american musical society southwest spring meeting in san antonio, tx (see appendix e). he also presented a poster at the music encoding conference in montreal. the poster was entitled “music scholarship online: aggregation and peer review for music” (see appendix f), and it garnered much interest from the encoding community as it seeks to ensure high-quality digital scholarship is accessible and discoverable to music performers and scholars alike. accomplishments this project set out to accomplish two things: to establish a metadata framework for digital objects relating to music and to outline a method for peer reviewing digital content relating to music. the first objective was accomplished through the rdf recommendations that muso made to the arc community. with these set, muso has a basic rdf schema on which to start aggregating objects that will make them interoperable with digital objects relating to other disciplines already aggregated by arc. the second objective was accomplished by outlining a bi-level peer review process. the muso community decided that the first level of peer review would be most appropriate for digital collections. at this level, the muso advisory board would determine the suitability of the content by asking each project the following questions: . to whom is this content interesting? . how does the project make its materials manifest, exposed, and documented? . what is the sustainability plan for the project? . does the project achieve its own goals? should a project require a more rigorous academic review, the muso peer review board would assign it to a discipline-specific reviewer who would consider the resource’s content and a technology reviewer who would ensure the material is stored and presented in ways that adhere to current standards. audiences there were three major audiences for this project: music librarians, traditional and digital musicologists, and digital humanists. participants in the muso meetings and email discussions were taken from these three groups, and specific actions were taken to reach out to each during and after the meeting in january. the first major audience, music librarians, form perhaps one of the most important groups for the muso project, particularly in its early stages due to their expertise in descriptive metadata and the myriad of recent and ongoing digitization projects that could be aggregated into muso. as such, the project director has attended consecutive annual meetings of the music library association. this has been invaluable as he has been able to raise awareness for the muso project. at the last meeting in cincinnati this past march, the project director participated on a panel during which he outlined the muso project for an audience of music librarians. the second major audience, digital musicologists, is a significantly smaller group than music librarians. however, it is no less important. this second group will generate the born-digital projects that muso will review and aggregate. the project director gave a paper presentation at the april meeting of the southwest american musicological association in san antonio, which was attended by musicologists from texas and american southwest. the paper was very well received and has sparked new collaborations with musicologists in the southwest united states, as many of them are engaging in born-digital research projects and music digitization projects. in addition, the project director presented a poster in may at the music encoding conference in montreal. the poster garnered significant interest from the conference’s digital musicologists and students in attendance from around the world. the final major audience, digital humanists, was also reached in a couple of significant ways. first, the muso project teamed up with the digital humanities working group at texas a&m to present a public lunchtime presentation by muso participant carl stahmer on the final day of the muso meeting. the presentation, entitled “the early modern ideology: the economics and politics of moveable and virtual type” built on the muso discussions by exploring the new developments in the english short title catalogue, particularly as it builds and implements a linked data infrastructure for its database. the presentation was attended by digital humanities scholars from texas a&m. the project director has also had a snapshot presentation accepted at the upcoming digital library federation forum meeting in milwaukee in november, which will serve to expand awareness of the muso project, particularly as it looks ahead towards implementation. continuation of the project with the rdf established and a peer reviewing process outlined, muso is ready to be implemented as a full-fledged virtual research environment and member of the advanced research consortium. the project director has taken a new position at glasgow university starting in october of this year, and glasgow is very interested in the development and implementation of muso and will support the director as he seeks implementation funding for the project from within the united kingdom. in particular, the mellon foundation has historically funded projects similar to muso, and the leverhulme foundation and arts and humanities research council are other potential benefactors for an implementation project. thanks to the outreach efforts of the project, a number of partnerships have been strengthened that will be critical in the future implementation of muso. the advanced research consortium has been a key partner in the muso project, and this relationship promises to continue into the implementation, particularly as muso has officially become an official member of the arc community. in addition, the muso project has resulted in strengthened collaborations with the music encoding initiative (mei) out of the university of virginia, and muso will look to establish mei as the standard for encoding rdf- compliant metadata for participating projects. it will also continue to work with the single interface for music score searching and analysis (simssa) project out of mcgill university to develop a method of sharing data so that both can benefit from the content available through their aggregated digital resources. a number of new partnerships have also been formed as a result of the outreach efforts of the muso project. notably, muso will collaborate with digital humanities quarterly and dhcommons to publish muso peer reviews. muso will also pursue formal partnerships with répertoire international des sources musicales as well as the opera and ballet primary sources project out of brigham young university to help identify and provide content for aggregation into muso and in fine-tuning the rdf for muso. long term impact thanks to the recommendations of the muso community, a number of changes have been made to the arc rdf standards that will allow it to aggregate and describe musical content in ways that will be meaningful to music scholars. this will broaden the scope and impact of arc as it seeks to make digital scholarly content discoverable and accessible to students and researchers around the world. grant products the grant resulted in a number of products. first, elizabeth grumbach and laura mandell gave presentations on aggregation and digital peer review at the muso meeting in january, and the powerpoint slides from these presentations are freely available through texas a&m’s digital repository, the oaktrust at https://oaktrust.library.tamu.edu/handle/ . / . in addition, the official notes from the meeting are stored and freely available from the oaktrust. the project director’s panel presentation in “bridging emerging and established approaches to music research” from the music library association meeting may be viewed through the mla’s conference video archives at http://www.musiclibraryassoc.org/mpage/mla_ _media. the poster presented at the music encoding conference is reproduced in appendix f. finally, the presentation that the project director made at the american musicological society is available in appendix e and is being revised for submission to notes, the journal of the music library association. the muso website, available at http://muso.tamu.edu, provides another set of links to many of these resources. it is hosted by the initiative for digital humanities, media, and culture (idhmc), where it will continue to reside for the next two years, until funding can be secured to implement muso, or until another host can be found. appendix a: local publicity major grants to preserve the arts july , (taken from https://liberalarts.tamu.edu/blog/ / / /major-grants-to-preserve- the-arts/) by tyler webb with two new grants from the national endowment for the humanities (neh), one of the largest funders of humanities programs in the united states, the college of liberal arts at texas a&m university will be able preserve historical culture and musical arts. daniel schwartz, an assistant professor in the department of history and grant recipient, is on the second round of funding of a $ , grant project, “advanced reference resources for middle eastern history.” the main focus of the project, which is co-authored by david michelson of vanderbilt and jeanne-nicole mellon saint-laurent of marquette, is the website syriaca.org, which publishes online reference works regarding the culture, literature and history of syriac communities during the late antiquity period to present. syriac, a dialect of aramaic that is still used liturgically in christian churches throughout the middle east, is a large part of their heritage and culture. “the basic mission of this website is to create a cyber-infrastructure, which is basically a set of online tools for doing research but also for linking research projects,” schwartz said. “so we collaborate with a handful of projects in the states working with manuscripts and texts.” the prevalence of syriac culture sprouted during the period commonly known as the dark age, or late antiquity, which refers to the moment when the roman imperial power in the western half of the mediterranean fell apart and the development of barbaric kingdoms became prevalent. while previously overlooked, scholars have recently shown interest in this period because this is when cultures outside of the roman core began to produce their own languages and literature. “they are considered barbarians by the romans,” schwartz said. “they’re outside the civilized world. it’s a great moment when we get the periphery talking back to the core.” one of the key things being done through syriaca.org is the creation of unique resource identifiers, or uris, that allow researchers to link their information together into one common system. “my specific area of focus is working with people,” schwartz said. “there are many different names that people use to refer to me, and as a human you understand that, but a computer has no way to comprehend that unless we tell it that. these uri’s create the opportunity to link between a variety of things. we’re creating these unique identifiers for people, texts, and manuscripts.” with this next round of funding, schwartz’s team is working on a handbook of syriac texts, cataloging all the authors, sources and works, and putting all of them into one place. the second neh grant is for a project entitled “aggregation and peer review in music.” it is a $ , start-up fund with similar principles of preservation. led by timothy duguid, a postdoctoral research associate in the initiative for digital humanities, media and culture, the project aims to create muso, or music scholarship online. this will serve as a library of musical projects available on the internet. ranging anywhere from beethoven to lady gaga, duguid says this site is for the preservation of all music styles. “it’s sort of like a one-stop shop for finding things online,” duguid said. “it fills a need right now in the music community. music researchers are starting to produce resources online but don’t have a way to promote them. there’s single place where i, as a scholar, can go to find it. this will make it so that even if you don’t know it exists, you will still be able to find it.” in addition to a significant publicity campaign for muso, the funding from the grant will be largely used to host a two-day workshop of scholars at texas a&m to discuss the project. “we will be discussing two areas of focus,” duguid said. “first is regarding aggregation: what data do we need to collect from projects in order to make this valuable to music scholars. the second is providing peer review for projects that we decide to ingest. we want to make sure that the resources we are gathering are high quality.” the funding for the project began in june, so the workshop is expected to be held within the next year. appendix b: meeting schedule thursday, january : - : – welcome & introductions [msc ] (timothy duguid) : - : – breakouts & discussion [msc ]: what do music scholars need from a digital curator and search mechanism? : - : – break : - : – presentation [msc ]: “a template for muso: the advanced research consortium and its rdf guidelines” (laura mandell & elizabeth grumbach) : - : – lunch : - : – breakout session [msc ]: what can current music projects tell us about essential metadata for music scholars? : - : – break : - : – discussion [msc ]: what rdf guidelines should be used by muso? friday, january : - : – presentation [msc ]: “digital peer review within arc” (laura mandell & liz grumbach) : - : – break : - : – breakout discussions [msc ]: what standards should be used to evaluate digital projects in music? what are some exemplars for digital projects in music? : - : – lunch and public presentation [glasscock center]: “the early modern ideology: the economics and politics of moveable and virtual type” (carl stahmer) : - : – discussion [laah ]: what should be the peer review process for digital projects in music? : - : – break : - : – discussion [laah ]: future plans and next steps appendix c: meeting notes . breakout session – “what do music scholars need from a digital curator and search mechanism?” participants were asked to list some music aggregators and then to identify the critical characteristics of a music scholarship aggregator. the group identified the following aggregators: • archivegrid (www.oclc.org/research/themes/research- collections/archivegrid.html) • digital resources for musicology (drm.ccarh.org) • doremus (www.doremus.org) • europeana sounds (www.europeanasounds.eu) • isidore (www.rechercheisidore.fr) • music treasures consortium (memory.loc.gov/diglib/ihas/html/treasures/treasures-home.html) • opera and ballet primary sources (sites.lib.byu.edu/obps) • portal to texas history (texashistory.unt.edu/) • nines (www.nines.org) it was discussed that good aggregators should: • include short descriptions of the projects as a whole • the descriptions should be uniform and use metadata • they should be flexible to allow for some variability based on individual project needs • allow user submissions • allow easy searching • offer outreach and training for metadata standards • acquire a constant funding source . breakout session – “what can current digital projects tell us about essential metadata for music scholars?” participants were asked to list some digital projects in music and to take a look at their descriptive metadata. they were then asked to compare this with arc’s rdf. the list of digital projects included: • augmented notes (www.augmentednotes.com) • beethoven’s werkstatt (beethovens-werkstatt.de) • centre for the history and analysis of recorded music (www.charm.rhul.ac.uk/sound/sound.html) • chopin first editions online (www.chopinonline.ac.uk/cfeo) • documenting teresa carreño (documentingcarreno.org) • english broadside ballad archive (ebba.english.ucsb.edu) • enhancing music notation addressability (mith.umd.edu/research/enhancing- music-notation-addressability/) • freischütz digital (www.freischuetz-digital.de) • john cage unbound (exhibitions.nypl.org/johncage) • linked jazz (linkedjazz.org) • lost voices: the chansons of nicolas du chemin (digitalduchemin.org) • marenzio online digital edition (www.marenzio.org) • networked environment for musical analysis (cirss.lis.illinois.edu/project/project-details.php?id= ) • new york philharmonic digital archives (archives.nyphil.org) • online chopin variorum edition (www.chopinonline.ac.uk/ocve) • schenker documents online (www.schenkerdocumentsonline.org) • songs of the victorians (www.songsofthevictorians.com) • structural analysis of large amounts of music information [salami] (cirss.lis.illinois.edu/project/project-details.php?id= ) • virginia woolf online (www.woolfonline.com) projects identified generally used the following metadata categories: • creator • title • unique identifier (uri) • scope and content statement • repository • form/genre • notation types • tools/capabilities • typology • technical specs for recordings, etc. • authorities it was suggested that the mei header could be a vehicle for metadata content. one suggested modification to the arc rdf was to change to something like . this would make the data more interoperable and compatible with linked data systems. it was determined that some of the arc rdf is not consistent and that the categorizations need to be brought to the same level. that is, apples and oranges should not be possibilities in the same metadata field. for the field, the following should be added: • dataset • printed text • realia • notated music • encoded content we also recommend that “full text” should be modified to “searchable content” or something to that effect to allow for searching of encoded media. . breakout session – “what standards should be used to evaluate digital projects in music?” objects that can be reviewed: • encoded content • software tools • archives • digital editions things to consider in a review: • motivation of the project (audience, perceived use, goals) • documentation of the project • integrity of practices, research questions • clear and orderly site architecture • visibility and accessibility (usability) • sustainability (a plan must be in place, regardless of whether it is to last or become obsolete) • description of the intellectual property and materials that the site offers • accreditation of sources and contributors • importance and relevance • innovation and originality (either in presentation or content) • interoperability we determined that muso should have two levels of peer review: . aggregation review – this is a basic review by the editorial board to determine whether a project merits inclusion in the muso catalog . traditional review – this is an academic review of the content and presentation of the resource we recommend that arc change its basic peer review questions to: . to whom is this content interesting? . how does the project make its materials manifest, exposed, and documented? . what is the sustainability plan for the project? . does the project achieve its own goals? next steps it was agreed that muso should join the arc community. a sub-node structure for muso could be envisaged that would parallel the current arc structure. however, muso should start as a single node that could then subdivide as things develop in the future. the initial governance structure of muso would consist of an appointed advisory board. after it is established, a more representative system will be established that will include representatives from relevant scholarly societies. an application will be submitted that will help implement muso through an neh implementation grant. that grant will fund: • software development • metadata creation • database curation • publicity and pr for metadata creation, aggregation, and digital peer review remaining questions: - what is muso going to aggregate? - how do you deal with umbrella projects vs. smaller projects (i.e. simssa vs. its components like diva.js)? - should we aggregate software and how? - how do we evaluate collaborative work? - how should we modify ? - how should we modify ? - should we use collex? what are our other options? appendix d: official recommendations to arc . arc should change its formatting for the role element to something like are there any other metadata elements that should be treated similarly? (see http://bit.ly/collexwiki) . the element currently includes formats and genres, so it would be best to modify it as below while moving the deleted values over to the element (added values are in bold, deleted values are struck-through): analysis, bibliography, catalog, citation, collection, correspondence, criticism, drama, ephemera, edition, fiction, historiography, law, life writing, liturgy, musical work, musical analysis, musical recording, musical score, nonfiction, performance, pretext, poetry, religion, reference works, review, scripture, sermon, translation, travel writing, treatise if this is not possible or acceptable to the arc community, we would recommend retaining both the deleted and adding the new values. . we came up with a set of recommended new values for the element in january. however, given that the current element includes a number of formats in addition to genres, i have modified our original suggestions so as to make a better distinction between type and genre. you will note that i added “ephemera” instead of “realia”. is that acceptable? the following are the suggested values for (additions are in bold): citation, codex, collection, dataset, drawing, encoded content, ephemera, illustration, interactive resource, manuscript, map, moving image, notated music, periodical, physical object, printed text, roll, sheet, sound, still image, typescript . regarding the element, most are happy with the broader term “music” to replace “musicology”, especially due to the confusion that could result from using the term musicology (what about music theory, composition, etc.). however, i agree with some that “art history” should be similarly broadened to “art” (allowing for art criticism, research, history, etc.). the recommended values for this element would therefore be (modified values are in bold): anthropology, archaeology, architecture, art, book history, classics and ancient history, ethnic studies, film studies, gender studies, geography, history, law, literature, manuscript studies, math, music, philosophy, religious studies, science, theater studies . arc’s current model using , , and are sufficient for now, but muso will require more complex descriptions of relationships and will be investigating a more frbr-based model. . arc should include a review date for peer-reviewed content by creating a new non- mandatory element called appendix e: paper given at the american musicological society southwest spring meeting ams southwest meeting april , music scholarship online: problems for digital musicology and a potential solution by timothy duguid when i moved from scotland to texas in , i traded-in castles and kilts for american football stadiums and cowboy boots. a historical musicologist whose idea of digital humanities consisted of html websites, pdfs, digital music recordings, and excel spreadsheets, i also found myself in an english department (of all places) trying to navigate a strange new world of metadata schemas and data visualization along with a collection of acronyms and abbreviations that reminded me of alphabits soup. my first year was spent configuring a new visualization laboratory dedicated to humanities research, while also trying to catch-up with these new concepts. throughout my time there, however, my work has revealed several ways in which musicological research is lagging behind other disciplines such as history and literature. as laurent pugin recently observed, most digital work in the field of musicology to date has been focused on issues of access and scale. as we are all aware, more and more resources are being made available online as collections are being digitized. some examples include the digital archive of the beethoven-haus, the digital image archive of medieval music, and the julliard manuscript collection. in fact, the ams conference in milwaukee included an entire panel on “digital musicology”, and it focused almost exclusively on digitized collections and archives. however, digital musicology and indeed the digital humanities as a whole reach well beyond simply taking a picture of a resource and cataloging it for a web-based interface (valuable as those efforts may be). they also relate to the ability to use, analyze, and manipulate the information contained in that picture. computers do not natively know how to read what is on an image, so groups such as the text creation partnership have set out to transcribe text-based resources to make them fully computer searchable. similar efforts in music research would include the kernscores repository and the josquin research project out of stanford university. along with the elvis project out of mcgill university, these projects are focused on creating large collections of computer-readable music for the purposes of analysis. since hand transcription is so time intensive, however, corporations such as google are investing significant capital in optical character recognition technologies. these effectively program computers to be able to “read” what is on images. this is similar in music, which is even more complicated for computers to “read”. nevertheless, researchers at mcgill are attempting to produce a reliable optical music recognition tool that will allow them to quickly transcribe music that can then be searched and analyzed. despite these advances, the musicological community has shied away from implementing these along with other recent developments into research and dissemination workflows. beyond making more music accessible and analyzable as computer-readable data, adaptive user interfaces for displaying and playing music and hold amazing promise for researchers. it is now possible to build and develop diverse, open digital resources to help people better understand and work with music-related data. there are a number of reasons for the community’s reticence towards digital methodologies. frans wiering recently published the results of a survey of the ismir community that revealed that the lack of usable data was the most significant barrier to scholarship in digital musicology. second to that were issues of usability and training. since this survey was conducted among so-called “techies”, it is no wonder that they were most concerned with data availability. if i was to poll this room or music departments across the american southwest, however, i would anticipate that the most significant hurdle would be unfamiliarity or discomfort with digital resources and computer programming. indeed, there is presently a sharp learning curve for conducting digital research in musicology, and when we add to that the many demands that are made upon the schedules of early career faculty and researchers (the folks who are most likely to want to conduct that research), it is no wonder that digital musicology and its research methodologies remain relatively unexplored and underutilized. beyond issues of time constraints, the most significant barriers to born-digital projects in musicology center around the concepts of authority and discoverability. information posted on the internet, particularly outside of traditional reputable journals and publishers, carries the stigma of being academically suspect. given the amount of time and labor required in generating digital resources, few are willing and able to invest a significant amount of time into something that will not advance their career. this then carries over into the music classroom, leaving students to assume that the latest technology in musicology research is limited to pdfs powerpoints, and streaming audio. discoverability also remains a significant issue for all digital projects. indeed, the challenge for anyone posting information on the internet – regardless of whether that information is open or proprietary, music-related or otherwise – is ensuring that the people who need it most are aware of its existence. whether we like it or not, google stands at the forefront of web crawling practices to help with the discoverability problem. even so, the bias of google’s search results, placing the most well-connected websites and the most popular websites at the top of its results pages, is well documented. the question for researchers on limited or nonexistent budgets is therefore how to ensure that their content can be discovered and disseminated. while most researchers in the sciences and humanities therefore still turn to fixed formats for reporting their findings and for sharing their data, some are generating born-digital resources for dissemination. for instance, jerome mcgann’s rosetti archive combines analysis of art, design, and literature into a single digital resource that includes digital editions of rosetti’s writings and scholarly analysis of all of the site’s content. it is with these types of projects in mind that mcgann began the networked infrastructure for nineteenth-century electronic scholarship (nines). he argued that there would be a brain-drain from digital studies if pre-tenure researchers could not get proper academic credit and wide recognition for their digital work. this virtual research environment (vre) aggregates digital projects alongside content from archives and scholarly journals, providing a one-stop-shop for nineteenth-century studies. since the development of nines, other communities have come online using it as a model: the medieval electronic scholarly alliance (mesa), the renaissance knowledge network (rekn), th connect, and modernist networks (modnets). all of these resources form nodes in the advanced research consortium (arc), providing coverage of humanistic research in each of the major historical epochs of western culture. arc has also begun to add subject-specific mini-nodes that are based on libraries’ special collections. these include studies in radicalism online (siro) out of michigan state university and the great lakes aggregator out of the university of michigan. arc stands as an answer to those who are concerned with both discoverability and career advancement in the digital humanities. scholarly projects request peer review by the illustrious editorial boards that serve each period-specific scholarly community. for projects that pass this peer review, arc then collects descriptive metadata about them so that they can be aggregated with other high quality resources in a faceted search interface. this is admittedly similar to the cataloging work already being done by libraries, with a significant difference. arc is a grassroots organization built by scholars for scholars. it relies on its contributors (the experts) to describe their own projects. arc does not store anything other than descriptive metadata. therefore, its user interfaces respond to search queries by presenting the relevant returns from each of its period- specific nodes and then they send users out to the actual resources themselves. this gives each project an amazing amount of flexibility to determine how much information it wants arc to index, and this allows each project to determine the best methods for presenting its data and/or analyses. arc has been eminently successful in reviewing and aggregating digital projects in the fields of history and philology, as its database now lists nearly , peer-reviewed digital objects among over . million other cultural artifacts that can all be freely and openly searched through any one of arc’s participating nodes. this is great for researchers in literature and history, but what about music scholars? indeed, a number of high-quality digital projects in music have begun to surface, and beethoven’s werkstatt is just one of those projects. this project is creating born-digital genetic editions of beethoven’s music. how can scholars generating born-digital music scholarship such as this ensure that their hard work will be discoverable? furthermore, how can users ensure that the information presented there is reliable? enter: music scholarship online (muso). this neh-funded project seeks to establish a virtual research environment dedicated to music studies that would join the arc community and would therefore benefit from the interdisciplinary resources already contained therein. in its initial phase, muso is working with music scholars, librarians, and coders to modify arc’s current metadata requirements, which are currently tailored to literary and historical scholarship. we hosted a workshop at texas a&m at the end of january to begin the revision process, and we came to some interesting conclusions. first, the descriptive metadata required by muso must be lean and simple. we are working with digital projects that have limited budgets. if these projects have given any thought to using metadata to describe their site and content, they probably cannot employ a metadata librarian to generate that metadata. moreover, we are not interested in generating preservation metadata. we are rather interested in gathering metadata necessary for discovery. so, while we rely on the cataloging expertise of librarians, we must continually remind them and ourselves that we are only need the information that will allow scholars to find the digital resources. second, we realized that music projects need a more robust system for describing the relationships between objects than what exists for literary and historical scholarship. perhaps more than any other discipline, music relies on hierarchies and sequences of smaller units to generate perspective and meaning. in the most basic sense, arc presently allows literary scholars to identify that the “return of the king” is part of tolkien’s lord of the rings, so also music scholars can identify the fourth movement of brahms’s symphony no. . more than that, however, musicians need to be able to express and distinguish more complex relationships including (but not limited to) excerpts, arrangements, and medleys that are far less prevalent in literary objects. finally, we need a more standardized vocabulary for describing objects. arc relies on authorities such as dublincore and the library of congress for its vocabulary, but even so literary scholars have their own ideas of what constitute discipline, genre, and even the element from dublincore. these do not necessarily align with how musicians understand these elements, nor do the existing vocabularies meet the needs of music projects. once again, we are reliant on the community of music scholars and librarians in helping us to develop an ontology that provides meaningful descriptions to music scholars and that plays well across disciplines. having dealt with the discoverability issue, i return to the issue of authority. muso’s position as an aggregator of digital music scholarship can only be cemented among academic circles if it can develop a system of peer review that can ensure the quality of its contributions. following the examples of arc’s other communities, muso will develop a system of peer review. muso will gather an editorial board consisting of well-respected music scholars. this group will oversee two levels of peer review. in the first level, the board itself will evaluate projects and archives for inclusion in muso. at this level, the board will evaluate projects based on the following four questions: . to whom is this content interesting? . how are this project’s materials manifested, exposed, and documented? . what is the sustainability plan for the project? . does the project achieve its stated goals? should a project desire a more rigorous academic review, it can apply to the editorial board for the level two review. in these instances, the editorial board would turn to two groups of experts in relevant fields to examine the resource’s content and presentation of data. with these set, muso will be poised to be a leading resource for music researchers to conduct high-quality scholarship, both digital and analog. it will allow scholars to discover content from archives, journals, and digital projects; furthermore, it will promote new digital scholarship. beyond musical studies, however, scholars will be presented with multidisciplinary relationships and therefore avenues for new and innovative enquiries. appendix f: poster given at the music encoding conference aggregation and peer review for music data aggregation solr index arc rdf ar-c.catalog.org arc catalog so, you have a digital project.... peer review for digital projects archives submit for peer review p&t letter metadata ingestion muso metadata dublin core custom consultation & elements standards meetings custom muso elements join the muso community! muso.tamu.edu music scholarship online (muso) is a proposed finding aid and peer review platform for digital scholarship in music. it will gather a community of music scholars dedicated to high-quality digital scholarship that will work together to promote the work of their colleagues by conducting outreach to the music community and by building a research environment for students and researchers to harness the power of technology to conduct and disseminate new and innovative research in music. muso has joined the advanced research consortium, which is a hub of humanities research nodes containing scholarly resources spanning the history of western culture from the medieval to the modern periods. this strategic partnership will promote high- quality multidisicplinary digital research. how can you ensure that people can discover the content of your project? how can you make your project interoperable with other projects and digital resources? how you can receive professional credit for your project? appendix g: letter of commitment to host the muso website terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . another suitcase, another student hall – where are we going to? what ach/allc can tells us about the current direction of humanities computing the thirteenth joint international conference of the association for computers and the humanities and the association for literary and linguistic computing, held at new york university in greenwich village, new york city, between june th – june th , showcased the wide spectrum of research being undertaken within the field of “digital media and humanities research” at present. this was the largest ever ach/allc (the oldest established meeting of scholars working at the intersection of advanced information technologies and the humanities), providing a meeting point for the expanding humanities computing community, whilst giving an indication of the type and quality of work being pursued in the arena. the conference was well attended, with over participants registered, and a further or so witnessing the webcasts of the opening and closing sessions (a first for the ach/allc). whilst over half of these participants came from academic institutions within the usa, the international academic community was also well represented . there was an increased presence of attendees from projects out with academic institutions, and industry. the community of ach/allc, whilst being very westernised, is international, predominantly academic, but with necessary and welcome links to the practical application of its techniques in the heritage, teaching, and commercial sectors. this segue between the research based and the practical, industry and academia, demonstrates the rich paradox of the meeting of the computing and humanities fields. johanna trucker, the robertson professor in media studies at the university of virginia, in her opening plenary “reality check: problems and prospects in digital there were sizeable numbers from the united kingdom, canada, norway, and germany. there were also participants from institutions in italy, ireland, austria, spain, czech republic, romania, poland, holland, belgium, sweden, finland, mexico, puerto rico, japan, hong kong, south africa, and new zealand. terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . humanities” explored the meeting of these two different academic disciplines thoroughly. it is hard to know enough to work in the field of humanities computing – after all, it is a discipline which embraces the technical subjects but depends on a long background of humanities research. from the perspective of the humanist this is not a happy synthesis: there are un-reconcilable differences between the humanities and computing, but it is trucker’s belief that keeping those differences alive is the most productive way forward for research in the domain. (a theme revisited in alan liu’s closing keynote “the tribe of cool: information culture and history”, where he stressed that humanities computing has always been about collaboration, and that the cusp between the disciplines offers the most rewarding prospects for research.) trucker pointed out that there is a malaise surrounding the condition of the humanities, and traditional humanities research, in contemporary culture – but the digerati are happy. in a society where the dominant destructive force of “disneyfication” depends on a sense of historical amnesia, the humanist’s job is to create historical memory. although it would be romanticising to suggest that technology could ever be our only saviour, those working in digital media and the humanities are optimistic and excited about their research and their place in the more traditional disciplines; we are now in the situation where research is not pushing but being pushed by humanities computing. there may indeed be an inherent contradiction in the coupling of the humanities and computing as an academic subject, but trucker stressed that instead of trying to wholly embrace engineering or computing science working practises, or worrying about whose territory we are poaching on, we should embrace the sense of not really knowing where humanities computing is going, and celebrate the “mutually destabilising otherness” of the field, adding that this “fraying and fragmentation is essential to critical thought”. but if we don’t know where the field is going, can this conference give some indication of where it is at present, or at least where it has been recently? of course, the papers presented are not the only worthwhile work that is being done in the area of computing and the humanities at the moment, but they do reflect a wide international academic audience and authorship. the conference comprised of individual papers and sessions, split into parallel sessions, with harold short commenting that “the key intellectual problem at this conference will be in choosing which of the three terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . sessions to attend”. a survey of these sessions, papers, and posters should present a snapshot of the kind of work being undertaken. this conference indicates that work is being done across the whole spectrum of humanities research, with dominant themes emerging in literature and linguistics, the development of digital resources, the evaluation of these resources, and also in the creation of policy, strategy and standards to maintain, manage and preserve such research. the question of how to utilise such resources in teaching was repeatedly raised, as was how to deliver humanities computing as a taught course itself. furthermore, a large proportion of papers at the conference dealt with the development of software and systems to aid in humanities teaching and research, indicating that humanities computing scholars are increasingly able and willing to embrace the technically complex, whilst forging interdisciplinary partnerships and associations which further research in the field. literary and linguistic computing there were a number of papers addressing the more “traditional” aspects of literary and linguistic computing, such as the analysis of corpora to indicate language change and community development, the stylistic and structural analysis of individual literary texts, and the interrogation of texts to answer questions of authorship attribution. these more traditional applications in the field showed the development of new techniques and software, and were applied to both earlier and modern sources. linguistics corpus studies was represented in a number of papers. juola’s “the time course of language change” describes an experiment in using techniques to determine rate of language change in a source that spans decades, to look at how small corpora can be analysed to give evidence of change in the type and structure of language used across a time frame. horobin, in “the evolution of standard written english: a corpus approach” described and demonstrated a corpus of middle english to assess the development and influence of standard written english, suggesting important factors terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . in the design and use of historical corpora to allow the data to be interrogated in a number of intra- and extra-linguistic ways. a more specific study on the analysis of one feature of language change was illustrated in blatna and koprivova’s poster, “czech approaching english-verbal forms with personal pronouns across styles in the czech national corpus” which looked at the increasing frequency of personal pronouns, showing how czech is becoming closer to english on the lexical but also the grammatical level. the analysis of style was demonstrated in bolshakova’s poster, “phraseological database extended by educational material for learning scientific style”, which presented an analysis of functional style on scientific and technical prose, examined the phraseology of scientific texts. a modern body of text was analysed in giordano’s “the genre of electronic communication: a virtual barbecue revisited” which argued that empirical linguistic analysis should be an alternative and fruitful way to understand the emergence and structure of a digital community, examining the linguistic structure of chat rooms and discussion groups. literary studies techniques used to try and ascertain the order an author intended sections of a text to be published in were discussed in spencer et al, “reconstructing the stemma of a textual tradition from the order of sections in manuscripts”, and also bordalejo et al, “the order of the canterbury tales: praxis of computer analysis”. the first appropriated analogous technique from evolutionary biology for the restructuring of family trees, describing two techniques where they can calculate the “distance” between a pair of manuscripts through the number of insertions, deletions and transpositions, and how it is possible to reconstruct a stemma from the matrix of pairwise distances among manuscripts. the second presented the results and implications of the use of such computational techniques to produce stemmata based on the tale order of the canterbury tales, in order to show a relationship between the textual tradition and the order of the tales. the analysis of more contemporary literature was presented in gardener, “versions of interactivity: meta-interpretive response in hypertext fiction” which described techniques to identify and interpret the way in which feedback and non-trivial decision making influence or determine a reader’s choices when confronted by hypertext as opposed to traditional texts. these terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . recordings, and subsequent analysis of interactions, can be used to show how a text is being interpreted by a reader, whilst paying attention to the distinctive aspects of hypertext fiction. elements of style, rather than narrative structure, were considered in robey, “rhythm and meter in italian renaissance narrative verse”, which presented a systematic representation of dante’s divine comedy by creating an electronic text marked up in terms of accents and syllable divisions, indicating some quite substantial divergences in accentual structure between this and other renaissance texts. pawlowski et al, in “time series modelling in the analysis of greek metrics” detailed the controversy regarding the rhythmical organisation of greek texts, and how this can be studied to gain understanding of the underlying rhythms in greek oral literature. problems of authorship attribution were addressed in a number of papers. hoover, in “vocabulary richness and authorship reconsidered”, demonstrated the usefulness of vocabulary richness for attributing authorship in the stylistic analysis of literary texts. rudman’s, “the dna authorship attribution model” described a method of representation of an author’s style based on the detection of their “intellectual dna” – that is identifying the number and category of features which identify their work, using statistical analysis. a project specific analysis was covered in holmes, “a widow and her soldier: the case of the picket letters”, which detailed the problems in determining the authorship of a volume of letters, presenting a discussion of the sampling techniques, textual preparation, and the stylometric analysis used to raise questions of subterfuge and questionable authorship. hoover’s poster, “experiments in multi-variate analysis and authorship attribution”, reported on a project that is re-examining and comparing the statistical techniques used in stylistics. burrows, in the busa presentation plenary “questions of authorship; attribution and beyond” discussed the development and application of statistical and computing tools for the analysis of literary texts, detailing problems in computational stylistics and authorship attribution. it is obvious from these papers that the role of computational techniques in answering problems in literary and linguistic computing is as pertinent as ever, and such analyses terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . are providing statistical representations of texts that would be impossible without the use of such tools. the use and development of such techniques continues to provide a different viewpoint, and a new way of interrogating and understanding texts for linguistic or literary purposes. digital resources textual encoding and editing a large proportion of papers and posters presented at the conference related to the creation, editing, and publishing of digital materials. although there were a few projects regarding the development of multimedia resources, and problems in evaluation of digital resources, most of these papers focussed particularly on the implementation of tei standards for encoding and display of textual matter (indicating the bias towards the textual in humanities computing as a discipline). many of the projects had developed comprehensive, flexible, sets of specifications regarding digitisation, markup, presentation, and delivery when the tei guidelines were not sufficient in these areas, and discussed the problems in developing encoding guidelines when there were often considerable challenges to design and implementation. this was often coupled with considerations to the extent to which such markup processes could be automated, and the development of suitable tools to view, manipulate and process the encoded data. a large spectrum of types of source material was covered, for example the process of producing digital books (gibson and ruotolo, “beyond the web: tei and the ebook revolution” and bia, “technical aspects of the production process of digital books using xml-tei at the miguel de cervantes digital library”), and the digitisation, markup, presentation, and delivery of letters, (vanhoutte, “dancing with dalf: towards a digital archive of letters written by flemish authors and composers in the th and th century” and eide, “putting the dialog back together: re-creating structure in letter publishing”). problems with the encoding and display of poetry were covered, showing both the difficulties encountered when dealing with variations terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . between manuscripts (price et al, “what’s interesting about whitman’s poetry manuscripts?”) and the development of a standard for encoding verse within the framework of the tei (ore et al, “tei for better or verse” and schreibman and chua, “revisiting revisions: employing xml and xsl to display deeply encoded, multi- versioned text”). using the tei to produce academic electronic journals was covered in two papers (unsworth, “publishing originally digital scholarship at the university of virginia”, and hannon et al, “building belphegor: a multilingual electronic journal using tei”) and the application of the tei to web pages, a less obvious than usual domain, was considered in rahtz, “using the tei to author web sites”. the encoding of text and the development of suitable tools for use specifically in corpus studies was discussed in biber et al, “aac- digital resources in textual studies”, and attention was also drawn towards the need to nurture the involvement of humanities scholars in the process of corpus building and annotation in o’donnell et al, “opentext.org: an experiment in internet-based collaborative humanities scholarship”. using encoding to aid with the display and interrogation of linguistic systems was considered in tu, “the adaption and breakthrough of chinese documents encoding v: a case study of cbeta digital triptaka and tei”, and also bia and quero “building spell check facilities for ancient spanish” . problems in the markup of different typographical conventions were considered in russom and bauman, “typographic regularisation in the wwp textbase” and also the poster presented by anderson “markup vs. character encoding: the quandary of handling the epigraphical/papyrological “underdot” in computer representation”. on a more general note, problems with the encoding and integration of different types and formats of historical texts were considered in fuchs et al, “digitizing the difference: the challenge of heterogeneity in the sources of early modern science”, whilst a poster by anderson and crawford, “form or function: considerations in presenting historical documents on the web”, surveyed the difficulties in representing and encoding marginalia in historical documents. robertson’s, “the historical event mark-up and linking project” looked at the need for a common historical markup language through which historical documents can be published. terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . markup was considered on a more conceptual level in brown et al, “intertextual encoding in the writing of women’s literary history” which presented the attempt to devise an encoding scheme for a broad-based, integrating literary history to emphasis intertextual relationships between texts, and this use of tei in the encoding of an analytical framework was also considered in wittern, “tei and topic maps” which considered the possibility of encoding not only the concrete features of a text but also aspects of the world as presented through the encoded texts. mccarty, in “the diy commentary; or, what the reference and the link told each other” discussed what the most fully developed traditional examples of the commentary genre teach us about designing and implementing better scholarly information systems, and how it may be possible to imagine an interface to multiple commentaries that would better represent the plural text. the act of text encoding itself was considered in caton, “towards a politics of text encoding” which undertook a critical evaluation of the process, exploring the notion that it is of a neutral, a-political nature. these papers show a broad range of focus, from the project specific, to the more outward looking, from those that have taken a very practical stance towards problems of encoding and delivery, to those that are more concerned about the theoretical implications of their actions. however, they indicate the breadth to which the tei is being implemented throughout the domain as a whole, and the critical processes that have to be developed to solve remaining problems with markup, presentation, and delivery of digital resources. as a representation of the type of textual encoding, editing, and, publishing that is happening in the field of humanities computing, these papers indicate the creativity necessary, and indeed present, when working within the guidelines of the tei, and the need to embrace these guidelines to further the quality of humanities based digital resources. multimedia however, although these text based resources represent the majority of papers presented regarding the creation of digital resources, this conference presented an terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . increased interest in the visual, and in the creation, dissemination, and evaluation of multimedia resources. fraistat and jones, in “immersive textuality, the editing of virtual spaces” showed how exploiting the theatrical possibilities of digital environments allowed users to explore a literary text through its virtual representation in digital media, expanding the work into a game, or a space in which to travel through to explore its meaning. stoicheff and deshaye, in “the visual display of literary complexity in a hypertext critical edition of william faulkner’s the sound and the fury” examined the visual display of textual information in the digital environment, indicating how representing texts visually can give new insights into their structure, and provide resources to aid in teaching. interest in the physicality of digital texts and presentation was shown with robertson’s quirky, “an e|mediated rhetoric of visuality” exploring the role of typography as an information source within electronic resources, and the role of visual design as a cultural and literary shaping force was demonstrated in walton’s poster, “cultures and literacies: south african students and western visual design on the world wide web. the growing interest in the use of virtual reality within humanities computing was represented by beacham and denard’s “the pompey project: digital research and virtual reconstruction of rome’s first theatre”, which detailed the practical and theoretical problems in putting together a virtual reality model of the theatre, and an exploration of the problems of integrating historical and archaeological data into a high-end vr space. the presence of these papers at the conference demonstrates a willingness to include new technologies, and their applications, in the evolving humanities computing pantheon, and as such to encourage further research in these non textual areas, which are as pertinent to the use of computing in humanities disciplines as the need for textual resources. resource evaluation there also seems to be a growing interest in the need for evaluation of these resources, in order to be able to evaluate digital resources in the areas of research, teaching, and in the provision of tools to search and retrieve information, such as in digital library projects. smith et al, in their session, “towards a generative evaluation toolbox: a roundtable” probed the present state of evaluation tools in the terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . areas of digital research projects, digital pedagogical projects, and digital library projects, questioning how you judge project quality, and project success. mactavish, in “more than words: astonishment and special effect in multimedia” called for a better way to evaluate how visual and aural elements function within interactive environments, as humanities computing expands to deal with more non-linguistic elements. individual projects were also concerned with evaluation issues, such as roz et al, “the decameron web. how does encoding help pedagogy?” which detailed evaluation by users as to the usefulness of the resource whilst indicating how the encoding of a primary text and the related retrieval system contribute to the teaching and learning experience in the digital environment, and many of the textual encoding and editing papers also showed such concern into evaluation issues. the need for evaluation of resources in image retrieval and management was covered in chen, “image retrieval knowledge and art history curriculum in the digital age"” questioning the effectiveness of the current tools available to search for images from digital archives, and kraus, “mimetic metadata: linguistic representations of visual objects in image-based electronic projects” criticising the subjective nature of image descriptions which are the basis for image retrieval systems, and how this effects usability of the resources. these papers can be taken to indicate the state of “digital media and humanities research” in the area of digital resources, in that they show the bias towards the textual, although there is a growing interest in other media. also, that there is a great deal of innovative and creative work being done, both in textual encoding and editing, and the development of other multimedia resources. those involved are increasingly considering the development process, and how it can be made easier by the development of software and tools for creation and presentation. and ultimately, after exploring thoroughly how to create such resources, the field is learning how to evaluate and utilise such research, to enable the design and implementation of resources which meet the needs of researches, students, and institutions. policy, strategy, and standards terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . a growing number of papers at this conference discussed the need for the design and implementation of a framework for creating and managing digital resources, detailing the research needed in information storage, standards for preservation and access, and integrated and sophisticated search mechanisms to put in place dynamic architectures for digital scholarship. there was also concern for long term preservation of such resources, discussing reprocessing, and archival standards. such papers also covered the strategic issues involved in implementing an integrated digital environment for the humanities, detailing the collaboration necessary between various institutions (universities, libraries, archives, museums, and media centres) and individuals to ensure the success and applicability of such endeavours, whilst examining the role of the humanist in the creation of new computer technologies. henry et al, in their session “international strategic and policy issues in networking digital resources in the humanities” discussed the strategic issues regarding implementing an integrated digital environment for the humanities, and how the extensibility of models, strategies and practices between projects and institutions across the sector could aid in the integration and management of resources across a broad community. staples et al, in their session “progress of the supporting digital scholarship project” focussed on the creation, analysis, and reprocessing of digital resources, discussing the progress made regarding collecting and preserving existing digital resources of scholarly research for long-term use and preservation. rehberger et al’s session “digitizing the human: humanizing the digital” addressed problems of organisation and implementation of large digital corpora in new media, querying how search engines interfaces influence users’ access to the warehouses of information. on a more site-specific note, ore and eide, in their poster “the norweigan museum project” discussed the steps taken to develop a common database system for the management of collections for the norwegian university museums. mciver et al, in “a new framework for web-based contributory encyclopaedias” detailed the motivations, design, and implementation of a framework for creating and managing web-based contributory encyclopaedias, exploring the continuing need for some centralising mechanism for knowledge organisation such as that in traditional encyclopaedias. renear et al, in “the w c consortium and standards” provided an terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . introduction into the development and implementation of standards by the w c, particularly those regarding the tei, and xml. these papers indicate a concern over the management of digital resources – now that the humanities scholar is becoming more adept at producing such resources, it is important that these resources are managed, maintained, and preserved, and the implementation of frameworks to do so is logically the next step in the dissemination and preservation of such scholarship, although the development and adoption of such frameworks will demand large investments of time, effort, and finances to provide adequate infrastructures for the humanities. teaching this conference also indicated a growing concern regarding teaching humanities computing as an academic subject, partly to fill the immediate cultural need of trained professionals who understand both the humanities and information technology, and also how to transform the enthusiasm for the subject into undergraduate programmes to enable a wider audience to develop the skills necessary to understand and contribute to the growing arena, but also how to incorporate digital resources into more traditional academic subjects. hockey et al, in the session “ma programmes for humanities and digital media” discussed the increasing demand for humanities computing as a taught course, comparing and contrasting different ma programmes; one established course and one that has just been developed. the comparison covered both what is expected of the institution, regarding investment, teaching, assessment, and technical provision, and what is expected of the student, regarding course work, theoretical, and practical studies. the presentation showed an emphasis on evaluation and assessment of the programmes themselves, a focus that was mirrored in unsworth’s session “a masters degree in digital humanities at the university of virginia”, which detailed the coursework involved, topics covered, and structure of this particular course, whilst addressing how successful such a course is in providing a scholar with both the necessary computing and humanities based skills needed to have a true understanding of the digital humanities field. on a more theoretical level, terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . liu’s closing keynote, “the tribe of cool: information culture and history” questioned the distance between “cool” and the educational system, and how to encourage students to learn about the humanities, when such dominant knowledge structures were not fashionable in today’s society, whilst considering the importance of teaching humanities computing as a singular subject. there were a few papers that considered the role of the humanities computing department or unit in traditional academic institutions, and how they could aid in the integration of technology and teaching in more traditional humanities subjects. burnard et al’s session “symbiosis or serfdom?… are you/we/they being served?” considered the relationship between the users and providers of humanities computing services, and suggested that the key to understanding humanities computing is a focus on the needs and requirements of clientele. the session also stressed the need to stimulate demand by demonstrating the benefits of such technology to a sometimes sceptical audience. condron et al’s poster “sharing expertise in the use of information and communication technology to enhance teaching and learning in the humanities” presented a collaborative project between institutions to explore how new technologies can support staff and students to make better use of small-group teaching. stroupe’s “writing against the curriculum: technology, writing and reconciling disciplinary with social consciousness” indicated how the tools and techniques of distance, off campus, education can provide an infrastructure for communication across faculties and departments in the humanities and sciences. consideration of the use of digital resources for teaching in particular subject areas was also addressed. duguld’s “digital pedagogy in film and media studies” discussed the use of digital moving images in the classroom, stressing how using such images provides the ability to develop rich multimedia resources, and eases their dissemination by utilising digital delivery mechanisms. hamilton’s poster “remote interactive animated projection” also addressed some of these issues whilst focussing on the development of a video streaming facility to aid in humanities research and teaching in an arts faculty, utilising animation because of its flexible nature. terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . the development of software and tools a great number of papers at this conference dealt with the development of software and computational tools to aid in humanities research, and how such software can be utilised and tested. again, there was a bias towards the textual, with many papers detailing systems to aid in the markup of texts and their presentation. other systems were developed regarding the organisation and analysis of texts, particularly hypertexts, and, again, there were some papers regarding the development of software and systems for non-textual elements of humanities research, including the visual, and music. textual tools many of the papers presented tools to aid in the markup of texts. akhtar et al, in “automating xml mark-up” demonstrated a novel two-stage automatic xml markup system, to automatically extract and apply markup rules to documents by using self-organisation and adaptive automatic markup; learning from its own errors to increase accuracy. bia and carrasco, in “automatic dtd simplification by examples” , described a method for the automatic generation of simplified dtd from a source dtd and a set of sample marked up files, in order to create a minimum dtd. sperberg-mcqueen and huitfeldt, in “practical extraction of meaning from markup using xslt”, documented the development of software to aid in combating the problem of providing a clear, explicit account of the meaning and interpretation of markup by providing a notation for expressing the meaning of constructs in a markup language and the use of that notation to define elements and attributes. burnard and fix’s, “introducing phelix: an open xml database system” demonstrated the development of a general purpose tei-compatible xml database system, configured to support a complex xml dtd, tested on collaboratively designed database of several hundred manuscripts. porter et al, in “building flexible language-learning systems: perl and html vs. xml and xsl”, discussed two new systems to aid and support language learning, comparing the different technologies and different target terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . audiences, with the common requirements, and the need for build in user friendly systems administration tools. a large proportion of the papers regarding software and systems development discussed techniques for the organisation, analysis, and querying of electronic texts. smith, in “linking and gathering: automatic hypertext in the perseus digital library” described a systems which aims to augment electronic documents by automatically generating hypertexts from the contents of a digital library, to aid in the creation of composite documents for richer contextualisation. wong and webster’s “linguistic description and exploration using rdf” proposed an approach to the information search and retrieval of linguistic data from texts based on the identification of linguistic information about the rhetorical structure of the text. white et al, in “co-cited author maps as real-time interfaces for web-based document retrieval in the humanities” presented an account of a web based information retrieval interface that aids in the mapping of scholarly literatures by creating maps of inter-related author names, so providing an aid to humanities research which tends to centre work around named persons. a few of these papers dealt with the categorisation of texts, and how the use of techniques from the information retrieval field, for example numerical classification and categorisation strategies, can replace the manual categorisation process. de pasquale and meunier’s “categorisation techniques in computer assisted reading and analysis texts (carat) in the humanities” presented the development of tools to categorise texts, asking if these text classification techniques can be applied successfully to the reading and analysis of texts in the humanities and social sciences. forest et al, “from mathematical classification to thematic analysis of philosophical texts” explored methods of classification and categorisation for the automatic reading and analysis of humanities texts. leopold and kindermann’s poster “what can hyperplane-classifiers tell us about texts” reported on a project which uses support vector machines for text classification, utilising semantic spaces and latent semantic indexing to classify texts. terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . the development of tools to aid in the construction of dictionaries was demonstrated in tufis and barbu “extracting multilingual lexicons from parallel corpora”, showing a method for automatic extraction of translation equivalents from parallel corpora to be able to automatically produce bilingual dictionaries, and also silberztein et al, “large coverage dictionaries and grammars for text processing: the intex system”. this session focussed on the use of a corpus processing system for the textual analysis of lexical resources and grammars, explaining advanced methods for information extraction, demonstrating how the tools may be used in research, and how applications may be built based on the technology. the intex system was utilised again in dougherty et al, “intex solves pronunciation and intonation problems in text to speech reading machines” a discussion on how to train text to speech machines to correctly intone sentences so that they retain their meaning, and what could be the optimal structure of lexical entries in order to account for lexical ambiguities in sentences, demonstrating how markup is a practical problem which comes to the fore when developing tools to do a specific task. the management and analysis of text as part of the internet was addressed in rockwell et al, “tracking culture on the web: an experiment” which demonstrated the use of the www as a tool to track ideas and culture by the development of a system for tracking selected items, and how such tools can indicate cultural change. zarri’s “the euforbia project: a semantic approach to the filtering of illegal and harmful content on the internet” presented an experiment about the filtering of internet documents according to an unbiased and semantic-rich approach. the english bias of the internet and related technologies was pointed out in golumbia’s “the computational object: a poststructuralist approach” which demonstrated the focus on standard average european languages in today’s computer programming infrastructure, querying how this effects the development of the code, and the accessibility to such systems. multimedia tools the papers dealing with the development of tools and systems for non-textual humanities research covered a broad scope. kirschenbaum et al, in “the virtual terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . lightbox: the potential of peer-to-peer humanities computing” demonstrated the development of an image based software tool which functions as an image based whiteboard for the web, allowing images to be juxtaposed for comparison, discussing the prospects the development of this kind of tool has for humanities computing. brown and seales, in “ d imaging and processing of damaged texts” showed how d imaging can be used as a means of creating, manipulating, restoring and carefully measuring features on digital facsimiles of manuscripts, applying new restoration techniques such as flattening to aid historians in their studying of manuscripts and other texts. terras, in “reading the papyrologist: building systems to aid the humanities expert” discussed the process involved in working with humanities experts in order to identify what type of computer tools will help them carry out their task, and the construction of a system to aid papyrologists in reading ancient texts. the analysis of music scores was presented by ng’s poster, “optical music recognition: stroke tracing and reconstruction of hand-written manuscripts”, which documented an automatic and efficient method to transform paper-based music scores into a machine representation. bod’s, “using natural language processing techniques for musical parsing” presented an investigation into whether it is possible to use probabilistic parsing techniques from natural language processing to parse music into groups and phrases which can be represented in a tree structure. the paper presented the development of a new parser which combines techniques from probabilistic heuristics to solve ambiguity in order for it to parse music accurately. ng, in “music via motion: interactive multimedia performances” demonstrated a motion and colour detection system which uses a video camera to survey a live scene and track visual changes. again, these papers show the breadth of vision humanities computing is developing: the inclusion of papers dealing with non-textual elements show how the field is evolving and embracing a much wider remit, rather than just retaining a textual focus. these papers also indicate that some of the technical problems faced by scholars in the humanities are being solved by such scholars themselves, as they develop new techniques, tools, and software, and implement and test these new systems on a variety of sources. a number of the papers show that there is great inter-disciplinary collaboration between scholars to aid in the development of such tools, and that, as terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . trucker intoned in her opening session, the meeting of the technical and the humanities based expert can often prove to be a rich collaboration. the social aspect… socially, the conference was calmer than it has been in previous years. being in the heart of greenwich village delegates were mostly left to fend for themselves, and the social schedule was kept to a minimum with a few drinks receptions, although there was still adequate opportunity to socialise between and after sessions. the conference culminated in a banquet at the th avenue ballroom, but unfortunately there was no dancing (no dancing?), with the hardiest having to go onto to a blues club in the village. . session on saturday morning, anyone? conclusions the remit of such a conference has moved away from the merely literary and linguistic side of computing, towards computing and the humanities in its broadest sense, encompassing not only the development of digital resources, but their management, evaluation, and preservation, the development of computational tools to aid the humanities scholar in both textual and multimedia systems, and the development of teaching programmes and techniques to increase the presence of humanities computing in academic establishments. this conference has shown the interdisciplinary nature of such research, and how such collaborations can be fruitful. however, there still remains a lot to be done in all the areas mentioned, presenting great opportunities for the scholars involved. humanities computing is still a very young, developing field, indicated by the fact that so many young researchers were attending and presenting at the conference, alongside more established academics in the arena. there is a feeling of approachability – and although the conference is getting larger every year, there still remains a palpable sense of community, what allan reneer, the president of allc, called the “social work” of humanities computing; the building of connections between people and projects to aid work and advance research, and how such a conference is increasingly about the relationships terras, m. ( ). "another suitcase, another student hall, where are we going to? what ach/allc can tell us about the current direction of humanities computing" literary and linguistic computing, volume , issue . fostered and connections made at such meetings. the quality of discussion after the papers was firm evidence of this, and was often commented upon. compare this to conferences in more established traditional fields in the arts or sciences; anyone having attending such a conference will attest to the difference in tone. the phrase “warm and fuzzy” was used repeatedly to describe the allc/ach community at this conference – and we presume this had nothing to do with the nyc humidity. humanities computing appears to be an academic field in transit: having not long started its journey, this conference attests to the momentum that is building, and the quality and disparate nature of the research presented indicates that is has far to travel whilst technologies are explored, infrastructures put in place, and resources developed, evaluated, and utilised. so where is it all going? in many directions. and tubingen, … international journal of scientific research in computer science, engineering and information technology cseit | accepted : march | published : march | march-april- [ ( ) : - ] international journal of scientific research in computer science, engineering and information technology © ijsrcseit | volume | issue | issn : - doi : https://doi.org/ . /cseit concept of digital literation based on value of local wisdom piil pesenggiri in history learning in the industrial revolution . regiano setyo priamantono , warto , akhmad arif musadad student at master degree program in historical education- sebelas maret university, surakarta, indonesia , lecturer at master degree program in historical education- sebelas maret university, surakarta, indonesia abstract literacy is one of the abilities that are considered important in facing the st-century world. the distinctive character of the st-century world is the industrial revolution of . . the impact of the industrial revolution . has been felt by anyone in every aspect of life, including aspects of education. the low condition of indonesia's digital literacy capability must be overcome immediately. for this reason, this study aims to propose a thesis on the concept of digital literacy based on value of local wisdom piil pesenggiri in learning history in the era of the industrial revolution . . local wisdom piil pesenggiri is the behavior and outlook on life of lampung people who are still held firm to this day. it is hoped that through this strategy the historical awareness of the students at public high school kalianda in lampung province will increase amidst the current of industrial revolution . . this study used qualitative research methods. data collection was conducted in january and february with natural conditions, primary data sources and more data collection techniques in participant observation, in- depth interviews, and documentation. the results show that there must be a change in conventional history learning resources to a digital history book that can understand current students without losing their cultural identity. keywords: digital literacy, piil pesenggiri, learning history, industrial revolution . i. introduction literacy is one of the abilities that are considered important to face the st-century world. according to the st century education framework developed by the world economic forum (wef), there are important skills that children need to prepare and possess for them to survive and succeed today. the sixteen skills are divided into large groups, namely foundational literacies, competencies, and character qualities. the distinctive character of the st-century world is the industrial revolution of . . the impact of the industrial revolution . has been felt by individuals, communities, companies, and even a country. what is most felt is in the economic field, not infrequently many companies are not ready to face the industrial revolution . . then what is meant by the industrial revolution . and how will it affect education and especially learning history? the fourth industrial revolution era was colored by artificial intelligence, supercomputers, genetic engineering, nanotechnology, automatic cars, and innovation. these changes occur at the exponential speed which will have an impact on the economy, industry, government, and politics. in this era, more visible forms of the world have become global villages (hasudungan & kurniawan, ). http://ijsrcseit.com/ http://ijsrcseit.com/ http://ijsrcseit.com/ https://doi.org/ . /cseit volume , issue , march-april- | http://ijsrcseit.com regiano setyo priamantono et al int j sci res cse & it, march-april- ; ( ) : - especially in the field of digital literacy, indonesia's education world is deemed necessary to improve its digital literacy abilities, given that indonesia is in a low position. precisely reading interest in indonesia is still low. of the countries surveyed in "most literred nation in the world ", indonesia is almost in the position of caretaker, entrenched in rank . indonesia's ranking is only one level above botswana, which ranks protruding. besides, as the beginning of the explanation, digital literacy is one of the basic abilities that a person must possess to face the world of industrial revolution . . this can be started among students. figure . : the concept of application of digital literacy based on local wisdom piil pesenggiri the low ability of indonesian digital literacy may be the reason for the lack of close and relevant material or literacy content with students. therefore, it is felt necessary to integrate local wisdom that is essentially close and relevant to the daily lives of students. also, based on preliminary data obtained through interviews with students it can be concluded that historical awareness still does not show maximum results. thus, it is necessary to see the extent of conducting this research, digital literacy will be integrated with local wisdom piil pesenggiri in the history learning conducted at public high school kalianda, south lampung. based on the introduction above, the research problem formulation is to find out how the concept of digital literacy based on local wisdom piil pesenggiri in learning history in the industrial revolution era . . ii. methods and material qualitative research is research that uses a natural setting, intending to interpret phenomena that occur and is carried out by involving existing methods. meanwhile, according to donald ary, qualitative research is trying to understand phenomena by focusing on images rather than breaking them into variables. the aim is a holistic picture and depth of understanding rather than numerical data analysis (donald ary, ). according to donald ary (donald ary, ) qualitative research has six characteristics, namely: ( ) concern of context, ( ) natural settings, ( ) human instruments, ( ) descriptive data, ( ) emergent design, ( ) inductive analysis. data collection was carried out in natural settings, primary data sources and more data collection techniques in participant observation, in- depth interviews, and documentation. iii.results and discussion a. digital literacy what is the importance of digital literacy? the rapid development of information and communication technology provides challenges and prospects on a multidimensional basis. in the context of education, this development provides opportunities and challenges, both for teachers and learners, providing new nuances in learning and learning, social interaction, and professional work. for teachers, for example, mastery of digital literacy provides convenience and effectiveness in planning, implementing, and evaluating learning programs that it does. •time•subject •content•as a skill digital literacy local wisdom piil pesengg iri industri al revoluti on . learnin g history http://www.ijsrcseit.com/ volume , issue , march-april- | http://ijsrcseit.com regiano setyo priamantono et al int j sci res cse & it, march-april- ; ( ) : - it can be imagined when non-literate teachers operate computers, it will take more time and energy and more money to prepare learning plans, compile teaching materials, and develop practical learning media, which can be attractive and strengthen learners' understanding of teaching material. conversely, educators who are literate in digital technology can compile and develop teaching materials and instructional media more attractively by utilizing images, videos, and music that are suitable for that purpose. this, in turn, increases the quality of student learning. especially considering students currently living in the era of the industrial revolution . it is felt necessary to develop learning that is up to date and contains local values of local wisdom. foundational literacies are skills related to the child's ability to apply core skills in everyday tasks. skills related to basic literacy consist of (the skills needed in the st century, ) a) literacy (skills related to text and language) b) numeracy (skills related to numbers) c) scientific literacy (skills related to scientific thinking) d) ict literacy (skills related to the use of information technology) e) financial literacy (skills related to decision making related to personal finance f) cultural and civic literacy (skills related to cultural understanding and civic rights) figure . : st century capabilities (https://widgets.weforum.org/nve- /content/exhibits/ .svg) digital literacy is a person's knowledge or skills in understanding, analyzing, evaluating, managing, using, and utilizing various information from digital (including online) media, including how to re- communicate that information to individuals, groups and the wider community (hary soedarto harjono, ). digital literacy in this context does not merely mean the ability to use computers to write and read as in the context of general literacy, but rather a set of basic skills in the use and production of digital media, information processing, and utilization, participation in social networks to create and share knowledge, and various skills professional computing (tour, ). rila setyaningsih et al summarize the seven elements of digital literacy include: ( ) information literacy is the ability to search, evaluate and use information needed effectively; ( ) digital scholarship is an element that includes the active participation of digital media users in academic activities to make information from the digital media as a data reference, for example in research practice or completion of college assignments; ( ) learning skills are effective learning of various technologies that have complete features for formal and informal learning activities, ( ) ict literacy or so-called awareness of information and communication technology that focuses on ways to adopt, adapt and use ict-based digital devices and media both applications and services (setyaningsih et al., ). ict-based media in question such as computers or lcd projectors/power points that have been designed/designed in such a way that can be used following their understanding, moreover already connected to the internet as a basis for learning ( ) career and identity management related to ways to manage online identity. a person's identity can be represented by some different avatars who can have relations with more than one party at almost the same time; ( ) communication and collaboration is a form of active participation for learning and research http://www.ijsrcseit.com/ volume , issue , march-april- | http://ijsrcseit.com regiano setyo priamantono et al int j sci res cse & it, march-april- ; ( ) : - through digital networks; ( ) media literacy includes critical reading skills and creative academic and professional communication in various media. the existence of media literacy makes audiences not easily deluded by the information that at a glance meets and satisfies their psychological and social needs. in the context of historical learning, listening, speaking, reading and writing skills can be trained with the help of digital technology. for learning these four language skills, historical learning resources are not limited to printed materials, but also digital materials and media that can be used more practically and efficiently. the learning media are very flexible and easy to develop. in terms of learners, how enjoyable are students who live in the millennial era who are facilitated by qualified digital technology? college assignments can be typed on a computer, learning resources from all over the universe are available. like fruit, just pick it from the tree. however, to be able to download and upload information that we need and need by others, or just reading and listening to information requires digital literacy. in other words, learning resources will not provide benefits if we do not have sufficient knowledge to use them. in the era of the industrial revolution . , historical education experienced quite a big challenge and now its role is demanded so that it can foster historical awareness, both in its position as members of society and citizens, as well as to strengthen the spirit of nationalism and the love of the motherland without ignoring a sense of togetherness in life between nations in this world. to deal with these challenges, an approach that utilizes digital literacy needs to be done by integrating the value of piil pesenggiri local wisdom in learning history. b. local wisdom piil pesenggiri lampung community hadikusuma ( ), said, lampung people inherit the nature of behavior and outlook on life called piil pesenggiri which has the following characteristics: . pesenggiri, meaning unyielding do not want to lose the attitude and behavior. . juluk adek, implies liking with a good name and honorable title. . nemui nyimah, meaning accepting and giving in an atmosphere of joy and sorrow. . nengah nyappur, implies sociable and deliberative in solving a problem. . sakai sambayan, implies helpfulness and cooperation in kinship and neighborhood relations. in the research of sulistyowati and risma margaretha (irianto & margaretha, ), it can also be understood that the condition of lampung today experiences a seriousness in its existence as lampung ethnicity which is increasingly marginalized due to cultural changes both in diffusion or assimilation perspective and the like or in challenges as multicultural society, national and global. in the book written by umar rusdi and friends, there is a quote that signifies the uniqueness of the people of lampung. the quote reads: (umar rusdi et.al., ) “tandonou ulun lappung, wat piil pesenggiri, you balak piil ngemik malou ngigau diri. ulah nou bejuluk you beadek, iling mewari ngejuk ngakuk nemui nyimah. ulah nou pandai you nengah you nyappur, nyubali jejamou begawiy balak, sakai sambayan” the translations are: the sign of the people of lampung, there is piil pesenggiri, he has a big heart, has a sense of shame, self-respect, because it is more, big name and title. like http://www.ijsrcseit.com/ volume , issue , march-april- | http://ijsrcseit.com regiano setyo priamantono et al int j sci res cse & it, march-april- ; ( ) : - brothers, give open arms. because he is smart, he is sociable. work together with a large work, please help. in the observation of researchers in the field, that there is a kind of flexibility from the lampung people towards the use or application of the piil pesenggiri culture in the contemporary or modern context. this can be said of an effort to dynamize the culture of the lampung people themselves in response to the demands of the times. if in a traditional context, the meaning of piil pesenggiri is a condition with a ceremony or custom, norms that apply, for example about the meaning of berjuluk beradok, in the sense of customary lampung, it is defined as a title or title that brings the name of greatness to the person who is given berjuluk beradok (nickname in the form of a title, for example, raja sejagat lampung), with this title, then the person who is given this title has a principle or mandate of values that must be upheld, then he must apply the principle of his piil pesenggiri, namely self-esteem, dignity, authority personally and family. usually to get a degree in a lampung traditional event, obtained by holding a begawi (lampung traditional party), by holding the event lampung people use their philosophy to meet nyimah which means understanding like receiving guests, with sweet face, and open arms to all guests, then by the title of the title is also that he must bear the philosophy of nengah nyappur, which is to live together in a community, both indigenous people or the general public, so that they can consult in solving problems or about certain activities, then the last sakai sambayan is to work together in working on events good events in traditional parties or other activities, this can be interpreted the meaning of cooperation. piil pesenggiri, as a pillar of lampung philosophy with the four pillars of the nanggah nyimah, sakai sambayan, nengah nyappur, and bejuluk beradok have lived for centuries and have been lived by indigenous peoples in lampung. piil pesenggiri, ethos and the spirit of this village, if carried out consistently and sincerely, will bring people to a harmonious and harmonious order of life. piil pesengiri alienates people from division and strengthens in a multicultural society. so, piil pesengiri can be recommended and lived by anyone who loves peace but also likes diversity. local wisdom and ethics of piil pesengiri can become spirit and capital in fostering development in sang bumi ruwai jurai so that lampung people can stand tall with other tribes in the global community. c. the concept of learning implementation plan digital literacy based on value of local wisdom piil pesenggiri historical education taught in schools is an alternative way to achieve the goals stated above. as explained by isjoni that "historical education in schools aims to build the personality and mental attitudes of students, awakening awareness of a fundamental dimension in human existence (continuity of movement and continuous transition from the past towards the future), bringing people to the values of honesty and wisdom in students and instilling national love and humanity "(isjoni, ). djamarah, hamalik, wiyanarti in ahmad fakhri hutauruk (hutauruk, ) stated that there were five ( ) main activities in designing historical learning strategies, namely: . identifying the ability of the initial conditions of students, as well as determining the specifications and qualifications of behavior changes and personality of participants students as expected. . choosing a historical learning approach system based on people's aspirations and outlook on life. . choosing and determining the procedures, methods, and techniques of teaching history that are considered the most suitable and effective so that they can be used as a guide by the teacher in carrying out their assignments. . establishing norms and minimum limits of success or criteria and standards of success to be used as a guide by the teacher is doing. . evaluate both the process http://www.ijsrcseit.com/ volume , issue , march-april- | http://ijsrcseit.com regiano setyo priamantono et al int j sci res cse & it, march-april- ; ( ) : - and the learning outcomes of history, which will then be used as feedback for improving the overall learning system. d. digital literation based on value of local wisdom piil pesenggiri digital literation based on value of local wisdom piil pesenggiri learning implementation plan educational unit : public high school kalianda, south lampung subjects : mandatory history class / semester : x / main material : historical thinking main sub-material : chronological, diachronic, synchronous, space and time thinking time allocation : x meeting ( x minutes) learning objectives through learning activities that examine local wisdom piil pesenggiri with a scientific approach using discovery learning learning models through digital sources, students can understand about historical thinking that includes chronological, diachronic, synchronous, and able to understand the concepts of space and time and presenting the results of discussions to the class by using technology and information and developing an attitude of religiosity (peace of love, tolerance), independence (likes to read, curiosity), mutual cooperation (cooperation, oriented towards mutual benefit) and integrity (responsible). learning steps description of activities time allocatio n introduction minutes . students pray and give greetings when starting learning (discipline of worship, the value of religiosity). . the teacher conducts classroom management (the teacher checks the readiness of students to learn, starting from class cleanliness, neatness of student clothing, and the neatness of tables and chairs). . the teacher checks student attendance. . the teacher prepares digital history tools, media, and books . apperception to focus students in following the lesson. the teacher conducts questions and answers that relate prior knowledge to the material to be learned. . the teacher explains the learning objectives or basic competencies to be achieved. . the teacher conveys an outline of the learning material and assessment to be carried out. core minutes stimulus • students observe historical timelines about local wisdom piil pesenggiri (hard work, independence). • students read digital history books to discover the concept of chronological (diachronic) and synchronous thinking in the http://www.ijsrcseit.com/ volume , issue , march-april- | http://ijsrcseit.com regiano setyo priamantono et al int j sci res cse & it, march-april- ; ( ) : - development of local wisdom piil pesenggiri (reading fondness, value of independence). problem statement through observing images on the projector screen and reading digital history books, the teacher gives an opportunity/motivates students to formulate the problem that will be discussed in the next steps, for example: • what is chronological, associate local wisdom piil pesenggiri? • what is meant by chronological thinking, linking local wisdom piil pesenggiri? • how to think chronologically, associate local wisdom piil pesenggiri? data collection students are divided into groups consisting of - people per group. each group has been divided into topics according to sub material that local wisdom piil pesenggiri is discussing with what will be discussed (collaboration, mutual cooperation values). discourse if we study history we will never be separated from the concept of chronological thinking in history. history teaches us how to think chronologically, meaning to think coherently, orderly and continuously. with a chronological concept, history will give us a complete picture of the local wisdom piil pesenggiri from a review of certain aspects so that we can easily draw the benefits and meanings from the development of local wisdom piil pesenggiri every era. the diachronic concept sees that the development of local wisdom piil pesenggiri in his journey experiences development and moves over time. through this process students at public high school kalianda can make comparisons and see the development of local wisdom piil pesenggiri in the lives of the people of south lampung from the era to the next. command: . explain the concept of chronological thinking in the development of local wisdom piil pesenggiri? . explain the concept of diachronic thinking in the development of local wisdom piil pesenggiri? . explain the concept of synchronous thinking in the development of local wisdom piil pesenggiri? . explain the concept of space and time in the development of local wisdom piil pesenggiri? data processing • students formulate answers to the formulation of problems that arise (responsibility, integrity value) verification • students match the observations of images on the projector screen with digital history books about questions or formulations of problems that arise (creative and innovative, the value of independence). http://www.ijsrcseit.com/ volume , issue , march-april- | http://ijsrcseit.com regiano setyo priamantono et al int j sci res cse & it, march-april- ; ( ) : - generalization • thinking chronologically, meaning thinking in a coherent, orderly and continuous manner. the diachronic concept sees that the development of local wisdom piil pesenggiri experiences development and moves over time. through this process, the students at public high school kalianda can make comparisons and see the historical development of their people's lives from the ages to the next. • synchronous way of thinking in history means thinking that is widespread in space but limited in time. this way of thinking analyzes the development of local wisdom piil pesenggiri at one time. this way of thinking is important in the development of local wisdom piil pesenggiri because it functions to analyze the state of a place at a certain time. the place is horizontal and analyzes contemporary events. • the concept of space and time in history is very important. the concept of space is an important element that can not be separated in an event and changes in human life as subjects or agents of history. all human activities must take place simultaneously with the scene. space is the most inherent concept of time. space is a place where various historical events occur with time. the study of an event based on the time dimension cannot be separated from the time-space of the occurrence of the event. if the time of minutes is focused on the aspect of when the event happened, then the concept of space of the minutes focuses on the aspect of the place, where the event occurred in this context namely, the development of local wisdom piil pesenggiri at public high school kalianda, south lampung. closing minutes . teachers and learners conclude as a whole the material development of local wisdom piil pesenggiri at this meeting (friendly/communicative, mutual cooperation values). . students with the guidance of teachers do a reflection on learning the development of local wisdom piil pesenggiri by writing a summary of the material and the learning process by giving suggestions or responses at this meeting (moral commitment, the value of integrity). . students are assigned to work on independent tasks and send it to the email group and the results will be published on the school website https://www.smanegeri kalianda.sc h.id/ . (hard work, value of independence). . students are given the task of the group to complete the analysis and alternative solutions to the problems that become class studies (cooperation, mutual value). . teachers and students close the activity by expressing gratitude to god that this meeting has been going well and smoothly (discipline of worship, the value of religiosity). http://www.ijsrcseit.com/ volume , issue , march-april- | http://ijsrcseit.com regiano setyo priamantono et al int j sci res cse & it, march-april- ; ( ) : - learning resources • internet. • google scholar • history of digital books learning tools and media • tools : laptops, projectors, projector screens, speakers • media: concept map about the development of local wisdom piil pesenggiri iv. conclusion literacy is one of the abilities that are considered important to face the st-century world. the distinctive character of the st-century world is the industrial revolution of . . students of public high school kalianda in south lampung live in the era of the industrial revolution . . besides, students also live with the values of local wisdom piil pesenggiri as lampung people's behavior and outlook on life. these values are, big-hearted, have shame, self-esteem- because of more, big name and title. like brothers, give open arms. because he is smart, he is sociable. work together with a large work, please help. theoretically by applying digital literacy that integrates the values of local wisdom piil pesenggiri in learning history is expected to foster students' historical awareness at public high school kalianda, south lampung in the era of the industrial revolution . . so, further research is needed by making digital learning resources that have contained local wisdom piil pesenggiri in learning history. v. references [ ] donald ary, dkk., , “pengantar penelitian dalam pendidikan” in pustaka pelajar, yogyakarta. [ ] hary soedarto harjono. . literasi digital: prospek dan implikasinya dalam pembelajaran bahasa. ( ). [ ] hasudungan, a. n., & kurniawan, y. . meningkatkan kesadaran generasi emas indonesia dalam menghadapi era revolusi industri . melalui inovasi digital platform. seminar nasional multidisplin , september, – . [ ] hutauruk, a. f. . digital citizenship: sebagai upaya meningkatkan kualitas pembelajaran sejarah di era global. historis | fkip ummat, ( ), . https://doi.org/ . /historis.v i . [ ] irianto, s., & margaretha, r. . piil pesenggiri: modal budaya dan strategi identitas ulun lampung. makara human behavior studies in asia, ( ), . https://doi.org/ . /mssh.v i . [ ] isjoni, , “pembelajaran sejarah pada satuan pendidikan” in alfabeta, bandung. [ ] setyaningsih, r., abdullah, a., prihantoro, e., & hustinawaty, h. . model penguatan literasi digital melalui pemanfaatan e-learning. jurnal aspikom, ( ), . https://doi.org/ . /aspikom.v i . [ ] the skills needed in the st century. . http://widgets.weforum.org/nve- /chapter .html [ ] tour, e. ( ). digital mindsets: teachers’ technology use in personal life and teaching. language learning and technology, ( ), – . [ ] umar rusdi et.al., , “arsitektur tradisional daerah lampung (rifai abu (ed.))”, in departemen pendidikan dan kebudayaan. cite this article as : regiano setyo priamantono, warto, akhmad arif musadad, "concept of digital literation based on value of local wisdom piil pesenggiri in history learning in the industrial revolution . ", international journal of scientific research in computer science, engineering and information technology (ijsrcseit), issn : - , volume issue , pp. - , march-april . available at doi : https://doi.org/ . /cseit journal url : http://ijsrcseit.com/cseit http://www.ijsrcseit.com/ https://doi.org/ . /cseit https://search.crossref.org/?q= . /cseit http://ijsrcseit.com/cseit exploring the bloodaxe archive: a creative and critical dialogue the acquisition of the internationally significant archive of poetry publisher bloodaxe books in was the starting point for a new collaboration between library staff at newcastle university and researchers in the school of english literature, language and linguistics. an exploratory, and then a major, arts and humanities research council-funded project, ‘the poetics of the archive: creative and community engagement with the bloodaxe archive’, became the opportunity to test new theories about archival practice, particularly through digital applications, and expand the audience traditionally involved with literary archives from humanities researchers, to poets, artists and film-makers. the project was realized through both following and subverting established foundations of archival practice: alongside traditional cataloguing, which provided the strong frame for other activities and the metadata on which it drew, an experimental digital interface was created, which could harness multimedia and creative outputs. it was also the aim of the interface to emulate an experience where users had the capacity to ‘browse’ the data alongside ‘search and retrieve’ discoverability, and make links that depended on the kind of serendipity on which creative activity thrives. exploring the bloodaxe archive: a creative and critical dialogue introduction in a first for newcastle university, the university library and the school of english language, literature and linguistics (selll) jointly acquired the bloodaxe books archive in . bloodaxe books was established in and is now one of the largest and most successful independent poetry publishers in the uk. thanks to the foresight of its editor, neil astley, bloodaxe from the start laid down the habit of keeping everything and its archive comprises extensive editorial, business and financial records, and correspondence. it is also a continually expanding resource, with further accruals received by the library on an annual basis. as we transferred the first tranche of the bloodaxe archive from its home in northumberland to our archival standard stores in (see photograph), a preliminary project, also funded by the arts and humanities research council (ahrc), began. two poets, phd graduates tara bergin and anna woodford, under the direction of linda anderson, worked alongside our archivist, ian johnson. we attempted to support the researchers in our traditional role of ‘describer’ and ‘provider’, giving clues as to what was in the boxes, while also ensuring legal requirements under data protection and copyright legislation were adhered to. the initial tranche of material from the bloodaxe archive insights – ( ), november exploring the bloodaxe archive | linda anderson and ian johnson linda anderson professor of modern english and american literature newcastle university ian johnson head of special collections & archives newcastle university while the latter was a necessity, the former was less crucial beyond the logistics of providing access in the controlled environment of an invigilated reading room. this pilot project provided an initial sense of the scope and potential of the archive to us, with the focus on creative outputs. it also allowed questions to be raised. as non-traditional researchers, the poets still felt the need for the metadata of box-lists and catalogues, even as the methods and assumptions behind them seemed alien. our outputs, in collaboration with visual artist kate sweeney, included a short poem-film documenting the initial ‘opening ‘of the archive. tara bergin’s poem ‘what we found in the archive’ ended presciently with the lines: we asked the archivist to make us copies. we promised not to tell. we signed his orange form in pencil – and pocketed everything . in this figurative view, we as a library emerged as an authority to be colluded with or subverted. it was a timely caveat, spurring further reflection: archival infrastructures are necessary but can also be viewed, as in creative contexts, as stifling, encouraging suspicion and even rejection. this case study sets out first of all the general context in which research was happening from the library point of view. it then addresses a particular set of issues that proved important for the project ‘the poetics of the archive’ from the point of view of literary researchers. context in archival practice and digital humanities by the time of the project, it had become difficult for archivists and librarians simply to hold uncritically to the view that the desirable end product of archives was objective discoverability, particularly in the new ubiquitous digital environments. as geoffrey yeo states, even where expertise of archivists is acknowledged, ‘experts cannot be seen as infallible providers of objective information’ . standards-based explorations of collections through catalogues that are compliant with isad[g] (general international standard archival description) adhere strictly to concepts of faithful capture of provenance, original order and functional analysis. however, these constructs, especially around signalling significance, are based on fallible human assessments of what is the most logical presentation of descriptive data. even where digitization provides the assets to ‘show not tell’, this metadata still drives discoverability. beyond the acknowledgement of ‘islands of innovation’, the first wave of the digital revolution, rather than undermining these curatorial constructs in archives, seems to have cemented them through the interoperability these standards afford. through discovery portals, researchers are able to interrogate, with objective rigour, obscure and aggregated collections in a way that was unimaginable even years ago. but who are these researchers? within this paradigm, they are predictable abstracts, seen as both logical and interested in narrative wholes as well as unexplored fragments. do they always seek objectivity? do they even know exactly what they may be looking for in an archive? how does the theory match up with the experience of coming into contact with a range of different researchers, researchers who may also include creative practitioners? as a field in itself, digital humanities as a subset of digital scholarship can be seen as amorphous, an attribute it often playfully embraces. at its core, however, it ‘refers to new modes of scholarship and institutional units for collaborative, trans- disciplinary, and computationally engaged research, teaching, and publication’. it extends and expands traditional humanities research to embrace, and co-curate, digital design. ‘we as a library emerged as an authority to be colluded with or subverted’ ‘digital humanities … extends and expands traditional humanities research to embrace, and co-curate, digital design’ moreover, digital humanities is a collaborative endeavour involving humanists, librarians and artists in conceptualizing and solving problems, never relying on one person’s or group’s interpretations. this view challenges entrenched ideas of objectivity as the end goal. working on the papers of poet elizabeth jennings, dr jane dowson has written, ‘by digitally bringing the material into the public arena with self-declared interests and subjectivities, we shift monolithic narratives to multivocal ones’. what power must information professionals and archival interfaces preserve in order to service the needs of end users of archives and digital libraries? to what extent can they function as ‘critical nodes’, ‘a site of potential interdisciplinary collaboration and user engagement’ in a more creative design space? these are the questions that underwent further elaboration in our project ‘the poetics of the archive’. the poetics of the archive – the view of literary researchers and poets the particular ahrc scheme we responded to was a ‘capital funding call for digital transformations in community research co-production in the arts and humanities’ and it seemed designed to cover many of the areas we hoped to develop in relation to the bloodaxe archive, including the chance to curate and catalogue the archive. our successful application initiated months of frantic learning with a large team that extended across poets, fine artists, digital artists, literary researchers and library staff, including a dedicated archivist and digitizer. the library professionals, as noted above, provided with great speed and assiduousness the building blocks to enable design. collaboration enables or perhaps, at its most challenging, forces a kind of openness, if the ideas of the different researchers are to be fully respected and accommodated within the project and if it is to exceed any one person’s vision. we accomplished more than we ever thought at the beginning we could. however, four particular areas are worth highlighting where a mixture of circumstance, lack of precedent, theoretical exploration and ambition created the crucible for innovation and laid down a series of challenges in terms of customary archival practice. many of the results, including different kinds of creative output, can be accessed through the experimental interface we created. the uncatalogued archive one of the results of receiving research funding shortly after having physically acquired the archive was that processes that one might have expected to happen sequentially had to happen at the same time. there was no time for library staff to do more than briefly list materials before researchers began to delve into them (supervised, of course), choosing only two boxes of material at a time from a box-log (figure ). figure . bloodaxe archive prototype box-log interface ‘collaboration enables or perhaps, at its most challenging, forces a kind of openness’ ‘we accomplished more than we ever thought at the beginning we could’ this meant that researchers and participants got access to papers that ordinarily might have been disposed of in the course of sorting and curating. one participant commented on her surprise that ‘the packets of proofs and correspondence were still enclosed in original stamped envelopes that felt like relics in their tattiness’. this experience of ‘tattiness’ highlighted the materiality of the archive, its availability as a source of visual inspiration. macro-photographs by photographer phyllis christopher helped to release a sense of the allure and fascination of paper, revealing its textures, its fraying edges and indentations, and a suggestive visual resemblance to skin. they also made visible ghostly traces of the poets’ working, hidden under coffee stains and tipp-ex. whilst one of our aims was ‘digital transformation’, in this case the digital helped to create continuity with the material, exemplifying jacques derrida’s argument that the digital liberates the ‘past resources of paper’ and provides us with insight into materiality in ‘a sort of future anterior’. according to derrida, technologies ‘liberate our reading for a retrospective exploration of the past resources of paper, for its previously multimedia vectors’. what is the temporality of an archive? is the past it stores always a coming into being of the past? is the process of writing that it records an illusion, its fluidity already fixed, its provisionality already a historical fact? or can we encounter something in an archive, as susan howe writes suggestively, through a kind of telepathy, ‘quickly – precariously – coming as it does from an opposite direction’, maybe even, not what is fixed but ‘a moment before’. more practically, what should be kept of an archive and what should be thrown away? how can we predict what might, in the future, become of importance to researchers? is an editorial function already in operation in the tidying of papers and disposing of ephemera? and who decides what is important and what not? the abridged archive an archive by contemporary writers, most of whom are still alive, raises particular copyright problems when part of the aim is also to create digital surrogates. the filtering and redaction of personal information by our archivist was important and conformed to best archival practice. however, when it came to digitizing editorial pages from the bloodaxe archive, permission had to be sought from the poets in question. many gladly gave permission, but at least per cent of those asked said no or did not reply. again, there was a time pressure. not everything in the archive could be digitized for practical as well as legal reasons, and indeed, it seemed uninteresting to digitize pages where there was no marginalia, no editorial changes, and the page was the same as the final, published version. the pages digitized therefore are and can only be a sample of the whole, and this archive cannot hope to emulate the comprehensiveness of some other, more historical digital archives. the techniques of computer text mining we employed in the digital interface are not operating across a complete record, and the patterns they establish can only be suggestive rather than definitive. yet, perhaps it is here that the ‘literariness’ of the archive interposes itself, offering another set of meanings from those underpinning text-mining techniques. to follow the journey of particular words, even as a sample, is to appreciate how language within poetry is both a matter of individual choice – a making new – but, as one of our researchers and poets, ahren warner, reflected, also bound within previous uses by poets of the same words. the section of the interface entitled ‘words’ (figure ), warner reminds us, demonstrates that ‘words are malleable, changeable entities that will themselves always be reconstructed, renewed and perhaps rendered completely other by further words’. the archive offers, in its own way, a lesson in creativity and poetry writing, and the ever-shifting ground they inhabit between originality and repetition. ‘whilst one of our aims was “digital transformation”, in this case the digital helped to create continuity with the material’ ‘another set of meanings from those underpinning text- mining techniques’ ‘the archive offers … a lesson in creativity and poetry writing’ figure . word map of connections interface creative expansion one of the exciting challenges of the original funding call was that it offered the opportunity to explore the archive as a provocation to fresh creativity, rather than as a scholarly resource which would produce writing about it, rather than in response to it. we gathered together poets who were in poetry groups both regionally and nationally in order to create a community of people who could meet to hear about the archive and spend time reading and responding to their own selection from it. there were also two fine artists, irene brown and alan turnbull, employed to make their own art in relation to the archive, a film-maker, kate sweeney, and photographer phyllis christopher, whose work all contributed to an exhibition as part of the newcastle poetry festival, . this was in many ways one of the most exciting and rewarding aspects of the project. some of our poets had never visited an archive before, nor thought of doing so, but found themselves responding to the archival space itself, experiencing the fascination of containers and secrets hidden away, waiting to be brought back to life. there was an element of serendipity about which boxes they chose to look at, and often the collocation of different pages produced a collage effect, generating different meanings. or as poets they found themselves particularly affected by the seeming intimacy of the encounter with particular poets, and the effect, in the archive, of seeming to experience a voice or a bodily trace. of her own experience in an archive, the poet susan howe writes, ‘here is deep memory’s lure, and sheltering. in this room i experience enduring relations and connections between what was and what is.’ one of our participants, anna woodford, commented, ‘it is a strange situation – writing a poem about a vast poetry archive while surrounded by other poets doing the same thing. i realised that wherever i write, i am also in some sense in the reading room trying to insert my poem/my name into a room full of my contemporaries and the work of my forebears represented by the archive’. the impressive showcase of work produced for the project is at the moment accessed from the digital interface, but this is work from only one occasion. we want there to be many occasions when poets use the archive to learn about and stimulate their own creativity, and can experience the digital interface in some of the same ways the poets entering the physical archive did. the digital archive we approached the creation of the digital interface to the archive with a set of questions about how we could create a different kind of interface which was ‘generous’ in mitchell whitelaw’s terms and could support kinds of interaction which were less task-driven, but opened up new forms of encounter which were of themselves enjoyable, open and non-predictable, and where there was room for affect, rather than only, as nicholas belkin says, ‘efficiency and effectiveness’. led by digital artist tom schofield, we explored different kinds of browsing and search through words and shapes, which could throw up interesting and often serendipitous connections. we also added photographs and films, some of which feature interviews by two of our researcher-poets, colette bryce and ahren warner, with a selection of the poets whose work ‘the fascination of containers and secrets hidden away, waiting to be brought back to life’ ‘we explored different kinds of browsing and search through words and shapes’ appears in the archive, but some of which explore gesture or interactions with paper in their own creative way. it was nevertheless important, of course, that the archive be catalogued fully, and the new digital interface we created is in many ways parasitic on the information and metadata contained in the more traditional catalogue for the bloodaxe archive that exists on the archives hub. however, here for us was an opportunity to align an informatics tool with research in the humanities, and particularly with the creativity contained within the archive. ‘from analysis springs invention’, johanna drucker writes, then warns us that ‘humanities documents and aesthetic artefacts are not “data” and they don’t contain “data”. in the bloodaxe archive the partial nature and partiality of the knowledge it contains underpin an uncertainty that may be its most valuable discovery and asset. current and future developments the bloodaxe archive and the potential unlocked through ‘poetics of the archive’ cemented contemporary poetry archives as a collection strength at newcastle university and this is being expanded through further acquisitions, including the archives of individual poets such as sean o’brien, moniza alvi and jack mapanje, and the development of an overarching interface which, under construction at the moment, will in time grow and explore new forms of connection between the different archives. for the library, ‘poetics of the archive’ and the generous interface that it created stands as a benchmark for the potential of working alongside humanities academics and exploring digital innovation and new ways of engaging with archives. one of the interests that emerged from this research was ‘archival liveness’, a topic on which tom schofield has subsequently published, and which explores the temporal interconnectedness of archival systems and archivists. a shift from archives to archiving emphasizes not only the dynamic nature of archiving but also the construction of knowledge that takes place within archival institutions. this focus is also being explored by kate sweeney as she finds new ways of registering through film the unacknowledged labour within an archive, both in relation to the creation of the original documents and their storage within the library system. conclusion the work of discovery and experimentation that has been opened up by ‘the poetics of the archive’ has the potential to grow in exciting new directions. this research is built on the firm foundations of collaboration between literary researchers and library staff. the project allowed for the development of new paradigms, through digital design experience and the serendipity of creativity, to which literary researchers and library cultures have responded in tandem and will go on responding in the future. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘abbreviations and acronyms’ link at the top of the page it directs you to: http://www.uksg.org/publications#aa competing interests the authors have declared no competing interests. ‘a shift from archives to archiving emphasizes … the construction of knowledge’ ‘the potential to grow in exciting new directions’ references . ncla archive: http://archive.nclacommunity.org/content/?= (accessed october ). . yeo g, trust and context in cyberspace, archives and records, , ( ), – ; doi: https://doi.org/ . / . . (accessed october ). . weller m, , the digital scholar: how technology is transforming scholarly practice, london, bloomsbury academic. doi: https://doi.org/ . / http://www.uksg.org/publications#aa http://archive.nclacommunity.org/content/?= https://doi.org/ . / . . https://doi.org/ . / . what is digital humanities?: https://whatisdigitalhumanities.com/ (accessed october ). . burdick a, drucker j, lunenfeld p, presner t and schnapp j, digital humanities, , cambridge, massachusetts, the mit press: https://mitpress.mit.edu/sites/default/files/titles/content/ _open_access_edition.pdf (accessed october ). . dowson j, poetry and personalities: the private papers and public image of elizabeth jennings. in: the boundaries of the literary archive: reclamation and representation, eds smith c and stead l, , surrey, ashgate. . hedstrom m, archives, memory, and interfaces with the past, archival science, , ( – ), – . doi: https://doi.org/ . /bf . the bloodaxe archive: http://bloodaxe.ncl.ac.uk (accessed october ). . howe s, notes on process in poetics of the archive, eds anderson l and warner a, , newcastle, newcastle centre for the literary arts. . derrida j, paper machine trans. r bowlby, , stanford, stanford university press ( ). . howe s, spontaneous particulars: the telepathy of archives, , new york, new directions ( ). . warner a, ‘words’: http://bloodaxe.ncl.ac.uk/explore/index.html#/words (accessed october ). . howe s, ref. . . woodford a, poem for the archive: http://bloodaxe.ncl.ac.uk/explore/index.html#/research/poa aw (accessed october ). . whitelaw m, generous interfaces slideshare: http://www.slideshare.net/mtchl/generous-interfaces (accessed october ). . belkin n, ‘some(what) grand challenges for information retrieval’: https://www.researchgate.net/profile/nicholas_belkin/publication/ _ecir_keynote_somewhat_grand_challenges_for_information_ retrieval_ /links/ deec e fd /ecir-keynote-somewhat-grand-challenges-for-information-retrieval- .pdf (accessed july ). . drucker j, graphical approaches to the digital humanities. in: a new companion to the digital humanities, ed schreibman s, siemens r and unsworth j, , chichester, wiley-blackwell. . contemporary poetry collections: http://poetry.ncl.ac.uk (accessed october ). . schofield t, kirk d, amaral t, dörk m, whitelaw m, schofield g and ploetz t, archival liveness: designing with collections before and during cataloguing and digitization, digital humanities quarterly, , ( ): http://www.digitalhumanities.org/dhq/vol/ / / / .html (accessed october ). article copyright: © linda anderson and ian johnson. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. corresponding author: linda anderson professor of modern english and american literature newcastle university, uk e-mail: linda.anderson@ncl.ac.uk orcid id: http://orcid.org/ - - - ian johnson orcid id: http://orcid.org/ - - - to cite this article: anderson l and johnson i, exploring the bloodaxe archive: a creative and critical dialogue, insights, , ( ), – ; doi: https://doi.org/ . /uksg. published by uksg in association with ubiquity press on november https://whatisdigitalhumanities.com/ https://mitpress.mit.edu/sites/default/files/titles/content/ _open_access_edition.pdf https://doi.org/ . /bf http://bloodaxe.ncl.ac.uk http://www.slideshare.net/mtchl/generous-interfaces https://www.researchgate.net/profile/nicholas_belkin/publication/ _ecir_keynote_somewhat_grand_challenges_for_information_retrieval_ /links/ deec e fd /ecir-keynote-somewhat-grand-challenges-for-information-retrieval- .pdf https://www.researchgate.net/profile/nicholas_belkin/publication/ _ecir_keynote_somewhat_grand_challenges_for_information_retrieval_ /links/ deec e fd /ecir-keynote-somewhat-grand-challenges-for-information-retrieval- .pdf http://poetry.ncl.ac.uk http://digitalhumanities.org: /dhq/vol/ / / / .html http://digitalhumanities.org: /dhq/vol/ / / / .html http://www.digitalhumanities.org/dhq/vol/ / / / .html http://creativecommons.org/licenses/by/ . / mailto:linda.anderson@ncl.ac.uk http://orcid.org/ - - - http://orcid.org/ - - - https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ _goback thou shalt commit: the internet, new media, and the future of women’s history claire bond potter journal of women's history, volume , number , winter , pp. - (article) published by the johns hopkins university press doi: . /jowh. . for additional information about this article access provided by new school university ( feb : gmt) http://muse.jhu.edu/journals/jowh/summary/v / . .potter.html http://muse.jhu.edu/journals/jowh/summary/v / . .potter.html © journal of women’s history, vol. no. , – . thou shalt commit the internet, new media, and the future of women’s history claire bond potter more than a tool for global networking and intellectual exchange, digital technology has transformed the most basic terms of feminist scholarship: reading, writing, archival research, and publication itself. this article addresses how the internet and the emerging field of digital humanities has fulfilled some of the larger aspirations of feminist scholarship as they were articulated at the dawn of the twenty-first century. when we move online, however, scholars engaged with history and new media identify new questions that require feminist attention. among them are the digital divide between universities and their publics; transnational linguistic barriers; the uncertain future of journals within an altered reading and publishing environment; and the gendered history of digital technology itself. in the mid- s, i was about to sign my first book contract with a uni-versity press. having made a heroic effort to read the dense legal prose, i stumbled on a line about electronic rights. “what are they talking about?” i asked my tennis partner, a former mass-market book editor, later in the day. “who knows?” she responded, handing two balls over the net. “just sign it. your serve.” what if she had told me instead that, by , this same book would be delivered over radio waves, making it possible for a buyer to “start reading war on crime on your kindle in under a minute”; that i myself would be reading other people’s books on a six-by-eight glass screen; or that i would be pulling journal citations for this article from electronic files stored on that device? i would not have believed it. in , even leo beranek, a col- laborator on arpanet, a predecessor to the internet, predicted that linked computers would expand scholarly networking, but not that they would transform a task as basic as reading. “early in the new century the number of homes connected to the internet will equal the number that now have televisions,” he wrote in the massachusetts historical review. “the internet has succeeded wildly beyond early expectations because it has immense practical value and because it is, quite simply, fun.” when the journal of women’s history (jwh) opened for business twenty- five years ago, most historians were not even familiar with email. by the spring of , as the contours of a post-cold war world were emerging, claire bond potter and jwh geared up for its tenth anniversary issue, the internet still had not made an impact on how feminist scholarship was written or consumed. the journal instead looked to major changes in what we thought “women’s history” was, changes that were undoubtedly related to the enhanced net- working possibilities created by email. leila rupp’s introduction urged readers toward these paradigm shifts: the critical importance of advances in feminist and queer theory; decentering north american women’s history and its dominance over the field; and reconsidering women’s history from a transnational perspective. looking back, an article on the recent history of digital scholarship, and its promise for our work, would have dovetailed nicely with this agenda. if digital and new media projects were already demonstrating the dramatic impact they would have on the historical profession, and on what we mean when we use basic keywords like “archive,” “document,” or “publish,” very few intellectuals had recognized that the internet would reshape the humanities, producing distinct practices, methodologies, and tools. and yet, many collaborative digital history projects were up and running prior to jwh’s tenth anniversary. the american social history project at the city university of new york had moved into the new center for media and learning (now the ashp/cml) in , its own tenth anniversary. in , the historian richard jensen, of the university of illinois-chicago, and his collaborators founded h-net as a free, list serve-based scholarly network. jensen, collaborating with a group of scholars who were disproportionately female compared to the rest of the profession, launched h-women and twelve other field-based lists in , creating a communications infrastruc- ture that would allow journals, like jwh, to send calls for papers anywhere that a computer could receive them. by , roy rosenzweig, who had published his first article on his- torical computing in and had collaborated with steve brier and josh brown of ashp/cml at ashp and the mid-atlantic radical historians organization (marho) collective, established the center for history and new media at george mason university. despite feminist scholarship’s roots in an activist social history project (two historians of women, elizabeth blackmar and susan porter benson, were also collaborators at marho), there was no explicit link between feminist scholarship and the research occurring at what have come to be known as digital humanities labs. this may be why the editors of jwh did not speculate in that tenth anniversary issue about the impact that applied computer technology and the emerging internet would have on feminist research, the objects of feminist research, or collaborative scholarship, much less on how the community of scholars that made up women’s and gender history would function in a digital media environment. journal of women’s history winter the jwh editors were not alone in this omission. at the dawn of the twenty-first century, historians of all genders and specialties seemed largely unaware that the fancy typewriters with screens suddenly appearing on their desks would fundamentally alter how anyone practiced, published, or taught history. a google scholar search for the phrase “digital history” between and turns up only forty-eight results, most of which are related to archiving. as a skeptical andrew mcmichael put it in the american historical association’s perspectives in the same year as jwh celebrated its tenth an- niversary, the internet offered historians “a mixed message” and always would. why? perhaps this was because computer-driven, quantitative cliometric research, and the historians who did it, had largely taken their work to more receptive economics departments by . perhaps, ironically, it was a tautological problem. what was cutting edge in the digital humani- ties often migrated to specialized research centers, like ashp/cml, because history departments did not view such projects as sufficiently scholarly, almost ensuring that non-digital historians could remain happily ignorant of what computers could accomplish. perhaps it was a status problem: many of the people who pioneered the digital humanities were at public institutions, were women, and were working at colleges and universities outside the academic research metropolis. whatever the reason, in , it was easy for most historians to imag- ine that anything not printed on paper was unscholarly and impermanent, despite the rapid adoption of digital learning at the secondary school level. this condition persists in many departments today. despite the rise of refereed on-line history journals, on-line versions of every major publica- tion, and sophisticated archiving systems, the question of why electronic scholarship matters, and how it should be evaluated, is still an alarmingly difficult conversation to have with many colleagues in the humanities. interestingly, when i reflected on the great silence about digital history in the jwh tenth anniversary issue, i realized that most of the basic tools that structure daily onscreen life for the average historian were already in place in . email, web pages, web logs (soon to be called blogs), newsgroups, the pioneering graphically-oriented netscape browser, and the list serve were well established. google was in the works and would launch in september . in other words, everything existed but facebook, twitter, wikipedia, and research tools like endnote and zotero! regardless, mcmichael asserted a consensus among historians (“all would concede”) that “much, perhaps most, of the information on information available on the internet is not very useful.” the most open tools, such as newsgroups, would “never be any more than a forum where those who repeat their arguments the most and the loudest tend to dominate the discussion.” email alone had potential. claire bond potter “a majority of historians are now using e-mail,” he wrote, “and the rest have probably heard about it from colleagues. in fact, from the perspective of the historian, email may be one of the greatest benefits of the internet revolution,” giving us the capacity “to communicate almost instantly with other historians around the globe.” things turned out differently, although it is worth mentioning that internet utopians have been wrong about some of their predictions, too. firewalls limit free access to scholarship and major journalism outlets, while connecting to the digital realm usually requires either a job or money to pay for a wireless signal. the internet has also transformed access to some, although not all, archives. it is particularly friendly to collections where privacy, copyright, or (as recent events in the united states have demonstrated) national security, are not a concern. but access to digitized documents and finding aids does push research forward, and in some fields the absences are unimportant when compared to improved access. “digitization is a godsend,” one prominent early american women’s his- torian said to me over breakfast when i asked if, and how, the internet had changed her work. newspapers are a critical source for anyone writing about women and gender in the early modern period, not only because they provide documentation of a period in which women were less liter- ate, had less leisure to write, and were rarely published, but because they provide insights into the gendering of public discourse. not only could this historian read seventeenth and eighteenth century newspapers at her desk, she could enlarge them at will, a huge leap over microfilm and microfiche reproduction and a welcome relief to the eyes, young or older. in , it was also hard to imagine that the emergence of free blogging and social media software would make the boundaries between “fun” and “work” on the internet highly porous within five years. on the one hand, scholars who have made the leap to new media may be the targets of skep- ticism by a more traditional history establishment that prefers paper. on the other hand, for most of the twentieth century, the same establishment preferred white men of the political and intellectual classes as historical subjects, and look what happened to that. if history is not yet on the side of digital scholarship, those of us who prowl the internet may nevertheless be on the side of history, which is trending sharply toward the consumption of information and education, as well as entertainment, on digital platforms. in , when asked which electronic technologies, or “screens” they used “too much,” american adults between and named television, not the internet as their chief “time-waster.” yet the survey adds to common sense evidence that the public sphere is increasingly shaped by new media. fifty-eight percent of adults between the ages of and saw themselves as too active on smartphones, as opposed to thirteen percent aged to , journal of women’s history winter and percent in the and over age group. those who felt they used the internet too much: , , and percent. those who felt they spent too much time on social media such as facebook: , , and percent. while these numbers could be read in several different ways (for example, that older age groups are less deft with new technologies, more resistant to change, think that the time that they spend on the internet is not wasted; or that believe they have less time to waste), the cultural impact of time spent in front of screens is undeniable. young people raised with technology have simply integrated digital platforms and screens more thoroughly into all their activities, including those that they believe, or have been schooled to believe, are a waste of time. i would also propose that if historians strive to be more publicly engaged than we currently are, we might want our work—whether it is traditional scholarly books and articles, narrative and mapping done on web platforms, community history, or writing for a popular audience—to be done in a virtual world where people are both working and wasting time. this is a good place to pause and ask, in particular, what the move to screens might mean for feminists, for female-bodied people, for the study of gender and sexuality, and for this journal as it moves into its next twenty- five years. has the internet made a difference to the practice of women’s history? if so, what difference has it made? and what do journals devoted to the history of women, gender, and sexuality need to be thinking about as they look to the future? let’s start with the bigger picture. the first thing that probably should come to mind is the likelihood that the journal you may be holding in your hand right now will not always be printed on paper and delivered in the mail. already many of us receive email links to journals we subscribe to, followed a week or so later by a heavy, plastic-wrapped tome printed on expensive, acid-free paper. while paper will still be with us for some time, the writing, so to speak, is on the (facebook?) wall. historians should soon expect to have the option to receive journals only through the ether. those of us who have access to a university library don’t need subscriptions now: we simply log in to jstor, or project muse, to get what we want, or what we think we need. most simplify this task by making table of contents services available to their faculty: a simple email alerts the recipient to a new issue. like the effect of mp music files on the concept of a unified record album, the trend towards on-line delivery will accelerate mash-up style journal consumption. scholars will read without regard to editorial themes, sequencing, clusters, and introductions, unless these files are “bound” to each other in such a way as to prevent separating them. one solution is to make a journal deliverable in the manner of mass-market magazines or e- books, in which the reader can use hyperlinks to read around the volume, claire bond potter but the issue is delivered as a coherent whole. at the most recent meeting of the american historical association, president william cronon announced that the aha would begin delivering its journal through an app, or internet application, sometime in (the conference program had been delivered on paper, in a pdf, and in a very popular smartphone app). technological changes are not cheap and represent a new state of play that may hit history journals that are not backed up by steep professional membership dues in the pocketbook. this currently implicates nearly all journals in the fields of women’s, gender, queer, and sexuality studies based in the united states. journals also will experience differently the increas- ing cultural pressure on universities and scholarly organizations to make our work freely available. this is something some universities do through digital commons platforms, like bepress.com, and individual scholars can accomplish by uploading pdf files of their work to a website made from a simple word press blog platform. but recirculating our own work on the web is not always legal. most history journals, which have contracts with subscription services like jstor and project muse, currently ask authors to sign a contract that restrains republication for a year or more. while the problem of access is not particular to women’s history as a field, i would argue that feminists and queer scholars have a special re- sponsibility to address it because of our attachment to political communi- ties outside the university. in other words, our intellectual principles have always been animated by our progressive commitments to social justice. attending to the project of writing and publishing across national, class, and racial lines is, arguably, a task that is complete only when our scholarship is returned to the communities where the research originates. freedom of access needs to be matched by attending to the linguistic barriers that divide scholars from each other, and from their global publics. language barriers prevent scholars and citizens around the world from accessing primary and secondary material that may be easily available on line, but are unreadable. as historians alice yang and alan s. christy, co- directors of the university of california santa cruz center for the study of pacific war memories, have argued, antagonistic nationalisms thrive on linguistic barriers that allow each country’s narrative to go unchallenged by another’s. however, as they point out, “demanding mastery of multiple languages and historiographies” to pursue truly transnational projects “is unrealistic.” digital technology that translates the work of collaborators and history consumers across many national publics “can effectively achieve this goal…. the key to this is not erasing the language barrier but making the language barrier visible and negotiable.” transnational and translingual collaborations may also require a necessary shift away from the history profession’s outdated valorization of the sole author. journal of women’s history winter recognition of the scholarly value of new media and digital publication does not promote, as some have put it, the “death of the book” (after all, literary theorist roland barthes announced the death of the author almost forty years ago and that seems to have passed); nor does the possibility of collaborating in virtual space mean that scholarly meetings will become redundant. while some may prefer an ereader, almost nobody prefers to read a book on a computer screen. most recently, the demise of print was bandied about at a history convention that had an entire exhibit devoted to displaying thousands of books, hundreds of which had been published that year. nor, despite the surge in popularity of online teaching and massive open online courses (moocs), should we imagine that digital and new media would necessarily replace face-to-face pedagogy. what we do know, however, is that academics (many of whom are not particularly interested in digital history as a practice) have, in the past fifteen years, flocked to social media and blogging as a new way of making community. this has been occurring even as hiring and tenure practices, and the turn to contingent labor, have undermined older forms of community like the department and the campus. new media has also become a way in which feminist communities organize themselves at a time in which women’s, gender, and sexuality studies programs, founded to organize against gender inequities within the university, have ceased playing an activist or mentoring function on many campuses. what do these new forms of community look like? as one example, on labor day weekend, , feminist academic bloggers notorious ph.d. and another damned medievalist put out a call for participants in their second on-line writing group. the two women, each a long-time presence in the blogosphere, proclaimed that the network of writers who would be hosted on both their blogs had been “founded as a virtual alternative to those dissertation writing groups that many of us benefited from when we were grad students, but that seem to disappear as we move into jobs.” members were instructed to dispense support and advice, and establish discussion topics, but slackers would not be tolerated. “the main commandment here,” they wrote, “is thou shalt commit.” if you blog it, they will come. by september , fifty-six historians and literary and cultural studies scholars had signed up for what became known as another damned notorious writing group. each listed a project, or in some cases a series of tasks, that they wanted to complete during the virtual group’s twelve-week life span. as another damned medievalist warned, not showing up for two weeks would cause a member to be dropped. “obe happens. yes, we are all at times overtaken by events,” and the most com- mon response to that was to stop writing. “if shit happens, then you have two choices. you cannot make any progress, and drop out of the group. or claire bond potter you can write in and say you have been obe, and then use the group as a reason to think through how you are going to deal with it. i’m voting for the latter.” this kind of tough love was once only possible in real life (or as we say in the blogosphere, irl). now history is being written, archived, posted, researched, and lived online, with the added advantage that scholars can be more open about what, and who, sabotages their work than colleagueship permits. however, email, counter to andrew mcmichael’s prediction, has, by , become the internet’s “mixed message.” a worse time waster than facebook, lol cats, and online shopping, email nags at our conscious- ness, drives us to wi-fi free locations and internet blocking software, and leaves us wringing sore hands that have not written a scholarly word. a search for the phrase “managing email” at the chronicle of higher education in january turned up over articles devoted to lifting the burden of daily, or hourly, communication. “with so many messages coming in, many people on campuses are feeling a sense of overload,” one blog announced. “the tech therapy team talks with brett foster, an associate professor of english at wheaton college, in illinois, about his experiment in keeping his inbox to zero each day.” academic computing technology often seems to generate more work than it is worth, particularly for those who doubt their ability to learn some- thing more complex than email, and has eased the way to forms of labor exploitation and speedup. however much our hearts may sink at the most recent administrative exhortation to develop a mooc on the civil war, or announcement of the most recent “student success software,” the internet has had a far greater capacity to create community and generate creative, shareable scholarship than could have been anticipated twenty-five years ago. at the same time, it allows historians to work across the boundaries of the universities where we are appointed, edit journals, organize confer- ences, and access archives without ever leaving our desks. feminist historians have long prided themselves on the creation of com- munity. the berkshire conference of women historians, the coordinating council for women in history, the international federation for research in women’s history, the committee on women of the american historical association, the western association of women historians, the southern association of women historians, and the lesbian herstory archives are but a few of the organizations that arose to meet a felt need for intellectual support, mentoring, and mutuality. our presence on the web has surely enhanced that. women’s studies programs, many of which have now be- come gender, sexuality, or feminist studies programs to mark changes in the discipline, create web communities for the students and faculty they serve. journal of women’s history winter the internet makes all of the organizations and journals that are founda- tional to feminist scholarship more visible and makes them better resources for young scholars. blogs, which emerged as a powerful social networking platform in the early twenty-first century, have emerged as sites for public and experimental feminist writing, for sharing scholarly and methodical insights, for discussing professional issues, and for interpreting current events. blogs attached to professional websites create a flexible location for spreading news and calls for papers, while those that are attached to the mass-circulation press provide a more inclusive sphere for bringing women and gender history to the attention of a non-scholarly public. finally, new digital technologies have their own history, one that is recent to be sure, but that nevertheless resonates to historical questions of race, class, gender, nationalism, and sexuality that are at the heart of a feminist intellectual enterprise. the recent history of new media will build on established questions and patterns in intellectual, cultural, and business history that frame the economic progress of women in relation to the gendering of work itself. scholars may also want to weave their findings into the long labor history of the american west, and compara- tive economic frontier industries in canada, australia, and south africa. as historian stephen pitti has argued, companies like facebook, google, and apple thrived in the so-called “silicon valley” as only the most recent entrepreneurs to displace and exploit working class mexican-americans. apple, for example, has also made enormous profits at the expense of a predominantly female, chinese factory labor force that rioted in in response to oppressive work conditions. feminist historians have an important task before them in examining the proto-pioneer, masculine progress narratives that portray successful internet entrepreneurs as establishing a newly democratic public sphere on and off the web. in this contemporary history, the industry had a sexual hierarchy, but no longer does. male aggression has given way to the uniquely civilizing influence of american women in a bricks and mortar workplace far from the sites of neo-colonial exploitation. experience, a job bank and dotcom networking site founded in , imagines an industry now come to full maturity, one that encourages a diverse workforce to “work hard and play hard” in rational settings that abhor sexism, racism, and homophobia. in reality, women have succeeded in new media and technology careers in this new west much as they did in the old west, or indeed, anywhere: by educational, racial, and class status; and in relation to whether the work itself has always been, or has become, gendered in such a way as to allow “women” to excel at it. as one experience blogger notes, although an earlier generation had to fight for status, women now have a distinct advantage claire bond potter over men because of “the attributes they bring to their jobs” (communica- tion skills, consensus building, and intuition). the feminist historian asks: is it possible to achieve gender equality on any shop floor—digital or irl—that shapes itself around the “essential qualities” of masculinity and femininity? here, the digital brings nothing intellectually new to the table, even though evidence collection and preser- vation on the internet may require new skills. whether in automobile and electrical factories, where women kept their jobs in the upholstery shops after world war ii because their fingers were said to be more nimble than men’s, or on the editorial boards of mass-market magazines, new employ- ment opportunities for women have often disguised more permanent forms of highly gendered, racialized, and class inequality. early research on the technology boom has indicated, for example, that if well-educated women do well in a digital world, working-class women do not. pink-collar workers in technology-driven service industries, the core of the neoliberal u.s. economy, have access to less modern, hand-me-down equipment and are inadequately educated to use it. this results in low pay, little or no advancement, and “difficult if not impossible work environments” for working-class women. in addition to valorizing masculine adventurism, cheating, sexual conquest, and misogyny, dotcom hagiographies ignore these persistent inequalities. the american journalist ben mezrich’s best-selling account of mark zuckerberg and the founding of facebook, the basis for a award- winning movie, the social network, is one example. appearing in its first edition with a martini and a piece of red lingerie on the cover, mezrich’s zuckerberg and his male cronies celebrate each technological and financial victory by seeking out beautiful women who are glad to open their—er, arms—to the triumphant nerds. in a less sensational register, katherine losse describes her years as one of the first, and few, female employees of facebook in the boy kings: a journey into the heart of the social network. an almost perversely asexual observer of adolescent male social aggression, losse’s facebook is an endless “bromance,” a frat house with free food, free toys, flipped work and sleep schedules, and free concert tickets. male engineers are at the top of the hierarchy, while the less financially privileged female support staff perform the metaphorical housework and practical tasks that connect the company to its users. since its inception, women’s history has expanded its portfolio to meet new challenges, new opportunities, and new subjects. too many historians continue to privilege email, online journal searches, and guilty consultations with wikipedia, while ignoring scholarly developments in digital scholarship and new media. one immediate, and unnecessary, consequence of this is that departments are reluctant to educate, evaluate, journal of women’s history winter hire, or tenure scholars in a field that is not only well established, but may represent great potential for training future phds for jobs in the university, journalism, corporate and independent publishing, secondary education, it and archives management, and public service. this makes it ever more important that scholars, like those assembled on the editorial board of jwh, provide the space, the prestige, and the creative publishing opportunities that feminist scholars in digital history around the globe require to move forward in their work and careers. these unexploited opportunities beckon to the leadership that feminist historians have shown in the past as they have remade the university around principles of inclusion, progress, and cutting edge interdisciplinary work. feminist historians, like other specialists who came out of the progres- sive social history tradition, can set an example by firmly moving history towards the digital, just as jwh shrewdly moved to decenter the united states and u.s.-based historians in its own pages fifteen years ago. not eliminating but re-imagining print, and with it the power of conventional narratives and textual styles linked to the production of codex narratives, has the potential to transform and re-energize history as a scholarly, public, and politically relevant practice. notes leo beranek, “roots of the internet: a personal history,” massachusetts historical review. ( ), . leila rupp, “editor’s note,” journal of women’s history , no. (spring, ): – . richard jenson to campbelld@apsu.bitnet, june , humanist discussion group, history net lists / , no. , http://lists.village.virginia.edu/lists_archive/ humanist/v / .html; and anthony grafton, “roy rosenzweig: scholarship as commu- nity,” clio wired: the future of the past in the digital age (new york: columbia university press, ), ix–xx. by , brier, brown, and rosenzweig had also produced a cd-rom for the first edition of the united states social history textbook, who built america? (new york: bedford books, ) and the website, history matters, http://historymatters.gmu.edu/. i can’t help but point out that google scholar, an indispensable tool, was launched as recently as . andrew mcmichael, “the historian, the internet and the web: a reassess- ment,” perspectives (february ), http://www.historians.org/perspectives/ issues/ / / vie .cfm. this is my speculation based on observations made by melanie r. shell- weiss, marilyn levine, kriste lindenmeyer, paul turnbull, and kelly a. woestman, some of the founding members of h-net on “h-net and the disciplines,” th american historical association annual meeting, new orleans, la, january . claire bond potter ibid. gail drakes, “who owns your archive? historians and the challenge of intellectual property law,” in doing recent history: on privacy, copyright, video games, institutional review boards, activist scholarship, and history that talks back, ed., claire potter and renee romano (athens: university of georgia press, ), – . for a contemporary history of the u.s.-led invasion of iraq that began life as a blog, see riverbend, baghdad burning: girl blog from iraq (new york: the feminist press, ). jeff nunokawa, of princeton university’s english department, has been writing facebook notes for several years that combine photographs, scraps of literature, and commentary about the feelings these quotations summon or encapsulate. this project was the object of a new yorker “talk of the town” piece; see rebecca mead, “earnest,” the new yorker, july . mead’s piece will soon appear as a book. frank newport, “u.s. young adults admit too much time on cell phones, web,” gallup economy, april , http://www.gallup.com/poll/ /young-adults-admit- time-cell-phones-web.aspx. one study argues young people are leaving facebook because their parents are going on it. in , the fastest growing demographic on facebook was in its mid-thirties and forties; see mary madden, “older adults and social media,” pew internet, august , http://www.pewinternet.org/reports/ / older-adults-and-social-media.aspx; and sherry turkel, alone together: why we expect more from technology and less from each other (new york: basic books, ), – . alice yang and alan y. christy, “eternal flames: the translingual impera- tive in the study of world war ii memories,” in doing recent history: on privacy, copyright, video games, institutional review boards, activist scholarship, and history that talks back, ed., claire potter and renee romano (athens: university of georgia press, ), – , . roland barthes, image, music, text, trans. stephen heath (new york: hill and wang, ), – . leah price, “dead again,” new york times august ; christopher mims, “the death of the book has been greatly exaggerated,” mit technology review, september , http://www.technologyreview.com/view/ / the-death-of-the-book-has-been-greatly-exaggerated/. “virtually a historian: blogs and the recent history of dispossessed aca- demic labor,” historical reflections/reflexions historiques , no. (summer ): – . the adventures of notorious, ph.d., girl scholar, “writing group, fall term: call for projects,” september , http://girlscholar.blogspot.com/ / /writing-group- fall-term-call-for.html. blogenspiel, “another damned notorious writing group is called to order,” september , http://anotherdamnedmedievalist.wordpress.com/ / / /another- damned-notorious-writing-group-is-called-to-order/; and the adventures of notorious ph.d., girl scholar, “writing group : pacing,” september , http://girlscholar. blogspot.com/ / /writing-group-week- -pacing.html. journal of women’s history winter jeffrey r. young, “academics struggle with managing email,” chronicle of higher education, january , http://chronicle.com/blogs/techtherapy/ / / / episode- -academics-struggle-with-managing-e-mail/. joan greenbaum, windows on the workplace: technology, jobs and the organiza- tion of office work (new york: monthly review press, ). leila j. rupp, “at the turn of the millennium,” journal of women’s history , no. (spring ): – , . stephen pitti, the devil in silicon valley: northern california, race and mexican americans (princeton: princeton university press, ); and adam gabbatt, “foxconn workers on iphone line strike in china, rights group says,” the guardian, october . http://www.guardian.co.uk/technology/ /oct/ /foxconn-apple-iphone-china-strike. amy marcott, “a level field at last? women and the internet,” experience, http://s b.experience.com/alumnus/article?channel_id=diversity&source_page=additional_ articles&article_id=article_ (accessed january ). ruth milkman, gender at work: the dynamics of job segregation by sex during world war ii (champagne-urbana: university of illinois press, ), ; carrie pitzulo, bachelors and bunnies: the sexual politics of playboy (chicago: university of chicago press, ), – ; and mary e. virnoche, “pink collars on the internet: roadblocks to the information superhighway,” women’s studies quarterly, vol. , no. ¾ (fall-winter ): – , . ben mezrich, the accidental billionaires: the founding of facebook, a tale of sex, money, genius and betrayal (new york: doubleday, ); and katherine losse, the boy kings: a journey into the heart of the social network (new york: free press, ). asist panel revised draft research know-how for research support services: preparing information specialists for emerging roles sheila corrall school of information sciences university of pittsburgh north bellefield avenue pittsburgh, pa scorrall@pitt.edu mary anne kennan school of information studies charles sturt university locked bag , silverwater nsw mkennan@csu.edu.au dorothea salo school of library & information studies university of wisconsin-madison north park street madison, wi salo@wisc.edu abstract the panel will discuss the importance of understanding the research environment for providing effective information and technology support to researchers, and the implications for curricula in professional education. our specific context is growing involvement of academic libraries and information services in managing research data, but the issues raised have wider implications for educating and developing other information specialists (e.g., in research institutes, government agencies, public libraries). studies in the past five years have identified technical and discipline- related skills and knowledge gaps as potential constraints on developing library research data services. our recent research in australia, new zealand, the uk, and ireland confirmed the need for data curation and technology skills, but also found practitioners engaging in other forms of research support, and expressing needs for a multilayered introduction to the research environment, extending beyond the research skills typically gained in masters programs, including subjects such as academic culture and practice, and research policy and evaluation. the panelists represent a mix of academic and practitioner viewpoints from different countries. they will each offer their views on what is missing and should be added to graduate curricula, and how programs can make space, asking the audience to respond with their own suggestions, counter-arguments, and alternative visions, using an interactive style from the start. keywords academic libraries, curriculum development, data management, information services, professional education, research support. introduction political, economic, and technological developments in the past decade have renewed interest globally in the role of libraries and librarians in supporting research (auckland, ; bourg, coleman, & erway, ; maccoll & jubb, ; webb, gannon-leary & bent, ). the most frequently discussed area is the curation and management of data from e-research (garritano & carlson, ; henty, ; hey & hey, ; lewis, ; lyon, ; salo, a; soehner, steeves & ward, ; tenopir, birch & allard, ; tenopir, sandusky, allard & birch, ), but libraries are also raising their profile with involvement in areas such as institutional repositories (cassella & morando, ; horwood, sullivan, young & garner, ; kennan & kingsley, ; salo, ), scholarly publishing (adema & schmidt, ; crow et al., ; hahn, ), and bibliometrics (ball & tunger, ; drummond & wartho, ; hendrix, ). a recurring theme of such discussions is the need to re-skill or up-skill the information workforce to provide higher-end research support (auckland, ; henty, ; lewis, ; lyon, ; tenopir et al, ). funding from the institute of museum and library services has supported new modules, courses, specializations, and programs in the us to prepare practitioners for digital curation and data management (harris-pierce & liu, ; keralis, ), but developments in the uk and other countries have been slower (cox, verbaan & sen, ; pryor & donnelly, ). most reports of funded curriculum initiatives have concentrated on the technical aspects of data management, but some have also highlighted the need for practitioners to understand the research process and policy context (cox et al., ). others have called for more discussion among educators and practitioners to determine future curriculum content and presentation (harris-pierce & liu, ; lewis, ) our own recent study of developments in library support for research in australia, new zealand, the uk and ireland (corrall, kennan & afzal, ) confirmed the need for technical knowledge and ict skills development, but also pointed up significant needs in the areas of research processes, research methods, research workflows, and policy contexts (national and institutional agenda). asist , november - , , montreal, quebec, canada. problem statement and research questions academic libraries are responding to political, economic, and technological challenges in the research environment with service innovations in areas such as bibliometric support for research evaluation, and planning for the curation and management of digital research data. a survey of library practitioners in australia, new zealand, the uk, and ireland found competency needs for delivering the desired services were broader and deeper than formerly acknowledged, including subjects that are not typically the focus of current curriculum initiatives, such as academic culture and practice, digital scholarship, education and research policy, intellectual property and licensing, and research assessment and evaluation, to provide a fuller understanding of the context for service development. discussion representing the viewpoints of both academic educators and professional practitioners is needed to debate future directions for library and information science curricula to meet the needs identified. insight gained from such a discussion can be used to inform the planning and design of both preparatory professional education programs and continuing professional development courses, and/or to suggest lines of inquiry for further investigation. the central question is: • what do library and information professionals need to know about research to provide effective support in the e-research environment, e.g., methodologies, policies, processes, workflows? related subsidiary questions include: • what additional subjects must be included in graduate library and information science program curricula, e.g., as required or elective courses? • what subjects or courses will be dropped to make space for the subjects identified? • should all library and information students undertake an empirical research project as preparation for research support roles? • how can practitioners in the field get the research know-how required for effective support services? key issues for discussion there is growing acknowledgment that students and practitioners in the library and information domain need education and training in technical aspects of digital data curation to enable research libraries to support institutional expectations in the area of research data management. around one-third of ala-accredited mlis programs have offered a course on data curation within the last three years, of which almost half offered a concentration or specialization in the subject (harris-pierce & liu, ). there has been less discussion about the background knowledge and understanding of the research arena needed to complement the technical skill sets already defined. the first task here is to identify aspects of the research environment that should be part of the core knowledge base for information specialists in research support roles. the next issue is the breadth and depth of treatment required for the topics identified, to determine how many courses might be needed. a key question is whether graduates aiming for research support roles should carry out a small-scale project to gain fuller understanding of the process of research. another issue is whether the subject matter deemed essential for practitioners specializing in research support should be part of the core curriculum, or only offered as a specialist track or program. a final important matter for debate is how to enable practitioners already in the field to update their knowledge and skills in a way that fits the demands of their jobs and personal/financial circumstances. the panel will work interactively to engage the audience throughout the session. the chair will put each specified question in turn first to the panel members (rotating the order in which they speak) and then to the audience, inviting participants to provide their own perspectives on the questions and responses, or offer counter-arguments and alternative proposals. panelists the panel members will provide complementary and contrasting expert views on the questions for debate, drawing on their varied backgrounds and experiences. sheila corrall sheila corrall is professor and chair of the library and information science program at the university of pittsburgh school of information sciences, where she teaches courses on research methods and academic libraries. she was formerly head of the university of sheffield ischool, and served as director of library and information services at three universities in the uk. her research interests focus on the evolving roles and competencies of library and information professionals, and their education, training, and development needs. recent work includes a review of the roles and responsibilities of libraries and librarians in the research data arena (corrall, ), an analysis of evolving academic library specialties (cox & corrall, in press), and a survey of research support services in academic libraries in australia, new zealand, the uk, and ireland (corrall et al., ). she serves on the editorial boards of education for information, information research, international journal of digital curation, and new review of academic librarianship. she also serves on the advisory panel of the jisc-funded rdmrose project, which is developing learning materials about research data management for liaison librarians in university libraries, both for the continuing professional development of practitioners and for embedding into graduate curricula. corrall will introduce the panel, and chair the discussion, in addition to responding to the questions from the viewpoints of research, education, and practice in the us and the uk. mary anne kennan mary anne kennan is a senior lecturer in the school of information studies at charles sturt university, australia, where she teaches courses on the digital environment, research data management and research methods. her research interests build on her phd, which focused on scholarly communication, institutional repositories, and open access, moving into the broader areas of e-research and research data management, including the practices of sharing and collaboration. recent work includes the survey with corrall of research support services in academic libraries in australia, new zealand, the uk, and ireland (corrall et al., ). other recent projects have investigated the management and sharing of volunteer- collected data (kennan, williamson & johanson, ) and the role of institutional mandates in promoting open access (kennan, ). her previous experience includes years working in libraries and the information world, including serving as director of the frank lowy library at the australian graduate school of management in sydney. she has also taught at the university of new south wales and the university of technology sydney. she is joint editor of australian academic and research libraries and serves on the editorial board of the international journal of actor- network theory and technological innovation. kennan will respond to the questions from the viewpoints of research, education, and practice in australia, drawing on her experience of online distance education for librarians. dorothea salo dorothea salo is a faculty associate in the school of library & information studies at the university of wisconsin-madison, where she teaches courses on digital curation; digital trends, tools, and debates; libraries and publishing; organization of information; and research-data management for graduate students. she also works with libraries and librarians as an independent consultant, specializing in scholarly communication and data curation. salo formerly worked as digital repository librarian and research services librarian at the university of wisconsin, and as digital repository services librarian at george mason university. relevant publications include a review of the roles and responsibilities of libraries and librarians in institutional repository development (salo, ), an examination intellectual property ownership in the e- research environment (salo b), and an analysis of the challenges facing libraries in adapting their technical infrastructures for research data management (salo, a). salo will respond to the questions from the viewpoints of practice, training, teaching, and consultancy, drawing on her experience of working with researchers and students from different disciplinary backgrounds, in the us. references adema, j., & schmidt, b. ( ). from service providers to content producers: new opportunities for libraries in collaborative open access book publishing. new review of academic librarianship, (suppt. ), - . retrieved april , , from http://www.tandfonline.com/doi/pdf/ . / . . auckland, m. ( ). re-skilling for research: an investigation into the roles and skills of subject and liaison librarians required to effectively support the evolving information needs of researchers. london: rluk research libraries uk. retrieved april , , from http://www.rluk.ac.uk/content/re-skilling- research ball, r., & tunger, d. ( ). bibliometric analysis – a new business area for information professionals in libraries? support for scientific research by perception and trend analysis. scientometrics, , - . bourg, c., coleman, r., & erway, r. ( ). support for the research process: an academic library manifesto. dublin, oh: oclc research. retrieved april , , from http://www.oclc.org/content/dam/research/publications /library/ / - .pdf carlson, j. r., & garritano, j. r. ( ). e-science, cyberinfrastructure, and the changing face of scholarship: organizing for new models of research support at the purdue university libraries. in s. walter & k. williams (eds.). staffing, sustaining, and advancing the academic library in the st century (pp. - ). chicago, il: association of college and research libraries. cassella, m., & morando, m. ( ). fostering new roles for librarians: skills sets for repository managers – results of a survey in italy. liber quarterly, , - . retrieved april , , from http://liber.library.uu.nl/publish/articles/ /article. pdf. corrall, s. ( ). roles and responsibilities: libraries, librarians and data. in g. pryor (ed.). managing research data (pp. - ). london: facet. corrall, s., kennan, m. a., & afzal, w. ( ). bibliometrics and research data management: emerging trends in library support for research. library trends, , - . cox, a., & corrall, s. (in press). evolving academic library specialties. journal of the american society for information science and technology. cox, a., verbaan, e., & sen, b. ( ). upskilling liaison librarians for research data management. ariadne, .retrieved april , , from http://www.ariadne.ac.uk/issue /cox-et-al crow, r., ivins, o., mower, a., nesdill, d., newton, m., speer, j., watkinson, c. ( ). library publishing services: strategies for success, final research report (version . ). washington, dc: sparc. retrieved april , , from http://wp.sparc.arl.org/lps/ drummond, r., & wartho, r. ( ). rims: the research impact measurement service at the university of new south wales. australian academic & research libraries, , - . retrieved april , , from http://www.alia.org.au/publishing/aarl/ /arrl.vol .no . .pdf garritano, j. r., & carlson, j. r. ( ). a subject librarian's guide to collaborating on e-science projects. issues in science and technology librarianship, . retrieved april , , from http://www.istl.org/ - spring/refereed .html# hahn, k. k. ( ). research library publishing services: new options for university publishing. washington, dc: association of research libraries. retrieved april , , from http://www.arl.org/bm~doc/research- library-publishing-services.pdf harris-pierce, r. l., & liu, y. q. ( ). is data curation education at library and information science schools in north america adequate? new library world, , - . hendrix, d. ( ). tenure metrics: bibliometric education and services for academic faculty. medical reference services quarterly, , - . henty, m. ( ). developing the capability and skills to support e-research. ariadne, . retrieved april , , from http://www.ariadne.ac.uk/issue /henty hey, t., & hey j. ( ). e-science and its implications for the library community. library hi tech, , - . horwood, l., sullivan, s., young, e., & garner, j. ( ). oai compliant institutional repositories and the role of library staff. library management, , - . kennan, m. a. ( ). learning to share: mandates and open access. library management, , - . kennan, m.a., & kingsley, d.a. ( ). the state of the nation: a snapshot of australian institutional repositories, first monday, ( ). retrieved april , , from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/ fm/article/view/ / . kennan, m. a., williamson, k., & johanson, g. ( ). wild data: collaborative e-research and university libraries. australian academic and research libraries, , - . keralis, s. d. c. ( ). data curation education: a snapshot. in l. jahnke, a. asher, & s. d. c. keralis. the problem of data (pp. - ). washington, dc: council on library and information resources. retrieved april , , from http://www.clir.org/pubs/reports/pub /pub .pdf lewis, m. ( ). libraries and the management of research data. in s. mcknight (ed.). envisioning future academic library services: initiatives, ideas and challenges (pp. - ). london: facet. lyon, l. ( ). the informatics transform: re-engineering libraries for the data decade. international journal of digital curation, ( ), - . retrieved april , , from http://www.ijdc.net/index.php/ijdc/article/view/ / maccoll, j., & jubb, m. ( ). supporting research: environments, administration and libraries. dublin, oh: oclc research. retrieved april , , from http://www.oclc.org/resources/research/publications/lib rary/ / - .pdf pryor, g., & donnelly, m. ( ). skilling up to do data: whose role, whose responsibility, whose career? international journal of digital curation, ( ), - . retrieved april , , from http://www.ijdc.net/index.php/ijdc/article/view/ salo, d. ( ). innkeeper at the roach motel. library trends, , - . retrieved april , , from https://www.ideals.illinois.edu/handle/ / salo, d. ( a). retooling libraries for the data challenge. ariadne, . retrieved april , , from http://www.ariadne.ac.uk/issue /salo salo, d. ( b). who owns our work? serials, , - . retrieved april , , from http://uksg.metapress.com/content/l u tqlh j v l/ful ltext.pdf soehner, c., steeves, c., & ward, j. ( ). e-science and data support services: a study of arl member institutions. washington, dc: association of research libraries. retrieved april , , from http://www.arl.org/bm~doc/escience_report .pdf tenopir, c., birch, b., & allard, s. ( ). academic libraries and research data services: current practices and plans for the future, an acrl white paper. chicago, il: association of college & research libraries. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/p ublications/whitepapers/tenopir_birch_allard.pdf tenopir, c., sandusky, r. j., birch, b., & allard, s. ( ). academic librarians and research data services: preparation and attitudes. ifla journal, , - . retrieved april , , from http://www.ifla.org/files/assets/hq/publications/ifla- journal/ifla-journal- - _ .pdf webb, j., gannon-leary, p., & bent, m. ( ). providing effective library services for research. london: facet. asist panel revised draft asist panel revised draft. fostering effective data management practices at leiden university | scholarly publications skip to main content leiden university scholarly publications home submit about select collection all collections centre for the arts in society (lucas) leiden university libraries (ubl) academic speeches dissertations faculty of archaeology faculty of governance and global affairs faculty of humanities faculty of science faculty of social and behavioural sciences leiden journals, conference proceedings and books leiden law school leiden university press medicine / leiden university medical centre (lumc) research output ul search box persistent url of this record https://hdl.handle.net/ / documents download fostering effective data management practices not applicable (or unknown) open access in collections this item can be found in the following collections: centre for the arts in society (lucas) leiden university libraries (ubl) verhaar, p.a.f.; schoots, s.p.; sesink, l.; frederiks, f. ( ) fostering effective data management practices at leiden university article / letter to editor all authors verhaar, p.a.f.; schoots, s.p.; sesink, l.; frederiks, f. editor(s) verhaar p.a.f., schoots s.p. date journal liber quarterly volume issue pages - doi doi: . /lq. link https://doi.org/ . /lq. © - leiden university a service provided by leiden university libraries contact about us recently added digital collections student repository the best, the worst, and the hardest to find: how people, mobiles, and social media connect migrants in(to) europe https://doi.org/ . / creative commons non commercial cc by-nc: this article is distributed under the terms of the creative commons attribution- noncommercial . license (http://www.creativecommons.org/licenses/by-nc/ . /) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the sage and open access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). social media + society january-march : – © the author(s) reprints and permissions: sagepub.co.uk/journalspermissions.nav doi: . / journals.sagepub.com/home/sms si: forced migrants and digital connectivity introduction each year, millions of people forcibly leave their homes due to disasters, climate change, persecution, or armed conflict. in , the massive influx of refugees from crisis-ridden countries such as syria, afghanistan, and iraq into european territory revealed fundamental political controversies among the european union (eu) member states and led to the col- lapse of the european border system. soon the alleged “european migration crises” topped european and interna- tional newspapers. in this somewhat heated atmosphere, german chancellor angela merkel ordered a temporary open-door asylum policy by famously noting “wir schaffen das” (merkel, , p. ). her decision to suspend eu rules on registering asylum seekers in the first eu state they entered aimed at the growing number of syrians fleeing the conflict in their country. in reality, though, significant num- bers of people from other countries took the opportunity to enter, too. in this situation, the city of berlin, the capital of the german “refugees welcome” policy, soon became a hot spot for the european controversies over the humanitarian crisis. in berlin, all arriving asylum seekers had to register with one central reception facility, the regional office for health and social affairs (lageso). as one of the authors witnessed on a daily basis, exhausted and partly traumatized by war, expulsion, and months of escape, hundreds of refu- gees, both adults and children, queued every day for being registered. it soon became evident that the rush of refugees from syria, the balkan regions, afghanistan, and iraq over- whelmed the lageso hopelessly. being capable of issuing waiting numbers for about – people and processing around applications a day, on just day , refugees smsxxx . / social media + societyborkert et al. research-article technical university berlin, germany university of washington, usa newcastle university, uk Åbo akademi university, finland universiti kuala lumpur, malaysia corresponding author: maren borkert, technical university berlin, strasse des . juni , h , berlin, germany. email: m.borkert@tu-berlin.de the best, the worst, and the hardest to find: how people, mobiles, and social media connect migrants in(to) europe maren borkert , karen e. fisher , , and eiad yafi abstract for displaced people, migrating into europe has highly complex information needs about the journey and destination. each new need presents problems of where to seek information, how to trust or distrust information, and financial and other costs. the outcomes of receiving poor or false information can cause bodily harm or death, loss of family, or financial ruin. we aim to make two major contributions: first, provide rich insights into digital literacy, information needs, and strategies among syrian and iraqi refugees who entered europe in , a topic rarely dealt with in the literature. second, we seek to change the dominant perspective on migrants and refugees as passive victims of international events and policies by showing their capacities and skills to navigate the complex landscape of information and border regimes en route to europe. building on research at za’atari refugee camp (jordan), we surveyed arab refugees in two centers in berlin. analyses address refugees’ temporal information worlds, focusing on the importance and difficulty in finding specific information, how migrants identify mis- and disinformation, and the roles of information and technology mediaries. findings illustrate the digital capacities refugees employ during and after their journey to europe; they show social support via social media and highlight the need for a radical shift in thinking about and researching migration in the digital age. keywords refugees, displaced people, syria, media use, information texture, border landscape, social connection, icts, digital literacy, misinformation, disinformation https://uk.sagepub.com/en-gb/journals-permissions https://journals.sagepub.com/home/sms mailto:m.borkert@tu-berlin.de http://crossmark.crossref.org/dialog/?doi= . % f &domain=pdf&date_stamp= - - social media + society could line up to register at the lageso. in the midst of the german “refugee crisis,” they made berlin a sad synonym for official failure. without the numerous, mostly female, volunteers and the countless helpers in secular and ecclesias- tical organizations, who largely worked until exhaustion, the humanitarian and administrative crisis would have been much more intense (bundesamt für migration und flüchtlinge [bamf], ; karakayali & kleist, ). this clumsy handling of the influx of hundred thousands of refugees, fol- lowed by the new year’s eve sexual assaults attributed to asylum seekers inter alia in cologne and a string of islam- inspired terror attacks, changed the mood in the country (stinauer, ). losing rapidly support, chancellor merkel took a tougher approach on asylum seekers in germany (re-) installing measures to reduce their numbers, along with the help of international partners such as turkey. the public backlash made anti-immigrant sentiments and rhetoric come to the fore, facilitating among others the fast rise of german right-wing populist party alternative for germany (afd), largely unknown before. in attempt to balance the public discussion inspired by fears of uncontrolled immigration and subsequent financial burdens with scientific evidence, we conducted research about those who arrived with a view on their abilities and agency. trying to add a new angle of view, we focused on digital literacy and complex information needs migrants and particular refugees face during their journey and at their des- tination in europe. in the remainder of this article, we intro- duce the context of the research in berlin; discuss social media, concept of digital literacy, and relevant research on refugees; and share our findings with regard to the survey methodology. asylum seekers in the city of berlin in , germany received an unprecedented individual first asylum applications with , applications submitted to the bamf, the german federal office for migration and refugees; , (or %) more than in the previous year (bamf, ). from syria, , asylum seekers com- prised % of all applications. among the top countries of origin, are from the balkan regions: albania, kosovo, serbia, and macedonia, evidencing that the open-door asylum policy toward syrian refugees attracted nationals from other countries as well. however, the actual number of asylum seek- ers entering germany was significantly higher, since the for- mal application for asylum is processed with a time delay, and some who submitted and distributed (in)to germany moved to other eu countries. in the easy (“erstverteilung von asylbegehrenden”) system, for instance, around . million accesses of asylum seekers were registered nationwide (bamf, ). in berlin, , asylum seekers, . %, were settled in (bamf, ) with notable impact on the city. as the statistical office of berlin-brandenburg states, the overall growth of berlin in terms of population was driven by the influx of foreigners—the first such instance in years. indeed, of berlin’s . million registered population in , every third berliner came from abroad or was german with “migration background.” in / , syrians ( , ) became the third largest foreign group in berlin, after polish ( , ) and turkish people ( , ; statistical office of berlin-brandenburg, ). closer examination shows that the majority of asylum applications ( . %) in were by men—higher in all age groups up to “under years.” only in the “ -year-old and older” category is the proportion of female applicants greater. furthermore, . % ( , ) of asylum seekers are under years old, and almost three-quarters ( . %), namely, , persons, are under age years (bamf, ). thus, the overwhelming majority of asylum seekers in – are male and under years of age—a trend reflected in our study. the arrival of more than a million refugees and migrants clearly left its marks on german politics and society. all lev- els of administration—from local communities to regional and national authorities—faced unprecedented challenges, while the question of social equity and burden sharing rose to the fore. the political difficulties to mitigate the “european refugee crisis,” however, should not excuse forgetting the challenges and hardships for those who came to europe. besides the daunting challenges and emotional trauma of uncertainty and loss which displaced persons, either inter- nally (idps) or internationally, face during the flight, on arrival new difficulties await. in contrast to the safe haven and imagery that displaced persons seek, they are confronted with cramped living quarters, resource scarcity, and the ani- mosity of and violence by home and host-country citizens. this holds particularly true for children and women who are among the most disadvantaged (marfleet, ). whether within or across national boundaries, the displaced persons are forced to choose between bad and worse. while migration and displacement are worldwide increas- ing, the use of information and communication technologies (icts) has spread and intensified. driven by the rapid increase in cheap mobiles and services, migrants and displaced per- sons use mobile and internet technologies in planning depar- tures, managing flight, coordinating with others, and finding way to new locations. similarly, organizations supporting displaced persons are making far more intensive use of new icts (bishop & fisher, ; vernon, deriche, & eisenhauer, ). from mobiles and social media to crowdsourced map- ping, the rapid acceleration of technology use is both benefit- ing and creating challenges for refugees and service providers alike. for internationally displaced persons, migrating into europe is associated with highly complex information needs about the journey and the destination; each new need present- ing problems of where to seek information, factors of trust, and financial and other costs and where the outcomes of receiving poor or false information can most severely cause death, loss of family, or financial ruin. borkert et al. refugees and social media at the margins of mainstream migration research, literature on forced migration and media use is steadily growing, fueled mainly by three disciplines—migration studies, media and communication studies, and information science. the most prominent strands of literature address transnational migration, e-diasporas, and media landscapes and informa- tion worlds of refugees and migrants. regarding transnational migration, a growing body of research examines how migrants and refugees are using tech- nologies, particularly cell phones, to connect to their coun- tries of origin and, in some cases, help to create new relationships and connections in their countries of destina- tions (horst, ; horst & taylor, ; panagakos & horst, ; vertovec, and , p. ) calls the widespread use of cell phones among today’s mobile populations “the social glue of migrant transnationalism.” furthermore, mobile phones play a vital role for emotional intimacy. thomas and lim ( ) found that the use of mobile phones among trans- migrants enhance their overall well-being as they facilitate communication and intimacy with loved ones in their coun- tries of origin as well as within their diasporic communities. as madianou and miller ( ) show, mobile phones enable migrated parents to keep a (more) active role in their chil- dren’s lives mediating new forms of digital intimacy. studies on transnational migration also address how mobile phones and the virtual in general stimulate a (new) sense of belong- ing, for constructing common experiences and social identi- ties (gajjala, ; hedge, ; parham, ; wilding, ) as well as self-representation (diminescu & loveluck, ). witteborn ( ) emphasizes how new technologies enable people to enhance sociality and build networks. on a more negative note, archambault ( ) suggests that new media may disrupt intimate long-distance relationships when they are used for personal surveillance. in the context of e-diasporas diminescu ( , ) argued that migrants cannot be seen as “double absent” (sayad, ) but must be conceptualized as multiple con- nected. according to diminescu and loveluck ( ), the ubiquitous presence of digital technologies affects all aspects of a migrant’s experience both pre-entry and post-arrival. before entering a new country, the migration journey often starts by going “through the screen,” that is, crossing an informational frontier made up of databases and identifica- tion systems such as the schengen information system (sis) to gather information on the desired destination. after arrival, migrants face the early necessity of acquiring a sim card or mobile phone and gain access to a computer, to find work and stay connected with family and friends. these multiple forms of presence leave traces in the analog and virtual world that provide a rich ground for understanding migration tra- jectories and migrant networks if combined together (diminescu and loveluck, ). in addition, georgiou ( , ) and hepp, bozdag, and suna ( ) show how diasporic minority groups use media in complex ways that feedback how they communicate interest, make claims, and mobilize identities. with an emphasis on youth digital dias- poras, leurs and ponzanesi ( ) develop this argument further. they stress that established dimensions to locate a migrant’s feeling of belonging such as countries of origin/ country of destination or local/transnational no longer hold in the hypertextual world of esthetics. in the digital diasporas they inhabit, migrant youth show mutual recognition and express individuality by combining national or “ethnic” affiliations with other, largely transnational, youth subcul- tures producing a blend of cultural belonging and hybridized connections that is far more articulate and complex as cur- rent theory allows (leurs & ponzanesi, ). similar find- ings were reported by fisher, yefimova, and bishop ( ), who worked with immigrant and refugee youth from latin america, myanmar, and east africa in understanding the roles youth play as information guides on behalf of families, friends, and within communities and social institutions. their work shows that refugee youth are early adopters of technology and serve as linguistic, geographic, and cultural wayfarers on behalf of others; however, youth are often over- burdened with load of unpaid helper including in social insti- tutions such as schools and do not share parents views of home culture. providing new evidence, gillespie et al. ( ) and wall, campbell, and janbek ( ) take a more critical stance on new media. they confirm that new technologies play a cru- cial role in the planning and navigating the dangers of a migration journey as well as in a migrant’s protection and empowerment after arrival. yet they warn that particularly the smartphone is a double-edged sword: as a resource, migrants benefit for making translations, accessing vital ser- vices (such as legal advice, medical help, and shelter), and keeping in touch with families and friends. but the digital traces that migrants leave make them vulnerable to surveil- lance by state and nonstate actors and to intimidation by extremist groups (gillespie et al., ). coining the term “information precarity,” wall et al. ( ) found that refu- gees experience information precarity in five forms: in terms of (a) the technological and social access to information; (b) the prevalence of irrelevant, sometimes dangerous informa- tion; (c) the lack of their own image control; (d) surveillance by the state; and (e) disrupted social support. crucial to our research and understanding is the concept of digital literacy as social practice. although the term “digi- tal literacy” had been applied before, its introduction is often attributed to gilster ( ). pointing out the differences between digital information media and conventional print media, he conceptualizes digital literacy as the development of competences in four areas, that is, assembling knowledge, evaluating information content, searching the internet, and navigating hypertext (gilster, ). as lankshear and knobel ( ) emphasize, the most commonly used defini- tions of digital literacy tend to (a) confine digital literacy social media + society almost exclusively to roles concerned with information; (b) taper off interaction with information to assessing its truth (or validity), credibility, reliability, and so on, as a defense against being manipulated; and (c) theorize it as a “thing” or master competence to possess, lack, need, and acquire. in this vein, digitally literate people are often seen as function- ing better in the knowledge-driven economy, while digital illiterate people are perceived as vulnerable and passive (lankshear & knobel, ). in light of this criticism, con- cepts such as media literacy and information literacy (koltay, ) or, more recently, multicultural literacy and emerging technology literacy (cordes, ) soon proliferated. but while the first two are rather applied to describing consump- tion and not always production of digital content (livingstone, ), the latter appear as subcategories of what is identified as a key ability in the digital age. yet in our study, we apply a slightly different approach and follow street ( ) in con- ceiving literacy as “social practices and conceptions of read- ing and writing.” widening the notion of digital literacy, we assume that many cultural ways of reading and writing exist and that individuals move in and out multiple ways of read- ing and writing. in this sense, they become both consumers and producers of digital content as well as active agents in digital practices of online searching and communicating. in practice, this is demonstrated by the differences in how peo- ple read and write on a (public) facebook page or in a per- son-to-person whatsapp communication as well as in their strategies and social practices to identify and handle mis- leading or wrong information. methodology: research setting the city of berlin, where the massive influx of refugees in and caused administrative turbulences, commit- ted support, and the rise of (new?) anti-immigrant sentiment, provided an exceptional location for this study. even after the number of arrivals dropped significantly, due to the highly disputed refugee pact between the eu and turkey, some , refugees are living in city gyms, the hangars at the tempelhof airport or elsewhere in an absolute emer- gency accommodation (beikler & vogt, ; kopietz, ; zeit online, ). at the time of our research, there were two reception centers at storkower street, officially belonging to the district of pankow, where the authors recruited their interview partners through personal appear- ance and snowballing. visiting the two reception centers for weeks during working time and holidays, the trained arabic- speaking interviewers approached the inhabitants on the facilities and asked about their willingness to participate in the study. to raise the number of female interviewees, the female interviewers turned to conducting the interview not in the open but in the somewhat protected environment of the refugees’ private space. the sample, however, might be skewed toward less mobile refugees (mothers, fathers, and elder people) who remained in the camp as those more active were searching the city for (irregular) work and an apart- ment. typical for berlin, the accommodations at storkower street were situated in a converted former office building, consisting of a collective living facility for refugees who passed the first admission procedure and were living there for several months but also for years as well as an emergency shelter for the initial reception of refugees. both facilities accommodate roughly people, of whom around are children and young people. in the first structure, the refugees live in two- to three-bedded rooms and have common sani- tary facilities, a kitchen as well as a common room, and din- ing room on each floor. the facility has a play room for children, while a playground is planned in the outdoor areas. refugee children for whom schooling is compulsory attend the nearest available school. if children are unfamiliar with the german language, they were first taught in special wel- come classes by teachers provided by the berlin senate department for education, youth and science. in addition to the center manager, four social workers, one administrative employee, one caretaker, and child care worker worked there. security guards were available hr a day, and every person aiming to enter the facility was requested to identify themselves and register as visitor. until today, the two facili- ties are managed by the protestant youth and welfare office (ejf) supported by a large circle of volunteer workers from the citizens’ initiative “pankow helps.” a third reception cen- ter was opened in directly proximity in by a private owner. in an attempt not to make the inhabitants an easy tar- get for rising anti-immigrant sentiment in germany, further sociodemographic details on the accommodated refugees was deliberately withheld. researching crisis-torn refugees—privacy rights and ethics interviewing recently arrived refugee adults and children, specifically in refugee accommodation centers, requires spe- cial sensitivity and preparation (borkert & de tona, ). as fontes ( ) observes, biases, cultural differences, and linguistic misunderstandings have the potential to exert a powerful influence in interviews with migrants—even when interviewers have best intent. guidelines and recommenda- tions for interviewing migrants and refugees both adults and children, commonly categorized as “vulnerable groups,” are not missing. besides, there are good publications with regard to studying social behavior online as well as legal provisions on processing personal data in germany and europe. as ethical decision making is a deliberative process, the authors consulted different people and sources during the research process: regional experts and experienced interpreters, fel- low researchers, and people participating in and familiar with the context under study as well as ethic guidelines and publications in migration and refugee studies as well as internet and information research. although principles vary by disciplines, some shared basic principles of research borkert et al. ethics and ethical treatment of interview partners can be identified that formed the basis of our methodological approach. these core principles are based on the fundamen- tal rights of human dignity, autonomy, protection, safety, respect for human beings and particularly children, justice, and the general public interest. we agree with markham and buchanan ( ): that the greater the vulnerability of the interviewee and the community he or she belongs to, the greater the obligation of the researcher to protect the inter- locutor and involved community. to balance harms and ben- efits, we abandoned interview questions which potentially could have inflicted the interviewee and exert an influence on his or her asylum request. in consequence, detailed ques- tions on migration routes to germany or country of origin context questions were not included. equally, information on specific websites of interest (urls) was neither archived nor subjected to analysis. as adolescents and adults were interviewed whose first language is not german, the authors arranged for qualified foreign language interpreter ahead of time (fontes, ). we deliberately involved arabic- speaking interpreters (two women and one man), who visited the refugee accommodations in their context of volunteer and/or professional work. the interpreters were thus known to the refugees inhabiting the facilities as well as to the center managers, which helped create an atmosphere of mutual trust and confidence. as humans, adults are influenced by how they feel physically, and the data collection assistants were briefed not to interview refugees who were overly tired, hun- gry, or unwell. a culturally acceptable snack was made avail- able, and it was made sure that the interviewees were comfortable with the room settings. in this context, some female interviewees preferred to be interviewed in the bed- rooms inhabited by the family in contrast to shared facilities. according to fontes ( ), rumors, jealousy, privacy, and reputation are often crucial issues in close-knit (ethnic) com- munities, while the concept of “confidentiality” may not exist in every language. using simple language, the inter- viewers explained to the interviewees where their informa- tion would be shared and with whom. considering that interviews which are held in a warm and friendly way are more likely to produce valid information (davis & bottoms, ), the data collection assistants were asked to approach and assist refugees in a warm, relaxed, supportive, and non- judgmental manner. analysis aiming to use statistics to generalize findings, we developed a questionnaire in english combining migration research with information studies and building on past experiences in research carried out in the za’atari refugee camp in jordan (fisher, ; fisher, yefimova, & yafi, ). the questionnaire was pretested, revised, and again pre- tested before the actual data collection began. a total of three qualified foreign language interpreters and three arabic-speaking refugees who volunteered for the study were trained to assist with the compilation of the question- naire and collected data. using informal interpreters such as family members and friends was avoided to increase the accuracy, confidentiality, and impartiality in interpretation. the assisted survey used a random sampling approach to gather data with individuals. the completed question- naires were collected and then transferred to an (english) online survey by a bilingual researcher. the questionnaire comprised questions, closed-ended questions, and open-ended questions for which six indices were constructed to guide analysis: •• information needs; •• information seeking and role of icts; •• identifying mis- and disinformation; •• role of information mediaries; •• social and economic inclusion factors of migrants in host communities. data were analyzed using nonparametric statistics disag- gregating by age and gender and content analysis. in terms of positionality, the co-authors bring different disci- plinary strengths and insights to the study. german sociologist maren borkert has vast international experiences in interdisci- plinary and transdisciplinary research and communication. she works at the intersection of business studies, innovation, and computation and aims at introducing digital methods to the study of migration, inclusion, and entrepreneurship. karen fisher is an information scientist specializing in info-sociologi- cal aspects of people and information. engaged at unhcr za’atari syrian refugee camp by the jordan/syrian border since , her field experience of displaced people by conflict zones builds on years working with displaced migrants in the united states. eiad yafi is a computer scientist having vast experience in information and communication technologies for develop- ment (ict d) with a focus on icts for sustainable education and immigrants. from homs, syria, his family are members of the syrian refugee community and e-diaspora. findings demographic data included nationality, age, gender, civic status, and country of stay. of the participants, % were males and % female; all were between and years old with % being youth (age: – years). for respondents under years of age, interviewers made sure to obtain parental or guardian permission before conducting the inter- view. approximately % were married with % single, % divorced, and % widowed. the majority were from syria ( %), % palestinian and syrian palestinian, par- ticipants from iraq, and from egypt. the participants were well educated: % completed their education before fleeing their country. regarding highest level of enrolled education, % were enrolled in a university program or above, % in social media + society an intermediate or middle school, while % were in a sec- ondary/high school, and % in a trade school/college. considering syria’s location at the eastern end of the mediterranean sea, the majority of syrian migrants fled to neighboring turkey and lebanon before able to enter europe. in our sample, % traveled by sea and land with % by air. historically, migrants move in groups to reduce the risks of the journey. when asked, % traveled with friends, % with spouses and/or children, and % with other relatives. yet, almost one-third ( %) traveled alone. information needs the information needs strand focused on refugees’ needs, their information seeking, and the importance of information before and during migration. from the survey’s choices plus “other category,” the most critical information needs were “well-being of family in home country” ( %), “news about my country of origin” ( %), learning a new language ( %), and learning the culture of destination country ( %). learning how to use communication technologies was reported critically important by %. at the other spec- trum—information not important at all, participants listed “communicating with smugglers while travelling” and “iden- tifying worst european country.” to further understand refugees’ information needs, we asked about migration consideration factors. the current influx of migrants to europe provides new insights. the most critically important were “political stability in a chosen country” ( %) and “strong economy in the chosen country” ( %). while % respondents indicated “easiness of the asylum procedure” as an important issue; the least important topics were “health care system” ( %), having a social sup- port system ( %), “aid provided in host country” ( %), and “weather/climate” ( %). information seeking and icts with the help of icts and social media networks, the par- ticipants seemed to have little difficulty in finding needed information, especially about route maps, identifying essentials to bring, exchanging money, and so on. for example, %, and respectively, said that it was “very easy to find” out the economy strength and political stability in germany, while % stated that “best and worst european countries to migrate” along with “how to use communica- tion technology” ( %). challenging information topics included “health care system” ( %), “vocational and uni- versity education for adults” ( %), and “friendliness of local people” ( %). digging further, disaggregating the results by gender and age group, shows gender was not a significant factor to determine the use of the icts to seek for help or informa- tion with % females and % males confirming not using the mobile phone seeking for help or information. however, disaggregating the results by age shows almost all older participants (age: – years) used their mobile and tech- nologies to seek help or information during their trip—con- trasted with % of the – years age group. while older people may have more familial responsibility to stay in touch and seek/share information with family; younger respondents, mostly males, may not have had funds to cover calls. regarding icts for obtaining needed and serendipitous information, % respondents used their mobile to call peo- ple asking for help or information during their trip to europe. most participants ( %) used their own sim cards for accessing the internet/wi-fi since they left their country, while % used internet cafe or public places, followed by bus/train stations ( %) for getting connections. however, only % used their mobile to access social media such as facebook en route to europe. this is significant, showing that the heavy use of social media was prior to deciding and arranging for the trip. this relatively small percentage sup- ports others’ findings that “conversations with other travel- ers” is an important information source and that the smaller portion of the refugees with continuous access to informa- tion via social media were information mediators (figure ). identifying mis- and disinformation despite mobile use, social media, and other people as infor- mation sources, respondents did not receive accurate infor- mation all the time. in total, % stated that information was “sometimes correct—a couple of sources were ok,” while % received information that was “mostly correct a lot of valuable, accurate information,” and % receiving “rarely correct” information. when asked how they knew when to distrust information, % replied “learning by experience,” suggesting refugees became aware of when to distrust infor- mation only when faced different reality. the importance of other people in judging information was raised by % who knew to distrust information from friends and people who arrived earlier. information mediaries many actors played a significant role in helping refugees to search for information using the internet and mobile phones. “friends” ( %) topped the list, followed by “other refugees” ( %). considering that the majority of migrants reached europe via sea and land, smugglers sur- prisingly were not important actors—only % reported “smugglers” as providing help with searching for informa- tion (figure ). open response data about infomediaries were analyzed for person who helped and type or nature of help, which were grouped into eight categories: travel/directions; information unrelated to travel; money/material goods; refugees, child and health care; language and education; technology; and, borkert et al. finally, employment and membership. since all migrants were concerned about arriving safely to their destination, it was expected that travel/directions was most prevalent. examples include the following: •• “someone showed us the way to the united nations office for the support of refugees in the capital of hungary” ( -year, male, syrian, university degree holder); •• “a turkish taxi driver picked us up km before the austrian border and dropped us off in vienna and saved us from having to do the finger prints in hungary” ( -year, male, syrian, university degree holder); •• “my uncle took me from munich to berlin” ( -year, male, iraqi, intermediate school/middle school certifi- cate holder); •• “someone helped me to find the way to the station” ( -year, male, syrian, secondary school certificate holder). regarding monetary and goods materials: as middle eastern societies are rather conservative with strong family relations, it was consistent that infomediary help included family member, both close and far: •• “my husband’s brother helped me with money” ( - year, female, syrian, intermediate school/middle school certificate holder); •• “my cousin helped me with money” ( -year, male, syrian, university degree holder); •• “my brother gave me money” ( -year, male, syrian, intermediate school/middle school certifi- cate holder). the helpfulness and sympathizing of strangers, mainly europeans, affected by social media show that casing the difficulties faced by migrants was demon- strated through informational and instrumental assis- tance, such as food, family, a car lift, shelter, children care, and so on. these results support the thomson figure . actors playing significant role in helping refugees with information and technology. figure . sources for learning the best route to travel. social media + society reuters foundation study in september , which found that more than three-quarters of europeans sympa- thize with syrian refugees coming to their countries, challenging reports of growing anti-immigration senti- ment across the continent. examples of european help include the following: •• “the smuggler didn’t take money for the transport of my children” ( -year, male, syrian, intermediate school/middle school certificate holder); •• “in greece, a lebanese woman took care of my children”; •• “someone gave me money after my money got sto- len” ( -year, male, palestinian, intermediate school/ middle school certificate holder); •• “an austrian family invited us to their home, fed us, gave us money and rented a car for us to get to germany” ( -year, male, syrian, intermediate school/middle school certificate holder); •• “a woman helped me in berlin to find my way back to the refugee camp and she also bought me a ticket for the public transport” ( -year, male, syrian, inter- mediate school/middle school certificate holder). “help not related to travel” included advices to follow groups and not individuals, the necessity to communicate with people who arrived germany earlier, not to trust smug- glers, and information on asylum process in germany and austria, how to find a pediatrician, “warning from the police and from the places where the police usually are” ( -year, male, syrian, secondary school certificate holder) and “my uncle registered me in a football club” ( -year, male syrian). given the plethora of facebook pages and other messag- ing applications as sources of information to migrants seek- ing a safer place, we asked whether the refugees themselves were contributing to sharing information with others, on which platform, such as facebook, whatsapp, viber, and so on, and what data were being shared, such as posts, texts, voice messages, maps, video file, and so on. analysis showed that respondents were not spending large times sharing information while traveling, especially as facebook was inaccessible at times. only % posted daily on facebook and % posted to facebook groups pages. however, text messaging was popular, due to the ease of using whatsapp and viber: % sent text messages via chat applications when traveling, while % sent voice messages via same applications. sending maps or video files was not significant, due to poor connectivity and other factors. this low sharing behavior was boosted on arrival in europe: daily texting and voice messaging remained high- est at % and %; sending maps and video files daily via social media also increased, suggesting refugees had good connectivity in germany. discussion: the need for a radical new approach to understanding migration our research confirms the relevance of smartphones and internet-based communication tools such as whatsapp and facebook for migration but highlights the intrinsic value of other people. this finding is consistent with the unhcr connecting refugees report (vernon et al., ) and international rescue committee (irc), which assessed the importance of mobiles to arab refugees in (handelsblatt, ). our research shows that social media enables contact with families and friends, while creating and maintaining social networks between those on the move and people who migrated prior, shedding a fresh light on the question of dis- rupted social support in situations of dislocation and refuge (wall et al., ; see also dekker, engbersen, klaver, & vonk, in this special section). our findings support gillespie et al. ( ) and wall et al. ( ) that refugees who fled to the eu, in – , particularly from syria and iraq, are largely well educated and digitally literate. they are fre- quently concerned about staying connected, finding access to wi-fi and phone charging but most of all to stay safe online and off-line. misinformation is widespread, and it is difficult to know which information to trust (gillespie et al., ). indeed, people enjoy sharing information, even when they do not believe it (karlova & fisher, ). in consequence, misinformation and disinformation, defined as inaccurate information and deceptive information, respectively, have to be considered as varieties of human information behavior. both are by default diffused through social networks. social media such as facebook and whatsapp has made their diffu- sion easier and faster. as our research shows migrants and particularly refugees for whom false information can poten- tially lead to severe harm and even death are very well aware of default and misleading information circulating in social media. nevertheless, refugees described being both consum- ers and producers of social media content. with % of them using their smartphone during their journey to europe, they demonstrate an advanced degree of digital connectivity and literacy. this holds true also after they arrived in europe and germany: % and % of refugees shared information on their journey via whatsapp, viber, and so on as a text or voice message with % and % doing so daily. our findings highlight, on one hand, the relevance of transnational digital networks among refugees and migrants as well as their impact on migration movements. the over- whelming majority ( . %) of the refugees, in fact, learned their best route to europe via facebook, whatsapp, or viber. literally, no one accessed book or library computers for this purpose. with strong digital literacy, our findings show, on the other hand, that migrants are digital agents of change who themselves post and share information in social media and digital social networks. as both consumer and producer of digital migration knowledge, they, furthermore, demonstrate an elaborate degree of awareness with regard borkert et al. to information quality, mis- and disinformation. besides digital connectivity and social media literacy, it is the social ties to persons who successfully migrated that our respon- dents considered most trustworthy in terms of accurateness, completeness, and trueness of information. the latter points toward a certain rationality in matters of flight that seem to contradict the common idea of fleeing as a helter-skelter reaction to a situation of stress in which someone leaves everything behind and starts to run. for the persons inter- viewed, at least, fleeing rather seems to manifest itself as an (pro)active process of decision making in which complex information needs and information gains through social media play a vital role. finally, we wish to highlight three implications of our main findings summarized above: this regards, first and foremost, the common misconception that all refugees are passive victims fleeing misery with nothing but their lives. our research shows instead that the newly arrived refugees in germany actively escaped using a wide range of resources and skills available to them (including ict, fam- ily ties, creative solution seeking, and the rational assess- ment of information quality, for instance). to our understanding, this false image of refugees in germany needs to be revoked. second, our analysis of the digital connectivity, information behavior, and interaction needs among refugees during and after their journey to europe calls for the establishment of a digital scholarship in migra- tion studies capable of exploring the digital traces that migrants leave behind with digital (=computational) tools, while contributing to the development of own methodolog- ical approaches and theoretical perspectives based on achievements of the social sciences in the analog era. third, for future research, we recommend focus on understanding the most effective ways of facilitating integration that reflect refugees’ cultural and communication stances, spe- cifically regarding people, place, and time. fisher ( ), for example, reports on distinct design and field research insights of syrians displaced in the middle east that are relevant to co-designing integration services, systems, and policies in germany and eu. examples include under- standing the roles of young arabs (male and female) in serving as infomediaries in sub-communities; how libraries and other cultural agencies may be engaged in integration, especially given our study’s finding that refugees did not use books and libraries while en route to germany; and facilitating refugees’ needs and access to information about education, health care, civics, and other hard-to-find topics. relatedly, activities that bring mainstream society together with refugees such that established residents can under- stand the culture, experiences, and concerns of refugees are also needed for future work. declaration of conflicting interests the author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. funding the author(s) received no financial support for the research, author- ship, and/or publication of this article. notes . for the latter especially the data protection directive (directive / /ec) and the general data protection regulation (gdpr / ) of the european union (eu) which enters into force on may proved to be informative. . these principles are codified in policies and documents such as the un declaration of human rights, the helsinki declaration, and the belmont report or the european textbook on ethics in research. . according to the dublin regulation (regulation no. / and predecessors), an eu law, a refugee’s asylum request must be processed in the eu member state through which the appli- cant first enters the eu. thus, asylum applicants tend to be careful when talking about their stories of flight as details might be used against them during the asylum process. in an attempt to relieve pressure on hungary and greece, in , berlin stopped returning syrian asylum seekers to their first port of entry in the eu. yet information on migration journeys and flight routes remains sensitive as non-syrian nationals tried to benefit from the exemption and enter the eu pretending to be syrian nationals. together with the eurodac regulation which establishes a europe-wide fingerprint database for unauthorized entrants to the eu, the dublin regulation is the cornerstone of the dublin system. . a total of two studied at tu berlin and one inhabited the same refugee accommodation facility and considered trustworthy by the engaged foreign language interpreters. references archambault, j. s. ( ). breaking up “because of the phone” and the transformative potential of information in southern mozambique. new media & society, , – . beikler, s., & vogt, s. ( , october ). flüchtlinge in berlin. die zustände vor dem lageso sind lebensgefährlich [translation for the article title: refugees in berlin. the conditions in front of the lageso are life threatening]. der tagesspiegel. retrieved from http://www.tagesspiegel.de/berlin/fluech- tlinge-in-berlin-die-zustaende-vor-dem-lageso-sind-lebensge- faehrlich/ .html bishop, a. p., & fisher, k. e. ( ). using ict design to learn about immigrant teens from myanmar. in proceedings of the seventh international conference on information and commu- nication technologies and development. article no . new york, ny: acm. retrieved from https://dl.acm.org/citation. cfm?id= borkert, m., & de tona, c. ( ). stories of hermes: an analy- sis of the issues faced by young european researchers in migra- tion and ethnic studies. forum qualitative social research, ( ). retrieved from http://www.qualitative-research.net/ index.php/fqs/article/view/ bundesamt für migration und flüchtlinge. ( ). aktuelle zahlen zu asyl [current figures on asylum]. ausgabe april . retrieved from https://www.bamf.de/shareddocs/anlagen/ de/downloads/infothek/statistik/asyl/aktuelle-zahlen-zu- http://www.tagesspiegel.de/berlin/fluechtlinge-in-berlin-die-zustaende-vor-dem-lageso-sind-lebensgefaehrlich/ .html http://www.tagesspiegel.de/berlin/fluechtlinge-in-berlin-die-zustaende-vor-dem-lageso-sind-lebensgefaehrlich/ .html http://www.tagesspiegel.de/berlin/fluechtlinge-in-berlin-die-zustaende-vor-dem-lageso-sind-lebensgefaehrlich/ .html https://dl.acm.org/citation.cfm?id= https://dl.acm.org/citation.cfm?id= http://www.qualitative-research.net/index.php/fqs/article/view/ http://www.qualitative-research.net/index.php/fqs/article/view/ https://www.bamf.de/shareddocs/anlagen/de/downloads/infothek/statistik/asyl/aktuelle-zahlen-zu-asyl-april- .pdf;jsessionid= e f fa baf ab fafc bc . _cid ?__blob=publicationfile https://www.bamf.de/shareddocs/anlagen/de/downloads/infothek/statistik/asyl/aktuelle-zahlen-zu-asyl-april- .pdf;jsessionid= e f fa baf ab fafc bc . _cid ?__blob=publicationfile social media + society asyl-april- .pdf;jsessionid= e f fa baf ab f afc bc . _cid ?__blob=publicationfile bundesamt für migration und flüchtlinge. ( ). das bundesamt in zahlen [the federal office in figures ]. asylum. retrieved from https://www.bamf.de/shareddocs/anlagen/de/ publikationen/broschueren/bundesamt-in-zahlen- .html cordes, s. ( ). broad horizons: the role of multimodal literacy in st century library instruction. retrieved from http://cite- seerx.ist.psu.edu/viewdoc/download;jsessionid=c c f c dc d f d fa e ?doi= . . . . &rep=rep &type=pdf davis, s. l., & bottoms, b. l. ( ). effects of social support on children’s eyewitness reports: a test of the underlying mecha- nism. law and human behavior, , – . dekker, r., engbersen, g., klaver, j., & vonk, h. (in press). smart refugees. how syrian asylum migrants use social media in migrant decision-making. social media + society. diminescu, d. ( ). the connected migrant: an epistemological manifesto. social science information, , – . diminescu, d. (ed.). ( ). e-diasporas atlas. explorations and cartography of diasporas on digital networks. paris, france: ed. de la maison des sciences de l’homme. diminescu, d., & loveluck, b. ( ). traces of dispersion: online media and diasporic identities. crossings: journal of migration and culture, ( ), – . fisher, k. e. ( ). information worlds of refugees. in c. m. maitland (ed.), icts for refugees and displaced persons. cambridge, ma: the mit press. fisher, k. e., yefimova, k., & bishop, a. p. ( , may - ). adapting design thinking and cultural probes to the experiences of immigrant youth: uncovering the roles of visual media and music in ict wayfaring. proceedings of the chi conference extended abstracts on human factors in computing systems, san jose, ca. fisher, k. e., yefimova, k., & yafi, e. ( , june - ). future’s butterflies: co-designing ict wayfaring technology with refugee syrian youth. proceedings of the th international conference on interaction design and children (idc’ ), manchester, uk. fontes, l. a. ( ). interviewing immigrant children and fami- lies for suspected child maltreatment. elmhurst, il: american professional society on the abuse of children (apsac) advisor spring title. gajjala, r. ( ). cyberselves: feminist ethnographies of south asian women. walnut creek, ca: altamira press. georgiou, m. ( ). diasporic media across europe: multicultural societies and the universalism-particularism continuum. journal of ethnic and migration studies, , – . georgiou, m. ( ). diaspora in the digital age: minorities and media representation. journal on ethnopolitics and minority issues in europe, , – . gillespie, m., ampofo, l., cheesman, m., faith, b., iliadou, e., issa, a., & skleparis, d. ( ). mapping refugee media journeys. smartphones and social media networks (research report). retrieved from http://www.open.ac.uk/ccig/sites/ w w w . o p e n . a c . u k . c c i g / f i l e s / m a p p i n g % r e f u g e e % media% journeys% % may% fin% mg_ .pdf gilster, p. ( ). digital literacy. new york, ny: john wiley. handelsblatt. ( , september ). clinging to a smartphone life- line. handelsblatt. retrieved from http://global.handelsblatt. com/edition/ /ressort/politics/article/clinging-to-a-smart- phone-lifeline hedge, r. s. (ed.). ( ). circuits of visibility: gender and trans- national media cultures. new york: new york university press. hepp, a., bozdag, c., & suna, l. ( ). mediale migranten. mediatisierung und die kommunikative vernetzung der diaspora [media migrants. mediatization and the communi- cative networking of the diaspora]. wiesbaden, germany: vs verlag. horst, h. a. ( ). the blessings and burdens of communication: cell phones in jamaican transnational social fields. global networks, , – . horst, h. a., & taylor, e. b. ( ). the role of mobile phones in the mediation of border crossings: a study of haiti and the dominican republic. the australian journal of anthropology, , – . karakayali, s., & kleist, o. ( ). strukturen und motive der ehrenamtlichen flüchtlingsarbeit (efa) [structures and motives of voluntary refugee work (efa)] in deutschland (research report). retrieved from http://www.bim.hu-berlin. de/media/studie_efa _bim_ _v%c % .pdf karlova, n., & fisher, k. ( ). a social diffusion model of misinfor- mation and disinformation for understanding human information behaviour. information research, ( ). retrieved from http:// www.informationr.net/ir/ - /paper .html#.wfv jfzau x koltay, t. ( ). the media and the literacies: media literacy, information literacy, digital literacy. media, culture & society, , – . kopietz, a. ( ). moabit—kleiner tiergarten ist neuer brennpunkt der kriminalität, berliner zeitung [moabit—kleiner tiergarten is the new focal point of crime]. berliner zeitung. retrieved from http://www.berliner-zeitung.de/ lankshear, c., & knobel, m. ( ). digital literacy and digital literacies: policy, pedagogy and research considerations for education. nordic journal of digital literacy, , – . leurs, k., & ponzanesi, s. ( ). mediated crossroads: youthful digital diasporas. journal of media and culture, . retrieved from http://journal.media-culture.org.au/index.php/mcjournal/ article/view/ livingstone, s. ( ). media literacy and the challenge of new information and communication technologies. communication review, , – . madianou, m., & miller, d. ( ). mobile phone parenting: reconfiguring relationships between filipina migrant moth- ers and their left-behind children. new media & society, , – . marfleet, p. ( ). refugees in a global era. basingstoke, uk: palgrave macmillan. markham, a., & buchanan, e. ( ). ethical decision-making and internet research: recommendations from the aoir ethics working committee (version . ). retrieved from http://aoir. org/reports/ethics .pdf merkel, a. ( ). bundespressekonferenz vom [federal press conference of ] . . . retrieved from https:// www.bundesregierung.de/content/de/mitschrift/presse konferenzen/ / / - - -pk-merkel.html https://www.bamf.de/shareddocs/anlagen/de/downloads/infothek/statistik/asyl/aktuelle-zahlen-zu-asyl-april- .pdf;jsessionid= e f fa baf ab fafc bc . _cid ?__blob=publicationfile https://www.bamf.de/shareddocs/anlagen/de/downloads/infothek/statistik/asyl/aktuelle-zahlen-zu-asyl-april- .pdf;jsessionid= e f fa baf ab fafc bc . _cid ?__blob=publicationfile https://www.bamf.de/shareddocs/anlagen/de/publikationen/broschueren/bundesamt-in-zahlen- .html https://www.bamf.de/shareddocs/anlagen/de/publikationen/broschueren/bundesamt-in-zahlen- .html http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=c c f cdc d f d fa e ?doi= . . . . &rep=rep &type=pdf http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=c c f cdc d f d fa e ?doi= . . . . &rep=rep &type=pdf http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=c c f cdc d f d fa e ?doi= . . . . &rep=rep &type=pdf http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=c c f cdc d f d fa e ?doi= . . . . &rep=rep &type=pdf http://www.open.ac.uk/ccig/sites/www.open.ac.uk.ccig/files/mapping% refugee% media% journeys% % may% fin% mg_ .pdf http://www.open.ac.uk/ccig/sites/www.open.ac.uk.ccig/files/mapping% refugee% media% journeys% % may% fin% mg_ .pdf http://www.open.ac.uk/ccig/sites/www.open.ac.uk.ccig/files/mapping% refugee% media% journeys% % may% fin% mg_ .pdf http://global.handelsblatt.com/edition/ /ressort/politics/article/clinging-to-a-smartphone-lifeline http://global.handelsblatt.com/edition/ /ressort/politics/article/clinging-to-a-smartphone-lifeline http://global.handelsblatt.com/edition/ /ressort/politics/article/clinging-to-a-smartphone-lifeline http://www.bim.hu-berlin.de/media/studie_efa _bim_ _v%c % .pdf http://www.bim.hu-berlin.de/media/studie_efa _bim_ _v%c % .pdf http://www.informationr.net/ir/ - /paper .html#.wfv jfzau x http://www.informationr.net/ir/ - /paper .html#.wfv jfzau x http://www.berliner-zeitung.de/ http://journal.media-culture.org.au/index.php/mcjournal/article/view/ http://journal.media-culture.org.au/index.php/mcjournal/article/view/ http://aoir.org/reports/ethics .pdf http://aoir.org/reports/ethics .pdf https://www.bundesregierung.de/content/de/mitschrift/pressekonferenzen/ / / - - -pk-merkel.html https://www.bundesregierung.de/content/de/mitschrift/pressekonferenzen/ / / - - -pk-merkel.html https://www.bundesregierung.de/content/de/mitschrift/pressekonferenzen/ / / - - -pk-merkel.html borkert et al. panagakos, a. n., & horst, h. a. ( ). return to cyberia: technology and the social worlds of transnational migrants. global networks, , – . parham, a. ( ). diaspora, community and communica- tion: internet use in transnational haiti. global networks, , – . sayad, a. ( ). the suffering of the immigrant. cambridge, ma: polity press. statistical office of berlin-brandenburg. ( ). statistischer bericht. einwohnerinnen und einwohner im land berlin am . juni [statistical report. residents in the state of berlin on june , . basic data]. grunddaten. retrieved from https://www.statistik-berlin-brandenburg.de/publikationen/ stat_berichte/ /sb_a - - _ h _be.pdf stinauer, t. ( , january ). neue erkenntnisse der polizei. offenbar kaum nordafrikaner zu silvester in köln [new find- ings of the police. apparently few north africans on new year’s eve in cologne]. kölner stadtanzeiger. retrieved from https://www.ksta.de/koeln/neue-erkenntnisse-der-polizei- offenbar-kaum-nordafrikaner-zu-silvester-in-koeln- street, b. ( ). literacy in theory and practice. cambridge, uk: cambridge university press. thomas, m., & lim, s. s. ( ). on maids, mobile phones and social capital—ict use by female migrant workers in singapore and its policy implications. in j. katz (ed.), mobile communica- tion and social policy (pp. – ). trenton, nj: transaction. thomson reuters foundation. ( ). poll shows most europeans sympathize with syrian refugees, “have not lost their hearts.” retrieved from http://www.reuters.com/article/us-europe-refu- gees-poll-iduskcn l x vernon, a., deriche, k., & eisenhauer, s. ( ). connecting refugees: how internet and mobile connectivity can improve refugee well-being and transform humanitarian action. geneva, switzerland: united nations high commissioner for refugees. retrieved from http://www.unhcr.org/en-us/publi- cations/operations/ d c /connecting-refugees.html vertovec, s. ( ). cheap calls: the social glue of migrant trans- nationalism. global networks, , – . vertovec, s. ( ). transnationalism. new york, ny: routledge. wall, m., campbell, m. o., & janbek, d. ( ). syrian refu- gees and information precarity. new media & society, , – . wilding, r. ( ). virtual intimacies? families communicating across transnational contexts. global networks, , – . witteborn, s. ( ). becoming (im)perceptible: forced migrants and virtual practice. journal of refugee studies, , – . zeit online. ( ). flüchtlinge: alles neu am lageso? [refugees: everything new at the lageso?]. retrieved from http://www. zeit.de/politik/deutschland/ - /lageso-berlin-fluech- tlinge-situation-ueberfordert author biographies maren borkert (phd, technical university berlin), is a marie- curie-experienced researcher at the school of economics and management, centre for entrepreneurship, chair of entrepreneurship and innovation management, tu berlin. her research interests include digital innovation and networks, inclusive entrepreneurship, and european migration governance. karen e. fisher (phd, university of washington), is a professor at the information school and an adjunct professor at the department of communication, university of washington; as well as consultant with unhcr (jordan); visiting professor, open lab, newcastle university, united kingdom; and adjunct professor, Åbo akademi university, finland. eiad yafi (phd, universiti kuala lumpur), is a senior lecturer at malaysian institute of information technology, universiti kuala lumpur. from homs, syria, he received his phd in computer science (data mining) from jamia hamdard university, india, in . https://www.statistik-berlin-brandenburg.de/publikationen/stat_berichte/ /sb_a - - _ h _be.pdf https://www.statistik-berlin-brandenburg.de/publikationen/stat_berichte/ /sb_a - - _ h _be.pdf https://www.ksta.de/koeln/neue-erkenntnisse-der-polizei-offenbar-kaum-nordafrikaner-zu-silvester-in-koeln- https://www.ksta.de/koeln/neue-erkenntnisse-der-polizei-offenbar-kaum-nordafrikaner-zu-silvester-in-koeln- http://www.reuters.com/article/us-europe-refugees-poll-iduskcn l x http://www.reuters.com/article/us-europe-refugees-poll-iduskcn l x http://www.unhcr.org/en-us/publications/operations/ d c /connecting-refugees.html http://www.unhcr.org/en-us/publications/operations/ d c /connecting-refugees.html http://www.zeit.de/politik/deutschland/ - /lageso-berlin-fluechtlinge-situation-ueberfordert http://www.zeit.de/politik/deutschland/ - /lageso-berlin-fluechtlinge-situation-ueberfordert http://www.zeit.de/politik/deutschland/ - /lageso-berlin-fluechtlinge-situation-ueberfordert from fair leading practices to fair implementation and back: an inclusive approach to fair at leiden university libraries hettne, km, et al. . from fair leading practices to fair implementation and back: an inclusive approach to fair at leiden university libraries. data science journal, : , pp.  – . doi: https://doi.org/ . /dsj- - practice paper from fair leading practices to fair implementation and back: an inclusive approach to fair at leiden university libraries kristina maria hettne , peter verhaar , erik schultes and laurents sesink centre for digital scholarship, leiden university libraries, leiden, nl go fair international support and coordination office (gfisco), leiden, nl corresponding author: kristina maria hettne (k.m.hettne@library.leidenuniv.nl) leiden university (lu) adopted a data management regulation in . the regulation embraces the findable, accessible, interoperable and reusable (fair) principles. to implement the regulation a programme was established. the focus of the programme was initially to raise awareness and to set up services to make data findable and accessible and to train researchers on data management planning. in , the programme entered its second phase, with an increased focus on interoperable and reusable data, and on implementing the machine-actionable aspects of fair data. this step is non-trivial, however, because of the fast-developing fair research data international research field that requires fast adoption of leading practices by support professionals with the adequate skills. in this paper we describe how lu aims to close the feedback loop between international bottom-up organisations such as gofair, the research data alliance and codata on the one hand and university staff engaged in developing and implementing emerging fair leading practices on the other. during processes such as these, it is of crucial importance to focus primarily on the needs of researchers. we describe how lu builds up its support for fair data before, during and after research through its involvement in leading practices, training and consultancy and end with recommendations for other universities wanting to implement the fair principles. keywords: fair data; research data management; leading practice; implementation introduction in april , leiden university (lu) launched its research data management (rdm) regulation (research data management regulations leiden university ). this regulation was formulated to make it easier for researchers to comply with rdm requirements formulated by funders and other external parties to enhance the transparency and the integrity of research. the rdm regulation was inspired by the fair princi- ples (ensuring that data and software are findable, accessible, interoperable, and reusable by humans and machines), which were initially formulated in (jointly designing a data fairport ), published in (wilkinson et al. ) and now widely adopted by research funding and research performing organisa- tions. the lu rdm regulation applies to three distinct stages of data management: before, during and after research. it states, for example, that a data management plan (dmp) must be drawn up before the start of the data collection phase. the dmp consists more concretely of a further specification of the data management protocols of the faculty or institute for the specific research project in question. during the research project, research data must be preserved securely, meaning that the integrity, availability and – if required – con- fidentiality of the data, must be guaranteed. after the formal completion of the research project, the data must be managed in such a way that they are findable, accessible, comprehensible and reusable in the long term. this means that data must be stored together with the metadata, documentation and possibly the software required for its reuse. to better ensure the secure and sustainable access to the data, they should preferably be preserved in a repository certified with the coretrustseal (coretrustseal ). the regulation stipulates a minimum retention term for research data of ten years. https://doi.org/ . /dsj- - mailto:k.m.hettne@library.leidenuniv.nl hettne et al: from fair leading practices to fair implementation and backart.  , page  of following the publication of the lu rdm regulation, faculties were asked to translate the general princi- ples in this regulation to concrete guidelines, following the needs of individual disciplines. a data manage- ment implementation programme was set up to support six faculties, with an expected finish date at the end of . members of the project team include data management experts from the centre for digital scholarship (cds) at lu libraries, policy advisors form academic affairs, research ict experts from lu ict shared service centre and rdm experts from the faculties. the cds is responsible for information provi- sioning, advise and training. during the first phase of the program, the cds started a range of activities to raise awareness of the activities needed to make data fair. for example, a research data service catalogue listing resources essential for research data management at lu was released in and revised in (leiden research data service catalog ). in addition to this, the cds organises regular workshops (occurring every six weeks) on how to write dmps. next to general workshops, the cds also organises dmp workshops tailored to the needs of specific research institutes. both types of workshops include a general introduction to the fair principles. during the customised workshops, students are also asked to evaluate the ‘fairness’ of a number of studies in their discipline. the participants are encouraged to revise or to update their dmp during their project if necessary. there is no formal assessment of the workshops, but feedback is asked from participants. the above activities concentrated on making data fair for humans as a first step. with fair for humans we mean that human researchers can find the data set in a repository, download it, understand it (because it has a readme file describing it) and reuse it for their own purposes (because it has a permissive license attached to it). the process of making data fair for human researchers demands relatively little effort, but it is still a huge leap forward, compared to the situation in which the data are stored exclusively on the researcher’s own hard drive, for instance. the second phase of the data management implementation programme, which started in , had a much stronger focus on interoperability and reusability and the machine actionability of data as stated in the fair principles. the cds believes that most of the principles underlying findability and accessibility can be implemented on a university-broad level. in contrast, the principles underlying interoperability and reusability can be implemented more effectively at faculty-level and institution-level since these more frequently require domain-specific knowledge. the approach that was adopted at leiden university was strongly informed by the work of international organisations such as go fair, the research data alliance (rda) and codata, which all formulate and recommend leading practices for the implementation of the fair principles. the experiences and les- sons learned from the local implementations at the faculty and institutional level can in turn be used to strengthen these international leading practices. in this paper we describe how the cds tries to close the feedback loop between international leading practices and the lu implementation during this second phase of the data management programme, maintaining a clear focus on the researcher. while the implementa- tion of rdm policies, support and services (see for example cruz et al. or mushi et al. . for a survey of research data services implementation see tenopir et al. ) have already been discussed in a range of other publications, the various ways in which universities can advance the interoperability, reusability and machine-actionability of research data, in agreement with the fair principles, ha still received little atten- tion, to our knowledge. by describing the approach within lu this paper aims to contribute to this emerging debate. approach the primary focus of the cds is to deliver the best possible support to all researchers affiliated with lei- den university. we believe that this can best be achieved by making researchers aware of relevant leading practices, by training them in how to make their data fair and by providing individual consultancy when needed. to give state-of-the art support, the cds needs to keep up with the pace of the fast-developing international field around fair research data. especially the implementation of the fair principles for interoperable and reusable data requires advanced knowledge that can be found more easily at an inter- national level. ideally, by actively taking part in current developments related to fair data, we can learn from leading practices for implementation solutions and translate these to relevant researcher support while influencing the pace of the availability and the suitability of these solutions to leiden research- ers. in return, we also share our experience with the organisations providing these leading practices. we therefore engage closely with the international bottom-up organizations go fair, rda and codata (figure ). below, we describe our activities within three areas: leading practices, training and consul- tancy. we also discuss how these activities contribute to our researcher support before, during and after research projects. hettne et al: from fair leading practices to fair implementation and back art.  , page  of leading practices at the end of , the cds began its active involvement in two international activities to develop leading practices for implementation solutions for fair data: the fair funders pilot programme (ffpp) (wittenburg et al. ) and the fair implementation matrix (sustkova et al. ). both activities were initiated and led collectively by go fair and rda. the ffpp wanted to make it easier for grantees to produce fair data by using tools and services for creating dmps and fair metadata. the fair implementation matrix was aimed at encouraging communities to record their technical implementation choices for fair, and to work towards a degree of convergence around the fair leading practices. these two activities resulted in several workshops and preliminary results that were tracked in the open science framework project (fair imple- mentation matrix, ). in , both initiatives entered a phase of rapid development because of the covid- pandemic. it prompted the birth of the virus outbreak data network (vodan) as a joint activity of codata, rda, wds, and go fair (vodan ). the ffpp and fair implementation matrix became core activities within vodan and the cds decided to intensify its involvement and contribute actively to the vodan network. led by go fair, a three-point framework for fairification ( pfff ) was launched on the th of july . one of the key components of the three-point framework for fairification are meta- data for machine (m m) workshops (metadata for machines workshops ). in a m m workshop data stewards and researchers collaborate to decide on data policy issues and metadata descriptions needed to ensure fairness and render their decisions in a machine-actionable form (a metadata schema). the meta- data schema produced in the m m form part of the larger fair implementation profile (fip) (magagna et al. ), which in turn guides the configuration of a fair data point (a metadata repository that provides access to metadata in a fair way)(fair data point design specification ). the three-point framework for fairification was adopted by the dutch national covid- research program from the research funding agency zonmw, among others. in theory, these activities could have given the cds a concrete opportunity to explore the added benefit of being actively involved in the development of these leading practices when supporting leiden researchers applying for funding within this program. in practice, however, we noticed that the cds was only contacted by researchers at a late stage when the proposal was already almost com- pleted (in most cases just days before the deadline or even the same day). we can only speculate that the quality of these proposals would have improved if we had been involved at an earlier stage and if we could have worked out the details from a fair data stewardship perspective. we did note, however, that we could offer valuable advice to researchers about the budget for data stewardship, since we knew, from our involve- ment in vodan, what zonmw was expecting in terms of making data fair. we will continue to be involved in the developments of the three-point framework for fairification and plan to offer the m m and fip figure : schema showing the feedback loop between institutional protocols and international leading practices in the context of leiden university. hettne et al: from fair leading practices to fair implementation and backart.  , page  of workshops and to set up fair data points for researchers from any discipline as a way to guide researchers in their decisions how to make their data fair for machines. in terms of research support, m m and fip work- shops should be performed at the beginning of a research project, to make sure that there is still enough freedom to choose the appropriate metadata and data standards. training researchers need to be trained in fair data skills to enable them to communicate effectively with data experts when making their data fair. to understand the skills needed, the cds contributes to the develop- ment of a competency framework to describe the skills and knowledge required to do fair-related work in a particular discipline. the first workshop (a terminology for fair stewardship skills workshop ) was organised by codata, the digital curation centre, dutch techcentre for life sciences (dtl)/elixir-nl, fairsharing and royal holloway. the competency framework is work in progress and a first version was released after the second workshop in october (fair terminology ). some skills are generic, such as data modelling, but to actually model their own data, researchers need to be aware of data formats and metadata standards in their own field. as explained previously, the m m workshops help communities to define/declare their metadata standards while the fair implementation profile records them. however, these activities are directed at research communities, individual researchers need other training materials to learn about how they can make their data fair. to create discipline-specific and understandable fair data resources aimed at single researchers, the cds participated as collaborators in the international library carpentry top fair data and software things sprint in november . during this sprint, training materials were developed for specific academic dis- ciplines including history, biodiversity and archaeology (erdmann et al. ). the cds lead the sprint on the history resource. the top fair data things are intended as living resources that can be edited and expanded by the community. most of these resources have researchers as their primary audience, but they can also be used as a source by librarians during their efforts to raise awareness of the fair principles. in fact, when developing a workshop entitled ‘let your research bloom: practical steps for fair data’ for the science phd day at the lu, the top fair data and software things resources helped to identify the four first steps to make data more fair in a very productive manner. as explained in the top fair data things, researchers need to upload their data to a repository (f), decide who has access to the data (a), describe the data using the metadata scheme offered by the repository (i), and choose a license (r). in may , another library carpentry international sprint was held, this time in collaboration with mozilla. the cds coordinated the sprint on the top fair things for astronomy together with the vrije universiteit in amsterdam, the netherlands and the university of notre dame in the united states. the resulting document has been added to the top fair things resource on zenodo (erdmann et al. ). the resource on zenodo has been consulted quite frequently ( unique views, data accessed on september ) and, since may , it has been recognised by the rda as an endorsed output. we stimulate leiden researchers to contrib- ute to these sprints. the involvement of a leiden researcher in the creation of the top fair things for astronomy was a fruitful experience, both for the researcher as for the data stewards in the sprint. in terms of research support, the material is broad and could fit anywhere during the research project, for example right at the start when there is a need to create “fair awareness” or at the end since it contains guidelines interoperability and reusability. to meet the need for researchers to work with standards when “wrangling” their own data, the cds organised a ‘bring your own data fairification’ pilot workshop for early career researchers at the leiden institute for advanced computer science in june . the course was based on information from the first draft of the ‘essential steps of the fairification process’ book (schultes et al. ). similar to the top fair things, it is expected that the book will evolve into a dynamic resource that will be updated by the community. following the very positive evaluation, the ‘bring your own data’ workshop will be followed by similar events at other lu institutes. it complements the m m and fip workshops mentioned in the leading practices section by focussing on the data wrangling while the other workshops focus on documenting the standards that need to be used. given these qualities, m m and fip workshops ideally take place before the data collection and a bring your own data workshop is best organised after data collection. consultancy while it is crucially important to develop good educational resources and to organise interactive training sessions, such initiatives are not always sufficient. research projects often work with highly complicated data sets, and researchers sometimes lack the skills and the knowledge to manage their data ensuring maximum reuse. because the questions that researchers may have cannot always be adequately addressed hettne et al: from fair leading practices to fair implementation and back art.  , page  of during training sessions, it is also necessary to be willing to make dedicated appointments with individual researchers and to offer consultancy about specific topics in the field of data management. in the course of the last few years, the cds has offered advice on a wide range of activities across the entire research life- cycle. most of the questions that we have received were about database design and about data modelling. a large number of researchers have also asked for advice on how to publish data in compliance with the fair principles (imming et al. ). in and , the cds has participated actively as a partner in an educational project which concentrated on the fairification of an existing born-digital scholarly archive ( verhaar ). the project was initiated by a professor of book history at lu who had produced a large number of text documents containing semi-structured research annotations in the course of his academic career. to improve the reusability of the data, the cds has helped to convert the semi-structured research annotations into a searchable mysql database, based on a well-considered data model. the entries in this database were connected to entries in wikipedia, so that the new data set could be integrated more effec- tively into existing data sets. the project continues to evolve every year since it is embedded in the curricu- lum for the ma programme on book and digital media studies at the lu. therefore, students continue to work on the database and related projects. the cds has also participated in a project which aimed to expose the legacy data of the lu centre for linguistics in a fair format. this project was taken up as a case study in the leiden rdm programme. in , the lucl sent out a questionnaire and performed interviews with researchers at the centre to inves- tigate what type of legacy data they knew about. questions were asked, concentrating, among other aspects, on the languages that were studies, the data formats and the curation needs. the survey resulted in information about datasets. the metadata about these datasets was entered into an online, searchable database, developed by the cds. during this process, the cds also advised on the standards to use in the description of these data sets (leiden linguistics data ). the project is still ongoing and the next steps include archiving the metadata in a repository and making the actual datasets fair. the participation in projects such as these helped us to develop a better understanding of the concrete steps that can be taken to make data more fair. with the development of the three-point framework for fairification, we are beginning to get more tools for our fair toolkit to benefit these types of projects. both the book history project and the lucl legacy data project could set up a fair data point, for instance, and a m m and fip workshop could be arranged for a project similar to the lucl project, to stimulate discussions and decisions on metadata and data standards. lessons learned the cds uses the leading practices recommended by go fair, rda and codata to support the implemen- tation of fair data within lu faculties and institutes. like the lu rdm regulation, the leading practices are broad, and there is a need for a translation to protocols for faculties and institutes, as well as for training and consultancy to accompany such protocols. we recognise that, as universities, we must take on an active role in the feedback loop going from leading practices to implementation at faculty and institutional level and vice versa. by collaborating closely with go fair, rda and codata, amongst other organisations, we help to co-develop leading practices and implementation guidelines for fair, in a manner that is informed by first-hand experiences in the cds training sessions and by experiences for supporting actual projects to create fair data. such collaborations have the obvious advantage that we can deliver machine-actionable fair data support and training to researchers, next to input for institutional protocols that is informed by latest developments. however, this can only be achieved when data stewards have the means and the skills to implement not only the principles of findability and accessibility but also the much principles of inter- operability and reusability, which unfortunately are less straight-forward. until very recently, the cds was partly supported financially by a lu innovation fund and partly by the lu libraries. after a successful internal evaluation, based on a questionnaire sent out to all researchers and other users of our services, the cds is now being funded structurally from the lu central budget. the evaluation specifically stated that the involvement in international initiatives as stated above is important for the lu, allowing for quick action in response to new developments. specific projects have been funded either from lu research or educational grants, or by external grants, but it has always been the goal to have a core, cen- tral, structural funding for the team. this stable funding, in combination with the collaborative skills and the expertise of the members of the cds team, who have been trained in fields such as information technology and computer science, form the conditions under which the cds can actively participate in the development of international leading practice. the situation in which data stewards are employed by a specific research project only is clearly undesirable, as there will generally be fewer opportunities for them, in that case, to be hettne et al: from fair leading practices to fair implementation and backart.  , page  of involved in similar efforts. the generic tasks of a data steward are to develop rdm infrastructure, policies, and support services. to be able to do this and to further the development of the data stewardship profession itself the data steward needs to balance these activities with innovation. we argue that a data steward should possess a number basic skills in the fields information technology and computer science, next to innovation skills and collaborative skills. in addition to this, the data steward must be given the opportunity in form of time and funding to take part in leading practices activities to develop the data stewardship profession fur- ther. this is important for the less affluent institutes as well, in order to build future-proof fair data support. institutes with smaller budgets can however start on a small scale by being involved in one single working group of one international leading practices organisation, to advance the support in at least one area of fair. competing interests the authors have no competing interests to declare. author contributions all authors wrote the paper collectively. references pfff. . available at https://www.go-fair.org/ / / /a-three-point-framework-for-fairification/ [last accessed october ]. a terminology for fair stewardship skills workshop. . available at https://terms fairskills. github.io/announcement.html [last accessed july ]. coretrustseal. . available at https://www.coretrustseal.org [last accessed july ]. cruz, m, et al. . policy needs to go hand in hand with practice: the learning and listening approach to data management. data science journal, ( ): . doi: https://doi.org/ . /dsj- - erdmann, c, et al. . top fair data & software things. zenodo. doi: https://doi.org/ . / zenodo. fair data point design specification. . available at https://github.com/fairdatateam/fairdata- point-spec [last accessed october ]. fair implementation matrix. . available at https://osf.io/n uwp/ [last accessed october ]. fair terminology. . available at https://github.com/terms fairskills/fairterminology [last accessed october ]. imming, m, et al. . fair data advanced use cases: from principles to practice in the netherlands. zenodo. doi: https://doi.org/ . /zenodo. jointly designing a data fairport. . available at https://www.lorentzcenter.nl/lc/web/ / / info.php ?wsid= [last accessed july ]. leiden linguistics data. . available at http://leiland.ullet.net [last accessed october ]. leiden research data service catalog. . available at https://digitalscholarship.nl/rds/ [last accessed oct ]. magagna, b, et al. . reusable fair implementation profiles as accelerators of fair convergence. osf preprints. doi: https://doi.org/ . /osf.io/ p g metadata for machines workshops. . available at https://www.go-fair.org/resources/go-fair-work- shop-series/metadata-for-machines-workshops/ [last accessed july ]. mushi, ge, et al. . identifying and implementing relevant research data management services for the library at the university of dodoma, tanzania. data science journal, ( ): . doi: https://doi.org/ . /dsj- - research data management regulations leiden university. . available at https://www.biblio- theek.universiteitleiden.nl/binaries/content/assets/ul ub/research--publish/research-data-manage- ment-regulations-leiden-university_def.pdf [last accessed july ]. schultes, e, et al. . available at https://osf.io/sjzc / [last accessed july ]. sustkova, hp, et al. . fair convergence matrix: optimizing the reuse of existing fair-related resources. data intelligence, ( – ): – . doi: https://doi.org/ . /dint_a_ tenopir, c, et al. . research data management services in academic research libraries and perceptions of librarians. library & information science research, ( ): – . doi: https://doi.org/ . /j.lisr. . . verhaar, p. . available at https://digitalscholarshipleiden.nl/articles/durable-access-to-book-histori- cal-data [last accessed october ]. https://www.go-fair.org/ / / /a-three-point-framework-for-fairification/ https://terms fairskills.github.io/announcement.html https://terms fairskills.github.io/announcement.html https://www.coretrustseal.org https://doi.org/ . /dsj- - https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://github.com/fairdatateam/fairdatapoint-spec https://github.com/fairdatateam/fairdatapoint-spec https://osf.io/n uwp/ https://github.com/terms fairskills/fairterminology https://doi.org/ . /zenodo. https://www.lorentzcenter.nl/lc/web/ / /info.php ?wsid= https://www.lorentzcenter.nl/lc/web/ / /info.php ?wsid= http://leiland.ullet.net https://digitalscholarship.nl/rds/ https://doi.org/ . /osf.io/ p g https://www.go-fair.org/resources/go-fair-workshop-series/metadata-for-machines-workshops/ https://www.go-fair.org/resources/go-fair-workshop-series/metadata-for-machines-workshops/ https://doi.org/ . /dsj- - https://doi.org/ . /dsj- - https://www.bibliotheek.universiteitleiden.nl/binaries/content/assets/ul ub/research--publish/research-data-management-regulations-leiden-university_def.pdf https://www.bibliotheek.universiteitleiden.nl/binaries/content/assets/ul ub/research--publish/research-data-management-regulations-leiden-university_def.pdf https://www.bibliotheek.universiteitleiden.nl/binaries/content/assets/ul ub/research--publish/research-data-management-regulations-leiden-university_def.pdf https://osf.io/sjzc / https://doi.org/ . /dint_a_ https://doi.org/ . /j.lisr. . . https://doi.org/ . /j.lisr. . . https://digitalscholarshipleiden.nl/articles/durable-access-to-book-historical-data https://digitalscholarshipleiden.nl/articles/durable-access-to-book-historical-data hettne et al: from fair leading practices to fair implementation and back art.  , page  of vodan. . available at https://www.go-fair.org/implementation-networks/overview/vodan/ [last accessed october ]. wilkinson, md, et al. . the fair guiding principles for scientific data management and steward- ship. scientific data, : . erratum in: scientific data, ( ): . doi: https://doi.org/ . / sdata. . wittenburg, p, et al. . the fair funder pilot programme to make it easy for funders to require and for grantees to produce fair data. arxiv. https://arxiv.org/abs/ . . how to cite this article: hettne, km, verhaar, p, schultes, e and sesink, l. . from fair leading practices to fair implementation and back: an inclusive approach to fair at leiden university libraries. data science journal, : , pp.  – . doi: https://doi.org/ . /dsj- - submitted: july accepted: october published: october copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/ licenses/by/ . /. data science journal is a peer-reviewed open access journal published by ubiquity press. open access https://www.go-fair.org/implementation-networks/overview/vodan/ https://doi.org/ . /sdata. . https://doi.org/ . /sdata. . https://arxiv.org/abs/ . https://doi.org/ . /dsj- - http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / introduction approach leading practices training consultancy lessons learned competing interests author contributions references figure op-llcj .. digital humanities is text heavy, visualization light, and simulation poor ............................................................................................................................................................ erik malcolm champion cic, aapi, school of media culture and creative arts, curtin university, australia ....................................................................................................................................... abstract this article examines the question of whether digital humanities has given too much focus to text over non-text media and provides four major reasons to encourage more non-text-focused research under the umbrella of digital humanities. how could digital humanities engage in more humanities-oriented rhetorical and critical visualization, and not only in the development of scientific visualization and information visualization? ................................................................................................................................................................................. digital humanities is text? four arguments there has long been a debate on what exactly is digital humanities (cohen et al., ; terras et al., ). my article will put forward the sugges- tion that in earlier books there is a subtext that digital humanities are primarily or uniquely or best viewed as computing services and tools applied to the digitalization and processing of text or litera- ture (baldwin, ) but this would be to the det- riment of both text-based and non-text-based scholarly research. my concern that visualization projects are not often mentioned as being part of the digital humanities might seem a little paranoid; clearly there are presentations on visualizations at digital humanities conference. however, i am not alone. svensson ( ) has pointed out the great amount of projects done that can be described as digital humanities even if they are not textual studies. meeks ( ) entitled his provocative article ‘is digital humanities too text-heavy?’ and he observed that at digital humanities conferences ‘a quick look at the abstracts shows how much the analysis of english literature dominates a conference attended by archae- ologists, area studies professors and librarians, network scientists, historians, etc.’ perhaps there are so many text-focussed attendees because they do not feel their digital leanings are appreciated at mainstream aca- demic conferences in their field. perhaps geographers and archaeologists do not attend en masse because their digital leanings are appreciated in their discipline but publications in digital humanities-specific pro- ceedings and journals are not. however, there may be another reason. as meeks himself recounts, early digital humanities books were keen to show a trail of mythical origins in the humanities computing field, and the humanities computing field is itself heavily indebted to text- based research. hence text-based research historically dominates digital humanities events. as an example, hockey ( ) wrote the following in her chapter ‘the history of humanities computing’, in one of the first books dedicated to digital humanities (schreibman et al., ): ‘applications involving textual sources have taken center stage within the development of humanities computing as defined by its major publications and thus it is inevitable that this essay concentrates on this area’. correspondence: erik m. champion, school of media culture and creative arts, faculty of humanities, curtin university, gpo box u perth, western australia , australia. email: erik.champion@curtin.edu.au digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqw this is a pre-copyedited, author-produced version of an article accepted for publication in digital scholarship in the humanities following peer review. the version of record “champion, e. . digital humanities is text heavy, visualization light, and simulation poor. digital scholarship in the humanities. : fqw .” is available online at: http://doi.org/ . /llc/fqw deleted text: deleted text: patrik deleted text: elijah deleted text: ' deleted text: ' deleted text: susan deleted text: , deleted text: . such a move has been recently contested (robertson a, b), but there does appear to be a text emphasis in many digital humanities research infrastructures. for example, ontologies for directories of digital humanities tools and methods in european projects (such as digital research infrastructure for the arts and humanities (dariah) and network for digital methods in the arts and humanities (nedimah)) and in american or international pro- jects (such as digital research tools (dirt) bamboo, currently known as dirt) are heavily influenced by the ontology of digital humanities as developed at the university of oxford, following unsworth ( ). the university of oxford definition of digital humanities, at least on their webpage (unpublished), is text based and desk based. their website (http://digital.huma- nities.ox.ac.uk/support/whatarethedh.aspx) page says that, amongst other new advantages, digital huma- nities offers ‘new desktop working environments’ and ‘new ways of representing data’. yet virtual reality has been involved with the humanities for at least two decades, and closer to three decades. i was involved in computer-aided design and drafting (cadd) and multimedia, and the experience of digital reconstructions of arte- facts and heritage sites over years ago, and com- puter games for over years, others have been involved on this field for much longer. i consider these projects in the realm of humanities. as an academic area, virtual reality’s intersection with the humanities also measures in the decades. year celebrates the nd conference of virtual systems and multimedia (http://www.vsmm . org/), ‘virtual systems and multimedia (vsmm) has become a bridge between technology, art, culture, history, science and engineering’. vsmm has had a virtual heritage element for almost all of its years. the silicon graphics international corp (sgi) virtual reality modelling language model of tenochtitlan is from , and dudley castle in england featured a ‘virtual reality tour’ from around . on a more personal note, i experienced the joys (and usability issues) of a virtual reality (head mounted display with cyberglove) environment at the start of and i was certainly not the first participant. this leads me to argue that there are at least four reasons to be concerned with any idea that digital humanities are being perceived as primarily text based (and in particular not related to visualiza- tion). i will argue: there is ‘not always’ a clear sep- aration between written language and images; that to be a humanist or a humanistic scholar (not the same thing) we do not always have to have high levels of literacy; that non-text-based media can be part of digital humanities for it is actually part of humanities and that visualization-incorporating media can provide suitable scholarly arguments. . written language and images historically, the distinction between text and symbol has been blurred, from cave paintings through early european and asian languages and as part of world history in general. recent research suggests that caves were painted where the spaces were most reverberant, they are not only visual art forms but also reverberation chambers, possibly the more resonant spaces were seen as more spiritual. regardless of the original reason, this is evidence of the early symbiotic relationship between space sound and image (viegas, ; brown, ). writing discovered in china that has been dated , years old also reveals the early mixed origins of image and text. tang ( ) noted the ‘primitive writing . . . [lies] . . . somewhere between symbols and words’. this language is created when five or six of the symbols are combined; they are no longer symbols but words. literature is also inextricably linked to rhythm and movement. politics and the brainwashing effect of nationalistic marches are related to an understanding of movement (turner and pöppel, ); musical appreciation is heavily affected by both our mammalian heritage (pankseppa and bernatzky, ) and by the body in space (sacks, ; thomas, ). even today, language appears to be geographically influenced; one paper reveals that prepositions in parts of spain appear to depend on the geographical terrain and the local speakers are unaware of this (mark et al., ). if history is only that which has been written, then many cultures are excluded. oral heritage has proven cultural heritage does not have to be written down to be considered part of the humanities. worryingly, the scholarly field of history has a e. m. champion of digital scholarship in the humanities, deleted text: - deleted text: , deleted text: john deleted text: undated deleted text: - deleted text: - http://digital.humanities.ox.ac.uk/support/whatarethedh.aspx http://digital.humanities.ox.ac.uk/support/whatarethedh.aspx deleted text: thirty http://www.vsmm .org/ http://www.vsmm .org/ deleted text: " deleted text: " deleted text: twenty two deleted text: (vrml) deleted text: - deleted text: deleted text: ; deleted text: ( deleted text: five thousand deleted text: '' deleted text: and deleted text: , deleted text: in order popularity challenge: a survey of the american public revealed they were engaged by the notion of the ‘past’, but repelled by the word ‘history’ (rosenzweig and thelen, ). . visualization literacy in their book digital humanities in practice (warwick et al., ) and on the related blog (warwick, un- published), warwick, terras, and nyhan have decried the lack of public dissemination of digital humanities projects, and a lack of public accessibility was also pointed out by kirschenbaum ( ). to improve public access to digitalized material we also need to tackle the problem of literacy, digital literacy, and digital fluency (resnick, ). multimedia, visu- alizations, sensory interfaces can communicate across a wider swathe of the world’s population. although literacy is increasing, technology is fur- ther wedging a fundamental divide between those who can read and write and those who cannot (unesco, ). there also seems to be a need for visualization literacy, the public appear to be far more easily con- vinced by visualizations than by reading text. the im- plication is that their level of visualization literacy is not as discerning (pandey et al., ). . visualization is part of the humanities visualization is an extremely significant aspect of digital humanities, and writers such as burdick et al. ( , pp. – ) agree. literature itself is linked to both the image (theibault, ) and ma- teriality (rudy, ); the materiality of icelandic sagas and runic inscriptions are considered by vari- ous scholars to be essential properties (jesch, ). archives are not just text, and the digital humanities are collaborative and interwoven. even the book itself is a material, embodied ex- perience. the university of dundee’s poetry beyond text project group’s research is further evidence of the importance of image to the literary (university of dundee, ): ‘the crs [co-researchers] rated works in which they felt the text and image mutually enhanced one another more highly than works which they felt were ‘‘fragmented’’ or disjunctive’. humanities is not merely multimodal but also embodied experiences. the objects in and on which the humanities are described, critiqued, and preserved are more than just holders for text; they are essential artefacts, which give researchers essen- tial clues in the interpretation of text and author. material objects are not merely brute objects; they are symbolic as well, inscribed into the lived and symbolic world (mcdonald and veth, ). . visualization as scholarly argument where is visualization as a research tool in its own right? can visualization not actually create new re- search questions? jessop ( ) has argued that digital visualization is more than just an illustration; it is a scholarly methodology. visualization is promoted at stanford university’s digital humanities workshops as both a tool and an argument (robichaud and blevins, ). visualization workshops are increas- ingly popular fixtures at digital humanities workshops (milner, ) and conferences (weingart , ), and some recent conference papers even promote the use of ‘persuasive visualizations’ (hann, ). archival organizations now offer tools to help huma- nities scholars visualize new research questions, ‘by replacing information with image, we can often see a different story hidden in the data’ (tocewicz, ). research by van den braak et al. ( ) indicated some studies show improvement from argument visualization tools. however, the challenge of adopt- ing visualizations to the strategies of humanities is not always clear-cut, especially given visualizations in the humanities tend to prefer to cover as many in- terpretations as possible (sinclair et al., ). various scholars have argued that visualization can be reflective and critical (dörk et al., ; jessop, ; robichaud and blevins, ), but there is an important problem that is critical to my field of research, virtual heritage, and, i believe it is of great interest to digital humanities in gen- eral. i am speaking here of the distinction between the model and the simulation. simulations are not simply models i am trained as an architect, and so i probably define the word ‘model’ differently to an archaeologist, a digital humanities digital scholarship in the humanities, of deleted text: , deleted text: unknown deleted text: matthew deleted text: - deleted text: , deleted text: to deleted text: unknown deleted text: " deleted text: " deleted text: ' deleted text: /are deleted text: 't deleted text: martyn deleted text: , deleted text: so deleted text: a deleted text: n computer scientist, or a fashion designer. i am how- ever finding myself more and more influenced by the archaeological distinction between model and simulation because it has also revealed to me an important issue in my own field of research, virtual heritage. it makes more sense to see the model as a physical or digital representation of a product or process, while a simulation is actually the reconfi- gurative use of a model to reveal new and potential aspects of a model. so a model can reveal or explain current states of a system, but a simulation can reveal new and hitherto unimagined potential states and possibilities of a system. a model of the weather is not the same as a simulation engine that finds out what the weather might be like tomorrow. this distinction between model and simulation is important when we wish to understand process rather than merely an end product. i employ games, game engines, and virtual reality to create virtual heritage projects (virtual reality in the service of cultural heritage). the most famous charter dedi- cated to best practices in virtual heritage is the london charter (denard, , p. ) defines ‘com- puter-based visualization’ as ‘the process of repre- senting information visually with the aid of computer technologies’. it may seem that virtual heritage is simply the recreation of what used to be there. yet, what used to be ‘there’ was more than a collection of objects. those objects had spe- cific meaning to the cultural perceptions of the site’s traditional inhabitants. reproducing the artefacts is not enough for we must also convey the importance of that cultural heritage to the public. and here lies the dilemma of space and time, a culture may no longer exist, the artefacts may have moved and been dispersed, our understanding of either the site or its owners could be conflicted and our interpretations of both may have dramatically changed or never have been agreed upon. these considerations lead me to sug- gest an alternative definition: ‘virtual heritage is the attempt to convey not just the appearance but also the meaning and significance of cultural artefacts and the associated social agency that designed and used them, through the use of interactive and im- mersive digital media’. this alternative definition of virtual heritage is directly involved in the issue of simulation versus model. in many archaeological texts (bentley et al., ; costopoulos, ; lake, ; molyneaux, ; rahtz and reilly, ; winsberg, ; wurzer et al., ) there is a notion of a simulation as being like a model, but a less restricted model, because the aim is to understand the processes rather than view an abstracted or simplified repre- sentation (a model, in other words). so a simulation is concerned with creating just enough modelling so that the ways in which components interact can be studied (and experienced) both spatially and tem- porally. winsberg in particular gave a good explan- ation: ‘successful simulation studies do more than compute numbers. they make use of a variety of techniques to draw inferences from these numbers. simulations make creative use of calculational tech- niques that can only be motivated extra-mathemat- ically and extra-theoretically.’ as an example, i would like to proffer the re- search opportunities of game design. games may be defined as systems of rules, but the rules that people follow, break, and create are not the algo- rithms in the software, and the way in which people interact with each is far more than a pre-scripted system of rules. games are simulations in the sense that they allow both players and spectators to exam- ine behaviours change and reveal themselves over time (behaviours here can be in the simulated en- vironment or be expressed by the human actors). thanks to game templates and frameworks, there are many technological options to explore human issues and values over time without having to im- merse oneself in years of programming. archaeologists such as wattrell ( ) can see the potential of games for engaging the public, ‘a no brainer of mythical proportions’, but stress they also require games and virtual environments to ‘provide the vital intellectual context of that infor- mation, exploring how and why archaeologists and egyptologists reached the conclusions they did about a given site, individual, historic event, cultural practice, etc.’ meyers ( ) reminds us that it is ‘necessary for students to know how this highly con- tested knowledge is constructed’. graham ( ) declares, ‘let the students do it. . .the learning in e. m. champion of digital scholarship in the humanities, deleted text: . deleted text: . deleted text: and deleted text: ' deleted text: and g deleted text: ethan deleted text: ; deleted text: kate deleted text: . deleted text: shawn doing’. other archaeology academics have also told me of the unexpected but delightful learning bene- fits they and their students discovered when trying to simulate archaeological environments inside game engines. for example, the fort ross historical game project in unity had input from historians, staff, and students (lercari et al., ). some have noted that games research has not been met with much approval and encouragement even in the digital humanities. jones ( ) commented, ‘my own interest in games met with resistance from some anonymous peer reviewers for the program for the dh conference, for example . . . [yet] . . . com- puter-based video games embody procedures and structures that speak to the fundamental concerns of the digital humanities’. the distinctive and— dare i say it—revolutionary power of games to afford the player the ability to test and develop their own theories is perhaps best but paradoxically exemplified by the attempts of traditional scholars to mould the simulation-rich possibilities of games into a system of rules, a model if you like. jeremy antley provided an example in his article ‘going beyond the textual in history’: to put it on even simpler terms—the main objection the authors have with current gamic modes is that they produce history for consumers, while the authors would much rather produce history for producers. this ap- proach, currently, is endemic in the historical discipline because historians, by and large, are used to being both the producers and con- sumers of their own product . . . textual modes focus on producing knowledge through reading, while gamic modes focus on producing knowledge through play. yet, historical understanding does not have to be passively received. in norway and italy a virtual reality project was designed to engage students in the area of renaissance science and travel diaries (carrozzino et al., ). the project team wished to explore information technology (it) in museum education, particularly to see how historic manu- scripts from the th and th centuries could convey knowledge through interactivity, without damaging the originals. they created an augmented d book, where objects appear to pop out of the page, an ‘information landscape’ and virtual reality (vr) display so participants could view and share a digital simulation of the books. the relevant aspect to this discussion is that the project did not stop at digital displays; the participants per- form experiments in the real world after visiting the digital environments. my own area of research is more to do with the simulation of built history and interactive heritage (champion, a) but even here i have found that students learn even more from designing and play- testing their own and others’ game engines than they learn simply as players. games should not only be seen as products but also as processes. games have the ability to synthesize narrative, con- jecture, computer-generated objects, contextually constrained goals, real-time dynamic data, and user-based feedback (mateas and stern, ). for example, i have explored the action and role- playing game ‘elder scrolls v: skyrim’ to see if new ways of interacting with literature could be designed inside the game engine (champion, b). skyrim mods can potentially allow scholars to create and insert their own stories, voice-overs, and movies into books. more interestingly though, the mod editor of this game allows designers to create their own adventures predicated on the player’s inter- action with books as interactive artefacts. i could, for instance, create a game level where the player has to determine which characters are authors from jud- ging their behaviours in comparison to the writing style found in books discovered in the game. or possibly the players could be transformed into dif- ferent characters, but are not able to see themselves or their identities, and must discover what sort of character they are from information found in books or in the game level or from conversations with the non-playing characters in the game. through this interactive richness—rather than through a high-tech ability to reproduce elements of the real world—people can both learn and enjoy alterity (experience of the ‘other’). in a virtual heri- tage environment, the more one can master local cultural behaviour, the more one can understand significant events from the local cultural perspective. mastery of dialogue and artefact use, as viewed from digital humanities digital scholarship in the humanities, of deleted text: . deleted text: . deleted text: -- deleted text: -- deleted text: deleted text: deleted text: deleted text: - deleted text: - a local cultural perspective, may lead to enhanced cultural immersion. it may consequently lead to a heightened sense of engagement. on the other hand, the interactive nature of the simulated environment allows us to create questioning rhetorical affor- dances that are either encountered dramatically and abruptly, forcing the player to confront their subconscious or desensitized default behaviours, or the rhetorical affordances are absorbed slowly during game-time, evoking questions only after post-game reflection. this critical approach can be used in game mods (champion, ) but it can also be employed in machinima—game engine cameras used to create pre-rendered video—it does not have to be em- ployed solely in real-time computer games. so, while game design and machinima production are not typically seen as part of digital humanities, they are interesting vehicles for fostering and examining community feedback, cultural issues, critical reflec- tion, and medium-specific techniques (such as pro- cedural rhetoric). machinima in particular is an excellent vehicle to engage and then confront auto- matic player behaviours and assumptions (champion, ). conclusion visualization projects leverage and incorporate text, they have been taught for centuries as humanities disciplines, and they can present and project inter- esting and provocative questions of immediate interest to humanities scholars; these projects also function in ways beyond the traditional act of read- ing. visualization employs research in the trad- itional humanities, converts information communication technology (ict) people to humanities research (sometimes) and in the above examples helps preserve and communicate cultural heritage and cultural significance through alterity, cultural constraints, and counterfactual imaginings. despite some strict definitions of the terms, history and heritage are not always literature! and the digital humanities audience is not always litera- ture-focused or interested in traditional forms of literacy. down through the ages, text has not lived in a hermetically sealed hermeneutic well all by itself. a world with literature but without the arts is intellec- tually and experientially impoverished. critical think- ing and critical literacy extend beyond the reading and writing of text. visualization can make scholarly arguments. therefore, non-text-based research should figure more prominently in digital humanities read- ers and monographs. references baldwin, s. ( ). the idiocy of the digital literary (and what does it have to do with digital humanities)? digital humanities quarterly, ( ), http://www.digital- humanities.org/dhq/vol/ / / / .html. bentley, r. a., maschner, h. d., and chippindale, c. ( ). handbook of archaeological theories. lanham: rowman & littlefield. brown, a. s. ( ). from caves to stonehenge, ancient peoples painted with sound. inside science. inside science news service website: http://www.insi- descience.org/content/caves-stonehenge-ancient-peo- ples-painted-sound/ . burdick, a., drucker, j., lunenfeld, p., presner, t., and schnapp, j. ( ). a short guide to the digital humanities digital humanities. cambridge, ma: mit press. carrozzino, m., evangelista, c., bergamasco, m., belli, m., and angeletaki, a. ( ). information landscapes for the communication of ancient manuscripts heritage. paper presented at the digital heritage international congress (digital heritage), , digital heritage international congress (digital heritage), marseille, france, october- november, . champion, e. ( ). undefining machinima. in lowood, h., and nitsche, m. (eds), the machinima reader. cambridge, ma: mit press, pp. – . champion, e. (ed.) ( ). game mods: design, theory and criticism: pittsburgh, united states: entertainment technology centre press. champion, e. ( a). critical gaming: interactive history and virtual heritage. in evans, d. (ed.). farnham, england: ashgate publishing. champion, e. ( b). ludic literature: evaluating skyrim for humanities modding. paper presented at the digital humanities congress . studies in the digital humanities, sheffield, - september . e. m. champion of digital scholarship in the humanities, deleted text: - deleted text: - deleted text: , deleted text: s deleted text: s deleted text: is deleted text: http://www.digitalhumanities.org/dhq/vol/ / / / .html http://www.digitalhumanities.org/dhq/vol/ / / / .html http://www.insidescience.org/content/caves-stonehenge-ancient-peoples-painted-sound/ http://www.insidescience.org/content/caves-stonehenge-ancient-peoples-painted-sound/ http://www.insidescience.org/content/caves-stonehenge-ancient-peoples-painted-sound/ cohen, d. j., frabetti, f., buzzetti, d., and rodriguez- velasco, j. d. ( ). defining the digital humanities. http://academiccommons.columbia.edu/item/ac: . costopoulos, a. ( ). simulating society handbook of archaeological theories. london: altamira press, pp. – . denard, h. ( ). the london charter for the computer- based visualisation of cultural heritage. london. http://www.londoncharter.org/. dörk, m., collins, c., feng, p., and carpendale, s. ( ). critical infovis: exploring the politics of visualization. paper presented at the chi extended abstracts, paris france. graham, s. ( ). my glorious failure. http://www. playthepast.org/?p¼ . hann, r. ( ). visualized arguments; or how to pierce the persuasive visualization and other arguments. paper presented at the eva london conference, - july , london. http://www.bcs.org/upload/ pdf/ewic_eva _paper .pdf. hockey, s. ( ). the history of humanities computing. in schreibman, s., siemens, r. and unsworth, j. (eds), a companion to digital humanities. oxford: blackwell. jesch, j. ( ). runes and words: runology in a lexicograph- ical context (plenary paper), in proceedings of the seventh international symposium on runes and runic inscriptions, ‘runes in context’. futhark, international journal of runic studies, , – . https://www.academia.edu/ / runes_and_words_runic_lexicography_in_context. jessop, m. ( ). digital visualization as a scholarly ac- tivity. literary and linguistic computing, ( ): . jones, s. e. ( ). the emergence of the digital humanities. new york, ny: routledge. kirschenbaum, m. g. ( ). digital humanities. http:// www.academicroom.com/topics/what-is-digital-humanities lake, m. w. ( ). trends in archaeological simulation. journal of archaeological method and theory, ( ): – . http://hesp.irmacs.sfu.ca/sites/hesp.irmacs.sfu. ca/files/lake_ .pdf. lercari, n., forte, m., and onsurez, l. ( ). multimodal reconstruction of landscape in serious games for heritage, an insight on the creation of fort ross virtual warehouse serious game. paper presented at the digital heritage international congress, marseille, france, october- november, . mark, d. m., gould, m. d., and nunes, j. ( ). spatial language and geographic information systems: cross-lin- guistic issues (el lenguaje espacial y los sistemas de información geograficos: temas interlinguisticos). paper presented at the conferencia latinoamericana sobre el technologia de los sistemas de información geográficos (sig), venezuela. mateas, m., and stern, a. ( ). façade: an experiment in building a fully-realized interactive drama. paper presented at the game developers conference, - march, san jose, ca. mcdonald, j., and veth, p. ( ). the archaeology of memory: the recursive relationship of martu rock art and place. anthropological forum, ( ): – . doi: . / . . . meeks, e. ( ). is digital humanities too text-heavy? https://dhs.stanford.edu/spatial-humanities/is-digital- humanities-too-text-heavy/. meyers, k. ( ). dealing with multiple narratives of truth and creating meaningful play. http://www. playthepast.org/?s¼dealingþwithþmultipleþnarratives &x¼ &y¼ . milner, m. ( ). visualization workshop. http://digihum. mcgill.ca/blog/ / / /visualization-workshop/. molyneaux, b. ( ). from virtuality to actuality: the archaeological site simulation environment. in archaeology and the information age: a global perspective. london: routledge, p. . pandey, a. v., manivannan, a., nov, o., satterthwaite, m., and bertini, e. ( ). the persuasive power of data visualization. transactions on visualization and computer graphics, ieee, ( ), – . pankseppa, j., and bernatzky, g. ( ). emotional sounds and the brain: the neuro-affective foundations of musical appreciation. behavioural processes, , – . http:// www.musikament.at/b /panksepp% bernatzky.pdf. rahtz, s., and reilly, p. ( ). archaeology and the information age. london: routledge. reilly, p., and rahtz, s. (eds) ( ). archaeology and the information age: a global perspective. london: routledge. resnick, m. ( ). rethinking learning in the digital age: the global information technology report: readiness for the networked world. oxford university press. robertson, s. ( a). chnm’s histories: digital history & teaching history. http://drstephenrobertson.com/ blog-post/digital-history-teaching-history/. robertson, s. ( b). the differences between digital history and digital humanities. http://drstephenrobert son.com/blog-post/the-differences-between-digital-his tory-and-digital-humanities/. digital humanities digital scholarship in the humanities, of http://academiccommons.columbia.edu/item/ac: http://www.londoncharter.org/ http://www.playthepast.org/?p= http://www.playthepast.org/?p= http://www.playthepast.org/?p= http://www.bcs.org/upload/pdf/ewic_eva _paper .pdf http://www.bcs.org/upload/pdf/ewic_eva _paper .pdf https://www.academia.edu/ /runes_and_words_runic_lexicography_in_context https://www.academia.edu/ /runes_and_words_runic_lexicography_in_context http://www.academicroom.com/topics/what-is-digital-humanities http://www.academicroom.com/topics/what-is-digital-humanities http://hesp.irmacs.sfu.ca/sites/hesp.irmacs.sfu.ca/files/lake_ .pdf http://hesp.irmacs.sfu.ca/sites/hesp.irmacs.sfu.ca/files/lake_ .pdf https://dhs.stanford.edu/spatial-humanities/is-digital-humanities-too-text-heavy/ https://dhs.stanford.edu/spatial-humanities/is-digital-humanities-too-text-heavy/ http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://www.playthepast.org/?s=dealing+with+multiple+narratives&x= &y= http://digihum.mcgill.ca/blog/ / / /visualization-workshop/ http://digihum.mcgill.ca/blog/ / / /visualization-workshop/ http://www.musikament.at/b /panksepp% bernatzky.pdf http://www.musikament.at/b /panksepp% bernatzky.pdf http://drstephenrobertson.com/blog-post/digital-history-teaching-history/ http://drstephenrobertson.com/blog-post/digital-history-teaching-history/ http://drstephenrobertson.com/blog-post/the-differences-between-digital-history-and-digital-humanities/ http://drstephenrobertson.com/blog-post/the-differences-between-digital-history-and-digital-humanities/ http://drstephenrobertson.com/blog-post/the-differences-between-digital-history-and-digital-humanities/ robichaud, a., and blevins, c. ( ). tooling up for digital humanities. http://toolingup.stanford.edu/ ?page_id¼ . rosenzweig, r., and thelen, d. ( ). the presence of the past: popular uses of history in american life. new york, ny: columbia university press. rudy, k. m. ( ). kissing images, unfurling rolls, mea- suring wounds, sewing badges and carrying talismans: considering some harley manuscripts through the phys- ical rituals they reveal. electronic british library journal, , – . http://www.bl.uk/eblj/ articles/article .html. sacks, o. ( ). musicophilia: tales of music and the brain. new york, ny: alfred a. knopf. schreibman, s., siemens, r., and unsworth, j. (eds) ( ). a companion to digital humanities. oxford: blackwell. sinclair, s., ruecker, s., and radzikowska, m. ( ). information visualization for humanities scholars. literary studies in the digital age, an evolving anthology. http://dlsanthology.commons.mla.org/in- formation-visualization-for-humanities-scholars/. svensson, p. ( ). humanities computing as digital humanities. in terras, m., nyhan, j. and vanhoutte, e. (eds), defining digital humanities: a reader. farnham, england: ashgate, pp. – . tang, d. ( ). china discovers some of the world’s oldest writing. huff post. http://www.huffingtonpost. com/ / / /china-oldest-writing_n_ .html. terras, m., nyhan, j., and vanhoutte, e. (eds) ( ). defining digital humanities: a reader. farnham, england: ashgate. theibault, j. ( ). visualizations and historical arguments by john. in dougherty, j. and nawrotz, k. (eds), writing history in the digital age (vol. (spring version)). online. michigan, usa: university of michigan press. thomas, b. ( ). oliver sacks shares tales of musical hallucinations. https://blogs.scientificamerican.com/- mind-guest-blog/oliver-sacks-shares-tales-of-musical- hallucinations/. tocewicz, t. ( ). visualising helps make sense of data. research information: analysis and opinion. http:// www.researchinformation.info/news/news_story.php? news_id¼ http://www.researchinformation.info/ news/news_story.php?news_id¼ . turner, f., and pöppel, e. ( ). metered poetry, the brain, and time. in rentschler, i., herzberger, b. and epstein, d. (eds), beauty and the brain. basel: birkhäuser, pp. – . unesco. ( ). literacy for all remains an elusive goal, new unesco data shows | united nations educational, scientific and cultural organization. http://www.unesco.org/new/en/media-services/single- view/news/literacy_for_all_remains_an_elusive_goal_ new_unesco_data_shows/back/ /-.uuh o lm uo. university of dundee. ( ). materiality. poetry beyond text: vision, text and cognition. http://www.poetry- beyondtext.org/materiality.html. university of oxford. (unpublished). what are the digital humanities? digital.humanities@oxford. http://digital. humanities.ox.ac.uk/support/whatarethedh.aspx. unsworth, j. ( ). digital humanities centers summit, neh, digital humanities centers as cyberinfrastructure. van den braak, s. w., van oostendorp, h., prakken, h., and vreeswijk, g. a. ( ). a critical review of argument visualization tools: do users become better reasoners. paper presented at the workshop notes of the ecai- workshop on computational models of natural argument (cmna vi), august , riva del garda, italy. viegas, j. ( ). music and art mixed in the stone age. abc science: news in science. http://www.abc.net.au/ science/articles/ / / / .htm. warwick, c. (unpublished). chapter : studying users in digital humanities. blog post http://blogs.ucl.ac.uk/dh- in-practice/chapter- /. warwick, c., terras, m. m., and nyhan, j. (eds) ( ). digital humanities in practice. london: facet pub.: in association with ucl centre for digital humanities. wattrell, e. ( , december). project diary red land/black land. electronic group blog. weingart, s. ( , november). submissions to digital humanities . http://www.scottbot.net/hial/index. html@p¼ .html. weingart, s. ( , january). submissions to digital humanities . http://www.scottbot.net/hial/?p¼ . winsberg, e. ( ). computer simulations in science. the stanford encyclopedia of philosophy, (summer ), online. the stanford encyclopedia of philosophy web- site: http://plato.stanford.edu/archives/sum /entries/ simulations-science/ http://plato.stanford.edu/archives/ sum /entries/simulations-science/. wurzer, g., kowarik, k., and reschreiter, h. (eds) ( ). agent-based modeling and simulation in archaeology. vienna, austria: springer. e. m. champion of digital scholarship in the humanities, http://toolingup.stanford.edu/?page_id= http://toolingup.stanford.edu/?page_id= http://toolingup.stanford.edu/?page_id= http://www.bl.uk/eblj/ articles/article .html http://dlsanthology.commons.mla.org/information-visualization-for-humanities-scholars/ http://dlsanthology.commons.mla.org/information-visualization-for-humanities-scholars/ http://www.huffingtonpost.com/ / / /china-oldest-writing_n_ .html http://www.huffingtonpost.com/ / / /china-oldest-writing_n_ .html https://blogs.scientificamerican.com/mind-guest-blog/oliver-sacks-shares-tales-of-musical-hallucinations/ https://blogs.scientificamerican.com/mind-guest-blog/oliver-sacks-shares-tales-of-musical-hallucinations/ https://blogs.scientificamerican.com/mind-guest-blog/oliver-sacks-shares-tales-of-musical-hallucinations/ http://www.researchinformation.info/news/news_story.php?news_id= http://www.researchinformation.info/news/news_story.php?news_id= http://www.researchinformation.info/news/news_story.php?news_id= http://www.researchinformation.info/news/news_story.php?news_id= http://www.researchinformation.info/news/news_story.php?news_id= http://www.researchinformation.info/news/news_story.php?news_id= http://www.researchinformation.info/news/news_story.php?news_id= http://www.unesco.org/new/en/media-services/single-view/news/literacy_for_all_remains_an_elusive_goal_new_unesco_data_shows/back/ /-.uuh o lm uo http://www.unesco.org/new/en/media-services/single-view/news/literacy_for_all_remains_an_elusive_goal_new_unesco_data_shows/back/ /-.uuh o lm uo http://www.unesco.org/new/en/media-services/single-view/news/literacy_for_all_remains_an_elusive_goal_new_unesco_data_shows/back/ /-.uuh o lm uo http://www.poetrybeyondtext.org/materiality.html http://www.poetrybeyondtext.org/materiality.html http://digital.humanities.ox.ac.uk/support/whatarethedh.aspx http://digital.humanities.ox.ac.uk/support/whatarethedh.aspx http://www.abc.net.au/science/articles/ / / / .htm http://www.abc.net.au/science/articles/ / / / .htm http://blogs.ucl.ac.uk/dh-in-practice/chapter- / http://blogs.ucl.ac.uk/dh-in-practice/chapter- / http://www.scottbot.net/hial/index.html@p= .html http://www.scottbot.net/hial/index.html@p= .html http://www.scottbot.net/hial/index.html@p= .html http://www.scottbot.net/hial/?p= http://www.scottbot.net/hial/?p= http://plato.stanford.edu/archives/sum /entries/simulations-science/ http://plato.stanford.edu/archives/sum /entries/simulations-science/ http://plato.stanford.edu/archives/sum /entries/simulations-science/ http://plato.stanford.edu/archives/sum /entries/simulations-science/ using blogs as communication tools for the architecture design studio procedia - social and behavioral sciences ( ) – available online at www.sciencedirect.com sciencedirect - © published by elsevier ltd. this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). selection and peer-review under responsibility of the organizing committee of wces doi: . /j.sbspro. . . wces using blogs as communication tools for the architecture design studio maja baldeaa*, alexandra maiera, oana a. simionescua, afaculty of architecture and urbanism, “politehnica” university timisoara, str. traian lalescu nr. , timisoara , romania abstract architecture teaching in romania is aligning itself to european trends, partially by starting to use blogs as communication tools. this research focuses on a set of dedicated blogs for the architecture design studio of three different study years, developed at the faculty of architecture and urbanism of timisoara between / , and on their evolution. the activity and educational accomplishments of the blogs are analyzed through theoretical discussions on using blogs in teaching, followed by a comparative study of feedbacks from students and teachers. the conclusions suggest that the number of blogs should be expanded, that different hosting platforms allowing greater dynamic should be used and that a genuine dialogue should be nurtured between teachers and students. © the authors. published by elsevier ltd. selection and peer-review under responsibility of the organizing committee of wces . keywords: blog, teaching, education, architecture, design studio; . introduction the evolution of web . had a tremendous role on transforming teaching and its technologies, and its services contributed in a large extent to the development of the current higher education (grosseck, ). current architecture teaching trends all over europe use web communication in a high degree in the teaching process, transforming the way in which traditional programs are taught. in this context, educational approaches in architecture in romania are still in an early phase compared to western europe and schools here should strive to bring communication closer to contemporary teaching trends. the main role in implementing the new technologies * maja baldea. tel.: + - - - . e-mail address: maja.baldea@student.upt.ro © published by elsevier ltd. this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). selection and peer-review under responsibility of the organizing committee of wces http://crossmark.crossref.org/dialog/?doi= . /j.sbspro. . . &domain=pdf http://crossmark.crossref.org/dialog/?doi= . /j.sbspro. . . &domain=pdf maja baldea et al. / procedia - social and behavioral sciences ( ) – belongs to the teachers involved, who have to assume a new attitude towards teaching, both more opened towards sharing information and more responsible regarding the information offered (grosseck, ). architecture disciplines nowadays and the design studio in particularly have a high degree of specificity in regard to the teaching process. contemporary architecture school curricula includes a wide range of working methods and directions, resulting in theoretical and physical products, showing the degree of preparation of young architects (di batista, ), and implementing web . technologies in teaching architecture needs to take into account the specific nature of the working methods implied. this research focuses on the way in which blogs are used as a tool for teaching higher education, in relation to the activity of the architecture design studio of the faculty of architecture and urbanism of timisoara, romania. the introductory part of the study discusses in a broad sense the concepts and contexts of education and the use of new media in contemporary teaching, while the second part analyzes a set of blogs implemented simultaneously, following their development throughout the course of a whole academic year. by comparing different educational situations, a better understanding of the teacher’s roles as well as of the requirements for flexible communication has been attained. . a theoretical approach on the use of web based communication in teaching traditional academic communication is decisive in the educational processes. it is based on the interpersonal relationship established between teacher and student and aims to transmit not only precise information, but also knowledge exceeding the strict meaning of the sent message. in order to define the specific characteristics of scholarly communication, one must start from the modern concepts of communication. new visions regarding the teaching methodologies derive from new theories of communication that imply various ways of knowing, learning and transmitting information using nonconformist communication. . . academic communication. concepts modern research determined a set of principles defined as axioms of communication, partially overlapping the specific features of the pedagogic discourse (watzlawick, bavelas, & jackson, ). a review of these axioms is useful for the insight on possible nuances of new scholarly communication channels. these axioms state that: communication is inevitable, meaning that in the given institutionalized context, the teacher must be aware of his ability to communicate; every communication has content and a relational aspect, meaning that communication represents both the transfer of information and the relationship between those who communicate; communication is a continuous process in which the partners are involved in a chain exchange of action reaction, stimulus/response; communication takes either a digital (verbal, concrete meanings) or analogical (non-verbal, representative or referential) form; communication is either symmetrical or complementary, based on equality or difference, one of the interlocutors being designated a priori (watzlawick et al., ). two additional principles could be added, that communication is irreversible, producing an effect on the receptor on which one cannot intervene retroactively, meaning fast and lucid decision taking skills for teachers (salavastru, ) and communication involves processes of adjustment and adaptation, meaning that the message makes sense only in the light of previous life and linguistic experience of each individual (parvu, ). to conclude with, everything from the body language to the relationship between the teacher and his audience defines the act of academic communication. . . academic communication in the digital age. context the digital age offers several interactive tools through which one can develop knowledge and relationships. most of them are used daily by students and have a great impact upon their lives. in this context, blogs appear as a tool presenting a great potential within the educational process, because students are supposedly accustomed to this way of processing information. this raises a few questions to those who lead the educational process. how do you use the available technology to extend your time and space? how do you communicate with the outside world through cybernetic technology? are you a generator of information? are you willing to exchange information with others? what would be the results? can you successfully generate, store and share information? maja baldea et al. / procedia - social and behavioral sciences ( ) – digital technologies of communication have broken the paradigms of the industrial society and brought on other communication channels, affecting our daily lives and becoming more visible in education too. there are visible changes in the way we learn and absorb information, since instant messaging and social networks of the present provide a fragmented communication, while reality is built on a kaleidoscopic model of dynamic, discreet and multiple stimuli of short-term nature. contemporary technological trends shift communication towards becoming sensory, multidirectional and non-linear (moran, ). new technology is shaping both human behavior and changing scholarly practice. in this context alternatives to the traditional educational system appear, which can be incorporated into daily educational processes, relying on collaborative learning, connectivity and mobility. m. weller studied closely the concept of “digital scholarship” (weller, ), and its implications on the scholarly practices. the democratization of online space, as weller notes “opens up scholarship for a much wider group”, as well as subjects beyond the institutionally established curriculum (weller, , para. ). weller observes how scholarship has changed due to the extreme openness to higher education information carried out by new technologies, the main drives of this process being the use of digitization as a general mean of transposing scholarly work in the form of digital media and the distribution of those digital contents via the global network (weller, ). weller believes that those precise characteristics of the online learning environment represent “the means by which higher education comes to understand the requirements and changes in society, and thus the route by which it maintains its relevance to society” (weller, ), while blogs are considered to be “the epitome of the type of technology that can lead to rapid innovation” ( , para. ). . a comparative research on the use of blogs of the design studio beginning with the academic year / , three parallel blogs were created in order to ease communication with students involved in the design studio. they were implemented simultaneously in order to overcome deficiencies in communication perceived in the previous activity ( st year blog, ), ( nd year blog, ), ( th year blog, ). the primary need came from the specific character of the design studio curriculum, consisting in studio-based design education. during design studio classes, students individually develop their projects through fundamental methods such as drawing and model making, guided by teachers mostly in a one to one relationship. the blogs were intended to fill in gaps in direct communication during studio classes, as a supplementary communication and information source. they were used to display design tasks, documentation sources and theoretical supports and to communicate informative notices on the studio schedule such as approaching deadlines, workshop materials needed or specific events. . . the situation before using blogs the previous communication channels used in the design studio were direct communication, mainly verbal, addressing all students in the class, supplemented by indirect electronic communication in the form of e-mail messages sent to a small group of the student representatives, whose mission was to transmit the information further to their classmates via class group mails. with the number of students of each study year ranging between and , the former communication had clear disadvantages and limitations (table ), constituting the main reason for implementing blogs. direct communication was used to transmit general announcements to all students of each year that had to be gathered together in one single space, followed by the division of students into workgroups, where the basic information was further interpreted. electronic communication could only be used as a supplement to direct communication and hasn’t been used often, since the reception of the message by students was cumbersome. table . advantages and disadvantages of the direct and electronic communication methods before using blogs. advantages disadvantages direct communication direct teacher-student interaction that allows swift verification of hypotheses. involves a large number of students; not everyone can listen from a correct ergonomic position. allows the transmission of an idea, with input from the entire teaching group. the class spaces are not right for this type of interaction, being designed to hold a smaller number of students. maja baldea et al. / procedia - social and behavioral sciences ( ) – electronic communication allows high-speed transmission of general information to all students. use of intermediaries in transmission: the message reaches the larger group through some representatives. supplements information. it is difficult to transmit differentiated messages for a particular group (individual workgroup). the message has little visibility due to the amalgamated character of information in each student’s mailing list. . . using blogs the concurrently implemented blogs had similar graphic and content organization schemes, but later developed personalized representations. although visually appealing and important in the distinction between blogs, the graphical schemes are less important in relation to their actual content. the major differences in the way blogs are used are based on the content they carry and on the way in which information is communicated. the differences in managing content and in the way of communicating information (table ) derive from the type of communication that each teaching team follows as well as from the previous experience of the ones responsible for blog postings in communicating via internet channels. table . differences in the content of blogs of the st, nd and th year of study. year and work structure content particularities st year: students / teachers / workgroups the only blog that contains a theoretical support for the design studio, based on the fact that the dean of the st year is also teaching the theory course of architecture theory. the only year that runs a separate facebook page for its activities, where more dynamic and informal information is shown, so that the blog may hold only formal information. a sharp organization of the blog, carrying out general communication with all students. nd year: students / teachers / workgroups a blog that holds both general communication on the central newsfeed page, concerning all students, and also separate categories for each workgroup, targeting a differentiated working approach within groups. th year: students / teachers / workgroups a blog pursuing a newspaper-like communication, information on the main page with an amalgamated character, containing main and secondary information organized exclusively by time-line ordering. . results . . student feedback the student feedback can be discussed either by statistics of accessing the blog, that fail to show how many of the people accessing the blog are students of that year, or through organized inquiries. only the nd year teaching staff requested a questionnaire survey on the whole activity of the / academic year. it showed that the blog was perceived positively, receiving . points out of for efficient communication, while teachers received . points for communicating in the blog’s group pages. % of the respondents considered that they have been exhaustively informed by the blog, while % considered the published references as useful. still, the student’s general response about blogs was weak, as demonstrated by direct comments or by informal feedback provided during the classes. only a very small number were expressing their opinions or did interact actively in relation to the current content of the blogs. . . teacher feedback the teacher’s feedback (table ) reflects each year’s specific needs of communicating content. the st year includes academic courses as support, while the other two use the blogs mainly to inform students on the scholarly process, to indicate further study references or to deliver notices. a critical aspect in understanding how they work is the fact that they weren’t designed as freestanding education sources, but closely linked to the design studio activities of each year. maja baldea et al. / procedia - social and behavioral sciences ( ) – table . a comparison of the use of blogs at the st, nd and th year from the teacher’s point of view. year and work structure advantages disadvantages st year: students / teachers / workgroups communicating through the blog has shortened the path of information from teachers to students. the architecture theory course presented on the blog tends to be overlooked. communication and dissemination of information in digital format is essential, providing also storage for information where you can always return to. teachers presume that all students read everything written on the blog, but in reality this is not the case. it is a convenient information process. some students don’t benefit of internet access. nd year: students / teachers / workgroups allows the pursuit of information by everybody involved in the teaching process and the return. only some students are interested to access information within the sub-menus. fast communication, lacking redundancies. students don’t interact with the blog’s content. differentiated announcements for each workgroup. th year: students / teachers / workgroups providing a dynamic platform for discussions (obtaining feedback on the project theme through links posted by students on the blog). the discussions (on the content) took place only during the workshop, but this is understandable since there are two weekly workshops that facilitate direct encounter. greater interest and better focus on the project theme and on the guidance offered by teachers. the blog failed to become a discussion platform, providing only mutual information. the traceability of the activity of the semester. . . discussions by comparing the distinct blogs, the primary find is that each teaching group uses its own blog according to its specific communication needs. also, there are several distinctions in the way in which teachers transmit content via the blogs. comparing the opinions of each different teaching group, positivism stands out in relation to the noticeable improvement of communication towards the students, while at the same time a weak response of the students is commonly distinguished in relation to a dynamic electronic media, to which they are supposedly acquainted. a common result, although initially un-assumed but revealed by teacher’s feedback, is the fact that higher levels of group identity and social cohesion have been achieved, compared to previous years. apparently this happened through the publication of images on the teaching process and the increased involvement of everybody in the group. since the issue of a weak student interaction remains, we propose several hypotheses. on one hand, the institutionalization of educational communication gives the professor a special status in the relationship with students, possibly causing inhibitions in a symmetrical communication. also, the decreased number of comments may demonstrate the fact that the students assume responsibility of qualitative and relevant comments, which can inhibit much of the possible interactions. participating through comments to the content of blogs occurs more frequently with the th year students, partially due to the fact that professional opinion gets build over time, and the students in the inferior study years don’t have sufficient professional knowledge to be able to validate their views. the latter assimilate information in a more intuitive level and this aspect is taken into account in the communication strategy of the blogs of the st and nd year. the only blog that changed in the following academic year as a direct result of last year’s experience is the blog of the nd year ( nd year blog / , ), while the teachers of the th year also created a blog for the design studio. the nd year blog currently uses a hosting platform that supports class blogging, forums and individual student blogs. the individual blogs should be used by students to present individual work during the semester. apparently, students only tend to upload work when requested and fail to recognize their blog as a personal tool of representation. although too early to discuss the implications of the new blog platform, it is clear that a general inhibition of students in using blogs does exist, even if those students did experience blog communication in the previous year too, during the st year. we think that this inhibition cannot be broken only by the use of new media, but through active involvement of teachers in the entire scholarly process. maja baldea et al. / procedia - social and behavioral sciences ( ) – . conclusions the conducted research proves that the use of blogs did improve general scholarly communication, attesting blogs as successful instruments in supplementing information. improved communication has been confirmed by both teachers and students, when compared to prior teaching and learning experiences. still, the current phase of their implementation only represents an incipient form of embracing new media in the educational process. it is clear that in relation to the studio character of the design education as it is currently carried out, blogs can only be used as complementary channels to direct teaching and therefore play a limited role. in order to achieve a greater academic impact through the blogs content, future recommendations are to widen the current functions of blogs and to adapt the character of communication to the specific needs of each year’s students, according to their age and interests. the key factors to achieve these recommendations are to trigger a greater involvement of both students and teachers into the blogging process. teachers should integrate the blogging process in their wider scholarly activity, since blogging and teaching have grown to be functions of the same scholarly practice. references st year blog. ( ). retrieved october , from http://arhitectura tm.wordpress.com/ nd year blog. ( ). retrieved october , from http://arhitectura tm.wordpress.com/ nd year blog, / . ( ). retrieved december , from http://arhitectura tm.edublogs.org/ th year blog. ( ). retrieved october , from http://arhitectura tm.wordpress.com/ di batista, n. ( , january). studying architecture and design in europe today. in europe’s top schools of architecture and design . retrieved from http://digitaledition.domusweb.it/domus/books/ domus/#/ / grosseck, g. ( ). to use or not to use web . in higher education? procedia - social and behavioral sciences, ( ), - . doi: . /j.sbspro. . . moran, t. p. ( ). introduction to the history of communication: evolutions & revolutions. new york: peter lang. parvu, i. ( ). filosofia comunicării [the philosophy of communication]. bucurești, românia: s.n.s.p.a. salavastru, d. ( ). psihologia educației [the psychology of education]. retrieved from http://www.bueckergmbh.de/luci/files/books/dorina-salavastru-psihologia-educatiei.pdf watzlawick, p., bavelas, j. b., & jackson, d. d. ( ). pragmatics of human communication: a study of interactional patterns, pathologies, and paradoxes. new york: norton. weller, m. ( ). using learning environments as a metaphor for educational change. retrieved from http://nogoodreason.typepad.co.uk/welleronthehorizon.pdf weller, m. ( ). digital, networked and open. in, the digital scholar: how technology is transforming scholarly practice. retrieved from bloomsbury academic database. the academic ebook ecosystem reinvigorated: a perspective from the united states: the academic ebook ecosystem reinvigorated r e s e a r c h a r t i c l e (wileyonlinelibrary.com) doi: . /leap. received: june | accepted: june the academic ebook ecosystem reinvigorated: a perspective from the usa charles watkinson c. watkinson university of michigan, ann arbor, mi, , usa orcid: - - - e-mail: watkinc@umich.edu abstract the development of infrastructure to support new forms of long-form dig- ital scholarship that go ‘beyond the ebook’ has been an active area of humanities publishing over the last years. proactive philanthropic sup- port for this work has energized the us non-profit publishing community, especially university presses and library-based publishers. this article describes the various strands of work that are ongoing and identifies some common themes: an emphasis on shared values; a focus on building an ecosystem of interoperable platforms and tools; and engagement with the challenges facing new-form digital publications (especially preservation, discovery, and accessibility). this article also considers how publishers who are looking for new platforms and processes can navigate the variety of options now on offer. introduction frequent news stories about the persistence of print and a stable and possibly shrinking market for direct-to-consumer ebooks obscure the significant digital transformation that monographic publishing is undergoing. over the last years speculative discus- sions of what it would mean to go ‘beyond the book’ such as the academic book of the future initiative (lyons & rayner, ) have morphed into pilot and prototype projects. these in turn are now entering production in the form of new workflows, new production tools, and a proliferation of technology platforms. the implementation of new infrastructure is happening across the world and in all sectors of academic book publishing. in europe, for example, the hirmeos project (bertino, ) is taking a sys- tematic approach to developing common standards and linkages between platforms. in the commercial world, bloomsbury’s turn towards a digital strategy focused on transforming aca- demic book content features dramatically enhanced ebook-based products like the bloomsbury architecture library, screen stud- ies, and applied visual arts library (bloomsbury.com, n.d.). in the uk, the proliferation of new university and academic-led presses is leading to new initiatives around shared infrastructure and open access business models (adema, stone, & keene, ). while recognizing the broad spread of activity, this article focuses narrowly on an overview of the current state of innovation among non-profit humanities book publishers in the usa: a space where focused philanthropic funding has created an environment of unprecedented change and constructive upheaval. the role of the andrew w. mellon foundation the national endowment for the humanities (neh) and the insti- tute of museum and library services (imls) are both us govern- ment agencies with a strong interest in scholarly communication in the humanities. but while these agencies and several other pri- vate funders selectively support publishing initiatives, the domi- nant force in transforming humanities publishing in the us is undoubtedly the andrew w. mellon foundation through its long- running scholarly communication programme. not only does the level of mellon support dwarf other financial investments, but the programme officers are very proactive in working with poten- tial and current grantees to shape and improve their programmes. learned publishing ; : – www.learned-publishing.org © the author(s). learned publishing © alpsp. http://orcid.org/ - - - http://orcid.org/ - - - mellon also partners strategically with other funders. for exam- ple, the foundation has collaborated with neh to create the humanities open book program under the auspices of which , out-of-print humanities books have been made available to the public as free ebooks, often with additional digital affor- dances (hindley, ). in , mellon announced a substantial new programme of grantmaking focused on academic ebooks. while some seed funding had already been granted by the time a broader request for proposals was issued, most university presses first heard about the funding programme dedicated to a rethinking of the scholarly book in a may e-mail from senior programme officer for scholarly communications, don waters. this called on scholars and publishers to ‘develop and experiment with ways to produce, disseminate, and make easily discoverable high quality digital works of long-form interpretive scholarship, including monographs, that interact effectively with related materials on the web as well as with online readers’. the particular focus of the funding, the e-mail from waters explained, was to enable ‘university presses to collaborate with each other and with other organizations to develop shared capacity and infrastructure in one or more of the following areas of long-form digital publish- ing for the humanities: (a) editing; (b) clearing rights to images and multimedia content; (c) the interaction of the publication on the web with primary sources and other related materials; (d) production; (e) re- and post-publication peer review; (f ) mar- keting; (g) distribution; and (h) maintenance and preservation of digital content’. while open access outputs were not explicitly required, the e-mail also explained that the foundation was ‘especially interested in developments that would support new business models, such as those in which authors or their institu- tions, rather than readers, pay for the costs of producing and dis- tributing works on the web, or those that generate other new sources of revenue’. the grant funding offered was generous (proposals should request funding of approximately $ , – , with the grant period to be determined by the project partners, but not exceeding years) and the e-mail well-timed, preceding by about a month the annual meeting of the association of university presses. many meetings were scheduled and conversations pro- voked. successful awards began to be announced in december and the author’s own institution, university of michigan, was one of the recipients of funding. as of june , the digital monograph initiative had provided funding of $ . million spread over grants (waters, ). several reviews of the intent and progress of the mellon foundation’s intervention have been published, both by the funder itself and by scholars supported by the funder (maxwell, bordini, & shamash, ; waters, ). a planned follow-up landscape study of the digital monograph initiative by john max- well, conducted under the auspices of the mit press, has recently been announced. that the exact list of projects supported varies between articles reflects the depth and breadth of the founda- tion’s influence and its capacious vision of the scholarly commu- nications system. because projects are funded through a variety of mechanisms (including smaller planning grants and larger pro- ject grants), start at different dates, and last for different lengths of time, keeping track of the various initiatives is sometimes diffi- cult. while other important projects such as vega, scalar, and hypothes.is will be mentioned, the focus of this article is on the initiatives listed in table . representatives of these projects were invited to a digital monograph initiative meeting hosted by the foundation in new york on and september, , to promote collaboration between publishing platform and tool developers. the vibrancy of the meeting and the further conver- sations it provoked revealed many links and common interests. the original / grants were for or years. there- fore, in most of the projects listed in table have launched their platforms and tools and/or reached decision points around applying for further support. several projects have applied for additional grant funding with a focus on reaching a point of self- sustainability, both financially and in terms of community support. others are evaluating their next steps, often by bringing together stakeholder gatherings. it seems like a good time to take stock of what these projects have achieved and look to the future of the ecosystem they are starting to create. towards a scholarly publishing open source ecosystem applying the concepts developed in the study of natural ecosys- tems and biological evolution to software ecosystems and soft- ware evolution has become popular in computer and information sciences over the last decade (hanssen, ; mens, claes, gros- jean, & serebrenik, ). such approaches are much newer to the field of academic publishing but the ecosystem metaphor has become ubiquitous in recent conference presentations and is an key points • the discussion about how digital affordances will impact academic book publishing has moved from ‘speculation’ to ‘action’ as new platforms and workflows are implemented. • the world of us non-profit book publishers has been ener- gized by funding support, particularly from the andrew w. mellon foundation, aimed at creating new infrastruc- ture for long-form scholarship. • separate initiatives now align around shared values and seek interoperability with an emerging ecosystem of largely open source scholarly communication tools. • shared challenges are being identified, especially related to preservation, discoverability, and accessibility. • the plethora of new tools offers special value to smaller book publishers but is difficult to navigate and make sense of. the academic ebook ecosystem reinvigorated learned publishing ; : – © the author(s). learned publishing © alpsp. www.learned-publishing.org especially helpful theoretical framework as a diversity of ‘organ- isms’ (platforms, workflow tools, reading interfaces) proliferate in the marketplace, explore ways of connecting with each other, and attempt to find their unique niches. in april , for example, the joint roadmap for open sci- ence tools (jrost) project was initiated noting that ‘while open technologies and services are becoming essential in science prac- tices, so far there has been no holistic effort to align these tools into a coherent ecosystem that can support the scientific experi- ence of the future’ (angell, ). a may preconference convened by the library publishing coalition showcased a num- ber of open-source platforms and tools created in the last few years under the title ‘owned by the academy: a preconference on open source publishing software’. the meeting included pre- sentations by pubsweet (coko), janeway (birkbeck college, uni- versity of london), vega (wayne state university), hypothes.is, pkp publishing services and ojs, pressbooks, pubpub (mit), quire (getty), ubiquity press, scalar, fulcrum (michigan), and manifold (minnesota). as its title suggests, the library publishing coalition preconference was not just about promoting technical interoperability but also about shared values. this emphasis on scholarly values strongly intersects with the humetricshss initiative, also funded by the mellon founda- tion, which has focused on identifying a core set of values for the humanities and qualitative social sciences (agate et al., ). these are currently articulated as ‘collegiality, quality, equity, openness, community’ and a central focus of the initiative is how these values manifest in various academic contexts such as create a syllabus, contributing an annotation, reviewing a tenure case, or preparing and publishing an academic book. recent meetings of non-profit publishing groups in the usa have been notably more focused on articulating a values-based field of academic publish- ing that, the participants claim, is distinct from that of commercial rivals, with a strong focus on articulating ethical frameworks and a particular interest in issues of diversity, equity, and inclusion. a strong focus on ‘equity’ is one way in which several of the platforms described above manifest these values. for example, in constructing their ‘platform for multimedia books in indigenous studies’, the university of british columbia press (lead) and uni- versity of washington press (partner) teams have emphasized the importance of including indigenous native american communities in all aspects of the publishing workflows and tools that are used to disseminate scholarship by and about them. this involves including ‘community review’ alongside other forms of peer review during the selection process and allocating indigenous knowledge categories to content management system (mukurtu) which underlies the publishing platform, allowing selective access restriction based on community norms, for example. as discussed further below, accessibility to blind and partially sighted users has been a strong focus of several of the platforms, also very con- nected to principles of equity and inclusion. ‘collegiality’ is an underlying value often referenced in identi- fying the particular strengths of the different platforms and tools and finding ways to connect them. the extent of this activity can be exemplified by the university of michigan’s interactions with other grant recipients regarding its platform, fulcrum. figure shows a potential workflow that michigan is exploring for the production and publication of an enhanced ebook based on table mellon-funded monograph publishing platform and tool projects ( – ). title of project lead organization(s) project output collaborative services platform for university presses university of north carolina press www.longleafservices.org/our-story web-based content management system for oa monograph publishing university of california press and california digital library https://editoria.pub/ electronic portal for art and architecture books yale university press www.aaeportal.com platform for management of monographic source materials university of michigan press www.fulcrum.org developing the iterative scholarly monograph university of minnesota press and city university of new york https://manifoldapp.org/ digital publishing platform for interactive scholarly works stanford university press and stanford university library www.sup.org/digital infrastructure for enhanced networked monographs new york university libraries and press https://wp.nyu.edu/enmproject/ a distribution platform for open access monographs johns hopkins university press https://muse.jhu.edu/museopen/ open webbooks prototype for scholarly monographs rebus foundation https://rebus.foundation/ platform for multimedia books in indigenous studies university of british columbia press and university of washington press https://uwpressblog.com/ / / / ubcpress-uwapress-mellon-grant-to -help-develop-indigenous-studies- digital-publishing-platform/ c. watkinson www.learned-publishing.org © the author(s). learned publishing © alpsp. learned publishing ; : – http://www.longleafservices.org/our-story https://editoria.pub http://www.aaeportal.com http://www.fulcrum.org https://manifoldapp.org http://www.sup.org/digital https://wp.nyu.edu/enmproject https://muse.jhu.edu/museopen https://rebus.foundation https://uwpressblog.com/ / / /ubcpress-uwapress-mellon-grant-to-help-develop-indigenous-studies-digital-publishing-platform https://uwpressblog.com/ / / /ubcpress-uwapress-mellon-grant-to-help-develop-indigenous-studies-digital-publishing-platform https://uwpressblog.com/ / / /ubcpress-uwapress-mellon-grant-to-help-develop-indigenous-studies-digital-publishing-platform https://uwpressblog.com/ / / /ubcpress-uwapress-mellon-grant-to-help-develop-indigenous-studies-digital-publishing-platform extensive interactions with mellon-funded projects. while there is some overlap in the systems (manifold or vega, e.g., are also presentation platforms) the illustration shows how mellon’s investments have been structured around a publishing workflow concept that bolts onto tools employed during the author’s research workflow. encouragement from the programme officers and associated funding for travel have encouraged a spirit of col- legiality between many of the platforms. for example, new york university, university of minnesota, and university of michigan have met in person each year of their initial grants to share chal- lenges and seek opportunities for collaboration. this is not to say that the projects do not engage in competition for clients and resources, but the relationships are also intensively collaborative. this may best be described as ‘coopetition’ (bengtsson & kock, ). figure may suggest a misleadingly closed system; how- ever, not shown are other relationships with both commercial and non-profit partners, often facilitated by open application pro- gramming interfaces (apis). for example, fulcrum is working to deposit enhanced media content through its university of michi- gan library parent into distributed preservation networks such as aptrust. it also has a relationship with digital science and google to manage analytics and multiple arrangements with vendors like ebsco and proquest to enable discovery through library systems. ‘openness’ is also a value often expressed. taking advantage of the openness of the software frameworks that underlie these systems, further opportunities for integration are being explored by several third parties. the collaborative knowledge foundation (coko), for example, is providing modular components to link sev- eral of the projects in its quest to ‘build modular, open source publishing software using collaborative development to ensure the technology underlying research communication enables inno- vation and rapid publishing’ (coko foundation, n.d.). lyrasis, meanwhile, is creating an incubator environment to introduce hosted versions of several open source software platforms to its member community of over , libraries, archives, and museums (about lyrasis, n.d.). its imls-sponsored ‘it takes a village’ project has involved several of the platforms in strategic planning around sustainability. through this project, a useful framework has been expressed as a ‘sustainability wheel’ in which governance, technology, community engagement, and resources form the four quadrants. best practices during three phases of development (‘getting started’, ‘growing’, ‘stable but not static’) are defined (arp & forbes, ). the idea of a values-based ecosystem of open source tools has sometimes manifested in discussions of an alternative network of tools, a ‘parallel ecosystem’, that would not be accountable to the interests of commercial shareholders but aligned with the values of the academic community. this rhetoric has sometimes become quite heated. as jefferson pooley charac- terizes it, ‘there’s a contest underway, pitting non-profit platforms and initiatives, supported by foundations like andrew w. mellon and alfred p. sloan, against projects underwritten by the legacy publishing industry and silicon valley venture-capital firms. the contest isn’t really about feature sets or new formats: the basic values of the academic enterprise are at stake. we have the chance to disrupt (to repurpose a stale verb) the strange, if explainable, joint-custody arrangement we currently have: non- profit universities and for-profit publishers. a publishing ecosys- tem centered on scholarly values – rather than per cent, elsevier-style profit margins – is within reach. for that to happen, we have to throw our weight behind the non-profits, before it’s too late’ ( ). while the vision may be compelling, such a framing also risks being divisive and exclusionary; indeed at odds with the values it espouses. there are many mission-driven publishers who are not categorized as non-profit and many non-profit book publishers in the usa continue to rely on experienced and robust platform ser- vices from commercial organizations like wiley atypon, ubiquity press, silverchair, pubfactory, ingenta, and highwire. publishers like allison belan from duke university press note that any argu- ment that such entities are not values-based, unfairly caricatures their substantial investments in advancing important open source standards like epub , counter, shibboleth, and lockss. they also point out that the single-minded focus of such organi- zations on maintaining systems and relationships frees publishers up to focus on content development. as t. scott plutchak has written, in the context of the con- troversial elsevier acquisition of berkeley electronic press, the commercial owner of the digital commons publishing/repository platform widely used by libraries, ‘a scholarly communication eco- system managed entirely within the academy, with no need or room for commercial players, dedicated to no cost sharing of the products of research globally, remains the holy grail for many librarians who’ve dedicated their work lives to scholarly commu- nication issues. i remain deeply skeptical of efforts to create an entirely separate ecosystem without engaging the people in com- mercial publishing. these are talented and committed people with a wealth of knowledge about how scholarly communication sys- tems actually work. certainly they have their blind spots, but that’s why all of the other stakeholders need to be tightly engaged. we count on the others to help us past our own blind figure an enhanced ebook publication workflow based entirely on mellon-funded elements. the academic ebook ecosystem reinvigorated learned publishing ; : – © the author(s). learned publishing © alpsp. www.learned-publishing.org spots’ ( ). don waters also notes that ‘our trustees actually do not respond favorably to arguments for the use of mellon funds based solely on the perceived need to stand up competi- tors to existing commercial organizations. however, they are per- suaded when there is a need to open new pathways that simply do not now exist’ (personal communication, june , ). identifying common challenges and seeking shared solutions as real scholarly works start to be published on the new plat- forms and the various connecting tools and workflows start to be deployed, the participants in the developing ecosystem described above are identifying a number of common challenges to the publication of enhanced ebooks, especially around preservation, discoverability, and accessibility. a number of workshops, many supported in whole or in part by mellon, have been exploring how best to address these issues. how to divide roles and responsibilities between authors, libraries, publishers, and digital humanities centres has been a consistent theme. while out of scope of this article, a parallel stream of mellon foundation fund- ing is supporting libraries and digital humanities centres to con- ceptualize their roles in supporting new forms of digital scholarship, particularly in the earlier stages of a work’s produc- tion, as described by maxwell and colleagues ( ). a particular area of negotiation is at what point the preparation of a work of long-form digital scholarship is handed off from a library or humanities centre to a publisher, if at all. while the challenges around preservation, discoverability, and accessibility are similar in kind they vary in extent depending on the complexity of the types of work being published. such complexity can be conceptualized as varying across a continuum from ‘simple ebook’ through ‘enhanced ebook’ to ‘expansive digi- tal humanities project’. while a taxonomy of these new types of work is lacking (and needed), ‘enhanced’ describes an ebook which may include digital affordances such as time-based multi- media (audio, video), annotations, interactive timelines, or maps but is still enclosed within a container such as an epub file. ‘expansive’ is a term usefully defined by researchers from duke university as referring to ‘projects that are interactive and dynamic in their content as they span and often grow over time across multiple content types, audiences, and contributors’ and manifest in ways that extend beyond a single container (hansen, milewicz, & mangiafico, ). a similar comparison has been made by michael elliott who contrasts ‘long-form scholarship published digitally that is substantially enhanced by the digital format’ with ‘digitally published, long-form scholarship that is not suitable for print publication’ in a taxonomy developed at emory university (elliott, ). preservation leading north american commentators like clifford lynch and peter brantley have been eloquent over the last years in identifying the preservation of ebooks as a looming challenge, particularly for libraries (brantley, ; lynch, ). preserva- tion organizations based in the usa such as portico, hathitrust, and lockss have started to wrestle with the demands of enhanced ebooks and the library of congress announced pro- posed mandatory deposit of electronic-only books in april . however, none of the services currently offer the capacity to curate even the simplest of enhancements such as embedded multimedia. a recent digital preservation coalition report notes that ‘ownership of the responsibility for the preservation of dif- ferent large categories of digital artefacts that fall under the rubric of ebooks is not clearly established. nor are the costs for carrying out the preservation, and establishing sufficient perma- nent funding to meet those costs’ (kirchhoff & morrissey, ). among the mellon-funded platform projects, divergent approaches to digital preservation are being explored, oriented around the distinction between ‘emulation’ and ‘migration’ strate- gies. emulation involves using a program that imitates the origi- nal, obsolete, hardware or software to render a digital object. in emulation, the original bit stream (the information that comprises the file) is saved and used. in contrast, in migration, the original bit stream is changed over to a new, current file format (stuchell, ). stanford university press, which has published several complex digital publications on scalar, is now exploring emulation and virtualization approaches. university of michigan press is focused on migration of content as a core feature of its fulcrum publishing platform which is built on the same samvera fedora stack as the university of michigan library’s deep blue institu- tional data repository and shares its preservation policies as well as infrastructure. discoverability the challenges of discovering open access monographs have been explored by the ‘mapping the free ebook supply chain’ project which investigated how readers found, acquired, and used a sam- ple of open access ebooks published by open book publishers and university of michigan press (watkinson, welzenbach, hell- man, gatti, & sonnenberg, ). while the situation is being improved by the work of organizations such as jstor, ingenta open, doab, and oapen since the study was completed, the ‘mapping’ project revealed that the dominant role of commercial infomediaries in ebook discovery and the rigidity of their systems has made discovery of open access ebooks through libraries very challenging. this same rigidity of systems creates discovery obsta- cles for titles that include additional digital affordances, irrespec- tive of whether they are sold or made open access, since they are oriented to expect works that are presented in familiar containers. for example, google scholar (where many researchers naturally turn for journal articles) does not yet index and rank scholarly ebooks at the same depth and breadth; the library discovery ser- vices ecosystem is in a constant state of commercial consolidation and transition, making integrations between platforms/publishers and services challenging to establish and maintain; and these same systems insert a number of jumps between the place where the c. watkinson www.learned-publishing.org © the author(s). learned publishing © alpsp. learned publishing ; : – content is discovered, the place where it is linked, and the place where it is accessed/consumed. as allison belan at duke univer- sity press has noted, each jump introduces a point of discovery-to- access failure, making it difficult for either the library or the pub- lisher to understand where the user’s disconnect is happening (belan, personal communication, june , ). a mellon-funded project engaging specifically with discover- ability challenges is muse open which aims to deliver open access book content cost effectively in a browser-native format and provide a discovery layer for new forms of content, such as the ‘black press in america’ multimodal project published by johns hopkins university press in collaboration with the univer- sity’s sheridan libraries (barbara kline-pope, ; schonfeld, ). closely connected with the issue of discoverability is that of information about use and engagement. finding ways to pre- sent measures of impact is important so that authors of new for- mats, already potentially viewed with suspicion by more conservative academic administrators, can demonstrate the reach and impact of their work. project muse has always placed an emphasis on providing high quality usage information to partici- pating publishers and has placed transparent usage reporting at the heart of their new platform. a new mellon-funded project led by the book industry study group is engaging with the issue of gathering usage information from multiple platforms to tell a coherent story of open access ebook usage (bisg, ). important work on the ‘last mile’ of ebook delivery (how a user interacts with the content once they have discovered it) has been done by jstor labs and the rebus foundation, both using a combination of design thinking exercises and survey approaches. these have resulted in two white papers (humphreys, spencer, brown, loy, & snyder, ; mcguire et al., ). jstor has created several tools to imbed in their platform to improve engagement with ebooks, notably topic- graph and text analyser (https://labs.jstor.org/projects/). the rebus foundation, meanwhile, is now engaging with the problem of ensuring that ebooks spread across a variety of platforms, including manifold and fulcrum, can be collected and organized in a common web-based interface by a scholarly reader. accessibility encouraged by the mellon foundation, accessibility has been a challenge that many of the new platforms have been wrestling with as have many others across the publishing industry (conrad & kasdorf, ). as well as fulfilling legal responsibilities and engaging with the needs of users with print disabilities, the publishers of enhanced ebooks see that designing platforms and content with accessibility in mind also catalyses good digital design. enriched image descriptions aid discoverability while a user experience designed for screen reading software also facili- tates other forms of machine reading. several mellon grant recipi- ents have been developing guidelines and resources to help streamline the work required to make content accessible, with initiatives such as the describing visual resources toolkit (sup- ported by the samuel h. kress foundation and university of michigan library) bringing representatives of a number of the organizations together (https://describingvisualresources.org/). a growing concern, however, has been with the amount of additional labour that requiring accessibility for multimodal publi- cations imposes on authors and publishers. audio and video files need to be captioned or transcribed, images need alt-text to be written, and more complex digital objects, such as d models, need explanations to be written that explain why they may not be fully accessible. susan doerr has described the results of time tracking the labour involved in entering metadata for projects to be published on manifold and fulcrum and the questions of sus- tainability this raises (doerr, ). while no publishers underes- timate the importance of providing accessible content, it is clear that best can sometimes be the enemy of the good. conclusion: picking and choosing the proliferation of new platforms, tools, and workflows has cre- ated excitement and uncertainty in the scholarly book publishing ecosystem in the usa. for the projects funded by the andrew w. mellon foundation a condition for receiving a grant is that the software products are openly licensed. not only are there now a plethora of github repositories filled with open source software objects, but in many cases a hosted option of the tool or platform is also being offered for a fee with incentives for early adopters. for a publisher looking to take advantage of this new envi- ronment, picking and choosing among the new technology and service options vying for attention is intimidating. which projects will survive and which will become extinct is dependent on how well the different creators can identify and inhabit their unique niches, symbiotically partner with other existing and emergent life forms, and create a large enough community of software devel- opers, content producers and users to become keystone species. oss watch from the university of oxford provides nine ‘top tips for selecting open source software’ (metcalfe, ). three of the criteria listed seem particularly relevant in this context: • its reputational fit with the publisher’s own disciplinary focus. • a commitment to open standards and interoperability with other systems. • the presence of a large enough community to sustain contin- ued development. while some of the workflows and tools are generalizable, it is increasingly clear that the different platforms will have partic- ular resonance with certain communities of scholarly practice. michigan’s fulcrum, for example, offers a durable, flexible, and discoverable solution for multimedia content which is attractive to scholars in visually rich fields, especially those that deploy time-based media. yale’s electronic portal for art and architec- ture books has sophisticated tools for managing the licensing restrictions that shape the practice of art history and is deeply the academic ebook ecosystem reinvigorated learned publishing ; : – © the author(s). learned publishing © alpsp. www.learned-publishing.org https://labs.jstor.org/projects https://describingvisualresources.org embedded in the museum community. minnesota’s manifold and mit’s pubpub are optimized for collaborative public scholar- ship in which a community of researchers work iteratively to develop new knowledge. vega’s design is shaped by the con- cerns of digital rhetoric for preserving the form of an author’s work as well as its content. in short, certain platforms will fit particular publishers better than others and in other cases they may supplement rather than replace other online delivery mech- anisms. it seems clear that an increasing number of us pub- lishers will be maintaining their own content on multiple platforms rather than just one. formal partnerships between many of the organizations responsible for tool maintenance and creation are being formed and these promise to create a coherent set of services for pub- lishers who wish to make bold moves. however, at least as important is a commitment to interoperability with existing sys- tems. the availability of apis allow connections with commercial systems as well as other open source tools and cater to pub- lishers who have already invested substantially in workflows that they are not willing to abandon wholesale. potential client pub- lishers should look for providers with useful, open, well-defined, consistent, and stable apis. one vulnerability for open source software providers lies in the size of the community of developers supporting a particular product. martin eve and andy byers describe the advantages of using a popular programming language in the creation of their janeway journal publishing system (eve & byers, ). fulcrum is part of the samvera community in which a number of institu- tions, including the university of michigan, commit to contribut- ing to development work in a formal community framework (awre & green, ). building a community of users also increases the chance of sustainability, especially as they commit increasing quantities of content to the platform or imbed the workflow tool integrally in their processes. potential client pub- lishers should look closely at the robustness of the underlying open source software as well as the strength and commitment of the producer and user community. as this article makes clear, much has happened in the ebook infrastructure space since . the challenge for the next few years lies in whether the tools and platforms described can sus- tain themselves financially as well as technologically. while one source of income lies in selling software as a service there are other discussions around pooling library resources to support open source infrastructure, most eloquently expressed by ‘the . % commitment initiative’. this envisions libraries devoting at least . % of their budgets to sustaining open source infrastruc- ture, rather than purchasing and licensing published resources (lewis, goetsch, graves, & roy, ). the idea is exciting and builds on collaborative funding models for open access initiatives such as knowledge unlatched and the lever press. there is some concern that library funders would not invest sufficiently in the growth and further development of open source projects which require substantial surplus funds to stay current with changing needs. however, the prospect of base institutional funding in addition to a fee-for-service model is exciting and will further encourage publishers interested in embracing some of the new technology options. acknowledgements i am very grateful to allison belan, don waters, and nicole mitchell for their helpful comments on a draft of this paper. and to jeremy morse and kentaro toyama for generously sharing their perspectives on the us open source landscape. references about lyrasis. (n.d.). retrieved from https://www.lyrasis.org/about/ pages/default.aspx adema, j., stone, g., & keene, c. ( ). changing publishing ecologies: a landscape study of new university presses and academic-led pub- lishing: a report to jisc. london, england: jisc. retrieved from http://repository.jisc.ac.uk/ / /changing-publishing-ecologies- report.pdf agate, n., kennison, r., konkiel, s., long, c., rhody, j., & sacchi, s. ( , july). humetricshss: towards value-based indicators in the humanities and social sciences. retrieved from humanities commons website: https://doi.org/ . /m r s angell, n. ( , may). open science projects collaborate on joint roadmap [web log post]. retrieved from http://jrost. org/ / / /jrost-launched.html arp, l. g., & forbes, m. ( ). it takes a village: open source software sustainability: a guidebook for programs serving cultural and scientific heritage. atlanta, ga: lyrasis. retrieved from https://www.lyrasis. org/technology/documents/itav_interactive_guidebook.pdf awre, c., & green, r. ( ). from hydra to samvera: an open source community journey. insights into imaging, ( ), – . retrieved from https://insights.uksg.org/articles/ / barbara kline-pope, w. q. ( , may). getting to the heart of the matter through muse open [web log post]. retrieved from http://choice .org/blog/university-press-forum-getting-to-the- heart-of-the-matter-through-muse-open bengtsson, m., & kock, s. ( ). “coopetition” in business networks—to cooperate and compete simultaneously. industrial marketing management, ( ), – . https://doi.org/ . / s - ( ) -x bertino, a. ( ). hirmeos – high integration of research mono- graphs in the european open science infrastructure. septentrio conference series, ( ).https://doi.org/ . / . bisg. ( , june). understanding oa ebook usage: toward a com- mon framework – book industry study group. retrieved from https://bisg.org/news/ /understanding-oa-ebook-usage- toward-a-common-framework.htm bloomsbury.com. (n.d.). bloomsbury – products. retrieved from https://www.bloomsbury.com/dr/digital-resources/products/ brantley, p. ( ). the curation of obscurity. in h. mcguire & b. f. o’leary (eds.), book: a futurist’s manifesto: essays from the bleeding edge of publishing. sebastopol, ca: o’reilly. coko foundation. (n.d.). homepage. retrieved from https://coko. foundation/ conrad, l. y., & kasdorf, b. ( ). making accessibility more accessi- ble to publishers. learned publishing, ( ), – . https://doi. org/ . /leap. c. watkinson www.learned-publishing.org © the author(s). learned publishing © alpsp. learned publishing ; : – https://www.lyrasis.org/about/pages/default.aspx https://www.lyrasis.org/about/pages/default.aspx http://repository.jisc.ac.uk/ / /changing-publishing-ecologies-report.pdf http://repository.jisc.ac.uk/ / /changing-publishing-ecologies-report.pdf https://doi.org/ . /m r s http://jrost.org/ / / /jrost-launched.html http://jrost.org/ / / /jrost-launched.html https://www.lyrasis.org/technology/documents/itav_interactive_guidebook.pdf https://www.lyrasis.org/technology/documents/itav_interactive_guidebook.pdf https://insights.uksg.org/articles/ / http://choice .org/blog/university-press-forum-getting-to-the-heart-of-the-matter-through-muse-open http://choice .org/blog/university-press-forum-getting-to-the-heart-of-the-matter-through-muse-open https://doi.org/ . /s - ( ) -x https://doi.org/ . /s - ( ) -x https://doi.org/ . / . https://bisg.org/news/ /understanding-oa-ebook-usage-toward-a-common-framework.htm https://bisg.org/news/ /understanding-oa-ebook-usage-toward-a-common-framework.htm https://www.bloomsbury.com/dr/digital-resources/products/ https://doi.org/ . /leap. https://doi.org/ . /leap. doerr, s. ( ). adding media, adding value. against the grain, ( ), . retrieved from https://against-the-grain.com/ / / v - -adding-media-adding-value/ elliott, m. a. ( ). the future of the monograph in the digital era: a report to the andrew w. mellon foundation. the journal of elec- tronic publishing, ( ). https://doi.org/ . / . . eve, m. p., & byers, a. ( ). janeway: a scholarly communications platform. insights into imaging, , . retrieved from https:// insights.uksg.org/articles/ . /uksg. /print/ hansen, d., milewicz, l., & mangiafico, p. ( ). developing library support for publishing expansive digital humanities projects. in cni: coalition for networked information spring meeting brief- ings. washington, dc: coalition for networked information. retrieved from https://www.cni.org/topics/digital-humanities/ developing-library-support-for-publishing-expansive-digital- humanities-projects. hanssen, g. k. ( ). a longitudinal case study of an emerging soft- ware ecosystem: implications for practice and theory. the journal of systems and software, ( ), – . https://doi.org/ . /j.jss. . . hindley, m. ( ). a decade of digital: a look back at projects sup- ported by the office of digital humanities. humanities: the maga- zine of the national endowment for the humanities, ( ), . humphreys, a., spencer, c., brown, l., loy, m., & snyder, r. ( ). reimagining the digital monograph: design thinking to build new tools for researchers. the journal of electronic publishing, ( ). https://doi.org/ . / . . kirchhoff, a., & morrissey, s. ( ). preserving ebooks. york, england: digital preservation coalition. https://doi.org/ . /twr - lewis, d. w., goetsch, l., graves, d., & roy, m. ( ). funding com- munity controlled open infrastructure for scholarly communica- tion: the . % commitment initiative. college & research libraries news, ( ), . lynch, c. a. ( ). ebooks in . american libraries, – . retrieved from https://www.jstor.org/stable/ lyons, r. e., & rayner, s. ( ). the academic book of the future. basingstoke, england: palgrave macmillan. maxwell, j. w., bordini, a., & shamash, k. ( ). reassembling schol- arly communications: an evaluation of the andrew w. mellon foundation’s monograph initiative (final report, may ). the journal of electronic publishing, ( ). https://doi.org/ . / . . mcguire, h., anthony, b., hyde, z. w., ashok, a., bjarnason, b., & mays, e. ( ). an open approach to scholarly reading and knowledge management – simple book publishing. montreal, qc: rebus foundation. retrieved from https://press.rebus. community/scholarlyreading/ mens, t., claes, m., grosjean, p., & serebrenik, a. ( ). studying evolving software ecosystems based on ecological models. in t. mens, a. serebrenik, & a. cleve (eds.), evolving software sys- tems. (pp. – ). berlin, germany: springer. metcalfe, r. ( , february). top tips for selecting open source software [web log post]. retrieved from http://oss-watch.ac.uk/ resources/tips pooley, j. ( , august). scholarly communications shouldn’t just be open, but non-profit too [web log post]. retrieved from http://blogs.lse.ac.uk/impactofsocialsciences/ / / /schola rly-communications-shouldnt-just-be-open-but-non-profit-too/ schonfeld, r. c. ( , august). open ebooks coming to project muse: an interview with wendy queen [web log post]. retrieved from https://scholarlykitchen.sspnet.org/ / / / open-ebooks-coming-to-project-muse-an-interview-with-wendy- queen/ scott plutchak, t. ( , october). beprexit and then what? [web log post]. retrieved from http://tscott.typepad.com/tsp/ / /beprexit.html stuchell, l. ( , march). what is digital preservation? retrieved from https://www.lib.umich.edu/preservation-and-conservation/ digital-preservation/what-digital-preservation waters, d. j. ( , july). monograph publishing in the digital age [web log post]. retrieved from http://mellon.org/resources/ shared-experiences-blog/monograph-publishing-digital-age/ waters, d. j. ( , july) the monograph is dead: long live the monograph. presentation at the jisc and cni leaders' confer- ence. retrieved from https://www.slideshare.net/jisc/the- monograph-is-dead-long-live-the-monograph? watkinson, c., welzenbach, r., hellman, e., gatti, r., & sonnenberg, k. ( ). mapping the free ebook supply chain: final report to the andrew w. mellon foundation. retrieved from http:// hdl.handle.net/ . / the academic ebook ecosystem reinvigorated learned publishing ; : – © the author(s). learned publishing © alpsp. www.learned-publishing.org https://against-the-grain.com/ / /v - -adding-media-adding-value/ https://against-the-grain.com/ / /v - -adding-media-adding-value/ https://doi.org/ . / . . https://insights.uksg.org/articles/ . /uksg. /print/ https://insights.uksg.org/articles/ . /uksg. /print/ https://doi.org/ . /j.jss. . . https://doi.org/ . /j.jss. . . https://doi.org/ . / . . https://doi.org/ . /twr - https://www.jstor.org/stable/ https://doi.org/ . / . . https://doi.org/ . / . . https://press.rebus.community/scholarlyreading/ https://press.rebus.community/scholarlyreading/ http://oss-watch.ac.uk/resources/tips http://oss-watch.ac.uk/resources/tips http://blogs.lse.ac.uk/impactofsocialsciences/ / / /scholarly-communications-shouldnt-just-be-open-but-non-profit-too/ http://blogs.lse.ac.uk/impactofsocialsciences/ / / /scholarly-communications-shouldnt-just-be-open-but-non-profit-too/ https://scholarlykitchen.sspnet.org/ / / /open-ebooks-coming-to-project-muse-an-interview-with-wendy-queen/ https://scholarlykitchen.sspnet.org/ / / /open-ebooks-coming-to-project-muse-an-interview-with-wendy-queen/ https://scholarlykitchen.sspnet.org/ / / /open-ebooks-coming-to-project-muse-an-interview-with-wendy-queen/ http://tscott.typepad.com/tsp/ / /beprexit.html http://tscott.typepad.com/tsp/ / /beprexit.html https://www.lib.umich.edu/preservation-and-conservation/digital-preservation/what-digital-preservation https://www.lib.umich.edu/preservation-and-conservation/digital-preservation/what-digital-preservation http://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/ http://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/ https://www.slideshare.net/jisc/the-monograph-is-dead-long-live-the-monograph? https://www.slideshare.net/jisc/the-monograph-is-dead-long-live-the-monograph? http://hdl.handle.net/ . / http://hdl.handle.net/ . / the academic ebook ecosystem reinvigorated: a perspective from the usa introduction the role of the andrew w. mellon foundation towards a scholarly publishing open source ecosystem identifying common challenges and seeking shared solutions preservation discoverability accessibility conclusion: picking and choosing acknowledgements references digital humanities the importance of pedagogy: towards a companion to teaching digital humanities hirsch, brett d. brett.hirsch@gmail.com university of western australia timney, meagan mbtimney.etcl@gmail.com university of victoria the need to “encourage digital scholarship” was one of eight key recommendations in our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences (unsworth et al). as the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” ( ). in other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. while the commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. both the companion to digital humanities ( ) and the companion to digital literary studies ( ), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. there is much work to be done. this poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. this discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by mccarty and kirschenbaum ( ). the growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (jessop ), and the desire of practitioners to consolidate and validate their research and methods. we propose a volume, teaching digital humanities: principles, practices, and politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. we plan to structure the volume according to the four critical questions educators should consider as emphasized recently by mary bruenig, namely: - what knowledge is of most worth? - by what means shall we determine what we teach? - in what ways shall we teach it? - toward what purpose? in addition to these questions, we are mindful of henry a. giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” ( ). consequently, we will encourage submissions to the volume that address these wider concerns. references breunig, mary ( ). 'radical pedagogy as praxis'. radical pedagogy. http://radicalpeda gogy.icaap.org/content/issue _ /breunig.ht ml. giroux, henry a. ( ). 'rethinking the boundaries of educational discourse: modernism, postmodernism, and feminism'. margins in the classroom: teaching literature. myrsiades, kostas, myrsiades, linda s. (eds.). minneapolis: university of minnesota press, pp. - . http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html digital humanities schreibman, susan, siemens, ray, unsworth, john (eds.) ( ). a companion to digital humanities. malden: blackwell. jessop, martyn ( ). 'teaching, learning and research in final year humanities computing student projects'. literary and linguistic computing. . ( ): - . mccarty, willard, kirschenbaum , matthew ( ). 'institutional models for humanities computing'. literary and linguistic computing. . ( ): - . unsworth et al. ( ). our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. new york: american council of learned societies. journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com   http://jrmdc.com       some initial reflections on xml markup for an image-based electronic edition of the brooklyn museum aramaic papyri f. w. dobbs-allsopp, princeton theological seminary contact: chip.dobbs-allsopp@ptsem.edu chris hooker, princeton theological seminary contact: christopher.hooker@ptsem.edu gregory murray, princeton theological seminary contact: gregory.murray@ptsem.edu keywords aramaic; brooklyn museum; critical edition; elephantine; markup; papyrus; tei; xml downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com abstract: a collaborative project of the brooklyn museum and a number of allied institutions, including princeton theological seminary and west semitic research, the digital brooklyn museum aramaic papyri (dbmap) is to be both an image-based electronic facsimile edition of the important collection of aramaic papyri from elephantine housed at the brooklyn museum and an archival resource to support ongoing research on these papyri and the public dissemination of knowledge about them. in the process of building out a (partial) prototype of the edition, to serve as a proof of concept, we have discovered little field-specific discussion that might guide our markup decisions. consequently, here our chief ambition is to initiate such a conversation. after a brief overview of dbmap, we offer some initial reflection on and assessment of xml markup schemes specifically for semitic texts from the ancient near east that comply with tei, cse, and mep guidelines. we take as our example bmap (=tad b . ) and we focus on markup as pertains to the editorial transcription of this documentary text and to the linguistic analysis of the text’s language about the authors: f. w. “chip” dobbs-allsopp is professor of old testament at princeton theological seminary. his research interests include the historical, philological, and literary study of biblical and ancient near eastern literature (with special focus on poetry, northwest semitic inscriptions) and exploring how new technologies can enhance the editing of ancient semitic texts. dobbs- allsopp’s most recent monograph is on biblical poetry (new york/oxford: oxford university press, ). christopher hooker is a phd candidate at princeton theological seminary. gregory murray is director of academic technology and digital scholarship services at princeton theological seminary library. he has worked with tei encoding of humanities texts since (tei p in sgml) and has extensive experience with text processing and xml technologies, including xslt and xquery. downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com to cite this article: dobbs-alsopp, f.w., c. hooker and g. murray, . some initial reflections on xml markup for an image-based electronic edition of the brooklyn museum aramaic papyri. journal of religion, media and digital culture ( ), pp. - . online. available at: . introduction: project overviewi a collaborative project of the brooklyn museum, princeton theological seminary, and west semitic research, the digital brooklyn museum aramaic papyri (dbmap) is to be both an image-based electronic scholarly edition of the important collection of aramaic papyri from elephantine housed at the brooklyn museum and an archival resource to support ongoing research on these papyri and the public dissemination of knowledge about them. the collection, consisting of nine whole papyrus rolls (eight of which were still intact, folded, with original cords and sealings upon acquisition) and a large number of fragments (from more than eight other rolls), was bequeathed to the brooklyn museum by ms. theodora wilbour in . ms. wilbour’s father, charles edwin wilbour, had purchased these papyri originally sometime during the period january - february in aswan (this according to a notebook entry of his from that time). the papyri were packed in “tin biscuit boxes” and placed in a trunk with other boxes of egyptian papyri, where they remained (ultimately stored in a new york warehouse) unknown and unread for over half a century—wilbour died in , without having revealed to anyone the contents of his purchase. as a result of ms. wilbour’s bequest, an editio princeps of these papyri was published in short order by emil g. kraeling—the brooklyn museum aramaic papyri: new documents of the fifth century b.c. from the jewish colony at elephantine (= bmap; kraeling )—some sixty years after their initial discovery. the papyri date to the fifth century bce and mostly consist of legal documents having to do with the interrelations of two families spanning several generations. historically, the collection represents the earliest major acquisition of aramaic papyri related to the ancient military colony at elephantine. downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com at the heart of the proposed project is the creation of an image-based, facsimile edition of these spectacular aramaic papyri. as with all scholarly text editions, whether print-based or digital, the chief task is to provide a reliable and accurate simulation of the underlying source text. mla’s committee on scholarly editing guidelines (= cse) assert that editors establish reliability by “explicitness and consistency with respect to methods, accuracy with respect to texts, adequacy and appropriateness with respect to documenting editorial principles and practice” (mla ). a primary rationale for undertaking a new critical edition of any text is to improve on what older editions have achieved. in the case of these papyri there is currently only a single critical (i.e., answering to cse guidelines) edition available, the editio princeps published by kraeling now almost sixty years ago. the volume in all respects is typical of a standard print-based critical edition, consisting of a long, informative “historical introduction” (kraeling , p. - ) and editions of each of the papyri—general description, transcription, english translation, critical commentary. there are also several indices (proper names, words) and a set of photographic plates—one black and white image for each papyrus plus an assortment of other images (e.g., endorsements, unopened papyrus rolls). this edition remains the single most comprehensive treatment of these papyri as a whole, although, of course, scholarship over the last sixty years has greatly improved our understanding of almost every aspect of these papyri. which is to say, bmap, no matter its historical contributions, can no longer serve as a fully adequate and accurate edition of these texts. in fact, most contemporary students of these papyri use the edition of these papyri found in the handbook edition of the entire corpus of aramaic documents from egypt by bezalel porten and ada yardeni (= tad; porten and yardeni - ). this latter volume features the insights and readings of the foremost student of the elephantine aramaic corpus, porten, and the exquisite hand drawings of yardeni. it offers the most accurate rendition of the brooklyn museum papyri generally available. but as the title suggests, tad was never envisioned as a critical edition of these texts (e.g., there is no commentary, explicit theory of editing or photographs of the texts and only a very minimal critical apparatus). it is our intention, then, to author a truly critical edition of these papyri, one that aspires to the traditional goals (and downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com standards) of scholarly editing and that builds on the vast scholarly advances achieved during the period between the appearances of bmap and tad and since. reflections on xml markup in the process of building out a (partial) prototype of the edition to serve as a proof of concept, we have discovered little field-specific discussion that might guide our markup decisions. consequently, here our chief ambition is to initiate such a conversation. we offer some initial reflection and assessment of xml markup schemes specifically for semitic texts from the ancient near east that comply with tei (text encoding initiative n.d.), cse, and the model editions partnership (mep) guidelines. we take as our example bmap (=tad b . ) and we focus on markup as pertains to the editorial transcription of this documentary text and to the morphosyntactic markup (part-of-speech tagging) of the text’s language.ii editorial transcription transcription may be defined as “the effort to report—insofar as typography allows—precisely what the textual inscription of a manuscript consists of” (meulen and tanselle , p. ). our transcription is of a documentary text (i.e., non-literary) in a single copy and is designed to support a facsimile edition, not to stand primarily in its place.iii as such, our transcription is what has been traditionally described as a “typographic facsimile,” which “attempts to duplicate exactly the appearance of the original source text as far as possible within the limits of modern typesetting technology” (kline and perdue , p. ; cf. meulen and tanselle , p. - ).iv where we believe explicit editorial comment is warranted (e.g., with regard to scribal alterations or where a reading is graphically ambiguous), users will be pointed to an epigraphic commentary for elaboration and discussion and will always be able to compare the transcription with the digital facsimile. a chief editorial aim in our transcription, then, is to report “what actually appears in a manuscript” as faithfully as possible (meulen and tanselle : ). within the limits of folio technology, this has usually required a conscious editorial decision to forego the incorporation of downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com editorial emendations, for example in order to correct apparent errors. as meulen and tanselle (as late as ) state, “a text cannot simultaneously be unemended and emended” ( : ). in a digital environment this is no longer the case. the element in the general tei guidelines ( . ) specifically enables the encoder to represent for example a text in its ‘original’ uncorrected and unaltered form, alongside the same text in one or more ‘edited’ forms. this usage permits software to switch automatically between one ‘view’ of a text and another, so that (for example) a stylesheet may be set to display either the text in its original form or after the application of editorial interventions of particular kinds. this provides us with the very attractive opportunity to both present the textual artifact as it has been historically preserved and to register editorial interventions where we deem them desirable. our interventions remain minimal, restricted (at this point) to apparent errors. the (inter-linear) writing of šhdy without the final aleph in l. in the formulaic phrase šhdyʾ bgw “the witnesses herein are” (bmap . - , . , . , . , . , . , . ; cf. . ) is a case in point: ( ) šhdy šhdyʾ this makes good editorial (and historical) sense. it also means that when we add the morphosyntactic markup we are not left only with the erroneous analysis—in this case, as if the downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com noun were actually a plural construct. this leads to a general observation, namely, that because of the digital environment the various aspects of our critical edition (e.g., facsimile, transcription, translation) and archival resource (e.g., morphosyntactic analysis) need not fully overlap or seamlessly agree. the various components can be manipulated to multiple (and even conflicting) ends. the transcription will be offered in two scripts: a transliterated roman script and an aramaic block script. all the xml markup was composed initially using the roman script. the markup for the transcription in the aramaic block script (and also for the digital facsimile itself) will be generated from this same file. after some experimentation and following mep recommendations (chesnutt, hockey and mcqueen ), we follow a gradual markup procedure, attending to one variety (or level) of markup at a time (e.g., editorial transcription, morphosyntactic analysis). the basic elements in our markup scheme are three: the alpha- numeric characters themselves and indications for line and word division: ( ) b l ʾlwl hw these correspond to the characters of the aramaic script and numerical notation system and the two meta-script conventions (line division, word spacing) habitually employed by the scribe (haggai b. shemaiah): ( ) bmap . - downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com the writing consists of alphabetic characters and numerical ciphers grouped by word spacing and organized into horizontal lines. we employ the standard transliteration conventions formulated by the sbl (alexander , § . . . ) for representing the aramaic script, a linear alphabetic script: ( ) ʾbgdhwzḥṭyklmnʿspršt in the aramaic numeral notation system, for numbers up to ninety-nine, the system is purely cumulative-additive, consisting of signs for , , and . the unit-signs are grouped in threes (since up to nine such signs could be required). we use the corresponding arabic numerals for the three component signs ( , , and ), plus and , depending on how the units are grouped. for example, the number twenty-eight is written out in l. with a cipher composed of the following signs: ( ) bmap . the lines are right-adjusted and line-ends always coincide with graphic word boundaries. these lines are presentational in nature only, and thus bear no semantic significance for what is written. that is, the text is written in a running format with lines ending where they may, constrained only by the width of the sheet of papyrus being used and coincidence of word boundary. we use the “anonymous block” element () in tei with the @type ("line") downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com and @n (e.g., "bmap. .r: ") attributes to signify these lines: ( ) ... ... ... as generally in alphabetic writing from the ancient levant, the aramaic of the elephantine papyri is written out with word division. spacing (a brief segment of the papyrus left uninscribed) is used in these papyri to signify (graphic) word division, a convention of aramaic scribal practice that becomes prominent in the seventh century bce (e.g., kai = assur ostracon; tad a . ). we use whitespace wrapped in the “punctuation character” element ( ) to represent this meta-script convention. this turns out to have a number of benefits. first, it underscores the fact that such a use of spacing is a material, meta-script convention (e.g., like the use of commas). the element according to the tei guidelines “contains a character or string of characters regarded as constituting a single punctuation mark” ( . . ). in this instance, spacing is used just like the point or dot in the old hebrew script or the small cuneiform wedge in the alphabetic cuneiform from ancient ugarit, and stands in contrast to the continua scripta tradition of alphabetic writing without word dividers, as in some phoenician scripts and in ancient greek manuscripts. second, it allows a more perspicuous linguistic description in the coding since a graphic word does not necessarily correspond to a linguistic (or grammatical) word. for example, prepositional phrases with the proclitic prepositions b-, l-, and k- are written out graphically together with their objects, e.g., lʿnnyh (l + ʿnnyh) “to ananiah” ( . ), bʾbny (b + ʾbny) “in the stone weights” ( . ). thus, the “word” ( ) element may be reserved for representing a “grammatical (not necessarily orthographic) word” ( . . ). this use of the element also allows the use of whitespace within the xml markup for human readability—that is, only whitespace wrapped in a element indicates a character from the text, while all other whitespace is insignificant. alterations observed in the source text are mainly of two kinds, additions and deletions, for which the “addition” () and “deletion” () elements from the core downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com tei guidelines ( . . ; cf. . . . ) are used. with the element, the @hand (e.g., "scribe," "witness ") and @place (e.g., "above," "inline") attributes are used, and with the element, the @type (e.g., "erasure") attribute. examples: ( ) ʾlhʾ ( . ) (the scribe added the word ʾlhʾ “the god” above the line) ( ) ( . ) (the scribe originally wrote the cipher for the number “ ”; by erasing the last vertical stroke, he corrected the number to “ ”) additions, which (in this document) are generally inclusions of material accidentally left out initially and written in inter-linearly above the line, are marked approximately at the point in the text where they are inserted (often above spacing between words; cf. meulen and tanselle , p. ). deletions, which are mostly erasures, are marked at the inter-linear point at which they occur. the markup aims only to report the fact of alteration. all additional editorial comment, including specification of chronological sequence, is reserved for the epigraphic commentary. the presence of the accompanying facsimile offers a helpful clarifying aid, relieving the markup of the need to be overly precise as to point of execution. on occasion it is apparent that a deletion and addition have been coordinated. for example, at the beginning of . the scribe initially wrote kl nšn , “altogether, ladies.” upon recognizing his mistake (the sellers are husband and wife) he erased the final vertical stroke in the cipher for the numeral “ ” (converting it to the cipher for the numeral “ ”) and added gbr “ man” inter-linearly above the line following the word kl “all.” ( ) kl downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com gbr nšn there is no way of knowing the precise sequence in which these alterations were executed (i.e., addition then erasure, erasure then addition), but it is at least clear that they are interdependent. in such cases, the tei guidelines allow for the use of the “substitution” () element ( . . . . ) to group coordinated alterations. however, since in some cases (as here) the alterations are not proximate, we forego the use of this element, marking only the fact of an addition and deletion and relying on the epigraphic commentary to detail a more precise characterization of the coordination (or of other relevant matters). there are places where a material reading is unclear for some reason. for example, in . the wife’s name was originally written as wbl. later the scribe adds an additional letter super- linearly, above and to the left of the bet: ( ) kraeling construes the letter as an aleph and reads ʾwbl (bmap, pp. - ). in contrast, porten and yardeni construe the letter as a yod and read wbyl (tad b, p. ). the name is apparently hurrian (kornfeld : ) and is spelled three other ways: ʾwbl ( . ), ybl ( . ), and ʾwbyl downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com (bmap . ). graphically, the inserted letter patterns more like a yod than an aleph, especially in over all size, and its placement (above the line after the bet) is also consistent with the yod in ʾwbyl (bmap . ). if the scribe intended an aleph as the initial consonant in the name, presumably he would have inserted the letter above the line and closer to the beginning of the name, for which there is plenty of space (most of the super-linear additions in this document are inserted beginning approximately at the point in the text where they would fall most naturally, e.g., gbr in . ; ʾlhʾ in . ; zk in . ). ( ) gbr in . ( ) ʾlhʾ in . ( ) zk in . in such cases, the tei guidelines provide for multiple ways of marking such variation. we have opted to use the “apparatus entry” () element, which may be used “whether or not represented by a critical apparatus in the source text,” with the parallel segmentation method for coding variant readings ( . . ), because it provides maximum transparency and flexibility. when there is an editorial preference for a reading (as here) we mark that with the “lemma” () element (with the @resp attribute signaling any supporting opinions, e.g., "tad"). other readings are marked with the “reading” () element and the @resp attribute (e.g., “k”). so our markup for this example is as follows: downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com ( ) wb y ʾ if the alternative readings are judged to be equally preferable each is marked with the element (with @resp attribute), as (possibly)v in the following example ( . ): ( ) bmap . ḥyḥ ḥyrw again, the main intent is reportorial in nature. any supporting rationale will be given in the epigraphic commentary. morphosyntactic markup classification of words into parts of speech (or word classes) is not entirely straightforward. the modern linguistic practice has been to use morphological and syntactic criteria for defining parts of speech, which of course vary cross-linguistically and do not necessarily totally overlap even within languages. the appendix contains our working pos classification. the intention here is downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com not to innovate linguistically. we have attempted to use an intuitive, field-specific sense of the relevant grammatical and lexical categories in use. in general, our default analysis is cued principally by the treatments in standard grammars (e.g., gea (muraoka and porten )) and lexicons (e.g., cal (kaufman et al., n.d.), dnwsi) of the various aramaic dialects. the chief aims in providing such tagging is to ease usability of these documents and to support the linguistic analysis of their language. furthermore, wanting all markup to be well-formed xml, and thus enabling general portability and use of standard xml parsers for processing and the like, we have utilized only five general tei elements: ( ) (name, proper noun) contains a proper noun or noun phrase (number) contains a number, written in any form (word) represents a grammatical (not necessarily orthographic) word (morpheme) represents a grammatical morpheme (abbreviation) contains an abbreviation of any sort four of the five elements—, , , and —are used fairly restrictively. the element, in contrast, does the bulk of the descriptive work.vi several observations about the markup itself. first, the and elements provided by tei accord well with the fact that proper names and numbers are generally distinguished lexicographically in semitic (and aramaic in particular) from other word categories. numerals are mostly indicated through ciphers in this document, which we indicate with the @type (="cipher") and @value (e.g., " ") attributes. when a number is spelled out, as in . (lʿšrtʾ "to the ten"), we indicate what kind of number with the @type attribute (e.g., "cardinal") and then wrap it within the element: ( ) ʿšrt downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com we do something similar with the element. since we are not interested necessarily in expanding abbreviations within the transcription but pointing to a lexical entry, we identify an abbreviation with the element and then wrap in the element: ( ) r ( . ) the element, the workhorse of this markup scheme, may appear with as many as three attributes: the @type attribute identifies the relevant part of speech; the @subtype attribute provides pertinent inflectional information (e.g., for nouns: gender, number and state; for verbs: binyan, tam, person, gender, and number); and the @lemma attribute points to the citation form in a lexicon (module). in the case of homographs, we have followed the ordering found in cal. as a rule, the attributes are used only as relevant (e.g., prepositions, conjunctions and the like require no @subtype entry) and only to the extent relevant (e.g., the @subtype description of nouns with possessive suffixes are marked only for gender and number).vii initially, we have erred on the side of providing more descriptive pos categories, especially when it comes to the various kinds of particles, conjunctions, adverbs, and the like that are used. we treat clitics differently, depending on their kind. clitics are (phonologically) bound forms (“constrained to occurring next to an autonomous word” (hopper and traugott , p. )) that have an independent syntactic role and thus may be thought of as standing halfway between autonomous words and fully grammaticalized affixes. prepositions and pronouns are two word categories that often become cliticized in natural languages. we have marked the proclitic prepositions (b-, l-, k-) and conjunctive waw (w-) with the element: ( ) b ( . ) l ( . ) k ( . ) w ( . ) downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com this is consistent with the general treatment these elements receive from lexicographers, who habitually provide lexical entries for them in the standard lexicons. by contrast, the pronominal suffixes attached to verbs, nouns, and prepositions have been marked with the element: ( ) bh ( . ) ( ) ksp k ( . ) ( ) ygrn k ( . ) the logic here is twofold: one, pronominal suffixes are not treated separately lexicographically in aramaic (and in west semitic generally), and, two, they are not considered a part of the standard inflectional feature set for verbs and nouns.viii marking them with the element (instead of with the element) signals both of these distinctions and captures as well these clitics' strong resemblance to other affixes (e.g., suffixes on the perfect), their lexical status notwithstanding.ix another area where we default to the lexicographers (at least initially) is in our treatment of compound or pseudo prepositions (gea, ). for example, following cal (and gea) we consider both bšm (l. ) and br mn (l. ) as fully grammaticalized and thus autonomous lexical items and mark them as such: ( ) br mn ( . ) downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com alternative markup privileging the decomposition of these complex items are readily imaginable and perhaps could be handled alternatively using the element: ( ) bšm bšm ( ) br mn br mn we have used the traditional nomenclature for the various verbal binyanim (e.g. peal, pael, afel) in the markup but will write a program that allows users to shift back and forth between this set of terms and the newer, cross-semitic terms (e..g., g, d, c, as used in cal). downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com conclusion in closing, nothing about what we have just reviewed in terms of xml markup seems to us to be revolutionary, either technically or theoretically. the surprise remains the general absence of a scholarly discussion on such issues in the field. in part we suspect this is because most of the digital-based text projects in the field to date have been dominantly entrepreneurial in motivation and orientation and not conceived as research or scholarship. there are exceptions. for example, there seems to be a live interest currently in leveraging digital resources for syntactic analysis of various (semitic) text corpora, and there are now a number of sites dedicated to presenting transcriptions of cuneiform literature (e.g., sources for early akkadian literature, http://www.seal.uni-leipzig.de/). but to our knowledge no digital-based project involving texts from the ancient near east (esp. pre-hellenistic corpora) have been conceived of from an explicitly articulated editorial perspective.x that is, most of the commonly used electronic text resources in the field (e.g., accordance, logos, michigan-claremont-westminster electronic hebrew bible) are essentially what is known as “reader editions.” they are not critical or scholarly editions and therefore, ultimately, cannot be depended on academically. these “reader” editions have served the field well, showing, for example, the viability and benefit of electronic text-based resources and “tools.” now the field needs to take the next step: to create critical, scholarly editions that will make use of all of the advantages of the currently available electronic reader editions and also be trustworthy and reliable. this is what we are proposing to do with dbmap. downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com notes i this represents a slightly revised version of a paper presented in the digital humanities in biblical, early jewish, and christian studies unit at the annual meeting of the society of biblical literature in san diego, ca (november , ). the images used in examples , , and - are details of bmap ( . . ; =tad b . ). inscriptifact text isf_txt_ . photograph by bruce and kenneth zuckerman, west semitic research. courtesy brooklyn museum. reuse of these images is prohibited without permission of the rights-holders. we thank bruce zuckerman and marilyn lundberg of west semitic research and ed bleiberg of the brooklyn museum for their support of this project more generally. ii in what follows, we employ inline markup. as one reviewer of this paper has pointed out, however, other methods of markup, such as stand-off markup (see, for example: http://www.tei-c.org/activities/workgroups/so/sow .xml; http://www.balisage.net/proceedings/vol /html/banski /balisagevol -banski .html), may actually end up being more congenial to our project. we find this an incredibly generative observation and plan to explore further such possibilities as the project moves forward. iii kline and perdue ( , p. ): “increasingly common are print editions in which a photo facsimile appears as part of a parallel text accompanying a printed editorial transcription. digital scanning creates wider options for editors who wish to offer such photographic images in online or dvd-based editions, conveniently linked to machine-searchable transcriptions, accessed through automated indexes.” iv both bmap and tad (unwittingly) offer approximations of a “typographic facsimile,” although neither is consistent on this issue since these volumes are not expressly theorized from an editorial perspective. v this is by way of example only and follows the judgment of porten and yardeni (tad b, )—we have not looked closely at personal names to this point. vi here we emphasize the practical and limited nature of our initial experiment. there are other standards that are both compatible with tei and promote the use and reuse of textual data across applications, e.g., laf (linguistic annotation framework, iso ; see ide and romary , p. - ). vii historically, possessive suffixes were attached to nouns after the case endings in aramaic, and syntactically, nouns with possessive suffixes are considered determined. some synchronic grammars of specific aramaic dialects (e.g., gea, ; hug , p. ) indicate that the suffixes are attached to the construct forms of nouns. whether this is the right analysis is open to debate, but even if correct, for our purposes, the presence of a possessive suffix implicates the use of the construct state of the noun, and therefore need not be explicitly marked (cf. bar-haim, sima'an, and winter , p. ). viii contrast the suffixes on the perfect form of the verb, which are clearly related historically to the larger pronominal system, with the chief difference that over time they became fully grammaticalized as suffixes, and thus a part of the verb's inflectional morphology. ix contrast bar-haim, sima'an, and winter ( , p. , ), who treat pronominal suffixes on verbs and prepositions in modern hebrew as word segments, but not possessive suffixes on nouns. x for example, seal offers this as its main rationale: “to enable the efficient study of the entire early akkadian literature in all its philological, literary, and historical aspects.” the site boasts of new “collations” for the texts presented, but offers no explicit editorial theory for guidance. presumably this is to be elaborated in the print volumes under production. downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com appendix: part of speech (pos) inventory (name, proper noun) contains a proper noun or noun phrase @type="person" "divine" "place" "gentilic" (number) contains a number, written in any form @type="cipher" "cardinal" "ordinal" "fraction" "multiplicative" (abbreviation) contains an abbreviation of any sort (morpheme) represents a grammatical morpheme *mainly used (now) for representing object and possessive suffixes @type="sf-(person, gender, number)" (word) represents a grammatical (not necessarily @type="pos(sessive)" @lemma="(dictionary entry)" @type="indef(inite)" @lemma="(dictionary entry)" @type="prep(osition)" @lemma="(dictionary entry)" @type="conj(unction)" @lemma="(dictionary entry)" @type="neg(ative)" @lemma="(dictionary entry)" @type="cond(itional)" @lemma="(dictionary entry)" @type="inter(rogative)" @lemma="(dictionary entry)" @type="adverb" @lemma="(dictionary entry)" @type="interj(ection)" downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com orthographic) word @type="verb" @subtype="(binyan: pe, pa, af/haf, ethpe, ethpa, ettaf)-(tam: pf, impf, impv, inf, part)-(person, gender, number)" @lemma="(dictionary entry)" @type="noun" @subtype="(gender, number)-(state: abs, cstr, det)" @lemma="(dictionary entry)" @type="adj(ective)" @subtype="(gender, number)-(state: abs, cstr, det)" @lemma="(dictionary entry)" @type="pron(oun)" @subtype="_(person, gender, number)" @lemma="(dictionary entry)" @lemma="(dictionary entry)" @type="exist(ence)" @lemma="(dictionary entry)" @type="part(icle)" @lemma="(dictionary entry)" @type="abbr(eviation" @lemma="(dictionary entry)" downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com bibliography alexander, p. h., ed., . the sbl handbook of style: for ancient near eastern, biblical, and early christian studies. peabody: hendrickson. bar-haim, r., k. sima'an, and y. winter, . part-of-speech tagging of modern hebrew text. natural language engineering ( ), pp. - . chesnutt, d. r., s. m. hockey, and c. m. sperberg-mcqueen, . markup guidelines for documentary editions. july. [online] available at: http://xml.coverpages.org/mepguide .html [accessed december ] donner, h. and w. rölling, - and . kanaanäische une aramäische inschriften. d ed and th ed. vols. wiesbaden: harrasowitz. (=kai) hoftijzer, j. and k. jongeling, . the dictionary of north-west semitic inscriptions. vols. leiden: brill. (=dnwsi) hopper, p. j., and e. c. traugott, . grammaticalization. cambridge: cambridge university press. hug, v., . altaramäische grammatik der texte des . und . jh.s v. chr. heidelberg: heidelberger orientverlag. ide, n., and l. romary, . international standard for a linguistic annotation framework. natural language engineering ( - ), pp. - . iso, . language resource management – linguistic annotation framework. iso : . edition . [online] available at: downloaded from brill.com / / : : am via free access journal of religion, media and digital culture volume , issue ( ) https://jrmdc.com http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= [accessed december ] (= laf) kaufman, s., et al. n.d. comprehensive aramaic lexicon. cincinnati: hebrew union college. [online] available at: http://cal .cn.huc.edu/. [accessed december ] (= cal) kline, m.-j., and s. h. perdue. . a guide to documentary editing. d ed. charlottesville: university of virginia. [online] available at: http://gde.upress.virginia.edu/ [accessed december ] kraeling, e. g., . the brooklyn museum aramaic papyri: new documents of the fifth century b.c. from the jewish colony at elephantine. new haven: yale university press. (= bmap) meulen, d. l. v., and g. t. tanselle, . a system of manuscript transcription. studies in bibliography , pp. - . mla's committee on scholarly editions, . guidelines for editors of scholarly editions. mla.org. [online] available at: http://www.mla.org/cse_guidelines [accessed december ] (= cse) muraoka, t., and b. porten, . a grammar of egyptian aramaic. leiden: brill (= gea) porten, b., and a. yardeni., - . textbook of aramaic documents from ancient egypt. vols. winona lake: eisenbrauns. (=tad) text encoding initiative, n.d. p : guidelines for electronic text encoding and interchange. [online] available at: http://www.tei-c.org/release/doc/tei-p -doc/en/html/index.html [accessed december ] (= tei) downloaded from brill.com / / : : am via free access journal of religion, media & digital culture (jrmdc) some initial reflections on xml markup for an image-based electronic edition of the brooklyn museum aramaic papyri abstract: about the authors: to cite this article: introduction: project overviewi reflections on xml markup editorial transcription morphosyntactic markup conclusion notes appendix: bibliography access to this work was provided by the university of maryland, baltimore county (umbc) scholarworks@umbc digital repository on the maryland shared open access (md-soar) platform. please provide feedback please support the scholarworks@umbc repository by emailing scholarworks-group@umbc.edu and telling us what having access to this work means to you and why it’s important to you. thank you. mailto:scholarworks-group@umbc.edu against the grain against the grain manuscript biz of digital — developing and growing a new repository biz of digital — developing and growing a new repository service: part expansion service: part expansion michelle flinchbaugh follow this and additional works at: https://docs.lib.purdue.edu/atg part of the library and information science commons this document has been made available through purdue e-pubs, a service of the purdue university libraries. please contact epubs@purdue.edu for additional information. https://docs.lib.purdue.edu/atg https://docs.lib.purdue.edu/atg?utm_source=docs.lib.purdue.edu% fatg% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages http://network.bepress.com/hgg/discipline/ ?utm_source=docs.lib.purdue.edu% fatg% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages against the grain / november continued on page biz of digital — developing and growing a new repository service: part expansion by column editor: michelle flinchbaugh (acquisitions and digital scholarship services librarian, albin o. kuhn library & gallery, university of maryland baltimore county, hilltop circle, baltimore, md ; phone: - - ; fax: - - ) introduction the university of maryland, baltimore county (umbc), a research-intensive in- stitution with full-time, part-time developing processes, procedures, and docu- mentation for doing submissions in the library. a librarian shifting duties to work full-time on a new repository, the digital scholarship services (dss) librarian, was eight months into a soft roll-out with minimal outreach only to individual faculty members, and she was the only staff working on the repository. despite the limited outreach, and that the system and services hadn’t been rolled-out to all of cam- pus, a robust flow of submissions for her to process and enter had developed. the system needed to be rolled-out to all of campus, and staffing needed to be added—both were done simultaneously at the beginning of the fall semester. expanding service to the entire campus first, the implementation of the system was announced to all of campus. then the dss librarian attempted to tell all of the faculty on campus about the new system. first, she sent an email to email list for academic department chairs asking to present at one of their meetings in the coming year. most didn’t respond, but she was scheduled into a few departments’ meetings. she then began contacting depart- ment chairs for the remaining departments individually, and continued with this through an entire semester. she was able to set up many more presentations at departmental meetings, but still the vast majority of departments, %, didn’t respond. for those departments where she wasn’t invited to present, she emailed a flier to all of the faculty in the department. later she began contacting campus centers about schol- arworks@umbc. via these presentations and contacts, and ongoing processing of new umbc publications via google scholar alerts, at the end of the first year % of umbc’s academic departments had at least one work in scholarworks@umbc. all of the major academic administrative units (the provost’s office, deans’ offices) and centers also had works in scholarworks@umbc. hiring help the amount of time that the dss librarian had to process and enter new submissions was insufficient to handle incoming submissions as soon as she began receiving long lists of publications. she notified both her supervisor and other administrators that she was only able to add a little more than items per month, and of the number of submissions she had, and requested a student and began developing and preparing processes and documentation for a student to do the data entry work, and also possibly some processing of submissions. faculty, was utilizing the maryland shared open access repository (md-soar) dspace platform to develop repository services. few faculty would self-submit, so the library was figure : fields in the submission spreadsheet that student assistant uses to input items into the ir column editor’s note: this is part of a part series on creating a new repository service. part : getting started appeared in the june issue (v. # ). part : procedures for library submissions appeared in the september issue (v. # ). this is part , which completes the series. — mf against the grain / november the challenge of having a student do data entry work was in how to convey information to the student on what to enter without entering all of the data on the item. the first attempt at this was a simple list of items to enter, along with a link to the item’s metadata record on the publisher’s site. metadata values readily available on the publisher’s site (author, ab- stract, keywords, etc.) weren’t included in the list of items to enter since the student assistant could readily find, whereas information not available or that requires a judgement call was included in the list. relying on copying and pasting from the publisher’s metadata record where possible speeds up both the processing of items and their submission to the system. however, using a list, the librarian noted that there were inconsistencies in what infor- mation she included and didn’t include, some because of the nature of the work and others because of which versions of the work were available and what metadata was available, and yet others because she simply didn’t prepare the items consistently. unhappy with the list, she switched to spreadsheet with columns, and decided most information that could be found in metadata records and the work itself wouldn’t be included in the spreadsheet, limiting the spreadsheet to links to works and metadata about works and information not readily available in metadata or on the work, or that required a judgement call. (see figure .) the spreadsheet was supplemented with detailed documentation on entering and completing items from a spreadsheet, avail- able here: https://wiki.umbc.edu/display/ library/entering+and+completing+items+- from+a+spreadsheet% c+full+procedure and a metadata chart to use as a short guide on what to look for and where to put it a record. (see figure .) to finish, the dublin core element was add- ed to all lines in the “where to put it column” to facilitate editing in administration which is done entirely by dublin core element. in addition to providing these resources for the student assistant, after hire, the dss librarian spent a great deal of time training the student assistant, in steps, and checking and correcting work until it was done correctly. ini- tial training was only the submission processes utilizing the submission form. when that was mastered, the student was trained in utilizing administrative capabilities to edit metadata and map items to additional collections, then how to use the administrative capabilities to correct an error. in a final phase of training the student learned to add rights statements, change a creative commons license version, and embargo an item for automatic release when the embargo ends. once the student had worked through a substantive backlog of items to be entered, the dss librarian also trained the student to determine if an item is in-scope for the re- pository, to check rights and then enter those works that can be posted into a spreadsheet allow for training in more manageable phases and better mastery of the work. still needing more help to date, the dss librarian has barely scratched the surface of outreach and much more can be done, but the bulk of her time has been spent on a perpetual backlog of items to process and enter for faculty. the current backlog of works to check rights and enter is reaching nearly , . with the dss librarian working full-time plus the half-time student assistant, the maximum monthly rate of processing has been per month so that this constitutes nearly a month backlog. additionally, they continue processing google scholar alerts, and a commitment was made to re-visit publications website annually to add new materials that were added to them. the student assistant won’t be available during breaks, and graduates in a just a year and half. at the date of this writing, the dss librarian biz of digital from page continued on page for future entry. substantive documentation was created covering this. the portion on scope covers theses and dissertations, cvs, obituaries, and abstracts with no full text, the requirement that an author must be affiliated with umbc or the article about umbc or someone affiliated with it. the section on checking rights is broken down by format, and covers creative commons licenses, open access, u.s. federal government publications. a final section covers determining which col- lections to add an item to. the full procedures is available here: https://wiki.umbc.edu/pages/ viewpage.action?title=preparing+a+spread- sheet+of+items+to+enter&spacekey=li- brary. additional detail could be added to spell out some situations and decisions not covered, but this already extensive procedure is very challenging for someone just beginning this work. training on it best divided up, a new staff person doing the steps they know, and the dss librarian doing the rest. this would figure : guide for what to look for and where to put when entering items into ir against the grain / november come see us at table # at the charleston vendor showcase for more information on how to support or participate in the archive contact us at info@clockss.org. clockss archive is a dark archive that ensures the long-term survival of web-based scholarly publications, governed by and for its stakeholders. the archive includes over ▪ participating publishers ▪ library supporters ▪ , journal titles ▪ , , journal articles ▪ , ebooks ▪ journals have been triggered as open access clockss is the first archive to be re-certified by the council of research libraries for our trusted repository audit checklist (trac). our score was upgraded for organizational infrastructure to the top score of . we maintained our top score of for technologies, technical infrastructure, security. our total score of out of is the highest score of any of the archives that have been certified. https://www.clockss.org has requested a full-time line to hire a staff person which was promised if/when there is enough work to justify doing so. workflows one workflow for loading etds, and sec- ond bifurcated workflow, first checking rights and finding info and filing it into a spreadsheet, and then using the spreadsheet to enter items was developed. these two workflows han- dle % of items going into the repository. however, some items don’t fit well into the spreadsheet because of their nature, and require utilizing dublin core elements not normally utilized, or entering multiple values into dublin core element where there is usually only one. other items may be serials, multivolume sets, art, video, symposia with video of multiple presentation given by different people. mod- ified spreadsheets were developed, or will be developed to be used in these instances. at some point in the future, items that are partic- ularly time consuming to enter manually (for example, works with more than ten authors) may also be loaded, depending on our ability to develop automated methods of reformatting data accurately. library’s committee of social media and outreach will promote items on social media, but interesting items have to be identified and sent to them. other means of promotion are also possible, but it’s been difficult to find time for this with a perpetual large waitlist of materials to process and add. the digital service librarian has also been working the md-soar governance group toward enhancements to solve various inef- ficiencies and problems related to the system configuration. a current effort is being made to standardize how metadata indicates that an item is a preprint or postprint and to select en- hancements to move forward. a new extended submission form is desirable, and additionally tweaks to the indexing and display, and field configuration would be of value. another area needing work is resolving inconsistencies in metadata as procedures have changed over time. in some instances, we’ve learned, in others, reached agreements not previously realized, so there have been incon- sistencies between how we entered records six months ago and how we enter them now, and at some point hope to do a large scale batch edit of our metadata to make records consistent. of particular importance is putting in place some type of authority control on the names biz of digital from page future plans with the large backlog of work, methods for making work more efficient are a high priority. in the short term, macro express can quickly automate some data entry tasks, making the entry of new submissions less time-consuming. the dss librarian has read articles on how other libraries have automated the submis- sion processes using citation managers and spreadsheets to batch load new submissions, and in the future will investigate if what other libraries have done will work at umbc and with md-soar. another high priority is further extending outreach. additional outreach needs to be done to professional programs, and faculty and programs located at distant locations in either baltimore city or at university of maryland system’s shady grove campus. outreach also needs to be done to lecture series, campus awards, and student publications and research forums. finally, outreach to new faculty six months to a year after they come to umbc needs to be put in place. finally, an annual email needs to go to faculty affiliated with each department and center reminding them to send materials. works in the repository also need to be promoted. a plan to have the system automat- ically tweet all new items has stalled. umbc continued on page against the grain / november of umbc authors. the system automatically creates author pages, but creates an additional author page for every different form of a name. a decision to put authority control in place for umbc author names would collate their works all on one author page. repository training for faculty was devel- oped and scheduled, but with very little interest in self-submission on campus, no one rsvp’ed and the training session was cancelled. the university of maryland, college park, has had some success with a workshop on authors’ rights, and important rights issues directly pertinent to the repository can be addressed in that type of workshop, so in the future, the dss librarian hopes to develop and offer such a workshop. moving from a part-time student assistant with diminishing returns on training to a full- time staff person is also highly desirable. that would facilitate much more work getting done at a higher quality. this would also allow the dss librarian to shift from spending most of her time doing staff-level production work (or correcting the student assistant’s work), to doing further outreach and promotion of the repository and the works in it, refining procedures and documentation, and batch biz of digital from page we serve. the most successful libraries in navigating through change are likely the ones that have the greatest ability to be flexible given the changes that our profession is being asked to take on. here are three areas where flexibility has been key in the way we have faced change. first, flexibility in the services we pro- vide is one of the most important aspects to consider with library change management. there are certain aspects of our work and the services we provide where we need to make sure that we are aligned with our community needs. it seems that much of the literature about marketing in libraries (and elsewhere) is about how to successfully increase adoption or use of a service or product that you are providing vs. providing what is desired. one instance that we have had to change is our desire of managing course reserves in our new library. taking into consideration the lack of space and the desire not to hire evening staff to work onsite, we opted to change course and not bring back course reserves to our library portfolio. we aspired to move items into electronic format if possible — but as you know, this can be difficult with course adoption texts, let alone textbooks. we had to balance between what we felt we could do with what the needs are in our community and moved accordingly. it would be nice if we could provide this, but it did not meet our abilities and resources. second, flexibility with collections is cer- tainly a key aspect to change management. over the past eight years in our library, we have had flat or declining collections budgets. even in a flat year, that means cuts to cover inflation for the resources that you are plan- ning to keep. the flexibility involved here is to ensure that your cuts are balanced and that redundancies are eliminated first. while you can work on a straight cost per use model, this will likely not tell the entire story. resources targeted for faculty and researchers will likely be far less commonly used than ones geared to student use. additionally, there might be resources that had been staples in the library collection since before you arrived — and represent the core holdings of many libraries. but if they are not being used, then that is an opportunity for you to make changes that reflect your reality at the library. third, flexibility with people might be the most important aspect of navigating through change in any organization. i believe that change is an extremely personal construct that will impact different people in vastly unique ways. in our staff of just under twenty, we had some people who saw relatively little change in their day to day life with our transformation — as well as some who had to learn nearly a completely new job. but long before this change, i preached flexibility in the workplace for one simple reason. my premise is that if i am flexible with my team, they in turn may be flexible with our users and the community we serve. conversely, if i were rigid or rule bound with my team, it would be difficult to expect them to be flex- ible with our communi- ty. additionally, a great number of the ways that we may be flexible with our teams can create a better working environ- ment. giving your team the freedom and flexibility to navigate through these changes as they see fit enables the library group to better serve your community. while there will be aspects of library change that are fairly rigid, especial- ly with space and budget constraints, creating a flexible environment will pay dividends for you and your team. so just like that cute puppy, kitten, dog, cat or other pet you bring home from an adoption event, there is a great deal of joy that will come endnotes . sutton, sarah. “flexibility in the face of change.” library resources & technical services, vol. , no. , apr. , pp. – . ebscohost, doi: . /lrts. n . . . kesselman, martin a. “hot tech trends in libraries: flexibility and changeability is the new sustainability.” library hi tech news, vol. , no. , sept. , pp. – . ebscohost, doi: . /lhtn- - - . squirreling away from page clean-up of the data in the system. at present, the digital scholarship service librarian can only take on a single project each summer. more time would also allow her to develop and offer workshops and do more outreach. additionally, she could develop other digital scholarship services in consultation with li- brary management and write grant proposals toward obtaining funding for startup costs. conclusions the gradual implementation was good preparation for extending the repository service to all of campus, and moving to a pro- duction mode where many more items would be processed and added each month. basic procedures were in place, allowing a shift in focus from legal and technical issues to people, from how to do work to working with staff to increase production. it also gave the novice dss librarian time to learn. in months since the implementation of scholarworks@ umbc, , items have been added, bringing the total number of works to , , and there have been , visits to scholarworks@ umbc. this is a strong start, but in time umbc could potentially be adding , + items per year to the repository. your way. but the more rigid you are in bringing this new being into your family, the more likely you will be disappointed and troubled with the results. pets can be a great deal of work, but all of it is worth it when they are curled up with you when you are working on your late articles — right? corey seeman is the director, kresge library services at the ross school of business at the university of michigan, ann arbor. he is also the new editor for this column that intends to provide an eclectic exploration of business and management top- ics relative to the intersection of publishing, librarianship and the information industry. no business degree required! he may be reached at or via twitter at @cseeman. scholarworkscoversheetnolicense viewcontent biz of digital — developing and growing a new repository service: part expansion tmp. .pdf. iahz mariana_ou_inm _ city, university of london, msc library science inm libraries & publishing in an information society, ernesto priego may assignment option identify the main ways in which transformations in publishing are changing the way people do research. what are the relationships between publishing and digital scholarship? and what do these relationships make possible? what are some challenges and opportunities for publishers and/or libraries in the context of the new developments in digital scholarship? word count , including titles; estimated reading time: min mariana strassacapa ou publishing as sharing: observations from oral history practices in the digital humanities despite the evident general feeling that we experience an information deluge in our daily lives, whether ours is an ‘information society’ is subject of great debate. the term implies that ‘information’ is the very defining aspect of today’s society, rather than ‘agriculture’, for example (bawden & robinson, ); it also implies that at some point in the twentieth century a revolution has taken place, one that would have substituted a previous ‘industrial society’ for the current ‘information society’ as it fundamentally disrupted technologies and cultural practices related to human communication. even though i am not convinced by the idea that we live in a ‘new’ kind of society, and rather prefer interpretations that identify all the continuities of modernism and capitalism developments through the last century, it is undeniable that recently, in the last decades, transformations in mediated communication have accelerated the production and dissemination of information enormously, increasing the complexity of ways people interact (borgman et al., ). the widespread use of the internet and the world wide web through cheap, personal digital information computing devices is largely to blame for these profound transformations; the term ‘digital’, originally applied as synonymous with discrete electronic processing techniques, came to refer to anything related to computers, from electronics to social descriptors (digital divides, digital natives), to emerging fields of inquiry (digital art, digital physics) (peters, ). ‘digital scholarship’ fits the latter category; according to christine borgman, it ‘encompasses the tools, services, and infrastructure that support research in any and all fields of study’ ( ). clearly this is a quite broad definition, but does express the essential idea that scholarly practices and research opportunities have been widened through many new supporting ways. as i will argue here, a leading force defining digital scholarship has been the generalisation, in the digital milieu, of publishing as sharing. ‘sharing’ as the new rhetoric of publishing in the book digital keywords: a vocabulary of information society & culture, nicholas john scrutinises the term ‘sharing’ in its meanings recently acquired through use in the digital realm. non-metaphorically, john explains, to share is to divide, and at least from the sixteenth century it refers to the distribution of scarce resources; recently, though, it has also been attributed a more abstract communicative dimension: ‘a category of speech, a type of talk, characterised by the qualities of openness and honesty, and commonly associated with the values and virtues of trust, reciprocity, equality, and intimacy, among others’; it has become ‘the model for a digitally based readjustment of our interactions with things (sharing instead of owning) and with others’ (john, ). furthermore, ‘sharing’ would also mean a positive attitude with regards to future society; john talks in terms of the promise of sharing: the promise of sharing is at least twofold. on the one hand, there is the promise of honest and open (computer-mediated) communication between individuals; the promise of knowledge of the self and of the other based on the verbalisation of our inner thoughts and feelings. on the other hand, there is the promise of improving what many hold to be an unjust state of affairs in the realms of both production and consumption; the promise of an end to alienation, exploitation, self-centred greed, and breathtaking wastefulness. (john, ) publishing after the digital boom—and specifically after the internet and the world wide web having taken over a large share of our usual communication routines—, i argue, has a meaning which is becoming more and more inter-sectioned with that of ‘sharing’ we are referring to here. digital publishing and ‘sharing’ are intertwined as both follow a ‘distributive logic’ more sustainable and alternative to capitalism models of production and consumption (john, ); publishing has had its definition widened as well as its actors and subjects and, just as ‘sharing’, it ‘plays heavily on interpersonal relations, promising to introduce you to your neighbours, for instance, or to reinstate the sense of community that has been driven out by, say, the alienation supposedly typical of modern urban life’ (john, ): it is now part of everybody’s daily activities, and not just a specialised profession. this ‘publishing as sharing’ new notion is in accordance with the new paradigm of openness in digital scholarship. publishing processes had to be readapted, some of them radically, both to developments in digital technologies and to the pervasive digital ‘sharing’; when it comes to academic publishing and research practices, that means ‘open scholarship’, as in making your research data available in a repository for consultation and reuse; ‘open access’, as in publishing free from charge academic articles that would initially be charged for in digital journals; and ‘open dissemination’, as the idea behind institutional websites like oxford university research archive (two screenshots below), a friendly, searchable repository of research outputs, including many open-access articles. in this essay, i use the debates on oral history in the digital humanities to support the presentation of some of the relationships between publishing and digital scholarship and their implications, as well as challenges and opportunities that should concern those involved in both publishing and library & information science. new standards in oral history widening scholarship practices through digital publishing the transformations in scholarship brought about by the universe of digital possibilities and the world wide web abound, but not many fields have been impacted as much as oral history. in the introduction to oral history in the digital humanities: voice, access, and engagement (boyd & larson, ), the authors provide an overview of the developments in oral history and highlights how they were heavily influenced by the changing recording technologies of the last decades; if affordable and accessible new analogue technologies helped establish oral history as a compelling methodology for historical research in the s, the transcript of the audio recordings still posed a great challenge from the library/archival perspective: as text, they were considered a more efficient communication than the recording, easier to go through looking for specific bits of information; ‘without the transcript, the archive might have no more information about an oral history interview on its shelves beyond a name, a date, and the association with a particular project’, and oral history collections (of cassettes) were always under the threat of obscurity, with no perspective of use of discovery (boyd & larson, ). digital technologies, however, came to solve not only these problems but, with the world wide web, also give new and widened meanings for access; as the authors pointed out, ‘digital technologies posed numerous opportunities to explore new models for automating access and providing contextual frameworks to encourage more meaningful interactions with researchers as well as with community members represented by a particular oral history project’. in this essay, i present four main changes in publishing after the ‘digital shift’ (publishing = sharing) as we can identify from oral history’s new practices in research and dissemination: • the ‘democratic spirit’ boyd & larson talk about a ‘democratic spirit’ found in both oral history and the digital humanities as ‘the sense that the materials created, shared, generated, or parsed belong to everyone—not just to the educated or the well-to-do, but to those outside the university walls as well as those within’. indeed, oral historians are obviously interested in history from ‘bottom-up’, the one that can be found and captured in common people’s voices, and are then characterised by adopting a more ‘democratic’ approach to historical inquiry, one that assumes collective participation in the creation of materials; in combination with the digital humanities, this inclusion of people in the creation process extends also to people’s access to these materials (boyd & larson, ); oral history’s ‘democratic’ values and preconditions are enhanced and find fertile ground in digital publishing. as we can read from the founding statement of the journal for multimedia history of the university at albany, a website that used to publish oral history collections: [it is] because so much of what we were doing as professional historians seemed so isolating that we wanted to "get out on the web”, to reach not only academicians, but an entire universe of interested readers. we wanted to bring serious historical scholarship and pedagogy under the scrutiny of amateurs and professionals alike, to utilise the promise of digital technologies to expand history's boundaries, merge its forms, and promote and legitimate innovations in teaching and research that we saw emerging all around us (zahavi & zelizer, ) i understand this ‘democratic spirit’, as boyd & larson put it, as a manifestation of one of the transitions in authorship in the digital realm, ‘from intellectual property to the gift economy’, suggested by kathleen fitzpatrick in her book planned obsolescence: publishing, technology, and the future of the academy. if academics and publishers are to restore scholarly communication’s origins and work towards genuinely open practices of producing and sharing academic content, she argues, then scholars must embrace the creative commons licenses for their work, ‘thus defining for themselves the extent to which they want future scholars to be able to reuse and remix their texts, thereby both protecting their right to be credited as the author of their texts and contributing to a vibrant intellectual commons that will genuinely ‘promote the progress of science and useful arts.”’ (fitzpatrick, ; citing the u.s. constitution). oral history research output has always been a complicated type of material in terms of authorship, ownership, and rights; whole collections cannot be made accessible because of copyright issues, e.g. the interviewer has deceased and did not leave any documentation on the matter behind. but online, it is becoming more common to apply cc licenses to oral history interviews through the interviewees consent forms, as in the words of an oral historian, ‘it clearly keeps the copyright in the hands of the oral history interview participant, but allows us to freely share the recording and transcript on our open-access public history website and library repository, where individuals and organisations may copy and circulate it, with credit to the original source’ (simpson, ). the ‘democratic’ solution seems to be already available for academics, but the challenge now is to promote the cc license as such; academic and librarian jane secker seems to be on the right track when she refers to ‘copyright literacy’ as closely related to information literacy, to be of concern to everyone who ‘owns a device with access to the internet’ (secker, ). • ‘share your story’: authorship, collaboration, crowdsourcing co-authorship in interviewing projects is nothing new, but collaborative work tends to become the norm when we consider oral history as related to and part of the digital humanities. if oral history has always been distinct from other practices in the humanities, as it often holds certain complexity with regards to authorship—who is the author of an interview, the interviewer, the interviewee, or both? or none?—, this complexity has been successfully embraced in the digital realm. with crowdsourced websites like storycorps.org and antievictionmappingproject.net (below), anyone is encouraged to ‘share their story’ and take part as author of a larger narrative, comprised of the collection of stories that assemble an inconstant, growing whole. furthermore, as a oral history collection is published online and becomes a website, new roles which can arguably be corresponded to that of an author become essential: ‘while there are always two (and sometimes more) participants in the initial recording of an oral history, i would argue that there are three primary players in the presentation and preservation of a digital oral history once it has been recorded—the oral historian, the collection manager, and the information technology (it) specialist. these three roles may, in some programs, actually be represented by the same person, but there are specific concerns and responsibilities particular to each’ (schneider; in boyd & larson, ). in that sense, oral history is indeed in conformity with the basis of the digital humanities, understood as contrast to the essentially mono-authorial and monographic traditional processes and outputs of research in the humanities; as the dh manifesto . states: ‘digital humanities = co- creation’ (the digital humanities manifesto . , ; in boyd & larson, ). this is not to say that digital humanities has not been disruptive to previous practices in the humanities; on the contrary, it appears that the sciences have found continuity and enhancement of their procedures and methods in the digital realm, given that, as gross & harmon argue, in the sciences ‘collaboration was already flourishing; the internet greatly facilitated it, among not only networked scientists from around the globe but also armies of citizen-scientists participating through websites like galaxyzoo’ (gross & harmon, ). knowledge in the humanities, in contrast, the authors argue, build up as ‘a chain of individual achievements. even in the st century, collaboration in the humanities, though more common than previously, is not common at all. when it does occur, only two scholars are usually involved. there is a sense that these achievements ought to be individual.’ the humanities seem to be lagging behind the sciences in terms of being able to embrace the web’s possibilities, as we can see from some online journals: the oral history review by oxford academic, for example, presents no audio recording files or any other interactive feature, just the traditional pdf, authorial, text article. institutional digital publishing in the humanities would greatly benefit from more ‘digital’ explorations of content and linking, but that obviously involves difficult changes in well-established mindsets and practices with regards to the notion of the strong individual author and the acclaimed, recognition-provider, conventional text based academic journal article. • ‘archive everything’ a habit that is being abandoned thanks to the possibilities of digital archiving and storage is getting rid of the audio recordings of oral history once they have been transcribed. now, researchers are not only able to keep the audio recordings and their many versions and editions, but also house and organise the interview collections using digital depositories and content management systems like contentdm, and also enhance access to the interviews with ohms (oral history metadata syncronizer), which connects search terms with the online audio or video (website screenshot below) (boyd & larson, ). usability and discoverability issues are being sorted out by the ‘archive everything’ (giannachi, ) trend that comes with publishing-as- sharing practices. the ‘archive everything’ new paradigm is becoming such a norm in digital scholarship that fitzpatrick talks about a ‘database-driven scholarship’, that refers to new kinds of research questions made possible through the online availability of collections of digital objects (fitzpatrick, ). nyhan & flinn also mention a ‘rubric’ in the present research agenda of the digital humanities as one that looks back at humanities questions long asked and attempt to ask them in new ways, and to identify new questions that could not be conceived of explored before (nyhan & flinn, ); academic digital datasets, databases and archives are greatly responsible and enablers of these new opportunities. gross & harmon use a prize-winning monograph as an example of how current possibilities help ‘historians see anew’: pohlandt-mccormick’s research on the soweto uprising uses ‘photographs and official documents as an archive that can supplement, even interrogate the traditional historical archive. her monograph contains images and reproductions of some written documents in all, a trove hard to imagine in a conventional book. these images and documents are reproduced in an “archive” in her e-book, and select ones are integrated into the text and hyperlinked to supplementary information.’ (gross & harmon, ). of course, database and archival academic websites are not just product of research, but increasingly made available as opportunity for other researchers to come up with new inquiries from them. that is one of the ideas behind making research data accessible as requirement in journal publications; gross & harmon cite science’s stated policy as now typical: ‘as a condition of publication, authors must agree to make available all data necessary to understand and assess the conclusions of the manuscript to any reader of science’. with the ‘archive everything’ practices and the emergence of digital collections of data and documents, comes the increasing significance of the activity of curation, meaning ‘making arguments through objects as well as words, images, and sounds’ (digital humanities manifesto . , ). for fitzpatrick, curation relates to another shift in authorship that she identifies as ‘from originality to remix’: we might, for instance, find our values shifting away from a sole focus on the production of unique, original new arguments and texts to consider instead curation as a valid form of scholarly activity, in which the work of authorship lies in the imaginative bringing together of multiple threads of discourse that originate elsewhere, a potentially energising form of argument via juxtaposition. (fitzpatrick, ) but just as difficult as establishing this kind of curation as legitimate academic work is enhancing the reusability of these valuable datasets and digital archives; just requiring data sharing seems to be not enough. if we want to ‘archive everything’, discoverability and dissemination are essential, but cannot happen without solid institutional base and support: storage must be big, urls must always work, metadata and indexing must be precise and efficient. conclusion academic publishing should be about sharing layers of london is a project being undertaken in the university of london’s institute of historical research, funded by the heritage lottery fund; it ‘will bring together, for the first time, digitised heritage assets provided by key partners across london including: the british library, london metropolitan archives, historic england, the national archives, mola. these will be linked in an innovative new website which will allow you to create and interact with many different layers of london’s history from the romans to the present day. the layers include historic maps, images of buildings, films as well as information about people who have lived and worked in london over the centuries.’ (screenshot below) (layers of london, ). it is still being developed at this moment, but it is working hard on its dissemination, as ‘a major element of the project will be work with the public at borough level and city-wide, through crowd-sourcing, volunteer, schools and internship programmes. everyone is invited to contribute material to the project by uploading materials relating to the history of any place in london. this may be an old photograph, a collection of transcribed letters, or the results of local research project’ (layers of london, ). so, instead of an individual historical research on london mapping that would traditionally be published as textual product, layers of london is an open, funded website being built in an academic institution as platform for voluntary contributions; it has a blog, a twitter account, and instead of an ‘author’, a team of director, development officer, administrator, and digital mapping advisor. it represents all shifts in authorship as proposed by fitzpatrick: ‘from product to process’; ‘from individual to collaborative’; ‘from originality to remix’; ‘from intellectual property to the gift economy’; and ‘from text to… something more’ (fitzpatrick, ); and just like contemporary oral history projects, its success will be ‘measured by metrics pertaining to accessibility, discovery, engagement, usability, reuse, and … impact on both community and scholarship.’ (boyd & larson, ). as an open digital humanities work that fully embraces the possibilities of the web, however, it faces all the challenges that this kind of academic digital publication today usually does, including the recognition that it might even count as academic research. fitzpatrick points out: ‘the key, as usual, will be convincing ourselves that this mode of work counts as work—that in the age of the network, the editorial or curatorial labor of bringing together texts and ideas might be worth as much as, perhaps even more than that, production of new texts.’ (fitzpatrick, ). this ‘convincing ourselves’ effort involves the difficult task of rethinking university practices and the academic career, which simply cannot afford to shy away from the disruptive impact of digital publishing as sharing. the humanities in special has been trying to work itself out with the digital humanities; according to nyhan & flinn, another ‘rubric’ of the dh ‘has a distinct activist mission in that it looks at structures, relationships and processes that are typical of the modern university (for example, publication practices, knowledge creation and divisions between certain categories of staff and faculty) and questions how they may be reformed, re-explored or re-conceptualised.’ (nyhan & flinn, ). it must be a concern and responsibility of the university to establish and guarantee academic publishing as sharing, addressing today’s unsustainable models of publishing and embracing the shifting, more open forms of scholarly communication and research; i agree with fitzpatrick: ‘publishing the work of its faculty must be reconceived as a central element of the university’s mission.’ (fitzpatrick, ). librarians have significant roles to perform on this mission; the web is not a library, but librarians can help ensure it is used in its full potential: as a world wide networked communication system. and can help to let publishing be about sharing. references antieviction mapping project: documenting the dispossessions and resistance of sf bay area residents, ( - ). home. [online] available at: http://www.antievictionmap.com/#/we-are-here-stories-of- displacement-and-resistance/ [accessed may ]. bawden, d. and robinson, l. ( ). introduction to information science. london: facet. borgman, c. ( ). digital scholarship and digital libraries: past, present, and future. keynote presentation, th international conference on theory and practice of digital libraries, valletta, malta. available at: http:// works.bepress.com/borgman/ / [accessed may ]. borgman, c., abelson, h., dirks, l., johnson, r., koedinger, k., linn, m., … szalay, a. ( ). fostering learning in the networked world: the cyberlearning opportunity and challenge. national science foundation. available at: https://www.nsf.gov/pubs/ /nsf /nsf .pdf [accessed may ]. boyd, d. and larson, m. ( ) introduction. in: boyd. d. and larson, m., eds., oral history and digital humanities: voice, access, and engagement. new york: palgrave macmillan us. the digital humanities manifesto. ( ). [online] available at: http://manifesto.humanities.ucla.edu/ / / /the-digital-humanities-manifesto- / [accessed may ]. dougherty, j. and simpson, c. ( ). who owns oral history? a creative commons solution. in: boyd, d., cohen, rakerd, s. and d. rehberger, eds., oral history in the digital age. institute of library and museum services. available at: http://ohda.matrix.msu.edu/ / /a-creative-commons-solution/ [accessed may ]. http://www.antievictionmap.com/#/we-are-here-stories-of-displacement-and-resistance/ http://www.antievictionmap.com/#/we-are-here-stories-of-displacement-and-resistance/ http://works.bepress.com/borgman/ / http://works.bepress.com/borgman/ / https://www.nsf.gov/pubs/ /nsf /nsf .pdf http://manifesto.humanities.ucla.edu/ / / /the-digital-humanities-manifesto- / http://manifesto.humanities.ucla.edu/ / / /the-digital-humanities-manifesto- / http://ohda.matrix.msu.edu/ / /a-creative-commons-solution/ giannachi, g. ( ). archive everything: mapping the everyday. cambridge, massachusetts: the mit press. fitzpatrick, k. ( ). planned obsolescence: publishing, technology, and the future of the academy. new york: new york university press. gross, a. and harmon, j. ( ). the internet revolution in the sciences and humanities. st ed. new york: oxford university press. john, n. ( ). sharing. in: peters, b., ed., digital keywords: a vocabulary of information society & culture. princeton: princeton university press. the journal for multimedia history, ( , ). current issue. [online] available at: http://www.albany.edu/ jmmh/ [accessed may ]. layers of london, ( ). home. [online] available at: https://layersoflondon.blogs.sas.ac.uk [accessed may ]. nyhan, j. and flinn, a. ( ). computation and the humanities: towards an oral history of digital humanities. springer open. doi . / - - - - oral history metadata syncronizer: enhance access for free, ( ). home. [online] available at: http:// www.oralhistoryonline.org [accessed may ]. oxford university research archive, ( ). home. [online] available at: https://ora.ox.ac.uk [accessed may ]. pohlandt-mccormick, h. ( ). ‘i saw a nightmare…’ doing violence to memory: the soweto uprising, june , . [online] columbia university press and gutenberg-e. available at: http://www.gutenberg-e.org/pohlandt- mccormick/index.html [accessed may ]. secker, j. ( ). digital, information or copyright literacy for all? [blog] libraries, information literacy and e- learning: reflections from the digital age. available at: https://janesecker.wordpress.com/ / / /digital- information-or-copyright-literacy-for-all/ [accessed may ]. schneider, w. ( ). oral history in the age of digital possibilities. in: boyd. d. and larson, m., eds., oral history and digital humanities: voice, access, and engagement. new york: palgrave macmillan us. storycorps. ( ). stories. [online] available at: https://storycorps.org/listen/ [accessed may ]. http://www.albany.edu/jmmh/ http://www.albany.edu/jmmh/ https://layersoflondon.blogs.sas.ac.uk http://www.oralhistoryonline.org http://www.oralhistoryonline.org https://ora.ox.ac.uk http://www.gutenberg-e.org/pohlandt-mccormick/index.html http://www.gutenberg-e.org/pohlandt-mccormick/index.html https://janesecker.wordpress.com/ / / /digital-information-or-copyright-literacy-for-all/ https://janesecker.wordpress.com/ / / /digital-information-or-copyright-literacy-for-all/ https://storycorps.org/listen/ romans and rollercoasters: scholarship in the digital playground | scholarly publications skip to main content leiden university scholarly publications home submit about select collection all collections faculty of archaeology centre for the arts in society (lucas) academic speeches dissertations faculty of governance and global affairs faculty of humanities faculty of science faculty of social and behavioural sciences leiden journals, conference proceedings and books leiden law school leiden university press medicine / leiden university medical centre (lumc) research output ul search box persistent url of this record https://hdl.handle.net/ / documents download text_ir_lei publisher's version open access full text at publishers site in collections this item can be found in the following collections: faculty of archaeology centre for the arts in society (lucas) politopoulos, a.; ariese, c.; boom, k.h.j.; mol, a.a.a. ( ) romans and rollercoasters: scholarship in the digital playground article / letter to editor engagement with, or research and teaching driven by, play has long been only a minor aspect of archaeological scholarship. in recent years, however, spurred on by the continued success of interactive entertainment, digital play has grown from a niche field to a promising avenue for all types of archaeological scholarship (champion ; champion ; mol et al. a; morgan ; reinhard ). firstly, this article provides an introduction on the intersection between play and scholarship, followed by a discussion on how ‘archaeogaming’ scholarship has been shaping and been shaped by its subject matter over the last years. secondly, the scholarship that arises from digital play is further illustrated with a case study based on the romeincraft project developed by the authors. the latter, made use of minecraft, the popular digital building game, to (re-)construct and discuss roman heritage through collaborative play between archaeologists and... show moreengagement with, or research and teaching driven by, play has long been only a minor aspect of archaeological scholarship. in recent years, however, spurred on by the continued success of interactive entertainment, digital play has grown from a niche field to a promising avenue for all types of archaeological scholarship (champion ; champion ; mol et al. a; morgan ; reinhard ). firstly, this article provides an introduction on the intersection between play and scholarship, followed by a discussion on how ‘archaeogaming’ scholarship has been shaping and been shaped by its subject matter over the last years. secondly, the scholarship that arises from digital play is further illustrated with a case study based on the romeincraft project developed by the authors. the latter, made use of minecraft, the popular digital building game, to (re-)construct and discuss roman heritage through collaborative play between archaeologists and members of the public. starting with in-game maps, sites such as forts, settlements, and infrastructural elements were rebuilt based on geological, archaeological, and historical information. these crowdsourced reconstructions, which not only relied on archaeological knowledge but also on a fair dose of creativity, took place in a series of educational public events in – . the case study will detail the results of this project, as well as its methods, thus providing a practical example of digital scholarship which begins with discovery and ends in learning. the paper will conclude by reflecting on how the fun yet unpredictable dynamics of a digital playground not only shape public engagement with the past, but also open up unexpected avenues for more inclusive archaeological scholarship. show less all authors politopoulos, a.; ariese, c.; boom, k.h.j.; mol, a.a.a. date journal journal of computer applications in archaeology volume issue pages – doi doi: . /jcaa. © - leiden university a service provided by leiden university libraries contact about us recently added digital collections student repository digitization practices for translations: lessons learned from the our americas archive partnership project search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine september/october volume , number / table of contents   digitization practices for translations: lessons learned from the our americas archive partnership project lorena gauthereau-bryson, robert estep, monica rivero rice university point of contact for this article: monica rivero, monica.p.rivero@rice.edu doi: . /september -rivero   printer-friendly version   abstract the "our americas archive partnership" (oaap) is a collaborative effort between scholars, librarians, and information scientists, and provides an integrated approach to discovering, accessing and using scholarly works that exist in multiple digital repositories. the archive is comprised of electronic texts and images originally written in or about the americas from to approximately . its goal is to represent the full range and complexity of a multilingual "americas" and to foster new research examining american literatures and histories from a hemispheric perspective. this paper discusses the complexities involved in digitizing multilingual historical documents, including practices for creating "born-digital" translations and unique metadata to best describe these rare, primary documents.   introduction the our americas archive partnership (oaap) is a collaboration between rice university, the university of maryland's maryland institute for technology in the humanities (mith), and instituto mora — fondo antiguo biblioteca ernesto de la torre villar (mexico) to develop a common interface for american studies scholars to discover digital resources that support their research and teaching. the work has included the development of a federated search interface to support resource discovery across the partners' sites, digitization of unique archival materials, and efforts to address the complexity of multilingual digital documents. this project was funded by an institute of museum and library services (imls) national leadership grant.[ ] the collections collectively span a period of years and focus on the cultural transformation period that saw the making of modern and colonial cultures in the americas from to the s[ ]. many documents are original government publications such as constitutions, decrees, or presidential and congressional messages, as well as broadsides and pamphlets serving as public statements regarding the political and social events of the time, and firsthand accounts in the form of diaries and journals. in total there are , items in the oaap and non-english texts comprise almost percent of the contents. one of the key challenges for the archive was the digitization of these non-english texts along with a selection of related translations. a total of translations (over , pages) were completed during this project. more than librarians, archivists, scholars and information scientists participated in documenting these digital resources using qualified dublin core metadata and text encoding initiative (tei) p mark-up encoding for texts. during the project, novel digitization practices for translations were developed and refined; these approaches are the focus of this paper, along with the "lessons learned" throughout the process.   translations: a scholarly perspective the main objective of oaap is to "facilitate current critical work in inter-americas and hemispheric studies, spurring new possibilities and directions for comparativist approaches to the americas."[ ] the archive incorporates many spanish-language primary documents, not previously available from current on-line platforms, along with their translations into english. the translations seek to enforce the trans-hemispheric, multilingual goal of the oaap itself. though the archive includes full-text versions for all original-language documents, a different approach was taken for translations. the oaap includes both full translations of some diaries and books, as well as excerpts of longer documents. the selection of excerpt translations was made with a view towards expanding the range of accessible documents, so as to provide a wider sample of what is available. in this way the full breadth of the collection could be acknowledged. later submissions may also include an assortment of excerpts, as a way to present content in other languages. translation of historical documents requires much more than knowledge of multiple languages; it also requires research into specialized terminology. furthermore, idiomatic phrases must be translated into modern day equivalents, rather than directly translated, to maintain the author's original meaning. in addition, finer epistemological resonances must be negotiated to maintain the tone of the source document. an added value of oaap digital translations is the inclusion of the translator's annotations to explain historic, untranslatable or archaic terms, as well as references to events, locations, persons, etc., that may not be as well known today as they were when the documents were originally written. translations include allusions to foreign canonical or historical figures, such as "the prudent king" (philip ii of spain), or "the monk of yuste" (charles i of spain); explanations of colonial legal terminology and military ranks; and a deciphering of a variety of abbreviations which authors commonly employed to conserve ink and paper[ ]. these specialized notes provide a general context for modern readers and for those unacquainted with a document's related history.   online treatment of "born-digital" translations the online treatment of "born-digital" translations (that is, translations that do not exist in print format, but were created for digital publication) raises both academic and logistical questions in terms of presentation, description and item-level representation. an initial survey of existing online collections that appeared to contain "born-digital" translations was conducted at the beginning of the project[ ]. no universal treatment of "born-digital" translations was found to exist. some collections elected to present translated text, and the original text, side by side. some supplied what appeared to be only translated text, but were vague about where the source document existed, who created the translation or in what language the original text might have been. surprisingly, the most common characteristic found was a lack of information about the translations and methodology used to create them. translations are not surrogates for the original documents; rather they are closely tied to, and best used in conjunction with, the original text for scholarly interpretation and research. from a digital archive perspective, however, treating the translation as a separate item allows for better usability as the document can be assigned unique descriptive metadata and discovered independently from the source document (for example by language). this treatment also allows for translation documents to be discovered by browsing through the site's map and timeline tools and assigning user tags. therefore, the oaap treats translations as stand-alone documents in the digital archive. in order to facilitate the needs of users to compare translations to the original text, the archive also includes page images of each original document, as well as a cross reference link to the source document's electronic version and a detailed metadata description about both the translation and the source document.   text encoding all texts in the oaap collections were marked-up using xml (extensible mark-up language) following the tei (text encoding initiative) guidelines[ ] to enhance full-text searches, increase their online usability and provide a mechanism for capturing historical or informational annotations at relevant points within the document itself. the enhancement of search functions occurs behind the scenes, where the oaap system converts xml files into plain text and then processes expanded abbreviations, regularized words (a search for the modern spelling of "texas" would yield documents containing the archaic spelling: "tejas"), annotations, and any data that may have been included within the xml tags themselves. this becomes very important when dealing with special characters, a common feature of historical archives of non-english texts, so that searches containing phrases with and without diacritics produce the same results. for example, the oaap search platform treats "méxico" and "mexico" (with and without an accent mark) as the same word so that the search results produce all occurrences of either word (with and without the e-acute character). this system design required a conscious decision in the programming of the oaap search engine. without this normalization of diacritics in the search function, it would be very difficult for a researcher to find all relevant content, because historical documents most likely do not contain modern accent markings, or may vary with language standards of the day, or even with a single writer at various points within the same document. due to the inconsistency of such markings, normalization of diacritics is a way to support successful searches. though diacritics are normalized for searching purposes, the actual documents are marked-up faithfully, or as close to the original author's markings as possible, for display purposes, and page images of the original documents are included for visual comparison. another benefit of tei mark-up is that it provides a semantic structuring of the text. these logical divisions can be used to customize the presentation of the online text, including the creation of a hyperlinked table of contents. since the oaap archive includes many lengthy books and legal documents, many of which span over pages, clickable tables of contents make navigation of this type of material less cumbersome. users have the option of jumping to a specific section by clicking on the division titles or numbers, as opposed to scrolling through the entire document. other online usability features include the display of mouse-over comments, hyperlinks to footnotes and endnotes and embedded full metadata records including provenance and other source information for archival materials. text mark-up also allows for insertion of historical or contextual annotations directly within the relevant passages. this capability is especially helpful in many of the spanish manuscripts that contain archaic terminology or historical references. while an advanced spanish-speaking scholar may be familiar with such terminology, students or researchers who may not be as experienced with archival materials may have difficulty interpreting them. in fact, many historical terms do not have accurate translations, and literal translations often erase their historical context. one example is the term "congregas," which appears several times in the collection of handwritten documents, "papeles sobre la reducción del seno mexicano y sierra gorda"[ ] ("documents regarding the religious conversion in the seno mexicano and sierra gorda"). a literal translation would be "congregations" or "assemblies." yet, the term "congregas" in colonial new spain referred to a subjugating system in which the spaniards would compel the indians into villages (called "congregas") and then use these indians for slave labor. since a translation would erase the spanish colonial meaning, the translation actually maintains the spanish term (in italics, to signify that it is a foreign word) and includes a definition in the translator's footnote. with access to the context of such historical terms preserved in the original language, the reader can then conduct further research on the subject by searching for the term in its original language. other forms of archaic terminology include antiquated abbreviations and spellings. another useful application of tei encoding is to provide a reader with expanded versions or regularized spellings of these sorts of terms, such as: "bejar" (bexar county) or "mejico" (mexico). figure : application of tei encoding used to help decipher archaic abbreviations the marked-up document presents the original spelling as inline text denoted by a dotted underline. when moused over, the text is highlighted in blue and a display of the expanded text is shown in a yellow pop-up box. in the above example the name "guadalupe" was often abbreviated as "guad.e"; n.s. (nuestra señora) and henº (enero) are other acronyms with supplied expanded text. while text mark-up enriches the documents by helping to preserve the text in its original form, converting it for full text searching and providing semantic structure for customizable presentation, the work of transforming page images to tei encoded text is a labor intensive process. it requires a considerable time investment in terms of training in tei encoding, strong quality control and review measures, supporting technologies such as custom stylesheets (xslt) for transforming xml and system infrastructure to render the output for on-line display, and in the case of oaap, encoders proficient in multiple languages.   metadata our metadata practices for translation documents were developed through a series of trials and internal discussions amongst catalogers, archivists, and scholars on the project. one challenge that presented itself early on was the need to be vigilant in the pursuit of consistency across a heterogeneous collection of document types. for logistical reasons, digitization of materials occurred in batches according to the type of document: monographs, manuscripts, broadsides, etc. with each new batch it became necessary to revisit solutions from the previous batch in order to ensure consistency in the handling of such details as dating conventions, the appearance order of author/recipient for correspondence and so on. transcription and translation work of the actual contents of the text also raised questions regarding metadata and sometimes required refinements or outright changes to dates, authors or places as certain facts from the documents themselves became better known through research conducted by the oaap project team[ ]. in the final analysis, an in-depth metadata approach was adopted that provided data about both the original document and the translated work and included a joint review by both the translator and cataloger for the assignment of document titles. oaap translation documents inherit the metadata that was created to describe the original document, because both the translation and the original contain the same content and can be assigned the same metadata for subject analysis and name authority. additional metadata was added that directly describes the translation work itself, such as the name of the translator, the language of translation, etc.[ ] metadata description dc.contributor.translator proper name of the creator of the translated work dc.date date of original work dc.description.translation provides information about the translation: name of translator, title of the source document, language of the translation and original document dc.format translation [globally applied to all items] dc.language language of translated document using the international standard is . "codes for the representation of names of languages"(e.g. : "spa") dc.relation.isversionof provides link to original document [this "isversionof" qualifier corresponds to a matching "isreferencedby" qualifier in the original] dc.title translated title table : definitions for unique translation related metadata translations retain the date of the source document because we believe a researcher conducting a date-based search is more interested in the date of the document's actual contents than the date of its translation. the description (dc.description.translation) element provides a human readable reference to the source document, giving a clear explanation of what the translation work contains. this document is an english translation of the "carta de angel navarro al jefe ayuntamiento de goliad, , bejar." translated by lorena gauthereau-bryson. the language of the original document is spanish. figure : example of a description for a translated document     standardization of titles the creation of translation titles proved to be less clear-cut than was first anticipated. while many of these could be handled with a simple default to common sense and tradition, others required a more flexible choreography to compromise between differing conventions based on language (dates written in the order month-day-year versus day-month-year, for example). preliminary titles for translated works were based on archival finding aid information. in a number of cases, the finding aid used only english language titles regardless of the native language of the original text, so the original title in the native language had to be created as well. the titles provided in the finding aids, while reliable for the most part, did reveal some interesting and unexpected infelicities. a good example of this was found in a spanish manuscript. deciphering the document was in itself extremely laborious, as there were signs of wear (holes, tears, etc.), the handwriting was exceptionally florid and extensive marginalia crowded the principal text from both sides of the page. figure : cross section of a page from spanish manuscript[ ] the content, or at least the parts that could be initially understood, seemed to indicate that this was a portion of the testament, or will, of one diego alvarez. however, after additional research during translation work, it was determined that this document, although legal in nature, was an appendix of sorts to the proper will, which was not owned by the rice university library. the manuscripts dealer from whom these materials had been acquired, had failed to note that the "testamento..." was only a portion, and tangential to, the document accurately reflected by the larger title. therefore the actual translation of the document's content showed that this document was not the full will but rather a statement or "ordinance" related to diego alvarez' plan to make provisions for orphaned girls. the title was changed to reflect this: initial finding aid title: testamento de diego alvarez revised native language title: siguense las ordenancas de las donzellas huerfanas final translated title: the following are ordinances regarding the orphan maidens an example of how transcription work by researchers on the project provided information impacting native title formulation is an item from the "charlotte & maximilian" manuscript collection.[ ] initially thought to be a single large manuscript, it turned out to be a grouped document consisting of four separate texts written by the empress carlota when she was an adolescent. these consist of a brief fictional piece concerning refugees from the french revolution who find themselves at the summer retreat of the british royal family, a speech written in the first person persona of pope urban ii, calling on the christian princes of europe to go on crusade in the holy land, and two sermons to be delivered on catholic feast days. because of the differing interpretation as to what these texts represented (the creative endeavors of a devout and royalist teenager or simply homework?) a covering title was devised to include both such possibilities, namely "recueil de petits exercices et d'œuvres créatives, sous la responsabilité de carlota, l'impératrice du mexique" (grouping of brief academic exercises and creative works in the handwriting of empress carlota of mexico). figure : selection from manuscript "discours d'urbain ii au concile de clermont" written by the empress carlota, dated another challenging aspect in metadata work for manuscripts and other unpublished printed texts was the construction of supplied titles. as a general rule, a supplied title is based on an initial quick reading of the item to be cataloged. typically these sorts of works require qualifying segments of dates, places or people to help distinguish individual items, such as individual letters in a folder of similar letters. the convention we ultimately chose for assigning titles to correspondence was as follows: letter from author to recipient/date/place of composition — with both date and place necessarily optional, depending on their actual appearance in the text. examples: carta de j.g. de los santos a c. antonio vasquez, de mayo , goliad lettre de charlotte, l'imperatrice du mexique, à son oncle, le duc de nemours, le fevrier , lacken in a number of cases, the relevant dates and authors were not immediately apparent without a closer reading of the item's contents. this seemed to occur with some frequency in the notoriously difficult-to-read handwritten missives of mexican civil servants and military men from the period immediately prior to the war of independence in texas. in other instances, a close reading revealed an additional author not previously noted. in some of the spanish-language government documents the most prominent date on the piece might not in fact be the actual date of issuance or composition, and close reading was again required in order to sort this out. therefore, working closely with the translator of a text was a critical step in the formation of the final translation titles. after all translations were completed, a detailed review of translated titles was jointly conducted by the project translator, lorena gauthereau-bryson (americas studies researcher) and language specialist cataloger, robert estep, to ensure consistency with title formations across the collection. this standardization of titles included examining such things as: consistency in the way dates and places are placed or formed within titles. dates within spanish language titles have different capitalization, punctuation and order than english language documents. example: Órden n. , guatemala, de agosto order no. , guatemala, august , consistencies in the way various title segments are displayed across similar document types (such as broadsides and letters). example: initial title: el presidente de la republica a los centro-americanos revised native language title: hoja suelta-decreto, guatemala, guatemala, de agosto translated title: broadside-decree, guatemala, guatemala, august , consistency in the level of data used to create titles between the original title and translated title, for example, including subtitles in the translated title if subtitles existed for the original work. methodology for creation of excerpt titles. the general rule is to supply a title for the excerpt, usually a chapter or section header plus the title of the larger work, placed within square brackets. example: savage america, chapter i [excerpt from: the moral history of women] the oaap collection is comprised of approximately percent non-english texts, mostly spanish with some latin, french, and portuguese. one of the key findings during this project was the importance of language skills in preparing metadata for multilingual collections. having a working knowledge of the native language of the texts allows catalogers to better analyze contents to assign relevant subject terms and prepare descriptive titles.   future of oaap as a multilingual site from the inception of the project, there has been a genuine desire to make the site truly hemispheric, reflecting the concept of the americas as a community of nations with a shared, intertwined, and often conflict-ridden history. since the documentary wealth of the americas collection is multilingual (spanish, english, portuguese, and french, etc.), the in-house translation of selected documents is designed to expand the collection both in terms of quantity and accessibility, ideally forming a collection of originals alongside full translations in every language represented. one method suggested to achieve this goal is a formation of a community of students, scholars and researchers who could contribute full or partial translations to the archive. this long term effort could be part of a larger "journal" approach to the web site, where prominent scholars are invited to write editorials or host exhibits that foster a community focused on the field of hemispheric studies. another strategy for building oaap to a full multilingual site is the addition of a multilingual user interface. anecdotal feedback received to date has suggested that providing search mechanisms in a user's native language would be very beneficial, particularly so for researchers seeking non-english content. an even further extension of this ideal would be to take a step back and regard not merely the documents themselves or the user interface as targets for this treatment, but all aspects of communicative information related to the documents, such as translating the metadata used to describe the varying types of documents in the archive. the translation of metadata may be as complex an undertaking as the translation of textual contents. for example, one of the advantages of library of congress (lc) subject headings has been their formulaic and yet flexible consistency, their overarching and sweeping categories combined with the richness of individual case-specific detail, each of these components reflective of disciplinary and historical certainty in their accuracy and validity. how then would one go about translating lc subject headings into spanish, french, or portuguese? for both spanish and french the existence of databases of subject heading equivalents for lc subject headings has been a matter of library practice for some time, and with probably the same degree of universal usage and local tinkering as the lc subject headings in the english-speaking world. the spanish version is bilindex, and the french versions are rameau (répertoire d'autorité-matière encyclopédique et alphabétique unifié) for france, and répertoire de vedettes-matiére for canada. portuguese subject headings are another matter, and although there are a variety of resources in both brazil and portugal, these tend to be discretely focused by discipline (medicine, law, sports, music, environment, etc.). however, a quick canter through any non-english language cataloging chat-room turns up a fair amount of wariness regarding adhering too closely to the perceived anglo-centrism of the lc subject headings. this wariness seems justified, as there appears to be a certain built-in bias in the creation of lc subject heading terms over time. lc subject headings, in regard to certain subjects (particularly history) are luxurious when treating english-speaking north america and europe, and comparatively thin when treating latin america or africa. this relative paucity in regards to the southern portion of the continent was especially noticeable while cataloging materials for the oaap. a few examples: the american civil war receives more than valid subject headings, one of which, "campaigns", has a further breakdown of individual battles; the american revolution receives subject headings, with a breakdown of battles under "campaigns", and the war of receives subject headings for individual battles. in contrast, the -year war of chilean independence has a list of three battles and the -year mexican revolution is provided with a list of eight battles. the overall chronological periodization of us history up to lists broad time spans, in contrast to for mexico, eight for chile, and a surprising three for guatemala, which are, specifically, "history—to ", "history— - " and "revolution, ". thus, providing appropriate and accurate subject analyses and headings for the large numbers of items related to guatemala's history in the early part of the th century required considerable creativity in formulating valid entries which reflected more detail than the bare bones of the historical possibilities as defined by lc. this raises questions regarding the value of perpetuating this bias by creating multilingual versions of existing lc subject headings. perhaps efforts would be better spent exploring the use of alternative subject vocabularies produced in native languages either supplied directly by scholars or some other national lexicon.   conclusion the translations in the our americas archive partnership provide greater access to the contents of primary source materials of non-english languages. they support new research and emerging scholarship into inter-americas and hemispheric studies. these resources also represent a corpus of materials that explore the digital treatment of "born-digital" translations including historical annotations, the semantic encoding of texts and methods for describing such materials in meaningful ways. these works and others can be explored at the oaap beta site http://oaap.rice.edu/, and comments are welcomed.   notes [ ] imls award lg- - - - . [ ] read more online about these collections at http://oaap.rice.edu/collections.php. [ ]"imls grant abstract" found at http://oaap.rice.edu/about.php?page=documentation. [ ] examples of historical abbreviations may be seen in the educational module, "abreviaturas históricas" (historical spanish abbreviations) http://cnx.org/content/m /latest/. [ ] for more information on the survey of existing online collections that appeared to contain "born-digital" translations, including a complete list of search sites and methodology, please see http://tinyurl.com/ qqhd . [ ] http://www.tei-c.org/guidelines/. [ ] gorraez, josé de and escandón, josé de. guevara's report to his excellency, the viceroy, regarding the seno mexicano missions made in the year [excerpt from: documents regarding the religious conversion in the seno mexicano and sierra gorda]. translated by gauthereau-bryson, lorena. manuscripts. . from woodson research center, rice university, britton collection of early texas and u.s. civil war documents, - , ms . http://hdl.handle.net/ / . [ ] list of oaap project team members can be found at http://oaap.rice.edu/about.php?page=project#people. [ ] a comprehensive list of translation metadata elements and input guidelines are available online [see "application profile - translation metadata" found at http://oaap.rice.edu/about.php?page=documentation. [ ] alvarez, diego and cabello, francizco. the following are ordinances regarding the orphan maidens. translated by gauthereau-bryson, lorena. legal documents. . from woodson research center, rice university, britton collection of early texas and u.s. civil war documents, - , ms . http://hdl.handle.net/ / . [ ] this archival collection (covering the period - ) contains original letters from charlotte of belgium, chiefly as carlota, empress of mexico as well as photographs, engravings, and drawings of charlotte, of mexico city, and of members of the mexican military and other published materials regarding maximilian's reign in mexico. only a small selection of carlota's letters was digitized for this project. for more information on the complete collection, please see the archive finding aid at http://library.rice.edu/collections/wrc/finding-aids/manuscripts/ /.   about the authors lorena gauthereau-bryson is the americas studies researcher for the 'our americas archive partnership', where her main focus is the translation and research of archival documents. she is fluent in english and spanish and holds an ma in hispanic studies from rice university and bas in english and political science from rice university. her scholarly interests include us-mexico border studies, mexican-american literature, and hemispheric studies. robert estep is a senior copy cataloger at fondren library, rice university, with over years experience. he received a ba in english (minor in french) from the university of texas at austin. robert is fluent in multiple languages. besides providing subject-area metadata for the our americas archive partnership he assisted in the proofreading and transcription of some spanish-to-english and french-to-english translations, and translated some items from portuguese. monica rivero is the digital curation coordinator for the center for digital scholarship at rice university. she worked as the project manager for the 'our americas archive partnership'. she holds an mlis from university of north texas graduate school of library and information sciences and a ba in business management from sam houston state university. monica has over years experience in project management in the private sector.   copyright © lorena gauthereau-bryson, robert estep and monica rivero [pdf] a reflection on a data curation journey | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . / corpus id: a reflection on a data curation journey @article{ltter aro, title={a reflection on a data curation journey}, author={lucia l{\"o}tter and christa van zyl}, journal={journal of empirical research on human research ethics}, year={ }, volume={ }, pages={ - } } lucia lötter, christa van zyl published medicine journal of empirical research on human research ethics this commentary is a reflection on experience of data preservation and sharing (i.e., data curation) practices developed in a south african research organization. the lessons learned from this journey have echoes in the findings and recommendations emerging from the present study in low and middle-income countries (lmic) and may usefully contribute to more general reflection on the management of change in data practice.  view on sage europepmc.org save to library create alert cite launch research feed share this paper citations view all topics from this paper curation published comment citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency developing ethical practices for public health research data sharing in south africa s. denny, blessing silaigwana, d. wassenaar, s. bull, m. parker political science, medicine journal of empirical research on human research ethics : jerhre save alert research feed best practices for ethical sharing of individual-level health research data from low- and middle-income settings s. bull, p. cheah, + authors m. parker business, medicine journal of empirical research on human research ethics : jerhre save alert research feed ethics and best practices in data sharing in low and middle income settings references showing - of references sort byrelevance most influenced papers recency leading change. glenda christiaens medicine beginnings , highly influential view excerpts, references background save alert research feed death and dying d. carnall medicine bmj : british medical journal , view excerpt, references methods save alert research feed human sciences research council: institutional review, cape town, south africa: human sciences research council. how can universities meet the expectations how can universities meet the expectations in the epsrc research data policy steps that heis can take to meet steps that heis can take to meet the epsrc research data policy steps that heis can take to meet the epsrc research data policy you don't have to be the boss to change how your company works retrieved from https://hbr.org/ / /you-dont-have-to-be- the-boss-to-change-how-your-company-works harvard business review you don’t have to be the boss to change how your company ... ... related papers abstract topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue heard on the net: developing the balance of discovery and respect with primary resources portland state university from the selectedworks of jill emery fall october , heard on the net: developing the balance of discovery and respect with primary resources jill emery, portland state university tara robertson peggy glahn this work is licensed under a creative commons cc_by international license. available at: https://works.bepress.com/jill_emery/ / http://www.pdx.edu https://works.bepress.com/jill_emery/ http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / https://works.bepress.com/jill_emery/ / the charleston advisor / october www.charlestonco.com within libraryland social media this past spring and summer, an emerging story began to unfold. a relatively new upstart com- pany, reveal digital has begun de- veloping digital archives of primary resources which are funded by institutions pledging upfront support. the eventual result of this work will be collections made available as open access content to every- one. the majority of the content is being sourced from research librar- ies’ archival collections. those pledging money get early access to the content as it is being digitized and made available. in addition, source libraries obtain digital copies that they can dark archive. pledging li- braries also gain marc records and counter compliant usage statistics. reveal digital makes no claims on copyright to the mate- rial. after a designated period, the content will be made available as open access resources. all-in-all, this is an exciting new model in re- gards to primary resources that often cost librarians tens of thousands of dollars with sometimes hefty on-going access fees levied. however, there were questions being raised by librarians regarding how rights holders may have been contacted or given input into mak- ing some of the content available. in particular, the independent voic- es project, is a collection that has come under close scrutiny, most specifically, the inclusion of the entire run of on our backs within this collection. on our backs (oob) was an alternative press publication that fo- cused primarily on lesbian erotica/pornography from - . as the first all-female-produced content of this type in the united states, it bears historical significance. the publication was produced in san francisco, california for an independent press and developed a fairly large circulation/distribution model in north america and eventually australia . however there have been concerns about how the rights holders’ permissions have been sought in regards to fully digitizing this publication given the nature of its content. [author’s note: the oob content has been removed at this point from the independent voices collection due to concerns regarding depiction of pornography in some states.] one of the librarians who has expressed concerns is tara robertson, accessibility librarian, in vancouver, british columbia. tara has a personal blog in which she thoughtfully and eloquently writes about universal design, open-source software, intellectual freedom, and feminism. her initial post on march , , noted: “most of the oob run was published before the internet existed. consenting to appear in a limited run print publication is very dif- ferent than consenting to have one’s sexualized image be freely available on the internet. these two things are completely differ- ent. who in the early s could imagine what the internet would look like in ?” tara’s post was noted by other librarians involved in digital scholar- ship and used as reference to the need for developing ethical frame- work when producing digital content. on august , , tara robertson and jenna freedman hosted a critlib discussion on digiti- zation ethics (the curated collections of posts to that discussion can be found here: ). furthermore, tara was an invited speaker at code lib nys during the first week of august, which resulted in her sharing her slides as well as further thoughts on this digitization effort . in particular, when performing research at cornell uni- versity libraries, tara discovered some of the copyright permission forms indicating that editors’ image releases were for the print publi- cation and one-time use only. as more and more academic librarians consider digitization projects, especially projects that utilize primary source material that was not mass or commercially produced, considering and determining the ethical standards to be instituted are tantamount. as a way of help- ing librarians frame these discussions at their own institutions, peggy glahn, the program director at reveal digital, and tara robertson agreed to answer the following questions posed to them. it is hoped that by reading through their responses and thinking through these is- sues and concerns, we can all become more conversant in the digital ethical concerns in regards to making our primary resources discover- able in the twenty-first century. how do you define or present the concept of “the alternative or in- dependent press”? [pg]: independent voices is a collection of alternative press titles published in the s, s, and s. it is challenging to definitively define the alternative press in a way all can agree upon; it isn’t in the nature of the alternative press! we developed the project’s target title list in collaboration with librarians who collect in this area and others who were involved in the various social movements of the era. our criteria for selection includes periodicals that began publishing in the s, s, or s; titles that were intended for public consumption and distribution; titles for which we are able to obtain copyright per- mission or are otherwise in the public domain. zines are outside the scope of this project. [tr]: i’d define alternative or independent press as publications with a smaller print run than a mainstream press. alternative or indepen- dent presses often represent more diverse opinions than mainstream publishing. advisor reports from the field heard on the net developing the balance of discovery and respect with primary resources doi: . /chara. . . by jill emery (collection development librarian, portland state university) peggy glahn (program director, reveal digital) tara robertson (accessibility & systems librarian) advisor reports from the field / the charleston advisor / october www.charlestonco.com most zines and alternative/independent presses are compilations of works of multiple contributors. in many instances, the main edi- tors/authors did not seek permissions and/or releases from the pro- ducers providing submissions to their publications. given this situa- tion, to what extent should copyright permissions be sought? [pg]: the independent voices project contains no publications we would classify as a zine, so my response is only related to publica- tions from independent presses. we are creating independent voices under the precedent set by the greenberg vs. national geographic society ruling related to u.s.c section (c) of the copyright act. this ruling came after the tasini case. the ruling stipulates that publishers do not have to secure additional rights from contributors to digitize and display their previously published works, regardless of the media selected for redistribution, as long as the original print- ed context is maintained. this position is supported by ala, arl, aall, and mla through their joint amicus brief in support of the national geographic society. [tr]: it can be complex figuring out who has copyright, but it’s nec- essary. zine librarians’ code of ethics has good information on this subject. while it may be possible to argue fair use, they recommend respecting the wishes of content creators. i like that they also explic- itly acknowledge community: “in the name of community respect, we advise getting explicit permission whenever possible.” this makes good legal and ethical sense. in tara’s criticism of reveal digital, she cites the people involved in the creation of the mukurtu project as a positive example of how to engage with various communities when creating digital collections. given the scope and breadth of the in- dependent voices project, how could that same feedback be scoped that would result in a worthwhile engagement with the communi- ties involved? [pg]: the mukurtu project is a thoughtful innovative approach to opening up access to cultural material in a positive and affirming way. i love what they are doing. i hope there will be opportunities in the fu- ture to incorporate concepts and features makurtu has developed into our platform when we develop new projects. for now we are limited in our ability to add new platform functional- ity to independent voices. the project is being funded by contribu- tions from libraries. as of today, we are $ , away from reach- ing the $ . m cost recovery goal. we are very close, but only have until december , to reach the goal. it won’t be an easy task. every dollar that is contributed now is going toward digitizing the content originally scoped for the project. there are no dollars avail- able to add new platform functionality, unless we vastly exceed the goal. as we transition our investment fund approach to open access in , we will be building an editorial board composed of librarians and faculty from funding libraries. the editorial board will provide the mechanism for funding libraries to have a voice and a vote in how their dollars are spent. should future projects approved by the edito- rial board include unpublished cultural material, we will be looking to projects like mukurtu for best practices in engaging cultural com- munities in the digitization of their material. [tr]: it’s important to talk to the editors and the original content creators: writers, photographers, and their models, and find out what their wishes are with this magazine. most of the models wouldn’t have had copyright of the images that they appeared in, yet it’s their bodies on the page. in this situation, the models are the ones with the least amount of rights and who can be harmed the most. several people who modeled shared their thoughts and feelings with me. one said, “when i heard all the issues of the magazine are being digitized, my heart sank. i meant this work to be for my community and now i am being objectified in a way that i have no control over. people can cut up my body and make it a collage. my professional and public life can be high jacked. these are uses i never intended and i still don’t want.” another said, “it’s one thing to have regrets over what you’ve published, but i actually never consented to have this photoshoot published by on our backs in the first place, let alone digitally.” consultation should also include researchers, academics, librarians, and archivists. members of the queer community also have a stake in this, though defining who is representative of the queer community could be difficult. i think it’s possible to design a way for people to engage online or through social media to allow for broad consultation without it costing too much. the new zealand text collection consulted various communities in deciding if and how to digitize moko, or maori tattooing. their re- port is online and would be a good place to start when planning com- munity consultation on culturally sensitive materials. what are the best practices or proper steps to be taken in obtain- ing permission for creating digital archives of content produced by third parties? [pg]: the answer depends on the kind of content that will be includ- ed in a digital archive. different content types have different rules re- garding their use. for a collection like independent voices, which is composed of published periodicals, our process for obtaining rights was and continues to be properly conducted. the first step is identifying the legitimate rights holder for a publica- tion. that isn’t always easy, particularly with alternative press con- tent. it can take a good deal of detective work to find the right person or people and their contact information. if we are not able to identify or reach the legitimate rights holder and obtain their written permis- sion, we do not include the title in the collection. going beyond obtaining permission from rights holders to obtaining permission from individual contributors is not required, nor is it de- sirable, nor is it economically feasible. it is not required due to the greenberg ruling described above. it is not desirable because of the gaping holes the process would inevitably lead to in the historic re- cord for every publication included in a collection. it is not economi- cally feasible because of the exponentially higher number of person hours it would require to identify and contact rights holders and ne- gotiate permissions. [tr]: [author’s note: tara felt she wasn’t sure what to say for ques- tion # as she doesn’t work in that area and copyright in canada is different from the u.s.] the charleston advisor / october www.charlestonco.com in , there was the case the new york times, co. versus tasini which outlined some fundamental problems between the digitization of works where permissions had not been fully granted to a third party . more recently, there have been the takedown concerns involv- ing the getty research institute images. how can librarians, as a profession, be better informed on these types of copyright permis- sion concerns with our locally created digital collections? [pg]: the tasini case is one that is often raised in the context of digital periodical collections. it is very well-known among librarians. the greenberg ruling is equally important as it directly builds on the tasini ruling, but seems to be less frequently raised in discussions. without greenberg, most of the archival periodical collections avail- able today would never have been created. copyright permission concerns will always be a tricky area when it comes to creating digital collections. unless you are a copyright li- brarian or lawyer, most librarians are not going to have time to keep up with the case law. i have seen good, informative discussions of copyright issues on listservs like scholcomm or blogs like the schol- arly kitchen. these venues can offer food for thought, but ultimately, consulting with a copyright lawyer will enable librarians to under- stand all the issues as they pertain to their particular collection and enable them to make the right decisions for their institution. [tr]: as with most areas of librarianship, you end up developing a depth of knowledge in the area that you are working in. the copyright librarians and the librarians who work in digital collections are partic- ularly knowledgeable in this area as it’s part of their everyday work. i think as a profession we need to have a broader conversation about the ethics of digitization. even if we’ve got the copyright clearances to digitize, there are cases where it’s inappropriate. central to librari- anship is a concern about increasing access to information. we also need to talk about where it’s not appropriate for us to be providing access. many librarians bring up intellectual and academic freedom, which while important misses the point that we have a responsibil- ity to honour and respect other cultural protocols around information sharing. at the ipinch cultural commodification, indigenous peo- ples & self-determination public symposium, kim christen withey said, many indigenous knowledge systems rely on protocols. many of the protocols have to do with not seeing, which very much is the antithesis of the western “seeing is believing.” you have to see it to know it. and these systems are saying you don’t get to see it or know it—deal with it. we need to learn more about this in libraries. should textual works and two dimensional works be treated differ- ently than photography or other visual and audio media? [pg]: i think the answer to this question depends on the context of the project. a collection of photographs from an archive are treated differently under copyright law than photographs that are part of a published work. likewise, audio media is covered by a raft of copy- right requirements. anyone creating digital collections must first ad- here to the applicable copyright in addition to any ethical or privacy concerns that the content itself may raise. [tr]: the content is more important than the format. textual works can be published newspapers, love letters, or journal entries. that said, in our society sexually explicit content images and video are generally more controversial than sexually explicit text. in your opinion, should any library or library consortium offering digital collections provide a mechanism for authors or members of their community to request redaction of digital content? [pg]: sure. the mechanism can be as simple and inexpensive as pro- viding an e-mail address for redaction requests. the trickier question for librarians will be how to respond to those requests. librarians have traditionally stood firm against censorship of any kind. redac- tion is a form of censorship. however, there may be circumstances where redaction is the right thing to do for legal or privacy reasons. the choice to redact or not to redact is not always obvious or easy. [tr]: i used to think a clear public takedown policy and contact was important, and maybe it still is, however i think it’s more important to know who you could have a conversation with if you had con- cerns. it’s difficult when you’re outside an organization or university to know how things are structured and who you can contact. by mak- ing this clear, you’re making it possible for people in the community to start a dialogue with you. libraries need to consult with communities before putting sensitive collections online in the first place. reveal digital’s independent voices began digitization in and the zine librarian’s code of ethics came out in october . for peggy gahn to answer: what is the review cycle in place for updating the procedures and processes of the copyright permissions sought? for tara, what would be the best practice for reviewing and updating the process and procedures for seeking copyright permissions? [pg]: independent voices is a collection of alternative press periodi- cals that were originally created for as wide a public distribution as possible. while the publications in independent voices share some characteristics with zines (often published by marginalized popula- tions, sometimes short runs), they are not considered to be zines. our copyright permissions process adheres to the principles upheld by greenberg. we have no plans to change our copyright permissions process for the remainder of titles targeted for inclusion in indepen- dent voices. one of reveal digital’s future projects will focus on zines. we will be working in close partnership with librarians who are currently fol- lowing the zine librarian’s code of ethics. we intend to be in full compliance with this document when we do work with zine content. [tr]: i know i’m sounding like a broken record, but i think for cul- turally sensitive collections we need to go beyond just looking at copyright. since scholarship and research are now sourced in many cases from social media and from mechanisms that once were considered realms of smaller communities, how can librarians engage their ac- ademic communities with the concept of “the right to be forgotten”? [pg]: i think everyone from my era and older (genx and baby boomers) feel very fortunate that we and our friends did not have so- cial media sources on which we discussed and posted pictures docu- menting our youthful exploits! everyone has done, said, or written advisor reports from the field / the charleston advisor / october www.charlestonco.com something at some point in their life they regret. before social media, those things were largely undiscoverable and therefore forgotten. it is a very different picture today. social media itself may be the answer to how librarians engage with academic communities around the con- cept of the right to be forgotten. tara robertson through her blog and other social media outreach has done much to advance the discussion within scholarly circles. [tr]: for people who modelled in on our backs, i understand why some of these people would not want their photo shoots online. our society isn’t terribly sex positive. the models who i talked to did not consent to have images of their bodies online and some were also worried that this could hurt their careers or lives. we often think about these things as a balancing act—where there’s a need to balance the freedoms of researchers and queers who want access to that history with the freedoms of people whose lives could be hurt by this access. i don’t actually think it’s a balancing act. i think that in this case we need to prioritize the voices of people who could be hurt by this con- tent being freely available on the internet. i can see how in other cases, like people who committed crimes against humanity or other atrocities, shouldn’t be allowed to erase them from history. perhaps we should be able to erase embarrassing things we’ve done? perhaps not? i’m not really sure where to draw the line. docnow is “a tool and community developed around supporting the ethical collection, use, and preservation of social media content” (). i’m enjoying listening to the thoughtful conversations that are exploring these questions. are there any last thoughts or considerations to be shared? [pg]: reveal digital’s inclusion of on our backs (oob) in indepen- dent voices has generated a lot of good discussion about ethical is- sues related to digitization. as we said in our public statement on this issue, it was only after careful consideration and consultation with librarians and scholars that we arrived at the decision to include oob in indepen- dent voices. it is considered by many in the academic community to be an essential artifact of the “feminist sex wars,” when feminists split into factions based on their attitudes towards women’s sexual expres- sion. the reaction from librarians to our removal of oob from indepen- dent voices has been universally positive. many have expressed their hope that we can find a way to bring the material back into the collec- tion. we have also heard from a number of scholars who were active- ly using oob in their research. they have expressed disappointment but understanding about our decision as well. heather findlay, who was the final publisher of oob and is the cur- rent rights-holder was sad to see the material removed. as ms. find- lay reflects on her experience with oob she characterizes the mod- els and contributors as incredibly brave women who participated in the publication as a political statement and an act of power and rage. they wanted the material to be seen by as many people as possible. in ms. findlay’s experience, oob’s contributors were delighted about the digitization of oob and its inclusion in the independent voices project. there are many different voices to be heard in this debate. we will continue to listen and look for a path forward that is sensitive to all. [tr]: although i’ve publicly critiqued reveal digital, figuring out best practices for the ethical digitization of independent media from just before the internet existed is not easy or simple. reveal digital is in a great position to figure out what best practices look like in terms of community consultation, ethics beyond just looking at copyright and digital access. they are intelligent folks who have figured out a great business model, so i’m hopeful that they’ll also be able to figure this out. n portland state university from the selectedworks of jill emery fall october , heard on the net: developing the balance of discovery and respect with primary resources tmpxiwlgm.pdf ucla ucla previously published works title engagement of academic libraries and information science schools in creating curriculum for sustainability: an exploratory study permalink https://escholarship.org/uc/item/ x x sw journal journal of academic librarianship, ( ) issn issn: - authors jankowska, maria a. smith, bonnie j. buehler, marianne a. publication date peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ x x sw https://escholarship.org http://www.cdlib.org/ engagement of academic libraries and information science schools in creating curriculum for sustainability: an exploratory study maria a. jankowska ucla charles e. young research library majankowska@library.ucla.edu a charles e. young drive po box los angeles, ca - tel: ( ) - fax: - bonnie j. smith university of florida, george a. smathers libraries bonniesmith@ufl.edu library west po box gainesville, fl tel: ( ) - fax: ( ) - marianne a. buehler university of nevada, las vegas libraries marianne.buehler@unlv.edu s. maryland pkwy box las vegas, nv tel: ( ) - fax: ( ) - abstract in , the association for the advancement of sustainability in higher education released, “sustainability curriculum in higher education: a call to action,” encouraging infusion of sustainability topics into universities’ teaching and research. since then, academic programs and research, related to social, economic, and environmental sustainability have enriched university curricula. an exploratory study was conducted to determine the position and engagements of academic libraries and information science schools in their contributions to scholarly sustainability activities and curricular initiatives. this article presents the results of the study which reveals a number of engagements by library professionals in the areas of sustainability, such as increasing open access to research, building sustainability-related collections and research guides, and incorporating sustainability content into information literacy. while academic libraries and information science schools are engaged in a broad spectrum of initiatives that support their institutions’ sustainability research and curricular functions, this study indicates that such activities require a more targeted approach. keywords: library sustainability, sustainability activities in academic libraries, marketing sustainability, lis sustainability, lis curriculum for sustainability, academic library engagement tel:( )% - tel:( )% - mailto:marianne.buehler@unlv.edu tel:( )% - tel:( )% - mailto:bonniesmith@ufl.edu tel: % - tel:( )% - engagement of academic libraries and information science schools in creating curriculum for sustainability introduction in , over twenty-two international university presidents signed the talloires declaration in france, creating the association of university leaders for a sustainable future (ulsf). with their signatures, they committed their institutions to a ten-point action plan for incorporating sustainability and environmental literacy into research, teaching, outreach, and operations. more than university presidents and chancellors at institutions in over countries across five continents have now signed this declaration that states: “universities bear profound responsibility to increase the awareness, knowledge, technologies, and tools to create an environmentally sustainable future” (talloires declaration, ). with the increasing recognition of higher education’s critical role in creating a sustainable future, the association for the advancement of sustainability in higher education (aashe) was formed in , as a professional association to coordinate and strengthen campus sustainability. later that year, the american college and university presidents’ climate commitment (acupcc) was inaugurated and currently has institutional signatories. subsequently, university presidents and chancellors committed their institutions to incorporate environmental sustainability into their campus operations, services, research, teaching, and outreach (about aashe, ). in , aashe initiated a self-reporting tool to assess and compare progress in campus sustainability efforts. the sustainability tracking, assessment & rating system™ (stars) tool uses a credit system to report, rate, measure, and compare sustainability performance in higher education (stars, a program of aashe, ). these national and international efforts indicate that academic institutions are addressing the sustainability challenge in their operations, research, teaching, and services. to meet their commitments and goals, campus leaders are looking for input and leadership from all campus community stakeholders. it is therefore, incumbent on libraries and library information science (lis) schools to respond to this challenge and be active partners in designing and supporting a sustainability curriculum. this is especially important since libraries support all disciplines and are in a unique position to serve as hubs for sustainability collaboration, dialog, and innovation. an exploratory study was conducted to create a better understanding of academic libraries and lis schools roles in educating for sustainability. for this purpose, a survey was conducted to measure their current levels of engagement. this article reports on the study’s results and presents insights and opportunities for libraries and lis schools to engage in the sustainability challenge on their campuses. literature review: libraries creating curriculum for sustainability aashe defines sustainability as “encompassing human and ecological health, social justice, secure livelihoods, and a better world for all generations” (about aashe, ). in august , aashe challenged institutions of higher education by releasing a document, “sustainability curriculum in higher education: a call to action” ( ), that encouraged the infusion of sustainability topics into their curricula and research. currently, campus sustainability initiatives and academic programs are created and discussed by academic affairs’ representatives, capital programs, general services, housing, student affairs, and university communications, as well as graduate and undergraduate students and faculty with expertise in responsible business practices, social justice, ecology, energy, engagement of academic libraries and information science schools in creating curriculum for sustainability public health, and the environment. how are academic libraries embracing campus-wide efforts to incorporate sustainability concepts into scholarship, teaching, and service? a search for articles, studies, and research on infusing sustainability across the curriculum indicated that the topic is generally well represented in literature related to higher education, but very few results were discovered associated with library practices and lis school participation in this emerging focus. nevertheless, library literature related to greening libraries, environmental concerns, and sustainability concepts in academic libraries, as presented by jankowska and marcum, has been expanding since the s (jankowska & marcum, ). the recent movement towards greening library buildings, collections, services, operations, and outreach has been well covered in a book titled, “greening libraries,” edited by monika antonelli and mark mccullough ( ). in , terry links’ article, “transforming higher education through sustainability and environmental education,” turned librarians’ attention to the importance of using sustainability topics in information literacy. he encouraged libraries to “bring voices to the conversations by building of collections that challenge narrow disciplinary answers to the issues before us” (link, ). the same year, the american library association’s (ala) president, sarah long, focused her term on the theme: “libraries build communities” one of her initiatives was to support a special pre-conference workshop held at the july ala conference in chicago with the goal to teach librarians community-building skills to promote sustainable development in their localities (long, ). since that time, few articles have been published on successful sustainability workshops (jankowska, ) and recommendations of sustainability resources worth adding to library collections (desilva, ), (applin, ). however, the attention was primarily focused on greening collections (connell, ). recently, megan stark ( ) stated, “a cornerstone of academic librarianship, information literacy should be included in discussions about sustainability and academic libraries.” in her article, she reported on the successful adaptation of the association of college & research libraries (acrl) information literacy competency standards encouraging students to reflect on cultural, historical, ecological, and economic elements of sustainability at the university of montana mansfield library. in , madeleine charney (university of massachusetts—amherst) conducted a survey and interviewed librarians who were interested in academic library sustainability topics. she presented the preliminary results of this survey at the aashe annual conference in pittsburgh. her presentation “getting closer: the librarian, the curriculum and the office of sustainability” ( ), emphasized that “academic librarians play a vital role in supporting sustainability across the curriculum. as seasoned consolidators and distributors of information, librarians also bring a unique voice to sustainability councils and committees.” later, charney published, “a sustainability librarian’s manifesto: your “take action” checklist,” in which she presented practical ideas to move sustainability efforts forward. in this manifesto, she stated, “we must each do our part to prevent “the library” from becoming an afterthought in the sustainability movement. step up, connect to the key players, ask questions. be bold!”( ). from february to august beth filar williams, madeleine charney, and bonnie smith organized a four-part webinar series titled, "libraries for sustainability” ( ). the webinars connected librarians who were interested in sustainability and provided an opportunity for discussions about best practices, resource sharing, and future sustainability actions in libraries. the result of this networking series was the emergence of individuals committed to working within ala to continue this important dialog in a more formal venue. concurrently, a book chapter, “libraries as sustainability advocates, educators, and entrepreneurs,” by beth filar williams, anne less, and sarah dorsey engagement of academic libraries and information science schools in creating curriculum for sustainability ( b) presented interviews of librarians who implemented sustainability concepts into librarianship, although not necessarily in information literacy. the sustainability librarians linkedin group stemmed from the book chapter essays and has become a forum for “librarians who are entrepreneurs, leaders, facilitators, and advocates in the sustainability movement” (http://www.linkedin.com/groups/sustainability-librarians- ). increasingly, library literature has used the term sustainability in the context of lis education (howard, ), collection management (chadwell, ), institutional repositories (buehler, ), digital resources (maron & loy, ), scholarly communication (schroeder, ), preservation, and digital formats (hoebelheinrich, ). in this study, the authors investigate sustainability through activities promoting equitable access to information that maximizes a return on investment for scholarly research, communication, and lifelong learning. they considered all academic library activities and lis school initiatives in support of promoting equal access to information, open research and scholarship, and teaching across the curriculum in institutions of higher education. these activities and initiatives focus on supporting open access to information, research, and scholarly communication, promoting institutional repositories to preserve digital content, educating faculty about retaining their author rights, building sustainability-related collections and research guides, and incorporating sustainability content into information literacy, diversity, and teaching as well as collaborating on sustainability projects, and seeking funding for sustainability efforts. purpose of the study and research questions as indicated in the literature review, more research needs to be conducted to fully understand and document academic library and lis school contributions to their academy’s sustainability curricular initiatives and activities. the main goal of the study was to investigate the engagement level of academic libraries and lis schools in campus sustainability teaching and curricular activities. for academic libraries, the authors explored the following five research questions: . what is the level of academic library engagement in campus sustainability teaching and curricular activities? . is there a relationship between the level of engagement of academic libraries in the emerging focus of teaching sustainability across the curriculum and the carnegie classification (cc) taxonomy of higher education institutions in the us? . what kind of sustainability activities are academic libraries involved in? . in what ways do academic libraries market sustainability resources to users? . in what areas do libraries collaborate on sustainability-related content with other units on campus? recently, aashe’s stars steering committee discussed the need to connect stars ratings and cc (stars steering committee meeting, & ). this discussion relates directly to the authors’ second research question where they investigated the association between the cc of institutions and their level of engagement in activities that support sustainability curricula. while academic libraries were the main focus of this study, the authors were also interested in exploring the practices of lis schools. they wanted to understand whether and in what ways lis curricula have evolved to prepare students for their future academic library workforce demands in relationship to existing courses that encompass a context of sustainability. the authors proposed one research question: http://www.linkedin.com/groups/sustainability-librarians- engagement of academic libraries and information science schools in creating curriculum for sustainability . what is the main focus of sustainability stressed by lis schools in their curriculum and practice? a research survey design with an online questionnaire was used to elicit responses from academic librarians, administrators, and lis educators and students to explore answers to the above six research questions. research design and methodology an online survey was used as the primary research method, as well as library, lis program, and university homepage searches and a literature review. to efficiently collect data describing the research phenomenon the questionnaire was designed to collect qualitative information for the purpose of further understanding academic libraries’ and lis schools’ sustainability initiatives and activities in north america (see appendix ). the qualtrics online survey software was used to create the survey with both open-ended and closed questions, allowing respondents to make comments. the survey questionnaire consisted of questions. this survey used branching logic which allowed targeting two different groups by jumping a block of questions if it did not pertain to the specific group (lis or academic library). three of these questions were general questions to establish the affiliation of the respondent, either with an academic library, lis program, or both. twenty-four questions were addressed only to respondents from academic libraries and questions addressed only to lis school participants. eighteen questions focused on sustainability initiatives and activities in academic libraries, including instruction, research, scholarly communication, resources, collaboration, and outreach. six questions addressed the university’s commitment to sustainability as expressed by the establishment of an office of sustainability, committees, policies, and sustainability workshops. seven questions pertained specifically to lis schools and their sustainability initiatives. prior to distribution, the questionnaire was reviewed by a number of colleagues and approved by the internal review board (irb) offices of the authors’ institutions. the information about the survey and link to questionnaire, with its accompanying introduction, was then distributed using electronic mailing lists, linkedin groups, blogs, google groups, multiple facebook accounts, and over direct solicitations. two weeks after the initial survey distribution, the authors sent reminders through the various posting mechanisms (electronic mailing lists, blogs, linkedin) to encourage additional survey participation. the survey was open for three weeks, from april th through may nd, . study population and response rate the survey targeted two categories of respondents. academic library employees (librarians, administrators, managers, and staff) in higher education institutions were in the first category. faculty, staff (non-faculty), students, administrators, and managers from lis schools comprised the second category. geographically, the survey targeted employees and lis school students in north american institutions. non-us institutions were removed from the data analysis as a result of the low response rate. survey respondents included academic library employees and lis school employees and students. twelve (five from canada) respondents from institutions outside the us were removed from the data analysis. a total of respondents reported working in an academic library, while respondents reported being associated with an lis school. sixteen respondents were associated with engagement of academic libraries and information science schools in creating curriculum for sustainability both an academic library and an lis program (figure ). fig. number of respondents by category academic libraries among academic library respondents, % were librarians, % staff (non-librarian), % administrators, and % managers. twelve percent of respondents selected more than one type of position (librarian and administrator, for example). respondents in the academic library category represented institutions, constituting approximately . % of the total number of libraries in four- year degree-granting us institutions. of the institutions in the sample, represented cc category one, category two, category three, and category four (figure ). public institutions accounted for % and private institutions constituted % of the total number. the institutions’ size listed by the number of students was represented with % under , students and % with , or more students (figure ). the geographical region most represented was the south ( %), followed by the midwest region ( %). the west and the northeast were similarly represented with % and % respectively. __________ calculations based on academic libraries: . ( ). the national center for academic statistics. retrieved february , http://nces.ed.gov/pubs / .pdf. the geographical areas were defined as regions: south, midwest, west, and northeast following u.s. census bureau. census regions and divisions of the united states. retrieved february , . https://www.census.gov/geo/reference/gtc/gtc_census_divreg.html. https://www.census.gov/geo/reference/gtc/gtc_census_divreg.html http://nces.ed.gov/pubs / .pdf engagement of academic libraries and information science schools in creating curriculum for sustainability fig. number of institutions by carnegie classification fig. percent of institutions represented by number of students lis schools the second survey category targeted faculty, staff, and students from lis schools in the us and canada. considering canada’s low response rate, the authors excluded their responses from the analysis. the sample in this category included students and faculty members from lis schools, constituting % of the total number of ala accredited programs. quantitative analysis results to evaluate participants’ perceptions of their academic libraries’ engagement in teaching for sustainability across the curriculum (addressed by the first and second research questions), the engagement of academic libraries and information science schools in creating curriculum for sustainability authors __________ calculation based on ala accredited master’s programs ( ). retrieved february , . www.ala.org/accreditedprograms/home. created a scaling system that measures the outcome of five categories based on responses to the following survey questions. category : formal documents/statements/actions/programs the first category included the following questions: q. —has your library adopted any of the following (sustainability statement, commitment, action plan, or other)? please select all that apply. q. —is an individual or group responsible for coordinating your library's sustainability efforts? q. —select the sustainability efforts (institutional repository, data curation, collection development, subject research guide, research instructions, exhibits, other) this individual or group coordinates. please select all that apply. q. —are your library’s sustainability efforts reported to any of the following (university/college administration, library administration, university level committee, library level committee, state level, aashe sustainability tracking & reporting system (stars®) report, none of the above, unsure, n/a (we have no sustainability efforts.), other—please elaborate)? please select all that apply. category : incorportating sustainability components into information literacy the second category includes the following question: q. —what sustainability content areas are librarians at your institution incorporating into student information literacy classes/instruction (open access to research, retaining author rights, institutional repository use, public engagement in the community, environmental, social equity, none, unsure, other—please elaborate)? please select all that apply. category : libraries sustainability activities the third category included the following questions: q. –in which of the following areas is your library involved (sustainability research, sustainability teaching, sustainability curriculum development, sustainability collection development, aashe sustainability tracking & reporting system (stars®) report, greening libraries, none, unsure, other–please elaborate)? please select all that apply. q. –does your library have a designated person responsible for coordinating sustainability collection development with other subject specialists (yes, no, unsure, other—please elaborate)? q. –does your library have an institutional repository (yes, no, unsure)? category : marketing sustainability resources and practices http://www.ala.org/accreditedprograms/home engagement of academic libraries and information science schools in creating curriculum for sustainability the fourth category includes the following questions: q. –in what specific ways does your library market sustainability resources to users (subject research guides, exhibits, instruction, social media, newsletter, library website, we do not market sustainability resources, unsure, other)? please select all that apply. category : collaborating with other units on campus the fifth category includes the following questions: q. –in which of the following areas has your library collaborated on sustainability-related content with other units on campus (teaching/co-teaching, training, co-authoring, course development, curriculum development, conference or symposium, presentations, workshops, exhibits, speakers, films, aashe sustainability tracking & reporting system (stars) report, other collaborative activities, none of the above, unsure? please select all that apply. q. –has your library sought development or grant funding for sustainability efforts (yes, no, unsure)? q. –is your library represented on any of the committees responsible for developing and recommending policies and strategies to advance the university's/college's commitment to sustainability (yes, no, unsure)? the questions above included possible initiatives or activities as options for engagement in sustainability efforts. category (formal documents/statements/actions/programs) included options; category (incorporating sustainability components into instruction) included options; category (libraries sustainability activities) included options; category (marketing sustainability resources and practices) included options; and category (collaborating with other units on campus) included options. each of these options was assigned one point. the total points for each institution were then tabulated based on question answers in the five categories above. where several individuals from the same institution completed the survey, the highest number of points for each category was attributed to the institution. where the institution was unknown, the data was removed from the dataset. the libraries represented in the survey were grouped into four intervals according to the number of points received, one point for each activity the respondent acknowledged their library was involved in. the intervals were selected to represent a low (minimally engaged) and a high (highly engaged) level of engagement, with most institutions falling in two middle intervals representing to distinct activities. the selection of the interval cutoffs were based on the response bell curve and what the authors thought represented a fair assessment of the level of engagement. the four engagement intervals were: minimally engaged: libraries with to points, somewhat engaged: libraries with to points, moderately engaged: libraries with to points, highly engaged: libraries with to points. engagement of academic libraries and information science schools in creating curriculum for sustainability survey results revealed that out of libraries represented in the survey, ( %) were minimally engaged, ( %) were somewhat engaged in sustainability activities, ( %) were moderately engaged, and ( %) were highly engaged (figure ). this quantitative data showed that almost % of academic libraries represented in this survey were involved in sustainability initiatives or activities. fig. level of library engagement in sustainability activities next, the authors measured the strength of association between the universities’ degree level, volume and field coverage, research funding, undergraduate selectivity, and specialization (expressed by the carnegie classification (cc) taxonomy) and the level of sustainability activities (expressed by an index of library engagement in activities supporting sustainability). the first cc is assigned to doctoral-granting institutions, the second to comprehensive universities and colleges, the third to liberal arts colleges, and the fourth to two-year colleges and institutes (mccormick & zhao, ). the authors constructed the index by adding the responses to five categories of questions about the libraries engagement (categories - ). the index (fig. ) was represented on the ordinal scale ranging from to , where the value of represents the highest level of engagement (the number of initiatives and activities ranging from to ) and the value of represents the lowest level of engagement (ranging from to ). the strength of the association was calculated with spearman’s rank-order correlation (laerd statistics, ) using spps statistical software. the spearman's rho correlation between cc and the index is . (fig. ). this is a weak positive correlation, but statistically insignificant (sg. -tailed = . ). this correlation analysis helped the authors to answer the second research question, and found that the cc taxonomy of higher education institutions in the us is associated with the level of academic libraries’ engagement in teaching sustainability across the curriculum, but both variables do not influence each other. engagement of academic libraries and information science schools in creating curriculum for sustainability engagement of academic libraries and information science schools in creating curriculum for sustainability engagement of academic libraries and information science schools in creating curriculum for sustainability engagement of academic libraries and information science schools in creating curriculum for sustainability fig. index of library engagement in activities supporting sustainability and the carnegie classification taxonomy. engagement of academic libraries and information science schools in creating curriculum for sustainability fig. correlation between the index of library engagement in activities supporting sustainability and the carnegie classification taxonomy. in summary, this survey revealed that the level of engagement of academic libraries in sustainability activities is associated with the cc taxonomy of us colleges and universities but that they do not influence each other. for example, libraries in institutions with carnegie classification categories one and two do not necessary function at a higher level of engagement in their sustainability activities and initiatives than libraries at institutions classified as categories three and four. additionally, quantitative data showed that nearly % of libraries represented in the survey were engaged in sustainability initiatives and activities. qualitative analysis: survey results and respondent comments academic libraries when asked about how engaged respondents felt academic libraries should be in campus sustainability teaching and curricular activities, % believed libraries should be very engaged, while % felt libraries should be somewhat engaged. only % selected the option, “not very engaged–this is not a priority.” this question received the most comments with % of the respondents providing an explanation with their answers. regardless of the choice made (very engaged, somewhat engaged, or not a priority), almost a third of the respondents commented that sustainability is only one area of the curriculum that needs to be supported. eight percent of those who provided comments felt strongly that academic libraries should play a significant role in sustainability teaching and curricular activities with comments such as: “libraries are the hub of campus life and ought to take the lead in sustainability issues; we are positioned perfectly to be an integral player to sit at the table along with other faculty and administrators deciding campus-wide sustainability initiatives, programs, and philosophies.” the results of the survey reveal academic libraries are engaged in a wide variety of activities that both support and enhance sustainability curricular efforts at their institutions. since sustainability links many disciplines, it was not surprising to find that these important library services were prominent. engagement of academic libraries and information science schools in creating curriculum for sustainability __________ original comments. all comments are anonymous. a quarter of respondents indicated that their library was involved in sustainability research while % of respondents indicated their library was involved in teaching sustainability topics. more than % of respondents included initiatives related to greening libraries such as: “we do have some modest energy and resource programs in place, paper reduction, electricity consumption reduction, compost bins in the library, etc.; all i know, is we are recycling our paper and we push for green cleaning; we participate in environmental efforts such as recycling, reducing paper waste, green printing, using filtered tap water, composting in library kitchen; i think the library should be as sustainable as possible –adopting measures to mitigate the impact we have on the environment; probably minimal efforts with greening the library–could do more.” open access (oa), retaining author rights, and institutional repositories (irs) appeared prominently and in diverse contexts throughout the survey. these were seen as sustainable and equitable models of access to information that also maximizes the return on investment for scholarly research. these topics were incorporated into student information literacy programs, used in development and grant proposals, and reported as creating opportunities for collaboration. one respondent commented, “i believe we should teach students the value of access to information in open access formats and the value of archiving and licensing their own work for use and re-use by others.” nearly % of respondents indicated that their library has used an ir specifically to collect, preserve, and disseminate sustainability-related scholarly materials. reportedly, the items most often archived in the ir for this purpose were articles and reports ( % of those who have an ir), and graduate and undergraduate work ( % of those who have an ir). others reported having a dedicated ir section for sustainability related content and using the repository to archive materials from the university’s office of sustainability. another significant area of engagement was incorporating sustainability content into information literacy, including topics related to oa ( %), use of the ir ( %), environmental subjects ( %), retaining author rights ( %), social equity ( %), and community engagement ( %). in several instances, respondents reported that inclusion was not programmatic but rather an individual decision. one respondent captured this focus: “we have a supportive role in helping to infuse sustainability literacy and practice on campus.” information literacy was also reported as one of the key ways in which libraries market sustainability resources with the most common method being the use of subject research guides. exhibits, the library’s website, and social media were also used by libraries to showcase and promote their sustainability materials and resources. academic libraries are accustomed to collaborating with other units on campus. with regards to sustainability-related content, teaching and presentations were reported as the most common methods in which libraries work together with other units on campus while more than a third of the respondents reported collaborating on exhibits, speakers, workshops, and films. the least common areas of collaboration was the co-authoring of research and working jointly engagement of academic libraries and information science schools in creating curriculum for sustainability on the aashe stars’ report. the survey responses revealed a low rate of library engagement in the stars reporting process, indicating that academic libraries are either not involved at all or at a minimal level. in the current version of stars . , (stars, a program of aashe, ) libraries are evaluated under the topic of support for research category and waste generation in paper and ink during printing. of the possible stars credits, libraries could be evaluated on their sustainability engagement activities. through broader evaluation in stars, libraries would demonstrate to campus, administrators, and faculty that they have much to offer to academic sustainability. during the stars period of public commenting (november ) on aashe’s recently drafted credits, a number of library professionals submitted remarks addressing this issue. they petitioned the stars . committee to: evaluate academic library initiatives supporting oa; promote irs to preserve digital content created by faculty, graduate, and undergraduate students; educate faculty about retaining their author rights; build sustainability-related collections and research guides; incorporate sustainability content into information literacy, diversity and teaching; and collaborate on sustainability projects. if approved, these activities and initiatives might have a chance to engage library partnerships with other university stakeholders and foster a greater level of collaborative sustainability in higher education. when asked about library-specific documents related to a sustainability vision, % of respondents indicated that they had a detailed sustainability statement, commitment, or action plan. over % of respondents indicated that there is no official reporting on sustainability efforts at the university or library level while only % stated having a person responsible for coordinating sustainability collection development. twenty nine percent reported having a person responsible for coordinating their library’s sustainability efforts, with only % functioning in an official capacity. in the words of a few respondents: “we do have some sustainability actions but no formal plan; there are multiple teams responsible for several of the options. no one team or committee is responsible for all aspects; individual librarians do many of the listed things; these duties are spread out between various groups.” in summary, although the majority reported the lack of a library-specific formal sustainability commitment or action plan, more than % of respondents from academic libraries felt that libraries should be very engaged or somewhat engaged in educating for sustainability. many replied indicating they have some sustainability actions, but no formal plans. only % of respondents reported having a person functioning in an official capacity and responsible for coordinating their library’s sustainability efforts. the survey results also revealed that library employees still strongly associate the term sustainability with green efforts. recycling programs, tracking energy usage, making appropriate changes to reduce energy consumption, leed certified buildings, and greening libraries were often mentioned. lis schools when lis school participants were asked the question, “how proactive do you think library schools should be in preparing students for sustainability practice in libraries?” twenty-five respondents stated, “very proactive,” replied “proactive,” felt “neutral,” and stated, “not an lis school role” (figure. ). engagement of academic libraries and information science schools in creating curriculum for sustainability fig. “how proactive do you think library schools should be in preparing students for sustainability practice in libraries?” these results portrayed that % of respondents felt sustainability issues should be addressed in lis programs. survey respondents’ quotes provided insights on what they believed to be a priority and their experiences in attending or working in lis schools: “same reasons i believe libraries should consider it a priority. moreover, the training starts in library school, so it is imperative that sustainability practices are taught in these programs; it should be learned in library school and continued as librarians branch out in their various disciplines within the field.” the authors were interested in specific ways that lis schools have evolved to reflect academic institutions’ foci on educating for sustainability. the survey respondents were asked to list course names offered in lis programs that incorporate sustainability-related subject matter. the most popular were: • course content addressing diversity issues– respondents, • oa and scholarly communication: ir, retaining author rights, data curation or digital content– , • patrons’ social equity issues– , • sustainability in collection development– , • greening library buildings and practices– . aspects of scholarly communication, collection development, and social equity/diversity were considered major library concerns for information access. survey quotes focused on specific content in lis programs that went beyond traditional library courses such as: diversity and global connections; literacy and services to underserved populations, organizational ethics, intellectual freedom; information ecology and ecological informatics; ethics diversity and change; information ethics and policy; multicultural services in libraries; information services to diverse client groups; archival outreach: programs and services; library architecture and space planning; information engagement of academic libraries and information science schools in creating curriculum for sustainability access & knowledge acquisition; community informatics; and accessibility for information technology. when asked how they felt regarding how lis programs have evolved to reflect academic institutions' focus on educating for sustainability, some faculty stated: “expansion of the meaning of information resources. stewardship and continuity (sustainability) of information resources emphasized. more discussion of systems and users as unique human patterns;” “providing classes to educate students on diverse, collection development, open access, intellectual freedom;” “emphasizes importance of open access–library is open to public. diversity and social equity are important, as is sustainability in the buildings and in classes.” among lis students, answers to this question were very diverse. some of them reported a strong emphasis on sustainability: “my lis program put strong emphasis on diversity and the value of multiple perspectives in problem solving. the program encouraged students to go beyond our traditional thoughts about libraries to discover new solutions that promote sustainability; i am enrolled in the information and diverse populations specialty in my mls program, which is designed to educate students on the need to provide services inclusive of a variety of users.” other students reported an absence of focus on sustainability such as: “to my knowledge it is not even on the radar, let alone a priority. i've never heard such ideas mentioned in courses, guest talks, and research areas. the campus/university in general is, in my opinion, extremely out of touch with sustainable living habits that i considered standard before arriving here. oblivious waste and consumption are the norm here, as they are, throughout much of the midwest, from what i can see; i don't recall coming across the concept of sustainability in my classes.” the authors were interested in discovering if lis programs marketed their sustainability efforts and course offerings to attract students and in what specific areas. of the responses to this question, only answered ‘yes,” responded “no,” were unsure, and chose “not applicable.” specific areas of sustainability that some schools advertised encompassed diversity in multiple iterations, such as global connections, populations, and partnerships with community agencies, an archives management, digital libraries, and developing collections of open access databases for small-budget libraries. additional lis program sustainability marketing included education advantages, such as originality in course offerings, technical skills, and well-qualified staff or a scholarship program such as: “the information and diverse populations specialty is actually a scholarship program that covers all tuition for full-time students enrolled in the mls program. in addition we are assigned mentors and receive monthly lectures from individuals in the field on various issues of diversity.” engagement of academic libraries and information science schools in creating curriculum for sustainability in summary, the lis schools’ section of the survey elicited responses illustrated an innovative range of coursework that addresses sustainability in digital content, importance of oa, patrons’ free access to information, social equity and diversity, transformations in collection management, intellectual freedom, continuity and expansion of information resources. lis respondents ( %) believed that lis schools should play a role in educating for sustainability. a % response rate to marketing lis program sustainability efforts and course offerings to attract students was considered low by the authors in contrast to actual activity recorded by survey respondents. the notion of lis schools marketing sustainability components in their programs does not equate to what is actually occurring within the curriculum and reported in the survey by lis faculty and students. findings and discussion this exploratory study provided an empirical snapshot of library employee and lis faculty and student perceptions on the level of engagement of academic libraries and lis programs in the emerging focus on educating and teaching for sustainability across the curriculum in us academic institutions. the authors considered the research successful from the perspective of establishing some baseline information for continuing to improve our understanding of the role of libraries and the lis schools in educating for sustainability. findings related to the first research question addressing the level of academic libraries’ engagement in campus sustainability teaching and curricular activities revealed that out of libraries represented in the survey, % were minimally engaged, % were somewhat engaged, % were moderately engaged, and % were highly engaged in sustainability activities and initiatives. conclusions related to the second research question revealed a weak positive correlation between the level of engagement of academic libraries in the emerging focus of teaching sustainability across the curriculum and the carnegie classification taxonomy of higher education institution in the us. findings related to the third research question addressing types of sustainability activities at academic libraries revealed over % of the libraries represented in this research reported between and actions and initiatives. some of the reported sustainability related activities included: • information literacy classes incorporating topics related to open access, use of the ir, environment, retaining author rights, social equity, and community engagement ( %) • collaborating with other units on campus sustainability-related activities ( %) • creation of subject guides ( %) • efforts to build collections devoted to sustainability-related topics ( %) • greening libraries ( %) • involvement in sustainability research ( %) • teaching ( %) • involvement in stars report ( %) findings related to the fourth research question revealed the following most frequently reported venues for marketing sustainability resources to users: • subject research guides ( %) engagement of academic libraries and information science schools in creating curriculum for sustainability • information literacy classes ( %) • exhibits ( %) • library website ( %) • social media ( %) responses to the fifth research question addressing the area of library collaboration on sustainability-related content with other units on campus indicated that libraries collaborate most often with academic units and less frequently with administrative or other non-academic campus units. the findings presented the following most frequently reported sustainability-related activities and initiatives: • teaching with academic units on campus ( %) • presentations with academic units on campus ( %) • exhibits with academic units on campus ( %) • course development with academic units on campus ( %) • organizing speakers with academic units on campus ( %) findings related to one research question addressing the focus of sustainability stressed in lis curriculum and practice revealed the following most frequently reported sustainability-related course content: • diversity: organizational ethics, services to diverse user groups ( %) • oa and scholarly communication: curation of digital content and intellectual freedom ( %) • social equity: free access to information, accessibility for information technology ( %) • collection development: continuity and expansion of information resources ( %) • greening: library buildings, collections, services, and information technology ( %) in summary, both categories of respondents expressed an overwhelming support for engaging academic libraries ( %) in campus sustainability teaching, research, and outreach, and lis programs ( %) in their curricular activities. conclusion overall, the authors found with the increasing focus on educating for sustainability, library employees, lis faculty, and students realize the importance for libraries and lis programs to respond to this expanding movement. importance of oa, irs, retaining author rights, and diversity figured prominently in different contexts throughout the survey. these are seen as sustainable and equitable models of access to information that also maximize a return on investment for scholarly research and protect equal access to resources, now and in the future. they are incorporated into student information literacy programs, research guides, collection development, teaching equity and diversity issues; used in development and grant proposals; and reported as creating opportunities for outreach and collaboration. the study demonstrated libraries in institutions with carnegie classification categories one and two do not present a higher level of engagement in their sustainability activities and initiatives than libraries at institutions classified as category three and four. while most academic libraries represented in the study have been engaged in a broad spectrum of activities that support their institution’s sustainability research and curricular functions, this study has indicated that these engagement of academic libraries and information science schools in creating curriculum for sustainability activities lack a focused and targeted approach. the study revealed a gap between an eagerness to be actively engaged in sustainability activities and an absence of specific sustainability documents such as a statement, commitment or action plan in the strategic plans of academic libraries. out of libraries represented in the study, only a few were reported as having sustainability activities included in their strategic plan. expending evaluation criteria for academic libraries in the stars reporting system might lead to the development of action plans focused on sustainability activities and allow them to respond more quickly to the needs of university sustainability initiatives. lis programs may not be marketing courses that focus on sustainability, but according to survey responses, they do offer a variety of substantive courses that include aspects of sustainability- related content, focusing on access to information and diverse users; transformation in collection development and management; entrepreneurship in information, ethics, diversity and change; multicultural services; community informatics; accessibility for information technology; archival outreach; digital curation; digital scholarship and open content. this exploratory study initiated a better understanding of the role of libraries and lis programs in educating for sustainability and presented opportunities for further research to discover essential factors to foster sustainability and effective ways to market sustainability concepts in lis programs. acknowledgment the presented research project was carried out with support from the authors’ libraries. the authors wish to thank administrators and colleges for their assistance. appendix . supplementary data supplementary data to this article can be found online at: http://dx.doi.org/ . /j.acalib. . . . references aashe. ( ). sustainability curriculum in higher education: a call to action. retrieved february , , from http://www.aashe.org/files/a_call_to_action_final% % .pdf aashe. ( and ). stars steering committee meeting minutes. retrieved june , , from http://www.aashe.org/files/documents/stars/stars_steering_committee_meeting_ . . .pdf aashe. ( ). stars history and system development. retrieved february , , from https://stars.aashe.org/pages/about/faqs/stars-history-and-system-development.html aashe. ( ). about aashe. (retrieved october , , from http://aashe.org/about) american college & university presidents' climate commitment. ( ). mission and history. retrieved february , , from http://www.presidentsclimatecommitment.org/about/mission-history http://www.presidentsclimatecommitment.org/about/mission-history http://aashe.org/about https://stars.aashe.org/pages/about/faqs/stars-history-and-system-development.html http://www.aashe.org/files/documents/stars/stars_steering_committee_meeting_ . . .pdf http://www.aashe.org/files/a_call_to_action_final( ).pdf engagement of academic libraries and information science schools in creating curriculum for sustainability antonelli m. & mccullough, m. (eds.) ( ). greening libraries. los angeles: library juice press. applin, m. ( ). building a sustainability collection: a selected bibliography, reference services review, ( ), – . doi: . / . buehler, m. ( , october). building global bridges to higher education sustainability research & collaborations. paper presented at the aashe annual conference, boulder, co. retrieved january , from http://www.aashe.org/resources/conference/building-global-bridges- sustainability-researchcollaborations-higher-education. chadwell, f. ( ). what's next for collection management and managers?: sustainability dilemmas. collection management, ( ), - . doi: . / . . charney, m. ( , october). getting closer: the librarian, the curriculum, and the office of sustainability. paper presented at the aashe annual conference, pittsburgh, pa. retrieved february , , from http://www.aashe.org/resources/conference/getting-closer-librarian- curriculum-and-office-sustainability- charney, m. ( ). a sustainability librarian's manifesto: your "take action" checklist. libraries for sustainability webinar series. unpublished results. retrieved february , , from http://works.bepress.com/charney_madeleine/ connell, v. ( ). greening the library: collection development decisions. endnotes: the journal of the new members round table, ( ), - . retrieved february , , from http://www.ala.org/nmrt/sites/ala.org.nmrt/files/content/oversightgroups/comm/schres/endnotesvo l is / greeningthelibrary.pdf desilva, m. ( ). more than sustainable agriculture resources. college & research library news, ( ) , - . (retrieved february , , from http://crln.acrl.org/content/ / / .full). hoebelheinrich, n. ( ). an aid to analyzing the sustainability of commonly used geospatial formats: the library of congress sustainability website. journal of map & geography libraries, ( ), - . http://dx.doi.org/ . / . . howard, k. ( ). new paradigm, new educational requirements? australian viewpoints on education for digital libraries. world library and information congress: th ifla general conference and assembly. retrieved february , , from http://conference.ifla.org/past/ifla / -howard-en.pdf jankowska, m., a. ( ). can the ala interest in sustainable development be continued? public libraries, ( ), - . retrieved february , , from http://crl.acrl.org/content/ / / .full.pdf+html http://crl.acrl.org/content/ / / .full.pdf+html http://conference.ifla.org/past/ifla / -howard-en.pdf http://dx.doi.org/ . / . . http://crln.acrl.org/content/ / / .full http://www.ala.org/nmrt/sites/ala.org.nmrt/files/content/oversightgroups/comm/schres/endnotesvol is / greeningthelibrary.pdf http://www.ala.org/nmrt/sites/ala.org.nmrt/files/content/oversightgroups/comm/schres/endnotesvol is / greeningthelibrary.pdf http://works.bepress.com/charney_madeleine/ http://www.aashe.org/resources/conference/getting-closer-librarian-curriculum-and-office-sustainability- http://www.aashe.org/resources/conference/getting-closer-librarian-curriculum-and-office-sustainability- http://www.aashe.org/resources/conference/building-global-bridges-sustainability-researchcollaborations-higher-education http://www.aashe.org/resources/conference/building-global-bridges-sustainability-researchcollaborations-higher-education engagement of academic libraries and information science schools in creating curriculum for sustainability jankowska, m., a., & marcum, j. ( ). sustainability challenge for academic libraries: planning for the future. college & research libraries ( ), - . retrieved february , , from http://crl.acrl.org/content/ / / .full.pdf+html laerd statistics. ( ). a how to statistical guide, spearman’s rank correlation using spss. retrieved june , , from https://statistics.laerd.com/spss-tutorials/spearmans-rank-order-correlation-using- spss- statistics.php link, t. (spring ). transforming higher education through sustainability and environmental education. issues in science and technology librarianship. retrieved february , , from http://www.istl.org/ -spring/article .html long, s. ( ). libraries can help build sustainable communities. american libraries, ( ), . retrieved february , , from http://www.questia.com/library/ g - /libraries-can- help-build-sustainable-communities maron, n., & loy, m. ( ). funding for sustainability: how funders' practices influence the future of digital resources. united kingdom: ithaka s+r. retrieved february , , from http://www.sr.ithaka.org/research-publications/funding-sustainability-how-funders%e % % - practices-influence-future-digital mccormick, a. & zhao, c. ( ). the carnegie classification of u.s. institutions of higher education. retrieved february , , from http://classifications.carnegiefoundation.org/downloads/rethinking.pdf schroeder, r. ( ). promotion of the “scholarship of publishing”–a sustainable future for scholarly communication. presentation, sustainable scholarship conference, pacific university. retrieved february , , from http://commons.pacificu.edu/sustainableschol/program/oct / / stark, m. ( ). information in place: integrating sustainability into information literacy instruction. electronic green journal, ( ). retrieved february , , from http://www.escholarship.org/uc/item/ fz w p stars, a program of aashe ( ). version . technical manual, - . retrieved february , , from http://www.aashe.org/files/documents/stars/stars_ . _technical_manual.pdf ulsf ( ). talloires declaration retrieved october , , from http://www.ulsf.org/programs_talloires_td.html williams, b., less, a., & dorsey, s. ( a). libraries as sustainability advocates, educators, and entrepreneurs. in m. krautter, m. lock & m. scanlon (eds.), the entrepreneurial librarian: essays on the infusion of private-business dynamism into professional service. jefferson, nc: http://www.ulsf.org/programs_talloires_td.html% http://www.aashe.org/files/documents/stars/stars_ . _technical_manual.pdf http://www.escholarship.org/uc/item/ fz w p http://commons.pacificu.edu/sustainableschol/program/oct / / http://classifications.carnegiefoundation.org/downloads/rethinking.pdf http://www.sr.ithaka.org/research-publications/funding-sustainability-how-funders%e % % -practices-influence-future-digital http://www.sr.ithaka.org/research-publications/funding-sustainability-how-funders%e % % -practices-influence-future-digital http://www.questia.com/library/ g - /libraries-can-help-build-sustainable-communities http://www.questia.com/library/ g - /libraries-can-help-build-sustainable-communities http://www.istl.org/ -spring/article .html https://statistics.laerd.com/spss-tutorials/spearmans-rank-order-correlation-using-spss- https://statistics.laerd.com/spss-tutorials/spearmans-rank-order-correlation-using-spss- http://crl.acrl.org/content/ / / .full.pdf+html engagement of academic libraries and information science schools in creating curriculum for sustainability mcfarland. williams, b., charney, m., & smith, b. ( b). libraries for sustainability call to action and collaboration! retrieved february , , from http://greeningyourlibrary.wordpress.com/ / / /libraries-for-sustainability-a-four-part- webinar-series/ http://greeningyourlibrary.wordpress.com/ / / /libraries-for-sustainability-a-four-part-webinar-series/ http://greeningyourlibrary.wordpress.com/ / / /libraries-for-sustainability-a-four-part-webinar-series/ microsoft word - idcc -fenlon-inpress.docx idcc | extended abstract draft from december correspondence should be addressed to katrina fenlon, campus drive, hornbake library south rm , college park, md . email: kfenlon@umd.edu the th international digital curation conference takes place on - february in dublin, ireland url: http://www.dcc.ac.uk/events/idcc copyright rests with the authors. this work is released under a creative commons attribution . international licence. for details please see http://creativecommons.org/licenses/by/ . / sustaining digital humanities collections: challenges and community-centered strategies abstract since the advent of digital scholarship in the humanities, decades of extensive, distributed scholarly efforts have produced a digital scholarly record that is increasingly scattered, heterogeneous, and independent of curatorial institutions. digital scholarship produces collections with unique scholarly and cultural value—collections that serve as hubs for collaboration and communication, engage broad audiences, and support new research. yet, lacking systematic support for digital scholarship in libraries, digital humanities collections are facing a widespread crisis of sustainability. this paper provides outcomes of a multimodal study of sustainability challenges confronting digital collections in the humanities, characterizing institutional and community-oriented strategies for sustaining collections. strategies that prioritize community engagement with collections and the maintenance of sociotechnical workflows suggest possibilities for novel approaches to collaborative, community-centered sustainability for digital humanities collections. katrina fenlon college of information studies university of maryland, college park | sustaining digital humanities idcc | extended abstract introduction since the advent of digital scholarship in the humanities, decades of extensive, distributed scholarly efforts in collecting and digitization, datafication, modelling, encoding, scholarly editing, annotation, and the development of maps, games, simulations, and more, have resulted in a digital scholarly record that is increasingly scattered, heterogeneous, and independent of libraries and cultural institutions. the digital outputs of humanities research are increasingly media-rich, data-centric, interactive, and interlinked with external resources. they are also increasingly common; more than half of faculty report creating digital tools and collections, most intended for public use or to serve a disciplinary community of researchers (maron and pickle, ). digital scholarship produces collections with unique scholarly and cultural value, both in their capacity to manifest scholarly interpretation and serve new research and reuse, and in their propensity to gather and represent digital primary source evidence that does not exist as such in mainstream memory institutions. yet the bulk of digital humanities collections are unsustainable. outside of well- resourced digital humanities centers and libraries, there continues to be a systematic lack of support for digital scholarship after the phase of its initial creation. even on campuses with established digital humanities centers, there are rarely end-to-end solutions in place for supporting digital scholarship from its conception to preservation, so that maintaining projects—which are built by scholars or research communities, often on bespoke infrastructures using short-term funding—has become a major problem for institutions (maron and pickle, ; smithies et al., ). library support for digital scholarship at every phase of its lifecycle is growing but remains profoundly inadequate overall to match the ongoing growth in digital scholarship or confront the existing accumulation of legacy collections. this paper reports on a multimodal study of the sustainability challenges confronting digital collections in the humanities. based on a set of interviews with practitioners in digital humanities centers and libraries, supplemented by an analysis of digital collections, this paper identifies the central challenges confronting the management of collections over time. this paper then characterizes strategies for sustaining collections, dwelling on one strategy in urgent need of increased research and understanding: that of community engagement with and reuse of digital collections in the humanities, with the goal of moving toward community-centered sustainment. background one common mode of digital humanities production is the digital collection—often called thematic research collection (palmer, ) or digital archive—which takes the form of a curated aggregation of primary sources along with materials and features designed to support research on a theme. “collection” is used as a shorthand in this paper for a variety of digital projects and their outcomes, ranging from scholarly editions to linked data hubs, which gather primary sources or evidence derived from sources, and integrate those sources with annotation, contextual information, secondary sources, or functional and interactive elements in order to construct platforms for learning and research. digital humanities collections serve as hubs for collaboration and communication, engage broad audiences, and generate new research (palmer, ; fenlon, ). while collections have long constituted a prominent mode of digital scholarship (palmer, ; flanders, ; fenlon, ; cooper and rieger, ), they rarely gain integration into systems of digital curation or preservation in libraries and other curation institutions. despite the fact fenlon | idcc | research paper that most fall well within scope of the preservation missions of libraries responsible for stewarding institutional research, digital humanities collections are facing a widespread crisis of sustainability. sustainability and preservation are uniquely problematic for digital humanities collections, for many reasons. collections are often developed and maintained outside of the purview of dedicated memory institutions. they tend to be centered in scholarly communities, in the sense that scholars create and maintain collections for their own uses or the uses of their communities, with fluctuating resources, and usually without professional curatorial support. because these collections tend to be funded on short cycles oriented toward technical innovation or experimentation and rapid development, they often rely on bespoke or fragile infrastructures. these collections are highly creator- dependent; they rarely endure beyond the interest and involvement of their initial creators, even when there are active communities of use. there is evidence of systemic confusion around the value of digital scholarship to academic institutions, and how institutions should understand ownership of highly collaborative and distributed projects (maron & pickle, ). and because collections function simultaneously as scholarly publications and as platforms for ongoing research, they confront a conceptual morass around what sustainability and preservation really mean for different kinds of digital scholarship in different contexts. more pragmatically, most academic libraries simply lack capacity to take in and sustain any more than a narrow swath of digital scholarship. sustainability is a term that has garnered widely varying definitions across the literatures of practice and research in cultural heritage, digital humanities, and digital curation. most discussions of sustainability revolve around organizational resilience, long- term economic viability, and questions of institutional management (eschenfelder et al., ). there is increasing recognition of the sociotechnical aspects of sustainability—of the need to maintain the collaborative processes and labor that serve to construct digital scholarship in combination with technical artifacts and processes (langmead et al., ; madsen and hurst, ). this paper builds on sociotechnical approaches to sustainability, considering sustainability to mean the ability of a collection to remain viable over time, to responsively support the communities that create and use it, in whatever forms are useful, for as long as useful. in contrast to a paradigm of digital preservation focused on fixity, this definition of sustainability admits the need for collections to continue to change and grow. this definition also presumes that sustainability and preservation approaches exist on a spectrum, with no clear delineation between them. institutional efforts to sustain and preserve digital scholarship are commonly characterized by one or more of the following three main features: ( ) maintenance and preservation efforts are solely or primarily assumed by digital humanities centers, where they exist; ( ) where centers or preservation institutions (mostly libraries) offer long- term support for digital scholarship, that support is generally framed in terms of service levels; and ( ) repository, publishing, and data management infrastructures are developed to increase the capacity of institutions to hold and maintain increasingly complex digital scholarship. digital humanities centers commonly serve as inadvertent, sometimes reluctant memory institutions. depending on their capacity and their relationships with other entities, they make sporadic, often reactive investments into maintaining digital projects that they host. some centers and labs have developed comprehensive strategies and policies to confront burgeoning maintenance needs (smithies et al., ; madsen and hurst, ). centers may possess a range of relationships with institutional libraries, ranging from complete independence to physical colocation and organizational ties. these relationships substantially affect the capacity of a center or lab to sustain digital scholarship over time (prescott, ). for both digital humanities centers and for libraries playing an active role in sustaining or preserving digital scholarship, the most common reported strategy involves | sustaining digital humanities idcc | extended abstract the articulation and negotiation of a service model comprised of varying service levels or layers. service levels are usually defined around the varying commitments a library or center agrees to make to maintain discrete kinds of components, significant properties, or levels of access to collections in response to identified functional requirements (e.g., oltmanns et al., ; madsen and hurst, ; goddard and walde, ; vinopal and mccormick, ; sustaining digital scholarship, ). service levels may be negotiated on a per-project basis to create formal agreements between digital humanities creators and libraries or centers, or they may constitute blanket institutional policies. for libraries, this layered service model in almost every case entails a “handoff” of a collection— migration of the collection along with transfer of ownership or responsibility—from a research community to the library. at what point in the lifecycle of a project that handoff happens varies widely. in addition to developing policies, a final common institutional strategy is the development or adoption of advanced technical infrastructure for the management, preservation, and publication of increasingly complex digital objects and collections. emergent preservation repositories, publishing platforms, and collaborative research environments aim to capture and represent complex digital research objects, linked data, and primary source collections alongside and interleaved with traditional forms of scholarly publication (e.g., sweeney et al., ; almas, ; white et al., ; fenlon, ). digital humanities scholarship has generally resisted large-scale infrastructure for many reasons, including the high variation in user requirements across projects (dombrowski, ), the non-scalability of digital humanities and digital curation (rawson and muñoz, ), and epistemological tensions with established and emergent cyberinfrastructure from other domains (fenlon, ; smithies et al., ). beyond institutionally centered strategies for digital humanities sustainability and preservation, there is a promising movement within cultural institutions toward shared stewardship and related models for partnering with communities to share the work of collection maintenance over time (e.g., smithsonian, ). these models emerge from a substantial body of research in the archival community on post-custodial and participatory archives (gilliland and flinn, ; caswell, ; clement, ). while these efforts have largely focused on community archives rather than digital scholarship, they may offer a promising direction for digital collections more broadly. methods this paper reports selected outcomes of a multimodal, qualitative study of thematic research collections as an emergent mode of digital scholarship in the humanities, along with challenges for libraries in supporting collections throughout their lifecycles. the study was conducted in three phases: ( ) typological analysis of a large sample of collections (n= ), which characterized the range and defining features of collections; ( ) qualitative content analysis of three exemplary collections to more deeply characterize the genre; and ( ) a set of semi-structured interviews with nine practitioners, representatives of digital humanities centers and libraries, each with significant expertise in the creation and management of digital humanities collections. the goal of the interview phase of the study was to identify current practices in supporting thematic research collections, along with challenges and strategies for integrating collections into infrastructures of maintenance and preservation. this paper focuses on the outcomes of the interviews, which had the most bearing on questions of sustainability and preservation. however, a relevant outcome of the typology and content analysis phases of this study—which pertains to different modes of contribution of digital collections—is summarized in the fenlon | idcc | research paper first part of the “challenges” section, below. for details on methods and findings of typology and content analysis, see fenlon ( ). the interview phase of this study addressed questions including: what are the challenges, for libraries and related scholarly-publishing entities, in supporting thematic research collections as a scholarly genre? how do library publishing programs and related scholarly-publishing entities support the creation and publication of thematic research collections, and what problems exist in meeting the needs of collection creators? how do libraries collect, represent, describe, preserve, and otherwise treat thematic research collections after publication, and what problems exist in meeting user needs? sampling for the interview phase of the study was purposive. while the sample was small, participants were selected for their expertise in the creation and maintenance of thematic research collections, prioritizing the potential richness of expert response over any gains in generalizability that might be attained from a larger or random sample. participants were selected to represent well-established centers and labs with a long history of creating and maintaining digital collections, including the center for digital research in the humanities at the university of nebraska-lincoln, the maryland institute for technology in the humanities, the roy rosenzweig center for history and new media at george mason university, and the scholars’ lab and the institute for advanced technology in the humanities at the university of virginia. all participants waived confidentiality for this study; nonetheless, the description of results below employs participant codes (in the form of “participant x”), rather than names, to distinguish quotations by different participants. where possible, interviews were conducted with more than one person from each institution. two additional interviewees were selected for their extensive experience working with collections in addition to expertise in library administration. interviews were coded using qualitative content analysis. the coding frame was built inductively, deriving themes from the transcripts in answer to the research questions. the study admits several limitations beyond those that confront interview studies generally. this study focuses on the perspectives of collection creators within digital humanities centers (albeit, collection creators with significant expertise). future work will need to integrate the perspectives of independent scholars, along with those of more and varied stakeholders in preservation institutions. few libraries appear to systematically deal with thematic research collections post-publication, which makes empirical investigation of the possibilities difficult. for this reason, this study aims to be foundational rather than comprehensive or conclusive about the challenges confronting institutions. challenges to sustaining digital humanities collections this study surfaced four main challenges confronting the sustainability and preservation of digital humanities collections: ( ) discontinuity between the essential interactivity of digital collections and the paradigm of artifactual preservation; ( ) the importance and vulnerability of “connective tissue” within and between collections; ( ) ambiguity of institutional contexts and roles; and ( ) lack of infrastructure for collaborative humanities workflows. these challenges are grounded in and contextualized by an important observation about digital scholarship which emerged from the typological and content analysis phases of research: that the varying contributions of digital scholarship seriously complicate discourse around and practical approaches to sustainability and preservation. different collections aim to contribute to scholarship in different ways. this study identified different kinds of contributions that collections make to scholarship. while the contributions described here are by no means exhaustive, they exemplify epistemological differences that have a bearing on sustainability and preservation decisions. based on the | sustaining digital humanities idcc | extended abstract typological analysis and content analysis reported in fenlon ( ), collections may be usefully differentiated by constellations of interrelated properties, such as a collection’s purpose(s), a collection’s theme or subject, the kinds and diversity of items in a collection, and how interrelationships among items in a collection are created through technical, narrative, and design elements. in fact, this study found that the combination of these properties may be boiled down to a deceptively simple question, with which to differentiate collections: what would it mean for a given collection to be complete? in other words, what idea of completeness—in the senses of wholeness, totality, or comprehensiveness—guides the development of the collection? the study identified three preliminary kinds of collections, each bent toward a different ideal of completeness: • definitive source collections aim to bring together an exhaustive set of definitive primary sources, to serve as an authoritative resource for scholarship. sustainability and preservation efforts for such collections would likely center on maintaining access to the sources directly. • interpretive context collections aim to surround a diverse set of exemplary (not necessarily definitive) sources with interpretive context and make interrelationships between sources and context actionable and usable. sustainability and preservation efforts for such collections would likely prioritize metadata over sources themselves. • evidential platform collections are focused on aggregating, deconstructing, and remodelling diverse forms of primary sources for new analytical and interpretive uses, for example by deriving computationally amenable data from primary sources. sustainability and preservation efforts for such collections would likely prioritize the data along with rigorous documentation of provenance and persistent links to original sources. of course, many collections combine aspects of each of these varieties of contribution (and presumably many other varieties). if the aim of sustainability and preservation efforts is to maintain the contributions of digital scholarship, then those efforts must be adaptive to varieties of contribution. the digital humanities community lacks a common vocabulary for discussing different modes of contribution of digital scholarship; thus, the first of the four challenges identified in the interviews is a conceptual challenge. the rest of this section elaborates the four challenges identified above. ( ) discontinuity between the essential interactivity of digital collections and the paradigm of artifactual preservation. thematic research collections tend to be essentially interactive. user-interactivity, collection performativity, or experientiality are often integral to the purposes and intellectual contribution of the collection. customized browsing functions that exploit scholarly encodings, indexing and navigational schemes that manifest scholarly interpretation, specialized reading and annotation tools, games, interactive maps, three-dimensional models, and simulations—the interactive components of digital collections are often designed to accomplish multiple things at once: to manifest interpretive stances, to enable knowledge transfer, and simultaneously to serve as platforms for ongoing research (palmer et al., ; fenlon, ). therefore, many collections must remain interactive for their contributions to be manifest. collections are intended to be “living” (participant ). for many collections to be realizing their scholarly purposes, they may not be decomposed into “items,” “objects,” or “raw data,” or reconstructed in a standard content management system. the interactivity of digital scholarship challenges the prevailing paradigm of artifact- oriented digital preservation. a scholar-centric paradigm of sustainment would prioritize the sustainment of contributions, which may be amorphous, and which may or may not neatly align with preservation-ready outputs. there are some promising solutions to aspects of this problem emergent from software preservation and web archiving research fenlon | idcc | research paper (e.g., rhizome’s webrecorder ), which begin to confound the distinction between sustainment and preservation. indeed, interview participants in this study tended to conflate the terms sustainability and preservation in light of the essential interactivity of digital collections. one implication of this challenge is the need for a stronger vocabulary for articulating the contributions of digital scholarship to support determinations about what needs to be kept “alive” (and in what form, and for how long), and what can be effectively fixed in amber. it also seems likely that sustainability itself will mean very different things to different research communities in different contexts, and this needs further research. ( ) the importance and vulnerability of “connective tissue” within and between collections. digital humanities collections are networked resources with visible and invisible dependencies among components, and with external resources and services. a collection’s contents may be less essential than “connective tissue” among contents (participant ). connective tissue—interrelationships among components and contextual information, often forged through links or calls to external resources and customized schemas and utilities—may constitute the main interpretive or intellectual contributions of a collection, transcending the discrete digital objects that are the ‘items’ of a collection. however, the same connective tissue is highly vulnerable to dissolution precisely because it tends to be invisible, undocumented, or technically bespoke and difficult to migrate. this poses the most immediate technical challenge for both sustainability and preservation. integral and interstitial components of collections frequently carry important and inexplicit meaning and context. the term relationships is used here to indicate constitutive pieces of collections that are not readily classed as primary or secondary sources or data, including links or calls to external data sources and services; implicit contextual and relational information asserted via narrative and design elements; descriptive and relational schemas and ontologies; and computed components such as information retrieval components, dynamic components, algorithmic components, etc. fenlon ( ) identified a distinction between direct and indirect relationships undergirding digital scholarship. direct relationships are referential relationships that are formalized and actionable, for example as calls to uris encoded in processing scripts or in files, which serve to interrelate, for example, page images to corresponding encoded transcriptions and relevant external standards and authorities. indirect relationships, on the other hand, are visible and usable in the design of a collection or its web presence (for example, when a webpage juxtaposes a manuscript image with a transcription of the image), but are technically performed by completely unrelated, often computational processes, and are not encoded explicitly in the digital objects comprising the collection. relationships that are inexplicit or forged dynamically through computation are vulnerable to loss during migration and preservation actions, during staffing changes, and in the absence of thorough documentation. indirect relationships within a collection’s architecture may prove essential to the meaning and the contribution of a collection, and they are intuitively more difficult to characterize and document, let alone sustain or preserve. one participant, describing how important semantic and editorial information was located in stylesheets rather than directly in digital objects, noted that, “if those things ever get separated, you’ve lost a huge analytical contribution,” and acknowledged the “tight interconnectedness, the integration of purposes of these two things—the phenomena of the data model and the other, related phenomena of the stylesheet or the computational processes” (participant ). becker ( ) has detailed the metaphorical and computational nature of digital objects, and the challenges for preservation work. these challenges are amplified when we consider not only aggregated and interrelated objects, often rife with external dependencies, but also objects that are essentially https://webrecorder.io/ | sustaining digital humanities idcc | extended abstract interactive. this challenge seems likely to grow in an era of linked data and increasingly networked digital scholarship. ( ) ambiguity of institutional contexts and roles. while many digital humanities collections are created, managed, and sustained by communities of use, they may bear a great variety of relationships to institutional libraries, ranging from complete independence to active and formalized partnership. a collection’s institutional context, including factors such as its administrative home within the organization of a university or its proximity to the library, bears heavily on its sustainability, particularly affecting how collection curators are able to plan for or implement maintenance as opposed to innovation or development. this study found that the roles of various entities with a stake in digital scholarship—including scholars, academic departments, libraries and units within libraries, and digital humanities centers and labs—are complex, context- dependent, and subject to ongoing negotiation. roles within the system of scholarly communication at large become systematized and institutionalized only around established, well understood genres, which may help explain why comparatively unfamiliar or nascent forms of digital scholarship have struggled to attain systematic treatment in libraries. participants were unanimous that libraries have a significant role to play in the sustainment of digital scholarship. most participants reported having had one or more interactions with the library toward the maintenance or preservation of digital humanities collections. two participants reported that their respective centers had established relationships and standing agreements with the library, which ensure that the library would serve as the “eternal resting place” (participant ) for each digital humanities center’s collections, but in both cases the commitment did not carry a timetable for transfer of responsibility, and was constrained to item-level metadata and limited types of items that would fit readily into the existing institutional repository. determining transfers of responsibility can be a fraught exercise: it is rarely clear when digital projects are “done and ready for the library to migrate and preserve, and sort of embalm, or whether they were things that the scholar might still like to add to” (participant ). another participant, working within a digital humanities center, noted that when a center is physically or administratively located within a library there seems to be an almost unconscious reliance on the surrounding infrastructure to bear the weight of stewardship of collections: “i don’t have to constantly worry about [preservation] because there’s an infrastructure around me that’s thinking about this” (participant ). however, no participants reported having established systematic measures or ongoing processes for collaborating with libraries in sustainability and preservation. in some cases where librarians play active roles in the development and maintenance of digital scholarship, their involvement may not reflect established or sustained administrative or institutional support from the library; it may just reflect the initiative of individual librarians. one participant noted that librarians in often enter into digital-scholarship collaborations almost “in spite of or around the edges of their existing roles” (participant ). another suggested that digital humanities centers can serve as a “focal point for collaboration between librarians and faculty” toward increasing the library’s roles as “a partner in the research enterprise” (participant ). while libraries continue to increase support for digital scholarship and digital publishing, and indeed take increasingly active roles in research and the collaborative construction of thematic research collections and other forms of digital scholarship, it is not always clear how library digital scholarship initiatives are related to collection development and preservation missions of the library. the appropriate and sustainable division of labor for digital collections is of course a heavily context-dependent determination, and one that may be negotiated and renegotiated over time. as mentioned above, there is no consensus around the value of digital scholarship from an institutional perspective, nor a strong understanding of how libraries or preservation institutions fenlon | idcc | research paper should negotiate the ownership of collaborative and distributed projects (maron & pickle, ). this study evinces the need for increased research into context-dependent sustainability strategies, and the many and varying roles to be played by different stakeholders. ( ) lack of infrastructure for collaborative humanities workflows. emergent digital humanities preservation and sustainability strategies are increasingly prospective. libraries seek to make interventions earlier in scholars’ development processes, to help scholars make more sustainable technological and representational choices, and to gather requirements to make sustainability plans. as an alternative to the pattern of retrospectively migrating digital projects into the care of libraries after their development, there are increasing efforts to develop and implement common preservation-oriented infrastructures that have the flexibility and extensibility to undergird distributed, custom development by individual digital humanities projects. prospective strategies aim to lay sustainable foundations in the form of preservation-oriented data management systems underlying advanced indexing and access layers, as platforms on top of which humanists can build expressive, interpretive, customized digital scholarship (e.g., sweeney et al., ; madsen and hurst, ; almas, ; white et al., ). the success of cyberinfrastructure for the humanities will depend on its capacity to accommodate the wide-ranging human and technical processes or workflows that structure the development and maintenance of collections. indeed, we can understand those workflows as integral to the infrastructure of collections, and therefore of sustainability. the workflows or processes that create and maintain collections (and digital humanities scholarship generally) are idiosyncratic, distributed, and highly collaborative, and this will complicate attempts to establish a shared cyberinfrastructure even within domains of research (fenlon, ). indeed, this study found that beyond maintenance of the technical components of a collection, sustaining a collection may depend on the maintenance of human workflows. one interview participant described needing to alter the course of a whole collection-development workflow—a distributed and collaborative process of digitization, transcription, and ingest—in order to conduct a routine data migration. this participant described the difficulty and necessity of implementing changes to a workflow that was well established and distributed across teams at multiple institutions, asserting that alterations to workflow necessarily accompany technical maintenance and may in fact be more complex: “having a conversation about…what the folks working on [the collection] like to do, want to do with it—that was sustainability work—and keeping their workflow intact in some ways, but just fixing some things that maybe weren’t working” (participant ). toward community-centered sustainability strategies this study illuminated several institutional strategies for digital humanities scholarship, some of which are well established in library practice, while others are emergent. as described above, the most common, institutionally centered strategies for sustainability and preservation rely on negotiated levels of commitment and, ultimately, handoff of responsibility for the collection from the original creators to a curation institution, often with some loss of fidelity to the collection. this strategy is inevitably inadequate for handling the diversity and scope of digital scholarship, due to the challenges described above: comprehensive collection of digital scholarship would exceed the capacity of most preservation institutions; and there are aspects of digital scholarship that strongly resist common approaches to preservation or shared, scalable curatorial and research infrastructures. | sustaining digital humanities idcc | extended abstract this research identified a promising complement or alternative to institutionalized sustainability strategies: reorienting sustainability efforts toward research communities, rather than focusing exclusively on collections themselves. the notion of community- centered sustainability emerges from two interrelated outcomes of the interviews: ( ) collection sustainability depends on engaging communities of interest, including original creator and user communities, development/maintenance communities, and communities of reuse; and ( ) as described above, maintaining collections may frequently entail maintaining the sociotechnical workflows that structure collaborations within research and development communities. interview participants were unanimous about the critical importance of use to ensuring a collection’s sustainability. one participant observed that stakeholder engagement is more important than any technical intervention: “the bigger concern is not, how do you structure these?…it’s really, how do you create those kinds of community engagements that result in people squawking if the project goes away?” (participant ). the study suggested strong interest among collection stakeholders in the strategy of preparing collections to pivot toward new purposes and therefore new user communities over time. one participant suggested that collections might be documented and structured from the start to support handoffs to new research communities, mirroring patterns of open-source software development. however, this participant also acknowledged significant obstacles, including the lack of support and incentive in digital humanities research for repurposing existing collections rather than developing new ones (participant ). participants also suggested that aggregating thematically related collections might help combine and grow user communities from across disciplines or topical areas. community-centered sustainability strategies revolve around the ongoing growth and development of collections in service to communities, further highlighting the distinction between sustainability and preservation of digital humanities scholarship. the idea that purposefully and strategically growing and engaging user communities benefits the sustainability of collections is not new. in a study of open data and digital curation practices, lee et al. ( ) argued that the mission of the curator must be extended beyond access-provision to the facilitation of new forms of use and interaction with and among users of data. in addition, post ( ) has explored new models of institutional and community partnership for the preservation of new media art. however, the question of how curators can purposefully grow community engagement with a collection or, alternatively, increase the capacity of collections for use and development by varying communities, remains open and vitally important to the future of humanities data curation. despite a robust literature on humanities scholars’ information practices, ongoing digital curation efforts would benefit from increased understanding of the needs of users of digital humanities scholarship and scholar-generated collections specifically. future work by re-orienting our conception of sustainability toward research communities rather than focusing exclusively on the collections or artifacts created and used by those communities, we open a landscape of possibilities for collaborative sustainment of digital scholarship. community-centered archiving strategies, including community-oriented acquisition and participatory archives, aim to reorient archival practice away from institutional imperatives and toward the well-being and endurance of communities (christen and anderson, ; caswell and cifor, ; gilliland and flinn, ; caswell, ; yoon, ; shilton and srinivasan, ). in cultural heritage practice, there are numerous emerging models of institutional partnership with communities, including efforts to: fenlon | idcc | research paper • create resources such as toolkits, workshops, and community-oriented best practices to support community curation work; • provision community sustainability efforts through re-granting programs, the reallocation of collection development funds toward community investments, or in-kind resources such as library staff time and consultation; • establish spaces and practices for building trust and equitable partnership among communities and memory institutions; • develop a common foundation of principles along with model policies and agreements toward ongoing partnership or shared stewardship. while many of these developments are happening in the context of community archives theory and practice in cultural institutions, rather than in the realm of digital scholarship and academic libraries, there is significant commonality across community archives and digital humanities collections (centered in research communities), and in the sustainability challenges they face. future work will explore the overlap among and differences between collections centered in different kinds of communities, and the sustainability strategies available to them. the results reported here have laid the groundwork for an ongoing investigation into the sustainability challenges confronting collections more broadly, particularly collections that are created, managed, and sustained primarily by their communities of use, either outside of the purview of memory institutions or in tentative or provisional relationships with memory institutions. future work aims to support and extend this movement toward community-centered sustainability of all kinds of digital collections through case studies of digital humanities collaborations and collections. the goal of future work is to answer foundational questions confronting next-generation sociotechnical infrastructures for long-lived cultural and scholarly records: on what sustainability means for different communities, different stakeholders within communities, and different collection contexts; on the contributions, purposes, and completeness of different forms of digital scholarship; and around the distinctive and evolving roles of institutions and communities in sustaining cultural records. references almas, b. ( ). perseids: experimenting with infrastructure for creating and sharing research data in the digital humanities. data science journal, ( ). https://doi.org/ . /dsj- - caswell, m. ( ). community-centered collecting: finding out what communities want from community archives. proceedings of the american society for information science and technology, ( ), – . https://doi.org/ . /meet. . caswell, m., & cifor, m. ( ). from human rights to feminist ethics: radical empathy in the archives. archivaria, . retrieved from https://archivaria.ca/index.php/archivaria/article/view/ christen, k., & anderson, j. ( ). toward slow archives. archival science, ( ), – . https://doi.org/ . /s - - -x clement, t., hagenmaier, w., & knies, j. l. ( ). toward a notion of the archive of the future: impressions of practice by librarians, archivists, and digital humanities scholars. the library quarterly, ( ), – . https://doi.org/ . / | sustaining digital humanities idcc | extended abstract cooper, d., & rieger, o. y. ( ). scholars are collectors: a proposal for re-thinking research support [issue brief]. retrieved from ithaka s+r website: https://doi.org/ . /sr. dombrowski, q. ( ). what ever happened to project bamboo? literary and linguistic computing, ( ), – . https://doi.org/ . /llc/fqu eschenfelder, k. r., shankar, k., williams, r., lanham, a., salo, d., & zhang, m. ( ). what are we talking about when we talk about sustainability of digital archives, repositories and libraries? proceedings of the th asis&t annual meeting: creating knowledge, enhancing lives through information & technology, : – : . retrieved from http://dl.acm.org/citation.cfm?id= . fenlon, k. ( ). thematic research collections: libraries and the evolution of alternative scholarly publishing in the humanities (doctoral dissertation, university of illinois at urbana-champaign). retrieved from http://hdl.handle.net/ / fenlon, k. ( ). modeling digital humanities collections as research objects. in proceedings of the acm/ieee joint conference on digital libraries, urbana-champaign. flanders, j. ( ). rethinking collections. in p. l. arthur & k. bode (eds.), advancing digital humanities (pp. – ). https://doi.org/ . / _ gilliland, a., & flinn, a. ( ). community archives: what are we really talking about? presented at the cirn prato community informatics conference . retrieved from https://www.monash.edu/__data/assets/pdf_file/ / /gilliland_flinn_keynote.pd f goddard, l., & walde, c. ( ). negotiating sustainability: the grant services “menu” at uvic libraries. presented at the digital humanities (dh ). retrieved from https://dh .adho.org/abstracts/ / .pdf green, h. e., & courtney, a. ( ). beyond the scanned image: a needs assessment of scholarly users of digital collections. college and research libraries, ( ), - . retrieved from http://crl.acrl.org/content/early/ / / /crl - .full.pdf langmead, a., quigley, a., gunn, c., hakimi, j., & decker, l. ( ). sustaining medart: the impact of socio-technical factors on digital preservation strategies (report of research funded by the national endowment for the humanities, division of preservation and access no. pr- - ). retrieved from https://sites.haa.pitt.edu/sustainabilityroadmap/wp- content/uploads/sites/ / / /sustainingmedart_finalreport_web.pdf lee, c., allard, s., mcgovern, n., & bishop, a. ( ). open data meets digital curation: an investigation of practices and needs. international journal of digital curation, ( ). retrieved from https://doi.org/ . /ijdc.v i . fenlon | idcc | research paper madsen, c., & hurst, m. ( ). are digital humanities projects sustainable? a proposed service model for a dh infrastructure. presented at the coalition for networked information fall membership meeting (cni f), washington, d.c. retrieved from https://www.slideshare.net/mccarthymadsen/are-digital-humanities-projects-sustainable- a-proposed-service-model-for-a-dh-infrastructure maron, n. l., & pickle, s. ( ). sustaining the digital humanities: host institution support beyond the start-up phase [research report]. retrieved from ithaka s+r website: https://sr.ithaka.org/publications/sustaining-the-digital-humanities/ oltmanns, e., hasler, t., peters-kottig, w., & kuper, h.-g. ( ). different preservation levels: the case of scholarly digital editions. data science journal, ( ), . https://doi.org/ . /dsj- - palmer, c. ( ). thematic research collections. in a companion to digital humanities. blackwell publishing. retrieved from http://digitalhumanities.org: /companion/view?docid=blackwell/ / .xml&chunk.id=ss - - &toc.depth= &toc.id=ss - - &brand= _brand palmer, c. l., teffeau, l. c., & pirmann, c. m. ( ). scholarly information practices in the online environment: themes from the literature and implications for library service development. retrieved from oclc research and programs website: http://www.oclc.org/content/dam/research/publications/library/ / - .pdf post, c. ( ). preservation practices of new media artists: challenges, strategies, and attitudes in the personal management of artworks. journal of documentation, ( ), – . https://doi.org/ . /jd- - - prescott, a. ( ). beyond the digital humanities center. in s. schreibman, r. siemens, & j. unsworth (eds.), a new companion to digital humanities (pp. – ). retrieved from http://onlinelibrary.wiley.com/doi/ . / .ch /summary rawson, k., & muñoz, t. ( ). against cleaning. in m. k. gold & l. f. klein (eds.), debates in the digital humanities. retrieved from https://dhdebates.gc.cuny.edu/read/untitled- f acf c-a - d -be - f ac e a /section/ de - - e- c - a a f e shilton, k., & srinivasan, r. ( ). participatory appraisal and arrangement for multicultural archival collections. archivaria, ( ), – . smithies, j., westling, c., sichani, a.-m., mellen, p., & ciula, a. ( ). managing digital humanities projects: digital scholarship & archiving in king’s digital lab. digital humanities quarterly, ( ). smithsonian center for folklife and cultural heritage. ( ). shared stewardship of collections policy. retrieved october , , from https://folklife- media.si.edu/docs/folklife/shared-stewardship.pdf | sustaining digital humanities idcc | extended abstract sustaining digital scholarship. ( ). sds final report. retrieved from institute for advanced technology in the humanities website: https://dcs.library.virginia.edu/files/ / /sds_finalreport .pdf sweeney, s. j., flanders, j., & levesque, a. ( ). community-enhanced repository for engaged scholarship: a case study on supporting digital humanities research. college & undergraduate libraries, ( – ), – . https://doi.org/ . / . . vinopal, j., & mccormick, m. ( ). supporting digital scholarship in research libraries: scalability and sustainability. journal of library administration, ( ), – . https://doi.org/ . / . . white, l., feldman, k., kosnik, a. d., gere, a., naglak, m., rathemacher, a., & cohen, s. ( , october). sustainable publishing for digital scholarship in the humanities. technical services faculty presentation presented at the charleston conference webinar series. retrieved from https://digitalcommons.uri.edu/lib_ts_presentations/ yoon, a. ( ). defining what matters when preserving web-based personal digital collections: listening to bloggers. international journal of digital curation, ( ). retrieved from http://www.ijdc.net/article/view/ . . session proposal worlds collide: a repository based on technical and archival collaboration erin o'meara electronic records archivist, university of north carolina at chapel hill gregory jansen repository developer, university of north carolina at chapel hill the failure of many institutional repositories (ir) to acquire large sets of faculty publications has shown that the traditional ir model is not sustainable without a shift in academic publishing. the carolina digital repository (cdr) aims to be more than a traditional ir and instead of focusing primarily on open access publishing, it will acquire, preserve and make accessible a range of at-risk scholarly output, such as datasets, faculty papers, university records and other faculty research projects. research libraries have begun to focus on their special collections as core to their mission and a key contribution to the wider world of scholarship. we extend this emphasis on collecting unique materials to include the new ir as a strategic direction. repository implementations suffer from a disconnect between technologists and digital curators in terminology and approach. this barrier has inhibited the development of comprehensive solutions that meet technical needs and deeply incorporate curatorial requirements. overcoming this barrier demands work to establish common language and mutual understanding of concepts from each discipline. the key output of academic institutions, besides the conferring of degrees, is the development of new knowledge. through preservation a repository supplies a key missing ingredient in the life cycle of knowledge production, a persistent point of access that fills the growing gulf between modern modes of digital scholarship and traditional publishing. the new ir model addresses the institution's role in providing stewardship of these assets. the carolina digital repository (cdr) is an institution-wide initiative at university of north carolina at chapel hill (unc-ch), spearheaded by university libraries. it is a preservation framework for material in electronic formats produced by members of the unc-ch community. following standards developed in the oais reference model, the cdr employs fedora for data content models and uses irods as a data store. in this paper we will explain how the repository works as a whole to meet these challenges. we will show how our repository was made possible through a sustained commitment of institutional resources and the deep involvement of motivated experts with archival, librarianship and software development backgrounds. we will trace how the functional needs for preservation activities were brought into the project and satisfied without sacrificing flexible access and ingest work flow. we will explain how we blended and extended the models set forth in open archival information system (oais), fedora commons and the integrated rule-oriented data system (irods) to meet these requirements without creating a muddle of concerns in code and subsystems. lastly, we will explore the stable measures taken and the ongoing project to create data management and object integrity safeguards used to mitigate risk of data corruption and loss. a key recommendation from the clir report for reconceiving research libraries for the st century is: collaboration should under-gird all strategic developments fo the university, especially at the service function level. greater collaboration among librarians, information technology specialists, and faculty on research project design and execution should be strongly supported. (clir , ) the cdr has been a collaboration of four partners, university libraries, information technology services, the school of information and library science and the data intensive cyber environments (dice) research group. from conception to deployment, on committees, the project team and working groups, the initiative incorporated expertise from all of these disciplines. the cdr's design is guided by archival theory that includes principles of authenticity and trust. oais provided a conceptual model that informed the design of system layers and components. certification tools, such as trustworthy repository audit checklist (trac), supplied functional requirements. the resulting implementation is a distinct best-of- breed software architecture aimed at teasing apart the concerns of ingest work flow, durable access and data preservation. these worlds met as we adopted and adapted numerous preservation standards, including oais, mets, mods and premis. the cdr is set to go live in april . we will have several pilot collections populating the repository that span disciplines and collecting efforts. our goals for the first year of deployment include an enhanced user interface with greater search and access control, enhanced metadata and curation of objects, and audit of current preservation activities within the repository. references council on library and information resources. no brief candle: reconceiving research libraries for the st century. august . worlds collide: a repository based on technical and archival collaboration references white paper report id: application number: hd- - project director: laura wexler institution: yale university reporting period: / / - / / report due: / / date submitted: / / cover page type: white paper grant number: hd- - title of project: photogrammar project director: lauren wexler grantee institution: yale university date submitted: - - narrative description project activities photogrammar - a web-based platform that makes it easy for users to organize, search, and visualize over , photographs from to - demonstrates how historical archives, particularly archives of visual culture, can be digitally reimagined to increase scholarship, visibility and usability. with photogrammar, users can now follow photographers’ paths across the country and retrace their steps over time, explore the collection’s historical archival systems, and organize the photographs in new and informative ways. physically housed in the library of congress, the photographic archive—which was commissioned by the united states farm security administration and office of war information (fsa-owi) to document american life—serves as an important visual record for scholars and the public-at-large. in the public domain, the collection contains over , monochrome and color photographs and offers a unique snapshot of the nation during the great depression and world war ii. prior to photogrammar, users interested in the archive contended with limited filtering options; they either had to wade through the massive print collection or they could use the online site the library of congress provides, which only allows for basic searching due to government restrictions. when building photogrammar, the team worked closely with the collection’s curator to develop creative new approaches to the collection using computational methods. the innovation of photogrammar is in augmenting, re-framing and re-visualizing a substantial public archive from the library of congress, in turn allowing new and unprecedented access and research. by standardizing the archive’s metadata, the photogrammar team increased entry points into the historical collection. extensive metadata accompany a majority of the photographs, capturing such features as dates, captions, locations, photographers’ names, and a uniquely identifying call number. however, the metadata was inconsistently recorded; dates, for instance, were written in a number of styles (“aug ”, “aug. ”, “august ”, “summer ”). the photogrammar team cleaned and transformed the metadata on a case-by-case basis, standardizing the fields. in the process, the team discovered latent information that allowed them to add metadata to % of the photographs. fsa-owi photographers sent their mm film to be processed in washington d.c. the staff would then cut the film into strips of - frames to print the images. this film structure was formerly hidden as the suffix in the reproduction number field associated with each image in the library of congress’s digital catalogue. in a novel approach, the photogrammar team reconstructed the strips of film, allowing attributes to be algorithmically assigned to the photos (fig ). for example, the first and last photograph in a strip includes the same photographer, date, and location. therefore, we know that the three shots in between have these same attributes. as a result of the standardization and new metadata, users can now map photographs over time (fig ), run faceted searches (fig ), and see the order in which photographs were taken. figure . reconstructed strip from the work of john vachon (chicago, ). figure . map by county in vs. map of photos from - figure . faceted searching. the mapping interface photogrammar’s team designed facilitates new discoveries about the fsa-owi collection. over half of the photographs were tagged with county geographic information. the photogrammar team added longitude and latitude coordinates, added a time slider and mapped the photographs over historic county boundaries. the result is a new understanding of the collection. the popular and scholarly characterization of the archive hitherto argued that the photographs chiefly depict rural poverty in the american south and the dust bowl. photogrammar’s map challenges this characterization by showing the national breadth of the archive, which was apparent virtually immediately and increased over time. (fig ). beyond looking at the collection at scale, there is a wealth of information that can be learned and questions that can be posed about each photograph or sequence of photographs by putting them in context through use of this interactive map. figure . national reach of the fsa-owi collection photogrammar also makes the archival system that organized the physical archive at the library of congress available digitally. at the library of congress, filing cabinets contain approximately , printed photographs from the collection. in , paul vanderbilt developed two pioneering archival systems for cataloging the photographic prints: a classification system and a lot system. for the classification system, he assigned each photograph to a three-tier hierarchy based on the photograph's themes or subjects, a process that today would commonly be referred to as tagging. with the lot system, he put photographs into groups based primarily on the photographer’s shooting assignment. vanderbilt’s archival systems persist today as the method for organizing the photographs at the library of congress: the classification system is used for managing prints, while the lot system is used for the microfilm copies of the photographs. vanderbilt's two systems serve as an object of study themselves via photogrammar and are an exciting way to explore the archive and thoughts about it from the early s. digitally reconstructing the categorization scheme greatly simplifies the process of analyzing vanderbilt's cataloging project; it was previously possible only when visiting the physical fsa- owi archive and in that case required substantial effort to look at several categories at once (fig and ). figure . vanderbilt’s classification and lot number systems are reconstructed and searchable in photogrammar. figure . interactive visualization of vanderbilt’s archive system. figure . similar photos is created using term frequency-inverse document frequency. photogrammar also created three new ways to search the collection using text analysis and image analysis. the first is using term frequency-inverse document frequency to identify captions that are most similar. it allows users to search photographs by shared semantic information, finding patterns of similarity and difference across the entire archive that would be virtually impossible to establish otherwise. (fig. ). the other two search methods use image analysis. the first is using color analysis to search the color photographs in the collection by frequency, hue, saturation and brightness (fig. ). the second is using facial analysis to identify the faces in individual photographs (fig ). the two searches based on image analysis and facial detection will be released this spring once the user experience is tested. figure . color-based search of the color photographs in the fsa-owi. figure . interactive search of the photographs by faces. accomplishments and audiences to measure impact, photogrammar maintains analytics. since the site’s launch in october , almost , users visited the site with over million pageviews. photogrammar has a global reach with users from over countries and continents. citing the innovative approach to visual culture, it was named by slate as one of “five of ’s most compelling digital history exhibits and archives’’ and featured in national publications such as npr’s morning edition and bbc world news. acclaim has also come from museums and universities, with invited talks at institutions such as columbia university, pitzer college, the university of pennsylvania, and rice university. recently, the museum of modern art in new york city used photogrammar as a model for the digital component of the exhibition object: photo. responding to users, who have sent hundreds of emails offering new information about the photographs, the team is now developing a crowdsourcing component for the project’s next phase. we have also made a lot of headway in publicizing the project to a wide academic audience. specific presentations, including future scheduled talks, include: • southern american studies association, february • new england american studies association, november • modern language association annual conference, january • american historical association annual conference, january • policy history conference, june • new england american studies association, october • american studies association, november • international center for photography, april • smithsonian archives of american art, november • american studies association, november • organization of american historians, april • digital humanities , july • american studies association, november • museum of modern art, december • north eastern public humanities consortium, may • national world war ii museum, may • pitzer college, september, • american historical association, january • university of pennsylvania, march, • rice university, march • rutgers university, april • american statistical association, july for an expansive list of photogrammar citations in the press and public media, see appendix a. classroom use of photogrammar is robust and growing, in both k- and post-secondary environments. audiences have also been very interested in the specific kinds of information made accessible by the project, such as the expanse of the country that was covered by the photographers and the specific paths some of them took on their journeys. they have been interested as well in the applicability of the photogrammar techniques for re-envisioning other large sets of documentary photographs. as the fsa photographs are very well known in a general kind of way, people have been fascinated not only to see how new kinds of information can be extracted from the canonical images but also to see new images beyond the usual iconic subset. already we can tell that the photogrammar project will generate information that will allow scholars to challenge common preconceptions of the fsa archive, such as that it was overwhelmingly concerned with rural life, or that it produced very little photography in northern states. we predict that it will also allow connections to be made more evident between the fsa and the owi periods. as very little scholarship has been done on the owi work, this is an interesting new facet. and finally, the smithsonian institution has approached us with an offer of partnership, currently under consideration, between photogrammar and the s. i.’s public sculpture archive. we speculate that should this proposal develop further, exciting new aspects of mapping large cultural heritage archives along the photogrammar model will emerge. members of the photogrammar team are planning to teach a new course at yale university in - about public sculpture, in order to fill in scholarship and context needed for such a plan. since the start of the granting period, the photogrammar project group has met together virtually every week to discuss techniques and strategies. by now our disparate talents have been very successfully melded into a true team effort. indeed, the team itself is one of the achievements of the project so far, as we have become quite aware of things that we have been able to do that we could not without the skills of each team member. for the code related to photogrammar, see https://github.com/statsmaths/photogrammar. continuation & long term impact the photogrammar team is actively maintaining the current infrastructure of the public-facing website. we have also applied for a second round of funding to enhance two areas of photogrammar. in this proposed extension, photogrammar will create links across archives in order to place the fsa-owi in the larger federal effort to document america during the great depression. we will incorporate the federal writers project (fwp), which recorded the lived experience of americans, and bring together over , life histories from the university of north carolina libraries (unc) and the library of congress (loc). interviews will be plotted on a new geographical layer allowing search by space and time. users will be able to search the new layer independently or along with the geographical layer of fsa-owi photographs. the new and cleaned transcripts created over the next year will also allow for refined search functionality including faceted browsing and full text search. users will be able to explore the fwp and fsa-owi spatially, temporally, and through faceted searching, allowing the public to explore the broader documentary record of the era relationally. photogrammar will also expand and deepen knowledge about the work of individual photographers. the two proposed additions are: adding audio files of photographer oral histories that were conducted by the smithsonian's archives of american art (aaa), and rebuilding and making available the photographer's film rolls, an innovation made possible through photogrammar’s innovations. these additions will augment photogrammar's current faceted searching by date and photographer, a feature that already allows users to track individual photographers on assignment across the country. the new combination of interactive resources will allow users to hear and read transcripts of what the photographers themselves thought about their experience with the fsa-owi as well as to see the photographer's rolls of films in shooting order revealing individual photographers’ ways of seeing and capturing through the camera. the expansions will offer source materials for some of the most famous documentarians of the th century including dorothea lange and walker evans as well as new insights into some of the lesser-known photographers. the proposed extensions of the project will broaden photogrammar's reach and facilitate the use of digital scholarship in research and teaching. by linking disparate archives held by the aaa, loc, and unc. photogrammar will promote new avenues for inquiry across the academy and in public venues collaboratively. scholars working on the lives of individual photographers can move seamlessly between their images, their photographic journeys and their narrative accounts. studies on the history of particular cities or states can easily juxtapose photographic evidence and personal accounts from both the people who lived there as well as the photographers who documented them. scholars will have new access to a large, curated humanities dataset. the federal government's role in recording american life during the great depression and world war ii can be analyzed for large-scale patterns, including questions of demographic representation in the fsa-owi (photographic) and the fwp (written) archives. in addition to increasing the visibility of and access to the aaa, loc, and unc archives, photogrammar (in its extended form) will also be of interest to a diverse public audience. visits to photogrammar's website and wide coverage in popular press publications indicate interest from a large and international audience comprised of a variety of age groups and backgrounds. the combined collections, coupled with the faceted search and customized visualization options, will offer scholars and the public an easy-to-use digital platform for further exploration, research, and teaching the documentary expression of the s and s. grant products the primary grant product is the public website, accessible at http://photogrammar.yale.edu. we have also drafted a paper entitled “uncovering latent metadata in the fsa-owi photographic archive”, which is currently accepted contingent on minor revisions for publication in an issue of the peer-reviewed journal digital humanities quarterly (dhq). accomplishments in our original grant proposal submission, february, , we hoped to complete the project in three phases, as follows: “phase i: the initial phase of the project aims to deliver a working website which will have a minimal working version of all the core aspects of the expected final website. to accomplish this, we will acquire historical maps, georeference the digital maps, build a core lamp server with wordpress management system, develop a graphical interface with processing and javascript (http://processingjs.org), and create the sites textual content (i.e. about page, introductory text). expected to start october and conclude by the end of october . phase ii: during the second phase, we will introduce the beta version of the website to a limited audience which will include the classrooms of advisory board members and conference workshop groups. in parallel, we will continue to improve the structure of the website. in particular, we will allow for user generated content and visualization of user generated content. additionally, the photogrammar team will start the process of writing up academic reports pertaining to the project. by the end of phase ii we expect to have a version of the website which is ready to go “live” to a general audience. expected to start september and conclude by the end of may . phase iii: in the final phase of the project, we will concentrate on promoting the website to a wide audience. this will take the form of both speaking at conferences to promote the site as well as submitting academic papers which both use and discuss the website’s contributions to the digital humanities. we will also continue in this phase to address our new users’ suggestions and concerns both methodologically and technically. expected to start may . it will technically conclude with the grant by the end of september , although yale it will continue to host the project in perpetuity and the team members expect to continue working on it and other extensions in the future.” looking back, we can say that we hit the mark and beyond with the website itself, and our associated scholarly contributions and public speaking, and fell somewhat short of collaborating on teaching with our board of advisors. we lost one core team member along the way, ken panko, who moved to singapore, but managed to hold onto another, stacey maples, despite his move to stanford. we also added peter leonard, librarian for digital humanities research, and trip kirkpatrick, senior instructional technologist, both at yale university, to the core team. we requested and received three no-cost extensions, ending the grant period on november , . we came in at the original budgeted cost, although the awarded amount was supplemented significantly by in- kind contributions of resources from yale and donated time of the entire team. we are enormously grateful to the neh for the opportunity to make photogrammar, and we are undiminished in our enthusiasm for continuing to build and explore the site. appendix a: , ambassador, , dick greenberg northern colorado residential real estate new paradigm partners llc s. lemay ave fort collins, and co blog my profile home search o: - m: - . “the grab bag - photogrammar.” activerain. accessed october , . http://activerain.com/blogsview/ /the-grab-bag-photogrammar. “ pic.jp - 万枚の貴重な ~ 年当時のリアルな風景がわかる写真を好きなだけ見て 探せる「photogrammar」.” accessed september , . http:// picweb.csdsol.com/detail.html?id=m_ _ _ . , rashid faridi on october, and said. “photogrammar.” the dreams of cities. accessed october , . http://thedreamsofcities.wordpress.com/ / / /photogrammar/. “ , iconic pictures of depression-era america released by yale (photos).” rt english. accessed october , . https://www.rt.com/usa/ -depression-yale-photos-america/. “ . απίστευτες φωτογραφίες της ζωής στις ΗΠΑ τις δεκαετίες ’ και ’ .” palo.gr. accessed october , . http://www.palo.gr/pagosmia-nea/ - -apisteytes-fwtografies-tis-zwis- stis-ipa-tis-dekaeties- -kai- / /. “ ~ 年のアメリカの写真が沢山みれるwebサービス 『photogrammar』 | pcあれこれ探 索.” accessed september , . http://pc.mogeringo.com/archives/ . aahds symposium: welcome, & “re-visioning the archive,” . http://www.youtube.com/watch?v= yciam- wuw&feature=youtube_gdata_player. “ableton forum • view topic - photogrammar.” accessed september , . https://forum.ableton.com/viewtopic.php?f= &t= . “accessing historic images — like those of john vachon — is much easier now, thanks to yale photogrammar.” minnpost. accessed september , . http://www.minnpost.com/stroll/ / /accessing-historic-images-those-john-vachon-much- easier-now-thanks-yale-photogrammar. account, ministère culturecom verified. “#photogrammar : . photos de la grande #crise économique de aux États-unis @yale > http://photogrammar.yale.edu/ pic.twitter.com/rtk scu qy.” microblog. @ministerecc, october , . https://twitter.com/ministerecc/status/ . american art history and digital scholarship: new avenues for exploration, . http://www.youtube.com/watch?v=rm zzzshsme&feature=youtube_gdata_player. “announcing new start-up grant awards (july ) | national endowment for the humanities.” accessed february , . http://www.neh.gov/divisions/odh/grant-news/announcing- -new- start-grant-awards-july- . “awesome archives: photogrammar is a web-based platform for...” accessed september , . http://awesomearchives.tumblr.com/post/ /photogrammar-is-a-web-based-platform- for. boboltz, sara. “ photos that show just how badass american women were during wwii.” huffington post, september , . http://www.huffingtonpost.com/ / / /women-jobs- wwii-photos_n_ .html. boldenow, amanda. “photogrammar - curated vintage photographs grouped by geography/county (yale.edu).” pinterest. accessed september , . https://www.pinterest.com/pin/ /. brooke, eliza. “government photos take new form online.” yale daily news, september , . http://yaledailynews.com/blog/ / / /government-photos-take-new-form-online/. “cool website .... photogrammer.” accessed october , . http://www.okctalk.com/current- events-open-topic/ -cool-website-photogrammer.html. “delfi foto > lietotāja albums > lielā depresija, asv.” accessed october , . http://foto.delfi.lv/album/ /. department, wired photo. “this week in photography: flickr photobooks, the nick cage insta- bot, and google glass.” wired, november , . http://www.wired.com/ / /twip- /. “depression-era photos make a mark in american photography.” accessed september , . http://radio.wpsu.org/post/depression-era-photos-make-mark-american-photography. “depression-era photos make a mark in american photography : npr.” accessed september , . http://www.npr.org/ / / / /depression-era-photos-make-a-mark-in- american-photography. “ecrater.com :: view topic - photogrammar.” accessed october , . http://community.ecrater.com/viewtopic.php?p= &sid= d bfb d c dd dd . estes, adam clark. “yale just released , incredible photos of depression-era america.” gizmodo. accessed october , . http://gizmodo.com/yale-just-released- - -incredible- photos-of-depress- . ———. “yale showcases , incredible photos of depression-era america.” gizmodo. accessed october , . http://gizmodo.com/yale-just-released- - -incredible-photos- of-depress- . evans, hayley. “yale university releases , incredible photos of the great depression.” beautiful/decay. accessed october , . http://beautifuldecay.com/ / / /yale- university-releases- -incredible-photos-great-depression/. “explore depression era photographs with the yale photogrammar | novel technology.” accessed september , . http://carlispina.wordpress.com/ / / /yale-photogrammar/. “fascinating depression-era food images from yale university’s photogrammar archive. #fwx - scoopnest.com.” scoopnest. accessed october , . http://www.scoopnest.com/user/foodandwine/ . gallegos, emma g. “photos: yale releases nearly , pictures of d.c. during the depression.” dcist. accessed october , . http://dcist.com/ / /photos_depression_era_photos_in_dc.php. “ganz großes kino: amerikabilder bei photogrammar | die welt der medizinischen blogs.” accessed september , . http://www.medicalblogs.de/ / /ganz-groses-kino- amerikabilder-bei-photogrammar/. “ganz großes kino: amerikabilder bei photogrammar › denkmale › scilogs - wissenschaftsblogs.” accessed september , . http://www.scilogs.de/denkmale/ganz-grosses-kino- amerikabilder-bei-photogrammar/. “ganz großes kino: amerikabilder bei photogrammar « science.newzs . de.” accessed september , . http://science.newzs.de/ / / /ganz-grosses-kino-amerikabilder-bei- photogrammar/. geneva project. accessed july , . https://vimeo.com/ . “get lost in this map of , photos from depression-era america.” gizmodo. accessed september , . http://gizmodo.com/get-lost-in-this-map-of- - -photos-from- depression- . gilmore, ruth wilson. “fatal couplings of power and difference: notes on racism and geography.” professional geographer , no. (february ): . “gray-card: http://photogrammar.yale.edu/map/ ... - scientia rex.” accessed october , . http://scientia-rex.tumblr.com/post/ /gray-card-httpphotogrammaryaleedumap. groetzinger, kate. “yale has released , government photos of the great depression.” quartz. accessed october , . http://qz.com/ /yale-just-released- -government- photos-of-the-great-depression/. “historic photos of belle glade revealed on yale’s new photogrammar site | quick pulse.” accessed september , . http://quickpulse.blog.palmbeachpost.com/ / / /historic- photos-of-belle-glade-revealed-on-yales-new-photogrammar-site/. “honderdduizenden foto’s van crisisjaren vs vrijgegeven.” ad. accessed october , . http://www.ad.nl/ad/nl/ /buitenland/article/detail/ / / / /honderdduizenden- foto-s-van-crisisjaren-vs-vrijgegeven.dhtml. “http://photogrammar.yale.edu.” accessed october , . http://seenthis.net/sites/ . “http://www.petaluma .com/gallery/ - /story.html.” petaluma argus courier. accessed october , . http://www.petaluma .com/gallery/ - /story.html. “impressive! yale university launches photogrammar platform, search and visualize more than , historic images from to | lj infodocket.” accessed september , . http://www.infodocket.com/ / / /nice-yale-university-launches-photogrammar-platform- search-and-visualize-more-than- -historic-images/. independent, the post. “not everything on the internet is junk | postindependent.com.” the post independent. accessed february , . http://www.postindependent.com/news/ - /not-everything-on-the-internet-is-junk. “instagram photos for tag #photogrammar | iconosquare.” accessed september , . http://iconosquare.com/tag/photogrammar. “interactive historical photo collection launched | best education news.” accessed september , . http://www.besteducationnews.com/interactive-historical-photo-collection-launched.html. “interactive historical photo collection launched | yale daily news.” accessed september , . http://yaledailynews.com/blog/ / / /interactive-historical-photo-collection-launched/. jalna. “photos by jalna: photogrammar.” photos by jalna, september , . http://jalna.blogspot.com/ / /photogrammar.html. “jenny’s art, design and architecture blog: photogrammar: archive of , photographs documenting the great depression.” accessed september , . http://artdesarc.blogspot.com/ / /photogrammar-archive-of- .html. jmfelli. “from the scout report: photogrammar treasure trove of primary source materials | schurz library news.” accessed october , . https://www.iusb.edu/library/blog/?p= . julienne, stéphan. “photogrammar, l’album photo interactif de la grande dépression américaine.” photo memory. accessed october , . http://www.photo-memory.eu/photogrammar-album- photo-interactif-grande-depression-americaine/. lace-evans, olivia. picturing america’s great depression. accessed october , . http://www.bbc.com/news/magazine- . “l’america povera, tra il e il | linkiesta.it.” accessed september , . http://www.linkiesta.it/archivio-fotografico-stati-uniti- - -photogrammar. long, heather. “yale releases , incredible great depression photos.” cnnmoney, october , . http://money.cnn.com/gallery/news/economy/ / / /great-depression- - photos-yale/. malomil. “malomil: ainda o photogrammar (desculpem, é viciante).” malomil, sábado, de setembro de . http://malomil.blogspot.com/ / /ainda-o-photogrammar-desculpem- e.html. martelle, scott. “the past makes a comeback -- in searchable words and pictures.” latimes.com, october , . http://www.latimes.com/opinion/opinion-la/la-ol-library-of-congress- newspapers-photographs- -story.html. meier, allison. “library of congress photographs mapped into an interactive atlas of the great depression.” accessed september , . http://hyperallergic.com/ /library-of- congress-photographs-mapped-into-an-interactive-atlas-of-the-great-depression/. meletzis, menelaos. “photogrammar: . eikoneΣ ΑΜΕΡΙΚΑΝΙΚΗΣ ΙΣΤΟΡΙΑΣ!” photonet: to µοναδικό µηνιαίο φωτογραφικό περιοδικό! accessed october , . http://www.nexusmedia.gr/photogrammar/. “muck rack - journalists comments on: photogrammar from photogrammar.yale.edu.” accessed september , . http://muckrack.com/link/osi h/photogrammar. naotoj. “photogrammar - hole in the wall.” http://naotoj.hatenablog.com/. accessed september , . http://naotoj.hatenablog.com/entry/ / / / _ . neh digital humanities lightning round part , . http://www.youtube.com/watch?v=ce -lktve #t= m s. “new deal project map | living new deal.” accessed november , . http://livingnewdeal.org/map/. new directions in digital scholarship, spring , . http://www.youtube.com/watch?v=rlfibjbcnmc&feature=youtube_gdata_player. “oh, how i love the story old photos tell.” pinterest. accessed march , . https://www.pinterest.com/pin/ /. “online platform photogrammar serves as search engine for library of congress photographs » mass appeal.” accessed september , . http://massappeal.com/multi/online-platform- photogrammar-provides-search-engine-for-library-of-congress-photographs/. “photogrammar.” accessed october , . http://www.uglyhedgehog.com/t- - .html. “photogrammar.” accessed october , . http://www.mymodernmet.com/profiles/blogs/list/tag/photogrammar. “photogrammar.” texags. accessed october , . http://texags.com/forums/ /topics/ . “photogrammar.” the darkroom: exploring visual journalism from the baltimore sun. accessed october , . http://darkroom.baltimoresun.com/ / /baltimore-street-photographer- great-depression-era-edition/. “photogrammar.” phiffer.org. accessed october , . http://phiffer.org/links/photogrammar/. “photogrammar.” ash ryan beats. accessed march , . http://ashryanbeats.com/photogrammar/. “photogrammar.” the trendy things. accessed january , . https://thetrendythings.com/read/ . “photogrammar.” accessed october , . http://dirtdirectory.org/resources/photogrammar. “photogrammar –.” accessed september , . http://corinneand.com/photogrammar/?utm_source=rss&utm_medium=rss&utm_campaign=phot ogrammar. “photogrammar.” accessed september , . http://www.metafilter.com/ /photogrammar. “photogrammar.” accessed september , . http://www.metafilter.com/ /photogrammar. “photogrammar | analyzing educational technology.” scoop.it. accessed october , . http://www.scoop.it/t/educational-technololgy/p/ / / / /photogrammar. “photogrammar | coole zitate.” accessed april , . http://goldfinanceuk.tk/photogrammar/photogrammar.html. “photogrammar | effective technology integration...” accessed september , . http://www.scoop.it/t/effective-technology-integration-into- education/p/ / / / /photogrammar. “photogrammar | etilen sosyete.” accessed october , . http://etilen.net/photogrammar/. “photogrammar | human interest.” scoop.it. accessed october , . http://www.scoop.it/t/human- interest-oppitori/p/ / / / /photogrammar. “photogrammar | kentucky | pinterest.” accessed november , . https://www.pinterest.com/pin/ /. “photogrammar | le kiosque mÉdias.” accessed september , . http://kiosquemedias.wordpress.com/tag/photogrammar/. “photogrammar | libraries.” accessed september , . http://stlawu.edu/library/announcement/photogrammar. “photogrammar | technologies | scoop.it.” accessed september , . http://www.scoop.it/t/technologies/p/ / / / /photogrammar. “photogrammar | todoele . .” accessed september , . http://todoele.org/todoele /category/etiquetas/photogrammar. “photogrammar | yale university library.” accessed october , . http://web.library.yale.edu/dhlab/photogrammar. “photogrammar — , + photos from the depression era, organized by county.” designer news. accessed november , . https://news.layervault.com/stories/ -photogrammar-- - photos-from-the-depression-era-organized-by-county. “photogrammar — , + photos from the depression era, organized by county • /r/photography.” reddit. accessed november , . http://www.reddit.com/r/photography/comments/ mw wn/photogrammar_ _photos_from_t he_depression_era/. “photogrammar: , photographs from to across america – mapped | photo archive news.” accessed september , . http://photoarchivenews.com/blog/ / / /photogrammar- -photographs-from- - to- -across-america-mapped/. “photogrammar : , photographs from to - battlefront forum.” accessed september , . http://www.battlefront.com/community/showthread.php?s= ada c f ebe b eb &t= . “photogrammar: general town meeting place: the park. lakeland, photographer marion post wolcott | florida - maybe a few good things | pinterest.” accessed november , . https://www.pinterest.com/pin/ /. “photogrammar - a call/trail of cthulhu resource with . depression-era photographs.” obskures.de. accessed october , . http://obskures.de/ /photogrammar-a-calltrail-of- cthulhu-resource-with- - -depression-era-photographs/. “photogrammar: banco de imágenes hist&oac...” accessed september , . http://www.scoop.it/t/el-mundo-del-diseno-grafico/p/ / / / /photogrammar- banco-de-imagenes-historicas. “photogrammar - browse a map of , photos from - created by the us farm security administration • /r/internetisbeautiful.” reddit. accessed october , . https://www.reddit.com/r/internetisbeautiful/comments/ nusqi/photogrammar_browse_a_map_o f_ _photos_from/. “photogrammar: cutting burley tobacco and putting it on sticks to wilt before taking it into curling and drying barn. russell spears’ farm near lex… | pinterest.” accessed november , . https://www.pinterest.com/pin/ /. “photogrammar: depression era photo collection searchable by location and topic. : internetisbeautiful.” accessed october , . http://www.reddit.com/r/internetisbeautiful/comments/ hydr /photogrammar_depression_era_p hoto_collection/. “photogrammar, el buscador de fotografías históricas.” altfoto. accessed september , . http://altfoto.com/ / /photogrammar-buscador-interactivo-library-congress. “photogrammar, el buscador de fotografías históricas | zayra mo.” accessed september , . http://www.zayramo.com/photogrammar-el-buscador-de-fotografias-historicas/. “photogrammar - from here to there.” accessed september , . http://blog.garritys.org/ / /photogrammar.html. “photogrammar impressionnant... l’université de yale a….” accessed october , . http://seenthis.net/messages/ . “photogrammar is a web-based platform for organizing, searching, and visualizing the , photographs from to created by the united sta… | pinterest.” accessed november , . https://www.pinterest.com/pin/ /. “photogrammar - képek a washingtoni kongresszusi könyvtár fotógyűjteményéből ( - ).” mai manó ház. accessed october , . http://maimanohaz.blog.hu/ / / /photogrammar_kepek_a_washingtoni_kongresszusi_kon yvtar_fotogyujtemenyebol_ . “photogrammar - let’s talk about pics.” alamy. accessed october , . http://discussion.alamy.com/index.php?/topic/ -photogrammar/. “photogrammar mentions (with images, tweets) · triplingual.” storify. accessed september , . https://storify.com/triplingual/photogrammar-mentions. “photogrammar met en ligne . photos des États-unis entre et | ufunk.net - nekoblog.org :: marque-pages.” accessed october , . http://links.nekoblog.org/?nu hmq. “photogrammar met en ligne . #photos des #Étatsunis entre et - scoopnest.com.” scoopnest. accessed october , . http://www.scoopnest.com/fr/user/pierreyvesrevaz/ . “#photogrammar on.” lockerdome. accessed september , . http://lockerdome.com/hashtag/photogrammar. “photogrammar: organizing , photographs from to | hacker news.” accessed september , . https://news.ycombinator.com/item?id= . “photogrammar - or how to spend a day online doing ‘research.’” accessed september , . http://www.nscale.net/forums/showthread.php? -photogrammar-or-how-to-spend-a-day- online-doing-quot-research-quot&p= . “photogrammar: over k photos of american everyday life between - | whatever.” voat. accessed october , . https://voat.co/v/whatever/comments/ . “photo grammar - pentaxforums.com.” accessed october , . http://www.pentaxforums.com/forums/ -general-photography/ -photo-grammar.html. “photogrammar: photos from the depression era, organized by county • /r/internetisbeautiful.” reddit. accessed november , . http://www.reddit.com/r/internetisbeautiful/comments/ mszjf/photogrammar_photos_from_the_ depression_era/. “photogrammar site archives fsa photos | amelia + dan.” accessed november , . http://ameliaanddan.com/blog/charleston-wedding-photographer/archive-of- -fsa-photos- happy-monday/. “photogrammar : top news articles [ ].” accessed october , . http://www.ooyuz.com/newsarticles?term=photogrammar. “photogrammar - top stories and breaking news.” accessed september , . https://www.inside.com/photogrammar. “photogrammar : une plongée dans l’histoire américaine en | rt news.” accessed october , . http://realtimenews.eu/fr/photogrammar-une-plongee-dans-lhistoire-americaine-en- - _ .html. “photogrammar : une plongée dans l’histoire américaine en photos.” big browser. accessed october , . http://bigbrowser.blog.lemonde.fr/ / / /photogrammar-une- plongee-dans-lhistoire-americaine-en- - -photos/. “photogrammar : une plongée dans l’histoire américaine en photos.” accessed october , . http://www.tropiquesfm.net/photogrammar-une-plongee-dans-l.html. “photogrammar:米国fsa-owiプロジェクトで撮影された写真 万点に新たな命を吹き込む イェール大学の取り組み | カレントアウェアネス・ポータル.” accessed september , . http://current.ndl.go.jp/node/ . “photogrammar:米国fsa-owiプロジェクトで撮影された写真 万点に新たな命を吹き込む イェール大学の取り組み | カレントアウェアネス・ポータル.” accessed september , . http://current.ndl.go.jp/node/ . “photogrammar项目:美国 和 之间的黑 文艺圈 展示 设计时代网-powered by thinkdo .” accessed october , . http://www.thinkdo .com/s/ . “photogrammer.” whatnext.pl. accessed october , . http://whatnext.pl/tag/photogrammer/. “photogrammer.” accessed october , . http://photogrammer.com.ua/?lang=en. “photogrammer | the events that happen in the mysterious brain of the person that writes this blog, english c.” accessed november , . http://thedistantevent.blogspot.com/ / /httpphotogrammar.html. “#photogrammer - instausers.com.” accessed october , . http://instausers.com/photogrammer/. “photogrammers at yale: design observer.” accessed november , . http://designobserver.com/feature/photogrammers-at-yale/ /. “photogrammer.. yale..” this is the what. accessed october , . http://www.thisisthewhat.com/ / /photogrammer-yale/. richard, diane l. “upfront with ngs: upfront mini bytes -- michigan, gaelic words, time magazine, photogrammar, netherlands archives, new york city archives, technician, hebridian connections.” accessed december , . http://upfront.ngsgenealogy.org/ / /upfront-mini-bytes-michigan-gaelic.html. robb, john. “photogrammar.” accessed october , . http://www.visionary.photo/photogrammar/. rosen, rebecca j. “seeing the great depression.” the atlantic, august , . http://www.theatlantic.com/business/archive/ / /seeing-the-great-depression/ /. rosie. “photogrammar: das große und großartige fotoarchiv der farm security administration.” sellnews blog. accessed january , . http://blog.sell- news.com/blog/ / / /photogrammar-das-grosse-und-grossartige-fotoarchiv-der-farm- security-administration/. said, at : am. “ , iconic pictures of depression-era america released by yale (photos).” talesfromtheloublog. accessed october , . https://talesfromtheloublog.wordpress.com/ / / / -iconic-pictures-of-depression- era-america-released-by-yale-photos/. “see every new deal project in america, in one map.” vox. accessed november , . http://www.vox.com/xpress/ / / / /new-deal-map. “see what midland-odessa looked like years ago.” mrt.com / midland reporter-telegram. accessed february , . http://www.mrt.com/editors_picks/collection_f f ce -a - e - a - f b a .html. smashing magazine. “a great resource for photographers: photogrammer, a platform that visualizes , photos from to . http://photogrammar.yale.edu.” microblog, december , . https://twitter.com/smashingmag/statuses/ . smith, lauren. “this incredible archive lets you see depression-era photos of your county.” good housekeeping. accessed october , . http://www.goodhousekeeping.com/life/a /searchable-great-depression-era-photos/. “stephanie the riveter.” accessed december , . http://www.yalealumnimagazine.com/articles/ /photogrammar. “stories from photogrammar.yale.edu.” accessed october , . http://digg.com/source/photogrammar.yale.edu. “taulard.net - photogrammar - . photos us en noir et blanc.” accessed november , . http://www.taulard.net/commentaire/breve/ /. “teachersfirst review - yale photogrammar.” accessed november , . http://www.teachersfirst.com/single.cfm?id= . “the migrant mother. the most iconic pic of the great depression. read about it here: : daniel gennaoui.” accessed september , . http://inagist.com/all/ /. “the yale photogrammer site is very cool, but i have a quibble with its map interface. #geography http://pic.twitter.com/oq qbmqlw .” veooz. accessed september , . http://www.veooz.com/photos/ihtxpts.html. “this incredible archive lets you see depression-era photos of your county.” pinterest. accessed october , . https://www.pinterest.com/pin/ /. tiffany, kaitlyn. “this digital photo archive proves that depression-era kids were more badass than you.” the verge. accessed october , . http://www.theverge.com/ / / / /yale-photo-archive-great-depression-get-a-tetanus- shot. “veterans experience | national endowment for the humanities.” accessed june , . http://www.neh.gov/news/veterans-experience. “walker evans’s photos and k more digitized by the yale project » mobylives.” accessed november , . http://www.mhpbooks.com/walker-evanss-photos-and- k-more- digitized-by-the-yale-project/. warren, matthew. “wwp historians: yale’s photogrammar: depression & wwii era photographs.” wwp historians, october , . http://wwphistorians.blogspot.com/ / /yales- photogrammar-depression-wwii-era.html. wolfe, wes. “massive new deal-era photo archive released online.” the free press, october , . http://www.kinston.com/article/ /news/ . “wwii military and homefront photos archive: photogrammar - ephemera & photographs.” u.s. militaria forum. accessed october , . http://www.usmilitariaforum.com/forums/index.php?/topic/ -wwii-military-and- homefront-photos-archive-photogrammar/. “yale gets neh grant to create website for historic photos.” yale news. accessed august , . http://news.yale.edu/ / / /yale-gets-neh-grant-create-website-historic-photos. “yale launches an archive of , depression-era photos.” kottke.org. accessed october , . http://kottke.org/ / /yale-launches-an-archive-of- -depression-era-photos. “yale launches photogrammar, a searchable archive of , depression era photographs.” topix. accessed september , . /forum/county/tulare-ca/t prn r ilc i. “yale launches photogrammar, a searchable archive of , depression era photographs.” popphoto. accessed september , . http://www.popphoto.com/news/ / /yale- launches-photogrammar-searchable-archive- -depression-era-photographs. “yale photogrammar | the tromp queen.” accessed december , . https://haskerj.wordpress.com/tag/yale-photogrammar/. “yale photogrammar revitalizes and adds new context to the fsa-owi images — @joycevalenza neverendingsearch.” accessed september , . http://blogs.slj.com/neverendingsearch/ / / /yale-photogrammar-revitalizes-and-adds- new-context-to-the-fsa-owi-images/. “yale photogrammar - wwii general.” wwii forums. accessed october , . http://www.ww f.com/topic/ -yale-photogrammar/. “yale project makes , depression-era photos searchable with interactive database.” petapixel. accessed september , . http://petapixel.com/ / / /yale-project- photogrammar-place- k-depression-era-photographs-on-searchable-interactive-maps-and- more/. “yale’s photogrammar website.” ewillys. accessed september , . http://www.ewillys.com/ / / /yales-photogrammar-website/. “yale university library news: photogrammar project & depression-era photo archives archives.” accessed february , . http://www.library.yale.edu/librarynews/ / /photogrammar_project_depressio.html. yang, hannah. “interactive historical photo collection launched.” yale daily news. september , . http://yaledailynews.com/blog/ / / /interactive-historical-photo-collection- launched/. “zdjęcia z czasów wielkiego kryzysu i projekt photogrammar - wykop.pl.” accessed september , . http://www.wykop.pl/link/ /zdjecia-z-czasow-wielkiego-kryzysu-i-projekt- photogrammar/. “Клуб Дальномер • Просмотр темы - photogrammar - американские фото - .” accessed october , . http://rangefinder.ru/club/viewtopic.php?f= &t= &p= . “США между и годами – фотопроект Йельского университета.” coolday.today. accessed october , . http://coolday.today/ssha-mezhdu- -i- -godami-fotoproekt- jel-skogo-universiteta.html. “Фотовыставка «Мир незрячими глазами».” accessed october , . http://photogrammer.com.ua/news?fotovystavka-mir-nezryachimi-glazami. “대공황 시기 미국 정부가 수집한 , 장의 사진들.” visla. accessed october , . http://visla.kr/?p= . killer applications in digital humanities patrick juola duquesne university pittsburgh, pa united states of america juola@mathcs.duq.edu august , abstract the emerging discipline of “digital humanities” has been plagued by a perceived neglect on the part of the broader humanities community. the community as a whole tends not to be aware of the tools developed by dh practitioners (as documented by the recent surveys by siemens et al.), and tends not to take seriously many of the results of scholarship obtained by dh methods and tools. this paper argues for a focus on deliverable results in the form of useful solutions to common problems that humanities scholars share, instead of simply new representations. the question to address is what needs the humanities community has that can be dealt with using dh tools and techniques, or equivalently what incentive humanists have to take up and to use new methods. this can be treated in some respects like the computational quest for the “killer application” – a need of the user group that can be filled, and by filling it, create an acceptance of that tool and the supporting methods/results. some definitions and examples are provided both to illustrate the idea and to support why this is necessary. the apparent alternative is the status quo, where digital research tools are brilliantly developed, only to languish in neglect and disuse. introduction “the emerging discipline of digital humanities”. . . . arguably, “digital humani- ties” has been emerging for decades, without ever having fully emerged. one of the flagship journals of the field, computers in the humanities, has published nearly forty volumes, without having established the field as a mainstream sub- discipline. the implications of this are profound; tenure-track opportunities for dh specialists are rare, publications are not widely read or valued, and, perhaps most seriously in the long run, the advances made are not used by mainstream scholars. this paper analyzes some of the patterns of neglect, the ways in which mainstream humanities scholarship fails to value and participate in the digital humanities community. it further suggests one way to increase the profile of this research, by focusing on the identification and development of “killer” ap- plications (apps), computer applications that solve significant problems in the humanities in general. patterns of neglect . patterns of participation a major indicator of the neglect of digital humanities as a humanities discipline is the lack of participation, particularly by influential or high-impact scholars. as an example, the flagship (or at least, longest running) journal in the field of “humanities computing” is computers and the humanities, which has been published since the s. despite this, the impact of this journal has been minimal. the journal citation reports database suggests that for , the impact factor of this journal (defined as “the number of current citations to articles published in the two previous years divided by the total number of articles published in the two previous years” ) is a relatively low . . (this is actually a substantial improvement from ’s impact factor of . .) in terms of averages from – , chum was the th most cited journal out of a sample of , scoring in only the th percentile. by contrast, the most influential journal in the field of “computer applications,” bioinformatics, scores above . ; computational linguistics scores at . ; the journal of forensic science at . . neither literary and linguistic computing, text technology, nor the journal of quantitative linguistics even made the sample. in other words, scholars tend not to read, or at least cite, work published under the heading of humanities computing. do they even participate? in six years of publication ( - ; volumes – ), chum published articles, with different authorial affiliations (including duplicates) listed. who are these authors, and do they represent high-profile and influential scholars? the unfortunate answer is that they do not appear to. of the affiliations, only are from “ivy league” universities, the single most prestigious and influential group of us universities. similarly, of the affiliations, only sixteen are from the universities recognized by us news and world report [usnews, ] as one the top departments in in any of the disciplines of english, history, or sociology. only two affiliations are among the top ten in those disciplines. while it is of course unreasonable to expect any group of american universities to dominate a group of international scholars, the conspicuous and almost total absence of faculty and students from top-notch us schools is still important. nor is this absence confined to us scholars; only one affiliation from the top canadian doctoral universities (according to the maclean’s ranking) appears. (geoff rockwell has pointed out that the maclean’s rankings are http://jcrweb.com/www/help/hjcrgls .htm, accessed june , school papers ( ) papers ( ) usnews top harvard cal-berkeley yale princeton stanford cornell chicago columbia johns hopkins ucla penn michigan-ann arbor wisconsin-madison unc-chapel hill maclean’s top mcgill toronto ( authors) western ubc queen’s ivies not otherwise listed brown (one paper authors) dartmouth table : universities included for analysis of ach/allc and dh proceedings not necessarily the “best” research universities in canada, and that a better list of elite research universities would be the so-called “group of ” or g– schools. even with this list, only three papers — two from alberta, one from mcmaster – appear.) australian elite universities (the go ) are slightly better represented; three affiliations from melbourne, one from sydney. only in europe is there broad participation from recognized elite universities such as the leru. the english-speaking leru universities (ucl, cambridge, oxford, and edinburgh) are all represented, as are the universities of amsterdam, leuven, paris, and utrecht despite the language barrier. however, students and faculty from harvard, yale, berkeley, toronto, mcgilli, and adelaide — in many cases, the current and future leaders of the fields — are conspicuously absent. perhaps the real heavyweights are simply publishing their dh work else- where, but are still a part of the community? a study of the abstracts accepted to the ach/allc conference (victoria) shows that only in- cluded affiliations from universities in the “top ” of the usnews ranking. only two came from universities in the “top ” of the maclean ranking, and only from ivies (four of those six were from the well-established specialist dh program at brown, a program unique among ivies.) a similar analysis shows low participation among the abstracts at the dh conference (paris). the current and future leaders seem not to participate in the community, either. . tools and awareness people who do not participate in a field cannot be expected to be aware of the developments it creates, an expectation sadly supported by recent survey data. in particular, [siemens et al., , toms and o’brien, ] reported on a survey of “the current needs of humanists” and announced that, while over % of survey respondents use e-text and over half use text analysis tools, they are not even aware of “commonly available tools such as tact, wordcruncher and concordancer.” the tools of which they are aware seem to be primarily common microsoft products such as word and access. this lack of awareness is further supported by [martin, ] (emphasis mine): some scholars see interface as the primary concern; [electronic] resources are not designed to do the kind of search they want. oth- ers see selection as a problem; the materials that databases choose to select are too narrow to be of use to scholars outside of that field or are too broad and produce too many results. still others question the legitimacy of the source itself. how can an electronic copy be as good as seeing the original in a library? other, more electronically oriented scholars, see the great value of accessibility of these resources, but are unaware of the added potential for research and teaching. the most common concern, however, is that schol- ars believe they would use these resources if they knew they existed. many are unaware that their library subscribes to resources or that universities are sponsoring this kind of research. similarly, [warwick, a] describes the issues involved with the oxford university humanities computing unit (hcu). despite its status as an “inter- nationally renowned centre of excellence in humanities computing,” [p]ersonal experience shows that it was extremely hard to con- vince traditional scholars in oxford of the value of humanities com- puting research. this is partly because so few oxford academics were involved in any of the work the hcu carried out, and had little knowledge of, or respect for, humanities computing research. had there been a stronger lobby of interested academics who had a vested interest in keeping the centre going because they had projects asso- ciated with it, perhaps the hcu could have become a valued part of the humanities division. that it did not, demonstrates the con- sequences of a lack of respect for digital scholarship amongst the mainstream. killer apps and great problems one possible reason for this apparent neglect is a mismatch of expectations between the expected needs of audience (market) for the tools and the com- munity’s actual needs. a recent paper [gibson, ] on the development of an electronic scholarly edition of clotel may illustrate this. the edition itself is a technical masterpiece, offering, among other things, the ability to compare passages among the various editions and even to track word-by-word changes. however, it is not clear who among clotel scholars will be interested in using this capacity or this edition; many scholars are happy with their print copies and the capacities print grants (such as scribbling in the margins or reading on a park bench). furthermore, the nature of the clotel edition does not lend itself well either to application to other areas or to further extension. the knowledge gained in the process of annotating clotel does not appear to generalize to the annotation of other works (certainly, no general consensus has emerged about “best practices” in the development of a digital edition, and the various pro- posals appear to be largely incompatible and even incomparable). the clotel edition is essentially a service offered to the broader research community in the hope that it will be used, and runs a great risk of becoming simply yet another tool developed by the dh specialists to be ignored. quoting further from [martin, ]: [some scholars] feel there is no incentive within the university system for scholars to use these kinds of new resources. — let alone to create them. this paper argues that for a certain class of resources, there should be no need for an incentive to get scholars to use them. digital humanities specialists should be in a unique position both to identify the needs of mainstream hu- manities scholars and to suggest computational solutions that the mainstream scholars will be glad to accept. . definition the wider question to address, then, is what needs the humanities community has that can be dealt with using dh tools and techniques, or equivalently what incentive humanists have to take up and to use new methods. this can be treated in some respects like the computational quest for the “killer applica- tion” – a need of the user group that can be filled, and by filling it, create an acceptance of that tool and the supporting methods/results. digital humanities needs a “killer application.” “killer application” is a term borrowed from the discipline of computer sci- ence. in its strictest form, it refers to an application program so useful that users are willing to buy the hardware it runs on, just to have that program. one of the earliest examples of such an application was the spreadsheet, as typified by visicalc and lotus - - . having a spreadsheet made business deci- sionmaking so much easier (and more accurate and profitable) that businesses were willing to buy the computers (apple iis or ibm pcs, respectively) just to run spreadsheets. gamers by the thousands have bought xbox gaming consoles just to run halo. a killer application is one that will make you buy, not just the product itself, but also invest in the necessary infrastructure to make the product useful. for digital humanities, this term should be interpreted in a somewhat broader sense. any intellectual product — a computer program, an abstract tool a the- ory, an analytic framework — can and should be evaluated in terms of the “affor- dances” [gibson, , ruecker and devereux, ] it creates. in this frame- work, an “affordance” is simply “an opportunity for action” [ruecker and devereux, ]; spreadsheets, for instance, create opportunities to make business decisions quickly on the basis of incomplete or hypothesized data, while halo creates the opportu- nity for playing a particular game. ruecker provides a framework for comparing different tools in terms of their “affordance strength,” essentially the value of- fered by the affordances of a specific tool. in this broader context, a “killer app” is any intellectual construct that creates sufficient affordance strength to justify the effort and cost of accepting, not just the construct itself, but the supporting intellectual infrastructure. it is a solution sufficiently interesting to, by itself, retrospectively justify looking the problem it solves — a great problem that can both empower and inspire. three properties appear to characterize such ”killer apps”. first, the prob- lem itself must be real, in the sense that other humanists (or the public at large) should be interested in the fruits of its solution. for example, the organizers of a recent nsf summit on “digital tools for the humanities” identified several examples of the kinds of major shifts introduced by information technology in various areas. in their words, when information technology was first applied [to inventory- based businesses], it was used to track merchandise automatically, rather than manually. at that time, the merchandise was stored in the same warehouses, shipped in the same way, depending upon the same relations among produces and retailers as before[. . . ]. to- day, a revolution has taken place. there is a whole new concept of just-in-time inventory delivery. some companies have eliminated warehouses altogether, and the inventory can be found at any instant in the trucks, planes, trains, and ships delivering sufficient inventory to re-supply the consumer or vendor — just in time. the result of this is a new, tightly interdependent relationship between sup- pliers and consumers, greatly reduced capital investment in “idle” merchandise, and dramatically more responsive service to the final consumer. a killer application in scholarship should be capable of effecting similar change in the way that practicing scholars do their work. only if the prob- lem is real can an application solving it be a killer. the clotel edition described above appears to fail under this property precisely because only specialists in clotel (or in th-century or african-american literature) are likely to be inter- ested in the results; a specialist in the canterbury tales will not find her work materially affected. second, the problem must get buy-in from the humanities computing com- munity itself, in that humanities computing specialists will be motivated to do the actual work. the easiest and probably cheapest way to do this is for the process of solution itself to be interesting to the participating scholars. for example, the compiling of a detailed and subcategorized bibliography of all ref- erences to a given body of work would be of immense interest to most scholars; rather than having to pore through dozens of issues of thousands of journals, they could simply look up their field of interest. (this is, in fact, very close to the service that thompson scientific provides with the social science citation index, or that penn state provides with citeseer.) the problem is that though the product is valuable, the process of compiling it is dull, dreary, and unre- warding. there is little room for creativity, insight, and personal expression in such a bibliography. most scholars would not be willing to devote substan- tial effort — perhaps several years of full-time work — to a project with such minimal reward. (by contrast, the development of a process to automatically create such a bibliography could be interesting and creative work.) the process of solving interesting problems will almost automatically generate papers and publications, draw others into the process of solving it, and create opportuni- ties for discussion and debate. we can again compare this to the publishing opportunities for a bibliography — is “my bibliography is now % complete” a publishable result? third, the problem itself must be such that even a partial solution or an incremental improvement will be useful and/or interesting. any problem that meets the two criteria above is unlikely to submit to immediate solution (oth- erwise someone would probably already have solved it). similarly, any such problem is likely to be sufficiently difficult that solving it fully would be a ma- jor undertaking, beyond the resources that any single individual or group could likely muster. on the other hand, being able to develop, deploy, and use a par- tial solution will help advance the field in many ways. the partial solution, by assumption, is itself useful. beyond that, researchers and users have an incen- tive to develop and deploy improvements. finally, the possibility of supporting and funding incremental improvements makes it more likely to get funding, and enhances the status of the field as a whole. . some historical examples to more fully understand this idea of a killer app, we should first consider the history of scholarly work, and imagine the life of a scholar c. . he (probably) spends much of his life in the library, reading paper copies of journal articles and primary sources to which he (or his library) has access, taking detailed notes by hand on index cards, and laboriously writing drafts in longhand which he will revise before finally typing (or giving to a secretary to type). his new ideas are sent to conferences and journals, eventually to find their way into the libraries of other scholars worldwide over a period of months or years. collaboration outside of his university is nearly unheard-of, in part because the process of exchanging documents is so difficult. compare that with the modern scholar, who can use a photocopier or scan- ner to copy documents of interest and write annotations directly on those copies. she can use a word processor (possibly on a portable computer) both to take research notes and to extend those notes into articles; she has no need to write complete drafts, can easily rearrange or incorporate large blocks of text, and can take advantage of the computer to handle “routine” tasks such as spelling correction, footnote numbering, bibliography formatting, and even pagination. she can directly incorporate the journal’s formatting requirements into her work (so that the publisher can legitimately ask for “camera-ready” manuscripts as a final draft), eliminating or reducing the need both for typists and typesetters. she can access documents from the comfort of her own office or study via an electronic network, and use advanced search technology to find and study docu- ments that her library does not itself hold. she can similarly distribute her own documents through that same network and make them available to be found by other researchers. her entire work-cycle has been significantly changed (for the better, one hopes) by the availability of these computation resources. we thus have several historical candidates for what we are calling “killer apps”: xerographic reproduction and scanning, portable computing (both ar- guably hardware instead of software), word processing and desktop publishing (including subsystems such as bibliographic packages and spelling checkers), net- worked communication such as email and the web, and search technology such as google. these have all clearly solved significant issues in the way humanities research is generally performed (i.e. met the first criterion). in ruecker’s terms, they have all created ‘affordances” of the sort that no modern scholar would choose to forego. the amount of research work — journals, papers, patents, presentations, and books — devoted to these topics suggests that researchers themselves are interested in solving the problems and improving the technolo- gies, in many cases incrementally (e.g., “how can a search engine be tuned to find documents written in thai?”). of course, for many of these applications, the window of opportunity has closed, or at least narrowed. a group of academics are unlikely to be able to have the resources to build/deploy a competing product to microsoft and/or google. on the other hand, the very fact that humanities scholars are something of a niche market may open the door to incremental killer apps based upon (or built as extensions to) mainstream software, applications focused specifically on the needs of practicing scholars. the next section presents a partial list of some candidates that may yield killer applications in the foreseeable future. some of these candidates are taken from my own work, some from the writings of others. . potential current killer apps . . back of the book index generation almost every nonfiction book author has been faced with the problem of index- ing. for many, this will be among the most tedious, most difficult, and least rewarding parts of writing the book. the alternative is to hire a professional indexer (perhaps a member of an organization such as the american society of indexers, www.asindexing.org) and pay a substantial fee, which simply shifts the uncomfortable burden to someone else, but does not substantially reduce it. a good index provides much more than the mere ability to find information in a text. the clive pyne book indexing company lists some aspects of what a good index provides. according to them, “a good index: • provides immediate access to the important terms, concepts and names scattered throughout the book, quickly and efficiently; • discriminates between useful information on a subject, and a passing men- tion; • has headings which are concise, accurate and unambiguous reflecting the contents and terminology used in the text; • has sufficient cross-references to connect related terms; • anticipates how readers will search for information; • reveals the inter-relationships of topics, concepts and names so that the reader need not read the whole index to find what they are looking for; • provides terminology which might not be used in the text, but is the reference point that the reader will use for searching through the index; • can make the difference between a book and a very good book” a traditional back-of-the-book (botb) index is a substantial intellectual ac- complishment in its own right. in many ways, it is an encapsulated and stylized summary of the intellectual structure of the book itself. “a good index is an objective guide to the text, a link between the author’s ideas and the reader. it should be a road map that leads readers to every relevant idea without frus- trating detours and dead ends.” and it is specifically not just a concordance or a list of terms appearing in the document. it is thus surprising that a tedious task of such importance has not yet been computerized. this is especially surprising given the effectiveness of search en- gines such as google at “indexing” the unimaginably large volume of information on the web. however, the tasks are subtly different; a google search is not ex- pected to show knowledge of the structure of the documents or the relationships http://www.cpynebookindexing.com/what makes a good index.htm, accessed / / kim smith, http://www.smithindexing.com/whyprof.html, accessed / / . among the search terms. as a simple example, a phrasal search on google (may , ) for “a good index,” found, as expected, several articles on back of the book indexing. it also found several articles on financial indexing and index funds, and a scholarly paper on glycemic control as measured (“indexed”) by plasma glucose concentrations. a good text index would be expected to identify these three subcategories, to group references appropriately, and to offer them to the reader proactively as three separate subheadings. a good text index is not simply a search engine on paper, but an intellectual precis of the structure of the text. this is therefore an obvious candidate for a killer application. every hu- manities scholar needs such a tool. indeed, since chemistry texts need indexing as badly as history texts do, scholars outside of the humanities also need it. unfortunately, not only does it not (yet) exist, but it isn’t even clear at this writing what properties such a tool would have. thus there is room for fun- damental research into the attributes of indices as a genre of text, as well as into the fundamental processes of compiling and evaluating indices and their expression in terms of algorithms and computation. i have presented elsewhere [juola, , lukon and juola, ] a possible framework to build a tool for the automatic generation of such indices. with- out going into technical detail,the framework identifies several important (and interesting) cognitive/intellectual tasks that can be independently solved in an incremental fashion. furthermore, this entire problem clearly admits of an in- cremental solution, because a less-than-perfect index, while clearly improvable, is still better than no index at all, and any time saved by automating the more tedious parts of indexing will still be a net gain to the indexer. thus all three components of the definition of killer app given above are present, suggesting that the development of such an indexing tool would be beneficial both inside and outside the digital humanities community. . . annotation tools as discussed above, one barrier to the use of e-texts and digital editions is the current practices of scholars with regard to annotation. even when documents are available electronically, many researchers (myself include) will often choose to print them and study them on paper. paper permits one not only to mark text up and to make changes, but also to make free-form annotations in the margins, to attach postit notes in a rainbow of colors, and to share commentary with a group of colleagues. annotation is a crucial step in recording a reader’s encounter with a text, in developing an interpretation, and in sharing that interpretation with others. the recent iath summit on digital tools for the humanities [iath summit, ] identified this process of annotation and interpretation as a key process underly- ing humanistic scholarship, and specifically discussed the possible development of a tool for digital annotation, a “highlighter’s tool,” that would provide the same capacities of annotation of digital documents, including multimedia doc- uments, that print provides. the flexibility of digital media means, in fact,that one should be able to go beyond the capacities of print — for example, instead of doodling a simple drawing in the margin of a paper, one might be able to “doodle” a flash animation or a .wav sound file. discussants identified at least nine separate research projects and communi- ties that would benefit from such a tool. examples include “a scholar currently writing a book on anglo-american relations, who is studying propaganda films produced by the us and uk governments and needs to compare these with text documents from on-line archives, coordinate different film clips, etc.”; “an add-on tool for readers (or reviewers) of journal articles,” especially of electronic journal systems (the current system of identifying comments by page and line number, for example, is cumbersome for both reviewers and authors.); and “an endangered language documentation project that deals with language variation and language contact,” where multilingual, multialphabet, and multimedia re- sources must be coordinated among a broad base of scholars. such a tool has the potential to change the annotation process as much as the word processor has changed the writing and publication process. can community buy-in be achieved? there is certainly room for research and for incremental improvements, both in defining the standards and capacities of the annotations and in expanding those capacities to meet new requirements as they evolve. for example, early versions of such a project would probably not be capable handling all forms of multimedia data; a research-quality prototype might simply handle pdf files and sound, but not video. it’s not clear that the community support is available for building early, simple versions – although “a straw poll showed that half of [the discussants] wanted to build this kind of tool, and all wanted to use it.” [iath summit, ], responding to a straw poll is one thing and devoting time and resources is another altogether; it is not clear that any software development on this project has yet happened. however, given the long-term potential uses and research outcomes from this kind of project, it clearly has the potential to be a killer application. . . resource exploration another issue raised at the summit is that of resource discovery and explo- ration. the huge amount of information on the web is, of course, a tremendous resource for all of scholarship, and companies such as google (especially with new projects such as google images and google scholar) are excellent at finding and providing access. on the other hand, “such commercial tools are shaped and defined by the dictates of the commercial market, rather than the more complex needs of scholars.” [iath summit, ] this raises issues about ac- cess to more complex data, such as textual markup, metadata, and data hidden behind gateways and search interfaces. even where such data is available, it is rarely compatible from one database to another, and it’s hard to pose questions to take advantage of the markup. in the words of the summit report, what kinds of tools would foster the discovery and exploration of digital resources in the humanities? more specifically, how can we easily locate documents (in multiple formats and multiple media), find specific information and patterns in across [sic] large numbers of scholarly disciplines and social networks? these tasks are made more difficult by the current state of resources and tools in the hu- manities. for example, many materials are not freely available to be crawled through or discovered because they are in databases that are not indexed by conventional search engines or because they are behind subscription-based gates. in addition, the most commonly used interfaces for search and discovery are difficult to build upon. and, the current pattern of saving search results (e.g., bookmarks) and annotations (e.g., local databases such as endnote) on local hard drives inhibits a shared scholarly infrastructure of exploration, discovery, and collaboration. again, this has the potential to effect significant change in the day-to-day working life of a scholar, by making collaborative exploration and discovery much more practical and rewarding, possibly changing the culture by creating a new “scholarly gift economy in which no one is a spectator and everyone can readily share the fruits of their discovery efforts.” “research in the sciences has long recognized team efforts. . . . a similar emphasis on collaborative research and writing has not yet made its way into the thinking of humanists.” but, of course, what kind of discovery tools would be needed? what kind of search questions should be supported? how can existing resources such as lexi- cons and ontologies be incorporated into the framework? how can it take advan- tage of (instead of competing with) existing commercial search utilities? these questions illustrate many of the possible research avenues that could be explored in the development of such an application. jockers’ idea of “macro lit-o-nomics (macro-economics for literature)” [jockers, ] is one approach that has been suggested to developing useful analysis from large datasets; ruecker and de- veraux [ruecker and devereux, ] and their “just-in-time” text analysis is another. in both projects, the researchers showed that interesting conclusions could be drawn by analyzing the large-scale results of automatically-discovered resources and looking at macro-scale patterns of language and thought. . . automatic essay grading the image of a bleary-eyed teacher, bent over a collection of essays at far past her bedtime is a traditional one. writing is a traditional and important part of the educational one, but most instructors find the grading of essays to be time-consuming, tedious, and unrewarding. this applies regardless of the sub- ject; essays on shakespeare are not significantly more fun to grade than essays on the history of colonialism. the essay grading problem is one reason that multiple choice tests are so popular in large classes. we thus have another po- tential “killer app,” an application to handle the chore of grading essays without interfering with the educational process. several approaches to automatic essay grading have been tried, with rea- sonable but not overwhelming success. at a low enough level, essay grading can be done successfully just by looking at aspects of spelling, grammar, and punctuation, or at stylistic continuity [page, ]. foltz [foltz et al., ] has also shown good results by comparing semantic coherence (as measured, via la- tent semantic analysis, from word cooccurances) with that of essays of known quality: lsa’s performance produced reliabilities within the range of their comparable inter-rater reliabilities and within the generally accepted guidelines for minimum reliability coefficients. for example, in a set of essays written on the functioning of the human heart, the av- erage correlation between two graders was . , while the correlation of lsa’s scores with the graders was . . . . . in a more recent study, the holistic method was used to grade two additional questions from the gmat standardized test. the performance was compared against two trained ets graders. for one question, a set of opinion essays, the correlation between the two graders was . , while lsa’s correlation with the ets grades was also . . for the second question, a set of analysis of argument essays, the correlation between the two graders was . , while lsa’s correlation to the ets grades was . . thus, lsa was able to perform near the same reliability levels as the trained ets graders. beyond simply reducing the workload of the teacher, this tool has many other uses. it can be used, for example, as a method of evaluating a teacher for consistency in grading, or for ensuring that several different graders for the same class use the same standards. more usefully, perhaps, it can be used as a teach- ing adjunct, by allowing students to submit rough drafts of their essays to the computer and re-write until they (and the computer) are satisfied. this will also encourage the introduction of writing into the curriculum in areas outside of tra- ditional literature classes, and especially into areas where the faculty themselves may not be comfortable with the mechanics of teaching composition. research into automatic essay grading is a active area among text categorization scholars and computer scientists for the reasons cited above. [valenti et al., ] from a philosophical point of view, though, it’s not clear that this approach to essay grading should be acceptable. a general-purpose essay grader can do a good job of evaluating syntax and spelling, and even (presumably) grade “se- mantic coherence” by counting if an acceptable percentage of the words are close enough together in the abstract space of ideas. what such a grader cannot do is evaluate factual accuracy or provide discipline-specific information. further- more, the assumption that there is a single grade that can be assigned to an essay, irrespective of context and course focus, is questionable. here is an area where a problem has already been identified, applications have been and con- tinue to be developed, uptake by a larger community is more or less guaranteed, but the input of humanities specialists is crucially needed to improve the service quality provided. discussion the list of problems in the preceeding section is not meant to be either exclusive or exhaustive, but merely to illustrate the sort of problems for which killer apps can be designed and deployed. similarly, the role for humanities specialists to play will vary from project to project – in some cases, humanists will need to play an advisory role to keep a juggernaut from going out of control (as might be needed with the automatic grading), while in others, they will need to create and nurture a software project from scratch. the list, however, shares enough to illustrate both the underlying concept and its significance. in other words, we have an answer to the question “what?” — what do i mean by a “killer application,” what does it mean for the field of digital humanities, and, as i hope i have argued, what can we do to address the perennial problem of neglect by the mainstream. an equally important question, of course, is “how?” fortunately, there appears to be a window opening, a window of increased attention and avail- able research opportunities in the digital humanities. the iath summit cited above [iath summit, ] is one example, but there are many others. re- cent conferences such as the first text analysis developers alliance (tada), in hamilton ( ), the digital tools summit for linguistics in east lansing ( ), the e-meld workshops (various locations, – ), the cyberinfras- tructure for humanities, arts, and social sciences workshop at ucsd ( ), and the recent establishment of the working group on community resources for authorship attribution (new brunswick, nj; ) illustrate that digital scholarship is being taken more seriously. the establishment of ray siemens in as the canada research chair in humanities computing is another impor- tant milestone, marking perhaps the first recognition by a national government of the significance of humanities computing as an acknowledged discipline. perhaps most important in the long run is the availability of funding to support dh initiatives. many of the workshops and conferences described above were partially funded by competitively awarded research grants from national agencies such as the national science foundation. the canadian foundation for innovation has been another major source of funding for dh initiatives. but perhaps the most significant development is the new ( ) digital humanities initiative at the (united states) national endowment for the humanities. from the website : neh has launched a new digital humanities initiative aimed at supporting projects that utilize or study the impact of digital technology. digital technologies offer humanists new methods of conducting research, conceptualizing relationships, and presenting http://www.neh.gov/grants/digitalhumanities.html, accessed / / scholarship. neh is interested in fostering the growth of digital hu- manities and lending support to a wide variety of projects, including those that deploy digital technologies and methods to enhance our understanding of a topic or issue; those that study the impact of digital technology on the humanities–exploring the ways in which it changes how we read, write, think, and learn; and those that digitize important materials thereby increasing the public’s ability to search and access humanities information. the list of potentially supported projects is large: • apply for a digital humanities fellowship (coming soon!) • create digital humanities tools for analyzing and manipulating humanities data (reference materials grants, research and development grants) • develop standards and best practices for digital humanities (research and development grants) • create, search, and maintain digital archives (reference materials grants) • create a digital or online version of a scholarly edition (scholarly editions grants) • work with a colleague on a digital humanities project (collaborative re- search grants) • enhance my institution’s ability to use new technologies in research, educa- tion, preservation, and public programming in the humanities (challenge grant) • study the history and impact of digital technology (fellowships, faculty research awards, summer stipends) • develop digitized resources for teaching the humanities (grants for teach- ing and learning resources) most importantly, this represents an agency-wide initiative, and thus illus- trates the changing relationship between the traditional humanities and digital scholarship at the very highest levels. of course, just as windows can open, they can close. to ensure continued access to this kind of support, the supported research needs to be successful. this paper has deliberately set the bar high for “success,” arguing that digi- tal products can and should result in substantial uptake and effect significant changes in the way that, as neh put it, “how we read, write, think, and learn.” the possible problems discussed earlier are an attempt to show that we can effect such changes. but the most important question, of course, is “should we?” “why?” why should scholars in the digital humanities try to develop this software and make these changes? the first obvious answer is simply one of self- interest as a discipline. solving high-profile problems is one way of attracting the attention of mainstream scholars and thereby getting professional advance- ment. warwick [warwick, b] illustrates this in her analysis of the citations of computational methods, and the impact of a single high-profile example. of all articles studied, the only ones that cited computation methods did so in the context of don foster’s controversial analysis of “a funeral elegy” to shake- speare. the funeral elegy controversy provides a case study of circum- stances in which the use of computational techniques was noticed and adopted by mainstream scholars. the paper argues that a com- plex mixture of a canonical author (shakespeare) and a star scholar (foster) brought the issue to prominence. . . . the funeral elegy debate shows that if the right tools for tex- tual analysis are available, and the need for, and use of, them is explained, some mainstream scholars may adopt them. despite the current emphasis on historical and cultural criticism, scholars will surely return in time to detailed analysis of the literary text. there- fore researchers who use computational methods must publish their results in literary journals as well as those for humanities computing specialists. we must also realize that the culture of academic disci- plines is relatively slow to change, and must engage with those who use traditional methods. only when all these factors are understood and are working in concert, may computational analysis techniques truly be more widely adopted. implicit in this, of course, is the need for scholars to find results that are publishable in mainstream literary journals as well as to do the work resulting in publication, the two main criteria of killer apps. on a less selfish note, the development of killer applications will improve the overall state of scholarship as a whole, without regard to disciplinary boundaries. while change for its own sake may not necessarily be good, solutions to genuine problems usually are. creating the index to a large document is not fun — it requires days or weeks of painstaking, detailed labor that few enjoy. the inability to find or access needed resources is not a good thing. by eliminating artificial or unnecessary restrictions on scholarly activity, scholars are freed to do what they really want to do — to read, to write, to analyze, to produce knowledge, and to distribute it. furthermore, the development of such tools will in and of itself generate knowledge, knowledge that can be used not only to generate and enhance new tools but to help understand and interpret the humanities more generally. soft- ware developers must be long-term partners with the scholars they serve, but digital scholars must also be long-term partners, not only with the software de- velopers, but with the rest of the discipline and its emerging needs. in many case, the digital scholars are uniquely placed to identify and to describe the emerging needs of the discipline as a whole. with a foot in two camps, the digital scholars will be able to speak to the developers about what is needed, and to the traditional scholars about what is available as well as what is under development. conclusion predicting the future is always difficult, and predicting the effects of a newly- opened window is even more so. but recent developments suggest that digital humanities, as a field, may be at the threshold of new series of significant de- velopments that can change the face of humanities scholarship and allow the “emerging discipline of humanities computing” finally to emerge. for the past forty years, humanities computing has more or less languished in the background of traditional scholarship. scholars lack incentive to partici- pate (or even to learn about) the results of humanities computing. this paper argues that dh specialists are placed to create their own incentives by develop- ing applications with sufficient scope to materially change the way humanities scholarship is done. i have suggested four possible examples of such applica- tions, knowing well that many more are out there. i believe that by actively seeking out and solving such great problems – by developing such killer apps, scholarship in general and digital humanities in particular, will be well-served. references [foltz et al., ] foltz, p. w., laham, d., and landauer, t. k. ( ). auto- mated essay scoring: applications to educational technology. in proceedings of edmedia ’ . [gibson, ] gibson, m. ( ). clotel: an electronic scholarly edition. in proceedings of ach/allc , victoria, bc ca. university of victoria. [iath summit, ] iath summit ( ). summit on digital tools for the humanities : report on summit accomplishments. [jockers, ] jockers, m. ( ). xml aware tools — catools. in presentation at text analysis developers alliance, mcmaster university, hamilton, on. [juola, ] juola, p. ( ). towards an automatic index generation tool. in proceedings of ach/allc , victoria, bc ca. university of victoria. [lukon and juola, ] lukon, s. and juola, p. ( ). a context-sensitive computer-aided index generator. in proceedings of dh , paris. sorbonne. [martin, ] martin, s. ( ). reaching out: what do scholars want from electronic resources? in proceedings of ach/allc , victoria, bc ca. university of victoria. [page, ] page, e. b. ( ). computer grading of student prose using mod- ern concepts and software. journal of experimental education, : – . [ruecker and devereux, ] ruecker, s. and devereux, z. ( ). scraping google and blogstreet for just-in-time text analysis. in presented at casta- , the face of text, mcmaster university, hamilton, on. [siemens et al., ] siemens, r., toms, e., sinclair, s., rockwell, g., and siemens, l. ( ). the humanities scholar in the twenty-first century: how research is done and what support is needed. in proceedings of allc/ach , gothenberg. u. gothenberg. [toms and o’brien, ] toms, e. g. and o’brien, h. l. ( ). understand- ing the information and communication technology needs of the e-humanist. journal of documentation, (accepted/forthcoming). [usnews, ] usnews ( ). u.s. news and world report : america’s best graduate schools (social sciences and humanities). [valenti et al., ] valenti, s., neri, f., and cucchiarelli, a. ( ). an overview of current research on automated essay grading. journal of in- formation technology education, : – . [warwick, a] warwick, c. ( a). no such thing as humanities comput- ing? an analytical history of digital resource creation and computing in the humanities. in proceedings of allc/ach , gothenberg. u. gothenberg. [warwick, b] warwick, c. ( b). whose funeral? a case study of com- putational methods and reasons for their use or neglect in english studies. in presented at casta- , the face of text, mcmaster university, hamilton, on. mapping the kominas' sociomusical transnation: punk, diaspora, and digital media this article was downloaded by: [occidental college], [wendy fangyu hsu] on: july , at: : publisher: routledge informa ltd registered in england and wales registered number: registered office: mortimer house, - mortimer street, london w t jh, uk asian journal of communication publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/rajc mapping the kominas' sociomusical transnation: punk, diaspora, and digital media wendy fangyu hsu a a center for digital learning and research , occidental college , los angeles , ca , usa published online: jun . to cite this article: wendy fangyu hsu ( ) mapping the kominas' sociomusical transnation: punk, diaspora, and digital media, asian journal of communication, : , - , doi: . / . . to link to this article: http://dx.doi.org/ . / . . please scroll down for article taylor & francis makes every effort to ensure the accuracy of all the information (the “content”) contained in the publications on our platform. however, taylor & francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the content. any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by taylor & francis. the accuracy of the content should not be relied upon and should be independently verified with primary sources of information. taylor and francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the content. this article may be used for research, teaching, and private study purposes. any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. terms & conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions http://www.tandfonline.com/loi/rajc http://www.tandfonline.com/action/showcitformats?doi= . / . . http://dx.doi.org/ . / . . http://www.tandfonline.com/page/terms-and-conditions http://www.tandfonline.com/page/terms-and-conditions original article mapping the kominas’ sociomusical transnation: punk, diaspora, and digital media wendy fangyu hsu* center for digital learning and research, occidental college, los angeles, ca, usa (received december ; final version received april ) the kominas is a south asian american punk band known for its iconic role within the punk-inspired, muslim-affiliated music culture self-labeled as ‘taqwacore’. since its national tour in , the kominas has been creating a radically translocal social geography comprised of musicians, listeners, artists, filmmakers, and bloggers on- and off-line. the band concocts a transnational sound, combining elements of punjabi and punk music. this paper examines the kominas’ web production and interactions over digital social media on myspace and twitter. it discusses how the band members contemplate their troubled sense of national belonging; and illustrates how they build a diasporic space that is digitally produced and unified by minoritarian politics. this ethnographic project uses participant-observation and tools from digital humanities (data-mining and geospatial visualization) to map the transnational contours of the kominas’ self-made community. keywords: south asian identity; punk; popular music; internet; digital huma- nities; digital ethnography the kominas is a south asian american punk rock band based in northeastern united states. formed in boston, massachusetts in , the band is well-known for its association with the grassroots music culture self-labeled as ‘taqwacore.’ the prefix ‘taqwa’ is a quranic arabic term meaning ‘fear-inspired love’ or ‘love-based fear’ for the divine. the suffix ‘core’ refers to its punk roots, highlighting the do-it- yourself ideology and subversive attitudes central in hardcore punk music scenes. writer michael muhammad knight coined the term ‘taqwacore’ in his novel about a group of college-age individuals who live in a house together in buffalo, new york ( ). knight conceived the term as a way to reclaim a space for an alternative practice of islam inflected with the punk anti-status-quo ethos. the world shared by the kominas, me, and many other active members of the taqwacore scene, is embedded in a global digital media network. since , the kominas has been vigorously creating a transnational social terrain via online social networking and face-to-face interactions through touring. members of the band have crossed the borders of more than six nation/states including pakistan, united kingdom, canada, norway, austria and the us, leaving their home in northeastern united states to perform in three continents of the world, and establishing friendly networks across north america, europe, the middle east, south and southeast *email: hsuw@oxy.edu asian journal of communication, vol. , no. , � , http://dx.doi.org/ . / . . # amic/sci-ntu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly http://dx.doi.org/ . / . . asia. the kominas’ transnational sound and community, i argue, are linked to, if not a consequence of, the feeling of lack of national belonging and social comfort experienced by individuals of south asian descent living in the united states (maira, ). responses to the events on september, � islamphobia and the war on terror � brought this collective state of melancholia into relief. the members of the kominas have experienced discrimination and alienation, similar to many indivi- duals of south asian, muslim, and arabic heritage living in the us. in their everyday life, they juggle the consequences of neo-orientalist and ‘civilizational’ (polumbo-liu, ) discourses that partition the world into two opposing halves, namely, muslim and western. the band members have suggested that they never feel quite at home when they physically are home. they have written songs with titles such as ‘sharia law in the usa’ to question the racializing surveillance upon individuals assumed to be of muslim descent. the song ‘suicide bomb the gap’, for example, subverts the terror-infused imagery of south asian and muslim masculinity that is rampantly circulated in mainstream news media. this article, however, does not elaborate on the overt instances of the band’s resistance against racializing media and surveillance. instead, it focuses on how the members of the band have asserted themselves in reclaiming their own spaces across national and regional boundaries, in the context of post- / geopolitics. looking at the kominas’ interactions over digital social media, this article seeks to articulate the band’s self-made geography. the band deploys the punk sound and do-it-yourself social-networking to re-territorialize and re-embed itself into a world partitioned by ideology, politics, and migration. in doing so, the kominas decenters the anglo- american domination of punk and rock music with the creation of an alternative community, a new home away from its physical home. in missing: youth, citizenship, and empire after / ( ), sunaina maira investigates how young south asian muslim immigrants negotiate their alienation, social discomfort, and lack of national belonging. maira outlines the ways in which these youth interpret and repurpose popular culture and media to express dissent toward us imperialism and military actions. in an example, she describes an informant’s expression of ‘ambiguous dissent’ against us imperialism in the form of personal webpages and mashup images circulated on the internet ( , pp. � ). these digital moments, however, are left un-theorized in maira’s work. this paper responds to maira’s work by highlighting the role of digital music and social media in the south asian american expression of post- / dissent and ethnic pride. this essay argues that digital media have enabled the creation of generative spaces for individuals to share personal and collective discontentment. digital spaces have become a vital alternative to users’ repressed physical reality. following the works on digital sociality within minority music cultures in a postnational context (luvaas, ; murthy, , ; pinard & jacobs, ), i examine the kominas’ dynamic geographical occupation, digitally and physically instantiated. theoretically, i build on josh kun’s ( ) conception of ‘audiotopia . . . [as] small, momentary, lived utopias built, imagined, and sustained through sound, noise, and music’ ( , p. ). kun uses the concept of audiotopia to explore the meaning of music in spatial terms, specifically ‘the spaces that the music itself contains, the spaces that music fills up, the spaces that music helps us to imagine as listeners occupying our own real and imaginary spaces’ ( , p. ). kun’s formulation focuses on the emotional and sensual register of music as embedded in recorded asian journal of communication d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly sound. in this paper, i broaden kun’s conception of audiotopia to consider not only the psychoacoustic space produced by music, but also a variety of sociomusical spaces created by a band � live and documented, embodied and virtual, physical and imagined. to trace and document these varied lived spaces, i applied the methods of field research with an integrated focus on the physical and the digital. i spent two years following the kominas, tracking the band’s musical and social engagements including live performances, touring, networking, promoting, and show booking. these processes took place in both online and offline environments. in these variously mediated social spaces marked as taqwacore, i participated as a fan, blogger, and fellow musician. methodologically, i extended the methods of participant observation from the physical into the virtual realm of interaction. i position the digital as an extension of the physical and vice versa. my digital participation is no less important than my physical participation in the scene. in what follows, i will describe how the kominas reconfigures the world’s geography to make a home in a punk transnation. i will do so while reflecting on the dynamic range of mediation in the band’s sociomusical formation. first, in an analysis of the band’s web production and social media engagement, i discuss how the band established a defiant, minority-based community for fellow musicians through forming musical kinships and a web-based record label. this analysis examines the musicians’ creation of a ‘brown’ ethno-racial social space and unfolds the relationship between race and ethnicity in a south asian postcolonial context. the second section focuses on the band’s use of the social networking site myspace as a means to extend in-person interactions into a global network of listeners and supporters. this section delves into my application of digital humanities methods, specifically web scraping and geospatial visualization, to map the transnational contours of the kominas’ digital social terrain. a bbrown eethno-rracial ccontinuum on twitter, the kominas actively sent instant messages of characters or less referred to as ‘tweets’ to its fans and friends. the band sent tweets that contain the word ‘brown’ to provoke and organize online dialogues about global south asian and other minority identities. the band used the hashtag #brownbandreparations � following the convention of using a hashtag (#) to group messages under a common topical label � to relay a message from its brother band, the psychedelic alternative country duo sunny ali and the kid (thekominas, ). in the blogosphere, writers flaunted a similar connection between the kominas and brownness. on washington, dc-based music and commentary site true genius requires insanity (tgronline, ), a contributing blogger proclaimed that ‘the future of american punk is brown,’ alluding to the kominas’ first album wild nights in guantanamo bay ( ). a blogger on music and opinion site elivindotorg ascertained: ‘is the year punk turns brown?’ (elivindotorg, ). in more mainstream press, mtv iggy announced its endorsement for the kominas as ‘beloved brown brothers’ ( ). what are the ethnic, racial, and geopolitical implications of brownness in the kominas’ social world? in this section, i discuss the ways in which the kominas transforms the notion of diaspora by deploying a radically inclusive brown ethno- racial identity. this brown sociomusical kinship, at times, stems from an ethnic space, w.f. hsu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly marked by a shared pakistani-american or south asian pan-ethnic identity. in other contexts, brownness is charged with racial meanings as the band explicates its minority position within the anglo-american dominance in rock music in us society. the brown ethno-racial identification enables a productive movement between a conventional notion of diaspora oriented toward an ancestral homeland and a politically charged space that centers minority experiences in the postcolonial era. in what follows, i will outline how the band forms a network of friends within and across the boundaries of the pakistani and south asian diaspora, through an active engagement with digital social media. more often than not, the kominas has used the term ‘brown’ to express a sense of fraternity based on a shared pakistani heritage. members of the band adopt a creative naming practice that highlights its musical kinship with other bands. the band renames its physical locations to signify its pakistani background by playfully adding ‘-tan(i)’ to the names of the us cities in its friend network. for instance, the kominas labeled its birthplace as ‘bostonstan’ on its twitter profile. in an mtv iggy interview, basim usman, the bassist of the kominas, describes his brother’s band sunny ali & the kid as being ‘phillistani,’ a term signifying its home base in philadelphia (kishwer, ). more recently, two of the kominas’ members relocated in the so-called ‘phillistan,’ neighboring their brown brothers in sunny ali & the kid. the logic of this naming convention extends to the band’s personal identities online. the name of basim’s livejournal blog is ‘punkistani’; drummer imran’s twitter handle is ‘rockistani.’ at other times, the kominas’ self-created discourse conflates its pakistani and south asian affiliations. on its official band website, the kominas describes itself as a ‘desi punk outfit comprised of four brown sons of south asian parents.’ this statement subsumes the descriptors ‘brown’ and ‘south asian’ under the label of ‘desi.’ it is unlikely that the members of the band confuse ethnic and racial meanings of these terms. on twitter, basim sent out a declarative message that comments on the relationship between desi, an ethnic identity, and brown, a racial identity: ‘we’re a new breed of desi that doesn’t have to forgo brownness to get into patently non-desi things’ (basimbtw, ). this statement suggests that previously, desi artists who engaged in cultural activities labeled as non-desi had to avoid or veil their race-based position as a brown minority. basim’s assertion encourages the retention of brownness as a racial identifier in contexts markedly non-south-asian. this assertion, i argue, presents a new ethno-racial space for individuals of south asian descent and challenges the essentialist conception of what it means to be ethnically desi. basim statement broadens the desi ethnic continuum to include non-desi cultural processes, thus allowing the desi ethnic minority to claim ownership over cultural practices outside of its group. this is evident in the second part of the tweet � ‘to get into patently non-desi things’ � in which basim prescribes an anti-assimilation ideology. according to this stance, a desi individual could retain his or her minority status even when the individual, as an outsider minority, engages in a non-south-asian cultural activity. i read this redefinition as a politicized proclamation that re-positions ethnicity, not in isolation, but in relation to the material and historical conditions of race and racial injustice. this ethno-racial continuum couches all desi experiences within the minority experience in the postcolonial context. it also enables the racialized notion of brownness to exist broadly in a recontexualized ethnic spectrum that includes both the south-asian and the non-south-asian experiences. asian journal of communication d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly basim’s assertion for a brown ethno-racial continuum fits with the psychosocial geography as articulated in the song ‘par desi.’ in the recording, basim’s voice shivers as he sings the chorus line, ‘in lahore it’s raining water, in boston it rains boots.’ the subject in the song defines his physical home in boston, where he experienced an assault by skinhead punks. he sings, ‘they tried to stomp me out, but they only fueled the flame.’ the song narrates a history of migration and the emotions of displacement. it raises the questions, ‘where do i point to blame, when men scatter like moths? / . . . how’d i get here, from a land with long monsoons?’ local alienation fuels the nostalgia for lahore, a home far away from home. this song describes an emotional geography � a spatial containment in boston (and by extension, the united states) in juxtaposition with a safe refuge in lahore, pakistan, remotely located on the south asian subcontinent. musically, this song combines rhythmic elements of ska-punk, a subgenre of punk common in the local music scenes in boston, and bhangra, a dance music from the cultural region in east pakistan and north india known as punjab, on the guitar to form the rhythmic backbone. this song ties the minority experience of a race-based alienation in boston to the diasporic identity rooted in ethnic migration. though this song does not explicitly locate its time-place in terms any ethnic or racial terms, as a musical statement, it resonates with what basim has declared as the meaning of desi and brown identities. i argue that this unique conception of a brown ethno-racial continuum fuels the band’s efforts in politicizing its seemingly apolitical cultural engagement. the discursive continuum extends into how the band textually represents its own music. in , the kominas and sunny ali & the kid formalized their musical brotherhood by creating an independent record label called poco party in . drummer imran malik states that a motivation behind starting poco party was to engage in the cultural process of taste-making or, in his words, ‘establishing an aesthetic’ (imtiaz, a): we don’t identify with islam as much as we identify with our pakistani heritage. songs like ‘‘pardesi’’; you take a typical iktara [a traditional one-string instrument] riff and mix it with reggae and ska. that kind of stuff is exciting to us. the idea is not to fuse these kinds of music, but to take south asian music and translate it into something where you use it with the instruments we know how to play because we grew up in america, we identify with rock culture and the instruments. and the idea of three to five brown kids making music . . . you can use your own vocabulary, so that’s what we’re doing, talking about things that we talk about between ourselves. (imtiaz, b; my italics) imran connects the band members’ pakistani heritage to the practice of identity expression through music that is markedly south asian. this diasporic affiliation with south asia and pakistan is expressed to be in relation to the experience of being a racial minority in the host country: in imran’s words, as ‘brown kids’ who ‘grew up in america . . . and identify with rock culture.’ the notion of translating south asian musical elements and pakistani heritage to be understood as ‘american’ rock music, i contend, is more politicized than what it appears to be. rejecting the mix-bag model of fusion, imran privileges the act of translation to emphasize the role of agency in cultural mediation. i see this musical translation as a performative act, empowering and politicized. more than just an assertion of a personal identity, this translation w.f. hsu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly highlights the marginal position of a postcolonial translator as a brown racial minority, as someone who is neither black or white, and grew up listening to and loving rock music, but having never felt embraced by the culture surrounding the music. the kominas’ brown identification alongside its brother band challenges the black and white racial binary that has organized the discourses of rock and punk music (shank, ). the idea of filtering american rock music through the lens of diasporic pakistani experiences and south asian sensibility facilitates the formation of a cultural space for the formerly invisible and silenced minority participants to feel a degree of social comfort and to voice their perspectives. the design of poco party’s website offers further insight into the record label’s aesthetic and ethical orientation. the wallpaper background of the website is a map of the british india during britiain’s colonial occupation of the indian subcontinent. to begin with, this map reminds web users of the subjugation of the indian sub- continent under the british raj, and the historical legacies of colonialism. this map insinuates poco’s party’s minoritarian status of being brown in the global postcolony. on the record label’s blog, basim posted a provocative postcolonialist interpretation of the video game series mario brothers. in his analysis, basim likens the evil bowser’s (the main antagonist) takeover of the mushroom kingdom as the british colonization of india; the character of toad as the indian raja (indian ruling figure of the british indian empire); and equating the heroic efforts of the mario brothers to the neo-imperialist wars in iraq and afghanistan waged by the west. alluding to spivak’s well-known article ‘can the subaltern speak?’, basim claims that he empathizes with the minor characters in the video game, or in his words, the ‘mute sub-alterns’ of the mushroom kingdom. he explains, ‘at the end of the day, we are all a bunch of turtles weary of white people jumping on us. they can collect extra life after extra life, but when we get stomped we’re gone forever’ (poco party, ). basim’s blog post offers a defiantly anti-colony critique of the british colonization of india. it is a perspective that adamantly articulates the silenced voices of the colonized. alternatively, the map of the british indian empire could be read as hopes for the re-unification of the subcontinent separated by the violent partition in . i have observed the notion of unification in practice by the band in person. in its performative deployment, unification acts as an emotional compass that brings together individuals across social barriers defined by national origins, ethnicity, religion, and beyond. at their stop in charlottesville on the tour, the members of the kominas accompanied omar waqar, the brainchild behind qawwali punk project sarmust, in the performance of his song ‘return to ambala.’ strumming the chords on his acoustic classical guitar, omar repeats the chorus lines: ‘they call it partition/it’s more like separation,’ omar and his backing band moved the crowd to join in a participatory call and response reminiscent of a qawwali chant. i clapped and sang along, feeling an intense emotional unification with everyone in the room. this performance broadened the notion of unification to fit the immediate context of united states, creating a social space that was inviting to other misfits. the performance brought together the performers and the audience, muslim, hindu, and christian, pakistani and indian, south asian and east asian, black, white, and brown. furthermore, the band’s forging of a brown identity can be discussed in an explicitly anti-racist and anti-colonialism context with the respect to discourses asian journal of communication d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly about rock music and the postcolonial condition. in an interview with mtv iggy, basim makes note of the band’s recent changes: ‘we’re singing less and less in english. we may pull a bad brains and phase the rock out all together!’ (kishwer, ). basim’s juxtaposition of ‘singing less and less in english’ and ‘phase the rock out’ is intriguing. both phrases imply a movement away from the center. punjabi language, in this example, represents a movement away from the english linguistic center in rock music culture worldwide. the reference of the black hardcore band bad brains further reinforces the decentering of the white-dominated punk rock music, as well as the broader rock music category. in the same interview, basim describes his goal: ‘to get white people fluent in punjabi, so they can teach it to brown people’ (kishwer, ). in this provocative statement, basim makes an analogy between the historical fact of european colonialism and the anglo- american dominance within rock music. the punjabi punk invasion prophesied by basim is not just about linguistic translation. aspiring to shift the power center from white to brown people, basim is hinting at a minoritarian global punk project. basim’s statement thus aims to de-colonialize punk rock music, and subsequently, the global society. alongside its multi-diasporic brown brothers, the members of the kominas have adapted the notion of unification in flexible ways to engage in minoritarian politics in various settings. remapping the south asian subcontinent and its related diasporic hubs as their own, they have worked to ameliorate the effects of imperialist colonialism of the past and the postcolonial social partitioning of the present. using digitally produced media, the kominas has created an inclusivist space for brown- identified social formation beyond the boundaries of race, ethnicity, and geography. this space is a reparative response in face of the historical (hebdige, ; kalra, hutnyk, & sharma, ; zuberi, ) and the present (krishnan, ) exclusion of south asian and other minoritarian subjectivities in punk rock. the taqwacore digital diaspora the use of social networking site myspace has been documented as a key factor that contributes to the birth of the taqwacore scene (crafts, ). the most accepted narrative about the inception of the scenes goes something like this: using social networking site myspace and email, mike knight reached out to various punk rockers of muslim heritage living in north america, forming a network of friends and enthusiasts around the self-identified label of taqwacore. in the summer of , knight joined together with five us and canada-based bands including the kominas, turning this online community into a physical reality. they organized their first collective tour and dubbed it ‘taqwa-tour.’ the bands, along with friends, fans, documentary photographer kim badawi, and filmmaker omar majeed, toured north america, traveling from the us east coast to the midwest in a painted-green school bus that knight had purchased on ebay. the kominas’ do-it-yourself network is comprised of muslim, south asian, and other taqwacore-inspired musicians, listeners, artists, filmmakers, and bloggers. in a radio interview, basim attributes the nascent formation of the taqwacore scene to the online communication between mike knight and the muslim punk musicians across north america. basim said: w.f. hsu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly i guess there were a lot of kids playing in various scattered bands, standard rock bands. because, i guess, we all have muslim names, we would be asked questions about islam . . . we got in touch with each other. and we met lots of people who were into punk, into islam already . . . . the internet played a big part in this. we all got connected and i guess we tried to flesh out this idea of a cultural space. (akbar & hsiao, ) i became aware of the taqwacore scene by first listening to this radio interview with basim. then via twitter and myspace, i reached the members of the kominas and scheduled a meeting in boston (this interaction is recounted in the beginning of the article). after i met and interviewed the band, i posted my review of the band’s first album wild nights in guantanamo bay on my blog yellowbuzz.org. this blog post became my entry into the conversation stream brought together by the members of the taqwacore community who were using the #taqx hashtag on twitter. how i entered the social terrain surrounding the kominas is illustrative of the extent to which the band’s social networks are embedded in digital media. i read in the taqwacores zine an apt description of this digital rite of passage, an initiation ritual common to most followers of the kominas and other taqwacore bands. zine contributor paddycakes, a twentysomething, west indian woman living in queens, new york, describes how she became a part of the scene: a few months into college in ish, a friend of mine told me about this online makeshift book called the ‘taqwacores.’ i didn’t pay too much attention to it, but he made it a huge deal. he went on and on how there are kids out there like me, ‘brown kids, muslim kids.’. . . i check out online, bought the book off amazon, and read it in like a day . . . weirdly enough, as if mike attached low-jack in the books, he added me on myspace, then the kominas, it spiraled into this huge community. i was excited! i spend loads of time chatting up kids all around the world about music . . . up until a year ago, i finally met face to face with a few online taqx kids. moshing around together to songs we listened to over and over on myspace, emailing each other mp s that took forever to send since some of us still had dial-up. (paddycakes, , pp. � ) after reading her compelling narrative in the zine, i reached out to paddycakes and became friends with her on twitter. the taqwacores zine consists of six xeroxed, double-sided pages of text and pictures, stapled together. the font style is set in typerwriter face. in black and white ink, the aesthetics of the zine resembles the grassroots publications of fan-zines in the punk scenes in london and new york during late s and s (spencer, ). the cover art is a collage of skeletons, punk jacket zippers, a shirtless kurt cobain, a cartoonish drawing of a bass player with a scarf over his face. the zine was compiled and edited by kait foley and britny rose, both active (white, non-muslim) bloggers within the taqwacore scene. kait sent out a tweet announcing the release of the taqwacores zine. after corresponding with kait on twitter and sending her $ via paypal, i received in the mail a large manila envelope containing the zine, along with a homemade bootleg cd-r of a show recording featuring the kominas, and other taqwacore groups. the availability of free social networking software tools � myspace, facebook, twitter, and so on � has enabled the band to extend its social networks beyond the physical constraints of living as muslim, south asian minorities living in north america. in a punk grassroots manner, the kominas has joined efforts with bloggers, music writers, and their friends and fans in creating a strong presence within various digital social spaces. whether they self-identify or are marked as asian journal of communication d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly taqwacore or not, these individuals stake claims to their existence in spite of alienation experienced in the physical world. i explore the spatial dimension of the kominas’ virtual community by asking: where are the kominas’ online friends and fans located in the physical world? where is the band’s ‘digital diaspora’ situated geographically? i deploy the term ‘digital diaspora’ to encapsulate the social and geographical domain of the kominas’ community. conceptually, i extend the notion of ‘virtual diaspora’ as defined in pinard and jacobs ( , p. ) work on the online hip-hop community in the african diaspora. the authors build on benedict anderson’s notion of the ‘imagined community’ ( ) to describe the virtual communication of online participants. importantly, they highlight the political nature of the virtual diaspora, as ‘a metaphor for a terrain in which, due to experiential and historical dynamics, social agents position themselves oppositionally, as well as opportunistically, to the status quo or the dominant ideology’ (pinard & jacobs, , p. ). similarly, the digital diaspora carries an underlying oppositional edge. my formulation, however, foregrounds social media technology and examines it beyond the intended end-user roles of social networking sites. for that reason, i emphasize articulations of digital media with a focus on the design and functioning with the respect to the formation of the digital diaspora. this theoretical distinction also breaks down the virual-vs-real binary that is implied in much of the discourse about online and offline communities. i contend through the example of myspace for a holistic approach to the study of the internet articulates the embeddedness of digital media in physically embodied social life and a seamless connection between on and off-life social dynamics. my notion of the digital diaspora also contrasts with the conventional meaning of diaspora that is often ethnically, racially, nationally, or sometimes religiously determined. in his study of the taqwacore scene participants, for example, murthy ( ) uses the term diaspora (or diasporic) in a conventional manner to refer to ethnicity as an explanation for displacement, for example, the informants’ relation- ship to an imagined homeland of pakistan and south asia. my reconceptualization, however, foregrounds the internet as a productive site of social interactions and community formation around and across the boundaries of nation, ethnicity, race, and religion. by inserting the term ‘digital,’ i insist upon digital sociocultural processes as not only sites of my research inquiry, but also objects of my study. the digital diaspora redefines the basis of the relationship between home and diaspora, echoing gopinath’s ( ) reading of bhangra as a critique of diasporic thinking. the concept of digital diaspora, i argue, reconfigures the conventional diaspora- home relationship; it highlights the band as a new social home for its friends and fans in a digitally generated and hosted community. to investigate and articulate such a digital diaspora, i used methods of ethnography or computational ethnography. with an awareness of characteristics unique to digital media, i participated while making observations in online communities. i also set out to explore the band’s digital social terrain beyond what internet browser displays web information by leveraging software tools such as web scraping and mapping technologies. web scraping refers a set of programmatic methods designed to extract targeted information from web pages. during the period that i was working on this project, the kominas had close to friends on myspace. to extract location information displayed on the profile pages of the band’s friends, i created a software web-scraper in the form of an application w.f. hsu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly programming interface (a.p.i.) in the ruby scripting language. the a.p.i. success- fully crawled through the web profile pages of friends of the kominas on myspace and parsed the geographically related text in the source code of these profile pages. using openlayers, a browser-based mapping program, i then turned the extracted geospatial data into a dynamic map that visualizes the friend locations of the band. to fit the format of this article as a textual document, i took a series of screenshots to demonstrate the depth and flexibility of this dynamic spatial visualization. computational methods such as web scraping and web-mapping, though undocumented in ethnographic, (ethno) musicological, and other anthropologically informed scholarship, are a part of an emerging conversation about the use of technology in humanist research within the field of digital humanities. scholars in communication and social sciences, however, have applied similar computational methods such as web-crawling in their works (halavias, ; lin, halavais, & zhang, ). until recently, mapping has been applied only metaphorically as a spatial theoretical framework in many humanities disciplines including music (bohlman, ; caspary & manzenreiter, ; swiss, sloop, & herman, ). traditionally, ethnographers insert a single-page map in the beginning of their monograph to contexutualize their narratives geographically. what happens when ethnographers investigate communities comprised of multiple sites, some online and others off-line? a digital map not only encapsulates the geographical coverage of these projects, but also articulates the intricate dynamics of social interactions across various geographical boundaries. with this custom methodological design, i intend to achieve ‘radical empiricism,’ a term that i use to describe my goals in identifying and documenting. with purposeful attention on specificity and precision, i examine the sociomusical processes that take place in digital social environments and the software infra- structure that supports these interactions. the computational methods that i designed for this project have allowed me to go beyond the textual and discursive dimensions, a path previous unexplored by academic online participant observers. using this dynamic digital map, i have uncovered new visual and geo-spatial patterns of the kominas’ global friend networks. this map not only visualizes, but also helps contextualize the stories of the band’s translocal occupation and diasporic preoccupation. the visualization has made visible patterns of social linkage that i had not anticipated in my physical field research. it has also allowed me to contemplate the spatial dimensions of ethnic belonging and transnational commu- nities while staying rooted in my empirical observations. interestingly, the map shows a salient concentration of online friends in south and southeast asia. the kominas’ connection to pakistan is most likely tied to the band members’ heritage and personal relations to the country. as discussed previously, basim, shajhejan, and imran of the kominas are all of pakistani descent. they have spent significant time living in pakistan. basim and shahjehan lived in lahore and worked as journalists in and early . they played in a band called noble drew and performed three shows in pakistan: two in lahore, one in islamabad in june . drummer imran went to medical school in lahore. basim and imran, along with two friends whom they met in pakistan, formed a band called the dead bhuttos during their tenure in lahore. asian journal of communication d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly in an interview with a pakistani newspaper, imran expresses on behalf of the band its goal to instigate an independent music scene in pakistan, alternative to the mainstream pop music industry there. he notes: my take on the music industry here is that there are very few live venues here, one in karachi and now one in islamabad. but there is no place with proper sound and light that’s dedicated to just being a proper music venue. also, there are around six music channels but they don’t seem to promote new music or do stories on bands that are just forming. they’re not really like taste makers, they’re just going with what sells, i find that kind of frustrating. i think it can be changed, and it’s one of the things we’d like to see through. (imtiaz, a) here imran proclaims the purposes of the transnational outreach of his band and his record label poco party. imran’s declaration implies an independence or ‘indie’ ethos based on an anti-corporation and anti-commercialism attitude. this indie ideology motivates him to establish further connections to pakistan as not only a diasporic subject, but also as a musician and tastemaker. it enables him to curate a unique kind of pakistani american sensibility that is based in nostalgia toward his imagined homeland, a sense of belonging to the country of his ethnic origin. the most surprising pattern that i found on the map is the spread of the kominas’ digital friendship in southeast asia (figure ). malaysia and indonesia are known for a strong presence of local punk scenes in urban centers, in particular kuala lumpur and jakarta (wallach, ), the capital cities of these countries, respectively. the band is well aware of its geographical position relative to the global punk terrain. in a recent interview, basim launches a defensive remark, reacting to the interviewer who assumed the ‘anglophone world’ as a place of authenticity for punk rock. basim says: figure . map : the kominas’s myspace friend distribution in south east asia. w.f. hsu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly what an ignorant question. anglophone world? punk is ten times bigger in kuala lampur [sic] than it ever will be in the uk, france, or germany. or america. no, the reason for forming the dead bhuttos, and the rush to put a single online was to show, at least cosmetically, that pakistan was as capable of putting out punk rock as turkey, malaysia, japan, and lebanon. the usa is good to sell obscure malaysian and japanese records in, but it’s not a good place to play this kind of music. we’d do much better in south eastern asia, which yes, we get a lot of traffic from online. tons of people from singapore, malaysia, and indonesia add us. we’ve been covered in the major malaysian music magazine. i think it makes more sense for us to play in malaysia than it does to play in europe. (rashid, foley, usmani, & khan, ) in his comment, basim expresses his feelings of connection to pakistan and malaysia. he relegates the us to a place of commerce that spawns the consumption and not the production of punk music. with his claim of the us and the rest of the anglophone world as sites of punk inauthenticity, he gravitates toward asia while validating it as a more legitimate site of punk rock music production. in his assertion, basim highlights the asian punk scenes on the global punk map while articulating his desire to forge connections with punk music scenes based in asia. in doing so, he implies his interest in de-centering the euro-american hegemony of punk music. is the kominas’ friend concentration in southeast asia be related to the fact that malaysia and indonesia have a muslim-majority population? religion certainly does not explain the lack of friend concentration in other countries with a muslim- majority population. speaking with basim about my map, i learned about a sectarian difference between the north american and the southeast asian experience of islam. basim said that the southeast asian punk kids that he has befriended online all seem to be ‘very religious.’ they question basim for his lax observance of ramadan and according to basim, pray five times a day. while alluding to basim’s observations, i am hesitant to draw generalizations regarding the sectarian and regional differences in the practice of islam between north american and southeast asian muslims. nevertheless, the friendship connections to southeast asia could be a reflection of the active participation on myspace among independent and punk rock musicians in indonesia (luvaas, ). the friend concentration in malaysia can be attributed to the malaysia-based users’ high web interest in myspace. based on the analytics of google web search interest, the search interest in myspace by users in malaysia was ranked fourth in the world, at the time of this research, right below puerto rico, the united states, and australia. the meanings and motivation behind forging these social connections, however, call for further ethnographic processes such as peer-to- peer conversations or interviews. finally, these maps show not a cyberpunk fantasy, but a social reality that has burgeoned in a digital space. the kominas has reconfigured the world’s map and created its own punk rock diaspora. this emerging transnational friend-territory, shown in figure , is not just an imagined community. it is a sociocultural space created by punk rock sound and the exchanges of mix-tapes, mp s, face-to-face visits, shows, tweets, zines, blog posts, hyperlinks, virtual hugs, encouragement and strength. the kominas has performed the cultural work of building a translocal home away from home. the band’s friendship network is like an archipelago, scattered asian journal of communication d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly across bodies of water. this archipelago of friends is arguably a new kind of diaspora that is not only digitally constructed by the band, but also digitally articulated via emerging digital ethnographic research techniques. this digital diaspora radically transgresses the boundaries between muslim and non-muslim territories, a highly charged geographical distinction after september . it also traverses the orientalist east-west binary, a geopolitical construction that reinforces the differences between the two halves of the world on either side of the levantine coast (prashad, ; said, ). this space has enabled many misfits, ‘brownies,’ immigrants, queers, and punks to congregate and interact without feeling like an outcast. paddycakes captures this sentiment in her contribution of the taqwacores zine: ‘feeling a sense of understanding, a beginning point. we are all still so different from each other. taqx isn’t about who’s more hardcore than who, or which is the loudest, or even who wears the most spikes. it’s about the sheer happiness we have seeing each other and as soon as we do, hugs are given away freely’ (paddycakes, , p. ). conclusion by articulating, quite literally, the contour and shape of the kominas’ transnational community via the technique of web mapping, i have extended josh kun’s conception of audiotopia from the sound-based psychoacoustic dimension to the ethnographic sociocultural register. working toward a shared postnational utopian vision, the musicians assert their creative agency in both musical performance and social actions. in the case of the kominas, music-making has led to various facets of social organizing and real-life consequences such as d.i.y. tours, recording production, performance exchanges, hangouts, and record label formation. borrowing from foucault’s conception of archipelago, i contend that the kominas’ archipelago, as illustrated in my mapping project, is in fact ‘physically dispersed yet at the same time covers the entirety of a society’ (foucault, , figure . map : the kominas’s myspace friends worldwide (satellite view). w.f. hsu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly p. ). unlike foucault’s archipelago, the kominas’ counterpart is not a punitive system itself, but is a subversion of one. it engages in a constant struggle to survive and flourish in the midst of past and present global inequities left over from the legacy of colonial occupations. a steadfastly growing network, this global archipelago fosters a refuge for its member islands, while countering the forces that impinge upon its dispersed but powerful existence. as i am writing this conclusion, i recall the taqwacore support of the political events in egypt, tunisia, and their neighboring countries known as the arab spring. on february, , the day the egyptian dictator mubarak fell, tazzystar, a bay- area-based activist/blogger that i began to follow via our mutual love for the kominas, tweeted a widely circulated message. she made an unequivocal declaration that ‘revolution is taqwacore’ (tazzystar, ). this tweet evoked a transnational, anti-status-quo solidarity to support the revolution, and staked claims to the meaning of the arab spring through the lens of taqwacore. along with two taqwacore-affiliates based in indonesia, i retweeted this message. acknowledgements the author is grateful to the scholars’ lab at the university of virginia for the digital humanities graduate fellowship, and to basim usmani, shahjehan khan, imran malik, and other members of the kominas for their friendship. notes . on my personal blog [hosted on http://yellowbuzz.org], i compiled and shared a set of field notes, music reviews, and interviews. the blog not only made my work publicly available on the internet, to the musicians and their friends and peers in my study, but also contributed to the building of the musician-informants’ social networks. . qawwali is devotional music associated with sufism, a mystic sect within islam in india and pakistan. the practice of qawwali extends throughout the indian subcontinent, but its roots are tied to north india. for more on qawwali, see qureshi ( ). . bad brains started as a fusion jazz band, but came into fame as a hardcore band in the late s and early s. later the band developed a strong reggae and metal sound, departing its genre anchors in punk and hardcore music. for more about bad brains, see darryl ( ). . the term ‘web-crawling’ is sometimes used synonymously with web-scraping. typically crawling refers to the technology of extraction all information on the web, similar to the technology of google search engines. and web-scraping refers to the extraction of specific online information. scholars in computer science, communication, and social sciences, however, utilized web-crawling technology in their studies of web-based communities and population. for an example, see lin et al. ( ). . in openlayers, i inserted a base layer of the world’s regions � marked by various shades of green in the background � to help contextualize the friend distribution across continental boundaries. i have documented the technical process of this project on my blog: wendyhsu ( , january ). . i have hosted this dynamic web map on a server, making it available for interested researchers to interact with it as a mode of ethnographic data discovery and representation. the dynamic map is hosted at: http://beingwendyhsu.info/soundmaps/thekominas/. . as i was finishing my dissertation, the taqwacore movements in indonesia and malaysia were emerging. october , marwan kamel, the frontman of chicago-based taqwacore band al thawra, started to use the hashtags #indotaqx and #mtaqx as an effort to locate and connect all taqwacore-identified users on twitter. up until this point, there had been asian journal of communication d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly http://yellowbuzz.org http://beingwendyhsu.info/soundmaps/thekominas/ tweets labeled with these hashtags. i will continue to monitor and document this formation of the transnational social networks. . this study is documented in the form of analytics by google. see this for more details: google ( ). notes on contributor wendy hsu is a music ethnographer and musician who engages in sonic materials from asia, continental and diasporic. she received her phd in critical and comparative studies from the music department at the university of virginia. as a mellon postdoctoral digital scholar- ship fellow at the center of digital learning & research at occidental college, she engages with multimodal ethnographic methodology and digital pedagogy. she blogs at http:// beingwendyhsu.info. references akbar, a., & hsiao, a. (producer). ( , march ). the kominas: south asian muslim punk [audio podcast]. retrieved from http://www.asiapacificforum.org/show-detail.php?show_ id� anderson, b. ( ). imagined communities: reflections on the origin and spread of nationalism. new york, ny: verso. basim, btw. ( , september ). we’re a new breed of desi that doesn’t have to forgo brown- ness to get into patently non-desi things [twitter post]. retrieved from http://twitter.com/#!/ basimbtw/status/ bohlman, p.v. ( ). pilgrimage, politics, and the musical remapping of the new europe. ethnomusicology, , � . doi: . / caspary, c., & manzenreiter, w. ( ). from subculture to cyberculture?: the japanese noise alliance and the internet. in n. gottlieb & m. j. mclelland (eds.), japanese cybercultures (pp. � ). new york: routledge. crafts, l. ( , july ). taqwacore: the real muslim punk underground [national public radio]. retrieved from http://www.npr.org/templates/story/story.php?storyid� darryl, j.a. ( ). play like a white boy: hard dancing in the city of chocolate. in horse, k.c. (ed.), rit it up: the black experience in rock ‘n’ roll (pp. � ). new york, ny: palgrave macmillan. elivindotorg ( , december ). looking forward to : the kominas and sunny ali and the kid � taqwacore’s leading men [web log message]. retrieved from http://elivin. org/ / / /looking-forward-to- -the-kominas-and-sunny-ali-and-the-kid-taqwacores- leading-men/ foucault, m. ( ). questions on geography (c. gordon, l. marshall, j. mepham, & k. soper, trans.). in c. gordon (ed.), power/knowledge: selected interviews and other writings ( � ), (pp. � ). new york, ny: pantheon books. google. ( ). web search interest: myspace. worldwide, � present [web analytics]. retrieved from http://www.google.com/trends/explore#cat�&q�myspace&geo�&date� &clp�&cmpt�q gopinath, g. ( ). bombay, u.k., yuba city’: bhangra music and the engendering of diaspora. diaspora, , � . doi: . halavais, a. ( ). national borders on the world wide web. new media & society, ( ), � . doi: . hebdige, d. ( ). subculture: the meaning of style. london: methuen & co, ltd. imtiaz, h. ( a, november ). punking pakistan up [the express tribune]. retrieved from http://www.karachidigest.com/articles/news/punkingpakistan-up/ imtiaz, h. ( b, october ). music channels are not tastemakers, they are just going with what sells. [the express tribune]. retrieved from http://tribune.com.pk/story/ /music- channels-are-not-taste-makers-theyre-just-going-with-what-sells/ kalra, v., hutnyk, j., & sharma, s. ( ). resounding (anti)racism, or concordant politics? revolutionary antecedents. in s. sharma, j. hutnyk & a. sharma (eds.), dis-orienting w.f. hsu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly http://beingwendyhsu.info http://beingwendyhsu.info http://www.asiapacificforum.org/show-detail.php?show_id= http://www.asiapacificforum.org/show-detail.php?show_id= http://www.asiapacificforum.org/show-detail.php?show_id= http://www.asiapacificforum.org/show-detail.php?show_id= http://www.asiapacificforum.org/show-detail.php?show_id= http://www.asiapacificforum.org/show-detail.php?show_id= http://www.asiapacificforum.org/show-detail.php?show_id= http://www.asiapacificforum.org/show-detail.php?show_id= http://twitter.com/#!/basimbtw/status/ http://twitter.com/#!/basimbtw/status/ http://www.npr.org/templates/story/story.php?storyid= http://www.npr.org/templates/story/story.php?storyid= http://www.npr.org/templates/story/story.php?storyid= http://www.npr.org/templates/story/story.php?storyid= http://www.npr.org/templates/story/story.php?storyid= http://www.npr.org/templates/story/story.php?storyid= http://www.npr.org/templates/story/story.php?storyid= http://elivin.org/ / / /looking-forward-to- -the-kominas-and-sunny-ali-and-the-kid-taqwacores-leading-men/ http://elivin.org/ / / /looking-forward-to- -the-kominas-and-sunny-ali-and-the-kid-taqwacores-leading-men/ http://elivin.org/ / / /looking-forward-to- -the-kominas-and-sunny-ali-and-the-kid-taqwacores-leading-men/ http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.google.com/trends/explore#cat=&q=myspace&geo=&date=&clp=&cmpt=q http://www.karachidigest.com/articles/news/punkingpakistan-up/ http://tribune.com.pk/story/ /music-channels-are-not-taste-makers-theyre-just-going-with-what-sells/ http://tribune.com.pk/story/ /music-channels-are-not-taste-makers-theyre-just-going-with-what-sells/ rhythms: the politics of new asian dance music (pp. � ). atlantic highlands, nj: zed books. kishwer. ( , august ). q & a with the kominas basim usmani & imran malik part ii: ‘we’re singing less and less in english’ [web log message]. retrieved from http://blog. mtviggy.com/ / / /q-a-with-the-kominas-basim-usmani-imran-malik-part-ii-were- singing-less-and-less-in-english/ knight, m.m. ( ). the taqwacores. brooklyn, ny: autonomedia. kominas, the. ( ). wild nights in guantanamo bay [cd]. new york, ny: poco party. krishnan, m. ( ). how can you be so cold? in s. duncombe & m. trembaly (eds.), white riot: punk rock and the politics of race (pp. � ). brooklyn, ny: verso. kun, j. ( ). audiotopia: music, race, and america. berkeley: university of california press. lin, j., halavais, a., & zhang, b. ( ). the blog network in america: blogs as indicators of relationships among us cities. connections, ( ), � . retrieved from http://www.insna. org/connections-web/volume - /lin.pdf luvaas, b. ( ). dislocating sounds: the deterritorialization of indonesian indie pop. cultural anthropology, , � . doi: . maira, s. ( ). missing: youth, citizenship, and empire after / th. durham, nc: duke university press. mtv iggy. ( , july ). sharia blawg in the usa! the kominas on tour [web log message]. retrieved from http://www.facebook.com/note.php?note_id� murthy, d. ( ). communicative flows between the diaspora and ‘homeland’: the case of asian electronic music in delhi. journal of creative communications, ( ), � . doi: . murthy, d. ( ). muslim punks online: a diasporic pakistani music subculture on the internet. south asian popular culture, ( ), � . doi: . paddycakes. ( , february). i’m not punk, i’m not taqwacore. no wait, i’m just me. . . in foley, k. & b. rose (eds.), the taqwacores zine (pp. � ). pinard, a., & jacobs, s. ( ). building a virtual diaspora: hip hop in cyberspace. in m. d. ayers (ed.), cybersounds: essays on virtual music culture (pp. � ). new york, ny: peter lang publishing. poco party. ( , september ). the mushroom kingdom’s colonial reign [web log message]. retrieved from http://pocoparty.com/archives/the-mushroom-kingdoms-colonial-reign polumbo-liu, d. ( ). multiculturalism now: civilization, national identity, and difference before and after september th. boundary , ( ), � . retrieved from http://muse. jhu.edu/journals/b /summary/v / . palumbo-liu.html prashad, v. ( ). everybody was kung-fu fighting: afro-asian connections and the myth of cultural purity. boston, ma: beacon press. qureshi, r. ( ). sufi music of india and pakistan: sound, context, and meaning in qawwali. new york, ny: cambridge university press. rashid, h., foley, k., usmani, b., & khan, s. ( , february ). taqwacore roundtable: on punks, the media, and the meaning of ‘muslim.’ religion dispatches. retrieved from http:// www.religiondispatches.org/archive/culture/ /taqwacore_roundtable% a_on_punks,_the_ media,_and_the_meaning_of_%e % % cmuslim%e % % d said, e. ( ). orientalism. new york, ny: random house. shank, b. ( ). from rice to ice: the face of race in rock and pop. the cambridge companion to pop and rock. new york, ny: cambridge university press. spencer, a. ( ). diy: the rise of lo-fi culture. new york, ny: marion boyars. swiss, t., sloop, j., & herman, a. (eds.). ( ). mapping the beat: popular music and contemporary. malden, ma: blackwell. tazzystar. ( , february ). revolution is taqwacore [twitter post]. retrieved from http:// twitter.com/tazzystar/status/ tgrionline. ( , may ). the drop: the kominas � boston’s favorite desi punks . . . get schooled in tacqwacore! [web log message]. retrieved from http://elivin.org/ / / /the- drop-the-kominas-bostons-favoite-desi-punks-get-schooled-in-tacqwacore/ thekominas. ( , november ). #brownbandreparations rt@sunnyalithekid: every- body gets a video game video, we want a video game video too [twitter post]. retrieved from http://twitter.com/thekominas/status/ asian journal of communication d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly http://blog.mtviggy.com/ / / /q-a-with-the-kominas-basim-usmani-imran-malik-part-ii-were-singing-less-and-less-in-english/ http://blog.mtviggy.com/ / / /q-a-with-the-kominas-basim-usmani-imran-malik-part-ii-were-singing-less-and-less-in-english/ http://blog.mtviggy.com/ / / /q-a-with-the-kominas-basim-usmani-imran-malik-part-ii-were-singing-less-and-less-in-english/ http://www.insna.org/connections-web/volume - /lin.pdf http://www.insna.org/connections-web/volume - /lin.pdf http://www.facebook.com/note.php?note_id= http://www.facebook.com/note.php?note_id= http://www.facebook.com/note.php?note_id= http://www.facebook.com/note.php?note_id= http://www.facebook.com/note.php?note_id= http://www.facebook.com/note.php?note_id= http://www.facebook.com/note.php?note_id= http://pocoparty.com/archives/the-mushroom-kingdoms-colonial-reign http://muse.jhu.edu/journals/b /summary/v / . palumbo-liu.html http://muse.jhu.edu/journals/b /summary/v / . palumbo-liu.html http://www.religiondispatches.org/archive/culture/ /taqwacore_roundtable% a_on_punks,_the_media,_and_the_meaning_of_% e % % cmuslim% e % % d http://www.religiondispatches.org/archive/culture/ /taqwacore_roundtable% a_on_punks,_the_media,_and_the_meaning_of_% e % % cmuslim% e % % d http://www.religiondispatches.org/archive/culture/ /taqwacore_roundtable% a_on_punks,_the_media,_and_the_meaning_of_% e % % cmuslim% e % % d http://twitter.com/tazzystar/status/ http://twitter.com/tazzystar/status/ http://elivin.org/ / / /the-drop-the-kominas-bostons-favoite-desi-punks-get-schooled-in-tacqwacore/ http://elivin.org/ / / /the-drop-the-kominas-bostons-favoite-desi-punks-get-schooled-in-tacqwacore/ http://twitter.com/thekominas/status/ wallach, j. ( ). modern noise, fluid genres: popular music in indonesia, � . madison: university of wisconsin press. wendyhsu. ( , january ). mapping an asian american indie rock digital diaspora [web log message]. retrieved from http://beingwendyhsu.info/?p� zuberi, n. ( ). sounds english: transnational popular music. chicago: university of illinois press. w.f. hsu d ow nl oa de d by [ o cc id en ta l c ol le ge ], [ w en dy f an gy u h su ] at : ju ly http://beingwendyhsu.info/?p= http://beingwendyhsu.info/?p= http://beingwendyhsu.info/?p= http://beingwendyhsu.info/?p= http://beingwendyhsu.info/?p= http://beingwendyhsu.info/?p= http://beingwendyhsu.info/?p= commentary editor’s note: this is one of a series of commentaries on the future of scientific publishing. for a listing of the other commentaries, see http://www.jneurosci.org/cgi/content/full/ / / . will research sharing keep pace with the internet? richard k. johnson scholarly publishing and academic resources coalition, washington, dc the ways scientists share and use research are changing rapidly, fundamentally, and irreversibly. the signs are plain to see. e-mail and a growing range of other network technol- ogies efficiently and rapidly link research- ers from around the globe and enhance informal communication. most scientific literature is now created in digital form and, in nearly every discipline, some scholarship is digital-only or can be fully understood only in digital form. google has cataloged more than eight billion web pages and a billion images, and is now un- dertaking to digitize books on a scale that previously seemed unthinkable. these changes signal a new era of dig- ital scholarship. many of yesterday’s lim- itations on research and learning are being swept away by the internet. for the first time in history, we have a practical oppor- tunity for efficient, unlimited sharing of information at virtually no cost beyond that of providing it to the first reader. dynamics of change many elements comprising the process of scientific exchange have been quick to re- spond to the opportunity. for example, biomedical researchers have used the in- ternet to rapidly form new or ad hoc com- munities of scientists in response to health crises such as severe acute respiratory syn- drome and avian influenza. scientists us- ing the interconnectivity of the web have begun to break down information silos, allowing interdisciplinary perspectives on complex questions and vexing challenges, and teams of investigators in far-flung time zones work together effectively and easily, quickly sharing information. however, journals have been compar- atively slow to embrace the potential of the ubiquitous network. true, online edi- tions are now the norm for most journals and online reference linking has made it easier to navigate the literature. but fun- damentally, most online journals are sim- ply digital editions of their print analogs. little changed since they were invented � years ago. why haven’t journals evolved more rapidly? the culture of academe and its “prestige economy” is one factor imped- ing change. academic career advance- ment depends on publishing in leading, well established journals, journals that may have little incentive to alter their model. also, economies extrinsic to sci- ence have grown up around the sale (and now lease, in the digital context) of jour- nals. change has sometimes been held back by efforts to protect publishing rev- enues and profits. related to this is the desire of many publishers to rigorously defend “their” intellectual property (the texts provided to them by scholarly au- thors, together with editing, formatting, and other enhancements) in the digital environment through licensing restric- tions. new technical protection schemes for intellectual property could make mat- ters worse yet for information users. but scientists and scientific organiza- tions, including the society for neuro- science (sfn), increasingly are asking themselves, what could journals become if freed to do all that they might for the ad- vancement of science? the national science foundation’s blue-ribbon advisory panel on cyberin- frastructure looked at the overall scientific communication process and concluded that “the traditional linear, batch process- ing approach to scholarly communication is changing to a process of continuous re- finement as scholars write, review, anno- tate, and revise in near-real time using the internet” (national science foundation, ). this points to the need for a rather dramatic change by traditional journals if they to are to keep up. a team of distinguished scientists in wide-ranging fields, brought together by microsoft research in to look ahead at the transformation of science, envi- sioned “the rise of new kinds of publica- tions, not merely with different business models, but also with different editorial and technical approaches” serving re- search needs that will evolve with science itself. particularly intriguing is their com- ment that “these developments will not only reflect changes in the way research is done but in some cases may also stimulate them” (microsoft research, ). today journals are a record of research, but per- haps in the near future they will be vehi- cles for real-time, iterative, collaborative refinement of scientists’ understanding of research. as these prognostications suggest, the scientific paper and its historic container, the journal, are poised for change. the possibilities and demands of science to- gether with new enabling technologies are just too compelling to resist. received july , ; accepted july , . r.k.j. was the founding executive director of sparc (scholarly publish- ing and academic resources coalition), a position he held from to . currently he is a scholarly communications consultant and senior advisor to sparc. correspondence should be addressed to richard k. johnson, scholarly publishing and academic resources coalition, dupont circle nw, wash- ington, dc . e-mail: rick@arl.org. © richard k. johnson. licensed under a creative commons attribution-noncommercial . license. doi: . /jneurosci. - . copyright © society for neuroscience - / / - $ . / the journal of neuroscience, september , • ( ): – • research sharing the internet offers the opportunity to eliminate access barriers that limit use of scientific findings, to share research freely among all potential readers. because sci- entific discovery is a cumulative process, with new knowledge building on earlier findings, it is counterproductive to keep research locked up like books in a four- teenth century monastery. the large audience for freely accessible scientific knowledge may be surprising to many, but the hunger for it is apparent from experience of the national library of medicine (nlm). a few years ago, nlm transformed its fee-based index and ab- stracts of biomedical journal articles to free availability on the web as pubmed. use of the database increased -fold once it became freely available. the po- tential scope of this usage could never have been anticipated by looking solely at use of the controlled-access version. who are these new readers? they surely include scientists around the globe at institutions that may not be able to af- ford needed journals. they also may be researchers in unexpected fields, search engine users who didn’t realize previously they could use work in a seemingly unre- lated field. they may be students, patients or their families, physicians, community health workers, or others from the general public: taxpayers who finance so much biomedical research. much of the thinking about new ways to share scientific knowledge with these readers and about new economic models to sustain the process revolves around two complementary strategies. open-access journals open-access journals, whose costs are typically covered through advertising, dues, publication fees, sponsorships, in- kind contributions, or a combination of these and other sources of support, are emerging as an alternative to the tradi- tional subscription model. according to the directory of open access journals, there currently are � open-access journals in wide-ranging fields. this is a good start, but so far it represents only about a tenth of the world’s peer-reviewed journals. online open archives commonly hosted by universities or gov- ernment agencies to advance their mis- sion, online open archives provide free ac- cess to articles, supporting data, working papers, preprints, images, and other ma- terial deposited by members of an institu- tional or disciplinary community. in bio- medicine, the national library of medicine’s pubmed central online ar- chive is the best known open archive, but many universities have also established “institutional repositories” to preserve work conducted at their institution. open archives supplement journal browsing and readership; they don’t replace it. the outlook for the future of open archives is framed in large part by the sometimes- conflicting terms and obligations of au- thors’ agreements with their funders and the journals in which they publish. a discussion of the merits and tactics for each of these open-access strategies is beyond the scope of this commentary, but suffice it to say that neither spells the end of science or peer review, as skeptics have suggested. however, both involve the un- bundling of the functions journals have traditionally performed: registration: es- tablishing the intellectual priority of re- search; certification: certifying the quality of the research and the validity of the claimed finding; awareness: ensuring the dissemination and accessibility of research, providing a means by which researchers can become aware of new research; and archiv- ing: preserving the intellectual heritage for future use (roosendaal and geurts, ). these functions can now be distrib- uted via the internet among various ser- vice providers, not just a journal. no longer is it obligatory for the certification of research quality (e.g., the peer review process overseen by a particular editorial board) to be hardwired to its dissemina- tion; they can be independent. this disag- gregation opens the door to a more dy- namic communication environment. the role of funders not surprisingly, many governments and funding agencies around the world recog- nize that dissemination of research results is part of the research process itself, that the impact of the research they fund will be magnified by bringing down barriers to its use. increasingly, funders are implement- ing or exploring policies to facilitate the sharing of information and realize the benefits of digital scholarship. the na- tional institutes of health (nih) has been among the highest visibility agencies to open the door to research sharing with its public access policy, aimed at securing a permanent digital archive, enhancing management of its research portfolio, and ensuring that findings are available to all potential users. at this writing, the policy requests that nih investigators voluntar- ily deposit their final peer-reviewed manuscripts in pubmed central. (how- ever, a mandatory deposit policy may be on the horizon.) the nih also allows grant funds to be used to pay journal pub- lication fees charged by some open-access journals. the united states congress has taken a growing interest in ensuring access to fed- erally funded research. indeed, the nih policy was framed in response to congres- sional pressure. now pending in congress is the federal research public access bill (s. ), introduced in may , which would require that research supported by major government funding agencies be freely available online within months of publication in a journal. interest is hardly limited to the united states. the wellcome trust, the united kingdom’s largest private biomedical re- search funder, has played an international leadership role by requiring its grantees to submit an electronic copy of the final manuscripts of their research papers into pubmed central. wellcome also provides grantholders with additional funding to cover publication fees charged by some open access journals. other uk funders have followed suit, including the govern- ment’s biotechnology and biological sci- ences research council (bbsrc) and medical research council (mrc). they announced recently that all their funded researchers will be required to submit a copy of their final manuscript “at the ear- liest opportunity,” with the mrc stipulat- ing that the works be made available “cer- tainly within months” (biotechnology and biological sciences research council, ; medical research council, ). the canadian institutes of health re- search (cihr) is exploring development of policies governing access to physical products of research (e.g., cell lines, dna libraries), data typically deposited in pub- lic databases (genomic data, dna se- quences, and protein sequences), and peer-reviewed published results. its goal is to increase access to cihr-funded re- searchers’ discoveries and, in so doing, “stimulate the development of new health products that will benefit the health of ca- nadians as well as the global population” (canadian institutes of health research, ). these kinds of policies and actions may ultimately break the gridlock that is holding back the evolution of journals. funders have a unique perspective on the outcomes of research that transcends the narrower interests of other stakeholders. their influence can overcome some of the coordination problems associated with • j. neurosci., september , • ( ): – johnson • will research sharing keep pace with the internet? change and the emergence of new norms for research sharing. a new era of opportunity by seamlessly linking data, knowledge, and users, the emerging research environ- ment promises to catapult science ahead. and, given the complex scientific, social, and economic challenges that face us, the arrival of these new capacities is coming none too soon. to its credit, the society for neuro- science is taking steps to embrace change rather than guard the status quo that se- duces so many successful organizations. the guiding principles of sfn�s “open access publishing strategy” (http://www. sfn.org/index.cfm?pagename � strategic- plan# ) well capture the spirit with which all societies should approach the transi- tion ahead: recognize the value and likeli- hood of open access publishing and be ready with an effective strategy when this happens; maintain the ethos of scientific publishing (i.e., that it is by and for scien- tists and that the advancement of science ranks above all other publishing motives); maintain peer review as an essential ele- ment in any open access format (society for neuroscience, ). this kind of constructive approach will go a long way toward ensuring that neuroscience and sfn advance and flourish in a time of great change and opportunity for science, the era of the internet. references biotechnology and biological sciences research council ( ) bbsrc’s position on de- posit of publications. retrieved july , , from http://www.bbsrc.ac.uk/news/articles/ _june_research_access.html canadian institutes of health research ( ) cihr policy in development, access to products of research. retrieved july , , from http://www.cihr-irsc.gc.ca/e/ .html medical research council ( ) mrc guid- ance on open and unrestricted access to pub- lished research. retrieved july , , from http://www.mrc.ac.uk/open_access microsoft research ( ) towards science, p . retrieved august , , from http:// research.microsoft.com/towards science/ downloads/t s_reporta .pdf national science foundation ( ) revolu- tionizing science and engineering through cy- berinfrastructure: report of the national sci- ence foundation blue-ribbon advisory panel on cyberinfrastructure, p . retrieved au- gust , , http://www.nsf.gov/od/oci/ reports/atkins.pdf roosendaal he, geurts pa ( ) forces and functions in scientific communication: an analysis of their interplay. crisp . retrieved august , , http://www.physik. kuni-oldenburg.de/conferences/crisp / roosendaal.pdf society for neuroscience ( ) strategic plan. retrieved july , , from http://www.sfn.org/index.cfm?pagename� strategicplan# johnson • will research sharing keep pace with the internet? j. neurosci., september , • ( ): – • alexandria, volume , no. ( ) published by manchester university press http://dx.doi.org/ . /alx. social media use by the us federal government at the end of the presidential term debbie rabina, anthony cocciolo and lisa peet abstract the purpose of this study is to describe in quantitative and qualitative terms the use of social media by the us government. during the autumn of the researchers collected and examined over , unique social media sites used by the executive, legislative and judicial branches of government. this data was collected as part of a national web archiving initiative known as the end of term harvest, where us government websites are web archived in anticipation of changes prompted by the election. we found that social media is used heavily across all federal agencies and that they utilize a variety of social media platforms, with the most popular being facebook, twitter, youtube and flickr. the qualitative examination revealed that agencies use social media to provide the public with information and to engage the public in conversation through the feedback and comment mechanisms enabled by the social media providers. however, we did not find evidence that social media is enabling high levels of collaboration between government and citi- zens, which was a goal stated in obama’s transparency memorandum. introduction the united states is increasing its adoption of social media as part of its official communication on both the federal (hansell, ) and local levels (stelter and preston, ). however, the content of social media in the dot gov domain or through third-party social media providers is not con- sidered official government information, and therefore not subject to legal requirements for collection, retention, preservation and access as is official information published by the us government. because it is not preserved, this content is considered at risk of disappearing and not being available to the public. this could be a loss to future researchers who may be interested in seeing how the government interacted (or failed to interact) with its citizens through then emergent information and communications technologies. to preserve this content, a group of organizations (the library of congress, the internet archive, university of north texas, california digital library and corresponding author _alexandria rabina.indd / / : social media use by the us federal government the government printing office) included social media in a larger project to document the web presence of the us federal government in the months pre- ceding the presidential election. this larger project of preservation of web content is known as the end of term harvest (eot), and this particular project focuses on the web archive of social media content. in this project, we collected and studied , social media urls from official government websites, and of those , unique urls were included in the eot archive. using this social media archive, this study aims to show how the us federal government incorporates social media within its official web-based communications and how it uses social media. we pose the fol- lowing research questions: rq : who uses social media in the us government? rq : what social media platforms does the us government use? rq : what observations can be made of the content available from the us government via social media platforms? we will describe the use of social media by the us government first, and some challenges associated with it use. second, we will describe the eot harvest project and our study methodology. lastly, we will present our findings regarding the use of social media by the us federal government. literature review social media and the us government social media are web-based platforms that employ web . , which is a series of design patterns and approaches to structuring web-based systems that capitalize on the networked information environment, enabling the web to better support the use, production, and circulation of information in a peer- to-peer networked arrangement (cocciolo, ; benkler, ). these platforms rely on individual production and user-generated content, and are designed to support participation and individuation though such mecha- nisms as profile pages, which often state explicit likes, interests, and friend- ships (o’reilly, ; cormode and krishnamurthy, ). the largest and most visible examples of social media are facebook and twitter, whose content is almost entirely dependent on the activity and engagement of users. the use of social media in the us government is decentralized and managed by each agency or department individually. from a policy per- spective, the use of social media draws from the transparency memorandum issued by president obama in his first weeks in office (obama, ). the transparency memorandum recognized the importance of open- ness in government as a way to strengthen democracy and calls for govern- ment to be transparent, participatory, and collaborative; these principles were formalized in the open government directive issued in late (open government directive, ). while the policies that direct open _alexandria rabina.indd / / : social media use by the us federal government government come from the office of management and budget (omb) that operates within the white house, implementation guidance is provided by the general services administration (gsa), an independent agency that sup- ports the operational aspects of the federal government. gsa maintains a register of social media accounts in government (howto.gov, b), which ironically is not made public, but does allow users to verify that an account is indeed affiliated with the us government. the gsa social media navigator (u.s. general services administration a) provides guidance for employee responsibilities when accessing social media services in an official capacity. it is directed to gsa employees and contractors, and is designed to adhere to the standards of ethical conduct for employees of the executive branch ( c.f.r. part ), the conflict of interest statutes ( u.s.c § - ), and the hatch act ( u.s.c. § – ). gsa clarifies that information provided through social media may come only in addition to, and not in place of, official communication channels such as government websites (u.s. general services administration, a). gsa offers government employees online training on the use of social media, with an emphasis on ethical aspects of social media (u.s. general services administration, n.d.). government employees are cautioned not to disclose on social media information that is protected by other statutes or that could further personal interests such as endorsing services, products, or businesses (u.s. general services administration, a). further, adoption of social media software is subject to equal access laws. disabled persons should have the same access rights to government information disseminated through social media channels as they do to all information disseminated by the government (u.s. general services administration, a). in addition to disseminating information, the government may use social media to collect information as part of conducting official business. collecting information from the public via social media is permitted when the collection is voluntary, does not place a burden on participants, and public dissemina- tion of results is not intended (u.s. general services administration, b). when collecting information from the public, agencies must comply with privacy laws and regulations and specify how private and personally identifi- able information will be used (u.s. general services administration, b). the gsa provides guidance and information on the use of social media in government, including policies, best practices, and resources (howto. gov, c). the policies discuss risk mitigation (federal cio council, ), term of service agreements (howto.gov, a), and government efficiency, specifically addressing the need to reconcile with the paperwork reduction act (pra) (sunstein, ). this last memorandum clarifies that the pra does not apply to many uses of social media. it describes social media as being used by agencies to engage the public by means of ‘publishing’ solicitations for public comment and for conducting ‘virtual public meetings’ (sunstein, ). the transparency memorandum legitimized the use of social media by the _alexandria rabina.indd / / : social media use by the us federal government government as a means to support the goals of transparency, participation, and collaboration. this led to widespread use of social media by the govern- ment, which caught the attention of the web archiving team as it prepared to archive government websites at the conclusion of the presidential election. recent research on social media and transparency in government the increased use of social media in government is of interest to policy makers and researchers alike. recent research on the use of social media in government largely centres around the three goals of obama’s transparency memorandum: transparency, participation, and collaboration. although us federal agencies have been using new platforms such as wikis, blogs, flickr, facebook, twitter, and youtube for a short time, they have become ubiqui- tous, with the obama administration leading by example (bertot, jaeger and hansen, ). the recent research on social media in government will be discussed in terms of transparency, participation, and collaboration, and will conclude with a set of challenges. transparency one of the primary goals of the proposed shift is transparency of government operations, both between agencies (cain, ) and, more important, from a citizen perspective. bertot, jaeger and grimes ( ) propose that this can be an important anti-corruption tool, both in the united states and elsewhere. such factors as the dissemination of information, timely release of materials as requested, facilitation of public meetings, and the ability for whistleblow- ers to make themselves heard can all work to empower the public and keep government accountable. desirable outcomes such as administrative reform, law enforcement and social change can be effected through the use of social media, and such platforms as blogs and wikileaks can serve as an alternative press. however, much of the success or failure of these initiatives depends on the culture of openness already in place; challenges are often more sociologi- cal than technological. transparency also raises issues of trust, including the risk of privacy violation and the separation of professional and private roles in social media (kavanaugh et al., ). participation one desired outcome of transparency and accessibility of government opera- tions is citizen participation or ‘shared governance promoting democracy’ (editorial, , p. ). kavanaugh et al. ( ) examine localized gov- ernment media use to identify several current initiatives and some issues surrounding user participation. almost one-third of all online adults in the us use social tools to keep up on government activities, which includes minorities; a pew study shows no significant gap for latinos or african _alexandria rabina.indd / / : social media use by the us federal government americans (smith, ). cell phone use also extends the government’s reach. kavanaugh et al. report that much local social media participation originates around issues of public safety, both at the civil and emergency level: monitoring public opinion, social convergence, community issues, response to crises, and tracking civic-related themes. this can enable agencies suffering from budget cuts to extend the government’s information reach, and citizens’ ability to mobilize. however, while social media can outpace the government’s official apparatus and mainstream media, this also results in unchecked sources; both information and misinformation spread more quickly. with the change in administrations has also come a shift from closed network technology to third-party applications. the department of state has traditionally been ‘the outward face of the united states to the world’ (cain, , p. ), so its origination of new technology has been a logical step. its web presence—www.state.gov—produces and archives information for the public. it uses various social media platforms and a number of web pages for different aspects of the state department’s administration, with varying degrees of successful interaction. however, cain ( ) indicates a need for consistent internal policies about defining the media contained within and its searchability. bertot, jaeger and hansen ( ) point out existing policy instruments that advocate for those with disabilities or cultural disenfran- chisement issues, which predate the government’s push for citizen participa- tion but which can be leveraged to realize these objectives. collaboration collaboration between the government and public has evolved from neigh- bourhood watches and an auxiliary police force to electronically facilitated collaborations, with the potential for a scenario where the ‘government treats the public not as customers but as partners’ (linders, , p. ). key opportunities for collaboration include democratic participation and engage- ment, where the public can enter into constructive dialogue; co-production, in which the public is involved in the development, design and delivery of government services; and crowdsourcing innovations and the development of solutions (bertot, jaeger and hansen, ). changing boundaries between the government and the public result in a greater diversity of relationship types. linders ( ) identifies the need to define categories of co-production in terms of these relationships, the spectrum of public service delivery partnerships, and collaborative activities. these collaborations can move in the direction of government-to-citizen (the delivery of highly personalized decision-influencing information, embedding government capabilities such as data.gov into the greater ecosystem, and open book government); citizen-to-government (e-participation and e-rule- making input, crowdsourcing for problem solving, and citizen reporting); and citizen-to-citizen (collective action, self-service community organization, and self-monitoring in the form of evaluation and complaint platforms). _alexandria rabina.indd / / : social media use by the us federal government challenges to social media use by governments although social media platforms have enjoyed popular success, they pose a series of challenges to government agencies looking to use them to commu- nicate and exchange information with members of the public. a first chal- lenge is related to advertising. in the commercial marketplace, social media platforms have become formidable advertising vehicles, with companies of all sizes using them to connect with consumers (li and bernoff, ). users of social media can expect sponsored advertising designed for his or her par- ticular demographic group displayed alongside all other content. this can be a challenging environment for government agencies, which may not want their services combined with commercial offerings out of a wish to remain vendor neutral (hemphill, ). for example, suppose a member of the public discusses the time requirements for getting a new passport with a government official via that agency’s facebook page. this content could be mined, and advertising for passport expediting services could be provided to the user, with the service provider willing to pay the most for the advertising space appearing the most prominently. this advertising could be useful to the user—especially if he is in a rush and can afford the service—but thwarts the government agency’s attempt to not privilege one vendor over another. figure illustrates how government information and advertising coalesce on facebook.com. a second concern is related to privacy. for example, a user can acciden- tally or inadvertently reveal private information via a public function of a social media platform (e.g. attempting to discuss a private matter, such as receiving benefits from assistance programmes, on an agency’s public facebook page). design features for redacting private information— without removing the entire online contribution outright—are often lacking in social media sites. of course, even deleting information from a public portion of a social media website does not mean that all copies have been destroyed. the provider may have additional copies available in non-public locations. a third challenge associated with government agency use of social media is the possibility of those sites acting as ‘walled gardens’, giving the com- mercial social media provider control over how that information is curated and made available to users, if at all (berners-lee, ). returning to our prior example, there is little preventing a social media provider from high- lighting government agency information that is amendable to advertising (e.g. passport expedition services), versus information that is less conducive to advertising. further, there is also nothing keeping social media providers from burying (or not showing) government information if they believe it may cause the user to exit the ‘walled garden’ for the open internet or another ‘walled garden’. social media providers are motivated to have users spend as much time as possible within their sites because the more time users spend in them, the more opportunities for exposing them to sponsored advertising (friesen, ). any content that is not engaging or pleasant to users, which _alexandria rabina.indd / / : social media use by the us federal government could be determined probabilistically from past behaviours in the social media environment, may be subject to being buried. a fourth issue related to making government information available on social media platforms is the widespread inaccessibility of these platforms in some countries. for example, at the time of writing, facebook, twitter, and youtube were not available at all in mainland china (at least not without the use of a proxy server outside of china and some technical know-how). us citizens living abroad would benefit from this information being available elsewhere (e.g. a non-blocked webhost or at an embassy or consulate). a fifth issue is that social media platforms pose a challenge for govern- ment agencies because they are difficult to archive. if the information put into social media were to be considered government records, the difficulty in web archiving could complicate record-keeping practices and hinder the eventual transfer of such records to the national archives. the difficulty of web archiving social media is caused by the information being layered within complex client-side web interfaces that don’t necessarily abide by open standards (e.g. a url or uri for identifying a piece of information) (masanès, ; berners-lee, ). however, some social media are more easily web archived, such as twitter, which uses open standards (such as the url for each tweet) and has made deliberate efforts to be web archived (through a partnership with the library of congress) (osterberg, ). the last issue is related to the reliability and consistency in how govern- figure : government information on facebook, with the centre column showing white house information and advertising in the right column _alexandria rabina.indd / / : social media use by the us federal government ment delivers information to citizens. linders ( ) cites the risk that the shift to citizen collaboration may be perceived as a withdrawal of support on the government’s part, adding: ‘services based on internet-facilitated vol- unteerism replace planning with probability—i.e. no one is “scheduled” to be available, but someone will “probably” be there to help’ (linders, , p. ). bertot, jaeger and hansen ( ) point out that while electronic access to services and information has become a common expectation, the administra- tion’s policy structure has not changed significantly to accommodate these shifts. although the e-government act of provides some guidelines, and the gsa offers social media providers a standard agreement for gov- ernment usage, much policy regarding social media has yet to be upgraded. bertot, jaeger and hansen ( ) argue that agencies are engaging in social media ‘through an antiquated policy structure that establishes the param- eters for information flows, access, and dissemination’ (bertot, jaeger and hansen, , p. ). the obama administration is aware of problems, but the trend has been to allow these technologies and sort out the issues on an ad hoc basis. social media, if adequately mined for data, has the ability to provide self- referential information on the responsiveness of government policy to tech- nological change (bertot, jaeger and hansen, ) and how best to use these media to enable civic participation (kavanuagh et al., ). bertot, jaeger and grimes ( ) point out that adoption of icts can be a self-fulfilling prophecy; perceptions of their value to the public are promulgated through the same social media that are being questioned. however, all sources agree on e-government’s democratic potential: engaging and educating the public, bringing services to the people, fostering a participatory democracy, and promising the citizenry as a whole consistent access to its products (bertot, jaeger and hansen, ). the end of term harvest the dataset used in this study was collected as part of the efforts to archive the web content of the united states federal government. the us federal government maintains an active collection of information published by the federal government and preserved and collected by the us government printing office. these include the laws, bills, regulations, congressional hearings, published papers, and other collections available from fdsys, the federal digital content management system (http://www.gpo.gov/fdsys/). despite all these efforts, there is much that does not get captured, and information that is not subject to requirements for retention or preservation often does not get preserved. many agency websites fall under this category. moreover, the website of the office of the white house is managed by each administration, and each newly elected president recreates the white house website in his own image. _alexandria rabina.indd / / : social media use by the us federal government in aticipation of the presidential election, a group of open govern- ment information proponents came up with a plan to document this moment in history by capturing and preserving us federal government websites. this group is represented by the library of congress, the internet archive, the university of north texas, the california digital library, and the united states government printing office. during the months preceding the presidential election, the eot captured and archived about terabytes of information (the equivalent of billion single-spaced typed pages). the project is described in detail by seneca et al. ( ) and the archive can be viewed at the end of term web archive (http://eotarchive.cdlib.org/). prior to the us presidential election, the eot team prepared for another web capture. in the years between and the use of social media by the us government proliferated, in no small amount encouraged by policies set forth by the obama administration with the transparency memorandum and other policy documents discussed earlier. well aware of the penetration of social media into government, the eot team was eager to capture these websites as part of the record of the end of term. the lessons learned from the data collected serves as the basis for the findings presented in this paper. methodology overview preparation for collecting urls of social media in the us government began in summer of and the data collection period lasted from about mid- september to the end of october, with all nominations in place by the november election day. the data was collected by sixteen students enrolled in the graduate course ‘government information sources’ taught by the cor- responding author at {institution name removed for review purposes}. the students were supervised by the corresponding author, who also approved all nominations submitted by the students. the dataset created includes , urls. the eot team provided the scope of urls sought for nomination as well as detailed syntax on how to submit urls. this included all government agencies listed on the usa.gov a-z list and the us government manual. as mentioned earlier, the gsa registry is not publicly available, so we relied on these two sources to guarantee comprehensive coverage of all federal websites. the a-z list is a web directory available from usa.gov, the main gateway to the us government on the web. the list is an alphabetical listing of both agencies and departments within agencies (for example, under ‘a’ there is a directory listing for the alcohol, tobacco, firearms, and explosives and under ‘j’ there is a listing for the department of justice, which is the parent organization). information on the a-z directory is sparse, often listing nothing but the main url and minimal information (see figure ). to verify the listing and get more information about the agency, we turned _alexandria rabina.indd / / : social media use by the us federal government to the united states government manual. the us government manual, in addition to being the official directory of government, is very detailed and comes in at over , pages. the manual provides ‘comprehensive informa- tion on the agencies of the legislative, judicial, and executive branches. it also includes information on quasi-official agencies; international organizations in which the united states participates; and boards, commissions, and com- mittees’ (u.s. government printing office, , p. iii). social media content from senators and members of congress not seeking reelection were also considered at-risk—as the content may disappear after the election—and thus included within the scope. data collection process: selection of urls the first stage of data collection was the preparation stage. this stage included first a discussion between the researcher and the eot team about the nature of the project and the addition of ocial media websites to the web archive. the researcher prepared a written description of the workflow, which is briefly illustrated in figure . step : identifying content for nomination the students attended a virtual meeting and presentation by the eot team. afterward, the class of sixteen was divided into four teams. each team had figure : agency entry from the a-z index ( . . ) _alexandria rabina.indd / / : social media use by the us federal government at least one member versed in social media, and in many cases more than one. we divided the listing of government agencies into four parts. there are approximately agencies listed, so each team was assigned agencies. each team divided the agencies among its own members, resulting in approx- imately thirty-one agencies per student. in addition to agency websites, each group was assigned social media of elected individuals serving in the senate or the house at that time and who were not running for reelection, and this list was also divided among the four groups. each student examined his or her list and located the urls of the social media used by those agencies and representatives. each team created a shared google spreadsheet listing the url and other information requested on the eot nomination form. step : review and approval of nomination after students added the nomination to the google spreadsheet, the cor- responding researcher reviewed each nomination. the review consisted of checking the syntax of the nomination, of opening it in a browser that was not logged into any social media websites (this is to verify that no password is required for access), and verifying authenticity of the website. once each nomination is approved the researcher indicates that on the spreadsheet. upon this approval, students submit the nomination. step : submission once the corresponding author reviewed and approved each nomination, students began nominating using the eot form provided by the university of north texas, where the nominating tool was developed. the form asks for a url and some descriptive fields such as title, agency, branch of government, comments, nominator and institution. once submitted, the eot team at university of north texas preserves and archives the selected urls. figure : workflow of submitting social media sites for inclusion in eot archive _alexandria rabina.indd / / : social media use by the us federal government data collection process: scope of data collection, syntax and verification the scope of nominations as defined by the eot team included social media sites sponsored by government agencies and representatives, specifically federal government websites (.gov, .mil) in the legislative, executive, and judicial branches of government. of particular interest for prioritization were sites likely to change dramatically or disappear during the transition of government. out of scope of the harvest were local or state government websites, or any other site not part of the above federal government domain. intranets and deep web content were also not captured. also, only social media websites that were freely accessible without the need to be a registered user or have an account on the social media website were included. the urls submitted followed a precise syntax, since any variation of the prescribed syntax would result in overharvesting or no harvesting. for example, nominating flickr sites requires a url to end with a slash (example: http://www.flickr.com/photos/barbaraboxer/) and facebook requires not including ‘https’ or a final slash. not following the specified syntax for each social media site would result in an error message (as in the case of facebook) or in archiving millions of documents unintentionally. for example, one misplaced ‘/’ could result in harvesting all of facebook.com instead of only facebook.com/deptofdefense. the instructions and guidelines containing the syntax were provided by the eot partners, specifically from the library of congress, the internet archive and university of north texas, and were based on their experience with web harvesting. in cases where no social media was found, an attempt was made to find such sites using an advanced query on internet search engines. for example, the office of the inspector general at the department of commerce (http:// www.oig.doc.gov/pages/default.aspx) showed no evidence of use of social media, but to verify we ran a search for ‘facebook site:oig.doc.gov’, which returned no results. social media sites that were not linked to from a govern- ment website were not included, since they may be predatory sites. data collection process: limitations although attempts were made to comprehensively list every social media website publicly accessible to users, there were some limitations on our ability to achieve complete coverage. as mentioned above, in the absence of a directory for social media government websites, we could never be quite sure if we were missing sites. in addition, we also encountered some government agencies with complex and deep structures where we were not only unsure that we located every level of the agency website, but when we did, we found such a profusion of social media use that it was not possible to cover it all. an example would be the us department of state. the department of state maintains social media on many of the us embassies and consulates, often _alexandria rabina.indd / / : social media use by the us federal government in languages other than english. since it was not possible to manually nomi- nate all their urls in the time allotted, we opted in such cases for a selective approach. for example, we nominated social media for embassies that were of political interest during the autumn of (such as afghanistan and syria) and excluded stable counties such as finland and austria. analysis of collected data to address the first and second research questions, or ‘who uses social media in the us government?’ and ‘what social media platforms does the us gov- ernment use?’ we will provide tabulations of total social media webpages by branch of government, agency, and social media provider (e.g. facebook, twitter). urls were imported into an excel spreadsheet, and standard func- tions used to generate tabulations. to address the third research question, or ‘what observations can be made of the content the us government makes available on social media plat- forms?’ we will provide a range of examples that illuminate the content the government makes available via social media the dataset of social media urls nominated for inclusion in the end of term web archive—as well as other non-social media websites included—is available for use by researchers and can be downloaded from a digital library available from the university of north texas. results rq : who uses social media in the us government? in this project, we collected and studied , social media urls from offi- cial government websites, and , unique urls were included in the eot archive. the executive branch had by far the largest number of social media webpages, which is to be expected since it is the largest branch, followed by the legislative and judicial (see table ). rq : what social media platforms does the us government use? the most popular platform for social media use by government is facebook, followed closely by twitter (see table ). further, eighty-six government sites embedded social media features directly into their websites (e.g. blog- ging with comments from a .gov or .mil domain). end of term harvest nomination reports http://digital .library.unt.edu/ nomination/eth /reports/. _alexandria rabina.indd / / : social media use by the us federal government rq : what observations can be made of the content the us government makes available on social media platforms? once we identified the agencies that use social media and the platforms they use, we wanted to get a more nuanced understanding of the ways in which government uses social media. in particular, we were interested in learning whether social media aids in accomplishing the goals of the transparency memorandum: transparency, participation and collaboration. we chose to examine the urls we collected to find examples of the various uses of social media in government. most sites we examined are on the spectrum between presence and inter- action as described by the crs report (seifert, ). in other words, they provide information to the public, and allow the public to comment, as is expected from the architecture of social media. the reach of social media websites varies widely. for example, ejournal usa, a journal moderated by the united states department of state, has . million likes and frequent postings. many of the postings include questions and are intended to directly engage readers in a conversation. for example, the site asks readers to provide their ideas for closing the online gender gap, a posting that received over comments and over , likes. at the other end of the spectrum is the federal maritime commission, with a mere five tweets. many agencies gear their social media to very specific audiences they are trying to target and engage, and these include non-english speakers both in the us and outside its borders. for example, the state department maintains a facebook account in arabic and farsi to engage non-us citizens, and state dept. ejournal facebook account https://www.facebook.com/ejournalusa federal maritime commission twitter account https://twitter.com/fmc_gov/ state dept. facebook account in arabic https://www.facebook.com/digitaloutreachteam state dept. facebook account in farsi https://www.facebook.com/usadarfarsi table : social media use by government branch branch urls executive , legislative judicial unclassified total social media pages , of those urls, unique government agencies had at least one social media page. the agency with the most social media pages is the us house of representatives (see table ). _alexandria rabina.indd / / : social media use by the us federal government the department of homeland security maintains a twitter account geared toward spanish speakers in the us. agencies are taking advantage of the full capabilities that each social media platform offers, using text and visual materials as appropriate. for example, the us army maintains a pinterest site that engages over , followers with photographs on topics ranging from army fashion to army values. dept. of homeland security twitter account in spanish https://twitter.com/uscis_es/ us army on pinterest http://pinterest.com/usarmy/ table : social media use by government agency agency number of urls house of representatives state department health and human services department defense department national archives and records administration national aeronautics and space administration homeland security agriculture department senate veterans affairs department army department food and drug administration interior department president of the united states transportation department commerce department education department joint chiefs of staff federal executive board internal revenue service treasury department geological survey architect of the capitol land management bureau energy department urls for agencies with less than urls unclassified total social media webpages , _alexandria rabina.indd / / : social media use by the us federal government even social media platforms that are not well known in the general social media landscape are used. for example, the us department of agriculture uses storify, a platform that curates other social media sites, and the us geological survey uses github, a website that allows sharing computer code. government agencies that are part of the national security efforts and are not typically considered publicly minded are also making use of social media. for example, the us missile defense agency maintains a flickr account and the us border patrol maintains a pinterest account. several agencies use social media to call for participation in offline activi- ties such as open meetings and to submit grant applications. for example, the u.s department of health and human services twitter account called on the public to apply for seed funding and participate in a conference. conclusion in conclusion, social media is widely used by the branches and agencies of the us government. this includes use of a variety of social media platforms, usda on storify http://storify.com/usda/ usgs on github https://github.com/usgs/ us missile defense agency flickr account http://www.flickr.com/photos/mdabmds/ us border patrol on pinterest http://pinterest.com/esmietana/united-state-border-patrol/ us health data on twitter https://twitter.com/healthdatagov/ table : social media platform use platform number of urls facebook twitter youtube flickr pinterest google+ tumblr linkedin foursquare vimeo myspace other commercial social media platform social media features (e.g. blogging) embedded in official site total social media webpages , _alexandria rabina.indd / / : social media use by the us federal government with facebook, twitter, youtube and flickr being the most popular, and growing use of other platforms such as pinterest and tumblr. however, as the qualitative examples reveal, the use of social media is largely just a presence in these social media sites, and it is difficult to find examples of more extensive engagement between government and citizens through these platforms. social media has yet to achieve the same impact as the ‘we the people’ petitions on whitehouse. gov. ‘we the people’ allows individuals to petition the white house to act on matters of their choice, and the white house is committed to address any petition that receives , or more signatories. a recent accomplishment of the ‘we the people’ petition was the petition to mandate open access to all federally funded research. in response to the petition, president obama signed a memorandum instructing federal agencies that provide grants of more than $ million annually to make the results of federally-funded research publicly available free of charge within twelve months after original publication (holden, ). in the social media sites analysed in this study, participation is limited to user comments between government and citizen, and collaboration at the level demonstrated by the ‘we the people’ petitions cannot be observed. government social media sites offer users a potential opportunity to easily engage with agencies, which is an important step in achieving the goals of the transparency memorandum. however, fully reaching the goals of the transparency memorandum is still far off. open platforms have yet to provide the panacea that will increase participation and collaboration between citizens and government. further research is needed to identify the steps needed to help the government achieve these goals. references c.f.r. part ‘standards of ethical conduct for employees of the executive branch’, http://www.gpo.gov: /fdsys/granule/cfr- -title - vol /cfr- -title -vol -sec - /content-detail.html?null (visited . . ). u.s.c. - ( ) ‘an act to prevent pernicious political activities’, http://www.law.cornell.edu/uscode/text/ /part-iii/subpart-f/chapter- / subchapter-iii (visited . . ). u.s.c - ( ) ‘bribery, gift and conflict of interest’, http://www. law.cornell.edu/uscode/text/ /part-i/chapter- (visited . . ). benkler, yochai ( ) the wealth of networks: how social production transforms markets and freedom. new haven, ct: yale university press. berners-lee, tim ( ) ‘long live the web: a call for continued open standards and neutrality’. scientific american, november, , http:// www.scientificamerican.com/article.cfm?id=long-live-the-web (visited . . ). bertot, john c., jaeger, paul t. and grimes, justin m. ( ) ‘using icts to create a culture of transparency: e-government and social media as open- _alexandria rabina.indd / / : social media use by the us federal government ness and anti-corruption tools for societies’. government information quarterly ( ), , pp. – . bertot, john c., jaeger, paul t. and hansen, derek ( ) ‘the impact of polices on government social media usage: issues, challenges, and recom- mendations’. government information quarterly ( ), pp. – . cain, jonathan ( ) ‘web . utilization in e-diplomacy and the proliferation of government grey literature’. the grey journal ( ), , pp. – . carvin, andy ( ) distant witness. new york: cuny journalism press. cocciolo, anthony ( ) ‘can web . enhance community participation in an institutional repository? the case of pocketknowledge at teachers college, columbia university’. journal of academic librarianship ( ), , pp. – . cormode, graham and krishnamurthy, balachander ( ) ‘key differences between web . and web . ’. first monday ( ), http://firstmonday. org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / (visited . . ). editorial ( ) ‘social media in government’. government information quarterly , pp. – . end of term web archive (n.d.) ‘u.s. federal government websites – ’, http://eotarchive.cdlib.org/ (visited . . ). federal cio council ( ) ‘guidelines for secure use of social media by federal departments and agencies’, https://cio.gov/wp-content/uploads/ downloads/ / /guidelines_for_secure_use_social_media_v - .pdf (visited . . ). friesen, norm ( ) ‘education and the social web: connective learning and the commercial imperative’. first monday ( ), http://firstmonday. org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / (visited . . ). hansell, saul ( ) ‘should the white house be a place for friends?’ the new york times may, , http://bits.blogs.nytimes.com/ / / / should-the-white-house-be-a-place-for-friends/ (visited . . ). hemphill, thomas a. ( ) ‘government technology acquisition policy: the case of proprietary versus open source software’. bulletin of science technology & society , pp. – . holden, john p. ( ) ‘increasing access to the results of federally funded scientific research’. memorandum for the heads of executive departments and agencies. february, http://www.whitehouse.gov/sites/default/files/ microsites/ostp/ostp_public_access_memo_ .pdf (visited . . ). howto.gov ( ) ‘federal-compatible terms of service agreements’, http:// www.howto.gov/web-content/resources/tools/terms-of-service-agreements (visited . . ). howto.gov ( b) ‘social media registry’, http://www.howto.gov/social- media/social-media-registry (visited . . ). _alexandria rabina.indd / / : social media use by the us federal government howto.gov ( c) ‘using social media in government’, http://www.howto. gov/social-media/using-social-media-in-government (visited . . ). kavanaugh, andrea l., fox, edward a., sheetz, steven d., yang, seungwon, li, lin t., shoemaker, donald j., natsev, apostol, and xie, lexing ( ) ‘social media use by government: from the routine to the critical’. government information quarterly, , , pp. – . li, charlene and bernoff, josh ( ) groundswell: winning in a world transformed by social technologies. cambridge, ma: harvard business press. linders, dennis ( ) ‘from e-government to we-government: defining a typology for citizen coproduction in the age of social media’. government information quarterly, , , pp. – . masanès, julien ( ) ‘web archiving: issues and methods’. in: masanès, julien ed.), web archiving. berlin: springer, pp. – . obama, barack ( ) ‘transparency and open government: memorandum for the heads of executive departments and agencies’, http://www.white- house.gov/the_press_office/transparencyandopengovernment (visited . . ). open government directive ( ) ‘memorandum for the heads of execu- tive departments and agencies.’ december, http://www.whitehouse.gov/ open/documents/open-government-directive (visited . . ). o’reilly, tim ( ) ‘what is web . : design patterns and business models for the next generation of software’, http://www.oreillynet.com/pub/a/ oreilly/tim/news/ / / /what-is-web- .html (visited . . ). osterberg, gayle ( ) ‘update on the twitter archive at the library of congress’, january, http://blogs.loc.gov/loc/ / /update-on-the- twitter-archive-at-the-library-of-congress/ (visited . . ). seifert, jeffrey w. ( ) ‘a primer on e-government: sectors, stages, oppor- tunities, and challenges of online governance’. congressional research service report order code rl , january, , www.fas.org/sgp/ crs/rl .pdf (visited . . ). seneca, tracy, grotke, abigail, hartman, cathy n., and carpenter, kris ( ) ‘it takes a village to save the web: the end of term web archive’. dttp: documents to the people, spring, pp. – . smith, a. ( , april). government online. pew internet and american life project. http://www.pewinternet.org/reports/ /government- online.aspx (visited . . ). stelter, brian and preston, jennifer ( ) ‘in crisis, public officials embrace social media’. the new york times, november, http://www.nytimes. com/ / / /technology/in-crisis-public-officials-embrace-social- media.html (visited . . ). sunstein, cass r. ( ) ‘memorandum for the heads of executive depart- ments and agencies, and independent regulatory agencies’, http://www. whitehouse.gov/omb/assets/inforeg/socialmediaguidance_ .pdf (visited . . ). _alexandria rabina.indd / / : social media use by the us federal government u.s. dept. of commerce (n.d.) ‘office of the inspector general’, http://www. oig.doc.gov/pages/default.aspx (visited . . ). u.s. general services administration (n.d.) ‘gsa on-line university learncenter’, https://gsaolu.gsa.gov/login.asp?sessionid= - f - bf- - -f e ffc b &dct= &lcid= &requeste durl=learncenter.asp% fid% d % page% d &secure=true (visited . . ). u.s. general services administration ( a) ‘gsa social media navigator, chapter introduction—guidance for the official use of social media’, http://www.gsa.gov/portal/content/ (visited . . ). u.s. general services administration ( b) ‘gsa social media navigator, chapter – your responsibilities’, http://www.gsa.gov/portal/ content/ # (visited . . ). u.s. general services administration ( c) ‘gsa social media navigator, chapter – social media use should be strategic’, http://www.gsa.gov/ portal/content/ (visited . . ). u.s. government printing office (n.d.) ‘fdsys’, http://www.gpo.gov/fdsys/ (visited . . ). u.s. government printing office ( ) ‘united states government manual’, http://www.gpo.gov/fdsys/browse/collection.action?collection code=govman (visited . . ). usa.gov ( ) ‘a-z index of u.s. federal departments and agencies’, http://www.usa.gov/directory/federal/index.shtml (visited . . ). acknowledgements we would like to thank abbie grotke (library of congress), kris carpenter (the internet archive), cathy hartman (university of north texas) and the following pratt sils students: laural angrist, leo bellino, denis chaves, megan fenton, eloise flood, shanta gee, lucia kasiske, mike kohler, emily lundeen, julia marden, joan markey, erin noto, lauren reinhalter, megan roberts, malina thiede and rachel wittmann. _alexandria rabina.indd / / : social media use by the us federal government debbie rabina is assistant professor at pratt institute school of information and library science. her areas of specialization include government information, informa- tion law and policy, and information systems of inter- national organization. dr rabina’s research is situated within the framework of cultural information studies, and focuses on how democratic micro and macro organi- zations, form and harbor information policies that stem from and support their perception of democracy. anthony cocciolo is an assistant professor at pratt institute school of information and library science, where his research and teaching are in the areas of digital archives, moving image and sound archives, and digital libraries. he completed his doctorate from the communication, computing, technology in education programme at teachers college, columbia university. prior to pratt, he was the head of technology for the gottesman libraries at teachers college, columbia university. lisa peet received her masters in information and library science from pratt institute in new york.  she currently works with the darwin manuscripts project, a grant- funded transcription and bibliography project operating out of the american museum of natural history library. she writes book reviews and literary criticism for the literary websites bloom (www.bloom-site.com), where she is senior editor/writer and her own site, like fire (www.likefire.org). her interests include digital scholar- ship, digital humanities and archiving practices. _alexandria rabina.indd / / : _alexandria rabina.indd / / : sustaining the eebo-tcp corpus in transition judith siefring, bodleian libraries, university of oxford eric t. meyer, oxford internet institute, university of oxford march bodleian libraries sustaining the eebo-tcp corpus in transition: report on the tidsr benchmarking study this report was funded by jisc, and is an output of the bodleian libraries (http://www.bodleian.ox.ac.uk/) and the oxford internet institute (http://www.oii.ox.ac.uk), both at the university of oxford. all images by the authors unless otherwise indicated. questions or queries about this report may be directed to: judith siefring eebo-tcp bodleian digital library systems and services (bdlss) osney one building, osney mead, oxford, ox ew, united kingdom tel: + ( ) email: judith.siefring@bodleian.ox.ac.uk dr. eric t. meyer oxford internet institute, university of oxford st giles, oxford, ox js, united kingdom tel: + ( ) email: eric.meyer@oii.ox.ac.uk please cite this report as: siefring, j. & meyer, e.t. ( ). sustaining the eebo-tcp corpus in transition: report on the tidsr benchmarking study. london: jisc. available online: http://ssrn.com/abstract= http://www.bodleian.ox.ac.uk/ http://www.oii.ox.ac.uk/ mailto:judith.siefring@bodleian.ox.ac.uk mailto:eric.meyer@oii.ox.ac.uk http://ssrn.com/abstract= table of contents acknowledgements ......................................................................................................................... acronyms & abbreviations .............................................................................................................. executive summary ......................................................................................................................... introduction .................................................................................................................................... context of the study ............................................................................................................................... research design & methods .................................................................................................................... quantitative impacts ....................................................................................................................... survey of researchers ............................................................................................................................. analytics ................................................................................................................................................ bibliometrics .......................................................................................................................................... web . impacts ................................................................................................................................... twitter ............................................................................................................................................... google blog search ........................................................................................................................... qualitative impacts ....................................................................................................................... focus groups ......................................................................................................................................... sect workshop ................................................................................................................................. digital humanities summer school focus group ............................................................................... digital citation focus group ............................................................................................................. interviews and opinion-gathering ........................................................................................................ librarians .......................................................................................................................................... encoding experts ............................................................................................................................... projects ............................................................................................................................................. eebo-tcp editors .............................................................................................................................. user feedback ....................................................................................................................................... conference ........................................................................................................................................ conclusions ................................................................................................................................... appendix ....................................................................................................................................... list of projects based on or related to eebo-tcp ................................................................................. acknowledgements the authors wish to thank all the participants in this research, some of whom are named in the report and some of whom remain anonymous, but all of whom have helped us to better understand how early english books online and the text creation partnership are having impacts. particular thanks go to jonathan blaney, simon charles, amanda flynn, colm maccrossan, michael popham, rebecca welzenbach and pip willcox. finally, thanks to jisc for funding our research. acronyms & abbreviations bho: british history online cerl: consortium of european research libraries ebba: english broadside ballad archive ecco: eighteenth-century collections online eebo: early english books online eebo-tcp: early english books online text creation partnership estc: english short title catalogue odnb: oxford dictionary of national biography oed: oxford english dictionary oii: oxford internet institute, university of oxford sect: sustaining the eebo-tcp corpus in transition executive summary between march and november , the sect project, funded by jisc’s digital preservation and curation programme, carried out a benchmarking study of the use and impact of the eebo-tcp corpus using the oxford internet institute’s toolkit for the impact of digital scholarly resources (tidsr). in summary, the main findings of the study were: usage statistics from proquest (the point of access for most users) show a steady increase in eebo usage from - . bibliometric analysis indicates that eebo-tcp is having an impact both in published scholarship and in new scholarship being produced as part of masters and doctoral work. eebo’s reputation is very high amongst the user community – it is considered reliable, easy to use and easy to find. users consider eebo to be important for their own research and teaching, but particularly for research. they strongly believe in its importance to their field or discipline, and value its contribution to new research possibilities. the study identified particular areas where work could be done to improve the long-term sustainability and usefulness of the resource.  usage statistics show that there are considerable bumps when new collections are announced, but these seem to be often followed by rather steep declines. the reasons for these declines should be looked at in more detail.  the tcp’s visibility and profile should be significantly developed in order to create a stronger brand and distinguish eebo-tcp from eebo. this will have particular importance in relation to the lifting of restrictions on phase one tcp texts in .  steps could be taken to raise the project’s visibility online and through social media.  eebo-tcp should make more effort to target users and potential users working outside the disciplines of traditional history and english language and literature.  the tcp should look in detail at metadata and documentation in order to provide more (and more useful) information or to clarify/improve the existing data.  the most useful (and sustainable) resources will be the ones which work in tandem with each other and which link with each other in useful ways. eebo-tcp should look to develop such links.  the completeness of the corpus and the possibility of future corrections being made to the data are areas of particular importance in the user community. an exploration of potential funding models to allow the addition of more texts and the correction of mistakes in the data should be carried out.  absence of citation to digital resources is a significant problem in digital humanities – eebo-tcp should take steps to make citation easier and to publicize the importance of citing online material.  careful planning for the release of the phase one tcp texts into the public domain in needs to be carried out as soon as possible.  the tcp must consider how to sustain the expertise and knowledge that has been developed by project staff over the course of the project. opportunities should be found to properly exploit the significant knowledge gained over many years. introduction the early english books online text creation partnership (eebo-tcp) was established in , as a collaborative project involving the university of oxford, the university of michigan, the commercial publisher proquest and the council on library and information resources (clir). the aim of the text creation partnership was to create fully searchable xml-encoded transcriptions of the image sets of early printed books which form the basis for proquest's early english books online, http://eebo.chadwyck.com. phase i of the project, in which text production began, ran from to , and created , searchable texts, which are available through the proquest interface, and also via a tcp interface, http://eebo.odl.ox.ac.uk/e/eebo/ and http://quod.lib.umich.edu/e/eebogroup/ . the full cost of phase i production was around $ million. the project has now moved into phase ii, which aims to complete the corpus: one copy of every text printed in england or in english between - . the completed resource will make available around , searchable electronic texts, and phase ii is projected to cost around $ . million. the entire eebo-tcp corpus will therefore represent an investment of c.$ m, involving in excess of person-years of effort applied over a year period. context of the study the bodleian libraries and the oxford internet institute sought and received funding from jisc under their digital preservation and curation programme, managed by neil grindey, for the sect: sustaining the eebo-tcp corpus in transition project. the first stage of the sect project was to carry out a benchmarking study of the impact and use of eebo-tcp, using the oii’s toolkit for the impact of digital scholarly resources (tidsr), itself a jisc-funded initiative. the study concentrated primarily on the use and impact of eebo-tcp in the uk. this report outlines the results of the tidsr study, which will be used as a basis for the creation of practical recommendations for improvements to eebo-tcp, focussing on how best to secure the long-term sustainability of the corpus. research design & methods a mixture of qualitative and quantitative methodologies offered through tidsr were used to consider in detail the use and impact of eebo-tcp. quantitative methods of analytics, bibliometrics, web . analysis, and an in-depth survey of researchers were used to build a detailed picture of the use and profile of the resource. this research was complemented by the qualitative data gathering through three focus groups, a conference, individual interviews, and email discussion. updates on the project were made available via the project blog, www.bodleian.ox.ac.uk/eebotcp/sect. http://eebo.chadwyck.com/ http://eebo.odl.ox.ac.uk/e/eebo/ http://quod.lib.umich.edu/e/eebogroup/ http://www.bodleian.ox.ac.uk/eebotcp/sect quantitative impacts survey of researchers an eebo-tcp user survey was put up online in the summer of . participation was low during the summer vacation and the closing date of st december was chosen to take advantage of the return to study of staff and students for michaelmas term. the survey was incentivized by offering entry into a prize draw. people in total started the survey, completed at least part of the survey, and completed it in full. the survey sought to collect data on use of digital resources generally, of eebo, and of eebo-tcp. a summary and discussion of the results of the survey, following, will show the results of each question, and will suggest conclusions which can be drawn from the data. percentages of time on particular activities participants were asked to indicate what proportion of their time was spent in research and teaching, administration and other activities. this was a mechanism to enable us to only ask questions, for example, on teaching to those who have significant teaching responsibilities. mean percentage of time spent active* respondents n= % total % active* n % research % % % teaching % % % administration % % % other activities % % % * active respondents are those reporting spending at least % of their time on the given activity. the participants in this survey therefore spend the largest portion of their time engaged in research. of the % of respondents who reported spending at least one-fifth of their time doing research, the average amount of time spent was %. slightly less than half ( %) spent at least one-fifth of their time teaching, and of those, they reported spending % of their time on teaching activities. use of online resources in teaching how often do you use online resources in your teaching? n= of the respondents who spend at least one-fifth of their time teaching, the majority use online resources in their teaching either daily ( %) or several times a week ( %). this result indicates that use of online resources is now commonplace in teaching – of those with teaching responsibilities, none said that they never use online resources in their teaching. creators of online content should therefore consider how they can best help teachers make use of their material, to help ensure that a particular resource keeps being used in teaching in the future. teachers actively encourage their students to access material online and content creators need to make their resource one that is pointed to and recommended. . % . % . % . % . % % % % % % % daily several times a week several times a month about once a month less than once a month n= nearly all of the teachers reported encouraging students to access materials online ( %), and none of the remaining respondents actively discouraged the use of online materials. of our sample (which was specifically recruited from eebo-aware audiences), a high proportion ( %) encourage their students to use eebo in particular. the respondents who said they didn’t encourage their students to use eebo were asked to give their reasons. the free text responses were as follows: • i was unsure how to answer this question: i *would* encourage my students to use eebo, but i don't teach courses for which it's pertinent. • rarely relevant to them. • current university of employment does not subscribe. • the lack of full-text searching; and also the fact that some of the images are of very poor quality. • it is not available at my university. • not available at my institution. • i'm currently teaching first year courses that do not require early english at all. • i teach a first year module, where accessing primary materials via eebo is not required. if i was teaching second or third year students, i would encourage them to use eebo. • i teach archaeology but use eebo occasionally for my research • teaching mainly nineteenth-century literature - does not seem relevant • my library doesn't have a subscription! • have never heard of it until this email. for the most part, then, teachers are not encouraging their students to use eebo either because their institution does not subscribe or they teach courses where it is not relevant. the one critical response may indicate either lack of awareness of the possibilities afforded by eebo (full-text searching is available) or may reflect dismay that not everything is available for full- text search (completeness of the corpus being particularly desirable for many scholars). use of online resources in research how often do you use online resources in your research? n= of the % of respondents who spend at least one-fifth of their time on research, we can see that online resources are important tools. this result indicates just how heavily online resources are now used in research. careful catering to the needs of the research community is therefore a key priority for creators of digital content. % % % % % % % encourage students to access materials online enourage students to use eebo . % . % . % . % . % % % % % % % % daily several times a week several times a month about once a month less than once a month use and awareness of other resources respondents were asked to describe their use of or awareness of a number of resources which have some similarities with eebo, and were then asked to indicate their use or awareness of particular general web sites that enable researchers to access electronic texts. n= these results indicate that amongst this (admittedly self-selecting) audience, eebo is by far the best known and used, but that ecco, bho and lion in particular are used by significant numbers of people who use eebo. a comparison of the benefits and disadvantages of these resources would be a useful exercise to suggest possible improvements to eebo-tcp. these results from the general purpose sites suggest that amongst the eebo user-community, google books has very high usage and recognition, project gutenberg and the internet archive are reasonably well-used and known, while gallica, europeana and the hathitrust are little known or used. an exploration of the attractions of the top three “rival” resources, and of google books in particular, would provide useful insights into user preferences. finally, respondents were asked to list any additional resources which they use regularly or frequently. there were free-text responses, which included all of the following unique resources (with the number in the first column indicating if multiple respondents mentioned the resource): depositions academic journals through university library website allegro alumni cantabrigensis alumni oxoniensis amazon % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % similar sites eebo ecco british history online literature online internet shakespeare editions jisc historic books brown u. women writers other sites google books internet archive project gutenberg hathitrust gallica europeana use regularly use on occasion do not use never heard of it amazon kindle american memory america's historical newspapers anglo-american legal tradition archive.org artcyclopedia artstor bavarian state library bayerische stadtsbibliothek [munchen] (bsb) bbti bethlem royal hospital archives and museum gallery bibles online bibliography of british and irish history bibliotecacervantesvirtual bnf bodleian library broadside ballads brepolis bridewell online british book trade index british history online british images pre- british library british literary manuscripts online british museum prints british periodicals online british state papers (home and colonial series, but published by different online vendors) brotherton manuscripts burney collection - th and th century newspapers online calendar of state papers online cced cerl connected histories copac database of early english playbooks (deep) dcb dictionary of irish biography dictionary of literary biography digilib.usm.edu digital scriptorum dnb online early american imprints ebba ebo ebooks ebrary edit e-journals e-rara.ch estc (english short-title catalogue) ethos folger.edu gale state papers online geonames google maps google scholar history of parliament http://www.lib.rochester.edu icdl individual research library's digital collections that i know of istc jisc ireland john foxe online jstor kvk london lives lost plays database - matt steggle luminarium.org manuscript facsimile websites, such as the site run by harvard houghton library many of the diocese of york cause papers dating from the th century have been digitised, and are accessible via the borthwick institute website. memory of the netherlands milton reading room museum and gallery sites museum catalogues national archives national archives at kew odl bodley odnb (oxford dictionary of national biography) oed (oxford english dictionary) old bailey proceedings on line online dictionaries online journals and newspaper databases open source shakespeare oxford scholarly editions online pares parker on the web perdita manuscripts persée perseus online library perseus project plre.folger.edu project muse reed ricorso.net sabin americana sbn sciencedirect scottish dictionaries online shakespeare collection shaw-shoemaker state papers online statutes of the realm the agas map the burney collection the cecil papers the holinshed project the latin library thesaurus ucsb ballad project university of edinburgh online resources ustc v&a various linguistic corpora, grammars and dictionaries viaf wright american fiction these very full responses are a very helpful indication of the kind of collections most commonly used by eebo’s user community. this sense of their being a “suite” of resources commonly used by researchers suggests that work could profitably be done exploring the potential for links with other resources, thus helping scholars to find their way through the collections they need. the responses given above suggest that eebo-tcp could most profitably consider the potential for links with (in no particular order):  jstor  estc  oed  odnb  cerl  british book trades index  database of early english playbooks  british history online  ebba finally in this area, participants were asked about learning to use digital resources. how do you prefer to learn how to use digital resources? n= the answers to this question suggest that library training sessions and web tutorials aren’t particularly effective ways of carrying out user education. most people prefer to explore resources themselves or on the advice of their peers. eebo’s reputation participants were asked to indicate their level of agreement with various statements. the results are outlined in the table below. . % . % . % . % . % . % . % . % % % % % % % % % % % % other web tutorials by attending training sessions reading research papers that have used them being shown uses in specific research help pages and documentation learning about them from peers exploring them yourself n= we can conclude that eebo’s reputation is very high amongst the user community – it is considered reliable, easy to use and easy to find. users consider eebo to be important for their own research and teaching, but particularly for research. they strongly believe in its importance to their field or discipline, and value its contribution to new research possibilities. users recommend eebo to both their colleagues and students, but of these more often to colleagues. the user community appears to be aware of how to make use of eebo in their work, and there doesn’t seem to be an overwhelming need for more training, although this would broadly be welcomed. the user community very strongly believes that electronic resources like eebo do not undermine the quality of humanities research. finding the eebo corpus participants were asked, if they remember, where they first learned about eebo. n= those who responded “other” gave the following free-text responses: • i originally used the microfilms. :) % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % e-resources undermine the quality of research would like to use eebo but i’m not sure how more training is needed in how to use eebo eebo is important to my teaching eebo is easy to use eebo is easy to find eebo is a reliable resource i have recommended eebo to students eebo makes new research possible eebo is important to my research i have recommended eebo to colleagues eebo is important to my field or discipline strongly agree agree neutral disagree strongly disagree . % . % . % . % . % . % . % . % . % . % . % % % % % % % % % % % seeing it mentioned in publication from a student stumbling across it in a press release professional association search engine such as google at a conference or presentation other i don’t remember listing of library resources from a colleague • probably as a trial of the uni library, maybe plugged by faculty members • from a lecturer, when i was a student • from a teacher • mentioned in an undergraduate lecture. • tutor • in my time as a graduate student, by a professor's recommendation. • when researching my ancestor's george thomason collection • from a professor during undergraduate studies • mentioned by a prof • from teachers when i was a postgraduate student • from a graduate supervisor • contributor • folger shakespeare library • probably from hearing lecturers mention it during my undergraduate study. • in a grad course for an assignment i had to do • mentioned in a postgrad course description. it is enlightening that while library documentation seems to reach many people, the most common method of learning about eebo is through a colleague. this may have implications for the kind of outreach work that the project pursues. the free-text responses indicate the importance of academics passing on information about eebo to their students. supporting these academics would therefore have knock-on benefits for their students. using eebo participants were asked to describe how they use eebo. when using eebo, do you make use of: the results of this question indicate the value of all three aspects of the eebo resource – the image sets, the transcribed texts and the catalogue records – working together in tandem. it also indicates the significant added value that the tcp texts give to eebo, both as transcriptions and as finding aids. scholars will need different parts of the eebo resource depending on the particular task or work they are undertaking at any one time. the value of this interconnectivity must be considered when planning for the texts being made available in the public domain. % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % catalogue records only text search to find image full text only both images and full text images only always often occasionally never don't know participants were then asked to identify the ways that they use eebo, ticking all that apply. n= those who ticked “other” gave the following free-text responses: • to identify fragmentary text • to refresh my memory on resources i’ve already seen • to supplement the oed to provide a wider sample set of how keywords were used these figures once again strongly illustrate that users use the different aspects of eebo in tandem and for different types of work. large numbers of users both consult and download the full-text transcriptions. a very high number carry out full-text searches across the corpus. % of the users surveyed look to reuse or re-edit the texts for new purposes, which gives a snapshot of the potential for future development of the corpus. the respondents who did not use eebo were asked to give reasons why, ticking all that apply. reason n it is not relevant to my research i just never got around to using eebo i don't have a subscription that allows me to access eebo it is not relevant to my teaching i tried to use it, but found it too difficult to use there are other similar resources i find more useful i prefer working with physical materials over electronic materials other this small sample suggests that those who don’t use eebo, don’t use it because it is not directly relevant to their work or because they cannot access it. some simply have never got round to it. it does not appear that people aren’t using eebo because they don’t consider it useful or reliable or because they prefer other resources. use and awareness of eebo-tcp respondents who earlier answered that they had used eebo texts (either by themselves or in conjunction with images) at least occasionally were asked if, before completing the survey, they had heard of eebo-tcp, and whether they were aware that eebo-tcp creates the full texts available on the main eebo site. . % . % . % . % . % . % . % . % . % . % . % . % . % % % % % % % % % % other source for quantitative analysis reuse/re-editing for new purposes download full-text transcriptions find teaching materials find materials to consult in person pursue personal interests/research download image sets full-text searches across the corpus research resource for manual analysis consult full-text transcriptions consult image sets reference resource n= of the people that answered this question, there is about an even split between people who didn’t know or were unsure of what eebo-tcp does ( %) and people who had heard of eebo- tcp and knew what it does ( %). this is particularly striking as this was an eebo-tcp survey and therefore to some extent a self-selecting and aware sample group. this might suggest that eebo-tcp need to do more awareness and profile-raising work to make more people more informed about the nature of the corpus they are using and the processes via which it was created. the next survey question asked, when accessing eebo-tcp, which of the following interfaces have you used: most people, as we would expect, use the proquest site. however, some also have used the local implementations of the corpus at oxford and michigan, and a small number have used jisc historic books (although we’d expect this figure to rise in the future). it is notable that a significant number of people don’t know which interface they use – this suggests that most users just go to eebo, probably as linked to via their local institution’s network, and don’t think about who is providing the resource. reaching such people with additional support materials could be tricky, as they may tend simply to visit sites via static bookmarks or by always using the same pathways via institutional websites. respondents were then asked about their citation habits with regard to eebo-tcp. . % . % . % . % % % % % % % % knew that the full texts were created separately, but didn’t know by whom had heard of eebo-tcp but didn’t know what it does hadn’t heard of eebo-tcp had heard of eebo-tcp and knew what it does . % . % . % . % . % % % % % % % % % % % jisc’s historic books university of michigan’s eebo-tcp university of oxford’s eebo-tcp don’t know proquest’s eebo how do (or would) you cite materials from eebo-tcp? researchers, n= ; teaching students, n= researchers were asked about their habits with regard to citing materials from eebo-tcp. earlier in the survey, those engaged in teaching had been asked how they would teach students to cite resources consulted online; those data are also included here for comparison. these responses reveal the variety of approaches to digital citation current within the scholarly community, as well as the startling number of people who fail to indicate their use of digital resources at all. the student data is an indication that not only do many teachers teach their students effectively to hide their use of digital resources, but also that those who do teach their students to acknowledge their use, teach them to do so in different ways. the implications of these are discussed more fully in the sect report on digital citation, available separately. importance of eebo-tcp features participants were asked to rank various factors in order of importance to their work. the following percentages resulted, divided by ranking position overall, then, accuracy of transcription is considered by most to be the most important factor, followed by completion/comprehensiveness of the corpus. the consistency and richness of the xml encoding are considered important but less so than the first two factors. participant who answered “other” provided the following free-text descriptions of other factors that their work depends on: • i'd actually like to see some built-in links to oed definitions. i put that ahead of xml bc i'm not xml literate. • links to images are correct • access, searching, downloading, using offline • time % % % % % % % % % % % % % % % % % % % % % teaching students researchers online version only print + url print + [online] (no url) print only other % % % % % % % % % % % % % % % % % % % % % % % % % % some other factor consistent underlying xml rich underlying xml comprehensiveness accuracy of transcription st nd rd th % % % % % • accessibility--i.e. it is not accessible to individual users who do not have access through an institution • images • the accuracy and comprehensiveness of the search functions across corpus. • downloadability for texts • user-friendliness • ease of relating transcription to image of corresponding original text • access to texts not available in hard copy • number of texts covered. • the quality of the texts • unavailable on google or gallica • i haven't used it and that was the most neutral answer • high quality reproductions of illustrations, covers, title pages etc • i am most appreciative of the opportunity to see an approximation of how the words appear on the page. the tcp transcriptions are a close second. these responses indicate other areas of importance that the tcp must consider: the availability of images alongside the text, the ability to download materials and use them offline, search functionality and user-friendliness, and the ability to access texts unavailable elsewhere. when asked if eebo-tcp had allowed them to ask any research questions that otherwise wouldn’t be possible to address, the following responses were given: this result suggests the truly transformative nature of eebo-tcp: when you take away those who didn’t see or answer the question, % of respondents ( of ) have found eebo-tcp to enable new, otherwise impossible, research questions to be explored. these respondents were asked if they could provide one or two examples of new research questions that eebo-tcp and the responses are both varied and fascinating: • i spent some time working on providing annotations for a historical text, and i was able to trace the sources the author used to see where his transcribed a text precisely and where he modified it. without the ease of full-text searching, i probably would not have pursued these questions because they would have taken too much time to investigate. • what kind of genres are there in early english printed texts? how do they change over time? • how a specific set of works were cited across a much broader field of contemporary publishing than i would otherwise have been able to locate. • broadly speaking, it has allowed me to attain a greater understanding of the usage of various words over a given time period. • the frequency / distribution of particular words between particular dates. this can be especially useful when the results run contrary to the information contained on the oxford oed. • locating the use of particular words. • generally: use of the full text as a subject index to texts that otherwise would not be known to discuss a yes no specific person, topic, etc. • locating a passage quoted but not cited elsewhere - finding out about the collocation of two terms • e.g. it allows me to track phrases, proverbs, etc. across playtexts for purposes of commentary in a modern critical edition • i have been able to trace printed books possibly read by early modern readers from fragments of text transcribed in manuscript commonplace books and memoranda. working on early book lists and catalogues, which often use not the actual title of a book but a subtitle or other familiar description, i use eebo to identify books not identifiable in any other way (for example using estc). this is particularly useful if a catalogue lists something which is a part/section of a bigger work. • tracking the development of particular terminology across different texts. • how do readers read early modern books? ) how do texts differ? • easier to compare newspapers texts by using electronic versions • responses to minor french poets, later reception history of du bartas (see my paper from the eebo-tcp conference for more) • it is useful for searching for key words or concepts across large swathes of text. • which contemporary printed sources referenced a particular author or work, suggesting ways a work was received. • 'is this word/phrase common or rare in the period?' • the use of the term originality in art texts the use of the term connoisseurship in art texts • clarity of print. • it allows you to make links between apparently disparate texts that you may not otherwise think to connect. • the most important development that eebo has made possible in my research has to do with the prevalence and registers of circulation of works in which particular words were present. in other words, previous research may have lead me to consider more "high literary" contexts, but eebo helps me begin to gauge the broader contexts in which key terms and concepts may have been considered. • it has pushed my research in a more linguistic direction because i can search for keywords across a large corpus. • searching for particular key words relating to palaeography in my research. searching for particular key words relating to shorthand in my research. • searching for references to particular writers in the newsbooks of the s and s. using frequency analysis and word clouds to investigate writers' styles and preoccupations. • though one must approach the tcp with care, as not all texts are available and those that are were not chosen at random, i've looked for history of usage information for various words, phrases. • how common was a phrase like ___ in the corpus? how early was such a phrase available in print? • it has helped me find very esoteric citations in books i never knew existed since these are searchable across the internet. these helped me make connections from my work to a much more global consideration. • how and where did writers reference bookmarks? how and where did writers reference enclosure? • appearance or not of specific vocabulary in author's work. • access to texts i wouldn't be able to use otherwise • full-text searches allow for discovery of patterns of allusion • it helps me search full-text to find references in print sources. sample research q: "what early modern books include shakespeare's proverb "fat paunches make lean pates"?" • how do literature scholars use electronic primary sources? what influences literature scholars' desires to use electronic primary sources? • access to search allows forbidentifyingbsynergies, eg on gender, recipes, ingredients etc • it has enabled me to search across works for brief references to objects, allowing me to build up a broader picture of how these seemingly slight allusions might have acted and been received culturally. • i has allowed me to think through some of the implications of text-mining and distant reading. • a complex set of q • searching for allusions to particular names or places across a corpus. comparing the use of such a thing in prose vs in verse. • was able to see language trends across time as well as determine frequency of specific attributes • i have been better able to look at the use of classical authors in early modern english geographical works • it is less a specific question in general and more the opportunity to constantly ask very specific questions of the primary sources because they are always available on eebo, unlike in a library where you only get a certain amount of time with them. • not sure what eebo-tcp is, but in terms of eebo, it allows me to pursue thematic topics across the corpus irrespective of genre or other limitations. • it has allowed me to work on images of the material text, which was important for my undergraduate thesis (and consulting the copy in the huntingdon library as an undergraduate was not feasible!) • better idea of integration of text & image - eg. how images are facing in a volume. • ability to trace diffusion and evolving meanings of terminology through title and full text keyword searching • as i received the email to participate in the survey, i was in fact in the middle of conducting an eebo search to find out the history of perceptions of a relatively obscure substance (ambergris); i wouldn't have been able to begin without eebo. • compare frequency of use of certain crucial terms before and after given dates. find particular terms in texts i would not have thought to look at. • frequency of punctuation use in drama. • i use it a lot to trace words, proverbial phrases, poem titles, etc. where dictionaries and other resources would give me a distorted and less full sense of their origins and history it has enabled me to identify the sources of many things in a current editing project that i would not otherwise have found. i'm not sure either of these was a question eebo-tcp allowed me to ask, so much as a question it enabled me to hope to answer • 'noli me tangere' is a phrase used in wyatt's famous sonnet 'whoso list to hunt ...' it means 'do not touch me'. biblically, these words were spoken by the resurrected christ. in wyatt's poem they are spoken by an attractive but unattainable courtly woman, possibly anne boleyn. eebo enabled me to discover that 'noli me tangere' was also the name of a medical disease that, like syphilis, rotted the nose. so eebo helped me to discover a new aspect of the poem, a vengeful ambiguity that attributes to this woman a particularly virulent degree of unpleasantness. • the dispersal of new words, mostly. . • helping to trace early usage of certain words helping to explore perceptions of scotland and scottishness in early modern period • by giving me access to a wider wealth of early english literature it has given me a more comprehensive view of the period and consequently i feel all my research is much indebted to the resource. • using full text search enables retrieval of references to specific terms (eg. psalm, psalms) within a variety of texts which would otherwise be impossible to find. searching titles of texts to discover range of associated meanings (eg. diary) and diversity of texts produced under such a title. • how often certain keywords have appeared in texts over a period of time. • text analysis • it has allowed me to analyse a single term across a wide period of time and range of literature. • dating of usage (getting round limitations of oed). associations of words in phrases. recurrence of textual material between works by different authors. but the primary value is as a way of reading early modern texts and doing word-searches quickly to enrich the answers i can give to relatively traditional research questions. • it frees up time and money previously needed for physical travel to access these resources. it has allowed me to browse more resources, more quickly and thereby granted me access to new questions. • how often does a specific term appear in early modern texts • i am an editorial musicologist based in north-west wales, miles or more from any of the major libraries in london, oxford and cambridge; eebo makes it possible for me to examine, search and trawl through far more materials than i could possibly access on any affordable research trip undertaken in order to consult original materials. it also facilitates the elimination of materials that prove irrelevant -- for example, if i am seeking copies which contain manuscript emendations. • it has given me access to writing on cunning folk by protestant theologians that were unavailable in my university library. as well, it provided me with transcripts of plays from the th- th centuries. • was this abnormality present in multiple editions of this text? how were woodcuts utilised in other editions of this text? • it allowed me to find a particular phrase difficult to find in the image set. • any research question that i ask has been facilitated by eebo tcp because i am unable to spend long periods in the archives due to family commitments. • eebo has allowed the researcher to save large amounts of time in his/her search for various items. it is an incredible resource. • eebo-tcp and ecco and similar online resources have enabled me to pursue a phd at trinity college dublin that i may not have been able to pursue at a university in another country. i am disabled and suffer from serious medical conditions so the easy access allows me to engage in high quality research that i would otherwise not be able to do. • the range of spelling variants in foreign words (french, italian, german, etc) used in a thematically or generically defined corpus of english texts (e.g. plays from a particular decade or on a particular theme). • much easier access to word patterns, frequency of particular words etc. • shakespeare and philosophy • what kinds of attitudes and appraisals there were towards literary publication in the th century. how do the translators of early print products apply the dedicatory device and other paratexts they were producing in order to justify their translation choices and evaluate the text and their own work. • the ability to follow a particular intertextual discourse across years, including both canonical and non- canonical texts, allowed me to answer a question that a very careful scholar had raised but been unable to answer years ago. these responses give us a snapshot of the breadth and depth of research enabled by eebo-tcp. participants views on what improvements could be made to eebo-tcp are just as enlightening. respondents were asked: what one thing would you change about eebo-tcp to improve it, if anything? people answered this question. their answers were: • i'd like to see more texts available--there aren't always transcriptions for the texts i'm interested in. • make it free • more works keyed in! • include more texts • the availability of full text searching within all texts would make the single biggest difference. • more consistent comprehension of abbreviations + greek characters • add collaborative, open, crowdsourced corrections and additional layers of markup/annotation in a stand-off manner for publicly accessibly volumes. • make it freely available *along with* digital images. • completing the full-text transcriptions for all items in eebo. • speed • make it clear (especially for infrequent and student users) that eebo does not contain all estc titles. i'm staggered by how many researchers assume that they can use eebo to find everything written by/printed by a particular individual. there should be a large health warning somewhere! a particular frustration is that if you reorder search results by date, every time you look at one record's images the results list returns to the original alpha order - very annoying! • be less conservative over illegibility or find a way of including 'guesses'. sometimes when you look at a transcription, a word that is transcribed as illegible can be deduced or is legible bar one or two letters-- but by marking it or some letters as illegible, it's no longer searchable. • better metadata • accurate transcription of latin is needed: both of texts that are entirely in latin, and of bits of latin embedded in english texts. the corpus of texts currently offered is biased in favour of the vernacular, for no good scholarly or intellectual reason. • i'd improve the quality of the transcriptions. • make the process of downloading .pdfs easier. the marked list system is clunky and time consuming. ecco is better. i know this is two things, but it's important that you record the shelfmark of the physical item the facsimile comes from where possible. • speed up the server. searching is very slow. pages take a long time to load. • greater coverage (ie all the eebo books) • i find the search interfaces complex, clunky and not always reliable • i would suggest ways to link the image sets to the full-text transcriptions a bit more readily. (it may be that these links do exist already, and i am not familiar with them--if so, perhaps making the links from one to the other more explicit would be helpful!) • add more texts ) improve the transcriptions ) create modern transcriptions underlying the texts so that searching is rationalized • the search function - really clunky to use compared to google searching. • move toward more random or at least carefully selected set of texts to establish a more representative sample • eliminate genre distinctions in the defaults -- defaults should always search the maximum set. • more texts from other literary traditions • price--my current institution can't afford either eebo or eebo-tcp. • make it more affordable • open access • the xml should be freely available at one click. • being able to search the images. • full eebo texts are difficult to download for saving and printing. • more full-text transcriptions, completed more accurately. • a complex set of questions about forms of imprints. • accuracy of transcriptions and cataloguing information! • still not clear on "tcp" eebo • haven't used it yet, but will now • increase the number of full-text transcriptions, so that i can do full text searches on more documents. and make the searching faster - it's very time consuming waiting for (basic) searches to complete. • i want to know more about eebo-tcp • make it easier to access information on authors without linking to lion (perhaps also linking to odnb?) • improvement of transcription accuracy, particularly ability to recognise abbreviated forms in black letter • it would be useful if it could show exactly which spelling variants are being searched for with particular words. • allow searches that combine terms in a certain proximity (e.g. within a given paragraph) • it can sometimes be cumbersome having to go through the marked list in order to download images. • accuracy of transcription. • comprehensiveness - there are stc items missing from it altogether (i can give an example if you want), so one must still use estc. - and then comprehensively good images (there are still image sets from defective copies which need replacing - again, i can give an example) - and then comprehensive availability of etexts • all texts should have a transcription. • quicker and easier access for remote users with my login always remembered • sometimes it is very slow, and if i'm doing an extensive methodical search it is frustratingly slow! i've had difficulty trying to download and save whole books - single images much easier. seem to be unable to download full text versions which would be very helpful. have to select/copy these - and as a result text carries on the line out of the page, so i have to redo each line. • accuracy of transcription • i would make it possible for users to download searchable copies of the text with the formatting intact and i would make the online searching capacity for where one is looking for a word or phrase easier to use and faster. i would also make the formatting more consistent - there are frequently unnecessary spaces. i would consider allowing privileged users to correct inaccuracies in the full text. most academics will check against the original images if they are going to quote an excerpt of text and then discover inaccuracies in the full text; if they were given permission to alter the originals (as in wikipedia) this would vastly improve accuracy. offload onto your well-educated userbase! you could moderate this if you felt it necessary. page signatures should also be given where there are not page numbers to help users locate text in the originals. • it would be great if the texts were properly digitised rather than reproductions from the old stc microfilms • somehow make it easier to navigate through the screenshots. • more transcriptions. • it's probably my fault but i can't always save and download eebo tcp texts in the layout as it appears via eebo. the sparse, text-only downloads are hard to read. • the searching capabilities seems not consistent at times. • it is very slow to work on most browsers. whether searching or retrieving an image, it would be nice to have faster access • allow display of image and text concurrently • i have found it very slow and lots of the images fuzzy • better quality of transcription • some catalogue listings are a little opaque as regards detailed contents. • accuracy of transcription, including latin and greek, • an improved tutorials system for new users • i would like to have the possibility to download a pdf searchable of the text • i found a text that was inaccurately recorded. the text differed from the record relating to it. i'd improve accuracy of data recording, in other words. • the search engine does not accurately portray the depth of the catalogue and should be more simpler • scans of originals as opposed to microfilm to enable machine ocr. the user can to this armed with some decent software so you don't have to. • i have always found that eebo appears to run slower when i am browsing through image sets of periodicals. it is always much quicker to download these in full than to browse them in eebo. i'm not sure if this is relevant to the survey though. • more images to be made freely available • i would like to have more information on the production choices of the full text. on what grounds is it decided if the section of text is considered prologue, dedication or 'to the reader'? is one word code switch enough to have the work listed among those marked as containing the language in the search, etc. • i love the way the estc has added links to the eebo texts; it would be nice if that went in the other direction, and i could quickly jump from the eebo metadata to estc. (i didn't give the obvious answers: free access to all and full-colour images a la early european books! i assume those as givens!) all of these responses provide valuable input for eebo-tcp, but we might make particular note of the repeated requests for comprehensiveness of coverage, improved transcription accuracy, and free open access to text and images. eebo-tcp feedback mechanisms participants were asked a series of questions about engagement with online resources, such as finding errors and suggesting texts. n= this suggests that only a fairly small number of people actively report errors found in online materials. this may be because they don’t have the time or inclination, but it may also be the case that they don’t know how or where to report errors, or whether such reports would be welcome. respondents were asked: if it were possible, would you send corrections you spot in eebo-tcp texts? this very encouraging response suggests that a very significant number of users would become involved in error correction were such a mechanism available. respondents were also asked: would you suggest texts for inclusion in eebo-tcp? again, the positive response ( % saying yes) suggests a willingness on the part of the user community to become more involved in the work of the tcp. with regard to whether scholars would visit a website that included updates and information about eebo-tcp and projects related to it, a large number responded positively to this question, or were at least unsure. there were few outright “no”s. of course, the tcp already has a centralised website, but this answer also suggests the importance of a centralised space in the future, perhaps particularly after the texts enter the public domain. participants were asked to identify who they’d contact in the event of different kinds of problems with using eebo-tcp. % % % % % % % % % % % % % % % % % % % % % % % % % % % would you visit a website that included updates and information about eebo-tcp and projects… would you suggest texts for inclusion in eebo- tcp? if it were possible, would you send corrections you spot in eebo-tcp texts? do you report errors that you find in online resources, including eebo? yes unsure no don't have the time who would you contact first if you needed to get in touch with someone with regard to a technical question or problem, a transcription question or problem, and an access question or problem with eebo-tcp? n= the answers to these questions suggest that it is not very clear who should be contacted in the event of different kinds of problem. access and to a lesser extent technical problems are perhaps most likely to be addressed to the user’s own library. for the tcp’s purposes, the most revealing result is of the question on transcription errors – most users wouldn’t know who to contact, and only % ( respondents) would think to contact the tcp (although an additional would contact either oxford or michigan). this once again suggests that more outreach and publicity work on the work of the tcp would be of value to the user community. who are eebo-tcp’s users? participants were asked: please choose the title that best describes your role when you use eebo (or your main role, if you don't use eebo) role n % undergraduate student % postgraduate student % librarian % professor % lecturer % researcher – academic % researcher – independent % reader % other % n= those who answered other: • book conservator • phd retired • assistant lecturer % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % i wouldn’t know who to contact my own library proquest the text creation partnership university of oxford my local computer support other university of michigan jisc technical issue transcription issue access issue the people who answered this survey are overwhelmingly either academics or students at postgraduate level. with the caveat that more people at this level may have seen the survey, it is interesting that only undergraduates responded. we might take it as indicative of eebo-tcp’s core users and be aware of the needs of this core user group – postgraduate and academic researchers – but should also consider how to reach undergraduate users better than is currently the case. the question about how people first learned about eebo confirmed the importance of academics passing on their knowledge of eebo to their students. participants were asked: what is your main field of specialization? field n % languages and literature % history % art history % library science % history of science % linguistics % archaeology % other % n= those who responded “other” answered: • bibliography • computer science • digital research support • history of book, book conservation • history of philosophy • intellectual history • liturgy • music • music • political science • theology as we would expect, there is a very strong showing here for languages & literature and history, which may be thought of as eebo-tcp’s core subjects. however, the range of disciplines represented reflects the range of materials available via eebo. art history makes a notable showing – an area where tcp might consider doing further outreach. finally, participants were asked to indicate in which country they live: country n % england % scotland % wales % northern ireland % republic of ireland % usa % canada % australia % other % n= again as we should expect, most of the survey respondents came from the uk and the republic of ireland, with a number of representatives from other english-speaking countries. the “other” answers indicate that eebo is also valued and used by scholars from non-english speaking countries, notably finland and italy (although this may tell us more about the distribution of the survey than about usage in europe): italy finland switzerland netherlands of the survey respondents asked to be added to our mailing list to receive eebo-tcp updates. making efforts to increase the eebo-tcp mailing list may be an effective way to disseminate more information about the work of the tcp. analytics the analytics in this report are based on usage statistics from three sources: jisc historic books (for the period august -july ), proquest (for the period january -february ), and university of michigan (for the period august -december ). selected usage statistics are presented below. usage statistics, proquest the proquest statistics, which cover the longest period, show a steady increase in usage from - . there are no really surprising numbers here, which generally show a trend of steady growth with an uptick in . text views via the proquest interface are consistently viewed at a rate about % of the page image views. usage statistics, jisc historic books collection the jisc historic books collection has much lower usage than proquest, and shows a worrying decline throughout . this trend should be investigated further to determine if there is a reason for this decline, and if anything can be done to reverse the trend. ja n - m a y - s e p - ja n - m a y - s e p - ja n - m a y - s e p - ja n - m a y - s e p - ja n - m a y - s e p - ja n - m a y - s e p - ja n - m a y - s e p - ja n - m a y - searches sessions page image views text views usage statistics, university of michigan eebo (august -december ) while usage of the michigan site has historically been low compared to proquest, numbers improved dramatically in april (when phase ii launched) and even more in december -february (when the most recent batch of texts was released to the public). in general, the material that has been made public is (unsurprisingly) the content which gets the most usage. in general, all these basic usage stats show a mixed picture: while usage in general is growing, there are considerable bumps when new collections are announced, but these seem to be often followed by rather steep declines. it is important to determine whether these declines are simply because many of the people who first come to the resources after an announcement are just window shoppers, as it were, or whether there are features of the resources which are making it difficult to turn interested audiences into regular users. bibliometrics bibliometric analysis of sources citing eebo-tcp and eebo were performed using data from google scholar, scopus, and jstor. these results show the extent to which eebo and eebo-tcp are mentioned and cited in the literature, although they do not necessarily find all uses of the resources that did not cite or mention them by name. publications related to early english books online n google scholar , jstor scopus scopus theses & dissertations , proquest dissertations & theses it is apparent in the data shown above that early english books online is having an impact in the published literature. one thing to keep in mind is that all the databases use different search mechanisms and index different bodies of literature. for instance, both jstor and scopus allow searching all fields in the database, but have a narrower range of materials available to search (i.e. only those journals included in each index, which are selected using fairly stringent criteria of impact and scholarly importance). google scholar, on the other hand, searches a much wider selection of publications, including not just journal publications, but also things such as reports, unpublished documents hosted on academic servers, and presentations. eebo-related publications in scopus, and citations of those publications only scopus allows one to readily count the publications per year, but these data show a steady growth in publications over the last decade, which indicates that the online collections are having a positive impact on scholarship. in addition, we see that citations to the articles that mention eebo/eebo-tcp are also growing, which indicates a growing secondary impact on scholarship. one would expect the data in google and jstor to follow a similar pattern. google scholar search term: "eebo-tcp" or "eebo tcp" or eebo or "early english books online" jstor search term: eebo-tcp or "eebo tcp" or eebo or "early english books online" in full-text, including all content scopus search term: all("eebo-tcp" or "eebo tcp" or eebo or "early english books online") proquest dissertations & theses: the humanities and social sciences collection search term: "eebo-tcp" or "eebo tcp" or eebo or "early english books online" in full-text publications citations author location, scopus publications the scopus data also lets us see the country of the authors in the publications, shown above. we can see that the united states and the united kingdom are by far the most common locations of authors mentioning their use of early english books online, followed two other english- speaking countries, canada and australia. note that this pattern of publication is unsurprising given general patterns of publication in english-language journals, where the us and uk tend to dominate across most disciplines. if we look at the journals of the publications, we see a range of journals one might expect, focusing on english literature, language, and period studies. journals of publication, all scopus journals with more than one article studies in english literature notes and queries renaissance studies pmla library english literary history studies in philology review of english studies literary and linguistic computing english studies spenser studies shakespeare shakespeare quarterly comparative drama historical journal journal of medieval and early modern studies modern philology northern history renaissance and reformation serials eighteenth century studies agricultural history reference services review only countries with more than one publication shown. additional countries are also represented in the data by one publication each. additional journals only appear once each in the dataset spain new zealand finland sweden france netherlands australia canada united kingdom united states journal of the history of medicine and allied sciences milton quarterly journal of the early book society rhetorica journal of the history of rhetoric huntington library quarterly explicator papers of the bibliographical society of america literature and theology print quarterly anq quarterly journal of short articles notes and reviews number of theses & dissertations per year in scopus & proquest the thesis & dissertation data shows a somewhat unusual pattern that is likely an artefact of the database. there is considerable growth from - in the number of post-graduate theses and dissertations mentioning the early english books online resource (from in , to in , and to a high of in ), followed by a rather remarkable apparent decline. it is likely, given the precipitous nature of this decline, that some repositories are either behind in making their data available to scopus, or stopped indexing certain kinds of documents. this is particularly likely since the proquest data, which has a smaller set of dissertations, does not show a similar pattern. instead, the proquest data increases dramatically in and then stays relatively steady (keeping in mind that data will still be updated in the early months of ). putting aside data anomalies, these data generally demonstrate the importance of eebo/eebo- tcp to new scholarship being produced as part of masters and doctoral work. this already successful area is one which should be encouraged, since today’s post-graduate students are the faculty and researchers of tomorrow. scopus proquest web . impacts two types of data were collected and analysed: twitter data, and data from google blog search. both results are reported below. twitter twitter data was automatically collected once a day from february -february . n total tweets retweets links in tweets unique twitter accounts these data show a moderately active twitter presence, and a reasonable volume of re-tweets of content related to the resource. however, there is certainly room for improvement to increase the visibility of eebo in the twittersphere. this is particularly true in that the uk twitter account for eebo-tcp (see below) has , followers. this shows a strong level of interest in eebo -tcp that could be better leveraged by thinking through the eebo-tcp twitter strategy. total tweets for all twitter accounts with greater than tweets top tweeters n short description heatherfro phd student in glasgow studying gender in early modern london thefrozensea “early modern dialogues” blogger oxfordeebotcp the eebo-tcp project twitter in the uk sgwingo phd student in michigan studying rare manuscripts jamescummings u of oxford tei expert historicbooks jisc historic books collection carenmilloy head of projects at jisc collections tcpstream the tcp at the u of michigan perayson senior lecturer at lancaster u using natural language processing tracelarkhall english literature faculty at bath spa u pipwillcox bodleian staff member working on eebo-tcp using an automated google spreadsheet running the tags (http://mashe.hawksey.info/twitter-archive-tagsv /) template set to gather data once per day. twitter search terms: @oxfordeebotcp or #tcpsect or @tcpstream or #eebo. data from february to february . visualization of the top terms used in tweets the visualization above of the words from the twitter collection related to eebo and eebo-tcp show lots of attention to the september ( ) conference co-hosted in oxford by the sect project. notice also the positive affected terms occurring in the data (e.g. excited, delighted, interested, great) as well as the terms that indicate the impact of eebo/eebo-tcp on research (e.g. revolutionizing, future, advance). again, this collection of words can suggest ways for the team to capitalize on perceived strengths to increase the visibility of the resource in the twittersphere more widely. some words removed for readability, including @oxfordeebotcp, rt, @tcpstream google blog search google blog search found sources mentioning eebo or eebo-tcp. this is a relatively modest number of blog posts, and suggests that efforts could be made to make the resources more visible in the blogosphere. as you can see below, blog posts came from the project itself in some cases, and there were also a number of posts related to the conference organized by the project. others announce the availability of new material. it would be good to see more blogging about innovative uses and new lines of research inspired by eebo and eebo-tcp (such as those which were discussed at the conference), and it might prove valuable to encourage scholars who we know have done interesting work to blog about it if they have blogs. conference organizers reflect on “revolutionizing early modern ... www.textcreationpartnership.org/ jan by rebecca conference organizers reflect on “revolutionizing early modern studies”? eebo-tcp in . january , - posted in conference report. this conference report was contributed by judith siefring, a tcp editor at the university of oxford, with contributions from pip willcox, also an ... the early english books online text creation partnership in , held in oxford on the th and th september , was a cause for great celebration for those of us involved in its organization. more results from text creation partnership eebo-tcp conference proceedings | eebo-tcp blogs.bodleian.ox.ac.uk/eebotcp/ dec by judith s as the proceedings illustrate, the conference was a stimulating meeting where work and ideas using eebo-tcp were shared through a series of excellent papers, posters, and discussion. the event provided a wealth of ... erica zimmer presents at recent eebo-tcp conference » editorial ... www.bu.edu/editinst/ oct by katherine a evans congratulations to erica zimmer, phd candidate in the editorial institute, for having presented at the recent eebo-tcp conference, “revolutionizing early modern studies”?: the early english books online text creation ... early english books online-text creation partnership (eebo-tcp ... cucataloging.blogspot.com/ may by cataloging and metadata services this week, cms loaded , eebo-tcp records into chinook. the collection is described on the eebo-tcp homepage as: “the university of michigan, the university of oxford, the council on library and information ... early english books online text creation partnership: user survey ... earlymodernonlinebib.wordpress.com/ oct by eleanor shevlin posted on behalf of the eebo-tcp project please help the early english books online text creation partnershipplan for the future by filling in our user survey, and be entered intoa prize draw to win one of ten £ amazon ... more results from early modern online bibliography text creation partnership releases over , new eebo-tcp texts publishing.umich.edu/ apr by admin http://www.textcreationpartnership.org/ / / /conference-organizers-reflect-on-%e % % crevolutionizing-early-modern-studies%e % % d-eebo-tcp-in- / http://www.google.co.uk/url?url=http://www.textcreationpartnership.org/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= cc q auwaa&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnfparqc i uf z kb-vnup bfkog http://www.google.co.uk/search?q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% +blogurl:http://www.textcreationpartnership.org/&hl=en&tbo=d&biw= &bih= &gbv= &tbm=blg&sa=x&ei= xkeufe_omq qx _ocwda&ved= cc q guwaa http://blogs.bodleian.ox.ac.uk/eebotcp/sect/ / /eebo-tcp- -conference-proceedings/ http://www.google.co.uk/url?url=http://blogs.bodleian.ox.ac.uk/eebotcp/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= cdmq auwaq&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcngzybefztlbyqgjrknzim pe-kpa http://www.bu.edu/editinst/ / / /erica-zimmer-presents-at-recent-eebo-tcp-conference/ http://www.google.co.uk/url?url=http://www.bu.edu/editinst/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= cdgq auwag&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnhc m vjja-rnzzft xhv qiewqw http://cucataloging.blogspot.com/ / /early-english-books-online-text.html http://www.google.co.uk/url?url=http://cucataloging.blogspot.com/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= cd q auwaw&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcne oemodtpzy_b lrbvhor xwpzta http://earlymodernonlinebib.wordpress.com/ / / /early-english-books-online-text-creation-partnership-user-survey/ http://www.google.co.uk/url?url=http://earlymodernonlinebib.wordpress.com/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= ceiq auwba&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcngw u y bmokhtlocq-afgs_okdnq http://www.google.co.uk/search?q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% +blogurl:http://earlymodernonlinebib.wordpress.com/&hl=en&tbo=d&biw= &bih= &gbv= &tbm=blg&sa=x&ei= xkeufe_omq qx _ocwda&ved= cemq guwba http://www.publishing.umich.edu/ / / /tcp-releases-over- -new-eebo-tcp-texts/ http://www.google.co.uk/url?url=http://publishing.umich.edu/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= cegq auwbq&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcneeh ucxt- xcyvvt qqe n htrw we are pleased to announce the release of texts from the second phase of our early english books online text creation partnership (eebo-tcp) project. these texts were produced in collaboration with proquest, and ... early english books online – text creation partnership (eebo-tcp) blogs.unimelb.edu.au/eresource/ jul by admin electronic resources, new databases and key research tools at the university of melbourne. call for papers: “revolutionizing early modern studies”? - bodleian ... historyatox.wordpress.com/ may by iholowaty call for papers: “revolutionizing early modern studies”? conference: eebo-tcp . / / by iholowaty · “revolutionizing early modern studies”? the early english books online text creation partnership in . university of ... collection development blog » , eebo-tcp texts now live obelix.lib.hku.hk/cdblog/ oct by electronic resources coordinator as one of the partner institutions of the early english books online text creation partnership (eebo tcp) project, hku libraries is pleased to announce the completion of the first production phase of eebo tcp. the project ... ecco-tcp and eebo-tcp: a new way of exploring texts from ... litlanglibrary.wordpress.com/ mar by litlanguiuc tired of squinting at the scanned page images in early english books online (eebo) and eighteenth-century collections online (ecco)? those days are now gone: the eebo-tcp and ecco-tcp project databases offer ... project curriculum/work plan: week one » early modern digital ... emdigitalagendas.folger.edu/ oct by owen williams at the end of the day, exercises will be assigned introducing the most widely used digital corpus in early modern english studies, early english books online (eebo). eebo is a commercially available collection of digitized full-text facsimiles. it currently ... participants will break into small groups to find examples and discuss applications of eebo-tcp for research and classroom use. on friday afternoon, discussion returns to the principles of stcs by examining those ... partnership makes th century texts available to public | the ... ur.umich.edu/feed may ... publishers, and university libraries to produce scholar-ready text editions of works from digital image collections, including ecco, early english books online (eebo) from proquest, and evans early american imprint from readex. ... tcp outreach coordinator ari friedlander says the eebo-tcp project is much larger than ecco-tcp because pre- works are more difficult to capture with optical character recognition than ecco's th-century texts, and ... reading rare books online. « vade mecum andrewkeener.wordpress.com/ jan by andrewkeener for readers with access, electronic databases including early english books online (eebo) offer thousands of early and rare printed materials that can be downloaded to a home computer, printed out, consulted in a pdf ... for instance, you can page through the artifact in its entirety; http://blogs.unimelb.edu.au/eresource/ / / /early-english-books-online-text-creation-partnership-eebo-tcp/ http://www.google.co.uk/url?url=http://blogs.unimelb.edu.au/eresource/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= ce q auwbg&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnfwfsexduhajqs_xgz-pjt cbqa-g http://historyatox.wordpress.com/ / / /call-for-papers-revolutionizing-early-modern-studies-conference-eebo-tcp- / http://www.google.co.uk/url?url=http://historyatox.wordpress.com/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= cfiq auwbw&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcngd -_ fzry bbqntn c_s_jpdhnw http://obelix.lib.hku.hk/cdblog/?p= http://www.google.co.uk/url?url=http://obelix.lib.hku.hk/cdblog/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= cfcq auwca&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnglegrbnlze xlsjjnnufe ts ag http://litlanglibrary.wordpress.com/ / / /ecco-tcp-and-eebo-tcp-a-new-way-of-exploring-texts-from- - / http://www.google.co.uk/url?url=http://litlanglibrary.wordpress.com/&rct=j&sa=x&ei= xkeufe_omq qx _ocwda&ved= cfwq auwcq&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcne dwuhxyw-uun momfy_gnoyjf w http://emdigitalagendas.folger.edu/ / / /project-curriculumwork-plan-week-one/ https://www.google.co.uk/url?url=http://emdigitalagendas.folger.edu/&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cc q auwadgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnhi yujx m duwpqhjrwnw z gmug http://ur.umich.edu/ /may _ / -partnership-makes- th https://www.google.co.uk/url?url=http://ur.umich.edu/feed&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cdiq auwatgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnesn mktzawwih nqhctc aa bgg http://andrewkeener.wordpress.com/ / / /reading-rare-books-online/ https://www.google.co.uk/url?url=http://andrewkeener.wordpress.com/&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cdcq auwajgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnfnzymunqdnpfk_zse_mxnv z- sw you can download it to your computer; you can peruse the ascii text (although eebo's tcp project currently only has available first-edition keyed texts, so this one would not be there). university of michigan library opens ecco – eighteenth century ... www.libraries.wright.edu/noshelfrequired/ apr by spolanka ... and university libraries to produce scholar-ready (that is, tei-compliant, sgml/xml enhanced) text editions of works from digital image collections, including ecco, early english books online (eebo) from proquest, and evans early ... according to ari friedlander, tcp outreach coordinator at u-m, the eebo-tcp project is much larger than ecco-tcp because pre- works are more difficult to capture with optical character recognition (ocr) than ecco's ... from the director – april , – osul odds and ends | from ... https://library.osu.edu/blogs/director/ apr by batts. @osu.edu this article from the chronicle of higher education provides more detail – http://chronicle.com/article/language-and/ . early english books online – text creation partnership (eebo-tcp). many years ago, ohio state ... update on tcp--full text searching of eebo & ecco | lcr collections libcollections.blogspot.com/ apr by helen clarke update on tcp--full text searching of eebo & ecco. i just wanted to ... most notably, you can now search all collections (eebo-tcp, evans-tcp, and. ecco-tcp) ... early english books online - tcp - , . evans early ... english at reading · mark hutchings: recent and forthcoming ... https://blogs.reading.ac.uk/english-at-reading/ nov by cindy in september i presented a paper at the university of oxford's eebo-tcp conference on the use of the database early english books online (eebo) in teaching, drawing on my part module editing the renaissance ... early modern digital agendas | hastac hastac.org/users/edgaradams dec by zhoel ... can historicize, theorize, and critically evaluate current and future digital approaches to early modern literary studies—from early english books online-text creation partnership (eebo- tcp) to advanced corpus linguistics, ... the future of primary texts online is almost here - profhacker ... chronicle.com/blogs/profhacker/ apr by prof. hacker it's no exaggeration to say that life in the humanities has been radically transformed over the last decade or so as a result of the release of databases of primary texts, including early english books online (eebo), eighteenth-century collections online (ecco), and early .... if your institution is a tcp partner, then you can go to the eebo-tcp website and access all the keyed texts (and if your library subscribes to eebo as well, the corresponding images are pulled in). - references launch of website | the early modern blog blogs.reading.ac.uk/emrc/ jul by leigh blount http://www.libraries.wright.edu/noshelfrequired/ / / /university-of-michigan-library-opens-ecco-eighteenth-century-collections-online-to-the-public/ https://www.google.co.uk/url?url=http://www.libraries.wright.edu/noshelfrequired/&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cdwq auwazgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnfia gayfy_cwtep ybo eram mmw https://library.osu.edu/blogs/director/ / / /from-the-director-april- - -osul-odds-and-ends/ https://www.google.co.uk/url?url=https://library.osu.edu/blogs/director/&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= ceeq auwbdgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcngffz-iktrgqzdsiiynke k zoxuq http://libcollections.blogspot.com/ / /update-on-tcp-full-text-searching-of.html https://www.google.co.uk/url?url=http://libcollections.blogspot.com/&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= ceyq auwbtgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnecpucgrtquwqlc-wcws_wu-ofrow https://blogs.reading.ac.uk/english-at-reading/ / / /mark-hutchings-recent-publications/ https://www.google.co.uk/url?url=https://blogs.reading.ac.uk/english-at-reading/&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cesq auwbjgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcngnifteiitpublf don-qo gczroq http://hastac.org/opportunities/early-modern-digital-agendas https://www.google.co.uk/url?url=http://hastac.org/users/edgaradams &rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cfaq auwbzgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnhmnwtyqjpqr posh zozmz_ipzlg http://chronicle.com/blogs/profhacker/the-future-of-primary-texts-online-is-almost-here/ https://www.google.co.uk/url?url=http://chronicle.com/blogs/profhacker/&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cfuq auwcdgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnho vclz mkp_xmvyivbzrokowtw https://www.google.co.uk/search?hl=en&tbo=d&gbv= &biw= &bih= &noj= &tbm=blg&q=link:http://chronicle.com/blogs/profhacker/the-future-of-primary-texts-online-is-almost-here/ &sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cfyq wuwcdgk http://blogs.reading.ac.uk/emrc/ / / /launch-of-website/ https://www.google.co.uk/url?url=http://blogs.reading.ac.uk/emrc/&rct=j&sa=x&ei=pxoeucgio-ek awesyhwbq&ved= cfsq auwctgk&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcngdlt jmwmfncssozl xp jxzdrg alice eardley and michelle o'callaghan will be launching the website at the eebo-tcp conference, “revolutionizing early modern studies”? the early english books online text creation partnership in , which will be ... where material book culture meets digital humanities » wynken de ... sarahwerner.net/blog/ apr by sarah werner thanks to eebo (early english books online), ecco (eighteenth century collections online), and gallica (the digital collection of the bibliothèque nationale), among others, digital facsimiles are available for us to consult and download entire works from the early modern printed world. there are limitations, of ... eebo-tcp can make research a bit easier if you're interested, say, in sassafras and want to find instances of it being discussed. in the right hands, you can ... the permissive digital archive – copious but not compendious blogs.helsinki.fi/kaislani/ nov by kaislani [ ] the following example is from john lavagnino, “scholarship in the eebo-tcp age”, talk by john lavagnino at the conference revolutionizing early modern studies? the early english books online text creation ... appositions: studies in renaissance / early modern literature ... appositions.blogspot.com/ may by noreply@blogger.com (whow) we propose to annotate existing texts created by the early english books online text creation partnership (eebo-tcp), enriching these texts by providing detailed information about their prosodic structure. we would use a ... - references humanist discussion group, vol. , no. . - renaissance humour renhum.blogspot.com/ apr by renaissance humour ... commercial publishers, and university libraries to produce scholar-ready (that is, tei- compliant, sgml/xml enhanced) text editions of works from digital image collections, including ecco, early english books online (eebo) from proquest, and ... according to ari friedlander, tcp outreach coordinator, the eebo-tcp project is much larger than ecco-tcp because pre- works are more difficult to capture with optical character recognition (ocr) than ecco's ... digitalkoans » blog archive text creation partnership project ... digital-scholarship.com/digitalkoans/ jun by admin ... , , representing a substantial portion of the nearly , books contained in the subscription databases from which they are transcribed: early english books online (eebo), evans early american imprints, and eighteenth century collections online (ecco). ... through , the primary focus of the tcp is to produce around , texts for a second phase of the eebo-tcp partnership (the first phase, which ended in , produced around , texts). peter scott's library blog: the university of michigan library ... xrefer.blogspot.com/ apr by peter scott the university of michigan library has announced the release of , texts from the second phase of its early english books online text creation partnership (eebo-tcp) project. the text creation partnership produced ... more results from peter scott's library blog http://sarahwerner.net/blog/index.php/ / /where-material-book-culture-meets-digital-humanities/ https://www.google.co.uk/url?url=http://sarahwerner.net/blog/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cc q auwadgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnhwmlo pvfsiu qjpbst e mx q http://blogs.helsinki.fi/kaislani/ / / /the-permissive-digital-archive/ https://www.google.co.uk/url?url=http://blogs.helsinki.fi/kaislani/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cdiq auwatgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcng ijasiw_vtxjtquoh zhophkwra http://appositions.blogspot.com/ / /ben-burton-elizabeth-scott-baumann.html https://www.google.co.uk/url?url=http://appositions.blogspot.com/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cdcq auwajgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnflqi-q flzdkathmcvsws-oae aq https://www.google.co.uk/search?hl=en&safe=off&tbo=d&gbv= &noj= &biw= &bih= &tbm=blg&q=link:http://appositions.blogspot.com/ / /ben-burton-elizabeth-scott-baumann.html&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cdgq wuwajgu http://renhum.blogspot.com/ / /humanist-discussion-group-vol.html https://www.google.co.uk/url?url=http://renhum.blogspot.com/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cd q auwazgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcneifgfhxjk isqpyxmlacija grna http://digital-scholarship.org/digitalkoans/ / / /text-creation-partnership-project-outreach-librarian-at-university-of-michigan-library/ https://www.google.co.uk/url?url=http://digital-scholarship.com/digitalkoans/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= ceiq auwbdgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcner g_muzcpp_sn amgkx_ stckoq http://xrefer.blogspot.com/ / /university-of-michigan-library-releases.html https://www.google.co.uk/url?url=http://xrefer.blogspot.com/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cecq auwbtgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnh mrqknyl cyxut b nl nk jow https://www.google.co.uk/search?q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% +blogurl:http://xrefer.blogspot.com/&hl=en&safe=off&tbo=d&gbv= &noj= &biw= &bih= &tbm=blg&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cegq guwbtgu early modern digital agendas at the folger institute | the early ... www.emintelligencer.org.uk/ nov by karen ... current and future digital tools and approaches in early modern literary studies—from early english books online-text creation partnership (eebo-tcp) to advanced corpus linguistics, semantic searching, and visualization ... washington college news: wc alum wins national award for ... washingtoncollegenews.blogspot.com/ feb by washington college chestertown, md, february , — heidi atwood, a graduate in english from washington college, has received the grand prize in the early english books online/eebo- tcp undergraduate essay ... p-herbals-msg - stefan's florilegium archive www.florilegium.org/ jun to: sca-cooks at ansteorra.org. subject: re: [sca-cooks] gerard's herball. from: "christina l biles" . date: tue, nov : : - . the url for early english books online is. http://wwwlib.umi.com/eebo/. unfortunately, you have to be a member institution for full access to the. project. gerard is ..... plants (deluxe clothbound edition) (hardcover)on the shelf. it's also up on eebo and eebo-tcp for those with academic connections. hope this helps, ... - references bodleian libraries secure £ million jisc award « german friend's ... germanbodfriends.wordpress.com/ mar by iholowaty this funding enables the early english books online text creation partnership (eebo-tcp), led by the bodleian and the university of michigan, to make available a further , texts as part of their project to offer all ... hobo: events www.english.ox.ac.uk/hobo/ nov by ian gadd ... full-text transcriptions of works in early english books online (eebo), we invite proposals for research papers and posters reflecting the various ways in which tcp texts are being used. is eebo-tcp revolutionizing research and teaching in ... - references bodleian libraries - digify - miriam mueller blog.miriammueller.net/ mar this grant extends earlier work done to create the early english books online (eebo) resource through proquest. eebo “provides access to digital facsimiles of over , works published in england or english between and .” the eebo-tcp (text creation partnership) has made available more than , documents that allow users to view either the original document or a fully searchable and browsable full-text version. the new grant will make available another ... http://www.emintelligencer.org.uk/ / / /early-modern-digital-agendas-at-the-folger-institute/ https://www.google.co.uk/url?url=http://www.emintelligencer.org.uk/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= ce q auwbjgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnejm_enmhxsqiurxtszdne m jzkw http://washingtoncollegenews.blogspot.com/ / /wc-alum-wins-national-award-for.html https://www.google.co.uk/url?url=http://washingtoncollegenews.blogspot.com/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cfiq auwbzgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnfvu xa izxu xbdreh-hzmy vp w http://www.florilegium.org/files/plants/p-herbals-msg.html https://www.google.co.uk/url?url=http://www.florilegium.org/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cfcq auwcdgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnf ntm_k_chj nt vrosmj_kglpq https://www.google.co.uk/search?hl=en&safe=off&tbo=d&gbv= &noj= &biw= &bih= &tbm=blg&q=link:http://www.florilegium.org/files/plants/p-herbals-msg.html&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cfgq wuwcdgu http://germanbodfriends.wordpress.com/ / / /eebojiscaward/ https://www.google.co.uk/url?url=http://germanbodfriends.wordpress.com/&rct=j&sa=x&ei=uxoeufbhk-wr qwjoicwcg&ved= cf q auwctgu&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnel_jlx bdofoxuwhmoqp msf_q q http://users.ox.ac.uk/~hobo/hobo/events.html https://www.google.co.uk/url?url=http://www.english.ox.ac.uk/hobo/&rct=j&sa=x&ei=yhoeuy-fhcgw axxvohadq&ved= cc q auwadge&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcnes iddrk ixhyqod bbafdtkv w https://www.google.co.uk/search?hl=en&safe=off&tbo=d&gbv= &noj= &biw= &bih= &tbm=blg&q=link:http://users.ox.ac.uk/% ehobo/hobo/events.html&sa=x&ei=yhoeuy-fhcgw axxvohadq&ved= cc q wuwadge http://blog.miriammueller.net/post/ https://www.google.co.uk/url?url=http://blog.miriammueller.net/&rct=j&sa=x&ei=yhoeuy-fhcgw axxvohadq&ved= cdmq auwatge&q=% eebo% +or+% eebo+tcp% +or+% eebo-tcp% +or+% early+english+books+online% &usg=afqjcne i-v bxiwbqpilzz pj ckjkg qualitative impacts the quantitative results in the preceding section give one view of the impact of eebo/eebo- tcp, but to put these raw numbers in context, we also gathered extensive qualitative information using focus groups, interviews, and an analysis of user feedback. these results are reported here. focus groups sect workshop the first workshop for the sect project was held on the th of april, and was attended by people, including academics, editors, project developers and digital technologies specialists. a full report on this workshop is available via the project website, www.bodleian.ox.ac.uk/eebotcp/sect. digital humanities summer school focus group a short focus group on eebo-tcp was held one the july and was attended by around people. the group was made up of delegates for the digital humanities summer school which is multidisciplinary and not period-specific, and so provided a useful audience, not restricted to those with a particular interest in literary studies, history or the early modern period. attendees were asked about their awareness of eebo-tcp and about their use of digital humanities resources. the main points to emerge from this focus group were: . eebo-tcp is reasonably well-known in a general humanities context. about a third of attendees said they'd heard of it, and one compared it with the oed in terms of the importance of the resource in his department. . when asked about other resources comparable to tcp, participants suggested a. broadside ballads at santa barbara (modern text, old layout; only of ballads done; in some ways better than eebo, in others worse) b. ecco (participants aware that it was ocr-based) c. old bailey online d. allegra catalogue (“not efficient to search”) e. googlebooks f. internet archive (user liked that they can make their own collection and search it) . what improvements/developments would users like from eebo-tcp? a. make it easier to find particular genres, e.g. ballads b. diachronic presentation / n-gram-like qualities c. ability to separate the paratext, e.g. front matter . no one seems to go to library training sessions. participants were unaware of eebo training sessions. when asked for a show of hands, about half preferred online training and about half face to face, but some wanted both. . one participant suggested that librarians are against publisher training; they think it's not appropriate. so if eebo-tcp offers training, it should be made very clear that it is eebo-tcp editors who are doing the training. digital citation focus group on the th november a focus group on digital citation and research methodologies in the humanities was held, attended by people, including academics, editors, online and print publishers, and digital content creators. the focus group was held in response to themes which http://www.bodleian.ox.ac.uk/eebotcp/sect developed at the eebo-tcp conference in september (more details are given under “user feedback” below). a full report on digital citation will be made available separately. the focus group explored the lack of adequate citation of digital resources and the variation in practice amongst those who do cite or otherwise acknowledge their use of such material. the group discussion indicated that there is unlikely to be a quick or easy way to change either the perception that digital resources are somehow less “respectable” or scholarly than traditional print ones or the variation in citation practice. however, there are measures which could gradually improve the citation situation in the humanities. the group suggested that the main measures likely to improve the situation are: . publicizing the issue. we must continue to talk about it, formally and informally. conference papers, blog posts, articles and presentations which focus on the problem of digital citation will keep the issue current and will encourage users to consider their own practices. it may be particularly productive to target the subject associations as they often run associated journals. . making citation easy. creators or curators of digital content must make it as easy as possible for their users to cite (or as difficult as possible for them not to), including making useful urls. . incentivizing citation. researchers and projects could be incentivised to make clear their use of resources, e.g. by publicizing articles or projects on the eebo-tcp website, or even by some kind of monetary incentive – everyone who has cited a text could enter some kind of competition (like the eebo-tcp essay competition of old). . dating digital items. digital collections must make it clear how to date content accessed via their resource. release information and/or editorial updates should be made as obvious as possible. . interdisciplinary knowledge exchange. we must look for input from other areas of study where there is philosophical overlap – for example, citation for audiovisual material (the work being done by sian barber at royal holloway) or for music. . respected institutions leading change. we may already be seeing a gradual shift in practice led by respected bodies – recently the royal society announced their move to continual publication, whereby they will give a doi but no page numbers. such moves will change the focus within digital scholarship. in addition to the broad measures outlined above, we should consider specific measures in relation to teaching and training. uptake for library training sessions tends to be low. uptake for web tutorials, anecdotally, seems rather low too – but this may be due to lack of publicity or planning for dissemination. participants agreed that it would be helpful to draft citation guidelines for digital resources that could be circulated to academic departments and subject administrators for inclusion in local documentation circulated to students as they begin their studies. making such teaching and training materials easily available on project websites would also be helpful. one-off project-led training sessions could be worthwhile, providing that they are properly promoted to encourage good attendance. what other, more specific, recommendations emerged from the focus group discussion? . make urls as short as possible and, if possible, human-decodable. . include a clear link to a citation from the main page of a text, image, etc. . encourage/guide users always to give a date of access whenever they cite a digital resource, and include such a date in automatically generated citations. . provide easily accessible editorial documentation at the point of accessing texts and images (rather than solely on project – descriptive – websites). . digital content creators should consider how best to raise and develop the scholarly reputation of their resource, and promote that resource accordingly. . where content (such as, from , eebo-tcp phase one texts) is in the public domain and not tied to one point of access, citation information should be tied to individual texts (perhaps by including a citation in the tei header, if possible). overall, the group felt that the best way to tackle the problem of digital citation is to continue to raise it as an issue and prompt users to reassess individual and institutional practices. interviews and opinion-gathering librarians librarians play an important role as mediators of digital content; they are often the key people responsible for introducing users to online resources and for helping these users resolve any problems they encounter. indeed eebo-tcp itself is the product of libraries; text production is conducted at the university of michigan library and the bodleian library at oxford and draws on library special collections around the world. library representatives were present at the sect workshops and focus groups, and at the eebo-tcp conference. additionally, individual interviews with specialist librarians were conducted for the project: isabel holowaty (history subject librarian at the bodleian library), sarah wheale (bodleian library rare books cataloguer), sean hughes (assistant librarian at trinity college dublin library), and teresa pedroso (disability librarian at the bodleian). some of the issues raised in discussions with librarians are outlined below. eebo is presented as a key resource for the study of history and english language and literature. however, students don’t know the difference between eebo and eebo-tcp – they don’t understand why some books have full text and some don’t (some believe that all image sets have full-text). academics and more advanced researchers better understand the distinction, and are better aware of the nature of the collection. many oxford students come across eebo via the marc records available through solo. they may find a reference to a text in a book they are reading and then search for it in solo, which pulls up the eebo text. they like solo because it uses a single search box, like google. a shocking number of students find things via google, which doesn’t find things in collections like eebo. younger students assume that if they find something on the internet it exists, if they don’t it doesn’t exist – they don’t, for example, ask rare books specialists about uncatalogued or card- catalogued material. readers don’t understand some very basic things about eebo. some don’t know that you can’t full-text search images – confusion with eighteenth century collections online (ecco) could be an issue here as ecco uses ocr technology and allows results to be highlighted on the images. many readers don’t understand why eebo is different. documentation is extremely important – users need to be able to establish clearly the nature of a collection and what they can do with it. many students do not read help pages or editorial policy documents; perhaps a click-through button on every page of full-text would be better, saying something like “how was this text created?” placement of help pages and documentation also needs to be taken into account when considering the needs of disabled users – it is important not to label any help pages as “for the disabled”. this reduces the number of people who will look at it to a minority, while people who could benefit from the information ignore it as they feel it is not for them. for example, an elderly researcher whose eyesight isn’t what it used to be could benefit from larger font sizes, but would not think to read documentation intended for disabled users. calling such documentation something like “user preferences” would be more useful to more people. the language level of help pages is also important – overly complicated textual language should be avoided. in general, online collections should ideally hold usability sessions to look in detail at how they cater for disabled users, looking for such things as clearly labelled buttons, clear form fields, easy to access text versions for the visually impaired, clearly marked steps to get to download text, and appropriate labelling. more project updates would be welcome – these could be made available via websites or blogs and then librarians could disseminate them amongst the student body. web tutorials would also be helpful in this respect – these need to be short and specific, e.g. how to do a proximity serach, a boolean search, wild card searching, etc. students tend not to attend library training sessions. training and user education materials would be better sent to academics who could give them to their students – students do attend compulsory sessions in their academic syllabus. one librarian estimated that of the direct reader queries that she gets, around % of them involve online resources in some way. readers think they understand how to use print collections and are more likely to ask about online resources. this illustrates the importance of making sure that librarians are kept up-to-date about online collections. a hub for information about eebo, eebo-tcp, and projects based on or related to it would be very useful in this context. encoding experts sebastian rahtz and james cummings of it services at oxford university are active very members of the text encoding initiative (tei) community and they have consulted on numerous projects which have made use of eebo-tcp texts. some of the issues raised by james and sebastian were:  a primary concern about eebo-tcp is that it is a one-way workflow; there is no mechanism to feed in corrections. crowdsourcing was identified as offering significant potential in this area.  the metadata associated with tcp files can be confusing. there are multiple file identifiers and unhelpful file names, plus the headers are separate from the files. which is the canonical identifier? such issues have implications for citation.  some collections (like the oxford text archive) allow corrected and enhanced versions to be filtered back in to the main corpus, but this can produce a situation where multiple versions of a single text exist and have to be managed carefully.  sebastian and james have converted many of the tcp texts into tei-p , for use by projects and for development as e-books. they stress the importance in their view of converting all tcp files in this way, and suggest that funding ought to be sought to convert the texts en masse before the lifting of restrictions on phase texts.  tc documentation is a problem. projects should have as part of their start-up something like a wiki, which would allow all queries and resolutions to be accessible in a shared space. users who want to work with the underlying tagging need to know what was meant by a particular piece of mark-up or why something was tagged in a particular way. good documentation needs to be made available for users. projects there have been many projects which have taken eebo-tcp texts and have used and developed them for new purposes. representatives of a number of these projects have participated in sect events, notably the sect workshop in april and the eebo-tcp conference in . individual meetings and/or conference calls were also held with representatives from particular projects, and others were contacted via email and invited to share their views. project perspectives appear in the workshop, conference and digital citation reports, and a list of projects based on or connected to eebo-tcp text is included as an appendix to this report. this section will briefly deal with some additional issues raised by projects. many projects use the tcp transcriptions as a good starting point for further work. the john donne society’s digital text project , for example, checks and corrects errors and gaps in the transcriptions, before editing the xml mark-up to make it fully tei-compliant before adding additional, more detailed mark-up. such projects like to access the texts in both plain text and xml formats, and would appreciate being able to do so via ftp or a dedicated content management system that would allow them to get the text in various desired formats. e-book formats are also desirable for ease of reading, for example. as the results of the survey also suggest, accuracy and comprehensiveness are seen to be the most important elements of tcp, as xml encoding can be edited and updated automatically, and for different purposes. some project personnel expressed a desire to be able to feed back some of their work into the tcp corpus – many projects work carefully to perfect the particular texts and would like to be able to share the results of their hard work with other eebo users. a mechanism to include enriched mark-up would also be attractive, although it may be problematic. a desire for improved metadata was also expressed – better metadata would allow researchers to more effectively isolate the works or authors or genres that interest them. eebo-tcp editors arguably the people who know the pros, cons and potential of the eebo-tcp collection best are those who work most closely with it – the eebo-tcp editing teams based at michigan and oxford. for this reason, the editors at oxford were met with individually, and the editors at michigan invited to contribute via email, in order to put together a picture of how the corpus is viewed by those who are creating it. editors’ views focused in the main on the topics outlined below – these issues were identified as important, although opinion varied on the specifics. encoding some felt it very important that the tcp texts should be converted to tei p -compliant xml, on the understanding that this would be an automated process that would require some compromise. maintaining a corpus that is in up-to-date tei has significant benefits in terms of interoperability and future development. many projects already take the xml provided by the tcp to tei experts in order to have the data converted. it would be very useful to such users were our data already compliant. others are much less convinced of the importance of the tei – it was suggested that to those outside the tei community, the encoding scheme doesn’t mean much. when the texts enter the public domain, they will go in all directions (uncontrolled by the tcp) and many users will access them in e-book format via amazon or the internet archive. it was argued that the underlying encoding scheme in such a context is of little interest to most users. conservation some argued very strongly that the tcp and/or oxford and michigan have a cultural responsibility to conserve the data that has been created. there should be a simple, plain data set held in untouchable ur-text form. there should be a time-stamped archive copy (several copies ideally) of every text created by the tcp – physical back-up tapes would be desirable from a preservation point of view. this basic principle of preservation was felt by some to be fundamentally important and of significant long-term value. documentation the state of the tcp documentation was strongly felt to be a problem. much (most?) of the editorial policy decisions are not written down in a way that users can access. it is strongly felt that users need to know as much information about how the texts were created as possible, and a clear editorial policy needs to be made available. one view was that tcp documentation should be (have been) date stamped. it should be possible for researchers to establish when a text was reviewed and therefore what policies were being followed at that time. one suggestion was that the best way to address this problem would be to write a tcp manual with a full index (and that specific funding might be sought to carry out this work). a supplementary “history of the tcp” document would also be useful. such complete and careful documentation would literally sustain the expertise and decision-making of the project – complete transparency on behalf of the tcp will affect how people use the resource. accuracy and correction overall editors feel that the tcp is a fantastic corpus, but acknowledge that there are inevitable inadequacies in the corpus, due to the production processes and the nature of the underlying images, but also due to the unavoidable differences between individual editors and therefore texts. in an ideal world there would be a second pass through texts, but given that this is highly unlikely if not impossible, the tcp must consider the potential offered by correction. some feel that there definitely should be a mechanism for correcting errors and illegibles in the corpus. eebo-tcp will be a key resource for decades to come, and accuracy of data is very important to most users. however, it is unclear what the business model for such a process could or should be. others were uncertain as to how much value a correction process would actually have – how much would it really improve the data? some suggested that a pilot project to look at correction could answer this question. the idea of shared applications is growing in currency (see project bamboo, for example). involving the community of users would be an interesting approach to correcting the data. crowdsourcing initiatives often encourage users to build a resource – like transcribe bentham, your art, what’s the score at the bodleian? – rather than to correct existing data. corrections to the tcp data might be better served by the development of a plug-in for textual projects – essentially a form (plus image) for users to fill in with corrections and submit for manual checking by a digital editorial team. however, some felt very strongly that a correction process based on crowdsourcing would be a bad idea for quality assurance reasons – it was suggested that a better model would involve teams at various libraries who could correct from the originals or one central team who could travel to various libraries. thoughts on correction focused on transcription rather than encoding – standardizing encoding across all texts doesn’t seem to be something editors consider workable. it was argued, however, that front and back matter might be one area that would particularly benefit from standardized encoding. if any correction and editing to the corpus is to be carried out, it was strongly argued by some that an edition-based model be employed. an editions model would be easier to cite, and would allow the important ability to date stamp research results. completeness completeness of the corpus was acknowledged to be of importance for scholarship – research results have to be heavily caveated. the collection is inherently partial – it contains only what survived and is held by or so libraries. the eebo-tcp selection process has added another layer of partiality. ideally the tcp corpus would be expanded to include all books in eebo, but it was conceded that the funding for such a process would be very difficult to obtain. there were differing views on what should be done if the tcp had some more funding and therefore had to prioritise the inclusion of some texts over others. some argued that the tcp should look at incorporating single editions of everything including foreign language texts in order to correct the huge tcp bias in favour of english language material. if multiple editions of works are done, however, they should be linked together. an alternative view held that the lack of latin, french, and multiple editions was not too problematic. the tcp policy – that one edition of everything (as it were) is better than multiple editions of some things – was felt to be sound. in this view, the order of priority would be . finish all english language texts, . finish multiple editions of english works, . work on non-english material. the latter was considered to require a different skill-set than that developed over the course of the tcp project, and would require input from specialists in the various languages concerned. the completion of non-english texts might be better done in a collective european context. metadata improved metadata was considered important by some. one view is that the tcp should add estc numbers and the shelf mark of the physical copy to the catalogue records and to the tcp metadata (i.e. the tei header), because of the increasing volume of work concentrating on copy- specific research. another suggestion was that we need to consider how users will search metadata when the texts separate from proquest’s image resource (i.e. when they enter the public domain). copyright some felt that while it is understandable, it is regrettable from an open-access point of view that the images and metadata are copyrighted by proquest. might proquest be willing to change their copyright statement to allow use for professional (but not commercial) use? this would help users who might want to use an image on a blog or a presentation – they may not even realise that they ought to request appropriate permissions from the publisher. public domain in the phase one texts will enter the public domain in . some editors think that the tcp has a responsibility to manage the availability of the texts at this point. a plan for providing access and a publicity strategy need to be devised. in terms of the textual data, it was felt by some that if any improvements or developments were made to the phase texts before january , then there should also be a commitment to making the same changes across the phase data set. internal consistency of this sort, across all the tcp texts, is very important. general editors have many ideas about potential enhancements to and developments of the data – they believe that the editorial teams who created the texts should look into ways to further develop the data for the benefit of users. collectively, editors see the tcp as a brilliant project. for some, the practicalities of creating it can make one forget what a fantastic and important resource it is. it will be transformative for generations to come, especially as other resources come online. the more high-quality resources there are available to search across, the more important and enlightening research will come to light. the project needs to focus on user needs and user understanding. our user group will need an authoritative version of the texts, and we should have a user education strategy to publicise the corpus in the future. user feedback conference the sect project co-hosted, with oxford eebo-tcp, the conference “revolutionizing early modern studies”? the early english books online text creation partnership in , held in oxford on the th and th september . the conference coincided with the tenth year of production for the tcp in oxford and it allowed sect and eebo-tcp to reflect on the impact that the corpus has had on research and teaching in the early modern period and to explore planned and potential developments in the future. the conference provided invaluable evidence for sect of just how valued the corpus is amongst its users, and the many ways that it enhances and develops research in the humanities. proceedings of the conference are available via the oxford university research archive, at http://ora.ox.ac.uk/objects/uuid: e ddb -f - cb - faf- a af . the conference was opened by richard ovenden, associate director of the bodleian libraries, who has been an important advocate for the tcp since its inception. richard introduced keynote speaker dr john lavagnino of king’s college london, who delivered a superb survey of “scholarship in the eebo-tcp age”. john set the tone for the whole conference by exploring the philosophical questions and practical challenges of digital scholarship. he explained the importance of the tcp production model – transcribed rather than ocred text – and considered the kinds of work that the corpus allows scholars to do, either uniquely in digital rather than print form or very significantly faster than previously possible. john also introduced what would become a recurring theme in the conference – that eebo-tcp is “everywhere in early modern studies, though largely hidden: overt citation and discussion are minimal”. this citation problem was followed up in the sect focus group on digital citation (see http://www.bodleian.ox.ac.uk/eebotcp/sect/ / /digital-citation-focus-group/), while research methodologies in the humanities remains a topic ripe for further discussion. the first panel, eebo-tcp: practice and potential, was opened by becky welzenbach, the tcp’s outreach librarian, who gave delegates an overview of the current state of the tcp. she was followed by martin mueller who gave a thought-provoking talk outlining his work on linguistic annotation of the tcp corpus, looking in particular at his work on morphadober . martin subsequently wrote a very interesting blog post on the conference and on his ideas for the future of eebo-tcp which will be discussed further under “projects” below. martin was followed by marie-helene lay who provided a detailed exploration of how to deal with spelling variation in early modern french and english. panel one was completed by elizabeth scott-baumann who discussed her database, which she has created together with ben burton, which offers early modern poetry marked up by form and metre. this first session gave a real sense of the careful and detailed development work currently being undertaken, based on eebo-tcp materials, and of some of the challenges presented by using early modern materials for research. peter auger opened the second panel, on early modern reception and response, with a fascinating discussion of how eebo-tcp has allowed him to explore early modern english responses to french poets. the tcp corpus has allowed peter to build on earlier research by allowing him to identify additional sources. this sense of building on and reinforcing earlier research was itself reinforced in simon davies’ discussion of his work on early modern demonology. mary erica zimmer, like peter and simon, gave a clear sense of the kind of detailed and specialist work that the tcp enables in her talk on spenser’s letter of the authors. these stimulating panels were followed by a poster session, which showcased some of the projects which have used or are related to eebo-tcp in significant ways. james cummings illustrated the productive ways that eebo-tcp materials can be enhanced and reused for new purposes. james also joined ian gadd, giles bergel, and pip willcox in a poster exploring a project which will be of great interest to eebo-tcp users, the digitization of the stationers’ register. jayne henley provided a striking poster showing her work on editing texts in welsh for the tcp. jim kuhn, sarah werner and owen williams of the folger shakespeare library showed see http://www.bodleian.ox.ac.uk/eebotcp/sect/ / /digital-citation-focus-group/ see http://morphadorner.northwestern.edu/ http://ora.ox.ac.uk/objects/uuid: e ddb -f - cb - faf- a af http://ora.ox.ac.uk/objects/uuid: e ddb -f - cb - faf- a af http://www.bodleian.ox.ac.uk/eebotcp/sect/ / /digital-citation-focus-group/ http://www.bodleian.ox.ac.uk/eebotcp/sect/ / /digital-citation-focus-group/ http://morphadorner.northwestern.edu/ a poster focusing on their plans for interoperable digital editions of early modern drama. judith siefring’s final poster on sect: sustaining the eebo-tcp corpus in transition, described the project’s focus on assessing the impact of the tcp corpus, for which all of the posters and panels at the conference have supplied such valuable input. the third and final panel of the first day of the conference was a superb illustration of the ways in which the tcp is being used in teaching. heather froelich presented her paper, co-written with richard j whitt and jonathan hope, on the textlab course run at strathclyde university, which fosters collaborative working to explore text and language in detail. mark hutchings then spoke about his course which uses eebo-tcp materials to teach his students editing theory and practice. leah knight surveyed her ten years’ worth of experience in using the tcp in the classroom and the challenges that this has brought with it. this excellent panel provided a useful counterpoint to the impressive and detailed research work outlined earlier in the day. day two of the conference opened with panel four, on the subject of the politics and practicalities of editing. daniel carey and anders ingram opened with an engaging paper on their work creating an edition of richard hackluyt’s principal navigations based on the tcp transcription. giles bergel followed them with a timely and thought-provoking discussion on the politics and poetics of transcription. this very practical engagement with the challenges of digital editing was followed up by michelle o’callaghan and alice eardley’s presentation of their own work creating digital editions for the verse miscellanies online project. sebastian rahtz closed this fascinating session with an exploration of how he and james cummings have worked to bring the tcp encoding into line with more recent versions of the text encoding initiative guidelines. panel five concentrated on the work being done by the corpus research on early modern english (creme) team at lancaster university. alistair baron, andrew hardie, paul rayson, stephen pumphrey, alison findlay and liz oakley-brown gave a series of papers exploring the potential of the tcp corpus for linguistic and semantic analysis, and applications in the classroom. these very stimulating papers were extremely well-received by the conference audience. the sixth and final panel of the conference, on digital research methods, was opened by jake halford who discussed his work on the emergence of “new philosophy” in the seventeenth- century. jake explored how eebo-tcp has helped him in his research and graciously suggested that hearing the work of others explored during the conference has given him possibilities for his own work. helen sonner then gave a very engaging paper on the popular construction of meaning in early modern print, tracing the meaning and development of the word “plantation”. matthew steggle closed the session with a charming discussion of how eebo-tcp has enabled his work looking for “lost plays”, concentrating, for this paper, on the work of thomas dekker. the conference was brought to a close with a summary and plenary discussion, led by emma smith. emma skilfully pulled together the themes of the conference, highlighting the range of work being carried out using eebo-tcp and demonstrating the value of the conference in bringing scholars together to share their work and ideas. emma led a discussion which considered how scholars can fully embrace the possibilities offered by digital technology, and how this changing digital landscape is prompting researchers and content creators alike to think about research methodologies. how are research methods changing? how can scholars explain and make explicit their methodologies? what role can content creators play in this process? this discussion of the changing nature of the research process, and of research goals, led on to a discussion of the role of libraries and in particular of rare book libraries. by considering the state of the tcp in , this conference enabled a stimulating exploration of the changing research landscape for scholars in the humanities and for those who endeavour to support such research. the important questions raised are ripe for further discussion in the future and have fed directly in to the work of the sect project. conclusions the tidsr analysis of eebo-tcp has provided an important opportunity to reflect on the impact of eebo-tcp. this process of taking a step back and looking in detail at how users use the corpus, what they like and don’t like about it, what the reputation of eebo-tcp is and how can a good reputation be maintained in the future, has been enormously valuable for eebo-tcp. overall, the broad consensus seems to be that eebo-tcp is a fantastic resource and is greatly valued by the scholars that use it, but there are improvements that could be made to ensure its central place in early modern scholarship for decades to come. many themes developed over the course of the study and were brought up by a variety of different people in different areas. the material gathered through the tidsr process will now be used to formulate some recommendations for improvements that could or should be made to eebo-tcp and to establish how feasible these recommendations are in practice. appendix list of projects based on or related to eebo-tcp this list demonstrates the range of projects which have made use of the eebo-tcp corpus. it is not an exhaustive list. all urls were accessed on / / . complete works of james shirley, http://www .warwick.ac.uk/fac/arts/ren/oupjamesshirley/ corpus research on early modern english (crÈme), http://ucrel.lancs.ac.uk/ electronic database of poetic form, http://digital.humanities.ox.ac.uk/projectprofile/project_page.aspx?pid= great writers inspire, http://openspires.oucs.ox.ac.uk/greatwriters/ the holinshed project, http://www.english.ox.ac.uk/holinshed/ the hakluyt project, http://www.rmg.co.uk/researchers/research-areas-and-projects/hakluyt- editorial-project/ inke: implementing new knowledge environments, http://inke.ca/ jisc historic books, http://www.jisc-collections.ac.uk/jiscecollections/jischistoricbooks/ john donne society digital text project, http://community.itergateway.org/groups/john-donne- society-digital-text-project leme: lexicons of early modern english, http://leme.library.utoronto.ca/ manuscripts online, http://manuscriptsonline.wordpress.com/ the map of early modern london, http://mapoflondon.uvic.ca/ the monk workbench, https://monk.library.illinois.edu/cic/public/ morphadorner, http://morphadorner.northwestern.edu/ patterns of reference, http://www.internetcentre.imperial.ac.uk/project/por/ philologic, https://sites.google.com/site/philologic / the spenser archive, http://spenserarchive.org verse miscellanies online, http://www.reading.ac.uk/emrc/research-activities/emrc- miscellanies-project.aspx witches in early modern england, http://witching.org/ http://www .warwick.ac.uk/fac/arts/ren/oupjamesshirley/ http://ucrel.lancs.ac.uk/ http://digital.humanities.ox.ac.uk/projectprofile/project_page.aspx?pid= http://openspires.oucs.ox.ac.uk/greatwriters/ http://www.english.ox.ac.uk/holinshed/ http://www.rmg.co.uk/researchers/research-areas-and-projects/hakluyt-editorial-project/ http://www.rmg.co.uk/researchers/research-areas-and-projects/hakluyt-editorial-project/ http://inke.ca/ http://www.jisc-collections.ac.uk/jiscecollections/jischistoricbooks/ http://community.itergateway.org/groups/john-donne-society-digital-text-project http://community.itergateway.org/groups/john-donne-society-digital-text-project http://leme.library.utoronto.ca/ http://manuscriptsonline.wordpress.com/ http://mapoflondon.uvic.ca/ https://monk.library.illinois.edu/cic/public/ http://morphadorner.northwestern.edu/ http://www.internetcentre.imperial.ac.uk/project/por/ https://sites.google.com/site/philologic / http://spenserarchive.org/ http://www.reading.ac.uk/emrc/research-activities/emrc-miscellanies-project.aspx http://www.reading.ac.uk/emrc/research-activities/emrc-miscellanies-project.aspx http://witching.org/ sustaining the eebo-tcp corpus in transition: report on the tidsr benchmarking study judith siefring & eric t. meyer london: jisc in, out, across, with- collaborative education and digital humanities in, out, across, with: collaborative education and digital humanities (job talk for scholars' lab) mar nd, i’ve accepted a new position as the head of graduate programs in the scholars’ lab, and i’ll be transitioning into that role over the next few weeks! as a part of the interview process, we had to give a job talk. while putting together this presentation, i was lucky enough to have past examples to work from (as you’ll be able to tell, if you check out this past job talk by amanda visconti). since my new position will involve helping graduate students through the process of applying for positions like these, it only feels right that i should post my own job talk as well as a few words on the thinking that went into it. blemishes, jokes, and all, hopefully these materials will help someone in the future find a way in, just as the example of others did for me. and if you’re looking for more, visconti has a great list of other examples linked from her more recent job talk for the scholars’ lab. for the presentation, i was asked to respond to this prompt: what does a student (from undergraduate to doctoral levels) need to learn or experience in order to add “dh” to his or her skill set? is that an end or a means of graduate education? can short-term digital assignments in discipline-specific courses go beyond “teaching with technology”? why not refer everyone to online tutorials? are there risks for doctoral students or the untenured in undertaking digital projects? drawing on your own experience, and offering examples or demonstrations of digital research projects, pedagogical approaches, or initiatives or organizations that you admire, make a case for a vision of collaborative education in advanced digital scholarship in the arts and humanities. i felt that each question could be a presentation all its own, and i had strong opinions about each one. dealing with all of them seemed like a tall order. i decided to spend the presentation close reading and deconstructing that first sentence, taking apart the idea that education and/or digital humanities could be thought of in terms of lists of skills at all. along the way, my plan was to dip into the other questions as able, but i also assumed that i would have plenty of time during the interview day to give my thoughts on them. i also wanted to try to give as honest a sense as possible of the way i approach teaching and mentoring. for me, it’s all about people and giving them the care that they need. in conveying that, i hoped, i would give the sort of vision the prompt was asking for. i also tried to sprinkle references to the past and present of the scholars’ lab programs to ground the content of the talk. when i mention potential career options in the body of the talk, i am talking about specific alumni who came through the fellowship programs. and when i mention graduate fellows potentially publishing on their work with the twitter api, well, that’s not hypothetical either. so below find the lightly edited text of the talk i gave at the scholars’ lab - “in, out, across, with: collaborative education and digital humanities.” i’ve only substantively modified one piece - swapping out one example for another. and a final note on delivery: i have heard plenty of people argue over whether it is better to read a written talk or deliver one from notes. my own sense is that the latter is far more common for digital humanities talks. i have seen both fantastic read talks and amazing extemporaneous performances, just as i have seen terrible versions of each. my own approach is, increasingly, to write a talk but deliver that talk more or less from memory. in this case, i had a pretty long commute to work, so i recorded myself reading the talk and listened to it a lot to get the ideas in my head. when i gave the presentation, i had the written version in front of me for reference, but i was mostly moving through my own sense of how it all fit together in real time (and trying to avoid looking at the paper). my hope is that this gave me the best of both worlds and resulted in a structured but engaging performance. your mileage may vary! in, out, across, with: collaborative education and digital humanities it’s always a treat to be able to talk with the members of the uva library community, and i am very grateful to be here. for those of you that don’t know me, i am brandon walsh, mellon digital humanities fellow and visiting assistant professor of english at washington and lee university. the last time i was here, i gave a talk that had almost exclusively animal memes for slides. i can’t promise the same robust internet culture in this talk, but talk to me after and i can hook you up. i swear i’ve still got it. in the spirit of amanda visconti, the resources that went into this talk (and a number of foundational materials on the subject) can all be found in a zotero collection at the above link. i’ll name check any that are especially relevant, but hopefully this set of materials will allow the thoughts in the talk to flower outwards for any who are interested in seeing its origins and echoes in the work of others. and a final prefatory note: no person works, thinks or learns alone, so here are the names of the people in my talk whose thinking i touch upon as well as just some – but not all – of my colleagues at w&l who collaborate on the projects i mention. top tier consists of people i cite or mention, second tier is for institutions or publications important to discussion, and final tier is for direct collaborators on this work. today i want to talk to you about how best to champion the people involved in collaborative education in digital research. i especially want to talk about students. and when i mention “students” throughout this talk, i will mostly be speaking in the context of graduate students. but most of what i discuss will be broadly applicable to all newcomers to digital research. my talk is an exhortation to find ways to elevate the voices of people in positions like these to be contributors to professional and institutional conversations from day one and to empower them to define the methods and the outcomes of the digital humanities that we teach. this means taking seriously the messy, fraught, and emotional process of guiding students through digital humanities methods, research, and careers. it means advocating for the legibility of this digital work as a key component of their professional development. and it means enmeshing these voices in the broader network around them, the local context that they draw upon for support and that they can enrich in turn. i believe it is the mission of the head of graduate programs to build up this community and facilitate these networks, to incorporate those who might feel like outsiders to the work that we do. doing so enriches and enlivens our communities and builds a better and more diverse research and teaching agenda. this talk is titled “in, out, across, with: collaborative education and digital humanities,” and i’ll really be focusing on the prepositions of my title as a metaphor for the nature of this sort of position. i see this role as one of connection and relation. the talk runs about minutes, so we should have plenty of time to talk. when discussing digital humanities education, it is tempting to first and foremost discuss what, exactly, it is that you will be teaching. what should the students walk away knowing? to some extent, just as there is more than one way to make breakfast, you could devise numerous baseline curricula. this is what we came up with at washington and lee for students in our undergraduate digital humanities fellowship program. we tried to hit a number of kinds of skills that a practicing digital humanist might need. it’s by no means exhaustive, but the list is a way to start. we don’t expect one person to come away knowing everything, so instead we aim for students to have an introduction to a wide variety of technologies by the end of a semester or year. they’ll encounter some technologies applicable to project management, some to front-end design, as well as a variety of programming concepts broadly applicable to a variety of situations. lists like this give some targets to hit. but still, even as someone who helped put this list together, it makes me worry a bit. i can imagine younger me being afraid of it! it’s easy for us to forget what it was like to be new, to be a beginner, to be learning for the first time, but i’d like to return us to that frame of thinking. i think we should approach lists like these with care, because they can be intimidating for the newcomer. so in my talk today i want to argue against lists of skills as ways of thinking. i don’t mean to suggest that programs need no curriculum, nor do i mean to suggest that no skills are necessary to be a digital humanist. but i would caution against focusing too much on the skills that one should have at the end of a program, particularly when talking about people who haven’t yet begun to learn. i would wager that many people on the outside looking in think of dh in the same way: it’s a big list of unknowns. i’d like to get away from that. templates like this are important for developing courses, fellowship, and degree-granting programs, but i worry that the goodwill in them might all too easily seem like a form of gatekeeping to a new student. it is easy to imagine telling a student that “you have to learn github before you can work on this project.” it’s just a short jump from this to a likely student response - “ah sorry - i don’t know that yet.” and from there i can all too easily imagine the common refrain that you hear from students of all levels - “if i can’t get that, then it’s because i’m not a technology person.” from there - “digital humanities must not be for me.” instead of building our curricula out of as-yet-unknown tool chains, i want to float, today, a vision of dh education as an introduction to a series of professional practices. lists of skills might be ends but i fear they might foreclose beginnings. instead, i will float something more in line with that of the scholarly communication institute (held here at uva for a time), which outlined what they saw as the needs of graduate and professional students in the digital age. i’ll particularly draw upon their first point here (last of my slides with tons of text, i swear): graduate students need training in “collaborative modes of knowledge production and sharing.” i want to think about teaching dh as introducing a process of discovery that collapses hierarchies between expert and newcomer: that’s a way to start. this sort of framing offers digital humanities not as a series of methods one does or does not know, but, rather, as a process that a group can engage in together. do they learn methods and skills in the process? of course! anyone who has taken part in the sort of collaborative group projects undertaken by the scholars’ lab comes away knowing more than they came in with. but i want to continue thinking about process and, in particular, how that process can be more inclusive and more engaging. by empowering students to choose what they want to learn and how they want to learn it, we can help to expand the reach of our work and better serve our students as mentors and collaborators. there are a few different in ways in which i see this as taking place, and they’ll form the roadmap for the rest of the talk. apologies - this looks like the sort of slide you would get at a business retreat. all the same - we need to adapt and develop new professional opportunities for our students at the same time that we plan flexible outcomes for our educational programs. these approaches are meant to serve increasingly diverse professional needs in a changing job market, and they need to be matched by deepening support at the institutional level. so to begin. one of our jobs as mentors is to encourage students to seek out professionally legible opportunities early on in their careers, and as shapers of educational programs we can go further and create new possibilities for them. at w&l, we have been collaborating with the scholars’ lab to bring uva graduate students to teach short-form workshops on digital research in w&l classrooms. funded opportunities like this one can help students professionalize in new ways and in new contexts while paying it forward to the nearby community. a similar initiative at w&l that i’ve been working on has our own library faculty and undergraduate fellows visiting local high schools to speak with advanced ap computer science students about how their own programming work can apply to humanities disciplines. i’m happy to talk more about these in q&a. we also have our student collaborators present at conferences, both on their own work and on work they have done with faculty members, both independently and as co-presenters. here is abdur, one of our undergraduate mellon dh fellows, talking about the writing he does for his thesis and how it is enriched by and different from the writing he does in digital humanities contexts at the bucknell digital scholarship conference last fall. while this sort of thing is standard for graduate students, it’s pretty powerful for an undergraduate to present on research in this way. learning that it’s ok to fail in public can be deeply empowering, and opportunities like these encourage our students to think about themselves as valuable contributors to ongoing conversations long before they might otherwise feel comfortable doing so. but teaching opportunities and conferences are not the only ways to get student voices out there. i think there are ways of engaging student voices earlier, at home, in ways that can fit more situations. we can encourage students to engage in professional conversations by developing flexible outcomes in which we are equal participants. one approach to this with which i have been experimenting is group writing, which i think is undervalued as a taught skill and possible approach to dh pedagogy. an example: when a history faculty member at w&l approached the library (and by extension, me) for support in supplementing an extant history course with a component about digital text analysis, we could have agreed to offer a series of one-off workshops and be done with it. instead, this faculty member – professor sarah horowitz – and i decided to collaborate on a more extensive project together, producing introduction to text analysis: a coursebook. the idea was to put the materials for the workshops together ahead of time, in collaboration, and to narrativize them into a set of lessons that would persist beyond a single semester as a kind of publication. the pedagogical labor that we put into reshaping her course could become, in some sense, professionally legible as a series of course modules that others could use beyond the term. so for the book, we co-authored a series of units on text analysis and gave feedback on each other’s work, editing and reviewing as well as reconfiguring them for the context of the course. professor horowitz provided more of the discipline-specific material that i could not, and i provided the materials more specific to the theories and methods of text analysis. neither one of us could have written the book without the other. professor horowitz was, in effect, a student in this moment. she was also a teacher and researcher. she was learning at the same time that she produced original scholarly contributions. even as we worked together, for me this collaborative writing project was also a pedagogical experiment that drew upon the examples of robin derosa, shawn graham, and cathy davidson, in particular. davidson taught a graduate course on “ st century literacies” where each of her students wrote a chapter that was then collected and published as an open-access book. for us as for davidson, the process of knowing, the process of uncovering is something that happens together. in public. and it’s documented so that others can benefit. our teaching labor could become visible and professionally legible, as could the labor that professor horowitz put into learning new research skills. as she adapts and tries out ideas, and as we coalesce them into a whole, the writing product is both the means and the end of an introduction to digital humanities. professor horowitz also wanted to learn technical skills herself, and she learned quite a lot through the writing process. rather than sitting through lectures or being directed to online tutorials by me, i thought she would learn better by engaging with and shaping the material directly. her course and my materials would be better for it, as she would be helping to bind my lectures and workshops to her course material. the process would also require her to engage with a list of technologies for digital publishing. beyond the text analysis materials and concepts, the process exposed her to a lot of technologies: command line, markdown, git for version control, github for project management. in the process of writing this document, in fact, she covered most of the same curriculum as our undergraduate dh fellows. she’s learning these things as we work together to produce course materials, but, importantly, the technical skills aren’t the focus of the work together. it’s a writing project! rather than presenting the skills as ends in themselves, they were the means by which we were publishing a thing. they were immediately useful. and i think displacing the technology is helpful: it means that the outcomes and parameters for success are not based in the technology itself but, rather, in the thinking about and use of those methods. we also used a particular platform that allowed professor horowitz to engage with these technologies in a light way so that they would not overwhelm our work – i’m happy to discuss more in the time after if you’re interested. this to say: the outcomes of such collaborative educations can be shaped to a variety of different settings and types of students. take another model, cuny’s graduate center digital fellows program, whose students develop open tutorials on digital tools. learning from this example, rather than simply direct students or colleagues towards online tutorials like these, why not have them write their own documents, legible for their own positions, that synthesize and remix the materials that they already have found? the learning process becomes something productive in this framing. i can imagine, for example, directing collaboratively authored materials by students like these towards something like the programming historian. if you’re not familiar, the programming historian offers a variety of lessons on digital humanities methods, and they only require an outline as a pitch to their editorial team, not a whole written publication ready to go. your graduate students could, say, work with the twitter api over the course of a semester, blog about the research outcomes, and then pitch a tutorial to the programming historian on the api as a result of their work. it’s much easier to motivate yourselves to write something if you know that the publication has already been accepted. obviously such acceptance is not a given, but working towards a goal like this can offer student researchers something to aim for. their instructors could co-author these materials, even, so that everyone has skin in the game. this model changes the shape of what collaborative education can look like: it’s duration and its results. you don’t need a whole fellowship year. you could, in a reasonably short amount of time, tinker and play, and produce a substantial blog post, an article pitch, or a library research guide (more on that in a moment). as jeff jarvis has said, “we need to move students up the education chain.” and trust me - the irony of quoting a piece titled “lectures are bullshit” during a lecture to you is not lost on me. but stay with me. collaborative writing projects on dh topics are flexible enough to fit the many contexts for the kind of educational work that we do. after all, no one needs or values the same outcomes, and these shared and individual goals need to be worked out in conversation with the students themselves early on. articulating these desires in a frank, written, and collaborative mode early on (in the genre of the project charter), can help the program directors to better shape the work to fit the needs of the students. but i also want to suggest that collaborative writing projects can be useful end products as well as launching pads, as they can fit the shape of many careers. after all, students come to digital humanities for a variety of different reasons. some might be aiming to bolster a research portfolio on the path to a traditional academic career. others might be deeply concerned about the likelihood of attaining such a position and be looking for other career options. others still might instead be colleagues interested in expanding their research portfolio or skillset but unable to commit to a whole year of work on top of their current obligations. writing projects could speak to all these situations. i see someone in charge of shaping graduate programs as needing to speak to these diverse needs. this person is both a steward of where students currently are – the goals and objectives they might currently have – as well as of where they might go – the potential lives they might (or might not!) lead. after all, graduate school, like undergraduate, is an enormously stressful time of personal and professional exploration. if we think simply about a student’s professional development as a process of finding a job, we overlook the real spaces in which help might be most desired. frequently, those needs are the anxieties, stresses, and pressures of refashioning yourself as a professional. we should not be in the business of creating cv lines or providing lists of qualifications alone. we should focus on creating strong, well-adjusted professionals by developing ethical programs that guide them into the professional world by caring for them as people. in the graduate context, this involves helping students deal with the academic job market in particular. to me in its best form, this means helping students to look at their academic futures and see proliferating possibilities instead of a narrow and uncertain route to a single job, to paraphrase the work of katina rogers. a sprinkler rather than a pipeline, in her metaphor. as rogers’s work, in particular, has shown, recent graduate students increasingly feel that, while they experienced strong expectations that they would continue in the professoriate, they received inadequate preparation for the many different careers they might actually go on to have. the praxis program and the praxis network are good examples of how to position digital humanities education as answers to these issues. fellowship opportunities like these must be robust enough that they can offer experiences and outcomes beyond the purely technical, so that a project manager from one fellowship year can graduate with an ma and go into industry in a similar role just as well-prepared as a phd student aiming to be a developer might go on to something entirely different. and the people working these programs must be prepared for the messy labor of helping students to realize that these are satisfactory, laudable professional goals. it should be clear that this sort of personal and professional support is the work of more than just one person. one of the strengths of a digital humanities center embedded in a library like this one at uva is that fellows have the readymade potential to brush up against a variety of career options that become revealed when peaking outside of their disciplinary silos: digital humanities developers and project manager positions, sure, but also metadata specialists, archivists, and more. i think this kind of cross-pollination should be encouraged: library faculty and staff have a lot to offer student fellows and vice versa. developing these relationships brings the fellows further into the kinds of the work done in the library and introduces them to careers that, while they might require further study to obtain, could be real options. to my mind the best fellowship programs are those fully aware of their institutional context and those that both leverage and augment the resources around them as they are able. we have been working hard on this at w&l. we are starting to institute a series of workshops led by the undergraduate fellows in consultation with the administrators of the fellowship program. the idea is that past fellows lead workshops for later cohorts on the technology they have learned, some of which we selectively open to the broader library faculty and staff. the process helps to solidify the student’s training – no better way to learn than to teach – but it also helps to expand the student community by retaining fellows as committed members. it also helps to fill out a student’s portfolio with a cv-ready line of teaching experience. this process also aims to build our own capacity within the library by distributing skills among a wider array of students, faculty, and staff. after all, student fellows and librarians have much they could learn from one another. i see the head of graduate programs as facilitating such collaborations, as connecting the interested student with the engaged faculty/staff/librarian collaborator, inside their institution or beyond. but we must not forget that we are asking students and junior faculty to do risky things by developing these new interests, by spending time and energy on digital projects, let alone presenting and writing on them in professional contexts. the biggest risk is that we ask them to do so without supporting them adequately. all the technical training in the world means little if that work is illegible and irrelevant to your colleagues or committee. in the words of kathleen fitzpatrick, we ask these students to “do the risky thing,” but we must “make sure that someone’s got their back.” i see the head of graduate programs as the key in coordinating, fostering, and providing such care. students and junior faculty need support – for technical implementation, sure – but they also need advocates – people who can vouch for the quality of their work and campaign on their behalf in the face of committees and faculty who might be otherwise unable to see the value of their work. some of this can come from the library, from people able to put this work in the context of guidelines for the evaluation of digital scholarship. but some of this support and advocacy has to come from within their home departments. the question is really how to build up that support from the outside in. and that’s a long, slow process that occurs by making meaningful connections and through outreach programs. at w&l, we have worked to develop an incentive grant program, where we incentivize faculty members who might be new to digital humanities or otherwise skeptical to experiment with incorporating a digital project into their course. the result is a slow burn – we get maybe one or two new faculty each term trying something out. that might seem small, but it’s something, particularly at a small liberal arts college. this kind of slow evangelizing is key in helping the work done by digital humanists to be legible to everyone. students and junior faculty need advocates for their work in and out of the library and their home departments, and the person in this position is tasked with overseeing such outreach. so, to return to the opening motif, lists of skillsets certainly have their place as we bring new people into the ever-expanding field: they’re necessary. they reflect a philosophy and a vision, and they’re the basis of growing real initiatives. but it’s the job of the head of graduate programs to make sure that we never lose sight of the people and relationships behind them. foremost, then, i see the head of graduate programs as someone who takes the lists, documents, and curricula that i have discussed and connects them to the people that serve them and that they are meant to speak to. this person is one who builds relationships, who navigates the prepositions of my title. it’s the job of such a person to blast the boundary between “you’re in” and “you’re out” so that the tech-adverse or shy student can find a seat at the table. this is someone who makes sure that the work of the fellows is represented across institutions and in their own departments. this person makes sure the fellows are well positioned professionally. this person builds up people and embeds them to networks where they can flourish. their job is never to forget what it’s like to be the person trying to learn. their job is to hear “i’m not a tech person” and answer “not yet, but you could be! and i know just the people to help. let’s learn together.” dh poster in , librarians at bucknell university developed a librarian-led undergraduate digital scholarship research program. we created the digital scholarship summer research fellows (dssrf) program to broaden research opportunities for students and introduce them to new ways of engaging in scholarship. the eight week program provides students with an opportunity to undertake independent research on a topic of their own choosing, and utilize digital humanities tools and methodologies to both answer questions and convey their research findings.here, we examine the lasting impacts of dssrf on the participants. we surveyed past fellows to understand how their participation and the skills they acquired were applicable to their subsequent coursework and career paths, and how the program influenced their thinking about scholarship. assessing the impact of a digital humanities summer research program carrie pirmann, bucknell university and courtney paddick, bloomsburg university reflections self-assessment future directions "dssrf made me realize that research has no limits. you can conduct research in any field, and add to it through it being in a digital form. i think it's the research of the future." we asked students to assess their confidence levels, before and after dssrf, with a variety of research and soft skills. these charts represent the areas in which students displayed the greatest amount of growth. ( = not at all confident; = extremely confident) one student leveraged his newly developed data visualization skills and showcased his project on the job market, and was hired by a sports analytics firm one student decided to pursue a graduate degree in library science after learning about archives and special collections one student, who is pursuing a career in market research, credited dssrf with both confirming her decision to major in economics, and kick-starting her interest in data visualization two undeclared students indicated participation in dssrf helped confirm their choice of major several students reported the program influenced their choice of majors, minors, and/or career paths. some examples: academic/career impacts "i think that the biggest impact that the program had was about how presentation of scholarship might change and expand to allow for more collaboration, and what this could be used for in different situations." responses to the survey have proven very helpful as we look forward to future iterations of the program. based on student feedback, we know they found field trips, interactions with peers and members of library and it, and work on their individual projects to be the most impactful aspects of dssrf. students found the weekly blog posts and assigned readings to be the least helpful parts of the program, so these will certainly be areas to revise moving forward. based on the results of the survey, we have also identified the tools students most frequently gravitate towards for their own projects as well as tools they have used after dssrf, and we will use this information to make decisions on the tools and techniques included in the future. background international journal of information science and management open access policy dr. mohammad reza ghane regional information center for science and technology (ricest) assistant prof. in library and information science ghane@ricest.ac.ir abstract scholarly communication as a social activity needs rethinking since this process is in the monopoly of commercial publishers. authors and their institutions as well as librarians had been working to achieve unrestricted access to research output. in this regard, many researchers around the world gathered in budapest on february to decide on global access to publications free of legal and price barriers. this campaign leads to issuing the declaration budapest open access initiative. this global and scientific gathering was the starting point for open access movement. this new paradigm in scholarly communication is discussed in this paper from price, legal, and business approaches. in relation to open access policies, “green and gold” routes as well as new licences in terms of creative commons are considered. finally, i concluded that higher education institutes should provide suitable infrastructure to make researchers’ works accessible to others. at the same time, custodians of higher education have to legislate for new policies to mandate their researchers to publish the outputs in institutional and subject-based repositories. keywords: open access policy, journal price crisis, journal permission crisis, creative commons, institutional repository, subject-based repository, golden route, green route introduction scholarly journals publishing started with establishment of learned societies in the mid of th century. the royal society of london and the french academy of sciences are the pioneers of scholarly journal publishing i.e. philosophical transactions and le journal des savants, respectively. learned societies along with scholarly journals increased scientific collaboration between researchers. universities and learned societies controlled journals publishing before the second world war. in the course of time commercial publishers found that publishing industry is a profitable business. at the end of s with the setting up of thomson reuters (formerly isi) and universities research policy to enforce academic staffs to publish in journals indexed in isi, situation changed in favor of commercial publishers and researchers lost gradually their control over their research findings. university and research libraries to meet the information needs of academicians decided to improve their journal collection development. on the one hand, the higher education policy makers wanted their faculty staff to publish in the prestigious journals which manipulated open access policy ijism, special issue nd international conference on scholarly journals editors-in-chief (islamic countries, - , dec. ) by a few commercial publishers and on the other, insisted on “publish or perish” especially in europe. the result was inelastic market. this market boosted the spiral increase of journal price and caused a smaller fall in demand. consequently, there was no control on journal pricing policy. in this situation libraries cannot afford to purchase all the journals their academicians demanded and resulted hindering researchers to have access to findings of peers and their own, except subscribers. suber ( ) called this permission and price crisis. pricing strategy under monopoly conditions caused journal price growth rate increased above inflation (fig. ). figure . expenditure trends in arl libraries, - the advent of internet (rooted in s) enabled researchers to share their scientific findings with peers. new technologies and the traditional idea of researchers (providing access to the findings for all, everywhere at any time) mixed and the skillful mixture brought about changes in scholarly communication (i.e. open access movement). it is noteworthy that world internet penetration is more than % with world population of . billion people as of july according to the medium fertility estimate by the united nations department of economic and social affairs, population division and , , , estimated internet users in september (http://www.internetlivestats.com /internet-users/). totally % for internet penetration rate (http://www.internetworldstats .com/stats.htm) show the global tendency of users toward this technology and global information society is making preparations for new environment. academic staffs as a member of global information society confronted with new challenges as price and permission barriers in scholarly communication. dr. mohammad reza ghane ijism, special issue nd international conference on scholarly journals editors-in-chief (islamic countries, - , dec. ) researches are of global importance. in this regard, universities are responsible to distribute the research findings of their researchers and provide access to knowledge as widely as possible. to this end, information and communication technologies (ict) play a crucial role. the higher education institutes should provide suitable infrastructure and implement new policies enable the researchers to make their research outputs open access as well as have access to peers’ published works. to achieve this aim, higher education institutes should set out action required. in this regard, they may implement processes and procedures including green and gold routes to open access. in addition to, mandate the academicians to deposit their outputs in an institutional or subject-based repository. literature review open access as a new model of research outputs publishing investigated from different approaches. the most controversial issues are economic and licences approaches as well as citation advantages. regarding citation advantages, swan’s study ( ) showed most researches ( ) proved the citation advantages of open access articles up to . citation rates of open access vs non-open access articles are investigated by antelman ( ). he concluded that citation rates are increased in open access articles in mathematics ( %), electrical and electronic engineering ( %), political science ( %) and philosophy ( %) indexed in web of science. motivations to make open access publication to increase citation is an issue studied by bernius & hanauske ( ). they come to the conclusion the motives behind making articles open access enhance citation rate. business models of open access journals is in favor of some researches. article processing cost, advertising, sponsorships, internal subsidies, external subsidies, and donations and fundraising are issues studied by crow ( ). willinsky ( ) surveyed other economic aspects such as author self-archiving, sponsored open access, delayed open access are investigated and finally suggested “. . . by challenging the need for current levels of economic stratification while seeking to increase the openness with which this public good is cultivated, circulated, and built upon”. several new copyright models are created concerning open access journals. hoorn & van der graaf ( ) result showed the attitudes of authors in uk and netherlands toward different open access copyright models. their findings showed that authors believe traditional copyright model should change in favor of academics. pappalardo et al. ( ) investigated digital repositories from open access policies and copyright licensing of materials deposited into them. to this end, they designed a guide to arrange principles for setting up “a lawful and effective management model” for repositories. routes to open access is another issue of importance to authors. it is noteworthy that gold route is in interest of authors in some disciplines and some geographical areas (swan, ). green route is used in digital repositories for author self- archiving. some authors prefer green route and some other ones gold route, and some advocate both of them (suber, ). in general, open access is noteworthy for its new model of publishing and is interested by advocates in different disciplines. in this regard, commercial publishers also changed their business models. open access policy ijism, special issue nd international conference on scholarly journals editors-in-chief (islamic countries, - , dec. ) open access definition the definitions of open access focus on the access to the scientific literature without legal, price and technological barriers. suber ( ) declared “open access (oa) literature is digital, online, free of charge, and free of most copyright and licensing restrictions”. budapest open access initiatives (boai, ) defines open access as: by ‘open access’ to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. the only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited. when authors retain all copyrights in her/his work, documents are free of restricted access for everyone based on copyright licenses. two other choices are share copyright or transfer it. with regard to copyright transfer, there are two options for traditional journals. first, author transfers all intellectual and commercial exploitation rights to the publisher. second, authors partially keep intellectual rights and transfer commercial rights to publisher (hoorn & van der graaf, ). share copyrights on the one hand guarantees the authors’ moral rights, on the other remove permission barrier, and allow readers to use and reuse the works even for commercial purposes. in this sense, i.e. creative commons (cc), “all right reserved” is replaced by “some rights reserved”. cc has created several types of licenses and defined the creative work use for open access journals. creative commons licenses the creative commons copyright licenses permit readers to copy, distribute, edit, remix, and build upon the content of previous works, but within the framework of copyright law. using cc licenses enables the creator to define usage conditions and let authors to change their copyright terms from “all right reserve” to “some rights reserved” (http://creativecommons.org/about). the interesting point is that cc licenses “are not an alternative to copyright”. the great virtue of using cc licenses, from the creators’ perspective, is its flexibility to have control over the works. in this regard, the author can easily modify his/her copyright terms to meet the needs. the licenses are as follows. attribution cc by attribution-noderivs cc by- nd attribution-noncommercial-sharealike cc by-nc-sa attribution-sharealike cc by-sa attribution-noncommercial cc by-nc http://creativecommons.org/about dr. mohammad reza ghane ijism, special issue nd international conference on scholarly journals editors-in-chief (islamic countries, - , dec. ) attribution-noncommercial-noderivs cc by-nc-nd all six licenses have some features in common without infringing copyright. the licenses allow others to copy, reuse, redistribute, build upon, and share the content legally. it is noteworthy that users must acknowledge the author. three cc licenses, cc by, cc by- nd, cc by-sa, let commercial use of materials. three others, cc by-nc-sa, cc by-nc, and cc by-nc-nd are for non-commercial purposes. the explanations of some attributions are useful for users. cc noncommercial (nc) is fair use of content and permit others copy, use, display, and perform the materials for noncommercial purposes only. cc no derivative works (nd) means that users are not allowed to adapt the original work. in this regard, they can distribute, display, and perform verbatim copies of the work. the last important attribution is cc share alike (sa) that means users are allowed to give out “the derivative works under the same the licence terms that govern the original work”(http://creativecommons.org.au/learn/licences/). the significant point in cc licences is that “cc no derivative works” and “cc share alike” cannot be considered for inclusion in a licence regarding the two licences description. open access routes there are two ways to make research findings open access. these options are gold route and green route. gold route recommends a researcher publish in a fully open access journal. regarding open access journals in addition to fully open access journal, there are hybrid open-access journals and delayed open-access journals. historically hybrid open- access journals back to (walker cited in björk, ). making his/her content freely available electronically, author is charged by commercial publishers. in this respect, toll – access journals (subscription journals) policy allows making some articles open access known as article processing charge (apc). delayed open-access journals as another business model in subscription journals put six to twelve months or more embargo length on free access to journal articles after publishing (laakso, & björk, ). by self- archiving research outputs in an institutional or subject-based repository open access, green route is achieved. the journal publishers’ policies determine which route you choose to publish your manuscript or deposit in a repository. the sherpa/romeo ( ) services based at the university of nottingham inform contributors of publishers’ copyright policies. (http://www.sherpa.ac.uk/romeo/). institutions organize open access institutional repositories (e.g. dash for harvard university) to preserve digital collection of research outputs created by researchers in higher education institutions (bailey, jr., ). subject-based repositories (e.g. arxiv for physics) collect contents in a particular subject. in the early ’s the first repositories in different disciplines founded (björk, ). in general, digital repositories provide a condition to increase visibility and readership of research findings that bring about more citations and consequently research impact (johnson, ). furthermore, they bring value to scholarly communication from perspective of content contributors and users (burns, open access policy ijism, special issue nd international conference on scholarly journals editors-in-chief (islamic countries, - , dec. ) lana & budd, ). fact and figures in relation to repositories are available in directory of open access repositories - opendoar (http://www.opendoar.org/) and registry of open access repositories - roar (http://roar.eprints.org/). discussion open access to research outputs as a new paradigm in scholarly communication boosts content availability. creators’ willingness to free access to their findings with the help of new information and communication technologies speeded up this process. commercial publishers have overcome scholarly communication and seized control of journal publishing. on the one hand, their absolute monopoly on publishing industry leads to price crisis. on the other hand, transferring authors’ exclusive right to journals to publish his/her research finding is bargaining chip for commercial publishers that gained a market advantage by leveraging their network of publishing industry overwhelming. the price and permission barriers imposed by commercial publishers on researchers and their university libraries forced them to rethinking scholarly communication. the consequence is open access movement and vary route to make their findings available to anyone, anytime and anywhere without bypassing copyright law. open access is a new model of scholarly communication that is free of any legal and price barrier. we should bear in mind that open access journals are not an alternative to subscription journals. they also consider business model. one of the differences between open access journals and toll access journals is that readers do not pay for accessing the contents in the former. however, both models need expenses to survive in competitive era. although researchers’ positive attitudes toward unrestricted access to publications are growing (tbi communications, ; odell, dill & palmer, ), they are sensitive about the way others copy, re-use, and build upon their content (van nordoon, ). regarding the legal approach, cc licences guide authors to determine their control over their outputs. conclusion open access to publications is an issue of importance to stakeholders of scholarly communication. in this regard, readers, authors, research funders, learned societies, and librarians are the strong advocates of open access. but commercial publishers as an important component of this process are in an unsustainable situation. however, commercial publishers soon changed their copyright terms and conditions in favor of open access. the motives behind open access accelerated the growth of this new paradigm in scholarly communication. higher education institutes should consider the motives and attempt to codify availability to research outputs. waaijers ( ) listed some benefits of open access as “corporate social responsibility”, “citation advantage”, “advancement of science”, “protection against abuse”, “research transparency”, and “cheap distribution”. research funders as another important component of scholarly communication should set up the infrastructure to provide access to outputs from different channels such as institutional and subject-based repositories. mandating open access at national or dr. mohammad reza ghane ijism, special issue nd international conference on scholarly journals editors-in-chief (islamic countries, - , dec. ) institutional level is another task of custodians of higher education. finally, universities need to cooperate and coordinate with other duty holders in relation to scholarly communication to achieve their objectives. references antelman, k. ( ). do open-access articles have a greater research impact?. college & research libraries, ( ), - . bailey, jr., c. w. ( ). institutional repositories, tout de suite. digital scholarship. retrieved from http://digital-scholarship.org/ts/irtoutsuite.pdf (accessed september, ) bernius, s., & hanauske, m. ( ). open access to scientific literature: increasing citations as an incentive for authors to make their publications freely accessible. paper presented at the nd annual hawaii international conference on system sciences, hicss, january , - january , . [online]. retrieved from:http://origin- www.computer.org/csdl/proceedings/hicss/ / / / - - .pdf (accessed september , ). björk, b. c. ( ). open access subject repositories: an overview. journal of the american society for information science and technology, ( ): - . björk, b.c. ( ). the hybrid model for open access publication of scholarly articles – a failed experiment? journal of the american society for information science and technology, ( ): - . budapest open access initiative – boai ( ). retrieved from http://www.budapestopenaccessinitiative.org/read (accessed june, ). burns, c. s.; lana, a. & budd, j. m. ( ). institutional repositories: exploration of costs and value. d-lib magazine, ( / ). retrieved from http://www.dlib.org/dlib /january /burns/ burns.html (accessed september ). crow, r. ( ). income models for open access: an overview of current practice. retrieved from http://sparc.arl.org/sites/default/files/incomemodels_v .pdf (accessed october, ). hoorn, e. & van der graaf, m. ( ). copyright issues in open access research journals: the authors’ perspective. m-lib magazine, ( ). retrieved from http://www.dlib.org /dlib/february /vandergraaf/ vandergraaf.html (accessed september, ). johnson, r. k. ( ). institutional repositories: partnering with faculty to enhance scholarly communication. d-lib magazine, ( ). retrieved from http://www.dlib.org /dlib/november /johnson/ johnson.html (accessed september ). laakso, m. & björk, b.c. ( ). delayed open access – an overlooked high-impact category of openly available scientific literature. journal of the american society for information science and technology, ( ), - . odell, j.; dill, e. & palmer, k. ( ). author’s rights to share scholarship: a survey of faculty attitudes and actions. indianapolis: indiana library federation. retrieved from https://scholarworks.iupui.edu/bitstream/handle/ / /facultyoasurvey- open access policy ijism, special issue nd international conference on scholarly journals editors-in-chief (islamic countries, - , dec. ) ilf c.pdf?sequence= &isallowed=y (accessed september, ). pappalardo, k. m., fitzgerald, a. m., fitzgerald, b. f., kiel-chisholm, s. d., o'brien, d., & austin, a. ( ). a guide to developing open access through your digital repository. retrieved from http://eprints.qut.edu.au/ / / .pdf (accessed october, ). sherpa/romeo ( ). retrieved from http://www.sherpa.ac.uk/romeo/ (accessed october ). suber, p. ( ). open access. mit :the mit press essential knowledge series. retrieved from: http://cyber.law.harvard.edu/hoap/open_access_% the_book% , (accessed march, ). swan, a. ( ). policy guidelines for the development and promotion of open access. unesco: france. retrieved from https://books.google.com/books?hl=en&lr=&id= ptsr-dixx-ac&oi=fnd&pg=pa &dq=policy+guidelines+for+the+development + and+promotion+of+...&ots= bi wqtxu&sig=eqswwgghfocsecmu uhs - tun #v=onepage&q= policy% guidelines% for% the% development% and% promotion% of% ...&f=false (accessed october, ). swan, a, ( ). the open access citation advantage: studies and results to date. retrieved from: http://eprints.soton.ac.uk/ / /citation_advantage_paper.pdf (access september, ). tbi communications ( ). learned society attitudes towards open access: report on survey results. retrieved from http://www.edp-pen.org/images/stories/doc/edp_ society_survey_may_ _final.pdf (accessed september, ) van nordoon, r. ( ). researchers opt to limit uses of open-access publications. nature, february. doi: . /nature. . waaijers, l. ( ). a brief overview of international trends in open access. retrieved from http://www.kb.se/docs/about/projects/openaccess/international_trends_oa _leowaaijers.pdf (accessed september, ). willinsky, j. ( ). the stratified economics of open access. economic analysis & policy, ( ): - . prototyping across the disciplines research how to cite: el khatib, randa, et al. . “prototyping across the disciplines.” digital studies/le champ numérique ( ): , pp.  – . doi: https://doi.org/ . /dscn. published: january peer review: this is a peer-reviewed article in digital studies/le champ numérique, a journal published by the open library of humanities. copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access: digital studies/le champ numérique is a peer-reviewed open access journal. digital preservation: the open library of humanities and all its journals are digitally preserved in the clockss scholarly archive service. https://doi.org/ . /dscn. http://creativecommons.org/licenses/by/ . / el khatib, randa, et al. . “prototyping across the disciplines.” digital studies/le champ numérique ( ): , pp. – . doi: https://doi.org/ . /dscn. research prototyping across the disciplines randa el khatib , david joseph wrisley , shady elbassuoni , mohamad jaber and julia el zini university of victoria, ca new york university abu dhabi, ae american university of beirut, lb corresponding author: david joseph wrisley (djw @nyu.edu) this article pursues the idea that within interdisciplinary teams in which researchers might find themselves participating, there are very different notions of research outcomes, as well as languages in which they are expressed. we explore the notion of the software prototype within the discussion of making and building in digital humanities. the backdrop for our discussion is a collaboration between project team members from computer science and literature that resulted in a tool named topotext that was built to geocode locations within an unstructured text and to perform some basic natural language processing (nlp) tasks about the context of those locations. in the interest of collaborating more effectively with increasingly larger and more multidisciplinary research communities, we move outward from that specific collaboration to explore one of the ways that such research is characterized in the domain of software engineering—the iso/iec : standard. although not a perfect fit with discourses of value in the humanities, it provides a possible starting point for forging shared vocabularies within the research collaboratory. in particular, we focus on a subset of characteristics outlined by the standard and attempt to translate them into terms generative of further discussion in the digital humanities community. keywords: software prototyping; interdisciplinary collaboration; standards; geocoding; spatial humanities; shared research vocabularies in various global contexts, researchers are coming together to imagine the environments that can host, sustain, and facilitate new forms of academic research and inquiry. indeed, research infrastructures in the academy have evolved to include new spaces of exchange, data management, and computation. at their most virtual, such spaces for digital humanists include software, middleware, cloud computing, https://doi.org/ . /dscn. mailto:djw @nyu.edu el khatib et al: prototyping across the disciplinesart.  , page  of labs, makerspaces and the like. indeed, we have entered an era of what has been called “software intensive humanities (sih),” where digital humanists not only use packaged software bundled in their institutional infrastructures, but they also embark on innovative tool creation as a form of generative, critical practice (smithies ). in this article, we explore the idea of proof of concept software prototyping, stemming from a collaboration between researchers in the humanities and computer science, and we examine the issue of the value of collaboration across the disciplines. we have been attempting to model a process that could very well be popularized in coming years, even embedded within basic computational infrastructures for humanists the way that platforms such as voyant tools have democratized text analytics. that process is creating a map from a text. humanities and computer science collaborations: towards a product or prototype? digital humanities are a deeply social endeavor, one in which project results are shaped by the various actors involved, as well as by the mutual value drawn by them from the process. those projects are not always carried out in the context of a shared lab. cross-disciplinary collaborations also take place virtually, instead of within the confines of a single room, department, university, or even region. collaboration, it can be argued, is a form of a third place, drawing on the theoretical interests and practical expertise of different kinds of disciplinary actors, taking place in and between their traditional working spaces, and importantly, growing out of the development of shared vocabularies for collaboration and an appreciation of the stakes of the research for others in our team (bracken and oughton ). our experience stems from a collaboration initiated informally between a small group of researchers and students in departments of english and computer science, rather than within a formal research collaboratory, and at a moment where digital humanities had limited purchase within the home institution. whereas interdisciplinarity is an easily promoted ideal, building structures across the disciplines for successful, and sustainable, collaboration is more challenging to achieve (bos, zimmerman, olson, et al. ). collaboration is known to be difficult el khatib et al: prototyping across the disciplines art.  , page  of for a number of reasons: focus in some disciplines on individual work over research teams, lack of planning and project management skills, lack of infrastructure to facilitate the collaboration, new forms of accountability or communication required to carry projects to fruition, in addition to basic disciplinary difference (siemens and siemens ). in this article, we turn to another important challenge of collaboration unmentioned above: the ways we characterize the prototype resulting from interdisciplinary collaboration, or in oversimplified terms, the “finished product.” since the work of such software prototyping is iterative, experimental, and without a clear end in sight, the humanists on our team came to appreciate the process as passing through multiple stages of somewhat finished prototypes. a new version or prototype might improve performance or user experience compared with previous versions, but, in turn, eclipsing another part of its previous functionality. we propose to examine in this article how the computational task of text mapping, that is, modelling and operationalizing a relationship between geographic entities and features of language, can be framed within a mutually beneficial language for a collaborative team. we do this by turning to some documentation from beyond the humanities—some might say far beyond the humanities—that, if reframed and generalized, might provide a starting point for forging common goals and vocabulary for interdisciplinary teams. this relies, however, on unpacking, and refining, the notion of the prototype for the specific case of software development within digital humanities. software prototypes: materializing contemplative knowledge the word prototype appears in renaissance english from a latinized greek word meaning a “first form,” or a “primitive pattern.” a software prototype, according to a dictionary of computer science, can be defined as a “preliminary version of a software system in order to allow certain aspects of that system to be investigated … additionally (or alternatively) a prototype can be used to investigate particular problem areas or certain implications of alternative design or implementation decisions” (n.p.). prototyping after the digital turn can also be seen as an assemblage of various modes of intellectual work: “theoria (or contemplation), poiesis (or making), and praxis (or practice/action)” (saklofske , n.p.). according to saklofske, contemplation el khatib et al: prototyping across the disciplinesart.  , page  of and action are related through the process of making, which can be seen as a materialization of contemplative knowledge both in, and through, engaged activity. research on building in digital humanities has framed prototyping as an intertwined process of making and thinking, embodied together in the prototype “product;” examples of functional software prototypes, it has been argued, are a contribution to knowledge in themselves in as much as they suggest innovative methodologies (galey and ruecker ; ruecker and rockwell ). we assert that a prototype is best understood in a similar light, as intertwined thinking and making, a process of modeling embodied in step-wise software versions (el khatib forthcoming). in this sense, in saklofske’s terms, the prototype, which embodies the process and product, serves both as an argument and theory. in our case, what kinds of thinking across the disciplines led to our prototype? data creation is central to many of our research projects in digital humanities, and it is common knowledge how it can be very time-consuming and expensive. one common research task at the intersection of textual and spatial analysis consists in extracting geographical information from unstructured text and visualizing such data on map interfaces. it is a rather time-consuming process that has led researchers to want to automate the process. another system that models the text mapping process is the “edinburgh geoparser” (edinburgh language technology group https://www.ltg.ed.ac.uk/software/geoparser). practitioners in the spatial humanities have recourse to a growing body of code and critical literature, in addition to infrastructure in the form of gazetteers— digital lists of places against which entities extracted from texts can be matched. convinced that the immediate linguistic context of geographic entities found in texts is illustrative of the ways that place is constructed by literature, the authors of this paper set out to operationalize this hypothesis by prototyping software named topotext to carry out the task. creating a software prototype involves different skill sets in code, interface design, implementation, and testing; in short, it is a social process. the various iterations of topotext, from basic conceptualization to implementation, involved different https://www.ltg.ed.ac.uk/software/geoparser el khatib et al: prototyping across the disciplines art.  , page  of groups of students and faculty, and this meant that we confronted many disciplinary assumptions that went unmentioned. furthermore, software prototypes, particularly of the kind one finds in digital humanities, are acts of open scholarship. they are placed in code sharing and versioning environments for others to use, adapt, and refine. of course, working within a digital humanities lab or on funded projects, one solution to software or coding needs is to contract developers to carry out discrete tasks. if digital humanities enter the research collaboratory, however, where they are face to face with others in computer science (or other disciplines), different dynamics come into play, in which divergent notions of both process and product emerge. our experience made us very aware of the fact that within the single academic unit of computer science, we also find different forms of reflection and action that map onto the abovementioned axis of theoria, poeisis, and praxis. in other words, “there are many different computings” (mccarty , ). mccarty qualifies the domain of software “a locus of confounding” precisely because he argues, that “the more theoretical side of computer science meets the world through systems engineered to serve and interact with it” (mccarty , ). our specific experience has made us see the urgency of thinking through the ways that research is validated from the perspective of different disciplines, as well as within the same disciplinary groupings of the academic unit. at stake here is the way that the common language we might use to describe software prototyping within research teams, and the ways we can take home our results to our different disciplinary homes. tensions of reproducibility instead of relying on a service-based, developer-for-hire model of computing, what we call for here is an active discovery of how our disciplinary values and expectations as humanists converge or diverge with those working in different aspects of computing. within the context of prototyping, we describe a hybrid mode of interdisciplinary academic collaboration set beyond the confines of a physical space (such as a laboratory) or grant-funded project; this collaboration was carried out in its beginning on the same campus, and then subsequently via virtual communication between humanists and computer scientists working from different el khatib et al: prototyping across the disciplinesart.  , page  of sites. the project was experimental, not only regarding the concepts employed, but also in the structure of the collaboration. to our knowledge, no such project had been attempted before between the two academic departments at our institution. in this light, looking back on several years of working together on the topotext team, we wonder how this type of collaboration was able to pursue its experimental nature while there was no formal support, and likewise, what incentives kept team members pursuing the research project despite the lack of structure. although they were not clearly articulated in this initial stage of collaboration, there were reasons that the prototyping process appealed to all members of the team. if we are to generalize from the experience, what might be some ways of constructing the frame of collaboration in mutually beneficial ways? how do we balance experimentation and rigor in software prototyping that can bring us closer to “next generation tools” (siemens, , n.p.)? how can humanists understand what colleagues in computing think is a valuable result in a research project? some common guidelines would be useful for aligning future collaboration. multidisciplinary collaboration models take into account disciplinary characteristics and differences. major considerations in disciplinary difference include defining research problems and choice of critical vocabulary, designing methodology, asserting authorship, choosing publications venues, assigning rewards and recognition, as well as inter-researcher communication (siemens, liu, and smith , ). two models for collaboration in an academic setting are faculty-oriented research projects where lead faculty members make decisions on behalf of the entire team and lead the intellectual direction of the project, and collaboration that approaches team members as equals, where all members intellectually contribute to the project. the multidisciplinary humanities-computer science team discussed in this paper fits better into the latter, where the students continue to be as invested in its intellectual direction as faculty members. additionally, it is further from the service-based approached that is more commonly associated with faculty-oriented research projects; here, team members are invested in creating shared research foci, vocabularies, and methodologies. el khatib et al: prototyping across the disciplines art.  , page  of as would be expected in a new collaboration, when we began building topotext, the computer science team came to the project with another set of implicit values. although we worked quite closely together, it is not fair to say that at the beginning of the collaboration, we were completely aware of the other side’s workflows or standards of success. one of the members of the team from computer science pointed to systems and software development quality standards, the international organization for standardization (iso) and the international electrotechnical commission (iec), also known as iso/iec : , a part of the systems and software quality requirements and evaluation (square), as standards that his field follows when developing software and that informs the pedagogy of software engineering. at first, such a profession-specific document seemed to alienate us, and yet it contained some interesting wisdom; had we not made a conscious effort to “exploit the benefits of diversity” of our project team, we would perhaps have missed this way that his research community articulates project goals (siemens, cunningham, duff, and warwick , n.p.). negotiating between the more open- ended, experimental nature of prototyping and the iso/iec : , indeed involved balancing novelty and conceptual innovation with functional suitability and accuracy. as a standard, it generally challenges the theoretical boundaries of the prototype, while maintaining a level of robustness of the methodologies of the software product. one of the ideas the humanities members of the team had, as we continued to theorize the emerging prototype, was how easy it is for our collaborator’s labor and professional expertise to disappear behind the accuracy of the code. in other words, as the software prototype began to embody the qualities of a purposeful, running tool, it was easy to neglect the design decisions and testing that brought the tool into functional existence. it is important to note that another contributor from the computer science team asserted that he does not rigidly follow such standards as the iso since the research projects he directs do not have the specific goal of an end-product software system. the divide, if there can be said actually to be one, in our humanities- computer science collaboration was not so much across departmental lines, but el khatib et al: prototyping across the disciplinesart.  , page  of rather between two competing goals, one of experimental modeling without any particular futurity for the prototype in mind, and another that sought to make strategic choices in step-wise implementation planning for scalability and sustainability. we might formulate this insight as a question: for whom does a prototype have an afterlife? the software engineering part of the team described this method of developing code into software as a “spiral,” incorporating feedback between developmental phases that allow for modification and improvement. that is to say, new phases are begun before predecessor phases are complete. it would seem, therefore, that unbeknownst to us, the tension in the dual meaning of the notion of the “prototype,” signifying both the singular, abstract “original” form and the mold or pattern from which subsequent copies can be developed, was playing itself out in the daily business of our collaboration. the analogy here could be extended to the tension between a “pure” thought experiment, and an experiment with the notion of reproducibility built in to its execution. reproducibility is key to non-proprietary, open software development these days, as well as to standards of reliability and transparency in certain circles in the digital humanities community, as the use of environments such as github, r markdown source documents or jupyter notebooks would seem to attest (kluyver, ragan-kelly, perez, et al. ). the principle of reproducibility also serves historiographic ends, as a way “of thinking- through the history and possibilities of computer-assisted text analysis” (rockwell and sinclair ). on our topotext team, there were multiple perspectives on what needed to be done to bring about the software prototype as a public, shareable object: one that conceived the prototype as a kind of essai, and another that was conceptualizing the prototype as a structure in ways that it could be expanded on later. whereas in digital humanities we might speak of operationalizing a concept, that is, translating a theoretical concept into a finite, computable experiment, others aim to move beyond experimental thinking with a computer to build a tool guided by best practices of software development so that it can be shared, distributed and further elaborated. again, we are reminded of the “rich plurality of concerns” included within computing (edwards, jackson, chalmers, et al. ). let us push el khatib et al: prototyping across the disciplines art.  , page  of forward to explore, then, the differences and commonalities between such social experiments in digital humanities software prototyping and the abovementioned iso/iec : standard. forging vocabularies of collaboration: from standards to guidelines we are not suggesting that all digital humanities projects or collaboratories should aim for standardized implementation. far from it. instead, it is worth examining to what extent the details of the iso/iec : standard for software development might be translated into guidelines for software prototyping. could delving into these standards be helpful for defining a mutually comprehensible language for some of the guidelines of the research collaboratory? are there aspects of them that all sides of a collaboratory can share? are there other aspects that are simply too commercially oriented to take root in the open source ethos espoused by digital humanities? are there ways that the iso/iec : standard might be used to draft some more general guidelines, or even rethought to be useful to the informal, interdisciplinary encounter such as ours? could such guidelines be scaled from the informal collaboration to the more formalized research collaboratory? we believe that the documentation contains some material for mutual understanding and deserves closer analysis. central to multidisciplinary collaboration is developing understanding between disciplines, which can be forged through an understanding of field-specific language and the contexts in which it is being used. l. j. bracken and e. a. oughton ( ) identify three main aspects of language that are involved in attaining such an understanding: dialectics, metaphor, and articulation. dialectics refers to the difference between the everyday use of a word and its expert use, as well as the different meanings that are assigned to the same words by different disciplines. metaphor, or ‘heuristic metaphor,’ refers to expressions that push a conceptual understanding by systematically extending an analogy (klamer and leonard ). a metaphor assumes that those involved in the conversation share the same context before making a conceptual correlation. the final aspect is articulation, which refers el khatib et al: prototyping across the disciplinesart.  , page  of to the process of deconstructing one’s disciplinary knowledge in conjunction with the disciplines of collaborators in an attempt to identify the building blocks and employ them towards developing a common understanding. according to thierry ramadier ( ), “articulation is what enables us to seek coherence within paradoxes, and not unity” ( ). this idea of seeking coherence within paradoxes rather than attempting to reconcile disciplinary differences is what drives us to engage with the language of iso/iec : . all three aforementioned aspects of language play a crucial role in developing a disciplinary understanding as articulated through these standards. dialectics is crucial since some of the words used in the standards are familiar either in everyday life (such as “trust” or “comfort”) or a humanities context (such as “effectiveness”), but are actually used in a more specialized context in software development. we employ a metaphor in our explanation of the “modularity” characteristic below by drawing an analogy to a wrapper in order to explain how the characteristic functions. our approach to the language of iso/iec : focuses on developing an understanding between its characteristics and humanities concepts rather attempting to reconcile the two. the above standards related to quality control in software development— originally published in and reviewed and confirmed in , remain in vigor at the moment of the writing of this article. although software developers aspire to them, and they are taught as guidelines in computer science programs, like all other standards, more work needs to be done to assess to what extent they are effective or even observed. at first take, adapting guidelines for product-driven industry into the humanities may seem counterintuitive, or even meet with resistance; however, let us not forget that adopting—and reinterpreting—the languages of the different strands of computing runs deep in digital humanities. practitioners have been both adopting and adapting standards since some of its earliest days. take, for example, the text encoding initiative, which initially adopted standard generalized markup language (sgml) as an expression of its metadata, and was then succeeded by extensible markup language (xml) that is still being used today. by adopting robust guidelines for markup of structural and conceptual features of humanities data, the tei community laid one of the foundations of digital scholarship today. whereas the el khatib et al: prototyping across the disciplines art.  , page  of tei community has created “guidelines” out of what are xml standards, would it not be possible to do the same with software prototyping out of the iso/iec : document? as we mentioned above, our first foray into collaborative software prototyping, topotext . , was made in an undergraduate software engineering course offered at the american university of beirut (lebanon). it might be fair to say that the initial impulse for the collaboration—automating a process of mapping toponyms found in texts and conducting some basic textual analysis around them—was a case study through which undergraduates were exposed to industry-defined standards for high-quality software. this does not mean that such standards were scrupulously followed in the ensuing code, nor that topotext went through all the series of tests for professional software development, but rather that they were the ideal to which the computing team referred in building the software core. after being exposed to this process model, the humanists on the project team were stirred to explore how such standards were not only product-centered aims, but also how they enriched the conceptualization of code-based work. this led us to ask the question: are standards a conceptual apparatus sitting at the human interface of digital humanists and developers without ever being acknowledged as such? in the “waterfall model” of software development followed by our colleague in computing, an initial phase deals with the translation of concepts into processes and the articulation of specifications. this phase is followed by one focused on design, consisting of a modular decomposition of the steps of the core process. in these first two phases, the humanists worked closely with the computer scientists to articulate a common vision of the conceptual model. in the implementation phase that followed, this was less the case. in the ensuing testing and validation phases, the humanists stepped back in to confirm to what extent the desired processes were successfully implemented. these phases reinforce the social element of software development, in which “tests of strength” of the project’s functionality and usability are carried out. indeed, the testing phase attempts to compare the “symbolic level of the literate programmer with the machinic requirement of compilation and execution of the software” (berry , ). el khatib et al: prototyping across the disciplinesart.  , page  of one of the basic tensions inherent in our process-oriented collaborative model was the language used to describe the resultant system. the digital humanists on the team called the first version of the validated system a prototype, by which we meant an initial step that exposed some of the shortcomings of the text mapping process, whereas the software engineering approach characterized the system as a product. for some on the research team, the partial operationalization of a concept was at stake, a full of implementation of which may never be possible, whereas from the software development angle it consisted of an entire “life cycle” from the taking of specifications to the delivery of a workable system. we can conclude from this difference in perspective that the cycles of labor in prototyping, or perhaps just in research where software development is especially involved, from planning, implementation to validation, are conceived of very differently across disciplines. in retrospect, working together necessitated an understanding of our mutual notions of such phases of labor in research. in the iso/iec : standards documentation, the reader is struck by the language of engineering, functionalism and quality control, a far cry from what most humanists, even digital humanists, deal with every day. the document in question provides guidelines of what it calls characteristics and sub-characteristics for quality software development. the relevant sections of the document are contained in section , terms and definitions. section . outlines “quality in use” characteristics and sub-characteristics, that is to say, traits of a piece of software that deal with the user experience. section . outlines “product quality” characteristics and sub-characteristics, in other words, elements related to how the objectives set out in the design process phase are met by the software prototype. both of these domains, the role and experience of the non-expert user, and the optimal performance of the tool, were issues of perpetual conversation and debate in our collaboration. the principles set out in the iso/iec : document are not all applicable to the specific case of software development engaged in by the authors of this paper; for example, the principle of freedom from risk touches on forms of risk el khatib et al: prototyping across the disciplines art.  , page  of management that do not come into play with our text mapping tool. likewise, the principles of physical comfort and security do not seem immediately relevant, since topotext creates no particular physical stress and works with open source gazetteers and plain text source material. the risk of compromising the confidentiality or integrity of its users is very low to nil. the same might not be true of other software prototyping endeavors with geo-locating devices or wearable computing that collect data about users or create other physical stress. these notions notwithstanding, the characteristics and sub-characteristics of sections . and . , both user-centered and function-centered, contain quite a few pertinent concepts worthy of our both attention and contextualization within current conversations in digital humanities. it is with them that we believe bridges of dialogue could be built. space does not allow us to cover every single one of the themes evoked by the iso/iec : . here, we will limit ourselves to a brief discussion of a few of them that are most relevant to our experience within the framework of designing topotext. by linking various functionalities together and automating a process, some of the more rigid standards were satisfactorily met in the software prototype; without them, the prototype would not exhibit (in terms of the iso/iec : ) functional completeness, that is, the extent to which the software functions matched the outlined objectives. for example, the first version of topotext aimed to map locations from texts using the google map api, and also to carry out what bubenhofer has called “geo-collocation” [geokollokationen], making spatial association of features with natural language (bubenhofer , – ). this approach encountered problems with erroneous spatial data. in the case of nineteenth- century novels about london, the errors were most often mismatched with other places in the anglophone world with locations named after the geographies of london. although this version met the sufficient standards to carry out its functions, it left little space for effectiveness/accuracy. we did not know that we would discover something about the qualities of the data we were using—historical literary texts and a contemporary gazetteer—as well as about the processes were attempting to model. there is a growing literature on “failure” in digital humanities and the possibility el khatib et al: prototyping across the disciplinesart.  , page  of of a “failure that works,” leaving open the possibility of “uncovering and correcting your mistakes to be an essential part of the creative process, rather than something reserved for hindsight” (mlynaryk ). it is not immediately clear if the software development world would adopt the “working failure” as part of its standard, but the notion does seem to be found lurking within section . of the iso/iec : about product quality, in as much as functional completeness of a software prototype, may be satisfied, but functional accuracy or appropriateness may not. affinities between software prototyping and digital humanities building on the first instance of collaboration, as well as on a functioning skeleton of the first prototype, in the second version of topotext, we sought to integrate human judgment into the geocoding process. we changed the reference gazetteer to geonames and implemented a basic interface by which a list of potential matching places was produced, a function similar to the edinburgh geoparser’s capacity to disambiguate with respect to a gazetteer. topotext adds the function of allowing the user to rank, in an act of close reading, the best match. we also added what the layperson might call functionalities to topotext, aspects of which are also defined by the iso/iec : , a selection of the characteristics that the first iteration failed to achieve. these include modularity, reusability and maintainability; compatibility, interoperability, and coexistence; functional correctness, and, from the “quality in use” section, trust. modularity insists that the implementation of the prototype should be well documented and should be based on wrappers to ensure the feasibility of future enhancement, such as replacing used technologies or integrating different libraries. moreover, the model should be separated from the view (i.e., from ways of displaying the model) in order to support different types of interfaces for data consumers. essentially, this quality has to do with separating the content from the form. being an open data generator, topotext generates a comma-separated values (csv) file of the geographic entities included in each text matched with lat-long coordinates, which can then be exported, allowing reusability in other environments. it also generates the maps and word clouds of most frequent words in collocation with el khatib et al: prototyping across the disciplines art.  , page  of particular places, although in a separate browser window. taken from this angle, the development process has aimed to keep the prototype separate from its form. although this principle has not translated to a seamless, non-expert user experience with the tool, the notion of open data generation overlaps with the standard of modularity. the software engineering focus on the tool adopted an “agile scrum” approach with respect to the various functions; new functions can be added—that is, specific theoretical interventions can be operationalized—under their modularity and their consistency with the overall process framework. more work can be done in the case of topotext with documenting its own interwoven process of design and theory to ensure replicability. after all, a prototype must exhibit maintainability, that is, it should always contain the seed of its own improvement. reusability is one of the key motivating factors for version . of topotext. the question of reusability finds its most immediate expression in the tool’s function allowing for data import and export of the tool’s geo-coded data in a plain, csv format; in other words, all data generated by topotext is reusable in other gis-based platforms. this sub-characteristic closely relates to compatibility, which houses the two subcategories of interoperability and coexistence. both versions of topotext were created through a deep remix of existing tools and libraries that are interoperable; in the theoria stages of the second iteration, the outside data source that topotext draws upon was revisited and replaced in order to situate it within the realm of open data further; we switched from google maps engine and map api to leaflet (an open source javascript library for interactive maps) and geonames (an open gazetteer published with a liberal creative commons license). as we mentioned before, sometimes a new version of a prototype might improve specific functions at the risk of outperforming previous functions. in fact, future versions of topotext need to upgrade the visualization of its textual analysis to match the improved level of the geocoding. in sum, interoperability was taken into account from the beginning and in a way that would allow us to shuffle the coexisting tools as the prototyping process continued. nonetheless, a working prototype exhibiting modularity exists. it remains a work in progress with its different parts changing incrementally. el khatib et al: prototyping across the disciplinesart.  , page  of functional correctness refers to the degree to which the prototype “provides the correct results with the needed degree of precision” (see iso/iec : , section . . . . system and software quality models. n.d. https://www.iso.org/ standard/ .html). as we mentioned above, the move away from automatic geocoding toward a semi-automatic, human-in-the-loop process of disambiguation of data not only allowed for more accurate matching of place name with spatial coordinates, but it enacted one of the more interesting human-centered aspects of the iso/iec : , namely trust. reincorporating human close reading, that is, human judgement about location, obviously slows down the data creation process, but it also serves as a way to peek into the “black box;” this semi- automated approach is meant to mediate between the advantages of automatic parsing, namely speed and scope, and the painstakingly slow process of manual geocoding. conclusion much is made of the interdisciplinary encounters in the digital humanities lab, in particular, the inclusion of other academic voices from outside the humanities, and yet much more needs to be theorized about the languages of collaborative work, especially if we imagine reaching far beyond the humanities for potential collaborators. such collaborative work necessarily means venturing into disciplinary conventions and idioms that appear foreign and even alienating. navigating such radically different discourses is tantamount to analyzing, and even deconstructing, the “boundary-work” of disciplinary construction (klein , – ). we might call it a form of digital humanities translanguaging, moving beyond established academic language systems, in order to draw upon complex semiotic resources for enacting our transdisciplinary research. the examples of the international organization for standardization (iso) document have provided us with some starting points for a dialogue with other disciplines, in what might just be an opportunity to infuse lessons learned from critical digital humanities into a software development model. experimental prototypes such as topotext implement any number of important design decisions that are based upon theoretical positions, for example, about https://www.iso.org/standard/ .html https://www.iso.org/standard/ .html el khatib et al: prototyping across the disciplines art.  , page  of the value of the human in semi-automated computational processes. although notebooks have not been built for topotext, as rockwell and sinclair suggest, it is perhaps a valuable next step as they document for others how theoretical positions become instantiated in code and then developed towards software. for a third iteration, we plan to continue thinking through the terminology of the standards explored in this article, and about how to continue prototyping across disciplines in a meaningful way, seeking points of interest or overlap between what might appear to be divergent research goals. one of the foci will be on the usability and operability of the prototype, characteristics which refer to the attributes that make software easy to use and control, in particular for non-expert users. this effort will focus on existing functionalities but will address interface design aesthetics, that, incidentally, are also covered in the standards. software is meant for something more than an end in itself. software developers work on innovating their methods in order to fine tune practical applications. instead of viewing the standards as a technical and limiting framework, or as a strictly industry-based, product-driven set of rules alien to the type of work carried out in digital humanities, let us continue to think of ways that the standards might be drawn upon as resources to shape critically informed guidelines that will enable next-generation software. the standards can, and should, be approached critically, conceived of as a core part of the prototyping process that allows for future flexibility, given changing project goals or project team members, rather than serving as an ideal to which all products conform. competing interests the authors have no competing interests to declare. references berry, david. . the philosophy of software: code and mediation in the digital age. houndmills: palgrave macmillan limited. doi: https://doi. org/ . / bos, nathan, ann zimmerman, judith olson, jude yew, jason yerkie, erik dahl, and gary olson. . “from shared databases to communities of practice: a https://doi.org/ . / https://doi.org/ . / el khatib et al: prototyping across the disciplinesart.  , page  of taxonomy of collaboratories.” journal of computer-mediated communication : – . doi: https://doi.org/ . /j. - . . .x bracken, l. j., and e. a. oughton. . “‘what do you mean?’ the importance of language in developing interdisciplinary research.” transactions of the institute of british geographers new series ( ): – . doi: https://doi.org/ . / j. - . . .x bubenhofer, noah. . “geokollokationen – diskurse zu orten: visuelle korpusanalyse.” mitteilungen des deutschen germanistenverbandes ( ): – . doi: https://doi.org/ . /mdge. . . . edinburgh language technology group. . “edinburgh geoparser.” accessed july , . https://www.ltg.ed.ac.uk/software/geoparser. edwards, paul n., steven j. jackson, melissa k. chalmers, geoffrey c. bowker, christine l. borgman, david ribes, matt burton, and scout calvert. . knowledge infrastructures: intellectual frameworks and research challenges. ann arbor: deep blue. http://hdl.handle.net/ . / . el khatib, randa. forthcoming. “collocating places and words with topotext.” in social knowledge creation in the humanities , edited by alyssa arbuckle, aaron mauro, and daniel powell. new technologies in medieval and renaissance studies series. toronto: iter press. galey, alan, and stan ruecker. . “how a prototype argues.” literary and linguistic computing ( ): – . doi: https://doi.org/ . /llc/ fqq international organization for standardization. . “iso/iec : .” accessed nov , . https://www.iso.org/standard/ .html. klamer, arjo, and thomas c. leonard. . “so what’s an economic metaphor?” in natural images in economic thought: markets read in tooth and claw, edited by philip mirowski, – . cambridge: cambridge university press. doi: https:// doi.org/ . /cbo . klein, julie thomson. . “the rhetoric of interdisciplinarity: boundary work in the construction of new knowledge.” in the sage handbook of rhetorical https://doi.org/ . /j. - . . .x https://doi.org/ . /j. - . . .x https://doi.org/ . /j. - . . .x https://doi.org/ . /mdge. . . . https://www.ltg.ed.ac.uk/software/geoparser http://hdl.handle.net/ . / https://doi.org/ . /llc/fqq https://doi.org/ . /llc/fqq https://www.iso.org/standard/ .html https://doi.org/ . /cbo . https://doi.org/ . /cbo . el khatib et al: prototyping across the disciplines art.  , page  of studies, edited by andrea a. lunsford, kirt h. wilson, and rosa a. eberly, – . thousand oaks: sage publications. kluyver, thomas, benjamin ragan-kelley, fernando perez, brain granger, matthias bussonnier, jonathan frederic, kyle kelley, jessica hamrick, jason grout, and sylvian corlay. . “jupyter notebooks-a publishing format for reproducible computational workflows.” in positioning and power in academic publishing: players, agents and agendas, edited by fernando loizides, and brigit schmit, – . amsterdam, ios press. mccarty, willard. . humanities computing. houndmills, hampshire: palgrave mcmillan. doi: https://doi.org/ . / mlynaryk, jenna. . “working failures in traditional and digital humanities.” hastac (blog). accessed nov , . https://www.hastac. org/blogs/jennamly/ / / /working-failures-traditional-and-digital- humanities. ramadier, thierry. . “transdisciplinarity and its challenges: the case of urban studies.” futures ( ): – . doi: https://doi.org/ . /j. futures. . . rockwell, geoffrey, and stephan sinclair. . “thinking-through the history of computer-assisted text analysis.” in doing digital humanities: practice, training, research, edited by constance crompton, richard j. lane, and ray siemens, – london: routledge. saklofske, jon. . “digital theoria, poiesis, and praxis: activating humanities research and communication through open social scholarship platform design.” scholarly and research communication ( ): , . siemens, lynne, and ray siemens. . “notes from the collaboratory: an informal study of an academic dh lab in transition.” paper presented at digital humanities conference. hamburg, germany, july . published on implementing new knowledge environments blog. siemens, lynne, richard cunningham, wendy duff, and claire warwick. . “‘more minds are brought to bear on a problem’: methods of interaction and https://doi.org/ . / https://www.hastac.org/blogs/jennamly/ / / /working-failures-traditional-and-digital-humanities https://www.hastac.org/blogs/jennamly/ / / /working-failures-traditional-and-digital-humanities https://www.hastac.org/blogs/jennamly/ / / /working-failures-traditional-and-digital-humanities https://doi.org/ . /j.futures. . . https://doi.org/ . /j.futures. . . el khatib et al: prototyping across the disciplinesart.  , page  of how to cite this article: el khatib, randa, david joseph wrisley, shady elbassuoni, mohamad jaber and julia el zini. . “prototyping across the disciplines.” digital studies/le champ numérique ( ): , pp. – . doi: https://doi.org/ . /dscn. submitted: july accepted: june published: january copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /.  open access digital studies/le champ numérique is a peer-reviewed open access journal published by open library of humanities. collaboration within digital humanities research teams.” digital studies/le champ numérique ( ). doi: https://doi.org/ . /dscn. siemens, lynne, yin liu, and jefferson smith. . “mapping disciplinary differences and equity of academic control to create a space for collaboration.” canadian journal of higher education ( ): – . siemens, ray. . “communities of practice, the methodological commons, and digital self-determination in the humanities.” digital studies/le champ numérique. doi: https://doi.org/ . /dscn. smithies, james. . the digital humanities and the digital modern. london: palgrave macmillan. doi: https://doi.org/ . / - - - - https://doi.org/ . /dscn. http://creativecommons.org/licenses/by/ . / https://doi.org/ . /dscn. https://doi.org/ . /dscn. https://doi.org/ . / - - - - humanities and computer science collaborations: towards a product or prototype? software prototypes: materializing contemplative knowledge tensions of reproducibility forging vocabularies of collaboration: from standards to guidelines affinities between software prototyping and digital humanities conclusion competing interests references microsoft word - miami cover letter.docx june , human resources manager university of miami libraries p.o. box coral gables, fl - re: digital humanities librarian staff position to whom it may concern: i am writing to apply for the position of digital humanities librarian at richter library. i began working with digital humanities while finishing my ph.d. in english. in the process, i became involved in building dh activity and infrastructure across multiple departments on campus. that focus on infrastructure and community building led to my current library-based position supporting digital scholarship at mcmaster university. i am seeking a permanent library-based position because i am energized by the work of helping people discover local and non-local resources and collaborators and develop effective plans for short and long-term projects; and i believe that the library is the best fit for my skillset and expertise. i completed my ph.d. in english at the university of washington in . while writing my dissertation, i founded the demystifying digital humanities workshop series (dmdh, http://www.dmdh.org) with my colleague sarah kremen-hicks in . dmdh provides participants with an introduction to major trends and practices, working with programming languages, and project ideation and development. our participants have included undergraduates and graduate students, faculty, and staff from twenty-one departments and degree programs. i worked closely with staff from the uw simpson center for the humanities, uw libraries, and uw information technology in order to ensure that the workshops promoted networked growth of dh activity across campus. starting and running dmdh provided me with invaluable experience in administration and marketing. perhaps more importantly, facilitating dmdh provided me with invaluable knowledge about what dh means in a wide range of academic disciplines, what sorts of research questions students and faculty were framing, and the challenges they encountered. in my current role in the sherman centre for digital scholarship in mcmaster’s mills library, i developed and taught mcmaster's first undergraduate dh course, collaborating with mcmaster's maps & data department and special collections so that the course subject matter would highlight the library's extensive collections of wwi maps and documents. my other duties at the sherman centre have included consulting with faculty, graduate, and undergraduate students to find proper tools, think through issues of sustainability and adapt to new genres and mediums for scholarly communication. students and faculty are often unaware of the range of services that places like the library and sherman centre offer to support digital humanities learning and research. increasing the sherman centre's visibility (and thus clarifying the role that libraries play in supporting digital scholarship) has been one of my main goals as a postdoctoral fellow. my efforts have resulted in several new collaborations with faculty and students in the departments of english, history, french, languages & linguistics, and communications. my own entry into the digital humanities was in , when i came up with the idea for visible prices (vp, http://www.visibleprices.org), a tool to help readers understand the significance and purchasing power of prices that appear in literary texts. working to take vp from an idea to a reality has been an excellent education in several fundamental areas of digital scholarship. i have learned how to find potential platforms and experiment with them in order to effectively assess their suitability and sustainability for my particular objectives. i have also learned how to clearly articulate the objectives of vp for both general and specialist audiences, and to explain how vp integrates with traditional research questions. in the process of building visible prices and training as a digital humanist, i have dealt with a range of obstacles. figuring out how to surmount these obstacles has been one of the most significant aspects of my dh education. my own experiences in problem-solving mean that i am rarely surprised by the questions that students and faculty ask. even when questions involve tools where i have little or no expertise, my experience with vp combined with my work with participants in the demystifying workshops have given me broader knowledge of the digital humanities that allows me to think through questions in terms of the data involved, and identify resources and search keywords that will lead to solutions. a robust digital humanities community supports multiple levels of engagement with digital scholarship: some people build tools and projects, while others produce more traditional scholarship that engages with digital sources and projects. i envision working in a role that helps members of the university of miami community attain sufficient informational and technological literacy to pursue their individual goals. such a role would cultivate networked support and participation from a range of academic programs and departments. this will help to ensure the sustainability of university of miami digital humanities resources. equally importantly, it will help academic programs develop a better understanding of what digital scholarship involves, and support student research featuring digital components. i am tremendously excited about the possibility of helping to develop digital humanities at the university of miami from within richter library, and look forward to discussing the digital humanities librarian position further. best, paige morgan compartir lo que nos une digitizing and curating colonial records from the caribbean and central and south america for public outreach # - panel virtual / virtual panel • antonio rojas castro, tobias kraft, kathrin kraller, berlin-brandenburg academy of sciences and humanities, germany • hadassah st. hubert, florida international university, usa • maría josé afanador-llach, universidad de los andes, colombia dh v conference humanities commons, july - , • grisel terrón, eritk guerra, alaina solernou, oficina del historiador de la ciudad de la habana la habana, cuba • amalia s. levi, the heritedge connection, barbados protecting haitian patrimony initiatives at digital library of the caribbean (dloc) panel: compartir lo que nos une. digitizing and curating colonial records from the caribbean and central and south america for public outreach # virtual panel/ panel virtual presenter: hadassah st. hubert, ph.d. clir postdoctoral fellow in data curation for latin american and caribbean studies email: hsthuber@fiu.edu digital library of the caribbean (dloc) created in , dloc is a platform for caribbean research materials which are free and open access. administered by florida international university (fiu) in partnership with the university of the virgin islands (uvi) and the university of florida (uf), dloc's technical infrastructure is provided by the university of florida (uf). dloc's diverse partners serve an international community of scholars, students, and peoples by working together to preserve and to provide enhanced electronic access to cultural, historical, legal, governmental, and research materials. dloc's partners collaborate with scholars and teachers to promote and perform educational outreach for caribbean studies, create new works of digital scholarship, and develop other research and teaching initiatives. partner training shared infrastructures institutional support http://www.fiu.edu/ http://www.uvi.edu/pub-relations/uvi/home.html http://www.ufl.edu/ http://www.ufl.edu/ http://dloc.com/ dloc quick facts • partners – caribbean, europe, and us • over million hits since (average million views per month) • over . million pages of open access content • , titles with , items • training program: digitization, data curation, and more • scholarly collaborations • educational outreach • shared governance protecting haitian patrimony initiative (phpi) goals: • encourage communication across institutions working to assist haiti’s libraries • coordinate technical and in-kind assistance for haiti’s libraries • raise funds to support specific collection/archival recovery and preservation projects in haiti, including purchasing and shipping needed materials, paying travel costs for specialists to travel to haiti, and paying wages for personnel carrying out archival preservation and recovery work in haiti initial partners in phpi archives nationales d’haïti houses civil and state records as well of those of the office of the president and most government ministries. bibliothèque nationale d’haïti, established in , holds a collection of historical rare books, manuscripts and newspapers, and offers current publications, research support, and study space. bibliothèque haïtienne des frères de l’instruction chrétienne (bhfic), founded in by the christian brothers, serves as a repository for haitian imprints and holds one of the most significant collections of newspapers. bibliothèque haïtienne des pères du saint-esprit (bhpse), (saint-martial) founded in by the fathers of the holy spirit, holds documents and rare books chronicling french colonization, slavery and emancipation, the haitian revolution, and haiti’s nineteenth and twentieth century history. renamed bibliothèque haïtienne des spiritains (bhs) in . haiti: an island luminous haiti: an island luminous an island luminous is a digital humanities website to help readers learn about haiti’s history. a collaborative effort hosted online by digital library of the caribbean (dloc), an island luminous combines rare books, manuscripts, and photos scanned by archives and libraries in haiti and the united states with commentary by over experts. it is available in english, haitian kreyòl, and french. http://islandluminous.fiu.edu/ institut de sauvegarde du patrimoine national (ispan) the digital library of the caribbean (dloc) partnered with institut de sauvegarde du patrimoine national (ispan) in to protect haitian national patrimony. ispan archive cap haïtien, haiti royal chapel of our lady of the immaculate conception church • in , following the long u.s. occupation of haiti ( - ), haitian president, sténio vincent, had the dome reconstructed in an effort to reclaim haiti’s historic heritage – the chapel was the only building in the sans-souci complex to be rebuilt. • in , the dome underwent another series of repairs and restoration by ispan. • early in the morning on monday, april , , the historic church caught fire and the dome was destroyed, only the walls remain. experts are evaluating the site to determine what can be saved. findings and experiences • haiti’s cultural patrimony is mostly outside of the country • demystifying preservation and digitization • lack of institutional capacity and reliance on contingency • trust and long-term partnerships • project based funding more information http://dloc.com/ http://islandluminous.fiu.edu/ contact us at: dloc@fiu.edu http://dloc.com/ http://islandluminous.fiu.edu/ mailto:dloc@fiu.edu studies in technology enhanced learning journal homepage: stel.pubpub.org studies in technology enhanced learning, ( ) article type full paper, double-blind peer review. special issue debating the status of ‘theory’ in technology enhanced learning research | more at https://doi.org/ . / c f e.dc https://doi.org/ . / c f e. aaef b citation moffitt, p. ( ). engineering aca- demics and technology enhanced learning; a phenomenographic approach to establish concep- tions of scholarly interactions with theory. studies in technolo- gy enhanced learning, ( ). keywords theory; scholarship; technology enhanced learning; tel; phenom- enography; higher education abstract discussions and debates of theory in technology enhanced learning (tel) within higher education (he) are often characterised by instrumentalist motives, where theory is either juxtaposed somehow with reality or is used in expedited attempts to order, predict and monitor directly business-driven outcomes. the current paper instead examines and interprets scholars’ experiences of their activities at a nexus of their research and practice; specifically, how participants conceive of their own scholarly interactions with theory in tel. the paper summarises an interpretive study conducted in mid- with teaching-focused lecturers at the royal school of military engineering, a school providing he in infrastructure engineering for defence personnel. the paper first describes problematic notions of scholarship, theory and tel then analyses the related existing literature, to illustrate a dearth of studies which examine experiences, perceptions and conceptions of scholarly interactions with theory in tel. a phenomenographic study is presented, with outcomes disclos- ing four parsimonious conceptions ranging in successive inclusivity and complexity. participants conceive that scholarly interactions with theory in tel enable them to: understand their own competence; exhibit their own competence; critique the change endeavours of others; and undertake their own change endeavours. the categories of description and dimensions of variation show how, to the study’s participants, the status of theory in tel is very much thriving and informs philip moffitt professional engineering wing, royal school of military engineering, chatham, united kingdom engineering academics and technology enhanced learning; a phenomenographic approach to establish conceptions of scholarly interactions with theory cover image memorycatcher via pixabay. publication history received: november . revised: may . accepted: june . published: june . http://stel.pubpub.org https://doi.org/ . / c f e.dc https://doi.org/ . / c f e. aaef b https://doi.org/ . / c f e. aaef b engineering academics and technology enhanced learning studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b sociocultural perspectives of tel’s enhancement. the find- ings also expose important contradictory social conditions, which are beyond the scope of phenomenography and are lucrative for further research of an interventionist nature. . introduction in this paper i set out to contribute to debating the status of theory by examining lecturers’ experiences, perceptions and conceptions of their scholarly interactions with theory in technology enhanced learning (tel). a phenomenographic study is described in the paper, which was conducted in mid- . participants of the study are teaching-focused lectur- ers at the higher education (he) wing of the royal school of military engineering, a defence school in the united king- dom concerned with knowledge and understanding of built infrastructure. it is important at the outset to differentiate the content knowledge of engineering from the pedagogical knowledge being examined. in their disciplinary scholarship, the participants’ content knowledge can be described as a nexus of engineering theory and application (christensen et al., ). in the current paper, i examine their experiences at a similar nexus of “practice informed by theory and theory informed by practice” (la velle, , p. ) but specifically in their scholarly interactions with theory in tel. it is also important to note that the paper discusses second-order perspectives of scholarly interactions with theory in tel. it does not scrutinise any particular theories in tel. the research question driving this paper is “what is the nature of variance in lecturers’ conceptions of their scholarly interactions with theory in tel?”. i open the paper with a discussion of three notions which can be problematic in he; scholarship, theory and tel. i describe some key issues which frame these three notions, examining the related existing literature and depicting an apparent dearth of empirical studies which examine academics’ perceptions, conceptions and experiences of scholarship, theory and tel. an empirical study is then described, where i present phenomenographic interpretations of participants’ expe- riences in their scholarly interactions with theory in tel. my analyses identify an outcome space with four collective and parsimonious ways that participants conceive of their scholarly interactions with theory in tel, from under- standing their own competence in tel to undertaking their own change endeavours. the paper’s discussion analyses how participants place importance and meaningfulness on sociocultural enhancement of tel, exposing lucrative contradictions for further research of an interventionist nature. to close the paper, i conclude by describing how the categories of description – the most important outcomes of phenomenographic research – show that for these partici- pants the status of theory in tel is very much thriving. . literature review three notions which are foundational for the current paper, and which can be problematic in he, are scholarship, theory and tel. these are used to structure the sub-sections below, where i discuss key issues which are presented in existing literature and clarify positions taken in the paper. the fourth sub-section summarises the coverage of empirical studies which examine perceptions, conceptions and experi- ences of scholarship, theory and tel. . scholarship in this paper scholarship is broadly conceived as un- dertaking activities relating to the character of knowledge and understanding, which take place within the purview of he. in my discussions i use boyer’s ( ) definition of scholarship as informed, reflexive and inquiring approaches of: discovery (creating knowledge); integration (knowledge across disciplines); application (engagement beyond he) and teaching (developing others). varied interpretations of scholarship and scholarly activities in he have been dis- cussed including by marshall and pennington ( ); tight ( ); and weller ( ). many contemporary debates have centred on dissatisfaction with defining scholarship; the conflation of scholarly competence with market-driven procurement and publishing of research has been commonly disputed. while relating scholarship to tel in this paper may serve to focus on specific literature, technology can also add ambiguity to the notion of scholarship, with terms like “digital scholarship” having been used as euphemistic shorthand for the “curation and collection of digital resourc- es” (weller, , p. ). i thus deem scholarship in tel to include participants making research contributions to the tel activities they are involved in (laurillard et al., ; ). i also consider that researching such tel activities has an a-priori requirement for a theoretical approach (c.f. antonenko, ; elken & wollscheid, ; tight, a), accepting that theory is also a problematic notion. . theory theory can be understood in three ways (calhoun, ): as some conjecture to be refuted or confirmed; as that which is commonly accepted as truth; and as some means with which to understand the connections of phe- nomena. the latter is the notion of theory which i use in this paper. educational researchers who have shared this https://doi.org/ . / c f e. aaef b moffitt ( ) studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b conception, where theory guides and illuminates research and practice, have presented varied judgements of notable criteria and attractive attributes of theory. examples include: sayer’s ( , p. ) theoretical dimensions of ordering, conceptualising and hypothesising; tight’s ( , p. ) association of theory with evaluation, generalisation, explanation and prediction; halverson’s ( , p. ) theoretical powers of description, rhetoric, inference and application; and ashwin’s ( : pp. - ) discussions of simplification, consistency, framing and openness to development. in this paper i try to avoid endorsing particular characterisations, seeking instead to interpret experiences of others. yet i do presuppose theory as being in an interweaved nexus of thought, research and practice. i concurrently wish to carefully avoid appearing to use the term theory simply to avoid or obscure analysing reality, whilst “grasping for legitimation” (bligh & flood, , p. ). i reject claims that theory is inconsequential (see e.g. thomas, ) or peripheral (such as the dichotomy of “book knowledge” and “practical knowledge” described by jarvis, , p. ). . tel the third problematic notion to describe is that of tel, which has faced accusations, notably from within, of being under-theorised (hew et al., ), of lacking a stable ontology (laurillard et al., ) and of pervasive “common sense assumptions” of positive effects of techno- logical innovation based on hype (bennett & oliver, , p. ). in response, authors have called for an explicit repositioning and reconsideration of a theory of tel (e.g. crook & sutherland, ; gunn & steel, ; jones & czerniewicz, ). passey ( ) has proposed that the very term tel has become so encompassing as to have lost much of its theoretical meaning to scholars, calling for the representation of distinguishable technology-enhanced fields: managing learning (teml); education (tee); managing education (teme); teaching (tet); and managing teaching (temt). in this current paper, i conceive of tel as a social activity related to the qualitative development of learning processes, activity whose production is mediated by technology yet with the precedence of pedagogical concerns above technological concerns (kirkwood & price, ). of additional note, i consider that technological artefacts are both tools (acting on the world) and signs (acting on the mind), which can be either digital or analogue in tel. by extension, i reject the notion of tel as referring to wholly virtual interactions which preclude physical co-presence at places of education (see also crook & bligh, ). table . prevalent themes in studies of perceptions, conceptions and experiences of scholarship, theory and tel from to notion prevalent themes from tight’s ( ) schema for he research total te ac h in g an d l ea rn in g c ou rs e d es ig n t h e st u d en t ex p er ie n ce q u al it y s ys te m p ol ic y in st it u ti on al m an ag em en t a ca d em ic w or k k n ow le d ge a n d r es ea rc h scholar- ship theory tel . coverage of empirical studies to locate relevant and informative empirical studies in each of the three notions of scholarship, theory and tel, i first used search criteria to identify peer-reviewed papers ranging from to inclusive. the results were manually screened to identify studies of perceptions, conceptions and experiences of phenomena, as discrete from studies of phenomena themselves. during detailed analyses and coding, each paper was allocated a thematic category from tight’s ( , p. ) schema for he research: teach- ing and learning; course design; student experience; quality; system policy; institutional management; academic work; and knowledge and research. at the time of writing, the literature was found to have the coverage in table . the review highlights gaps, whilst indicating prevalent trends: • in research of experiences of scholarship, there is a dominance of business-driven outcomes such as knowledge transfer and marketisation of the student experience. • in research of experiences of interacting with theory, studies most frequently examine redesigning curricula and developing academic staff to sustain competitive advantage. • in research of experiences of tel, projects dominate which conduct post-hoc validation of technologies https://doi.org/ . / c f e. aaef b engineering academics and technology enhanced learning studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b and which relate implemented artefacts to changing teaching and learning. when the intersects of these three notions were ex- amined, as shown in figure , there were studies which examined perceptions, conceptions and experiences at the intersect of any two of the three notions. yet there were none which examined conceptions at the intersect of all three notions. in synopsis, the review illustrates a deficiency of research which examines perceptions, conceptions and experiences of scholarly interactions with theory in tel. figure . quantities of studies of perceptions, conceptions and experiences of scholarship, theory and tel and quantities of those studies (in brackets) at their intersects . theoretical framework phenomenography was used as the study’s theoretical framework to interpret participants’ perceptions, conceptions and experiences of their scholarly interactions with theory in tel. it is a qualitative, interpretivist and relatively recent form of research rooted in education (marton, ). a phenomenographic theoretical framework is non-dualistic, and it assumes a limited number of ways in which a group of people perceive, understand or experience a shared phenom- enon. knowledge is thus constituted through relationships between individuals and the world (ibid.). phenomenog- raphy emerged from an empirical basis, and it is the only methodology to have been substantially developed within he, where it is hermeneutically developed by researchers who apply, critique and improve its methodological process- es. phenomenography concentrates on participants’ experi- ences and conceptions of phenomena as knowledge’s central form, rather than focusing on researching phenomena themselves (svensson, ). experiences and variances are presented for a unitary group; the findings apply to variation in meaning across, as discrete from within, a population (Åkerlind, ). as a second-order technique based on accounts of experiences, phenomenography is susceptible to allegations of poor rigour, questionable validity and disputed reliability. the majority of criticisms appear to be most notably raised by phenomenographers themselves: it is described by hallett ( , p. ) as “methodological idolatry” which over-prioritises its own procedural concerns; its methodo- logical processes can merely expose the researcher’s own pre-ordained categories (ashworth & lucas, ); and naïve researchers can uncritically replicate participants’ conservative, descriptive and neutral data (webb, ). ekeblad ( ) counters that such critiques are unhelpfully postmodernist, and that the processes of phenomenography do not necessarily impede a researcher being able to test or critique findings. the research design in section is presented in consideration of these concerns, notably those related to quality. particular arrangements are summarised which sought during the study to provide assurance of theoretical validity and reliability (sin, ). . research design in designing the study described in this paper, proposals for phenomenographic research were discussed with a disinterested colleague who conducted regular checks of validity and reliability; not to provide tacit agreement with my conduct or my findings, but for some assurance that the outcome space could have plausibly originated from the research design, the gathered data and the analyses. to further substantiate quality in the research design, i incorpo- rated theoretical and methodological recommendations from more established authors. in particular, my research design included arrangements for: bracketing my intentions; being parsimonious with findings; presenting illustrative quotes for informed scrutiny; and overtly presenting the participants’ social conditions (Åkerlind, ; ashworth & lucas, ; marton & booth, ). in sub-section . , i describe who the participants of the study were, and how they were selected, followed in sub-section . by an explanation of the design to gather and phenomenographically interpret their data. . participants and their selection the participants in the study are amongst my work colleagues at the royal school of military engineering. they are teaching-focused lecturers, recruited from industry to fill the school’s lecturing positions in he, having been recruited https://doi.org/ . / c f e. aaef b moffitt ( ) studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b for experience in vocational areas of engineering and man- agement. they teach groups of six to twelve mature military students on degree programmes, who learn to design, con- struct, project manage and maintain infrastructure such as hospitals, airfields, water treatment and electrical distribu- tion. pedagogical issues of tel are generally first introduced to these lecturers during their preparatory training for their positions, which includes undertaking postgraduate certi- fication in he with a partnered collegiate university. their continued endeavours at developing tel activity take place during their design of teaching and learning, in response to feedback from managers, peers and learners, and during formal professional development if they seek it. in designing the research, i applied purposive sampling to a group of these lecturers at the school who had volun- teered for the study. purposive sampling selected partici- pants for their representation, value and variation. my in- tent in asking for volunteers and then purposively sampling them was to balance variation with critical cases, limiting my own bias and the potential for artificial distinction (cohen, manion, & morrison, ). purposive sampling is typical of phenomenographic studies, in contrast to pursuing saturation through a large sample size. i accepted a sample size of nine for this study, because it comprised variation of personal characteristics and traits in spite of being below the generally accepted ten to twenty for phenomenography (Åkerlind, ). this smaller sample size gave me the opportunity for a deeper approach in the study’s field work, and it allows me in this paper to explore contextual shared experiences, yet it does constrain the paper’s generalisability. . gathering and interpreting data data collection was conducted in semi-structured and open-ended interviews with each participant. interviews comprised conversational open questioning techniques pertaining to notions of scholarship, theory and tel. these discursive exchanges led to specifically focused questions to expose personal experiences, perceptions and conceptions. i made significant and conscious effort to avoid the carryover of my bias into the problem space, and thus into the subse- quent outcome space (creswell, ). opening discussions of scholarship enquired what each lecturer conceived of their scholarly identity and what typical activities were under- taken in the name of scholarship. open-ended questioning identified and discussed historically recent activities in scholarship, with follow-up enquiries in response. the second notion of theory and the third notion of tel were examined in similar ways, applying conversational segue as required. conversation was guided between the three notions to relate them to the others, in latter stages merging all three for coinciding and associative experiences. the data informed the discretisation of conceptions for the subsequent structuring of categories of description and dimensions of variation. completed transcripts were iteratively analysed to progressively reveal shared concep- tions across the group. reed ( ) describes two analytical methods at this stage: the first, generally in european research, identifies pools of meaning with commonality across the group relatively early; the second, more prevalent in australasia, preserves individual responses until as late as possible. my analyses used the former; i coded transcripts while removing irrelevant text, progressively identifying con- gruent experiences and variation of meaning. boundaries between participants were discarded, as common categories of description were identified (marton & booth, ). nu- merous iterations and reinterpretations eventually revealed an outcome space (svensson, ) which was assessed for important criteria of quality in phenomenographic studies (ibid.): illustration of something distinctive about how phe- nomena are understood; logical and structural relatedness; and parsimony (meaning faithful illustration of variance, communicated as concisely as practicable). . findings the outcome space in sub-section . shows categories of description in order; the first conception is the least complex and inclusive, and the final conception is the most complex and inclusive. it is important to note that my phenomenographic findings in this paper present only one of many potential outcome spaces. it is also stressed that these categories are not equitably distributed between partici- pants; they represent conceptions across the group as one whole. the categories of description are first introduced, then presented in sub-section . as structural and referen- tial representations, and then exemplified in sub-section . with quotes from the pools of meaning. . outcome space in this instantiation, there are four qualitatively different conceptions of the group’s scholarly interactions with theory in tel. the group conceived that their scholarly interactions with theory enabled them to: category . understand their own competence in tel. category . exhibit their own competence in tel. category . critique the tel change endeavours of others. https://doi.org/ . / c f e. aaef b engineering academics and technology enhanced learning studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b category . undertake their own tel change endeavours. the outcome space is inclusively hierarchical, yet not developmental (see also ashwin, ). to explain, category does not include experiences of any other categories, whilst category includes experiences of all other catego- ries. the outcome space is not progressive or step-by-step, because participants at any one time have assumed any one conception; they have not needed to progressively move through each in turn, from category to category . . structural and referential aspects the outcome space can be alternatively presented as in table , with structural and referential aspects of scholarly interactions with theory in tel (described in marton & pong, ; yates et al., ). structural aspects represent importance to the group. in this study, structural aspects of scholarly interactions with theory in tel relate to internal- isation (individual understanding of factual and relatively isolated concepts) and externalisation (societal understand- ing of more contextually applied concepts). referential aspects represent meaningfulness to the group. in this study, referential aspects of scholarly interactions with theory in tel relate to competence, criticality and enhancement. . exemplification of the findings each of the conceived categories of the outcome space is exemplified below, in the tone of the phenomenographic tradition; not by describing phenomena, but by describing experiences of relationships between participants and phe- nomena. the authenticity of language has been preserved in the examples, with swearing and taboo references included as they were expressed. these expressions were appropriate in the context and comfort of speaker-listener relationships. any offence or impoliteness is unintentional “on the part of the speaker” (timothy & janschewitz, , p. ) and is my sole responsibility. . . category . understand their own competence in tel challenges in individual sense-making for notions of competence such as confidence, proficiency and capability were frequently identified in the pools of meaning. navi- gating an understanding of competence in tel scholarship appeared to be considered by participants as a foundational utility for their interactions with theory. theory seemed to be experienced as some means for them to understand: others’ expectations of competent tel scholarship; their own expectations of competent tel scholarship; and gaps between the normative judgements of others and their own judgements of their competence at that point in time. the following examples show experiences related to understand- ing their competence in tel scholarship. “being capable and all that means using it [theory] to work out what [managers] want from us, kind of set expectations, aspirations maybe … what good enough looks like to them … like a known skillset … common language with [peers and colleagues], knowing the theoretical stuff and feeling a bit better for it, rather than worse off for it, not be scared … not being embarrassed and incompetent, like what to try and not to try in tel and whatever that is …”. “i wouldn’t feel a bit confident … if people were saying stuff and i didn’t know what they were on about. a bit of theory … lets you crack the code, you can read stuff … know what people are on about … it [theory] can help you work out tel that [learners] might need, if you’re worried about them getting pissed off in a lab … or on site or a classroom, [theory] might help you work out what might happen, what might not be worth a go …”. . . category . exhibit their own competence in tel the external and shared meaning-making of theory was considered by the group as essential to their negotiation and exhibition of competence in tel, particularly in managerial interactions, collaboration with peers and teaching-learning interactions. with reference to the structural aspects of the outcome space, the pools of meaning show that internal sense-making to understand competence was considered dis- crete from external meaning-making to exhibit competence. varied motives for exhibiting competence were evident, apparently adapted to suit the stakeholders involved. the extracts below illustrate the participants’ conceptions of exhibiting their competence in tel, successively including understanding competence in tel. “a lot of using it [theory] i suppose is about not looking a dick … at first we do it [interact with theory] for things like getting a [postgraduate certificate in he] out of the way, and getting through probation periods, but then it turns from not making dicks of ourselves to … doing tel whatever it might be and making tel better. it makes you look at least a bit capable in their [colleagues’ and managers’] eyes, being proficient … to know and show you know your theory … but i’m convinced we don’t really think much about it [theory] until we’re worried about looking or sounding stupid, then we need to show other people what we know …”. https://doi.org/ . / c f e. aaef b moffitt ( ) studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b table . structural and referential aspects of scholarly interactions with theory in tel referential competence criticality enhancement structural internal understanding competence in tel external exhibiting competence in tel critiquing tel change endeavours undertaking tel change endeavours “… coming at it [tel] from some theory or other makes people think, they’ll think you know what you’re doing, and [students] they’ll feel in safe hands, line managers will know you mean business … it’s [theory] got a shitter reputation than it deserves … sometimes it comes across as “either-or” [air quotes with fingers] … either theory or practice like that [chopping motion with hand], a line’s drawn … funny thing is if you design say fire pumps you’d be saying and writing, maybe d’arcy’s and bernoul- li’s and pascal’s, laws and theories all over the place, but with tel you don’t, you jump straight in … you wouldn’t design fire pumps by saying ‘just piss about with them and we’ll see if they work later’ but we do when we’re designing tel, unless we’re talking to other people then suddenly we give a shit …”. . . category . critique the tel change endeavours of others in the pools of meaning, the ability to competently critique change was usually directed at the endeavours of hierarchical line managers. theory was experienced by par- ticipants as intrinsic to their informed and credible critique of such change endeavours. scholarly interactions with theory in tel seemed to be experienced as empowering, lending some authority to their critique, with understanding and exhibiting competence in tel expressed as precursors to critiquing change endeavours. two examples are shown below, which also illustrate successive inclusion of the previous conceptions. with regard to the referential aspects of the outcome space, and the dimensions of variation, the pools of meaning highlight criticality as being discrete from exhibiting competence. variance appears to be experienced by capacity to reject, confront, adapt and exchange ideas based on expressing differences of opinion. “… you wouldn’t turn up to find new smart boards on your walls or ipads on your desks and just go ‘that’s shit’ or ‘that’s ace’. at least i hope not. but if you know what they’re [managers] trying to achieve with them you can do some swotting … make sure you know what you’re on about and how to tackle them … say ‘what exactly will this do … what’s going wrong that this’ll solve exactly, why are we throwing tablets at people, how do we make learning better with this?’. if you can say that using a bit of grown-up talk, some theory here and there, it’s miles better than just bleating how shit their ideas are”. “you can push back on things they [managers] want to change … maybe model what might happen … theory’s good for that, maybe get to grips with what the [problem] was in the first place, theory’s good for that … telling them ‘this won’t work, it’s shit, spend the money on more people or a vending machine instead’ isn’t exactly helpful, but analysing and talking with [theory] is … if or when things go tits up you can go ‘look it never worked in study x or study y either, but it’s not tel it’s just buying a load of ipads, but this other stuff might work’ … you can use theory for that, offer solutions that will work not just saying ‘it’s shit i’m not doing it, it won’t work’ … ”. . . category . undertake their own tel change endeavours the final inclusive conception of the outcome space describes participants’ interactions with theory to conceive, design and undertake their own tel change endeavours. descriptions in the pools of meaning built upon the previous three conceptions. in the referential aspects of the outcome space, and in examining the dimensions of variation, https://doi.org/ . / c f e. aaef b engineering academics and technology enhanced learning studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b the extracts show how enhancing tel was experienced as discretely meaningful from, yet inclusive of, critique. differentiation of undertaking change endeavours refers to participants taking ownership of recognising the impetus for change, diagnosing the requirements for change, and then designing and undertaking change. the following examples describe how this appeared to be their most complex and inclusive conception. “we’d use it [theory] for having a go at something differ- ent, or should anyway … things to try in seminars, labs, on sites, even just for working out if you know enough to try it, wondering what’ll happen, to ask [colleagues] if they reckon they [students] might go with you, or if they might be thinking ‘she just hasn’t thought this through, fuck this’. i can’t remember what i did before i knew what it [theory] was good for, not for tel anyway … maybe i just tried and guessed and sometimes got lucky … but when you know it and use it [theory] … it’s hard not to think about it for your own changes, it saves a bit of time and effort when you’re having a go … better than a stab in the dark … at least you can have a read and a think and a chat with other people whether something might work or not …”. “if you’re changing things it’s to make things better, not worse, so you need enough [theory] flying round, else you’re going to look a right dick if it’s an epic fail, and you’ll take them [learners] down with you. imagine someone saying ‘you said the gaffer’s ideas were crap, so do tell us what you tried instead’ … then you go ‘oh i never had a plan i just thought it’d be worth a go, it’s gone tits up never mind [apparent sarcasm] it’s only a whole [cohort] can’t do their jobs now poor bastards’. there’s an incentive to get your “theory proficiency badge” [air quotes with fingers] for tel before trying to change stuff, or just a “making sure you don’t look a dick” badge… if it fails then you’ve got productive failure if you can use theory to redesign and explain what hap- pened … but you’ve got a plain old fuck up if you can’t use theory to explain it … they’ll [learners] thank you one day if you can tell them what went wrong by using a bit of theory, but if you can’t they’ll just think you’re a dick, and they’ll probably be right …”. . discussion various implications for scholarly interactions with theory in tel can be drawn from the outcome space, with my discussion here delimiting observations which inform debates of the status of theory. the implications of the conceptions are first structured to suit the separate notions of scholarship, theory and tel. i then amalgamate these three notions, to discuss implications for scholarly interac- tions with theory in tel. . implications for scholarship the outcome space’s structural and referential aspects illustrate what participants in the study perceived as important and meaningful for scholarship. articulation was generally expressed through pedagogical concerns of scholarship, rather than other activities associated with scholarship. categories of description relate to their use of theory, to negotiate their identity as teaching-focused scholars (from internal and external perspectives) and to understand and enact enhancing their teaching-focused scholarship. these conceptions go a small way to countering narratives in the literature of prevailing “orthodoxies and pseudo theories” in scholarly practice (drumm, , p. ) and they present opposition to the empirical literature’s foci on instrumentalist gains of scholarship, such as technology transfer with industry and commercialising knowledge. yet it is important to re-state the limitation that participants of this study are teaching-focused lecturers, conflating their daily reality of scholarship with that of teaching and learn- ing, explaining the pedagogical foci of conceptions. the setting and the participants thus limit generalisability of the findings, yet they do so whilst illustrating how phenomeno- graphic research can yield important empirical results across a bounded group. amidst claims that learning is displacing teaching, and that students and managers are displacing teachers (e.g. murphy, ), the outcome space implies that some scholars can resist role displacement, and that theory informs their experiences of scholarly activities. the successively inclusive and hierarchical categories show that theory informs scholarship’s movement for the group from instrumentalist to humanistic perspectives, despite the latter being politically problematic (mardis, hoffman & rich, ). on one hand, the ‘exchange value’ of scholarship in exhibiting competence in tel is related to securing their wage and status. on the other hand, the ‘use value’ of scholarship in enhancing tel is related to societal development. for this study’s participants, sustaining scholarship’s use value through a humanistic perspective is perhaps less hazardous than for most in he; education in the defence sector is anecdotally characterised by ample resourcing, well-motivated students and job security (c.f. the united kingdom’s he market and competitive scholarship in watermeyer & tomlinson, ). yet the outcome space does expose a nascent lucrative contradiction, with categories of description which are recognisably related to https://doi.org/ . / c f e. aaef b moffitt ( ) studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b scholarship’s use or exchange value. the dimensions of variation thus illustrate some potential for opposition to scholarship’s commodification (by which i mean the process of “transforming use to exchange value”, as in mosco, , p. ). aggravating such contradictions is beyond the scope of this paper’s phenomenographic research, yet it may benefit future interventionist research (bligh & flood, ), suggesting lucrative opportunities for formative and agentic change of scholarship driven by participants themselves. . implications for theory the dimensions of variation are informative for the shared social meaning of theory, commencing with the structural representation of competence. the externalisa- tion of exhibiting shared competence leads to successively meaningful experiences of critique and enhancement, the social negotiations of which are informed by theory. this study’s collective group is clearly engaged with theory, although again generalisability is likely to be limited. they may have vested interests related to their postgraduate study of tel, their relatively late entries into lecturing careers and their negotiations of new professional identities as scholars. there is another important caveat that i wish to raise, driven by a dialectic of distance, where the closer i examine the participants’ interactions with theory the more ambiguous their implications seem to become. such ambiguity is characterised in the examples from the pools of meaning, through confused interplay of terms and notions such as the- ory, theories, methodology and methods, exacerbated during detailed questioning. further ambiguity exists regarding the group’s organisational level. while phenomenography necessitates a relatively consistent sample of participants, there is shared recognition across the group of theory-relat- ed outside influence at macro, meso and micro levels (see also tight, ; crook & sutherland, ). and yet this group’s direct experiences are at one organisational level; further research of theory needs to recognise teaching and learning, but ought to also include representation of “the institution and larger society” (anderson, , p. ). by my preclusion of other stakeholders, and by examin- ing experiences of theory rather than theories, this current paper evades engaging in debates of ideological perspec- tives, avoiding related traps described in seminal works: passey’s ( , p. ) elusive “unifying theory of learning”; trowler’s ( , p. ) concerns for “forcing evidence into a frame”; and shaw and crompton’s ( , p. ) “misted theoretical spectacles”. given this paper’s interpretivist par- adigm i also miss opportunities to aggravate contradictions that theory is embroiled in: theory as a convenient “model of learning to frame teaching” (drumm, , p. ), theory’s rejection in anti-intellectual endeavours (ellis, ), and theory’s relationship with emotionally laden changes of iden- tity (van veen & lasky, ). this study partially exposes theory’s contradictory nature to these participants, yet those contradictions remain un-aggravated: theory is recognised as complex, yet is sought to provide clarity; scholars secure a wage and status through theory’s exchange value, yet use theory for societal benefit; and theory better informs their practice, yet separation of the notions is a false dichotomy to them (the cartesian wrong turn described by toulmin & gustavsen, ). theory in research of tel is often mar- ginalised for expedience (bennett & oliver, ) or findings are theorised only to marketise tel and sustain education’s “fetishisation of emergent technologies” (hall & stahl, , p. ). in some contrast this study’s participants, whose expressions must be taken at face value, evidently value theory in social and cultural negotiations; others may have more conflicting and troublesome experiences, deserving follow-up research. whilst debates of its exact nature are far from resolved, theory is thriving. . implications for tel in research of experiences of tel, much of the existing literature is dominated by post-hoc validation of implement- ed digital technologies, the resulting impact on developing staff and redesigning curricula. the socially transformative possibilities of tel have been nascent for decades, yet empirical research is characterised by the procurement and retrospective acceptance of pre-ordained digital media and platforms (see e.g. bates & sangrà, ; hew, lan, tang, jia & lo, ). this study’s findings show that participants experience alternative conceptions of tel, which differ across the group yet are commonly related to the technological mediation of learning and to social negotiations of knowledge and its meaning. participants foreground their competence in tel as foundational for the successive critique of change and, in turn, their design and implementation of change. their most successively inclusive conception, their enhancement of tel, is described from a sociocultural perspective (as defined in trowler, saunders & bamber, ). despite these social negotiations, the pools of meaning consistently highlight divergence in how tel itself is defined, with some open and explicit acknowledge- ments of uncertainty. this observation echoes the words of seminal writers in earlier sections; the very definition of tel is itself problematic. the group’s value judgements of the importance and meaningfulness of tel are clear, despite tel itself being less clear. examples of what constitutes tel from elsewhere appear to variously interweave with the group’s perceptions, https://doi.org/ . / c f e. aaef b engineering academics and technology enhanced learning studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b including: means by which teachers and learners interact (beetham & sharpe, ); tools and processes for peda- gogical design (laurillard, ); and conditions to develop scholarship (passey, ). shared recognitions of stubborn ontological and epistemological challenges of tel research are also reflected in the study’s pools of meaning. these include the manifestation of challenging encounters in the daily reality of tel scholarship: countering the marketisa- tion of tel artefacts; resisting technological determinism; and rejecting the unmediated transmission of digital media for passive consumption (see also bayne, ; hall & stahl, ; jones & czerniewicz, ). whilst tel lacks a stable definition in the group’s pools of meaning (and to an extent that instability informs the dimensions of variation), there are consistent social and cultural learning-oriented perspec- tives to their experiences of tel. as with the previous two notions, their experiences and perceptions of tel may relate to vested interests, but their accounts must be taken at face value in this interpretivist study. as in the wider he commu- nity, a debate of ‘what tel is’ across the group appears to be very active, often emotional, and far from resolved. their uncertainty of conceptions of tel appear to warrant further research, again of a more interventionist nature. . implications for scholarly interactions with theory in tel results of phenomenographic research can be used to develop processes of education, such as the strategic use of variation in teaching to encourage effective learning (tight, b). the results of this study are likely to have a less direct path to developing tel processes, yet amalgamating the three notions illustrates valuable findings for scholars beyond their use of theory to merely sustain digital artefacts (c.f. flavin, ). firstly, the representation of structural aspects implies some movement, from an individual goal of internalised sense-making in tel, toward a more important societal motive of externalised meaning-making in tel (de- scribed in vygotskian terms by aidman & leontiev, ). secondly, participants attribute and label meaningfulness to their successive competence, critique and enhancement of tel. in a provocative call to reject theory in all educa- tional research, thomas ( , p. ) claims that “debate about theory is rarely accompanied by any discussion of its meaning”. the structural and referential representation of the outcome space suggests that these participants reject thomas’s claim; the status of theory is highly meaningful to them. the categories of description and dimensions of variation also rebuff thomas’s additional claim (ibid.) that theory stifles creativity; data show varied, inventive and imaginative scholarly interactions with theory in tel. in a paper written over a decade ago, laurillard ( , p. ) described education as “on the brink of being transformed through learning technologies; however, it has been on that brink for some decades”. in a call for more theoretical work by engeström ( , p. ), digital technology is described as having “not yet brought about significant change”. bayne ( , p. ) navigates theory and rhetoric to describe tel as “black-boxed, under-defined and gener- ally described in instrumental or essentialist terms”. this paper illustrates that such claims are acknowledged by the study’s participants, and yet they are dilemmatic to them, presenting contradictions which are deserving of further research outside the scope of phenomenography. for their scholarly engagement with theory in tel there are dialectics at play, beyond those of phenomenographic meaning and importance: participants recognise complexity in theory whilst seeking clarity through theory; they consider theory and practice separately but consider them inseparable; they present intellectual analyses which are countered by visceral reactions; and they attribute local difficulties to more systemic problems. dilemmas are particularly notable in the most inclusive and complex of their conceptions, the enhancement of tel, and they may be lucrative to agentic change (see e.g. virkkunen, ). in many studies which discuss scholarship, theory and tel, managerial consensus can inhibit genuine change, through the pragmatic pursuit of “contrived collegiality” (mulford, , p. ) and market-driven “bureaucratic rationality” (brookfield, , p. ). other scholars face the implementation of artefacts to replicate results observed elsewhere, with coercion and edict to pursue pre-ordained intentions (gray, ). the current paper’s findings instead illustrate potential for dialectical debates of theory in tel, informed by this interpretivist research. a caveat is that generalisability of the results is limited. although no research has “context-free meaning” (sin, , p. ), the limitations of this paper deserve explication. participants were selected to provide variance in “the collective mind” described by marton ( , p. ) yet they are all from the same organisational level in the same he wing of the same defence school; whilst these results aspire to be of use and interest, there can be no claims of saturation or fitness for extrapolation. the findings do, however, inform further research at this school and are perhaps of some modest interest or use to other researchers. for further research at this particular school, i plan to conduct interventionist research related to scholarly interactions with theory in tel, led by participants themselves, supplementing these interpretivist findings by aggravating the contradictory social circumstances which have been exposed. https://doi.org/ . / c f e. aaef b moffitt ( ) studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b . conclusion this paper describes a study of the nature of variance in how participants conceive of their scholarly interactions with theory in tel. the outcome space, which is one of many possible outcome spaces, shows that participants conceived of interactions with theory enabling them to: understand their own competence in tel; exhibit their own competence in tel; critique tel change endeavours of others; and undertake their own tel change endeavours. the structural representation of the outcome space shows the variance in their importance of interactions with theory in tel; from internal sense-making and understanding competence in tel, to external meaning-making and exhibiting competence in tel. the representation of referential aspects shows increasing meaningfulness of their interactions with theory in tel, attributed to: their competence in tel; critique of change in tel; and undertaking their own change endeav- ours in tel. these successively inclusive, hierarchical, and complex categories indicate potential for movement in the par- ticipants’ interactions with theory. the movement from understanding and exhibiting competence in tel, toward the sociocultural enhancement of tel, counters many instrumentalist and business-driven perspectives of theory in existing literature. the categories and dimensions of variation also refute claims that theory is void of meaning, or is somehow oppositional to the reality of practice. this study’s participants engaged with theory to confront their role displacement in tel, and they foregrounded sociocul- tural perspectives of tel’s enhancement. importantly, they did so in dilemmatic ways, recognising and negotiating differences in the use and exchange value of their own scholarly interactions with theory in tel; whilst not using those terms, they presented experiences of tensions between wage and status on the one hand, and change through sociocultural enhancement of learning on the other hand. this dialectic of use versus exchange value is one of many contradictions exposed in the study which are related to theory. others include: value judgements of theory in tel, despite shared ambiguity of a definition of tel; local difficulties with theory, presented as symptoms of systemic problems; intellectual analyses of theory, countered by visceral reactions; using theory to inform practice, whilst describing both as inseparable; and recognising the complex- ity of theory whilst seeking its value in clarification. re- search to further expose and aggravate these contradictions is beyond the scope of phenomenographic interpretation of meaning and importance; they can be better examined through further research of an interventionist nature, informed by this interpretive study, and driven by partici- pants themselves. to close the paper, the most important outcomes of phenomenographic research, the qualitatively different categories of description, show that for these par- ticipants the status of theory in tel is very much thriving. references aidman, e. v., & leontiev, d. a. ( ). from being moti- vated to motivating oneself: a vygotskian perspective. studies in soviet thought, ( ), – . https://doi. org/ . /bf Åkerlind, g. s. ( ). learning about phenomenography: interviewing, data analysis and the qualitative research paradigm. in j. a. bowden & p. green (eds.), doing de- velopmental phenomenography (pp. – ). melbourne: rmit university press. anderson, t. ( ). towards a theory of online learning. in t. anderson (ed.), the theory and practice of online learning (pp. – ). athabasca: athabasca university press. antonenko, p. d. ( ). the instrumental value of con- ceptual frameworks in educational technology research. educational technology research and development, ( ), – . https://doi.org/ . /s - - - ashwin, p. ( ). variation in academics’ accounts of tutorials. studies in higher education, ( ), – . https://doi.org/ . / ashwin, p. ( ). analysing teaching–learning interactions in higher education: accounting for structure and agency. london: continuum publishers. ashworth, p., & lucas, u. ( ). what is the ‘world’ of phenomenography? scandinavian journal of educational research, ( ), – . https://doi. org/ . / bates, t., & sangrà, a. ( ). managing technology in higher education: strategies for transforming teaching and learning. san francisco: jossey-bass. bayne, s. ( ). what’s the matter with ‘technology-en- hanced learning’? learning, media and technology, ( ), – . https://doi.org/ . / . . https://doi.org/ . / c f e. aaef b engineering academics and technology enhanced learning studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b beetham, h, & sharpe, r. ( ). an introduction to rethinking pedagogy. in helen beetham & r. sharpe (eds.), rethinking pedagogy for a digital age: design- ing for st century learning (pp. – ). abingdon: routledge, taylor & francis group. https://doi. org/ . / bennett, s., & oliver, m. ( ). talking back to theory : the missed opportunities in learning technology research. research in learning technology, ( ), – . https:// doi.org/ . /rlt.v i . bligh, b., & flood, m. ( ). the change laboratory in higher education: research-intervention using activity theory. in j. huisman & m. tight (eds.), theory and method in higher education research: volume (pp. – ). london: emerald group publishing limited. https://doi.org/ . /s - bligh, b., & flood, m. ( ). activity theory in empirical higher education research: choices, uses and values. tertiary education and management, ( ), – . https://doi.org/ . / . . boyer, e. ( ). enlarging the perspective. in a special report. scholarship reconsidered: priorities of the professo- riate (pp. – ). new york: john wiley & sons. https:// doi.org/ . /ptj/ . . brookfield, s. ( ). pedagogical peculiarities. in emma medland, r. watermeyer, a. hosein, i. m. kinchin, & s. lygo-baker (eds.), pedagogical peculiarities: conversations at the edge of university teaching and learning (pp. – ). leiden, the netherlands: koninklijke brill nv. https://doi.org/ . / calhoun, c. ( ). theory. in dictionary of the social sciences (pp. – ). oxford: oxford university press. https://doi.org/ . / acref/ . . christensen, s. h., didier, c., jamison, a., meganck, m., mitcham, c., & newberry, b. ( ). general intro- duction: the engineering-context nexus a perennial discourse. in s. h. christensen, c. didier, a. jamison, m. meganck, c. mitcham, & b. newberry (eds.), philosophy of engineering and technology; volume . engineering identities, epistemologies and values: engineering education and practice in context (pp. xix–xxxiv). london: springer. https://doi.org/ . / - - - - cohen, l., manion, l., & morrison, k. ( ). research methods in education ( th ed.). london: routledge. https://doi.org/ . / creswell, j. w. ( ). research design: qualitative, quan- titative, and mixed methods approaches. london: sage publications. crook, c., & bligh, b. ( ). technology and the dis-placing of learning in educational futures. learning, culture and social interaction, , – . https://doi. org/ . /j.lcsi. . . crook, c., & sutherland, r. ( ). technology and the- ories of learning. in e. duval, m. sharples, & r. suth- erland (eds.), technology enhanced learning: research themes (pp. – ). london: springer. https://doi. org/ . / - - - - _ drumm, l. ( ). folk pedagogies and pseudo-theories: how lecturers rationalise their digital teaching. research in learning technology, ( ), – . https://doi. org/ . /rlt.v . ekeblad, e. ( ). on the surface of phenomenography: a response to graham webb. higher education, ( ), – . https://doi.org/ . /a: elken, m., & wollscheid, s. ( ). the relationship between research and education: typologies and indicators. oslo: nordic institute for studies in innovation, research and education. retrieved from https://brage.bibsys.no/ xmlui/handle/ / ellis, v. ( ). reenergising professional creativity from a chat perspective: seeing knowledge and history in practice. mind, culture, and activity, ( ), – . https://doi.org/ . / . . engeström, r. ( ). new forms of transformative agency. in a. littlejohn & a. margaryan (eds.), technology-en- hanced professional learning: processes, practices and tools (pp. – ). london: routledge, taylor & francis group. https://doi.org/ . / flavin, m. ( ). disruptive technology enhanced learning: the use and misuse of digital technologies in higher education. london: palgrave-macmillan. https://doi. org/ . / - - - - gray, j. ( ). probing the limits of systemic reform: the english case. in a. hargreaves, a. lieberman, m. fullan, & d. hopkins (eds.), springer international handbooks https://doi.org/ . / c f e. aaef b moffitt ( ) studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b of education, volume . second international handbook of educational change (pp. – ). london: springer. https://doi.org/ . / - - - - _ gunn, c., & steel, c. ( ). linking theory to practice in learning technology research: a review of learning tech- nology research articles. research in learning technology, ( ), – . https://doi.org/ . /rlt.v i . hall, r., & stahl, b. ( ). against commodification: the university, cognitive capitalism and emergent technologies. in c. fuchs & v. mosco (eds.), marx and the political economy of the media (pp. – ). leiden, the netherlands: brill academic publishers. https://doi. org/ . / _ hallett, f. ( ). the dilemma of methodological idolatry in higher education: the case of phenomenography. in m. tight & j. huisman (eds.), international perspec- tives on higher education research , volume (pp. – ). london: emerald group publishing. https:// doi.org/ . /s - ( ) halverson, c. ( ). activity theory and distribuuted cognition: or what does cscw need to do with theo- ries? computer supported cooperative work, , – . https://doi.org/ . /a: hew, k. f., lan, m., tang, y., jia, c., & lo, c. k. ( ). where is the “theory” within the field of educational technology research? british journal of educational technology, ( ), – . https://doi.org/ . / bjet. jarvis, p. ( ). practice-based and problem-based learning. in p. jarvis (ed.), the theory and practice of teaching (pp. – ). london: kogan page. https://doi. org/ . / jones, c., & czerniewicz, l. ( ). editorial. theory in learning technology. research in learning technol- ogy, ( ), – . https://doi.org/ . /rlt. v i . kirkwood, a., & price, l. ( ). technology-enhanced learning and teaching in higher education: what is ‘enhanced’ and how do we know? a critical literature review. learning, media and technology, ( ), – . https://doi.org/ . / . . la velle, l. ( ). the theory–practice nexus in teacher ed- ucation: new evidence for effective approaches. journal of education for teaching, ( ), – . https://doi.or g/ . / . . laurillard, d. ( ). digital technologies and their role in achieving our ambitions for education. london: institute of education. laurillard, d. ( ). teaching as a design science: building pedagogical patterns for learning and technology. lon- don: routledge, taylor & francis group. https://doi. org/ . / laurillard, d., charlton, p., craft, b., dimakopoulos, d., ljubojevic, d., magoulas, g., masterman, e., pujadas, r., whitley, e. and whittlestone, k. ( ). a constructionist learning environment for teachers to model learning designs. journal of computer assisted learning, ( ), - . https://doi.org/ . /j. - . . .x laurillard, d., kennedy, e., charlton, p., wild, j., & di- makopoulos, d. ( ). using technology to develop teachers as designers of tel: evaluating the learning designer. british journal of educational technology, ( ), – . https://doi.org/ . /bjet. mardis, m. a., hoffman, e. s., & rich, p. j. ( ). trends and issues in qualitative research methods. in m. j. b. (eds): m. spector, d. merrill, j. elen (ed.), handbook of research on educational communications and tech- nology (pp. – ). new york: springer. https://doi. org/ . / - - - - marshall, s., & pennington, g. ( ). teaching excellence as a vehicle for career progression. in h. fry, s. ketter- idge, & s. marshall (eds.), a handbook for teaching and learning in higher education: enhancing academic prac- tice (third, pp. – ). abingdon: routledge, taylor & francis group. https://doi.org/ . / marton, f. ( ). phenomenography - describing concep- tions of the world around us. instructional science, , – . https://doi.org/ . /bf marton, f., & booth, s. ( ). learning and awareness. mahwah, new jersey: lawrence erlbaum associates, inc. https://doi.org/ . / marton, f., & pong, w. ( ). on the unit of de- scription in phenomenography. higher education research & development, ( ), – . https://doi. org/ . / https://doi.org/ . / c f e. aaef b engineering academics and technology enhanced learning studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b mosco, v. ( ). the political economy of communication (second). london: sage. https://doi.org/ . /ins. mmj. mulford, b. ( ). recent developments in the field of educational leadership: the challenge of complexity. in a. hargreaves, a. lieberman, m. fullan, & d. hopkins (eds.), springer international handbooks of education, volume . second international handbook of educa- tional change (pp. – ). london. https://doi. org/ . / - - - - murphy, t. ( ). the future of technology enhanced learning (tel) is in the hands of the anonymous, grey non-descript mid-level professional manager. irish jour- nal of technology enhanced learning, ( ), – . https:// doi.org/ . /ijtel.v i . passey, d. ( ). inclusive technology enhanced learning : overcoming cognitive, physical, emotional, and geo- graphic challenges. london: routledge. https://doi. org/ . / passey, d. ( ). technology‐enhanced learning: rethink- ing the term, the concept and its theoretical background. british journal of educational technology, ( ), – . https://doi.org/ . /bjet. reed, b. ( ). phenomenography as a way to research the understanding by students of technical concepts. nucleo de pesquisa em technologia da arquiterura e urbanismo (nutau): technological innovation and sustainability, – . sayer, a. ( ). introducing critical realism: realism and social science. london: sage publications. https://doi. org/ . / shaw, i., & crompton, a. ( ). theory, like mist on spectacles, obscures vision. evaluation, ( ), – . https://doi.org/ . / sin, s. ( ). considerations of quality in phenom- enographic research. international journal of qualitative methods, ( ), – . https://doi. org/ . / svensson, l. ( ). theoretical foundations of phenomenography. higher education research & development, ( ), – . https://doi. org/ . / thomas, g. ( ). what’s the use of theory? harvard educational review, ( ), – . https://doi. org/ . /haer. . . x w u tight, m. ( ). researching higher education (second). maidenhead: mcgraw-hill international and the open university press. tight, m. ( a). examining the research/teaching nexus. european journal of higher education, ( ), – . https://doi.org/ . / . . tight, m. ( b). phenomenography: the development and application of an innovative research design in higher education research. international journal of social research methodology, ( ), – . https://doi.org/ . / . . tight, m. ( ). higher education research: the developing field. london: bloomsbury academic. timothy, j., & janschewitz, k. ( ). the pragmatics of swearing. journal of politeness research, ( ), – . https://doi.org/ . /jplr. . toulmin, s., & gustavsen, b. ( ). beyond theory: chang- ing organizations through participation, dialogues on work and innovation. amsterdam: john benjamins publishing company. https://doi.org/ . /dowi. trowler, p. ( ). wicked issues in situating theory in close up research. higher education research & development, ( ), – . https://doi.org/ . / . . trowler, p., saunders, m., & bamber, v. ( ). enhancement theories. in v. bamber, p. trowler, m. saunders, & p. knight (eds.), enhancing learning, teaching, assessment and curriculum in higher education: theory, cases, practices ( st ed., pp. – ). maidenhead: mcgraw-hill international and the open university press. https://doi. org/ . /j. - . . .x van veen, k., & lasky, s. ( ). emotions as a lens to explore teacher identity and change: different theoretical approaches. teaching and teacher education, ( ), – . https://doi.org/ . /j.tate. . . virkkunen, j. ( ). dilemmas in building shared trans- formative agency. activities / @ctivités, ( ), – . https://doi.org/ . /activites. https://doi.org/ . / c f e. aaef b moffitt ( ) studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b watermeyer, r., & tomlinson, m. ( ). the marketization of pedagogy and the problem of “competitive account- ability.” in e. medland, r. watermeyer, a. hosein, i. m. kinchin, & s. lygo-baker (eds.), pedagogical peculiarities: conversations at the edge of university teaching and learn- ing (pp. – ). leiden, the netherlands: koninklijke brill nv. https://doi.org/ . / webb, g. ( ). deconstructing deep and surface: towards a critique of phenomenography. higher education, ( ), – . https://doi.org/ . /a: weller, m. ( ). the digital scholar: how technology is transforming scholarly practice. london: bloomsbury academic. https://doi.org/ . / yates, c., partridge, h., & bruce, c. ( ). exploring infor- mation experiences through phenomenography. library and information research, ( ), – . https://doi. org/ . /lirg https://doi.org/ . / c f e. aaef b engineering academics and technology enhanced learning studies in technology enhanced learning, ( ) https://doi.org/ . / c f e. aaef b open access (cc by . ) © the authors. this article is distributed under creative commons attribution . international licence. you are free to • share — copy and redistribute the material in any medium or format • adapt — remix, transform, and build upon the material for any purpose, even commercially. under the following terms: • attribution — you must give appropriate credit, provide a link to the license, and indicate if changes were made. you may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. • no additional restrictions — you may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. the full licence conditions are available at: https://creativecommons.org/licenses/by/ . / acknowledgements the author would like to express appreciation to the anonymous reviewers for their constructive comments on previous versions of this paper, and to colleagues at the royal school of military engineering who volunteered for this paper’s phenomenographic study. about the author philip moffitt is a consultant and teaching-focused lecturer based at the higher education wing of the royal school of military engineering in the united kingdom. a chartered engineer, facilities manager and ergonomist, he specialises in technol- ogy enhanced learning for teams who design, build and operate critical national infrastructure, and whose learning requirements are often only identified at the time and location of need. phil’s research interests include: collaborative learning for geographically distal teams; relationships of learning with culturally and historically embedded organisational practic- es; ergonomics for human-computer interaction and error reduction; and research-inter- ventions to redesign learning activity, driven by participants themselves. phil is an alumni member of the centre for technology enhanced learning at lancaster university. email: phil@philipmoffitt.com orcid: - - - twitter: @philmoffitt https://doi.org/ . / c f e. aaef b https://creativecommons.org/licenses/by/ . / http://orcid.org/ - - - http://twitter.com/philmoffitt cm&r : (march) hmorn – selected abstracts c-d - : governing access to a distributed research network’s data resources beth l syat, mph, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; kimberly lane, mph, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; jeffrey s brown, phd, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; david magid, md, mph, institute for health research, kaiser permanente colorado; joe v selby, md, mph, division of research, kaiser permanente northern california; richard platt, md, ms, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; andrew nelson, mph, healthpartners research foundation to answer many public health questions, it is essential to use information from more than one electronic data system, and efficient ways are needed to securely access and use data from multiple organizations while respecting the regulatory, legal, proprietary, and privacy implications of this data use and access. one approach centers on the development of distributed research networks that allow data owners to maintain confidentiality and physical control over their data, while permitting authorized users to ask essential questions. once such a network is fully operating and key elements are in place, sharable data resources can be made available to approved network users, under approved conditions. for instance, data from a large cohort of hypertensive patients with five years of utilization (a hypertension cohort) could be available on the network. the following questions will need to be addressed: who can have access? under what conditions should access be granted? what policies/procedures are required? to address the specific needs associated with governance of a network’s resource(s), the authors call for the establishment of user eligibility requirements, policies to deal with funders (i.e., access rules for study funders), clear standard operating procedures, and guidelines for accessing the network. recommendations to meet to those needs include: ) establishing data oversight policies; ) defining responsibilities for data resource access; ) defining responsibilities for data owners at each site (i.e., responding to queries when requests come in); ) creating standard operating procedures for the data resource; ) creating collaboration guidelines for external partners; and ) monitoring overall resource use. for the purpose of this poster, we propose to illustrate responsibilities for data owners at each site. ps - : digital scholarship: scientific publishing at the crossroads virginia d scobba, mls, ma, group health center for health studies, group health cooperative background/aims: scholarly communication is the system through which research and other scholarly writings are created, evaluated for quality, disseminated to the scholarly community, and preserved for future use. the traditional formal means of interchange, publication in peer reviewed journals, is at the core of the communication infrastructure. however, the structures and processes by which scholars communicate have undergone a major transformation in recent years with the advent of the digital age. new electronic technologies for access to information appear to be revolutionizing scholarly publishing, aptly defined by the term, digital scholarship. current trends in the chaotic scholarly publishing market can be perceived as both opportunities for and threats to digital scholarship. methods: digital scholarship is in a state of unprecedented upheaval as publishers, librarians, legislators, scholarly societies, scientists and other scholars engage in tactics to propel change in directions that promote their individual goals. strategies involve remodeling the publishing market, modifying academic and research institutional procedures, and influencing public policy. results: emerging digital publishing technologies, increasing volume of scholarly works, and decreasing satisfaction with a costly and dysfunctional economic model are changing the fundamental structure of scholarly publishing. research institutions, as well as government and funding agencies, are implementing or exploring strategies which promote free and open access to research results. these include alternative copyright arrangements, e -print archives and digital repositories. conclusion: scholars, researchers, and society at large gain tremendous benefits from the expanded dissemination of research findings. however, several factors have impeded the progress of digital scholarship, including efforts to protect publishing revenues and profits, legal licensing restrictions, and the traditional culture of academia. it is therefore critical that the scientific community is actively engaged to ensure that the advancement of scholarship takes priority in the development of new publishing models. ps - : developing an analytical tool for assessing the adequacy of state health information exchange laws randy mcdonald, jd, lovelace clinic foundation; maggie gunter, phd, lovelace clinic foundation; shelley carter, rn, mph, lovelace clinic foundation; bob mayer aims: to develop and test an analytic legislative tool that provides states with the ability to analyze and propose reform to laws related to the exchange of electronic health information. background: through extensive research, the multi -state harmonizing security and privacy law collaborative (hsplc) found myriad barriers to health information exchange in laws and business practices. in some cases, barriers are beneficial because they protect people’s privacy. however, barriers can be problematic when they prevent the timely exchange of information needed for the treatment of patients. there are many inconsistencies in state and federal laws and among state statutes in their definitions, organizational structure, and content. some states have adopted new legislation that addresses the exchange of health information that may further exacerbate differences among states and impede interstate exchange of electronic health information. methods: hsplc developed a set of analytical tools and a narrative guide, the roadmap, to assist states in implementing an effective legal framework for the review and adoption of legislation that supports health information exchange (hie). the tools and roadmap were created through extensive research to identify best practices for identifying, evaluating, and reforming state laws related to the disclosure of electronic health information. results: hsplc found that various state resources (legal, legislative, healthcare policy, healthcare providers, and consumers) are necessary for successful completion of the roadmap to identify opportunities for legislative reform. hsplc believe that states will have greater likelihood of success in achieving legislative reform if they use the roadmap and reach out to other states contemplating a change in legislation. interstate collaboration and coordination are essential if we are to achieve a national legal and technical infrastructure that facilitates health information exchange. conclusions: legislation in most states does not adequately address the exchange of electronic health information. drafting of legislation must take into account a state’s unique environment and culture, and the needs and support of stakeholders. the goal of using the analytic tool is to protect health information while removing barriers that impede the exchange of vital information. the hsplc roadmap provides a step by step process to analyze and reform state legislation. ps - : optimizing health informatics interventions from the patient’s perspective: focus group on improving safe nsaid use douglas w roblin, phd, kaiser permanente georgia; richard m shewchuk, phd, university of alabama at birmingham; jeroan j allison, md, msc, university of alabama at birmingham; renny varghese, mph, kaiser permanente georgia; suzanne baker, mph, university of alabama at birmingham; catarina i kiefe, md, phd, university of alabama at birmingham background: patient- provider messaging in an electronic medical record (emr) system provides an opportunity to create and sustain productive patient- provider interactions. we elicited patient perspectives on design, benefits, and concerns to improve usability and efficacy of a proposed health informatics intervention to support surveillance of, and provider feedback on, over the counter (otc) non-steroidal anti -inflammatory drug (nsaid) use. methods: we conducted four focus groups involving kaiser permanente georgia (kpg) adults – years old who had a medical condition for which nsaids should be used cautiously or had a recent prescription for nsaids. the focus group elicited information regarding: otc nsaid use (including recognition of risks and side effects), design of an otc nsaid survey to be delivered via kp.org (the secure kp internet portal for patient- physician messaging), benefits and concerns about transmission of this information via electronic messaging to their primary care physicians, and defining open science author’s note: much of the words, diagrams, and ideas in this chapter make generous use of creative commons licensed materials. the basic framing and many words and sentences are borrowed directly from kramer and bosman’s ​defining open science definitions​. the chapter was then developed by micah vandegrift, with some rewrites, edits, updates, and contextualization. keywords: open science, open knowledge, open source, open scholarship, open research overview this chapter lays out some definitional landscape for “open science.” it offers a brief overview of key points, core topics, and common discussions in this area. chapter outline i. introduction ii. frameworks iii. characteristics iv. horizon(s) v. bibliography i. introduction open science encompasses a multitude of assumptions about the future of knowledge creation and dissemination. defining this term is important because it is picking up momentum ​in practical use as a shorthand umbrella term for a variety of activities that stem from a variety of principles on university campuses, across higher education and in affiliated industries. as global scholarship continues to be more deeply intertwined, concerns about the unequal availability of participation in human knowledge are being unearthed. open science is one of a few movements that are responding to the injustice of information access that tends to privilege the anglo-euro western culture and northern hemisphere. because openness in higher education and research has been a public policy topic in europe for years, many of the core definitions, ideas, and concepts of open science come from the european union, member states, and organizations in europe. only recently has the united states begun to utilize the language of open science, due in part to the distributed nature of our higher education/research industry (we don’t have a department of higher education, science, and technology, for example), and also based on deeply entrenched ideals about american https://im punt .wordpress.com/ / / /defining-open-science-definitions/ individualism and boostrampism which can often be resistant a communal, share alike orientation, which open science represents. open science is “ongoing transitions in the way research is performed, researchers collaborate, knowledge is shared, and science is organised.” in the broadest spectrum, open science is related to open access (how academic publications are shared), open data (sharing raw materials of research), open source software (reuse and adaptation encouraged), open educational resources (barrier free teaching and learning), and many subcategories of each of these. additionally, open science bumps up against science and technology policy and the challenges and opportunities in the public policy sphere. to state it bluntly, a broad literacy in open science means to dabble a bit in each of these areas, and to pull good ideas and aspects from all of them into a way of doing or supporting research. fig. - the eu’s foster taxonomy is an essential tool for visualizing the connections between these areas, and also referencing basic definitions for the related terms. danille robinson and the open source alliance helpfully define open as “transparent and freely available for use, reuse, remixing, and sharing… modif[ying] another term such as open source or open access, implying a difference from a conventional, closed or non-transparent approach.” open science in their estimation is a new way of doing research and scholarship, where the goal is the advancement of knowledge through gracious giving to the common pool of from the european commission report “open innovation, open science, open to the world: a vision for europe.” accessible at ​https://op.europa.eu/s/ohun ​https://www.fosteropenscience.eu/taxonomy/term/ ​https://osaos.codeforscience.org/what-is-open/ https://op.europa.eu/s/ohun https://www.fosteropenscience.eu/taxonomy/term/ https://osaos.codeforscience.org/what-is-open/ resource from which anyone with an internet connection can pull. as outlined in their umbrella diagram (fig. ), all the opens participate in creating a more just, equitable, diverse and welcoming system of knowledge. the​ broad scope of open science makes it unrealistic and counterproductive to expect there to be one unifying definition of open science that fits all. while there are common descriptors, the concept is evolving, so a helpful way to approach defining open science is to talk about what it ​does ​and to what it applies​, rather than what it is. so, ​what does open science do​? open science, according to the national academies report ​open science by design​, aspires to “increase transparency and reliability, facilitates more effective collaboration, accelerates the pace of discovery, and fosters broader and more equitable access to scientific knowledge and to the research process itself.” mirrored in the national academies report and in the european union’s foster open science training handbook , the phrase “open science” ​applies to principles as well as practices (fig. )​. for example, a researcher might believe in open access in principle, and make judicious decisions about how to make their own work open. open science is a spectrum rather than an on/off switch. another helpful phrase from european open advocates is that research should be “as open as possible, as closed as necessary.” cribbing elinor ostrom via ​open knowledge institutions: reinventing universities ​https://book.fosteropenscience.eu/en/ introduction/ ​https://github.com/open-science-promoters/opensciencelogo this phrase first began to appear in relation to privacy and data use as the eu adopted open research data policies, and has been adopted more widely in and across open science language. it is especially https://wip.mitpress.mit.edu/pub/oki https://book.fosteropenscience.eu/en/ introduction/ https://github.com/open-science-promoters/opensciencelogo we also need to be aware that some challenges of defining this area come from an english language-focus and euro-centric perspective. the german word wissenscaft, “incorporates scientific and non-scientific inquiry, learning, knowledge, scholarship and implies that knowledge is a dynamic process discoverable for oneself, rather than something that is handed down” and is helpful for broadening open beyond just stem (science, technology, engineering, and math). more recently, phrases like “open scholarship” and “open knowledge” are being employed for wider utility. this chapter will default to the term “open science” in an effort to align with global efforts and established literature in this area. practically in the united states, a phrase like “​open research and scholarship​” would probably best represent the fullest variety of activities, methods, and principles, and also explicitly include, welcome, and make space for the social sciences and humanities in the conversation. ii. frameworks much like digital scholarship and/or digital humanities, open science resists a monolithic definition. even so, it is a helpful umbrella for situating lots of related concepts, ideas, services, technologies, and projects. we tend to talk about these umbrellas as “frameworks”, which can encompass workflows (processes/procedures which are not always explicit), best practices, and/or just theoretical models. the idea of frameworks is helpful in shaping the concept of open science that is explored here. fecher & friesike compiled a five “schools of thought” framework (fig. ) through which open science is approached: ● the infrastructure school - concerned with technological architecture ● the public school - concerned with accessibility or invitational qualities of knowledge ● the measurement school - concerned with evolving impact measurement ● the democratic school - concerned with free access to knowledge ● the pragmatic school - concerned with efficiency of collaborative research helpful when discussing sensitive research data about human subjects and/or endangered or marginalized populations. referenced in eu policy at ​https://op.europa.eu/s/ohuo​. ​https://en.wikipedia.org/wiki/wissenschaft https://op.europa.eu/s/ohuo https://en.wikipedia.org/wiki/wissenschaft fig. - fecher & friesike five schools fecher and friesike’s framework proposes that these five approaches encompass most of the aspirations of open science. libraries working in this area tend to lean toward one or maybe two of these schools, often based on the character of the university they are attached to and the specific strengths of the librarians employed there. for example, nc state university, a land-grant, stem-focused university with deep connections to the state of north carolina through our cooperative agriculture extension program, fits squarely in the pragmatic school, invested in efficient and collaborative work. florida state university, my alma mater and former place of employment, would fit much more comfortably in the democractic school, as etched in stone above dodd hall, “the half of knowledge, is to know where to find knowledge.” the “five schools'' model offers a flexible suite of definitions for understanding what open science does and where it applies. the knowledge exchange, a think tank of european researchers, developed an “open scholarship framework”, proposing that any open activity could be situated in between three dimensions: arena, research phase, and level (fig. ). for example, a post-doc’s software development project could be technological and focused on dissemination at a micro level, while simultaneously being part of a larger research group’s work that is challenging the social fabric of discovery across their discipline. while feeling a bit sterile and conceptual, this model is helpful for visualizing another important aspect of what open science does, as alluded to above; open science is a spectrum, affecting change in many ways concurrently, changing how researchers perform daily work, how universities value new forms of scholarly outputs, and also how governments invest in and extract value from higher education as an industry. fig. - ke open scholarship framework another important framework for understanding what open science does and applies to is the history of other “open” movements. a valuable trait in this history is that the predecessors of open science took care to document and refine their definitions over time, leading to a nice building block approach using common terms and ideas across the years. the precepts of open science arose from a few “open” movements that had ​several established framing definitions: ● free software​ ( ): “free software” means software that respects users' freedom and community. roughly, it means that the users have the freedom to run, copy, distribute, study, change and improve the software. thus, “free software” is a matter of liberty, not price. ● open source definition​ ​( ): includes qualities and characteristics for something to be called open source, including: free redistribution, access to the source code, allow and encourage modified and derivative works, some progressive claims about anti-discrimination. ● open access​ ( - ): ...open access refers primarily to scholarly literature that is “digital, online, free of charge, and free of most copyright and licensing restrictions. what makes it possible is the internet and the consent of the author or copyright-holder.” ​http://legacy.earlham.edu/~peters/fos/brief.htm https://en.wikipedia.org/wiki/the_free_software_definition https://opensource.org/osd-annotated http://legacy.earlham.edu/~peters/fos/brief.htm http://legacy.earlham.edu/~peters/fos/brief.htm ● open definition​ ( ): “open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).” building from this history, unesco issued a first draft of recommendations on open science in late , following a six month long period of global comments and discussion including participants from countries , underlining and responding to the need for a worldwide open science defining framework. helpfully, that recommendation offered a mega-definition, affirming that open science is: “an umbrella concept that combines various movements and practices aiming to make scientific knowledge, methods, data and evidence freely available and accessible for everyone, increase scientific collaborations and sharing of information for the benefits of science and society, and open the process of scientific knowledge creation and circulation to societal actors beyond the institutionalized scientific community.” gesturing backwards to history again, the unesco authors hedge their bets and say the open science is “a complex of at least the following elements…” listing out lengthy definitions for open access, open data, open source hardware/software, open science infrastructures, open evaluation, open educational resources, open engagement of societal actors, and openness to diversity of knowledge. approaching open science through these frameworks, or others like them, has allowed a breadth of people to claim ownership in the term and utilize it as is helpful to describe their innovations in how they work, why they chose one perspective over another, and in what ways they distribute and invite others into their work. teased in the introduction, and aligning with the trajectory toward a more equitable and inclusive research environment, a remaining barrier for open science is the “science” part of that phrase. if the distilled version of open science is an evolutionary movement in the culture and behaviors of how academic ideas are created, shared, and used, it's clear that many other people who do not identify as scientists also have purchase in that movement. what we can best glean from the five schools, the three dimensions, and the mega-inclusive global concept is that open science is changing things, rapidly, and hopefully, toward a more equitable data, information, and knowledge future. iii. characteristics the principles of open science, according to the national academies report mentioned earlier, work together to “increase transparency and reliability, facilitate more effective collaboration, accelerate the pace of discovery, and foster broader and more equitable access to scientific ​https://en.unesco.org/science-sustainable-future/open-science/consultation ​https://unesdoc.unesco.org/ark:/ /pf .locale=en  http://opendefinition.org/ https://en.unesco.org/science-sustainable-future/open-science/consultation https://unesdoc.unesco.org/ark:/ /pf .locale=en knowledge and to the research process itself.” taken individually, these principles in practice provide a loose set of qualities (not all encompassing) that we can look for in a research project or a scholar’s portfolio. while not a win/lose checklist, these characteristics can be helpful in identifying open interventions that one could encourage in partnerships, consultations, or collaborative projects. open science is greater than what it is, and being aware of core principles can help us get closer to describing more clearly what it does. deepening the list the national academies proposed, the open and collaborative science in development network (ocsdnet) proposed in their open science manifesto that open science: ● enables a knowledge commons where every individual has the means to decide how their knowledge is governed and managed to address their needs ● recognizes cognitive justice, the need for diverse understandings of knowledge making to co-exist in scientific production ● practices situated openness by addressing the ways in which context, power and inequality condition scientific research ● advocates for every individual's right to research and enables different forms of participation at all stages of the research process ● fosters equitable collaboration between scientists and social actors and cultivates co-creation and social innovation in society ● incentivizes inclusive infrastructures that empower people of all abilities to make, and use accessible open-source technologies ● strives to use knowledge as a pathway to sustainable development, equipping every individual to improve the well-being of our society and planet practically then, the chart below attempts to pair some open science practices to open science principles, in an effort to show what open might look like in action. ​https://ocsdnet.org/manifesto/open-science-manifesto/ practice principle documenting data workflows increases transparency by allowing the process of research to be more visible. equitably apportioning credit across a research project advocates for every individual's right to research and enables different forms of participation at all stages of the research process. https://ocsdnet.org/manifesto/open-science-manifesto/ these characteristics, and the many not detailed in this chapter, are a snapshot of the kinds of actions a researcher or research collective might take to illustrate their commitment to open science. there is no shortage of articles, blog posts, conference presentations, or listicles on how to be an open researcher, indicating a growing shift in behavior. however, the culture of research production, steeped in the traditions of pre-digital higher education, is slower to recognize, value, and provide credit for many open practices. as open science practices become more commonplace the expectation is that systems like tenure and promotion will adapt to account for it. looking ahead to that time, statements like the vienna principles​ or the ​san francisco declaration on research assessment​ offer a vision of “cornerstones of the future scholarly communication system” and “the need to improve the ways in which researchers and the outputs of scholarly research are evaluated.” the characteristic practices of open science are not confined to the empirical research that is often stereotyped in the hard sciences. advancements in applied fields like education, sociology, public history, and fine arts can also connect and reaffirm goals and principles like those outlined above. a modern and progressive researcher might then talk about their work sharing research ideas through pre-prints or video abstracts early in the research process accelerates the pace of discovery by circulating new knowledge. producing non-technical, non-technological things (art works, community events, translations/interpretations, etc.) recognizes cognitive justice, the need for diverse understandings of knowledge making to co-exist in scientific production. clearly indicating copyright and licenses for the things you produce (data, articles, posters, graphics, software), and advocating for author-favored licensing in publishing facilitates more effective collaboration by allowing anyone who encounters your work to immediately understand how they can use, build on, and re-share it. advocating for revised tenure and promotion guidelines practices situated openness by addressing the ways in which context, power and inequality condition scientific research. resisting corporate monopolization of academic tools and systems (like major publishing companies owning open access repository software) enables a knowledge commons where every individual has the means to decide how their knowledge is governed and managed to address their needs. https://viennaprinciples.org/ https://sfdora.org/ https://sfdora.org/ using familiar disciplinary terms (e.g. cultural heritage artifact) while also connecting to the increasingly common language of open science (e.g. fair data), effectively tying their work to this next phase of how we produce new knowledge. iv. horizon(s) in the end it’s perhaps more important to point to the increasing speed of developments towards open science, than worry about the exact definition of it. returning to the trans-atlantic perspective that opened this chapter, the latter years of the ’s have produced a groundswell of open science advancements, from individuals and communities building and aligning, to full-scale university programs, and perhaps most impactful, governmental and research funder policies shifting to full-throated support. concurrently, the global connectivity of research, riding the waves of open access and open educational resources thanks in large part to the maturity of the internet, has solidified the realization that an open science for north america and europe is not open at all. in an essay for ​the geopolitics of open​, chen, mewa, albornoz, and huang urge caution in the spread of open science from the northern to the southern hemisphere, writing, that if it is not respectful and inclusive of local and diverse knowledge systems, “we will continue to witness the strengthening of systems that seek to be global and “open” research infrastructures, yet continue to limit wider and equitable participations from researchers in less powerful regions and institutions.” they continue, acknowledging work of critical theory scholars studying the spread of open knowledge, “an uncritical uptake of “openness” that does not actively work to redress power imbalances in the current system of academic knowledge production - such as the primacy of knowledge written in colonial languages in historically dominant institutions and validated by international academic journals (chan ; czerniewicz ; canagarajah ) - threatens to replicate and amplify them.” knowing this, there is an implicit responsibility in claiming open science to be aware of defining its boundaries poresly. kramer and bosman rightly point out that open science does not develop in a vacuum and is part of a broader movement towards open knowledge. they clarify that nicely, writing that open knowledge and within that open science should be open to the world, offering: ​translations, plain language explanations, outreach beyond academia, open to questions from outside academia, curation and annotation of non-scholarly information, actionable formats, and participation in public debate. the near horizon is to think globally and act locally; the distant horizon is to erase the barriers between the academy and the public. a broadly defined, principled, see: ​global young academy​, ​university of utrecht - open science platform​, ​plan s​, ​open research funders group albornoz, d., chen, g. (zhiwen), huang, m., mewa, t., cota, g. m., & solís, Á. o. Á. ( ). ​the geopolitics of open​.​ ​https://hcommons.org/deposits/item/hc: / ibid ​https://im punt .wordpress.com/ / / /defining-open-science-definitions/ https://globalyoungacademy.net/activities/open-science/ https://www.uu.nl/en/research/open-science https://www.coalition-s.org/ http://www.orfg.org/ http://www.orfg.org/ https://hcommons.org/deposits/item/hc: / https://hcommons.org/deposits/item/hc: / https://im punt .wordpress.com/ / / /defining-open-science-definitions/ action-oriented open science movement may be part of realizing that vision for a more equitable, open knowledge environment. v. bibliography fecher, b., & friesike, s. ( ). open science: one term, five schools of thought. https://link.springer.com/chapter/ . % f - - - - _ ​ (cc-by-nc) national academies of sciences, engineering and medicine. ( ). open science by design: realizing a vision for st century research. ​https://doi.org/ . / bosman, j., & kramer, b. ( , march ). defining open science definitions [blog]. retrieved august , , from i&m / i&o . website: https://im punt .wordpress.com/ / / /defining-open-science-definitions/ (cc-by) https://book.fosteropenscience.eu/en/ introduction/ http://www.knowledge-exchange.info/event/os-framework albornoz, d., chen, g. (zhiwen), huang, m., mewa, t., cota, g. m., & solís, Á. o. Á. ( ). the geopolitics of open​.​ ​https://hcommons.org/deposits/item/hc: / https://link.springer.com/chapter/ . % f - - - - _ https://doi.org/ . / https://book.fosteropenscience.eu/en/ introduction/ http://www.knowledge-exchange.info/event/os-framework https://hcommons.org/deposits/item/hc: / https://hcommons.org/deposits/item/hc: / using blogs as a communication tool for teaching students in the architecture design studio procedia - social and behavioral sciences ( ) – available online at www.sciencedirect.com - © published by elsevier ltd. this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). selection and peer-review under responsibility of the organizing committee of wces doi: . /j.sbspro. . . sciencedirect wces using blogs as a communication tool for teaching students in the architecture design studio maja bâldea a *, alexandra maier a, oana simionescu a a faculty of architecture and urbanism, polytechnic university timișoara, str. traian lalescu nr. , timișoara , romania abstract the research focuses on the way in which specific and dedicated blogs can be used as a tool for teaching and a channel of didactic dialogue with students, in relation to the activity of the architecture design studio at the faculty of architecture of the polytechnic university of timisoara. three different blogs for three different years of study have been developed at the beginning of the study year of / , as a necessary addition to traditional communication of an essentially applicative subject, at the initiative of the teachers involved in the design studio. the paper follows the activity and educational accomplishments of the blogs from their debut until present, comparing them at the same time, while also discussing different concepts on the use of blogs in the teaching process. the teaching experience offered by the blogs is discussed through the feedback requested at the end of the study year from both teachers and students that have been using and experiencing it. this feedback was also used as a basis of shifting to a new blogging platform for one of them, that can offer an improved educational experience for both students and teachers, integrating individual student blogs into the main blog of the ”class”. thus, the research depicts the positive and negative aspects of using blogs as communication tools in teaching students of the faculty of architecture, studying its direct implications in the didactic process. © the authors. published by elsevier ltd. selection and peer-review under responsibility of the organizing committee of wces . keywords: use of blogs, teaching tool, architecture education, design studio; . introduction this paper focuses on the way in which a blog is used as a tool for teaching and a channel of didactic dialogue with the students, in relation to the activity of the architecture design studio at the faculty of architecture and * maja bâldea. tel.: + - - - e-mail address: maja.baldea@arh.upt.ro © published by elsevier ltd. this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). selection and peer-review under responsibility of the organizing committee of wces http://crossmark.crossref.org/dialog/?doi= . /j.sbspro. . . &domain=pdf maja bâldea et al. / procedia - social and behavioral sciences ( ) – urbanism of timisoara. it discusses a general view upon blogs implemented simultaneously at the same faculty while discussing in a broad sense the concepts and context of teaching methodology and use of new media. . . didactic communication. concepts didactic communication is one of the determining aspects of the educational process. it is based on the interpersonal relationship established between teacher and student and aims to transmit not only a set of precise information, but also knowledge exceeding the strict meaning of the sent message. in order to define the specific characteristics of didactic communication, one must start from the modern sense of the concept of communication. new visions regarding the teaching methodologies are due to the new theories of communication that imply various ways of knowing, learning and transmitting information using nonconformist communication. in fact, modern research (watzlawick, beavin & jackson, ) managed to determine a set of principles defined as axioms of communication, partially overlapping the specific features of the pedagogic discourse. a review of these axioms is useful for the insight on possible nuances of the new didactic communication channels, as follows: communication is inevitable, meaning that in the given institutionalized context, the teacher must be aware of his ability to communicate; every communication has content and a relationship aspect, meaning that communication represents both the transfer of information and the relationship between those who communicate; communication is a continuous process in which the partners are involved in a chain exchange of action reaction, stimulus/response; communication takes either a digital (verbal, concrete meanings) or analogical (non-verbal, representative or referential) form; all communication is either symmetrical or complementary, based on equality or difference (“mirrored” behaviour - equal, or complementary behaviour, one of the interlocutors is designated a priori). two additional principles can be added: communication is irreversible, producing an effect on the receptor on which one cannot intervene retroactively, which means fast and lucid decision taking skills for teachers; communication involves processes of adjustment and adaptation (the message makes sense only in the light of previous life and linguistic experience of each individual) (pârvu, ). to conclude with, everything from the body language to the relationship between the teacher and his audience defines the act of communicating. . . didactic communication in the digital age. context the digital age offers several interactive tools through which one can develop knowledge and relationships. most of them are used daily by students and have a great impact upon their lives. in this context, blogs appear as a tool presenting a great potential within the educational process, because students are accustomed to this way of information processing. this raises a few questions to those who lead the educational process. how do you use the technology available to extend your time and space? how do you communicate with the outside world through cybernetic technology? are you a generator of information? are you willing to exchange information with others? what would be the results? can you successfully generate, store and share information? digital technologies of communication have broken the paradigms of the industrial society and brought on other communication channels, affecting our daily lives and becoming more visible in education too. according to moran. (moran, ), “communication becomes sensory, multidirectional and non-linear”. there are visible changes in the way we learn and absorb information, since instant messaging and social networks of the present provide a fragmented communication, while reality is built on the model of a kaleidoscope of dynamic, discreet and multiple stimuli of short-term nature. in this context alternatives to the traditional educational system appear, which can be incorporated into daily educational processes, relying on collaborative learning, connectivity and mobility. the entire scholarship concept has changed due to the extreme openness to higher education information carried out by the new technologies, triggering the concept of digital scholarship (weller, ). . . using the blog as a teaching tool in architecture teaching blogs as tools for teaching are used by different universities of architecture in the world, as an attempt to counteract the discrepancy between the practice of an educational activity inherited from an industrial society and maja bâldea et al. / procedia - social and behavioral sciences ( ) – the adoption of new communication practices based on new technologies. the current challenge consists in finding ways to transform the teaching-learning relationship, so these new ways of interaction with the world can positively contribute to the architecture education. . a comparative research on the use of blogs of the design studio beginning with study year / , three parallel blogs were generated in order to facilitate easy communication with students involved in the design studio. they were implemented simultaneously at the st (http://arhitectura tm.wordpress.com), nd (http://arhitectura tm.wordpress.com) and th (http://arhitectura tm.wordpress.com) year, in order to overcome deficiencies in communication perceived in previous activity of the teaching staff. the primary need came from the specific character of the design studio, consisting in workshops where students individually develop their projects, guided by teachers. the blogs were intended to fill in gaps in direct communication during studio hours, as completing communication by displaying design tasks, documentation sources or theoretical support. they also serve to communicate informative notices on the teaching process such as approaching deadlines, workshop materials needed or specific events. . . comparison of the previous teaching activity, without blog, with the current one which uses the blog the previous systems that were used for communicating with the students in the design studio implied either direct communication, mainly verbal with all the students in the class, or by electronic communication with only some students whose mission was to transmit the information to their classmates. the former use of these methods is discussed in what follows (table ), their disadvantages being the main trigger that started the use of blogs to improve teacher-student communication. direct communication consisted mainly in direct verbal communication, used to transmit general announcements to all students gathered together, followed by the division of students into workgroups, where the basic information was further interpreted. electronic communication supplemented direct communication and involved transmission of e-mails to the student’s representatives. it has not been used often, since the reception of the message by students was cumbersome. table . advantages and disadvantages of the direct and electronic communication methods advantages disadvantages direct communication method direct interaction teachers-students that allows swift verification of hypotheses. a large number of students, not all of them can listen to the announcement from a correct ergonomic position. the transmission of an idea, with input from the entire teaching group. the class spaces are not right for this type of interaction, being designed to hold a smaller number of students. electronic communication method supplements information. use of intermediaries in transmission: the message reaches the larger group through some representatives. high-speed transmission of general information to all students. it is difficult to transmit differentiated messages for a particular group (individual workgroup). the message has little visibility due to the amalgamated character of information in the mailing group. . . content the concurrently implemented blogs had similar graphic and content organization schemes, but later developed personalized representations. important differences appear from the point of view of the content, of the materials and of the way in which information is communicated. the differences in managing content and in the way teachers communicate the information (table ) derive on the one hand from the type of communication that each teacher’s team chooses to follow, and on the other hand from the previous experience of communicating information on blogs or generally via internet of the ones responsible for blog postings. maja bâldea et al. / procedia - social and behavioral sciences ( ) – table . differences in managing the content of blogs for the st, nd and th year of study year of study content particularities st year: workgroups the only blog that contains a theoretical support for the design studio workshop derived from the fact that the dean of the st year is also teaching the theory class of architecture theory. the only year that has a separate facebook page for its activities, where more dynamic and informal information is shown, so that the blog may hold only formal information. a sharp organization of the blog, carrying out general communication with all students. nd year: workgroups a blog that combines general communication on the central newsfeed page, containing information of general character which concerns all students, and also separate categories for each workgroup, targeting a differentiated working approach within the workshop’s evolution. th year: workgroups a blog pursuing a newspaper-like communication, information on the main page having an amalgamated character, containing main and secondary information organized exclusively by time-line ordering. . results the study discussed the results after one academic year, based on teacher’s and student’s feedback. the teacher’s opinions that were using the blogs (table ) are discussing the perceived advantages and disadvantages, intended to show the characteristic points of the approach of the different years. table . a comparison of advantages and disadvantages in the use of blogs at the st, nd and th year work structure advantages disadvantages st year: students / teachers ( dean, teaching assistants, phd students, externs) / workgroups communicating through the blog has shortened the path of information from teachers to students. the architecture theory course presented on the blog tends to be overlooked. communication and dissemination of information in digital format is essential, providing storage for information where you can always return to. teachers presume that all students read everything that is written on the blog, but in reality this is not the case. it is a convenient information process. some students don’t benefit of internet access. nd year: students / teachers ( associate professor, dean, teaching assistants, phd students, extern) / workgroups allows the pursuit of information by everybody involved in the teaching process and the return. not all students know how to access information within the sub-menus. quick communication, lacking redundancies. differentiated announcements for each workgroup. th year: students / teachers ( associate professor, teaching assistants, phd students, extern) / workgroups providing a dynamic platform for discussions (obtaining feedback on the project theme through links posted by students on the blog). the discussions still took place only during the workshop, understandable since there are two weekly workshops that facilitate direct encounter. a greater interest and better focus on the project theme and on the guidance offered by teachers. the blog failed to become a discussion platform, providing only mutual information. the traceability of the activity of the semester. the student’s feedback can be discussed either by statistics of accessing the blog, that don’t manage to truly show how many of the people accessing the blog were students, or through organized inquiries, but only the teachers of the nd year requested a questionnaire survey on the activity. the respondents were positive on the use of the blog, % of them considering that they have been exhaustively informed by it, while % stated that they have read and maja bâldea et al. / procedia - social and behavioral sciences ( ) – considered the bibliographical references useful. still, the student’s generic response about blogs, demonstrated by direct comments or by informal feedback provided during the classes was weak, and only a very small number were expressing their opinions or did interact actively in relation to the current content of the blog. by comparing the blogs the primary find is that each teaching group uses its own blog according to its specific communicating needs. comparing the opinions of each different teaching group, positivism stands out in relation to the visible improvement of the communication to the students, while at the same time a week response of the students is distinguished in relation to the communication via a dynamic electronic media. a common result, although initially un-assumed but revealed by teacher’s feedback is the fact that higher levels of group identity and social cohesion have been created than in previous years, apparently strengthened by the publication of images on the teaching process and the involvement of everybody in the group. still, the issue of reduced student’s blog interaction remains, so we propose several hypotheses. on one hand, the institutionalization of educational communication gives the professor a special status in the relationship with students, possibly causing inhibitions in a symmetrical communication. also, the decreased number of comments may demonstrate the fact that the students assume responsibility of qualitative and relevant comments, which can inhibit much of the possible interactions. participating through comments to the blog’s content occurs more frequently with the th year students, partially due to the fact that professional opinion gets build over time, and the students in the inferior study years don’t have sufficient professional knowledge to be able to validate their views. the latter assimilate information in a more intuitive level and this aspect is taken into account in the communication strategy of the blogs of the st and nd year. . discussions, conclusions, recommendations the only blog that changed in the current academic year as a result of last years’ experience is the blog of the nd year, by turning to the edublogs.org hosting platform that allows a more dynamic interaction, containing a forum and also supporting individual student blogs administrated by each student that should be used to present individual work during the semester, that can be embedded in the main page of the class blog. the results are yet unclear, but students only upload work upon teacher’s request and don’t yet identify their blog as a personal tool of representation and also fail to interact with each other’s work via individual blogs. the conducted research proves that the implemented blogs did improve teaching communication, proving to be successful instruments in supplementing and sustaining it. the implementation of blogs represented an incipient form of embracing new media that only shows the potential of using blogs. a general positive impact has been achieved in the communication of teachers and students, but the main future development task of the current blogs would be to achieve a greater involvement of students in the process and converting the existing platforms in instruments with a more dynamic character. we consider that a key issue in the adequate functioning of blogs is the connection between the character of communication and the specific needs of each year’s students, given by their age and interests. references moran, t., p. ( ). introduction to the history of communication. evolutions and revolutions. new york: peter lang. pârvu, i., ( ). filosofia comunicării. bucureşti: ed.comunicare.ro. sălăvăstru, d. ( ). psihologia educaţiei. iași: ed. polirom, pp. - . watzlawick, p., & beavin, j.,h., & jackson, d., d., ( ). pragmatics of human communication: a study of interactional patterns, pathologies and paradoxes. new york, w.w. norton &company. weller, m. ( ). the digital scholar: how technology is transforming scholarly practice. basingstoke: bloomsbury academic. conference full paper template published in interlending and document supply , no. : - open access: help or hindrance to resource sharing? tina baich, iupui university library introduction the growing acceptance of the open access movement has created an increasingly large body of free, online information that library users may have difficulty navigating. students, in particular, may not be fully aware of open access and the corpus of knowledge available to them. as a result, users still request open access materials through interlibrary loan (ill) despite their ability to access these materials directly. the resource sharing & delivery services (rsds) department of indiana university-purdue university indianapolis’ (iupui) university library began tracking ill borrowing requests for open access materials in . rsds tracks any request fitting the general criteria of open access content established by peter suber: “digital, online, free of charge, and free of most copyright and licensing restrictions” (suber, ). therefore, the collected data include requests for grey literature, electronic theses and dissertations (etds), and public domain works in addition to open access journal content. in , the author used the collected data to study open access borrowing requests over two fiscal years (july -june ) (baich, ). this period showed an increase in open access requests while overall borrowing requests held relatively steady. this paper presents an update on that research using data for july -june . the new study provides evidence that the number of borrowing requests for open access documents continued to grow in the ensuing two years. literature review it is a commonly held belief that interlibrary loan is and will continue to be adversely affected by the growth of open access. studies conducted in japan and belgium do show a reduction in the number of ill requests and place at least partial blame on open access (koyama et al., ), (corthouts et al., ), but schöpfel recently stated there is “little empirical evidence for this causal relation” (schöpfel, ). (mcgrath, ) also notes the many situations in which interlibrary loan will still be necessary despite any negative impact of open access. the experience at iupui university library has so far run counter to the argument of open access as a hindrance to resource sharing. open access has not reduced the number of article requests to date, and users actually request open access materials through interlibrary loan. the author could locate only one study which shares the view that open access is “an extraordinarily useful source for librarians to perform document delivery service” (hu and jiang, ). one of the key reasons users submit ill requests for open access materials is difficulty with discovery. there are a vast number of resources for locating open access materials, but users want ease of access. connaway, et al. found this is so imperative for users that they will “readily sacrifice content for convenience” (connaway, et al., ). additionally, a report on the findings of twelve user behavior studies found that google and other search engines are increasingly central to the search for information (connaway and dickey, ). in fact, when “information consumers” were asked by oclc research where they begin their information search, percent indicated beginning in a search engine while not a single person began their search on a library website (derosa, et al., ). as kroll and forsman note, “researchers find google and google scholar to be amazingly effective in finding isolated bits of information or getting to publications or findings of interest to them” (kroll and forsman, ). as a result, users are unlikely to search multiple resources for the information they seek both out of convenience and the possible perception that what they seek has been found. these user behaviors present a particular problem for open access content housed in repositories. google scholar doesn’t follow the same metadata standards as libraries, which causes a level of incompatibility that can impact discovery. google scholar does have inclusion guidelines for webmasters to help increase the likelihood an open access repository will be indexed, but some libraries may lack the knowledge or resources to implement these guidelines (artlisch and o’brien, ). artlisch and o’brien found that “in general, irs [institutional repositories] that followed these guidelines had a much higher indexing ratio ( - percent) than sites that did not ( - percent)” (artlisch and o’brien, ). the current inconsistency in discovery of open access content through a google or google scholar search has a negative impact on user discovery. the discovery problem extends beyond open access materials. the most recent study of literature regarding ill requests for owned items summarizes the literature by stating, “most … found that interlibrary loan requests for items owned or available through electronic access through the library represented percent or greater of the total cancelled requests” (kress et al., ). while there are a number of factors that can cause users to place requests for owned items, one of the key issues is similar to that for open access materials. libraries offer numerous methods for locating an item – online catalogs, databases, a-z e- journal lists, and openurl link resolvers – that retrieve different formats and results. this does not align with users’ need for convenience and ease of access and may result in a greater reliance on ill to locate information. the initial study of ill requests for owned items suggested that users may “take the line of least resistance in a search and believe that if it is not in the first place they look, it must not exist” (yontz et al., ). this proves to be a prescient statement in light of later research. as discussed earlier, users’ demand for ease of access has only increased in the ensuing years. overview of iupui and university library iupui iupui is an urban university with nineteen schools and academic units from both indiana university and purdue university enrolling more than , students. iupui is administratively linked to indiana university (iu) and is considered a core campus in the iu system along with bloomington. the iu system also includes six regional campuses around the state. iupui has its own extension campus, indiana university-purdue university columbus, located approximately forty-five miles south of indianapolis (iupui, n.d.; indiana university, n.d.). all indiana university campus libraries collaborate in a number of ways including a shared online catalog, a remote circulation service, and some shared subscriptions. university library iupui university library serves the faculty, staff and students of all iupui schools except the law, medicine and dentistry schools, which have their own libraries. the herron school of art also has its own library, but it reports administratively to the dean of university library. resource sharing services for herron users are provided by university library. david w. lewis, the dean of university library, is a well-known proponent of the open access publishing model and has written a number of articles on scholarly communication and open access. one of his earliest works on the topic is an unpublished paper titled “six reasons why the price of scholarly information will fall in cyberspace” (lewis, ). guided by dean lewis’ vision, university library has developed a strong structure and services to support the scholarly communication of iupui faculty. one of the library’s earliest open access initiatives was the launch of an institutional repository in . the largest collection within the institutional repository, now known as iupui scholarworks, is that of electronic theses, dissertations, and doctoral papers from iupui graduates. the growth of this collection was largely facilitated by the iupui graduate office’s etd submission requirement, which began in . in , the library launched its first hosted open access journal in open journal systems (ojs). at the time of its transition to ojs, advances in social work was edited by an iupui faculty member. that same year, university library refocused the existing digital libraries team as the digital scholarship team to be more inclusive of scholarly communication and open access issues. the increased focus on scholarly communication resulted in the addition of several librarian positions to the team including a digital scholarship and data management librarian ( ); scholarly communications librarian ( ); digital scholarship outreach librarian ( ); digital user experience librarian ( ) and digital humanities librarian ( ). these new positions are largely focused on supporting iupui faculty research and its dissemination to a broader audience. in , the digital scholarship team increased its outward focus again with the creation of the iupui university library center for digital scholarship to enrich the research capabilities of scholars at iupui, within indiana communities, and beyond by: • digitally disseminating unique scholarship, data, and artifacts created by iupui faculty, students, staff and community partners; • advocating for the rights of authors, fair use, and open access to information and publications; • implementing and promoting best-practices for creation, description, preservation, sharing, and reuse of digital scholarship, data, and artifacts; • strategically applying research-supporting technologies; • teaching digital literacy (iupui university library, ). university library also launched another major service in in the form of the iupui open access publishing fund. this fund “underwrites reasonable publication charges for articles published in fee-based, peer-reviewed journals that are openly accessible” (iupui university library, ). though administered by university library, financial support comes from several key campus stakeholders including the office of the vice chancellor for research, iu school of dentistry, robert h. mckinney school of law, and university library. since its launch, the fund has supported the publication of sixteen faculty articles representing nine of iupui’s schools (jere odell, personal communication, dec. ). in september , the director of the center was elevated to associate dean for digital scholarship once again emphasizing the importance of scholarly communication and open access to the library. open access policies in april , the iupui library faculty, the governing body for all librarians from the campus’ five libraries plus columbus, passed a deposit mandate in affirmation of its support of the open access movement. the mandate requires iupui and iupuc librarians to deposit their scholarly articles in the institutional repository, iupui scholarworks. after months of work, the iupui faculty council’s library affairs committee introduced an open access policy. following discussion at iupui faculty council meetings and a series of town hall meetings, the iupui faculty council passed the open access policy on october , , becoming the first indiana university faculty body to do so. the policy requires the deposit of faculty-created scholarly articles in iupui scholarworks. overview of resource sharing operations iupui university library’s rsds department provides interlibrary loan and document delivery services to the faculty, staff and students of all iupui schools except the law, medicine and dentistry schools, which have their own libraries. university library also has an agreement with martin university, a local university without its own library, to provide ill services to its affiliates. rsds consists of half an fte librarian, three fte staff (two of which have responsibility for resource sharing services) and two-three fte student employees. iupui university library is an oclc supplier, participates in rapidill, and uses the oclc illiad ill management system. total ill borrowing requests have decreased slightly over the past three fiscal years, but each decrease can be attributed to fewer loan requests. when borrowing copy requests are considered separately, the statistics show an increase in this type of request every fiscal year since / . the large increase in article requests in / can be attributed to the implementation of a document delivery service for articles and book chapters owned by the library. these trends are illustrated in figures and . figure . borrowing requests submitted by fiscal year figure . borrowing copy requests submitted fiscal year submitted % change from previous year / , / , . % / , . % / , . % / , . % / , . % open access ill workflow the rsds department utilizes the oclc illiad ill management software, which supports the creation of custom routing rules, queues, and emails that assist staff in automating workflows. two custom queues, “awaiting open access searching” and “awaiting thesis processing,” allow staff to monitor potential open access borrowing requests. items published in the us prior to are considered to be within the public domain and free from copyright restrictions. a custom routing rule directs any borrowing request with a pre- publication date into the “awaiting open access searching” queue so staff can search for freely available electronic copies prior to sending the request to another library. staff members use illiad addons, which automatically execute searches in hathitrust, internet archive, and google or google scholar based on information in the request. with the increase in availability of electronic theses and dissertations (etds), staff members now search for open access versions if a title is not part of our proquest dissertations & theses (pqdt) subscription before submitting a request via oclc. the “awaiting thesis processing” queue facilitates this by segregating all requests with a document type of thesis or containing the phrase “dissertation abstracts.” when a thesis or dissertation request is submitted, an rsds staff member first searches the pqdt database to determine whether iupui university library has access through its subscription. if access is not possible through proquest, the staff member searches google scholar and/or google for an etd deposited in an institutional repository. it is only after failing to find an etd that the staff member will turn to oclc where she will confirm there is no electronic resource record or url included in the print record. if no etd is located, the staff member will submit a request for a physical copy from another library. a visual representation of these workflows is depicted in figure . figure . open access ill workflow all other article requests are sent into the rapidill system, which also checks for open access titles. very few of the open access requests received by iupui university library are fulfilled through rapidill’s open access check ( of , requests, or . %). staff members search the article title in google scholar for open access versions when requests are returned from rapidill as unfilled. rapidill also returns requests that are part of our local holdings, which can result in the identification of additional open access items when requests are searched in the library’s e-journal portal. the library uses serials solutions as its vendor for electronic resource management. within the administrative module, it is possible to activate “subscriptions” to various open access journal collections. an example of how this appears in the user interface is shown in figure . thanks to this feature, resources such as pubmed central and the directory of open access journals as well as various collections of freely accessible journal titles are linked through the library’s e-journal portal. this allows staff to fill requests for gold oa articles as well as those green oa articles archived in pubmed central with minimal searching and without burdening possible lenders with requests for open access materials. figure . example open access article as located in e-journal portal post- conference paper and report copy requests are screened for open access versions prior to submission to oclc. specific open access searching is typically not done for post- book chapter or loan requests, but staff members are conscious of electronic resource records in oclc and may sometimes identify an open access item based on the url included in the record. extensive searching for open access options does not occur for book chapter and loan requests until all other borrowing options have been exhausted. when an open access item is located, the staff member enters tracking information into the call number and location fields within the request form and records “open” or “etds” (depending on the document type) as the lending library. she then saves the pdf to the illiad web server and sends the user a custom email notifying him both of the document’s availability on his account and of its location on the open web. requests for which an open access version is located are considered filled by rsds since the staff member has used her time and expertise to find and deliver the item to the user. data overview since the publication of the author’s study, open access requests have increase by - percent each year. figure shows the number of borrowing requests filled with open access materials during fiscal years through . despite these substantial increases, open access requests only account for percent of total borrowing copy requests ( , of , ). figure . open access borrowing requests by fiscal year as stated in the introduction, students may not be fully aware of open access and the corpus of knowledge available to them. the number of requests by user status seems to support this assertion. alternatively, or perhaps in addition, students may have greater difficulty with the discovery issues covered in the literature review. when taken in combination, iupui undergraduate and graduate student requests account for percent of open access requests received in / and / . if requests from martin university students are added, then student requests account for percent of open access borrowing requests. figure shows the number of open access borrowing requests by user status. figure . open access borrowing requests by user status users representing unique departments or schools submitted open access requests in / and / . figure shows the number of open access requests submitted by users from the top fifteen departments or schools. of the top fifteen departments, seven are stem or health sciences disciplines. the amount of open access materials available in these disciplines may relate to the public access policies enacted by the national science foundation and national institutes of health, which require that research funded by these federal agencies be accessible to the public. figure . open access borrowing requests by user department or school open access document types and resources the , open access requests received during fiscal years and represent a variety of material types (see fig ). figure . open access borrowing requests by document type doc type / / article book/chapter thesis conference report grand total percent change on previous year % % article requests nearly three-quarters (n= , , %) of open access requests in fiscal years and were for articles. these requests were filled from a wide variety of both gold (open access journals) and green (self-archiving) sources. though it is difficult to identify percentages with precision due to a variety of factors (i.e. language barrier, changes in access, multiple oa options), only article requests ( %) can be clearly identified as gold oa. an additional open access articles were retrieved from the sites of journals that still rely primarily on a subscription-based publishing model. thirty articles requests were for public domain materials and were filled using hathitrust, internet archive and library digital collections. based on this analysis, the majority of open access article requests were filled via green (self-archiving) sources. within the administrative module of university library’s electronic resource management system, it is possible to activate “subscriptions” to open access journal collections. access to these collections then becomes available in the e-journal portal (see fig above). the activation of open access collections within university library’s e- journal portal resulted in the location of ( %) articles in open access journals and repositories (see fig ), which is a decrease from % from the previous study. another ( %) requests were filled from open access journals not included in e-journal portal open access collections; while more than ( %) requests were for open access articles included in journals that still rely primarily on a subscription model. figure . number of open access borrowing requests filled through e-journal portal open access repositories were a major source for articles. there are several types of open access repositories including subject, institutional, consortial, and national. though no one subject repository was the location for a significant number of articles, subject repositories as a whole provided access to articles (see fig ). eighty-four open access article requests were located in institutional repositories, while consortial and national repositories such as dialnet, redalyc, and scielo accounted for another requests. when taken together, these open access repositories represented percent of total open access borrowing requests for articles. figure . number of open access borrowing requests filled through subject repositories (including e- journal portal) subject repository number of requests arxiv.org citeseerx digital library for physics and astronomy education resources information center (eric) europe pubmed central project euclid pubmed central optics infobase organic eprints total dialnet is the consortial repository of spanish libraries. redalyc is a repository of iberoamerican journal content based in mexico. scielo stands for scientific electronic library online and exists in a number of iterations to serve as the national repository of various south american countries. book and book chapter requests books and book chapters represented only percent (n= ) of open access requests, which is a decrease from percent in the previous study. more than two-thirds ( %, n= ) of book and book chapter requests were published in the th and th centuries with another percent (n= ) published in the th, th, and th centuries. twenty-six percent (n= ) were published in the st century. one item had an unknown publication date. the majority ( %) of freely available books were located in hathitrust ( ) and internet archive ( ). this is a shift from the previous two fiscal years when the most common source was google books, which is down to just two requests from . in all likelihood, this is due to the ability to automatically search hathitrust and internet archive within the ill management system. though still a small percentage, four requests ( %) were for recently published open access e-books, which is a new development from the previous study. thesis and dissertation requests theses and dissertations accounted for percent (n= ) of total open access requests, which is a decrease from percent in the previous study. however, a greater percentage ( %) of total thesis and dissertation borrowing requests (n= ) were filled using etds than in the previous study ( %). not surprisingly, graduate students were the most frequent requesters of etds ( %). ninety-three percent of etds were located within an open access repository. the institutional repository of the granting institution was most common with percent (n= ) of requests followed by the consortial repository ohiolink etd center with percent (n= ). the national repositories theses canada portal ( ) and ethos ( ) comprised percent of the total etd requests. one request was located in the subject repository, education resources information center (eric). following open access repositories was a new entrant to the etd field, pqdt open. thesis and dissertation authors now have the option to publish their work as open access through proquest’s umi dissertation publishing service for a fee. these etds are available both through the proquest dissertations & theses subscription database as well as the public interface, pqdt open (proquest, ). six requests ( %) were filled with etds published as open access through proquest. conference paper and report requests conference papers represented three (n= ) percent of open access borrowing requests, which is a substantial decrease from the previous study where conference papers accounted for percent of the total. this is primarily due to changes with all academic, an online conference management tool that was previously an excellent source for open access conference papers. in the previous study, percent (n= ) of open access conference papers were located in all academic or the related repository, political research online, compared to percent (n= ) in the current period under study. linking within the all academic site appears to have changed causing many dead links from google scholar. once a search is re-executed in all academic, the index page for a given paper can be confusing and frequently does not yield a link to the full-text. these changes have greatly reduced the usefulness of all academic for locating conference papers. instead, conference papers were located in a variety of repositories and websites including those of the conference or sponsoring organization. in the previous study, reports represented such a small number (n= ) of open access borrowing requests that they were not discussed. however, report requests are now more numerous than those for conference papers at ( %) of the , open access requests. of these requests, percent (n= ) were located on the issuing institution’s website and percent (n= ) in eric. other sources included government agency websites ( ), open access repositories ( ), and the national criminal justice reference service ( ). conclusion the discovery problems surrounding information retrieval do not align with users’ need for convenience and ease of access and may result in a greater reliance on ill to locate information. an example can be taken from iupui university library’s own document delivery service. rsds offers document delivery of articles and book chapters from the library’s print collection for all users. however, users do not limit their requests to items from the print collection. from july through june , rsds filled , document delivery requests of which percent were available through the library’s electronic holdings. users clearly find it easier to request through ill rather than completing the search process themselves even though this means a delay in access. the data presented here show that this is clearly the case for open access materials as the number of ill requests for such content steadily rises. the request volume and discovery problems may make open access feel like a hindrance to resource sharing. ill practitioners may themselves be overwhelmed or frustrated by the number of possible sources for open access materials. the growth in the number of requests for these materials also adds a manual workflow and the burden of filling requests that could have been located by the user. despite these potential drawbacks to the use of open access materials in ill, the benefits are clear. open access helps resource sharing in three ways. first is the increased ability to fulfill borrowing requests. theses and dissertations as well as grey literature like conference papers and reports are notoriously difficult to obtain due to lack of holdings or unwillingness on the part of the owning library to lend. in these instances, open access is an enormous help to ill practitioners in that it allows them to obtain materials for users that they may not be able to otherwise. second is speed. by utilizing open access materials, the turnaround time for these requests is greatly reduced. the requests do not need to be sent to other libraries or handled by lending library staff. a parallel can be drawn between the difference in turnaround time for borrowing versus document delivery requests since document delivery requests can be filled with material immediately at hand just as requests for open access materials can be. the rsds department’s overall turnaround times for borrowing and document delivery requests during the two years under study vary by . days. if you limit the comparison to items delivered electronically, the borrowing turnaround time was . days while the document delivery turnaround time was . days. immediate access to the material requested saved . days, a clear benefit to ill services and users. third is cost. since open access materials are free of charge, libraries are saved potential borrowing and shipping fees that a typical ill transaction could incur. during the two years included in this study, rsds filled , borrowing requests using open access materials. the potential cost of borrowing these items through traditional ill is $ , . based on mary jackson’s cost estimate of $ . per borrowing transaction (jackson , p. ). by utilizing open access materials, the cost for these requests is reduced to a minimal amount of staff time. these benefits will outweigh the potential pitfalls especially as open access continues to grow. if ill practitioners want the number of requests for open access materials to decrease, we need to take an active role in the education of our users through our websites, electronic communications, and by working with our colleagues to embed information about open access in instruction. as expert searchers, ill practitioners are also perfectly positioned to assist their colleagues in improving the discovery of open access materials. users should be able to discover open access items with ease using intuitive, user-friendly systems and interfaces. in the meantime, ill practitioners must embrace the idea that we provide a vital service in aiding users with the discovery of open access resources as well as the benefits this large body of literature provides us. references artlisch, k. and o’brien, p.s. ( ), “invisible institutional repositories: addressing the low indexing ratios of irs in google scholar”, library hi tech, vol. no. , pp. - . baich, t. ( ), “opening interlibrary loan to open access”, interlending & document supply, vol. no. , pp. - . connaway, l.s. and dickey, t.j. ( ), the digital information seeker: report of the findings from selected oclc, rin, and jisc user behavior projects, the higher education funding council for england, on behalf of jisc, available at: http://www.jisc.ac.uk/media/documents/publications/reports/ /digitalinformationseekerreport.pdf (accessed may , ). connaway, l.s., dickey, t.j. and radford, m.l. ( ), “‘if it is too inconvenient, i’m not going after it:’ convenience as a critical factor in information-seeking behaviors”, library and information science research, vol. , pp. - , pre-print available at: http://www.oclc.org/research/publications/library/ /connaway- lisr.pdf (accessed may , ). corthouts, j., van borm, j. and van den eynde, m. ( ), “impala - : years of ill in belgium”, interlending & document supply, vol. no. , pp. - . derosa, c. et al. ( ), perceptions of libraries, : context and community, oclc, dublin, oh, available at: https://www.oclc.org/en-us/reports/ perceptions.html (accessed may , ). http://www.jisc.ac.uk/media/documents/publications/reports/ /digitalinformationseekerreport.pdf http://www.oclc.org/research/publications/library/ /connaway-lisr.pdf http://www.oclc.org/research/publications/library/ /connaway-lisr.pdf https://www.oclc.org/en-us/reports/ perceptions.html hu, f. and jiang, h. ( ), “open access and document delivery services: a case study in capital normal university library”, interlending & document supply, vol. no. / , pp. - . indiana university (n.d.), “campuses”, available at: http://www.iu.edu/campuses/index.shtml (accessed may , ). iupui (n.d.), “about iupui”, available at: http://www.iupui.edu/about/ (accessed may , ). iupui university library ( ), “iupui open access publishing fund”, available at: http://www.ulib.iupui.edu/digitalscholarship/oafund (accessed january , ). iupui university library ( ), “center for digital scholarship: about”, available at: http://www.ulib.iupui.edu/digitalscholarship/about (accessed january , ). jackson, m.e. ( ), assessing ill/dd services: new cost-effective alternatives, greenwood, westport, connecticut. koyama, k., sato, y., tutiya, s., and takeuchi, h. ( ), “how the digital era has transformed ill services in japanese university libraries: a comprehensive analysis of nacsis-ill transaction records from to ”, interlending & document supply, vol. no. , pp. - . kress, n., del bosque, d. and ipri, t. ( ), “user failure to find known library items”, new library world, vol. no. / , pp. - . kroll, s. and forsman, r. ( ), a slice of research life: information support for research in the united states, oclc, dublin, oh, available at: http://www.oclc.org/content/dam/research/publications/library/ / - .pdf (accessed may , ). lewis, d. ( ), “six reasons why the price of scholarly information will fall in cyberspace”, unpublished manuscript, available at: http://hdl.handle.net/ / (accessed january , ). mcgrath, m. ( ), “viewpoint: open access – a nail in the coffin of ill?”, interlending and document supply, vol. no. , pp. - . proquest ( ), “open access publishing plus from proquest: frequently asked questions (faq)”, available at: http://media .proquest.com/documents/open_access_faq.pdf (accessed may , ). schöpfel, j. ( ), “open access and document supply”, interlending and document supply, vol. no. , pp. - . suber, p. ( ), “open access overview”, available at: http://www.earlham.edu/~peters/fos/overview.htm (accessed may , ). yontz, e., williams, p. and carey, j.a. ( ), “interlibrary loan requests for locally held items: why aren’t they using what we’ve got?”, journal of interlibrary loan, document delivery & information supply, vol. no. , pp. - . http://www.iu.edu/campuses/index.shtml http://www.iupui.edu/about/ http://www.ulib.iupui.edu/digitalscholarship/oafund http://www.ulib.iupui.edu/digitalscholarship/about http://www.oclc.org/content/dam/research/publications/library/ / - .pdf http://hdl.handle.net/ / http://media .proquest.com/documents/open_access_faq.pdf http://www.earlham.edu/% epeters/fos/overview.htm introduction literature review overview of iupui and university library overview of resource sharing operations open access ill workflow data overview open access document types and resources conclusion digital artifacts and landscapes. experimenting with placemaking at the impero project heritage article digital artifacts and landscapes. experimenting with placemaking at the impero project alessandro sebastiani ���������� ������� citation: sebastiani, a. digital artifacts and landscapes. experimenting with placemaking at the impero project. heritage , , – . https://doi.org/ . / heritage received: december accepted: february published: february publisher’s note: mdpi stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. copyright: © by the author. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (https:// creativecommons.org/licenses/by/ . /). department of classics, university at buffalo (suny), buffalo, ny , usa; as @buffalo.edu; tel.: + - - - abstract: this paper describes the public archaeology approach and placemaking experiment at the etruscan and roman site of podere cannicci in tuscany (italy), drawing from the previous experience at three other archaeological sites along the tyrrhenian coast. after three years of excavations at the impero project (interconnected mobility of people and economy along the river ombrone), the team has begun a side project to develop new strategies for communicating the results of the research. these include, but are not limited to, an app which displays augmented reality and d reconstructions of both the site and the material culture. the project uses digital narratives to engage local communities and scholars in the interpretation and reconstruction of ancient landscapes along with the middle valley of the ombrone river. this approach also has the potential to support and sustain local tourism, providing an original experience for visitors. moreover, the solution allows people from all over the world to be connected with the ongoing research and its results, as everything will be published on a dedicated website. keywords: cultural heritage; classical archaeology; augmented reality; d reconstructions; digital archaeology; placemaking . introduction the world of cultural heritage is changing rapidly, and new approaches for its sus- tainability are needed to face the challenges that the st century is presenting to our communities nowadays. unfortunately, we are responsible for having created a gap be- tween academia and public communities, and for some decades we failed altogether to transmit our knowledge of the past to vast audiences. new challenges are arising in our discipline, much more related to how to preserve and make our monuments accessible, rather than producing more datasets from newly open excavations. it is time to put a damper on investigating and digging up more sites of excavation and to instead concen- trate on what had already been uncovered. as richard hodges writes, “ . . . will there be the means to challenge great questions about the past or will archaeologists increasingly concentrate upon making sense of and re-assessing discoveries made by our baby-boomer generation? a major aspect of the future of studying the past is to make it accessible to our communities [ ] we cannot afford any longer the inevitable. public interest is becoming insatiable as global tourism and a global hunger for history reduces the import of mere reporting of digs” [ ]. obviously, archaeological excavations continue and new sites are brought to light daily. however, approaches for relating to cultural heritage have changed. with the beginning of a new research expedition, the scholar’s eye is aimed almost immediately at the enhancement of a site and the construction of narratives around which to base tourist experiences as well as the involvement of local communities. archaeologists are becoming placemakers [ ], providing historical identities to critical waypoints of the past (whether these are major or “minor” archaeological sites) and [ ] transmitting these identities through a number of different strategies and narratives to wider audiences. heritage , , – . https://doi.org/ . /heritage https://www.mdpi.com/journal/heritage https://www.mdpi.com/journal/heritage https://www.mdpi.com https://doi.org/ . /heritage https://doi.org/ . /heritage https://creativecommons.org/ https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / https://doi.org/ . /heritage https://www.mdpi.com/journal/heritage https://www.mdpi.com/ - / / / ?type=check_update&version= heritage , digital archaeology represents one of the many approaches to engage with wider audiences and to make our narratives be accessible to the most. d models and reconstruc- tions, as well as the extensive use of virtual and augmented reality play a crucial role in our st century approach to monuments, archaeological sites and research and global tourism [ – ]. it is under this lens that our impero project (interconnected mobility of people and economy along the river ombrone) in south tuscany, italy has begun to employ a strategy of placemaking, as we shall see in the next paragraphs. it is a journey that began in , and like all challenges, it has seen moments of success alternating with some sudden setbacks. each autumn, however, has allowed the development of new and unexpected directions in an attempt to achieve the development of innovative approaches in the management and transmission of cultural heritage. . the impero project in , a new archaeological project started at the university of sheffield and contin- ued under the department of classics at the university at buffalo (suny) [ , ]. the impero project aims at reconstructing the historical landscape of the middle valley of the ombrone river in south tuscany from the etruscan period until the end of the middle ages. since last year, it has also included the data available from the alberese archaeological project that was carried out between and on the coastal area of the ancient territory of the etruscan-roman city of rusellae [ ]. in this way, the research intends to fully document the intermittent changes that came into being in an under- investigated riverine landscape and to promote the discovery of ancient historical sites to local communities and global tourism (figure ). it is a path, as mentioned, which began in within the maremma regional park, where the enhancement of historical and artistic attractions is accompanied by the preservation and protection of the natural environment. as we will see in the next paragraph, several strategies developed in the context of the maremma regional park were put in place at alberese and served as the starting point for a much more elaborated placemaking project for the inland area. stemming from this effort, the impero project began with investigations into two archaeological sites located along the middle stretch of the ombrone river, belonging to two different historical phases: on the one hand, the late etruscan sanctuary and roman republican village of podere cannicci, and on the other, the ruins of the medieval fortified settlement of castel- laraccio di monteverdi [ ], both within the modern municipality of civitella paganico (grosseto) (figure ). located at the feet of gentle slopes of a tuscan hill, podere cannicci represents quite an intriguing etruscan to late republican site ( th century bce—early st century bce) (figure ). its story begins with a natural sacred place, attested by the abundance of votive offerings recovered in the s during some rescue excavations carried out by the soprintendenza archeologica della toscana [ , ], and continues with the construction of a possible vicus, the economic strategy of which relied heavily on its strategic location along riverine and terrestrial trade routes [ , ]. the village grew economically and, most likely, socially to the point of becoming a reference point within the surrounding territory. the excavations and geophysical surveys have clearly shown that the settlement extended on a wide area of at least ha, with a number of different complexes (figure ), including dwellings for the villagers, as well as manufacturing structures and storage facilities, all of which point to a thriving and flourishing settlement. apparently, its fortune was also determined by proximity to the religious area. although monumental or substantial remains are not yet visible (if any were ever constructed!), the votive sphere at cannicci clearly revolved around fertility cults and offerings [ , ]. terracotta uteri were predominant in the ancient votive deposits and were accompanied by small terracotta heads and statuettes (figure ). the settlement relied on agrarian economy, and also on craftsmanship. the proximity to the sanctuary also meant that the inhabitants could produce some of the necessary offerings and special requests of the sacred place: black heritage , gloss ware and ex-votos, both in metal and pottery, were produced here in specialized and seasonal workshops. the excavations brought back to light the heavy background noise of these activities. a large amount of waste testifies to the intense production of metal objects, domestic ware and luxury black gloss ware vessels (figure ). as the excavations grew bigger in , another dwelling of the village was investigated; the remains of a large facility appeared, with one room containing at least dolia (figure ) [ ].heritage , for peer review figure . – map showing the geographical areas of research discussed in the paper. figure . map showing the geographical areas of research discussed in the paper. heritage , for peer review figure . – map showing the geographical areas of research discussed in the paper. figure . map showing the area of the impero project. the red square indicates the location of podere cannicci, the late etruscan and republican sanctuary and vicus; the purple star indicates the location of the deserted medieval village of castellaraccio di monteverdi. heritage , heritage , for peer review figure . – aerial view of the late etruscan and republican settlement at podere cannicci. figure . aerial view of the late etruscan and republican settlement at podere cannicci. heritage , for peer review figure . – results of the geophysical investigations at podere cannicci. remains of underground structures are clearly visible through the arp resistivity, showing a wider settlement covering some ha. figure . – some of the votive offerings collected during the – excavations at podere cannicci; (a) represents a terracotta face; (b) represents one uterus; (c) is a clay statuette, maybe representing minerva. figure . results of the geophysical investigations at podere cannicci. remains of underground structures are clearly visible through the arp resistivity, showing a wider settlement covering some ha. heritage , heritage , for peer review figure . – results of the geophysical investigations at podere cannicci. remains of underground structures are clearly visible through the arp resistivity, showing a wider settlement covering some ha. figure . – some of the votive offerings collected during the – excavations at podere cannicci; (a) represents a terracotta face; (b) represents one uterus; (c) is a clay statuette, maybe representing minerva. figure . some of the votive offerings collected during the – excavations at podere cannicci; (a) represents a terracotta face; (b) represents one uterus; (c) is a clay statuette, maybe representing minerva. heritage , for peer review figure . – some of the pottery wastes recovered during the last two excavations seasons at podere cannicci. figure . some of the pottery wastes recovered during the last two excavations seasons at podere cannicci. heritage , heritage , for peer review figure . – aerial view of excavations trench (area ) showing the remains of a dwelling with seven dolia still in situ. the structure was shown in the results of the geophysical campaigns. this vibrant community of worshipping and industrious farmers and artisans came to an end when the social war between marius and sulla reached this part of etruria [ ]. sulla’s troops showed no mercy and the settlement at cannicci was set on fire, its houses never rebuilt, and people dispersed in the surrounding territory if not killed. what was once a dynamic settlement in between religion, agriculture and craftsmanship, was, at this point, destroyed and never occupied again. the impero project also investigates a medieval site. as one of the tasks of the re- search is to understand the changes that occurred between the classical and the modern world along the valley of the ombrone river, it was necessary to excavate the remains of castellaraccio di monteverdi, a fortified hilltop settlement facing the river and a collapsed medieval bridge that crossed it [ , , ] (figure ). figure . aerial view of excavations trench (area ) showing the remains of a dwelling with seven dolia still in situ. the structure was shown in the results of the geophysical campaigns. this vibrant community of worshipping and industrious farmers and artisans came to an end when the social war between marius and sulla reached this part of etruria [ ]. sulla’s troops showed no mercy and the settlement at cannicci was set on fire, its houses never rebuilt, and people dispersed in the surrounding territory if not killed. what was once a dynamic settlement in between religion, agriculture and craftsmanship, was, at this point, destroyed and never occupied again. the impero project also investigates a medieval site. as one of the tasks of the research is to understand the changes that occurred between the classical and the modern world along the valley of the ombrone river, it was necessary to excavate the remains of castellaraccio di monteverdi, a fortified hilltop settlement facing the river and a collapsed medieval bridge that crossed it [ , , ] (figure ). the excavations revealed the existence of a much larger settlement, developing over at least three main terraces molded across the slopes of the hill. at this stage, parts of the manor house/tower are under investigation, as well as one of the dwellings located on the hilltop. a general mapping of the fortification and of all the visible walls of the castle was carried out, allowing a preliminary understanding of the topography. the thick deposits of rubble that seal the occupation layers of the structures naturally slow down the excavations at this site and, together with the minimal collection of material culture available, at this stage it is still rather early to advance significant hypotheses on the chronology of the settlement, as well as on the structures it contained and their original functions. an interesting aspect was the discovery of a number of fragments of a late th century bc dolium amidst and at the bottom of a rubble context. this may hint at the possibility that a late etruscan phase can be documented one day on the hilltop of castellaraccio, providing a unique case for this territory. although etruscan settlements are quite often present at the very bottom of medieval stratigraphies and structures, the area of civitella paganico, as well as all the middle valley of the ombrone river, is rather devoid of etruscan sites. heritage , heritage , for peer review figure . – aerial view of the remains of the deserted medieval village of castellaraccio di monteverdi. the excavations revealed the existence of a much larger settlement, developing over at least three main terraces molded across the slopes of the hill. at this stage, parts of the manor house/tower are under investigation, as well as one of the dwellings located on the hilltop. a general mapping of the fortification and of all the visible walls of the castle was carried out, allowing a preliminary understanding of the topography. the thick deposits of rubble that seal the occupation layers of the structures naturally slow down the exca- vations at this site and, together with the minimal collection of material culture available, at this stage it is still rather early to advance significant hypotheses on the chronology of the settlement, as well as on the structures it contained and their original functions. an interesting aspect was the discovery of a number of fragments of a late th century bc dolium amidst and at the bottom of a rubble context. this may hint at the possibility that a late etruscan phase can be documented one day on the hilltop of castellaraccio, provid- ing a unique case for this territory. although etruscan settlements are quite often present at the very bottom of medieval stratigraphies and structures, the area of civitella pagan- ico, as well as all the middle valley of the ombrone river, is rather devoid of etruscan sites. academically then, the impero project sits at the intersection of a number of large debates, spanning from the organization of the historical landscape and its settlement net- works to the rise and fall of ancient economies. it investigates commerce and trade along riverine and terrestrial routes and embraces the wider mediterranean and its micro and macro ecologies to reconstruct the historical identity of a place, paganico, and its territory. in engaging with these crucial questions of research, the project also attempts to involve figure . aerial view of the remains of the deserted medieval village of castellaraccio di monteverdi. academically then, the impero project sits at the intersection of a number of large debates, spanning from the organization of the historical landscape and its settlement networks to the rise and fall of ancient economies. it investigates commerce and trade along riverine and terrestrial routes and embraces the wider mediterranean and its micro and macro ecologies to reconstruct the historical identity of a place, paganico, and its territory. in engaging with these crucial questions of research, the project also attempts to involve local communities in the interpretative process. interestingly enough, this second task proves to be the most challenging, requiring us to seek new narratives and tools for transmitting the history of a settlement, the construction of historical identities, and the sense of authenticity that comes from our archaeological research to local communities and wider audiences. despite its challenging nature, the communication and dissemination of authentic stories about the places we investigate remains a fundamental responsibility that we, as a scientific community, must fulfill. . a digital venture for the project as mentioned in the brief introduction of this paper, archaeologists are currently facing new challenges in providing for the kinds of authentic experiences that international tourists and local communities increasingly seek while visiting historical sites. undoubtedly then, our discipline is now engaging with new methods, practices and approaches to disseminating our interpretations of sites and settlements. engagement with local communities, placemaking, the accessibility of archaeological areas, and the visual dissemination of artifacts and structures are just a few examples of a new vocabulary that archaeology must integrate into its research. in this spirit, an increasing number of archaeological projects is committed to public archaeology. we have the moral duty to open the doors of our sites and to shape new narratives that might attract wider audiences, helping to produce sustainable economic strategies for archaeological places. we are experiencing a new revolution in the use of visual art that is no longer confined to scholarly publications and academic treatments of the past. visual art and the possibility of personal interaction may be one of the keys to engage with local communities and imbue our work as historians with authenticity and heritage , identity. the moment could not be better: as we witness a growing trend of nationalistic movements, whose propaganda is based on falsified myths and misrepresented historical reconstructions, in a world that seems to prefer to erect walls and barriers, archaeology has the power to address the reality of the past and to construct bridges between history and local communities, to make places in lieu of non-places. following marc augé’s definition of non-places as spaces of vulnerability where members of society are helpless, since the non-place neither provides nor transmits any sense of belonging to a culture or authentic history [ ], at the impero project we have started to build these bridges between the academic community of archaeologists and specialists, and the local, wider community. we believe that the latter are the final recipients of our work. our research should produce an authentic reconstruction of the historical changes that occurred, in our specific case, along the flow of the middle valley of the ombrone river. our two sites, the late etruscan and republican sanctuary area and the medieval deserted hilltop village, serve the purpose of our experiment with placemaking. as we have seen, each settlement tells stories of different social and cultural identities, struggling and melting together in the crucial passages between the etruscan period, the roman era and the transformations of the intricate world of the middle ages. it is challenging, however, to relay our research to the modern community of farmers and rural inhabitants who live in the area around paganico. this community, who might well represent the very descendants of the ancient settlement we study, will always experience barriers in learning about its past unless we create the means to disseminate our results and help them construct their identities, making cannicci a place for their stories and narratives. how can we transform an archaeological landscape of destruction and abandonment, with scattered elements, into something tangible and understandable for wider audiences? how can we transmit the importance of those rubble contexts in terms of social and historical identities? our experience at the impero project has its roots in a previous attempt that, although much more traditional in its approach, had the chance to quantify the interest of a general audience and to seize the opportunity of creating a place. it is fair to conclude that the attempt was successful overall, but it also failed to develop into something more structural and substantial. back in , an independent research project began in the area of the maremma regional park in south tuscany. known as the alberese archaeological project, it led to the excavation of three important roman sites along with the tyrrhenian coast: the sanctuary area of diana umbronensis [ ], the manufacturing district at spolverino [ , ], and the positio of umbro flumen [ , ] (figure ). in , when these excavations were almost completed, the research team decided to organize an archaeological exhibit to display some of the objects and to disseminate the results of the project. this was of vital importance, as the sites were not accessible nor open to visitors but operated on public land and used public funds. moreover, the project took place in the protected landscape of a regional park whose coasts and nature trails attract thousands of tourists annually. the exhibition was also a first opportunity to introduce local communities and regional institutions to the cultural heritage present within this protected area. titled “i romani di alberese” (romans at alberese), the exhibit opened on july at the archaeological museum of grosseto (figure ) and was by all measures a success. although reduced in size to only display cases and a variety of roman objects, it managed to attract the attention of some thousands of tourists and was, as a result, extended three times before moving to the headquarters of the regional park in alberese for another year. unfortunately, we were not able to capitalize fully on the results and allurement that the exhibit provoked in almost two and a half years and about , visitors. what we created was supposed to be the first step towards the development of a series of archaeological trails within the regional park, starting from the area of the sanctuary of diana umbronensis located on the main road leading tourists to one of the most beautiful and crowded beaches of tuscany. we began to plan an archaeological area that could guide visitors through the meanders of the historical heritage , stratigraphy that was patiently removed and interpreted. smart panels were to provide the necessary link between a traditional interaction with the archaeological site and the use of mobile devices (smartphones and tablets) to view d reconstructions and models of the roman architecture and objects (figures and ). in essence, we hoped to migrate the physical exhibition onto any personal device, so that the experience of discovering and learning about an archaeological site could follow the visitors to their homes. tags and specific targets would have been used to recreate the otherwise hidden settlement at spolverino, a stunning roman manufacturing district sealed under almost two meters of alluvial clay. unable to be left in open-air, the idea was to use virtual reality to represent it visually and engender appreciation of its history (figures and ). finally, at umbro flumen our intention was to use digital means to recreate the settlement and the original roman coastline on which it sat in the nd century bc. nowadays the sea is more than km away from the roman positio, as far as our possibilities of bringing these plans to fruition. the attempt failed for a number of reasons. most likely, however, it was due to fact that the project was too ambitious to be realized. the time was not yet right, and the costs of such a move were too high to be covered by a humble, independent project. none of the main political actors believed in the potential of this idea, seen as too grand for a provincial location that aspires to survive rather than advance. heritage , for peer review figure . – aerial views of the three sites investigated in the area of alberese. (a) shows the sanctuary dedicated to diana umbronensis; (b) shows the manufacturing district at spolverino, while (c) shows the remains of umbro flumen. figure . aerial views of the three sites investigated in the area of alberese. (a) shows the sanctuary dedicated to diana umbronensis; (b) shows the manufacturing district at spolverino, while (c) shows the remains of umbro flumen. heritage , heritage , for peer review figure . – banner of the exhibit “i romani di alberese” opened in and showing the results of the excavations at alberese. figure . banner of the exhibit “i romani di alberese” opened in and showing the results of the excavations at alberese. heritage , heritage , for peer review figure . – masterplan for the creation of a smart panel using tablets and gps at alberese. figure . masterplan for the creation of a smart panel using tablets and gps at alberese. heritage , for peer review figure . – masterplan for the creation of a smart panel using tablets and gps at alberese. figure . d reconstructions of the area at alberese in the roamn period. in the front, the sanctuary area of diana umbronensis, looking north towards umbro flumen and spolverino. note that also the landscape and natural environment were reconstructed to show the original seacoast line. heritage , heritage , for peer review figure . – d reconstructions of the area at alberese in the roamn period. in the front, the sanctuary area of diana umbronensis, looking north towards umbro flumen and spolverino. note that also the landscape and natural environment were reconstructed to show the original seacoast line. figure . – archaeological plan of the excavations at the manufacturing district of spolverino - alberese. figure . – d rendering of the manufacturing district of spolverino - alberese. figure . archaeological plan of the excavations at the manufacturing district of spolverino–alberese. heritage , for peer review figure . – d reconstructions of the area at alberese in the roamn period. in the front, the sanctuary area of diana umbronensis, looking north towards umbro flumen and spolverino. note that also the landscape and natural environment were reconstructed to show the original seacoast line. figure . – archaeological plan of the excavations at the manufacturing district of spolverino - alberese. figure . – d rendering of the manufacturing district of spolverino - alberese. figure . d rendering of the manufacturing district of spolverino–alberese. as too often happens, the archaeological interest in alberese began to dwindle. settle- ments were studied, most of them published or soon to be, and historical questions (for heritage , the moment) answered. it was time, then, to abandon this part of tuscany and try to make some places somewhere else. as the impero project continues its main scientific research, a different approach to communicating and placemaking is happening. having learned from previous failures at alberese, the project decided to invest even more into future technologies. we decided to make extensive use of d modeling, augmented reality reconstructions, and web-based information to make our research accessible to everyone. our new experiment with placemaking started at podere cannicci. its more accessible location and lower density of archaeological contexts immediately allowed for a more rapid extension of research which could expose much of the settlement in just three years of investigations. unfortunately, the presence of heavy collapses of wall structures (some even up to over two meters deep) led to a logistical slowdown of the ongoing investigations at castellaraccio, where we hope to initiate a project redeveloping the area as a tourist attraction in the immediate future. furthermore, as we had anticipated, a series of data regarding the function and possible interpretation of the classical site was already available due to the rescue excavations in the late s. once the site was chosen, the first step was to identify the assemblages that we may use to support our placemaking project; in other words, we needed to define a strategy for artistic visual design. the choice fell on using building materials, votive offerings and peculiar objects as inspiration. these all represent quintessential elements of our site: they describe the way rural dwellings were built, showcasing skills and techniques that are still present in tuscan landscapes of the countryside, as well as representing everyday objects that functioned to sustain the communities in the past (figures and ). they serve to create a bridge between the past and the present of our rural communities, a trait d’union of cultural identities that are handed down from generation to generation. the votive offerings, mainly terracotta uteri, symbolize prosperity and fertility, as we have seen previously. we attempt to match this cult with the need for descendants (mostly men as they were necessary for working the land and joining the army), although we can also address these objects as a request for fertility for the fields (keeping in mind that in the etruscan and roman times, economy was largely based on agricultural goods). those votive offerings bonded ancient etruscans and roman colonists to a land and fertile soils that still guarantee incomes and revenues to the local communities today. the skilled artisans that molded terracotta objects and forged bronze statuettes of bovines were protected by minerva, and their tradition and heritage has survived into modern paganico and its environs. the land and the fields around podere cannicci, for example, produced grape, olives and wheat that were stored and managed by patient farmers. archaeology was able to retrieve the traces of these forms of production. dolia still in situ tell a story of seasonal work, of the process of transforming grape into wine, olives into oil (figure ), a tradition that is still present and vivid in tuscany. d reconstructions helped us visualize these practices, connecting the local, modern communities to their past and their ancestors. our d models and reconstructions followed two different methods and perspec- tives [ – ]. some of the material culture that we decided to digitalize was selected in order to represent, as previously mentioned, particular aspects of everyday life at roman can- nicci that can still be experienced in modern paganico. once the selection was done, we used a portable artec spider d scanner to digitalize the artefacts and artec studio pro to create the digital models; we then uploaded the final renderings onto sketchfab (https://sketchfab.com/imperoproject). by navigating the website, the visitor will notice also the presence of almost the entire collection of votive offerings retrieved during the excavations in – . this decision was made as the objects are currently displayed at the archaeological museum in grosseto but we wanted to make them available to everyone. https://sketchfab.com/imperoproject heritage , heritage , for peer review figure . – d model of a loom weight founds during the excavations at podere cannicci. figure . d model of a loom weight founds during the excavations at podere cannicci. our d reconstructions are a little more subjective, as one can easily expect. we started with the precise recording of the archaeological strata and evidence during the excavations at podere cannicci. as we excavated the different deposits, we mapped the material culture retrieved and created a digital plan of the room with dolia. our very first attempt to visualize this particular space included the elevation of the clay walls on top of the physical remains and the juxtaposition of d models of dolia on the actual remains that were still in situ. the carbonized wooden table was d reconstructed on the exact place of its rediscovery in , while the objects that we decided to render were those collected during its excavations. as for the architectural reconstruction of the possible porticoed area, we utilized once again the real data coming from the excavations. two stone pillar bases were exposed in , with a carbonized beam in between them; hence, we decided to render this type of construction as shown in figure . obviously, we collected some of the rooftiles that allowed us to hypothesize the final reconstruction and we relied on previous renderings of similar facilities [ ]. heritage , heritage , for peer review figure . – d model of a mud brick recovered during the excavations at podere cannicci. figure . d model of a mud brick recovered during the excavations at podere cannicci. finally, we needed to make our work as archaeologists understandable. why should public institutions fund our research? what are we giving back to the public in terms of monuments and art? once again, we relied heavily on digital tools. we faced the necessity of allowing people to visualize our work, how we date structures and sites, how we reconstruct historical landscapes, and how we uncover identities and make places. think for a second of pompeii or herculaneum, or the coliseum. these are quintessential, authentic places that transmit certain values from the past, identities that are preserved and shown daily to hundreds of thousands of tourists. then we return to podere cannicci: humble stone foundations, melted mud bricks and shattered pottery vessels filling a large drain. the task of transmitting value to these features in terms of historical authenticity is enormous. our rural settlement, the place where sulla sealed his conquest of the ager rusellanus, has no monumentality. augmented reality helped us fill this void. through the use of this technique, we were able to start a process of communicating our research and our historical vision of the past, our visual art. people now can see where votive offerings were found, or how we date our strata and why we use stratigraphy (figure ). we wanted to make our research reliable and understandable to everyone. this task will continue during the next heritage , seasons, when we will implement our holistic and technological approach to the historical reconstruction and dissemination of visual art. heritage , for peer review figure . – d reconstruction of the dwelling at podere cannicci where seven dolia were found still in situ during the – excavation seasons. the picture shows an overlay between the archaeological deposits (the earth-beaten floor and remains of the walls) and the d reconstruction built on top of it. our d models and reconstructions followed two different methods and perspec- tives [ – ]. some of the material culture that we decided to digitalize was selected in order to represent, as previously mentioned, particular aspects of everyday life at roman cannicci that can still be experienced in modern paganico. once the selection was done, we used a portable artec spider d scanner to digitalize the artefacts and artec studio pro to create the digital models; we then uploaded the final renderings onto sketchfab (https://sketchfab.com/imperoproject). by navigating the website, the visitor will notice also the presence of almost the entire collection of votive offerings retrieved during the excavations in – . this decision was made as the objects are currently displayed at the archaeological museum in grosseto but we wanted to make them available to eve- ryone. our d reconstructions are a little more subjective, as one can easily expect. we started with the precise recording of the archaeological strata and evidence during the excavations at podere cannicci. as we excavated the different deposits, we mapped the material culture retrieved and created a digital plan of the room with dolia. our very first attempt to visualize this particular space included the elevation of the clay walls on top of the physical remains and the juxtaposition of d models of dolia on the actual remains that were still in situ. the carbonized wooden table was d reconstructed on the exact place of its rediscovery in , while the objects that we decided to render were those collected during its excavations. as for the architectural reconstruction of the possible figure . d reconstruction of the dwelling at podere cannicci where seven dolia were found still in situ during the – excavation seasons. the picture shows an overlay between the archaeological deposits (the earth-beaten floor and remains of the walls) and the d reconstruction built on top of it. it is for this reason that we open a new direction in our development project for podere cannicci. thanks to the generous support of a research grant from the digital scholarship studio and network (dssn) at the university at buffalo, in fact, the project was able to purchase two oculus quests. our idea is to recreate a virtual and augmented reality environment, perfectly interactable for the user, where the etruscan and republican settlement comes back to life. the idea was suggested by a number of other applications of this technique or attempts to [ – ]. the user should be able to walk through the different spaces that archaeology uncovered during the excavations, picking up objects, seeing and experiencing a virtual scenario where all the details are based on a precise, scientific reconstruction of the material culture and buildings. by developing a specific app for the oculus quest, visitors will be able to use their own devices, although some will be available onsite. in this way, by recreating the structures and life of the site, the project will overcome the limits that a humble settlement like cannicci presents. the lack of monumentality can be compensated for by the virtual reconstruction of the spaces, of the people, and of the objects, still allowing some flexibility in updating the data (figures and ). in fact, we must keep in mind that this solution will allow for updates and ongoing development as we increase the number of reconstructions and keep pace with excavations. in this way, the heritage , maintenance and printing costs of traditional panels will certainly be reduced. at the same time, since virtual experiences do not require physical presence onsite, visitors will be able to interact with the etruscan and roman site at cannicci remotely, from the comfort of their home. this part of the project was set to start during the research season at the impero project; however, the outbreak of covid- forced us to cancel all activities. nonetheless, our team is already working on the app and the reconstructions. we hope to deliver this innovative experience in summer . heritage , for peer review figure . – the picture shows one roman drain at podere cannicci with a tag. our experimental app proposes an aug- mented reality reconstruction of the archaeological stratigraphy there contained and of the material culture used to date it. in this way, the user will see the physical remains of the drain (upper part of the picture) and the reconstructed facility in augmented reality with details of the archaeological deposits (lower part of the picture).the app will be accessible via personal devices such as tablets and smartphones. it is for this reason that we open a new direction in our development project for po- dere cannicci. thanks to the generous support of a research grant from the digital schol- arship studio and network (dssn) at the university at buffalo, in fact, the project was able to purchase two oculus quests. our idea is to recreate a virtual and augmented real- ity environment, perfectly interactable for the user, where the etruscan and republican settlement comes back to life. the idea was suggested by a number of other applications of this technique or attempts to [ – ].the user should be able to walk through the dif- ferent spaces that archaeology uncovered during the excavations, picking up objects, see- figure . the picture shows one roman drain at podere cannicci with a tag. our experimental app proposes an augmented reality reconstruction of the archaeological stratigraphy there contained and of the material culture used to date it. in this way, the user will see the physical remains of the drain (upper part of the picture) and the reconstructed facility in augmented reality with details of the archaeological deposits (lower part of the picture). the app will be accessible via personal devices such as tablets and smartphones. heritage , heritage , for peer review figure . – our oculus quest experiment – in this case the user can navigate inside the virtually reconstructed roman buildings at podere cannicci. here the user interacts with objects found on the remains of a carbonized roman table. figure . our oculus quest experiment—in this case the user can navigate inside the virtually reconstructed roman buildings at podere cannicci. here the user interacts with objects found on the remains of a carbonized roman table. in the meantime, in september , we opened an archaeological exhibit in paganico where visitors could interact with artifacts and sites. the exhibition was possible through the generous support of a research grant by the office of the vice president for research and economic development of the university at buffalo. the exhibition was divided into a series of rooms with panels where the various stories related to the excavations at podere cannicci and castellaraccio di monteverdi were explained (figure ). moreover, through the use of qr codes, visitors were able to visualize artifacts that could not be displayed, learn their stories, and assimilate and appropriate their historical identities. technology also allowed us to disseminate our preliminary results through the establishment of a web-based virtual exhibit. people from all over the world can still visualize our work, debate our theories and study our artifacts (www.imperoproject.com/archeologia-a-monteverdi). our intention is to renovate the www.imperoproject.com/archeologia-a-monteverdi heritage , exhibition every year, showing different aspects of our research, as well as of the historical landscapes of the municipality of civitella paganico.heritage , for peer review figure . – our oculus quest experiment – the user is navigating through the d reconstruction of one of the rooms at podere cannicci, where archaeology retrieved the remains of a wine cellar with dolia still in situ as well as a carbonized wooden table. in the meantime, in september , we opened an archaeological exhibit in paganico where visitors could interact with artifacts and sites. the exhibition was possible through the generous support of a research grant by the office of the vice president for research and economic development of the university at buffalo. the exhibition was divided into a series of rooms with panels where the various sto- ries related to the excavations at podere cannicci and castellaraccio di monteverdi were explained (figure ). moreover, through the use of qr codes, visitors were able to visu- alize artifacts that could not be displayed, learn their stories, and assimilate and appropri- ate their historical identities. technology also allowed us to disseminate our preliminary results through the establishment of a web-based virtual exhibit. people from all over the figure . our oculus quest experiment—the user is navigating through the d reconstruction of one of the rooms at podere cannicci, where archaeology retrieved the remains of a wine cellar with dolia still in situ as well as a carbonized wooden table. heritage , heritage , for peer review figure . – the exhibition “archeologia a monteverdi”at paganico in september . . conclusions i would like to conclude this paper with a few additional remarks. only three years ago, the territory of civitella paganico was confined to amateur studies, where, however, some of the great potentials of a multi-faceted and stratified landscape were already emerging. the few data on the etruscan period were accompanied by the excavation of the large roman thermal complex of pietratonda [ , ], but still lacked the ability to iden- tify the more complex and ramified settlement dynamics. the middle ages were isolated to a few publications on the birth of the “borgo franco” at paganico, and only some works attempted a broader reading in light of extensive field surveys [ ]. the archaeological activities of our project almost immediately intersected with those undertaken in the nearby necropolis of casenovole by the ass. arch. odysseus, where a group of local ar- chaeologists is discovering some late etruscan and republican tombs [ – ]. we entered into a composite archaeological and historical scenario where data was fragmented, and we are pushing to create synergies among the different agents. without jeopardizing the necessary independence of individual research projects, we are trying to find a reasoned solution to the management, enhancement and transmission of the cultural heritage of the entire municipal area. it is for this reason that we are planning the possibility of a diffused museum for the territory, open to a classic showcasing of historical areas as well as to in- novative digital communication techniques. what we are trying to create is a complete and constant interaction between the visitor and the places of the historical identity of this territory and its stratified landscapes. furthermore, through the integration of an aug- figure . the exhibition “archeologia a monteverdi”at paganico in september . . measure of success and public opinion since day one of our project, we decided to have an open access profile, and to extensively use the internet as a way to communicate and measure the success of our research. in , the website of the project was set up (www.imperoproject.com). this serves as a collector of all the information gravitating around the excavations and the historical sites we investigate annually. almost immediately after, our facebook page was created (https://www.facebook.com/imperoproject) while twitter, instagram and youtube accounts were added only during the summer of as we began the excavations at podere cannicci. social media have finally acquired their primary role in the field of humanities and the communication of archaeological research has certainly benefited in terms of visibility and access to information. our social media approach at the impero project took advantage, once again, of our previous experience at alberese. during the excavation campaigns, the website and social pages are updated on a daily basis to make our research transparent and immediately reviewed and commented on. the excavation journals are published at the end of each day, as well as all the d models, including those in progress, in order to receive possible feedback from external users. we have chosen to use english as the official language, instead of the more comfortable italian, to favor a wider dissemination and understanding of the contents we publish. this way of conveying raw excavation data immediately (without waiting for the publication of each single final report) and of showing the subsequent steps towards the interpretation of our contexts is certainly inspired by previous ventures, such as the pivotal project at miranduolo, in the municipality of chiusdino, siena (http://archeologiamedievale.unisi.it/miranduolo/); here, the team of the university of siena has always made available all the excavation www.imperoproject.com https://www.facebook.com/imperoproject http://archeologiamedievale.unisi.it/miranduolo/ heritage , information. it is a method of data-sharing that not only inspired us, but the philosophy and application of which we share and engage with [ ]. the large involvement of digital technologies and of the internet is also useful to measure and understand public engagement with our project. the website is accessed from all the continents, especially in the summer time when we have fresh new contents to share. it is the moment when we publish our daily journals of excavations (from both podere cannicci and castellaraccio) and digital interactions spike. in absolute terms, every year saw an increment of visitors: we started with visitors and visualizations in , to reach visitors and , visualizations in . the last set of numbers was boosted by the opening of the virtual exhibition in september , allowing us to understand that people from all over the world were able to virtually visit the exhibition, as well as interact with our d models and reconstructions. due to the coronavirus outbreak in , we were not able to constantly update our contents; the number of both visitors and visualizations decreased accordingly, unfortunately. finally, we will implement our methods of measuring public engagement, especially when we organize our archaeological open days; these events are usually attended by hundreds of local people and they can turn into a unique occasion to collect feedback and comments on our job as placemakers. . conclusions i would like to conclude this paper with a few additional remarks. only three years ago, the territory of civitella paganico was confined to amateur studies, where, however, some of the great potentials of a multi-faceted and stratified landscape were already emerging. the few data on the etruscan period were accompanied by the excavation of the large roman thermal complex of pietratonda [ , ], but still lacked the ability to identify the more complex and ramified settlement dynamics. the middle ages were isolated to a few publications on the birth of the “borgo franco” at paganico, and only some works attempted a broader reading in light of extensive field surveys [ ]. the archaeological activities of our project almost immediately intersected with those undertaken in the nearby necropolis of casenovole by the ass. arch. odysseus, where a group of local archaeologists is discovering some late etruscan and republican tombs [ – ]. we entered into a composite archaeological and historical scenario where data was fragmented, and we are pushing to create synergies among the different agents. without jeopardizing the necessary independence of individual research projects, we are trying to find a reasoned solution to the management, enhancement and transmission of the cultural heritage of the entire municipal area. it is for this reason that we are planning the possibility of a diffused museum for the territory, open to a classic showcasing of historical areas as well as to innovative digital communication techniques. what we are trying to create is a complete and constant interaction between the visitor and the places of the historical identity of this territory and its stratified landscapes. furthermore, through the integration of an augmented reality platform (oculus quest), i want to immerse the visitor inside the settlements and material culture that we investigate annually. through this project, we want to convey the historical identity and authenticity that emanate directly from the archaeological knowledge of the territory. the challenge is to be able to connect local communities to their cultural roots, to the economic aspects of a landscape that have spanned the centuries unchanged. at the same time, we intend to connect tourists to landscapes and historical realities through the possibility of visiting, studying and coming to know the places we have investigated even remotely. for this reason, our website is already transforming itself into an information portal and, over the years, we will develop it in order to have different educational and interaction levels appropriate for a variety of user experiences. the task is as arduous as it is fascinating. being able to connect global tourism and local communities to the authentic narratives of a territory so rich in history can only emerge as a generational goal. in other words, we are preparing to become placemakers for the municipality of civitella paganico. we are sure that it will be an engaging journey heritage , to rediscover the historical memories and cultural identities of a precious corner of the middle valley of the ombrone river. funding: this research was funded by the university at buffalo, office of the vice-president for research and development (ovpred), as well as by the university at buffalo digital scholarship studio and network (dssn). the archaeological excavations and geophysics at podere cannicci were also supported by the new york community trust. data availability statement: the data presented in this study are available on request from the corresponding author. the data are not publicly available due to the forthcoming publication of monographs and articles related to the ongoing project. acknowledgments: the author wishes to thank michelle hobart (the cooper union, new york) and todd fenton (michigan state university) for their constant help, support and commitment with the impero project. josef souček (national museum of prague) diligently provided all the d models and augmented reality data and pictures. tyler johnson (university of michigan) kindly edited the initial version of this paper, providing feedback and useful comments; it goes without saying that all the possible remaining mistakes are all my fault. the giannuzzi savelli family allows the excavations at both podere cannicci and castellaraccio di monteverdi and i am always grateful to them. a special thank you goes to alessandro carabia (university of birmingham) and edoardo vanni (university of siena) who direct the excavations at our archaeological sites. the mayor of civitella paganico, alessandra biondi, has been always supportive of the project, and thanks to her encouragement we were able to set up the archaeological exhibition in paganico in – . last, but not the least, a thank you to all the students who took part in our excavations between and ; without their efforts we would not have any story to tell. conflicts of interest: the author declares no conflict of interest. references . hodges, r. the archaeology of mediterranean placemaking: butrint and the global heritage industry; bloomsbury: london, uk, . . hodges, r. archaeologists as placemakers: making the butrint national park. in butrint : the archaeology and histories of an ionian town; hansen, i.l., hodges, r., leppard, s., eds.; oxbow books: oxford, uk, ; pp. – . . holtorf, c. on pastness: a reconsideration of materiality in archaeological object authenticity. anthr. q. , , – . [crossref] . remondio, f.; campana, s. d recording and modelling in archaeology and cultural heritage: theory and best practices; archaeopress: oxford, uk, . . forte, m.; siliotti, a. virtual archaeology: re-creating ancient worlds; harry n. abrams: new york, ny, usa, . . olson, b.r.; caraher, w.r.; heath, s. visions of substance: d imaging in mediterranean archaeology; university of north dakota: grand forks, nd, usa, . . gaitatzes, a.; christopoulos, d.; roussou, m. reviving the past: cultural heritage meets virtual reality. in proceedings of the conference on virtual reality, archeology and cultural heritage—vast ’ , glyfada, greece, – november ; acm press: glyfada, greece, ; pp. – . . sebastiani, a. from villa to village. late roman to early medieval settlement networks in the ager rusellanus. in encounters, excavations and argosies: essays for richard hodges; moreland, j., mitchell, j., leal, b., eds.; archaeopress archaeology: oxford, uk, ; pp. – . . sebastiani, a.; hobart, m. scavi nella tenuta di monteverdi a civitella paganico. boll. archeol. online , , – . . sebastiani, a. new data for a preliminary understanding of the roman settlement network in south coastal tuscany. the case of alberese (grosseto, it). res. antiq. , , – . . hobart, m.; carabia, a. excavation at castellaraccio (civitella paganico—gr) . j. fasti online , , – . . barbieri, g. aspetti del popolamento della media valle dell’ombrone nell’antichità: indagini recenti nel territorio di civitella paganico. j. anc. topogr. , , – . . fabbri, f. la stipe votiva di podere cannicci (civitella paganico, grosseto). in un’anima grande e posata. studi in memoria di vincenzo saladino offerti dai suoi allievi; bazzecchi, e., parigi, c., eds.; scienze e lettere: rome, italy, ; pp. – . . sebastiani, a.; vanni, e.; morelli, g.; woldeyohannes, e.; hobart, m. the second archaeological season at podere cannicci (civitella paganico—gr). j. fasti online , , – . . fabbri, f. la stipe votiva di podere cannicci a paganico (civitella paganico). in le vie del sacro. culti e depositi votivi nella valle dell’albegna; rendini, p., ed.; nuova immagine: siena, italy, ; pp. – . . fabbri, f. votivi anatomici fittili. uno straordinario fenomeno di religiosità popolare dell’italia antica; ricerche; ante quem: bologna, italy, . http://doi.org/ . /anq. . heritage , . sebastiani, a.; vanni, e.; brando, m.; woldeyohannes, e.; mccabe, m.d., iii. the third archaeological season at podere cannicci (civitella paganico—gr). j. fasti online , , – . . keaveney, a. the social war bc. in rome and the unification of italy; liverpool university press: liverpool, uk, ; pp. – . . farinelli, r. la valle dell’ombrone dalla tarda antichità al basso medioevo. il contributo delle indagini storico-archeologiche alla storia del popolamento e dei flussi di traffico. in ombrone. un fiume tra due terre; resti, g., ed.; pacini: ospedaletto, pisa, italy, ; pp. – . . augé, m. non-places: introduction to an anthropology of supermodernity; verso: london, uk, . . sebastiani, a.; chirico, e.; colombini, m.; cygielman, m. diana umbronensis a scoglietto. santuario, territorio e cultura materiale; archaeopress: oxford, uk, . . sebastiani, a. spolverino (alberese—gr). the th archaeological season at the manufacturing district and revi-sion of the previous archaeological data. j. fasti online , , – . . sebastiani, a.; derrick, t. a regional economy of recycling over four centuries at spolverino (tuscany) and environs. in recycling and the ancient economy; duckworth, c., wilson, a., eds.; oxford university press: oxford, uk, ; pp. – . . sebastiani, a.; chirico, e.; colombini, m. grosseto località alberese: indagini nel sito marittimo di età romana nell’area di prima golena. not. della soprintend. ai beni archeol. della toscana , , – . . chirico, e. prima golena (alberese, gr): umbro flumen una mansio-positio a servizio della viabilità. boll. archeol. online , , – . . garstki, k. virtual representation: the production of d digital artifacts. j. archaeol. method theory , , – . [crossref] . jeffrey, s. challenging heritage visualisation: beauty, aura and democratisation. open archaeol. , , – . [crossref] . mcpherron, s.p.; gernat, t.; hublin, j.-j. structured light scanning for high-resolution documentation of in situ archaeological finds. j. archaeol. sci. , , – . [crossref] . klein, m.; vermeulen, f.; corsi, c. radiography of the past—three dimensional, virtual reconstruction of a roman town in lusitania. int. j. herit. digit. era , , – . [crossref] . lyttleton, j.; herron, t. through the virtual keyhole. archaeol. irel. , , – . . knabb, k.a.; schulze, j.p.; kuester, f.; defanti, t.a.; levy, t.e. scientific visualization, d immersive virtual reality environments, and archaeology in jordan and the near east. near east. archaeol. , , – . [crossref] . sanders, d.h. virtual heritage. j. east. mediterr. archaeol. herit. stud. , , – . [crossref] . valenti, m. la “live excavation”. in proceedings of the vi congresso nazionale di archeologia medievale, l’aquila, italy, – september ; redi, f., forgione, a., eds.; all’insegna del giglio: florence, italy, ; pp. – . . barbieri, g. civitella paganico (gr). scavi alle terme romane di pietratonda. not. della soprintend. ai beni archeol. della toscana , , – . . cabarrou, m.; darles, c.; pisani, p. essai de description d’un bâtiment des eaux de toscane, l’édifice mystérieux de pietratonda (gr). l’antique partag. , , – . [crossref] . marcocci, a. contributo alla carta archeologica del comune di civitella paganico (gr); università degli studi di siena: siena, italy, . . barbieri, g.; lippi, b.; mallegni, f. civitella paganico (gr). la tomba del tasso di casenovole presso pari. not. della soprintend. ai beni archeol. della toscana , , – . . barbieri, g. tomba del tasso di casenovole presso pari (civitella pganaico). in l’occhio dell’archeologo. ranucci bianchi bandinelli nella siena del primo ’ ; barbanera, m., ed.; silvana: milano, italy, ; pp. – . . turchetti, m.a. civitella paganico (gr). casenovole: la tomba delle tre uova. not. della soprintend. ai beni archeol. della toscana , , – . http://doi.org/ . /s - - -z http://doi.org/ . /opar- - http://doi.org/ . /j.jas. . . http://doi.org/ . / - . . . http://doi.org/ . /neareastarch. . . http://doi.org/ . /jeasmedarcherstu. . . http://doi.org/ . /pallas. introduction the impero project a digital venture for the project measure of success and public opinion conclusions references document downloaded from: this paper must be cited as: the final publication is available at copyright additional information https://doi.org/ . /llc/fqw http://hdl.handle.net/ / oxford university press gamermann ., d.; moret-tatay, c.; navarro pardo, e.; fernández de córdoba, p. ( ). the small-world of 'le petit prince': revisiting the word frequency distribution. digital scholarship in the humanities. ( ): - . doi: . /llc/fqw the small-world of “le petit prince”: revisiting the word frequency distribution d. gamermann∗ , c. moret-tatay , e. navarro-pardo , and p. fernandez de córdoba castellá department of physics, universidade federal do rio grande do sul (ufrgs) - instituto de f́ısica , av. bento gonçalves - caixa postal - cep - - porto alegre, rs, brasil. departamento de neuropsicobioloǵıa, metodoloǵıa y psicoloǵıa social - facultad de psicoloǵıa, magisterio y ciencias de la educación, sede de san juan bautista. universidad católica de valencia, san vicente mártir - calle guillem de castro , - valencia, spain. department of developmental and educational psychology - faculty of psychology, universitat de valència. av. blasco ibáñez, - valencia, spain. instituto universitario de matemática pura y aplicada, universitat politècnica de valència. camino de vera, s/n - valencia, spain. march , abstract many complex systems are naturally described through graph theory and different kinds of systems described as networks present certain important characteristics in common. one of these features is the so called scale-free distribution for its node’s connectivity, which means that the degree distribution for the network’s nodes follows a power law. scale-free networks are usually refered to as small-world because the average distance between their nodes do not scale linearly with the size of the network, but logarithmically. here we present a mathematical analysis on linguistics: the word frequency effect for different translations of the “le petit prince” in different languages. comparison of word association networks with random networks makes evident the discrepancy between the random erdös-rény model for graphs and real world networks. key words: small-world, word frequency, zipf’s law many objects of study in different interdisciplinary fields find a natural mathematical description as graphs. a graph is simply an object formed by two different sets: a set of nodes and a set of edges connecting these nodes. for many decades the mathematical study of graphs has been guided by the erdös-rény model for random graphs erdös, p. and rényi, a. ( ). in this model a (random) graph is constructed from a set of n nodes by connecting or not each one of the n(n− ) pairs of nodes with a probability p. a random graph will, therefore, have on average ∗danielg@if.ufrgs.br p n(n− ) links and the degree distribution of its nodes will follow a poisson distribution. another characteristic of random graphs is the fact that its size (average node distance) scales linearly with the number of nodes in the graph. as graph theory started being applied to many real systems such as metabolic or protein networks, neural networks, the internet, social networks, food-chains, among many others rives & galitski ( ), haykin ( ), pastor-satorras et al. ( ), crucitti et al. ( ), a discrepancy between these real-world graphs and the random erdös-rény graphs became evident. the node’s degree distribution in real-world graphs do not follow a poisson distribution, instead they follow a power-law distribution and thus became known as scale-free. as a consequence, the average distance between two nodes in such networks grows slowly with the the number of n nodes in the network and this characteristic is known as small-world behavior amaral et al. ( ). it has been observed that the word frequency distribution in a language also follows a scale-free distribution and many explanations for this phenomenon have been given. in linguistics, this observation is known as zipf’s law. it states that the proportion of words p (in a text, for example) with a given frequency k follows a power law: p(k) ∼ k−γ where γ is generally a number between and . this law shows that few words present very high frequency and, conversely, many words present low frequency. a particular and appealing explanation for this could be achieved via concepts from statistical mechanics where one tries to minimize an energy function based on the balance between the efforts of the speaker and the listener which is defined by the word frequency and ambiguity, as shown in cancho & solé ( ). one traditional way to examine differences between languages is by variables such as frequency, morphological complexity, evolution and cultural transmission. all these aspects can be related in a complex adaptive system beckner et al. ( ). in particular, the word frequency is a classical effect in cognitive psychology characterized by its robustness: high frequency words are recognized quicker and remembered better sternberg & powell ( ). therefore, a large body of research has employed the word frequency as an approach of word difficulties dufau et al. ( ), esteves et al. ( ), moreno-cid et al. ( ), moret-tatay & perea ( b,a), navarro-pardo et al. ( ), perea, moret-tatay & carreiras ( ), perea, comesaña, soares & moret-tatay ( ), perea, gatt, moret-tatay & fabri ( ), perea, moret-tatay & gómez ( ). according to breland ( ), the logic of this is that low frequency words are more difficult because they appear less often in print. moreover, (van heuven et al. ( )) proposed the zipf-scale as a better standardized measure of word frequency. given the ease with which word counts can be collected at the present time, a useful tool on contrastive linguistics is a lexical corpus of a language. in other words, a large collection of texts in the electronic form supplemented by linguistic annotation that has become an important tool in linguistic studies. not surprisingly, several databases for computing statistics and psycholinguistic in several languages have been developed for this objective coltheart ( ), davis ( ). however, according to perea et al. ( ), yap et al. ( ), other variables might be involved in word recognition, in particular in word frequency, such as the number of contexts in which a word appears. in the present work we focus on the analysis of a single linguistic material (the little prince by saint-exupéry) in several different languages. to this propose, we have studied statistical properties of the text and networks (graphs) associated with this text. in the different languages we studied the word frequency distribution on one hand and then we constructed different networks by word associations. for each network we built, we evaluated its main properties, like its average clustering coefficient, nodes distances and its degree distribution. in the next section we present the methodology we used and the mathematics behind our analyses, in the results section we describe our findings and in the conclusions section we present the main aspects of our results and a brief overview. methods . materials the little prince text was obtained from the internet in eight different languages: spanish, english, dutch, greek, basque, italian, portuguese and (of course) french. in order to analyze the text, python scripts were written. the computer codes were run in a computer with a i quadcore processor and gb of ram memory. the scripts first stored all text in the computer ram memory. then, it used punctuation in order to slice the text in its sentences and then removed all punctuation and numerals ( , , , ...) from the raw text. it then identified the different words as the strings left which were separated by spaces. as an example, below one can see the first characters from the french text: antoine de saint-exupéry le petit prince premier chapitre lorsque j’avais six ans j’ai vu, une fois, une magnifique image, dans un livre sur la forêt vierge qui s’appelait � histoires vécues �. ça représentait un serpent boa qui avalait un fauve. voilà la copie du dessin. on disai through our scripts, the extract above becomes the list of words: antoine, de, saint, exupéry, le, petit, prince, premier, chapitre, lorsque, j, avais, six, ans, j, ai, vu, une, fois, une, magnifique, image, dans, un, livre, sur, la, forêt, vierge, qui, s, appelait, histoires, vecues, ça, représentait, un, serpent, boa, qui, avalait, un, fauve, voilà, la, copie, du, dessin, on, disait. once the python script transforms the whole text in a raw list of words ( in the case of the french text), it counts the number of different words ( in the french text) and counts also the number of times that each single word is repeated in the text. for the construction of networks, we will link words based on their relative distance in the text. for this, one needs to keep track of the sentences in which the text is divided and which words appear in each sentence. so our script actually first creates a list of sentences, by slicing the text when it finds a punctuation symbol, and after that a list of single words, by slicing the sentences in its blank spaces. . analysis the word frequency distribution p(k) is a function that, for each natural number k, tells how many words appeared in the text k times. in the case of the french text, for example, different words appeared only once (p( ) = ), one of these is the word “réjouir”, that appears in the whole text only once. on the other hand, the word “et” was the fifth most frequent word, appearing times (k = ) and this is the only word that appeared this number of times, consequently p( ) = . the most frequent word was the article “le” that appeared times and is the only word appearing times in the text (p( ) = ). typically, for a text, many words appear only a few times, while a few words are repeated constantly along the text. as a consequence, the function p(k) is a decreasing function. a mathematical function that often fits p(k) in a text is the power-law distribution: p(k) = ak−γ, ( ) log (p(k)) = log(a) −γ log(k) ( ) where a is a proportionality constant that can be evaluated by the total number of words. the fact that the frequency distribution follows a power-law (or scale-free) distribution is known as the zipf law. note from equation ( ) that, in a log-log plot, the distribution will follow a straight line. for real texts, the tail (large values of k) of the p(k) distribution will be very noisy, because only a handful of large values of k will be populated and then by a single word. in figure we show the function p(k) (in logarithmic scale) for the french text. one can clearly see the noise in the right tail. figure : word frequency distribution for the french text with a noisy right tail. . . . . . . . lo g (# w o rd s ) log (freq) p(k) in order to fit the distribution avoiding the noisy tail, one can use the right-cumulative distribution: pc(k) = ∫ ∞ k p(k′)dk′ = a γ − k−(γ− ) ( ) log (pc(k)) = log ( a γ − ) − (γ − ) log(k). ( ) in figure one can see the distribution pc(k) (in logarithmic scale) for the french text. this curve is much smoother than the raw p(k) distribution and it is always decreasing. figure : word frequency cumulative distribution for the french text. . . . . . . . lo g (# w o rd s ) log (freq) pc(k) from equations ( ) and ( ) it is clear that the plot of log(p) or log(pc) versus log(k) will follow a straight line if the distribution p(k) follows the power-law in equation ( ). so fitting lines to the empirical data collected from the texts, one can determine the parameters a and γ. the parameter a divided by γ− is just the total number of different words in a text. one can realize this by noticing that pc( ) = #total of words. apart from measuring and fitting the word frequency distribution, we analyzed networks of words association built from the texts. in order to build a network from the texts in each language, we set each word as a node and we built two different networks by following two different rules in order to set the links between words. in the first network we define a link between two words if they appear side by side in at least one sentence in the text. in the second network a link is defined between two words if there is a third word between the two in at least one sentence in the text. in figure we show examples of the two networks based on a single sentence in the text: “my drawing was not a picture of a hat!” figure : example of the two networks. network on the left and network on the right. of was picture hat drawing my not a of was picture hat drawing my not a network network an important structure in order to analyze a graph is its adjacency matrix, this is a symmetric n×n matrix, where n is the number of nodes in the graph and the elements mij are equal to one if there is a link between nodes i and j and zero otherwise. from this matrix, one can directly obtain the degree (number of neighbors or connections) for any given node in the graph: ki = n∑ j= mij. the number of nodes (words) in each network constructed from the texts maybe less than the total number of different words in each whole text because we remove non-connected components (sets of nodes from which it is not possible the reach a bigger set of nodes following the links within the set) from the graphs. for each network we performed three analyzes: we fitted a power-law to its degree distribution, we calculated the average clustering coefficient and the average distance between two nodes. the fitting of a power-law follows the same steps done in order to fit word frequencies (but now looking at degree for each node in the network). the clustering coefficient of a node is given by ravasz & barabasi ( ): ci = ei ki(ki − ) ( ) where ki is the degree of node i and ei is the number of connections between the neighbors of node i. the average clustering c̄ of a network can now be calculated straightforward as the average value of the ci’s for all nodes in the network. the distance between two nodes is defined as the minimum number of links one has to go through in order to travel from one node to the other. the average distance between every one of the n(n− ) different pairs of nodes in each network was calculated using dijkstra’s algorithm dijkstra ( ) via the pynetmet package gamermann et al. ( ). the average of the distances between every pair is the network’s average distance d̄. we compared the average clustering and average distance in every network with results from random networks. for this purpose, for each network, we built an ensemble with twenty random networks with the same number of nodes and the same number of links, but with random topology. the input for a network is its adjacency matrix m. so, for building a random network we use the following algorithm: ( ) start with an n ×n matrix where all its elements are zero. (one has here n nodes and zero links (` = ) between them. ( ) while the number of links (`) is less than the desired number of links in the network, repeat: ( . ) chose two different integer random numbers (i and j) between and n. ( . ) if mij is zero, change mij and mji to one and increase in one unit the number of links (`→ ` + ). ( ) check if any node (i) has been left unconnected. if so, randomly choose a node (j) to connect it (i) to and randomly break an existing connection of node j. ( ) repeat step ( ) until no node is left unconnected. steps ( ) and ( ) are actually optional, but throughout our calculations, we have chosen to work with fully connected graphs. this algorithm returns a randomly generated adjacency matrix representing a connected network with a predefined number of nodes and links. using this algorithm, for each network obtained from a text, we generate an ensemble of twenty random networks with the same number of nodes and links. for each random network in the ensemble the average clustering and average distance is calculated and then the average inside each ensemble is evaluated. results in figure the distributions for all the eight languages in log-log scale are supper-posed showing the tendency they have to follow a straight line. in figure the distribution for each individual language is shown with the best line fitted using the least squares method. in the title of each plot one finds the equation fitted. in table we show the values of γ, a γ− , total number of words and the χ dof for the best fit for each language. the value for χ (minimized by the least square method) is calculated as: χ = kmax∑ k= (log(pc(k)) − log(pcobs,k )) �k ( ) figure : cumulative word frequency distribution for all texts. . . . . . . . lo g (# w o rd s ) log (freq) spanish english dutch basque greek italian portuguese french where pcobs,k is the observed value for the right-cumulative distribution of words at frequency k, �k is the error associated to log(pcobs,k ) and the sum is made for all k’s for which pobs,k is different from zero . since pcobs,k is an absolute frequency, the error associated to it is its square-root and, therefore, one evaluates the logarithmic error �k = ln( ) √ pcobs,k . the results for the networks analysis can be found in tables and . in figure we show, for the network constructed from the portuguese text, its degree distribution, the best fitted line to it and the degree distribution for a random network with the same number of nodes and links (n = and ` = ). from this figure, one can clearly see the difference between the distribution obtained from a “real” network (power-law distribution) and the one obtained from a completely random network (poisson distribution). in a power-law distribution there is a sensible probability of observing nodes with a higher (much bigger than average) degree, while in a poisson distribution this probability drops to zero very fast. note that pcobs,k is the right cumulative distribution so, if pobs,k is zero for a given value of k, pcobs,k will be a constant for all k′s after this, until reaching a new k where pobs,k is not zero, and therefore, these points would not bring any new information to the analysis. in all our equations log is the base logarithm and ln is the natural (base e) logarithm. figure : cumulative word frequency distribution for all texts with the best line fitted. - . . . . . . . . lo g (# w o rd s ) log (freq) log (pc) = - . log (k) + . french b+m*x - . . . . . . . . lo g (# w o rd s ) log (freq) log (pc) = - . log (k) + . greek b+m*x - . . . . . . . . lo g (# w o rd s ) log (freq) log (pc) = - . log (k) + . basque b+m*x - . . . . . . . . lo g (# w o rd s ) log (freq) log (pc) = - . log (k) + . dutch b+m*x - . . . . . . . . lo g (# w o rd s ) log (freq) log (pc) = - . log (k) + . english b+m*x - . . . . . . . . lo g (# w o rd s ) log (freq) log (pc) = - . log (k) + . italian b+m*x - . . . . . . . . lo g (# w o rd s ) log (freq) log (pc) = - . log (k) + . portuguese b+m*x - . . . . . . . . lo g (# w o rd s ) log (freq) log (pc) = - . log (k) + . spanish b+m*x figure : degree distribution for the network obtained from the portuguese text compared with a random network. - lo g (# n o d e s ) log (freq) network fit random network table : summary of the fits. language # words a γ− γ χ dof spanish . . . english . . . dutch . . . basque . . . greek . . . italian . . . portuguese . . . french . . . the properties calculated for the two types of networks ( and ) are very similar, but they differ significantly from the properties calculated for random networks. the average node distance in the random networks are, on average, around two units bigger than in the language networks and they present a much smaller standard deviation in the case of random networks. the second interesting difference between random and language networks is the average clustering coefficient, which is very close to zero in the case of random networks. in language networks, words tend to form clusters because of the language structure (they will share either context, grammatical or semantic function, ...) and this feature is reflected in the clustering coefficient calculated from eq. ( ). conclusions here we present a mathematical analysis on linguistics: the word frequency effect for different translations of the same book (“le petit prince”) in eight different languages. the interest of these studies is that the occurrence of words in sentences reflects the language’s organization. apart from the word frequency distribution, we also performed analyzes of different networks built based on word associations in the text and compared these to random networks. as expected, word frequency presented a scaling law. the results suggest small differences on language volume for the same material. in particular, the γ parameter varied slightly across the different languages. moreover, our study shows how different languages tend to slightly differ in formal aspects. comparison of word association networks with random networks makes evident the discrepancy between the random erdös-rény model for graphs and real world networks. a real network follows a specific design principle and therefore its nodes are connected in an organized way. this becomes evident from the clustering coefficient of the networks which have a high value for networks and , but is very close to zero for the random networks. another interesting difference between the real and random networks is the observation of the small-world effect in real networks: its average node’s distance is much smaller than in random networks. finally, one can conclude that these results show how different languages tend to slightly differ in formal aspects table : network parameters for the different languages. n is the number of nodes and ` is the number of links, γ is the parameter obtained fitting a power-law to the degree distribution for the nodes, c̄ is the average clustering, d̄ is the average nodes distances. the parameters with a subscript r refer to the the averages in the random networks and the uncertainties shown are the standard deviations for the calculated averages (in the case c̄r and d̄r, it is the standard deviation within the ensemble and not the average standard deviation within networks). language n ` γ c̄ d̄ c̄r d̄r spanish . . ± . . ± . . ± . . ± . english . . ± . . ± . . ± . . ± . dutch . . ± . . ± . . ± . . ± . basque . . ± . . ± . . ± . . ± . greek . . ± . . ± . . ± . . ± . italian . . ± . . ± . . ± . . ± . portuguese . . ± . . ± . . ± . . ± . french . . ± . . ± . . ± . . ± . when the context is controlled. in particular, these results are of interest to other applied fields. bear in mind that, in recent decades, the cognitive psychology has paid particular interest to examining factors influencing the recognition of printed words, i.e., frequency, familiarity, word length, age of acquisition among others, according to andrews ( ). there remain some empirical underlying questions, regarding the question of measuring the word frequency for different languages, from printed manuals to even subtitles. even if more research is needed here, the comparison between these sources is beyond the scope of this study. here, we offer a comparison employing different translations of the same printed material in different languages. that allows us to compare differences of word frequency in the same context. regarding this topic, perea et al. ( ), yap et al. ( ) stated that other variables must have a role on frequency, such as the number of contexts in which a word appears. that correspond with the nature of our results. furthermore, some researchers (van heuven et al. ( )) proposed the zipf-scale as a better standardized measure of word frequency, giving also examples of printed words with various zipf values. the authors also claimed that an alternative zipf scale presented in their work is better suited for research in word recognition. here, we follow the same logic. thus, these results might offer some insights in to the role of the word frequency effect for print words, but more research in this field is necessary. acknowledgment we would like to thank thomas irvin for his invaluable help and comments. table : network parameters for the different languages. n is the number of nodes and ` is the number of links, γ is the parameter obtained fitting a power-law to the degree distribution for the nodes, c̄ is the average clustering, d̄ is the average nodes distances. the parameters with a subscript r refer to the the averages in the random networks and the uncertainties shown are the standard deviations for the calculated averages (in the case c̄r and d̄r, it is the standard deviation within the ensemble and not the average standard deviation within networks). language n ` γ c̄ d̄ c̄r d̄r spanish . . ± . . ± . . ± . . ± . english . . ± . . ± . . ± . . ± . dutch . . ± . . ± . . ± . . ± . basque . . ± . . ± . . ± . . ± . greek . . ± . . ± . . ± . . ± . italian . . ± . . ± . . ± . . ± . portuguese . . ± . . ± . . ± . . ± . french . . ± . . ± . . ± . . ± . references amaral, l. a., scala, a., barthelemy, m. & stanley, h. e. ( ), ‘classes of small-world networks’, proc. natl. acad. sci. u.s.a. ( ), – . andrews, s. ( ), ‘all about words: a lexicalist perspective on reading’, from inkmarks to ideas: current issues in lexical processing p. . beckner, c., blythe, r., bybee, j., christiansen, m. h., croft, w., ellis, n. c., holland, j., ke, j., larsen- freeman, d. & schoenemann, t. ( ), ‘language is a complex adaptive system: position paper’, language learning (s ), – . breland, h. m. ( ), ‘word frequency and word difficulty: a comparison of counts in four corpora’, psychological science-cambridge- , – . cancho, r. f. & solé, r. v. ( ), ‘least effort and the origins of scaling in human language’, proceedings of the national academy of sciences ( ), – . coltheart, m. ( ), ‘the mrc psycholinguistic database’, the quarterly journal of experimental psychology ( ), – . crucitti, p., latora, v., marchiori, m. & rapisarda, a. ( ), ‘efficiency of scale-free networks: error and attack tolerance’, physica a: statistical mechanics and its applications , – . davis, c. j. ( ), ‘n-watch: a program for deriving neighborhood size and other psycholinguistic statistics’, behavior research methods ( ), – . dijkstra, e. ( ), ‘a note on two problems in connexion with graphs’, numerische mathematik ( ), – . url: http://dx.doi.org/ . /bf dufau, s., duñabeitia, j. a., moret-tatay, c., mcgonigal, a., peeters, d., alario, f.-x., balota, d. a., brysbaert, m., carreiras, m., ferrand, l. et al. ( ), ‘smart phone, smart science: how the use of smartphones can revolutionize research in cognitive science’, plos one ( ), e . erdös, p. and rényi, a. ( ), on the evolution of random graphs, in ‘publication of the mathematical institute of the hungarian academy of sciences’, , pp. – . esteves, c. s., oliveira, c. r., moret-tatay, c., navarro-pardo, e., carli, g. a. d., silva, i. g., irigaray, t. q. & argimon, i. i. d. l. ( ), ‘phonemic and semantic verbal fluency tasks: normative data for elderly brazilians’, psicologia: reflexão e cŕıtica ( ), – . gamermann, d., montagud, a., jaime infante, r., triana, j., urchuegúıa, j. & fernández de córdoba, p. ( ), ‘pynetmet: python tools for efficient work with networks and metabolic models’, computational and mathematical biology ( ), – . haykin, s. ( ), neural networks: a comprehensive foundation, st edn, prentice hall ptr, upper saddle river, nj, usa. moreno-cid, a., moret-tatay, c., irigaray, t. q., argimon, i. i., murphy, m., szczerbinski, m., mart́ınez-rubio, d., beneyto-arrojo, m. j., navarro-pardo, e. & fernández, p. ( ), ‘the role of age and emotional valence in word recognition: an ex-gaussian analysis’, studia psychologica ( ), – . moret-tatay, c. & perea, m. ( a), ‘do serifs provide an advantage in the recognition of written words?’, journal of cognitive psychology ( ), – . moret-tatay, c. & perea, m. ( b), ‘is the go/no-go lexical decision task preferable to the yes/no task with developing readers?’, journal of experimental child psychology ( ), – . navarro-pardo, e., navarro-prados, a. b., gamermann, d. & moret-tatay, c. ( ), ‘differences between young and old university students on a lexical decision task: evidence through an ex-gaussian approach’, the journal of general psychology ( ), – . pastor-satorras, r., vazquez, a. & vespignani, a. ( ), ‘dynamical and correlation properties of the internet’, phys. rev. lett. ( ), . perea, m., comesaña, m., soares, a. p. & moret-tatay, c. ( ), ‘on the role of the upper part of words in lexical access: evidence with masked priming’, the quarterly journal of experimental psychology ( ), – . perea, m., gatt, a., moret-tatay, c. & fabri, r. ( ), ‘are all semitic languages immune to letter transposi- tions? the case of maltese’, psychonomic bulletin & review ( ), – . perea, m., moret-tatay, c. & carreiras, m. ( ), ‘facilitation versus inhibition in the masked priming same– different matching task’, the quarterly journal of experimental psychology ( ), – . perea, m., moret-tatay, c. & gómez, p. ( ), ‘the effects of interletter spacing in visual-word recognition’, acta psychologica ( ), – . perea, m., soares, a. p. & comesaña, m. ( ), ‘contextual diversity is a main determinant of word identification times in young readers’, journal of experimental child psychology ( ), – . ravasz, e. & barabasi, a. l. ( ), ‘hierarchical organization in complex networks’, phys rev e stat nonlin soft matter phys ( pt ), . rives, a. w. & galitski, t. ( ), ‘modular organization of cellular networks’, proc. natl. acad. sci. u.s.a. ( ), – . sternberg, r. j. & powell, j. s. ( ), ‘comprehending verbal comprehension.’, american psychologist ( ), . van heuven, w. j., mandera, p., keuleers, e. & brysbaert, m. ( ), ‘subtlex-uk: a new and improved word frequency database for british english’, the quarterly journal of experimental psychology ( ), – . yap, m. j., tan, s. e., pexman, p. m. & hargreaves, i. s. ( ), ‘is more always better? effects of semantic richness on lexical decision, speeded pronunciation, and semantic classification’, psychonomic bulletin & review ( ), – . microsoft word - tel_icpsr_ _final.docx social science data repositories in data deluge: a case study at icpsr workflow and practices abstract: design/methodology/approach: we conducted two focus group sessions and one individual interview with eight employees at the world’s largest social science data repository, the interuniversity consortium for political and social research (icpsr). by examining their current actions (activities regarding their work responsibilities) and it practices, we studied the barriers and challenges of archiving and curating qualitative data at icpsr. purpose: due to the recent surge of interest in the age of the data deluge, the importance of researching data infrastructures is increasing. the open archival information system (oais) model has been widely adopted as a framework for creating and maintaining digital repositories. considering that oais is a reference model that requires customization for actual practice, this study examines how the current practices in a data repository map to the oais environment and functional components. findings: we observed that the oais model is robust and reliable in actual service processes for data curation and data archives. in addition, a data repository’s workflow resembles digital archives or even digital libraries. on the other hand, we find that: ) the cost of preventing disclosure risk and ) a lack of agreement on the standards of text data files are the most apparent obstacles for data curation professionals to handle qualitative data; ) the maturation of data metrics seems to be a promising solution to several challenges in social science data sharing. original value: we evaluated the gap between a research data repository’s current practices and the adoption of the oais model. we also identified answers to questions such as how current technological infrastructure in a leading data repository such as icpsr supports their daily operations, what the ideal technologies in those data repositories would be, and the associated challenges that accompany these ideal technologies. most importantly, we helped to prioritize challenges and barriers from the data curator’s perspective, and contribute implications of data sharing and reuse in social sciences. introduction as the research paradigms in science disciplines become data-intensive and collaborative (hey, tansley, tolle, ), researchers are promoting data as the “infrastructure of science,” critical in forming “the basis for good scientific decisions, wise management and use of resources, and informed decision-making” (tenopir et al., ). although disciplinary cultural differences exist between social sciences and natural sciences, the former discipline is changing to require greater access to data and more transparency (guest, ; elman & kapiszewski, ). all of this calls for a strong emphasis on data depositing and sharing. despite the recent surge of interest in the age of the data deluge, managing digital resources inside a repository for the purposes of preservation and access is neither novel nor unique. since the late ’s, digital library communities have been designing and improving the concept of a trusted digital repository, which by its definition should possess key attributes such as “reliable”, “long-term access”, “managed resources” and for the “designated community”; all are recognized as critical requirements for data management and curation services (borgman et al., ). the open archival information system (hereafter: oais) has been a well-known and widely-adopted conceptual model for creating and maintaining a digital repository. oais was proposed two decades ago and, ever since, has become a consensus and a standard for “maintaining digital information over the long-term” (lavoie, , p. ). the oais model can be viewed at three different levels of granularity. the first level describes the external world with which oais interacts. the second level defines the internal workflow of oais, including six functional entities: ingesting, archive storage, data management, preservation planning, access, and administration (i.e., day-to-day operation). the third level defines the format of possible inputs to the oais services. considering the important status of oais in digital repositories, and the fact that oais is a conceptual reference model that requires customization or “translation” into actual practice or a service (vardigan & whiteman, ), it is important to closely examine the practices in data repositories and review how they adopt the oais model. so far, we have seen reports from data repository management teams documenting their adoption of oais for data curation services, but there are few third-party studies examining how data curation practices map to the oais model. therefore, the first research question in this preliminary study is: rq . what are the current practices in a data repository? in order to closely examine how data repository services support social science data sharing, it is necessary to gather information about how data professionals carry out current practices at a research data repository. we conducted a case study on the world’s largest social science data repository, the interuniversity consortium for political and social research (hereafter: icpsr). we further mapped the gathered information to the oais environment and oais’s suggested functional entities in order to examine the current practices with a scaffolding reference. we view the case study with icpsr as an opportunity to examine the support technologies in data repository services. therefore, our second research question is: rq . what are the current challenges of the underlying technologies at a data repository? what are the desired information technologies (its) perceived by employees to support their data repository services? in addition to rq and rq , we also report several interesting findings on the challenges and opportunities in social science data sharing-reuse cycles. we attempt to address the critical inquiry: what are the challenges or barriers encountered by data curation professionals when handling social science data? what general challenges do they see regarding social science data sharing? by investigating these research questions, we are able to evaluate the gap between current practices and the straightforward realization of the oais model. we are also able to identify whether the current technological infrastructure in a leading data repository such as icpsr is sufficient to support their work; if not, what their desired its would be; and the challenges of supporting any ideal technologies. most importantly, we can prioritize the identified challenges and barriers from the data curator’s perspective and obtain a holistic view on data sharing practices in social sciences. literature review we take a funnel approach (i.e., from broader to narrower topics) to review related literature. first, we review the background of the increasing importance of data repositories for research data management in the data deluge age. then in section . , we review articles related to the overall operation and workflow in data repositories, specifically in the adoption of oais by data repositories. aligned with our focus on technical challenges, in . , we focus on the technical infrastructure and review its evolution in social sciences. . the data deluge and research data management the requirement of research data management can date back to the e-research movement in the mid- s. in the discussions regarding e-research movements, rapid increasing computational capabilities enables more demands of data-driven scientific discoveries. (e.g., griffin, ). more the predecessors of e-research are cyber-infrastructure and e-science, terms that were coined in the early s to highlight the importance of information technology that supports scholarly activities. according to borgman ( ), the united states tends to use the term “cyber-infrastructure,” whereas asia, europe, australia, and other areas favor the term “e-science.” the prefix “e” in e- science is usually taken to stand for “electronic,” but can also be understood as “enable” or a concept of “enhancement” (p. ). plans, controls, and management are needed to face the “data deluge ” and advance data scholarship. in response to the popularity of e-research and data scholarship, since , the nsf has engaged in organizing councils and digital scholarship workshops, producing several high-impact reports, including cyberinfrastructure vision for st century discovery and understanding infrastructure: dynamics, tensions, and design in . this series of movements and endeavors reflects the government’s view of research data management: the data deluge requires greater control of data management. later, in , the us government announced a manifesto of digital stewardship and preannounced a mandate in that all nsf applications should include a research data management plan. the nsf’s mandate on data management has also become a source to explore how pis share and reuse their data. mischo, schlembach, & o’donnell ( ) analyzed , dmps from july to november at the university of illinois. mischo et al. found that the most common venues that pis who preserve their datasets were personal websites ( . %), personal servers ( . %), local instructional repositories (e.g., ideals at uiuc, . %), and repositories that do not locate on the campus, including disciplinary repositories ( . %) and other non-uiuc organization ( %). among all , dmps, the authors calculated the occurrence of named repositories which were mentioned by pis. the arxiv, genbank, and nanohub are among the most frequently mentioned. however, mischo’s project did not find significant differences in storage venues when comparing funded grants to unfunded proposals. additionally, they found that nsf grant applicants underutilized disciplinary repositories. similarly, bishoff and johnston ( ) analyzed dmps included in nsf grant proposals at the university of minnesota, and they found a variety of pis’ data sharing strategies. the nsf mandates signal the important role of data repositories. because of the growing importance of the research repository in the data deluge age, it is imperative to examine its current state and potential challenges. . the adoption of oais in data repositories many data repositories in the archive communities have adopted the oais model. for example, as early as , icpsr published a series of articles and guidelines describing how it integrates the oais model into its work model. the outcome, called the “icpsr pipeline,” adopts the oais reference model in the context of social science research data, and is well-documented in “designing the future icpsr pipeline process” (gutmann et al., ) and “icpsr meets oais” (vardigan & whiteman, ). on the other hand, data repositories such as the uk data archive (ukda) at the university of essex and the national archives (tna) tested and reported on how their systems and processes complied with the oais reference model (beedham et al., ). such work helps to provide guidance for digital repositories and further promote a more cooperative environment among data repositories. aside from assessing compliance, there are existing studies that use oais as a foundational framework to examine data repositories’ practices. for example, yoon and tibbo ( ) conducted a content analysis on data submission package elements (sip as “submission information package” in the oais model), and examined submission forms and submission guidelines collected from data repositories in the social science domain. the term “data deluge” was coined in the early s (e.g., hey & trefethen, ) in order to reflect the sheer volume and magnitude of research data in the digital age. based on this preannouncement, all nsf grant applicants, on or after january , , are required to submit a two-page research data management plan describing how to share and manage their data. us federal funding agencies further expanded this mandate in by adding new data management and data-sharing requirements to grant applications. . technical infrastructure in social science information communication technologies have enhanced academia with more possibilities and opportunities since the early s (bingham, ). in particular, improved technical infrastructure encourage research activities in social science from many aspects, e.g., increasing size and complexity of analyzable research data, availability of easy-to-use social science analysis tools and more channels and patterns for scholars to collaborate efficiently and widely. as revealed by curty’s interview with social scientists ( ), the technological infrastructure of data repositories is important for data sharing and re-use. however, the challenges underlying the technical infrastructure are not inevitable. lazer et al. ( ) claimed that the unavailability of easy-to-use social science analysis tools and the insufficient sharing of data impede the advance of the computational social science. bingham ( ) identified the three particular aspects of research conduct influenced by technical infrastructure: ) data collection and analysis, ) communication and collaboration, and ) storage and retrieval. based on bingham’s framework, researchers developed an integrated data collection and analysis platform in social sciences: common language resources and technology infrastructure (clarin), which is a research infrastructure embedding technical infrastructure to support researchers. this particular tool offers the social sciences and humanities research community with advanced tools to discover, explore, exploit, annotate, analyze or combine data on language resources (krauwer & hinrichs, ). re-focusing on social scientists’ data sharing practices on an individual level, jeng et al. ( ) recently investigated researchers’ perceived technical infrastructure and reported four technical limitations that hinder data sharing, namely: platform availability, platform usability, facilities and technical standards. as for technical infrastructure concerning data storage and retrieval, fecher et al. ( ) summarized three sub- factors as architecture, usability, and management software. methodology . case study: icpsr icpsr was established in and is the world’s largest primary data archive of social science research. as of july , icpsr holds , studies, , datasets, and , files for download (icpsr, ). as we mentioned in the previous section, icpsr has adopted oais and represents its workflow as the “icpsr pipeline” (beecher, ). although these publications provide documentation for the adoption of oais, our examination of icpsr’s workflow and infrastructure is promising and legitimate for the following rationales. firstly, we are external information science researchers interested in social science data management and services. we provide a different and novel perspective on the data management issues at icpsr. secondly, we used a focus group as our research method so that we can collect in-depth data directly from icpsr practitioners, and we used a bottom-up approach to reconstruct data management and services in icpsr. the depth of data collected through focus groups cannot be matched by reading published articles. . research design and study protocol our focus group study uses participatory design and employs a special technique called visual narrative inquiry (bowler et al., ). the detailed execution of this research method has been creatively revised by mattern et al. ( ) and lyon et al. ( ; ) to enhance the engagement of a focus group. using a focus group approach in this case study draws upon participants’ experiences and encourages interaction among group participants. our study project has been reviewed by the institutional review board at the university of pittsburgh and meets all the necessary criteria for an exemption (irb#: pro ). the brief version of study protocol, as shown in table , begins with stage i: the study information was introduced and consent was obtained from the participants in the focus group. the participants were then invited to describe their backgrounds and explain how their backgrounds have led them to their job positions at icpsr. in stage ii: session of professional activities, each participant wrote down their professional activities (i.e., activities related to their day-to-day responsibilities at their institution) regarding data curation or collection development at the institution, one activity per sticky note. the participants then had a discussion among themselves and explained these activities to each other. next, participants worked on sorting these actions into clusters. participants were encouraged to leave their seats and go to the whiteboard, self-grouping their sticky notes. they were allowed to use magic markers as a visual aid or re-position the sticky notes however they see fit. in stage iii, participants were sent back to the table and asked, on another set of sticky notes, to write down the tools related to the sorted concepts on the whiteboard, such as specific software, online services, or homegrown programs. participants were then encouraged to describe imaginary or desired information technologies. in the final stage, participants were asked to elaborate about challenges and opportunities regarding data-sharing practices, and were additionally asked about icpsr’s professional activities. while the appendix lists all actual questions, here are some examples of them in stage iv: § please elaborate more about the differences when curating qualitative, mixed-method, and quantitative data, if any. (group a) § what are critical factors that may influence researchers’ willingness to share their data? (group a) § how do you determine the scope of icpsr’s collection? (group b) § does icpsr provide other services or support to further connect the data depositors and data reusers? (group b) table . stages in focus group sessions stages description i. warm-up • the mediators introduce the study information and acquire consent. • participants describe their background and explain how their backgrounds led them to their current job positions. ii. session of professional activities • each participant writes down their actions (professional activities related to their responsibilities at their institution) regarding data curation or collection development at the institution, one action per sticky note. • all participants leave the table and go to the whiteboard, self-grouping their sticky notes. participants are free to use magic markers as a visual aid or re-position the sticky notes. iii. underlying information technology activity- collecting current its and desired its • participants return to the table and, on another set of sticky notes, write down the tools related to the concepts on the whiteboard, such as certain software, online services, or homegrown programs. • participants describe desired information technologies. iv. semi-structured interview • each participant elaborates more about their actions in curation, acquisition, and collection development. note: the detailed procedure is attached in the appendix . sampling and data collection our study consists of two focus groups and one individual interview, all of which were conducted in june onsite at the icpsr headquarters in ann arbor, michigan. in total, eight icpsr employees participated in the study, and seven out of eight were directors or senior managers (at least years of experience). the sampling method in this study is expert and convenience sampling, targeting data curation professionals and other professionals who work in icpsr. to contact such a specific target population, the invitations were sent according to two categories (“administration” and “collection development & delivery”) on the staff directory webpage or were referred by icpsr employees. table summarizes work experience (in years) and general responsibilities of our participants. group a’s session lasted about minutes, group b’s session lasted about minutes, and the individual interview lasted about minutes. table . participant background groups id year of experience general responsibilities in icpsr a p > years curation p > years curation, data processing p < years curation, data processing b p > years acquisition, administration p > years customer relations, administration p > years curation, administration p > years administration * p > years administration note: * individual interview was conducted. the topics discussed in these focus groups and the interview were as follows. group a — “curation services.” the emphasis of group a was on data curation services. participants include p to p . figures a- d illustrate a more detailed breakdown of our focus group procedure. in stage ii (see the description in table ), each participant first wrote on their individual sticky notes and attached them to the whiteboard in the conference room (figure a). individual participants were welcome to write additional notes after a discussion with the other participants in their group. participants were also invited to take advantage of visual aids to elaborate about their actions (figure b). in stage iii, participants added underlying it and desired it on the whiteboard using yellow rectangular sticky notes (figure c). in figure d, participants continued adding different visual aids, such as the section that reads “openicpsr” with a dashed line to the final outcome. figure . group a activity breakdown group b — “collection development.” the emphasis of group b was on collection development at icpsr. all participants in group b are directors or managers, and their daily responsibilities extend beyond collection development, including acquisition, delivery, supervising, customer relations, outreach, administration, and preservation planning. participants include p to p in table . a more detailed breakdown can be found in figures a- d. firstly, all group b participants attached their notes to the whiteboard without any sorting or classification (figure a). later, the participants grouped similar actions into columns (figure b) and named each cluster themselves (figure c). note that the focus group mediators did not directly participate or interfere with participants’ sorting process. finally, as shown in figure d, the participants added their it practice notes onto the white board. figure . group b activity breakdown. interview — in addition to the two focus groups, we interviewed an experienced director (p in table ) to add valuable perspective and to clarify some points regarding our research questions. questions include: ) a follow-up on how curation professionals communicate with data depositors about potential disclosure risk; ) factors that can influence a researcher’s willingness to share data with icpsr; ) the potential challenges and opportunities for social scientists when sharing their qualitative data. after collecting data from the research sites, we digitalized all the sticky notes and entered data into a spreadsheet-style table. specifically, the workflows or clusters created by participants in both focus groups were digitalized by a digital camera. these digital images allowed us to re-create and analyze the focus group results. all conversations that happened during the focus groups and the interview were recorded and transcribed. participants’ quotations on transcription files are managed using atlas.ti, a qualitative data analysis software. findings in this section, we report interesting findings observed in the data collected from the two focus group studies (with participants p -p ) and in the individual interview with p , including direct quotations. the results are divided into three sections. in . , we discuss the overall practices (data curation actions) at icpsr, while in . and . we answer rq by reporting the current it practices and desired it from data curation professionals’ perspectives. finally, in . , we report our findings and observations regarding challenges and opportunities in social science data sharing. . data curation workflow at icpsr as we collected participants’ actions, results presented by the participants in group a resemble the icpsr pipeline (figure d). however, the results presented by the participants in group b were mostly bottom-up action clusters (figure d), which have little similarity with the oais structure. based on the clusters of sticky notes, we further integrate the participant-created action clusters with the oais model, presented in figure . in group a’s reported actions, after receiving an sip (submission information package) from the data depositors, data processors perform a series of activities to prepare the data for documentation, such as “building metadata” and “creating codebook.” the various actions in the data processing stage seem to be interrelated and not necessarily sequential, as the participant p expressed, “once we get everything together, then we start to put all these pieces together and they're all interrelated. you don't have to do one before the other.” figure . participant-reported activities and oais components at icpsr unlike group a’s use of a workflow to explain their actions, group b sorts their actions (shown in yellow rectangles in figure ) into eight clusters: curation, new products, acquisition, outreach, evaluation, management, customer services, and training & education. we found that group b’s action clusters overlay with other oais functional components except for data processing and metadata building. however, we find that only a portion of the action clusters can be perfectly covered by a single oais function entity. for example, in figure , the actions in “ingest,” “archival storage” and “data management” are overlapping, suggesting that they require support from multiple entities. this is exactly the purpose of viewing oais as a reference: although the oais model provides a high-level reference guideline, data archives or repositories should expect to work out the details and customize the model to reflect their own needs. . current it practices table enumerates the reported technologies based on associated action clusters. we find that participants mentioned more it tools related to “data processing” and their effort to develop “new products”. we also see that office software (such as word processors, text editors, and spreadsheets) are the most common tools. on the other hand, participants reported that they prefer linux-based operating systems in their work environment, and most of their work was done under the linux environment: “we do our work in the linux environment but we have windows environment that we can also work in as well” (p ); “(we) log on pc but using linux” (p ). table . current information technologies reported by participants action clusters current it participants acquisitions metadata editor, lead management tool, deposit viewer, deposit form, spreadsheet, email p , p , p web team bibliographical database (bibliofake), pdf applications p , p processing word processor, spreadsheet, gis scripts, spss, sas, stata, r, text editor, linux, windows, study management tool, deposit viewer, metadata editor, pdf applications, web browser, unix, hermes, html p , p , p new products online questionnaire software, usability testing tool, web-hosted service for webinars, responsive design tools, email, unix, html, xml, word processor, funding database, lead management tool, deposit form, email p , p , p , p outreach web-hosted service for conferences, presentation software, google analytics, word processor p , p , p evaluation text visualization tool, google analytics, data mining tools, data visualization tools, online questionnaire software p management university financing reporting system, spreadsheet, word processor p , p , p customer service email tracking system, web-hosted service for webinars, email, social media, online video p , p , p training and education word processor, web-hosted service for webinars, email extension (boomerang for gmail) p , p , p , p according to group a (in which participants used figure to explain the internal tools), we find that core actions in the data processing cluster mostly rely on internally-developed applications, which include: • herme (a file-converting tool that can convert data files from one format to another, such as from spss to csv and sas), • deposit form (creating the package after data depositors or pis finish the deposit; • deposit viewer (allowing curators at icpsr to view metadata about deposits), • metadata editor, “the primary environment for creating, revising, and managing descriptive and administrative metadata about a study” [beecher, , para ]); and the • bibliofake (a database created for storing “bibliographic information and exports it into a format in a system that can use to render that information on the website” [p ]). figure . the internal workflow of processing data package at icpsr (provided by p ) we concluded that there is no single integrated platform that handles multiple action clusters simultaneously. on the other hand, some actions, such as processing, involve more tools and thus are more complex than others. as shown in figure , p wrote down a couple tools that she uses during data-package processing. participant p elaborated on what p wrote by saying: “i'm mostly surprised these are all the stuff that we're doing” (p ). figure . a data curator’s toolbox for processing data packages at icpsr (provided by p ) . desired information technologies for data curation professionals as shown in table , group a precisely describes the tools and technologies needed to address their daily challenges. for example, they would like to have technologies that can automatically extract all of the metadata from an input dataset; as one participant mentioned, “wouldn't it be great if there was a form where you uploaded a file and that system would automatically extract all of the metadata for that file” (p ). they also desire tools that can help “flag” possibly sensitive or harmful content, and technologies that can automatically discover possible identifier combinations. almost all participants in group a mentioned the disclosure check: “you always have to decide, ‘is it harmful?” what’s the level of harm that's going to happen and what's the level of sensitivity?’” (p ). “[s]ometimes you miss human sense of what kind of information is dangerous. i know there are tools for disclosure risk but they are not efficient and they cannot identify information [that] we actually identify as disclosure risk” (p ). action clusters current challenges ideal it solutions participant processing metadata are manually extracted. technologies that can automatically extract most of the metadata from an input dataset p disclosure risk or sensitive content are manually checked technologies that can help ‘flag’ possible sensitive or harmful contents; automatically find out possible combination of identifiers p & p quality control tools that can speed up the process for ensuring data quality by checking if file crushes, errors, executing dataset and scripts p & p administration hard to estimate “cost” for every single case technologies that can estimate needed resources before assigning laboring and money. p management hard to synchronize with other departments in the institution one united and transparent system that can instantly and actively inform or facilitate communication and synchronization between internal departments or separate archives; that can reduce time between contacts p & p training and education -- a platform that can enhance user engaging and allows customization for training purpose p table . current challenges and ideal it solutions reported by participants on the other hand, since all the participants in group b are in management positions, their descriptions of ideal technologies are less specific but more comprehensive than those provided by group a. for example, they desire automated tools to estimate the cost of each study, and systems that can unite multiple departments. participant p called for tools that can “make things connect and interact across because now we have all of these silos, systems with the university (u of michigan) with icpsr.” she also anticipated this one-stop-shopping system can be developed sooner: “…the hope is that over the next few years, we’ll be putting in a new enterprise system, securities and if this will connect some of those things better or just take one place that you put everything and go in and grab what you need” (p ). . challenges and opportunities of social science data sharing in this section, we discuss the challenges and opportunities regarding social science data sharing. table lists out the challenges and opportunities that we identified through the focus group sessions (p -p ) and the interview with p . challenges and opportunities occur at various levels, ranging from individual researchers, their discipline communities and data infrastructures, to the national level. we would like to note that a cross-level investigation is needed because a challenge that exists in one level may be solvable by an opportunity existing in another level. table . challenges and opportunities toward social science data sharing in different levels challenges opportunities individual level § social scientists’ individual concerns about data sharing: o pi’s confidentiality concerns (p , p ) o pi’s confidence of data sharing (p , p ) § lack of reward model (e.g., data are not recognized as research products) (p ) -- community level § lack of agreement on the standard of text data files in qualitative studies (p , p ) § low awareness of data sharing in social sciences (p , p ) § data metrics (p , p , p , p , p ) infrastructural level § labor-intensive process of data curation, especially for qualitative data (p , p , p , p ) § hard to fulfill various community needs at once (p , p , p ) § active curation (p ) § enclaves and embargo settings (p , p , p ) national level can be both challenges or opportunities: § regulations and mandates on data sharing at the national level (p ) . . challenges we explain in detail each of the six identified challenges in social science data sharing from data curators’ perspectives. labor-intensive process of data curation, especially for qualitative data. preparing qualitative data for sharing requires extra time and effort. for data curation professionals, open-ended responses can be text-heavy, and the processing cost for time and labor is hard to estimate. for example, participants p and p had a conversation and described the effort of processing qualitative responses, “if you have to read through , responses” (p ); “sometimes they mention the names, other people name their names or the exact date of something happened, that's the information we don't want them to (reveal)” (p ). lack of agreement on the standard of text data files. participants also suggested that it is necessary to adopt and inform data depositors about sustainable digital file formats and standard metadata for qualitative data. regarding qualitative data curation, icpsr widely accepts a series of text-based files, whereas the pdf is an exception. “we have a very good handle on that where we put it into an ascii text file or set ups with qualitative stuff. it's not as cut and dried to use word as a proprietary format, to use xml, or pdfs, or if you put it in a pdf, is it searchable? (in a rhetorical tone)” (p ). the designated community problem: difficult to fulfill various community needs at once. data curators often face the designated community problem—that is, they find it difficult to clearly identify the target users of a data repository. for example, p expressed that from time to time they would ask themselves about who the designated community of icpsr is: “there's customers (research institutions who pay the annual membership fee to icpsr) and there's users (data reusers), and then people who use our data are often not the people who pay for it” (p ). therefore, the team may need to use additional labor and time to repeatedly review potential stakeholders. social scientists’ individual concerns about data sharing. several observations made by the data curators can help explain why a social scientist might refuse to share their data. on the top of the list, it seems that social scientists are most worried about “sensitive data” and have “confidentiality concerns”: “(one barrier) is fear of confidentiality or privacy issues, feeling like they have some sensitive information or data that they won't be able to release and so but they don't know about these other channels that are available” (p ). in addition, qualitative approaches usually deeply involve the researchers’ worldviews; such subjectivity might influence how qualitative researchers view and value their research data, and thus may sometimes result in resistance to archive and share their data. participant p , speaking from an administrator’s perspective at icpsr, shared his thoughts on qualitative data sharing and still believes qualitative data sharing is possible: “data sharing tends to be weakest in qualitative fields because qualitative researchers many of them for various ideological and ontological reasons believe they can't share their data, but it's not true that that's not universal” (p ). awareness of data sharing is increasing but still low. the majority of faculty and graduate students in social science fields do not share data or are aware of its importance. participant p related this phenomenon to the low awareness of perceived benefits: “not everyone or even not the majority maybe know that publishing data or putting your data into a repository is a good thing” (p ). on the other hand, the lack of a reward model can be another critical hindrance for researchers’ data sharing in general. the same participant compared data products with research articles: “[y]ou've probably gone through the tenure process where your reviewers, if you publish a data collection, or let's say you publish an article, but you also spent… a lot of time publishing a data product. that data product is used by thousands of people around the world. that article maybe was read by ten people but it was in science or nature, that would be a tenure, the data product, from what i understand, doesn't get nearly the eyeballs or attention” (p ). . . opportunities despite the challenges, we also observe four encouraging opportunities for social science data sharing from data curators’ perspectives. among these opportunities, data metrics were on the top of the list and were mentioned by participants in both focus groups. secure dissemination services, such as an enclave policy. several participants (p , p , and p ) mentioned the enclave policy at icpsr. “we do have a restricted data use policy. people can apply and receive the data from our secure downloads if they can have it or if it's just really restricted, we can put it in a physical enclave or we have a digital enclave where people can log into it and only use the data there” (p ). research data infrastructure also pays attention to potential disclosure risks, and data repositories such as icpsr often offer secure dissemination services. such security mechanisms are an opportunity to address an individual’s confidentiality concerns, mentioned above. the maturity of data metrics. despite imperfections, citation-based bibliometric methods have been widely used to evaluate scholars for promotion, tenure, hiring, or other recognizing mechanisms (borgman, ). however, data citation or data publication is not a common recognizing mechanism in academia. in our study, participants across both focus group sessions mentioned the lack of recognition of data citation repeatedly: “it's funny that you look at the citation or reference of a book or a journal article and that's very well established in research and academia but this you can't say nearly the same for our data collection. it's not yet considered a first rate research product and as a result it affects other aspects of the research life cycle” (p ). although nsf ( ) has recognized data as a research product since , it is taking time for academia to form an agreement to adopt data publications as research products. to encourage data sharing in social sciences, the community can consider data sharing a kind of academic contribution by adopting data metrics. p in group b expressed her positive attitude about the connection between providing data metric services and a pi’s willingness to share data at icpsr: “… individual pi, they might be excited to see downloads and citations and search…they can say, look at how much impact we have had… [b]ut again it's all still relatively new” (p ). call for an “active curation.” to speed up the process of data curation, participant p mentioned the concept of active curation, a new model of accomplishing data curation piece by piece (myers et al., ). the traditional curation model usually requires everything to be available before proceeding to the next step, whereas active curation is an incremental model where metadata and elements can be added over time: “that's where my wishes came from, reducing the time it takes to get data in the door, supporting active curation, so maybe we can get the data in before they have to actually deposit it or let others use it, but if we can help them along the way” (p ). this opportunity not only reduces curation time, but also ultimately allows pis to proactively update their datasets. this is beneficial for pis who are hesitant to share data because they are afraid that errors or mistakes in their data will be pointed out. call for a national policy. participant p mentioned the uk, which has national policies that encourage uk researchers to submit datasets to the national archives: “yeah, and many other countries like uk, there is requirement that people deposit their data in a particular place” (p ). there is no nation-wide data sharing infrastructure as of in the us, and there is no universal guideline for selecting a data archiving platform. the existence of a national policy can simplify pis’ effort to select a data archiving platform, but it would be challenging to build supporting infrastructure for such a policy. discussion and implications the adoption of oais. the curation and collection development practices that we collected greatly resemble the icpsr pipeline, which adopts the oais model as its high-level framework. we observe that although data infrastructures such as icpsr have the freedom of being detached from the oais model, their practices still strongly resemble oais at large. this observation may indicate two things: ) the oais model is robust and reliable in actual service processes for digital curation and digital archives; ) a data repository’s workflow resembles digital archives or even digital libraries. however, even though we observe that icpsr’s workflow and practices resemble the oais model, the institution also revises the model to meet their own needs. for example, some activities require interaction between multiple function components; we observe overlapping areas between “ingest,” “data management,” and “archival storage” as well as “administration” and “access”. this finding may reflect the employees’ actual skillsets or work allocations in a data repository, and thus can serve as a reference for other data repositories. current it practices are specific and their coherence needs to be improved. as for current it practices, we find that icpsr’s main actions in data processing are handled by internally-developed tools, which is consistent with the observation made by jeng and lyon ( ). that is, social science projects tend to require a unique set of it functionalities, and thus it is common to develop customized tools for a specific task rather than using general-purpose tools for multiple tasks. current it challenges include data disclosure checks and the coherence problem. for disclosure management, a more intelligent tool is desired, which can help improve efficiency and the decision- making process by providing additional information, such as highlighting possibly risky texts. addressing the coherence problem requires a better platform on which people can work together smoothly. however, we did not ask participants to elaborate on the desired it’s possible functionalities and appearance, so a future specific participatory study is anticipated to capture more details. gaps in data-sharing practices in social sciences at scale. the aforementioned discussion reveals several challenges as well as opportunities. again, although a particular challenge exists on one level (e.g., pis’ concerns about data sharing at the individual level), it may be resolvable by an opportunity existing on another level (e.g., the maturity of data metrics at the community level). data curation remains challenging to scale due to privacy concerns and its labor-intensive process. to resolve this scaling issue and handle big data in social sciences, researchers require better and automated tools to help detect or perform disclosure checks. in addition, consistent with prior work (jeng, he, and oh, ), data curators also express their worries about the low awareness about data sharing in social sciences. however, it is unclear what the root cause of this is, given that every stakeholder appears to support of data sharing. as a bottom-up approach, we suggest that it might be helpful to expose early-career social scientists (i.e., senior graduate students, post-doctoral researchers, and assistant professors) to trainings on research integrity, data transparency, and the spirit of open research. conclusion through two focus group sessions and one individual interview with eight icpsr employees, we evaluated the gap between icpsr’s current practices, it practices, and their adoption of the oais model. we also revealed the current its which support data curation professionals’ daily operations, the ideal technologies these professionals desire, and the challenges with these ideal technologies. most importantly, we helped to prioritize barriers from data curators’ perspectives and we contributed implications about data sharing and reuse in social sciences. based on participants’ point of views, several challenges and opportunities regarding data sharing in social sciences are also observed. our reported findings reveal several challenges (such as data ownership and confidentiality concerns); however, to reiterate, a particular challenge may exist on one level (e.g., pis’ concerns about data sharing at the individual level), and be resolvable by an opportunity existing on another level (e.g., the maturity of data metrics at the community level). data sharing and curation in social sciences remain challenging to scale due to privacy concerns and a labor-intensive process, especially with regard to qualitative data sharing. better and automated tools would be required to help detect or perform disclosure checks. this case study on icpsr might be limited because the study only focused on one repository and participants were self-selected. in addition, interviewing more participants on data-curation-related responsibilities (e.g., the web team, the it team, or metadata librarians) would allow us to yield a more robust outcome and present a more holistic workflow at a data repository. one future work is to compare our results with related work based on the investigation on social scientists’ data-sharing and reuse practices (e.g., jeng, he, and oh, ; yoon, ; curty, ). a cross-level (i.e., individual, institution, community, and infrastructure) triangulation is exceptionally needed for capturing the whole picture of data sharing and reuse practices in social sciences. another future direction is to compile a list of design principles for improving the design of a data curation system, based on the collected it practices and ideal technologies in this study. acknowledgements the authors thank the ifellowship, guided by the committee on coherence at scale (coc) for higher education, sponsored by the council on library and information resources (clir) and andrew w. mellon foundations; as well as beta-phi-mu honor society, which provided research funding for this project. this study is also partially supported by the project titled research on knowledge organization and service innovation in the big data environments funded by the national natural science foundation of china (no. ). the authors also thank drs. nora mattern, liz lyon, sheila corrall, jian qin, jung sun oh, and stephen griffin for their invaluable comments and suggestions on this research project. last but not least, the authors thank all participants and people who helped facilitate the field study at icpsr for their valuable input and assistance. references beecher, b. ( , november ). the icpsr pipeline process. retrieved october , , from http://techaticpsr.blogspot.com/ / /icpsr-pipeline-process.html beedham h, missen j, palmer m, ruusalepp r ( ) assessment of ukda and tna compliance with oais and mets standards. joint information systems committee (jisc), united kingdom, http://data-archive.ac.uk/media/ /oaismets_report.pdf. bingham, j. l. ( ). information technology and the conduct of research. bulletin of the medical library association, ( ), . bishoff, c., & johnston, l. ( ). approaches to data sharing: an analysis of nsf data management plans from a large research university. journal of librarianship and scholarly communication, ( ), ep . bohémier, k. a., atwood, t., kuehn, a., & qin, j. ( ). a content analysis of institutional data policies. in proceedings of the th annual international acm/ieee joint conference on digital libraries (pp. - ). acm. borgman, c. l., wallis, j. c., mayernik, m. s., & pepe, a. ( , june). drowning in data: digital library architecture to support scientific use of embedded sensor networks. in proceedings of the th acm/ieee-cs joint conference on digital libraries (pp. - ). acm. bowler, l., knobel, c., & mattern, e. ( ). from cyberbullying to well-being: a narrative-based participatory approach to values-oriented design for social media. journal of the association for information science and technology, ( ), - . curty, r, g. ( ). actors influencing research data reuse in the social sciences: an exploratory study. international journal of digital curation (ijdc), ( ): - . elman, c., & kapiszewski, d ( ). a guide to sharing qualitative data. center for qualitative and multi method inquiry (cqmi), syracuse university. fecher, b., friesike, s., & hebing, m. ( ). what drives academic data sharing?. plos one, ( ), e . griffin s. ( ). libraries in the digital age: technologies, innovation, shared resources and new responsibilities, chapter in “communication and technology”, volume of the series “handbook of communication science”, ed. by cantoni, l., danowski, j., de gruyter mouton. guest, g., namey, e. e., & mitchell, m. l. ( ). collecting qualitative data: a field manual for applied research. sage. gutmann, m. p., evans, b., mitchell, d., & schürer, k. ( ). the data archive technologies alliance: looking towards a common future. in iassist conference. hey, t., tansley, s., & tolle, k. (eds.). ( ). the fourth paradigm; data-intensive scientific discovery. redmond, washington: microsoft research. icpsr. ( ). size of icpsr's holdings. retrieved october , , from https://www.icpsr.umich.edu/icpsrweb/content/about/history/ icpsr. icpsr: a case study in repository management. retrieve from https://www.icpsr.umich.edu/icpsrweb/content/datamanagement/lifecycle/ingest/enhance.ht ml jeng, w. & lyon, l. ( ). a report of data-intensive capability, institutional support, and data management practices in social sciences. international journal of digital curation (ijdc), ( ): - . jeng, w., he, d., & oh, j. ( ). toward a conceptual framework for data sharing practices in social sciences: a profile approach. in the proceedings of the asis&t annual meeting. kim, y. ( ). institutional and individual influences on scientists’ data sharing behaviors. unpublished dissertation. syracuse university. krauwer, s., & hinrichs, e. ( ). the clarin research infrastructure: resources and tools for e-humanities scholars. in proceedings of the ninth international conference on language resources and evaluation (lrec- ) (pp. - ). european language resources association (elra). lavoie, b. f. ( ). the open archival information system reference model: introductory guide. microform & imaging review, ( ), - . lazer, d., pentland, a. s., adamic, l., aral, s., barabasi, a. l., brewer, d., ... & jebara, t. ( ). life in the network: the coming age of computational social science. science (new york, ny), ( ), . lyon, l., jeng, w., & mattern, e. (forthcoming). research transparency: a preliminary study of disciplinary conceptualisation, drivers, tools and support services. oclc ( ). trusted digital libraries: attributes and responsibilities. retrieved from https://www.oclc.org/content/dam/research/activities/trustedrep/repositories.pdf mattern, e, jeng, w., he, d., lyon, l., & brenner, a. ( ). using participatory design and visual narrative inquiry to investigate researchers’ data challenges and recommendations for library research data services. program: electronic library and information systems. ( ): - . mischo, w. h., schlembach, m. c., & o’donnell, m. n. ( ). an analysis of data management plans in university of illinois national science foundation grant proposals. journal of escience librarianship, ( ), . myers, j., hedstrom, m., akmon, d., payette, s., plale, b. a., kouper, i., ... & kumar, p. ( ). towards sustainable curation and preservation: the sead project's data services approach. in e- science (e-science), ieee th international conference on (pp. - ). national science foundation (nsf). ( , january). national science foundation’s merit review criteria: review and revisions. retrieved october , , from https://nsf.gov/pubs/policydocs/pappguide/nsf /gpg_sigchanges.jsp tenopir, c., allard, s., douglass, k., aydinoglu, a. u., wu, l., read, e., … frame, m. ( ). data sharing by scientists: practices and perceptions. plos one, ( ). vardigan, m., & whiteman, c. ( ). icpsr meets oais: applying the oais reference model to the social science archive context. archival science, ( ), - . yoon, a., & tibbo, h. ( ). examination of data deposit practices in repositories with the oais model. iassist quarterly, ( ). chicago yoon, a. ( ). data reusers' trust development. journal of the association for information science and technology. appendix. focus group protocol group a (data curation professionals): minutes time activity mediator actions question prompts : - : review information and consent distribute introduction script obain consents on: § proceed the focus group § use recorders, and § data will be shared thank you for your participation. i believe your input will be valuable to this research and in helping grow all of our professional practice. approximate length of interview: minutes, two group activities and three major questions : - : warming up mediator actions § set timer § set recorder taking note: § education background § career history § year of experience § primary activities please take us back through a little history in your career that brought you to this current position. also, we would like to know more about your current work at icpsr. prompts: § how long have you been involved in your current job? (what year were you involved) § what primary tasks does your job involve? : - : concept construction distribute post-its (different colors) process: individual write post-its stick to write board sort cluster take a picture distribute easel pad take a picture distribute post-its (yellow post-its) take a picture question : what are your activities as a curation professional to support data curation? prompt: before/ after data submitting process: individual write post-its→ stick to write board → sort→ draw cluster→ ask participants if there is anything left. question : now we have n clusters, could you explaining the relationships among the activities question : what are the tools that you use for your actions in curation? prompts: § computer equipments § software § online services § internal toolkits? question : can you think of any desired tools or technology (tools may not exist) which can facilitating your actions at icpsr? (talking only, do not distribute sticky notes) : - : questions about qualitative data curation -- question a: have you ever curated qualitative data? if yes, jump to b if no, have you heard about your colleagues or others in icpsr curating qualitative data? do you have any observation? question b: please tell us about the difference when curating qualitative, mixed method, and quantitative data, if any. is there any special case or example that you would like to share? question : based on your observations and experience as curation professionals in icpsr, what are the critical factors that may influence a pi’s willingness to share his/her data? prompts: § has a pi ever told you about or you have heard---the factors could influence pi’s willingness? § are they from: § individual incentives § research culture § institution : - : debriefing -- suggestions about research instrument? was anything unclear? group b (collection development professionals): minutes time activity mediator actions question prompts : - : review information and consent distribute introduction script obain consents on: § proceed the focus gorup § use recorders, and § data will be shared thank you for your participation. i believe your input will be valuable to this research and in helping grow all of our professional practice. approximate length of interview: minutes, two group activities and three major questions : - : warming up mediator actions § set timer § set recorder taking note: § education background § career history § year of experience primary activities please take us back through a little history in your career that brought you to this current position. also, we would like to know more about your current work at icpsr. prompts: how long have you been involved in your current job? (what year were you involved) what primary tasks does your job involve? : - : concept construction distribute post-its (different colors) process: individual write post-its stick to write board sort cluster take a picture distribute easel pad take a picture distribute post-its (yellow post-its) take a picture question : what are your responsibilities in supporting collection development and delivery in icpsr? prompt: before/ after data submitting process: individual write post-its→ stick to write board → sort→ draw cluster→ ask participants to clarify if there is any sticky note unclassified. question : are there any tools that you use? prompts: computer equipments software online services internal toolkits? (yellow post-its) question : can you think of any desired tools (tools may not exist) or technology which can facilitating your actions at icpsr? (talking only) : - : questions about collection development and vision now we have a couple questions related to collection development, collection delivery, management, and marketing topics in icpsr. question : how do you determine the scope of icpsr’s collection? we read about icpsr’s collection development policy, we read about the high-priority areas including sexual orientation, social media, immigration, and so on. how does icpsr decide which areas should be given priority? prompts: § are these decisions from icpsr’s interval decision? § members’ opinions or feedback? § recent research hot topics (recent publications)? § or community or specific researchers’ demands? § how does icpsr decide to add a new interest? question : this question is related to appraisal standards in icpsr. please tell us about how icpsr applies the selection and appraisal criteria for data from mixed-method study or qualitative study. are they different from quantitative one? is there any special case or example that you would like to share? prompts: when will data be referred to the qdr? question : this questions is about openicpsr. given the differences between openicpsr and icpsr, please share your experience with us about how icpsr handles or manages these two different collections. is openicpsr within the scope of icpsr? prompts: § do icpsr members mention anything about icpsr? (their experience with openicpsr?) § what is your observation? § is there any plan for further promoting open-icpsr to icpsr members? question : currently icpsr supports search interface and track utilization for data sharers and reusers. does icpsr provide other services or support to further connect the data depositors and reusers? : - : debriefing -- suggestions about research instrument? was anything unclear? received / / review began / / review ended / / published / / © copyright quinn et al. this is an open access article distributed under the terms of the creative commons attribution license cc-by . ., which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. curated collections for educators: five key papers on evaluating digital scholarship antonia quinn , teresa chan , christopher sampson , catherine grossman , christine butts , john casey , holly caretta-weyer , michael gottlieb . department of emergency medicine, suny downstate college of medicine . faculty of health sciences, department of medicine, division of emergency medicine, mcmaster university . emergency medicine, university of missouri columbia . pulmonary and critical care medicine, virginia commonwealth university health systems . section of emergency medicine, louisiana state university health sciences . department of emergency medicine, ohiohealth doctors hospital . department of emergency medicine, oregon health & science university . department of emergency medicine, rush university medical center  corresponding author: antonia quinn, antonia @gmail.com disclosures can be found in additional information at the end of the article abstract traditionally, scholarship that was recognized for promotion and tenure consisted of clinical research, bench research, and grant funding. recent trends have allowed for differing approaches to scholarship, including digital publication. as increasing numbers of trainees and faculty turn to online educational resources, it is imperative to critically evaluate these resources. this article summarizes five key papers that address the appraisal of digital scholarship and describes their relevance to junior clinician educators and faculty developers. in may , the academic life in emergency medicine faculty incubator program focused on the topic of digital scholarship, providing and discussing papers relevant to the topic. we augmented this list of papers with further suggestions by guest experts and by an open call via twitter for other important papers. through this process, we created a list of papers in total on the topic of evaluating digital scholarship. in order to determine which of these papers best describe how to evaluate digital scholarship, the authorship group assessed the papers using a modified delphi approach to build consensus. in this paper we present the five most highly rated papers from our process about evaluating digital scholarship. we summarize each paper and discuss its specific relevance to junior faculty members and to faculty developers. these papers provide a framework for assessing the quality of digital scholarship, so that junior faculty can recommend high-quality educational resources to their trainees. these papers help guide educators on how to produce high quality digital scholarship and maximize recognition and credit in respect to receiving promotion and tenure. categories: medical education, quality improvement, other keywords: curated collection, digital scholarship evaluation, academic promotion, medical education introduction and background the manifestation of scholarship is changing. whereas traditionally, academia only recognized clinical research, benchwork, peer reviewed publications, and grant funding as markers of success, other forms of scholarship were defined in the late th century by ernest l. boyer. dr. boyer, through his seminal work, scholarship reconsidered, created a major paradigm shift to open access review article doi: . /cureus. how to cite this article quinn a, chan t, sampson c, et al. (january , ) curated collections for educators: five key papers on evaluating digital scholarship. cureus ( ): e . doi . /cureus. https://www.cureus.com/users/ -antonia-quinn https://www.cureus.com/users/ -teresa-chan https://www.cureus.com/users/ -christopher-sampson https://www.cureus.com/users/ -catherine-grossman https://www.cureus.com/users/ -christine-butts https://www.cureus.com/users/ -john-casey https://www.cureus.com/users/ -holly-caretta-weyer https://www.cureus.com/users/ -michael-gottlieb include other forms of scholarship, namely scholarship of integration, scholarship of application/engagement, and scholarship of teaching and learning [ ]. with the advent of the digital age, however, disruptive technologies like social media are now pushing us even closer towards yet another paradigm shift. in the age of the printing press, publishing and distributing ground-breaking new ideas were controlled by publishing houses. in today’s age, the web . applications now allow for ease of publication at an unprecedented level. this has led to a veritable explosion of certain types of digital products, including blogs, podcasts, and tweet chats [ - ]. however, recent trends have allowed for differing approaches to scholarship, including digital publication [ - ]. while there is a compelling case that these new forms of scholarship are actually no different from prior technologies (e.g. is a blog post not just the modern interpretation of a text book? isn’t a podcast just a more easily distributed recording of a lecture?), their ease of digital publication makes quality surveillance imperative [ ]. as increasing numbers of trainees and faculty turn to online educational resources, it is imperative to evaluate digital resources for quality. trainees and practicing physicians alike are inconsistent in gestalt recommendations of online educational content [ - ]. in order for academia to accept these disruptive forms of scholarship, it is imperative these publications be scrutinized and rigorously assessed for content and quality in the same way as reserved for traditional scholarship. in , the faculty incubator was created by the academic life in emergency medicine (aliem) team to create a community of practice (cop) for early career educators. in this online forum, members of this cop discuss and debate topics relevant to modern clinician educators. to that end, we created a one month module focused on learning technologies, with digital scholarship as a prominent point of discussion. since there is an emerging literature base on this important topic, our team sought to identify key literature about how to evaluate the quality of digital scholarship. this paper utilized a consensus-based review process to determine which papers may assist junior educators who wish to learn more about how they can evaluate their digital scholarship. these papers provide a framework for assessing digital scholarship for quality so that junior faculty can recommend quality educational resources to their trainees. furthermore, by highlighting the parts of digital scholarship that produce high quality, future writers of online content may be able to better shape their work in order to maximize credit when applying for promotion and tenure. review methods in the first month of the aliem - faculty incubator, we discussed the topic of digital scholarship. we monitored the proceedings of this cop from may st- st, . the monitored online discussions involved both junior faculty members and senior faculty mentors. while discussions occurred, we gathered the titles of papers that were cited, shared, and recommended within our online discussion forum and compiled these into a list. this list was then expanded via two other methods. first, we consulted two content experts, brent thoma and jonathan sherbino, for suggestions. next, we posted a call for important papers regarding the evaluation of digital scholarship on twitter. we ‘tweeted’ requests to have participants of the #foamed and #meded online communities provide suggestions for important papers on the topic of peer review. quinn et al. cureus ( ): e . doi . /cureus. of once the augmented list was completed, we conducted a three-round voting process inspired by the delphi methodology. similar methods were used on our previous papers to build consensus on the most important papers to feature [ - ] [ - ]. readers will note that this was not traditional delphi methodology because our raters included novices (i.e. junior faculty members, participants in the faculty incubator), as well as experienced/expert medical educators (i.e. clinician educators, all of whom have published > peer reviewed publications, who serve as mentors and facilitators of the aliem faculty incubator). rather than only including experts, we intentionally involved junior educators to ensure we selected papers that would be of use to a spectrum of educators throughout their careers. results our faculty incubator discussions yielded papers, and the expert recommendations and social media calls yielded a total of one additional article. our process provides a rank-order listing of all these papers in the order of perceived relevance, from the most to the least relevant. our top five papers are expanded upon below. after our third round of voting, we had a tie for the final article leading to a fourth round of voting. we included the article not chosen as the fifth paper as an 'honorable mention'. our ratings of all papers are listed in table , along with their full citations. citation round . initial mean scores (sd). max score round . % of raters that endorsed this paper round . % of raters that endorsed this paper round . tie break round top papers cabrera, et al. [ ] . ( . ) % . % thoma, et al. [ ] . ( . ) . % . % chan tm, et al. [ ] . ( . ) . % . % colmers, et al. [ ] . ( . ) . % . % sherbino, et al. [ ] . ( . ) % . % . % gottlieb, et al. [ ] . ( . ) . % . % . % honorable mention thoma b, et al. [ ] . ( . ) % % chan t, et al. [ ] . ( . ) % % frank jr, et al. [ ] . ( . ) % . % cameron cb, et al. [ ] . ( . ) % . % quinn et al. cureus ( ): e . doi . /cureus. of lin m, et al. [ ] . ( . ) % . % krishnan k, et al. [ ] . ( . ) . % . % chan tm, et al. [ ] . ( . ) % % thoma b, et al. [ ] . ( . ) % % lin m, et al. [ ] . ( . ) . % % flynn l, et al. [ ] . ( . ) . % % sherbino j, et al. [ ] . ( . ) . % % boyer el. [ ] . ( . ) % % sterling m, et al. [ ] . ( . ) % % purdy e, et al. [ ] . ( . ) . % % paterson qs, et al. [ ] . ( . ) . % % eysenbach g [ ] . ( . ) % % roland d, et al. [ ] . ( . ) % lumba- brown a, et al. [ ] . ( . ) . % % jordan j, et al. [ ] . ( . ) % boulos mn, et al. [ ] . ( . ) . % % sutherland s, et al. [ ] . ( . ) . % % diug b, et al. [ ] . ( . ) % lin m, et al. . ( . ) . % quinn et al. cureus ( ): e . doi . /cureus. of [ ] raine t, et al. [ ] . ( . ) % riddell j, et al. [ ] . ( . ) % thoma b, et al. [ ] . ( . ) % ke q, et al. [ ] . ( . ) % hillman t, et al. [ ] . ( . ) % pronk np, et al. [ ] . ( . ) % flynn s, et al. [ ] . ( . ) % solomon dj, et al. [ ] . ( . ) % langdorf mi, et al. [ ] . ( . ) % table : the complete list of literature items collected by the authorship team discussion the following is a list of papers that our group has determined to be of interest and relevance to junior faculty members and more senior colleagues who may be in charge of faculty development. the accompanying commentaries are meant to explain the relevance of these papers to junior faculty members and also highlight considerations for senior faculty members when using these works for faculty development workshops or sessions. . thoma b, chan t, benitez j, lin m. educational scholarship in the digital age: a scoping review and analysis of scholarly products, the winnower : e . , , doi: . /winn [ ]. summary: this article applies the more widely accepted boyer model of scholarship to classify digital works [ ]. a literature review and coding schema allow the different forms of digital media presented to be reclassified into boyer’s four subtypes of scholarship: teaching, integration, application, and discovery [ ]. various types of digital scholarship were also reviewed and summarized. over % of digital scholarship was mapped to the teaching subtype, although some mapped to more than one subtype. the weakness most associated with digital scholarship was ensuring rigorous scrutiny for quality, which is presumed to occur with traditional scholarship. a framework, such as the one proposed, is valuable as the development quinn et al. cureus ( ): e . doi . /cureus. of and distribution of digital scholarship continues to increase and these works are integrated into academic portfolios. relevance to junior faculty members: associating various types of digital scholarship with a well-established and accepted education framework allows junior faculty to strengthen their academic portfolio. demonstrating how their digital works fit into the established subtypes of boyer’s scholarship framework can assist junior faculty when explaining their work to a promotion and tenure committee. this may be particularly valuable by normalizing digital scholarship into terms that promotion and tenure committees may be more familiar with. relevance to faculty developers: thoma and colleagues go beyond demonstrating how digital scholarship can fit into the framework of boyer’s four subtypes of scholarship to make a case for developing tools that can examine the impact and quality of these digital products [ ]. the growing number of digital products identified in the literature emphasizes the increased use of digital technology for educational scholarship with most being categorized into boyer’s scholarship of teaching and learning [ ]. the authors concluded that there is no compelling evidence that digital products are more effective but, given their widespread use, further research is needed in order to assess their quality. . cabrera d, vartabedian bs, spinner rj, jordan bl, aase la, timimi fk. more than likes and tweets: creating social media portfolios for academic promotion and tenure. journal of graduate medical education. aug; ( ): - [ ]. summary: more education faculty are focusing their scholarly efforts on the creation, curation, and dissemination of free, open-access medical education. in this paper, cabrera and colleagues propose a framework to incorporate social media scholarship into current academic promotion and tenure systems. they provide recommendations of best practices for institutions in implementing a system for recognizing social media and digital scholarship for academic promotion. this includes the development of focused guidelines and strategic priorities for social media use, the development of a clear impact grid to identify types of social media activities considered for academic tenure, and methods by which the quality and impact of scholarship can be measured. suggested measures of impact include altmetrics such as page views, peer review of the work, and objective measures of quality such as those laid out by sherbino and colleagues [ ]. an example impact grid is also included in the paper. faculty scholars are encouraged to create and maintain a social media scholarship portfolio, much the same as they would an educator’s portfolio for their work in clinical and didactic teaching, curriculum design, and formal mentorship. the description of one’s scholarly work in social media should remain consistent with glassick’s framework and include clear goals, adequate preparation, appropriate methods, significant results, effective presentation, and reflective technique [ ]. relevance to junior faculty members: this paper lays out a clear framework for presenting social media and digital scholarship for academic promotion and tenure. junior faculty should develop and curate a social media scholarship portfolio, which includes a statement of social media scholarship philosophy, academic niche, audience, objectives, and platforms. all aspects of social medial scholarship should be presented and include original content creation, curation of content, community management, platform administration, data analysis, and a durable record of scholarship (eg, permalinks, cached content). a clear description of how social media scholarship aligns with the junior faculty member’s overall career development plan and program of scholarship will provide a cohesive picture for the promotion and tenure committee at the time of review. relevance to faculty developers: this is an essential paper for educators who engage in social quinn et al. cureus ( ): e . doi . /cureus. of media to receive proper recognition when applying for promotion and tenure. the authors outline a list of potential social media scholarship avenues, as well as their relative impact. it also provides a guideline for developing local institutional criteria for social media-based scholarship in addition to a formal definition based upon the paper by sherbino and colleagues [ ]. this resource may be provided to one’s academic appointment committee when applying for promotion, but would be better utilized in advance to help develop local criteria. . colmers in, paterson qs, lin m, thoma b, chan tm. the quality checklists for health professions blogs and podcasts, the winnower : e . , , doi: . /winn. . colmers, et al. this article is distributed under the terms of the creative commons attribution.; [ ]. summary: blogs and podcasts are highly popular in medical education and have grown exponentially without significant oversight or quality control. this paper seeks to create an easy-to-use checklist for educators to evaluate and identify the educational value of a blog or podcast. to do so, the authors used a three-step process. initially, they performed a literature review with the goal of extracting quality indicators of secondary educational resources. these findings then underwent qualitative analysis to isolate those that would be applicable to blogs or podcasts, yielding a total of indicators. to further refine these quality indicators, the authors then surveyed a large and varied group of expert producers of online medical education content to determine which of this initial group of indicators was most important. this analysis yielded a more manageable indicators for blogs and for podcasts. to avoid bias in surveying only experts in content development, the authors also surveyed general medical education experts. from the initial group of quality indicators, the general medical experts identified three indicators for blogs, one for podcasts, and nine for both blogs and podcasts. once these surveys were completed, the authors compiled the findings from both the expert producers of online content and the general medical education experts to form simplified checklists for both blogs and podcasts. the checklists are similarly divided into three sections (ie, credibility, content, and design) and contain a series of questions that can be answered yes, no, or unclear. relevance to junior faculty members: many educators and students enjoy blogs and podcasts due to their entertainment value, brevity, and style. they offer a welcome break from more traditional methods such as textbooks or lectures. however, there is a tendency to overlook the content and quality of a blog or podcast when blinded by other factors, such as the aforementioned entertainment value. this checklist offers junior faculty, who may have limited experience in evaluating online educational materials, a simple and rapid method to assess the worth of a blog or podcast as a teaching tool. by focusing on key aspects of the blog or podcast, this checklist ensures that the junior faculty will be able to accurately assess its quality. relevance to faculty developers: this paper is invaluable for faculty development for both consumers and producers of digital scholarship. assessing the quality of podcasts and blogs based on personal gestalt has been shown to be unreliable [ ]. this paper provides an easily usable checklist for developers, editors, and end-users of blogs and podcasts. this checklist helps guide the producer to focus their efforts to produce a quality product. editors can utilize rubric to provide quality feedback to writers to optimize the peer review process, advancing the overall quality of digital scholarship. educators can guide learners to critically appraise blogs and podcasts prior to their use. . chan tm, grock a, paddock m, kulasegaram k, yarris lm, lin m. examining reliability and validity of an online score (aliem air) for rating free open access medical education resources. annals of emergency medicine. dec ; ( ): - [ ]. quinn et al. cureus ( ): e . doi . /cureus. of summary: this article compared a global rating gestalt-based likert scale versus a five-domain rating scale (aliem approved instructional resources score (air score)) to evaluate free open access medical education (foam) blog posts. the air score included ratings of: best evidence in emergency medicine rater scale, content accuracy, educational utility, having an evidence- based medicine construct, and provision of author and literature references. the air score had moderate congruence with the gestalt evaluations and required nine raters to achieve adequate interrater reliability per article. relevance to junior faculty members: this article is important to junior faculty as it provides an initial scaffold for evaluating existing resources, as well as a framework for creating quality online foam resources. as evaluation tools for foam resources evolve, it will be important to keep up with literature to adapt publications as needed to stay within best practices. relevance to faculty developers: this paper is of two-fold importance for those interested in faculty development. first, it is a good example of applying measurement science to an educationally-relevant scoring system. this can open up a discussion around the methodology used by the authors to generate ‘validity evidence’ for a particular scoring system, which can be a nice way to compare and contrast with known diagnostic test validation methods. moreover, it is an example of how you can generate multiple wins from a single educational innovation [ ]. notably, this group has authored a paper describing their innovation [ ] and subsequently analyzed the data they acquired whilst running their innovation processes to generate validity evidence around the scoring system they created. . sherbino j, arora vm, van melle e, rogers r, frank jr, holmboe es. criteria for social media- based scholarship in health professions education. postgraduate medical journal. oct ; ( ): - [ ]. summary: sherbino and colleagues recognized that substantial questions were present on how social media-based scholarship was viewed through traditional scholarship assessment methods [ ]. using a consensus approach, they were able to define the criteria for social media- based scholarship. following a facilitated session of both in-person and virtual health professions educators, a consensus statement was produced. this statement required that social media-based scholarship in health professions must: be original, advance the field of health professions education, be archived and disseminated, and provide the community the ability to comment on and provide feedback in a transparent fashion. a process was also defined that listed the standards of criteria for authorship. with respect to the impact on education, the authors agree that evidence of a transparent critical appraisal is required. also, the innovations must have the potential to impact health professions in a rapid or broad fashion. ease of accessibility is also a concern. alternative metrics were suggested as a way to demonstrate the dissemination and impact of social media-based scholarship. they concluded that the health professions education community should champion social media-based scholarship as a legitimate educational pursuit. relevance to junior faculty members: given the use of social media-based educational scholarship among junior faculty, it is important to know criteria that can be applied in order to make one’s scholarship more robust and of high quality. this enables a junior faculty member’s productivity to fit criteria that can help them when viewed by evaluators for academic advancement. relevance to faculty developers: with increasing use of social media within academic medicine, it is essential that junior faculty understand what constitutes scholarship. this paper builds upon the work of glassick to define scholarship with social media and is complementary to the above paper by cabrera [ , ]. this may be a valuable resource to ensure that one’s efforts meet quinn et al. cureus ( ): e . doi . /cureus. of the criteria for scholarship in order to properly categorize and support projects when applying for promotion and tenure. it is equally important to ensure that faculty understand what does not constitute scholarship, so as to realign one’s efforts with a project that meets the criteria and may maximize their efforts [ ]. honorable mention: gottlieb m, chan tm, sherbino j, yarris l. multiple wins: embracing technology to increase eff iciency and maximize efforts. aem education and training [ ]. this paper was highly-ranked by our team, but ultimately not selected because it does not bear weight on how to actually evaluate digital scholarship. instead, we summarize it here because we feel that it is a valuable “how-to” article that may guide how junior faculty members align projects and harness the power of technology to efficiently increase their scholarly efforts [ ]. while it does not help those seeking evaluative tools for judging the quality of digital scholarship, it describes well how one might integrate digital scholarship into their practice. limitations many of the final articles share one or more of the authorship group of this manuscript. these same authors served as senior mentors and content experts for the month covering digital scholarship for the aliem faculty incubator and therefore were essential in the writing of this manuscript. as with more systematic reviews, individuals with a good grasp of key literature and content experts, such as active researchers in a particular field, are best served to identify papers. we attempted to minimize selection bias by “tweeting” requests for additional articles covering the evaluation of digital scholarship as well as multiple rounds of the modified delphi. the chosen articles not only had to cover the topic of evaluating digital scholarship, they had to be beneficial for junior faculty. articles not chosen via the modified delphi methodology may not have been appropriate for junior faculty. conclusions this paper describes five key articles on the evaluation of digital scholarship. we believe this resource will be valuable for clinician educators to evaluate digital scholarship, while also providing guidance for producers of digital content to create a robust product that meets the definition of scholarship. additional information disclosures conf licts of interest: in compliance with the icmje uniform disclosure form, all authors declare the following: payment/services info: all authors have declared that no financial support was received from any organization for the submitted work. financial relationships: all authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. other relationships: all authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work. acknowledgements the authors would like to thank dr. brent thoma and dr. jonathan sherbino for their input as content experts on the topic of digital scholarship and for providing references on the topic. the authors would like to acknowledge dr. michelle lin and the - academic life in quinn et al. cureus ( ): e . doi . /cureus. of emergency medicine (aliem) faculty incubator participants and mentors for facilitating the drafting and submission of this manuscript. references . boyer e: scholarship reconsidered: priorities of the professoriate. princeton university press, lawrenceville, nj; . . cadogan m, thoma b, chan tm, lin m: free open access meducation (foam): the rise of emergency medicine and critical care blogs and podcasts ( – ). emerg med j. , : – . . /emermed- - . chan tm, thoma b, radecki r, topf j, woo hh, kao ls, et al.: ten steps for setting up an online journal club. j contin educ health prof. , : – . . lin m, joshi n, hayes bd, chan tm: accelerating knowledge translation: reflections from the online aliem-annals global emergency medicine journal club experience. ann emerg med. , : – . . /j.annemergmed. . . . sherbino j, arora vm, van melle e, et al.: criteria for social media-based scholarship in health professions education. postgrad med j. , : – . . /postgradmedj- - . cabrera d, vartabedian bs, spinner rj, et al.: more than likes and tweets: creating social media portfolios for academic promotion and tenure. j grad med educ. , : – . . /jgme-d- - . . roland d, trueger s, thoma b, et al.: foam helps to bridge the knowledge translation gap . emerg physician int. , : . accessed: november , : http://www.epijournal.com/articles/ /foam-helps-to-bridge-the-knowledge-translation- gap. . krishnan k, thoma b, trueger ns, et al.: gestalt assessment of online educational resources may not be sufficiently reliable and consistent. perspect med educ. , : – . . /s - - - . thoma b, sebok-syer ss, krishnan k, et al.: individual gestalt is unreliable for the evaluation of quality in medical education blogs: a metriq study. ann emerg med. , : – . . /j.annemergmed. . . . king a, boysen-osborn m, cooney r, et al.: curated collection for educators: five key papers about the flipped classroom methodology. cureus. , :e . . /cureus. . thoma b, gottlieb m, boysen-osborn m, et al.: curated collections for educators: key papers about program evaluation. cureus. , : – . . /cureus. . gottlieb m, chan t, fredette j, et al.: academic primer series: five key papers about study designs in medical education. west j emerg med. , : – . . /westjem. . . . cooney r, chan tm, gottlieb m, et al.: academic primer series: key papers about competency- based medical education. west j emerg med. , : – . . /westjem. . . . yarris l, gottlieb m, scott k, et al.: academic primer series: key papers about peer review . west j emerg med. , : – . . /westjem. . . . boysen-osborn m, cooney r, gottlieb m, et al.: academic primer series: key papers about teaching with technology. west j emerg med. , : – . . /westjem. . . . chan t, gottlieb m, quinn a, et al.: academic primer series: five key papers for consulting clinician educators. west j emerg med. , : – . . /westjem. . . . thoma b, chan t, benitez j, lin m: educational scholarship in the digital age: a scoping review and analysis of scholarly products. the winnower. , : - . . /winn. . . chan tmy, grock a, paddock m, et al.: examining reliability and validity of an online score (aliem air) for rating free open access medical education resources. ann emerg med. , : – . . /j.annemergmed. . . . colmers in, paterson qs, lin m, thoma b, chan tm: the quality checklists for health professions blogs and podcasts. the winnower. , : – . . /winn. . . gottlieb m, chan tm, sherbino j, yarris l: multiple wins: embracing technology to increase efficiency and maximize efforts. aem educ train. , : – . . /aet . . thoma b, sanders j, lin m, et al.: the social media index: measuring the impact of emergency medicine and critical care websites. west j emerg med. , : – . quinn et al. cureus ( ): e . doi . /cureus. of https://eric.ed.gov/?id=ed https://dx.doi.org/ . /emermed- - https://dx.doi.org/ . /emermed- - https://www.ncbi.nlm.nih.gov/pubmed/?term= https://dx.doi.org/ . /j.annemergmed. . . https://dx.doi.org/ . /j.annemergmed. . . https://dx.doi.org/ . /postgradmedj- - https://dx.doi.org/ . /postgradmedj- - https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . /jgme-d- - . http://www.epijournal.com/articles/ /foam-helps-to-bridge-the-knowledge-translation-gap http://www.epijournal.com/articles/ /foam-helps-to-bridge-the-knowledge-translation-gap https://dx.doi.org/ . /s - - - https://dx.doi.org/ . /s - - - https://dx.doi.org/ . /j.annemergmed. . . https://dx.doi.org/ . /j.annemergmed. . . https://dx.doi.org/ . /cureus. https://dx.doi.org/ . /cureus. https://dx.doi.org/ . /cureus. https://dx.doi.org/ . /cureus. https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /winn. . https://dx.doi.org/ . /winn. . https://dx.doi.org/ . /j.annemergmed. . . https://dx.doi.org/ . /j.annemergmed. . . https://dx.doi.org/ . /winn. . https://dx.doi.org/ . /winn. . https://dx.doi.org/ . /aet . https://dx.doi.org/ . /aet . https://dx.doi.org/ . /westjem. . . . /westjem. . . . chan t, trueger ns, roland d, thoma b: evidence-based medicine in the era of social media: scholarly engagement through participation and online interaction. cjem. , – . . /cem. . . frank jr, cheung wj, sherbino j, et al.: caep academic symposium: how to have an impact as an emergency medicine educator and scholar. cjem. , :s –s . . /cem. . . cameron cb, nair v, varma m, et al.: does academic blogging enhance promotion and tenure? a survey of us and canadian medicine and pediatric department chairs. jmir med educ. , :e . . /mededu. . lin m, thoma b, trueger ns, et al.: quality indicators for blogs and podcasts used in medical education: modified delphi consensus recommendations by an international cohort of health professions educators. postgrad med j. , : – . . /postgradmedj- - . chan t, thoma b, krishnan k, et al.: derivation of two critical appraisal scores for trainees to evaluate online educational resources: a metriq study. west j emerg med. , : – . . /westjem. . . . thoma b, chan tm, paterson qs, et al.: emergency medicine and critical care blogs and podcasts: establishing an international consensus on quality. ann emerg med. , : – . . /j.annemergmed. . . . lin m, joshi n, grock a, et al.: approved instructional resources series: a national initiative to identify quality emergency medicine blog and podcast content for resident education. j grad med educ. , : – . . /jgme-d- - . . flynn l, jalali a, moreau ka: learning theory and its application to the use of social media in medical education. postgrad med j. , : – . . /postgradmedj- - . sherbino j: education scholarship and its impact on emergency medicine education . west j emerg med. , : – . . /westjem. . . . sterling m, leung p, wright d, bishop tf: the use of social media in graduate medical education. acad med. , : – . . /acm. . purdy e, thoma b, bednarczyk j, migneault d, sherbino j: the use of free online educational resources by canadian emergency medicine residents and program directors. cjem. , : – . . /cem. . . paterson qs, thoma b, milne wk, lin m, chan tm: a systematic review and qualitative analysis to determine quality indicators for health professions education blogs and podcasts. j grad med educ. , : – . . /jgme-d- - . . eysenbach g: can tweets predict citations? metrics of social impact based on twitter and correlation with traditional metrics of scientific impact. j med internet res. , :e . . /jmir. . roland d, spurr j, cabrera d: preliminary evidence for the emergence of a health care online community of practice: using a netnographic framework for twitter hashtag analytics. j med internet res. , :e . . /jmir. . lumba-brown a, tat s, auerbach m, et al.: pemnetwork barriers and enablers to collaboration and multimedia education in the digital age. pediatr emerg care. , : – . . /pec. . jordan j, jones d, williams d, druck j, kuehl dr: publishing venues for education scholarship: a needs assessment. acad emerg med. , : – . . /acem. . boulos mnk, maramba i, wheeler s: wikis blogs and podcasts: a new generation of web- based tools for virtual collaborative clinical practice and education. bmc med educ. , : . . / - - - . sutherland s, jalali a: social media as an open-learning resource in medical education: current perspectives. adv med educ pract . , : – . . /amep.s . diug b, kendal e, ilic d: evaluating the use of twitter as a tool to increase engagement in medical education. educ health (abingdon). , : – . . lin m, sherbino j: creating a virtual journal club: a community of practice using multiple social media strategies. jgme. , : – . . /jgme-d- - . . raine t, thoma b, chan tm, lin m: foamsearch.net: a custom search engine for emergency medicine and critical care. emergency medicine australas. , : – . . / - . . riddell j, patocka c, lin m, sherbino j: jgme-aliem hot topics in medical education: analysis quinn et al. cureus ( ): e . doi . /cureus. of https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /cem. . https://dx.doi.org/ . /cem. . https://dx.doi.org/ . /cem. . https://dx.doi.org/ . /cem. . https://dx.doi.org/ . /mededu. https://dx.doi.org/ . /mededu. https://dx.doi.org/ . /postgradmedj- - https://dx.doi.org/ . /postgradmedj- - https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /j.annemergmed. . . https://dx.doi.org/ . /j.annemergmed. . . https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . /postgradmedj- - https://dx.doi.org/ . /postgradmedj- - https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /acm. https://dx.doi.org/ . /acm. https://dx.doi.org/ . /cem. . https://dx.doi.org/ . /cem. . https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . /jmir. https://dx.doi.org/ . /jmir. https://dx.doi.org/ . /jmir. https://dx.doi.org/ . /jmir. https://dx.doi.org/ . /pec. https://dx.doi.org/ . /pec. https://dx.doi.org/ . /acem. https://dx.doi.org/ . /acem. https://dx.doi.org/ . / - - - https://dx.doi.org/ . / - - - https://dx.doi.org/ . /amep.s https://dx.doi.org/ . /amep.s https://www.ncbi.nlm.nih.gov/pubmed/?term= https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . / - . https://dx.doi.org/ . / - . https://dx.doi.org/ . /jgme-d- - . of a multimodal online discussion about team-based learning. j grad med educ. , : – . . /jgme-d- - . . thoma b, paddock m, purdy e, et al.: leveraging a virtual community of practice to participate in a survey-based study: a description of the metriq study methodology. aem educ train. , : – . . /aet . . ke q, ahn yy, sugimoto cr: a systematic identification and analysis of scientists on twitter . plos one. , : - . . /journal.pone. . hillman t, sherbino j: social media in medical education: a new pedagogical paradigm . postgrad med j. , : – . . /postgradmedj- - . pronk np, ankel fk: building resilience into the workplace: bending the system to adapt . acsm's health & fitness journal. , : – . . solomon d: the impact of digital dissemination for research and scholarship . ecancer. , : – . . /ecancer. .ed . langdorf m, lin m: emergency medicine scholarship in the digital age . west j emerg med. , : – . . /westjem. . . . glassick ce: boyer's expanded definitions of scholarship, the standards for assessing scholarship, and the elusiveness of the scholarship of teaching. acad med. , : – . . / - - quinn et al. cureus ( ): e . doi . /cureus. of https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . /aet . https://dx.doi.org/ . /aet . https://dx.doi.org/ . /journal.pone. https://dx.doi.org/ . /journal.pone. https://dx.doi.org/ . /postgradmedj- - https://dx.doi.org/ . /postgradmedj- - https://experts.umn.edu/en/publications/building-resilience-into-the-workplace-bending-the-system-to-adap https://dx.doi.org/ . /ecancer. .ed https://dx.doi.org/ . /ecancer. .ed https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . / - - https://dx.doi.org/ . / - - curated collections for educators: five key papers on evaluating digital scholarship abstract introduction and background review methods results table : the complete list of literature items collected by the authorship team discussion limitations conclusions additional information disclosures acknowledgements references college & research libraries january goals that could be applied to find an answer. the reader is able to see how a variety of different problems can be answered using statistics. the final part of the book includes fourteen case studies. these cover a range of topics from e-books to reference staffing. each one shows how data analytics can be applied to the problem being addressed. some, like the case study on benchmarking library standards, give more detail on the analytic process, whereas the case study on instruction is so fleeting at just over one page, it is more of a minor commentary on test- ing. at times these chapters are less case studies and more like examples of problems. they do allow the reader to think about the problem, although they, like so much of the book, are not covered in any particularly great detail. this is a lost opportunity and would have been much better if other authors had been able to submit their own more detailed case studies of how they used data analytics. surprisingly, the book ends with no concluding commentary, no recommenda- tions, nothing to pull it all together into a nice little package. the book does provide a bibliography of cited works, which is interesting since at the end of chapters it does list the references used in the chapter. the reason they do this may be because there are problems with some of the chapter references. for example, a work by xia is cited on page , but the reference is not included at the end of the chapter, although it is included in the bibliography. also, there is one item listed in the references at the end of chapter but it is never cited within chapter . the authors do provide a list of “other useful readings” at the end of the book in case the reader wants to explore further. note that some of the items in this list are also in the bibliography. for some- one just getting started in this area, the book does give a broad overview. the lists, charts, and tables also are excellent resources and can be beneficial. the book could have been more complete; maybe another edition will address some of this edition’s problems.—mark shelton, college of the holy cross marketing and outreach for the academic library: new approaches and initiatives. bradford lee eden, ed. lanham, md.: rowman & littlefield, . p. paper, $ . (isbn - - - - ). lc . marketing and outreach are increasingly becoming topics of interest among academic librarians. it seems that conferences, presentations, journal articles, and blog posts are often discussing issues related to social media, programming, and student and faculty engagement in library services. these are a few areas of focus within marketing and outreach that are relevant to nearly all who work in an academic library. “marketing and outreach for the academic library” is the seventh volume in the “creating the st century academic library” series published by rowman & littlefield. this volume follows titles such as “rethinking technical services: new frameworks, new skill sets, new tools, new roles,” “cutting-edge research in developing libraries of the future: new paths for building future services,” and “enhancing teaching & learning in the st century academic library: successful innovations that make a difference.” this series clearly aims to cover all aspects of current academic librarianship. the editor of the series is bradford lee eden, dean of library services at valparaiso university and editor of several journals related to library science and musicology. this particular volume is composed of ten chapters. while there is no clear thematic organization to the content matter, there are three chapters that discuss various aspects of social media in libraries, two chapters related to events and event planning, three chapters related to digital services, one related to library space, and one that proposes a vision for the future of librarians and librarianship. each chapter ends with cited references, and most include a list of other relevant and useful resources for further doi: . /crl. . . book reviews information. the index is brief and not recommended for navigating the contents of the book. the first chapter is titled “making social media worth it: planning and implementing for a small institution.” this chapter discusses the process that librarians at california state university northridge followed to create a social media plan after they inherited several inactive library social media accounts. the timeframe that is discussed in this chapter is to . while initially this could strike the reader as historical infor- mation in the realm of social media strategies and best practices, there is still plenty of information that can be of use to librarians who are still struggling with stagnant or multiple, unwieldy social media accounts. takeaways from this chapter include advice for content creation in facebook and instagram, best practices for developing a social media plan, and recommendations for assessment of social media activity with tools including facebook analytics and the free website likeanalyzer. the authors of this chapter also include an interesting discussion about their campus-specific social media site called “campusquad.” “from idea to instagram: how an academic library marketing committee created a character for the youtube generation” is the second chapter dedicated to social media topics. this provocative submission from towson university’s albert s. cook library describes the path these librarians took to create engaging video tutorials about the library and related services. initially, the tutorials were policy and procedure videos that parodied popular commercials. in an attempt to add consistency to their videos, the librarians feature a ventriloquist dummy, which they soon discovered reminded their students of a character from a horror film. finally, they settled on featuring a muppet in their video tutorials, which they thought would be less frightening. the majority of this chapter discusses the successes and failures of using the muppet named al. there are lengthy descriptions of fine details about tutorial creation, including selecting the muppet’s wardrobe, film technique and equipment, scriptwriting, and the process of branding and marketing the muppet to be a friendly, recognizable face to first-year students. while a muppet mascot may not be feasible for all academic library cultures, this chapter offers quite a few unique ideas and suggestions for creating memorable video tutorials. the third chapter that focuses on social media is titled “digital engagement in deliv- ering library services: a case study from the state library of new south wales.” this australian research library engaged in a two-year project dedicated to experimenting with various social media platforms including instagram and wikipedia. the librarians took each platform through a four-stage process that included accessing, exploring, engaging, and evaluating each platform. they also defined outcomes for successful digital engagement on each of the platforms. some initiatives that staff members took with the library’s social media accounts included geo-tagging photographs of the li- brary that patrons posted to instagram, answering reference questions on instagram, and encouraging patron contributions to wikipedia articles created by librarians. the authors make an intriguing speculation that the future of collection development will involve “a more collaborative function with crowdsourcing, community sourcing, community curation, participatory interpretation” ( ). librarians from the cunningham memorial library at indiana state university discuss events and event planning in the chapter titled “events and extravaganzas.” this chapter details some of the events held in a specific dedicated event space in the library. these events include fundraisers, faculty receptions, and campus fundrais- ers. one particular event, the “library extravaganza,” is discussed at length. readers will be interested to learn that an events coordinator, who is part of the library staff, is in charge of the activities in the events space. following this chapter is one entitled college & research libraries january “librarians as event coordinators: building partnerships and engagement through user-centered programs.” in this chapter, joe c. clark from kent state university’s performing arts library discusses various events that librarians collaboratively host with other nonlibrary campus offices and organizations. examples of these shared events include an open mic lunch; a lecture series featuring professional directors, designers, and choreographers; an excellence in research award; and a jazz café. while these events may not be relevant to all academic libraries, the author explains several tips for librarians who wish to collaborate with other campus offices for events. points of advice include how to approach shared funding for events and selecting refreshments, as well as branding and marketing events. the information discussed in these two chapters will be of interest to anyone who seeks fresh ideas for program- ming in an academic library. two chapters in this book are dedicated to various aspects of digital services. chapter discusses the process that the university of notre dame libraries followed to promote some of the digital services in their center for digital scholarship. these librarians learned that word-of-mouth marketing from workshop attendees proved to be a highly effective method of outreach for their services. the final chapter of this book is titled “democratizing digital: the highway digital collection and the promise of inclusive online collaboration.” this chapter outlines the collaboration of various institutions including the utah manuscript association and utah state university in creating and promoting a digital collection related to u.s. highway . this information will be sure to inspire any librarian charged with marketing digital collections or anyone who aspires to collaborate with other institutions to build and market a digital collection. chapter discusses three different approaches that librarians from loyola university, new orleans take to embedded librarianship. these librarians have faculty status and therefore are able to participate in activities such as co-lecturing for an entire course with department faculty, grading research assignments, and teaching students the technical side of creating video presentations. while some activities, such as colectur- ing for a course, may not be attainable by librarians at all institutions, there are many innovative ideas presented in this chapter that are sure to at least inspire ideas for collaboration with faculty. the wide scope of topic coverage in this volume makes it invaluable to not only outreach librarians, but any librarian concerned with social media, event planning, and digital collections or services. there are countless unique ideas and practical points of advice to inspire readers with their own marketing and outreach efforts.—laura wilson, college of the holy cross _goback _goback _goback _goback _goback _goback scholars in an increasingly open and digital world: how do education professors and students use twitter? internet and higher education ( ) – contents lists available at sciencedirect internet and higher education scholars in an increasingly open and digital world: how do education professors and students use twitter? george veletsianos a,⁎, royce kimmons b a school of education and technology, royal roads university, victoria, bc, canada b instructional psychology & technology department, bringham young university, provo, ut, united states ⁎ corresponding author. http://dx.doi.org/ . /j.iheduc. . . - /© elsevier inc. all rights reserved. a b s t r a c t a r t i c l e i n f o article history: received august received in revised form february accepted february available online february there has been a lack of large-scale research examining education scholars' (professors' and doctoral students') so- cial media participation. we address this weakness in the literature by using data mining methods to capture a large data set of scholars' participation on twitter ( students, professors, , unique hashtags, and , tweets). we report how education scholars use twitter, which hashtags they contribute to, and what factors predict twitter follower counts. we also examine differences between professors and graduate students. results (a) reveal significant variation in how education scholars participate on twitter, (b) question purported egalitarian structures of social media use for scholarship, and (c) suggest that by focusing on the use of social media for scholarship re- searchers have only examined a fragment of scholars' online activities, possibly ignoring other areas of online pres- ence. implications of this study lead us to consider (a) the meaningfulness of alternative metrics for determining scholarly impact, (b) the impact that power structures have upon role-based differences in use (e.g. professor vs. student), and (c) the richness of scholarly identity as a construct that extends beyond formal research agendas. © elsevier inc. all rights reserved. keywords: social media networked scholarship digital scholarship twitter faculty members' use of online networks online participation . introduction research on emergent forms of technology-infused scholarship and social media use by scholars has explored the relationship between tech- nology and scholarly practice and the impact and implications of technol- ogy in the work and life of scholars. such research, however, has rarely focused on scholars in the field of education or differentiated between faculty members and doctoral students and typically has depended on surveys, interviews, or small-scale naturalistic observations of social media practices. in other words, while existing empirical research from a variety of disciplines may yield some insights into education scholars' activities online, there has been a lack of large-scale research examining social media participation. research in this area is necessary because many researchers have claimed that digital practices in general, and so- cial media activities in particular, have the potential to transform the ways in which education scholarship is conducted and disseminated (burbules & bruce, ; fetterman, ; greenhow, robelia, & hughes, ; yettick, ). for instance, social media may foster par- ticipatory learning and expand the reach of research. yet, such advocacy often rests on claims rather than empirical evidence (kimmons, ) and uses of social media have led to tensions and conundrums in scholars' professional lives (veletsianos, ; veletsianos & kimmons, ). this dichotomy suggests that we need to better understand how social media are being used in scholarship as well as the implications of their use. to help fill that gap, this study analyzes a large data set of education scholars' activities on twitter, one of the most popular social media platforms among academics (lupton, ). using these data, we examine the ways in which doctoral students and professors in education use twitter, the hashtags that they contribute to, and the factors that predict their fol- lower counts. by doing so, we hope to provide greater insight into educa- tion scholars' online participation. . literature review proponents of open, digital, and social scholarship have argued that scholarly use of social media can “enhance the impact and reach of scholarship” and “foster the development of more equitable, effective, efficient, and transparent scholarly and educational processes” (veletsianos & kimmons, , pp. ). as a result, universities are in- creasingly encouraging researchers and educators to expand their on- line presence (mewburn & thomson, ). advocates for greater incorporation of digital technology into scholarly practice have focused on the societal benefits of these emergent forms of scholarship (e.g., broadening access to education and scholarship for the common good), but scheliga and friesike ( ) have found that scholars face both individual and systemic barriers that may prevent them from en- gaging in these practices despite understanding their potential at a sys- temic level. similarly, esposito ( ) found that scholars' use of digital and open practices may largely serve functional purposes and be driven by a desire to achieve efficiencies instead of an aspiration to re-imagine scholarly practices. http://crossmark.crossref.org/dialog/?doi= . /j.iheduc. . . &domain=pdf http://dx.doi.org/ . /j.iheduc. . . http://dx.doi.org/ . /j.iheduc. . . http://www.sciencedirect.com/science/journal/ g. veletsianos, r. kimmons / internet and higher education ( ) – twitter is a popular social media platform for scholars (lupton, ; van noorden, ), and prior research on twitter has found that scholars use it to share information, resources, and media pertaining to their teaching and research practice. for instance, scholars have been shown to use twitter to request and offer assistance to others (veletsianos, ), critique the work of other scholars (mandavilli, ), contribute to conferences via hashtags (li & greenhow, ; mahrt, weller, & peters, ; ross, terras, warwick, & welsh, ), implement engaging pedagogies (junco, heiberger, & loken, ), and share and comment upon preprint and published articles (eysenbach, ). although several studies have examined disciplin- ary differences in the use of twitter (holmberg & thelwall, ; rowlands, nicholas, russell, canty, & watkinson, ), other than the research reported by li and greenhow, we were unable to identify studies that specifically examined its use by education scholars. researchers have also argued that attending to alternative metrics, such as examining references to the scholarly literature in tweets, can extend scholars' impact beyond citations in peer-reviewed journals (priem & hemminger, ). for instance, some have found that the frequency of article mentions via twitter appears to correlate with sub- sequent downloads and citations (shuai, pepe, & bollen, ; thelwall, haustein, larivière, & sugimoto, ), although the correlation be- tween tweets and citations in all fields is unclear (haustein, peters, sugimoto, thelwall, & larivière, ) and in some cases appears to be weakly associated (de winter, ). on the other hand, hall ( ) warns that researchers may lose sight of valuable scholarly met- rics (e.g., citation indices) in favor of popularity metrics like twitter fol- lowers. by examining a large sample of education scholars' online practices, we can begin to better understand social media metrics and thus contribute to the conversation of whether social media metrics can be used to better understand a scholar's impact. while researchers are able to say with increasing confidence what scholars do on social media, it is somewhat unclear how scholars partici- pate on twitter and how online activities relate to academic identity. greenhow et al. ( ) argued that social media support the development of scholars' digital identities, and others found that both professors (veletsianos & kimmons, ) and students (kimmons & veletsianos, ) intentionally refine or limit their online participation so that it can be scrutinized by others. one study examined education scholars' twitter participation during the american educational research association (aera) conference and described commonalities and differences be- tween faculty members and students (li & greenhow, ). in that study, faculty members reported that twitter supported their professional digital identity, while students reported that twitter served other purposes for them that were unrelated to identity (e.g., access to the research commu- nity). li and greenhow's study supports findings from other literature that showed that faculty and student perceptions of popular social media devi- ate (e.g., roblyer, mcdaniel, webb, herman, & witty, ). the existing research suffers from three weaknesses that this study attempts to remedy. first, very little research has examined education scholars' activities on social media, and even less has compared educa- tion professors' activities with students' activities. second, while educa- tion student and faculty use of social media has been examined via self- reported means (e.g., kimmons & veletsianos, ; li & greenhow, ), no research has examined such differences by examining natu- ralistic data trails at any scale. third, current research on what mediates education scholars' participation on social media has been mostly exploratory, thus preventing scholars from developing inferential models. this study addresses all these weaknesses by using data mining methods to capture and analyze a large data set to illuminate scholars' participation on twitter. . theoretical framework this study is situated in the digital networked practices of scholars, and in particular on networked participatory scholarship (nps). nps refers to scholars' use of “online social networks to share, reflect upon, critique, improve, validate, and otherwise develop their scholarship” (veletsianos & kimmons, , p. ). the networked spaces that scholars use (e.g., twitter, blogs) can be described as fluid organization- al structures that impose little restrictions on membership and enable loosely-connected and tightly-knit distributed individuals to connect with one another (dron & anderson, ). social learning theory un- derpins networked participation on social media. in this perspective, learning and knowledge in networked spaces are facilitated, negotiated, and co-constructed individually as well as socially (cf. brown, collins, & duguid, ; lave & wenger, ; wenger, ). thus, learning in online networks becomes a situated activity that takes the form of participation in the socio-cultural practice of scholarship, and as veletsianos ( , p. ) argues, online social networks serve as “emerging and evolving network[s] of scholar–learners where scholarly practices may be created, refined, performed, shared, discussed, and negotiated.” . methods the research focuses on twitter as a platform for scholarly pur- poses, because it is widely used for scholarship (lupton, ). twit- ter is a free microblogging platform that allows users to post content in the form of “tweets” that may also contain links to online content. tweets are limited to characters of text and may be hashtagged with keywords (e.g., #education) or may mention other users by username (e.g., @barackobama). a hashtag refers to a “#” symbol followed by a short phrase. through hashtags and mentions, users can find others that are tweeting on similar topics, share information in an organized manner, and form networks around shared interests. about one-third of tweets include mentions (boyd, golder, & lotan, ), most of which may be conversational in nature (honeycutt & herring, ). users can also retweet a tweet to share content posted by someone else with all of their followers. by default, all sharing on twitter is publicly visible, meaning each user can go to another user's profile page, see all of that user's tweets, and “follow” that user to have new tweets provided directly to them. each user's profile page also provides some general metrics about use and popu- larity, including that user's number of tweets and followers. this study collected the most recent tweets for each user who used the official hashtag of the american educational research as- sociation conference (#aera ). contributors to the hashtag were a sub- section of education scholars, and by gathering a list of contributors we were able to examine education scholars' twitter participation. we se- lected this particular hashtag as a way to identify education scholars be- cause the aera annual meeting is one of the largest gatherings of education scholars worldwide, includes a broad array of education re- searchers (as opposed to a content- or methods-focused conference), and the conference was the latest aera conference at the time of writing. thus, the #aera hashtag served as a vehicle to locate a large and diverse sample of education scholars. in other words, the data in our sample are not limited to the aera conference – the confer- ence only served as a way to identify education scholars. while some users may have used other hashtags in relation to this conference (e.g., #aera ), we limited our identification of scholars by examin- ing the profiles of those who posted using the official hashtag. as a re- sult, our sample excludes scholars who did not use the official hashtag. . . research questions to better understand education scholars' uses of twitter, we asked the following three research questions: rq how do scholars in the education field use twitter? rq which hashtags do education scholars contribute to? rq what factors predict participants' follower counts? g. veletsianos, r. kimmons / internet and higher education ( ) – for each of these questions, we also examined possible differences between professors and graduate students because prior research sug- gests that students and faculty hold different perceptions about the use of social media in education. one study found that students believed social media could be more convenient than did faculty, while faculty were more likely to believe that such media were not appropriate for classwork (roblyer et al., ). in interviews of education scholars con- tributing to #aera , li and greenhow ( ) similarly reported that these groups differed in how they viewed twitter. based on these find- ings we anticipated observable participation differences between grad- uate students and professors. the first question addressed the scholarly uses of twitter specifically among education scholars to better understand how this technology is used. the second was intended to uncover what intellectual and social online communities education scholars participate in and how diverse or homogenous those communities happen to be to better understand the scope or multi-facetedness of scholarly online identities. the third was intended to examine the factors that may predict scholars' follower numbers and shed light on the claim that social media metrics can en- rich our understanding of scholars' impact. . . data collection twitter's application programming interface (api) allows re- searchers to systematically retrieve large amounts of public user data. we used the #aera twitter feed and the twitter api to collect data. first, we developed a series of php/rest/json scripts to use the twitter api to extract information for all of the identified #aera tweets, in- cluding tweet text, metadata (e.g., creation date, retweet count), and author information (e.g., id, name, tweet count, description). tweet, user, hashtag, and mention data were also stored in the database and identifiers were included to maintain relationships between objects (e.g., tweets and their authors, hashtags and their tweets). second, we developed another series of web scripts for the twitter api to extract the most recent tweets from each user identified in the previous step, which allowed us to collect user tweets that were not la- beled with the #aera hashtag. a twitter api restriction though, limit- ed our access to only the most recent tweets for each user. tweet data began being collected six months following the aera confer- ence and continued for several months. thus, collected tweets include tweets prior to, during, and after the conference. finally, we programmatically generated basic descriptives for each tweet (e.g., number of hashtags, number of mentions), user (e.g., lifespan), and hashtag (e.g., unique uses), and generated binary descriptive variables (e.g., hashtagged, mentioned). . . data analysis we identified users. next, we read each user's profile informa- tion (bio, location, username) and using this information we coded the collected users as graduate students, professors, or other. accounts that could not be readily identified from this information as either graduate student or professor accounts (e.g., unclear, corporate, multiauthor, or anonymous accounts) were excluded from analysis, and the final data set included an almost equal number of graduate students ( ) and professors ( ) for a total number of . by identifying accounts in this way, it is possible that student or professor accounts might have been excluded from analysis if they did not self-identify as such. this, however, is an intentional delimitation of the study, as we did not feel it to be appropriate to label accounts in a manner that was not reflective of self-descriptions. furthermore, if the goal of this study is to under- stand scholars' participation in social media, then it seems to make sense to focus our attention on social media use which users connect to their identities as scholars. all data were then exported from the database and imported to spss for statistical analysis. separate variables were analyzed for three data sets. . . . tweet data set this dataset included unique identifiers, retweets (the number of times the tweet had been retweeted), and retweet (a binary variable reflecting whether the tweet was original or a retweet). . . . hashtag data set this dataset included unique hashtags and the number of times each was used. hashtag counts were calculated to determine communities (e.g., phdchat), conferences (e.g., aera ), and topics (e.g., immigration) that were identified in tweets. hashtag use varied by user, and some hashtags were widely used while others were used by only one or two users. hashtags that were used by more than unique users in the data set (roughly % of users) were marked as viral. we used users to demarcate viral and non-viral hashtags because the number of users using a hashtag fell dramatically from that point onwards, indicating low uptake. . . . user data set this dataset included raw and percentage participation factors. per- centage factors were used to represent each user's overall twitter activ- ities, counteracting skewing that would have resulted from highly differential numbers of tweets. participation factors were: • professor — whether the participant self-identified as a professor (non-exclusive to student). • student — whether the participant self-identified as a student (non- exclusive to professor). • followers (dependent) — the number of other twitter users who “fol- low” the user. • following (independent) — the number of other twitter users whom the user “follows.” • listed (correlate/independent) — the number of twitter lists on which the user appears. • tweets (independent) — the number of tweets the user has posted. • lifespan (independent) — the number of years (in decimal form) since the twitter user account was first created, calculated as lifespan = cur- rent date − creation date. • frequency (independent) — the number of tweets the user posts in a day, calculated as number of tweets ÷ (last tweet date − first tweet date). • mentioning (independent) — the percentage of tweets in which the user mentions another user, calculated as user tweets with mentions ÷ tweets. • hashtagging (independent) — the percentage of tweets in which the user includes a hashtag, calculated as user tweets with hashtags ÷ tweets. • linking (independent) — the percentage of tweets in which the user includes a url, as calculated by user tweets with links ÷ tweets. • retweeting (independent) — the percentage of tweets that are retweets (i.e., non-original), calculated as user retweets ÷ tweets. • replying (independent) — the percentage of tweets that are replies to other twitter users or tweets, calculated as user replies ÷ tweets. . results after data cleaning, identification of users by role, and exclusion of participants who could not be identified as students or professors, the user data set included users ( students and professors), the hashtag data set included , unique hashtags that were used , times, and the tweet data set included , tweets ( % from students and % from professors). table followers (popularity) and tweets (activity) of scholars by percentile groups. percentile group followers tweets avg. followers top % % % , top % % % top % % % top % % % bottom % % % bottom % % % g. veletsianos, r. kimmons / internet and higher education ( ) – . . rq : how do graduate students and professors in the education field use twitter? the descriptive statistics of student and professor use of twitter re- vealed that more than half of the tweets ( . %) mentioned other users, while only about a quarter ( . %) were replies to others. further, more than % of tweets were retweets, % included a hashtag, and more than % included a hyperlink. results also showed considerable vari- ance between individual users with a positive skew on most non- normalized factors. for instance, the standard deviation of followers, fol- lowing, listed, and tweets exceeded each factor's mean, and the median was far below the mean. those participants who were more active (i.e., posted more tweets) or more popular (i.e., gained more followers) exponentially exceeded their counterparts (table ). table provides an overview of activity and popularity by percentile groups. comparing the popularity of the top % with that of the bottom % of scholars suggests that popularity is roughly equivalent to activity or the efforts of the individual in terms number of tweets posted. as we consider the top percentile groups however, participation becomes more and more unequal: the top % garner % of all followers (though they provide % of all tweets) and the top % command % of all fol- lowers (though they only provide % of all tweets). the most popular % scholars have an average follower base nearly times that of scholars in the lower % and times those in the bottom %. to determine if any factors were attributable to participant roles as either student or professor, a multivariate analysis of variance (manova) with user descriptives as the dependent variables and role as the independent variable yielded significant overall effects (wilks' λ = . , p b . , partial eta squared = . , observed power = ). given the significance of the overall test, univariate main effects were examined, and significant effects were detected for followers, f( ) = . , p b . , partial eta squared = . , observed power = . ; listed, f( ) = . , p b . , partial eta squared = . , observed power = . ; and linking, f( ) = . , p b . , partial eta squared = . , observed power = . . estimated marginal means for each dependent variable by role (table ) revealed that: . professors had more followers than students (md = ). . professors were listed more often than students (md = ). . professors included links in their tweets more often than did stu- dents (md = %). . . rq : which hashtags do education scholars contribute to? other than #aera , which was the hashtag that all these scholars contributed to, what hashtags did they use? our analysis revealed that both graduate students and professors hashtagged % of their tweets. in total, , unique hashtags were used, with around unique hashtags used per user, but just unique hashtags ( . %) were con- sidered viral. the average hashtag was used by only . users an aver- age of . times, suggesting high variability. though non-viral hashtags table descriptive results of students' and professors' twitter use. mean sd median min max followers . . , following . . listed . . tweets , lifespan . . . . . frequency . . . . mentioning . . . . . hashtagging . . . . linking . . . . retweeting . . . . replying . . . . accounted for . % of all hashtags, they were present in about half ( . %) of all tweets. the viral hashtags were present in . % of all tweets, reflecting that some hashtags were important to a large number of participants. as shown in table , these hastags were related to educa- tion (e.g., edchat, highered, edreform), civil rights or advocacy (e.g., ferguson, blacklivesmatter), or general internet culture (e.g., ff for follow friday, tbt for throwback thursday). we created viral hashtag scatter plots of participation and use fre- quencies to better understand differences between groups and detected a strong, positive linear relationship for participation between groups (r = . ) as expressed in the following equation: student participa- tion = . + (. × professor participation) (fig. ), and a moderate, pos- itive power law relationship for hashtag frequency between groups (r = . ), as expressed in the following equation: student frequency = . × professor frequency. (fig. ). these relationships reveal that stu- dents were somewhat less likely to participate in each viral hashtag than professors and that the frequency in which professors participate in these hashtags is exponentially greater than that of students. . . rq : what factors predict participants' follower counts? we had anticipated that several measurable participation factors might influence follower counts, including tweets, following, role, and lifespan. visual inspection of initial scatterplots and curve estimation tests revealed potential power law relationships between most factors, so values of scale variables were recoded logarithmically to allow for further analysis that assumed linearity. visual inspection of logarithmic scatterplots revealed linearity, and bivariate correlation results of raw values were compared to results of logarithmic values to ensure that data recoding improved correlations in data. in all cases, correlations in the data were improved as a result of the logarithmic recoding of scale variables (table ). first, a logarithmic scatter plot of followers to following (fig. ) re- vealed a strong, positive linear relationship between the two variables (r = . ;, as expressed in the following equation: lg(followers) = . × lg(following). this reveals that by following more people, partici- pants will receive more followers but that this rate decreases. for exam- ple, if one user follows people, another follows , and a third follows , the first will receive a number of followers representing % of those followed, the second would receive %, the third %, and so forth. the scatterplot also revealed the possibility of outliers, which needed to be considered in later analysis. next, a logarithmic scatter plot of followers to tweets (fig. ) revealed a strong, positive linear relationship between the two variables (r = . ), as expressed in the following equation: lg(followers) = . + (. × lg(tweets)). this reveals that by tweeting more, participants table estimated marginal means and medians of role-based twitter differences. professors graduate students e.m. mean std. error median e.m. mean std. error median followers . . . . listed . . . . linking . . . . . . table top hashtags by user role. tnedutsprofessor hashtag % of users tweets per user hashta g % of users tweets per user education . . education . . highered . . edchat . . edchat . . highered . . edtech . . ferguson . . ferguson . . edtech . . ff . . research . . research . . phdchat . . aera . . teachers . . stem . . edreform . . teachers . . ff . . g. veletsianos, r. kimmons / internet and higher education ( ) – will receive more followers but that this rate decreases. for example, if one user posts tweets, another posts , and a third posts , the first would gain . followers per tweet, the second would gain . , the third . , and so forth. and third, a logarithmic scatter plot of followers to lifespan (fig. ) re- vealed a weak, positive linear relationship between the two variables (r = . ) as expressed in the following equation: lg(followers) = . + (. × lg(lifespan)). this reveals that mere time in the medium produces followers but that the rate of return on followers per time in- terval decreases. for example, a user's first year would garner . fol- lowers, first four years would garner . followers per year, first eight years would garner . per year, and so forth. multiple linear regression was utilized to test whether any of the participation factors significantly predicted users' follower counts. the results of the stepwise linear regression indicated that a model of four predictors explained % of the user variance (r = . , f[ ] = . , p b . ), but casewise diagnostics identified nine outliers fig. . plot of linear relationship between hasht exceeding three standard deviations from predicted values. of these, seven identified themselves in their profiles as belonging to elite univer- sities, including harvard, princeton, university of pennsylvania, and university of toronto. this suggested that outlier status may be influ- enced by self-identifying factors as either elite or otherwise. to test this, participant data was coded with two new variables based upon whether the user identified a university in their profile and if so, wheth- er this was an elite university, where elite university was considered to be any university that the carnegie classification system was described as “very high research university.” anova comparisons of followers based on these new factors did not reveal significant results. outliers were therefore excluded from the analysis. upon exclusion of these nine cases, the strength of the predictive model increased to % (r = . , f[ ] = . , p b . ). it was found that following (b = . , p b . ), tweets (b = . , p b . ), role (b = . , p b . ), and lifespan (b = . , p b . ) significantly predicted followers. those scholars who follow more users, have tweeted more, signal themselves as professors, and have been on twitter longer will have more followers (table ). this relationship may be expressed in the following equation: lg followersð Þ ¼ : � lg followingð Þð Þ þ : � lg tweetsð Þð Þ þ : � roleð Þ þ : � lg lifespanð Þð Þ−: : . discussion and implications the results presented in this study reaffirm a number of previous re- search findings and contribute new insights to current knowledge re- garding four aspects of the research questions: participation equity, role differences, scholars' online participation, and scholarly influence. ag participation of students and professors. fig. . plot of power law relationship between hashtag frequency of students and professors. g. veletsianos, r. kimmons / internet and higher education ( ) – . . participation equity findings for rq reveal significant variation in how education scholars participate on twitter and the benefits received from participa- tion. as scholars became more active (i.e., increase their number of tweets) and popular (i.e., increase their number of followers) on twit- ter, they did so exponentially. for instance, the most followed % gar- nered % of all followers and had an average follower base nearly times larger than that of other scholars, while providing only % of all tweets. this leads us to ask whether the use of social media for scholarly work necessarily leads to new and more egalitarian structures for scholarly dissemination or if it reflects existing, or fosters new, non- egalitarian structures of scholarly practice. results for rq show that being widely followed on social media is impacted by many factors that may have little to do with the actual quality of scholarly work (i.e. following count, tweet count, role, lifespan) and suggests that participa- tion and popularity may be impacted by a number of additional factors unrelated to scholarly merit (e.g., wit, controversy, longevity). results for rq should lead us to question whether social media metrics can and should be used as proxies for scholarly value, as has been argued by proponents of alternative metrics (e.g., priem & table bivariate correlations of factors with raw values vs. logarithmic values. tweets following lifespan role followers . ⁎⁎ . ⁎⁎ . ⁎⁎ . ⁎⁎ lg(tweets) lg(following) lg(lifespan) role lg(followers) . ⁎⁎ . ⁎⁎ . ⁎⁎ . ⁎⁎ ⁎⁎ denotes significance at the p b . level. hemminger, ) and to recognize that a scholar's power to dissemi- nate meaningful work in a digitally connected culture is mediated (and therefore can be manipulated) by effective social media strategies. exponential increases in activity and popularity lend empirical support to earlier claims that some individuals may be more capable of exploiting the commons than others (veletsianos & kimmons, ) and suggests that twitter and similar technologies may not necessarily be the democratizing forces they are sometimes claimed to be. therefore, we recommend future research comparing traditional measures of scholarly outputs (e.g., number of journal articles or cita- tions) to twitter impact metrics to determine to what extent they may or may not be connected. similar research has been conducted in other disciplines (eysenbach, ; shuai et al., ; thelwall et al., ), but none has heretofore been done in the field of education. al- though it may be that measuring social media activity may help deter- mine impact in a manner more relevant to today's society, the fact that participation patterns predict follower counts (as shown in this paper) should elicit questions as to what follower counts actually mean in these contexts. this should lead us to consider what other fac- tors may influence scholars' abilities to share their work in a meaningful manner and to examine what metrics may be meaningful to inspect. . . role differences results for rq also revealed that although participant role impacted follower counts, listed counts, and linking, this distinction accounted for only a very small percent of variation ( % or less) in each of those fac- tors, suggesting that the social capital traditionally associated with pro- fessorial status may not provide much influence on twitter. yet, results for rq reveal that viral hashtag use did skew toward professors, and our analysis also indicates that there may be some qualitative differ- ences between hashtags based on role. based on our scatter plots fig. . plot of users' linear relationship between logarithmic follower and following counts. fig. . plot of users' linear relationship between logarithmic follower and tweet counts. g. veletsianos, r. kimmons / internet and higher education ( ) – fig. . plot of users' linear relationship between logarithmic follower counts and lifespan. g. veletsianos, r. kimmons / internet and higher education ( ) – (figs. and ), professors may be more likely to post on some civil rights issues like gender, sexual orientation, race, and violence and on content- related topics such as history, math, and literacy. students, on the other hand, appeared to be more likely to post on topics specifically related to the graduate student experience, such as dissertations, jobs, and data, and on topics of wider cultural interest, such as events and scandals. it seems that students also tended to refer to technology in generalities (e.g., tech), while professors were more likely to refer to specific tech- nologies and initiatives (e.g., moocs, open access, elearning). these results extend earlier findings in the literature. li and greenhow ( ) suggested that differences in motivations may reflect different roles within conferences, and our research showed that partic- ipation patterns vary by roles. future research in this area should ex- plore the following: ) role differences within the community of educational scholars to determine if statistically significant differences exist in the hashtag use of students and professors; ) differences based upon other demographic factors, such as gender, race, and age; and ) contextualized, qualitative use of hashtags. . . scholarly online participation the results of rq also suggest that scholars' participation in and contributions to hashtags is diverse and may extend well beyond table regression coefficients of factors as predictors of followers. b std. error t (constant) −. . − . ⁎⁎⁎ lg(following) . . . ⁎⁎⁎ lg(tweets) . . . ⁎⁎⁎ role . . . ⁎⁎⁎ lg(lifespan) . . . ⁎⁎ ⁎⁎ indicates significance at the p b . value. ⁎⁎⁎ indicates significance at the p b . value. traditional notions of scholarship. other researchers have reported that scholars often use twitter in both personal and professional ways (e.g., bowman, ; holmberg & thelwall, ; veletsianos, ) and that events external to conferences can impact the conference hashtag activity (mahrt et al., ). our research contributes to the current understanding by demonstrating that scholars' online participa- tion is influenced by temporal events other than conferences, such as the ferguson events (revealed by the viral presence of the hashtags #ferguson and #blacklivesmatter). our research also shows that peaks in activity (as shown by viral hashtags) can be produced by events of broad societal significance related to the scholarly interests of various subcultures within the community (e.g., critical educators, culture and race researchers) that may not be relevant to all subcultures within the community (e.g., international scholars). furthermore, al- though some individuals within the sample may have used the ferguson hashtag because it relates to their area of expertise, the sheer volume of tweets pertaining to this topic and the number of indi- viduals contributing to it suggests that at least some of those scholars may not necessarily have had a research connection to the topic. this last finding also suggests that by focusing on the use of social media for scholarship most of the current frameworks used to investi- gate emergent forms of technology-infused scholarship (i.e., social scholarship, digital scholarship, open scholarship) have focused on a fragment of scholars' online activities (kimmons, ) and have ig- nored other aspects of online presence (e.g., scholars' expression of identity). researchers need to explore a wider range of scholars' activi- ties to fully understand their online lives and participation. at present, the scholarly community lacks frameworks to make sense of the diver- sity of scholars' online participation. the research community would benefit from further development and adoption of frameworks to un- derstand scholars’ online participation beyond scholarship. future re- search in this area, for example, might explore the reasons that scholars participate online in the ways that they do and investigate such topics as scholars' online activism, use of humor, and discourse g. veletsianos, r. kimmons / internet and higher education ( ) – surrounding academic life. in summary, future research should examine the substance of education scholars' tweets qualitatively in order to gain a more in-depth look at how scholars are using twitter. . . scholarly influence given that many scholars use twitter to share their work with a broader audience, it has been suggested that follower counts might be a useful metric of success in this regard (cf. marwick & boyd, ). re- gardless of whether twitter and similar technologies are equalizing forces, our findings for rq offer several practical suggestions for scholars who would like to increase their followership: (a) tweet often, (b) follow many other users, (c) self-identify as a professor if accurate, and (d) continue using twitter over an extended period. whether one views this advice as gaming the system or legitimate participation in the community may depend on one's own assumptions about the medi- um. however, if follower counts are considered a metric of impact, one has to question it further, as our results show that education scholars' fol- lowership is most strongly predicted by the number of tweets posted and number of people followed. a wide range of variables might impact the number of tweets posted: extroverts might tweet more than intro- verts and scholars with family responsibilities might have less time to tweet than those without. . limitations and future research one major limitation of this study is that results will not necessarily transfer to other online social networks used by scholars like researchgate, facebook or academia.edu (tufecki, ). another is that participants were selected based upon their use of the #aera hashtag and this decision led to the exclusion of education scholars who did not use the hashtag. as a result of this choice, we may be missing nuanced scholarly social media use that might lead education scholars to elect not to participate in popular conference hashtags even though they might have a twitter account. future research can address these two lim- itations by examining scholars' participation on other social media plat- forms, and by conducting analyses similar to the ones reported here using additional hashtags as vehicles to identify education scholars. the latter approach, would enable researchers to broaden the data source to include professors and students who participated in other education- focused conferences/communities. additionally, other areas of future re- search mentioned previously include: • the comparison of traditional scholarly output measures to twitter impact metrics; • the analysis of role, gender, race, and age differences regarding hashtag use; • and the qualitative analysis of scholars' tweets to determine more substantial meanings of use. . conclusion this research used a large-scale data set to examine education scholars' participation on twitter. it examined the ways in which doctor- al students and professors used twitter, the hashtags that they contribut- ed to, and what factors predicted their follower counts. expanding opportunities to interact with diverse audiences in online settings and the potential of online networks to increase citations, reach, and impact have led many scholars to use social media and online social networks as part of their scholarly activities. yet, the results of this study indicate that significant variation exists in education scholars' networked participation. while one of the antici- pated outcomes of social media use is the democratization of knowledge sharing and participation, the results of this research question such pur- ported egalitarian structures of social media use. significantly, the results reported herein caution researchers and practitioners that theoretical frameworks that focus exclusively on scholarship and overlook the di- verse activities that scholars enact online, ignore significant aspects of who scholars are whey they are online. the richness and complexity of networked scholarship, coupled with the findings reported here, pro- vides a fertile ground for further research on the topic. acknowledgments this study was partially funded by a grant from the canada research chairs program and the j.a. & kathryn albertson foundation. references bowman, t.d. ( ). differences in personal and professional tweets of scholars. aslib journal of information management, ( ), – . brown, j., collins, a., & duguid, p. ( ). situated cognition and the culture of learning. educational researcher, ( ), – . burbules, n.c., & bruce, b.c. ( ). this is not a paper. educational researcher, ( ), – . boyd, d., golder, s., & lotan, g. ( ). tweet, tweet, retweet: conversational aspects of retweeting on twitter. proceedings of the rd hawaii international conference on system sciences (retrieved from http://www.danah.org/papers/tweettweetretweet. pdf). dron, j., & anderson, t. ( ). how the crowd can teach. in s. hatzipanagos, & s. warburton (eds.), handbook of research on social software and developing community ontologies (pp. – ). hershey, pa: igi global information science. esposito, a. ( ). neither digital or open. just researchers. views on digital/open schol- arship practices in an italian university. first monday, (sss). eysenbach, g. ( ). can tweets predict citations? metrics of social impact based on twitter and correlation with traditional metrics of scientific impact. journal of medical internet research, ( ). fetterman, d.m. ( ). webs of meaning: computer and internet resources for educa- tional research and instruction. educational researcher, ( ), – . greenhow, c., robelia, b., & hughes, j.e. ( ). learning, teaching, and scholarship in a digital age: web . and classroom research: what path should we take now? educational researcher, ( ), – . hall, n. ( ). the kardashian index: a measure of discrepant social media profile for scientists. genome biology, ( ), . haustein, s., peters, i., sugimoto, c., thelwall, m., & larivière, v. ( ). tweeting biomed- icine: an analysis of tweets and citations in the biomedical literature. journal of the association for information science and technology, ( ), – . holmberg, k., & thelwall, m. ( ). disciplinary differences in twitter scholarly commu- nication. scientometrics, , – . honeycutt, c., & herring, s.c. ( ). beyond microblogging: conversation and collabora- tion via twitter. proceedings of the nd hawaii international conference on system sci- ences (retrieved from http://ella.slis.indiana.edu/*herring/honeycutt.herring. . pdf). junco, r., heiberger, g., & loken, e. ( ). the effect of twitter on college student en- gagement and grades. journal of computer assisted learning, ( ), – . kimmons, r. ( ). emergent forms of technology-influenced scholarship. in m. khosrow- pour (ed.), encyclopedia of information science and technology (pp. – ) ( rd ed.). igi global. kimmons, r., & veletsianos, g. ( ). the fragmented educator . : social networking sites, acceptable identity fragments, and the identity constellation. computers & education, , – . lave, j., & wenger, e. ( ). situated learning: legitimate peripheral participation. cam- bridge, uk: cambridge university press. li, j., & greenhow, c. ( ). scholars and social media: tweeting in the conference backchannel for professional learning. educational media international, ( ), – . lupton, d.a. ( ). feeling better connected': academics' use of social media. canberra: news & media research centre (retrieved on november from http://www. canberra.edu.au/about-uc/faculties/arts-design/attachments /pdf/n-and-mrc/feeling- better-connected-report-final.pdf). mahrt, m., weller, k., & peters, i. ( ). twitter in scholarly communication. in k. weller, a. bruns, j. burgess, m. mahrt, & c. puschmann (eds.), twitter and society (pp. – ). new york, ny: peter lang. mandavilli, a. ( ). peer review: trial by twitter. nature, , – . marwick, a., & boyd, d. ( ). to see and be seen: celebrity practice on twitter. convergence, ( ), – . mewburn, i., & thomson, p. ( ). why do academics blog? an analysis of audiences, purposes and challenges. studies in higher education, ( ), – . priem, j., & hemminger, b. ( ). scientometrics . : new metrics of scholarly impact on the social web. first monday, ( ). roblyer, m.d., mcdaniel, m., webb, m., herman, j., & witty, j.v. ( ). findings on facebook in higher education: a comparison of college faculty and student uses and perceptions of social networking sites. the internet and higher education, ( ), – . ross, c., terras, m., warwick, c., & welsh, a. ( ). enabled backchannel: conference twitter use by digital humanists. journal of documentation, ( ), – . rowlands, i., nicholas, d., russell, b., canty, n., & watkinson, a. ( ). social media use in the research workflow. learned publishing, ( ), – . http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://www.danah.org/papers/tweettweetretweet.pdf http://www.danah.org/papers/tweettweetretweet.pdf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://ella.slis.indiana.edu/*herring/honeycutt.herring. .pdf http://ella.slis.indiana.edu/*herring/honeycutt.herring. .pdf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://www.canberra.edu.au/about-uc/faculties/arts-design/attachments /pdf/n-and-mrc/feeling-better-connected-report-final.pdf http://www.canberra.edu.au/about-uc/faculties/arts-design/attachments /pdf/n-and-mrc/feeling-better-connected-report-final.pdf http://www.canberra.edu.au/about-uc/faculties/arts-design/attachments /pdf/n-and-mrc/feeling-better-connected-report-final.pdf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf g. veletsianos, r. kimmons / internet and higher education ( ) – scheliga, k., & friesike, s. ( ). putting open science into practice: a social dilemma? first monday, ( ), – . shuai, x., pepe, a., & bollen, j. ( ). how the scientific community reacts to newly sub- mitted preprints: article downloads, twitter mentions, and citations. plos one, ( ). thelwall, m., haustein, s., larivière, v., & sugimoto, c. ( ). do altmetrics work? twitter and ten other social web services. plos one, ( ). tufecki, z. ( ). big questions for social media big data: representativeness, validity and other methodological pitfalls. proceedings of the eight international aaai confer- ence on weblogs and social media. mi: ann arbor. van noorden, r. ( ). online collaboration: scientists and the social network. nature, ( ), – . veletsianos ( ). social media in academia: networked scholars. new york, ny: routledge. veletsianos, g. ( ). higher education scholars' participation and practices on twitter. journal of computer assisted learning, ( ), – . veletsianos, g., & kimmons, r. ( ). assumptions and challenges of open scholarship. the international review of research in open and distance learning, ( ), – . veletsianos, g., & kimmons, r. ( ). scholars and faculty members lived experiences in online social networks. the internet and higher education, ( ), – . wenger, e. ( ). communities of practice. learning, meaning and identity. cambridge, uk: cambridge university press. de winter, j.c.f. ( ). the relationship between tweets, citations, and article views for plos one articles. scientometrics, ( ), – . yettick, h. ( ). one small droplet: news media coverage of peer-reviewed and university-based education research and academic expertise. educational researcher, ( ), – . http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf scholars in an increasingly open and digital world: how do education professors and students use twitter? . introduction . literature review . theoretical framework . methods . . research questions . . data collection . . data analysis . . . tweet data set . . . hashtag data set . . . user data set . results . . rq : how do graduate students and professors in the education field use twitter? . . rq : which hashtags do education scholars contribute to? . . rq : what factors predict participants' follower counts? . discussion and implications . . participation equity . . role differences . . scholarly online participation . . scholarly influence . limitations and future research . conclusion acknowledgments references http://www.rw.org.za open access reading & writing - journal of the reading association of south africa issn: (online) - , (print) - page of original research read online: scan this qr code with your smart phone or mobile device to read online. authors: willie t. chinyamurindi zikhona dlaza affiliations: department of business management, university of fort hare, south africa corresponding author: willie chinyamurindi, chinyaz@gmail.com dates: received: dec. accepted: may published: nov. how to cite this article: chinyamurindi, w.t. & dlaza, z., , ‘can you teach an old dog new tricks? an exploratory study into how a sample of lecturers develop digital literacies as part of their career development’, reading & writing ( ), a . https:// doi.org/ . /rw.v i . copyright: © . the authors. licensee: aosis. this work is licensed under the creative commons attribution license. introduction information communication technology (ict) is becoming popular, especially within a university campus scenario both, as a learning tool (chinyamurindi & shava ) and as a form of communication (shava, chinyamurindi & somdyala ). according to nyambane and mzuki ( ), the rapid growth of ict has brought about remarkable changes in the st century. a notable change, especially within the university campus, is the role of ict in educational delivery. notably, empirical studies (mostly in a south african setting) to this end have commonly been focussed on students as end-users (e.g. chinyamurindi & shava ; shava et al. ). the processes that end-users, such as teachers and faculty members, engage in their adoption and use of technology are argued as key in effective delivery using ict (tiba, condy & junjera ). this warrants further investigation, also given contextual and socio- historical issues surrounding south africa (khunou ). further, there appears to be a lot of government support in the use of ict, as a way of widening access to education albeit, the landscape of discrimination the country is coming from (south african department of education ). to this end, the need to develop ict policy and skills is argued for (blignaut & els ) by creating life-long learners (south african department of education ). priority in all this is also on developing skills that allow teachers and those in management and administration to help students in using icts. the concept of digital and information communication technology (ict) literacy is receiving renewed empirical attention. this focus is attributed to the changing nature of society and the move towards the ideals of the knowledge-based economy. further, universities in south africa and internationally are encouraging the fusion of technology in how students read and write. this research gives focus to the lecturer, particularly those lecturers who were once resistant to the use of technology as part of teaching instruction. the aim here was to track how these lecturers over a one-year period develop digital and ict literacies to assist their career development. the study adopted an interpretivist philosophy, relying on the qualitative research approach and a series of three interviews over a year period with lecturers at a selected south african university. data were analysed using thematic analysis to generate three central themes. firstly, the source of resistance in using technology as part of teaching and learning emanated from two main subthemes as perceptions: ( ) technology viewed as a fad with little or no impact on actual learning and ( ) challenges concerning institutional technology support as a limitation in integrating technology into teaching and learning. secondly, the change of attitude (rather reluctantly) in using technology as part of teaching and learning was because of factors such as peers, the technology ‘tech-savvy’ student community and also a consideration for future career prospects as digital and ict literacies are becoming a critical skills acumen for career progression. finally, in developing digital and ict literacies, the lecturers relied on: ( ) participation in training programmes that encourage digital scholarship, ( ) personal investment of time and effort to learn about how to develop digital and ict literacies and lastly, ( ) developing a career and identity management strategy that incorporates digital and ict literacies. implications for teaching and learning practice are made based on these findings. further, the impact on individual career development (as far as lecturers are concerned) is also suggested. can you teach an old dog new tricks? an exploratory study into how a sample of lecturers develop digital literacies as part of their career development read online: scan this qr code with your smart phone or mobile device to read online. http://www.rw.org.za https://orcid.org/ - - - https://orcid.org/ - - - mailto:chinyaz@gmail.com https://doi.org/ . /rw.v i . https://doi.org/ . /rw.v i . http://crossmark.crossref.org/dialog/?doi= . /rw.v i . =pdf&date_stamp= - - page of original research http://www.rw.org.za open access despite the implementation of e-education in south africa, there seems to be a slow rate of adoption and use of technology in the classroom (tiba et al. ). notably, empirical studies pay less attention to how this happens in two scenarios. firstly, within a context of rapid ict usage, especially amongst students as end-users. secondly, within a context of individual career development. this research pays cadence to these two contexts and seeks to explore how a sample of university lecturers develop digital literacies as part of their career development. literature review this research is based on a theoretical lens that seeks to understand the adoption of technology amongst lecturers. in achieving this, theoretical consideration is given to the technology acceptance model (tam) (venkatesh & davis ). this model consists of two beliefs, namely perceived usefulness and perceived ease of use of the application. these two views determine attitudes towards the adoption of a new technology. the attitude towards adoption depicts the prospective adopter’s positive or negative orientation or behaviour towards incorporating a new technology (venkatesh & davis ). usage could also be influenced by an individual’s perception of the ability to use the technology (compeau & higgins ). subsequently, all these elements of tam can serve as predictors of human behaviour (fishbein & ajzen ; lee & lehto ) or behavioural intention to use, thus resulting in tam being seen as a useful predictor in explaining human behaviour concerning technology acceptance (chinyamurindi & louw ; chinyamurindi & shava ; saadé, nebebe & tan ). the tam is built around the theory of reasoned action (tra) (fishbein & ajzen ), which suggests that individual behaviour is initiated by the intention to perform a particular task. the result of this is that individual behavioural intention determines one’s attitude and subjective norms regarding the behaviour in question (fishbein & ajzen ). the tra also posits that intention to act determines behaviour, and a causal link is believed to exist between the two (venkatesh & davis ). the attitude–behavioural intention relationship, as espoused in the tam constructs, assumes that all intentions to use technology are equal and can be formed on the basis of the positive use of the technology. the rapid advancement in ict has unceasingly changed the nature of teaching and learning in higher education (voogt ). the traditional teacher-centred approaches are now being replaced by dynamic and interactive student-centred learning environments because of the new emerging technology (davis ). the new generations of teachers are now expected to be not only an excellent source of information about curriculum design and delivery but also an agent who is supposed to minimise the gap between how technology is applied in classroom teaching and the opportunities offered by technology to enhance student learning (northcote & lim ). according to khan, bibi and hason ( ) for teachers to keep up with global pace, they need to be sufficiently competent in using ict in educational settings. a study conducted by buabeng-andoh ( ) found a positive correlation between ict use and competences. this finding is consistent with sorgo, verckovnik and kocijancic ( ) who found a high correlation between frequency of use of ict, perceived value and teachers’ competence in the use of ict amongst science teachers. they further concluded that teachers’ competence and confidence were predictors of the role of ict in teaching. similarly, the findings by petrogiannis ( ) concluded that computer-experienced teachers were more ready to use ict in their classes than non-experienced teachers. thus, the lack of familiarity with advanced functionalities may be the main reason for teachers not attempting to use ict (blin & munro ). teachers are expected to use technology in innovative ways that provide students with an engaging and empowering learning experience to prepare them to interact with a globally networked society (kopcha, rieber & walker ). research indicates that the effective use of technology in higher education teaching and learning is one of the factors contributing towards the improvement in the quality of instruction (khan et al. ). a study conducted by collins ( ) provided evidence about the positive effects of using ict in teaching and learning situations. information communication technologies have great potential for knowledge dissemination, effective learning and the development of more efficient educational services (buabeng- andoh ). this was also found by koehler and mishra ( ) that ict alone cannot lead students to learn; however, teachers who incorporate ict into their teaching can also facilitate student’s learning (khan ). an essential aspect of the shift in technological processes has been to the acceptance and use of ict for teaching and learning (oye, iahad & ab.rahim ). according to buabeng-andoh ( ), the adoption of ict by education has been seen as a powerful way to contribute to educational change, better prepare students for the information age, improve learning outcomes and competencies of learners, and equip students with survival skills for the information society. therefore, teachers are expected to integrate ict into their teaching and learning processes. for instance, an analysis by hue and ab jalil ( ) of the relationship between lecturers’ attitudes towards ict integration into the curriculum and their use of ict in the classroom suggests that when lecturers’ attitudes towards ict integration into curriculum increase, there is a possibility that their ict use will also increase. furthermore, to successfully initiate and implement educational technology in the school programme depends strongly on the teachers’ support and attitudes. it is believed that if teachers perceived technology programmes as neither fulfilling their own needs nor their students’ needs, it is likely that they will not integrate the technology into teaching and learning (buabeng-andoh ). http://www.rw.org.za page of original research http://www.rw.org.za open access evidence suggests that teachers’ attitudes and beliefs influence successful integration of ict into teaching (hew & brush ; keengwe & onchwari ). if teachers’ attitudes are positive towards the use of educational technology, then they can easily provide useful insight about the adoption and integration of ict into teaching and learning processes. consequently, students learning process and outcomes are connected to their teachers’ teaching beliefs, understanding and practices (khan ). for instance, previous studies assert that traditional beliefs that teachers and students hold, have a negative influence on the use of ict in the classroom (e.g. hermans et al. ; mäkitalo-siegl, kohnle & fischer ). a study conducted by drent and meelissen ( ) on factors that influence the innovative use of ict by teacher educators in the netherlands revealed that student-oriented pedagogical approach, positive attitude towards computers, computer experience and personal entrepreneurship of the teacher educator have a direct positive influence on the innovative use of ict by the teacher. on the contrary, previous studies suggest that a small number of teachers believe that the benefits of ict are not seen (buabeng-andoh ). the empirical survey revealed that one-fifth of european teachers thought that the use of ict in teaching did not benefit their students’ learning (korte & husing ). a survey of the united kingdom (uk) teachers also revealed that teachers’ positivity about the possible contributions of ict was moderated as they became ‘rather more ambivalent and sometimes doubtful’ about ‘specific, current advantages’ (becta : ). furthermore, it has been found that lack of access to equipment in the classroom and lack of teachers’ training skills in the use of the equipment contribute to the low use of ict by teachers (buabeng-andoh ). similarly, the study conducted by howie, muller and paterson ( ) found that lack of computer literacy amongst teachers, lack of training with regard to integration of ict into teaching and the absence of a properly developed computer skills curriculum were barriers to teachers’ application of the technology. therefore, the inconsistency between teachers’ actual use of ict and perception can be attributed to an inadequate supply of ict resources, lack of access to the right kinds of technology, inadequate ict pedagogical training and insufficient administrative support (buabeng-andoh ). research methodology this research adopted a qualitative methodology hinging on the interpretivist philosophy and using an exploratory research design (creswell ; silverman ). the qualitative approach is praised for the way it allows for gaining a deeper understanding of the lived experience (sheard ) and effective in arriving at making some sense around this experience (chinyamurindi , a, b). research context the research project was conducted at a south african university in the eastern cape province of south africa where the authors of this article are based. this research was part of a research component of a qualification of the postgraduate diploma in higher education that one of the authors of this article was part of. this research followed the institutional ethical requirements of the participating institution. participant sampling and data collection a total of participants took part in the study. the breakdown of these participants was as follows: were males ( %); were females ( %); age range was – years; average age was years; black people ( %) and white people ( %). table shows some description of the participants who took part in this study. a convenience sampling approach was used relying on those participants who were ‘accessible and available’ (cohen, manion & morrison : ). participants had to be full- time members of staff with the participating university. contact and recruitment of the participants were made through an email sent by the researchers explaining the nature of the project and the expectations. upon agreeing to be part of the study, an interview was arranged at a venue deemed suitable by the prospective respondent. following all this, a semi-structured interview approach was used following a predetermined structure of questions (silverman ). the study (including interviews) was conducted from may to may , and each interview lasted between min and . h. strategies to ensure data quality to ensure data quality, four steps were taken. firstly, initial interview questions were pretested with a sample of postgraduate students on similar characteristics as those participants who took part in the study. these students were also teaching, employing a blended learning approach that employs technology and traditional classroom delivery. secondly, to ensure credible data, all interview data were recorded and transcribed verbatim within h has done in previous research (chinyamurindi a, b). thirdly, after data transcription, participants were emailed a copy of the transcript to verify its accuracy. any changes were made to the transcript as per the wish of the participant. lastly, comprehensive notes were taken during the data collection process, and that assisted in the collection and analysis of data. in essence, this was a point of reflexivity (taylor, gibbs & lewins ) and achieving some sensitivity (mays & pope ) in the collection and analysis of data. data analysis the interviews were exported into qsr international’s nvivo , a data analysis and management software package useful when dealing with masses of text, graphic, audio and http://www.rw.org.za page of original research http://www.rw.org.za open access video data (reuben & bobat ). thematic analysis was used as a means of data analysis (braun & clarke ). thematic analysis is viewed as a flexible and useful research tool because of its ability to potentially provide a rich and detailed yet complex account of data (clarke & braun ). the process of generating themes is believed to happen at two levels: ( ) semantic level and ( ) latent level (braun & clarke ). at the semantic level, themes are identified within the explicit of the surface of meanings of the data and the analyst is not looking for anything beyond what a respondent has said or written; on the contrary, thematic analysis at the latent level seeks to identify or examine underlying ideas that are theorised as shaping or influencing the semantic content of the data (braun & clarke ). a semantic level approach was used for this research and this entailed an analytic process which involves a progression from description where the data have been organised to show patterns in semantic content and summarised, to interpretation, where there is an attempt to theorise the significance of the patterns and their broader meanings and implications in relation to previous literature (braun & clarke ). research results drawing from the data analysis, three main findings emerged. firstly, the source of resistance in using technology as part of teaching and learning emanated from two main subthemes as perceptions: ( ) technology viewed as a fad with little or no impact on actual learning and ( ) challenges concerning institutional technology support as a limitation in integrating technology into teaching and learning. secondly, the change of attitude (rather reluctantly) in using technology as part of teaching and learning was because of factors such as peers, the ‘tech-savvy’ student community and also a consideration for future career prospects as digital and ict literacies are becoming a critical skills acumen for career progression. finally, in developing digital and ict literacies, the lecturers relied on: ( ) participation in training programmes that encourage digital scholarship, ( ) personal investment of time and effort to learn how to develop digital and ict literacies and lastly, ( ) developing a career and identity management strategy that incorporates digital and ict literacies. table highlights the development of these themes stemming from the initial codes that are generated. source of resistance based on the data analysis, two main sources of resistance towards technology usage in teaching and learning were identified. table presents these sources of resistance. from table some accounts can be used as examples. for instance, marjorie was resistant towards using technology drawing on the years of work experience that she has. ‘i have been in academia for over ten years now. i would say when the technology movement started i was sceptical. it appears there is always a fascination with tools and yet the core issue of teaching and learning is not changing.’ (marjorie, female, associate professor) this view by marjorie was confirmed by another participant: ‘it seems we are in a state of flux in higher education today. an endless excitement with tools and pedagogical resources hinging on technology. i understand why we lecturers can be resistant. to keep up with all these tools and resources is a mission. what does not change are the basics, these are the basics from which our teaching and learning are grounded in.’ (brian, male, lecturer) linked to this subtheme and derived from table are experiences that also reveal a failure of institutional table : participant profiles. participant (pseudonym) wp ict k g p anna good female lecturer sly good male lecturer jane good female senior lecturer danny excellent male associate professor dorich good male lecturer shivon good male lecturer vanessa excellent female senior lecturer brian excellent male lecturer indor good female lecturer gauna excellent male senior lecturer chigo good female senior lecturer thaimi good male senior lecturer jolif excellent female senior lecturer kevon good male associate professor progress good male full professor jacob good male associate professor jo good female associate professor pat good female senior lecturer simone excellent female associate professor marjorie good female associate professor wp, work experience; ict k, information communication technology knowledge; g, gender; p, position. http://www.rw.org.za page of original research http://www.rw.org.za open access authorities in helping address fears and concerns lecturers have regarding technology usage. for instance, shivon cited an experience: ‘last year, i had a class of from a course that usually hosted students. so i approached the relevant authorities to see how technology can help as i had taught without technology. the whole process took at least three weeks to set-up, and for me, it was just not worth the delay. mainly due to an internal issue that can easily be solved.’ (shivon, male, lecturer) other participants like vanessa cited an institutional disparity to exist concerning the use of technology: ‘for instance, a student can take eight courses a semester. meaning they have eight different lecturers. each of these has their way of teaching. some use technology and others don’t. do you see the confusion it creates? in my view, there is more support for traditional classroom learning than the electronic aspect.’ (vanessa, female, senior lecturer) sources causing the change of attitude after five months, the second stream of interviews was conducted with the same participants and the aim here was to understand the experiences in using technology albeit the initial views of resistance. our concern was the sources that may change an attitude to favour technology usage. table presents a list of these sources of change and some quotes to support these. how lecturers develop digital literacies? finally, in developing digital and ict literacies, the lecturers relied on: ( ) participation in training programmes that encourage digital scholarship, ( ) personal investment of time and effort to learn how to develop digital and ict literacies; and lastly, ( ) developing a career and identity management strategy that incorporates digital and ict literacies. table presents the illustrating quote for this finding. discussion and conclusion this study sought to explore how a sample of university lecturers develop digital literacies as part of their career development. the backdrop of this being a context where south africa is viewed as becoming a country that places priority on the adoption and use of technology in the classroom (tiba et al. ). in support of the observation of the rapid change brought about by ict (nyambane & mzuki ), this research illustrates factors that affect lecturers as part of the knowledge generation process albeit table : development of themes: initial codes to resultant themes. initial codes resultant themes • scepticism of using the technology • negative feelings around technology • questions about technology usage technology as a fad • internal problems regarding ict support • software challenges • hardware challenges lack of institutional support • role of peers • staff support • collegial spirit peer influence • changing type of learners • student influence • st century learner needs tech-savvy students • promotion prospects • tenure • career prospects future career prospects • training efforts • hrd efforts participation in training • agency • initiative and drive personal investment • self-marketing efforts • awareness of oneself career and identity management ict, information ommunication echnology. table : sources of resistance to technology usage. source illustrating quotes technology as a fad ‘the real metric for me is student throughput rates. if i get there through the traditional approach, it does not matter. technology just makes things complex, yet the metrics are still the same’. anna ‘if students are passing and there is evidence to this through me standing in front of them, i don’t see what the problem is. technology has helped with keeping useful data, especially around at-risk students. challenges outweigh benefits for me. these affect my attitude towards using technology’. jolif ‘the resistance appears to emerge from challenges within the system. for instance, the issue of pass rates. we get to a point where we have to choose between the objective and the subjective. pass rates are objective, and experience of using technology is subjective. i prefer that which my career is based on, the former.’ danny lack of institutional support ‘so, what happens when the technology is down – for this is a reality i have experienced and makes me sceptical of technology in teaching instruction.’ indor ‘a help desk should be available hours. for instance, when students need help accessing the learning management system in the early hours of the morning who is there to help them?’ jane ‘we need support, and this starts from the highest office on the campus and trickles to our students. the absence of such support is tantamount to failure.’ thaimi table : sources causing the change of attitude. source illustrating quotes peer influence ‘my colleagues have been the greatest influence on my adoption and use of technology over the years. i guess iron sharpens iron and that is what we have had to do as more senior academics.’ marjorie ‘one of my peers took me through a crash course to understand dynamics of blackboard, our learning management system. this made the difference for me and helped change my view of the worth of technology instruction.’ jacob we work in communities of practice in academia. these are also communities of collaboration. so it becomes mandatory to tap into the skills of those who can do what you can’t do. over the years this has been the lesson i learnt given technology usage’. pat tech-savvy students ‘our students force you to be relevant. none but their use of technology heightens this issue.’ kevon ‘think of standing in a class with students who are information hungry and with devices that make such information easy to access. you have no choice but always to keep up.’ progress ‘it comes down to relevance. to be relevant understand your constituents. this involves understanding the tools they use. without this, you are just irrelevant.’ gauna future career prospects ‘university promotions depend on your ability to prove to your prospect employer that you are relevant to the changing needs of society. relevance to various technological tools becomes mandatory.’ indor ‘your next job is reflective of the way you can show how able you are to change. an inability to change reveals that you are weak and not needed given the wider changes happening in society.’ simone ‘to show technological acumen on your cv is a plus. so, one is really compelled to show and use this for their own survival.’ jo http://www.rw.org.za page of original research http://www.rw.org.za open access the uncomfortable in some cases. so in a bid to create life- long learners, blignaut and els ( ) suggested that the focus should also be on understanding how to empower those mandated with the responsibility to develop such learners, and this heightens the focus on university lecturers. although a possible tension exists, as illustrated by this study, between a traditional approach to learning (voogt ) and a contemporary perspective (khan et al. ), some form of compromise can be arrived at. for instance, the participants of the study narrated their concerns about technology usage and a possible reason for their negative attitude. however, these concerns should not be dismissed as these can have utility functions. firstly, they can be a basis for interventions to improve teaching delivery, using technology, may be suggested (sorgo et al. ). secondly, as illustrated by the findings, these concerns help improve understanding the lived experience, concerning technology usage in teaching delivery and ultimately do affect the career prospects of the lecturers. finally, following these concerns creates a platform to identify best practices concerning ict use and the competencies needed (petrogiannis ). this can affect full-system utilisation (davis ). this research supports previous south african research that shows the popularity of the use of technology in teaching and learning (chinyamurindi & shava ). previous research attested to this popularity by paying attention to end-user perspective as a focus on students (chinyamurindi & shava ; shava et al. ). this research magnifies the attention of the lecturer as an important stakeholder, not only in the generation of knowledge but also how it is presented. such a focus on the lecturer as a user of ict for teaching practice gives cadence to the importance of empowering this constituent group as part of the deliverables of their jobs (tiba et al. ). through the findings of this research, contextual issues, as probed in previous south african studies were key in influencing the adoption of technology amongst teaching staff (khunou ). thus, the lecturing and teaching staff may be considered to be life-long learners as well (blignaut & els ). the source of resistance by the lecturers to technology could be attributed to the pressures from the context they are operating within (tiba et al. ). emphasis within this context is on the need to meet deliverables to an ever- changing student constituency that is technologically savvy. thus, students indirectly drive this pressure that requires lecturers to not only catch up with the latest technology but also be users themselves. thus, the change brought about by the use of icts has created this different atmosphere (nyambane & mzuki ). the resistance towards technology as found in this study could be indicative of generational challenges that exist within the academy. two contesting views seem to exist, firstly, a traditional teacher- centred approach and the latter, a more contemporary perspective with emphasis on technology usage (davis ). this generational gap manifests in different approaches to learning and plays a key role in the resistance to icts. this study illustrates how this links to the individual career development against the teachers. guided by the literature, lecturers may need to make the transition between the identified gaps to allow them to efficiently and effectively meet the needs of the st century learner (khan et al. ). institutions of higher learning may need to come in and assist this process through empowerment initiatives. the study has some contributions. firstly, it advances the literature within a south african setting that concerns aspects of ict literacies (chinyamurindi & shava ; shava et al. ). whereas previous studies have focussed on student samples, this research locates its argument by giving voice to university lecturers. this sample is as important as the student sample (yet under-researched) as argued by other researchers (tiba et al. ). further, the research heightens the focus on the role of context in shaping aspects of technology usage (khunou ). the findings of this study do much to highlight those issues from a lecturer’s perspective that can help to create a better working context as far technology is concerned. the current work has some limitations. firstly, because of the qualitative nature of the research, generalisability is not possible as the findings are subjective based on a lived experience. thus, the results of this research are not generalisable to the entire population of university lecturers in south africa. we thus urge caution when interpreting and making implications based on this. secondly, this research does not reveal an accurate demographic profile of the lecturers in the selected university as participants were conveniently selected. despite these limitations within this research, future research can be enacted to improve on these table : how digital and information communication technology literacies develop? source illustrating quotes participation in training ‘i signed up for a course organised by getsmart, and this helped get the latest skills-set concerning technology usage.’ kevon ‘our teaching and learning centre organises some training workshops and forums that help make us aware of the latest happenings and also share best practices.’ pat ‘when i attended a teaching and learning conference there was a special session of ict and digital skills. ever since i attend this conference every year in durban to hone my skills.’ progress personal investment ‘i make an effort to subscribe to different blogs and viewpoints to empower myself.’ simone ‘you are the master of your destiny. when you realise this, you make an effort to create this destiny by taking proactive steps.’ brian ‘there is no substitute for investment than a personal investment to keep abreast of changes happening.’ jo career and identity management ‘i have now invested in belonging to a range of social networking sites and professional societies as part of managing my career. this allows me to learn from others around the clock.’ thaimi ‘a friend recommended a consult a technologist who is independent as part of trying to find ways to improve my impact not just as a teacher but also as part of my career identity beyond this institution.’ anna ‘part of all this is an awareness of what i am becoming as part of the experience. thus, each experience deserves to be managed with thoughtfulness as it forms part of the whole experience. i have stopped complaining and started managing issues. this has allowed me to embrace technology as a friend than a foe.’ danny http://www.rw.org.za page of original research http://www.rw.org.za open access limitations. further, a quantitative research approach can be used to improve the subjective limitation of this research. this allows for a more representative sample. inferences can also be made to establish linkages and association between the variables being studied. in conclusion, one participant of this study spoke of the challenge they face and a possible solution: ‘can you teach an old dog new tricks? this is an insult to an old dog. you can teach a dog anything, just as long as it wants to learn.’ (progress, male, full professor) acknowledgements the authors thank all the participants for taking part in this research. competing interests the authors declare that they have no financial or personal relationships which may have inappropriately influenced them in writing this article. authors’ contributions w.t.c. collected and analysed the data. z.d. wrote the initial draft of the article and also assisted in data collection. references becta, , harnessing technology: schools survey , viewed january , from http://emergingtechnologies.becta.org.uk/uploaddir/downloads/ page_documents/research/ht_schools_survey _analysis.pdf blignaut, a.s. & els, c.j., , ‘comperacy assessment of postgraduate students’ readiness for higher education’, the internet and higher education ( ), – . https://doi.org/ . /j.iheduc. . . blin, f. & munro, m., , ‘why hasn’t technology disrupted academics’ teaching practices? understanding resistance to change through the lens of activity theory’, computers & education ( ), – . https://doi.org/ . /j.compedu. . . braun, v. & clarke, v., , ‘using thematic analysis in psychology’, qualitative research in psychology ( ), – . https://doi.org/ . / qp oa buabeng-andoh, c., , ‘an exploration of teachers’ skills, perceptions and practices of ict in teaching and learning in the ghanaian second-cycle schools’, contemporary educational technology ( ), – . chinyamurindi, w. & shava, h., , ‘an investigation into e-learning acceptance and gender amongst final year students’, south african journal of information management ( ), – . https://doi.org/ . /sajim.v i . chinyamurindi, w.t., , ‘an investigation of career change using a narrative and story-telling inquiry’, south african journal of human resource management ( ), – . https://doi.org/ . /sajhrm.v i . chinyamurindi, w.t., a, ‘using narrative analysis to understand factors influencing career choice in a sample of distance learning students in south africa’, south african journal of psychology ( ), – . https://doi.org/ . / chinyamurindi, w.t., b, ‘a narrative investigation into the meaning and experience of career success: perspectives from women participants’, south african journal of human resource management ( ), – . https://doi.org/ . /sajhrm. v i . chinyamurindi, w.t. & louw, g.j., , ‘gender differences in technology acceptance in selected south african companies: implications for electronic learning: original research’, sa journal of human resource management ( ), – . https://doi. org/ . /sajhrm.v i . clarke, v. & braun, v., , ‘teaching thematic analysis: overcoming challenges and developing strategies for effective learning’, the psychologist ( ), – . cohen, l., manion, l. & morrison, k., , research methods in education, routledge, new york. collins, t., , ‘supporting teachers to embed flexible learning technologies in their teaching practice: a case study’, national vocational education and training research program occasional paper, ncver, adelaide. compeau, d.r. & higgins, c.a., , ‘computer self-efficacy: development of a measure and initial test’, mis quarterly ( ), – . https://doi.org/ . / creswell, j.w., , ‘mapping the field of mixed methods research’, journal of mixed methods research ( ), – . https://doi.org/ . / davis, n., , ‘leadership of information technology for teacher education: a discussion of complex systems with dynamic models to inform shared leadership’, journal of information technology for teacher education , – . https://doi.org/ . / drent, m. & meelissen, m., , ‘which factors obstruct or stimulate teacher educators to use ict innovatively?’, computers & education ( ), – . https://doi.org/ . /j.compedu. . . fishbein, m. & ajzen, i., , belief, attitude, intention, and behaviour: an introduction to theory and research, addison-wesley, reading, ma. hermans, r., tondeur, j., van braak, j. & valcke, m., , ‘the impact of primary school teachers’ educational beliefs on the classroom use of computers’, computers & education , – . https://doi.org/ . /j.compedu. . . hew, k.f. & brush, t., , ‘integrating technology into k- teaching and learning: current knowledge gaps and recommendations for future research’, educational technology research & development , – . https://doi.org/ . / s - - - howie, s.j., muller, a. & paterson, a., , information and communication technologies in south african secondary schools, hsrc press, pretoria. hue, l.t. & ab jalil, h., , ‘attitudes towards ict integration into curriculum and usage among university lecturers in vietnam’, international journal of instruction ( ), – . keengwe, j. & onchwari, g., , ‘computer technology integration and student learning: barriers and promise’, an international journal of science education and technology , – . https://doi.org/ . /s - - - khan, s.h., , ‘emerging conceptions of ict-enhanced teaching: australian tafe context’, instructional science: an international journal of the learning sciences ( ), – . https://doi.org/ . /s - - - khan, s.h., bibi, s. & hason, m., , ‘australian technical teachers’ experience of technology integration in teaching’, sage open ( ), – . https://doi.org/ . / khunou, g., , ‘what middle class? the shifting and dynamic nature of class position’, development southern africa ( ), – . https://doi.org/ . / x. . koehler, m.j. & mishra, p., , ‘what happens when teachers design educational technology? the development of technological pedagogical content knowledge’, journal of educational computing research ( ), – . https://doi. org/ . / ew - wb-bkhl-qdyv kopcha, t.j., rieber, l.p. & walker, b.b., , ‘understanding university faculty perceptions about innovation in teaching and technology’, british journal of educational technology ( ), – . https://doi.org/ . /bjet. korte, w.b. & husing, t., , ‘benchmarking access and use of ict in european schools : results from head teacher and a classroom surveys in european countries’, e-learning papers ( ), – . lee, d.y. & lehto, m.r., , ‘user acceptance of youtube for procedural learning: an extension of the technology acceptance model’, computers and education , – . https://doi.org/ . /j.compedu. . . mäkitalo-siegl, k., kohnle, c. & fischer, f., , ‘computer-supported collaborative inquiry learning and classroom scripts: effects on help-seeking processes and learning outcomes’, learning and instruction , – . https://doi. org/ . /j.learninstruc. . . mays, n. & pope, c., , ‘qualitative research in health care: assessing quality in qualitative research’, bmj: british medical journal ( ), . https://doi. org/ . /bmj. . . northcote, m. & lim, c.p., , ‘the state of pre-service teacher education in the asia-pacific region’, in c.p. lim, k. cock, g. lock & c. brook (eds.), innovative practices in pre-service teacher education: an asia-pacific perspective, pp. – , sense publishers, rotterdam, netherlands. nyambane, c.o. & nzuki, d., , ‘factors influencing ict integration in teaching – a literature review’, international journal of education and research ( ), – . oye, n.d., iahad, n.a. & ab.rohim, n., , ‘adoption and acceptance of ict innovation in nigerian public universities’, international journal of computer science engineering and technology ( ), – . petrogiannis, k., , ‘the relationship between perceived preparedness for computer use and other psychological constructs among kindergarten teachers with and without computer experience in greece’, journal of information technology impact ( ), – . reuben, s. & bobat, s., , ‘constructing racial hierarchies of skill – experiencing affirmative action in a south african organisation: a qualitative review’, sa journal of industrial psychology ( ), – . https://doi.org/ . /sajip. v i . saadé, r.g., nebebe, f. & tan, w., , ‘viability of the technology acceptance model in multimedia learning environments: a comparative study’, interdisciplinary journal of knowledge and learning objects ( ), – . https://doi. org/ . / shava, h., chinyamurindi, w. & somdyala, a., , ‘an investigation into the usage of mobile phones among technical and vocational educational and training students in south africa’, south african journal of information management ( ), – . https://doi.org/ . /sajim.v i . sheard, l., , ‘anything could have happened: women, the night-time economy, alcohol and drink spiking’, sociology ( ), – . https://doi.org/ . / http://www.rw.org.za http://emergingtechnologies.becta.org.uk/uploaddir/downloads/page_documents/research/ht_schools_survey _analysis.pdf http://emergingtechnologies.becta.org.uk/uploaddir/downloads/page_documents/research/ht_schools_survey _analysis.pdf https://doi.org/ . /j.iheduc. . . https://doi.org/ . /j.compedu. . . https://doi.org/ . /j.compedu. . . https://doi.org/ . / qp oa https://doi.org/ . /sajim.v i . https://doi.org/ . /sajhrm.v i . https://doi.org/ . / https://doi.org/ . /sajhrm.v i . https://doi.org/ . /sajhrm.v i . https://doi.org/ . /sajhrm.v i . https://doi.org/ . /sajhrm.v i . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /j.compedu. . . https://doi.org/ . /j.compedu. . . https://doi.org/ . /j.compedu. . . https://doi.org/ . /s - - - https://doi.org/ . /s - - - https://doi.org/ . /s - - - https://doi.org/ . /s - - - https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / x. . https://doi.org/ . / x. . https://doi.org/ . / ew - wb-bkhl-qdyv https://doi.org/ . / ew - wb-bkhl-qdyv https://doi.org/ . /bjet. https://doi.org/ . /j.compedu. . . https://doi.org/ . /j.learninstruc. . . https://doi.org/ . /j.learninstruc. . . https://doi.org/ . /bmj. . . https://doi.org/ . /bmj. . . https://doi.org/ . /sajip.v i . https://doi.org/ . /sajip.v i . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /sajim.v i . https://doi.org/ . / https://doi.org/ . / page of original research http://www.rw.org.za open access silverman, d., , doing qualitative research: a practical handbook, sage publications limited, london. sorgo, a., verckovnik, t. & kocijancic, s., , ‘information and communication technologies (ict) in biology teaching in slovenian secondary schools’, eurasia journal of mathematics, science and technology education , – . south african department of education, , white paper on e-education: transforming learning and teaching through information and communication technologies (icts), viewed july , from http://www.info.gov.za/ whitepapers/ /e-education.pdf south african department of education, , annual report / , viewed april , from http://www.education.gov.za/dynamic/dynamic.aspx?pageid = &catid= &category=reports&legtype=null taylor, c., gibbs, g.r. & lewins, a., , quality of qualitative analysis, viewed september , from http://onlineqda.hud.ac.uk/intro_qda/qualitative_ analysis.php tiba, c., condy, j. & junjera, n., , ‘re-examining factors influencing teachers’ adoption and use of technology as a pedagogical tool’, in south african international conference on educational technologies. proceedings, pretoria, april – , , pp. – . venkatesh, v. & davis, f.d., , ‘a theoretical extension of the technology acceptance model: four longitudinal field studies’, management science ( ), – . https://doi.org/ . /mnsc. . . . voogt, j., , ‘teacher factors associated with innovative curriculum goals and pedagogical practices: differences between extensive and non-extensive ict- using science teachers’, journal of computer assisted learning , – . https://doi.org/ . /j. - . . .x http://www.rw.org.za http://www.info.gov.za/whitepapers/ /e-education.pdf http://www.info.gov.za/whitepapers/ /e-education.pdf http://www.education.gov.za/dynamic/dynamic.aspx?pageid= &catid= &category=reports&legtype=null http://www.education.gov.za/dynamic/dynamic.aspx?pageid= &catid= &category=reports&legtype=null http://onlineqda.hud.ac.uk/intro_qda/qualitative_analysis.php http://onlineqda.hud.ac.uk/intro_qda/qualitative_analysis.php https://doi.org/ . /mnsc. . . . https://doi.org/ . /j. - . . .x _hlk draft not yet approved by the european commission d . : open science policies and resource provisioning in the nordic and baltic countries (second report): author(s) per-olov hammargren (snic), päivi rauste (csc) maijastiina arvola (csc) status draft version . date / / document identifier: deliverable lead snic related work package wp author(s) anca heinola (fmi), terje vellemaa (etais/ut), troels rasmussen (deic/dtu), ilmars slaidins (rtu), birna guðrún gunnarsdóttir (uice), tomi rosti (uef), jarmo saarti (uef) linas bukauskas (vu), maria francesca iozzi (uninett sigma ) contributor(s) due date / / actual submission date / / reviewed by liisi lembinen, university of tartu, and pirjo kontkanen, university of helsinki approved by dissemination level public draft not yet approved by the european commission website https://www.eosc-nordic.eu/ call h -infraeosc- - project number start date of project / / duration months license creative commons cc-by . keywords table of contents . purpose and scope of the deliverable . methodology . . focus areas . open science policies . . update on state of open science . . policy implemented incentives for fair denmark estonia finland latvia lithuania norway sweden . . policies for os training / training for making data fair denmark estonia finland latvia lithuania norway draft not yet approved by the european commission sweden . . policies for making other research objects fair denmark estonia finland latvia lithuania norway sweden . . summary /analysis . . . policy implemented incentives for fair . . . policies for os training / training for making data fair . . . policies for making other research objects, as software and methodology, fair . resource provisioning policies . . policy facilitation of cross border research denmark estonia finland latvia lithuania norway sweden . . summary /analysis . roadmap with eosc perspective . . roadmap related to open science / fair . roadmap related to facilitating the cross-border research list of abbreviations literature list draft not yet approved by the european commission executive summary for this deliverable the authors have surveyed and described any policy and similar documentation such as written guidelines, policies et.al. relating to specific focus areas. the focus areas chosen are policy implemented incentives for fair, policies for open science (os) training / training for making data fair, policies for making other research objects fair and policies facilitating cross border research. the countries in the scope of the deliverable are denmark, estonia, finland, norway, latvia, lithuania and sweden. regarding policy implemented incentives for fair, the countries have taken different approaches. policy implemented incentives are either in the form of national policies or law. policies may be authored by different organisations, both organisations with national mandates, such as funders, and organisations with a subnational mandate, such as higher education institutions (hei:s). this also illuminates that different stakeholders are involved, ranging from ministries, and funders to hei:s and libraries. policies for (os) training/training for making data fair, or policies that involve os training, are available in some countries. in countries where neither policies or training is available, there is awareness of the importance of policies for os training/training for making data fair, which is reflected in some draft policies. os training is provided by hei:s in a majority of countries inventoried. a difference between the countries inventoried is that in some countries policy is set at a national level, in others, policies are set in addition to, or only at the subnational level. most countries inventoried do not have policies in place for making other research objects fair. a conclusion, as has also been highlighted by other reports , is that making other research objects fair is an area in need of focus in order to ensure transparency, reproducibility, and reusability of research. one country inventoried has a policy that mentions facilitation of cross border research in the sense that national e-infrastructure is promoted to international users. the remainder of the inventoried countries have national policies, regarding access to resources, which focuses on researchers with national affiliation access to services. a conclusion is that in the nordic and baltic region, facilitation of cross border research is not a focus in policies, but policies rather have a national scope. https://www.nature.com/articles/sdata https://www.nature.com/articles/sdata draft not yet approved by the european commission . purpose and scope of the deliverable the aim of this deliverable is to provide a second assessment of the status of os policies and provisioning of infrastructure resources in the nordic and baltic countries, as well as a roadmap on how the nordic and baltic countries can align with eosc. this deliverable as such is a follow up of the first deliverable in the series regarding open science and resource provisioning, d . open science policies and resource provisioning in the nordic and baltic countries. in the first deliverable, submitted in february , the following observations were made: within nordic and baltic region, maturity in regard to implementation of open science ranges from countries having laws in place governing the implementation of open science to countries being in the early stages of adopting national strategies and plans for the implementation of open science. the study showed that in countries where a national os policy has yet to be established, some hei:s and funders have established open access policies and to a lesser extent open data policies. the previous deliverable showed, in regard to resource provisioning and access policies, that national hpc facilities were not available in some baltic countries. access to resources was discovered to be solely for academic use. regarding resource provisioning policies, the previous study showed that principles throughout the nordics and baltics differ, ranging from access requiring technical and scientific review to access being granted on demand. this deliverable builds on the d . deliverable, referred to above, and aims to add to the existing body of knowledge via inventories of focus areas related to os/fair and resource provisioning policies. this deliverable is also to consider eosc developments and deliver a roadmap in which the findings are put into relation to eosc. this deliverable will as such relate the findings to relevant eosc policies and recommendations. this may entail policies such eosc rules of participation (rop) which, as published guidelines, deals with both open science and resource provisioning within the frame of eosc. it may also entail outputs such as eu´s open science policy with the ambitions: the eosc is a trusted, virtual, federated environment that cuts across borders and scientific disciplines to store, share, process and reuse research digital objects (like publications, data, and software) that are findable, accessible, interoperable and reusable (fair). eosc brings together institutional, national and european stakeholders, initiatives and infrastructures. https://www.eosc-nordic.eu/kh-material/testimateriaali/ https://www.eosc-nordic.eu/kh-material/testimateriaali/ https://op.europa.eu/en/publication-detail/-/publication/a d - e- eb-b f- aa ed a /language-en/format-pdf/source- https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science_en#tracking-open-research- trends-open-science-monitor https://www.eosc-nordic.eu/kh-material/testimateriaali/ https://www.eosc-nordic.eu/kh-material/testimateriaali/ https://op.europa.eu/en/publication-detail/-/publication/a d - e- eb-b f- aa ed a /language-en/format-pdf/source- https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science_en#tracking-open-research-trends-open-science-monitor https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science_en#tracking-open-research-trends-open-science-monitor draft not yet approved by the european commission . methodology the eosc-nordic work package project group, consisting of participants from the organizations snic, csc, uef, fmi, sigma , ut/etais, deic, uice, rtu, and nordunet, have during september - february conducted desktop studies to compile the materials that form this deliverable. the focus of this deliverable has been identified by both the authors of this deliverable, stakeholders within the eosc-nordic project, os stakeholders in the nordics and baltics, as well as in reports by eosc projects, and requirements set out in the eosc-nordic project plan. . . focus areas in this deliverable inventories will be provided for specific focus areas. the focus areas chosen are: ● policy implemented incentives for fair ● policies for open science (os) training / training for making data fair ● policies for making other research objects (incl. software and methodology) fair ● policies facilitating cross border research in relation to the adoption of fair, three areas were identified as suitable focus areas. first focus area is policy implemented incentives for fair. fair in practice task force of the eosc fair working group, has set six recommendations for implementation of fair practice . one of the recommendations is reward and recognise improvements of fair practice. following a dialogue in our working group and between work packages, it was decided that this deliverable would be suitable for a mapping of policy implemented incentives for fair. the fairsfair project aims to supply practical solutions for the use of the fair data principles throughout the research data life cycle . within the project a series of practical recommendations for policy enhancement to support the realisation of a fair ecosystem has been prepared . one of the recommendations highlights the need for training: “researchers and data stewards should receive practical guidance on how to implement fair within different domains – particularly with regard to data description using appropriate metadata standards, data tags and ontologies. a commitment is needed from all stakeholders to support and meet training needs relating to open science.” as such policies for os training / training for making data fair was chosen as a focus area for which this deliverable is able to add value in form of a mapping. fair for other research objects has been identified by the fair in practice task force of the eosc fair working group, as an area for which there is a need for further studies. regarding fair for other https://ec.europa.eu/info/sites/info/files/research_and_innovation/ki enn.pdf https://www.fairsfair.eu/the-project https://zenodo.org/record/ #.ybwu-nmxu w https://ec.europa.eu/info/sites/info/files/research_and_innovation/ki enn.pdf https://www.fairsfair.eu/the-project https://zenodo.org/record/ #.ybwu-nmxu w draft not yet approved by the european commission research objects, the report six recommendations for implementation of fair practice asserts that, in order to ensure widespread benefits of the eosc, improvements in fair practices are necessary. one of the recommendations is develop and monitor adequate policies for fair data and research objects. policies for making other research objects (incl. software and methodology) fair was identified as a focus area that needs further studies. lastly, regarding resource provisioning policies, the focus in this deliverable is on incentives for cross border collaborations. aforementioned aspects, open science, including fair, and access to e- infrastructure, are aspects that are at the core of eosc, as highlighted in the rop. as mentioned also in eu´s open science policy eosc will enable researchers across disciplines and countries to store, curate and share data. the last focus area in this document was chosen from the need to identify gaps in policies for facilitating cross border research. throughout this deliverable inventories for each focus area are provided per inventoried country, in which policies, guidelines et.al. related to the focus areas are addressed. https://ec.europa.eu/info/sites/info/files/research_and_innovation/ki enn.pdf https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science_en#tracking-open-research- trends-open-science-monitor https://ec.europa.eu/info/sites/info/files/research_and_innovation/ki enn.pdf https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science_en#tracking-open-research-trends-open-science-monitor https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science_en#tracking-open-research-trends-open-science-monitor draft not yet approved by the european commission . open science policies . . update on state of open science as described in section . . this deliverable mainly focuses on focus areas related to open science/fair policies and policies facilitating cross border research. as required by the deliverable description, the deliverable also provides an update for the deliverable d . , which presents an overview of the state of open science policies in the nordics and baltic region. as such, this section provides an update and an overview of news related to the state of open science policies in the nordics and baltics. specifically changes after february are provided, as february was the submission date for the previous report. updates are provided per country. participants were asked to describe for their countries any policies, guidelines etc. related to open science. documents in scope were records available via organisations relevant for the specific national contexts. denmark how to support fair data management was addressed by a data management task force set up through deic, who in november , published their report on a national data management strategy. estonia currently estonia does not have an active open science policy; however, the ministry of education and science has been actively developing open science framework for estonia (working document, in estonian). the document is in the final stage. the framework will be published as the appendix of estonian r&d, innovation and entrepreneurship development plan for - . the national strategy “estonia ” (in estonian) clearly states developing open science in estonia: “improving the wider availability and use of research results (including the development of open science) and support the development of a knowledge-based in society, including citizen science.” . https://www.hm.ee/sites/default/files/eesti_avatud_teaduse_raamistik_ .pdf https://www.hm.ee/sites/default/files/ _taie_arengukava_eelnou_lisad_ . . .pdf riigi_pikaajaline_arengustrateegia_eesti_ _eelnou_uldosa.pdf (riigikantselei.ee) https://www.hm.ee/sites/default/files/eesti_avatud_teaduse_raamistik_ .pdf https://www.hm.ee/sites/default/files/ _taie_arengukava_eelnou_lisad_ . . .pdf https://www.riigikantselei.ee/sites/default/files/riigikantselei/strateegiaburoo/eesti /riigi_pikaajaline_arengustrateegia_eesti_ _eelnou_uldosa.pdf draft not yet approved by the european commission finland the working group appointed by the expert panel on open data has, in cooperation with the national open science and research steering group, drafted principles for the policy on open access to research materials and methods and a policy component on open access to research data . this draft is the first part of the national policy on open access to research materials and methods for the finnish higher education and research community, and it further defines the national objective for open research materials and methods mentioned in the finnish declaration of open science and research – . the policy will be supplemented with a policy component on open research methods in - . norway in december , the ministry of research and education published the new national strategy on access to and sharing of research data . the strategy aims at establishing the basic principles for the management and curation of publicly funded research data, and therefore building the foundation for facilitating the reuse of data for advancement of knowledge and for the benefit of the society in its whole. the strategy stems from three basic principles, namely (i) research data must be as open as possible, as closed as necessary; (ii) research data should be managed and curated to take full advantage of their potential; (iii) decisions concerning archiving and management of -research data must be taken within the research community. change in the underlying culture, increased competence, data management plans, better technical infrastructure, improved national coordination among subject fields and sustainable funding models are the identified requirements in the process of establishing the above- mentioned principles as a national practice. the strategy also highlights the need to facilitate the reuse of data from statistics norway in research and identify concrete measures to facilitate the processes of accessing and consuming statistical data. similarly, the link of public registries data and statistics norway data with the health data is envisioned in connection with the health data program. the program started in under the directorate of e- health to improve the utilization of norwegian health data from health registries, population-based surveys and research biobanks. in early the norwegian council of research finalised the open science policy after an extensive consultation round of several stakeholders. the research council’s policy for open science is based on https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://avointiede.fi/fi/julistus https://www.regjeringen.no/en/dokumenter/national-strategy-on-access-to-and-sharing-of-research-data/id / https://www.forskningsradet.no/en/adviser-research-policy/open-science/policy-for-open-science/ https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://avointiede.fi/fi/julistus https://www.regjeringen.no/en/dokumenter/national-strategy-on-access-to-and-sharing-of-research-data/id / https://www.forskningsradet.no/en/adviser-research-policy/open-science/policy-for-open-science/ draft not yet approved by the european commission the principle that research and innovation processes are to be “as open as possible, as closed as necessary”. to embrace the various aspects of open science, the policy has three main objectives: i) to contribute to a well-functioning science system; ii) to contribute to sustainable societal development; iii) to strengthen the public trust in science. the policy clarifies the research council’s role in promoting open science through a list of measures. the measures are: ● knowledge about and competence in open science ● testing of open science and innovation in projects ● access to and reuse of scientific results ● data infrastructures for handling and making research data accessible ● career development and research assessment ● transparency in research funding processes ● responsible research and innovation ● openness in innovation processes ● rights to research results ● research as the premise for societal development ● involvement of users and the general public in research and innovation processes through user participation and citizen science. research institutes and universities are assigned to build the knowledge and define policies and procedures for data management (including the adoption of the data management plans as an integral part of the project management). the definition of such policies and the implementation of them in the different institutions has gone therefore at different pace. the uit arctic university of norway was the first publishing and a policy for open access and a policy for research data management , covering two essential parts of open science. the university of oslo (uio) published an open science policy in , which was later updated in may . the university of oslo’s policy aims at making research data openly available, but exceptions can be made for data that cannot and should not be made available. the norwegian university of science and technology ntnu have published policies on open science and guidelines based on open access to data on for an horizon - , while the university of bergen has recently updated it open science policy (sept. ) . policies for open access can also be found in the oslo metropolitan school . https://intranett.uit.no/content/ /principles% and% guidelines% for% research% management% at% uit_ .pdf https://www.uio.no/english/for-employees/support/research/research-data-management/policies-and-guidelines/index.html https://innsida.ntnu.no/c/wiki/get_page_attachment?p_l_id= &nodeid= &title=ntnu+open+data&filename=ntnu% open% data_policy.pdf https://www.uib.no/en/foremployees/ /university-bergen-policy-open-science https://ansatt.oslomet.no/en/open-access-policy https://intranett.uit.no/content/ /principles% and% guidelines% for% research% management% at% uit_ .pdf https://www.uio.no/english/for-employees/support/research/research-data-management/policies-and-guidelines/index.html https://innsida.ntnu.no/c/wiki/get_page_attachment?p_l_id= &nodeid= &title=ntnu+open+data&filename=ntnu% open% data_policy.pdf https://innsida.ntnu.no/c/wiki/get_page_attachment?p_l_id= &nodeid= &title=ntnu+open+data&filename=ntnu% open% data_policy.pdf https://www.uib.no/en/foremployees/ /university-bergen-policy-open-science https://ansatt.oslomet.no/en/open-access-policy draft not yet approved by the european commission latvia the ministry of education and science of latvia (izm) initiated development of open science roadmap for latvia and there has been an activity reviewing the current state and elaborating recommendations for development. document (in latvian) has been published on june th, and now is used in the ongoing development of a political document – the latvian national open science strategy. it is planned that open science development in latvia will be based on pillars: open access, fair data and citizen science. the strategy is likely to be approved in q , . lithuania during lithuania's legislation system regarding open science or open access publishing did not change. sweden during the national library (kb) made available, in english, an inquiry into the needs for financial and technical support in order to enable swedish scholarly journals to transition into open access. furthermore, kb was in tasked by the swedish government to establish a national platform for swedish scholarly open access journals. kb also compiled and published information regarding swedish hei:s expenditures related to scientific publications and publishing scientific publications available via open access during . additionally, in the swedish research bill open access is a focus area, as the government mandated scientific publications financed by public funds to be subject to open access starting in . further, swedish research council (vr) has launched a national data management plan tool. lastly, the mandates of kb and vr are to be further developed, in order to accelerate the conversion to an open science system. https://www.izm.gov.lv/sites/izm/files/petijums-atverta_zinatne_ _ .pdf https://www.kb.se/samverkan-och-utveckling/nytt-fran-kb/nyheter-samverkan-och-utveckling/ - - -financial-and-technical-support-for- open-access-scholarly-journals.html https://www.kb.se/samverkan-och-utveckling/nytt-fran-kb/nyheter-samverkan-och-utveckling/ - - -totala-utgifter-for-vetenskaplig- publicering- .html https://www.vr.se/aktuellt/nyheter/nyhetsarkiv/ - - -digitalt-verktyg-for-datahanteringsplaner-nu-tillgangligt.html https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for- sverige.pdf https://www.izm.gov.lv/sites/izm/files/petijums-atverta_zinatne_ _ .pdf https://www.kb.se/samverkan-och-utveckling/nytt-fran-kb/nyheter-samverkan-och-utveckling/ - - -financial-and-technical-support-for-open-access-scholarly-journals.html https://www.kb.se/samverkan-och-utveckling/nytt-fran-kb/nyheter-samverkan-och-utveckling/ - - -financial-and-technical-support-for-open-access-scholarly-journals.html https://www.kb.se/samverkan-och-utveckling/nytt-fran-kb/nyheter-samverkan-och-utveckling/ - - -totala-utgifter-for-vetenskaplig-publicering- .html https://www.kb.se/samverkan-och-utveckling/nytt-fran-kb/nyheter-samverkan-och-utveckling/ - - -totala-utgifter-for-vetenskaplig-publicering- .html https://www.vr.se/aktuellt/nyheter/nyhetsarkiv/ - - -digitalt-verktyg-for-datahanteringsplaner-nu-tillgangligt.html https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf draft not yet approved by the european commission . . policy implemented incentives for fair the following inventory is provided per country. participating representatives were asked to describe any policy/similar such as written guidelines, management policies et.al. relating to policy implemented incentives for fair. documents in scope were documents /similar available via the organisations are relevant for the specific national context. denmark as mentioned in the previous report, fair is an integral part of the danish national strategy for collaboration on e-infrastructure from . two task forces have been set up in accordance with the overall strategy - one on scientific merit and the other on fair data management. the taskforce on scientific merit delivered a report in . the task was to give recommendations on how scientists could advance their careers not only through peer-reviews and publications in order to ensure that other valuable aspects of science are not under-prioritized - among others specifically on sharing of data. “lack of transparency and access to data creates structural problems in science, amongst others with reproducibility, testing, reusability and use of data”. however, the report does not mention fair specifically nor does it address the issues it acknowledged. the recommendation given is that in order to support and strengthen strong research environments university management shall give merit and acknowledge to a higher degree the width of significant input to research results - for example development of datasets, programming, modelling, knowledge sharing. how to support fair data management was addressed by the data management task force who in november published their report on a national data management strategy. the key principles driving the national fair data management policy supported by the universities and deic is: - data management shall support fair principles and reuse of data - minimally datasets must be identifiable through pid and metadata - valuable datasets must be made openly available, unless specific reasons not to do so exists. metadata must always be made openly available - relevance and weighing of individual fair principles varies and must be defined within the different fields according to international standards - it must be possible to preserve all kinds of research objects and file formats in the short and long term. - data deemed to have future value must be stored and made accessible in a technically and organizationally secure data infrastructure. - particularly valuable datasets must be identified and managed with long term preservation in mind draft not yet approved by the european commission - in the event that available datasets are being deleted, the pid shall continue to be available along with metadata on the data and their deletion. - data life cycle data management - methods and tools shall be available and ensure that data and documentation is recorded and stored continuously - licencing conditions shall be available either in human or machine-readable formats - relevant ri and tools must be offered to all scientists independent of research field and institutional affiliation - data storage and access to data must be open to all scientists - data in relevant repositories shall constitute the requirement of research institutions obligations towards the national archive. - technical tools shall enable data access in open and standardised formats and shall not be hindered by licensed software. - needed skills and competences shall be available regardless of field or institutional affiliation - scientists need not be data management experts, but will be supported in the data management work - knowledge on data management and fair shall be accessible for all scientists at all research performing institutions. estonia estonia has not implemented a national open access/open science policy yet. however, in , open science expert group was initiated by the estonian research council to support drafting a national open science policy by complying principles and recommendations for the development of national open science policy ( ). from the end of , the ministry of education and research in estonia has started to develop a roadmap for an open science policy framework which is expected to result in official policy in . the framework will be published as the appendix of estonian r&d, innovation and entrepreneurship development plan for - . at the same time updating the organisation of research and development act is underway and open science is a part of it. it is expected that in a few years, this policy can result in the establishment of estonian open science competence centre which is a central support system for open science implementation in estonia. the work version of the os framework requires that all the publicly funded research data should be available based on fair principles. in addition, all research conducting institutions need to publish their data based on fair principles. the estonian code of conduct for research integrity agreement, signed in by estonian research universities and research institutions require researchers to make sure that their research data could be found and used as easily as possible following fair principles. https://www.etag.ee/wp-content/uploads/ / /open-science-in-estonia-principles-and-recommendations-final.pdf https://www.etag.ee/wp-content/uploads/ / /open-science-in-estonia-principles-and-recommendations-final.pdf draft not yet approved by the european commission finland most higher education institutions have guidelines for considering fair principles when opening research data. finnish national coordination of open science has published a guide to scientific publication channels for responsible material and data policy development, recalling the fair principles. when higher education institutions and research organisations update open science policies, the fair principles are usually included in the policy at that time. the data service providers (e.g. csc , fsd ) have fair principles mentioned in their policies. most funders, for example academy of finland, mention fair principles as part of the guidelines for data management plans . the common digital vision of higher education institutes includes fair principles and the needs or open science as one of the drivers . national fairdata services provided by csc utilize fair principles . the ministry of education and culture is committed to fair principles and incorporates them into its own policies and guidance documents . all higher education institutions have guidance for data management planning and generally follow fair principles. under the finnish national coordination of open science, a working group is working on incentives for open science. this work has begun spring and is still ongoing. there are sub groups for publishing, learning and research data and methods. incentives are also considered from the perspectives of both the teacher and the researcher. there is a working group on career models and merit-earning methods in the general collective agreement for universities, but the work is just beginning. finnish advisory board on research integrity has updated the template for a researcher’s curriculum vitae model and now it allows the researcher to highlight the promotion of open science as part of the scientific and societal impact of research. in finland, incentives are not taken directly to the policy level, but the state of will is recorded at a more general level. responsibility for the national coordination of open science has been given to the federation of finnish learned societies , which is responsible for national policy work, and these policy https://www.eosc-nordic.eu/kh-material/testimateriaali/ https://avointiede.fi/en/policies/declaration-open-science-and-research- - https://www.csc.fi/en/data-policy https://www.fsd.tuni.fi/en/data-archive/documents/fsds-data-management-policy/ https://www.aka.fi/en/research-funding/apply-for-funding/how-to-apply-for-funding/az-index-of-application-guidelines /data-management- plan/data-management-plan/ https://minedu.fi/en/vision- https://www.fairdata.fi/en/ https://minedu.fi/en/science-and-research https://www.eosc-nordic.eu/kh-material/testimateriaali/ https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://avointiede.fi/en/policies/policies-open-science-and-research-finland https://tenk.fi/en/advice-and-materials/template-researchers-curriculum-vitae https://tsv.fi/en https://www.eosc-nordic.eu/kh-material/testimateriaali/ https://avointiede.fi/en/policies/declaration-open-science-and-research- - https://www.csc.fi/en/data-policy https://www.fsd.tuni.fi/en/data-archive/documents/fsds-data-management-policy/ https://www.aka.fi/en/research-funding/apply-for-funding/how-to-apply-for-funding/az-index-of-application-guidelines /data-management-plan/data-management-plan/ https://www.aka.fi/en/research-funding/apply-for-funding/how-to-apply-for-funding/az-index-of-application-guidelines /data-management-plan/data-management-plan/ https://minedu.fi/en/vision- https://www.fairdata.fi/en/ https://minedu.fi/en/science-and-research https://www.eosc-nordic.eu/kh-material/testimateriaali/ https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://avointiede.fi/en/policies/policies-open-science-and-research-finland https://tenk.fi/en/advice-and-materials/template-researchers-curriculum-vitae https://tsv.fi/en draft not yet approved by the european commission documents recommend the implementation of incentives. as a funder of higher education institutions and research institutes, various ministries draw up performance agreements with organisations where the promotion of open science can be addressed at a more general level. the funding model for higher education institutions has incentives mainly for publications, and organisations themselves have the autonomy to adjust their own incentives for open science fields. latvia fair will be an integral part of the national open science strategy, covering rdm practices and incentives, research infrastructures, skills and monitoring. in the draft version of the open science roadmap document . it is recognized that coordinated actions in introducing data repositories based on fair principles is one of priorities. it is planned to build an appropriate e-infrastructure with data management platforms (dmp) and create a unified service centre to support research institutions and individual researchers. the latvian council of science will implement standardized research data management procedures for state funded research. to create incentives for researchers and institutions, it is planned to support these actions with appropriate financing with emphasis on the state-of-the-art research projects and international cooperation. the draft version of the latvian national open science strategy requires that all the publicly funded research data should be available in open access databases based on fair principles. https://www.izm.gov.lv/sites/izm/files/petijums-atverta_zinatne_ _ .pdf https://www.izm.gov.lv/sites/izm/files/petijums-atverta_zinatne_ _ .pdf draft not yet approved by the european commission lithuania the research council of lithuania is the main institution coordinating open access activities of lithuania. such a decision was made on the submission of the ministry of education and science and in response to the request made by the secretariat of the lithuanian national commission for unesco to appoint the institution responsible for open access to research information. in , the research council of lithuania became a partner of the international project 'open access policy alignment strategies for european union researchers'. in lithuania, the public and private institutions are required by law to make any research results public. article no. of the republic of lithuania the law of on science and studies incentivizes usage of public funds from the state budget to include as the outcome a produced results (deliverables) to be open access for general public insofar as it is consistent with the laws regulating intellectual property rights and commercial, state and public service secrecy protection. norway the board of the digitalization for higher education and research - consisting of members from the high education and research sector, approved in may the action plan for the sector which includes the target image and several initiatives related to fair research data. the targets are that: i) researchers make results (publications, data and more) easily retrievable and, as far as possible, available for reuse; ii) researchers have access to a clear and user-friendly service offering that supports both academic and administrative tasks. the targets shall be achieved by: ● making available support services that simplify all steps in the research process and ensure that registered information can be reused. ● making available services to support collaboration: researchers must be able to collaborate easily and seamlessly with colleagues nationally and internationally, and across disciplines. ● making research data available: research data from norwegian research institutions must follow the fair principles (findable, accessible, interactive and reusable). ● making available common services for storage and calculations: researchers shall have access to generic storage and computing services. unit - the norwegian directorate for ict and joint services in higher education & research has the responsibility to implement the action plan, by launching a concept phase project (spring ) for: i) mapping the existing research infrastructures and services and define guidelines for a service to be considered a national service; ii) mapping the existing services for long-term storage of research data and study the need to supplement these. in january , unit in collaboration with the higher https://www.unit.no/en/node/ https://www.unit.no/en/about-unit https://www.unit.no/en/node/ https://www.unit.no/en/about-unit draft not yet approved by the european commission education institutions has presented a proposal for a new digitization strategy to the ministry of education. the document includes among the proposed topics open access. the arctic university of norway uit was the first norwegian university to establish an institutional policy for research data management. at that time, the fair principles had not gained its current status as a de-facto standard. nevertheless, the uit policy includes several fair-related incentives. the policy mandates researchers at uit among other things to write a data management plan (dmp), and to share their data in a trustworthy repository as early as possible. on the other hand, uit commits to provide the necessary support services to enable researchers to comply with the policy, including an institutional archive based on dataverse. the university of oslo’s rectorate approved recently updated policies and guidelines for research data management, this time including not only principles related to fair but also to care . the guidelines of the policy demand that the research data shall be made available for re-use at the earliest stage, research project should have a dmp which include also the description of the metadata standard adopted, and finally research data shall be made deposited in archive and made discoverable and reusable through the proper license. uio has also launched a project in october to explore technical solutions for providing support for fair metadata connected with the storage. the norwegian university of science and technology ntnu demands dmp for every research project. the university of bergen supports fair by offering an institutional research data archive based on dataverse. the nordi project owned by nsd - the norwegian center for research data - was originally established to modernize the nsd’s archive data services and it has been recently restructured to adjust the scope to the national directive as well as the european union report "turning fair into reality”. uninett sigma - the national provider of e-infrastructure for massive research data and hpc infrastructure - is in alignment with the research council’s and the universities’ policy, required since to submit a data management plan to every application for storage resources and has promoted the publication of the data in the sigma’s open research data archive by not applying cost to the storage used, regardless on the size of the dataset. this has brought an increase of % of the archive in the period - . https://www.rd- alliance.org/sites/default/files/care% principles% for% indigenous% data% governance_final_sept% % .pdf https://www.nsd.no/en/about-nsd-norwegian-centre-for-research-data/projects/the-nordi-project https://www.sigma .no https://www.regjeringen.no/en/dokumenter/national-strategy-on-access-to-and-sharing-of-research-data/id / https://www.rd-alliance.org/sites/default/files/care% principles% for% indigenous% data% governance_final_sept% % .pdf https://www.rd-alliance.org/sites/default/files/care% principles% for% indigenous% data% governance_final_sept% % .pdf https://www.nsd.no/en/about-nsd-norwegian-centre-for-research-data/projects/the-nordi-project https://www.sigma .no/ draft not yet approved by the european commission sweden the swedish national guidelines for the european research area contain measures aiming at strengthening the priorities agreed on at eu-level. the guidelines offer an overview of the responsibilities of actors in the research and innovation system. this includes incentives for fair. the research bill of provided national direction for open science. this included that research products, as far as possible, to the greatest extent possible should make use of and fulfil the fair principles. starting in , publicly funded research publications are to be available via open access. additionally, vr, as a funder, starting in , demands dmp:s for research projects receiving funding . furthermore, policy implemented incentives for fair are found in the vr report kriterier för fair forskningsdata, published in , in criteria for review of fair compliance for research data that has been funded by public funds and the kb criteria, published in , for review of fair compliance of publications funded by public funds. the swedish research bill states that research data, though eosc, shall be coordinated to meet the fair criteria. a focus area for the vr coordination task for open science is national coordination regarding data management plans, and is seen as a key component in realizing the fair principles. . . policies for os training / training for making data fair this section provides an inventory of policies for os training and training for making data fair. the inventory is provided per country. participating representatives were asked to for their countries describe any policy/similar such as written guidelines, management policies et.al. relating to os training /training of researchers for making data fair. documents in scope were documents /similar available via the organizations which are relevant for producing such for the specific national context such as funders, government agencies, ministries, research societies, and hei:s, as well as national organizations providing application support, such as e-infrastructures. denmark in denmark os training is focused mainly on building support structure for scientists. more and more emphasis is put on a new profession “data stewards” - that is data specialists with integral knowledge of data management issues, methods and technical tools, which can facilitate the data management https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska- forskningsomradet- .pdf https://www.vr.se/uppdrag/oppen-vetenskap/oppen-tillgang-till-forskningsdata.html https://www.vr.se/analys/rapporter/vara-rapporter/ - - -kriterier-for-fair-forskningsdata.html https://www.kb.se/samverkan-och-utveckling/oppen-tillgang-och-bibsamkonsortiet/oppen-tillgang/fair.html https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for- sverige.pdf https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska-forskningsomradet- .pdf https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska-forskningsomradet- .pdf https://www.vr.se/uppdrag/oppen-vetenskap/oppen-tillgang-till-forskningsdata.html https://www.vr.se/analys/rapporter/vara-rapporter/ - - -kriterier-for-fair-forskningsdata.html https://www.kb.se/samverkan-och-utveckling/oppen-tillgang-och-bibsamkonsortiet/oppen-tillgang/fair.html https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf draft not yet approved by the european commission processes and make data fair. the basic idea is that these persons are available for researchers and may join research projects as would lab technicians, software engineers and librarians. the scope is wide, from a very generic advisory role, to field specific support at the highest academic level, fully integrated into research. the new dm strategy of november stipulates that the universities and deic have the responsibility for ensuring that needed competences for fair dm is available during the research process. together they must ensure that the area is developed and coordinated nationally and internationally and that fair data stewardship competencies are developed, including development of fair data stewardship through targeted training of it staff and librarians and by offering bachelor- and master programmes for data stewards and specific training modules at ph.d level focusing on generic and field specific competences. deic is responsible for ensuring critical mass, and coordinating and facilitating sharing of knowledge across different institutions and fields and will set up a national data stewardship competence centre and each institution will set up local data stewardship front offices. estonia in estonia, the university of tartu is offering os training. ut offers the following courses for masters and phd students: research data management; research data management and publishing which includes rdm and fair data sections . topics of open science and research data management are included in the doctoral elective course research integrity: framework requirements, values and principles of action. ut also offers various courses for students connected to data science, for example introduction and methods of data science, data mining. in addition, through the datacite consortium, training is offered in the largest estonian research universities. training and information materials of research data management and open science have been created also by other universities, like taltech, tallinn university and university of life sciences . ut library has close contacts with the estonian research council regarding data management plans and evaluation of dmp-s. finland the policy work is managed at the national level by the federation of finnish learned societies. it provides on one hand strategy and program work and on the other hand it coordinates the efforts of https://sisu.ut.ee/andmehaldus/home- ?lang=en http://htk.tlu.ee/xerte /play.php?template_id= http://library.emu.ee/en/research/research-data-management/ https://tsv.fi/en https://sisu.ut.ee/andmehaldus/home- ?lang=en http://htk.tlu.ee/xerte /play.php?template_id= http://library.emu.ee/en/research/research-data-management/ https://tsv.fi/en draft not yet approved by the european commission different actors in their daily work. the federation of finnish learned societies also hosts the working groups for special areas, including one in open and fair data. at the present a “national policy on open access to research materials and methods: policy component - open access to research data” (here the draft version is being in use from ) is being prepared for the finnish research community the aim being “research materials and methods are as open as possible and as closed as necessary. materials are managed appropriately in order to achieve the fair principles. research methods and materials, including research data, are recognised as independent research outputs.” it sets the following principles for opening the research: . research materials are opened only responsibly . researchers have access to data management infrastructures and services, and they are developed in a research-driven manner . the researchers’ work in good data management practices and opening of research materials is valued in researcher merit criteria) the draft document makes recommendations for the training to be organized including the time frame in which they should be in practice. starting with "by , research organizations provide guidelines, practices and training in data management planning for students, researchers and staff." the schedule for the recommendations will be reviewed again during the spring . the service providers, especially csc and finnish social science data archive provide help and instruction for the use of the services that support fair principles and good data management, for example the fairdata services and data service portal aila . the main education and dissemination efforts for fair data are conducted at the individual higher education institutions level, especially at the universities. there the libraries have been major actors in providing assistance, help, and education for making fair data (e.g. university of turku and university of eastern finland . at the present, the finnish model is publicly funded and centralized services and the assistance and user education is done both by the service providers and locally in different types of academic and research institutions and there is an ongoing cooperation between these actors. https://avointiede.fi/en/policies/policies-open-science-and-research-finland https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://www.fairdata.fi/en/ https://www.fsd.tuni.fi/en/data/#data-service-portal-aila https://utuguides.fi/researchdata https://libguides.oulu.fi/researchdata/fair_principle https://www.uef.fi/en/research-data-management https://avointiede.fi/en/policies/policies-open-science-and-research-finland https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://www.fairdata.fi/en/ https://www.fsd.tuni.fi/en/data/#data-service-portal-aila https://utuguides.fi/researchdata https://libguides.oulu.fi/researchdata/fair_principle https://www.uef.fi/en/research-data-management draft not yet approved by the european commission latvia the latvian national open science strategy will require that publications from state financed research projects must be published in green or gold open access and researchers will be trained on how to use open access platforms and tools. researchers are already allowed to include article processing charges in the research project expenses and will be developed other incentives for researchers. it is planned to support development of research data management skills at the expert level, as well for “long tail” scientists. the international open access week workshop . taming the research data wisely: a short introduction of how-to-do" was organized by the national open access desk of latvia on october , . it is planned to develop a training programme for the data stewards. the latvian national library in collaboration with universities is planning to develop a training course on data management for librarians and researchers. additional resources will be planned for upgrading existing data repositories and research information systems to be compliant with fair principles. https://www.openaire.eu/blogs/workshops-on-research-data-management-gained-high-interest-among-different-stakeholders-in-latvia https://www.openaire.eu/blogs/workshops-on-research-data-management-gained-high-interest-among-different-stakeholders-in-latvia draft not yet approved by the european commission lithuania the working group for open access to research data and the research group under the lithuanian national commission for unesco distributed their 'request to the lithuanian science and higher education institutions on open access to research data', in which they stated having noticed attempts in forming the open access policy, preparing provisions and recommendations for open access to university research papers and results, organising seminars for the science community and disseminating the information on open access and its benefits to the society. as the body of research data, research publications are tracked by the national digital databases. the corresponding maintainers of national systems organize seminars and even public challenges oriented to scientists to learn the benefits of open access to scientific data and learn the flow of information. norway training offerings for fair data in norway are organised at institutional level and at the time of this writing there is no national coordination for training on fair data. the uit research data policy mandates uit to provide guidance and support in the development of dmps, using basic methods and service for processing, storing, as well as archiving and publishing of research data. support is also given for defining licenses for access, reuse, and dissemination of the data, and establishing third party agreements and contracts if necessary. the uio has established a portfolio of research data management courses and training aimed at different groups, organised by the university library, the university center for information technology and software carpentry community. recently the uio's digital scholarship center (dsc ) was established to help researchers to take advantage of digital tools and methods in your research and assist you with navigating the university's complex digital ecosystem. the uib offers regularly courses for making dmps , while nsd is offering support for dmp through a chat-channel. uninett sigma is providing support to create dmp and customise dmp template to institutional/community specific data management policies in easydmp . https://www.ub.uio.no/english/writing-publishing/dsc/carpentry-uio/index.html https://www.ub.uio.no/english/writing-publishing/dsc/index.html https://www.uib.no/en/ub/ /introduction-data-management-plan-dmp https://www.nsd.no/en/create-a-data-management-plan https://easydmp.sigma .no https://www.ub.uio.no/english/writing-publishing/dsc/carpentry-uio/index.html https://www.ub.uio.no/english/writing-publishing/dsc/index.html https://www.uib.no/en/ub/ /introduction-data-management-plan-dmp https://www.nsd.no/en/create-a-data-management-plan https://easydmp.sigma .no/ draft not yet approved by the european commission sweden training for and policies related to os and making data fair are touched on in the vr report on criteria for fair criteria for research data. the report concludes that previous reports, authored by non-swedish actors, highlights the need for training. furthermore, the swedish national data service (snd) provides overviews of the fair principles related to data. additionally, the snd data access units (dau), distributed across swedish hei:s in the snd consortium, provides some local support via staff, both domain experts and generalists. additionally, the swedish research bill states that hei libraries play a significant part in facilitating the transition towards open science via providing support and service. lastly, individual swedish hei:s may offer training via internal organizations catering to both the specific hei requirements, and national requirements, both policies, and organisational requirements, such as for snd, relating to support for fair. an example of hei policies and training is the uppsala university (uu) internal organization for fair, “fairdriktning – hantering av forskningsdata”, which aims to serve as a uu wide function, a data office to support research needs for knowledge of handling of research data, coordinate regarding offers for resources and be a system for management /handling of research data, as well as conveying knowledge regarding demands related to research data that various stakeholders have placed on uu. . . policies for making other research objects fair fair for other research objects has been identified by the fair in practice task force of the european open science cloud fair working group, as an area for which there is a need for further studies , as such an inventory of policies for fair for other research objects have been performed. the inventory is provided per inventoried country. participating representatives were asked to for their countries describe any policy/similar such as written guidelines, management policies et.al. relating to making other research objects, as software and methodology, fair. documents in scope were documents /similar available via the organisations which are relevant for producing such for the specific national context such as funders, government agencies, ministries, research societies, and heis, but also national organisations providing application support, such as e-infrastructures. https://www.vr.se/download/ .ad e b efab cdc/ /kriterier-fair-forskningsdata_vr_ .pdf https://dhb.snd.gu.se/wiki/fair-principerna https://snd.gu.se/en/node/ https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for- sverige.pdf https://mp.uu.se/documents/ / /projektdirektiv+ufv+ - +fairdrikning+forskningsdata.pdf/ e -b f- d e- - e d f c https://ec.europa.eu/info/sites/info/files/research_and_innovation/ki enn.pdf https://www.vr.se/download/ .ad e b efab cdc/ /kriterier-fair-forskningsdata_vr_ .pdf https://dhb.snd.gu.se/wiki/fair-principerna https://snd.gu.se/en/node/ https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf https://mp.uu.se/documents/ / /projektdirektiv+ufv+ - +fairdrikning+forskningsdata.pdf/ e -b f- d e- - e d f c https://mp.uu.se/documents/ / /projektdirektiv+ufv+ - +fairdrikning+forskningsdata.pdf/ e -b f- d e- - e d f c https://ec.europa.eu/info/sites/info/files/research_and_innovation/ki enn.pdf draft not yet approved by the european commission denmark in the danish data management strategy, it is clearly stated that methods and tools should be available, must enable proper data and documentation is collected and stored continuously. licensing for reuse must be available as part of the metadata in either human- or machine-readable format. technical platforms must make it possible to retrieve data in an open and standardised format and may not be hindered through the use of licences software. estonia software and codes have not been officially regulated in estonia. in general, researchers involved in developing software or codes are required to use digital object licencing. universities are in the process of developing regulations on both commercial and open usage of software and code. copyright act is implemented when it comes to rights of software and other digital objects. finland in june , the finnish government published its programme titled “a participatory and knowledgeable finland – a socially, economically and ecologically sustainable society” . the country strives to achieve the best public administration in the world from a democratic as well as information management policy perspective. the programme is to enhance the use of open-source software in public information systems and highlights open source, open data, and open interfaces. the use of open source software is promoted as a priority by public administrations. this programme applies largely to all the public administrations, including publicly funded research institutes attached to ministries in seven administrative sectors, but not to universities, which fall into different categories. however, the open-source software recommendations within the programme do not refer specifically to research, nor do they mention the obligation of being fair. although the majority of the higher education institutions, research institutes, funding agencies acknowledge the importance of open-source software as part of open science and provide information and even recommendations for using and making their software open source, very few provide a policy for that (fmi , csc ). it is clear that the adoption of policies and fair practices for software and other research objects in finland lags behind open data and open access; still the number of software repositories and their content suggest that open software is a common routine among finnish research community. http://julkaisut.valtioneuvosto.fi/bitstream/handle/ / /inclusive% and% competent% finland_ _web.pdf?sequence= &isallo wed=y https://en.ilmatieteenlaitos.fi/open-source-code https://www.csc.fi/en/open-source-policy https://en.ilmatieteenlaitos.fi/open-source-code https://www.csc.fi/en/open-source-policy draft not yet approved by the european commission moreover, the fact that the software is associated with metadata and persistent identifiers shows that this kind of research output is findable, accessible, and reusable. in conclusion, the practice of opening software and methods is broadly enforced in the research culture but not regulated. draft not yet approved by the european commission latvia response from latvia on these challenges will not be elaborated in detail and currently are not in focus. open science and open access principles will be based mainly on oecd principles and guidelines for access to research data from public funding . lithuania software or digital libraries are a part of research data and have requirements to be published according to open access strategy. however, no specific policies are mentioned for computer code as it is partially regulated with the laws on intellectual properties. norway at the time of this writing there isn’t a national policy or guideline for making digital objects fair, although this might be possibly seen as one of the potential outcomes of the implementation of the action plan approved by the digitalization board (see above). mature communities such as for example the nordatanet within the climate community have de-facto already best practice regarding sharing and reusing digital objects. in , the norwegian research council has financed the biomeddata , a project focused on fair management and sharing of molecular life science data, and includes collaboration with national data generating infrastructures, coordinated by elixir norway and the university of bergen through its computational biology unit. software is included in research data as largely defined in the uit research data policy; cf. section : “research data is defined as all registrations, notes, and reporting which are produced or arise in the course of research, and which are regarded as being of scientific interest and/or scientific potential. the format of these may include, but is not limited to, numbers, text, source code, photographs, films, and sound.” being part of the coderefinery project , uit offers training for their students and employees to advance the fairness of software management and development practices. the uio is also disseminating and promoting approaches for fairization of digital object by being part of the code refinery and hosting and supporting actively the carpentry community . sweden the swedish research bill of stated that research products shall to the greatest extent possible make use of and fulfil the fair principles. while not explicitly pointing to software, the scope of the https://www.oecd.org/sti/inno/ .pdf https://www.nordatanet.no/nb/taxonomy/term/ https://www.cbu.uib.no/new-funding-to-elixir-norway-on-fair-data-management/ https://coderefinery.org/ https://www.ub.uio.no/english/writing-publishing/dsc/carpentry-uio/index.html https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska- forskningsomradet- .pdf https://www.oecd.org/sti/inno/ .pdf https://www.nordatanet.no/nb/taxonomy/term/ https://www.cbu.uib.no/new-funding-to-elixir-norway-on-fair-data-management/ https://coderefinery.org/ https://www.ub.uio.no/english/writing-publishing/dsc/carpentry-uio/index.html https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska-forskningsomradet- .pdf https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska-forskningsomradet- .pdf draft not yet approved by the european commission policy document “ nationell färdplan för det europeiska forskningsområdet – ” in a broad manner addresses research products which includes among other aspects source codes and the research process additionally, investigations performed by rda-sweden /snd shows that the national initiatives in sweden are based on a ground-up perspective and are initiated by individual groups. lastly, the swedish research bill of states that the shift to an open research system is a major undertaking which entails making as many of the research process phases and tools openly available via the internet. open science as a concept is reasoned to include, among other aspects, source code. https://www.kb.se/samverkan-och-utveckling/oppen-tillgang-och-bibsamkonsortiet/oppen-tillgang/oppen-tillgang-i-eu.html https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska- forskningsomradet- .pdf https://snd.gu.se/sites/default/files/ - /rda% sweden% webinar% - - .pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for- sverige.pdf https://www.kb.se/samverkan-och-utveckling/oppen-tillgang-och-bibsamkonsortiet/oppen-tillgang/oppen-tillgang-i-eu.html https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska-forskningsomradet- .pdf https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for-det-europeiska-forskningsomradet- .pdf https://snd.gu.se/sites/default/files/ - /rda% sweden% webinar% - - .pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf draft not yet approved by the european commission . . summary /analysis this section summarizes chapter and describes similarities and differences for the countries inventoried. table : summary /analysis incentives for fair os training software yes no draft yes no draft yes no draft denmark n a n n estonia n a n n finland n a n n latvia n n n lithuania n a n n norway n a n n sweden n a/m n note: n= national level policy a= university/institution/service provider level activity m= mentioned in policy/report at national level . . . policy implemented incentives for fair as summarized in table , a similarity between the countries inventoried is that there are national policies, or draft policies, or laws, which can be considered as policy implemented incentives for fair in both denmark, estonia, finland, lithuania, and sweden. a difference is that different approaches have been taken by the countries inventoried. law is used as a regulatory tool in lithuania, while policy is utilized as a regulatory tool in finland, denmark and sweden. policies may be authored by different organisations, both those with national mandates, such as ministries and funders, and organisations with a subnational mandate, such as heis. this also illustrates that different stakeholders are involved in the countries inventoried, ranging from ministries, and funders to heis and libraries. related to eosc, the eosc fair working group has stated that a balance of penalties and rewards is needed for optimum impact for implementing fair practices. policy requirements and the consequence of not being able to get funding without complying can be seen as penalties and should not be the only motivation to implement fair. rewards for data sharing should also be in place. “reward and recognise improvements of fair practice” [recommendation in six recommendations for implementation of fair https://op.europa.eu/en/publication-detail/-/publication/ fa - - eb- a - aa ed a /language-en/format-pdf/source- https://op.europa.eu/en/publication-detail/-/publication/ fa - - eb- a - aa ed a /language-en/format-pdf/source- draft not yet approved by the european commission practice]. it is expected that the academic reward is in balance with the effort made in sharing the data. . . . policies for os training / training for making data fair as summarized in table , a similarity is that policies for os training, or policies that involve os training, are very few at national level. in sweden it is mentioned at national level and in finland and latvia there is awareness of the importance of os training, and national drafts include provisions for this. furthermore, a similarity is that training is provided heis in a majority of countries inventoried, and by e- infrastructures in some countries. in many countries there are policies at the subnational level. related to eosc, based on an analysis of the data policy landscape in , fairsfair has prepared a series of practical recommendations for policy enhancement to support the realisation of a fair ecosystem. first recommendation mentioned “provide practical guidance to researchers and data stewards on how to implement fair within different domains” and goes on with “commitments are needed from all stakeholders to support and meet training needs relating to open science”. . . . policies for making other research objects, as software and methodology, fair as summarized in table , a similarity between the countries is that a majority of countries inventoried do not have policies in place for making other research objects, software or methodologies fair. a conclusion for this specific item, as also has been highlighted by fairsfair synchronisation force is that making other research objects fair is an area in need of development in the nordic and baltic region. related to eosc, in the original fair principles paper “the fair guiding principles for scientific data management and stewardship” , the authors state that the fair principles apply not only to ‘data’ in the conventional sense, but also to the algorithms, tools, and workflows that led to that data. all scholarly digital research objects benefit from application of these principles, since all components of the research process must be available to ensure transparency, reproducibility, and reusability. as reported in the second report of the fairsfair synchronisation force “the way in which fair is applied to software, and the development of any related guidelines and metrics, needs further work and clear recommendations.” https://op.europa.eu/en/publication-detail/-/publication/ fa - - eb- a - aa ed a /language-en/format-pdf/source- https://zenodo.org/record/ #.yb pn mxu w https://zenodo.org/record/ #.ybwu-nmxu w (s ) https://doi.org/ . /zenodo. https://www.nature.com/articles/sdata https://doi.org/ . /zenodo. https://op.europa.eu/en/publication-detail/-/publication/ fa - - eb- a - aa ed a /language-en/format-pdf/source- https://zenodo.org/record/ #.yb pn mxu w https://zenodo.org/record/ #.ybwu-nmxu w https://doi.org/ . /zenodo. https://www.nature.com/articles/sdata https://doi.org/ . /zenodo. draft not yet approved by the european commission recommendation number , in six recommendations for implementation of fair practice by eosc fair working group is stated as “translate fair guidelines for other digital objects.” according to the document, it is clear that adoption of fair practice for other research objects lags behind research data, and that both funder and publisher mandates will have a key role in improving fair practice. the group has identified an important measure related to this would be a requirement to share code as a prerequisite for publication. the eosc rop makes the case that the eosc fair principles shall apply to “individual research objects, such as datasets, publications, software, etc., or repositories of research objects,” as such both having policies in place, and providing training will be important in facilitating this aim. . resource provisioning policies . . policy facilitation of cross border research the previous deliverable, d . , provided an overview of the state of resource provisioning policies, focusing on high performance computing (hpc), in the nordic and baltic region. the group of writers for this deliverable involves stakeholders in the form of personnel from e- infrastructure providers, and personnel working with resource provisioning policies in the nordic and baltic countries. these authors have chosen to focus on facilitation of cross border research, such as resource provisioning policies. this due to that the facilitation of the aims of eosc, such as defined in the eosc rules of participation , entails facilitation of cross border research. the inventory is provided per country. participating representatives were asked to for their countries describe any policy/similar such as written guidelines, management policies et.al. relating to facilitation of cross border research. documents in scope were documents /similar available via the organisations which are relevant for producing such for the specific national context such as funders, government agencies, ministries, research societies, and e-infrastructure providers. denmark access to nationally funded ri is in general terms open to scientists who are employed by danish research institutions, including foreign nationals. this is a grant requirement for any project funded through the danish research infrastructure pot under the ministry of science and education. most grants are dependent on substantial co-fund from the consortium that applies for funding. no single https://op.europa.eu/en/publication-detail/-/publication/ fa - - eb- a - aa ed a /language-en/format-pdf/source- eosc rules of participation - publications office of the eu (europa.eu) https://op.europa.eu/en/publication-detail/-/publication/a d - e- eb-b f- aa ed a /language-en/format-pdf/source- https://op.europa.eu/en/publication-detail/-/publication/ fa - - eb- a - aa ed a /language-en/format-pdf/source- https://op.europa.eu/en/publication-detail/-/publication/a d - e- eb-b f- aa ed a /language-en/format-pdf/source- https://op.europa.eu/en/publication-detail/-/publication/a d - e- eb-b f- aa ed a /language-en/format-pdf/source- draft not yet approved by the european commission institution can apply. in order to be defined as national ri, at least two and preferably more institutions must be part of the consortia. on the other hand, no researcher from a danish institution can be denied access to national ri, regardless of whether or not that researcher's institution is part of the consortium. however, in many cases, access to resources for researchers outside the consortium may be subject to direct pay-per-use cost recovery schemes, or limitations in terms of access to consumable services. there is no one-size-fits-all model for cost recovery. the actual business model applied depends a lot on the nature of the ri, the width of the user base and the financial burdens and risks incurred by the host institution(s). in many cases the consortium is motivated to recover costs, by offering services to scientists outside the consortium and in some cases also offering services to industry. when that is the case, the ri tends not to discriminate between danish and foreign users. this also tends to be the case when a single university gets a grant from a private funder to install and operate large scale ri. usually there will be mechanisms in place to ensure that not only users from the host institution can get access. on the other hand, if a consortium exists that includes all danish universities, access for users not affiliated with a danish institution tends to be much more restricted. an all-encompassing consortium tends to be quite specific on who gets access, opting in most cases to allow users (from consortium members) free access only limited by the ri’s capacity. since all consortium members wish to maximize the return on their investment, any outside access may be considered a loss of resources - particularly if resources are split into parts equivalent to the individual institution’s share of the cost. this very tight resource management tends to be more present with horizontal it services than field-specific ri. with field-specific ri, in a small country as denmark, the key stakeholders tend to know and trust one another - and are able to carve out a reasonable consortium agreement with flexible access arrangements. with generic e-infra this does not exist to the same degree, and the concern seems more that a host institution will be in a position to increase skills and competences - or with database infrastructure to be in a position to provide access to interesting collaborators - and thereby attract outside users for scientific collaboration, at the expense of the other consortium members. in those cases, the consortium agreement is very specific about access and service protocol and typically specify that only danish scientists can be granted access. in most cases this does not prohibit an international project to get access to resources, but only through a danish pi - to the extent possible, to create a level playing field between the institutions as to cross-border cooperation. estonia according to estonian research council development plan (etag), their mission is supporting international research cooperation. etag advises and trains researchers, entrepreneurs and other interested parties, and creates opportunities for participating in international research cooperation. in cooperation with ministries, etag selects the partnership programmes and research infrastructures that https://www.etag.ee/wp-content/uploads/ / /etag-arengukava- _eng.pdf https://www.etag.ee/wp-content/uploads/ / /etag-arengukava- _eng.pdf draft not yet approved by the european commission support estonian researchers’ participation, among others etag is promoting research cooperation with the baltic and nordic countries. the objective of the ‘estonian research international marketing strategy - ’ is to contribute towards the execution of the ‘estonia is active and visible in terms of international rdi cooperation’ sub- objective of the estonian research and development and innovation strategy - , ‘a knowledge- based estonia’ (hereinafter referred to as the rdi strategy). one of the goals of the marketing strategy is to increase international knowledge of available opportunities in estonia in terms of open cooperation and research infrastructure. most estonian research institutions have various international collaboration projects and many nationally important research infrastructure units of the estonian research infrastructures roadmap are connected to the european strategy forum on research infrastructures (esfri) and other international research infrastructures. in february, , estonian research council signed the agreement for estonia to join the nordic e- infrastructure collaboration neic, which gives estonian research infrastructures the possibility to maintain and enhance their competitiveness and do more international cooperation. in general, access to national and research infrastructure is available for estonian researchers or foreigners who are working in estonia. at the end of , ut submitted an application to join eosc association which will facilitate international cooperation further. finland the declaration for open science and research (finland) – does not mention facilitating the cross-border research. the finnish research community is drafting policies to support the vision and the mission of the declaration. in the draft policy document of open access of research data and methods the cross-border research is not mentioned . the academy of finland published in january a strategy for national research infrastructures in finland for the years - . the objective of the strategy is to promote the quality, competitiveness and renewal of research, to strengthen the broad-based impact of research environments and to increase national and international cooperation . the strategy was drawn up by the finnish research infrastructure committee and adopted by the board of the academy. teadusagentuur_dokument_eng.pdf (etag.ee) https://www.etag.ee/en/funding/infrastructure-funding/estonian-research-infrastructures-roadmap/ https://avointiede.fi/en/policies/declaration-open-science-and-research- - https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://www.aka.fi/globalassets/ -tutkimusrahoitus/ -ohjelmat-ja-muut-rahoitusmuodot/ - tutkimusinfrastruktuurit/aka_tik_strategia_ _en_digi_a.pdf https://www.etag.ee/wp-content/uploads/ / /teadusagentuur_dokument_eng.pdf https://www.etag.ee/en/funding/infrastructure-funding/estonian-research-infrastructures-roadmap/ https://avointiede.fi/en/policies/declaration-open-science-and-research- - https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments https://www.aka.fi/globalassets/ -tutkimusrahoitus/ -ohjelmat-ja-muut-rahoitusmuodot/ -tutkimusinfrastruktuurit/aka_tik_strategia_ _en_digi_a.pdf https://www.aka.fi/globalassets/ -tutkimusrahoitus/ -ohjelmat-ja-muut-rahoitusmuodot/ -tutkimusinfrastruktuurit/aka_tik_strategia_ _en_digi_a.pdf draft not yet approved by the european commission many higher education institutions and research organisations have international collaboration embedded in their strategy (e.g. university of helsinki strategy plan ). the right to use national services like (csc) services is based on users´ affiliation to a finnish higher education or research institution . for using these services, the researcher usually needs a project and user accounts. for the cross border collaboration, this means, a finnish partner and a joint project from a finnish university apply for a user account. only the finnish researcher can be granted access. this is also the case if the finnish researcher wants to use nordic capacity in a joint project. nowadays commercial tools like google docs and google drive are often used in international projects as they are easy to use and do not require any affiliation to universities for accessing the services. however, the commercial services do not guarantee data protection nor sustainability. therefore, the policy of availability and usability of national services also for international partners in a controlled manner should be taken into consideration. in this context, the issue of costs and data protection needs to be resolved. latvia latvia claims to be open for any developments broadening international cross-border collaboration and looking forward to international commonly agreed initiatives and rules. latvia is involved in esfri projects, horizon europe programme ( - ) and european partnerships. latvia has joined eurohpc, euroatom and is planning to join the eosc association. support for international cross-border collaboration in research is based on regulation no. https://www.helsinki.fi/en/university/strategic-plan- - https://research.csc.fi/free-of-charge-use-cases https://likumi.lv/ta/id/ -atbalsta-pieskirsanas-kartiba-dalibai-starptautiskas-sadarbibas-programmas-petniecibas-un-tehnologiju-joma https://www.helsinki.fi/en/university/strategic-plan- - https://research.csc.fi/free-of-charge-use-cases https://likumi.lv/ta/id/ -atbalsta-pieskirsanas-kartiba-dalibai-starptautiskas-sadarbibas-programmas-petniecibas-un-tehnologiju-joma draft not yet approved by the european commission lithuania the research council of lithuania does not directly define policies for an international body or cross border collaborators. from the law of science and studies article it is obvious that international collaborators and bodies participating in state funded projects have the obligation to sign cross border and/or cross-organisational joint venture agreements with the very specific clause regarding publication and the ownership of intellectual property. in many cases it is regulated by the signing parties / organisations. a very important question for the lithuania research community is to be able to identify metrics for research impact evaluation in open access journals and venues. currently used commercial sources for research impact evaluation, i.e. clarivatetm web of science, are not incentifying enough research publishing in open access journals as it is directly implicating the level of employment at research organisations. norway from the new national strategy on access to and sharing of research data: “the infra­structure in place must lay a foundation for cooperation and knowledge-sharing that extends across countries and sectors. it should be easy for international researchers to find norwegian data sets.” this should happen through the adoption of internationally recognised standards and services. the strategy also invites the norwegian council of research to continue supporting international collaboration through adequate financing programs. norwegian investment and participation in the eosc and join undertaking eurohpc initiatives is aiming at facilitating international collaboration, through common services for storage, computing and sharing data. norway is also actively supporting the neic – the nordic e-infrastructure collaboration, promoting project such as coderefinary. sweden via the swedish research bill select research infrastructures are allocated additional funding. facilitation of cross-border research is mentioned in the context that research infrastructure may increase international collaboration via facilitating contacts with researchers from beyond sweden. internationalization is touched upon in the sense that increased funding of national research https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for- sverige.pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf draft not yet approved by the european commission infrastructures may facilitate increased quality of swedish research, and as such contribute to internationalization. while not a policy, in a referral regarding internationalization of research vr argues that research is dependent on access to ri:s, and that internationalisation of ri:s is a cost-efficient way to finance, and to facilitate synergies, which a single country does not have the ability to provide and operate. ri:s are viewed as international collaborations, as a consequence of the aim to build international collaboration, which as a consequence facilitates international research collaborations. this applies to some national ri:s, as some national ri:s promote international collaborations, such as max iv which is available to international researchers from its member institutions outside of sweden. lastly, swedish research infrastructure, and e-infrastructure, financed by vr, is bound by a set of common boundary conditions. these entail to be openly accessible primarily to researchers, industry and other relevant actors operating in sweden. https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for- sverige.pdf https://www.vr.se/download/ . d f ae b d / /en% strategisk% agenda% f%c %b r% internationalisering. % delbet%c %a nkande% av% utredningen% om% %c %b kad% internationalisering% av% universitet% och% h%c %b gsk olor max_iv_general_terms_and_conditions_for_open_user_access_ .pdf https://www.vr.se/english/mandates/research-infrastructure/what-is-research-infrastructure.html https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet-framtid--kunskap-och-innovation-for-sverige.pdf https://www.vr.se/download/ . d f ae b d / /en% strategisk% agenda% f%c %b r% internationalisering.% delbet%c %a nkande% av% utredningen% om% %c %b kad% internationalisering% av% universitet% och% h%c %b gskolor https://www.vr.se/download/ . d f ae b d / /en% strategisk% agenda% f%c %b r% internationalisering.% delbet%c %a nkande% av% utredningen% om% %c %b kad% internationalisering% av% universitet% och% h%c %b gskolor https://www.vr.se/download/ . d f ae b d / /en% strategisk% agenda% f%c %b r% internationalisering.% delbet%c %a nkande% av% utredningen% om% %c %b kad% internationalisering% av% universitet% och% h%c %b gskolor https://www.vr.se/english/mandates/research-infrastructure/what-is-research-infrastructure.html draft not yet approved by the european commission . . summary /analysis this section summarizes chapter and describes similarities and differences for the countries inventoried. table : facilitation of cross-border research country policy facilitating of cross-border research denmark policy does not consider facilitation of cross border research estonia policy does consider facilitation of cross border research finland national policies do not mention facilitating the cross-border research, most of organisational strategies do latvia policy does consider facilitation of cross border research lithuania policy does not consider facilitation of cross border research norway policy does consider facilitation of cross border research sweden policy does not consider facilitation of cross border research as summarized in table , the inventory shows that three of the countries inventoried have policies that mentions facilitation of cross border research. the remainder of the inventoried countries have national policies which focus on researchers with national affiliation access to services. the findings are consistent with the findings in the deliverable d . . which showed that focus in policies f is academic usage for users with national affiliations. a conclusion is, that in the nordic and baltic region, facilitation of cross border research is not a focus in policies, rather, policies have a national scope. regarding facilitation of cross border research outputs from eosc, such as the eosc rules of participation, it is worth noting the fact that in general policies in place in the baltic and nordic region do not focus on openness as the eosc rop, but rather concentrate on national service provisioning. . roadmap with eosc perspective . . roadmap related to open science / fair this section sets the findings of this deliverable into an eosc perspective. this is done via relating the findings to eosc policies and recommendations. draft not yet approved by the european commission in , the commission proposed to the competitiveness council creating an european open science cloud . the aim was to develop a trusted, virtual, federated environment that cuts across borders and scientific disciplines to store, share, process and re-use research digital objects (like publications, data, and software) following the fair principles . in this document, we have focused on investigating how fair related issues (incentives for fair, os/fair training, making other digital objects fair) have been considered on the policy level in the baltic and nordic region. because the goal for eosc is also a federated environment that cuts across borders and scientific disciplines, we concentrated on investigating how cross border research is facilitated through policies in baltic and nordic region. regarding the policy implemented incentives for fair, our investigation shows that there are national policies, or draft policies, or laws, which can be considered as policy implemented incentives for fair in many inventoried countries. policies may be authored by different organisations, both those with national mandates, such as ministries, and organisations with a subnational mandate, such as heis. eosc fair working group has stated that a balance of penalties and rewards is needed for optimum impact for implementing fair practices. attention should still be paid to the development of incentives for fair implementation in both national and subnational levels and on how to create incentives for researchers for sharing their research data. our study shows that there is awareness of the importance of os training / training for making data fair in many inventoried countries. there are either national policies in place or taking the form of the draft policy in almost every country. furthermore, there are policies at the subnational level and training is organized by heis or e-infras. first recommendation of fairsfair practical recommendations for policy enhancement to support the realisation of a fair ecosystem goes “provide practical guidance to researchers and data stewards on how to implement fair within different domains” and goes on with “commitments are needed from all stakeholders to support and meet training needs relating to open science”. our working group sees the need to further develop and harmonize policies for the provision of training in order to increase the skills and knowledge to store, share, process and re-use research digital objects following the fair principles. most countries inventoried do not have policies in place for making other research objects, incl. software and methodology, fair. a conclusion for this specific item, as has also been highlighted by other reports, is that making other research objects fair is an area in need of focus in order to ensure transparency, reproducibility, and reusability of research . https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science/european-open-science-cloud- eosc_en https://www.nature.com/articles/sdata https://op.europa.eu/en/publication-detail/-/publication/ fa - - eb- a - aa ed a /language-en/format-pdf/source- https://zenodo.org/record/ #.ycjh xmxxb https://www.nature.com/articles/sdata https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science/european-open-science-cloud-eosc_en https://ec.europa.eu/info/research-and-innovation/strategy/goals-research-and-innovation-policy/open-science/european-open-science-cloud-eosc_en https://www.nature.com/articles/sdata https://op.europa.eu/en/publication-detail/-/publication/ fa - - eb- a - aa ed a /language-en/format-pdf/source- https://zenodo.org/record/ #.ycjh xmxxb https://www.nature.com/articles/sdata draft not yet approved by the european commission . roadmap related to facilitating the cross-border research eosc-hub briefing paper – provision of cross-border services – describes challenges involved in provision of publicly funded rivalrous resources across borders in europe. the paper states that the european research landscape is very fragmented with a large number of actors involved, often providing or consuming services within the boundaries of their thematic areas and national borders. “the majority of research funding in europe is provided nationally. funding sources are varied, complex and involve a large number of different rules, which contributes to suboptimal use of member states’ investment in research services, particularly in cases of cross-border service usage. research use cases could be aided by providing “choreography” of national and european funding schemes and eligibility criteria across borders.” as our study shows, cross-border research is not yet considered in policies in the baltic and nordic region. according to eosc-hub briefing paper providing special funding schemes and eligibility criteria across borders, could be of help when facing for example cost issues to be solved when using services across borders. also, the issue of data protection should be solved. as such, there are not only the policies that are lacking, but also the issues of costs and data protection to be aided and solved to move forward with cross border research. the eosc-nordic study restrictive policies a barrier to cross-border open science – open science in the nordics: legal insights finds that in many situations, in many cases, legislation is not a barrier to cross- border data sharing for research. lastly, outputs from eosc, such as the eosc rules of participation, states that “the principle of openness is central to eosc. eosc is as open as possible, and only as closed as necessary. this applies to users, resources, and the rop themselves.” as such, as previously mentioned, it is worth noting that, in general, policies in place in the baltic and nordic region do not focus on openness, but rather concentrate on national service provisioning. as such, a conclusion is that should eosc involve both consumable service and access to data awareness via dialogue on national level is needed to facilitate understanding of the aims of eosc among a wider audience. https://www.eosc-hub.eu/sites/default/files/eosc-hub% briefing% paper% -% provision% of% cross-border% services% - % final_ .pdf https://www.eosc-hub.eu/sites/default/files/eosc-hub% briefing% paper% -% provision% of% cross-border% services% - % final_ .pdf https://www.eosc-nordic.eu/kh-material/deliverable- - -open-science-in-the-nordics-legal-insights/ https://repository.eoscsecretariat.eu/index.php/s/ o fgqwn ekoz n https://www.eosc-hub.eu/sites/default/files/eosc-hub% briefing% paper% -% provision% of% cross-border% services% -% final_ .pdf https://www.eosc-hub.eu/sites/default/files/eosc-hub% briefing% paper% -% provision% of% cross-border% services% -% final_ .pdf https://www.eosc-hub.eu/sites/default/files/eosc-hub% briefing% paper% -% provision% of% cross-border% services% -% final_ .pdf https://www.eosc-hub.eu/sites/default/files/eosc-hub% briefing% paper% -% provision% of% cross-border% services% -% final_ .pdf https://www.eosc-nordic.eu/kh-material/deliverable- - -open-science-in-the-nordics-legal-insights/ https://repository.eoscsecretariat.eu/index.php/s/ o fgqwn ekoz n draft not yet approved by the european commission list of abbreviations csc it center for science coc code of conduct on research integrity, dau data access units dmp data management plan dsc digital scholarship center deic danish e-infrastructure cooperation etais estonian scientific computing infrastructure etis estonian research information system eosc european open science cloud esfri the european strategy forum on research infrastructures eu european union (eu) eurohpc the european high-performance computing joint undertaking fair findable, accessible, interoperable, reusable fmi finnish meteorological institute fsd finnish social science data archive gdpr general data protection regulation hei:s higher education institutions izm latvian ministry of education and science kb national library of sweden lu university of latvia neic nordic e-infrastructure collaboration ntnu the norwegian university of science and technology nsd norwegian center for research data os open science rcn research council of norway rtu riga technical university rop eosc rules of participation sdu university of southern denmark snic swedish national infrastructure for computing snd swedish national dataservice tenk finnish national board on research integrity uef university of eastern finland ut university of tartu unit norwegian directorate for ict and joint services in higher education & research uio university of oslo vr swedish research council draft not yet approved by the european commission literature list academy of finland . finland’s roadmap for research infrastructures. https://www.aka.fi/en/research-and- science-policy/research-infrastructures/finlands-roadmap-for-research-infrastructures/; academy of finland . open science – open access publishing and open data. https://www.aka.fi/en/funding/apply-for-funding/az-index-of-application-guidelines/open-science/; academy of finland . data management plan https://www.aka.fi/en/research-funding/apply-for-funding/how-to-apply-for-funding/az-index-of-application- guidelines /data-management-plan/data-management-plan/ academy of finland . academy of finland publishes strategy for national research infrastructures in finland – . https://www.aka.fi/en/about-us/media/press-releases/ /academy-of-finland-publishes-strategy- for-national-research-infrastructures-in-finland- /; csc – it center for science. https://www.csc.fi/en/home; csc – it center for science. free-of-charge use cases for end users in csc's computing, cloud and storage services. https://research.csc.fi/free-of-charge-use-cases; csc – it center for science. data policy. https://www.csc.fi/en/data-policy csc – it center for science. how to get started with services? https://research.csc.fi//accounts-and-projects; datacite. http://datacite.ut.ee/; dau-handboken. fair-principerna. https://dhb.snd.gu.se/wiki/fair-principerna easydmp. https://easydmp.sigma .no/ eosc-hub. https://www.eosc-hub.eu/about-us; eesti teadus- ja arendustegevuse, innovatsiooni ning ettevõtluse arengukava – . https://www.hm.ee/sites/default/files/ _taie_arengukava_eelnou_lisad_ . . .pdf elixir norway. new funding to elixir norway on fair data management. https://www.cbu.uib.no/new-funding- to-elixir-norway-on-fair-data-management/ estonian research council. open science in estonia - open science expert group of the estonian research council principles and recommendations for developing national policy. https://www.etag.ee/wp-content/uploads/ / /open-science-in-estonia-principles-and-recommendations- final.pdf estonian research council. estonian research infrastructures roadmap. https://www.etag.ee/en/funding/infrastructure-funding/estonian-research-infrastructures-roadmap/; estonian research council. estonian research council development plan . https://www.etag.ee/wp- content/uploads/ / /etag-arengukava- _eng.pdf estonian research council. estonian research international marketing strategy – . https://www.etag.ee/wp-content/uploads/ / /teadusagentuur_dokument_eng.pdf estonian research information system. https://www.etis.ee/portal/news/index/?islandingpage=true&lang=eng; estonian scientific computing infrastructure. https://etais.ee/about/; estonian university of life sciences library. research data management. https://library.emu.ee/en/research/research-data-management/ eosc-hub, eosc-hub briefing paper – provision of crossborder services. https://www.eosc- hub.eu/sites/default/files/eosc-hub% briefing% paper% -% provision% of% cross- border% services% -% final_ .pdf eosc-nordic, deliverable . : deliverable . open science policies and resource provisioning in the nordic and baltic countries (first report) https://www.eosc-nordic.eu/kh-material/testimateriaali/ draft not yet approved by the european commission eosc-nordic, deliverable . : open science in the nordics: legal insights, https://www.eosc-nordic.eu/kh- material/deliverable- - -open-science-in-the-nordics-legal-insights/ european commission . open science. https://ec.europa.eu/info/sites/info/files/research_and_innovation/knowledge_publications_tools_and_data/doc uments/ec_rtd_factsheet-open-science_ .pdf; european commission. six recommendations for implementation of fair practice. https://ec.europa.eu/info/sites/info/files/research_and_innovation/ki enn.pdf european commission. what the european open science cloud is.https://ec.europa.eu/info/research-and- innovation/strategy/goals-research-and-innovation-policy/open-science/european-open-science-cloud-eosc_en fairdata.fi take care of your research data https://www.fairdata.fi/en/ fairsfair, https://www.fairsfair.eu/the-project fairsfair . d . policy enhancement recommendations. https://zenodo.org/record/ #.ydept wg-ul fairsfair . d . fair policy landscape analysis. https://zenodo.org/record/ #.xkz vou-nbi; fairsfair . second report of the fairsfair synchronisation force (d . ). https://zenodo.org/record/ #.ydeqinwg-ul federation of finnish learned societies. https://tsv.fi/en; federation of finnish learned societies. draft policy component for open access to research data now open for comments https://avointiede.fi/en/news/draft-policy-component-open-access-research-data-now-open-comments federation of finnish learned societies. declaration for open science and research. https://avointiede.fi/sites/default/files/ - /declaration _ .pdf the finnish social science data archive (fsd). fsd’s data management policy https://www.fsd.tuni.fi/en/data-archive/documents/fsds-data-management-policy/ finnish government : . inclusive and competent finland – a socially, economically and ecologically sustainable society. https://julkaisut.valtioneuvosto.fi/bitstream/handle/ / /inclusive% and% competent% finland_ _web.pdf?sequence= &isallo% wed=y finnish meteorological institute. finnish meteorological institute publishes software as open source code. https://en.ilmatieteenlaitos.fi/open-source-code finnish national board on research integrity (tenk). https://www.tenk.fi/en https://www.tenk.fi/sites/tenk.fi/files/decree.pdf; finnish national board on research integrity (tenk). template for researcher's curriculum vitae https://tenk.fi/en/advice-and-materials/template-researchers-curriculum-vitae mark d. wilkinson, michel dumontier, et.al. the fair guiding principles for scientific data management and stewardship. https://www.nature.com/articles/sdata ministry of education and culture. science and research https://minedu.fi/en/science-and-research ministry of education and culture.vision for higher education and research in . https://minedu.fi/en/vision- ministry of education and research . estonian research and development and innovation strategy - . https://www.hm.ee/sites/default/files/estonian_rdi_strategy_ - .pdf; draft not yet approved by the european commission ministry of education and research. national strategy on access to and sharing of research data. https://www.regjeringen.no/en/dokumenter/national-strategy-on-access-to-and-sharing-of-research- data/id /?ch= ; ministry of education and research, estonian research council . estonian research infrastructure roadmap . https://www.etag.ee/wp-content/uploads/ / /etag_research_infrastructure_roadmap_ .pdf; ministry of education and science of latvia. https://www.izm.gov.lv/en/; ministry of education, science and culture . policy and action plan - – the science and technology policy council. https://www.government.is/lisalib/getfile.aspx?itemid= e fff-ac b- e - a- bc d ; national library of sweden. fair-principerna. https://www.kb.se/samverkan-och-utveckling/oppen-tillgang-och-bibsamkonsortiet/oppen-tillgang/fair.html national library of sweden. Öppen tillgång. https://www.kb.se/samverkan-och-utveckling/oppen-tillgang-och- bibsamkonsortiet/oppen-tillgang.html national library of sweden . Öppen vetenskap och fair-principerna lyfts i regeringens budgetproposition för . https://www.kb.se/samverkan-och-utveckling/nytt-fran-kb/nyheter-samverkan-och-utveckling/ - - - oppen-vetenskap-och-fairprinciperna-lyfts-i-regeringens-budgetproposition-for- .html; nordforsk . the state of open science in the nordic countries: enabling data science in the nordic region. https://www.nordforsk.org/en/publications/publications_container/state-o-open-science-in-the-nordic- countries/view; ntnu. ntnu open data – ntnu’s policy for open research data - . https://innsida.ntnu.no/c/wiki/get_page_attachment?p_l_id= &nodeid= &title=ntnu+open+data&file name=ntnu% open% data_policy.pdf; nordatanet. https://www.nordatanet.no/nb/taxonomy/term/ norwegian centre for research data. the nordi project. https://www.nsd.no/en/about-nsd-norwegian-centre-for-research-data/projects/the-nordi-project/ norwegian centre for research data. create a data management plant. https://www.nsd.no/en/create-a-data- management-plan/ oecd. principles and guidelines for access to research data from public funding. https://www.oecd.org/sti/inno/ .pdf openaire. https://www.openaire.eu/; openaire. workshops on research data management gained high interest among different stakeholders in latvia. https://www.openaire.eu/blogs/workshops-on-research-data-management-gained-high-interest-among-different- stakeholders-in-latvia open science. https://openscience.fi/; open science. declaration for open science and research - . https://openscience.fi/en/policies/declaration-open-science-and-research- - ; open science. key documents of open science and research initiative. https://openscience.fi/en/policies-and-key- actors/key-documents-open-science-and-research-initiative; oslomet . open access policy. https://ansatt.oslomet.no/en/open-access-policy; oulu university. research data guide. https://libguides.oulu.fi/researchdata/fair_principle ‘pētījuma par atvērto zinātni un rīcībpolitikas ceļa kartes izstrādi’ noslēguma ziņojums, .gada .jūnijs https://www.izm.gov.lv/sites/izm/files/petijums-atverta_zinatne_ _ .pdf rda sweden. placing research software into open science - initial results from an rda sweden and eosc nordic collaboration. https://snd.gu.se/sites/default/files/ - /rda% sweden% webinar% - - .pdf draft not yet approved by the european commission regeringskansliet . Öppen digital plattform för svenska vetenskapliga tidskrifter. https://www.regeringen.se/pressmeddelanden/ / /oppen-digital-plattform-for-svenska-vetenskapliga- tidskrifter/; regeringen . regeringens proposition / : : kunskap i samverkan – för samhällets utmaningar och stärkt konkurrenskraft. https://www.regeringen.se/ adad /contentassets/ faaf a af b fde ef b /kunskap-i-samverkan-- for-samhallets-utmaningar-och-starkt-konkurrenskraft-prop.- .pdf; regeringskansliet . regeringens proposition / : forskning, frihet, framtid – kunskap och innovation för sverige https://www.regeringen.se/ af /contentassets/da af a b dadcfb d /forskning-frihet- framtid--kunskap-och-innovation-for-sverige.pdf research data alliance international indigenous data sovereignty interest group. (september ). care principles for indigenous data governance. the global indigenous data alliance. gida-global.org https://www.rd- alliance.org/sites/default/files/care% principles% for% indigenous% data% governance_final_sept% % .pdf swedish research council. digitalt verktyg för datahanteringsplaner nu tillgängligt. https://www.vr.se/aktuellt/nyheter/nyhetsarkiv/ - - -digitalt-verktyg-for-datahanteringsplaner-nu- tillgangligt.html swedish research council. en strategisk agenda för internationalisering. delbetänkande av utredningen om ökad internationalisering av universitet och högskolor (sou : ) https://www.vr.se/download/ . d f ae b d / /en% strategisk% agenda% f %c %b r% internationalisering.% delbet%c %a nkande% av% utredningen% om% %c %b kad% int ernationalisering% av% universitet% och% h%c %b gskolor swedish research council. open access to research data. https://www.vr.se/english/mandates/open- science/open-access-to-research-data.html swedish research council. kriterier för fair forskningsdata. https://www.vr.se/analys/rapporter/vara-rapporter/ - - -kriterier-for-fair-forskningsdata.html swedish research council. what is research infrastructure? https://www.vr.se/english/mandates/research-infrastructure/what-is-research-infrastructure.html the university of bergen, policy for open science https://www.uib.no/en/foremployees/ /university-bergen-policy-open-science uit norges arktiske universitet . principles and guidelines for research data management at uit. https://intranett.uit.no/content/ /principles% and% guidelines% for% research% management% at% uit_ .pdf; uit norges arktiske universitet . principles for open access to academic publications at uit the arctic university of norway. https://en.uit.no/content/ /cache= /principles% for% open% access% at% uit_ .pdf; unifi . open science and data – action programme for the finnish scholarly community. https://www.unifi.fi/wp-content/uploads/ / /unifi_open_science_and_data_action_programme.pdf; uninett sigma . https://www.sigma .no/ unit – the norwegian directorate for ict and joint services in higher education & research. about unit. https://www.unit.no/en/about-unit draft not yet approved by the european commission unit – the norwegian directorate for ict and joint services in higher education & research. digitalisering i høyere utdanning og forskning. https://www.unit.no/en/node/ university of bergen library. introduction to data management plan (dmp). https://www.uib.no/en/ub/ /introduction-data-management-plan-dmp university of eastern finland library. research data management. https://www.uef.fi/en/research-data- management university of helsinki. strategic plan of the university of helsinki – . https://www .helsinki.fi/sites/default/files/atoms/files/hy _strategia_en.pdf university of oslo . carpentry@uiot. https://www.ub.uio.no/english/writing-publishing/dsc/carpentry- uio/index.html university of oslo . digital scholarship center. https://www.ub.uio.no/english/writing- publishing/dsc/index.html university of oslo . policies and guidelines for research data management. https://www.uio.no/english/for- employees/support/research/research-data-management/policies-and-guidelines/index.html; university of tartu, estonian research council . estonian code of conduct for research integrity. https://www.eetika.ee/sites/default/files/www_ut/hea_teadustava_eng_trukis.pdf; university of tartu. research data management and publishing. https://sisu.ut.ee/andmehaldus/home- ?lang=en utbildningsdepartementet. nationell färdplan för det europeiska forskningsområdet – . https://www.regeringen.se/ a ba /contentassets/ ced a ca cb e e /nationell-fardplan-for- det-europeiska-forskningsomradet- .pdf uppsala universitet. fairdriktning – projektdirektiv. hantering av forskningsdata. https://mp.uu.se/documents/ / /projektdirektiv+ufv+ - +fairdrikning+forskningsdata.pdf/ e -b f- d e- - e d f c omeka and other digital platforms for undergraduate research projects on the middle ages research how to cite: cuenca, esther liberman and maryanne kowaleski. . “omeka and other digital platforms for undergraduate research projects on the middle ages.” digital medievalist ( ): , pp. – , doi: https:// doi.org/ . /dm. published: june peer review: this is a peer-reviewed article in digital medievalist, a journal published by the open library of humanities. copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access: digital medievalist is a peer-reviewed open access journal. digital preservation: the open library of humanities and all its journals are digitally preserved in the clockss scholarly archive service. https://doi.org/ . /dm. https://doi.org/ . /dm. http://creativecommons.org/licenses/by/ . / cuenca, esther liberman and maryanne kowaleski. . “omeka and other digital platforms for undergraduate research projects on the middle ages.” digital medievalist ( ): , pp. – , doi: https://doi.org/ . /dm. research omeka and other digital platforms for undergraduate research projects on the middle ages esther liberman cuenca and maryanne kowaleski fordham university, us corresponding author: esther liberman cuenca (esther.liberman.cuenca@gmail.com) this article discusses how digital projects can be employed to encourage undergraduates to think across disciplinary divides, to integrate field and online research, and to confront methodological issues in a more direct way. one of these projects draws on an open-source, web-publishing platform called omeka and was designed for an interdisciplinary course on the archaeology and history of medieval london offered at fordham university’s london centre. the project aimed to give students first-hand experience with the material culture of a medieval city and consisted of two parts. the first, an object report, required each student to research and write a short essay on a single medieval object on display at the museum of london, highlighting the significance of the object within the context of civic, religious, and domestic life in medieval london. in addition, students uploaded images and found illustrations of their objects in medieval manuscripts. the second part, a site report, required a visit to a medieval london location– a church, a monastery, or cemetery, for example– to research its significance in the middle ages. students also uploaded images of their site, which they photographed themselves, and identified the site’s location on a (preferably medieval) map of london. another similar project was designed using the weebly web-editing platform for students taking western tradition i at marymount california university, which does not have access to omeka. both the omeka and weebly projects allowed students to grapple with larger questions about integrating material objects into pre-modern history, but they were especially valuable for teaching students about the importance of being a responsible researcher since students contributed to a digital humanities project that made their research available to a wide public. keywords: omeka; pedagogy; london; undergraduate research; archaeology; weebly https://doi.org/ . /dm. mailto:esther.liberman.cuenca@gmail.com cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of introduction many institutions of higher learning are increasingly sponsoring and promoting undergraduate research, although their beneficiaries are largely the stem fields. digital projects, however, can offer new and creative opportunities to promote undergraduate research in the humanities, particularly given their collaborative focus, development of technical skills, self-conscious use of methodology, and interdisciplinarity. this essay outlines a successful classroom project for a course taught by maryanne kowaleski and esther liberman cuenca, centered on the use of an open-access digital platform called omeka, which was especially effective in encouraging interdisciplinary research and engagement with methodological issues (cuenca and kowaleski ). this particular project—an online student exhibition of medieval objects and medieval sites for a course on the archaeology and history of medieval london at fordham university (figure )—was also distinguished in two other ways: it emphasized the value of material evidence, and it occurred within the context of a study-abroad course that allowed cuenca and kowaleski to integrate field trips and online research. the results of this student project, as well as a similar project with cuenca as an instructor at marymount california university (mcu) that used weebly (cuenca ), indicate that digital projects allow students to take a different—and public— type of ownership and responsibility for their research than that which occurs with more conventional “off-line” student projects, because of the nature of the data footprint they generate and the potentially large audience for these projects on the web. medieval london on omeka omeka is an open-source, web-publishing platform that was developed at george mason university a decade ago for the display of scholarly collections and exhibits (roy rosenweig center for history and new media ; information about omeka’s metadata can be found at “working with dublin core” [omeka ]). a free, but limited version, is available on omeka.net, but many institutions, including fordham university, subscribe to the version on omeka.org, which includes a variety of features http://omeka.net http://omeka.org cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of that enhance the learning and viewing experience, such as plug-ins that let users annotate images, share data and map locations, and create timelines. we decided to employ omeka because it had been successfully used for several digital projects at fordham (center for medieval studies ) and because omeka is inexpensive, easy to use, and well-suited for projects that mix text and image (kucsma, reiss, and sidman ). kowaleski designed two assignments for the medieval london course that aimed to give students first-hand experience with the material culture of a medieval city and to introduce them to the use of digital tools for humanities research. these figure : partial screenshot of homepage for the omeka projects for the medieval london course at fordham university’s london centre ( ), http:// medievallondon.ace.fordham.edu. http://medievallondon.ace.fordham.edu http://medievallondon.ace.fordham.edu cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of assignments coincided with lectures on archaeological methods and several field trips, including a guided tour of the medieval collection at the museum of london and a guided tour of the museum’s collection warehouse and conservation facilities in hackney (museum of london b). the first project, an object report, required each student to choose a single medieval object found in the collection of the museum of london. we compiled a list of medieval objects using the searchable database of the museum of london’s collections, which can be accessed online (museum of london a). after claiming a medieval object on a sign-up sheet placed on the course’s google drive (which is how we shared all materials with students during the course), each student had to research and write a short report of about to words in length, find at least two images, and mount both text and images on the omeka- made website while also filling in the required metadata. the course syllabus and instructions for the assignment are available at fordham library’s digital depository (kowaleski ). since the class focused on the history and archaeology of the city and the methods employed by archaeologists in the recovery, preservation, and display of medieval material culture, the object reports encouraged students to explore the cultural and historical contexts in which their objects were manufactured, sold, and used. in so doing, the students grappled with, and reflected on, larger historical questions that were raised in the work of arjun appadurai in the social life of things ( ) about the agency of things, by considering how objects have simultaneously been shaped by, but have also influenced, human activities. the object report required students to think about the utility, composition, and social life of their objects, while also reflecting how archaeologists (and historians interested in material culture writ large) approach and interpret objects for insights into past societies. this exercise, which was supported in lectures and student presentations (kowaleski ) and readings (particularly mcintosh and schofield ) encouraged students not only to focus on historically situating the object but also to reflect on the very process of historicizing the artifact (figure ), and understanding the tools, interpretations, and influence of digital archaeology (richter ; watrall ). cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of the second virtual exhibit, the site report, required students to focus on the preservation, evolution, and impact of a london site, which students again chose from a list compiled by the instructors. the list of these sites included medieval buildings (ranging from london’s guildhall to medieval parish churches), streets, wider landscapes (including cemeteries), and rivers, some of which are no longer visible but run beneath contemporary london (for example, tan ; de silva ; mccallum ). depending on the type of site, students had to write a short report addressing a series of questions that encouraged them to think about topography as well as the development, material forms, and functions of different kinds of urban spaces, issues treated in the lectures and course readings, especially caroline figure : partial screenshot of student suzanne forlenza’s object report on a church bell ( ), http://medievallondon.ace.fordham.edu/exhibits/show/medieval- london-objects/churchbell. http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-objects/churchbell http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-objects/churchbell cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of m. barron’s ( ) history of late medieval london and john schofield ( ) on the archaeology of medieval london. the site report had to be accompanied by at least two images. one was a photograph of the site as it currently exists, which required students to visit the site in person and to consider the medieval in contemporary london. the second had to be a map on which they located their site. kowaleski discussed how to employ maps and old engravings as primary source evidence in a lecture, and cuenca provided students with images of maps of medieval and early tudor london that they could modify and edit to show the location of their sites (grandiose ). we made clear the advantages of including more pictures in their exhibits, and many students did put effort into finding old engravings of their sites that they could discuss in their reports (figure ). this assignment thus prompted students to recognize a wider complement of primary sources available for reconstructing the material and physical world of medieval london. one of the more valuable components of both assignments was the requirement to add metadata about the text and images used. omeka employs dublin core standards for the metadata categories (roy rosenweig center for history and new media ). the instructors created a detailed guide for students about what they should consider entering for various metadata fields, but limited the number of metadata fields to nine, which were: “title,” “subject,” “description,” “source,” “publisher,” “date,” “contributor,” “rights,” and “type” (figure ). the “title” field is for the name of the object or site. for the “subject” field, students had to choose from a list of categories that the instructors compiled. for “date,” students had to choose one of three periods (early, high, or late medieval) if the exact dates or date range for their objects or sites were unknown, and for “contributor” they entered their first and last names. in addition to using these categories as keywords in searches, students had to enter additional keywords, or tags, for users searching their online exhibits. this vectorized map is based on the medieval london map found in william r. shepherd’s historical atlas. ( , ). cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of coming up with their own tags made the students consider how their objects or sites might fall into particular categories rooted in specific historical moments. while students usually excelled at choosing appropriate tags for their objects and sites, some of their keywords were not pertinent to their projects, and many needed to be modified. we used this particular problem to emphasize the extent to which web searches are bound to keyword choices made by web developers and other users. we continually stressed the importance of entering the metadata correctly because that process, much like compiling a bibliography of secondary sources, is crucial to presenting research online in a responsible manner. into the “description” field the students copied and pasted their object or site reports, which cuenca figure : partial screenshot of student marina elgawly’s site report on greyfriars ( ), http://medievallondon.ace.fordham.edu/exhibits/show/ medieval-london-sites/greyfriars. http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-sites/greyfriars http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-sites/greyfriars cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of copy-edited for clarity and grammar. kowaleski spent considerable time talking about how to identify suitable bibliographic sources, including part of a class lecture devoted to identifying what made for a scholarly and, therefore, citable web resource (see also kelly ). she compiled an extensive bibliography to guide students to the best sources (kowaleski ) and made many of them available as pdfs on the course google drive. drawing from these and other resources, the students found appropriate sources to cite in their own bibliographies. despite instructions to format bibliographic entries in the “source” field according to scholarly formatting laid out in the chicago manual of style, many students failed to follow the figure : partial screenshot of the dublin core metadata fields available on omeka ( ), http://omeka.org [no direct link is available]. http://omeka.org cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of formatting rules, which then occasioned a discussion of why academic research needs standardized formatting. because many of the images found in these reports came from external resources, such as museum websites and scholarly books, it was imperative that the distribution and copyrights were not only clearly available to the website’s visitors, but that students also appreciated how historical images become available for public use. the links they used to obtain the images of their sites and objects, if they did not personally photograph them, were entered into the “publisher” field. additionally, under “rights,” they provided the url links to pages on museum or academic websites that detail the sharing or distribution rights of their images. finally, for “type,” students entered the type of media (which was usually “still image”) that they had uploaded to their exhibits. overall, the instructors found that making a guide (kowaleski ) on how to enter the metadata for these projects was critical to the students’ success in completing these reports, as only a few students had experience with data entry or even familiarity with the concept of metadata. the instructions compiled for the students included references and links to resources that would help them write their reports, as well as information on how to search for appropriate supplementary images for their projects (kowaleski ). the object and site images were not the only media that students curated for their online exhibits, as they also had to provide their viewers with geographic, artistic, and manuscript contexts for their objects and sites. for their object reports, they uploaded, at minimum, one other image from a manuscript source or piece of art that depicted their object. the image did not necessarily have to represent london life or material culture from that city, but it did have to date from the medieval period. for example, a student who wrote his report on a pilgrim’s badge recovered in london included, in his exhibit, a manuscript illumination from a fifteenth-century belgian book in the pierpont morgan library, showing a pilgrim who visited the shrine of santiago de compostela in spain and depicted with several badges in the shape of seashells fastened to his hat (milohnić ). another student, who wrote her report on a set of rosary beads, not only showed examples of rosary beads depicted in two religious paintings, but also provided another image from the metropolitan museum cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of of art in new york of an early sixteenth-century german rosary made from ivory (elgawly ). several of the site reports incorporated old engravings to illustrate a now- destroyed building or street, while others cleverly integrated map material from other sources (for example, lobel ). some of the students chose to upload a section of the agas map of london, named after the mid-sixteenth-century surveyor who was attributed with its cartography, because it illustrates, in incredible detail, a bird’s-eye view of london’s buildings and streets (jenstad n.d.). in so doing, the students had to think about the social, economic, and political forces that helped shape (between the medieval and early modern periods) the topography and buildings of london over time. in the fall term, kowaleski taught a version of this course at fordham’s home campus without the field excursions of the original study-abroad course in london. she omitted the site report, but lengthened the object report to require to words, three to four images, and a minimum of four to six secondary sources (kowaleski ). the assignment worked just as well, but in the future, this experience will encourage her to require parenthetical references in the online essay in order to accommodate the longer text and greater number of secondary and primary sources; such a requirement will also help students be more conscious about their use (and overuse) of particular sources. she also now has a greater appreciation of the labor costs involved in digital pedagogy projects like this since she did not have access to the significant help provided by cuenca during the london course in terms of technical instruction, editing, and building the actual omeka exhibit (students only have contributor privileges in omeka and thus are not able to combine their collections into an exhibit). another change will be to require specific readings on the implications of digital scholarship itself. although she devoted an this source is downloadable as pdfs in eight map sections from mary d. lobel’s historic towns trust, “map of london in ,” british historic towns atlas ( ). http://www.historictownsatlas.org.uk/ atlas/volume-iii/city-london-prehistoric-times-c -volume-iii/view-text-gazetteer-and-maps-early. some students also used the black-and-white version of this map (printed across pages and thus done to a very large scale) in caroline barron’s london in the later middle ages ( ). http://www.historictownsatlas.org.uk/atlas/volume-iii/city-london-prehistoric-times-c -volume-iii/view-text-gazetteer-and-maps-early http://www.historictownsatlas.org.uk/atlas/volume-iii/city-london-prehistoric-times-c -volume-iii/view-text-gazetteer-and-maps-early cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of entire lecture to using omeka and the digital humanities, and frequently referred to developments in digital archaeology, it is clear that students would benefit from more exposure to the theoretical underpinnings of, and relevant debates in, digital humanities (gold ed. ; gold and klein eds. ), particularly in terms of digital history (for example, cohen and rosenzweig ). other medieval and pre-modern projects drawing on the experience with the london course, cuenca designed a similar project for her students at mcu, where she taught a survey course on the western tradition from antiquity to the early modern period. she adapted the instructions for the medieval london object report to emphasize a more comparative approach to analyzing the material culture of the pre-modern period and to accommodate the web-building platform weebly (http://weebly.com), which cuenca had used for other student projects in previous classes. weebly is also less expensive and easier to use than omeka. for this project, cuenca took her students on a field trip to the los angeles county museum of art (http://www.lacma.org), or lacma, where the students were required to photograph and then write about an object made prior to circa housed in the european collection of the museum. they uploaded their reports and images to individual, blank webpages that cuenca set up for them on the class website (figure ). the students wrote four short reports. one explored the provenance, materials, manufacturers, and purpose of their pre-modern object at lacma, a second focused on modern objects they may use today that are similar to the lacma pre-modern object, the third was an analysis that compared their lacma object to similar pre- modern ones, from roughly the same period or earlier, found in other museums, and the fourth was a biographical portrait of an individual—an artist, patron, or even a mythical or biblical figure—who was associated in some way with their lacma object. for example, one student chose to write on a seventeenth-century venetian glass ewer at lacma. she then compared this ewer to pitchers advertised for sale on wayfair and etsy, and to a roman ewer at brooklyn museum in new york city http://weebly.com http://www.lacma.org cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of figure : partial screenshot of homepage for the western tradition at the los angeles county museum of arts projects at marymount california university ( ), http://mcuhistory.weebly.com/his- -fall- .html. http://mcuhistory.weebly.com/his- -fall- .html cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of (shoaf ). she completed the webpage with a report on angelo barovier, a famous fifteenth-century glassmaker who lived on the venetian island of murano, where the highest quality glass was made. students were also required to post a bibliography of works they used in assembling these reports (figure ). figure : partial screenshot of the biographical profile of the glassmaker angelo barovier and works cited of the report by ariana shoaf ( ), http://mcuhistory. weebly.com/shoaf-ariana.html. http://mcuhistory.weebly.com/shoaf-ariana.html http://mcuhistory.weebly.com/shoaf-ariana.html cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of these virtual exhibits required students, at minimum, to curate images of their lacma objects, the modern object to which they compared their lacma object, a comparable pre-modern object from a different museum, and an illustration of their biographical figure (or an object associated with this person). ultimately, these exhibits encouraged students to reflect on continuity and change between past and present, and across different artistic and material cultures. since weebly is a simple drag-and-drop website builder, there are no metadata fields to fill out; instead, cuenca instructed the students to provide full captions underneath their images, as well as a complete works cited page with url links at the bottom of their exhibits. weebly is free to use, though it requires a paid “pro” account, which can be purchased in packages ranging from one month to two years, for the privilege of inviting users to contribute and edit webpages. there are other digital platforms, however, which are completely free to use and allow instructors to invite students to build exhibits and also edit their work. these free platforms are primarily used for blogging purposes, but can be deployed for online projects such as these, provided that instructors make the necessary modifications to accommodate the project to the limitations or strengths of a particular platform. the popular tumblr (http://tumblr.com) and wordpress (http://wordpress.com) platforms, for example, not only permit instructors to build websites or blogs and invite their students to contribute their own content or webpages, but also to set certain administrative controls and levels of editorial access over the entire website and individual pages. in the spring term, cuenca drew again on the omeka platform in designing a course at fordham university called medieval hollywood. some students in the course took advantage of a video plug-in to upload relevant film clips as part of their collections on a “medieval” film (cuenca ). this film project had to include at least two images and one primary source text that helped contextualize the historical setting of the film, along with an essay of to words analyzing the film’s representation of women or gender, ethnicity, or social status. the pedagogical goals of the project were twofold: first, to allow students to engage with the primary sources—literary, artistic, and material—that have informed filmmakers who adapt medieval stories to the screen; second, to encourage students to examine the implications of the artistic http://tumblr.com http://wordpress.com cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of (and often political) choices made by these filmmakers. in addition to the film project, students signed up to give an in-class presentation on another film (provided to them from a list of over options) and write a short blog post (called a “capsule review” on the website) of to words summarizing their presentations. the presentations examined a film’s popular and critical reception, its historical influences, and how historical events (as portrayed in particular primary sources) were portrayed onscreen. students filled in the metadata (including bibliographic information) for their items (video clips, images, and text), but cuenca, as editor, categorized the various film projects and capsule reviews into thematic groups, such as the crusades, joan of arc, religious orders, and vikings. while this omeka project prompted students to think about the relationship between medieval history (the primary sources) and medievalism in popular culture (the film), it also serves as a digital repository of searchable scholarly resources for the representation of the middle ages in cinema. conclusions it is not enough that students move their research online, as having students simply transfer their work from the “real” world of research papers to a “digital” world of websites or blogs does little on its own to generate any particular skills or new insights. the design of the assignment must encourage them to consider an historical object or site not as static points of inquiry but as dynamic ones. the exercises involved in identifying a variety of contemporary and historical images, searching museum websites, finding comparable objects, working with online maps, and filling in metadata or captions for their websites enabled students to transcend a more linear method of learning history. these digital humanities projects allow students to engage in multi-dimensional investigations that go beyond traditional research projects confined to producing text on paper, and to create microhistories of objects and sites that could lead viewers into several possible non-linear tangents. in so doing, the students have written their own historical narratives using text and images but have also opened up possibilities for further investigation. the exercise of “curating” exhibits, in which students are asked to embed their chosen objects within a particular place at a specific time like medieval cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of london, or relate it to a historical figure and similar objects found in other museums, offers undergraduates the opportunity to use digital scholarship not only to recover the past but also to reflect on the act of this recovery and how their choices can affect the historical narratives that scholars tend to privilege. there was thus a dynamic element in these projects that might have been less effective on paper (or likely impossible to implement in that medium). an especially valuable aspect of students’ foray into digital scholarship is the methodological issues they had to confront, whether in the requirement to enter metadata in omeka; consideration of the credibility and authority of different types of online, print, and visual sources; the value in adopting a comparative perspective; how to document the images they included in their exhibit; or bringing together documentary and online visual sources like manuscript illuminations, engravings, and maps to construct the textual narratives associated with their exhibits. finally, both the omeka- and weebly-based projects underscored an obvious, but often overlooked, premise of digital humanities projects, especially as it relates to the middle ages: the visual representation of objects and sites—aided appropriately by supplementary texts, maps, and images chosen and identified by the student—can transform seemingly inert objects from the pre-modern era into a kind of language with which they can reconstitute the past into stories. these websites, properly curated and constructed, can then become wholly self-contained microhistories of objects with their own assumptions, logic, and interpretations. these projects not only offer students the opportunity to present their research within the context of undergraduate conferences, poster sessions, or journals targeted at undergraduate research, but also make student research available to a wider public while also holding students accountable for how they present their research online (indeed, some students were shocked when they discovered that their authored report would crop up in a google search on their name). digital projects such as these open up digital spaces for students to engage with visual, archaeological, and cartographic sources that take them beyond the textual histories they normally learn in the classroom. cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of acknowledgements the authors thank dr. john harrington, who as dean of faculty at fordham university facilitated, along with dr. eva badowska, dean of the graduate school of arts and sciences at fordham, the graduate assistantship that allowed esther liberman cuenca to help design and implement the digital element of the london study- abroad course. for invaluable support, they also thank fordham it, particularly dr. fleur eshghi, patrick guerrier, and kanchan thoaker, and timothy ryan mendenhall for his help making the fordham course materials available at fordham university’s digital depository, digital research@fordham. • esther liberman cuenca: elc • maryanne kowaleski: mak • conceptualization: mak • resources: mak • omeka design, instruction: elc, mak • weebly design, instruction: elc • project administration: elc, mak • writing: elc, mak competing interests the authors have no competing interests to declare. references appadurai, arjun. (ed.) . “introduction: commodities and the politics of value.” in the social life of things: commodities in perspective, – . cambridge: cambridge university press. barron, caroline m. . london in the later middle ages: government and people, – . oxford: oxford university press. doi: https://doi.org/ . /acp rof:oso/ . . center for medieval studies, fordham university. . medieval digital projects. new york: fordham university. accessed april , . https://medievaldigital. ace.fordham.edu. https://www.fordham.edu/info/ /research https://doi.org/ . /acprof:oso/ . . https://doi.org/ . /acprof:oso/ . . https://medievaldigital.ace.fordham.edu https://medievaldigital.ace.fordham.edu cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of cohen, daniel j., and roy rosenzweig. . digital history: a guide to gathering, preserving, and presenting the past on the web. philadelphia: university of pennsylvania press. accessed april , . doi: https://doi.org/ . / dh. . . cuenca, esther liberman. (ed.) . the western tradition at the los angeles county museum of arts. accessed april , . http://mcuhistory.weebly.com/his- - fall- .html. cuenca, esther liberman. (ed.) . medieval hollywood. new york: fordham university. accessed may , . https://medievalhollywood.ace.fordham. edu. cuenca, esther liberman, and maryanne kowaleski. (eds.) . medieval london. new york: fordham university. accessed april , . https://medievallondon. ace.fordham.edu. de silva, lorraine. . “guildhall.” medieval london, esther liberman cuenca, and maryanne kowaleski (eds.). new york: fordham university. accessed april , . http://medievallondon.ace.fordham.edu/exhibits/show/medieval- london-sites/westminsterhall. elgawly, marina. . “rosary.” medieval london, esther liberman cuenca, and maryanne kowaleski (eds.). new york: fordham university. accessed april , . http://medievallondon.ace.fordham.edu/exhibits/show/ medieval-london-objects/rosary. forlenza, suzanne. . “church bell.” medieval london, esther liberman cuenca, and maryanne kowaleski (eds.). new york: fordham university. accessed april , . http://medievallondon.ace.fordham.edu/exhibits/show/medieval- london-objects/churchbell. gold, matthew f., and lauren k. klein. (eds.) . debates in the digital humanities . minneapolis: university of minneapolis press. accessed april , . http://dhdebates.gc.cuny.edu/book/ . gold, matthew k. (ed.) . debates in the digital humanities. minneapolis: university of minnesota press. accessed april , . http://dhdebates.gc.cuny.edu/ book/ . doi: https://doi.org/ . /minnesota/ . . https://doi.org/ . /dh. . . https://doi.org/ . /dh. . . http://mcuhistory.weebly.com/his- -fall- .html http://mcuhistory.weebly.com/his- -fall- .html https://medievalhollywood.ace.fordham.edu https://medievalhollywood.ace.fordham.edu https://medievallondon.ace.fordham.edu https://medievallondon.ace.fordham.edu http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-sites/westminsterhall http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-sites/westminsterhall http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-objects/rosary http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-objects/rosary http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-objects/churchbell http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-objects/churchbell http://dhdebates.gc.cuny.edu/book/ http://dhdebates.gc.cuny.edu/book/ http://dhdebates.gc.cuny.edu/book/ https://doi.org/ . /minnesota/ . . cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of grandiose. (ed.) . “file:map of london, .svg.” wikimedia commons. accessed april , . https://commons.wikimedia.org/wiki/file:map_of_ london,_ .svg. jenstad, janelle. n.d. “the agas map.” the map of early modern london. victoria: university of victoria. accessed may , . http://mapoflondon.uvic.ca/ map.htm. kelly, t. mills. . teaching history in the digital age. ann arbor: university of michigan press. accessed april , . doi: https://doi.org/ . / dh. . . kowaleski, maryanne. . “digital pedagogy: an omeka exhibition on medieval london.” digital research@fordham. new york: fordham university. https:// fordham.bepress.com/medieval_mvst /. kucsma, jason, kevin reiss, and angela sidman. . “using omeka to build digital collections: the metro case study.” d-lib magazine; the magazine of digital library research, : – . accessed april , . http://www.dlib.org/dlib/ march /kucsma/ kucsma.html. lobel, mary d. (ed.) . “map of the city of london, c. .” in the city of london from prehistoric times to c. , – . the british historic towns atlas, iii. oxford: oxford university press. accessed april , . http://www. historictownsatlas.org.uk/atlas/volume-iii/city-london-prehistoric-times- c -volume-iii/view-text-gazetteer-and-maps-early. mccallum, m. conner. . “fleet river.” medieval london, esther liberman cuenca, and maryanne kowaleski (eds.). new york: fordham university. accessed april , . http://medievallondon.ace.fordham.edu/exhibits/show/medieval- london-sites/westminsterhall. mcintosh, jane. . the practical archaeologist: how we know what we know about the past. new york: checkmark books, nd ed. milohnić, jonathan. . “pilgrim badge.” medieval london, esther liberman cuenca, and maryanne kowaleski (eds.). new york: fordham university. accessed april , . http://medievallondon.ace.fordham.edu/exhibits/show/ medieval-london-objects/pilgrimbadge. https://commons.wikimedia.org/wiki/file:map_of_london,_ .svg https://commons.wikimedia.org/wiki/file:map_of_london,_ .svg http://mapoflondon.uvic.ca/map.htm http://mapoflondon.uvic.ca/map.htm https://doi.org/ . /dh. . . https://doi.org/ . /dh. . . https://www.fordham.edu/info/ /research https://fordham.bepress.com/medieval_mvst / https://fordham.bepress.com/medieval_mvst / http://www.dlib.org/dlib/march /kucsma/ kucsma.html http://www.dlib.org/dlib/march /kucsma/ kucsma.html http://www.historictownsatlas.org.uk/atlas/volume-iii/city-london-prehistoric-times-c -volume-iii/view-text-gazetteer-and-maps-early http://www.historictownsatlas.org.uk/atlas/volume-iii/city-london-prehistoric-times-c -volume-iii/view-text-gazetteer-and-maps-early http://www.historictownsatlas.org.uk/atlas/volume-iii/city-london-prehistoric-times-c -volume-iii/view-text-gazetteer-and-maps-early http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-sites/westminsterhall http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-sites/westminsterhall http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-objects/pilgrimbadge http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-objects/pilgrimbadge cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of museum of london. a. “museum of london advanced search.” museum of london. accessed april , . http://collections.museumoflondon.org.uk/ online/search/#!/advanced. museum of london. b. “museum of london archaeological archive.” museum of london. accessed april , . https://www.museumoflondon.org.uk/ collections/access-and-enquiries/archaeological-archive-access. omeka. . “working with dublin core.” omeka classic user manual. accessed may , . https://omeka.org/codex/working_with_dublin_core. richter, ashley. . “so what is digital archaeology? ( july ).” popular archaeology (winter issue, july ). accessed january . roy rosenweig center for history and new media. . “omeka.” fairfax, va: george mason university. accessed april , . http://omeka.org. schofield, john. . london – . the archaeology of a capital city. sheffield: equinox. shepherd, william r. . historical atlas. new york: henry holt and company. in the perry-castañeda library map collection, university of texas libraries. accessed april, , . https://legacy.lib.utexas.edu/maps/historical/history_ shepherd_ .html. shoaf, ariana. . “ewer.” the western tradition at the los angeles county museum of arts, esther liberman cuenca (ed.). accessed april , . http://mcuhistory. weebly.com/shoaf-ariana.html. tan, samantha. . “westminster hall.” medieval london, esther liberman cuenca, and maryanne kowaleski (eds.). new york: fordham university. accessed april , . http://medievallondon.ace.fordham.edu/exhibits/show/medieval- london-sites/westminsterhall. watrall, ethan. . “archaeology, the digital humanities, and the ‘big tent.’” debates in the digital humanities , matthew k. gold, and lauren f. klein (eds.). minneapolis: university of minnesota press. accessed april , . http://dhdebates.gc.cuny.edu/debates/text/ . http://collections.museumoflondon.org.uk/online/search/#!/advanced http://collections.museumoflondon.org.uk/online/search/#!/advanced https://www.museumoflondon.org.uk/collections/access-and-enquiries/archaeological-archive-access https://www.museumoflondon.org.uk/collections/access-and-enquiries/archaeological-archive-access https://omeka.org/codex/working_with_dublin_core http://popular-archaeology.com/blog/adventures-in-digital-archaeology/so-what-is-digital-archaeology http://omeka.org https://legacy.lib.utexas.edu/maps/historical/history_shepherd_ .html https://legacy.lib.utexas.edu/maps/historical/history_shepherd_ .html http://mcuhistory.weebly.com/shoaf-ariana.html http://mcuhistory.weebly.com/shoaf-ariana.html http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-sites/westminsterhall http://medievallondon.ace.fordham.edu/exhibits/show/medieval-london-sites/westminsterhall http://dhdebates.gc.cuny.edu/debates/text/ cuenca and kowaleski: omeka and other digital platforms for undergraduate research projects on the middle ages art. , page of how to cite this article: cuenca, esther liberman and maryanne kowaleski. . omeka and other digital platforms for undergraduate research projects on the middle ages. digital medievalist ( ): , pp. – , doi: https://doi.org/ . /dm. published: june copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access digital medievalist is a peer-reviewed open access journal published by open library of humanities. https://doi.org/ . /dm. http://creativecommons.org/licenses/by/ . / introduction medieval london on omeka other medieval and pre-modern projects conclusions acknowledgements competing interests references figure figure figure figure figure figure microsoft word - steele scholarlycommunication .docx scholarly communication to . a brindley snapshot. colin steele, australian national university abstract this chapter attempts a snapshot of the dramatic changes impacting on scholarly information access and delivery in the last forty years through the prism of lynne brindley’s career. this was a period in which historical practices of information and access delivery have been dramatically overturned. in some respects, however, the models of scholarly publishing practice and economics have not changed significantly, arguably because of the dominance of multinational publishers in scholarly publishing, exemplified in the ‘big deals’ with libraries and consortia, and the scholarly conservatism imposed to date by research evaluation exercises and tenure and promotion practices. the recent global debates on open access to publicly funded knowledge, have, however, brought scholarly communication to the forefront of attention of governments and university administrations .the potential exists for scholarly research to be more widely available within new digital economic models, but only if the academic community regains ownership of the knowledge its creates. librarians can and should play a leading role in shaping ‘knowledge creation, knowledge ordering and dissemination, and knowledge interaction’. ------------------------------------------------------------------ it is a truth universally acknowledged that in the twenty first century, we are witnessing a revolution in communication, both scholarly and social, unparalleled since the invention of the printing press in the fifteen century. instantaneous internet communication provides unprecedented opportunities and challenges for scholarly communication, libraries, information access and delivery. lynne brindley’s career spans that significant period of change. in her miles conrad lecture for , lynne commented, ‘certainly i have been a participant in the digital library journey, perhaps for more years than i care to recollect and the challenges of libraries and information technologies, information and knowledge management have certainly shaped major parts of my career’. brindley ( a). oxford - i first met lynne when she was a sconul trainee at the bodleian library in - . in april , i was privileged to give the after dinner speech at the th anniversary of the re- founding of the bodleian library. in my speech, i alluded to lynne’s time as a trainee at the bodleian and noted that coming from australia, a quote from south pacific was appropriate: ‘we get packages from home / we get movies, we get shows’, but ‘there is nothin' like a dame’. this prediction of a dameship came true in . lynne’s election as master of pembroke college oxford from august brings, as lynne put it, ‘a degree of oxford circularity’. now seems almost a library galaxy far far away, when some of the bodleian traditions, particularly in cataloguing, still had their practices in a previous century.there was a bodleian automation activity, however, in the late s, which in ambition, if not in completion, foreshadowed lynne’s work at aston university in the second half of the ’s in campus networked automation. automation the bodleian embarked in the late ’s on what can be seen in retrospect as an overly optimistic ocr scanning project. this was to convert the pre- bodleian manuscript catalogue entries, in various languages and hands. the project leaders, in turn, peter brown and john jolliffe, had both come from the british museum library, a library which lynne was later to head when in its new location at st pancras. the complexities in scanning the manuscript entries proved too much for the contracted firm, which went out of business. this automation endeavour may have provided an invaluable case study in terms of communication, project planning and management for lynne when at kpmg in the early ’s. research libraries and scholarly communication the bodleian, with its historical tradition of ‘scholar librarians’, reflected that the work of university libraries must be inextricably linked with those of their scholarly communities. the ’s bodleian focus was essentially academic, and occasionally esoteric (scottish herringbone bindings before was the subject of one bodleian librarian’s thesis). research libraries are now, in contrast, in ‘an era of discontinuous change—a time when the cumulated assets of the past do not guarantee future success’. calhoun ( ). research libraries must reconfigure their priorities to participate in and influence emergent, network-level scholarly communication infrastructures. the digitally connected world has seen scholarly communication evolve in new campus relationships for the library in such areas as e-scholarship, data management, copyright, on-line learning, scholarly publishing, institutional repositories and research metrics and analysis. the use of impact in the wider societal sense for scholarly output in the uk research evaluation framework also brings into play new relationships on campus between libraries and their scholarly campus community. data collection required here goes far beyond traditional citation metrics. an australian study, excellence in innovation. research impacting our nation’s future, highlighted the importance of grey literature in supporting the case studies on impact assessment. in this role, libraries play an essential collecting and repository function. group of eight australian technology network ( ). follett committee and the ‘electronic library’ lynne, as the new librarian of the london school of economics in became a member of the follett committee, established that year to review library provision in higher education. the follett report supported the need for, in general terms, the electronic delivery of documents over networks; the electronic availability of teaching materials for students; the opportunities for resource sharing and practical co- operation; and an integrated approach to information access and delivery in a complex environment. by early , the funding councils had approved the establishment of the follett implementation group on information technology (figit). this committee, chaired by lynne, developed a major three-year uk electronic library initiative. as reg carr commented later, ‘a sense of excitement, and of facilitated change, was in the air, and the level of expectation within the academic library and information community was very high’. carr ( ). the ‘big deal’ part of this initiative included ‘big deal’ contracts with multinational stm publishers. at the time, these national agreements, which were replicated in several countries, including australia, captured that excitement in providing a platform for delivery of increased content to libraries and then increasingly, to the desktops of researchers. as time has elapsed, that excitement has often turned to frustration by librarians over the inflexibility and cost of ‘big deal’contracts. although the scholarly world is globally linked, much of the scholarly information created by university researchers,still remains behind the expensive firewalls of multinational publishers. the debate, currently raging on open access to publicly funded research, is reflected in the debate on the ‘big deal’. a study of a decade of ‘big deal’ library purchasing in american research libraries, notes that, ‘interest in research library subscriptions to large-publisher bundles persists for several reasons. perhaps the primary reason is that, for more than a decade, a small group of publishers account for a disproportionate amount of libraries’ materials expenditures . . . content and pricing seem to be trending toward a growing disconnect’. (strieb and blixrud, , p. , ). stuart schieber, director of the office for scholarly communication at harvard university, has outlined the difficulties that even a harvard school has had in reducing the cost of the elsevier ‘big deal’, even when they wanted to cancel a significant number of serials. ‘from the library’s point of view, you can’t win by cancelling journals, because the product is not the journal, it’s the bundle’. schieber ( ). the current financial problems of american university libraries resulting from funding cutbacks and the decline of the american dollar, is a scholarly blessing in disguise, because it leads universities, like harvard and california, to question the nature of scholarly publishing monopolies and the nature of academic publishing practices in the twenty-first century. andrew odzlyko in his wide ranging article ‘open access, library and publisher competition, and the evolution of general commerce’, notes that publishers have proved more adept in the control of scholarly publishing in recent decades and ‘in the process they are also marginalizing libraries, and obtaining a greater share of the resources going into scholarly communication. this is enabling a continuation of publisher profits as well as of what for decades has been called “unsustainable journal price escalation”.’ odzlyko ( ). the uk finch report one of the inhibiting factors to scholarly communication change is the inability of much of the academic community to comprehend the new digital publishing environments and an inability to resist the conservative ‘publish or perish’ frameworks, in which they are trapped by their university administrations, national research evaluation exercises and university league tables. nowhere is this better reflected than in the academic reactions to the british finch report on accessibility, sustainability, excellence: how to expand access to research publications. finch ( ). lynne brindley, in her opening address, to a november ‘implementing finch’ conference, reflected that ‘to be provocative, one might argue that there is more trust between a researcher and her publisher than between a researcher and her university acting as gatekeeper of the publication fund, particularly as it is not obvious where all the necessary funds will come from?’ brindley ( ). the finch committee report stimulated significant global commentary, largely stemming from its overall preference on favouring ‘gold’ open access article payments as the best way to implement open access to publicly funded research. the finch committee, and especially the rcuk (research councils uk), have been criticised for not thinking through the economic and structural issues of implementing this approach, particularly for the hass (humanities and social sciences) disciplines and for not exploring in depth other models. it has, however, had the dramatic effect of bringing to the attention of university administrations the issues, and particularly the economics, of scholarly communication and publishing. one of the alleged parameters for the finch committee was not to ‘disturb’ the existing scholarly publishing framework, although it is curious why it should be protected, when many other communication industries are being significantly disrupted and transformed in a digital global environment. the dominance of the multinational publishers has only taken place in the last forty or so years, so there is no historical precedent for contemporary protection. a long term retrospective review would have been helpful for the finch committee, not least in examining the original institutional models of the ownership of university scholarship. the british library under lynne brindley has engaged in significant prospective analyses of digital futures and ‘digital natives’. this author attempted some futurology in , when invited to give the follett lectures in that year, ‘new romances or pulp fiction? do libraries and librarians have an internet future?’. in , quite a lot of my ‘neuromancer’ type predictions have proven reasonably accurate, such as in the paragraphs: ‘the network advances have transformed our modes of communication and will result in significant changes in our structures to accommodate organised information access and storage. the world is indeed now increasingly mcluhan's global village. the origin and dissemination of knowledge can just as easily be in australia, austria or albania as america . . . the integration of scholarly communication processes from the creation of the article/book with the author, through to the ultimate delivery mechanism, is now requiring a new convergence and interaction of author, publisher, distributor and reader. publishers' print warehouse will be transformed, where relevant to a continuing publisher presence, into electronic delivery mechanisms with data being sent electronically directly to users or to libraries for site wide access and downloading accompanied by secure encrypted monetary transfers’. steele ( a ). university presses and the monograph i also commented in those lectures, ‘university presses, a declining force in recent years, may well become transformed as they mutate into distributors of information from their own and other universities in electronic format, thereby making available information that was too prohibitively expensive to produce and distribute in conventional form.’ steele ( b). the finch committee did not address in any detail the question of the future of the academic monograph, this topic, at the time of writing, is the subject of an australian government expert reference group, comprising librarians, publishers and academics. it has been given the brief to explore the future of the academic monograph in a digital context and to try to establish sustainable infrastructures for publication, particularly in open access formats. the success of some of the newly established australian university presses, such as the anu e press, founded in , might provide one of the exemplars for open access monograph production. in the anu e press, embedded in the academic institutional framework of the university had nearly , complete monograph pdf downloads. if this is compared to the average print sale of an academic monograph, estimated in several surveys to be around - copies, the comparison is significant. in the australian model, the online version is freely available for download but print copies are available for purchase through pod (print on demand). in anu sold nearly , print copies. the anu model of making available its own institutional scholarship through its university press reverts back to the model established, in the late nineteenth and early twentieth century by a number of universities, including johns hopkins, manchester and the university of california. conclusion the leadership of lynne brindley at the british library since has encompassed digital developments on all content fronts with significant outreach to communities locally, nationally and globally. lynne has especially been a fervent champion of digital scholarship in the corridors of power, as represented by her membership of a number of influential committees in the uk. librarians, whom one australian vice chancellor once referred to as ‘mice who aspire to be rats’, need to be much more ‘rat- like’ as numerous external global influences impact on their operations. in this process, high level scholarly communication expertise is essential. in , the globally networked university library has to position itself to engage actively with research, learning and civic engagement. it needs to be cognisant of the rapidly changing digital and social media conditions and consequent priorities in teaching and research of online environments. access to and delivery of knowledge cannot remain in the digital st-century within the constrained frameworks imposed by the historical print and reward environments. scholars will need to be increasingly involved in an integrated scholarly research output environment which will begin the creation of scholarship and end with its widespread distribution. the overall critical factors, in terms of access and distribution of knowledge, will be an emphasis on openness and social productivity. cliff lynch, director of the us coalition for networked information since , has reflected, in a long interview, on the massive changes that have occurred for libraries over the last thirty years. lynch reflects that we are, ‘now in an era when giant, badly behaved telecom incumbents dominate’. lynch ( ). this comment resonates with those of timothy wu in the master switch: the rise and fall of information empires, where wu notes the dominance of apple, google, amazon and microsoft, all of whose activities have dramatically impinged on libraries and information providers. wu demonstrates how new communication mediums arrive on a wave of optimism, (like the ‘big deal’?), only to become dominated by monopolies who take control of the ‘master switch’. wu ( ). wu argues that the internet is in danger of following the same path as telegraphy and telephony in the nineteenth century, and film, radio and television in the twentieth. a similar argument is propounded in susan crawford’s captive audience. the telecom industry and monopoly power in the new gilded age. crawford ( ) crawford’s focus is about the control over, and dissemination of, telecommunication infrastructure in the united states. new readers in the bodleian take an oath which include the words: ‘i hereby undertake not to . . . kindle therein any fire’. will the ever increasing tablets and mobile devices be the mechanisms to kindle the flames of wider access to publicly funded knowledge in the twenty first century, or will multinational publishers continue their control and pricing dominance of much scholarly content within the conservatism of current scholarly reward practices. as gideon burton has written, ‘the more academia wishes to enjoy the benefits of the digital medium, the less it can hold on to restrictive and closed practices in the production, vetting, dissemination, and archiving of information. burton ( ). the vision of the future of scholarly communication within those frameworks could be viewed as either utopian or dystopian depending one’s perspective. a strong voice and leadership by librarians is needed more than ever and here it is appropriate that lynne brindley should have the last word: ‘opportunities exist for real and vocal leadership in shaping this emerging space, shaping the political economy of higher education, and shaping its interactions with knowledge creation, knowledge ordering and dissemination, and knowledge interaction’. brindley ( b). __________________________ references brindley, lynne ( a/b) ‘challenges for great libraries in the age of the digital native’, http://nfais.org/files/mc_lecture_ .pdf (visited . . ). brindley, lynne ( ) ‘the transition to finch: the implications for the arts, humanities and social sciences’; in academy of social sciences open access publishing. presentations from the academies ‘implementing finch’ conference. london: academy of social sciences, p. . http://www.acss.org.uk/docs/professional% briefings/professi onal% briefings% -% jan% % - % open% access% publishing.pdf (visited . . ). burton, gideon ( ) quoted in david lewis ( ) ‘from stacks to the web: the transformation of academic library collecting’, http://crl.acrl.org/content/early/ / / /crl- .full.pdf (visited . . ). calhoun, karen ( ) ‘supporting digital scholarship: bibliographic control, library cooperatives and open access repositories’, http://d-scholarship.pitt.edu/ / (visited . . ). carr, reg ( ) ‘towards the academic digital library in the uk: a national perspective’, in global issues in st century research librarianship. helsinki: nordinfo. http://www.bodley.ox.ac.uk/librarian/rpc/academicdl/academicd l.htm (visited . . ). crawford, susan ( ) captive audience. the telecom industry and monopoly power in the new gilded age. new haven: yale university press. finch, janet ( ) accessibility, sustainability, excellence: how to expand access to research publications. report of the working group on expanding access to published research findings. london: research information network. http://www.researchinfonet.org/wp- content/uploads/ / /finch-group-report-final- version.pdf (visited . . ). group of eight australian technology network ( ) excellence in innovation. research impacting our nation’s future. canberra: group of eight/atn. lynch, clifford a, elke greifeneder, michael seadle ( ) ‘interactions between libraries and technology over the past years: an interview with clifford lynch . . ”, library hi tech, ( ), p. - . also available at: http://www.cni.org/talks-interviews/interactions-between- libraries-technology-past- -years/ (visited . . ). odzlyko andrew ( ) ‘open access, library and publisher competition, and the evolution of general commerce’, http://arxiv.org/abs/ . (visited . . ). schieber, stuart ( ) ‘why open access is better for scholarly societies’, http://blogs.law.harvard.edu/pamphlet/ (visited . . ). strieb, karla l. and julia c. blixrud ( ) ‘the state of large- publisher bundles in ’ research library issues: a bimonthly report from arl, cni, and sparc, no. . http://publications.arl.org/rli / (visited . . ). steele, colin ( ) ‘new romances or pulp fiction? do libraries and librarians have an internet future?’, http://www.ukoln.ac.uk/services/papers/follett/steele/paper.html (visited . . ). wu,timothy ( ) the master switch: the rise and fall of information empires, london: atlantic. op-llcj .. digitally reconstructing the great parchment book: d recovery of fire-damaged historical documents ............................................................................................................................................................ kazim pal ucl department of computer science, university college london, uk nicola avery london metropolitan archives, uk pete boston headscape ltd, uk alberto campagnolo ligatus research centre, university of the arts, uk caroline de stefani london metropolitan archives, uk helen matheson-pollock university college london, uk daniele panozzo courant institute of mathematical sciences, new york university, usa matthew payne muniment room, westminster abbey, uk christian schüller department of computer science, eth zurich, switzerland chris sanderson headscape ltd, uk chris scott headscape ltd, uk philippa smith london metropolitan archives, uk correspondence: melissa terras, ucl department of information studies, foster court, university college london, gower street, london, wc e bt. e-mail: m.terras@ucl.ac.uk digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. this is an open access article distributed under the terms of the creative commons attribution license (http://creativecommons. org/licenses/by/ . /), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. of xpath error undefined namespace prefix xpath error undefined namespace prefix rachael smither london metropolitan archives, uk olga sorkine-hornung department of computer science, eth zurich, switzerland ann stewart national museums liverpool, uk emma stewart london metropolitan archives, uk patricia stewart oxford english dictionary, oxford university press, uk melissa terras ucl department of information studies, university college london, uk ucl centre for digital humanities, university college london, uk bernadette walsh museum and visitor service, derry city and strabane district council, northern ireland laurence ward london metropolitan archives, uk liz yamada freelance paper conservator tim weyrich ucl department of computer science, university college london, uk ucl centre for digital humanities, university college london, uk ....................................................................................................................................... abstract the great parchment book of the honourable the irish society is a major surviving historical record of the estates of the county of londonderry (in modern day northern ireland). it contains key data about landholding and population in the irish province of ulster and the city of londonderry and its environs in the mid- th century, at a time of social, religious, and political upheaval. compiled in , it was severely damaged in a fire in , and due to the fragile state of the parchment, its contents have been mostly inaccessible since. we describe here a long-term, interdisciplinary, international partnership involving conservators, archivists, pal et al. of digital scholarship in the humanities, computer scientists, and digital humanists that developed a low-cost pipeline for conserving, digitizing, d-reconstructing, and virtually flattening the fire-damaged, buckled parchment, enabling new readings and understanding of the text to be created. for the first time, this article presents a complete overview of the project, detailing the conservation, digital acquisition, and digital reconstruction methods used, resulting in a new transcription and digital edition of the text in time for the th anniversary celebrations of the building of londonderry’s city walls in . we concentrate on the digital reconstruction pipeline that will be of interest to custodians of similarly fire-damaged historical parchment, whilst highlighting how working together on this project has produced an online resource that has focussed community reflection upon an important, but previously inaccessible, historical text. ................................................................................................................................................................................. introduction we present here a major, groundbreaking partner- ship that physically conserved and digitally recon- structed a severely damaged parchment document of significant importance as a source for the city of london’s role in the protestant colonization and ad- ministration of the irish province of ulster (now in northern ireland). badly damaged in a fire, the great parchment book was stored under restricted access, unavailable for handling or reading by historians. established conservation methods for flattening were unsuitable due to the parchment’s fragility. standard two-dimensional digitization techniques would not capture the physicality of the text ad- equately, and existing three-dimensional ( d) scan- ning methods for documentary material were unsuitable for this particular application. this paper presents a lightweight pipeline for the digital acquisition and generation of a d reconstruction of a severely fire-damaged document that exhibits com- plex geometry. the reconstruction allows a new, d surrogate of the document to be navigated, and a globally flattened image of the text to be created, enhancing its legibility. we also introduce a novel quality metric, relating our work to standard best practice in digitization for archival materials, to gauge the effective digitization resolution of our re- construction approach. in addition, our system addresses the issue of provenance in d reconstruc- tion: aiming to document and make transparent the capture and processing elements so that historians can trust whether a feature is present in the original text or whether it is an artefact of the reconstruction pipeline. this digital reconstruction, combined with transcription and online publishing of the resulting digital surrogates, have made the contents of the great parchment book of the honourable the irish society available for the first time in over years. our method is particularly relevant to libraries and archives holding similarly damaged historical docu- ments that cannot be fully restored by conventional physical or digital means. such documents—which are all too common—generally have restricted access due to their fragile nature, giving opportunities for digital representations to allow people to read contents remotely, without physical handling of the originals. additionally, digital representations can allow for vir- tual reconstruction, including removal of geometric distortions, and corrections of discolouration. we demonstrate how our reconstruction work has contributed to the great parchment book pro- ject overall, assisting in the creation of a digital edi- tion of the text in time for the major local and national commemoration of the th anniversary of the plantation of ulster. our project demon- strates how advanced computational methods, when developed to encompass the needs of conser- vators, archivists, and palaeographers, can be used to enhance our understanding of historic texts, and create a resource to benefit a wide community. the great parchment book the great parchment book of the honourable the irish society (hereafter ‘the irish society’) is a major survey of all the estates in the county of derry digitally reconstructing the great parchment book digital scholarship in the humanities, of managed by the city of london through the irish society and the london livery companies in the early th century. the great parchment book was compiled in by a commission instituted by charles i to survey the lands which would now fall under his control. given the relative paucity of archival records for early modern ireland, the manuscript contains key data about landholding and population in a part of ulster at this time, as well as information about the province’s relation- ship with london. however, the manuscript was severely damaged in a fire in and only limited information about its contents has been available since. in this section, we detail the history of the great parchment book, explain its importance, and describe its current physical state which made the contents of the manuscript all but inaccessible before the undertaking of our project. . the honourable the irish society the history of the city of london, the irish society, and the great parchment book are very much inter- twined (bond and vandercom, ). the nine year’s war ( – ), where the forces of gaelic irish chieftains fought against english rule in ireland, ended in gaelic defeat and exile for their leaders, leaving northwest ulster open to colonization (canny, ; curley, ). from , king james vi of scotland and i of england and ireland ( – ) had a policy of settling ulster with english and scottish protestants, known as the ‘plantation’, with the city of london corporation and the london livery companies being reluctantly compelled to administer the plantation of the county of derry (curl, ). originally established by the city of london’s court of common council on january , the irish society was formally incor- porated by the royal charter on march (bond and vandercom, , p. xii) which also gave to the society grants of lands and privileges. that year the irish society was placed in direct control of the city of derry, which was renamed londonderry, and the newly constituted county and estates surrounding it, with the county of coleraine planted and converted into londonderry, with parts of donegal, antrim, and tyrone also part of the wider plantation (moody, ; curl, ). although much has changed since (including the loss of all its estates), the irish society continues to operate to this day, and has always had its administrative centre (the irish chamber) at or near guildhall at the heart of the square mile in london, maintaining ‘a tradition of care of its administrative records’ (the national archives n. d. (a)). . overview of the great parchment book the great parchment book is the major surviving early record of the irish society relating to irish estate management. the great parchment book was compiled in after charles i ( – ) claimed as forfeit the estates constituting the entire county of londonderry (which london was admin- istering through the irish society and the livery companies) after a politically motivated case in the star chamber ruled that the londoners had not ful- filled their obligations of plantation. charles i then commissioned a survey intended to gather, in one single volume, full details of all the contracts and rented lands in londonderry and coleraine that he had successfully claimed: this commission authorised sir ralph whitfield, serjeant-at-law, thomas fotherley, gentleman of the king’s privy chamber, bramhall, bishop of derry and sir william parsons, surveyor general of ireland to collect and receive all sums due to the king in londonderry, to seize on the king’s behalf all castles, manors, lands and tenements lately be- longing to the londoners, and to conclude, on terms most profitable to the king, new con- tracts for leases of estates of inheritance with the existing tenants and others. whitfield and fotherley were the active members of the commission, arriving in ireland in april, and finishing about the beginning of october . attended by a clerk and two assistants, they toured the estates, and copies of all the con- tracts were entered in a ‘great parchment booke’, thereafter presented to the king. (the national archives, n. d. (b)) bond and vandercom ( ) describe the book thus: ‘the various grants and agreements were pal et al. of digital scholarship in the humanities, engrossed on vellum, and signed by the respective parties, and were preserved, (bound up,) amongst the records of the irish society’ (p. ): after the survey, the book made its way back to london to the irish chamber at guildhall. the great parchment book then represents an important information source for the city of london’s role in the colonization and administra- tion of ulster, containing key data about landhold- ing and population in th-century londonderry: it is commonly described as the domesday book of ulster, or derry (worshipful company of vintners, ; santry, ; reisz, ). . physical condition: barriers to interpretation in february a fire broke out amongst building works on the north side of guildhall which ‘not- withstanding speedy assistance, burnt so furiously for some time, that the whole of that office was destroyed, together with all the books of accounts, several bonds, and considerable sum in bank-notes’ (lambert, , p. ). a great many of the irish society’s archives were incinerated or (like the great parchment book) badly damaged: very few th- century records of the irish society remain as a result (aldous, ). after the fire, and in spite of its parlous state, the surviving leaves of the great parchment book were carefully preserved be- cause of its importance to the irish society and sig- nificance to the history of ulster: the fragments [. . .] which remain, are valuable and interesting, as they elucidate the titles of the twelve chief companies to their manorial town lands, which are described by name. and they set forth the estimated quantity of land contained in the respective denominations of town lands, granted or demisted by the commissioners. (bond and vandercom, , p. ) in , however, t. w. moody described the book as ‘a mass of scorched and dirty fragments, most of them mere crusts. the details of whitfield’s and fotherley’s work are completely ir- recoverable’ (see fig. ) (p. ). the extant manuscript consists of separate parchment membranes, all damaged in the fire. it is unknown how many folios were in the original book, and the initial ordering of the sheets is also unknown: the grouping by livery company pre- sented prior to this project is conjectural. the re- maining sheets defy reading under normal conditions. parchment consists of an irregular structure of organic fibres that are sensitive to their environment and can shrink, swell, and buckle if exposed to heat or humidity, creating dra- matic and irregular geometric distortions, precisely what has happened in this instance (chahine, ; giurginca et al., ). this uneven shrinkage, warping, intorsion, and distortion has rendered much of the text of the great parchment book ille- gible, although it is usually still visible. the physical state of the text (at the corporation of london records office it became affectionately known as the ‘poppadom’ book) combined with its fragility and distorted writing meant that, although it remained part of the city of london’s collections held at london metropolitan archives (lma), the contents of the great parchment book had been, prior to this project, unavailable to interested parties for over years (curl, ) (see fig. ). the motivation for revisiting physical conservation fig. the great parchment book in its archival storage prior to conservation and re-packaging as part of this project (lma reference: cla/ /em/ / ). image used with permission, the irish society and lma, the city of london corporation digitally reconstructing the great parchment book digital scholarship in the humanities, of and digital restoration methods came with the ap- proaching th anniversary of the building of the londonderry city walls in , and a planned pro- gramme of public engagement and commemor- ation. the ultimate aim was to make the document available as a central point in an exhib- ition in derry guildhall in during derry�londonderry’s year as european city of culture. methodology we knew from the project’s outset that this was an undertaking without a certain result as we were com- mitted to exploring new techniques and technologies, both in physical conservation and digital imaging. each element of the project was a major piece of work in its own right and different funders were ap- proached to support them. a full conservation assessment was first undertaken, to define a measure of the damage to each folio and to establish methods that could make subsequent digital acquisition and processing more effective. a partnership between lma, ucl department of computer science, and ucl centre for digital humanities established a -year engineering doctorate (engd) in the ucl virtual environments, imaging and visualization programme beginning in september (jointly funded by the engineering and physical sciences research council and lma) with the intention of designing a digital reconstruction workflow and soft- ware to capture, process, and display the writing on the parchment. the aim was to make the text legible, by digitally undoing part of the damage and deform- ation without having to resort to checking the newly conserved book, and ideally to reconstitute the manuscript digitally. the research component of the engd began in september . conservation of the parchment fig. an example of heavy gelatinization, shrinkage, and distortion of the parchment sheet of the great parchment book (lma reference: cla/ /em/ / ). image used with permission, the irish society and lma, the city of london corporation pal et al. of digital scholarship in the humanities, sheets was carried out by september , and tran- scription of the manuscript was undertaken along- side conservation and digital restoration until spring , with the launch of the website sched- uled for the end of may to coincide with the opening of the derry guildhall exhibition (although the website could be updated beyond this). further interactive geometry work was done in collabor- ation with colleagues in the interactive geometry lab within the institute of visual computing at eth zurich throughout . this was an ag- gressive schedule which required conservators, arch- ivists, historians, digital humanists, and computer scientists working in tandem on physically conser- ving, digitally reconstructing, and reading the great parchment book. we provide here a brief overview of the physical conservation methods used, before detailing the digital acquisition and reconstruction approach pursued. . physical conservation the treatment of such a degraded and fragile manu- script was challenging: traditional conservation alone would not produce sufficient results to make the manuscript accessible or suitable for exhib- ition. the parchment itself was too shrivelled to be returned to a readable state, although there had clearly been at least one attempt in the past to do so. no documentation regarding this has been found. a detailed condition assessment was carried out to es- tablish the possible risks to the document’s integrity during storage and handling. the types of damage were identified and a condition rating system devised to establish the overall extent of the damage, which included: planar distortion of the surface caused by fire; deep and stiff creases from heavy gelatinization and denaturation of the parch- ment; the presence of large and small tears on the edges as a result of shrinking; the presence of a thick layer of calcite on the surface caused by strong de- hydration of the skin during the fire; damage to the inks which after testing were revealed to be metal- logallic (in some areas the ink had flaked off and in other parts it was completely lost, leaving a lighter colour on the parchment), and severe surface dirt present on many of the sheets. the results confirmed that the parchment sheets were too damaged to be handled safely even after ex- tensive conservation treatment. forced flattening of the entire sheet would have facilitated the digitization process, but damaged the parchment irreversibly. much of the text is visible but distorted. it was decided to humidify the parchment sheets as far as was appro- priate to their fragile state to try to release only the creases that were obscuring the text. the ultimate aim was to gain legibility and enable the best access during the digitization phase of the project. the practical conservation of the membranes was the essential first step. different treatment op- tions were tested on samples to determine the best way to humidify and dry the parchment under ten- sion (see woods, for an overview of this tech- nique, also clarkson, ; singer, ; hassel, fig. conservation treatment. after controlled humidi- fication, the sheet is dried under tension by means of magnets and polyester wadding. any tears are held in place during the drying process to stop them increasing. smither (n. d.) contains further discussion and images of this stage of the work digitally reconstructing the great parchment book digital scholarship in the humanities, of ). conservation work on the membranes included cleaning, humidification, and tension drying, using magnets placed on top of the parch- ment above a metal sheet to hold creases open during the drying process (see smither n. d. for an overview) (see fig. ). the aim was to introduce as little moisture as possible to the parchment and tension it locally taking into account the con- straints of daily working hours and the need to make the treated sheets available for digitization within the agreed time-frame. this approach opened out areas of the parchment so that as much of the text as possible could be accessed during the digital acquisition process. dry cleaning of the sheets was performed by means of a soft brush and a chemical sponge only where the inks were not flaking off. repairs were carried out only on areas where handling during digi- tization would have compromised the integrity of the sheets (avery et al., ; de stefani, , ). the digitization started as soon as the treatments on each sheet were completed. the treated sheets needed to be rehoused because their existing packaging was not of conservation standard and was damaging the sheets further. new packaging was provided for safe and long-term storage by means of an archival box and tyvek sheets to interleave the parchment sheets (london metropolitan archives, ). . existing digital approaches to flattening manuscripts before undertaking any digital acquisition and re- construction work ourselves, it was first necessary to establish if there were any existing approaches that could be potentially of use, surveying previous work in advanced digitization of historical documents that have complex geometry. when digitizing flat documents, a single top-down image is generally viewed as sufficient to meet scholarly require- ments. in the case of our document, a single image would be insufficient to produce a high-qual- ity digital surrogate since folds would occlude re- gions of the folio, and some raised areas would be imaged with foreshortening effects. for these rea- sons, each folio needed to be three-dimensionally reconstructed to produce a digitization of sufficient quality. we now provide an overview of prior work in attempting to virtually flatten documents, includ- ing an explanation of why existing approaches could not be adopted for our task. previous work addressing the problem of compu- tationally flattening documents typically deals with single images that exhibit small geometric distortions and a specific type of deformation, such as rectifying printed text captured by a flat-bed scanner or a camera, with the aim of reconstructing the shape of modern printed documents to improve the perform- ance of optical character recognition (ocr) algo- rithms. this approach assumes it is possible to rectify the deformation of a document by observation: extracting features that are used to estimate the under- lying changes in the text. for example, wada et al. ( ) and zhang et al. ( ) assume the physical deformations to be caused by the spine of a book, and both propose methods to reconstruct the surface of bound documents captured with a flatbed scanner by using the shading cues in the image. wu and agam ( ) present a simple method for rectifying a warped image based on tracing lines of text and then using these to generate a deformation mesh. these line tracing methods work well on documents with clear text and good image contrast. schneider et al. ( ) and tian and narasimham ( s) recon- struct the shape of smoothly folded pages by detecting horizontal line directions and vertical strokes in the text, de-warping images of textual documents by estimating the d surface of the document. however, this relies on the text being printed on a light background in a regular font so that individ- ual letters and strokes can be detected. the assump- tions made by these approaches (that text is printed on a clean white page, with regular font and line spa- cing, and that the page has not undergone any non- isometric deformation such as warping or buckling) mean that such methods are not robust when applied to our damaged historical parchment since we cannot make such strong assumptions about shading and text- ual cues from our twisted, handwritten text. brown and seales ( ), sun et al. ( ), brown et al. ( ), and bianco et al. ( ) ap- proach the problem of virtually flattening arbitrarily warped documents, with fewer initial assumptions made about their shape or content. they acquire a d reconstruction of the document using a pal et al. of digital scholarship in the humanities, structured-light scanner, where the document sur- face is illuminated with a known pattern, and the distortion of the pattern as seen from a camera is used to compute the d shape and then flatten the resulting triangle mesh using a mass-spring simula- tion. the mesh is allowed to fall into a planar configuration under a gravity force while spring forces maintain its structure. we observe that this mimics the physical conservation approach of softening the parchments and then stretching them out. brown et al. ( ) also add a photomet- ric correction step to remove baked-in shading from the reconstruction’s texture. this type of approach produces impressive results considering its simpli- city. however, the capture process suffers from problems with self-occluding objects since both the camera and projector must be able to see a point on the object to reconstruct it: if there are any regions of the page which cannot be seen by both the camera and the projector, they will not be included. complex objects would require a number of separate scans to be performed and the resulting meshes registered with each other, which is not trivial. the mass-spring approach could also cause fold-overs when applied to documents with high levels of physical distortion, introducing un- wanted overlaps. finally, the isometry assumption built into the mass-spring system (i.e. that the docu- ment deforms uniformly in all directions) is not appropriate for completely arbitrarily deformed manuscripts which contain regions that have both shrunk and expanded, as is the case with denatured parchment, which is heavily deformed with pro- nounced folds and creases. ulges et al. ( ), lampert et al. ( ), and koo et al. ( ) capture stereo images of a docu- ment from which a d surface can be reconstructed. these methods are demonstrated on open books, and assume there is no self-occlusion in the pages. global conformal mapping is used in brown and pisula ( ) to unfold a document; however, this approach still makes strong assumptions about the type of deformations present. samko et al. ( , ) present a method for scanning and virtually unrolling scrolled historical documents written on parchment. the scrolls are scanned using x-ray tomography to produce d volumetric data. these data are treated as a set of volumetric slices. however, this is also unsuitable for our problem because it assumes that the deformation to the document is equal throughout, where parchment (as previously discussed) is likely to buckle and twist in non-isometric patterns. in addition to the virtual unfolding of docu- ments, a number of techniques have been developed to correct aspects of historical document degrad- ation other than geometric distortion, namely, fading of the text as the ink degrades, and bleed- through of the text as the writing support deterior- ates. these include multispectral imaging (easton et al., ; giacometti et al., ; kim et al., ; klein et al., ). however, preliminary experi- ments with this approach showed that there would be no improvement in visible text using multispec- tral imaging with the great parchment book (macdonald, n. d.). ink bleed removal (correcting images from ink seeping through from the other side of the folio; huang et al., ; hanasusanto et al., ; chan and vese, ) is also not a pri- mary concern for us, likewise, the use of intelligent interpretation support systems that propagate likely readings of the text is beyond this project’s scope (terras, ; roued-olsen et al., ). we therefore came to the conclusion that exist- ing methods previously proposed for complex document acquisition suffered from occlusions (narrow areas which could not be captured due to the physicality of the document) or required com- plex manual alignment of partial scans. developing our own pipeline for digitization and processing of documentary material was therefore necessary and would prove more efficient than adopting others’ systems. however, there are existing, wider approaches in cultural heritage digitization and in computational d graphics which have proved useful, allowing us to build on their techniques and tools, and informed and enhanced our work. . adopting related d reconstruction approaches digitally flattening a document requires that we first capture a digital surrogate, to which flattening and restoration algorithms can be applied. although exist- ing flattening methods for documents did not hold digitally reconstructing the great parchment book digital scholarship in the humanities, of solutions for us, there is much previous work dealing with the general topic of d reconstruction of objects, and many different established pipelines for undertak- ing such work. contact approaches (which acquire the shape of an object by probing it with sensors) are obviously unsuitable for use on fragile historical arte- facts. non-contact methods exist which can be cate- gorized into active methods (emitting some form of light or non-visible radiation, then detecting how the light interacts with the object to recover its shape, although conservators can have concerns about the use of lasers and such like in proximity to delicate objects) and passive methods (which analyse reflected ambient radiation). both of these approaches have been much used in the digitization of a range of cul- tural and heritage objects. the types of d represen- tations produced include point clouds (snavely et al., ; wu, ; furukawa and ponce, ), volu- metric models (kutulakos and seitz, ; furukawa and ponce, ), and triangle meshes (vergauwen and gool, ; autodesk, ). we therefore had a range of approaches to choose from, and our initial phases of development involved investigating different existing approaches to reconstruction and their suit- ability for our problem. we initially discussed the con- struction of a bespoke laser scanner, but during that phase determined that the parchment exhibits suffi- cient visual texture, given its non-uniformity, to allow structure-from-motion algorithms to perform well. we therefore turned our attention to multi-view stereo, a technique which is now commonly used in computer graphics ‘to reconstruct a complete d image of a model from a collection of images taken from known camera-viewpoints’ (seitz et al., , p. ). a range of high-quality multi-view algorithms have been developed with various computational approaches being used, which can reach ‘remarkable’ accuracy to generate complete d object models, if there are enough d images available on which they can be based (ibid., p. ). multi-view stereo has been extensively used to record archaeological sites, archi- tectural ruins, and museum objects, usually at larger scale (for example, see pollefeys et al., ; remondino, ; wenzel et al., ) and is now a standard approach in computer vision (see hartley and zisserman, for an overview). the multi-view stereo approach is very well suited for practical, manual acquisition of our deformed parchment, as it allows a user to freely choose view- points to reach all parts of the wrinkled surface, cap- turing a series of d digital images that we can then use to generate a d model. previous approaches such as top-down cameras or structured-light scan- ners would not be able to cope with the self-occlu- sions in the pages and would produce incomplete reconstructions. using a hand-held camera allows us to adapt the acquisition process to the highly varying shapes of the parchment, and guarantees full coverage of the parchment surface. the fact that we only use commonly available hardware makes this approach more accessible: archives and museums are also unlikely to have access to specia- lized scanning equipment or the expertise required to use it. in addition, there are already existing compu- tational algorithms for multi-view stereo that we can adopt and adapt to fit our task. there are now free end-to-end web services which compute textured d models from an unca- librated set of d images: both arc d (vergauwen and gool, ; tingdahl and van gool, ) and autodesk d catch ( ) are popular in cultural heritage digitization and other services are also available. however, these are ‘black-box’ approaches where it is not clear how the model was generated from the input d images (nguyen et al., ), and it is imperative that we understand how data move through our pipeline to generate surrogate models that we can trust (terras, ). an alternative multi-view stereo approach that is fully documented, open- source, supported by multiple operating systems, and allows users to trace each individual pixel through the pipeline was available in wu’s visualsfm ( ) software which uses wu et al.’s graphics processing unit (gpu) implementation of scale-invariant feature transform (lowe, ; wu, ), and their multi-core bundle adjustment al- gorithm (wu et al., ) to generate a sparse d reconstruction using structure from motion. this is a widely used dense multi-view stereo reconstruc- tion workflow, performing very well on even highly unstructured image sets containing variants in pal et al. of digital scholarship in the humanities, lighting, image exposure, and lens type (remondino et al., ). visualsfm thus provided us with an approach and flexible tools upon which to build our digital reconstruction pipeline. the great parchment book digital reconstruction pipeline we first capture a set of high-resolution images and then perform two pre-processing steps: recon- struction, and computation of texture maps. we developed a viewer that allows a user to navigate the surface of this model, and to generate flattened representations of specific, local areas of the docu- ment, to aid interpretation. additional, advanced mesh parameterization was then developed to com- pute a map that flattens the d surface into the d plane while introducing as little distortion as pos- sible to produce images of the whole document, virtually recovered to an extent not possible with physical restoration methods. we detail here the capture, reconstruction, and computation of texture map phases. we also discuss how we can assess the quality of our reconstructions by relating them to established digitization standards in the cultural and heritage sector. our digital models can be viewed and shared, allowing the contents of the book to be accessed more easily and without further hand- ling of the original document. we describe both our interactive viewer and our global flattening ap- proach, presenting results of our pipeline at every stage. . reconstruction . . capture first, it should be noted that throughout the capture phase, it was imperative that the great parchment book did not come to any harm. the studio setting at lma was discussed with conservators, and a vol- unteer, a qualified conservator, assisted with hand- ling the document. we evaluated the use of appropriate lights and supports for the parchment. as noted above, digitization happened soon after conservation, with the folios returning to store in improved housing: the conservation and digitiza- tion elements of the project are closely intertwined. the first step of the acquisition process was to capture a set of overlapping d images that cover the entirety of the parchment (see fig. ). the folios were imaged using a hand-held digital single-lens reflex (dslr) camera (canon d mark iii). each parchment was placed on a table covered with a black velvet cloth (to provide a matt background) surrounded by three large, evenly spaced diffuse lights to provide uniform illumination and mini- mize the amount of shade cast on the parchment due to self-shadowing. a ‘colorchecker color rendition chart’ was placed on the table next to the parchment, providing a measure of scale and colour calibration. for each folio, we first took a set of images (typ- ically between eight and ten) in a circular formation so that the entire parchment is visible in each image. we then took many more close-up images, making sure to cover the entire surface of the folio thor- oughly. for highly distorted areas of the parchment where the text had shrunk to a very small size, we use a macro lens to obtain extreme close-up images. although algorithms exist for automatically selecting optimal camera viewpoints (ahmadabadian et al., ), the selection of camera views in our fig. kazim pal undertaking the digitization process, which required much movement around each folio. you can also see here a set of small ceramic balls placed around the folio which we originally had planned to use for calibra- tion: these were not required as our work progressed but they remain in the capture shots. video (which is available in the online version of this paper) shows a time-lapse of a typical capture sequence, indicating the range of angles required to gain a complete coverage of each folio side, and indicating that the digitization process was non-contact digitally reconstructing the great parchment book digital scholarship in the humanities, of reconstruction workflow is dependent on the judg- ment and experience of the human operator (see figs and ). in total, there were folio sides to be captured (this is less than double the total number of folios— —because some folios have a blank side), with of these requiring macro capture. we captured , images. typically, an image set for an indi- vidual folio will contain between fifty and sixty mp images, but can sometimes be as large as eighty or more images for extremely deformed folios, or as low as twenty or thirty for relatively flat ones. the most images captured per folio was eighty-nine, and the fewest was eleven, but there was an average of forty-nine images per folio: this indi- cates the variation in their shape. it took – min to capture each folio side, meaning twelve folios could be captured in day. the entire capture phase took days of work, although these were not consecutive but were dependent both on access to facilities at lma, and fitting in with the conservation procedures. . . reconstruction as described above, we process the image sets with wu’s visualsfm (wu, ) software to generate a sparse d reconstruction using structure from motion. we then apply furukawa and ponce’s patch-based multi view stereo (pmvs) algorithm (furukawa and ponce, ) to generate a dense point reconstruction, examples of which are shown in fig. (top). this process also computes, along with fig. a comprehensive image set for a single parchment folio, showing a range of overview, detail, and macro shots to capture the detail of the surface from various angles pal et al. of digital scholarship in the humanities, the point reconstruction, calibration parameters for each input image (such as focal length and camera rotation) which allows us to determine the camera viewing direction of each input image. the reconstruction is computed up to an arbi- trary scale, so the distances in the resulting object space do not correspond to the true distances in real-life space. to correct for this, we allow a user to mark points on the colorchecker which are a known distance apart, and we then triangulate their positions in object space to compute a scaling factor to allow distances in the model to match those in real-life space. as can be seen in fig. (top), the point clouds contain holes in certain areas. a meshing process smoothly interpolates a surface over holes. the next step in our pipeline is therefore to compute a triangle mesh from the dense point cloud, for which we use kazhdan et al.’s ( ) poisson surface reconstruction algorithm. this algorithm requires very little parameter tuning (we use the exact same parameters for fig. screenshot of a visualization of the camera positions used to image a single folio side, indicating the range of movement required around the folio to gain enough coverage of the surface to build a robust model digitally reconstructing the great parchment book digital scholarship in the humanities, of every reconstruction), is resilient to noisy data, and is designed to interpolate holes in the point recon- struction. the algorithm makes use of the normal vectors associated with each point that is generated as part of the pmvs output, making it a natural choice to follow pmvs. we use the poisson surface reconstruction implementation provided in meshlab (cignoni et al., ). examples of our reconstructed meshes are shown in fig. (middle). . . computation of texture maps the final part of our reconstruction pipeline is to generate texture maps for the triangle meshes. we use the same texture-atlas generation method as esteban and schmitt ( ), originally proposed by schmitt and yemez ( ), since it is simple to implement and avoids having to compute a texture parameterization for the mesh. this calculation applies the appropriate texture images to place over the mesh model (fig. , bottom). the resulting d model contains – mb of data per folio. . . assessing reconstruction quality professional archival standards for document digi- tization describe minimum resolution for raster images (federal agencies digitization initiative, fig. top: dense point clouds generated by visualsfm and pmvs. middle: triangle meshes generated by applying poisson surface reconstruction to the point clouds. bottom: textured meshes generated by back-projecting and blending the original images pal et al. of digital scholarship in the humanities, ). in the case of planar d artefacts that are imaged with flatbed scanners or in a fronto-parallel camera image, this minimum resolution is usually expressed in dots per inch (dpi): a measure of the sampling frequency of the document being imaged which gives the number of samples (i.e. image pixels) that are taken in the space of one linear inch on the surface of the document. in the case of the great parchment book, the folios should ide- ally be imaged at dpi, and at a minimum of dpi (fadi, ). however, our generated models are not d images. how can we effectively assess their quality in relation to archival standards? measuring dpi is simple when digitizing a single flat object from a front-facing viewpoint. in our case, however, with a d reconstruction texture generated by blending many different images from different viewing distances and viewing angles, the effective sampling density of the reconstructed parchment varies across the surrogate surface, dependent on the acquisition conditions of the images contributing to each surface point. therefore, assigning a single dpi quality label would not sufficiently characterize the data set. also, since we cannot guarantee that every point on the manuscript surface is imaged from a fronto-parallel viewpoint, there will inevitably be a degree of anisotropy (or stretch) in the sampling. instead we want to assess the ‘effective’ dpi of our model: that is, a measure of the frequency at which details on the parchment surface are sampled by the acquisition and reconstruction process. we generated this by looking at a distribution (or histogram) of effective dpi (see pal, , p. – , for details of how this was computed), see fig. . the histogram of effective dpi shows that the majority of the mesh vertices are sampled at over dpi. it also shows that the distribution is bi- modal, with a small second cluster of vertices sampled at around dpi. we can analyse exactly where on the manuscript surface these variations in sampling occur: low-dpi vertices are mostly on the edges of the folio, which were most likely imaged less thoroughly due to the absence of text or other im- portant features. we argue that this analysis provides an effective way to gauge the quality of a data set in terms easily communicated to archivists and conser- vators, and demonstrates that our models are of high-enough quality to be of use to historians and palaeographers who are used to relying on digital images of manuscripts of similar spatial quality. . interactive document exploration with the quality of our reconstructions assured, we could now proceed with exploring and exploiting the models to improve access to the document. our next innovation was to create an interactive system that allows a user to navigate the surface of the d recon- struction of the great parchment book, virtually fig. histogram of effective dpi of a model of one folio: x-axis, size of pixel by effective dpi, y-axis, number of pixels with that count. most of the mesh vertices are sampled over dpi, with a much smaller number of vertices having been captured at dpi digitally reconstructing the great parchment book digital scholarship in the humanities, of flattening specific areas of interest when required. previously, we discussed how standard metric- preserving surface parameterization algorithms are often not suitable for flattening parchment, because of the way parchment deforms when exposed to heat and/or moisture. we circumvented this difficulty by using an interactive viewer which presents a locally flattened view of the region of text that the user is currently focussed on, by undistorting local subsets of the mesh. this system aims to improve the acces- sibility and legibility of text in highly distorted docu- ments, in a manner which does not require a global parameterization: since areas are being independently flattened, reconstruction artefacts elsewhere in the mesh will not affect them. this approach is inspired by the observation that, when transcribing a text, a palaeographer will only ever inspect a small section of a folio at any given time (youtie, ) and is analogous to avoiding the distortions of large map projections when attempting to flatten a globe onto a d plane (snyder, ). it is, therefore, unnecessary to un-distort the entire folio at once if the primary goal is to simply expose the content in a form that can be read. instead, if a user looks at a particular region, it should be displayed in a way that is optimal in terms of its readability. text should be visible, should not be distorted, and lines of text should be rectified so that they run horizontally from left to right, as is to be expected for this document. the interface provides the capability of visualizing the text in ways which are impossible with the physical document. local flattening was accomplished by using two modes: a local-affine mode renders the mesh in d and transforms it so that the target region is ori- ented to face the camera; and local-flattening mode allows the target region to be flattened into d independently from the rest of the mesh. the user can pan over the surface of the document, pausing in places of interest which would benefit from local flattening. we can see the results of this system in a selection of flattened sections of parchment from different folios of the great parchment book (see figs and ). our system also addresses the issue of proven- ance. for historians studying the text through a digital representation, it is important to be able to judge whether a feature present in the surrogate was also present in the original text or whether it is an artefact of the reconstruction pipeline. terras ( ) discusses this issue at length, focussing mainly on imaging artefacts, and the london charter for the computer-based visualization of cultural heritage (denard, ) stresses the importance of storing paradata which documents the process of generating visualizations of cultural heritage. in our case the most likely source of error is the d reconstruction process. we therefore document the provenance of the reconstruction by providing the user with smart access to the original image collection: for a given d view, the system displays the portion of an ori- ginal image that best depicts the currently observed part of the parchment. by comparing the d recon- struction with the original images the user can fig. the local-affine view and local-flattening mode. between the left and middle image, the user pans slightly to the right. in the right hand image, the user stops panning and local flattening is performed, removing perspective distor- tions and revealing otherwise hidden text. video (which is available in the online version of this paper) shows a user panning across the surface of a folio and carrying out local flattening in various areas of interest pal et al. of digital scholarship in the humanities, better assess the content of the text in areas of the d reconstruction which seem to contain errors. this system was used by the palaeographer who compiled the transcription of the great parchment book, using our system alongside access to the ori- ginal folios to gain a fuller understanding of the text contained within the parchment. the provenance feature was demonstrably useful in resolving any ambiguities in the model, and access to the local-flat- tening tool helped in the interpretation of the text. . global flattening with a system successfully in place to allow navigat- ing and local flattening of areas of parchment, we fig. sections of parchment folio rendered in, left, local-affine mode and, right, local-flattening mode. the top right and middle right sections show text obscured by a fold which is made visible when the sections are flattened. in the bottom left image a large amount of text is obscured by a fold but this previously occluded section of text becomes visible when the mesh is locally unfolded digitally reconstructing the great parchment book digital scholarship in the humanities, of were able to return to the difficult (and optional, for our purposes) issue of whether we could produce globally flattened areas of the text that had as little distortion as possible to produce useful d images of the unfolded manuscript. mesh parameterization computes a map that flattens a d surface into the d plane by defining some geometric measure of distortion which the algorithm attempts to minimize in the mapping (see sheffer et al., for an over- view of this technique). our task is to estimate the complex deformation of the parchment and invert it, thus restoring the original shape of the parchment. there are various constraints that help us in this approach (as the previous research in this area showed, it is important to understand documentary features to be able to address them). before being damaged, the text in the documents was written in a uniform glyph size, in equally spaced horizontal lines, and with strict vertical page margins. we can therefore attempt to find a scaling field which captures the degree of shrinking or stretching of the text at each point, and an orientation field which captures the text direction on the document surface. we use a scale constraint based on identifying a sparse set of single characters (those without as- cenders or descenders) and use template-match- ing-based optical character recognition to compute the bounding box, or x-height, of the text. the relative sizes of characters in different regions indi- cate the amount of shrinking or expansion that has occurred in that region, given the original text was written in approximately the same size throughout. we found that detecting the letter ‘a’ worked well because it has a distinctive shape and is very common, so we can detect a sufficient number of instances in each folio. we then use a semi-auto- matic approach to line detection (which is compli- cated by distortions, discolouration, fading, and ascending and descending glyphs): once a user begins to manually trace a line on the folio, the system continuously proposed a suggestion of the next section of line. the user can refine line identi- fication until they are happy with the result- ing transformation, and we use these measures of line and scale to invert the deformation of the text (see figs , and ). even after successful inverse distortion, the tex- ture of the document still exhibits intensity and colour variations which convey the false impres- sion the document is still distorted and not flat. these variations are a combination of shading baked into the texture at the time of acquisition, and the genuine discolouration of the parchment which has taken place in the course of the damage. while preserving these observed appearance vari- ations is a useful feature to study the rectified text in the context of the original damage, which miti- gates the risk of misinterpreting potential artefacts introduced by over-processing the content (terras, , bentkowska-kafel, ), many readers will prefer a cleaned-up colour appearance in addition to the unwarped geometry. therefore, we option- ally remove colour variations by normalizing the parchment texture’s appearance. this is achieved by independently scaling each colour channel by a spatially varying factor, so that all ink-free regions fig. top. the automatic tracing line (green) is mis- guided by the presence of ascenders and descenders. bottom: after a single user correction, the suggestion moves back to the baselines. in this way, the user can help provide the constraints with which to carry out in- verse distortion of the text pal et al. of digital scholarship in the humanities, of the parchment roughly match the same colour (see fig. ). video (which is available in the online version of this paper) walks the user through the de- formation, and optional colour correction, as pre- sented in pal et al., . an overview of the processing pipeline, and a selection of successfully generated flattened images, is presented in fig. . outputs and impact there are various ways in which this project has pro- duced outputs that will have lasting impact. obviously, the focus of this article has been on the acquisition and restoration methods that have enabled the contents of the great parchment book to be ac- cessed by researchers more easily and without further handling of the original, fragile, folios, assisting the production of the new transcript of the text. originally tested on a small subset of six pages of parchment (pal et al., ), our capture, restoration, and flattening process has now been used to virtually restore all remaining pages of the great parchment book. the project worked with web-designers headscape to develop a website that both kept the user community informed via the means of a project blog, and now hosts a readable and exploitable version of text, comprising a scholarly digital edition which features a searchable transcription as well as a glossary of the manuscript contents. the aim of the website is that it should be accessible and useful to a wide range of people—academic researchers and local and family historians alike. our new flattened images were inte- grated into the website of the great parchment book project alongside: images of the folio before conserva- tion treatment; a new scholarly transcription of the original text generated after conservation and digital reconstruction—which has revealed significantly more information on practically every folio, providing a rich, new resource on the history of ulster for his- torians—and a version of this new transcription more suitable for non-scholarly audiences. the texts, encoded in tei-compliant xml, are fully search- able at http://www.greatparchmentbook.org/folios/. a video providing an overview of all aspects of the con- servation, acquisition, digital reconstruction, tran- scription, and encoding, is available in the online version of this paper. the virtually reconstructed great parchment book is the centrepiece of an exhibition in derry guildhall which opened in to commemorate the th anniversary of the building of londonderry’s city walls in . an original, newly conserved folio from the book was also dis- played during the first ten months of this exhibition (a rotating schedule includes various archival ob- jects to ensure renewed interest). both the museum and visitor service of derry city and strabane district council and lma have used the document in their interpretation and outreach programmes, developing resources for schools and colleges based on the information it contains, with a particular school programme associated with the great parchment book taking place in derry�londonderry during the time the exhibition has been open (stewart, a, b). both the website and the exhibition have been very well received: the exhibition had nearly , physical visitors in its ocr user constraintsdeformed folio flattened colour- corrected user refinement uniform scaling constraints fig. all steps of our global flattening algorithm. the original d model of the distorted folio is flattened using the estimate from the ocr analysis and the interactive refinement of constraints provided by the user, until a satisfactory result is achieved. finally, we remove the intensity and colour variations from the texture, should that be required digitally reconstructing the great parchment book digital scholarship in the humanities, of http://www.greatparchmentbook.org/folios/ first year and has had over , visitors at time of writing. overall visitor feedback from the exhib- ition has been very positive, noting that it gives a balanced overview of the plantation for a variety of international audiences who may be learning about it for the first time, with high praise for the audio visual and original artefact and document material (mcconnell, ). it has been a huge success that the exhibition has been so positively received by the derry�londonderry community, leading to discus- sion and debate around the history of sensitive con- flict. the website has received over , page views since it was launched in may , and there is considerable interest in the project from academics who study the period. the newly created resource is used in undergraduate teaching in the school of english and history at the university of ulster and is proving to be a ‘vital postgraduate and fig. on the left, the distorted parchments are annotated with the constraints shown here in blue. on the right, the documents are virtually restored, flattening the folios and removing the distortion pal et al. of digital scholarship in the humanities, post-doctoral research tool’ given that this docu- ment can ‘revisit a contentious historical legacy’ (kelly, , p. ). interactions with the research community interested in the great parchment book also occurred regularly in the form of work- shops and presentations throughout the duration of this project: these are described as part of the project blog. as well as this public engagement success, and the creation of a new scholarly resource, both the conservation and computation approaches in this project have led to further information regarding method and process that will benefit others. the research done on parchment degradation, treat- ments, sample preparation, trial procedures, results of the tests, and the methodology applied to repair the great parchment book were recorded through photographic and written documentation. regular updates were shared on the great parchment book blog and are now referred to extensively by conser- vators, providing a resource for others attempting to conserve similarly damaged parchment, who are often in direct contact with lma regarding their approach. lma will offer training courses by the project conservator in the future, following the interest in this from the sector (smith, ). from a computational point of view, the project has had numerous successes. we have developed a pipeline for low-cost acquisition of highly detailed d models of a fire-damaged parchment, adapting fig. folio a of the great parchment book flattened by our algorithm, shown with and without shading and discol- ouration, and with a detail image showing a close-up view of a region of text after full reconstruction digitally reconstructing the great parchment book digital scholarship in the humanities, of previously available techniques mostly used for large-scale historic and architectural structures, to provide an accurate representation of a document. we have developed a technique to navigate the sur- face of this resulting model, allowing further close- reading and analysis: although we have used it to analyse the text we are most interested in here, the viewer can also explore arbitrary d models, allow- ing the user to inspect interesting surface details of objects for forensic or palaeographic examination. our viewer allows local flattening and manipulation of the underlying mesh to support the type of phys- ical manipulation a historian or palaeographer may wish, but be unable to do, with fragile historical texts or artefacts. our understanding of the tech- niques of production of the great parchment book has allowed us to generate high-quality glo- bally flattened images of each folio, virtually smoothing and restoring the text, increasing its le- gibility for both general and specialist audiences. we have done so whilst considering palaeographic best practice and bearing data provenance in mind. the fact that our system allows users to interact with both the generated models, and continually to check the veracity of these models by comparing features in the d-captured images of the folios, builds trust in our approach. additionally, we have developed a way in which our approach may be compared to established archival standards for the creation of digital surrogates, demonstrating that our resulting models are of similar spatial qual- ity to d images acquired through more normative digitization procedures. although we have not car- ried out detailed user testing of our software with a wide library and archive user community, the member of the project team who relied upon our software to aid in transcribing the great parchment book was wholly positive about their experience in navigating and interrogating the surface of the document via the manipulation of its d surrogate, and we have had much successful feedback from the research community attending our events and workshops. there is now ample potential for taking this software out to a wider use community, and the joy of our approach to the digital acquisi- tion of the text (using relatively affordable dslr for capture, without the need for specialist equipment) should mean that this technique can be adopted by others in the library and archive community. ucl has enabled free access to the digital restor- ation pipeline through a stand-alone version of our software, providing guidance on acquisition through (interactive) processing via our dedicated viewer, based on best-practice computational approaches. the open-sourcing of ucl’s platform should enable other institutions to access the acqui- sition and restoration process themselves. meanwhile lma is exploring the possibility of de- veloping their role as a centre of expertise for the conservation, imaging, and digital restoration of distorted parchments, working in tandem with ucl to maintain the trajectory we have built up working on this project together. the project has also led to further understanding of the structure of the great parchment book itself, aiding in reconstructing the original ordering of its folios. prior to conservation, and in the absence of any previous knowledge of the original make-up of the book, the remaining folios had been arranged in a conjectural order. because of gaps in the volume caused by entirely missing folios, the order of the companies’ sections within the original bound book can never be recovered beyond doubt. however, in the course of the project described here, where a closer reading of the text was possible and fragments were positively identified, we can be confident that our reordering of the surviving folios (as presented on the website, with many of the folios having new folio numbers) within each of the companies is correct, given the numbering of the charters, their arrangement, and the way the text now reads (while still taking into account the exist- ence of missing folios or sections of text within charters). the imaging and reading of individual folios in this project has led to a greater understand- ing of the document as a whole. these various successes have meant the project has attracted significant public attention. lma and ucl were very pleased to receive a commendation of merit in the european succeed awards , which promote the take up and validation of re- search results in mass digitization (succeed project, ). the project has been featured in a range of newspaper and magazine articles (for pal et al. of digital scholarship in the humanities, example, reisz, ; davis, ). in june , the great parchment book was inscribed to the uk register of the unesco memory of the world, recognizing the document’s importance (great parchment book, ). summarizing the successes of the project, the first minister of northern ireland, the rt hon peter d robinson mla, wrote in a guest blog post on the great parchment book website: i cannot praise the work of the lma & ucl highly enough. in completing this mammoth project they have succeeded in opening a verit- able treasure trove of information relating to a most significant period in the history of ulster; and illustrating as never before the central role played by the london guilds in the creation and preservation of the city of londonderry and its environs. (robinson, ). conclusion the conservation, digital reconstruction, and result- ing transcription of the great parchment book have provided a lasting resource for historians researching the plantation of ulster in local, national, and inter- national contexts. our work on the computational approach to model, navigate, flatten, and ultimately read the damaged parchment will be applicable to similarly damaged material held elsewhere as we be- lieve we are developing best-practice computational approaches to digitizing highly distorted, fire- damaged, historical documents which are all too common in library and archive collections. we have demonstrated how other existing approaches to this problem have proved inadequate, and adopted a semi-automatic approach to enable an expert user, such as a palaeographer or historian, to guide the estimation of virtually restoring texts to generate useful, accurate representations of the dena- tured parchment, that allow the text to be more le- gible. the digital image outputs from our system are of high quality, and the techniques used to generate them are transparent: they can be trusted by those accessing them. the digital images can also be inte- grated easily into scholarly digital editions when the presence of this new evidence has assisted in the text’s transcription. the work described here is not theoretical: the conservation activities, digitization and restoration pipeline developed, and resulting online resource created, were accomplished within a societal context in time to provide a focus for a community celebra- tion of the founding of londonderry and the th anniversary of the building of its city walls, and a means of reflection on the lasting legacy of the plantation of ulster. we see here, in this project, how a regional museum and metropolitan archive can work together, whilst also interacting with uni- versity research mechanisms, to develop a process that helps our interpretation of primary historical texts, presenting online materials of benefit to a wide range of interested individuals, and engaging in community activities to respond to, and reflect upon, an important local and national anniversary. it is also important to stress that the pipeline we describe here is not entirely a digital one: we were dependent on the expert work of the conservators and archivists, and the historical and linguistic ex- pertise of the palaeographer who carried out the transcription, and so this project also demonstrates a successful international, interdisciplinary project where aspects of conservation, computational sci- ence, and digital humanities research come together to benefit our understanding of an archival object to move forward the interpretation of our cultural inheritance. we now encourage the scholarly community to make use of the texts and images available at http:// www.greatparchmentbook.org/. this resource will be maintained by lma to provide lasting access to a document of significant historical importance, the contents of which were not available until we under- took this novel and groundbreaking interdisciplin- ary work to conserve, image, and virtually recover the great parchment book of the honourable the irish society. acknowledgements this work was financially supported by the ucl engd veiv centre for doctoral training; the engineering and physical sciences research council digitally reconstructing the great parchment book digital scholarship in the humanities, of http://www.greatparchmentbook.org/ http://www.greatparchmentbook.org/ (grant ep/g / ); the european research council (grant imodel stg- - ); adobe research; derry heritage and museums service (dhms, now museum and visitor service, derry city and strabane district council), the uk’s national manuscripts conservation trust; the marc fitch fund; the honourable the irish society; the city of london corporation, london metropolitan archives, and a number of london livery companies: clothworkers’ company, drapers’ company, fishmongers’ company, goldsmiths’ company, ironmongers’ company, mercers’ company, merchant taylors’ company, skinners’ company. advice and support were provided by professor james stevens curl, the british library, the national archives, and the trustees of lambeth palace library. references aldous, v. e. ( ). the guildhall fire of , with particular reference to the records of the irish society. city of london, lma: corporation of london record office, col/ac/ / / . ahmadabadian, a. h., robson, s., boehm, j., and shortis, m. ( ). stereo-imaging network design for precise and dense d reconstruction. the photogrammetric record, ( ): – . autodesk ( ). d catch. http://www. dapp.com/ catch. avery, n., campagnolo, a., de stefani, c., pal, k., payne, m., smither, r., stewart, a, stewart, e., stewart, p., terras, m., ward, l., weyrich, t., yamada. l. ( ). the great parchment book. in digital humanities , lincoln: university of nebraska, july . http://dh .unl.edu/schedule- and-events/program/. also available in the journal of digital humanities , vol. . . see http://journal- ofdigitalhumanities.org/ - /great-parchment-book- project/. bentkowska-kafel, a. ( ). i bought a piece of roman furniture on the internet. it’s quite good but low on polygons. visual resources. an international journal of documentation, ( ): – . bianco, g., bruno, f., tonazzini, a., salerno, e., savino, p., zitová, b., sroubek, f., and console, e. ( ). a framework for virtual restoration of ancient documents by combination of multispectral and d imaging. in proceedings of eurographics italian chapter conference, pp. – . bond, e. j. and vandercom, j. f. ( ). a concise view of the origin, constitution, and proceedings of the honorable society of the governor and assistants of london of the new plantation in ulster: within the realm of ireland, commonly called the irish society/ printed by order of the court; compiled principally from their records. london, g. bleaden. http://cata- log.hathitrust.org/record/ . brown, m., and pisula, c. j. ( ). conformal deskewing of non-planar documents. in proceedings of ieee conference on computer vision and pattern recognition ( ), vol. , san diego, ca, pp. – . brown, m. and seales, w. ( ). document restoration using d shape: a general deskewing algorithm for arbitrarily warped documents. in proceedings of ieee international conference on computer vision, ieee, vol. , kauai, hawaii, pp. – . brown, m. s., sun, m., yang, r, yun, l., and seales, w. b. ( ). restoring d content from distorted docu- ments. ieee transactions on pattern analysis and machine intelligence, ( ): – . butime, j., corzo, l. g., and gutiérrez, i. ( ). application of computer vision to d reconstruction: a survey of reconstruction methods. saarbrücken: vdm verlag dr. müller. cains a.g. ( ). the conservation, repair and preser- vation of books and manuscripts in trinity college dublin, journal of the european study group on physical, chemical, biological and mathematical techniques applied to archaeology, ravello, : – . canny, n. ( ). making ireland british - . oxford: oxford university press. chan, t. f. and vese, l. a. ( ). active contours with- out edges. ieee transactions on image processing, ( ): – . clarkson, c. ( ). the preservation and display of single parchment leaves and fragments. in petherbridge, g. (ed.), conservation of library and archive materials and the graphic arts. petherbridge: institute of paper conservation. society of archivists, pp. – . clarkson, c. ( ). a conditioning chamber for parch- ment and other materials. the paper conservator, ( ): – . chahine, c. ( ). changes in hydrothermal stability of leather and parchment with deterioration: a dsc study, thermochimica acta, ( – ): – . pal et al. of digital scholarship in the humanities, http://www. dapp.com/catch http://www. dapp.com/catch http://dh .unl.edu/schedule-and-events/program/ http://dh .unl.edu/schedule-and-events/program/ http://journalofdigitalhumanities.org/ - /great-parchment-book-project/ http://journalofdigitalhumanities.org/ - /great-parchment-book-project/ http://journalofdigitalhumanities.org/ - /great-parchment-book-project/ http://catalog.hathitrust.org/record/ http://catalog.hathitrust.org/record/ cheyney, e. p. ( ). the court of star chamber. the american historical review, ( ): – . cignoni, p., corsini, m., and ranzuglia, g. ( ). meshlab: an open-source d mesh processing system. ercim news, : – . curl, j. s. ( ). the honourable the irish society and the plantation of ulster, - , chichester: phillimore. curley, w.j., ( ). vanishing kingdoms: the irish chiefs and their families. dublin: lilliput press. davis, n. ( ). not fade away. . . how robots are preser- ving our old newspapers. in the observer, sunday th july . http://www.theguardian.com/books/ /jul/ /british-library-digitising-newspapers-boston-spa. denard, h. ( ). a new introduction to the london charter. in bentkowska-kafel, a., baker, d., and denard, h., (eds), paradata and transparency in virtual heritage. farnham: ashgate, pp. – . de stefani, c. ( ). conservation of the great parchment book. paper presented at the ara annual conference, brighton, - august, . de stefani, c. ( ). great parchment book project. paper presented at opposites attract: science and archives, london metropolitan archives, march . derry city council ( ). great parchment extract on view in guildhall’s new exhibition space. http://www.derrycity. gov.uk/news/great-parchment-extract-on-view-in- guildhall%e % % s-new, july . easton, r. l. jr., knox, k. t., and christens-barry, w. a. ( ). multispectral imaging of the archimedes palimpsest. proceedings of nd applied imagery pattern recognition workshop, ieee , seattle, washington, pp. – . english, r. ( ). irish freedom, the history of nationalism in ireland. london: macmillan. esteban, c. h. and schmitt, f. ( ). silhouette and stereo fusion for d object modelling. computer vision and image understanding, ( ): – . federal agencies digitization initiative ( ). technical guidelines for digitizing cultural heritage materials: creation of raster image master files. http://www.digitizationguidelines.gov/guidelines/ fadgi_still_image-tech_guidelines_ - - .pdf. furukawa, y. and ponce, j. ( ). accurate, dense, and robust multiview stereopsis. ieee transactions on pattern analysis and machine intelligence, ( ): – . georgii, j. and westermann, r. ( ). mass-spring sys- tems on the gpu. simulation modelling practice and theory, ( ): – . giacometti, a., campagnolo, a., macdonald, l., mahony, s., robson, s., weyrich, t., terras, m., gibson, a. ( ). the value of critical destruction: evaluating multispectral image processing methods for the analysis of primary historical texts. journal of digital scholarship in the humanities. oxford university press. http://dx.doi.org/ . /llc/fqv . giurginca, m., lacatusu, i., miu, i., and petroviciu, l. ( ). parchment behaviour under extreme heat and fire conditions, materials research innovations, ( ): – . great parchment book ( ). great parchment book awarded unesco memory of the world status. in great parchment book blog, st june , http:// www.greatparchmentbook.org/ / / /great-parch- ment-book-awarded-unesco-memory-of-the-world- status/. hanasusanto, g. a., wu, z., and brown, m. s. ( ). ink-bleed reduction using functional minimization. in proceedings of the ieee conference on computer vision and pattern recognition (cvpr), ieee, lixouri, greece, pp. – . hartley, r., and zisserman, a. ( ). multiple view geometry in computer vision. cambridge, new york, ny: cambridge university press. hassel, b. ( ). conservation treatment of medieval parchment documents damaged by heat and water. in preprints of the iada conference, copenhagen, pp. – . huang, y., brown, m. s., and xu, d. ( ). user-as- sisted ink-bleed reduction. ieee transactions on image processing, ( ): – . institute of conservation ( ). shortlist, the pilgrim trust award for conservation. http://conserva- tionawards.org.uk/awards/the-pilgrim-trust-award-for- conservation/ -shortlist/. jordan, t. ( ). using magnets as a conservation tool: a new look at tension drying damaged vellum documents. american institute for conservation, annual meeting , book and paper session. http://cool.conservation-us.org/coolaic/sg/bpg/annual/ v /bp - .pdf. kazhdan, m., bolitho, m., and hoppe, h. ( ). poisson surface reconstruction. in proceedings of the th eurographics symposium on geometry processing, sgp ’ , eurographics association, pp. – . kelly, b. ( ). personal communication to philippa smith, reference in support of application to unesco memory of the world committee, th january . digitally reconstructing the great parchment book digital scholarship in the humanities, of http://www.theguardian.com/books/ /jul/ /british-library-digitising-newspapers-boston-spa http://www.theguardian.com/books/ /jul/ /british-library-digitising-newspapers-boston-spa http://www.derrycity.gov.uk/news/great-parchment-extract-on-view-in-guildhall%e % % s-new http://www.derrycity.gov.uk/news/great-parchment-extract-on-view-in-guildhall%e % % s-new http://www.derrycity.gov.uk/news/great-parchment-extract-on-view-in-guildhall%e % % s-new http://www.digitizationguidelines.gov/guidelines/fadgi_still_image-tech_guidelines_ - - .pdf http://www.digitizationguidelines.gov/guidelines/fadgi_still_image-tech_guidelines_ - - .pdf http://dx.doi.org/ . /llc/fqv http://www.greatparchmentbook.org/ / / /great-parchment-book-awarded-unesco-memory-of-the-world-status/ http://www.greatparchmentbook.org/ / / /great-parchment-book-awarded-unesco-memory-of-the-world-status/ http://www.greatparchmentbook.org/ / / /great-parchment-book-awarded-unesco-memory-of-the-world-status/ http://www.greatparchmentbook.org/ / / /great-parchment-book-awarded-unesco-memory-of-the-world-status/ http://conservationawards.org.uk/awards/the-pilgrim-trust-award-for-conservation/ -shortlist/ http://conservationawards.org.uk/awards/the-pilgrim-trust-award-for-conservation/ -shortlist/ http://conservationawards.org.uk/awards/the-pilgrim-trust-award-for-conservation/ -shortlist/ http://cool.conservation-us.org/coolaic/sg/bpg/annual/v /bp - .pdf http://cool.conservation-us.org/coolaic/sg/bpg/annual/v /bp - .pdf kersten, t. p., and lindstaedt, m. ( a). image-based low-cost systems for automatic d recording and mod- elling of archaeological finds and objects. in progress in cultural heritage preservation. berlin, heidelberg: springer, pp. – . kersten, t. p., and lindstaedt, m. ( b). automatic d object reconstruction from multiple images for architec- tural, cultural heritage and archaeological applications using open-source software and web services. photogram- metrie-fernerkundung-geoinformation, : – . kim, s. j., zhuo, s., deng, f., fu, c.-w., and brown, m. ( ). interactive visualization of hyperspectral images of historical documents. ieee transactions on visualization and computer graphics, ( ): – . klein, m. e., aalderink, b. j., padoan, r., bruin, g. d., and steemers, t. a. g. ( ). quantitative hyperspec- tral reflectance imaging. sensors, ( ): – . koo, h. i., kim, j., and cho, n-i. ( ). composition of a dewarped and enhanced document image from two view images. ieee transactions on image processing, ( ): – . kutulakos, k. n., and seitz, s. m., ( ). a theory of shape by space carving. international journal of computer vision, ( ): – . lambert, b. ( ). the history and survey of london and its environs: from the earliest period to the present time. london: dewick and clarke. lampert, c.h., braun, t., ulges, a., keysers, d., and breuel, t. m., ( ). oblivious document capture and real-time retrieval. in proceedings of first international workshop on camera-based document analysis and recognition, seoul, korea, pp. – . larsen, r. ( ). introduction to damage and damage assessment of parchment. in improved damage assessment of parchment (idap): assessment, data collection and sharing of knowledge. luxemburg: european commission, directorate-general for research, directorate i–environment, pp. – . lenihan, p. ( ). consolidating conquest, ireland - . cambridge: pearson. lennon, c. ( ). sixteenth century ireland, the incomplete conquest. dublin: gill & macmillan. li, m., weng, d., li, y., zhang, l., and zhou, h. ( , december). high-accuracy d measurement system based on multi-view and structured light. international conference on optical instruments and technology (oit ). international society for optics and photonics, pp. n- n. library of congress ( ). technical standards for digital conversion of text and graphic materials. available at http://www.digitizationguidelines.gov/guidelines/fadgi_ still_image-tech_guidelines_ - - .pdf. london metropolitan archives ( ). transcription methodology and conventions. in great parchment book. http://www.greatparchmentbook.org/the-pro- ject/transcription-methodology-and-conventions/. london metropolitan archives ( ). conservation of the great parchment book: technical report. pilgrim’s trust award entry, . lowe, d. g. ( ). distinctive image features from scale- invariant keypoints. international journal of computer vision, ( ): – . macdonald, l. (n. d.) enhancing legibility of the great parchment book, london metropolitan archive. white paper, university college london. mccamy, c. s., marcus, h., and davidson, j. g. ( ). a color-rendition chart. journal of applied photographic engineering, ( ): – . mcconnell, r. ( ). personal communication to melissa terras, museum and visitor service, derry city and strabane district council, st december . moody, t. w. ( ). the londonderry plantation, - : the city of london and the plantation in ulster. belfast: w. mullan and son. mowry, j. f. ( ). parchment: its manufacture, history, treatment and conservation. guild of bookworker’s journal, ( ): – . nguyen, h. m., wünsche, b., delmas, p., and lutteroth, c. ( ). d models from the black box: investigating the current state of image-based modelling. in proceedings of the th international conference on computer graphics, visualization and computer vision (wscg ), pilsen, czech republic, june - , . the national archives, (n.d. (a)). irish society. http://dis- covery.nationalarchives.gov.uk/details/rd/ e - bc - f- dc - dcf cf . the national archives, (n .d. (b)). the commission and the great parchment book, . http://discovery.natio- nalarchives.gov.uk/details/rd/ e f f-aef - - c - cdebd fa . pal, k., terras m., and weyrich, t. ( a). d reconstruction for damaged documents: imaging of the great parchment book. in proceedings of nd intl. workshop on historical document imaging and processing, washington dc, august , . pal et al. of digital scholarship in the humanities, http://www.digitizationguidelines.gov/guidelines/fadgi_still_image-tech_guidelines_ - - .pdf http://www.digitizationguidelines.gov/guidelines/fadgi_still_image-tech_guidelines_ - - .pdf http://www.greatparchmentbook.org/the-project/transcription-methodology-and-conventions/ http://www.greatparchmentbook.org/the-project/transcription-methodology-and-conventions/ http://discovery.nationalarchives.gov.uk/details/rd/ e -bc - f- dc - dcf cf http://discovery.nationalarchives.gov.uk/details/rd/ e -bc - f- dc - dcf cf http://discovery.nationalarchives.gov.uk/details/rd/ e -bc - f- dc - dcf cf http://discovery.nationalarchives.gov.uk/details/rd/ e f f-aef - - c - cdebd fa http://discovery.nationalarchives.gov.uk/details/rd/ e f f-aef - - c - cdebd fa http://discovery.nationalarchives.gov.uk/details/rd/ e f f-aef - - c - cdebd fa pal k., terras m., and weyrich, t. ( b). interactive exploration and flattening of deformed historical docu- ments. computer graphics forum (proc. eurographics) ( ): – . pal k., panozzo d., schüller c., sorkine-hornung o., and weyrich, t. ( ). content-aware surface parameterizaation for interactive restoration of historical documents computer graphics forum (proc. eurographics), ( ): – . pal, k. ( ). digital restoration of damaged historical parchment. engd thesis, department of computer science, university college london. petherbridge, g. ( ). the conservation of library and archive materials and the graphic arts. london: the institute of paper conservation and society of archivists; london, boston: butterworths. pollefeys, m., van gool, l., vergauwen, m., cornelis, k., verbiest, f., and tops, j. ( , november). image-based d acquisition of archaeological heritage and applications. in proceedings of the conference on virtual reality, archeology, and cultural heritage, acm, pp. – . quandt a.b. ( ). the conservation of a th century illuminated manuscript on vellum, chicago: american institute for conservation preprints, pp. – . quandt, a. b. ( ). recent developments in the con- servation of parchment manuscripts. in aic book and paper group annual, vol. . american institute for conservation, pp. – . http://cool.conservation-us. org/coolaic/sg/bpg/annualv /bp - .html. reisz, m. ( ). northern ireland’s ‘domesday book’ deciphered, fire-damaged survey ordered by charles i yields its secrets at last. times higher education, th june . https://www.timeshighereducation.com/ news/northern-irelands-domesday-book-deciphered/ .article. remondino, f., and menna, f. ( ). image-based surface measurement for close-range heritage documentation. international archives of photogrammetry, remote sensing and spatial information sciences, (b - ): – . remondino, f. ( ). heritage recording and d mod- eling with photogrammetry and d scanning. remote sensing, ( ): – . remondino, f., del pizzo, s., kersten, t. p., and troisi, s. ( ). low-cost and open-source solutions for automated image orientation–a critical overview. in progress in cultural heritage preservation, springer berlin heidelberg, pp. – . robinson, p. d. ( ). from the first minister of northern ireland, the rt hon peter d robinson mla. the great parchment book project, th june . http://www.greatparchmentbook.org/ / / /from-the-first-minister-of-northern-ireland-the-rt- hon-peter-d-robinson-mla/. rodriguez-echavarria, k., morris, d., and arnold, d. ( ). web based presentation of semantically tagged d content for public sculptures and monuments in the uk. in proceedings of the th international conference on d web technology, acm, pp. – . roued-olsen, h., tarte, s. m., terras, m., brady, j. m., bowman, a. k. ( ). towards an interpretation sup- port system for reading ancient documents. in digital humanities , university of maryland, june . digital humanities conference abstracts, pp. – . http://www.mith .umd.edu/dh /wp-content/up- loads/dh _conferencepreceedings_final.pdf. samko, o., lai, y.-k., marshall, d., and rosin, p. ( ). segmentation of parchment scrolls for virtual unrolling. proceedings of the british machine vision conference, bmva press,university of dundee, pp. – . samko, o., lai, y.-k., marshall, d., and rosin, p. l. ( ). virtual unrolling and information recovery from scanned scrolled historical documents. pattern recognition, ( ): – . santagati, c., inzerillo, l., and di paola, f. ( ). image-based modeling techniques for architectural heritage d digitalization: limits and potentialities. international archives of the photogrammetry, remote sensing and spatial information sciences, (w ): – . santry, c. ( ). the trouble with townlands: great parchment book. in irish genealogy online http:// www.irishgenealogynews.com/ / /the-trouble- with-townlands-great.html (accessed december ). schack m., and fackelmann m., ( ). bericht über praktische arbeiten am institut für restaurierung der österreichischen nationalbibliothek mit neuen und neu adaptierten. methoden, internationale arbeitsge- meinschaft der archiv-, bibliotheks-und graphikrestaur- atoren berlin , marburg: iada. schmitt, f. and yemez, y. ( ). d color object reconstruction from d image sequences. in proceedings of the international conference on image processing, kobe, japan, vol. , pp. – . digitally reconstructing the great parchment book digital scholarship in the humanities, of http://cool.conservation-us.org/coolaic/sg/bpg/annualv /bp - .html http://cool.conservation-us.org/coolaic/sg/bpg/annualv /bp - .html https://www.timeshighereducation.com/news/northern-irelands-domesday-book-deciphered/ .article https://www.timeshighereducation.com/news/northern-irelands-domesday-book-deciphered/ .article https://www.timeshighereducation.com/news/northern-irelands-domesday-book-deciphered/ .article http://www.greatparchmentbook.org/ / / /from-the-first-minister-of-northern-ireland-the-rt-hon-peter-d-robinson-mla/ http://www.greatparchmentbook.org/ / / /from-the-first-minister-of-northern-ireland-the-rt-hon-peter-d-robinson-mla/ http://www.greatparchmentbook.org/ / / /from-the-first-minister-of-northern-ireland-the-rt-hon-peter-d-robinson-mla/ http://www.mith .umd.edu/dh /wp-content/uploads/dh _conferencepreceedings_final.pdf http://www.mith .umd.edu/dh /wp-content/uploads/dh _conferencepreceedings_final.pdf http://www.irishgenealogynews.com/ / /the-trouble-with-townlands-great.html http://www.irishgenealogynews.com/ / /the-trouble-with-townlands-great.html http://www.irishgenealogynews.com/ / /the-trouble-with-townlands-great.html schneider, d., block, m., rojas, r. ( ). robust document warping with interpolated vector fields. in proceedings of the th international conference on document analysis and recognition, vol. , curitiba, brazil, pp. – . seitz, s.m., curless, b., diebel, j., scharstein, d, szeliski, r. ( ). a comparison and evaluation of multi-view stereo reconstruction algorithms. ieee computer society conference on computer vision and pattern recognition, vol. , new york, ny, pp. – , . /cvpr. . . shashkov, m. m., nguyen, c. s., yepez, m., hess-flores, m., and joy, k. ( , july). semi-autonomous digit- ization of real-world environments. in computer games: ai, animation, mobile, multimedia, educational and serious games (cgames). louisville, kentucky: ieee, pp. – . doi: . / cgames. . . sheffer, a., praun, e., and rose, k. ( ). mesh par- ameterization methods and their applications. foundations and trends in computer graphics and vision, ( ): – . singer, h. ( ). the conservation of parchment objects using gore-tex laminates. the paper conservator, ( ): – . smith, p. ( ). the great parchment book project, arc magazine (february ), pp. . smith, p. ( ). personal communication to melissa terras, london metropolitan archives, nd april . smither, r. (n. d.). the great parchment book, rachael smither conservation. http://rachaelsmitherconserva- tion.com/the-great-parchment-book/. snavely, n., seitz, s. m., and szeliski, r. ( ). photo tourism: exploring photo collections in d. acm transactions on graphics, ( ): – . snyder, j. p. ( ). flattening the earth: two thousand years of map projections. chicago: university of chicago press. stewart, p. ( ). the great parchment book. paper presented at plantation families: people, records and resources, a family and local history event on the plantation of ulster, belfast, september , . succeed project ( ). awards . http://succeed-pro- ject.eu/succeed-awards/awards- . https://web.arch- ive.org/web/ /http://succeed-project.eu/ succeed-awards/awards- . sun, m., yang, r., yun, l., landon, g., seales, b., and brown, m. ( ). geometric and photometric restoration of distorted documents. in proceedings of the ieee international conference on computer vision, vol. , pp. – . terras, m. ( ). reading the readers: modelling com- plex humanities processes to build cognitive systems. literary and linguistic computing, ( ): – . terras, m. ( ). artefacts and errors: acknowledging issues of representation in the digital imaging of an- cient texts. in fischer, f., fritze, c., and vogeler, g., eds, kodikologie und palkaographie im digitalen zeitalter /codicology and palaeography in the digital age . norderstedt: books on demand, pp. – . tian, y. and narashimhan, s. ( ). rectification and d reconstruction of curved document images. proceedings of the ieee conference on computer vision and pattern recognition, colorado: colorado springs, pp. – . thomas, k. ( ). alterations within the structural hierarchy of parchment induced by damage mechanism. phd thesis, cardiff university, cardiff. http://orca.cf.ac.uk/ / /u .pdf. tingdahl, d. and van gool, l. ( ). a public system for image based d model generation, computer vision/ computer graphics collaboration techniques th international conference, rocquencourt, france, mirage , springer berlin heidelberg, pp. – . trucco, e., and verri, a. ( ). introductory techniques for -d computer vision. englewood cliffs: prentice hall. ulges, a., lampert, c. h., and breuel, t. ( ). document capture using stereo vision. in proceedings of the acm symposium on document engineering, doceng , acm, pp. – . vergauwen, m. and gool l. v. ( ). web-based d reconstruction service. machine vision applications, : – . wada, t., ukida, h. and matsuyama, t. ( ). shape from shading with interreflections under a proximal light source: distortion-free copying of an unfolded book. international journal of computer vision, ( ): – . walsh, b. ( ). personal communication to melissa terras, museum and visitor service, derry city and strabane district council, th january . weng, j., huang, t. s., and ahuja, n. ( ). motion and structure from image sequences, vol. . berlin, new york: springer-verlag. pal et al. of digital scholarship in the humanities, http://rachaelsmitherconservation.com/the-great-parchment-book/ http://rachaelsmitherconservation.com/the-great-parchment-book/ http://succeed-project.eu/succeed-awards/awards- http://succeed-project.eu/succeed-awards/awards- https://web.archive.org/web/ / http://succeed-project.eu/succeed-awards/awards- http://succeed-project.eu/succeed-awards/awards- http://orca.cf.ac.uk/ / /u .pdf wenzel, k., rothermel, m., fritsch, d., and haala, n. ( ). image acquisition and model selection for multi-view stereo. proceedings of international archives of the photogrammetry, remote sensing and spatial information sciences, xl- : – . wikipedia ( ). color chart. https://en.wikipedia.org/ wiki/color_chart. woods, c. ( ). conservation treatments for parch- ment documents. journal of the society of archivists, ( ): – . woods, c. ( ). the conservation of parchment. in, kite, m. and thomson, r. (eds.). conservation of leather and related materials. oxford: butterworth- heinmann, pp. – . worshipful company of vintners ( ). members briefing. th february . http://www. vintnershall.co.uk/resource/collection/ b e- e - b-b f - fb a f /member_briefing_ _feb_ .pdf. wu, c. ( ). siftgpu: a gpu implementation of scale invariant feature transform (sift). university of north carolina at chapel hill. http://www.cs.unc.edu/�ccwu/ siftgpu/. wu, c. ( ). visualsfm: a visual structure from motion system. http://www.cs.washington.edu/homes/ccwu/ vsfm/. wu, c., agarwal, s., curless, b., and seitz, s. m. ( ). multicore bundle adjustment. in proceedings of ieee conference on computer vision and pattern recognition (cvpr), colorado: colorado springs, pp. – . wu, c. and agam, g. ( ). document image dewarping for text/graphics recognition. in proceedings of the joint iapr international workshop on structural, syntactic, and statistical pattern recognition, springer berlin heidelberg, pp. – . youtie, h. c. ( ). the papyrologist: artificer of fact. greek, roman, and byzantine studies, ( ): – . zhang, z., tan, c., and fan, l. ( ). restoration of curved document images through d shape modelling. proceedings of the ieee computer society conference, vol. , washington, dc, pp. – . notes this article is predominantly based on the engd thesis of pal ( ) combined with reports on specific tech- nical approaches used in the project such as papers (pal et al., a, b, ; avery et al., ), presentations (de stefani, , ; stewart, ; smith, ), internal project documentation, and additional re- search to draw together a unique complete overview of this groundbreaking project. the city of london is a city and county within central london constituting of the areas of london’s original settlement in the st century ad to the middle ages, around and to the east of st paul’s cathedral. a major business and financial sector, it is also often referred to as the square mile. see http://www.cityoflondon.gov. uk for further information. the plantation of ulster was an organized colonization of one of ireland’s most northerly provinces by english and scottish settlers from onwards. james vi of scotland and i of england and ireland wrote in that this would enable ‘the settling of religion, the introduction of civility, order, and government, among a barbarous and unsubjected people, to be the acts of piety and glory, worthy also a christian prince to endeavour’ (quoted in bond and vandercom, , p. ). highly contentious, this scheme has ramifica- tions to this day. see lennon ( ), canny ( ), and lenihan ( ) for overviews of the plantation of ulster, and english ( ) for a discussion of the legacy of the plantation, which is seen by some as the origin of mutually antagonistic and polarized catholic/ irish and protestant/british identities and communities in ulster. http://www.cityoflondon.gov.uk/ the london livery companies are currently london-based ancient and modern trade associations and guilds, see http://www.liverycompanies.info/ and http://www.liverycompanies.com/history/ for further in- formation. the pre-eminent great twelve original livery companies were predominantly involved with the plantation. eight of the great twelve, whose estates appear in the great parchment book, contributed funds towards the research described here, the: clothworkers’ company, drapers’ company, fishmongers’ company, goldsmiths’ company, ironmongers’ company, mercers’ company, merchant taylors’ company, and skinners’ company. http://www.honourableirishsociety.org.uk/about-us/ who-we-are this renaming of both the city of derry and the new county surrounding it to londonderry has resulted in an ongoing naming dispute between irish nationalists and unionists. when the city was uk city of culture in (http://www.cityofculture .com/), the dual name derry�londonderry was used, and this has been appropriated by other media: it is the preferred digitally reconstructing the great parchment book digital scholarship in the humanities, of https://en.wikipedia.org/wiki/color_chart https://en.wikipedia.org/wiki/color_chart http://www.vintnershall.co.uk/resource/collection/ b e- e - b-b f - fb a f /member_briefing_ _feb_ .pdf http://www.vintnershall.co.uk/resource/collection/ b e- e - b-b f - fb a f /member_briefing_ _feb_ .pdf http://www.vintnershall.co.uk/resource/collection/ b e- e - b-b f - fb a f /member_briefing_ _feb_ .pdf http://www.vintnershall.co.uk/resource/collection/ b e- e - b-b f - fb a f /member_briefing_ _feb_ .pdf http://www.cs.unc.edu/∼ccwu/siftgpu/ http://www.cs.unc.edu/∼ccwu/siftgpu/ http://www.cs.unc.edu/∼ccwu/siftgpu/ http://www.cs.washington.edu/homes/ccwu/vsfm/ http://www.cs.washington.edu/homes/ccwu/vsfm/ http://www.cityoflondon.gov.ukforfurtherinformation http://www.cityoflondon.gov.ukforfurtherinformation http://www.cityoflondon.gov.uk/ http://www.liverycompanies.com/history/ http://www.liverycompanies.com/history/ http://www.honourableirishsociety.org.uk/about-us/who-we-are http://www.honourableirishsociety.org.uk/about-us/who-we-are http://www.cityofculture .com/ format we use to refer to the modern-day conurbation, with derry used throughout to describe the city pre- plantation, and londonderry used to name the city in the period immediately after the building of the city walls in . londonderry is used for the county, throughout. a medieval great hall and surrounding complex which is both the ceremonial and administrative centre for the city of london and its corporation. some other material does survive, including deputa- tion and visitation books, surveys, rentals, and rent rolls. the catalogue of the irish society archive is available at http://search.lma.gov.uk/scripts/mwi- main.dll/ /lma_opac/web_detail /refdþcla� f ?sessionsearch. the star chamber was an english court of law sitting at the palace of westminster from the late th to the mid- th century. established to ensure fair enforce- ment of laws, particularly ‘cases of breach of public order’ and ‘cases of violation of royal commands’ (cheyney, , p. ), it had a reputation for secrecy and ‘its action is generally supposed to have been tyr- annical and irregular’ (ibid., p. ). a full overview of the case filed by charles i’s attorney-general against the irish society and governor of the new plantation in ulster ‘‘complaining, amongst other charges, of ir- regularity and misrepresentation’’ can be found in bond and vandercom ( , p. clxxiv–clxxv) but the precise date is not given. see also moody ( ), curl ( ). for further background information, see moody ( ), p. . it is consistent with the ordering of a return of rents received ca. held by the national archives (sp / ). the former record office of the city of london corporation, now merged into london metropolitan archives. https://www.cityoflondon.gov.uk/things-to-do/ london-metropolitan-archives/pages/default.aspx the exhibition ‘plantation: process, people, perspectives’ was opened at derry guildhall in june , see derry city council ( ), and due to its continuing popularity will remain open for some time (walsh, ). see http://www.cityofcul- ture .com/ for activities surrounding derry�londonderry city of culture . http://www.cs.ucl.ac.uk/ http://www.ucl.ac.uk/dh/ http://engdveiv.ucl.ac.uk/ https://www.epsrc.ac.uk/ http://igl.ethz.ch/ http://ivc.ethz.ch/ https://www.ethz.ch/en.html see cains ( ), mowry ( ), petherbridge ( ), quandt ( , ), schack and fackelmann ( ), and woods ( ) for various established approaches that were initially considered before being dis- counted for this particular document because of its physical condition. a standard method of flatten- ing documents, used in the past, is to stick them onto another surface, such as synthetic polyester fabric, in a stretched state using a starch paste, and allowing it to dry under the tension of the adhesive (woods, ). parchment documents have also been moistened, then dried under a weighted board. these methods have been described as ‘ignoring its life and spirit altogether. . . sticking it down. . . struggling with it and pressing it till it stays down, subjected, beaten and defeated’ (clarkson, , p. ). any change in relative humidity, in fact, has detrimen- tal effects on historical parchment documents, with structural alteration occurring at the microscopic level to the material’s collagen structure (larsen, ; thomas ). for this reason, conservation treatments involving humidity had to be carefully cali- brated to attain the opening of the folds with minimal damage (de stefani, ). this is a relatively new approach in conservation. see jordan ( ) for an experimental use of the tech- nique, which found that ‘the use of rare earth magnets is an acceptable treatment alternative to current meth- ods of humidifying and tension drying vellum docu- ments. the method is particularly useful with severely cockled, damaged vellum’ (p. ). this prior work informed our approach. various standards exist to aid in the digitization of documents with non-complex geometry, such as library of congress ( ) and federal agencies digitization initiatives ( ). further explanation of the approaches described here is available in pal ( , p. – ). mass-spring simulation is a simple technique for com- putationally simulating the mechanics of deformable objects. see georgii and westermann ( ) for an introduction. further explanation of the approaches detailed here is available in pal ( , p. – ). see pal ( , p. - ), for an overview of various techniques, including triangulation, structured-light scanning, time-of-flight scanning, multi-view-stereo methods, structure from motion, real-time methods, meshing, view dependent texture mapping, and pal et al. of digital scholarship in the humanities, http://search.lma.gov.uk/scripts/mwimain.dll/ /lma_opac/web_detail/refd+cla∼ f ?sessionsearch http://search.lma.gov.uk/scripts/mwimain.dll/ /lma_opac/web_detail/refd+cla∼ f ?sessionsearch http://search.lma.gov.uk/scripts/mwimain.dll/ /lma_opac/web_detail/refd+cla∼ f ?sessionsearch http://search.lma.gov.uk/scripts/mwimain.dll/ /lma_opac/web_detail/refd+cla∼ f ?sessionsearch http://search.lma.gov.uk/scripts/mwimain.dll/ /lma_opac/web_detail/refd+cla∼ f ?sessionsearch https://www.cityoflondon.gov.uk/things-to-do/london-metropolitan-archives/pages/default.aspx https://www.cityoflondon.gov.uk/things-to-do/london-metropolitan-archives/pages/default.aspx http://www.cityofculture .com/ http://www.cityofculture .com/ http://www.cs.ucl.ac.uk/ http://www.ucl.ac.uk/dh/ http://engdveiv.ucl.ac.uk/ https://www.epsrc.ac.uk/ http://igl.ethz.ch/ http://ivc.ethz.ch/ https://www.ethz.ch/en.html surface parameterization. trucco and verri ( ) and butime et al. ( ) provide introductory texts to this range of technologies. structure from motion refers to a range of methods that estimate d structures from d image sequences, which may use local motion signals to improve results. see weng et al. ( ) for an overview. http://www.arc d.be/ http://www. dapp.com/catch arc d has been used, for example, to generate d models of sculpture as part of the uk’s public monuments and sculpture association’s national recording project (rodriguez-echvarria et al., ) and its efficacy in capturing cultural heritage models has been tested when compared with other approaches (remondino and menna, ). d catch has been used for a variety of heritage applications including the low-cost d recording of archaeological objects (kersten and lindstaedt, a) and architectural sites (santagati et al., ). these include microsoft photosynth (https://photo- synth.net/), my dscanner (http://www.shapeking. com/ d-scan/my dscanner/), and hyper d (http:// sourceforge.net/projects/hyper d/). see, for example, li et al. ( ), shashkov et al. ( ), and kerstin and lindstaedt ( a, b) for other uses of visualsfm. inserting a colour calibration target into an image cap- ture sequence is a standard method for ensuring accur- acy of colour within cultural heritage digitization. for more information about the colorchecker, see http:// xritephoto.com/colorchecker-passport-photo, and for colour charts in general, see mccamy et al. ( ) as an introduction, and wikipedia ( ) for an up- to-date overview of industry use. if the hole was caused by poor coverage of that region in the image set, this could cause ghosting artefacts in the texture since the interpolated mesh surface may not be accurate, resulting in blurry text in the reconstruc- tion. however, if the hole was caused by the absence of text in the region (and hence a lack of texture features for the reconstruction algorithm to detect), this was less problematic since we cared most about maintaining accuracy and legibility of text. our computational approach is fully described in pal ( , p. – ) and pal et al. ( a, b). this effective dpi will, understandably, change if the size and quality of the input images increase or decrease: our results here are in part due to the quality of the d image inputs used. both of our viewing facilities, as well as large parts of our digital restoration pipeline, were new software de- velopments built from scratch, in cþþ and using external cross-platform libraries such as qt and opengl. the respective source code, together with all other processing tools developed by us, are avail- able for download under http://reality.cs.ucl.ac.uk/ projects/gpb/. currently still in alpha status, we con- tinue to refine the software to make it accessible to a wider user base. our computational approach is fully described in pal ( , p. – ) and pal et al. ( a, b). expertise provided by colleagues from the interactive geometry lab (http://igl.ethz.ch/) from eth zurich contributed to this phase of the project. our computational approach is covered in pal et al. ( ). our computational approach is covered in pal ( , p. – ) and pal et al. ( ). our computational approach is covered in pal et al. ( ). http://headscape.co.uk/ a description of the transcription methodology and conventions can be found in london metropolitan archives ( ). this element of the project received a grant from the marc fitch fund towards the employment of a palae- ographer who also encoded appropriate terms using the text encoding initiatives guidelines to capture structural and semantic information about the texts enabling comprehensive searching of the document. for more information about tei see http://www.tei- c.org/index.xml. this is also available online at https://www.youtube. com/watch?v=kuxxkmbzg-m. http://www.derrystrabane.com/subsites/museums- and-heritage/museums-and-visitor-service an alpha version is available from http://reality.cs.ucl. ac.uk/projects/gpb/. the project was also shortlisted for the prestigious institute for conservation (icon)’s pilgrim trust award for conservation (icon ), and has been nominated for the royal historical society public history awards , and uk blog of the year awards . digitally reconstructing the great parchment book digital scholarship in the humanities, of http://www.arc d.be/ http://www. dapp.com/catch https://photosynth.net/ https://photosynth.net/ http://www.shapeking.com/ d-scan/my dscanner/ http://www.shapeking.com/ d-scan/my dscanner/ http://sourceforge.net/projects/hyper d/ http://sourceforge.net/projects/hyper d/ http://xritephoto.com/colorchecker-passport-photo http://xritephoto.com/colorchecker-passport-photo http://reality.cs.ucl.ac.uk/projects/gpb/ http://reality.cs.ucl.ac.uk/projects/gpb/ http://igl.ethz.ch/ http://headscape.co.uk/ http://www.tei-c.org/index.xml http://www.tei-c.org/index.xml https://www.youtube.com/watch?v=kuxxkmbzg-m https://www.youtube.com/watch?v=kuxxkmbzg-m http://www.derrystrabane.com/subsites/museums-and-heritage/museums-and-visitor-service http://www.derrystrabane.com/subsites/museums-and-heritage/museums-and-visitor-service http://reality.cs.ucl.ac.uk/projects/gpb/ http://reality.cs.ucl.ac.uk/projects/gpb/ developing new skills for research support librarians rebecca browna*, malcolm wolskia & joanna richardsona a division of information services, griffith university, australia *corresponding author acknowledgement: none article developing new skills for research support librarians abstract in recent years there has been considerable discussion about the key role which university libraries can play by engaging with their research community. as a result libraries are scoping, developing and implementing new roles and service models, especially in the relatively new area of research data. this article explores the specific challenges experienced by a traditional academic librarian at griffith university as she moved into a new role as a data librarian. it was found that this transition needed to be underpinned by a skills development program, a mentor/coach and a support network of specialists. the authors then outline some strategies to facilitate this type of role transition, which include investing in a range of training and staff development activities, leveraging existing core librarian capabilities, and understanding the researcher perspective. the article concludes with a suggestion that several national organisations will continue to have an important role in supporting librarians as they develop new skills. keywords: data librarian; research data services; research libraries; research data management; library roles implications for best practice • while formal skills training is important as librarians move into new research support roles, there is also a critical need for informal training, mentoring and support networks. • library roles which support research need to be scoped to determine the skills and expertise required within a team, faculty and the institution. • in-depth knowledge of the research process in specific discipline areas may be required to enable librarians to contribute as a full partner in the research activity. • in australia national bodies such as caul and ands will continue to have an important role to play in assisting libraries to provide support networks. introduction: research data is the new gold there is a new emphasis on re-using and preserving research data. this is a logical consequence of data having become more numerous, more complex, and more important. as the australian national data service (ands ) observes, ‘…the research process is transforming to become more investigative as it is now possible to assemble significant data collections that enable much broader problems to be addressed. thus it is critical that research data is managed, discoverable, and connected to enable innovative re-use’. funding bodies and national governments are seeking an improved return on investment for funded research. along with the transformative nature of research, this has been a major driver for initiatives aimed at better access to and the sharing of research data, e.g. uk data archive. in addition, governments are providing access to data as part of their ‘open government’ strategy. the australian government, for example, in its declaration of open government is committed to ‘open government based on a culture of engagement, built on better access to and use of government held information, and sustained by the innovative use of technology’ (tanner ). most australian state governments have followed suit with similar strategies. in australia re-use of data is typically an existing priority in projects funded by national eresearch collaboration tools and resources (nectar) and ands, and is an emerging priority in funding agencies such as the australian research council (arc) and national health and medical research council (nhmrc). in addition, arc applicants are expected to outline their ‘management of data’. at the same time internationally, digital preservation organisations and services are beginning to evolve in recognition of the importance of preserving not only research data but also digital information more generally as part of its lifecycle. walters and skinner ( ) examine in detail some of the types of initiatives (business-driven, community- driven- and library-driven) which underpin these operations. role of university libraries in supporting research data the recent focus on data has resulted in much discussion within universities, often leading to the development of new services. the library is one of the service providers within a university that is seen by many as having a key role in engaging with the research community, particularly in regard to the management of their data (auckland ; malenfant ; jaguszewski and williams ). the traditional role of providing information support and training has been expanded to include support in all steps of the research lifecycle (simon fraser university library ). however, many university libraries are grappling with their emerging role in supporting the new area of research data. in a uk survey (lewis ), the starting point was to question whether the specific activity of managing data was actually a role for university libraries. on the one hand, the answer was “no’ because the scale of the challenge in terms of infrastructure, skills and culture change requires concerted action by a range of stakeholders, of which university libraries are just one. on the other hand, the answer was ‘yes’ because ‘data from academic research projects represents an integral part of the global research knowledge base, and so managing it should be a natural extension of the university library’s current role in providing access to the published part of that knowledge base’ (p ). how far that current role should be extended has been the topic of much discussion. in the association of college and research libraries (acrl) surveyed a cross section of its members in the united states and canada to provide a baseline assessment of the current state of, and future plans for, research data services (rds) in academic libraries in these countries. in the resultant report (tenopir et al, , - ), two of the key findings were that: • only a small minority of academic libraries in the united states and canada currently offer research data services (rds), but a quarter to a third of all academic libraries are planning to offer some services within the next two years • libraries on campuses that receive nsf (national science foundation) funding are more likely to offer or plan to offer rds of any type. this suggests that funding agency requirements are driving the need for rds. as budget decisions move towards even greater accountability, it is likely that more agencies will dictate responsible data management, so the need for rds on campus is likely to grow. if the library is not actively involved in providing these services, some other unit is likely to be pressed into service, which can diminish the image of the library as an important partner in the research process. those university libraries which are adopting a broad perspective are currently beginning to undertake a range of research support activities, including • raising awareness of data issues within institutions and the benefits of actively managing research data • assisting in developing policies about data management and preservation • providing advice to researchers about data management early in the research life cycle; influencing the way researchers will be creating their data, the formats they will use and building a commitment to use a repository to publish/preserve their data • working with it service colleagues to develop appropriate local data storage capacity • training and introducing data management and curation concepts to research students • exploring methods of moving data from work-in-progress storage spaces to repositories in more seamless ways walters and skinner ( , ) have published a report which discusses ‘the emerging practice of digital curation for preservation and how research libraries are fostering curatorial practices in order to ensure that their parent institutions continue to realize their core mission of creating, disseminating, and preserving knowledge. maccoll ( ) reinforces the idea that a vision of a comprehensive and strategic role for libraries includes the curation and preservation of research outputs. tenopir et al ( , ) point to ‘a more active and visible role in the knowledge creation process by placing librarians at all stages in the research planning process and by providing expertise to develop data management plans, identify appropriate data description, and create preservation strategies’. in july , liber’s e-science working group (christensen-dalsgaard et al ) released ‘ten recommendations for libraries to get started with research data management’. since then, several libraries across europe either have started to build or have expanded their capacities for research data management, typically combining e-infrastructure and support services. their progress has been documented in case studies, which describe policies and strategies that pave the way for the creation, institutional integration and the running of support services and underlying infrastructures. in addition, challenges and lessons learned are described, and ways-forward outlined. digital scholarship and the challenges associated with research data offer libraries the chance to shed their ‘support service’ label and become research collaborators (corrall ). sarah thomas, vice-president for the harvard library, reinforces the likelihood of this new role: ‘i see us moving up the food chain and being co-contributors to the creation of new knowledge’ (monastersky, , ). in a webinar, three senior information professionals discussed how their respective libraries were offering a growing number of services to support a diverse set of research needs, as both researchers and scholars increasingly move toward data-driven research (mertens et al ). stuart ( ) has explored the potential role for research libraries in a data-centric age. he highlights, for example, the importance of providing training to researchers in accessing, archiving, publishing and managing data. in the examples outlined above, each library has had to determine its own approach to supporting research, based on its respective strategic priorities and those of the parent organisation. for most libraries, defining the scope of the problem and then moving from a theoretical discussion to actually developing and implementing a practical response has not been an easy task since it impacts on all aspects of the library organisation and the staff. in the following sections the authors discuss their own experiences and suggest some strategies to facilitate the transition. the challenge of putting theory into practice in it became evident at the authors’ university that there was a high level mandate for change. this was a result of several key drivers: compliance with funding agency requirements, senior academic recognition of the potential value of data as an asset and the practical need to improve data management and an enterprise need for across the board data classification methods to help manage the risk of data loss. this resulted in changed policies and new guidelines. at griffith university the division of information services integrates e-research, library, information and communication technology into a single organisation. a divisional restructure saw the formation of six portfolios, two of which encompassed library operations. the new information management portfolio took over governance responsibility of the traditional acquisitions and library systems function as well as corporate records and institutional repositories for scholarship including research outputs. additionally this portfolio has responsibility for discoverability of content as well as the management role. new teams and staff roles were established to manage these operations. the new library and learning services portfolio assumes the governance of faculty librarians and physical spaces. faculty librarian roles were also modified to address data management. at the same time investment in further development of systems and infrastructure to manage data underpinned the establishment of new services. development of a discovery portal known as the ‘research hub’ and development of new repositories to accommodate new demands are ongoing (e.g. addressing the need to mint dois). a complementary program of work is underway to implement new services through the faculty librarians. the objective of this program of work is to further engage with researchers and assist researchers in the management of their data. similarly to what was noted in the liber e-science working group study above, most of the griffith staff involved saw a potential for new services in research data but few staff had any practical experience. carlson has observed: ‘the challenges encountered by librarians seeking to engage in data management and curation issues are found at the individual level (acquiring skills and confidence) and at the organizational level (creating a supportive environment). both levels will need to be addressed by libraries seeking to develop data services’ ( , ). carlson’s observations parallel the experiences of one of the authors of this paper. the author moved from a long-time role as an academic services librarian, working with a science and engineering faculty, to a role as data librarian working on a federally funded climate change adaptation project which aimed to provide advice to natural resource management bodies around australia. the data management component involved providing data management advice to nine clusters of researchers from the commonwealth scientific and industrial research organisation (csiro) and academic institutions throughout the project lifecycle, and ensuring that project outputs would be accessible and discoverable at the project’s completion. the author’s previous role had included collection development, advanced information retrieval training for higher degree students and academics, bibliometrics reporting, and providing high-level advice to research staff on data management planning. the author considered herself as having a solid, broad-level understanding of data management, including the creation, use, description, storage, and retrieval of data. however the new role proved a substantial challenge. the author did find that having relevant discipline knowledge (in this case a formal background in environmental sciences, including experience as a research assistant) was a distinct advantage. familiarity with scientific terminology and the scientific research process meant that she was comfortable working with climate change researchers from the start, and broadly understood the types of outputs they were producing. initially the author was required to provide written, best-practice advice on a suite of data management topics such as in-project data storage, copyright and licensing (including the complex legalities of extensive data re-use), data description and documentation, data identifiers, and long-term deposit for preservation and discoverability. while this proved to be an excellent, on-the-job introduction to many major concepts in data librarianship, becoming a confident ‘expert’ in a short space of time required the rapid uptake of new skills and knowledge. the author spent considerable time doing background reading and completing a variety of informal training packages (for a list of self-training options and communities of practice for data librarians see simons and searle ( , )) as well as participating in an week, assessment-based mooc on metadata. together these tools provided a strong theoretical overview of the field of data management; however the author had difficulty contextualising theory, and felt limited by a lack of both practical data management experience and in-depth understanding of the data on which she was providing advice. as she wrote in a blog post following the early days of her new role, ‘for many academic librarians moving into the data management space, ‘data’ is just a word. how many academic librarians have seen a dataset recently? …. if i could have seen a raw dataset or data collection prepared for sharing and reuse, versioned correctly, saved in an appropriate file format, licensed, assigned a doi, described using an appropriate metadata schema, uploaded into a content management system, and made discoverable for reuse, then i think i could have saved many hours of reading and scratching my head…’ (http://www.samsearle.net/ / /reflections-on-path-to-data.html) the author was also required to provide detailed advice to researchers on creating quality documentation (metadata) for their project outputs, with the aim of storing outputs in the terra nova climate change adaptation information hub (https://terranova.org.au/). she quickly recognised that her previous career focus on information literacy and information retrieval meant she lacked expertise in cataloguing processes, including the use of authority files and standards, which would have been useful for the role. for example, terra nova incorporates a number of geospatial fields and vocabularies from the anzlic metadata profile (itself based on as/nzs iso : geographic information – metadata). rapid upskilling in the use of standards in metadata creation was therefore required. the author was involved in the ongoing development of the terra nova hub. in spite of her extensive experience in information retrieval from an end-user perspective, her lack of http://www.samsearle.net/ / /reflections-on-path-to-data.html information technology skills (e.g. lack of technical experience with relational databases and content management systems, web development and design), initially limited the advice she could provide to developers on improving the user experience. the author was assigned an experienced mentor from within the organisation, which proved critical during the induction period of the role. the mentor was able to point the author to a wealth of existing material to assist in writing best-practice advice, and helped to place new concepts into context by providing real-life examples of their application. the most valuable training material proved to be case studies which attempted to contextualise theory, for example completed data management plans, (such as those found at the digital curation centre website, http://www.dcc.ac.uk/resources/data-management- plans/guidance-examples), actual data curation profiles (such as those at http://datacurationprofiles.org/) and webinars hosted by the australian national data service, which have provided an opportunity to hear speakers describe their involvement in research data service provision in detail (ands ). the mentor proofread and edited all content written by the author during the first three months of the role. the mentor also helped the author to understand the scope of her position and provided reassurance about the depth of technical expertise required for the job. when the author’s mentor moved to a new position within the division of information services and was no longer able to provide mentoring, the author found herself working alone within a team of programmers, web developers, business analysts and project managers. at this point the author successfully turned to both local and online communities of practice external to the organization to seek advice and support, for example, the australian national data service general discussion group. the experience of this author, as outlined above, highlights some of the challenges to a successful transition from traditional academic librarian to data librarian, and points to the types and formats of skills development which could ease the journey. discussion in the particular case outlined above, the leap from theory into practice has been a large one and has needed to be underpinned by a skills development program, a mentor/coach and a support network of specialists. training and staff development when designing training, one needs to consider both the background of the librarian and the details of the expanded role they are expected to play. auckland outlines the wide variety of data management roles in which an academic librarian may be involved: from a largely advisory role (e.g. providing advice and referral on within-project data management, long- term preservation of research outputs and compliance with policy and funding mandates) to a hands-on role (e.g. applying advanced skills in developing metadata schema specific to disciplinary standards and individual research projects) ( , ). the data management roles of academic librarians, along with the associated required knowledge and skills, can be represented as a continuum of increasing complexity: advisory role, e.g. academic librarian eresearch role, e.g. data librarian knowledge of the research process and scholarly communication, including an overview of discipline-based knowledge and outputs advanced understanding of discipline-based research process, outputs and scholarly communication, including an understanding of data types and formats typical of specific disciplines knowledge of legal and regulatory frameworks advanced understanding of ethics, intellectual property, copyright and licensing overview of good information and data management practices, e.g. safe storage, backup and long-term deposit of data advanced understanding of safe storage, backup, and transfer of data, including file formats, version control, file authenticity and security overview of metadata concepts and schemas advanced knowledge of discipline-specific metadata schemas and related standards, at both item-level and collection level understanding of mark-up languages such as xml, interoperability and crosswalks. overview of preservation standards knowledge of repository certification schemes and standards overview of semantic web and open data knowledge of semantic web standards, open data platforms communication and outreach skills high level communication and documentation skills, project management skills, systems design skills, business analysis skills. table . increasing complexity of librarian roles supporting research data there also needs to be a continual program of in-service training to provide ongoing skills development. some specific skills needed include negotiation, advocacy and communication. simons and searle ( , p ) have identified three broad training pathways for librarians moving into the data management space: formal (tertiary) education, training courses (in-house or externally provided), and informal learning, either self-directed or supervisor/peer-assisted. searle ( ) has specifically outlined the benefits of introducing a component of scenario-based learning into introductory research data management workshops for librarians. in addition she provides practical advice as to how to develop scenarios and integrate them within an institutional staff development program. this has highlighted the need for a good support network of specialists to form a virtual team. such a network enables the data librarian to acquire specific knowledge from domain experts on the job. it also allows the data librarian to call in a specialist to work on complex problems beyond their level of expertise. some recent australian examples include: . secondment of subject librarians to work in specialist teams for a short period to build skills (la trobe university) (huggard et al ) . creation of virtual teams including library staff to develop services for researchers (university of south australia) (healey et al ) . secondment of librarians to work on projects to address data management activities (griffith university) (wolski and richardson ) the common thread is that librarians have been placed in teams with other specialists. a key finding from these examples has been the importance of librarians being able to tap into other librarians’ experiences. as a result national organizations such as ands and caul have facilitated such activities through workshops and webinars. however there is still an unmet need to further develop support network librarians working in domain specific areas e.g. bioscience, ecology. alex ball ( , ) from the digital curation centre (dcc)/ukoln informatics has recently stated: ‘i am confident we will see more specialists with scientific, technical, engineering and medical backgrounds being brought into the library profession to deal with specialist data issues’. while many data management practices are cross-disciplinary, there are discipline areas in which domain knowledge would be a distinct advantage. complex datasets resulting from scientific instruments, modelling results, and geographic information system (gis) layers, for example, need accurate and extensive description to be reusable. this level of description would require a librarian to have, or to develop, some expert knowledge. therefore it is important within the library that any librarian roles which support research need to be scoped to determine the specific skills and expertise required within the team, faculty and institution. leveraging existing core capabilities an aspirational role for librarians was affirmed at the may arl membership meeting (association of research libraries ) which described the research library and university of as ‘a rich and diverse learning/research ecosystem’ (p ), with the research library shifting from ‘its role as knowledge service provider within the university to become a collaborative partner that catalyzes evolution’ (p ). as discussed previously, research support librarians generally have to acquire new skills to function effectively in their newly expanded roles. in terms of partnering with researchers, however, they can also draw on those existing core capabilities they already have by virtue of having been trained as a librarian. o’brien and richardson ( ) have discussed the core capabilities that position librarians well to be partners in the process of research. these include structured thinking, knowledge of information management theory, ability to communicate, understanding of knowledge dissemination, awareness of trends, etc. it is their ability to utilise both existing capabilities and newly-acquired skills which helps to establish the librarian as a core member of a research support team. researcher perspective whenever you develop a service you need to understand the perspective of the intended audience. this is a fundamental to a good communication strategy. in the case of specifically supporting data management, librarians may encounter resistance. for example, a challenge is a lack of buy-in from the data owners / creators themselves. many of the steps in data management are labour intensive and require time and money, which have often not been factored into grant-based research. one of the questions often asked by researchers is ‘why should i spend my precious time and money doing this?’ ‘so others can build on your research’ may ring alarm bells for researchers working in a highly competitive and financially constrained research environment. ‘so others can cite your data and your citation rate will improve’ is likely to elicit a request for statistically significant proof that data sharing is increasing citation rates. more recently the compliance flag is having some success. ‘because your funding body/publisher says you have to’ is more likely to bring the researcher to the table. but at this point the research librarian must present a complete package of tools and procedures to make data management as seamless and pain-free as possible for the researcher. that is, he or she must ‘know their stuff’, which is a challenge in such a new profession or for those libraries developing a new service. in australia and internationally there has been much discussion about how to improve engagement and support within the institutions. for example, a recent survey by research data alliance europe ( ) has made recommendations about how to improve research practice. it should not be assumed that all researchers do not exercise good practice in managing their data; some may do so by using a variety of readily available tools and technologies. the challenge is how to bring about a wider change in behaviour so that all researchers understand the importance of continually reassessing their current practices and, if necessary, adopting new practices. the issue of behavioural change within universities is emerging as a critical factor in responding to the rapidly changing research practice (yanosky ; o’reilly et al ; andreoli-versbach and mueller-langer ; wolski and richardson ). therefore the success of assisting researchers in managing data for example depends on both the upskilling of library support staff and the willingness of researchers to engage in the process. conclusion university libraries are now seen as having a key role in engaging with their research community. as a result traditional roles providing information support and training have been expanded to include support in all aspects of the research lifecycle. libraries are having to determine their approach to supporting research in their respective institutions in response to their own strategic priorities as well as those of the parent organisation. the challenge is to not only scope the changes required but also to develop and implement an effective support model. in this paper the authors have discussed their own experiences specifically in transitioning a traditional academic librarian to a new role as a data librarian. several key findings have emerged from addressing training needs and working directly with researchers. firstly, while formal skills training is important as librarians move into new research support roles, there is also a critical need for informal training, mentoring and support networks. secondly, library roles which support research need to be scoped to determine the skills and expertise required within a team, faculty and the institution. this is because not only will all support librarians not have the same roles but also there is a need to have expertise in some areas, e.g. information technology, standards, project management. thirdly, there may be a need to have in-depth knowledge of the research process in specific discipline areas to be able to contribute as a full partner in the research activity. for example, how an ecologist, in contrast with a bioclinician, finds and collects data, and then processes and analyses it through to publication. whatever approach a library takes, there will be opportunities for libraries to respond to a rapidly changing environment through collaboration, e.g. especially in providing support networks. in australia national bodies such as caul and ands will continue to have an important role to play in this respect. note . an earlier version of this paper was presented at the vala conference, melbourne, february . the substantially revised paper published in this issue of the australian library journal has been double-blind peer reviewed to meet the department of education’s herdc requirements. references andreoli-versbach, patrick, and frank mueller-langer. . open access to data: an ideal professed but not practised. research policy, ( ), - . association of research libraries. . arl strategic thinking & design: a framework for the organization going forward. washington, dc: arl. auckland, m. . re-skilling for research: an investigation into the role and skills of subject and liaison librarians required to effectively support the evolving information needs of researchers. london: research libraries uk. australian national data service. . ands audio & video. http://ands.org.au/presentations/audio-video.html australian national data service. . research data infrastructure: — : the ands project response to the diisr research infrastructure roadmap discussion paper. caulfield east, australia: ands. ball, alex. . les métiers liés aux données de la recherche: data librarian. paper presented at e congrès annuel de l’adbu, le havre, september . available online: http://opus.bath.ac.uk/ / /ball_data_librarians.pdf carlson, jake. r. . "opportunities and barriers for librarians in exploring data: observations from the data curation profile workshops." journal of escience librarianship ( ): article . http://dx.doi.org/ . /jeslib. . christensen-dalsgaard, birte, marc van den berg, rob grim, wolfram horstmann, dafne jansen, tom pollard, and annikki roos. . "ten recommendations for libraries to get started with research data management." final report of the liber working group on e-science/research data management. http://libereurope.eu/wp- content/uploads/the% research% data% group% % v % final.pdf corrall, sheila. . “designing libraries for research collaboration in the networked world.” paper presented at liber nd annual conference, munich, - june . https://www.liber .de/fileadmin/inhalte_redakteure/ . _corrall.pdf healey, angelica, ann morgan, and glynn stringer. . “do researchers dream of data management?” paper presented at eresearch australasia , melbourne, - october. huggard, simon, kerry sullivan, and mina nichols-boyd. . “opening up the dialogue on research data: how librarians at la trobe university are enabling the process”. paper presented at eresearch australasia , melbourne, - october. jaguszewski, janice m., and karen williams. . new roles for new times: transforming liaison roles in research libraries. washington, dc: association of research libraries. lewis, martin j. . “libraries and the management of research data.” in envisioning future academic library services, edited by sue mcknight, - . london: facet publishing. maccoll, john. "library roles in university research assessment." liber quarterly ( ): - . malenfant, kara j. . “leading change in the system of scholarly communication”, college & research libraries ( ): - . mertens, mike, kimberly silk, and joel herndon. . beyond data management plans, creative data services in libraries. webinar. http://libraryconnect.elsevier.com/articles/ - /webinar-beyond-data- management-plans-creative-data-services-libraries monastersky, richard. . “publishing frontiers: the library reboot”, nature ( ): - . o’brien, linda, and joanna richardson. (in press). “supporting research through partnership.” in creating the st-century academic library, vol. , edited by bradford eden. new york: rowman & littlefield. o’reilly, kelley, jeffrey johnson, and georgiann sanborn. . improving university research value a case study. sage open, ( ), . research data alliance europe. . rda europe data practice analysis. http://admin.icordi.eu/repository/document/scienceworkshops/rda% europe_d ata% practice% analysis.pdf searle, samantha. . using scenarios in introductory research data management workshops for library staff. paper presented at eresearch australasia , melbourne, - october. simon fraser university library. “research commons: research lifecycle for graduate researchers”. last modified august http://www.lib.sfu.ca/research- commons/research/research-lifecycle simons, natasha, and sam searle. . “redefining 'the librarian' in the context of emerging eresearch services”. paper presented at vala , melbourne, - february. http://www.vala.org.au/vala -proceedings/vala -session- - simons stuart, david. . “libraries could play key role in managing research data”, research information : - tanner, lindsay . declaration of open government. canberra: australian government department of finance and deregulation. tenopir, carole, ben birch, and suzie allard. . academic libraries and research data services: current practices and plans for the future; an acrl white paper. chicago: association of college and research libraries. walters, tyler, and katherine skinner. . new roles for new times: digital curation for preservation. washington, dc: association of research libraries. wolski, malcolm, and joanna richardson. . “terra nova: a new land for librarians?” paper presented at vala , melbourne, - february. http://www.vala.org.au/vala -proceedings/vala -session- -wolski wolski, malcolm, and joanna richardson. (in press). “improving data management practices of researchers using behavioural models.” paper to be presented at theta , gold coast, australia, – may . yanosky, ronald. . institutional data management in higher education. boulder, co: educause center for applied research arl digital scholarship institute sarah melton, boston college; michelle dalmau, indiana university; nora dimmock, university of rochester; daniel g. tracy, university of illinois at urbana-champaign; erin glass, university of california, san diego introduction & coalition for networked information (cni) hosts large-scale workshops on establishing and supporting digital scholarship centers; articulates a need for academic librarians to incorporate digital skills and methodologies into practice. university of rochester hosts a mellon foundation funded institute for mid-career librarians in digital humanities with a residential and online format; cohort model and basic methodologies were highly valued by participants. fall association of research libraries (arl) convenes a working meeting with three-member teams from university of rochester, indiana university, university of illinois urbana-champaign, boston college, and university of california, san diego that includes dh and ds librarians, outreach and liaison librarians, and library directors. the team develops a model for a digital scholarship institute to provide a week long, immersive experience to introduce librarians and staff who had no prior experience in digital scholarship to the methodologies and practices of ds. the planning team becomes the dsi steering committee, admissions committee, curriculum committee, and the pedagogy team. june inaugural digital scholarship institute is held at boston college under the sponsorship of the arl academy and the five arl partners. dsi boston participants: attendees & instructors immersive learning by librarians for librarians goals and curriculum from the start, we recognized that we wanted to emphasize building a culture of digital scholarship over specific tools. the advisory group identified seven overarching learning goals for the institute. we wanted participants to be able to: . describe how digital scholarship fits into higher education and why academic libraries are engaging in digital scholarship . demonstrate confidence in their ability to engage with digital scholarship projects by developing strategies for advancing their roles as contributors, as partners and/or co-creators in digital scholarship projects . identify the hallmarks of digital scholarship/critical elements and methodological principles that qualifies scholarly work as digital scholarship . evaluate different digital scholarship methodologies and tools . integrate existing skillsets into those needed for digital scholarship . envision digital scholarship as a collaborative endeavor by identifying individual researchers or local institutional units with whom they feel confident working to continue furthering their knowledge and practice of digital scholarship . establish an integrated cohort as part of this institute to cultivate ongoing knowledge-sharing, skill-building, and networking during and beyond the institute. jennifer vinopal’s keynote, “discern, question, and resist,” and alex gil’s introduction to digital scholarship workshop opened the institute. on the final day, we hosted sessions on digital pedagogy and digital scholarship consultations, followed by a debrief of the institute. the institute’s immersive environment allowed participants to learn, debate, and reflect on these concepts as a cohort. the arl advisory group recognized the importance of assessment, and considered ways to make it as unintrusive as possible. ● a private journal for daily reflection with the goal to facilitate completion of three, brief surveys that corresponded to the workshop days (tuesday - thursday) and a comprehensive survey issued on the last day. we asked attendees to reflect upon: ○ what they learned today? ○ what they would have liked to learn? ○ what worked? ○ what didn’t work? ● two flip-charts (“parking lot”) that were setup in a semi-private nook for attendees to note, anonymously, feedback that would be more time-sensitive. ● three short workshop surveys that were disseminated post-workshops each day (tuesday-thursday): ○ what is an example(s) of something that you learned in this workshop? ○ share one specific way that you might apply what you learned in your library. ○ what do you wish you learned today or did you expect to learn? ○ any additional feedback about that day: logistics, meals, breaks, workshops. ● a closing -question comprehensive survey that was distributed at the end of the th and final day of the institute. assessment activities raised several possibilities for changes to the institute: ● streamline pre-institute communications, and set application and notification deadlines further in advance, to improve preparation for travel to and participation in the institute. ● replace hackathon with informal discussion time: participants appreciated time to talk about issues related to providing digital scholarship services. ● consider more time with a smaller set of tools, and more time for the sessions on core related librarianship issues like pedagogy, consultation, and data curation. planning is currently underway for the arl dsi in january of , which will be hosted at uc san diego with modifications to the curriculum based on the instructional talent available at that campus and nearby. arl is also committed to creating an ongoing community of practice among attendees so as to better support their ongoing efforts in digital scholarship. thus, there is also a continued discussion about ways to further cultivate community among the cohorts after the duration of the institute. participants were required to submit a short application in which they detailed their experience and interest in digital scholarship. a total of participants attended. instructors were drawn from the arl dsi advisory group as well as from the library at boston college, which hosted the inaugural institute. the pedagogy team consulted with instructors to help craft lesson plans. course materials were shared on arl’s github repository (https://github.com/tech-at-arl/ digital-scholarship-institute). in addition to gathering feedback from attendees, we gathered feedback from instructors at the end of their workshop sessions. revising the institute responses revealed participants valued the dsi program and found it useful for their future work, and especially valued the ability to build a cohort of distributed colleagues. many said they would recommend it to a colleague, and in many cases already had. this was a real overview, and i still found that meeting people was the most important thing i did. the environment—one of isolation coupled with group activities was very important in developing my feelings of camaraderie among the group. -dsi participant what worked best? [a] cohort of colleagues who were on the same ds exposure level as that fostered a safe and enjoyable learning environment, / day basic introductory workshops on ds tools (as opposed to many tools at one time or discussions without tools), a nice remote location that helped create a "ds camp" feel and focus, and excursions that enabled time to strengthen ties and create a genuine sharing experience. -dsi participant the advisory group designed a five-day, face-to-face immersive training environment in which participants learned together, shared experiences, and formed a cohort for continued support and collaboration. ● keynotes introducing digital scholarship and hot topics ● hands-on, half-day workshops on mapping, info viz, digital exhibitions, text analysis, multimodal publishing and text encoding ● hackathon ● scenario walkthroughs and use cases the arl dsi pedagogy team was convened to help guide instructors in implementing active, measurable learning strategies such as hands-on activities and peer discussion. assessment: methodology & findings https://github.com/tech-at-arl/digital-scholarship-institute https://github.com/tech-at-arl/digital-scholarship-institute / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / always already computational: collections as data  final report        thomas padilla (pi)  laurie allen (co‑pi)  hannah frost (co‑i)  sarah potvin (co‑i)  elizabeth russey roke (co‑i)  stewart varner (co‑i)      ‑‑‑‑‑‑‑‑‑        this project was made possible by the institute of museum and library services (lg‑ ‑ ‑ ‑ ).   the views, findings, conclusions, or recommendations expressed in this publication do not necessarily                          represent those of the institute of museum and library services or author host institutions.                       / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / acknowledgements    the project team would like to acknowledge the institute of museum and library services, whose                              support made this project possible. program officers trevor owens and emily reynolds provided                          essential guidance throughout. patricia hswe, formerly at penn state university libraries, now at the                            andrew w. mellon foundation, helped sparked the idea that became reality. we thank stanford                            university, texas a&m university, emory university, and the university of pennsylvania for their                          contributions to the project. project home institutions ‑ the university of california, santa barbara, and                              later the university of nevada, las vegas ‑ provided crucial support to the project. special thanks to amy                                    gros louis, kee choi, lonnie marshall, maggie farrell, and michelle light.      individuals listed below authored and edited project resources, participated in national forums, and                          presented or served on program committees for project‑initiated events. many others beyond the list                            below contributed ‑  we are grateful to them all.    ajao, john  almas, bridget  anderson, clifford  arroyo ramirez, elvia   averkamp, shawn  bailey, helen  bailey, jefferson  baumgardt, frederik  becker, devin   butterhof, robin   capell, laura   chassanoff, alex  claeyssens, steven  clement, tanya  coates, heather  coble, zach   collard, scott   craig, kalani  cram, greg  del rio riande, gimena  di cresce, rachel  dickson, eleanor   dombrowski, quinn   elings, mary   enderle, scott  escobar varela, miguel  ferriter, meghan  foreman, gabrielle p.  fowler, daniel   galarza, alex  gniady, tassie  gradeck, bob  green, harriett  guiliano, jennifer  hardesty, julie  harlow, christina  higgins, devin  horowitz, sarah m.  hswe, patricia   ikeshoji‑orlati, veronica   jansen, greg  johnston, lisa  jordan, mark  jules, bergis  kashyap, nabil  kaufman, micki  kerchner, dan   kizhner, inna  kouper, inna  leem, deborah  lill, jonathan   lillehaugen, brook   lincoln, matthew  littman, justin   liu, alan  locke, brandon  lynch, katherine  mannheimer, sara  marciano, richard  marcus, cecily  martinez, alberto  mason, ingrid  matienzo, mark  mclaughlin, steve  meredith‑lobay, megan  miller, matthew  milligan, ian  mookerjee, labanya  morgan, paige   neatrour, anna  newbury, david   nunes, charlotte  orlowitz, jake  patterson, sarah  phillips, cheryl  pollock, caitlin  porter, dot   posner, miriam  powell, chaitra  rabun, sheila  ridge, mia    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / rodgers, richard   romary, laurent  ross, denice  roued‑cunliffe, henriette  sakr, laila  scates kattler, hannah  schmidt, ben  schwartz, daniel l.  scott weber, chela  senseney, megan  seubert, david  severson, sarah  sherratt, tim  simpson, john  souther, mary  sutton koeser, rebecca  st. onge, timothy  terras, melissa  thomas, deborah   thompson, santi  tomasek, kathryn  tracy, daniel g.  van tine, lindsay  vejvoda, berenica  vogeler, georg  weigel, tobias  weingart, scott  whitmire, amanda   williams, elliot   wolf, nick   wrubel, laura   yarasavage, nathan   zarafonetis, michael   zastrow, thomas  ziegler, scott   zwaard, kate              / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   scope note   introduction   activities   about our approach   collections as data framework (v )   the santa barbara statement on collections as data   collections as data facets   collections as data personas    things   methods profiles   collections as data position statements   additional resources   impacts   findings   collections as data development requires critical engagement with the ethical implications of  cultural heritage organization work   collections as data development is possible at a wide range of organizations   collections as data development benefits collection users and stewards   challenges to collections as data development are more organizational than technical   collections as data development benefits from engaging specific community needs   collections as data development benefits from collaboration across multiple communities of  practice   areas for further investigation   moving from ethical consideration to action   conducting more community‑specific user studies to inform workflow development   developing functional requirements in service to user and collection steward needs   publicly charting and sharing the terms of relationships with commercial entities   enabling  widespread collections as data discovery   addressing collections as data preservation needs and obstacles   exploring post‑custodial approaches to collections as data   appendices   appendix  : the santa barbara statement on collections as data   appendix  : collections as data facets   appendix  : collections as data personas   appendix  :   things   appendix  : collections as data methods profiles   appendix  : national forum position statements     / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / appendix  : forum summaries   appendix  : conference engagements,  ‑   appendix  : digital humanities   preconference: shaping humanities data                                           / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / scope note  from ‑  always already computational: collections as data  documented, iterated on, and                        shared current and potential approaches to developing cultural heritage collections that support                        computationally‑driven research and teaching. with funding from the institute of museum and library                          services,  always already computational held two national forums, organized multiple workshops, shared                        project outcomes in disciplinary and professional conferences, and generated nearly a dozen deliverables                          meant to guide institutions as they consider development of collections as data.      this report documents the activities and impacts of the always already computational project,                          delineates findings, and identifies areas for further inquiry.          / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / introduction  always already computational: collections as data arose from practical need and a desire to build upon                                decades of digital collection practice. while cultural heritage practitioners have broad experience                        replicating the analog experience of watching, viewing, and reading in a digital environment, they less                              commonly share the experience of supporting users who want to work with collections as data ‑ a                                  conceptual orientation to collections that renders them as ordered information, stored digitally, that are                            inherently amenable to computation. these users come from many disciplines and professions, they act                            within and outside of the university, and they share in common a desire to leverage computational                                methods like machine learning, computer vision, text mining, visualization, and network analysis.                        meeting their needs is contingent on the availability of collections, infrastructure, and services that are                              tuned for computational work.     at the time  always already computational  formed, existing experience in this space was difficult to                              discern beyond relatively well‑resourced efforts like the hathitrust research center and the british                          library. without diversification of examples and corresponding paths to doing the work, the viability of                              collections as data efforts ran the risk of being perceived as an elite activity ‑ smaller actors need not                                      apply. it became clear that a broader field of participation was needed. ideally, this field would exhibit                                  variation in institutional resources, collection types, and community responsibilities. all of the above                          would critically contend with the ethical implications of producing and making use of collections as data.                                from ‑ ,  always already computational  sought to cultivate this field by openly documenting,                          iterating on, and sharing current and potential approaches to developing cultural heritage collections                          that support computationally‑driven research and teaching.     at inception, anticipated project outcomes were as follows: gather key stakeholders to craft a strategic                              direction that leads to  ( ) creation of a collections as data framework that supports pragmatic collection                                transformation and documentation,  ( )  development of computationally amenable collection use cases                      and user stories ( ) identification of methods for making computationally amenable library collections                          more discoverable through aggregation and other means, and (  )  guidance, in the form of functional                              requirements that support development decisions relative to technical feature integrations with                      repository infrastructures.     as synchronous and asynchronous engagements began in earnest, project scope and the shape of                            deliverables morphed accordingly. the tension between creation of particular solutions and universal                        solutions was persistent . given its nature as a broadly conceived community project,  always already                            computational  was not positioned to make overly specific technical recommendations. preference was                        ultimately given to the creation of malleable deliverables that could be shaped to guide engagement                              with particular community needs. we determined that collections as data discoverability and the                          development of specific functional requirements were projects that required independent investigation.                      ideally these investigations will be tied to specific contexts, a framing distinct from a project like  always                                  already computational , which sought cultural heritage community‑wide engagement .       / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / always already computational  deliverables constitute version of the collections as data framework.                          this framework includes a range of resources, expressed in different forms, providing multiple points of                              engagement throughout the process of considering collections as data efforts. for example,  the santa                            barbara statement on collections as data is a set of principles developed with community feedback                              designed to help guide practitioners through the practical, theoretical, and ethical dimensions of                          collections as data work. this deliverable does not advance solutions, rather it raises core questions to                                be resolved in local contexts. the  collections as data facets describe a range of institutional approaches                                to implementing collections as data. this resource aims to help practitioners see multiple paths into                              doing the work. the  collections as data personas represent high level role types associated with                              collections as data development and use. together, the personas, derived from  always already                          computational  community engagements and project team experience, aim to surface needs,                      motivations, and goals in context. compiled at the end of two years of project engagements, the                                     things  provide examples of things a practitioner can do to initiate collections as data at their institution.     throughout the course of the project,  always already computational  was inspired and humbled by the                              active interest and ingenuity shown by librarians, archivists, museum professionals, researchers,                      educators, and more as they engaged with collections as data challenges and opportunities. by                            emphasizing diverse community engagement and documentation over prescriptive recommendations,                  we hope that we have cultivated, encouraged, and questioned in ways that a wide range of communities                                  find to be useful.     thomas padilla  laurie allen   hannah frost   sarah potvin   elizabeth russey roke   stewart varner                     / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / activities  about our approach    from the beginning,  always already computational held an expansive view of collections as data work.                              the project sought to document implications of collections as data work across cultural heritage                            organization functions, practices, and roles. a national forum with participants representing a broad                          spectrum of perspectives kicked off project activity. two years of synchronous and asynchronous                          community engagements spanning a range of professional and disciplinary contexts followed.     project activity was designed to serve three near‑term goals:  ( )  identify cross‑cutting issues and bring                              common themes into focus,  ( ) scaffold project activity with those issues and themes  ( )  identify  special                                concerns or less clear areas that required deeper investigation. discussions at the first national forum                              informed overall project goals and direction. project deliverables were iterated on over the course of the                                project activity. iteration was by design, given the need to engage, respond to, and incorporate diverse                                community input. deliverables were shared across a range of venues including but not limited to the                                digital library federation, american historical association, society of american archivists, the coalition                        for networked information, association of college and research libraries, nicar, and open                        repositories.     always already computational  community engagements drew inspiration from human‑centered design                    methods. the luma institute  handbook of human‑centered design methods and the  liberating                        structures  toolkit provided a series of generative activities :     ● round robin  ‑ generate fresh ideas by providing a format for group authorship.   ● concept poster  ‑ promote an idea and rally support for its development.    ● affinity clustering ‑ teams sort items based on perceived similarity, defining commonalities                        that are inherent but not necessarily obvious.   ● importance/difficulty matrix ‑ establish priorities by plotting relative importance and                    difficulty.   ● ‑ ‑ ‑all  ‑ generate ideas that open with self‑reflection in response to a prompt and expand                              into larger group discussion.       luma institute,  innovating for people: handbook of human‑centered design methods  (pittsburgh, pa: luma institute,  );  http://www.liberatingstructures.com/     innovating for people ,  .   ibid.,  .   ibid.,  .   ibid.,  .   http://www.liberatingstructures.com/ ‑ ‑ ‑ ‑all/    http://www.liberatingstructures.com/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / individual and group perspectives gathered through these activities directly informed the framework                        described below.  collections as data framework (v )  the santa barbara statement on collections as data    the santa barbara statement on collections as data is a set of principles developed with community                                feedback designed to help guide practitioners through the practical, theoretical, and ethical dimensions                          of collections as data work. this deliverable does not advance solutions, rather it raises core questions to                                  be resolved in local contexts. the first version of the santa barbara statement was inspired by the first                                    collections as data national forum (uc santa barbara, march ‑ ). after its release, the team                                asynchronously gathered comments on the web via open annotation and sought synchronous feedback                          across a  series of conversations and workshops . the second version of the statement was revised and                                released based on community feedback.    permanent link: https://doi.org/ . /zenodo.   collections as data facets    collections as data facets, authored by community contributors, document collections as data                        implementations. an implementation consists of the people, services, practices, technologies, and                      infrastructure that aim to encourage computational use of cultural heritage collections. the fifteen                          facets represent collections as data efforts at museums, academic libraries, societies, and institutions like                            the library of congress.     permanent link: https://doi.org/ . /zenodo.   collections as data personas  collections as data personas represent high level role types associated with the development  and use of                                collections as data. the personas aim to surface needs, motivations, and goals in context.     permanent link:  https://doi.org/ . /zenodo.    things    things is designed for practitioners who are seeking to get started with collections as data. things                                    provides an impetus for exploring, learning from colleagues, deepening knowledge and understanding,                        and taking that first step. participants at our  second national forum (university of nevada las vegas,                                may  ‑ ,  ) provided the bulk of recommendations.     https://collectionsasdata.github.io/nominations/ https://collectionsasdata.github.io/events/ https://doi.org/ . /zenodo. https://collectionsasdata.github.io/partners/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   permanent link: https://doi.org/ . /zenodo.   methods profiles    methods profiles characterize common research methods in relation to the process of collections as data                              development. they are designed to help collection stewards bridge the gap between research methods                            and design of workflows that support creation of machine actionable collections.     permanent link:  https://doi.org/ . /zenodo.   collections as data position statements (forum  )    prepared by invited participants in advance of the first  collections as data national forum (uc santa                                barbara, march ‑ ), the twenty‑six position statements describe challenges, opportunities,                      connections, and gaps in the work of collections as data. perspectives subsequently informed project                            activity.     permanent link: https://doi.org/ . /zenodo.   additional resources    ● national forum   livestream recording  ● collections as data google group ‑ as of may , the google group includes topics and                                     members   ● collections as data group library  ‑ as of may  , this zotero group includes   items and    members    ● serendipitous collections as data                   https://groups.google.com/forum/#!forum/collectionsasdata   https://www.zotero.org/groups/ /collections_as_data_‑_projects_initiatives_readings_tools_datasets    https://doi.org/ . /zenodo. https://collectionsasdata.github.io/nominations/ https://www.youtube.com/watch?v=enapv xmo i https://groups.google.com/forum/#!forum/collectionsasdata https://www.zotero.org/groups/ /collections_as_data_-_projects_initiatives_readings_tools_datasets https://collectionsasdata.github.io/serendipity/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / impacts    always already computational’s primary role, as expressed in the framework, was to highlight existing                            work, foster conversations, identify gaps, collect feedback, and spark further conversation and adoption                          in the context of specific community needs. the impact of  always already computational is likely best                                measured by its potential to motivate further development.    over two years of project activity,  always already computational  saw collections as data:     ● … taken up as a strategic priority within the university of california’s shared content                            leadership group’s  plans & priorities for / based on the university of california                          library collection: content for the  st century and beyond  ● … incorporated as a feature of the oclc  research and learning agenda for archives,                            special, and distinctive collections in research libraries  ● … inform the creation of permanent positions like the digital collections as data manager                            position at johns hopkins university libraries  ● … inform the creation of postdoctoral positions like the british national archives’ ftna                          postdoctoral research fellowship, focused on unlocking “archival collections as data”  ● … identified as a core driver for an international, future of archival science curriculum effort  ● … presented as a component of the digital library federation eresearch network  ● … inform software preservation network outreach  ● … delivered as a week‑long collections as data course at the humanities intensive learning                            and teaching institute  ● ... inspire reading groups, international hackathons, workshops, and conference sessions                    that span disciplinary, library, archives, and museum communities.     “ /  sclg plans & priorities for  /  based on the university of california library collection: content for the  st  century and beyond,” university of california, last modified september  ,  ,  http://libraries.universityofcalifornia.edu/groups/files/sclg/docs/sclg_ _ % plan.pdf ; chela scott weber. “research  and learning agenda for archives, special, and distinctive collections and research libraries.” oclc research,  .  https://doi.org/ . /c c f ; “manager of digital collections as data.”  https://jobs.jhu.edu/job/baltimore‑manager‑of‑digital‑collections‑as‑data‑md‑ / / .  ; “developing a computational framework for library and archival education.” developing a computational framework for  library and archival education.  https://dcicblog.umd.edu/computationalframeworkforarchivaleducation/ ;   “ftna postdoctoral research fellowship (datafication) at the national archives,” february  ,  .  https://web.archive.org/web/ /http://www.jobs.ac.uk/job/bho /ftna‑postdoctoral‑research‑fellowship‑dat afication/ ; “eresearch network ‑ dlf wiki.” accessed january  ,  .  https://wiki.diglib.org/eresearch_network#webinars ;  “events | the software preservation network.” accessed may  ,  .  http://www.softwarepreservationnetwork.org/events/ ;  padilla, thomas, and mia ridge. “collections as data.”  hilt  (blog). accessed january  ,  .  https://dhtraining.org/hilt/course/collections‑as‑data‑ / ; september  , natalia ermolaev. “cdh reading group: collections  as data.” center for digital humanities @ princeton university, september  ,  .  https://cdh.princeton.edu/updates/ / / /cdh‑reading‑group‑collections‑data/ ; moore institute. “collections as data ‑  hackathon / collaborative workshop ‑ moore institute.” text.  nui galway  (blog). accessed january  ,  .  http://mooreinstitute.ie/event/collections‑data‑hackathon‑collaborative‑workshop/ ; dalmau, michelle. “collections as data at  indiana university and beyond,” november  ,  .  https://libraries.indiana.edu/collections‑data‑indiana‑university‑and‑beyond ;  menendez, rebecca, cheryl miller, andrzej  rutkowski, and stacy r. williams. “arlis/na  th annual conference: getting started with collections as data.” accessed  january  ,      http://libraries.universityofcalifornia.edu/groups/files/sclg/docs/sclg_ _ % plan.pdf https://doi.org/ . /c c f https://doi.org/ . /c c f https://jobs.jhu.edu/job/baltimore-manager-of-digital-collections-as-data-md- / / https://dcicblog.umd.edu/computationalframeworkforarchivaleducation/ https://web.archive.org/web/ /http://www.jobs.ac.uk/job/bho /ftna-postdoctoral-research-fellowship-datafication/ https://web.archive.org/web/ /http://www.jobs.ac.uk/job/bho /ftna-postdoctoral-research-fellowship-datafication/ https://web.archive.org/web/ /http://www.jobs.ac.uk/job/bho /ftna-postdoctoral-research-fellowship-datafication/ https://wiki.diglib.org/eresearch_network#webinars http://www.softwarepreservationnetwork.org/events/ https://dhtraining.org/hilt/course/collections-as-data- / https://dhtraining.org/hilt/course/collections-as-data- / https://cdh.princeton.edu/updates/ / / /cdh-reading-group-collections-data/ https://cdh.princeton.edu/updates/ / / /cdh-reading-group-collections-data/ http://mooreinstitute.ie/event/collections-data-hackathon-collaborative-workshop/ http://mooreinstitute.ie/event/collections-data-hackathon-collaborative-workshop/ https://libraries.indiana.edu/collections-data-indiana-university-and-beyond https://libraries.indiana.edu/collections-data-indiana-university-and-beyond https://arlisna .sched.com/event/itta/getting-started-with-collections-as-data?iframe=no&w= %&sidebar=yes&bg=no / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ● … directly inform  collections as data: part to whole , awarded $ , by the andrew w.                              mellon foundation to foster the development of broadly viable models that support                        implementation  and  use of collections as data.    in addition to tracking the various examples of impact above, the  always already computational  team                              simply asked through an open survey, “have you used this project?”. we include below a sampling of                                  responses:    more than the resources, which i've referenced and read and looked at off an on during the                                  projects run, we (the digital library folk at idaho) have used the idea(s) promoted through the                                project to stimulate our own thinking, development, and conversations. i've had other librarians i                            don't work with that closely with bring up the project to me, and that's led to some really                                    interesting conversations.     devin becker, university of idaho     ( ) i'm leading data curation work package in a national research and data infrastructure project                              for humanities, arts and social sciences. i drew on the collections as data facets to augment the                                  advice given to a colleague new to data curation, to help them think about how to make data                                    available e.g. via api or as static snapshots in a sustainable manner, and to think about their                                  collection "as data". ( ) the facets have also informed the development of a data curation                              framework for data sharing and interoperability across multiple platforms (discovery, access,                      research and archiving).    ingrid mason, australia’s academic and research network (aarnet)     so far, we have developed one research project exploring the use of oral histories as collections                                as data. collections as data has also strongly influenced our thinking of how best to digitize and                                  make available a collection of mining records from the early s, which would be best                              expressed more as a database set up for computational use by researchers in addition to a                                traditional digital collection.    anna neatrour, university of utah     use of the [collections as data] facets were instrumental in explaining the widespread practice of                              working with collections as data. before this list of examples, it was a constant struggle to                                explain the idea and justify the work. i frequently cite the santa barbara statement when writing                                about the the use of data in special collection libraries. i've used the personas somewhat less                                https://arlisna .sched.com/event/itta/getting‑started‑with‑collections‑as‑data?iframe=no&w= %&sidebar=yes&bg=no ;  neely, liz, anne luther, and chad weinard. “cultural collections as data: aiming for digital data literacy and tool development  – mw  | boston.” accessed january  ,  .  https://mw .mwconf.org/proposal/cultural‑collections‑as‑data‑aiming‑for‑digital‑data‑literacy‑and‑tool‑development/ ;   padilla, thomas, hannah scates kettler, laurie allen, and stewart varner. “collections as data: part to whole.” collections as  data ‑ part to whole. accessed january  ,  .  https://collectionsasdata.github.io/part whole/ .    https://arlisna .sched.com/event/itta/getting-started-with-collections-as-data?iframe=no&w= %&sidebar=yes&bg=no https://mw .mwconf.org/proposal/cultural-collections-as-data-aiming-for-digital-data-literacy-and-tool-development/ https://mw .mwconf.org/proposal/cultural-collections-as-data-aiming-for-digital-data-literacy-and-tool-development/ https://collectionsasdata.github.io/part whole/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / regularly, but i have referenced them to offer examples for what types of researchers might be                                interested in different types of data.     scott ziegler, louisiana state university     i think it helps people draw the connection between digital archives and progressive values. i also                                think it's a helpful, positive avenue into discussing what resources are necessary in terms of                              storage, repository infrastructure, etc. in order to archive collections digitally, and why                        institutions should earmark funds and other resources to support collections as data.    charlotte nunes, lafayette college  findings   below, we share some of the clearest findings that arose from  always already computational . as an                                overarching finding, we cannot emphasize enough the value of collaboration between staff working                          within and across galleries, libraries, archives, and museums. collections as data development provides                          concrete, generative opportunities to learn things from one another. in a university context, much is                              often said about  interdisciplinary  research and its role in addressing challenges. with collections as data,                              we have an opportunity to embrace the value of  interprofessionalism .  the incorporation of concepts,                            language, and standards from multiple areas of practice allows for a more nuanced understanding of                              systems and the ways they can serve us and our users. as each of the findings below are considered,                                      readers are encouraged to think broadly about the kinds of collaborations that would allow forward                              progress.   collections as data development requires critical engagement with the ethical                    implications of cultural heritage organization work    collections as data development must critically engage with bias in collection and description,                          archival silences, and assumptions about collection use. the viability of collections as data effort                            demands critical engagement ‑ especially as collection practices leveraging computational means like                        machine learning, computer vision, and more hold as much potential to harm as to help. archival                                approaches to provenance, with their focus on  documenting the custodial and contextual history of                            objects, provide one path forward.  ethical fault lines are often easier to see when trying to develop                                  new policies and workflows. examination of policies and workflows should support changes in                          practice. prior harms should be acknowledged and remediated to the extent possible.   collections as data development is possible at a wide range of organizations     collections as data development does not depend on availability of abundant resources ‑ the work is                                possible at a wide range of organizations. incremental progress is a primary feature of this work.                                  / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / small scale projects, experiments, and discussions can help establish a more inclusive path forward.                            while discussions of api development, data download mechanisms, and what technical                      infrastructure to adopt are important, it is more important that space be created for meaningful                              collaborations to form within and outside of the cultural heritage organization.  collections as data development benefits collection users  and  stewards    collections as data development offers clear benefits to collection users  and  stewards. users gain                            access to machine‑actionable collections that are more readily amenable to research questions,                        expanded forms of pedagogy, and creative work.     the value of being able to more readily apply computational methods to collections is decidedly not                                isolated to disciplinary researchers. cultural heritage staff increasingly use similar methods to                        address core challenges that include, but are not limited to, collection metadata and object                            remediation, expanding discovery, and critically engaging with collections. for example,  always                      already computational  has shown that cultural heritage staff are among the most prolific users of                              collections as data. as dot porter, curator of digital research services in the schoenberg institute for                                manuscript studies at the university of pennsylvania, observed at the second collections as data                            national forum:    i must have know that i would use it, i just didn’t realize how much i would use it, or how                                          having it available to me would change the way i thought about my work, and the way i                                    worked with the collections. ... having openn as a source for data gives me so much in                                  my curatorial role. i have the flexibility to build the interfaces i want using tools i can                                  understand, and flexibility, easy access, familiar formats.   challenges to collections as data development are more organizational than                    technical     collections as data development provides a context for productive destabilization of organizational                        silos often predicated on the management and use of analog resources. the cultural heritage                            community has repeatedly lauded the capacity for collections as data work to encourage                          collaboration between operationally disconnected parts of a cultural heritage organization.    a successful turn towards collections as data development requires inclusive organizational                      experimentation ‑ spanning archivists, technologists, subject experts, catalogers, and more.                    collections as data development blurs traditional divisions between cultural heritage organization                       dot porter. “data for curators: openn and bibliotheca philadelphiensis as use cases.”  remarks from the  collections as data  national forum    event held at the university of nevada, las vegas, may    .  dot porter digital  (blog).  http://www.dotporterdigital.org/data‑for‑curators‑openn‑and‑bibliotheca‑philadelphiensis‑as‑use‑cases/ .      https://collectionsasdata.github.io/forum / https://collectionsasdata.github.io/forum / http://www.dotporterdigital.org/data-for-curators-openn-and-bibliotheca-philadelphiensis-as-use-cases/ http://www.dotporterdigital.org/data-for-curators-openn-and-bibliotheca-philadelphiensis-as-use-cases/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / work. implementation requires a combination of community engagement, domain knowledge, and                      the capacity for infrastructure development. holistic combination of programs, tools, and services                        presents the primary challenge. for example, digital scholarship groups have a role to play in                              catalyzing use of collections as data, but the sustainability of that effort remains challenged by                              integration with core digital repository efforts. as efforts in this space grow, cultural heritage                            organizations will need to review divisions of labor and experiment with policies and workflows that                              foster generative, inclusive collaborations.   collections as data development benefits from engaging specific community                  needs    collections as data development reaches its true potential when it engages specific community                          needs. collections as data designed for everyone serve no one. engagement with community needs                            is never complete ‑ it requires active, ongoing, and sustained effort. what we learn from                              engagement directly informs programs, services, and partnerships. beyond the question of                      collections as data usability, community partnerships help ensure that collections as data efforts do                            not result in replication or amplification of bias that harms underrepresented communities. while                          collections as data development will be a new experience for some, it can be an exciting opportunity                                  to develop close collaborative relationships that go beyond the traditional roles of  service provider                            and  service consumer .   collections as data development benefits from collaboration across multiple                  communities of practice     given that community needs are constantly changing, collections as data are varied in                          implementation. efforts to meet these needs benefit from collaboration across multiple                      communities of practice.  always already computational surfaced many communities of practice that                        contribute or hold the potential to contribute to collections as data development. collection                          stewards have deep knowledge of metadata, web archive managers and digital library managers                          have expertise in packaging subsets of data, historians have experience using and/or developing                          their own collections as data, and educators are anchored by the experience of teaching with data in                                  a classroom setting. the efforts of a diverse group like this are brought into generative contact by                                  shared statements of principles and tools for communicating, exchanging experience, and                      collaborating across communities of practice.       santa barbara statement on collections as data. always already computational: collections as data.  https://collectionsasdata.github.io/statement/    https://collectionsasdata.github.io/statement/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / areas for further investigation   as  always already computational  comes to a close, we introduce a series of topics that merit further                                  investigation. in some cases, these topics were determined to be out of scope for the project given the                                    scale of team engagement, composition, and capacity. in other cases, topics were introduced and then                              reinforced by multiple community engagements, without clear resolution. we resist referring to these                          topics as “new” and suggest, instead, that these topics align with perennial challenges facing cultural                              heritage organizations.  . moving from ethical consideration to action    the cultural heritage community works to become more knowledgeable about the negative                        potential of producing and using collections as data. taking that knowledge and converting it                            into actionable strategies, processes, and workflows that can be implemented across various                        stages of collection acquisition, description, and access is a prime area for further                          investigation.   . conducting more community‑specific user studies to inform workflow development   the viability of collections as data workflows depend on further investment in                        community‑specific user studies.  always already computational  encountered repeated calls                  for practical resources that support collections as data development decisions relative to                        descriptive practices, alignment with standards, data types, and optimal delivery mechanisms.                      creation of more tailored resources in this space depends on deeper understanding of user                            need.   . developing functional requirements in service to user and collection steward needs   always already computational  focused on documenting and/or creating tools for eliciting,                      describing, and communicating user and collection steward needs. the next stage of                        development would benefit from creation of functional requirements that reflect needs in                        context with specific use cases. in the aggregate, functional requirements should take into                          account variation in institutional resources required to implement them.   . publicly charting and sharing the terms of relationships with commercial entities   local and global efforts to develop open data and infrastructure greatly benefit collections as                            data development and use. with that said, collections as data effort will often call for                              interaction with commercial entities. this is likely the case from a collection standpoint given                            the spread of proprietary data held by contemporary companies like twitter and licensed data                            held by vendors. optimal practice in this space is often difficult to access, locked within                              non‑public agreements. efforts to improve this situation should be documented and released                        publicly. the work should be aligned with core values that support openness and equity.                              / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / securing relationships on these terms must be viable with or without the power of a                              prestigious institution, consortial heft, or inordinate access to capital.   . enabling  widespread collections as data discovery    approaches to making collections as data easy to find are inconsistent at best. setting aside                              well‑known sites  where large volumes of open or licensed data are systematically collected or                            aggregated for discovery, or institutionally‑based static sites promoting their collections as                      data such as  openn or  lc labs , it often feels like one needs to know the right people to find                                        collections amenable to computational use. how can  ad hoc instances of collections as data be                              described and indexed for better discovery in a consistent, systematic fashion? are there                          approaches to encoding description ‑ leveraging schema.org vocabularies for example ‑ that                        can be developed and standardized for community adoption? are there particular platforms                        or systems that enable or hinder discovery, and if so, how?  . addressing collections as data preservation needs and obstacles   as we work to develop collections as data, the matter of long‑term stewardship of the                              products of these efforts ‑‑ including source data sets and derived data sets ‑‑ comes to the                                  fore. do current digital preservation policies and resources in institutions adequately cover the                          requirements for ensuring preservation of collections as data? are there identifiable gaps or                          misalignments in resources, workflows and practices which hinder preservation, and how can                        they be overcome?   . exploring post‑custodial approaches to collections as data  cultural heritage organizations are not the primary repositories for collections as data. it is                            neither desirable nor feasible for organizations to collect, store, and preserve all data locally,                            even if libraries took a collaborative approach. how can post‑custodial approaches to                        preservation and access in archival repositories inform collections as data work?          https://www.data.gov/ https://www.data.gov/ http://openn.library.upenn.edu/ https://labs.loc.gov/lc-for-robots/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / appendices  appendix  : the santa barbara statement on collections as data  may      the santa barbara statement on collections as data was written by the institute of museum and                                library services supported always already computational: collections as data project team. the first                          version was based on the collaborative work of participants at the first collections as data national                                forum (uc santa barbara, march ‑ ). after its release, the team gathered comments from the                                hypothesis web annotation tool and sought additional feedback across a series of conversations and                            workshops (april ‑ april ). the current version of the statement was revised based on that                                  community feedback, especially the close, directed feedback provided by workshop participants at                        the digital library federation forum  .        what are “collections as data”? who are they for? why are they needed? what values guide their                                  development? the santa barbara statement on collections as data poses these questions and                          suggests a set of principles for thinking through them, as part of a community effort to empower                                  cultural heritage institutions to think of collections as data and consequently to explore what might                              be possible if cultural heritage seen in this light was more readily open to computation.    the concept of collections as data emerges at – and is grounded by – a particular moment in the                                      recent history of cultural heritage institutions. for decades, cultural heritage institutions have been                          building digital collections. simultaneously, researchers have drawn upon computational means to                      ask questions and look for patterns. this work goes under a wide variety of names including but not                                    limited to text mining, data visualization, mapping, image analysis, audio analysis, and network                          analysis. with notable exceptions like the hathitrust research center, the national library of the                            netherlands data services & apis, the library of congress’ chronicling america, and the british                            library, cultural heritage institutions have rarely built digital collections or designed access with the                            aim to support computational use. thinking about collections as data signals an intention to change                              that, and efforts like the library of congress’ collections as data: stewardship and use models to                                enhance access and the multinational digging into data suggest that a broader community shift                            intentionally scoped to institutions large and small comes at an opportune time.    while the specifics of how to develop and provide access to collections as data will vary, any digital                                    material can potentially be made available as data that are amenable to computational use. use and                                reuse is encouraged by openly licensed data in non‑proprietary formats made accessible via a range                              of access mechanisms that are designed to meet specific community needs.      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ethical concerns are integral to collections as data. collections as data should make a commitment to                                openness. at the same time, care must be taken to comply with legal requirements, cultural norms,                                and the values of vulnerable groups. the scale of some collections may also obfuscate what is hidden                                  or missing in the histories they are perceived to represent. cultural heritage institutions must be                              mindful of these absences and plan to work against their repetition. documentation should be                            informed by archival principles and emergent reproducibility practice to ensure that users have the                            information they need to work with collections responsibly.    principles    . collections as data development aims to encourage computational use of digitized and                        born digital collections. by conceiving of, packaging, and making collections available as                        data, cultural heritage institutions work to expand the set of possible opportunities for                          engaging with collections.    . collections as data stewards are guided by ongoing ethical commitments. these                      commitments work against historic and contemporary inequities represented in collection                    scope, description, access, and use. commitments should be formally documented and                      made publicly available. commitment details will vary across communities served by                      collections but will share common cause in seeking to address the needs of the vulnerable.                              collection stewards aim to respect the rights and needs of the communities who create                            content that constitute collections, those who are represented in collections, as well as the                            communities that use them.    . collections as data stewards aim to lower barriers to use. a range of accessible                            instructional materials and documentation should be developed to support collections as                      data use. these materials should be scoped to varying levels of technical expertise. materials                            should also be scoped to a range of disciplinary, professional, creative, artistic, and                          educational contexts. furthermore the community should be motivated and encouraged to                      build and share tools and infrastructure to facilitate use of collections as data.    . collections as data designed for everyone serve no one. specific needs inform collections as                            data development. these needs may be commonly held by particular user communities.                        rather than assuming these needs or imagining these communities, stewards should be                        intentional about who their collections are designed for, work to lower the barriers to use                              for the people in those communities, and continue to assess these needs over time. where                              resources permit, multiple approaches to data development and access are encouraged.    . shared documentation helps others find a path to doing the work. for example, collections                            as data work can entail decisions about selection, description, conversion cleaning,                      formatting, and delivery mechanisms or platforms that enable discovery and provide access.                        in order for a range of individuals and institutions to engage collections as data work, it must                                    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / be possible to locate documentation that demonstrates how and why the work is done.                            documentation must also attest to the history of how the collection has been treated over                              time. while no documentation can be fully comprehensive, incomplete or in‑progress                      documentation is better than no documentation. examples of documentation include                    human and machine readable metadata schemas, data sheets, workflows, application                    profiles, deeds of gift, and codebooks. documentation should be publicly accessible by                        default.    . collections as data should be made openly accessible by default, except in cases where                            ethical or legal obligations preclude it . terms of use for collections as data must be made                                explicit and should align with community‑based practices such as rightsstatements.org and                      standard licenses such as creative commons, open data commons, and traditional                      knowledge licenses.    . collections as data development values interoperability. interoperability entails alignment                  with emerging and/or established community standards and infrastructure and eases                    integration with centralized as well as distributed infrastructure. this approach facilitates                      collections as data discovery, access, use and preservation.    . collections as data stewards work transparently in order to develop trustworthy,                      long‑lived collections. trustworthiness depends upon efforts to ensure and publicly                    document the technical integrity of the data as well as its provenance. it also requires that                                data stewards acknowledge absences and areas of uncertainty within the collection as data.                          trustworthy collections as data should include open, robust metadata, and should be under                          the care of stewards and institutions committed to their preservation.    . data as well as the data that describe those data are considered in scope. for example,                                images and the metadata, finding aids, and/or catalogs that describe them are equally in                            scope. data resulting from the analysis of those data are also in scope.    . the development of collections as data is an ongoing process and does not necessarily                            conclude with a final version. work in progress status can be seen as a virtue when iteration                                  is geared toward developing productive collaborations and integrations between new and                      existing technologies, workflows, and service models. the ongoing development of                    collections as data can impact staffing models, workflows, and technical infrastructure.                  / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   appendix  : collections as data facets  august   ‑ august      collections as data facets document collections as data implementations. an implementation                      consists of the people, services, practices, technologies, and infrastructure that aim to encourage                          computational use of cultural heritage collections.        facet  : mit libraries text and data mining  richard rodgers, massachusetts institute of technology    . why do it  mit libraries collect, curate, and provide access to numerous digital collections that comprise                          important research outputs and contributions to the scholarly record. access is typically                        provided via traditional web applications designed for individual users in browsers. in assessing                          the patterns of use of these collections, it became apparent that a significant amount of traffic                                was due to various automated processes that ‘scraped’ the sites, but did not identify themselves                              as indexing services. at the same time, we began to receive more and more direct requests from                                  individual scholars on campus (and beyond) for bulk delivery of textual corpora in our                            collections, in order to perform text‑mining on them. it was clear that these ‘alternative’ uses of                                collections were not well served by existing access methods and systems.    . making the case  we saw that we needed to explore how better to provide access for these kinds of use, and this                                      need dovetailed with a broader agenda that the libraries were pursuing of reconceiving library                            services as a ‘platform’: a notion articulated in recommendation of the future of the libraries                                task force report, which specifically mentions text and data mining as important                        ‘non‑consumptive’ uses of library‑stewarded material. the platform model emphasizes                  empowering users to create their own discovery/access/consumption tools by providing open,                      standards‑based, and performant apis or other services that such tooling can leverage. so the                            case was made by arguing that an experiment to expose collection data via a new api designed                                  for bulk access would teach us how to build a library platform that would increase the value of                                    all collections.    . how you did it    https://collectionsasdata.github.io/facets/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / based on the analytics, we selected mit’s electronic theses and dissertations as the initial                            collection to work with: it was highly sought after, fairly extensive (close to k theses, with                                plans to digitize the entire historical run), and already under management in our institutional                            repository (dspace@mit). we wrote a formal proposal for a project to design and build a                              prototype of a new discovery and access service for this collection to enable text and data                                mining (or other non‑consumptive uses).    the project team consisted of:    ● a project manager, who oversaw the scrum‑agile process used to manage the development  ● three software developers, who took responsibility for content accession, repository                    management, and api design and development, respectively  ● an analyst, who surveyed the field of existing text and data mining services, and who worked                                with potential users of the system to understand their needs  ● a ui/ux expert, who helped in designing intuitive and effective user interfaces (which                          complemented and documented the api).  ● the development project ran for ‑ months, and a functional prototype was built that                            exposed an api for discovery and bulk access of etheses. the user could request any (or all)                                  of content representations: the metadata (including an abstract), the thesis as a pdf                            (which is the approved submission format), and the full (unstructured) text extracted from                          the pdf.    the service consisted of several cooperating software components: a fedora repository, which                          held the metadata and textual artifacts, an elasticsearch index, used to query the full‑text, as                              well as the metadata, an api server which formed the front‑end, exposing the ways users could                                interact with the index and repository, and various queues and caches to connect these                            components. each component was deployed in a container to a kubernetes‑orchestrated                      environment in a cloud service (google container engine).    several challenges the project encountered, to name a few:    ● the quality of the pdfs in the collection varied considerably, with numerous encoding and                            other errors that affected or impaired use. some etheses were created in digitization                          workflows from analogue originals, whereas others were ‘born digital’, and both content                        streams were created over a long span of time using different software, workflow practices,                            etc. we vacillated between attempts to ‘repair’ the theses, or enhance the metadata with                            quality indications so that machine use could adjust for it: the final prototype included                            aspects of both approaches.  ● the cloud environment required considerable knowledge of deployment and orchestration                    tools and platforms that the team lacked. while we were able largely to surmount these                              deficiencies, we did so at some cost to the overall project deliverables. our initial resource                              model for the project included a ‘devops’ role (unfilled) that would have greatly assisted.    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ● it was difficult to identify and attract a broad variety of potential users to help define the                                  product design. we gained valuable insight from those we engaged with, but suspected                          there were many more research objectives, techniques, requirements, etc that would have                        beneficially shaped the design of the api and the whole service. this stemmed in part from                                the fact that we were asking for input without a working system to react to.    . share the docs  project documents forthcoming, but the code that was used to run the prototype is available on                                github.    . understanding use  the team solicited potential users of the ethesis service, and conducted a small number of                              interviews to elicit both their intended use, but also what affordances such a service should                              provide to researchers.    we learned that the metadata we exposed (academic department, completion year, degree                        type, etc) were considered useful ways to plumb and select within that particular corpus                            (etheses), in addition to keyword search over the full‑text.    the service itself was designed to gather data about how it was used, but working against this                                  was the desire to make the data openly available to all, without ‘user tracking’. in the end, the                                    service emerged with a lightly tiered structure: all content was freely available, but certain                            advanced functions required obtaining an api key (which allowed much better analytics).    . who supports use  while the cloud‑hosted service compute infrastructure was supported by the libraries                      technology team, the project required considerable support throughout the libraries and                      archives. at mit, the responsibility for collecting and curating theses and dissertations falls to the                              institute archives, who were a key stakeholder in the project. they did extensive research                            (including soliciting advice from the institute’s legal counsel) on the ip and rights issues                            surrounding such a new service, since this kind of use was not originally contemplated in the                                policies governing theses. they also assumed general responsibility for the rare but complex                          decisions around takedown requests, etc.    since this service obtains content from existing digitization workflows, the digitization team was                          also closely involved in providing access to scripts, software tools, etc used to create ethesis                              artifacts.    if the service were launched in production, repository managers would need to both administer                            the service, but also field questions and provide support for end‑users (api key management,                            etc). in addition, the it operations group would need to follow the standard set of practices for                                  system backup, performance monitoring, etc. we learned that data‑intensive services such as                          / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / this (where gigabytes of package downloads were routine) had to be managed carefully from a                              resource perspective.    . things people should know  one key insight we gained was the need to perform a thorough appraisal of the collection from a                                    data completeness, uniformity, and consistency perspective: when discovery and access are                      confined to siloed legacy applications, these quality dimensions may be difficult to observe.    . what’s next  etds were a great candidate collection for understanding the requirements of a text and data                              mining service, but we have numerous text‑based collections of high value, including our                          extensive open access articles collection, conference proceedings, technical reports, working                    papers, etc. an analysis of these corpora (what are useful metadata discriminators, etc) in light                              of the insights gained in the etheses prototype, could lead to a general, flexible service for                                offering the wealth of content the libraries has to new forms of scholarly inquiry.                                    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : carnegie museum of art collection data  david newbury, carnegie museum of art; daniel fowler, open knowledge international    . why do it  as stated on the carnegie museum of art (cmoa) website, the collection data project is meant                                to be used for “discovery, inspiration, and innovation, allowing people to creatively re‑imagine                          and re‑engineer our collection in the digital space.” cmoa collection data is stored in  emu , a                                collections management system from axiell. this collections as data facet documents the                        release of this data: it was exported to both csv and json as a “data dump” and  released on                                      github  for public consumption to help enable this creative reuse.    cmoa acknowledges that this project is continuously evolving and that the data will be                            periodically revised to reflect changes in how its curators understand the objects stored in the                              database. this acknowledgment is reflected in the choice of a platform (github) which natively                            supports storing version‑controlled data. cmoa made the choice to publish using csv, json, and                            github because of their relative ease of use for researchers and developers—these platforms                          enable easy access to large amounts of data without the need for tools beyond what the                                researchers already possess, or requiring potential users to learn an api or write sql against                              proprietary databases.    in addition to publishing the data itself, it was also important to provide a human‑ and                                machine‑readable description of the data, its structure, and guidance on how to actually use it.                              csv, while easy to work with for many users, is a notoriously underspecified format: developers                              often have differing opinions on what constitutes a “valid” csv file. the  data package                            specification developed by open knowledge international is a “containerization” format for data                        which is meant to provide a consistent interface (or “wrapper”) to a diverse range of datasets,                                especially those containing tabular data (e.g. data stored in csv files). a single file,                            datapackage.json, stored with the dataset documents where each data file is saved (either on                            disk or a remote server) as well as its “schema” (number of columns and expected values per                                  column). releasing this dataset as a data package was a good start for providing a minimum                                machine‑readable description of a dataset for processing. a growing set of software libraries and                            tools can read the data package specification so that artists, data analysts, and other users                              interested in cmoa’s collection can benefit from this consistent interface regardless of the                          software they use.    a human‑readable version of some of this same information is provided through a supplied                            “readme” file.    collection data on github:  https://github.com/cmoa/collection     data package specification:  http://specs.frictionlessdata.io/     https://emu.axiell.com/ https://github.com/cmoa/collection https://github.com/cmoa/collection https://frictionlessdata.io/specs/ https://frictionlessdata.io/specs/ https://github.com/cmoa/collection http://specs.frictionlessdata.io/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   . making the case  the case to provide the public increased access to museum data was not a difficult one at the                                    carnegie museum of art—the museum considers engagement and education to be a core part                            of its mission, and firmly believes in open access as essential to museum practice. also, we were                                  helped immensely by the fact that several large institutions, in particular moma,  had already                            done so —rather than having to explain exactly what we were doing in detail, we could tell our                                  administration and board that “we were doing it the way moma did it”. being able to model our                                    work on the previous work and decisions of others helped reassure non‑technical stakeholders                          that we weren’t doing anything risky or controversial.    the most significant barrier was determining how to coordinate the various expectations across                          departments—to publish this data required coordination across registrarial, publishing, digital,                    and curatorial teams. additionally, it was clear that it would be important to provide all                              stakeholders with the ability to maintain control over their data. we provided at least six months                                of notice to allow the various departments time to correct any information that they felt was                                essential, and we also allowed anyone to hold back data that they didn’t feel was ready. all we                                    asked for was a single sentence written description of why the information should not be                              published. this allowed stakeholders to maintain agency, while avoiding the temptation to                        withhold large amounts of the information by default.    finally, we had many internal discussions about how regular updates would be possible, and we                              worked with all the departments to craft language to communicate this within the github                            documentation as being living data. this helped set the expectation both inside and out that this                                is not a publication that had been vetted by a curator for accuracy and completeness.    . how you did it  the carnegie museum of art collections data publication was an offshoot of the art tracks                              project at cmoa, a data visualization for provenance. because of the sensitive nature of                            provenance, one of the most important goals of the project was to ensure that the professionals                                with the best understanding of the nuances of the data had control over which works were                                available for publication. to do so, we worked with travis snyder, the collections database                            administrator, to craft a series of reports, using filter criteria he devised and fields he approved,                                that created a collection of xml reports, one per‑table, from the collections management                          system. these reports run as needed nightly, and the resulting xml is uploaded to an internal                                ftp site.    a second set of custom scripts, written by david newbury, the lead developer of the art tracks                                  project, download and transform the xml, replacing internal field names with friendlier labels                          and joining data across tables. additionally, these scripts add additional information that is not                            explicitly held in our collections management system such as the urls for the object website and                                  https://github.com/museumofmodernart/collection https://github.com/museumofmodernart/collection / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / images of the work. these scripts, written in ruby, are run whenever the institution wants to                                update the publication data.    our intention was to automate this process, but at this point, the benefit of regular, automatic                                updates is not yet worth the overhead of what is needed to maintain a complex automation                                system, for example, the time and effort required to provision servers and handle error reporting                              robustly. instead, they’ve been wrapped into a single command line command using rake, a                            ruby library designed to automate repetitive tasks for programmers. the single command will                          download the xml, reprocess the files, generate both the json and csv representations, and                            then upload the generated files to github. currently, if there are problems in the export, a                                human is running them and will notice (and hopefully correct) the problem before erroneous                            data is published. one interesting fact is that this script also has to update the documentation on                                  github. for example, we provide in the documentation the number of items in the collection.    we’ve included several data formats within our the export. first, we include a csv export. in                                discussions with members of the pittsburgh digital humanities community, csvs were seen as                          the most readily‑accessible format for researchers interested in quantitative analysis of our                        collections information. it doesn’t require any programming ability to read it, just a copy of excel,                                which also means that it’s the version we show internal, non‑technical people. it is, however,                              somewhat limited—for instance, artworks can have one or more creators, and tabular formats                          like csv are not designed to handle hierarchical relationships. we encode this data using an                              internal microformat (pipe‑separated values), but we’ve learned from watching users that this is                          confusing and non‑optimal. we’re still working to determine if there’s a better way to handle this                                sort of data.    the data package descriptor file, datapackage.json, which provides metadata for the csv files in                            the dataset is written separately as an encapsulation of the expected output of this csv export                                pipeline. information about contributors to the dataset, its licensing, expected values per                        column per file is stored here.    we also provide a single large json export of the data. this is designed primarily for developers,                                  who can load it into memory and process it directly. it’s a large file ( mb), but not so large that                                          it can’t be held in memory using a modern computer. when we’ve held hackathons or worked                                with web technologists, this is the form of the data that they’ve been most comfortable with.    we also provide a directory containing a single json file for each object in the collection. this                                  was created to approximate an api—there’s a single url that will return information about each                              object, as well as an index file containing a list of ids, titles, and a url to an image. however, our                                          experience has been that this format is too complicated for both developers (who prefer the                              single json file) and non‑developers (who prefer the csv), and is not used.    an additional complication for our data is that we have broken out the , + photographs of                                the teenie harris collection into their own file. this collection is part of the cmoa collection, but                                    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / is significantly larger than the rest of the collection combined. we found in exploring other                              collection data releases, such as the tate london and their collection of j.m.w. turner’s                            sketchbooks, that large‑scale special collections tended to drown out the rest of the collection in                              data analysis, and might be best considered separately. we discussed with the museum                          stakeholders our options, but the decision was made that publishing them as a separate files,                              using the same format and structures, and both documented the same way in the github, was                                an acceptable pattern.    . share the docs  one of the most important decisions that we made was to treat  the documentation for the                                release as of equal importance to the data. tracey berg‑fulton, the collections database                          associate and art tracks team member, spent a long time crafting the documentation to be                              thorough and friendly. friendly was important, because we knew that many of the people who                              would be looking at this data would be students or members of the public, and we wanted them                                    to feel welcome to use the data. big legal disclaimers and restrictions, or dense technological                              jargon might have prevented them from feeling like they were welcome.    we also included within our documentation a table that indicates not just what the field is, but                                  what it means, what type of data you can expect, and a real‑world example of the sort of data                                      that field contains. we wanted to make sure that people were able to find out if our data would                                      meet their needs without having to download it and review it.    once we had completed our documentation, we sent it through several rounds of internal                            review—not just editorial review, to confirm that we’d spelled everything correctly, and legal                          review, to make sure that we’d appropriately used the correct licenses and disclaimers, but also                              content review, to make sure that our examples were factual, and that our descriptions captured                              the nuances of the content experts. this helped, but even more it fostered the sense that this                                  was of the museum, not just of the art tracks project or the technology department.    beyond internal review, we’ve tried to consult with developers and researchers to verify that the                              information that we’ve provided is what is actually needed to understand our release. we also                              explicitly reached out to others in our field with a history of being critical of museum                                documentation and data, such as matthew lincoln, to critique our documentation and provide                          feedback on utility, comprehensibility, and completeness. we’ve also monitored other data                      releases across the museum field, and have worked to integrate good ideas around                          documentation from our peers. finally, we model good collaboration by explicitly linking and                          thanking the institutions that helped us through example and direct advice on this project.    finally, we’ve been working with open knowledge international to explore the use of data                            packages to provide an additional level of documentation for the collections data release. this                            provides a machine‑readable description of the contents of the csv file, which allows software                            tools and agents to both understand and validate the structural content of the data. we use it as                                      https://github.com/cmoa/collection / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / a validation tool to ensure that all of the data published is structurally correct—for instance, that                                every url is a valid url, or that our id numbers are in the correct format, or that every work has                                          an accession date. our hope is that in the future additional software tools will leverage this                                format, but the most direct benefit to the institution has been as a exhaustive check against our                                  data to verify that the rules that we believe are enforced actually are—and we have been                                regularly surprised by the exceptions that we’ve found.    collection data on github:  https://github.com/cmoa/collection     . understanding use  compared to an api, providing access to carnegie museum of art collection data through a data                                dump is a lower support cost option in terms of time and money. there is no server we need to                                        run: cmoa are, for the moment, hosting the public data on github’s infrastructure. providing a                              data dump also benefits users, both academic researchers and software developers, who might                          not be not be interested in writing code to hit an api endpoint , times to get ,                                     objects. a single file containing all the required data seems to be much easier for certain use                                  cases.    . who supports use  mid‑size museums are not well‑equipped to offer support for digital resources. unlike, for                          instance, a library or archive, the information management and technology staff are                        internally‑focused, not public‑facing. curators, educators, and docents, who are often the public                        face of the museum, are often unaware that our digital resources exist.    because of this, we have worked closely with local universities, in particular the university of                              pittsburgh’s information science program, the carnegie mellon digital humanities program, and                      the frank raytche studio for creative inquiry. we’ve worked with faculty and staff there,                            providing access to curatorial and digital team members one‑on‑one to help them enable use of                              these collections in their programs for teaching, research, and artistic reuse amongst their                          students.    finally, our hope is that through the standardization work that we’ve been undertaking with                            open knowledge international, we can work to make it so that enabling reuse and support can                                be shared across the industry—we can facilitate working with museum data, not just carnegie                            museum of art data.    . things people should know  one of the most important decisions we made was to release our data under a creative                                commons zero (cc ) license. we were strongly influenced in this decision by cooper hewitt and                              the museum of modern art, as well as from conversations with the digital humanities                            community. attribution is extremely important to us, and we’re extremely proud of our data. but                              the case was made convincingly that requiring attribution would be a burden to the most                                https://github.com/cmoa/collection / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / innovative and essential use we wanted to enable—projects that synthesize our data with others                            to generate new knowledge. by putting any restriction on the reuse of the data, many potential                                users would feel obligated to involve legal counsel to review their use, and that burden would be                                  sufficient to prevent their use of our data. instead of requiring attribution via a cc‑by license, we                                  made it easy for people to give us credit—we told them how we’d like to be credited, and asked                                      them kindly to do it. in our experience, almost every project that has used our data has credited                                    us in some way or another.    a surprising takeaway for us has been that one of the primary users of our public data has been                                      the museum itself. easy access to our own data has enabled internal projects to be built on top                                    of the published data, both because it’s in an easy‑to‑use form, but also because of the                                permissive license. all of the data available is already approved for public use, so the approval                                process for remixing it and reusing it is significantly easier—”it’s already public” is a wonderful                              way to eliminate debate as to the appropriateness of using that information in public                            presentations.    another important point that we missed on our initial communications is that we didn’t                            adequately explain how we were using github. github is an essential tool in the open source                                community, and that community has a set of norms around how to provide feedback and                              suggestions on work that is released via the tool. typically, if you found a mistake or wanted to                                    improve a project that was available on github, you would do so through a provided mechanism                                called a “pull request”, where you would create a copy of the work, make the change, and ask                                    the owner to approve merging your new version with the official version. because collections                            data is not a standard use of github, people were unclear whether or not we would accept                                  corrections to our collections information through this mechanism. matthew lincoln, who                      originally brought this to our attention, suggested that it wasn’t important what the answer was,                              as long as it was clear, and so we explicitly indicated that we would not take suggestions this                                    way, and offered an email address that would accept such changes. this has been entirely                              satisfactory to all of our users, as well as our internal staff who were happy to accept                                  suggestions, but were very pleased to learn that theyat didn’t have to learn how to use github                                  to do so.    open knowledge international is keen to work on pilots with others considering releasing high                            quality tabular datasets in the open:  http://frictionlessdata.io     . what’s next  carnegie museum of art is hoping to release further iterations of its collections data over time.                                there are also now more tools that consume and generate data packages. it would be an                                interesting exercise to more deeply integrate features enabled by the data package descriptor.                          for example, cmoa can now add steps in the workflow that validate the dataset using tools like                                  good tables to ensure that the data and the expectations declared in the datapackage.json                            match before publishing. additionally, given the additional information stored in a data package,                            http://frictionlessdata.io/ http://goodtables.io/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / semi‑automated export to other backend formats or databases can be offered relatively easily                          depending on interest.    cmoa and open knowledge international also hope to do work that supports the automatic                            generation of dataset documentation to ensure that documentation provided on github through                        the readme file matches that contained within the datapackage.json.                                                      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : calcofi hydrobiological survey of monterey bay  amanda whitmire, stanford university libraries    . why do it  researchers are beginning to understand the magnitude and complexity of the effects of climate                            change on our earth system, and all research in this area is grounded in what we know about the                                      past. data collection at sea is labor‑intensive and relatively rare, and technology has lowered                            that barrier only within the last couple of decades. through this lens, we understand why in the                                  marine sciences, the most valuable data collections are observational time‑series studies, and                        the older the better.    when i realized the scope of the analog oceanographic data collections being housed at the                              miller library (a marine biology branch library in the stanford libraries system), there was no                              question that these materials needed to be digitized and shared openly. there are very few                              oceanographic time‑series studies from the s ‑ s, and these particular data only exist at                              our location. these data are an important contribution to studies in the marine sciences, climate                              change and coastal ecology. our library is located in a tsunami zone, and since we have the only                                    copy of these data, they are at significant risk of being lost.    . making the case  stanford libraries has a digital production group (dpg) whose primary focus is digitization of                            library collections for the purposes of preservation and access. given the scientific relevance of                            the oceanographic data and its risk of being lost, it was not difficult to convince my boss (the                                    associate university librarian for science & engineering) to support digitization of the material.    our process for internally funding digitization projects is kept intentionally simple. any librarian                          in our science and engineering research group is welcome to write a “collection project                            proposal” (cpp; limited to a single side of one page) that describes the materials to be digitized,                                  why they are important, what the goals for digitization would be, and an estimate of the costs.                                  our aul reviews these on an annual basis and grants as many requests as are justified and he                                    has the budget for. if a project idea comes up mid‑year, we can also submit a cpp as needed. i                                        proposed a pilot project to digitize a subset of the collection, and it was funded at $ , .    . how you did it  my goals for this collection include moving a step beyond digitization of materials to create                              actionable datasets, but i am not prepared to address that because i am still investigating how                                best to accomplish such a task (automated text recognition processes, crowdsourcing,                      transcription services, etc.). this section will be a lot more interesting once i get there, and the                                  project will make more sense as a cad facet at that time.      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / for now, i’ll focus on the process of material curation and how the digitization workflow works.                                some of the process is being captured in an open science framework project page. in concise                                terms, this was the curation plan that i made before i started (adapted from a great poster and                                    using common sense), and it has largely been accurate:    . inventory ‑ what do we have? how much do we have? what kinds do we have?  . organize ‑ by cruise, station, variable, year? standardize dates, stations, variables, cruise                        names…  . appraise ‑ are there duplicates? is anything missing? prioritize: what is most valuable or in                              the worst shape?  . metadata ‑ create descriptive & administrative metadata to guide digitization process:                      titles for collections in the digital repository, file names, etc.  . digitization ‑ stanford libraries digital production group has a well‑equipped lab and staff                          for systematic digitization & deposit into the stanford digital repository (sdr)  . metadata ‑ data need readme files and item‑ & data‑level metadata to facilitate                          understanding & reuse; metadata from the dpg needs quality assurance and remediation.  . make actionable ‑ conversion from pdf to actionable tabular data is critical for enabling                            reuse of the data. how do we make it happen at scale?    steps ‑ have been completed for the first batch of materials (data from every third year over                                  the ‑year time‑series). steps ‑ are time‑intensive and the effort logically scales with the size                              of the collection. the dpg requires relatively little metadata to get the digitization process going,                              so step was brief. i am fortunate that we are so well supported by the experts in the dpg. they                                          require submission of a digitization proposal via a standardized form that they provide, which                            ended up to be about pages long. based on the proposal, they provided an estimate of the                                    digitization timeline and costs, and then moved forward.    . share the docs  as mentioned in the previous section, some content can be found at, “whitmire, amanda l.                              . “hopkins marine station calcofi hydrobiological survey of monterey bay, ca: ‑                          .” open science framework. november  .  osf.io/c egt .”    the digitized items are not yet in the library catalog (also the discovery layer for the repository),                                  but you can see a few examples of digitized material via direct links:    ● a quarterly report:  https://purl.stanford.edu/qt cq   ● an annual report:  https://purl.stanford.edu/dz js   ● field data:  https://purl.stanford.edu/xj cj   ● phytoplankton data:  https://purl.stanford.edu/qw yy   ● zooplankton data:  https://purl.stanford.edu/hy cx         https://osf.io/c egt/ https://purl.stanford.edu/qt cq https://purl.stanford.edu/dz js https://purl.stanford.edu/xj cj https://purl.stanford.edu/qw yy https://purl.stanford.edu/hy cx / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / . understanding use  the primary audience for these data is researchers, but i believe that they will not use the data                                    for research purposes unless it is in a format that that can use. meaning, text files with tabulated                                    data. that is the main driver behind my desire to move a step beyond digitization (while                                recognizing that digitization is a critical action for these at‑risk materials). i believe this because i                                used to be an oceanographer and i understand both their need for data like this and also the                                    constraints on their time and workflows. pdfs of legacy data are nearly worthless to a marine                                scientist who seeks to answer research questions.    . who supports use  after the data have been fully documented and converted to spreadsheets, the goal is that they                                can be used largely unsupported (setting aside the tremendous amount of work that goes into                              maintaining the digital repository). as a subject specialist and the curator of the collection, i am                                available to support data users. interacting with ‑dimensional oceanographic data is generally                        handled in matlab (the software of choice for most oceanographers) or r (an emerging choice in                                this domain). i expect most users of these data to be outside of stanford.    . things people should know  this project feels important. analog research data is everywhere ‑ everywhere ‑ and we need                              librarians and archivists to engage with faculty who are retiring to guide them in sorting through                                the maelstrom. i am focused on facilitating reuse in the digital space because my audience for                                these data are my former colleagues and i know that’s where they operate. that said,                              identifying, curating, and archiving analog datasets to facilitate discovery and enable future                        reuse is critical. in my opinion, collections as data must necessarily extend to the analog world in                                  order to keep up with the upcoming influx of materials from retiring faculty who worked in the                                  pre‑digital era. this project is an example of how we bring those data into the digital realm, but i                                      encourage anyone interested in this type of work to reach out to faculty regarding their data. do                                  it today.    . what’s next  the most challenging part of this process is next: go from image or pdf to spreadsheets. this is                                    the part of the project that has the potential for real‑world impact. nothing that i’ve                              accomplished so far is unique (important though it is). we’ve seen crowdsourcing, and we’ve                            seen transcription. what researchers really need is a way to liberate all of the older, analog data                                  from paper into the digital medium that they use. if i can make progress on addressing how we                                    might be able to do that at scale, i’ll consider this effort a success.                / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : american philosophical society open data projects   scott ziegler, american philosophical society library    . why do it  the  american philosophical society library (aps) has been digitizing historic primary sources for                          just about a decade. we’ve spent a lot of time smoothing out our workflow, and we feel like the                                      process is pretty well developed. however, we’ve known for some time that the audience for                              these scans are limited. the vast majority of our scanned material is hand‑written                          (correspondence, diaries, ledgers, account books, for example). reading this handwriting can be                        slow, and at times is a specialization in its own right.    we wanted to make this material available in a more approachable manner. we also wanted to                                give researchers an opportunity to easily interact with the material in different ways, including                            mapping and text analysis. lastly, we see this as an outreach opportunity. we hope to build                                tutorials for students at the high school and undergraduate level to learn about visualization                            creation and digital history.    . making the case  the administrative case for  creating datasets from our collection was based entirely on our                            mission to increase access to our collections. this was a relatively easy case to make. however,                                there were additional hurdles to overcome.    primarily, there are administrative concerns that the data we put out will have mistakes. this has                                proven to be the case. we try to include warnings that our datasets are created with attention to                                    detail, but that errors happen. we’re also cautious about how we label these datasets. we tend                                not to say that they are transcriptions (though, due to a dearth of synonyms, we do use the verb                                      ‘transcribe’). as an organization, we benefit greatly from large and professional transcription                        projects, including the papers of benjamin franklin and the papers of thomas jefferson. these                            projects are definitive representations of primary material. our datasets are not. our datasets                          are our attempt to make our material more usable, and usable for different types of projects.    in making the case for doing these datasets, we agreed to be clear about what we’re putting out,                                    to help draw a distinction between our datasets and professional transcriptions, and to supply                            feedback options for people who find mistakes.    . how you did it  we identified the requirements for dataset creation to be:    . ability to view a scan of the page being transcribed  . ability to simultaneously view the software that the text is being typed into  . versioning and/or revision history  . ability to share among multiple people    https://diglib.amphilsoc.org/data https://diglib.amphilsoc.org/data / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   we experimented with a number of crowdsourcing tools, including  omeka/scripto ,                    omeka/scribe , and  scribe project . however, we quickly realized that the team we were                          assembling was small enough to rely on more modest tools.    we ended up using google sheets as the primary tool. we used dual monitors to ensure that the                                    person creating the dataset can easily see the scanned page as well as the spreadsheet.    for the  historic prison data , our first major step toward thinking of our collections as data, we                                  were lucky to have two talented and devoted volunteers: kristina frey and michelle ziogas.                            kristina assisted in the early stages of the project, and michelle did the majority of the dataset                                  work.    . share the docs  we don’t currently have any documentation, though we expect to create some during future                            projects.    . understanding use  we understand the use of our data primarily anecdotally. we think of our datasets as a means of                                    identifying new institutional partners and collaborators. we monitor the use of our data via                            these partners. for example, we created the historic prison dataset from material in our library                              related to eastern state penitentiary. as we did this, we contacted the staff of the eastern state                                  historic site, and this has flourished into a fruitful partnership. researchers come to our data                              through them, through our digital repository, and through the various third‑party services we                          use to host our data. several of these researchers have contacted us to offer their own data, to                                    discuss additional projects, to show what they’re building, and to offer corrections. this has                            been our principal measure of success.    we do maintain some metrics. the  magazine for early american datasets records the number of                              times datasets are downloaded. we also have a count of how many people download from our                                digital repository. these are helpful and appreciated. however, the motivation continues to be                          the new connections we make with individuals.    . who supports use  [blank]    . things people should know  when discussing this with people at libraries similar to my own, i tend to focus on the following:    ● datasets are easy to create. all you need to get started as a spreadsheet and something to                                  transcribe.    https://github.com/omeka/plugin-scripto https://github.com/ui-libraries/scribe http://scribeproject.github.io/ https://diglib.amphilsoc.org/data https://repository.upenn.edu/mead/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ● material is easy to identify. we look for material that will work well as spreadsheets.                              ledgers, printed forms, tallies, account books, are all examples due to their recognizable and                            repeatable format.  ● datasets are useful. you can save researchers’ time by removing the challenge of reading                            handwritten notes; you can put material in a format that makes it easy to map; the material                                  can sorted, searched and filtered; you can promote the mission of your library.    however:    ● datasets need to be managed: mistakes will slide in, and researchers will point them out;                              editorial decisions will need to made, even in the most straight‑forward‑looking material.    . what’s next  our flagship project to date – historic prison data – has gotten some positive attention, and                                we’re eager to keep moving. we’ll be hosting a digital humanities fellow to focus specifically on                                using the historic prison data. he’ll be exploring various types of visualizations and analysis. we                              also hope to build a number of tutorials to encourage others to use the data for their own                                    projects.    additionally, we’re working on two other  open data projects . one involves a post office book                              kept by benjamin franklin during his tenure as postmaster of philadelphia. the other will involve                              a record of indentured individuals arriving in philadelphia during the years of ‑ . both of                              these projects will have academic advisory committees to help us strategize use cases and                            promote the data.                                        https://diglib.amphilsoc.org/data / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : openn  dot porter, university of pennsylvania libraries    . why do it  we believe that users of  manuscript data should have access to first‑quality images and                            metadata free of technical or licensing constraints and this is what  openn provides. first quality                              means the resolution at which the images were captured, and authoritative metadata in archival                            formats presented for easy reuse by humans and machines. everything in openn is licensed as a                                free cultural work .    . making the case  the administrative case for creating datasets from our collection was based entirely on our                            mission to increase access to our collections. this was a relatively easy case to make. however,                                there were additional hurdles to overcome.    penn libraries has a commitment to open data, and the study of manuscripts in a digital age is                                    the central mandate of the schoenberg institute for manuscript studies (sims) which is an                            integral part of the library and was founded in . much of the work of sims involves the                                    reuse of our own digital manuscript materials, and we knew in that we could not do our                                    job without a resource like openn. so we had to make one. the director of sims made the case                                      for openn to the director of libraries, who made the decision to invest in the creation of openn.    . how you did it  in penn libraries hired doug emery, who had created systems similar to openn for other                                projects, and he conceived the framework. the penn libraries did not at that time have a                                repository, so it was not in a position to host openn in an existing system. the director of sims                                      asked the director of libraries if we could set it up through penn central computing. we started                                  to populate openn with existing medieval manuscript image data; this was a challenge because                            although most of our manuscripts had already been photographed and cataloged, the master                          tiff files were located in scattered hard drives and servers stored in various corners of penn                                libraries. this work was very intensive, and was carried out primarily by jessie dummer. we                              chose the manuscripts because they were central to the mission of sims and because the data                                was good. doug emery and dot porter designed a package and metadata structure for                            converting descriptive marc and structural metadata into a tei format designed for use and                            consumption integrating metadata with images.    once openn was populated with penn libraries manuscript data we moved on to a second                              project. this project took advantage of the openn platform to gather into one location holdings                              from many different institutions, based around a common theme ‑ th century travel diaries.                            this project has its own website, but the data served from there is all extracted from openn                                  (http://diaries.pacscl.org/). openn now is the host for the bibliotheca philadelphiensis project, a                        project to digitize most of the western medieval manuscripts in philadelphia which received a                              http://openn.library.upenn.edu/ http://openn.library.upenn.edu/ https://creativecommons.org/share-your-work/public-domain/freeworks/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / $ k  grant from clir . sims’s curator for digital research services, dot porter, is a co‑pi on this                                  project.    openn was designed to use the simplest and least expensive technologies available for sharing                            image and metadata. as such, technologically it is nothing more than a webserver with a very                                large hard drive that runs apache and exposes the directory listings of its content. the content                                itself is static, comprising only images, tei/xml metadata, text manifests, and html files. this                            data is exposed for ease of access and ease of movement via simple, well‑established internet                              protocols: http, anonymous ftp, and rsync. one challenge that we had during implementation                          was convincing our service providers that what we wanted was something as simple as openn,                              without a query interface or an application programming interface. technologically, openn is                        more like an old‑style software sharing website from the s than it is a modern web                                application.    however this approach does have sustainability issues. penn libraries is currently designing and                          building a  samvera repository, and in the future we would like the data in openn to be stored in                                      this repository, but served in ways similar to how it is done now. storing the data in the                                    repository will help with sustainability, and will also provide additional ways to serve the same                              data (e.g., using iiif protocols). however we do plan to keep serving the data as friction‑free as                                  possible.    . share the docs  we have both a readme and a technical readme file on the openn site:    http://openn.library.upenn.edu/readme.html  http://openn.library.upenn.edu/technicalreadme.html     . understanding use  through openn, we provide well‑structured standard packages that allow for machine and                        human reuse without putting any preconditions on how it may be used. we provide the data;                                users can do whatever they like. we are undoubtedly openn’s primary user. we have built online                                bookreaders (generated with scripts from the tei/xml files) that stream image files from the                            openn server, and we have also built downloadable epub electronic books (also generated with                            scripts from the tei/xml files) that have copies of the manuscript images as part of the book.    . who supports use  isc (penn central computing) maintains the computer and storage, jessie dummer and diane                          biunno carry out the day to day work of managing and adding materials to the openn website.                                  dot porter provides curatorial advice and oversight (and is also a superuser), and doug emery                              wrote and maintains the software and manages the project.    . things people should know    http://bibliophilly.pacscl.org/ https://wiki.duraspace.org/display/samvera/samvera http://openn.library.upenn.edu/readme.html http://openn.library.upenn.edu/technicalreadme.html / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / we serve digital assets on openn that represent physical materials that penn libraries doesn’t                            own. openn is seen by us as an outlet for materials    openn treats digital assets as originals and seeks to build up a distinctive library of assets                                whether those originals are housed by penn libraries or not. the open licensing in openn allows                                for easy collaboration with institutions local and international, many of which could not deliver                            this data in this quantity by themselves. it is a mistake to think that either the licensing or the                                      ease of access to the materials is less important than the other ‑ they are equally important.    . what’s next  we are going to move openn to the library’s samvera repository to ensure preservation                            standards and long term sustainability and scalability. we will maintain an openn interface to                            this data, but the same data will also be able to be served through other methods including iiif.                                    we will also be expanding the content of openn from mainly medieval manuscripts to printed                              books and archival material.                                                          / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : chronicling america  robin butterhof, library of congress; deborah thomas, library of congress; nathan yarasavage,                        library of congress    . why do it  american newspapers are a valuable primary source for research and study across a wide variety                              of disciplines – from political history to economics to epidemiology and more. the primary goal                              of the  national digital newspaper program is to enhance and expand access to american                            newspapers by providing free and open access to the data selected and gathered from                            institutional collections around the country to create one unified national collection of                        historically significant newspapers. by utilizing open data formats and schemas, communication                      protocols, and providing bulk data downloads, we can expose the collection to a very different                              type of use than through an individual user‑based web interface and extend the research value                              of the collection.    . making the case  the administrative case for creating datasets from our collection was based entirely on our                            mission to increase access to our collections. this was a relatively easy case to make. however,                                there were additional hurdles to overcome.    the case for providing extended access to data had two aspects. extending uses of the collection                                beyond the individual user was an opportunity to allow for new and enhanced uses of the                                content. in addition, the software developed for managing and displaying the data created under                            the program uses internal apis and standard web protocols for accessing data and                          communication within the software. to expose these internal mechanisms to external users was                          a low barrier to extending the use of this important federally‑funded resource.    . how you did it  an important component of envisioning the collection as a dataset was accomplished through                          emphasizing consistent and verified technical standardization of the file formats and metadata                        created under the program. to ensure this outcome, primarily for the purposes of creating a                              sustainable collection, the program developed highly‑detailed technical requirements for data                    producers and provided a jhove ‑based java validation tool for ensuring conformance to key                          requirements. while minor changes have been made over the course of production years so                              far, the dataset is largely internally consistent. (most changes have been loosening of precise                            requirements rather than outright changes to technical specifications.) with a long‑term vision                        for the program and specifically scoped goals (eventually involve all states and territories,                            produce x amount of data per producer per ‑year grant, etc.), we strove to ensure that the data                                    we received at the end of the program (some years later) would be compatible with the data                                    received early in the program. to that end maintaining strict data standards using open                              https://chroniclingamerica.loc.gov/ https://www.loc.gov/ndnp/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / well‑document technical formats and a robust inventory management system has allowed us to                          achieve that goal to date.    with a reliable and consistent dataset, an access system could be built that both supported                              broad access to the collection and provided robust and flexible technical environment. the                          current system is based in the django web framework written in python which includes                            implementation of various open data access points and supports others. more information on                          these access points  is available and the  code‑base itself  is available.    collaboration is a notable characteristic of the program not only with regard to the institutions                              providing data, but also with regard to the staff within the library of congress. developers,                              digital library staff, program managers, and collection specialists alike had a stake in the                            development of the web site. various views were created not only to assist programmatic access                              to the open data for digital humanists and researchers but also for digital library staff, program                                partners, and collections managers at lc.    . share the docs  technical requirements for creation of the dataset are part of the  technical guidelines for the                              national digital newspaper program . the national endowment for the humanities funds state                        representatives to select and digitize historic newspapers from their collections to conform to                          technical specifications established by the library of congress. all data created under the                          program is delivered to the library for aggregation and public presentation, creating a large                            consistent dataset for historic newspapers (currently   million pages/  million files).    harvest and use of the data is documented on the  main web interface . a built‑in reporting                                feature of the django framework provides information and rss feeds supporting use of the data                              at  http://chroniclingamerica.loc.gov/reports/ . the django framework and python code itself is                      available on github . in addition, a  listserv , hosted by the library of congress, supports data                              users through community input.    . understanding use  learning about uses of the data is often indirect. as no api key is required to use the data, there                                        is no register of people interested in using the data. on one hand, this is a primary driver for the                                        adoption of the content in, for example, classroom settings. no api key means that it is very                                  quick to get going with the content. on the other hand, it means we must infer use through                                    various alerts and searches, for example, when we see a published article. in addition, as the                                content is public domain, there are no restrictions on the use of the content. this has led to a                                      wide variety of uses, from commercial harvesting of the site to serving as a test dataset in a                                    digital humanities class.    some methods of finding out about the data use include google alerts for the project name or                                  social media posts, using common #hashtags like #chronam or retweets. (a former web                            https://chroniclingamerica.loc.gov/about/api/ https://github.com/libraryofcongress/chronam http://www.loc.gov/ndnp/guidelines/ http://www.loc.gov/ndnp/guidelines/ https://chroniclingamerica.loc.gov/about/api/ http://chroniclingamerica.loc.gov/reports/ https://github.com/libraryofcongress/chronam https://listserv.loc.gov/cgi-bin/wa?a =chronam-users / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / developer created a twitter bot  @paperbot that retweets when someone posts a tweet with a                              link to one of the ndnp pages.) other methods include tracking metrics for the site; a huge                                  traffic spike on a particular day to a particular page turned out to be a popular reddit post.                                    similarly, if the content harvester or researcher is running into problems getting content from                            the site, they will reach out to us to figure out a better method. researchers will also reach out                                      for information about how to credit the site or ask questions about the parameters of the data,                                  both through direct contact or through the chronam‑users listserv.    neh also ran a  data challenge in to encourage direct use of the content. this led to some                                      outstanding projects. one tracked how biblical quotations were used within the newspaper                        context; another combined the data with another dataset (project hal, a national lynching                          database) to provide more information about specific lynchings. other researchers tracked the                        etymology of the word “hoosier,” extracted the agricultural news, and created an interactive                          visualization for following a phrase over time/location. in the k‑ arena, an ap history class                              used digital humanities tools to look at different historical topics in the newspapers.    . who supports use  there are a number of different layers that support the use of the data. inside of the library of                                      congress, the ndnp program specialists are often the first line of contact. the library of                              congress site provides an email contact option (ask‑a‑librarian), and reference specialists                      typically refer these questions to the ndnp program specialist. (most users review all available                            documentation first and tend to use email contact as the last possible option.) the ndnp                              program specialists tend to answer some technical questions (pointing users to csv files), data                            questions (questions about ocr, limitations of the dataset), or query tweaking (instead of                          looking for fish pricing, search for specific fish prices in specific markets, such as market price of                                  salmon in portland versus local nearby markets).    for complicated questions, there are a number of other options. sometimes the method the                            researcher/user is using can impact the performance of the website. in that case, the lc                              technology staff figure out how the researcher/user can get at the data without impacting                            performance (like downloading the bulk ocr bags instead of scraping the site). in other cases,                              the question is best answered by other users of the data. in this case, we recommend that users                                    contact the chronam‑users listserv (chronam‑users@listserv.loc.gov). for example, another user                  might have already figured out a way to visualize given issues in a specific state by year. as more                                      and more users work with the data, we encourage researchers to look at prior research, and                                point researchers to known current research efforts underway.    publicizing and encouraging the use of the data is also mixed in with encouraging the use of the                                    collection in general. the neh supports the use of the data, such as the data challenge described                                  above. similarly, our education outreach team as well as national history day serve as boosters                              for the use of the collection in general and the use of the data. as the project is a distributed                                        model, our state project partners (universities, state libraries, and state historical societies)                          https://twitter.com/paperbot https://www.neh.gov/news/press-release/ - - / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / encourage the use of content in the classroom, provide greater awareness of the content and                              what can be done with it via talks at conferences, etc.    . things people should know  beyond the features that support individual web browsing, chronicling america also supports                        access to all data through common web protocols and formats, providing machine‑level views of                            all data for harvesting and large‑scale bulk download. as examples, researchers can harvest                          batched digitized page images as jpeg , pdf and/or mets‑alto ocr, or bulk ocr‑only                          batches. each newspaper page includes embedded linked data using a number of ontologies                          and supports json and rdf views. us newspaper directory bibliographic records are also                          available as marcxml. the open api includes industry‑standard endpoints like opensearch and                        supports stable intelligible urls.    to accommodate data harvesting activities, the chronicling america web site infrastructure and                        workflow includes several features specifically designed to support such work:    . during data ingest, additional text‑only data sets are created and stored separately ready for                            bulk download.  . to create transparency and ease of access to the bulk downloadable data, feeds for the                              downloadable files, in both atom and json format were added. researchers can subscribe                          to the feed to ensure they get any new data that is added.  . for the interactive api (json & rdf) caching was added to provide fast responses for pages                                that need to be created “on the fly” by the server (as opposed to the bulk processed data                                    that exists in flat files).    for the user, we intentionally provide access and support to users with a wide variety of needs                                  and skills. for example, a student can download a csv file of all of the digitized newspapers                                  available on the site; the csv file includes information about the title, first issue digitized, final                                issue digitized, state, etc. a researcher might be interested in large‑scale text analysis; for that                              user, all of the ocr files have been bagged and are available for bulk download.    . what’s next  planned infrastructure and interface design upgrades as well as endeavors to integrate and                          streamline digital content presentations at the library present challenges and opportunities                      related to api access to data collections. planning is underway to integrate the chronicling                            america dataset into the general digital collections of the library. providing api and bulk data                              download access to chronicling america data has proven to be a valuable service, and as such,                                maintaining equivalent or improved access after integration is a priority for the library. much of                              the available digital collections at the library of congress lack api documentation or bulk data                              access. leveraging the work done with chronicling america in these areas, more data collections                            at the library are expected to take advantage of the same approaches used by chronicling                              america in the near future.    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : la gaceta de la habana  paige morgan, university of miami libraries; elliot williams, university of miami libraries; laura                          capell, university of miami libraries    . why do it  the university of miami libraries cuban heritage collection (chc) received funding from lamp                          (latin american materials project) and larrp (latin american research resources project) to                        digitize its holdings of la gaceta de la habana in .  la gaceta is a significant historical                                  resource, in that it was the paper of record during the spanish colonial occupation of cuba; and                                  the chc holds one of the largest collections of the newspaper outside of cuba, with nearly                                   years of issues (from  ‑ ).    as part of our regular digitization workflow, we also create a plain‑text file generated through                              optical character recognition (ocr), in order to make digitized material discoverable through                        our  digital collections user interface . our standard practice within this workflow has been to use                              uncorrected ocr. however, our digital collections interface (currently contentdm) only allows                      discovery, rather than any sort of analysis. associate dean for digital strategies sarah shreeves                            was aware of the increasing interest in text analysis as a result of digital humanities activity, and                                  she suggested that creating a dataset that was easily accessible for use in text analysis tools                                would be a useful experimental project for a few members of the library’s digital strategies                              team. everyone involved was aware of the imperfections of the ocr’d files; but we were also                                aware of the relative scarcity of spanish‑language datasets, and aware that if we made                            high‑accuracy ocr a condition for release, that we might never reach the point where the files                                were ready. at this point in time, we are more interested in learning what is possible with                                  imperfect ocr, and learning how we can make significant small improvements, than we are in                              striving for perfection on first release.    we think that it is worth emphasizing the creation of this dataset as a learning project on                                  multiple levels. one of those levels was institutional: our goal was to understand how much                              work was involved in preparing a large dataset (approximately , files), and what specific                            steps would be part of the workflow, both for la gaceta and potentially for other datasets we                                  might want to release in the future. on another level, it was a learning project for the three of us                                        who were chiefly responsible because of our different backgrounds. as a digital humanities                          librarian without an mls/is, paige morgan brought hands‑on experience with text mining, and                          with creating and preparing corpora, but lacked experience with corpus creation in the context                            of library systems for large‑scale file management. conversely, elliot williams (metadata                      librarian) and laura capell (head of digitization) had experience with library file management,                          but were unfamiliar with the specific needs of researchers who might want to work with the la                                  gaceta materials. this project was an opportunity for all three of us to begin fitting our expertise                                  together and teaching each other enough to be able to produce materials efficiently. we see this                                as valuable preparation for future similar projects where we bring in people who may have vital                                  https://github.com/umiamilibraries/collections-as-data/tree/master/lagaceta https://merrick.library.miami.edu/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / expertise with a particular set of materials, but who may be less familiar to the processes                                involved in creating machine‑readable data.    . making the case  there was considerable enthusiasm for this project, both from library administrators, chc                        curators, and library faculty who were excited about providing deeper access to materials than                            the digital collections interface allowed. la gaceta is a significant set of texts for cuban and                                colonial studies, and we are excited about being able to introduce interested chc researchers                            and um students to text‑mining techniques with materials that are directly relevant to their                            studies.    acting on that enthusiasm was not difficult precisely because we deliberately kept this project as                              low‑key and low‑resource‑intensive as possible: three people were primarily involved, with brief                        consultations or assistance from three others. generating the ocr’d plain‑text files is part of our                              existing digitization workflow, so the new activity within this project was focused on finding the                              best way to share the files and document how to use them. our estimate is that the total time                                      spent on this new activity was around ‑ hours. keeping the project fairly low‑stakes and                              experimental made it a more comfortable site for learning and collaboration for everyone                          involved. it was also helpful that our goal for this project was not just the end product of the la                                        gaceta dataset, but also a clearer understanding of the work involved, and the resources we                              might need in the future (i.e., an internal data repository, rather than an external github site).    la gaceta is an interesting test case for text mining release because it’s an imperfect dataset.                                the paper is thin enough that opposite page images tend to bleed through, and creases and                                sometimes blurred text complicate the ocr process. the dataset is too large for every page to                                have its ocr checked individually – however, that makes it a more interesting test case. and                                even with imperfect ocr, distant reading still yields interesting results. we’re looking for                          repetitive errors that might be fixable using a bulk find‑and‑replace – and hoping that doing so                                will be another aspect of useful learning for our team.    . how you did it  for the initial digitization process, roughly half of the la gaceta volumes were digitized in‑house                              by um libraries personnel; and the other half were outsourced with funding from lamp and                              larrp. the combined output of this digitization process was approximately . terabytes of tiff                            files (one file for each page of the newspaper), which were ocr’d in‑house. both the tiff and                                  plain‑text files are stored in our dedicated digital collections server for preservation purposes,                          but for this initial release, we decided to focus on providing just the plain‑text files as a bulk                                    download, available through a github repository.    the majority of our work was about deciding how to structure the files, and how they should be                                    named – and for all of us, that meant learning about the differences between file management                                practices within a library context and the context of a dh researcher working with the files in a                                      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / text analysis tool such as laurence anthony’s antconc or geoffrey rockwell and stefan sinclair’s                            voyant.    to explain: when our la gaceta holdings were prepared for digitization, they were separated in                              one‑month chunks. within each month, there would be separate text files for each page of the                                newspaper, so each month would contain about files, since each issue is ‑ pages long. we                                  broke up the newspaper this way because although la gaceta was a daily paper, breaking it                                down by day would have required substantially more time – enough to be unsustainable within                              our standard digitization workflow. we experimented with regular expressions to see whether it                          would be possible to break the months into days using the first few lines – but the results                                    weren’t quite reliable enough to be worthwhile. one month chunks of the newspaper worked                            fine for displaying la gaceta within our digital collections interface. but what would it be like for                                  researchers to navigate those materials in bulk within a text analysis tool?    the question that emerged from this thinking was about the id for each individual .txt file, i.e.                                  each page of the newspaper. our standard digitization workflow also generated a ‑character                          filename for each .txt and .tiff file (e.g. chc .txt). this filename is the product                            of our house schema for internal file management, which has worked very well in that context:                                library faculty and staff who use it are familiar with how the filename breaks down into                                segments that identify the repository, collection, object, sequence, and format. however, this                        filename structure is not easy to parse for external researchers, especially not in tools like                              antconc and voyant. would we need to change the filename to something more                          human‑readable in order to make the dataset useful? what would the stakes of that change be?                                as a researcher, paige wanted more legible filenames, while laura and elliot were resistant to                              the idea of multiple filenames for the same object, and what it would mean for the library to                                    potentially have to develop an alternative filename schema designed for functionality within text                          analysis tools.    making a decision about the filenames was probably the most controversial/high‑stakes aspect                        of this project, since it felt like it had major implications both for users and for the library                                    personnel involved. in the end, for our initial release of la gaceta files, rather than create                                simplified and human‑readable filenames for each document, we created a roster that will allow                            users to match any filename to its month and year. keeping the ‑character filename is                              advantageous since researchers can use the same id number to access the page image through                              our catalog if they want to check the original image. as we make more releases, the question of                                    a more human‑readable filename will almost certainly come up again, and perhaps we’ll work                            towards that alternate schema that’s designed more for external researchers, rather than for                          internal library file management.    . share the docs  this project is still new enough that we’re still in the process of adding more formal                                documentation – as we have it, we’ll make it available through the  um libraries collections as                                  https://github.com/umiamilibraries/collections-as-data/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / data website . our current introduction to the dataset (including an explanation of the filenames)                            is here, in our main repository.    for now, however, we recommend exploring this dataset with laurence anthony’s  antconc . we                          recommend antconc for three main reasons:    . it’s lightweight and easy to download and run on windows, mac, and linux machines.  . the main interface is adjustable in a way that will work well with the la gaceta filenames.  . antconc is widely used enough that there are plenty of excellent tutorials, and even a c orpus                                linguistics mooc based at lancaster university that features it – in short, lots of support for                                users who might want to use this dataset as they learn more about text mining.    while this dataset could also work with  voyant (particularly voyant server, which doesn’t require                            an internet connection), the experience might be a bit rougher, just because of the sheer                              number of files involved, since even a single month includes around   pages.    . understanding use  because of the early stage of this project, this is an area that we’re still figuring out: we want to                                        learn from what our users do and what they need, and continue refining this dataset or use the                                    info to produce better datasets with future materials. one important aspect of this project is                              that the local campus community is relatively new to dh, and so getting to the point where we                                    can better understand the use will involve at least some work on our part to model what use                                    looks like. since we released this at the end of the school year, we anticipate more opportunities                                  to figure that out till this fall. we understand that our success in this area will depend on how                                      much work we put into making sure that various communities are aware of this dataset and how                                  to use it, and plan to produce more materials that help them learn what they can do.    we’re very interested in responding to the needs that our users raise, and we welcome feedback                                and requests.    . who supports use  the fully digitized version of la gaceta is supported by university of miami libraries faculty in                                the cuban heritage collection and faculty who work with our distinctive collections. use of the                              current release of the plain‑text dataset is supported chiefly by paige morgan (digital humanities                            librarian), in collaboration with laura capell and elliot williams, as we continue to refine the                              dataset according to user feedback. in addition to making the dataset available for individual                            researchers, we are also developing lightweight plans that instructors could adapt if they wanted                            to use the dataset as a smaller or larger unit within a particular course.    . things people should know  our approach might be described as “ambitiously unambitious” in its scope – and that gave us                                room to think calmly and clearly about the new dataset that we were producing, and how it fit                                      https://github.com/umiamilibraries/collections-as-data/ http://www.laurenceanthony.net/software/antconc/ https://www.futurelearn.com/courses/corpus-linguistics https://www.futurelearn.com/courses/corpus-linguistics https://voyant-tools.org/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / (or didn’t fit) with our existing digital collections and schema, and our local institutional                            practices, etc. creating this dataset has helped to make some inchoate questions more explicit,                            and we think that seeing those questions more clearly is just as valuable as answering them –                                  which we hope to do in future projects. we recommend this approach, especially for any                              institutions that are hoping to use the collections as data initiative as a means for helping their                                  faculty/staff develop new skills and expertise.    . what’s next  in the immediate future, we want to make sure that we put sufficient energy into outreach,                                promotion, and support for the la gaceta dataset, which should be valuable both as a training                                object for our local community, and for gathering feedback for future data releases.    we will also be looking for other materials in our collections that could be good candidates to be                                    processed and released in formats that will be useful for digital humanities researchers. one                            obvious future project will be various parts of the  pan american world airlines collection , which                              is in the process of being digitized – but we’re certain that the pan‑am collection is just one of                                      many potential projects.                                  http://scholar.library.miami.edu/digital/exhibits/show/panamerican / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   facet  : text as data initiative  zach coble, new york university libraries; scott collard, new york university libraries; nicholas wolf,                            new york university libraries    . why do it  as part of a broader text‑as‑data initiative, new york university (nyu) libraries is in the process                                of expanding access to the proquest historical newspapers collection. this project involves                        negotiating with the vendor for access to the corpus as a set of text files, acquiring and storing                                    the data, and creating infrastructure to promote discovery, access, and creative uses of the new                              collection. at a high level, this is the type of work that librarians do every day, but the technical                                      components of the project have presented a fresh set of challenges.    we are seeing an increasing number of requests for machine‑actionable data at nyu libraries,                            whether in the form of full‑text collections, bibliographic metadata, or both, from data                          researchers seeking corpora to perform topic modeling, network modeling, machine learning,                      and other natural language processing tests. the most predominant disciplines at our university                          that are interested in these methods have thus far come from political science and the  center for                                  data science . we are simultaneously tracking the changes among publishers with regard to of                            api access to collections, provisions for researcher worksets of publisher data, and other                          affordances for machine‑actionable research using previously licensed content. in anticipation of                      an emerging trend, several departments at the library, including  digital scholarship services ,                        data services , and  digital library technology services , are eager to get ahead of this changing                              landscape, to shape how our relationships with content providers can enable this type of                            research, and to reconsider what library‑provided content will look like in this environment.    . making the case  as with all of our new initiatives, it begins as a pilot. we are interested in exploring several                                    significant questions: what is the best way to provide access to the data? how will researchers                                use it? a pilot provides a low‑stakes mechanism to work through a set of faculty requests in                                  order to answer these questions and then evaluate if and how we want to continue. in our                                  experience, when we are upfront with patrons about the pilot status of a project, and make clear                                  that we are not promising new services and that the whole thing might disappear in, say, six                                  months, they respond favorably and appreciate the candidness.    we have also found that pilots are most successful when they have wide scale buy‑in. a project                                  like this has a variety of stakeholders ‑ both internally from liaison, reference, collections                            management, data services, and metadata librarians, as well as externally from faculty and                          central it. clear and consistent communication with everyone during pilot process not only helps                            prevent surprises but also establishes buy‑in through a collaborative work process.    https://cds.nyu.edu/ https://cds.nyu.edu/ https://library.nyu.edu/departments/digital-scholarship-services/ https://library.nyu.edu/departments/data-services/ https://library.nyu.edu/departments/digital-library-technology-services/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   . how you did it  the project began with a faculty member asking a liaison library for access to government                              documents corpora. this prompted us to revisit our licensing terms for similar types of content,                              such as historical newspapers, and to look for cases where our licensing terms allows us to                                provide full‑text content to our research community. once we realized there was potential to                            meet an emerging need among scholars and to leverage existing resource agreements, we                          convened a working group to investigate the issues.    the project has been a joint endeavor bringing together several departments, including digital                          scholarship services, data services, digital library technology services, subject liaisons, and                      collection development. each brings strengths to this team project. digital scholarship members                        speak to researcher needs working with content not traditionally seen as “data,” in this case                              full‑text historical content. digital scholarship can also draw on past experiences in digital                          humanities projects that have developed key techniques in text mining that we can bring to bear                                on how we shape the form of the data we distribute. data services team members bring an                                  awareness of how researchers are wrangling, transforming, and analyzing data‑driven projects,                      assisting patrons and librarians alike in how they conceive of the data embedded in the full‑text                                content. subject liaisons will have interacted with faculty members and understand the scope of                            their needs. collections development can speak to the terms of licenses, will often know the                              institutional history of data collections acquired by vendors (often previous shipments of                        cd‑roms, hard drives, and other storage media), and can help negotiate new terms as vendors                              begin to take notice of data‑drive access requests.    the pilot is also a helpful use case for new mass storage services coming out of  research cloud                                    services , a joint initiative from nyu libraries and central it. specifically, we are considering                            providing access to the collection through nyu’s mountable storage (another pilot!), which                        provides remotely accessible fast‑as‑desktop storage that is protected and backed up. here, we                          will use this new storage service as a distribution point to researcher to enable restricted access                                that is both convenient and controlled.    . share the docs  we do not have any documentation that we have permission to share at this point, although we                                  will share it via our various channels as it becomes available.    . understanding use  we have researchers interested in using the historical newspaper corpus for machine learning,                          topic modeling, network modeling, and other natural‑language processing. to better facilitate a                        variety of research uses, we are currently investigating ways to reduce the data cleaning and                              preparation steps that individual researchers are required to perform. one example of this is                            ocr correction, as preliminary samples indicate there is a fair amount of incorrectly transcribed                            text. additionally, the library would like to create mechanisms to query the corpus and create                                https://wp.nyu.edu/library-drsr/ / / /mountable-storage-pilot-first-impressions/ https://wp.nyu.edu/library-drsr/ / / /mountable-storage-pilot-first-impressions/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / subcollections (e.g. by a specific newspaper, timespan, or keyword) to facilitate use by                          researchers interested in working with the content but are not interested in massaging the data.                              at a broader level, the library sees this pilot as a new and creative approach to library forms of                                      ingest, collection development, and information distribution. we want this use case to help                          inform our vision for next‑generation library services and library collections.    . who supports use  use of the historical newspapers corpus is supported primarily by data services and digital                            scholarship services. liaison librarians also have a significant role in outreach and patron                          support.    . things people should know  we are still early in the process and are eager to learn from our experiences. thus far we have                                      found that positioning the initiative as a pilot was helpful in making the administrative pitch                              because it allows us to try new things and, equally important, gives us room to make mistakes.                                  additionally, bringing in several departments has been helpful in scoping the project as well as                              getting buy‑in from our diverse group of stakeholders.    . what’s next  our next steps include plans to improve access, discovery, and outreach for the collection. after                              our data cleaning and processing work is complete, we want to ensure the collections is                              discoverable in the library catalog and other primary discovery avenues. finally, we plan to begin                              outreach for the collection, which could included workshops as well as class‑based instructional                          sessions, as we’ve found that sessions working with pre‑packaged data sets are better.                                      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : #hackfsm  mary elings, university of california berkeley, bancroft library; quinn dombrowski, university of                        california berkeley, research it    . why do it  in april to celebrate the th anniversary of the free speech movement at uc berkeley,                                the bancroft library, the research it group in the office of the cio, and the school of                                  information at uc berkeley held  #hackfsm , a hackathon around the  free speech movement                          digital archive , as part of the digital humanities @ berkeley initiative. the event brought                            together thirteen teams of uc berkeley students to design a new interface for a subset of                                bancroft’s digital holdings on the free speech movement.    the free speech movement was an appealing, immediately recognizable subject of the                        hackathon. the free speech movement is felt to be quintessentially “berkeley”, and while most                            students are aware of the movement, it is not necessarily well understood by those students.                              the hackathon offered an opportunity to raise awareness of the subject and there was an                              available dataset to work with in the bancroft library’s free speech movement (fsm) digital                            archive.    . making the case  the hackathon served as a valuable opportunity for groups in very different areas of the                              university, with different priorities and organizational cultures, to work together towards a                        shared vision. there were areas of administrative overlap, particularly between the library and                          research it groups, and clearly defining roles and responsibilities was essential. #hackfsm was a                            highly collaborative and interdisciplinary effort, made possible by the participation of the library                          systems office, library administration, bids, the school of information, arts & humanities                        division, social sciences, and the students from various disciplines, in addition to the bancroft                            library and research it. the relationships formed through participating in this hackathon have                          continued to benefit campus through the development of new collaborative initiatives.    . how you did it  see the white paper (below).    . share the docs  #hackfsm: bootstrapping a library hackathon in eight short weeks    abstract: this white paper describes the process of organizing #hackfsm, a digital humanities                          hackathon around the free speech movement digital archive, jointly organized by research it at                            uc berkeley and the bancroft library. the paper includes numerous appendices and templates                          of use for organizations that wish to hold a similar event.    http://digitalhumanities.berkeley.edu/fsm-archive-hackathon http://bancroft.berkeley.edu/fsm/ http://bancroft.berkeley.edu/fsm/ http://research-it.berkeley.edu/sites/default/files/publications/hackfsm_bootstrapping_library_hackathon_ .pdf / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   . understanding use  there was never an explicit discussion of “use”; it was left up to the individual student teams to                                    define the audience for their project, and what “use” looked like. responses varied, and included                              a tool for conducting research, multiple browsing / exploration interfaces, and a few that were                              more like an exhibit.    . who supports use  the hackfsm team included the bancroft library, the research it group in the office of the cio,                                  and the school of information at uc berkeley. the data preparation for the api involved the                                library systems office and the bancroft library. in order to govern access to the library’s fsm                                api, researchit staff used a common‑good campus service (no cost to users) called api central,                              provided by uc berkeley’s information services and technology department. the api central                        service provides a proxy to the solr api, and can be configured to require credentials in order to                                    process an http request (credentials are values of app_id and app_key headers that are set in                                the http request header). university it staff, i‑school faculty, berkeley alumni, and individuals                          from local tech companies served as code mentors during the hackathon. eventbrite was used                            for registration of participants. social media accounts (twitter and facebook) were used to                          promote the event. during the hacking period, students, mentors, and event organizers                        communicated via piazza, a free platform that offers a course‑ based message board, commonly                          used in stem courses at uc berkeley.    the library administration offered space, as the new berkeley institute for data science space                            and the uc berkeley school of information for the opening and closing events. during the                              hackathon students were encouraged to make use of physical collaboration space provided by                          our new social sciences d‑lab and library.    . things people should know  projects like this are highly collaborative and require technologists as well as content providers.                            the most successful outcome of the project was student engagement. students from across                          disciplines came together to build something.    maintaining the winning sites was not successful and we need better method and practices to                              achieve a record of this work.    while the main work product was a website, the greater product was that developers and                              humanists learned to communicate and work together. it was humanists and technologists                        working and talking together, learning from and collaborating with each other in the process of                              building new scholarly output. hopefully events like hackfsm can prepare them for future                          collaborations in a research environment where such interdisciplinary projects will be more                        common.      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / . what’s next  our hope is to prepare more digitized collections as data so they are ready to be used                                  computationally. current ocr could be improved and brought to a point of being “research                            ready” for computational use. we plan to write a grant to prepare a large recently digitized                                archival collection, working with local data scientists on the requisite steps we would need to                              take to get the data to a point of usefulness.                                                                                  / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : hathitrust research center extracted features dataset  eleanor dickson, university of illinois at urbana champaign    . why do it  hathitrust digital library is a massive digital collection, comprising more than . million                          volumes, and growing. hathitrust aims to leverage the scope and scale of the digital library to                                the benefit of research and scholarship. the collection includes considerable material under                        copyright or subject to licensing agreements, which prohibits hathitrust from releasing much of                          it—either in the form of plain text files or scanned pages—as freely‑available data. the                            hathitrust research center therefore develops tools and services that open the collection to                          data‑driven research while remaining within the bounds of copyright and licensing restrictions,                        allowing only  non‑consumptive research .    one way the research center approaches this goal is through tools and technical infrastructure                            that mediate access to the data, including web algorithms researchers can run on hathitrust                            data, the hathitrust+bookworm visualization tool, and the htrc data capsule secure computing                        environment. results from a user‑needs assessment for text analysis conducted by the research                          center, as well as anecdotal evidence from researchers affiliated with htrc, evinced the value of                              flexible, open data for text analysis research. to this end, the research center released the  htrc                                extracted features dataset in , which includes metadata and data derived from the                          hathitrust corpus. the derived “features” in the dataset include page count, line count, empty                            line count, counts of characters that begin and end lines, and part‑of‑speech tagged word                            counts. the first release (v. . ) included . million public domain volumes from the collection,                            and second release (v. . ) opened . million volumes from the collection, representing a                          snapshot of the entire hathitrust digital library circa  .    . making the case  the htrc extracted features dataset was in part born from other projects at the research                              center, including the andrew w. mellon‑funded  hathitrust+bookworm project, that required the                      htrc to process full volume text into alternate formats. the team working on these projects                              realized that the data they were deriving would likely be useful to researchers and satisfy the                                htrc’s policy for non‑consumptive research.    much text analysis research begins with the process of generating so‑called features from the                            original text, which are then counted and calculated to draw conclusions about the data. htrc                              extracted features aids the researcher by providing the data already in feature format.                          furthermore, this shift in format from full text to features distills the contents of the volumes                                into facts and metadata, discarding the original expression of the full text. the extracted                            features dataset therefore strikes a balance of meeting the needs of researchers in a                            non‑consumptive manner.      https://www.hathitrust.org/htrc_ncup https://analytics.hathitrust.org/datasets https://analytics.hathitrust.org/datasets https://bookworm.htrc.illinois.edu/develop/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / the research opportunities created by the release of htrc extracted features was understood                          throughout hathitrust and htrc, and after review, the dataset was released.      . how you did it  deriving the htrc extracted features was largely the work of peter organisciak (university of                            denver), boris capitanu (university of illinois), and ted underwood (university of illinois).                        together they collaborated to create a data model and write code to derive the extracted                              features.    the resulting dataset includes: *for every volume: metadata, including bibliographic metadata,                      word counts, and page counts. *for every page in a volume: part‑of‑speech tagged tokens                            (words) and their counts. metadata, including information about the page (number of lines,                          number of empty lines, counts of characters beginning and ending lines), and the language, which                              has been computationally determined.    htrc extracted features are available in json format, where each file represents a volume.                            within the json files, data is organized by page in the volume. json is a hierarchical file format                                    popular for exchanging data, and it lends itself well to representing book data.    htrc extracted features are available using  rsync , which hathitrust tends to use to share data                              and is considered an efficient file transfer protocol. volumes download in  pairtree format, a                            highly‑nested directory structure.    the data can be retrieved with a structured url that includes the standard hathitrust volume                              identification number. the rsync url format is: data.analytics.hathitrust.org::features/. more                  information about generating the rysnc url can be found here:                    https://wiki.htrc.illinois.edu/x/oydjaq  .    . share the docs  the following sources contain more information about htrc extracted features.    code to extract features:   ● https://github.com/htrc/htrc‑featureextractor    data paper:  ● organisciak, p., capitanu, b., underwood, t. & downie, s.j. ( ). “access to billions of                            pages for large‑scale text analysis.” iconference . wuhan, china.                  http://hdl.handle.net/ /      htrc extracted features documentation:  ● https://wiki.htrc.illinois.edu/x/wqcgaq     https://linux.die.net/man/ /rsync https://confluence.ucop.edu/display/curation/pairtree https://wiki.htrc.illinois.edu/x/oydjaq https://github.com/htrc/htrc-featureextractor http://hdl.handle.net/ / https://wiki.htrc.illinois.edu/x/wqcgaq / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   htrc feature reader toolkit:  ● python toolkit for interacting with htrc extracted features:                https://github.com/htrc/htrc‑feature‑reader/     . understanding use  the htrc extracted features dataset is useful for both research and teaching. as discussed in                              section above, the feature format provides the data in a derived manner that aids the research                                  process without over‑mediating access to the data. as structured and pre‑processed data, it                          does not meet the needs of all users, for example those whose work requires access to bigrams                                  or greater, though it is useful for research that follows the bag‑of‑words model or that starts                                from token counts. demonstrated uses have shown the data’s value in large‑scale computational                          text analysis, such as text classification using machine learning techniques, and in‑classroom for                          teaching data science and digital humanities. exemplary uses are outlined below.    text classification with htrc extracted features  ted underwood at the university of illinois has drawn on htrc extracted features in his research                                on literary genres. his work in machine learning uses the features data, including words and                              word counts, characters, and computationally‑inferred, page‑level metadata, to make inferences                    about genre in hathitrust. dr. underwood classified volumes in the broad categories of fiction,                            poetry, drama, nonfiction prose, and paratext. his work classified over , volumes at the                            page‑level, and resulted in a derived dataset containing word counts by genre and by year for                                volumes from  ‑ .    more information about this research is available on figshare:                  http://dx.doi.org/ . /m .figshare.   .    pedagogical application of htrc extracted features  chris hench and cody hennesy at the university of california, berkeley have developed a                            module for the berkeley data science education program that makes use of htrc extracted                            features. in the first iteration of the module, students documented the use of extracted features                              in data visualization, mapping, and classification in jupyter notebooks. their notebooks will be                          re‑used in the classroom over the next year. chris will introduce the curriculum to students in his                                  course, “rediscovering texts as data.” in that multidisciplinary, digital humanities class, students                        will build on the existing jupyter notebooks as they develop coding skills. chris also imagines                              using the notebooks in workshops with non‑programmers, where they will provide a legible                          introduction to text analysis by revealing how python code is used to interact with the data                                without requiring attendees to program.    the jupyter notebooks are shared on github:  https://github.com/ds‑modules/library‑htrc  .        https://github.com/htrc/htrc-feature-reader/ http://dx.doi.org/ . /m .figshare. https://github.com/ds-modules/library-htrc / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / . who supports use  use of htrc extracted features is supported by two main groups within the htrc: the htrc                                tech team and the htrc scholarly commons. the htrc tech team is comprised of research                              programmers, software engineers, and researchers (faculty, postdocs, and graduate students)                    affiliated with the  university of illinois school of information and  indiana university data to                            insight center . the htrc scholarly commons group is made up of librarians from the university                              of illinois and indiana university who are affiliated with digital scholarly initiatives at their local                              campuses.    the tech team provides technical support for the data, including writing the code to generate                              the features, processing data on supercomputers at the university of illinois and indiana                          university to derive the dataset, and providing reliable access to the data. the htrc scholars’                              commons supports research and teaching with the suite of htrc tools and services. the                            scholars’ commons leads workshops, conducts outreach, and offers support to researchers who                        have questions about using the dataset. the htrc tech team and scholars’ commons have                            collaborated on questions of data curation and preservation of the dataset, discussed in more                            detail in section   below.    . things people should know  at the scale of hathitrust, challenges to access and storage become particularly acute. crunching                            feature data for millions of files is computationally expensive, and requires access to high                            performance computers. hathitrust is also a non‑static collection: volumes are added daily, and                          (with less frequency) volumes are removed. for these reasons, htrc has versioned the dataset                            following a “snapshot” model. due to the time it takes to generate the features, the dataset will                                  never be exactly current with the hathitrust digital library, but instead captures the collection at                              a moment in time. the research center continues to provide access to both extant versions of                                the dataset,  v. . and  v. .  , but in the future, may have to look to alternate models for access to                                      versions. each version of the dataset is terabytes in size and storage may prove an issue if every                                    new version includes features for the entire corpus.    others interested in creating derived datasets as a model for opening access to restricted                            collections should consider what features would be useful to their researcher community. in                          addition to the token (word) counts, htrc extracted features includes additional metadata,                        some of it processed from marc records and others calculated during feature‑extraction, that                          we hope provides valuable context for researchers who want to make use of the dataset. other                                collections with other perceived user communities may want to include additional features.    . what’s next  as hathitrust continues to grow, the htrc extracted features dataset will be periodically                          updated with new versions. between the first and second releases of the dataset, significant                            changes were made to simplify the data model that required all of the data to be re‑crunched. in                                    future releases, only new or differing files may need to undergo feature‑extraction. still, there                              https://ischool.illinois.edu/ https://pti.iu.edu/centers/d i/people.html https://pti.iu.edu/centers/d i/people.html https://wiki.htrc.illinois.edu/pages/viewpage.action?pageid= https://wiki.htrc.illinois.edu/display/com/extracted+features+dataset / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / are some issues in the existing data, primarily related to the tokenization of chinese‑, japanese‑,                              and korean‑language text, that htrc plans to improve on in future releases.                                                                                            / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : beyond penn’s treaty  michael zarafonetis, haverford college; sarah m. horowitz, haverford college    . why do it  at haverford, we believe that libraries should move beyond the creation of digital images of                              original sources. digital materials should allow scholars to do interesting and amazing things                          with our unique collections beyond what is possible with their physical incarnation rather than                            trying to replicate the experience of the original. we believe that “digitization” encompasses all                            of this work, rather than just the creation of images. as part of our efforts to make our                                    collections available to a wider set of users and to be used in new and interesting ways, we have                                      developed a number of projects that use this expansive definition of digitization with public                            facing websites that facilitate exploration of the collections.    beyond penn’s treaty fits into this effort for a number of reasons. while it includes digital images                                  of materials–primarily journals and letters written by quaker travelers in the late eighteenth and                            early nineteenth centuries–it also has added value in the form of  tei encoded and linked text , as                                  well as further information on the people, places, and organizations encoded. the materials                          from quaker & special collections included in the project are frequently requested, making                          them good candidates for digitization and wider distribution.    . making the case  the types of materials included in this project are some of the most requested by researchers                                and scholars using quaker & special collections. many of the included documents had only                            recently been cataloged as part of a grant‑funded project. because much of the work for the                                project was in‑scope for the digital scholarship team (creating databases, writing code, etc.), we                            needed only informal approval from the library director. she approved it based on the project’s                              ability to showcase these newly‑cataloged materials and add to our growing collection of digital                            collaborations between quaker & special collections and digital scholarship.    . how you did it  we collaborated with colleagues at the friends historical library (fhl) at swarthmore college to                            add their materials to the digital collection of travel journals and letters. items from haverford                              and fhl were scanned in their respective departments. the digital scholarship team at                          haverford, at the time composed of two ds librarians and several student assistants, then                            migrated the digital objects from a contentdm instance to a locally hosted omeka instance                            with the scripto/scribe plugin and theme to facilitate transcription. student workers in the                          library (in both ds and quaker and special collections) transcribed materials during their shifts.                            summer interns at swarthmore ( ) and haverford ( ) encoded the materials in tei xml                            and shared those transcriptions in a google drive folder while also producing a master database                              (google sheet) of biographical, location, and organization records. an additional intern also                          https://pennstreaty.haverford.edu/ https://github.com/hcdigitalscholarship/penns_treaty_data / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / worked on cleaning geographical data and building maps tracing travel routes recorded in the                            documents. student interns were overseen by staff from quaker & special collections and digital                            scholarship with expertise in the subject, technologies used, and metadata. pat o’donnell at fhl                            provided subject expertise in quaker biography and history, as well as experience with authority                            control for quaker records, to help build out the database and provide quality control for the                                records created. the transcribed and encoded documents are made accessible to the public in a                              custom‑built django site–beyond penn’s treaty–that provides multiple entry points to the                      collection. users can explore several maps that trace the routes of quaker travelers and search                              across the entire collection for person, place, and group names. the encoding of the documents                              creates future opportunities for visualizing the collection based on researcher interests.    . share the docs  the tei xml documents are publicly available in a  github repository , as is the code for the                                  django site . we have a  google doc with instructions for scanning, transcribing, and encoding                            materials.    . understanding use  like most of our digital scholarship projects, beyond penn’s treaty is outfitted with google                            analytics to allow us to track basic metrics of use on the page. however, beyond that, our data                                    about use is mostly anecdotal. since we provide all the materials for people to download and                                use, we only hear about these uses if they get in touch. as a relatively new project, we are not                                        aware of any major uses of this data.    . who supports use  use of the data is supported by digital scholarship and quaker & special collections. the                              coordinator for digital scholarship and services and the digital scholarship librarian have led the                            development of the django site, with regular input from the head of quaker & special                              collections. in the past year, encoding and transcription work and some of the django                            development has also been managed our metadata librarian, who has dedicated time for ds                            projects built into their job responsibilities and is a member of the ds team. special collections                                and ds staff continue to work together to identify funding opportunities and to create student                              internships to continue the digitization, transcription, and encoding of new materials.    . things people should know  much of the work involved with this project was done by student interns. this is a familiar model                                    for us, and one that works well in an undergraduate liberal arts setting. using students is not                                  necessarily less work than doing such a project in other ways, however, as they need lots of                                  oversight and supervision. such deep opportunities can be transformative experiences for                      students and rewarding for all those involved in such projects.    while this was a new project for us, it is built on other work we had done. we have used django                                          as the framework for a number of other projects, such as  quakers & mental health , and the                                    https://github.com/hcdigitalscholarship/penns_treaty_data https://github.com/hcdigitalscholarship/qi/tree/master/qi https://docs.google.com/document/d/ amwzchuydaagk -tad fyqcfogxairpuggodnff h a/edit http://qmh.haverford.edu/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / transcription and transformation process we employed was similar to that of the  ticha project .                            the project also built on the strong collaboration between digital scholarship and quaker &                            special collections.    . what’s next  since all of the documents in the project are encoded in xml, we can create visualizations of                                  many different kinds to explore the collection as a whole and the connections between people,                              places, and groups within it. we also hope to integrate the people, places, and organizations that                                have been encoded into a quaker linked data project that we are building. this application will                                allow researchers to explore connections across our entire suite of quaker projects.                                                                  https://ticha.haverford.edu/en/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : ticha: a digital text explorer for colonial zapotec  brook lillehaugen, haverford college; michael zarafonetis, haverford college    . why do it  the digitization, transcription, and encoding of these  documents is part of dr. brook                          lillehaugen’s linguistics research on the zapotec family of languages in the oaxaca region of                            southern mexico. the documents include printed texts and manuscripts written by spanish                        monks, bills of sale, religious testaments, land deeds, and other manuscripts that include the                            spanish, latin, and zapotec languages. the work has been done over the past several years and                                continues as the project team explores more archival material in mexico. the transcription and                            encoding is crucial to creating a digital annotated version of colonial period texts that include the                                zapotec language, which include morphological analysis within the texts. additionally, the  public                        interface features a transcription tool that allows the public to transcribe documents, providing                          avenues for students, other scholars, and indigenous community members to engage with the                          materials.    . making the case  no administrative case needed to be made, as digital scholarship staff in the haverford library                              supports faculty and student research. this project is essential to dr. lillehaugen’s research. the                            main institutional or administrative barrier is obtaining permission from various mexican                      archives to make the images publicly available.    . how you did it  the project is composed of several workflows. the first is digitization of archival manuscripts                            (bills of sale, religious testaments, etc.), which is done primarily by project team                          members–faculty, student research assistants, and librarians. the ticha project employs a                      postcustodial approach to the creation of the digital archive. the digital images are organized                            and stored in a dropbox folder, and uploaded to an omeka instance with the scribe/scripto                              theme and plugin combination. there they are described by student assistants, and made                          available for transcription. once the transcriptions are complete, they are visible alongside the                          image of the manuscript.    for printed texts and bound volumes, transcription and encoding is done by students in dr.                              lillehaugen’s colonial valley zapotec class. using git and github for version control, students                          transcribe texts digitized at the internet archive and push their work to a remote repository.                              making several passes at their assigned sections, they encode for language, outline structure,                          and formatting in tei xml markup. we chose tei to adhere to an encoding standard for texts,                                  and to draw comparisons across texts in the growing collection. this xml markup is merged with                                an export of morphological analysis from the  fieldworks language explorer (flex), a popular                          software package in the field of linguistics, which is then rendered into html for the public site.    https://github.com/hcdigitalscholarship/ticha-xml-tei https://ticha.haverford.edu/en/ https://ticha.haverford.edu/en/ https://www .archivists.org/glossary/terms/p/postcustodial-theory-of-archives https://software.sil.org/fieldworks/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   the public website is built in django, a python framework for the web, because many of our                                  student assistants are computer science majors who learn python in their introductory courses.                          using the omeka api, we can update the data and metadata in the archival materials section of                                  the site by running a python script. we also provide a download link to the plain text                                  transcriptions of each page on the website. a bulk download option of all texts is coming soon.    . share the docs  most of our documentation is in the  github repository  for the encoded texts.    . understanding use  the materials on the site can be used freely under a creative commons attribution and                              share‑alike license. the encoded transcriptions are of research value to dr. lillehaugen and                          linguists who study the zapotec family of languages. access to the documents (both the digitized                              originals and the transcriptions) is important for community members to explore their language                          and history. by soliciting direct input from these community members and from from workshops                            in oaxaca that the public interface facilitates this exploration. we continue to consult our                            zapotec speaking collaborators on design and interface questions.    by providing access to the encoded texts in tei xml, we hope that scholars can find interesting                                  ways of visualizing the collection.    we use google analytics to track usage of the project, and to help us make design decisions.    . who supports use  the digital scholarship team in the haverford library provides technical support for the project,                            with server space for the public interface provided by instructional and information technology                          services. mike zarafonetis (coordinator for digital scholarship and services and a project team                          member), and andy janco (digital scholarship librarian) provide project management and                      technical support for the project. technical work (tei quality control, django project feature                          development, etc.) is done by student research assistants and ds student assistants. ds also                            provides instructional support for dr. lillehaugen’s class, in which students collaboratively                      transcribe and encode the larger printed texts.    . things people should know  this project is very inclusive of undergraduate students in the work of transcribing, encoding,                            and developing the web platform for the public site. this is a model that is familiar to us in the                                        haverford libraries, and one that is aligned with our goals as a liberal arts institution. these                                students require a good deal of instruction and supervision, but such deep opportunities can be                              transformative experiences for them and rewarding for all those involved in such projects.      https://github.com/hcdigitalscholarship/ticha-xml-tei / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / additionally, members of the project team are very intentional about incorporating feedback                        from zapotec‑speaking community members. the transcription feature, for example, grew out of                        a request from speakers of the language who wished to contribute to the project. thinking                              expansively about our user base, particularly beyond a strictly scholarly audience, is important.    . what’s next  we continue to add more archival manuscripts and bound texts to the public interface. students                              are currently encoding and transcribing fray leonardo levanto’s arte de la lengua zapoteca, and                            we hope to have the encoded version completed by the end of . the next printed text for                                    transcription, encoding, and analysis will be juan de cordova’s vocabulario en lengua zapoteca.    we also plan to add interlinear analysis of the zapotec language to the archival manuscripts in                                the near future, which break down glosses by component parts. interlineal analysis is already in                              place for some of the printed texts (see this  example page from juan de cordova’s arte ).                                                          https://ticha.haverford.edu/en/texts/cordova-arte/ /original/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : vanderbilt library legacy data projects  veronica ikeshoji‑orlati, vanderbilt university    . why do it  the jean and alexander heard library has become the repository for dozens of digital projects                              executed across the university. as stewards of these digital collections ‑ encompassing                        databases, archives, e‑editions, and exhibitions ‑ it is incumbent upon us to ensure not only the                                availability, but also the accessibility of these resources to current and future generations. every                            digital project is the product of hundreds, if not thousands, of hours of intellectual labor. to                                facilitate (re)use of digital scholarship pioneer and practitioner contributions requires that their                        work be thoughtfully curated, documented, and made publically available.    . making the case  the administrative case for instituting a “data‑first” policy of distilling the content and structures                            of digital projects into machine‑actionable datasets is driven not only by ideological                        considerations but also practical ones. fundamentally, the infrastructure to support continued                      development of sunsetted digital projects without personally invested stakeholders is lacking.                      the time and expertise required to satisfactorily migrate and maintain all sites built in drupal ,                                for example, is not fiscally viable if the library is to care for an ever‑burgeoning collection of                                  digital projects. in addition, the clir postdoctoral fellowship program in data curation has                          allowed the library to experiment with integrating digital data curation practices into digital                          scholarship workflows.    . how you did it  the first dataset curated by current clir postdoctoral fellow veronica ikeshoji‑orlati is the                          e‑edition of raymond poggenburg’s charles baudelaire: une micro‑histoire. poggenburg initially                    published the micro‑histoire in as an entry‑based chronology of the life of charles                            baudelaire ( ‑ ). in the early s, an expanded e‑edition of the micro‑histoire was                          published by the vanderbilt university press and jean and alexander heard library. in , due                              to the deterioration of the perl framework on which the e‑edition was built and the library’s                                desire to increase the accessibility of the micro‑histoire’s contents, the data and metadata from                            the relational database underlying the e‑edition were extracted into csv format. data cleaning                          was accomplished with openrefine, and the library of congress  metadata object description                        schema (mods) version . was selected for structuring the data and metadata in xml format.                              the dataset is currently in a github repository awaiting legal counsel’s approval for public                            release. the  process of curating the micro‑histoire dataset was presented at the idcc                             conference.    . share the docs    http://diglib.library.vanderbilt.edu/baud-search.pl http://www.loc.gov/standards/mods/ http://www.loc.gov/standards/mods/ http://www.dcc.ac.uk/sites/default/files/documents/idcc ~/presentations/vai-cba_idcc _presentation.pdf / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / legacy data curation protocols and institution‑wide data management policies are currently                      being drafted. each project, in its public release through the  library github account, is                            accompanied by documentation specific to that project.    . understanding use  our goal in making vanderbilt’s digital project datasets publically available under cc , cc‑by, or                            cc‑by‑nc licenses (as appropriate) is to facilitate (re)use of the data in research and teaching                              contexts. it is anticipated that the communities currently utilizing the digital projects will engage                            with the curated datasets for their research purposes. in addition, new users interested in                            scholarly meta‑analyses or large‑scale quantitative research may incorporate the library’s                    datasets into their work. in the case of the poggenburg micro‑histoire dataset, for instance,                            baudelaire scholars are the most likely audience, but those interested in broader questions in                            french history and literature may find the data of use, too. while the users for each dataset may                                    differ, it is hoped that the curated datasets will also be of service to teachers working with                                  students to learn how to interrogate humanities and social science data in meaningful and                            methodologically sound ways.    . who supports use  members of the  digital scholarship and scholarly communications team in the jean and                          alexander heard library are the primary facilitators for data acquisition, curation, publication,                        and use projects on campus. a new position, the curator of born‑digital collections, has been                              created in order to continue curation efforts on library‑housed digital datasets. in order to                            encourage campus use of the datasets, the digital scholarship team conducts regular workshops                          and hosts working groups in linked data and the semantic web, tiny data (data curation for the                                  humanities), gis, and xquery to develop a cohort of data‑literate faculty, staff, and students                            around campus.    . things people should know  as many data curators may already know, an overwhelming majority of one’s time is given over                                to  data cleaning and standardization . to successfully run a data curation program within a library,                              it is critical to translate the lessons learned in curating legacy data sets to training programs in                                  data management for researchers across campus. the data‑driven research projects of today are                          the data curation challenges of the future, so establishing sound data management practices in                            current digital projects streamlines the process of ingesting them into the library’s collection                          when they are completed. in addition, a data curation program must be grown in tandem with                                digital scholarship education infrastructure in order to arm teachers and researchers with the                          programming skills required to grapple with the curated datasets.    . what’s next  currently, veronica ikeshoji‑orlati is curating the tv news dataset, a collection of nearly .                             million abstracts of news broadcasts from abc, cbs, nbc, cnn, and fox news dating back to                                august , . the  vanderbilt television news archive is one of the richest resources for us                                  http://heardlibrary.github.io/ https://www.library.vanderbilt.edu/scholarly/ https://www.nytimes.com/ / / /technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html https://tvnews.vanderbilt.edu/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / news reporting in the th and st century, but access to the metadata is limited due to the                                    current web interface. in order to facilitate not only improved discoverability of news segments,                            but also quantitative analysis of the dataset as as whole, ikeshoji‑orlati is collaborating with                            suellen stringer‑hye (linked data and semantic web coordinator), steve baskauf (senior                      lecturer of biological sciences), zora breeding (cataloguing and metadata team leader), and                        jacob schaub (music cataloguer) to map the dataset to the  iptc newscodes vocabulary . in                            addition, she is working with lindsey fox (gis librarian) to enrich the dataset with geospatial                              data.                                                                      https://iptc.org/standards/newscodes/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : the museum of modern art exhibition index  jonathan lill, moma archives    . why do it  since , the museum of modern art (moma) has been and remains the preeminent art                              institution in the history of th and st century visual culture. through groundbreaking                          exhibitions about cubism, abstract art, surrealism, and other art movements, moma led the way                            in promoting artists who are now household names. moma established a holistic approach to                            the understanding of modernism by exhibiting and establishing curatorial departments devoted                      to film, architecture and design, and photography. moma demonstrated that those fields of                          activity were worthy of critical analysis and appreciation.    the museum archives works continually to tell that history of the museum, and to organize and                                provide access to the documents and records that evince those decades of activity. we strongly                              believe that exhibition history isan important scaffold that can be used to build an understanding                              of moma’s accomplishments.  indexing exhibition artists and curators provides researchers new                      pathways of exploration while linking archival resources and artworks in the collection . this work                            helps increase exposure and use of moma archives’ historical collections and the dissemination                          of moma’s history.    . making the case  in the moma archives received funding to organize and describe moma’s exhibition files,                            which comprised paper records from all curatorial departments and the museum registrar for                          exhibitions staged since . we decided that an exhibition index could be built as part of that                                  project workflow. due to our experience fielding public and staff inquiries and guiding user                            research, the archives had developed an appreciation of the utility an exhibition index. how this                              data might be made available to researchers was unknown at the inception of the project.    simultaneous to the archives’ work on this project, the moma hired a new director of web and                                  video who was given the mandate of radically expanding the museum’s web content. she                            understood that our data could power the deployment of thousands of new web pages devoted                              to historical exhibitions, which could then be linked to numerous digital resources such as                            scanned press releases, exhibition catalogues, and installation photographs. only with the web                        team pushing this project forward was the archives able to move to completion. the new                              exhibition pages launched in september . the data set was  published to github at the same                                time.    . how you did it  the moma archives had long maintained a simple list of historical exhibitions. i built an access                                database, parsed that list, and imported a table of over , artist names from the museum’s                                  https://github.com/museumofmodernart/exhibitions https://github.com/museumofmodernart/exhibitions https://github.com/museumofmodernart/exhibitions / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / collection management system (the museum system, tms, vended by gallery systems). i                        created a simple interface that allowed interns to connect names to each exhibition using                            drop‑down menus and when necessary to create new name records. additional data was                          gathered from exhibition checklists scanned as part of the larger exhibition files project. the                            database structure allowed for easy review of the data,error checking, editing, and other                          maintenance. once the indexing was largely completed, names in the index were reconciled to                            viaf identifiers using the openrefine. the viaf ids were then used to add wikidata qids and                                getty ulan record numbers. once this data was used to generate web pages, urls for                              exhibitions and artists were added back into the dataset. gallery systems assisted with importing                            the data back into tms from the access‑generated csv files. the web team extracted data from                                tms to ingest into the web system as they do with collection objects and other data. a simple                                    flat version of the data was posted to github.    this project required close collaboration among several departments: the moma archives, the                        data asset management system administrators who managed all the digital objects to be                          connected to our new exhibition web pages, the tms administrators, and the digital media                            team. importantly, this was the first time the archives took responsibility for historical exhibition                            data in our collection management system and on the web site, involving us more closely in                                some key museum systems.    . share the docs  all documentation for the exhibition index and moma’s collection are located on github, along                            with the actual datasets:  https://github.com/museumofmodernart/exhibitions     . understanding use  the immediate and most practical use of this data is for answering research inquiries: who was                                in an exhibition, how many exhibitions has an artist been in, how often two artists have been                                  exhibited together, etc. this amounts to significant daily usage by library and archival researchers                            as well as the general public. with basic database or spreadsheet skills, more advanced inquiries                              can be answered by this data such as who was the youngest artist to be given a solo exhibition at                                        moma? or which artists have been exhibited most frequently without having works in the                            collection?    separate from immediate needs of art historians and scholars, we expect this resource should be                              of tremendous use in classroom teaching about specific artists, modern art, and museology in                            america. further, we believe this data can be used to connect digital and archival resources                              across the web. the exhibition index is less important for the information it contains than for the                                  people, things, and data it allows a user to connect together. its real potential is only realized                                  when connected to wikipedia entries, library union catalogs, and other datasets such as  social                            networks and archival context (snac) or the american art collaborative. ideally, this index can                            serve as a model for a multi‑institution pooling of exhibition and artist data and online archival                                resources.    https://github.com/museumofmodernart/exhibitions https://snaccooperative.org/?redirected= https://snaccooperative.org/?redirected= / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   . who supports use  [blank]    . things people should know  to build an exhibition index with any speed, the materials that provide the data must be located                                  and near at hand, preferably digitized, which is why conducting this work alongside a digitization                              or processing project is ideal. ocr of archival documents does not yield readily usable data.                              facility with database applications and data manipulation software or programming languages is                        key. but most important is having labor to perform the data entry. our workflow proved that                                with a narrowly constructed date‑entry interface, precise detailed instructions, and proper                      supervision and review, that this work can be swiftly and effectively performed by                          non‑professional staff and interns. beginning with imported name records and other data                        increased efficiency and reduced mistakes. error checking of the data showed that the error rate                              was within acceptable bounds and that most errors were omissions in data.    . what’s next  our initial funding allowed us to build an exhibition index from through (while                              primarily processing and opening to the public tens of thousands of folders of paper records). a                                new round of funding is now allowing us to extend that work through , merge it with more                                    recent data created in tms, and to further enrich the data by adding exhibition information such                                as department of origin, physical location, and subject tags. we are also working to combine this                                data with the exhibition index of moma ps (constructed as a smaller local project five years                                ago) and can begin to explore merging this data with that of other institutions such as the new                                    museum, white columns, and other arts institutions.                                    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facet  : social feed manager  laura wrubel, software development librarian, george washington university; justin littman,                    software development librarian, george washington university; dan kerchner, senior software                    developer, george washington university    . why do it  social media platforms produce and disseminate a record of our cultural heritage and are a                              source of data for answering research questions from numerous disciplines. after learning about                          a george washington university faculty member’s research which involved collecting tweets                      using a manual process, we developed prototype software in to connect to twitter’s apis                              and help her collect data. conversations with our university archivist highlighted use cases for                            collecting social media in the archives for future researchers. we saw a role for the library to                                  build better tools for our community to conduct social media research. this led us to develop                                social feed manager , which empowers researchers to build collections and enables libraries to                          proactively create datasets for use within their community. along with providing data, we offer a                              consultation service for students, faculty, researchers–and also archivists and librarians–to                    access and use social media data.    . making the case  development of social feed manager started through an imls sparks grant and proceeded with                            support from  national historical publications and records commission and the  council on east                          asian libraries . library leadership participated and supported these grants which defined work                        proceeding from our existing relationships with faculty and archivists. grant funding and project                          deliverables, as well as researcher and archivist needs, drove the allocation of staff time from                              developers, archivists, and librarians to support the work. developing software and building a                          service supporting social media research might appear to be peripheral to typical library                          operations. yet, the growing integration of the library’s staff into  research projects ,including                        funded research, sfm’s popularity with students at all levels, and the prominence of projects                            supported by data collected using sfm have become compelling evidence of its value and how                              this work supports library strategic goals concerning research and cross‑disciplinary                    collaboration.    . how you did it  our initial project team in ‑ , funded by a sparks! grant from imls, was small and focused:                                  the library’s director of scholarly technology (who served as project manager and principal                          investigator), a software developer, our e‑resources content manager, and a graduate student                        developer. in this first phase, we developed a suite of utilities and an administrative interface to                                manage collecting activities against the twitter public apis. a basic user interface provided                          access to data from twitter user timelines, one at a time. we collected data of interest to the                                    gw research community and in support of specific faculty and student research projects. this                              https://gwu-libraries.github.io/sfm-ui/ https://www.archives.gov/nhprc http://www.eastasianlib.org/mellongrants.htm http://www.eastasianlib.org/mellongrants.htm https://gwu-libraries.github.io/sfm-ui/data-research/#research-using-social-feed-manager / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / included tweets by members of congress, news outlets, and public sports and entertainment                          figures. the project team mediated much of the running of the data collecting and exporting                              data beyond simple downloads of an individual timeline’s tweets.    in our second round of grant funding from the national historical publications and records                            commission and the council of east asian libraries, we further developed the software and                            widened staff involvement in the project. our grant funded the exploration of social media                            archiving and thus several of our archivists and our digital services manager participated as team                              members. the project included a significant software development component, as we added                        social media platforms, built a user interface to empower researchers to manage their own                            collections, and added more functionality overall to manage collecting from the twitter, tumblr,                          flickr, and sina weibo apis. to improve sfm’s usability, our grant from nhprc supported                            bringing on a ux consultant to conduct an expert review of its interface. we also brought on an                                    experienced digital archivist to review the technical architecture and archival use cases. we                          wrote documentation and a quick start guide for both end users and other institutions using                              social feed manager.    as a library, we actively collected tweets related to topics of interest on the gw campus. the                                  largest and most heavily used collection has been our   elections collection , containing over                            million tweets. to facilitate making this data accessible to the gw community and beyond, a                                team member created  tweetsets , which provides a self‑service interface for the gw community                          to download data and for the broader community to download tweet identifiers.    the changing terms of use for social media platforms and accompanying changes to apis are a                                challenge both for maintaining working software and supporting research.    a current challenge is tracking and keeping up with the many research projects that use sfm. we                                  want to be able to tell the story about the students and faculty in a wide range of disciplines and                                        schools who are using sfm, and the contributions our librarians make to this work.    . share the docs  documentation for the social feed manager software.    the following documents are available through social feed manager  project site :    ● social media research ethical and privacy guidelines : general guidelines for gw researchers                        focusing on the collecting, sharing, and publishing of social media data  ● social feed manager: guide for building social media archives , christopher j. prom ( )  ● building social media archives: collection development guidelines    the details of our software development work are available on  github . this includes                          issue‑tracking and prioritization, past and ongoing milestone activity, and release notes. we also                            https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/pdi in https://tweetsets.library.gwu.edu/ https://gwu-libraries.github.io/sfm-ui/ https://gwu-libraries.github.io/sfm-ui/resources/social_media_research_ethical_and_privacy_guidelines.pdf https://gwu-libraries.github.io/sfm-ui/resources/sfmreportprom .pdf https://gwu-libraries.github.io/sfm-ui/resources/guidelines https://github.com/gwu-libraries/sfm-ui / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / publish  blog posts with each release, highlighting new features useful to the community and                            sharing tips for collecting and working with the data.    . understanding use  our consultation model means that we typically have contact with users of social feed manager                              and/or social media data and have an ongoing conversation about the analysis methods,                          findings, and outcomes of their research. this model also supports including discussion about                          ethical use of social media data.    in addition to being publicly available from tweetsets, several proactively collected datasets are                          available publicly on dataverse, as sets of tweet identifiers. twitter’s terms of use do not allow                                full tweet data to be shared, but tweet identifiers may be shared for research purposes. a                                researcher can pull the full tweet, or “hydrate” it, from twitter’s api. download metrics are                              available through dataverse and its collections are highly discoverable via google. we receive                          occasional follow‑up requests or questions and track citations of datasets we’ve published.    within the university, we are tracking schools and departments we’ve interacted with and                          monitor for published research that uses sfm, presentations, posters.    . who supports use  we have a team of software developer librarians who develop social feed manager, provide                            consultations with faculty and students, teach workshops, and manage related services. our                        subject specialist librarians are a frequent source of referrals. our data services librarian                          sometimes participates in consultations, especially where they involve the larger research data                        lifecycle.    . things people should know  ethical and privacy considerations need to stay at the forefront of this work and are a thread                                  throughout the software development, research consultation, and instructional aspects of this                      work.    it is not enough to provide a tool for building social media collections: users will need support in                                    understanding and optimizing their collecting parameters, understanding the data, and finding                      ways to manipulate or reformat it for analysis. we work with freshmen in writing seminars,                              undergraduates and graduate students from a wide range of disciplines, and faculty, with varying                            familiarity with csv and json data, social media platforms, and research methods suited to                            social media data.    social media platforms are constantly changing. terms of use and api affordances are designed                            for commercial users rather than academic or research use. it’s necessary to spend time                            understanding social media platforms, researcher needs, and staying up to date since what is                              https://gwu-libraries.github.io/sfm-ui/blog / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / available is always changing. advocacy for researcher needs can sometimes lead to change with                            platform terms, even if only over the long‑term.    . what’s next  we are continuing to maintain social feed manager and trying to keep up with changing api                                affordances. we’re further developing our workshops and outreach on campus. the interest in                          our elections collection has led to our working with external audiences for this data such as                                  journalists and non‑profits, and we participate in conferences related to that work. we’re being                            proactive about the   midterm elections and collecting with future research uses in mind.                                                                  / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / appendix  : collections as data personas   october   ‑ april      collections as data (cad) personas represent an initial set of high level role types associated with                                collections as data activity. while distinctions are fuzzy in the context of disciplinary and professional                              praxis, roles represented by personas can generally be understood in alignment with data stewardship or                              use. on the whole, personas aim to surface needs, motivations, and goals in context. these                              representations are derived from collections as data project engagements and project team experience.    in agile software development, a persona is used to help develop a broadly shared orientation to user                                  experience. gary geisler has written, “personas offer a way to summarize findings from user research                              and help determine user requirements and priorities. these documents help project teams develop a                            common understanding of a project’s intended audience and priorities. they also serve as a useful                              reference for design decisions throughout the development process.”                                                / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /           / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /               / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /               / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /               / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /               / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /               / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /             / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /             / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /             / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /             / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /             / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /     / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / appendix  :   things  want to support collections as data at your institution, but not  sure how to begin? drawing on what we learned from engaging  with practitioners and researchers throughout the  always already  computational  project, the project team compiled a list of    things you can do to get started.   things is intended to open  eyes, stimulate conversation, encourage stepping back, generate  ideas, and surface new possibilities. if any of that gets traction,  then perhaps you can make the case for investing in collections as  data at your institution in a meaningful, if not systematic, way.       our best advice: start simple and engage others in the process.  you may find some activities listed here are already underway!     about this publication:   things was published in october   under a cc by‑nc‑sa  .  license.     . know how optical character recognition (ocr) output is produced in your digitization workflows.  what software is used? what formats are created? what levels of accuracy are produced?  where is it stored? is it available for user download?    . create an inventory of full‑text collections managed by your institution. document rights status,  license status, discoverability, and downloadability. ask the question: are we offering optimal  access for computational use of the full‑text? how can we make it better?     . migrating a legacy digital collection to a new system or platform? take the opportunity to make  the content accessible to researchers that have computational projects in mind.    . interview the archivist, librarian, or curator responsible for a digital collection to document data  provenance and decisions made in the course of collection processing and digitization. work to  make this information publicly available.    . inventory your data holdings. just make a simple list. and then commit to keeping it up to date,  and watch it grow.    . add new fields to the collection management database to indicate and describe data  components.     . survey your digital collections to identify characteristics ‑‑ good metadata, open access, good  ocr, high usage, relevance to a high‑profile academic program or research area at the institution    https://collectionsasdata.github.io/ https://collectionsasdata.github.io/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ‑‑ which lend themselves to high impact as data.      . recognize and identify the things you need to do differently than have been done for physical  collection objects.    . find out if your digital collection database or access platform has an api available for querying by  the public. if it does not, see if it is possible to develop one. if it does, determine if it is actively  used. if it is actively used, see if you can reach out to users and ask about their usage!     . talk to a colleague responsible for systems that provide networked access to digital collections  about possible approaches to facilitate download of collection data in bulk.    . add a terms of use to your archival finding aids.    . read the language of your organization’s collection deed of gift or purchase agreement to  evaluate whether it allows for providing access to collection content in the form of data.    . review your digital collections metadata and evaluate the rights statements and license  statements in terms of consistency and clarity. are you able to adopt  rightsstatements.org ?    . socialize collections as data as something that can be supported by units and staff across the  library. identify some champions across the organization and people who have skills or position  to do the work.    . talk to people responsible for research data management to encourage planning for data  preservation and other considerations that make it possible for others to reuse the data in the  future.    . review your institution’s mission statement or strategic plan documentation, and consider if and  how collections as data activities are aligned with and support it.    . share sample projects with community partners to give them an idea of how their collections  can be used and be relevant to new ways of conducting scholarship.     . network with people who work with data and have the skills or knowledge you need to get your  work done.    . identify barriers and limitations to what services you can offer support, and talk with colleagues  about creative, feasible solutions to overcome them.    . publish or present on "wikidata for librarians," including case studies of libraries working with  wikidata to expand discovery of collections.      http://rightsstatements.org/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / . read up on iiif (for example, check out this  useful tutorial ) and determine what hurdles to  implementation exist at your institution. then talk to relevant folks about what it would take to  overcome them.    . read the resources in the always already computational project's  zotero library .    . develop a workshop focused on the use of data in abd about collections; shop it to department  faculty and incorporate it into research orientations for faculty and students.    . mentor a liaison interested in learning a data science skill who is well positioned to identify  datasets and data support needs amongst their researchers.    . conduct user testing of your library’s main discovery environment, with the goal of  understanding how easy or hard it is for a researcher to find the available data collections.    . develop a portal page with a site map specifically for discovering collections at your institution  available for computational use and related support services.    . begin tracking demand for and use of data in and about your collections.     . for a collection that cannot be made available openly on the web, investigate if your  organization is able to support mediated access to the data, such as through an offline or  encrypted workstation.    . prepare and provide datasets that are intentionally useful, in terms of size and complexity, for  teaching in semester‑ or quarter‑long classes.    . for classes that draw directly on library collections and generate data, ask the students to submit  their data products back to library, through the institutional repository. normalize the process of  giving back and augmenting the collections with data. this may work particularly well for  collections that are institutionally or regionally focused.    . identify a faculty member who does computational analysis for their own research and find a  way to transfer or replicate the tools and approaches they use to apply them to a library  collections‑as‑data use case.    . if you offer an api to your repository, evaluate the public‑facing documentation to see if it is  clear, current, accurate, and discoverable by researchers.     . publish documentation about how to find, use, and interpret collections as data in multiple  places including blogs, readme files, and libguides.      https://iiif.github.io/training/iiif- -day-workshop/ https://www.zotero.org/groups/ /collections_as_data_-_projects_initiatives_readings_tools_datasets/items? / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / . a dataset should always be accompanied by a readme plain text file that documents basic,  important information about the data. make readmes part of your data documentation  practice. develop one or more template to that can be used by librarians and researchers.    . make an effort to make existing ocr output generated from past scanned text collections  projects more available for computational analysis, such as through bulk download.    . when planning your next digitization project, incorporate additional steps for preparing content  files, ocr or transcription text, and metadata for bulk access. document the key issues and  decision points you encounter as you evolve and expand your digitization workflows.    . talk to colleagues involved in taking in deposits to your institutional repository or research data  repository about a process for encouraging and accepting contributions back from users of data  in your collections.    . gain the support of administration by following and supporting the work of third‑party research  groups like oclc that help bolster and highlight the trends in the development of collections as  data.    . provide a resource that shows a data user how to cite a dataset, and that shows a data creator  how to format a preferred citation for an original dataset and a derivative dataset.    . ask a subject specialist at your institution if faculty or students are requesting data about or  derived from library collections.    . take a public services librarian, curator, or archivist out for coffee to talk about collections as  data. ask what they are hearing from faculty, students, and other users of collections about  computational use and which collections have potential for taking action to lower barriers to  computational use.    . investigate how your library is collecting, managing, and making email archives accessible.  consider whether a collections as data approach will serve your institution's goals.    . start small. start with a research question, and choose projects that have promise to be  generalizable for use by future scholars such that the investment is worth the level of  commitment. no one‑offs!    . start with a prototype or proof of concept. it's fine if your collections as data project does not  integrate with institutional repository or formalized infrastructure.     . collaborate with subject specialists or instruction librarians to ask scholars about interest in  computational data in and about collections. compile their ideas to make a case, and build a    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / team for the next opportunity to pursue one of them.    . be thoughtful and strategic about allocating scarce resources to collection digitization projects.  consider prioritizing projects that produce outcomes that are reusable (derivative datasets) and  repeatable (processes, tools, workflows) that can benefit your department and your users again  and again.    . explore what it would take for your organization to contribute subject data to wikidata, drawing  on a local collection and then incorporating the wikidata links into your local discovery  environment.    . test how data gathered in a crowdsourcing project can be associated with the existing source  object data and can also serve as stand‑alone dataset.    . use your favorite search engine to find information about apis provided by museums and read  about the various ways that data about museum collections can be analyzed to discover new  insights.    . keep tabs on the projects emerging in the  collections as data: part to whole project , funded by  the mellon foundation. they are bound to point a way forward for us all!                                            https://collectionsasdata.github.io/part whole/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / appendix  : collections as data methods profiles    cad methods profiles are designed to help people who work in libraries, archives and museums gain                                a better understanding of common research methods that make use of cultural heritage collections                            for computational analysis. of course, these descriptions are simplified versions of the methods, and                            are described mostly in the context of their implications for the creation, description, packaging, or                              distribution of collections as data. profiles should be used in the context of the principles articulated                                in the santa barbara statement on collections as data.        text mining  laurie allen and scott enderle, university of pennsylvania    . what is it?    looking for patterns in text. generally, text mining is done on a corpus of texts rather than a                                    single text. finding and assembling a corpus that is appropriate to the research needs of a                                project can be one of the trickiest and most time consuming things that a researcher does when                                  approaching a project. there is not currently an agreed upon standard for describing or sharing                              text corpora, though there are a variety of guides to finding them, and vendors who sell access                                  to text that researchers can assemble to create a corpus.    see a few definitions and links:    ● drucker, johanna. data mining and text analysis ‑ introduction to digital humanities.                        accessed august  ,  .  ● underwood, ted. seven ways humanists are using computers to understand text. the                        stone and the shell (blog), june  ,  .    . who uses it?    text mining is used across humanities disciplines (notably language and literature departments,                        and history) and in the social sciences, especially political science, communications, and                        business. there are also text corpora used in machine learning applications as well as linguistics.                              disciplinary uses of text mining vary both in method of analysis, and, importantly, in the kinds of                                  texts included in the corpus of study. for example, a corpus of the front page articles of current                                    major newspapers might be valuable to a political scientists, while a scholar of th c. english                                novels might want a corpus of literary reviews.    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   . what form of data is most useful for this method?    generally, researchers doing text analysis will want to use plain text (i.e. machine readable, but                              without markup) in large quantities. they will also need accompanying metadata at a variety of                              scales. that is, sometimes they’ll want metadata at the book/article level, or at the collection                              level, and for some uses, it is helpful to have chapter or section level metadata. in linguistic uses,                                    analyses of texts sometimes include annotations down to the specific phoneme level, which                          make linguistic corpora less widely produced by libraries/archives/museums.    . what might researchers explore when they’re text mining?    they might look for word frequency counts (how often is a particular word used) at the page,                                  article/chapter, or volume level, or use those counts for further analysis. for that reason, a                              dataset of frequency counts, even in the absence of fulltext, is often useful, especially in cases                                where the full content of a corpus can not be made available because of copyright restrictions.    researchers often look for patterns in the data as they relate to features in the metadata (for                                  example, how does the frequency of a word in texts change over time). reliance on both the                                  metadata about each text and the text themselves makes it important for researchers to know                              about large inconsistencies in the data or metadata quality. for example, if the ocr quality is                                inconsistent across a collection, it is very useful to include standard metadata about ocr quality                              for each text, if it is known. or, if cataloging or metadata creation practices changed over time,                                  those changes should be noted so that researchers can account for those changes in their                              analyses.    in some cases, people are interested in locations of words on pages (if an ocr program has                                  included information about bounding boxes, it would be nice to have multiple versions – one                              with bounding boxes, and the other without).    . common tools used for text mining    most people who do text mining are using scripting languages like python or r.    beyond that, there are a few other tools, useful for analysis and teaching like:  ● voyant  ● antconc  ‑ (see also heather froehlich’s  antconc lesson on programming historian )  ● topic modeling tool  ● mallet    . things to look out for when preparing collections for text mining    https://voyant-tools.org/ http://www.laurenceanthony.net/software/antconc/ https://programminghistorian.org/en/lessons/corpus-analysis-with-antconc https://github.com/senderle/topic-modeling-tool http://mallet.cs.umass.edu/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   copyright: this is a big one, for obvious reasons. where fulltext can not be provided, some                                libraries provide wordcounts or other analytics about the texts.    documentation of text and metadata: multiple versions of texts can be a big source of                              frustration or confusion in text analysis. for example, a series of reports might have the same                                first page, which is duplicated across all reports. flagging those kinds of duplications can be                              valuable in helping researches cut the preparation time to making a corpus usable.    . examples of this method in use    underwood, ted, david bamman, and sabrina lee. “the transformation of gender in  english‑language fiction.” journal of cultural analytics,  . https://doi.org/ . / . .    barron, alexander t. j., jenny huang, rebecca l. spang, and simon dedeo. “individuals,  institutions, and innovation in the debates of the french revolution.” proceedings of the  national academy of sciences, april  ,  ,  .  https://doi.org/ . /pnas. .    . examples of collections optimized for this use    “documenting the american south: docsouth data.” accessed august  ,  .  https://docsouth.unc.edu/docsouthdata/.    chronicling america:  https://chroniclingamerica.loc.gov/    la gaceta de la habana:  https://merrick.library.miami.edu/cubanheritage/cubanlaw/lagaceta.php                           https://chroniclingamerica.loc.gov/ https://merrick.library.miami.edu/cubanheritage/cubanlaw/lagaceta.php / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / network analysis  . what is it?     network analysis supports quantitative and qualitative study of relationships between entities.  entities can be people, places, or things. network analysis is especially helpful for studying  multiple levels of complex systems.     a few resources and links:     “network analysis: lesson directory.”  programming historian .   https://programminghistorian.org/en/lessons/?topic=network‑analysis     easley, david, and jon kleinberg.  networks, crowds, and markets: a book by david easley and  jon kleinberg . accessed may  ,  .  https://www.cs.cornell.edu/home/kleinber/networks‑book/ .     locke, brandon.  humanities data curation record. network graphs and network analysis .  .  reprint, data praxis,  .   https://github.com/datapraxis/hdcr .     . who uses it?     network analysis is used across a wide range of communities with some variation in terminology  based on discipline. while social network analysis is popular, network analysis is also used to  study physical infrastructure, e.g. transmission of energy through an electrical grid, or the flow of  traffic. it can also be used for fictional characters in plots. in business network analysis it is used  to study how organizations form, how money transfers from one place to another. it is also used,  famously, in recommendation engines.     . what form of data is most useful for it?     researchers need relational information for network analysis, which can be found in many  datasets. however, not all networks are useful for analysis, so there can be a fair amount of  exploration in finding network datasets. the most basic forms of data for network analyses  simply require that each record includes two entities and a relationship. for example, a simple  spreadsheet with many rows and three columns. for each row: one person (entity) sent a letter  (relationship) to another person (entity), or one publication (entity) was authored (relationship)  by a person (entity).  other data can become part of network analysis as well, but the simplest  notion of the network simply requires entities and relationships.       https://programminghistorian.org/en/lessons/?topic=network-analysis https://programminghistorian.org/en/lessons/?topic=network-analysis https://www.cs.cornell.edu/home/kleinber/networks-book/ https://www.cs.cornell.edu/home/kleinber/networks-book/ https://github.com/datapraxis/hdcr https://github.com/datapraxis/hdcr / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / . what data features might researchers explore?      after establishing whether network analysis is the right method, researchers might explore the  size of a particular network, either by counting the number of nodes (entities) or number of  edges (relationships).  they might ask what is the percent of the network that is isolated from  the rest? they may also look at network level measurements ‑ who is most central, who are the  most important conduits? what are the people places or things that have easiest access to outer  bounds of network? they may look at the clustering coefficient – do relationships in the network  tend to clump together or are they fairly diffuse?     . common tools      palladio   http://hdlab.stanford.edu/palladio/  (for very lightweight exploration of networks,  designed for historical data)  cytoscape   http://www.cytoscape.org/  gephi   https://gephi.org/  nodexl   https://www.smrfoundation.org/nodexl/  pajek   http://mrvar.fdv.uni‑lj.si/pajek/      . examples of this method in use      warren, christopher n., daniel shore, jessica otis, lawrence wang, mike finegold, and cosma  shalizi. “six degrees of francis bacon: a statistical method for reconstructing large historical  social networks.”  digital humanities quarterly   , no.   (july  ,  ).  moravec, michelle. “network analysis and feminist artists.”  artl@s bulletin   , no.   (november  ,  ).   https://docs.lib.purdue.edu/artlas/vol /iss /  .     white, howard d., and katherine w. mccain. “visualizing a discipline: an author co‑citation  analysis of information science,  – .”  journal of the american society for information  science   , no.   ( ):  – .  https://doi.org/ . /(sici) ‑ ( ) : < ::aid‑asi > . .co; ‑  .     bibliography of historical network research   http://historicalnetworkresearch.org/bibliography/     . examples of collections optimized for this use     the following sources provide directories of network data:  “casos tools: network analysis data | casos.”   http://casos.cs.cmu.edu/tools/data .php .  “index of complex networks.” index of complex networks.   http://icon.colorado.edu/ .     “stanford large network dataset collection.”   http://snap.stanford.edu/data/index.html .  sualization ‑ thomas interested in this, planning to try and chat with lauren klein    http://hdlab.stanford.edu/palladio/ http://hdlab.stanford.edu/palladio/ http://www.cytoscape.org/ http://www.cytoscape.org/ https://gephi.org/ https://gephi.org/ https://www.smrfoundation.org/nodexl/ https://www.smrfoundation.org/nodexl/ http://mrvar.fdv.uni-lj.si/pajek/ http://mrvar.fdv.uni-lj.si/pajek/ https://docs.lib.purdue.edu/artlas/vol /iss / https://docs.lib.purdue.edu/artlas/vol /iss / https://doi.org/ . /(sici) - ( ) : % c ::aid-asi % e . .co; - https://doi.org/ . /(sici) - ( ) : % c ::aid-asi % e . .co; - http://historicalnetworkresearch.org/bibliography/ http://historicalnetworkresearch.org/bibliography/ http://casos.cs.cmu.edu/tools/data .php http://casos.cs.cmu.edu/tools/data .php http://icon.colorado.edu/ http://icon.colorado.edu/ http://snap.stanford.edu/data/index.html http://snap.stanford.edu/data/index.html / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / appendix  : national forum position statements   march       forum participants were asked to respond to the following prompt:    leading up to the forum, [we] ask that you write a brief position statement derived from direct or                                    related experience salient to the scope of work described in always already computational. we                            welcome bridging, divergence, and provocation. is there something concrete or conceptual we                        are missing? are there projects and initiatives this work should be connected to? are there                              questions and communities we aren’t currently considering? this is an opportunity to highlight                          aspects of your experience that relate to the project and will to some extent help stage                                interaction at the face‑to‑face meeting ‑ and beyond ‑ as the project team works to iteratively                                refine forum outputs in a range of professional and disciplinary communities.   perspectives represented in the position statements highlight the many directions collections as data  work could go. the statements certainly informed the work of the forum, and consequently the  iterative community based development of project outcomes.                         / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / pseudodoxia data: our ends are as obscure as our beginnings      jefferson bailey, internet archive      in his meditation on oblivion and regeneration, w.g sebald writes, “on every new thing there lies already                                  the shadow of annihilation.” contemplating collections as data evokes a similar correlation ‑‑ one where                              transformation (“this as that”) is less a process of alteration and more one of extraction of key, but                                    possibly opaque, preexistent characteristics (“these from those”). when we consider the computational                        availability of collections, we begin from a perspective in which collections are an amalgamation of                              fragmentary elements ‑‑ and their decomposition is neither affordance nor flaw, but instead a natural                              state of flux that allows them to be contextualized anew through a continual state of reconstitution and                                  derivation. this prevailing logic of decomposition distinguishes collections not as data but instead as                            pieces and processes, with attendant opportunities and entanglements ‑‑ collections and data become                          inseparable, commingled not in operation but instead via a type of consanguinity. likewise, our services                              supporting computational access to data should match this latent consanguinity.     as a large‑scale, online digital library that is also a mission‑driven, nonprofit technology developer, the                              internet archive has long approached collections as data. being fully online, with no physical reference                              collections other than those intended for digitization, collections and data are so intertwined as to be                                indivisible, either in concept, technology, or use. the internet archive’s collections include more than                               petabytes of unique data and has supported computational use of these collections since its beginning,                              from projects as wide‑ranging as semantic analysis of television closed‑caption transcripts to network                          graph study of linking behavior of hundreds of terabytes of web data. in addition, and as a self‑sustaining                                    non‑profit, the internet archive has facilitated this type a research through a service‑oriented and                            sustainable program development approach. developing data‑driven approaches to access and binding                      them to scalable, sustainable programs has elucidated many of the obstacles and potential solutions that                              emerge from this work. questions that have emerged:    ● how can computational research services create better pathways to interpretation through tools                        and methods for the smooth traversal between “reduction and abstraction” inherent in                        derivation and aggregation?  ● how can new access models help researchers have greater comfort with technical mediation at                            multiple levels and with an increasing distance between the granularity and totality of the                            object(s) of study?  ● how can programs address the challenges still inherent, even with derived datasets, of limited                            technical proficiency and local infrastructure?    in testing multiple models internally, and surveying and collaborating with similar efforts in the                            community, we developed a loose typology of program models for research services, oriented towards,                            but not exclusive to, very large born‑digital collection such as web archives.    ● bulk data model : the totality of domain, global‑scale crawl, or large born‑digital collection is                            transferred to researchers via data shipped on drives. analysis takes place locally, usually in a                              researcher’s own high‑performance computing environment.    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ● cyberinfrastructure model : a custodial/archival institution provides free/subsidized access to its                    own computing environment that is pre‑loaded with data, vms, and other tooling. researchers                          can do analysis in this remote environment and export results.  ● roll your own model : researchers receive support, generally in the form of funded or                            sponsored services, to create their own tools and leverage existing data platforms for candidate                            collection building and analysis.  ● programming support model: researchers, generally non‑technical, are given time with                    specialized technical support staff (engineers) to collaboratively build or aggregate datasets and                        perform analysis.  ● middleware model : the creation of specific tools and platforms that operate between data                          hosted with a custodian and advanced analytics tools maintained externally.  ● derivative model : provide pre‑defined datasets that contain key extracted, derived, or                      pre‑analyzed data culled from specific resources. the derived datasets support specific research                        questions, are fungible, and align data and delivery with researcher need.    while the internet archive has pursued many of these models, the most flexible and scalable has proven                                  to be the derivative model, in which key elements are extracted from primary resources and packaged in                                  simple but easy‑to‑use datasets. this preference was the result of many lessons learned in working to                                support computational use of extremely large digital collections.     ● services for computational access are more successful when built on top of, or expanded from,                              pre‑existing internal systems, processes, and infrastructure. modular, generalized, and                  interoperable are preferred and boutique services don’t scale.  ● research services should be flexible and, most importantly, content delivered should be                        disposable to the providing institution and be able to be recreated by existing, ongoing pipelines                              or frameworks.  ● focus on derivation (extract desired data from origin), portability (processes should work on                          multiple content types or in many areas of the workflow) , and access (ease of transfer of data to                                      recipient and ease of use by the recipient).  ● focus on scalable partnerships & decentralization in research service support.  ● researcher expectations often are not aligned with available custodial resources or services and                          research methodologies (conceptual, practical, technical) often are not aligned with target data                        characteristics, acquisition methods, or management tools.  ● service models must be self‑sustaining and scale. no “grant then gone.”  ● continually orient towards mutually reinforcing work, be it with collaborators or researchers,                        and always allow for generality, in partners, technologies, and models.    discovering how these lessons and approaches match, contest, or augment the findings of other efforts                              will be a particularly informative result of the “collections as data” forum.                / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / experiencing library collections as data  alexandra chassanoff,  massachusetts institute of technology    recent empirical research has confirmed that digital tools and technologies are fundamentally changing                          how scholars work.[ ] yet the inverse of this relationship has received little attention – how is                                infrastructure changing to support emergent scholarly practice?[ ] as you note in your grant narrative,                            “predominant digital collection development focuses on replicating traditional ways of interacting with                        objects in a digital space.” indeed, much of the research examining how scholars find, access, and use                                  materials in digital collections has paid little attention to qualitative factors about the interaction                            between collection users and environmental aspects.[ ]    my doctoral research focused on this problem – exploring how scholars were searching for, accessing,                              and using digitized archival photographs as forms of historical evidence. an underlying objective of my                              research was to explore the interpretive and evaluative practices that scholars bring to bear on                              non‑textual objects of humanistic inquiry. the intent was to think about how digitized photographs can                              function as data, and to provide a perspective on what makes interactions meaningful for scholars                              working with digital materials.      in my role as the project manager on the  bitcurator and bitcurator access projects, i worked with                                  scholars and archivists to develop approaches and methodologies for accessing and using born‑digital                          materials. at the close of each project, i recall thinking that technology was hardly the difficult part of                                    our work. rather, the challenges we faced seemed to be conceptual in nature. how might we envision                                  ways to access born‑digital materials? relatedly, how might we use born‑digital materials in our                            research? what kinds of questions could be asked and answered from examination of contents of the                                so‑called black box?      it seems that we face a similar challenge in considering library collections as data. i am grateful that this                                      forum is explicitly seeking to address this gap, particularly through the enlistment of a diversity of players                                  in the cultural heritage community. technologists, librarians, museum professionals, archivists, and                      scholars will contribute important and unique perspectives to this conversation. strategic approaches                        that facilitate access to, and preservation of, library collections as data will need to consider the constant                                  and shifting interplay between infrastructure and emergent scholarly practices. for example, recent                        research has shown that scholars are using google image search to locate archival photographs.                            traditional archival design approaches may not accommodate the serendipitous possibilities of digital                        space.      in thinking about ways to facilitate use and reuse, i hope to draw on my current research as a clir/dlf                                        software curation postdoctoral fellow. since october, i have been working at the mit libraries to                              investigate and make recommendations for how institutions can manage software as complex digital                          objects across generations of technology. software is another type of “data”, albeit one with implicit                              constraints for access, use and reuse. researchers rely on software for a variety of research activities – as                                    a subject of research itself, a way to operationalize methods, or to reproduce and validate previous                                results. institutions are increasingly tasked with activities related to the active management of software:                            from creation through use, dissemination, preservation and reuse. institutional approaches to software                          https://www.bitcurator.net/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / collection development must consider software in a variety of contexts: at an intellectual level (e.g.                              selection and appraisal); in planning for and designing repositories, platforms, services; and in                        developing staff competencies.   how can we accommodate the fluid and rapidly changing practices which characterize the current                            scholarly landscape? the results of my dissertation research suggest that one part of the puzzle might be                                  to develop an understanding of the factors and qualities that make experiences meaningful in different                              kinds of interactions. for example, what is it about the experience of (digitized) oral histories that make                                  them accessible and usable? rather than focusing on delivery mechanisms or crafting explicit                          methodological approaches, we might do well to consider the myriad ways in which specific types of                                materials in digital library collections can be experienced.     works cited    [ ] alexandra chassanoff, “historians and the use of primary source materials in the digital age,”  the  american archivist   , no.  ( ): ‑ ; jennifer rumer and roger c. schonfeld,  supporting the  changing research practices of historians, final report from ithaka s+r  ( ),      [ ] the important relationship between infrastructure, technology, and scholarship is explored in  christine borgman’s  scholarship in the digital age: information, infrastructure and the internet  (cambridge: mit press,  ).     [ ]  two notable exceptions in the field of library and information science (lis) are: marcia bates, “ t he  cascade of interactions in the digital library interface,”  information processing and management   , no.  ,  ; christopher a. lee,   “ digital curation as communication mediation ,” in  handbook of technical  communication , ed. alexander mehler, laurent romary, and dafydd gibbon (berlin: mouton de gruyter,  ),  ‑ .                                         http://www.ils.unc.edu/callee/p -lee.pdf / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / unsolved problems in the humanities data generation workflow: digitization  complexities, undiscoverable audiovisual materials, and limited training for  information professionals  tanya clement, university of texas austin       digital humanities has changed rapidly from a field that in which we primarily build and create access to                                    resources in the humanities to a field in which we deploy analytics on those resources in accordance with a                                      general move to data analytics. the always already computational initiative is taking an essential step                              towards bridging the first activity (digitization) to the second (analytics) by focusing on how we structure,                                bundle, and disseminate digitized or born digital collections and metadata on such collections. this is                              important and much needed work, but there are three main areas of concern or “unsolved problems” that i                                    would like to introduce into the conversation for the consideration of the group: ( ) digitization workflows;                                ( ) av metadata; ( ) and pedagogy in terms of training information professionals about data science, data                                analytics, and data visualization.    digitization workflows are where much library collections “data” such as descriptive or technical metadata                            are born, but these workflows are complicated processes that include selecting collections; establishing                          performance goals based on standardized measurement protocols; developing efficient test plans; and                        taking corrective action to maintain quality. even as cultural heritage institutions continue to rapidly digitize                              and refine these workflows, our knowledge about new approaches to digitization standards, to schemas for                              the semantic web, and to increasing our regard for issues of diversity and inclusivity in the digitization of                                    cultural heritage artifacts continues to evolve. newly issued guidelines from fadgi[ ] – an initiative                            incorporating many entities at the library of congress – challenge librarians and archivists to improve image                                quality precisely when pressures to digitize everything including collections that embody inclusivity are                          building. consequently, much of the metadata that we may use in a data framework has been generated                                  during an evolving and complex digitization process, which is often a time of increased one‑time funding for                                  the specific digitization job. to what extent will the guidelines that we generate during always already                                computational take digitization workflows into account? can we advise libraries and archives on how an                              understanding of an eventual data framework can be integrated into these workflows such that when                              requests for funding are made our colleagues can anticipate generating the kinds of data that we will need                                    for a data access environment?     second, and a case in point for the first “unsolved” problem, audiovisual materials are notoriously under                                represented in digital humanities precisely because they often lack the detailed data (or metadata) that                              supports their effective discovery, identification, and use by researchers, students, instructors, or collections                          staff. in recent years, increased concern over the longevity of physical av formats due to issues of media                                    degradation and obsolescence, combined with the decreasing cost of digital storage, have led libraries and                              archives to digitize recordings for purposes of long‑term preservation and improved access. however, unlike                            textual materials, for which some degree of discovery may be provided through full‑text indexing, av                              materials that lack detailed metadata cannot be found, understood, or consumed. most open source and                              commercial efforts that attempt to generate computationally‑assisted metadata and to facilitate improved                        discovery are narrow in focus, non‑scalable, developed as standalone tools, and do not address the rights                                and permissions that collections staff must consider for creating access. because of the complicated morass                              of technical and social issues that limit av discovery, and descriptive access to audiovisual objects at scale                                    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / would require a variety of mechanisms for analysis that would need to be linked together with tasks                                  involving human labor in a recursive and reflexive workflow platform that could eventually facilitate                            compiling, refining, synthesizing, and delivering metadata. colleagues from indiana university and                      avpreserve and a team of researchers at ut including myself are in the process of developing such a                                    workflow platform, which would allow libraries and archives to bring together and use task‑appropriate tools                              in a production setting. this work is in direct conversation with the kind of framework that always already                                    computational is proposing, but we believe that av needs, which include generating data about av materials                                as a solitary means of providing access to materials that may never (because of privacy and copyright                                  concerns) be publically accessible, are distinct from, though complementary with, those needs that                          correspond to generating data for text collections.     third, while information literacy is today a routine goal of library instruction, data work that includes                                enabling data discovery and retrieval, maintaining data quality, adding value, and providing for re‑use lags as                              a topic.[ ] if the library is the laboratory of the humanities, this lag impacts how the digital collections that                                      librarians curate are used in the humanities. rigorous data work requires data “carpentry” knowledge that                              considers validity, reliability, and usability as well as critical literacies more generally such as data quality,                                authenticity, and lineage, but humanists and librarians are not traditionally trained on evaluating these                            aspects of data. the corresponding difficulty of training students and professional academic librarians lies in                              the ever‑evolving nature of data work, which must respond to changing standards and needs in the context                                  of increasing data in the humanities and of changing infrastructures in libraries. there is work being done in                                    this space including the data science curriculum project, which is meeting just after the always already                                computational meeting in washington dc with representatives from the american statistical association                        (asa), the asa business‑higher education forum (bhef), the association for computers and the humanities                            (ach), the association for computing machinery (acm), the association for information systems (ais), the                            ieee computer society (ieee‑cs), informs, the icaucus, edison, and the american association for the                            advancement of science (aaas). as well, many programs in data science have emerged in recent years at                                  many universities and in many ischools, but there are few programs of study that focus specifically on                                  teaching students with concerns shaped by the humanities in the context of humanities collections.                            conversations on data science pedagogy are needed to ensure the integration of up‑to‑date resources,                            theories, and practices in data work in a curriculum that will be geared towards inclusivity and teaching the                                    next generation of our digital workforce about data preparation and analysis in the humanities. again, this                                work is directly relevant to the always already computational conversation since the data framework                            proposed requires practitioners who also have some training in data work.    works cited    [ ] federal agencies digitization guidelines initiative. technical guidelines for the still image digitization  of cultural heritage materials. september  .  http://www.digitizationguidelines.gov/ .    [ ] association of college and research libraries. working group on intersections of scholarly  communication and information literacy. intersections of scholarly communication and information  literacy: creating strategic collaborations for a changing academic environment. chicago, il: association  of college and research libraries,  .            http://www.digitizationguidelines.gov/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / computing in the dark:  spreadsheets, data collection and dh’s racist inheritance   p. gabrielle foreman and labanya mookerjee, university of delaware    living in a nation of people who decided that their world view would combine agendas for individual  freedom and   mechanisms for devastating racial oppression presents a singular landscape.    ‑toni morrison , playing in the dark    early on in the “always already computational” abstract this assertion appears, underscoring a central                            assumption of the project: “predominant digital collection development focuses on replicating                      traditional ways of interacting with objects in a digital space. this approach does not meet the needs of                                    the researcher, the student, the journalist, and others who would like to leverage computational                            methods and tools to treat digital library collections as data.” not only do the protocols and                                development of digital collections, of interacting with objects, not meet the needs of various users—let’s                              call them people or communities—who interact with “objects in digital spaces,” the lexicon itself                            reproduces particularly freighted ideas for black communities of researchers and students, many of                          whose ancestors entered the west as chattel property, as people who were both called objects and                                “leveraged,” that is bartered, mortgaged, sold and  listed  as such. in the us, this is true for the almost                                         years of municipal, census, and other records which make up collections and archives during slavery, for                                records that document the debt peonage that characterizes jim crow, and, one might argue, for ways in                                  which black people are accounted for in a prison industrial complex that again treats members of                                communities as things to be categorized, as surveilled and recorded objects.    the lexicon of digital collections extends the freighted, fretted, relation of categorization and data                            collection, to black subjects and black subjectivity. the term "item,” like “object,” again recalls the ways                                in which black people appear/ed in public records—as items on manifests, as "losses" on insurance                              claims, and again as items for sale in newspapers or to be distributed in probate. “fortune” was an  th ‑                                      century connecticut enslaved man whose very name announces his relation to the capital production,                            the wealth and fortune, he was meant to produce for his enslaver, dr. preserved porter (this is not a                                      typo). when the doctor died not long after he did, fortune appears in probate records as a skeleton the                                      doctor made from his body, claiming him in death as in life, and literally transforming him into both                                    material object and intellectual prop and property. fortune’s own wife, dinah, still enslaved by the                              family, was worth  less as a living, sentient, being in those records than her husband’s skeleton, a skeleton                                    she may have had to dust or clean, the bones of a husband she could not bury.    likewise, the spreadsheet opens up complex analogies to the ledger, as labanya mookerjee, a former                              exhibits committee co‑chair for the colored conventions project, writes in her “ disrupting data viz. &                              the colored conventions project :  interrogating data management methods through disability studies ,” a                        piece she wrote and published on tumblr for a graduate seminar led by p. gabrielle foreman. storing                                  data in spreadsheets powered by programs such as microsoft excel introduces an additional layer of                              complications; spreadsheets, as bookkeepers of capitalism, can be traced directly to the history of slave                              trader ledgers . the violence of this history runs the risk of being replicated if we continue to use                                    conventional methods of storing data. as many dh critics have now pointed out, the institutional power                                  http://disruptingdataviz.tumblr.com/post/ /introduction-to-disrupting-data-viz-the-colored http://disruptingdataviz.tumblr.com/post/ /introduction-to-disrupting-data-viz-the-colored http://t.umblr.com/redirect?z=http% a% f% fwww.slate.com% fblogs% fthe_vault% f % f % f % fslave_trader_ledger_william_james_smith_accounting_book.html&t=ownizjvjzjlhytu nwmxntkwmgfim u zjk nty mza odfimjq ocw uzhhsfppnq% d% d http://t.umblr.com/redirect?z=http% a% f% fwww.slate.com% fblogs% fthe_vault% f % f % f % fslave_trader_ledger_william_james_smith_accounting_book.html&t=ownizjvjzjlhytu nwmxntkwmgfim u zjk nty mza odfimjq ocw uzhhsfppnq% d% d / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / invested in the process of data collection—the prelude to data visualization—can be discussed alongside                            conversations on the power in the production of the archive. computational activity “is contingent on                              the availability of collections that are tuned for computational work (hughes ),” as the always                              already computational abstract asserts. “suitability is predicated on form, integrity, and method of                          access (padilla ). this points us to the hegemonic logic guiding the selective operations in                              knowledge production that has been interrogated through studies on the archives (trouillot) and in data                              visualization (drucker). both trouillot and drucker make a dh community (attuned to archive production                            as well as archive availability) aware of the need to name the difference between “capta” and “data” and                                    to challenge and counter the institutional powers that authorize “credibility” or “suitability” (padilla).    datasets, when constructed using conventional methods of data collection and organization, run a                          similar risk of activating institutional power and defining “credibility,” especially when the data is                            procured from traditional archival sources that too often excise, anonymize and erase certain subjects,                            transmogrifying them in turn into (almost invisible, ghosting) “objects” and “items.” two examples from                            the colored conventions movement obtain. first is the challenge of including black women whose                            names and participation are excised when we use traditional methods of collecting and naming data                              (from the lists of thousands of delegates over seven decades). curating a dataset that is reflective of the                                    actual history of women’s involvement has prompted ccp to revisit the logic used to develop the                                parameters of what qualifies as “participations,” extending the definition of participation from appearing                          in the minutes, to attendance at the gatherings, and to hosting and curating conversations (following                              psyche williams‑forson) at boarding houses, eateries etc. where women’s presences or imprints appear.                          a second example is the work that jim casey, co‑founder of ccp, has done on social network analyses                                    and data visualization between colored conventions and the underground railroad showing a surprising                          lack of overlap and co‑attendance. “all of this data is vexed,” asserts casey, “shaped by centuries of                                  decisions based on racial hierarchies about what to record, store, and reproduce.” casey uses siebert’s                              “directory of the [ ] names of underground railroad operators” included in his underground                          railroad ( ), and boston public library’s anti‑slavery collection data. these sources hew to a                            historical imaginary that places whites at the center of the ugr and that excises black leadership and                                  involvement, a corrective that has just begun to appear in recent scholarship and has not produced a                                  directory as of yet. based on racially hegemonic raw data, the co‑attendance visualizations don’t capture                              black ugr involvement by default.     this leads us to this set of questions. how do we account for (new, collective) data collection that                                    accounts for haunting imprints and outright absences in the archives upon which we depend? what are                                the implications of a lexicon and set of practices/tools that rely upon and reproduce a colonial language                                  of power and entitlement in the digital humanities as we think collectively about best practices to                                “leverage computational methods and tools to treat digital library collections as data”.                         / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / frictionless collections data   dan fowler, open knowledge foundation       data package is a containerization format for all kinds of data. it provides a framework for “frictionless”                                  data transport by specifying useful metadata that allows for greater automation in data processing                            workflows. the aim is to provide the minimum amount of information necessary to transfer data from                                one researcher to another, and, likewise, one data analysis platform to another. after several years                              developing these specs for general use, it is worth directly examining the extent to which library and                                  museum collections data are amenable to this approach.    new approaches to publishing library and museum collections data are necessary. such data, released                            on the internet under open licenses, can provide an opportunity for researchers to create a new lens                                  onto our cultural and artistic history by sparking imaginative re‑use and analysis. for organizations like                              museums and libraries that serve the public interest, it is important that data are provided in ways that                                    enable the maximum number of users to easily process it. unfortunately, there are not always clear                                standards for publishing such data, and the diversity of publishing options can cause unnecessary                            overhead when researchers are not trained in data access/cleaning techniques.      one approach for publishing collections data is via an api (application programming interface) on a                              record‑by‑record basis. this approach has its advantages: the data is likely structured and well                            described. however, these services may not map directly to the types of queries or analyses  researchers                                need to run. further, for both the researcher and publisher, it can be tedious and costly to provide large                                      amounts of collections data delivered record‑by‑record. for certain use cases, it is preferable to publish                              data in bulk format in open standards like csv or json. the  metropolitan museum of art and  tate                                    gallery , for instance, have released their collections data as sets of text‑based files on github. in this                                  approach, associated documentation is provided via files named by convention, for example, “readme”                          or “license”. this method of publishing allows users to load data into their own tools without the                                  overhead of programming against an api.      documentation for data published in bulk is often ad hoc. there is often no clear or rigorous                                  documentation of the fields (what types of data are in each column). reading such data into data                                  analysis programs using the built‑in csv ingest mechanisms yields data divorced from context: common                            date and boolean (“true/false”) columns must be explicitly assigned as such, numeric identifiers may                            be incorrectly loaded as integers, etc. these datasets are often exported from in‑house collections                            database software, and small errors in the translation of these often large datasets may go unnoticed.      data packages for collections  frictionless data , developed in the open by open knowledge international and members of the open                              data community, is an ideal framework for publishing this type of bulk data. the data package format,                                  requiring only the addition of a descriptor file called datapackage.json, provides a minimally invasive, but                              standardized way to provide clear and machine‑readable metadata. datasets created as data packages                          can later be easily exposed as apis given the wealth of metadata provided.        https://gdstechnology.blog.gov.uk/ / / /providing-access-to-datasets-through-apis/ https://gdstechnology.blog.gov.uk/ / / /providing-access-to-datasets-through-apis/ https://github.com/metmuseum/openaccess https://github.com/tategallery/collection https://github.com/tategallery/collection http://frictionlessdata.io/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / as an example, the  carnegie museum of art in pittsburgh, pennsylvania has provided its collections data                                as a downloadable data package.  providing the data in this format yields several benefits:    . users are provided with useful metadata to allow for easy import into their preferred analysis                              tool. these explicitly defined column types and metadata can eliminate some of the tedious                            work involved in “wrangling” a dataset.  . publishers can use tooling like  good tables  to automatically validate data.  . basic documentation for how to use the dataset (e.g. what columns mean) can be automatically                              created from structured metadata.  . collections data can be licensed in a machine‑readable manner.  . in the absence of data‑package‑aware tooling, the original data can be read/written as usual.    over the course of this year, with the continued support of a grant from the sloan foundation, we are                                      looking to work with researchers and institutions across a variety of fields to pilot the use of the                                    specifications. this may involve building tools and writing guides to analyse, validate, and/or visualize                            collections data. through this process we hope to improve the specifications more generally while also                              providing useful tooling for researchers in digital humanities.                                                                  https://github.com/cmoa/collection http://goodtables.okfnlabs.org/ mailto:daniel.fowler@okfn.org mailto:daniel.fowler@okfn.org / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / book carts of data:  usability and access of digital content from library collections   harriett green, university of illinois at urbana‑champaign      not all of the data we create or purchase for library collections comes in neat multi‑gigabyte packages of                                    ordered files: we recently discovered that datasets we had purchased as part of a database licensing                                negotiation were more shelf ready than machine ready: they currently exist as stacks of hard drives,                                discs, and other bewildering formats sitting on a book cart. how do we provide access to these data                                    collections?    in my extensive work with research teams, graduate students, and faculty members to obtain, generate,                              and transform data derived from collections in the university of illinois library and far beyond, the                                question of access and usability consistently rises to the fore. thus, i would ask, how can we                                  conceptualize the full spectrum of data usability? it is not enough for us to digitize the collection                                  materials and for the data to exist on someone’s server: usability encompasses data formats, tool                              interoperability to the negotiated permissions and rights for researchers to share and manipulate data as                              they engage in analytic workflows.     data usability means developing data models that take into account the actions that will be performed                                on our data. in determining the different types of data models that we can build and implement into our                                      collections, we must consider how humanists and social scientists effectively work with data in their                              research and teaching.     my work with the hathitrust digital library and hathitrust research center has seen this practice: the                                htrc has attempted to meet various expertise levels and needs of users in enabling access to the data:                                    on the newcomer end of the spectrum, we provide fully guided access to gathering and using data                                  through our workset builder and the portal with its pre‑set algorithms. but researchers frequently                            express the need for larger‑scale data that is more pliable and manipulatable, so the htrc developed the                                  extracted features datasets that allow researchers to generate highly customized and curated datasets.                          but the barriers to accessing this data can be high in terms of skillsets needed to both access and use the                                          data.    my research explorations on scholarly research practices also have shown me that data usability is                              critical:    our research for the htrc’s workset creation for scholarly analysis project examined researcher                          requirements for textual corpora to be useable for research (fenlon et al. , green et al. ).                                  our interviews with scholars revealed that the core areas of concern for researchers included the                              conceptualization of collections as reusable datasets and resources for scholarly communications;                      the ability to break apart collections into various levels of granularity to generate diverse objects of                                analysis; and the need for enriched metadata. we proposed building out the data model of the                                “workset,” the htrc‑specific term for textual corpora that researchers build.      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / our subsequent user study for htrc user requirements (green and dickson, ) gave further                            insights on how researchers used textual corpora and their scholarly practices that shape their needs                              for being able to work effectively with text collections in the hathitrust digital library, as well as                                  overall. we learned that scholarly practices and notable challenges when working with our textual                            collections included the ability to acquire and structure the data; the need for a space to work with                                    various tools and generate results; the ability to share data for research collaborations; and the role                                of data in teaching and training.    and my recently concluded research study for emblematica online explored how scholars engaged                          with the digitized emblem books drawn from leading rare book collections at illinois, hab                            wolfenbuettel, university of glasgow, duke, and the getty institute. in my examination of how                            scholars engaged with these multi‑institutional collections, their metadata, and the interlinked                      digital content through interviews and usability testing sessions, we found that the expectations of                            users when exploring digital collections is complex: they range from the basic need for high‑quality                              reproductions, which  emblematica  was praised for by all participants; to advanced scholarly                        concerns such as the ability to distinguish between the types of archival content they are                              perusing—emblem books versus emblems themselves—and the historical particularities of this                    specialized genre of emblem studies. respondents frequently expressed the need for context,                        annotated content, and other functionalities that would allow them to fully engage with the emblem                              books as an archival source and scholarly area. we considered that this may reveal the needs of                                  interdisciplinary scholarship as researcher take advantage of easy access to vast digital collections of                            content: the scholarly knowledge base that users approach with digital collections varies widely,                          and an effective digital collection must welcome all levels and inculcate them into the scholarly                              domain of the collection.    these are some of the findings i have learned in my work to examine what researchers needs are as they                                        engage with our library collections in digital formats and make use of these materials as data. this                                  forum’s discussion can provide critical new avenues for exploring how collections can be accessible,                            browseable, and extensible for addressing a diversity of emergent uses in research and teaching.    works cited    fenlon k., senseney m., green h., bhattacharyya s., willis c. and downie, j. s. ( ). scholar‑built  collections: a study of user requirements for research in large‑scale digital libraries.  proceedings of the  american society for information science & technology   ( ),  – . doi:  . /meet. .     green, h. e., fenlon, k., senseney, m., bhattacharyya, s., willis, c., organisciak, p., downie, j.s., cole, t.,  and plale, b. ( ). using collections and worksets in large‑scale corpora: preliminary findings from  the workset creation for scholarly analysis prototyping project. poster presented at iconference  ,  berlin, germany.    green, harriett, eleanor dickson, and sayan bhattacharyya. “scholarly requirements for large scale text  analysis: a user needs assessment for the hathitrust research center.” digital humanities    proceedings, krakow, poland, july   –  ,  .    green, harriett, mara wade, timothy cole, and myung‑ja han.  . “user engagement with digital  archives: a case study of emblematica online.”  in  creating sustainable community: the proceedings of    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / the acrl   conference , edited by dawn mueller,  – . chicago, il: association for college and  research libraries.                                                                                              / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / historical complications of/for open access computational data  jennifer guiliano, indiana university–purdue university indianapolis       always already computational seeks to support the “development of a strategic approach to developing,                            describing, providing access to, and encouraging reuse of library collections that support                        computationally‑ driven research and teaching.” historically, data in the digital collections sphere has                        most often been expressed as homogenous datasets falling into one of three primary types: textual,                              visual, or audio. “scholars” or “researchers” use large scale textual information derived from digitized                            volumes or the extraction of text only from hypertextual and multimedia environments or they mine                              hundred or even thousands of hours of video or audio materials to extract and analyze subsets. due to                                    the dominance of datasets like those derived from the google books corpus or through webscraping                              tools that cull text,image, or audio, large or dense cultural datasets are the norm in digital humanities,                                  and are not only homogenous in type but rarely imagine interactions as led by or with intervention from                                    individuals not holding the role of scholar or researcher.    more simply, i am suggesting that the question of creating computationally‑accessible datasets is not just                              the deployment of an ecosystem for development, description, access, and reuse but a recognition that                              there are potentially multiple ecosystems of research and teaching that  must exist simultaneously  and be                              treated as relational computational data. to illustrate this principle, i’ll provide a brief synopsis of the                                work of edward curtis and how the open access images that are currently available as                              computationally‑accessible data through the library of congress present a complicated consideration of                        computational data. beginning in , edward s. curtis embarked on a thirty‑year career documenting                            over eighty native communities. participating as part of scientific expeditions and anthropological                        excursions, he produced roughly volumes of information on native and indigenous life that were                              accompanied by photographic images as part of his  the north american indian series. created primarily                              as silver‑gelatin photographic prints, this series has long held a place of prominence in historical analysis                                as the images are not only noted for their rarity but for the limited dissemination and reuse throughout                                    the twentieth century as full sets of materials. only sets of the volume series were sold; however,                                      these images as individual objects have seen significant dissemination and reuse since their acquisition                            by the library of congress. more than , silver‑gelatin photographic prints (of a projected total of                                , ) were acquired by the library of congress through copyright deposit from about through                              . about two‑thirds ( , ) of these images were not published in curtis's multi‑volume work,  the                              north american indian . the collection includes individual and group portraits, as well as photographs of                              indigenous housing, occupations, arts and crafts, religious and ceremonial rites, and social rituals (meals,                            dancing, games, etc). more than , of the photographs have been digitized and individually described                              and are available through the library of congress api as well as via manual download of both jpeg and                                      tiff file formats.    using strategies common to anthropologists working in indigenous communities at the turn of the th                              century, curtis modified the images he produced to remove signs of modernity and contemporary life.                              this included providing specific forms of dress that were perceived as being “more traditional” as well as                                  stronger interventionist strategies like removing objects that would signal integration with th century                          euro‑american society. when viewing an image of a piegan lodge on the loc website,  the unretouched                                negative is provided to the api of an image of two piegan men situated in their lodge with a clock                                        centered between them. a computational dataset would expose the existence of this image, which could                                http://www.loc.gov/pictures/resource/cph. b /?co=ecur http://www.loc.gov/pictures/resource/cph. b /?co=ecur / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / allow scholars to run object based visual analysis algorithms to identify the clock in the image and                                  potentially find other images of modernity using shape‑segmentation leading to some conclusions about                          the interventionism of technology in indigenous life‑‑‑how widespread has technology embedded itself                        into indigenous life? but in current thinking about computationally‑accessible data, what would not be                            revealed is that this original negative shows an alarm clock between two seated men in a piegan lodge,                                    not the published, retouched image that american audiences would have viewed in  the north american                              indian . curtis physically cut the clock out of the negative. he then the retouched the image for                                  publication in  the north american indian . it is important for accuracy purposes for the dataset to reflect                                  not just the original photographic negatives but also relational data derived from what was actually                              published by curtis. otherwise, researchers might conclude that americans were familiar with signs of                            modernity in indigenous life when, in fact, that conclusion is relatively recent historiographically. other                            examples of this type of relational computational‑data are available with curtis: he depicted a crow war                                party on horses, even though there had been no crow war parties for years, and he used techniques of                                      focus and duration to induce hue saturation that romanticized images.     more problematically, for our computational dataset, curtis was also known to photograph religious                          rituals as part of his excursions. the [ oraibi snake dance ] image depicts hopi natives that were part of                                    the snake and antelope societies participating in a communal ceremony. performed in august to ensure                              abundant rainfall to help corn growth, the ritual was the most widely photographed ceremony in the                                southwest pueblos by non‑native observers. in current computationally‑accessible form, there are a                        number of issues to confront: ) there is no notation that this image is of a religious ritual that is now                                          prohibited from viewing by the non‑hopi public (and thus should be pulled from view for reasons of                                  cultural sensitivity); ) when subjected to computer vision techniques, the derivative images rely on                            segmentation of physical bodies‑‑‑a form of disembodied violence that reflects colonial practices where                          natives are treated as less than human through segmented image representation (e.g. scalps, severed                            limbs, etc). more holistically, this case illustrates one of the long‑term challenges of                          computationally‑enabled access: computers cannot identify culturally‑sensitive data nor is there an                      efficient means to retrieve culturally‑sensitive data once it has been distributed in computational form.                            while data might be displayed in an integrated manner, when it comes to the processing or analysis of                                    our data, computational analysis has largely existed at a segmented level rather than as an integrated                                structural process for research and teaching purposes. a complex humanities system for data are often                              artificially layered representations that rely on augmentation of 'found' datasets such as traditional and                            web archives.     often, human intervention is needed to verify the results of these computational processes, which have                              a habit of very quickly highlighting contradictions at the level of both object and corpora. an integrated                                  data ecosystem posits that through computational analysis it is important not only for core activities of                                development, description, access, and reuse, but also the return of data to its originating collection                              through data correction and relational derivatives. more simply, what is needed is an integrated                            humanities data ecosystem that recognizes approaches to computationally‑accessible data and relies on                        important characteristics of humanities research data and humanities research practices: ) humanists                        tend to create data, not just gather data; ) some of this data is inherently structured, but most is not; )                                          the resulting data is often highly interpretative, which has implications for sharing and re‑use; ) data                                creation is often iterative and layered with implications for copyright, versioning and active working                            spaces; and ) the process is as important as the product. and, significantly, to envision the broadest                                  potential intervention of computationally‑accessible datasets, we cannot envision that the terms                      “scholar” and “researcher” belong to the academic or archival communities. we must understand that                              http://www.loc.gov/pictures/collection/ecur/item/ / / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / the communities of origin should be the initiating point for considering development, deployment,                          access, etc.    works cited    [ ] portions of this response appeared in an earlier form in the introduction to “the future of digital  methods for complex datasets”, an  international journal of arts and humanities computing (ijhac)  special edition and as a contribution to a digital library federation panel on humanities data issues.  jennifer guiliano and mia ridge, international journal of humanities and arts computing, volume    issue  , page  ‑ . doi:   http://dx.doi.org/ . /ijhac. .  .                                                                              http://dx.doi.org/ . /ijhac. . http://dx.doi.org/ . /ijhac. . / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / identifying use cases for usable and inclusive library collections as data    juliet l. hardesty,   indiana university      a grounded, practical approach to digital projects often centers around concerns of how will the                              project be useful, how can the project realistically be completed, and what information is necessary                              to make this project (or the items in a digital project) discoverable and accessible? based on this                                  approach, there are two sides to making library collections useful as computational data – the                              collection‑holding library has to be able to release the data in a way that allows for computation and                                    researchers have to be able to find out about this data and do something with it. putting data out                                      there does not mean it will be used and offering a computational interface does not mean it will fit all                                        research needs.    the grant references the hathitrust research center (htrc) as an example of a computational                            interface for researchers. it also references hydra‑in‑a‑box as an example of an application that could                              benefit from computational functionality. this generated the thought of an htrc‑in‑a‑box that could                          work for libraries to set up their own computational interface for their collections. open government                              data efforts like  code for america or data.gov and ckan.org show how various groups and individuals                                can come together around a common goal of providing access to computational data and provide                              ways to access, analyze, and offer data. it would be useful to examine those models when discussing                                  approaches to treating library collections as data.    this project is concerned with all types of digital objects. text, images, audio, video, born‑digital,                              ‑dimensional, all have unique aspects to them that are sometimes computationally available but                          often are not. sometimes the only way to know about segments on a video or the contents of an                                      image is to have textual description available. that requires metadata generation or metadata                          enhancement. this work can be manually intensive but can also be aided by software. efforts such as                                  avpreserve’s plan to enhance metadata in stages for indiana university’s media digitization and                          preservation initiative move gradually toward more advanced technologies to identify aspects such as                          people’s faces, beats per minute, and speaker identification in video and audio for the purpose of                                producing metadata than can then be discovered by researchers.[ ] another project to watch will be                              wikimedia commons’ structured data project to “develop storage information for media files in a                            structured way on wikimedia commons, so they are easier to view, translate, search, edit, curate and                                use.”[ ] this process will not always be just about putting the data out there or making it possible for                                      researchers to access the data, it will also involve producing data about different types of objects than                                  has traditionally been the case in digital libraries. recommendations, tools, and workflows for                          metadata enhancement will be necessary to create usable computational data.    michelle dalmau, head of digital collections services at indiana university, correctly points out that                            different use cases are needed for library collections as data.[ ] at indiana university, several digital                              collections are available as datasets,[ ] largely based on researcher requests. tracking use in the wild                              is challenging, but datasets are used in the classroom (charles w. cushman photograph collection)                            and for research (wright american fiction). looking at how data is used for research compared to                                how it is used pedagogically for instruction might lead to insights on qualities of data that make                                  collections better suited for teaching versus research. being able to reliably trace the ways in which                                  https://www.codeforamerica.org/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / these data sets are used will demonstrate impact to stakeholders. using metadata about digital                            collections versus using the collection items themselves for content analysis is something else to                            consider. the british library offers image collections for analysis separate from bibliographic datasets                          about their archival holdings. indiana university’s cushman dataset offers only the metadata about                          the images, not the images themselves.    a final point to bring up concerns diversity and inclusion. not only should this project make sure the                                    collections considered for use cases are diverse in format, content, and source, but the project itself                                needs to have a broad and deep representation of voices and perspectives on computational data.                              these are not data that are only useful in the academic realm. access to computational data or                                  workflows and tools to allow others to provide access to computational data will be ever more                                important in the world, particularly if national governments continue to trend toward populism,                          nationalism, and privatization.     works cited    [ ] rudersdorf, amy and juliet l. hardesty. ( ). “av description with avpreserve and iu: strategies  and tools to describe audiovisual materials at scale for indiana university’s media digitization and  preservation initiative.” digital library federation forum, milwaukee, wisconsin.  https://osf.io/gfazc/     [ ]  juliet l. hardesty interviewed michelle dalmau regarding library collections as data in february  .    [ ]  https://commons.wikimedia.org/wiki/commons:structured_data     [ ] british library. collection guides: datasets for image analysis.  http://www.bl.uk/collection‑guides/datasets‑for‑image‑analysis                                               https://osf.io/gfazc/ https://commons.wikimedia.org/wiki/commons:structured_data http://www.bl.uk/collection-guides/datasets-for-image-analysis / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / emerging memory institution data infrastructure in the service of  computational research  christina harlow, cornell university      in my opinion, the  always already computational forum work area rests at the intersection of the                                understood functionalities of memory institution’s collection platforms and the needs of researchers                        working with large‑scale or computational data analysis techniques. in thinking about this forum’s scope                            and my own work, i am struck by possible collaborations not leveraged or mentioned. i would like to                                    explore if my work approach to a facet of a larger data problem could expand and, in turn, be expanded                                        by the forum’s discussion and deliverables on computational research needs and memory institution                          data practices.    my position for this upcoming forum will mostly fall along these points:    ● if library collections, including but not limited to that of digital repository platforms, are  considered (primarily digital repositories are targeted in the proposal), there is a wealth of data  and metadata (*data) that already exists. better yet, memory institutions already work with this  *data at scale using traditional and emerging technologies that underpin and are hidden by  delivery and discovery interfaces. how can this underlying ecosystem be better leveraged for  computational data analysis by researchers? i.e. do we just need to make access to a solr index  publicly available? can we plug into our library data etl systems a public hadoop integration  point? do we need to better document and expose to new communities our existing data apis or  data exchange protocols?     ● i would like to surface the functional needs of the research areas alluded to in the proposal, then  see where they overlap with existing *data operations work areas in memory institutions. a  strategic partnership here means we can strengthen the cases for, collaboration on, and support  of the technological, procedural, and organizational frameworks emerging. these are already  being built and used to support efforts of memory institutions and their data partners.    ● computational or large‑scale *data work requires transparency and agreement on a number of  points to make it statistically relevant and publicly reliable. these agreement points include but  are not limited to:    o machines should be able to understand the models or entities represented by the data;  o this requires having shared specifications around *data representation and contextual  meaning of models, datum, types, etc.;  o we need to build and maintain consistent data exposure services, points or methods so  that computational work can be reproducible, iterated, or distributed as needed (for  scalability);    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / o recognize that technological frameworks for computational analysis (for example,  hadoop) often require significant hardware, software, and maintenance to support.  stability of how data is exposed and data provenance can mitigate the technological  burden by offering consistency on which multiple partners can build and coordinate  efforts on the frameworks;  o and what is the responsibility of the originating memory institution to support capture  of that computational data output for sake of archiving, reproducibility, discoverability,  and expanded *data services?    my positions come from my own work on metadata operations within a large and well‑funded academic                                library system. my work focuses on building an efficient and coordinated *data ecosystem among                            sources including but not limited to:    ● a traditional marc  catalog with about   million bibliographic records, managed in an ils  (integrated library system), a few oracle databases, a perl‑based metadata reporting and  management interface, and other batch job management and metadata exposure services (apis  and data exchange protocols like z .  or sru);    ● a locally‑developed metadata integration layer that takes multiple data representations of  authority, bibliographic and other metadata retrieved via apis, merges them, and indexes into a  number of solr indexes;    ● multiple (~  depending on the definition) digital repository applications and services for delivery  of data and metadata to user interfaces. these repositories span technology and resource types  from lone fedora   instances for object persistence of primarily text‑focused digital surrogates to  more traditional dspace installations for user‑generated scholarly output type resources;    ● a locally‑managed authorities and entities interface that deals with both local vocabularies and  enhanced representations of currently   large (>  million resources) external metadata sets;    ● and *data from archives, preservation, digitization, and many other workflows and systems.    in building a coherent ecosystem for this *data, i work with enterprise data tooling and approaches that                                  perhaps also can support the computational data analysis needs to be surfaced in the  always already                                computational forum. in particular, i am leveraging etl and distributed data management systems that                            then interact with (and coordinate) existing memory institution *data standards, applications,                      specifications, and exchange protocols. due to the computational support of the selected distributed                          data systems, i run a number of processes that parallel some computational data approaches, but for                                different ends. i would like to outline how we could reuse or expand these existing approaches and                                  services to support the researchers (and their respective areas) who take part in this forum.              / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / on the computational turn in archives & libraries and the notion of levels of  computational services   greg jansen and richard marciano, university of maryland    . the computational turn in archives & libraries  the university of maryland ischool’s digital curation innovation center (dcic) is pursuing a strategic                            initiative to understand and contribute to the computational turn in archives and libraries. the                            foundational paper (with partners from ubc, kcl, tacc, and nara) calls for re‑envisioning training for                              mlis students in the “age of big data”. see: “ archival records and training in the age of big data ”. we                                        argue for a new computational archival science (cas) inter‑discipline, with motivating case studies on:                            ( ) evolutionary prototyping and computational linguistics, ( ) graph analytics, digital humanities and                        archival representation, ( ) computational finding aids, ( ) digital curation, ( ) public engagement /                          interaction with archival content, ( ) authenticity, and ( ) confluences between archival theory and                          computational practices: cyberinfrastructure and the records continuum.     deeper experimentation with these new cultural computational approaches is urgently needed and the                          dcic is developing a cas curriculum that brings together faculty from computer science, archival &                              library science, and data science. we conduct experiential projects teams of students to help them: gain                                digital skills, conduct interdisciplinary research, and explore professional development opportunities at                      the intersection of archives, big data, and analytics. these projects leverage unique types of archival                              collections: refugee narratives, community displacement, racial zoning, movement of people, citizen                      internment, and cyberinfrastructure for digital curation. see “ practical digital curation skills for                        archivists in the st century ” (lee, kendig, marciano, jansen), marac . two workshops on the                              interplay of computational and archival thinking were held in  april and  december  , and a                                pop‑up session  at saa   discussed archival records in the age of big data.    finally, the dcic is developing new cyberinfrastructure, called  dras‑tic (see  nov. cni talk ), that                              facilitates computational treatment of cultural data.  dras‑tic stands for digital repository at scale that                            invites computation (to improve collections), and blends hierarchical archival organization principles                      with the power and scalability of distributed databases.    our position statement builds to these cas investigations by suggesting a framework for “levels of                              computational service” to better describe the emerging ecosystem and identify gaps and opportunities.    .  levels of computational service  journalists, researchers, planners, and other user patrons support their investigations with new methods                          of computational analysis. libraries, archives, museums, and scientific data repositories hold data that                          will inform their disciplines. it is far easier today to analyze twitter behavior than it is to investigate                                    public life using public data from public institutions, such as government records, cultural heritage, and                              science data. we strive to make our public data and cultural memory as open to research as twitter.    http://dcicblog.umd.edu/cas/about/ http://bit.ly/ ll et http://bit.ly/ ll et http://dcicblog.umd.edu/cas/ http://dcicblog.umd.edu/cas/ieee_big_data_ _cas-workshop/ https://archives .sched.com/event/ f d https://www.cni.org/topics/digital-curation/drastic-measures-digital-repository-at-scale-that-invites-computation-to-improve-collections / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / computational analysis happens in various technical environments: on a single server; in distributed                          clusters; on cloud services. the tools we use have unique requirements, configurations, and hardware. it                              is said that a data stewardship organization cannot anticipate the uses for their data, but it is equally true                                      that they cannot anticipate the tools used for analysis. organizations need a service strategy that serves                                a range of users, from the most technically innovative, to the most time and resources constrained. we                                  describe a range of services for collections as data without losing site of core services. this is a “maturity                                      model” for stewardship organizations, with  levels of computational services  that show a clear                          progression toward full service.    . .  core service level  shipping datasets into the researcher compute environment remains the critical use case, maximizing                          flexibility and allowing researchers to link many datasets into one corpus. researchers need to  discover,                              scope, ship and make reference to datasets . though we may also move computational work across them,                                boundaries are an important place to define stable conditions, such as custody, provenance, security,                            and concise technical contracts. even the most advanced repository must establish these boundary                          conditions.    ● define license terms, how can we use the data?  ● define provenance:  ○ who produced the data and why?  ○ how did it arrive here?  ○ do versions exist elsewhere?  ● define dataset scope:  ○ what makes the corpus complete?  ○ is it complete?  ○ is it growing? what is the update history?  ● transfer methods with integrity verification and resume from failure  ● persistently citable datasets    . .  protocols service level   ● file‑by‑file transfer through http api (instead of batch downloads, like zips)  ● define citable subsets through custom queries or functions.  ● check for updates to any dataset or subset. (via http api)  ● http api for navigation of structured collections:  ○ static site (apache or nginx auto‑index of files)  ○ cloud data management interface (cdmi)  ○ linked data platform  (and fedora api)  ● delivery to cloud and cloud‑hosted, public datasets    . . enhanced service level  ● derived data available as subsets:  ○ plain text for documents and images  ○ normalized file formats    https://www.w .org/tr/ /note-ldp-primer- / / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ○ tabular data for table‑like sources  ○ linked data for graph‑like sources  ● machine‑readable provenance records  ● crowd‑sourcing of metadata  ● named entity indexing and subsetting (people, places, organizations, dates, events)  ● geospatial indexing and subsetting  ● consistent and citable random sample subsets (add random seeds to each observation)    . .  computer room service level  container technologies, such as docker, ship a custom compute environment to the dataset location. a                              hosted database can be opened up for queries or distributed compute jobs. while not as flexible as the                                    researcher environment, computer room services provide rapid and cost‑effective analysis. journalists                      on deadline benefit most from computer room services.    there are also growing calls, beyond the physical sciences, for analysis of big collections data in                                journalism and humanities scholarship. the sheer scale of big data makes transfer prohibitive, as is                              provisioning enough storage to host an entire corpus. at the digital curation innovation center at the                                university of maryland’s ischool, we are actively developing the  dras‑tic repository (digital repository                          at scale that invites computation). through  dras‑tic we aim to deliver computer room‑style services                            over heterogeneous digital collections and remove the limits of scale.    ● run an apache spark job on a defined dataset  ● host a compute container with a dataset mounted locally  ● sparql query service  ● use techniques above to produce a new subset for transfer    .  provisioning the researcher environment  from code notebooks to deployment scripts that provision clusters, it becomes easier to create and                              share compute environments. research that aims towards publication will also need to track the                            research steps workflow. through machine readable scripts and provenance, we can aim to reproduce                            an analysis at a different time and place, starting from the cited datasets and well described methods.                                  the curation activities performed by a stewardship organization and the steps taken by the researcher                              can form an unbroken chain of events leading to a reproducible product.    summary  for verifiable results in scholarship, or public trust in an independent press, we need to provide relevant                                  datasets and services that make it straightforward to trace findings back to their source in the public                                  record. we must confront a rightly skeptical reader, who faces increasingly high‑flying visualizations and                            claims made from them. they are correct to demand links to the underlying evidence and methods. by                                  providing these we enrich public understanding and trust. at the digital curation innovation center                            (dcic) we have committed to this agenda and pursue it through our research projects, scholarly                              activities, and the active development of the dras‑tic software project, and the building of a                              computational archival community .    https://saaers.wordpress.com/ / / /building-a-computational-archival-science-community/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / partnership recommended – the case of curating research data collections[ ]   lisa johnston, university of minnesota libraries    digitization alone is not enough to support large‑scale computational analysis of library collections.                          rather the more difficult steps of digital curation will be necessary to prepare our collections for                                appropriate reuse. partnership may be the key.    take for example the problem of analog data. the extraction of historical climate data from tables and                                  charts and other artifacts (e.g., zooniverse's old weather project) is an ambitious and important                            undertaking as these data are undeniably valuable and temporally unique. yet, the digitization of data                              points from the written page is just the first step toward a greater integration of their meaning in                                    modern and future research. in order for computation of these collections to be successful, the digital                                surrogate must be curated in a number of ways. the data may be transformed, cleaned, normalized,                                described, contextualized, and quality assurance measures put in place to ensure trust and track                            provenance of the work, to name a few. data curation activities prepare and maintain research data in                                  ways that make it findable, accessible, interoperable and reusable (fair).     in our work, the  data curation network project has taken steps to better understand the data curation                                  activities mentioned above and identify ways to harness the necessary domain and file format expertise                              needed to curate research data across a network of partner institutions.[ ] we represent academic                            library data repository programs that are staffed with curation experts for a range of data domains and                                  data file formats. our goals are to develop practical and transparent workflows and infrastructure for                              data curation, promote data curation practices across the profession in order to build an innovative                              community that enriches capacities for data curation writ large, and most importantly, develop a shared                              staffing model that enables institutions to better support research by collectively curating research data                            in ways that scale what any single institution might accomplish individually.   we are not alone in this desire to partner on data curation skills, staff, and infrastructure. national                                  examples of data curation such as the portage network (https://portagenetwork.ca), developed by the                          canadian association of research libraries (carl), aims to support library‑based data management                        consultation and curation services across a broader network and the jisc‑funded  research data                          management shared service project aims to develop a lightweight service framework that can scale to                              all uk institutions and result in efficiencies by “relieving burden from institutional it and procurement                              staff.” in the us, partnerships on technological infrastructure are booming. the project hydra’s sofia                            platform (https://projecthydra.org), which builds in the duraspace fedora framework, has been                      co‑developed by numerous institutions that seek to build a better digital repository infrastructure for                            data. and the  hydra‑in‑a‑box project (lead in part by another partnership success story for disseminating                              archival materials, the digital public library of america) aims to provide a networked platform for                              repository services that will scale for institutions big and small. another inspiring example is the                              research data alliance , which provides an incubator for collaboration around a range of data‑related                            topics. rda projects to track include the publishing data workflows working group and the newly formed                                research data repository interoperability working group. and partnerships do not necessarily need to                          start at the national‑level. several smaller‑scale partnerships underway for sharing curation staff                        expertise across institutions include the  digital liberal arts exchange , which facilitates data‑related                        problem solving and communication amongst peers as well as providing hosting services that allows                              https://sites.google.com/site/datacurationnetwork https://www.jisc.ac.uk/rd/projects/research-data-shared-service https://www.jisc.ac.uk/rd/projects/research-data-shared-service http://hydrainabox.projecthydra.org/ https://rd-alliance.org/ https://dlaexchange.wordpress.com/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / digital humanities projects to be run on shared infrastructure. and the  dataq project, which provides a                                virtual online forum for expert data staff to discuss and provide solutions for data issues in a                                  collaborative way.      by partnering on data curation efforts like these we may move beyond individualized digital curation                              strategies toward what i hope will become a robust “network” of digital collections that are                              computational, but also trusted. and as partners in this effort we may continue a shared dialogue and                                  collectively develop new and improved processes for curating research data and other digital objects.                            finally, our networked research collections will demonstrate our continuing and important role that                          libraries and archives have to play in the broader scholarly process.     works cited    [ ] portions of this statement were also published in “concluding remarks” by lisa r. johnston in  curating research data volume  : a handbook of current practice  (acrl,  ) available as an open  access ebook at  http://www.ala.org/acrl/publications/booksanddigitalresources/booksmonographs/catalog/publications .     [ ] currently in our planning phase, the data curation network aims expand into a sustainable entity  that grows beyond our initial six partner institutions, lead by the university of minnesota, and are the  university of illinois, cornell university, the university of michigan, penn state university, and  washington university in st. louis.                                                http://researchdataq.org/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ways of forgetting: the librarian, the historian, and the machine  matthew lincoln, getty research institute    jorge luis borges tells us of funes, the memorious: a man distinguished by his extraordinary recall. so                                  precise and complete were funes' memories, though, that it was impossible for him to abstract from the                                  near‑infinity of recalled specifics he possessed, to general principles for understanding the world:  locke, in the seventeenth century, postulated (and rejected) an impossible idiom in                        which each individual object, each stone, each bird and branch had an individual                          name. funes had once projected an analogous idiom, but he had renounced it as                            being too general, too ambiguous. in effect, funes not only remembered every leaf                          on every tree of every wood, but even every one of the times he had perceived or                                  imagined it... he was, let us not forget, almost incapable of general, platonic                          ideas... he was not very capable of thought. to think is to forget a difference, to                                generalize, to abstract. in the overly replete world of funes there were nothing but                            details, almost contiguous details. (borges  ,  )  attending to drucker's admonition that all "data" are properly understood as "capata", the story of                              funes is a potent reminder that it is not only inevitable that we will be selective when capturing datasets                                      from our collections, but that it is actually  necessary to be selective.(drucker ) a data set that aims                                    for perfect specificity does so at the expense of allowing any generalizations to be made though                                grouping, aggregating, or linking to other datasets. for our data to be useful in drawing broad                                conclusions, it is an  imperative  to forget.  however, in considering library and museum collections as data, we must grapple with several different                              frameworks of remembering, forgetting, and abstracting: that of the librarian, the historian, and the                            machine. these frameworks will often be at cross‑purposes:  ● the librarian favors data that is  standard : forgetting enough specifics about the                        collection in order to produce data that references the same vocabularies and thesauri                          as other collection datasets. the librarian's generalization aims to support access by                        many different communities of practice.    ● the historian favors data that is  rich : replete with enough specifics that they may                            operationalize that data in pursuit of their research goals, while forgetting anything                        irrelevant to those goals. the historian's generalization aims to identify guiding                      principles or exceptional cases within a historical context. (no two historians, of course,                          will agree on what that context should be.)    ● the machine favors data that is  structured : amenable to computation because it is                          produced in a regularized format (whether as a documented corpus of text, a series of                              relational tables, a semantic graph, or a store of image files with metadata.) in a                              statistical learning context, the machine seeks generalizations that reduce error in a                        given classification task, forgetting enough to be able to perform well on new data                            without over‑fitting to the training set.    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / at the getty research institute,  our project to remodel the getty provenance index® as linked open                                data is compelling us to balance each of these perspectives against the labor required to support them.                                  our legacy data is filled with a mix of transcriptions of sales catalogs, archival inventories, and dealer                                  stock books, paired with editorial annotations that index some of those fields against authorities or                              other controlled vocabularies. originally designed to support the generation of printed volumes, and                          then later a web‑based interface for lookup of individual records, these legacy data speak mostly about                                documents of provenance events, and do so for an audience of human readers. to make these data                                  linkable to museums that are producing their own linked open data (following the general cidoc‑crm                              principles of defining objects, people, places, and concepts through their event‑based relationships), we                          are transforming these data into statements about those provenance events themselves. in so doing, we                              are  standardizing the terms referenced,  enriching fields by turning them from transcribed strings into                            uris of things, and explicitly  structuring  the relationships between these data as an rdf graph.  all this work requires dedicated labor. this leads to hard questions about priorities.  to what extent do we preserve the literal content of these documents, versus standardizing the way that                                  we express the ideas those documents communicate (in so far as we, as modern‑day interpreters, can                                correctly identify those ideas)? to maintain (to remember) plain text notes about, say, an object's                              materials as recorded by an art dealer, is to grant the possibility of perfect specificity about what our                                    documents. but not aligning descriptions with authoritative terms for different types of materials and                            processes forecloses the possibility of generalizing about the history of those materials and processes                            across hundreds of thousands of objects. remember too much, in other words, and we become funes:                                incapable of synthetic thought.  capacious collections data must remember enough  and forget enough to be useful. for which terms will                                we expend the effort to do this reconciliation? which edge cases will we try to capture in an                                    ever‑more‑complex data model? opinions on how to draw that line will frequently set the librarian, the                                historian, and the machine at cross purposes. outlining the necessary competencies a collections data                            production team needs, and the key questions, in order to navigate perspectives must therefore be a                                crucial output of this forum.  works cited    borges, jorge luis.  . “funes, the memorious.” in  ficciones , edited by anthony kerrigan,  – .  new york: grove press.  drucker, johanna.  .  graphesis: performative approaches to graphical forms of knowledge  production in the humanities.  cambridge: harvard university press.                http://www.getty.edu/research/tools/provenance/provenance_remodel/index.html http://www.getty.edu/research/tools/provenance/provenance_remodel/index.html / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / assessing data workflows for common data 'moves' across disciplines  alan liu, university of california santa barbara    in considering how library collections can serve as data for a variety of data ingest, transformation,                                analysis, replication, presentation, and circulation purposes, it may be useful to compare examples of                            data workflows across disciplines to identify common data "moves" as well as points in the data                                trajectory that are especially in need of library support because they are for a variety of reasons brittle.    we might take a page from current research on scientific workflows in conjunction with research on data                                  provenance in such workflows.  scientific workflow management is now a whole ecosystem that includes                            integrated systems and tools for creating, visualizing, manipulating, and sharing workflows (e.g., wings,                          apache taverna, kepler, etc.). at the front end, such systems typically model workflows as directed,                              acyclic network graphs whose nodes represent entities (including data sets                    and results), activities, processes, algorithms, etc. at many levels of                    granularity, and whose edges represent causal or logical dependencies                  (e.g., source, output, derivation, generation, transformation, etc.)  (see fig.                  ) .  data provenance (or "data lineage" as it has also been called in relation                            to workflows) complements that ecosystem through standards,              frameworks, and tools‑‑including the open provenance model (opm) the                  w c's prov model, provone, etc. linked‑data provenance models have                  also been proposed for understanding data‑creation and ‑access histories                  of relations between "actors, executions, and artifacts.”[ ] in the digital                    humanities, the in‑progress "manifest" workflow management system              combines workflow management and provenance systems.[ ]    the most advanced research on scientific workflow and provenance now goes beyond the mission of                              practical implementation to meta‑level  analyses of workflow and provenance. the most interesting                        instance i am aware of is a study by daniel garijo et al. that analyzes workflows recorded in the                                        wings and taverna systems to identify high‑level, abstract patterns in the workflows.[ ] the study                            catalogs these patterns as  data‑oriented motifs (common steps or designs of data retrieval, preparation,                            movement, cleaning/curation, analysis, visualization, etc.) and  workflow‑oriented motifs (common steps                    or designs of "stateful/asynchronous" and "stateless/synchronous" processes, "internal macros,"                  "human interactions versus computational steps," "composite workflows," etc.). then, the study                      quantitatively compares the proportions of these motifs in the workflows of different scientific                          disciplines. for instance, data sorting is much more prevalent in drug discovery research than in other                                fields, whereas data‑input augmentation is overwhelmingly important in astronomy.  since this usage of the word  motifs is unfamiliar, we might use the                          more common, etymologically related word  moves to speak of                  "data moves" or "workflow moves." a  move connotes a                  combination of  step and  design . that is, it is a step implemented                        not just in any way but in some common way or form. in this regard,                              the russian word  mov for "motif," used by the russian formalists                      and vladimir propp, nicely backs up the choice of the word  move to                          mean a commonplace data step/design. indeed, propp's              diagrammatic analyses of folk narratives  (see fig. ) look a lot like                          / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / scientific workflows. we might even generalize the idea of "workflows" in an interdisciplinary way and                              say, in the spirit of propp, that they are actually  narratives . scientists, social scientists, and humanists do                                  not just process data; they are telling data stories, some of which                        influence the shape of their final narrative (argument, interpretation, conclusion).   the takeaway from all the above is that a comparative study of data workflow and provenance across                                  disciplines (including sciences, social sciences, humanities, arts) conducted using workflow modeling                      tools could help identify high‑priority "data moves" (nodes in the workflow graphs) for a library‑based                              "always already computational" framework.  one kind of high priority is likely to be very common data moves. for example, imagine that a                                    comparative study showed that in a sample of  in silico or data analysis projects across several disciplines                                  over % of the data moves involved r‑based or python‑based processing using common packages in                              similar sequences (perhaps concatenated in jupyter notebooks); and, moreover, that among this number                          % were common across disciplinary sectors (e.g., science, social science, digital humanities). then                          these are clearly data moves to prioritize in planning "always already computational" frameworks and                            standards.  another kind of high priority may be data moves that involve a lot of friction in projects or in the                                        movement of data between projects. one simple example pertains to researchers at different                          universities ingesting data from the "same" proprietary database who are prevented from standardizing                          live references to the original data because links generated through their different institutions' access to                              the databases are different. friction points of this kind identified through a comparative workflow study                              are also high value targets for "always already computational" frameworks and standards.  finally, one other kind of high priority data move deserves attention for a combination of practical and                                  sensitive issues. many scenarios of data research involve the generation of transient data products (i.e.,                              data that has been transformed at one or more steps of remove from the original data set). a                                    comparative workflow study would identify common kinds of transient data forms that require holding                            for reasons of replication or as supporting evidence for research publications. in addition, because some                              data sets cannot safely be held because of intellectual property or irb issues, transformed datasets (e.g.,                                converted into "bags of words," extracted features, anonymized, aggregated, etc.) take on special                          importance as holdings. a comparative workflow study could help identify high‑value kinds of such                            holdings that could be supported by "always already computational" frameworks and standards.  works cited  [ ] hartig, olaf. "provenance information in the web of data." in  proceedings of the linked data on the  web workshop at www , edited by christian bizer, tom heath, tim berners‑lee, and kingsley idehen,  april  ,  .  http://ceur‑ws.org/vol‑ /ldow _paper .pdf .  [ ] kleinman, scott. draft manifest schema. whatevery says (we s) project,  humanities.org.  [ ] garijo, daniel, pinar alper, khalid belhajjamey, oscar corcho, yolanda gil, and carole goble.  "common motifs in scientific workflows: an empirical analysis."    ieee  th international conference  on e‑science (e‑science) ,  :  – .  doi:  . /escience. .  .           http://ceur-ws.org/vol- /ldow _paper .pdf http://ieeexplore.ieee.org/document/ /?section=abstract / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / at the intersection of institution and data  matthew miller, new york public library       libraries are awash in data, from the large reservoirs of bibliographic metadata that power discovery and                                access systems, to boutique datasets created from the documents themselves and even the ephemeral                            data exhaust produced by staff and patrons conducting research. emerging from practical day‑to‑day                          working with this type of data below are some proposed observations and questions around description,                              distribution and access that are potentially useful and could benefit from closer examination.    the most potentially kinetic computationally amenable data comes from the conversion and processing                          of documents themselves. transforming documents into data at the new york public library took the                              form of small projects that converted special collection materials into datasets through the power of                              algorithms, staff and the crowd. the results were a domain specific dataset often with a necessarily                                unique data model. taking stock of the growing number these datasets we theorized about their                              possible integration with our traditional metadata systems. would it be possible to go beyond simply                              linking to the dataset as a digital asset? if we were to build a rdf metadata system from the ground up                                          could we begin thinking of it as an open‑world assumption system where the contents of these datasets                                  could exist alongside traditional bibliographic metadata? as more cultural heritage organizations                      continue to produce similar datasets we need to consider how they shape the next generation of our                                  metadata and discovery platforms.    stepping back from this larger question, when thinking about these resources as discrete datasets, what                              work could be done to improve their use and interoperability? wc standards such the void vocabulary                                provided the means to describe the metadata about datasets. leveraging such standards and                          establishing best practices and preferred authorities could we increase access across humanities                        datasets? how much work and what sort of resources are required to accomplish this at the dataset level                                    and perhaps at the data level as well. for example using common non‑bibliographic authorities such as                                wikidata uris in the data to facilitate interoperability across datasets and even institutions.    when publishing data for others it is a balance between providing access to the data in a format that                                      provides the least friction for adoption and use versus how knowledge organization systems work within                              a cultural heritage institution. this often requires preprocessing of library metadata turning it into a                              more accessible form that does not require extensive domain knowledge. for example, when releasing                            the metadata for nypl’s public domain images we did not publish the mods xml metadata, the format                                  that it is inherently stored in our systems. instead we opted to publish it as json and also as simple csv                                          files along with extensive documentation. reducing the complexity of the format reduced the complexity                            of the tools and skills needed to work with it.     another example taking this approach a step further is in linked jazz project in which we provided access                                    to the data in the form of a sparql endpoint. the data, which is stored as rdf statements, represent a                                          / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / social network of jazz musicians. this dataset lends itself to network analysis using popular tools such as                                  gephi. to make the application of such a tool as simple as possible we added a gephi file export api                                        allowing anyone to quickly download a gexf file of part of or the whole network to import into the                                      software. this sort of scholarly api is geared for delivering the resources needed to begin utilizing the                                  data immediately as opposed to just providing access to the underlying data store.    the topic of preprocessing introduces the question of best practices and standards that could be                              followed to ensure the broadest access to our datasets. what are some additional use cases that could                                  drive shared best practices or tools for releasing cultural heritage data? are there more advanced                              preprocessing that could be done to some of the common archetypical data formats found in libraries,                                archives and museums? and what sort of resources are required in an organization to process datasets                                for public consumption?     as institutions increasingly produce and release datasets, establishing some best practices around                        description, distribution and access can facilitate collaboration between organizations and ensure                      productive use of these resources by patrons.                                      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / metadata and digital repository accessibility issues  for library collections as data  anna neatrour, university of utah    in thinking of ways to use library collections as data, i was struck with the theme of accessibility. are                                      researchers genuinely invited to engage with library collections as data? i’m going to focus on this                                narrowly, looking mainly at aspects of metadata and technical infrastructure in digital repositories.  metadata as invitation to computation  encouraging usage of library collections as data could be embedded in digital collections metadata by                              including a statement that metadata is free to reuse, providing a cc license, or stating that metadata is                                    open as a policy. one example of this is seen in the harvard policy on  open metadata . many institutions                                      have agreed that their metadata is in the public domain, which is a condition for harvest by dpla, but                                      there is often no metadata reuse statement available at the item or collection level in the source digital                                    repositories for these shared collections. making it clear that we expect metadata to be reused and                                repurposed improves the accessibility of digital library collections as data. providing an easy way for                              researchers to download metadata in addition to a digital image might also encourage more research                              engagement with digital collections metadata. an example of this can be found in the  university of hull’s                                  repository , where records are easily downloaded in mods or dublin core. in addition, highlighting                            investigations undertaken by repurposing library metadata within the digital repository itself could spark                          additional ideas for research from people who might be encountering this possibility for the first time.  make digital repositories more welcoming  while offering access to digital collections via an api may be an effective way of showing that                                  computation is possible with digital collections, it doesn’t provide a welcoming environment for students                            or researchers who are at the initial stages of their research and who might not yet have the technical                                      expertise to utilize an api. providing a portal to a suite of sample apps created with an api, as  dpla  does                                          along with the search interface for a digital repository creates a signal that application development and                                computation utilizing a digital library is both possible and desired.   with libraries everywhere continually being asked to do more with less, curating all digital collections for                                computational purposes may be impossible. however, developing easy ways of bulk download for both                            images and metadata outside of an api may open up windows for researchers. providing clear methods                                to download digital objects across different collections, or interact with images across repositories                          through a framework like iiif could be yet another method for enabling researchers to interact with                                library collections as data.  digital collection managers may be able to curate new local or regional corpora by thinking creatively                                about digital items they already own. for example, in my own library at the university of utah, i’ve                                    wondered about the possibility of making our typewritten oral history transcripts available to                          researchers. these oral histories were scanned as pdfs, and i expect the ocr would be decent enough to                                    support text based topic modeling. figuring out how to make these resources accessible to researchers                                http://library.harvard.edu/open-metadata https://hydra.hull.ac.uk/ https://hydra.hull.ac.uk/ https://dp.la/apps / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / by packaging them in a way that would encourage computational use is a goal of mine.  what does a digital collections as data repository look like?  providing additional layers and portals that leverage computational exploration to existing collections                        might serve as an intermediate step. imagine if text based digital collections also had a voyant‑like layer                                  built into the digital repository itself that researchers could use, along with pre populated queries and                                visualizations so people at the beginning stages of inquiry could see examples of text analysis. this could                                  support an introductory approach to exploring collections as data in the classroom. many digital library                              repositories leverage visual possibilities for geospatial visualization and browsing, as in the  open parks                            network map that shows thumbnail images of digital items along with map locations . could an interface                                 be built into a digital repository that would enable researchers to easily mash up digital items into a                                    personalized portal that would support geospatial visualization without the need to download metadata,                          enhance information with coordinate data, and then create a more static map in an external system from                                  that exported data? could our digital repositories provide a mechanism for researchers to curate their                              own research collections, providing a space where digital library objects could be combined with                            researcher supplied data? any approach have to blend what is pragmatically possible along with support                              for experimentation with the existing infrastructure for our digital repositories. keeping in mind the idea                              of accessibility for researchers and library users at all stages of inquiry will hopefully result in an effective                                    blend of solutions for interacting with library collections as data.   i’d like to thank jeremy myntti and jim mcgrath for providing feedback on a draft of this position  statement.                                http://openparksnetwork.org/map/?b=yes&c=yes&z= &k=pubcode% acaha http://openparksnetwork.org/map/?b=yes&c=yes&z= &k=pubcode% acaha / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / actually useful collection data: some infrastructure suggestions  miriam posner       libraries and archives are increasingly making their materials available online, but, as a general rule,                              these materials aren’t of much use for computational purposes. for the most part, institutions have                              sought to replicate as closely as possible the experience of being in a reading room with an individual                                    object. we see this in artifacts like skeumorphic “swishes” on digital page‑turns, mammoth lists of                              browsable topics, and, what concerns me most here, the inability to download large quantities of object                                metadata. many of us have learned the basics of webscraping precisely to get around this problem,                                laboriously writing scripts to harvest metadata that we know must already exist somewhere, as data, in a                                  repository.    there are many good reasons cultural institutions impose these limitations on their metadata. for one                              thing, it’s not at all clear how many people actually  want to treat collections as data. most patrons aren’t                                      accustomed to encountering data in a cultural institution. so perhaps archives are just being good                              stewards of limited resources by focusing their attention on simply making digital facsimiles available.                            but the lack of collection data also limits other people’s imaginations about what they might do with                                  collections’ materials.    i’ve also been told by various institutions that they don’t have the right metadata for researchers to work                                    with ‑‑ that their descriptive information is often schematic, high‑level, and meant for search and                              discovery, not for visualization and analysis. i agree that this is a concern that we need to take seriously,                                      but i contend that even the most basic metadata is often more useful for understanding a collection than                                    many librarians imagine. simply having author or creator information, or language information, can be                            very helpful. my impression is that many institutions are holding onto their data tightly, with the hope of                                    cleaning and improving it in the future. but researchers can work with imperfect data, if its limitations                                  are discussed frankly. we can also contribute improved data back to the institution.    going forward, i imagine multiple pieces of infrastructure that could help make the data of cultural                                institutions as widely usable ‑‑ and widely  used  ‑‑ as possible:    a workable humanities data repository or registry.  a good many open data repositories already exist.                              most of them are designed to hold scientific data, although this need not disqualify them for humanities                                  data. humanists are actively contributing data (albeit on a relatively small scale) to general‑use data                              repositories such as figshare and zenodo. the more troublesome problem is that a) consensus hasn’t                              built around one particular repository; and b) absent a central repository, no substitute, such as a data                                  registry, gathers lists of cultural data in one place. what cultural data exists is stored, for the most part,                                      on github — fine for downloading, versioning, and contributing data, but a terrible way to discover new                                  datasets. we need a better way to find cultural data.    consideration of apis versus “data dumps.”  many cultural institutions, reasonably enough, offer apis as                            a means of accessing their data. this makes sense for a lot of different reasons, including access to the                                      most recent data and the ability to retrieve institutions’ data in many different ways. the problem here is                                    that many humanists can work with structured data, but  not with apis . many common visualization tools                                require no programming, and so it’s possible for humanists to work with data, even in sophisticated,                                  / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / thoughtful ways, without necessarily knowing how to program. developers at cultural institutions may                          feel that learning an api is trivial, but for many people, the availability of simple flat files can be the                                        difference between using and not using a dataset. i therefore hope that cultural institutions will consider                                the possibility of providing unglamorous flat files, in addition to api access to their data.    really lowbrow thought about data formats.  very simply, my students can work with csvs, but not xml                                  or json. visualizing and analyzing the latter two formats takes programming knowledge, while even                            non‑coders can import csvs into excel and create graphs and charts. obviously, one can convert xml and                                  json to csvs, but doing this requires some knowledge of these formats, and sometimes some                              programming (or at least command‑line) ability.    case studies.  it may seem unlikely, given the recent proliferation of digital humanities journals, but it’s                                relatively difficult to find vetted, a‑to‑z, soup‑to‑nuts examples of how to build visualizations and                            analysis from datasets. the aggregation of a number of fairly simple examples would, i believe, go far in                                    demonstrating how people might use datasets in their own work, and would certainly be of great utility                                  in the classroom. the key here would be to keep the examples quite simple, so that people can replicate                                      and build on them with relative ease.                                      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / interoperability and community building  sheila rabun, international image interoperability framework (iiif) consortium      i am coming from a non‑traditional background, with a master’s in interdisciplinary folklore studies,                            having gained the majority of my experience in libraries as the digital project manager and subsequently                                the interim director of the university of oregon (uo) libraries’ digital scholarship center. among many                              digital projects, i was responsible for the oregon digital newspaper program, where we made large sets                                of newspaper ocr data and images available to the public online, following the library of congress’                                chronicling america site and  open api . while digital newspaper data has been used to create                              visualizations and other computational projects (for example, the  mapping texts collaboration between                        the university of north texas and stanford university), the learning curve for scholars to find, harvest,                                and use the data provided remains a challenge. students and faculty from all subject areas are                                increasingly looking to library and information professionals for guidance on where to find accessible                            data resources, how to use them, and recommendations on platforms for sharing their work. in addition                                to determining best practices for making collections available as data, comprehensive training materials                          and documentation for end users will be key to lowering the barrier of entry to make it easier for                                      researchers to get started working with data on their own, encouraging wider re‑use and                            experimentation.    over the past months i have shifted my focus slightly, as the community and communications officer                                  for the  international image interoperability framework (iiif) consortium, to improve digital image                        repository maintenance and sustainability as well as access and functionality for end users. as a                              community‑driven initiative including national and state libraries, museums, research institutions,                    software firms, and other organizations across the globe, iiif provides  specifications for publishing digital                            image collection data to allow for interoperability across repositories. iiif specifically addresses the “data                            silo” problem that has been plaguing the digital repository community, particularly by using existing                            standards and models such as json‑ld and web annotation that make sharing and re‑use easy. a                                growing number of digital image repositories are by adopting iiif, and the  iiif consortium has grown to                                  include   institutional members since it was formed in  .    the iiif community and specifications are especially relevant to the goals of the always already                              computational (aac) work, especially regarding digital images. iiif has laid a groundwork for creation of                               a library collections as data as an internationally agreed‑upon best practice for making digital image data                                shareable and more usable for study. iiif utilizes json‑ld manifests (representations of a physical object                              such as a book, as described in the  iiif presentation api ), to encourage sharing, parsing, and re‑use of                                    data regardless of differing metadata schemas across collections and repositories. the iiif community                          has built the specifications specifically around  use cases to solve real problems, so far primarily focusing                                on the needs of those both using and making available digitized manuscripts, newspapers, and museum                              collections.    we are currently working on extending the iiif specifications to include interoperability for  audio and/or                              visual materials (with d materials further along the roadmap), as well as improved  discovery of                              iiif‑compatible resources on the web. collaboration with the existing community that has formed                          around iiif will be essential for the work of aac and we welcome new interested parties to get involved,                                        http://chroniclingamerica.loc.gov/ http://chroniclingamerica.loc.gov/about/api/ http://mappingtexts.org/index.html http://iiif.io/ http://iiif.io/community http://iiif.io/technical-details http://iiif.io/community/consortium http://iiif.io/api/presentation/ . / https://github.com/iiif/iiif-stories http://iiif.io/community/groups/av/ http://iiif.io/community/groups/av/ https://gist.github.com/azaroth / cd a d e b b ac a / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / inform and provide feedback on approaches for discovery and stay informed with new innovations.                            libraries and museums have been the primary adopters so far, but we have plans to do more outreach to                                      scholars and researchers in all disciplines, stem imaging providers, publishers, and the commercial                          sector. vendors like contentdm and luna have incorporated iiif into their products, and iiif is gaining                                speed in open source efforts like the hydra‑in‑a‑box repository product, which is iiif‑compatible. the                            goals of iiif and aac are in alignment, and there is an exciting potential to work more closely together,                                      leveraging the existing iiif community network and technical framework to create and build upon best                              practices.                                                    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / from libraries as patchwork to datasets as assemblages?  mia ridge, british library     the british library's collections are vast, and vastly varied, with ‑ million items in most known                                languages. within that, there are important, growing collections of manuscript and sound archives,                          printed materials and websites, each with its own collecting history and cataloguing practices. perhaps                            ‑ % of these collections have been digitised, a process spanning many years and many distinct                              digitisation projects, and an ensuing patchwork of imaging and cataloguing standards and licences. this                            paper represents my own perspective on the challenges of providing access to these collections and                              others i've worked with over the years.  many of the challenges relate to the volume and variety of the collections. the bl is working to                                    rationalise the patchwork of legacy metadata systems into a smaller number of strategic systems.[ ]                            other projects are ingesting masses of previously digitised items into a central system, from which they                                can be displayed in iiif‑compatible players.[ ]  the bl has had an 'open metadata' strategy since , and published a significant collection of                                metadata, the british national bibliography, as linked open data in .[ ] some digitised items have                              been posted to wikimedia commons,[ ] and individual items can be downloaded from the new iiif                              player (where rights statements allow). the bl launched a data portal, https://data.bl.uk/, in . it's                              work‑in‑progress ‑ many more collections are still to be loaded, the descriptions and site navigation                              could be improved ‑ but it represents a significant milestone many years in the making. the bl has                                    particularly benefitted from the work of the bl labs team in finding digitised collections and undertaking                                the paperwork required to make the freely available. the bl labs awards have helped gather examples                                for creative, scholarly and entrepreneurial uses of digitised collections collection re‑use, and bl labs                            competitions have led to individual case studies in digital scholarship while helping the bl understand                              the needs of potential users.[ ] most recently, the bl has been working with the bbc's research and                                  education space project,[ ] adding linked open data descriptions about articles to its website so they can                                be indexed and shared by the res project.  in various guises, the bl has spent centuries optimising the process of delivering collection items on                                request to the reading room. digitisation projects are challenging for systems designed around the                            'deliverable item', but the digital user may wish to access or annotate a specific region of a page of a                                        particular item, but the manuscript itself may be catalogued (and therefore addressable) only at the                              archive box or bound volume level. the visibility of research activities with items in the reading rooms is                                    not easily achieved for offsite research with digitised collections. staff often respond better to                            discussions of the transformational effect of digital scholarship in terms of scale (e.g. it's faster and                                easier to access resources) than to discussions of newer methods like distant reading and data science.  the challenges the bl faces are not unique. the cultural heritage technology community has been                              discussing the issues around publishing open cultural data for years,[ ] in part because making                            collections usable as 'data' requires cooperation, resources and knowledge from many departments                        within an institution. some tensions are unavoidable in enhancing records for use externally ‑ for                              example curators may be reluctant or short of the time required to pin down their 'probable' provenance                                  or date range, let alone guess at the intentions of an earlier cataloguer or learn how to apply modern                                        / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ontologies in order to assign an external identifier to a person or date field.   while publishing data 'as is' in csv files exported from a collections management system might have very                                  little overhead, the results may not be easily comprehensible, or may require so much cleaning to                                remove missing, undocumented or fuzzy values that the resulting dataset barely resembles the original.                            publishing data benefits from workflows that allow suitably cleaned or enhanced records to be                            re‑ingested, and export processes that can regularly update published datasets (allowing errors to be                            corrected and enhancements shared), but these are all too rare. dataset documentation may mention                            the technical protocols required but fail to describe how the collection came to be formed, what was                                  excluded from digitisation or from the publishing process, let alone mention the backlog of items                              without digital catalogue records, let alone digitised images. finally, users who expect beautifully                          described datasets with high quality images may be disappointed when their download contains                          digitised microfiche images and sparse metadata.  rendering collections as datasets benefits from an understanding of the intangible and uncertain                          benefits of releasing collections as data and of the barriers to uptake, ideally grounded in conversations                                with or prototypes for potential users. libraries not used to thinking of developers as 'users' or lacking                                  the technical understanding to translate their work into benefits for more traditional audiences may find                              this challenging. my hope is that events like this will help us deal with these shared challenges.  works cited  [ ] the british library, ‘unlocking the value: the british library’s collection metadata strategy    ‑  ’.    [ ] the international image interoperability framework (iiif) standard supports interoperability between  image repositories. ridge, ‘there’s a new viewer for digitised items in the british library’s collections’.    [ ]  deloit et al., ‘the british national bibliography: who uses our linked data?’    [ ]  https://commons.wikimedia.org/wiki/commons:british_library    [ ]  http://www.bl.uk/projects/british‑library‑labs, http://labs.bl.uk/ideas+for+labs    [ ]  https://bbcarchdev.github.io/res/    [ ]  for example, the 'museum api' wiki page listing machine‑readable sources of open cultural data was  begun in    http://museum‑api.pbworks.com/w/page/ /museum%c %a apis  following  discussion at museum technology events and on mailing lists.                http://museum-api.pbworks.com/w/page/ /museum%c %a apis / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / maintaining the ‘why’ in data: consider user interaction and consumption of library collections  hannah skates kettler, university of iowa    always already computational represents the next hurdle for libraries, archives and museums. now that                            the profession is comfortable with the notion of digitization, and have reaped the rewards of greater and                                  broader impact (proffitt and schaffner, ), it has now turned its focus towards born digital materials.                                it's not that born digital materials, in , is a new notion but it is definitely a concept the profession                                        has been aware of, but has been hesitant to tackle. as a digital humanities professional, i deal with the                                      use and creation of born digital materials every day and adapt to the multiplicitous ways library                                collections are created and made available, especially in the humanities.  i therefore approach the questions in always already computational with these concepts in mind:  relational datasets:   no library collection is an island. library collections are not simply a list of ones and zeros that wait to be                                          consumed and reused, then spat out again as something different. at least, not when we want to be able                                      to cite them. data (which henceforth will be a stand in for 'library collections') must be persistent in                                    order to be effectively accessible and reused for research. in order to amalgamate various datasets,                              immense amount of time is spent standardizing the data into something that can be cross referenced                                and used computationally. understanding that our data are unique, it does not necessarily follow that                              access should be as unique and idiosyncratic. what that linked data has provided is a framework to link                                    disparate ideas to each other relationally. i am particularly interested in the possibilities of the linked                                data at it applies to datasets that would allow one to describe contextual relationships between the                                data, relationships which typically are entirely use and user based. by generalizing data in a way that is                                    useful in multiple contexts by creating a framework that is flexible enough to accommodate data's                              multiplicity.   association of paradata:   pulling from experience with d collections, functioning without standards of how to make born digital                              materials more usable makes interfacing with other datasets much more difficult than other more                            traditional data. for example, visual materials are much more reliant on supplemental contextual data                            than text. that is not to say there is no context within textual data, but the aforementioned data could                                      include context within it. visual data, usually lacks this packaged approach. visuals are associated with                              text in order to provide that context. beyond catalogues, visual data's supplemental material is separated                              from and unintentionally disassociated from the visual (think a search result in an image database). few                                image datasets are accompanied with  why the image was created. true, one can inference based on the                                  basic metadata included with the object, but without intent, it is much more difficult to make judgement                                  about why the dataset (as generated by an api for instance) is included and why others were not. it also                                        makes it easier to fake, or misrepresent library data/collections.      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / cultural constructs of data:   compounding the narrowed context of textual and numerical datasets, problematic visual datasets, and                          even mixed data sets, you have the social constructs that support data. this aligns very well with the                                    work i, and a group of librarian and museum professionals are doing in association with the digital                                  library federation. as was mentioned in the october information bulletin from the library of                              congress, "because there is no analog (physical) version of materials created solely in digital formats,                              these so‑called 'born‑digital' materials are at much greater risk of either being lost and no longer                                available as historical resources, or of being altered, preventing future researchers from studying them in                              their original form." their particular focus for this remark was the preservation of born‑digital data. now                                that the profession, to some extent, has the ability and focus for preservation of born‑digital, it is time to                                      turn our eye to interoperability (like always already computational) and the cultural context of the data                                itself. consider the book  the intersectional internet: race, sex, and culture online  by safiya noble and                                brendesha tynes ( ) which underscores "how representation to hardware, software, computer code,                        and infrastructures might be implicated in global economic, political, and social systems of control." data                              without context is meaningless. data with context but without social awareness is deceptively                          meaningless. with that deception comes, in the worst case, the use and articulation of argument                              founded on a lack of understanding and awareness of perpetuating ideas that are intrinsically linked to                                the creation and curation of said data. a question for this group would be; how do we attempt to                                      preserve that context without overwhelming the user?   the always already computational group can hopefully come together to attempt to solve this and other                                concerns regarding digital aggregate data.    references  "born digital': eight institutions and their partners received awards totaling almost $  million from the  library to collect and preserve digital materials as part of the national digital information infrastructure  and preservation program".  .  library of congress information bulletin.    ( ):  ‑ .  noble, safiya umoja, and brendesha m. tynes.  . the intersectional internet: race, sex, class and  culture online. isbn:  ‑ ‑ ‑ ‑ .  proffitt and schaffner.  . the impact of digitizing special collections on teaching and scholarship:  reflections on a symposium about digitization and the humanities. report produced by oclc programs  and research. published online at:  www.oclc.org/programs/reports/ .pdf                 / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / people and machines both need new ways to access digitized artifacts  nonconsumptively  ben schmidt, northeastern university    how can we integrate generations of high‑quality, professionally‑created metadata with electronic                      versions of the object itself? particularly when copyright comes into play, we can't simply hope for                                openness; and there's a steep trade‑off between the thoroughness of a well‑thought‑out standard and a                              simplicity of conception that makes a digital resource useful for (for instance) a graduate student just                                beginning to get interested in working with large collections.  when we digital humanities researchers say that we're working with the "full text" of a scanned book,                                  it's usually more posturing than truth. in fact, what datasets like the hathitrust research center's                              extracted features really do is just radically transform the amount of metadata we have; instead of                                knowing or things from a marc record (eg: the language, four or five subject headings, the author,                                      the publisher), we just add on an additional several thousand ("how many times does it use the word                                    "aardvark?" "aardvarks?" "abacus?"...). all the rest of the information (even simple stuff like syntax, word                              order, negation) is thrown out. it's great that organizations like jstor and hathi are starting to release this                                    computationally‑derived metadata. but there's no clear way to incorporate this computational metadata                        into a traditional library catalog. the technical demands of even  downloading something like the htrc                              ef set exceed both the technical competencies and computing infrastructure of most humanists‑‑i've                          literally spent several weeks recently, restarting downloads and identifying missing files as i try to fill up a                                    raid array with several terabytes of data. processing these files into the raw material of research is even                                    harder.  so how do we make collections accessible for work? there are two ways that libraries can take more of                                      the burden onto themselves, and distribute (non‑copyright‑violating) distillations of texts that provide an                          onramp for digital analysis within the reach of mere mortals.  visual exploration  one useful and important way to work with this metadata and full text is by exposing through                                  visualization; this is what projects like the google ngrams viewer and the  hathi+bookworm project i've                              helped work on under an neh grant. patrons are able to use this combination of full text and catalog                                      metadata to explore the shapes and contours of vast digital libraries. since they know (sort of!) what any                                    given word means, they can use it to understand how vocabulary changes; find anomalous, interesting,                              or misclassified items; or understand the limits and constraints of an entire collection, a sorely‑needed                              form of information literacy. we've built the bookworm platform so the advances we're making with                              hathi can be used on any smaller (or larger) library, and we hope others will be interested in using to                                        explore their texts in the context of their metadata.    http://bookworm.htrc.illinois.edu/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   hathi trust bookworm browser  low‑dimensional embeddings  i'd also like to put on the radar a farther out‑there idea that extrapolates from the current trends in the                                        world of machine learning: the idea of a  shared embedding for digital items that would allow machines                                  to compare items across various collections, times, and artifacts. the basic idea of an embedding is to                                  associate a long list of numbers (maybe a few hundred) with a digital object so that items that are similar                                        have similar lists of numbers. these are sort of the inverse of the checksums that libraries frequently                                  associate with digital artifacts now, which are designed so that even the slightest change makes a file get                                    a completely different number. a good embedding will do the opposite; allow users and software to find                                  similar items. in a single collection like hathi, this practice i've found with even a simple embedding that                                    it's possible to, for instance,  look in the neighborhood of a book like "huckleberry finn" and find, in the                                      immediate neighborhood, dozens of titles like "collected works of mark twain, vol. " that lack proper                                titles that would identify them; and in the extended neighborhood other novels about american boys on                                riverboats.  inside a collection, this makes it possible to find works with improbable metadata. (it's sadly common for                                  the  wrong scan to be associated with metadata, and this can be extremely hard to catch.) across                                  collections, this makes it possible to engage in the work of comparison, duplicate detection,  perhaps the most interesting things about embeddings of digital files is that they're  not  restricted to  textual features. image embeddings are just as possible as textual embeddings, as in  this landscape  visualization of artworks that google recently produced .      http://sappingattention.blogspot.com/ / /literary-dopplegangers-and.html http://sappingattention.blogspot.com/ / /literary-dopplegangers-and.html http://sappingattention.blogspot.com/ / /literary-dopplegangers-and.html http://sappingattention.blogspot.com/ / /literary-dopplegangers-and.html https://babel.hathitrust.org/cgi/pt?id=uc .$b ;view= up;seq= https://artsexperiments.withgoogle.com/tsnemap/ https://artsexperiments.withgoogle.com/tsnemap/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /    when google recently released half a million hours of video, they did it not as image stills but as  vectorized features read by a neural network.  these features‑‑essentially, a computer's rough summary of an artifact into a few hundred                          numbers‑‑could make it possible to researchers and students to immediately engage in computational                          analysis without having to wade through the preparatory steps. if done according to shared standards,                              they could make collections interoperable in striking ways  even when texts or images can't be                              distributed . it's probably a few years too early to set a specific embedding for different types of                                  documents, but it is time now to contemplate what it would mean to distribute not documents                                themselves, but a useful digital shadow of them.                          https://research.google.com/youtube m/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / repurposing discographic metadata and digitized sound recordings  as data for analysis  david seubert, university of california santa barbara    use of sound recordings for research has been slow to develop due to bias against sound recordings as                                    historical documents by textual scholars, lack of descriptive data (discography), and lack of access                            because of restrictive copyright laws that make it difficult to digitize and provide access to collections.                                the use of digitized sound recordings or the discographic metadata about sound recordings as data to                                study is underdeveloped. the ucsb library wants to encourage scholarship of this kind using the data                                from the american discography project.     the american discography project that is presently based at the ucsb library with funding from the                                packard humanities institute was originally conceived as the encyclopedic discography of victor                        recordings by two record collectors in the early s. they began a project to document every classical                                  recording by the victor talking machine company, but eventually broadened their goal to include every                              victor recording session for rpm discs. in they were granted liberal access to the recording files                                  held by rca victor records (now sony music entertainment) and devoted many thousands of hours to                                compiling lists of the tens of thousands of victor master recording sessions from around the world.    the american discography project and its principal product, the  discography of american historical                          recordings (dahr) is now a research, publication, and digitization program based at the ucsb library                              with a goal of documenting disc recordings made during the standard groove era ( ‑ s) by                              american record companies and to digitize as many as possible for online access. much of the data                                  about a recording (who, what, where, when) is not documented on the recordings themselves, and only                                can be determined by consulting a published discography or primary source documents like company                            recording ledgers.    now in its fifth decade, the project has expanded beyond victor to incorporate other published                              discographies and includes data on recordings made by five early  th century record companies                            (berliner, victor, zonophone, columbia and okeh) with three more large labels (brunswick, decca,                          edison) and several smaller ones in the pipeline.     the sheer amount data documented in the online database is significant. dahr currently contains over                              . million data points documenting systematically and comprehensively the first years of american                            recording history including:     ● ,  recording sessions  ● ,  recording events (takes)  ● ,  physical manifestations (discs)  ● ,  names of performers, authors, composers  ●  languages  ●  recording locations      the initial project design was to document these recordings in a systematic fashion for the purposes of                                    http://adp.library.ucsb.edu/index.php http://adp.library.ucsb.edu/index.php / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / identification, cataloging by libraries and archives, collectors, and others. a bibliography of sound                          recordings. one of the further goals of the project is to encourage use of sound recordings as primary                                    source documents by scholars in fields beyond the study of music and as the project has grown, we have                                      growing success in this area. systematically adding audio to the database has allowed scholars to study                                the recordings, in context with authoritative data about their creation.     sound recordings and the metadata associated with them have not been mined and analyzed the way                                textual archives have. as the discography of american historical recordings grows in size, it is a prime                                  candidate for manipulation and analysis as data, as it contains standardized elements including language,                            dates, geographic information (recording locations), genres, names, and titles.    since the project was designed from the outset to be structured data, including authority control and                                standardized vocabularies for many elements, a potential and as yet unrealized reuse of the metadata as                                data, is now possible. as a participant in the national forum, we hope to be able to further                                    conceptualize how this can be best realized.                                             / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / the library as virtual reality: a worldbuilding approach  laila shereen sakr, university of california santa barbara      the process of considering digital library collections as data points relies on similar logics foundational to                                the development of virtual reality (vr). imagine the library as a vr film or as a computer ‑‑‑ temporally                                      and spatially. if the goal of the “always already computational: library collections as data” project is to                                  find a common framework among librarians, curators, and researchers that makes digitally‑born                        scholarship possible, i would like to suggest considering speculative design methodologies, or what alex                            mcdowell has described as worldbuilding.     alex mcdowell, a deeply influential designer has shifted how we think about design by fundamentally                              changing the role design plays in the creative process, potentially altering audiences’ expectations of                            creative work that ranges from architecture to computer games. drawing on the literary metaphor                            “worldbuilding” to explain his approach to design, mcdowell’s methods represent a cultural shift in his                              industry’s production process. speculating about what the world “might” look like in the future is easy.                                more challenging, though, is realizing that speculative vision through the design process. mcdowell’s                          work realizing a future‑world inspired by philip k. dick’s novella in the film  minority report is                                  emblematic of a transformation in design process that is made possible through the use of                              computational media. on  minority report , mcdowell led his production design team, which began as a                              largely analog art department, through a transition in which they became the first fully‑digital art                              department in the film industry — an example that many other design departments would soon follow                                and that foreshadowed a broader cultural shift in creative process.     most of the film’s audience will probably remember the gestural interface of the d screens used by the                                    agents in the department — speculative designs that, in turn, have influenced actual technologies                            ranging from apple’s ipad to microsoft’s kinect. however,  minority report ‘s influence in design reached                            an even wider array of design cultures, including biometrics (particularly retinal scanning), through other                            imagined technologies woven throughout the film’s environment and plot.     in other words, mcdowell’s world building integrates interdisciplinary humanistic, scientific, and design                        inquiry with emerging forms of computational media to fundamentally alter the film production process,                            blurring boundaries between physical and virtual environments and the distinctions between film and                          other media forms. in the digitally designed world of  minority report , props could be modeled first as                                  two‑dimensional images and later as three‑dimensional physical objects. then, through                    computer‑controlled milling, those models could be used to create final props by sculpting and                            mold‑making. bringing direction, cinematography, and design together in the virtual space of the                          pre‑visualization stage, props, actors, and the created world interacted throughout the production                        process. as a result,  minority report and mcdowell’s world building process signaled a transformation in                              design culture that has not yet fully played out.       / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / one approach to worldbuilding builds upon a procedure of information design that moves from                            archiving, to visualizing, to rationalizing, and then to governing. this process must take into account                              matters of scale. taking from both information design and game design, worldbuilding relies on several                              distinct way visual perspectives: drawing a complete world map and filling in as much information as                                possible, then running the game and letting the players explore that world. this visual perspective                              operates on a large scale. another perspective begins within specific town/city/place/room...and as they                          explore more and more of the world is revealed. these are some basic guidelines to consider as one                                    conceptualizes building a virtual word of data.     applying this theoretical framework to a process of speculative design for future library collections,                            could yield interesting results. the practice and ideas of worldbuilding, in mcdowell’s definition, are a                              clear example of interdisciplinary work connecting the arts, design, media‑focused computer science,                        and elements of the humanities and social sciences. worldbuilding is both the creation of media and a                                  design research practice, and in neither case is its interdisciplinarity a luxury, because the work simply                                must engage multiple disciplines in order to achieve a coherent vision and to push many fields forward.                                           / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / the struggle for access   tim sherratt, university of canberra      for me, exposing cultural heritage collections to computational methods raises difficult, important, and                          interesting questions about the nature of ‘access’ itself. so while we can and should develop                              best‑practice guidelines, i think we should also admit that we will never be, should never be, satisfied                                  with what cultural institutions deliver. we will always want something more. and that’s a good thing.  i’ve spent far too much of my life hacking the web interfaces of libraries and archives in the pursuit of                                        useful data. but while i would gladly take the time back, i recognise the value of the struggle. processes                                      such as screen‑scraping and normalisation are often frustrating, but they do at least make you think                                about the processes by which the data was created, managed, and shared.  so for me, one of the key questions is how we expose data to facilitate the use of computational                                      methods while preserving some of the difficulties and irregularities – the chisel marks in the smooth                                worked surface – that remind us of its history and humanity.  i’m not sure whether this is a metadata question, or a matter of how we frame the relationship between                                      researcher and institution. if we think of machine‑actionable data as a product or service delivered by                                institutions, then researchers are cast as clients or consumers. but if each dataset is not a product, but a                                      problem, then we open up new spaces for collaboration and critique.  i’ve started to realise that i have very little interest in statistics, or even data visualisation as i understand                                      it. i use computational methods to manipulate the contexts of cultural heritage collections. sometimes                            this results in useful tools or interfaces, sometimes it’s more akin to art. i’m motivated by the simple                                    desire to see things differently – to poke at the boundaries and limits of systems in the hope that                                      something interesting happens.  what seems to happen fairly regularly is that i find where the systems are broken. for example, while                                    harvesting debates from the australian parliament’s online database, i discovered about sitting days                            were missing. this sort of thing happens with complex systems, and the staff at the parliamentary library                                  have now fixed the problems. for me, it’s an example of the fact that we can never simply accept what                                        we’re given – search interfaces lie, and datasets have holes. but it’s also shows that once you open up                                      channels for the transmission of data, information flows both ways.  we can’t talk about the need for institutions to provide computation‑ready data without considering                            what they might get in return. the struggle for access might not always be comfortable, but it can be                                      productive. if data is a problem to be engaged with, rather than a service to be consumed, then we can                                        see how researchers might help institutions to see their own structures differently. on a practical level,                                how might we make it easier for institutions to re‑ingest the features and derivative structures identified                                through use.  i’m also a bit suspicious of scale. big solutions aren’t always best. large data dumps are great for                                    researchers with adequate computing power and resources, but apis support rapid experimentation and                          light‑weight interventions. similarly, while articulating best‑practice for computation‑ready data we                    shouldn’t lose sight of other ways data can be exposed. i want hackable websites as well as                                  downloadable csvs – all that basic stuff like persistent urls, semantic html, and maybe a sprinkle of rdfa                                    or json‑ld, enables data to be discovered everywhere, not just in a designated repository.    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / as i said, we will always want more. access will never be open and the job will never be done. we need                                            systems, protocols, guidelines, and collaborations that remind us there is always more to do, and offer                                the support to continue.                                                          / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / implications for the map in a 'collections as data' framework  tim st. onge, library of congress      i am arriving of the challenge of developing computationally amenable digital library collections from the                              perspective of a digital cartographer and geospatial analyst. my work for the library of congress as a                                  cartographer primarily involves digital map‑making and the analysis of born‑digital and made‑digital                        geographic information and maps to serve congressional research requests. my academic and                        professional backgrounds are based in geographic information science (gis) rather than in library                          science. however, i am often thinking about how the library of congress can best serve our collections                                  to meet the research and access needs of geographers in a digital age.    all of this is to say that my initial thoughts on developing a “library collections as data” framework are                                      largely shaped by the implications for one type of collection material in particular: the map.     there is enormous potential for the computational analysis of historic maps en masse, with methods                              that are both text‑based (e.g. extracting written text to create gazetteers of place names from certain                                time periods, cultures, languages, etc.) and image‑based (e.g. extracting map features based on                          groupings of image pixel values of similar color) (chiang, leyk & knoblock ). for the full integration                                  of historic maps into geographic information systems, processes like georeferencing and feature                        digitization, which have achieved varying levels of automation potential, must be completed. it is my                              view that georeferenced versions of scanned maps in library collections are highly appreciated among                            researchers and should be more standard “collections as data” offerings from libraries. the                          georeferenced map viewer created by the national library of scotland ( ) demonstrates the                          tremendous value of this type of data offering.    given the unique challenges of offering historic maps as computationally amenable collections, i admire                            the objective of the always already computational to conceive of a “collections as data” framework that                                is multimedia in scope and not only concerned with text analysis of written works (as critically important                                  and valuable as this is).     in my reading of the “statement of need” from the always already computational scope of work                                document, i interpret four major current problems of computationally amenable collections to be ( ) the                              lack of a common collections‑transformation framework across institutions, ( ) a lack of solutions for                            non‑text media, ( ) technical inadequacies in providing collections in large scale, and ( ) no data reuse                                paradigm for collections.    in addressing the first and second problems, i look forward to hearing more on the needs of                                  computational researchers who are working with image‑based collections, including, but not exclusively,                        scanned and digitized maps. in this needs assessment more broadly, in an abstract way, i imagine a                                  hierarchy of use cases and analysis tools. towards the top are elements that are most readily shared                                  among all kinds of library collections (e.g. all collection items have metadata files in standard format; all                                  text‑based, text‑extracted items could undergo analyses like frequency visualization or topic modeling).                        towards the bottom are more medium‑specific (e.g. only scanned maps are concerned with                          georeferencing and geographic projections). in laying out the strongest commonalities among researcher                        needs in working with library collections, perhaps a framework can be developed that addresses the                                / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / greatest, unifying needs of collection patrons across diverse uses in the digital humanities and other                              disciplines. furthermore, i hope that this framework highlights the unique and worthy challenges of                            devising solutions for researchers of non‑text media.    the third problem of providing collections on a large scale is certainly a critical concern to computational                                  research. if access to collection items is limited to one‑by‑one downloads or deliveries of physical dvds                                of data, simply the “data acquisition” phase can be sufficiently burdensome to slow or stop                              computational analyses before they even begin. the challenges of large‑scale collection access appear to                            be technological and, as is often the case for libraries and the digital humanities, budgetary. the                                methods of access detailed in the always already computational scope of work document demonstrate                            the wide variability among different institutions. i am interested to hear from project participants on the                                merits of these methods from their experience and what technical and budgetary considerations should                            be made in the process of developing best practices on this issue.     on the fourth problem of the data reuse paradigm, i believe this issue involves not only technological                                  hurdles, but policy ones as well. simply put, when researchers or patrons more broadly want to give back                                    to libraries, libraries should trust them. for example, this can take the form of an online‑based                                crowdsourced georeferencing tool that allows users to georeference scanned maps from a library                          collection and share them back to the library, which thereby shares that resource universally as a                                gis‑ready raster image (fleet, kowal, & přidal ). another example would be for libraries to host                                hackathons and other events that invite researchers to interrogate their collections as data and present                              on their findings, thereby allowing libraries learn lessons of the kinds of computational research that can                                (or cannot) work with their collections. i believe the archives unleashed series, which focuses on web                                archive research, is a great model for this kind of project (weber ). any frameworks arising from the                                    always already computational should encourage these kinds of “data sandbox” projects that allow for                            experimentation that reveal new insights into the computational analysis of collections as data and                            provide derived content and research directly back to libraries.    i look forward to learning from the diverse array of participants and contributing my insights to the                                  always ready computational initiative.     works cited    chiang, y., leyk, s., & knoblock, c. a. ( ) a survey of digital map processing techniques.  acm  computing surveys ,   ( ), article   (april  ),   pages. retrieved from  http://usc‑isi‑i .github.io/papers/chiang ‑acm.pdf .    fleet, c., kowal, k. c., & přidal, p. ( ) georeferencer: crowdsourced georeferencing for map library  collections.  d‑lib magazine ,   ( / ). retrieved from  http://www.dlib.org/dlib/november /fleet/ fleet.html .    national library of scotland ( )  view maps overlaid on a modern map / satellite image . retrieved  from  http://maps.nls.uk/geo/explore/ .    weber, m. s. ( ) archives unleashed!  collections as data | september  ,   | library of congress .  retrieved from  http://digitalpreservation.gov/meetings/documents/dcs / _weber_archives  unleashed.pdf .    http://usc-isi-i .github.io/papers/chiang -acm.pdf http://www.dlib.org/dlib/november /fleet/ fleet.html http://maps.nls.uk/geo/explore/ http://digitalpreservation.gov/meetings/documents/dcs / _weber_archives% unleashed.pdf http://digitalpreservation.gov/meetings/documents/dcs / _weber_archives% unleashed.pdf / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / considering the user  santi thompson, university of houston    as the forum unfolds, i would encourage participants to question and expand our assumptions of those                                who (re‑)use computational library collection data. in my mind, the identities of users and their                              motivations for coming to the digital library are just as important to understand as the technical                                requirements needed to re‑use data in interoperable and collaborative ways. knowing your users helps                            cultural heritage professionals, among other things, to better select content for the future, market the                              resources and collections available to them, and understand how to describe and make content available                              to others.[ ]     i was pleased to see that the proposal for  always already computational acknowledges the user to some                                  degree, noting that current digital library infrastructure and digital collection paradigms do "not meet                            the needs of the researcher, the student, the journalist, and others who would like to leverage                                computational methods and tools to treat digital library collections as data." as such, part of our forum                                  objectives will be to draft potential user stories and “to apply [data definitions and concepts] to a range                                    of potential user communities.” i find this to be incredibly important because libraries (and most likely                                other cultural heritage organization types) have not spent a vast amount of time asking and publishing                                on “who is a digital library user.”     my own research has focused in some narrow ways on better understanding digital library users. my                                collaboration with other members of the dlf assessment interest group’s user studies working group                            has found that the assessment of digital library reuse is complicated for a whole host of reasons,                                  including the profession’s inability to systematically identify and understand digital library users.[ ]                        additional research i have done with a co‑author suggests that digital library users (note:  not users of                                  computational data) are more frequently ( ) from outside of academia and ( ) reusing digital library                              content for a wide array of non‑scholarly pursuits.[ ]     i find  always already computational to be an exciting opportunity to address major gaps in our current                                  understanding of what is a digital library collection and how is it being used by targeted audiences. while                                    i recognize that demystifying the digital library user is not the primary pursuit of this national forum, i                                    look forward to discussing this as well as other important aspects of the grant with a deeply                                  knowledgeable and inspiring group of participants. i appreciate the opportunity to contribute to such a                              discussion.    works cited    [ ] for more on how understanding users and reuses can inform digital library management, see my  work with michele reilly: “understanding ultimate use data and its implication for digital library  management: a case study,”  the journal of web librarianship    ( ) ( ):  ‑ . doi:  http://dx.doi.org/ . / . .  .     [ ] in the user studies working group drafted a white paper, “surveying the landscape: use and                                  usability assessment of digital libraries,” that explored the state of research around three assessment                            topics: user/usability studies, return on investment, and content reuse. a copy can be found here:                                http://dx.doi.org/ . / . . / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / https://osf.io/uc b / .     [ ] see reilly and thompson, “understanding ultimate use,” and michele reilly and santi thompson,                            “reverse image lookup: assessing digital library users and reuses,”  the journal of web librarianship                            ( ):  ‑ . doi:  http://dx.doi.org/ . / . .  .                                                         https://osf.io/uc b / http://dx.doi.org/ . / . . / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / building institutional and national capacity for collections as data  kate zwaard, library of congress    about a year ago, the library of congress created a new division, national digital initiatives, which i am                                    proud to lead. our mission is to maximize the benefit of the digital collection, to incubate innovation,                                  and to encourage national capacity for digital cultural memory.   in a recent new yorker article, the librarian of congress said she wants the library of congress “to get to                                        the point where there’ll still be a specialness, but i don’t want it to be an exclusiveness. it should feel                                        very special because it  is  very special. but it should be very familiar [ ]” we in ndi take that message to                                      heart. we believe that an important step in getting users to engage with the library’s digital material and                                    staff is to provoke, explore, tell stories, and invite.     our vision is for ndi to help libraries and patrons explore the edges of possibility. to try things ourselves                                      and share with the profession. to help highlight the treasures we have ‑‑ here at the library of congress                                      and in our nation’s cultural heritage institutions – and spark people’s imagination around the potential                              uses of digitized or born digital collection objects. to encourage the curious and help them get answers.  to help people understand what a library is.  upon our founding, the director of national and international outreach said “it’s not enough anymore to                                just open the doors of this building and invite people in. we have to open the knowledge itself for                                      people explore and use. [ ]”      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / a few things we’ve been working on:  ● we organized “ collections as data ,” [ ] a conference devoted to exploring what’s possible using  computation with digital collections.   ● we hosted an  archives unleashed hackathon , bringing together programmers, librarians, and  scholars looking at computational analysis of web archives collections [ ]  ● we performed a  digital lab proof of concept  along with a report exploring how to deliver library  of congress digital collections as data to on‑site researchers [ ]  ● we hosted a  software carpentry workshop  [ ] to help teach library of congress librarians and  others in the neighborhood how to use code to manage and analyze digital collections.  ● we’ve started a series of  sample code notebooks  to help people work with library of congress  data [ ]    my background is in software development. before this job, i ran the repository development group [ ]                                at the library of congress and before that i worked on creating digital preservation software solutions                                for the government publishing office. my perspective is on the very practical. institutions have spent a                                lot of time, effort, and money on digitizing collections and establishing policies and infrastructures                            around the model of access that mimics analog models. transforming the technology, staff, and practice                              to accommodate data analysis is a second paradigm shift that will be just as difficult. for many                                  knowledge institutions, funding is decreasing and becoming less secure while the volume and complexity                            of digital information is multiplying and the commitment to analog collections remains. in my view, the                                only way forward is together:  ● leverage connections with physical sciences, social sciences, and journalism. work together on  tooling and training.  ● highlight digital scholarship projects with easy to understand outcomes to make the case beyond  academia.  ● support distributed fellowship models (ndsr) for building digital stewardship curation skills and    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / building skills for doing digital research.  ● create train‑the‑trainer programs to help scholars understand what’s possible using computation  ● get content, methodologies, and tools to k‑  educational audiences.  ● explore legal, cultural and privacy review models to guide researchers using novel digital  content, like a light‑weight irb.  ● provide space and time for experimentation.  the library of congress “preserves and provides access to a rich, diverse and enduring source of                                knowledge to inform, inspire and engage you in your intellectual and creative endeavors.” [ ] we are                                thrilled to be a part of this exciting conversation, and look forward to working together.    works cited     [ ] “the librarian of congress and the greatness of humility” by sarah larson.  the new yorker .   february  ,    http://www.newyorker.com/culture/sarah‑larson/the‑librarian‑of‑congress‑and‑the‑greatness‑of‑humili ty  [ ] “data and humanism shape library of congress conference” by mike ashenfelder.  the signal .  october  ,    http://blogs.loc.gov/thesignal/ / /data‑and‑humanism‑shape‑library‑of‑congress‑conference/  [ ] “collections as data report summary” by jaime mears.  the signal . february  ,    http://blogs.loc.gov/thesignal/ / /read‑collections‑as‑data‑report‑summary/  [ ] “co‑hosting a datathon at the library of congress” by jaime mears.  the signal.  july  ,    http://blogs.loc.gov/thesignal/ / /co‑hosting‑a‑datathon‑at‑the‑library‑of‑congress/?loclr=blogsig  [ ] “library of congress lab: library of congress digital scholars lab pilot project report” by michelle  gallinger and daniel chudnov. december  ,    http://digitalpreservation.gov/meetings/dcs /dchudnov‑mgallinger_lclabreport.pdf  [ ] software carpentry at the library of congress  https://oulib‑swc.github.io/ ‑ ‑ ‑loc/  [ ] data‑exploration github page  https://github.com/libraryofcongress/data‑exploration  [ ] “yes, the library of congress develops lots of software tools” by leslie johnston. august  ,    https://blogs.loc.gov/thesignal/ / /yes‑the‑library‑of‑congress‑develops‑lots‑of‑software‑tools/  [ ] “about the library”  https://www.loc.gov/about/          http://www.newyorker.com/culture/sarah-larson/the-librarian-of-congress-and-the-greatness-of-humility http://www.newyorker.com/culture/sarah-larson/the-librarian-of-congress-and-the-greatness-of-humility http://www.newyorker.com/culture/sarah-larson/the-librarian-of-congress-and-the-greatness-of-humility http://blogs.loc.gov/thesignal/ / /data-and-humanism-shape-library-of-congress-conference/ http://blogs.loc.gov/thesignal/ / /data-and-humanism-shape-library-of-congress-conference/ http://blogs.loc.gov/thesignal/ / /read-collections-as-data-report-summary/ http://blogs.loc.gov/thesignal/ / /read-collections-as-data-report-summary/ http://blogs.loc.gov/thesignal/ / /co-hosting-a-datathon-at-the-library-of-congress/?loclr=blogsig http://blogs.loc.gov/thesignal/ / /co-hosting-a-datathon-at-the-library-of-congress/?loclr=blogsig http://digitalpreservation.gov/meetings/dcs /dchudnov-mgallinger_lclabreport.pdf http://digitalpreservation.gov/meetings/dcs /dchudnov-mgallinger_lclabreport.pdf https://oulib-swc.github.io/ - - -loc/ https://oulib-swc.github.io/ - - -loc/ https://github.com/libraryofcongress/data-exploration https://github.com/libraryofcongress/data-exploration https://blogs.loc.gov/thesignal/ / /yes-the-library-of-congress-develops-lots-of-software-tools/ https://blogs.loc.gov/thesignal/ / /yes-the-library-of-congress-develops-lots-of-software-tools/ https://www.loc.gov/about/ https://www.loc.gov/about/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   appendix  : forum summaries  forum  :  march  ‑ ,   | santa barbara, california  the first forum was a gathering of key stakeholders, practitioners, thought leaders, and scholars currently                              working with collections as data. each participant was asked to prepare a position statement in advance of                                  the forum to help frame the discussion. forum sessions were a mixture of group discussions, presentations,                                and small group work using human centered design techniques. activities were designed to document                            current practice, surface problems, and generate new ideas and approaches for collections as data work.                              although crafting a joint framework and strategic direction for collections as data was an initial goal of the                                    forum, this was ultimately proved not to be achievable because of the multiplicity of techniques,                              approaches, and user needs for collections as data. instead, forum participants crafted the santa barbara                              statement, which represented a consolidation of the major themes of the forum. these included the                              complexity of the collections as data landscape, particularly the wide range of consumers and use cases;                                questions of scalability; open access solutions; ethical concerns; and partnerships.  agenda  march    : breakfast  : welcome & introductions  :   project scope overview  thomas padilla  : project outcomes‑‑focused group discussion  : break  : collections as data panel ‑‑ existing implementations  miriam posner (ucla), harriett green (uiuc), tim sherratt (university of camberra), mia ridge (british  library), jefferson bailey (internet archive), gabrielle foreman (university of delaware)    : idea generation  discussion:  you each came with a set of collections as data related ideas, expressed in part  through your position statements. you are in a group of people with a range of experiences.  during this time we would like you to work to align your experiences to generate ideas that hold  the potential to push collections as data work forward. we ask that you focus your discussion on  enumerating as many ideas as possible. we do not expect you to create detailed roadmaps  whereby these ideas might be pursued. this conversation is purely geared toward getting as  many of our  ideas to the surface as possible.     : working lunch  : sharing  : break    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / : play it out  discussion:  how might some of the ideas you generated be implemented?  : break  : sharing  : reflection  discussion:  this afternoon you spent time reflecting on your collective position statements and  discussing ideas that push collections as data work forward.  we now ask you to spend a few  moments in your focused group to critique these all of the ideas that were generated.  what do  you think are particularly good or useful ideas?  what might be easy to implement?  what  problems and pitfalls exist?     : set stage for day      march    : breakfast  : gather data  activity:  in this exercise, you will rely on each other as a sort of “focus group” to gather a set of  data about how you engage with collections as data.  please answer the following questions as a  group, recording your answers in this document.  you may choose which set of questions are  most relevant to your perspective in creating/manipulating/consuming collections as data.  some  groups may choose to answer from multiple perspectives.  we will build upon this data the  remainder of the day.      :   break  : story generation  activity:  using the data gathered earlier this morning from all of the groups and your own  personal experiences, write  ‑  user stories.     : lunch  :   story review and critique  activity:  examine and refine the use cases generated by another group.      : prototyping  activity:  using the best or most interesting ideas from the story generation activity, design a  product, system, service or curriculum, etc., that meets the needs of one or more people you  choose from the stories.  you will be presenting your prototype idea using a  concept poster .  be  sure to consider the effect of your solution on other stakeholders to demonstrate viability and  impact.      : break  : share prototypes  : discussion: implications for libraries  : review of day    https://www.google.com/search?q=luma+institute+concept+poster&espv= &biw= &bih= &tbm=isch&tbo=u&source=univ&sa=x&ved= ahukewj oi rxsahuhrlqkhvkrdlsqsaqijq / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   march    : breakfast  : discussion: absences  :   statement creation  : break  : engagement  :   closing remarks  attendees   john ajao  director, systems and repository operations  university of california santa barbara  matthew miller  head of semantic applications & data research   new york public library  jefferson bailey  head of web archiving programs  internet archive  anna neatrour  metadata librarian  university of utah  alex chassanoff  software curation postdoctoral fellow  massachusetts institute of technology  miriam posner  digital humanities coordinator  university of california los angeles  tanya clement  assistant professor of information  university of texas austin  sheila rabun  community and communications officer  stanford university  p. gabrielle foreman  professor of english and black american studies  university of delaware  mia ridge  digital curator  british library  daniel fowler  developer advocate  open knowledge foundation  laila sakr  assistant professor of film and media studies  university of california santa barbara  harriett green  english and digital humanities librarian   university of illinois at urbana champaign  ben schmidt  assistant professor of history  northeastern university  jennifer guiliano  assistant professor of history  indiana university‑purdue university indianapolis  david seubert  curator of performing arts collections  university of california santa barbara    http://www.jeffersonbailey.com/ http://www.filmandmedia.ucsb.edu/people/faculty/shereensakr/shereensakr.html http://jguiliano.com/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / julie hardesty  metadata analyst   indiana university  tim sherratt  associate professor of digital heritage  university of canberra  christina harlow  metadata librarian  cornell university  hannah skates kettler  digital humanities librarian  university of iowa  greg jansen  research software architect  university of maryland  timothy st. onge  cartographer  library of congress  lisa johnston  research data management/curation lead &  co‑director university digital conservancy  university of minnesota  santi thompson  head of digital research services  university of houston  matthew lincoln  data research specialist  getty research institute  kate zwaard  head of national digital initiatives  library of congress  alan liu  distinguished professor of english  university of california santa barbara    forum  :  may  ‑ ,   | las vegas, nevada  after spending a year at conferences, workshops, and seminars talking about what collections as data is,                                we held a second national forum focused the nuts and bolts of collections as data work, particularly how                                    communities interested in getting started with collections as data work could move forward. the first                              day of the forum focused on current implementations and how a variety of consumers, from librarians to                                  scholars to the general public, interacted with collections as data resources. this section of the forum                                was livestreamed and received over live and subsequent views. as in the first forum, the variety of                                    these collections as data implementations once again demonstrated that the collections as data                          landscape is complex and no one set of solutions will be feasible or even appropriate for everyone.                                  forum participants then focused on reality checks of  always already computational deliverables based                          on their own experiences with collections as data.    agenda  monday, may      : breakfast    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / : dean welcome & introductions  :   project update   thomas padilla  : panel  : who is collections as data for?   who is collections as data for? building on principle  , the forthcoming version of the  cad santa  barbara statement  will assert that "collections as data designed for everyone serve no one."  how has your work with cad been forged around specific people, whether those represented in  the collections, built into the design of the dataset, or reflected in your own teaching and/or  learning? what work have you done to match cad with populations?    dot porter (upenn), shawn averkamp (nypl), bergis jules (uc riverside)   : ‑ :   break  : panel  : what is the coolest thing about your collections as data work?   what is the coolest thing about your collections as data work? tell us why you became                                involved with this work and what motivates your continued dedication or interest. we'd like to                              show our attendees the spirit and possibilities of collections as data work.    micki kaufman (cuny), inna kouper (indiana), greg cram (nypl), laurie allen (upenn)  : ‑   break  : panel  : how have you implemented collections as data?   viewers of our livestream are likely interested in how they might participate in or grow                              collections as data. how have you started, shifted, or institutionalized collections as data?                          how do you see this work aligning with your institutional/organizational mission? what                        surprised you about the process, and what do you plan or hope to do next?    meghan ferriter (loc), mary elings (uc berkeley), helen bailey (mit),  veronica ikeshoji‑orlati (vanderbilt)  : lunch  : introducing the guide   : reality check on project deliverables ‑‑ group‑based discussion and activities  all   : break for dinner ‑  on your own    tuesday, may    : breakfast  : future directions: moving stuff forward ‑‑ group‑based discussion  all  : wrap up  thomas padilla  : end ‑  box lunch provided    https://collectionsasdata.github.io/statement/ https://collectionsasdata.github.io/statement/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / attendees  elvia arroyo ramirez
  assistant university archivist  university of california irvine  micki kaufman
  city university of new york  shawn averkamp  manager of metadata services  new york public library  inna kouper
  assistant scientist, school of informatics,          computer, and engineering  
assistant director, data to insight center indiana              university  helen bailey  
engagement data engineer  massachusetts institute of technology  mark matienzo  
collaboration & interoperability architect  stanford university  alex chassanoff
  clir/dlf postdoctoral fellow in software          curation  massachusetts institute of technology  jake orlowitz
  head of the wikipedia library  wikimedia foundation    kalani craig
  clinical assistant professor, department of          history  
co‑director, institute for digital arts &            humanities   indiana university  sarah patterson
  lecturer, department of english  university of massachusetts amherst
  co‑founder, colored conventions  greg cram  
associate director of copyright and information            policy  new york public library  dot porter
  curator of digital research services university of              pennsylvania  mary elings  head of technical services  the bancroft library, university of california            berkeley  chaitra powell
  african american collections and outreach          archivist   university of north carolina chapel hill  meghan ferriter
  senior innovation specialist   library of congress  chela scott weber
  director of library and collections  california historical society    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / devin higgins
  digital library programmer  michigan state university  hannah scates kettler  
digital humanities research and instruction          librarian  university of iowa  veronica ikeshoji‑orlati
  clir postdoctoral fellow  vanderbilt university  laura wrubel  software development librarian  george washington university  bergis jules
  university and political papers archivist          university of california riverside            / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / appendix  : conference engagements,  ‑   conferences as a way to expand conversation beyond the two national forums. limited money, chose to                                spend it by hosting mini‑forums with user groups not at the national forum. more gathering of use cases                                    and critique of our assumptions.  re‑emphasized diversity of experience, capacity, and needs.    .    ldcx (march  ‑ , stanford, california)  ● thomas padilla, hannah frost, “supporting end user computation / use of collections” ( hour                            unconference session)    csvconf (may  ‑ , portland, oregon)  ● laurie allen (keynote)    texas conference on digital libraries (may  ‑ , austin, texas)  ● sarah potvin, “almost already computational: an update from the library collections as data                          effort” (poster)    association of college and research libraries digital humanities interest group webinar (june ,                          online)  ● thomas padilla, “what does it mean: library collections as data” (  speakers,   minute panel)    american library association (june  , chicago, illinois)  ● laurie allen, “new kinds of collections: new kinds of collaborations,” on panel for “creating the                              future of digital scholarship together: collaboration from within your library” ( projects,                           minute panel)    society of american archivists (july  ‑ , portland, oregon)  ● alexandra chassanoff, thomas padilla, and elizabeth russey roke, “open forum ‑ always already                          computational: collections as data” (  minutes)     digital humanities (august  , montreal, quebec)  ● sarah potvin, thomas padilla, laurie allen, stewart varner, “shaping humanities data” (full‑day                        preconference symposium)    dlf eresearch network (august  )  ● thomas padilla, “collections as data” (  minutes)    digital library federation (october  ‑ , pittsburgh, pennsylvania)  ● thomas padilla, “collections as data: an update” (  minutes)    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ● thomas padilla, laurie allen, stewart varner, elizabeth russey roke, hannah frost, sarah potvin,                          “collections as data workshop” (  hours)    samvera connect (november  , salt lake city, utah)  ● hannah frost, “collections as data and samvera” (  minutes)    coalition for networked information (december  , washington, dc)  ● thomas padilla, laurie allen, hannah frost, “always already computational: collections as data”                        (  hour)    .    american historical association (january  , washington, dc)  ● laurie allen, stewart varner, “collections as data,” in workshop on “getting started in digital                            history  ” (  hour workshop)    national institute for computer‑assisted reporting (march  , chicago, illinois)  ● thomas padilla and laurie allen, “cultural heritage data? computational use, needs, and                        opportunities” (  minutes)     digital public library of america annual members meeting (march  ‑ , atlanta, georgia)  ● elizabeth russey roke, “dpla as data: collections as data in practice” (  minute workshop)    ldcx (march  ‑ , stanford, california)  ● hannah frost and kate lynch, “collections as data” (  minutes)    los angeles arts datathon (april  , los angeles, california)  ● thomas padilla, “collections as data x arts as data” (keynote)    dh + libraries, sidney harman center for polymathic studies, university of southern california (april                            , los angeles, california)  ● thomas padilla, “on a collections as data imperative” (  minutes)    society of american archivists (august  ‑ , washington, dc)  ● elizabeth russey roke, "collections as data," electronic records section meeting ( minute                        discussion)    open repositories (june  ‑ , bozeman, montana)  ● hannah frost and sarah potvin (moderators), mark jordan, katherine lynch, helen bailey,                        “enabling computational access at scale: are repositories serving collections‑as‑data?” (                     minutes)    hilt (june  ‑ , philadelphia, pennsylvania)    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ● thomas padilla and mia ridge, “collections as data” (week long workshop)    dariah beyond europe workshop at library of congress (october  ‑ , washington dc  ● laurie allen, stewart varner “collections as data: digital collections for emerging research                        methods.”  (keynote + workshop   hours)    digital library federation (october  ‑ , las vegas, nevada)  ● thomas padilla, stewart varner, hannah frost, elizabeth russey roke, sarah potvin, “always                        already computational, never quite automatic: towards a collections as data framework” (                         minutes)  ● sarah potvin, thomas padilla, santi thompson, liz woolcott, amanda rust, giordana mecagni,                        “what would the ‘community’ think? three grant‑funded team reflect on defining community                        and models of engagement” (  minutes)                              / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / appendix  : digital humanities   preconference: shaping humanities data  description  how can cultural heritage institutions develop and provide access to collections that are more readily                              amenable to computational use? how does a movement toward thinking about collections as data                            prompt an opportunity to reframe, enrich, and/or contextualize collections in a manner that expands use                              while avoiding replication of bias inherent in collection practice? the  collections as data  project                            presents  shaping humanities data  as a venue to explore these questions at  digital humanities   .  shaping humanities data features eleven talks and five demonstrations. talks and demonstrations were                          solicited through a  cfp and reviewed by an international program committee. the event also includes                              opportunities for discussion and workshopping  collections as data frameworks. the workshop will                        inform the development of recommendations that aim to support cultural heritage collections as data                            efforts.  schedule  august  ,    : ‑ :   ● introductions, schedule, project update  : ‑ :   ● reusable computational processing of large‑scale digital humanities collections                (marciano and jansen)  ● marcing the boundary: reusing special collections records through the early novels                      database (kashyap and van tine)  ● leveraging core data for the cultural heritage of the medieval middle east (schwartz)  : ‑ :   ● lessons learned through the smelly london project (leem)  ● historical public health data curation: indiana state board of health monthly bulletin                        project (pollock and coates)  ● javanese theatre as data (varela)  ● high performance computing for photogrammetry made easy (dombrowski, gniady,                  simpson, meredith‑lobay)  : ‑ :   ● using iiif to answer the data needs of digital humanists (di cresce)  ● demonstrating a multidisciplinary collections api (almas and baumgardt)  : ‑ :   ● collections as data workshopping  : ‑ :   ● umbra search as data: a digital sandbox to cross the digital divide (marcus)    https://collectionsasdata.github.io/shaping/ https://collectionsasdata.github.io/dh / https://collectionsasdata.github.io/statement/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / ● audio analysis for spoken text collections (clement and mclaughlin)  : ‑ :   ● facilitating global historical research on the semantic web: medea (tomasek and                      vogeler)  ● mending the vendor: correction and exploratory augmentation of collections as data                      (locke)  ● learning through use: a case study on setting up a research fellowship to learn more                              about how one of our collections works as computationally amenable data (severson                        and vejvoda)  ● addressing copyright and ip concerns when using text collections as data (senseney,                        dickson, and tracy)  ● libraries as publishers of a new bibliographical unit (claeyssens)  : ‑ :   ● wrap‑up  program committee members    harriett green, university of illinois at urbana champaign  inna kizhner, siberian federal university  alberto martinez, colegio de méxico  ian milligan, university of waterloo  gimena del rio riande, consejo nacional de investigaciones científicas y técnicas (conicet)‑ university                          of buenos aires  laurent romary, inria and dariah  henriette roued‑cunliffe, university of copenhagen  melissa terras, university college london  presentation abstracts  demonstrating a multidisciplinary collections api  bridget almas and frederik baumgardt, tufts university; tobias weigel, dkrz; thomas zastrow, mpcdf  the collections working group of the research data alliance (rda) is a multidisciplinary effort to                              develop a cross‑community approach to building, maintaining and sharing machine‑actionable                    collections of data objects. we have developed an abstract data model for collections and an api that                                  can be implemented by existing collection solutions. our goal is to facilitate cross‑collection                          interoperability and the development of common tools and services for sharing and expanding data                            collections within and across disciplines, and within and across repository boundaries. the rda                          collections api supports create/read/update/delete/list (crud/l) operations. it also supports                  set‑based operations for collections, such as finding matches on like items, finding the intersection and                              union of two collections, and flattening recursive collections. individual api implementations can                        declare, via a standard set of capabilities, the operations available for their collections. the perseids                                https://www.rd-alliance.org/groups/pid-collections-wg.html / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / project at tufts university is implementing this api for its collection of annotations on ancient texts. we                                  will review the model and the functionality of the api and demonstrate how we have applied it to                                    manage perseids humanities data. we will also provide examples of how it is being applied for                                collections of data in other disciplines, including climate computing and geoscience. finally, we will                            solicit feedback from the participants in the workshop on the api and model and its applicability for                                  other collections of cultural heritage data.    libraries as publishers of a new bibliographical unit  steven claeyssens, koninklijke bibliotheek    large‑scale digitisation of historical paper publications is turning libraries into publishers of data                          collections for machines and algorithms to read. therefor the library should critically (re)consider ) its                              new function as a publisher of ) a new type of bibliographical content in ) an exclusively digital                                    environment. what does it mean to be both library and publisher? what is the effect of remediating our                                    textual and audiovisual heritage, not as traditional bibliographic publications, but as data and datasets?                            how can we best serve our patrons, new and old, machines and humans?    in my talk i want to address these questions drawing on my background as a book historian specialized in                                      publishing studies, and on my experience as the curator of digital collections at the national library of                                  the netherlands (kb) responsible for providing researchers with access (data services) to the large                            collections of data the kb is creating.    at the kb we found there is no one‑way solution to cater the needs of digital humanists. i will reflect                                        upon their requirements by analysing the requests for data by digital humanists the kb received during                                the year  . what kind of data were they looking for? why did they need the data?    i will identify both valuable as well as incompatible user requirements, indicating the conflicting                            expectations and interests of different disciplines and researchers. therefore i argue that ) a close                              collaboration between scholars and librarians is essential if we really want to advance the use of large                                  digital libraries in the field of digital humanities, and ) we need to carefully reconsider our role(s) as a                                      library.    audio analysis for spoken text collections  tanya clement and steve mclaughlin, the university of texas at austin    at this time, even though we have digitized hundreds of thousands of hours of culturally significant audio                                  artifacts and have developed increasingly sophisticated systems for computational analysis of sound,                        there is very little provision for audio analysis. there is little provision for scholars interested in spoken                                  texts such as speeches, stories, and poetry to use or to even begin to understand how to use high                                      performance technologies for analyzing sound. toward these ends, we have developed a beginner’s                          audio analysis workshop as part of the hipstas (high performance sound technologies for access and                              scholarship) project. we introduce participants to essential issues that dh scholars, who are often more                              familiar with working with text, must face in understanding the nature of audio texts such as poetry                                    / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / readings, oral histories, speeches, and radio programs. first, we discuss the kinds of research questions                              that humanities scholars may want to explore using features extracted from audio collections– laughter,                            silence, applause, emotions, technical artifacts, or examples of individual speakers, languages, and                        dialects as well as patterns of tempo and rhythm, pitch, timbre, and dynamic range. we will also                                  introduce participants to techniques in advanced computational analysis such as annotation,                      classification, and visualization, using tools such as sonic visualiser, arlo, and pyaudioanalysis. we will                            then walk through a sample workflow for audio machine learning. this workflow includes developing a                              tractable machine‑learning problem, creating and labeling audio segments, running machine learning                      queries, and validating results. as a result of the workshop, participants will be able to develop potential                                  use cases for which they might use advanced technologies to augment their research on sound, and, in                                  the process, they will also be introduced to the possibilities of sharing workflows for enabling such                                scholarship with archival sound recordings at their home institutions.    using iiif to answer the data needs of digital humanists  rachel di cresce, university of toronto    how can we provide researchers and instructors with seamless access to dispersed collections,                          controlled by their formats, frameworks and softwares, across cultural heritage organizations? how can                          we allow free movement of this data so it can be analyzed, measured and presented through different                                  lenses? and how can we support this research without placing too high a technical burden on those                                  institutions, especially those with limited resources? these questions have been at the centre of the                              university of toronto’s mellon‑funded project, digital tools for manuscript study, which aims at                          integrating the international image interoperability framework (iiif), based on linked data principles,                        with existing tools to improve the researcher’s experience. essentially, the project shifts focus away from                              the tool that makes use of the data onto the data itself as a research and teaching tool.    at the core of the project is working with humanists to understand how they conduct their research and                                    what they need in order to do digital scholarship effectively. we identified, for example, strong needs for                                  data portability, repository interoperability, and tool modularity in scholarly work. we make use of the                              iiif data standard to support data portability, the mirador image viewer for its suite of tools for image                                    presentation and analysis and omeka for its wide adoption among digital humanities scholars and                            cultural heritage organizations. in addition, we have developed a standalone tool set called iiif to go.                                this is a user‑friendly iiif start‑up kit, designed to support both research and pedagogical uses. this talk                                  will discuss our attempt to democratize an international standard by ( ) embedding it in tools with wide                                  traction and low entry barriers in the digital humanities and manuscript studies community ( ) limiting                              the technical load required to make use of the standard and tools for instruction and research and ( )                                    looking toward linked data at glam institutions.    high performance computing for photogrammetry made easy  quinn dombrowski, university of california berkeley; tassie gniady, indiana university; john simpson,                        compute canada; megan meredith‑lobay, university of british columbia      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / photogrammetry (generating d models from a series of partially‑overlapping d images) is quickly                          gaining favor as an efficient way to develop models of everything from small artifacts that fit in a light                                      box to large archaeological sites, using drone photography. stitching photographs together, generating                        point clouds, and generating the dense mesh that underlies a final model are all                            computationally‑intensive processes that can take up to tens of hours for a small object to weeks for a                                    landscape to be stitched on a high‑powered desktop. using a high‑performance compute cluster can                            reduce the computation time to about ten hours for human‑sized statues and twenty‑four hours for                              small landscapes.    one disadvantage of doing photogrammetry on an hpc cluster is that it requires use of the command                                  line and photoscan’s python api. since it is not reasonable to expect that all, or even most, scholars who                                      would benefit from photogrammetry are proficient with python, uc berkeley has developed a jupyter                            notebook that walks through the steps of the photogrammetry process, with opportunities for users to                              configure the settings along the way. jupyter notebooks embed documentation along with code, and can                              serve both as a resource tool for researchers who are learning python, and as a stand‑alone utility for                                    those who want to simply run the code, rather than write it. this offloads the processing the hpc cluster,                                      allowing users to continue to work on a computer that might normally be tied up by the processing                                    demands of photogrammetry.    marcing the boundary: reusing special collections records through the early novels database  nabil kashyap, swarthmore college, and lindsay van tine, university of pennsylvania    in this presentation, early novels database project (end) collaborators nabil kashyap and lindsay van                            tine will offer perspectives on the possibilities and perils of reframing the special collections catalog as a                                  collaborative datastore for humanities research. among other activities, the end project includes                        curating records from regional special collections, developing standards for enhancing catalog records                        with copy‑specific descriptive bibliography, and publishing open access datasets plus documentation.                      work on end therefore excavates basic questions around what thinking through library holdings as data                              might actually entail. what ultimately constitutes “the data”? what do they do? for whom? starting                              from leigh star’s notion of the boundary object, this presentation explores the theory and praxis of                                marc as a structure of knowledge that can allow “coordination without consensus.”    the marc records at the core of the end dataset, the result of meticulous work on the part of                                      institutional catalogers, serve as “boundary objects”–that is, they serve as a flexible technology that                            both adapts to and coordinates a range of contexts. these contexts, in turn, can have very different                                  needs and values, from veteran catalogers to undergraduate interns, special collections to open source                            repositories, and from projected to actual uptake and reuse of the data in classrooms and research.    these shifting contexts call into question just what the “data” is. it will look different to a cataloger, an                                      outside funding organization, a sophomore, a programmer, or an th c. scholar. what might appear                              straightforward–creating derivatives, for example–instead reveals a host of issues. transforming nested                      into tabular data brings to light frictions between disparate assumptions as to the unit of study, whether                                  a work or volume or a particular copy. privileging certain fields either effaces the specificity of                                  / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / transcription or sacrifices discoverability. there is no transparent “data dump”; instead, every act of                            transformation reinscribes a set of disciplinary and institutional values. viewing collections as data is as                              much about opening up data as about actively demonstrating and to an extent prescribing research                              possibilities.    lessons learned through the smelly london project  deborah leem, wellcome trust and university college london    i propose to present the intended aims of the smelly london project; what we achieved; challenges we                                  experienced working with digitised collections; and possible directions for further development. in order                          to increase the impact and value that cultural heritage digital collections can offer we believe that their                                  online collections and platforms should be more amenable to emerging technologies and facilitate a new                              kind of research.    wellcome library – part of wellcome – is one of the world’s major resources for the study of health and                                        histories. over the past few years wellcome have been developing a world‑class digital library by                              digitising a substantial proportion of their holdings. as part of this effort, approximately , medical                              officer of health (moh) reports for london spanning from ‑ were digitised in . since                              september wellcome have been digitising , more reports covering the rest of the united                              kingdom (uk) as part of uk medical heritage library (ukmhl) project in partnership with  jisc and the                                  internet archive. however, no digital techniques have yet been applied successfully to add value to this                                very rich resource.    as part of the  smelly london project, the ocr‑ed text of the moh london reports has been text‑mined.                                    through text mining we produced a geo‑referenced dataset containing smell types for visualisation to                            explore the data. at the end of the smelly london project the moh smell data will also be available via                                        other platforms and this will allow the public and other researchers to compare smells in london from                                  the th century to present day. this has the further potential benefit of engaging with the public.                                  however, cultural heritage organisation do not offer platforms that can help researchers share or                            communicate the data derived from digital collection use.    mending the vendor: correction and exploratory augmentation of collections as data  brandon locke, michigan state university    like many university libraries, michigan state received external hard drives filled with collections they                            held perpetual licenses to. like many university libraries, those collections have mostly remained mostly                            unused since they’ve been acquired. the data required processing to make them usable, but without                              demand for specific data from scholars, there was little benefit or reason to make all of the data                                    available.  in an effort to pilot a project to make this data more available and to promote use of the datasets,                                        brandon locke (director of leadr), devin higgins (library programmer), and megan kudzia (digital                          scholarship technology librarian), embarked on a project to make the papers of fannie lou hamer                              available for download. hamer’s papers were chosen based on her historical stature and interest to                                https://ukmhl.historicaltexts.jisc.ac.uk/home https://collectionsasdata.github.io/shapingdata_dh _abstracts/www.londonsmells.co.uk / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / faculty and graduate students in the department of history, and upon the relatively small size of the                                  collection.    the original scope of the project was for higgins and kudzia to make the plain text files available for                                      download by any msu student, faculty and staff. leadr staff would then experiment with different text                                and data mining tools to add metadata and create subsets and auxiliary datasets to accompany the                                collection.    after higgins and kudzia made the plain text files  available to the campus community , the leadr staff                                  immediately encountered troubles with named entity recognition. upon inspection, the ocr on the files                            were far too flawed for any accurate text mining, and the entire collection had to be redone using the                                      provided page images with close training and manual correction.    this talk will detail some of the shortcomings in the supplied data, discuss opportunities for                              experimental text and data mining to enhance and augment existing collections datasets, and engage in                              opportunities for collaborations between institutions in improving data quality.    reusable computational processing of large‑scale digital humanities collections  richard marciano and greg jansen, university of maryland    the digital curation innovation center (dcic) at the u. maryland ischool, officially launched the                            “dras‑tic” archiving platform at ipres , in oct. . this stands for digital repository at scale                                that invites computation [to improve collections], and is rolled out under a community‑based open                            source license. the goal is to build out an open source platform into a horizontally scalable archives                                  framework serving the national library, archives, and scientific management communities. as a potential                          scalable and computational platform for big data management in large organizations in the cultural                            heritage, business, and scientific research communities.    this digital repository framework can scale to over a billion records and has tools for advanced metadata                                  extraction ‑ including from images, file format conversion, and search within the records and across                              collections. the underlying software is based on the distributed nosql database, apache cassandra,                          created to meet the scaling needs of companies like facebook. dras‑tic supports integration by                            providing a standard restful cloud data management interface (cdmi), a command‑line interface, web                          interface, and messaging as contents are changed (mqtt). we are now exploring connecting dras‑tic                            with a graph database engine to support social network analysis and computing of archival and library                                collections.  we wish to demonstrate this environment with reusable clustering workflows for grouping digitized                          forms by their layout, a recurring use‑case in many digital humanities projects. this is a preprocessing                                step that has the potential to lead to more accurate ocr of regions in images within digitized forms.    umbra search as data: a digital sandbox to cross the digital divide  cecily marcus, university of minnesota libraries      https://listings.lib.msu.edu/fannielouhamer/ / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / publicly launched in , the university of minnesota libraries’  umbra search african american history                            has been working with partners across the country—from the digital public library of america to yale                                university to howard university—to facilitate digital access to african american cultural history. as more                            than a search tool, umbra search doesn’t just bring together over , digital materials from ,                                 us libraries, archives and museums. it also promotes the use of these materials through programming                              with students, educators, scholars, and artists, and leads a massive digitization effort of african american                              materials to build out a national digital corpus of african american history. now, umbra search is                                exploring what it means to share the umbra search digital corpus as a data set that helps to bridge the                                        digital divide and promote digital literacy among underrepresented youth and kids of color. by packaging                              curated sets of umbra search data around thematic topics (as well as providing access to the whole of                                    umbra search data) with accessible digital storytelling tools that allow students to make data their own,                                umbra search provides an introduction to digital storytelling and other digital humanities skills through                            the lens of african american history and culture. umbra search’s national digital corpus provides a                              unique opportunity to engage students with steam activities and skill building with culturally relevant                            content that affirms african american history and culture. this talk discusses the rationale for developing                              a digital sandbox that provides libraries with a new model for activating primary source materials and                                digital collections—often considered to be among the more rarefied and inaccessible collections in                          libraries—and digital humanities tools in communities that may not regularly engage with archives,                          primary source digital collections, or digital humanities.    historical public health data curation: indiana state board of health monthly bulletin project  caitlin pollock and heather coates, indiana university‑purdue university indianapolis    as digital scholarship librarians, enhancing open digital content to facilitate reuse is a key mission of our                                  work. this talk will introduce the work of iupui librarians in curating the indiana state board of health                                    monthly bulletin ( ‑ ). while in circulation, this resource was sent to all health officers and                              deputies in the state, plus individual subscribers. physicians shared information about health and                          wellness, communicable diseases, patent medicines, food safety, and many other topics. as such, the                            bulletin provides a unique historic portrayal of indiana public health practice, fascinating images, and                            regular vital statistics from the early and mid‑ th century. this project brings together the ruth lilly                                medical library and the iupui university library to leverage librarian expertise in digital humanities,                            medical humanities, public health, the history of medicine, and data curation. our initial focus is curating                                a ‑year span ( ‑ ) of these bulletins in order to develop and refine processes that can be                                  adapted for other digital collections. our curation efforts focus on providing greater accessibility to                            students and scholars of indiana and medical history, public health, and hoosiers across the state. we                                are creating three types of products: tei documents; geocoded citizens and professionals, community                          organizations and businesses, and buildings; and vital statistics data. data dictionaries are being                          developed to support analysis of the vital statistics and to capture additional context about historic                              knowledge of disease and death. project documentation will be developed to support exploration by the                              public and use by scholars and provide transparency with regards to the decisions made during curation.                                all products generated from the project, including protocols for curation, will be shared openly under a                                cc‑by license on platforms including github and the tei archiving, publishing and access service (tapas)                              project.    https://collectionsasdata.github.io/shapingdata_dh _abstracts/umbrasearch.org / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /   leveraging core data for the cultural heritage of the medieval middle east  daniel l. schwartz, texas a&m university    i direct syriaca.org, a core data project for syriac history, literature, and cultures. syriac is a dialect of                                    aramaic once spoken by populations across the middle east and asia. syriac sources document key                              moments in the interaction of judaism, christianity, and islam and offer unique perspectives on the                              history of the middle east from the roman period through ottoman rule and into the tumultuous                                present in iraq, syria, and the levant. syriaca.org has built a core data infrastructure useful to any digital                                    project in the field that is interested in incorporating our uris for persons, places, works, manuscripts,                                etc. i would like to propose a ‑minute demonstration of three projects that highlight this utility. )                                  spear (syriac persons, events, and relations) is a digital prosopography that employs our core data                              model (uris) to extract and encode data about persons, events, and relationships from primary source                              texts. the scale enabled by the digital allows extensive treatment of many subaltern groups usually left                                out of traditional print prosopography. tei encoding and serialization into rdf allow for multiple ways to                                query and visualize this data. ) the new handbook of syriac literature is an open‑access digital                                publication that will serve as both an authority file for syriac works and a guide to accessing their                                    manuscript representations, editions, and translations in digital and analog formats. though still in                          development, this handbook will more than double the number of works contained in the last                              publication to attempt something similar, anton baumstark’s geschichte, which is over years old. the                              handbook is part of syriaca.org’s efforts to produce reference resources that help overcome the colonial                              biases that informed orientalist organization of the cultural heritage of the medieval middle east. ) we                                are developing a uri resolver that any project in the field using our uris can incorporate into their                                    website to show users how many and what types of resources syriaca.org has on the entities included in                                    their data and to provide direct links to those resources.        addressing copyright and ip concerns when using text collections as data  megan senseney, eleanor dickson, and daniel g. tracy, university of illinois    open source text data mining tools such as voyant and publicly‑available services such as the hathitrust                                research center (htrc) have brought the potential of new research discoveries through computational                          analytics within reach of scholars. while the tools for mining and analyzing the contents of digital                                libraries as data are increasingly accessible, the texts themselves are frequently protected by copyright                            or other ip rights, or are subject to license agreements that limit access and use.    the htrc recently convened a task force charged to draft an actionable, definitional policy for so‑called                                non‑consumptive use, which is research use that permits computational analysis while precluding                        human reading. this year, the htrc released the task force’s non‑consumptive research policy, which is                              shaping revised terms of service and tool development within the htrc. building on the development of                                the htrc’s policy, our team is seeking to catalyze a broader discussion around data mining research                                using in‑copyright and limited‑access text datasets through an imls‑funded national forum that will                            / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / bring together experts around issues associated with methods, practice, policy, security, and replicability                          in research that incorporates text datasets that are subject to intellectual property (ip) rights. the                              national forum aims to produce an action framework for libraries with recommendations that will                            include models for working with content providers to facilitate researcher access to text datasets and                              models for hosting and preserving the outputs of scholars’ text data mining research in institutional                              repositories and databanks.    this short talk will describe the task force’s work to establish a non‑consumptive research policy for the                                  htrc and outline next steps toward building a more comprehensive research agenda for library‑led                            access to the wealth of textual content existing just out‑of‑reach in digital collections and databases                              through the upcoming national forum.    learning through use: a case study on setting up a research fellowship to learn more about how one                                    of our collections works as computationally amenable dataset  sarah severson and berenica vejvoda, mcgill university library and archives    mcgill university library and archives recently completed a major project to retrospectively digitize all of                              the dissertations and theses in the our collections. once these were added to the institutional                              repository, the metadata and full text of over , electronic theses and dissertations (etd), from                              ‑ present, became searchable using the traditional database structure of keywords and full text.                              with such a large and comprehensive corpus of student scholarship, we wanted to use this collection as                                  our first foray into thinking about ‘collections as data’ and what kinds of research could be done if we                                      opened up the entire raw, text corpus.    in order to encourage use and dialogue with the collection, the library created a computational                              research fellowship through an innovation fund. the fellowship call was left deliberately open in order                              to learn what people wanted to do with the collection and the only condition was that they share what                                      they learned openly through presentations about their work and host any code in an open environment                                such as github.    the selected fellow project will specifically utilize python’s natural language toolkit and capitalize on                            using word vec (a word embedding algorithm developed by google), to build an application with a                              front‑end, web‑based interface that will allow researchers to examine how literary terms have changed                            over time in terms of usage and context. the project will also include a data visualization component                                  using plotly (a python library) to promote interactive and visually meaningful data displays. more                            concretely, researchers will be able to enter a concept and a time‑period of interest and visualize how                                  the context of the concept has evolved over time. by way of example, the concept of “woman” shifts                                    contextually between first‑wave feminism and prior, as well as through subsequent waves of feminism.  this presentation will look at how we are thinking of our etd collection as a computationally amenable                                  dataset; the computational fellowship as a means of engagement; and, what we hope to learn about the                                  collection and future library text mining services and support.      / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# / facilitating global historical research on the semantic web: medea (modeling semantically enhanced                        digital edition of accounts)  kathryn tomasek, wheaton college (massachusetts); georg vogeler, centre for information modeling ‑                        austrian centre for digital humanities, university of graz    social and economic historians have spent at least the past fifty years creating data sets well suited for                                    analysis using post‑wwii computational tools (spss/sass). contemporary efforts by such historians as                        patrick manning to aggregate data sets for human systems analysis demonstrate a desire to take                              advantage of the more recent tools represented by the semantic web. both tomasek and vogeler have                                explored ontologies that can be integrated into the cidoc‑crm family of event‑based models and used                              for markup of digital scholarly editions of accounts, a genre of archival documents that support                              humanities research as well as social science research. this short paper offers a brief introduction to                                recommendations for producing digital scholarly editions of accounts that include references to a                          book‑keeping ontology using the tei attribute @ana. vogeler has tested comparability of data across a                              small sample of such editions for which the references have been transformed into rdf triples. new                                editions are being added to those stored in the gams repository (geisteswissenschaftsliches asset                          management system) at the university of graz between now and august . we see these editions in                                  sharp contrast to the example of “page‑turning” simulations referenced in the cfp for the workshop:                              creating full digital scholarly editions of accounts using tei, the book‑keeping ontology, and rdf triples                              are an example of shaping humanities data for use and reuse by taking advantage of the affordances of                                    the semantic web.    javanese theatre as data  miguel escobar varela, national university of singapore    the  contemporary wayang archive is an archive of indonesian theatre materials. the online portal’s                            primary goal is to enable users to watch videos alongside transcripts, translations and scholarly notes.                              however, a new version currently under development will enable users to query the archival materials                              via apis. the first api will be directed at linguistic queries from the transcript and translation corpus. the                                    goal is to enable data‑driven investigations of the ways javanese and indonesian are used in the                                performances. although these languages are widely spoken (indonesia is the fourth most populous                          country in the world and javanese is its most widely spoken regional language), there are almost no                                  machine‑readable resources in these languages that can be used in digital humanities and computational                            linguistics research projects. a second api is aimed at video processing applications. the api will serve                                videoframe‑level data that can be used to interrogate and visualize the collection in new ways. we                                believe that most theatre projects in dh remain heavily focused on textual data or on numerical data                                  such as revenue numbers, cast sizes and collaboration networks. however, we believe that video                            processing offers a rich and yet untapped avenue for inquiry [ ]. we aim to encourage further research                                  into this area via our video processing api. this talk will briefly outline the objectives and history of cwa,                                      our goals for the future and the technical and intellectual property rights challenges that we face.  references: [ ] escobar varela, m and g.o.f. parikesit, ‘a quantitative close analysis of a theatre video                                recording’ in digital scholarship in the humanities (forthcoming), doi: . /llc/fqv       https://collectionsasdata.github.io/shapingdata_dh _abstracts/cwa-web.org / / aac_finalreport - google docs https://docs.google.com/document/d/ qb j dsmcq rt je zanf_-bpry pemll u_qrjf /edit# /                 jannidis_on_the_perceived_complexity journal of cultural analytics july , on the perceived complexity of literature. a response to nan z. da fotis jannidisa auniversität würzburg, germany a r t i c l e i n f o article doi: . / c. journal issn: - a b s t r a c t at the center of nan z. da's article is the claim that quantitative methods cannot produce any useful insights with respect to literary texts: "cls's methodology and premises are similar to those used in professional sectors (if more primitive), but they are missing economic or mathematical justification for their drastic reduction of literary, literary- historical, and linguistic complexity. in these other sectors where we are truly dealing with large data sets, the purposeful reduction of features like nuance, lexical variance, and grammatical complexity is desirable (for that industry's standards and goals). in literary studies, there is no rationale for such reductionism; in fact, the discipline is about reducing reductionism." at the center of nan z. da's article is the claim that quantitative methods cannot produce any useful insights with respect to literary texts: cls's methodology and premises are similar to those used in professional sectors (if more primitive), but they are missing economic or mathematical justification for their drastic reduction of literary, literary-historical, and linguistic complexity. in these other sectors where we are truly dealing with large data sets, the purposeful reduction of features like nuance, lexical variance, and grammatical complexity is desirable (for that industry's standards and goals). in literary studies, there is no rationale for such reductionism; in fact, the discipline is about reducing reductionism. ( ) from this quote, one could assume that the article is concentrating on showing the "drastic reduction of literary, literary-historical, and linguistic complexity" at work in the more than studies she mentions. but her criticism, especially in the eight replication studies, is much more traditional—traditional in the field of data science. for example, she points out that ted underwood should have done b instead of a, because a "does nothing for his objective" ( ). ted underwood explained that he actually did b, which da conceded in her answer (da b). t h e c a n o n o f d u t c h l i t e r a t u r e a c c o r d i n g t o g o o g l e in other cases she points to problems due to different corpus sizes or the instability of results due to the usage of different stopword lists. all this is good and has to be done—and errors can and will happen on all sides. but this is not the crucial point. even if she could have shown that all cls studies she analyzed are in some way flawed according to the standards of inference statistics - and to emphasize this: they are not - there is no way which leads from that finding to her very sweeping statements: "the nature of my critique is very simple: the papers i study divide into no-result papers—those that haven't statistically shown us anything— and papers that do produce results but that are wrong." ( ) her analyses of the studies don't support her claim because the quality of specific studies cannot be the basis of an argument for the question of whether quantitative methods can be applied to literary texts with valid results. the bonmot at the end of the quote cited above, "the discipline is about reducing reductionism", receives its plausibility from the confrontation with the professional sectors: industry dumbs down language for its purposes, while literary scholarship does the opposite. but this opposition immediately looses plausibility when one realizes that these methods are also used in other fields of research for the purpose of research into complex matters—for example computational linguistics, sociology, psychology, bio-informatics and more. most research fields would also claim that one of their goals is not to 'reduce complexity'. the rhetoric of da's article is based on obscuring the boundaries between the very general statements about cls on the one side and the very limited reach of her methodological criticism on the other. basically she says over and over again: all practitioners in the field of computational literary studies are incompetent—and analyzing literary texts with computational means wouldn't work anyway. the fundamental problem of her essay is that she tries to combine these two quite different statements into one argument, where the errors are supposed to prove the point about the futility of the approach. at the beginning of section , for example, she states: "cls has no ability to capture literature's complexity." ( ) but in the sentences following this general statement, she delves into one specific paper by mark algee-hewitt, who uses entropy of bigrams as a measure for literary complexity and finds a correlation with canonicity. her criticism advances in two steps: she explains what entropy measures and emphasizes the difference between higher diversity of words and more complex meaning. but as j o u r n a l o f c u l t u r a l a n a l y t i c s algee-hewitt didn't claim to capture "meaning," she concedes the point only to explain how he—in her view—miscalculated the differences. this second step maybe fruitful inside of the field cls, but doesn't show that "cls has no ability to capture literature's complexity", because there is no logical way to move from an error in a calculation of a researcher to a general statement about the fruitfulness of a research field in general. (this also applies to da: the fact that a calculation of hers includes an error—as some have mentioned—, doesn't say anything about the validity of her arguments in general.). her analysis of statistical errors does nothing for her objective, if this objective is the argument that cls is a field of research that cannot use quantitative methods because of the specific quality of its object, literary texts. if you look at the two sides of her argument, the errors of cls and the futility of its research program, it is interesting to see that most of the text is about the errors, but there are only a few arguments bolstering the second claim. in the following, i concentrate on this second claim. as far as i can see, there is only one fundamental argument she employs, which we saw quoted above: complexity. and then there is a second group of methodological arguments, which are derived from the papers she analyzes, but which she at least in parts transforms into general objections to the application of quantitative methods to literary texts: these center around questions of operationalization, the high dimensionality of language data, and 'proper' inference statistics. complexity nan z. da doesn't explain why literature is supposed to be especially complex, and maybe she doesn't have to because it would be fair to assume that this is a shared belief in literary studies. however, to make sense in this context it is not enough that literature is very complex. it has to be singularly complex, given that many other research fields with complex objects use quantitative methods. sociology tries to describe whole societies, including artists and artworks, just as psychology tries to understand the psyche of individuals as well as groups, including those who produce and perceive works of art. so, yes, a literary text is probably infinitely complex, and the society in which this text is produced, distributed and read, is also infinitely complex, as is the human psyche. but these infinities probably don't have the same size, just like natural and real numbers. in light of these comparisons, it is very hard to believe an argument that literature is t h e c a n o n o f d u t c h l i t e r a t u r e a c c o r d i n g t o g o o g l e singularly complex, that only in the case of literature there can be no quantitative research. but this is the underlying assumption which permeates most of the other arguments. when da is talking about the use of nlp methods her argument basically goes like this: ) cls is not using better and more complex methods than those used in computational linguistics. ) these methods don't work for literary texts, because they only work with sets of simple and similar texts. ) even if they work on the same level as non-literary texts, this is not enough for literature. "you quickly run into a data scarcity and data complexity problem with literature." ( ) the first point is probably an effect of her sample of cls texts. i will come back to that at the end. the second point is actually an empirical question. da states: "speech tagging is extremely inaccurate for literary texts." ( ) tellingly enough in this reference-rich text there is no reference backing up this claim. does the complexity of literary texts really reveal itself in something like part-of-speech tagging? yes, she says: "lexical, syntactic, and grammatical ambiguities make it difficult for an algorithm to know whether a word is a participle or a gerund, if an adjective is a noun, or if entire phrases are functioning as a single part of speech." ( ) maybe this is true, if you look at modern poetry or experimental prose, but not very likely if you look at most of the fictional prose. but again, it is an empirical question, and i can only offer one piece of empirical evidence here: in our studies on character references in german literary prose, in which we annotated texts from the th and the beginning of the th century, we didn't see especially high error rates after we finished the tedious business of creating training corpora for our domain. what we did find were different distributions compared to non-fictional prose (krug at al. ). so even if the material is definitely different, at least the one nlp method we used worked fine. but a key point here is that these assertions can and ought to be tested. the third point seems to be a variant of the complexity of literature argument: "tagging errors and imprecision in nlp do not sufficiently degrade the extraction of information in many other contexts, but they do for literature." ( ) the concrete form of the argument is rather unclear, because all of her observations up to this point support argument , that nlp tools cannot work as precisely on literary texts as on non-literary texts. but they do not explain why a working nlp tool with high, but not perfect, reliability wouldn't suffice for literary texts. why j o u r n a l o f c u l t u r a l a n a l y t i c s % precision with high inter-annotator agreement is good enough for linguistics but not for literary studies remains unclear. this lack of coherence between her arguments is quite typical for the whole text. the first argument attacks cls because they don't use more sophisticated methods, methods which she claims in her second argument, do not work for literary texts anyway. and the same lack of logical connection can be found for argument two and three. she is just stacking them up like ramparts around the center of a castle to make sure that it is impossible to reach it. statistics as mentioned above da repeatedly criticizes the foundation of cls studies: no matter how fancy the statistical transformations, cls papers make arguments based on the number of times x word or gram appears. ( ) therefore all the things that appear in cls—network analysis, digital mapping, linear and nonlinear regressions, topicmodeling, topology, entropy—are just fancier ways of talking about word frequency changes. ( ) no one has ever said, though, that consistent word frequency is what distinguishes shakespeare's comedies from tragedies, tragedies from histories, and so on—and no one would ever say that because such distinctions cannot be captured with word frequencies. ( ) is she right? yes, a lot of work in cls and also in computational linguistics uses token frequencies, where a token can be anything from characters to words, taken either as -grams, -grams, or -grams, etc. is this bad? da contrasts it explicitly and implicitly with the specific complexity of literature to make her point that an approach based on token frequencies can never be enough for such a complex subject matter. but she actually merges two aspects into one, which people in cls usually treat separately and for good reasons. the first aspect is a theory of a phenomenon. the second aspect is the formal model of an indicator, which is used to test hypotheses derived from the theory and which are assumed to be directly or indirectly related to the phenomenon. the indicator will tell the t h e c a n o n o f d u t c h l i t e r a t u r e a c c o r d i n g t o g o o g l e researcher something about the phenomenon, if the indicator is well-chosen. but these indicators don't represent the phenomena in their full complexity. they are just one single aspect. if for example one theory assumes that authorship is solely a discursive phenomenon and another theory assumes that authorship might be even more complex involving discursive practices and similar distributions of words in texts, then a simple stylometric test can produce results which are not easily brought into agreement with the first theory. this test does not model the complexity of authorship, but it doesn't have to in order to achieve its goal. a theory of the research process in cls doesn't exist yet, but it will resemble in many aspects similar work done in the social sciences, where the difference between theory, hypothesis and indicator or variable, which can be measured, is widely used. in most introductions into statistics, the description of the research process is more simplified: there is a theory and a hypothesis derived from the theory. so the theory could be that a specific drug influences blood pressure and the hypotheses which will be tested look like: h : taking the drug has no effect, h : taking the drug reduces blood pressure. handbooks then usually concentrate on the setup of a randomized experiment with control groups etc. they don't talk about the difficulties of measuring blood pressure reliably by using either the auscultatory or the oscillometric method. blood pressure is a rather complex phenomenon and the oscillometric method doesn't represent all of its aspects, but uses a very specific aspect, the oscillations of a cuff pressure. da's criticism about the use of word frequencies in cls is basically claiming we cannot use a method because it doesn't represent the complexity of the measured phenomenon in total. as i hope the above examples show, this general claim doesn't adequately represent the complexity of research design in quantitative studies. to take another example more related to the field, there are many differences between genres like romance and science fiction, but you can use word frequencies to distinguish between them with a very high reliability. these results are quite robust in a statistical sense. if the question, you are interested in is how the prevalence of romance and science fiction changes over time in a set of books, word frequencies may be enough to answer it. if you want to know the difference between gender representations in these genres, word frequencies may not be enough. to assume that word frequencies cannot be the basis for any kind of research question one could ask in relation to literature, because literature is too complex, is simply false. one cannot answer this question in general; you have j o u r n a l o f c u l t u r a l a n a l y t i c s to look at each research question and design separately: is the chosen indicator enough to answer the research question? the task is made more difficult by our lack of knowledge in this area. those of us in the field of cls are also researching which kind of indicators yield robust information about which literary phenomena. some answers will be generic, for example the use of function words for stylometric purposes, and some will be quite individually tailored to one specific question about some group of texts written at a specific place and time. there is another variant of da's argument also implying that literature is too complex for quantitative methods: "to look for homologies in literature, cls must eliminate much of high-dimensional data and determine the top drivers of statistically significant variation. this always involves a significant loss of information; the question is whether that loss of information matters." ( f.) as so often in this text, the question is only rhetorical, though it actually is a real question, one which can only be answered empirically in the context of specific research questions. we already know that some questions can be answered reliably with simple representations of texts, but we don't yet know the limitations of these approaches. da thinks she can deduce from her knowledge about the working of a procedure like principal component analysis (pca) that it can never be used with literary texts and yield usable results. again, she is empirically wrong, this time shown by the work of researchers like john burrows or hugh craig, who used pca rather successfully in the context of authorship attribution. it would be really hard, if not impossible, for a layman to understand from her description, how pca works. but this is not important for her argument, because the aim is to contribute to the main theme of her text, i.e., that literature is too complex for this kind of method: it is one thing to statistically identify the shared drivers of a medical illness and another to say that the difference between immanuel kant's third critique and g. w. f. hegel's lectures on aesthetics can be captured in two or three numbers derived from their overlap on two or three vocabulary lists. ( ) da assumes again, that this application of quantitative analysis is supposed to model the whole complexity of the phenomenon. certainly, 'the' difference cannot be captured this way, but 'a' difference can, and for some questions this will be enough. t h e c a n o n o f d u t c h l i t e r a t u r e a c c o r d i n g t o g o o g l e there is a puzzling statement in the more general part of her text which shows an idiosyncratic understanding of data science: "quantitative analysis, in the strictest sense of that concept, is usually absent in this work." ( ) she doesn't explain what she thinks 'quantitative analysis, in the strictest sense of that concept' is— and it is not inference statistics, because hypothesis testing is the second item on her list of missing things in cls—, but it is easy to show that this is not a widely shared view. exploratory data analysis is an approach to quantitative analysis which goes back to the statistician john tukey in the s (tukey ), but has gained more and more traction the more data and computing power is available. it is especially useful for detecting patterns in large collections of data. if you look at some of the recent introductions into data science like (vanderplas ), typical methods of exploratory data analysis like clustering or data visualization are usually explained in depth. there is an ongoing discussion in the field about the role of exploratory data analysis in relation to confirmatory data analysis. tukey states "exploratory data analysis can never be the whole story, but nothing else can serve as the foundation stone—as the first step." (tukey ) now that large portions of a given population are in some cases available, descriptive methods have a different role to play. it isn't a resolved question what role exploratory data analysis has in the research process today, and how to make sure that the patterns, which are discovered in visualizations or by similar means, are valid and robust. additionally the relation of traditional statistical approaches and what breiman calls algorithmic modeling is under discussion (breiman , underwood ). it is an open question how to integrate statistical methods organized around the concept of falsification with machine-learning approaches organized around the concept of optimization into one common research design framework. but this is very far from da's claim, that 'this work' is not quantitative analysis. all in all, da's professed attitude that she is only helping editors in literary studies to make informed decisions is not convincing. if you look, for example, at her 'suggested guidelines for reviewing cls manuscripts' in the online appendix of her article, she raises the bar so high for any journal editor willing to publish cls papers by demanding a replication with the help of a programmer, a statistician and a literary historian, that no journal will have the means to go to that trouble. the guidelines of a journal like science require an evaluation of the data as essential, while a software review is not required nor is replication (science ). again, the question how to evaluate data science is a real issue, and in an j o u r n a l o f c u l t u r a l a n a l y t i c s ideal world where research has unlimited funding specialized journals would have that combination of expertise. but as da wields these requirements, they are just supposed to make sure that no cls article is published in a journal of literary studies. nan z. da touches some interesting questions, but because she is so bent on proving that cls is a failed endeavor in general, she has to act as if she already has all the answers. selection bias and coda da calls her paper a "computational case against computational literary studies", but if you look at her paper from a data science perspective, the research design is quite questionable, as others have also pointed out (piper ). the main problem is her data. she has a sample of eight studies, which would seemingly allow her to judge an international field with many works in the last thirty to forty years. an online bibliography of the field, which christof schoech assembled with the help of many in the field, has more than , entries (schoech et al. ). eight replication studies is indeed a lot of work, and i am certain the field will learn something in the long run from this endeavour. but as da has pointed out, this is of no interest to her: "it is not a method paper" (da b). she wants to show to literary studies that cls is and will be flawed. her statements, based on this very small sample, condemn the whole field, without any limitations, in the strongest words. eight data points is an extremely weak basis for that. but not only is the size of her sample a problem, the selection of the sample is too. the only thing we hear about the selection criteria and the process is that the papers were "chosen for their prominent placement, for their representativeness, and for the willingness of authors to share data and scripts or at least parts of them" ( ). in statistics representativeness is approximated by a randomized selection of the cases, not by the impression of a researcher that these instances would support her hypothesis nicely. from a european perspective the bias in da's selection of articles in the field of computational literary studies is rather obvious. it aims to talk about the field in general, but only quotes north-american scholars, especially those who have published in one journal. what about karina van dalen, maciej eder, mike kestemont, jan rybicki, christof schöch? —to mention just a few. all of them t h e c a n o n o f d u t c h l i t e r a t u r e a c c o r d i n g t o g o o g l e have published extensively in english. if you start to do research on cls, it is difficult not to come across the journal digital scholarship in the humanities, the flagship journal of the association of the digital humanities organisations, where a number of cls research from all continents have been published. all of this is missing in her selection. additionally the bias is also distorting the image of cls by excluding those approaches in europe which also belong to the field but are quite different to the examples she quotes. there is for example the work of karina van dalen and her group in the netherlands, who are looking into factors which make it more probably that people view something as good literature by combining methods from social science and cls (koolen , riddell , van cranenburg ). or evelyn gius and christof meister in germany, who are interested in robust markup schemas for narrative phenomena and how to support scholars in annotating contradictory, unclear or vague information (gius , meister ). or the work done by maciej eder and jan rybicki to evaluate and apply stylometric methods on small and large text collections (eder , rybicki ) or the work done by mike kestemont on applying cutting edge digital methods like deep learning to languages with limited resources like latin (kestemont ) or the studies of my research group, which is adapting nlp tools to the domain of literary texts by creating manually annotated corpora of literary texts, for example for character references or speech rendering (krug ). you will also find more studies about the robustness of specific quantitative methods in europe (eder , evert , schöch ). so the politically motivated bias of her selection, which has in itself a depressing effect on european readers in the current political climate, has also resulted in a seriously distorted image of the field. the journal, which published nan z. da's essay offered some of the authors, which she had criticized, the possibility for statements and she in turn answered them. three points are of interest in her answer: ) she emphasizes that she didn't write her text for cls but for literary scholars and editors. in my opinion the whole structure of her paper and the selection of the examples are only understandable under this perspective. ) she seems to be confused about what she really said in her original paper: "it seems unobjectionable that quantitative methods and nonquantitative methods might work in tandem. my paper is simply saying: that may be true in theory but it falls short in practice." (da b) as i j o u r n a l o f c u l t u r a l a n a l y t i c s have shown above, she does say much more. she repeatedly makes the point that quantitative methods cannot be applied to literary texts in general and not only in some specific cases. ) she changed the outcome of her paper. in the original paper she was very clear in her statement: "the papers i study divide into no-result papers—those that haven't statistically shown us anything—and papers that do produce results but that are wrong." ( ) in her later statement she modified this to: "first, there is statistically rigorous work that cannot actually answer the question it sets out to answer or doesn't ask an interesting question at all. second, there is work that seems to deliver interesting results but is either nonrobust or logically confused." (nan z. da: argument, , my emphasis, f.j.) with the addition of "doesn't ask an interesting question at all" she changed the first group from 'no-result papers' to 'no-result or not an interesting question'. well, 'interesting' is a very loose and quite subjective category. this loophole is probably made necessary by the fact that in at least two cases her criticism was wrong, that is, the statistics are fine and they do show what the author of the paper intended to show. if we look at the topics of these two studies, ted underwood's paper on genre and hugh craig's book chapter on co-authorship of shakespeare's plays, it is clear that they treat questions which have been discussed extensively in non-quantitative studies. it seems to indicate that what is not interesting to da may be interesting to many others. hopefully this is also true for the literary scholars and editors whom her study in resentment tried to convince of the opposite. references breiman : leo breiman: statistical modeling: the two cultures. in: statistical science , ( ), - . da a: nan z. da: the computational case against computational literary studies. in: critical inquiry (spring ), - . eder : eder, maciej does size matter? authorship attribution, small samples, big problem. digital scholarship in the humanities , ( ): - evert : stefan evert thomas proisl fotis jannidis isabella reger steffen pielström christof schöch thorsten vitt: understanding and explaining delta measures for authorship attribution. in: digital scholarship in the humanities , suppl_ ( ), ii -ii . gius : evelyn gius, janina jacke: the hermeneutic profit of annotation: on preventing and fostering disagreement in literary analysis. ijhac ( ): - ( ). t h e c a n o n o f d u t c h l i t e r a t u r e a c c o r d i n g t o g o o g l e kestemont : mike kestemont and jeroen de gussem: "integrated sequence tagging for medieval latin using deep representation learning" in: journal of data mining & digital humanities ( ) https://jdmdh.episciences.org/ koolen : koolen, cornelia w.: reading beyond the female. the relationship between perception of author gender and literary quality. amsterdam . krug : markus krug, frank puppe, isabella reger, lukas weimer, luisa macharowsky, stephan feldhaus, fotis jannidis: description of a corpus of character references in german novels - droc [deutsches roman corpus]. dariah-de working papers nr. . göttingen: dariah-de, . urn: urn:nbn:de:gbv: -dariah- - - meister : meister, jan christof: crowd sourcing "true meaning". a collaborative markup approach to textual interpretation." in: willard mccarty, marylin deegan (eds.), collaborative research in the digital humanities. festschrift for harold short (ashgate publishers) , - . piper : andrew piper, "do we know what we are doing?" journal of cultural analytics. april , . rybicki : jan rybicki and maciej eder: deeper delta across genres and languages: do we really need the most frequent words? literary and linguistic computing, , ( ): - science : instructions for reviewers of research articles. online: https://www.sciencemag.org/sites/default/files/rainstr .pdf ( . . ) schöch : christof schöch: stylometry bibliography. ff. https://www.zotero.org/groups/ /stylometry_bibliography? schöch : christof schöch: "zeta für die kontrastive analyse literarischer texte. theorie, implementierung, fallstudie." in: quantitative ansätze in den literatur- und geisteswissenschaften. systematische und historische perspektiven, edited by toni bernhart, sandra richter, marcus lepper, marcus willand, and andrea albrecht, - . berlin: de gruyter, . tukey : john w. tukey: exploratory data analysis. reading, mass. . underwood : ted underwood: algorithmic modeling: or, modeling data we do not yet understand. in julia flanders & fotis jannidis (eds.): the shape of data in digital humanities: modeling texts and text-based resources. new york: routledge , - . van cranenberg : andreas van cranenburgh, karina van dalen-oskam, joris van zundert: vector space explorations of literary language. in: language resources & evaluation ( ). https://doi.org/ . /s - - - van dalen-oskam : alan riddell & karina van dalen-oskam: readers and their roles: evidence from readers of contemporary fiction in the netherlands. plos one , ( ) finding the connection: research data management and the office of research b u ll e ti n o f th e a ss o ci at io n f o r in fo rm at io n s ci en ce a n d t ec h n o lo g y – o ct o b er /n o ve m b er – v o lu m e , n u m b er c o n t e n t s n e x t pa g e >< p r e v i o u s pa g e a s our researchers each navigate their own particular research lifecycle and we seek to develop library services to meet their needs, it behooves us to try and understand their entire research process. fortunately, we’re not the only ones interested in our users. campus offices of research have the mission to assist researchers in getting and managing research funding. these units range from a few people processing grant forms to large and distributed organizations complete with educational programs. in recent years, there have been several case studies of partnerships between the office of research and institutional libraries, both within the united states [ ][ ][ ] and abroad [ ][ ][ ]. by pooling resources, both the libraries and the offices of research get better results. in particular, the data management plan requirement from u.s. federal funding agencies creates a natural fit between data management librarians and research development professionals. as part of an office of research team, research development professionals work to match researchers to funding opportunities, prepare grant materials, build research teams, interact with funding agencies and provide training [ ][ ]. as external funding has become more competitive and important to institutional revenue, the national organization of research development professionals (nordp) was created to support these individuals [ ]. if you peruse their annual conference program, you’ll see many topics that overlap with librarian interests: data management, research impact, unique identifiers for researchers, funding agency regulations and digital scholarship. i’ve held data librarian positions at three different academic institutions and the office of research has always been a welcome ally. my first position was at a small, research-intensive institution, right after the national science foundation data management plan was announced. i reached out to the local office of research to find out what training opportunities they intended to provide. this contact led to my facilitating a conversation around data ethics at their annual responsible conduct of research training. topics surrounding data ethics lead to talking about best practices in data management – that poor documentation results in the inability to repeat experiments, that the lack of standards results in isolated data sets and how to implement better practices. expectations have changed, and now the responsible, ethical researcher consciously engages in good data management practices. when i took a position at a large, public institution, i again reached out to the office of research, only with different results. at an institution known for developing finding the connection: research data management and the office of research by amanda rinehart amanda rinehart is assistant professor and data management librarian at the ohio state university libraries. she can be reached at rinehart. @osu.edu editor’s summary academic and research librarians can enhance the services they provide to researchers by collaborating with university offices of research. the requirement that projects have data management plans to qualify for federal funding has stimulated alignment, partnering and pooling of resources between service providers. the author’s experiences in data librarian positions illustrate various routes and results of interaction, from demonstrating the value of reusing data for different purposes to promoting data ethics and best practices in data management. librarians should seek out opportunities to collaborate with their campus research offices, understand commonalities in service to researchers and focus on data management as a complement to research professionals’ work. keywords academic communities academic libraries research libraries collaboration data curation library and archival services funding r d a p r e v i e w mailto:rinehart. @osu.edu b u ll e ti n o f th e a ss o ci at io n f o r in fo rm at io n s ci en ce a n d t ec h n o lo g y – o ct o b er /n o ve m b er – v o lu m e , n u m b er c o n t e n t s n e x t pa g e >< p r e v i o u s pa g e best teaching practices, data management needed to be embedded into that framework. while there were no obvious joint training opportunities, the director of the office of research suggested that i pursue an internal grant that targeted cross-disciplinary research. in conjunction with a faculty member in political science, i proposed to properly prepare previously digitized u.s. supreme court records and demonstrate how this data could be re-used in the classroom. a presentation at the annual university-wide symposium on teaching and learning led to wider campus recognition of how best practices in data management can lead to re-use in the classroom. in my current position at ohio state university there are two levels of research development professionals: those at the central office of research and those embedded in specific colleges. reaching out to both has paid off. collaborating with the central office of research has resulted in joint workshops, such as “getting grants: finding funding and planning for data management.” at the college level, i have been conducting specialized events, such as presenting on data management in the responsible conduct of research training for the college of veterinary medicine and conducting guest lectures at the college of pharmacy. how can you engage with your office of research, and in particular, your research development professionals? first, if you haven’t had the opportunity to manage a grant, learn a bit about their language [ ] and identify opportunities [ ] for collaboration. consider services that are adjacent to each other on the research lifecycle [ ], ask your colleagues for introductions or simply go to their office or set up a time to have coffee. articulate that you aren’t looking to duplicate services and define what you mean by data and data management. like all relationships, it may take a bit of time, but the advantages are multifold: the benefits of targeted contact networks, more direct referrals, better timing of events and information and the potential ability to demonstrate value in terms of direct revenue. � r d a p r e v i e w r i n e h a r t , c o n t i n u e d t o p o f a rt i c l e resources on next page b u ll e ti n o f th e a ss o ci at io n f o r in fo rm at io n s ci en ce a n d t ec h n o lo g y – o ct o b er /n o ve m b er – v o lu m e , n u m b er c o n t e n t s < p r e v i o u s pa g e r d a p r e v i e w r i n e h a r t , c o n t i n u e d t o p o f a rt i c l e resources mentioned in the article [ ] bedard, m., hendrickson, d., lubas, r., & reenen, j. v. ( ). library information technology collaborations at the university of new mexico. journal of library administration, ( ), ‐ . [ ] black, c., harris, b., mahraj, k., schnitzer, a. e., & rosenzweig, m. ( ). collaboration between the university of michigan taubman health sciences library and the university of michigan medical school office of research. medical reference services quarterly, ( ), ‐ . [ ] delserone, l. m., kelly, j. a., & kempf, j. l. ( ). connecting researchers with funding opportunities: a joint effort of the libraries and the university research office. collaborative librarianship, ( ). retrieved from www.collaborativelibrarianship.org/ [ ] ball, j. ( ). research data management for libraries: getting started. insights: the uksg journal, ( ), ‐ . retrieved from http://insights.uksg.org/ /volume/ /issue/ / [ ] clements, a. ( ). research information meets research data management … in the library? insights: the uksg journal, ( ), ‐ . retrieved from http://insights.uksg.org/ /volume/ /issue/ / [ ] richardson, j., nolan‐brown, t., loria, p., & bradbury, s. ( ). library research support in queensland: a survey. australian academic & research libraries, ( ), ‐ . [ ] levin, j. (march , ). the emergence of the research-development professional. the chronicle of higher education. retrieved from http://chronicle.com/article/the-emergence-of-the/ / [ ] national organization of research development professionals. ( ). what is research development? retrieved from https://nordp.memberclicks.net/index.php?option=com_content&view=article&id= &itemid= [ ] national organization of research development professionals. ( ). home. retrieved from https://nordp.memberclicks.net/ [ ] rinehart, amanda. ( ). creating sustainable data management services: the office of research and you [poster]. acrl conference , portland, or. [ ] rinehart, amanda. ( ). the two-for-one workshop: mapping data management services to the research lifecycle [poster]. asis&t rdap summit , minneapolis, mn. [ ] university of central florida libraries research lifecycle committee. ( ). the research lifecycle at ucf [online graphic]. retrieved from www.library.ucf.edu/scholarlycommunication/researchlifecycleucf.php www.library.ucf.edu/scholarlycommunication/researchlifecycleucf.php https://nordp.memberclicks.net/ https://nordp.memberclicks.net/index.php?option=com_content&view=article&id= &itemid= http://chronicle.com/article/the-emergence-of-the/ / http://insights.uksg.org/ /volume/ /issue/ / button : button : fql .. functional disambiguation based on syntactic structures ............................................................................................................................................................ octavio santana suárez, josé rafael pérez aguiar, luis losada garcı́a, and francisco javier carreras riudavets university of las palmas de gran canaria ....................................................................................................................................... abstract this article presents a disambiguation method which diminishes the functional combinations of the words of a sentence taking into account the context in which they appear. this process is built in two phases: the first phase is based on the local syntactic structures of the spanish language and reaches an average yield of %. the second one is supported by syntactic tree representation and pushes the results up to an approximate high end of %. this process constitutes the starting point towards an automated syntactic analysis. ................................................................................................................................................................................. introduction in the spanish language, there are a considerable number of words that can play different gramma- tical functions, and therefore a text analysis would produce an enormous amount of combinations unless the function of each word within the context where it appears is considered. functional disambiguation consists of the elimination of the results that do not answer to their function within the text. this article presents a method of functional disambiguation this method reduces the size of the answer through a two-step treatment of a morphological processor. in the first stage, a functional disambiguation based on local syntactic structures is applied; here the grammatical functions that invalidate the neighbouring environment of every word within the sentence are discarded. in the second stage, the functional disambiguation is performed; at this point the combinations of gram- matical functions of the sentence that prevent the generation of syntactic representation trees valid for the whole sentence are discarded. basic syntactic structures and functional pairs in the spanish language, there are basic structures that repeat and combine over and over among themselves in order to give way to the sentences of the discourse. the composition of these structures defines the pairs of grammatical functions that appear in a sentence—within these local-type structures. when a local-type study is to be performed, the null symbol is included both at the beginning and at the end of every structure. the functional behaviours of the following need to be considered: noun, adjective, demonstrative adjective, possessive adjective, adverb, personal pronoun, relative pronoun, remaining pronouns, article, preposition, conjunction, coordinating con- junction and contraction. some categories are disclosed because they show function and position differences in the syntactic structures. among adjectives it is possible to distinguish the possessive ones from the demon- strative ones; the possessive adjectives that can appear before, after and in both positions in relation correspondence: francisco javier carreras riudavets, departamento de informática y sistemas, edificio de informática y matemáticas, campus universitario de tafira, universidad de las palmas de gran canaria, las palmas de gran canaria, las palmas, spain. e-mail: fcarreras@dis.ulpgc.es literary and linguistic computing, vol. , no. , . � the author . published by oxford university press on behalf of allc and ach. all rights reserved. for permissions, please email: journals.permissions@oxfordjournals.org doi: . /llc/fql to the nominal head which they complement can be separated. among pronouns we can with distin- guish the demonstrative adjectives—with adjective function—from the personal pronouns—the unstressed and the tonic pronouns are identified separately—and from the relative pronouns; the remaining group of pronouns are considered under the denomination of other pronouns. the coordinating conjunctions will be taken into account in a special fashion because they are used to link formal structures of the same syntactic level—all of them are included under the denomi- nation of a conjunction. the personal forms of the infinitive, gerund and participle can also be distinguished from each other. among the contracted forms, a combination of a preposition and a determiner and, sometimes, a combination of three elements will be considered. the punctuation marks are also considered, differ- entiating between a comma and a semicolon. . homogeneous noun phrase the homogeneous noun phrase has the following basic structure: null þ determiner þ nominal head þ adjacencies þ null. the determiner may be: an article, a possessive adjective or a demonstrative adjective. the nominal head is formed by a noun. the adjacency may be: an adjective, the de preposition followed by a phrase (prepositional complement of the noun) or a noun (apposition). the determiner, the nominal head and the adja- cency must agree in gender and number. this structure may exhibit certain variations in relation to the presence and position of their elements. the nominal head will always be present; however, the determiner and the adjacency may not appear. some adjacencies—adjective—may precede the nominal head and, sometimes, the determiner— possessive adjective—may follow the nominal head. table shows the configurations formed by consecutive pairs are shown. . heterogeneous noun phrase heterogeneous noun phrases are combinations of the homogeneous ones: homogeneous noun phrase þ connector þ homogeneous noun phrase. the connectors are conjunctions, from the gram- matical point of view, and from the graphical point of view they are realized by the comma (,). the new combinations of symbols that appear are listed in table . . substitute noun phrase the substitute noun phrase appears when the nominal head is realized by a category different from the noun: the pronoun, the adjectives and the infinitives preceded by a determiner. with respect to the homogeneous noun phrase, they appear as new pairs of functional categories as a result of substituting the noun that forms the head by a pronoun (tonic personal pronoun, pronoun of relative preceded by an article or other pronoun), an adjective (preceded by an article or a demon- strative adjective) or an infinitive (preceded by an article, possessive adjective or demonstrative adjec- tive). the head continues to conform with the determiner and the adjacencies with regard to gender and number. . verb the simple verbal forms are constituted by one verb in the active mode with the basic structure: null þ simple verbal form þ null. a simple verbal form may be a personal verbal form or an infinitive. table pairs of symbols that form homogeneous noun phrase followed by null determiner nominal head adjacency null no yes yes yes determiner yes no yes yes nominal head yes yes no yes adjacency yes yes yes yes table pairs of symbols that form heterogeneous noun phrase preceded by connector followed by connector determiner yes yes nominal head yes yes adjacency yes yes o. s. suárez et al. literary and linguistic computing, vol. , no. , a complex verbal form has the following basic structures: null þ auxiliary þ impersonal form þ null and null þ proclitic þ personal form þ null. the first structure includes the already understood auxiliary followed by a participle, compound tenses, passive voice, the auxiliary of direct action followed by an infinitive and an auxiliary verb followed by a gerund. the second structure represents an unstressed personal pronoun followed by a simple or a compound personal verbal form. in addition, there are two special cases that must be treated: null þ indirect incidence auxiliary þ conjunction þ infinitive þ null (because the only acceptable conjunction is que) and null þ indirect incidence auxiliary þ preposition þ infinitive þ null (because the acceptable prepositions are a, de, en and por). finally, the existence of multiple verbal heads have to be taken into account: verb þ connector þ verb (the connectors are again conjunctions from the grammatical point of view and the comma from the graphical point of view). . prepositional phrase the prepositional phrase comprises a preposition plus a noun phrase; the pairs of contributing functional categories are combinations with the preposition preceded by null or followed by a determiner, a nominal head, an adjacency and another preposition (in the case of double prep- osition, the first one has to be a or hasta). . adjectival phrase the simple adjectival phrase, which exists only with copulative verbs, acts as an attribute, and it is formed by an adjective in the basic structure: null þ adjectival phrase head þ null. the adjectival phrase head is an adjective. in the multiple adjectival phrases, adjectives may appear linked by connectors: null þ adjectival phrase head þ connector þ adjectival phrase head þ null; the connector is a coordinating conjunction or a comma. . adverbial phrase the syntactic structures of the adverbial phrase are: null þ adverb þ null, null þ adverb þ adverb þ null, null þ adverb þ prepositional phrase þ null, null þ adverb þ nominal phrase þ null and null þ adverbial phrase þ null. the pairs of contributing functional categories are combinations where the adverb is preceded by null and a preposition (in the case of an adverbial phrase), or followed by another adverb, a preposition, a determiner, a nominal head or an adjacency. the adverb may appear, in some cases, adjacent to an adjective within a noun phrase; because the combination: definite article þ adverb must be added. . linkage among several structures the basic structures are combined in order to generate structures of a larger size. in many cases, it is not necessary to use linking particles, but in others it is. when the structures to be linked are clauses, it is necessary to use a linking element. for this reason the pairs: null þ linking element and linking element þ null are added; the linking element can be a conjunction, a comma or a semicolon. local functional disambiguation local functional disambiguation starts from the following data: ( ) the allowed set of functional behaviour s— referred to in section . ( ) the set of pairs p of symbols of the form a þ b, where a and b belong to s, that can exist in the local structures in the spanish language—they have been presented in section . ( ) a set of combinations of functional categories that are not allowed. due to the existence of rules of the form null þ category and category þ null— beginnings and endings of local structures— disallowed combinations may occur; to avoid this, a set of functional structures that are not prohibited has to be defined (table ). functional disambiguation based on syntactic structures literary and linguistic computing, vol. , no. , ( ) starting from the words that produce the combination, a set of special cases is defined: � when there is more than one verbal form without a link relation, the option is disregarded. � if there is a determiner but no adjacencies, the ambiguity between the adjective and the noun is resolved in favour of the noun because it is the head of the noun phrase. � in the case of ambiguity between adjective and participle, the term favoured is the adjective if there are no auxiliary verbs— haber and ser. � in order to avoid problems of cacophony, the concordance of the de article with the nominal head which it precedes is not necessary. � before mı́, ti or sı́, only a preposition may appear. � after a question or exclamation mark, qué will be another pronoun. � after a verb or an adverb, que will be a conjunction; after el, la, las, lo, los it will be a relative pronoun; after a comma it will be a pronoun or a conjunction. � before a noun, de will be a preposition. � the word no acts only as a noun after el or un. � the words sobre and muy do not have the value of a noun before another noun. the following steps are executed: ( ) processing of the morphological analysis of the sentence and getting a set of potential functional combinations. ( ) examining all strings in groups of three elements in order to accept or reject the central element. given the sequence of functions a þ b þ c, b is accepted if and only if it is given any one of the following conditions: ( ) {a þ b} and {b þ c} belong to p ( ) {null þ b} and {b þ c} belong to p ( ) {a þ b} and {b þ null} belong to p ( ) {null þ b} and {b þ null} belong to p ( ) of the sequences not rejected, those contain- ing any prohibited functional elements are eliminated. ( ) the remaining combinations also include some that fit the special cases. structural ambiguities starting from a combination of functional behav- iours of the words of a sentence, it is possible to get more than one tree for the analysis when applying the spanish grammar considered here—it is forma- lized with more than rules; such multiple results denote structural ambiguity. the existence of more than one rule with the same symbol or combination of symbols on the right side is what is denominated direct structural ambiguity; the grammar used here comprises more than direct structural ambiguities that cover about rules. the direct structural ambiguities led to primary conflicts. there are cases of real ambiguities that can lead to more than one valid interpretation of a sentence. table prohibited combinations prohibited functional combinations preceding possessive adjective þ preposition preceding or after possessive adjective þ preceding possessive adjective definite article þ preposition comma þ conjunction þ punctuation conjunction þ null null þ demonstrative adjective þ personal verbal form pronoun þ infinitive unstressed personal pronoun þ adjective unstressed personal pronoun þ adverb unstressed personal pronoun þ determined article unstressed personal pronoun þ conjunction unstressed personal pronoun þ preposition unstressed personal pronoun þ pronoun unstressed personal pronoun þ noun punctuation þ conjunction þ comma noun þ participle . . . o. s. suárez et al. literary and linguistic computing, vol. , no. , solving primary conflicts in the following paragraphs several proposals for producing rules to resolve conflicts will be considered; the superposition of these rules will produce the removal of the non-acceptable trees of analysis. in some cases, the rules may be applied the very moment a new symbol is added during the process of analysis, i.e. when the rules depend upon the symbols of the lower levels; in other cases it would be necessary to wait until the completion of the tree. . ambiguities and necessary words for some of the complements it is not possible to use all the words of a given functional behaviour: in this sense any pronoun neither originates a direct object nor does any preposition of a prepositional phrase give way to an indirect object. rule: necessary words let s be a non-terminal symbol generated starting from an intermediate symbol is and let pn(s) be the set of words necessary for s, the s symbol will be accepted if and only if it is found to belong to the set pn(s) among the words generated by is. . . prepositional phrases various structures can be generated from is ¼ prepositional phrase; however, the allowed preposi- tion are not the same for all the structures. the direct object, for example, only takes a and the indirect object takes a or para. in this sense, conflicts are eliminated in some cases and in the remaining ones the conflicts are diminished. in the case of concatenating prepositions the same rule is applicable, and is applied to the second preposition. when a contraction appears, the same considerations for the preposition are to be applied. there are words that, in general, are not recognized as prepositions, but they have similar functional behaviours; they are the so-called imper- fect prepositions. . . unstressed personal pronouns various structures can be generated starting from is ¼ unstressed personal pronoun; however, the allowed pronouns are not the same for all the structures (table ). . . other categories other categories used in resolving conflicts are shown in table . . ambiguities and symbols not allowed if starting from an unstressed personal pronoun, a substitutive noun phrase is generated. it should not give place to a direct object because such pronouns table other categories in the resolution of conflicts s ¼ structure is ¼ category nw(s) ¼ necessary words adjacency adverb como, más, menos, no, todo/a adjacency relative pronoun cuyo/a, cuyos/as, que subordinate connector adverb apenas, como, conforme, cuanto, donde, mientras, siempre, tal, tan subordinate connector conjunction aunque, con que, cuando, cuantos/as, para, porque, que, si comparative construction adverb ası́, como adjectival group adverb como . . . . . . . . . table unstressed personal pronouns in the resolution of conflicts s ¼ structure nw(s) ¼ necessary words direct object la/s, lo/s, me, nos, os, se, te indirect object la/s, le/s, lo/s, me, nos, os, se, te attribute lo morpheme of passive construction se morpheme of impersonal construction se . . . . . . functional disambiguation based on syntactic structures literary and linguistic computing, vol. , no. , may have a direct object function only when they are preceded by a preposition. rule: non-allowed symbols let s be a non-terminal symbol, generated from an is symbol and let nas(s,is) be a set of symbols generating is, and catalogued as non-allowed generators of s. then, s generated by is will be rejected if is has been generated by means of some symbol of the set nas(s,is) (table ). . ambiguities and related symbols some symbols may not appear without the existence of other symbols in the same tree of analysis. rule: necessary symbols the symbol s is added to the tree of analysis only if it exists as the ns(s) symbol (table ). in order to reduce the appearance of direct objects erroneously recognized as such, it is advisable to take into account the need for having a transitive verb. taking also into account that copulative verbs are of the intransitive type the possibility of confusing an attribute with a direct object is, hence, reduced. rule: necessary symbols with condition the s symbol is added to the tree of analysis if and only if it exists as an ns symbol that complies with the condition c(s, ns). . ambiguities and incompatible symbols the differentiation between ambiguities and incom- patible symbols is based on the non-existence of a symbol and not on its existence; in order to accept the intransitive sentence symbol the direct object symbol must not exist. every tree that includes incompatible symbols is rejected—with the excep- tion of compound sentences which consist of several predicates. rule: incompatible symbols the symbol s is added to the analysis tree if and only if the ins(s) symbol does not exist (table ). . concordances among the different structures which constitute a sentence there are mandatory requirements regard- ing the concordance of certain characteristics. rule: concordances if s and s are the symbols of an analysis tree, this tree is accepted if and only if there is concordance between the set of definite character- istics for these symbols, csd(s , s ). . . description of cases concordances that have been checked during the process of local functional disambiguation should be verified again because two elements that appear in the same phrase must concord and in table non-allowed symbols (nas) s ¼ symbol is ¼ intermediate symbol nas(s,is) ¼ non-allowed symbol direct object noun phrase infinitive indirect object noun phrase unstressed personal pronoun subject noun phrase tonic personal pronoun noun phrase nominal head adjective . . . . . . . . . table necessary symbols (ns(s)) s—new symbol ns(s)—necessary symbol attribute copulative verbal head copulative verbal head attribute passive verbal head passive auxiliary direct object verbal head attributive sentence attribute supplement sentence supplement intransitive sentence verbal head passive sentence passive verbal head transitive sentence direct object . . . . . . o. s. suárez et al. literary and linguistic computing, vol. , no. , the local analysis it might be that this was not the case because the union of local structures was assumed. concordance between subject and verbal head: in sentences with verbs in a personal form there must be concordance in number and person with the head of the subject structure. csd(subject, verbal head ) ¼ {number, person} csd(subject, passive verbal head ) ¼ {number, person} csd(subject, copulative verbal head ) ¼ {number, person} besides the concordance between subject and predicate, the concordance between the next descendants of the predicate must be accom- plished—always in gender and number: adjacency with nominal head. direct object with direct object. indirect object with indirect object. objective predicative with direct object. subjective predicative with verbal head. subjective predicative with subject. determiner with nominal head. . semantic information the analysis of the semantic content of the words leads to the elimination of ambiguities—starting from the information given through ideological dictionaries: to generate the symbol circumstantial complement of tense among the words that form it, there must be something that gives information on tense or moment. rule: necessary semantic if ws is a set of words that are joined to build the s symbol and if ims(s) is the set of ideological meanings associated with the s symbol, then s is rejected if there is no word in ws such that the ideological analysis belongs to ims(s). to improve the efficiency of the automation of the disambiguation process, it should be easy to create a disposition containing the words with the necessary semantics for all the symbols. to avoid taking into account words with seman- tics that must not directly intervene with the symbol, the set of ws words will be formed only with words at the highest level of the representation tree for the symbol—if such a level contains only irrelevant words, determiners, connectors and prepositions, the action is to go to lower levels until relevant words are found. . . ideological relationships and symbols there are words whose semantic content prohibits the generation of a given symbol from another: a homogeneous noun phrase cannot generate a table incompatible symbols s—new symbol ins(s)—incompatible symbol attribute attribute attribute morpheme of passive attribute direct object attribute indirect object attribute objective predicative attribute subjective predicative attribute supplement agent complement supplement morpheme of impersonal morpheme of impersonal morpheme of impersonal morpheme of passive morpheme of impersonal morpheme of half voice morpheme of passive morpheme of passive morpheme of passive morpheme of half voice morpheme of half voice morpheme of half voice verbal head verbal head direct object attribute direct object morpheme of passive indirect object attribute indirect object morpheme of passive attributive sentence passive verbal head supplement sentence passive verbal head supplement sentence direct object intransitive sentence attribute intransitive sentence direct object intransitive sentence supplement transitive sentence attribute transitive sentence passive verbal head objective predicative attribute objective predicative objective predicative objective predicative subjective predicative subjective predicative attribute subjective predicative morpheme of impersonal subjective predicative subjective predicative subject morpheme of impersonal supplement attribute supplement agent complement supplement supplement . . . . . . functional disambiguation based on syntactic structures literary and linguistic computing, vol. , no. , direct object if the head is a person—ideological information. rule: incompatible semantic if ws is a set of words that are joined to form the s symbol starting from the is symbol and if simr(s, is) is the set of ideological meanings that produce the rejection of the s symbol, generated from is, then s is rejected if there is some word exists in ws such that its ideological analysis belongs to simr(s, is). . . ideological relationships among symbols symbols can be used as relationships of the ideological type between the subject head and the verbal head: if the verbal head implies an action and is in active form, the subject should constitute a living being. rule: ideological relations among symbols if s and s are symbols of an analysis tree, this tree is accepted if and only if the ideological concor- dance is satisfied in the set of definite characteristics for these symbols, irs(s , s ). . special cases special circumstances are applied when the previous methods do not solve the problem. . . clauses the clause, whether coordinate or subordinate, must have a connector, either a subordinator or a coordinator, inserted before or after the clause. similarly, any type of sentence that allows any of these clauses should comply with the same condi- tions. the main clause and the subordinate clause are different in the way they are joined to the remainder of the sentence. the subordinate clause of an infinitive should have an infinitive as the verbal head. . . interrogative sentences and exclamatory sentences the interrogative and exclamatory sentences are differentiated by the punctuation marks that delimit them. . . double direct object a double direct object-left dislocation is easily recognizable by the following characteristics: ( ) the two elements are found together, ( ) the first one is found at the beginning of the sentence, ( ) the second one is a pronominal clitic and ( ) there must be concordance with regard to gender and number between the two corresponding heads. rule: double direct object if s is a root symbol that covers the whole sentence, and if two direct object symbols appear, then s is accepted if and only if the direct objects are adjacent, are followed by a verbal head and the second direct object is realized by an unstressed personal pronoun. it must be taken into account that in compound sentences there may be two direct objects for each personal verbal form. . . elimination of options according to the position of the determiners only the following symbols: demonstrative adjec- tive, definite article and other pronoun appear before the nominal head. the possessive adjectives can be divided into those that precede the nominal head—mi, mis, tu, tus, su and sus—and those that come after the nominal head—mı́o, mı́a, mı́os, mı́as, tuyo, tuya, tuyos, tuyas, suyo, suya, suyos and suyas— and those that can appear both before as well as after the nominal head—nuestro, nuestra, nuestros, nuestras, vuestro, vuestra, vuestros and vuestras. rule: post-head determiners if s is a symbol that belongs to the group of noun phrases and is generated starting from a sequence of symbols where the nominal head þ determiner sequence appears, s will be accepted if and only if the determiner symbol is found in the pss (post-head symbol set) group of terminal symbols that can follow the nominal head. rule: pre-head determiners if s is the symbol for the adjective phrase that is generated starting from an adjective symbol, s will not be accepted if it is found after a determiner symbol from the bss (before symbol set) group of the terminal symbols that cannot follow the nominal head. o. s. suárez et al. literary and linguistic computing, vol. , no. , . . connectors there are a number of combinations of words that give place to conjunctive conjunctions: a consecuencia, a distinción de, a fin de, a fin de que, a lo que parece, a medida que, a menos que, a pesar, a pesar de, ahora bien, ahora que, al menos, al objeto de, al objeto de que, al parecer, al paso que, antes bien, ası́ como, ası́ es que, ası́ pués ası́ y todo, aún cuando, etc. . . other cases there are situations in which ambiguities can be resolved starting from considerations regarding the words, grammatical categories and intervening objects. resolutions of other conflicts there are rules that, without being directly applied to a given primary conflict, serve to eliminate ambiguities. . symbols that cannot cover the whole sentence the structure of a sentence should include a subject and a predicate, or possibly only a predicate. the object of analysis cannot comprise the subject symbol by itself. the main clause and the subordinate clause are symbols that have been defined to generate the analysis of compound sentences; it is for this reason that a main clause symbol has the same structure as a sentence symbol so that the sentence symbols need not cover the whole sentence. rule: total symbols the s non-terminal symbol that covers all sequences to be analyzed is accepted if and only if it is found among the symbols of the set tss (total symbols set) that have allowed the covering of whole sentences. . verbal periphrasis the verbal periphrasis formed by more than two elements generates a complex verbal form only in specific cases: acabar de þ infinitive, deber de þ infinitive, dejar de þ infinitive, echarse a þ infinitive, empezar a þ infinitive, estar para þ infinitive, explotar a þ infinitive, haber de þ infinitive, haber que þ infinitive, ir a þ infinitive, llegar a þ infinitive, ponerse a þ infinitive, romper a þ infinitive, tener que þ infinitive, venir a þ infinitive, volver a þ infinitive, etc. . considerations regarding predicate symbol generation the rules that define the structures of the predicate are given through the combinations of elements that can appear in it. in a formal definition of structural type it would be necessary to indicate all the possible combinations; because the place- ment of the majority of the elements is free, the number of possible structures for the predicate would be enormous; therefore it has been decided to permit all combinations and to prohibit those that are not possible—table shows pairs of incompatible symbols in the same predicate. the generation of a predicate symbol should be rejected either when some of its ends are not a beginning or an ending of the generated symbol or when there is an adjacency punctuation mark; it would not be a rejection in the case of subordinate sentences—the existence of subordinate elements would be verified as would the existence of multiple verbal forms. . other cases there are specific situations in which ambiguities can be resolved starting from considerations of words, grammatical categories and intervening objects. experimental results we analysed selected sentences covering the broadest spectrum of casuistry inherent to spanish grammar. the reliability measure for the disambi- guation is given by: g ¼ ðp � Þ=ðn � Þ where p is the total number of functional combina- tions minus the number of functional combinations functional disambiguation based on syntactic structures literary and linguistic computing, vol. , no. , accepted and n is the total number of functional combinations provided by the morphological analyser. as can be seen, in fig. and fig. , the yield of the functional disambiguation—local and structural—increases with the number of symbols of a sentence. the functional disambiguation based on local syntactic structures has an average yield of % and increases to a high of % after applying the structural conditions. conclusions this study does not stop in subsets of grammar but challenges a whole system of rules for spanish grammar, despite the notable amount of combina- tions needed for the analysis. it contributes towards a solution to the problem of the emergence of functional ambiguities. first, a process of disambiguation based on local syntactic structures is applied; it reaches an average yield of sentence symbols number a v e ra g e g o o d n e s s after local functional disambiguation after structural functional disambiguation fig. functional disambiguation goodness. sentence symbols number a c e p te d f u n c ti o n a l c o m b in a ti o n s after local functional disambiguation after structural functional disambiguation fig. number of accepted functional combinations. o. s. suárez et al. literary and linguistic computing, vol. , no. , %. subsequently, a disambiguation based on syntactic representation trees is applied. it improves the yield or performance up to a high end of %. the importance of this work lies in its signifi- cant contribution to the development of future applications: ( ) it accelerates the process of syntactic analysis by trimming incorrect structures. ( ) it improves the precision of results in the advanced searching of words. ( ) it allows discarding options not valid in the information extraction process. ( ) it detects grammatical errors in written constructs, etc. references bosque, i., demonte, v., and lázaro carreter, f. ( ). gramática descriptiva de la lengua española. madrid: espasa. gili gaya, s. ( ). curso superior de sintaxis española. barcelona: biblograf s.a. gómez torrego, l. ( ). análisis sintáctico. teorı́a y práctica. s.m., madrid. quesada, j. f. ( ). un modelo robusto y eficiente para el análisis sintáctico de lenguajes naturales mediante árboles múltiples virtuales. centro informático cientı́fico de andalucı́a (cica). real academia española ( ). esbozo de una nueva gramática de la lengua española. madrid: espasa-calpe. santana, o., pérez, j., carreras, f., duque, j., hernández, z., and rodrı́guez, g. ( ). flanom: flexionador y lematizador automático de formas nominales. lingüı́stica española actual xxi, : – . santana, o., pérez, j., hernández, z., carreras, f., and rodrı́guez, g. ( ). flaver: flexionador y lematizador automático de formas verbales. lingüı́stica española actual xix, : – . santana, o., pérez, j., losada, l., and carreras, f. ( ). hacia la desambiguación funcional automática en español. procesamiento del lenguaje natural, (sepln): – . functional disambiguation based on syntactic structures literary and linguistic computing, vol. , no. , volume issue / ta l e s o f a t o o l e n c o u n t e r e x p l o r i n g v i d e o a n n o tat i o n f o r d o i n g m e d i a h i s t o r y susan aasman university of groningen research centre for media and journalism studies oude kijk in ‘t jatstraat ek groningen the netherlands s.i.aasman@rug.nl tom slootweg utrecht university department of media and culture studies muntstraat a ev utrecht the netherlands t.slootweg@uu.nl liliana melgar estrada utrecht university department of media and culture studies muntstraat a ev utrecht the netherlands lmelgar@beeldengeluid.nl rob wegter university of groningen research centre for media and journalism studies oude kijk in ‘t jatstraat ek groningen the netherlands r.wegter@rug.nl abstract: this article explores the affordances and functionalities of the dutch clariah research infrastructure – and the integrated video annotation tool – for doing media historical research with digitised audiovisual sources from television archives. the growing importance of digital research infrastructures, archives and tools, has enticed media historians to rethink their research practices more and more in terms of methodological transparency, tool criticism and reflection. moreover, also questions related to the heuristics and hermeneutics of our scholarly work need to be reconsidered. the article hence sketches the role of digital research infrastructures for the humanities (in the netherlands), and the use of video annotation in media mailto:s.i.aasman@rug.nl mailto:t.slootweg@uu.nl mailto:lmelgar@beeldengeluid.nl mailto:r.wegter@rug.nl s. aasman et al., tales of a tool encounter studies and other research domains. by doing so, the authors reflect on their own specific engagements with the clariah infrastructure and its tools, both as media historians and co-developers. this dual position greatly determines the possibilities and constraints for the various modes of digital scholarship relevant to media history. to exemplify this, two short case studies – based on a pilot project ‘me and myself. tracing first person in documentary history in av-collections’ (m&m) – show how the authors deployed video annotation to segment interpretative units of interest, rather than opting for units of analysis common in statistical analysis. the deliberate choice to abandon formal modes of moving image annotation and analysis ensued from a delicate interplay between the desired interpretative research goals, and the integration of tool criticism and reflection in the research design. the authors found that due to the formal and stylistic complexity of documentaries, also alternative, hermeneutic research strategies ought to be supported by digital infrastructures and its tools. keywords: digital humanities, research infrastructures, digital tool criticism, video annotation, documentary history the gaining influence of digital research has not gone unnoticed among media historians. many audiovisual archives have opened their digital collections via large scale infrastructures, including new digital tools that enable scholars to explore, compare or analyse these collections with new research questions – or revisit old ones in new ways. as expected, this digital transformation has raised a debate as to what it might mean for the research field of media history. eef masson recently argued that the rise of the digital humanities is characterised by a collision of epistemic traditions: hermeneutics and positivism. huub wijfjes has also reflected on the implications of the ‘digital turn’ for media historical research. as masson, he has argued that this ‘turn’ has reignited a longstanding debate on whether “history should hermeneutically focus on understanding and contextualising unique events or on analysing structure and patterns based on quantifiable units and data.” however, the discussion not only revolves around the question whether we should take quantitative or qualitative approaches, but also on the possible transformative aspect of established research practices as a result of changing research environments. what happens if we base our research on digital tools developed within a specific digital infrastructure: are we still able to “achieve our analytical goals”? indeed, do digital tools invite new methodological approaches, for instance related to automated retrieval of metadata out of a larger dataset for further statistical analysis? or can the tools also be deployed traditionally, in more hermeneutic media historical research, by supporting scholars in reconstructing particular historical trends in a digitised corpus of audiovisual archival materials? we arrive at these preliminary questions because of our involvement as research pilot scholars in the common lab research infrastructures for the arts and humanities (clariah core, - ) project, funded by the netherlands organisation for scientific research (nwo). our task was to conduct a research project, entitled me & myself: tracing first person in documentary history in av-collections (m&m), to explore the possibilities and constraints of doing media historical research through our use of the media suite (ms), a research environment of the clariah infrastructure. the media suite is currently in development and ties into the clariah focus area of media eef masson, ‘humanistic data research: an encounter between epistemic traditions,’ in mirko tobias schäfer and karin van es, eds, datafied society: studying culture through data, amsterdam university press, , - , p. . huub wijfjes, ‘digital humanities and media history: a challenge for historical newspaper research,’ tijdschrift voor mediageschiedenis, , , , - . wijfjes, ‘digital humanities and media history,’ . gerben zaagsma, ‘on digital history,’ bmgn – low countries historical review, , , , - , p. . in this article we refer to version . : released in january . see: https://mediasuite.clariah.nl/documentation/release-notes/v - https://www.clariah.nl/en/ https://www.clariah.nl/en/ http://mediasuite.clariah.nl https://mediasuite.clariah.nl/documentation/release-notes/v - s. aasman et al., tales of a tool encounter studies. it provides an integrated research environment that makes possible to search, annotate, analyse and enrich large digitised audiovisual and contextual collections from archives and other cultural heritage institutions across the netherlands. our pilot project aims to explore the added value of video annotation by testing the integrated manual video annotation tool in connection to its use in a research project that addresses a more or less traditional media historical research question. in addition, the m&m research project aims to trace when and where the emergence and rise of an autobiographical and confessional mode of documentary filmmaking can be located on dutch public service television in the last five decades. the main collection used by the project is the digitised audiovisual collection of the netherlands institute for sound and vision (nisv), which, due to the clariah project, has been made available for online access and viewing to researchers in the netherlands for the first time. video annotation tools have gained a growing influence in the field of film and media studies since the second half of the s. this is predominantly due to the increasing availability of major video collections, whether they are accessible through media archives or are obtainable on social media platforms. the size of those collections, and the fact that they are available as digital data, invite new research opportunities and challenges for which annotation offers a solution. the act of annotation, as a common and longstanding analytical aid to close reading practices of media scholars, is often employed as a means to an end. but annotation can entail so much more, especially when done with digital tools. video annotations tools can be re-usable for the sake of retrieval, enrichment and contextualisation by creating metadata which allows connecting one’s sources or data to other archival material or scholarly annotations; while other tools are particularly useful in preparing the research material for quantitative analysis by using a stricter formalist or rigorous qualitative coding approach. in short, there are several ways of doing digital video annotations and each type of annotation align to different research traditions. the tool at hand – the manual annotation tool in the media suite – does hence not meet every specific research need. this challenge touches on the central issue of this article: we want to explore and explain how the ms video annotation tool influences our research practice while still ‘in development’. however, we are also interested in how we as researchers execute agency as co-developers of the tool. hence, this article addresses how doing media history is methodologically influenced by a digital tool that aims to assists us in the analysis process. we reflect on the affordances of the tool and on how they impact our scholarly work. by asking these and other questions about the effects of digital tools on our scholarly work, we follow the suggestions made by marijn koolen, jasmijn van gorp and jacco van ossenbruggen to integrate ‘tool criticism’ and ‘reflection’ as essential elements in the research practice of a digital scholar. these include reflecting on our experiences with the infrastructure at hand, and accessibility to data since our usage of data are always mediated via a particular set of tools. besides the media suite, there are two other ‘work packages’ related to clariah: one supporting linguistics, and the other for social economic history see for more information at https://clariah.nl the process of building the media suite is described in the following conference papers: carlos martínez ortiz et al., ‘from tools to “recipes”: building a media suite within the dutch digital humanities infrastructure clariah,’ delivered at the digital humanities benelux , utrecht, https://dspace.library.uu.nl/handle/ / . see also: roeland ordelman et al., ‘challenges in enabling mixed media scholarly research with multi-media data in a sustainable infrastructure,’dh mexico city, https://hdl.handle.net/ . / db f b - ab - d-bc - afaa the composition and scope of the collection can be found here: http://mediasuitedata.clariah.nl/dataset/nisv-catalogue see, for instance, the inventory of tools for audio-visual annotation included in: liliana melgar estrada, eva hielscher, marijn koolen, christian olesen and julia noordegraaf, ‘film analysis as annotation: exploring current tools,’ the moving image: the journal of the association of moving image archivists, , , , - . marijn koolen, jasmijn van gorp and jacco van ossenbruggen, ‘towards a model for digital tool criticism: reflection as integrative practice,’ digital scholarship in the humanities, fqy , october , , - . https://clariah.nl https://dspace.library.uu.nl/handle/ / https://hdl.handle.net/ . / db f b - ab - d-bc - afaa https://hdl.handle.net/ . / db f b - ab - d-bc - afaa http://mediasuitedata.clariah.nl/dataset/nisv-catalogue s. aasman et al., tales of a tool encounter s c h o l a r l y c h a l l e n g e s i n b u i l d i n g a d i g i t a l r e s e a r c h i n f r a s t r u c t u r e over the past few years, several large-scale digital infrastructural projects have emerged in europe. well- known examples include dariah – a pan-european infrastructure for arts and humanities scholars working with computational methods – and clarin – european research infrastructure for language resources and technology. in the netherlands, the clariah project aims to build upon these infrastructures, designed to further explore the wishes of researchers in three focus areas within the humanities: linguistics, socio-economic history and media studies. together these focus areas often use several important types of data in the humanities: text, images, audiovisual material and structured data. different challenges and opportunities emerge for scholars as users, or as part of the creation of these infrastructures. we identify, for instance: the changing role of the scholar as co-developer, the tension between generalisation and specificity in developing information and research services for scholars, the limitations in the breadth of digitised collections, and – most importantly for this article – the need for scholarly reflection and tool criticism. as stated, the arrival of new research infrastructures has affected the role played by the ‘digital scholar.’ not only do researchers assess the usefulness of infrastructures and their tools, they are often also involved as co- developers. the result is that scholars can, to a certain extent, directly influence how the infrastructure and tools are built according to their specific needs. the latter has been the case regarding the development of the clariah infrastructure, and the media suite, because one of the services of this infrastructure is suitable for media historical research. this new dynamic between the researcher and the infrastructure ‘in development’ seems beneficial, but can also present challenges: one of them is how to align particular research activities with more general institutional implementation roadmaps. the latter often has a significant impact on research practices. even if pilot scholars were being asked to list their requirements, and thus were able to make their needs explicit, they also had to adapt to the inherently slow pace of developing new systems. indeed, system development in the context of the research infrastructure is complex due to the accommodation of diverse requirements, not in the least because alternative solutions before actually implementing the services often needs to be investigated. all of these specific circumstances are relatively novel in the development of dutch digital humanities infrastructures. as wolfgang kaltenbrunner noted, the emergence of digital research infrastructures within the humanities took place relatively late compared to other research domains. concerning the netherlands, he observed that the first contours of a more comprehensive digital research infrastructure in the humanities emerged to reduce “the organizational fragmentation of the humanities.” this fragmentation relates to a larger international debate on whether the digital humanities should strive to become a ‘big tent,’ by unifying the variety of research disciplines and traditions in the humanities into one standardised ontological and epistemic domain and research practice. this approach enticed joris van zundert to warn about the dangers of what he termed the ‘generalization paradox.’ according to him, the contradiction usually pertains to a desire within infrastructures to cater to various research for an elaborate discussion on the digital scholar, see: martin weller, the digital scholar: how technology is transforming scholarly practice, bloomsbury, . see for example: franciska de jong, roeland ordelman and stef scagliola, ‘audio-visual collections and the user needs of scholars in the humanities: a case for co-development,’ proceedings of the nd conference on supporting digital humanities (sdh ), centre for language technology, copenhagen, . wolfgang kaltenbrunner, ‘reflexive inertia: reinventing scholarship through digital practices,’ doctoral thesis, university of leiden, , p. . stephen robertson, ‘the difference between digital humanities and digital history,’ in matthew k. gold, lauren f. klein, eds, debates in the digital humanities , university of minnesota press, , - , p. . see also: patrik svensson, big digital humanities: imagining a meeting place for the humanities and the digital, university of michigan press, . idem, p. . https://www.dariah.eu/ https://www.clarin.eu/ s. aasman et al., tales of a tool encounter communities, which pushes designers toward generalisation, while individual researchers hope to find specific methods, tools and data models according to their own requirements. in van zundert’s view, if this path is followed, many burgeoning large-scale infrastructures in the humanities will result in “highways that connect nothing to nowhere.” indeed, one of the effects he fears is a lack of interest by those infrastructural projects to pursue an agenda in which “the existing heuristics and hermeneutics are appropriately translated into their equivalent digital counterparts.” another perspective is to think about infrastructures as a ‘house with many rooms,’ enabling a wide variety of disciplines and traditions to explore novel digital tools and research practices from the context of their respective academic backgrounds. so far, clariah represents the second approach more than the first, which seems to be a positive development. the infrastructure offers a diversity of tools and data, designed for specific research areas, while it also allows researchers to be active contributors to its further development. nevertheless, institutional policies and technological investments greatly determine decision-making, including priorities set for further developing functionalities or eventual copyright limitations. the advantages of clariah as an infrastructure – and the media suite as a workspace – are that they provide authenticated access, via login with university credentials to audiovisual data collections and related mixed-media, as for example contextual sources provided by dutch cultural heritage and knowledge institutions. for our m&m project, we were mainly interested in dutch autobiographical documentaries which are part of the digitised historical television collection of the nisv. for this project, we were able to make a selection of relevant items based on an extensive exploratory search process. the growing accessibility of various data collections is greatly welcomed, and many advances are made right now in the media suite to ensure that more and more scholars can make use of audiovisual collections thanks to the existence of the research infrastructure and large-scale availability of digitised and digital public service broadcasting heritage. it is nevertheless relevant to keep in mind that there are potential drawbacks of doing media history in the media suite on the basis of available digitised, audiovisual collections. audiovisual heritage was preserved unevenly, and as media historian helle standgaard jensen has noted, “many programs were never kept even in their analogue format (...) and therefore we do not know what was left to digitize in the first place. the problem here is that with no knowledge of the pre-selection, we are left in the dark when it comes to doing critical source analysis in its most basic form.” so, it is important to acknowledge that digitised sources are by no means considered to be taken as representative solely on the basis of them being digitally available. therefore, we stress the inherent hybridity of historical research practice, especially when digitised collections are at play. as gerben zaagsma argued, the discipline of history is faced with the “real challenge (...) to be consciously hybrid and to integrate ‘traditional’ and ‘digital’ approaches in a new practice of doing history.” we are especially interested in whether the clariah media suite is (or is not) facilitating the ‘heuristics’ and ‘hermeneutics’ of traditional media historical research. in other words, we seek to assess whether the media suite – including the integrated tools – might suffer from the ‘generalization paradox’ discussed earlier, and we will return to the issue in the conclusion of this article. joris van zundert, ‘if you built it, will we come? large scale digital infrastructures as a dead end for digital humanities,’ historical social research / historische sozialforschung, , , , . van zundert, ‘if you built it, will we come?,’ . robertson, ‘the difference between digital humanities and digital history,’ p. . the television collection consists of percent of the entirety of the broadcast heritage contained within nisv archive. the television collection amounts to , items in total, out of which , items are labelled as documentaries. for more details on this phase of our project, please consult our contribution to the special dossier on the clariah pilot projects in the forthcoming tmg-journal for media history, . helle standgaard jensen, ‘doing media history in a digital age: change and continuity in historiographical practices,’ media, culture and society, , , , . zaagsma, ‘on digital history,’ p. . s. aasman et al., tales of a tool encounter d i g i t a l t o o l c r i t i c i s m the project me & myself: tracing first person in documentary history in av-collections explores the video annotation tool as a mode of scholarly description, including a specific approach to analyse items in audiovisual collections whilst reconstructing a historical trend of a sub-genre. in this article we focus on the impact this specific tool has on the analytical research phase when identifying interrelated formal, stylistic and narrative elements within our corpus of documentary tv-programmes. part of our explorations of the potential of the video annotation tool is to explicitly include an active process of tool criticism in the project design. media historians who use digital humanities research methods should be aware that tools – like texts – make an argument. tools can be seen as “theoretical objects and as such can be argued about, analysed and criticised.” in addition, koolen, van gorp and van ossenbruggen strongly recommended to reflect explicitly on the influence of the tool on every phase of the research process. on the basis of our experiences as pilot scholars, we therefore aim to conduct a form of tool criticism: what does digital video annotation offer, and which of its affordances and functionalities will help us to accomplish our research goals – and which ones are less suitable? how did the tool influence our research methodology, and did it stimulate us to change our research question or strategies? as fred gibbs and trevor owens have argued, it is essential to engage in methodological transparency in our historical writing when using novel means to explore and interpret historical data. this methodological transparency entails a clear understanding of what the tool actually does and what it does not do, and therefore should be discussed, explained and reflected upon together with the results. we argue that such consideration is a distinctly transformative aspect of media historical research in a digital setting. previously, (media) historians did not always include explicit methodological reflections, but digital humanities research requires explicit tool criticism and reflection to fully understand the scope, possibilities and constraints of a particular research question or project. from the start of our project, we became aware of the importance to understand the functionalities of the tool and how it influenced our research practice. once we entered the media suite, we were immediately confronted with the need of ‘tool-thinking’, because we had to familiarise ourselves with the use of a tool we had not used before. at the same time, our usage also became part of the development of the tool as well. from the outset, this dual position strongly affected our way of thinking and working, stimulating a reflective scholarly awareness. this particular awareness also helped us to perceive our new encounters with a tool as a necessary experimental moment and, consequently, became an important part of our digital scholarship. “the research process should include experimentation to find out how digital tools work in terms of modeling and transforming data, and to bring out and refine a scholars’ own assumptions about tools”, as koolen, van gorp and van ossenbruggen have reminded us. in what follows, we will explain more in-depth what happened when we began to experiment with the basic and generic manual annotation tool, how our hands-on experiences challenged us to come to terms with its possibilities and limitations, but also how the tool provided us with alternative strategies helping us to deal with the tool in novel ways. anastasia dorofeeva, ‘towards digital humanities tool criticism,’ ma-thesis, leiden university, , p. . idem, p. . koolen, van gorp, van ossenbruggen, ‘towards a model for digital tool criticism.’ fred gibbs and trevor owens, ‘the hermeneutics of data and historical writing,’ in jack dougherty and kristen nawrotzki, eds, writing history in the digital age, the university of michigan press, , - , p. koolen, van gorp and van ossenbruggen, ‘towards a model for digital tool criticism,’ p. . s. aasman et al., tales of a tool encounter v i d e o a n n o t a t i o n t o o l s john unsworth has termed annotation a ‘scholarly primitive’ and regards it as one of the basic research ‘functions’ necessary to any scholar regardless of the respective discipline. annotation includes activities that occur during searching, reading or interpreting, such as selecting or choosing relevant media objects, bookmarking passages or segments, adding notes, writing comments or memos, adding metadata, coding according to personal or common ‘code books’, or creating links or connections between items and segments. these activities can be done manually (as was common practice until recently), but can also be mediated or assisted computationally and, only to a certain extent in the humanities, be fully automated. since digital libraries became more and more accessible to scholars and other users, information systems aimed to support annotation during reading, basically through highlighting passages or adding notes ‘in the margins.’ in addition, since the widespread use of the vcr in the s and s, and the arrival of dvd in the late s, annotating videos also became a common research practice. moving images thus became more ‘attainable’ for analysis and interpretation by media scholars. today, video annotation tools are abundantly available. in general, these tools offer the option to upload or stream online, one or more audio or video files. these can then be segmented according to one (or more) criteria, by adding labels (also called ‘tags’ or ‘codes’), and enhanced further by the addition of other types of annotations (e.g. comments). while most of these tools share similar core features, they also differ in how they model data and functionalities, and in the support they provide to generate different data visualisations. given the available tools, these have resulted in various affordances that can (or cannot) support scholars in performing their analyses – while at the same time influence research methods and outcomes. the most well-known audiovisual annotation tools originate from at least four different research and professional traditions. the first relates to the field of linguistics and communication studies that often use the elan and anvil software packages. both packages are free of charge and offer a variety of functionalities to annotate on multiple tiers, work from controlled vocabularies, and even work with d motion capture files. second are the so-called qdas (qualitative data analysis software) packages used in ethnography and qualitative analysis, such as atlas.ti and nvivo. these tools enable the user to organise and structure multiple data sources, such as audio, visual or text files of interviews or other documents. by using these tools, detailed analyses can be made through coding, segmenting and linking of data, aided by visualisation functionalities. the third tool kit pertains to professional applications for video editing, such as final cut pro. the fourth, lastly, consists of web-based tools offered by archival aggregators, this insight originates from an influential unpublished paper presented at the symposium ‘humanities computing: formal methods, experimental practice’ held at king’s college, london may , . john unsworth, ‘scholarly primitives: what methods do humanities researchers have in common, and how might our tools reflect this?’ retrieved february , , http://www.people.virginia.edu/~jmu m/ kings. - /primitives.html see for example: catherine c. marshall, ‘annotation: from paper books to the digital library,’ in dl ‘ proceedings of the second acm international conference on digital libraries, acm new york, , - . for raymond bellour’s notion of film as an ‘unattainable’ text for analysis (before the arrival of video) see: raymond bellour, ‘the unattainable text,’ in raymond bellour and constance penley, ed, the analysis of film, indiana university press, , - . for a further discussion on how digital tools make film more ‘attainable’ for analysis, see: melgar estrada et. al., ‘film analysis as annotation.’ see also the discussion in this blog post: christian olesen, ‘introducing mimehist: annotating eye’s jean desmet collection,’ film history in the making (blog), august , , https://filmhistoryinthemaking.com/ / / /introducing-mimehist-annotating-eyes-jean-desmet-collection/ liliana melgar estrada and marijn koolen, ‘audiovisual media annotation using qualitative data analysis software: a comparative analysis,’ the qualitative report, , , , - . while both programs are free of charge, they are not both open-source software. only elan is. a literature review about the use of these tools for video analysis is offered in: melgar estada and koolen, ‘audiovisual media annotation using qualitative data analysis software.’ for instance, by following the principles of grounded-theory analyses. see, for instance: alison pickard and susan childs, ‘grounded theory: method or analysis?’ in alison pickard, ed, research methods in information, second edition, neal-schuman, . an example of the use of this group of tools in scholarship is described in: lea jacobs and kaitlin fyfe, ‘digital tools for film analysis: small data,’ in charles r. acland and eric hoyt, eds, the arclight guidebook to media history and the digital humanities, reframe books, , - . https://tla.mpi.nl/tools/tla-tools/elan/ http://www.anvil-software.org/ http://www.people.virginia.edu/~jmu m/kings. - /primitives.html http://www.people.virginia.edu/~jmu m/kings. - /primitives.html https://filmhistoryinthemaking.com/ / / /introducing-mimehist-annotating-eyes-jean-desmet-collection/ s. aasman et al., tales of a tool encounter such as euscreen or europeana. these aggregator systems, which were originally made for searching across collections, are now progressively integrating video annotation functionalities to support users in common research tasks that occur while working with online collections (for example, with the creation of personal clips, posters, or collections of bookmarks). the first group of tools facilitate audio-visual segmentation, as for example storing timestamps with a certain ‘code,’ or ‘tag,’ attached to it. this kind of annotation has particularly appealed to media scholars, who also went on to contribute to the further development of video annotation with tailored-made tools for film analysis, such as les lignes du temps. however, possibilities of annotation were prior to these digital tools also of importance to a branch of film studies devoted to counting and labelling shots, most famously done so by film scholar barry salt. in , he proposed a statistical method of stylistic film analysis as an alternative to the more traditional, interpretative practices prevalent in film studies at the time. salt manually counted shot lengths and labelled them according to a coding scheme based on such variables as camera movement, angle and framing. more recently, yuri tsivian and gunars civjans similarly aspired to a more positivist mode of historical film scholarship. they developed the cinemetrics tool to enable researchers to computationally segment and label films, and storing this information in a web-based database. in the digital formalism project ( - ), headed by film historian adelheid heftberger, the video annotation tool anvil was repurposed for the formal study of russian filmmaker dziga vertov’s films, using a coding protocol based on the same one used in cinemetrics. the digital formalism project created histograms, key frame visualisations and an annotated dvd edition of its corpus. similarly, film historian barbara flueckiger’s film colour research project ( - ) reused annotation functionalities of the video annotation tool elan to fit their specific project needs. the project resulted in the ‘timeline of historical film colors,’ an online presentation of annotation-based analyses of the aesthetics and technological history of colour in film. flueckiger’s team also used manual and semi-automatic annotations in order to relate specific metadata to the films, ranging from highly fine-grained classifications to linking annotated segmentations to technical film journals or film-theoretical essays. multiple-tier annotation support is another important innovation in the development of video annotation tools and is available in, for example, elan and anvil. by using multiple-tier annotation it is possible to assign multiple layers of annotations – such as aesthetic aspects in one layer, and objects or characters in another layer – to audiovisual documents. the digital formalism project exploited this functionality to provide a multi-layered description of vertov’s films on the basis of multiple variables (shot-scale, camera movement, etcetera) and a strict code book or ‘protocol.’ tiers, or annotation layers, have proven to be a powerful and necessary feature in digital annotation to enable more complex and interrelated analyses that go beyond solely identifying and annotating a single defined unit of analysis (such as neatly defined categories as shot length, or camera movement). tiers enable analyses in which the grouping of tags, according to specific ‘facets’, becomes possible. this important feature for more information see: christian gosvig olesen, ‘film history in the making: film history, digitised archives and digital research dispositifs,’ doctoral thesis, university of amsterdam, ; rob wegter, ‘exploring digital methods for media history: a tool criticism of video annotation,’ ma- thesis, university of groningen, . salt first reported on the ground principles of his ‘statistical style analysis’ in: barry salt, ‘statistical style analysis of motion pictures,’ film quarterly, , , , – . salt later put his method of analysis to work in a an overview of film style and technology that ranges from to . see: barry salt, film style and technology: history and analysis, hobbs the printers, . his datasets can be downloaded on www. starword.com. yuri tsivian, ‘cinemetrics, part of the humanities’ cyberinfrastructure,’ in michael ross, manfred grauer, bernd freisleben, eds, digital tools in media studies analysis and research. an overview, transcript verlag, , - . adelheid heftberger, ‘do computers dream of cinema?,’ in david m. berry, ed, understanding digital humanities, palgrave macmillan, , - ; heftberger, digital humanities and film studies: visualising dziga vertov’s work, springer, . barbara flueckiger, ‘a digital humanities approach to film colors,’ the moving image: the journal of the association of moving image archivists, , , , - . stephan hahn, ‘filmprotokoll revised ground truth in digital formalism,’ maske und kothurn: internationale beiträge zur theater-, film-, und medienwissenschaft, , , , - . http://euscreen.eu/ https://www.europeana.eu/portal/en http://ldt.iri.centrepompidou.fr/ldtplatform/ldt/ http://ldt.iri.centrepompidou.fr/ldtplatform/ldt/ http://cinemetrics.lv/ http://zauberklang.ch/filmcolors/ www.starword.com� www.starword.com� s. aasman et al., tales of a tool encounter of video annotation, however, is currently unavailable in the clariah media suite. we therefore had to consider and make alternative methodological decisions for our research project in order to deal with this limitation in our tool-use. m & m a n d t h e c l a r i a h m e d i a s u i t e a n n o t a t i o n t o o l the media suite annotation tool offers a combination of functionalities, and was originally developed for other projects at the netherlands institute for sound and vision, such as linkedtv, axes, arttube, and the mind of the universe. in the first project, linkedtv, emphasis was put on the automatic extraction of entities (names of persons, locations, buildings) from spoken, written and visual information in media documents. because automatic retrieval proved to be problematic for various reasons, it was decided to focus instead on manual annotation, using automatically extracted information to aid the annotator. the tool’s evolution hence gradually moved towards manual annotation and was eventually adapted for integration within the clariah media suite. on the basis of functionalities, affordances and research applications of video annotation discussed in the previous section, the tool developers decided to first of all focus on supporting the most essential tasks such as segmenting, tagging, commenting, linking, and adding personal metadata via customizable templates. the tool, however, continues to be updated and refined on the basis of new recommendations by scholars as they engage in their research projects. advice and suggestions were also given by the m&m research project. at the start of our project, we recommended to develop the tool with multiple-tier annotation support. analysing autobiographical documentaries needs an analytical approach that acknowledges the inherent complexities of this genre. we believe that the possibility to annotate segments on the basis of its multimodal complexity, by identifying a meaningful interrelationship between sound, speech, editing and framing, would benefit our media historical approach. unfortunately, the clariah infrastructural and tool development process prioritised essential functionalities over multiple-tier annotation support. this meant that we had to abandon our intention to use multiple-tier annotation in our project. because the tool currently only allows for single-tier annotation, we thus had to explore other ways to achieve our analytical goals. we redesigned our methodology by focusing on alternative strategies that would fit our type of research. in order to be able to annotate interrelated features within our dataset of autobiographical documentaries, we devised a methodological workaround that proved to be suitable for our project: the choice to make segmentations on the basis of units of interest rather than the strict, codified units of analysis traditionally deployed in the annotation process. the decision was intricate, and needs some further explanation. formal units of analysis, as was the case in the cinemetrics projects, are well-suited for statistical analysis of singular elements in hollywood film style and aesthetics that tend to be commonly agreed upon by scholars, such as shot boundaries, framing, cinematography and so on. this approach can be computationally designed due to the highly formalised characteristics and structure of narrative hollywood cinema. throughout film history, hollywood narrative fiction film has aligned to a mode of storytelling that more or less follows the rules of the continuity system. this means that these films are highly suitable for annotation-based analyses that apply a strict coding protocol with well-defined variables. the history of the annotation tool used in the clariah media suite has been described by its developer jaap blom during personal communications and presentations, for example, at the big video sprint mini-conference at aalborg university in . more information about the linkedtv, axes, and arttube projects can be found in their websites. the “mind of the universe” project uses a preliminary version of the annotation tool discussed in this paper, the output of the annotations generated with this tool can be seen here: http://www.themindoftheuniverse.org/ explore david bordwell, the way hollywood tells it: story and style in modern movies, mpublishing and university of michigan library, . https://www.linkedtv.eu/ https://web.archive.org/web/ /http:/www.axes-project.eu/ http://www.arttube.nl/ http://themindoftheuniverse.org/explore http://www.themindoftheuniverse.org/explore� http://www.themindoftheuniverse.org/explore� s. aasman et al., tales of a tool encounter annotating documentaries, however, is another matter, because “[t]hey evolve, change, consolidate, and scatter in unpredictable ways,” following media theorist bill nichols. documentaries are often a compound of different strategies; fictional, non-fictional, cinematic, and non-cinematic. this means that while one documentary may use the diary form (adopted from literature) in conveying its narrative, others may draw from journalistic (adopted from newspaper and reportage conventions) or other strategies. in other words, the artistic and formal diversity of documentaries is the most relevant common denominator. as nichols has stated, documentaries often tend to have a particular ‘voice’ that determines “the entirety of each film’s audio-visual presence: the selection of shots, the framing of subjects, the juxtaposition of scenes, the mixing of sounds, the use of titles and inter-titles.” due to this idiosyncratic status, documentaries are not necessarily obvious candidates for a cinemetrics-inspired and more formalist analysis, but rather need an approach that does justice to their complexities. as a consequence, we choose to introduce the unit of interest as a methodological way to determine the focus of our video annotation protocol. we therefore define our understanding of units of interest as a way of segmenting particular moments in a documentary (or of any other moving image document) on the basis of multiple, interrelated formal and aesthetic elements that cannot be subsumed under one strictly defined variable in a code book. in the case of the m&m project, for example, units of interests could refer to segments – similar to the idea of ‘clips’ – in which we found evidence of, or attribute ‘traits’ to, multiple interrelated formal and stylistic manifestations of the autobiographical and confessional. by introducing the unit of interest as our methodological focus, we acquired some flexibility in the segmentation process because we were not bound to strictly formal divisions of singular stylistic or formal categories in our corpus such as shots, sequences or scenes. moreover, this approach enabled us to integrate our interpretative efforts with the video annotation tool and thus explore video annotation in terms of its possible hermeneutic affordances. furthermore, we believe that by annotating on the basis of units of interest we were able to create evidence to historically reconstruct the development of the confessional mode in dutch documentary history. in contrast to a more formal analysis, the main advantage of this form of reconstruction was that it offered means to map a genre by the identification of a cluster of distinct features or traits. in short, the form of ‘reconstruction’ gave us a flexible methodology that encourages the analysis of the genre via the identification of the interplay of its most distinct features, without favouring or taking the formal dimension as the main basis. t e s t i n g t h e t o o l with the help of a test case of the dutch documentary namens onze ouders (in the name of our parents; directed by monique wolf and hans fels, ikon, ) and pappa is weg en ik wilde nog wat vragen (dad’s gone and i still wanted to ask him something; directed by marijn frank, vpro, ), we will illustrate how we used the tool. our aim is to reflect on how the functionalities enabled us to deal with the formal complexity of documentaries, and the interpretative dynamics involved. we will also show how the segmenting and tagging functionality in particular enhanced traditional media historical research practices, while also bringing up new challenges. on a methodological level, challenges oscillated between the affordances and functionalities given to us by the tool, and our interpretative strategies to comprehend an audiovisual document in its historical and stylistic context. the first phase of our research process consisted of building a corpus of documentaries which we suspected to be bill nichols, introduction to documentary, second edition, indiana university press, , p. . nichols, introduction to documentary, p. - . idem, p. . s. aasman et al., tales of a tool encounter autobiographical or confessional. in this phase the ms enabled us to build our corpus through the bookmarking functionality. this presented the possibility to save our selection of documentaries (or other archival documents) to a user project, and consequently made available in the personal workspace. this functionality created an overview of the documents which were created and attached to our user project. it also allowed for searching our corpus on the basis of keywords, periodisation, and a wide range of metadata-based filters. in general, this search functionality became more informative as research progressed with added annotations to the documents in the corpus. the affordance of the workspace, the user projects and the bookmarking functionality all form part in a more general affordance of the ‘digital archive’ and its web-based, networked topology. because of this specific topology the researcher is able to outsource the activity of data storage, for which the responsibility lies at clariah and nisv, who ensures the longevity of the corpus that is built within its infrastructure. on the one hand, this frees media historians of the task of keeping records of their own corpus, via copies, photos, registers, analogue annotations, or other means. on the other, this situation creates a dependency on external providers for sustaining the interface, infrastructure and data. video . clariah media suite screencast: search (produced by rob wegter, published on youtube: march , ). our use of the annotation tool yielded some interesting results regarding the challenges that came up when opting for a unit of interest approach. to illustrate these challenges, we will briefly discuss one particular unit of interest from the documentary namens onze ouders. we have chosen this scene because it contains a very noticeable idiosyncratic mode of expression and a complex formal interplay of visual and auditory elements. all of these elements, moreover, can be found throughout the documentary. the scene under scrutiny also features some of the more general characteristics attributed to the genre of autobiographical and confessional documentaries. in this particular scene, documentary maker hans fels stands on a field at the outskirts of the auschwitz-birkenau concentration and extermination camp, trying to imagine what his parents had experienced there during the holocaust. it is a notable moment in the documentary because we see the filmmaker on-screen, directly addressing the viewer, while explicitly reflecting on his painful family history. this is a moment of soliloquy, a recurring artistic device in the genre of autobiographical documentary that often takes the shape of a film or video diary. this device can be seen as an ‘embryonic instance’ of ‘techno-analysis,’ referring to a particular moment in documentary history “when the camera for this pilot project, veerle ros collected dutch documentaries on the basis keywords compiled in boolean search queries used in the nisv immix catalogue (in order to test the media suite, in the beginning of the clariah media suite project, we constantly compared with the original catalogue). we also used as a reference: bert hogenkamp, de nederlandse documentairefilm – : de ontwikkeling van een filmgenre in het televisietijdperk, uitgeverij boom, . for a conceptual reflection on the networked archive, see: sonja de leeuw, ‘het archief als netwerk,’ tijdschrift voor mediageschiedenis, , , , - . https://www.youtube.com/watch?v=ca c e-maq s. aasman et al., tales of a tool encounter as confessional instrument is taken up by the confessant herself,” following documentary theorist michael renov. we tagged this segment with ‘filmmaker’s history,’ ‘direct sound,’ ‘filmmaker on-screen’ and ‘family history’ to represent what happened in the sequence. we could, naturally, have continued our tagging, adding for instance ‘auschwitz,’ ‘field,’ ‘medium close-up,’ ‘hand-held,’ all related to the contents of fels’s narration and performance in the scene. pappa is weg en ik wilde nog wat vragen is another example from our corpus. it represents a more recent example of the evolution of the confessional and autobiographical genre in documentary history. in the documentary, marijn frank chronicles her relationship with her father, who is terminally ill and eventually passes away during production. against the backdrop of her father’s failing health and subsequent death, franken thematises the sudden realisation that she lacks any intimate knowledge of her father and his family history, in part, due to his introvert nature. to get closer to her father, she sets out to reconstruct his biography by interviewing relatives, and more importantly, by interrogating herself on their strained relationship. the documentary is rife with self-reflexive, autobiographical moments, captured by frank during intimate moments on her sofa, in bed, or on the toilet. these segments can be tagged with labels as ‘video diary,’ ‘self-reflexivity,’ and ‘family history.’ there are also other stylistic features that could be tagged, such as the overt presence of the camcorder, and the noise this non-professional equipment produces, but also the moments during which her cat walks in front of the camera, or when frank sneezes on-camera. these confessional moments heighten the sense of authenticity. however, instead of annotating all of the segmentations in which we observed these stylistic elements, our unit of interest approach allows us to single out the most significant ones, and attribute to them a cluster of interrelated features or ‘traits.’ video . clariah media suite screencast: manual video annotation (produced by rob wegter, published on youtube: march , ). a s s e s s i n g t h e t o o l we can make several observations in relation to how the video annotation tool helped us to reconstruct the autobiographical in the history of dutch documentaries aired on dutch public service broadcasting. having annotations – saved to a personal workspace in the media suite – meant that we could search and browse the tags we had added to the various segmentations. this enabled us, for instance, to easily retrieve segmentations michael renov, the subject of documentary, university of minnesota press, , p. - . https://www.youtube.com/watch?v=kl-yxk oq s. aasman et al., tales of a tool encounter tagged with ‘filmmaker’s history,’ and then do a comparative analysis of different instances where a filmmaker’s own biography, or personal history, is part of documentaries. moreover, we found that our annotations added more in-depth traits and characteristics to our sources. we thus were able to create our own structure, with collections and sub-collections. furthermore, we could enhance and work with our own qualitative metadata parallel to the existing archival metadata. finally, the tool forced us to reflect explicitly on our interpretations, and the underlying methodological considerations, and compare them to more traditional, analogue practices. as an example of analogue research habits, one can imagine a media historian physically visiting an archive, requesting the aid of a technician to watch a film on an editing table or a tv-programme on a specialised video- playback device, taking notes on paper. even if the digitisation of these activities into (potentially biased) features such as bookmarking, tagging, and segmenting has led to affordances as keyword search, they have also created the possibility for us to use these functionalities to implement a coding regime that can be used for more consistent qualitative data analysis. the fact that our tool was based on single-tier annotation, eventually led to our proposal to deploy units of interest. this allowed us to recognise and emphasise the complexity of our research material. in the two cases we presented, we illustrated how we chose to adopt the notion of units of interest in order to sidestep formal reductionism. this decision allowed us to bring together different layers present within a segment, and to capture the complexity of our research material with a cluster of tags. these tags can be seen as a representation of the interpretative dynamics between us, as researchers, and the documentaries under scrutiny. of course, these choices and this strand of research have their shortcomings. we did, for example, not abide to strict formal codes for our annotation strategies. nor did we follow a consistent coding manual coming from broadly agreed upon formal hierarchies and taxonomies. the disadvantage of this open-ended mode of tagging is that further retrieval (or grouping) of segments will be difficult. this challenge will become even more pressing when scaling up annotation projects, having multiple annotators (for multiple years) working on a corpus – which in turn will result in difficult questions like: will it ever be possible to agree on guidelines as to what categories might be used to tag and segment documentaries? how can the basic affordances of annotation tools accommodate for disagreement or complementary views? moreover, is it necessary to agree? we found that these questions touch upon the core of our methodological challenge, because to examine the complexity and the historicity of a scene, by segmenting it according to a cluster of traits, still requires a more elaborate motivation and validation of our interpretative strategies. overall, however, the video annotation tool enabled us to go beyond the available archival metadata and explore our qualitative, hermeneutic pursuit to understand the stylistic history of a genre. the combination of contextual information from, for example, broadcast programme guides, and automatically generated metadata about the cinematic, audio and textual elements is something that needs to be explored further when these options are ready in the clariah media suite. c o n c l u s i o n in , jasmijn van gorp et. al. published an article in this journal in which they argued that new digital tools will prove essential in unearthing ‘alternative histories’ but also in arousing new methodological questions for media historians in the digital age. our research has not been able to deliver an alternative history, but it certainly inspired us exploring alternative methodological questions and reflect on the impact of digital tools on our research practice. the fact that we entered a specific large-scale infrastructure, including available tools, meant that some of our research routines had to be redefined, or rather, made the need for methodological transparency and reflection more jasmijn van gorp, sonja de leeuw, justin van wees and bouke huurnink, ‘digital media archaeology: digging into the tool avresearcherxl,’ view journal for european television history and culture, , , , - . s. aasman et al., tales of a tool encounter explicit. as koolen, van gorp and van ossenbruggen pointed out, “the explicitness of digital tools prompts scholars to ask questions.” the circumstances in which we found ourselves by using the clariah infrastructure, including the media suite and its tools, implied a different research dynamic towards and relationship with our research material. the m&m project turned out to be a valuable experiment for prompting reflection on the use of a tool and infrastructure that are still in development. this meant that we had to adapt to the situation by redesigning our methodology. in retrospect, we have come to appreciate these obstacles as an opportunity to develop a more reflexive attitude towards the affordances of a tool. it has also confirmed the need to develop an integrated research design, offering an opportunity to be “trained in the critical analysis of the creation, enrichment, editing and retrieval of digital data as much as in the classical internal and external source critique.” one general conclusion from our work is an apparent interest to see how the evolution of (future) research infrastructures – which combine tools, such as video annotation, and access to data sets (collections of films, oral history records or digitised newspapers) – can improve media historical research. the development of the video annotation tool, at first glance, has seemed to move towards the ‘generalisation paradox’ discusses by joris van zundert. our role as co-developers might mitigate this ‘paradox,’ because we recommended clariah to enhance the tool with multiple-tier annotation support. a layered approach to moving image analysis, we believe, provides flexibility to various modes and strategies of annotation (automatic, manual, curatorial), and allows for different analytical perspectives. we are also delighted that steps are now being taken to implement our recommendations for further development of the media suite. as a research infrastructure, clariah seems to embrace a heterogenous, ‘house with many rooms’ approach, by enabling different types of scholarly work and developing tools that are open to adjustments. at the same time, our exploration also showed that this is not an easy process. there is an inherent tendency within the digital humanities to replace hermeneutic complexity with a desire to strive for reductionism by an increased focus on automated annotation processes and distant viewing. it is nevertheless important to stress that our struggle has resulted in an important lesson, a ‘tale of a tool encounter.’ we were able to keep our ideas intact, to find workarounds. we could also reflect on our own position in relation to the relevance of specific digital tools and the role of (inter)national research infrastructures. by going through this process, we found room to negotiate our way of doing media history, with its distinct potential benefits and pitfalls. one thing is clear, however: the ongoing digitisation of media archives, as well as our contemporary media culture, necessitates and requires us to keep testing our skills to make sure that we do not only follow but also are followed by the infrastructures and the tools. a c k n o w l e d g m e n t s the research for this article was made possible by the clariah-core project financed by nwo (www.clariah.nl). b i o g r a p h i e s susan aasman is associate professor in the department for media and journalism studies at the university of groningen. her field of expertise is in media history, with a particular interest in documentary, amateur film, and koolen, van gorp, van ossenbruggen, ‘towards a model for digital tool criticism.’, p. . andreas fickers, ‘veins filled with the diluted sap of rationality: a critical response to rens bod,’ bmgn: low countries historical review, , , , . www.clariah.nl� s. aasman et al., tales of a tool encounter digital archives. her current research addresses the possibilities of using computational tools for doing media historical research. she is director of the centre for digital humanities at the university of groningen and programme coordinator of the master programme digital humanities. tom slootweg is a postdoctoral researcher in the department of media and culture studies at utrecht university. he is a media historian and wrote a doctoral thesis on the arrival of electronic video in the netherlands between the s and s. his current research focuses on exploring digital tools for teaching and research in media studies in general, and media/television history in particular. liliana melgar estrada holds a phd in information science and currently works as a postdoctoral researcher at utrecht university and the netherlands institute for sound and vision. her research is on scholarly annotations in the humanities, with a special focus on supporting scholarly work with audiovisual collections. she is a researcher at clariah, the dutch national infrastructure for digital humanities research. rob wegter is a junior researcher in the department for media and journalism at the university of groningen. he holds a ma degree in digital humanities and worked as a research assistant for the m&m project (clariah) and the draft project (create at the university of amsterdam). in these projects, he focused on the exploration and development of digital methods for research on first-person documentary, early cinema and digital archiving. view journal of european television history and culture vol. , , doi: . / - . .jethc publisher: netherlands institute for sound and vision in collaboration with utrecht university, university of luxembourg and royal holloway university of london. copyright: the text of this article has been published under a creative commons attribution-noncommercial-no derivative works . netherlands license. this license does not apply to the media referenced in the article, which is subject to the individual rights owner’s terms. http://creativecommons.org/licenses/by-nc-nd/ . /nl/deed.en_gb http://dx.doi.org/ . / - . .jethc _km rqmlz g p _j zll _fob te _dy vkm _dp vu _ilsb oytptj s n profession © by the modern language association of america on the evaluation of digital media as scholarship \ / geoffrey rockwell in the modern language association task force on evaluating scholarship for tenure and promotion issued a recommendation that “[d]epartments and institutions should recognize the legitimacy of schol- arship produced in new media” (report ). that a task force of one of the largest and most prestigious scholarly associations in the humanities would recommend digital work be taken seriously was a dramatic move, one whose effects i witnessed going before a tenure and promotion committee prepared to argue for recognition of digital work for a colleague. when i arrived the committee members all had a photocopy of the recommenda- tion, and i discovered that i had prepared the wrong case. the problem was no longer convincing others that digital work could be scholarly; the prob- lem was that colleagues are unsure how to evaluate digital work, whether a peer-reviewed article in an online journal or an interactive research web site. colleagues and chairs are willing to entertain the case theoretically, but in practical terms they don’t know how to get started, and that is what this essay is about: getting started—a turning toward the sort of dialogue that could mature into a culture of balanced evaluation. let me start with some definitions. research, for the purposes of this essay, is the activity that leads to scholarship, which is the outcome that can be shared. many researchers use digital methods or digital resources but still share their scholarship in print. others might conduct research in \ / the author is professor of philosophy at the university of alberta. \ / geoffrey rockwell ||| s n archives, never looking at a computer screen, but still publish their schol- arship in digital form. this distinction suggests that a simple approach to evaluation would be to focus on digital scholarship and develop the case for the evaluation of digital outcomes rather than practices. but committees, especially when hiring and tenuring junior faculty members, are also concerned with the research future of the candidate—from what little has been done they are drawing inferences about what will be done in the future. after all, hir- ing and tenuring are about making expensive commitments on behalf of the university to the future. further, merit increases are often based on activity rather than outcome—otherwise we would unfairly penalize our colleagues who write books and thus don’t have much to show in the years between publications. in short, it is not enough to evaluate only digital outcomes. evaluators need to consider research activity for digital scholars much as they do for traditional scholars, and that is hard when you don’t have the experience to assess what digital humanists are doing. the second and more difficult term to define is digital. for the pur- poses of this essay, what matters is not whether some scholarship is digital at some point in its making. it doesn’t matter if an article in literary and linguistic computing was written with a word processor or if it was written on bookstore receipts i found in my pocket. what matters is that the work is shared with the community in electronic form and, more important, that it is meant to be experienced in electronic form, usually off a computer screen, though some interactive works are presented as installations with- out a screen. this is what our colleagues have trouble evaluating—those works that address their audience differently, that often have no beginning or end and are therefore frustrating to read. colleagues are being asked to read differently, and thus they are being asked to evaluate something they may not even understand how to access. they are being asked to evaluate a type of scholarship they haven’t had any experience creating, and they can’t therefore imagine the research done to create it. how are colleagues to feel comfortable evaluating when they can’t imagine the making—the poesis of digital work? this in turn raises the question of why colleagues should have to evaluate work in digital form at all. why not simply sub- contract the work to reviewers familiar with digital research? why colleagues should evaluate digital work colleagues who want guidance often start with the guidelines for evaluating work with digital media in the modern languages, by the mla’s committee on information technology (cit). these guidelines were developed by ||| on the evaluation of digital media as scholarship s n the cit specifically to help evaluators and candidates. for the evaluators the guidelines recommend you do the following: delineate and communicate responsibilities engage qualified reviewers review work in the medium in which it was produced seek interdisciplinary advice stay informed about accessibility issues while all five guidelines are important, i focus on the third one, to “re- view work in the medium in which it was produced.” it is this guideline that creates the most work for evaluators, and it is this one that forces a culture change on us. ultimately the others follow from this basic collegial responsibility, so let us examine it closely. responsibility the recommendation calls on evaluators to review the work of their col- leagues and not simply to review the reviews. it seems an obvious point, but reviewing a colleague’s work in whatever form it comes in is what we are supposed to do when formally evaluating for hiring, merit raise, or tenure. even when there are external reviewers—and it is a good idea to get informed external advice—the final decision rests with the commit- tee, and for that reason committee members should have some familiarity with the work. it is a renunciation of responsibility to not review a candi- date’s work. experimentation in form it is common for candidates to prepare cribs to their work for those un- willing to wrestle with cd-roms and strange web sites. when i came up for tenure, i provided the committee with a narrative on my digital work that included screen shots, descriptions, discussions about the nature of my contributions (to coauthored works), and references to associated work that legitimized the digital work. i tried to show how software tools i had worked on that couldn’t be reviewed themselves were reviewed in proxy through grant proposals and conference presentations. i tried to make a “double or nothing” argument that digital work could be treated as scholarship when it was reported back to the research community—so a peer-reviewed conference paper on a web site that was not peer reviewed legitimized the original digital work and doubled the academic credit (so i would get credit for the conference paper and the digital work). while such narratives are useful to evaluators, and candidates should be encouraged to prepare them, they should never be a substitute for review geoffrey rockwell ||| s n of the work in the form it was produced in. the originality of digital work is difficult to assess when all you have is a description. digital work is often about processes, interactivity, and interface, and no description (even with screen shots) can do the work justice. many new media works are experi- ments in form, and that experimentation is lost in translation. digital work to be evaluated as an original scholarly contribution needs to be assessed in such a way that the originality (or lack thereof) is evident. prose descrip- tions of projects can help the expert imagine the contribution, but they are likely to mislead the evaluator new to new media. finally, it should be mentioned that some types of digital work, like tools, are about method, and their value comes from how they can be used on different objects. colleagues really should try tools, even if they are difficult to try, precisely because they instantiate methods and hermeneutical processes. some types of digital work what then are the types of digital work that need assessment, and how are they different from print work? a complete typology of scholarly digital work is beyond this essay, in part because new forms of digital expression seem to emerge yearly, which of course is the point—the digital permits extraordinary experimentation with form. it also permits rapid experimen- tation such that each year there seems to be a new technological fad for which digital humanists demand recognition. perhaps it is this mashing change that characterizes the digital in form. still, it is worth starting with a few types of stable work colleagues in the digital humanities have pre- sented to the community and pointing out some of the things evaluators and candidates can discuss to understand the value of the contribution. online peer-reviewed publication the least controversial type of digital work is the peer-reviewed online article in a web journal like digital humanities quarterly. where the pro- cesses of peer review are comparable to those of a print journal, it is safe to assume that quality control is equivalent to that of print. further, an online article can be easily read by internal evaluators without having to learn about a different scholarly medium—just print it out and ignore the venue. there are, however, issues of credibility and persistence of online materials—issues that haunt the venue. as long as authors don’t trust online venues or don’t feel the venues have the requisite prestige, then, in a self-fulfilling fashion, the venues will lack the credibility of print journals. this has been addressed partly by attention to the problem of preservation—faculty members (and journals) are encouraged to deposit ||| on the evaluation of digital media as scholarship s n their work in institutional repositories (chan). there is also evidence that online publications get cited more and therefore have more impact, which is why one could argue that junior scholars should be publishing online (lawrence). finally, there are some innovative approaches to peer review itself that promise to deal with credibility issues, such as open peer review and editing, where anyone can assess and edit a submission. things to discuss: why was the online journal chosen? how does it handle peer review? are there any statistics on access to the article that can give a sense of the impact of the research? is the work being archived for long-term preservation? scholarly electronic editions one of the most useful contributions of digital humanists has been to create online scholarly electronic editions of resources of interest, from historical documents to literary works. while there are many electronic versions of classic literary texts, often put up in a bout of enthusiasm by students, scholarly electronic editions represent significant and informed research work. the work of the electronic editor, like that of the scholarly print editor, is not trivial. peter robinson and kevin taylor, in “publishing an electronic textual edition: the case of the wife of bath’s prologue on cd-rom,” describe the series of decisions, informed by knowledge of the context and of the original, about what to show and hide, how to enrich the material, and how to represent it electronically. the oppor- tunities and fluidity of the electronic form mean the editor must master two fields, the intellectual context of the original and current practices in digital representation. there is also now a significant literature around scholarly editing in the electronic age that the digital editor should be aware of and possibly contribute to. ironically, if the editor gets the form right so that the electronic version can be searched and easily read, no one will notice, and such delicate work will be unappreciated in evaluation, but this is true of translation and editorial work, whatever the medium. it could be argued that work on scholarly electronic editions is particularly important at this  juncture since we are in a transformative epoch when new scholarly resources are being designed and built for the next genera- tion to interpret. things to discuss: how did the editor think through representing the edition electronically? what was the editor able to do differently in elec- tronic form? what was lost? were others consulted, or was a review so- licited? who is using the edition, and are there any statistics on usage? is there a plan for long-term maintenance and preservation? was the work geoffrey rockwell ||| s n deposited in a trusted digital repository? did the editor also report on the research and decisions behind the edition? specifications one of the least appreciated contributions is the work of developing guidelines, standards, and specifications. to the untrained eye this looks like service work on a large scale. i prefer to think of it as an oulipean art—that of designing constraints that encourage controlled innovation. specifications are, after all, a system of suggestions as to what you should and shouldn’t do. they make possible a potential literature, in this case electronic scholarship. and, in the case of guidelines like those of the text encoding initiative, they present a theory of text in a form that has real consequences. if they aren’t confused (and there are poor specifications), then they instantiate and communicate a theory about what the potential for an electronic representation is. the problem with specifications is that they aren’t reviewed the way other work is. in some cases specifications are reviewed by standards bod- ies as they become standards, but the specifications are usually commis- sioned by the standards body, and the politics of standards review are dif- ferent from those of academic peer review. there are, however, ways to tackle standards work. standards are often published, and these publications are scholarship. a measure of the impact of a standard is its adoption, and some standards can be shown to be widely adopted. other specifications may be developed as proof of concept or for a particularly innovative project. in those cases we can look at how the innovations were returned to the community through conference papers, clear and useful documentation, and consultations. things to discuss: what needs and communities do these specifications address? how were the specifications developed? were there formal pro- cesses that included outside review? how were the specifications returned to the community, and is there evidence of impact? is there a plan for long-term review and maintenance of the specifications? research tools humanities computing, if you survey its journals, has been as much about the representation of humanities evidence in scholarly electronic editions as about developing tools for preparing, publishing, and studying the new editions. since the first issue of computers and the humanities there have been reviews of tools, articles about tool projects, and laments for the lack of tools that can exploit the new evidence. there is also a history of anxiety ||| on the evaluation of digital media as scholarship s n about the recognition of tool work (sinclair et al.). software tools that are developed to be used by others, as useful as they may be inside the field, are hard to explain as research outside. how, then, are we to think about such work? one way is to think of tool development as work in applied methodology. if specifications are an implementation of a theory of content, tools are an implementation of a potential method of research. tools present a theory of the practice of research in a form that others can try. they say something like, “it is useful to do this in this way so we have facilitated the practice in this way.” one of the constraints and opportunities of the digital is that it forces us to be concrete when we imagine potential repre- sentation and method. everything on the computer is formalized, which is not to say that man-computer processes are formalized. to create a tool is to have to choose a particular theory of practice, think about it, explore its consequences, and formalize parts of it for others. no amount of paper prototyping or imagining is a substitute for actually trying to implement something that works, which is why theorizing about tools is not the same as building them as a research practice. this is new; in the humanities we are not used to having to take a concrete stand on methods that can be tested by others. more to the point, in the humanities we are suspicious of methods and tools and therefore reluctant to stabilize methods in tools for fear that practices will then freeze and be imposed. one of the ways in which we distinguish the humanities from the social sciences is that our practices are themselves at stake, fluid and woven with evidence. tool work would seem like the trojan horse of scientism in the humanities. how to evaluate tool work then? like specifications, tools are not typi- cally peer-reviewed for publication, but they are demonstrated, tested, and used, and evaluators can therefore find all sorts of documentation, from manuals to online comments, about them. in some cases tools are even re- viewed. what would a research review of a tool look like? alan galey, stan ruecker, and the inke team have presented case studies in “how a pro- totype argues” of how experimental design prototypes could be evaluated as reified arguments. they look at how a design presents an argument, how it handles objections, how it is an original contribution, and how it is part of a research trajectory. unfortunately, tools are rarely reviewed in a formal fashion, and this is in part because of how complex it is to review a tool if you are going to go down to the level of a code review. unlike a monograph, tool development is usually reviewed at the start—it is the grant proposal to build it that is reviewed—on the basis of a description of potential, not the finished implementation. grants are critical because the cost of a tool lies in its development (and maintenance), not so much its geoffrey rockwell ||| s n distribution, so review has to happen earlier. for these reasons, in addition to looking at how the tool has been shared and tested, evaluators should take seriously a history of successful grants as an indication that the peer community is impressed. things to discuss: what need does the tool meet, and how has it been shown to meet that need? what community is the tool for, and how has the developer engaged the community? how is the tool an improvement on previous tools? is the interface interesting or the algorithm better? what theories or arguments are borne by the tool? how are those documented or shown? is there a maintenance plan, or is this a prototype? was the tool or its development reviewed as a grant proposal or in some other way? research blogs and web . activity cathy davidson has argued that we are entering a second phase that can be loosely connected to social media technologies, often given the web . designation (“humanities . ”). blogs and now twitter are examples of social media that have been adapted for research work in the academy. such emergent forms are particularly hard to evaluate since they don’t resemble any traditional academic form and they are more about process and relationships than finished content. a good blogger (or team of blog- gers), however, does a great service to the community by tracking fast- moving issues, linking to new materials, and commenting on those issues. the better blogs will include short reviews, announcements, interesting interventions, and notes about timely matters like exhibits. blogs, as i have learned, require habits of attention. each post might take half an hour to research and post. posts may appear to be light and quick, but the good bloggers learn and practice their craft. in some ways running a blog is like moderating a discussion list. how often does willard mccarty post a pro- vocative note to humanist to promote discussion? the work of facilitat- ing the conversations we value in the humanities should not be dismissed as service; it can be closer to journal editing. disciplines interested in human expression should take seriously new types of expression. what is really at issue is whether scholars should par- ticipate in experiments or take a critical or judgmental stance and only comment on, review, and theorize about the creative work of others. we have encoded in our departmental divisions views about the values and differences of academic work that separate the creative work of the artist from the critical work of the art historian, or the creative work of the writer of fiction from the theoretical work of the literary scholar who studies her or him. we aren’t entirely sure if the fine, design, and performing arts should be in the academy, as the language of most tenure and promotion ||| on the evaluation of digital media as scholarship s n documents shows. imagine trying to get creative digital work evaluated when you aren’t in the art department. the split between “interpretation” or “theoretical” or “analytical” work on the one hand and, on the other, “archival work” or “editing” falls apart when we consider the theoretical, interpretive choices that go into deci- sions about what will be digitized and how. (davidson, “data mining”) in addition to the problem of assessing new media work, there is the per- ception that at best digital scholarship is essentially community work, edi- torial work, or a form of translation and therefore theoretically light. it needs to be said over and over that there is nothing a priori untheoretical about digital work; it is rather a form of potential theory. i have argued that specifications, for example, instantiate a particular theory of text, and others have argued that prototypes can reify arguments. every decision of the tei about how to encode some phenomenon that we take for granted, like a date, is based on a theory of what a date is for the purposes of tex- tual representation. every research tool bears a theory about the practice of interpretation and the potential for computer-assisted interpretation. specifications and tools can be done well and be appropriately theorized, or done poorly without a view to the fabric of humanities knowledge. if we don’t recognize and support well-theorized specifications and tools, we will have to live with those that emerge from other groups with needs and questions other than those we care about. do we really want our tools to be built only by google and to thus be geared for handling business documentation? likewise, if we don’t recognize the care and work that goes into maintaining the research commons through editing, blogging, and other social research activities, then our public intellectual space will be managed by others (or simply not be there). i will go further and say that practices of theory that do not, where ap- propriate, take into account their implementation are unethical, especially when consequences are openly discussed. the old way of doing theory is premised on an unexamined view that the way ideas are transmitted is primarily through chains of books by great men. this is simply no longer true, if it ever was. the epidemiology of ideas—the way ideas are trans- mitted, explored, refined, and forgotten—is complex and changing. the internet is changing the ecology of transmission. a widely read blog can have measurably more readers than a published book. if what we value is appropriate intervention into the flow of conversation we call the hu- manities, then we need to be prepared to measure contributions, no matter what their form, in terms of their effectiveness as interventions. counting peer-reviewed books and articles just doesn’t cut it as a measurement of geoffrey rockwell ||| s n impact, especially with all the problems of peer review and its particular economy. it should be noted that one relevant feature of the digital is that access to information can be logged and measured in ways that were unthinkable before. viewing statistics are easy to gather for blogs, web sites, tools, and hypermedia. the statistics we can gather have far more detail than the crude metric of peer-reviewed page counts. while neither page counts nor web statistics really tell you whether information is having an effect, one can infer a lot more about readers from google analytics than one can from sales of a peer-reviewed book. things to discuss: what are the subject and audience of the blog? what is the contribution to the research community of the work? are there sta- tistics that show the reach and impact of the blog? what are some exem- plary posts that show the research focus of the blog? are there plans to archive the blog or to repurpose parts as publications? hypermedia and new media works there is a whole class of new media works that are born digital in the sense that they are authored on and for the computer as creative or expressive works. these works take advantage of the networked computer as an alter- native medium for creative and original expression. many of these works, especially those on the web, take advantage of the nonlinear and hyper- textual potential of electronic literature, which is why i am gathering this diverse literature under the rubric hypermedia. many of these works are experiments in literary interactivity and can only be viewed if you have ac- cess to the right configuration of equipment. others are playful and game- like. all in all they are a nightmare to review and publish because they are experimental and because they are often technically idiosyncratic. most are therefore either made available online or self-published since there is no viable publishing and review mechanism. such works are to some extent beyond the scope of this essay since they are not so much humani- ties research as original arts creations that should be assessed as digital or media art. having complained above about the artificial division of the creative and interpretive, i want to provide ideas about how these can be evaluated. a common way to get research credit for any creative work is to present and publish papers about the making of it. there are all sorts of venues, from conferences to media exhibits, where the work can be demonstrated and the research issues around the creative work discussed. as for the evaluator, an important thing to pay attention to is the use of interactivity (kiousis). the potential for interactivity is what makes the digital work ||| on the evaluation of digital media as scholarship s n different from other media. the computer makes it possible to program responses, branching, algorithmic visualizations, and computer-generated sound into the work. in some ways these works are the easiest to evaluate since they are meant to express something that an evaluator could inter- pret. in other words, they are meant to be played by you. you can approach them as a work of art and bring the interpretive traditions of the humani- ties to bear on this new media art. as with other arts you can look for the artist’s statement and ask about the genesis of the work. finally, you can ask critics familiar with such work to talk you through it. things to discuss: how should such a work be interpreted? what is the history of this work, and how is the work responding to other works? how does the work use the computing medium and opportunities for interac- tivity? is it playful, fun, and responsive? how is it documented? how is it exhibited or shared with its audience? developing a dialogue the evaluation of digital work is a process that needs to be developed, if possible long before career-changing decisions have to be made. both candidates and evaluators benefit if there is a dialogue around expecta- tions and evaluation earlier rather than later; candidates will know what they can do to make their case, and evaluators will have a framework for evaluation. start with the hiring as the mla guidelines point out, the dialogue starts with the hiring. the job ad is the first gesture from the institution indicating what it is look- ing for. job ads should accurately reflect what an institution wants and therefore what it will (should) evaluate. statements like “the successful candidate will be expected to run a digital humanities lab” are a public sign that the candidate will be expected to manage a lab. candidates and evaluators should therefore be prepared to assess how well the candidate managed the lab and to take that into account. job ads that say things like “candidates should submit a portfolio of new media work” are signaling that the institution wants candidates who have created new media and will presumably value that in the future. in short, for the institution the job ad (and other hiring communications) is the first utterance in a dialogue. departments should not put in the ad things they don’t value and should be prepared to take seriously in evaluation anything that they say they value. likewise, for candidates the job ad is a first clear indication of what will be evaluated. things may change in the dialogue, and you may want geoffrey rockwell ||| s n them to change, but the job ad is a document that you can use to further the dialogue once hired. candidates and chairs can also negotiate a memorandum of under- standing about what the expectations are and how work will be evaluated (see report ). the more conversation there is at the beginning, when tenure review is in the future, the better. conversations with the chair and department it is common for the dialogue to lapse after the thrill of hiring. chairs need to move on to other issues, and new hires need time to orient themselves to the new job. the dialogue tends to continue through annual reviews, which can end up being the only formal occasions for ongoing dialogue. needless to say, both parties benefit if the dialogue is pursued more vigor- ously through standard opportunities like • discussing digital opportunities at departmental meetings • presenting digital work to the department and university • preparing grant proposals, both internal and external • developing collaborations with colleagues, and • teaching with technology broad engagement the fundamental difference between digital work that is not research and digital work that is research rests in how it contributes to a larger conversation. original research is responsive to what others have done (and are doing), and it is reported back in ways that inform others in the field. an assessment of the researcher’s contribution to that conversation from his or her own peers is thus especially valuable. while there are few venues for strict peer review of digital work, there are all sorts of ways that researchers can engage and document their engagement in broader academic conversations. they should identify local, national, and interna- tional conferences that will allow them to address a research community of peers. chairs should help researchers find funding to attend appropri- ate conferences, workshops, and venues. the digital humanities is a field, like computer science, where new knowledge is often shared live at con- ferences because it needs to be shown to be understood. for this reason anyone whose research includes building new media works should find venues to exhibit these works. if there isn’t funding to travel to confer- ences, then researchers should be encouraged to find ways to share results online through blogs, twitter, discussion lists, and other public forums. if i were an evaluator, i would expect digital work that can’t be peer reviewed and published to then be exhibited or demonstrated in other ways. more ||| on the evaluation of digital media as scholarship s n generally, a researcher who is not participating in the conversations of such a fast-changing field is not likely to be doing research-level digital development. chairs would do well to advise new researchers to start shar- ing their work as soon as possible rather than clinging to it because it could be better. administrative conversations often digital humanists are expected to provide administrative leader- ship in the department around things digital, and this creates special and dangerous circumstances where there is greater need for dialogue. this is especially true when institutions are hiring their first digital humanist and expect that one person to play a “transformative” role. the leadership can take the form of being expected to service the department’s computers, manage the departmental web site, manage a lab, manage staff, run the online presence for a departmental project, apply for grants to get infra- structure for the department, introduce instructional technology, or get colleagues to use technology. such expectations can be manifestly unfair when these junior colleagues, instead of being shielded from administra- tion like their peers, have extra service dumped on them while still being expected to live up to traditional research expectations. for this reason it is especially important that there be a dialogue about administrative expecta- tions. junior faculty members should refuse extraordinary service without reassurances in writing that it will be recognized even if the chair (and the department’s digital agenda) changes. given the importance of good administration in the light of the expense of computing, it is no surprise that departments and individuals are increasingly looking at alternative academic positions where the mix of responsibilities is differently struc- tured to recognize leadership. there are, however, reasons for defining such positions as faculty positions, which is why it is useful to imagine structured conversations that can ensure that there are scholarly outcomes associated with the administration. if a conversation around expectations and opportunities takes place early on, leadership projects can be designed to include a research dimension. instructional technology projects can be designed to have a research dimension, for example. when providing computing support to an important research project like an institute or an online publication that bestows research credit on others, it is particularly important to negotiate meaningful research com- ponents for the digital hire. negotiations should include discussion about shared credit, opportunities for creative exploration, access to resources for managing the project, recognizing project management as a form of research, and coauthorship of papers about the project. i have always geoffrey rockwell ||| s n found that there are opportunities to weave my own research interests into the project if i ask early on. i also try to make sure that all i have to do for the project is the management and that there is sufficient funding to hire others for the text encoding, programming, design, and testing. conversations at times of critical evaluation if there has been a dialogue from the start, then moments of critical evalu- ation, when the evaluators need to make decisions that go on the record, go much smoother. first, there will be documentation about job expecta- tions so that the evaluators can ask, did the hire do what we discussed? likewise, candidates can use the accumulated documentation to help them structure their case so that it is recognized by the evaluators. second, the departmental evaluators should have been exposed to the candidate’s work over the years, and they should have had chances to ask about the work as it was developed, thus giving them a way of evaluating the making. an evaluator who is skeptical of the value of digital work should have had am- ple opportunities to ask the candidate respectfully about the value of works demonstrated, and the candidate should have some idea as to what docu- mentation would satisfy the skeptic. third, any large and mission-critical projects for the department can be taken into account in the evaluation if there has been explicit discussion of expectations early on. the goal is to develop a consensus as to what documented digital work will count so that both sides can anticipate the final review and agree on outcomes. some evaluators may feel that too much discussion lessens their ability to make critical judgments because they will get caught in a web of obligations with the candidate. that shouldn’t be the case if there have been frank discus- sions of expectations. conclusion: the resistance of digital media every field, especially a new one, finds ways to contribute to the larger scholarly effort of its time in unique ways. humanities computing is no exception. i would argue it is the difference in the contributions of com- puting humanists that, on the one hand, make the contributions so valu- able taken one by one and that, on the other, make them so hard to classify as scholarship comparable to what other colleagues do. digital research works resist classification and comparison in so many ways, and that is often their value. this is a period of experimentation with scholarly form, and some of the most useful work will not look like anything else that we recognize as scholarly. and that is the way we want it. further, failure is to be expected and valued. no one complains that tim berners-lee’s world ||| on the evaluation of digital media as scholarship s n wide web was a failed hypertext technology because the software he and colleagues created was left behind. it provided the ideas and the specifica- tions, not the particular instantiation. in sum, few digital research contributions can be assessed the way print contributions can, but we can develop a culture of assessment that includes conversations that are in the tradition of the humanities. notes | / . this paper is also a story woven from a wiki that i started when i was on the mla committee on information technology. since my first failures to explain digital work to colleagues i have had a chance to practice talking with chairs about how to assess digital work, and this essay lays out some of what i believe we need to tell our col- leagues as we work together toward inclusive tenure and promotion processes. above all, i am more than ever convinced that digital work is not “old wine in new bottles” and that we do ourselves a disservice if we try to argue that we are doing the same sort of work, just in digital form. for that reason i started the wiki to experiment with different ways—from fictional cases to a short guide—of introducing colleagues to the difference. the original version of the wiki as it was when i started it is at www .philosophi.ca/pmwiki.php/main/mladigitalwork. although i was responsible for most of it, ronnie apter wrote the links and bibliography section. the wiki was later reproduced on the mla site (http://wiki.mla.org) so that it could be openly edited by any member. it is now a community document that can evolve as we all need it to. . for a project that documents what digital humanists do, see the day in the life of the digital humanities project (day). . for that matter, you can get help. if you are trying to assess a learning object in its original form but don’t know how to install it and run it, then ask for help from the university instructional technology unit. if you want to assess a work of new media art, then ask your colleagues in art or design to help you. getting advice and help from col- leagues across campus allows evaluators to review the work in its original form while getting advice from people familiar with the form. . a community-edited version of this discussion of types of digital work is available at the mla wiki, http://wiki.mla.org. . see siemens et al.: this report from looks at the issues from different per- spectives and includes a survey of canadian humanities and social science scholars. in the section “report on responses to the questionnaire,” which i coauthored, we noted that “ % (of respondents) felt that non-electronic outlets were more credible, though % felt that peer review ensures similar quality.” . nature had an online “web debate” on peer review that nicely surveys the alter- natives. see nature peer review. . for more, see siemens and schreibman. . see tei-c.org for more on the text encoding initiative. the tei guidelines have become a de facto reference point for anyone representing scholarship in electronic form. one needn’t follow the guidelines, but you should at least be able to explain why. . for a definition of what web . is, see o’reilly. the term came out of confer- ence brainstorming by o’reilly and others. geoffrey rockwell ||| s n . i have been maintaining a blog since called theoreti.ca at http://theoreti.ca. while i do mention this in annual reports, i don’t expect research credit for it. other blogs, however, are more substantial works. . the humanist discussion list has been going since and has probably done more to build community in the field than any other project. see www. digitalhumanities .org/humanist/. . some exceptions are vectors and eastgate. vectors describes itself as a “journal of culture and technology in a dynamic vernacular.” it works with authors to review and publish interactive new media works: see www.vectorsjournal.org/. eastgate publishes both hypertext editing tools and fiction created with their tools. see www.eastgate.com/. some traditional publishers have also undertaken digital projects. see driscoll and scott. . see boaz and boaz. while cds are no longer the favored way of distributing new media work (the web is), this article tells the “narrative” of the project in a way that could help colleagues understand what it takes. . because of the project orientation of much digital humanities work, it has be- come common in the field to negotiate charters at the beginning that make clear the rights and expectations of all parties, especially vulnerable parties like graduate student research assistants. many digital humanists will therefore be used to such negotiations and understand their value. for more on charters, see ruecker and radzikowska. . see bethany nowviskie’s blog post on this subject, which has links and announces a forthcoming collection on the issue. . reasons can range from there not being support for alternative academic posi- tions at the university to wanting to have an integrated position where someone teaches using technology, manages the instructional technology, and conducts pedagogical re- search around instructional technology. . a general pattern in the academy is that no one respects or budgets for the management of large projects. it is common to be expected to both manage the de- velopment of a digital project and do a lot of the work when, for example, student programmers graduate. i try to make it clear that just managing the digital component is a significant task and that support for the programming and other duties is needed. works cited | / boaz, john k., and mildred m. boaz. “t. s. eliot on a cd-rom: a narrative of the production of a cd.” computers and the humanities ( ): – . print. chan, leslie. “supporting and enhancing scholarship in the digital age: the role of open-access institutional repositories.” canadian journal of communication . ( ): n. pag. web. aug. . davidson, cathy n. “data mining, collaboration, and institutional infrastructure for transforming research and teaching in the human sciences and beyond.” ctwatch quarterly . ( ): n. pag. web. aug. . ———. “humanities . : promise, perils, predictions.” pmla . ( ): – . print. day in the life of the digital humanities. taporwiki, u of alberta, apr. . web. july . driscoll, adrian, and brad scott. “electronic publishing at routledge.” computers and the humanities ( ): – . print. ||| on the evaluation of digital media as scholarship s n galey, alan, stan ruecker, and the inke team. “how a prototype argues.” literary and linguistic computing . ( ): – . print. guidelines for evaluating work with digital media in the modern languages. modern lan- guage association. mla, . web. july . kiousis, spiro. “interactivity: a concept explication.” new media and society . ( ): – . print. lawrence, steven. “online or invisible?” nature . ( ): . print. mccarty, willard. humanities computing. new york: palgrave, . print. nature pier review trial and debate. nature. nature pub. group, dec. . web.  july . nowviskie, bethany. #alt-ac: alternate academic careers for humanities scholars. nowviskie, jan. . web. july . blog. o’reilly, tim. “what is web . : design patterns and business models for the next generation of software.” o’reilly. o’reilly media, sept. . web. aug. . report of the mla task force on evaluating scholarship for tenure and promotion. modern language association. mla, . web. july . robinson, peter, and kevin taylor. “publishing an electronic textual edition: the case of the wife of bath’s prologue on cd-rom.” computers and the humanities ( ): – . print. ruecker, stan, and milena radzikowska. “the iterative design of a project charter for interdisciplinary research.” proceedings of the th acm conference on designing interactive systems dis ( ): – . web. aug. . siemens, ray, and susan schreibman, eds. a companion to digital literary studies. oxford: blackwell, . alliance of digital humanities organizations. web. aug. . siemens, ray, et al. the credibility of electronic publishing. n.p., . web. mar. . sinclair, s., et al. “peer review of humanities computing software.” assn. for liter- ary and linguistic computing and the assn. for computers and the humanities. athens, georgia. . address. white paper report report id: application number: hd project director: bethany nowviskie (bethany@virginia.edu) institution: university of virginia reporting period: / / - / / report due: / / date submitted: / / national endowment for the humanities white paper grant #hd neatline: facilitating geospatial and temporal interpretation of archival collections project directors: bethany nowviskie adam soroka scholars’ lab university of virginia library march introduction in september , with level ii digital humanities start-up funding from the national endowment for the humanities, the scholars’ lab at the university of virginia library initiated neatline, a project to develop a user-friendly tool with which students and scholars could generate interpretive expressions of the literary or historical content of archival collections. such “scholarly expressions” in neatline -- which remains a work in progress -- take the form of customizable, interlinked and interactive timelines and maps, easily publishable to the web. they are built using flexible, open-source geospatial software and standards-based approaches to gis, and they permit users to draw heavily on digitized archival content and standardized metadata created by librarians. that said, each neatline exhibit is imagined as a carefully-designed narrative -- a story told in time and space through small-scale interpretive decision-making by scholars and archivists, rather than (as is more commonly pursued in our era of “big data”) an algorithmically-derived or data-driven geotemporal information visualization. fig. : the primary editing and geo-temporal storytelling interface of neatline. this white paper describes: ) the theoretical goals of neatline, including a role for iterative sketching (or graphesis) and hand-craftedness in digital interpretive toolbuilding; ) activity undertaken in the start-up period, including shifts in project scope and planned outcomes that came with a decision to architect neatline as a set of mix-and-match plugins for omeka, an open-source platform for online collections and exhibits; and ) the scholars’ lab’s ongoing work on neatline, beyond the scope of the start-up grant, funded by the library of congress and undertaken in partnership with the center for history and new media at george mason university. in the broadest terms, neatline is conceived as a contribution -- within the visual vernacular -- to multidisciplinary, place-based scholarship using primary sources. as a start-up project, neatline made three claims to innovation or a shifting of the landscape of geo-temporal visualization in the humanities (itself described in more depth in the next section of this report). first, by building on primary resources expressed in ead (“encoded archival description”) metadata -- among other standards, such as vra core - - neatline creates a path for collaborative contribution by libraries and cultural heritage institutions to the hermeneutic scholarly process. although ead has been employed by academics working in concert with archivists, as in case of the walt whitman archive, its use has typically been straightforwardly bibliographical, as a finding aid and for the production of catalogs of manuscripts and letters. ead has yet to be used by scholars as a stepping-stone to rich, interpretive or theory-based expression (much less visualization) of the content of those primary resources. the neatline project aims to demonstrate the value of archival metadata to interpretive scholarship, and thereby strengthen connections among scholars and collections stewards. next, neatline aimed to provide a seamless, out-of-the-box experience for scholarly end- users. this was a major argument in our bid for neh funding, and our rationale for shifting from development of a promised stand-alone, downloadable tool (installable as a single, server-side application), to a functionally atomized array of interchangeable plugins for omeka (http://omeka.org) is arguably the most important contribution of the project to the current scene of digital humanities tool production. neatline as initially conceived would be simple to install and use, removing the major hurdle of need for technical support by multimedia developers or gis consultants and allowing for low-risk experimentation and easy web publication by students and scholars engaged with primary resources. as soon as a user imported ead and map data (easy to acquire from libraries and archives), he or she would see a basic timeline and an open geographical field, ready for selection, annotation, and geo-referencing. neatline was imagined as a self-contained, self-service, single-function tool that nonetheless allowed expert users easy access “under the hood” to customize and contribute to its open source code. for a stand-alone tool, this would have been the right approach; but we quickly realized that, with neatline, we had the opportunity to model a more productive and collaborative set of digital humanities software practices. the scholars’ lab has now shared source code for completed or in- progress neatline-related plugins with the omeka developers’ community, and omeka forms the backbone for basic content management functionality in our project. we feel strongly that our shift to omeka plugin production retains or enhances all of the desired qualities of our originally-proposed system, while adding two great benefits: ) no longer must users assent to the entire, ideal neatline workflow in order to make use of our work -- and in fact we are seeing great interest in and use of individual plugins in omeka user communities and scholarly and archival contexts far removed from those interested in geo- temporal interpretation; and ) our close collaboration with the omeka team and open source developers’ community is leading to advancements in the core code of omeka itself, again benefiting a far wider audience than our start-up grant anticipated. not only were we able to leverage omeka as a technical and social framework for neatline, but our plugins and the improvements our work has prompted in omeka core have made this excellent system a more attractive option for research and special collections libraries -- even those with sophisticated technical and repository infrastructure of their own. finally, neatline makes a theoretical contribution to the digital humanities by emphasizing hand-crafted visualization (a practice we have, after the experimentation of the uva’s now- defunct speclab thinktank and the scholarship of johanna drucker, called graphesis) as a mode of praxis and scholarly inquiry. low-tech sketching and storyboarding is regularly taught as part of the earliest design processes for digital projects in the scholars’ lab. a side effect of the neatline tool is to demonstrate -- to fields like history and literary studies, in which interpretation of visual artifacts themselves is rarely taught and drawing as a way of knowing is infrequently modeled -- the value of iterative interpretation and knowledge- production manifested in visual form. we have emphasized this by designing, wherever possible within the omeka framework, drawing and editing interfaces that are relatively simple to use and nearly identical to a finished, end-user’s view. scholars who use neatline are always sketching, erasing, and sketching again their arguments on the screen. our map and timeline-related plugins are designed to offer neatline users the ability to model multiple backgrounds independently from the foreground of their critical attention, and express all of these fields and their interrelation visually. the spatial and temporal foreground stands as a place for scholarly commentary (textual annotation as well as intervention in the visual field by means of the freehand drawing of lines and shapes), while user-specified neatline backgrounds can be empirical or unabashedly subjective, absolutely geo-referenced or wholly speculative. for example, in one local (yet unreleased) test case which uses the maps and letters of civil war cartographer jedediah hotchkiss, backgrounds are brought in from modern satellite imagery, from hotchkiss’s own surveyed-and-drawn maps (including not only military plans but also sketches sent home in letters to his family), and from the historical maps that served as the surveyor’s own mental “base layer.” by operating on archival metadata -- itself already an interpretation of a literary or historical collection -- to allow scholars literally to illustrate connections among documents and the spatial and temporal dimensions of their textual content, neatline embodies a theme of much work in the scholars’ lab: that method is a path to argument, and that there are cases when even very traditional interpretive humanities scholarship is best enacted in iterative, visual modes. project outcomes concrete products of the grant included advancements, in collaboration with chnm, to the code omeka codebase and wholesale creation by the scholars’ lab of a suite of plugin software for the omeka framework. these plugins are usable independently but gain value in combination, comprising the neatline system. they are described in much greater depth under the heading “neatline’s architecture and feature set,” below. . ead importer: allows for the easy and easily-adjusted import of of encoded archival description finding aid information into an omeka instance. . neatline maps: provides connection to one or more geoserver instances as well as any specification-compliant wms service. . neatline features: offers the ability to encode and edit geospatial shape information using a pleasant graphical interface. . timeline plugin: incorporates the well-known simile timeline javascript framework into omeka to provide chronological visualization of omeka items. . neatline theme: provides facilities for combining neatline and omeka information into unified interactive presentations. in addition, our work on neatline led us to develop five further plugins, not necessarily integral to our initial, proposed use case, but which neatline users may employ as part of their omeka exhibits. these plugins greatly extend the capacity of omeka and make it a more attractive option for better-resourced libraries and cultural heritage institutions -- the very constituency best positioned to contribute further development time to the open source code of neatline and omeka alike. . fedoraconnector: makes it possible to display, comment on, annotate, and otherwise employ objects inheriting behaviors from a fedora commons repository. . genericxmlimporter: permits users to import any arbitrary, flat xml data into omeka. . solrsearch: allows use of the solr search engine with omeka, facilitating improved search and implementing faceted browsing. . teidisplay: allows users to render tei files in html form and attached them to omeka items. this plugin integrates with solrsearch for indexing. . vracoreelementset: allows users to bring the vra core element set (for visual resources) into omeka. work on all of these (and on additional neatline-related plugins and extensions to omeka) will continue under the rubric of an “omeka + neatline” partnership with the center for history and new media, funded through by a contract with the library of congress. this collaborative partnership, which we consider a major outcome of neh’s investment in the neatline project, is described under “next steps,” below. contexts: scholarly, technical, and institutional sophisticated geo-temporal visualization is not new to the humanities – nor is it a phenomenon unique to digital practitioners and their methods. we will not rehearse the full range and critical tradition of map- and timeline-making in print and printed scholarship. minard’s time-based map of napoleanic troop movements has (thanks to edward tufte’s evangelism) become a tired commonplace in design discussions, but these point to a rich set of print-media approaches – many cataloged and analyzed by scholars like j.b. harley, mark monmonier, john krygier, and dennis wood – on which neatline has drawn for visual inspiration. for interpretive examples in history and literary studies, the two initial fields neatline proposed to address, we looked first to the synthetic work of alan baker and of anne knowles in describing historical applications of geography and geographical information systems (gis), and then to the critical uses of geography by scholars like j. hillis miller (on hardy) or john gillies (on shakespeare), as well as to the formal experimentalism of franco moretti in graphing and mapping literary texts for knowledge discovery. among digital centers, our local partners at uva’s iath (the institute for advanced technology in the humanities) have long been leaders in the production of web-based maps and timelines, and the recent establishment of a “spatial history lab” at stanford and a “digital cultural mapping” program at ucla is a mark of the currency of these approaches to humanities inquiry and pedagogy at other institutions. however, most geo-temporal interface development by scholars has either required use of expensive, proprietary desktop gis software or has been in a “one-off” mode – as with ben ray’s notable flash map of accusations in the salem witch trials – not utilizing community- supported open-source toolsets or building on common standards. furthermore, production of geo-temporal visualization has typically required scholars to partner intensively with technical staff to make any headway in their work. all of these factors have contributed to what martyn jessop, in a article in literary and linguistic computing, termed an “inhibition of geographical information in digital humanities scholarship” -- a provocation that led the scholars’ lab, during the course of our work on neatline, to host three tracks of an neh-funded institute for enabling geospatial scholarship, from which reports and a public, community-driven “spatial humanities” website are forthcoming in early . surprisingly, it is only lately that commercial, desktop gis toolsets – primarily created for use in synchronic contexts in the sciences – have begun to account for time as a matter of course. and relatively recent development of generalized, web-based tools and services for timeline and map creation either continues in the vein of one-off production or has other key failings that limit usefulness and adoption in a humanities context. none of the proliferating tools that build on google maps and simple timeline widgets has enabled easy import of archival metadata for scholarly annotation. most tools are rigid in their conception of space and modern in their expression of geography. google’s timemap project, for instance, does not allow for the use of historical maps that require higher-order transforms such as re-projection or skewing to match a modern street grid. other, hosted services (like dipity and geocommons) have simple mapping and temporal features, but are not open, extensible, or easy to integrate with historical maps and offline archival materials. although their ease of use and the inherent value of spatial and temporal lenses through which to view the humanities has brought such tools quickly into classroom application, they remain ill-suited to the modeling and close, customizable, and inherently narratological meaning-making needs of scholars. on a more technical note, we have observed that over the past quarter-century of evolution in geospatial technology, a simple but remarkably limiting form of data arrangement has been the norm. the predominant gis tools segregate geometrical types of data (points, lines, polygons, and collections of these objects) into separate data stores and handle their processing and visualization as separate tasks. while technical demands for speed and efficiency once made this practice necessary, it is no longer required. unlike most commercial gis software enterprises, open-source efforts are working in a more modern framework. the spatial technologies on which neatline are built are more intuitive for humanists, enabling them to treat features of all types on an even footing. if a road and a village green are both important in a particular context, the neatline user will able to reference and examine them with equal ease, and without a conceptual understanding of the different ways these shapes are treated in a gis. the most notable software contribution to the field of geospatial and temporal visualization in the digital humanities is a late- s java applet from the university of sydney’s archaeological computing laboratory, also (like google’s tool) called timemap. it has been very slow to penetrate the humanities community – and particularly to excite developers who might contribute to its source code – despite widespread recognition of the need for such a tool. the sydney timemap is a heavy-duty application-building framework, calling for a hearty infrastructure for development and hosting. many scholars using it will require significant consultation and data entry support. unlike omeka and our simple- to-install neatline plugins, timemap does not lend itself to the easy experimentation that would attract a base of users not already committed to intensive digital work. both the google and sydney timemap applications and a number of other geo-temporal visualization tools – including neatline – were analyzed against a dozen core criteria in as part of the scholars’ lab’s start-up grant bid. criteria for comparison included various measures for ease of use and suitability to the kinds of data (including scanned historical maps) that humanities users – particularly scholars interested in archival materials – require. by the time its core plugins are complete, neatline is projected to meet eleven of these benchmarks. its closest analogue, the sydney timemap tool, met five. planning for future, more nuanced temporal aspects of the neatline project are informed by project director nowviskie’s experience in the visualization of time for humanities inquiry, gleaned as designer of the speclab prototyping project, “temporal modelling.” this intel-funded initiative ( - ), which began with an interdisciplinary faculty seminar and was conducted in collaboration with artist and visual studies expert drucker, prototyped software to express the subjective experience of time (discontinuous, anti- metrical, cast with prospect and retrospect, inflected by emotion) as we find it in literary and historical documents. temporal modelling was also the framework for much thinking and experimentation about the role that digital tools could play in bringing graphesis to humanities interpretation. drucker and nowviskie presented their research in several public venues and it is documented in nowviskie’s dissertation (speculative computing: instruments for interpretive scholarship), in a co-authored chapter of the blackwell’s companion to digital humanities, and in speclab, a monograph by johanna drucker, published by chicago up. finally, one of neatline’s chief innovations is in providing a concrete link between archival descriptive metadata and second-order interpretive scholarship. there have been remarkably few attempts to build on ead in the creation of scholarly visualizations and interpretive interfaces. a notable effort is the archivez ead visualization tool from the university of maryland, funded by a neh start-up grant. archivez is, however, geared toward algorithmic visualization of the internal relations among ead files conceived (properly) as finding aids, rather than as an interpretive, editorial space for scholarly contribution and output. still, this investment by neh (and the research the grant references, including adoption studies among libraries and cultural heritage institutions) demonstrates recognition of the growing centrality of ead in archival practice. uva library’s engagement with web-based geospatial visualization dates to the mid- s, when our geospatial and statistical data center began to consult on and produce innovative web applications in collaboration with faculty and with centers such as iath and vcdh, the virginia center for digital history. when geostat – which had largely served the environmental science and architecture fields – folded into the newly-created, humanities- and social science-focused scholars’ lab in , support for gis applications to humanities scholarship gained momentum. through an internal uva library innovation grant awarded to scholars’ lab staff and director bethany nowviskie, we constructed a geospatial data infrastructure using best-of-breed open-source components and web service standards – the same tools that informed our plans for neatline. the design of a “spatial data portal,” built on this infrastructure was informed by a semester-long faculty/grad seminar on the topic of gis in the humanities, hosted by the scholars’ lab and led by gis specialists kelly johnston and chris gist. senior scholars’ lab developer adam soroka served as chief architect for the system, which has now been deployed to a dedicated hardware platform. the purchase of this hardware and major investment of staff time in spatial metadata creation, map georectification, and software development demonstrates uva library’s institutional commitment to these technologies and their use by students and scholars across the disciplines. prior to the start-up grant, our development effort was displayed in greatest detail at code lib, a librarian-technologists' conference held in providence, ri, in . at this event, soroka offered both an invited workshop and a presentation. neatline evolved directly from a demonstration application constructed for this workshop, using ead metadata from brown university library's collection of primary resources relating to the horror writer h.p. lovecraft, whose letters and stories meditate on early th- century providence in revelatory ways. this proof-of-concept application, constructed and populated with data over the course of several weeks, demonstrated the interpretive power of geo-temporal visualization of ead-described material and the need for simple spatial data mapping and creation interfaces. in the weeks following code lib, soroka and the scholars’ lab were approached repeatedly by peer research libraries and governmental agencies (such as the virginia legislature and the usda) for guidance in bringing open- source gis to their constituencies. several presentations preceding and during the course of the start-up grant outlined the technical and interpretive possibilities of the larger framework that informs neatline. nowviskie gave invited lectures on spatial humanities and presentations on this project at the royal society in london, the university of canberra, victoria university of wellington, the university of maryland, the neh, the library of congress, and at james madison university. she also gave an invited workshop on gis and historical maps at an ala/ acrl rare books and manuscripts conference in and, along with head of scholars’ lab r&d wayne graham, joined a panel on the “spatial turn” at sts , the meeting of the society for textual scholarship. nowviskie, soroka, gist, johnston, and former head of scholars’ lab public services joseph gilbert formed a related gis panel (“new world ordering”) at digital humanities ’ and nowviskie presented a poster on graphesis at dh ‘ in london. the research informing that poster was shared at the gathering of the scholarly communication institute (a three-day institute on “spatial technologies and methodologies”) and at the neh institute for enabling geospatial scholarship, and published in december as “inventing the map” in the digital humanities: a young lady’s primer, in a special issue of the poetess archive journal. gilbert and graham received feedback from the history community when they attended the meeting of the organization of american historians to discuss our gis infrastructure and its implications for neatline. finally, nowviskie will give an invited talk (“how to play with maps”) and offer a neatline workshop at “space/place/play,” a conference of cwrc, the canadian writing research collaboratory. by the end of , neatline will have been presented in at least venues in countries, on continents. neatline’s architecture and feature set: neatline, as a system, is structured to combine multiple interacting services under a unified user interface. it is a simple example of a “service-oriented architecture.” the advantages of such a design include: . a loose style of coupling between components that permits of their flexible redesign or replacement; . easy extension and reuse of components; . the possibility to use some but not all of the system; . the use of appropriate programming frameworks for different tasks. the components that comprise neatline divide roughly into two categories: the powerful and well-known geospatial webservice engine geoserver, and a collection of plugin software for the omeka framework. the geoserver engine is the product of an ongoing effort from an large and well- distributed international community. its primary goal is to instantiate the suite of geospatial webservices promulgated by the open geospatial consortium and iso technical committee (geographic information / geomatics). the fact that internationally- recognized standards are at the heart of a key component in neatline has very positive effects, as we shall see. the other category of components are plugins for the omeka framework. they will be described in more detail below, but at this point it is appropriate to note that they compose end users’ entire interface to neatline. no user is expected to normally interact directly with geoserver. this is in alignment with our goal to emphasize ease-of-use, because to work directly with geoserver requires a level of technical expertise inappropriate to expect of scholars without particular experience in the technologies it incorporates. in line with our fourth point about soa, we can cite an immediate benefit: that while geoserver is written in the fast and powerful (but difficult to use) compiled language java, our neatline plugins are written in the relatively easy to use, interpreted language php. it is much easier to learn or find personnel with expertise at php than it is to learn java or find java programmers, and the practical effect is that while neatline users have the benefit of java’s power, they are able to compose their projects’ appearance and behavior with the ease and inexpensiveness of php. however, we have introduced a less attractive quality into our architecture with this two- pronged fomulation. geoserver’s strength is in handling geospatial data, which becomes most interesting and comprehensible to scholars as it acquires graphical representation. but omeka’s strength is in handling textual data, and it offers users a default interface that relies heavily on traditional text-based webforms for entry and editing. bridging this difference became a major theme of this first round of neatline development. fig : an omeka collections view, showing content using neatline plugins. we can describe neatline’s general abilities by moving through four areas of scholarly activity: . importing archival content and descriptive contexts; . storing and providing access to geospatial data; . visualizing sequence; . and presenting views on combined data. neatline begins with an omeka plugin named ead importer, which as its name would have us believe, makes it easy and straightforward to import encoded archival description documents into an omeka collection. since ead is the premiere standard for marking up archival content electronically, this opens the door to a vast universe of archive descriptions published by research institutions, many containing valuable commentary by curators and archivists. by using ead importer, a neatline user can begin a project facing not a blank scholarly field, but a well-populated arena of objects of interest on which to comment and experiment. we expect that this plugin will find considerable use even by parties otherwise disinclined to experiment with the sort of visualizations that neatline affords. in the second realm, neatline offers extremely powerful operations on both geospatial imagery and discrete geospatial data, through two of its omeka plugins: neatline maps and neatline features. it replaces geoserver’s powerful but arcane interface with one much easier to use and importantly, one that is fully integrated into omeka’s preexisting forms and workflows. adding a georeferenced map to an omeka collection is now just as easy as adding a simple image – nothing more than a file upload through a web form. the map can then be treated in omeka either as a flat image (the only way maps could be represented in omeka before the neatline project) or, newly, as an interactive, cartographic and spatially- enabled object, able to be combined with other geospatial data and operated upon as true gis information. this ease of use does come with some costs. geospatial imagery can be very large. uploading a file through web forms and http can be a limiting factor on the size of maps that can be handled with neatline, although geoserver’s abilities go far beyond this limitation. however, we have not found this limitation be particularly annoying in practice. another limitation arises from the use of geoserver as engine. geoserver can store and process a very large variety of file formats for geospatial imagery, but that variety is not infinite. it is possible to obtain spatial data or imagery that geoserver will not handle without special processing. again, we have not found this limitation problematic in practice. the variety of options in geoserver by default is wide enough to cover virtually all commonly-used formats. once a map is available in the omeka/neatline context, it can be used in a number of ways. immediately, it can be examined or exhibited as an interactive map resource, featuring panning, zooming, measurement in scale, and other common gis operations. discrete geospatial data can be overlaid and multiple maps can be combined. the map can used in omeka exhibits as a first-class resource, like a document or simple image, but with all of its interactive functionality available. in addition, thanks to the geoserver engine powering neatline, when a map is used in combination with other geospatial resources it is automatically reprojected and rescaled to combine appropriately, saving users considerable tedious work and obviating the need for considerable technical knowledge in order to use a map made in one standard projection with a map in a different one -- or to use an historical map with a modern one. lastly, neatline provides access to uploaded imagery through the world-standard web map service (wms) webservice protocol, enabled the sharing of resources not just between neatline projects, but among neatline projects and projects using almost any other kind of geospatial engine. this last provision makes it possible for users to publish not only their finished exhibitions, but to publish the underlying imagery at the foundations of their arguments in an immediately reusable form. fig : a georectified historic map, shown in omeka item view. neatline also offers tools to work with discrete geospatial data. feature data (as it usually called in the world of geospatial technology) is attached to omeka items as metadata like any other piece of metadata. in fact, the same editors that allow easy editing of item metadata seamlessly allow editing of shapes and collections of shapes associated with those items. since omeka’s editors are built on textual web forms which would be infelicitous for editing graphical elements, we have constructed new components for those omeka forms that present graphical interfaces with convenient and powerful drawing and annotation tools. with these means, multiple arbitrarily-complex shapes can be attached to an item as well as distinguished uniquely and annotated with names and descriptions. behind the scenes, those graphical metadata are translated into geographic markup language (gml), a powerful (if dense) code for encoding geospatial features and relationships. this markup is then stored in metadata fields directly associated with the omeka item in question for retrieval and rendering or export and other use. gml is another international standard which allows of easy data interchange amongst projects. unlike with imagery, neatline is not currently able to provide a webservice exposing geospatial feature data -- this would be web feature service (wfs), the feature- related “cousin” of wms -- because we do not currently decompose and index the data any further than has been described. this may be an avenue for future development. once features and maps have been made available in a neatline/omeka project, they can be combined (as mentioned above) in visualizations. features can be overlaid on maps to provide commentary on either, and geographic relationships can be made apparent, even when they vary through time. this brings us to the third major group of neatline functionalities: visualizing sequence. for neatline, we have constructed an omeka plugin that incorporates the simile timeline javascript timeline construction toolkit. (this was a much-requested addition to omeka’s arsenal of plugins, and is another example of the neatline project’s contribution to digital scholarship far beyond the audience who might be interested in explicitly geo-temporal interpretation of archival collections.) by basing our timeline plugin on the simile code, we inherit simile timeline’s simple, linear, and completely metrical model of time and sequence. adding omeka items to a sequence is no more difficult than assigning a date to them using omeka’s metadata editors, and tagging them using omeka’s built-in tagging infrastructure. then a timeline can be produced (as a first-class item in omeka) which displays all of the items associated with a given tag. those with an assigned date display appropriately on the timeline, while those without specified dates (as can be common in archival collections) are displayed in an auxilliary view. this kind of timeline can then be examined on its own or used in omeka exhibits, much like a neatline map. lastly we come to the most important set of abilities in neatline: the presentation of views of combined data. because the interface of a neatline project lives inside an omeka instance, we are able to take advantage of omeka’s machinery for assembling exhibits to produce neatline views. in fact, to produce a neatline exhibit is exactly to produce an omeka exhibit using neatline-powered items and then to enable it with the fifth major neatline omeka component: neatline theme. neatline theme is a customizable library of javascript and css that, when applied to an omeka exhibit, connects any neatline-powered items in the display with special behaviors for the end user. a good example of these behaviors is the ability to click on any item displayed in a timeline to zoom instantly to its representation on a map and find tools for graphical editing and annotation available -- or instead to go to an omeka metadata editing page to alter or comment on that item. it is here that we find the idea of graphesis returning strongly. in a well-assembled neatline exhibit, the line between examination and annotation in the graphical context almost disappears. to discover a geographical fact that deserves remark is instantly to be able to comment on it in a useful, graphical way. neatline’s first iteration of technical development has been fruitful and fun, but it has real limitations. one of the most crucial is the manner in which discrete geospatial data is currently encoded as metadata attached to omeka items. this design unfortunately makes it impossible to do two things that would be very useful. firstly, we cannot reuse feature-encodings across distinct objects. if a geographical shape (say, the footprint of a house) is of interest in two different item-contexts (say, a hand-drawn map in a letter, and a contemporary sanborn map including the feature), it must be attached to each separately, and thenceforth updated and edited separately if future changes are necessary. it would be better if geographical shapes existed in the omeka framework as first-class objects, with their own metadata. the fact that as of yet they do not prevents us from taking another step (previously mentioned). we cannot currently expose feature data as a useful webservice. this, too, would become much easier if shapes were reified as full-fledged omeka items. lastly, this would allow for the application of data-indexing techniques specialized to spatial data, which would improve the performance of searching within collections as well as open the door to new kinds of spatially-based searching. if, however, we were to do this, it would require that omeka support more powerful forms of object linkage and relationship expression than it currently does, a project that the omeka group at chnm has agreed would be worthwhile. in the realm of presentation, we feel that our current neatline theme module is attractive and well-designed, but know it could be better. more and more powerful forms of behavior could be supplied in exhibits, and those already present could be refined and made more flexible. we look to a community of users to guide this kind of development by requesting features and developing their own -- and expect that neatline will also benefit from attention by lead omeka designer and research director jeremy boggs, who is joining the scholars’ lab in the new role of humanities design architect. lastly, we recognize that the model of time we have inherited from the simile timeline framework is very simple. its linearity and metrical constancy limit its expressive power and its ability to encode complex scholarly arguments or to represent those very qualities of humanities information (such as temporal ambiguity, uncertainty, discontinuity, subjectivity, and contingency) that most draw scholars’ attention. we look to the temporal modeling project -- newly revived with the support of the canadian funder sshrc to a cwrc research team led by stan ruecker and including bethany nowviskie and wayne graham of the scholars’ lab -- to provide inspiration as to how we might move beyond this limitation. challenges and next steps: omeka + neatline the scholars’ lab has learned a great deal and grown as a center and a team through our work on the neatline project. we originally intended to produce an application which stood alone and collected its own community of interested scholars. instead, we have been able to “piggyback” on the community of interest already surrounding omeka, which is particularly helpful because that community is naturally aligned to ask the kinds of questions we hoped to encourage and support with neatline. collaborative relationships with chnm, both at the developer and mangerial levels, have been greatly strengthened through work on this project, and we feel that both centers have learned from the partnership. at the technical level, the scholars’ lab was continually challenged to find a balance between offering the complete power of the tools we leveraged (particularly geoserver) and offering an attractively-simple experience for the majority of users. it is not yet clear how well we succeeded. using our apparatus requires a considerable degree of initial action from technical support staff (for example, installing and maintaining a java web application container), and it is not yet clear whether that will prove an obstacle to adoption. in some situations, we were also challenged by the distinction between the fundamentally graphical nature of neatline’s vision and the fundamentally textual underpinnings of the omeka. omeka instances expect inputs to their databases to be available as text, particularly as text entered in traditional web forms. neatline, on the other hand, bases its whole experience on a suite of graphical gestures. while in most cases, we were able to overcome the difference by clever programming that hides the underlying machine- readable text data behind a graphical interface, there remain places where this breaks down and the user is rudely shunted from operating in a graphical interface to operating through “texty” web forms. this breaks the flow of graphesis and provides a frontier for continued development work. we were pleased to be able to accomplish so much on neatline, despite an unexpected restriction on the hiring of wage employees at the university of virginia library. the bulk of the neh start-up grant was intended as salary offset for adam soroka, allowing us to hire extra help to compensate for time he and other members of our r&d division (including ethan gruber and wayne graham) spent in development of neatline. shortly after the disbursement of funds, uva library came to the difficult decision – due to the financial crisis in the state of virginia – that wage positions would be phased out and no more temporary employees hired. we sought unsuccessfully for some time to hire student programmers (still permitted) to fill our need, before finding a qualified collaborator in undergraduate sam ebespracher. meanwhile, we continued development of neatline and re-prioritized some other projects in order to allow work to progress. this caused us to underspend slightly on the grant and to devote less time than planned to archival research and content development for our planned exemplar projects in history and literary studies, which have not yet been built out to our satisfaction for public release. as a consequence of our shift to omeka plugin development, we also engaged less with our neatline advisory board than intended. staged releases for their feedback and review were less possible as development proceded on many simultaneous fronts with an array of neatline plugins. in february , the scholars’ lab and chnm announced a collaborative “omeka + neatline” initiative, supported by $ , in funding from the library of congress. this two-year initiative is meant to improve existing plugins, add preservation workflows, and refine the neatline toolset. on chnm’s side, enhancements to omeka’s core apis, improved documentation, regular “point” releases, and a new exhibit builder under the auspices of the project are intended to strengthen omeka’s user and developer communities. omeka + neatline was one of six contract awards made by the library of congress in a program that aims both to improve loc’s own content management and delivery infrastructure and to contribute to collaborative knowledge sharing among broader communities concerned with the sustainability and accessibility of digital content. in july of , approximately $ , , was targeted toward broad agency announcements covering three areas of research interest related to these goals. technical proposals were openly solicited from expert, multi-disciplinary communities in both academic and commercial settings in the following areas: ingest for digital content, data modeling of legislative information, and open source software for digital content delivery. the neatline + omeka project (guided by nowviskie and chnm managing director tom scheinfeldt) falls into the latter category. it is additionally an opportunity for the scholars’ lab and chnm to document and disseminate a model for open source, developer-level collaborations among library labs and digital humanities centers, and a white paper on this topic is one of the contract deliverables. while the scholars’ lab did succeed at constructing the major technical elements necessary to support the interpretive activity we envisioned in our neh start-up grant, it is true that each of our omeka plugins is very young and has rough edges. we have not yet created a fully polished experience for our users. using the neatline apparatus, at this stage of its development, requires a healthy degree of technical skill and a willingness to experiment. in addition, final integration of our plugins into a unified editing environment that embodies and promotes the brand of graphesis we proposed in our application is not yet finished. we expect to complete and articulate the value of that work under the auspices of the library of congress’s funding. we are also turning our attention to improved communications about the project, creating a greatly-improved website, available later this year at http://neatline.org/. this site will replace the current neatline blog and is intended to describe each of our plugins and how they work together, include screenshots and “how- to” screencasts, and link to discussion, documentation, and exemplar applications of the neatline tool in the context of two archival collections -- an effort we continue under the auspices of the neatline + omeka project. we’d like to thank neh for its generous support -- both financial and in terms of the helpful advice and enthusiastic advocacy of the staff of the office of digital humanities -- and to express our continued excitement at what neatline promises to offer scholars and archivists. work on this project is situated exactly at the nexus of what we feel makes the scholars’ lab interesting and special: deep expertise in interdisciplinary digital scholarship and gis (deepening, as we have hosted an neh iatdh institute on “enabling geospatial scholarship” and supported the work of a number of spatially-oriented graduate fellows in digital humanities), coupled with a singular appreciation for archival information and library culture, by virtue of our embeddedness in a major research library with rich and unique holdings. we hope that one side effect of our work on neatline will be increased willingness of other libraries to support staff-initiated research-and-development projects, and to embed r&d teams with the librarians who support methodological work by scholars interested in spatial and text technologies. related links: main informational site for neatline: http://neatline.org/ page detailing our plugins: http://www.scholarslab.org/projects/omeka-plugins/ plugin documentation and open-source code releases enabled by the neatline grant: http://omeka.org/codex/plugins/eadimporter http://omeka.org/codex/plugins/timeline http://omeka.org/codex/plugins/fedoraconnector http://omeka.org/codex/plugins/genericxmlimporter http://omeka.org/codex/plugins/neatlinefeatures http://omeka.org/codex/plugins/neatlinemaps http://omeka.org/codex/plugins/solrsearch http://omeka.org/codex/plugins/teidisplay http://omeka.org/codex/plugins/vracoreelementset “omeka + neatline” press release: http://www.scholarslab.org/announcements/scholars- lab-and-chnm-partner-on-omeka-neatline/ bethany nowviskie, “‘inventing the map’ in the digital humanities: a young lady’s primer.” poetess archive journal. vol , no ( ): http://paj.muohio.edu/paj/index.php/paj/article/viewarticle/ http://www.google.com/url?q=http% a% f% fneatline.org% f&sa=d&sntz= &usg=afqjcnegrxb_ph mhzpkmwtkkt f woga http://www.google.com/url?q=http% a% f% fneatline.org% f&sa=d&sntz= &usg=afqjcnegrxb_ph mhzpkmwtkkt f woga http://www.google.com/url?q=http% a% f% fneatline.org% f&sa=d&sntz= &usg=afqjcnegrxb_ph mhzpkmwtkkt f woga http://www.google.com/url?q=http% a% f% fneatline.org% f&sa=d&sntz= &usg=afqjcnegrxb_ph mhzpkmwtkkt f woga http://www.google.com/url?q=http% a% f% fneatline.org% f&sa=d&sntz= &usg=afqjcnegrxb_ph mhzpkmwtkkt f woga http://www.google.com/url?q=http% a% f% fneatline.org% f&sa=d&sntz= &usg=afqjcnegrxb_ph mhzpkmwtkkt f woga http://www.google.com/url?q=http% a% f% fneatline.org% f&sa=d&sntz= &usg=afqjcnegrxb_ph mhzpkmwtkkt f woga http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fprojects% fomeka-plugins% f&sa=d&sntz= &usg=afqjcnga-md sdpkqcwffoyr nml zyrw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% feadimporter&sa=d&sntz= &usg=afqjcneobht_wv kunuk m vocqavmovfw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ftimeline&sa=d&sntz= &usg=afqjcngdmbx vmlxcudjptqspviun sa http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% ffedoraconnector&sa=d&sntz= &usg=afqjcnecrhvdei lzhchls cxcet bxnw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fgenericxmlimporter&sa=d&sntz= &usg=afqjcnfj cwvr cedd cdfxn r k dmmiw http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinefeatures&sa=d&sntz= &usg=afqjcnf zqr-grznaqihqkbtxzp abztig http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fneatlinemaps&sa=d&sntz= &usg=afqjcnhrzpcfwylj dijq xdtvmrdvpcqq http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fsolrsearch&sa=d&sntz= &usg=afqjcnhdyzuckn ptut zfoacij gkg ma http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fomeka.org% fcodex% fplugins% fteidisplay&sa=d&sntz= &usg=afqjcnh cp_cdq _r ifnozkwhkagpqja http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fwww.scholarslab.org% fannouncements% fscholars-lab-and-chnm-partner-on-omeka-neatline% f&sa=d&sntz= &usg=afqjcng aqgzsyxily jmy v_xbwrrhew http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw http://www.google.com/url?q=http% a% f% fpaj.muohio.edu% fpaj% findex.php% fpaj% farticle% fviewarticle% f &sa=d&sntz= &usg=afqjcnenmzupuatpnlsx-ecnd rnts hfw durham research online deposited in dro: february version of attached �le: accepted version peer-review status of attached �le: peer-reviewed citation for published item: sunderland, luke ( ) 'introduction : medieval libraries, history of the book and literature.', french studies., ( ). pp. - . further information on publisher's website: https://doi.org/ . /fs/knw publisher's copyright statement: this is a pre-copyedited, author-produced version of an article accepted for publication in french studies following peer review. the version of record luke sunderland; introduction: medieval libraries, history of the book, and literature. french studies ; ( ): - is available online at: https://doi.org/ . /fs/knw . additional information: use policy the full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for personal research or study, educational, or not-for-pro�t purposes provided that: • a full bibliographic reference is made to the original source • a link is made to the metadata record in dro • the full-text is not changed in any way the full-text must not be sold in any format or medium without the formal permission of the copyright holders. please consult the full dro policy for further details. durham university library, stockton road, durham dh ly, united kingdom tel : + ( ) | fax : + ( ) https://dro.dur.ac.uk https://www.dur.ac.uk https://doi.org/ . /fs/knw http://dro.dur.ac.uk/ / https://dro.dur.ac.uk/policies/usepolicy.pdf https://dro.dur.ac.uk unpublished draft – do not cite or circulate – refer to published version ‘introduction: medieval libraries, history of the book and literature’, french studies, . ( ), – introduction: medieval libraries, history of the book, and literature medieval libraries are studied as collections of books, but much less frequently as collections of ideas. they are somewhat neglected by literary scholars, who tend to define the parameters of their studies in terms of authors, genres, themes, traditions, or movements, rather than library collections. such critics are interested in where individual texts come from or where they go, and much less in which texts were gathered together in libraries and thus made sense together. studies have increased awareness of the intertextuality of medieval literature, especially of the interplay between literature and philosophy in the later middle ages; the pieces in this special issue were developed at a series of workshops sponsored by durham university and the university of cambridge, – . we would like to thank both institutions for their support. philippe frieden, thomas hinton, and luke sunderland are grateful to durham’s institute of medieval & early modern studies for sponsorship of a project panel at the international congress on medieval studies, kalamazoo, . finally, sunderland would like to thank the university of minnesota for the invitation to speak in , which helped form the ideas expressed in this introduction. see especially adrian armstrong and sarah kay, knowing poetry: verse in medieval france from the ‘rose’ to the ‘rhétoriqueurs’ (ithaca: cornell university press, ); alain corbellari, la voix des clercs: littérature et savoir universitaire autour des dits du xiii e siècle (geneva: droz, ). medieval libraries medieval literary texts were of course in dialogue with other sorts of knowledge. but the potential for using popular literary texts — the incontournables of medieval libraries — to inform an idea of what those libraries symbolized, or how they were conceived or used, remains unexploited. nor has the medieval library been deployed to make sense of the texts found within it, despite the fact that the meaning of any text is inevitably informed by familiarity with the other works alongside which it is found. rather, history of the book has been the main field to tackle medieval libraries. scholars in that domain show well how people needed, made, or used books. patronage, ownership, lending, and production are central foci as the histories of particular manuscripts and the activities of makers and collectors are documented. meaning is found in the physical forms of writing, or in the structure and presentation of manuscripts, including paratextual features, such as rubrics and marginalia. but history of the book’s statistics-driven approach, reliant on empirical data, remains epistemologically cautious and runs up against natural limits where such information is sketchy or unavailable. history of the book is interested in the supply and demand of books as objects; it has concentrated on tracking and measuring book ownership. but it lacks the ‘human contexts for the production and reception of texts’. these are the subjects dominating the recent compilation of the work of two of the most important scholars in history of the book: richard h. and mary a. rouse, bound fast with letters: medieval writers, readers, and texts (notre dame: university of notre dame press, ). for an overview of the field, see alexandra gillespie, ‘analytical survey : the history of the book’, new medieval literatures, ( ), – . see the critique of history of the book offered in guglielmo cavallo and roger chartier, ‘introduction’, in a history of reading in the west, ed. by guglielmo cavallo and roger chartier, trans. by lydia g. cochrane (cambridge: polity, ), pp. – ; roger chartier, medieval libraries expanded to the level of the library, the positivist approach of history of the book shows its limitations more clearly. the library is reduced to being a symptom of the processes of production; thus jenny stratford and teresa webber note the tendency to focus on individual books and owners, due to lack of evidence about how books were ‘perceived, acquired, or used as collections’. though sociological questions, such as taste and temperament, can be answered via knowledge of the economics of the book trade, this tends again to tabulated data. codices become mere figures in defined lists of knowable facts, and the library is viewed as the aggregation of these facts, a bigger data set. hanno wijsman’s recent work on the libraries of burgundian nobles, for example, carefully records the ownership of texts in particular languages and genres. he shows how the ducal library, in the later middle ages, was ‘geared to secular use and the representation of worldly power […] the book in the noble library now becomes chiefly a source of secular knowledge and identity and is no longer predominantly in the service of the culture of prayer’. this is precisely the kind of reductive claim about texts — which are made monochromatic — that dominates the field. the nature and dynamics of this ‘secular knowledge’ remain unexamined. elsewhere, the same answers are peddled: aristocratic libraries contained standard corpora; they reflected the noble world-view; they manifested wealth and status; l’ordre des livres: lecteurs, auteurs, bibliothèques en europe entre xiv e et xviii e siècle (aix- en-provence: alinea, ). gillespie, ‘history of the book’, p. . ‘bishops and kings: private book collections in medieval england’, in the cambridge history of libraries in britain and ireland i: to , ed. by elisabeth leedham-green and teresa webber (cambridge: cambridge university press, ), pp. – (p. ). luxury bound: illustrated manuscript production and noble and princely book ownership in the burgundian netherlands ( – ) (turnhout: brepols, ), p. . medieval libraries they served political, dynastic, or crusading aims. scholars in history of the book quite rightly criticize the tendency of scholars in other fields to disembody medieval texts — that is, to treat them outside of the context of the books they circulated in — but history of the book in turn focuses on the embodiment of the texts to the detriment of the texts themselves. it flattens texts for which multiple readings and resonances are possible, and sidelines the capacity of works to create and disrupt their own systems of meaning. medieval libraries have thus become part of social and economic history, without being properly integrated into the history of ideas. history of the book fails to show how individual works fitted within a wider body of knowledge, and why books and libraries mattered to medieval readers intellectually. this special issue aims to take steps towards rectifying that. in the rest of this introduction, i will argue that an approach informed by foucault can help illuminate the phenomenon of the medieval library. foucault and libraries in les mots et les choses, foucault repeatedly turns to the library as a figure of knowledge organized, completed, mastered, totalized. describing renaissance knowledge in terms of see, on the burgundian collection, the ideology of burgundy, ed. by d’arcy jonathan dacre boulton and jan veenstra (leiden: brill, ). deborah mcgrady sees book acquisition as aggressive assertion of power: ‘what is a patron? benefactors and authorship in ms harley , christine de pizan’s collected works’, in christine de pizan and the categories of difference, ed. by marilynn desmond (minneapolis: university of minnesota press, ), pp. – . simon eliot and jonathan rose, ‘introduction’, in a companion to the history of the book, ed. by simon eliot and jonathan rose (malden: blackwell, ), pp. – . gillespie, ‘history of the book’, p. . medieval libraries resemblance, foucault suggests it forms a second-degree treasure trove, a classification system, a set of marks, referring to the treasure of nature: ‘la vérité de toutes ces marques — qu’elles traversent la nature, ou qu’elles s’alignent sur les parchemins et dans les bibliothèques — est partout la même: aussi archaïque que l’institution de dieu’. he also writes that la croix du maine ‘imagine un espace à la fois d’encyclopédie et de bibliothèque qui permettrait de disposer les textes écrits selon les figures du voisinage, de la parenté, de l’analogie et de la subordination que prescrit le monde lui-même’. the library is, then, a collection of texts which together provide a complete and systematic representation of the world. foucault writes at more length about libraries in two lesser-known pieces: ‘des espaces autres’ and la bibliothèque fantastique. in the former, foucault develops his idea of ‘heterotopias’, which are defined first of all in contradistinction to utopias. utopias are: les emplacements qui entretiennent avec ’espace réel de la société un rapport général d’analogie directe ou inversée. c’est la société elle-même perfectionnée ou c’est l’envers de la société, mais, de toute façon, ces utopies sont des espaces qui sont fondamentalement essentiellement irréels. heterotopias, on the other hand, are simultaneously real and mythical: des lieux réels, des lieux effectifs, des lieux qui sont dessinés dans l’institution même de la société, et qui sont des sortes de contre-emplacements, sortes d’utopies les mots et les choses: une archéologie des sciences humaines (paris: gallimard, ), p. . les mots et les choses, p. . ‘des espaces autres’, in dits et écrits iv (paris: gallimard, ), pp. – ; la bibliothèque fantastique: à propos de la ‘tentation de saint antoine’ de gustave flaubert (bruxelles: la lettre volée, ). ‘des espaces autres’, p. . medieval libraries effectivement réalisées dans lesquelles les emplacements réels, tous les autres emplacements réels que l’on peut trouver à l'intérieur de la culture sont à la fois représentés, contestés et inversés, des sortes de lieux qui sont hors de tous les lieux, bien que pourtant ils soient effectivement localisables. foucault ventures six arguments about heterotopias. first, all cultures create heterotopias: primitive societies have what he terms crisis heterotopias (privileged, sacred, or forbidden places, reserved for individuals in a state of crisis: menstruating or pregnant women, the elderly). such heterotopias are disappearing today, replaced by a second category: heterotopias of deviation (psychiatric hospitals and prisons). second, particular heterotopias can change function over time. third, ‘l’hétérotopie a le pouvoir de juxtaposer en un seul lieu réel plusieurs espaces, plusieurs emplacements qui sont en eux-mêmes incompatibles’ — the best example is the theatre, which brings onto the stage, successively, a series of places foreign to one another. the cinema works in a similar way, and the garden is the oldest example of this sort of heterotopia: ‘le jardin, c’est la plus petite parcelle du monde et puis c’est la totalité du monde. le jardin, c’est, depuis le fond de l'antiquité, une sorte d’hétérotopie heureuse et universalisante (de là nos jardins zoologiques)’. fourth, temporal discontinuities are also often absorbed, making ‘heterochronias’ such as museums and libraries: ‘il y a […] les hétérotopies du temps qui s’accumule à l’infini, par exemple les musées, les bibliothèques’. fifth, heterotopias are not freely accessible: either one is locked in, such as in prison, or entry requires ritual or purification. and sixth, finally, heterotopias ‘des espaces autres’, pp. – . ‘des espaces autres’, p. . ‘des espaces autres’, p. . ‘des espaces autres’, p. . medieval libraries can be communities relying on the exclusion of the outside world, which is denounced as illusory or imperfect; this type is like a realized utopia, such as the jesuit colony. worlds in miniature which stand in a variety of problematic, shifting relationships with the real world, heterotopias destabilize any opposition between order and disorder. they contain disorder within an ordered structure, or else hold multiple, competing orders within one space. and the library is for foucault a heterotopia peculiar to modernity: au xvii e , jusqu’à la fin du xvii e siècle encore, les musées et les bibliothèques étaient l’expression d'un choix individuel. en revanche, l’idée de tout accumuler, l’idée de constituer une sorte d’archive générale, la volonté d’enfermer dans un lieu tous les temps, toutes les époques, toutes les formes, tous les goûts, l’idée de constituer un lieu de tous les temps qui soit lui-même hors du temps, et inaccessible à sa morsure, le projet d’organiser ainsi une sorte d’accumulation perpétuelle et indéfinie du temps dans un lieu qui ne bougerait pas, eh bien, tout cela appartient à notre modernité. le musée et la bibliothèque sont des hétérotopies qui sont propres à la culture occidentale du xix e siècle. the library is organized in a way that compensates for the disorder of the real world; it is a place where the discourses of the world — spatially and temporally different — find their place, within one big system. but this vision of the library, as an embodiment of the nineteenth-century desire to know, catalogue, and master, stands in contrast to the one developed in la bibliothèque fantastique, foucault’s reading of flaubert’s tentation de saint-antoine. antoine, living as a hermit in a hut high up on a cliff overlooking the nile and the desert, is overcome by despair and solitude. he prepares to leave his hermitage, but hates himself for weakness, and throws himself down to the ground where he falls into a trance. as he lies immobile, he is tempted by the devil who deploys a wide variety of illusions. the ‘des espaces autres’, p. . medieval libraries universe parades before antoine: he sees manifestations of desire, knowledge, power, imagination. visions of plenty (food, drink, wealth, adulation) are followed by the appearance of his disciple hilarion, who represents science and reason, and critiques antoine’s doctrinal knowledge. various heretics appear, as do other gods and beings, such as dwarfs, the sphinx and the chimera. for foucault, la tentation is a library; he speaks of its erudition and mentions some of the many books flaubert consulted on doctrine, heresy, and mythology. scholars have lamented this aspect, claiming the work drowns under the weight of its knowledge. but foucault sees it as a comprehensive tour of knowledge types: ‘comme un soleil nocturne, la tentation va d’est en ouest, du désir au savoir, de l’imagination à la vérité, des plus vieilles nostalgies aux déterminations de la science moderne’. however la tentation is not a library as we know it. the idea of the library as a heterotopia takes on new dimensions in this text: it is entered as a dream, an illusion; it is a sacred, forbidden space of crisis. this library is a fantastic reconciliation of the rational with the irrational, and the dreamlike with the scholarly. in the text, we see the library ‘ouverte, inventoriée, découpée, répétée et combinée dans un espace nouveau’. play and imagination are untrammelled by the barriers of order. foucault contrasts la tentation to don quixote and the marquis de sade’s nouvelle justine: both are books created by absorbing and reacting to other books. such critiques are summarized by allan h. pasco, ‘trinitarian unity in la tentation de saint-antoine’, french studies, ( ), – , who also notes critical attempts to find order. mary orr reads the text as a set of dialogues between sacred and secular knowledge: flaubert’s ‘tentation’: remapping nineteenth-century french histories of religion and science (oxford: oxford university press, ). la bibliothèque fantastique, p. . la bibliothèque fantastique, p. . medieval libraries but they ironize the books they devour, whereas flaubert’s text holds out quite different possibilities: la tentation, elle, se rapporte sur le mode sérieux à l’immense domaine de l’imprimé; elle prend place dans l’institution reconnue de l’écriture. c’est moins un livre nouveau, qu’une œuvre qui s’étend sur l’espace des livres existants. elle les recouvre, les cache, les manifeste, d’un seul mouvement les fait étinceler et disparaître. elle n’est pas seulement un livre que flaubert, longtemps, a rêvé d’écrire; elle est le rêve des autres livres: tous les autres livres, rêvants, rêvés, — repris, fragmentés, déplacés, combinés, mis à distance par le songe, mais par lui aussi rapprochés jusqu’à la satisfaction imaginaire et scintillante du désir. après, le livre de mallarmé deviendra possible, puis joyce, roussel, kafka, pound, borges. la bibliothèque est en feu. foucault describes a delirium of literary research here. there is madness within order, and order within madness. he also reconnects here with the garden imagery of ‘des espaces autres’: la tentation est la première œuvre littéraire qui tienne compte de ces institutions verdâtres où les livres s’accumulent et où croît doucement la lente, la certaine végétation de leur savoir. flaubert est à la bibliothèque ce que manet est au musée. flaubert captures the library in its essence as a vast space of possibility that no longer just contains knowledge but also facilitates its creation. he unleashes the literary potential of the library. indeed library scientist gary radford sees in the piece possibilities for ‘an alternative perspective from which the rationalistic assumptions of a positivistic epistemology can be foregrounded, transcended, and critiqued, along with the conception of the academic library la bibliothèque fantastique, p. . la bibliothèque fantastique, p. . medieval libraries which it supports’. for radford, traditional concepts of knowledge, meaning, and communication in library and information science face a crisis because they fail to understand how people experience their interactions with the modern academic library. refusing the idea of absolute knowledge, radford suggests that the library be seen as a route to particular knowledges, each created as a user moves through the collection. users need the freedom to find different itineraries, forming individual rationalities that will never add up to a single, stable order of knowledge. radford’s piece dates to , but his conclusions apply a fortiori to the era of digital scholarship, google and wikipedia, when readers need to ask questions about authority, citation and the nature of truth — rather than being able simply to depend on the fact that the provided content is reliable — but are also allowed to become contributors in their own right. mary franklin-brown begins and ends her study of medieval encyclopaedias with comparisons to wikipedia: both have labyrinthine qualities, polyvocality, and most importantly, a tension between the aspiration to universality and order on the one hand, and unstable practices of citation, cross-referencing, and openness to revision on the other. both medieval encyclopaedias and wikipedia arguably replace and supplement libraries, but also provide a model for understanding them. indeed the disruptive dimensions of both correspond well to foucault’s imagining in la bibliothèque fantastique of the library as a multifarious, unsettling cavalcade of irreconcilable ideas, rather than as storehouse of a tamed and fixed savoir; thus in turn invites critique of his historicization in ‘des espaces autres’, by conjuring away the shift from the medieval and early modern library ‘flaubert, foucault, and the bibliotheque fantastique: toward a postmodern epistemology for library science’, library trends, ( ), – (p. ). reading the world: encyclopedic writing in the scholastic age (chicago: university of chicago press, ), pp. – ; – . medieval libraries as personal collection to the modern library as a culture’s ordered archive. the nineteenth century, along with other periods, had wildly different conceptions of the library. the middle ages in fact show symptoms of many of the phenomena foucault attributes to the nineteenth century. first, the term ‘library’ must be used carefully: foucault deploys it as much to designate an imaginary space of encounter between books as an institutional space where they are stored. for the medieval period, it is the former sense, closer to the idea of book collection, that is arguably more important: ordered, institutional libraries were still nascent, with many sets of books housed in chests or cabinets rather than rooms with shelves. yet works were circulating with greater intensity and coming into contact, because the proliferation of books is not unique to the era of print, as michael clanchy cautions: ‘a vigorous book-using culture was the precursor to the invention of printing rather than its consequence’. christopher de hamel even speaks of ‘a sudden excess of knowledge’ from the twelfth century onwards, which accelerated book production; there was ‘an almost relentless stream of new titles in the lists of desiderata, in theology, history, politics, geography, natural history (bestiaries, for example), liturgy (missals rather than the older sacramentaries), and the first ancient greek works translated into latin from arabic intermediaries’. the secular book trade was growing apace; and the prologues to works of the period often speak of the number and diversity of books. there was evidently a bewildering increase in the amount of written material. it is common for authors to start an ‘parchment and paper: manuscript culture – ’, in a companion to the history of the book, ed. by eliot and rose, pp. – (p. ). ‘the european medieval book’, in the book: a global history, ed. by michael f. suarez, s.j. and h.r. woudhuysen (oxford: oxford university press, ), pp. – (p. ). richard and mary rouse, manuscripts and their makers: commercial book production in medieval paris – , vols (turnhout: brepols, ). medieval libraries encyclopaedia or compilation by saying that it has been written because there are many books, such as here when bartholomaeus anglicus explains his reasons for producing his encyclopaedia: ut simplices et parvuli, qui propter librorum infinitatem singularum rerum proprietates, de quibus tractat scriptura, investigare non possunt, in promptu invenire valeant saltem superficialiter quod intendunt. [so that uneducated people and children, who cannot investigate the properties of individual things that scripture discusses, because of the infinity of books, can easily find want they want, albeit on a superficial level.] this of course also raises questions about accessibility and changing audiences. new vernacular reading publics were hungry for translations; the book trade grew rapidly in response. bartholomaeus’s encyclopaedia would itself become vastly popular in its french rendering as well as its latin original. the desire for rationality, order, and truth was also manifest in the way in which books were becoming easier to consult thanks to rubrics, tables of contents, indexes, notae, and running heads. it was a world of more and longer books: the drive to complete and compile explains the vogue, from the thirteenth century onwards, for anthologies, florilegia, miscellanea, codifications, summae, and mammoth literary texts such as arthurian prose cycles. frédéric duval sketches an ideal late medieval francophone library by cataloguing all those texts with more than fifty surviving manuscripts dating to the period to . amongst other works, he notes the presence of bibles historiales; de proprietatibus rerum: volume i (prohemium, libri i–iv), ed. by baudouin van den abeele and others (turnhout: brepols, ), p. . my translation. lectures françaises de la fin du moyen Âge: petite anthologie commentée de succès littéraires (geneva: droz, ). medieval libraries prayerbooks and moral guides; frère laurent’s la somme le roi; the roman de la rose; manuals of hunting and falconry; medical works; encylopaedias (sidrac, gossouin de metz’s image du monde and brunetto latini’s trésor); great histories (the histoire ancienne jusqu’à césar, the grandes chroniques de france); and works by boethius, guillaume de digulleville, jean gerson, and christine de pizan. many of these are attempts to summarize and compile certain forms of knowledge (historical, medical, moral, philosophical, or practical), or to dramatize human encounters with the limits of the knowable. three contributors to this special issue have worked on literary cycles, seeking models of completeness and narrative order, as well as moments when disorder imposes itself. such thinking informs our approach to libraries. cycles, anthologies, and the like are microcosms of libraries, modes of gathering, organizing and making sense of diverse material; they are products of a search for classifiable, totalizable, and manageable knowledge. yet they also betray the fear that knowledge might spiral out of control, or that particular pieces of knowledge might prove impossible to locate. foucault’s heterotopia corresponds to the multiple systems for ordering the things of the world — competing discourses and disciplines are represented and related, without ever tessellating completely — that are found in medieval libraries. yet, in a dimension missing from foucault’s idea of the library, the medieval book collection was also heterotopic because it gathered codices each with a highly individualized history of production, ownership, and use, bound up with practices of gift- giving and inheritance (here, there is great potential for the reconnection of history of the miranda griffin, the object and the cause in the vulgate cycle (oxford: legenda, ); thomas hinton, the ‘conte du graal’ cycle: chrétien de troyes’s ‘perceval’, the continuations, and french arthurian romance (cambridge: d.s. brewer, ); luke sunderland, old french narrative cycles: heroism between ethics and morality (cambridge: d.s. brewer, ). medieval libraries book’s thinking with literary and philosophical work). foucault thinks teleologically — he seems to picture the library as an institution, a fixed place where ever more texts accrete — but medieval libraries were frequently temporary projects, prone to dispersal. collections often broke up after the death of their owners; books were stolen, sold or destroyed; parchment was reused. viewed synchronically at any one moment, a medieval library held varieties of books, texts, and discourses in an imperfect crystallization of knowledge new and old. but considered diachronically, the medieval library appears personal, transitory, and palimpsestic. the relationship between writer, rewriter, reader, and text varied from manuscript to manuscript. an individual vision of order could always be replaced by a new one when books or collections changed hands; particular elements within the knowledge set retained their distinctiveness, and along with it their potential to combine differently, with other works and other systems, in new locations. many medieval codices gather diverse texts and books within them and could be considered libraries in miniature. in this sense, it is perhaps the encyclopaedia that best represents the library — indeed, foucault uses the ideas of the library and of the encyclopaedia interchangeably as figures of knowledge mastered and completed in les mots et les choses — and my dialogue with foucault in this context is inspired by franklin- brown’s work on the scholastic encyclopaedia, already referenced. franklin-brown reveals how encyclopaedias reproduce other texts and other discourses, juxtaposing historically and generically different ways of knowing, through the scholarly practices of commentary and compilation. attacking foucault’s historical argument, she casts vincent of beauvais’s speculum maius as a medieval heterotopia: the desire to create such a space is not uniquely modern, a fact to which the speculum maius bears most eloquent, overwhelming testimony. like the modern library, the scholastic encyclopedia compensated for the confusion of volumes piled into the medieval libraries armarium or book cabinet, for the heteroglot murmur of discourses circulating through the walls of the monastery, the school, the university, and the castle, by gathering them all into a single space, the book, and fixing them on the page, in some comprehensible order that, itself, lent order and meaning to the world it represented. the constitution of a utopia of knowledge, the realization of that utopia in an actual place: it would be difficult to find a better summation of what drove the scholastic encyclopedists to pick up their quills. foucault’s description of la tentation de saint antoine, however, has other potential links. it could well be applied to the roman de la rose, another product of the opening up of intertextual possibility as a set of books is consumed to create another. the presence of books together in libraries allows for the composition — compilation, anthologization — of other works. but more fundamentally, the existence of private libraries and the increase in private book ownership also meant that there was a newly-formed reading public with heterogeneous tastes that could understand and desire such books. sylvia huot reads the rose as a tissue of citations, as ‘an extensive and intricate tour of latin authors, both ancient and medieval’. because the text contains citations of ancient authors but does not mark them as such, she argues that it ‘presupposes, rather than provides, knowledge of these texts’. the medieval reading the world, p. . sarah kay casts christine de pizan’s livre du chemin de longue estude as ‘a creative rereading of christine’s own library’: the place of thought: the complexity of one in late medieval french didactic poetry (philadelphia: university of pennsylvania press, ), p. . dreams of lovers and lies of poets (london: legenda, ), p. . armstrong and kay call the rose ‘a disorderly double of an encyclopedia’ (knowing poetry, p. ). dreams, p. . medieval libraries library is a creative space of possibility: the existence of libraries drove the production and consumption of heterotopic books. many late medieval works are both books and interfaces between traditions. they show the desire to cross the boundaries of knowledge, but also the fear of disorder, the anxiety that knowledge might remain hopelessly elusive, polyvocal, or contradictory. returning to foucault, we might see such works, and the libraries that contain them, as examples of the limited or composite forms of universality that substitute for the impossible, perfect encyclopaedia. in les mots et les choses, foucault speaks of the eighteenth-century scholar charles bonnet, who imagined a huge library of the universe or the true universal encyclopaedia, and saw all actual libraries and encyclopaedias as necessarily pale imitations: ‘sur ce fond d’une encyclopédie absolue, les humains constituent des formes intermédiaires d’universalité composée et limitée’. library collections contained some texts offering mastery of worldly knowledge in face of discordant discourses, and others revealing the impossibility of a complete grasp on what can be known. if we remain transfixed by the stories of where individual texts or physical books originated or what their dissemination was — and those are the foci of both history of the book and much literary scholarship — then we lose sight of which texts were gathered together in libraries and thus made sense together, as different ways of coping with the imperfections and complications of human knowledge. each medieval library must be grasped as a project — fleeting, heterogeneous, and incomplete — that reaches out in its own more or less systematic way towards the limits of the knowable. les mots et les choses, p. medieval libraries the heterotopic medieval library through their analysis of particular texts and collections, the pieces offered here open up ways of examining the medieval idea of the library which both concord with and contest foucault’s analysis. libraries are sets of knowledge in various states of completion, partially realized utopias, special spaces to which access is reserved, personal projects, systems juxtaposing incompatible discourses and historically-layered modes of knowing; they at once represent, contest, and invert the values of the culture around them. the pieces of this special issue move chronologically, from the twelfth century, which saw an acceleration of manuscript production, through the late middle ages, where texts entered into increasingly intense interdisciplinary dialogue, to the early modern period, when a vibrant manuscript culture survived alongside print. first, thomas hinton looks at the metaphors of the garden (also dear to foucault) and the forest as alternative models for knowledge and as responses to the diversity of books, suggesting an oscillation between order and chaos: book collecting parallels cultivation and both are opposed to the wild forest, as hinton demonstrates through readings of reginald of durham and richard de fournival. but the field of history of the book has focused too greatly on monasteries and universities, argues hinton, and there remains an urgent need to move towards private libraries, where cultural historians can help form a new approach. hinton attempts to sketch one here by adopting a margins and centre methodology. outside the centre of institutional libraries, the city, as site of the production of books for private ownership, especially in the vernacular, appears as particularly active. the forest is of course another margin: associated with unsorted knowledge and language in dante and du bellay, it also provides a metaphor in medieval romances by wace, chaucer and benoît de saint-maure for the marvel, the surprise encounter and the unexpected power of books. the idea that reading could potentially lead in unexpected directions is a recurrent concern of vernacular authors, suggesting that the forest could ultimately prove more medieval libraries productive than the kempt garden. this is far from the imposed order of the modern library, and close to foucault’s vision of flaubert’s tentation. like foucault, emma campbell uses literary texts to formulate an idea of the library. references to real or imaginary sources in medieval vernacular literature are relatively conventional, but some writers locate their sources more precisely, mentioning a work in a library or book collection. drawing on benoît de saint-maure’s roman de troie, thomas of kent’s roman de toute chevalerie, adgar’s le gracial, and chrétien de troyes’s cligès, campbell explores how such references to libraries frame the composition of french texts, and considers the particularities of these references. medieval book collections significantly predate the conceptions of rational totalization which inform many modern notions of the library, including foucault’s thinking in les mots et les choses; the ordering of knowledge with which they are associated consequently looks substantially different from that of the present day. campbell argues that the library as it appears in medieval french literature is connected to hierarchies of knowledge rather than to systems of classification or notions of ordered space; she therefore moves away from the thematics of order and disorder to envisage the library as a set of relationships between writers, readers, and texts. indeed campbell’s final example, adenet le roi’s berte as grans pies, locates its source to the library of saint-denis, a site for intellectual activity with royal and monastic associations, and a contact point between different historical periods. this allows campbell to argue that the medieval library connotes textual and epistemological genealogies, offering a site for the transmission of knowledge through books, and a geographical and historical link in the movements of translatio between disparate cultures and peoples. miranda griffin also picks up on the heterochronic dimension of the library, by casting the ovide moralisé as a microcosm of the book collection. like the rose, the ovide moralisé is an interface with classical traditions, a book that helped readers deal with the medieval libraries diversity of discourses by suggesting a model for their integration, whilst also playing on their discordance. in a similar way to la tentation, the ovide moralisé creatively rewrites other books, offering the medieval reader both a translation of ovid’s tales of transformation from latin to french, and a digest moralizing them to bring out their value in transmitting divine truths. it forges links between diverse sets of material by absorbing and rewriting other books in a reading practice involving commentary and translation, all the while overtly conscious of its dependence on, and re-evaluation of, previous instances of authority. griffin focuses on the encounter between two auctores cited in the ovide moralisé: ovid and an author referred to as ‘crestien’. evidence from ovide moralisé codices belonging to the libraries of king philippe vi, his wife, jeanne de bourgogne and their three grandsons — king charles v; jean, duc de berry; and philippe le hardi, duc de bourgogne — suggests that the name crestien, today likely to be identified with chrétien de troyes, would in the fourteenth century likely have connoted the author responsible for a twelfth-century translation and adaptation of the philomela section of book vi of ovid’s metamorphoses, an œuvre which is itself interpolated into the ovide moralisé. the author function of ‘ovide’ is for griffin multifaceted and multiform, and the ovide moralisé both a book resulting from an imaginative approach to a library, and a library itself. the medieval library can, in turn, be viewed as a fantastic encounter with a book collection, as a space where texts refigure, reshape, re-evaluate, and toy with one another. this focus on book users continues in philippe frieden’s examination of the book collection of charles d’orléans, which demonstrates how particular libraries can be seen as sets of alternative histories, varying with the manufacture, provenance, and ownership of each volume. in foucault’s terms, the collection is a particular, limited universality, an individual, idiosyncratic, or personalized form of completeness. it is heterotopic in marking exclusivity and socio-political distinction, with books as valuable objects and markers of medieval libraries status, and heterochronic because manuscripts passed through different hands, taking on layers of meaning from the collections around them, and also from being annotated palimpsestically. all the same, mastery of knowledge is also expressed. more specifically, the figure of the duke as owner or collector appears in a variety of forms within the collection, and the library can be considered as the reflection of the collector himself, and the embodiment of a series of events and relationships in his life. it offers an unusually rich opportunity to gain a better understanding of the intimate links connecting manuscripts, medieval libraries, and their owners and family. through study of the surviving catalogues of the collection, details held within books made for him, and his own book of poetry, frieden sheds light on the portrait of charles d’orléans as a collector that is provided by these various pieces of evidence. john o’brien’s ‘response’ closes the collection with consideration of the early modern period, arguing that what changes in the library is not so much the knowledge- conception of the space as the sort of works that are allowed access to that space — which remains a heterotopia with selective access — and the place they have in it. o’brien distinguishes courtly libraries, where medieval romances and epics remained popular, and libraries like the royal library which might include them alongside the new learning, from scholars’ libraries which normally would not contain that vernacular medieval literature, but might easily have theological or philosophical or historical works drawing on the medieval heritage. o’brien shows that the early modern library is not the same entity at every temporal or spatial point in the period, and its relationship to the past, especially the medieval era, is diverse, rich, and many-sided. if history of the book believes that the medieval library is knowable by its codices, then we want to reveal what medievalists might better understand by close reading of its texts, and especially literary texts, because of their capacity to dramatize human encounters medieval libraries with books, in their perplexing variety, to bring divergent discourses into contact, and to reveal the interplay between order and chaos that shaped libraries. it is literary texts that best show the workings of libraries as heterotopias, holding together texts that vary synchronically (by vehicling different modes of knowing, ordering, and writing) and diachronically (by encapsulating different histories of ownership, composition, rewriting, and transmission). this special issue will give a glimpse of the potential for literary texts to provide insight into the cultural work performed by medieval libraries. to teach, critique, and compose: representing computers and composition through the ciwic/dmac institute available online at www.sciencedirect.com sciencedirect computers and composition ( ) – to teach, critique, and compose: representing computers and composition through the ciwic/dmac institute julia voss ∗ santa clara university abstract this article examines how the computers in writing-intensive classrooms (ciwic)/digital media and composition (dmac) institute has realized founding director cynthia l. selfe’s commitment to prioritizing people first, then teaching, then technology. i analyze how institute curricula introduce and model pedagogies for teaching digital composing, foster networking among par- ticipants, articulate a critical stance toward technology, and encourage newcomers to enter the field as administrators and scholars (as well as teachers). i also draw on participant documents (social media posts, publications, and cvs) to investigate the uptake of these ideas. moving forward, i suggest that in light of the institute’s growing emphasis on digital composing, ) knowledge-making should be seen as the larger frame for ciwic/dmac work, and ) research should be added to the institute’s existing articulation of the field in terms of people→teaching→technology. © the author. published by elsevier inc. this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). keywords: ciwic; dmac; history; pedagogy; professional development; digital composing; ohio state university; michigan technological university “i’ve never been to a workshop, meeting, conference, or class where people have been made to feel more enabled, encouraged, and nurtured to work through their ideas.” –ciwic participant “i learned a few important things about multimodal composing these past few days.. .. i also remembered a lot of things i had forgotten about composing in general.” –j. james bono, dmac, participant “doing dmac has made me realize for the first time that i am a professional, and that’s a really wonderful feeling.” –kathryn perry, dmac participant in the years since its founding, the computers in writing-intensive classrooms (ciwic) institute, and its successor the digital media and composition (dmac) institute, has become such a familiar feature of the computers and composition landscape that people talk about doing ciwic/dmac as a signifier of technological professional ∗ department of english, st. joseph hall, santa clara university, el camino real, santa clara, ca . e-mail address: jvoss@scu.edu http://dx.doi.org/ . /j.compcom. . . - /© the author. published by elsevier inc. this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). http://crossmark.crossref.org/dialog/?doi= . /j.compcom. . . &domain=pdf http://www.sciencedirect.com/science/journal/ dx.doi.org/ . /j.compcom. . . http://creativecommons.org/licenses/by-nc-nd/ . / mailto:jvoss@scu.edu dx.doi.org/ . /j.compcom. . . http://creativecommons.org/licenses/by-nc-nd/ . / d a t h t fi a t i t fi j w o p a w n m s o l l i m e i s m s r c c j. voss / computers and composition ( ) – evelopment. doing the institute means working for two weeks with founding director cynthia l. selfe, whose work s a researcher, teacher, and mentor sophisticated, critical, student/teacher-centered stance toward technology. doing he institute signifies familiarity with scholarship at the intersections of technology, literacy, and pedagogy, and implies ands-on experience with digital composing. it means entering a community of teachers and scholars connected by heir interest in integrating digital technologies into teaching and research on writing, rhetoric, literacy, and related elds like literary studies. this project investigates what it means to “do” ciwic/dmac, positioning the institute as an entry and rticulation point that helps define computers and composition as a field. i examine the priorities the insti- ute highlights, the pedagogies it models, the professional community it fosters, and the professional activities t encourages. to construct a history from both above and below, i examine how participants take up insti- ute concepts and practices to consider ciwic/dmac’s impact on computers and composition and related elds. this approach adds systematic documentary analysis to existing narrative accounts of ciwic/dmac (see ournet, ) and recommendations of the institute as a site for technology training (see mcgrath, ; braun, ). . a brief history of the ciwic/dmac institute, by way of its mission ciwic was founded in at michigan technological university by selfe and an mtu colleague, billie ahlstrom (selfe, personal communication, june , ). the call for applications described ciwic as an pportunity for “practical, hands-on experience in the effective use of word processing, style analysis, and spelling rograms,” providing “techniques for setting up computer-assisted writing labs; and strategies for training teachers nd students to use computers” (“computers in writing-intensive classrooms,” , p. ). although technology as central, the institute focused on teaching and learning, opening it to participants with varying levels of tech- ical expertise but who shared interests in literacy and composing. as gail e. hawisher, paul leblanc, charles oran, & selfe ( ) explained, this period of the history of computers and composition was characterized by a hift toward more critical stances toward technology, informed by social and critical theories. at ciwic, this new rientation translated into interrogating how computers related to composition pedagogy, problematizing techno- ogical utopianism, interrogating access inequalities, and working with institutional stakeholders to support writing abs. throughout the s, ciwic began integrating internet-based communication technologies and multimedia into ts curriculum as desktop computers became more powerful and gained network connectivity (see hawisher, leblanc, oran, & selfe, , pp. – , – ). when anne wysocki joined the ciwic staff in , the institute xpanded from a single program to three separate tracks in response to the growing opportunities for multimodal, nteractive composing: • approaches to integrating computers into writing classrooms (ciwic-aic, led by selfe) focused on integrating computers into writing instruction, carrying on ciwic’s original focus • new media (ciwic-nm, led by wysocki) focused on training participants in building born-digital, multimodal, interactive texts • independent projects (ciwic-ip) supported scholars working on their own digital projects with ciwic staff and michigan tech resources (“announcing a suite of summer institutes for teachers,” ) elfe took the institute with her to ohio state university (osu) in , where it was renamed dmac: the digital edia and composition institute. at osu the institute converged back into a single program, emphasizing rhetorical trategies for multimodal, digital composing in a variety of genres and in light of concerns about access, agency, epresentation, and other issues. dmac retained ciwic’s commitment to hands-on digital production, shifting focus during the s, ciwic introduced participants to the world wide web, hypertext, email, and electronic conferencing and taught attendees to ompose in image editing software like photoshop and html-editing software like macromedia dreamweaver (see “computers in writing-intensive lassrooms,” , , a, b, , & ; selfe, personal communication, june , ). j. voss / computers and composition ( ) – to integrate multimodality into participants’ teaching and research. this attention to participants’ institutional and personal-professional concerns is another hallmark of the institute, as selfe has argued: technology is not really as important as the people. so, we ask things like, does the technology get in the way of what we are doing? or does it help us in what we are doing? what is the upshot? (as cited in beck, , p. ) or, as dmac participant rachael ryerson ( a) tweeted, paraphrasing selfe’s parting comments to attendees: “#dmac in a nutshell: . people . instruction . technology.” . documenting ciwic/dmac’s history to examine how the institute has represented the field of computer and composition and to investigate how partic- ipants have taken up these ideas, i gathered “official” texts produced by the institute (announcements, curricula, and readings) and “unofficial” texts produced by participants (social media posts made during the institute, publications developed at the institute, and online professional documents created by alumni after attending the institute ). i ana- lyzed curricula to get a sense of the institute’s content over time, identifying recurrent themes in readings, discussions, instructional methods, assignments, and extracurricular activities. as i will discuss, these themes related to pedagogy, networking, the social context/use of technologies, and participant professionalization. i then examined participants’ social media posts looking for these same themes, adding codes to describe additional activities (mostly social ones like recommending references, joking with one another, and commenting on the experience of attending the institute) to describe content not found in the official documents. in some cases, the social themes that emerged from the social media posts directed me back to official ciwic/dmac materials to examine the extent to which these interactions were encouraged by institute curricula. finally, i analyzed participants’ professional documents, noting their job titles, the role technology has played in their teaching, any editorial and/or administrative positions they’ve held, and their publishing histories. this approach has drawbacks, related to missing documents and limited causal evidence. i was able to gather complete curriculum records for all dmac institutes, but ciwic materials were not systematically archived. as a result, i’m working with a smattering of archived ciwic web pages and full curricula for years ( , , and ). participant documents are similarly skewed toward dmac because before , ciwic used locally-stored files and conferencing software like daedalus for discussion among participants, which were not archived. records of discussions and reflections by dmac participants are still available online, and consequently play a much larger role in this project. finally, i located alumni professional documents using a complete list of attendees for dmac, but only a partial list of ciwic attendees, who are therefore underrepresented in my quantitative analysis of participants’ work after attending the institute. ciwic/dmac imbalances aside, the kind of evidence i draw on here does not lend itself to claiming direct causal relationships, but instead to demonstrating the diffusion of institute ideas. as a result, the history of ciwic/dmac i present here is partial, and i hope it will encourage others to study the institute and its impact using other methods and perspectives. this account focuses on the pedagogies the institute models, the professional environment it cultivates, the critical and diverse scholarship it features, and how it cultivates participants’ professional development through hands-on digital composing work. the social media included in this study include comments posted on the official institute blogs and tweeted using the #dmac , #dmac , and #dmac hashtags. the institute maintained official blogs for ciwic – , dmac – , and dmac – . except for the dmac blog, all institute blogs are still online. since the institute has maintained an official twitter profile (dmacinstitute) and encouraged participants to live-tweet during dmac using the official institute hashtags. i define “professional documents” here as texts created by the participants to describe themselves professionally. these include cvs; university profile pages; professional websites; linkedin profiles; and academic networking sites like academia.edu, mendeley, and researchgate. to flesh out and contextualize ciwic in light of scarce documents, i interviewed several people who attended ciwc in the early s about their experiences. while not cited here, these interviews informed my interpretation of ciwic documents. accounting for duplicates (some people attended ciwic more than once), the total number of recorded participants for ciwic and dmac is . this figure includes all dmac attendees ( – ), and all attendees of ciwic – . that leaves years unaccounted for, suggesting a total of ∼ participants over the institute’s -year history. o i t g c s “ f w t f a r w a w w u c a g r r j. voss / computers and composition ( ) – . presenting and modeling a writing-centered approach to teaching with technology . . connecting old and new literacies as a professional development institute for english and writing studies teachers, ciwic/dmac begins with a focus n literacy instruction in light of technological changes in the production, circulation, and reception of texts. during ts second decade, ciwic began with this sequence of sessions: • student writing characteristics • assumptions about writing and the teaching of writing • introducing michigan tech’s center for computer-assisted language instruction [the institute’s home lab] • integrating computers into writing programs (ciwic, ; ciwic-aic & his progression clearly put the teaching of writing first, and positioned technology as a way to support instructional oals. dmac opening session titles indicate a similar stance: • multimodality and literacy (dmac ) • making a brief/case for multimodal literacy (dmac ) • thinking about multimodal composition (dmac , , , & ) • why english teachers should think about digital technology, design, and multimodal composition (dmac , ) • using audio to think about multimodal composition (dmac, ) onsidering the rhetorical affordances of digital, multimodal texts—especially ones published online—also tands out as a major feature of the institute overall. sessions like “world wide wit” (ciwic, ), hypertext fictions” (ciwic, ), and “collaborative web sites” (ciwic-aic, ) highlighted the play- ul, interactive types of content found online in order to introduce teachers to internet culture and suggest ays they could incorporate new genres (as well as new technologies) into their pedagogies. born-digital exts featured at ciwic-aic ( )—such as poems that go’s interactive flash poetry and early articles rom kairos and computers and composition online—went further, modeling how the multimodal affordances nd playful conventions of internet communication could weave textual aesthetics together with scholarly igor. recent dmac institutes ( – ) have extended this focus on digital rhetorics, using susan h. delagrange’s ork on visual argument through arrangement and multimodal revision to examine how digital texts redefine rhetoric nd the composing process. discussion of delagrange’s work on twitter showed participants taking up her theories as atchwords: • beauty and utility linked via “elegance” in digital projects: code, content, style. #dmac (hagood, c) • delagrange: techne=knowledge in the head + knowledge in the hand #dmac (rodrigue, b) hile design-oriented sessions like delagrange’s encouraged participants to think about production, dmac took p digital activism by way of krista bryson’s west virginia water crisis ( ) to add a focus on the social context, irculation, and reception of digital texts. participants read bryson’s blog, which gathered stories from west virginians ffected by the january elk river chemical spill to create a citizen-journalist counter narrative to corporate and overnment reports downplaying the spill’s severity. participant and staff tweets accompanying bryson’s discussion esponded to the issues her blog provoked about civic rhetoric, vernacular media production, social networking, epresentation, and public intellectualism: • twitter reached the talking heads; facebook reached the people @klbryson #dmac #digital activism (hancock, c) • digital activist pieces embrace interactivity; the “author” relinquishes control #dmac (conatser, ) • from @klbryson, we learn the simple yet potent power of wtinessing [sic] and representing to others. #dmac (selfe, b) • fascinating discussion about @klbryson’s digital activism and rhetorical roles as scholar / citizen / journalist / storyteller #dmac (sloan, ) j. voss / computers and composition ( ) – figure . ciwic/dmac first assignment sequences. these posters considered how digital texts circulate, interrogated the dynamics of ownership when public intellectuals and community members collaborate on digital texts, recalled ethical questions about qualitative research and repre- sentation, and expanded the conception of audience beyond classroom and profession to strive for community impact and political reform. using writing instruction as a point of entry, the institute asks participants to consider changing teaching envi- ronments (computer labs), new modes of composing (media production software), and new methods of distribution and reception (digital activism). these experiences and examples prompt participants to re-think—as bono’s ( ) epigraph illustrated—what they already know about writing and rhetoric, parsing what changes and what endures when composing moves into digital, interactive spaces. . . collaborative and studio approaches to teaching with technology throughout its history the institute’s pedagogical models have addressed teaching with technology as well as the conceptual questions delagrange and bryson raise about design and delivery. in her account of attending the institute, journet ( ) highlighted the applied pedagogical experience conferred by the institute’s approach to digital composing: dmac provided much-needed support—readings and discussion, hands-on instruction, and one-on-one consul- tation. but it was not until i actually started to work with cameras and audio recorders, use editing software, or put together clips that i really began to learn how to compose with digital media. (p. ) to foreground producing digital texts, participants begin working on projects right away, using the activities journet mentioned. these assignments range from hand-coding a webpage (ciwic, ) to drafting a visual argument (ciwic-nm, ) to creating a -second audio announcement (dmac, ) to composing a -second video (dmac, ). as illustrated in figure below, these assignment sequences i a ‘ i d i a t b f t w a i s e c f t a p c t l w i t j. voss / computers and composition ( ) – • begin with a conceptual introduction that situates the project in terms of the rhetorical affordances of different media, • specify a deliverable, • offer demonstration and direct instruction, and • provide lab/studio time for participants to do sustained work. the experience of producing multimodal texts brings home the semiotic potential of digital media described in nstitute readings, as dmac participant tony o’keeffe ( ) explained: “what i began to understand after just couple of days’ work with digital tools is what multimedia theorists commonly refer to as the various technologies’ affordances’—what each one makes it possible to achieve.” the “couple of days’ work” o’keefe refers to plays out n lab time, during which attendees work on their projects alongside staff and other participants. live-tweets from mac that record tips and strategies shared by participants demonstrate how lab time provides the kind of scaffolded, ndependent composing work journet and o’keeffe describe: • pro tip from cindy @selfe on recording audio: in a pinch, your car is a pretty soundproof little recording studio. #dmac (dmacinstitute, ) • annotate pdfs with foxit (free) #dmac (rodrigue, a) • youtube allows a “private with a link” option, so the vid isn’t in the search options #dmac (hancock, a) lthough these tweets only capture some of the real-time consulting that goes on during lab/studio time, they speak o the kind of learning afforded by diving into production in the institutes’ resource-rich environment. working side- y-side in the lab also encourages participants to work together, feed off one another’s energy, ask for help, and seek eedback, as indicated by these observations about project development and the lab’s atmosphere: • day of dmac, and we’ll [sic] already starting to produce interesting artifacts. audio literacy narratives. woot! #dmac (hagood, b) • dramatic shift in the shape of my dmac project. love the way the collaborative setting encourages organic evolution! #dmac (miller, ) • special thanks to erin [a staff member] for helping so many of us in the lab today! finally started to get that multimodal “writer’s high”.. . #dmac (parfitt, b) hese tweets echoed the ciwic ( ) participant’s epigraph, emphasizing the benefits of the institute’s supportive ork environment. o’keeffe ( ) elaborated on the value of such co-present studio work: “multimedia work suggests n image borrowed from that physiology on which our lives depend: the systole of individual, solitary work which leads nescapably to the diastole of collaborative sharing, for both judgement [sic] and further development.” the institute’s tudio environment doesn’t only provide help and trouble-shooting, but as o’keeffe and miller noted, also creates an nvironment of “supportive judgment” for developing and re-thinking projects. similar to journet’s observation that she couldn’t really understand multimodal composing until she began reating multimedia texts, experiencing the institute’s studio approach is an important part of its pedagogical pro- essional development. ciwic/dmac’s self-directed, scaffolded studio environment shows teachers how to support heir own students’ media production by positioning teachers as students of digital composing. the institute bal- nces direct instruction with individual trial and error, encourages one-on-one consulting with staff and other articipants, and allows participants to observe others and experience themselves working at the limits of their ompetence. . . considering pedagogical and curricular influence the impact on individual teachers and scholars is one mark of the influence of the institute. however, tracing how hese teachers and scholars have returned to their home institutions to shape curriculum, champion new technologies and abs, and more is another method of analyzing the influence of the institute. many alumni do teaching and administrative ork that positions them to put ciwic/dmac ideas and techniques into practice at their home institutions, recorded n table below. the majority of institute alumni ( ) whose work and profiles i reviewed for this article hold eaching positions, and of them report teaching classes that incorporate technology in a significant way. these j. voss / computers and composition ( ) – table alumni teaching and administrative work after attending ciwic/dmac. teaching classroom teaching, any course or subject teaching courses on digital media composing or theories of media and technology administration directing writing program, writing across the curriculum program, or writing course serving on technology advisory committee or providing technology training directing instructional technology program, center, or department figures suggest that alumni have the opportunity to apply ciwic/dmac pedagogies in their teaching. furthermore, table also shows that numerous institute alumni also serve as writing program directors and curriculum developers, where their choices about curriculum, staffing, and assessment affect a great number of students. finally, institute alumni who run instructional technology centers and services point to additional ways ciwic/dmac’s approaches to technology and pedagogy can be disseminated across participants’ home institutions, potentially reaching beyond english departments and writing programs. ties to institutional and professional sponsors also encourage this kind of broader impact. many participants receive funding to attend ciwic/dmac, and are therefore accountable to their sponsors for directing technology training workshops, creating resources, or developing curricula once they return home. bonnie newcomer ( ), who attended ciwic with the support of the kansas association of teachers of english (kate), explained how the responsibility to “bring ciwic home with you” (a recurrent session title throughout the institute’s history) fostered the spread of institute ideas: the goal of the kate organization was for me to hobnob with english teachers from across the united states so i could bring back ideas to share with kansas teachers via newsletter and a presentation at kate conference [ ] . my goals were much the same with the additional goal of acting as a mentor for kansas teachers who were entering strange terrain as they left the world of paper and entered the world of bits and bytes. further research is needed to study systematically whether and how institute ideas have been implemented at partic- ipants’ home institutions, but participant narratives and professional documents show that the conditions for such influence certainly exist. . space, time, and socialization among participants participants’ experience of the institute itself offers one approach to examining ciwic/dmac’s impact on attendees themselves. curricula and real-time comments show how the institute’s structure encourages attendees to socialize and network around the ideas and methods they encounter, deepening their engagement with institute content and increasing the likelihood that they will adopt it. to promote this kind of retention and reflection, the institute includes an extracurriculum of “official” social events, informal gatherings, and recommendations for local activities. indeed, working with others while navigating new technologies, new approaches, and new theoretical frames for thinking about digital writing requires the shaping of community at the institute and beyond (see boyle et al. and stewart in this special issue). this extracurriculum allows participants to work with and re-encounter institute concepts with peers, potentially (as i will argue) encouraging them to work through and question institute ideas as part of their ongoing professional development. ciwic’s “official” social events have included a picnic at a local park during the first few days of the institute and a dessert night (sometimes accompanied by a talent show) during the second week. strategically located away from the institute home base at michigan tech, these events provided an informal atmosphere for participants to talk about their work and shared interests, evident in photos from these events (see figure ). after the institute moved to ohio state in , the official social events shifted to include an evening potluck at co-director scott lloyd dewitt’s house and research by dmac alumni laura mcgrath and letizia guglielmo ( ) on the workshop they led to help english faculty at kennesaw state university to integrate multimodal composing in their teaching provides a rare example of this kind of research. mcgrath and guglielmo traced j. voss / computers and composition ( ) – f [ a t v t p a e c t a t w a igure . photographs from “official” ciwic extracurricular activities: institute picnic (day , ) [left]; participants talking [middle] and singing right] at dessert night event at selfe’s house (day , ). party at selfe’s house, where (beginning in ) participants screened their short video projects. as participant weets about the screening suggested, these events foster professional relationships between participants as they iew and respond to one another’s work: • the concepts in were so good! #dmac #proudtoknowallofyou (hancock, b) • @llcadle your #conceptin was so beautiful. and moving. #dmac (vankooten, a) he institute schedule also includes optional field trips and recommendations for after-hours activities that encourage articipants to socialize. examples of these activities have included: • cruise of the copper county area around houghton, mi (ciwic ; ciwic-aic ; ciwic-aic, ) • exhibition of digital artwork by wysocki (ciwic, , ciwic-aic, ) • drinks at a campus-area bar the first evening of the institute (ciwic, ) • gallery open house night in columbus, oh’s short north arts district (dmac, , , , ) narrative poem written by laura bartlett, chidsey dickson, doug eyman, and colleen reilly ( ) about their xperience at ciwic and set to lou reed’s “take a walk on the wild side” captured how these informal activities help reate a convivial familiarity that fosters professional relationships. barlett et al’s allusion to reed’s song encompassed he unconventional work the authors did during institute sessions: cindy [selfe] teaches us to pay attention in a world where there’s not enough invention.. . audacity ruled the aic visual set new media free (p. ) nd the poem’s inside jokes captured the playful collegiality the institute encourages among participants: cheryl [ball, ciwic associate director] never once led us astray she wrote her dissertation in one day drove us around in the van showed us the monks [of holy transfiguration skete in eagle harbor, mi] who made that jam and said hey babe.. . take a walk on the wild side hey doug, don’t you dare walk up that waterfall (p. ) he pedagogies they presented at ksu all the way from the dmac institute they attended to the classrooms of colleagues who attended their orkshop, suggesting at least one way to study institute impact on participants’ home institutions. because of changes in dmac’s assignment structure, participant videos have been screened at evening events hosted by both dewitt ( – ) nd selfe ( ). j. voss / computers and composition ( ) – while the barlett et al. poem was playful and referred only obliquely to the writers’ academic work, contacts made at the institute can also underpin more explicitly professional ties between participants. this twitter exchange between two dmac participants in anticipation of the computers and writing conference illustrated the kind of networking the institute can facilitate: • @crystalvk excited to see dmac folk again! #networking #dmac #cwcon (ryerson, b) • @rachaelryerson totally!! who else is going besides @selfe , @selfe , @harleyferris, and @myergeau? #dmac #cwcon (vankooten, b) • @crystalvk @rachaelryerson @selfe @selfe @harleyferris @myergeau we will see but i think you got them all! #dmac #cwcon (ryerson, c) ciwic/dmac’s extracurriculum provides time, space, and colleagues that extend participants’ engagement with institute ideas beyond the daily sessions, cultivating intellectual contacts that can lead to ongoing conversations about technology and literacy. . critical perspectives on technology and the diversity of the field while the institute’s extracurriculum focuses specifically on attendees as people, another way in which ciwic/dmac prioritizes people over technology comes from the critical social theories woven into the curricu- lum. these topics have varied widely over the years, due in particular to the institute’s inclusion of invited speakers and visiting scholars. these guest experts have allowed the institute to address topics ranging from intellectual property (ciwic, ) to usability (ciwic-aic, , ) to community literacies (dmac, , , ) to moocs (dmac, , ), delving into research that complicates and questions trends in computers and composition, higher education, and technology. i focus here on sessions selected from across ciwic and dmac that discussed the relationship of technology to gender, race, and accessibility in order to illustrate the intellectual diversity of the institute curricula, using participant texts to discuss attendees’ reception and uptake of these ideas. alongside sessions focusing on material and ideological access to technology along lines of race and class, ciwic, featured a session led by guest speaker gail hawisher titled “women writing the web: graphic images at the century’s start.” at the session, hawisher shared research she conducted with patricia sullivan ( ) on the visual features of websites representing women, comparing the images of female bodies found in commercial, academic, and personal/professional sites. despite their interest in exploring the extent to which the web allowed women to cultivate multiple subjectivities and foreground embodied experience, hawisher and sullivan ( ) observed that the majority of commercial websites objectified women’s bodies as sexual objects (pp. – ), while academic sites ignored them to present women as disembodied minds (pp. – ). only a few hand-crafted personal/professional sites engaged in the kind of identity play and virtual embodiment that hawisher and sullivan theorized through the lens of materialist and cyberfeminism ( , pp. – ). presenting this research at ciwic allowed hawisher to alert participants to sexism in online images and call on attendees to resist these norms by foregrounding embodied, gendered experience in their own online digital presences. similarly, a pair of sessions on race, technology, and representation led by beverly moss and valerie kinloch at dmac examined how research on technological literacy frequently glossed over race. their sessions drew on their own and others’ research to counteract this blind spot by foregrounding difference and exposing bias in the use and study of literacy technologies. in addition to the real-time conversations moss and kinloch facilitated during that day’s sessions, the discussion of race and technology continued after-hours on the institute blog, where participant both ciwic-aic and ciwic-nm participants attended hawisher’s session. the description of the session i provide here draws on hawisher and sullivan’s “fleeting images: women visually writing the web” ( ), which was assigned reading for this session and provided additional detail beyond the brief description of the session given in the ciwic-aic and -nm schedules. the readings for moss and kinloch’s sessions included samantha blackmon’s ( ) research on using digital, multicultural pedagogies in majority-white classes; bruce sinclair’s ( ) work connecting african american history to the history of technology; kinloch’s ( ) research on black youths’ use of art to respond to gentrification in harlem; and moss’s ( ) work on approaches to studying community literacies in the twenty-first century. d f • • • • • i o t r d t a s i m r p d e e y p d n s p i d n a j. voss / computers and composition ( ) – ouglas walls offered a series of reservations about digital media’s potential as a tool for empowerment. walls ( a) oregrounded race as an omnipresent and central aspect of theorizing technology: the stakes are higher for people of color dealing with technology in front of other people. new media, like literacy, can be used as a form of violence. scholars can learn from semiotic systems other than print based that are closer to the rhetoric [sic] structure of new media. identity is information and behaves like information in digital systems. new media doesn’t do us any good if it just replicates unjust power structures and is continued [sic] to be used to dehumanize folks. n response, another participant, arr ( ) quoted walls to articulate an aspirational teaching philosophy that drew n his argument for the raced nature of technology: “folks are categorized, labeled, placed into groups, associated through language, media, and informatics in complex ways.” one of the main ways those associations are made within our society is through race/racist logic; therefore, “the stakes are higher for people of color dealing with technology in front of other people.” in knowing this, as a teacher, i must “understand how information and media does work [to] understand how identity and point of view is constructed” as a way to interrogate my own invisible assumptions that may guide classroom practices. his exchange showed dmac participants building on institute content and on each other’s responses to it in order to efine their ideas and practices. walls went on to pursue his critique of the emancipatory potential of new media in his mac project, a visual collage/voiceover video that he later published in kairos. his video questioned the extent o which racist, imperialist tropes like authenticity could be “disrupted, resisted, and remixed into demi-humorous rguments or a quasi-academic piece of new media” (walls, b), rather than simply recirculated as existing racist tereotypes and replicated as existing material inequalities. and if new media is used in these ways, walls stated in mpassioned tones, he would resist using them ( b). perhaps owing to the fast-paced and sarcastic tone of his ashup-manifesto (compared to the print articles discussed in moss’s and kinloch’s sessions), walls embraced a more adical stance toward technology than the institute readings or in-class discussions did, demonstrating the potential articipant texts have to reinterpret institute ideas and recirculate them critically into the field. while walls, arr, and other dmac participants considered race and technology in lengthy blog posts, mac participants engaged visually with access, design, and multimodality. participants read melanie yergeau t al’s webtext “multimodality in motion: disability & kairotic spaces” ( ), and discussed ableism in academia and lsewhere. following the session, selfe challenged participants: “can you integrate insights from melanie’s talk into our dmac discussions/projects/classes to help enact change? #dmac ” ( a). in response, participants tweeted hotos and observations throughout the day to catalog accessibility issues they noticed in light of the morning’s iscussion, shown in figure below. these tweets suggest that the participants were seeing the world in new ways, oticing access barriers that were previously invisible to them. . professional development at ciwic/dmac the institute’s scholarly, critical orientation is also reflected in the professional development it provides. at ciwic and ciwic-aic , readings on the relationship between technology and race, class, gender, and sexuality haped participants’ book reviews, teaching demonstrations, software critiques, and computer lab designs. these rojects focused on preparing participants to build writing labs, teach effectively in them, and train others at their nstitutions to do so. beginning in , ciwic-nm asked participants to use tools like adobe photoshop, macromedia irector, and macromedia flash to create interactive, born-digital texts: a visual argument, a sequential argument, a on-linear project ( only), and a final project of their own design. this fast-paced assignment sequence pushed for other examples of dmac alumni developing their final projects into digital publications, see lindemann & smith ( ) and omizo ( ). lumni have also published their short concept in videos on multimodality: see burns ( ) and perry ( ). j. voss / computers and composition ( ) – figure . dmac, participant photo-tweets responding to yergeau’s session on access (mathis, [left]; parfitt, a [middle]; andrewjaykinney, [right]). participants to focus on their own (rather than students’) digital composing, marking an institute shift toward media production and research-oriented professional development. when the institute moved to ohio state in and became dmac, the new name signaled the central role participant-created media had come to play. dmac followed ciwic-aic’s shift (begun in ) toward incorporating multimodal composing, featuring assignments that paired audio and video recorders with computers. as in ciwic-nm, dmac participants designed and produced a final project, called “thinking about multimodal composition.” this assignment asked participants to create a born-digital multimedia text (using web authoring platforms like sophiebook, ibooks author, wordpress, or dreamweaver) that combined written content with other media assets to demonstrate the potential of multimodal texts (“thinking about multimodal composition,” ). in , the final project shifted again to narrow its focus, asking participants to reflect on the process of producing an earlier institute assignment (a short video) to examine ) the process of multimodal composing, and ) their video’s use of multimodal rhetoric (“final project,” ). this prompt paralleled the approach several institute alumni used to build publications around their institute projects by conducting meta-analyses of them (see mondor & rounsaville, ; kimme hea & turnley, ; lackey, ). the dmac final project underscores the scholarly, as well as pedagogical, outcomes of institute work, encouraging participants to see their assignments as potentially publishable projects. another aspect of recent dmac curricula that highlights the institute’s growing attention to research-oriented professional development is the graduate workshop. the visiting scholars and journal editors who have lead this session since its inception in have used it to introduce participants to exemplary research in computers and composition and to discuss the opportunities and challenges inherent in digital scholarship. participants’ enthusiastic response to a list of digital publishing venues yergeau ( ) shared following the workshop suggested that attendees were taking up dmac’s emphasis on digital publishing, gathering resources to produce their own digital scholarship. the fact that % of dmac alumni compared to % of ciwic alumni report publishing born-digital scholarship suggests that the changes in institute curriculum (along with other factors) may well be fostering multimodal scholarship. the institute also promotes digital scholarship by educating participants about the concerns, costs, benefits, and evaluation of digital scholarship, whether or not these alumni go on to publish digitally themselves. numerous institute information about use of video in ciwic-aic beginning in from cheryl ball, danielle devoss, cindy selfe, and scott dewitt, personal communication, august , . although the name of the dmac final project has changed during this time (called a “literacy documentary project” in , “making a brief case for multimodal composition” in , and “final project” in ), the focus throughout is on using multiple media to make an argument that demonstrates the value of multimodality. graduate workshop leaders have included kara poe alexander ( ), cheryl ball ( , ), kristine blair ( , – ), joseph harris ( ), debra journet ( , ), tony o’keeffe ( ), and melanie yergeau ( – ). yergeau’s tweet (“google doc list of digital publication venues. pls add to lists! http://t.co/w rheqeig #dmac ” ( )) was retweeted times on the #dmac hashtag (by attendees, dmac alumni, and non-attendees) and favorited by people ( attendees, dmac alumni, and non-attendees). these percentages reflect the number of institute alumni who have provided professional information online, not the total number of institute attendees (see note for additional information on institute attendance numbers and professional document totals). a f m s a n a j p h p t d e a m i r d i i s w m t a d p w s t a i t a c c t a t h w t g r j. voss / computers and composition ( ) – lumni are now tenured and senior faculty, responsible for assessing the work of colleagues applying for jobs, tenure, ellowships, and awards. these alumni can serve as the informed digital scholarship judges braun calls for ( , pp. – ) and share their knowledge with other senior colleagues, as journet recommended ( , pp. – ). as any as journal editors have also attended the institute, where they had the opportunity to learn about the demands, tandards, and significance of research on computers and composition. it’s especially important to note that institute lumni edit/have edited publications that don’t focus explicitly on technology like college composition and commu- ication, composition studies, the writing instructor, southern discourse, and disability studies quarterly, as well s field-specific publications like kairos and computers and composition. attending the institute helps editors—not ust in computers and composition, but in other fields as well—evaluate the kinds of digital scholarship participants roduce, helping to extend the reach of this type of work. . conclusions as i argue in this text, institute and participant documents illustrate selfe’s often-quoted people/teaching/technology ierarchy of concerns. this dictum doesn’t seem to fully account, however, for the institute’s increasing emphasis on articipants’ digital composing work in recent years. while this shift is certainly a “personal” concern insofar as it relates o participants’ professional development, composing original texts also calls attendees to create new knowledge about igital composing, rhetoric, and pedagogy. the recent changes in dmac’s final project formalize this development by ncouraging participants to look ahead to producing digital scholarship. furthermore, social media conversations and ttendees’ publication records suggest that participants are taking up the institute’s invitation to publish their knowledge- aking texts. just as the institute has long introduced newcomers to computers and composition through shared interests n pedagogy, its recent overt emphasis on research offers another avenue of entry. the final project—especially its most ecent iteration as a meta-analysis of earlier institute assignments—encourages scholarship on learning to work with igital technologies, inviting participants to situate developmental dmac experiences within rhetorical, semiotic, and nstitutional frameworks. a project like this positions newcomers to contribute to the field by building on existing work n computers and composition as represented, for example, by the critical theories of technology discussed by guest peakers. whether their projects are pedagogical or publication-oriented, creating multimodal digital texts at dmac involves hat journet, cheryl e. ball, and ryan trauman ( ) called “the new work of composing,” (re)defining what texts ean and how they work by negotiating the purposive, technical, and generic questions that digital texts pose for heir creators. when participants use institute assignments in their teaching, they apply the embodied knowledge about ffordances and processes of multimodal production they acquired at the institute, as journet ( ) and o’keeffe ( ) escribed. and when participants develop institute assignments into publications, that kind of knowledge-making goes ublic in the field, producing work like walls’ ( b) video-collage, which plays with scholarly conventions and tone hile engaging with critical theories of race, language, culture, and technology. the institute’s emphasis on knowledge-making not only through teaching but also especially through research uggests adding to and resituating selfe’s people→teaching→technology dictum to better account for the digital exts participants create. figure proposes knowledge-making as a context within which to situate institute work and dds to selfe’s description of the institute’s mission. digital composing functions as the context within which the nstitute’s people/teaching/technology priorities play out, describing the conceptual and applied work participants do hroughout the institute. and the emphasis recent dmacs place on digital publishing suggests that research, as well s teaching, should be represented among the institute’s primary concerns. people still come first, reflected in the ommunity the institute cultivates; its attention to participants’ professional needs; and the prominent role played by ritical theories of technology informed by embodied lenses of race, class, gender, and access. part of this attention o people, however, adds research to selfe’s maxim as an increasingly important professional concern for institute ttendees (and academics generally) of all levels. i place teaching and research side-by-side in figure both to indicate he frequent overlap between teaching and research in computers and composition scholarship (reflecting the field’s istory of classroom-based and pedagogically-oriented work) and to illustrate the prominence of these two topics. and hile technology per se ranks below these concerns, the rhetorical, semiotic, and ethical responsibilities that apply o digital composing are predicated on (balanced on) technical knowledge and choices. these technical decisions are uided by the composing task and commitments to persons and pedagogy/scholarship, however, and so technology emains a tertiary consideration. j. voss / computers and composition ( ) – figure . situating institute priorities within the context of knowledge-making through digital composing. although the institute continues to prioritize people and teaching, it increasingly approaches these concerns through digital composing, folding in research as a complement to the pedagogical and administrative professional development ciwic/dmac has long provided. as perry’s (qtd. in hagood, ) epigraph suggested, this kind of comprehensive professional development has become an increasingly important part of “doing dmac,” and the digital production work participants do, whether as a teaching or research activity, plays a central role in fostering attendees’ participation both in the field of computers and composition and in their own institutions. acknowledgements i would like to thank many generous colleagues who helped with the research for this project, demonstrating yet again the generosity of this field and its interest in telling its history. thank you to cheryl ball, teena carnegie, trey conatser, dànielle devoss, and patricia lynne who helped me track down ciwic and dmac documents. i’d also like to thank susan delagrange, lisa lebduska, joe pounds, cindy and dickie selfe, and madeleine sorapure for talking to me about their memories of ciwic. and finally, thank you the special issue editors and my writing group for invaluable feedback on drafts of this project and to my research assistant aidan mahoney, the internet detective who tracked down cvs for countless institute alumni. julia voss is an assistant professor of english at santa clara university, where she teaches courses on multimodal composition, writing studies, and (digital) literacy. an alumna of dmac , her research has examined the relationship between childhood computer experiences and adult attitudes toward technology and investigated the physical, virtual, and social infrastructures that support multimodal digital composing work. she is currently studying the relationship between high- and low-tech classroom design features and student learning. references ciwic-nm calendar. ( ). computers in writing-intensive classrooms institute. houghton, mi. dmac schedule. ( ). digital media and composition institute. retrieved from http://archive.today/nscqk andrewjaykinney. ( , may ). how do the visually impaired identify office numbers in this building? #dmac [twitter post]. retrieved from https://twitter.com/andrewjaykinney/status/ /photo/ announcing a suite of summer writing institutes for teachers. ( ). computers in writing-intensive classrooms. retrieved from http://www.hu.mtu.edu/oldsites/ciwic/ / arr. ( , june ). race/ethnicity and technology/new media studies: personal perspective [web log comment]. retrieved from http://dmac .blogspot.com/ / /raceethnicity-and-technologynew-media.html bartlett, laura, dickson, chidsey, eyman, doug, & reilly, colleen. ( ). walk on the wild side (at ciwic!). computers and composition, ( ), – . beck, estee. ( ). reflecting upon the past, sitting with the present, and charting our future: gail hawisher and cynthia, selfe discussing the community of computers & composition. computers and composition, ( ), – . http://archive.today/nscqk https://twitter.com/andrewjaykinney/status/ /photo/ http://www.hu.mtu.edu/oldsites/ciwic/ / http://dmac .blogspot.com/ / /raceethnicity-and-technologynew-media.html http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref b b b b b c c c c c c c c c c c c d d d d d d d d d d d d f f h h h h h h h h j j. voss / computers and composition ( ) – lackmon, samantha. ( ). but i’m just white” or how “other” pedagogies can benefit all students. in pamela takayoshi, & brian huot (eds.), teaching writing with computers: an introduction (pp. – ). new york, ny: houghton mifflin. ono, j. james. ( , june ). things i have learned. [web log post]. retrieved from http://dmac .blogspot.com/ raun, catherine c. ( ). cultivating ecologies for digital media work: the case of english studies. carbondale, il: southern illinois university press. ryson, krista. ( ). west virginia water crisis. [web log]. retrieved from http://wvwatercrisis.com/ urns, hugh. ( ). resolution in seconds. kairos, ( ). retrieved from http://technorhetoric.net/ . /disputatio/burns/index.html iwic participant. ( ). announcing a suite of summer institutes for teachers. in computers in writing-intensive classrooms. retrieved from http://www.hu.mtu.edu/oldsites/ciwic/ /index.htm iwic schedule. ( ). computers in writing-intensive classrooms institute. houghton, mi. iwic-aic schedule. ( ). computers in writing-intensive classrooms institute. houghton, mi. iwic-aic schedule. ( ). computers in writing-intensive classrooms institute. houghton, mi. iwic-nm schedule. ( ). computers in writing-intensive classrooms institute. houghton, mi. omputers in writing intensive classrooms: a summer workshop for teachers of english june l - , . ( a). computers and composition, ( ), . omputers in writing intensive classrooms: a summer workshop for teachers of english june - , . ( b). computers and composition, ( ), . omputers in writing-intensive classrooms: a summer workshop for english teachers. ( ). computers and composition, ( ), . omputers in writing-intensive classrooms: a summer workshop for teachers of english june – , . ( ). computers and composition, ( ), . omputers in writing-intensive classrooms: a summer workshop for teachers of english june – , . ( ). computers and composition, ( ), . omputers in writing-intensive classrooms: a summer workshop for teachers of english june – , . ( ). computers and composition, ( ), . onatser, trey. ( , may ). digital activist pieces embrace interactivity; the “author” relinquishes control #dmac [twitter post]. retrieved from https://twitter.com/treyconatser/statuses/ ay picnic, ciwic, . ( ). day —picnic, page . computers in writing classrooms. retrieved from http://web.archive.org/web/ /http://www.hu.mtu.edu/ciwic/ /picnic .htm ay dessert night at selfe’s house— people talking. ( ). dessert pics . computers in writing-intensive classrooms . retrieved from http://web.archive.org/web/ im /http://www.hu.mtu.edu/ciwic/ /dessert.htm ay dessert night at selfe’s house— people singing. ( ). dessert pics . computers in writing-intensive classrooms . retrieved from http://web.archive.org/web/ im /http://www.hu.mtu.edu/ciwic/ /dessert.htm mac schedule . ( ). digital media and composition institute. retrieved from http://archive.today/rqjl mac schedule. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/schedule .htm mac schedule. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/schedule .htm mac schedule. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/schedule .htm mac schedule. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/schedule .pdf mac schedule. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/schedule .htm mac schedule. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/schedule .htm mac schedule. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/schedule .htm macinstitute. ( , may ). pro tip from cindy @selfe on recording audio: in a pinch, your car is a pretty soundproof little recording studio. #dmac [twitter post]. retrieved from https://twitter.com/dmacinstitute/status/ inal project. ( ). digital media and composition institute. columbus, oh. inal project. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/assignments /dmacfinal.pdf agood, grace. ( a). dmac thanks. [video file]. retrieved from http://www.youtube.com/watch?v=kwutfmgkvs agood, grace. ( b, may ). day of dmac, and we’ll already starting to produce interesting artifacts. audio literacy narratives. woot! #dmac [twitter post]. retrieved from https://twitter.com/ghagood/status/ agood, grace. ( c, june ). beauty and utility linked via “elegance” in digital projects: code, content, style. #dmac [twitter post]. retrieved from https://twitter.com/ghagood/status/ ancock, niki. ( a, may ). youtube allows a “private with a link” option, so the vid isn’t in the search options. #dmac [twitter post]. retrieved from http://twitter.com/hancockniki/statuses/ ancock, niki. ( b, may ). the concepts in were so good! #dmac #proudtoknowallofyou [twitter post]. retrieved from https://twitter.com/hancockniki/statuses/ ancock, niki. ( c, may ). twitter reached the talking heads; facebook reached the people @klbryson #dmac #digital activism [twitter post]. retrieved from https://twitter.com/hancockniki/statuses/ awisher, gail e., & sullivan, patricia a. ( ). fleeting images: women visually writing the web. in gail e. hawisher, & cynthia l. selfe (eds.), passions, pedagogies and twenty-first century technologies (pp. – ). logan, ut: utah state university press. awisher, gail e., leblanc, paul, moran, charles, & selfe, cynthia l. ( ). computers and the teaching of writing in american higher education: a history. norwood, nj: ablex. ournet, debra. ( ). inventing myself in multimodality: encouraging senior faculty to use digital media. computers and composition, ( ), – . http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://dmac .blogspot.com/ http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://wvwatercrisis.com/ http://technorhetoric.net/ . /disputatio/burns/index.html http://www.hu.mtu.edu/oldsites/ciwic/ /index.htm https://twitter.com/treyconatser/statuses/ http://web.archive.org/web/ /http://www.hu.mtu.edu/ciwic/ /picnic .htm http://web.archive.org/web/ im_/http://www.hu.mtu.edu/ciwic/ /dessert.htm http://web.archive.org/web/ im_/http://www.hu.mtu.edu/ciwic/ /dessert.htm http://dmp.osu.edu/dmac/schedule .htm http://dmp.osu.edu/dmac/schedule .htm https://twitter.com/dmacinstitute/status/ http://www.youtube.com/watch?v=kwutfmgkvs https://twitter.com/ghagood/status/ https://twitter.com/ghagood/status/ http://twitter.com/hancockniki/statuses/ https://twitter.com/hancockniki/statuses/ https://twitter.com/hancockniki/statuses/ http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref j. voss / computers and composition ( ) – journet, debra, ball, cheryl e., & trauman, ryan. ( ). the new work of the book in composition studies: an introduction. in debra journet, cheryl e. ball, & ryan trauman (eds.), the new work of composing.. logan, ut: computers and composition digital press/utah state university press. retrieved from http://ccdigitalpress.org/nwc/chapters/journet-et-al/ kimme hea, amy c., & turnley, melinda. ( ). refiguring the interface agent: an exploration of productive tensions in new media composing. in cheryl e. ball, & james kalmbach (eds.), raw: (reading and writing) new media (pp. – ). cresskill, nj: hampton press. kinloch, valerie. ( ). youth representations of community, art, and struggle in harlem. new directions for adult and continuing education, ( ), – . lackey, dundee. ( , may ). publication idea? [web log post & comments]. retrieved from: http://web.archive.org/web/ /http://www.ryantrauman.net/dmacblog/? lindeman, lis, & smith, gregory o. ( ). literature and digital illumination. kairos, ( ). retrieved from http://kairos.technorhetoric. net/ . /topoi/lindeman-smith/index.html literacy documentary project. ( ). digital media and composition institute. retrieved from http://web.archive.org/web/ /http://dmp.osu.edu/dmac/schedule .htm making a brief/case for multimodal composition. ( ). digital media and composition institute. retrieved from http://web.archive.org/web/ /http://dmp.osu.edu/dmac/schedule .htm mathis, keri. ( , may ). handicapped? must go around. even though handicapped parking is right here. #dmac [twitter post]. retrieved from https://twitter.com/kerielizabeth/status/ /photo/ mcgrath, laura. ( ). teaching new media, negotiating access. in cheryl e. ball, & james kalmbach (eds.), raw: (reading and writing) new media (pp. – ). cresskill, nj: hampton press. mcgrath, laura, & guglielmo, letizia. ( ). supporting faculty in teaching the new work of composing: colleague-guided faculty development within an english department. the writing instructor. retrieved from http://www.writinginstructor.com/currentmoment-mcgrath-guglielmo miller, paula. ( , june ). dramatic shift in the shape of my dmac project. love the way the collaborative setting encourages organic evolution! #dmac [tweet]. retrieved from https://twitter.com/paulammiller/status/ mondor, shannon, & rounsaville, angela. ( ). inventing/producing columbus: a new humanities remix. kairos, ( ). retrieved from http://technorhetoric.net/ . /disputatio/mondor/index.htm moss, beverly j. ( ). writing in the wider community. in roger beard, debra myhill, jeni riley, & martin nystrand (eds.), the sage handbook of writing development (pp. – ). thousand oaks, ca: sage publications. newcomer, bonnie. ( ). newcomer attends ciwic‘ : kate scholarship enables teacher to bring back computer-composition training. com- puters in writing-intensive classrooms . retrieved from http://web.archive.org/web/ /http://www.hu.mtu.edu/ciwic/ /liz.htm o’keeffe, tony. ( ). mr. secrets. in debra journet, cheryl e. ball, & ryan trauman (eds.), the new work of composing.. logan, ut: computers and composition digital press/utah state university press. retrieved from http://ccdigitalpress.org/nwc/chapters/okeeffe/ omizo, ryan. ( ). dmac theory. kairos, ( ). retrieved from http://kairos.technorhetoric.net/ . /disputatio/omizo/ parfitt, beth. ( a, may ). access locations. #dmac [twitter post]. retrieved from https://twitter.com/parfittbeth/status/ /photo/ parfitt, beth. ( b, may ). special thanks to erin for helping so many of us in the lab today! finally started to get that multimodal “writer’s high”. #dmac [twitter post]. retrieved from https://twitter.com/parfittbeth/status/ perry, kathryn. ( ). the movement of composition: dance and writing. kairos, ( ). retrieved from http://kairos.technorhetoric.net/ . /disputatio/perry/ rodrigue, tanya k. ( a, may ). annotate pdfs with foxit (free) #dmac [twitter post]. retrieved from https://twitter.com/tkrisr/status/ rodrigue, tanya k. ( b, may ). delagrange: techne=knowledge in the head + knowledge in the hand #dmac [twitter post]. retrieved from https://twitter.com/tkrisr/statuses/ ryerson, rachael. ( a, may ). #dmac in a nutshell: small potent gestures; .people . instruction . technology. [twitter post] retrieved from http://twitter.com/rachaelryerson/statuses/ ryerson, rachael. ( b, june ). @crystalvk excited to see dmac folk again! #networking #dmac #cwcon [twitter post]. retrieved from https://twitter.com/rachaelryerson/statuses/ ryerson, rachael. ( c, june ). @crystalvk @rachaelryerson @selfe @selfe @harleyferris @myergeau we will see but i think you got them all! #dmac #cwcon [twitter post]. retrieved from https://twitter.com/rachaelryerson/statuses/ . selfe, cynthia l. ( a, may ). dmac challenge: can you integrate insights from melanie’s talk into your dmac discussions/projects/classes to help enact change? #dmac [twitter post]. retrieved from https://twitter.com/selfe /statuses/ . selfe, cynthia l. ( b, may ). from @klbryson, we learn the simple yet potent power of wtinessing and representing to others. #dmac [twitter post]. retrieved from https://twitter.com/selfe /statuses/ sinclair, bruce. ( ). integrating the histories of race and technology. in bruce sinclair (ed.), technology and the african american experience: needs and opportunities for study (pp. – ). cambridge, ma: mit press. sloan, ryan russel. ( , may ). fascinating discussion about @klbryson’s digital activism and rhetorical roles as scholar /citizen /journalist /storyteller #dmac [twitter post]. retrieved from https://twitter.com/rrussellsloan/statuses/ thinking about multimodal composition. ( ). digital media and composition institute. retrieved from http://dmp.osu.edu/dmac/dmacproject .jpg vankooten, crystal. ( a, may ). @llcadle your #conceptin was so beautiful. and moving. #dmac [twitter post] retrieved from https://twitter.com/crystalvk/statuses/ http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://web.archive.org/web/ /http://www.ryantrauman.net/dmacblog/? http://kairos.technorhetoric.net/ . /topoi/lindeman-smith/index.html http://kairos.technorhetoric.net/ . /topoi/lindeman-smith/index.html http://web.archive.org/web/ /http://dmp.osu.edu/dmac/schedule .htm http://web.archive.org/web/ /http://dmp.osu.edu/dmac/schedule .htm https://twitter.com/kerielizabeth/status/ /photo/ http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://www.writinginstructor.com/currentmoment-mcgrath-guglielmo https://twitter.com/paulammiller/status/ http://technorhetoric.net/ . /disputatio/mondor/index.htm http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://web.archive.org/web/ /http://www.hu.mtu.edu/ciwic/ /liz.htm http://ccdigitalpress.org/nwc/chapters/okeeffe/ http://kairos.technorhetoric.net/ . /disputatio/omizo/ https://twitter.com/parfittbeth/status/ /photo/ https://twitter.com/parfittbeth/status/ /photo/ https://twitter.com/parfittbeth/status/ http://kairos.technorhetoric.net/ . /disputatio/perry/ https://twitter.com/tkrisr/status/ https://twitter.com/tkrisr/statuses/ http://twitter.com/rachaelryerson/statuses/ https://twitter.com/rachaelryerson/statuses/ https://twitter.com/rachaelryerson/statuses/ https://twitter.com/selfe /statuses/ http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref http://refhub.elsevier.com/s - ( ) - /sbref https://twitter.com/rrussellsloan/statuses/ https://twitter.com/crystalvk/statuses/ v w w y y j. voss / computers and composition ( ) – ankooten, crystal. ( b, june ). @rachaelryerson totally!! who else is going besides @selfe , @selfe , @harleyferris, and @myergeau? #dmac #cwcon [twitter post]. https://twitter.com/rachaelryerson/statuses/ alls, douglas. ( a, june ). race/ethnicity and technology/new media studies: personal perspective [web log post]. retrieved from http://dmac .blogspot.com/ / /raceethnicity-and-technologynew-media.html alls, douglas. ( ). an “a” word production: authentic design. kairos, ( ). retrieved from http://technorhetoric.net/ . /disputatio/ walls/index.htm ergeau, melanie, brewer, elizabeth, kerschbaum, stephanie, oswal, sushil k., price, margaret, salvo, michael j., & howes, franny. ( ). multimodality in motion: disability & kairotic spaces. kairos, ( ). retrieved from http://kairos.technorhetoric.net/ . /coverweb/yergeau- et-al/index.html ergeau, melanie. ( , may ). google doc list of digital publication venues. pls add to lists! http://tinyurl.com/ks hdyh#dmac [twitter post]. retrieved from https://twitter.com/myergeau/statuses/ https://twitter.com/rachaelryerson/statuses/ http://dmac .blogspot.com/ / /raceethnicity-and-technologynew-media.html http://technorhetoric.net/ . /disputatio/walls/index.htm http://technorhetoric.net/ . /disputatio/walls/index.htm http://kairos.technorhetoric.net/ . /coverweb/yergeau-et-al/index.html http://kairos.technorhetoric.net/ . /coverweb/yergeau-et-al/index.html http://tinyurl.com/ks hdyh https://twitter.com/myergeau/statuses/ to teach, critique, and compose: representing computers and composition through the ciwic/dmac institute a brief history of the ciwic/dmac institute, by way of its mission documenting ciwic/dmac's history presenting and modeling a writing-centered approach to teaching with technology . connecting old and new literacies . collaborative and studio approaches to teaching with technology . considering pedagogical and curricular influence space, time, and socialization among participants critical perspectives on technology and the diversity of the field professional development at ciwic/dmac conclusions acknowledgements references op-llcj .. qu’est-ce qu’un texte numérique?— a new rationale for the digital representation of text ............................................................................................................................................................ joris j. van zundert royal netherlands academy of arts and sciences, the netherlands tara l. andrews university of vienna, austria ....................................................................................................................................... abstract in this article we aim to provide a minimally sufficient theoretical framework to argue that it is time for a re-conception of the notion of text in the field of digital textual scholarship. this should allow us to reconsider the ontological status of digital text, and that will ground future work discussing the specific analytical affordances offered by digital texts understood as digital texts. following from the argument of suzanne briet regarding documentation, referring to eco’s under- standing of ‘infinite semiosis’, and accounting for the reciprocal effects between carrier technology and meaning observed by mcluhan, we argue that the func- tions of document and text are realized primarily by their fluid nature and by the dynamic character of their interpretation. to define the purpose of textual schol- arship as a ‘stabilisation’ of text is therefore fallacious. the delusive focus on ‘stability’ and discrete ‘philological fact’ gives rise to a widespread belief in textual scholarship that digital texts can be treated simply as representations of print or manuscript texts. on the contrary—digital texts are texts in and of themselves in numerous digital models and data structures which may include, but is not limited to, text meant for graphical display on a screen. we conclude with the observation that philological treatment of these texts demands an adequate digi- tal and/or computational literacy. ................................................................................................................................................................................. in suzanne briet asked the question ‘qu’est-ce que la documentation?’, ruminating on what it is that documentation does, what constitutes a document, and what does not. she departed from a linguist–philosophical definition of ‘document’— ‘tout indice concret ou symbolique, conservé, ou enregistré, aux fins de représenter, de reconstituer ou de prouver un phénomène ou physique ou intel- lectuel.’ (briet, )—and ultimately proposed a new understanding of the concept of document, one much more fluid than the writings on paper that we usually associate with the term. un étoile est-elle un document? un galet roulé par un torrent est-il un document? un animal vivant est-il un document? non. mais sont des documents les photographies et les catalogues d’étoiles, les pierres d’un musée de minéralogie, les animaux catalogués et exposés dans un zoo. (briet, ) a rock on the ground, for example, may not have information communication significance. the same stone as a part of a museum’s geological collection, on the other hand, may document the type of rock correspondence: joris j. van zundert, huygens institute for the history of the netherlands, royal netherlands academy of arts and sciences, amsterdam, the netherlands. e-mail: joris.van.zundert@huygens. knaw.nl digital scholarship in the humanities, vol. , supplement , . � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com ii doi: . /llc/fqx advance access published on august downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: - found in a certain geological layer or area. her most famous example, that of an antelope, becomes documentation of its species as soon as it is cap- tured by an explorer and housed in a zoo; even its corpse can be preserved and thus maintained as a document even after its death. briet in this way ex- panded our notion of the ontological and epistemo- logical status of the concept of ‘document’. for briet, documentation ‘was a scientific activity of the greatest importance’ (hearns bishop, , p. )—a positive and unifying force, a foundational work for science, and an inscription technology that allowed knowledge to be codified, connected, and spread (bede, ). most salient to our argu- ment is that documentation is situated; it is an act of interpretation made from a particular cultural and historic context (hearns bishop, , pp. – ). thus, when briet’s antelope becomes a document and becomes a part of the document we call tax- onomy, it is also a constituent part of a specific cul- turally induced worldview. the antelope document inscribes the meaning of an antelope according to a certain specific culture. briet evokes the sheer prolific power of documentation to establish these meanings: ‘in our age [. . .] the least event, scientific or political, once it has been brought into public knowledge im- mediately becomes weighted down under a ‘‘vest- ment of documents’’’: a new sub-species of antelope inspires a newspaper item, is described in various scientific articles, a specimen gets loaned to an exhibition, a taxonomic description is made, etc. (briet, , ; hearns bishop, , p. ). briet thus advocated a shift in understanding of what a document is, urging her readers to focus on its function rather than the form in which one nor- mally expects to find it. if we take such a purely functional view, then a document is anything that, on the material level, is used by humans to commu- nicate information to other humans. briet’s understanding is thus that the concept of document is fluid. documentation is not merely pro- lific, it is also transformative. each act of documenta- tion that sprouts yet more documents—that we understand now can be of any kind—transforms the inscription of knowledge that the source document contained. bishop in this respect remarking on briet notes that briet’s perspective is still important today ‘because her theories allow us to view a wide variety of information objects in terms of their relationships’. these relationships are transformative: documentation ‘is a surrogate artifact, it is an interpretation of the artifact. in effect, a new ‘artifact’ in the form of docu- mentation is created to serve as a surrogate for the artifact’ (hearns bishop, , p. ). if, thanks to briet, the concept of ‘document’ is a fluid one, then so must be the concept of ‘text’; while text is a feature commonly found in docu- ments—indeed documents, in the broad under- standing that briet defined them, are the carriers of text—text is no more bound by the document than documentation is bound by the text it carries. however, that does not mean that text and docu- ment do not influence each other. the relationship between medium and message and between technol- ogy and meaning is reciprocal (mcluhan, ( )). as alan kay ( ) put it: ‘i had a very mcluhanish feeling about media and environments: that once we’ve shaped tools, in his words, they turn around and ‘‘reshape us.’’’ the means of documen- tation in part shapes (or contributes to the shaping of) the interpretation that a document inscribes. a taxonomy for instance inclines toward enforce- ment of its categories: the decision to put a specimen in a certain category depends not only on the judgment of, e.g., a biologist but also on the trade-off to be made between the convenience of the existence of a category that more or less fits and the effort it takes to create one that might per- haps be more fitting, as well as the ramifications that this new category might have for the overall struc- ture of the taxonomy. the medium of any text tends likewise to influence the shaping of the text—in- scription in stone demands austerity, and a digital text editor invites prolixity. the medium shapes the text; in the same vein then the medium shapes to an extent the meaning of the text. the prolific nature of documentation briet pointed to is also the prolific nature of infor- mation: cultures use media of any kind to prolifer- ate information. the medial shaping of text is causally connected to the remediation that happens when we push information onto a new medium to communicate, store, transform, or analyze it (bolter and grusin, ). but text is volatile and stable in qu’est-ce qu’un texte numérique? digital scholarship in the humanities, vol. , supplement , ii downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: : deleted text: : deleted text: - deleted text: - deleted text: ( ), deleted text: : deleted text: . deleted text: ` deleted text: artifact' deleted text: : deleted text: , deleted text: ( ) deleted text: judgement deleted text: , equal measure: even while the medium shapes a text, so that text can be conveyed across multiple media, it can still be recognized and meaningful as the same text. text thus adapts with great ease to any new medium—from oral to pictorial to inscribed to time- or digitally based storage media—and will permeate nearly any medium within a short amount of time. text is constantly moved, copied, translated, paraphrased, re-written, re-contextualized, and re-mixed. each occurrence of any of these acts produces a new text, the result of the act, distinct from the text or texts that went into the act, and yet recognizable nevertheless as ‘the same text’. this is the textual condition, as it was defined by jerome mcgann ( ). we can under- stand text as a stable entity, but the harder we try to stabilize it, the more stubbornly it refuses to be bolted down. although philologists perceive the unwillingness of texts to be pinned down as ‘philological fact’ as a problem (mcgann, , van zundert, , pp. – ), for umberto eco this is their most salient feature, as it allows for ‘infinite semiosis’ (eco, ). eco argues that the sign has its roots in omens: natural phenomena that could be inter- preted as predictors of natural events. dark clouds forebode rain, and smoke above the forest signals a fire. this dynamic aspect of inference and interpret- ation is pivotal for the sign function. only well after the invention of writing does an identity function begin to be attributed to signs as words; this process results from a unification of a theory of signs and one of language that, according to eco, finds it cul- mination in the work of augustine. in written lan- guage denotation, i.e. the assertion that a word (or sign) uniquely refers to some real-world antecedent, becomes strongly foregrounded. according to eco: ‘problems derive from the fact that contemporary theories of sign have been dominated by a linguistic model, and a wrong one at that [. . .] where signs are conceived of as being intentionally emitted and con- ventionally coded, linked by a bi-conditional bond to their definition, subject to analysis in terms of lesser articulatory components, and syntagmatically disposed according to a linear sequence’ (eco, , p. ). he refers to c. s. peirce to argue that a sign’s primary function is not to signify some identity with a real-world phenomenon, that is a word as a sign is not strongly linked to one single meaning, instead a sign works through inference, or interpretation. reading, interpretation, and understanding do not operate by scientific induction or deduction to make predictions about the meaning of a sign. rather, these are abductive processes: given a word, a reader instantly hypothesizes about possible meanings of that word and of its relation to words in its context, based on her own tacit knowledge. reading a text is therefore not the decoding of a sequence of identity relations from text to real- world objects and events but the construction of meaning through a process, executed by the reader, of structuring hypotheses. it is this phenomenon of abduction-based rea- soning about the meaning of signs in a text that drives the infinitive process of interpretation and reinterpretation to which readers subject a text. this same phenomenon can be said to underlie barthes’ ( ) concept of ‘writerly’ text: as a reader reads, she is constructing a new ‘cognitive’ text from a sequence of words. this text is ‘writerly’ because the reader is in a sense re-writing the text— she is both re-constructing its meaning and there- fore also adorning and changing its meaning be- cause the hypothetical or abductive way of decoding signs allows for—or rather, is the direct cause of—variation in interpretation. this variation would not be possible if words (and signs) really only existed as a unique referential relationship of identity. thus we cannot read but by interpretation. the text-as-signs is the inscription on the page, on the screen, or on the disk, but that is in a sense the least enticing aspect of text. it could be argued that the serialization of a text as words on paper or as bits in electronic storage exists merely as an affordance for the reader to hypothesize about the meaning of the ‘actual’ text that is being cognitively constructed as she reads. it is clear that this new ‘writerly’ and mental representation of the text is an ephemeral one, existing only in the mind of the reader. it is moreover a text that is ontologically different from what might be recorded in signs on the page. in order for its meaning to be fully realized, a text must thus undergo an ontological shift, from j. j. van zundert and t. l. andrews ii digital scholarship in the humanities, vol. , supplement , downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: - deleted text: recognisable deleted text: : deleted text: deleted text: — deleted text: real deleted text: — deleted text: : deleted text: real deleted text: — deleted text: , deleted text: , deleted text: , deleted text: deleted text: ` existence as signs on paper to existence as a cogni- tive representation in a reader’s mind. how could we say that what is in the mind of the reader is ‘not’ a text? where the act of documenting creates a surrogate of a phenomenon or artifact, the act of reading pro- duces a cognitive surrogate of a text that is itself already a surrogate: the signs on paper that repre- sent a cognitive text that once existed in an author’s mind. to exist as communicated meaning, the text has undergone not one but two shifts in its onto- logical status, and arguably has become three texts along the way—and yet it is said that both author and reader have acted upon the same text. documentation codifies and inscribes how a cer- tain culture understands its world. as briet showed, the volatility of the concept of document and the prolific nature of documentation are fundamental to this function. without a rapid proliferation of copies, surrogates, derivatives, and remediations, documentation fails in one of its major purposes. text inscribes information that codifies some understanding about the world. its carriage by these rapidly proliferating documents makes text volatile with respect to form, medium, and—follow- ing mcluhan, ( )—meaning. the abduc- tive process of interpreting and understanding them, that according to eco we cannot escape, makes their meaning yet more volatile. moreover, to achieve its purpose of communicating some meaning, the volatility of texts—and quite possibly that of documents too—must also ‘fundamentally’ encompass a negotiation of boundaries between modes of being, that is an ontological shift from being as signs-in-a-medium toward being as cogni- tive representation. thus, the functions of document and text are realized through processes of dynamics. the idea then that the purpose of textual scholarship is to ‘stabilize’ a text, is an audacious one. even as the philologist works to stamp an authority on a par- ticular version of a text, the text itself replicates cog- nitively with every exemplar considered, and its meaning shifts with its medium. the end result, of course, is a new text which can claim to be a faithful representation of the cognitive text of the editor, informed by the texts of the exemplars. this new text may stake a claim to supplant prior editions, but it is very difficult to argue that it supplants any of the exemplars, and even its claim to authority over prior editions can be questioned. essentially, a new set of signs has sprung into existence that will produce yet more texts, each of which may be just as prolific as its siblings and ancestors. thus in the quest for authority and stabilization of a text, the philologist cannot help but have a multiplicative effect. in the time before the rise of digital scholarly editions, the sheer audacity of this multiplication in service to authoritative stability was not so clear. as long as the system of print production and mass distribution of books endures within text- ual scholarship, the authority of the particular inscribed version of a text that the philologist seeks to impose has been amplified by the inherent authority of that version gaining access to the aca- demically and commercially controlled channels of distribution and replacing older versions on the shelves of libraries and bookstores. the scholar who, on the other hand, goes on to make a digital edition drives the proliferation of the ontological status of text even farther, and farther perhaps than he or she realizes. peter shillingsburg has argued that even a simple digital transcription cannot but be an imprecise and often erroneous representation of a written text. the questions he raises defy simple answers: but especially transcription—even ‘text only’ transcriptions—involves interpretation (is it an i or e? is it underlined or crossed out? is the obscured letter a k or a t? should that upright have been crossed as a t or is it an l?—were the bushes lopped or topped?). and these questions about the text can be multi- plied if one asks what is the meaning of underlining or italics (is it for emphasis or to indicate a foreign word, title of a book, or name of a ship?). and so, it is asked, can the surrogate be unmediated, representing exactly the original, such that the user need not see the physical document? and if transcription is always interpretive, is all the interpretive ana- lysis by transcribers of a piece? is it futile to distinguish levels of intervention so that the qu’est-ce qu’un texte numérique? digital scholarship in the humanities, vol. , supplement , ii downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: ( ) deleted text: : deleted text: s deleted text: - deleted text: stabilisation deleted text: - deleted text: - decision about the e/i or t/l, the decision not to include crossed out words, the decision about the emphasis/ship’s name, and the de- cision to add links to related documents just a continuum of editorial intervention from minimal to unlimited? (shillingsburg, , p. ) just as eco argues that contemporary theories of sign have been dominated by a linguistic model foregrounding the identity between sign and refer- ent, leaving little space for the more fundamental operation of inference for formation of meaning, we argue that contemporary theories in textual scholarship have been dominated by a foreground- ing of the document and the sign as purely discrete (insofar as they are able to be typeset) attributes of text. shillingsburg’s statements show how even at the most intimate level of the glyph philological editing is seen as a process of abstraction and cre- ation of discrete surrogates of what is in reality a multi-attributed material representation of text. moreover, his discussion of markup code, and his claim that it ‘tends to interfere with repurposing’ of a text, indicates that textual scholars are working in a medium whose properties and qualities they often do not yet fully grasp. it is clear, as shillingsburg argues, that digital transcriptions are not simply ‘the text’ itself. even less so are digital editions— that is to say, digital versions of texts that go beyond simple transcription, or digital presentation of print-based editions. these are texts whose medium lends them qualities that defy translation to the physical or print medium, as observed by sahle ( ). notwithstanding the claim to repre- sentation of a handwritten or print text that these editions generally make, these digital texts are not simple surrogates or stand-in artifacts for originals (hearns bishop, , p. ) nor are they merely philological evidence (cf. for instance greetham, , pp. , , ). to the degree that we can sum- marize briet’s argument as ‘infinite proliferation of the document’ and eco’s argument as ‘infinite pro- liferation of meaning’, we should regard these texts fully as texts in their own right. despite the risks associated with the use of a medium that is not fully understood, shillingsburg advises us to ‘just do it’—to go forth and create the editions, to create a text which may be intended to represent a physical text but is nevertheless new, and digital. the scholar who takes his advice is con- fronted with the peculiar fluidity, perhaps the agency even, of the text that she has created. this is a feature of the digital medium in which we have chosen to work—one of its qualities that we, as yet, only dimly understand. the digital medium can be transformative and discrete in its ambiguity. ascii art is an excellent example of this (e.g. fig. ). what exactly is the text here? how should it be read, and how should it be represented if it is remediated? this is, however, not to say that text must be digital to be fluid, or that the potential for such double entendre resides pecu- liarly in the digital. the commingling of linguistic sign and visual image that is usually expressed as ascii art today is in fact age old, as can be seen in fig. . this kind of glyphic art stresses not only the transmedial or fluid nature of text; it also points to its continuous character even in this discrete medium. that continuity of interpretation, certainly present within artistic expression before the compu- tational age, becomes even more pronounced in some internet memes. or consider emojis—in essence iconography but now registered in the writing system called unicode, pinned down in meaning by a standard but infin- itely variable in interpretation depending on the style of implementation of that standard. this is fig. ascii art, produced by the authors using the online text to ascii art generator (patorjk.com/software/taag/) j. j. van zundert and t. l. andrews ii digital scholarship in the humanities, vol. , supplement , downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: morever deleted text: : deleted text: , deleted text: : deleted text: - deleted text: - deleted text: in order http://patorjk.com/software/taag/ fluid transmediality at its finest—emojis are pictor- ial in nature and defined as glyphs, while ascii art is composed of glyphs and used to produce some- thing pictorial. the transmedial, fluid, and continu- ous nature of text and therefore of ‘textuality’ is now far more obvious than it ever was. the digital en- vironment amplifies the continuity of these natures, just as the digital environment seems to amplify all problematic qualities of text (o’donnell, ). our claim of course is not that text is fluid. this is well known (cf. bryant, , levy, ). we also consider it well established by now that the very concept of text is fluid. what we claim here is that the digitally enabled humanities for the greater ma- jority have fallen into a habit of considering digital texts as mere digitized surrogates of non-digitally inscribed texts, that is as documents. the height of sophistication of digital publishing among pub- lishers still tends to be to offer a digital publication, meaning a pdf or an epub version of a book—even while the idea of so-called ‘born digital’ texts (files produced by text editors, blog posts, tweets, and so forth) has found traction throughout the world. or, to put another analytic frame on this, for the most part digital textual scholarship seems to be stuck in a paradigm of remediation. bolter and grusin con- tend that all mediation is remediation, that is all new media express themselves through the encap- sulation of older media. ‘each new medium is jus- tified because it fills a lack or repairs a fault in its predecessor, because it fulfills the unkept promise of an older medium’ (bolter and grusin, , p. ). indeed, the rhetoric surrounding the digital scholarly edition is revolutionary, whereas closer in- spection of the practice reveals little novelty; most digital editions seem to be dutiful remediations of print publications (karlsson and malm, ). when scholars speak of a ‘digital text’ what they usually have in mind is the visible rendering of a digitally inscribed text, which usually takes a form visually very similar to a physical text, allowing the option of re-inscription on, e.g., paper. the on- screen display of the digital representation is tech- nically an interface to the digitally inscribed text, but from the ontological perspective of the scholar, it is ‘really’ an interface to a real or potential phys- ical text. this conforms to the assertion by bolter and grusin that ‘digital media can never reach [a] state of transcendence, but will instead function in a constant dialectic with earlier media’ (bolter and grusin, , pp. – ). yet perhaps the most sig- nificant remediation of text is not occurring at the level of the graphical interface. there may certainly be a mediation—in the original marxist analytic sense of the process of negotiating a balance of power between social groups—between scholars producing conventional print editions and those creating digital editions. but the more important, yet less apparent, remediation is a similar renegoti- ation of what text is, between those scholars who understand digital text as the visualization in a graphical interface and those scholars and pro- grammers who write, work with, and experience the digital code and models of text of which these visualizations are merely screen-oriented representations. representation for reading purposes only scratches the surface of what a digital text is. as soon as scholars set out to apply the first computa- tional analyses to text and to create the first digital editions, text started to flow into the digital envir- onment. as it did, it brought its textual condition to the digital environment, even as properties of the digital were imparted on the nature of text. the people who worked within the digital environment began to create a particular category of text that was digital in nature. programming languages, for in- stance, applied algebraic and textual constructs, so that they could be more easily read and applied (anon, ). these executable texts were then fig. fragment of british library, ms harley , f. r. � british library board, reproduced with permission of the british library board qu’est-ce qu’un texte numérique? digital scholarship in the humanities, vol. , supplement , ii downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: e.g. deleted text: - deleted text: - deleted text: — deleted text: , deleted text: — deleted text: , deleted text: `` deleted text: '' deleted text: : deleted text: - deleted text: - deleted text: `` deleted text: '' deleted text: : deleted text: - used to model databases, which in turn were used to model other texts into these databases (jones, ). computational linguists began compiling vast cor- pora of texts, using textual tags to annotate them, making them distinctly ‘different’ from the physical texts they were derived from. all these textual con- structs imported into the digital environment became products of their idiosyncratic environ- ment, defined foremost by their ‘digital’ properties. our claim is that these texts belong to a distinct ontological category. they are true digital objects with inalienable digital properties. even a plain- text transcription is not a mere imitation of a real- world text, but should be considered as a text in and of itself. until now we have not usually considered these texts as being texts in their true digital form. but what happens if we broaden our perspective to accept all of these as texts: databases, xml files in their xml form, source code in its legible form as well as in the form of the results of its execution, whatever visual form those results may take (cf. for instance fig. ). from this perspective, and with the help of hindsight, it becomes clear that the history of digital textual scholarship has been by and large one of ‘patching’ the perceived inadequacies of digi- tal text to allow it to function more like ‘normal’ physical text—thereby inadvertently misunder- standing and disregarding the digital nature and ex- istence of digital text. the simplest form of digital text is arguably the string—a linear series of binary signals that encode characters according to some predefined table. its origin is connected to the physical and technical requirements of telegraphs and earlier signal transfer technologies such as semaphores (petzold, ). the ability to regard information as an unidimen- sional stream of discrete dichotomic bits was essen- tial to the work of both turing ( ) and shannon ( ). they and we have nonetheless been aware all along that a linear series of characters can never capture the multidimensional properties of text. it cannot represent structure, semantics, relations, or perspectives internal or external to the text. because of this, the computational string was ‘patched’ to become a data structure, initially with typesetting codes to instruct printing machines on how to pro- duce typographically beautified texts (goldfarb, ). markup and hyperlinks were invented at a later stage, patching the string to allow for more multidimensional connections within and between digital texts. markup in the form of html arguably became the most preferred of these patches, along- side xml in general and, in humanistic/scholarly contexts, tei in particular. these ‘patching’ tech- nologies were developed to allow digital text to behave more like analogue text. they helped to re- mediate what we knew about the properties of ‘real- world’ text in the digital environment. fig. examples of four texts: a so-called plain text, a json encoding of a manuscript transcription, the text of a javascript source code document, and a graph depiction of textual markup j. j. van zundert and t. l. andrews ii digital scholarship in the humanities, vol. , supplement , downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november arguably the most advanced ‘patch’ we have come up with so far is the knowledge graph. the graph as an interface to a real-world text has been gaining currency (andrews and macé, ; dekker et al., ; schmidt and colomb, ; van zundert and andrews, ). as a model for the representation of the multidimensionality of text, the graph model takes us far beyond the limitations of the linearity of the string. it also takes us well beyond the limitations of the hierarchy of the string-segmented-into-a-tree that is markup. this has considerable advantages for the digital ‘repre- sentation’ of real-world text. especially where that representation must encompass multidimensional aspects, such as ambiguity, narrative structure, vari- ance, annotation, and so forth. nevertheless there is a further, more fundamental step beyond re-representation that should be taken. this is a step that was in fact already taken when text crossed into the digital medium and a new ontological category was created, but we as textual scholars have failed to acknowledge it. as long as we keep treating digital texts indeed as ‘models’ of text, digital models moreover whose only purpose of being is to depict themselves as digital re-represen- tations of analogue texts, we deny these models their ontological status of actually ‘being-a-text’ in and of themselves. this is what we claim: the graph, the database, and the json-ld file that now are regularly created and maintained to function as data structures for the representation of text are in fact texts, and they should be considered as that: as texts. that most scholars do not regard the idiosyn- cratic aspects of databases, graphs, text files, and so forth, as idiosyncratic properties of a kind and category of texts in its own right, is an effect of the fact that digital text production is still rooted very firmly in a representational philosophy. almost all digital text production is geared toward recreating, within a digital environment, in a familiar guise, the comfortable and familiar aspects of continuous and fluid texts-in-the-physical-world. even while we have used digital text in this way, the properties of these digital ‘versions’, which is to say the digital properties of these texts in their own right, were unintentionally neglected. graphs, markup, and strings seen solely as representations of text-in-the- real-world will always strike us as inadequate on some level. as shillingsburg argued, it is not possible to make a perfect translation or copy of an analogue text into the digital realm. kirschenbaum ( ) has convincingly shown that digital texts are physical too and that we must acknowledge their materiality. along the way he confirms that our digital models have a full claim to the status of texts, for they too are material texts—ones that require machine mediation to be read, and have therefore a different sort of materiality, but nevertheless still material and still ‘texts’. in jerome mcgann, apparently driven to despair about the perceived volatility and ephem- erality of digital texts, argued that textual scholars should regroup toward the philological–physical fact of the glyph on paper (mcgann, ). we argue here that scholarship should rather venture in the opposite direction, embracing digital texts for what they are: texts adorned with properties that are both inalienably textual and inalienably digital. david berry ( ) argues for the need to critically examine digital objects such as digital in- formation streams, now that these objects increas- ingly help to constitute contemporary society and culture (cf. also jones, ). we would add to this the argument that code and digital data structures are included among the digital texts that increas- ingly constitute contemporary cultural artifacts and scholarship. these texts are thus worthy of our philological consideration. we call attention here to their ontological and epistemological status and import within textual scholarship. we should indeed go even further: where berry (and others) call for the consideration of the ‘sur- face’ or the ‘interfaces’ of these data structures as digital objects in themselves, we contend that the data structures and models are themselves the ob- jects worthy of our scholarly scrutiny. these are, after all, texts in themselves. when a scholar has modeled the semantics, the structure, or for that matter any characteristics of a text in a database, and she has added some logic or style sheets to depict a visualization of those characteristics onto the ‘canvas’ of a computer screen, then that depic- tion may be the representation of the text of some qu’est-ce qu’un texte numérique? digital scholarship in the humanities, vol. , supplement , ii downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: real deleted text: schmidt deleted text: and deleted text: & colomb , deleted text: , deleted text: haentjens deleted text: , deleted text: - deleted text: , deleted text: s deleted text: s deleted text: - physical exemplar. in the process, however, new cognitive texts and new documentary evidence of these, in the form of those very database models and style sheets, were created. not only the visual- ization but also the digital objects that produce the visualization have become documents and texts in themselves. the fluidity of document and the infinite semi- osis of text cause a proliferation of documents and texts that each have inalienable unique properties that may be bound to the specific materiality and medium of the document and the text. in a schol- arly context it is negligent not to acknowledge these idiosyncratic properties, and to regard them as mere inconvenient and unsatisfying incongruencies be- tween the physical print text, the digital representa- tion, and the digital model. these incongruencies are what make digital texts texts in their own right, and they point toward the differing onto- logical status of digital and print text. these texts cannot be and, in fact, actively resist being identical. purporting that they are, or can be, and that they are only representations of physical texts, and nothing more, is epistemologically shortsighted. none of the texts we produce can have an inherent scholarly pri- macy over the others, simply on the basis of its form—the print text says things that the tei encod- ing does not, the tei encoding says things that the json does not; the json says things that the graph does not; and saliently: vice versa. they are all texts, and their forms are intrinsically bound up in the expression of their essence. one reason that scholars have paid less attention to digital data structures and information models as texts in their own right may be that digital texts require their own specific literacy to be read and written. digital structures and objects are texts that contain programming code, or require pro- gramming code to be created, analyzed, visualized, etc. that is, these texts are made up in part of signs whose meanings scholars will recognize from other sorts of texts (e.g. characters, words, and syntactic and semantic structures), but they also consist of signs still rather alien to scholars without program- ming experience, such as string denotations, punc- tuation semantics, variables, loops, and subroutine statements. these signs, innate to the realm of code and computation, require a different, additional lit- eracy to be fully understood and interpreted. the ability to code and encode are necessary prerequis- ites, but computational literacy goes beyond learn- ing the syntax and semantics of a particular programming language. as annette vee noted: ‘but, unfortunately, when ‘‘literacy’’ is connected to programming, it is often in unsophisticated ways: literacy as limited to reading and writing text; literacy divorced from social or historical con- text; literacy as an unmitigated form of progress’ (vee, , p. ). vee argues that literacy refers to a set of skills without which one is no longer able to navigate one’s world. code and digital texts as technologies are not yet infrastructurally critical to textual scholarship. however, the text- ual scholar who does want to engage with digital texts as ‘digital’ texts requires a specific literacy. epigraphical literacy, codicological literacy, and computational literacy are essential in the under- standing, respectively, of a stone inscription, of a medieval manuscript, and of a digital text, each one in its specific mode of being. vee’s argument is the most recent in a discourse spanning at least four decades, which includes inter alia stephen ramsay ( ), john unsworth ( ), friedrich kittler ( ), and donald knuth ( )— a discourse that puts forth the argument that working with digital texts requires some proficiency in coding, and that this proficiency is easily recognizable as lit- eracy: reading and writing, but of a different kind. roots of the argument can be traced back to the work of adele goldberg and alan kay, who were involved with the creation of smalltalk, which can be regarded as the mother of all object-oriented pro- gramming languages. kay and goldberg were spe- cifically interested in how programming could be taught, an experience that profoundly influenced goldberg’s thinking on literacy, convincing her ‘that literacy should involve computing-based technologies and the expectation that our knowledge and skills will continually change, rather than define literacy as being pencil/paper/book-based’ (goldberg, , p. ). however, literacy (be it computational or writ- ten-language literacy) cannot be reduced to the skills of reading and writing. kay’s sobering observation was that, after years of experience, the success of j. j. van zundert and t. l. andrews ii digital scholarship in the humanities, vol. , supplement , downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: , deleted text: , deleted text: : deleted text: : deleted text: that deleted text: object deleted text: : deleted text: thirty teaching computing literacy still depended on the ‘‘‘hacker phenomenon’’, that, for any given pursuit, a particular % of the population will jump into it naturally, while the % or so who can learn it in time do not find it at all natural’ (kay, , p. ). more salient however is another observation he makes: ‘the connection to literacy was painfully clear. it isn’t enough to just learn to read and write. there is also a literature that renders ideas. language is used to read and write about them, but at some point the organization of ideas starts to dominate mere language abilities’. that is, literacy does not only consist of the basic skills of reading and writing a certain set of symbols. following eco, interpretation and understanding come from tacit knowledge-based inference. reading, writing, and thus also coding are about fluency of words and of symbols, whereas the fluency we need is a fluency in ideas and concepts. in the case of coding literacy, this means an experienced understanding of basic algorithms, coding constructs, and programming patterns, and it is a literacy that requires a number of years in training and experience, rather than a few months. it is hard for scholars who lack this literacy to conceive of code and data structures as just another semiotics, another meaningful way to express texts. it is clear that being non-literate in code and encod- ing makes it extremely hard to appreciate ‘digital’ texts as what they are essentially: texts. what is then left is the mere use of code and data structures as another tool for representational approaches, for the depiction of a print or manuscript text in a digitized guise mimicking the exemplar as closely as possible. we lose sight of the fact that there are ‘native’ digital ways of looking and working with digital texts, read- ing and writing them, when we remain within our representational philosophical confines. that limited understanding not only provokes us to con- centrate almost exclusively on standards for repre- senting texts it also prohibits us from investigating the textual nature of the digital text. just as briet argued the epistemological status of the rock as document, we should grant the proper ontological and epistemological status to the digital objects that we have so far used merely for textual representation. just as a rock can be a document, a serialization or a source code is certainly a text. references andrews, t.l. and macé, c. ( ). beyond the tree of texts: building an empirical model of scribal variation through graph analysis of texts and stemmata. literary and linguistic computing, ( ), – . anon ( ). preliminary report: specifications for the ibm mathematical formula translating system, fortran. new york, ny: international business machines cooperation. http://www.computerhistory. org/collections/catalog/ (accessed november ). barthes, r. ( ). s/z: an essay. new york, ny: hill and wang. bede, m.w. ( ). what is documentation? english translation of the classic french text, by suzanne briet (lanham, md: scarecrow, ). college and research libraries, , – . berry, d.m. ( ). critical theory and the digital. new york, ny; london; new delhi etc.: bloomsbury academic. bolter, j.d. and grusin, r. ( ). remediation: understanding new media. cambridge, ma: mit press. briet, s. ( ). qu’est-ce que la documentation? paris: édit. briet, s. ( ). what is documentation? english translation of the classic french text. lanham, md: scarecrow press. http://ella.slis.indiana.edu/�roday/ briet.htm (accessed october ). bryant, j. ( ). the fluid text: a theory of revision and editing for book and screen. university of michigan press. http://books.google.nl/books?id¼ w wpodpbu c. dekker, r., van hulle, d., middell, g., neyt, v., van zundert, j. ( ). computer supported collation of modern manuscripts: collatex and the beckett digital manuscript project. literary and linguistic computing, ( ), – . eco, u. ( ). the theory of signs and the role of the reader. the bulletin of the midwest modern language association, ( ), – . goldberg, a. ( ). oral history of adele goldberg. http:// archive.computerhistory.org/resources/access/text/ / / - - -acc.pdf (accessed november ). goldfarb, c.f. ( ). the roots of sgml—a personal recollection. http://www.sgmlsource.com/history/roots. htm (accessed august ). greetham, d. ( ). textual scholarship: an introduction. new york & london: garland publishing inc. hearns bishop, m. ( ). briet’s antelope: some thoughts on suzanne briet ( - ) and conserva- tion documentation. waac newsletter, ( ), – . qu’est-ce qu’un texte numérique? digital scholarship in the humanities, vol. , supplement , ii downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november deleted text: : deleted text: `` deleted text: '' deleted text: : deleted text: is deleted text: ; http://www.computerhistory.org/collections/catalog/ http://www.computerhistory.org/collections/catalog/ http://ella.slis.indiana.edu/∼roday/briet.htm http://ella.slis.indiana.edu/∼roday/briet.htm http://ella.slis.indiana.edu/∼roday/briet.htm http://books.google.nl/books?id= w wpodpbu c http://books.google.nl/books?id= w wpodpbu c http://archive.computerhistory.org/resources/access/text/ / / - - -acc.pdf http://archive.computerhistory.org/resources/access/text/ / / - - -acc.pdf http://archive.computerhistory.org/resources/access/text/ / / - - -acc.pdf http://www.sgmlsource.com/history/roots.htm http://www.sgmlsource.com/history/roots.htm jones, s.e. ( ). the emergence of the digital humanities. new york, ny; london: routledge. jones, s.e. ( ). roberto busa, s.j., and the emergence of humanities computing: the priest and the punched cards. new york, ny; london: routledge, taylor & francis group. karlsson, l. and malm, l. ( ). revolution or remedi- ation? a study of electronic scholarly editions on the web. human it, ( ), – . kay, a.c. ( ). the early history of smalltalk. acm sigplan notices, ( ), – . kirschenbaum, m. ( ). mechanisms: new media and the forensic imagination. cambridge, ma: mit press. kittler, f. ( ). es gibt keine software. in draculas vermächtmis. leipzig: reclam verlag, pp. – . knuth, d.e. ( ). literate programming. the computer journal, ( ), – . levy, d.m. ( ). fixed or fluid? document stability and new media. in echt proceedings of the acm european conference on hypermedia technology. edinburgh; new york, ny: acm press, pp. – . https://pdfs.semanticscholar.org/f /b af a e - c e d c da dcaf f .pdf (accessed march ). manovich, l. ( ). software takes command, vol. . new york, ny; london; new delhi etc.: bloomsbury academic. mcgann, j. ( ). the textual condition. princeton: princeton university press. mcgann, j. ( ). philology in a new key. critical inquiry, ( ), – . mcluhan, m. ( ). understanding media: the extensions of man (critical edition). in gordon, w. t. (ed.) (first published ). berkeley: gingko press. o’donnell, d.p. ( ). a first law of humanities com- puting? blog. http://people.uleth.ca/�daniel.odonnell/ blog/the-first-law-of-humanities-computing (accessed june ). petzold, c. ( ). code: the hidden language of computer hardware and software. redmond: microsoft press. ramsay, s. ( ). reading machines: toward an algorithmic criticism (topics in the digital humanities). chicago: university of illinois press. sahle, p. ( ). about ‘‘a catalog of: digital scholarly editions’’. http://www.digitale-edition.de/vlet-about. html (accessed november ). schmidt, d. and colomb, r. ( ). a data structure for representing multi-version texts online. international journal of human-computer studies, ( ), – . shannon, c.e. ( ). a mathematical theory of communi- cation. the bell system technical journal, ( ), – . shillingsburg, p. ( ). from physical to digital textual- ity: loss and gain in literary projects. cea critic, ( ), – . turing, a.m. ( ). on computable numbers, with an application to the entscheidungsproblem. proceedings of the london mathematical society, ( ), – . unsworth, j. ( ). what is humanities computing and what is not? in braungart, g., gendolla, p. and jannidis, f. (eds), jahrbuch für computerphilologie, . http://computerphilologie.digital-humanities.de/jg / unsworth.html (accessed july ). van zundert, j.j. ( ). author, editor, engineer—code & the rewriting of authorship in scholarly editing. interdisciplinary science reviews, ( ), – . van zundert, j.j. and andrews, t.l. ( ). apparatus vs. graph: new models and interfaces for text. in hadler, f. and haupt, j. (eds.) interface critique. kaleidogramme. berlin: kulturverlag kadmos. vee, a. ( ). understanding computer programming as a literacy. lics, ( ), – . woolgar, s., and cooper, g. ( ). do artefacts have ambivalence? moses’ bridges, winner’s bridges and other urban legends in s&ts. social studies of science, ( ), – . note . following the debate surrounding the potential politics and agency of artifacts (cf. woolgar and cooper, ), some form of agency for documents and (thus) texts can be assumed. although we would not attribute direct agency to texts, artifacts (and thus documents and texts) may effectuate a ‘deferred’ agency of, for instance, an author. this notion of deferred agency relates to poetic notions such as cath- arsis and, e.g., bertold brecht’s ideas on theater as a political forum. among others manovich ( ), berry ( ), and van zundert ( ) argue that ideas on such deferred agency actually are very relevant with respect to the recent form of text that software code is. j. j. van zundert and t. l. andrews ii digital scholarship in the humanities, vol. , supplement , downloaded from https://academic.oup.com/dsh/article-abstract/ /suppl_ /ii / by guest on november https://pdfs.semanticscholar.org/f /b af a e c e d c da dcaf f .pdf https://pdfs.semanticscholar.org/f /b af a e c e d c da dcaf f .pdf http://people.uleth.ca/∼daniel.odonnell/blog/the-first-law-of-humanities-computing http://people.uleth.ca/∼daniel.odonnell/blog/the-first-law-of-humanities-computing http://people.uleth.ca/∼daniel.odonnell/blog/the-first-law-of-humanities-computing http://www.digitale-edition.de/vlet-about.html http://www.digitale-edition.de/vlet-about.html http://computerphilologie.digital-humanities.de/jg /unsworth.html http://computerphilologie.digital-humanities.de/jg /unsworth.html vlc_ _ - _bookreviews .. during the victorian period. based on our current preoccupation with defining what we mean when we talk about victorian materiality through “thing theory” and other object-based methodologies, this labor continues to haunt us today. notes . daniel hack, the material interests of the victorian novel (charlottesville: university of virginia press, ), . . jacques derrida, specters of marx: the state of debt, the work of mourning and the new international, trans. peggy kamuf (new york: routledge, ), . . verax [j. j. g. wilkinson], “evenings with mr. home and the spirits,” spiritual herald (february ): . . j[ames]burns, “the work of the spiritualist, and how to do it?” medium and daybreak, november , , – . . rev. w. mountford, “thoughts on spiritualism,” spiritual magazine (november ): – . . m. a. oxon [william stainton moses], “after all, is there any such thing as matter?” human nature (may ): . . oxon, “after all,” . . epes sargent, the proof palpable of immortality, nd ed. (boston: colby and rich, ), . . “how do spirits make themselves visible?” spiritual magazine (june ): . . karl marx, capital, vol. , trans. ben fowkes (new york: penguin, ), . media alison byerly although the term “media” postdates the victorian period,victorian culture was suffused with media. in fact, mediation, broadly defined, was a defining aesthetic of the period, and one could argue that the field of media studies properly begins with the nineteenth century. materiality, media https://www.cambridge.org/core/terms. https://doi.org/ . /s downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /s https://www.cambridge.org/core many victorian art forms sought to expand the boundaries of their medium by incorporating other media. in the world of visual art, pre- raphaelite painters created pictures based on poems, or, like dante gabriel rossetti, wrote poems to accompany paintings. later in the cen- tury, photographer julia margaret cameron created photographs to illus- trate alfred tennyson’s idylls of the kings, among other literary works. these artworks aligned representation with reproduction by disseminat- ing the original work far beyond its original instantiation, a popularizing move that anticipated later technologies of transmission. as martin meisel demonstrated in realizations: narrative, pictorial, and theatrical arts in nineteenth-century england, theater was profoundly influ- enced by visual art and vice versa. theatrical renderings of famous paint- ings led to an in-home version of this form of mediation, tableaux vivants, a kind of parlor game in which guests posed as famous paintings. a piv- otal scene in george eliot’s daniel deronda adds another layer of media- tion by offering a literary depiction of an amateur actress posing as a painting. nineteenth-century “program music” sought to create narra- tives or evoke scenes, as in hector berlioz’s symphonie fantastique ( ) or paul dukas’s the sorcerer’s apprentice ( ). commercial media also blossomed during this period. the reduced cost and improved quality of printing led to a boom in mass distribution of paper advertising products, such as leaflets, brochures, and pamphlets, as well as paper novelties, like paper dolls, cardboard toy theaters, fold-out panoramas, greeting cards, and cartes de visite featuring photographic portraits. trade cards, which first came into use in the eighteenth century, became more lavishly illustrated forms of advertise- ment. as ann mcclintock, thomas richard, and jennifer wicke have shown, the influence of advertising media on literature and society was far-reaching. the victorians also invented entirely new forms of mass media, such as the panorama or diorama, an entertainment staple of the period. enormous -degree paintings filled large venues with painstakingly detailed renderings of scenes depicting great cities, such as london, paris, rome, or constantinople; great battles; or even actual journeys, through what became known as “moving panoramas” that recreated a trip down the rhine or mississippi. often, a narrator would offer a rem- iniscence or commentary that enhanced the documentary value of the representation. other theatrical trappings sought to create a “you are there” sense of immersion in the scene. panoramas were often accompa- nied by lectures, performances, guidebooks, maps, and other ancillary vlc • vol. , no. / https://www.cambridge.org/core/terms. https://doi.org/ . /s downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /s https://www.cambridge.org/core productions, making them one of the earliest instances of multiplatform entertainment. i have argued that many forms of victorian textual representation should also be considered media, and that the confluence and overlap among these forms gives literature a place within the trajectory of media development that leads from panoramas, through cinema, to con- temporary virtual reality and similarly immersive media experiences. all of these forms show an evolution towards increasing realism and sense of presence. the major strategies of nineteenth-century fiction strive for these same qualities. the ingratiating stance of the narrator, the cine- matic rendering of landscape, and, above all, the self-reflexivity of victorian fiction contribute to a sense of immersion in the text that is analogous to many of the strategies performed by other media. victorian fiction reflects the great interest in emerging technologies of communication. telegraphy plays a prominent role in works by arthur conan doyle, thomas hardy, henry james, bram stoker, and others. as stephen arata noted in his seminal article on dracula, telegrams, type- writers, and stenographic machines are crucial to generating the texts that form the basis of the novel. a number of victorian scholars have compared victorian communication networks, such as the telegraph and the postal service, to contemporary communication systems. jay david bolter and richard grusin popularized the term “remedi- ation” to describe the tendency of new media to reconceptualize and refashion old media forms. new media, they claimed, do not kill off their antecedents but rather absorb them into new modes of representa- tion. we see this process at work in the victorian period and beyond, in the way that photographs mimic the qualities of visual and theatrical art, while turn of the century cinema continues to employ many of the strat- egies of the midcentury panorama display. these transitions are consis- tent with the kind of evolution described by henry jenkins and david thorburn, who see media change as an “accretive, gradual process . . . in which emerging and established systems interact, shift, and collude with one another.” the victorian obsession with media may account for the twenty-first century obsession with re-presenting victorian texts and themes in the most contemporary media. in addition to a continuing stream of jane austen and charles dickens film adaptations, there is the wildly popular bbc sherlock, which takes media technology as a central theme and trans- lates it from the nineteenth century to the present. holmes’ frequent telegrams become texts, and sherlock’s laptop computer serves as a media https://www.cambridge.org/core/terms. https://doi.org/ . /s downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /s https://www.cambridge.org/core visual metaphor for his encyclopedic brain. there have also been a num- ber of victorian video games, including sherlock holmes: crimes and punishments (focus home interactive, ), victoria: an empire under the sun (paradox interactive, ), and victoria ii (paradox, ). and, of course, the growing field of digital scholarship related to the victorian period is a further testament to the forward compatibility of victorian art. the field of victorian studies has always recognized the dynamic interconnections among different forms of art and culture in the period. treating these forms of representation as “media” highlights their reflex- ivity, broad dissemination, and focus on engaging audiences, and it underscores the degree to which they foreshadow the evolution of many contemporary technologies of communication and representation. notes . martin meisel, realizations: narrative, pictorial, and theatrical arts in nineteenth-century england (princeton: princeton university press, ). . george eliot, daniel deronda (harmondsworth: penguin, ). . anne mcclintock, imperial leather: race, gender, and sexuality in the colonial contest (new york: routledge, ); thomas richard, the commodity culture of victorian england: advertising and spectacle, to (stanford: stanford university press, ); jennifer wicke, advertising fictions: literature, advertising, and social reading (new york: columbia university press, ). . alison byerly, are we there yet? virtual travel and victorian realism (ann arbor: university of michigan press, ); richard altick, the shows of london (cambridge: harvard university press, ). . stephen arata, “the occidental tourist: dracula and the anxiety of reverse colonization,” victorian studies , no. ( ): – . . laura otis, networking: communicating with bodies and machines in the nineteenth century (ann arbor: university of michigan press, ); catherine golden, posting it: the victorian revolution in letter writing (gainesville: university press of florida, ); richard menke, telegraphic realism: victorian fiction and other information systems (stanford: stanford university press, ); jonathan grossman, charles dickens’ networks: public transport and the novel (oxford: oxford university press, ). vlc • vol. , no. / https://www.cambridge.org/core/terms. https://doi.org/ . /s downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /s https://www.cambridge.org/core . jay david bolter and richard grusin, remediation: understanding new media (cambridge: mit press, ). . henry jenkins and david thorburn, rethinking media change: the aesthetics of transition (cambridge: mit press, ), x. medicine mary wilson carpenter steven shapin has observed that although we live in a scientific cul-ture, most of this culture’s inhabitants have little idea of what scien- tists do and know. by contrast, not only do we live in a medicalized culture, but as charles e. rosenberg comments, “for most of us today, physicians and lay persons alike, medicine is what doctors do and what doctors believe (and what they prescribe for the rest of us).” most of us today have direct, personal knowledge of what doctors do and know. this major cultural difference between “science” and “medicine” emerged in the nineteenth century when medical practice became part of everyday life. science inhabited a much more elite sphere. victorians read about science and scientists, but they did not have a fam- ily scientist who practiced science on them. they did have family doctors or, if they were poor, poor law doctors. the victorian poor were also likely to experience hospital medicine, as more and more voluntary hos- pitals, supported by donations and open to the poor, were founded. by the last quarter of the century, more and more middle- and upper-class patients were also entering hospitals as private patients. it was in the nineteenth century that a medical profession first emerged as such. in the early part of the century, medicine and surgery were practiced by a conglomerate bunch of apothecaries, apprentice- trained surgeons who might or might not have had any formal instruc- tion in surgery or experience in hospitals, and oxbridge physicians who were erudite in greek and latin medicine but might never have treated a live patient until they went into practice. by the end of the nineteenth century, legislation had imposed standards requiring univer- sity medical education and hospital training, and efforts—largely unsuc- cessful—were made to define and exclude “quacks.” media, medicine https://www.cambridge.org/core/terms. https://doi.org/ . /s downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /s https://www.cambridge.org/core outline placeholder notes media notes medicine op-llcj .. collaborative authorship in the twelfth century: a stylometric study of hildegard of bingen and guibert of gembloux ............................................................................................................................................................ mike kestemont institute for the study of literature in the low countries & clips computational linguistics group, university of antwerp, belgium sara moens and jeroen deploige history department, ghent university, belgium ....................................................................................................................................... abstract hildegard of bingen ( – ) is one of the most influential female authors of the middle ages. from the point of view of computational stylistics, the oeuvre attributed to hildegard is fascinating. hildegard dictated her texts to secretaries in latin, a language of which she did not master all grammatical subtleties. she therefore allowed her scribes to correct her spelling and grammar. especially hildegard’s last collaborator, guibert of gembloux, seems to have considerably reworked her works during his secretaryship. whereas her other scribes were only allowed to make superficial linguistic changes, hildegard would have permitted guibert to render her language stylistically more elegant. in this article, we focus on two shorter texts: the visio ad guibertum missa and visio de sancto martino, both of which hildegard allegedly authored during guibert’s secretaryship. we analyze a corpus containing the letter collections of hildegard, guibert, and bernard of clairvaux using a number of common stylometric techniques. we discuss our results in the light of the synergy hypothesis, suggesting that texts resulting from collaboration can display a style markedly different from that of the collaborating authors. finally, we demonstrate that guibert must have re- worked the disputed visionary texts allegedly authored by hildegard to such an extent that style-oriented computational procedures attribute the texts to guibert. ................................................................................................................................................................................. introduction since the end of the s, literary studies have seen a clear shift of focus from the analysis of authorial intentions to reader-oriented criticism. the repudi- ation of the modern idea of autonomous authorship has perhaps gone furthest in medieval studies, with the rise, since the late s, of material philology (nichols, ). medievalists have become increas- ingly aware of the importance of manuscript culture in their understanding of texts: medieval texts should not primarily be studied, it is argued, as ab- stract entities resulting from authorial ambitions, but rather as tangible objects, materialized in correspondence: mike kestemont, institute for the study of literature in the low countries & clips computational linguistics group, university of antwerp, belgium. email: mike.kestemont@gmail.com digital scholarship in the humanities, vol. , no. , . � the author . published by oxford university press on behalf of allc. all rights reserved. for permissions, please email: journals.permissions@oup.com doi: . /llc/fqt advance access published on october . specific manuscript contexts. every material mani- festation of a text is unique, because the acts of copying and compiling nearly always resulted in textual changes—from minor changes in orthog- raphy to complete rewritings. our modern post-ro- mantic conception of authorship therefore seems profoundly anachronistic with respect to the middle ages (cerquiglini, , p. – ). yet, even if medieval culture did not share our present- day view on the significance of original authorship, the middle ages have known many respected and authoritative individuals who were recognized by their contemporaries and posterior readers as pro- ducers of very specific literary works. some kind of correlation even existed between the degree to which texts were susceptible to alterations and the religious and intellectual authority of their authors (deploige, ). this did not mean, however, that such recog- nized authors were necessarily acting individually in the process of conceiving their treatises or narra- tives—quite the contrary. writing in the middle ages meant entering into a dialogue with a long line of predecessors, whether through citations, paraphrasing, or allusions. in the actual process of literary composition too, medieval authors only seldom worked alone. a ‘new’ text could be the result of drafts on wax tablets copied by professional scribes, of processes of dictation and subsequent correction, etc. a twelfth-century authority like the cistercian abbot bernard of clairvaux ( – ), one of the most prolific and influential medieval authors, is known to have been surrounded by a team of secretaries. for his sermons and letters in particular, he was assisted by a number of collabor- ators to whom he could dictate his messages or who were asked to produce texts in accordance with his own views. some of his collaborators were even trained in imitating his writing style, thus facilitat- ing bernard’s work of final editing or correcting (leclercq, ; , pp. – ). in the case of the remarkably few medieval female authors known to us, the role of secretaries and collaborators is even more intricate. women writers like the german nuns hildegard of bingen ( – ) or elizabeth of schönau ( – ) were considered unlearned and incapable of independently writing down their visionary experiences, even if these were ‘divinely inspired’. these women therefore had to be assisted by male collaborators, often also serving as their spiritual directors. the precise nature and implications of such cross-gender collab- orations remain a topic of scholarly debate. the immediate incentive for the present article is the preparation of a new critical edition of two lesser known texts attributed to hildegard of bingen, supposedly dating from the last years of her life: the visio de sancto martino, which is con- ceived as a letter addressed to the worshippers of saint martin, and the visio ad guibertum missa, containing spiritual advice to an anonymous monk-priest, generally identified as her last secre- tary, guibert of gembloux ( – ) (deploige and moens, forthcoming). among the few scholars who paid attention to these texts, there is still no consensus as to the extent to which they should be attributed to either hildegard herself or to her col- laborator guibert. as neither traditional stylistic analysis nor contextual historical research has so far been able to resolve the problem, we will ap- proach this issue through a stylometric analysis. we will focus on three research questions. first, does stylometry allow for an authorial dif- ferentiation between the writings of twelfth-century latin authors, belonging to highly similar intel- lectual circles? to answer this question, we will investigate the letter collections or epistolaria of hildegard of bingen, her secretary guibert of gembloux, and their famous contemporary, bernard of clairvaux. our aim is to assess to what extent we can distinguish stylistic profiles for these authors, despite the marked variance within medi- eval manuscript culture (cerquiglini, ), as well as the fact that these authors, like many of their contemporaries, were often assisted by secretaries. next, we wish to analyze in more detail to what extent we can discern in hildegard’s epistolary work, the influence of her last secretary, guibert of gembloux. did her style undergo detectable styl- istic changes under the editorial assistance of guibert, or does the same homogeneous authorial voice appear throughout her epistolary work? finally, we will assess the complex question to which author we should attribute, at least on m. kestemont et al. digital scholarship in the humanities, vol. , no. , -- , - -- twelfth , since have twelfth in spite of stylistic grounds, the visiones at stake in this article. in answering these research questions, we do not aim to develop novel stylometric techniques. the originality of this research is to be found in our application of a number of well-established tech- niques to assess their feasibility when dealing with medieval latin texts, a textual tradition that until now has only rarely received attention in computa- tional authorship attribution. before addressing these issues, we will first briefly introduce the state of research with respect to the so-called mittarbeiter problem in the hildegard scholarship. ‘uneducated in the art of grammar’ the benedictine nun hildegard of bingen was one of the most productive female authors of the middle ages (newman, ). after a youth as anchoress at the abbey of the monks of disibodenberg in the rhineland near mainz, she ended up as abbess of her own convent at the nearby rupertsberg. her extensive oeuvre includes genres as diverse as vi- sionary books, letters, hagiographical texts, treatises on monastic life, musical compositions, and some works on physics and medical healing. considered a true prophetess, receiving revelations and admon- itions from god, she enjoyed a special status, even in the highest ecclesiastical milieux. her extensive circle of correspondents, comprising, among others, popes and the emperor, testifies to her prophetic reputation. she was therefore able to gain an au- thority unprecedented for a woman, enabling her to even criticize the male clergy of her time. among the first to approve her visionary gift was bernard of clairvaux, in a letter answering her re- quest for support. her female authorship was built on her recognition as a mouthpiece of god, which caused her to present herself during her entire life as a poor and uneducated woman—uneducated pre- cisely because she was a woman (deploige, ). in one of her vitae, her biographer guibert of gembloux specifies that she was ‘uneducated as to her schooling in the art of grammar’ (derolez, – , p. ). her status, both as a woman and an allegedly unlearned prophetess who may not have had the same type of schooling as young monks, meant that throughout her life hildegard had to be assisted by secretaries (ferrante, ). her first and principal secretary was volmar of disibodenberg, who remained her close associate until his death in . he assisted in the redaction of the majority of her works. as we can learn from a famous miniature in the now lost manuscript (henceforth ms) wiesbaden, landesbibliothek, , dating from the end of her life, hildegard dictated and wrote drafts on wax tablets, which were subse- quently copied on parchment and linguistically ‘pol- ished’ in accordance with the rules of grammar (fig. ). in addition, several rupertsberg nuns must have aided their abbess as scribes during this period, given the number of known manuscripts produced in rupertsberg under hildegard’s super- vision (embach, , p. , – , , – ; herwegen, , p. – ). after volmar’s death, hildegard had to complete her last major visionary cycle, the liber divinorum operum (‘book of the divine works’), with more occasional assistance by a number of different collaborators from her im- mediate circle of spiritual acquaintances (herwegen, , p. – ). at the very end of her life, how- ever, she was unexpectedly joined by guibert, a monk from the abbey of gembloux in brabant (nowadays belgium). himself a fervent letter writer and hagiographer (moens, ), he served as her secretary from until her death in (delehaye, ; ferrante, , p. – ). while even the authenticity of her female author- ship had not always gone uncontested, until the sem- inal work by schrader and fürhrkötter ( ), a lot of scholarly efforts have been concerned with the precise role of hildegard’s secretaries. just as for other female writers working under the direction of father confessors (coakley, ), the question has been raised to what extent hildegard’s secretaries interfered with the final versions of her works, pos- sibly generating male, clerical interpretations rather than original female viewpoints. following the pion- eering research by herwegen ( ), most specialists now agree that the role of hildegard’s collaborators was restricted to minor grammatical and stylistic al- terations. generally speaking, they had to copy her words verbatim unless they received hildegard’s collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , in order which art grammar' st , up -- explicit authorization for corrections (schrader and führkötter, , p. – ; ferrante, , p. ). it is generally assumed, however, that hildegard must have granted a somewhat greater liberty to guibert, who only entered into her life when she was already at the very advanced age of . although their involvement was short, guibert nevertheless had a significant impact on hildegard’s literary fig. ms wiesbaden, landesbibliothek, , fol. r. (lost since ). photo: rheinisches bildarchiv köln m. kestemont et al. digital scholarship in the humanities, vol. , no. , legacy. for example, he may have assisted her as one of the correctors in the final redaction of the liber divinorum operum, of which ms ghent, university library, (fig. ), can be considered the auto- graph copy most true to hildegard’s own words (derolez and dronke, , pp. xci–xciv). he also aided her in both the writing and compilation of portions of her epistolarium. on the basis of manu- script evidence, content, and dating, we can distin- guish in hildegard’s letter collection a part that must have been written and compiled with the help of volmar and another group of letters that must have been written or transmitted under guibert’s supervision. last but not least, guibert is also thought to have directed the compilation of the so- called riesenkodex (ms wiesbaden, landesbi- bliothek, ), the manuscript in which, by the end of her life, hildegard had collected all the authorized versions of her works (van acker, , pp. – ). two suspect visions the visio de sancto martino (‘vision of saint martin’) and visio ad guibertum missa (‘vision sent to guibert’), which are at stake in this article, cannot be found in the riesenkodex. they are only preserved in three manuscripts that can be linked to the abbey of gembloux and guibert’s own oeuvre. therefore, both texts are traditionally not included in the core of hildegard’s canon (schrader and führkötter, , p. ; embach, , p. ). whereas the titles in the manuscripts (fig. ), as well as guibert’s accompanying letters, firmly attri- bute these visiones to hildegard, there are good rea- sons to suspect that guibert must have been extensively involved in their final redaction. the figure of saint martin for instance—the main topic of the visio de sancto martino—is entirely absent from hildegard’s oeuvre. guibert, on the other hand, developed a lifelong fascination for this saint and devoted nearly half of his life to spreading his cult. the visio ad guibertum missa discusses the role of the priest as well as the topic of literary collaboration, both issues of direct rele- vance to guibert. moreover, the end of the latter text contains a passage of particular interest in which hildegard grants guibert the exceptional right to revise her texts more fundamentally than simply at the level of style and grammar: when you correct [the visio de sancto martino] and the other works, in the emend- ing of which your love kindly supports my deficiency, you should keep to this rule: that adding, subtracting, and changing nothing, you apply your skill only to make corrections where the order or the rules of correct latin are violated. or if you prefer—and this is something i have conceded in this letter beyond my normal practice—you need not hesitate to clothe the whole sequence of the vision in a more becoming garment of speech, preserving the true sense in every part. for even as foods nourishing in themselves do not appeal to the appetite unless they are sea- soned somehow, so writings, although full of salutary advice, displease ears accustomed to an urbane style if they are not recommended by some color of eloquence (translated by newman, , p. ) with this statement, hildegard allegedly granted guibert editorial privileges that she had not allowed any other previous collaborator. the passage also prompted scholars to have a closer look at the authorship, style, and content of these visionary texts. already in his edition, pitra voiced doubts with respect to hildegard’s alleged author- ship. he stated that guibert, if not their original author altogether, must at least have reworked the texts profoundly. pitra based his verdict on a number of syntactical features, on metaphors which he considered typical of guibert, and on the extensive insertion of biblical quotations (pitra, , p. – , ). herwegen remained more cautious: although he accepted that guibert had refined the texts stylistically, he still discerned hildegard’s authorial voice shimmering through guibert’s multiple corrections. he recognized hildegard’s genius in the overall structure of the visions and in some typically hildegardian vocabu- lary. he also rejected pitra’s assertion that the nu- merous biblical quotations could only have been inserted by guibert (herwegen, , p. – ). collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , below . . -- -- -- -- which - newman recently stated that the visio ad guibertum missa was ‘written by guibert in hildegard’s persona’ (newman, , p. ), although van acker ( , p. ) and coakley ( , p. ) continued to con- sider hildegard as the text’s author and guibert as a mere stylistic reviser. these assertions concerning the authorship of the visiones seem to have been predominantly based on subjective appreciations of style and con- tent and the arguments used in this debate remain, at best, intuitive. the appearance of a new critical edition of the visiones once more put the question of their authorship at the forefront: should the texts be regarded as hildegardian or pseudo-hildegardian? stylometric methods may provide a more objective basis for disentangling the issue and to re-assess the nature of guibert’s secretaryship. corpus preparation for the present study, brepols publishers generously provided a digital corpus containing the nearly complete works of hildegard, guibert, and bernard of clairvaux. we obtained these texts in raw format, corresponding to the way they are included in the brepols electronic library of latin texts, on the basis of modern critical editions. fortunately, these editions are all based on manuscripts that were compiled under the supervision of the original authors or at least in their close vicinity, so that we do not have to worry about major scribal interven- tions. the fact that all three authors in our corpus have been productive letter writers rendered their epistolaria an attractive point of departure. more- over, the two short visionary texts of dubious origin that are at issue in this article are mostly comparable with hildegard’s letters with respect to length, topics, and manuscript tradition. obviously, we re- stricted our authors’ letter collections to the letters they wrote themselves, leaving aside the letters that were merely addressed to them and that were usu- ally contained in the same manuscripts (constable, ). for bernard, this resulted in a sub-corpus of , words and for guibert of , words. hildegard’s letter collection contained , words, , of which are contained in the part compiled with the help of her first secretary fig. ms brussels, royal library, – , fol. v. epistula domine hildegardis magistre cenobii sancti roberti pinguensis de excellentia beati martini episcopi – ‘letter of lady hildegard, magistra of the monastery of saint rupert in bingen, on the excellence of the blessed bishop martin’ m. kestemont et al. digital scholarship in the humanities, vol. , no. , . paper to . volmar, while the remaining , words consti- tute the letters that, as discussed earlier, have most probably been edited in some way by guibert. medieval latin is characterized by unstable or- thography. as even a single scribe often used differ- ent spellings for the same word, modern editors already tend to silently normalize minor ortho- graphic variants. we have normalized the orthog- raphy in our corpus even further via lemmatization, a useful procedure in stylometry for medieval texts (kestemont et al., ). the texts were first toke- nized using the natural language toolkit (bird et al., ). the coordinating conjunction –que (‘and’) was not realized as a separate word in medi- eval latin, but it was appended to the preceding word (e.g. terra aquaque, ‘land and water’). to auto- matically isolate the clitic, we have stripped the suffix (‘xque’) from every word that did not occur in a list of words proposed by schinke et al. ( , p. – ). we have also split up the medieval con- traction of the reflexive pronoun se and the idiom- atic reinforcement ipsum in seipsum (or teipsum, teipsam, etc.). a number of specific character combinations were freely interchangeable in medieval latin, such as ph for f, v for u, oe or ae for e (or for e�, the so-called ‘e caudata’) (rigg, ). we have therefore lifted the difference between v and u, as well as between ae, oe, and e, by substituting all vs for us and all aes and oes for es. for the substitution of ae and oe by e, this actually meant that we were sometimes forced to erase the distinction between grammatically import- ant morphemes (e.g. between the male vocative sin- gular domine and the female nominative plural dominae). yet, this was unavoidable, as a good deal of the aes and oes in our corpus were already con- tracted to es, making it nearly impossible to automat- ically normalize them the other way round. subsequently, we checked whether the surface tokens in our corpus were present in a large and representative word list from the perseus project (tufts university). when a token was not, we used a permutation algorithm to generate plausible spel- ling variants for it. if one of these newly generated forms was contained in the word list, the original form was replaced by its newly generated counter- part. to generate these variants, we constructed an array with all possible variations for the consecutive character groups. next, we combined these options through the cartesian product in the matrix by means of a permutation algorithm (kestemont et al., ). table lists the series of common alter- native character combinations we have considered, loosely based on riggs ( ). an example matrix for a word like chirographum would be: {[c], [h j Ø], [i j y], [r], [o], [g], [r], [a], [ph j f], [u], [m]}. all unique, alternative word spellings that can be gener- ated on the basis of the matrix are: chirographum, ciro- graphum, chyrographum, cyrographum, chirografum, cirografum, chyrografum, and cyrografum. finally, we automatically annotated the tokens with lemmas using the medieval index thomisticus treebank (it-tb: passarotti and dell’orletta, ) as training material (ca. , tokens; ca. , sentences). for the lemmatization of our corpus we have used morfette (chrupala et al., ). unlike other popular lemmatization tools, such as treetagger (schmid, ), morfette also lemma- tizes input tokens that the tagger did not already encounter verbatim in the training data. morfette considers pairs of input tokens and lemmas in the training material. from these pairs it learns ‘shortest edit scripts’ or ways to transform tokens into their lemmas using character insertions, deletions, and re- placements. an annotated sample from the visio ad guibertum missa is listed as an example (table ), illustrating how this procedure did not manage to identify all lemmas correctly. especially content words that are not typical of thomas aquinas’s scholastic vocabulary were not always recognized. for the function words used in our analyses (see below), this problem was fortunately hardly an issue. table interchangeable medieval latin character com- binations allowed in our permutation algorithm ci vs. ti ch vs. h ph vs. f h vs. Ø w vs. uu vs. vv vs. uv vs. vu i vs. j vs. y k vs. c vs. ch g vs. gu collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , above . since - ). so ' ' ' ' ' since ' ' ' , up ). which (ses) below which feature selection today’s stylometry has become an umbrella term for a still growing number of techniques for author- ship analysis. each of these has been the subject of both criticism and praise, making it hard to discern a consensus on best practice in this field. for this research too, we had to balance the pros and cons of a number of tried and tested methodologies. recent studies still tend to agree on the undeniable meth- odological advantages of using function words in authorship attribution (binongo, , p. ). an author’s use of function words is said, for instance, to be relatively unaffected by a text’s topic or genre. (dis-)similarities between texts regarding function words are therefore to a certain extent content-in- dependent and can be more easily associated with authorship than e.g. content words or other topic- specific stylistics (juola, , p. – ). numerous empirical studies have effectively demonstrated that analyses of the high-frequency strata of function words yield reliable indications about a text’s authorship (koppel et al., , p. – ; stamatatos, , p. – ). in this research, we have therefore restricted our analyses to function words, using a number of approved methods— many of them implemented in the publicly available script suite ‘stylometry with r’ (eder et al., ). preliminary analyses showed that the upper tail of the frequency spectrum in our corpus still con- tained a good deal of content-rich lemmas. among the ca. most frequent lemmas in our entire corpus, listed in table , we came across multiple topic-specific nouns like deus, dominus, sanc- tus, . . . and verbs like facio, uideo, uiuo, . . . the inclusion of such lemmas obviously reflects the cor- pus’s fairly specific, religious semantics. it is also related, however, to the simple fact that a highly inflected language like latin with its many declen- sions makes less use of function words than weakly inflected languages like english. a third explanatory factor might be the fact that we worked with the frequencies of lemmas instead of surface forms. it thus seemed advisable to remove these content words from our data tables. the content-rich words we chose to remove are marked by a hashtag (#) in table . the words followed by an asterisk (*) in the same table are non-reflexive personal pronouns, which are also often culled in stylometry to avoid the intrusion of genre-related or topic-specific features. naturally, a collection of letters will contain more instances of the second-person pronouns tu/vos (‘you’) or tuus/vester (‘your’) than a saint’s life. in our analyses, we have deleted this kind of pronoun. just as in table , one can still distinguish a certain number of wrongly lemmatized tokens in table . the surface form sui, for example, often seems to have remained unchanged, whereas it should have been transformed into suus. this particular error, however, is neutralized by our elimination of non- reflexive personal pronouns. in sum, our culling of the lemmas in table resulted in function words with which to form the basis for the actual analyses. it should be noted, however, that character n-grams might have been an attractive additional feature type for our research, as these have often been shown to be excellent features in authorship attribution (koppel et al., , p. – ; stamatatos, , p. – ). this method, which does not require any kind of normalization or lemmatization, segments texts into consecutive, par- tially overlapping groups of n characters—the word ‘bigram’ for instance contains the bigrams ‘_b’, ‘bi’, ‘ig’, ‘gr’, ‘ra’, ‘am’, ‘m_’. contrary to a word-level approach, character n-grams are also sensitive to stylistic information below the word level, like case endings or other grammatical morphemes that are table example of lemmatization based on morfette original lemma translation in in ‘in’ uisionem uisio ‘vision’ anime anima ‘soul’ mee meus ‘my’ , / / uidi uideo ‘i see’ ingentem ingentem not recognized [ingens¼ ‘gigantic’] rutilantis rutilo ‘glow’ ignis ignis ‘fire’ nubem nubem not recognized [nubes¼ ‘cloud’] translation: ‘in a vision of my soul, i saw a gigantic cloud of glowing fire.’ m. kestemont et al. digital scholarship in the humanities, vol. , no. , - -- . . a total number of since -- not realized as separate words (rybicki and eder, , p. ). latin, for instance, is a heavily in- flected language that makes use of affixes to mark the grammatical functions of words—‘by iron, not by sword’ being for example ‘ferro non gladio’ (sapir, , ch. vi). therefore, it would have made sense to additionally study the character n-grams in the corpus. however, one runs into the aforementioned problem that historical languages are characterized by unstable orthography (piotrowski, ). although latin spelling variation seems to have been less pronounced than in vernacular medieval languages, it does constitute a serious issue. when comparing two texts written by the same author, surviving in manuscripts with a strongly divergent orthography, stylometric methods may detect arti- ficially large differences. conversely, and likewise due to scribal interference, texts of non-identical authorial provenance may show artificial similarities when they survive in manuscripts with a similar orthographical profile. in medieval manuscripts, we might even find inconsistent word spellings for the same words throughout the same text (rigg, ). this ultimately implies that an approach based on character n-grams is unadvisable for medi- eval latin (cf. kestemont and van dalen oskam, ). unfortunately, this means that our approach based on lemmatization cannot take into account stylistic subtleties below the word level (e.g. table most frequent lemmas in the corpus (#¼content words; *¼non-reflexive pronouns) et e quoniam #caritas #consilium contra qui uel #uerbum #uenio #rex #pono in #possum aut quasi dum #amicus #sum pro idem scilicet #talis #honor non quam super #causa #ceterus #nomen #tu* #uester* #terra #manus #caro uelut #is* autem #uolo #iustitia #fides ante #ego* #multus nunc #modus #res #ta #deus #habeo iam #primus #paruus #iudicium ad ne #uita semper apud usque hic #sanctus ac #audio #pax quantum sed enim #cor #mundus #salus #lex ut etiam #nam #debeo siue #fidelis de #noster* #do #uiuo #eternus #sol #suus* #uerus #solus #cado #inuenio #celestis #ille* #uideo unde inter #frater #potior a sicut quidem #o #uir uidelicet cum #alius tam #diligo magis tunc quod ita propter #uoluntas #fors #angelus ipse tamen #quidam #gloria #us #diuinus #tuus* #filius #bonus quoque #certus #summus #omnis #spiritus ergo atque #loquor #ideo si #christus #tempus #aliqui #uox #prior #sui* #bonum sine #malum #iustus #populus per #ecclesia nisi #mens post #episcopus #facio #opus #unus #oculus #misericordia #similis #homo xque #dies #nihil #celum #os #dico sic #nullus #secundum adhuc #nouus quia #magnus ubi #pars #domus #tantum #dominus #iste* #corpus #mors #uis #uia #meus* #anima #locus #peccatum #beatus licet nec #pater #uirtus #scio #quomodo #predico #quis #gratia #totus #hildegars #ueritas #fratres #duo #quero collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , . -- . indicative versus subjunctive mood, as expressed in case endings). however, we will demonstrate that our method is still able to harvest sufficient stylistic information from the texts. indirectly, our results will therefore even serve to emphasize how much grammatical information is in fact still expressed by isolated function words in medieval latin. testing principal components analysis the first stylometric technique we adopt is principal components analysis (pca), a procedure derived from multivariate statistics and commonly used to reduce the dimensionality of a data set (binongo, ). by combining the original variables of a data table into new, uncorrelated compound variables or ‘principal components’, pca is able to summarize large and complex data sets into insightful lower- dimensional scatterplots. when applied to the frequencies of high-frequency items in texts, this technique often successfully reveals the authorial structure in a data set. pca’s good performance in authorship attribution is due to the fact that it ex- plicitly tries to model correlations between word frequencies. especially the frequencies of function words show complex correlations that are related to stylistic, arguably authorial choices between small sets of alternative options. a mere visual in- spection of the samples’ positions in pca scatterplots often shows that samples written by the same author will cluster, whereas groups of samples written by distinct authors lie further apart. because of the considerable size of the epistolaria in the corpus, we could start with a large sample size of , lemmatized words per sample. recent re- search has demonstrated that the accuracy of most authorship attribution techniques is likely to in- crease when larger samples are taken (eder, ; luyckx and daelemans, ). our selection of the epistolaria of exactly three authors—hildegard of bingen, guibert of gembloux and bernard of clairvaux—respects the fact that it is theoretically unadvisable to include more than three authors in a pca, especially when the discussion of the results is restricted to the two first principal components (pcs) (binongo and smith , p. ). as is cus- tomary since burrows ( ), our pca is based on the correlation matrix, appropriately scaling the ori- ginal word frequencies. fig. shows the scatterplot that results from our first experiment. each author’s samples are visualized as black letter combinations: the first letter of the author’s name is followed by a digit, indicating the sample’s indexed position in the respective episto- laria. g_ep- , for instance, is the fourth sample of , lemmatized words taken from guibert’s epistolarium. at this stage, we are restricting hildegard’s epistolarium to the letters that are not associated in any way with guibert’s secretaryship. fig. displays a remarkably clear authorial separ- ation of the samples. guibert’s samples (g_ep) are concentrated in the upper-right quadrant, whereas the samples from hildegard’s epistolarium (h_epng) are invariably positioned to the left. finally, bernard’s samples (b_ep) form a tight cluster of samples in the lower-right half of the plot. the density of this last cluster thus points at a clear stylistic unity, despite the fact that, as noted earlier, bernard must have been assisted in his epistolary work by a true personal chancellery consisting of at least five different collab- orators (leclercq, , p. – ). additionally, the plot in fig. contains a series of high-frequency items in light grey, the ‘component loadings’, visualizing how strongly the lemmas have contributed to the creation of the pcs. if a word can, for instance, be found to the far left of the scatterplot, this demonstrates that it is relatively more frequent in samples with a similar position in the plot. our first scatterplot thus shows that the use of et (‘and’) and a (‘from’) is surprisingly typical of guibert’s writings, whereas the use of the prepos- ition in (‘in’) is very characteristic of the hildegard samples. in comparison, the use of the lemmas non or si seems to be relatively more typical of bernard’s writing. the scatterplot does not reveal any anoma- lies and it is safe to assume that the high-frequency grammatical lemmas argue in favor of a clear styl- istic differentiation between our authors. the remarkable stylistic differences with respect to a number of specific lemmas used by our authors can be highlighted in another way. the boxplots in fig. visualize information about the absolute m. kestemont et al. digital scholarship in the humanities, vol. , no. , vs. pca -- -- which . above u frequencies (medians, quartiles, etc.) for three inter- esting function words—in, et, and non—in samples of , words. in boxplot (a) concerning the use of in, the primary column refers to the counts in hildegard; in the second boxplot (b) dealing with et, the left column concerns guibert; and in boxplot (c), with the results for non, bernard’s results are displayed in first column. the secondary column in all three boxplots refers to the material by the two other authors, e.g. guibert and bernard in boxplot (a). these boxplots indeed reveal unmistakable dif- ferences between the respective epistolaria with re- spect to the frequency of these important function words. interestingly, these differences coincide with stylistic observations that have been made in trad- itional philological research. given the visionary discourse developed in much of her writings— even in her letters—it is not surprising to come across an intensive use of the preposition in in hildegard’s letters. she repeatedly sees things in divine visions; she continuously searches the alle- gorical meanings buried in the multitude of details that she discovers in her visions (dronke, ). guibert’s writings are especially notorious for their all too inflated and artificial style, and guibert’s wearisome tendency to compose extremely long - . - . - . . . . - . - . - . . . . principal components analysis pc ( . %) mfw culled @ % pronouns deleted correlation matrix p c ( . % ) b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- h_epng- h_epng- _epng- h_epng- h_epng- h_epng- epng- h_epng- - - - - - - et qui in non ad hic sed ut de a cum quod ipse si per quia nec e uel pro quam autem ne enim etiam sicut ita tamen xque sic quoniam aut idem super nunc iam ac unde quidem tam propter ergo sine nisi ubi quasi scilicet semper inter quoque atque dum apud siue magis post adhuc contra uelut ante usque quantumuidelicettunc licet principal components p ro p o rt io n o f va ri a n ce e xp la in e d ( in % ) fig. pca of the epistolaria by hildegard, guibert, and bernard ( , lemmas/sample) collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , -- -- -- -- which sentences, full of coordinating conjunctions (see also derolez, , p. v and ix). bernard’s frequent use of non can be related to the didactic nature of his epistolary expositions in which he very often relies on an antithetical style to illustrate his thoughts (mohrmann, ; pranger, , p. ). testing delta for our pca displayed in fig. , we have been work- ing with extremely generous sample sizes of , lemmas each. because the ultimate goal of this art- icle remains the attribution of the visio ad guibertum missa and the visio de sancto martino of which the authorship seems very questionable, the problem of sample size needs to be put forward (eder, ; luyckx and daelemans, ): while the first disputed visio at stake in this article still contains , lemmas, the latter only counts , words. the scatterplots in fig. a and b show the results of the same procedure as in fig. but using sample sizes of , and , lemmas, respectively. this clearly illustrates the decrease in discriminatory performance of our pca when we reduce the sample size in our experiments. fig. b demonstrates that the authorial dis- crimination becomes less powerful, in particular between guibert and bernard in the vertical component. to what extent will we be able to rely on pca for a fairly solid attribution of a text, like the visio de primary ( / ) secondary ( / ) a b so lu te f re q u e n cy p e r sl ic e ( w o rd s) boxplot for "in" (wilcoxon rank sum: p < . ) primary ( / ) secondary ( / ) a b so lu te f re q u e n cy p e r sl ic e ( w o rd s) boxplot for "et" (wilcoxon rank sum: p < . ) (a) (b) fig. . (a–c) boxplots of the absolute frequencies of in, et, and non in epistolary samples of , lemmas m. kestemont et al. digital scholarship in the humanities, vol. , no. , since paper sancto martino, of only ca. , words? although the scatterplots in the previous section demonstrate the general validity of the stylometric approach for our corpus, it makes sense to apply a second attri- bution technique to our corpus to validate the out- come of the pca more precisely. because it is unfeasible to generate new scatterplots for every small change in parameter settings like e.g. sample size in our experiments, we additionally apply burrows’s delta ( ) to the epistolaria. in its traditional implementation, delta offers a similarity metric to determine the authorship of an- onymous works. based on the frequencies of a small set of high-frequency items, delta computes the stylistic distance between an unknown sample and a set of samples written by a series of candidate authors. it will attribute the anonymous sample to the author of the (single) sample in the data set to which it is closest in style according to the metric. as such, delta uses a ‘nearest neighbor’ reasoning (argamon, ). we can apply a ‘leave-one-out validation’ with delta as follows. we can temporar- ily treat each sample in our collection as anonym- ous. next, we can have delta attribute the anonymized sample to one of the candidate authors and check whether the suggested attribution is suc- cessful or not. if at the end of this procedure, we divide the number of correct attributions by the total number of samples in the data set, we get a percentage that offers a useful approximation of the general effectiveness of our technique, should it, for instance, be applied to real-world samples of un- known provenance. fig. shows the result of this leave-one-out val- idation for various sample sizes (multiples of lemmas, ranging from to , ). it is obvious that larger sample sizes invariably lead to higher accuracies in cross-validation. yet, whereas the ini- tial accuracies are fairly low (even < %), the attri- bution success quickly rises above the psychological barrier of % (sample sizes > , lemmas) and becomes entirely flawless when dealing with sample sizes of ca. , lemmas or more. for a text count- ing , lemmas, like the visio de sancto martino, we might well reach an attribution accuracy of about %. moreover, because these numbers are in line with earlier reports concerning modern lan- guages (eder, ; luyckx and daelemans, ), fig. again demonstrates that even a highly in- flected language like latin contains a satisfying amount of useful stylistic information in its gram- matical lemmas alone. by now, we can assume that, when applied cau- tiously, pca should offer enough solid ground to make conjectures about the authorship of the vi- sions in the corpus traditionally attributed to hildegard. following a nearest neighbor reasoning (argamon, ), we can plot unseen, anonymous texts together with the works of established author- ial origin and investigate to which of the authorial clusters the unseen work is most similar in style. however, before moving on to the analysis of the visions, we have first tested this attribution proced- ure. in the pca scatterplot in fig. , we have added a new, ‘anonymous’ sample (amounting to , primary ( / ) secondary ( / ) a b so lu te f re q u e n cy p e r sl ic e ( w o rd s) boxplot for "non" (wilcoxon rank sum: p < . ) (c) fig. . continued collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , in order since u which since u lemmas) by author ‘x’ to equal-sized samples from the aforementioned epistolaria. the new sample turns out to be stylistically much more similar to bernard’s samples than to those by hildegard or guibert. should this sample have been truly an- onymous, the analysis would have offered firm grounds for conjecture that the text from which the sample is derived is actually authored by bernard of clairvaux. in this specific case, this rea- soning would have led to a historically sound at- tribution, as the anonymous text we have questioned is in reality the sermo in festo sancti martini, written by bernard around . an interesting fact about this example is that even though the topic and genre of this text are perhaps quite different from the epistolary material of our candidate authors (viz. a sermon about the afore- mentioned saint martin), it is clear that our pca procedure allows for solid conclusions. although one should perhaps not always expect such clear- cut stylistic, authorial differentiation in historical corpora, this promising example clearly illustrates the benefits of the present methodology for (future) research. pc ( . %) mfw culled @ % pronouns deleted correlation matrix p c ( . % ) - - - b g h (a) fig. (a and b) pcas with reduced sample sizes ( , and , lemmas/sample) (continued) m. kestemont et al. digital scholarship in the humanities, vol. , no. , since guibert’s secretaryship: synergy and beyond? as discussed earlier, we have discerned two groups of letters in hildegard’s epistolarium: one that must have originated at the time when volmar was still hildegard’s secretary and that bears no potential traces of guibert’s interference, and another con- taining the letters that are likely to have been revised by guibert. if we confront samples of , lemmas from both portions, labeled here h_epng and h_epg, respectively, in a pca, we get the result in fig. . we notice that the first, horizontal pc captures an impressive % of the original variation in our data and primarily relates to the stylistic differentiation between guibert’s own letter collections (g_ep) and the anterior portion of hildegard’s epistolarium (h_epng). interestingly, we see that the second pc in the right half of the plot (still capturing . % of the original variation) discriminates between hildegard’s non-guibertian letters and her letters that can be associated with guibert’s secretaryship. pc ( . %) mfw culled @ % pronouns deleted correlation matrix p c ( . % ) - - - b g h(b) fig. continued collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , above which which which these results thus suggest that there do indeed exist stylistic differences between the oldest portion of hildegard’s epistolarium and the letters in which we expected to discern guibert’s editorial finger- prints. they also confirm what can be deduced from the surviving manuscript evidence. the so- called autograph copy of the liber divinorum operum mentioned earlier offers unique insight into the way in which hildegard’s collaborators must have edited her texts under her supervision (derolez, ). fig. , showing a number of lines from the randomly selected page of ms ghent, university library, , makes it clear that it was the function words in particular that were often altered by hildegard’s correctors; tam being erased, quod being replaced by ut or quia, ad being added, etce- tera. a collaborator—especially guibert, who is known to have had a great deal of freedom in his editorial work—may thus have had a notable impact on hildegard’s stylistic profile. however, in fig. , we see that the samples from hildegard’s epistolarium that bear the influence of guibert’s interference do not seek the company of guibert’s own writings in the scatterplot. after all, they continue to be somewhat more similar to hildegard’s style. this result is reminiscent of the synergy hypothesis, recently discussed by pennebaker ( ). pennebaker puts forward three hypotheses concerning the stylistic effect of collaborations between different authors. such pro- jects can produce a language that is ( ) similar to the one produced by a single person writing alone, ( ) the average of the two writers, or ( ) unlike either of one of the styles that the collaborating au- thors would produce on their own. based on ex- ploratory research on the federalist papers and beatles songs, pennebaker ultimately argues in favor of the latter, so-called ‘synergy view’ on col- laborative authorship, not refuting however the pos- sibility that one of the collaborating authors might have remained more influential with respect to the end product (cf. petrie et al., ). this synergy hypothesis thus might be applicable to a certain extent to the hildegard–guibert ‘collaboration’, where the result of the creative process does not fit in with the other letter samples written by hildegard or guibert individually, although the result is somewhat more similar to hildegard. . . . . cross validation sample size c ro ss -v a lid a tio n a cc u ra cy ( % ) o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o fig. cross-validation using delta (dotted lowess line fitted) m. kestemont et al. digital scholarship in the humanities, vol. , no. , above , which -- -- which ; ; - more can be learned about the stylistic dichot- omy in hildegard’s epistolarium by applying a mann–whitney test to the lemmas occurring at least twice in , lemma samples. here, we tem- porarily leave the realm of high-frequency lemmas and venture into the lower-frequency strata of the lexical spectrum. hence, this test will not particu- larly emphasize the discriminatory power of high- frequency lemmas, as was the case with our other tests (kilgariff, ). fig. contrasts the words that were predominantly used in the hildegard’s letters written under volmar’s secretaryship with those that become typical when guibert took over the editorial work in the preservation of her letters. the lemmas have been ranked and plotted accord- ing to the u test statistic obtained for each lemma. fig. learns how the use of the relative pronoun qui (‘who’) for instance only becomes prominent in letters edited by guibert, who is indeed notorious for constructing eloquent but complex sentences with a lot of embedded relative clauses. moreover, this latter group of letters is also characterized by a pc ( . %) mfw culled @ % pronouns deleted correlation matrix p c ( . % ) - - - - b g h x fig. attribution of an anonymized sermo x to the bernardian corpus collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , - , more dry and stereotypical ecclesiastical vocabulary (omnipotens, sanctus, spiritus, verus, . . . ), whereas the letters not influenced by guibert betray a more direct and lively narrative style (sed, tunc, nunc, dico, ergo, deinde, . . . ), possibly more true to hildegard’s own preferred way of expressing herself. we might thus be inclined to agree with newman ( , p. ) when she stated: ‘purists can at least rejoice that the collaboration [between guibert and hildegard] began only after the seer’s major works were completed’. from the methodological point of view, these results also show that the discriminatory effects in lower- frequency strata correspond with the stylistic di- chotomy present in the high-frequency vocabulary, thus corroborating the performance of the latter methodology. - . - . . . - . - . . . principal components analysis pc ( %) mfw culled @ % pronouns deleted correlation matrix p c ( . % ) g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- h_epg- h_epg- h_epg- h_epg- h_epg- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- - - - - - - et qui in non ad hic sed ut de a cum quod ipse si per quia nec e uel pro quam autem ne enim etiam sicut ita tamen xque sic quoniam aut idem super nunc iam ac undequidem tam propter ergo sine nisi ubi quasi scilicetsemper inter quoque atque dum apudsiue magis post adhuc contra uelut ante usque quantum uidelicet tunc licet principal components p ro po rt io n of v ar ia nc e ex pl ai ne d (in % ) fig. pca of the epistolarium of guibert, of the letters of hildegard transmitted without guibert’s editorial assistance, and of the guibertian letters in hildegard’s epistolarium ( , lemmas/sample) fig. ms ghent, university library, , p. (detail). reproduced with permission m. kestemont et al. digital scholarship in the humanities, vol. , no. , let us finally turn to the original incentive for the present article, namely, the authorship dis- cussion concerning two texts of dubious proven- ance: the relatively short visio de sancto martino about saint martin ( , lemmas) and the some- what longer visio ad guibertum missa ( , lemmas). fig. offers the result of three pcas in which we have confronted both ‘dubia’ (hence d_mart and d_missa) with the previously dis- cussed epistolary collections, again using the same lemmas and a sample size of , lemmas. fig. a considers all texts by all authors; fig. b ex- cludes bernard’s texts; fig. c only considers guibert’s epistolarium and the ‘anonymous’ vision- ary texts. all subplots in fig. clearly show that both visions tightly cluster with guibert’s epistolarium, instead of with hildegard’s. this effect is perhaps least prominent in fig. a, where d_mart and d_missa display modest similarities to some of the epistolary samples from the portion of hildegard’s epistolarium that was revised by guibert. in all three plots, however, the visions are generally speaking far more similar to guibert’s writings than to hilde- gard’s. significantly, most samples resulting from the combined authorial voices of hildegard and guibert again do not display any significant rap- prochement to the epistolaria of the individual au- thors. these observations seem to reinforce the synergy hypothesis. moreover, the visions’ quasi- semetipse interdum ualde quod surgus mens quomodo quare amo mysticus uenio rectus populus ubi non ergo deinde hic in dico nunc sed before guibert mann-whitney u . . . . . . solus possum semper cesso iesus numquam indumentum sanctus imito uel per a designo precipio efficio summus licet uerus qui e omnipotens with guibert mann-whitney u . . . . . fig. results of mann–whitney test (u statistic) comparing the vocabulary in hildegard’s epistolarium before and during guibert’s secretaryship collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , - random position in the final subplot (fig. c) re- veals no pronounced stylistic differences with gui- bert’s letters, regarding the high-frequency lemmas analyzed. they invariably cluster with guibert’s epistolary oeuvre, making him a much more plaus- ible author than hildegard—at the very least, from a stylistic point of view. an important, yet inconspicuous, last feature of fig. a is that it includes the sermo in festo sancti martini, even though it can hardly be spotted among bernard’s other samples. this sermon deals, just like the visio de sancto martino, with saint martin. both texts were even clearly influ- enced by the same late antique hagiographical narratives concerning this saint, namely, the works of his first hagiographers sulpicius severus (c. – ) and gregory of tours ( – ). it is interesting to note that despite their interwo- venness within the same intertextual tradition, they are still clearly distinguished and therefore demonstrate that topic-related stylistics hardly interferes with the author-related differences. the visionary texts under investigation thus betray guibert’s stylistic influence to such an advanced extent that we could wonder whether we should not entirely attribute these texts to guibert, in- stead of arguing for any form of ‘synergetical col- laboration’, as was still possible for the portion of the epistolarium over which both hildegard and guibert labored. - - - - principal components analysis pc ( . %) mfw culled @ % correlation matrix p c ( . % ) b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_ep- b_mart- d_mart- d_missa- d_missa- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- h _epg- h _epg- h _epg- h _epg- h _epg- h _epg- h _epg- h _epg- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- - - - - fig. pcas including the visio de sancto martino and the visio ad guibertum missa (continued) m. kestemont et al. digital scholarship in the humanities, vol. , no. , -- - - u conclusions it is obvious that the experiments reported in this article only touch the tip of the iceberg of the research on hildegard’s complicated authorship, to say nothing of the exciting, broader topic of twelfth-century latin writing. as stated in our introduction, individuality and authorship remain complex issues when it comes to medieval literature. even an authoritative and highly idio- syncratic author like bernard of clairvaux is known to have been assisted by a team of collab- orators. it is moreover clear that medieval scribes often gradually introduced errors and deviations when successively copying exemplars, thus pos- sibly altering the original authors’ style in the surviving copies of texts. nevertheless, we hope to have demonstrated that these issues do not need to imply that stylometry, when applied cau- tiously, cannot yield valid research results in the field of medieval philology. first we showed that authorial discrimination was possible in the corpus studied. although sam- ples had to be big enough to yield correct attribu- tions, stylometric methods were generally able to model the overall differences in writing style. this suggests that superficial interference from scribes (or even later editors) can be by-passed to a certain extent, for instance through lemmatization. interestingly, we obtained satisfying results with a word-level approach, notwithstanding the fact that latin is a highly inflected language. although other strategies might increase attribution accuracies in the future, this shows that even in highly inflected - - - principal components analysis pc ( . %) mfw culled @ % correlation matrix p c ( . % ) d_mart- d_missa- d_missa- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- h _epg- h _epg- h _epg- h _epg- h _epg- h _epg- h _epg- h _epg- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- h_epng- - - - fig. continued collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , paper introduction languages, plenty of stylistic information can already be harvested at the word-level. in the course of our research, we have also touched on collaborative authorship, an issue that recently has raised considerable interest in stylom- etry (reynolds et al., ). our methodology enabled us to discover clear stylistic differences in hildegard of bingen’s epistolary work between those letters for which she had relied on the modest as- sistance of her first collaborator volmar and the letters that have been compiled and copy-edited by guibert of gembloux. interestingly, the letter samples influenced by the collaboration between hildegard and guibert formed an isolated cluster that did not display advanced stylistic similarities to hildegard’s former epistolary oeuvre, nor to that of guibert. these results argue in favor of what pennebaker ( ) has called the synergy hypothesis: when two authors are involved in the same texts, the end result need not resemble the writing style of one of the two individually; the result might rather resemble that of a ‘new’, third author. the evidence offered in this particular case study is valuable in this light, but at the same time still too scant to come to a final verdict on this fascinating topic. finally, with respect to our initial research ques- tion, we hope to have convincingly disputed the authorship of two texts allegedly attributed to hildegard: the visio de sancto martino and the visio ad guibertum missa. we argued that these vi- sions are stylistically speaking completely in line with the writing style of guibert de gembloux, hildegard’s last secretary. these results offer - - - - - principal components analysis pc ( . %) mfw culled @ % correlation matrix p c ( . % ) d_mart- d_missa- d_missa- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- g_ep- - - - - - fig. continued m. kestemont et al. digital scholarship in the humanities, vol. , no. , up which up u quantitative support to suspicions voiced in earlier, traditional philological research: if guibert is not to be considered their original author altogether, it is clear that he reworked these texts so profoundly that hardly anything of hildegard’s writing style is still discernible in them. in fact, it is noteworthy that our analyses could not offer any stylistic evidence at all that hildegard once authored (even a preliminary or simply oral version of) these texts, although this remains of course an interesting historical possibility. acknowledgements we thank the corpus christianorum library & knowledge centre of brepols (turnhout) and in particular luc joqué for generously putting at our disposal the corpora analyzed in this article. marco passarotti (università cattolica del sacro cuore, milan) generously provided us with the it-tb, while helma dik (university of chicago) provided the word list from the perseus project (tufts university). we are moreover very grateful for the valuable feedback from albert derolez, wim verbaal, antoon bronselaer, and guy de tré. in addition, we thank the anonymous reviewers of the digital humanities conference for their helpful comments on this research project, as well as the anonymous reviewers of this journal, in par- ticular, for their extensive feedback on the normal- ization procedures described. mike kestemont developed the stylometric methodology for this art- icle. sara moens brought in her domain expertise concerning guibert of gembloux and medieval epistolography. jeroen deploige, who took the ini- tiative for this collaborative research, contributed from his involvement with hildegard scholarship. all three authors contributed equally to the end result. funding this work was supported by the research foundation – flanders, of which both sara moens and mike kestemont are fellows, and by the flemish hercules foundation, which finances the project ‘sources from the medieval low countries (smlc)’, directed by jeroen deploige. references argamon, s. ( ). interpreting burrows’s delta: geo- metric and probabilistic foundations. literary and linguistic computing, ( ): – . binongo, j. ( ). who wrote the th book of oz? an application of multivariate analysis to authorship attri- bution. chance, ( ): – . binongo, j. and smith, w. ( ). the application of principal components analysis to stylometry. literary and linguistic computing, ( ): – . bird, s., klein, e., and loper, e. ( ). natural language processing with python. analyzing text with the natural language toolkit. sebastopol: o’reilly. burrows, j. ( ). computation into criticism. a study of jane austen’s novels and an experiment in method. oxford: clarendon press. burrows, j. ( ). ‘delta’: a measure of stylistic differ- ence and a guide to likely authorship. literary and linguistic computing, ( ): – . cerquiglini, b. ( ). in praise of the variant: a critical history of philology. baltimore: jhu press. chrupala, g., dinu, g., and van genabith, j. ( ). learning morphology with morfette. proceedings of the international conference on language resources and evaluation, lrec , - may . marrakech, morocco: european language resources association, pp. – . coakley, j. ( ). women, men and spiritual power: female saints and their male collaborators. new york: columbia university press. constable, g. ( ). letters and letter-collections. turnhout: brepols. delehaye, h. ( ). guibert, abbé de florennes et de gembloux, xiie et xiiie siècles. revue des questions historiques, : – . deploige, j. ( ). in nomine femineo indocta. kennisprofiel en ideologie van hildegard van bingen ( - ). hilversum: verloren. deploige, j. ( ). anonymat et paternité littéraire dans l’hagiographie des pays-bas méridionaux (ca. - ca. ). autour du discours sur l’’original’ et la ‘copie’ hagiographique au moyen âge. in renard, e., trigalet, m., hermand, s., and bertrand, p. (eds), collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , scribere sanctorum gesta. recueil d’études d’hagio- graphie médiévale offert à guy philippart. turnhout: brepols, pp. – . deploige, j. and moens, s. (eds), visio de sancto martino et visio ad guibertum missa. in deploige, j., embach, m., evans, c., gärtner, k., and moens, s., hildegardis bingensis opera minora. pars secunda. turnhout: brepols, forthcoming. derolez, a. ( ). the genesis of hildegard of bingen’s liber divinorum operum. the codicological evidence. in gumbert, j.p. and de haan, j.m. (eds), litterae textuales. essays presented to gerard i. lieftinck. ii: texts & manuscripts. amsterdam: van ghent, pp. – . derolez, a. (ed.) ( – ). guiberti gemblacensis epis- tolae: quae in codice b.r. brux. - inveniuntur. turnhout: brepols. derolez, a. and dronke, p. (eds), ( ). hildegardis bingensis liber divinorum operum. turnhout: brepols. dronke, p. ( ). the allegorical world-picture of hildegard of bingen: revaluations and new problems. in burnett, c. and dronke, p. (eds), hildegard of bingen: the context of her thought and art. london: the warburg institute. eder, m. ( ). does size matter? authorship attribution, small samples, big problem. digital humanities . conference abstracts. king’s college london, pp. – . eder, m., kestemont, m., and rybicki, j. ( ). stylometry with r: a suite of tools. digital humanities . conference abstracts. university of nebraska- lincoln, pp. – . embach, m. ( ). die schriften hildegards von bingen. berlin: akademie verlag. ferrante, j. ( ). scribe quae vides et audis. hildegard, her language, and her secretaries. in townsend, d. and taylor, a. (eds), the tongue of the fathers. gender and ideology in twelfth-century latin. philadelphia: university of pennsylvania press, pp. – . herwegen, i. ( ). les collaborateurs de ste. hildegarde. revue bénédictine, : – ; – ; – . juola, p. ( ). authorship attribution. foundations and trends in information retrieval, ( ): – . kestemont, m. and van dalen-oskam, k. ( ). predicting the past: memory-based copyist and author discrimination in medieval epics. in calders, t., tylus, k., and pechenizkyi, m. (eds), proceedings of bnaic . eindhoven: benelux association for artificial intelligence, pp. – . kestemont, m., daelemans, w., and de pauw, g. ( ). weigh your words—memory-based lemmatiza- tion for middle dutch. literary and linguistic computing, ( ): – . kilgariff, a. ( ). comparing corpora. international journal of corpus linguistics, ( ): – . klaes, m. (ed.) ( ). hildegardis bingensis epistolarium. pars iii. turnhout: brepols. köhler, r. ( ). synergetic linguistics. in köhler, r., altman, g., and piotrowoski, r. g. (eds), quantitative linguistik/quantitative linguistics. ein internationales handbuch/an international handbook. berlin, new york: walter de gruyter, pp. – . koppel, m., schler, j., and argamon, s. ( ). computational methods in authorship attribution. journal of the american society for information science and technology, ( ): – . leclercq, j. ( ). saint bernard et ses secrétaires. in recueil d’études sur saint bernard et ses écrits, vol. . rome: edizioni di storia e letteratura, pp. – . leclercq, j. ( ). lettres de s. bernard: histoire ou litterature? in recueil d’études sur saint bernard et ses écrits, vol. . rome: edizioni di storia e letteratura, pp. – . leclercq, j. and rochais, h. (eds), ( – ). epistolae in sancti bernardi opera, vols – . rome: editiones cistercienses. leclercq, j., talbot, c. h., and rochais, h. (eds), ( – ). in sancti bernardi opera. rome: editiones cistercienses. luyckx, k. and daelemans, w. ( ). the effect of author set size and data size in authorship attribution. literary and linguistic computing, ( ): – . moens, s. ( ). twelfth-century epistolary language of friendship reconsidered. the case of guibert of gembloux. revue belge de philologie et d’histoire, ( ): – . mohrmann, c. ( ). observations sur la langue et le style de saint bernard. in s. bernardi opera, vol. . rome: editiones cistercienses, pp. ix–xxxiii. newman, b. ( ). sister of wisdom. st. hildegard’s theology of the feminine. la: university of california press. newman, b. (ed.) ( ). voice of the living light: hildegard of bingen and her world. la: university of california press. nichols, s. ( ). why material philology? some thoughts. zeitschrift für deutsche philologie, : – . m. kestemont et al. digital scholarship in the humanities, vol. , no. , passarotti, m. and dell’orletta, f. ( ). improvements in parsing the index thomisticus treebank. revision, combination and a feature model for medieval latin. in calzolari, n., choukri, k., maegaard, b., mariani, j., odijk, j., piperidis, s., rosner, m., and tapias, d. (eds), proceedings of the international conference on language resources and evaluation, lrec , – may . valetta: european language resources association, pp. – . pennebaker, j. ( ). the secret life of pronouns. what our words say about us. ny: bloomsbury. petrie, k., pennebaker, j., and sivertsen, b. ( ). things we said today: a linguistic analysis of the beatles. psychology of aesthetics, creativity, and the arts, ( ): – . pitra, j. b. ( ). analecta sacra et classica spicilegio solesmensi parata, vol. . paris: a. jouby et roge. piotrowski, m. ( ). natural language processing for historical texts. california: morgan & claypool publishers. pranger, b. ( ). bernard the writer. in mcguire, b.p. (ed.), a companion to bernard of clairvaux. leiden: brill, pp. – . reynolds, n., schaalje, g., and hilton, j. ( ). who wrote bacon? assessing the respective roles of francis bacon and his secretaries in the production of his english works. literary and linguistic computing, ( ): – . rigg, a. ( ). orthography and pronunciation. in mantello, f. and rigg, a. (eds), medieval latin: an introduction and bibliographical guide. washington: the catholic university of america press, pp. – . rybicki, j. and eder, m. ( ). deeper delta across genres and languages: do we really need the most fre- quent words? literary and linguistic omputing, ( ): – . sapir, e. ( ). language: an introduction to the study of speech. new york: harcourt, brace & co.. schinke, r., greengas, m., robrtson, a. m., and willett, p. ( ). a stemming algorithm for latin text databases. journal of documentation, ( ): – . schmid, h. ( ). probabilistic part-of-speech tagging using decision trees. proceedings of the international conference on new methods in language processing. manchester, uk. schrader, m. and führkötter, a. ( ). die echtheit des schrifttums der heiligen hildegard von bingen. quellenkri- tische untersuchungen. keulen–graz: böhlau verlag. stamatatos, e. ( ). a survey of modern authorship attribution methods. journal of the american society for information science and technology, ( ): – . van acker, l. ( ). der briefwechsel der heiligen hildegard von bingen. vorbemerkungen zu einer kri- tischen edition. revue bénédictine, : – . van acker, l. (ed.) ( – ). hildegardis bingensis epistolarium. turnhout: brepols. notes . among the letters written with the help of volmar, we count those in ms wien, österreichische nationalbi- bliothek, (theol. ), which offers a copy of a collection compiled by volmar before (van acker, , p. xxvi), and the limited number of letters that can be found distributed over ms stuttgart, würt- tembergische landesbibliothek, cod. theol. phil. ; ms wien, österreichische nationalbibliothek, ; ms berlin, staatsbibliothek preussischer kultur- besitz, cod. theol. lat. fol. ; ms london, british library, cod. add. ; ms paris, bibliothèque nationale, nouv. acquis. lat. ; ms trier, stadtbi- bliothek, cod. / and ms kynžvart, cod. . among the letters compiled and edited under guibert’s supervision, we count those in the riesenkodex wies- baden, landesbibliothek, (dating from - / ), that are not also found in ms wien, österrei- chische nationalbibliothek, (theol. ) (van acker, , p. xxvii), as well as those copied in ms berlin, staatsbibliothek preussischer kulturbesitz, cod. lat. , which bear traces of guibert’s editorial as- sistance (klaes, , p. xvii). among the letters con- tained in the latter group, compiled under guibert’s supervision, we obviously encounter all hildegard’s letters addressed to guibert and the ones that have been written in the years in which he stayed in rupertsberg. . mss brussels, royal library, – and – (both originating from gembloux, early thir- teenth century) and ms brussels, royal library, – (originating from sint-maartensdal near louvain, fifteenth century). . see www.brepolis.net. the critical editions of the works of both hildegard of bingen and guibert of gembloux are published in several volumes in brepols’s own corpus christianorum series. for the works of bernardus, the brepols library of latin texts relies on leclercq et al. ( – ). . bernard’s letters, edited by leclercq and rochais ( – ), contain the ‘official’ epistolarium, collaborative authorship in the twelfth century digital scholarship in the humanities, vol. , no. , that www.brepolis.net up compiled shortly after bernard’s death, as well as letters transmitted elsewhere. guibert’s letters were edited by derolez ( – ) on the basis of ms brussels, royal library, – . . see note . hildegard’s letters are edited by van acker ( – ) and by klaes ( ) . we supplemented this list with three words—plerum- que, utrumque, and quicumque—yet did not allow any of these items into the restrictive set of function words we list below. we did not consider other, much less frequent clitics (e.g. –ne (‘if’) or –ve/ue (‘or’)), because it is difficult to automatically detect these using a simple rule-based approach and to distinguish them from e.g. the –ne in deuotione or the –ue in serue. . we have described our approach in a generic way for future reference. it should be noted, however, that there still remains a small number of possible spelling variants in medieval latin that are hard to deal with but that were not relevant for the present research because we worked with critical editions that have already normalized orthography to a large extent. one can think here of the interchangeability of –mqu– and –nqu– in some words and the problem of single/double consonants (as e.g. in litera and lit- tera). a lesser frequent, yet still important, orthographical variant that we leave unaddressed is (–)exs– versus (�)ex–, because it is difficult to auto- matically detect it using a rule-based approach. nevertheless, this variant hardly affects any of the func- tion words to which we have restricted our analyses. . in these training data too, we have substituted all vs for us and all aes/oes for es. . note that licet, which strictly speaking derives from the impersonal verb licere, is considered a function word because it is primarily used as a subordinating concessive conjunction. . other errors in the lemmatization displayed in table are ‘hildegars’, ‘us’, and ‘ta’. . note that from this point onwards, we will express the size of textual samples in terms of the number of consecutive lemmatized words they contain (a number which, after tokenization, need not be iden- tical to the original number of surface forms in the original texts). for the sake of conceptual clarity we shall keep pennebaker’s original terminology, although it should be stressed that our present use of the term ‘synergy hypothesis’ is completely unrelated to the concept of ‘synergetic linguistics’ in the field of quan- titative linguistics (köhler, ). m. kestemont et al. digital scholarship in the humanities, vol. , no. , -- -- since ' ' ' ' ' why marriage matters: a north american perspective on press / library partnerships why marriage matters: a north american perspective on press / library partnerships charles watkinson, university of michigan library, university of michigan, ann arbor, usa email: watkinc@umich.edu orcid: - - - key points: • around thirty percent of campus-based members of the association of american university presses now report to libraries, more than double the number five years ago. • beyond reporting relationships, physical collocation and joint strategic planning characterize the most integrated press/library partnerships. • the main mutual advantages of deep press/library collaboration are economic efficiency, greater relevance to parent institutions, and an increased capacity to engage with the changing needs of authors in the digital age. • there is emerging interest in collaboration at scale among libraries and presses that may extend the impact of press/library collaboration beyond single institutions. introduction in a post on the society for scholarly publishing’s popular scholarly kitchen blog, consultant joe esposito explored “having relations with the library: a guide for university presses” (esposito, ). he wrote that “every way that you look at the relationship between a press and a library, you come away with little or nothing to support an organizational marriage. presses are great things, libraries are great things, but they are not better things by virtue of having been put into the same organization.” he concludes, “both libraries and presses are better off pursuing their own aims, cooperating when useful, working separately when it is not. surely it is not out of line to ask: why can’t we just be friends.” in this article i argue the case for “marriage,” with its connotations of long-term, deeply- embedded partnership; a case that the rapidly growing number of university presses that report into libraries in north america will recognize. as mission-driven, non-profit organizations, university presses and academic libraries should be natural allies in the quest to create a more equitable scholarly publishing system. expert in scholarly information management, situated on university and college campuses, supported to a varying degree by this article is protected by copyright. all rights reserved. this is the author manuscript accepted for publication and has undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the version of record. please cite this article as doi: . /leap. http://dx.doi.org/ . /leap. http://dx.doi.org/ . /leap. the same funding sources, and sharing many philosophical ideals, librarians and university press publishers seem to be logical partners in supporting the production of knowledge. but it is only recently that there has been much traction on the idea. while the opportunities for publishing collaborations had been a topic of low-level discussion for many years (e.g., day, ), a particular focus on this issue arose in the late s. between and , several important reports (brown, griffiths, and rascoff, ; crow, ; hahn, ) examined the opportunities for campus publishing partnerships, highlighting a few major initiatives that had started to emerge. these early experiments did not immediately appear to stimulate emulation, and a period of relatively little apparent activity ensued. for example, a survey of library publishing activity across a wide range of north american institutions conducted in found that fewer than percent of the responding libraries that had access to a potential university press partner within their parent institutions were engaged in any form of collaboration (mullins et al., , ), a number that had changed little from a similar survey three years earlier (hahn , ). this article proposes that we are now, however, seeing a resurgence of interest in the idea of library/press collaboration and that this time the movement is more sustainable since it is much more broadly based in character, with a diverse group of institutions involved. in the “aaup biennial reporting structure survey,” thirty out of the one hundred and thirty three members of association of american university presses (aaup) reported to libraries, representing a doubling over the five years (see table ). since aaup includes some learned society, museum, and public policy publishers among its membership, it can reasonably be claimed that almost a third of campus-based university presses in north america now report to libraries. / abilene christian akron alberta alberta alberta alberta alberta arizona arizona arizona arizona calgary calgary calgary calgary calgary concordia delaware this article is protected by copyright. all rights reserved. george mason georgia georgia georgia georgia indiana kentucky kentucky marquette marquette marquette marquette marquette michigan michigan michigan michigan mit mit mit mit mit nebraska new england new england new york new york new york new york new york north texas north texas north texas northwestern northwestern northwestern northwestern northwestern oregon state oregon state oregon state oregon state oregon state penn state penn state penn state penn state penn state purdue purdue purdue purdue purdue southern illinois stanford stanford stanford stanford stanford syracuse syracuse syracuse syracuse syracuse temple temple temple temple texas christian texas christian texas christian texas christian texas christian texas tech utah utah utah utah utah utah state utah state west virginia wilfrid laurier this article is protected by copyright. all rights reserved. yale table : presses reporting to libraries. data from aaup biennial press reporting structure survey, the most recent results accessible at: http://www.aaupnet.org/images/stories/data/ _reporting_structure_ .pdf a continuum of types of relationship a reporting relationship is one thing; truly leveraging the synergies that collaboration between a university press and library can offer is another. collaborations on campuses are not only increasing in number but they are becoming richer in nature. this trend can be analyzed in the context of a proposed taxonomy of relationship types (articulated in more detail in watkinson, ) in which three drivers seem to particularly affect where a press appears. these are (a) whether the press reports to the library; (b) whether press staff members are physically collocated with library colleagues; and (c) whether the press and library engage in strategic planning together. • type , little evidence of currently active relationships between press and library • type , good relationships between the press and one or more libraries, but no reporting • type , reporting and joint projects, but relative autonomy and no physical collocation • type , physical collocation, reporting, but relative autonomy • type , more integrated, shared vision approaches should the taxonomic outline above be understood as snapshots of different stages along a process, where relationships move from collaboration to integration, or as representing different models appropriate in different contexts? arguments could be made for both suggestions. on the one hand, some organizational models may make progression beyond the type category, in which collaborations exist but there is a lack of reporting relationship, difficult. a particular structural challenge faces presses that are tied to a university system rather than a specific campus. the system-based university presses of florida, kansas, north carolina, and mississippi, for example, have extremely positive relationships with libraries but publishing responsibilities across many different institutions. such an organizational structure may make integrated relationships with any one campus challenging. at other institutions, a clear progression can be seen as a press is moved into a this article is protected by copyright. all rights reserved. reporting relationship with a library for administrative reasons but then the two partners find increasing synergies. at both purdue university and the university of michigan the presses were “rescued” by far-sighted library directors at a time when large deficits had been accrued and the provosts had become concerned about lack of oversight. from such inauspicious beginnings, however, a process of movement from collaboration to integration can be shown as various opportunities were explored, with the relationship developing from type (prior to / ) through type to type today. both examples highlight the importance of reporting, physical collocation, and shared strategic planning as the main taxonomic delineators. while initially the press staff and library staff were in different buildings, collaboration increased dramatically when they were moved into the same location. in the case of purdue, the press moved from the periphery of campus to an attractive central location, in close proximity to the dean of libraries’ office. at michigan, librarians from the scholarly publishing office were relocated to a library facilities building at the edge of campus to join press staff. joint strategic planning exercises were the next step, with an important part of these being the increasing inclusion of the press director in library senior leadership meetings. type situations are often reflected by the press director also having a position within the library, represented in the individual’s title: for example, “aul for publishing and director of university of michigan press,” “director of purdue university press and head of scholarly publishing services, purdue libraries,” “executive director temple university press and scholarly communications officer, university libraries”, “director, indiana university press and digital publishing”, and most extremely “donald and delpha campbell university library and oregon state university press director.” even where titles may not reflect it, press directors can be highly involved in library leadership decisions as at mit press where the press director and library director have set out an ambitious joint agenda around the transformation of scholarly communication (mit, ). the variation in the types of relationship represented by press/library collaborations was on display at a recent meeting convened by aaup, the association of research libraries, and coalition for networked information and held at temple university in philadelphia on may – , . sponsored by the andrew w. mellon foundation this p l (presses to libraries) summit brought together directors of university presses and deans/directors of libraries from most of the institutions with a reporting relationship. after establishing a common understanding of the barriers to and possibilities for alignment, the participants focused on the opportunities their partnerships might offer for system-wide approaches to managing the total cost of the scholarly publishing system and better supporting the needs of digital scholarship. it is to these benefits (the reasons that press/library collaborations once established tend to progress along the continuum) that we now turn. this article is protected by copyright. all rights reserved. why marry? while many press/library collaborations are initiated by anticipated “economic” benefits, the partners increasingly find “sociopolitical” advantage which is often closely linked to “technological” opportunity in an environment where the need to sustain digital scholarship is an increasing theme. these three themes are discussed below. the benefits realized are not only relevant to the two partners, of course, but also allow them together to better serve the scholarly communication needs of institutional faculty, staff, and students and to develop powerful solutions for particular disciplinary communities whose subject interests align with the strategic strengths of the parent university—an idea strongly focused on in the recommendations of the ithaka s&r report on university publishing in a digital age (brown, griffiths, and rascoff ). economic: in the economic sphere, the reasons why a university press could benefit from closer relationships with the library may initially be clearer than the advantages for libraries. as described in a number of reports, university presses have long been suffering from the declining market for scholarly books and increased financial scrutiny from their institutions (thompson , – ). reducing expenses is a priority, and opportunities to share overhead costs with campus partners are beneficial. as libraries increasingly either de-accession or remove print materials to remote storage, subsidized or “free” physical space is becoming available that may be suitable for press occupancy, although presses interested in a central campus location will often have to wrestle with other priority needs (especially those focused on student learning) when lobbying for premium library space. other opportunities for synergy frequently come in the areas of it services, combined human resource and business office support, and shared legal counsel. in a survey conducted by aaup’s library relations committee in , % of libraries provided some form of cash subsidy to university presses, while % of libraries provided some other kinds of service. this included rent-free space but also support for basic office functions, digitization, metadata enrichment, and preservation services. both libraries and presses share specific needs in these areas that would not be well accommodated by other campus partners. for example, it specialists in the library tend to understand the metadata standards needed for bibliographic information and the demands of digital preservation, hr recruiters are often advertising in similar venues for library and press staff, and legal expertise in areas such as intellectual property is desirable for both partners (even if they may sometimes approach the law from different angles). while many of the business office functions needed by the partners are similar, some challenges can emerge in this area. these are mostly related to handling a this article is protected by copyright. all rights reserved. revenue-generating unit whose income and expenditure fluctuate over a multiyear cycle (e.g., expenses incurred on a book in one financial year may not be recouped until the following financial year) rather than a library, which spends down an annually renewed budget over a single financial year, and having to track cash flow. indeed, while many press/library collaborations have found synergies in back-office operations related to expenditure, it has been much harder to merge systems related to revenue, including the time-consuming demands of royalty tracking. a less tangible area of economic opportunity for both presses and libraries is in developing a better mutual understanding of the economic challenges facing the scholarly communication ecosystem in order to develop more informed strategies for intervention. one example of this lies in the area of open-access publishing, where questions about the “real cost” of publishing both journal articles and, increasingly, books are at the center of library strategies to support this emerging field. university presses, over % of which publish journals, can help untangle the issues and inform an understanding of what might constitute a fair level of subsidy. with the growing interest in open-access monographs, questions of what constitutes a reasonable first copy cost are again coming to the fore, and the opportunities to work through cost components in an environment of mutual trust are invaluable. where university press staff members are involved in discussions about collections development choices, presses gain insights into the processes by which libraries choose what and what not to buy. these are valuable for decision-making locally and may give a library-based university press a competitive advantage, but there are also ripple effects as informed press directors and staff spread an understanding of the constraints libraries are operating under within the publishing community more broadly. perhaps even more important than back-office efficiencies, there are perceptual advantages (especially for smaller presses) in having university press budgets incorporated into those of a larger parent organization on campus. because they produce sales revenue, university presses generally are classified by their parent institutions as “auxiliary” operations alongside entities such as student housing, catering, and sometimes even athletics. not only are academic publishing revenues dwarfed by those other sources of earned income, but the metrics of success for such units tend to primarily be financial rather than mission-related. libraries, meanwhile, are classified as core academic units. funds spent on the library and its subsidiary units are classified as “designated” for pursuit of the academic mission of the university. by changing its classification from “auxiliary” to “designated” in university accounts (the exact terms used will vary by institution), the press’s appearance under the library’s financial accounting umbrella can change the way in which the parent institution’s senior administrators understand the purpose of supporting an academic publishing unit – to the advantage of the this article is protected by copyright. all rights reserved. university press. no more being called before the provost to account for yet another year of deficit! sociopolitical: as libraries move from stewarding collections to providing services, academic librarians are eager to acquire expertise in serving the needs of faculty as “authors” rather than “users” of scholarly information. even though the individuals may be the same, the attitudes and expectations of faculty as authors and as users of scholarly content are as different as “dr. jekyll and dr. hyde” (mabe and amin, ). the development of data management services and library publishing services are two manifestations of this change in emphasis, but it has become clear that libraries are struggling to gain acceptance by faculty members in these new “research support” roles, as reflected in the results of the latest ithaka us faculty survey which suggests little advance in the library’s credibility as a research partner vs. increasing perception of its value in supporting students (wolff, rod, and schonfeld, ). while the credibility of the university press as a partner to authors may be greatest in humanities and social science disciplines, an association between a press and a library can advance the reputation of the library in this space and provide valuable access to knowledge about effective ways to solicit and work with authors. a perennial challenge for university presses has been in demonstrating relevance to their parent institutions. focused on the needs of specific disciplines across institutions rather than on a single institution, university presses provide a public good that is clear at the system level but is much less apparent to administrators evaluating the local benefits of their investments. partnership with the library allows the press to create programs that demonstrate alignment with the needs of the institution, while also advancing the ambitions of the library in areas such as scholarly communication and information literacy instruction. these successes can be represented to senior administration by the dean or director of libraries who, unlike the press director, is a visible presence in institutional leadership meetings. a particularly interesting opportunity for collaboration lies in finding ways for the university press and library to engage with students in new ways. a number of university presses are working with their parent libraries to create open and/or affordable textbooks (e.g., indiana, temple, purdue, oregon state). meanwhile, under the banner of “publishing as pedagogy” (alexander, colman, kahn, peters, watkinson, & welzenbach, ), others are working to integrate the experience of publishing student work into the experiential learning opportunities that are increasing in number on north american campuses. the development of scholarly communication curricula involving the production of the graduate-produced michigan journal of medicine (http://www.michjmed.org/) or the undergraduate-run journal of purdue undergraduate research (http://docs.lib.purdue.edu/jpur/) are examples. as well as this article is protected by copyright. all rights reserved. completing the scholarly communication cycle and providing a tangible output that students can use in their future careers, involvement in a publishing process also involves the application of a number of high impact learning experiences that can be shown to have a positive impact on student success (weiner and watkinson, ). technological: as faculty members increasingly apply digital tools to their research, their needs for support in publishing the full record of their work electronically is increasing. the evidence- based study by the ithaka organization on “university publishing in the digital age” identified four emerging needs for scholars whose modes of information production and consumption are increasingly electronic. these are that everything must be electronic, that scholars will rely on deeply integrated electronic research/publishing environments, that multimedia and multi-format delivery will become increasingly important, and that new forms of content will enable different economic models (brown, griffiths, and rascoff , – ). almost a decade later, it is clear that university presses are seeing these needs expressed by almost every author, not just “digital humanists.” press/library collaborations have the capacity to effectively meet these needs by not only harnessing the complementary skills of publishers and librarians but also enabling university presses to connect peer-reviewed scholarship with less formally produced material, the idea of publishing “across the continuum” described by daniel greenstein ( ). the inclination to experiment, which at many university presses has been suppressed by the need to constantly look to the bottom line, can be released by financial relief that being part of the library can offer to enable new opportunities to be explored. while a recent round of grants given by the andrew w. mellon foundation to improve university press capacity to support digital scholarship in the humanities have gone to presses with a range of organizational structures, a disproportionate number of recipients represent library/press partnerships. the projects proposed by presses reporting to libraries have characteristics that leverage the relative strengths of each party and emphasize the logic of deep collaboration. for example, new york university’s enhanced network monograph project focuses on issues of the discoverability of digital projects, especially open access publications, an area of joint concern to libraries and presses (nyu enm, ). the university of michigan’s fulcrum platform (fulcrum.org), meanwhile, leverages library-based work to develop data repositories using the open source hydra/fedora framework to serve the needs of humanists for long-term digital preservation of the digital research outputs they wish to link to their monographs (um hydra, ). michigan is working on this project with three other presses strongly linked to their libraries (indiana, northwestern, and penn state) and one that is not (minnesota). this article is protected by copyright. all rights reserved. why not just good friends? achieving some of the benefits of the sorts of collaboration described above does not absolutely require an integrated press/library structure. there are good examples of collaboration where the press and library have different reporting lines, or even are at different institutions, such as duke university press and cornell university libraries for project euclid (ehling and staib, ) or oxford university press and university of utah library in hosting supplemental content for a faculty member’s book (anderson, ). university of north carolina press especially has shown leadership in creating relationships with its system libraries to advance initiatives such as the creation of open educational resources through its office of scholarly publishing services (ruff, ). some university presses that report to libraries continue to maintain self-conscious separation of functions: stanford university press has chosen to collaborate with the university of richmond’s digital scholarship lab rather than its parent library to create its mellon-funded digital scholarship platform (stanford, ). it is also important not to dismiss the real challenges that integrating two organizations with different cultures and traditions pose, especially since the historical relationship of client/vendor has built-in tensions. cultural differences between librarians and publishers that make collaborating on joint projects challenging have sometimes been exemplified by the idea that “libraries are service organizations whose funding comes in part from their success in anticipating needs, they tend to say yes” while “publishers, working to break even in a highly competitive business, evaluating many potential projects, and with quantifiable limits on their productivity, tend to say no” (mccormick, , ). meanwhile, the need to pursue business strategies that cover most costs through earned revenue and the razor-thin margins most university presses operate on are often overlooked by libraries, and university press directors often feel unfairly picked upon when libraries accuse them of dragging their feet on open access or being “disconnected from the academic values of their parent institutions,” a common refrain in debate around the georgia state university lawsuit (smith, ). however, as the above discussion has hopefully illustrated, the deep partnership required to truly unleash the power of the complementary skills and infrastructure that exist in university presses and academic libraries can only develop when press and library staff are collocated and share a common vision. only in such “marriages” can resources be gifted and received, uncertain futures explored without risk, and the cultural differences between the partners truly appreciated and valued. just good friends is not good enough. references this article is protected by copyright. all rights reserved. aaup library relations committee. ( ). press and library collaboration survey. new york: association of american university presses. alexander, l., colman, j., kahn, m., peters, a., watkinson, c., & welzenbach, r. ( ). publishing as pedagogy: connecting library services and technology. educause review, january . http://er.educause.edu/articles/ / /publishing-as-pedagogy- connecting-library-services-and-technology anderson, r. ( , july ). another perspective on library-press ‘partnerships’ [web log post]. retrieved from http://scholarlykitchen.sspnet.org/ / / /another-perspective-on- library-press-partnerships brown, l., griffiths, r. j., & rascoff, m. ( ). university publishing in a digital age. new york: ithaka s&r. retrieved from http://www.sr.ithaka.org/publications/university- publishing-in-a-digital-age/ crow, r. ( ). campus-based publishing partnerships: a guide to the critical issues. washington, dc: sparc. retrieved from http://digitalcommons.bepress.com/repository-research/ day, c. . the need for library and university press collaboration. collection management , no. – : – . doi: . /j v n _ . ehling, t., and staib, e. ( ). the coefficient partnership: project euclid, cornell university library, and duke university press. against the grain , no. (december–january), – . esposito, j. ( , july ). having relations with the library: a guide for university presses [web log post]. retrieved from https://scholarlykitchen.sspnet.org/ / / /having- relations-with-the-library-a-guide-for-university-presses/ greenstein, d. ( ). next-generation university publishing: a perspective from california. journal of electronic publishing , no. (fall). doi: . / . . . hahn, k. l. ( ). research library publishing services: new options for university publishing. washington, dc: association of research libraries. mabe, m. a., and amin, m. ( ). dr jekyll and dr hyde: author–reader asymmetries in scholarly publishing.” aslib proceedings , no. : – . mccormick, m. ( ). “learning to say maybe: building nyu’s press/library collaboration.” against the grain , no. : – . mit ( ). hacking the library-publisher partnership at mit. inside higher ed, february . https://www.insidehighered.com/blogs/higher-ed-beta/hacking-library-publisher- this article is protected by copyright. all rights reserved. partnership-mit mullins, j. l., murray rust, c., ogburn, j. l., crow, r., ivins, o., mower, a., nesdill, d., newton, m. p., speer, j., & watkinson, w. ( ). library publishing services: strategies for success. final research report. washington, dc: sparc. http://wp.sparc.arl.org/lps. nyu enm ( , june). nyu libraries, nyu press lead project to develop innovative open access monographs. retrieved may , , from https://www.nyu.edu/about/news- publications/news/ / / /nyu-libraries-nyu-press-lead-project-to-develop- innovative-open-access-monographs.html ruff, c. ( ) unc press to offer publishing services for professors’ diy textbooks. chronicle of higher education, may . http://chronicle.com/article/unc-press-to-offer- publishing/ / smith, k. ( , september) gsu and university presses [web log post]. retrieved from: http://blogs.library.duke.edu/scholcomm/ stanford ( , january ) stanford university press awarded $ . million for the publishing of interactive scholarly works. retrieved may , , from http://library.stanford.edu/news/ / /stanford-university-press-awarded- - million-publishing-interactive-scholarly-works thompson, j. b. ( ). books in the digital age: the transformation of academic and higher education publishing in britain and the united states. cambridge: polity press. um hydra ( , april ). mellon grant funds u-m press collaboration on digital scholarship. retrieved may , , from http://www.publishing.umich.edu/ / / /mellon- grant-funds-u-m-press-collaboration-on-digital-scholarship/ watkinson, c. ( ). from collaboration to integration: university presses and libraries. in getting the word out: academic libraries as scholarly publishers. eds. maria bonn and mike furlough (pp. - ). association of college and research libraries: chicago, il. wolff, c., rod, a. b., & schonfeld, r. c. ( , april ). ithaka s+r us faculty survey . retrieved from http://sr.ithaka.org?p= this article is protected by copyright. all rights reserved. leap_ _charles watkinson[ ].jpg this article is protected by copyright. all rights reserved. book reviews and the consolidation of genre kent chang, yuerong hu, wenyi shang, aniruddha sharma, shubhangi singhal, ted underwood, jessica witte, peizhen wu a paper presented at the virtual panel “cultural analytics and the book review: models, methods and corpora,” adho , july , . introduction book reviews clearly cast new light on reception: on literary judgment, for instance, and prestige. but reviews may also give us an opportunity to test claims about the significance of patterns in the reviewed books themselves. for instance, literary scholars have recently claimed that predictive models can measure the strength of the boundaries that separate different cultural categories— different genres of fiction, say, or market segments. but the evidence supporting this argument comes purely from the texts themselves. the works in a particular literary genre may be relatively easy (or hard) to distinguish from others, because they possess (or lack) a distinctive diction. interpreting this textual boundary as evidence about the strength of a cultural distinction has seemed questionable to many readers. one can imagine cultural categories that would be salient and distinctive for human readers even though they don't leave the kind of traces that can be captured in a model of word frequency. so do textual models really tell us anything about the boundaries between cultural categories? it is hard to resolve this question with a single experiment, because it isn't immediately clear what counts as ground truth about the strength of a cultural boundary. josé calvo tello has compared predictive accuracy to the level of human consensus about different genres (expressed, for instance in bibliographies). ted underwood has dan sinykin, “how capitalism changed american literature,” public books, july , , https://www.publicbooks.org/how-capitalism-changed-american-literature/. richard jean so and edwin roland, “race and distant reading,” pmla . (jan ): - . this is, for instance, one of the questions raised by nan z. da, “the computational case against computational literary studies,” critical inquiry (spring ): - . josé calvo tello, “genre classification in spanish novels: a hard task for humans and machines?” european association for digital humanities , https://eadh .exordo.com/programme/presentation/ . compared predictive accuracy to the degree of overlap or separation between genres. (that is, we might expect pairs of genre labels that are often assigned to the same works to be closer to each other than those that rarely overlap.) both studies suggest that the accuracy of a textual model does correlate with the behavior of human observers. but both studies are still open to the objection that they rely purely on explicit labels. this could produce a subtle kind of false confirmation. perhaps the conscious labeling behavior of bibliographers and catalogers is governed by categories overtly signaled in the diction of a literary work—but ordinary readers care more, in practice, about other categories, less clearly registered in diction? book reviews give us a way to address this remaining source of doubt. reviewers may or may not explicitly assign books to a genre: in the nineteenth- and early-twentieth- century period we will discuss here, explicit genre categorization is unusual. but reviews do presumably reflect the tacit concepts and categories that organize the landscape of fiction for a particular reader. it seems likely that books with similar reviews were perceived in similar ways. so if the textual boundaries between groups of literary works do really correlate with the responses of ordinary readers, reviews of those texts ought to reveal the same groupings and distinctions. that is the primary hypothesis we set out to test in this paper. are the subject or genre categories most strongly marked in fiction also the categories most strongly marked in reviews of fiction? data to construct a corpus of paired literary texts and book reviews we aligned extracted features from hathitrust research center with book reviews from proquest's british periodicals collection, matching on both the author and the title of the original work. we also used predictive modeling to filter the book reviews for reviews of fiction. review metadata is imperfect, and title matches are often ambiguous, so without this filtering step it would have been difficult to have confidence that we were really pairing books of fiction with their reviews. ted underwood, “the historical significance of textual distances,” proceedings of the second joint workshop on computational linguistics for cultural heritage, social sciences, humanities and literature, santa fe, , https://www.aclweb.org/anthology/w - /. boris capitanu, ted underwood, peter organisciak, timothy cole, maria janina sarol, j. stephen downie ( ). the hathitrust research center extracted feature dataset ( . ) [dataset]. hathitrust research center, http://dx.doi.org/ . /j x jt . when the filtering process was complete we had pairs of books and reviews. the books' dates of first publication extend from to , but the vast majority (more than ) were published after , and about after . the reviews dated from to . most were published very shortly after the book in question, although we do have a few nineteenth-century reviews of don quixote. when a book had multiple volumes, we aggregated the texts; when we had multiple reviews of the same book, we also aggregated the reviews to produce a single composite review-text. word counts for the books and reviews are available through a github repository documenting the project. methods our overall hypothesis was (generally) that similar books will have similar reviews and (more specifically) that categories of books with closely-knit textual similarity will also have reviews that resemble each other closely. we preregistered an initial plan to test this hypothesis in two ways: using supervised predictive models (which have worked well for this problem in the past, but require relatively large groups of works), and using word mover's distance. distance metrics are easy to apply to small groups of texts, and we hoped a distance-measuring approach would allow us to explore this question across a wider range of genres, including genres with few examples. while other distance metrics are more familiar, we thought word mover's distance might be preferable for short texts like reviews, since it uses word embeddings rather than one-hot encoding and thereby produces a less sparse feature space. we did find that our preregistered hypotheses were confirmed using word mover's distance. for instance, to take the simplest example, we measured wmd between random pairs of books and the corresponding pairs of reviews. we found a statistically significant relationship between the two measures (r = . , p < . ). but regular cosine distance on the frequencies of the most frequent words showed an even stronger relationship in the same sample (r = . , p < . ). in subsequent metadata for the books used in this experiment is available at our github repository, https://github.com/tedunderwood/reviews/tree/master/bpo/corexperiment. metadata for the reviews (and word counts for both the books and reviews) is available at the supporting open science framework site: https://osf.io/a /. yuerong hu, wenyi shang, and william e. underwood, “book reviews and the consolidation of genre: first registration,” open science framework, october , , https://osf.io/j ycz. matt j. kusner, yu sun, nicholas i. kolkin, “from word embeddings to document distances,” proceedings of the nd international conference on machine learning, lille, france, . experiments, we also found that measuring category strength with cosine distance produced results that echoed predictive models more closely than wmd did. so we reverted to cosine distance in subsequent experiments. since this is already the dominant distance measure in text analysis, we didn’t feel there was a great risk of tailoring our methods to a particular sample or problem. as part of our preregistered hypothesis, we used metadata in contemporary libraries to define twenty-four genre or subject categories. these categories could be viewed as a source of anachronism (since they were mostly defined by librarians half a century or more after the publication of the original works). but the anachronism in question is helpfully orthogonal to the question explored here. in other words, we're not positing that these twenty-four categories are the best categories for english fiction - , or that they precisely align with real divisions in literary culture. instead we are asking whether the relative clarity of different categories (in the literary texts themselves) correlates with the relative clarity of the same categories (in reviews of the texts). to fully test this hypothesis, it might actually be good if some of our categories were anachronistic, and did fail to align clearly with real boundaries between literary practices. we then tested our central hypothesis about genre in several different ways. first, we trained classifiers to distinguish literary works (and their reviews) from other works and reviews in our corpus. (we used the scikit-learn implementation of regularized logistic regression.) we found that the accuracy of the book classifiers correlated with the accuracy of the review classifiers, r = . and p < . . in this part of the experiment, we could only use fourteen large categories, because predictive models become unstable with small training sets. so some of the categories in figure are all- encompassing (e.g. “random”), or very general (works labeled “novel” or “romance” in their titles), or defined through subject headings rather than genres (e.g. works about “britain” or “north america”). instead of using tf-idf, we scale features by converting each column of the matrix to a z-score, which is equivalent to using burrows's delta. for empirical evidence that this distance measure works well for many problems in text analysis, see stefan evert et al., “understanding and explaining delta measures for authorship attribution,” digital scholarship in the humanities . (december ): pp. - , https://doi.org/ . /llc/fqx . “scikit-learn: machine learning in python,” pedregosa et al., jmlr , pp. - , . figure . correlation between predictive models of books and of reviews. but we also observed the same pattern across a larger set of categories that are closer to groups ordinarily called “genres,” using distance measurements between pairs of texts. we selected random pairs of works in the same genre and measured both the in- genre distance (between mystery a and mystery b) and the out-of-genre distances (e.g. from mystery a to a randomly selected work published in the same year as mystery b). distances were measured as cosine distances for the , most common words, scaled using burrows's delta (which is in effect the standardscaler in scikit-learn). by subtracting the in-genre distance from the out-of-genre distance for each pair, we obtained a measurement of how much closer works in each genre are to each other than to the rest of the corpus. again, we found that closely-knit genres produce closely-knit groups of reviews, r = . , p < . . figure . correlation between distance-differences for books and reviews. in each case a pair of books in the same category are cross-compared to a pair of books outside the category, but published in the same years. conclusions and future work we conclude that the similarities and differences between texts (measured, for instance, by cosine similarity) do correlate with similarities and differences in reception—or, at any rate, in book reviews. when we look at individual pairs of books, the relationship may not be very strong; perhaps r ≈ . . but if we back up, and gather books in categories, the aggregate relationship is stronger. closely-knit genres also produce clusters of closely-related reviews. this could be to some extent a verbal accident, if book reviews were distinguished by exactly the same unusual words overrepresented in the book texts. for instance, we might imagine that references to “valor” and “surrender” would characterize war stories (as well as reviews written about them). but inspection of the most strongly marked categories in our corpus does not lead us to credit this explanation. for instance, books are likely to be folklore when they mention fairy, witches, or invisible; those are some of the strongest features in our predictive model. but reviews are likely to be about folklore when they mention traditions, collected, and popular. in both contexts, folklore is marked by a distinctive diction—but it is a different diction in each context. so we suspect the coherence of these categories is not a purely verbal accident, but reflects an underlying social distinction. books of folklore are genuinely unlike other kinds of fiction; that social distinctiveness is reflected (in different ways) both in their texts and in their reviews. we also tested this hypothesis in several other ways. for instance, we found that the correlation between review-similarity and book-similarity holds (more weakly, r = . ) even if we use two different subsets of works in each genre: one to test the similarity of book-texts, and a different, disjoint set to test the similarity of review-texts. having validated this measurement of generic distinctiveness, we then used it, experimentally, to measure a broad structural change in fiction between and . we measured the difference between in-genre and out-of-genre comparisons, and dated each pair of books to the midpoint of the two publication dates (since we precisely matched out-of-genre comparisons to the dates of the in-genre pair, the midpoint date was always the same). we found that genres became more closely knit across this period: that is, works of fiction became more similar to other works in the same genre than they were to randomly selected works from the same publication year. figure . the consolidation of genre? this pattern is open to different interpretations. it could reveal a process of differentiation (if we focus on the growing differences between genres)—or consolidation (if we focus on the strengthening of in-genre similarity). but since the categories we are using are drawn from late twentieth-century librarians' judgments, it could also be that works of fiction simply fit those judgments better as we move closer to the late twentieth century. to decide between these interpretations, we have subsequently repeated the experiment with categories inferred from a topic model of twentieth-century book reviews. but that's a different project and a separate paper. biron - birkbeck institutional research online mckim, joel and leslie, esther ( ) life remade: critical animation in the digital age. [editorial/introduction] downloaded from: http://eprints.bbk.ac.uk/id/eprint/ / usage guidelines: please refer to usage guidelines at https://eprints.bbk.ac.uk/policies.html or alternatively contact lib-eprints@bbk.ac.uk. http://eprints.bbk.ac.uk/id/eprint/ / https://eprints.bbk.ac.uk/policies.html mailto:lib-eprints@bbk.ac.uk life remade: critical animation in the digital age esther leslie birkbeck, university of london joel mckim birkbeck, university of london animation and contemporary life are enmeshed like never before. a growing number of the media images we consume are in animated form (from fully animated features to cgi laden blockbusters and advertisements); recourse to common animation software and aesthetic approaches significantly blur the lines between previously distinct artistic and design practices (from video games, to special effects, to architecture and contemporary art); and through techniques of computational modelling and visualization, animation is increasingly fundamental to processes of knowledge production and the creation of various modes or elements of life. this appears therefore to be a particularly ‘critical’ moment to ponder animation’s expanded cultural and political role. this special issue also provides an opportunity to consider animation’s own powers of critique – the ways in which the digital animated image is increasingly being deployed explicitly as a means of intervening in social and political arenas ranging from human rights advocacy to ecological activism. and finally, we hope this collection of essays serves to further the already rich examination of the politics of more traditional forms of animation in the current digital age. this special issue thus builds upon recent scholarship that has already begun to contend with animation’s expanded presence and its inherent political and critical significance, including suzanne buchan’s insightful explorations of the contemporary ‘pervasiveness’ of animation ( ), karen beckman’s call to finally bring animation out of the ‘margins’ of film theory ( ), and last year’s excellent special issue of this journal edited by eric herhuth addressing ‘the politics of animation’ ( ), from labour conditions to national identity formation. the claim that the current ubiquity of digital technologies is largely responsible for animation’s elevation in status from a somewhat marginal aesthetic tradition (associated primarily with children’s entertainment or experimental film) to arguably the dominant contemporary media form is by now a familiar one. paul wells notes that in the digital era ‘the dividing line between live action and animation is essentially effaced’ ( : ); lev manovich asserts that ‘the new media of d computer animation has ‘eaten up’ the dominant media of the industrial age – lens based photo, film and video recording’ ( : ); and alan cholodenko views this situation as the confirmation of animation’s position as the ‘paradigm of all forms of cinema’ ( : ), to take but three prominent examples. these commentators quite rightly, and often presciently, draw attention to digital animation’s gradual expansion into numerous previously distinct moving image domains. what is perhaps less often emphasized or elucidated is animation’s particularly privileged relationship to computational information – its position as one of the primary or default modes of visually representing digital data. german media philosopher friedrich kittler provocatively suggested over thirty years ago that the general digitization of information and the expansion of computer to computer communication would reduce human-oriented interfaces of sound and image to mere ‘surface effects’ or ‘eyewash’ ( : ) – a theme that current theorists such as mark b. n. hansen ( ) and bernard stiegler ( ) have developed further. yet while we acknowledge that computational data (the binary code processed by machines) is in fundamental ways unreadable or unavailable to human senses, this only serves to highlight the role that animation often plays in translating this digital information into human-oriented visual forms – literally bringing data to life by allowing it to enter into the realm of human experience. it’s this function of mediating between the digital and the human senses that pushes animation into the epistemic and design realms of data visualization, modelling, simulation and rendering – forms of contemporary representation with enormous political and social resonances. in order to take critical account of how the rise of the digital has both transformed existing practices of animation and produced entirely new domains of animated image work requires that animation studies engage to a greater degree with new scholarship emerging from digital media studies, as well as contemporary political theory. we hope that the contributions to this special issue will help further this already developing exchange. while it’s difficult to imagine an area of animation that has not been touched by the shift towards the digital, we’ve highlighted five areas of exploration for this special issue: the expansion of animation into new non-entertainment oriented domains; the emergence of digital animation as a key aesthetic technique within contemporary art; the impact of the digital on traditional spheres of animation; the importance of material considerations of animation infrastructure and interfaces; and the critical histories and futures being made possible by digital animation. many of the contributions to this special issue do not fit neatly or exclusively within any one of these categories, but speak instead to issues that range across these areas of investigation. expansion of animation one of the most dramatic impacts of the shift towards the digital has been the expansion of animation into considerably more facets and areas of contemporary life. as a primary means of representing computational information, digital animation has moved from aesthetic and cultural contexts into the sphere of knowledge production and visual argumentation. as a rhetorical tool, a data visualization technique and a means of information exchange, animation is now employed in such diverse disciplines as life sciences, engineering and law. within these various technical fields of application, digital animation is often a method of making visible phenomena and temporal processes that would otherwise be unrepresentable. in this special issue, for example, the anthropologist natasha myers describes how protein modellers produce digital animations of nano- scale molecular structures invisible to human sight. scientists employ digital animations to visualize climate simulations involving complex variables and extended time scales (doyle, ) and forensic animations are mobilized within the courtroom as evidentiary re-enactments of past events (ma, zheng and lallie, ). these emerging moving image forms animate digital information, bringing computational data into the realm of human understanding and discussion. while the use of animated moving image in non-entertainment or epistemic contexts extends back much further in time, the expansion of animation into the domain of ‘technical images’ (bredekamp, dunkel and schneider, ) has certainly intensified in the digital age. in their contributions to this special issue, both pasi väliaho and thomas elsaesser discuss the us military’s use of animation and computer-generated images for both training and therapeutic purposes, referencing harun farocki’s sustained investigation of these and other ‘operational’ images used to shape contemporary human physic life. valiaho inserts these practices into a much longer media archaeological trajectory of animation’s ‘power over the plasticity of our minds.’ elsaesser suggests that the computer generated animations of digital post-production shift the logic of cinema from a visual capturing of reality to a ‘harvesting, extraction, and manipulation’ of reality akin to the genetic or molecular management of bio-engineering. digital animation, in other words, does not capture an already existing reality, it produces, cultivates or ‘grows’ its own. joel mckim’s contribution to this issue explores the emergence of digital animation in contexts ranging from architectural design to post-conflict human rights investigations. in the hands of certain artists and designers, mckim argues that digital animation ‘makes possible new responses to the present moment of urban crisis.’ animation in the art world it is not uncommon for contemporary artists to use animation as their medium. the use of animation specifically as a medium is not quite synonymous with the production of art animation or experimental animated films. artists who use animation today where they might once have used paint or bronze are frequently fascinated by the long history of animation as the contrary of art, a ‘lowbrow’ form to which a kitsch quality adheres. animating artists play with this history – for example, in the way mark leckey has by returning repeatedly to the character felix the cat, who appears as a sign of beginnings of merchandising, of media fascination and of the original moment of tv broadcast (a revolving maquette of felix was a test broadcast in the us). but, artists are also enticed by how a cartoon character, such as felix, in his shape-shifting abilities, proposes a tantalising form that undermines the very notion of form, poses posing at its very core, revealing the constructedness of all things, their artifice and contingency. alex charnley’s essay in this collection considers a contemporary artist who has made several animated artworks which deal with self- enstaging, the adoption of typecasting, the prevalence of stock imagery and faking it. charnley’s analysis of jordan wolfson explores the ways in which wolfson’s recent use of the animated medium – replete with stereotypes, pratfalls and incongruities - lends itself to a complex reconstruction of the shadowy underbelly of present-day popular culture, specifically the current mobilisation of nasty humour in the guise of the cartoonic in current neo-rightist online culture. animation is shown here as a locus for exploring the curious side-shoots of contemporary us political discourse, which are effervescent in the wake of trump. esther leslie’s essay explores the significance of the cloud as a frequently recurring image (or even character) within cgi-based art work. although an aesthetic preoccupation dating back to constable’s paintings and further still, the clouds now appearing in everything from studio aka advertisements to the post-pop productions of the artist group friendswithyou all have the cloud, the figure of our omnipresent contemporary digital surround, looming behind them. politics of traditional animation in the digital age patrick crogan’s essay is directed more towards mainstream us culture, seeking at the core of animation’s technical procedures, including the deep structure of contemporary software, a clue as to why the medium, as deployed in the hollywood blockbusters, quite against its historical orientation towards potential, is imbricated in spectacular visions of destruction and the cauterising of any future other than that of capitalist consumerism. the sophisticated integrations of such a consumerist future, still imperfectly employed but open to adaptation, are spelt out vividly in marc steinberg’s contribution on media mix and the fascistic ‘total mobilization’ of all areas of economy, technology and desire in anime designed for children. crogan’s essay also considers what happens when two modes – the analogical and the digital – meet. similarly annabelle honess roe explores a long relationship between animated sections and live action film, specifically in order to consider how this translates into the contemporary environment of mainly digital film making. at issue here is how the codes of realism are buffeted or supported in the digital epoch through the evocation of animated styles familiar from analogue animation. animation, the animation that distances itself from photorealism or other ‘connective’ strategies, retains a power to disrupt, to introduce a critical or political note into the unfurling of documentary film with its illusions of conveying the truth of the world. how might such deployments of interjecting animation, of the digital in the guise of the analogue even, marry with thomas elsaesser’s observation here that ‘the digital image is now the primary reference point for all kinds of images, including analogue images, in just the way that gramophone records have had to be relabelled ‘vinyl’, because they are hence¬forth seen from the implied perspective of the cd or the mp download.’ digital infrastructures despite its growing pervasiveness, the production of digital animation remains, for the most part, extremely capital and resource-intensive. in a media moment partly characterized by democratized access to media production and cheap reality television, feature-length animation represents an expensive, technologically demanding and labour-heavy counterpoint to these trends (see for example herhuth, ). emphasizing the material and infrastructural networks that underpin the apparently ‘ephemeral’ and ‘wireless’ computational technologies we have come to take for granted has been one of the most dynamic areas of contemporary digital scholarship. considerations of current animation production, with its server banks, post-production studios and render-farms, would do well to draw from these material investigations of the data centres (hu, and holt and vonderau, ) and fibre-optic cable systems (starosielski, ) that support our digital culture. media scholars have also crucially highlighted the significant ecological impact of the technological devices used to both produce and view digital animation (cubitt, ), tracing their life cycle from the geological extraction of rare minerals and precious metals required to construct them (parikka, ) to their eventual transformation into toxic e-waste residing in landfills and dumping sites in the developing world (gabrys, ). a material consideration of digital animation might also take into account the tools, interfaces and techniques inextricably bound up in its process of production and consumption, an area of study that has been productively development by both animation scholars and digital theorists more widely. rather than engage in textual media analysis, these scholars seek to better understand the software – from flash (salter and murray, ) to autodesk (wood, ) – codecs (cubitt, ), composting techniques (lamarre, ) and platforms (gillespie, and bratton, ) that enable the creation and distribution of contemporary digital media. in this issue sean cubitt considers the implications of the constellation of material objects, infrastructures and formats that must come together (from vector graphics to data servers) to create a digital character like gore verbinski’s animated chameleon rango. our encounter with rango’s world, he claims, is an ‘ethical compact’ in which, ‘we have the responsibility as audience to oversee the material conditions of its existence.’ leslie’s aforementioned contribution explores both the material and metaphorical implications of the cloud, with global cloud computing emerging as an economic and technological necessity for large scale animation. ‘the cloud . . . has implications for animation,’ she argues, ‘which is now peculiarly susceptible to the rapid technological changes in computing.’ critical histories and futures the final question that this special issue explores is the forms of critical temporality made possible by digital animation –the ability of animation to introduce new or alternative histories and futures. in our current situation there is a general feeling, we suggest, that politics enacted in the present often appears to arrive too late. there is a sense, in other words, that the conditions of present action have in some ways already been predetermined by those with the means to shape the future – via systems and technologies of modelling, simulation, prediction and speculation (amoore, and berns and rouvroy, ). it as if by the time we arrive at what we hoped would be a better future, we find that it has long since been developed, partitioned and monetized. in this political context of temporal-colonization, animation remains a crucial access point to the future. digital animation allows for future worlds or alternative versions of this world to be both envision and argued for. but animation has also become a very important way to connect with or ‘re-animate’ the past. through the animated image we recreate past events or bring to life otherwise unavailable histories, often with an explicitly political dimension. increasingly, the critical imperatives we face also involved time scales that extend beyond human lifespans and challenge the human political imagination. with an issue such as climate change, for example, we struggle to comprehend notions of deep time that extend backwards to geological eras prior to human impact (haraway, and hamilton, ) and forwards to the possibility of a time after human existence (danowski and de castro, and paglen, ). timothy morton uses the term ‘hyperobjects’ to describe ‘things that are massively distributed in time and space relative to humans’ ( : ). a difficult to conceive phenomena like global warming are hyperobjects, according to morton, but so are styrofoam cups and plastic bags, disposable items that will long outlast there human manufacturers. morton points to the work of animation artist marina zurkow (whose two ecologically themed mesocosm pieces each have a duration of over hours) as one example of how aesthetic interventions may help us to begin to engage with the fundamental incomprehensibility of hyperobjects. digital animation has undoubtedly played a central role in the attempt to visualize and apprehend the extend timescale events that make up our current climate crisis. to take but one example from august of this year, anti lipponen, a research at the finnish meteorological institute, tweeted a visually simple, but rhetorically devastating animated gif depicting the escalating number of temperature anomalies in countries from to . this one visualization indicates the potentially powerful political relationship between digital data and animation. roe’s aforementioned contribution to this issue identifies the critical and ‘disruptive’ capacities of animated interjections into live action documentary films, highlighting the respective low-tech and victorian styles of bowling for columbine’s sardonically violent ‘history of america’ digital segment and the animated account of a disastrous colonial ‘war for resources’ in the climate themed the age of stupid. she argues that increasingly available digital technologies are allowing live-action documentary filmmakers to avail themselves of the ‘rhetorical potential’ and ‘critical, political possibilities of animation’ in relation to both the past and the impending future. väliaho’s already mentioned contribution provides a seigfried zielinski-like deep time archaeology of animated media that runs from ignatius of loyola to athanasius kircher to sergie eisenstein. and the previously introduced contribution by mckim highlights the use of digital animation in both historical urban reconstructions and speculative designs of possible urban futures. the genesis of this special issue was a two-day symposium held at birkbeck, university of london in june of , entitled: life remade: the politics and aesthetics of animation, simulation and rendering. we would like to thank all of the participants of that symposium, many of which are present in this issue either as writers or references. they are: erika balsom, suzanne buchan, sean cubitt, thomas elsaesser, anselm franke, kitano keisuke, gillian rose, susan schuppli, richard squires, hito steyerl, toshiya ueno, pasi väliaho, eyal weizman and liam young. we would also like to thank the birkbeck institute for the humanities for sponsoring that symposium and suzanne buchan for editorial guidance on this special issue. references amoore, l. ( ) the politics of possibility : risk and security beyond probability. durham: duke university press. beckman, k. (ed.) ( ) animating film theory. durham: duke university press.bratton, b. h. ( ) the stack : on software and sovereignty. cambridge: mit press. bredekamp, h., dünkel, v. and schneider, b. ( ) the technical image : a history of styles in scientific imagery. chicago: university of chicago press. buchan, s. (ed.) ( ) pervasive animation. london: routledge. cholodenko, a. ( ) ‘“first principles” of animation’, in beckman, k. (ed.) animating film theory. durham: duke university press, pp. – . cubitt, s. ( ) finite media : environmental implications of digital technologies. durham: duke university press. cubitt, s. ( ) the practice of light : a genealogy of visual technologies from prints to pixels. cambridge: mit press. danowski, d. and castro, e. b. v. de ( ) the ends of the world. translated by r. nunes. cambridge: polity press. doyle, j. ( ) mediating climate change. farnham: ashgate. gabrys, j. ( ) digital rubbish : a natural history of electronics. ann arbor: university of michigan press. gaycken, o. ( ) devices of curiosity : early cinema and popular science. oxford: oxford university press. gillespie, t. ( ) ‘the politics of “platforms”’, new media & society, ( ), pp. – . hamilton, c. ( ) defiant earth : the fate of humans in the anthropocene. cambridge: polity press. hansen, m. b. n. ( ) feed-forward : on the future of twenty-first-century media. chicago: university of chicago press. haraway, d. j. ( ) staying with the trouble : making kin in the chthulucene. durham: duke university press. herhuth, e. ( ) ‘the politics of animation and the animation of politics’, animation. sage publicationssage uk: london, england, ( ), pp. – . herhuth, e. ( ) pixar and the aesthetic imagination : animation, storytelling, and digital culture. berkeley: university of california press. hu, t.-h. ( ) a prehistory of the cloud. cambridge: mit press. kittler, f. ( ) gramophone, film, typewriter. translated by g. winthrop-young and m. wutz. stanford: stanford university press. lamarre, t. ( ) the anime machine : a media theory of animation. minneapolis: university of minnesota press. leslie, e. ( ) ‘animation and history’, in beckman, k. (ed.) animating film theory. durham: duke university press, pp. – . ma, m., zheng, h. and lallie, h. ( ) ‘virtual reality and d animation in forensic visualization’, journal of forensic sciences, ( ), pp. – . manovich, l. ( ) software takes command : extending the language of new media. london: bloomsbury publishing. morton, t. ( ) hyperobjects : philosophy and ecology after the end of the world. minneapolis: university of minnesota press. paglen, t. ( ) the last pictures. berkeley: university of california press. parikka, j. ( ) a geology of media. minneapolis: university of minnesota press. parks, l. and starosielski, n. (eds) ( ) signal traffic : critical studies of media infrastructures. urbana: university of illinois press. rouvroy, a. and berns, t. ( ) gouvernementalité algorithmique et perspectives d’émancipation, réseaux, ( ), pp. - . salter, a. and murray, j. ( ) flash : building the interactive web. cambridge: mit press. starosielski, n. ( ) the undersea network. durham: duke university press. stiegler, b. ( ) automatic society. translated by d. ross. cambridge: polity press. wells, p. ( ) basics animation: scriptwriting. lausanne: ava pub. wood, a. ( ) software, animation and the moving image : what’s in the box? houndmills: palgrave macmillan. oliver gaycken, for example, examines the historical trajectory of ‘the practice of modelling, which provides a rich vein of overlap between scientific visualization and animation techniques’ ( : ). as esther leslie argues elsewhere, ‘animation does to history what it does to nature. animation evokes history, plays with it, undermines it, subverts it, but it does not have it, just as it does not have nature. it has second nature. or different nature. it has different history. it models the possibility of possibility’ ( : ). conference full paper template perspectives on university library automation and national development in uganda by mr. robert s. buwule kyambogo university, uganda; university of pretoria, south africa ass. prof. shana r. ponelis university of wisconsin-milwaukee, usa; university of pretoria, south africa abstract academic libraries in universities store large volumes of research that can be used for development purposes to support teaching, learning, research, innovation, community outreach and partnerships. library automation incorporates the adoption of integrated library systems (ils). effective adoption of an ils enables broad-based access to global and local knowledge sources to solve local, regional and national development challenges. using a sequential mixed methods approach in a case study of a ugandan public university, kyambogo university, this study investigated the perceptions of librarians, information workers and other university stakeholders with respect to library automation and the contribution thereof to national development. the results confirmed that the ils improved library operations and plays an important role in supporting national development. this study also highlights the continued challenges of adopting an ils in developing countries such as uganda, which, if addressed, could further improve information service delivery for a nation‘s socio-economic transformation. keywords academic library, university library, library automation, integrated library systems, technology adoption, access to information, national development, uganda, kyambogo university introduction academic libraries in universities store large volumes of research conducted at universities and other research institutions that can be used for development support teaching, learning, research, innovation, community outreach and partnerships (bossaller and atiso, : ). as such public—and private—academic libraries can play a central role in the collection and dissemination of both international and local content by providing access not only to students, faculty and researchers but also to the broader community and society. information and communication technologies (icts) are central in facilitating the effective storage, communication and dissemination of information. the world summit on the information society (wsis) declared that, ―local authorities should play a major role in the provision of ict services for the benefit of their populations‖ (wsis, ). fostering digital opportunities strengthens capacities for scientific research, information sharing, cultural creations and exchanges of knowledge (unesco, ). the international federation of library associations and institutions (ifla, ) further believes that increasing access to information and knowledge across society with the help of available icts greatly supports sustainable development and contributes to improving people‘s lives. icts hold the potential to bridge socio-economic divides (bossaller and atiso, : ) and those in positions of authority have a responsibility to do so. library automation is the direct application of ict to library functions such as acquisition, circulation, cataloguing and serials control (amekuedee, ). libraries automate their library services using integrated library systems (ils) to improve efficiency and enhance access to library resources (webber and peters, ). the effective adoption of icts such as an ils in academic libraries will ―accelerate the level of knowledge acquisition and consequently improve national development‖ (ani et al., : ). librarians and information workers were among the first to realise the importance of internet in the provision of information services to the public (de saulles, : ). librarians therefore partly fuelled the expansion in the quantity and communicability of information by adopting icts such as the ilss and the internet in their libraries. the libraries‘ ability to make information available electronically directly facilitates interaction with information seekers in a more cost-effective manner (amekuedee, : ). individuals thus play an important role in ict adoption and is reflected in virtually all technology adoption models, such as the perceived usefulness and perceived ease of use in davis et al.‘s ( ) technology acceptance model (tam) and in rogers‘ diffusion of innovations (doi) theory. the purpose of this article is to determine the perceptions of the librarians and information workers involved in library automation and other stakeholders in uganda with respect to library automation and the contribution thereof to national development. this paper is structured as follows: after a discussion of the role of universities in africa and uganda, a brief history of library automation is presented, thereafter the methodology of the study, the results and a discussion follows. we conclude the paper with the key perspectives on the role of library automation on national development. the role of universities in africa and uganda the former united nations secretary-general, kofi annan, stated that, ―the university must become a primary tool for africa‘s development in the st century‖ (annan, ). according to sutz ( ), ―to increase their contribution to development through the production and distribution of knowledge, universities in developing countries need to transform themselves into developmental universities.‖ among others, such developmental universities must clarify, analyse and solve local, regional and national problems in partnership with government, industry, community and other research organizations and make resulting developmental knowledge available and accessible to the broader society regardless of socio-economic status (fredua-kwarteng, ; ). the african union (au) agreed on a set of goals that all african countries are expected to achieve by (african union commission, : ). rooted in pan-africanism, agenda provides a robust framework for addressing past injustices and the necessary infrastructure that supports accelerated integration and growth, technological transformation and development through african integration. in an effort to develop the capacity of africa‘s citizens to be effective change agent for the continent‘s sustainable development as envisioned by the au and its agenda , the african union commission has developed an africa comprehensive ten- year continental education strategy. on january , heads of state attending the th african union summit in addis ababa approved the continental education strategy for africa - (cesa - ). the objectives of cesa - include building, rehabilitating and supporting education infrastructure and revitalising and expanding tertiary education, research and innovation, particularly at a postgraduate level, to address continental challenges and promote global competitiveness. uganda is an east african landlocked country with a geographical area measuring , square miles with a population of million people (bbc, ) that gained independence in . although uganda is a multi-lingual country with more than different languages, english was inherited from the colonial period as the language of government and education. uganda, by sub-saharan african standards, is one of the leading countries with a vibrant media sector with nearly private radio stations, dozens of television stations and print outlets. by , uganda had . million internet users (bbc, ). there are seven public universities licensed by the uganda national council of higher education. the majority of public and private universities are located in central uganda. public universities have been established in other regions in recent years, the most recent being lira university ( ) and muni university ( ) in the northern region, and soroti university ( ) in the eastern region. although some universities are entirely new, several universities were formed by upgrading and/or merging tertiary and technical colleges; for example, bishop tucker theological college ( ) became the uganda christian university in (chartered in ), and kyambogo university (established in ) by merging the uganda polytechnic- kyambogo (upk), the institute of teacher education-kyambogo (itek) and the uganda national institute of special education (unise). the cabinet of the republic of uganda ( : iii) approved a national vision for ―a transformed ugandan society from a peasant to a modern and prosperous country within thirty years.‖ the republic of uganda ( : ) therefore plans to facilitate and nurture human resources and skills development to support national development. to achieve this, emphasis will be put on research and development, acquisition of modern scientific knowledge and technology, and building of knowledge networks. this will further be supplemented by building partnerships with local and international institutions. all this cannot be achieved without the dissemination of scientific information and knowledge through academic libraries that are automated and networked by means of an ils. brief history of library automation library automation initiatives are first reported in the united states of america in the s and later in western europe but took major strides in s when the library of congress designed the machine readable catalogue (marc) format for communicating library bibliographic data on magnetic tape (saffady, : ; borgman, : ). thereafter libraries started using key punched cards in combination with sorters, collators, and other unit record equipment in replacement of manual record keeping practices (saffady, : ). in the s more libraries started automating by sharing computer infrastructure which was very expensive at the time. this was the period that saw the writing of single-purpose software of cataloguing, circulation, serials control and other library functions. by this time other countries had developed their own marc format and ifla convened an international meeting to develop the universal marc (unimarc) standard. it was in the late s and the s that saw the transition from in-house built library systems to integrated library systems. through their local area networks and dial-up modems, libraries set up online catalogues while others shared bibliographic data and set up union catalogues (borgman, : ). it was also around this time that this technology started to be adopted in some leading african libraries most especially in south africa. in sub-saharan africa the library automation campaign was spearheaded by the united nations education scientific and cultural organisation (unesco). computerised documentation services/integrated set of information system (cds/isis) was developed and distributed to many academic and public libraries in this region to automate their bibliographic information (mutula, : ). cds/isis was not only used in africa but was widely used by libraries, information and documentation centres throughout the world where no predefined library automation structures or standards were available (de smet, : ). not very different from the african experience, many asian and central and eastern europe libraries entered the s with minimal library automation with mostly bibliographic databases and applications on standalone computers (borgman, : ). this is confirmed by husain and ansari ( : ) who report that library automation gained some momentum in india in the s driven mainly by the sharply dwindling prices of computer hardware, availability of ilss and the increased enthusiasm of library professionals for embracing technology. haider ( ) also reports of the same scenario in pakistan. the emergence of internet in the s added a new flavour to library automation as both library workers and users started to remotely access the library catalogue (mutula, : ). the twenty-first century came with free and open source software (foss) as an intervention to the digital divide that had by this time been created between the automated libraries in the developed world and the traditional and manual libraries in the developing world like uganda. there were a number of foss that were developed around this time like koha and evergreen (mutula, : ). unesco also upgraded the cds/isis which was dos-based to winisis for ms windows and later to abdc for a web-based environment (de smet, : ). today significant advancements in sophisticated icts have triggered dynamic changes in software, hardware, networks and mobile technology which are constantly increasing the capacities of ilss (rosa and storey, ). cloud computing is creating highly scalable platforms that facilitate quick access to hardware and software over the internet, in addition to easy management and access by non-expert users (romero, : ). coupled with the mass penetration of mobile smartphones and tablets, broad-based adoption and access is increasingly possible. methodology a single case, kyambogo university (kyu), was selected and studied in-depth using a sequential exploratory mixed methods design (shown in figure ) to elicit the perspectives from the target population, kyu staff and students, on library automation and the role the ils plays in supporting national development. although a single case limits generalizability of the study, the access and familiarity of the lead author ensured the case was information rich with respect to the topics under investigation (patton, ). figure . research design in the qualitative strand, semi-structured interviews were conducted in english with purposively selected participants, representing . % of the target population. semi-structured interviews allowed the researchers to ask standard questions supplemented by individually tailored questions to probe research subjects‘ reasoning or to get more clarification (leedy and ormrod : ). in the subsequent quantitative strand a questionnaire, also in english, based on results from the first strand was used to survey the entire target population for which the overall response rate was . %. table provides the detail on the target population and sampling strategies for each strand. table . target population and sample response rates kyu section/unit total population interviews survey (n) (%) (n) (%) kyu top management . kyuls management team . ict unit . . general library staff . . kyu library committee members . . kyu ict unit (e-kampus) the students guild . . total . . kyu: kyambogo university kyuls: kyambogo university library service document review was used to inform and support the development of data collection instruments as well as the analysis and interpretation thereof. documents included published and unpublished reports from kyu, administrative documents, newspapers, journal articles and books in print and electronic (nieuwenhuis, : ). the document data gathering technique helped in reconstructing events, critical incidences and their social relationships. sequential mixed data analysis was used to analyse the data collected from the interviews (teddlie and tashakkori, ). all quantitative data was analysed using spss statistical software to generate descriptive statistics. background on the case kyambogo university (kyu) is one of the seven public universities of uganda (uganda national council of higher education, ). as mentioned above kyu was established in by amalgamating three existing institutions. following the merger of the three institutions, the three institutional libraries were merged to form the kyambogo university library service (kyuls) (kyambogo university, ). there is hardly anything reported about the automation initiatives before the merger. mutula ( ) briefly reports that upk library was accessing the african virtual university digital library, which was searchable via internet. attempts at automating the kyambogo university library catalogue post-merger with an in-house built solution were initiated by the staff members based on the local needs of the three branch libraries in using microsoft access. the respective databases were never integrated as each branch used a stand- alone computer and there was no network linking the branches of kyuls. in kyuls started entertaining the idea of adopting an ils. this idea given further credibility by the consortium of uganda university libraries (cuul) that organised training in for its member institutions, which included kyu, in koha. koha is an open source, predominantly web-based ils and is widely being adopted by academic libraries with low cost budgets around the world. cuul has actively encouraged academic libraries across east africa to adopt koha and has conducted several training sessions on koha since (adoma and ponelis, ). the kyuls staff member who attended cuul‘s koha training was supposed to train other staff members at kyu and a one-week koha training session was organised at kyu for a few key staff in april . koha software was then installed on the university server and customised beginning in june . another koha orientation for a selected number of library staff was carried out in august and thereafter koha was launched in kyu. study programs and student intakes are increasing yearly. kyu‘s current enrolment is over , students on campus and over , students from affiliated institutions, mainly primary and secondary teacher training colleges. kyu‘s physical library space is not in any way able to house the , students, staff members and external users and the only alternative is to access the library online. the adoption of koha opened up kyu to a number benefits of ilss enumerated by ayankola and ajala ( ) including flexibility, speed, ease of updating and manipulation of bibliographic data, remote access to an item by multiple users simultaneously. however, like at the university of malawi‘s academic libraries, there is a low level of computer technology replacement (mapulanga, ). a few computers and ict equipment donated to support library automation initiatives in kyuls are almost obsolete. study results contribution of library automation to library operations the majority of respondents indicated that library automation contributes to academic library operations ( respondents, %). respondents were asked to identify scenarios of where and how automation of academic libraries and information centres improves library information services (table ). table . how library automation supports library operations theme frequency (n) (%, n= respondents) (%, n= responses) tracks library records . saves time . . increases productivity . eases work . . increases efficiency . . allows interoperability . saves costs . ensures accuracy . increases library‘s influence . . contribution of library automation to national development in the first strand, interviewees were asked to assess whether library automation contributes to national development, directly or indirectly, and the majority ( %) stated that it does. those interviewees who felt that library automation contributes to national development were asked to elaborate on reasons for their affirmation. a list of possible impacts of automation in university libraries was provided from which interviewees selected the following as the top contributions:  speeds up service delivery;  improves access for those with physical disabilities;  promotes the library as a centre of excellence;  increases the visibility of the nations‘ local content; and  aligns the library with the national vision. survey respondents in the second strand were provided with a list of the millennium development goals (mdgs) and asked about the contribution of library automation to the achievement of the mdgs. (at the time of data collection, the united nations had not yet launched the sustainable development goals or sdgs.) according to the results of the study, . % of the respondents affirmed that library automation directly or indirectly contributed to the achievement of the mdgs. these respondents further elaborated on the different ways library automation has contributed to the achievement of the mdgs. as shown in table , the primary contribution of library automation to mdgs is seen to be enabling faster and easier access to large stores of information with support for research and innovation a secondary contribution. see the united nations development programme‘s overview and achievement of the mdgs at http://www.undp.org/content/undp/en/home/sdgoverview/mdg_goals.html. http://www.undp.org/content/undp/en/home/sdgoverview/mdg_goals.html table . library automation and impact on the millennium development goals positive impact frequency (n) (%, n= respondents) provides faster and easier access to information . enables mass storage of information . facilitates research for development . supports scientific innovation . provides a competitive advantage for kyuls . kyuls: kyambogo university library service survey respondents in the second strand were also asked whether library automation facilitates collaboration and networking among library workers and users for purposes of development both nationally and internationally. as shown in table , the majority of respondents ( . %) believe that library automation creates a firm foundation of international cooperation, networking and collaboration. the respondents who believed that library automation enhances collaboration and networking reasoned that library automation allows library staff members and users to collaborate with developmental experts, researchers, innovators and (potential) investors online and generally share experiences of accessing information online. these respondents were requested to rate their prior experience of collaboration through library automation. all respondents rated their experience as either very good, good and fair with . % assessing their experience as very good (table ). table . library automation facilitates collaboration and networking for national development answer frequency (n) (%) yes . no / no opinion . total table . experience of external collaboration as a result of library automation level of experience frequency (n) (%) very good . good . fair . poor - - very poor - - total challenges to adopting integrated library systems in developing countries challenges confounding successful adoption and use of the ils were mentioned by several interviewees. survey respondents were thus asked to indicate challenges encountered in library automation. table lists the barriers to library automation at kyuls. table . barriers to library automation barrier frequency (n) (%, n= respondents) internet instability . procurement bureaucracies . inadequate top management support . lack of infrastructure . shortage of funding . inadequate staff skills . ever-changing technology . technophobia* . lack of initiative* . initial stage not well managed* . lack of champions* . * as specified under ‗other‘ option in questionnaire responses discussion the results show that library staff at kyu reported that the ils improved library operations, including the ability to catalogue their holdings for users‘ benefit by improving information service provision. the relatively recent automation of kyuls gives the staff a good basis of comparison in terms of improvements to library operations before and after the ils implementation. the results also confirm the benefits of library automation and implementation of an ils that has been widely reported in the literature; for example, walsh ( : ) reports how libraries have migrated from sending paper messages to sending them by email or sms using their ilss which saves time, postage costs, reduces delivery staff and space. respondents in this study demonstrated support of the ils as well as its role in supporting national development, particularly by making knowledge, both global and local, widely accessible and supporting research and innovation. thanks to ilss, countries and their citizens can expect their public academic libraries to be able to provide seamless connections to information sources, facilitate remote access to local and international databases to meet their various information needs, support research and innovation and, ultimately, contribute to their national development agendas. furthermore, the ils‘ facilitation of collaboration and sharing across institutional boundaries is also seen as contributing to national development efforts. when library users find it easy to access information through the ils, the nation‘s education sector is automatically supported as thousands of students, investors, policy makers and all information seekers remotely access all the information they need to execute their duties for the nation‘s benefit. healy ( : ) posits that university libraries and most especially public ones are uniquely positioned to increase access to information since they provide services such as internet access to members of the community. providing access to internet coupled with a functional information searching and retrieval tool, the ils, is a significant step toward providing access to information for national development. universities and institutions of higher learning have long been the primary producers of original research that has lead to many world changing innovations (de saulles, : ). traditionally universities have been disseminating this research information through peer-reviewed journals however the trend is shifting to institutional repositories which are linked to ils which ease access to this research and innovation information in universities. conversely, the lack of timely access to textbooks and other relevant information materials leads to students losing interest in learning and thus they no longer invest time in reading. therefore this has having a negative impact on their innovativeness, literacy and general knowledge (uncst, ). as reflected in the results above library automation enhances collaboration and networking between other academic institutions which can be used to develop the career path of information and knowledge seekers through peer learning support, mentorship and placement services. for example, according to mutula ( : ), library automation offers access to a diversity of electronic resources from different providers and this digital scholarship encourages collaboration and networking in all aspects of the library users‘ learning experience. library automation further contributes to knowledge dissemination since kyuls‘ online public access catalog (opac) enables worldwide access and thus increases the visibility of a nation‘s local content. librarians at kyuls should explore creative ways that can increase the visibility of their academic libraries through working with different faculties, schools and colleges to promote the use of ilss (liu and luo, : ). as library users are inspired to read, learn, complete assignments, reflect and think, this impact can be felt even on the national level. missingham and moreno ( : ) allude to this fact that national development needs to involve all sectors of a nation. players in national development have to put platforms like ilss which allow all inclusive participation and resource sharing on local and international networks. bossaller and atiso ( : ) state that stable and sustained ilss allow scientists in developing nations to share their work, create collaborative relationships, and create solutions for the most pressing issues in their countries. academic libraries as leading agencies of disseminating knowledge, create an egalitarian platform for all (eklow, : ). the results show that kyu staff see that library automation contributed to the achievement of the mdgs. with the introduction of the sdgs, libraries continue to have an essential role through providing access to information to the public to support development (bradley, : ). the library further uses its icts to preserve and ensure ongoing access to developmental information for future generations. because ict plays such a central and highly visible role in library automation it is easy to forget that library automation as an innovation encompasses more than just the hardware and ils software. successful adoption and adaptation of information systems such as an ils requires the diffusion of supporting higher-context dependence innovations around techniques, procedures, education and training, funding policies and governance policies (lor, ). the respondents‘ concerns with respect to the challenges of ils implementation and use are similar to those widely reported in the literature on library automation in sub-saharan africa: a lack of infrastructure (reliable power supply, equipment, connectivity), skills base (lis education and training), management support and supporting policies and procedures (operational, funding, procurement). consider for example, the ugandan government‘s plan to provide free solar- powered laptops to all ugandan university students, which would reduce costs associated with buying printed textbooks, photocopying teaching materials, and improve access for persons with disabilities (anguyo, : ). this plan, however, is questionable since there is no corresponding infrastructure to enable access to information through, for example, broadband connectivity. the biggest impediment is funding. while funding is necessary it is not sufficient. given sufficient funds, any library can purchase icts but it takes considerable skill to implement an ils because it changes the way people work, the processes of the library, and users‘ relationship with the library (negash et al., : ). library automation may still not deliver on access to information to the relevant stakeholders without skills and leadership. according to scheeder and witt ( : ), ifla proposes the first level of change agenda as reskilling, if library automation is to significantly influence the achievement of national and international development. librarians must personally embrace the required skills that are needed to use ils to offer services that can positively transform the society. reskilling automatically calls for high levels of creativity and innovation among librarians. this may also entail lis training institutions equipping students with up-to-date skills in the use of ilss. in order to empower academic libraries to positively influence national development, leadership development should be considered. mutula ( : ) suggests that these common challenges can be overcome by partnering with consortia, and imparting continuous pedagogic training in library automation to academic library staff. the lack of appreciation by policy-makers of the role libraries play in the development of uganda reported by okello-obura and kigongo-bukenya ( ) is a critical issue. libraries and university management should play a leading role in advocating for libraries and creating greater awareness of the central role that academic libraries play national development is vital, particularly given initiatives to transform higher education through the au‘s agenda and cesa - and the ugandan government‘s vision . the recently released omaswa task force report (nakkazi, ) recommends the formation of ‗innovation universities‘ in uganda to serve as engines of industrialisation and economic growth, requiring substantial restructuring of uganda‘s nine public universities. furthermore, the minister for higher education, science and technology, hon. tickodri-togboa, stated that higher education institutions must be transformed into ―vehicles of industrialisation, employment-wealth creation, inclusive and sustainable development and socio-economic transformation in line with uganda vision ‖ (nakkazi, ). libraries and their institutional leadership need to ensure that they are adequately empowered and enabled to support a successful transformation. challenges that inhibit library adoption and ils implementation and use are unlikely to be adequately addressed without a greater appreciation of the important role that libraries play. given that inadequate top management support was seen as a major barrier, librarians should commence by advocating internally to canvas support for initiatives and also to increase awareness among policy-makers with respect to development. libraries are also not mentioned in any of the sdgs, in spite of the lyon declaration on access to information and development launched at the world library and information congress in and the vital role libraries and other information centres can play in support of all sdgs (ifla, ). conclusion and implications academic libraries are among the leading agencies supporting knowledge production and disseminating knowledge within a country. this study shows that academic librarians in research and tertiary institutions such as kyu can make a difference that will support national development through ensuring operational ilss in their libraries that connect all citizens through library automation so that they can communicate and share knowledge with others locally, nationally, regionally and globally. academic libraries should lobby their university management and policy-makers in national governments to support libraries and, in general, to implement ilss. it will also be necessary to have an accompanying infrastructure with updated computers, connectivity, and lis education and training. there can be no library automation without investment in the supporting infrastructure. governments and development partners should fund academic libraries so they may improve information service delivery through their ils to assist with socio-economic transformation. ultimately, such funding for university libraries is vital for the achievement of uganda‘s vision and, more broadly, the au‘s agenda . declaration of conflicting interests the authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. funding the financial assistance of the carnegie corporation of new york (ccny) that made this study possible is hereby acknowledged. opinions expressed and conclusions arrived at, are those of the authors and are not to be attributed to the ccny. references adoma p and ponelis sr ( ) open source integrated library systems in academic libraries in uganda: initial results. ifla wlic it section satellite meeting, stellenbosch university, stellenbosch, south africa, - august . african union commission ( ) agenda : the africa we want popular ed. addis ababa: african union commission. available at: http://www.un.org/en/africa/osaa/pdf/au/agenda .pdf (accessed april ). amekuedee jo ( ) an evaluation of library automation in some ghanaian university libraries. the electronic library ( ): - . anguyo i ( ) is government provision of computers to varsity student realistic? new vision. december, p. . annan, k ( ) statement by h. e. mr. kofi annan, secretary-general of the united nations. opening statement presented at the second phase of the world summit on the information society, tunis, tunisia, november . available at: http://www.itu.int/net/wsis/tunis/statements/docs/io-un-opening/ .html (accessed april ). ani oe, esin je and edem n ( ) adoption of information and communication technology (ict) in academic libraries. the electronic library ( ): – . ayankola ia and ajala sf ( ) the challenges and frustration of software adoption in nigeria libraries: a survey of some selected libraries. philosophy and practice (e- journal): - . bbc ( ) uganda country profile. web document. available at: http://www.bbc.com/news/world-africa- (accessed november ). borgman cl ( ) from acting locally to thinking globally: a brief history of library automation. the library quarterly ( ): – . bossaller j and atiso k ( ) sharing science: the state of institutional repositories in ghana. ifla journal, ( ): – . bradley f ( ) ―a world with universal literacy‖: the role of libraries and access to information in the un agenda. international federation of library associations and institutions ( ): - . davis fd, bagozzi rp and warshaw pr ( ) user acceptance of computer technology: a comparison of two theoretical models. management science ( ): - . de saulles m ( ) information literacy among uk smes: an information policy gap. aslib proceedings: new information perspectives ( ): – . de saulles m ( ) information . : new models of information production, distribution and consumption. london: facet publishing.de smet e ( ) abcd: a new foss library automation solution based on isis. information development ( ): – . eklow s ( ) re-positioning the library in the university: aligning library acquisitions. in: the standing conference of eastern, central and southern african library associations. nairobi, kenya.: kenya library association. fredua-kwarteng e ( ) the case for developmental universities. university world news, october. available at: http://www.universityworldnews.com/article.php?story= (accessed april ). fredua-kwarteng e ( ) ethics and the developmental university, university world news, september, available at: http://www.universityworldnews.com/article.php?story= (accessed april ). haider sj ( ) library automation in pakistan. international information and library review, ( ): – . healy am ( ) medlineplus go local: connecting at-risk populations with health care services. in: charbonneau dh (ed) global information inequalities: bridging the information gap. oxford: chandos publishing, pp. - . husain s and ansari ma ( ) library automation software packages in india: a study of the cataloguing modules of alice for windows, libsys and virtual. annals of library and information studies (september): - . international federation of library and information associations and institutions (ifla) ( ) access and opportunity for all: how libraries contribute to the united nations agenda. the hague, netherlands: ifla.available at: http://www.ifla.org/files/assets/hq/topics/libraries-development/documents/access-and- opportunity-for-all.pdf. kyambogo university ( ) history of the university. available at: http://kyu.ac.ug/index.php/contact-us/logo/history. leedy pd and ormrod je ( ) practical research: planning and design. ninth edition. upper saddle river, nj: merrill. liu z and luo l ( ) a comparative study of digital library use: factors, perceived influences, and satisfaction. the journal of academic librarianship ( ): - . lor pj ( ). understanding innovation, policy transfer and policy borrowing: implications for lis in africa. in: th annual public lecture on african librarianship in the st century, university of south africa, pretoria, may . available at: https://pjlor.files.wordpress.com/ / /understanding-innovation.pdf (accessed april ). mapulanga p ( ) swot analysis in the planning of information services and systems in university libraries: the case of the university of malawi strategic plans. the bottom line ( ): - . missingham m and moreno r ( ) resource sharing in australia: evaluation of national initiatives and recent developments. interlending and document supply ( ): - . mutula sm ( ) it development in eastern and southern africa: implications for university libraries. library hi-tech ( ): - . mutula sm ( ) library automation in sub saharan africa: case study of the university of botswana program. electronic library and information systems ( ): - . nambogga j ( ) embrace ict, migereko advises youth. new vision, february, p. . nakkazi e ( ) sweeping university reforms to emphasise innovation. university world news, june , . available at: http://www.universityworldnews.com/article.php?story= (accessed april ). negash s, anteneh s and watson rt ( ) a phd in information systems for emerging economies: the addis ababa university model. information technology for development ( ): – . nieuwenhuis j ( ) qualitative research designs and data gathering techniques. in: maree k et al. (eds) first step in research. pretoria: van schaik publishers, pp. - . okello-obura c and kigongo-bukenya imn ( ) library and information science education and training in uganda: trends, challenges and the way forward. education research international : - . patton mq ( ) qualitative evaluation and research methods. third edition. newbury park, california: sage publications. republic of uganda. ( ). uganda vision . kampala: national planning authority. available at: http://npa.ug/wp-content/themes/npatheme/documents/vision .pdf (accessed april ). romero nl ( ) ―cloud computing‖ in library automation: benefits and drawbacks. the bottom line ( ): – . available at: http://dx.doi.org/ . / . rosa k and storey t ( ) american libraries in : creating their future by connecting, collaborating and building community. ifla journal, ( ): - . saffady w ( ) library automation: an overview. library trends ( ): – . scheeder d and witt s ( ) libraries: a call to build the action agenda. ifla journal ( ): - . sutz j ( ) the role of universities in knowledge production. himalayan journal of sciences ( ): - . teddlie c and tashakkori a ( ) foundation of mixed methods research: integrating quantitative and qualitative approaches in the social and behavioural sciences. thousand oaks, california: sage publications. uganda national council of higher education ( ) public universities. available at: http://www.unche.or.ug/institutions/public-universities (accessed april ). uncst ( ) uganda national council of science and technology: the quality of science https://pjlor.files.wordpress.com/ / /understanding-innovation.pdf http://npa.ug/wp-content/themes/npatheme/documents/vision .pdf http://www.unche.or.ug/institutions/public-universities education in uganda, kampala uganda. unesco ( ) towards knowledge societies. paris, france: unesco. walsh a ( ) using mobile technology to deliver library services: a handbook. london: facet publishing. webber d and peters a ( ) integrated library systems: planning selecting and implementing. santa barbara, california: libraries unlimited. world summit on the information society (wsis) ( ) wsis declaration of principles: building the information society: a global challenge in the new millennium (wsis- /geneva/doc/ -e), geneva, december . available at: http://www.itu.int/net/wsis/docs/geneva/official/dop.html (accessed april ). http://www.itu.int/net/wsis/docs/geneva/official/dop.html abstract introduction the role of universities in africa and uganda brief history of library automation methodology background on the case study results contribution of library automation to library operations contribution of library automation to national development challenges to adopting integrated library systems in developing countries discussion conclusion and implications references building research data management infrastructure in canada from the bottom-up charles (chuck) humphrey university of alberta where’s canada today? the following three statements best summarize the situation in canada today around research data management and preservation. .  a strategic shift has occurred over the past decade from building a national data preservation institution to building national research data management infrastructure. .  building this national research data infrastructure is taking place from the bottom-up. .  building this infrastructure from the bottom-up requires intentional, collaborative actions. the driving principle is one of cooperation, not control. . shift from institution to infrastructure national data archive consultation, - oecd access to publicly funded research data, canadian digital information strategy, - consultation on access to scientific research data, international data forum, unesco charter on preservation of digital heritage, understanding infrastructure [c]yberinfrastructure is the set of organizational practices, technical infrastructure, and social norms that collectively provide for the smooth operation of scientific work at a distance (p ). ! ! ! ! understanding infrastructure: dynamics, tensions, and design p. edwards, s. jackson, g. bowker and c. knobel january research data management infrastructure }  rdmi is the configuration of staff, services, and tools assembled to support data management across the research lifecycle and more specifically to provide comprehensive coverage of the stages making up the data lifecycle. it can be organized locally and/or globally to support research data activities across the research lifecycle. capitalizing on big data: toward a policy framework for advancing digital scholarship in canada appendix : definitions . bottom-up development }  the brewster kale principle: just build it }  november , carl directors’ meeting in ottawa }  levels of data stewardship responsibilities }  the research project level }  the local institutional level }  the wider stakeholder level }  across regions, canada, and the globe }  across domains and research programs }  across sectors stewardship levels and the research lifecycle institutional research lifecycle paul jefferys. data management at oxford. march . policy at the institutional level at the wider research stakeholder level }  a few examples: }  individual institution across sectors }  canada’s international polar year and the development of the ipy data assembly centre network and its transformation into the canadian polar data network }  consortia within region }  ocul/sp cloud storage project }  ocul/sp dataverse network }  regional consortia }  the canadian social science research data private lockss network }  shared functionality enhancements to archivematica for research data }  national membership }  the carl research data management institute . successful bottom-up characteristics }  capitalize on the energy driving the sense of urgency around sharing and preserving research data, which is resulting in potential partners across sectors and institutions. }  as we identify potential partners, we have begun to change the metaphors that we use to describe the organizational representations of research data infrastructure. we have gone from “data landscapes” to “data ecosystem.” the data landscape access function individual centric domain centric institutional centric long-term access short to mid-term access immediate access websites ftp sites domain web portals data centres domain archives data libraries staging repositories institutional repositories su st ai na bi lit y research data ecosystem building of a successful collaboration }  take steps to build trust among partners, which doesn’t always come from the tops of organizations. let those passionate within organizations find solutions, working together with their counterparts across organizations. }  prepare a charter that expresses the norms for working together and the common commitment to the task of research data preservation. a good example is the charter of the canadian polar data network (see http://polardatanetwork.ca/wp-content/uploads/ cpdn_governance.pdf) }  develop a set of policies to serve as a foundation for the shared research data management infrastructure. data policy document framework building of a successful collaboration }  develop blueprints for new research data management infrastructure in teams across institutions and do what is possible now while laying the groundwork for what can be incorporated in the future. the carl canadian national collaborative data infrastructure proposal as a blueprint. building of a successful collaboration }  pool resources to get infrastructure in place. }  the cwap project is an example of this approach. this is a jointly funded initiative between the university of alberta, ubc, and sfu to add research data preservation functionality to archivematic, a tool developed by artefactual inc in vancouver that produces high quality archival information packages. more recently, ocul/sp has expressed interest in contributing to this develop and the university of saskatchewan has a project to extend functionality between islandora and archivematica. building of a successful collaboration }  integrate and be open to new partners, including a variety of designated user communities. “by what authority …” }  build trust among the communities being served. }  demonstrate competencies in delivering services. }  develop a positive reputation around trust and competencies. }  operate from a data culture that incorporates norms of best practice and in which rewards are only part of the reason for engaging.  an rdmi agenda for the canadian research data management network }  rdm policy and resource coordination }  develop, promote, interpret, and review rdm policies }  collaboratively raise resources for joint rdm projects }  services }  coordinate service delivery across the data lifecycle: planning, managing, sharing, discovering, repurposing, and preserving research data }  tools and technology }  identify, evaluate, and develop tools supporting dm across the research lifecycle }  identify, evaluate, and develop preservation tools and technology }  expertise }  upgrade and train dm skills for stakeholders across the lifecycle }  develop and advance rdm specializations }  local or globally bulgarian dialectology as living tradition: a labor of love bulgarian dialectology as living tradition: a labor of love quinn dombrowski division of literatures, cultures, and languages & stanford university libraries, stanford university, palo alto, ca, usa; orcid: - - - ronelle alexander department of slavic languages and literatures, uc berkeley, berkeley, ca, usa vladimir zhobov department of slavonic philology, sofia university "st. kliment ohridski", sofia, bulgaria bulgarian dialectology as living tradition: a labor of love bulgarian dialectology as living tradition (bdlt) has been one of the longest- running slavic digital humanities projects in the united states. initially conceived in as a series of printed volumes, the digital project was built upon the foundation of a long-term international collaboration dating to the ’s. as bdlt nears completion in , this paper reflects on the trajectory of its development and its sustainability as an unfunded digital humanities project, and the ways it can serve as both a model and cautionary tale for others who seek to undertake similar work. keywords: digital humanities; digital preservation; content management systems; dialectology; bulgarian language . introduction bulgarian dialectology as living tradition (bdlt) has been one of the longest-running slavic digital humanities projects in the united states. initially conceived in as a series of printed volumes, the digital project was built upon the foundation of a long- term international collaboration between ronelle alexander (uc berkeley) and various bulgarian scholars dating to the mid- ’s. a serendipitous conversation in between alexander and quinn dombrowski, a digital humanist with a background in slavic linguistics, and in library and information science, transformed the focus of the project from preparing word documents for eventual publication, to preparing data for entry into a database. soon after that, vladimir zhobov of sofia university became the bulgarian research director of the new digital project. in , zhobov and alexander decided to open the previously password-protected website to the public, even as basic data entry was still in progress, and in , the project officially launched after many years “in beta”. while data entry is not yet finished for all aspects of the project, it is rapidly nearing completion. at eight years old, bdlt hardly ranks among the most longstanding digital projects with a focus on slavic materials (cf. various national corpora-building efforts; manuscript markup and display environments such as http://manuscripts.ru/; the database of russian birchbark letters at http://gramoty.ru; and dialectological databases, such as those on russian dialects found at http://www.parasol.corpus.org/pushkino and http://www.rureg.hs-bochu.de, a site on bulgarian diaspora dialects at http://www.corpusbdr.info, and a comprehensive site on polish dialects at http://www.dialektologia.uw.edu.pl/index.php? =start, etc.) nonetheless, the institutional and financial circumstances for us-based slavists undertaking digital projects are vastly different than for their colleagues situated in countries where such projects can be framed and funded as valuable efforts to bolster the national language in a digital environment where english predominates. as more us-based slavists engage with digital tools and methodologies that are transformative in their application to slavic studies, but may not be perceived as sufficiently “innovative” on the technical level to successfully compete for national-level grant funding (e.g. from the neh, acrl, etc.), bdlt can serve as a replicable model for project development that depends much more on time than money. . overview bulgarian dialectology as living tradition (bdlt, http://bulgariandialectology.org/) is a searchable database of oral speech representing the full range of bulgarian dialects. it comprises excerpts (henceforth called “texts”), drawn from a large corpus of material recorded in bulgarian villages over the period - . bdlt is the digital embodiment of a scholarly project with two goals: first, to make both the discipline and the material of bulgarian dialectology available to a broader, international audience; and second, to bring the focus of dialectology back to the natural, spontaneous speech which constitutes the basic data for dialectological research. . background and source material bdlt emerged out of a multi-year collaboration between the american slavist and dialectologist ronelle alexander and several bulgarian dialectologist colleagues. as early as , alexander began to discuss the desirability of joint fieldwork with bulgarian colleagues todor bojadžiev and maksim mladenov; in , two additional bulgarian colleagues, georgi kolev and vladimir zhobov, joined this conversation. such work was not possible during the socialist period, but as soon as the government changes occurred, various members of this group took short field trips: alexander and kolev recorded material in the razlog region (one village) in , and alexander and mladenov recorded material in the ihtiman, panagjurište, and velingrad regions (five villages total) in . these trips were followed by longer expeditions in and , which visited many more locations and gathered the bulk of the material underlying bdlt. these ventures were supported by the international research and exchanges board (irex), and the field expeditions were directed jointly by alexander, kolev and zhobov. when bulgarian dialectology as living tradition arose as a project and publication, the material from these field trips was augmented with similar work done by members of the research team, their colleagues, students and associates, in order to increase the geographic coverage and obtain a more representative set of transcripts from bulgaria as a whole. in order to place primary emphasis on natural spontaneous speech, audio clips from the actual field recordings have been made available along with each text, and both they and the transcriptions are presented in as “natural” a frame as possible. the audio files have undergone very little sound editing; only certain loud and distracting noises have been removed. in the transcription, every utterance has been included (including those by bystanders when relevant to the conversation) as well as any non- linguistic sounds when there was even the slightest possibility that they may have influenced the flow of conversation, e.g. by distracting the speaker. in addition, overlapping speech by several informants has been transcribed. such transcription of “natural speech”, therefore, makes the material available for linguistic analysis at several different levels beyond the word itself (the focus of nearly all the maps in dialect atlases). topics which are rarely, if ever, addressed in dialectological research, such as word order, functional sentence perspective, conversational analysis, narrative structures, and intonation, could now be studied on the basis of this material. . “revitalizing bulgarian dialectology” one of the conditions of the irex grant supporting the field expedition was the publication of a volume summarizing results of the expedition. this volume, entitled revitalizing bulgarian dialectology , was published in under the editorship of alexander and zhobov, in association with the university of california press, as an open-access pdf manuscript available through the california digital library’s escholarship platform. the goal of the expedition had been to “revitalize” bulgarian dialectology both in bulgaria and the west by means of putting bulgarian and american students together in the field and creating situations where they could learn not only from their teachers but also from each other. the resulting volume included not only articles by the teachers (alexander, kolev and zhobov) but also research papers by each participant, student and teacher alike, based on dialect material recorded during the expedition. the volume was published in california to underscore ronelle alexander and vladimir zhobov, eds.. revitalizing bulgarian dialectology. . university of california press. http://escholarship.org/uc/item/ hc x hp the importance of making bulgarian dialectal data available at the international level, and it was published electronically and open-access to maximize availability, especially in eastern europe. . bdlt as audio-based chrestomathy although revitalizing bulgarian dialectology had made public some outcomes of the most recent expedition, the ultimate goal of the research team was to devise a way to make available the actual field material gathered on this and previous expeditions, and to do so in such a way as to make this material more accessible to outsiders. realizing that it would not be possible to transcribe the entire amount of recorded material (over hours), they decided to choose representative excerpts and create an audio-based chrestomathy; in order to make the chrestomathy more fully representative of the broad scope of variation throughout bulgaria, they also decided to include material from previous trips undertaken by zhobov and kolev prior to their collaboration with alexander. the plan was not only to transcribe each excerpt but also to provide it with interlinear glosses and an english translation; each excerpt would also be accompanied by a streaming audio file, a clip from the actual field recordings. the goal of the resulting publication was to make actual field data maximally available (including in audio form) at the international level. furthermore, since the excerpts were chosen not only for linguistic value but also for content, the volume would give a representative picture of both linguistic variation and traditional cultural phenomena throughout bulgaria. . bdlt as digital humanities project at aatseel , alexander discussed the audio-based chrestomathy with quinn dombrowski. after receiving an ma in slavic linguistics as well as an mlis, dombrowski had found employment as it staff in the academic technologies unit of the university of chicago’s central it organization. dombrowski had experience with developing digital humanities projects across a number of fields, and at the time was on the program staff of project bamboo, a mellon-funded digital humanities cyberinfrastructure initiative. having previously attempted an xml markup project to capture dialectal variation in subsets of the data published in the bŭlgarski dialekten atlas , dombrowski had a personal interest in working with a different kind of bulgarian dialectology material, and making it as accessible and reusable as possible. alexander shared early drafts of bdlt with dombrowski in the form of microsoft word documents, where each line of the text was transcribed and translated into english, and each token was annotated with linguistic metadata. dombrowski noted that the high degree of structure in these word files was more reminiscent of a database than a traditional scholarly monograph. moreover, the process required to generate the word files involved significant duplication of work, as each token would need to be glossed and annotated anew every time it occurred. not only was this inefficient, it also increased the risk of inconsistencies. dombrowski felt that the rigidity of a print- oriented pdf end product would also limit its audience. the transcripts touch on a wide range of topics, from folklore and traditions, to agricultural practices, to personal stories andrew dombrowski and quinn dombrowski. “an xml-based approach to dialectological data: the development of syllabic liquids in bulgarian.” presented at the th balkan and south slavic conference at the university of ohio. . http://quinndombrowski.com/blog/ / / /bulgarian-dialect-atlas-at-the- th-balkan- and-south-slavic-conference stojko stojkov et al., ed. - . insitut za bǎlgarski ezik. bǎlgarski dialekten atlas i-iv. sofia: izdatelstvo na bălgarska akademia na naukite, - . depicting daily life in rural bulgaria. these narratives could be valuable in a wide variety of contexts, within and beyond the academy, but the formatting of the word documents -- where the narrative was visually interrupted every few words by a block of linguistically-oriented data -- significantly impeded the narrative’s readability and accessibility. this could be remedied by the production of another set of word documents that presented the narratives as continuous text, but there, too, choices would have to be made about whether to include the original bulgarian (and how: inline, in parallel columns, or separately), and whether to include a transliteration along with the cyrillic (again, and how)? converting the project’s structure to a database would eliminate these issues. tokens could be entered, glossed, and annotated once, and these token entries would then be referenced in each text where they appear. rather than committing to a single display format, database queries could enable any number of displays, in order to accommodate various audiences’ needs and interests. a linguistics-oriented view could display all the tokens and their metadata (much like the original word files); multiple narrative-oriented views could display the text without interruption, and in any combination of writing systems. a database would allow users to not only view the linguistic metadata on tokens but also to use it as a means of querying the transcripts: e.g. pulling up all lines that include a lexeme of interest, or all lines that include a particular verb form. a database would also facilitate augmenting the transcripts with additional metadata to support discoverability and analysis -- for instance, individual lines could be tagged with thematic content, and tokens could be grouped into phrases that show noteworthy linguistic features. in short, moving from a print-oriented workflow to a database would vastly increase the research potential of the corpus, in addition to making the content more accessible to the broadest possible international audience. dombrowski offered to create a prototype of bdlt as an online database. she had previously built web-based digital humanities projects using the open-source content management system drupal, and saw it as being well suited to this project as well. at the time, drupal had a large, international developer community creating and maintaining modules (pieces of add-on functionality for the core drupal platform) that could fulfill the project’s technical requirements of storing and querying structured metadata, storing and presenting audio files, and importing and exporting text. this would allow dombrowski to quickly develop a complex web application that would be highly customized to the specific data model of bdlt, without writing any code. (see “drupal and other content management systems” for further discussion of drupal and other content management systems.) drupal was released shortly before dombrowski began to develop the pilot version of bdlt, and the drupal development philosophy supports api-breaking changes between major versions . as a consequence, there is always a delay between the release of a new major version of drupal core, and the point when it becomes usable for complex projects, as module developers need time to refactor their code if they intend to continue supporting their modules. for that reason, dombrowski chose to build bdlt in drupal -- a decision that had long-term consequences, even as it was unavoidable at the time. quinn dombrowski. “drupal and other content management systems” in doing digital humanities: practice, training and research, ed. c. crompton, r.j. lane, and r. siemens. . routledege. dries buytaert. “backwards compatibility”. may , . https://dri.es/backward- compatibility it took approximately hours of work to develop the initial prototype of bdlt, which included an interface for entering and editing texts and annotating tokens, a text display equivalent to the original word files, a map display of locations, as well as data structures for linking tokens to lexemes, annotating thematic content, and browsing all tokens, organized by lexeme. for the sake of expediency, dombrowski manually entered the data for a few example texts, but anticipated that the existing word files could be imported into the system without much difficulty. in february , dombrowski demoed the prototype for alexander. after conferring with zhobov, alexander decided to move ahead with implementing bdlt as a database, with dombrowski acting as the project’s technical staff. . technical implementation the pilot version of the site that dombrowski developed in remains largely unchanged to this day, though a few additional displays and features have accrued over the course of the site’s development. “digital humanities development without developers: bulgarian dialectology as living tradition” provides a detailed description of the technical underpinnings and data model for bdlt as of , and is inclusive of all of the site’s major features, with the exception of the more recent “phrases” content types and displays. . structural overview in brief, there are seven content types (location, contents, text, token, line, lexeme, phrase) and five search functions (wordform, lexeme, linguistic trait, thematic ronelle alexander and quinn dombrowski. "digital humanities development without developers: bulgarian dialectology as living tradition". . proceedings of dh-case ii (doceng workshop). doi: . / . . content, phrase). the results from any search query can be exported as a csv or microsoft word file, and the site provides a map display as one output from linguistic trait, wordform, or phrase search queries. • locations. each village visited is located on a map on the home page and is represented by a page of its own, accessible either by a link from a list on the home page or a tab on the map. each location page gives basic metadata about the village (administrative region, dialect group, date visited), and provides a lengthy prose description of the relevant dialect subgroup. salient traits of the group are illustrated by examples taken from the site itself. links to the text(s) representing this village are also available on this page. • contents. this page displays basic information about each of the excerpts, or texts, which the site contains: text name, dialect group, duration of audio file, number of lines of text, number of tokens of informant speech, and a brief synopsis of thematic content. data entry status for content not yet completely entered is also noted. data can be sorted on all columns except the audio length, and texts can be accessed from the text link on this page. • texts. each text has its own page: it contains a sidebar with a small map locating the village, a photo image of the village, and metadata (date of recording, word count, physical context of recording, name(s) of investigator(s), and synopsis of thematic content). each text is broken into lines for ease of data retrieval; each line is numbered, coded to identify the speaker, and provided with a timecode to facilitate location of the transcribed portion within the accompanying audio file. each text is presented in three different views: glossed view gives a translated text with interlinear tags, comprising grammatical and lexical tags placed underneath each token; line view gives simply the transcribed text with english translation; and cyrillic line view simply gives the text in cyrillic transcription (it is assumed that bulgarian users need neither translation nor interlinear glosses). the audio link is available in all three views, and it follows the text as the user scrolls down the page. • tokens. each token has its own page, which lists all the tags assigned to that token, and all the lines throughout the database where that token occurs, with each line identified by text name and line number. • lines. each line has its own page, which lists all its tokens, any thematic content tags assigned to that line, and any identified phrases associated with that line. • lexemes. each lexeme has its own page, with links to all the tokens associated with that lexeme. note: a “lexeme” is the lemma in standard bulgarian associated with the dialectal token; if no such lemma exists, then one is created and tagged as a “dialectal lexemes”. lexemes are also tagged for etymological and other information. • phrases. a unique feature of this site design is the ability to isolate grammatically significant groups of words, or phrases. each phrase has its own page, listing all the tags assigned to it, as well as the line of its occurrence and any other lines in which it occurs. • wordform search. this search page allows users to select any combination of grammatical/pragmatic tags and/or the english translation and/or the bulgarian lexeme, and see all the lines on the site which display the tokens so identified; the geographical distribution of the selected tokens is displayed on a map. each selected token is displayed within the line of its occurrence; users may then follow a link to the text with the token in question to see the larger context and hear the audio. • lexeme search. this search page allows the user to see all the phonetic representations throughout the site of any one lexical item. users can also isolate words with particular prefixes or suffixes (using “begins with…”, “ends with…” buttons). users can also search for lexemes within categories of special interest (such as dialectal lexeme, loanword source), and for instances of lexical variation (the occurrence of more than one dialectal term for a particular item or action). • linguistic trait search. this page, by allowing the user to search for any one of a very large number of linguistically significant traits, enabled the linguistic tagging of tokens at a much more complex level than that marked by the interlinear tags which form the basis of the wordform search. here, the user makes hierarchically embedded choices to isolate the trait in question; this allows very complex searches at both the synchronic and diachronic level. each selected token is listed in the context of its line, and the geographical distribution of selections is displayed on a map. • thematic content search. this search page allows users to find chunks of text (identified by text and line number) where the recorded conversation concerns a particular topic. the search page allows one to locate the desired topic either through a thematically ordered ethnographic list with many subdivisions in each category, or by an alphabetical list of every single tag regardless of its place in the hierarchical listing. • phrase search. this search page allows the user to find instances of grammatically significant groups of words at a number of levels. this is particularly useful for scholars of bulgarian and balkan linguistics, since many of the traits characterizing the balkan sprachbund must be defined in phrasal terms. because there was no way to mark these traits at the token level, this additional content type was devised specifically for this site. as in other searches, results give the context of the full line, and display the geographical distribution on a map. . hosting hosting is a perennial challenge for web-based digital humanities projects. digital humanities thought leader miriam posner has characterized obtaining server space as “the most hilariously awful problem in doing dh at a university, and almost nobody has got this figured out. i know people who are secretly running servers under their desks, buying their own server space, or running projects off google drive.” universities continue to struggle with questions of what campus organization, if any, should be responsible for providing web hosting for digital projects. many central it organizations, including those at uc berkeley, follow a model of offering inflexible, standardized services in order to reduce support costs when those services are made available to the campus as a whole. as a result, they are a poor fit for digital scholarly projects, which are unlikely to resemble standard templates for departmental websites, faculty profiles, etc. at some institutions, the library has stepped in to fill the need for hosting for scholarly projects, but when web hosting is seen as an indefinite commitment, the ongoing costs of server hardware and -- more significantly -- the staff miriam posner. “here and there: creating dh community”. september , . http://miriamposner.com/blog/here-and-there-creating-dh-community/ jennifer vinopal & monica mccormick. “supporting digital scholarship in research libraries: scalability and sustainability”. journal of library administration: vol. , , issue . p. - . https://doi.org/ . / . . time necessary for patching and updating software, can become a drain on library resources. as a result, some organizations that take supporting digital scholarship as their mandate have retreated to a position of offering advice on commercial hosting options, with the costs (financial and technical upkeep) to be borne by the scholar . over the course of its development, bdlt has navigated three of the most common hosting scenarios for digital projects. dombrowski built bdlt using a general-purpose shared web hosting account already purchased for use with multiple different projects. within four years, the site needed to be migrated to a different environment after the hosting service threatened to shut it down due to an excess number of tables in the mysql database. the large number of tables that drupal generates as part of creating its content types (data structures) is a frequent criticism of the system , and it became a technical barrier for hosting the site using low-cost, general-purpose hosting. by , when the site was threatened with eviction from its hosting environment, dombrowski had moved to the research it organization at uc berkeley (alexander’s institution), and was overseeing the hosting services offered by that campus’s digital humanities program. dombrowski initially arranged for bdlt to move to the drupal-specific commercial hosting service that had partnered with uc berkeley’s it organization to provide hosting for drupal sites. this move was ultimately short-lived: recognizing that hosting would disappear when the digital humanities program’s funding ran out, dombrowski and alexander sarah kalikman lippincott. “digital scholarship at harvard: current practices, opportunities, and ways forward”. june , . https://projects.iq.harvard.edu/files/dsi/files/harvard_ds_final-report_ _v .pdf “drupal schema – why this methodology?” january , . drupal forums. https://www.drupal.org/forum/general/general-discussion/ - - /drupal-schema-why- this-methodology took advantage of the berkeley language center’s offer to move the site to their server, with system-level support from that unit’s sysadmin. under this model, hosting for the site would be guaranteed for at least as long as alexander was an active or emerita faculty member at uc berkeley. . technical staffing technical development of digital humanities projects can quickly become costly, even when reusing existing code as part of a configurable open source content management system such as drupal. professional technical expertise commands a premium. for that reason, self-funded projects such as bdlt particularly benefit from having a core team of personally committed collaborators that includes at least one individual with the technical expertise to implement the project. while restricting the technical scope to what a core member of the team can personally accomplish may limit the project’s scholarly ambitions, the alternative involves waiting for a significant influx of funding that may not be feasible, particularly if the project involves applying established methodologies in a new domain. some projects attempt to overcome this hurdle by hiring professional technical staff to work on the project piecemeal as smaller amounts of funding (e.g. university-internal microgrants, etc.) become available, but this approach becomes more expensive overall as the project pays for the start-up costs of professional developers re-familiarizing themselves with the project at the beginning of each new phase of work. it also risks the project being left half-completed if the scholar is unable to secure further grants. dombrowski has served as the primary technical developer on this site from its inception to the present day. like alexander, dombrowski has never been paid for work on the project, instead contributing out of personal interest and commitment. however, particularly because bdlt represents volunteer work, dombrowski’s availability to direct time to the project has fluctuated, and changes in institution, job, job scope, and life circumstances (including the birth of three children over four years) have all had an impact. alexander has used her research funds to pay graduate students with technical knowledge of drupal (including some trained by dombrowski) to implement site configuration changes during times when dombrowski has been unavailable. however, those graduate students themselves have taken on this work as one among many conflicting priorities, including finishing their dissertations, leading to periods where they have fallen incommunicado for weeks or months at a time. in august , alexander took a weeklong workshop on drupal offered by dombrowski at the digital humanities at berkeley summer institute, with the goal of developing sufficient technical proficiency to serve as her own technical backstop for the project, and reduce the turnaround time needed to make minor configuration changes on the website. . migrations and code changes one disadvantage of building a project using a content management system is that it closely ties the project’s lifecycle to the support lifecycle for that version of the content management system. a major version upgrade is a non-trivial undertaking on any such platform, but drupal’s api-breaking design philosophy further exacerbates these challenges. building bdlt in early necessitated the use of drupal , but this choice guaranteed that the site would have to be migrated to a new version of drupal within the medium term, when the drupal open source project stopped providing security updates for that version. in alexander and dombrowski , the authors anticipated a migration directly from drupal to drupal , with the expectation that version -- which had not yet been given a release date -- would provide more robust technical underpinnings for the site in the long term. instead, the drupal project’s decision to jettison much of drupal’s own architecture and replace it with the enterprise php framework symfony had the effect of alienating many smaller-scale developers, including those who typically work on digital humanities projects. the resulting lag in module availability has been tremendous, and many modules with significant adoption in digital projects across a wide variety of disciplines (e.g. biblio, which provides a data structure for importing, exporting, storing, and displaying bibliographic references) have still seen no significant movement towards a drupal port as of . by summer the release of drupal was imminent, and drupal would only be given a three-month grace period after its release before security updates were no longer provided. discussions in the developer forums suggested that many general- purpose modules would not be available concurrently with drupal ’s release, to say nothing of scholarly-oriented modules. in light of this, dombrowski advised cammeron girvin, a graduate student working with alexander, on a site upgrade to drupal . while girvin had served as a project manager for bdlt for some years, the upgrade was his first experience interfacing directly with the technical underpinnings of the site (i.e. the filesystem and mysql database). the upgrade was difficult, requiring multiple attempts and a downtime spanning the entire summer before the site was again available online; furthermore, it took nearly an additional six months to resolve all the bugs related to the upgrade. by , drupal had seen significant uptake among digital humanists, and all of the widely used drupal modules were available for drupal at that point, or dries buytaert. “why the big architectural changes in drupal ?” september , . https://dri.es/why-the-big-architectural-changes-in-drupal- bibliography module – issues – drupal port. march , . drupal module issue queue. https://www.drupal.org/project/biblio/issues/ replaced with improved alternatives. unfortunately, early in the development of bdlt, dombrowski had selected a niche module called editview as the primary interface for data entry. that module had been abandoned by its developer after drupal , despite user requests for a drupal version starting in . rather than completely reconceptualizing data entry for the site, alexander contracted with agile humanities agency (http://agilehumanities.ca/), a digital humanities-oriented development firm created by former english professor dean irvine, to write a drupal version of editview. this piece of technical development work represented a significant financial investment for bdlt, but it also has served as a locus of broader impact for the project within the digital humanities community. the drupal editview module has subsequently been adopted by other projects with similar tabular data entry needs, including the george washington financial papers project (http://financial.gwpapers.org/). . data entry and labor for bdlt and similar projects, the amount of time dedicated to developing the technical infrastructure is dwarfed by the enormity of the task of data entry. dombrowski’s expectation that data could be parsed from the word files and imported into drupal to seed the database was quickly shown to be overly optimistic. despite consulting with developer colleagues at the university of chicago who offered elaborate examples of using regular expression syntax to capture some of the words and linguistic annotation, it was ultimately too error-prone to use and all the lines, tokens, and annotations had to be manually entered into drupal. editview module – issues – d port of editview. april , . drupal module issue queue. https://www.drupal.org/project/editview/issues/ in some respects, entering data into drupal was not dissimilar from work alexander and zhobov already anticipated undertaking in microsoft word as part of their audio chrestomathy. it may have been easier, as there was no need to fuss over table spacing and formatting in word for the linguistic annotations. the work was, nonetheless, slow, and became slower as the database grew. the growing number of annotated tokens led to an increasing lag in the site’s autocomplete functionality, which was necessary to ensure that new texts were able to reference existing tokens, rather than creating new database entries. the possibility of additional metadata beyond what the word documents would have supported represented another data entry task, and the ways that the database throttled the speed of data entry (e.g. through waiting for the autocomplete) represented a significant increase in the overall time needed to put the material in its final format. the challenge of data entry at scale is endemic to digital humanities projects. the need for large-scale, low-cost data entry has led projects to adopt practices vis-à-vis undergrad labor that have drawn critique from others in the field . a survey described in “student labour and training in digital humanities” shows that the vast majority of digital humanities projects are funded by federal and/or institution-internal grants, which are necessary to offset the many costs of developing these projects, not least among them the cost of paying student workers. only three of the projects surveyed indicated that they received no funding. spencer keralis. “disrupting labor in digital humanities; or, the classroom is not your crowd”. in disrupting the digital humanities. dorothy kim and jesse stommel, eds. , punctum books. katrina anderson, lindsey bannister, janey dodd, deanna fong, michelle levy, and lindsey seatter. “student labour and training in digital humaniteis”. digital humanities quarterly, , vol. , no. . http://www.digitalhumanities.org/dhq/vol/ / / / .html the first point of the student collaborators’ bill of rights states that “as a general principle, a student must be paid for his or her time if he or she is not empowered to make critical decisions about the intellectual design of a project or a portion of a project (and credited accordingly). students should not perform mechanical labor, such as data-entry or scanning, without pay.” for bdlt, the lack of project funding combined with the city of berkeley’s steep increases in minimum wage over the course of the project ($ /hour as of , up from $ /hour in ) made hiring undergraduates for data entry unfeasible. instead, alexander has worked with cohorts of students through a longstanding uc berkeley program, urap, which connects undergraduates with faculty members doing research. while it may not be the ideal solution, alexander has devoted significant thought and energy towards collaborating with those students in ways that align with the collaborators’ bill of rights. . urap program since , uc berkeley has offered the undergraduate research apprenticeship program (urap) as an institutional framework “to assist faculty in reconciling their commitments to research with their responsibilities for undergraduate education. by promoting faculty-student research collaboration, urap works to invigorate undergraduate education and to contribute to the sense of intellectual community on campus.” faculty who wish to participate in urap submit a project description to an online portal, and students can submit a statement of interest to up to three different haley di pressi, stephanie gorman, miriam posner, raphael sasayama, and tori schmitt. “a student collaborator’s bill of rights”. june , . ucla humtech. https://humtech.ucla.edu/news/a-student-collaborators-bill-of-rights/ “what is the undergraduate research apprenticeship program?” urap website. http://urap.berkeley.edu/program-intro projects. faculty members interview the students and can select any number of them to collaborate on the project. the bdlt project was ideally suited for this framework. the project description alexander submitted to the portal outlined the nature of the project, stressing both its linguistic and ethnographic aspects, stated that knowledge of basic linguistic structure was highly desirable but not required, and that knowledge of bulgarian, or indeed any slavic language, was not necessary. during interviews with interested students, alexander gave students an overview of the site and explained how data entry was done. both the instructor, and the students who decided to choose this project, then completed a “learning contract”: the instructor committed to providing a research experience for the students and the students committed to a minimum number of work hours per week throughout the semester, for which they could receive course credit (one credit per three hours of work a week, up to four credits per semester). the project was first listed with urap in january , and has been listed every semester since then (except for fall when alexander was doing research abroad the entire semester). the largest number to join the project in any one semester was eight, and the smallest was two. because of the enormous amount of data to be entered, there was never a lack of work for students. students did data entry on their own time, keeping track of their work hours, and then participated in regular group meetings for discussion of research goals. the student collaborators’ bill of rights states that “course credit is generally not sufficient ‘payment’ for students’ time, since courses are designed to provide students with learning experiences.” urap is one of a few programs at uc berkeley that provides course credit for non-traditional work; another, decal, grants course credit for student-run courses on topics ranging from “decode silicon valley startup success” and “sign language in healthcare” to “cal pokémon academy” and a master course in the board game “settlers of catan” . providing course credit specifically in exchange for work on faculty research makes the nature of the exchange clear to the student upfront, in contrast to traditional departmental courses that incorporate student labor as a class assignment. data entry, for all its tedium, is a very authentic research experience, and one that the project directors also engaged with as part of data preparation. in order to make the project available to as many students as possible, regardless of their knowledge of bulgarian, the project directors provided the data to the students in plain text files with all the tags for coding -- not unlike the original word files of the audio chrestomathy. in order to make the project as meaningful as possible, alexander met twice monthly with student apprentices as a group. in addition to discussing any problems with data entry, these meetings were an opportunity for students to learn more about the history and development of the project, and its importance for bulgarian dialectology as a research field. since one of the goals of the very minimal bdlt research budget has been to bring the bulgarian project director (zhobov) to berkeley once a year, some of the student apprentices have been able to meet with him as well and to learn first-hand about the bulgarian aspects of the collaborative project. this aligns with the student collaborators’ bill of rights principle that “at a minimum, internships for course credit should be offered as learning experiences, with a high level of mentorship.” students have also been able to participate in project development: their input has been sought on certain aspects of project design, and there was more than one occasion when a student volunteered an idea that led to a particular breakthrough. whether such contributions amount to students’ being “empowered to make critical decal courses. spring . https://decal.berkeley.edu/courses decisions about the intellectual design of a project or a portion of a project” is arguable, but the project directors’ willingness and enthusiasm for reworking the site in response to solicited and unsolicited student input has given these students more agency in the project than simply doing data entry. . impact on students to date, thirty-one undergraduate students have worked on the project through urap. their importance to the project is inestimable, a fact of which they are reminded at the celebratory dinner at the end of each semester. in more lasting terms, their contributions are acknowledged on the site’s project team page (http://bulgariandialectology.org/project-team), which gives a small list of the names of “active apprentices” and an ever-growing list of the names of “alumni apprentices”. this is in alignment with the student collaborators’ bill of rights point # , which states that “if students have made substantive (i.e., non-mechanical) contributions to the project, their names should appear on the project as collaborators”. many of the apprentices keep in touch after graduation. two have gone on to graduate work in linguistics, listing participation in this project as a major deciding factor in their career choices. of the graduate students who have worked on the project, two were specialists in bulgarian, and project directors created shorter field expeditions in bulgaria with them in mind, so that each was able to get first-hand experience of the process through which field data are acquired. in addition, the graduate student who was most involved in project design was able to cite his work on this project as an important qualification for his current alternative-academic career path. the students who have been least satisfied working on the project are those in their final year as computer science majors. it is understandable that they wish to be doing cutting-edge technical work, rather than staying at the level of using (and not even modifying the code for) a php/mysql content management system. their dismay and frustration, while somewhat disorienting for other students, has been instructive as a concrete illustration of the ways in which those in the humanities do research with very limited financial resources. . international collaboration the bdlt project has been international from the outset, since it grew out of collaborative work between one american scholar and a group of bulgarian scholars, and has been maintained over the last decade through collaboration between the two project directors, one american (alexander) and one bulgarian (zhobov). the two are in constant electronic contact, consulting over issues of data preparation, data entry, and site design issues (especially with the most recent design additions, concerning “phrases”). they visit each other’s universities frequently for purposes of on-site collaboration; zhobov’s visits to berkeley are especially useful for student apprentices to learn more about international aspects of the project. they have also presented joint research papers about the project at various venues in europe and russia. most data entry takes place in berkeley, because of faster computer speed and more modern equipment. some types of data entry need zhobov’s specialized knowledge, however, and must be done in bulgaria, despite the fact that it takes place more slowly as a result of network delays, and computers with less memory and older browsers that often perform poorly with the site’s ajax-based data entry interface. . research outcomes scholars worldwide have become aware of the rich data resource which bdlt provides, and several research projects are currently utilizing data from the bdlt site. in particular, both zhobov and alexander have recently produced major research papers drawing on material in the bdlt site. although zhobov’s work on dialectal vocalism could have been prepared directly from the original field tapes, the choice to focus his analyses on texts from the site, and to cite only examples which could then be consulted directly on the site, increase the value of his work to other researchers. alexander’s work, by contrast, on accentuation in certain word groups, derives directly from the data organization in the “phrases” section of the site. indeed, it is anticipated that this part of the site will be especially valuable to balkan linguists once the full set of data is entered: they will be able to access dialect data about word order sequences, pronoun reduplication, instances of “evidential” usage, and similar topics. before the availability of the bdlt site, scholars could only collate data on these topics by laboriously combing through whatever dialectal “texts” had been included as supplementary material to published dialect descriptions: now they will be able to easily search for such material due to the site’s search interface. from a certain angle, it is difficult to argue that the project itself is research. while the investigation of any one research question would necessitate a focused subset of the data preparation involved in creating bdlt, those research questions require scholars to limit the scope of their curation and annotation, and move on to analysis and write-up. in the name of developing a resource of value beyond any one inquiry, or even any one person’s research agenda, the project directors have spent the majority of the last eight years focusing on data curation and annotation, which has unavoidably come at a cost with regard to their scholarly output vis-à-vis the kind of dialectological work zhobov, vladimir. new approaches to bulgarian dialectal vocalism; alexander, ronelle. bulgarian dialectal accent, a new approach. both articles are slated to be published in the forthcoming monograph: alexander & zhobov, bulgarian dialects, living speech in the digital age. that is the focus of their pre- scholarship -- work that can now resume fully as bdlt nears completion (nevertheless, the completion of the two major research ventures noted above proceeded alongside work on bdlt). for a scholars who are principally interested in disciplinary research, but who are drawn to the promise of what a resource equivalent to bdlt in their field could provide, the reality is that they will get substantially more research done if they engage in the traditional research practices of focusing their data collection and curation to those materials that contribute directly to a specific inquiry. building a site such as bdlt is an act of hope, and of generosity for the future scholars who will receive all of the benefit without sacrificing years of their professional lives to data preparation. it is not an undertaking for early-career scholars who can ill afford the impact on their publication rate. at the same time, late- career scholars who are giving thought to the nature and impact of their legacy may find that a project like bdlt, which generates richly annotated data that can jumpstart research for subsequent generations of scholars, is a meaningful gift to the future that reaches far beyond any new monograph. . sustainability for the project’s potential to be realized, the materials need to remain available. while the primary value is in the texts and linguistic annotations, the interface has been specifically designed to facilitate access to the data, reducing the effort necessary to extract the subsets of the corpus relevant for particular research questions. the recent change in licensing terms for the google map api, which broke the map-based navigation that the site had used since its inception and necessitated a complete rebuild of the site’s geospatial functionality, was a stark reminder of how bdlt is vulnerable to decisions made by large corporations whose interests and priorities diverge from those of the project team. collaborating with students has added an ethical dimension to the question of sustainability; per the student collaborators’ bill of rights: “senior scholars should recognize that projects on which students have collaborated represent important components of students’ scholarly portfolios. senior scholars should thus make every reasonable effort to either sustain a “live” project or, failing this, either transfer its ownership to student collaborators or distribute to students an archived version or snapshot of the project.” with the data entry phase of bdlt winding down, and the full picture of the contents of the project becoming clear, the project team is taking a multi-pronged approach to sustainability. . website the drupal project has announced end-of-life for drupal in november ; ironically, this will also be the end-of-life for drupal , aligning with the end-of-life for the version of the symfony php framework that replaced drupal’s previous technical underpinnings. moving directly from drupal to drupal , as bdlt initially planned, would not have bought the project any additional time between migrations. the digital humanities community’s response to drupal has not been enthusiastic. in addition to the increased difficulty for coders whose skill set does not align with “enterprise php development” to build drupal modules and themes, the server requirements for drupal to perform adequately outstrip what is typically available in shared low-cost commercial hosting environments. as a result, many digital humanities projects built in drupal will face a decision point in the next few years, and drupal is not an obvious choice. one compelling alternative that has emerged in the wake of drupal is backdrop cms (https://backdropcms.org/), a fork of drupal aimed at nonprofits and small businesses that prioritizes stability and a positive user dries buytaert. “drupal , and ”. september , . https://dri.es/drupal- - -and- experience for non-programmers who build such sites, over technical advancements in the core apis. the skill set of backdrop’s target audience, and the project’s overall priorities, align well with the needs of digital humanities projects. backdrop has incorporated many drupal modules that previously had to be maintained and updated separately into its core code, and backdrop core includes an option to enable automatic updates in order to reduce ongoing support costs and minimize the risk of the site being hacked as a result of delayed installation of security updates. while additional work is needed on backdrop ports of some of bdlt’s modules, dombrowski has been working with the backdrop developer community to ensure this functionality is in place in time to migrate bdlt to backdrop before drupal ’s end-of-life. the berkeley language center sees it within their purview to provide access to bdlt indefinitely via through their server and sysadmin, but the scope and priorities of organizations such as a language center are subject to change, particularly in a context of public disinvestment in higher education. while it is not a substitute for the full functionality of a live database that can support any combination of queries, dombrowski and alexander are working with digital preservation specialists at uc berkeley to capture and preserve the site with web recording software once data entry is complete. this will generate a moderately interactive surrogate of the site that can be used in perpetuity for some kinds of information retrieval needs, even if the site itself ceases to exist online. . data the bdlt website includes functionality for exporting any search query as a csv. with an eye towards capturing the full extent of the data for potential use in computational research, and/or in other interfaces if the website is no longer available, dombrowski has generated a set of csv files that include all fields from all content types, and will update these files with new versions once data entry is complete. drupal automatically generates a unique id for each node (instantiation of a content type), and stores references between nodes (e.g. a pointer from a token to a lexeme, or a line to a token) using that unique id. including this id in all exports will make it possible to reconstruct the network of relationships between the various content types in future data analyses and interfaces. while uc berkeley does not have a track record of providing an institutional repository for data, the project team anticipates depositing the data sets in such an environment if it is established. the tromsø repository of languages and linguistics (https://site.uit.no/trolling), which is affiliated with the clarin european research infrastructure, is appealing as a disciplinary repository. in addition to formally accessioning the data to these data repositories, dombrowski has followed the common digital humanities practice of putting the csv files, along with some example analysis code, on the code repository platform github (https://github.com/quinnanya/bdlt-data). . print the “texts” on the site are valuable pieces of data, even in the absence of the metadata that enables the various search options. to make sure that these texts are preserved at multiple levels, print copies will be produced of all three versions of each text: that with the grammatical and lexical glosses, that in latin transcription with english translation, and that in cyrillic transcription. it is particularly important to have both of the latter, since bulgarian dialectology uses a different set of transcription symbols than those currently used in the west. . conclusion over the course of eight years, the bulgarian dialectology as living tradition project has navigated the full digital humanities project lifecycle, from idea to archiving, without support from external funding. while others may disagree with any of the decisions made during the course of the project’s implementation, this paper has served to explicate the motivations, context, and constraints that informed those decisions, to serve as a point of discussion for the development of future digital humanities projects within the broad field of slavic studies, and beyond. as bdlt transitions from a project to a scholarly resource, the directors hope that this undertaking can lay the foundation for the emergence of a richer understanding of bulgarian language and culture, and that scholars working in other areas may be inspired to undertake similar endeavors to make materials available digitally -- but only at the right time and place in their careers where such work becomes feasible. . acknowledgements the authors would like to express their deep appreciation for all their collaborators on this project as of april : senior associate research team member georgi kolev; associate research team members roslyn burns, cammeron girvin, snejana iovtcheva, kea johnston, eric prendergast, vesela simeonova, traci speed, and john sylak; apprentice research team members jessica adams, richa bhandal, zuhra bholat, gabrielle bozmarova, nina chang, jessica chapman, lana cosic, katie crowe, stephanie deleon, naomi francisco, austin frenes, dimiana georgieva, emmanuella hristova, siyana hristova, andrew kuznetsov, kathleen lamont, mckayla major, kelsey mota, grace newsom, jerry nikolaev, nadia nizetich, siyao [logan] peng, stella petkova, charles rosencrans, elizabeth sawyer, john sockolov, jeffrey stock, aleksandrina stoyanova, vanessa taylor, milena tintcheva, and emma wilcox; fieldwork core team members georgi kolev and the late maksim mladenov; fieldwork contributors elena uzeneva and georgi mitrinov; fieldwork assistants krasimir mirchev, radko shopov, and ivan vankov; fieldwork student apprentices cammeron girvin, marieta nikolova, traci speed lindsey, matthew baerman, jonathan barnes, tanya delcheva, elisabeth elliott, kamen petrov, and petŭr shishkov. for a complete and current list of project collaborators, see http://bulgariandialectology.org/project- team. this paper is dedicated to the late maksim mladenov, who has been the project’s guiding light and guardian angel. untitled zotero: a free and open-source reference manager correspondence to: julie courraud, institut régional du cancer de montpellier (icm) val d’aurelle, rue des apothicaires, montpellier cedex , france julie.courraud@icm.unicancer.fr julie courraud clinical research department, institut régional du cancer de montpellier (icm), val d’aurelle, montpellier, france abstract zotero is a free, open-source reference management program compatible with linux®, mac®, and windows® operating systems. libraries are backed up online allowing sharing between computers and even multiple users. zotero makes it easy to keep your reference library organised and ‘clean’. reference libraries are compatible with other refer- ence management programs, and difficulties can be quickly addressed via online forums. for these reasons, zotero can be a valuable resource to medical writers. keywords: reference manager, bibliography, medical writing, open source, freeware you may remember the time when inserting refer- ences in a text was one of the most time-consuming and arbitrary tasks. with time and projects flying by, your bibliography may have become a jungle where finding a specific article began to resemble cave exploration. fortunately, reference manage- ment programs have been developed to make writers’ lives easier. several programs are available, including mendeley® (acquired in by elsevier), endnote® (thomson reuters), and biblioscape (cg information). in contrast to previous articles com- paring several reference managers, , this article focuses zotero, a free and open-source program orig- inally developed by the center for history and new media at george mason university and the corporation for digital scholarship in the usa. what you can do with zotero collecting and organising references zotero is useful at all stages of citing sources, from conducting bibliographic searches to writing docu- ments. when you read citations on a computer, tablet, or mobile device, zotero automatically col- lects ‘metadata’ (details such as authors, title, and date) that are stored in your electronic library. metadata can also be recovered from the pdfs of the published articles (at least dating from around ) or by using article identification numbers. libraries can be organised in collections as you would organise file folders on your computer (figure ), and files, notes, or links, even full texts, can be associated with each reference. pdf files are automatically downloaded when available (open access articles, for instance) and zotero can rename them according to first author, year, and title. zotero includes a search bar that helps find refer- ences within your library. this search function even includes the text within pdf files. zotero also has a ‘locate’ button that helps finding items online. libraries from other reference managers may be imported in zotero. inserting citations into a document zotero is compatible with microsoft word®, libreoffice, openoffice, and neooffice. when writing a document, you may easily insert citations by clicking on zotero buttons. your reference list is also automatically created and you can switch cita- tion format as often as needed. a style repository containing more than styles is available online; if the style you need is not on the list, you may find a similar style using the ‘search by example’ tool. in-line citations and bibliographies may be personalised (e.g. remove author or add page numbers), and journal titles may be automati- cally abbreviated when needed. some styles also include a translator that adapts terms in cited refer- ences (e.g. ‘available at’, ‘accessed’, etc.) to a specific language. saving, synchronising, and sharing libraries zotero works much like google docs® or dropbox®. when you create a personal account on zotero, you receive mb of free online storage, although this can be increased for a small fee. zotero syn- chronises your computer with your online account so that all of your references are backed up on zotero servers. you can even synchronise several © the european medical writers association doi: . / z. medical writing vol. no. mailto: semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . / corpus id: the impact of copyright permissions culture on the us visual arts community: the consequences of fear of fair use @article{aufderheide theio, title={the impact of copyright permissions culture on the us visual arts community: the consequences of fear of fair use}, author={p. aufderheide and t. milosevic and bryan bello}, journal={new media & society}, year={ }, volume={ }, pages={ - } } p. aufderheide, t. milosevic, bryan bello published psychology, computer science new media & society as digital opportunities emerge in the visual arts—to produce multimedia art and digital scholarship, publish online, and hold online museum exhibitions—old copyright frustrations have worsened in a field where getting permissions is routine. a national survey of visual arts professionals, combined with in-depth interviews of visual arts practitioners throughout the united states, explored how visual arts professionals use the us copyright doctrine of fair use. results showed… expand view on sage sciencepolicy.colorado.edu save to library create alert cite launch research feed share this paper citationshighly influential citations background citations methods citations view all topics from this paper cyberbullying digital scholarship risk assessment self-censorship dns certification authority authorization assistive technology program manager social media best practice baseline (configuration management) bryan cantrill goodyear mpp code radix tree linda (coordination language) jane (software) declaration (computer programming) citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency cracking the copyright dilemma in software preservation: protecting digital culture through fair use consensus b. butler, p. aufderheide, peter a. jaszi, k. cox political science save alert research feed applied linguistics review view excerpt, cites background save alert research feed more than fifty shades of grey: copyright on social network sites stephen pihlaja sociology save alert research feed it’s bigger than hip-hop: sampling and the emergence of the market enhancement model in fair use case law p. fuller, jesse abdenour sociology view excerpts, cites background save alert research feed accessing russian culture online: the scope of digitisation in museums across russia m. terras, i. kizhner, m. rumyantsev, k. sycheva political science, computer science dh highly influenced pdf view excerpts, cites background save alert research feed research on visual image generation based on network media and its influence on communication gou shuang-xiao computer science pdf view excerpts, cites methods save alert research feed a comprehensive regulatory model c. xu sociology save alert research feed information policy and e-learning dian walster business save alert research feed creative action under two copyright regimes: filmmaking and visual arts in australia and the united states aram sinnreich, p. aufderheide, d. newman political science save alert research feed the biopolitics of intellectual property: regulating innovation and personhood in the information age gordon hull sociology save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency demystifying fair use: the gift of the center for social media statements of best practices jennifer m. urban, a. falzone political science, sociology view excerpt, references background save alert research feed reclaiming fair use: how to put balance back in copyright p. aufderheide, peter a. jaszi business, engineering pdf view excerpts, references background save alert research feed documentary in a culture of clearance: a study of knowledge of and attitudes toward copyright and fair use among norwegian documentary makers l. larsen, torgeir uberg nærland sociology pdf view excerpts, references methods save alert research feed copyrights and copywrongs: the rise of intellectual property and how it threatens creativity m. gillen political science, computer science int. j. law inf. technol. pdf view excerpt, references background save alert research feed what's wrong with this picture? an examination of art historians' attitudes about electronic publishing opportunities and the consequences of their continuing love affair with print m. whalen art art documentation: journal of the art libraries society of north america save alert research feed visual resources association: statement on the fair use of images for teaching, research and study gretchen s wagner, allan t kohl political science view excerpt, references background save alert research feed copyright issues and the creation of a digital resource: artists' books collection at the frick fine arts library, university of pittsburgh ann c. shincovich engineering art documentation: journal of the art libraries society of north america view excerpt, references background save alert research feed untold stories in south africa: creative consequences of the rights clearance culture for documentary filmmakers s. flynn, peter a. jaszi political science view excerpts, references methods save alert research feed the nature of copyright: a law of users' rights l. r. patterson, stanley w. lindberg political science view excerpt, references background save alert research feed the nature of copyright: a law of users' rights l. r. patterson, stanley w. lindberg political science view excerpt, references background save alert research feed ... ... related papers abstract topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue op-icbj .. symposium communications principles for inviting inquiry and exploration through science and data visualization eric rodenbeck stamen design, mission street no. , san francisco, ca , usa from the symposium “science through narrative: engaging broad audiences” presented at the annual meeting of the society for integrative and comparative biology, january – , at san francisco, california. e-mail: erode@stamen.com synopsis science, in the popular imagination, is about finding answers to questions. scientists make discoveries, de- velop theories, and deliver those discoveries and theories to audiences with an interest in the truth as backed up by science. well-designed data visualization (dataviz), by contrast, can generate and address not only new questions but new kinds of questions. it has the particular quality of allowing its viewers, users, and makers the ability to generate new inquiries, and to put them in a better place to answer them. dataviz offers esthetic and interactive platforms for discussion and inquiry that can help scientists to both do their work and better communicate their work to broader audiences. here i will illustrate and examine case studies from multiple points along the rich and varied possibility space that opens up when science and dataviz work together. i will also introduce three communication principles that i have learned from my involvement with hundreds of dataviz projects over the years. well-designed dataviz can help scientists and those involved with science find ways to navigate the multiple competing interests and priorities inherent in both communication to non-scientists and exploratory data-rich interfaces. introduction the focus of dataviz can be understood to exist along a spectrum of abstraction, from facts at the most concrete end, to wisdom, knowledge, and even vision as the most aspirational place for dataviz to work (fig. ). each of these kinds of work requires a dif- ferent approach, and each uses a different kind of raw material and has unique characteristics and outputs. much of what is widely considered dataviz by practitioners from edward tufte to cole nussbaumer knaflic focuses exclusively on the data row in fig. of this paper. through our client-facing and research practice at stamen, we engage in mul- tiple kinds of dataviz approaches across this spec- trum. much of this work is done for and with scientists across a broad range of fields, from meta- genomics to the study of human emotions. thinking of dataviz as more than the communication of facts in the clearest way possible to everyone who looks at it is crucial to unlocking the full communication potential of the medium. there is more to dataviz than communicating simply. one example of this is the visualization of com- plex metagenomic data that the scientists at the banfield laboratory at the university of california use to analyze new landscapes of genetic diversity. their work is difficult to explain to lay audiences, due to its complexity. there is a significant gap be- tween how most people think about metagenomic sequencing and what the science involves. banfield scientists use data visualizations that stamen built for them for “hypothesis generation and experimentation” (stamen design c). this is data visualization for highly skilled and experienced scientists. it requires their personalized experience, knowledge, and judgment in aiding their work, and belongs more properly in the wisdom row of fig. than in the data row. the interfaces are very difficult for lay people to understand, but serve a crucial role in helping the scientists to look at “at all the recov- ered organisms and all metabolic reactions of these organisms simultaneously” (stamen design c). meeting your audience where they are, whether at a very low or very high degree of scientific literacy or advance access publication august , � the author(s) . published by oxford university press on behalf of the society for integrative and comparative biology. this is an open access article distributed under the terms of the creative commons attribution license (http://creativecommons.org/licenses/ by/ . /), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. integrative and comparative biology integrative and comparative biology, volume , number , pp. – doi: . /icb/icy society for integrative and comparative biology deleted text: the deleted text: a–c deleted text: e deleted text: a–c https://academic.oup.com/ knowledge abstraction, is key to designing interfaces and visualizations that will help achieve your com- munications goals. what works in one row will likely not work as well in another. the intersection of science, data, and the internet we are living through an astonishing transformation in the amount and availability of data to people ev- erywhere, both inside and outside of academic insti- tutions. this transition is changing the way that science is done and communicated. as an example, consider the paper “magnetic alignment in grazing and resting cattle and deer.” researchers analyzed the position of thousands of graz- ing animals found on google earth and found that deer and cattle tend to align their bodies in a north– south direction: “amazingly, this ubiquitous phenom- enon does not seem to have been noticed by herds- men, ranchers, or hunters” (begall et al. ). humans have been looking at deer and cattle for a long time, and yet apparently never noticed the di- rection in which they tend to stand. the emergence of fast, free, and easy access to an accurate and often-updated library of satellite imagery enables observers to ask different kinds of questions than have been previously addressable. while scientists don’t yet understand the proximate mechanisms be- hind this behavior, a statistically sufficient sample dataset now exists for other researchers to build on this work. a paper, also using google earth imagery as the source material, found that “mutual distances between individual animals within herds (herd density) affect their n–s preference” (slaby et al. ). in both these projects, the amount of free and easily available data was the key factor in allowing these insights. there are many other areas where the amount of data available has grown dramatically in recent years, and the field of data visualization is emerging as a set of practices around doing, com- municating about, and otherwise dealing with this rapidly changing landscape and possibility space. this paper presents three communication princi- ples and projects drawn from stamen’s client services practice visualizing scientific data. these principles can be useful for scientists and the broader public with science through data visualization. principle : public conversations about science are never just about the truth. it’s wise to plan for this, and not shrink from it big glass microphone (fig. ) is a commission stamen received from the victoria and albert fig. matrix of data visualization abstraction (stamen design ). e. rodenbeck deleted text: internet deleted text: - deleted text: - museum in london (stamen design ). relying on the work of biondo bondi and eileen martin at stanford, the stanford exploration project and the school of earth, energy & environmental sciences, the project uses data from a km long fiber optic cable buried about a meter under the ground at stanford university. light shines through the cable, which responds to vibrations in the environment by changing its shape very slightly. the stanford team has shown that it’s possible to convert the vibrations of the perturbed optical fiber strands into information about the direction and magni- tude of seismic events. we received a -min long sample of data from the stanford team. we visualized the data in an interac- tive, dynamic interface overlaid on a map of the portion of the stanford campus under which the cable is buried. the data are split into different fre- quency bands, ranging from . to hz. each of these bands can be turned on and off, and perturba- tions in the light waves are clearly visible as bright spots along the length of the cable. for example, gasoline-powered cars driving on the road above the cable are visible across a range of the frequency bands. electric cars, which are mostly si- lent to the naked ear, leave a noticeably different signature. other perturbation sources, like bicycles, pedestrians, seismic activity, construction equipment, and air conditioners, have different profiles, moving at different speeds than cars. some perturbations don’t move at all in space, but show significant var- iability of vibration across multiple frequency bands. by comparing the location of this pattern to the geographical map, we determined that one of these stationary but highly varied objects was a fountain, burbling outside the earth sciences building. if a fiber optic cable buried under the ground can be used to detect fountains and cars, what else can be used as a sensor? what is equivalent to the google earth example above, when this kind of in- formation is as ubiquitous as satellite imagery? what kinds of problems can we use for these sensing ap- paratuses to investigate? what kinds of new ques- tions could they enable? big glass microphone is a datavis provocation, done under the guise of art and design, designed to provoke more questions than it answers. it re- ceived significant press attention, some of which pre- sented the science behind the project in ways that were not quite accurate and which certainly would not withstand peer review. in particular, some of the articles suggested that the cable could pick up and distinguish human speech, which the researchers have emphasized is not the case. some of these articles introduced the idea that scientific instrumen- tation can be used in ways that it was not originally intended for, that this can result in some interesting new kinds of observations: “the fiber optic loop un- der stanford’s campus was originally installed last august for seismic research, but the stanford team, with the help of optasense, decided to turn the noise into signals” (saplakoglu ). the project also opened up the idea for people that this data could be everywhere, and “at a larger scale, imagine how valuable this type of data would be for an engineer corralling traffic, or a mobility service sniffing out customers” (bliss ). from stamen’s perspective, this is a part of the process fig. big glass mic. stamen design. commissioned by the victoria and albert museum ( ). a provocation that uses material from the “data” row in fig. as a source for a project in the wisdom row, in order to evoke a sense of wonder and mythic imagination applied to unseen infrastructure. communication principles in data visualization deleted text: , deleted text: ve deleted text: nine deleted text: ute deleted text:   deleted text: hz deleted text: hz deleted text: hz deleted text: the deleted text: that of engaging in public conversation: not everyone gets all the facts right, some articles are outright wrong, and information can be taken out of context. articles written in the lay press have a significantly lower threshold for veracity than those accepted in peer- reviewed journals, by design. communication with these outlets require different strategies than those commonly deployed by the scientific press, and dif- ferent strategies are needed to have conversations with those outside the academy. a key part of this strategy is acknowledging that a natural part of a big public conversation is that not everyone who writes about a project will get every detail exactly correct. it’s not important for every journalist to understand every factual argument that a scientist makes in order to start a useful con- versation about science. “you can’t turn a no to a yes without a maybe in between,” frank underwood says in house of cards (foley ). one of the main things i hope to help scientists understand is that they can learn from frank underwood in house of cards when it comes to communicating about their work. the press will always get something “wrong,” from a scientific perspective. the magnetic alignment projects at the beginning of this article have received widespread press coverage. the paper occa- sioned an article (bates ) about both of the projects in wired magazine titled “cow compass points the way north.” it’s a goofy, catchy title about a complex topic involving real science and sophisticated research—which is exactly the point. it’s not entirely accurate. but it’s in wired. the public is engaged with the work. from a commu- nications perspective, that’s more important than whether they get every single detail about the proj- ect correct the first time. this work, though it di- rectly engages with data from a fiber optic cable and might seem to belong in the data row of fig. , is more useful to think about as a deliberate treatment of a data project and as a wisdom project from a communications perspective. it is intended to evoke a sense of wonder and mythic imagination applied to unseen infrastructure. principle : data visualization can invite more questions than it answers american panorama (fig. ) is an interactive atlas of american history, designed and built by stamen (stamen design a), and commissioned by the university of richmond’s digital scholarship lab (dsl). the project uses maps and data visualizations to enable discussion of, and citation of, spatial relationships in different historical contexts. built by and for expert historians, this project and the principle that informs it also belong more in the knowledge row of fig. than in the facts row. one of the maps, foreign born, displays the num- ber of americans counted in each census that were born outside the united states, subdivided into counties. viewers can see that in in san francisco close to half the population was born out- side the country, with the largest numbers of this group coming from ireland, germany, and china. these numbers and proportions remain relatively stable through the census for san francisco. note that each link takes you to a different state of the map, an important detail when citing these maps. in , chinese immigrants, who were for several decades one of the top three foreign born groups in san francisco, suddenly disappear from the map. the same thing occurs in , , and . there appear to be no chinese at all. we see them reappear in much-reduced numbers in the census. the census groups all americans of asian descent into asia (unspecified). chinese are listed as the biggest group of foreign born americans in san francisco in , where they remain until the census, which is the most recent as of this writing. we know that chinatown in san francisco was an active site of chinese activity during these years. it made no sense that the data showed zero chinese born americans during this time. we therefore thought there was a bug in our code. perhaps we’d spelled something wrong in the latest compile. but when we looked at the code, everything checked out. we went in and looked at the data: the row for china was empty in the data we’d received from the dsl. we finally asked our clients at the university for clarification. the chinese exclusion act of (dunigan ) not only severely re- stricted immigration from countries like china. the act also forbade non-white foreign born people from being counted in the census. they therefore are literally off the map. the scholars at the dsl asked us to remain open to the possibilities of letting project viewers ask questions of the material. we decided together to leave the gap in the data in the project, as a way to invite inquiry into the material. this was deemed more aligned with the project’s goals as a tool for researchers than to explain every aspect of what the data showed. sometimes (as in this instance) blank spaces on maps are as important as the parts that are filled in. sometimes noise is as interesting as signal. e. rodenbeck deleted text: as deleted text: ) deleted text: census deleted text: digital scholarship lab principle : data visualization communication is never context-free. there’s no neutral or correct way to do this work we worked with behavioral scientists paul and eve ekman on the atlas of emotions (fig. ), commis- sioned by the dalai lama. his holiness and paul have written several books together about emotions, bringing their differing world views to bear on the subject of emotions to the benefit of both. the dalai lama, for example, learned about the concept of mood, an emotional state that causes people to in- terpret various events through the lens of an emo- tion that may not be the appropriate one for the task at hand. this was a new notion for him because, in tibetan buddhism, there is no concept for a bad mood (lama and ekman ). and paul learned about the tibetan concept of attachment as a kind of fulcrum point between emotional aversion and fig. american panorama: foreign born. census figures for foreign born population of san francisco in (above) and (below). stamen design. commissioned by the dsl at the university of richmond ( ). built by and for expert historians, this project and the principle that informs it also belong more in the knowledge row of fig. than in the facts row. communication principles in data visualization emotional attraction. the dalai lama knows that, while he can speak to a certain kind of audience whose disposition might lead them to listen to what he has to say, others, perhaps those more scientif- ically minded, might be suspicious of the message a buddhist monk might bring. the dalai lama there- fore asked paul to design for him an atlas of what science knows about how emotions work to address this need. paul asked us to help him design and build it. this work belongs in the vision row of fig. , though the principle that informs it can be equally well-applied to any of the rows in that figure. paul’s response was to design a survey to uncover the consensus among scientists who study emotion, and what they disagree about. according to the sur- vey, scientists agree that all humans share five emo- tions: anger, fear, sadness, disgust, and enjoyment. we then worked with paul and eve to design a vi- sual representation of what these data showed, draw- ing on these scientists’ understanding of the structure and nature of human emotions. the project identifies emotional states within each primary emotion, organized by their felt intensity. among the multiple states of anger, for example, annoyance is felt only at the lower levels of intensity. fig. the atlas of emotions. each emotion presented as an independent continent (above) and the states of enjoyment displayed along with their attendand actions (below). stamen design. commissioned by the lama and ekman ( ). this work belongs in the vision row of fig. , though the principle that informs it can be equally well-applied to any of the rows in that figure. e. rodenbeck deleted text: figure exasperation and bitterness, two other states of an- ger, bridge very low levels of intensity and very high levels of intensity. fury, the most intense state of anger, is only ever felt at the highest levels. you cannot be slightly furious. paul had never looked at his work this way before, which was a surprise to both him and us. this re- nowned scientist, whose work laid the foundation for the modern scientific understanding of emotions, had never taken the time to count or map the rela- tive intensities of the different emotional states he’d been studying for years. far from indicating a gap in paul’s work, what we feel that this demonstrates is an powerful example of an opportunity for designers and scientists to work together to help bring a new level of visual thinking and accessibility to the vital work of science. paul later commented that this col- laboration was . . . wasn’t just about discovering things i didn’t know about my own research . . . i also learned things that i didn’t think it was possible to know about my work. (stamen design b) the project was designed for an english-speaking, western audience, as the dalai lama had asked us to do. this included color choices, which happened to be the same colors chosen for the five emotions that live in the character of riley’s head in the recent pixar movie inside out. we chose green for disgust, blue for sadness, orange for enjoyment, purple for fear, and red for anger. when talking about the project, i often used the example of how red is a symbol of good luck in china to illustrate what we assumed was large vari- ance of opinion on color-emotional correspondence in other parts of the world. surely the chinese would have a different color for anger. how could an emo- tion generally associated with negativity also be as- sociated with good luck? to my surprise, when i finally had the chance this year to ask a group of chinese speakers what color they associated with an- ger, they all said the same thing: red. they also told me that they associate the color blue with sadness, but that this was likely only true in china since the introduction of elvis presley’s music. conclusion in this new world of data ubiquity, scientific and dataviz communication is like any other kind of communication. there are many opportunities avail- able and examples to choose from as you decide how to work with data and communicate with it, both to your scientific peers and to the public. evaluating dataviz science communication based only on whether the public immediately understands all the important aspects to the science can serve as a bar- rier to more widespread understanding of the work. the continuum of types of dataviz (fig. ) can serve as a useful framing device for making decisions about how to communicate in different ways to dif- ferent kinds of audiences. it’s important to note that these principles and data types are by no means de- finitive. as has been discussed in relation to big glass mic, useful results can come from applying the lessons from one technique to a project that would seem to better fit in another. nevertheless, by considering these principles and actively employ- ing them when communicating about their work to lay audiences, scientists can have a greater impact on the world than by adhering to the same strict prin- ciples of scientific accuracy that we expect them to deploy in their research. acknowledgments special thanks are due to martin karrenbach and john williams at optasense for their support on the big glass mic project and to the society for integrative and comparative biology for the oppor- tunity to publish this manuscript, and especially to sara elshafie for inviting me to participate in this community (http://sicb.org/meetings/ /abstracts/ correctsymposia.php). references bates m. cow compass points the way north. wired magazine (https://www.wired.com/ / /cow-compass- points-the-way-north/). begall s, cerveny j, neef j, vojtech o, burda h. . magnetic alignment in grazing and resting cattle and deer. proc natl acad sci u s a : – (doi: . /pnas. ). bliss l. . beneath a bustling university campus, a big cable is listening (https://www.citylab.com/design/ / / beneath-a-bustling-university-campus-a-big-cable-is-listen- ing/ /). dunigan g. . the chinese exclusion act: why it matters today. susquehanna univ pol rev , article (http://schol- arlycommons.susqu.edu/supr/vol /iss / ). ekman p. . what scientists who study emotion agree about. perspect psychol sci (doi: . / ). foley j. . house of cards. season , episode , chapter (https://www.youtube.com/watch? v¼svhhwdjptte). lama d, ekman p. . emotional awareness: overcoming the obstacles to psychological balance and compassion. new york: henry holt & company. lama d, ekman p. . atlas of emotions (https://hi.sta- men.com/in- -the-dalai-lama-asked-his-friend-scientist- dr- a f c bd ). communication principles in data visualization deleted text: `` deleted text: ' deleted text: ' deleted text: ' deleted text: '' deleted text: a–c deleted text: , deleted text: a deleted text: in conclusion http://sicb.org/meetings/ /abstracts/correctsymposia.php http://sicb.org/meetings/ /abstracts/correctsymposia.php https://www.wired.com/ / /cow-compass-points-the-way-north/ https://www.wired.com/ / /cow-compass-points-the-way-north/ https://www.citylab.com/design/ / /beneath-a-bustling-university-campus-a-big-cable-is-listening/ / https://www.citylab.com/design/ / /beneath-a-bustling-university-campus-a-big-cable-is-listening/ / https://www.citylab.com/design/ / /beneath-a-bustling-university-campus-a-big-cable-is-listening/ / http://scholarlycommons.susqu.edu/supr/vol /iss / http://scholarlycommons.susqu.edu/supr/vol /iss / https://www.youtube.com/watch? v=svhhwdjptte https://www.youtube.com/watch? v=svhhwdjptte https://hi.stamen.com/in- -the-dalai-lama-asked-his-friend-scientist-dr- a f c bd https://hi.stamen.com/in- -the-dalai-lama-asked-his-friend-scientist-dr- a f c bd https://hi.stamen.com/in- -the-dalai-lama-asked-his-friend-scientist-dr- a f c bd saplakoglu y. . is the ground beneath the stanford cam- pus listening to you? san jose mercury news (https://www. mercurynews.com/ / / /is-the-ground-on-the-stan- ford-campus-listening-to-you/). slaby p, tomanova k, vacha m. . cattle on pastures do align along the north–south axis, but the alignment depends on herd density. j comp physiol a : – (doi: . /s - - - ). stamen design. a. american panorama (http://dsl.rich- mond.edu/panorama/foreignborn/). stamen design. b. atlas of emotions (https://stamen. com/work/atlas-of-emotions/). stamen design c. new images of complex microbiome environments visualized by berkeley metagenomics lab and stamen design. interview with jill banfield, uc berkeley (https://hi.stamen.com/uc-berkeley-metagenomics- lab-releases-new-images-of-complex-microbiome-environ- ment-discovered-a c ). stamen design. . big glass microphone (https://www. vam.ac.uk/bigglassmic/). e. rodenbeck https://www.mercurynews.com/ / / /is-the-ground-on-the-stanford-campus-listening-to-you/ https://www.mercurynews.com/ / / /is-the-ground-on-the-stanford-campus-listening-to-you/ https://www.mercurynews.com/ / / /is-the-ground-on-the-stanford-campus-listening-to-you/ http://dsl.richmond.edu/panorama/foreignborn/ http://dsl.richmond.edu/panorama/foreignborn/ https://stamen.com/work/atlas-of-emotions/ https://stamen.com/work/atlas-of-emotions/ https://hi.stamen.com/uc-berkeley-metagenomics-lab-releases-new-images-of-complex-microbiome-environment-discovered-a c https://hi.stamen.com/uc-berkeley-metagenomics-lab-releases-new-images-of-complex-microbiome-environment-discovered-a c https://hi.stamen.com/uc-berkeley-metagenomics-lab-releases-new-images-of-complex-microbiome-environment-discovered-a c https://www.vam.ac.uk/bigglassmic/ https://www.vam.ac.uk/bigglassmic/ white paper report report id: application number: hd project director: bonnie robinson (brobinson@northgeorgia.edu) institution: north georgia college and state university reporting period: / / - / / report due: / / date submitted: / / cover sheet type of report: final grant number: hd- - title of project: encouraging digital scholarly publishing in the humanities name of project director: bonnie robinson name of grantee institution (if applicable): university of north georgia (formerly north georgia college & state university) date report is submitted: / / encouraging digital scholarly publishing in the humanities: white paper abstract this project, led by the university press of north georgia, and funded by a digital start-up grant from the national endowment for the humanities focused on exploring the peer review process and increasing its usefulness to presses and scholars publishing digitally. by exploring this issues we have made recommendations for best practices in digital publishing, specifically for small academic presses. through surveys and a workshop of key stakeholder groups (press directors, college administrators, humanities faculty, and library/technology center directors), we found a strong investment in the “gold standard” of double- or single-blind peer review. working within the current academic publishing structure (including publishing in print) was a priority, even to presses and faculty members who were actively exploring digital publishing and open access models. on closer inspection, we realized that the various stakeholders valued the current peer review process for different reasons. and we found that the value of peer review goes beyond vetting the quality of scholarship and manuscript content. based on these findings, we considered ways to obtain these benefits within the current academic structure through innovative peer review processes. at the same time, we looked for ways of offsetting potential risks associated with these alternative methods. we considered cost effective ways to accommodate the needs of the disparate constituencies involved in academic publishing while allowing room for digital publishing. while our findings focus primarily on small academic presses, they also have significant implications for the open access community. encouraging digital scholarly publishing in the humanities: white paper table of contents introduction project design conclusions leveraging shared interest recommendations for small presses our next steps appendix a: initial survey appendix b: workshop description appendix c: workshop agenda appendix d: workshop summary appendix e: follow up survey introduction this project developed based on discussions that began among the members of the consortium for open access textbooks at the association of american university presses in june . dr. bonnie robinson, project director, and director of the university press of north georgia, found that other academic presses were struggling with issues related to the transition to electronic publishing. she utilized this existing group to query other press directors about their procedures and practices related to peer review and electronic publishing. members of the consortium expressed strong interest in developing a model to share resources related to the peer review process. based on these initial conversations, dr. robinson applied for and received a digital start-up grant from the national endowment for the humanities. the funded project focused on examining peer review and electronic publishing of single-author digital monographs. monographs are detailed scholarly studies focused on a single subject and usually offering new, original research. why monographs? monographs are a particular area of interest for both academic presses and institutions. these carefully researched texts have a small audience, resulting in short print runs of , copies or less; nevertheless, they have a significant scholarly impact due to their original theses and pioneering contributions to their respective field of knowledge. presses and institutions also often share the costs and labor related to publishing monographs. technological changes facilitate the short runs characteristic of monographs. by drastically reducing production and distribution costs, digital born monographs can make publishing scholarly monographs economically feasible for small presses. peer review and digital publishing we began by exploring the peer review process to find ways to increase its usefulness and reliability to presses and scholars publishing digitally. we hoped to identify best practices for presses incorporating innovative peer review processes to support digital publishing. the cost savings of digital publishing motivated our focus on supporting small presses in increasing their digital offerings. we discovered a strong resistance to digital-only publishing from both presses and scholars.  although content is king for presses, delivery methods constantly change and their longevity is suspect.  as long as readers demand print, presses need to continue publishing in print in order to remain competitive.  scholars need to meet administrative tenure and promotion expectations where traditional print publishing remains the standard. we considered ways to accommodate these needs cost effectively while allowing room for digital publishing. project design we started by thinking about questions we wanted to answer and identifying key stakeholders. we decided to use surveys and a face-to-face workshop that would bring together these key stakeholders to gather more information. we developed an initial survey directed to a small number of university press directors participating in an open access textbook consortium; the survey questions focused on digital publishing policies and peer review. from the results of this survey, we derived questions to consider at the workshop. participants at the workshop included press directors, faculty, administrators, and it personnel. workshop discussion identified concerns and possibilities that warranted further examination and that shaped our second set of surveys disseminated to administrators, faculty, publishers, library/technology center directors, and it personnel. for further description of the project design, see appendices a-e conclusions print or digital? initial survey results indicated that presses use the same peer review process for both digital and print monographs, but % of scholars still believe that the process differs. half ( %) of scholars believe that if a scholarly monograph is peer reviewed, then its delivery method does not matter. % of scholars think that digital born monographs should count toward promotion and tenure. but % of scholars, and % of publishers strongly agree or somewhat agree that promotion and tenure committees prefer print over digital born publications. % of scholars are affiliated with institutions that require peer reviewed (blind or double blind) publication for promotion and tenure. just over half ( %) of publishers believe that monographs need to be published in print, but % think that if a monograph is peer reviewed, then its delivery method does not matter. of the scholars responding to this question, % prefer print only; % prefer digital only; and % prefer both. none of the librarians said that scholarly monographs need to be in print. the peer review process we discovered a strong resistance to changing the “gold standard” of double- or single-blind peer review from presses, scholars, and administrators. in other words, we soon realized that there existed a real interest in maintaining the current academic publishing structure—even among presses who were exploring digital publishing and open access models. on closer inspection, we realized that the various stakeholders valued the current peer review process for different reasons. we found that the value of peer review goes beyond vetting the quality of scholarship and manuscript content.  scholars value the current peer review process in terms of improving the manuscript’s quality (argument, structure, and clarity).  presses value the competitive edge that the peer review process (and gatekeeping role) gives them in academia.  institutions(and their administrators) value the quality control of this gate-keeping in terms of building prestige.  scholars value the prestige of their work being placed at top tier, well respected presses. based on these findings, we considered ways to obtain these benefits within the current academic structure through innovative peer review processes. at the same time, we looked for ways of offsetting potential risks associated with these alternative methods. leveraging shared interests libraries, presses, faculty, and administrators each form what kathleen fitzpatrick in planned obsolescence ( ) calls “communities of practice.” the areas where their independent goals overlap present the best opportunities to change views and expectations regarding digital born monographs. i. collaboration economics drive the need for small presses to collaborate to share costs. to quote from a workshop participant: “scholarship is not commercial. we can’t draw a direct line from scholarship to dollars.” workshop participants and follow up survey respondents saw opportunities for certain types of collaboration, i.e., marketing and distribution; however, most did not see how collaboration would reduce the heaviest direct costs. to quote from the first survey: “cost sharing would work logistically for some portions of our operating costs, just not the most substantial ones—editorial and production. many presses remain in competition with each other for the best authors, grants, and publicity, so it’s not a normal state for publishers to work together in collaborative ways.” the workshop participants pointed out that we need to collaborate in cost- and labor-effective ways. they asked, “can you reduce the whole system (for both presses and institutions) costs down? or are the costs there, and can you move them around?” one means of reducing costs is to cut out the middle- men. of the services provided by a press that are seen as valuable, editing/copyediting, production, and marketing and distribution are related to high direct costs. these services would therefore seem potentially rewarding for collaboration. some survey respondents do outsource marketing and distribution services to larger presses and/or vendors. a highly valued service that has low (or potentially low) direct costs is peer review, a process performed and/or facilitated by faculty boards and scholar reviewers. a. editing/copyediting seems a viable area of collaboration. yet % of publishers do not share editors among presses. to quote from a survey respondent, “editors build lists at single presses.” if they do share editors, it is for copyediting. b. production is seen as a valuable service provided by a press. the follow up survey asked publishers if they shared production activities among presses. % said no. “but we work with vendors who certainly serve other presses.” and, “we share digital activities but not printing and binding activities.” most publishers do not collaborate in that area due to branding. c. faculty boards and peer reviewers are intrinsically collaborative. workshop participants pointed out that faculty and university presses already collaborate through faculty boards and peer reviewers. could those resources be used to further assist small presses with limited resources, i.e., through shared reviewers or a faculty board consortium? some survey comments on this latter possibility included the following: “it depends on the board, its reputation, where the publication will circulate, etc.,” and, “i wouldn’t submit to a press if i weren’t sure i’d get good quality specialist readers.” also, publishers in the survey commented that, “such consortia have no sense of responsibility, and historically have shown themselves willing to approve nearly anything put in front of them.” and, “faculty boards are not as critical as good reviewers.” these results indicate an area for collaboration through faculty boards, providing that common concerns – accountability, credentialing, legitimizing specialists – were addressed. ii. peer review few publishers see opportunities for collaborating in terms of peer reviewers or the peer review process. to quote from the survey: “editorial programs are what really distinguish presses from each other. shared reviews would break that down. and in the case of negative reviews, they would be unduly harmful to authors. no one or two reviewers should be able to kill a project at multiple presses.” what is valuable in the peer review process for published monographs? % of the survey respondents believed it helped with their research or expanded their research methods; % believed it strengthened the argument; % thought it helped in revising the structure; % found it helpful as copyediting; and % found it no help at all – except with preparing “blurbs” (which could be useful for marketing). in terms of innovative forms of peer review, the survey responses reveal what stakeholders value about peer review itself. frankness and lack of bias are values. because of that, most publishers find open peer review, for example, a disincentive for potential reviewers: “academia can be a uniquely political and contentious venue;” “for overall quality assessment, public reviews bring too much pressure on the reviewers to be less frank in judging a work…we live in a litigious society.” respondents doubted that blog discussions, webmetrics, and crowdsourcing could provide “sustained attention, including many specific criticisms as well as assessment of a work’s coherence as an argument overall.” the data and information provided through these methods were valuable, depending upon who used them and how they were used. a sample response: “ideally, we care whether scholarship is getting read and used. metrics is a good system for determining these things. they should count, but the data needs to be analyzed carefully.” these responses indicate an inclination to diversify the peer review process towards hybridicity and transparency, providing that what is valued in traditional peer review is preserved in these new processes. but publishers prefer/like traditional peer review also because it gives them a competitive edge over commercial presses, since “peer review is a default to promotion and tenure committees.” so publishers might resist change here. iii. digital born publishing when asked in the survey whether digital born monographs should count towards tenure and promotion, % of scholars and administrators responding to the follow up survey said no. a sampling of their responses revealed their perception that digital born monographs would not undergo rigorous peer review. to quote: “peer review is an important part of good scholarship; if it were to change for digital monographs, i would need more information about how it would change.” workshop participants noted that publishers see a number of ways that digital advances have not yet had as big an impact in publishing as possible because their specific target audiences have not fully- embraced improvements in: digital review copies, electronic peer review, permissions, agreements, and digital born (identified as e-only) publications. our survey to press directors revealed that publishers use the same peer review process for digital as for print products. in other words, there is no difference between the two. as one respondent wrote, “content, not format, is [the] object of reviews.” recommendations for small presses . assure scholars and administrators that double blind peer review will remain in place, while at the same time encouraging innovative models. assuring scholars and administrators of the quality of digital born products is essential to encourage their growth. since double or single blind peer review is highly valued, focusing editing and copy editing efforts on that process has a very strong return on investment. according to our survey, most peer reviewers consider their work as service to their field; many do not receive or expect an honorarium. if small presses could encourage reviewers to forego honoraria, then the main cost of traditional peer review is in management. management costs themselves can be further reduced if reviewers are encouraged to use electronic manuscripts. coinciding with this traditional review process, small presses can explore blogs, comment communities of discipline-specific scholars, and other digital forms of collaborative review that enhance the quality of a manuscript at little cost. . continue to use a hybrid (digital and print) publishing model, while anticipating a digital-only future. digital publishing can comprise the majority of a press’s products, and print copies of scholarly monographs can be made available through print on demand. altering perceptions of digital born products would be a means to lessen the need for print products. with current perceptions, publishing still needs to be hybrid. small presses have an advantage in their agility in adapting to and adopting this hybridicity. their overhead is comparatively small. and they can collaborate with various entities and constituencies; besides libraries, small presses can partner with learned societies, discipline-specific scholars, etc. . consider collaborative models for the peer review process. concerns for sustainability and cost recovery drive this interest in potential collaboration and partnerships. viable options include collaborating in the peer review process itself with faculty board consortia and/or learned societies and discipline-specific scholars/reviewers. small presses, however, need to be very careful to leave potential sources of cost recovery entirely out of the manuscript vetting/peer review process itself, for example, in author fees and/or required subsidies. . ensure the quality of digital or hybrid products through rigorous and transparent peer review processes. small presses need to explain how and why they choose peer reviewers; what questionnaires or rubrics, if any, they require their reviewers to use and with rubrics, whether or not comments are encouraged; what review/assessment system, if any, they use; how they interpret their reviewers’ comments and implement any requested revisions; what related communication, if any, occurs between the reviewer and the project editor; whether and/or how often they re-use reviewers and whether or not their reviewers are financially compensated. all their peer reviewers need to be experts in the field, who are able to evaluate high level scholarly work and to write focused reviews. archive all records of draft versions, revisions, and accompanying comments. . allocate resources wisely. small presses can allocate most of their resources to those branding activities that are both highly valued and comparatively low cost, such as double or single blind peer review and design/production. small presses need to enhance their reputation by assuring the high quality of their publications, so they should allocate their resources towards the activities that achieve the most valued outcomes of the traditional peer review process. in other words, small presses should take greatest advantage of elements in the traditional peer review process scholars believe enhance the quality of the monograph: micro and macro sustained analysis that strengthens the argument, revises its structure, or improves its clarity. small presses can also encourage those innovative peer review processes that promote these particular goals. . consider using innovative practices at strategic points in the review process. think about using new methods at different points of the peer review process (pre, during, and post publication). for example blogs can foster sustained scholarly conversation over a period of time. these contributions can be collected and can assist a work’s development phase. collaborative peer review with different reviewers focusing on different aspects of a text can also assist a work’s development. and open peer-to-peer review or review within a closed community of scholars (who do not dissipate/disseminate a scholar’s research) can assist at the revision and editing stages. digital publishing and post publication review also especially facilitate quick and responsive revision/editing. our next steps by identifying potential concerns and new possibilities in peer review and digital publishing, our project is moving this conversation forward and encouraging change. we will continue to contribute to the ongoing conversations about digital publishing in the humanities through blogs, newsletters, journals, and conferences. for example, we will align our findings with the discussions and recommendations of the recent jisc collections and oapen open access monographs in humanities and social sciences conference. in addition, the recently released white paper from ithaka s+r focusing on “campus services to support historians,” shows how issues related to digital publishing and the promotion and tenure process are playing out in a specific discipline within the humanities. we will continue to disseminate sample policies, best practices, and ideas for new possibilities in peer review and digital publishing to small presses to enable their greater leadership in and impact on digital publishing. and, we will continue to reach out to the academic and publishing communities (both nationally and internationally) to encourage collaborations to increase the digital output of small presses. we are specifically interested in supporting faculty board consortia and acquisitions editors. collaboration with the open access community is a priority to us. the “sanctioning” by academic presses of scholarly monographs through peer review can be a goal in itself, and this vetting process can be transferred to open access publishing. some project findings that apply to open access publishing include: finding ways to maximize high impact factors like quality-control, production and design, and marketing while minimizing risks (i.e. loss of prestige and respect, plagiarism, longevity, inaccessibility). this is the most promising convergence we have discovered among the common interests of our various constituencies. appendix a: initial survey the purpose of the initial survey was to explore common and unique practices among university presses of disparate size and individualized editorial programs in terms of the peer review process, publishing costs, collaborations and methods of cost recovery, and digital publishing policies and practices. the survey was sent to a targeted group (n= ) of university press directors that were part of the consortium for open access textbooks. this group of university presses formed to share costs and profits in the development and marketing of peer reviewed open educational resources. the response rate for this survey was %. through this survey we gathered valuable information to guide the conversation at the workshop. we learned that few operating costs were shared through partnerships with other presses, presses compete with each other for manuscripts and readers as well as with commercial presses, peer review gives university presses a competitive edge over commercial presses, the same peer review process is used for print and digital products, end users determine media format, and scholarly end users distrust the quality of digital born publishing. initial survey questions:  in which forms – digital and/or print – do scholars prefer to publish monographs?  did peer review improve/enhance a published monograph before its publication?  do you think peer review of digital born monographs differs in any way from the peer review of print scholarship?  what is the role of university presses?  do scholarly monographs need to be published in print?  does digital-only content have longevity?  could blogs be peer reviewed in the same manner as scholarly monographs?  can the peer review process be crowd-sourced?  is the future of peer review open peer review within a closed community of credentialed scholars?  has digital publishing increased the usefulness of peer reviewed shorter forms of scholarship, such as shorts, essays, web pages, and blogs?  do promotion and tenure committees prefer traditional to new models of publishing?  if a scholarly monograph is peer reviewed, does its delivery method matter?  if a scholarly monograph is peer reviewed, does its place of publication matter? appendix b: workshop description the purpose of the workshop was to bring together a group of key stakeholders (press directors, administrators involved in the promotion and tenure process, scholars, and it professionals) to focus on the peer review process, digital publishing, and the role of academic presses. specifically, the group was tasked with developing focused questions for the follow-up surveys, and outlining next steps. key issues discussed at the workshop included the following:  what is the role of the press in scholarly communications?  are innovative and traditional forms of peer review mutually exclusive?  what is the value of print over digital publication (and vice versa)?  what are the purposes of peer review?  how do we persuade promotion and tenure committees of the value of digital born scholarship?  how can university presses collaborate in order to reduce costs and so encourage small presses and start up presses? the workshop discussion elucidated the view that many factors besides peer review, or vetting the quality of scholarship, “branded” a monograph as itself of high quality. such “branding” also involves the author’s credentials, expertise, and recognition; the author’s affiliate academic institution’s reputation; the publisher’s reputation; and the monograph’s production layout and design, front and back matter, etc. scholarly communication involves disparate skills among various entities and is labor intensive. those involved in branding scholarship as high quality include academic institutions, who compete with other academic institutions for an international audience; university presses, competing with each other for authors, titles, reviewers, and reputation; libraries, building collections; learned societies, indexing discipline-specific scholarship; and professional publishing societies, determining best practices. within these entities, publishers see their role as devising a type of peer review for born digital scholarship and communicating that to academic institutions. - % of publishing costs goes towards peer review itself. yet peer review is an important (though not the sole) factor in what determines quality content and facilitates the additional branding elements of editing, production, etc. publishers prefer blind (double or single) peer review because it assures scholars the freedom to analyze and evaluate manuscripts without fear of retaliation; reviewers have the option for anonymity; candid reviews are thereby ensured. blind review supports presses by giving university presses a competitive advantage over commercial presses; it enhances a press’s reputation because it is important for a press to ensure the integrity of the scholarship. publishers use the same peer review process for print and digital products. innovative forms of peer review, such as biblio- or webmetrics, can be useful for post-publication review. their reliability is open to question since they can be “gamed.” most publishers do not use open pre-publication peer review because it remains a disincentive for potential reviewers who fear backlash and subjectivity. open review seems unreliable in terms of credentialing reviewers. also, promotion and tenure committees trust blind review over open peer review. open peer review can be useful for those authors who are looking for revision guidance. open peer review might be useful once a press decides whether or not to publish a manuscript. some presses believe that peer review will become more and more transparent in the next years and that reviews will become more available to public scrutiny. the publishers at the workshop believed that we have already moved beyond the need to build respect for digital scholarship and that the larger questions are related to cost savings and collaboration. they also believe that the same factors that brand print scholarship as high quality do so for digital born scholarship, with peer review being the key factor and “gold standard.” the administrators and faculty workshop participants confirmed the view that we have moved beyond the need to build respect for digital scholarship, providing that the content undergoes peer review. if a work is peer reviewed to ensure its quality, then its publishing format does not matter (i.e. in terms of promotion and tenure). appendix c: workshop agenda agenda: encouraging digital publishing in the humanities workshop ( / / ) general review of preliminary survey responses (power point presentation) guiding questions to consider:  what is/will be the role of the press in scholarly communications?  what is the value of print over digital publication (and vice versa)?  what are the purposes of peer review?  is the value of rigorous, traditional peer review available to innovative forms of peer review?  how do we persuade tenure-and-promotion committees and college/university administrations of the value of digital born scholarship? examine/discuss focused concerns: . print vs digital . innovative/peer review processes a. double and single blind b. open/peer-to-peer c. transparent: reports/annotations shared d. collaborative: community of scholars e. flexible bibliometrics f. combining values/determining best practices . stakeholders (who need to be considered in determining value) a. non-tenured faculty b. tenure and promotion committees c. college/university administrations d. expert reviewers/scholars e. university presses . adapting collaborative models (that technology/social media, etc., encourage) a. open access b. digital platforms/sharing/production c. pre-publication availability d. competition e. peer review management systems f. sustainability models for peer review processes and for publishing . partnerships with scholarly societies (mla, aha, etc.) . partnerships with press groups (aaup) . scholarly websites brainstorm next steps; draft plan of next steps; develop questions and recipients for next surveys appendix d: workshop summary participants:  thomas bacher (director, university of akron press)  dr. tanya bennett (professor of english, department of english, north georgia college & state university)  mick gusinde-duffe (assistant director of acquisitions, and editor in chief, university of georgia press)  dr. markus hitz (professor of computer sciences, department of math and computer sciences, ngcsu)  jane hoener (director, wayne state university press)  alex holzman (director, temple university press)  dean chris jespersen (school of arts & letters, ngcsu)  meredith morris babb (director, university press of florida)  dr. bj robinson (director, university press of north georgia and department of english, ngcsu)  dr. denise young (assistant vice president, office of institutional effectiveness, ngcsu) working questions (focusing on monographs/digital born monographs):  what is/will be the role of the university press in scholarly communications?  what is the value of print over digital publication (and vice versa)?  what are the purposes of peer review?  is the value of rigorous, traditional peer review available to innovative forms of peer review?  how do we persuade tenure-and-promotion committees and college/university administrations of the value of digital born scholarship?  how can university presses collaborate in order to reduce costs and so encourage/assist small and start up university presses? main categories for discussion: collaboration, digital vs print publication, peer review, and stake holders goals: new questions and ideas; outline next steps; identify stake holders; develop focused questions for data- and information-gathering; develop possibilities for neh digital start up level ii funding application (in terms of actual digital products, like website) collaboration presses and universities/academic institutions have much in common, in terms of valuing peer review and scholarly communication. both are concerned with finding/making and ensuring quality contributions to scholarship. both are concerned with profits and/or cost recovery – which involve “leveraging the brand.” both want to protect/enhance their reputation and competitive edge. presses and institutions share the following “participants” in publishing/published monographs: scholars/faculty, faculty boards, libraries, readers, and costs (since universities subsidize presses). the cost of scholarly communications is growing. presses need to establish leadership within their parent campus community. the collection of expertise in a university press, combined with library expertise (presses know about costs while libraries know about prices), can really do something. besides vetting the quality of scholarship (trusted venue for high-quality scholarship), “branding” a monograph involves author’s credentials/expertise/recognition; affiliate academic institution’s reputation; publisher/press’s reputation plus the monograph’s copyediting, production layout and design, front and back matter, etc. scholarship is not commercial. scholarship is the commons of human culture. we can’t draw a direct line from scholarship to dollars. we ought to return scholarly outcomes to the public. we need to find a way to synthesize disparate skills in scholarly communication. we should consider engaged scholarship as a form of service learning because scholarly communication is labor intensive. question: can you reduce the whole system (for both presses and institutions) costs down? or are the costs there, and can you move them around? one means of reducing costs is to cut out the middle-men. mission critical question: what would stakeholders be willing to do without in order to get a scholarly (digital born) monograph published? question: who does the “branding” of scholarship? answer (complete?): academic institutions (competing with other academic institutions for an international audience); university presses (competing with each other for authors, titles, reviewers, and reputation); libraries (building collections); learned societies (indexing discipline-specific scholarship); professional publishing societies (aaup)? summary: the publishers see opportunities for collaboration in the following areas: editing, production, and marketing. they consider the costs of peer review almost ‘negligible’ (consisting of honorarium to external reviewer). largest (indirect) costs are personnel in all departments and production, so sharing those costs would be helpful. few publishers see opportunities for collaborating in terms of the peer review process itself: “editorial programs are what really distinguish presses from each other. shared reviews would break that down. and in the case of negative reviews, they would be unduly harmful to authors. no one or two reviewers should be able to kill a project at multiple presses.” possibility: content consortium for open access digital projects. offer publication services (book packaging) consortium/publishing coalition. need to monetize the disparate skill sets. synthesis itself is a contribution: editors, scholars, librarians – pull together. digital vs print publication digital publishing reduces costs in the following areas: printing, distribution, warehousing, overhead. it allows for enhanced content, discoverability/searchable content, additional modes of content, pre- publication availability, quicker access to content once the work is published, easier collaboration among presses, and easier transition to multiple/open peer review processes. digital also helps the author integrate external reviews, thus improving developmental editing/revision. publishers see a number of ways that digital advances have not yet had as big an impact as possible because their specific target audiences have not fully-embraced improvements in: digital review copies, electronic peer review, permissions agreements, and e-only publications. the end-users determine formats. art, archeology, and anthropology, for example, work best in print (due to complex layout). presses could create a content addendum website for such discipline-specific monographs. the material on such content addendum websites could be made available for open peer review. publishers expressed concerns over longevity of digital platforms and later conversions. also, publishers note that many promotion-and-tenure committees in the humanities still regard printed books as the “gold standard.” quality print on demand publishing could/should therefore partner with digital publishing since scholars can be satisfied with as few as ten print copies of their monographs. question: can a press be all-digital? answer: peer review is the biggest obstacle to this possibility: “we send manuscripts and receive reviews electronically but do not use an electronic review platform because we have too low a volume to make that cost efficient.” also, “ new monographs a year do not provide sufficient volume to make a peer review management system effective.” summary: most presses publish in multiple formats because “content is king.” possibility: add to “book packaging” publishing services consortium the possibility of cloud storage – distributed all over the world; eliminate storing redundancies peer review publishers see their role as coming up with a type of peer review for born digital scholarship and communicating that to academic institutions. - % of publishing costs goes toward peer review. yet peer review is what determines quality content. “we still rely on peer review to determine the quality of the scholarship, regardless of its media format.” publishers use the same peer review process for print and digital products. publishers prefer blind (double or single) peer review for the following reasons: assures scholars the freedom to analyze and evaluate without fear of retaliation; reviewers have the option for anonymity; ensures candid reviews; helps keep out bias. “for all its faults, blind review is still the best way to get an honest, forthright assessment of a project.” blind review also supports presses in the following ways: it gives university presses a “competitive advantage over commercial presses;” enhances a press’s reputation because it is important for a press to ensure the integrity of the scholarship: “peer review is about quality control…digital and print both warrant making sure they are quality publications. presses want to be assured of the scholarly integrity of what we publish and place our imprint on.” single blind review can at times penalize younger scholars, though, if the reader is a senior scholar who might be negatively disposed toward controversial work (if a writer is junior). peer review processes in the form of bibliometrics can be useful for post-publication review. important caveat concerning their reliability: they can be gamed. most publishers currently do not use open peer review because it remains a disincentive for potential reviewers: “academia can be a uniquely political and contentious venue.” also, “for overall quality assessment, public reviews bring too much pressure on the reviewers to be less frank in judging a work…we live in a litigious society.” also, open review seems unreliable in terms of credentialing the reviewers. and promotion-and-tenure committees trust blind review over open peer review: “p & t committees get skittish with new models.” open peer review can be useful for those authors who are looking for revision guidance. open peer review might be useful once a press decides whether or not to publish a manuscript. some presses believe that peer review will become more and more transparent in the next years and that reviews will become more available. question: if print and digital publications undergo the same processes of blind peer review, then why do promotion-and-tenure committees prefer print? if content is king, then why does media/format matter? answer: we have already moved beyond the need to build respect for digital scholarship. the larger question is on cost savings and collaboration. question to administrators/deans/department heads/promotion-and-tenure committee members: if a work is peer-reviewed to ensure quality, does its publishing format (or even place of publication ie personal website) matter in terms of promotion and tenure? answer from representatives at workshop: no. question: are presses themselves the middle-men that can be eliminated? can what presses offer indeed be likened to “book packaging”? summary: blind review is a trusted means of ensuring quality scholarship. its sustainability is open to question. presses foresee a need to combine/layer means of peer review. possibility: learned and professional societies could host an institutional repository for credentialed/quality scholarship ie aaup could host a ‘deluxe,’ branded, trusted institutional repository of digital born scholarship/open access material for collaborating presses, universities, libraries: brand the quality of scholarship, then “convert” everyone else (“if content is king, then good metadata is the workhorse”). stakeholders university press faculty boards are often overlooked, especially in terms of the peer review process. faculty boards are a check in the process. faculty board members are appointed by the press’s parent institution’s president or provost. they are also recommended by past board members. members comprise names that have been evaluated by colleagues – diverse representation that includes a sort of institutional history. they are themselves a type of community of scholars (open peer review). they are a gateway to good scholarship. engaged boards understand research, appreciate good writing. peer reviewers themselves need more attention. presses often use the same reviewers over and over. presses have their own roster of peer reviewers (closed community of scholars?). institutions generally believe that senior scholars have an “obligation” to do peer review. many publishers value reviews from full professors only. there is a move to accept reviews from associate/assistant professors: “junior” professors are very close to new research. summary: stakeholders include faculty, promotion-and-tenure committees, academic administrators, libraries, learned societies, information technology departments/committees, distance learning departments/committees, digital scholarship centers, university presses, editorial boards, acquisition editors, students possibility: develop a consortium faculty board (shared faculty board ie academic council of educators). such a board could be a national editorial gateway (university-administration driven). in order to develop such a board, need to plan for exceptions/voting bias/rewards. collaborating presses must exercise due diligence (including clearing participation with department heads who see it as valued service). possibility: develop consortium advisory editorial board that is a hybrid of faculty from other universities plus expert laypeople plus publishing professionals. besides surveys, investigative ‘next steps’ include examining the technologies/infrastructures of the following: netgalleys (put up content; open it up to reviewers, or have editors make requests for reviews. editorial control through netgalley peer review. online assessment system. embedded rubrics. rubrics that you follow); ehistory/historyebooks.org (heb advisory board and title review board); bepress peer review management system. big idea possibility: develop a website/clearing house of external peer reviews/reviewers (askanexpert.com; directories in specific areas). develop a means for an open exchange of reviewers, where electronic publishers can ‘bid’ for reviewers (ecommerce; quality rankings; paid value of peer review; credentialing; managing). website can be a means of sharing reports once a publishing decision is made. such a website could expose/display the peer review process itself as well as ensure quality scholarship of digital born monographs. important step: need to talk to stakeholders ahead of time: making a case for it vs just putting it out on the web. process must include dialogue/discussion. broadcast some articles and information on the clearinghouse’s peer review to allow for user testing. appendix e: follow-up survey using the information developed at the workshop, we designed the follow-up survey. the purpose of this survey was to solicit the views/attitudes/perspectives on and concerns about peer review and digital publishing among a larger number of stakeholders; we intended to gather the information needed to shape recommendations to small presses for best practices in digital publishing and peer review. this survey was sent to a larger group (n= , ), that included administrators, faculty, publishers, library/technology center directors, and it personnel. the response rate was around %. we benefitted from the survey results by learning that a significant number of respondents value digital born publishing, innovative peer review processes, and new publishing models; we benefitted also by learning that constraints, concerns, and perceptions (versus reality) inhibited receptivity to changes in delivery methods and peer review. the combined interests in sustainability and adaptability identified in the workshop discussion shaped our subsequent surveys. in addition to our original focus on the peer review process and innovative ways to facilitate publishing digital born monographs, we expanded our survey to include questions related to cost recovery and sustainability. we sent the surveys to the three main stakeholder groups (faculty, presses, and library/technology center directors). from the results/data gathered, we learned that:  some of our questions needed clearer grounding/preparation due to the diversity of knowledge about peer review processes and digital publishing among the recipients.  we could have asked for clearer self-identification in terms of background knowledge of/experience in digital publishing.  we could have targeted more specialized groups within the larger constituencies; i.e., digital humanities scholars, centers of scholarly communication, etc.  we could have delineated more clearly the nuances among the various categories of questions addressing digital publishing, cost recovery opportunities, and peer review. survey response rates when who response rate pre-workshop ( - - to - - ) directors of university presses (n = ) % (n= ) post workshop; follow-up ( - - to - - ) editors at university presses (n= ) . % (n= ) ; . % partial completion (n= ) faculty in the humanities (n= , ) . %(n= ); . % partial completion (n= ) library/technology center directors . % (n= ) faculty included % instructors, % assistant professors, % associate professors, % full professors, and % none of the above. we included deans in the survey but did not request in the survey differentiation between their scholarly and administrative roles. similarly, we included digital technology center directors in the library directors but did not request in the survey differentiation between the two. follow-up survey questions: questions for faculty which of the following describes your rank? a. instructor; b. assistant professor; c. associate professor; d. full professor; e. none of the above branch: as an _________, do you feel obligated to peer review scholarly work? a. yes; b. no (please explain) why do you agree to write peer reviews? a. financial compensation; b. professional responsibility; c. access to current research; d. other? would you peer review scholarship in order to earn a reduction on a learned society’s dues? a. yes; b.no what do you think are the role(s) of learned societies (check all that apply)? a. networking; b. research opportunities; c. conferences; d. publishing opportunities; e. setting of professional standards if a learned or professional society, collaborating with a consortium of presses, sponsored a site for discoverable open access materials, would you make your scholarly work available on such a site? a. yes; b. no (please explain) as an author, would you consider publishing your scholarly monograph with a press that requires subvention, that is, your own monetary support, ie grant awards, etc.? a. yes; b. no (please explain) would you serve as a peer reviewer for a press that shares peer reviewers among a consortium of university presses? a. yes; b. no (please explain) as an author, in which of the following forms do you prefer your scholarly monograph to be published? a. print; b. digital; c. both have you published a scholarly monograph that has an addendum website? a. yes; b. no branch: if you published a scholarly monograph that has an addendum website, was the website peer reviewed? a. yes; b. no branch: if an addendum website to a scholarly monograph was peer reviewed, then in what manner was it peer reviewed? a. open; b. closed; c. other (please explain) if you have published a monograph with a scholarly press, do you know how the press selected the manuscript’s peer reviewers? a. yes; b. no (please explain) if your monograph has undergone peer review, did the published text benefit from the peer review process in any of the following ways (please check all that apply)? a. research or expanded research methods; b. strengthened the argument; c. revised its structure; d. copyediting; e. other (please explain) if your scholarly work has undergone peer review, with which of the following do you have more general experience in terms of the reports? – a. reports were balanced and effective; b. reports were limited; c. reports were biased; d. other (please explain) if you have peer reviewed a scholarly work, did you generally make an effort to be balanced and objective in your report? a. yes; b. no (please explain) if you have peer reviewed scholarly work, did you write the report with any of the following in mind (please check all that apply)? a. the author; b. the press; c. the potential readers if you were to write a peer review report on a scholarly monograph, in which of the following forms would you prefer to read the manuscript (please check all that apply): a. print; b. digital file (ie .pdf, .doc, etc.) if you were to write a peer review report on a scholarly monograph, would you be willing to use a system similar to netgalley’s, that is, a site that’s hosted by a consortium of university presses from which you can download a manuscript/galley? a. yes; b. no (please explain) if you were to write a peer review on a scholarly monograph, would you be willing to work with an online assessment system (ie online) rubrics? a. yes; b. no (please explain) have you ever worked with an acquisitions editor? a. yes; b. no branch: if you have ever worked with an acquisitions editor in publishing your scholarly monograph, did the acquisitions editor assist you in revising your manuscript? a. yes; b. no (please explain) in order to get your scholarly monograph published, what would you be willing to do without, in terms of services from a university press (check all that apply)? a. peer review; b. editing; c. copyediting; d. production (layout & design); e. marketing for faculty and administrators how much do you agree with this statement: in emerging fields, sometimes the junior professors are the most informed scholars? a. completely agree; b. somewhat agree; c. neither agree nor disagree; d. somewhat disagree; e. strongly disagree how much do you agree with this statement: university presses should ask only full professors for peer reviews? a. completely agree; b. somewhat agree; c. neither agree nor disagree; d. somewhat disagree; e. strongly disagree do you think that peer reviewers should be financially compensated? – a. yes; b. no (please explain) do you think that writing peer reviews should be considered expected professional activity? a. yes; b. no (please explain) should authors suggest people to review their own work? a. yes (please explain); b. no (please explain) do you think that the peer review process can be crowdsourced? a. yes; b. no (please explain) does your university credit peer reviewed work that has not been published through a university press? a. no; b. yes (please explain) does your university credit non-peer-reviewed publications towards promotion and tenure? a. no; b. yes (please provide examples) if your university requires peer reviewed publication for promotion and/or tenure, then in which form is the scholarship required to be reviewed (please check all that apply)? a. single blind; b. double blind; c. open do you think digital born monographs should count towards tenure and promotion? a. yes; b. no (please explain) do you think the peer review of digital born monographs differs in any way from the peer review of print scholarship? a. yes; b. no (please explain) do you think that a university press’s faculty board is the equivalent of a community of scholars? a. yes; b. no (please explain) if a university press offered publication services such as editing, peer review, layout & design, and web hosting for open access materials, would you pay for the service? a. yes; b. no (please explain) branch: if your scholarly monograph were published by a university press that offered for-fee publication services, including peer review, and if your monograph’s peer review reports were made available to a tenure and promotion committee, would your university credit the monograph towards promotion and/or tenure? a. yes; b. no (please explain); c. don’t know if an author could pay a press to do a double blind review on a monograph that the author then made open access rather than published through a university press, would the monograph be credited for promotion and/or tenure? a. yes; b. no (please explain) if an author could pay a press to do a double blind review on a monograph that the author then made open access rather than published through a university press, should the monograph be credited for promotion and/or tenure, in your opinion? a. yes (please explain); b. no (please explain) does your university credit service learning/engaged scholarship towards promotion and tenure? a. yes; b. no have you ever served on a university press faculty board? a. yes; b. no branch: if you have ever served on a university press faculty board, were you appointed to the service? a. yes; b. no (please explain how you joined the faculty board) branch: if you have ever served on a university press faculty board, was there a term limit to your service? a. yes; b. no (please explain) branch: if you have ever served on a university press faculty board, what were your reasons for doing so (please check all that apply)? a. as an honor; b. to build your reputation; c. to build your resume; d. as a position for distinguished; e. faculty; f. as service/appointment which of the following do you think should serve on a university press faculty board (please check all that apply)? a. full professors; b. associate professors; c. assistant professors; d. expert lay people; e. library staff if your university has a university press, do you know any, or all, of its faculty board members? a. yes; b. no branch: do you know the role of a faculty board in your university’s press? a. yes; b. no branch: do you have contact with the faculty board of your university’s press? a. yes; b. no branch do you know how the faculty board of your university’s press is selected? a. yes; b. no questions for faculty, administrators, university press editors, and library/technology center directors how well do you agree with the following statement: the purpose of university presses is to disseminate scholarship, not necessarily administer peer review for peer review’s sake. a. completely agree; b. somewhat agree; c. neither agree nor disagree; d. somewhat disagree; e. strongly disagree should scholarly monographs be funded through any of the following means? a. library subscription fees; b. university subsidies; c. both; d. neither (please explain) should scholarly monographs be funded through subventions, that is, money supplied by the author, ie through grants? a. yes; b. no (please explain) do monographs need to be of a certain length (ie up to pp)? a. yes (please explain); b. no (please explain) do you think that a single-author monograph is the highest form of scholarship in the humanities? a. yes; b. no (please explain) do monographs need to be published in print? a. yes (please explain); b. no (please explain) do you think the advent of digital publishing has increased the usefulness of peer reviewed shorter forms of scholarship, such as shorts, essays, web pages, blogs, etc.? a. yes; b. no (please explain) does blogging lend itself to constant, in depth scholarly conversation? a. yes (please explain); b. no (please explain); c. maybe (please explain) could blogs be peer reviewed in the same manner as scholarly monographs? a. yes; b. no (please explain) who are the consumers of scholarly monographs (check all that apply)? a. libraries; b. faculty; c. both equally; d. other (please explain) how much do you agree with the following statement: the future of peer review is open peer review within a closed community of credentialed scholars? a. completely agree; b. somewhat agree; c. neither agree nor disagree; d. somewhat disagree; e. strongly disagree how much do you agree with the following statement: promotion and tenure committees prefer traditional to new models of publishing. a. completely agree; b. somewhat agree; c. neither agree nor disagree; d. somewhat disagree; e. strongly disagree if a scholarly monograph is peer reviewed, does its delivery method (ie print, digital) matter? a. yes; b. no (please explain) if a scholarly monograph is peer reviewed, does its place of publication matter? a. yes (please explain); b. no (please explain) would you accept work published at a press that uses a national faculty board, that is, a faculty board with members from a consortium of universities? a. yes; b. no (please explain) digital publishing allows for tracking views, downloads, and trackbacks. do you think this type of information should influence promotion and tenure? a. yes (please explain); b. no (please explain) with new formats constantly appearing, do you think that digital-only content has longevity? a. yes; b. no (please explain); c. maybe (please explain) do you think that cloud storage (with all its redundancies) has a role in the long-term preservation of digital monographs? a. yes; b. no (please explain); c. maybe (please explain) for publishers and editors do you share editors among presses? (yes/no/please explain) do you share marketing among presses? (yes/no/please explain) do you share production activities among presses? (yes/no/please explain) do you share distribution activities among presses? (yes/no/please explain) would you use an editorial management system (such as bepress offers)? (yes/no/please explain) questions for administrators and librarians do you think that libraries determine the quality of published scholarship? (yes/no/please explain) do you know what an acquisitions editor does? (yes/no) do you know an acquisitions editor? (yes/no) have you worked with an acquisitions editor? (yes/no) would you consider allocating library funds to subsidize press publications? (yes/no/please explain) branch: if your library allocated funds to subsidize press publications, would you require that publication to be open access? (yes/no/please explain) questions for library/technology center directors if you work for a library at a university with a press, do you know its faculty members? (yes/no) branch: do library staff serve on your university press’s faculty board? (yes/no) if collaborating university presses developed a cloud consortium, would your library sign up for it to access material? (yes/no/please explain) king’s research portal doi: . / document version peer reviewed version link to publication record in king's research portal citation for published version (apa): spence, p. ( ). the academic book and its digital dilemmas. convergence (london), ( ), - . https://doi.org/ . / citing this paper please note that where the full-text provided on king's research portal is the author accepted manuscript or post-print version this may differ from the final published version. if citing, it is advised that you check and use the publisher's definitive version for pagination, volume/issue, and date of publication details. and where the final published version is provided on the research portal, if citing you are again advised to check the publisher's website for any subsequent corrections. general rights copyright and moral rights for the publications made accessible in the research portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognize and abide by the legal requirements associated with these rights. •users may download and print one copy of any publication from the research portal for the purpose of private study or research. •you may not further distribute the material or use it for any profit-making activity or commercial gain •you may freely distribute the url identifying the publication in the research portal take down policy if you believe that this document breaches copyright please contact librarypure@kcl.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. download date: . apr. https://doi.org/ . / https://kclpure.kcl.ac.uk/portal/en/publications/the-academic-book-and-its-digital-dilemmas( e d-f d- b-adc - d b d ).html /portal/paul.spence.html https://kclpure.kcl.ac.uk/portal/en/publications/the-academic-book-and-its-digital-dilemmas( e d-f d- b-adc - d b d ).html https://kclpure.kcl.ac.uk/portal/en/journals/convergence-london(dc cd c -a - a - aa - f cf cf).html https://doi.org/ . / the academic book and its digital dilemmas paul spence paul.spence@kcl.ac.uk king’s college london, uk submitted to the journal convergence: the international journal of research into new media technologies author pre-print version, after peer review, copyright notice copyright rests with the journal convergence: the international journal of research into new media technologies journal volume number and issue number still pending at time of submission. copyright © . reprinted by permission of sage publications. doi: . / journals.sagepub.com/home/con abstract the future of the academic book has been under debate for many years now, with academic institutional dynamics boosting output, while actual demand has moved in the opposite direction, leading to a reduced market which has felt like it is in crisis for some time. while journals have experienced widespread migration to digital, scholarly monographs in print form have been resilient and digital alternatives have faced significant problems of acceptance, particularly in the arts and humanities. focusing in particular on the arts and humanities, this article asks how, and under what conditions, the digitally mediated long-form academic publication might hold a viable future. it examines digital disruption and innovation within humanities publishing, contrasts different models, and outlines some of the key challenges facing scholarly publishing in the humanities. this article examines how non-traditional entities, such as digital humanities research projects, have performed digital publishing roles and reviews possible implications for scholarly book publishing’s relationship to the wider research process. it concludes by looking at how digital or hybrid long-form publications might become more firmly established within the scholarly publishing landscape. introduction in his article “scholarship: beyond the paper” in nature a few years ago, jason priem argued that “we are witnessing the transition to … another scholarly communication system – one that will harness the technology of the web to vastly improve dissemination” ( : ). while such arguments are not new, and impassioned claims about the transformative powers of digital technology in publishing have often proven to be premature or unrealistic, it seems clear that our relationship to scholarly publication is susceptible to change at every level of its existence, from conception to final reception, and beyond, as a result of digital mediation. whereas academic journals have experienced many changes already, predictions of the imminent demise of print in academic publishing have proven to be misplaced, particularly in the arts & humanities (and to some extent in the social sciences), where the print monograph continues to hold significant cultural and symbolic value. discussions about the future of the academic book face a series of contradictory dynamics: the enduring cultural value of the book for some scholarly sectors, which however currently rests on an economic model that seems untenable; the preference for print for some kinds of reading versus the enormous potential in digital discovery and annotation; and the concerns of many publishers, keen to engage with digital agendas and yet anxious to avoid the pitfalls experienced by the music industry. in any case, there seems to be little doubt that further (and substantial) change is coming. in her exploration of the impact of digital on the academic market, frania hall calls the monograph “the scholarly publisher’s next challenge” ( : ). the enduring importance of deep, reflective reading currently better suited to reading in print form and fears about the effect of digital migration have deferred major transformations, but sooner or later the scholarly monograph is likely to undergo a much closer engagement with (and transformation through) digital social mediation, data-driven dynamics and network effects. focusing in particular on the arts and humanities (although many of its arguments are applicable to scholarly book publishing in other fields), this article asks how, and under what conditions, the digitally mediated long-form academic publication might hold a viable future. it examines digital disruption and innovation within humanities publishing, contrasts different models, and outlines some of the key challenges facing scholarly publishing in the humanities. debating the future of the academic book academic publishing was already “at the crossroads” in , notes thompson, by which time a steady increase in outputs, fuelled by the pressure to publish (to get onto, or move up, the academic ladder), stood in stark contrast to the actual market for academic books ( : ). thompson points to important regional differences, for example between the u.s. markets, dominated by university presses whose mission was often underwritten by their institutions, and uk-based academic publishing, where the larger university presses like oup and cup had achieved greater market diversification, had greater global reach, and thus were less financially vulnerable to the immediate effects of a downturn in book sales. nevertheless, the reality was that the field as a whole was “thinning out” ( : ), and everyone now operated in a restricted economic space, where digitally mediated innovation seemed tempting, but had so far been largely elusive. in recent years there have been numerous reports, publications and initiatives examining the current state and future of the academic book. these have been especially visible in, although not limited to, regions of the world where scholarly publishing is highly developed in commercial or infrastructural terms, such as the united kingdom or north america, and in many countries these debates are part of processes of reflection dating back decades. special issues in academic journals on publishing have examined this from different perspectives: as part of wider reviews of the scholarly publishing landscape; through calls to rethink the university press; with a particular focus on digital publishing for the humanities and social sciences; and as calls to ‘disrupt’ the existing scholarly landscape as a whole. a series of initiatives in the united states, many of them funded by the andrew w. mellon foundation, have attempted to address the particular challenges facing university presses there, from policy and infrastructural perspectives, as described by anthony watkinson in his report on ‘the academic book in north america’ for the academic book of the future project ( ). many of these have produced reports and have left traces in scholarly journals, offering various proposals on how to address what is widely seen as a ‘crisis’ in scholarly book publishing and covering a wide range of issues including business models, open access, infrastructure and the relationship of university presses to their local library and faculty (brown et al., ; elliott, ). more recently, the uk’s arts & humanities research council, in collaboration with the british library, invited “collaborative proposals to explore the academic book of the future in the context of open access publishing and the digital revolution”. the result of this was the two-year ‘academic book of the future’ project, led by dr samantha rayner at university college london (ucl) and colleagues at ucl and king’s college london, which initiated a community coalition and a series of activities that formally ended in september . of particular note is the academic book week, which has evolved into a self-sustaining event beyond the life of the project. special issue of nature exploring transformations in scientific publishing https://www.nature.com/news/the-future-of-publishing-a-new-page- . special issue of the journal of electronic publishing, volume issue , on ‘reimagining the university press’ (fall ) or special issue of learned publishing, volume , on ‘the university press redux’. special issue of the journal of scholarly publishing, volume issue on ‘digital publishing for the humanities and social sciences’. special issue of the journal of electronic publishing, volume issue , on ‘disrupting the humanities: towards posthumanities’. http://www.ahrc.ac.uk/funding/opportunities/archived-opportunities/academicbookofthefuture/ https://academicbookfuture.org/ https://acbookweek.com/ https://www.nature.com/news/the-future-of-publishing-a-new-page- . http://www.ahrc.ac.uk/funding/opportunities/archived-opportunities/academicbookofthefuture/ https://academicbookfuture.org/ https://acbookweek.com/ while by no means uniform in their conclusions, the body of evidence emerging from these initiatives points consistently towards a number of factors affecting the future of scholarly book publishing: . contradictions around supply and demand for scholarly books (in the u.s. and uk at least – monograph output in the humanities has increased in recent years, while actual sales per title have dropped) . continuing anxiety around open access (with national and international dynamics complicating things further) . divergent attitudes towards new digital media and ecologies, and their implications for credit and promotion . an ongoing sense that the future of the academic book is “at a major crossroad” and “uncertain” (in the words of an ahrc press release about the academic book of the future project) but without widespread consensus on what the problems, or at least the solutions, really are digital culture and technology (henceforth ‘digital’) are not the only factor here, but they have introduced new opportunities or challenges, and accentuated many of the difficulties which already existed. digital mediations in his examination of the state of digital scholarship, and its affordances or limitations, weller explores how digital technology is transforming scholarly communications as a whole, underlining some dynamics of digital culture which profoundly influence the future of the academic book in digital form ( ). the combined effect of the transition from information scarcity to information abundance, debates about copyright and networked interactions, or user-generated, mobile and mutable content - to name just a few factors - has fundamentally altered many areas of human life in the last twenty years or so, and these provide a context with which discussions of academic book publishing have still not fully engaged, in particular in those areas (such as the humanities) where wider engagement with digital practices is still undergoing negotiation. for some, the globally networked, digital and open cultures which have emerged as a result of the world wide web seem to point to a target of sorts for scholarly publishing, whereby geographic, institutional and social divides can be resolved through digital infrastructures which, moreover, enable scholarship to be more fully integrated with wider knowledge structures, thus facilitating wider public engagement: “[d]igital humanities scholarship .. promises to expand the constituency of serious scholarship and engage in a dialogue with the world at large” (burdick et al., : ). these digital http://www.ahrc.ac.uk/newsevents/news/the-academic-book-of-the-future / http://www.ahrc.ac.uk/newsevents/news/the-academic-book-of-the-future / transformations are both facilitated and complicated by processes of disintermediation, globalization and media convergence (phillips, : xiii-xiv) and by competing dynamics between popular and commercial interests in the digital space, or between ‘open’ and proprietorial ‘walled garden’ approaches to digital infrastructure. publishing as a whole has seen many instances of digital innovation, from “interactive digital products experimenting with narrative structures”, innovative funding/pricing models, aggregation models or user-generated content, to new entrants in publishing (hall, ). geolocation, virtual reality, linked data, data-driven analysis and artificial intelligence are just some of the many opportunities for content, but how can these work for the scholarly monograph? while scholarly publishing has arguably experimented ‘digitally’ more than other sectors like trade publishing, in part due to anxiety over its future, many argue that scholarly monographs are the least amenable to digital transformation, at least with regards to content (thompson, : - ). some argue for the ongoing primacy of print in scholarly book publishing – which will “draw on digital capabilities” but in a “subordinate”, non-“disrupter” role (esposito, ), while others argue that ‘digital’ holds the key to understanding the future, and that our thinking on this subject should “rip off the physical covers of the ‘book’ and move swiftly into the digital realm” (pinter, : ). one barrier to engagement is the fact that the stakeholders and participants in scholarly publishing are highly heterogeneous, representing often radically different starting points, which influence the variety in responses to digital transformation. ‘print first’ or ‘digital first’? in l’édition électronique, dacos and mounier broadly divide visions of digital publishing into two: one strand which understands it as a simple substitution from print to screen, with no fundamental change in the overall concept or apparatus of publishing (they maintain that this position was hard to maintain, even in ); and another, which views digital publishing as part of a “new era” of knowledge production, a “revolution” in text comparable to the arrival of the printing press and its effects on humanity. tellingly, the latter view contemplates “the disappearance of the book as we know it” (dacos and mounier, : - , my translation). applying this division to long-form digital publications we have: those which effectively follow print models to produce what are, basically, digital remediations of the printed book and those whose processes, functionality, forms and/or formats are fundamentally different, because they are conceived for digital. the division is not watertight, since each “digital book” may draw on traditional or disruptive models to differing degrees, but, as a general principle, it is a useful point of comparison in the current landscape. the first model – long-form publications simulating the print book, with, at best, modest application of digital affordances - dominates the digital output of long-form academic publications at present. electronic text has existed in publishing since the s, and publishers (and publishing) played a key role in the development of electronic markup standards such as xml, but digital innovations have generally been received with caution, and even where there is dual print-digital workflow, the conceptual models for publication, design parameters, publishing systems, editing flows, supporting infrastructures and wider expectations of the scholarly community are still largely predicated on the print model by default. the current general consensus around what constitutes an ebook, moreover, is a far more limited, and print-centric, view than that which circulated in its early history (and which pointed to an altogether more ambitious concept of ‘electronic book’). these less ambitious, to use mrva-montoya’s phrase, ‘tradigital’ books (mrva-montoya, ), in pdf or epub format have been easier to produce because they do not fundamentally undermine existing models, and as a result, they represent a limited engagement with digital modes and affordances. in a similar vein, prescott, in asking if we are “doomed to a world of pdfs?”, expresses concern that “the future publishing landscape is a bleak one” and argues that the scholarly environment it is supposed to serve is “less media rich” now than it was a few decades ago (prescott, ). even the epub format, which is (by default) flowable and in theory allows for rich, interactive publications – more like websites than books– is, argues mcguire, constrained by the application of drm and device/platform-specific restrictions ( : - ) which, in their current implementations, severely limit digitally mediated interactivity across books. we are still far from the modular, highly structured, dynamically interactive, ‘crowd collaborative’, social and networked views of the academic book which digital culture and technology might allow for. to re-appropriate language used by craig mod, the first vision responds to the question “how do we change books to make them digital?”, whereas the second asks “how does digital change books?” (mod, : ). the first model presupposes moderate change to the current landscape; the publisher model adapts to ‘digital’, but otherwise stays broadly the same; the second model consists of a much more radical transformation in models for scholarly dissemination. at present, academic book publishing has largely stayed with the first model for a number of reasons. the enduring attachment of many scholars to physical books and preference for reading print is a key factor, although this will probably change as reading technology improves, wider reading habits evolve, and viable and alternative models of the ‘book’ emerge in digital form. while publishers are increasingly starting to look at digital-first systems and workflows to produce both digital and physical books, a paradigmatic shift which challenges the assumption that a ‘print-like’ object will be developed first (or perhaps even at all) means that changes in author perceptions are likely to take longer. for now, at least, authors and editors “have relatively little experience in enriching their texts to take advantage of the opportunities opened up by digital technologies” (jubb, : ), although again this is likely to evolve. similarly, scholarship monographs, particularly in the humanities and social sciences, are likely to remain broadly ‘linear’ in the short term, even if complementary non-linear modes are slowly emerging over time. in spite of all these caveats, a digital transformation in academic book production seems inevitable. bhaskar argues that the arrival of the “digital network means, over the long term, that there can be no such thing as business as usual” for publishing as a whole (bhaskar, : ) and looking at the study habits and practices of our students today (as opposed to the habits and practices of those teaching them), it seems highly improbable that, in ten or twenty years, the scholarly media ecology will remain unchanged. how might a digital long-form publication which could truly rival the printed academic book emerge? at present, we are very much at the stage of experimentation. there are many challenges of technical sustainability and preservation, education and training, not to mention effective business models and integration into the wider fabric of scholarly communications. but perhaps the most serious challenge is to explore how the digital long-form publication might become an effective vehicle for scholarly argument and interpretation to rival the print monograph. i now turn to a research field within the humanities which has a track record in research into new models and frameworks for digital publication. the digital humanities and scholarly publishing the ‘digital humanities’ is a transdisciplinary field with a history of experimentation with, and critique of, the interactions between computational tools and methods, digital culture and the humanities (often straying into the social sciences) stretching back over years. digital humanists have been involved in numerous publishing-related initiatives, including: the academic book of the future project (where the host departments in the two co-coordinating institutions both have long-standing history in ‘dh’ ); many of the mellon-funded north american initiatives mentioned earlier; various digital publishing tools and frameworks, whether general purpose (scalar and manifold ), function/technology-specific (tapas ) or field-specific (papyri.info and perseids ); markup frameworks (xml and tei ); and the production of multiple digital editions, resources, databases and other forms which either qualify as, or occupy the same intellectual space as, long-form publications. disclaimer: i work for one of them http://scalar.usc.edu/scalar/ http://manifold.umn.edu/ http://tapasproject.org/ http://papyri.info/ http://sites.tufts.edu/perseids/ https://www.w .org/xml/ http://www.tei-c.org/index.xml http://scalar.usc.edu/scalar/ http://manifold.umn.edu/ http://tapasproject.org/ http://papyri.info/ http://sites.tufts.edu/perseids/ https://www.w .org/xml/ http://www.tei-c.org/index.xml in spite of this activity, scholarly book publishing has not featured particularly prominently as a topic (except as a by-product of other scholarly activities, such as editing) in many of the better known digital humanities publications. to take just one example, in the first edition of the landmark ‘blackwell companion to digital humanities’ (schreibman et al., ), books and publishing do feature, but generally in relation to some other topics such as electronic markup (renear, ) or electronic scholarly editing (smith, ). on one level this is hardly surprising; the field’s proximity to these themes is clear from the copious literature which it has produced on markup and scholarly editing as significant areas of both study and practice. later volumes, including the substantially revised second edition of the blackwell companion (schreibman et al., ), come closer to addressing the current state (and future) of publishing, although they still tend to address the issue within wider discussions about subjects such as scholarly communications or digital scholarship. in spite of this general preference for focussing on wider scholarly frameworks over publishing, and thus on ‘digital resources’ rather than ‘digital publications’, researchers in the digital humanities have often addressed issues relating to publishing, and how they fit into wider discussions about the future of the academic book. what follows is a short review of four common themes within the ‘dh’ view on publishing. • modelling and publishing. in their review of ‘digital publishing [as] seen from the digital humanities’, blanke, pierazzo and stokes locate publishing close to another of dh’s historic areas of strength, namely ‘modelling’. for them, publishing “needs to be understood as a range of modelling activities that aim to develop and communicate interpretations” - perhaps symbolically, one of their subheadings is “[n]ot publishing but modelling” ( : ). the implied venue for this kind of modelling activity is the non-narrative-based publication of digitised content, most commonly published in scholarly editions or archive- based publications, but the article raises important wider questions about what we consider to be “faithful reproduction” and proposes that we free ourselves from “skeumorphic representations” of non-digital content in a digital environment, which apply to all kinds of publication (blanke et al., : , ). • process versus product. in a very different vein, in her chapter ‘scholarly publishing in the digital age’ kathleen fitzpatrick reflects on her experience with media commons, - which she also used for the preparation of her monograph ‘planned obsolescence’ (fitzpatrick, ), - as an experiment in networked scholarly publishing which aimed to facilitate social editing, community creation, public engagement and peer review. the richer interactions between peers which this editing/publishing model enables places the focus less on the final outcomes of research publishing (“the product”) and more on “the process” (fitzpatrick, : - ), which draws attention to publishing as part of a wider research ecosystem. • scholarly research infrastructure. digital humanities research has often been involved in “building” scholarly infrastructure – both for critical interpretation and as a community-building exercise – resulting in publishing functions which are embedded within wider scholarly research systems. this is evident, for example, in crane et al.’s early call to build “the infrastructure for ephilology”. the digital resource/publication argued for in that case: can be disseminated to anyone, anywhere, at any time; is hypertextual, facilitating connection between scholarly narrative and supporting evidence; can be dynamically remixed for different people/uses; is capable of learning by itself through “documents that learn from each other”, using machine generated information from external datasets; is able to “learn from their human readers” by analysing their digital habits; and is customisable to individual users and their settings (crane et al., ). many of these attributes may become desirable for scholarly publications of the future, but does this describe a digital resource, or a publication, or potentially both? as publication, in this scenario, increasingly merges into a larger research infrastructure, it becomes more important to establish clear dividing lines between research and publication, a topic i will return to later. • re-thinking the academy. finally, it is not uncommon to see the digital humanities invoked to support more radical re-alignments of the scholarly landscape – for cathy davison, “dh is … about realigning traditional relationships between disciplines, between authors and readers, between scholars and a general public, and, in other ways, re-envisioning the borders and missions of twenty-first century education” (davidson, : ). that gives some sense of how the digital humanities views publishing; in what ways does it actually perform publishing functions or roles? with a few notable exceptions (fitzpatrick, ), this does not generally involve discussions about publishing mission or sustainability. digital humanists are frequently involved in “building” resources, and as such these typically have many of the following attributes: they are experimental; they combine text with other media in dynamic interplay; they involve interdisciplinary, multi-author, inter-institutional collaboration; they are networked; they are closely connected to communities of practice (not just digital humanities, but also, say, epigraphers, or early modernists); they encourage curation, open access and sharing; they may be conceived with public engagement in mind. i do not for a moment intend to suggest here that digital innovation is limited to the digital humanities. there are many new media, digital arts and electronic literature experiences in relation to publishing which deserve a fuller treatment, but which i do not analyse in detail here for reasons of space. it is clear from all of this, that in many ways, the digital humanities are already deeply involved in some publishing practices, including those which produce long-form publications, but also that their role is poorly defined precisely because of their range, a point i will expand on later. i will now outline the key challenges i believe we need to address in order to connect the different visions around digitally-mediated long-form publishing in the humanities. projections of the digitally mediated academic book what projections exist for digital futures of the book, and what criteria are used to describe them? kapaniaris et al. present a spectrum based on degrees of interaction, ranging from ebooks in pdf form at one end, to books apps at the other ( ). a report by an emory working group to the mellon foundation on ‘the future of the monograph in the digital era …’ presents a print/digital continuum from traditional print-based books to digital only and identifies four models: (a) print monographs, (b) digital long- form publications “with a strong resemblance to print monographs”, (c) significantly enhanced long-form publications in digital form and (d) long-form publications which are conceived, and can only realistically operate, digitally (elliott, ). enhancements, in this definition, might include images, sound, or references to other content and complex navigational structures. key criteria for dividing categories might be whether or not the work is linear or non-linear, and whether it is ‘stable’ or ‘updateable’. at the more interactive end of the spectrum, it not always clear how to distinguish between a digitally enhanced ebook and other text-based electronic resources, and even where that distinction is clear, the “complex relationship” which the university press system (and indeed scholarly publishing as a whole) “maintains … to the plethora of electronic research and reference databases that are ever-more essential to supporting scholarship” (lynch, ) is often an obstacle to differentiation between scholarly ‘publications’ and supporting ‘resources’. there is also some overlap here with debate regarding the future of other scholarly forms, such as the journal article, and it may be necessary to take a wider view across the full range of possible scholarly outputs. for example, breure et al. suggest a similar taxonomy based on a spectrum which distinguishes between: text-driven and image- driven interfaces; linear and non-linear dynamics; and limited multimedia support or visual narratives sustained by full immersion/interactivity connected to research datasets (breure et al., ). this may be equally to relevant to books and journals, and everything in-between. one key outcome of the andrew w. mellon foundation’s strategic investment in long- form scholarly publishing, which began in , has been the development of a set of features to describe the “monograph of the future” (understood to be digital and open access) which are ambitious in scope and which very much favour an ‘enhanced’ view of the academic book. in this formulation, the academic book should be: “fully interactive and searchable online” with primary and other sources; portable across reader applications; able to support usage metrics which protect user privacy; be updated, managed and preserved digitally; economically sustainable and amenable to device- neutral user annotations, while meeting scholarly standards of rigour, able to function within existing systems of professional recognition and marketable as an object belonging directly to its reader (waters, ). this is an ambitious ‘wish-list’, implemented in part across a number of its funded research projects, and still in need of further testing and debate, but it provides important material for thought on how to develop new publishing models and infrastructure, and whether they are most effectively instantiated at institutional, national, commercial or disciplinary levels. how is the book changing as a ‘system’ for creating and disseminating knowledge? in order to understand that properly, we need to better understand how digitally mediated academic long-form publications work, or might work, and how they affect knowledge production ‘systems’. writing from a book design perspective, craig mod argues that we need to contemplate the book, not as a fixed object, but as a combination of systems: a pre-artefact system (conception, authoring and editing); the system of the artefact itself (‘the published book’ itself); and a post-artefact system (“the space in which we engage with the artefact”). digital culture disrupts all of these systems: the pre-artefact system is no longer limited to interactions between author and editor and may include other forms of co-creation and ‘community’ editing; the book itself can be manifested in multiple forms, each with a different set of affordances; and the post-artefact system may include “digital marginalia”, namely comments, notes and interactions between an (in our case scholarly) community around a piece of writing (mod, : - ) and, in this sense, ‘digital’ functions as “scaffolding between the pre- and post-artefact systems” (mod, ). despite the challenges, and while there is significant variation across disciplines and geographies, scholarly communications have been, and continue to be, transformed by digital culture and technology. thanks to social media effects, public/private and formal/informal boundaries are no longer as clear as they used to be. research objects increasingly circulate in digital form or through digital channels and “[i]n the web era, scholarship leaves footprints” (priem, : ). our expectations about how we gather information (speed, access, broader interpretations of what constitute ‘valid’ sources) and then process/disseminate it (the sharing economy, collective intelligence and online publication modes) have been dramatically changed by digital culture. the pervasive influence of social media on dissemination in today’s society, where the smartphone often constitutes the primary mode of access to information (and for companies, a crucial means to accessing information on user/reader behaviour) is another element altering the knowledge landscape, creating new structures and signifiers of symbolic value. these factors have so far still not had a major impact on scholarly outputs, but it is very unlikely these outputs will remain unaffected in future. research ecologies in some disciplines, for example in the arts and humanities, still depend very much on ‘print’ era models, but this is increasingly being contested (kelly, ), even if the path of progression is by no means clear yet. given all of this, we might expect more mutual overlap in debates about the future of ‘research’ and ‘publishing’ respectively: many of the discussions around research ecosystems and infrastructure seem to treat publishing as an afterthought, or merely as a ‘digital button’ to press to produce output, while much of the debate around the future of publishing takes little account of evolving scholarly communication cycles and research ecosystems. we need to better understand the ‘digital book’ (or its alternatives) as intellectual systems, but also how they fit into wider knowledge and research systems, including those which operate beyond the academy. long-form publications, networked scholarship and new knowledge objects digital publications have often raised interesting questions, but they do not, as yet, constitute coherent and readily identifiable modes of scholarly expression and as such, their location in existing scholarly communication circuits remains under-articulated. one early attempt to articulate a ‘digital’ future for scholarly content was darnton’s pyramid, which envisaged knowledge being represented in different layers, including (top to bottom): ( ) a concise view of a topic; ( ) supporting argument arranged in chunked and non-sequential form; ( ) documentation and it accompanying analysis; ( ) theoretical discussion; ( ) pedagogical materials; and ( ) interactions between authors and readers (darnton, ). early visions of this type were sometimes criticised as being utopian or techno- deterministic in character. nevertheless, increasing evidence of a ‘networked research cycle’ (weller, : ) in some areas of academia suggests changes in the research process that will start to effect greater changes in how publications are conceived and produced. this implies, as i have noted, a change in focus from ‘product’ to ‘process’, but this greater connection between research and publication ecosystems, points towards two effects. on the one hand, it theoretically makes it possible to produce publications faster, and with a greater connection between analysis and evidence (data; models; visualisations), while, in some cases, it makes it harder to see the distinction between ongoing research and stable research outputs. brown et al. believe that publishing will look “very different” in the future, and now that the online mediation of journals is well established, they “believe the next stage will be the creation of new formats … ultimately allowing scholars to work in deeply integrated electronic research and publishing environments that will enable real-time dissemination, collaboration, dynamically-updated content, and usage of new media.” (brown et al., : ). but these new formats are unlikely to evolve merely on the grounds of technological possibility and affordance; if they do develop in any significant way, they will likely grow from scholarly need, grounded in changes in the way that we produce knowledge. one thing which stands out from many of the reports produced about the future of the book is that, while there is abundant literature on practical aspects (such as open access or business models), and a good understanding of how academics structures (validation/promotion systems or research evaluation programmes) drive expectations about format, there are relatively few studies regarding how digital publication actually facilitates or encourages new forms of knowledge production. in his ‘theses on the epistemology of the digital’, alan liu explores how ‘the digital’ affects our understanding of what knowledge consists of, and how it potentially transforms its systems of production and dissemination. it introduces new knowledge objects (such as ‘algorithm’, ‘multimedia’ and ‘data’) and challenges the preference for “acts of rhetoric and narrative” in some (often humanities-based) disciplines (liu, ). it also increasingly encourages us to question whether a monograph, or even a book in the more general sense, is always the best way to communicate a given argument. by this logic, if we stop looking at digital books as, necessarily, simple digital mediations of a print original and take full advantage of the communicative capacity of the digital medium, we are better placed to find critical arguments which can only be made digitally and which make better use of the digital space as a site of creativity, co-creation and generative knowledge. how well are we currently placed to commit to such challenges? where i work, in the humanities, there are different opinions regarding the level of engagement of researchers with the theoretical or practical aspects of digital culture and technology. whereas some argue that today’s humanities reseachers are “well versed in modern digital practices” (deegan, : ), others argue that, by their inability to engage with digital innovation nearly as fluidly as they typically engage with print monographs, “the arts and humanities are not embracing the culture of transformation that these fields pretend to embody” (o’sullivan, : ). smiljana antonijević’s wide-reaching ethnographic study of scholars across institutions in the us and europe seemed to indicate that there remain both anxieties and practical barriers to full engagement of humanities with the affordances of ‘the digital’, although generational differences exist (antonijević, : - ). beyond the digital humanities, we can observe little evidence of humanities researcher involvement, or interest, in the design of the research and publication tools which they adopt, with the very real danger that “humanities scholars will develop the same consumer relationship to digital content that they have had to print” (prescott, : - ). this is part of a wider problem, in the humanities, linked to the fact that digital resources carry less prestige, which sets up a certain circular dynamic where digital resources are used to support research, but are then under-cited because of the preference for print (hitchcock, ). finally, it also takes us back to challenges which derive from the growing density of the media landscape and difficulties in delimiting new forms of publication within a broader, digitally mediated research ecosystem. as we have seen, digital publishing blurs boundaries, and (at least potentially) replaces a finite set of publication types with a seemingly fluid spectrum populated with multiple ‘publication points’. distinctions between ongoing research and stable outputs, or between ‘digital resource’ and ‘digital publication’ are not always clear in this scenario, and some digital practitioners have been reluctant to sacrifice the flexibility in definition which the digital medium provides, but in many ways they would be better served by making clearer formal distinctions. the acts of maintaining dynamic digital resources and providing snapshots for evaluation/accreditation are not mutually exclusive, as those of us who have submitted digital outputs to the uk’s research excellence framework can attest. there is a wider set of questions around digital resources, and their ‘equivalence’ to the academic book which is beyond the scope of this article, but issues such as preservation, stability of record and how to integrate knowledge objects such as evidentiary datasets or dynamic visualizations within digital long-form publications (either embedded or as external ‘appendices’) will be a key part of that discussion. rearticulating publishing forms definitions and categorisations of academic books are often illustrative of the competing claims and pressures on them. there are no universal definitions for the academic book, but deegan’s description of the book as a “long-form publication, a monograph, the result of in-depth academic research … making an original contribution” is a good starting point, and traditional distinctions with the shorter journal article (which is often more limited in scope) still stand, although as she points out, they are “becoming increasingly blurred” ( ) and the emerging mini-monograph format (palgrave pivot and stanford briefs) adds to erosion of the boundaries between forms. her inclusion of an approximate word length for the monograph ( - , words) is, of course, a print legacy, and we might question whether parameters of length (or indeed structure, format and use of non-textual media) will always be so significant, but for now, no other models constitute scalable alternatives in the scholarly mainstream. in part, this is a reflection of cultural status: monographs “are deeply woven into the way that academic think of themselves as scholars” (deegan : ), but this assumption, and the print model which accompanies it, is increasingly disputed – pinter, for example, argues that, in future the book will be defined more by its function than any other feature and that we will move beyond the “sunken investments in existing scaffolding” to engage with evolving new media ecologies (pinter, : ). many terms exist to describe digitally mediated forms of the long-form publication, including ‘enhanced ebook’, ‘enhanced monograph’, ‘networked book’ or ‘book apps’. digital terms are also notoriously fluid: originally the term ‘ebook’ covered more ambitious visions of the book in electronic form, but it has been largely appropriated, as a result of commercial usage, to represent remediated print content in epub or pdf formats with relatively limited functionality. there is also an important point to make about the formulation of terms. print-based terms at least loosely describe, or stand in as signifiers for, their scholarly purpose – the monograph, a single authored piece of research; the edited collection, bringing together different writing about a given theme; or the scholarly edition, providing a critical interpretation of a given work- whereas terms used for new digital long-form publications types merely imply something about the format or functionality – it is ‘enhanced’ or ‘networked’ (we are rarely told to what purpose) – or in the case of ‘book app’, they offer information about its delivery platform. what is more, at its core the language used for these ‘new’ forms is resolutely tied to print – the terms used simultaneously seek to appropriate the cultural baggage of the print book and to liberate themselves from it at the same time – which help to explain the conceptual challenges in making them viable alternatives to the printed book in the short term. digital forces us to think about distinctions in form, content, platform or device which are either not relevant or not negotiable for the printed book and it is unlikely that we will see stable terms emerge in the short term to describe these new instances of the ‘book’ (or its partial replacement). nevertheless, until stable terms for new scholarly publishing concepts arise, it may remain harder for them to gain traction beyond the margins, and so this requires attention. as we have already seen, a vast array of terminology for digital outputs exists, and these have been fuelled in part by the nature of digital affordances themselves (which may influence new ‘fashions’ in digital research), but also in large part by the pressure to present new forms as being ‘innovative’. i would also contend that the terms used so far for long-form digital publications and/or other research outcomes have generally had more to do with cultural and political context than any substantive element related to functionality or cultural representation. the cultural baggage of common words such as ‘archive’, ‘edition’ or ‘database’ varies according to sector and locale. some have argued for the symbolic force of the ‘database’ (manovich, ) while the concept of ‘archive’ has considerable currency in many areas of the humanities, although their relation to publication seems unclear. in see also (drucker, ) for earlier terms such as “expanded book”, the “hyper-book” or “the book emulator”. ken price is unusual in giving serious attention to “the genres we are now working in” as he explores various terms in relation to his experience on the whitman project (price, ). their projection of possible new cultural forms which might be generated by the digital humanities, burdick et al. suggest new terms such as ‘augmented editions’, ‘animated archive’ or ‘database documentaries’ ( : , , ); these have the virtue that they provide meaning to otherwise overused and ambiguous terms, but the question is whether or not these, or the many other terms currently in circulation, will have the coherence and consensus to be adopted more broadly. to some extent, stable terms will emerge organically over time and it would be counter-productive to overly force the issue, but greater discussion among the various constituencies of scholarly publishing would surely be beneficial for all. a crucial aspect of this conversation will be to find greater alignment between the terminology used at different stages of the scholarly communications cycle, in particular around validation and promotion processes. so, whereas ‘enhanced monograph’ seems to be used by various academics and people involved in discussions about the future of publishing, it does not appear, for example, anywhere in the extensive list of admissible output collection formats used in the last uk research evaluation framework exercise (ref ), where we see, under the list of admissible ‘digital artefacts’, the terms ‘software’, ‘website content’, ‘digital or visual media’ and ‘research datasets and databases’. moreover, a clear boundary still does not really exist between, on the one hand, innovative / experimental forms and, on the other, stable forms worthy of inclusion as outputs equivalent to the journal article or monograph. while the experimentative, ‘laboratory’ function of much work typically carried out in the digital humanities will continue to be important in pushing the boundaries of scholarly communications (and a fundamental part of the research agenda of that field), we also need to establish clearer genres, descriptors and/or labels around digital publications across the spectrum (from ‘short form’ to ‘long’ form) so that they can be evaluated fairly. in ‘imagining a university press system to support scholarship in the digital age’ lynch argues for greater standardization and for ‘templates’ ( ), which would fix particular genres, facilitating scholarly validation, circulation and credit systems. thomas iii actually goes on to tentatively propose terminology we might use to this purpose: interactive scholarly works (isws), which by his definition are more “tightly defined” digital outputs combining archives, tools and argument; digital projects or thematic research collections (trcs) , which cover more “capacious” outputs drawing together heterogeneous tools, models and datasets in open-ended, multi-author research collaborations; and digital narratives, which are born-digital works of highly structured and interpretative scholarly narrative (thomas iii, : - ). while we might argue about the precise division or nomenclature, the need for clearer categorisation of digital works - for formal publishing and evaluation purposes - and a more consistent terminology, seems clear. this is, moreover, a conversation which needs to include a wide range of actors, and to be multi-disciplinary and global in http://www.ref.ac.uk/about/guidance/submittingresearchoutputs/ after caroline palmer’s proposed use of the term (palmer, ). http://www.ref.ac.uk/about/guidance/submittingresearchoutputs/ outlook. it is also to be hoped that discussions around terms which affect both academic standing and career advancement will become less national and more global over time. while these differences in terminology exist, digital alternatives to the book will continue to be undermined by difficulties in formal academic validation. making ‘print’ and ‘digital’ work together part of the answer may lie in gaining a better understanding of how print and digital work together. how does scholarship function differently in the digital environment – what is lost, what is gained, and how does this influence choices about digital and print channels? we are only just starting to understand the answer to these questions, but we need to identify which aspects of scholarly communication are better served by digital or print, and how they might fit together better in future. the recent recovery of print versus ebook sales in trade publishing suggests a broader 'cooling' of public attitudes towards ‘digital’ reading after a period of high expectations (and sometimes hyperbole) for digital formats, and in scholarly publishing, numerous sources seems to confirm that print publications hold enduring significance for academic researchers (wolff-eisenberg et al., ), especially in areas like the humanities and social sciences where narrative-based argument is at the core (deegan, ). academic books are a key feature of the publishing landscape, particularly in the humanities and social sciences, for a number of reasons, which include their cultural symbolism, ability to communicate a coherent and sustained narrative, phenomenological resonances/power, readability, and finally, underlying academic credit and promotion mechanisms (deegan, ). by contrast, ‘digital’ mediations of the book have faced significant problems of acceptance for a number of reasons, and so are generally limited to ebook remediations of print monographs, special cases (such as digital scholarly editions) or new media experiments. that said, - and while early enthusiasm (and at times proselytism) regarding the potential of digital technology to transform academic book publishing has waned as the practical limitations have become more apparent -, the major challenges of sustainability in current models of supply and demand (jubb, : ), along with wider questions about how ‘the academy’ should re- adjust to new modes of knowledge production, mean that it nonetheless seems inevitable that ‘digital’ will play a significant part in re-thinking its future. dunleavy, speaking from a social sciences perspective, has argued for a ‘new renaissance’ of books based on emerging realities such as the digital reading list, which favours chunkable content which can easily be downloaded, annotated or added (by students) and which can be added to at the last minute, on demand (by lecturers). highlighting the growing awareness that it may not be practical to continue marketing books as single entities, he argues that the book may be better thought of as part of a https://www.theguardian.com/books/ /mar/ /ebook-sales-continue-to-fall-nielsen-survey-uk- book-sales https://www.theguardian.com/books/ /mar/ /ebook-sales-continue-to-fall-nielsen-survey-uk-book-sales https://www.theguardian.com/books/ /mar/ /ebook-sales-continue-to-fall-nielsen-survey-uk-book-sales large high quality library which can be navigated, rather in the way that we navigate journal collections (dunleavy, ). in this scenario, print and digital need to work together as part of a seamless experience, allowing users to experience content as they prefer, on paper or on screen. it is to be expected, then, that ‘digital’ and ‘print’ may be seen as less oppositional in future. the recent reader survey by the oxford university studies in the enlightenment confirmed what we already know from various sources: that readers “seek portability and immediate accessibility of scholarly resources” and yet do not generally favour ‘digital only’ access. rather, they prefer hybrid print-digital access, according to the kind of activity they are carrying out. we are still far from having stable and sustainable business models for hybrid long-form publications, but from a scholarly perspective the requirement is clearly there. conclusions in earlier times, digital publishing was sometimes presented as making publishing simpler in some way: whether through the immediacy and potential global reach of posting content to the web or through the promise of ‘single-source publishing’ which often accompanied the early proposition of xml for editing/publishing. far from simplifying publishing, digital culture and technology have made it far more complex in many respects, with new content types, more technical formats, competing workflows and hugely divergent business models. there are clearly many advantages for moving content into digital first workflows, and this may become more common in future even in scholarly book publishing, but the adoption barriers are significant, and the increasing use of mobile and tablets has only complicated things further (mcilroy, ). this is likely to make more adventurous long-form digital publications harder to sustain in business terms, in the short term, and yet from a scholarly perspective, this shift towards a richer range of outputs has already started, and it is something which needs to be understood properly and integrated into the current publishing landscape. as the recent study of arts and humanities outputs submitted to the uk’s research evaluation exercise showed, monographs carry great weight, but there is also greater variation in research outputs, with the suggestion that scholars (in the arts and humanities) are more likely to see digital media as “central to their research output and scholarly experience” (tanner, : ), even beyond more obviously receptive fields such as art and design, the performing arts, communication studies, new media studies or library and information management. we are also at a stage of intense contradiction in terms of geographic scope, where on the one hand, the effects of a global network facilitate stronger connections between scholars around the world, while on the other hand digital media effects exacerbate historic geo-economic and social divides. while some aspects of academic publishing display global characteristics, debates about the future of the academic book are still largely operating along national lines, as the example of debates in the u.s. and the uk demonstrate, tied to local funding landscapes and systems of credit and evaluation. a book published digitally is, in theory, open to wider and more democratic dissemination systems, but in practice its fate is often firmly tied to national systems for academic validation, localised (and often inconsistent) licensing dynamics and unevenly stacked international knowledge flows. as inefuku has argued, “[t]rue democratization and globalization of knowledge cannot exist without a critical examination of the systems that contribute to the production of scholarship”, and initiatives to develop global publishing platforms need to involve global south perspectives from the start (inefuku, ). redefining scholarly publishing so that it is genuinely inclusive, collaborative and based on true reciprocity will be an important part of the academic book of the future. various pieces of research, including the recent academic book of the future project, have demonstrated the enduring appeal and importance of the long form narrative- based scholarly monograph, while highlighting the ongoing challenges facing the academic book. in many fields, the academic book has been replaced by databases or side-lined as the currency of the journal article, dominant in the sciences, has grown, and some might argue that the digital mediation of the academic book has reached its limits. i have argued here that, while change may be slow, such a position is untenable in view of changing media expectations and habits. it is crucial, however, to gain greater common understanding of the motivation and dynamics which bind together (and sometimes separate) different actors in the scholarly book communication circuit, and of the way that relationships are changing. there are a number of different stakeholders involved in scholarly publishing – including academics (as authors and consumers), librarians, publishers, digital media companies, digital practitioners and wider publics – and discussion regarding the future of scholarly publishing “has too often failed to transcend the self-interest of individual groups of stakeholders” (anthony cond of liverpool university press, quoted in samantha rayner's preface to deegan, : ). there does, nevertheless, appear to be a sense now that roles are changing, with, for example, publishers “shifting their position in the value chain, and redefining themselves as they go, into training and assessment, information systems, networked bibliographic data, and learning services” (goldsworthy, ). along with this, there is a growing awareness in some quarters that partnerships are going to be crucial in bridging the gaps which exist between different stakeholders. this includes the digital humanities. the digital humanities already plays a semi-informal role as “exploratory laboratory” for publishing along the lines proposed by svensson for its role in relation to the humanities more generally (svensson, ), but if this role were more consistently negotiated with (and recognised by) other stakeholders (such as other humanities academics, publishers and libraries) it would benefit all involved. initiatives such as the recent call for novel publications “blending cutting-edge technology with high quality scholarship” by the king’s digital lab and stanford university press will help to redefine complex narrative argument within a digital or hybrid setting. it is perhaps understandable that a field which is constantly in transition - in part due to changes in digital culture and technology, and in part due to its fluid/unstable status within the academy – should strive to make a wide set of claims influencing everything from policy to innovation, but i would like to argue here that both digital humanities and publishing sectors would mutually benefit from greater analysis and clarity about the field’s actual (and potential) contributions to debates about the future of publishing in the humanities. william g. thomas iii points out that the field has produced “innovative and sophisticated hybrid works of scholarship, blending archives, tools, commentaries, data collections and visualizations”, but that many of these outputs have faced serious problem in terms of recognition, credit and absorption into the wider scholarly fabric (thomas iii, : ). these gaps in understanding about the nature and status of new digital outputs constitute as much a problem for the humanities as a whole (and indeed scholarly publishing) as it does for the digital humanities. but what if these outputs were viewed (and recognised) more fully as part of the process of exploration in the ongoing transformation of scholarly publishing in the humanities? i have proposed here a vision of the academic book in the humanities which is globally inclusive, shaped by actual scholarly needs (rather than by the histories of print or web technologies), re-articulated for current media landscapes, more closely aligned to emerging research ecosystems and with greater integration of needs of the different stakeholders. it is possible to imagine digital long-form arts and humanities publications developing in a number of different ways in future. firstly, and although i have not had space to contemplate it properly here, the concept of ‘publishing the archive’ will increasingly be important, especially around chunked book content. this seems likely to manifest itself in how established publishers find new ways to make digital assets which are currently ‘book-bound’ available as part of self-managed or aggregated online platforms. nor have i addressed content managed by galleries, libraries and museums, which naturally connects to many areas in the humanities thematically. secondly, new ‘digital’ forms will develop and stabilise which will contain their own network-native systems of knowledge formation, academic certification and filtering. these will take a lot longer to emerge, because they depend on a level of critical digital literacy, and consensus around media effects, in the humanities which it will take time to develop. the third route will involve moving beyond digital simulation of print monographs, or concepts of ‘enhanced’ monographs, to hybrid publications which aim to take full advantage of the affordance https://www.kdl.kcl.ac.uk/blog/call-expressions-interest-your-novel-idea-publication/ https://www.kdl.kcl.ac.uk/blog/call-expressions-interest-your-novel-idea-publication/ of each medium. this mixed ecology provides many challenges – not least how we apportion different roles and functionality to the ‘print’ and ‘digital’ manifestations of a particular ‘book’ - but also many opportunities in fully integrating complex scholarly argument into a potentially more connective, participatory and visually expressive medium. references antonijević, s. ( ) amongst digital humanists - an ethnographic study of digital knowledge production smiljana antonijević palgrave macmillan. basingstoke, hampshire; new york: palgrave macmillan. bhaskar, m. ( ) the content machine: towards a theory of publishing from the printing press to the digital network. london ; new york: anthem press. blanke, t. et al. ( ) digital publishing seen from the digital humanities. logos. ( ), – . breure, l. et al. ( ) rich internet publications: ‘show what you tell’. journal of digital information. ( ). brown, l. et al. ( ) reimagining the digital monograph: design thinking to build new tools for researchers, a jstor labs report. available from: https://hcommons.org/deposits/item/hc: / (accessed august ). brown, l. et al. ( ) university publishing in a digital age. available from: http://sr.ithaka.org/research-publications/university-publishing-digital-age (accessed august ). burdick, a. et al. ( ) digital_humanities. cambridge, mass.: mit press. crane, g. et al. ( ) ‘ephilology: when the books talk to their readers’, in susan schreibman & ray siemens (eds.) companion to digital literary studies. blackwell companions to literature and culture. hardcover oxford: blackwell publishing professional. dacos, m. & mounier, p. ( ) l’édition électronique. paris: la découverte. darnton, r. ( ) the new age of the book. available from: http://www.nybooks.com/articles/ / / /the-new-age-of-the-book/ (accessed august ). davidson, c. ( ) ‘why yack needs hack (and vice versa): from digital humanities to digital literacy’, in patrik svensson & david theo goldberg (eds.) between humanities and the digital. cambridge, massachusetts: mit press. pp. – . deegan, m. ( ) the academic book of the future project report: a report to the ahrc and the british library, london. available from: https://academicbookfuture.org/end-of-project- reports- / (accessed july ). drucker, j. ( ) ‘the virtual codex from page space to e-space’, in susan schreibman & ray siemens (eds.) companion to digital literary studies. blackwell companions to literature and culture. hardcover oxford: blackwell publishing professional. pp. – . dunleavy, p. ( ) ebooks herald the second coming of books in university social science. lse review of books. available from: http://blogs.lse.ac.uk/lsereviewofbooks/ / / /ebooks- herald-the-second-coming-of-books-in-university-social-science/ (accessed november ). elliott, m. ( ) the future of the monograph in the digital era: a report to the andrew w. mellon foundation. journal of electronic publishing ( ). esposito, j. ( ) the multifarious book. the scholarly kitchen. available from: https://scholarlykitchen.sspnet.org/ / / /the-multifarious-book/ (accessed august ). fitzpatrick, k. ( ) planned obsolescence publishing, technology, and the future of the academy. new york: new york university press. fitzpatrick, k. ( ) ‘scholarly publishing in the digital age’, in patrik svensson & david theo goldberg (eds.) between humanities and the digital. cambridge, massachusetts: mit press. pp. – . goldsworthy, s. ( ) the future of scholarly publishing. available from: http://blog.oup.com/ / /future-scholarly-publishing/ (accessed august ). hall, f. ( ) digital convergence and collaborative cultures. logos ( ), – . hall, f. ( ) the business of digital publishing an introduction to the digital book and journal industries. london; new york: routledge. hitchcock, t. ( ) confronting the digital: or how academic history writing lost the plot. cultural and social history ( ), – . inefuku, h. w. ( ) globalization, open access, and the democratization of knowledge. educause review. available from: http://er.educause.edu/articles/ / /globalization-open- access-and-the-democratization-of-knowledge (accessed july ). jubb, m. ( ) academic books and their futures: a report to the ahrc & the british library. available from: https://academicbookfuture.org/end-of-project-reports- / (accessed august ). kapaniaris, a. et al. ( ) digital books taxonomy: from text e-books to digitally enriched e- books in folklore education using the ipad. mediterranean journal of social sciences. ( ), . kelly, j. ( ) an ecology for digital scholarship. available from: http://ihrdighist.blogs.sas.ac.uk/ / / / / (accessed august ). liu, a. ( ) theses on the epistemology of the digital: advice for the cambridge centre for digital knowledge. available from: http://liu.english.ucsb.edu/theses-on-the-epistemology-of- the-digital-page/ (accessed august ). lynch, c. ( ) imagining a university press system to support scholarship in the digital age. journal of electronic publishing. ( ). available from: http://hdl.handle.net/ /spo. . . . manovich, l. ( ) the language of new media. cambridge, mass.: mit press. mcguire, h. ( ) ‘why the book and the internet will merge’, in hugh mcguire & brian o’leary (eds.) book: a futurist’s manifesto: a collection of essays from the bleeding edge of publishing. edition boston, ma: o’reilly media. pp. – . mcilroy, t. ( ) mobile strategies for digital publishing: a practical guide to the evolving landscape. blue ash, ohio: the future of publishing and digital book world. mod, c. ( ) ‘designing books in the digital age’, in hugh mcguire & brian o’leary (eds.) book: a futurist’s manifesto: a collection of essays from the bleeding edge of publishing. edition boston, ma: o’reilly media. pp. – . mrva-montoya, a. ( ) beyond the monograph: publishing research for multimedia and multiplatform delivery. journal of scholarly publishing. ( ), – . o’sullivan, j. ( ) scholarly equivalents of the monograph? an examination of some digital edge cases. available from: https://cora.ucc.ie/handle/ / (accessed september ). palmer, c. l. ( ) ‘thematic research collections’, in susan schreibman et al. (eds.) companion to digital humanities. blackwell companions to literature and culture. oxford: blackwell publishing professional. pp. – . phillips, a. ( ) turning the page: the evolution of the book. london; new york: routledge. pinter, f. ( ) ‘the academic “book” of the future and its function’, in rebecca e. lyons & samantha rayner (eds.) the academic book of the future. st ed. edition new york, ny: palgrave macmillan. pp. – . prescott, a. ( ) are we doomed to a world of pdfs? digital riffs. available from: https://medium.com/digital-riffs/are-we-doomed-to-a-word-of-pdfs- f edaf (accessed august ). prescott, a. ( ) consumers, creators or commentators? problems of audience and mission in the digital humanities. arts and humanities in higher education. ( – ), – . price, k. m. ( ) edition, project, database, archive, thematic research collection: what’s in a name? digital humanities quarterly ( ). available from: http://www.digitalhumanities.org/dhq/vol/ / / / .html (accessed september ). priem, j. ( ) scholarship: beyond the paper. nature. ( ), – . renear, a. ( ) ‘text encoding’, in susan schreibman et al. (eds.) companion to digital humanities. blackwell companions to literature and culture. hardcover oxford: blackwell publishing professional. pp. – . available from: http://www.digitalhumanities.org/companion/. schreibman, s. et al. (eds.) ( ) a new companion to digital humanities. nd revised edition edition. chichester, west sussex, uk: wiley-blackwell. schreibman, s. et al. (eds.) ( ) companion to digital humanities. blackwell companions to literature and culture. hardcover. oxford: blackwell publishing professional. available from: http://www.digitalhumanities.org/companion/. smith, m. n. ( ) ‘electronic scholarly editing’, in susan schreibman et al. (eds.) companion to digital humanities. blackwell companions to literature and culture. hardcover oxford: blackwell publishing professional. pp. – . available from: http://www.digitalhumanities.org/companion/. svensson, p. ( ) the landscape of digital humanities. digital humanities quarterly ( ). available from: http://digitalhumanities.org/dhq/vol/ / / / .html (accessed august ). tanner, s. ( ) an analysis of the arts and humanities submitted research outputs to the ref with a focus on academic books. available from: https://kclpure.kcl.ac.uk/portal/en/publications/an-analysis-of-the-arts-and-humanities- submitted-research-outputs-to-the-ref -with-a-focus-on-academic-books( cfc - e - d - b b- e )/export.html (accessed september ). thomas iii, w. g. ( ) ‘the promise of the digital humanities and the contested nature of digital scholarship’, in susan schreibman et al. (eds.) a new companion to digital humanities. nd revised edition edition chichester, west sussex, uk: wiley-blackwell. pp. – . thompson, j. b. ( ) books in the digital age: the transformation of academic and higher education publishing in britain and the united states. cambridge, u.k.; malden, mass.: polity press. thompson, j. b. ( ) merchants of culture: the publishing business in the twenty-first century. new york, new york: plume. waters, d. j. ( ) monograph publishing in the digital age. shared experiences. available from: https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital- age/ (accessed august ). available from: https://mellon.org/resources/shared- experiences-blog/monograph-publishing-digital-age/ (accessed august ). watkinson, a. ( ) the academic book in north america: report on attitudes and initiatives among publishers, libraries, and scholars. available from: https://academicbookfuture.org/academic-book-north-america-watkinson/ (accessed august ). available from: https://academicbookfuture.org/academic-book-north-america- watkinson/ (accessed august ). weller, m. ( ) the digital scholar: how technology is transforming scholarly practice. london: bloomsbury. wolff-eisenberg, c. et al. ( ) ithaka s+r us faculty survey . available from: http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey- / (accessed august ). available from: http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey- / (accessed august ). author biography paul spence is a senior lecturer in digital humanities at king's college london. his research currently focuses on digitally mediated knowledge creation, digital publishing, global perspectives on digital scholarship and the potential interplay between modern languages and digital culture. he was joint creator of the multi-platform publishing framework xmod (since renamed as kiln http://kcl-ddh.github.io/kiln/), and now leads the 'digital mediations' strand on the language acts and world-making project (https://languageacts.org/). http://kcl-ddh.github.io/kiln/ https://languageacts.org/ updating the agenda for academic libraries and scholarly communications | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /crl. . . corpus id: updating the agenda for academic libraries and scholarly communications @article{lynch updatingta, title={updating the agenda for academic libraries and scholarly communications}, author={c. lynch}, journal={coll. res. libr.}, year={ }, volume={ }, pages={ - } } c. lynch published political science, computer science coll. res. libr. view via publisher crl.acrl.org save to library create alert cite launch research feed share this paper citationshighly influential citations background citations methods citations view all topics from this paper library (computing) paper mentions blog post v # the golden age of the green ecosystem: a color-blind perspective on repositories against-the-grain.com december citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by most influenced papers sort by citation count sort by recency creating a more inclusive future for scholarly communications nancy maron, r. kennison, n. hall, yasmeen shorish, kara j. malenfant political science highly influenced view excerpts, cites background save alert research feed open and equitable scholarly communications: creating a more inclusive future nancy maron, r. kennison, + authors yasmeen shorish political science highly influenced view excerpts, cites background save alert research feed the ir has two faces: positioning institutional repositories for success john novak, a. day political science highly influenced view excerpts, cites background and methods save alert research feed researching bears ears: reference practice for civic engagement amy brunvand political science save alert research feed opening up open access institutional repositories to demonstrate value: two universities’ pilots on including metadata-only records k. bjork, rebel cummings-sauls, ryan otto computer science pdf view excerpt, cites background save alert research feed reimagining the academic library: what to do next. review article d. w. lewis computer science pdf save alert research feed the future is wide open: sustainable scholarly communications and affordable learning in libraries l. krier, rita premo, m. wegmann political science save alert research feed the subject specialist is dead. long live the subject specialist! a. day, john novak sociology save alert research feed engaged citizenship through campus-level democratic processes: a librarian and graduate student collaboration on open access policy adoption m. cantrell, a. johnson political science pdf save alert research feed harvesting the academic landscape: streamlining the ingestion of professional scholarship metadata into the institutional repository jonathan bull, t. schultz computer science pdf view excerpt save alert research feed ... ... references showing - of references an open science framework for solving institutional challenges: supporting the institutional research mission m. spitzer political science save alert research feed bibliometrics and research evaluation: uses and abuses y. gingras sociology save alert research feed q&a with cni's clifford lynch: time to rethink the institutional repository? " open and shut? sept q&a with cni's clifford lynch: time to rethink the institutional repository? " open and shut? sept q&a with cni’s clifford lynch: time to rethink the institutional repository? open and shut? sept. , uses and abuses g. harry, h. g. murray, imrs oocument, m. ewen, p. cranton pdf save alert research feed related papers abstract topics paper mentions citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators blog posts, news articles and tweet counts and ids sourced by altmetric.com terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue notes from the field: student perspectives on digital pedagogy research how to cite: colligan, colette and kandice sharren. . “notes from the field: student perspectives on digital pedagogy.” digital studies/ le champ numérique ( ): , pp. – . doi: https://doi.org/ . / dscn. published: december peer review: this is a peer-reviewed article in digital studies/le champ numérique, a journal published by the open library of humanities. copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access: digital studies/le champ numérique is a peer-reviewed open access journal. digital preservation: the open library of humanities and all its journals are digitally preserved in the clockss scholarly archive service. https://doi.org/ . /dscn. https://doi.org/ . /dscn. http://creativecommons.org/licenses/by/ . / colligan, colette and kandice sharren. . “notes from the field: student perspectives on digital pedagogy.” digital studies/le champ numérique ( ): , pp. – . doi: https://doi.org/ . /dscn. research notes from the field: student perspectives on digital pedagogy colette colligan and kandice sharren simon fraser university, ca corresponding author: kandice sharren (ksharren@sfu.ca) this special collection on digital pedagogy features essays by student researchers within the digital pedagogy network (dpn). dpn is an informal interdisciplinary training network formed to foster the transfer of digital humanities (dh) knowledge and skills and to build connections between simon fraser university (sfu) and university of victoria (uvic) faculty, students, librarians, educational partners, and the public. central to the network has been the participation and experience of students, who have shared their digitally-focused work in a series of showcases and symposia that have alternated between sfu and uvic. what emerged during these events were student perspectives on current pedagogical practices in digital humanities, both inside and outside the classroom, as well as for the degree and beyond. our special collection builds on these perspectives, featuring student authors addressing issues that over the past five years have been central to their dh learning and training. these student perspectives gather into four topic clusters, namely ) collaboration with galleries, libraries, archives, & museums (glam); ) digital doctorates; ) major research projects; and ) transforming dh pedagogy. keywords: digital humanities (dh); digital pedagogy; glam institutions; digital projects; student labour; digital activism cette collection spéciale portant sur la pédagogie numérique consiste en des dissertations écrites par des chercheurs-étudiants dans le cadre du digital pedagogy network (dpn – réseau de pédagogie numérique). le dpn est un réseau informel d’entraînement interdisciplinaire créé pour favoriser le transfert de connaissances et d’habilités liées aux humanités numériques et pour développer des liens entre les effectifs, les étudiants, les bibliothécaires, les partenaires éducatifs de l’université simon fraser (sfu) et l’université de victoria (uvic) et le public. la participation et les expériences d’étudiants, qui ont partagé leur travail concernant la numérique dans une série de présentations et symposiums qui ont alterné entre sfu et uvic, ont joué un rôle primordial dans ce réseau. ce qui est ressorti durant ces évènements étaient les perspectives d’étudiants envers les pratiques https://doi.org/ . /dscn. mailto:ksharren@sfu.ca colligan and sharren: notes from the fieldart.  , page  of pédagogiques actuelles dans les humanités numériques, tant à l’intérieur qu’à l’extérieur des salles de classe, pour les programmes universitaires et au-delà. notre collection spéciale tire parti de ces perspectives, incluant des auteurs-étudiants qui abordent les enjeux essentiels à leurs études et entraînement dans les humanités numériques. ces perspectives d’étudiants présentent quatre thèmes regroupés, notamment ) la collaboration avec des institutions glam ; ) des doctorats numériques ; ) des projets de recherche majeurs ; et ) la transformation de la pédagogie des humanités numériques. mots-clés: humanités numériques (hn) ; pédagogie numérique ; institutions glam ; projets numériques ; labeur d’étudiants ; activisme numérique this special collection on digital pedagogy features essays by student researchers within the digital pedagogy network (dpn). dpn is an informal interdisciplinary training network formed to foster the transfer of digital humanities (dh) knowledge and skills and to build connections between simon fraser university (sfu) and university of victoria (uvic) faculty, students, librarians, educational partners, and the public. central to the network has been the participation and experience of students, who have shared their digitally-focused work in a series of showcases and symposia that have alternated between sfu and uvic. what emerged during these events were student perspectives on current pedagogical practices in digital humanities, both inside and outside the classroom, as well as for the degree and beyond. our special collection builds on these perspectives, featuring student authors addressing issues that over the past five years have been central to their dh learning and training and driving change in the field. these student perspectives gather into four topic clusters, namely ) collaboration with glam institutions; ) digital doctorates; ) major research projects; and ) transforming dh pedagogy. across these four clusters, familiar discussions within digital humanities emerge, including those on collaboration (deegan and mccarty ), new skills and training (ramsay and rockwell ), innovative forms of dissemination (jagoda ), and the distribution of labour (anderson et al. .; boyles et al. ; logsdon et al. ; siemens ). this special collection’s colligan and sharren: notes from the field art.  , page  of experiential approach to digital pedagogy, in which students are provided with opportunities to guide their own learning through hands-on opportunities, foregrounds the complementary nature of in-class and extracurricular learning, as well as the variety of roles students inhabit in dh at various levels of study. the authors of the following essays are not just students enrolled in a degree program, but are collaborators, research assistants, mentors, project managers, and leaders of their own research projects. their on-the-ground perspectives reveal the excitement that comes from having a sense of agency in their education and developing practical skills and professional connections, as well as a critical sense of the problems that can accompany the emergence of new structures and relationships. each of our four topic clusters gathers these student perspectives and includes a response from another member of the network, among them faculty, librarians, and other students. as with our face-to-face events, our aim with this special issue is to foster dialogue among various actors working with digital humanities approaches and methods. by centering student perspectives, informed by concrete practical experience as well as critical approaches, this collection works to advance discussions of digital pedagogy in recent years. these have included practical guides (battershill and ross ), technical how-tos (the programming historian ), assignment and keyword repositories (davis et al. ), as well metalevel critical discussions on the state of digital pedagogy (gold and klein ; anderson et al. ; stommel et al. ). we start from the position that pedagogical situations offer a place to create a scholarly community that welcomes a broad range of participants. at the same time, they offer a place to explore and address larger structural concerns in dh about interdisciplinary and intersectoral collaborations, contingent labour, and institutional limitations. digital humanities is a field that is grappling with how to create sustainable projects using limited resources, how to navigate relationships with different stakeholders, how to create a more accessible and inclusive scholarly community, and how to bring in political and cultural critique. our collection reveals how students are engaging through practice and critique with the state of the art in digital pedagogy, and, in turn, advancing the field of dh. if pedagogy is a principal colligan and sharren: notes from the fieldart.  , page  of concern of dh, as matthew k. gold and lauren f. klein note in the introduction to debates in digital humanities , then students at all levels need to be central in any such discussions. our first cluster, “collaborating with glam institutions,” features three essays by students reflecting back on their experiences collaborating with glams (short for galleries, libraries, archives, and museums). the collaborative digital curation and exhibits at the heart of these essays exemplify how digital pedagogy can connect students, faculty, and glam professionals working at the intersection of dh, archival fieldwork, and public humanities. christina hilburger, donna langille, and melissa nelson open this cluster describing their work on a digital exhibit for the redpath museum to earn credit for their mcgill information studies course in digital curation. alessandra bordini, a masters in publishing graduate from simon fraser university, follows by discussing her involvement, first as a student research assistant and then as a project manager, digitizing and describing a collection of incunabula by the printer-publisher aldus manutius held at sfu special collections. finally, josie ann greenhill describes her experiences as an undergraduate student at uvic undertaking an extracurricular digital exhibit of pre-raphaelite books in collaboration with uvic special collections and the electronic textual cultures lab. each essay foregrounds a different student experience (undergraduate and graduate), disciplinary perspective (information studies, publishing, and art history), and academic purpose (an assignment, a research project, and a practicum). lisa goddard and rebecca dowson, academic librarians from uvic and sfu respectively, summarize these perspectives through the lens of an emerging form of librarianship in which librarians take a more active role in teaching and research. together these essays highlight the importance of collaborating with glam institutions to create public-facing digital scholarly resources. they confirm the pedagogy of building as a way of knowing, described, for example, by stephen ramsey and geoffrey rockwell. one epistemological gain from making these resources is an enhanced sense of the full digital life cycle of cultural artifacts as they move from creation to dissemination. greenhill describes how digital curation colligan and sharren: notes from the field art.  , page  of brings new attention to unique cultural materials, as well as the collection bias and cultural mediation that their curation brings. bordini gains new appreciation for the analytical power of descriptive metadata in making the social processes of book production from the past discoverable to new publics. hilburger, langille, and nelson turn their attention to digital preservation, not the usual concern of student projects. their insights into the processes of digital cultural production and transmission importantly result in an enhanced sense of student agency. the authors of these essays emphasize how their collaboration with glams enabled them to become decision makers and problem solvers in digital curation, while directing their own digital skill development and participating in scholarly production in ways that hold more meaning than the typical student assignment. these reflections reveal the variety of models of student-glam collaborations currently in practice as well as evolving student roles in public-facing digital scholarship. such descriptions on learning-in-action, however, also expose the need for critical inquiry into the configurations and effects of these collaborations. as new roles for students, faculty, and glam personnel are rapidly being reconfigured, and arguably democratized, they also introduce the potential for ill-defined and unsustainable roles. if student agency is prioritized in these collaborations, more thought must be paid to how student agency continues and evolves over the lifespan of digital projects. hilburger, langille, and nelson as well as greenhill cast reflective glances back on finite projects that had clear endings, but bordini’s role has evolved from that of student researcher to project manager, highlighting the need to plan for changing student roles over the lifespan of a collaborative digital project, as well as labour practices more generally. the discourse of the “mutual benefit” derived from dh collaborations between student researchers and cultural institutions potentially masks the use of free student labour and other unfair labour practices that may detract from the achievements of these collaborations. collaborating with glam institutions is not the only way for students to gain practical experience with digital methods. our second cluster, “digital doctorates,” addresses curriculum design from the perspectives of ma and phd students whose colligan and sharren: notes from the fieldart.  , page  of capstone projects and dissertations integrate digital research methods. by asking how digital humanities projects might be accounted for in graduate programs, these essays explore the rewards, as well as the risks, of integrating digital research into degree requirements. randa el khatib opens the cluster with an argument in favour of a digital dissertation, which draws on the portfolio format common to science phds, which consists of six peer-reviewed articles or book chapters that draw on her research into the geospatial elements of milton’s epic poem paradise lost. reese alexandra irwin uses her experience developing a diplomatic digital edition of the first print edition of jane austen’s unfinished novella sanditon to consider the institutional and administrative complications of integrating digital research into graduate programs. in her essay, she contends that the library is essential to supporting graduate student digital projects, but that to be effective it must be treated as a pedagogical partner by the student’s home department. while both el khatib and irwin discuss digital projects that are central to their graduate work, caroline winter discusses a satellite project; her digital edition of mary shelley’s gothic tales complements a monograph- style doctoral dissertation. for winter, the satellite project is an opportunity for graduate students to develop digital skills, explore different modes of research, and experience being part of a strong community of practice; however, participating in a satellite project can also increase the time to completion, putting students at risk of running out of funding. the different strategies that these authors outline for incorporating dh projects into their graduate research rely on personal initiative, as well as supervisory and institutional support. in her response, michelle levy weighs the risks of the various approaches to digital projects outlined in these essays and concludes that the institutions that house these students must offer greater support by adapting to the changing and increasingly digital landscape of humanities disciplines. however, beyond practical questions about how best to support independent digital research in the context of a graduate program, this cluster also asks how supervisors, institutions, and hiring committees assess this research, which often takes non-traditional forms. cumulatively, then, this cluster focuses on student research projects to demonstrate colligan and sharren: notes from the field art.  , page  of their potential to expand learning and outreach, and also to present tactics for working with weaknesses in institutional support and assessment. but not all student engagement with digital research is part of a curriculum, as our third cluster shows. with the increasing frequency of large-scale digital humanities projects, research assistant work has taken on a new form for graduate students in the humanities, often involving large teams. our third cluster, which addresses student labour on “major research projects,” consists of two essays addressing how these kinds of projects offer the opportunity for students to take on new roles. anna mukamal, a past project manager for the modernist archives publishing project (mapp) and phd student at stanford university, begins this cluster by exploring the benefits for students for working on projects that collaborate across institutions. mapp, a critical digital archive focused on early twentieth-century publishing history, involves a number of different processes and initiatives, and, as project manager, mukamal was tasked with facilitating them. mukamal’s experience speaks to the professional and intellectual opportunities that come from working with a network of scholars based in institutions across north america and the uk, especially the rewards of intergenerational mentorship. kate moffatt and kandice sharren follow mukamal, focusing on the role of unseen labour in major dh projects, in terms of both the amount of effort that goes into metadata collection and the affective labour that goes into managing a team, through reference to their work as editors of the women’s print history project (wphp), a bibliographical database that seeks to account for women’s involvement in print between and . moffatt and sharren address the limitations of the records they use to recover the mostly forgotten women who owned printing and bookselling firms, exploring how collective forms of knowledge production, whether disseminated through eighteenth-century print or twenty-first century databases, privilege some actors and types of labour over others. in her response, mapp co-principal investigator claire battershill reflects on the need for the directors of major digital projects to take into consideration the ways in which their project structures interact with existing social and institutional hierarchies. colligan and sharren: notes from the fieldart.  , page  of the key theme in this cluster is how student labour is used in large-scale digital projects, particularly for students whose work may be adjacent to their research interests but does not necessarily advance their degrees. as christina boyles et al. have argued, dh initiatives involving large teams can rely on the labour of early career researchers in temporary positions who are called on to perform administrative duties and support the work of faculty in ways that may detract from their ability to pursue their own research agendas. graduate student research assistants experience this precarity in heightened ways; often, the positions they occupy are informal, less clearly defined, and dependent upon intermittent funding. although work on major digital projects can fund graduate degrees and provide the opportunities to develop skills and networks that are otherwise not part of their programs, it also runs the risk of distracting graduate students from their degree requirements. in the case of this cluster, all of the authors are involved in projects with feminist aims, both in terms of the data they make available and the structures of the projects themselves. as feminist projects, they outline a strategy for robust documentation practices that capture the full spectrum of labour that goes into projects such as mapp and the wphp and make the significance of that labour visible to those outside of the project. our final cluster closes with two essays by students and faculty collaborators seeking to transform the field of digital humanities by rethinking pedagogical practices and spaces. nadine boulay discusses the design and development of a teaching resource game that shows the experience of transgender, non-binary, and gender nonconforming youth. underlying the development of this game environment is a carefully articulated theory of intersectionality “wherein categories such as race, gender identity, sexuality, and class cannot be understood as separate axes, but as mutually-constituting and interconnected.” ashley morford, arun jacob, and kush patel similarly focus on the importance of pedagogical spaces for developing an anti-colonial dh pedagogy and transmitting this work via citational practices and networks. they emphasize the significance of uvic’s digital humanities summer institute, as well as other digital spaces, as sites for developing and teaching a theory colligan and sharren: notes from the field art.  , page  of of inclusive and activist digital pedagogy that is explicitly intersectional and socially inclusive. each essay reveals the importance of pedagogy in bringing social justice to the digital humanities. by integrating anti-colonial and intersectional practices into digital pedagogy, they help drive the field of dh toward social innovation. taken together, these essays demonstrate that transforming dh into a politically engaged, socially just, and inclusive field involves ongoing critical attention to pedagogical spaces and practices. kimberly o’donnell responds to these papers as a graduate student and digital fellow at simon fraser university, offering her own perspective on the importance of bringing cultural critique and intersectional approaches to digital pedagogy. covid- struck just as we finished writing this introduction. amidst widespread disruptions and shifts to online and remote teaching, what has emerged is the extent to which face-to-face interactions and engagement with material objects remains, in many cases, an essential component of digital pedagogy, whether students are digitizing and curating exhibits of holdings in a glam institution, working with librarians to develop a digital project, building a research team for a major project, or developing and facilitating workshops. while one of the frequently cited goals of dh is to make materials and knowledge accessible to a wider range of people via digital technologies, the digital turn does not mean that all dh work can be completed digitally. indeed, the fundamentally collaborative and material nature of many of the projects and initiatives described in the following essays reveal that much of dh scholarship is fundamentally rooted in immediate social and geographical relationships -- relationships that have been rapidly overturned during the pandemic. after all, this collection emerged from the in-person symposia and showcases we held in victoria and vancouver to celebrate student achievement and foster connections. digital pedagogy often focuses on questions of tools and methods, but our present situation reminds us that we must attend to the social and cultural processes that condition their use. the student-centered experiential and critical insights advanced in the chapters that follow do just that, by bringing attention to issues of colligan and sharren: notes from the fieldart.  , page  of collaboration, degree requirements, research training, and social justice activism in dh – all the more pressing during our present public health and social crises. our hope is that this special issue will help students, faculty, librarians, and academic administrators critically navigate this new pedagogical landscape amidst powerful pressures to go rapidly digital and adapt online. acknowledgements we would like to acknowledge lisa goddard, michelle levy, rebecca dowson, alyssa arbuckle, claire battershill, matt huculak, deanna reder, stephen ross, jentery sayers, and raymond siemens, who helped create the digital pedagogy network, with the support of a social sciences and humanities research council connections grant. michelle levy has additionally played an important role in conceptualizing this special issue for digital studies, along with its three co-editors, colette colligan, kimberly o’donnell, and kandice sharren. finally, we wish to thank the participants at our dpn events for their thoughtful contributions as well as the anonymous peer reviewers for their feedback on the essays gathered in this issue. competing interests the authors have no competing interests to declare. author note authors are listed alphabetically. references anderson, katrina, lindsey bannister, janey dodd, deanna fong, michelle levy, and lindsey seatter. . “student labour and training in digital humanities.” digital humanities quarterly ( ): – . accessed june , . https://scholar. google.ca/scholar?hl=en&as_sdt= % c &q=anderson% c+katrina% c+linds ey+bannister% c+janey+dodd% c+deanna+fong% c+michelle+levy% c+a nd+lindsey+seatter.+ .+“student+labour+and+training+in+digital+hum anities.”+digital+humanities+quarterly+ +% % .+&btng= battershill, claire, and shawna ross. . using digital humanities in the classroom a practical introduction for teachers, lecturers, and students. london: bloomsbury. https://scholar.google.ca/scholar?hl=en&as_sdt= % c &q=anderson% c+katrina% c+lindsey+bannister% c+janey+dodd% c+deanna+fong% c+michelle+levy% c+and+lindsey+seatter.+ .+�student+labour+and+training+in+digital+humanities.�+digital+humanities+quarterly+ +% % .+&btng= https://scholar.google.ca/scholar?hl=en&as_sdt= % c &q=anderson% c+katrina% c+lindsey+bannister% c+janey+dodd% c+deanna+fong% c+michelle+levy% c+and+lindsey+seatter.+ .+�student+labour+and+training+in+digital+humanities.�+digital+humanities+quarterly+ +% % .+&btng= https://scholar.google.ca/scholar?hl=en&as_sdt= % c &q=anderson% c+katrina% c+lindsey+bannister% c+janey+dodd% c+deanna+fong% c+michelle+levy% c+and+lindsey+seatter.+ .+�student+labour+and+training+in+digital+humanities.�+digital+humanities+quarterly+ +% % .+&btng= https://scholar.google.ca/scholar?hl=en&as_sdt= % c &q=anderson% c+katrina% c+lindsey+bannister% c+janey+dodd% c+deanna+fong% c+michelle+levy% c+and+lindsey+seatter.+ .+�student+labour+and+training+in+digital+humanities.�+digital+humanities+quarterly+ +% % .+&btng= https://scholar.google.ca/scholar?hl=en&as_sdt= % c &q=anderson% c+katrina% c+lindsey+bannister% c+janey+dodd% c+deanna+fong% c+michelle+levy% c+and+lindsey+seatter.+ .+�student+labour+and+training+in+digital+humanities.�+digital+humanities+quarterly+ +% % .+&btng= colligan and sharren: notes from the field art.  , page  of boyles, christina, anne cong-huyen, carrie johnston, jim mcgrath, and amanda phillips. . “precarious labor and the digital humanities.” american quarterly ( ): – . doi: https://doi.org/ . /aq. . davis, rebecca frost, matthew k. gold, katherine d. harrys, and jentery sayers, eds. . digital pedagogy in the humanities. . modern language association. accessed june . https://digitalpedagogy.hcommons.org/. deegan, marilyn, and willard mccarty, eds. . collaborative research in the digital humanities. abingdon; new york: routledge. gold, matthew k., and lauren f. klein. . “digital humanities: the expanded field.” in debates in digital humanities , edited by matthew k. gold and lauren f. klein, ix–xvi. minnesota: university of minnesota press. jagoda, patrick. . “gaming the humanities: digital humanities, new media, and practice-based research.” differences: a journal of feminist cultural studies ( ): – . doi: https://doi.org/ . / - logsdon, alexis, amy mars, and heather tompkins. . “claiming expertise from betwixt and between: digital humanities librarians, emotional labor, and genre theory.” college & undergraduate libraries ( – ): – . doi: https://doi.org/ . / . . ramsay, stephen, and geoffrey rockwell. . “developing things: notes toward an epistemology of building in the digital humanities.” debates in the digital humanities, edited by matthew k. gold, – . minnesota: university of minnesota press. doi: https://doi.org/ . /minnesota/ . . siemens, lynne. . ‘“it’s a team if you use ‘reply all’”: an exploration of research teams in digital humanities environments.” literary and linguistic computing ( ): – . doi: https://doi.org/ . /llc/fqp stommel, jesse, chris friend, and sean michael morris. . critical digital pedagogy: a collection. hybrid pedagogy inc. accessed september . https:// cdpcollection.pressbooks.com/ the programming historian. . “ the programming historian.”accessed june . https://programminghistorian.org/en/. https://doi.org/ . /aq. . https://digitalpedagogy.hcommons.org/ https://doi.org/ . / - https://doi.org/ . / . . https://doi.org/ . /minnesota/ . . https://doi.org/ . /minnesota/ . . https://doi.org/ . /llc/fqp https://cdpcollection.pressbooks.com/ https://cdpcollection.pressbooks.com/ https://programminghistorian.org/en/ colligan and sharren: notes from the fieldart.  , page  of how to cite this article: colligan, colette and kandice sharren. . “notes from the field: student perspectives on digital pedagogy.” digital studies/le champ numérique ( ): , pp. – . doi: https://doi.org/ . /dscn. submitted: may accepted: june published: december copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access digital studies/le champ numérique is a peer-reviewed open access journal published by open library of humanities. https://doi.org/ . /dscn. http://creativecommons.org/licenses/by/ . / acknowledgements competing interests author note references op-llcj .. geographical patterns of formality variation in written standard california english ............................................................................................................................................................ costanza asnaghi, dirk speelman and dirk geeraerts quantitative lexicology and variational linguistics research unit, university of leuven, belgium ....................................................................................................................................... abstract formality variation in the written use of lexical words in the relational sphere in california english is analyzed on a geographical level for the first time in this article. linguistic data for word alternations including a formal and an informal term for a specific concept are gathered from newspapers web sites written in english through site-restricted web searches across california (asnaghi, an analysis of regional lexical variation in california english using site-restricted web searches. joint ph.d. dissertation, università; cattolica del sacro cuore and university of leuven, milan, italy and leuven, belgium, ) and analyzed with a series of spatial statistical analyses (grieve et al. a statistical method for the identification and aggregation of regional linguistic variation. language variation and change, : – , ). urban versus rural and north versus south tendencies are detected in the language choices of california journalists. these tendencies are rooted in the history of the golden state as well as in its socio- economical structure (starr and procter. americans and the california dream, – . history: reviews of new books, ( ): – , ; hayes, historical atlas of california: with original maps. berkeley/los angeles/london: university of california press, ). ................................................................................................................................................................................. introduction the geographical distribution on the california ter- ritory of a group of linguistic variables formed by variants denoting different degrees of formality in addressing or describing people is surveyed here. the linguistic data under scrutiny were collected from online newspapers from across the state of california. this is the first evaluation to provide a quantitative analysis of regional formality variation in the lexical domain of written standard california english. although california is the most populous state in the usa, previous large-scale dialectology studies have never paid much attention to it. studies re- porting the phonetic situation of california cities or areas have been conducted (e.g. decamp, ; hinton et al., ; moonwomon, ; hagiwara, ; eckert, ; waksler, ; bucholtz et al., ; hall-lew, ; podesva, ; kennedy and grama, ); however, no studies have attempted to give a big picture of language variation in the state. david reed and allan metcalf did attempt to produce a linguistic atlas of the pacific coast in the s (reed and metcalf, ), aiming at a description of the language in california and nevada. the interviews conducted for the atlas demonstrated indeed that california english correspondence: asnaghi costanza, quantitative lexicology and variational linguistics research unit, university of leuven, belgium. e-mail: costanza.asnaghi@kuleuven.be digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqu digital scholarship in the humanities advance access published december , here nited states fifties is an independent dialect, as reported by elizabeth s. bright ( ); nonetheless, an atlas with the results of these inquiries was never published. this article intends to investigate formality variation in the newspaper register on a large scale in a region that has been linguistically explored only to a limited extent before. comprehensive dialect studies of american english have surveyed california as a small portion of the contiguous usa (carver, ; labov et al., ; grieve, ). in particular, grieve’s ( ) analysis was conducted in a quantitative way and included california cities. a further study (grieve, ) referred to the same set of data focusing on the results for contraction, which is a feature of informal writing. comparisons be- tween grieve’s study and the present research will be discussed below. formality variation, i.e. variation in the form of linguistic choices of words in accordance with the conventions of the social context of use, has been extensively acknowledged by various foundational sociolinguistics studies. labov ( b) describes the ‘principle of formality’ as a formal linguistic context obtained whenever a speaker monitors his or her language production. in the context of news- paper writing, language is supposed to be somewhat ‘formal’ as for labov’s definition. in fact, written language, especially when targeted for publication, is usually monitored by the writer. douglas biber ( ) describes linguistic variation as a continuum, and sees formality as a dimension of that con- tinuum. one of the aims of this article is to show that a single dimension is indeed sufficient to pin- point variation between more formal and less formal linguistic styles. on the quantitative geographical variational level, szmrecsanyi ( ) compares quantitative studies based on linguistic atlas data to quantita- tive studies based on linguistic frequency data (i.e. corpora), concluding that frequency-based approaches provide more realistic linguistic evidence. only few previous studies have attempted to analyze formality variation quantitatively in a geo- graphical perspective. examples are grieve ( ) and the lexicon-based sociolectometrical approach introduced in geeraerts et al. ( ) and further developed in speelman et al. ( ) and ruette et al. ( ). grieve ( ) demonstrates that measures of contraction rate, e.g. would not and wouldn’t, are regionally patterned in written standard american english. the investigation that this article presents mainly differentiates from grieve’s study in that it examines the behav- ior of lexicon rather than contraction rates. lexicon is, according to peter trudgill ( ), the level of the english language where stylistic differences are most evident. the remainder of the article is structured as fol- lows: section presents the employed method of data gathering, section presents two spatial statis- tical analyses, and section provides an overview of the results. finally, section reviews the results: an interpretation for the detected regional patterns is provided, and a direction for future developments of this study is also suggested. linguistic data online newspapers are the selected register for this study, corresponding more generally to the written standard english register, where the term ‘register’ is used according to charles ferguson’s ( ) in- terpretation, or ‘the communicative situation’, and the term ‘standard english’ follows trudgill’s ( : ) definition according to which ‘all newspapers that are written in english are written in standard english’. online newspapers publish a great amount of freely available text in a com- puter-readable form and are annotated for place of publication. while online newspapers texts do not cover all possible registers of written standard english, online newspaper texts can nonetheless be representative indicators of style for a particular region. in fact, online newspapers texts contain a wide variety of articles and sections, encompassing local, national, and international news, sport reports, arts and cul- ture columns, travel tips, business insights, and in some cases even fiction; they also often include ad- vertisements and readers’ opinions. in biber’s ( ) research on register relations, limited to the c. asnaghi et al. of digital scholarship in the humanities, paper forty-eight nited states eleven is paper , paper paper is ( ) `` '' ( : ) ( ) dimension of ‘involved’ versus ‘informational’ text production that is comparable to the focus of this article, text categories in the newspaper sphere (press reviews, press reportage, and editorials), al- though very different from a limited category of written texts (i.e. personal letters), share the same formality level with a wide number of texts (i.e. biographies, academic prose, science fiction, reli- gion, humor, and popular lore); newspaper lan- guage is also relatively close to further categories of written texts (i.e. official documents and fiction). moreover, the online newspaper archives that were reached for this research contain text in such a quantity that a distinction among genres is not es- sential for the determination of patterns of regional lexical variation at a national level of resolution. the most efficient technique up to date to gather linguistic frequencies from online texts is site- restricted web searches (grieve et al., ). starting from a list of suitable newspaper web sites based in the geographical area to be investi- gated and from a list of lexical alternation variables formed by variants denoting different degrees of formality, the google search engine was queried for the number of hits for each variant of the se- lected variables in the entire archive of each newspaper. the list of newspaper web sites included online newspapers based in different california locations (see asnaghi, for the full list of newspapers). daily and weekly california online newspapers written in english with substan- tial reports on local facts were considered suitable for this study . university, entertainment, and parish papers were excluded. hits from online news- papers published in the same location were summed. the list of the lexical alternation variables was formed by nouns denoting people in the family sphere: dad/father, mom/mother, grandpa/grand- father, grandma/grandmother, folks/parents , and kid/child. for each variable of this list, the first vari- ant is less formal, and the second variant is more formal. other variants for these concepts exist in english. for example, when asked for nicknames for mater- nal grandmothers, over , harvard dialect survey respondents provided a varied range of an- swers, the most frequent being grandma ( . %), while the other variants (nana, . %; grandmother, . %; granny, . %; grammy/grammie/grammi, . %; mimi, . %; other undisclosed nouns, . %) were infrequent, with a frequency rate lower than % in all cases. those low-frequency variants were not analyzed in this study. in fact, the emphasis of this study is on high-frequency vari- ants. although some sociolinguists insist that all variables, including low-frequency ones, should be included in a dialect study (labov, a; kretzschmar, ), in this case, the inclusion or exclusion of low-frequency variants would return a similar proportion in the count of linguistic vari- ation for a specific concept. for example, in sonoma, california, represented here by the online newspaper sonomawest.com, grandma occurs times ( . %), grandmother occurs times ( . %), nana occurs times ( . %), grammie occurs only once ( %), grammi and granny do not occur at all in the newspaper archive ( %) (grammy and mimi were omitted from this test because too ambiguous for an effective search through site- restricted web searches: a search for grammy returned hits meaning ‘grammy awards’, or the annual award given by the american national academy of recording arts and sciences for achievement in the record industry; a search for mimi returned different proper names). excluding low-frequency variables does not dramatically change the proportion for the high-frequency ones: in this case, if only the hits for grandma and grandmother are calculated, grandma accounts for . % of the results, while grandmother accounts for . % of the results, affecting the percentage only slightly ( . % change for grandma, . % change for grandmother). moreover, the focus here is on the relational sphere for no specific reason other than accuracy in data collection and interpretation. in fact, dad/ father, mom/mother, grandpa/grandfather, grandma/ grandmother, folks/parents, and kid/child are rela- tively unambiguous terms, which is a requirement for good performance in site-restricted web searches. other near-synonyms were considered, among which were nab/arrest, shiv/knife, and cool/ geographical patterns of formality variation digital scholarship in the humanities, of paper . excellent. nonetheless, these word pairs would not perform well in site-restricted web searches, the words arrest, shiv, and cool being highly polysemous. furthermore, in word pairs such as display/manifest, quick/rapid, stipend/emolument, wisecrack/joke, de- termine/ascertain, and many others that were con- sidered for research, a formality gradient between the first and the second term does exist, but it is not guaranteed that one of the terms is actually used with informal or formal intentions (see brooke et al., ). therefore, formality variation in the lexicon can be cautiously retrieved only from a vo- cabulary that corresponds to an aware choice of the writer toward a relatively formal or a relatively in- formal linguistic variant to identify a specific con- cept. this is best represented by relational terms, where the alternative between the formal way to address a member of a family, i.e. father, and the informal way to address the same person, i.e. dad, is well demarcated. the reason why we opted for site-restricted web searches over more conventional corpus lin- guistic techniques to mine newspaper articles is that site-restricted web searches allow for the examination of a vaster quantity of text. site-re- stricted web searches are queries conducted through a web search engine that look for a spe- cific term or expression in a specific web site. google search was used in this case, although other web search engines would also apply. the specific web search through google was con- ducted by entering the tag site: immediately fol- lowed by the web site domain (e.g. sonomanews.com) and by the target term expressed inside quotation marks (e.g. ‘mother’). quotation marks prevent from automatic stemming, i.e. a search for pages containing not only the selected term but also closely related variants of the term such as its plural form, etc., and force google search to return results that exactly match the searched term. the www. prefix was removed from the web site addresses so that the search engine would search the entire domain, including pages with a different prefix such as sports.sono- manews.com. the same search was conducted automatically through a python script for each term of the six word alternation variables. regional linguistic variation in this study is observed in natural language discourse that is pro- duced by a sample of language users (i.e. journalists) taken from the entire population, rather than just by a few long-term residents, as is the case in trad- itional american dialect surveys. online newspaper language, restricted to letters to the editor, has been previously analyzed in dialectology (grieve, ). online newspapers do not usually disclose explicit information on the provenance of the informants. nonetheless, while it is possible that journalists are not residents of the city where a newspaper is pub- lished, this may occur only for a limited number of cases , therefore a substantial source of texts will be written by local or near local authors, whereas the rest will be from all over the place, which we con- sider as noise in the data. in addition, nonlocal jour- nalists writing for a local newspaper produce text that can be either considered as noise for our study, or, more usefully, text produced in and for a local audience, therefore valuable data. for ex- ample, the san francisco chronicle archive contains a letter to the editor from a newspaper reader in seattle, wa . is that letter part of the san francisco speech community or the seattle speech community? probably neither, or both, but the communicative situation for that specific letter is in san francisco. therefore, the cases in which jour- nalists are not based in the same location as the newspaper that they write for do not represent a threat to the validity of the site-restricted web searches method. it should be noted once again that this study aims at measuring regional patterns in the newspaper register of california english rather than at making assumptions on the general speech community of california. moreover, the quantity of linguistic data collected through this method as well as the statistical techniques used to interpret the data (see section ) smooth out the scattered cases of nonresident journalists. the site-restricted web searches method may seem deceptive: web searches will count potential nonequivalent uses of the searched variants. additionally, search engines return the number of pages in which a target term can be found, rather than the number of actual occurrences of the term in the web site. furthermore, google search returns c. asnaghi et al. of digital scholarship in the humanities, being s `` '' www - - - only an estimate number of results, and it is hardly possible to define a search boundary according to a specific genre or register other than by selecting spe- cific web sites—newspaper web sites in this case. despite the potential noise in the collected data, the site-restricted web searches method was proved valid through an evaluation across the usa for lex- ical word alternation variables distribution attested by both this method and previous american english research. site-restricted web searches returned lin- guistic distribution results that were comparable to results obtained through traditional linguistic data collections (see details in grieve et al., ). the validity of the method was therefore widely proven despite of the potential noise. it should be noted that other linguistic collections focusing on lexical variation such as hans kurath’s ( ), e. bagby atwood’s ( ), and cassidy and hall’s (cassidy and hall, , ; hall and cassidy, ; hall, , ; hall and von schneidemesser, ) involved noise in dialect data too. nonetheless, the noise in the data did not undermine the surveys. site-restricted web searches obtain a consider- able quantity of linguistic data through automated series of computational instructions as opposed to costly traditional data collection methods. for ex- ample, for the variable grandpa/grandfather, in the newspaper redding record searchlight based in redding, ca, we found , hits for the variant grandpa and , hits for the variant grandfather. given the very big amount of data not only on the dimension of examined newspapers but also on the dimension of occurrences for each variant, it would be unrealistic to pursue manual analysis as in trad- itional corpus linguistics. for any deeper analysis on the textual distribution of the terms, the identifica- tion of any collocational patterns, the examination of sub-genre differences, as well as for a thorough cleaning of false hits from the analysis, the applica- tion of very refined distributional methods would be required. statistical analysis the frequencies for the six lexical alternation vari- ables were counted through site-restricted web searches across the locations. the results were then calculated as proportions, providing continu- ous data for the analysis presented in the sections below. the computation for the value of every single variable followed equation . in equation , t is the value of the variable to be obtained, n is the value of the first variant of the variable, and n is the value of the second variant of the vari- able. t ¼ n n þ n : ð Þ the collected data sorted as proportions for each linguistic variable at each location were analyzed through moran’s i, a statistical tech- nique for the measurement of global spatial auto- correlation (moran, ; odland, ; grieve, ). moran’s i studies phenomena having a random probability distribution in more than one dimen- sion in space. its foundation is in cross-product statistic (�, equation ), but it differs from cross-product statistic (hubert et al., ) in that it takes into consideration multiple dimen- sions. the equation for cross-product statistic is as follows: � ¼ x i x j wij cij; ð Þ where i and j are any pair of locations, wij is the weight between observation i and j, also called spa- tial weight matrix or neighboring function (paradis, ), and cij is the measure of the dis- tance between the values of i and j. cij is calculated according to a certain measure of distance in cross- product statistic (such as euclidean distance, manhattan distance, spherical distance, etc.; see sawada, ); in moran’s i, cij is calculated as displayed in equation , namely as the product of the distance of the value xi at location i and of the value xj at location j from the global mean of the z-values. cij ¼ ðxi � xÞðxj � xÞ ð Þ also, as for the pearson statistic, moran’s i in- cludes a scaling factor (expressed here in geographical patterns of formality variation digital scholarship in the humanities, of nited states equation ) that is not present in the cross-prod- uct statistic: m ¼ n w x ðxi � xÞ : ð Þ the complete formula of moran’s i is provided in equation as follows: i ¼ n xx wijðxi � xÞðxj � xÞ w x ðxi � xÞ : ð Þ moran’s i results typically range between � and þ for each variable, where scores toward � denote dispersion, scores toward þ denote clustering, and scores near indicate random distribution. the p-values correspond to a one-tailed . alpha level. in this study, moran’s i measured the level of significance of each lexical alternation variable. in particular, a one-tailed t test assessed positive global spatial autocorrelation, establishing whether each variable evince regional clustering. in table , the scores of moran’s i significance test of global spatial autocorrelation, the z-scores, and the p-values are displayed, ranging from highly significant to less significant. in general, the significance at the global level for all selected vari- ables was considerable. after the analysis of global spatial autocorrel- ation, a test of local spatial autocorrelation was con- ducted. the main difference between global and local spatial autocorrelation statistics is that a global measure of spatial autocorrelation returns a number for each variable of the data set, while a local measure of spatial autocorrelation returns a number associated with each observation unit, as a quantitative expression of waldo tobler’s ( : ) first law of geography: ‘everything is related to everything else, but near things are more related than distant things’. in order to calculate local spatial autocorrelation, a spatial weighting function has to be defined. a spatial weighting function is a protocol that specifies the weight to the comparison of every pair of locations. the analyses reported here are based on a recip- rocal weighting function, which is a common weighting function that assigns a weight to a com- parison based on the reciprocal of the distance be- tween the two locations, so that weight decreases with distance (odland, ). the test of local spatial autocorrelation getis- ord gi followed equation (ord and getis, ; grieve, ): giðdÞ ¼ x j wijðdÞxj � wi xðiÞ sðiÞf½ððn � Þs iÞ� w i �=ðn � Þg ; ð Þ where j ¼ i; s i ¼ x j w ij; ðj ¼ iÞ; x and s denote sample mean and variance. getis-ord gi examined each linguistic variable for significant levels of positive or negative local spatial autocorrelation. the goal of this analysis is to determine what distributional values of the vari- ables are found in the surroundings of the chosen locations. getis-ord gi fetched a z-score for each variable at each location. variables returning a z-score value larger or equal to � . were considered locally sig- nificant. the z-scores were considered significant at a one-tailed . alpha level. getis-ord gi scores were positively significant or negatively significant for locations surrounded by other locations with similar values or with dis- similar values, respectively. in particular, positive getis-ord gi scores indicated that the first lexical variant was relatively more frequent, while negative getis-ord gi scores indicated that the second lex- ical variant was relatively more frequent in that neighborhood. getis-ord gi scores approximating to zero indicated a region of variability between a preference for the first lexical variant and a pref- erence for the second lexical variant in that table global spatial autocorrelation results alternation moran’s i z-score p-value one tail mom/mother . . . dad/father . . . grandpa/grandfather . . . folks/parents . . . kid/child . . . grandma/grandmother . . . c. asnaghi et al. of digital scholarship in the humanities, -- s -- s `` '' ( : ) neighborhood. for example, los angeles city, hollywood, and beverly hills are locations close to each other—the distance among the three cities can be represented by a triangle of sides of, respectively, , , and miles. for the variable kid/child, the results of the gi analysis are very similar in the three cities (approximating the re- sults to two decimal places, los angeles gi ¼ : , hollywood gi ¼ : , and beverly hills gi ¼ : ). the positive results indicate that child is relatively more used than kid in the newspapers of that area. figure is an example of a map of california on which the surveyed locations were plotted in dots filled with shades according to the raw proportion values, while figure is an example of an autocorre- lated map. the map in figure displays the probabil- ity of the first variant relative to the second variant of the continuous lexical alternation variable dad/father in california english. a dot filled with a lighter color indicates that the first variant is more common in the location identified by that specific dot. a dot filled with a darker color indicates that the second variant is more common in the identified location. the map in fig. probability of dad relative to father. a color version of this figure is available online geographical patterns of formality variation digital scholarship in the humanities, of -- figure represents again the probability of the first variant relative to the second variant of the continu- ous lexical alternation variable dad/father, this time by plotting the gi z-scores on the california map. in these maps, a darker dot (or a red dot, with reference to the online color version of the map) indicates that the identified location was associated with a positive gi z-score, and therefore the first variant occurs rela- tively more frequently in that location. a lighter dot (or a blue dot, with reference to the online color version of the map) proves that the identified location was associated with a negative gi z-score, and there- fore the second variant occurs relatively more fre- quently in that location. a grey dot (or a white dot, with reference to the online color version of the map) shows a region of fluctuation in the preference for the first or second variant. going back to the example, los angeles, hollywood, and beverly hills are repre- sented by darker/red dots in the autocorrelated map for the variable kid/child (see fig. ) due to the close fig. probability of dad relative to father: map of local spatial autocorrelated getis-ord gi z-score values. a color version of this figure is available online c. asnaghi et al. of digital scholarship in the humanities, relatedness of the positive results that the gi analysis returned at those locations. figure is visually clearer than figure , and provides an example of the powerful smoothing effect of the employed statistical autocorrelation technique. this analysis is the quantitative equiva- lent of isoglosses identification (grieve et al., ). in fact, local spatial autocorrelation is a direct way to decipher the linguistic data more clearly, leveling the noise that was present in the raw data. results the maps of the getis-ord gi z-scores (figs – ) exhibit two main tendencies in the language choices of california journalists, namely a north/ south and an urban/rural distinction, as described in the following paragraphs. for an overview of the urban/rural areas of california, figure presents population density information throughout the state. fig. probability of mom relative to mother: map of local spatial autocorrelated getis-ord gi z-score values. a color version of this figure is available online geographical patterns of formality variation digital scholarship in the humanities, of . pattern a, north/south the map for the variable grandpa/grandfather dis- plays a clear distinction between the usage of the terms in northern and southern california. in par- ticular, the less formal realization for the concept ‘the father of one’s father or mother’ is relatively more frequent in northern california, while the more formal one is relatively more frequent in the lower part of southern california, with a few weak outliers in the central part of the state (see fig. ). with the variable folks/parents (fig. ), the scheme resembles the one in grandpa/grandfather. in fact, the less formal realization for the concept ‘a person’s father and mother’ folks is more common in the north, while the more formal real- ization is more common in the south. . pattern b, urban/rural the variables dad/father and mom/mother reveal a clear usage distinction in written online newspaper fig. probability of grandpa relative to grandfather: map of local spatial autocorrelated getis-ord gi z-score values. a color version of this figure is available online c. asnaghi et al. of digital scholarship in the humanities, language between urban and rural california envir- onments. in particular, for the variable dad/father, the variant dad is relatively more common in the metropolitan areas around san francisco and los angeles; the variant father is relatively more common in the central and northern rural parts of california (fig. ). for the variable mom/mother, the term mom is relatively more frequent in the san francisco bay region and in the los angeles area; the term mother is relatively more frequent in the more rural eastern part of northern california (fig. ). for the variable grandma/grandmother, the term grandma, which is the less formal realization for the concept ‘the mother of one’s father or mother’ is relatively more frequent in southern california, es- pecially in the los angeles urban area, as well as in san francisco and in the silicon valley; the more formal term grandmother is relatively more frequent in northern california, with a prevalence in the fig. probability of grandma relative to grandmother: map of local spatial autocorrelated getis-ord gi z-score values. a color version of this figure is available online geographical patterns of formality variation digital scholarship in the humanities, of central part of the state, as well as on the coast of northern california (see fig. ). finally, although folks/parents displays dialect patterns mainly on a north/south dimension, the variable has a notable strong correlation in the urban area of greater los angeles (fig. ). . pattern testing in order to verify the two identified patterns and test them with further data, we compared the six distri- butional patterns for the lexical variables under examination in this article with the distributional patterns for another set of eight variables, namely contraction rate variables (hasn’t/has not, haven’t/ have not, doesn’t/does not, don’t/do not, wasn’t/was not, weren’t/were not, couldn’t/could not, won’t/will not). for similarity reasons, data for this comparison were retrieved and analyzed following the same cri- teria as the ones we detailed in sections and . notably, the variables doesn’t/does not, don’t/do not, couldn’t/could not, and won’t/will not followed a north/south structure as in pattern a (section . ; fig. probability of kid relative to child: map of local spatial autocorrelated getis-ord gi z-score values. a color version of this figure is available online c. asnaghi et al. of digital scholarship in the humanities, paper fig. ), and the variables hasn’t/has not, haven’t/have not, wasn’t/was not, and weren’t/were not followed an urban/rural structure as in pattern b (section . ; fig. ). discussion as an attempt to answer labov’s ( ) call for a quantification of the dimension of style, situating fig. population distribution in california (source: us census) geographical patterns of formality variation digital scholarship in the humanities, of ( ) the quantification on a geographical level, the ana- lysis of distribution of the selected lexical variables formed by word alternations on different levels of formality brought to the conclusion that written formality in the english language is regionally pat- terned, as grieve’s ( ) analysis of american english demonstrated before. therefore, regional linguistic variation on a formality level exists in written standard california english. in particular, two very strong patterns of variation emerged for california english, namely an urban/rural dimen- sion and a north/south dimension. a historical motivation underlies the language usage distinction between the north and the south of california. in fact, in the mid-nineteenth century, while northern california was growing rapidly as a consequence of the gold rush, southern california continued to be a pastoral hispanic region until the s, when, with the development of irrigation and the aqueduct fig. probability of folks relative to parents: map of local spatial autocorrelated getis-ord gi z-score values. a color version of this figure is available online c. asnaghi et al. of digital scholarship in the humanities, ( ) system, the imperial valley saw the increase of the farm population. the year-round favorable weather and the relaxed lifestyle drew people to southern california, incrementing the real estate industry (starr and procter, ; hayes, ). while san francisco was the most populated city in the state until , los angeles grew consider- ably in the following years and became three times as big as san francisco in . therefore, the residents of northern california have been settled longer than the residents of southern california. the different use of the language between north and south that is evidenced in this research can be a result of the historical settlement patterns. this study confirms what reed pointed out years ago: through the example of the distribution of the term chesterfield, reed provided evidence that a north/south dialect distinction already existed in california in the s (reed and bradley, ). fig. pattern a (north/south) testing: probability of haven’t relative to have not: map of local spatial autocorrelated getis-ord gi z-score values. a color version of this figure is available online geographical patterns of formality variation digital scholarship in the humanities, of . for sixty fif i ties the straightforward language distinction be- tween the rural and the urban areas of california is based on the socioeconomical structure, where a metropolis influences people’s lifestyles in a differ- ent way if compared to what has an impact on people’s habits in the agricultural cities. language in general, included written language as demon- strated here, is gradually adapting toward informal- ity, a ‘shift to a more speech-like style’ that geoffrey leech defines as ‘colloquialization’ (leech et al., : ). the urban sprawls are normally motors for innovation, and they are also in the lin- guistic context, while the slower-paced countryside is reluctant to adopt changes. in fact, the results of this research show that metropolitan areas tend to- ward more informal language than rural areas. notably, although california encompasses five urban agglomerations (greater los angeles, the san francisco bay, san diego-tijuana, greater sacramento, and metropolitan fresno), according fig. pattern b (urban/rural) testing: probability of couldn’t relative to could not: map of local spatial autocorre- lated getis-ord gi z-score values. a color version of this figure is available online c. asnaghi et al. of digital scholarship in the humanities, - s `` '' s s fig. reproduction of grieve’s contraction measures maps—california section (source: grieve, ): . be not contraction; . do not contraction; . have not contraction; . modal not contraction; . be contraction; . have contraction; . modal contraction; . them contraction; . to contraction; . non-standard ‘not’ contraction; . double contraction geographical patterns of formality variation digital scholarship in the humanities, of to this research, only two of those agglomerations are involved in a different use of the language com- pared to the rest of the state. in particular, the two urban areas that emerge from this study are greater los angeles and the san francisco bay. greater los angeles and the san francisco bay are in fact dif- ferent from the rest of the california metropolitan areas from a variety of points of view. to pick one, the gross domestic product (gdp) of greater los angeles and the san francisco bay is much higher than the gdp of any other california urban agglomerations. a comparison between the results of the present study and previous research show that grieve’s ( ) arguments align to this study. the two stu- dies are similar on the basis of goals, methods, and coverage of california. in fact, the goal of both this study and grieve’s study is to analyze language vari- ation in written standard american english as ap- pears in online newspapers, and the methods used are somewhat comparable, if not in the data collec- tion, at least in the statistical analyses. both studies analyze english dialect variation on a quantitative basis, and both provide dialect maps. also, although encompassing the us, grieve’s coverage includes california, which is also the state surveyed here. however, a comparison with grieve’s results is pos- sible to a limited extent. the comparability is lim- ited due to the different number of observations, the different geographical zoom of the two studies, and the different nature of the analyzed variables. with regards to the number of observations, grieve’s sample for california contains cities, while this study analyzes locations. as for the geographical zoom of the two studies, grieve’s survey focuses on the contiguous usa, while this research focuses on only one out of those states. also, the vari- ables analyzed by grieve are full grammatical forms versus contracted grammatical forms, such as the alternation in writers’ choice between is not and isn’t, based on the assumption that contractions are prevalent in informal writing, while they tend to be avoided in more formal writing; this study analyzes two lexical forms for each onomasiological concept, based on the assumption that one lexical form is used in a more formal way whereas the other lexical form is used in a less formal way. finally, the newspaper sample is limited to letters to the editor in grieve’s survey, while the sample for this research encompasses the whole archive that online news- papers make available. six out of the eleven contraction variables ana- lyzed by grieve witness variation in california (features , , , , , , fig. ), four of which display very weak patterns (features , , , , fig. ). the most successful variable in terms of regional variation in california is the non-standard ‘not’ contraction (feature , fig. ). in particular, the non-standard ‘not’ contraction variable presents great variation from the upper part of northern california (redding and chico) to the san francisco bay area, sacramento, and fresno; more- over, the non-standard ‘not’ contraction variable be- haves similarly in the san francisco bay area, sacramento, fresno, los angeles, riverside, and san diego, while in bakersfield the variable displays a different behavior, more similar to the one de- tected in the northern rural area. it should be noted that these two different pattern regions rela- tive to the results obtained from this variable are both rural; one cluster of observations is in the north, while one other observation is in the south. the patterns resulting from grieve’s survey for the non-standard ‘not’ contraction variable seem com- parable to the results of the study reported in this article. the territory under investigation could be ex- panded in a future study. for example, once the entire us territory is surveyed for patterns of lexical formality in written standard american english, it would be interesting to compare lexical results to contraction rate patterns and to previously estab- lished general dialect patterns in formality variation in american english. acknowledgments many thanks to jack grieve and thomas wielfaert for their comments on the article. references asnaghi, c. ( ). an analysis of regional lexical variation in california english using site-restricted c. asnaghi et al. of digital scholarship in the humanities, . . to eleven nited states ly paper nited tates web searches. joint ph.d. dissertation, università cattolica del sacro cuore and university of leuven, milan, italy and leuven, belgium. atwood, e.b. ( ). the regional vocabulary of texas. austin, tx: university of texas press. biber, d. ( ). variation across speech and writing. cambridge, uk: cambridge university press. bright, e. s. ( ). a word geography of california and nevada, volume . berkeley, california: university of california press. brooke, j., wang, t., and hirst, g. ( ). inducing lexicons of formality from corpora. in bel, n., daille, b., and vasiljevs, a. (eds), proceedings of the lrec workshop. valletta, malta, pp. – . bucholtz, m., bermudez, n., fung, v., edwards, l., and vargas, r. ( ). hella nor cal or totally so cal? the perceptual dialectology of california. journal of english linguistics, : – . carver, c. m. ( ). american regional dialects: a word geography. ann arbor, mi: university of michigan press. cassidy, f. g. and hall, j. h. (eds), ( ). dictionary of american regional english. vol. a–c. cambridge, ma: belknap press of harvard university press. cassidy, f. g. and hall, j. h. (eds), ( ). dictionary of american regional english. vol. d–h. cambridge, ma: belknap press of harvard university press. decamp, d. ( ). the pronunciation of english in san francisco. in williamson, j. v. and burke, v. m. (eds), a various language: perspectives on american dialects. new york: holt, rinehart and winston, inc. eckert, p. ( ). linguistic variation as social practice: the linguistic construction of identity in belten high. vol. . hoboken, nj: wiley-blackwell. ferguson, c. a. ( ). dialect, register and genre: work- ing assumptions about conventionalization. in biber, d. and finegan, e. (eds), sociolinguistic perspectives on register. oxford, uk: oxford university press, pp. – . geeraerts, d., grondelaers, s., and speelman, d. ( ). convergentie en divergentie in de nederlandse woordenschat. een onderzoek naar kleding- en voetbaltermen. amsterdam: meertens instituut. grieve, j. ( ). a corpus-based regional dialect survey of grammatical variation in written standard american english ph.d. thesis, flagstaff, arizona: northern arizona university. grieve, j. ( ). a regional analysis of contraction rate in written standard american english. international journal of corpus linguistics, ( ): – . grieve, j., asnaghi, c., and ruette, t. ( ). site- restricted web searches for data collection in regional dialectology. american speech, : – . grieve, j., speelman, d., and geeraerts, d. ( ). a statistical method for the identification and aggregation of regional linguistic variation. language variation and change, : – . hagiwara, r. ( ). acoustic realizations of american/r/as produced by women and men, vol. . los angeles: phonetics laboratory, department of linguistics, ucla. hall, j. h. (ed.) ( ). dictionary of american regional english, vol. . p-sk. cambridge, ma: belknap press of harvard university press. hall, j. h. (ed.) ( ). dictionary of american regional english, vol. . sl-z. cambridge, ma: belknap press of harvard university press. hall, j. h. and cassidy, f. g. (eds), ( ). dictionary of american regional english, vol. i-o. cambridge, ma: belknap press of harvard university press. hall, j. h. and von schneidemesser, l. (eds), ( ). dictionary of american regional english, volume : contrastive maps, index of entry labels, questionnaire, and fieldwork data. cambridge, ma: belknap press of harvard university press. hall-lew, l. (ed.) ( ). ethnicity and phonetic variation in a san francisco neighborhood ph.d. thesis, stanford university. hayes, d. ( ). historical atlas of california: with original maps. berkeley/los angeles/london: university of california press. hinton, l., moonwomon, b., bremner, s., luthin, h., clay, m. v., lerner, j., and corcoran, h. ( ). it’s not just the valley girls: a study of california english. proceedings of the annual meeting of the berkeley linguistics society, vol. . hubert, l. j., golledge, r. g., and constanzo, c. m. ( ). generalized procedures for evaluating spatial autocorrelation. geographical analysis, : – . kennedy, r. and grama, j. ( ). chain shifting and centralization in california vowels: an acoustic analysis. american speech, ( ): – . kretzschmar, w. ( ). the linguistics of speech. cambridge, uk: cambridge university press. geographical patterns of formality variation digital scholarship in the humanities, of kurath, h. ( ). a word geography of the eastern united states. ann arbor, mi: university of michigan press. labov, w. ( a). sociolinguistic patterns. philadelphia: university of pennsylvania press. labov, w. ( b). some principles of linguistic meth- odology. language in society, : – . labov, w., ash, s., and boberg, c. ( ). atlas of north american english: phonetics, phonology, and sound change. new york: mouton de gruyter. leech, g., hundt, m., mair, c., and smith, n. ( ). change in contemporary english: a grammatical study. cambridge, uk/new york: cambridge university press. moonwomon, b. ( ). sound change in san francisco english. ph.d. thesis, berkeley: university of california. moran, p. ( ). the interpretation of statistical maps. journal of the royal statistical society series b (methodological), ( ): – . odland, j. ( ). spatial autocorrelation. london: sage publications. ord, j. k. and getis, a. ( ). local spatial autocorrel- ation statistics: distributional issues and an application. geographical analysis, ( ): – . paradis, e. ( ). moran’s autocorrelation coefficient in comparative methods http://cran.r-project.org/ web/packages/ape/vignettes/morani.pdf. (accessed september ). podesva, r. j. ( ). the california vowel shift and gay identity. american speech, ( ): – . reed, d. w. and bradley, f. w. ( ). eastern dialect words in california (special issue of: american speech: a quarterly of linguistic usage ed.), vol. . ann arbor, mi: publication of the american dialect society. reed, d. w. and metcalf, a. a. ( ). linguistic atlas of the pacific coast. microfilm: bancroft library of the university of california at berkeley. ruette, t., speelman, d., and geeraerts, d. ( ). measuring the lexical distance between registers in na- tional varieties of dutch. in soares da silva, a. t., torres, a., and gonçalves, m. (eds), lı́nguas pluricêntricas. variação linguı́stica e dimensóes sociocognitivas. braga, portugal: publicações da faculdade de filosofia, universidade católica portuguesa, pp. – . sawada, m. ( ). global spatial autocorrelation indices – moran’s i, geary’s c and the general cross-product statistic research paper from the laboratory for paleoclimatology and climatology at the university of ottawa. http://www.lpc.uottawa.ca/publications/moransi/ moran.htm. speelman, d., grondelaers, s., and geeraerts, d. ( ). profile-based linguistic uniformity as a generic method for comparing language varieties. computers and the humanities, : – . starr, k. and procter, b. ( ). americans and the california dream, – . history: reviews of new books, ( ): . szmrecsanyi, b. ( ). forests, trees, corpora, and dia- lect grammars. in szmrecsanyi, b. and wälchli, b. (eds), aggregating dialectology, typology, and register analysis. berlin: de gruyter, pp. – . tobler, w. ( ). a computer movie simulating urban growth in the detroit region. economic geography, ( ): – . trudgill, p. ( ). standard english: what it isn’t. in bex, t. and watts, r. j. (eds), standard english: the widening debate. london: routledge, pp. – . waksler, r. ( ). a hella new specifier ling.ucsc.edu/ jorge. notes the complete list of online newspapers, including the location where the selected newspaper are based, the communities and topics they cover, the frequency of their publication, the circulation, and other notes can be found in asnaghi, . only in this case the plural form was searched (folks/ parents) to avoid ambiguous cases that would have occurred in the case of a search for folk/parent, such as hits for folk meaning ‘folk music’ and ‘folk art’. a brief survey was sent to a sample of california news- paper editors. about editors replied, confirming that: ‘almost all [journalists] are local residents’ (becky o’malley, editor, berkeley daily planet) or ‘we [i.e. the journalists] are all residents of the city’ (judi bowers, editor, big bear grizzly). san francisco chronicle, december , http://www. sfgate.com/opinion/letterstoeditor/article/letters-to- the-editor- .php, retrieved on july . ‘‘the most immediate problem to be solved in the attack on sociolinguistic structure is the quantification of the dimension of style’’ (labov, : ). information retrieved from quickfacts.census.gov on december . c. asnaghi et al. of digital scholarship in the humanities, http://cran.r-project.org/web/packages/ape/vignettes/morani.pdf http://cran.r-project.org/web/packages/ape/vignettes/morani.pdf http://www.lpc.uottawa.ca/publications/moransi/moran.htm http://www.lpc.uottawa.ca/publications/moransi/moran.htm thirty `` '' `` '' , http://www.sfgate.com/opinion/letterstoeditor/article/letters-to-the-editor- .php http://www.sfgate.com/opinion/letterstoeditor/article/letters-to-the-editor- .php http://www.sfgate.com/opinion/letterstoeditor/article/letters-to-the-editor- .php , , wp-p m- .ebi.ac.uk params is empty sys_ exception wp-p m- .ebi.ac.uk no params is empty exception params is empty / / - : : if (typeof jquery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/ . . /js/jig.min.js"][/script]'.replace(/\[/g,string.fromcharcode( )).replace(/\]/g,string.fromcharcode( ))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} page not available reason: the web page address (url) that you used may be incorrect. message id: (wp-p m- .ebi.ac.uk) time: / / : : if you need further help, please send an email to pmc. include the information from the box above in your message. otherwise, click on one of the following links to continue using pmc: search the complete pmc archive. browse the contents of a specific journal in pmc. find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/med/ international journal of e-planning research, ( ), - , january-march copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. abstract academic blogging has typically been a form of digital scholarship that is under-utilized in academia. although there are both costs and benefits to blogging at different stages in an academic’s career, blogs can provide a rewarding platform for bringing research and academic perspectives to a wide-reaching and broader audi- ence. this note explores the different experiences of each of the co-authors in terms of using blogs for their scholarly communication. the experiences and lessons gained are of particular relevance to urban planners, sociologists, and anthropologists, who study the social, economic, and historical elements of the city. the findings suggest that the motivations and approaches of scholarly blogging are diverse but overall add value to the academic community. moreover, each testimony in this note provides examples of the benefits of blog- ging for research, collaboration, and engagement. blogging the city: research, collaboration, and engagement in urban e-planning. critical notes from a conference pierre clavel, cornell university, ithaca, ny, usa kenneth fox, cornell university, ithaca, ny, usa christopher leo, department of political science, university of winnipeg, winnipeg, mb, canada anabel quan-hasse, university of western ontario, london, on, canada dean saitta, department of anthropology, university of denver, denver, co, usa ladale winling, department of history, virginia tech, blacksburg, va, usa keywords: academia, blogging, history, scholarly communication, social media, tenure introduction social media have transformed how scholars in all disciplines share ideas, consult and col- laborate with colleagues, and disseminate their research findings (bonetta, ; quan-haase, ). not all academics, however, are embrac- ing the move toward digital communication with the same enthusiasm. some argue that es- tablished, traditional means of publishing—via print books or journals—are more meaningful and allow for better quality control than digital counterparts (hurt & yin, ). despite these criticisms many academics are utilizing social media for connecting with collaborators and students, discussing important doi: . /ijepr. copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march topics, obtaining feedback, and disseminating their research findings. in a widely cited blog dave parry, chair of communication and digi- tal media at saint joseph’s university, argues that social media has become an essential part of scholarship “[n]ot because social media is the only way to do digital scholarship, but because…social media is the only way to do scholarship period” (parry, , para. ). this opinion is widely shared by social media advocates. they see these tools as playing a central role in their scholarly practice—not only as a means to connect to other scholars but, rather, as a means of “networking across sectors throughout the entire research process” (sprain, endres, & petersen, , p. ). there is no single definition of the term social media. a wide range of different tools are aggregated under its rubric, including mi- croblogging, blogs, social networking sites, and video sharing and streaming websites (hogan and quan-haase, ). despite this prolifera- tion in social media tools, blogs have perhaps played the most central role in scholarly practice. most scholarly writing on academic blogs notes that academic blogging is a distinctly different form of communication than traditional aca- demic writing. for instance, blogging historian juan cole ( , p. ) writes: i consider blogging to be a genre of writing, which can be endowed with academic attributes, even if it is not like the genre of the academic article. a blog entry is intended to intervene in a debate raging in the blogosphere, and it is best if it is dashed off quickly, incorporating as much original thinking and analysis as possible, and based on the best information, given the constraints of immediacy. one of the most comprehensive reports on blogging, conducted by nielsen reports in , shows that there are more than million blogs on the internet (nielsen online reports, ). the same study estimates that . million users publish blogs on blogging platforms and another million publish blogs on social media platforms (nielsen, ). while blogging has often been dismissed as a pastime activity for teens, in recent years this perception has certainly changed. a report by the pew internet and american life project (lenhart, purcell, smith, & zickuhr, ) shows that between and , blogging has moved away from being a medium primarily utilized by teens for self-expression toward a tool for exchanging credible, accurate, and cur- rent information among all age groups (scale & quan-haase, in press). moreover, academic blogging is distinct from other kinds of blogging. while the content of academic blogs varies from discipline to discipline, three common benefits can be seen across academic blogging as a whole. these are ( ) interactive communication; ( ) timeliness and personal tone; and ( ) broad dissemination of research results. interactive communication. blogs, unlike print books and journals, allow for interactive means of engaging with a wider readership. most blogs allow readers to add comments, and many blogs feature exchanges between readers and blog authors. kim and abbas ( ) in their study of internet use in academic libraries refer to blogs as a “user-initiated knowledge function” permitting both blogger and reader to share knowledge. the authors note that blogs excel at making communication between blogger and reader a two-way exchange, such that % of the academic libraries they surveyed use blog- ging as a form of knowledge exchange. perhaps most importantly, blogs allow for immediate feedback and for the readership to add not only their opinions, but also original content in the form of data, images, or text; resources in the form of urls, pdfs, and jpegs; and additional information on an object, person, or event. timeliness and personal tone. blogs are perceived as a supplemental means of pub- lishing academic work and not necessarily as a substitute. there are two reasons why blogs are seen as supplemental and advantageous. first, blogs are timely. there are no delays in the publishing process as is the case with print books and journal articles. this is an important advantage, especially in fields where knowledge copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march becomes obsolete quickly and where publication delays can span years. secondly, many scholars note that the informal writing style of academic blogging differs from traditional academic writing in a way that makes ideas more acces- sible. heather cox richardson ( ) believes academic blogging helps her “distill” complex ideas, allowing her to better communicate them to a wider public. since many academic blogs have larger and more diverse readerships than scholarly publications, a number of academ- ics feel that this wider audience produces an information version of peer-review that opens their work to public comment and critique (cohen, ; cole, ; kaufman, ; lindgren, ). broad dissemination of research results. academics are using blogs to disseminate their research to a wider audience. this includes not only academics in related fields, but also the general public who may have an interest in the data or findings. cohen states that “[w] riting a blog lets you reach out to an enormous audience beyond academia. some professors may not want that audience, but i believe it’s part of our duty as teachers, experts, and public servants. it’s great that the medium of the web has come along to enable that communication at low cost” (cohen, ). this move is, for instance, reflected in the call put forward by the society for american archaeology to encourage its members to blog about their work. similarly, lynn goldstein, the director of michigan state university’s campus archaeology program (cap), has also discussed the importance of blogging to their academic program. she writes that all cap students are required to blog about their work for the purpose of gaining access to a variety of different publics—academic and non-academic (meyers, ). in addition, many see blogging as working hand in hand with traditional publishing and not in opposi- tion. lindgren ( ) notes that “probably the most important contribution of blogging to legal scholarship is informing readers both inside and outside the legal academy of recent work published in a law review or posted to a website service, such as the social science research network (ssrn).” despite the many benefits of blogging the genre brings some risks. for instance, goldstein points out that inexperienced bloggers, such as students, may risk offending groups within these publics by sharing sensitive cultural materials (meyers, ). hurt and yin ( ) suggest that risks can be particularly acute for pre-tenured scholars if they transmit inaccurate informa- tion that has not been properly peer-reviewed. they further note that blogging is not a part of the three primary categories commonly employed to evaluate tenure candidates (i.e. print publications, teaching, and service), and that it may take time away from these primary activities. vivienne raper ( ), while more supportive of academic blogging than hurt and yin ( ), shares their opinion that blogging can be detrimental to one’s academic career. she notes that some institutions will value blogs that support academic activities toward tenure and promotion, but that most will see blogging as a “harmless hobby.” still, many academics, particularly those in historical fields, value blogging. the canadian website active history (http://activehistory. ca) encourages historians to publically blog about their research with the following mis- sion statement: “we seek a practice of history that emphasizes collegiality, builds community among active historians and other members of communities, and recognizes the public responsibilities of the historian.” heather cox richardson writes that despite the perceived risks associated with academic blogging, “blog- ging lets you develop a sense of humor in your writing. hell, it encourages you to. and that is maybe the key to why blogging and tweeting are good for the historian’s craft. they’re fun” (richardson, ). thus blogging, like any form of com- munication, offers specific possibilities and drawbacks. for many academics, blogging provides a platform to hone academic writing and to communicate with a larger public. while blogging has the possibility to offend and may not further one’s academic career, for many copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march scholars the potential drawbacks of blogging are greatly outweighed by its benefits. the rest of this paper discusses these benefits as framed by each of our co-authors, written in their first-person voice. all of these scholars participated in a session called “scholarship blogging: what? why?” organized by pierre clavel for the meetings of the society for american city and regional planning history (sacrph) held in toronto. dean saitta chaired the session and anabel quan-hasse served as discussant. saitta and quan-hasse teamed up to collect and synthesize the session contribu- tions for this paper. pierre clavel: progressivecities.org blogging is given purpose by the blog’s content. www.progressivecities.org starts from research and writing about “progressive” neighborhood planning in american cities, especially where it concerns the redistribution of resources to poor neighborhoods and the opening of city halls to wider public participation. it aims to produce and preserve an archived collection that can be used by scholars and activists alike. the collection in i was anticipating five years of half-time work, then retirement. i had ten boxes of docu- ments and interview transcripts from writing about progressive cities like berkeley, hartford, cleveland, santa monica, and burlington; later chicago and boston. i found that the cornell library’s archives, officially the division of rare and manuscript collections (rmc), would take them as part of its already large planning collection (papers of some - planners). i named it thematically, as the “progressive cities and neighborhood planning” collection. i sent them the boxes, and by we had a nice collection. it grew substantially over time as other people gave materials. at one point an archivist — virginia krumholz in cleveland — argued that it was wrong to collect documents and bring them to cornell. rather, it was better that they stay in the city where they were created. i was persuaded of this, and tried to promote local archives. in one of our students, crystal launder, organized a trip to burlington where several students collected reports and other materials and, after making copies, we gave a box of these to their library. there was a big effort in berkeley in - when kathryn kasch and later karen westmont organized meetings for six months until they suspended operations. i also tried to get something started in binghamton, with no result. this was – in contrast to focusing single-mindedly on our collection and research – reaching out. website and blog the collection has been terrific for the schol- ars already focused on the topic, and there is a nicely designed catalogue on the library website (http://rmc.library.cornell.edu/ead/ htmldocs/rma .html). however, we wanted to supplement the catalogue with brief descriptions of the cities, samples of documents, and commentary. that was the reason we put together the progressive cities website. this led us eventually to install the blogging feature on the site. we sought to include a series of short essays commenting on the whole project, and to invite participation more broadly. the website has two main sections: the “blog” and “the project.” there are short sections on “the col- lection,” and “bibliography.” we subsequently added “contributors and contact,” a section that is about how people in different places can work on their own archives, and with contact addresses for people in each city. we only had - people listed, but hoped to add others in each city covered by the collection. at one point we added this paragraph: i would like to see people work on this website from their own places and perhaps with special angles that cut across places. what might these be? initially, take a look at the website and see if this is something you fit with, could see doing. think about it – see if it appeals to you and you copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march have time. look at the contact page and see who is listed there. write us and we can talk about it. i’d like to get about six other people involved in this website and project, so i’ll see what i can do over the next few months. outreach is difficult, but worth it because the progressive cities website origi- nated from a preservation effort, i developed more of a focus on the historical material than on current survivals and implications. however, this created a conflict with the desire to reach out: who, other than the occasional scholar, will be interested in these archives at cornell, or the website? progress was slow for generating inter- est and collections in the cities. but virginia krumholz’s idea bore fruit in another way: it began to generate interaction with scholars and activists in the cities we studied. the blog was potentially a way to sustain such interaction. while i found the blog a convenient place to write commentary on the site in general, it was hard to get a response. i tried to touch on topics that i thought people in our cities would find interesting and worth responding to. here i ran into a problem: it was hard to find people that were interested in the topic, including in the cities i had written about. i encountered many vague memories, while others said they were burned out from the conflict that had been daily fare when they were active. on the other hand, new potentials began to come from new places and people. a south african research team began collecting oral histories from “anti-apartheid planners who had hoped to implement reforms in johan- nesburg. a number of cities in the u.s. were considering the use of eminent domain to relieve “underwater” mortgages that were depleting tax bases and slowing recovery from the - recession. on many fronts activists were finding traction for arguments and devices to combat recession at state and local levels. perhaps by enlisting these activists as bloggers, we can contribute historical depth to their struggles. many nations now wonder about increasing income inequality, but the cities found ways to attack the problem. the collection and blog can reinforce current efforts as they emerge, and increase consciousness at the top. the collection is also important because it provides a focus and boundaries for the project as a whole. people want to know: what is the “progressive city”? i think the best answer is the focus on inequality, tied to a participatory opening of government. this has much to do with neighborhood housing issues but other things as well: rent control transferred millions to renters in santa monica; burlington’s support of land trusts helped create a situation where percent of the housing stock was “permanently affordable, i.e. protected from market fluctua- tions. in the s, boston enacted “linkage” fees on real estate developers, supporting an affordable housing trust fund. chicago took a series of steps to save relatively high paying manufacturing jobs. these collections required a special focus because progressive city activists worked against mainstream institutions and opinion. they won elections that gave a mandate to redistributive measures, but implementation encountered strong resistance and conflict. in chicago, harold washington encountered two years of “council wars” from a city council ma- jority that fought on racial grounds, but also saw threats in diversion of resources from downtown office construction to neighborhood jobs goals. boston developers argued the city would kill off a downtown office construction boom if it enacted linkage. it seemed important to collect documents and interview testimony focused around these redistributive issues; mainstream opinion would take care of the other side. archivists were allies in these efforts. i noticed a potential division of labor: libraries and archivists represented a whole new world: very professional, terrific at preservation, such that there could be a complementary function. my co-researchers and i provided a focused set of collections that had thematic coherence, while fitting nicely within a much broader set of planning collections and documents. we began to involve rmc in an outreach function. we had copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march in the collection a set of “readers” put together by the conference on alternative state and lo- cal policies in and . this included short reports, internal memos, policy ideas, news reports, and legislation collected from activists who had gotten into local and state governments and had organized for purposes of sharing ideas. we thought that by putting these online they would attract attention to the larger collection, making it more useful. we researched copyrights, scanned the documents, and got clearance for the great majority of the items (many were in the “public domain” in any case). the items went online through e- commons at the beginning of . christopher leo: christopherleo.com filling a niche that academia ignores before i went to graduate school, i spent some three years working for a series of daily newspa- pers. i was only years old when i started, and i loved the work. being a newspaper reporter gave me a license to pick up the phone and ask anyone any question that interested me. in those three years, i worked a number of beats: business and labour in marshalltown, iowa; education in lancaster, pennsylvania; and in york, pennsylvania – at the late gazette and daily, reputedly the only left-wing daily in the united states – city hall and the courthouse. as a young journalist, i had opportunities that young people rarely enjoy. i became acquainted with bank officials, the president of a manufacturing corporation, labor leaders, judges, city officials, and prominent lawyers. it was fascinating learning about the worlds they inhabited and, through them, gaining a better understanding of business, politics, and law. as time went on, however, i began to chafe at the daily midnight deadline. on most working days, i had an hour or two to research a “story” – a newspaper article – and write it. after that, it was on to the next piece. i became increasingly conscious of the fact that i was just scratching the surface of each issue i researched and i wanted go deeper. so i applied for graduate school, got ac- cepted to political science and political economy programs at a number of universities, and quit the gazette. i liked graduate school even better than newspaper work, but nothing is perfect. i loved researching african politics and city politics, my chosen areas of concentration, but as i began publishing my research, it became clear to me that i was missing something i had taken for granted as a newspaperman: a reader- ship. the bleak reality of academic publishing is that most of us get very few readers. most of our readers are the same few people we meet at conferences, plus our students – many of whom would not have read what we wrote had we not assigned it to them. then, a decade or two ago, the internet changed everything. by then i was a senior academic and secure enough to be able to try something unconventional without having to fear career death. combining an academic research career with a few hours a week devoted to blogging has allowed me to scratch my journalism itch – the unfulfilled desire to go deeper – while addressing my dissatisfaction, as an academic, with the inability to reach more than a few readers. blogging is a wonderful opportunity for academics to fill a niche that conventional academic writing ignores. academic journals are full of facts and ideas that are bound to be interesting to many non-academics, but that potential readership only rarely delves into journal articles or monographs, because from a layperson’s point of view, academics take forever to get to the point. interesting facts or ideas that emerge from academic research cannot be laid out the way journalistic research is, for very good reasons. a body of academic findings must be placed in a theoretical con- text and an academic article or book must set out explicitly how the findings in question are related to the literature. theory and literature reviews are unlikely to be perceived by a non- academic as desirable reading material. blogging allows us to gather the interest- ing things we have learned in our academic copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march research, strip them of theoretical discussions and literature reviews, and lay them out for a wider readership. this is not only an interesting and satisfying thing to do, but it is also a poten- tially important asset for academia as a whole. although we live in the wealthiest society in world history, attempts to allocate some of those vast resources to public purposes face heavy resistance. blogging gives us an opportunity, by presenting our findings to the public at large, to demonstrate that the resources universities command are valuable to society. christopherleo.com: an overview and some samples my blog consists of two parts: short articles that deal with research findings and commentary on those articles. a column entitled “the passing scene” contains short entries about interesting material i encounter daily in my research and reading, as well as links to that material. among the issues i’ve addressed in my blog: • a blog entry, in an earlier iteration of my blog, documented the techniques a developer used to get city council to agree to substantial and ultimately unjustifiable government subsidies for a major down- town development. such techniques are widely used and it is important for par- ticipants in or observers of local politics understand them. • another blog entry showed how city coun- cil was misled into agreeing to a bridge project that turned out to be far more expensive than promised. • in a comparative study of housing and homelessness in three canadian cities, my research assistants and i showed why a federal government program that made sense in vancouver was ill suited to win- nipeg and saint john, new brunswick, and what that, in turn, teaches us about differences among cities. • in recent years, in a later iteration of my blog, i’ve found myself devoting more at- tention to commentary on current affairs. thus in november , i offered some critical comments on the urban expansion policies that were being pursued by the city of winnipeg. a few months earlier, i commented critically on the way the city government communicates (or rather fails to communicate) with its constituents. academic blogs and academia blogging is a service senior academics can perform without worrying unduly about ca- reer consequences. junior academics need to be more careful. if you can write quickly and well, so that periodic blogs don’t take away much of the time you need for career-building activities, you might be able to blog without paying a significant career penalty. if not, you may well want to wait until you’ve reached a less vulnerable career stage. a more fundamental solution to the time- allocation problem is the formal recognition of blogs containing serious academic findings or discussions as legitimate career activi- ties. on no account should blogs be credited equally with refereed research, but in my view, it would be reasonable to count them the way we count community service, or non-refereed publications. that will not happen overnight. my attempts before i retired to make a similar case in departmental meetings fell on deaf ears. academia is a cumbersome battleship that turns slowly. however, when and if it does turn in a direction more favorable to blogging, the benefits will soon become apparent. kenneth fox: merton- columbiaproject.com “website” better describes what i am posting online than “blog.” i am making my project public while research and writing are in prog- ress. when the project appeared relevant to our sacrph conference topic, i posted discussion of how the two were related, including original documents of interest. in this intermediate form i am making material available that can be copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march referenced and linked, as so much material on wikipedia is linked to websites. the relationship of evolving sociological theory of the s, and city planning of the s and prior, was not one i had been thinking about, so relating the two has been very valuable for me. i expect it will be valuable for planning history of the s and following. while sociologists were reading planning theory and news, planners were undoubtedly reading sociology. if my site becomes known to planning historians, i want to know who they may be, and to correspond if they wish. robert merton is fairly unique in this context. a columbia sociologist of the next generation, herbert gans, was even more involved with planning and published frequently in planning journals. gans concerned himself with planning “ideology”, a tendentious point of view. blogging creates a new relationship with researchers, professionals and activists. its mutual informality and trust encourages publi- cizing discoveries of interest mid-course in the project. since our sacrph presentation i have discovered that merton and some associates undertook a major study of two planned commu- nities of the s: addison terrace within the city of pittsburgh, and winfield park in central new jersey. this research was underwritten by the fred l. lavanburg foundation, of which clarence stein was one of the trustees. the foundation was established in to develop ways that planned housing could ameliorate racial discrimination, juvenile delinquency and poverty. the results of the surveys of the two communities were never published, which ac- counts for the lack of knowledge about it among both planners and sociologists. my discussion of this discovery in blog form on the website will hopefully generate responses from readers with additional knowledge of merton’s work on this. there are pitfalls to be aware of in publiciz- ing projects such as mine, some fairly serious. while pierre clavel’s enterprise includes manu- script repositories he is involved in managing, much of my research thus far has been in the robert merton papers housed at columbia university’s rare book and manuscript library. rights to the material are controlled by a merton literary estate and by the library. i am posting selected documents of theirs on my site. each document must be reviewed and permission granted. the estate and the library are con- cerned that online availability will undermine their control. any heightened risk requires their time and attention. if sites such as mine are to flourish, as i hope they will, libraries will need to determine how to allocate scarce staff time and resources. another danger is the effect of online posting on subsequent publication, especially of monograph books. despite all the technical advances of recent decades, scholarly book production remains extremely expensive. university presses have become concerned that prior online availability of an author’s ideas will reduce sales. yet withholding new discovery, theorizing, analysis and argument cannot be the solution. given that the vast majority of scholarly monographs are purchased by libraries progress may come through new relationships between scholars, publishers, libraries, and audi- ences. wikipedia is one instance of unexpected, remarkable developments. wikipedia contribu- tors must conform to a set of requirements, yet contributions remain anonymous. this has not deterred participation however. i have become qualified as a wikipedia contributor, which is not difficult, but i have not found time to provide any contributions. here are a few words in closing about fear. one must trust web hosting enterprises that exist to be profitable. having dealt with two thus far i have found them surprisingly helpful with technical details. they also offer methods of promotion, such as how to make your site rank high on google searches. i have been cautious so far. i was also afraid that nega- tive effects might follow from posting scanned documents, leading the estate and the library to withhold further permissions. so far, i have cultivated a spirit of partnership and they have responded positively. i believe cooperation of this sort is going to be central to further progress. protective as they can be, literary estates and manuscript libraries want their treasures to be copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march known and appreciated. we, the site founders and bloggers, can become their best new allies in this enterprise. ladale winling: urbanoasis.com blogging by scholars can de-center the tradi- tional emphasis among historians on published research while simultaneously producing a new creative process that yields greater productiv- ity, new outlets, and facilitates connections to a wider array of sources and disciplinary influences. in short, blogging can make better historians by undermining and reordering tra- ditional scholarly processes. scholars such as william turkel have written on ways to navigate “the infinite archive,” the increasing amounts of literature, data, and sources informing historical research; blogging serves as a kind of infinite notepad or personal press for discussing and disseminating research. my blogging project at www.urbanoasis. org mirrors my own professional and scholarly development. created before i had earned any graduate degrees, the site and blog’s genre would best be considered “placeblogging” rather than history blogging. this effort took the contemporary built environment and devel- opment politics of my local communities—the college towns of kalamazoo and ann arbor, michigan—as starting points necessitating historical inquiry for context and prompted a search for eclectic sources in open, interactive discussion with a local blogging community. in ann arbor, in particular, the community news blog arbor update (formerly www.arborupdate. com) and the local commentary site ann arbor is overrated (formerly www.annarborisover- rated.com), both now defunct, provided a rich foundation with robust readership to the local blogging community. as i proceeded through my historical train- ing, blogging and scholarly productivity evolved hand in hand in an idiosyncratic fashion. where blog posts for the non-historian may start with a news tidbit from the press, a personal anecdote, or another blogger’s prompting, scholarly blog- ging is just as likely to start with a discovery from an archive, whether digital or analog. as research became central to my graduate career and my historian’s mentality, discussion of research and writing output became central to my professional life. blogging before graduate research had helped me create a community for feedback and had set me on a writing schedule— a day or a week without a post was a day i would find no readers and would showcase no research discoveries. a day that i could bash out a half-formed idea in a blog post was a day that prepared me to refine and incorporate that idea into the sub-argument of an article, chapter, or review. thus, the regularity of blogging helped me become a more productive historical writer and the informality of blogging, often a point of criticism, helped keep me from becoming too much of a perfectionist. many colleagues have noted that time spent writing for the web took away from other types of writing. i came to believe that a day that i was too busy to write a blog post was a day that i was simply too busy. what friedrich nietzsche called the “windless calm of the soul,” boredom even, that enabled productive writing and thinking was impossible in a day chockablock with meetings and errands. blogging was never the obstacle to writing—taking on too much other work always was. as pierre clavel writes of his progressive cities project, the historians’ website and blog allow for both the creation and discussion of digital materials—for historians to both draw upon and contribute to the infinite archive. blog- ging offers the opportunities to both highlight existing historical materials in digital archives and to create and showcase one’s own digital objects and sources (both born-digital and digitized analog sources). one of my efforts has been to digitize and host city security maps from the home ownership loan corporation. this new deal program, created to help rebuild the housing market during the depression by financing and guaranteeing home mortgages, institutionalized the practice of “redlining” by rating neighborhoods with african americans, jews, and immigrants as risky investments. through several trips to the national archives in college park, maryland where the maps are http://www.urbanoasis.org http://www.urbanoasis.org http://www.arborupdate.com http://www.arborupdate.com http://www.annarborisoverrated.com http://www.annarborisoverrated.com copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march housed, i have collected several dozen digital maps —the largest single collection available on the web. scholars and students contact me about these regularly and in discussions on my site i plumb these materials when i am thinking about cities in the th century. like blogging to publication, digitizing and hosting these maps is the first step in a broader digital history project on holc currently in the works. a blog also helps scholars to manage their identity on the web in addition to its productivity and project sandbox functions—in part because of these two other functions. colleagues, ac- quaintances, students, long lost relatives, and prospective employers will search for and find you on the web using google or another search engine. that is no longer a question. the real question is, what will they find and what role will you play in prioritizing what people find—who manages your identity on the web: you or google? one could easily start with dan cohen’s tips on search engine optimization from way back in to choose an appropriate do- main name and work to get authoritative links by creating high-quality, compelling content on your blog. but the most important step is to first decide that taking ownership of your online identity and participating in this digital culture is a worthwhile, even a necessary venture. once a scholar makes that decision, a whole array of options on what to do next cascades out in front of her—choosing to pursue a highly designed web site, emphasizing social media, or simply sticking to basic writing on the web via a blog. regardless of the choice, deciding to participate in creation of digital content through a platform like blogging ensures that your activities rise to the top of what the world finds out about you when it searches for you on the web. dean saitta: interculturalurbanism. com it was a pleasure for me to chair the scholarship blogging session in toronto that produced the collaboration that led to this paper. i am one of those society for american archaeology members, mentioned earlier, who has embraced blogging as a way to reach a wider public. my blog intercultural urbanism is, in many ways, a logical outgrowth of my training as an archaeologist. archaeologists study the material remains of human societies. it has become com- monplace for archaeologists to view cities as the most significant cultural artifact produced by human beings, one that reflects and reproduces human relationships, values, and aspirations. in other words, the city is a receptacle of cultural meaning (kostof, ). thus, the urban built environment can either enhance or erode the commitments that people make to the places where they live and, of course, the commitments they make to each other. my blog explores the territory where cul- ture, public policy, urban design, and built envi- ronment intersect. it takes stock of the cultural values that shape how ethnically diverse groups of citizens create, use, and respond to the urban built environment. it is predicated on the notion that the more sensitive that urban designers, planners, architects, and developers are to the role that culture plays in how people interact with landscape and built space—especially in today’s increasingly diverse urban com- munities—the better the chances for building neighborhoods and cities that are more inclusive and environmentally and culturally sustainable. the formulation is inspired by the work of phil wood and charles landry on the intercultural city (wood & landry, ), ash amin on the good city ( ), leonie sandercock on cos- mopolis ( ), chase, crawford and kaliski ( )on everyday urbanism, and many others. i took to blogging in because i have always admired scholars who write for the general public (e.g., see saitta, ). having been part of the typical scholarly grind for the first years of my career—writing for aca- demic journals, edited volumes, and other print publications—the blog has been a perfect outlet for my late career interest in interdisciplinary and publicly engaged scholarship. these aspects of academic work are not always rewarded by the traditional disciplines. blogging pro- vides an opportunity to range widely across archaeology, anthropology, history, geography, copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march sociology, ecology, evolutionary science, art, architecture, literature, communication, busi- ness, and other fields. it offers an opportunity to export anthropological knowledge to these other scholarly domains and to explore ways of thinking that are transdisciplinary in nature. writing regularly for intercultural urbanism has both liberated and sharpened my thinking about the city. it has also produced thousands of words and over a hundred essays—already in pretty good substantive and grammatical shape—that i plan to eventually merge into a monograph. thus, the blog is not only the most fun i have had writing as an academic, but it is also setting me up to complete a more traditional academic project down the road. perhaps most importantly, the blog has created multiple opportunities to connect with people in diverse areas of urban research, policy- making, and practice. the invitation to chair the scholarship blogging session at sacrph—a professional meeting that i had never before at- tended—was one such opportunity. in my blog drew the attention of leaders at the council of europe’s intercultural cities program, which produced a request for participation in an international seminar on culturally conscious urban placemaking. as a consequence, con- nections were made with individual scholars whose work, up to now, i have only admired from afar. in late the website sustainable cities collective began re-posting some of my essays, and in so doing has given them much wider exposure. in i was invited to be a regular blogger for the public interest urban planning and design website planetizen. writ- ing for planetizen has doubled the number of daily visits to interculturalurbanism.com. none of this would have happened if i had limited myself to traditional forms of writing within my particular academic discipline. my teaching has also benefitted from my blogging. echoing christopher leo, blogging has helped me better focus what i do with students in the classroom. all of the major as- signments in my course on the anthropology of the city send students into the community to do original research on various aspects of develop- ment in denver. i will sometimes synthesize and report the results of this work on the blog, something that i pitch as faculty-student co- production of knowledge. to the extent that the urban planning profession wants to know what today’s population of educated young adults (i.e., the “millennials”) desire in urban living, several of these essays have been re-posted to other urbanist websites. students appreciate this attention; several have commented to me that my course-based blogging gives them a sense that their thinking and writing about the city really matters beyond the classroom. conclusion the scholarship blogging session at the sacrph conference demonstrated that there is no single best way to do academic blogging. each of the presenters had different goals and approach. the session produced some consensus about advantages and drawbacks. still, there was strong agreement about the overriding virtues of scholarly blogging. blogging is a perfect medium for collecting and preserving scholarly knowledge about the city, especially knowledge that, for whatever reason, has been purposely forgotten or simply fallen between the cracks. it allows dissemination of such work to a broader audience. it is an excellent way for scholars to build up a personal archive of work. blogging can invigorate a scholar’s writing and boost their productivity. it allows individual scholars to create, manage, and actively shape their online identity; that is, to literally write themselves into being (sundén, ). blog- ging also has pedagogical utility by providing reading material for classes and facilitating faculty-student co-production of knowledge. thus, the question for scholars is not whether to blog, but rather, how to find the right balance between traditional forms of writing and other forms of digital engagement. copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march references amin, a. ( ). the good city. urban stud- ies (edinburgh, scotland), ( - ), – . doi: . / bonetta, l. ( ). scientists enter the blogo- sphere. cell, ( ), – . doi: . /j. cell. . . pmid: chase, j., crawford, m., & kaliski, j. ( ). ev- eryday urbanism. new york: the monacelli press. cohen, d. ( ). professors, start your blogs. re- trieved march , , from http://www.dancohen. org/blog/posts/professors_start_your_blogs cole, j. ( ). blogging current affairs history. journal of contemporary history, ( ), – . doi: . / hogan, b., & quan-haase, a. ( ). persistence and change in social media: a framework of social practice. bulletin of science, technology & society, ( ), – . doi: . / hurt, c., & yin, t. ( ). blogging while unten- ured and other extreme sports. wash. ul rev., , – . kaufman, s. e. ( ). an enthusiast’s view of academic blogs. inside higher ed, november. kim, y.-m., & abbas, j. ( ). adoption of library . functionalities by academic libraries and users: a knowledge management perspective. journal of aca- demic librarianship, ( ), – . doi: . /j. acalib. . . kostof, s. ( ). the city shaped: urban patterns and manings trough hstory. london: bulfinch press. lenhart, a., purcell, k., smith, a., & zickuhr, k. ( ). social media and young adults. the pew internet and american life project. the pew inter- net and american life project website: http://www. pewinternet.org/reports/ /social-media-and- young-adults.aspx lindgren, j. ( ). is blogging scholarship-why do you want to know. wash. ul rev., , . meyers, k. ( ). saa : blogging in archaeol- ogy, week . msu campus archaeology program. retrieved march , , from http://campusarch. msu.edu/?p= nielsen online reports. ( ). buzz in the blogo- sphere: millions more bloggers and blog readers. retrieved march , , from http://www.nielsen. com/us/en/newswire/ /buzz-in-the-blogosphere- millions-more-bloggers-and-blog-readers.html parry, d. ( ). be online or be irrelevant. retrieved march , , from http://academhack.outside- thetext.com/home/ /be-online-or-be-irrelevant/ quan-haase, a. ( ). research and teaching in real-time: / collaborative networks. in d. rasmussen neal (ed.), social media for academics (pp. – ). sawston, uk: chandos. doi: . / b - - - - . - raper, v. ( , january , ). science blogging and tenure. sciencemag. retrieved march , , from http://sciencecareers.sciencemag.org/career_ magazine/previous_issues/articles/ _ _ / caredit.a richardson, h. cox ( ). what blogging, twitter, and texting do for the historian’s craft. the histori- cal society. retrieved march , , from http:// histsociety.blogspot.ca/ / /what-blogging- twitter-and-texting-do.html saitta, d. ( ). stephen jay gould: in me- moriam. rethinking marxism, ( ), – . doi: . / sandercock, l. ( ). cosmopolis ii: mongrel cities in the st century. london: continuum. scale, m. s., & quan-haase, a. (in press). categoriz- ing blogs as information sources: implications for libraries and information science. in m. khosrow- pour (ed.), encyclopedia of information science and technology, rd e. hershey, nj: igi global. sprain, l., endres, d., & petersen, t. r. ( ). research as a transdisciplinary networked process: a metaphor for difference-making research. com- munication monographs, ( ), – . doi: . / . . sundén, j. ( ). material virtualities: approach- ing online textual embodiment. new york: peter lang publishing. wood, p., & landry, c. ( ). the intercultural city. london: earthscan. http://dx.doi.org/ . / http://dx.doi.org/ . /j.cell. . . http://dx.doi.org/ . /j.cell. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://www.dancohen.org/blog/posts/professors_start_your_blogs http://www.dancohen.org/blog/posts/professors_start_your_blogs http://dx.doi.org/ . / http://dx.doi.org/ . / http://dx.doi.org/ . /j.acalib. . . http://dx.doi.org/ . /j.acalib. . . http://www.pewinternet.org/reports/ /social-media-and-young-adults.aspx http://www.pewinternet.org/reports/ /social-media-and-young-adults.aspx http://www.pewinternet.org/reports/ /social-media-and-young-adults.aspx http://campusarch.msu.edu/?p= http://campusarch.msu.edu/?p= http://www.nielsen.com/us/en/newswire/ /buzz-in-the-blogosphere-millions-more-bloggers-and-blog-readers.html http://www.nielsen.com/us/en/newswire/ /buzz-in-the-blogosphere-millions-more-bloggers-and-blog-readers.html http://www.nielsen.com/us/en/newswire/ /buzz-in-the-blogosphere-millions-more-bloggers-and-blog-readers.html http://academhack.outsidethetext.com/home/ /be-online-or-be-irrelevant/ http://academhack.outsidethetext.com/home/ /be-online-or-be-irrelevant/ http://dx.doi.org/ . /b - - - - . - http://dx.doi.org/ . /b - - - - . - http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/ _ _ /caredit.a http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/ _ _ /caredit.a http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/ _ _ /caredit.a http://histsociety.blogspot.ca/ / /what-blogging-twitter-and-texting-do.html http://histsociety.blogspot.ca/ / /what-blogging-twitter-and-texting-do.html http://histsociety.blogspot.ca/ / /what-blogging-twitter-and-texting-do.html http://dx.doi.org/ . / http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . copyright © , igi global. copying or distributing in print or electronic forms without written permission of igi global is prohibited. international journal of e-planning research, ( ), - , january-march pierre clavel is professor emeritus, department of city and regional planning, cornell university. dean j. saitta is professor of anthropology and director of the urban studies program at the university of denver. center of pressure characteristics from quiet standing measures to predict the risk of falling in older adults: a protocol for a systematic review and meta-analysis protocol open access center of pressure characteristics from quiet standing measures to predict the risk of falling in older adults: a protocol for a systematic review and meta-analysis flavien quijoux , * , aliénor vienne-jumeau , françois bertin-hugault , marie lefèvre , philippe zawieja , pierre-paul vidal , and damien ricard , , abstract background: falling is the most common accident of daily living and the second most prevalent cause of accidental death in the world. the complex nature of risk factors associated with falling makes those at risk amongst the elderly population difficult to identify. commonly used clinical tests have limitations when it comes to reliably detecting the risk of falling, but existing laboratory tests, such as force platform measurements, represent one method of overcoming this lack of a test. despite their widespread use, however, center of pressure (cop) signal analysis techniques vary and there is currently no consensus on which features should be used diagnostically. our objective is to identify, through a systematic review and meta-analysis, the cop characteristics of older adults (≥ years old) during quiet bipedal stance which will allow fallers to be distinguished from non-fallers. methods: the systematic review will include both prospective and retrospective articles. five databases will be searched: pubmed, cochrane central, embase, and sciencedirect. in addition, a search of gray literature will be performed using google scholar and clinicaltrials.gov. searches will be circumscribed to include only older adults (aged over years) who underwent a bipedal quiet standing measure of their balance and for whom the number of falls was reported. two authors will independently assess the risk of bias for each included article using a -item checklist. funnel plots will be drawn to attest of possible publication biases for each cop parameters. the results will be synthesized descriptively and a meta-analysis will be undertaken. when trial methodological heterogeneity is too great for pooling of the data into a meta-analysis, evidence strength will be evaluated using best evidence analysis. discussion: despite the numerous advantages of posturography, the diversity of studies exploring balance in older fallers has led to uncertainty regarding the method’s ability to reliably identify fall-prone older adults. it is expected that the findings from this systematic review will help clinicians use bipedal quiet standing measures as a diagnostic test and allow researchers to explore cop characteristics to create better models for fall prevention care. systematic review registration: prospero crd keywords: older adults, fallers, quiet standing, cop, prediction, risk of falling © the author(s). open access this article is distributed under the terms of the creative commons attribution . international license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the creative commons public domain dedication waiver (http://creativecommons.org/publicdomain/zero/ . /) applies to the data made available in this article, unless otherwise stated. * correspondence: f.quijoux@orpea.net cnrs, umr cognition and action group, paris, france orpéa group, puteaux, france full list of author information is available at the end of the article quijoux et al. systematic reviews ( ) : https://doi.org/ . /s - - - http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf http://orcid.org/ - - - https://www.crd.york.ac.uk/prospero/display_record.php?id=crd http://creativecommons.org/licenses/by/ . / http://creativecommons.org/publicdomain/zero/ . / mailto:f.quijoux@orpea.net background the risk of falling amongst older people is a major health- care issue representing one of the primary cause of injury and death in this demographic group [ ]. in elderly people, even non-lethal falls can often lead to severe injur- ies such as hip fracture or traumatic brain injury [ – ]. over % of people aged over and % of those over years will fall multiple times in the course of year [ ]. this risk and the consequences could be reduced by improving the screening tools used to detect fall-prone older people. at the time of writing, most commonly used clinical screening tools, such as the timed up and go test (tug), stratify, performance oriented mobility assessment (poma), and the berg balance scale (bbs), are retrospectively correlated to a pa- tient’s fall history from previous months [ ], but they have been repeatedly shown to lack both sensitivity and accuracy in order to be used prospectively to identify the fall-prone older adults [ – ]. further- more, these tests are also usually unable to follow changes in balance capacity with age in older people or the kind of change that occur in the early stages of neurological diseases [ ]. the bbs, for example, re- quires an eight-point downgrade over points in order to be meaningful [ , ], and it is also prone to errors due to both floor and ceiling effects [ , ]. such draw- backs can also be seen in other tests whose subcompo- nents cannot be separated, which also makes them unsuitable for highly handicapped patients [ ]. other clinical tests, such as the stratify tool and the tug test, do not have well-defined thresholds for classifying pa- tients as fallers or non-fallers. due to these weaknesses, the predictive sensitivity and specificity of these tests are lowered [ ]. as a consequence, they are more likely to be used as mere fall history questionnaires [ ]. in addition to their subjective nature, results from clin- ical tests to evaluate balance need to be combined in order to identify the risk of falling [ ]. finally, even if the aforementioned tests are widely available in gerontology services, they cannot be used to discrim- inate fallers from non-fallers [ ]. quantitative posturographic tests, however, which as- sess balance by recording center of pressure (cop) os- cillations [ ] could provide a means to overcome these issues. the cop signal, usually assessed with force plat- forms, contains features that allow characterization of a patient’s postural strategies and modifications [ , ]. posturography also provides additional information on specific balance control mechanisms [ ] and thus constitutes a clinically useful tool to identify those at risk of falling [ ]. a better understanding of stabilization re- sponses should therefore allow a more targeted manage- ment of the causes of imbalance in older people [ ]. cop analysis has been used to determine motor strategies for fall prevention [ , ], to reliably distinguish patholo- gies [ ] and to link fear of falling with posturographic pa- rameters [ ]. studies have indicated that some sway characteristics of a quiet stance, especially in the mediolat- eral direction, are significantly different between non- fallers and fallers and could therefore be good indicators of those at increased risk of future falls [ ]. amongst healthy, older adults who live in the community, balance and sway measurements have been shown to be strong predictors of fall risk [ , ]. despite this work, however, to date, there has been no study to summarize those cop features which best discriminate fallers from non-fallers amongst older people aged over . in , piirtola and era [ ] concluded that some cop parameters during bipedal quiet stance could help to predict risk of falls in the elderly. unfortunately, the results of the nine articles included were contradictory and the measurement proto- cols used varied widely. similarly, the narrative review by pizzigalli et al. [ ] reported some cop parameters as fall risk predictors. however, the contradictory results and the absence of quantitative analysis in these two articles limit the application of their conclusions in clinical practice. we hope that a more exhaustive literature search, and a quantitative study based on different recording protocols, will establish which parameters, and under what condi- tions, are associated with an increased risk of falling. we will seek to minimize protocol heterogeneity in order to draw conclusions that can be applied in practice. a bi- pedal quiet stance is a simple test to study balance motor strategy in older adults [ , ] that, unlike unipedal or more complex tests, is more inclusive for an older popula- tion as it has a reduced incidence of participant exclusion due to falls during recording [ , ]. nevertheless, ways exist that make the test more challenging: one can add a double cognitive task [ ], a soft support with a foam pad [ , ] or asking the participant to close their eyes [ ]. therefore, the main aim of this systematic review is to extract the best biomarkers from cop bipedal quiet stance displacement data in order to (retrospective study) distinguish fallers from non-fallers and so (pro- spective study) predict fall risk. the second aim is to evaluate the accuracy of currently available predictive and classification models using these biomarkers. objectives this systematic review protocol was designed to address the following questions: – which features of the statokinesigram in older patients (≥ years) during a bipedal quiet stance test differ between fallers and non-fallers? – how well can the risk of falling in older adults be predicted from cop characteristics and analysis? quijoux et al. systematic reviews ( ) : page of – which parameters should be included in a predictive or a classification model of fall risk assessment for an older population? methods research protocol this literature search and analysis was designed according to the prisma (preferred reporting items for systematic reviews and meta-analyses) [ ] and moose (meta-ana- lysis of observational studies in epidemiology) [ ] guide- lines. this protocol was registered in the prospero database under the number crd . search strategy an electronic database search of titles and abstracts pub- lished will be performed between march and july to identify all articles published that include fall data for older people and their cop recordings. five databases (pubmed, cochrane central, embase, and sciencedirect) will be used as sources for published articles. the search will be performed for articles published without date restriction until july , , using associations of keywords (table ) from the pico methodology. the following mesh terms will be also used: “accidental falls/prevention & control,” “acciden- tal falls/statistics & numerical data*,” “aged,” “postural balance/physiology*,” “posture/physiology*,” “predictive value of tests,” and “regression analysis.” the main database search will be supplemented by a review of gray literature which will be conducted through web searches on google scholar and clinicaltrials.gov. in addition, all reference lists and bibliographies of included studies will be themselves reviewed for relevant studies that were not picked up through any electronic search. inclusion and exclusion criteria randomized control trials (rcts), non-randomized con- trol trials, and observational studies will all be eligible for inclusion. due to the risk of bias arising from only including data from published rcts [ , ], data from gray literature will also be included provided that they have met the inclusion criteria (table ). exclusion cri- teria will also be set (see table ). paper review process potentially eligible studies will be screened for inclusion eligibility independently by two review authors (fq and av) based on their title, abstract, and full text. articles will first be imported into the zotero® bibliographic database (corporation for digital scholarship and the roy rosenzweig center for history and new media, usa) before screening so that all articles can be reviewed from the same source in order to select those that meet the criteria. if there is disagreement between the reviewers, the study will be discussed until a consen- sus is reached. papers that are eligible will then be sub- jected to data extraction and a “risk of bias” evaluation, as described below. risk of bias evaluation a quality/risk of bias assessment will be performed by using a -item checklist based on the work of downs et al. [ ] (additional file ). the checklist to be used will retain items unchanged from the previous version of this checklist [ ] while another three items will be re- moved and two extra items added. the final risk of bias table keywords from the p.i.c.o. framework components (apply and for search) keyword used (apply or for search) population “older adults” “community-dwelling people” “elderly” “seniors” “outpatient” “fall prone elder” “nursing home” “institutional care” intervention “balance” “equilibrium” “quiet standing” “stance” “standing” “stability” “posture” “postural stability” comparison “posturography” “fall*” “risk of falling” “center of pressure” “centre of pressure” “cop trajectory” “cop displacement” “sway” “statokinesigram” “stabilogram” “force platform” outcomes “predict*” “diagnos*” “classif*” “disting*” “differenc*” quijoux et al. systematic reviews ( ) : page of assessment will also include a further six items from the original checklist that have been modified in order to evaluate the reliability of both the cop measures and the predictive models. in order to create and modify items, the critical appraisal skills programme (casp) evaluate a clinical prediction rule checklist (v . . ) will be used. quality assessment for each article will be per- formed by two assessors (fq and av), and each assessor will be blind to the score given by the other until both have completed the evaluation. any disagreement over the final table inclusion criterion inclusion criterion domains explicit criterion general criteria - published before july , . - related to the main topics: “the risk of falling in elderly people.” articles not related to this topic will not be included based on the two-reviewer evaluation system. language criteria - no language criteria are applied. however, for non-french, non-english, or non-spanish articles, we will contact professional translators if no french, spanish, or english version is found. such translations will be indicated in the main article. - all full papers will be retrieved (or translated) and used. type-of-study criteria - retrospective and prospective clinical trials, randomized, or not. - observational, time series, and cross-sectional studies. participants criteria - older patients (aged ≥ years of age) considered to be otherwise healthy/without neurological disease as determined by a diagnostic assessment (or any specification from the authors) which could impact their posture including (but not limited to) parkinson disease (pd), multiple sclerosis (ms), hemiplegia, paraplegic, stroke, or brain trauma. orthopedic disorders affecting balance such as recent arthroplasty or amputation will also not be included in the review. intervention criteria - articles analyzing the balance through cop recordings during quiet standing with both feet on the ground and evaluating the risk of falling by the number of falls during a period of time (retrospectively or prospectively) - any article measuring the risk of falling without an estimation of the number of falls per participant (i.e., indirect assessment through fear of falling tests or epidemiologic data only) or not related to the risk of falling (comparing elderly vs. young for example) will be discarded. - if training (e.g., exercise training or a physiotherapy program) is a part of the intervention, the article will be discarded unless a baseline of the quiet standing capacities is recorded. in this case, only the data from the baseline will be used. comparison criteria - fallers versus non-fallers (it can include “healthy elderly people” versus “fall prone elderly” or “low risk elderly” vs “high risk eld erly” or “single fallers” versus “multiple fallers” or “infrequent fallers” versus “recurrent fallers”) outcomes criteria - primary outcomes will be the features in the cop analysis and their differences between the groups (odds ratio for dichotomous outcomes and mean differences for continuous outcomes). - secondary outcomes will be the precision of the prediction (or the model) of the risk of falling, such as sensibility, specificity, area under the curve (auc) of receiver operating characteristic (roc) curves, number of true(/false) positive(/negative), positive predictive value (ppv), and negative predictive value (npv), odd-ratio or other evaluation of the system. table exclusion criterion exclusion criterion domains explicit criterion human criteria - all animal or pendulum-based studies will be discarded. intervention criteria - all studies quantifying other activities than quiet standing (e.g., gait and equivalent, using a moving platform or moving environment for assessment, obstacle dodging, external destabilization, functional reach tests, one leg standing, or any forms of assessment of balance other than standing upright). - romberg coefficient (difference between eyes opened and closed) will be accepted as well as standing on foam if there is a comparison with a firm surface. - cognitive tasks which do not require to move (e.g., counting or memorizing) will be accepted. - a standardized posture is not an exclusion criterion but will be noted. outcome criteria - a cop recording is mandatory to not be excluded. all studies than do not compute any parameter to quantify balance through cop data but focus on sway measurement only through sway meter, cumulative balance score (e.g., sensory organization test) or motion capture will be discarded. studies using center of mass (com) without a cop recording will be discarded too. equipment criteria - there are no equipment criteria as long as the research recorded cop displacement over time. force platforms, pressure insoles, or any other cop recording systems are all accepted but will be noted. population criteria - all studies including young (< years old), healthy people without a comparison group of older people will be discarded. - the presence of a neurologic pathology that could influence posture will be an exclusion criterion. - all studies including recently post-operative participants will be discarded. comparison criteria - all studies than do not compare elderly fallers and non-fallers but focus on methodological issues (e.g., cop features reli ability, force platform methodology and validation, biomechanical model validation) will be discarded. quijoux et al. systematic reviews ( ) : page of score for each article will be discussed; if no agreement can be reached, the rounded mean of both scores will be used. data extraction and analysis following inclusion of the articles for analysis, the text from each reference will be imported into microsoft excel (version , microsoft corp., redmond, wa) for data extraction. one assessor (fq) will extract and collate information following the recommendations of the joanna briggs institute reviewers’ manual [ ]. an- other assessor (av) will verify the extracted data from the included articles in order to confirm coherence of the data. key characteristics to be extracted will include information about the study itself such as author(s), title, year of publication, inclusion and exclusion criteria, sample size, study methodology (retrospective or pro- spective fall evaluation), study duration, rate of falls, and mention of any adverse events that occurred during the study (additional file ). population characteristics will also be recorded including demographic and biometric data such as participants’ gender, age, weight, height, bmi, and cognitive capacities (e.g., following a mini mental state examination—mmse). data gathered about the falls will include the studies’ definition of a fall and how they were evaluated and the geographical loca- tion of the work (country, region, and establishment where the measures took place); quiet standing test pa- rameters to be collected will include test conditions of the tests such as, for participants, whether they wore shoes or were barefoot, had their eyes open or closed, or if they were asked to use a comfortable or standardized foot position. for the test itself, data will be recorded on the type of standing surface (e.g., firm or foam) used, whether it was a cognitive double or single task, test duration, who performed the tests, the time interval be- tween the different test parts, the data collection methods (type of tools, sampling frequency, and filter characteristics), and the cop features. for predictive (in prospective studies) or classification (in retrospectives studies) models, their characteristics and level of accur- acy will also be extracted, when a statistical model has been used. when these data are unavailable from the main text, additional file will be examined for more information. when data on the force platforms or other kind of equipment (such as the sampling frequency or the pro- vider) are not available even in additional file , the specifications will be sought from other articles by the same author(s). for experimental studies, the available cop data will be extracted from the baseline measure- ments that were taken before any intervention had been implemented as long as the history of fall is also avail- able (retrospective classification). if the cop parameters before the intervention are not included, the article will not be analyzed. for observational studies with prospect- ive evaluation of falls, data recorded before the follow- up assessment will used as in the analysis; if measure- ments were not performed before follow-up, the article will be excluded. using software (like plot digitizer) to obtain data from figures was not considered as an option to extract data since this technique has been shown to be flawed concerning inter-rater reliability, with only a % agreement between both raters and an agreement of % with the original data even for trained raters [ ]. in addition to the time consumption of extracting data by two authors independently, there is no guidance for this kind of extraction so far [ ]. therefore, we had ra- ther not extracting the data on graphs to avoid introdu- cing new biases. finally, authors will be contacted via e-mail up to three times to request missing data when they are not available in the main text or from other sources as de- scribed above. strategy for data synthesis extracted data from included articles will be presented descriptively, especially study characteristics, population characteristics, cop features used, and the risk of bias. the risk of bias will be assessed using the value of the percentage scores from the -item checklist: score dis- tribution will also be studied to look for a gaussian dis- tribution or, on the contrary, a trend in favor of the studies included in the meta-analysis. the quality scores will also be used as a parameter of the cop heterogen- eity level in the meta-analysis. for pooling predictor data from cop recordings, at least three studies must have used the same feature. if the included studies show consistency between their protocols, particularly with regard to the homogeneity of patient populations and the quiet standing test condi- tions, a meta-analysis of the aggregated data will be con- sidered. for features that cannot be aggregated into a meta-analysis, a “best evidence synthesis” will be the pre- ferred method of evaluating the strength of the studies’ evidence [ ]. if data cannot be aggregated into a meta- analysis or if the results seem contradictory, the best evi- dence analysis will support articles with the highest score in the risk of bias assessment. particular care will be taken to ensure that the methodological quality of the studies and consistency of their results are reported. if a meta-analysis is indicated, the method will follow the cochrane collaboration handbook recommendations [ ]. means and standard deviations (sd) of measures will be used to compare the effect size of each parameter on the risk of falling and to allow the creation of forest plots. if sd data remain unavailable, even after con- tacting the authors, but standard errors or confidence intervals are available, we will calculate standard quijoux et al. systematic reviews ( ) : page of deviation values [ ]. effect size (es) will be calcu- lated using eq. [ , ]: es ¼ − n þ n ð Þ− � � y −y s ð Þ es is the unbiased effect size corrected for sample sizes n and n provided by hedges; y and y are the means of each group and s is the pooled within-group standard deviation. the estimated within-study variance of es is com- puted from eq. : σ̂ ¼ n þ n n n þ es n þ n ð Þ ð Þ assuming a fixed-effects model, the weighting coeffi- cient will be computed from eq. : ŵfe ¼ =σ̂ ð Þ if a random-effects model is preferred, the weighting coefficient will be computed from eq. : ŵre ¼ = σ̂ þ τ̂ � � ð Þ τ̂ ¼ q− k− ð Þ c ð Þ in eq. , τ̂ is the estimated between-studies vari- ance; q is the heterogeneity statistic of the k inde- pendent studies and c the coefficient computed from eq. : c ¼ x ŵfe− p ŵfeð Þ p ŵfe ð Þ a fixed-effects model will be chosen if the heterogen- eity is low to moderate (i < %) [ ]; otherwise, a ran- dom-effect model will be used. finally, as shown in eq. , the data will be pooled for meta-analysis in case of clinical, methodological, and statistical homogeneity to assess the mean effect size of a cop feature according to: es ¼ p ŵ � esð Þp ŵ ð Þ confidence in cumulative evidence sensitivity analyses will explore the impact of recording settings on the cop results during the quiet standing measurement such as if patients had open or closed eyes, their foot position, standing surface firmness as well as whether the study was prospective or retrospect- ive. the impact of cop measurement variability, due to factors like recording duration or sampling frequency [ ], will also be discussed. inter and intra-participant reliability for the different cop parameters will also be discussed in order to assess their usefulness in clinical practice [ – ]. if the data are detailed enough, the causes of falls will be investigated further to determine whether external factors independent of balance disor- ders were involved in the fall/non-fall status; such exter- nal factors could weaken the overall ability of cop measures to predict falls. if the heterogeneity for a given cop parameter within the meta-analysis is too great (as measured by i > %), the decrease of this heterogeneity will be tested by the deletion of studies that use a par- ticular cop recording configurations (with a different material than the other studies included for this param- eter for example); the heterogeneity decrease will then be discussed in relation to the study(s) deleted. if sub- groups exist, e.g., recurrent fallers vs infrequent fallers, microsoft excel (ibid.) will be used for their analysis. if enough rcts and interventional studies can be included, the overall quality of the evidence for each outcome will be presented using the grade (grad- ing of recommendations, assessment, development and evaluation) criteria as per the cochrane collabor- ation [ ]. otherwise, the cumulative evidence will be assessed using our own rating system which is based on the grade system and was created to overcome the limitations of using grade on non-interventional observational studies. this system will give a score for each outcome based on ( ) the mean risk of bias from every study included for that outcome, ( ) the total number of studies used to pool the data, ( ) a classification of heterogeneity from low to high, and ( ) the overall sample size (table ). each outcome could then be graded as either “high,” “moderate,” “low,” or “very low.” to visualize possible publication bias, funnel plots will be used to represent the estimated effect size of each art- icle against the standard error mean plotted on the verti- cal axis. a symmetric inverted funnel shape suggests no publication bias. a funnel plot will be drawn for each cop parameter with respect to the type of study (retro- spective or prospective). discussion this systematic review is expected to provide a valuable means of predicting and so preventing falls in older indi- viduals by providing robust, evidence-based guidelines for the clinical and laboratory evaluation of risk of falling using a simple and reliable bipedal cop test. the proposed study will retrieve and extract data from clinical trials and observational studies. it will report the spatio-temporal parameters of the center of pressure displacements during a bipedal quiet stand- ing task in older people who are then classified as “fallers” and “non-fallers.” we have purposefully quijoux et al. systematic reviews ( ) : page of chosen bipedal tests because of the applicability of these tests to all older people. unipedal tests, which are more difficult to perform, tend to exclude frailer individuals who find themselves unable to stand on one leg [ ]. we do not think that conducting a sen- sitivity study based on this subgroup of people would be feasible due to a lack of individual data. we also chose to focus only on bipedal tests to reduce the di- versity of recording methods used in the articles ana- lyzed; including other methods for other tests would only further complicate the task of analyzing such already-heterogeneous data to obtain reliable results. finally, we consider it possible that the motor strat- egies used to maintain balance during a one-legged stance are different from those used during bipedal stance [ ] and, hence, a multivariate analysis of bi- pedal cop tests would be more suited as the topic of a separate, equally specific, systematic review. non-systematic reviews from other publications in this field have indicated that the reliability of the bi- pedal cop measurements appears to be high across the different study protocols [ , , ] and it thus seems reasonable to assume that the repetition of measures will only increase this reliability. biomech- anical factors (such as height and weight) and acquisi- tion settings are known to have a moderate to high influence on cop parameters [ , ], and so par- ticular attention will need to be paid to these factors in order to pool the data without bias. one conceivable, and potentially major, limitation of this systematic review would be a lack of this par- ticipant and test protocol information in the included articles. in particular, fall circumstances can be key confounding variables: some cop measures might be associated with falls only under particular circum- stances and not others. for parameters where the data are available, we will carry out a sub-analysis stratified by fall circumstances. we will also try to re- duce these risks of bias by taking into account the quality of each study and by extracting information regarding the definition and evaluation of “a fall,” as well as data about adverse events gathered during the follow-up after from each acquisition protocol. additional file additional file : -items quality checklist. extracted data ordered by domain of interest. (docx kb) abbreviations auc: area under the curve; bmi: body mass index; com: center of mass; cop: center of pressure; grade: grading of recommendations, assessment, development, and evaluation; mmse: mini-mental state examination; ms: multiple sclerosis; pd : parkinson’s disease; rct: randomized control trial; roc: receiver operating characteristic acknowledgements the authors would especially like to thank jean-philippe régneaux from the cochrane collaboration for his help and advice in the preparation of this protocol. we would also thank jennifer dandrea palethorpe for her review of the english language to ensure the quality of this manuscript. authors’ contributions fq, av, pz, fbh, ml, ppv, and dr collaborated to develop and refine the conception, design, and writing of the study protocol. fq, av, and dr contributed to the search strategy and the quality appraisal. fq and av wrote the draft of the study protocol with inputs from pz and dr. all authors critically reviewed the manuscript and approved the final version. funding in the context of a cifre thesis, this systematic review with meta-analysis protocol has been funded by orpea group. availability of data and materials not applicable. ethics approval and consent to participate not applicable consent for publication not applicable competing interests the authors declare that they have no competing interests. author details cnrs, umr cognition and action group, paris, france. orpéa group, puteaux, france. hangzhou dianzi university, hangzhou , zhejiang, china. service de neurologie de l’hôpital d’instruction des armées de percy, service de santé des armées, clamart, france. ecole du val-de-grâce, service de santé des armées, paris, france. received: november accepted: august references . world health organization, éditeur. who global report on falls prevention in older age. geneva: world health organization; . . table cumulative evidence scale (low is rated , moderate , and high for each item. the final rating is very low (< ), low ( – ), moderate ( – ), high (> )) quality risk of bias score (mean of the -score) number of studies (n) heterogeneity (i ) cumulative sample size high > > < % (low heterogeneity) > moderate – – – % (moderate) – low < – > % (high heterogeneity) < score total quijoux et al. systematic reviews ( ) : page of https://doi.org/ . /s - - - . luukinen h, herala m, koski k, honkanen r, laippala p, kivelä s-l. fracture risk associated with a fall according to type of fall among the elderly. osteoporos int. ; ( ): – . . stevens ja, corso ps, finkelstein ea, miller tr. the costs of fatal and non -fatal falls among older adults. inj prev. ; ( ): – . . harvey la, close jct. traumatic brain injury in older adults: characteristics, causes and consequences. injury. ; ( ): – . . has. Évaluation et prise en charge des personnes âgées faisant des chutes répétées. . . schoene d, wu sm-s, mikolaizak as, menant jc, smith st, delbaere k, et al. discriminative ability and predictive validity of the timed up and go test in identifying older people who fall: systematic review and meta-analysis. j am geriatr soc. ; ( ): – . . ambrose af, cruz l, paul g. falls and fractures: a systematic approach to screening and prevention. maturitas. ; ( ): – . . da costa br, rutjes aws, mendy a, freund-heritage r, vieira er. can falls risk prediction tools correctly identify fall-prone elderly rehabilitation inpatients? a systematic review and meta-analysis. baradaran hr, éditeur plos one ; ( ):e . . perell kl, nelson a, goldman rl, luther sl, prieto-lewis n, rubenstein lz. fall risk assessment measures: an analytic review. j gerontol a biol sci med sci. ; ( ):m – . . raîche m, hébert r, prince f, corriveau h. screening older adults at risk of falling with the tinetti balance scale. lancet. ; ( ): – . . gates s. systematic review of accuracy of screening instruments for predicting fall risk among independently living older adults. j rehabil res dev. ; ( ): . pmid: . . pajala s, era p, koskenvuo m, kaprio j, törmäkangas t, rantanen t. force platform balance measures as predictors of indoor and outdoor falls in community-dwelling women aged – years. j gerontol ser a. ; ( ): – . . yelnik a, bonan i. clinical tools for assessing balance disorders. neurophysiol clin neurophysiol. ; ( ): – . . downs s, marquez j, chiarelli p. the berg balance scale has high intra-and inter-rater reliability but absolute reliability varies across the scale: a systematic review. j physiother. ; ( ): – . . pardasaney pk, ni p, slavin md, latham nk, wagenaar rc, bean j, et al. computer-adaptive balance testing improves discrimination between community-dwelling elderly fallers and nonfallers. arch phys med rehabil. ; ( ): – .e . . blum l, korner-bitensky n. usefulness of the berg balance scale in stroke rehabilitation: a systematic review. phys ther. ; ( ): – . . mancini m, horak fb. the relevance of clinical balance assessment tools to differentiate balance deficits. eur j phys rehabil med. ; ( ): . . beauchet o, fantino b, allali g, muir sw, montero-odasso m, annweiler c. timed up and go test and risk of falls in older adults: a systematic review. j nutr health aging. ; ( ): – . . lusardi mm, fritz s, middleton a, allison l, wingood m, phillips e, et al. determining risk of falls in community dwelling older adults: a systematic review and meta-analysis using posttest probability. j geriatr phys ther. ; ( ): – . . barry e, galvin r, keogh c, horgan f, fahey t. is the timed up and go test a useful predictor of risk of falls in community dwelling older adults: a systematic review and meta-analysis. bmc geriatr. ; ( ): . . błaszczyk jw. the use of force-plate posturography in the assessment of postural instability. gait posture. ; : – . . bauer c, gröger i, rupprecht r, meichtry a, tibesku co, gaßmann k-g. reliability analysis of time series force plate data of community dwelling older adults. arch gerontol geriatr. ; ( ):e – . . berger l, chuzel m, buisson g, rougier p. undisturbed upright stance control in the elderly: part . postural-control impairments of elderly fallers. j mot behav. ; ( ): – . . hof al. the equations of motion for a standing human reveal three mechanisms for balance. j biomech. ; ( ): – . . williams hg, mcclenaghan ba, dickerson j. spectral characteristics of postural control in elderly individuals. arch phys med rehabil. ; ( ): – . . maki be, mcilroy we. the role of limb movements in maintaining upright stance: the “change-in-support” strategy. phys ther. ; ( ): – . . maki be, mcilroy we. control of rapid limb movements for balance recovery: age-related changes and implications for fall prevention. age ageing. ; (suppl_ ):ii – . . hsiao-wecksler et, katdare k, matson j, liu w, lipsitz la, collins jj. predicting the dynamic postural control response from quiet-stance behavior in elderly adults. j biomech. ; ( ): – . . könig n, taylor wr, baumann cr, wenderoth n, singh nb. revealing the quality of movement: a meta-analysis review to quantify the thresholds to pathological variability during standing and walking. neurosci biobehav rev. ; : – . . dueñas l, balasch i bernat m, mena del horno s, aguilar-rodríguez m, alcántara e. development of predictive models for the estimation of the probability of suffering fear of falling and other fall risk factors based on posturography parameters in community-dwelling older adults. int j ind ergon. ; : – . . pizzigalli l, micheletti cremasco m, mulasso a, rainoldi a. the contribution of postural balance analysis in older adult fallers: a narrative review. j bodyw mov ther. ; ( ): – . . piirtola m, era p. force platform measurements as predictors of falls among older people – a review. gerontology. ; ( ): – . . bigelow ke, berme n. development of a protocol for improving the clinical utility of posturography as a fall-risk screening tool. j gerontol a biol sci med sci. ; ( ): – . . masani k, popovic mr, nakazawa k, kouzaki m, nozaki d. importance of body sway velocity information in controlling ankle extensor activities during quiet stance. j neurophysiol. ; ( ): – . . lichtenstein mj, shields sl, shiavi rg, burger mc. clinical determinants of biomechanics platform measures of balance in aged women. j am geriatr soc. ; ( ): – . . maki be, holliday pj, topper ak. fear of falling and postural performance in the elderly. j gerontol. ; ( ):m – . . bergamin m, gobbo s, zanotto t, sieverdes jc, alberton cl, zaccaria m, et al. influence of age on postural sway during different dual-task conditions. front aging neurosci. ; disponible sur: http://journal.frontiersin.org/ article/ . /fnagi. . /abstract. cité août . . fujimoto c, egami n, demura s, yamasoba t, iwasaki s. the effect of aging on the center-of-pressure power spectrum in foam posturography. neurosci lett. ; : – . . hong sk, park jh, kwon sy, kim j-s, koo j-w. clinical efficacy of the romberg test using a foam pad to identify balance problems: a comparative study with the sensory organization test. eur arch otorhinolaryngol. ; ( ): – . . howcroft jd, kofman j, lemaire ed, mcilroy we. static posturography of elderly fallers and non-fallers with eyes open and closed. in: world congr med phys biomed eng june - tor can, vol. ; . p. – . . moher d, liberati a, tetzlaff j, altman dg, the prisma group. preferred reporting items for systematic reviews and meta-analyses: the prisma statement. ann intern med. ; ( ): – . . stroup df, berlin ja, morton sc, et al. meta-analysis of observational studies in epidemiology: a proposal for reporting. jama. ; ( ): – . . russo mw. how to review a meta-analysis. gastroenterol hepatol. ; ( ): . . shamseer l, moher d, clarke m, ghersi d, liberati a, petticrew m, et al. preferred reporting items for systematic review and meta-analysis protocols (prisma-p) : elaboration and explanation. bmj. ; (jan ):g . . downs sh, black n. the feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. j epidemiol community health. ; ( ): – . . vienne a, barrois rp, buffat s, ricard d, vidal p-p. inertial sensors to assess gait quality in patients with neurological disorders: a systematic review of technical and analytical challenges. front psychol. ; disponible sur: http://journal.frontiersin.org/article/ . /fpsyg. . /full. cité juin . . the joanna briggs institute. the joanna briggs institute reviewers’ manual : the systematic review of studies of diagnostic test accuracy: the joanna briggs institute; . disponible sur: www. joannabriggs.org. cité mars . jelicic kadic a, vucic k, dosenovic s, sapunar d, puljak l. extracting data from figures with software was faster, with higher interrater reliability than manual extraction. j clin epidemiol. ; : – . quijoux et al. systematic reviews ( ) : page of http://journal.frontiersin.org/article/ . /fnagi. . /abstract http://journal.frontiersin.org/article/ . /fnagi. . /abstract http://journal.frontiersin.org/article/ . /fpsyg. . /full http://journal.frontiersin.org/article/ . /fpsyg. . /full http://www.joannabriggs.org http://www.joannabriggs.org . vucic k, jelicic kadic a, puljak l. survey of cochrane protocols found methods for data extraction from figures not mentioned or unclear. j clin epidemiol. ; ( ): – . . mathew sa, heesch kc, gane e, mcphail sm. risk factors for hospital re-presentation among older adults following fragility fractures: protocol for a systematic review. syst rev. ; ( ) disponible sur: http:// systematicreviewsjournal.biomedcentral.com/articles/ . /s - - - . cité déc . . chandler j, higgins jp, deeks jj, davenport c, clarke mj. cochrane handbook for systematic reviews of interventions version . . (updated february ), cochrane, . . marín-martínez f, sánchez-meca j. weighting by inverse variance or by sample size in random-effects meta-analysis. educ psychol meas. ; ( ): – . . hedges lv. distribution theory for glass’s estimator of effect size and related estimators. j educ stat. ; ( ): . . higgins jp, thompson sg, deeks jj, altman dg. measuring inconsistency in meta-analyses. bmj. ; ( ): . . ruhe a, fejer r, walker b. the test–retest reliability of centre of pressure measures in bipedal static task conditions – a systematic review of the literature. gait posture. ; ( ): – . . worthen-chaudhari lc, monfort sm, bland c, pan x, chaudhari amw. characterizing within-subject variability in quantified measures of balance control: a cohort study. gait posture. ; : – . . baltich j, von tscharner v, zandiyeh p, nigg bm. quantification and reliability of center of pressure movement during balance tasks of varying difficulty. gait posture. ; ( ): – . . lin d, seol h, nussbaum ma, madigan ml. reliability of cop-based postural sway measures and age-related differences. gait posture. ; ( ): – . . ryan r, hill s. how to grade the quality of the evidence. cochrane consumers and communication group, éditeur. . . michikawa t, nishiwaki y, takebayashi t, toyama y. one-leg standing test for elderly populations. j orthop sci. ; ( ): – . . jonsson e, seiger a, hirschfeld h. one-leg stance in healthy young and elderly adults: a measure of postural steadiness? clin biomech bristol avon. ; ( ): – . . li z, liang y-y, wang l, sheng j, ma s-j. reliability and validity of center of pressure measures for balance assessment in older adults. j phys ther sci. ; ( ): . . swanenburg j, de bruin ed, favero k, uebelhart d, mulder t. the reliability of postural balance measures in single and dual tasking in elderly fallers and non-fallers. bmc musculoskelet disord. ; ( ) disponible sur: http:// bmcmusculoskeletdisord.biomedcentral.com/articles/ . / - - - . cité mars . . chiari l, rocchi l, cappello a. stabilometric parameters are affected by anthropometry and foot placement. clin biomech. ; ( ): – . . schmid m, conforto s, camomilla v, cappozzo a, d’alessio t. the sensitivity of posturographic parameters to acquisition settings. med eng phys. ; ( ): – . publisher’s note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. quijoux et al. systematic reviews ( ) : page of http://systematicreviewsjournal.biomedcentral.com/articles/ . /s - - - http://systematicreviewsjournal.biomedcentral.com/articles/ . /s - - - http://systematicreviewsjournal.biomedcentral.com/articles/ . /s - - - http://bmcmusculoskeletdisord.biomedcentral.com/articles/ . / - - - http://bmcmusculoskeletdisord.biomedcentral.com/articles/ . / - - - http://bmcmusculoskeletdisord.biomedcentral.com/articles/ . / - - - abstract background methods discussion systematic review registration background objectives methods research protocol search strategy inclusion and exclusion criteria paper review process risk of bias evaluation data extraction and analysis strategy for data synthesis confidence in cumulative evidence discussion additional file abbreviations acknowledgements authors’ contributions funding availability of data and materials ethics approval and consent to participate consent for publication competing interests author details references publisher’s note a new polemic: libraries, moocs, and the pedagogical landscape – in the library with the lead pipe skip to main content chat .webcam open menu home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search aug nora almeida / comments a new polemic: libraries, moocs, and the pedagogical landscape “i mooc” photo by flickr user ilonkatallina (cc-by-nc-sa . ) in brief: the massive open online course (mooc) has emerged in the past few years as the poster child of the online higher education revolution.  lauded and derided, moocs (depending on who you ask) represent the democratization of education on a global scale, an overblown trend, or the beginning of the end of the traditional academic institution. moocs have gained so much critical traction because they have succeeded in unmooring educational exchanges and setting them adrift in the sea of the internet.  although the mooc is a new and evolving platform, it has already upended facets of education in which librarians are heavily invested including intellectual property, digital preservation, and information delivery and curricular support models. consequently, to examine the mooc as a microcosm is also to explore how the scope of academic librarianship is changing and will continue to change. librarians and information professionals—who serve as bibliographers, purchasing managers, access advocates, copyright and preservation experts, and digital pioneers on many campuses—are uniquely situated to mediate this disruption and to use this opportunity to develop strategies for navigating an environment in flux. by nora almeida surely some revelation is at hand i just signed up for my first massive open online course (mooc) , a class on “globalizing higher education and research for the knowledge economy,” co-taught by two university of wisconsin-madison faculty. the whole registration process took less than a minute and resembled countless other internet transactions i’ve conducted—i filled out a form with my name and email address, chose a password, checked a box indicating i agreed to their standard terms of service, and then clicked “sign up.”  i did not have to, as i did in graduate school, log on at : am to ensure i could enroll in courses before they filled up, crossing my fingers as my browser refreshed. i did not have to worry about prerequisites, financial aid, or even when the course starts—they’ll send me an email reminder.  moocs are the latest incarnation of the online higher education revolution yet it is still too soon to tell whether they represent a real step towards the democratization of education, a fleeting phenomenon, or the dissolution of the academy as we know it.  what we do know is that the mooc—conceived in a perfect storm of open education, digital pedagogy, crowdsourcing, globalization cross-currents—is suddenly the centerpiece of discussions about the changing landscape of higher education. part of the fascination with moocs, for skeptics and champions alike, has to do with timing. although moocs have attracted millions of students and garnered unprecedented attention outside of higher education, jesse stommel ( ), digital humanist and founder of hybrid pedagogy, reminds us that the moocs phenomenon “didn’t appear last week, out of a void, vacuum-packed.” broad critical interest in moocs is partly due to a ricochet effect; education costs have peaked, enrollment numbers continue to grow, student loan debts are staggering, and the job market has been slow to rebound from a long recession (waldrop, ). while moocs are not a direct response or solution to these salient issues, they are part of the larger conversation that has emerged about the future of higher education; a future that almost certainly involves discussions about economics and changing relationships between technology, learning, and information. moocs  are not so different from other historical pedagogical innovations.  in fact, “a mooc isn’t a thing at all, just a methodological approach [and arguably, an emerging business model], with no inherent value except insofar as it’s being used” (stommel, ). and moocs are being used as critical instruments by scholars, librarians, op-ed columnists, publishers, programmers, bloggers, teachers, and students.  a mooc polarizes precisely because it is nebulous, less ‘a thing’ than a massive open umbrella term. the ‘mooc’ brand has become synonymous with such an exhausting variety of pedagogical modes—as long as they are delivered in a ‘massive’ ‘open’ ‘online’ format—that virtually all moocs arguments start as definitional arguments.  those of us with a vested interest in how moocs are effecting higher education have a real stake in ensuring that the definition that sticks is one that we can stand behind. in practice, moocs can have vastly different pedagogical agendas, graphic design solutions, audiences, and objectives. moocs can be structured as traditional lectures, interactive discussions, or dynamic mixed-media environments.  there are remedial moocs, professional development moocs, and recreational moocs. there are niche moocs on special topics and moocs on classical subjects ranging from poetics to physics.  there are foundational moocs on the basics of academic writing and iterative moocs about pedagogical theory.  there are even moocs about moocs. in spite of the spectrum of perspectives, variety of mooc incarnations, and the fact that the legitimacy of a mooc (essentially a scalable curricular support tool) as a true transformative technological phenomenon is debatable, moocs still deserve another look. here’s why: the exploration of the mooc as catalyst for critical inquiry—a kind of operant—may offer some perspective on why higher education is changing and how librarians can play an active role in shaping what higher education becomes.  moocs as disruptive technology in a spring oclc research conference, “moocs and libraries: massive opportunity or overwhelming challenge?,” jim michalko used the phrase “disruptive technology” to capture the systemic changes that moocs introduce into the way that universities, and by extension, university libraries, work. the phrase, “disruptive technology,” was coined by clayton m. christensen in a harvard business review article to characterize the kind of game changing innovations that can throw markets into a tailspin.  these technologies  are disruptive in two senses: ) they are likely to catch on and change the direction of an industry fundamentally ) they are difficult to integrate into established business models and are not immediately profitable (p. ).  moocs ‘disrupt’ existing practices in higher education in both of these senses and have the capacity to alter the way we think and talk about higher education.  moocs up-end a lot of foundational assumptions about what constitutes a ‘course’, what it means to be a ‘student’, and what constitutes an educational interaction. when basic, definitional precepts no longer apply, many institutional stakeholders left in the wake of disruption are wondering: where do we go from here? in the first place, we should recognize that the ‘mooc’ may be disruptive, but it is not unprecedented or isolated. this particular innovation is conceivable as both a technological outgrowth and as a product of american capitalistic dogma that tows adages about necessity and invention. as librarians, we have an opportunity to use this ‘crisis’ to reimagine our roles in the institutions and communities that are adopting moocs. we can begin by engaging with other institutional and community stakeholders and by building flexible infrastructures for information delivery, rights management, instruction, and curricular support that can withstand and even improve in the face of change. librarianship, which has undergone its fair share of ‘disruption’ in the past few decades, is a field that is (perhaps uniquely) primed for change. in the context of online instruction, librarians have new opportunities to expand the realm of their work. in practice, this may mean taking on more active roles as co-instructors and content creators, educating faculty about open access scholarship, authoring best practice guidelines for intellectual property management, facilitating intra and inter institutional networks, or developing a new controlled vocabularies and preservation protocols for archiving and repurposing moocs.    obstacles and implementation we must recognize that any true ‘disruption’ introduces obstacles alongside opportunities. the legal hurdles to “making educational content available to people unaffiliated with traditional educational institutions” (vogl et al., , p. ) in partnership with businesses—namely, edx and coursera, currently the two leading platform providers—pose challenges for both institutional stakeholders and publishers. moocs also raise complex ethical questions about how partnerships with commercial entities may impact, complicate, or erode instructors’ intellectual property rights. logistically, providing an academic support infrastructure for students with different socio-economic and cultural backgrounds has proved to be a major hurdle if mooc retention rates are any indicator. these and other challenges are only compounded by the scale of moocs, which boast enrollment numbers in the tens (and sometimes hundreds) of thousands. implementation approaches so far have ranged from cautious to ambitious: penn state has been careful to differentiate between their five incubator moocs that “showcase faculty expertise and engage with prospective students from around the world” and their “online world campus,” where the focus is “helping traditional campus-based students to complete degree programs” (smutz, ); brown university’s instructional design team has involved “the university counsel’s office, media services, and the university library” in mooc implementation decisions (howard, ); stanford university’s center for legal informatics has developed a scalable intellectual property exchange (sipx) “copyright registry, marketplace and clearing engine,” in part to support open online instruction and which they incorporated in spring (vogl et al., , p. ). most universities are approaching moocs with some trepidation and are not yet offering college credit or direct access to copyrighted resources.  there are some fledgling efforts to monetize moocs and offer accreditation options , a trend that is only likely to continue as moocs gain cultural and academic legitimacy. the trajectory seems headed towards a freemium business model with some options for certification or college credit. there has been some push-back against these efforts from academics who warn that accrediting moocs will affect american scholarship in ways that haven’t yet been examined.  some open education advocates have also voiced concern over the monetization of a model that is largely defined by its ‘open-ness’. although most moocs are not (yet) accredited, moocs have ignited debates about current accreditation processes and whether they stifle “new education paradigms” (dennis, , p. ) and should be reevaluated. for most universities, the focus is still on compiling data, analyzing the shifting software platforms and delivery protocols while simultaneously exploring possible implementation scenarios that weigh complex licensing, privacy, and cost facets. many universities, in recognition of the impact moocs have on different facets of education, are involving stakeholders from across campus and in some cases, are using cross-institutional partnerships to develop best practices beyond a specific implementation scenario: “librarians from all of the edx partner institutions have formed two working groups […] one group is looking into the issue of access to content; the other is talking about the research skills that moocs require and how librarians can help students develop those skills” (howard, ).  the association of research libraries weighed in on the topic in october with the release of “moocs legal and policy issues for research libraries” which outlines “strategic considerations for research libraries” (butler, ).  authored by brandon butler, director of public policy initiatives, this arl issue brief falls short of a formal best practices guide and asserts that libraries, which already have established curricular support and copyright advisory roles on many campuses, can help shape “the way their parent and partner institutions approach the mooc phenomenon” (butler, ). butler is conservative in his assessment of the potential impact that librarians may have on moocs and in turn, how innovations like moocs are affecting librarianship.  take for example, the recent announcement that syracuse ischool instructor, r. david lankes, will run a “new librarianship mooc” that addresses, a “vision for a new librarianship [that goes] beyond finding library-related uses for information technology and the internet” (ross, ).  if this course and the general move in librarianship towards a hybrid instruction model is any indicator, one of the ways that librarians can play a more active role in shaping how ‘institutions approach the mooc phenomenon’ is through direct participation as students, instructors, and content creators.  librarians can also build upon existing professional association infrastructures and create networks devoted to exploring online instruction and developing solutions to the problems introduced. librarians, who have more disciplinary autonomy that departmental faculty, can also reach out to institutional stakeholders to spearhead moocs planning initiatives on their own campuses. open access and the publishing racket moocs, because they are part of a larger cohort of open education initiatives, offer an opportunity for inter-institutional information exchange and implicitly make a case for open access publishing. library journal contributor meredith schwartz ( ) notes that moocs are “helping with open access advocacy, as professors [involved with moocs] see the need to make their own writings accessible” (p. ).  the trend towards open access that moocs promote by virtue of their open-ness has fittingly accelerated the pace of the critical dialogue about the mooc phenomenon itself; this recursive property demonstrates one of the ways moocs work to ‘disrupt’ publishing.  situating a conversation about open scholarship on platforms ranging from ted videos, academic blogs, and newspaper editorials to autonomously released academic white papers, professional organization briefs, and peer reviewed open access journals allows for a consolidation of different levels of discourse.  the moocs conversation has fostered collaborations in digital communications as scholars and bloggers are able to come together to collectively comment on developments in online instruction and on each other’s comments, ad infinitum. open access (oa) is not a new concept in higher education but significant resistance from academic publishers, faculty, and institutions entrenched in inflexible publishing and resource delivery models has made the practical transition to oa difficult.  in his book, open access (just released in an open access format after a one year embargo), peter suber ( ) credits “failure of imagination” (p. ) as the primary obstacle to oa adoption and notes that academics who “support oa in theory” often don’t “understand how to pay for it, how to support peer review, how to avoid copyright infringement, how to avoid violating academic freedom, or how to answer many other long-answered objections and misunderstandings” (p. ). academic librarians have long been oa advocates—in part because they have a better understanding of how much toll-access resources and licenses cost than many other departmental faculty do and because they are generally more aware of new oa initiatives and delivery platforms through exposure.  in the arl issue brief, butler indicates that the new pedagogical context of a mooc may prompt institutions to develop “a new strategy of adopting carefully crafted open access policies” ( ).  librarians can be (and often are) the primary drivers behind institutional oa initiatives by providing platforms for oa publishing, funding for faculty who publish in oa journals , educating faculty about oa resources in their fields, and by negotiating flexible license terms with toll-access publishers. beyond oa publications, moocs have also begun to disrupt the academic publishing status quo. mooc students (i.e. millions of consumers worldwide with vested interests in educational resources) have prompted academic publishers to rethink their own delivery strategies.  in may , bookseller reported that several academic publishers—“cengage learning, macmillian higher education, oxford university press, sage, and wiley”—have begun “experimenting with offering coursera students versions of their e-textbooks” (page, , p. ). as with the moocs accreditation option, the option to access copyrighted resources (beyond authorized excerpts or previews) will likely develop into a freemium business model. the decision by select publishers to work with mooc platform providers and develop a delivery model that can work in a ‘massive’ ‘open’ context should not necessarily be viewed as a move towards oa, but rather an attempt by publishers to explore a (vast) new potential market.  however, it is encouraging that publishers are anticipating academic innovations and willing to rethink policies and delivery models. many academic libraries still accommodate restrictive licenses and expensive scholarship but rising access fees and shrinking acquisition budgets have prompted many libraries to look for sustainable alternatives. recent innovations in licensing models and oa peer review processes have already heralded major shifts on the information delivery horizon and this trend is only continuing.  as more publishers and content creators see oa as a viable alternative and as more rights holders develop creative solutions to provide affordable resources to new audiences in new contexts, content providers that refuse to adapt or join the conversation will likely be shut out of emerging markets. it has taken time and a shift in cultural attitudes towards oa publishing for many academics to stop equating cost and exclusivity with quality. however, oa advocates are optimistic that oa resources can increasingly “coexist” with “toll-access” publications (suber, , p. ). librarians can play an active role in this shift by engaging in faculty outreach, advocating for institutional adoptions of oa publishing and delivery infrastructures, and in extreme cases, boycotting ‘toll-access’ providers who refuse to negotiate reasonable rates. reimagining information and delivery aside from prompting a shift to oa resources and heralding developments in the commercial publishing sector, moocs may implicitly change information delivery processes in other subtle ways.  in a blog post on “moocs, distance education, and copyright,” kenneth crews ( ), director of the columbia copyright advisory office, indicates that within current copyright statutes there are creative solutions to copyright problems if we learn to ask the right questions. when it comes to information delivery options for copyrighted material, instructors should embrace flexibility and examine how some lesser used exemptions (like the teach act) might apply to moocs.  if we keep in mind that each mooc has a unique context and pedagogical methodology, it becomes clear that there is no blanket solution that can apply to every situation.  the importance of maintaining an open dialog about digital rights involving all stakeholders becomes paramount. kevin smith, the scholarly communications officer at duke university, underscores the importance of collaboration between librarians, “faculty and others on the production team to make sure that embedded materials are only what’s needed for the specific pedagogical purpose” (profitt, ). the advice smith offers here is relevant in terms of copyright compliance but also in terms of pedagogical culpability; shouldn’t course materials always have ‘a specific pedagogical purpose’?  if some of the obstacles presented by mooc platforms force a close evaluation of course content and instructional approach, the impact may extend to other (analog) educational contexts as well; this argument echoes sentiments that digital pedagogues have been advocating for years. whether we acknowledge it or not, the medium of the internet has changed the way that we interact with information and the sheer volume of text most of us sift through daily has changed how we read and absorb knowledge: “unlike a book […] a digital document exists in an electronic flux which is constantly being dissolved and reassembled for our consumption” (latham, , p. ).  we have more control over texts and over digitally delivered instructional content which can be manipulated to accommodate different kinds of learners.  scholar and open education advocate, dave cormier ( ), argues that “[n]ew communication technologies and the speeds at which they allow the dissemination of information” have changed how we codify knowledge and “has encouraged us to take a critical look at where [knowledge] can be found and how it can be validated.” cormier ( ), who has co-facilitated several moocs and is a proponent of social constructivism, has also warned that some of the conversation we should be having about changes in pedagogy and knowledge construction has been overshadowed by “a flurry of discussion about intellectual property rights.” it’s not that intellectual property rights aren’t important, but they are, in some respects, beside the point. to ignore the possibilities for critical scholarship introduced by digital publishing is to also ignore the pedagogical possibilities introduced by new kinds of textual interpretation, research processes, and “new techniques of reading no longer beholden to traditional interpretive authority” (latham, , p. ). many librarians find it difficult to reimagine information and its relationship to learning. however, such a reimagining will free us from reliance on outmoded information delivery processes that simply don’t work in online education environments.  as an academic librarian whose primary responsibility is to facilitate resource delivery to faculty and students, i believe that it is possible to facilitate information delivery to moocs students. librarians can do this through a combined effort to advocate for more flexible delivery models in our conversations with content providers, to educate faculty about fair use and its limitations, and most importantly, to revise our conception of what constitutes an academic resource.  this argument takes on new relevance when you consider that mooc students are not necessarily looking for a traditional education experience.  these students are interested enough in digital scholarship to enroll in an online course and may be best served by instructors who harness the inherent possibilities offered by the medium of the web, who can serve as curators of publicly accessible information, who can advocate for affordable copyrighted resources, and who can quickly and expertly offer a combination of open access materials, links, citations and minimal embedded pieces of scholarship to students all over the world for free. moocs as intellectual property an exploration of the relationships between intellectual property (ip) and moocs is further complicated by the fact that moocs are not just resource delivery vehicles, but are themselves generative and substantive resources.  a mooc is a unique copyrighted object that can be repurposed, licensed, and sold.  aside from the intellectual content of the course supplied by an instructor, there is also a huge amount of peripheral material including discussion board posts, student contributed content, and data that exists as a byproduct of a mooc.  taking this dimension of intellectual property into account, moocs have the potential to create a new pedagogical context that is part instructional forum, part web-publishing platform, part data-generator, part resource-aggregator, and part intellectual property object. instead of focusing exclusively on unilateral content ownership, columbia’s kenneth crews ( ) suggests that we acknowledge the many stakeholders involved in the production of a mooc and take a step back to “view the copyright in [and of] online courses not as a legal assertion, but as a set of rights to be shared and managed.”  librarians and digital archivists are in a unique position to advise faculty and administrators about the complex intellectual property issues that should be considered before jumping headlong into the fray.  in his arl whitepaper, brandon butler ( ) touches upon the importance of evaluating usage rights before signing a license agreement with a mooc platform provider.  institutional librarians and archivists, who are often responsible for the management of locally generated digital assets and for digital repository planning, can ensure that universities take the long view when it comes to negotiating flexible licenses that anticipate the reuse and repurposing of moocs course content as platforms, audiences, and formats develop. professional organizations (like the arl) can serve as an ideal forum for the creation and dissemination of comprehensive best practices guides for mooc ip management. academic librarians currently working with ip issues at their home institutions can collaborate to develop working ip standards that can be applied in a variety of online education contexts. such standards would be beneficial to librarians on the ground and more importantly, would prevent commercial platform providers from eroding rights that should belong to content creators. designating a mooc as a holistic, reusable, intellectual object also means that technical production and preservation protocols must also be considered.  much of the current literature on moocs and libraries overlooks the role that information professionals might play in authoring protocols for creating, preserving and managing digital content to ensure that moocs courses are reusable from a technical standpoint as well as a legal one. ideally, every mooc should come with its own digital preservation protocol that addresses version control, metadata, hosting and archiving recommendations.  this will ensure not only that intellectual objects are secure and reusable, but that the “evolution of the [moocs] form” (schwartz, , p. ) and history of this educational phenomenon are recorded for future education scholars. as librarians, we should promote our bibliographic and preservation knowledge in terms of how we can help facilitate a multifaceted institutional digital management strategy for moocs.  additionally, we should devote more time and attention to another dimension of the intellectual property object conversation: technical support for the creation and maintenance of moocs.  in an increasingly saturated market the lifespan of any given mooc rests not only on its legality and digital stability, but also on its substantive and technical quality.  one of the salient points introduced at the oclc research conference was the necessity for universities to support faculty in the production of moocs in order to ensure that their courses are compelling and competitive.  the library—“often already providing instructional support and access to the same technology for students and for faculty who are experimenting with ‘flipping’ their in-person classrooms”(schwartz, , p. )—is the obvious locus for technical production support, which makes librarians the obvious candidates to serve as technical intermediaries between faculty (i.e. content creators) and mooc platform providers. in practice, this will mean that librarians will have to designate staff, equipment, and space to the technical production and support of moocs. for this reason, it is imperative that librarians involve themselves in moocs initiatives before institutional adoption so they can draft implementation and management workflows, advocate for new funding streams, and in some cases, redefine the mission and focus of library departments and redistribute staff to ensure that online educational initiatives are well supported. in terms of the importance of advocacy, butler ( ) argues that librarians also “have a more general stake where moocs are concerned, which is the continuing relevance of librarians and library collections to university teaching” (p. ). butler is correct in his assumption that contributions by librarians are often undervalued, however his defensive intimation that librarians need to advocate for their own relevance is short sighted. if librarians adopt active institutional roles and offer tangible solutions to problems that moocs introduce, they can demonstrate (rather than argue for) the importance of ‘librarians and library collections.’ moocs and the future in a chronicle of higher education article from , media scholar and mooc skeptic siva viadhyanathan, (who likens the difference between a “real college course” and a mooc to the “difference between playing golf and watching golf”) concedes that the emergence and unprecedented popularity of the mooc is critically significant: “if we would all just take a breath and map out the distance between current moocs and real education, we might be able to chart a path towards some outstanding improvements in pedagogical techniques” (p. ). in an article on the rippling effect of online education in academic culture, nature contributor, marshall waldrop ( ), cites chris dede, a harvard educational technologist, who sees a similar opportunity for pedagogical culpability through technological innovation: “real gains in the productivity and effectiveness of learning will not come until universities radically reshape [existing educational] structures and practices to take full advantage of the technology” (p. ).  yes, the digital pedagogues say, because we’ve not only changed practices ‘to take advantage of the technology,’ the technology has already changed us and educational practices, irreparably, insidiously, and hopefully for the better.  for hybrid pedagogy contributors sean michael morris and jesse stommel ( ) there is no going back: “we need to worry for the entire enterprise of education, to be unnerved in order to uncover what’s going on now,” to “stop thinking of education as requiring stringent modes and constructs, and embrace it as invention, metamorphosis, deformation, and reinvention.” it is true that we have no choice to confront moocs, and we will, in the same way we’ve confronted and adapted to other ‘disruptive’ innovations that have transformed how we learn, interact, and access information. librarians are uniquely well situated to play an active role in how moocs are applied at the institutional level and also how moocs are ultimately defined by and within the larger context of the emergent ‘future’ of higher education. as audiences and objectives of moocs are evolving, so too are the roles and positions of power that competing stakeholders occupy. stakeholders—including universities, teachers, librarians as well as corporations, lawyers, and publishers—are grappling to define moocs in relation to their own priorities and visions of where higher education is headed. librarians should play an active role in defining moocs and reshaping the facets of higher education that it disrupts. we can start by participating in this conversation and by reimagining our own profession in light of the future. coda librarians have the capacity to become involved in mooc initiatives within their communities and institutions but the scope of the arena and the pace of developments can be overwhelming for individuals who want to contribute but don’t know where to begin. if you want to playing a role in defining moocs, the first thing you should do is sign up for a mooc to see how it really works. you can also engage in (or start) conversations at your home institution, community library, or local professional network through a listserv or special interest group. while you may not be able to create a moocs production studio, reassign library staff, or redefine the parameters of your own position overnight, there are smaller and achievable measures you can take depending on your position and institutional goals. whether you advocate or write a grant for oa publishing funds, create a lib guide to promote oa resources to faculty and students, work with institutional legal departments to draft a university ip policy, or collaborate with colleagues to create a digital preservation protocol, you can effectively impact your community and generate a progressive atmosphere. acknowledgements i’m very much indebted to the knowledgeable editors at in the library with the lead pipe and in particular to emily ford for her helpful insights and grammatical wizardry. much gratitude to silvia cho, my colleague at baruch college’s newman library, for enduring many iterations of this article, for her pragmatism and her ability to provide clarity amid chaos. references bower, j. l., & christensen, c. m. ( ). disruptive technologies: catching the wave. harvard business review ( ), - . butler, b. ( , october ). massive open online courses: legal and policy issues for research libraries. association of research libraries issue brief. retrieved from http://www.arl.org/ cormier, d. ( ). rhizomatic education: community as curriculum. innovate: journal of online education , . retrieved from  http://www.innovateonline.info/pdf/vol _issue /rhizomatic_education community_as_curriculum.pdf crews, k. ( , november ). moocs, distance education, and copyright: two wrong questions to ask. retrieved from http://copyright.columbia.edu/copyright/ / / /moocs-distance-education-and-copyright-two-wrong-questions-to-ask/ dennis, m. j. ( ). the impact of moocs on higher education. college and university, - . fain, p. ( ). gates, moocs and remediation. inside higher ed. retrieved from http://www.insidehighered.com/news/ / / /gates-foundation- solicits-remedial-moocs fyfe, p. ( ). digital pedagogy unplugged. digital humanities quarterly . , - . retrieved from          http://digitalhumanities.org/dhq/vol/ / / / .html howard, j. ( , march ). for libraries, moocs bring uncertainty and opportunity [blog post]. wired campus. retrieved from http://chronicle.com/blogs/wiredcampus/for-libraries-moocs-bring-            uncertainty-and-opportunity/ koller, d. ( ). what we’re learning from online education [video file]. retrieved from         http://www.ted.com/talks/daphne_koller_what_we_re_learning_from_online_education.html kolowich , s. ( , september ). moocing on site. inside higher ed. retrieved from http://www.insidehighered.com/news/ / / /site-based-testing-deals-strengthen-case-granting-credit-mooc-students latham, s. ( ) new age scholarship: the work of criticism in the age of digital reproduction. new literary history . , - . retrieved from muse.jhu.edu/journals/new_literary_history/v / . latham.html morris, s. m., & stommel, j. ( , november ). a mooc is not a thing: emergence, disruption, and higher education. hybrid pedagogy. retrieved from www.hybridpedagogy.com/journal page, b. ( , may ). publishers in moocs pilot. bookseller. retrieved from http://go.galegroup.com/ pappano, l. ( , november ). the year of the mooc. new york times.retrieved from http://www.nytimes.com/ / / /education/edlife/massive-open-online-courses-are-multiplying-at-a-rapid-pace.html?smid=pl-share proffitt, m. ( , april ). moocs and libraries: new opportunities for librarians [blog post]. retrieved from http://hangingtogether.org/?p= ross, j.d. ( , june ). registration open for new librarianship mooc. ischool news. retrieved from           http://ischool.syr.edu/newsroom/index.aspx?recid= schwartz, m. ( , may ). massive open opportunity: supporting moocs in public and academic libraries. library journal. retrieved from http://lj.libraryjournal.com/ / /library-services/massive-open-opportunity-supporting-moocs/#_ smutz, w. ( , april ). moocs are no education panacea, but here’s what can make them work. forbes leadership forum. retrieved from http://www.forbes.com/sites/forbesleadershipforum/ / / /moocs- are-no-education-panacea-but-heres-what-can-make-them-work/ stommel, j. ( , july ). the march of the moocs: monstrous open online courses. hybrid pedagogy. retrieved from www.hybridpedagogy.com/journal suber, p. ( ). future. in open access (pp. - ). cambridge, ma: mit press. vaidhyanathan, s. ( , july ). what’s the matter with moocs? [blog post]. the chronicle of higher education. retrieved from http://chronicle.com/blogs/innovations/whats-the-matter-with-moocs/ vogl, r., lee, f., russell, m., & genesereth, m. ( , june). sipx: addressing the copyright law barrier in higher education – access-to-clean-content technology in the st century. [whitepaper] stanford center for legal informatics. retrieved from http://mediax.stanford.edu/pdf/mxjune copyrightclearance.pdf waldrop, m. ( , march ). online learning: campus . . nature. retrieved from http://www.nature.com/news/online-learning-campus- - - . what are moocs? according to an article published in nature earlier this year, they’re “internet-based teaching programmes designed to handle thousands of students simultaneously, in part using the tactics of social-networking websites” (waldrop, ). [↩] those interested in learning more about the origin of the mooc should watch daphne koller’s ted talk, “what we’re learning from online education.” koller, a stanford computer scientist and coursera co-founder. she talks about her goal to develop a platform for delivering high quality educational content to anyone with an internet connection and offers a range of examples to demonstrate how moocs work and what they look like. [↩] in the years since the publication of christensen’s original article, the term “innovation” has eclipsed and supplanted the term “technology,” importantly shifting the focus from the disruption itself to the systemic effect of the disruption. this semantic shift may also reflect the broad adoption of christensen’s ideas. [↩] “udacity is experimenting with charging $ for courses that come with credit from sjsu […and] the american council on education, which advises college presidents on policy, recently endorsed five moocs from coursera for credit” (schwartz, ). [↩] many scholarly oa journals operate on a business model that requires contributors to pay a fee that serves to offset production costs. institutions and scholarly organizations are increasingly designating funding streams for oa publishing initiatives. [↩] copyright, future, instruction, librarianship, publishing, teaching, technology ending a harpercollins boycott (february , -august , ) two-way libraries, open catalogues and the future of sharing culture responses joe grobelny – – at : am thanks for bringing up the need for digital preservation in light of moocs: http://birdswithteeth.wordpress.com/ / / /libraries-the-mooc-trois-on-disruption/ nora_almeida – – at : am thanks, joe grobelny. seems like the digital preservation component is lacking from a lot of discussions. staff and expertise required makes it worth thinking about from the get-go. barnaby hughes – – at : pm thanks for this timely piece. we definitely need to be proactive in defining what moocs are. it saddens me that so many groups and institutions, including the ala, are already abusing the designation by capping enrollments or charging fees. such courses might be online, but they are neither open nor massive. nora_almeida – – at : am thanks, barnaby hughes. it seems as thought the free-to-freemium model is the inevitable fate of many an “open” platform. taking the reigns to define what moocs are now and where they are headed is key. the future of our institutions and our profession at large is up for grabs. pingback : in the library with the lead pipe » a new polemic: libraries, moocs, and the pedagogical landscape | flexibility enables learning mark_mcguire – – at : am thanks for such a well researched and well referenced piece, nora. leaders at many universities seem prepared to ignore moocs if they don’t think they are an immediate threat to their current business model and strategic plan. they are only interested in innovations that maintain and strengthen the status quo — i.e. the non-disruptive kind. such a fortress mentality is self-serving and is designed to protect those who are already on the inside. if we were really interested in advancing teaching, learning and research, we would be taking advantages of innovations that enable us to connect to, and engage with, the wider world beyond the moat. pingback : feminism + mooc=docc | amber n. welch amy – – at : am i am looking to see if there is anyone interested in conducting a program on mooc’s at the utah library association’s annual conference on april th-may nd . please let me know. pingback : the future of moocs - oedb.org pingback : a tale of two moocs | querying libraries this work is licensed under a cc attribution . license. issn - about this journal | archives | submissions | conduct una revista signaficativa para los estudios de semiÓtica en espaÑa open digital humanities journals: revista de humanidades digitales. a framework for the construction of an academic field clara martínez cantón (uned) gimena del río riande (conicet, argentina) romina de león (conicet, argentina) ernesto priani saisó (unam, méxico) why a(nother) digital humanities journal? dh, languages and open access open access in dh golden open access (hybrid journals) • digital philology: a journal of medieval cultures • digital scholarship in the humanities • international journal of humanities and arts computing (ijhac) • journal of computing and cultural heritage • language resources and evaluation (formerly: computers and the humanities ( - ). • new review of hypermedia and multimedia diamond open acces (no apc’s) • digital humanities quarterly • digital medievalist • digital studies / le champ numÉrique • humanist studies and the digital age • journal of data mining and digital humanities • journal of open humanities data • journal of the text encoding initiative (jtei) • zeitschrift fÜr digitale geisteswissenschaften (zfdg) • computational linguistics • digital classics online • journal of the japanese association for digital humanities languages in dh journals revista de humanidades digitales defining a new journal a spanish journal, a language inclusive journal: revista de humanidades digitales ◉ spanish ◉ english ◉ portuguese ◉ italian ◉ french editorial team ◉ gimena del río riande (conicet, argentina) ◉ clara martínez cantón (uned) ◉ romina de león (conicet, argentina) ◉ ernesto priani saisó (unam, méxico) editorial oa policies ◉ immediate and free open access with no commercial purposes editorial oa policies ◉ immediate and free open access with no commercial purposes ◉ journal management through open access journals (ojs) ◉ open editorial policees ◉ creative commons licenses ◉ preservation with lockss ◉ adhered to cope's ethical standards ◉ authors preserve their copyright wide range of publications rhd accepts: ◉ academic articles ◉ data articles ◉ reviews ◉ publications in other experimental formats (interviews and other formats that include audio, video or interactive material, etc.) challenges open access, profesionalization, continuity, indexation professionalization of the researcher as an editor ◉ building and developing of the website ◉ publication process of new issues ◉ peer-review management ◉ indixation ◉ proofreading and editing articles ◉ advertising (social networks, lists, etc.) professionalization of the researcher as an editor professionalization of the researcher as an editor ◉ ensuring the continuity of the publication ◉ bridging the gap in disciplinary and linguistic geographic asymmetries in the publication of scientific journals where to find us http://revistas.uned.es/index.php/rhd/index http://revistas.uned.es/index.php/rhd/index any question? for further communications: rhd@linhd.uned.es ¡thank you! mailto:rhd@linhd.uned.es utilities introduction to the utilities peter bol* harvard university *corresponding author. email: pkbol@fas.harvard.edu doi: . /jch. . a variety of databases, tools and platforms have created the foundation for digital schol- arship in chinese studies. the creators of some open-access projects introduce their work below, but first i offer some notes on the kinds of utilities that make up the expanding digital universe. searchable text databases are the most widely used resource. beginning in with the dynastic histories, academia sinica’s institute of history and philology has set the highest standard for the creation of digital texts, and since then other institutions have followed suit. a proportion of scripta sinica is open access. since then, in addi- tion to many commercial text databases, three other collections have been established that are entirely open access. cbeta, the chinese electronic tripitaka collection, began over twenty years ago. the most recent is the kanseki repository of premodern chinese texts, popularly known as kanripo; it is overseen by christian wittern of kyoto university and has , texts. the largest is donald sturgeon’s ctext, the chinese text project, currently holding over thirty thousand titles and more than five billion characters. the text repository is only one part of the ctext platform, which includes a variety of tools, as sturgeon explains in his introduction to the platform. the most convenient way to discover whether a text is available in one of these three repos- itories (or the zhonghu jingdian gujiku 中华经典古籍库) is to consult textref.org; icons show whether a text is open to view, search, or download and whether a scanned image is available. the scanned image is important because, as sturgeon explains in the case of ctext, the searchable text is created by applying optical character recognition (ocr) to the scanned image. textref.org currently has , records and is an important contribution to the cyber infrastructure for chinese studies. all providers of digital text, whether open- or licensed-access can choose to reveal the titles they have, without losing their proprietary rights. my thanks to kwok leong tang for his suggestions. http://hanchi.ihp.sinica.edu.tw/. for an english language survey of digital resources from taiwan see http://sinology.ascdc.sinica.edu.tw/index_en.html, http://thdl.ntu.edu.tw/index.html and www.digital.ntu. edu.tw/en/achievements.jsp. www.cbeta.org/. www.kanripo.org. https://ctext.org. https://textref.org/. © cambridge university press journal of chinese history ( ), , – h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://crossmark.crossref.org/dialog?doi= . /jch. . &domain=pdf https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms the ming qing women’s writings digital archive and database, discussed below by its director grace fong, is an example of digital archive of selected writings with an online scholarly apparatus. mqww is valuable for its collection of rare texts. it also illustrates how a database focused on a particular set of texts and people can make use of other online utilities. in this case mqww uses an application programming inter- face (api) to call up information from the china biographical database, thus relieving it of the need to keep track of kin and social relations of the writers themselves. the china biographical database (cbdb), discussed by two managers wang hongsu and tsui lik hang, is an open-access biographical database, like the dharma drum buddhist college (ddbc) buddhist studies authority database 佛學 規範資料庫 and the database of names and biographies 人物傳記資料庫 from the institute of history and philology. one can discover persons in these databases through biogref.org, which currently has almost , records. all three databases provide categorized biographical data. there is an important difference, however, in that cbdb is a relational database composed of code tables and data tables. this allows it to be used in complex queries covering large numbers or persons. the existence of code tables for offices, people, places, and so on, also means that cbdb code tables, accessed through its apis, can be used to mark up or “tag” texts on other platforms. there are specialized datasets that provide code tables for tagging texts, such as the china historical geographic information system (chgis), which also has an api to be used by online systems. its most important use, as described in my “visualization and analysis of historical space” in this section, is the provision of data layers of adminis- trative units for , years of china’s history for use in gis software. the chgis pro- ject also provides other valuable spatial datasets, including g. william skinner’s nineteenth- and twentieth-century datasets. platforms are online systems that allow users to upload their own data (or retrieve data that is already available on or through the platform) and use the capabilities of the plat- form to analyze the data. markus, introduced by hilde de weerdt, is a platform that allows users to upload text and tag it, using code tables from cbdb, chgis, and other databases or by creating their own lists of terms. tagging words in a text allows them to be extracted from the text and analyzed and visualized in diverse ways. li bin and his collaborators provide an example of this with their online system for the basic annals of sima qian’s records of the grand historian. in this case they tagged the text manually for person names and place names, thus allowing users to visualize the connections between persons and between persons and places statistically and geographically. doing this with single texts manually is manageable but only an automated system would make this possible across large text corpora. as chen shih-pei explains, the local gazetteers research tools (logart) is a plat- form devoted to extracting all manner of data from local gazetteers. gazetteers are data- bases to-be: they contain structured information using common categories. but in fact there is just enough variation in the original texts to make this complicated. in her con- tribution to this section, on the composition of the qing bureaucracy, chen bijia notes how challenging this can be in discussing the mining of the roster of appointments (jinshen lu). the logart system is quite powerful, but the platform must be installed on a local server using texts available to that institution or users must arrange to http://digital.library.mcgill.ca/mingqing/. http://authority.dila.edu.tw/ and http://archive.ihp.sinica.edu.tw/ttsweb/html_name/。 peter bol h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms spend time at the max planck institute in berlin. currently most searchable gazetteers are in licensed text databases. philologic is another powerful platform or framework for text analysis to scale; when it is on a local server users can create their own instances with their own corpora. jeffrey tharsen and clovis gladstone explain its capabilities with the example of the twenty-four chinese histories; they have opened their instance to readers. various platforms provide various kinds of tools for analyzing text corpora. ctext’s text tools allow comparison between chosen texts, visualizing similarity and proximity based on statistical analysis, and more. in addition to the introduction to the ctext plat- form and text tools for the analysis, the website offers detailed introductions and guides. some of the same capabilities are also part of philologic and they are being built into markus. a different kind of platform is , rooms, as introduced by nicholas frisch, which is aimed at enabling users to upload and annotate images of texts. this facilitates the study historical editions of texts that may exist in multiple woodblock editions. platforms typically allow users to upload texts for one-time analysis and, with reg- istration, to store texts (taken from ctext or kanripo for instance) on the platform’s servers, employ various tools, download data, and produce visualizations. the digital humanities research platform at academia sinica and docusky from national taiwan university provide these services to registered users. docusky, first developed by tu hsieh-chang and discussed by hsiang jieh, is unusual in that it is a system cre- ated to enable users to transform the texts and spreadsheets they may have on their own computer into their own online database. it provides a specific xml format which acts as a bridge between content and tools. users can convert texts and spreadsheets into the xml format and use them with the tools provided by docusky or other open access platforms, such as markus. in addition it is meant to create a link between research- ers and developers. the text corpora, databases, datasets, code tables, apis, tools, and platforms dis- cussed here are examples of the kinds of utilities and digital resources available for the study of china’s history. the list is not exhaustive. i have not covered the various software packages being used in data analysis. of the several challenges in using digital resources i will draw attention here to one: the lack of a reliable means of segmenting words or phrasemes in texts written in literary chinese. the lack of white space between words is a problem for east asian texts generally, which is exacerbated by the lack of punctuation and ambiguity of the status of a string of characters as a word, although there is a parser for modern spoken chinese. there are open-access utilities that have had some success with punctuating literary chinese and identifying parts of speech. the increasing number of searchable text databases, most of them commercial, pre- sents researchers with new challenges. first, researchers would like to search metadata (that is, information about the text such as author, title, edition) across databases, even https://ctext.org/instructions; https://ctext.org/tools; https://ctext.org/digital-humanities. 數位人文研究平台 http://dh.ascdc.sinica.edu.tw. https://docusky.org.tw. https://nlp.stanford.edu/software/lex-parser.shtml#tools. long quan temple, punctuation: http://gj.cool/gjcool/index; beijing normal university, punctuation: https://seg.shenshen.wiki/; academia sinica, part-of-speech tagging: https://ckip.iis.sinica.edu.tw/service/ ckiptagger/; nanjing normal university, part-of-speech tagging: http:// . . . /suiyuan/index.php. journal of chinese history h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms if the local library does not have a license to the content. the hollis catalog at harvard, for example, reveals metadata for those erudition databases it licenses, although unaffiliated users cannot access the content. the crossasia fulltext search catalog from the state library in berlin does this as well for the domain of asian stud- ies. the lack of interconnectivity across the digital universe of chinese studies, led to the shanghai conference on “cyberinfrastructure for historical china studies.” at this point there is no one agreed path forward, but there are several possibilities. the max planck institute for the history of science has developed the research infrastructure for the study of eurasia (rise) which, through its api, is meant to enable institutions to create secure linkages between third-party research tools and various third-party tex- tual collections. the organizers of the conference, together with major libraries and research institutes in china and around the world, are working with the chaoxing group to see whether a sophisticated, wide-ranging search, retrieval, and analysis system could be the basis for a common multi-lingual platform of open and licensed content. another approach, represented by the aforementioned textref.org and biogref.org, is for database providers to agree on a common standard for the basic metadata necessary to identify texts and individuals in their systems. the challenge is to build this into library and database workflow so that new data is entered automatically. a third option will take shape with the sixth and final edition of endymion wilkinson’s chinese history: a new manual, to appear in – . the manual will then also appear as a curated online database that can continue to evolve, a kind of a hub whose spokes are links through apis to library catalogs and other databases, at the same time that internal hyperlinks make it easy to explore the rich content of the book itself. digitizing premodern text with the chinese text project donald sturgeon durham university, email: donald.j.sturgeon@durham.ac.uk doi: . /jch. . abstract the widespread availability of digitized premodern textual sources – together with increas- ingly sophisticated means for their manipulation – has brought enormous practical ben- efits to scholars whose work relies upon reference to their contents. while great progress has been made with the construction of ever more comprehensive database systems and archives, far more remains not only possible but also realistically achievable in the near https://crossasia.org/de/service/crossasia-lab/crossasia-itr/. the program and other materials for the conference are available on the ctext website: https://ctext. org/digital-humanities/shanghai https://rise.mpiwg-berlin.mpg.de/ with thanks to the support of mr. shi chao the model being considered is based on an open-source version of 超星发现 at www.chaoxing.com/. donald sturgeon h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms uksg annual conference sponsorship by sage made it possible for six early career professionals to attend the uksg annual conference and exhibition in april in glasgow. they are isabel benton (university of york), jessica edwards (gale, a cengage company), alice hughes (goldsmiths, university of london), laura palmer (university of huddersfield), salah seoudy (the american university in cairo) and seth thompson (glasgow caledonian university). insights – , uksg | uksg annual conference uksg annual conference a report from the sponsored early career professionals keywords uksg annual conference and exhibition the early career professionals on the first morning of the conference, already in good company – from left to right: steve sharp (co-editor, ‘insights’), seth thompson, lorraine estelle (co-editor, ‘insights’), alice hughes, jessica edwards, salah seoudy, isabel benton, laura palmer and anna grigson (uksg’s education and events officer) alice hughes explained that the number of people her institution can send to the conference is limited and usually reserved for people in more senior positions. she said, ‘the sponsorship provided the chance to attend and take part in wider conversations within the knowledge community. i would like to stress just how valuable these sponsored places are, not only for the individuals personally, although that benefit is clear, but also in terms of widening conversations and engaging every level of the community.’ the venue for the first-timers’ sunday evening buffet was hung with attractive strings of sparkling lights that lit up the river clyde outside the window. it was already abuzz with chatter as jessica edwards stuck on her sticky name label and helped herself to a refreshingly cool glass of wine. she told us, ‘there were lots of friendly faces – many of whom were members of the uksg committee, and i quickly bumped into fellow sponsored attendees who were a lovely bunch and great company over the next few days.’ she also chatted with two friendly staff from psi (publisher solutions international) who run theipregistry.org and ip-intrusion.org and learnt more about the use (and misuse) of ip addresses, as well as career progression in sales and marketing at other companies in the publishing industry. monday dawned a bit grey and chilly, but not to worry – the sponsored attendees were lucky enough to be accommodated within a stone’s throw of the scottish event campus, where the conference was held. walking towards it on the first day, jessica was led to ponder how useful it is that architects of important buildings seem to ensure their buildings are bizarre enough to be unmissable; spotting the giant armadillo-croissant structure, she knew she must be heading in the right direction. her morning kicked off – not quite with the fascinating opening plenary (though that was to follow shortly) – but with taking photos of the inside of her bag, her delegate badge and, of course, straight down at her shoes. (how else does one start a conference?!) following this, she was extremely excited to see her face pop up on the click photo game leader board: she was in eighth place already! it may have been scotland in april, but first timers were able to take advantage of the riverfront terrace on a balmy sunday evening bottom photo: steve sharp http://theipregistry.org http://ip-intrusion.org laura palmer was also in a competitive mood and loving the jove photo challenge. she found it made a good ice-breaker for talking to exhibitors. as a result, she had some interesting and enlightening conversations with publishers and suppliers, who perhaps otherwise she wouldn’t have thought to approach. her ‘best stand activity’ photo was of the colouring canvas provided by gale cengage. ‘tapping into the adult colouring craze by asking delegates to colour in the protagonists from their newly-digitized stuart archival papers was really clever and attracted plenty of delegates seeking to relax after taking in so much information in the sessions.’ laura had quite a lot of success in the exhibition hall. she won a prize draw for champagne and finnish chocolates from lm information delivery (and is keeping the champagne for her graduation from her lis ma!). if that was not enough, she also won £ ’s worth of library books from oxford university press. the conference programme was well constructed, with just enough room swapping and breakouts to split the day into manageable chunks, and to keep each new day fresh and energised. although the early career professionals (ecps) were familiar in some degree with many of the topics and issues discussed, a great deal were new to them (and certainly they had moments of feeling completely out of their depth in a few technical talks!). however, they learnt fast and, with similar issues and arguments appearing in different guises, or from different angles and perspectives throughout the conference, they found their knowledge and understanding built cumulatively. jessica described it as a ‘web or network of ideas’ and found she was making new connections of understanding as the conference progressed. unsurprisingly, this applied particularly to discussion around open access (oa), which was one of the hottest, most frequently discussed topics, appearing consistently throughout the conference – from the political considerations, incentives and directives, getting the important photos out of the way first photos: jessica edwards ‘laura won a prize draw for champagne and finnish chocolates’ the bumble bees from manchester university press to promote their new content delivery platform manchester hive also rated highly on quirkiness! tweet shared from @shellyt ‘the conference programme was well constructed … to keep each new day fresh and energised’ jessica trying out the gale cengage colouring activity (photo: laura palmer) and alice taking advantage of the civic reception to catch up with exhibitors (and competitions) to exactly what it means for researchers. ralf schimmer’s plenary session talk, ‘just how open are we?’, attracted a lot of interest on twitter. he used visual slides and language to great effect to advocate advancing the oa cause beyond its current state, providing some memorable soundbites, including ‘the paywall is the stagnant inertia at the eye of the open access storm’ and ‘the paywall must come down’. isabel benton noted that ralf schimmer’s presentation offered an interesting alternative to the current model of paywalls by suggesting that money should be redirected away from subscriptions and towards openness. the presentation developed her knowledge of this topic, which previously she was aware off, but did not know a great deal about. the breakout sessions alice found the breakout sessions were particularly helpful as she could immediately see how they could be applied in her institution, and how the ideas discussed could directly improve working practices in her department. they also provided her with the chance to talk collaboratively, not only with librarians, about ways to improve services. seth thompson, salah seoudy and isabel found matt borg’s ‘user experience in libraries’ breakout session absorbing and amusing. matt explained several techniques to gather user experiences, or ‘ux’, and spoke passionately about the need for libraries to place the user at the centre of service developments. seth really liked the idea of asking library service users to write a ‘love letter’ or a ‘break-up letter’ to the library to identify something they really like, or something that they would like to change. he thought this was an engaging approach to capture library users’ thoughts and feelings about services. ‘your library is people’ was a quote from matt’s presentation that really resonated with him. salah rated this session as possibly his favourite of the whole conference. it gave him new insights and ideas on service improvement and development that could be implemented at his university library. he enjoyed the illustrations used by matt to demonstrate differences between user experience (ux) design and user interface (ui) design and also between behavioural and attitudinal research in libraries. as part of her internship, isabel will be working on a ux project and so found this session not only incredibly relevant, but inspiring. it was useful in exploring methodologies and how they have been put into practice, and she particularly liked the idea of digital mapping using the ‘visitor or resident’ framework, as well as the concept of a contextual interview. attending this session raised some interesting questions, which isabel thinks will prove helpful when exploring ux in the context of her own workplace. jessica attended nikki rowe’s session which drew on her experience negotiating chest agreements. her light-hearted depiction of a librarian’s relationship with publishers was welcomed; though highly amusing, it was equally insightful. nikki conveyed this relationship as a tale of romance (and heartbreak), complete with ‘moving to first base’, ‘meeting the parents’, ‘moving in together’ and ‘renewing your vows’. creatively illustrated with emotive images of romance blossoming and breaking down, it was certainly unique! slotted throughout were numerous hot tips about how to make library-publisher relationships work better (and why not to despair if they don’t), such as appreciating when a publisher is approaching their year-end, knowing who’s who in their organization, never slamming the door if a negotiation breaks down, and what may cause pressure points, from fte changes to staff movements. as with all the talks, jessica found viewing publishers (with whom she currently works) through the eyes of the librarian a fascinating and valuable exercise. as nikki’s presentation clearly demonstrated – we’re all human, and ‘companies don’t talk to one another; people do’. the talk which challenged jessica’s existing perceptions the most was that by helen dobson on predatory publishers. she had previously heard the term and, influenced by the negativity of the term ‘predatory’ (as many people probably are), saw such publishers in a largely one-dimensional way. although helen’s talk did not entirely shift jessica’s sense that predatory publishers can be sinister, it certainly widened her perspective and made her think about whether there is also a serious (sometimes unjustified) bias about what constitutes a predatory publisher. beall’s list was, after all, compiled through one individual’s subjective interpretation of what it means to be a predatory publisher; many accusations may be more down to opinion than the behaviour of the accused publisher. the fact that many journals starting out in non-western nations may be more in need of support and guidance from the publishing establishment on their methods of practice, rather than being ostracized and blacklisted, was a thought-provoking challenge to the traditional depiction of predatory publishers. indeed, the suggestion of power in the term ‘predatory’ would be turned on its head if this alternative interpretation is to be believed, because these new, non-western publishers are far from powerful. is any established publisher entirely free of unethical behaviour? helen provided great food for thought. seth found ted spilsbury’s breakout session ‘reducing waste on e-book acquisition to zero (pda)’ fascinating as the project ted spoke about placed students from the university of the west of england, bristol at the centre of e-book acquisition processes. he thought the use of patron-driven acquisition as the major e-book acquisition strategy was an innovative and user-focused approach to purchasing e-books. ‘a thought-provoking challenge to the traditional depiction of predatory publishers’ hot tips in nikki rowe’s breakout session on how to improve library-publisher relationships photos: jessica edwards there were so many breakout sessions to choose from! seth, salah and isabel all attended magaly bascones and amy staniforth’s breakout session ‘what is all the fuss about? is wrong metadata really bad for libraries and their end users?’ seth found it extremely insightful and applicable to his current role as a library assistant. the recommendation to negotiate with publishers regarding a basic level of required metadata, and possibly utilizing software such as knowledge base+, were two suggestions he had not considered, but which he will be passing on to his colleagues. salah usually deals face to face with patrons as part of his job, so he was glad to see more about what happens (and can go wrong) behind the scenes. the session gave him a comprehensive approach to the different functions of an academic library and how metadata is used differently and for varying purposes, and how accurate metadata is essential for the mission of his library. laura’s favourite breakout session was joanna ball and bethany logan’s ‘(book)sprinting towards open publishing: developing strategy and tools to support digital scholarship’. this focused on how the university of sussex’s research hive is finding innovative ways to support researchers and work with them to provide oa publishing opportunities which meet their career aims. open access is generally supported by librarians, but laura found it interesting to see how libraries can work with researchers to provide oa publishing models that are tailored to their needs, allowing them to make the most of its benefits. it provided useful material for her assignment discussing library oa support for researchers too, as it was an example of going beyond simple advocacy to involving the researchers in making the system work for them. the uksg quiz the annual uksg quiz proved to be another lovely evening. jessica had been warned to expect an intense, winner-takes-all mentality from some participants, but only one quiz answer was challenged (undoubtedly because quiz master richard gedye had ensured his quiz was watertight!). sharing a table with several long-standing members of uksg, the ecps learnt of the gossip and scandals of previous uksg conferences, from rowdy publisher tiffs to being awakened by the snoring of senior librarians blasting through the thin walls of the student digs previously used to accommodate delegates. a far cry from the sparkling glasses and rich carpets they were enjoying at the glasgow marriott! even so, she sensed a certain nostalgia for the early days of uksg, including the cramped university common room disco. the plenary sessions ‘a certain nostalgia for the early days of uksg, including the cramped university common room disco’ essential refuelling before engaging brains for the quiz the twittersphere was buzzing during adam blackwell’s plenary around fake news photo: @christine_phd on twitter the second day plenary session included a very topical presentation by adam blackwell from proquest on the fake news debate. the interest generated by his title ‘guns, lies and sex tapes: how the primacy of emotions over reason gave us fake news (and trump!)’ was not wasted – adam’s talk turned out to be just as engaging. as is often the case, an anecdote went a long way in capturing the interest of the audience. in this case, the strange case of dr lott, which was almost as unbelievable and outlandish as that of dr jekyll. even after lott forgot the names of every single survey participant, lost his data in computer crashes and fabricated his own reviews and publication credits – academics continue to cite him. via this anecdote, adam presented a powerful and thought-provoking explanation for the prevalence of fake news; sometimes as prevalent in academia as in the frenzied press we are so quick to condemn. it made for a compelling argument for why it is necessary to help students understand the role their emotions play in their interpretations. as adam argued, it is the primacy of emotions over reason that creates the climate for fake news. consequently, he concluded, the solution to fake news won’t be tweaks to facebook algorithms; it will be educators, not engineers. (his presentation was stuffed with just such juicy soundbites!) it is perhaps testament to the importance of this topic that this was echoed in the final plenary session. ‘you’re the people who connect common sense and knowledge’, said vijaya nath of the leadership foundation for higher education, ‘you have a hugely significant civic leadership role’. presentations from david mcmenemy and seeta peña gangadharan made seth think about the amount of data we, as library professionals, present to external or third parties and the complexity of this information flow. david and seeta’s talks highlighted the importance not only of the librarian’s role when holding other people’s personal data, but also in educating our library service users in good practices to protect their personal data. an idea presented in david’s talk was if learner analytics are not used appropriately, it could possibly be ‘you’re the people who connect common sense and knowledge’ vijaya nath fired delegates up to go out and fulfil their civic leadership role seeta peña gangadharan encouraging her audience to confess their position on privacy construed that the library might be ‘spying’ on patrons’ activities. this made seth consider the approaches libraries take to service analysis and if the feedback they gather is too personal. privacy in relation to the library patron is an ongoing ethical challenge as we develop valuable services while ensuring that we respect privacy and personal data. food, glorious food the ecps learnt that at uksg you will never be hungry! not only were they consuming a hearty breakfast at their hotel each morning (the delicious, creamy porridge particularly impressed jessica – maybe this ‘scott’s porage oats’ business really does equate to porridge-excellence), followed by numerous refreshment breaks and chocolate from exhibition stands, but some even found themselves consuming two lunches. after some delicious door-stop ham and mustard sandwiches at a publisher presentation lunch, finding fried, cheesy gnocchi still being served at the main lunch counter was too much to resist! (although jessica admitted to slightly regretting this decision as she sat through all that afternoon’s sessions feeling stuffed to the rafters …) more breakout sessions not only did it enable two lunches, popping into an additional lunchtime event allowed them to hear all about the wiley digital archives collection ‘royal anthropological institute of great britain and ireland’ (rai). one of the things which impressed jessica was the deep, personal relationship an archivist develops with their archive after working so closely with the material for many years. sarah walpole, archivist and photo curator at the rai, spoke very highly of other individuals who had worked on and dedicated a large part of their life to the rai collection; there was clearly a great bond of respect and affection both between the archivists and for their collection. this was also perceptible at the adam matthew/university of sussex talk about mass observation online in which the curator of the mass observation archive, fiona courage, admitted that, although excited more people would get to experience the archive, the idea of digitizing the collection and sending it out to the world to be used and explored without the protective hand of an archivist was also a daunting and slightly disquieting prospect for someone who had spent so long preserving and protecting the documents. she had, after all, long been gatekeeper and protector of the legacy of the individuals who had contributed to the mass observation projects. isabel particularly liked how the presenters were making the past more accessible using modern technologies such as handwritten text recognition technology. salah had direct experience of this project, having helped with inputting the biographical data of the volunteers and writers in the project during his final year at the university of sussex. however, he thought that the presentation did not delve enough into the ethical issues and agreed with fiona courage’s view that ‘not all archives should be digitized’. the lightning talks katherine stephan’s talk inspired her listeners to try new ideas seth, laura and salah enjoyed katherine stephan’s lightning talk ‘creating communities with research cafés: how libraries can connect the university’. katherine is clearly passionate about her role in library research support, so provided a valuable and motivating insight. the research cafés at ljmu are a seemingly simple but highly effective way to engage researchers, by providing them with valuable presentation opportunities and encouraging networking and social support for researchers across the university. seth, who was inspired by this session, feels the ideas and strategies outlined in katherine’s talk are something he would try to implement as a research librarian to help foster a togetherness between the library and researchers. the conference dinner the ecps, who had by now become friends, attended the conference dinner. it was an atmospheric evening – dinner in a grand scottish hotel with bagpipes and ceilidh, and a fun and appropriate way to round off the glasgow uksg experience. laura’s food highlight was the spiced lamb, a great warmer on a blustery day, although the cranachan, whisky jelly and cheesecake dessert was very artistic, and hearing the bagpipes take on some iconic rock tunes was also memorable! alice is certain that the ceilidh was the most confused yet courageous attempt at mass-organized dancing by the knowledge community that she will ever witness. salah’s favourite performance of the night by the scotsmen in kilts was their cover of ‘wonderwall’ by oasis. ‘encouraging networking and social support for researchers across the university’ the ecps at the conference dinner, with a detail of their favourite spiced lamb dish in all its glory photo (left): jessica edwards. more photos of the dinner and ceilidh follow the conference dinner was rounded off by energetic dancing in every style imaginable (and some not) to a very eclectic mix of tunes last words this report of the uksg conference would not be complete without mentioning the excellent use of a video clip in mike cannon’s plenary session on the last day, ‘from journal production to content marketing: transforming roles in a changing landscape’. the clip showed a man gaining momentum on a giant swing in an attempt to go all the way round: to check the video out, just search for ‘kiiking’. it grabbed everyone’s attention and was the perfect way to wake everyone up after the conference dinner the night before. the final word goes to alice, who said, ‘in a fairly cheesy way the analogy can be extended to the uksg glasgow conference, as a great deal of preparation from the organizers and speakers led up to an exciting and energising three days, which have hopefully provided the momentum for us all to implement new ideas in our daily working roles.’ all photos are by simon williams (simon@simonwilliamsphotography.co.uk) unless otherwise stated article copyright: © lorraine estelle. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. lorraine estelle co-editor, insights e-mail: lorraine.estelle@counterusage.org to cite this article: estelle l, uksg annual conference – a report from the sponsored early career professionals, insights, , : , – ; doi: https://doi.org/ . /uksg. published by uksg in association with ubiquity press on july https://www.youtube.com/watch?v=twbcsedrmfe mailto:simon@simonwilliamsphotography.co.uk http://creativecommons.org/licenses/by/ . / mailto:lorraine.estelle@counterusage.org https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ the breakout sessions the uksg quiz the plenary sessions food, glorious food more breakout sessions the lightning talks the conference dinner last words what’s so important about peer review? editorial there has been a lot of dialogue about peer review recently. there were a couple of presentations and a number of discussions about it at ala’s annual conference in orlando. since then, it is been a topic that has caught my eye whenever it comes up—which seems to be more often. lately, there has been commentary popping up in scholarly communication prompted by offbeat perspectives of peer review: a russian sociologist has gone so far as to fund a monument to peer reviewers in kickstarter—what he says will ultimately be a sculpture of a rolling dice with the various traditional outcomes of peer review on each side (https://www.kickstarter. com/projects/ /monument-to-an-anonymous-peer-reviewer). it is even receiving some recognition in such venues as nature (http://www.nature.com/news/ moscow-monument-proposed-to-immortalize-peer-review- . ) and the chronicle of higher education among others. this particular example demonstrates the perceived randomness of peer reviewer feedback. there is also the recent maneuver by the largest scientific publisher in the world to patent their online peer review process, in such a broadly worded fashion that the electronic frontier foundation awarded it the “stupid patent of the month” (https:// www.eff.org/deeplinks/ / /stupid-patent-month-elsevier-patents-online-peer- review). the response to this event has been extreme in some cases, with conjecture that this is the publisher’s effort to shut down other journals. despite the ongoing rhetoric about the publisher’s desire for world domination, i believe it is unlikely that this will adversely impact other scholarly efforts but it has certainly prompted some strong opinions. even in the academy, peer review, oddly, prompts some controversy and in some cases, outright disdain. a university provost recently made the statement that “peer review is the gold standard for academia”—while a bit of a truism, this declaration met with a fair bit of consternation from faculty. some responded that they felt like peer review was out of touch with the real world, underscoring the divide between academia and practice, between knowledge and the application of knowledge to make the world a better place. in addition, some scholars maintain that traditional peer review is the only standard—that journal-based publication is all that should be considered for quality (and cited reference the indicator for impact). certainly, many guidelines for promotion and/or tenure consider journal quality and impact primary metrics. tenure and promotion guidelines are what send the signal to researchers about how their work is acknowledged and rewarded, and ultimately, determine who remains in universities to do research. as these guidelines may be slow to change, research, and in turn, peer review, may also be slow to evolve. the peer review process itself does seem to engender some aggravation. certainly the formal process of peer review can appear as a barrier to publishing new discover- ies in a timely and unmediated manner. there is some aggravation for the perceived arbitrariness of decisions and reviews or, at times, the downright contradictory feed- back from reviewers. in a simplified model with peer reviewers, there are possible permutations in the outcome; of these, the reviewers would theoretically agree % of the time. the reality is that they disagree more than % of the time. a quick analysis what’s so important about peer review? doi: . /crl. . . https://www.kickstarter.com/projects/ /monument-to-an-anonymous-peer-reviewer https://www.kickstarter.com/projects/ /monument-to-an-anonymous-peer-reviewer http://www.nature.com/news/moscow-monument-proposed-to-immortalize-peer-review- . http://www.nature.com/news/moscow-monument-proposed-to-immortalize-peer-review- . https://www.eff.org/deeplinks/ / /stupid-patent-month-elsevier-patents-online-peer-review https://www.eff.org/deeplinks/ / /stupid-patent-month-elsevier-patents-online-peer-review https://www.eff.org/deeplinks/ / /stupid-patent-month-elsevier-patents-online-peer-review editorial of the papers submitted to college & research libraries and reviewed from through indicates complete agreement between peer reviewers only % of the time; in some cases, one reviewer recommends revise and resubmit and the other chooses reject or accept but in % of the cases, the reviewers are diametrically opposed with one saying accept while the other recommends reject. a simplified illustration of the review and decision points: submitted papers reviewed by editor (and editorial team) reviewed by peer reviewers final decision % desk reject % acceptance rate the disparate feedback does not indicate that either one is right or wrong; in fact, having the various perspectives is very valuable as they may point out different fac- tors or nuances and because they also represent how readers may approach a paper differently. however, editors work to reconcile this variation by balancing and clarify- ing comments as well as providing additional guidance, and possibly seeking another reviewer. it is also an opportunity to improve documentation of the journal standards as well as selection and onboarding of peer reviewers. college & research libraries invites scholars to be peer reviewers through a couple of methods. the most formal process is through the solicitation for committee interest or participation that ala sends out to the membership once a year. service on the c&rl is one of the opportunities listed and individuals can self-nominate for inclusion on the editorial board: however, editorial preference is that members of the editorial board already have some experience serving as peer reviewers for the journal so they gain experience with the journal operations and can bring that perspective to the board. that list of individuals is then passed along to the publication coordinating committee of acrl, the c&rl editor and the editorial board who are then considered, a number of whom may not be selected for the board but may be selected as reviewers. the other way in which reviewers are chosen for the journal is through the discretion of the editor which may be a positive response to a scholar inquiring about getting involved with the journal or an invitation to an expert with a specific expertise that fills a niche for the college & research libraries november journal or corresponds with a submission that may have a very rare or specialized focus. in considering people as peer reviewers, the editor and editorial will review the names, their areas of expertise, professional experience and record of research and publication. overall, priority is given to emerging subject areas, representation across types, or libraries, geography and other factors that will provide a broad and diverse expertise. for all of the questioning of the peer review process in scholarly journals, the concept is sound: at its foundation is the ideal that acknowledged and experienced experts within a discipline have a significant role to play in reviewing new research in an objective manner that is consistent with the standards of the discipline. in doing so, they assess the research question, validate the methodology, consider the findings and the way in which they contribute to knowledge or innovate practice. traditionally, this has been done through the literature with a blind peer review process. it can only be conjectured that at least some of the concern in the academy is about the very traditional model of peer review: but, like so many other aspects of scholarly communication and publishing, peer review can, and absolutely should, evolve. to my mind, peer review refers broadly to the evaluation of knowledge, innovation or practice by someone with recognized expertise in the discipline. taking an equally broad view of scholarship, i find boyer’s model provides a cohesive way to look at different kinds of scholarship: • scholarship of discovery is original research and what is often considered the traditional model which takes the form of scholarly books, journal articles, reference works. • scholarship of integration involves synthesis of information across disciplines, across topics within a discipline or identifies trends over time. interdisciplinary research projects or scholarly conferences are such examples. • scholarship of application or engagement is the application of knowledge to solve real world problems. the “grand challenges” that the us president charged higher education to address is symbolic of this type of research. this may take the form of collaboration with community, industry or service or- ganizations, development of policy or educational outreach or services or the contribution of processes or products to improve the world. • scholarship of teaching and learning looks at the scholarship and transmis- sion of knowledge around pedagogy and learning within a discipline or more broadly. this may be formally published materials, innovative teaching materi- als or the use of emerging technology for instruction. a number of colleges and universities have framed their promotion and tenure documents around boyer’s model, allowing the advancement of knowledge to inform practice and the educational process, cross disciplinary boundaries and contribute to society. this inclusive definition of scholarship is one that lends itself to emerging models of scholarly communication that breaks down barriers. the next few editorials in college & research libraries will explore evolving models of peer review, from the process itself to the application of expert review in new areas. therefore, there will be several editorial in the coming issues that will address dif- ferent aspects and kinds of peer review, especially related to the emerging scholarly environment. • megan hodge and colleagues will be addressing the process of peer review used in ala’s primo (peer-reviewed instructional materials online) project. this effort models peer review of best practices—a term which is widely used but rarely backed by a formal process for identifying such practices. • in this era of data-driven decision and emphasis on big data, it is ironic that research data itself is rarely reviewed. in most cases, the article is reviewed as a proxy for the dataset. morten wendelbo will discuss the necessity of peer editorial review for datasets and the expertise needed to do so effectively. editors from the journal of peace research, a premier journal in its field and one of the first to apply a peer review model to data (not to mention including the dataset with the publication of the article). • emily ford will be discussing open peer review and the models that a more trans- parent process facilitates, including the benefits of developmental peer review. • ideally, we would like to also include editorials on peer review of professional skills, peer review of grant/funding proposals and peer review of digital schol- arship. however, we are still identifying scholars with the related experience willing to write on these topics. • lastly, sarah potvin asks the question “who will review the reviewers?” the foundation of the quality of peer review rests firmly on the expertise and com- petence of those doing the reviewing: the process of identifying and selecting reviewers and the standards to which they are held are critical. these guest editorials will also, once revised and expanded, serve as anchoring chap- ters for a collection to be published by acrl that will address the evolving models of peer review. additional contributions will be solicited through the call for papers below. wendi arant kaspar texas a&m university cfp: evolving models of peer review (monograph collection to be published by acrl in ) with emerging environments in scholarly communication and initiatives such as open access impacting research activity and venues, the process of peer review plays a critical role in assessing value and quality. however, it is necessary for models of peer review to align with new scholarly efforts and formats, maintaining the validation by experts but demonstrating the flexibility needed for emerging research. we invite submissions of papers examining best practices and innovative models in peer review for inclusion in a monograph collection. while studies within the field of librarianship are preferred, compelling and original cases outside of the discipline will be considered (i.e., journal of peace research’s process for peer review of data). submissions should focus on specific cases, applications of models or best practices. note the scope of the guest editori- als: similarly innovative venues, formats or subjects of review are encouraged. deadlines • november , : submission of proposed paper topic, to words. submissions will be reviewed as received and selected authors will be notified by january . • march , : submission of final papers. please use the instruc- tions for authors from college & research libraries. • may : final collected manuscript is due. inquiries and submissions may be made to wendi arant kaspar at warant@tamu.edu with the subject line: peer review collection. _goback [pdf] engaging the museum space: mobilizing visitor engagement with digital content creation | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /llc/fqw corpus id: engaging the museum space: mobilizing visitor engagement with digital content creation @inproceedings{ross engagingtm, title={engaging the museum space: mobilizing visitor engagement with digital content creation}, author={c. ross and steven gray and j. ashby and m. terras and a. hudson-smith and c. warwick}, booktitle={digit. scholarsh. humanit.}, year={ } } c. ross, steven gray, + authors c. warwick published in digit. scholarsh. humanit. engineering, computer science, political science in recent years, public engagement is increasingly viewed as more than an ‘additional extra’ in academia. in the uk, it is becoming more common for research projects to embrace public engagement with the belief that it informs research, enhances teaching and learning, and increases research impact on society. therefore, it is becoming increasingly important to consider ways of incorporating public engagement activities into digital humanities research. this article discusses public engagement… expand view via publisher dro.dur.ac.uk save to library create alert cite launch research feed share this paper citationsbackground citations view all figures, tables, and topics from this paper figure figure table figure table figure figure figure figure figure figure figure figure view all figures & tables digital humanities digital recording crowdsourcing smartphone digital electronics mobile device institute for operations research and the management sciences citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency radical trust works: an investigation of digital visitor generated content and visitor engagement in museum spaces c. ross engineering pdf save alert research feed museum digital innovation: the role of digital communication strategies in torino museums p. bernardi, m. gilli sociology save alert research feed cultural heritage professionals developing digital experiences targeted at teenagers in museum settings: lessons learned vanessa cesário, a. coelho, valentina nisi sociology pdf view excerpt, cites background save alert research feed museums as experimental test-beds: lessons from a university museum j. ashby art pdf view excerpt, cites background save alert research feed discovering towneley park : a digital and multimethod approach to understanding the effects of a digital heritage interpretation of a lancashire park a. mcdonagh sociology pdf save alert research feed the ocean game: assessing children's engagement and learning in a museum setting using a treasure-hunt game vanessa cesário, marko radeta, sónia matos, valentina nisi sociology, computer science chi play save alert research feed interactive storytelling: th international conference on interactive digital storytelling, icids , bournemouth, uk, november – , , proceedings a. bosser, d. e. millard, charlie hargood computer science icids pdf save alert research feed beauty is truth: multi-sensory input and the challenge of designing aesthetically pleasing digital resources c. warwick engineering, computer science digit. scholarsh. humanit. pdf save alert research feed online marketing communications and the postmodern consumer in the museum context s. scoffield, j. liu engineering save alert research feed sharing is not always caring: social media and the dead david errickson, t. thompson sociology save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency radical trust works: an investigation of digital visitor generated content and visitor engagement in museum spaces c. ross engineering pdf view excerpt, references background save alert research feed communication interrupted textual practices and digital interactives in art museums palmyre pierroux, s. ludvigsen art pdf view excerpts, references background save alert research feed the engaging museum: developing museums for visitor involvement y. laberge sociology view excerpt, references background save alert research feed interactivity: moving beyond terminology m. adams, jessica j. luke, theano moussouri sociology save alert research feed visitor meaning‐making in museums for a new age l. silverman sociology save alert research feed participatory communication with social media a. russo, j. watkins, lynda kelly, sebastian chan sociology view excerpt, references background save alert research feed museum informatics: people, information, and technology in museums p. f. marty, k. b. jones engineering pdf save alert research feed digital futures i: museum collections, digital technologies, and the cultural construction of knowledge fiona cameron sociology view excerpt, references background save alert research feed using mobile technologies for multimedia tours in a traditional museum setting laura naismith, m. p. smith geography pdf save alert research feed exhibiting emotion: capturing visitors' emotional responses to museum artefacts genevieve alelis, a. bobrowicz, c. ang psychology, computer science hci pdf view excerpt, references background save alert research feed ... ... related papers abstract figures, tables, and topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue microsoft word - digital scholarship in action research rawson.docx katherine rawson presentation for “digital scholarship in action: research” mla convention | austin, texas today, i am going to talk about building humanities data sets for research. humanities data sets are often made by cobbling or accretion, sometimes both. building these data sets is often an activity that involves many people. they are - institutionally built, - made by collections of scholars, - made with the help of the public through crowdsourcing. examples: hathitrust worksets - digitizing libraries, hathitrust, scholars curating menus - librarians, librarians, crowds of transcribers, scholars the peopled nature of this work can be a virtue or a complication (often both). so if data sets are made by many people: what are the stakes for humanities researchers? as scholars, we want to: - maintain research value - act ethically around the work of others - attend to the ways that the perspective of those involved shaped the data set i am going to focus on the later two (in ways that i think support the first). attend to the work of others we want to act in ways that are ethically and intellectually responsible. things like collaborators’ bills of rights can help: neh: http://mcpress.media-commons.org/offthetracks/part-one-models-for-collaboration-career- paths-acquiring-institutional-support-and-transformation-in-the-field/a- collaboration/collaborators%e % % -bill-of-rights/ ucla: http://www.cdh.ucla.edu/news-events/a-student-collaborators-bill-of-rights/ but sometimes the work is already done. what about when collaborators are not living or are not named? in this case, we might consider narratives of acknowledgement. for example, for curating menus, which relies on the work of anonymous people who contributed to nypl’s what’s on the menu project and on librarians and donors who are now deceased, we decided to write the story of these people and to create a data dictionary that includes a place of agents. for each kind of data, we not only describe what it is, but who made it. finally, if we are using the work of others, we should — as much as possible — share our data. this is not unlike the notion of the public trust that lisa rhody discussed yesterday — only the bar is lower. it’s much easier to share data than maintain tools and projects. attend to the perspective of others how do we attend to the ways that the perspectives of the people who made them shape our data sets? miriam posner raised this at keystone dh as she discussed the nature of diversity and contingency of critical humanities discourse in comparison to the data sources we sometimes have to work with — what do we do about the fact that the census (and numerous agents involved in the census) say that gender is a binary? i think this is an important and promising question for digital humanities scholars. today, i am going to offer thoughts on one approach. in answering these kinds of questions for curating menus, trevor munoz and i did a few things. first, we researched and wrote about where our data came from. then we began experimenting with data structures. we have been working with indexing, because this allows us to maintain the frameworks of the other people who made the data (and to explicitly acknowledge their frameworks), while adding our own. instead of modifying the data that came before us, we are layering on top of it. the goal is to do this within a linked open data framework, so that many people can transparently add to, connect, and manipulate the data, while being able to see the hands of others in it. nearchos. networked archaeological open science: advances in archaeology through field analytics and scientific community sharing nearchos. networked archaeological open science: advances in archaeology through field analytics and scientific community sharing nicolò marchetti • ivana angelini • gilberto artioli • giacomo benati • gabriele bitelli • antonio curci • gustavo marfia • marco roccetti published online: november � the author(s) . this article is an open access publication abstract the full release and circulation of excavation results often takes decades, thus slowing down progress in archaeology to a degree not in keeping with other scientific fields. the nonconformity of released data for digital processing also requires vast and costly data input and adaptation. archaeology should face the cognitive challenges posed by digital environments, changing in scope and rhythm. we advocate the adoption of a synergy between recording techniques, field ana- lytics, and a collaborative approach to create a new epistemological perspective, one in which research questions are constantly redefined through real-time, collaborative analysis of data as they are collected and/or searched for in an excavation. since new questions are defined in science discourse after previous results have been disseminated and discussed within the scientific community, sharing evidence in remote with colleagues, both in the process of field collection and subsequent study, will be a key innovative feature, allowing a complex and real-time distant inter- action with the scholarly community and leading to more rapid improvements in research agendas and queries. & nicolò marchetti nicolo.marchetti@unibo.it department of history and cultures, alma mater studiorum – university of bologna, piazza s. giovanni in monte , bologna, italy department of geosciences, university of padua, via gradenigo , padua, italy department of civil, chemical, environmental, and materials engineering, alma mater studiorum – university of bologna, viale del risorgimento , bologna, italy department for life quality studies, alma mater studiorum – university of bologna, corso d’augusto , rimini, italy department of computer science and engineering, alma mater studiorum – university of bologna, mura anteo zamboni , bologna, italy j archaeol res ( ) : – https://doi.org/ . /s - - - http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf https://doi.org/ . /s - - - keywords digital archaeology � open access � data sharing � cyber-infrastructure � archaeological method and theory introduction archaeology is a systematic search for knowledge about the past through the study of the material correlates of the past. fieldwork data are used for building interpretations of past social dynamics, both in the field and afterwards. the process consists of many steps of repeated standardized operations characterized by some degree of consistency. digging and recording in archaeology involve unearthing, observing, and interpreting material remains (hummler ). within this process that mixes subjectivity and objectivity, accuracy is measured in terms of the traceability of each step and in the richness of the recorded evidence. since the digging of archaeological remains is not repeatable, it is obviously crucial that the recording operations are as accurate as possible. scientific replicability in this field is guaranteed uniquely by the access to primary data (atici et al. , p. ). at present, the stages of this process are not recorded in full, and primary data are seldom if ever published integrally (kintigh et al. ). an excavation is a destructive as well as a unique operation, and the interpretation of the excavator, albeit a necessary task, is not the only one possible. a fresh analysis should be carried out on the basis of all recorded data (roosevelt et al. ), but in fact an excavation report is almost never an integral one. this hampers the quality of the reevaluation of a unique piece of information, the archaeological context, which is a quintessential epistemological operation since research goals and techniques are always changing. excavation teams are becoming increasingly multidisciplinary, and the contri- bution of technology to every step of the excavation and recording process is growing exponentially. in particular, digital and portable technologies are effec- tively changing the pace and accuracy of some of the steps of the digging process, such as the use of drones and laser scanners for surveying and photogrammetry or archaeometric and chemical analyses of finds. thus multidisciplinary research teams produce highly differentiated datasets (paper files, digital files, databases, etc.), of which only a small part is formally disseminated (kansa and whitcher- kansa ). the digital and web-based revolution that we are, in general, experiencing in science has not yet considerably impacted the dissemination of archaeological datasets, which still relies on the production of written publications in which only a small selection of the data produced is represented. furthermore, at present, there are no readily available protocols or infrastructures for integrating and intercon- necting in a meaningful way the widely different datasets that are produced in the field. the application of digital technologies coupled with the open data and open science approaches to field archaeology has several positive correlates: cutting costs and lags in producing traditional publications while keeping high-quality standards, j archaeol res ( ) : – reducing the gap between data acquisition and dissemination, providing the means to publish in full and share complex archaeological datasets, and obtaining real-time feedback. we advocate that a digital, networked, and open field archaeology is what we need to make archaeology a sustainable, performing, and agile social science in the st century. here we propose an open approach—which we call the nearchos approach: networked archaeological open science—to the creation, integration, discussion, and dissemination of archaeological datasets from multidisciplinary field research (and we do not dwell here on defining issues aptly raised by smith ). consequently, we first review recent developments in archaeological practices and academia, and then we propose a set of strategies that can be of help in enhancing data recording in the field and openly elaborate and disseminate the results via the web. current best practices and trends in recent decades, rigorous methodologies for analyzing past dynamics have been developed by means of a scientific epistemology and scientific techniques (smith et al. ). data recording and elaboration techniques have been updated exponentially in the last decade, mainly thanks to increasingly affordable and diffused digital and d technologies (giligny et al. ; roosevelt et al. ). in addition, the sophisticated application of methods from informatics and natural sciences has become crucial for building reliable models of past economic phenomena, to infer the places of origin of raw materials, reconstruct ancient technologies, and date sites and artifacts (killick , p. ; smith et al. , p. ). from a theoretical perspective, there has recently been an attempt at system- atizing archaeological field practices and interpretation processes, the so-called ‘‘reflexivity’’ trend (see berggren ; londoño ). reflexivity approaches— both theoretical and as applied field practices (at çatalhöyük in turkey and the city tunnel project in malmö, sweden [berggren , p. ])—target the role of archaeologists as producers of data and the context of data production and interpretation in the field. by focusing on finding ad hoc solutions for enhancing data recording and interpretation in the field, we can, at least, try to mitigate the shortcomings that result from archaeological fieldwork and to elaborate strategies that are continuously in tune with the needs and constraints of excavation teams in their daily work. this is particularly important since, as stated above, field archaeology is becoming more and more multidisciplinary. reflexivity approaches put to the fore the idea that interaction between specialists and excavators in the field should be a close and continuous flow of information. of this trend, we stress particularly the importance given to shortening the distance between laboratories and excavation areas, and we propose below a series of ideas that aim at integrating these different aspects through a new learning structure. nevertheless, field archaeology is still generally performed according to traditional paradigms. data recording practices generally used in the field are a j archaeol res ( ) : – mixture of analog and digital techniques that are affected by limitations and subjectivity at every stage (roosevelt et al. ). each step requires skills and accuracy that are often undercut by contingent or logistic reasons (cost-efficiency, time saving, subjectivity, environment, etc.). partiality no more as stated in a recent manifesto article (kintigh et al. ), many cultural processes involve nonlinear relationships in which cause and effect are not readily distinguishable. consequently, a rich understanding of the past requires sophisti- cated interpretative strategies built on fine-grained primary datasets from systematic archaeological fieldwork. this can be made possible only if data are recorded and published in full. kintigh et al. ( ) conclude their article stating that sophisticated archaeological research is hindered by the diffused unavailability of contextual information since primary datasets remain largely unpublished or not available online. to this we add that, if data are recorded traditionally, only a small subset of the information is available to other researchers. there are three main correlates to this situation. the first is epistemological; research questions cannot be fully singled out (and answered) without full access to primary data. the second is ethical. because most archaeological excavations are carried out with public funds, ethical issues can arise from the withholding of data. the third is operational: academic publishing is slow, costly, and requires an enormous amount of work (as well as extra funding, personnel, equipment, facilities) for converting analog data into digital and then paper formats (not to speak of readers who must, on an individual basis, convert paper contents into digital archives back again in order to make a systematic use of them). as do other disciplines in the research and academic worlds, archaeology still relies almost totally on publications as a way to disseminate knowledge and achieve career advancement (fig. ; kansa and whitcher-kansa ). printed publica- tions, consisting most often of secondary datasets, may take years to reach interested researchers who are not always able to keep pace with the growing mass of relevant literature (even when they are in pdf format, since they need to become part of the citation basis). at the same time, another important barrier is the availability and high price of specialized publications (smith ). this outcome is ethically unacceptable, since we deal, in most cases, with research carried out with public funding. open access commercial publishing moves part of the problem upstream, requiring authors to expend more-or-less substantial additional financial investments in order to publish. in addition, traditional publishing—be it paper-based, e-book, or digital open access—still responds to a logic of selectivity in presenting the materials (i.e., the personal preferences of the excavator, time, and cost-saving variables, among others). this logic could be justified in the pre-digital era by production costs, but now it represents a self-imposed reduction in the presentation of the recorded evidence. even in the pre-digital era, archaeologists did attempt to produce outputs different than the book/journal format, obtaining in some cases results comparable j archaeol res ( ) : – to modern databases—for example, the human relations area files or the research of gardin ( , , , ; see also dallas ). these experiments certainly had an impact on systems of data organization, leading to the creation of digital databases, but they did not change data dissemination practices in archaeology. it is a truism that traditional archaeological research is becoming less and less sustainable (e.g., kansa and whitcher-kansa , p. ). new questions, entailing the search for new kinds of data, are defined in the scientific and academic workflows after previous results have been disseminated and discussed within the scholarly communities; if the stages of the process are lagged and datasets always partial, then progress is slow and the knowledge produced is heavily biased. digital humanities initiatives and web-based approaches are effectively changing the way that data are produced and reused in the realm of cultural heritage (kansa et al. ; snow et al. ) and also in scholarly publishing and dissemination of knowledge in all fields. the use of digital data for archiving, managing, and conserving cultural heritage is growing exponentially (see, for example, the journal of heritage in the digital era, the digital applications in archaeology and cultural heritage website, the open context project; table ). of course, the production of large collections of digital data also poses problems of structure, access, and long- term preservation, given the tendency of electronic technologies to become obsolete relatively quickly. numerous projects and online services have been created in the last few years to ensure quality, availability, and long-term storage of digital archaeological datasets. repositories such as tdar (the digital archaeological record), digital antiquity, and archaeology data service (ads) aim at storing, curating, and preserving digital datasets and also broadening their access (but with high maintenance costs). although very promising, these advances require organized digital collections that are simply lacking for many past and present excavations, large funds, and the fig. current research and dissemination cycle j archaeol res ( ) : – will to make archaeological datasets available to the public. for these and other reasons, at present, excavations teams that adopt these services are exceptions. going open within this rapidly evolving landscape, open access, open data, and big data approaches are hot topics that, however, still remain at the margins of archaeo- logical practice (kansa ). to change the scholarly information infrastructure, we need to apply not only new forms of data recording but also change dissemination practices. the use of modern digital media has the power to do so table web projects and sites with their respective urls project cited in the paper urls american historical association digital publication guidelines https://www.historians.org/teaching-and-learning/ digital-history-resources/evaluation-of-digital- scholarship-in-history arcadia fund http://www.arcadiafund.org.uk/media/ /open- access.pdf archaeology data service (ads) http://archaeologydataservice.ac.uk ariadne project http://www.ariadne-infrastructure.eu arxiv https://arxiv.org community research tools https://polymathprojects.org http://www.crawdad.org http://www.orbit-lab.org digital antiquity https://www.digitalantiquity.org/ digital applications in archaeology and cultural heritage http://www.journals.elsevier.com/digital-applications- in-archaeology-and-cultural-heritage/ erc open access policies and guidelines https://erc.europa.eu/funding-and-grants/managing- project/open-access human relations area files http://hraf.yale.edu journal of heritage in the digital era http://www.multi-science.co.uk/ijhde.htm national endowment for the humanities open government and open data policies https://www.neh.gov/about/legal/open-government national geographic space archaeology project http://nationalgeographic.org/projects/space- archaeology/ open context project https://opencontext.org open science framework https://cos.io/our-products/open-science-framework/ socarxiv https://osf.io/preprints/socarxiv society of american archaeology’s open science interest group https://osf.io/ dfhz/# stanford forma urbis project http://formaurbis.stanford.edu/index.html the digital archaeological record (tdar) http://core.tdar.org çatalhöyük project database http://www.catalhoyuk.com/research/database j archaeol res ( ) : – https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history http://www.arcadiafund.org.uk/media/ /open-access.pdf http://www.arcadiafund.org.uk/media/ /open-access.pdf http://archaeologydataservice.ac.uk http://www.ariadne-infrastructure.eu https://arxiv.org https://polymathprojects.org http://www.crawdad.org http://www.orbit-lab.org https://www.digitalantiquity.org/ http://www.journals.elsevier.com/digital-applications-in-archaeology-and-cultural-heritage/ http://www.journals.elsevier.com/digital-applications-in-archaeology-and-cultural-heritage/ https://erc.europa.eu/funding-and-grants/managing-project/open-access https://erc.europa.eu/funding-and-grants/managing-project/open-access http://hraf.yale.edu http://www.multi-science.co.uk/ijhde.htm https://www.neh.gov/about/legal/open-government http://nationalgeographic.org/projects/space-archaeology/ http://nationalgeographic.org/projects/space-archaeology/ https://opencontext.org https://cos.io/our-products/open-science-framework/ https://osf.io/preprints/socarxiv https://osf.io/ dfhz/% http://formaurbis.stanford.edu/index.html http://core.tdar.org http://www.catalhoyuk.com/research/database only if we apply an open perspective. we believe that open data and open methods, entailing online digital publication of primary datasets, can have an enormous impact on how data are created in the field and how they are reused by the scholarly community to build new research (e.g., marwick et al. ; wilson and edwards ). our baseline assumption is that we need to record data more consistently in the field and to release them in full as structured open data since archaeological excavations are a public good that should be shared, and the primary data from the excavation are necessary for making advancement in the discipline (atici et al. ; kansa and whitcher-kansa ). scientific elaboration is hampered not only by limited access to research data but also by the fact that, in most cases, the data are not ready for digital processing. this entails the need for vast, costly data input and adaptation, and even more when several different publications are involved. born-digital and encoded data can improve access to scholarly content, that in turn allows researchers to cut costs, make easier, faster, and more productive research and teaching material, obtain higher citation rates, and overall make it easier to transfer knowledge between sectors (see the open access manifesto by the arcadia fund). in this light, we stress two trends. first, more and more research is conducted online. second, academic research is becoming increasingly collabora- tive. the rapid change in participation in academic social media and the need for online access to archaeological datasets also should be taken into consideration. the online dissemination of scholarly content and search for online peer review (academia.edu, researchgate.net, google scholar, in addition to online reposito- ries such as jstor) increased enormously in the last years. furthermore, much cultural heritage already has been turned into digital collections (virtual museums, digital libraries, scientific repositories), making the production of digital data an asset for cultural data documentation, management, and conservation (ikeuchi and miyazaki ). beyond archaeology, a case that deserves mention is the polymath project created by a group of mathematicians who collaborate online to solve open mathematical problems (cranshaw and kittur ). the polymath blog has shown that problems can be solved quickly through a collaborative web-based approach. problem solving through communal effort and discussion is more efficient since it hinges on the availability of more people and different skillsets. the polymath project also demonstrated that participation needs to be incentivized by affording proper academic attributions to the contributions of individual scholars. a digital open and networked science therefore entails that data sharing should be encouraged and online contributions should receive academic credit. in computer science, it is possible to access resources shared by specific groups, resources that are then put to good use by other research labs to generate new findings and research reports (crawdad, orbit-lab, see table ). excavations, due to their unique peculiarities, pose even more challenging and interesting research problems; here the ‘‘real-time’’ variable could really be the most important and innovative part, even when compared to other disciplines. the open data model is still struggling to receive validation in archaeology since jobs and career advancement are awarded on the basis of research published in traditional venues. j archaeol res ( ) : – new criteria for allocating credit for the production and reuse of digital data—for example, metadata with embedded authorship, as proposed by the ariadne project; extensive use of dois linking publications and primary data; or the ‘‘one url per potsherd’’ model proposed by open context—should be evaluated in order to make the open release of primary digital data the customary output of archaeological excavations (the ‘‘data sharing as publication’’ model, kansa and whitcher-kansa , p. ). the guidelines for evaluation of digital scholarship, recently issued by the american historical association (june ; see table ), are a positive example; they do not, however, move past a formal commitment to equate paper-based and digital publications (see also the erc and neh open data/ open access policies in table ). we advocate that open digital data policies should be elaborated by funding bodies in relation to field archaeology. data sharing, reuse, and collaboration should be considered not only as criteria for awarding research grants but also for contributing to academic advancement. appropriate metrics, the tracking of data reuse, and new policies for peer review of digital contents should be elaborated for awarding credit and for encouraging production of digital media. employing sophisticated technologies is only functional and subservient to operational protocols to be applied in the field, so a different attitude about how to carry out field research for generating an information flow within a networked environment, enhancing data elaboration, and their evaluation needs to be adopted. online full, real-time, and free dissemination is a requirement, but, of course, a new information architecture must be designed; the book/journal format that we are all familiar with must be reworked into a format appropriate to a digital interactive environment. networked elaboration and consistent access to data will help bring forward research questions that may be tested and validated in the field together with those that arise during first-stage research. it thus becomes possible to radically change the way that archaeologists work in the field and how data are produced, the way that archaeologists elaborate their datasets, and the way that fruitful scientific criticism can be incorporated into ongoing projects. short-circuiting field archaeology better science is made by singling out new research questions, leading to the production of new, different data. to do so, we have to operate at two distinct, yet interrelated levels: data production and data dissemination and evaluation. we need to apply the open knowledge approach (e.g., molloy ) to field archaeology in order to pursue a more global view of the excavation process. the ultimate goal of this approach is to build a new architecture of knowledge for field archaeology through a new learning strategy. first, as a set of routine applications in the field, we should use digital surveying and recording techniques, carrying out at the same time a wide array of scientific analyses almost in real time, by bringing laboratories to excavation areas. having readily available information from sampling may allow archaeologists to assess j archaeol res ( ) : – excavation and sampling strategies accordingly during fieldwork, significantly reducing post-excavation processing and data elaboration. this is achievable by experimenting with portable technologies to reduce the gap between data extraction and analysis. advanced techniques for fieldwork comprise digital photogrammetry, microstratigraphy and micromorphology, systematic soil sampling, and chemical/ mineralogical analysis of the finds. complex contextual information and multidisciplinary datasets from this array of analyses are difficult to manage both with traditional methods and in digital format. digital data require specific digital environments for proper display, integration, and exploitation. traditional publications in paper and/or pdf are not adequate formats for displaying digital datasets: what is the utility of building a complex d model if the only published outcome of this effort is a small black/white screenshot on the page of a journal? the same can be extended to databases, flowcharts, videos, gis files, agent-based modeling, and so forth. meeting these needs requires cyber-infrastructures (kintigh et al. ) to record the full array of data produced in the field as born-digital data and to publish openly the primary information on the web. we consider data sharing as a form of publication. to this end, digital datasets, models, databases, and instrument logs should to be shared on the web and published in full through preliminary reports (or technical reports). digital data need to be freely usable but marked by dois and encoded authorship information to give academic credit to the producers. open access policies need to be implemented not only in post-processing but also from the very inception of a project. this approach will result in web-based projects in which linked and integrated information from all disciplines concerning the excavation can be managed through open source software adapted and developed for the project. then, a second new scholarly behavioral step needs to take place. sharing evidence remotely in real time with colleagues, both in the process of field collection and subsequent study, may represent a key innovative feature for field archaeology (in a way that can be somehow compared to contemporary ‘‘live surgeries’’ in medical sciences). this will allow real-time remote interaction with the scholarly community, leading to rapid improvements in research agendas and queries. the by-product of this data-sharing mode will be the modifications of researchers’ behaviors. making available complete datasets from archaeological excavations, both as raw and refined data, in a single searchable platform will speed up scholarly work significantly. we believe that a stream of uninterpreted data may have represented a relatively insurmountable problem until some years ago, since there was no way to process and analyze large amounts of unstructured (and analog) information in a timely manner. this barrier no longer exists today, as demonstrated by the ‘‘big data’’ approaches (kitchin ). it also is possible to imagine that, ideally, more and more opportunities to process increasingly large and complex amounts of digital data will arise in the future, making big data analysis a key rationale in archaeology. we are confident that this approach will make archaeology more cost-effective and tuned with the speed of contemporary developments. freely usable digital data then will enhance the fruition of the information retrieved through digging and will j archaeol res ( ) : – provide a competitive advantage for creating new knowledge, new research questions, and overall a more articulated understanding of the past. our hope is that this contribution will stimulate alternative production of data (a better descriptive term than collection) within explanatory pathways generated by community-based scientific forums (a process that we call the nearchos approach). open strategies for a networked field archaeology here we review a series of techniques and strategies that, if applied consistently in the field, have the power to change the pace and accuracy of data collection and recording procedures, enhancing all four domains of archaeological recording (hummler , p. ): location (position), elements observed and not kept (strata), elements kept (finds, samples), and monitoring (checking the recorded evidence). these techniques do not necessarily form a custom ‘‘package’’; they can be scaled and customized according to particular needs, constraints, and budget that are singled out by each excavation team. open source software allows scholars to avoid the payment of costly licenses (ranging from hundreds to thousands of euros/dollars) and the limitations of locked- in propriety software, while retaining the quality of commercial software (although in some cases, with less user-friendly environments). costs of digital devices vary widely, from a few hundreds/thousands of euros/dollars for drones, digital cameras, and portable scanners, to several tens of thousands of euros/dollars for the most up- to-date portable devices; some of the strategies discussed below require an investment in equipment. this of course needs to be in line with the excavation strategies and goals devised by each team. however, if we couple open source software and low-cost digital devices, we can, on the one hand, obtain high- resolution digital outputs that are, in most cases, almost publication ready and, on the other hand, dramatically cut costs and post-excavation processing workflows. data creation strategies there are several techniques that may be used to structure the datasets that are created during fieldwork and then to disseminate them openly. scientific analyses and digital, web-based technologies, and d tools (minto and remondino ) have the potential not only to overcome some of the flaws detailed above but also to greatly improve data recording operations in the field (fig. ). we can apply a truly holistic approach to field archaeology by targeting a complete interconnection between the study of the environment and artifacts. there is no doubt that a steep learning curve may exist when new and different tools are employed during excavation. it also is true that in many cases this same problem is already present but simply is pushed out of mind, as very often some type of digitization is already integrated in the analysis of the finds. the downside of this second approach is that such processes are performed, in many cases, far from the excavation areas, and hence providing no opportunity for adjustments, integrations, and additional collection of information. the availability of analytical j archaeol res ( ) : – data during excavation and the improvement of user abilities in the field may help the directing/managing of the excavation itself and even provide key parameters to confirm or modify the strategy of archaeological fieldwork. another important step forward would be the reduction of the distance between labs and the field. setting up field labs equipped to perform a broad range of analyses (preliminary and definitive) and to make steady use of portable instruments should become a standard requirement. by bringing experts (with their instruments) fig. turning material evidence into digital: photogrammetric documentation of the south gate from the iron age in the inner town at karkemish, turkey (courtesy of the turco-italian archaeological expedition at karkemish) j archaeol res ( ) : – closer to the field, it is possible to curtail the learning curves for the scientific tools, to reduce equipment costs for the excavation team, and to have readily available (preliminary) results from analyses in the field, the quality of which is controlled by experts. we believe the data from excavation, surveys, and scientific analyses should be stored, interconnected, and managed through e-infrastructures and shared online in order to openly disseminate the datasets created and get feedback in real time, both from different members of the excavation team and from external experts. interaction and discussion during the excavation process ought to be increased. at present, there is no readily available protocol for doing so, but several research teams are heading toward this very goal. significant headway has been made by teams working at çatalhöyük (berggren et al. ; forte et al. ), kaymakçı in turkey (roosevelt et al. ), and pompeii in italy (dell’unto et al. ), among others. the systematic use of digital and d technologies through the recording process positively impacted the in-field praxis in the above-mentioned endeavors, greatly improving the amount and accuracy of data obtained from the excavations (see fig. ). the creation of gis files incorporating d features and integrating digital models of the excavated features with databases containing spatial, artifactual, and stratigraphic data are some of the most promising results from these experiments. these achievements are a turning point for practices in the field since they provide protocols and cost-effective techniques for considerably improving not only data recording but also access to excavation data, visualization, and research. we need to build on these successful experiments and make systematic use of a d archaeological approach throughout all stages of the excavation process. to do so, digital technologies need to be applied consistently and new learning frames need to be built. two methodological domains are affected by this approach. the first is digital documentation of the digging process. digital photogrammetry tools provide the means to create, almost in real time and with relatively cheap instruments, high- precision d models (digital surface models, or dsms), allowing the creation of a digital history of the progress of the excavation through time (e.g., berggren et al. ; doneus and neubauer ; forte ). webgis architectures allow the creation of integrated platforms for storing and handling large archives of georeferenced data, be it geographical, contextual, artifactual, or ecofactual, linked with os d web players. time-resolved segmentation of the excavation ‘‘movie’’ should produce a dynamic picture of the investigated architectural and landscape environments and yield a far better contextualization of the excavated finds and the related human activities. the second is the filing of finds and archaeological units. high-resolution images and d models allow archaeologists to measure automatically the morphological characteristics of the finds, replacing analog approaches, eliminating subjectivity from the process, and ultimately compressing post-processing times (e.g., roosevelt et al. , p. ). data extracted in this way can be exported in a variety of formats, making possible web exploration, the creation of online catalogs, virtual museums, and even replicas with d printers (e.g., the stanford forma urbis j archaeol res ( ) : – project). systematic campaigns of micromorphological analyses can be used to characterize scientifically the deposits that form the stratigraphic sequence (in terms of granulometry, color, compactness, etc.), limiting the pitfalls generated by visual determination of depositional units. to go beyond the visual approach to a quantitative characterization of materials, we add the archaeometric dimension to the study of these features. a range of techniques and methodologies can be implemented in the field to meet the requirements presented above. the first, although not yet fully developed, is d gis architecture for recording activities and finds in the field. one of the main tasks in rethinking archaeological fieldwork is that of designing a cyber-infrastructure capable of managing the complete pipeline of data collection in the field and the sharing and mining of data for research (snow et al. ). the main objective here is to fully integrate artifactual and contextual information with d infographic representations of the excavated features, possibly within a d gis architecture. these can be georeferenced and interconnected with finds, archaeological features, and stratigraphic units to make possible and practical data retrieval and filtering. the platform should be capable of handling geographic features and allow queries and spatial analyses, supporting multi-user access and interaction and automatic metadata extraction from both born-digital data and digitized paper-based literature. this will not only provide a fundamental tool for recording an excavation in greater detail (indeed, one day we will get a full d archive of the excavation process itself, which we can now get only at discrete time intervals), it also will considerably cut post-processing time and costs. the simple use of metadata and algorithms capable of grouping information based on such metadata (which, of course, implies raising questions about models and interrogation procedures) should facilitate queries and large-scale data mining immensely. the software infrastruc- ture must be capable of supporting two coexisting requirements: real-time cooperation and interaction between on-the-field and remote researchers and still grant remote access to high-quality information. high-definition d shape acquisition systems allow rapid and precise measure- ment of the morphology of small-scale artifacts and high-quality virtual rendering (by means of laser scan, tomography, etc.). these techniques can be applied to document cultural artifacts, bioarchaeological deposits, and archaeological and landscape features (e.g., jiménez fernández-palacios et al. ). another approach is photogrammetry, such as d digital surface models to systematically document the progress of excavation, landscape features, and monuments to create georeferenced time-tagged digital records of the excavation process. the d survey of the archaeological process allows us to better document structures, deposits, and finds, to reduce fieldwork time, and to improve the interpretation of archaeological materials. the data acquired via a d-modeling pipeline permit researchers to assess volumetric information for stratigraphy, the topographic position of the finds, and the correlation between spatially distant objects/points, and to gain a better understanding of the formation processes of the deposits and their excavation history over time (in fact, it is the contextual associations that define the functions of objects, which are given and not inherent, e.g., appadurai ). j archaeol res ( ) : – aerial and ground photogrammetric surveys, carried out through drones or telescopic rods, are processed with specific software to obtain point clouds or meshes (currently, our team at karkemish, turkey, can process an area of . ha in hours from high-resolution photographs to the final textured d model). the resulting d hybrid model (i.e., terrain model plus archaeological or building structures) has embedded spatial, volumetric, and color/texture information. the outcome of this work is the production of high-precision d-textured models, almost in real time, significantly reducing the use of total stations and optimizing and speeding up operations in the field (fig. ). the joint application of photogrammetry and terrestrial d scanning, in this case, results in the detailed description of a given site at different scales, from large structures to small finds (i.e., micromorphology). digital photogrammetry can be obtained from ground surveys or from cameras on drones; d scanning can be performed by different instrumentation, from terrestrial laser scanners to structured-light devices. these surveys have the potential not only to improve the acquisition of data but also to enhance interpretation of archaeological contexts and occupational phases through high-definition textured surface models. they can be used to document a variety of subjects: archaeological and landscape features, monuments, rock art, sculptures, degradation, collapses, etc. aerial imagery from drones also allows us to obtain high-resolution imagery and topographical maps that can be used for a wide array of visual and machine-based interpretations (e.g., vector plots, orthophoto mosaics, or dsms with texture mapping). remote sensing can be used to obtain multispectral/hyperspectral imagery at different spatial and spectral resolution. image analysis provides elements to build a general frame for sites, to obtain a better characterization of landscape and materials, and to support new archaeological investigation in the areas of interest. advanced processing procedures should be adopted for an effective data-fusion process, taking into account radiometric and geometric calibration workflows developed and tested in previous archaeological experiences (e.g., lasaponara and masini ). optical and radar images can be used alternately or together, depending on the aims of the analysis and the environmental context; old declassified images can be furthermore adopted to provide background documen- tation for large areas and to analyze their evolution through time (bitelli and girelli ; ur ). airborne laser-scanning data can provide high-resolution and high- precision dsms on larger areas to support the research before and during excavation; these kinds of data can be essential in forested areas (doneus and briese ). microstratigraphic sequences with traditional thin-section analysis coupled with portable probes such as confocal optical microscopy, fourier-transform infrared spectroscopy (ftir), and/or x-ray fluorescence (xrf) (e.g., wilson et al. ) should be obtained from archaeological deposits in order to carry out micromor- phological and pedological studies (weiner ). this would allow a precise definition of the nature of the deposits, which is critical for a better definition of sedimentary, pedological, and anthropological processes that affected/created the archaeological deposits. j archaeol res ( ) : – another important methodology involves systematic soil sampling for bioar- chaeological remains. today it is possible to develop new strategies to carry out and speed up systematic collection and analyses of bioarchaeological remains in the field. these include the systematic use of d digital and confocal microscopy to carry out microscopic-level analyses and d technologies (‘‘structure from motion’’) to document the exact setting of finds in the field and create large databases of characterized materials and digital reference collection (curci ; fanti et al. ). a final approach is the systematic archaeometric analysis of archaeological finds. chemical, mineralogical, and physical characterizations of organic and inorganic finds are not only powerful analytical tools for assessing past activities, they also provide information critical for the proper excavation, interpretation, and conservation of excavated materials (artioli ; artioli and angelini ; pollard and heron ). on-site analyses (optical microscopy and field instruments such as xrf, ftir, x-ray powder diffraction [xrd], d digitization techniques) yield readily available information on the production and use of artifacts. this can help archaeologists working in the field to interpret more readily the features they are excavating and consequently to plan their excavation strategies more accurately. cross-referencing archaeometric, bioarchaeological, and geological analyses in the field would greatly aid the efficient filing of archaeological finds. this procedure is, at the moment, a subjective and empirical process that relies exclusively on the intuition of the excavators. by obtaining early, precise distribution and character- ization of finds, we can gather insights not only on depositional processes in their archaeological setting but also on manufacture and usage patterns. this procedure also will eventually allow us to overcome some of the flaws involved in the process of filing archaeological finds. open data dissemination strategies here we discuss the dissemination of primary archaeological datasets. digitally produced datasets should be handled mainly through web portals that grant access to the d gis of the excavations (see below) and the digital reference collections that are created for characterized artifacts. all objects that are produced (datasets, workflows, models, images, etc.) should be structured and encoded with dois, authorship information, and other metadata. as stressed by kitchin ( ), the exponential growth in the production of digital data creates challenges related to the handling, processing, storing, and interpre- tation of the data. the problem is that as the amount of data increases enormously, the percentage of these data that gets analyzed and processed shrinks. another issue is data quality; scarce or poor-quality data will definitely undermine research efforts and produce weak scholarship. in this light, data organization through coherent infrastructures and quality control become key requirements in the task of changing how we produce knowledge and disseminate the findings of archaeological research. the issue of data quality is vast and cannot be addressed properly in this paper (kintigh et al. , p. ; kitchin ); rather we concentrate on the creation of cyber-infrastructures for interconnecting and accessing fieldwork datasets. j archaeol res ( ) : – the ideal environment to interconnect all the data produced in the field is that of a gis with d capabilities (e.g., carver ). spatial data and images can easily be integrated with other datasets from excavations, such as catalogs of archaeological finds and data from scientific analyses, by using giss (e.g., berggren et al. , pp. – ; landeschi et al. ). after the creation of intrasite d gis systems, the next step is the creation of virtual research environments (vres). the implications of creating and using vres are two-fold: they can provide enriched datasets and interlinked data, and they can be used for real-time feedback, evaluation, and criticism from the scholarly community. interaction with data and models is, in fact, crucial for the interpretation of the archaeological evidence. vres provide the means for publishing open digital datasets online and for obtaining real-time feedback between excavators and outside experts while the fieldwork is ongoing. in this light, solutions for creating networks of experts and scholars need to be explored and implemented (fig. ). the combination of e-infrastructure and optimized metadata will facilitate information retrieval, interpretation, and reuse of data by combining spatial and conceptual search parameters through a user-friendly interface (web portal). the creation of a vre for data sharing and online collaboration at various levels, from streamlining work in the field to developing solutions to complex research problems, should become a standard output of digital archaeological endeavors. for this to happen, web tools fig. prospective digital pipeline of field archaeological research organized through a virtual research network j archaeol res ( ) : – and social academic media tools need to be explored and developed to create interactive structures, live feeds, review sessions, and annotated workflows. technical reports (with copyright, id, dois, etc.) also may be used to share readily available primary data with the scholarly community. following the example of the arxiv platform (cornell university) for physics, mathematics, computer science, quantitative biology, quantitative finance, and statistics, the recently opened platform socarxiv—connected to the open science framework initiative—now provides a service for producing and openly sharing technical reports, papers, and datasets (with citation keys and indexed by google scholar) in the humanities and social sciences (see http://blogs.lse.ac.uk/impactofsocial sciences/ / / /developing-socarxiv-an-open-archive-of-the-social-sciences/). free and open technical reports/working papers/preprints are, in fact, widely used in the natural sciences and in some social sciences, such as economics, to promptly disseminate research results; they are relatively absent in archaeology. this model is intended not only to challenge the time lags and pay walls of academic pub- lishing but also to stimulate feedback and criticism by specialists and eventually generate academic papers and further research. all considered, these platforms appear to provide ideal venues for sharing preliminary results and primary datasets from archaeological fieldwork and also the means for reliable citation and reuse of information, giving credit to individual contributors. one problem that remains is quality control. upon our enquiry with arxiv, the administrator so responded to us on august , : ‘‘the creation of a new subject class requires considerable support from the community that will use it…. we require a commitment from a significant group of researchers to submit papers using the proposed subject class. this should include promises to submit a number of initial papers to get the subject class going. we also need a volunteer to moderate the class by reviewing daily submissions and flagging inappropriate submissions. this moderator should also review a significant number of already archived papers, looking for submissions that can be cross-listed to the new subject class and contact authors encouraging them to do so.’’ repositories such as arxiv.org—not unlike peer-reviewed journals—use moderators and scientific boards to evaluate submitted papers. but open reviews and open session tools also may be experimented with to involve specialists in reviewing the datasets that have been produced, thus curtailing the time-consuming and often one-sided/biased process of academic peer review. we stress that vres can be used following different dissemination strategies: the vre can be restricted to team members, encompass a select group of external experts and scholars, or be selectively/completely open to the public. the selected strategy should take into account the needs of local communities that, in some cases, may not want to disseminate sensitive data (e.g., the location of burial grounds) to avoid the risk of looting (e.g., mueller ). we note, however, that the most dramatic cases of looting, such as those carried out on the archaeological sites in southern iraq between and (emberling and hanson ), exploited the weaknesses of local control, impunity for perpetrators, and strong market demand, and were not organized following any online data stream. j archaeol res ( ) : – http://blogs.lse.ac.uk/impactofsocialsciences/ / / /developing-socarxiv-an-open-archive-of-the-social-sciences/ http://blogs.lse.ac.uk/impactofsocialsciences/ / / /developing-socarxiv-an-open-archive-of-the-social-sciences/ on a more positive note, it is worth citing çatalhöyük in turkey, where a digital environment for direct engagement of small teams of researchers with models and datasets created in the field has been recently created through the means of immersive reality (forte , figs. – ). the main goal of this effort was to make the excavation process virtually reversible in a simulated environment. immersive reality may represent one of the most groundbreaking applications for the study of archaeological data and, at the same time, provide the means for guiding the work of archaeologists in the field through a new type of interaction between excavators and networks of experts/scholars, based on a continuous feedback (e.g., roosevelt et al. , p. ). the visualization of data is, in fact, a very hot topic in all disciplines. it is worth mentioning that a high chance exists that, very soon, the way that we normally access imagery will probably change. it will soon be possible to navigate through d spaces relying on new advanced virtual reality technologies, as low-cost, high- performance, and head-mounted displays are expected to be on the market very soon (salomoni et al. ). as they say, ‘‘an image is worth a thousand words,’’ much the same way ‘‘a d voyage is worth a thousand images.’’ in essence, this technological shift will potentially pave the way to widespread access to d data for the dissemination and use of archaeological d models. this simply means that, for example, a movie could be seen with the eyes of an actor, or a soccer match from different points of view on the field. we cannot exclude today that something like this may happen, and it could be very useful for archaeological fieldwork and/or carrying out research based on it. the discussion about data sharing among vres and the public entails, albeit not as a core issue, the question of data durability and software obsolescence. this is, of course, a most significant matter due to the enormous efforts that have been poured into the digitization of data and the software architectures needed for managing them. yet it is a question that does not have a direct bearing on our purpose, which is about creating new science through the collection and sharing of data within a relatively short time span. beyond limitations of codification, archaeological practices may rely on digital infrastructures whose hardware and software will ultimately wear down. hence, there is an issue that the result of the digitization of those practices may not only change, but in some sense also deteriorate. over time, we run the risk that future generations will no longer be able to access the data. preservation in the digital realm may entail disruption and loss of data. from cave painting to cataloging, people have built cultural records of ritual activity and embodied knowledge. during the process of documentation, archaeologists code their data into imagery, text, tables, and audio to hold onto some aspects of their original appearance and throw away others. in this sense, the question of what to document and preserve (and how) should strike us as nothing new. yet, with digitization comes something more. we have possibilities for learning and interacting that were not available before. nonetheless, digitization comes at a cost: that of providing continuous effort toward the goal of adhering to digital format standards. if one wants to avoid the paradox of destroying finds by digital preservation, the research community should j archaeol res ( ) : – be committed to preserve standards and keeping them up to date (rosner et al. ). the use of open source software or fee-paying platforms that take care of conversion updates for hosted data like tdar does, in fact, provide a certain degree of assurance about data durability and accessibility. the key factor is that data should not be abandoned in a virtual box but constantly cared for by a system manager. data perusal and exploration must be ensured by constant data availability and access. free cloud space for data storage is unstable over time due to changing commercial policies, and either a server or a fee-paying service should be considered for that purpose. interconnecting networks for new dissemination schemes as laid out above, field archaeology relies on a set of operations that are limited by physical, technological, and human constraints, causing the process to be slowed down to a degree not compatible with the development rate of other sciences. we believe we have to change the way we perform data collection in the field and the way we present the results of archaeological fieldwork. digital technologies can provide affordable means to reduce time and costs of these operations. in addition, increasingly cheap and portable scientific instruments allow researchers to set up field laboratories and to make steady use of these technologies during all stages of fieldwork. digital technologies are instrumental in the creation of digital-born data which through web-based storage and publication methods can overcome the blocks that presently affect the dissemination of primary datasets from archaeological fieldwork. the change in field practices must be accompanied by a change in publication and research behaviors (and, before that, in academic evaluation criteria). much research is already carried out for the most part online, and this trend is only set to increase. we have to think about web-based methods of publication as the first output of archaeological research, and we advocate that the scholarly community strive to find ways for awarding academic credit for the production of digital contributions. these methods not only promise to speed up the publication pipeline, they also can expand significantly the resources available for research. by disseminating primary datasets from archaeological fieldwork promptly in the form of structured digital data, we also can change research in scope and rhythm and as a consequence make better science. digital technologies can provide the means to build a new architecture of data production in the field, but on this platform—that each team can design according to its funds and abilities—we need to build a new epistemological approach, one that is shaped by collaboration, networking, and openness in all its stages, what we synthetically call the nearchos approach. acknowledgments we acknowledge the help of giampaolo luglio for preparing fig. and of massimo bozzoli for preparing fig. . we are grateful to two anonymous reviewers for their constructive insights. j archaeol res ( ) : – open access this article is distributed under the terms of the creative commons attribution . international license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted use, dis- tribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. references cited appadurai, a. (ed.) ( ). the social life of things: commodities in cultural perspective, cambridge university press, cambridge. artioli, g. ( ). scientific methods and cultural heritage: an introduction to the application of materials science to archaeometry and conservation science, oxford university press, oxford. artioli, g., and angelini, i. ( ). mineralogy and archaeometry: fatal attraction. european journal of mineralogy : – . atici, l., whitcher-kansa, s., lev-tor, j., and kansa, e. c. ( ). other people’s data: a demonstration of the imperative of publishing primary data. journal of archaeological method and theory : – . berggren, å. ( ). reflexive approaches in archaeology, development of. in smith, c. (ed.), encyclopedia of global archaeology, springer, new york, pp. – . berggren, å., dell’unto, n., forte, m., haddow, s., hodder, i., issavi, j., lercari, n., mazzucato, c., mickel, a., and taylor, j. s. ( ). revisiting reflexive archaeology at çatalhöyük: integrating digital and d technologies at the trowel’s edge. antiquity : – . bitelli, g., and girelli, v.a. ( ). metrical use of declassified satellite imagery for an area of archaeological interest in turkey. journal of cultural heritage : e –e . carver, g. ( ). archaeological information systems (ais): adapting gis to archaeological contexts. in international congress of cultural heritage and new technologies–workshop ‘‘archäologie und computer’’, phoibos verlag, vienna (cd-rom). cranshaw, j., and kittur, a. ( ). the polymath project: lessons from a successful online collaboration in mathematics. in proceedings of the sigchi conference on human factors in computing systems chi ’ , association for computing machinery (acm), new york, pp. – . curci, a. ( ). working with d data in zooarchaeology: potential and perspectives. ocnus : – . dallas, c. ( ). jean-claude gardin on archaeological data, representation and knowledge: implications for digital archaeology. journal of archaeological method and theory : – . dell’unto, n., landeschi, g., leander touati, a. m., dellepiane, m., callieri, m., and ferdani, d. ( ). experiencing ancient buildings from a d gis perspective: a case drawn from the swedish pompeii project. journal of archaeological method and theory : pp. – . doneus, m., and briese, c. ( ). airborne laser scanning in forested areas: potential and limitations of an archaeological prospection technique. in cowley, d. c. (ed.), remote sensing for archaeological heritage management: proceedings of the th eac heritage management symposium (reykjavı́k ), european archaeological council (eac), brussels, pp. – . doneus, m., and neubauer, w. ( ). laser scanners for d documentation of stratigraphic excavations. in baltsavias, e. p., gruen, a., van gool, l., and pateraki, m. (eds.), recording, modeling and visualization of cultural heritage: proceedings of the international workshop, centro stefano franscini, monte verita, ascona, switzerland, may – , , taylor and francis, london, pp. – . emberling, g., and hanson, k. ( ). catastrophe! the looting and destruction of iraq’s past, museum publication no. , oriental institute, chicago. fanti, f., cau, a., cantelli, l., hassine, m., and auditore, m. ( ). new information on tataouinea hannibalis from the early cretaceous of tunisia and implications for the tempo and mode of rebbachisaurid sauropod evolution. plos one. https://doi.org/ . /journal.pone. . forte, m. ( ). d archaeology: new perspectives and challenges: the example of çatalhöyük. journal of eastern mediterranean archaeology and heritage studies : – . j archaeol res ( ) : – http://creativecommons.org/licenses/by/ . / https://doi.org/ . /journal.pone. forte, m., dell’unto, n., issavi, j., onsurez, l., and lercari, n. ( ). d archaeology at çatalhöyük. international journal of heritage in the digital era : – . gardin, j.-c. ( ). code pour l’analyse des cylindres orientaux, centre d’analyse documentair pour l’archéologie, paris. gardin, j.-c. ( ). four codes for the description of artifacts: an essay in archeological technique and theory. american anthropologist : – . gardin, j.-c. ( ). code pour l’analyse des formes de poteries, centre national de la recherche scientifique (cnrs), paris. gardin, j.-c. ( ). code pour l’analyse des ornements, centre national de la recherche scientifique (cnrs), paris. giligny, f., djindjian, f., costa, l., moscati, p., and robert, s. (eds.) ( ). caa . st century archaeology. concepts, methods and tools: proceedings of the nd annual conference on computer applications and quantitative methods in archaeology, archaeopress, oxford. hummler, m. ( ). recording in archaeology. in smith, c. (ed.), encyclopedia of global archaeology, springer, new york, pp. – . ikeuchi, k., and miyazaki, d. ( ). digitally archiving cultural objects, springer, new york. jiménez fernández-palacios, b., nex, f., and remondino, f. ( ). arcube—the augmented reality cube for archaeology. archaeometry : – . kansa, e. c. ( ). openness and archaeology’s information ecosystem. world archaeology : – . kansa, e. c., and whitcher-kansa, s. ( ). we all know that a is a sheep: data publication and professionalism in archaeological communication. journal of eastern mediterranean archaeology & heritage studies : – . kansa, e. c., whitcher-kansa, s., and watrall, e. (eds.) ( ). archaeology . : new approaches to communication and collaboration, cotsen digital archaeology series , cotsen institute of archaeology press, los angeles. available at: http://escholarship.org/uc/item/ r tb. killick, d. ( ). the awkward adolescence of archaeological science. journal of archaeological science : – . kintigh, k. w., altschul, j. h., beaudry, m. c., drennan, r. d., kinzig, a. p., kohler, t. a., limp, w. f., maschner, h. d., michener, w. k., pauketat, t. r., peregrine, p., sabloff, j. a., wilkinson, t. j., wright, h. t., and zeder, m. a., ( ). grand challenges for archaeology. american antiquity : – . kintigh, k. w., altschul, j. h., kinzig, a. p., limp, w. f., michener, w. k., sabloff, j. a., hackett, e. j., kohler, t. a. ludäscher, b., and lynch, c. a. ( ). cultural dynamics, deep time, and data: planning cyberinfrastructure investments for archaeology. advances in archaeological practice ( ): – . https://doi.org/ . / - . . . . kitchin, r., . the data revolution: big data, open data, data infrastructures and their consequences, sage publications, thousand oaks, ca. landeschi, g., dell’unto, n., lundqvist, k., ferdani, d., campanaro, d. m., leander touati, a. m. ( ). d-gis as a platform for visual analysis: investigating a pompeian house. journal of archaeological science : – . lasaponara, r., and masini, n. (eds.) ( ). satellite remote sensing: a new tool for archaeology, springer, new york. londoño, w. ( ). reflexivity in archaeology. in smith, c. (ed.), encyclopedia of global archaeology, springer, new york, pp. – . marwick, b., d’alpoim guedes, j., barton, m., bates, l. a., baxter, m., bevan, a., bollwerk, a., bocinsky, r. k., brughmans, t., carter, a. k., conrad, c., contreras, d. a., costa, s., crema, e. r., daggett, a., davies, b., drake, l., dye, t. s., france, p., fullagar, r., giusti, d., graham, s., harris, m. d., hawks, j., heath, s., huffer, d., kansa, e. c., whitcher kansa, s., madsen, m. e., melcher, j. negre, j., neiman, f. d., opitz, r., orton, d. c., przystupa, p., raviele, m., riel- salvatore, j., riris, p., romanowska, i., strupler, n., ullah, i. i., van vlack, h. g., watrall, e. c., webster, c., wells, j., winters, j., and wren, c. d. ( ). open science in archaeology. the society for american archaeology archaeological record ( ): – . minto, s., and remondino, f. ( ). online access and sharing of reality-based d models. scires-it - scientific research and information technology : – . https://doi.org/ . / i v n p . molloy, j. ( ). the open knowledge foundation: open data means better science. plos biology. https://doi.org/ . /journal.pbio. . j archaeol res ( ) : – http://escholarship.org/uc/item/ r tb https://doi.org/ . / - . . . https://doi.org/ . /i v n p https://doi.org/ . /i v n p https://doi.org/ . /journal.pbio. mueller, t. ( ). how tomb raiders are stealing our history. national geographic : – . pollard, a. m., and heron, c. (eds.) ( ). archaeological chemistry, nd ed., royal society of chemistry, cambridge. roosevelt, c. h., cobb, p., moss, e., olson, b. r., and ünlüsoy, s. ( ). excavation is destruction digitization: advances in archaeological practice. journal of field archaeology : – . rosner, d., roccetti, m., and marfia, g. ( ). the digitization of cultural practices. communications of the association for computing machinery : – . https://doi.org/ . / . . salomoni, p., prandi, c., roccetti, m., casanova, l., marchetti, l., and marfia, g. ( ). diegetic user interfaces for virtual environments with hmds: a user experience study with oculus rift. journal on multimodal user interfaces : - . snow, d., gahegan, m., giles, c., hirth, k., milner, g., mitra, p., and wang, j. ( ). information science, cybertools and archaeology. science : – . smith, m. e., feinman, g. m., drennan, r. d., earle, t. k., and morris, i. ( ). archaeology as a social science. proceedings of the national academy of sciences usa : – . smith, m. e. ( ). do publishing trends collide with the grand challenges of archaeology? the saa archaeological record : . smith, m. e. ( ). social science and archaeological inquiry. antiquity : – . ur, j. ( ). spying on the past: declassified intelligence satellite photographs and near eastern landscapes. near eastern archaeology : – . weiner, s. ( ). microarchaeology: beyond the visible archaeological record, cambridge university press, cambridge. wilson, a. t., and edwards, b. (eds.) ( ). open source archaeology: ethics and practice, de gruyter, berlin. available at: http://www.degruyter.com/view/product/ . wilson, c. a., davidson, d. a., and cresser, m. s. ( ). multi-element soil analysis: an assessment of its potential as an aid to archaeological interpretation. journal of archaeological science : – . bibliography of recent literature albarella, u. (ed.) ( ). environmental archaeology: meaning and purpose, kluwer academic publishers, dordrecht. aspöck, e., and masur, a. ( ). digitizing early farming cultures customizing the arches heritage inventory & management system. in proceedings of digital heritage international congress , . sept.– . oct., granada, spain. ieee press, new york, pp. – . https://doi.org/ . / digitalheritage. . . bitelli, g., girardi, f., and girelli, v.a. ( ). digital enhancement of the d scan of suhi i’s stele from karkemish. orientalia : – . curci, a., urcia, a., lippiello, l., and gatto, m. c. ( ). using digital technologies to document rock art in the aswan-kom ombo region (egypt). sahara : – . de reu, j., de smedt, p., herremans, d., van meirvenne, m., laloo, p., and de clercq, w. ( ). on introducing an image-based d reconstruction method in archaeological excavation practice. jour- nal of archaeological science : – . forte, m., and pietroni, e. ( ). d collaborative environments in archaeology: experiencing the reconstruction of the past. international journal of architectural computing : – . https://doi. org/ . / . grosman, l., karasik, a., harush, o., and smilanksy, u. ( ). archaeology in three dimensions: computer-based methods in archaeological research. journal of eastern mediterranean archae- ology and heritage studies : – . hunt, a. m., and speakman, r. j. ( ). portable xrf analysis of archaeological sediments and ceramics. journal of archaeological science : – . ioannides, m., hadjiprocopis, n., doulamis, a., doulamis, e., protopapadakis, k., and makantasis, p. ( ). online d reconstruction using multi-images available under open access. isprs annals of photogrammetry, remote sensing and spatial information sciences : – . martinón-torres, m., and rehren, t. (eds.) ( ). archaeology, history and science: integrating approaches to ancient materials, left coast press, walnut creek, ca. j archaeol res ( ) : – https://doi.org/ . / . http://www.degruyter.com/view/product/ https://doi.org/ . /digitalheritage. . https://doi.org/ . /digitalheritage. . https://doi.org/ . / https://doi.org/ . / matsumoto, g. ( ). fill in the gap between theory and practice: making a gis-based digital map of pachacamac. in wilkins, j., and anderson, k. (eds). tools of the trade: methods, techniques and innovative approaches in archaeology, university of calgary press, calgary, pp. – . pearce, n., weller, m., scanlon, e., and ashleigh, m. ( ). digital scholarship considered: how new technologies could transform academic work. in education : – . rathje, w., shanks, m., and witmore, c. (eds.) ( ). archaeology in the making: conversations through a discipline, routledge, london. redman, c. l., grove, j. m., and kuby, l. h. ( ). integrating social science into the long-term ecological research (lter) network: social dimensions of ecological change and ecological dimensions of social change. ecosystems : – . remondino, f., and campana, s. ( ). d recording and modelling in archaeology and cultural heritage: theory and best practices, bar international series s , archaeopress, oxford. rosenbauer, c., rutishauser, s., trachsel, t., kilchör, f., and wittlin, e. ( ). the virtual cilicia project: how to use google earth as a visualization environment in an archaeological context. in sieck, j. (ed.), kultur und informatik: visual worlds & interactive spaces, werner hülsbusch, berlin, pp. – . tringham, r. ( ). households through a digital lens. in parker, b. j., and foster, c. p. (eds.), new perspectives on household archaeology, eisenbrauns, winona lake, wi, pp. – . vincent, m. l., kuester, f., and levy, t. e. ( ). opendig: contextualizing the past from the field to the web. mediterranean archaeology and archaeometry : – . j archaeol res ( ) : – nearchos. networked archaeological open science: advances in archaeology through field analytics and scientific community sharing abstract introduction current best practices and trends partiality no more going open short-circuiting field archaeology open strategies for a networked field archaeology data creation strategies open data dissemination strategies interconnecting networks for new dissemination schemes acknowledgments references cited a mobile interface for dspace search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine march/april volume , number / table of contents   a mobile interface for dspace elías tzoc miami university libraries tzoce@miamioh.edu doi: . /march -tzoc   printer-friendly version   abstract academic libraries were among the first adopters of mobile websites in universities, but much of the early development was focused exclusively on traditional library content such as the library's homepage, catalog, contact information, etc. as libraries continue to work on new technology developments, a mobile interface for their institutional repositories can be a good new way to reach out to faculty and other interested parties. miami university's scholarly commons runs on dspace as part of a shared infrastructure administered by ohiolink. dspace is used at academic institutions, research and resource centers, museums, national libraries, and government and commercial organizations. with over a thousand installations in more than countries, dspace is the most widely used open source repository platform by any measure. the steady popularity of dspace suggests that a lot of institutions will benefit from an out-of-the-box mobile interface. this article describes the development and implementation of the first mobile interface developed for dspace using the jquery mobile framework. keywords: mobile interface, dspace, institutional repositories, jquery mobile   i. introduction academic libraries were among the early adopters of mobile interfaces in universities, although most of those mobile developments were focused exclusively on limited types of services such as a library's homepage, catalog, contact information, etc. in the article "on the move with the mobile web: libraries and mobile technologies", the author provided a summary of major university libraries' mobile initiatives and stated, "libraries are mastering the mobile web to bring patrons a new set of services — services that their users are coming to expect from their communities and content providers" (kroski, ). in , library journal conducted a mobile library survey where respondents indicated that " % of academic libraries and % of public libraries currently offer some type of mobile services to their customers; two out of five libraries of all types, academic and public, report plans to 'go mobile' in the near future." (thomas, ). another study published in presented the type of services available on association of research libraries' mobile sites (aldrich, ). in that publication, the author also compared and contrasted the results with the literature identifying what mobile web users desired and provided an initial benchmark for comparisons with other institutions. undoubtedly, the development of library services and sites tailored to mobile users has gained momentum in the last four years and for many it may seem to have reached a level of maturity; however, libraries' content goes beyond what we now see in current library mobile sites. a recent publication on mobile sites for other types of library content is the article "developing mobile access to digital collections" in which the authors presented the "findings from in-depth case studies of four selected institutions and university libraries" that offer mobile services for their digital collections or cultural heritage collections (mitchell & suchy, ). as libraries continue to work on new technology developments, a mobile interface for their institutional repositories can be a good new way to reach out to faculty and other parties interested in scholarly communication. in this article, we describe the development and implementation of the first mobile interface for dspace — an open source application used worldwide for institutional repositories.   ii. why a mobile theme for dspace? dspace was developed by the massachusetts institute of technology (mit) libraries and the hewlett-packard labs; it was first released in and has become one of the most widely used platforms for open access digital repositories at academic institutions, research and resource centers, national libraries, and government and commercial organizations. as of august , the "usage of open access repository software" report in the directory of open access repositories (opendoar) indicates that ( . %) of those registered repositories are using dspace. on the same date, the dspace registry reported , live dspace instances. some of the benefits of using dspace for scholarly content include: built-in workflows for submitting data in any file format, international standards for metadata, access to an active community of developers, availability of the xmlui framework for creating customizable front-ends, and a growing list of service providers. one of the most recent and significant projects using dspace is the world bank open knowledge repository, which was released in april . as for the topic of mobile sites for institutional repositories in general, there seems to be little to nothing written or done about it. in the section 'possible research directions' of the article "institutional repositories: features, architecture, design and implementation technologies", the authors briefly discussed mobile access, saying "of all the eleven ( ) ir platforms reviewed, only greenstone supports access via mobile devices." (adewumi & ikhu-omoregbe, ). the steady popularity of dspace in the last few years suggests that a lot of institutions will benefit from an out-of-the-box mobile interface. one of the dilemmas in providing mobile access seems to be deciding whether to create an app for every major mobile device in the market, or create a mobile-optimized website. for us and many others, the best option was to develop a mobile-optimized website that can work on a wide range of mobile devices. the jquery mobile (jqm) framework is definitely one of the best tools currently available for this approach. with jqm, developers can rest assured that their sites will work in low-capability phones as well as in high-end touch smartphones. the use of html standards in jqm allows developers and web designers to create a single robust and highly branded website that will work on all popular smartphone, tablet, and desktop platforms. miami university libraries' institutional repository, scholarly commons, is currently running on dspace . and it is one of many instances hosted at the ohio library and information network (ohiolink) in a shared infrastructure. since the deployment of the first website in , the digital initiatives team has been using the xmlui framework, initially developed by the texas a&m university libraries, to implement several front-end customizations. in early , with the previous experience in creating dspace themes, and the need for a mobile interface for scholarly commons along with all the features available in the jqm framework, we decided to propose the creation of a mobile theme for dspace. a theme is a modular interface layer that allows developers to create customized web interfaces for a dspace repository, community or collection. a quick check of the dspace-tech list archive confirmed that a mobile theme was added to the list of new features in september , but after that, there was nothing else done. with this in mind and in the hope that our work can help some of the organizations already using dspace, we decided to add the mobile theme as a goal for the summer of .   iii. development and implementation the real adventure started in late may with a message to the dspace-tech listserv with a couple of basic questions regarding the xmlui webapp. the replies provided useful suggestions for setting up the required files for a mobile theme experiment. the next step included two key activities: creating a wireframe for the mobile interface. figure below illustrates the first design created using the drag-and-drop ui builder available on the jqm's homepage. even in this early part of the project it was important to stick to basic rules, such as data entry on mobile devices should be minimized, and the design should be kept simple and clean. because of those rules and since the target audience for the mobile theme are end-users and not dspace administrators, we decided to remove the entire "ds-options" sidebar, which contains all of the administrative functions in dspace. researching for mobile "best practices". the first document we reviewed was "library/mobile: tips on designing and developing mobile web sites" (griggs, bridges & gascho, ) where the authors provide key design and development strategies for building mobile websites. reading and becoming familiar with the wealth of information on the jqm's "demos & documentation" page was also important. it was very useful to watch the "mobile web design & development fundamentals" tutorial by joe marini, especially the chapters on mobile web development guidelines, setting up a development environment, and putting it all into practice. figure : wireframe for dspace mobile theme in june, we incorporated proper html elements as specified in the mobile page structure template in the default miami.xsl file. after completing the structure for the front page, the next step was to create a mobile theme using the themeroller site. at this point, we had a decent working mobile-optimized site, but we also noticed that even with the new page structure and the generated css files, several dspace elements that are dynamically generated and context sensitive (e.g. collection list or view) were still displaying in a non-mobile style. a first fix was to create a new javascript file and use the .html() property to modify the html on the fly; however, this solution in jqm requires a manual page refresh and we cannot expect users to refresh a page every time they move to a new page. a plan b was to create a new css file and add all the required css tweaks to give the entire site a mobile look-and-feel. below is the basic template for the front page. another section of the mobile.xsl file that required a major customization was the template for displaying individual items. figure illustrates an example of a typical item page and five main elements: title, share button, metadata with six fields (title, author/s, description/abstract, url, date, and related items in google scholar), thumbnail and link for downloading files, and a link back to the item's collection or community. in mid july, we had a mobile theme running on a second copy of the xmlui webapp on a dspace . installation. we later learned this is probably not the best way to do this, but back then this was the only way we got it to work and it allowed us to continue testing and tweaking some css, xml, xsl, and js files. another lesson learned during this phase of the project was to always devote enough time to testing your site on different devices or in every major device emulator available. for us, the opera mobile emulator and the android sdk were very useful in testing and adjusting the site to small screen sizes (even × ); the ios simulator worked just fine, but the real headache was with the windows phone emulator because it simply did not work and kept displaying a "not loading" error. after considerable time was spent researching the error, we learned that it was an html element without a closing tag. we spent the rest of july adjusting and testing the mobile theme in every device and emulator available. figure : example of an item's page in early august, we ran some final tests and we believed it was time to share the link with others and start collecting external feedback. we posted a comment and a link to a first zip file on the duraspace jira ticket page. at the same time, we wanted to test the new interface on an ohiolink machine and this was when we learned that duplicating the xmlui webapp was neither effective nor sustainable, especially in a shared infrastructure. james russell, ohiolink developer, came up with a solution, which required a second domain name and it turned out to be a great and robust solution for us. figure illustrates four screenshots of the new dspace mobile interface as viewed on four different emulators. on september th, we were pleased to actually see it up and running on a production machine. to access the site, visit http://sc.lib.muohio.edu/. if your device is not detected as mobile, try the mobile site http://mobile.sc.lib.muohio.edu/. figure : dspace mobile site on ios, android, opera and windows phone emulators to see a larger version, more detailed version of figure , click here. for anyone interested in installing and further customizing the dspace mobile theme, a copy of the released version is available. a working copy is available on the dspace site on github. it was added to dspace version . , released in november . details about the theme file structure and an installation guide is available in the appendix at the end of this article.   iv. conclusions overall, this summer project turned out to be quite exciting and positive, as it allowed us to implement a mobile interface for our scholarly commons site and perhaps most importantly, the project was added to the core code of dspace . , which will potentially benefit many dspace users. as noted earlier, this first mobile theme is exclusively focused on end-users of the dspace repository whose main activities are likely to be searching and browsing for scholarly content. among the future activities that can help to further develop the mobile theme for dspace are: a) developing a usability test of the current theme; b) development of mobile-friendly structures for complex pages such as the "advanced search" page; and c) evaluating the feasibility of a mobile interface for dspace administrators, which may incorporate the sword (simple web-service offering repository deposit) protocol to easily ingest items into a repository. finally, as we worked on this mobile project, we learned how hard it can be to be compatible with the majority of the mobile devices and screen sizes currently on the market. in fact, many web designers and developers are quickly getting to the point where they are unable to keep up with the rapidly growing number of new devices and screen sizes. they argue that for many websites, creating a website version for each resolution and new device would be impractical. the discussion seems to suggest that the solution relies on the concept of "responsive web design", which is an approach to web design where the site is created to provide an optimal user experience regardless of the end-user' device. thus, an interesting question for the dspace community to consider is would it be feasible to explore the possibilities of responsive web design for dspace.   acknowledgements the author would like to thank james russell, ohiolink developer, for helping with the implementation of a second domain for the mobile theme; and ivan masár, member of the dspace committer team, for providing technical assistance in adding the code to the dspace . release.   references [ ] adewumi, a. o. & ikhu-omoregbe, n. a. ( ). institutional repositories: features, architecture, design and implementation technologies. journal of computing, vol. no. . [ ] aldrich, a. w. ( ). universities and libraries move to the mobile web. educause quarterly, vol. no. . [ ] griggs, k., bridges, l.m., & gascho, h. ( ). library/mobile: tips on designing and developing mobile web sites. code lib journal, issue . [ ] kroski, e. ( ). on the move with the mobile web: libraries and mobile technologies. library technology reports — american library association. [ ] mitchell, c. & suchy, d. ( ) developing mobile access to digital collections. d-lib magazine vol. , no. / . http://dx.doi.org/ . /january -mitchell [ ] thomas, c. l. ( ). gone mobile? (mobile libraries survey ). library journal, issue .   appendix the mobile theme file structure is: +-- mobile | +-- lib | | +-- cookies.js | | +-- detectmobile.js | | +-- images | | | +-- ajax-loader.gif | | | +-- default-thumbnail.png | | | +-- icons- -black.png | | | +-- icons- -white.png | | | +-- icons- -black.png | | | +-- icons- -white.png | | +-- m-tweaks.css | | +-- sc-mobile.css | | +-- sc-mobile.min.css | | +-- mobile.xsl | | +-- sitemap.xmap | | +-- themes.xmap | +-- readme.txt the installation process is as follows: get a new domain name that is an alias of the existing domain name for your dspace installation, e.g. if your current domain is yoursite.edu your new domain name might be mobile.yoursite.edu. (these instructions assume that the new domain name starts with 'mobile.' if it is something else, you will need to make a change in step .) copy the mobile theme folder into your xmlui theme folder, e.g., ../dspace/webapps/xmlui/themes/. copy the messages_mobile.xml file into the default i n folder, e.g., ../dspace/webapps/xmlui/i n. this is a workaround that guarantees that the mobile theme only reads this file as it contains new/short labels for the mobile interface. add a call for the detectmobile.js and cookies.js file in the header of your current main theme.xsl file. it should look like this example: in this file, you can also add a "view mobile site" link in the footer section, which will allow users to view the full site on their mobile devices. the cookies.js file saves this preference but it is erased when the session is closed. open the detectmobile.js file and enter your new mobile domain at the end of the function call e.g. mobile.yoursite.edu. if you choose a different domain name or theme name other than "mobile" make sure to update the settings in the sitemap.xmap in mobile.xml, find the link "view full website" and replace the references to yoursite.edu with the domain name for your main site. check for lines - . replace or edit the themes.xmap file located in your default theme folder, e.g., ../dspace/webapps/xmlui/themes/. the code for setting up the properties for the domain is in lines - . this will need to be changed if the domain name for your mobile site starts with something other than 'mobile.' restart tomcat. you should be able to see the mobile theme in action; to change the look-and-feel, go to http://jquerymobile.com/themeroller/ and either create your own files or import/upgrade the uncompressed sc-mobile.css file.   about the author elías tzoc, originally from guatemala, assists the head of the center for digital scholarship to provide miami university scholars with the facilities, services, and expertise to support the creation and use of digital scholarship in all its forms. his current/recent work includes: developing and prototyping web interfaces for digital projects using contentdm, dspace, wordpress, ojs, and omeka; researching for new access points and mobile apps for digital library programs using the jquery mobile framework; researching and publishing on technical issues and open source applications for libraries; writing and co-leading grant-funded projects; and developing web plug-ins using php, html , xslt, css, and jquery.   copyright © elías tzoc scholarly electronic publishing bibliography scholarly electronic publishing bibliography charles w. bailey, jr. scholarly electronic publishing bibliography charles w. bailey, jr. digital scholarship houston, tx scholarly electronic publishing bibliography copyright © by charles w. bailey, jr. cover photographs (before alteration) by nasa. this work is licensed under the creative commons attribution-noncommercial . united states license. to view a copy of this license, visit http://creativecommons.org/licenses/by-nc/ . /us/ or send a letter to creative commons, second street, suite , san francisco, california, , usa. digital scholarship, houston, tx. http://www.digital-scholarship.org/ the author makes no warranty of any kind, either express or implied, for information in the scholarly electronic publishing bibliography , which is provided on an "as is" basis. the author does not assume and hereby disclaims any liability to any party for any loss or damage resulting from the use of information in the scholarly electronic publishing bibliography . in memory of paul evan peters ( - ), founding executive director of the coalition for networked information, whose visionary leadership at the dawn of the internet era fostered the development of scholarly electronic publishing. this page is intentionally blank. table of contents preface......................................................................................... economic issues ....................................................................... electronic books and texts ............................................... . electronic books and texts: case studies and history ........................ . electronic books and texts: general works ........................................ . electronic books and texts: library issues......................................... . electronic books and texts: research.................................................. electronic serials ................................................................ . electronic serials: case studies and history ....................................... . electronic serials: critiques................................................................. . electronic serials: electronic distribution of printed journals........... . . early experimental projects ....................................................................... . . . core, cornell university ................................................................. . . . red sage project, university of california, san francisco................ . . . superjournal project, elib ................................................................. . . . tulip, elsevier science .................................................................... . . jstor........................................................................................................ . . other projects............................................................................................. . . project muse, johns hopkins university ................................................... . electronic serials: general works ....................................................... . electronic serials: library issues......................................................... . electronic serials: research............................................................... general works.................................................................... . general works: research (multiple-types of electronic works)....... legal issues.......................................................................... . legal issues: digital copyright .......................................................... . legal issues: license agreements ...................................................... library issues ...................................................................... . library issues: digital libraries ........................................................ . . early digital library projects .................................................................. . . . alexandria project, university of california, santa barbara ........... . . . digital library initiative, university of illinois at urbana-champaign ..................................................................................................................... . . . informedia, carnegie mellon university ......................................... . . . mercury project, carnegie mellon university ................................. . . . stanford digital library project....................................................... . . . uc berkeley digital library project................................................ . . . university of michigan digital library project ............................... . . general ..................................................................................................... . . national digital library, library of congress ......................................... . . other projects and systems...................................................................... . library issues: digital preservation ................................................... . library issues: general works ........................................................... . library issues: metadata and linking ................................................ new publishing models ...................................................... publisher issues ................................................................... . publisher issues: digital rights management and user authentication ................................................................................................................... repositories, e-prints, and oai ......................................... appendix a. related bibliographies ................................... appendix b. about the author ............................................. preface the scholarly electronic publishing bibliography presents over , articles, books, and a limited number of other textual sources that are useful in understanding scholarly electronic publishing efforts on the internet. the bibliography is selective. all included works are in english. the bibliography does not cover digital media works (such as mp files), e-mail messages, letters to the editor, presentation slides or transcripts, unpublished e-prints, or weblog postings. most sources have been published from through ; however, a limited number of key sources published prior to are also included. the bibliography includes links to many freely available versions of included works. such links, even to publisher versions and versions in disciplinary archives and institutional repositories, are subject to change. urls may alter without warning (and often without automatic forwarding) or they may disappear altogether. inclusion of links to works on authors' personal websites is highly selective. note that e-prints and published articles may not be identical. this page is intentionally blank. economic issues anglada, lluis, and nuria comellas. "what's fair? pricing models in the electronic era." library management , no. / ( ): - . baker, david, and wendy evans, eds. digital library economics. oxford: chandos, . bannerman, ian. "pricing on-line journals." serials (march ): - . bauer, kathleen. "cost analysis of a project to digitize classic articles in neurosurgery." journal of the medical library association (april ): - . http://www.pubmedcentral.gov/picrender.fcgi?action=stream&blobt ype=pdf&artid= bennett, scott. "just-in-time scholarly monographs." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . besser, howard. "digital image distribution: a study of costs and uses." d-lib magazine (october ). http://www.dlib.org/dlib/october / besser.html bide, mark, charles oppenheim, and anne ramsden. "charging mechanisms for digitized texts." learned publishing (april ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art blixrud, julia c., and timothy d. jewell. "understanding electronic resources and library materials expenditures: an incomplete picture." arl: a bimonthly newsletter of research library issues and actions, no. ( ): - . http://www.arl.org/bm~doc/expend.pdf bonn, maria. "benchmarking conversion costs: a report from the making of america iv project." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature bonn, maria s., wendy p. lougee, jeffrey k. mackie-mason, and juan f. riveros. "a report on the peak experiment: context and design." d-lib magazine (june ). http://www.dlib.org/dlib/june / bonn.html bot, marjolein, johan burgemeester, and hans roes. "the cost of publishing an electronic journal: a general model and a case study." d-lib magazine (november ). http://www.dlib.org/dlib/november / roes.html bowen, william g. "jstor and the economics of scholarly communication." journal of library administration , no. / ( ): - . boyce, peter b. "costs, archiving and the publishing process in electronic stm journals." against the gain (december -january ): - . butler, meredith a., and bruce r. kingma. the economics of information in the networked environment. washington, dc: association of research libraries, . byrd, sam, glenn courson, elizabeth roderick, and jean marie taylor. "cost/benefit analysis for digital library projects: the virginia historical inventory project (vhi)." the bottom line: managing library finances , no. ( ): - . cavaleri, piero, michael keren, giovanni b. ramello, and vittorio valli. "publishing an e-journal on a shoe string: is it a sustainable project?" economic analysis and policy , no. ( ): - . http://www.eap-journal.com.au/download.php?file= chen, frances l., paul wrynn, and judith l. rieke. "electronic journal access: how does it affect the print subscription price?" bulletin of the medical library association (october ): - . http://www.pubmedcentral.gov/picrender.fcgi?action=stream&blobt ype=pdf&artid= ching, steve h., maria w. leung, margarret fidow, and ken l. huang. "allocating costs in the business operation of library consortium: the case study of super e-book consortium." library collections, acquisitions, and technical services , no. ( ): - . clarke, roger. "the cost profiles of alternative approaches to journal publishing." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / connaway, lynn silipigni, and stephen r. lawrence. "comparing library resource allocations for the paper and the digital library: an exploratory study." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /connaway/ connaway.html cooper, michael d. "the costs of providing electronic journal access and printed copies of journals to university users." the library quarterly , no. ( ). courant, paul n. "scholarship and academic libraries (and their kin) in the world of google." first monday , no. / ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ day, colin. "the economics of electronic publishing: some preliminary thoughts." in gateways, gatekeepers, and roles in the information omniverse: proceedings of the third symposium, ed. ann okerson and dru mogge, - . washington, dc: office of scientific and academic publishing, association of research libraries, . http://eric.ed.gov/ericwebportal/contentdelivery/servlet/ericserv let?accno=ed ———. "the economics of publishing: the consequences of library and research copying." journal of the american society for information science (december ): - . http://hdl.handle.net/ . / ———. "judging journal prices: a cost index for academic journals." journal of scholarly publishing , no. ( ): - . ———. "pricing electronic products." in filling the pipeline and paying the piper: proceedings of the fourth symposium, ed. ann okerson, - . washington, dc: office of scientific and academic publishing, association of research libraries, . fisher, janet h. "comparing electronic journals to print journals: are there savings?" in technology and scholarly communication, ed. richard ekman and richard e. quandt, - . berkeley: university of california press, . http://eric.ed.gov/ericwebportal/contentdelivery/servlet/ericserv let?accno=ed ———. "the true costs of an electronic journal." serials review , no. ( ): - . fisher, julian h. "scholarly publishing re-invented: real costs and real freedoms." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . fox, peter. "archiving of electronic publications—some thoughts on cost." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art frantsvåg, jan erik. "the role of advertising in financing open access journals." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ frey, kelly l. "business models and pricing issues in the digital domain." journal of library administration , no. ( ): - . getz, malcolm. "electronic publishing: an economic view." serials review , no. - ( ): - . ———. "evaluating digital strategies for storing and retrieving scholarly information." journal of library administration , no. ( ): - . ginn, claire. "calculating pricing models choices: rising to the challenge." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art gorman, g. e. "google print and the principle of functionality." online information review , no. ( ): - . greco, albert n., robert francis jones, robert m. wharton, and hooman estelami. "the changing college and university library market for university press books and journals: - ." journal of scholarly publishing , no. ( ). green, toby. "can the monograph help solve the library 'serials' funding crisis?" serials , no. ( ): - . grycz, czeslaw jan. "economic models for networked information." serials review , no. - ( ): - . halliday, leah, and charles oppenheim. "comparison and evaluation of some economic models of digital-only journals." journal of documentation (november ): - . ———. "economic models of digital-only journals." serials (july ): - . hahn, karla. "tiered pricing: implications for library collections." portal: libraries and the academy , no. ( ): - . hardy, rachel, charles oppenheim, and iris rubbert. "pelican: a pricing mechanism for electronic distribution of materials to the higher education community." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art haar, john m. "project peak: vanderbilt's experience with articles on demand." the serials librarian , no. / ( ): - . harnad, stevan. "electronic scholarly publication: quo vadis?" serials review , no. ( ): - . http://cogprints.org/ / hide, branwen. "how much does it cost, and who pays? the global costs of scholarly communication and the uk contribution." serials: the journal for the serials community , no. ( ): - . holmes, aldyth. "electronic publishing in science: reality check." canadian journal of communication , no. / ( ): - . http://www.cjc-online.ca/index.php/journal/article/view/ / holmstrom, jonas. "the cost per article reading of open access articles." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /holmstrom/ holmstrom.html ———. "the return on investment of electronic journals—it is a matter of time." d-lib magazine , no. ( ). http://www.dlib.org/dlib/april /holmstrom/ holmstrom.html houghton, john w. "crisis and transition: the economics of scholarly communication." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art houghton, john w., and charles oppenheim. "the economic implications of alternative publishing models: views from a non-economist." prometheus: critical studies in innovation , no. ( ): - . houghton, john, bruce rasmussen, peter sheehan, charles oppenheim, anne morris, claire creaser, helen greenwood, mark summers, and adrian gourlay. economic implications of alternative scholarly publishing models: exploring the costs and benefits. london: jisc, . http://www.jisc.ac.uk/media/documents/publications/rpteconomicoa publishing.pdf houghton, john, and peter sheehan. the economic impact of enhanced access to research findings. melbourne: centre for strategic economic studies, victoria university, . cses working paper no. . http://eprints.vu.edu.au/ / houghton, john, colin steele, and peter sheehan. research communication costs in australia: emerging opportunities and benefits. melbourne: centre for strategic economic studies, victoria university, . http://www.dest.gov.au/nr/rdonlyres/ acb f-ea d- faf-b f - f b / /dest_research_communications_cost_ report_sept .pdf hunter, karen. "the effect of price: early observations." in technology and scholarly communication, ed. richard ekman and richard e. quandt, - . berkeley: university of california press, . http://www.eric.ed.gov/ericwebportal/contentdelivery/servlet/eri cservlet?accno=ed kahin, brian, and hal r. varian, eds. internet publishing and beyond: the economics of digital information and intellectual property. cambridge, ma: mit press, . keyhani, andrea. "innovations in cost recovery." in filling the pipeline and paying the piper: proceedings of the fourth symposium, ed. ann okerson, - . washington, dc: office of scientific and academic publishing, association of research libraries, . kiernan, vincent. "paying by the article: libraries test a new model for scholarly journals." the chronicle of higher education, august , a -a . king, donald w. "some economic aspects of the internet." journal of the american society for information science (september ): - . king, donald w., and frances m. alvarado-albertorio. "pricing and other means of charging for scholarly journals: a literature review and commentary." learned publishing , no. ( ): - . king, donald w., peter b. boyce, carol hansen montgomery, and carol tenopir. "library economic metrics: examples of the comparison of electronic and print journal collections and collection services." library trends , no. ( ): - . http://hdl.handle.net/ / king, donald w., and jose-marie griffiths. "economic issues concerning electronic publishing and distribution of scholarly articles." library trends (spring ): - . http://hdl.handle.net/ / king, donald w., and carol tenopir. "evolving journal costs: implications for publishers, libraries, and readers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art kingma, bruce r. "the costs of print, fiche, and digital access: the early canadiana online project." d-lib magazine (february ). http://www.dlib.org/dlib/february /kingma/ kingma.html kyrillidou, martha. "the impact of electronic publishing on tracking research library investments in serials." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr serials.pdf lesk, michael. "pricing electronic information." serials review , no. - ( ): - . lustig, harry. "electronic publishing: economic issues in a time of transition." astrophysics and space science , no. - ( ): - . lynch, clifford a. "scholarly communication in the networked environment: reconsidering economics and organizational missions." serials review , no. ( ): - . mackie-mason, jeffrey k., and alexandra l. l. jankovich. "peak: pricing electronic access to knowledge." library acquisitions: practice & theory , no. ( ): - . mackie-mason, jeffrey k., juan f. riveros, maria s. bonn, and wendy p. lougee. "a report on the peak experiment: usage and economic behavior." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /mackie-mason/ mackie-mason.ht ml marks, robert h. "the economic challenges of publishing electronic journals." serials review , no. ( ): - . maron, nancy l., k. kirby smith, and matthew loy. sustaining digital resources: an on-the-ground view of projects today. new york: ithaka, . http://www.ithaka.org/ithaka-s-r/strategy/ithaka-case-studies-in-sust ainability meadows, jack, david pullinger, and peter such. "the cost of implementing an electronic journal." journal of scholarly publishing (july ): - . metz, paul, and paul m. gherman. "serials pricing and the role of the electronic journal." college & research libraries (july ): - . meyer, richard w. "monopoly power and electronic journals." the library quarterly (october ): - . montgomery, carol hansen. "measuring the impact of an electronic journal collection on library costs." d-lib magazine (october ). http://www.dlib.org/dlib/october /montgomery/ montgomery.ht ml ———. "print to electronic: measuring the operational and economic implications of an electronic journal collection." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art montgomery, carol hansen, and donald w. king. "comparing library and user related costs of print and electronic journal collections: a first step towards a comprehensive analysis." d-lib magazine (october ). http://www.dlib.org/dlib/october /montgomery/ montgomery.ht ml morris, sally. "the true costs of scholarly journal publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art nicholas, david, paul huntington, tom dobrowolski, and ian rowlands. "ideas on creating a consumer market for scholarly journals." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art odlyzko, andrew. "the economics of electronic journals." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ okerson, ann. "a librarian's view of some economic issues in electronic scientific publishing." paper presented at the unesco invitational meeting on the future of scientific information, paris, february . http://www.library.yale.edu/~okerson/unesco.html oppenheim, charles. "will pelican fly?" serials (july ): - . orsdel, lee c. van, and kathleen born. "serial wars." library journal, april . http://www.libraryjournal.com/article/ca .html peters, paul evan. "cost centers and measures in the networked information value-chain." journal of library administration , no. / ( ): - . quandt, richard e. "scholarly materials: paper or digital?" library trends , no. ( ): - . http://hdl.handle.net/ / rhind-tutt, stephen. "what a tangled web we weave: a review of pricing models and the forces that drive them." against the grain (february ): , - . robnett, bill. "online journal pricing." the serials librarian , no. / ( ): - . scholarly communication and technology. washington, dc: association of research libraries, . schonfeld, roger c., donald w. king, ann okerson, and eileen gifford fenton. "library periodicals expenses: comparison of non-subscription costs of print and electronic formats on a life-cycle basis." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /schonfeld/ schonfeld.html ———. the nonsubscription side of periodicals: changes in library operations and costs between print and electronic formats. washington, dc: council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf schroter, sara, leanne tite, and ahmed kassem. "financial support at the time of paper acceptance: a survey of three medical journals." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art scigliano, marisa. "consortium purchases: case study for a cost-benefit analysis." the journal of academic librarianship , no. ( ): - . sens, jean-mark. "moving digits in serials life." library philosophy and practice , no. ( ). http://www.webpages.uidaho.edu/~mbolin/sens.html sheehan, john houghton and peter. "estimating the potential impacts of open access to research findings." economic analysis and policy , no. ( ): - . http://www.eap-journal.com.au/download.php?file= snijder, ronald. "the profits of free books: an experiment to measure the impact of open access publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art sosteric, mike. "electronic journals: the grand information future?" electronic journal of sociology , no. ( ). http://www.sociology.org/content/vol . /sosteric.html sosteric, mike, yuwei shi, and olivier wenker. "electronic first: the upcoming revolution in the scholarly communication system." the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . sqw limited. costs and business models in scientific research publishing: a report commissioned by the wellcome trust. london: the wellcome trust, . http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_c ommunications/documents/web_document/wtd .pdf ———. economic analysis of scientific research publishing: a report commissioned by the wellcome trust. london: the wellcome trust, . stern, david. "pricing models: past, present, and future." the serials librarian , no. / ( ): - . ———. "pricing models and payment schemes for library collections." online (september/october ): - . stoller, michael, robert christopherson, and michael miranda. "the economics of professional journal pricing." college & research libraries (january ): - . tanner, simon, and marilyn deegan. "exploring charging models for digital cultural heritage in europe." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /tanner/ tanner.html ———. "pricing electronic journals." d-lib magazine (june ). http://www.dlib.org/dlib/june / varian.html varian, hal r., and brian kahin. internet publishing and beyond: the economics of digital information and intellectual property. cambridge, ma: the mit press, . white, sonya, and claire creaser. trends in scholarly journal prices - . loughborough: lisu, . http://www.lboro.ac.uk/departments/dis/lisu/pages/publications/oup .html willer, mirna, tanja buzina, karolina holub, jasenka zajec, miroslav milinovic, and nebojša topolšcak. "selective archiving of web resources: a study of processing costs." program: electronic library and information systems , no. ( ): - . willinsky, john. "scholarly associations and the economic viability of open access publishing." journal of digital information , no. ( ). http://jodi.tamu.edu/articles/v /i /willinsky/ electronic books and texts . electronic books and texts: case studies and history albanese, andrew richard. "the e-book enterprise: netlibrary's digital mission." library journal, february , - . bailey, charles w., jr. "evolution of an electronic book: the scholarly electronic publishing bibliography." the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . balakrishnan, n., raj reddy, madhavi ganapathiraju, and vamshi ambati. "digital library of india: a testbed for indian language research." tcdl bulletin , no. ( ). http://www.ieee-tcdl.org/bulletin/v n /balakrishnan/balakrishnan.ht ml bazerman, charles, david blakesley, mike palmquist, and david russell. "open access book publishing in writing studies: a case study." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / bearman, david. "jean-noël jeanneney's critique of google: private sector book digitization and digital library policy." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /bearman/ bearman.html blumenstyk, goldie. "digital-library company plans to charge students a fee for access." the chronicle of higher education, december , a . butter, karen, robin chandler, and john kunze. "the cigarette papers: issues in publishing materials in multiple formats." d-lib magazine (november ). http://www.dlib.org/dlib/november / butter.html carlson, scott. "questia lays off half its employees." the chronicle of higher education, december , a . cherry, joan m., and wendy m. duff. "studying digital library users over time: a follow-up survey of early canadiana online." information research, , no. ( ). http://informationr.net/ir/ - /paper .html chesnutt, david r. "the model editions partnership: 'smart text' and beyond." d-lib magazine (july/august ). http://www.dlib.org/dlib/july / chesnutt.html connaway, lynn silipigni. "a web-based electronic book (e-book) library: the netlibrary model." library hi tech , no. ( ): - . crane, gregory. "the perseus project and beyond: how building a digital library challenges the humanities and technology." d-lib magazine (january ). http://www.dlib.org/dlib/january / crane.html crawford, walt. "building the econtent commons." econtent , no. ( ): . dames, k. matthew. "library organizations should support google book search." online , no. ( ): - . doane, andrea. "the artfl project: an introduction." d-lib magazine (january ). http://www.dlib.org/dlib/january /briefings/ artfl.html duguid, paul. "inheritance and loss? a brief survey of google books." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ eaves, morris. "behind the scenes at the william blake archive: collaboration takes more than e-mail." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . editors and staff, william blake archive. "the persistence of vision: images and imaging at the william blake archive." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature elliott, laura. "how the oxford english dictionary went online." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /oed-tech/ fernandes, derrick. "the safari e-book route through the ict jungle: experiences at hillingdon libraries." program: electronic library and information systems , no. ( ): - . flowers, janet l. "netlibrary.com: cautious optimism/views from a research library and a university press." against the grain (november ): , . foster, andrea l. "an online library struggles to survive". the chronicle of higher education, september , a -a . galloway, edward a., and gabrielle v. michalek. "the heinz electronic library interactive online system (helios): building a digital archive using imaging, ocr, and natural language processing technologies." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /gall n .html ———. "the heinz electronic library interactive on-line system (helios): an update." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /gall n .html gaunt, marianne i. "center for electronic texts in the humanities." information technology and libraries (march ): - . gayton, cynthia m. "alexandria burned—securing knowledge access in the age of google." vine: the journal of information and knowledge management systems , no. ( ): - . green, toby. "publishing e-books: oecd's pay-per-view and e-library services - ." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art guernsey, lisa. "digital presses transform librarians into entrepreneurs." the chronicle of higher education, may , a -a . ———. "on-line whitman archive sings with works of the poet, making the 'body electric.'" the chronicle of higher education, april , a . haarhoff, leith. "books from the past: an e-books project at culturenet cymru." program: electronic library & information systems , no. ( ): - . hamilton, denise. "hart of the gutenberg galaxy." wired (february ): - . http://www.wired.com/wired/archive/ . /esgutenberg.html hart, michael s. "project gutenberg: access to electronic texts." database (december ): - . henry, charles. "rice university press: fons et origo." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . hockey, susan. "developing access to electronic texts in the humanities." in the evolving virtual library: visions and case studies, ed. laverna m. saunders, - . medford, nj: information today, . holder, warren. "e-books—reinventing the wheel?" serials , no. ( ): - . holz, dayna. "technologically enhanced archival collections: using the buddy system." journal of archival organization , no. / ( ): - . hyatt, shirley, and lynn silipigni connaway. "utilizing e-books to enhance digital library offerings." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /netlibrary/ ingalls, zoe. "a web site grows new poems, sometimes right before readers' eyes." the chronicle of higher education, july , a -a . james, ryan. "an assessment of the legibility of google books." journal of access services , no. ( ): - . kiernan, kevin s. "digital preservation, restoration, and dissemination of medieval manuscripts." in gateways, gatekeepers, and roles in the information omniverse: proceedings of the third symposium, ed. ann okerson and dru mogge, - . washington, dc: office of scientific and academic publishing, association of research libraries, . http://eric.ed.gov/ericwebportal/contentdelivery/servlet/ericserv let?accno=ed ———. "the electronic beowulf." computers in libraries (february ): - . kiernan, vincent. "an ambitious plan to sell electronic books." the chronicle of higher education, april , a -a . lackie, robert j. "from google print to google book search: the controversial initiative and its impact on other remarkable digitization projects." the reference librarian , no. ( ): - . lavoie, brian, lynn silipigni connaway, and lorcan dempsey. "anatomy of aggregate collections: the example of google print for libraries." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /lavoie/ lavoie.html lee, stuart. "digitizing wilfred." interview by philip hunter. ariadne, no. ( ). http://www.ariadne.ac.uk/issue /digiwilf/intro.html leetaru, kalev. "mass book digitization: the deeper story of google books and the open content alliance." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / maccoll, john. "google challenges for academic libraries." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /maccoll/ milne, ronald. "the google mass digitisation project at oxford." liber quarterly: the journal of european research libraries , no. / ( ): - . http://liber.library.uu.nl/publish/articles/ /article.pdf morrison, a. "delivering electronic texts over the web: the current and planned practices of the oxford text archive." computers and the humanities , no. ( ): - . mühlberger, günter. "ebooks on demand (eod): a european digitization service." ifla journal , no. ( ): - . mylonas, elli. "the perseus project." in scholarly publishing on the electronic networks: the new generation: visions and opportunities in not-for-profit publishing: proceedings of the second symposium, ed. ann okerson, - . washington, dc: office of scientific and academic publishing, association of research libraries, . neuman, michael, and paul mangiafico. "providing and accessing information via the internet: the georgetown catalogue of projects in electronic text." the reference librarian, no. / ( ): - . pack, thomas. "bringing literature alive: early english books online reshape research opportunities." econtent (december ): - . pang, alex soojung-kim. "the work of the encyclopedia in the age of electronic reproduction." first monday , no. ( ). http://www.firstmonday.org/issues/issue _ /pang/index.html parkes, david. "e-books from ebrary at staffordshire university: a case study." program: electronic library and information systems , no. ( ): - . pochoda, phil. "scholarly publication at the digital tipping point." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . poe, marshall. "note to self: print monograph dead; invent new publishing model." the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . polding, robert, josé miguel baptista nunes, and bernard kingston. "assessing e-book model sustainability." journal of librarianship and information science , no. ( ): - . prescott, andrew. "constructing electronic beowulf." in towards the digital library: the british library's initiatives for access programme, ed. leona carpenter, simon shaw, and andrew prescott, - . london: the british library, . sandler, mark. "academic and commercial roles in building 'the digital library.'" collection management , no. / ( ): - . schiff, lisa. "creating the mark twain project online." learned publishing , no. ( ): - . seales, w. brent, james griffioen, kevin kiernan, cheng jiun yuan, and linda cantara. "the digital atheneum: new techniques for restoring and preserving old documents." computers in libraries (february ): - . http://www.infotoday.com/cilmag/feb /seales.htm skarstein, vigdis moe. "the bookshelf: digitisation and access to copyright items in norway." program: electronic library and information systems , no. ( ): - . stauffer, andrew m. "tagging the rossetti archive: methodologies and praxis." the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . taylor, alison. "e-books from myilibrary at the university of worcester: a case study." program: electronic library and information systems , no. ( ): - . thibadeau, robert, and evan benoit. "antique books." d-lib magazine (september ). http://www.dlib.org/dlib/september /thibadeau/ thibadeau.html viscomi, joseph. "digital facsimiles: reading the william blake archive." computers and the humanities (february ): - . wilkins, valerie. "managing e-books at the university of derby: a case study." program: electronic library and information systems , no. ( ): - . willett, perry. "the victorian women writers project: the library as a creator and publisher of electronic texts." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /will n .html willinsky, john. "toward the design of an open monograph press." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . wittenberg, kate. "the gutenberg-e project: opportunities and challenges in publishing born-digital monographs." learned publishing , no. ( ): - . young, jeffrey r. "netlibrary files for bankruptcy protection." the chronicle of higher education, november , a . zalta, edward. "the stanford encyclopedia of philosophy: a university/library partnership in support of scholarly communication and open access." college & research libraries news , no. ( ): - , . . electronic books and texts: general works abdullah, noorhidawati, and forbes gibb. "students' attitudes towards e-books in a scottish higher education institute: part ." library review , no. ( ): - . anuradha, k. t., and h. s. usha. "e-books access models: an analytical comparative study." the electronic library , no. ( ): - . armstrong, c. j., and r. e. lonsdale. "scholarly monographs: why would i want to publish electronically?" the electronic library , no. ( ): - . arnold, kenneth. "the scholarly monograph is dead. long live the scholarly monograph." in scholarly publishing on the electronic networks: the new generation: visions and opportunities in not-for-profit publishing: proceedings of the second symposium, ed. ann okerson, - . washington, dc: office of scientific and academic publishing, association of research libraries, . balk, hildelies, and lieke ploege. "impact: working together to address the challenges involving mass digitization of historical printed text." oclc systems & services: international digital library perspectives , no. ( ): - . ball, rafael. "e-books in practice: the librarian's perspective." learned publishing , no. ( ): - . basch, reva. "books online: visions, plans, and perspectives for electronic text." online (july ): - . bennett, linda. "infinite riches in a little room: how can we manage, market and modernize the e-books phenomenon?" serials , no. ( ): - . bonn, maria. "free exchange of ideas: experimenting with the open access monograph." college & research libraries news no. ( ): - . brown, gary j. "beyond print: reading digitally." library hi tech , no. ( ): - . burk, roberta. "e-book devices and the marketplace: in search of customers." library hi tech , no. ( ): - . burrows, toby. "electronic texts, digital libraries, and the humanities in australia." library hi tech , no. ( ): - . ———. the text in the machine: electronic texts in the humanities. new york: the haworth press, . carvajal, doreen. "racing to convert books to bytes: evolving market for e-titles." the new york times, december , c , c . cawkell, tony. "electronic books." aslib proceedings (february ): - . chen, ya-ning. "application and development of electronic books in an e-gutenberg age." online information review , no. ( ): - . choudhury, g. sayeed, tim dilauro, robert ferguson, michael droettboom, and ichiro fujinaga. "document recognition for a million books." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /choudhury/ choudhury.html cisler, steve. "letter from san francisco: the internet bookmobile." first monday (october ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ cleyle, susan. "e-books: should we be afraid?" the serials librarian , no. / ( ): - . connaway, lynn silipigni. "e-books—new opportunities and challenges." technicalities (september/october ): - . coyle, karen. "mass digitization of books." the journal of academic librarianship , no. ( ): - . ———. "stakeholders and standards in the e-book ecology: or, it's the economics, stupid!" library hi tech , no. ( ): - . crane, gregory, and alison jones. "text, information, knowledge and the evolving record of humanity." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /jones/ jones.html davy, tom. "e-textbooks: opportunities, innovations, distractions and dilemmas." serials: the journal for the serials community , no. ( ): - . dietrich, dianne. "automated metadata formatting for cornell's print-on-demand books." the code lib journal, no. ( ). http://journal.code lib.org/articles/ ditlea, steve. "the real e-books." technology review (july/august ): - . http://www.technologyreview.com/infotech/ /page / dorman, david. "the e-book: pipe dream or potential disaster?" american libraries (february ): - . eisenberg, daniel. "problems of the paperless book." scholarly publishing (october ): - . esposito joseph j. "the processed book." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ ferwerda, eelco. "new models for monographs—open books: based on a paper presented at the rd uksg conference, edinburgh, april ." serials: the journal for the serials community , no. ( ): - . fischer, ruth, and rick lugg. "e-book basics." collection building , no. ( ): - . garrod, penny. "ebooks in uk libraries: where are we now?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /garrod/ goebel, ralf, and sebastian meyer. "the dfg viewer for interoperability in germany." liber quarterly: the journal of european research libraries , no. / ( ). http://liber.library.uu.nl/publish/articles/ /article.pdf harris, siân. "revolutionising background research." research information (may/june ). http://www.researchinformation.info/features/feature.php?feature_id = hawkins, donald t. "electronic books: a major publishing revolution. part : general considerations and issues." online (july/august ): - . ———. "electronic books: a major publishing revolution. part : the marketplace." online (september/october ): - . ———. "electronic books: reports of their death have been exaggerated." online (january/august ): - . http://www.onlinemag.net/jul /hawkins.htm hernon, peter, rosita hopper, michael r. leach, laura l. saunders, and jane zhang. "e-book use by students: undergraduates in economics, literature, and nursing." the journal of academic librarianship , no. ( ): - . herring, mark y. "here lies the book, r.i.p.: the report of its death has been greatly exaggerated." against the grain (december -january ): , - , . herther, nancy k. "the e-book industry today: a bumpy road becomes an evolutionary path to market maturity." the electronic library , no. ( ): - . hillesund, terje. "will e-books change the world?" first monday (october ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ hillesund, terje, and jon e. noring. "digital libraries and the need for a universal digital publication format." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . hockey, susan. "electronic texts: the promise and the reality." american council of learned societies newsletter (february ). ———. "evaluating electronic texts in the humanities." library trends (spring ): - . http://hdl.handle.net/ / hughes, carol ann. "the myth of 'obsolescence': the monograph in the digital library." portal: libraries in the academy , no. ( ): - . jaffe, neil. "print on demand (pod): an important step in the change to a digital distribution model for books." against the grain (june ): , - . jensen, michael. "e-books and retro glue protect the vested interests of publishing." the chronicle of higher education, june , a . jones, charles e., and david schloen. "electronic publication of ancient near eastern texts." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /epanet/ just, peter. "electronic books in the usa—their numbers and development and a comparison to germany." library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / kaufman, peter b., and jeff ubois. "good terms—improving commercial-noncommercial partnerships for mass digitization: a report prepared by intelligent television for rlg programs, oclc programs and research." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /kaufman/ kaufman.html landoni, m., r. wilson, and f. gibb. "looking for guidelines for the production of electronic textbooks." online information review , no. ( ): - . levine-clark, michael. "electronic books and the humanities: a survey at the university of denver." collection building , no. ( ): - . looney, michael a., and mark sheehan. "digitizing education: a primer on ebooks." educause review (july/august ): - . http://www.educause.edu/ir/library/pdf/erm .pdf loughran, tom. "some trends in electronic publishing." against the grain (june ): - . lowry, anita. "electronic texts in english and american literature." library trends (spring ): - . ———. "electronic texts in the humanities: a selected bibliography." information technology and libraries (march ): - . lynch, clifford. "the battle to define the future of the book in the digital world." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ ———. "electrifying the book." library journal, net connect supplement ( october ): - . ———. "electrifying the book, part ." library journal, net connect supplement (january ): - . malama, chrysanthi, monica landoni, and ruth wilson. "what readers want: a study of e-fiction usability." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /wilson/ wilson.html mattison, david. "alice in e-book land: a primer for librarians." computers in libraries (october ): - . maxymuk, john. "digitized books." the bottom line: managing library finances , no. ( ): - . meadow, charles t. "on the future of the book, or does it have a future?" journal of scholarly publishing (july ): - . morgan, eric lease. "electronic books and related technologies." computers in libraries (november/december ): - . morgan, greg. "a word in your ear: library services for print disabled readers in the digital age." the electronic library , no. ( ): - . nelson, mark r. "e-books in higher education: nearing the end of the era of hype?" educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf press, larry. "from p-books to e-books." communications of the acm (may ): - . ramaiah, chennupati k. "an overview of electronic books: a bibliography." the electronic library , no. ( ): - . rao, siriginidi subba. "electronic book technologies: an overview of the present situation." library review , no. ( ): - . ———. "electronic books: a review and evaluation." library hi tech , no. ( ): - . ———. "electronic books: their integration into library and information centers." the electronic library , no. ( ): - . ———. "familiarization of electronic books." the electronic library , no. ( ): - . sandler, mark, kim armstrong, and bob nardini. "market formation for e-books: diffusion, confusion or delusion?" the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . smith, david a. "debabelizing libraries: machine translation by and for digital collections " d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /smith/ smith.html smith, dinitia. "hoping the web will rescue young professors: in the publish-or-perish world can they live on the internet?" the new york times, june , a , a . sottong, stephen. "don't power up that e-book just yet." american libraries (may ): - . ———. "e-book technology: waiting for the 'false pretender.'" information technology and libraries (june ): - . http://www.ala.org/ala/mgrps/divs/lita/ital/ sottong.cfm steele, colin. "scholarly monograph publishing in the st century: the future more than ever should be an open book." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . sutherland, juliet. "a mass digitization primer." library trends , no. ( ): - . taylor, david. "e-books and the academic market: the emerging supply chain." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art terry, ana arias. "demystifying the e-book: what is it?, where will it lead us, and who's in the game?" against the grain (november ): , . thompson, john b. books in the digital age: the transformation of academic and higher education publishing in britain and the united states. cambridge: polity, . tonkery, dan. "e-books come of age with their readers." research information (august/september ). http://www.researchinformation.info/riaugsep ebooks.html tonkin, emma. "ebooks: tipping or vanishing point?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /tonkin/ van der velde, wouter, and olaf ernst. "the future of ebooks? will print disappear? an end-user perspective." library hi tech , no. ( ): - . van hoorebeek, mark. "napster clones turn their attention to academic e-books." new library world , no. / ( ): - . vasileiou, magdalini, richard hartley, and jennifer rowley. "an overview of the e-book marketplace." online information review , no. ( ): - . whalley, w. brian. "e-books for the future: here but hiding?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /whalley/ wheatcroft, andrew. " / vision? e-books in practice and theory." serials , no. ( ): - . williams, peter, iain stevenson, david nicholas, anthony watkinson, and ian rowlands. "the role and future of the monograph in arts and humanities research." aslib proceedings , no. ( ): - . winkler, karen j. "academic presses look to the internet to save scholarly monographs." the chronicle of higher education, september , a , a . zivkovic, daniela. the electronic book. berlin: bibspider, . . electronic books and texts: library issues anderson, rick. "the espresso book machine: the marriott library experience." serials: the journal for the serials community , no. ( ): - . anuradha, k. t., and h. s. usha. "use of e-books in an academic and research environment: a case study from the indian institute of science." program: electronic library and information systems , no. ( ): - . http://eprints.iisc.ernet.in/archive/ / armstrong, chris, and ray lonsdale. "challenges in managing e-books collections in uk academic libraries." library collections, acquisitions, & technical services , no. ( ): - . badke, william b. "questia.com: implications of the new mclibrary." internet reference services quarterly , no. ( ): - . ball, david. "innovative models for procuring e-books." serials , no. ( ): - . bell, lori, virginia mccoy, and tom peters. "e-books go to college." library journal, may : - . http://www.libraryjournal.com/article/ca .html bennett, linda, and monica landoni. "e-books in academic libraries." the electronic library , no. ( ): - . berube, linda. "e-books in public libraries: a terminal or termination technology?" interlending & document supply , no. ( ): - . bhatt, jay, w. charles paulsen, lisa g. dunn, and amy s. van epps. "science and technology libraries partnering with knovel." science & technology libraries , no. / ( ): - . http://hdl.handle.net/ / blummer, barbara. "e-books revisited: the adoption of electronic books by special, academic, and public libraries." internet reference services quarterly , no. ( ): - . buczynski, james a. "library ebooks: some can't find them, others find them and don't know what they are." internet reference services quarterly , no. ( ): - . burk, roberta. "don't be afraid of e-books." library journal, april , - . case, beau david. "love's labour's lost: the failure of traditional selection practice in the acquisition of humanities electronic texts." library trends (spring ): - . http://hdl.handle.net/ / chan, gayle r. y. c., and janny k. lai. "shaping the strategy for e-books: a hong kong perspective." library collections, acquisitions, & technical services , no. ( ): - . connaway, lynn silipigni, and heather l. wicht. "what happened to the e-book revolution?: the gradual integration of e-books into academic libraries." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . cox, john. "e-books: challenges and opportunities." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /cox/ cox.html crawford, walt. "mp audiobooks: a new library medium?" american libraries (august ): - . ———. "nine models, one name: untangling the e-book muddle." american libraries (september ): - . dewey, barbara i., and carol ann hughes. "sharing minds: creating the iowa scholarly digital resources center." information technology and libraries (june ): - . díez, luisa alvite, and blanca rodríguez bravo. "e-books in spanish academic libraries." the electronic library , no. ( ): - . dillon, dennis. "e-books: the university of texas experience, part ." library hi tech , no. ( ): - . ———. "e-books: the university of texas experience, part ." library hi tech , no. ( ): - . ———. "e-books: the ut-austin experience." texas library journal (fall ): - . dinkelman, andrea, and kristine stacy-bates. "accessing e-books through academic library web sites." college & research libraries , no. ( ): - . dougherty, william c. "the google books project: will it make libraries obsolete?" the journal of academic librarianship , no. ( ): - . ellis, steven. "toward the humanities digital library: building the local organization." college & research libraries (november ): - . engle, michael. "the social position of electronic text centers." library hi tech , no. - ( ): - , . foust, jill e., phillip bergen, gretchen l. maxeiner, and peter n. pawlowski. "improving e-book access via a library-developed full-text search tool." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/picrender.fcgi?artid= &b lobtype=pdf garrod, penny. "e-books: are they the interlibrary lending model of the future?" interlending & document supply , no. ( ): - . gaunt, marianne. "ceth, electronic text centers, and the humanities community." library hi tech , no. - ( ): - . ———. "literary text in an electronic age: implications for library services." in advances in librarianship, vol. , ed. irene godden, - . san diego: academic press, . ———. "machine-readable literary texts: collection development issues." collection management , no. / ( ): - . gibbons, susan. "ebooks: some concerns and surprises." portal: libraries and the academy , no. ( ): - . gibbs, nancy j. "ebooks two years later: the north carolina state university perspective." against the grain (december -january ): , , . ———. "e-books: report on an ongoing experiment." against the grain (december -january ): - . gibson, matthew, and christine ruotolo. "beyond the web: tei, the digital library, and the ebook revolution." computers and the humanities , no. ( ): - . giesecke, joan r., beth mcneil, and gina l. b. minks. "electronic text centers: creating research collections on a limited budget, the nebraska experience." journal of library administration , no. ( ): - . http://digitalcommons.unl.edu/libraryscience/ / goldenberg-hart, diane y. "library technology centers and community building: yale university library electronic text center." library hi tech , no. - ( ): - . hodges, dracine, cyndi preston, and marsha j. hamilton. "resolving the challenge of e-books." collection management , no. / ( ): - . huarng, kun-huang, and hui-chuan winnie wang. "a survey study of the chinese e-books consortium." library management , no. / ( ): - . jantz, ronald. "e-books and new library service models: an analysis of the impact of e-book technology on academic libraries." information technology and libraries (june ): - . johnson, richard k. "in google's broad wake: taking responsibility for shaping the global digital library." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr digprinciples.pdf joint, nicholas. "the electronic book: a transformational library technology?" library review , no. ( ): - . ———. "the google book settlement and academic libraries." library review , no. ( ): - . langston, marc. "the california state university e-book pilot project: implications for cooperative collection development." library collections, acquisitions, and technical services , no. ( ): - . long, sarah ann. "the case for e-books: an introduction." new library world , no. / ( ): - . lugg, rick, and ruth fischer. "the host with the most: ebook distribution to libraries." against the grain (december -january ): - , , . lynch, clifford. "what do digital books mean for libraries?" journal of library administration , no. ( ): - . lynch, mary-alice. "nylink's shared collection: a collaborative introduction of a new technology." against the grain (december -january ): , , . mcluckie, ann. "e-books in an academic library: implementation at the eth library, zurich." the electronic library , no. ( ): - . medeiros, norm. "every (e)book its (e)reader: book collections at a crossroads." oclc systems & services: international digital library perspectives , no. ( ): - . ormes, sarah. "it's the end of the world as we know it (and i feel fine) or how i learned to stop worrying and love the e-book." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /e-book/ park, yeon-hee. "a study of consortium models for e-books in university libraries in korea." collection building , no. ( ): - . peters, thomas a. "gutterdammerung (twilight of the gutter margins): e-books and libraries." library hi tech , no. ( ): - . pitti, daniel v. "encoded archival description: an introduction and overview." d-lib magazine (november ). http://www.dlib.org/dlib/november / pitti.html pomerantz, sarah. "the availability of e-books: examples of nursing and business." collection building , no. ( ): - . powell, christina kelleher. "opac integration in the era of mass digitization: the mbooks experience." library hi tech , no. ( ): - . powell, christina kelleher, and nigel kerr. "sgml creation and delivery: the humanities text initiative." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /humanities/ powell.html price-wilkin, john. "a gateway between the world-wide web and pat: exploiting sgml through the web." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /pricewil. n ———. "text files in libraries: present foundations and future directions." library hi tech , no. ( ): - . ———. "text files in rlg academic libraries: a survey of support and activities." the journal of academic librarianship (march ): - . ———. "using the world-wide web to deliver complex electronic documents: implications for libraries." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /pricewil. n ramirez, diana, and suzanne d. gyeszly. "netlibrary: a new direction in collection development." collection building , no. ( ): - . seaman, david. "the electronic text center: a humanities computing initiative at the university of virginia." the electronic library (june ): - . ———. "'a library and apparatus of every kind': the electronic text center at the university of virginia." information technology and libraries (march ): - . ———. "the user community as responsibility and resource: building a sustainable digital library." d-lib magazine (july/august ). http://www.dlib.org/dlib/july / seaman.html sharp, steve, and sarah thompson. "just in case' vs. 'just in time': e-book purchasing models: based on a breakout session held at the rd uksg conference, edinburgh, april ." serials: the journal for the serials community , no. ( ): - . shepherd, peter t. "the counter code of practice for books and reference works." serials , no. ( ): - . shreeves, edward. "between the visionaries and the luddites: collection development and electronic resources in the humanities." library trends (spring ): - . http://hdl.handle.net/ / smith, natalia, and helen r. tibbo. "libraries and the creation of electronic texts for the humanities." college & research libraries (november ): - . snowhill, lucia. "e-books and their future in academic libraries: an overview." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /snowhill/ snowhill.html soules, aline. "the shifting landscape of e-books." new library world , no. / ( ): - . spornick, charles d. "emory electronic text projects: the role of the full-text center in building partnerships." library hi tech , no. - ( ): - . sutton, brett, ed. literary texts in an electronic age: scholarly implications and library services: papers presented at the clinic on library applications of data processing. urbana-champaign, il: graduate school of library and information science, . http://www.ideals.uiuc.edu/handle/ / wallace, randy. "safari tech books online as supplementary reserve materials." science & technology libraries , no. / ( ): - . warner, beth forrest, and david barber. "building the digital library: the university of michigan's umlibtext project." information technology and libraries (march ): - . willett, perry. "building support for a humanities electronic text center: the experience at indiana university." library hi tech , no. - ( ): - . . electronic books and texts: research abdullah, noorhidawati, and forbes gibb. "students' attitudes towards e-books in a scottish higher education institute: part ." library review , no. ( ): - . anuradha, k. t., and h. s. usha. "e-books access models: an analytical comparative study." the electronic library , no. ( ): - . ———. "use of e-books in an academic and research environment: a case study from the indian institute of science." program: electronic library and information systems , no. ( ): - . http://eprints.iisc.ernet.in/archive/ / armstrong, chris, and ray lonsdale. "challenges in managing e-books collections in uk academic libraries." library collections, acquisitions, & technical services , no. ( ): - . berga, selinda adelle, kristin hoffmannb, and diane dawsonc "not on the same page: undergraduates' information retrieval in electronic and print books." the journal of academic librarianship , no. ( ): - . bierman, james, lina ortega, and karen rupp-serrano. "e-book usage in pure and applied sciences." science & technology libraries , no. / ( ): - . bonn, maria. "free exchange of ideas: experimenting with the open access monograph." college & research libraries news no. ( ): - . bucknell, terry. "the 'big deal' approach to acquiring e-books: a usage-based study." serials: the journal for the serials community , no. ( ): - . christianson, marilyn, and marsha aucoin. "electronic or print books: which are used?" library collections, acquisitions, & technical services , no. ( ): - . clark, dennis t. "lending kindle e-book readers: first results from the texas a&m university project." collection building , no. ( ): - . croft, rosie, and corey davis. "e-books revisited: surveying student e-book usage in a distributed learning academic library years later." journal of library administration , no. / ( ): - . dearnley, james, and cliff mcknight. "the revolution starts next week: the findings of two studies considering electronic books." information services & use , no. ( ): - . estelle, lorraine, and hazel woodward. "the national e-books observatory project: examining student behaviors and usage." journal of electronic resources librarianship , no. ( ): - . foote, jody bales, and karen rupp-serrano. "exploring e-book usage among faculty and graduate students in the geosciences: results of a small survey and focus group approach " science & technology libraries , no. ( ): - . grudzien, pamela, and anne marie casey. "do off-campus students use e-books?" journal of library administration , no. / ( ): - herlihy, catherine s., and hua yi. "e-books in academic libraries: how does currency affect usage?" new library world , no. / ( ): - . huarng, kun-huang, and hui-chuan winnie wang. "a survey study of the chinese e-books consortium." library management , no. / ( ): - . hughes, carol ann, and nancy l. buchanan. "use of electronic monographs in the humanities and social sciences." library hi tech , no. ( ): - . jamali, hamid r., david nicholas, and ian rowlands. "scholarly e-books: the views of , academics: results from the jisc national e-book observatory." aslib proceedings , no. ( ): - . langston, marc. "the california state university e-book pilot project: implications for cooperative collection development." library collections, acquisitions, and technical services , no. ( ): - . lin, chiun-sin, gwo-hshiung tzeng, yang-chieh chin, and chiao-chen chang. "recommendation sources on the intention to use e-books in academic digital libraries." the electronic library , no. ( ): - . lindquist, thea, and heather wicht. "pleas'd by a newe inuention?: assessing the impact of early english books online on teaching and research at the university of colorado at boulder." the journal of academic librarianship , no. ( ): - . littman, justin. "a circulation analysis of print books and e-books in an academic research library." library resources & technical services , no. ( ): - . http://www.oclc.org/research/publications/archive/ /littman-con naway-duke.pdf ———. "a preliminary comparison of electronic book and print book usage in colorado." colorado libraries (fall ): - . lonsdale, ray, and chris armstrong. "electronic books: challenges for academic libraries." library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "electronic scholarly monographs: issues and challenges for the uk." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "promoting your e-books: lessons from the uk jisc national e-book observatory." program: electronic library and information systems , no. ( ): - . maccall, steven l. "online medical books: their availability and an assessment of how health sciences libraries provide access on their public websites." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/picrender.fcgi?artid= &blobt ype=pdf mallett, elizabeth. "a screen too far? findings from an e-book reader pilot." serials: the journal for the serials community , no. ( ): - . maynard, sally, and cliff mcknight. "children's comprehension of electronic books: an empirical study." the new review of children's literature and librarianship ( ): - . ———. "electronic books for children in uk public libraries." the electronic library , no. ( ): - . mcknight, cliff, and james dearnley. "electronic book use in a public library." journal of librarianship and information science , no. ( ): - . mcknight, cliff, james dearnley, and anne morris. "making e-books available through public libraries: some user reactions." journal of librarianship and information science , no. ( ): - . nicholas, david, ian rowlands, david clark, paul huntington, hamid r. jamali, and candela ollé. "uk scholarly e-book usage: a landmark survey." aslib proceedings , no. ( ): - . park, yeon-hee. "a study of consortium models for e-books in university libraries in korea." collection building , no. ( ): - . pattuelli, m. cristina, and debbie rabina. "forms, effects, function: lis students' attitudes towards portable e-book readers." aslib proceedings , no. ( ): - . rowlands, ian, david nicholas, hamid r. jamali, and paul huntington. "what do faculty and students really think about e-books?" aslib proceedings: new information perspectives , no. ( ): - . safley, ellen. "demand for e-books in an academic library." journal of library administration , no. / ( ): - . shelburne, wendy allen. "e-book usage in an academic library: user attitudes and behaviors." library collections, acquisitions, and technical services , no. / ( ): - . slater, robert. "e-books or print books, 'big deals' or local selections—what gets more use?" library collections, acquisitions, and technical services , no. ( ): - . sprague, nancy, and ben hunter. "assessing e-books: taking a closer look at e-book statistics." library collections, acquisitions, and technical services , no. / ( ): - . sukovic, suzana. "references to e-texts in academic publications." journal of documentation , no. ( ): - . summerfield, mary, and carol a. mandel. "on-line books at columbia: early findings on use, satisfaction, and effect." in technology and scholarly communication, ed. richard ekman and richard e. quandt, - . berkeley: university of california press, . summerfield, mary, carol mandel, and paul kantor. "perspectives on scholarly online books: the columbia university online books evaluation project." journal of library administration , no. / ( ): - . ———. "the potential for scholarly online books: views from the columbia university online books evaluation project." publishing research quarterly (fall ): - . williams, karen carter, and rickey best. "e-book usage and the choice outstanding academic book list: is there a correlation?" the journal of academic librarianship , no. ( ): - . wilson, ruth. "e-books for students: eboni." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /e-books/ ———. "electronic books for everyone: designing for accessibility." vine, no. ( ): - . wilson, ruth, monica landoni, and forbes gibb. "a user-centred approach to e-book design." the electronic library , no. ( ): - . http://www.cis.strath.ac.uk/research/publications/papers/strath_cis_p ublication_ .pdf ———. "the web book experiments in electronic textbook design." journal of documentation , no. ( ): - . http://eprints.cdlr.strath.ac.uk/ / wynne, martin. "evaluation in the arts and humanities data service." vine , no. ( ): - . http://eprints.ouls.ox.ac.uk/archive/ / electronic serials . electronic serials: case studies and history ackerman, laurens v., and alphonse simonaitis. "rsna electronic journal: beyond paper images: radiology on the web." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . adair, james r. "tc: a journal of biblical textual criticism: a modern experiment in studying the ancients." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . amiran, eyal, and john unsworth. "postmodern culture: publishing in the electronic medium." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /amiran. n anderson, terry, and brigette mcconkey. "development of disruptive open access journals." canadian journal of higher education , no. ( ): - . http://ojs.library.ubc.ca/index.php/cjhe/article/view/ /pdf_ apfel, robert e. "arlo: a free, peer-reviewed electronic journal that is not free." serials review , no. ( ): - . arroyo, cristina márquez, laura munoa, fernando a. navarro, maría verónica saladrigas, and karen shashok. "panace@—a successful open access journal from the stm translation community." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art bachrach, steven m., darin c. burleigh, and anatoli krassivine. "designing the next-generation chemistry journal: the internet journal of chemistry." issues in science and technology librarianship, no. (winter ). http://www.library.ucsb.edu/istl/ -winter/article .html bachrach, steven m., and stephen r. heller. "the internet journal of chemistry: a case study of an electronic chemistry journal." serials review , no. ( ): - . bailey, charles w., jr. "electronic (online) publishing in action . . . the public-access computer systems review and other electronic serials." online (january ): - . björk, bo-christer, and zigaturk. "electronic journal of information technology in construction (itcon): an open access journal using an un-paid, volunteer-based organization." information research , no. ( ). http://informationr.net/ir/ - /paper .html botticelli, peter, robin dale, carla demello, barbara berger eden, richard entlich, anne r. kenney, and nancy mcgovern. "rlg diginews: taking stock at five years." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature brown, genevieve, and beverly j. irby. "fourteen lessons: initiating and editing an online professional refereed journal." the journal of electronic publishing (august ). http://hdl.handle.net/ /spo. . . cassar, mark. "open access helps when disciplines overlap." research information (december /january ). http://www.researchinformation.info/ridec jan product_focus.ht ml collins, mauri p., and zane l. berge. "ipct journal: a case study of an electronic journal on the internet." journal of the american society for information science , no. ( ): - . coulter, gerry. "launching (and sustaining) a scholarly journal on the internet: the international journal of baudrillard studies." journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . cyburt, richard h., sam m. austin, timothy c. beers, alfredo estrade, ryan m. ferguson, alexander sakharuk, hendrik schatz, karl smith, and scott warren. "the virtual journals of the joint institute for nuclear astrophysics." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /cyburt/ cyburt.html davey, frank. "swiftcurrent: a canadian experiment in on-line literary texts." the serials librarian , no. ( ): - . dykhuis, randy. "the promise of electronic publishing: oclc's program." computers in libraries (november/december ): - . ensor, pat, and thomas wilson. "public-access computer systems review: testing the promise." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . farney, tabatha a., and suzanne l. byerley. "publishing a student research journal: a case study." portal: libraries and the academy , no. ( ): - . fisher, janet h. "electronic journal update: cjtcs." the serials librarian , no. / ( ): - . friedlander, amy. "d-lib magazine: publishing as the honest broker." the serials librarian , no. / ( ): - . ———. "really years old?" d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /friedlander/ friedlander.html glanz, james. "e-journal: delayed but still a force." science, august , . haggerty, kevin d. "taking the plunge: open access at the canadian journal of sociology." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html hahn, karla l. electronic ecology: a case study of electronic journals in context. washington, dc: association of research libraries, . hardy, i. trotter. "starting an electronic journal in law." the journal of information, law and technology, no ( ). http://www .warwick.ac.uk/fac/soc/law/elj/jilt/ _ /hardy harrison, teresa m., and timothy d. stephen. "the electronic journal as the heart of an online scholarly community." library trends (spring ): - . http://hdl.handle.net/ / haschak, paul g. "the 'platinum route' to open access: a case study of e-jasl: the electronic journal of academic and special librarianship." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html haynes, john. "new journal of physics: a web-based and author-funded journal." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art heyworth, mike, julian richards, alan vince, and sandra garside-neville. "internet archaeology: a quality electronic journal." antiquity (december ): - . hickey, thomas b., and terry noreault. "the development of a graphical user interface for the online journal of current clinical trials." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /hickey. n holling, c. s. (buzz). "lessons for sustaining ecological science and policy through the internet." the journal of electronic publishing (june ). http://hdl.handle.net/ /spo. . . holoviak, judy, and keith l. seitter. "earth interactions: transcending the limitations of the printed page." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . hugo, jane, and linda newell. "new horizons in adult education: the first five years ( - )." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /hugo. n jackson, allyn. "the slow revolution of the free electronic journal." notices of the ams (october ): - . http://www.ams.org/notices/ /fea-eljnl.pdf jankowska, maria anna. "a library's contribution to scholarly communication and environmental literacy: the case of an open-access environmental journal. " the serials librarian , no. ( ): - . jennings, edward m. "ejournal: an account of the first two years." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /jennings. n joa, harald. "a case study in e-journal developments: the scandinavian position." against the grain (february ): - , - . jul, erik. "present at the beginning." computers in libraries (april ): - . kamada, hitoshi. "kiyo journals and scholarly communication in japan." portal: libraries and the academy , no. ( ): - . kelly, robert a. "digital archiving in the physics literature: author to archive and beyond—the american physical society." the serials librarian , no. / ( ): - . http://authors.aps.org/eprint/files/ /sep/aps sep _ . /ma in.html keyhani, andrea. "the online journal of current clinical trials: an innovation in electronic journal publishing." database (february ): - . kiernan, vincent. "why do some electronic-only journals struggle, while others flourish?" the chronicle of higher education, may , a -a . kirriemuir, john. "the professional web-zine and parallel publishing: ariadne: the web version." d-lib magazine (february ). http://www.dlib.org/dlib/february /ariadne/ kirriemuir.html lowry, charles b., susan k. martin, and gloriana st. clair. "portal: a new model for the digital future." college & research libraries news (may ): - , . mathis, philip m., judith n. hankins, deborah c. clark, and john d. clark. "launching a campus-based electronic periodical— scientia: the journal of student research." journal of college science teaching (may ): - . mckiernan, gerry. "perspectives in electronic publishing: an open access-dynamic-virtual electronic journal." library hi tech news (october ): - . meera, b. m., and rehana ummer. "open access journals: development of a web portal at the indian statistical institute." the electronic library , no. ( ): - . moothart, tom. "charles w. bailey, jr.: editor, publisher, innovator." serials review , no. ( ): - . morasch, bruce. "electronic social psychology." serials review (summer/fall ): - . moret, bernard m. e. "acm's journal of experimental algorithmics: bridging the gap between theory and practice." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . nadasdy, zoltan. "electronic journal of cognitive and brain sciences: a truly all-electronic journal: let democracy replace peer review." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . naylor, bernard, and marilyn geller. "a prehistory of electronic journals: the eies and blend projects." in advances in serials management, vol. , ed. marcia tuttle and karen d. darling, - . greenwich, ct: jai press, . oakeshott, priscilla. "the 'blend' experiment in electronic publishing." scholarly publishing (october ): - . o'donnell, james j. "five years of bryn mawr classical review." the serials librarian , no. / ( ): - . ———. "going electronic: the bryn mawr classical review." surfaces ( ). http://www.pum.umontreal.ca/revues/surfaces/vol /odonnel.html Özek, yvonne hultman. "lund virtual medical journal makes self-archiving attractive and easy for authors." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /ozek/ ozek.html persing, bob. "two little e-journals and how they grew." serials review , no. ( ): - . peters, stuart. "presenting a successful electronic journal subscription model." first monday (september ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ peters, stuart, and nigel gilbert. "the electronic alternative: sociological research online." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art pöschl, ulrich. "interactive journal concept for improved scientific publishing and quality assurance." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "interactive open access publishing and public peer review: the effectiveness of transparency and self-regulation in scientific quality assurance." ifla journal , no. ( ): - . ———. "interactive peer review enhances journal quality." research information (september/october ). http://www.researchinformation.info/risepoct openaccess.html pullinger, d. "the blend network and electronic journal project." program (july ): - . robison, david f. w. "the changing states of current cites: the evolution of an electronic journal." computers in libraries (june ): - . robison, elwin c. "architecture, graphics, and the net: a short history of architronic, a peer-reviewed e-journal." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /robi n .html savage, lon. "the journal of the international academy of hospitality research." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /savage. n shum, simon buckingham, and tamara sumner. "jime: an interactive journal for interactive media." first monday , no. ( ). http://www.firstmonday.org/issues/issue _ /buckingham_shum/inde x.html simoni, robert d. "serving science while paying the bills: the history of the journal of biological chemistry online." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art singleton, alan. "journals and the electronic programme of the institute of physics." the serials librarian , no. / ( ): - . snyder, kerala j. "electronic journals and the future of scholarly communication: a case study." notes (september ): - . solomon, david j. "medical education online: a case study of an open access journal in health professional education." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html ———. "strategies for developing sustainable open access scholarly journals." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ stalker, matt. "launching an online-only journal." learned publishing , no. ( ): - . steinberger, mark. "electronic mathematics journals." notices of the american mathematical society (january ): - . http://www.ams.org/notices/ /steinberger.pdf stojanovski, jadranka, jelka petrak, and bojan macan. "the croatian national open access journal platform." learned publishing , no. ( ): - . stover, mark. "the librarian as publisher: a world wide web publishing project." computers in libraries (october ): - . sumner, tamara, and simon buckingham shum. "open peer review & argumentation: loosening the paper chains on journals." ariadne, no. ( ). http://www.ukoln.ac.uk/ariadne/issue /jime/ tomlins, christopher l. "just one more 'zine? maintaining and improving the scholarly journal in the electronic present: a view from the humanities." learned publishing (january ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art turner, judith axler. "mickey, judy, colin, and me." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ turoff, murray, and starr roxanne hiltz. "electronic information exchange and its impact on libraries." in the role of the library in an electronic society: papers presented at the clinic on library applications of data processing, ed. f. wilfrid lancaster, - . urbana-champaign, il: graduate school of library science, . http://hdl.handle.net/ / ———. "the electronic journal: a progress report." journal of the american society for information science (july ): - . tuttle, marcia. "the newsletter on serials pricing issues." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /tuttle. n ———. "the newsletter on serials pricing issues: teetering on the cutting edge." in advances in serials management: a research annual, vol. , ed. marcia tuttle and jean g. cook, - . greenwich, ct: jai press, . valauskas, edward j. "waiting for thomas kuhn: first monday and the evolution of electronic journals." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . ward, kevin. "the katharine sharp review." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /katharine-sharp/ weibel, stuart, eric miller, jean godby, and ralph le van. "an architecture for scholarly publishing on the world wide web." computer networks and isdn systems (december ): - . wheary, jennifer, and bernard f. schutz. "living reviews in relativity: making an electronic journal live." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . wheary, jennifer, lee wild, bernard schutz, and christina weyher. "living reviews in relativity: thinking and developing electronically." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . widzinski, lori j. "the evolution of mc journal: a case study in producing a peer-reviewed electronic journal." serials review , no. ( ): - . ———. "mc journal: the journal of academic media librarianship." ariadne, no. ( ). http://ukoln.bath.ac.uk/ariadne/issue /academic-media/ willinsky, john, and ranjini mendis. "open access on a zero budget: a case study of postcolonial text." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html wilson, bonita, and allison l. powell. "a tenth anniversary for d-lib magazine." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /wilson/ wilson.html wilson, david l. "a journal's big break: national library of medicine will index an electronic journal on medline." the chronicle of higher education, january , a , a . wilson, tom d. "information research: a case study in the free electronic publication of research." vine, no. ( ): - . wilson, thomas c. "the origins of ter: ten years after." technology electronic reviews , no. ( ). young, jeffrey r. "stanford-based high wire press transforms the publication of scientific journals." the chronicle of higher education, may , a -a . youngen, ralph. "resources for mathematicians: the evolution of e-math." the serials librarian , no. / ( ): - . . electronic serials: critiques crawford, walt. "here's the content—where's the context?" american libraries (march ): - . ewing, john h. "no free lunches: we should resist the push to rush research online." the chronicle of higher education, october : b . jacobson, michael w. "biomedical publishing and the internet: evolution or revolution?" journal of the american medical informatics association (may/june ): - . kling, rob, and lisa covi. "electronic journals and legitimate media in the systems of scholarly communication." the information society , no. ( ): - . http://www.chass.utoronto.ca/epc/chwp/kling/ lawal, ibironke. "science resources: does the internet make them cheaper, better?" the bottom line , no. ( ): - . ovadia, steven. "self-published electronic journals: not quite the wave of the future." the serials librarian , no. ( ): - . piternick, anne b. "attempts to find alternatives to the scientific journal: a brief review." the journal of academic librarianship (november ): - . ———. "electronic serials: realistic or unrealistic solution to the journal 'crisis'?" the serials librarian , no. / ( ): - . ———. "serials and new technology: the state of the 'electronic journal.'" canadian library journal , no. ( ): - . quinn, frank. "roadkill on the electronic highway? the threat to the mathematical literature." publishing research quarterly (summer ): - . raney, keith r. "into a glass darkly." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . rowland, fytton. "electronic journals: neither free nor easy." ejournal , no. ( ). http://www.ucalgary.ca/ejournal/archive/rachel/v n /article .html ———. "the need for information organizations and information professionals in the internet era." serials review , no. ( ): - . schaffner, ann c. "the future of scientific journals: lessons from the past." information technology and libraries (december ): - . slagell, jeff. "the good, the bad, and the ugly: evaluating electronic journals." computers in libraries (may ): - . stankus, tony. "the key trends emerging in the first decade of electronic journals in the sciences." science & technology libraries , no. / ( ): - . stoller, michael e. "electronic journals in the humanities: a survey and critique." library trends (spring ): - . http://hdl.handle.net/ / tomlins, christopher l. "the wave of the present: the printed scholarly journal on the edge of the internet." journal of scholarly publishing (april ): - . http://archives.acls.org/op/ _wave_of_the_present.htm wallenius, leila i. t. "are electronic serials helping or hindering academic libraries?" the acquisitions librarian , no. / ( ): - . woodward, hazel, fytton rowland, cliff mcknight, jack meadows, and carolyn pritchett. "electronic journals: myths and realities." library management , no. ( ): - . . electronic serials: electronic distribution of printed journals . . early experimental projects . . . core, cornell university entlich, richard. "electronic chemistry journals: elemental concerns." the serials librarian , no. / ( ): - . entlich, richard, lorrin garson, michael lesk, lorraine normore, jan olsen, and stuart weibel. "making a digital library: the chemistry online retrieval experiment." communications of the acm (april ): . entlich, richard, lorrin garson, michael lesk, lorraine normore, jan olsen, and stuart weibel. "making a digital library: the contents of the core project." acm transactions on information systems (april ): - . entlich, richard, lorrin garson, michael lesk, lorraine normore, jan olsen, and stuart weibel. "testing a digital library: user response to the core project." library hi tech , no. ( ): - . weibel, stuart. "the core project: technical shakedown phase and preliminary user studies." oclc systems & services , no. ( ): - . . . . red sage project, university of california, san francisco deloughry, thomas j. "effort to provide scholarly journals by computer tries to retain the look and feel of printed publications." the chronicle of higher education, april , a -a . lucier, richard e., and robert c. badger. "red sage project." the serials librarian , no. / ( ): - . lucier, richard e., and peter brantley. "the red sage project: an experimental digital journal library for the health sciences." d-lib magazine (august ). http://www.dlib.org/dlib/august /lucier/ lucier.html . . . superjournal project, elib baldwin, christine. "superjournal update." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /superjournal/ dawson, heather. "putting the super into journal: the superjournal project at the british library of political and economic science." vine, no. ( ): - . eason, ken, and susan harker. "psychological processes in the use of electronic journals." serials (july ): - . eason, ken, sue richardson, and liangzhi yu. "patterns of use of electronic journals." journal of documentation (september ): - . mabe, michael. "superjournal: the publisher's perspective." serials (july ): - . mcknight, cliff, yu liangzhi, and susan harker. "librarians in the delivery of electronic journals: roles revisited." journal of librarianship and information science (september ): - . pullinger, david. "academics and the new information environment: the impact of local factors on use of electronic journals." journal of information science , no. ( ): - . ———. "learning from putting electronic journals on superjanet: the superjournal project." interlending & document supply , no. ( ): - . ———. the superjournal project: electronic journals on superjanet. bristol, england: institute of physics publishing, . pullinger, david, and christine baldwin. "superjournal: a project in the uk to develop multimedia journals." d-lib magazine (january ). http://www.dlib.org/dlib/january /briefings/ super.html . . . tulip, elsevier science cattey, bill, and greg anderson. "tulip at the massachusetts institute of technology." library hi tech , no. ( ): - . dougherty, william c., and edward a. fox. "tulip at virginia tech." library hi tech , no. ( ): - . jordan, william. "tulip at the university of washington." library hi tech , no. ( ): - . lynch, clifford a. "the tulip project: context, history, and perspective." library hi tech , no. ( ): - . mostert, paul. "tulip at elsevier science." library hi tech , no. ( ): - . mostert, paul, and peter fransen, "technical and functional aspects of electronic journal systems developed in tulip and ees projects." library acquisitions: practice & theory , no. ( ): - . needleman, mark. "tulip at the university of california, part i: implementation and the lessons learned." library hi tech , no. ( ): - . smith, earl c., and lynn j. davis. "tulip at the university of tennessee, knoxville." library hi tech , no. ( ): - . troll, denise a., charles b. lowry, and barbara g. richards. "tulip at carnegie mellon." library hi tech , no. ( ): - . wanat, camille. "tulip at the university of california, part ii: the berkeley experience and a view beyond." library hi tech , no. ( ): - . willis, katherine. "tulip at the university of michigan." library hi tech , no. ( ): - . willis, katherine, ken alexander, william a. gosling, gregory r. peters, jr., robert schwartzwalder, and beth forrest warner. "tulip—the university licensing program: experiences at the university of michigan." serials review , no. ( ): - . wilson, david l. "major scholarly publisher to test electronic transmission of journals." the chronicle of higher education, june , a , a . worona, steven l., and john m. saylor. "tulip at cornell university." library hi tech , no. ( ): - . . . jstor brindley, lynne, and kevin m. guthrie. "jstor and the joint information systems committee: an international collaboration." serials (march ): - . carlson, david. "aaas and jstor: anatomy of a successful initiative." college & research libraries news , no. ( ): - . carlson, scott. "jstor's journal-archiving service makes fans of librarians and scholars." the chronicle of higher education, july , a -a . chapman, karen. "an examination of the usefulness of jstor to researchers in finance." behavioral & social sciences librarian , no. ( ): - . chepesiuk, ron. "jstor and electronic archiving." american libraries (december ): - . deloughry, thomas j. "journal articles dating back as far as a century are being put on line." the chronicle of higher education, december , a , a . garlock, kristen l., william e. landis, and sherry piontek. "redefining access to scholarly journals: a progress report on jstor." serials review , no. ( ): - . gauger, barbara j., and carolyn kacena. "jstor usage data and what it can tell us about ourselves: is there predictability based on historical use by libraries of similar size?" oclc systems & services , no. ( ): - . guthrie, kevin m. "archiving in the digital age: there's a will, but is there a way?" educause review (november/december ): - . http://www.educause.edu/ir/library/pdf/erm .pdf ———. "challenges and opportunities presented by archiving in the electronic era." portal: libraries and the academy , no. ( ): - . ———. "jstor and the university of michigan: an evolving collaboration." library hi tech , no. ( ): - , . ———. "jstor: from project to independent organization." d-lib magazine (july/august ). http://www.dlib.org/dlib/july / guthrie.html ———. "lessons from jstor: user behavior and faculty attitudes." journal of library administration , no. ( ): - . guthrie, kevin m., and wendy p. lougee. "the jstor solution: accessing and preserving the past." library journal, february , - . krueger, stephanie, and irina lynden. "jstor's work in the russian federation: a case study." slavic & east european information resources , no. ( ): - . murphy, alison. "jstor usage." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /jstor/ schonfeld, roger c. "jstor: a case study in the recent history of scholarly communications." program: electronic library and information systems , no. ( ): - . spinella, michael p. "jstor: past, present, and future." journal of library administration , no. ( ): - . ———. "jstor and the changing digital landscape." interlending & document supply , no. ( ): - . sully, sarah e. "jstor: an ip practitioner's perspective." d-lib magazine (january ). http://www.dlib.org/dlib/january / sully.html thomas, spencer w., ken alexander, and kevin guthrie. "technology choices for the jstor online archive." computer (february ): - . walker, ben, dan schoonover, and raimonda margjoni. "creating a statewide jstor repository: initial steps taken by the florida state university system." journal of interlibrary loan, document delivery & electronic reserve , no. ( ): - . . . other projects anderson, cokie g. "digitizing scientific articles: special challenges." science & technology libraries , no. ( ): - . anderson, kent. "a journal publishing hybrid: creating electronic pages for pediatrics." journal of scholarly publishing (october ): - . ———. "from paper to electron: how an stm journal can survive the disruptive technology of the internet." journal of the medical informatics association (may/june ): - . andrews, nick. "the oxford journals online archives: the purpose and practicalities of a major print digitization program." serials review , no. ( ): - . arlitsch, kenning, l. yapp, and karen edge. "the utah digital newspapers project." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /arlitsch/ arlitsch.html atkinson, roderick d., and laurie e. stackpole. "torpedo: networked access to full-text and page-image representations of physics journals and technical reports." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /atki n .html beddall, jane, sue malin, and kim hallett. "seamless and integrated access to the world of electronic journals." the serials librarian , no. / ( ): - . bidigare, sarah a., and leena n. lalwani. "the peak project and strategies for remote user support." online (january/february ): - . boyce, peter b., and heather dalterio. "electronic publishing of scientific journals." physics today (january ): - . boyce, peter b., evan owens, and chris biemesderfer. "electronic publishing: experience is telling us something." serials review , no. ( ): - . brailsford, hugo. "parallel publishing for transactions." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /ppt/ cerdeira, hilda a. "ejournals delivery service: an email to internet experiment." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art clark, kathleen a. "implementation of isi's electronic library project at purdue university: criteria for selection and publisher pricing schemes." the serials librarian , no. ( ): - . cole, timothy, william h. mischo, thomas g. habing, and robert h. ferrer. "using xml and xslt to process and render online journals." library hi tech , no. ( ): - . dixon, anne. "the twelve ages of electronic journals." vine, no. ( ): - . donohue, robert e. "pubscience: accessing scientific and technical journal information at the desktop." serials review , no. ( ): - . eichhorn, g. "the digital library of the astrophysics data system." astrophysics and space science , no. - ( ): - . fishel, martha, and carol j. myers. "the pubmed central archive and the back issue scanning project." journal of interlibrary loan, document delivery & electronic reserve , no. ( ): - . goodvin, renee, and brooke lippy. "eml: taking mississippi libraries into the st century." d-lib magazine (february ). http://www.dlib.org/dlib/february /goodvin/ goodvin.html hobohm, hans-christoph. "changing the galaxy: on the transformation of a printed journal to the internet." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ hoffman, melia m., lawrence o'gorman, guy a. story, james q. arnold, and nina h. macdonald. "the rightpages service: an image-based electronic library." journal of the american society for information science (september ): - . howells, matthew, ashleigh bell, nicholas everitt, and jennifer mcmillan. "digitizing journal archives: the experience of taylor & francis." learned publishing , no. ( ): - . huber, charles f. "electronic journal publishers: a reference librarian's guide." issues in science and technology librarianship (summer ). http://www.library.ucsb.edu/istl/ -summer/article .html hunter, karen. "elsevier digitises scientific heritage." serials (november ): - . ———. "going 'electronic-only': early experiences and issues." journal of library administration , no. ( ): - . ———. "sciencedirect." the serials librarian , no. / ( ): - . joseph, heather. "an economic model for web enhancements to a print journal". the journal of electronic publishing (april ). http://hdl.handle.net/ /spo. . . kelly, robert a. "the american physical society and the torpedo ultra project." the serials librarian , no. / ( ): - . kimberly, robert. "electronic journal distribution: a prototype study." the electronic library (august ): - . kirstein, peter, and goli montasser-kohsari. "the c-oda project: online access to electronic journals." communications of the acm (june ): - . kluiters, christiaan c. p. "towards electronic journal articles: the publisher's technical point of view: implementation of elsevier science electronic subscriptions (ess) at the university of tilburg." ifla journal , no. ( ): - . koltay, zsuzsa, and h. thomas hickerson. "project euclid and the role of research libraries in scholarly publishing." journal of library administration , no. / ( ): - . kurtz, michael j., guenther eichhorn, alberto accomazzi, carolyn s. grant, markus demleitner, and stephen s. murray. "the nasa ads abstract service and the distributed astronomy digital library." d-lib magazine (november ). http://www.dlib.org/dlib/november / kurtz.html lewis, gregory s., and john w. edwards. "home on the electronic range: bringing the journal of animal science online." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art lowry, charles b., and denise a. troll. "carnegie mellon university and university microfilms international 'virtual library project.'" the serials librarian , no. / ( ): - . mcgeachin, robert b. "preservation of the texas agricultural experiment station bulletin in the digital repository." journal of agricultural & food information , no. ( ): - . moothart, tom. "migration to electronic distribution through oclc's electronic journals online." serials review , no. ( ): - . nilges, chip. "evolving an integrated electronic journals solution: oclc firstsearch electronic collections online." the serials librarian , no. / ( ): - . peek, robin, jeffrey pomerantz, and stephen paling. "the traditional scholarly journal publishers legitimize the web." journal of the american society for information science (september ): - . reilly, bernard f., and james simon. "shared digital access and preservation strategies for serials at the center for research libraries." the serials librarian , no. / ( ): - . rowland, fytton, cliff mcknight, and jack meadows. "elvyn: the delivery of an electronic version of a journal from the publisher to libraries." journal of the american society for information science (september ): - . seib, renate. "exilpresse digital: the deutsche bibliothek's digitization of selected german exile periodicals and newspapers from the - period." the serials librarian , no. ( ): - . smith, philip n. et al. "journal publishing with acrobat: the cajun project." electronic publishing (december ): - . stackpole, laurie e. "the u.s. naval research laboratory and the torpedo ultra project." the serials librarian , no. / ( ): - . stanley, tracey. "the internet library of early journals project." the serials librarian , no. ( ): - . seitter, keith l., and kenneth f. heideman. "whither print? staying nimble in the face of uncertainty." learned publishing , no. ( ): - . story, guy a., lawrence o'gorman, david fox, louise levy schaper, and h. v. jagadish. "the rightpages image-based electronic library for alerting and browsing." computer (september ): - . tagler, john. "recent steps toward full-text electronic delivery at elsevier science." the serials librarian , no. / ( ): - . thomas, timothy. "archives in a new paradigm of scientific publishing: physical review online archives (prola)." d-lib magazine (may ). http://www.dlib.org/dlib/may / thomas.html ———. "physical review online archives (prola): an image archive for the journal physical review." d-lib magazine (june ). http://www.dlib.org/dlib/june / thomas.html tucker, amy. "electronic journals: fulfilling a mission at the institute of physics." serials review , no. ( ): - . turner, judith axler. "lessons from the chronicle: pioneering an online newspaper." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . van steenberg, michael e. "nasa stelar experiment." the serials librarian , no. / ( ): - . wallis, jake. "facilitating scottish cultural publishing online." library review , no. ( ): - . http://eprints.cdlr.strath.ac.uk/ / . . project muse, johns hopkins university lewis, susan. "from earth to ether: one publisher's reincarnation." the serials librarian , no. / ( ): - . lewis, susan, and todd kelley. "project muse: tackling journals." in filling the pipeline and paying the piper: proceedings of the fourth symposium, ed. ann okerson, - . washington, dc: office of scientific and academic publishing, association of research libraries, . neal, james g. "the serials revolution: vision, innovation, tradition." the serials librarian , no. / ( ): - . schaffner, melanie b., judy luther, and october ivins. "project muse's new pricing model: a case study in collaboration." serials review , no. ( ): - . . electronic serials: general works abate, tom. "publishing scientific journals online." bioscience , no. ( ): - . amiran, eyal. "the rhetoric of serials at the present time." the serials librarian , no. / ( ): - . amiran, eyal, elaine orr, and john unsworth. "refereed electronic journals and the future of scholarly publishing." in advances in library automation and networking, vol. , ed. joe a. hewitt, - . greenwich, ct: jai press, . http://hdl.handle.net/ / anderson, kent. "the mutant journal: how adaptations to online forces are forcing stm journals to mutate." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art arms, william y. "what are the alternatives to peer review? quality control in scholarly publishing on the web." the journal of electronic publishing (august ). http://hdl.handle.net/ /spo. . . ashcroft, linda. "win-win-win: can the evaluation and promotion of electronic journals bring benefits to library suppliers, information professionals, and users?" library management , no. ( ): - . bailey, charles w., jr. "network-based electronic serials." information technology and libraries (march ): - . http://www.digital-scholarship.org/cwb/ital n .htm berin, andrew. "unbundled journals: trying to predict the future." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art bollag, burton. "east african universities will gain journal access in new online project." the chronicle of higher education, march , a . bosch, stephen. "buy, build, or lease: managing serials for scholarly communications." serials review , no. ( ): - . boyce, peter b. "scholarly journals in the electronic world." the serials librarian , no. / ( ): - . brown, david j. "scholarly journal publishing: coming to terms with the internet culture." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art brown, elizabeth w., and andrea l. duda. "electronic publishing programs in science and technology part : the journals." issues in science and technology librarianship (fall /winter ). http://www.library.ucsb.edu/istl/ -fall/brown-duda.html buckley, chad, marian burright, amy prendergast, richard sapon-white, and anneliese taylor. "electronic publishing of scholarly journals: a bibliographic essay of current issues." issues in science and technology librarianship (spring ). http://www.library.ucsb.edu/istl/ -spring/article .html bucknell, terry. "usage statistics for big deals: supporting library decision-making." learned publishing , no. ( ): - . butler, declan. "the writing is on the web for science journals in print." nature, january , - . butler, h. julene, ed. "abstracts of papers presented at the international conference on refereed journals, october ." serials review , no. ( ): - . clement, gail. "evolution of a species: science journals published on the internet." database (october/november ): - . coonin, bryna. "establishing accessibility for e-journals: a suggested approach." library hi tech , no. ( ): - . cowhig, jerry. "electronic article and journal usage statistics (eajus): proposal for an industry-wide standard." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art cox, john. "pricing electronic information: a snapshot of new serials pricing models." serials review , no. ( ): - . eisenberg, daniel. "the electronic journal." scholarly publishing (october ): - . elbeck, matthew, and jean mandernach. "expanding the value of scholarly, open access e-journals." library & information science research , no. ( ): - . felts, john w., jr. "now you can get there from here: creating an interactive web application for accessing full-text journal articles from any location." library collections, acquisitions, & technical services , no. ( ): - . fink, j. lynn, and philip e. bourne. "reinventing scholarly communication for the electronic age." ctwatch quarterly , no. ( ). http://www.ctwatch.org/quarterly/articles/ / /reinventing-schol arly-communication-for-the-electronic-age/index.html frankel, mark s., roger elliott, martin blume, and jean-manuel bourgois. "defining and certifying electronic publication in science." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art fry, jenny, and sanna talja. "the cultural shaping of scholarly communication: explaining e-journal use within and across academic fields." in asist : proceedings of the th asist annual meeting, edited by linda schamber and carol l. barry, - . medford , nj: information today, . gaines, brian r. "an agenda for digital journals: the socio-technical infrastructure of knowledge dissemination." journal of organizational computing , no. ( ): - . galvin, jeanne. "the next step in scholarly communication: is the traditional journal dead?" electronic journal of academic and special librarianship , no. ( ). http://southernlibrarianship.icaap.org/content/v n /galvin_j .ht m geller, marilyn. "the real cost and price of ejournals." against the grain (june ): - . glover, s. w. "the impact of the internet and electronic journals on biomedical publishing." health libraries review , no. ( ): - . greco, albert n., robert m. wharton, hooman estelami, and robert francis jones. "the state of scholarly journal publishing: - ." journal of scholarly publishing , no. ( ): - . guernsey, lisa, and vincent kiernan. "journals differ on whether to publish articles that have appeared on the web." the chronicle of higher education, july , a -a . halliday, leah, and charles oppenheim. "developments in digital journals." journal of documentation (march ): - . hammond, tony, timo hannay, and ben lund. "the role of rss in science publishing: syndication and annotation on the web." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /hammond/ hammond.html harrison, teresa m., timothy stephen, and james winter. "online journals: disciplinary designs for electronic scholarship." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /harrison. n hawkins, les. "conser cooperative open access journal project." serials review , no. ( ): - . ———. "network accessed scholarly serials." the serials librarian , no. / ( ): - . hellriegel, patricia, and kaat van wonterghem. "package deals unwrapped. . . or the librarian wrapped up? 'forced acquisition' in the digital library." interlending & document supply , no. ( ): - . henley, jane, and sarah thompson. "journalsonline: the online journal solution." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /cover/ hennig, nicole. "improving access to e-journals and databases at the mit libraries: building a database-backed web site called 'vera.'" the serials librarian , no. / ( ): - . hickey, thomas b. "present and future capabilities of the online journal." library trends (spring ): - . http://hdl.handle.net/ / hitchcock, steve, leslie carr, and wendy hall. "web journals publishing: a uk perspective." serials , no. ( ): - . http://eprints.ecs.soton.ac.uk/ / hurd, julie m., deborah d. blecic, and ann e. robinson. "performance measures for electronic journals: a user-centered approach." science & technology libraries , no. / ( ): - . hyldegaard, jette, and piet seiden. "my e-journal—exploring the usefulness of personalized access to scholarly articles and services." information research , no. ( ). http://informationr.net/ir/ - /paper .html johnson, qiana. "user preferences in formats of print and electronic journals." collection building , no. ( ): - . johnson, richard k., and judy luther. "are journal publishers trapped in the dual-media transition zone?" arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arl-br- -journals.pdf ———. the e-only tipping point for journals: what's ahead in the print-to-electronic transition zone. washington, dc: association of research libraries, . http://www.arl.org/bm~doc/electronic_transition.pdf kidd, tony. "are print journals dinosaurs?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /main/ kim, hak joon. "the transition from paper to electronic journals: key factors that affect scholars' acceptance of electronic journals." the serials librarian , no. ( ): - . kingma, bruce r. "electronic journal publishing in mathematics." the bottom line: managing library finances , no. ( ): - . kircz, joost g. "new practices for electronic publishing : will the scientific paper keep its form?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "new practices for electronic publishing : new forms of the scientific paper." learned publishing , no. (january ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art kling, rob, and ewa callahan. "electronic journals, the internet, and scholarly communication." in annual review of information science and technology, vol. , ed. blaise cronin, - . medford, nj: information today, inc., . http://rkcsi.indiana.edu/archive/csi/wp/wp - b.html koteswara rao, mamidi. "scholarly communication and electronic journals: issues and prospects for academic and research libraries." library review , no. ( ): - . langschied, linda. "electronic journal forum: vpiej-l: an online discussion group for electronic journal publishing concerns." serials review , no. ( ): - . lawal, ibironke. "scholarly communication at the turn of the millennium: a bibliographic essay." journal of scholarly publishing (april ): - . leslie, jacques. "goodbye, gutenberg." wired (october ): - . lougee, wendy p. "scholarly journals in the late th century." library collections, acquisitions, & technical services , no. ( ): - . lukesh, susan s. "revolutions and images and the development of knowledge: implications for research libraries and publishers of scholarly communications." the journal of electronic publishing (april ). http://hdl.handle.net/ /spo. . . luther, judy. "full text journal subscriptions: an evolutionary process." against the grain (june ): , , , . lynch, clifford. "shape of the scientific article in the developing cyberinfrastructure." ctwatch quarterly , no. ( ). http://www.ctwatch.org/quarterly/articles/ / /the-shape-of-the- scientific-article-in-the-developing-cyberinfrastructure/index.html ———. "technology and its implications for serials acquisition." against the grain (february ): , - . macdonald, ross. "what are the factors that will shape peer review in e-journals?" library hi tech news , no. ( ): - . machovec, george s. "electronic journal market overview— ." serials review , no. ( ): - . maxymuk, john. "electronic journals redux." the bottom line: managing library finances , no. ( ): - . mckiernan, gerry. "e is for everything: the extra-ordinary, evolutionary [e-]journal." the serials librarian , no. / ( ): - . ———. "embedded multimedia in electronic journals." multimedia information and technology , no. ( ): - . ———. "the static and the dynamic: embedded multimedia in electronic journals." technicalities (july/august ): , - . mcknight, cliff. "electronic journals—past, present . . . and future?" aslib proceedings (january ): - . meadows, jack. "can we really see where electronic journals are going?" library management , no. ( ): - . mi, jia, and frederick nesta. "the missing link: context loss in online databases." the journal of academic librarianship , no. ( ): - . moothart, tom. "american mathematical society demonstrates progressive innovation with e-journals." serials review , no. ( ): - . ———. "journals online expands the options for e-journal search and retrieval." serials review , no. ( ): - . morris, sally. "when is a journal not a journal? a closer look at the doaj." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art norris, michael, charles oppenheim, and fytton rowland. "finding open access articles using google, google scholar, oaister and opendoar." online information review , no. ( ): - . o'donnell, michael j. "electronic journals: scholarly invariants in a changing medium." journal of scholarly publishing (april ): - . okerson, ann. "the electronic journal: what, whence, and when?" the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /okerson. n ———. "oh lord, won't you buy me a mercedes benz. or, there is a there there." surfaces ( ). http://www.library.yale.edu/~okerson/surfaces.html ———. "publishing through the network: the s debutante." scholarly publishing (april ): - . panzera, don, and evelinde hutzler. "e-journal access through international cooperation: library of congress and the electronic journals library ezb." serials review , no. ( ): - . peek, robin p., and jeffrey p. pomerantz. "electronic scholarly journal publishing." in annual review of information science and technology, vol. , ed. martha e. willams, - . medford, nj: information today, inc., . peters, john. "the hundred years war started today: an exploration of electronic peer review." internet research: electronic networking applications and policy , no. ( ): - . peters, stuart. "epress: changing the way electronic journals work." vine, no. ( ): - . plutchak, t. scott. "the landscape shifts: new opportunities for collaboration arise as the primacy of the traditional journal article fades." serials: the journal for the serials community , no. ( ): - . pope, liz. "emerging trends in journal publishing." the serials librarian , no. / ( ): - . rentschler, cathy. "indexing electronic journals." the serials librarian , no. / ( ): - . resh, vincent h. "science and communication: an author/editor/user's perspective on the transition from paper to electronic publishing." issues in science and technology librarianship (summer ). http://www.library.ucsb.edu/istl/ -summer/article .html riley, cheryl a. "libraries, aggregator databases, screen readers and clients with disabilities." library hi tech , no. ( ): - . roberts, peter. "scholarly publishing, peer review and the internet." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ robertson, r. john. "stargate: exploring static repositories for small publishers." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /robertson/ rodgers, david l. "scholarly journals in ." the serials librarian , no. / ( ): - . rowe, richard r. "the transformation of scholarly communication and the future of serials." serials review (summer ): - . rowland, fytton. "scholarly journal publishing in new zealand." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "who will buy my bells and whistles? the true needs of users of electronic journals." serials (july ): - . rowlands, ian, and david nicholas. "the missing link: journal usage metrics." aslib proceedings , no. ( ): - . rowley, jennifer. "the question of electronic journals." library hi tech , no. ( ): - . rusbridge, chris. "new relationships in scholarly publishing." in networking and the future of libraries : managing the intellectual record, ed. lorcan dempsey, derek law, and ian mowat, - . london: library association publishing, . siegel, elliot r., donald a.b. lindberg, glen p. campbell, william g. harless, and c. rory goodwin. "defining the next generation journal: the nlm-elsevier interactive publications experiment." information services and use , no. / ( ): - . silver, keith. "pressing the 'send' key—preferential journal access in developing countries." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art singh, jagtar, fytton rowland, and jack meadows. "electronic journals on library and information science." oclc systems & services , no. ( ): - . smith, richard k. "online scholarly publishing in canada: technology and systems for the humanities and social sciences." canadian journal of communication , no. ( ). http://www.cjc-online.ca/index.php/journal/article/view/ / solomon, david j. developing open access electronic journals: a practical guide. oxford: chandos publishing, . steinberger, mark. "the demands on electronic journals in the mathematical sciences." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . stern, david. "new knowledge management systems: the implications for data discovery, collection development, and the changing role of the librarian." journal of the american society for information science and technology , no. ( ): - . stix, gary. "the speed of write." scientific american (december ): - . strickland, peter r., brian mcmahon, and john r. helliwell. "integrating research articles and supporting data in crystallography." learned publishing , no. ( ): - . sweeney, linden. "the future of academic journals: considering the current situation in academic libraries." new library world , no. ( ): - . tenopir, carol. "the complexities of electronic journals." library journal, february , - . ———. "electronic or print: are scholarly journals still important?" serials (july ): - . tenopir, carol, and donald w. king. towards electronic journals: realities for scientists, librarians, and publishers. washington, dc: special libraries association, . treloar, andrew. "electronic scholarly publishing and the world wide web." journal of scholarly publishing (april ): - . van brakel, pieter a. "electronic journals: publishing via internet's world wide web." the electronic library (august ): - . van marle, gerald a. j. s. "electronic serial publishing and its effect on the traditional information chain." serials (march ): - . ware, mark. "e-only journals: is it time to drop print?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art weller, ann c. "editorial peer review for electronic journals: current issues and emerging models." journal of the american society for information science , no. ( ): - . wilson, david l. "testing time for electronic journals." the chronicle of higher education, september , a -a . willinsky, john. "open journal systems: an example of open source software for journal management and publishing." library hi tech , no. ( ): - . http://pkp.sfu.ca/node/ willinsky, john, and larry wolfson. "the indexing of scholarly journals: a tipping point for publishing reform?" the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . wood, dee. "online peer review?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "project update: electronic submission and peer review— an update on the espere project." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art wusteman, judith. "formats for the electronic library." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /electronic-formats/ ———. "xml and e-journals." oclc systems & services , no. ( ): - . ———. "xml and e-journals: the state of play." library hi tech , no. ( ): - . . electronic serials: library issues abdulla, ali dualeh. "the development of electronic journals in the united arab emirates university (uaeu)." collection building , no. ( ): - . alan, robert, and nan butkovich. "libraries in transition: impact of print and electronic journal access." against the grain , no. ( ): , . allen, barbara mcfadden. "the cic-ejc as a model for management of internet-accessible e-journals." library hi tech , no. - ( ): - . aparna zambare, anne marie casey, john fierst, david ginsburg, judith o'dell, and timothy peters. "assuring access: one library's journey from print to electronic only subscriptions." serials review , no. ( ): - . atkinson, ross. "cornell and the future of the big deal: an interview with ross atkinson." interview by ellen finnie duranceau. serials review , no. ( ): - . bell, steven j. "the new digital divide: dissecting aggregator exclusivity deals." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /bell/ bell.html bevan, simon, satu nieminen, ruth hunn, and michelle sweet. "replacing print with e-journals: can it be done? a case study." serials (march ): - . http://dspace.lib.cranfield.ac.uk: /handle/ / bevis, mary d., and john-bauer graham. "the evolution of an integrated electronic journals collection." the journal of academic librarianship , no. ( ): - . biemiller, lawrence. "california state u. adopts new model to pay for journals." the chronicle of higher education, july , a -a . blosser, john, harriet lightman, william a. mchugh, and anna ren. "aggregator services evaluation: not an easy comparison." the serials librarian , no. ( ): - . bluh, pamela, ed. managing electronic serials: essays based on the alcts electronic serials institutes - . chicago: american library association, . born, kathleen. "role of the aggregator in the emerging electronic environment." journal of library administration , no. ( ): - . boyle, frances. "veni, vidi, non vici: e-journals management at the university of liverpool." serials (march ): - . bracke, marianne stowell, and jim martin. "developing criteria for the withdrawal of print content available online." collection building , no. ( ): - . brandsma, terry w., elizabeth r. bernhardt, and dana m. sally. "journal finder, a second look: implications for serials access in today's library." serials review , no. ( ): - . ———. "journal finder: a solution for comprehensive and unmediated access to journal articles." serials review , no. ( ): - . brower, stewart. "teaching e-journals: building a workshop for an academic health sciences library." serials review , no. ( ): - . brown, andrew, and neil smyth. "serials solutions and linkfinderplus at the university of wales swansea." program: electronic library & information systems , no. ( ): - . brunskill, kate, margaret kinnell, cliff mcknight, and anne morris. "switching on serials: the electronic serials in public libraries project." serials , no. ( ): - . burrows, suzetta. "a review of electronic journal acquisition, management, and use in health sciences libraries." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/picrender.fcgi?artid= &blobt ype=pdf butina, ingbritt. "electronic journals—the danish model." serials , no. ( ): - . cameron, robert d. "not just e-journals: providing and maintaining access to serials and serial information through the world-wide web." the serials librarian , no. / ( ): - . chadwell, faye a., and sara brownmiller. "heads up: confronting the selection and access issues of electronic journals." the acquisitions librarian, no. ( ): - . chambers, mary beth, and sooyoung so. "full-text aggregator database vendors and journal publishers: a study of a complex relationship." serials review , no. ( ): - . chan, liza. "electronic journals and academic libraries." library hi tech , no. ( ): - . chan, winnie. "creative applications of a web-based e-resource registry." science & technology libraries , no. / ( ): - . chandler, adam, and tim jewel. "key issue: the standardized usage statistics harvesting initiative (sushi)." serials , no. ( ): - . chen, xiaotian. "assessment of full-text sources used by serials management systems, openurl link resolvers, and imported e-journal marc records." online information review , no. ( ): - . ———. "embargo, tasini, and 'opted out': how many journal articles are missing from full-text databases." internet reference services quarterly , no. ( ): - . christie, anne, and laurel kristick. "developing an online science journal collection: a quick tool for assigning priorities." issues in science and technology librarianship, no. ( ). http://www.library.ucsb.edu/istl/ -spring/article .html chrzastowski, tina e. "making the transition from print to electronic serial collections: a new model for academic chemistry libraries?" journal of the american society for information science and technology , no. ( ): - . chudnov, daniel, cynthia crooker, and kimberly parker. "jake: overview and status report." serials review , no. ( ): - . cochenour, donnice. "cicnet's electronic journal collection." serials review , no. ( ): - . cochenour, donnice, and tom moothart. "relying on the kindness of strangers: archiving electronic journals on gopher." serials review , no. ( ): - . cole, louise. "a journey into e-resource administration hell." the serials librarian , no. / ( ): - . http://eprints.whiterose.ac.uk/ / ———. "usage data—the academic library perspective." serials (july ): - . colvin, john, and judith keene. "supporting undergraduate learning through the collaborative promotion of e-journals by library and academic departments." information research , no. ( ). http://informationr.net/ir/ - /paper .html cox, andrew, peter godwin, and robin yeates. "towards a checklist for choosing electronic journal aggregation services." vine, no. ( ): - . davies, j. eric. "counting on serials: management and serials metrics." serials (march ): - . degener, christie t., and marjory a. waite. "fools rush in . . . thoughts about, and a model for, measuring electronic journal collections." serials review , no. ( ): - . diedrichs, carol pitts. "e-journals: the ohiolink experience." library collections, acquisitions, & technical services , no. ( ): - . dollar, daniel m., john gallagher, janis glover, regina kenny marone, and cynthia crooker. "realizing what's essential: a case study on integrating electronic journal management into a print-centric technical services department." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= dorn, knut, and katharina klemperer. "e-journal aggregation systems: only part of the big picture." library collections, acquisitions, & technical services , no. ( ): - . duranceau, ellen finnie. "beyond print: revisioning serials acquisitions for the digital age." the serials librarian , no. / ( ): - . ———. "e-journal package-content tracking services." serials review , no. ( ): - . ———. "tracking content changes at aggregated websites for serials." serials review , no. ( ): - . duranceau, ellen finnie, and marilyn geller. "report of the task team on processing electronic journals in the mit libraries." serials review , no. ( ): - . duranceau, ellen, margret lippert, marlene manoff, and carter snowden. "electronic journals in the mit libraries: report of the e-journal subgroup." serials review , no. ( ): - . echeverria, mercedes, and pilar barredo. "online journals: their impact on document delivery." interlending & document supply , no. ( ): - . edwards, judith. "electronic journals: problem or panacea?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /journals/ eells, linda l. "for better or for worse: the joys and woes of e-journals." science & technology libraries , no. / ( ): - . ellis, kathryn d. "acquiring electronic journals." the acquisitions librarian, no. ( ): - . ———. "the revolt against journal publishers." the electronic library , no. ( ): - . felts, john. "now you can get there from here: creating an interactive web application for accessing full-text journal articles from any location." journal of library administration , no. / ( ): - . ferguson, christine l., maria d. d. collins, and jill e. grogg. "finding the perfect e-journal access solution . . . the hard way." technical services quarterly , no. ( ): - . frazer, stuart l., and pamela d. morgan. "electronic-for-print journal substitutions: a case study." serials review , no. ( ): - . frazier, kenneth. "the librarian's dilemma: contemplating the costs of the 'big deal.'" d-lib magazine (march ). http://www.dlib.org/dlib/march /frazier/ frazier.html friend, frederick j. "big deal—good deal? or is there a better deal?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art galloway, laura. "innovative interfaces' electronic resource management as a catalyst for change at glasgow university library." the serials librarian , no. ( ): - . http://eprints.gla.ac.uk/ / geffner, mira, and bonnie macewan. "a learning experience: the cic electronic journals collection project." the serials librarian , no. / ( ): - . gatten, jeffrey n., and tom sanville. "an orderly retreat from the big deal: is it possible for consortia?" d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /gatten/ gatten.html gibbs, nancy j. "walking away from the 'big deal': consequences and achievements." serials: the journal for the serials community , no. ( ): - . goodman, david. "should scientific journals be printed? a personal view." online information review , no. ( ): - . ———. "a year without print at princeton, and what we plan next." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art graves, tonia, and michael a. arthur. "developing a crystal clear future for the serials unit in an electronic environment: results of a workflow analysis." serials review , no. ( ): - . grogg, jill e. "using a subscription agent for e-journal management." journal of electronic resources librarianship , no. / ( ): - . gyeszly, suzanne d. "electronic or paper journals? budgetary, collection development, and user satisfaction questions." collection building , no. ( ): - . hahn, karla l., and lila a. faulkner. "evaluative usage-based metrics for the selection of e-journals." college & research libraries (may ): - . hamaker, chuck. "chaos—journals electronic style." against the grain (december -january ): - . hartmann, helmut. "electronic journals library: a german university's access and management platform for e-serials goes international." serials (july ): - . hawkins, les. "title access to full text journal content available in aggregator services." serials review , no. ( ): - . healy, leigh watson. "new bottles for old wine? california state university initiates an electronic core journals collection." educom review (may/june ): - . hennig, nicole. "improving access to e-journals and databases at the mit libraries: building a database-backed web site called 'vera.'" the serials librarian , no. / ( ): - . hoffman, william. "systems must change to help knowledge management." research information (december /january ). http://www.researchinformation.info/ridec jan standards.html hudson, laura, and laura windsor. "providing access to electronic journals: the ohio university experience." against the grain (june ): , , . hunter, karen. "the end of print journals: (in)frequently asked questions." journal of library administration , no. ( ): - . hurd, julie m. "serials management: adrift during a sea change?" journal of library administration , no. ( ): - . inger, simon. "the importance of aggregators." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art johns, cecily. "collection management strategies in a digital environment." issues in science and technology librarianship, no. ( ). http://www.library.ucsb.edu/istl/ -spring/article .html kalyan, sulekha. "non-renewal of print journal subscriptions that duplicate titles in selected electronic databases: a case study." library collections, acquisitions, & technical services , no. ( ): - . kaplan, richard, marilyn steinberg, and joanne doucette. "retention of retrospective print journals in the digital age: trends and analysis." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= keating, lawrence r., ii, christa easton reinke, and judi a. goodman. "electronic journal subscriptions." library acquisitions: practice & theory , no. ( ): - . keyhani, andrea. "coping with the digital shift: four of the thorniest issues." the serials librarian , no. / ( ): - . kichuk, diana. "electronic journal supplementary content, browser plug-ins, and the transformation of reading." serials review , no. ( ): - . kidd, tony. "electronic journal usage statistics in practice." serials (march ): - . ———. "electronic journals: their introduction and exploitation in academic libraries in the united kingdom." serials review (spring ): - . kiernan, vincent. "university libraries debate the value of package deals on electronic journals." the chronicle of higher education, september , a -a . knibbe, andrew. "a subscription agent's role in electronic publishing." the journal of electronic publishing (june ). http://hdl.handle.net/ /spo. . . knudson, frances l., nancy r. sprague, douglas a. chafe, mark l. b. martinez, isabel m. brackbill, and miriam e. blake. "leveraging the marc record: automatic generation of electronic journal web pages." serials review , no. ( ): - . knudson, frances l., nancy r. sprague, douglas a. chafe, mark l. b. martinez, isabel m. brackbill, vicky a. musgrave, and kathleen a. pratt. "creating electronic journal web pages from opac records." issues in science and technology librarianship (summer ). http://www.library.ucsb.edu/istl/ -summer/article .html kobulnicky, paul. "pork bellies and silk purses." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf koehler, amy e. c. "some thoughts on the meaning of open access for university library technical services." serials review , no. ( ): - . kwasik, hanna. "qualifications for a serials librarian in an electronic environment." serials review , no. ( ): - . lam, vinh-the. "organizational and technical issues in providing access to electronic journals." the serials librarian , no. ( ): - . lee, leslie a., and michelle m. wu. "do librarians dream of electronic serials? a beginner's guide to format selection." the bottom line , no. ( ): - . litchfield, charles. "local storage and retrieval of electronic journals: training issues for technical services personnel." serials review , no. ( ): - . liu, weiling, and fannie m. cox. "tracking the use of e-journals: a technique collaboratively developed by the cataloging department and the office of libraries technology at the university of louisville." oclc systems & services , no. ( ): - . luther, judy. "white paper on electronic journal usage statistics." the serials librarian , no. ( ): - . macewan, bonnie, and mira geffner. "the cic electronic journals collection project." the serials librarian , no. / ( ): - . ———. "the committee on institutional cooperation electronic journals collection (cic-ejc): a new model for library management of scholarly journals published on the internet." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /mace n .html manoff, marlene. "electronic journals: postmodern dream or nightmare." academic and library computing (september ): - . manoff, marlene, d. scott brandt, carter snowden, and carol zoppel. "wais/electronic journal evaluation task force report." serials review , no. ( ): - . manoff, marlene, eileen dorschner, marilyn geller, keith morgan, and carter snowden. "report of the electronic journals task force mit libraries." serials review , no. - ( ): - . martin, rebecca a. "finding free and open access resources: a value-added service for patrons." journal of interlibrary loan,document delivery & electronic reserve , no. ( ): - . mcdonald, john. "'no one uses them so why should we keep them?'—scenarios for print issue retention." against the grain , no. ( ): , . mckay, sharon cline. "accessing electronic journals." database (april/may ): - . mcmillan, gail. "embracing the electronic journal: one library's plan." the serials librarian , no. / ( ): - . ———. "technical processing of electronic journals." library resources & technical services (october ): - . ———. "technical services for electronic journals today." serials review , no. ( ): - . metcalf, cameron. "an open source solution to managing electronic journal links with database-generated web pages." serials librarian , no. , ( ): - . metz, paul. "electronic journals from a collection manager's point of view." serials review , no. ( ): - . miller, rush, and sherrie schmidt. "e-metrics: measures for electronic resources." serials (march ): - . miran, julie, and norm medeiros. "glory days: managing scientific journals in a liberal arts college." issues in science and technology librarianship (summer ). http://www.library.ucsb.edu/istl/ -summer/article .html mischo, william h., michael a. norman, wendy allen shelburne, and mary c. schlembach. "the growth of electronic journals in libraries: access and management issues and solutions." science & technology libraries , no. / ( ): - . misiek, marte, and gerry oxford. "electronic resources project." in electronic publishing: its impact on publishing, education, and reading: canadian association for information science proceedings of the th annual conference, ed. charles t. meadow, maggie weaver, and francoise hebert, - . ontario: canadian association for information science, . mitchell, anne. "tracking aggregator coverage with spreadsheets." the serials librarian , no. ( ): - . montgomery, carol hansen. "'fast track' transition to an electronic journal collection: a case study." new library world, no. ( ): - . http://idea.library.drexel.edu/handle/ / montgomery, carol hansen, and joanne l. sparks. "the transition to an electronic journal collection: managing the organizational changes." serials review , no. ( ): - . moothart, tom. "providing access to e-journals through library home pages." serials review (summer ): - . morgan, eric lease. "description and evaluation of the 'mr. serials' process: automatically collecting, organizing, archiving, indexing, and disseminating electronic serials." serials review , no. ( ): - . nabe, jonathan. "e-journal bundling and its impact on academic libraries: some early results." issues in science and technology librarianship, no. ( ). http://www.library.ucsb.edu/istl/ -spring/article .html naylor, bernard. "what librarians want: transplanting yesterday into tomorrow." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art neie, phillipp, and heather steele. "infomediaries in the internet era: subscription agents as intermediaries and aggregators in the electronic publishing world—agents of change and tradition." the serials librarian , no. / ( ): - . newton-smith, carol. "when the electronic journal comes to the campus." in the electronic journal: the future of serials-based information, ed. brian cook, - . new york: the haworth press, inc., . nisonger, thomas e. "electronic journal collection management issues." collection building , no. ( ): - . nowick, elaine, and claudine arnold jenda. "libraries stuck in the middle: reactive vs. proactive responses to the science journal crisis." issues in science and technology librarianship, no. ( ). http://www.istl.org/ -winter/article .html oberg, steve. "which route do i take? a viewpoint on locally developed versus commercially available journal management solutions." serials review , no. ( ): - . http://eprints.rclis.org/archive/ / parang, elizabeth, and laverna saunders. electronic journals in arl libraries: issues and trends. spec kit . washington, dc: office of management services, association of research libraries, . ———. electronic journals in arl libraries: policies and procedures. spec kit . washington, dc: office of management services, association of research libraries, . peters, thomas a. "collaborative print retention pilot projects." against the grain , no. ( ): , , . prior, albert. "managing electronic serials: the development of a subscription agent's service." the serials librarian , no. / ( ): - . prowse, stephen, and catrin sly. "stock checking e-journals: the experience of king's college london." serials: the journal for the serials community , no. ( ): - . publicker, stephanie, and kristin stoklosa. "reaching the researcher: how the national institutes of health library selects and provides e-journals via the world wide web." serials review , no. ( ): - . robbins, laura pope. "creating an integrated periodicals listing using microsoft access and asp scripts." oclc systems & services , no. ( ): - . rockliff, sue. "e-journals: the queen elizabeth hospital library experience." the electronic library , no. ( ): - . rowland, fytton, and mari connal. "research into libraries' purchasing and access requirements." serials , no. ( ): - . rowse, mark. "the hybrid environment: electronic-only versus print retention." against the grain , no. ( ): , , . rupp-serrano, karen, sarah robbins, and danielle cain. "canceling print serials in favor of electronic: criteria for decision making." library collections, acquisitions, & technical services , no. ( ): - . sale, arthur. "a challenge for the library acquisition budget." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /sale/ sale.html sanville, thomas j. "a method out of the madness: ohiolink's collaborative response to the serials crisis." serials (july ): - . sasse, margo, and b. jean winkler. "electronic journals: a formidable challenge for libraries." in advances in librarianship, vol. , ed. irene p. godden, - . san diego: academic press, . savory, richard. "managing electronic e-journal access: the tdnet solution." serials (november ): - . schmidt, krista, and nancy newsome. "the changing landscape of serials: open access journals in the public catalog." the serials librarian , no. / ( ): - . schulz, nathalie. "e-journal databases: a long-term solution?" library collections, acquisitions, & technical services , no. ( ): - . shepherd, peter. "the feasibility of developing and implementing journal usage factors: a research project sponsored by uksg." serials: the journal for the serials community , no. ( ): - . shim, wonsik, and charles r. mcclure. "improving database vendors' usage statistics reporting through collaboration between libraries and vendors." college & research libraries (november ): - . shouse, daniel l., nick crimi, and janice steed lewis. "managing journals: one library's experience." library hi tech , no. ( ): - . sitko, michelle, narda tafuri, gregory szczyrback, and taemin park. "e-journal management systems: trends, trials, and trade-offs." serials review , no. ( ): - . smith, malcolm. "hanging on to what we've got: economic and management issues in providing perpetual access in an electronic environment." serials (july ): - . stackpole, laurie e., and richard james king. "electronic journals as a component of the digital library." issues in science and technology librarianship (spring ). http://www.library.ucsb.edu/istl/ -spring/article .html stalberg, erin. "bibliographic access to titles in aggregator databases: one library's experience." the serials librarian , no. ( ): - . stankus, tony. "electronic journal concerns and strategies for aggregators, subscription services, indexing/abstracting services, and electronic bibliographic utilities." science & technology libraries , no. / ( ): - . stefancu, mircea, alex bloss, and jay lambrecht. "all about doller: managing electronic resources at the university of illinois at chicago library." serials review , no. ( ): - . suhr, karl. "compiling a collective, searchable list of full text titles for multiple databases." internet reference services quarterly , no. ( ): - . tobia, rajia c., jude a. lynch, bonnie c. o'connor, and thomas j. raymond, jr. "electronic journals: experiences of an academic health sciences library." serials review , no. ( ): - . tucker, natalie a., and robert p. holley. "digital infrastructure development within a nonprofit polymer science library: an analysis of the transition to digital serials at the michigan molecular institute." serials review , no. ( ): - . von ungern-sternberg, sara, and mats g. lindquist. "the impact of electronic journals on library functions." journal of information science , no. ( ): - . wang, jue, and alan t. schroeder, jr. "the subscription agent as e-journal intermediary." serials review , no. ( ): - . wilkinson, frances c. "electronic journal access for libraries: what some companies are doing to help—part i." against the grain (november ): - , . ———. "electronic journal access for libraries: what some companies are doing to help—part ii." against the grain (february ): - , - . withers, rob, rob casson, and aaron shrimplin. "creating web-based listings of electronic journals without creating extra work." library collections, acquisitions, & technical services , no. ( ): - . womack, ryan. "bel jour: a discipline-specific portal to periodicals." information technology and libraries (june ): - . woodward, hazel. "electronic journals—the librarian's viewpoint." serials (november ): - . woodward, hazel, and cliff mcknight. "electronic journals: issues of access and bibliographic control." serials review , no. ( ): - . zhang, xiaoyin, and michaelyn haslam. "movement toward a predominantly electronic journal collection." library hi tech , no. ( ): - . zhang, xiaoyin, and toby murray. "binding in the electronic environment." serials review , no. ( ): - . zimerman, martin. "periodicals: print or electronic?" new library world , no. / ( ): - . . electronic serials: research anderson, kent, john sack, lisa krauss, and lori o'keefe. "publishing online-only peer-reviewed biomedical literature: three years of citation, author perception, and usage experience." the journal of electronic publishing (march ). http://hdl.handle.net/ /spo. . . antelman, kristin. "do open-access articles have a greater research impact?" college & research libraries , no. ( ): - . ashcroft, linda. "issues in developing, managing and marketing electronic journals collections." collection building , no. ( ): - . bar-ilan, judit, and noa fink. "preference for electronic format of scientific journals—a case study of the science library users at the hebrew university." library & information science research , no. ( ): - . bar-ilan, judit, bluma c. peritz, and yecheskel wolman. "a survey on the use of electronic databases and electronic journals accessed through the web by the academic staff of israeli universities." the journal of academic librarianship , no. ( ): - . barker, anne l., and lucy a. tedd. "the ariadne project: an evaluation of a print and web magazine for library and information science professionals." journal of information science , no. ( ): - . bauer, kathleen, and nisa bakkalbasi. "an examination of citation counts in a new scholarly communication environment." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /bauer/ bauer.html bennett, denise beaubien, and amy g. buhler. "browsing of e-journals by engineering faculty." issues in science & technology librarianship, no. ( ). http://www.istl.org/ -spring/refereed .html berge, zane l., and mauri p. collins. "ipct journal readership survey." journal of the american society for information science (september ): - . berteaux, susan s., and peter brueggeman. "electronic journal timeliness: comparison with print." the serials librarian , no. ( ): - . bhat, mohammad hanief. "open access publishing in indian premier research institutions." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html bishop, ann peterson. "scholarly journals on the net: a reader's assessment." library trends (spring ): - . http://hdl.handle.net/ / björk, bo-christer, annikki roos, and mari lauri. "scientific journal publishing: yearly volume and open access availability." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html black, steve. "impact of full text on print journal use at a liberal arts college." library resources & technical services , no. ( ): - , . blessinger, kelly, and maureen olle. "content analysis of the leading general academic databases." library collections, acquisitions, & technical services , no. ( ): - . bonorino, adina gonzález, and valeria e. molteni. "electronic journals collections in argentine private academic libraries." the electronic library , no. ( ): - . bonthron, karen, christine urquhart, rhian thomas, chris armstrong, david ellis, jean everitt, roger fenton, ray lonsdale, elizabeth mcdermott, helen morris, rebecca phillips, sian spink, and alison yeoman. "trends in use of electronic journals in higher education in the uk—views of academic staff and students." d-lib magazine , no. ( ). http://www.dlib.org/dlib/june /urquhart/ urquhart.html borrego, Àngel, lluís anglada, maite barrios, and núria comellas. "use and users of electronic journals at catalan universities: the results of a survey." the journal of academic librarianship , no. ( ): - . http://www.recercat.net/handle/ / boukacem-zeghmouri, chérifa, and joachim schöpfel. "on the usage of e-journals in french universities." serials: the journal for the serials community , no. ( ): - . bravo, blanca rodríguez, maría luisa alvite díez, leticia barrionuevo almuzara, and maría antonia morán suárez. "patterns of use of electronic journals in spanish university libraries." serials review , no. ( ): - . brazzeal, bradley, and amanda clay powers. "electronic access to agricultural journals: an agronomy case study." serials review , no. ( ): - . bremholm, tony l. "challenges and opportunities for bibliometrics in the electronic environment: the case of the proceedings of the oklahoma academy of science." science & technology libraries , no. / ( ): - . brennan, martin j., julie m. hurd, deborah d. blecic, and ann c. weller. "a snapshot of early adopters of e-journals: challenges to the library." college & research libraries (november ): - . butler, h. julene. "research into the reward system of scholarship; where does scholarly electronic publishing get you?" in filling the pipeline and paying the piper: proceedings of the fourth symposium, ed. ann okerson, - . washington, dc: office of scientific and academic publishing, association of research libraries, . ———. "where does scholarly electronic publishing get you?" journal of scholarly publishing (july ): - . case, mary m. "a snapshot in time: arl libraries and electronic journal resources." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/resources/pubs/br/asit.shtml chaudhuri, jayati, and mariyam thohira. "usage of open-access journals: findings from eleven top science and medical journals." serials librarian , no. - ( ): - . cheng, weihong, and shengli ren. "evolution of open access publishing in chinese scientific journals." learned publishing , no. ( ): - . christianson, marilyn. "ecology articles in google scholar: levels of access to articles in core journals." issues in science & technology librarianship, no. ( ). http://www.istl.org/ -winter/refereed.html chu, heting. "promises and challenges of electronic journals: academic libraries surveyed." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art cochenour, donnice, and tom moothart. "e-journal acceptance at colorado state university: a case study." serials review , no. ( ): - . collins, cheryl s., and william h. walters. "open access journals in college library collections." serials librarian , no. ( ): - . connell, tschera harkness, sally a. rogers, and carol pitts diedrichs. "ohiolink electronic journal use at ohio state university." portal: libraries and the academy , no. ( ): - . http://hdl.handle.net/ / cooper, mindy m. "the importance of gathering print and electronic journal use data: getting a clear picture." serials review , no. ( ): - . cox, john. "scholarly publishing practices: a case of plus Ça change, plus c'est la même chose?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art crawford, walt. "free electronic refereed journals: getting past the arc of enthusiasm." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "getting past the arc of enthusiasm." cites & insights: crawford at large (may ): - . http://citesandinsights.info/civ i .pdf ———. "getting past the arc of enthusiasm (feedback and following up)." cites & insights: crawford at large (june ): - . http://citesandinsights.info/civ i .pdf cronin, blaise, and kara overfelt. "e-journals and tenure." journal of the american society for information science (october ): - . crummett, courtney, ellen finnie duranceau, tracy a. gabridge, remlee s. green, erja kajosalo, michael m. noga, howard j. silver, and amy stout. "publishing practices of nih-funded faculty at mit." issues in science and technology librarianship, no. ( ). http://www.istl.org/ -summer/refereed .html cummings, joel. "full-text aggregation: an examination of metadata accuracy and implications for resource sharing." serials review , no. ( ): - . davis, philip m. "author-choice open-access publishing in the biological and medical literature: a citation analysis." journal of the american society for information science and technology , no. ( ): - . ———. "patterns in electronic journal usage: challenging the composition of geographic consortia." college & research libraries (november ): - . davis, philip m., and leah r. solla. "an ip-level analysis of usage statistics for electronic journals in chemistry: making inferences about user behavior." journal of the american society for information science and technology , no. ( ): - . http://hdl.handle.net/ / de groote, sandra l. "impact of online journals on citation patterns of dentistry, nursing, and pharmacy faculty." journal of the medical library association , no. ( ): - . http://www.ncbi.nlm.nih.gov/pmc/articles/pmc / de groote, sandra l., and josephine l. dorsch. "measuring use patterns of online journals and databases." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/picrender.fcgi?action=stream&blobt ype=pdf&artid= de groote, sandra l., mary shultz, and marceline doranski. "online journals' impact on the citation patterns of medical faculty." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= dilek-kayaoglu, hulya. "use of electronic journals by faculty at istanbul university, turkey: the results of a survey." the journal of academic librarianship , no. ( ): - . dillon, irma f., and karla l. hahn. "are researchers ready for the electronic-only journal collection?: results of a survey at the university of maryland." portal: libraries and the academy (july ): - . http://hdl.handle.net/ / doug, way. "the open access availability of library and information science literature." college & research libraries , no. ( ): - . dow, ronald f. "editorial gatekeepers confronted by the electronic journal." college & research libraries (march ): - . duy, joanna, and liwen vaughan. "can electronic journal usage data replace citation data as a measure of journal use? an empirical examination." the journal of academic librarianship , no. ( ): - . edgar, brian d, and john willinsky. "a survey of scholarly journals using open journal systems." scholarly and research communication , no. ( ). http://journals.sfu.ca/src/index.php/src/article/view/ emrani, ebrahim, amin moradi-salari, and hamid r. jamali. "usage data, e-journal selection, and negotiations: an iranian consortium experience." serials review , no. ( ): - . erdman, jacquelyn marie. "image quality in electronic journals: a case study of elsevier geology titles." library collections, acquisitions, & technical services , no. / ( ): - . ford, charlotte e., and stephen p. harter. "the downside of scholarly electronic publishing: problems in accessing electronic journals through online directories and catalogs." college & research libraries (july ): - . fosmire, michael. "scan it and they will come . . . but will they cite it?" science & technology libraries , no. / ( ): - . http://docs.lib.purdue.edu/lib_research/ / fosmire, michael, and elizabeth young. "free scholarly electronic journals: what access do college and university libraries provide?" college & research libraries (november ): - . fosmire, michael, and song yu. "free scholarly electronic journals: how good are they?" issues in science and technology librarianship (summer ). http://www.library.ucsb.edu/istl/ -summer/refereed.html frandsen, tove faber. "attracted to open access journals: a bibliometric author analysis in the field of biology." journal of documentation , no. ( ): - . ———. "the integration of open access journals in the scholarly communication system: three science fields." information processing & management , no. ( ): - . http://www.hprints.org/hprints- /en/ gandhi, subash. "growth, characteristics, and distribution patterns of chemistry and biochemistry e-journals: a feasibility study for cuny libraries." serials review , no. ( ): - . gardner, susan. "the impact of electronic journals on library staff at arl member institutions: a survey and critique of the survey methodology." serials review , no. / ( ): - . gomes, suely, and jack meadows. "perceptions of electronic journals in british universities." journal of scholarly publishing (april ): - . goodman, david, sarah dowson, and jean yaremchuk. "open access and accuracy: author-archived manuscripts vs. published articles." learned publishing , no. ( ): - . guruprasad, r., and khaiser nikam. "e-journals and their usage patterns amongst the indian aerospace scientists and engineers in bengaluru." desidoc journal of library & information technology , no. ( ): - . http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ hahn, karla. "the state of the large publisher bundle: findings from an arl member survey." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ). http://www.arl.org/bm~doc/arlbr bundle.pdf hamilton, richard. "patterns of use for the bryn mawr reviews." in technology and scholarly communication, ed. richard ekman and richard e. quandt, - . berkeley: university of california press, . harnad, stevan, and tim brody. "comparing the impact of open access (oa) vs. non-oa articles in the same journals." d-lib magazine , no. ( ). http://www.dlib.org/dlib/june /harnad/ harnad.html harter, stephen p. "the impact of electronic journals on scholarly communication: a citation analysis." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /hart n .html ———. "scholarly communication and electronic journals: an impact study." journal of the american society for information science , no. ( ): - . harter, stephen p., and charlotte e. ford. "web-based analyses of e-journal impact: approaches, problems, and issues." journal of the american society for information science (november ): - . harter, stephen p., and hak joon kim. "accessing electronic journals and other e-publications: an empirical study." college & research libraries (september ): - . ———. "electronic journals and scholarly communication: a citation and reference study." information research , no. ( ). http://informationr.net/ir/ - /paper a.html hawkins, donald t. "bibliometrics of electronic journals in information science." information research , no ( ). http://informationr.net/ir/ - /paper .html hedlund, turid, tomas gustafsson, and bo-christer björk. "the open access scientific journal: an empirical study." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art hitchcock, steve, les carr, wendy hall, steve harris, steve probets, david evans, and david brailsford. "linking electronic journals: lessons from the open journal project." d-lib magazine (december ). http://www.dlib.org/dlib/december / hitchcock.html hitchcock, steve, leslie carr, and wendy hall. "a survey of stm online journals - : the calm before the storm." in directory of electronic journals, newsletters and academic discussion lists, th ed., ed. dru mogge, - . washington, dc: association of research libraries, . hitchcock, steve, freddie quek, leslie carr, wendy hall, andrew witbrock, and ian tarr. "towards universal linking for electronic journals." serials review (spring ): - . http://eprints.ecs.soton.ac.uk/ / jacsó, péter. "electronic shoes for the cobbler's children: treatment of digital journals in library and information science databases." online (july/august ): - . http://www.onlinemag.net/ol /jacso _ .html jamali, hamid r., david nicholas, and paul huntington. "the use and users of scholarly e-journals: a review of log analysis studies." aslib proceedings: new information perspectives , no. ( ): - . jeon-slaughter, haekyung, andrew c. herkovic, and michael a. keller. "economics of scientific and biomedical journals: where do scholars stand in the debate of online journal pricing and site license ownership between libraries and publishers?" first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ joglekar, neelambari, and bharati sen. "evaluation of electronic journals in library and information science." information studies (july ): - . johnson, ian m. "electronic publishing in librarianship and information science in latin america—a step towards development?" information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html johnson, ian, hong wang, and fei nie. "electronic journal provision and use in china: an initial study." serials: the journal for the serials community , no. ( ): - . jordan, mark, and dave kisly. "how does your library handle electronic serials? a general survey." serials (march ): - . joseph, lura e. "image and figure quality: a study of elsevier's earth and planetary sciences electronic journal back file package." library collections, acquisitions, & technical services , no. / ( ): - . karasözen, bülent, ayhan kaygusuz, and hacer (bati) Özen. "patterns of e-journal use within the anatolian university library consortium." serials: the journal for the serials community , no. ( ): - . http://eprints.rclis.org/archive/ / kaur, baljinder, and rama verma. "use and impact of electronic journals in the indian institute of technology, delhi, india." the electronic library , no. ( ): - . keller, alice. "delphi survey on the future development of electronic journals." serials (july ): - . ———. "future development of electronic journals: a delphi survey." the electronic library , no. ( ): - . khan, abdul mannan, and naved ahmad. "use of e-journals by research scholars at aligarh muslim university and banaras hindu university." the electronic library , no. ( ): - . kichuk, diana. "degrees of separation: linking and link distribution in cnslp publisher e-journal packages." the serials librarian , no. ( ): - . kim, hak joon. "motivations for hyperlinking in scholarly electronic articles: a qualitative study." journal of the american society for information science (august ): - . king, donald w., and carol hansen montgomery. "after migration to an electronic journal collection: impact on faculty and doctoral students." d-lib magazine (december ). http://www.dlib.org/dlib/december /king/ king.html king, donald w., carol tenopir, songphan choemprayong, and lei wu. "scholarly journal information-seeking and reading patterns of faculty at five us universities." learned publishing , no. ( ): - . king, donald w., carol tenopir, and michael clarke. "measuring total reading of journal articles." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /king/ king.html king, donald w., carol tenopir, carol hansen montgomery, and sarah e. aerni. "patterns of journal use by faculty at three diverse universities." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /king/ king.html kirlidog, melih, and didar bayir. "the effects of electronic access to scientific literature in the consortium of turkish university libraries." the electronic library , no. ( ): - . http://eprints.rclis.org/archive/ / knowlton, steven a. "continuing use of print-only information by researchers." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/picrender.fcgi?artid= &b lobtype=pdf koehler, wallace, paulita aguilar, sharon finarelli, charles gaunce, susan hatchette, rebecca heydon, emily mcewen, wendy mahsetky-poolaw, charles t. melson, rory patterson, mark stahl, mary ann walker, joanna wall, and gabe wingfield. "a bibliometric analysis of select information science print and electronic journals in the s." information research (october ). http://informationr.net/ir/ - /paper .html kokkonen, oili, and eva ijas. "availability of journals in electronic form." inspel , no. ( ): - . kortelainen, terttu. "an analysis of the use of electronic journals and commercial journal article collections through the finelib portal." information research , no. ( ). http://informationr.net/ir/ - /paper .html kousha, kayvan, and mike thelwall. "the web impact of open access social science research." library & information science research , no. ( ): - . lamothe, alain raymond. "electronic serials usage patterns as observed at a medium-size university: searches and full-text downloads." partnership: the canadian journal of library and information practice and research , no. ( ). http://journal.lib.uoguelph.ca/index.php/perj/article/view/ / lancaster, f. w. "attitudes in academia toward feasibility and desirability of networked scholarly publishing." library trends (spring ): - . http://hdl.handle.net/ / lee, wade m., and robin n. sinn. "scientists and the journal article: choices for access." journal of interlibrary loan, document delivery & information supply , no. ( ): - . lester, frank. "backlinks: alternatives to the citation index for determining impact." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . liew, chern li, schubert foo, and k. r. chennupati. "a study of graduate student end-users' use and perception of electronic journals." online information review , no. ( ): - . llewellyn, richard d., lorraine j. pellack, and diana d. shonrock. "the use of electronic-only journals in scientific research." issues in science & technology librarianship, no. ( ). http://www.istl.org/ -summer/refereed.html lorimer, rowland, and adrienne lindsay. "canadian scholarly journals at a technological crossroads." canadian journal of communication , no. ( ). http://www.cjc-online.ca/index.php/journal/article/view/ / mahe, annaig, christine andrys, and ghislaine chartron. "how french research scientists are making use of electronic journals: a case study conducted at pierre et marie curie university and denis diderot university." journal of information science , no. ( ): - . mayernik, matthew. "the prevalence of additional electronic features in pure e-journals." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . mcclanahan, kitty, lei wu, carol tenopir, and donald w. king. "embracing change: perceptions of e-journals by faculty members." learned publishing , no. ( ): - . mckibbon, k. ann, r. brian haynes, r. james mckinlay, and cynthia lokker. "which journals do primary care physicians and specialists access from an online service?" journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= mcveigh, marie e., and james k. pringle. "open access to the medical literature: how much content is available in published journals?" serials , no. ( ): - . mukherjee, bhaskar. "evaluating e-contents beyond impact factor—a pilot study selected open access journals in library and information science." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . ———. "the hyperlinking pattern of open-access journals in library and information science: a cited citing reference study." library & information science research , no. ( ): - . mukherjee, bhaskar, and uttar pradesh aranasi. "do open-access journals in library and information science have any scholarly impact? a bibliometric study of selected open-access journals using google scholar." journal of the american society for information science and technology , no. ( ): - . nicholas, david, and paul huntington. "electronic journals: are they really used?" interlending & document supply , no. ( ): - . nicholas, david, paul huntington, and hamid r. jamali. "the impact of open access publishing (and other access initiatives) on use and users of digital scholarly journals." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "open access in context: a user study." journal of documentation , no. ( ): - . ———. "the use, users, and role of abstracts in the digital scholarly environment." the journal of academic librarianship , no. ( ): - . nicholas, david, paul huntington, hamid r. jamali, and carol tenopir. "finding information in (very large) digital libraries: a deep log approach to determining differences in use according to method of access." the journal of academic librarianship , no. ( ): - . nicholas, david, paul huntington, and ian rowlands. "open access journal publishing: the views of some of the world's senior authors." journal of documentation , no. ( ): - . nicholas, david, paul huntington, bill russell, anthony watkinson, hamid r. jamali, and carol tenopir. "the big deal— ten years on." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art nicholas, david, paul huntington, and anthony watkinson. "digital journals, big deals and online searching behaviour: a pilot study." aslib proceedings: new information perspectives , no. / ( ): - . ———. "scholarly journal usage: the results of deep log analysis." journal of documentation , no. ( ): - . nicholas, david, hamid r. jamali m, paul huntington, and ian rowlands. "in their very own words: authors and scholarly journal publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art nicholas, david, ian rowlands, paul huntington, hamid r. jamali, and patricia hernández salazar. "diversity in the e-journal use and information-seeking behaviour of uk researchers." journal of documentation , no. ( ): - . nicholas, david, peter williams, ian rowlands, and hamid r. jamali. "researchers' e-journal use and information seeking behaviour." journal of information science , no. ( ): - . norris, michael, charles oppenheim, and fytton rowland. "the citation advantage of open-access articles." journal of the american society for information science and technology , no. ( ): - . olsen, janette r. "implications of electronic journal literature for scholars." ph.d. diss., cornell university, . oppenheim, charles, clare greenhalgh, and fytton rowland. "the future of scholarly journal publishing." journal of documentation (july ): - . park, ji-hong. "motivations for web-based scholarly publishing: do scientists recognize open availability as an advantage?" journal of scholarly publishing , no. ( ): - . park, ji-hong, and jian qin. "exploring the willingness of scholars to accept open access: a grounded theory approach." journal of scholarly publishing , no. ( ): - . park, taemin kim. "d-lib magazine: its first years " d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /park/ park.html pavliscak, pamela. "trends in copyright practices of scholarly electronic journals." serials review , no. ( ): - . pedersen, sarah, and rosemary stockdale. "what do the readers think? a look at how scientific journal users see the electronic environment." journal of scholarly publishing (october ): - . poworoznek, emily l. "linking of errata: current practices in online physical sciences journals." journal of the american society for information science and technology , no. ( ): - . prabha, chandra. "shifting from print to electronic journals in arl university libraries." serials review , no. ( ): - . quinn, brian. "mainstreaming electronic journals through improved indexing: prospects for the social sciences." serials review , no. ( ): - . reich, vicky. "electronic publishing: the publisher's view." the serials librarian , no. / ( ): - . rich, linda a., and julie l. rabine. "the changing access to electronic journals: a survey of academic library websites revisited." serials review , no. / ( ): - . ———. "how libraries are providing access to electronic serials: a survey of academic library web sites." serials review , no. ( ): - . riel, steven j., tom auger, sophie bogdanski, marian burright, bao-chu chang, janice christopher, ibironke lawal, kavita mundle, anthony oddo, and irwin weintraub. "perceived successes and failures of science & technology e-journal access: a comparative study." issues in science & technology librarianship, no. ( ). http://www.istl.org/ -summer/article .html robertson, victoria. "the impact of electronic journals on academic libraries: the changing relationship between journals, acquisitions and inter-library loans department roles and functions." interlending & document supply , no. ( ). rogers, sally a. "electronic journal usage at ohio state university." college & research libraries (january ): - . rowland, fytton, ian bell, and catherine falconer. "human and economic factors affecting the acceptance of electronic journals by readers." canadian journal of communication , no. ( ): - . http://www.cjc-online.ca/index.php/journal/article/view/ / rowlands, ian, and dave nicholas. new journal publishing models: an international survey of senior researchers. london: school of library, archive, and information studies, university college, . http://www.ucl.ac.uk/ciber/ciber_ _survey_final.pdf rudner, lawrence m., marie miller-whitehead, and jennifer s. gellmann. "who is reading on-line education journals? why? and what are they reading?" d-lib magazine (december ). http://www.dlib.org/dlib/december /rudner/ rudner.html rusch-feja, diann, and uta siebeky. "evaluation of usage and acceptance of electronic journals: results of an electronic survey of max planck society researchers including usage statistics from elsevier, springer and academic press." d-lib magazine (october ). http://www.dlib.org/dlib/october /rusch-feja/ rusch-feja-summar y.html salisbury, lutishoor, and emilio noguera. "usability of e-journals and preference for the virtual periodicals room: a survey of mathematics faculty and graduate students." electronic journal of academic and special librarianship , no. - ( ). http://southernlibrarianship.icaap.org/content/v n /salisbury_l . htm sathe, nila a., jenifer l. grady, and nunzia b. giuse. "print versus electronic journals: a preliminary investigation into the effect of journal format on research processes." journal of the medical library association (april ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= sawant, sarika. "the current scenario of open access journal initiatives in india." collection building , no. ( ): - . schauder, don. "electronic publishing of professional articles: attitudes of academics and implications for the scholarly communication industry." journal of the american society for information science (march ): - . scigliano, marisa. "measuring the use of networked electronic journals in an academic library consortium: moving beyond mines for libraries in ontario scholars portal." serials review , no. ( ): - . serotkin, patricia b., patricia i. fitzgerald, and sandra a. balough. "if we build it, will they come? electronic journals acceptance and usage patterns." portal: libraries and the academy , no. ( ): - . shemberg, marian, and cheryl grossman. "electronic journals in academic libraries: a comparison of arl and non-arl libraries." library hi tech , no. ( ): - . siebenberg, tammy r., betty galbraith, and eileen e. brady. "print versus electronic journal use in three sci/tech disciplines: what's going on here?" college & research libraries , no. ( ): - . smart, pippa. "e-journals: developing country access survey." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art smith, alastair g. "citations and links as a measure of effectiveness of online lis journals." ifla journal , no. ( ): - . soong, samson c. "measuring citation advantages of open accessibility." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /soong/ soong.html sotudeh, hajar, and abbas horri. "the citation performance of open access journals: a disciplinary investigation of citation distribution models." journal of the american society for information science and technology , no. ( ): - . ———. "tracking open access journals evolution: some considerations in open access data collection validation." journal of the american society for information science and technology , no. ( ): - . speier, cheri, jonathan palmer, daniel wren, and susan hahn. "faculty perceptions of electronic journals as scholarly communication: a question of prestige and legitimacy." journal of the american society for information science , no. ( ): - . sprague, nancy, and mary beth chambers. "full-text databases and the journal cancellation process: a case study." serials review , no. ( ): - . srivastava, sandhya, and paolina taglienti. "e-journal management: an online survey evaluation." serials review , no. ( ): - . standera, o. l. "electronic publishing: some notes on reader response and costs." scholarly publishing (july ): - . stemper, james a., and janice m. jaguszewski. "usage statistics for electronic journals: an analysis of local and vender counts." collection management , no. ( ): - . stewart, linda. "user acceptance of electronic journals: interviews with chemists at cornell university." college & research libraries (july ): - . suber, peter, and caroline sutton. "society publishers with open access journals." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#list swan, alma, and sheridan brown. "authors and electronic publishing: what authors want from the new technology." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art sweeney, aldrin e. "tenure and promotion: should you publish in electronic journals?" the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . taler, izabella. "lis open access e-journal 'where are you?'" webology , no. ( ). http://www.webology.ir/ /v n /a .html talja, sanna, and hanni maula. "reasons for the use and non-use of electronic journals and databases: a domain analytic study in four scholarly disciplines." journal of documentation , no. ( ): - . talja, sanna, pertti vakkari, jenny fry, and paul wouters. "impact of research cultures on the use of digital library resources." journal of the american society for information science and technology , no. ( ): - . taylor, donald. "looking for a link: comparing faculty citations pre and post big deals." electronic journal of academic and special librarianship , no. ( ). http://southernlibrarianship.icaap.org/content/v n /taylor_d .ht m tenopir, carol. "electronic publishing: research issues for academic librarians and users." library trends , no. ( ): - . http://hdl.handle.net/ / tenopir, carol, brenda hitchcock, and ashley pillow. use and users of electronic library resources: an overview and analysis of recent research studies. washington, dc: council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf tenopir, carol, and donald w. king. "designing electronic journals with years of lessions from print." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . ———. "electronic journals and changes in scholarly article seeking and reading patterns." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /tenopir/ tenopir.html ———. "managing scientific journals in the digital era." information outlook (february ): - . ———. "reading behaviour and electronic journals." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "setting the record straight on journal publishing: myth vs. reality." library journal, march , - . ———. "trends in scientific scholarly journal publishing in the united states." journal of scholarly publishing (april ): - . ———. "the use and value of scientific journals: past, present and future." serials (july ): - . tenopir, carol, donald w. king, peter boyce, matt grayson, and keri-lynn paulson. "relying on electronic journals: reading patterns of astronomers." journal of the american society for information science and technology , no. ( ): - . tenopir, carol, donald w. king, peter boyce, matt grayson, yan zhang, and mercy ebuen. "patterns of journal use by scientists through three evolutionary phases." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /king/ king.html tenopir, carol, donald w. king, and amy bush. "medical faculty's use of print and electronic journals: changes over time and in comparison with scientists." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/picrender.fcgi? action=stream&blobtype=pdf&artid= tenopir, carol, donald w. king, sheri edwards, and lei wu. "electronic journals and changes in scholarly article seeking and reading patterns." aslib proceedings , no. ( ): - . tomney, hilary, and paul f. burton. "electronic journals: a study of usage and attitudes among academics." journal of information science , no. ( ): - . tonta, yasar. "scholarly communication and the use of networked information sources." ifla journal , no. ( ): - . tschider, charlotte. "investigating the 'public' in the public library of science: gifting economics in the internet community." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ vakkari, pertti, and sanna talja. "searching for electronic journal articles to support academic tasks. a case study of the use of the finnish national electronic library (finelib)." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html vaughan, k. t. l. "changing use patterns of print journals in the digital age: impacts of electronic equivalents on print chemistry journal use." journal of the american society for information science and technology , no. ( ): - . vilar, polona, and maja Žumer. "comparison and evaluation of the user interfaces of e-journals." journal of documentation , no. ( ): - . vlachaki, assimina, and christine urquhart. "use of open access journals in biomedicine in greece." library management , no. / ( ): - . voorbij, henk, and hilde ongering. "the use of electronic journals by dutch researchers: a descriptive and exploratory study." the journal of academic librarianship , no. ( ): - . waddell, pam. "the potential for electronic journals in uk academia." in libraries and it: working papers of the information technology sub-committee of the hefc's libraries review, - . bath, uk: the office for library and information networking, . warkentin, erwin. "consumer issues and the scholarly journal." canadian journal of communication , no. ( ): - . http://www.cjc-online.ca/index.php/journal/article/view/ / warlick, stefanie e, and k. t. l. vaughan. "factors influencing publication choice: why faculty choose open access." biomedical digital libraries (article ). http://www.bio-diglib.com/content/ / / williams, peter, david nicholas, and ian rowlands. "e-journal usage and impact in scholarly research: a review of the literature." new review of academic librarianship , no. ( ): - . willinsky, john. "open access is public access: helping policymakers read research." canadian journal of communication , no. ( ). http://www.cjc-online.ca/index.php/journal/article/view/ / wilson, mary dabney. "flying first class or economy? classification of electronic titles in arl libraries." portal: libraries and the academy , no. ( ): - . wood, d. j. "peer review and the web: the implications of electronic peer review for biomedical authors, referees and learned society publishers." the journal of documentation (march ): - . woodward, hazel. "electronic serials: the uk electronic libraries (elib) programme." serials review (spring ): - . woodward, hazel, fytton rowland, cliff mcknight, carolyn pritchett, and jack meadows. "café jus: an electronic journals user survey." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ woolfrey, sandra. "subscriber interest in electronic copy of printed journals." journal of scholarly publishing (april ): - . worlock, kate. "electronic journals: user realities—the truth about content usage among the stm community." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art wulff, judith l., and neal d. nixon. "quality markers and use of electronic journals in an academic health sciences library." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/articlerender.fcgi?artid= xia, jingfeng. "a longitudinal study of scholars attitudes and behaviors toward open-access journal publishing." journal of the american society for information science and technology , no. ( ): - . youngen, gregory k. "citation patterns to traditional and electronic preprints in the published literature." college & research libraries (september ): - . yue, paoshan w., and millie l. syring. "usage of electronic journals and their effect on interlibrary loan: a case study at the university of nevada, reno." library collections, acquisitions, & technical services , no. ( ): - . zainab, a. n., a. r. huzaimah, and t. f. ang. "using journal use study feedback to improve accessibility." the electronic library , no. ( ): - . zhang, yin. "the impact of internet-based electronic resources on formal scholarly communication in the area of library and information science: a citation analysis." journal of information science , no. ( ): - . zhang, zhongdong. "evaluating electronic journals services and monitoring their usage by means of www server log file analysis." vine, no. ( ): - . general works allison, arthur, james currall, michael moss, and susan stuart. "digital identity matters." journal of the american society for information science and technology , no. ( ): - . baker, gavin. "student activism: how students use the scholarly communication system." college & research libraries news , no. ( ): - . banks, marcus a. "the excitement of google scholar, the worry of google print." biomedical digital libraries (article ). http://www.bio-diglib.com/content/pdf/ - - - .pdf benkler, yochai. the wealth of networks: how social production transforms markets and freedom. new haven: yale university press, . http://www.benkler.org/wealth_of_networks/index.php/download_p dfs_of_the_book bennett, scott. "re-engineering scholarly communication: thoughts addressed to authors." journal of scholarly publishing (july ): - . björk, bo-christer. "a model of scientific communication as a global distributed information system." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html ———. "a lifecycle model of the scientific communication process." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art boock, michael h. "a faculty led response to the crisis in scholarly communications." electronic journal of academic and special librarianship , no. ( ). http://southernlibrarianship.icaap.org/content/v n /boock_m .ht ml borgman, christine l. "data, disciplines, and scholarly publishing." learned publishing , no. ( ): - . ———. "the invisible library: paradox of the global information infrastructure." library trends , no. ( ): - . http://hdl.handle.net/ / ———. scholarship in the digital age: information, infrastructure and the internet. cambridge, ma: mit press, . borman, stu. "advances in electronic publishing herald changes for scientists." chemical & engineering news , no. ( ): - . bowen, william g. "'new times always; old time we cannot keep.'" arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr bowen.pdf brogan, martha l., and daphnée rentfrow. a kaleidoscope of digital american literature. washington, dc: council on library and information resources and digital library federation, . http://www.clir.org/pubs/abstract/pub abst.html bugeja, michael j. "the advent of print on demand . . . but make sure you read the fine print." the chronicle of higher education, march , b -b . butterworth, i., ed. the impact of electronic publishing on the academic community: an international workshop organized by the academia europaea and the wenner-gren foundation. london: portland press, . byerley, suzanne l., and mary beth chambers. "accessibility of web-based library databases: the vendors' perspectives." library hi tech , no. ( ): - . chodorow, stanley. "scholarship and scholarly communication in the electronic age." educause review (january/february ): - . http://www.educause.edu/ir/library/pdf/erm b.pdf clarke, roger. "freedom of information? the internet as harbinger of the new dark ages." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ clews, john. "digital language access: scripts, transliteration, and computer access." d-lib magazine (march ). http://www.dlib.org/dlib/march /sesame/ clews.html clyde, laurel a. "weblogs—are you serious?" the electronic library , no. ( ): - cole, timothy w. "publishing mathematics on the web." science & technology libraries , no. / ( ): - . covi, lisa m. "material mastery: situating digital library use in university research practices." information processing and management (may ): - . crawford, walt. "talking about public access—pacs-l's first decade." information technology and libraries (september ): - . dalbello, marija. "is there a text in this library? history of the book and digital continuity." journal of education for library and information science , no. ( ): - . daniel, hans-dieter. "publications as a measure of scientific advancement and of scientists' productivity." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art davies, j. eric, and helen greenwood. "scholarly communication trends—voices from the vortex: a summary of specialist opinion." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art dempsey, lorcan, and maria heijne. "scientific information supply—building networked information systems." the electronic library (august ): - . desmarais, norman. "e ink and digital paper." against the grain (december -january ): - . dilevko, juris, and lisa gottlieb. "print sources in an electronic age: a vital part of the research process for undergraduate students." the journal of academic librarianship , no. ( ): - . doldi, luisa m., and erwin bratengeyer. "the web as a free source for scientific information: a comparison with fee-based databases." online information review , no. ( ): - . domier, sharon h. "listservs within the pantheon of written materials." in advances in serials management, vol. , ed. marcia tuttle and karen d. darling, - . greenwich, ct: jai press, . dong, peng, marie loh, and adrian mondry. "the 'impact factor' revisited." biomedical digital libraries (article ). http://www.bio-diglib.com/content/ / / doty, philip, and ann p. bishop. "the national information infrastructure and electronic publishing: a reflective essay." journal of the american society for information science , no. ( ): - . duff, alistair s. "four 'e'pochs: the story of informatization." library review , no. ( ): - . duguid, paul. "limits of self-organization: peer production and 'laws of quality.'" first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ eisend, martin. "the internet as a new medium for the sciences? the effects of internet use on traditional scientific communication media among social scientists in germany." online information review , no. ( ): - . ekman, richard, and richard e. quandt, eds. technology and scholarly communication. berkeley: university of california press, . elam, barbara. "readiness or avoidance: e-resources and the art historian." collection building , no. ( ): - . ernest, douglas j., and holley r. lange. "electronic publishing: a bibliography, - ." in library high tech bibliography, vol. , ed. c. edward wall. ann arbor: pierian press, . fernandez, leila. "scholarly communication in the sciences—a third world perspective." internet reference services quarterly , no. ( ): - . fisher, william. "now you see it; now you don't: the elusive nature of electronic information." library collections, acquisitions, & technical services , no. ( ): - . gabriel, michael r. a guide to the literature of electronic publishing: cd-rom, desktop publishing, and electronic mail, books, and journals. greenwich, ct: jai press, . the getty art history information program. research agenda for networked cultural heritage. santa monica: the getty art history information program, . gordon, g. e., and fytton rowland, eds. scholarly publishing in an electronic era, international yearbook of library and information management. london: facet publishing, . graham, thomas w. "scholarly communication." serials (march ): - . greenstein, daniel. "the arts and humanities data service three years' on." d-lib magazine (december ). http://www.dlib.org/dlib/december /greenstein/ greenstein.html greenstein, daniel, and leigh watson healy. "print and electronic information: shedding new light on campus use." educause review (september/october ): - . http://www.educause.edu/ir/library/pdf/erm .pdf grycz, czeslaw jan, ed. promises & pitfalls—an aap/psp briefing paper on internet publishing. new york: association of american publishers, inc., . guernsey, lisa. "scholars who work with technology fear they suffer in tenure reviews." the chronicle of higher education, june , a -a . guthrie, kevin. "something in the water: scholarly communications in a rapidly changing information economy: based on paper presented at the st uksg conference, torquay, april ." serials: the journal for the serials community , no. ( ): - . halliday, leah. "scholarly communication, scholarly publication and the status of emerging formats." information research (july ). http://informationr.net/ir/ - /paper .html heimpel, rod. "legitimizing electronic scholarly publications: a discursive proposal." surfaces , no. ( ). http://www.pum.umontreal.ca/revues/surfaces/vol /heimpel.pdf hirshon, arnold. "a diamond in the rough: divining the future of e-content." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf hovde, karen. "you can't get there from here: student citations in an ephemeral electronic environment." college and research libraries , no. ( ): - . hurd, julie m. "scientific communication: new roles and new players." science & technology libraries , no. / ( ): - . igun, stella e. "implications for electronic publishing in libraries and information centres in africa." the electronic library , no. ( ): - . jacobson, thomas l. "the electronic publishing revolution is not 'global.'" journal of the american society for information science (december ): - . kasdorf, william e. the columbia guide to digital publishing. new york: columbia university press, . kebede, gashaw. "the changing information needs of users in electronic information environments." the electronic library , no. ( ): - . kiernan, vincent. "rewards remain dim for professors who pursue digital scholarship." the chronicle of higher education, april , a -a . kling, rob, and geoffrey mckim. "not just a matter of time: field differences and the shaping of electronic media in supporting scientific communication." journal of the american society for information science , no. ( ): - . http://arxiv.org/abs/cs.cy/ ———. "scholarly communication and the continuum of electronic publishing." journal of the american society for information science , no. ( ): - . http://arxiv.org/abs/cs.cy/ kling, rob, geoffrey mckim, and adam king. "a bit more to it: scholarly communication forums as socio-technical interaction networks." journal of the american society for information science and technology , no. ( ): - . https://scholarworks.iu.edu/dspace/html/ / /wp - b.html kovacs, diane k., kara l. robinson, and jeanne dixon. "scholarly e-conferences on the academic networks: how library and information science professionals use them." journal of the american society for information science (may ): - . kovacs, michael j., and diane k. kovacs. "the state of scholarly electronic conferencing." electronic networking: research, applications and policy (winter ): - . kubota, teruzo. "how are electronic journals and cd-roms being accepted in japan?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art lancaster, f. w. "electronic publishing." library trends (winter ): - . http://hdl.handle.net/ / ———. "the evolution of electronic publishing." library trends (spring ): - . http://hdl.handle.net/ / ———. "the paperless society revisited." american libraries (september ): - . ———. toward paperless information systems. new york: academic press, . langston, lizbeth. "scholarly communication and electronic publication: implications for research, advancement, and promotion." in untangling the web: proceedings of the conference sponsored by the librarians association of the university of california, santa barbara and friends of the ucsb library, ed. andrea l. duda. santa barbara: university of california, santa barbara library, . http://www.library.ucsb.edu/untangle/langston.html lawlor, bonnie. "abstracting and information services: managing the flow of scholarly communication—past, present, and future." serials review , no. ( ): - . lederberg, joshua. "options for the future." d-lib magazine (may ). http://www.dlib.org/dlib/may / lederberg.html lippincott, joan k. "beyond coexistence: finding synergies between print content and digital information." journal of library administration , no. ( ): - . liu, ziming, and david g. stork. "is paperless really more? rethinking the role of paper in the digital age." communications of the acm (november ): - . lucier, richard e. "knowledge management: refining roles in scientific communication." educom review (fall ): - . ———. "librarians and publishers as collaborators and competitors." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf lyman, peter, and hal r. varian. "how much information?" the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . lyons, patrice a. "the world meets the internet." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /lyons/ lyons.html maher, james v. "the research university and scholarly publishing: the view from a provost's office." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr provost.pdf mann, charles c. "electronic paper turns the page." technology review (march ): - . http://www.technologyreview.com/infotech/ / marks, jayne, and timo hannay. "evolving scholarly communication." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "a potency of life: scholarship in an electronic age." the serials librarian , no. / ( ): - . maynard, sally, and ann o'brien. "scholarly output: print and digital—in teaching and research." journal of documentation , no. ( ): - . mcgrath, eileen l., winifred fordham metz, and john b. rutledge. "h-net book reviews: enhancing scholarly communication with technology." college & research libraries , no. ( ): - . mogge, dru. "seven years of tracking electronic publishing: the arl directory of electronic journals, newsletters and academic discussion lists." library hi tech , no. ( ): - . newby, gregory b. "a prognosis for continued disarray in electronic scholarly communication." canadian journal of communication , no. / ( ): - . http://www.cjc-online.ca/index.php/journal/article/view/ / newman, kathleen a., deborah d. blecic, and kimberly l. armstrong. scholarly communication education initiatives, spec kit . washington: dc: association of research libraries, . http://www.arl.org/bm~doc/spec book.pdf.zip o'connor, steve. "economic and intellectual value in existing and new paradigms of electronic scholarly communication." library hi tech , no. ( ): - . odlyzko, andrew. "the rapid evolution of scholarly communication." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art o'donnell, james j. "st. augustine to nren: the tree of knowledge and how it grows." the serials librarian , no. / ( ): - . okerson, ann, ed. filling the pipeline and paying the piper: proceedings of the fourth symposium. washington, dc: office of scientific and academic publishing, association of research libraries, . ———. scholarly publishing on the electronic networks: the new generation: visions and opportunities in not-for-profit publishing: proceedings of the second symposium. washington, dc: office of scientific and academic publishing, association of research libraries, . okerson, ann, and dru mogge, eds. gateways, gatekeepers, and roles in the information omniverse: proceedings of the third symposium. washington, dc: office of scientific and academic publishing, association of research libraries, . http://eric.ed.gov/ericwebportal/contentdelivery/servlet/ericserv let?accno=ed palmer, carole l. "scholarly work and the shaping of digital access." journal of the american society for information science and technology , no. ( ): - . peek, robin p. "where is publishing going? a perspective on change." journal of the american society for information science , no. ( ): - . peek, robin p., and gregory b. newby, eds. scholarly publishing: the electronic frontier. cambridge, ma: the mit press, . peters, thomas a. "was that the rubicon, lethe, or styx we just crossed? access conditions for e-content." library collections, acquisitions, & technical services , no. ( ): - . pinfield, stephen. "what do universities want from publishing?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art pistotti, vanna. "electronic publishing in medicine: where are we?" journal of the pancreas , no. ( ): - . http://www.joplink.net/prev/ / _ .pdf quandt, richard e. "electronic publishing and virtual libraries: issues and an agenda for the andrew w. mellon foundation." serials review (summer ): - . schaffner, bradley l. "electronic resources: a wolf in sheep's clothing?" college & research libraries , no. ( ): - . schamber, linda. "what is a document? rethinking the concept in uneasy times." journal of the american society for information science (september ): - . schmiede, rudi. "upgrading academic scholarship—challenges and chances of the digital age." library hi tech , no. ( ): - . schonfeld, roger c., and brian f. lavoie. "books without boundaries: a brief tour of the system-wide print book collection." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . schwartz, charles a. "scholarly communication as a loosely coupled system: reassessing prospects for structural reform." college & research libraries (march ): - . ———. "the strength of weak ties in electronic development of the scholarly communication system." college & research libraries (november ): - . searing, susan e., and leigh s. estabrook. "the future of scientific publishing on the web: insights from focus groups of chemists." portal: libraries and the academy , no. ( ): - . sellitto, carmine. "the impact of impermanent web-located citations: a study of scholarly conference publications." journal of the american society for information science and technology , no. ( ): - . shulenburger, david e. "on scholarly evaluation and scholarly communication: increasing the availability of quality work." college & research libraries news , no. ( ): - . siriginidi, subba rao. "chemical information in the electronic era." collection building , no. ( ): - . smith, alastair g. "web links as analogues of citations." information research , no. ( ). http://informationr.net/ir/ - /paper .html soete, george j. transforming libraries: issues and innovations in electronic scholarly publication. spec kit . washington, dc: office of management services, association of research libraries, . sowards, steven w. "novas, niches, and icebergs: practical lessons for small-scale web publishers." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . steele, colin, linda butler, and danny kingsley. "the publishing imperative: the pervasive influence of publication metrics." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art stemper, jim, and karen williams. "scholarly communication: turning crisis into opportunity." college & research libraries news , no. ( ): - . stephen, timothy, and teresa m. harrison. "comserve: moving the communication discipline online." journal of the american society for information science , no. ( ): - . ———. "intensive disciplinarity in electronic services for research and education: building systems responsive to intellectual tradition and scholarly culture." the journal of electronic publishing (august ). http://hdl.handle.net/ /spo. . . tannehill, robert s., jr. "emerging standards on the citation of electronic documents: a status report." information standards quarterly (october ): - . tenopir, carol. "authors and readers: the keys to success or failure for electronic publishing." library trends (spring ): - . http://hdl.handle.net/ / terry, ana arias. "electronic ink technologies: showing the way to a brighter future." library hi tech , no. ( ): - . wagner, a. ben. "managing tradeoffs in the electronic age." journal of the american society for information science and technology , no. ( ): - . walker, janice r. "citing serials: online serial publications and citation systems." the serials librarian , no. / ( ): - . warner, simeon. "the transformation of scholarly communication." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art waters, donald j. "'doing much more than we have so far attempted'." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf ———. "managing digital assets in higher education: an overview of strategic issues." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr assets.pdf whitworth, brian, and rob friedman. "reinventing academic publishing online. part i: rigor, relevance and practice." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / ———. "reinventing academic publishing online. part ii: a socio-technical vision." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ wittenberg, kate. "collaborators in communication: publishers, scholars, and information technologists." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf worona, steve. "project cupid: a romance between networking and publishing." educom review (march/april ): - . http://www.educause.edu/resources/projectcupidaromancebetwe ennet/ . general works: research (multiple-types of electronic works) al-aufi, ali, and paul genoni. "an investigation of digital scholarship and disciplinary culture in oman." library hi tech , no. ( ): - . asefeh asemi, and nosrat riyahiniya. "awareness and use of digital resources in the libraries of isfahan university of medical sciences, iran." the electronic library , no. ( ): - . atakan, cemal, dogan atilgan, Ölem bayram, and sacit arslantekin. "an evaluation of the second survey on electronic databases usage at ankara university digital library." the electronic library , no. ( ): - . bhatt, r. k. "use of ugc-infonet digital library consortium resources by research scholars and faculty members of the university of delhi in history and political science: a study." library management , no. / ( ): - . björk, bo-christer, and ziga turk. "how scientists retrieve publications: an empirical study of how the internet is overtaking paper media." the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . boumarafi, behdja. "electronic resources at the university of sharjah medical library: an investigation of students' information-seeking behavior." medical reference services quarterly , no. ( ): - . crawford, john. "the use of electronic information services and information literacy: a glasgow caledonian university study." journal of librarianship and information science , no. ( ): - . de vicente, angel, john crawford, and stuart clink. "use and awareness of electronic information services by academic staff at glasgow caledonian university." library review , no. ( ): - . deng, hepu. "emerging patterns and trends in utilizing electronic resources in a higher education environment: an empirical analysis." new library world , no. / ( ): - . franklin, brinley, and terry plum. "library usage patterns in the electronic information environment." information research , no. ( ). http://informationr.net/ir/ - /paper .html friedlander, amy. dimensions and use of the scholarly information environment: introduction to a data set assembled by the digital library federation and outsell, inc. washington, dc: digital library federation and council on library and information resources, . http://www.clir.org/pubs/reports/pub /contents.html haridasan, sudharma, and majid khan. "impact and use of e-resources by social scientists in national social science documentation centre (nassdoc), india." the electronic library , no. ( ): - . harley, diane. "use and users of digital resources." educause quarterly , no. ( ): - . http://www.educause.edu/ir/library/pdf/eqm .pdf harley, diane, sarah earl-novell, jennifer arter, shannon lawrence, and c. judson king. "the influence of academic values on scholarly publication and communication practices." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . heterick, bruce. "faculty attitudes toward electronic resources." educause review (july/august ): - . http://www.educause.edu/ir/library/pdf/erm .pdf holmes, aldyth. "publishing trends and practices in the scientific community." canadian journal of communication , no. ( ). http://www.cjc-online.ca/index.php/journal/article/view/ / houghton, john w. "changing research practices and research infrastructure development." higher education management and policy , no. ( ): - . houghton, john w, colin steele, and margaret henty. "research practices, evaluation and infrastructure in the digital environment." australian academic & research libraries , no. ( ): - . http://alia.org.au/publishing/aarl/ . /full.text/houghton.html jankowska, maria anna. "identifying university professors' information needs in the challenging environment of information and communication technologies." the journal of academic librarianship , no. ( ): - . khan, abdul mannan, s. mustafa zaidi, and safay zaffar bharati. "use of on-line databases by faculty members and research scholars of jawaharlal nehru university (jnu) and jamia millia islamia (jmi), new delhi (india): a survey." the international information & library review , no. ( ): - . kriebel, leslie, and leslie lapham. "transition to electronic resources in undergraduate social science research: a study of honors theses bibliographies, - ." college and research libraries , no. ( ): - . http://www.ftrf.org/ala/mgrps/divs/acrl/publications/crljournal/ / may/kriebel.pdf kumar, b. t. sampath, and g. t. kumar. "perception and usage of e-resources and the internet by indian academics." the electronic library , no. ( ): - . kushkowski, jeffrey d. "web citation by graduate students: a comparison of print and electronic theses." portal: libraries and the academy , no. ( ): - . madhusudhan, margam. "use of electronic resources by research scholars of kurukshetra university." the electronic library , no. ( ): - . marcum, deanna b., and gerald george. "who uses what? report on a national survey of information users in colleges and universities." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /george/ george.html maron, nancy l., and k. kirby smith. "current models of digital scholarly communication: results of an investigation conducted by ithaka strategic services for the association of research libraries." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . ———. "digital scholarly communication: a snapshot of current trends." research library issues, no. ( ): - . http://www.arl.org/bm~doc/rli- -ithaka.pdf min, shao, and yang yi. "e-resources, services and user surveys in tsinghua university library." program: electronic library and information systems , no. ( ): - . nicholas, david, paul huntington, and hamid r. jamali. "diversity in the information seeking behaviour of the virtual scholar: institutional comparisons." the journal of academic librarianship , no. ( ): - . nicholas, david, paul huntington, hamid r. jamali, ian rowlands, and maggie fieldhouse. "student digital information-seeking behaviour in context." journal of documentation , no. ( ): - . niu, xi, bradley m. hemminger, cory lown, stephanie adams, cecelia brown, allison level, merinda mclure, audrey powers, michele r. tennant, and tara cataldo. "national study of information seeking behavior of academic researchers in the united states." journal of the american society for information science and technology , no. ( ): - . http://onlinelibrary.wiley.com/doi/ . /asi. /full oduwole, adebambo adewale, and olatundun oyewumi. "accessibility and use of web-based electronic resources by physicians in a psychiatric institution in nigeria." program: electronic library and information systems , no. ( ): - . ramlogan, rabia, and lucy a. tedd. "use and non-use of electronic information sources by undergraduates at the university of the west indies." online information review , no. ( ): - . http://cadair.aber.ac.uk/dspace/handle/ / renwick, shamin. "knowledge and use of electronic information resources by medical sciences faculty at the university of the west indies." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/picrender.fcgi?action=stream&bl obtype=pdf&artid= ritchie, ann, and paul genoni. "print v. electronic reference sources: implications of an australian study." the electronic library , no. ( ): - . rowlands, ian, and dave nicholas. "the changing scholarly communication landscape: an international survey of senior researchers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "scholarly communication in the digital environment: the survey of journal author behaviour and attitudes." aslib proceedings: new information perspectives , no. ( ): - . rowlands, ian, dave nicholas, and paul huntington. "scholarly communication in the digital environment: what do authors want?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art sheeja, n. k. "undergraduate students' perceptions of digital library: a case study." the international information & library review , no. ( ): - . shuling, wu. "investigation and analysis of current use of electronic resources in university libraries." library management , no. / ( ): - . swain, dillip k., and k. c. panda. "use of electronic resources in business school libraries of an indian state: a study of librarians' opinion." the electronic library , no. ( ): - . ———. "use of e-services by faculty members of business schools in a state of india: a study." collection building , no. ( ): - . tahir, muhammad, khalid mahmood, and farzana shafique. "use of electronic information resources and facilities by humanities scholars." the electronic library , no. ( ): - . university of california office of scholarly communication, california digital library escholarship program, and greenhouse associates. faculty attitudes and behaviors regarding scholarly communication: survey findings from the university of california. oakland, ca: university of california office of scholarly communication, . http://osc.universityofcalifornia.edu/responses/materials/osc-surve y-summaries- .pdf vibert, nicolas, jean-françois rouet, christine ros, mélanie ramond, and bruno deshoullieres. "the use of online electronic information resources in scientific research: the case of neuroscience." library & information science research , no. ( ): - . zhang, yin. "scholarly use of internet-based electronic resources." journal of the american society for information science and technology. (june ): - . ———. "scholarly use of internet-based electronic resources: a survey report." library trends (spring ): - . http://hdl.handle.net/ / legal issues . legal issues: digital copyright adler, prudence. "copyright and intellectual property legislation and related activities: new challenges for libraries." journal of library administration , no. ( ): - . ———. "the role of fair use in libraries and education." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr fairuse.pdf adler, prudence s., g. jaia barrett, patricia brennan, mary case, mary e. jackson, and duane e. webster, comps. copyright and the nii: resources for the library and education community. washington, dc: association of research libraries, . akmon, dharma. "only with your permission: how rights holders respond (or don't respond) to requests to display archival materials online." archival science , no. ( ): - . alexander, adrian w. "wither fair use? a library consortium viewpoint." portal: libraries and the academy , no. ( ): - . alexander, adrian w., and julie s. alexander. "intellectual property rights and the 'sacred engine': scholarly publishing in the electronic age." in advances in library resource sharing, vol. , ed. jennifer cargill and diane j. graves, - . westport, ct: meckler publishing, . anderson, byron. "first sale, digital copyright, and libraries." behavioral & social sciences librarian , no. ( ): - . ang, steven. "agenda for change: intellectual property rights and access management—a framework for discussion on the relationship between copyright and the role of libraries in the digital age." library review , no. / ( ): - . appel, andrew w., and edward w. felten. "technological access control interferes with noninfringing scholarship." communications of the acm (september ): - . http://www.cs.princeton.edu/research/techreps/tr- - bailey, charles w., jr. "strong copyright + drm + weak net neutrality = digital dystopia?" information technology and libraries , no. ( ): - , . http://www.ala.org/ala/mgrps/divs/lita/ital/ /number septemb er/bailey.pdf bald, margaret. "the case of the disappearing author." serials review , no. ( ): - . band, jonathan. "armageddon on the potomac: the collections of information antipiracy act." d-lib magazine (january ). http://www.dlib.org/dlib/january / band.html ———. the google library project: the copyright debate. washington, dc: office for information technology policy, american library association, . http://www.ala.org/ala/issuesadvocacy/copyright/googlebooks/the % google% library% project% policy% brief.pdf ———. a guide for the perplexed: libraries & the google library project settlement. washington, dc: association of research libraries and the american library association, . http://www.arl.org/bm~doc/google-settlement- nov .pdf band, jonathan, and jonathan s. gowdy. "sui generis database protection: has its time come?" d-lib magazine (june ). http://www.dlib.org/dlib/june / band.html barlow, john perry. "the economy of ideas: a framework for rethinking patents and copyrights in the digital age (everything you know about intellectual property is wrong)." wired (march ): - , - . http://www.wired.com/wired/archive/ . /economy.ideas.html ———. "property and speech: who owns what you say in cyberspace?" communications of the acm (december ): - . bennett, scott. "author's rights." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . ———. "copyright and innovation in electronic publishing: a commentary." the journal of academic librarianship (may ): - . ———. "the copyright challenge: strengthening the public interest in the digital age." library journal, november , - . beger, gabriele. "copyright law in the european union, with special reference to germany." library review , no. ( ): - . besek, june m. copyright issues relevant to digital preservation and dissemination of pre- commercial sound recordings by libraries and archives. washington, dc: council on library and information resources and the library of congress, . http://www.clir.org/pubs/abstract/pub abst.html ———. copyright issues relevant to the creation of a digital archive: a preliminary assessment. washington, dc: council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf ———. "copyright: what makes a use 'fair'?" educause review , no. ( ): - . http://www.educause.edu/educause+review/educauserevie wmagazinevolume /copyrightwhatmakesausefair/ bide, mark. "arrow—steps towards resolving the 'orphan works problem'." serials: the journal for the serials community , no. ( ): - . ———. "copyright and the network." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art blumenstyk, goldie. "academic groups say copyright legislation in congress would impede scholarship." the chronicle of higher education, may , a . ———. "after years, academics and publishers reach no clear conclusions on 'fair use.'" the chronicle of higher education, may , a -a . ———. "copyright law closes loophole on distribution of software." the chronicle of higher education, january , a -a . ———. "educators and publishers reach agreement on 'fair use' guidelines for cd-roms." the chronicle of higher education, october , a . boyle, james. "expanding the public domain." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr pubdomain.pdf branscomb, anne wells. "public and private domains of information: defining the legal boundaries." bulletin of the american society for information science (december/january ): - . burke, edmund. "database copyrights." educom review (march/april ): - . http://www.educause.edu/resources/databasecopyrights/ burrell, robert, and allison coleman. copyright exceptions: the digital impact. cambridge: cambridge university press, . buttler, dwayne k. "confu-sed: security, safe harbors, and fair-use guidelines." journal of the american society for information science (december ): - . byrd, gary d. "protecting access to the intellectual property of the health sciences." bulletin of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/picrender.fcgi?artid= &ac tion=stream&blobtype=pdf campbell, jerry d. "intellectual property in a networked world: balancing fair use and commercial interests." library acquisitions: practice & theory , no. ( ): - . campbell, james. "reactions to the enclosure of the information commons: - ." bulletin of the american society for information science and technology , no. ( ): - . http://www.asis.org/bulletin/oct- /campbell.html carlson, scott. "once-trustworthy newspaper databases have become unreliable and frustrating." the chronicle of higher education, january , a -a . carter, howard. "library faculty publishing and intellectual property issues: a survey of attitudes and awareness." portal: libraries and the academy , no. ( ): - . cave, mike, marilyn deegan, and louise heinink. "copyright clearance in the refugee studies centre digital library project." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature charlesworth, andrew. "digital curation, copyright, and academic research." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / cheverie, joan f. "the changing economics of information, technological development, and copyright protection: what are the consequences for the public domain?" the journal of academic librarianship , no. ( ): - . clark, charles. "in what are we trading? author's rights and publishers' rights in traditional and digital media." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art clark, jeff. "libraries and the fate of digital content." library journal, june , - . http://www.libraryjournal.com/article/ca .html coleman, anita sundaram. "self-archiving and the copyright transfer agreements of isi-ranked library and information science journals." journal of the american society for information science and technology , no. ( ): - . cottrell, terry. "a copyright primer for small undergraduate libraries." community & junior college libraries , no. ( ): - . covey, denise troll. acquiring copyright permission to digitize and provide open access to books. washington, dc: council on library and information resources and digital library federation, . http://www.clir.org/pubs/abstract/pub abst.html crews, kenneth d. copyright, fair use, and the challenge for universities: promoting the progress of higher education. chicago: the university of chicago press, . ———. copyright law for librarians and educators: creative strategies and practical solutions. nd ed. chicago: american library association, . ———. "electronic reserves and fair use: the outer limits of confu." journal of the american society for information science (december ): - . ———. "what qualifies as 'fair use'?" the chronicle of higher education, may , b -b . crews, kenneth d., and georgia k. harper. "the immunity dilemma: are state colleges and universities still liable for copyright infringements?" journal of the american society for information science (december ): - . crews, kenneth d., and gerard van westrienen. "copyright, publishing, and scholarship: the 'zwolle group' initiative for the advancement of higher education." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /crews/ crews.html davis, jinnie y. "fair use after confu." college & research libraries (may ): - . deazley, ronan. rethinking copyright: history, theory, language. cheltenham, uk: edward elgar publishing ltd., . deloughry, thomas j. "copyright in cyberspace." the chronicle of higher education, september , a , a . desmarais, norman. "copyright and fair use of multimedia resources." the acquisitions librarian, no. ( ): - . downes, daniel m. "new media economy: intellectual property and cultural insurrection." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . dryden, jean. "copyright issues in the selection of archival material for internet access." archival science , no. ( ): - . duggan, mary kay. "copyright of electronic information: issues and questions." online (may ): - . dusollier, severine. "fair use by design in the european copyright directive of ." communications of the acm , no. ( ): - . dyson, esther. "intellectual value." wired (july ): - , - . http://www.wired.com/wired/archive/ . /dyson.html eisenschitz, tamara. "moral rights and information content in published works." aslib proceedings: new information perspectives , no. ( ): - . elliott, roger. "who owns scientific data? the impact of intellectual property rights on the scientific publication chain." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ewing, john. "copyright and authors." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ fernández-molina, j. carlos. "laws against the circumvention of copyright technological protection." journal of documentation , no. ( ): - . ———. "the legal protection of databases: current situation of the international harmonisation process." aslib proceedings: new information perspectives , no. ( ): - . fernández-molina, j. carlos, and j. augusto chaves guimarães. "the wipo development agenda and the contribution of the international library community." the electronic library , no. ( ): - . fernández-molina, j. carlos, and eduardo peis. "the moral rights of authors in the age of digital information." journal of the american society for information science and technology , no. ( ): - . field, thomas g., jr. "copyright in e-mail." the journal of electronic publishing (september ). http://hdl.handle.net/ /spo. . . fisher, janet h. "copyright: the glue of the system." the journal of electronic publishing (january ). http://hdl.handle.net/ /spo. . . fitzgerald, brian, and kylie pappalardo. "the law as cyberinfrastructure." ctwatch quarterly , no. ( ). http://www.ctwatch.org/quarterly/articles/ / /the-law-as-cyberi nfrastructure/index.html foster, andrea l. "scholars and libraries want permission to copy electronic materials." the chronicle of higher education, december , a . frankel, mark s. "seizing the moment: scientists' authorship rights in the digital age." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art frazier, kenneth. "protecting copyright and preserving fair use in the electronic future." the chronicle of higher education, june , a . ———. "what's wrong with fair-use guidelines for the academic community?" journal of the american society for information science (december ): - . friedman, jonathan a., and francis m. buono. "using the digital millennium copyright act to limit potential copyright liability online." the richmond journal of law & technology (winter - ). http://law.richmond.edu/jolt//v i /article .html friend, frederick j. "zwolle's contribution to good copyright relationships." serials , no. ( ): - . gadd, elizabeth. "copyright clearance for the digital library: a practical guide." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art gadd, elizabeth, charles oppenheim, and steve probets. "the intellectual property rights issues facing self-archiving: key findings of the romeo project." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /gadd/ gadd.html ———. "the romeo project: protecting metadata in an open access environment." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /romeo/ ———. "romeo studies : the impact of copyright ownership on academic author self-archiving." journal of documentation , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "romeo studies : how academics want to protect their open-access research papers." journal of information science , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "romeo studies : how academics expect to use open-access research papers." journal of librarianship and information science , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "romeo studies : an analysis of journal publishers' copyright agreements." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "romeo studies : ipr issues facing oai data and service providers." the electronic library , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "romeo studies : rights metadata for open archiving." program: electronic library & information systems , no. ( ): - . http://eprints.rclis.org/archive/ / garcia, linda d. "information exchange: the impact of scholarly communication." educom review (fall ): - . garlick, mia. "a review of creative commons and science commons." educause review , no. ( ): - . http://www.educause.edu/educause+review/educauserevie wmagazinevolume /areviewofcreativecommonsandsci/ garrett, john r., and m. stuart lynn. "storerights, access rights, and copyright law: the base of the iceberg." serials review , no. ( ): - . gasaway, laura n. "changes in copyright ownership." serials review , no. ( ): - . ———. "copyright considerations for electronic reserves." in managing electronic reserves, ed. jeff rosedale, - . chicago: american library association, . ———. "copyright considerations for fee-based document delivery services." journal of interlibrary loan, document delivery and information supply , no. ( ): - . ———. "copyright in the electronic era." the serials librarian , no. / ( ): - . ———. "copyright, the internet, and other legal issues." journal of the american society for information science (september ): - . ———. "guidelines for distance learning and interlibrary loan: doomed and more doomed." journal of the american society for information science (december ): - . ———. "libraries, educational institutions, and copyright proprietors: the first collision on the information highway." the journal of academic librarianship (september ): - . ———. "scholarly publication and copyright in networked electronic publishing." library trends (spring ): - . ———. "serials ." the serials librarian , no. / ( ): - . ———. "tasini: did authors win?" against the grain (february ): , . ———. "the white paper, fair use, libraries and educational institutions." the serials librarian , no. / ( ): - . geist, michael, ed. in the public interest: the future of canadian copyright law. toronto: irwin law, . george, carole a. "testing the barriers to digital libraries: a study seeking copyright permission to digitize published works." new library world , no. / ( ): - . gibby, richard, and andrew green. "electronic legal deposit in the united kingdom." new review of academic librarianship , no. / ( ): - gillespie, tarleton. wired shut: copyright and the shape of digital culture. cambridge, ma: mit press, . ginsburg, jane c. "what to know before reissuing old titles as e-books." communications of the acm (september ): - . givler, peter. "seeking balance: rights and exceptions in section of the us copyright act." learned publishing , no. ( ): - . gladney, henry m. "digital dilemma: intellectual property synopsis and views on the study by the national academies' committee on intellectual property rights and the emerging information infrastructure." d-lib magazine (december ). http://www.dlib.org/dlib/december / gladney.html gorman, robert a. "intellectual property: the rights of faculty as creators and users." academe (may-june ): - . grillot, ben. "pubmed central deposit and author rights: agreements between publishers and the authors subject to the nih public access policy." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arl-br- -pubmed.pdf gross, robin d. "digital millennium copyright act's impact on freedom of expression, science, and innovation." in advances in librarianship, vol. ., ed. frederick c. lynden. san diego: academic press, . grosso, andrew. "the promise and problems of the no electronic theft act." communications of the acm (february ): - . gurman, diane. "why lakoff still matters: framing the debate on copyright law and digital publishing." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ guy, marieke, and brian kelly. "qa focus information for digital libraries: a case study of cc implementation." indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= hahn, karla. "two new policies widen the path to balanced copyright management: developments on author rights." college & research libraries news , no. ( ): - . halbert, martin. "copyright, digital media, and libraries." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /halbert. n hamma, kenneth. "public domain art in an age of easier mechanical reproducibility." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /hamma/ hamma.html hannay, william m. "legal implications of the digital future." library resources & technical services (october ): - . harper, georgia k. "copyright endurance and change." educause review (december ): - . http://net.educause.edu/ir/library/pdf/erm .pdf ———. "oa and ip: open access, digital copyright and marketplace competition." learned publishing , no. ( ): - . hatfield, amy. "content analysis of restrictive publisher copyright policies for electronic reserves." journal of interlibrary loan, document delivery & information supply , no. ( ): - . henry, geneva. "on-line publishing in the st century: challenges and opportunities." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /henry/ henry.html hess, charlotte, and elinor ostrom, eds. understanding knowledge as a commons: from theory to practice. cambridge, ma: mit press, . hilton, james. "copyright assumptions and challenges." educause review (november/december ): - . http://www.educause.edu/educause+review/educauserevie wmagazinevolume /copyrightassumptionsandchallen/ hirtle, peter b. "author addenda: an examination of five alternatives." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /hirtle/ hirtle.html ———. "copyright renewal, copyright restoration, and the difficulty of determining copyright status." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /hirtle/ hirtle.html hirtle, peter, and tricia donovan. "removing all restrictions: cornell's new policy on use of public domain reproductions." research library issues, no. ( ): - . http://www.arl.org/bm~doc/rli- -cornell.pdf hoorn, esther. "repositories, copyright and creative commons for scholarly communication." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /hoorn/ hoorn, esther, and maurits van der graaf. "copyright issues in open access research journals: the authors' perspective." d-lib magazine , no. ( ). http://www.dlib.org/dlib/february /vandergraaf/ vandergraaf.htm l horava, tony g. "webpages on copyright in canadian academic libraries." partnership: the canadian journal of library and information practice and research , no. ( ). http://journal.lib.uoguelph.ca/index.php/perj/article/view/ hudson, emily, and andrew t kenyon. "without walls: copyright law and digital collections in australian cultural institutions." script-ed , no. ( ). http://www.law.ed.ac.uk/ahrc/script-ed/vol - /kenyon.asp hugenholtz, p. bernt. "copyright vs. freedom of scientific communication." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art hughes, carol ann. "the case for scholars' management of author rights." portal: libraries and the academy , no. ( ): - . hyams, peter. "legal deposit of electronic publications." online & cdrom review (october ): - . isenberg, doug. gigalaw guide to internet law. new york: random house, . jacobson, robert l. "the furor over 'fair use.'" the chronicle of higher education, may , a , a , a . ———. "no copying." the chronicle of higher education, march , a -a . jenkins, celia, charles oppenheim, steve probets, and bill hubbard. "romeo studies : creation of a controlled vocabulary to analyse copyright transfer agreements." journal of information science , no. ( ): - . jenkins, celia, steve probets, charles oppenheim, and bill hubbard. "romeo studies : self-archiving: the logic behind the colour-coding used in the copyright knowledge bank." program: electronic library and information systems , no. ( ): - . jensen, mary brandt. does your project have a copyright problem? a decision-making guide for librarians. jefferson, nc: mcfarland & company, inc., . ———. "making copyright work in electronic publishing models." serials review , no. - ( ): - . jeweler, robin. the google book search project: is online indexing a fair use under copyright law? washington, dc: congressional research service, library of congress, . http://assets.opencrs.com/rpts/rs _ .pdf johns, adrian. piracy: the intellectual property wars from gutenberg to gates. chicago: the university of chicago press, . johnson, liz. "managing intellectual property for distance learning." educause quarterly , no. ( ): - . http://www.educause.edu/educause+quarterly/educausequa rterlymagazinevolum/managingintellectualpropertyfo/ joint, nicholas. "risk assessment and copyright in digital libraries." library review , no. ( ): - . kahin, brian. "the copyright law: how it works and new issues in electronic settings." the serials librarian , no. / ( ): - . kapitzke, cushla. "rethinking copyrights for the library through creative commons licensing " library trends , no. ( ): - kim, minjeong. "the creative commons and copyright protection in the digital era: uses of creative commons licenses." journal of computer-mediated communication , no. ( ). http://jcmc.indiana.edu/vol /issue /kim.html kirtley, jane e., rebecca daugherty, and leslie ann reis. "world intellectual property organization: comments on the basic proposal for the substantive provisions of the treaty on intellectual property in respect of databases." the journal of academic librarianship (march ): - . kleinman, molly. "the beauty of 'some rights reserved.'" college & research libraries news , no. ( ): - . korn, naomi, and charles oppenheim. "creative commons licences in higher and further education: do we care?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /korn-oppenheim/ koulouris, alexandros. "access and reproduction policies of university digital collections." journal of librarianship and information science , no. ( ): - . lastowka, f. gregory. "free access and the future of copyright." rutgers computer & technology law journal , no. ( ): - . http://papers.ssrn.com/sol /papers.cfm?abstract_id= lavoie, brian, and lorcan dempsey. "beyond : characteristics of potentially in-copyright print books in library collections." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /lavoie/ lavoie.html law, d. g., r. l. weedon, and m. r. sheen. "universities and article copyright." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art leibowitz, wendy r. "national research council's copyright report discusses issues without settling any." the chronicle of higher education, november , a . lessig, lawrence. code and other laws of cyberspace. new york: basic books, . ———. free culture: how big media uses technology and the law to lock down culture and control creativity. new york: penguin press, . http://www.free-culture.cc/ ———. the future of ideas: the fate of the commons in a connected world. new york: random house, . http://thefutureofideas.s .amazonaws.com/lessig_foi.pdf levering, mary. "what's right about fair-use guidelines for the academic community." journal of the american society for information science (december ): - . lichtenberg, james. "of steeds & stalking horses: academics meet publishers on the field of copyright." educom review (may/june ): - . http://net.educause.edu/apps/er/review/reviewarticles/ .html lipinski, tomas a. "the climate of distance education in the st century: understanding and surviving the changes brought by the teach (technology, education, and copyright harmonization) act of ." the journal of academic librarianship , no. ( ): - . ———. the complete copyright liability handbook for librarians and educators. new york: neal-schuman publishers, . ———. "the myth of technological neutrality in copyright and the rights of institutional users: recent legal challenges to the information organization as mediator and the impact of the dmca, wipo, and teach." journal of the american society for information science and technology , no. ( ): - . litman, jessica. digital copyright. amherst, ny: prometheus books, . long, maurice. "authors and their rights." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art long, sarah ann. "us copyright law: the challenge of protection in the digital age." new library world , no. / ( ): - . lopez, xavier r. "new developments in intellectual property rights: implications for geographic information systems." the journal of academic librarianship (november ): - . lowry, charles b. "fair use and digital publishing: an academic librarian's perspective." portal: libraries and the academy , no. ( ): - . lu, kathleen. "technological challenges to artists' rights in the age of multimedia: the future of moral rights." reference services review , no. ( ): - . lutzker, arnold p., "in the curl of the wave: what the digital millennium copyright act and term extension act mean for the library and education community." arl: a bimonthly report on research libraries issues and actions from arl, cni, and sparc, no. (april ): - . http://www.arl.org/bm~doc/curl.pdf madieha, ida, and abdul ghani azmi. "institutional repositories in malaysia: the copyright issues." international journal of law and information technology , no. ( ): - . mahesh, g., and rekha mittal. "digital content creation and copyright issues." the electronic library , no. ( ): - . marley, judith l. "guidelines favoring fair use: an analysis of legal interpretations affecting higher education." the journal of academic librarianship (september ): - . maxwell, terrence a. "is copyright necessary?" first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ ———. "parsing the public domain." journal of the american society for information science and technology , no. ( ): - . mcginnis, leah g. "bringing order out of chaos: the challenge of managing e-reserves copyright permissions." journal of interlibrary loan, document delivery & information supply , no. ( ): - . mcgreal, rory. "stealing the goose: copyright and learning." international review of research in open and distance learning , no. ( ). http://www.irrodl.org/index.php/irrodl/article/view/ / medeiros, norm. "smack down: copyright cases head to court (part )." oclc systems & services: international digital library perspectives , no. ( ): - . metcalfe, amy, veronica diaz, and richard wagoner. "academe, technology, society, and the market: four frames of reference for copyright and fair use." portal: libraries and the academy , no. ( ): - . miller, brett i. "recent lessons from the courts: the changing landscape of copyright in a digital age." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature morris, sally. "authors and copyright." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art muir, adrienne. "legal deposit and preservation of digital publications: a review of research and development activity." journal of documentation (september ): - . murray, laura j. "protecting ourselves to death: canada, copyright, and the internet." first monday , no. ( ). http://www.firstmonday.org/issues/issue _ /murray/index.html neal, james g. "copyright is dead . . . long live copyright." american libraries (december ): - . nollan, richard. "campus intellectual property policy development." reference services review , no. ( ): - . ober, john. "facilitating open access: developing support for author control of copyright." college & research libraries news , no. ( ): - . ogden, robert s. "copyright issues for libraries and librarians." library collections, acquisitions, & technical services , no. ( ): - . okerson, ann. "copyright in the year : no longer an issue for scholarly electronic publishing." serials review , no. ( ): - . ———. "the current national copyright debate: its relationship to the work of collections managers." journal of library administration , no. ( ): - . ———. "whose article is it anyway? copyright and intellectual property issues for researchers in the s." notices of the american mathematical society (january ): - . http://www.ams.org/notices/ /okerson.pdf ———. "whose work is it anyway? perspectives on the stakeholders and the stakes in the current copyright scene." the serials librarian , no. / ( ): - . ———. "with feathers: effects of copyright and ownership on scholarly publishing." college & research libraries (september ): - . http://www.library.yale.edu/~okerson/feathers.html oppenheim, charles. "does copyright have any future on the internet?" journal of documentation (may ): - . ———. "moral rights and the electronic library." learned publishing , no. ( ): - . orlans, harold. "fair use in us scholarly publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art o'rourke, maureen a. "is virtual trespass an apt analogy?" communications of the acm (february ): - . o'sullivan, maureen. "creative commons and contemporary copyright: a fitting shoe or 'a load of old cobblers'?" first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / ou, carol. "technology and copyright issues in the academic library: first sale, fair use and the electronic document." portal: libraries and the academy , no. ( ): - . padfield, tim. copyright for archivists and record managers. rd ed. new york: neal-schuman publishers, . patry, william. moral panics and the copyright wars. new york: oxford university press, . peters, paul evan. "networked intellectual property: brain-ache of the decade." educom review (may/june ): - . http://net.educause.edu/apps/er/review/reviewarticles/ .html powell, david j. "voluntary deposit of electronic publications: a learning experience." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art pritcher, lynn. "ad*access: seeking copyright permissions for a digital age." d-lib magazine (february ). http://www.dlib.org/dlib/february /pritcher/ pritcher.html quick, rebecca. "can't get there from here may be web's new motto: companies start to curb links to their sites." the wall street journal, july , b . rimmer, matthew. "the dead poets society: the copyright term and the public domain." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie warticle/ ———. "robbery under arms: copyright law and the australia-united states free trade ageeement." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ rivera-morales, noemi a. "fair-use guidelines: a selected bibliography." journal of the american society for information science (december ): - . roos, j. w. "libraries for the blind as accessible content publishers: copyright and related issues." library trends , no. ( ): - . http://hdl.handle.net/ / rosenberg, victor. "will new information technology destroy copyright?" the electronic library (october ): - . russell, carrie. "copyright and the public domain." texas library journal (spring ): - . salo, dorothea. "who owns our work?: based on a paper presented at the rd uksg conference, edinburgh, april ." serials: the journal for the serials community , no. ( ): - . samuelson, pamela. "big media beaten back." wired (march ): - , - . ———. "copyright and digital libraries." communications of the acm (april ): - , . ———. "the copyright grab." wired (january ): - , , - . http://www.wired.com/wired/archive/ . /white.paper_pr.html ———. "copyright law and electronic compilations of data." communications of the acm (february ): - . ———. "copyright's fair use doctrine and digital data." communications of the acm (january ): - . ———. "digital media and the law." communications of the acm (october ): - . ———. "encoding the law into digital libraries." communications of the acm (april ): - . ———. "good news and bad news on the intellectual property front." communications of the acm (march ): - . ———. "intellectual property rights and the global information economy." communications of the acm (january ): - . ———. "legal protection for database contents." communications of the acm (december ): - . ———. "the nii intellectual property report." communications of the acm (december ): - . ———. "on authors' rights in cyberspace: questioning the need for new international rules on authors' rights in cyberspace." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ ———. "preserving the positive functions of the public domain in science." data science journal ( ): - . http://www.jstage.jst.go.jp/article/dsj/ / / /_pdf ———. "why the anticircumvention regulations need revision." communications of the acm (september ): - . schiesel, seth. "global agreement reached to widen law on copyright." the new york times, december , , . schöpfel, joachim. "the new french law on author's rights and related rights in the information society." interlending & document supply , no. ( ): - . schragis, steven. "do i need permission? fair use rules under the federal copyright law." publishing research quarterly (winter ): - . seadle, michael. "copyright in the networked world: author's rights." library hi tech , no. ( ): - . ———. "copyright in the networked world: digital legal deposit." library hi tech , no. ( ): - . ———. "copyright in the networked world: new rules for images." library hi tech , no. ( ): - . ———. "copyright in the networked world: orphaned copyrights." library hi tech , no. ( ): - . sheat, kathy. "libraries, copyright and the global digital environment." the electronic library , no. ( ): - . sheppard, tamara. "putting the public in the public domain: the public library's role in the re-conceptualization of the public domain." new library world , no. / ( ): - . sherman, chris. "napster: copyright killer or distribution hero?" online (november/december ): - . shkolnikov, tanya. "to link or not to link: how to avoid copyright traps on the internet." the journal of academic librarianship , no. ( ): - . shuler, john a. "distance education, copyrights rights, and the new teach act." the journal of academic librarianship , no. ( ): - . smith, donny. "a copyright primer for electronic reserve: copyright for harried electronic reserves staff." journal of interlibrary loan, document delivery & information supply , no. ( ): - . smith, kay hogan, rajia c. tobia, t. scott plutchak, lynda m. howell, sondra j. pfeiffer, and michael s. fitts. "copyright knowledge of faculty at two academic health science campuses: results of a survey." serials review , no. ( ): - . smith, kevin l. "copyright renewal for libraries: seven steps toward a user-friendly law." portal: libraries and the academy , no. ( ): - . ———. "managing copyright for nih public access: strategies to ensure compliance." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arl-br- -copyright.pdf smith, millison. "fair use and distance learning in the digital age." the journal of electronic publishing (june ). http://hdl.handle.net/ /spo. . . spinello, richard a. "intellectual property rights." library hi tech , no. ( ): - . st. clair, gloriana, and sanford g. thatcher. "changing copyright legislation: two views." library acquisitions: practice & theory , no. ( ): - . stevens, joann. "the multimedia guidelines." journal of the american society for information science (december ): - . stix, gary. "some rights reserved." scientific american , no. ( ): . strong, william s. the copyright book: a practical guide, th ed. cambridge, ma: the mit press, . ———. "copyright in a time of change." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . suber, peter. "balancing author and publisher rights." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#balan cing sundt, christine l. "testing the limits: the confu digital-images and multimedia guidelines and their consequences for libraries and educators." journal of the american society for information science (december ): - . suthersanen, uma. "creative commons—the other way?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art tennant, roy. "the copyright war." library journal, june , - . http://www.libraryjournal.com/article/ca .html terry, ana arias. "author care: the rights publishers offer and what authors think." against the grain (june ): , , . thatcher, sanford g. "fair use: a double-edged sword." journal of scholarly publishing (october ): - . ———. "fair use in theory and practice: reflections on its history and the google case." journal of scholarly publishing , no. ( ): - . ———. "on the author's addendum." journal of scholarly publishing , no. ( ): - . theriault, leah. "doa at the online ramp." the acquisitions librarian, no. ( ): - . trln copyright policy task force. "model university policy regarding faculty publication in scientific and technical scholarly journals: a background paper and review of the issues." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /trln. n u.s. copyright office. "report on legal protection for databases." bulletin of the american society for information science (december/january ): - . vaidhyanathan, siva. copyrights and copywrongs: the rise of intellectual property and how it threatens creativity. new york: new york university press, . ———. "the state of copyright activism." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ van borm, julien. "the long tail, copyright and libraries." liber quarterly: the journal of european research libraries , no. ( ): - . velterop, jan. "copyright and research: a different perspective." script-ed , no. ( ). http://www.law.ed.ac.uk/ahrc/script-ed/vol - /velterop.asp vickery, jim. "the legal deposit of electronic publications." against the grain (february ): , - . vogele, colette, mia garlick, and berkman center clinical program in cyberlaw. podcasting legal guide: rules for the revolution. san francisco: creative commons, . http://mirrors.creativecommons.org/podcasting_legal_guide.pdf wagner, karen i. "intellectual property: copyright implications for higher education." the journal of academic librarianship (january ): - . wilkinson, margaret ann, and natasha gerolami. "the author as agent of information policy: the relationship between economic and moral rights in copyright." government information quarterly , no. ( ): - . willinsky, john. "copyright contradictions in scholarly publishing." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ wirth, andrea a., and faye a. chadwell. "rights well: an authors' rights workshop for librarians." portal: libraries and the academy , no. ( ): - . zhang, wende. "digital library intellectual property right evaluation and method." the electronic library , no. ( ): - . xiaofeng, guo, and li ying. "federated content rights management for research and academic publications using the handle system." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /guo/ guo.html . legal issues: license agreements allen, barbara mcfadden. "negotiating digital information system licenses without losing your shirt or your soul." journal of library administration , no. ( ): - . balkwill, richard. "digital licensing—a role for the publishers licensing society." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art best, rickey d. "is the 'big deal' dead?" the serials librarian , no. ( ): - . bide, mark, rajveen dhiensa, hugh look, charles oppenheim, and steve probets. "requirements for a registry of electronic licences." the electronic library , no. ( ): - . bide, mark, charles oppenheim, and anne ramsden. "some proposals regarding copyright clearance and digitisation in higher education." journal of information science , no. ( ): - . bley, robert, and ross macintyre. "nesli—the national electronic site license initiative." vine, no. ( ): - . blosser, john. "vendors and licenses: adding value for customers." the serials librarian , no. / ( ): - . borin, jacqueline. "site license initiatives in the united kingdom: the psli and nesli experience." information technology and libraries (march ): - . brennan, patricia, karen hersey, and georgia harper. licensing electronic resources: strategic and practical considerations for signing electronic information delivery agreements. washington, dc: association of research libraries, . buchanan, nancy l. "navigating the electronic river: electronic product licensing and contracts." the serials librarian , no. / ( ): - . butina, ingbritt. "electronic journals—the danish model." serials , no. ( ): - . carlson, amy, and barbara m. pope. "the 'big deal': a survey of how libraries are responding and what the alternatives are." the serials librarian , no. ( ): - . carpenter, todd a. "onix for publications licenses: getting an electronic grip on license information." the serials librarian , no. - ( ): - . carter, penny. "site licence concept: a view of the uk pilot site licence initiative." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art case, mary m. "library associations endorse principles for licensing electronic resources." arl: a bimonthly newsletter of research library issues and actions, no. ( ): - . http://www.arl.org/bm~doc/licensing.pdf cave, francis, brian green, and david martin. "onix for licensing terms: standards for the electronic communication of usage terms." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /green-et-al/ chamberlain, clinton k. "breaking the bottleneck: using seru to facilitate the acquisition of electronic resources." college & research libraries news , no. ( ): - . chang, sheau-hwang. "the dlf electronic resource management initiative." oclc systems & services , no. ( ): - . christou, corilee, and gail dykstra. "through a 'content looking glass'—another way of looking at library licensing of electronic content." against the grain , no. ( ): , , . clarke, roger. "a proposal for an open content licence for research paper (pr)eprints." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ cleary, colleen. "why the 'big deal' continues to persist." the serials librarian , no. ( ): - . cole, louise. "the e-deal: keeping up to date and allowing access to the end user." the serials librarian , no. ( ): - . costello, diane. "the role of caul (council of australian libraries) in consortial purchasing." journal of library administration , no. / ( ): - . cox, john. "licensing serials." serials (july ): - . ———. "model generic licenses: cooperation and competition." serials review , no. ( ): - . ———. "standard licences: simplifying the acquisitions process." serials (july ): - . crawford, amy r. "licensing and negotiations for electronic content." resource sharing & information networks , no. / ( ): - crews, kenneth d. "licensing for information resources: creative contracts and the library mission." in virtually yours: models for managing electronic resources and services, ed. peggy johnson and bonnie macewan, - . chicago: american library association, . davis, philip m. "fair publisher pricing, confidentiality clauses and a proposal to even the economic playing field." d-lib magazine , no. ( ). http://www.dlib.org/dlib/february /davis/ davis.html davis, trisha l. "legal issues: the negotiator's perspective for getting to the heart of the license." in virtually yours: models for managing electronic resources and services, ed. peggy johnson and bonnie macewan, - . chicago: american library association, . ———. "license agreements in lieu of copyright: are we signing away our rights?" library acquisitions: practice & theory , no. ( ): - . debruijn, deb. "the canadian national site licensing project." journal of library administration , no. / ( ): - . desmarais, norman. "chaos—ucita: a bad law protecting bad software." against the grain (february ): - . duranceau, ellen finnie. "license compliance." serials review , no. ( ): - . ———. "license tracking." serials review , no. ( ): - . ———. "using a standard license for individual electronic journal purchases: results of a pilot study in the mit libraries." serials review , no. ( ): - . ———. "why you can't learn license negotiation in three easy lessons: a conversation with georgia harper, office of general counsel, university of texas." serials review , no. ( ): - . duranceau, ellen, and ivy anderson. "author-rights language in library content licenses." research library issues, no. ( ): - . http://www.arl.org/bm~doc/rli- -author-rights.pdf durrant, fiona. negotiating licences for digital resources. london: facet publishing, . eason, ken. "evaluation of the national site license initiative (nesli)." serials (july ): - . euler, ellen. "licences for open access to scientific publications— a german perspective." indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= failing, patricia. "scholars face hefty fees and elaborate contracts when they use digital images." the chronicle of higher education, may , b -b . fox, david, and vinh-the lam. "canadian national site licensing project: getting ready for cnslp at the university of saskatchewan library." the serials librarian , no. ( ): - . friedgood, beverley. "the uk national electronic site licensing initiative." serials (march ): - . garlick, mia. "creative humbug? bah the humbug, let's get creative!" indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= gerhardt, paul. "creative archive." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /gerhardt/ giavarra, emanuella. "licensing digital resources: how to avoid the legal pitfalls." serials (july ): - . green, brian. "helping libraries manage digital rights: standards for the electronic communication of licensing terms." indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= gregory, vicki l. "ucita: what does it mean for libraries?" online (january/february ): - . http://www.onlinemag.net/ol /gregory _ .html grover, diane, and theodore fons. "the innovative electronic resource management system: a development partnership." serials review , no. ( ): - . guernsey, lisa. "california state u. tries to create a new way to buy on-line journals." the chronicle of higher education, january , a -a . hahn, karla. "do i have to negotiate a license for every e-resource i buy? developing a best practice option." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): . http://www.arl.org/bm~doc/arlbr licenseopt.pdf ———. "seru (shared electronic resource understanding)." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /hahn/ hahn.html harris, lesley ellen. licensing digital content: a practical guide for librarians. nd ed. chicago: ala, . harwood, paul. "nesli: an agent for change or changing the agent?" the electronic library , no. ( ): - . horava, tony. "access policies and licensing issues in research libraries." collection building , no. ( ): - . ———. "licensing e-resources for alumni: reflections from a pilot project." college & research libraries news , no. ( ): - . hormia-poutanen, kristiina. "licensing electronic journals in finland." serials (july ): - . iannella, renato. "managing licenses in an open access community." ercim news, no. ( ): . http://www.ercim.org/publication/ercim_news/enw /iannella.html jacobson, robert l. "checking the fine print on superhighway licenses." the chronicle of higher education, july , a , a -a . jamtgaard, laurel. "licenses and information policy: an update on ucc article b." arl: a bimonthly newsletter of research library issues and actions, no. ( ): - . http://www.arl.org/bm~doc/ucc b.pdf jewell, tim, trisha l. davis, diane grover, and jill e. grogg. "mapping license language for electronic resource management." the serials librarian , no. / ( ): - . kaye, laurie. "owning and licensing content—key legal issues in the electronic environment." journal of information science , no. ( ): - . kohl, david. "consortial licensing vs. tradition: breaking up is hard to do." learned publishing (january ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art lawrence, eileen. "licensing: a publisher's perspective." the serials librarian , no. / ( ): - . leffler, jennifer j., and heidi a. zuniga. "development and use of license forms for libraries with and without electronic resource management systems." technical services quarterly , no. ( ): - . mcginnis, suzan d. "selling our collecting souls: how license agreements are controlling collection management." journal of library administration , no. ( ): - . meera, b. m., and k. t. anuradha. "contractual solutions in electronic publishing industry: a comparative study of license agreements." webology , no. ( ). http://www.webology.ir/ /v n /a .html menzel, j., k. metzner, and e. pope. "ideal and appeal: a model for consortium licensing of electronic journal collections." astrophysics and space science , no. - ( ): - . morris, sally. "copyright and licensing—the changing scene." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art neal, james g. "the fight against ucita." library journal, september , - . needleman, mark. "the niso license expression working group." serials review , no. ( ): - . nielsen, henning p., and jane whittall. "model licensing. key elements and specific needs in electronic journal licensing for the pharmaceutical industry." serials (july ): - . okerson, ann. "buy or lease? two models for scholarly information at the end (or the beginning) of an era." daedalus , no. ( ): - . http://www.library.yale.edu/~okerson/daedalus.html ———. "copyright or contract?" library journal, september , - . ———. "the liblicense project and how it grows." d-lib magazine (september ). http://www.dlib.org/dlib/september /okerson/ okerson.html ———. "scholarly communication and the licensing of electronic publications." in the impact of electronic publishing on the academic community: an international workshop organized by the academia europaea and the wenner-gren foundation, ed. i. butterworth, - . london: portland press, . ———. "what academic libraries need in electronic content licenses: presentation to the stm library relations committee, stm annual general meeting, october , ." serials review , no. ( ): - . http://www.library.yale.edu/~okerson/stm.html olivieri, rene. "publishing economics: the site license effect." journal of scholarly publishing (january ): - . ———. "site licenses: a new economic paradigm." the serials librarian , no. / ( ): - . peters, paul evan. "making the market for networked information: an introduction to a proposed program for licensing electronic uses." serials review , no. - ( ): - . phelan, daniel. "canadian national site licensing project." against the grain (february ): , . pike, george h. "the delicate dance of database licenses, copyright, and fair use." computers in libraries (may ): - , - . http://www.infotoday.com/cilmag/may /pike.htm prior, albert. "nesli—progress through collaboration." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art rens, andrew. "managing risk and opportunity in creative commons enterprises." first monday , no. ( ). http://firstmonday.org/issues/issue _ /rens/index.html richards, rob. "licensing agreements: contracts, the eclipse of copyright, and the promise of cooperation." the acquisitions librarian, no. ( ): - . roberts, michael, tony kidd, and lynn irvine. "the impact of the current e-journal marketplace on university library budget structures: some glasgow experiences." library review , no. ( ): - . rogers, sam. "survey and analysis of electronic journal licenses for long-term access provisions in tertiary new zealand academic libraries." serials review , no. ( ): - . samuelson, pamela. "legally speaking: does information really want to be licensed?" the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . seadle, michael. "copyright in the networked world: compulsory licensing." library hi tech , no. ( ): - . soete, george. "licensing electronic resources: state of the evolving art." arl: a bimonthly newsletter of research library issues and actions, no. (february ): - . http://www.arl.org/bm~doc/licensing- .pdf strauch, bruce, and adam chesler. "a licensing survival guide for librarians." journal of electronic resources in medical libraries , no. ( ): - . tóth, péter benjamin. "creative humbug. personal feelings about the creative commons licenses." indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= turner, rollo. "agents, intermediaries, and journal licensing." journal of the medical library association (january ): - . http://www.pubmedcentral.gov/picrender.fcgi?action=stream&blobt ype=pdf&artid= verhagen, nol. "the licensing battlefield: consortia as new middlemen between publishers, agents and libraries—a view from the continent." serials: the journal for the serials community , no. ( ): - . webb, john. "managing licensed networked electronic resources in a university library." information technology and libraries (december ): - . wise, alicia. "nesli—implications outside the he community." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /jisc-content/ wiseman, leanne. "digital copying and the statutory licences in australian universities." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art woodward, hazel. "nesli—gathering momentum." the serials librarian , no. ( ): - . ———. "the uk's national electronic site licensing initiative (nesli)." journal of library administration , no. / ( ): - . wyatt, anna may. "ucita's impact on library services." journal of library administration , no. ( ): - . xenidou-dervou, claudine. "consortial journal licensing: experiences of greek academic libraries." interlending & document supply , no. ( ): - . yang, chyan, hsien-jyh liao, and chung-chen chen. "implementing digital copyright on the internet through an enhanced creative common licence protocol." the electronic library , no. ( ): - . library issues . library issues: digital libraries . . early digital library projects . . . alexandria project, university of california, santa barbara borgman, christine l., laura j. smart, kelli a. millwood, jason r. finley, leslie champeny, anne j. gilliland, and gregory h. leazer. "comparing faculty information seeking in teaching and research: implications for the design of digital libraries." journal of the american society for information science and technology , no. ( ): - . buttenfield, barbara. "usability evaluation of digital libraries." science & technology libraries , no. / ( ): - . frew, james, michael freeston, randall b. kemp, jason simpson, terence smith, alex wells, and qi zheng. "the alexandria digital library testbed." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /alexandria/ frew.html goodchild, michael f. "the alexandria digital library project: review, assessment, and prospects." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /goodchild/ goodchild.html hill, linda l., larry carver, mary larsgaard, ron dolin, terence r. smith, james frew, and mary-anna rae. "alexandria digital library: user evaluation studies and system design." journal of the american society for information science , no. ( ): - . hill, linda l., james frew, and qi zheng. "geographic names: the implementation of a gazetteer in a georeferenced digital library." d-lib magazine (january ). http://www.dlib.org/dlib/january /hill/ hill.html hill, linda l., greg janee, ron dolin, james frew, and mary larsgaard. "collection metadata solutions for digital library applications." journal of the american society for information science , no. ( ): - . janee, greg, james frew, and linda l. hill. "issues in georeferenced digital libraries." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /janee/ janee.html larsgaard, mary lynette, and larry carver. "accessing spatial data online: project alexandria." information technology and libraries (june ): - . liu, ying, and anita s. coleman. "communicating digital library services to scientific communities." libres , no. ( ). http://libres.curtin.edu.au/libres n /index.htm smith, terence r. "a brief update on the alexandria digital library project: constructing a digital library for geographically-referenced materials." d-lib magazine (march ). http://www.dlib.org/dlib/march /briefings/smith/ smith.html ———. "a digital library for geographically referenced materials." computer (may ): - . smith, terence r., and james frew. "alexandria digital library." communications of the acm (april ): - . . . . digital library initiative, university of illinois at urbana-champaign bishop, ann peterson. "document structure and digital libraries: how researchers mobilize information in journal articles." information processing and management (may ): - . ———. "measuring access, use, and success in digital libraries." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . bishop, ann peterson, laura j. neumann, susan leigh star, cecelia merkel, emily ignacio, and robert j. sandusky. "digital libraries: situating use in changing information infrastructure." journal of the american society for information science , no. ( ): - . ferrer, robert. "university of illinois the federation of digital libraries: interoperability among heterogeneous information systems." science & technology libraries , no. / ( ): - . schatz, bruce. "building the interspace: the illinois digital library project." communications of the acm (april ): - . schatz, bruce, william mischo, timothy cole, ann bishop, susan harum, eric johnson, laura neumann, hsinchun chen, and dorbin ng. "federated search of scientific literature." computer (february ): - . schatz, bruce, william h. mischo, timothy w. cole, joseph b. hardin, ann p. bishop, and hsinchun chen. "federating diverse collections of scientific literature." computer (may ): - . . . . informedia, carnegie mellon university christel, m., t. kanade, m. mauldin, r. reddy, m. sirbu, s. stevens, and h. wactlar. "informedia digital video library." communications of the acm (april ): - . kukulska-hulme, agnes, robert van der zwan, terry dipaolo, vanessa evers, and sarah clarke. "an evaluation of the informedia digital video library system at the open university." journal of educational media , no. ( ): - . wactlar, howard d., michael g. christel, yihong gong, and alexander g. hauptmann. "lessons learned from building a terabyte digital video library." computer (february ): - . wactlar, howard d., takeo kanade, michael a. smith, and scott m. stevens. "intelligent access to digital video: informedia project." computer (may ): - . . . . mercury project, carnegie mellon university arms, william y., thomas dopirak, parviz dousti, joseph rafail, and arthur w. wetzel. "the design of the mercury electronic library." educom review (november/december ): - . lowry, charles b., and barbara g. richards. "courting discovery: managing transition to the virtual library." library hi tech , no. ( ): - . richards, barbara g. "project mercury: the virtual library infrastructure at carnegie mellon university." in the evolving virtual library: visions and case studies, ed. laverna m. saunders, - . medford, nj: information today, inc., . troll, denise a. "library information system ii: progress report and technical plan." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /troll. n ———. "the mercury project: meeting the expectations of electronic library patrons." in advances in online public access catalogs, vol. , ed. marsha ra and regina rega, - . westport, ct: meckler publishing, . ———. "research on the distributed electronic library." in advances in library automation and networking, vol. , ed. joe a. hewitt and charles w. bailey, jr., - . greenwich, ct: jai press, inc., . . . . stanford digital library project paepcke, andreas, michelle q. wang baldonado, chen-chuan k. chang, steve cousins, and hector garcia-molina. "using distributed objects to build the stanford digital library infobus." computer (february ): - . paepcke, andreas, steve b. cousins, hector garcia-molina, scott w. hassan, steven p. ketchpel, martin roscheisen, and terry winograd. "using distributed objects for digital library interoperability." computer (may ): - . the stanford digital libraries group. "the stanford digital library project." communications of the acm (april ): - . . . . uc berkeley digital library project ogle, virginia, and robert wilensky. "testbed development for the berkeley digital library project." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /berkeley/ ogle.html van house, nancy a., mark h. butler, virginia ogle, and lisa schiff. "user-centered iterative design for digital libraries: the cypress experience." d-lib magazine (february ). http://www.dlib.org/dlib/february / vanhouse.html wilensky, robert. "toward work-centered digital information services." computer (may ): - . ———. "uc berkeley's digital library project." communications of the acm (april ): . . . . university of michigan digital library project atkins, daniel e., william p. birmingham, edmund h. durfee, eric j. glover, tracy mullen, elke a. rundensteiner, elliot soloway, jose m. vidal, raven wallace, and michael p. wellman. "toward inquiry-based education through interacting software agents." computer (may ): - . crum, laurie. "university of michigan digital library project." communications of the acm (april ): - . . . general allen, robert b., and edie rasmussen, eds. proceedings of the nd acm international conference on digital libraries: acm digital libraries ' , philadelphia, pa, july - , . new york: the association for computing machinery, . almasy, edward, david sleasman, and rachael bower. "software for building a full-featured discipline-based web portal: the scout portal toolkit." d-lib magazine (november ). http://www.dlib.org/dlib/november /almasy/ almasy.html andrews, judith, and derek law, eds. digital libraries: policy, planning and practice. aldershot, england: ashgate, . arko, robert a., kathryn m. ginger, kim a. kastens, and john weatherley. "using annotations to add value to a digital library for education." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /arko/ arko.html arms, william y. digital libraries. cambridge, ma: the mit press, . ———. "key concepts in the architecture of the digital library." d-lib magazine (july ). http://www.dlib.org/dlib/july / arms.html ———. "a viewpoint analysis of the digital library." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /arms/ arms.html arms, william y., christophe blanchi, edward a. overly. "an architecture for information in digital libraries." d-lib magazine (february ). http://www.dlib.org/dlib/february /cnri/ arms .html atkinson, ross. "library functions, scholarly communication, and the foundation of the digital library: laying claim to the control zone." the library quarterly (july ): - . ayris, paul. "the status of digitisation in europe." liber quarterly: the journal of european research libraries , no. / ( ). http://liber.library.uu.nl/publish/articles/ /article.pdf bailey, charles w., jr. "integrated public-access computer systems: the heart of the electronic university." in advances in library automation and networking, vol. , ed. joe a. hewitt, - . greenwich, ct: jai press, . baker, david. "digital library futures: a uk he and fe perspective." interlending & document supply , no. ( ): - . barton, jane. "digital librarians: boundary riders on the storm." library review , no. ( ): - . bawden, david, and polona vilar. "digital libraries: to meet or manage user expectations." aslib proceedings: new information perspectives , no. ( ): - . bekaert, jeroen, patrick hochstenbach, and herbert van de sompel. "using mpeg- didl to represent complex digital objects in the los alamos national laboratory digital library." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /bekaert/ bekaert.html biswas, goutam, and dibyendu paul. "an evaluative study on the open source digital library softwares for institutional repository: special reference to dspace and greenstone digital library." international journal of library and information science , no. ( ): - . http://www.academicjournals.org/ijlis/pdf/pdf /feb/biswas% and% paul.pdf blandford, ann, and george buchanan. "usability of digital libraries: a source of creative tensions with technical developments." tcdl bulletin , no. ( ). http://www.ieee-tcdl.org/bulletin/v n /blandford/blandford.html boock, michael, and ruth vondracek. "organizing for digitization: a survey." portal: libraries and the academy , no. ( ): - . http://hdl.handle.net/ / borgman, christine l. "digital libraries and the continuum of scholarly communication." journal of documentation (july ): - . ———. from gutenberg to the global information infrastructure: access to information in the networked world. cambridge, ma: the mit press, . ———. "multi-media, multi-cultural, and multi-lingual digital libraries: or how do we exchange data in languages?" d-lib magazine (june ). http://www.dlib.org/dlib/june / borgman.html ———. "what are digital libraries? competing visions." information processing and management (may ): - . boyd, kate, and douglas king. "south carolina goes digital: the creation and development of the university of south carolina's digital activities department." oclc systems & services: international digital library perspectives , no. ( ): - . brantley, peter. "architectures for collaboration: roles and expectations for digital libraries." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf breaks, michael. "building the hybrid library: a review of uk activities." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art bültmann, barbara, rachel hardy, adrienne muir, and clara wictor. "digitized content in the uk research library and archives sector." journal of librarianship and information science , no. ( ): - . caldera-serrano, jorge. "changes in the management of information in audio-visual archives following digitization: current and future outlook." journal of librarianship and information science , no. ( ): - . campbell, debbie. "how the use of standards is transforming australian digital libraries." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /campbell/ cantara, linda. "building a cyberinfrastructure for the humanities." oclc systems & services , no. ( ): - . caplan, priscilla. "oh what a tangled web we weave: opportunities and challenges for standards development in the digital library arena." first monday (june ). http://www.firstmonday.org/issues/issue _ /caplan/index.html carpenter, leona, simon shaw, and andrew prescott, eds. towards the digital library: the british library's initiatives for access programme. london: the british library, . cathro, warwick. "digitization in australasia." serials: the journal for the serials community , no. ( ): - . chapman, stephen. "managing text digitisation." online information review , no. ( ): - . chapman, stephen, and anne r. kenney. "digital conversion of library research materials: a case for full informational capture." d-lib magazine (october ). http://www.dlib.org/dlib/october /cornell/ chapman.html chen, hsueh-hua. "digital library projects in taiwan." tcdl bulletin , no. ( ). http://www.ieee-tcdl.org/bulletin/v n /chen/chen.html choi, youngok, and edie rasmussen. "what is needed to educate future digital librarians: a study of current practice and staffing patterns in academic and research libraries." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /choi/ choi.html ———. "what qualifications and skills are important for digital librarian positions in academic libraries? a job advertisement analysis." the journal of academic librarianship , no. ( ): - . choudhury, sayeed, benjamin hobbs, mark lorie, and nicholas flores. "a framework for evaluating digital library services." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /choudhury/ choudhury.html chowdhury, g. g., and sudatta chowdhury. "digital library research: major issues and trends." journal of documentation (september ): - . chowdhury, sudatta, monica landoni, and forbes gibb. "usability and impact of digital libraries: a review." online information review , no. ( ): - . chun, susan, and michael jenkins. "why digital asset management? a case study." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature clark, john r. "states of preservation: the maine memory project and similar public access digital history resources in the united states." behavioral & social sciences librarian , no. ( ): - . cole, timothy w. "creating a framework of guidance for building good digital collections." first monday (may ). http://firstmonday.org/issues/issue _ /cole/index.html cole, timothy w., and michelle m. kazmer. "sgml as a component of the digital library." library hi tech , no. ( ): - . coleman, anita. "interdisciplinarity: the road ahead for education in digital libraries." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /coleman/ coleman.html collier, mel. "strategic change in higher education libraries with the advent of the digital library during the fourth decade of program." program: electronic library and information systems , no. ( ): - . crane, gregory. "georeferencing in historical collections." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /crane/ crane.html dahl, mark, kyle banerjee, and michael spalti. digital libraries: integrating content and systems. oxford: chandos, . dalbello, marija. "cultural dimensions of digital library development, part i: theory and methodological framework for a comparative study of the cultures of innovation in five european national libraries." the library quarterly , no. ( ): - . ———. "cultural dimensions of digital library development, part ii: the cultures of innovation in five european national libraries (narratives of development)." the library quarterly , no. ( ): - . ———. "institutional shaping of cultural memory: digital library as environment for textual transmission." the library quarterly , no. ( ): - . das, anup kumar, chaitali dutta, and b. k. sen. "information retrieval features in indian digital libraries: a critical appraisal." oclc systems & services , no. ( ): - . deegan, marilyn, emil steinvel, and edmund king. "digitizing historic newspapers: progress and prospects." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature delis, kostas saidis and alex. "type-consistent digital objects." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /saidis/ saidis.html dempsey, lorcan. "the (digital) library environment: ten years after." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /dempsey/ dietrich, dianne, jennifer doty, jen green, and nicole scholtz. "reviving digital projects." the code lib journal, no. ( ). http://journal.code lib.org/articles/ digital libraries ' : proceeding of the second annual conference on the theory and practice of digital libraries. college station, tx: texas a&m university, . http://csdl.tamu.edu/dl / dorner, daniel g., chern li liew, and yen ping yeo. "a textured sculpture: the information needs of users of digitised new zealand cultural heritage resources." online information review , no. ( ): - . eschenfelder, kristin r., and michelle caswell. "digital cultural collections in an age of reuse and remixes." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ esler, sandra l., and michael l. nelson. "evolution of scientific and technical information distribution." journal of the american society for information science (january ): - . falk, howard. "developing digital libraries." the electronic library , no. ( ): - . fox, edward a. "digital libraries initiative (dli) projects - ." bulletin of the american society for information science (october/november ): - . http://www.asis.org/bulletin/oct- /fox.html fox, edward a., reagan w. moore, ronald l. larsen, sung hyon myaeng, and sung-hyuk kim. "toward a global digital library: generalizing us-korea collaboration on digital libraries." d-lib magazine (october ). http://www.dlib.org/dlib/october /fox/ fox.html fox, robert. "mining the digital library." oclc systems & services: international digital library perspectives , no. ( ): - . frumkin, jeremy. "the need for a digital library service registry." oclc systems & services , no. ( ): - . goh, dion hoe-lian, alton chua, davina anqi khoo, emily boon-hui khoo, eric bok-tong mak, and maple wen-min ng. "a checklist for evaluating open source digital library software." online information review , no. ( ): - . gorman, g. e. "digitisation: still the preserve of preservationists rather than users." online information review , no. ( ): - . graham, peter s. "requirements for the digital research library." college & research libraries (july ): - . green, david. "beyond word and image: networking moving images: more than just the 'movies.'" d-lib magazine (july/august ). http://www.dlib.org/dlib/july / green.html greenstein, daniel. "digital libraries and their challenges." library trends (fall ): - . http://hdl.handle.net/ / greenstein, daniel, and gerald george. "digital reproduction quality: benchmark recommendations." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#featured greenstein, daniel, and suzanne e. thorin. the digital library: a biography. washington, dc: digital library federation, council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf griffin, stephen m. "funding for digital libraries research: past and present." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /griffin/ griffin.html ———. "nsf/darpa/nasa digital libraries initiative: a program manager's perspective." d-lib magazine (july/august ). http://www.dlib.org/dlib/july / griffin.html griscom, richard. "distant music: delivering audio over the internet." notes , no. ( ): - . http://repository.upenn.edu/library_papers/ / hamilton, val. "sustainability for digital libraries." library review , no. ( ): - . http://eprints.rclis.org/archive/ / han, yan. "digital content management: the search for a content management system." library hi tech , no. ( ): - . harter, stephen p. "scholarly communication and the digital library: problems and issues." journal of digital information, , no. ( ). http://journals.tdl.org/jodi/article/viewarticle/ herring, susan davis. "journal literature on digital libraries: publishing and indexing patterns, - ." college & research libraries (january ): - . holley, rose. "developing a digitisation framework for your organisation." the electronic library , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "how good can it get? analysing and improving ocr accuracy in large scale historic newspaper digitisation programs." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /holley/ holley.html hughes, lorna m. digitizing collections: strategic issues for the information manager. london: facet publishing, . ieee international forum on research and technology advances in digital libraries: adl ' , may - , , washington, d.c. los alamitos, ca: ieee computer society press, . innocenti, perla, giuseppina vullo, and seamus ross. "towards a digital library policy and quality interoperability framework: the dl.org project." new review of information networking , no. ( ): - . iwhiwhu, basil enemute, and elvis ovietobore eyekpegha. "digitization of nigerian university libraries: from technology challenge to effective information delivery." the electronic library , no. ( ): - . jacobson, robert l. "librarians agree on coordination of digital plans." the chronicle of higher education, may , a . james-gilboe, lynda. "the challenge of digitization: libraries are finding that newspaper projects are not for the faint of heart." the serials librarian , no. / ( ): - . jones, dan. "digitization: the view from the national archives." serials: the journal for the serials community , no. ( ): - . jones, michael l. w., geri k. gay, and robert h. rieger. "project soup: comparing evaluations of digital collection efforts." d-lib magazine (november ). http://www.dlib.org/dlib/november / jones.html jordan, mark. putting content online: a practical guide for libraries. oxford: chandos, . kahn, robert, and robert wilensky. a framework for distributed digital object services. reston, va: corporation for national research initiatives, . http://www.cnri.reston.va.us/home/cstr/arch/k-w.html kani-zabihi, elahe, gheorghita ghinea, and sherry y. chen. "digital libraries: what do users want?" online information review , no. ( ): - . kaplan, deborah. "choosing a digital asset management system that's right for you." journal of archival organization , no. / ( ): - . kilker, julian, and geri gay. "the social construction of a digital library: a case study examining implications for evaluation." information technology and libraries (june ): - . king, edmund. "digitisation of newspapers at the british library." the serials librarian , no. / ( ): - . klemperer, katharina, and stephen chapman. "digital libraries: a selected resource guide." information technology and libraries (september ): - . kochtanek, thomas r., and karen k. hein. "delphi study of digital libraries." information processing and management (may ): - . kovacevic, ana, vladan devedzic, and viktor pocajt. "using data mining to improve digital library services." the electronic library , no. ( ): - . lack, rosalie. "the importance of user-centered design: exploring findings and methods." journal of archival organization , no. / ( ): - . lagoze, carl, dean b. krafft, sandy payette, and susan jesuroga. "what is a digital library anymore, anyway? beyond search and access in the nsdl." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /lagoze/ lagoze.html lally, ann m., and carolyn e. dunford. "using wikipedia to extend digital collections." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /lally/ lally.html landon, george v. "toward digitizing all forms of documentation." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /landon/ landon.html law, derek. "remembering history: the work of the information services sub-committee of the joint information systems committee in the uk." program: electronic library and information systems , no. ( ): - . lee, hyuk-jin. "collaboration in cultural heritage digitisation in east asia." program: electronic library and information systems , no. ( ): - . lesk, michael. "going digital." scientific american (march ): - . ———. "the organization of digital libraries." science & technology libraries , no. / ( ): - . ———. practical digital libraries: books, bytes, and bucks. san francisco: morgan kaufmann publishers, . levy, david m. "digital libraries and the problem of purpose." d-lib magazine (january ). http://www.dlib.org/dlib/january / levy.html levy, david m., and catherine c. marshall. "going digital: a look at assumptions underlying digital libraries." communications of the acm (april ): - . liew, chern li. "digital library research - : organisational and people issues." journal of documentation , no. ( ): - . linden, julie, and ann green. "don't leave the data in the dark: issues in digitizing print statistical publications." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /linden/ linden.html logoze, carl, and david fielding. "defining collections in distributed digital libraries." d-lib magazine (november ). http://www.dlib.org/dlib/november /lagoze/ lagoze.html lopatin, laurie. "library digitization projects, issues and guidelines: a survey of the literature." library hi tech , no. ( ): - . lowry, charles b., and barbara g. richards. "courting discovery: managing transition to the virtual library." library hi tech , no. ( ): - . lund, william. "digital object library products." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature lynch, clifford a. "research libraries engage the digital world: a us-uk comparative examination of recent history and future prospects." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /lynch/ ———. "today and tomorrow: what the digital library really means for collections and services." in virtually yours: models for managing electronic resources and services, ed. peggy johnson and bonnie macewan, - . chicago: american library association, . ———. "where do we go from here? the next decade for digital libraries." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /lynch/ lynch.html lynch, clifford, and hector garcia-molina. interoperability, scaling, and the digital libraries research agenda: a report on the may - , iita digital libraries workshop. http://www-diglib.stanford.edu/diglib/pub/reports/iita-dlw/main.htm l manaf, zuraidah abd. "the state of digitisation initiatives by cultural institutions in malaysia: an exploratory survey." library review , no. ( ): - . marchionini, gary, and hermann maurer. "the roles of digital libraries in teaching and learning." communications of the acm (april ): - . marcum, deanna b. "digital libraries: for whom? for what?" the journal of academic librarianship (march ): - . ———. "requirements for the future digital library." the journal of academic librarianship , no. ( ): - . matusiak, krystyna k. "information seeking behavior in digital image collections: a cognitive approach." the journal of academic librarianship , no. ( ): - . mccray, alexa t., and marie e, gallagher. "principles for digital library development." communications of the acm (may ): - . mckay, sally. "digitization in an archival environment." electronic journal of academic and special librarianship , no. ( ). http://southernlibrarianship.icaap.org/content/v n /mckay_s .ht m mcmenemy, david. "less conversation, more action: putting digital content creation at the heart of modern librarianship." library review , no. ( ): - . mischo, william h. "digital libraries: challenges and influential work." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /mischo/ mischo.html miller, rush g. "shaping digital library content." the journal of academic librarianship , no. ( ): - . mitchell, steve, margaret mooney, julie mason, gordon w. paynter, johannes ruscheinski, artur kedzierski, and keith humphreys. "ivia open source virtual library system." d-lib magazine (january ). http://www.dlib.org/dlib/january /mitchell/ mitchell.html moxley, joseph m. "universities should require electronic theses and dissertations." educause quarterly , no. ( ): - . http://www.educause.edu/ir/library/pdf/eqm .pdf mugridge, rebecca l. managing digitization activities. spec kit . washington, dc: association of research libraries, . http://www.arl.org/bm~doc/spec web.pdf nelson, michael l., and b. danette allen. "object persistence and availability in digital libraries." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /nelson/ nelson.html newman, alan, and peter dueker. "digital image asset management at the national gallery of art (us)." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article nilsen, dianne. ""in pursuit of efficiency: traversing the boundaries of a collection information system." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article novara, elizabeth a. "digitization and researcher demand: digital imaging workflows at the university of maryland libraries." oclc systems & services: international digital library perspectives , no. ( ): - . ooghe, bart, and dries moreels. "analysing selection for digitisation: current practices and common incentives." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /ooghe/ ooghe.html oppenheim, charles, and daniel smithson. "what is the hybrid library?" journal of information science , no. ( ): - . paepcke, andreas, chen-chuan k. chang, hector garcia-molina, and terry winograd. "interoperability for digital libraries worldwide." communications of the acm (april ): - . paepcke, andreas, hector garcia-molina, and rebecca wesley. "dewey meets turing: librarians, computer scientists, and the digital libraries initiative." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /paepcke/ paepcke.html peterson, elaine. "evaluation of digital libraries using snowball sampling." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ pinfield, stephen, jonathan eaton, catherine edwards, rosemary russell, astrid wissenburg, and peter wynne. "realizing the hybrid library." d-lib magazine (october ). http://www.dlib.org/dlib/october / pinfield.html pisciotta, henry a., michael j. dooris, james frost, and michael halm. "penn state's visual image user study." portal: libraries and the academy , no. ( ): - . poll, roswitha. "digitisation in european libraries: results of the numeric project." liber quarterly: the journal of european research libraries , no. / ( ): - . http://liber.library.uu.nl/publish/articles/ /article.pdf ———. "numeric: statistics for the digitisation of european cultural heritage." program: electronic library and information systems , no. ( ): - . pomerantz, jeffrey, sanghee oh, seungwon yang, edward a. fox, and barbara m. wildemuth. "the core: digital library education in library and information science programs." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /pomerantz/ pomerantz.html pope, nolan f. "digital libraries: future potentials and challenges." library hi tech , no. - ( ): - . price-wilkin, john. "just-in-time conversion, just-in-case collections: effectively leveraging rich document formats for the www." d-lib magazine (may ). http://www.dlib.org/dlib/may /michigan/ pricewilkin.html prinsen, jola g. b. "a challenging future awaits libraries able to change: highlights of the international summer school on the digital library." d-lib magazine (november ). http://www.dlib.org/dlib/november /prinsen/ prinsen.html prinsen, jola g. b., and hans geleijnse. "the international summer school on the digital library: experiences and plans for the future." d-lib magazine (october ). http://www.dlib.org/dlib/october /prinsen/ prinsen.html puglia, steven, and erin rhodes. "digital imaging—how far have we come and what still needs to be done?" rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article ray, joyce. "connecting people and resources: digital programs at the institute of museum and library services." library hi tech , no. ( ): - . ———. "digitization grants and how to get one: advice from the director, office of library services, institute of museum and library services." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#technical reese, terry, and j. kyle banerjee. building digital libraries: a how-to-do-it manual for librarians. new york: neal-schuman publishers, . reidy, denis v. "the electronic icarus: some problems and some solutions in digitisation." information services & use , no. ( ): - . riley, jenn, and ichiro fujinaga. "recommended best practices for digital image capture of musical scores." oclc systems & services , no. ( ): - . riley, jenn, and kurt whitsel. "practical quality control procedures for digital imaging projects." oclc systems & services: international digital library perspectives , no. ( ): - . rowlands, ian, and david bawden. "building the digital library on solid research foundations." aslib proceedings (september ): - . rusbridge, chris. "towards the hybrid library." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /rusbridge/ rusbridge.html rydberg-cox, jeffrey a. digital libraries and the challenges of digital humanities. oxford: chandos publishing, . salsich, anne cuyler. "collaboration: paradigm of the digital cultural content environment." journal of archival organization , no. / ( ): - . saracevic, tefko. "digital library evaluation: toward an evolution of concepts." library trends (fall ): - . http://hdl.handle.net/ / schatz, bruce r. "information retrieval in digital libraries: bringing search to the net." science, january , - . schwartz, candy. "digital libraries: an overview." the journal of academic librarianship (november ): - . seaman, david. "deep sharing: a case for the federated digital library." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf shiri, ali. "digital library research: current developments and trends." library review , no. ( ): - . http://eprints.rclis.org/archive/ / simon, scott james. "information architecture for digital libraries." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ singh, gian, rekha mittal, and moin ahmad. "a bibliometric study of literature on digital libraries." the electronic library , no. ( ): - . smith, abby. strategies for building digitized collections. washington, dc: digital library federation, council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf soergel, dagobert. "a framework for digital library research: broadening the vision." d-lib magazine (december ). http://www.dlib.org/dlib/december /soergel/ soergel.html spink, amanda, and colleen cool. "education for digital libraries." d-lib magazine (may ). http://www.dlib.org/dlib/may / spink.html suleman, hussein, and edward a. fox. "a framework for building open digital libraries." d-lib magazine (december ). http://www.dlib.org/dlib/december /suleman/ suleman.html sutton, shan. "navigating the point of no return: organizational implications of digitization in special collections." portal: libraries and the academy , no. ( ): - . tammaro, anna maria. "a curriculum for digital librarians: a reflection on the european debate." new library world , no. / ( ): - . tanner, simon, trevor muñoz, and pich hemy ros. "measuring mass text digitization quality and usefulness: lessons learned from assessing the ocr accuracy of the british library's th century online newspaper archive." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /munoz/ munoz.html tennant, roy. "the digital library federation." library journal, march , - . ———. managing the digital library. new york: reed press, . thomas, charles f. "memory institutions as digital publishers: a case study on standards and interoperability." oclc systems & services , no. ( ): - . ———. "replication: the forgotten component in digital library interoperability?" technicalities (july/august ): - . tsai, chih-fong. "a review of image retrieval methods for digital cultural heritage resources." online information review , no. ( ): - . tsakonas, giannis, and christos papatheodorou, eds. evaluation of digital libraries: an insight to useful applications and methods. oxford: chandos, . virkus, sirje, getaneh agegn alemu, tsigereda asfaw demissie, besim jakup kokollari, liliana m. melgar estrada, and deepak yadav. "integration of digital libraries and virtual learning environments: a literature review." new library world , no. / ( ): - . walshe, emily. "the dark side of digitization." portal: libraries and the academy , no. ( ): - . waters, donald j. "building on success, forging new ground: the question of sustainability." first monday , no. ( ). http://www.firstmonday.org/issues/issue _ /waters/index.html ———. "developing digital libraries: four principles for higher education." educause review (september/october ): - . http://www.educause.edu/ir/library/pdf/erm .pdf waugh, andrew. "the design and implementation of an ingest function to a digital archive." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /waugh/ waugh.html wiederhold, gio. "digital libraries, value, and productivity." communications of the acm (april ): - . witten, ian h., and david bainbridge. how to build a digital library. san francisco: morgan kaufmann publishers, . witten, ian h., david bainbridge, and stefan j. boddie. "greenstone: open-source digital library software." d-lib magazine (october ). http://www.dlib.org/dlib/october /witten/ witten.html witten, ian h., michel loots, maria f. trujillo, and david bainbridge. "the promise of digital libraries in developing countries." the electronic library , no. ( ): - . zhang, allison, and don gourley. creating digital collections: a practical guide. oxford: chandos, . . . national digital library, library of congress arms, caroline r. "historical collections for the national digital library: lessons and challenges at the library of congress." d-lib magazine (april ). http://www.dlib.org/dlib/april /loc/ c-arms.html ———. "historical collections for the national digital library: lessons and challenges at the library of congress." d-lib magazine (may ). http://www.dlib.org/dlib/may /loc/ c-arms.html becker, herbert s. "library of congress digital library effort." communications of the acm (april ): . dalbello, marija. "a phenomenological study of an emergent national digital library, part i: theory and methodological framework." the library quarterly , no. ( ): - . ———. "a phenomenological study of an emergent national digital library, part ii: the narratives of development." the library quarterly , no. ( ): e -e . http://eprints.rclis.org/archive/ / lamolinara, guy. "metamorphosis of a national treasure." american libraries (march ): - . ———. "the national digital library program." in the bowker annual library and book trade almanac, th ed., ed. catherine barr. new providence, nj: r. r. bowker, . . . other projects and systems adly, noha. "bibliotheca alexandrina: a digital revival." educause review , no. ( ): - . http://net.educause.edu/ir/library/pdf/erm .pdf alam, md nurul, and pragya pandey. "design and development of prototype astronomical digital image library using greenstone digital library software." desidoc journal of library & information technology , no. ( ): - . http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ anderson, ian g. "pure dead brilliant?: evaluating the glasgow story digitisation project." program: electronic library and information systems , no. ( ): - . arlitsch, kenning. "digitizing sanborn fire insurance maps for a full color, publicly accessible collection." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /arlitsch/ arlitsch.html arlitsch, kenning, and jeff jonsson. "aggregating distributed digital collections in the mountain west digital library with the contentdm multi-site server." library hi tech , no. ( ): - . bailey-hainer, brenda, and richard urban. "the colorado digitization program: a collaboration success story." library hi tech , no. ( ): - . barber, david. "ohiolink: a consortial approach to digital library management." d-lib magazine (april ). http://www.dlib.org/dlib/april / barber.html bartolo, laura m., cathy s. lowe, louis z. feng, and brook patten. "matdl: integrating digital libraries into scientific practice." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ bartolo, laura m., cathy s. lowe, donald r. sadoway, adam c. powell, and sharon c. glotzer. "nsdl matdl: exploring digital library roles." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /bartolo/ bartolo.html bayne, pauline s., and chris hodge. "digital audio reserves: a collaborative project at the university of tennessee." journal of interlibrary loan, document delivery & information supply , no. ( ): - . bevan, simon j. "electronic thesis development at cranfield university." program: electronic library & information systems , no. ( ): - . https://dspace.lib.cranfield.ac.uk/handle/ / bond, trevor james. "sustaining a digital collection after the grants: the early washington maps project." oclc systems & services , no. ( ): - . boock, michael. "organizing for digitization at oregon state university: a case study and comparison with arl libraries." the journal of academic librarianship , no. ( ): - . borgman, christine l., anne j. gilliland-swetland, gregory h. leazer, richard mayer, david gwynn, rich gazen, and patricia mautone. "evaluating digital libraries for teaching and learning in undergraduate education: a case study of the alexandria digital earth prototype (adept)." library trends (fall ): - . http://hdl.handle.net/ / brahms, ewald. "digital library initiatives of the deutsche forschungsgemeinschaft." d-lib magazine (may ). http://www.dlib.org/dlib/may /brahms/ brahms.html budhu, muniram, and anita coleman. "the design and evaluation of interactivities in a digital library." d-lib magazine (november ). http://www.dlib.org/dlib/november /coleman/ coleman.html burns, maureen a. "from horse-drawn wagon to hot rod: the university of california's digital image service experience." journal of archival organization , no. / ( ): - . byamugisha, helen m. "digitizing library resources for new modes of information use in uganda." library management , no. / ( ): - . castelli, donatella. "digital libraries of the future—and the role of libraries." library hi tech , no. ( ): - . ceynowa, klaus. "mass digitization for research and study: the digitization strategy of the bavarian state library." ifla journal , no. ( ): - . chambers, sally, and wouter schallier. "bringing research libraries into europeana: establishing a library-domain aggregator." liber quarterly: the journal of european research libraries , no. ( ) - . http://liber.library.uu.nl/publish/articles/ /article.pdf chapman, stephen, and william comstock. "digital imaging production services at the harvard college library." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature chavez, robert, timothy w. cole, jon dunn, muriel foulonneau, thomas g. habing, william parod, and thornton staples. "dlf-aquifer asset actions experiment: demonstrating value of actionable urls." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /cole/ cole.html chrzastowski, tina e., and alexander scheeline. "asdl: the analytical sciences digital library taking the next steps." science & technology libraries , no. / ( ): - . chudnov, daniel. "dspace: durable digital documents." serials (november ): - . http://hdl.handle.net/ . / chun, susan, and michael jenkins. "why digital asset management? a case study." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article clark, john r. "states of preservation: the maine memory project and similar public access digital history resources in the united states." behavioral & social sciences librarian , no. ( ): - . coleman, ross. "australian co-operative digitisation project, - ." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /digitisation/ collier, mel. "the business aims of eight national libraries in digital library co-operation: a study carried out for the business plan of the european library (tel) project." journal of documentation , no. ( ): - . concordia, cesare. "not just another portal, not just another digital library: a portrait of europeana as an application program interface." ifla journal , no. ( ): - . cook, matthew. "economies of scale: digitizing the chicago daily news." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature crane, gregory. "'hypermedia' and scholarly publishing." scholarly publishing (april ): - . crane, gregory, robert f. chavez, anne mahoney, thomas l. milbank, jeffrey a. rydberg-cox, david a. smith, and clifford e. wulfman. "drudgery and deep thought." communications of the acm (may ): - . cronin, christopher, kathryn lage, and holley long. "the flight plan of a digital initiatives project: providing remote access to aerial photographs of colorado." oclc systems & services , no. ( ): - . d'alessandro, michael p., jeffrey r. galvin, stephana i. colbert, donna m. d'alessandro, teresa a. choi, brian d. aker, william s. carlson, and gay d. pelzer. "solutions to challenges facing a university digital library and press." journal of the american medical informatics association (may/june ): - . dawei, wei, and sun yigang. "the national digital library project." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /dawei/ dawei.html dodsworth, eva. "university of waterloo's historical air photo digitization project." partnership: the canadian journal of library and information practice and research , no. ( ). http://journal.lib.uoguelph.ca/index.php/perj/article/view/ erway, ricky l. "digital initiatives of the research libraries group." d-lib magazine (december ). http://www.dlib.org/dlib/december /rlg/ erway.html ———. "a view on europeana from the us perspective." liber quarterly: the journal of european research libraries , no. ( ): - . fenske, david e., and jon w. dunn. "the variations project at indiana university's music library." d-lib magazine (june ). http://www.dlib.org/dlib/june /variations/ fenske.html fifarek, aimee. "celebrating history and innovation: the louisiana purchase digital library project at louisiana state university." oclc systems & services , no. ( ): - . flecker, dale. "harvard's library digital initiative: building a first generation digital library infrastructure." d-lib magazine (november ). http://www.dlib.org/dlib/november /flecker/ flecker.html foulke, kathleen, nancy milnor, melissa watterworth, and thomas wilsted. "the power of partnering: the cooperative creation of digital collections." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ fox, edward a., john l. eaton, gail mcmillan, neill a. kipp, paul mather, tim mcgonigle, william schweiker, and brian devane. "networked digital library of theses and dissertations: an international effort unlocking university resources." d-lib magazine (september ). http://www.dlib.org/dlib/september /theses/ fox.html fox, edward a., john l. eaton, gail mcmillan, neill a. kipp, laura weiss, emilio arce, and scott guyer. "national library of theses and dissertations: a scalable and sustainable approach to unlock university resources." d-lib magazine (july/august ). http://www.dlib.org/dlib/september /theses/ fox.html fox, sean, cathy manduca, and ellen iverson. "building educational portals atop digital libraries." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /fox/ fox.html france, fenella g., doug emery, and michael b. toth. "the convergence of information technology, data, and management in a library imaging program." the library quarterly , no. ( ): - . garrison, william a. "retrieval issues for the colorado digitization project's heritage database." d-lib magazine (october ). http://www.dlib.org/dlib/october /garrison/ garrison.html gemmill, laurie, and angela o'neal. "ohio memory online scrapbook: creating a statewide digital library." library hi tech , no. ( ): - . gladney, henry m., fred mintzer, fabio schiattarella, julian bescos, and martin treu. "digital access to antiquities." communications of the acm (april ): - . górny, miroslaw, john catlow, and rafal lewandowski. "the state of development of digital libraries in poland." program: electronic library and information systems , no. ( ): - . grewal, dilawar, and fred heath. "the emerging digital library: a new collaborative opportunity on the academic campus." journal of library administration , no. ( ): - . griffin, stephen m. "digital libraries initiative—phase ." d-lib magazine (july/august ). http://www.dlib.org/dlib/july / griffin.html grivell, les. "e-biosci, digital archives, databases, and the changing face of publishing." serials (july ): - . grotke, robert w. "digitizing the world's largest collection of natural sounds: key factors to consider when transferring analog-based audio materials to digital formats." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article guerard, genie, and robin l. chandler. "california cultures: implementing a model for virtual collections." journal of archival organization , no. / ( ): - . guillope, laurent. "mathematics and databases: open access." information services & use , no. - ( ): - . hampson, andrew. "case study: practical experiences of digitisation in the builder hybrid library project." program (july ): - . hartman, cathy nelson, dreanna belden, nancy reis, daniel gelaw alemneh, mark phillips, and doug dunlop. "development of a portal to texas history." library hi tech , no. ( ): - . hodges, doug, and carrol d. lunau. "the national library of canada's digital library initiatives." library hi tech , no. ( ): - . hughes, carol ann. "lessons learned: digitization of special collections at the university of iowa libraries." d-lib magazine (june ). http://www.dlib.org/dlib/june /hughes/ hughes.html hunt, leta, and philip j. ethington. "the utility of spatial and temporal organization in digital library construction." the journal of academic librarianship (november ): - . hunter, nancy chaffin, kathleen legg, and beth oehlerts. "two librarians, an archivist, and , images: collaborating to build a digital collection." the library quarterly , no. ( ): - . hurley, bernard j., john price-wilkin, merrilee proffitt, and howard besser. the making of america ii testbed project: a digital library service model. washington, dc: digital library federation, council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf iannella, renato. "australian digital library initiatives." d-lib magazine (december ). http://www.dlib.org/dlib/december / iannella.html jackson, allyn. "the digital mathematics library." notices of the ams , no. ( ): - . http://www.ams.org/notices/ /comm-jackson.pdf jacoby, joann., and mary s. laskowski. "measurement and analysis of electronic reserve usage: toward a new path in online library service assessment." portal: libraries and the academy , no. ( ): - . jones, r. arwel. "a marathon not a sprint: lessons learnt from the first decade of digitisation at the national library of wales." program: electronic library and information systems , no. ( ): - . jones, steve, matt jones, malcolm barr, and te taka keegan. "searching and browsing in a digital library of historical maps and newspapers." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ kaplan, nancy r., and michael l. nelson. "determining the publication impact of a digital library." journal of the american society for information science , no. ( ): - . klavans, judith l. "new center at columbia university for digital library research: fostering interdisciplinary research and bridging cultural clashes." d-lib magazine (march ). http://www.dlib.org/dlib/march /klavans/ klavans.html klijn, edwin. "the current state-of-art in newspaper digitization: a market perspective." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /klijn/ klijn.html kott, katherine, jon dunn, martin halbert, leslie johnston, liz milewicz, and sarah shreeves. "digital library federation (dlf) aquifer project." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /kott/ kott.html krueger, stephanie, and philip ponella. "dram/variations : a music resource case study." library hi tech , no. ( ): - . kucsma, jason, kevin reiss, and angela sidman. "using omeka to build digital collections: the metro case study." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /kucsma/ kucsma.html kunze, john a., and brian n. warling. "recent developments in galen ii: evolution of a digital library for the health sciences." d-lib magazine (march ). http://www.dlib.org/dlib/march / galen .html kurtz, michael j., guenther eichhorn, alberto accomazzi, carolyn grant, markus demleitner, and stephen s. murray. "worldwide use and impact of the nasa astrophysics data system digital library." journal of the american society for information science and technology , no. ( ): - . lanz, daniel, frederick zarndt, stefan boddie, tracy powell, and vishal salgotra. "the new papers past: an international collaboration between new zealand, india, germany, and the united states." oclc systems & services: international digital library perspectives , no. ( ): - . lee, stuart. "digitizing intellectual property: the oxford scoping study." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /oxford-mellon/ lee, stuart d., and kate lindsay. "if you build it, they will scan: oxford university's exploration of community collections." educause quarterly , no. ( ). http://www.educause.edu/educause+quarterly/educausequa rterlymagazinevolum/ifyoubuildittheywillscanoxford/ lesk, michael. "perspectives on dli- —growing the field." d-lib magazine (july/august ). http://www.dlib.org/dlib/july / lesk.html levi, peter. "digitising the past: the beginning of a new future at the royal tropical institute of the netherlands." program: electronic library and information systems , no. ( ): - . ling, ted. "why the archives introduced digitisation on demand." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature liu, ying. "geo-referenced digital libraries: experienced problems of purpose and infrastructure." library philosophy and practice , no. ( ). http://www.webpages.uidaho.edu/~mbolin/liu.html lu, shiyong, dapeng liu, farshad fotouhi, ming dong, robert reynolds, anthony aristar, martha ratliff, geoff nathan, joseph tan, and ronald powell. "language engineering for the semantic web: a digital library for endangered languages." information research , no. ( ). http://informationr.net/ir/ - /paper .html lougee, wendy p. "the university of michigan digital library program: a retrospective on collaboration within the academy." library hi tech , no. ( ): - . lucier, richard e. "building a digital library for the health sciences: information space complementing information place." bulletin of the medical library association (july ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= lutz, marilyn. "the maine music box: a pilot project to create a digital music library." library hi tech , no. ( ): - . lutz, marilyn, and curtis meadow. "evolving an in-house system to integrate the management of digital collections." library hi tech , no. ( ): - . maccoll, john. "electronic theses and dissertations: a strategy for the uk." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /theses-dissertations/ macías-virgós, e., and r. de la viesca. "digitization projects in spain." mathematics in computer science , no. ( ): - . mankita, isaac, ellen meltzer, and james harris. "a handful of things: calisphere's themed collections from the california digital library." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /mankita/ mankita.html marchionini, gary. "evaluating digital libraries: a longitudinal and multifaceted view." library trends (fall ): - . http://hdl.handle.net/ / marchionini, gary, and gary geisler. "the open video digital library." d-lib magazine (december ). http://www.dlib.org/dlib/december /marchionini/ marchionini.ht ml maslin, jon, and elizabeth lyon. "project patron—audio and video on demand at the university of surrey." the journal of academic librarianship (november ): - . mccarthy, cavan, and murilo bastos da cunha. "digital library development in brazil." oclc systems & services , no. ( ): - . mcglamery, patrick. "building a globally distributed historical sheet map set of austro-hungarian topographic maps, - ." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article mieczkowska, suzanne, and kathryn pryor. "digitised newspapers at norfolk and norwich millennium library." collection building , no. ( ): - . mijajlović, Žarko, zoran ognjanović, and aleksandar pejović. "digitization of mathematical editions in serbia." mathematics in computer science , no. ( ): - . mintzer, f. c., l. e. boyle, a. n. cazes, b. s. christian, s. c. cox, f. p. giordano, h. m. gladney, j. c. lee, m. l. kelmanson, a. c. lirani, k. a. magerlein, a. m. b. pavani, and f. schiattarella. "toward on-line, worldwide access to vatican library materials." ibm journal of research and development (march ): - . mischo, william h. "the digital engineering library: current technologies and challenges." science & technology libraries , no. / ( ): - . moen, william e. "accessing distributed cultural heritage information." communications of the acm (april ): - . nelson, michael l., gretchen l. gottlich, david j. bianco, sharon s. paulson, robert l. binkley, yvonne d. kellogg, chris j. beaumont, robert b. schmunk, michael j. kurtz, alberto accomazzi, and omar syed. "the nasa technical report server." internet research: electronic network applications and policy , no. ( ): - . nicholson, dennis, and george macgregor. "learning lessons holistically in the glasgow digital library." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /nicholson/ nicholson.html nikisch, jan andrzej, and miroslaw górny. "regional digital libraries in poland." the electronic library , no. ( ): - . nitecki, danuta a., and william rando. "a library and teaching center collaboration to assess the impact of using digital images on teaching, learning, and library support." vine , no. ( ): - . ober, john. "the california digital library." d-lib magazine (march ). http://www.dlib.org/dlib/march / ober.html oomen, johan, and vassilis tzouvaras. "providing access to european television heritage." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /ooman-tzouvaras/ palmer, david t. "the pacific rim library: a surprising pearl." serials review , no. ( ): - . payette, sandra, christophe blanchi, carl lagoze, and edward a. overly. "interoperability for digital objects and repositories: the cornell/cnri experiments." d-lib magazine (may ). http://www.dlib.org/dlib/may /payette/ payette.html purday, jonathan. "the british library's initiatives for access projects." communications of the acm (april ): - . raitt, david. "digital library initiatives across europe." computers in libraries (november/december ): - . http://www.infotoday.com/cilmag/nov /raitt.htm rao, ramana, jan o. pedersen, marti a. hearst, jock d. mackinlay, stuart k. card, larry masinter, per-kristian halvorsen, and george g. robertson. "rich interaction in the digital library." communications of the acm (april ): - . rusch-feja, diann, and hans jurgen becker. "global info: the german digital libraries project." d-lib magazine (april ). http://www.dlib.org/dlib/april / rusch-feja.html russell, kelly. "the jisc electronic libraries programme." computers and the humanities , no. ( ): - . rydberg-cox, jeffrey a. "cultural heritage language technologies: building an infrastructure for collaborative digital libraries in the humanities." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /rydberg-cox/ ———. "the cultural heritage language technologies consortium." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /rydberg-cox/ rydberg-cox.html rydberg-cox, jeffrey a., robert f. chavez, david a. smith, anne mahoney, gregory r. crane. "knowledge management in the perseus digital library." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /rydberg-cox/ saylor, john m., and carol minton-morris. "the national science digital library: an update on systems, services and collection development." science & technology libraries , no. / ( ): - . schlabach, martin l., and susan j. barnes. "the mann library gateway system." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /schlabac. n schmidt, heidi, karen butter, and cynthia rider. "building digital tobacco industry document libraries at the university of california, san francisco library/center for knowledge management." d-lib magazine (september ). http://www.dlib.org/dlib/september /schmidt/ schmidt.html schmidt, janine, and louise o'neill. "the 'dod' and 'pod' project in context at mcgill: part of digitizing collections to preserve content, provide access and enrich research." serials: the journal for the serials community , no. ( ): - . shaw, caroline. "creating the charles booth online archive: from nineteenth century london poverty to twenty-first century digital riches." library review , no. ( ): - . shaw, elizabeth j. "building a digital library: a technology manager's point of view." the journal of academic librarianship (november ): - . shaw, elizabeth j., and sarr blumson. "making of america: online searching and page presentation at the university of michigan." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /america/ shaw.html simon, rebecca. "scan: scholarship from california on the net." the serials librarian , no. / ( ): - . solbakk, svein arne. "critical technological and architectural choices for access and preservation in a digital library environment." library review , no. ( ): - . starr, susan s. "building the collections of the california digital library." issues in science and technology librarianship, no. (winter ). http://www.library.ucsb.edu/istl/ -winter/article .html stevens, kimberly weatherford, and bethany latham. "giving voice to the past: digitizing oral history." oclc systems & services: international digital library perspectives , no. ( ): - . suleman, hussein, anthony atkins, marcos a. goncalves, robert k. france, edward a. fox, vinod chachra, murray crowder, and jeff young. "networked digital library of theses and dissertations: bridging the gaps for global access—part : mission and progress." d-lib magazine (september ). http://www.dlib.org/dlib/september /suleman/ suleman-pt .html ———. "networked digital library of theses and dissertations: bridging the gaps for global access—part : services and research." d-lib magazine (september ). http://www.dlib.org/dlib/september /suleman/ suleman-pt .html sullivan, mark, and marilyn n. ochoa. "digital library of the caribbean: a user-centric model for technology development in collaborative digitization projects." oclc systems & services: international digital library perspectives , no. ( ): - . symonds, emily, and cinda may. "documenting local procedures: the development of standard digitization processes through the dear comrade project." journal of library metadata , no. / ( ): - . takle, marianne. "the norwegian national digital library." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /takle/ terpstra, judith a. k., frederick zarndt, david ongley, and stefan boddie. "the tundra times newspaper digitization project." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article thompson, larry a. "electronic theses and dissertations at virginia tech." science & technology libraries , no. ( ): - . turner, adrian l. "committing to memory: a project to publish and preserve california local history digital resources." journal of archival organization , no. / ( ): - . underhill, karen j., and bruce palmer. "archival content anywhere@anytime." internet reference services quarterly , no. / ( ): - . vandecreek, drew. "'webs of significance': the abraham lincoln historical digitization project, new technology, and the democratization of history." digital humanities quarterly , no. ( ). http://www.digitalhumanities.org/dhq/vol/ / / .html walker, kizer. "integrating a free digital resource: the status of making of america in academic library collections." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature wanat, thomas. "indiana program strives to digitize music without sacrificing the quality of sound." the chronicle of higher education, may , a . weig, eric, kopana terry, and kathryn lybarger. "large scale digitization of oral history: a case study." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /weig/ weig.html wenqing, wang. "building the new-generation china academic digital library information system (cadlis): a review and prospectus." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /wenqing/ wenqing.html wiseman, norman, chris rusbridge, and stephen m. griffin. "the joint nsf/jisc international digital libraries initiative." d-lib magazine (june ). http://www.dlib.org/dlib/june / wiseman.html wisser, katherine. "meeting metadata challenges in the consortial environment: metadata coordination for north carolina exploring cultural heritage online." library hi tech , no. ( ): - . witten, ian h. "customizing digital library interfaces with greenstone." tcdl bulletin , no. ( ). http://www.ieee-tcdl.org/bulletin/v n /witten/witten.html witten, ian h. "examples of practical digital libraries: collections built internationally using greenstone." d-lib magazine (march ). http://www.dlib.org/dlib/march /witten/ witten.html witten, ian h., rodger j. mcnab, steve jones, mark apperley, david bainbridge, and sally jo cunningham. "mastering complexity in a distributed digital library." computer (february ): - . wooldridge, brooke, laurie taylor, and mark sullivan. "managing an open access, multi-institutional, international digital library: the digital library of the caribbean." resource sharing & information networks , no. / ( ): - wykoff, leslie, laurie mercier, trevor bond, and alan cornish. "the columbia river basin ethnic history archive: a tri-state online history database and learning center." library hi tech , no. ( ): - . xiaodong, qiao, liang bing, and yao changqing. "china national science and technology digital library (nstl)." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /xiaodong/ xiaodong.html young, jeffrey r. "requiring theses in digital form: the first year at virginia tech." the chronicle of higher education, february , a -a . zhang, allison b. "creating online historical scrapbooks with a user-friendly interface: a case study." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /zhang/ zhang.html zhang, allison, and don gourley. "building digital collections using greenstone digital library software." internet reference services quarterly , no. ( ): - . zhang, yin, kyiho lee, and bum-jong you. "usage patterns of an electronic theses and dissertations system." online information review , no. ( ): - . zhen, xihui. "overview of digital library developments in china." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /zhen/ zhen.html zhou, qian. "the development of digital libraries in china and the shaping of digital librarians." the electronic library , no. ( ): - . zia, lee l. "the nsf national science, mathematics, engineering, and technology education digital library (nsdl) program." d-lib magazine (october ). http://www.dlib.org/dlib/october /zia/ zia.html ———. "the nsf national science, technology, engineering, and mathematics education digital library (nsdl) program: new projects in fiscal year ." d-lib magazine (november ). http://www.dlib.org/dlib/november /zia/ zia.html ———. "the nsf national science, technology, engineering, and mathematics education digital library (nsdl) program: new projects from fiscal year ." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /zia/ zia.html . library issues: digital preservation abrams, stephen, patricia cruse, and john kunze. "preservation is not a place." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ abrams, stephen, sheila morrissey, and tom cramer. "'what? so what': the next-generation jhove architecture for format-aware characterization." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ abrams, stephen l., and bruce rosenblum. "xml for e-journal archiving." oclc systems & services , no. ( ): - . adams, geoffrey. "partners go dutch to preserve the minutes of science." research information (september/october ). http://www.researchinformation.info/risepoct archiving.html adams, wright r. "archiving digital materials: an overview of the issues." journal of interlibrary loan, document delivery & electronic reserve , no. ( ): - altenhöner, reinhard. "data for the future: the german project 'co-operative development of a long-term digital information archive' " library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / altenhöner, reinhard, and tobias steinke. "kopal: cooperation, innovation and services: digital preservation activities at the german national library." library hi tech , no. ( ): - . altman, micah, margaret o. adams, jonathan crabtree, darrell donakowski, marc maynard, amy pienta, and copeland h. young. "digital preservation through archival collaboration: the data preservation alliance for the social sciences " american archivist , no. ( ): - . anderson, richard, hannah frost, nancy hoebelheinrich, and keith johnson. "the aiht at stanford university: automated preservation assessment of heterogeneous digital collections." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /johnson/ johnson.html anderson, rick. "is the digital archive a new beast entirely?" serials review , no. ( ): - . anderson, w. l. "some challenges and issues in managing, and preserving access to, long-lived collections of digital scientific and technical data." data science journal ( ): - . http://www.jstage.jst.go.jp/article/dsj/ / / /_pdf angevaare, inge. "taking care of digital collections and data: 'curation' and organisational choices for research libraries." liber quarterly: the journal of european research libraries , no. ( ): - . http://liber.library.uu.nl/publish/articles/ /article.pdf arms, caroline r. "keeping memory alive: practices for preserving digital content at the national digital library program of the library of congress." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature arms, william y. "preservation of scientific serials: three current examples." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . arms, william y., selcuk aya, pavel dmitriev, blazej kot, ruth mitchell, and lucia walle. "a research library based on the historical collections of the internet archive." d-lib magazine , no. ( ). http://www.dlib.org/dlib/february /arms/ arms.html aschenbrenner, andreas. "the bits and bites of data formats— stainless design for digital endurance." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article atkinson, ross. "text mutability and collection administration." library acquisitions: practice & theory , no. ( ): - . bailey, steve, and dave thompson. "ukwac: building the uk's first public web archive." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /thompson/ thompson.html barnes, ian. preservation of tex/latex documents. canberra: australian partnership for sustainable repositories, . http://www.apsr.edu.au/publications/latex-preservation.pdf ———. the preservation of word processing documents. canberra: australian partnership for sustainable repositories, . http://www.apsr.edu.au/publications/word_processing_preservation. pdf baudoin, patsy. "uppity bits: coming to terms with archiving dynamic electronic journals." the serials librarian , no. ( ): - . beagrie, neil. "the continuing access and digital preservation strategy for the uk joint information systems committee (jisc)." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /beagrie/ beagrie.html ———. "the digital curation centre." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "digital curation for science, digital libraries, and individuals." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / ———. "digital preservation: best practice and its dissemination." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /beagrie/ ———. "the jisc digital preservation focus and the digital preservation coalition." the new review of academic librarianship ( ): - . ———. national digital preservation initiatives: an overview of developments in australia, france, the netherlands, and the united kingdom and of related international activity. washington, dc: council on library and information resources and the library of congress, . http://www.clir.org/pubs/reports/pub /pub .pdf ———. "preserving uk digital library collections." program (july ): - . beagrie, neil, robert beagrie, and ian rowlands. "research data preservation and access: the views of researchers." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /beagrie-et-al/ bearman, david. "intellectual property conservancies." d-lib magazine (december ). http://www.dlib.org/dlib/december /bearman/ bearman.html ———. "reality and chimeras in the preservation of electronic records." d-lib magazine (april ). http://www.dlib.org/dlib/april /bearman/ bearman.html bearman, david, and jennifer trant. "authenticity of digital resources: towards a statement of requirements in the research process." d-lib magazine (june ). http://www.dlib.org/dlib/june / bearman.html beebe, linda, and barbara meyers. "the unsettled state of archiving." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . bellekom, chris. "building preservation functionality in a digital archive: the national library of the netherlands." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art berger, marilyn. "digitization for preservation and access: a case study." library hi tech , no. ( ): - . berman, fran, ardys kozbial, robert h. mcdonald, and brian e. c. schottlaender. "the need to formalize trust relationships in digital repositories." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf berthon, hilary, susan thomas, and colin webb. "safekeeping: a cooperative approach to building a digital preservation resource." d-lib magazine (january ). http://www.dlib.org/dlib/january /berthon/ berthon.html besser, howard. "collaboration for electronic preservation." library trends , no. ( ): - . http://www.ideals.uiuc.edu/handle/ / bethune, alec, butch lazorchak, and zsolt nagy. "geomapp: a geospatial multistate archive and preservation partnership." journal of map & geography libraries , no. ( ): - . boyce, peter b. "who will keep the archives? wrong question!" serials review , no. ( ): - . boyle, frances, alexandra eveleigh, and heather needham. "preserving local archival heritage for ongoing accessibility." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /boyle-et-al/ bradley, kevin. "defining digital sustainability." library trends , no. ( ): - . http://www.ideals.uiuc.edu/handle/ / bradley, rachael. "digital authenticity and integrity: digital cultural heritage documents as research resources." portal: libraries and the academy , no. ( ): - . brancolini, kristine r. "selecting research collections for digitization: applying the harvard model." library trends (spring ): - . https://www.ideals.uiuc.edu/handle/ / brandt, larry, valerie gregg, and sue stendebach. "the national science foundation digital government research program's role in the long-term preservation of digital materials." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature brichford, maynard, and william maher. "archival issues in network electronic publications." library trends (spring ): - . http://www.ideals.uiuc.edu/handle/ / brodie, nancy. "authenticity, preservation and access in digital collections." the new review of academic librarianship ( ): - . ———. "building a national electronic collection for long-term access." the serials librarian , no. / ( ): - . brody, tim, leslie carr, jessie m. n. hey, adrian brown, and steve hitchcock. "pronom-roar: adding format profiles to a repository registry to inform preservation services." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / brown, adrian. "automating preservation: new developments in the pronom service." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article ———. "developing practical approaches to active preservation." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ building a national strategy for preservation: issues in digital media archiving. washington, dc: council on library and information resources and library of congress, . http://www.clir.org/pubs/reports/pub /pub .pdf buonora, paolo, and franco liberati. "a format for digital preservation of images: a study on jpeg file robustness." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /buonora/ buonora.html burrows, toby. "preserving the past, conceptualising the future: research libraries and digital preservation." australian academic & research libraries , no. ( ): - . butler, meredith a. "issues and challenges of archiving and storing digital information: preserving the past for future scholars." journal of library administration , no. ( ): - . cain, mark. "being a library of record in a digital age." the journal of academic librarianship , no. ( ): - . cantara, linda. "long-term preservation of digital humanities scholarship." oclc systems & services , no. ( ): - . caplan, priscilla. "the florida digital archive and daitss: a model for digital preservation." library hi tech , no. ( ): - . ———. "the florida digital archive and daitss: a working preservation repository based on format migration." international journal on digital libraries , no. ( ): - . http://www.fcla.edu/digitalarchive/pdfs/ijdl_article.pdf ———. "premis—preservation metadata implementation strategies update . implementing preservation repositories for digital materials: current practice and emerging trends in the cultural heritage community." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article ———. "the preservation of digital materials." library technology reports , no. ( ). carlson, scott. "library of congress buys electronic archive of physics society's journals." the chronicle of higher education, february , a . ———. "stanford project will test an approach for preserving digital journals." the chronicle of higher education, march , a . carpenter, leona. "supporting digital preservation and asset management in institutions." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /carpenter/ chapman, stephen. "counting the costs of digital preservation: is repository storage affordable?" journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ choudhury, g. sayeed. "case study in data curation at johns hopkins university." library trends , no. ( ): - . choudhury, sayeed, tim dilauro, alex szalay, ethan vishniac, robert j. hanisch, julie steffen, robert milkey, teresa ehling, and ray plante. "digital data preservation for scholarly publications in astronomy." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / chowdhury, gobinda. "from digital libraries to digital preservation research: the importance of users and context." journal of documentation , no. ( ): - . clareson, tom. "nedcc survey and colloquium explore digitization and digital preservation policies and practices." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article cloonan, michèle valerie. "the moral imperative to preserve." library trends , no. ( ): - . http://www.ideals.uiuc.edu/handle/ / ———. "the preservation of knowledge." library trends (spring ): - . http://www.ideals.uiuc.edu/handle/ / cloonan, michèle valerie, and shelby sanett. "the preservation of digital content." portal: libraries and the academy , no. ( ): - . connertz, thomas. "long-term archiving of digital documents: what efforts are being made in germany?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art conway, paul. "preservation in the age of google: digitization, digital preservation, and dilemmas." the library quarterly , no. ( ): - . ———. "tec(h)tonics: reimagining preservation." college & research libraries news , no. ( ): - . ———. "yale university library's project open book: preliminary research findings." d-lib magazine (february ). http://www.dlib.org/dlib/february /yale/ conway.html council on library and information resources, and library of congress. capturing analog sound for digital preservation: report of a roundtable discussion of best practices for transferring analog discs and tapes. washington, dc: council on library and information resources and library of congress, . http://www.clir.org/pubs/abstract/pub abst.html cothey, viv. "digital curation at gloucestershire archives: from ingest to production by way of trusted storage." journal of the society of archivists , no. ( ): - . cramer, tom, and katherine kott. "designing and implementing second generation digital preservation services: a scalable model for the stanford digital repository." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /cramer/ cramer.html crawford, walt. "bits is bits: pitfalls in digital reformatting." american libraries (may ): - . crook, edgar. "for the record: assessing the impact of archiving on the archived." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article currall, james, peter mckinney, and claire johnson. "the world is all grown digital. . . . how shall a man persuade management what to do in such times?" international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ dale, robin l. "making certification real: developing methodology for evaluating repository trustworthiness." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article darlington, jeffrey. "pronom—a practical online compendium of file formats." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature day, michael. "e-print services and long-term access to the record of scholarly and scientific research." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /metadata/ ———. "online serials: preservation issues." the serials librarian , no. / ( ): - . ———. preservation of electronic information: a bibliography. bath: uk office for library and information networking, . http://homes.ukoln.ac.uk/~lismd/preservation.html de lusenet, yola. "tending the garden or harvesting the fields: digital preservation and the unesco charter on the preservation of the digital heritage." library trends , no. ( ): - . http://www.ideals.uiuc.edu/handle/ / deegan, marilyn. "management of the life cycle of digital library materials." liber quarterly , no. ( ): - . deegan, marilyn, and simon tanner, eds. digital preservation. london: facet publishing, . deken, jean marie. "preserving digital libraries: determining 'what?' before deciding 'how?'" science & technology libraries , no. / ( ): - . deloughry, thomas j. "panel urges saving digital data for posterity." the chronicle of higher education, september , a . dilauro, tim, mark patton, david reynolds, and g. sayeed choudhury. "the archive ingest and handling test: the johns hopkins university report." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /choudhury/ choudhury.html dobratz, susanne, and heike neuroth. "nestor: network of expertise in long-term storage of digital resources—a digital preservation initiative for germany." d-lib magazine , no. ( ). http://www.dlib.org/dlib/april /dobratz/ dobratz.html dobratz, susanne, and astrid schoger. "digital repository certification: a report from germany." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article donaldson, devan ray, and paul conway. "implementing premis: a case study of the florida digital archive." library hi tech , no. ( ): - . dougherty, william c. "preservation of digital assets: one approach." the journal of academic librarianship , no. ( ): - . downs, robert r., and robert s. chen. "self-assessment of a long-term archive for interdisciplinary scientific data as a trustworthy digital repository." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ doyle, julie, herna viktor, and eric paquet. "long-term digital preservation: preserving authenticity and usability of -d data." international journal on digital libraries , no. ( ): - . duranceau, ellen finnie. "archiving and perpetual access for web-based journals: a look at the issues and how five e-journal providers are addressing them." serials review (summer ): - . duranti, luciana, and kenneth thibodeau. "the concept of record in interactive, experiential and dynamic environments: the view of interpares." archival science , no. ( ): - . eastwood, terry. "appraising digital records for long-term preservation." data science journal ( ): - . http://www.jstage.jst.go.jp/article/dsj/ / / /_pdf ekman, richard h. "can libraries of digital materials last forever?" change (march/april ): - . entlich, richard, and ellie buckley. "digging up bits of the past: hands-on with obsolescence." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article erwin, tracey, and julie sweetkind-singer. "the national geospatial digital archive: a collaborative project to archive geospatial data." journal of map & geography libraries , no. ( ): - . farquhar, adam, and helen hockx-yu. "planets: integrated services for digital preservation." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / fenton, eileen gifford. "an overview of portico: an electronic archiving service." serials review , no. ( ): - . ———. "preserving electronic scholarly journals: portico." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /fenton/ flanders, julia. "trusting the electronic edition." computers and the humanities , no. ( ): - . flecker, dale. "digital archiving: what is involved?" educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf ———. "preserving scholarly e-journals." d-lib magazine (september ). http://www.dlib.org/dlib/september /flecker/ flecker.html fox, robert. "the double bind of e-journal collections." oclc systems & services , no. ( ): - . friedlander, amy. "the national digital information infrastructure preservation program: expectations, realities, choices and progress to date." d-lib magazine (april ). http://www.dlib.org/dlib/april /friedlander/ friedlander.html garrett, john r. "task force on archiving of digital information." d-lib magazine (september ). http://www.dlib.org/dlib/september / garrett.html geller, marilyn. "planning for the digital archive: the harvard e-journal experience." serials (november ): - . gertz, janet. "selection for preservation in the digital age." library resources & technical services (april ): - . giaretta, david. "the caspar approach to digital preservation." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ gladney, h. m., and j. l. bennett. "what do we mean by authentic? what's the real mccoy?" d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /gladney/ gladney.html glick, kevin, eliot wilczek, and robert dockins. "fedora and the preservation of university records project." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article gracy, karen f. "moving image preservation and cultural capital." library trends , no. ( ): - . http://www.ideals.uiuc.edu/handle/ / graham, peter s. intellectual preservation: electronic preservation of the third kind. washington, dc: the commission on preservation and access, . http://www.clir.org/pubs/reports/graham/intpres.html ———. "long-term intellectual preservation." collection management , no. / ( ): - . graham, rebecca a. "evolution of archiving in the digital age." serials review , no. ( ): - . granger, stewart. "digital preservation and deep infrastructure." d-lib magazine (february ). http://www.dlib.org/dlib/february /granger/ granger.html ———. "emulation as a digital preservation strategy." d-lib magazine (october ). http://www.dlib.org/dlib/october /granger/ granger.html ———. "metadata and digital preservation: a plea for cross-interest collaboration." vine, no. ( ): - . griepke, gertraud, bernd wegner, and seyed hasan. "the electronic mathematics archives network initiative (emani)." serials (november ): - . guenther, rebecca s. "battle of the buzzwords: flexibility vs. interoperability when implementing premis in mets." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /guenther/ guenther.html guthrie, kevin. "developing a digital preservation strategy for jstor." interview by anne r. kenney and oya y. rieger. rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature gutmann, m., k. schürer, d. donakowski, and hilary beedham. "the selection, appraisal, and retention of social science data." data science journal ( ): - . http://www.jstage.jst.go.jp/article/dsj/ / / /_pdf habing, thomas, janet eke, matthew a. cordial, william ingram, and robert manaster. "developments in digital preservation at the university of illinois: the hub and spoke architecture for supporting repository interoperability and emerging preservation standards." library trends , no. ( ): - . halbert, martin. "comparison of strategies and policies for building distributed digital preservation infrastructure: initial findings from the metaarchive cooperative." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ harada, hisayoshi. "digitizing, archiving, and preserving japanese cultural heritage." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature harvey, ross, and dave thompson. "automating the appraisal of digital materials." library hi tech , no. ( ): - . haynes, david, and david streatfield. "a national co-ordinating body for digital archiving?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /digital/ hedstrom, margaret. "digital preservation: a time bomb for digital libraries." computers and the humanities , no. ( ): - . http://deepblue.lib.umich.edu/handle/ . / hedstrom, margaret, and clifford lampe. "emulation vs. migration: do users care?" rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature heidorn, p. bryan. "shedding light on the dark data in the long tail of science." library trends , no. ( ): - . higgins, sarah. "dcc diffuse standards frameworks: a standards path through the curation lifecycle." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ hitchcock, steve, tim brody, jessie m. n. hey, and leslie carr. "digital preservation service provider models for institutional repositories: towards distributed services." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /hitchcock/ hitchcock.html hitchcock, steve, david tarrant, adrian brown, ben o'steen, neil jefferies, and leslie carr. "towards smart storage for repository preservation services." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ hixson, carol. "when just doing it isn't enough: the university of oregon takes stock." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article hockx-yu, helen. "digital curation centre—phase two." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ ———. "digital preservation in the context of institutional repositories." program: electronic library and information systems , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "establishing a uk lockss pilot programme." serials , no. ( ): - . http://eprints.rclis.org/archive/ / hodge, gail m. "best practices for digital archiving: an information life cycle approach." d-lib magazine (january ). http://www.dlib.org/dlib/january / hodge.html hoeven, jeffrey van der, bram lohman, and remco verdegem. "emulation for digital preservation in practice: the results." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / holdsworth, david, and paul wheatley. "emulation, preservation, and abstraction." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature honey, sadie l. "preservation of electronic scholarly publishing: an analysis of three approaches." portal: libraries and the academy , no. ( ): - . howell, alan. "perfect one day—digital the next: challenges in preserving digital information." australian academic & research libraries , no. ( ): - . hunter, jane. "scientific publication packages—a selective approach to the communication and archival of scientific output." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / hunter, jane, and sharmin choudhury. "panic: an integrated approach to the preservation of composite digital objects using semantic web services." international journal on digital libraries , no. ( ). http://espace.library.uq.edu.au/view.php?pid=uq: hunter, karen. "digital archiving." serials review , no. ( ): - . inger, simon. "production and content management implications for archival projects: a snapshot in may ." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art janée, greg, james frew, and terry moore. "relay-supporting archives: requirements and progress." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ jantz, ronald. "an institutional framework for creating authentic digital objects." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ johnson, jane d. "mic (moving image collections)." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article johnson, peggy. "libraries and the preservation of electronic information." technicalities (june ): - . joint, nicholas. "legal deposit and collection development in a digital world." library review , no. ( ): - . jones, maggie. "the digital preservation coalition: building a national infrastructure for preserving digital resources in the uk." the serials librarian , no. ( ): - . ———. "e-journals—what do you get for your money?" serials , no. ( ): - . ———. "a workbook for the preservation management of digital materials." the new review of academic librarianship ( ): - . kahle, brewster. "preserving the internet." scientific american (march ): - . kalusopa, trywell, and saul zulu. "digital heritage material preservation in botswana: problems and prospects." collection building , no. ( ): - . kanyengo, christine wamunyima. "managing digital information resources in africa: preserving the integrity of scholarship." the international information & library review , no. ( ): - . keller, alice, jonathan mcaslan, and claire duddy. "long-term access to e-journals: what exactly can we promise our readers?" serials: the journal for the serials community , no. ( ): - . kennedy, marie r. "reformatting preservation departments: the effect of digitization on workload and staff." college & research libraries , no. ( ): - . kenney, anne r. "surveying the e-journal preservation landscape." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr preserv.pdf kenney, anne r., and ellie buckley. "developing digital preservation programs: the cornell survey of institutional readiness, - ." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article kenney, anne r., and paul conway. "from analog to digital: extending the preservation tool kit." collection management , no. / ( ): - . kenney, anne r., richard entlich, peter b. hirtle, nancy y. mcgovern, and ellie l. buckley. e-journal archiving metes and bounds: a survey of the landscape. washington, dc: council on library and information resources, . http://www.clir.org/pubs/abstract/pub abst.html kenney, anne r., and oya y. rieger. "preserving digital assets: cornell's digital image collection project." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ keyhani, andrea. "creating an electronic archive: who should do it and why?" the serials librarian , no. / ( ): - . kirchhoff, amy j. "digital preservation: challenges and implementation." learned publishing , no. ( ): - . knight, steve. "early learnings from the national library of new zealand's national digital heritage archive project." program: electronic library and information systems , no. ( ): - . kulovits, hannes, andreas rauber, anna kugler, markus brantl, tobias beinert, and astrid schoger. "from tiff to jpeg ? preservation planning at the bavarian state library using a collection of digitized th century printings." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /kulovits/ kulovits.html lavoie, brian f. "the fifth blackbird: some thoughts on economically sustainable digital preservation." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /lavoie/ lavoie.html ———. "implementing metadata in digital preservation systems: the premis activity." d-lib magazine , no. ( ). http://www.dlib.org/dlib/april /lavoie/ lavoie.html ———. "premis with a fresh coat of paint: highlights from the revision of the premis data dictionary for preservation metadata." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /lavoie/ lavoie.html lavoie, brian, and lorcan dempsey. "thirteen ways of looking at . . . digital preservation." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /lavoie/ lavoie.html leggate, peter, and mike hannant. "the archiving of online journals." learned publishing (october ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art lehmann, klaus-dieter. "making the transitory permanent: the intellectual heritage in a digitized world of knowledge." daedalus , no. ( ): - . littman, justin. "actualized preservation threats: practical lessons from chronicling america." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /littman/ littman.html lor, peter johan. "preserving african digital resources: is there a role for repository libraries?" library management , no. ( ): - . http://www.up.ac.za/dspace/handle/ / lukesh, susan s. "e-mail and potential loss to future archives and scholarship or the dog that didn't bark." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ lynch, clifford. "canonicalization: a fundamental tool to facilitate preservation and management of digital information." d-lib magazine (september ). http://www.dlib.org/dlib/september / lynch.html ———. "the integrity of digital information: mechanics and definitional issues." journal of the american society for information science (december ): - . ———. "rethinking the integrity of the scholarly record in the networked information age." educom review (march/april ): - . http://net.educause.edu/apps/er/review/reviewarticles/ .html ———. "the role of digitization in building electronic collections: economic and programmatic choices." collection management , no. / ( ): - . lynn, m. stuart. "digital preservation and access: liberals and conservatives." syllabus (november/december ): - , - . lyons, susan. "preserving electronic government information: looking back and looking forward." the reference librarian , no. ( ): - . malinconico, s. michael. "digital preservation technologies and hybrid libraries." information services & use , no. ( ): - . marcum, deanna b. "the preservation of digital information." the journal of academic librarianship (november ): - . marcum, deanna, and amy friedlander. "keepers of the crumbling culture: what digital preservation can learn from library history." d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /friedlander/ friedlander.html martin, julia, and david coleman. "change the metaphor: the archive as an ecosystem." the journal of electronic publishing (april ). http://hdl.handle.net/ /spo. . . masaneès, julien. "towards continuous web archiving: first results and an agenda for the future." d-lib magazine (december ). http://www.dlib.org/dlib/december /masanes/ masanes.html ———, ed. web archiving. new york: springer, . mason, ingrid. "virtual preservation: how has digital culture influenced our ideas about permanence? changing practice in a national legal deposit library." library trends , no. ( ): - . http://www.ideals.uiuc.edu/handle/ / maxymuk, john. "preservation and metadata." the bottom line: managing library finances , no. ( ): - . mcdonald, robert h., and tyler o. walters. "restoring trust relationships within the framework of collaborative digital preservation federations." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ mcdonough, jerome, and mona jimenez. "video preservation and digital reformatting: pain and possibility." journal of archival organization , no. / ( ): - . mcgovern, nancy y. "a digital decade: where have we been and where are we going in digital preservation?" rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article mcgovern, nancy y., anne r. kenney, richard entlich, william r. kehoe, and ellie buckley. "virtual remote control: building a preservation risk management toolbox for web resources." d-lib magazine , no. ( ). http://www.dlib.org/dlib/april /mcgovern/ mcgovern.html mcgovern, nancy y., and aprille c. mckay. "leveraging short-term opportunities to address long-term obligations: a perspective on institutional repositories and digital preservation programs." library trends , no. ( ): - . mellor, phil. "camileon: emulation and bbc domesday." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature meyer, lars. safeguarding collections at the dawn of the st century: describing roles & measuring contemporary preservation activities in arl libraries. washington, dc: arl, . http://www.arl.org/bm~doc/safeguarding-collections.pdf michaels, jan. "here today, gone tomorrow? why we should preserve electronic documents." in advances in preservation and access, vol. , ed. barbra buckner higginbotham, - . medford, nj: learned information, inc., . milne, ronald, and john tuck. "implementing e-legal deposit: a british library perspective." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /milne-tuck/ minor, david, don sutton, ardys kozbial, brad westbrook, michael burek, and michael smorul. "chronopolis digital preservation network." international journal of digital curation , no. ( ). moghaddam, golnessa galyani. "archiving challenges of scholarly electronic journals: how do publishers manage them?" serials review , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "preserving scientific electronic journals: a study of archiving initiatives." the electronic library , no. ( ): - . morris, sally. "archiving electronic publications: what are the problems and who should solve them?" serials review , no. ( ): - . morris, steven p. "the north carolina geospatial data archiving project: challenges and initial outcomes." journal of map & geography libraries , no. ( ): - . morrissey, sheila. "the economy of free and open source software in the preservation of digital artifacts." library hi tech , no. ( ): - . neavill, gordon b. "electronic publishing, libraries, and the survival of information." library resources & technical services (january/march ): - . neavill, gordon b., and mary ann sheble. "archiving electronic journals." serials review , no. ( ): - . nelson, michael l., johan bollen, giridhar manepalli, and rabia haq. "archive ingest and handling test: the old dominion university approach." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /nelson/ nelson.html nelson, michael l., frank mccown, joan a. smith, and martin klein. "using the web infrastructure to preserve web pages." international journal on digital libraries , no. ( ): - . noonan, daniel w., amy mccrory, and elizabeth l. black. "pdf/a: a viable addition to the preservation toolkit." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november / contents.html ockerbloom, john mark. "archiving and preserving pdf files." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature oclc/rlg premis working group. implementing preservation repositories for digital materials: current practice and emerging trends in the cultural heritage community. dublin, oh: oclc online computer library, inc., . http://www.oclc.org/research/projects/pmwg/surveyreport.pdf o'donohue, kate, and rick j. block. "the accessing and archiving of electronic journals: challenges and implications within the library world." the serials librarian , no. / ( ): - . oltmans, erik, and nanda kol. "a comparison between migration and emulation in terms of costs." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article oltmans, erik, and adriaan lemmen. "the e-depot at the national library of the netherlands." serials , no. ( ): - . oltmans, erik, and hilde van wijngaarden. "the kb e-depot digital archiving policy." library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / ovadia, steven. "the need to archive blog content." the serials librarian , no. ( ): - . pace, andrew k. "digital preservation: everything new is old again." computers in libraries (february ): - . http://www.infotoday.com/cilmag/feb /pace.htm park, eun g. "perspectives on access to electronic journals for long-term preservation." serials review , no. ( ): - . park, eun g., and ho nam choi. "korean electronic site license initiative: archiving of electronic journals." online information review , no. ( ): - . pennock, maureen. "supporting institutional digital preservation and asset management: a summary of the jisc dpam programme synthesis." new review of information networking , no. ( ): - phillips, margaret e. "the national library of australia: ensuring long-term access to online publications." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . pinfield, stephen, and hamish james. "the digital preservation of e-prints." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /pinfield/ pinfield.html pope, jackson, and philip beresford. "iipc web archiving toolset performance testing at the british library." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /pope-beresford/ pozo, nick del, andrew stawowczyk long, and david pearson. "'land of the lost': a discussion of what can be preserved through digital preservation." library hi tech , no. ( ): - . "preserving the digital archive." online & cd-rom review (october ): - . ras, marcel. "the kb e-depot: building and managing a safe place for e-journals." liber quarterly: the journal of european research libraries , no. ( ): - . http://liber.library.uu.nl/publish/articles/ /article.pdf rauber, andreas, andreas aschenbrenner, oliver witvoet, robert m. bruckner, and max kaiser. "uncovering information hidden in web archives: a glimpse at web analysis building on data warehouses." d-lib magazine (december ). http://www.dlib.org/dlib/december /rauber/ rauber.html reich, vicky, and david s. h. rosenthal. "lockss (lots of copies keep stuff safe)." the new review of academic librarianship ( ): - . ———. "lockss: a permanent web publishing and access system." d-lib magazine (june ).http://www.dlib.org/dlib/june /reich/ reich.html ———. "lots of copies keep stuff safe as a cooperative archiving solution for e-journals." issues in science and technology librarianship (fall ). http://www.istl.org/ -fall/article .html reilly, bernard f., jr. "the library and the newsstand: thoughts on the economics of news preservation." journal of library administration , no. ( ): - . rhodes, sarah, and dana neacsu. "preserving and ensuring long-term access to digitally born legal information." information & communications technology law , no. ( ): - . rieger, oya y., and william r. kehoe. "enduring access to digitized books: organizational and technical framework." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ rlg diginews staff. "watch this space: ten promising digital preservation initiatives." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article rosenthal, david s. h. "bit preservation: a solved problem?" international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ ———. "format obsolescence: assessing the threat and the defenses." library hi tech , no. ( ): - . rosenthal, david s. h., thomas lipkis, thomas s. robertson, and seth morabito. "transparent format migration of preserved web content." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /rosenthal/ rosenthal.html rosenthal, david s. h., and vicky reich. "lockss, a permanent web publishing and access system: brief introduction and status report." serials (november ): - . rosenthal, david s. h., thomas robertson, tom lipkis, vicky reich, and seth morabito. "requirements for digital preservation systems: a bottom-up approach." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /rosenthal/ rosenthal.html ross, seamus. "the role of erpanet in supporting digital curation and preservation in europe." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /ross/ ross.html ross, seamus, and andrew mchugh. "audit and certification of digital repositories: creating a mandate for the digital curation centre (dcc)." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article rowe, richard r. "holding moonbeams: the challenge of preserving scientific knowledge." serials (november ): - . rumsey, sally, and ben o'steen. "oai-ore, preserv and digital preservation." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /rumsey-osteen/ rusbridge, adam, and seamus ross. "the uk lockss pilot programme: a perspective from the lockss technical support service." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / rusbridge, chris. "excuse me. . . some digital preservation fallacies?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /rusbridge/ russell, ann. "training professionals to preserve digital heritage: the school for scanning." library trends , no. ( ): - . http://www.ideals.uiuc.edu/handle/ / russell, kelly. "cedars: long-term access and usability of digital resources." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /cedars/ ———. "digital preservation and the cedars project experience." the new review of academic librarianship ( ): - . russell, kelly, ellis weinberger, and andy stone. "preserving digital scholarship: the future is now." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art sanett, shelby. "toward developing a framework of cost elements for preserving authentic electronic records into perpetuity." college & research libraries (september ): - . seadle, michael. "a social model for archiving digital serials: lockss." serials review , no. ( ): - . selingo, jeffrey. "a new archive and internet search engine may change the nature of on-line research." the chronicle of higher education, march , a -a . senserini, alessandro, robert b. allen, gail hodge, nikkia anderson, and daniel smith, jr. "archiving and accessing web pages: the goddard library web capture process." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /hodge/ hodge.html seville, catherine, and ellis weinberger. "intellectual property rights lessons from the cedars project for digital preservation." the new review of academic librarianship ( ): - . shenton, helen. "from talking to doing: digital preservation at the british library." the new review of academic librarianship ( ): - . shirky, clay. "aiht: conceptual issues from practical tests." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /shirky/ shirky.html sierman, barbara. "the jigsaw puzzle of digital preservation—an overview." liber quarterly: the journal of european research libraries , no. ( ): - . http://liber.library.uu.nl/publish/articles/ /article.pdf smith, abby. "digital preservation: an individual responsibility for communal scholarship." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf ———. "distributed preservation in a national context ndiipp at mid-point." d-lib magazine , no. ( ). http://www.dlib.org/dlib/june /smith/ smith.html smith, joan a., and michael l. nelson. "creating preservation-ready web resources." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /smith/ smith.html smith, mackenzie. "curating architectural d cad models." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ smith, mackenzie, and reagan w. moore. "digital archive policies and trusted digital repositories." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ spence, jaqueline. "preserving the cultural heritage: an investigation into the feasibility of the oais model for application in small organisations." aslib proceedings , no. ( ): - . spivey, catherine. "online archiving with the british library—the emerald experience." serials , no. ( ): - . stanescu, andreas. "assessing the durability of formats in a digital preservation environment: the inform methodology." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /stanescu/ stanescu.html ———. "assessing the durability of formats in a digital preservation environment: the inform methodology." oclc systems & services: international digital library perspectives , no. ( ): - . steenbakkers, johan f. "digital archiving: a necessary evil or new opportunity?" serials review , no. ( ): - . ———. "permanent archiving of electronic publications." serials , no. ( ): - . ———. "treasuring the digital records of science: archiving e-journals at the koninklijke bibliotheek." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article steinhart, gail, dianne dietrich, and ann green. "establishing trust in a chain of preservation: the trac checklist applied to a data staging repository (datastar)." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /steinhart/ steinhart.html task force on archiving of digital information. preserving digital information. mountain view, ca: commission on preservation and access and research libraries group. tennant, roy. "time is not on our side: the challenge of preserving digital materials." library journal, march , - . http://www.libraryjournal.com/article/ca .html teper, thomas h., and beth kraemer. "long-term retention of electronic theses and dissertations." college & research libraries (january ): - . terry, ana arias. "digital archiving: a work in progress." against the grain (june ): , , . thompson, dave. "a pragmatic approach to preferred file formats for acquisition." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /thompson/ trehub, aaron, and thomas c. wilson. "keeping it simple: the alabama digital preservation network (adpnet)." library hi tech , no. ( ): - . treloar, andrew, david groenewegen, and cathrine harboe-ree. "the data curation continuum: managing data objects in institutional repositories." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /treloar/ treloar.html van der werf-davelaar, titia. "long-term preservation of electronic publications: the nedlib project." d-lib magazine (september ). http://www.dlib.org/dlib/september /vanderwerf/ vanderwerf.ht ml van nuys, carol. "the paradigma project." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature waibel, gunter. "like russian dolls: nesting standards for digital preservation." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature waller, martin, and robert sharpe. mind the gap: assessing digital preservation needs in the uk. heslington, uk: digital preservation coalition, . http://www.dpconline.org/docs/reports/uknamindthegap.pdf walters, tyler o. "data curation program development in u.s. universities: the georgia institute of technology example." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ walters, tyler o., and katherine skinner. "economics, sustainability, and the cooperative model in digital preservation." library hi tech , no. ( ): - . watry, paul. "digital preservation theory and application: transcontinental persistent archives testbed activity." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ / waters, donald j. "choices in digital archiving: the american experience." in the impact of electronic publishing on the academic community: an international workshop organized by the academia europaea and the wenner-gren foundation, ed. i. butterworth, - . london: portland press, . ———. "transforming libraries through digital preservation." collection management , no. / ( ): - . watson, jennifer. "towards a preserved national collection of selected australian digital publications." the new review of academic librarianship ( ): - . ———. "you get what you pay for? archival access to electronic journals." serials review , no. ( ): - . webb, colin. "saving digital heritage—a unesco campaign." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature ———. "towards a preserved national collection of selected australian digital publications." the new review of academic librarianship ( ): - . wiggins, richard. "digital preservation: paradox & promise." library journal, net connect (spring ): - . williamson, andrew. "awareness of quality assurance procedures in digital preservation." library review , no. ( ): - . http://eprints.cdlr.strath.ac.uk/ / ———. "strategies for managing digital content formats." library review , no. ( ): - . http://eprints.cdlr.strath.ac.uk/ / witt, michael, jacob carlson, d. scott brandt, and melissa h. cragin. "constructing data curation profiles." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ yakel, elizabeth. "digital curation." oclc systems & services: international digital library perspectives , no. ( ): - . . library issues: general works aiguo, li. "calis: acquiring electronic resources." library collections, acquisitions, & technical services , no. ( ): - . albanese, andrew richard. "moving from books to bytes." library journal, september , - . http://www.libraryjournal.com/article/ca .html albitz, rebecca s. "electronic resource librarians in academic libraries: a position announcement analysis, - ." portal: libraries and the academy , no. ( ): - . aldrich, duncan m., and greggory stefanelli. "library services for a digital future." educause quarterly , no. ( ): - . http://www.educause.edu/educause+quarterly/educausequa rterlymagazinevolum/libraryservicesforadigitalfutu/ algenio, emilie r. "a how-to guide for electronic reserves; or, if i knew then what i know now." journal of interlibrary loan, document delivery & information supply , no. ( ): - . anderson, douglas. "allocation of costs for electronic products in academic library consortia." college & research libraries , no. ( ): - . appleton, leo. "perceptions of electronic library resources in further education." the electronic library , no. ( ): - . atkinson, ross. "managing traditional materials in an online environment: some definitions and distinctions for a future collection management." library resources & technical services (january ): - . ———. "networks, hypertext, and academic information services: some longer-range implications." college & research libraries (may ): - . ———. "toward a redefinition of library services." in virtually yours: models for managing electronic resources and services, ed. peggy johnson and bonnie macewan, - . chicago: american library association, . ———. "uses and abuses of cooperation in a digital age." collection management , no. / ( ): - . austin, brice. "a brief history of electronic reserves." journal of interlibrary loan, document delivery & information supply , no. ( ): - . bailey, charles w., jr. "bricks, bytes, or both? the probable impact of scholarly electronic publishing on library space needs." in information imagineering: meeting at the interface, ed. milton t. wolf, pat ensor, and mary augusta thomas, - . chicago: american library association, . http://www.digital-scholarship.org/cwb/bricks.htm bailey, charles w., jr., and dana rooks, eds. "symposium on the role of network-based electronic resources in scholarly communication and research." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /bailey . n baker, gayle, and carol tenopir. "managing the unmanageable: systematic downloading of electronic resources by library users." journal of library administration , no. / ( ): - . bakker, theodora a., and marcus a. banks. "scholarly communication initiatives at georgetown university: lessons learned." oclc systems & services: international digital library perspectives , no. ( ): - . ball, david. "public libraries and the consortium purchase of electronic resources." the electronic library , no. ( ): - . http://eprints.rclis.org/archive/ / bergman, barbara j. "looking at electronic resources librarians: is there gender equity within this emerging specialty?" new library world , no. / ( ): - . blecic, deborah d., joan b. fiscella, and jr. stephen e. wiberley. "measurement of use of electronic resources: advances in use statistics and innovations in resource functionality." college & research libraries , no. ( ): - . blummer, barbara. "opportunities for libraries with print-on-demand publishing." journal of access services , no. ( ): - . bracke, paul j. "access to remote electronic resources at the university of arizona." science & technology libraries , no. / ( ): - . branin, joseph, frances groen, and suzanne thorin. "the changing nature of collection management in research libraries." library resources & technical services (january ): - . bravo, blanca rodríguez, and maría luisa alvite díez. "survey of the providers of electronic publications holding contracts with spanish university libraries." d-lib magazine , no. ( ). http://www.dlib.org/dlib/april /alvite/ alvite.html brindley, lynne j. "the development of jisc strategy on electronic collections." library review , no. / ( ): - . ———. "joint funding councils' libraries review group (the 'follett') report: the contribution of the information technology sub-committee." program: electronic library and information systems , no. ( ): - . brooks, sam, and thomas j. dorst. "issues facing academic library consortia and perceptions of members of the illinois digital academic library." portal: libraries and the academy , no. ( ): - . butler, meredith. "electronic publishing and its impact on libraries: a literature review." library resources & technical services (january/march ): - . calhoun, karen. "from information gateway to digital library management system: a case analysis." library collections, acquisitions, & technical services , no. ( ): - . campbell, jerry d. "changing a cultural icon: the academic library as a virtual destination." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf casserly, mary. "developing a concept of collection for the digital age." portal: libraries and the academy (october ): - . caudle, dana m., and cecilia m. schmitz. "web access to electronic journals and databases in arl libraries." journal of web librarianship , no. ( ): - . chand, prem, and jagdish arora. "access to scholarly communication in higher education in india: trends in usage statistics via inflibnet." program: electronic library and information systems , no. ( ): - . chunrong, luo, wang jingfen, and zhou zhinong. "regional consortia for e-resources: a case study of deals in the south china region." program: electronic library and information systems , no. ( ): - . chrzastowski, tina e. "electronic reserves in the science library: tips, techniques, and user perceptions." science & technology libraries , no. / ( ): - . cline, nancy m. "virtual continuity: the challenge for research libraries today." educause review (may/june ): - . http://www.educause.edu/ir/library/pdf/erm .pdf cochrane, lynn scott. "if the academic library ceased to exist, would we have to invent it?" educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf cole, louise. "making the invisible visible: bringing e-resources to a wide audience." serials , no. ( ): - . conyers, angela. "e-measures: developing statistical measures for electronic information services." vine , no. ( ): - . covey, denise troll. usage and usability assessment: library practices and concerns. washington, dc: digital library federation, council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf crawford, walt. being analog: creating tomorrow's libraries. chicago: american library association, . crawford, walt, and michael gorman. future libraries: dreams, madness, and reality. chicago: american library association, . croneis, karen s., and pat henderson. "electronic and digital librarian positions: a content analysis of announcements from through ." the journal of academic librarianship , no. ( ): - . cummings, anthony m., marcia l. witte, william g. bowen, laura o. lazarus, and richard h. ekman. university libraries and scholarly communication: a study prepared for the andrew w. mellon foundation. washington, dc: association of research libraries, . http://etext.lib.virginia.edu/subjects/mellon/ davenport, nancy. "place as library?" educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf davis, trisha l. "the evolution of selection activities for electronic resources." library trends (winter ): - . http://hdl.handle.net/ / delsey, tom. "the national library's role in facilitating scholarly communications." canadian journal of communication , no. / ( ): - . http://www.cjc-online.ca/index.php/journal/article/view/ / dorner, daniel g. "the impact of digital information resources on the roles of collection managers in research libraries." library collections, acquisitions, & technical services , no. ( ): - . dunne, siobhán. "the irish research electronic library initiative: levelling the playing-field?" library management , no. / ( ): - . duy, joanna, and liwen vaughan. "usage data for electronic resources: a comparison between locally collected and vendor-provided statistics." the journal of academic librarianship , no. ( ): - . ekman, richard h., and richard e. quandt. "scholarly communication, academic libraries, and technology." change (january/february ): - . esterhazy, jonathan. "providing authenticated access to web resources." college & research news (september ): - . farrell, elizabeth f., and florence olsen. "a new front in the sweatshop wars?" the chronicle of higher education, october , a -a . faulkner, lila a., and karla l. hahn. "selecting electronic publications: the development of a genre statement." issues in science and technology librarianship, no. ( ). http://www.library.ucsb.edu/istl/ -spring/article .html feeney, m., and jill newby. "model for presenting resources in scholar's portal." portal: libraries and the academy , no. ( ): - . ferullo, donna l. "the challenge of e-reserves." netconnect (summer ): - . http://www.libraryjournal.com/article/ca .html fisher, william. "the electronic resources librarian position: a public services phenomenon?" library collections, acquisitions, and technical services , no. ( ): - . fons, theodore a., and timothy d. jewell. "envisioning the future of erm systems." the serials librarian , no. / ( ): - . forsythe, kathleen, and steve shadle. "university of washington libraries digital registry." journal of internet cataloging , no. ( ): - . french, beverlee. "the economics and management of digital resources in a multi-campus, multi-library university: the shared digital collection." collection management , no. / ( ): - . friend, frederick j. "uk access to uk research." serials: the journal for the serials community , no. ( ): - . http://eprints.ucl.ac.uk/ / frumkin, jeremy. "the wiki and the digital library." oclc systems & services: international digital library perspectives , no. ( ): - . galloway, laura. "innovative interfaces' electronic resource management as a catalyst for change at glasgow university library." the serials librarian , no. ( ): - . gandel, paul b. "libraries: standing at the wrong platform, waiting for the wrong train?" educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf gherman, paul m. "found money: trading infrastructure for information." journal of library administration , no. ( ): - . goerwitz, richard. "pass-through proxying as a solution to the off-site web-access problem." d-lib magazine (june ). http://www.dlib.org/dlib/june /stg/ goerwitz.html goodram, richard j. "the e-rbr: confirming the technology and exploring the law of 'electronic reserves': two generations of the digital library system at the sdsu library." the journal of academic librarianship , no. ( ): - . grace, claire, and alison bremner. "getting the value from evaluation: where to get the data and what you can do with it." vine , no. ( ): - . gray, edward, and anne langley. "public services and electronic resources: perspectives from the science and engineering libraries at duke university." issues in science & technology librarianship, no. ( ). http://www.istl.org/ -summer/article .html greenstein, daniel. "research libraries' costs of doing business (and strategies for avoiding them)." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf grycz, czeslaw jan. "resource sharing in the systematic context of scholarly communication." library trends (winter ): - . http://hdl.handle.net/ / harper, tim, and barbara p. norelli. "the business of collaboration and electronic collection development." collection building , no. ( ): - . hartland-fox, rebecca, and stella thebridge. "electronic information services evaluation: current activity and issues in uk academic libraries." serials , no. ( ): - . hartnett, eric, apryl price, jane smith, and michael barrett. "opening a can of werms: texas a&m university's experiences in implementing two electronic resource management systems." journal of electronic resources librarianship , no. / ( ): - . hecker, thomas e. "the post-petroleum future of academic libraries." journal of scholarly publishing , no. ( ): - . hendricks, arthur. "sushi, not just a tasty lunch anymore: the development of the niso committee su's sushi standard." library hi tech , no. ( ): - . hilton, christopher, and dave thompson. "further experiences in collecting born digital archives at the wellcome library." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /hilton-thompson/ hiremath, uma. "electronic consortia: resource sharing in the digital age." collection building , no. ( ): - . hulseberg, anna, and sarah monson. "strategic planning for electronic resources management: a case study at gustavus adolphus college." journal of electronic resources librarianship , no. ( ): - . intner, sheila s. "impact of the internet on collection development: where are we now? where are we headed? an informal study." library collections, acquisitions, & technical services , no. ( ): - . jaffe, lee david. "the information revolution is over." serials review , no. ( ): - . jasper, richard p. "collaborative roles in managing electronic publications." library collections, acquisitions, & technical services , no. ( ): - . jewell, timothy d. selection and presentation of commercially available electronic resources: issues and practices. washington, dc: digital library federation, council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf jewell, timothy d., ivy anderson, adam chandler, sharon e. farb, kimberly parker, angela riggio, and nathan d. m. robertson. electronic resource management: the report of the dlf initiative. washington, dc: digital library federation, . http://www.diglib.org/pubs/dlfermi / joint, nicholas. "choosing between print or digital collection building in times of financial constraint." library review , no. ( ): - . ———. "digital libraries and the future of the library profession." library review , no. ( ): - . kaufman, paula. "whose good old days are these? a dozen predictions for the digital age." journal of library administration , no. ( ): - . keller, michael a., victoria a. reich, and andrew c. herkovic. "what is a library anymore, anyway?" first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ kellerman, l. suzanne. "out-of-print digital scanning: an acquisitions and preservation alternative." library resources & technical services (january ): - . kennedy, marie r. "dreams of perfect programs: managing the acquisition of electronic resources." library collections, acquisitions, & technical services , no. ( ): - . kibbey, mark, and nancy h. evans. "the network is the library." educom review (fall ): - . kidd, tony. "collaboration in electronic resource provision in university libraries: shedl, a scottish case study." new review of academic librarianship , no. ( ): - . koehler, barbara m., and nancy k. roderer. "scholarly communications program: force for change." biomedical digital libraries (article ). http://www.bio-diglib.com/content/ / / koehler, wallace. "digital libraries, digital containers, 'library patrons,' and visions for the future." the electronic library , no. ( ): - . koehn, shona l., and suliman hawamdeh. "the acquisition and management of electronic resources: can use justify cost?" the library quarterly , no. ( ): - . krieb, dennis. "you can't get there from here: issues in remote access to electronic journals for a health sciences library." issues in science and technology librarianship (spring ). http://www.library.ucsb.edu/istl/ -spring/article .html landes, sonja. "electronic reserves at milne library suny geneseo." journal of interlibrary loan, document delivery & information supply , no. ( ): - . landesman, margaret, and johann van reenen. "consortia vs. reform: creating congruence." the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . lee, hur-li. "what is a collection?" journal of the american society for information science (october ): - . lee, stuart d., and frances boyle. building an electronic resource collection: a practical guide. london: facet publishing, . leggate, peter. "acquiring electronic products in the hybrid library: prices, licenses, platforms and users." serials (july ): - . lewis, david w. "inventing the electronic university." college & research libraries (july ): - . http://hdl.handle.net/ / ———. "what if libraries are artifact-bound institutions?" information technology and libraries (december ): - . http://hdl.handle.net/ / linberger, peter, lori jean fielding, and frank j. bove. "developing a web-based evaluation tool for purchasing electronic resources: a librarian-faculty-student partnership." electronic journal of academic and special librarianship , no. ( ). http://southernlibrarianship.icaap.org/content/v n /linberger_p . html lingle, virginia a., and cynthia k. robinson. "conversion of an academic health sciences library to a near-total electronic library: part ." journal of electronic resources in medical libraries , no. ( ): - . ———. "conversion of an academic health sciences library to a near-total electronic library: part ." journal of electronic resources in medical libraries , no. ( ): - . liu, guoying. "erm system implementation in a consortium environment." library management , no. / ( ): - . lougee, wendy p. "beyond access: new concepts, new tensions for collection development in a digital environment." collection building , no. ( ): - . ———. diffuse libraries: emergent roles for the research library in the digital age. washington, dc: council on library and information resources, . http://www.clir.org/pubs/reports/pub /pub .pdf loughry, patricia, and amy w. shannon. "managing selection and implementation of electronic products: one tiny step in organization, one giant step for the university of nevada, reno." serials review , no. ( ): - . lynch, clifford. "from automation to transformation: forty years of libraries and information technology in higher education." educause review (january/february ): - . http://www.educause.edu/ir/library/pdf/erm .pdf ———. "life after graduation day: beyond the academy's digital walls." educause review , no. ( ): - .http://www.educause.edu/ir/library/pdf/erm .pdf ———. "the technological framework for library planning in the next decade." new directions for higher education, no. ( ): - . ———. "the transformation of scholarly communication and the role of the library in the age of networked information." the serials librarian , no. / ( ): - . lynden, frederick c. "budgeting for collection development in the electronic environment." journal of library administration , no. ( ): - . ———. "tradeoffs or not: the pros and cons of the electronic library." collection management , no. ( ): - . manoff, marlene. "hybridity, mutability, multiplicity: theorizing electronic library collections." library trends (summer ): - . http://hdl.handle.net/ / ———. "revolutionary or regressive? the politics of electronic collection development." in scholarly publishing: the electronic frontier, ed. robin p. peek and gregory b. newby, - . cambridge, ma: the mit press, . ———. "the symbolic meaning of libraries in a digital age." portal: libraries and the academy , no. ( ): - . maple, amanda. "online music services and academic libraries." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr music.pdf marcum, deanna b. "research questions for the digital era library." library trends , no. ( ): - . http://hdl.handle.net/ / markuson, barbara evans, and elaine w. woods, eds. networks for networkers ii: critical issues for libraries in the national network environment. new york: neal-schuman publishers, inc., . martin, robert sidney, ed. scholarly communication in an electronic environment: issues for research libraries. chicago: rare books and manuscript section, association of college and research libraries, american library association, . mcclure, charles r. "strategies for collecting networked statistics: practical suggestions." vine , no. ( ): - . mcdonald, jenny, and adrienne kebbell. "access in an increasingly digital world." the electronic library , no. ( ): - . mcgrath, mike. "interlending and document supply: a review of recent literature—xliv." interlending & document supply , no. ( ): - . mcmillan, gail. "librarians as publishers: is the digital library an electronic publisher?" college & research libraries news (november ): - . mcnicol, sarah. "the evalued toolkit: a framework for the qualitative evaluation of electronic information services." vine , no. ( ): - . mercer, linda s. "measuring the use and value of electronic journals and books." issues in science and technology librarianship (winter ). http://www.library.ucsb.edu/istl/ -winter/article .html metz, paul. "principles of selection for electronic resources." library trends (spring ): - . http://hdl.handle.net/ / ———. "the view from a university library." change (january/february ): - . miller, paul. "web . : building the new library." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /miller/ mirsky, phyllis s. "the university of california's collection development collaboration: a campus perspective." collection management , no. / ( ): - . missingham, roxanne. "electronic resources australia: a national approach to purchasing." library management , no. / ( ): - . molyneux, robert. "counter, xml, and online serials measurement." against the grain , no. ( ): - . morrisey, locke. "data-driven decision making in electronic collection development." journal of library administration , no. ( ): - murdock, dawn. "relevance of electronic resource management systems to hiring practices for electronic resources personnel." library collections, acquisitions, and technical services , no. ( ): - . neal, james g. "chaos breeds life: finding opportunities for library advancement during a period of collection schizophrenia." journal of library administration , no. ( ): - . nicholas, david, paul huntington, hamid r. jamali, and carol tenopir. "what deep log analysis tells us about the impact of big deals: case study ohiolink." journal of documentation , no. ( ): - . okerson, ann. "are we there yet? online e-resources ten years after." library trends (spring ): - . http://hdl.handle.net/ / packer, donna. "acquisitions allocations: fairness, equity and bundled pricing." portal: libraries and the academy , no. ( ): - . pesch, oliver. "sushi: simplifying the delivery of usage statistics." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art peters, thomas a. "what's the use? the value of e-resource usage statistics." new library world , no. / ( ): - . petrick, joseph. "the electronic library: responses from the state university of new york (suny)." oclc systems & services , no. ( ): - . pilston, anna klump, and richard l. hart. "student response to a new electronic reserves system." the journal of academic librarianship , no. ( ): - . pomerantz, jeffrey, and gary marchionini. "the digital library as place." journal of documentation , no. ( ): - . pomerantz, sarah b. "the role of the acquisitions librarian in electronic resources management." journal of electronic resources librarianship , no. / ( ): - . ruth, lisa boxill. "license mapping for erm systems: existing practices and initiatives for support." serials review , no. ( ): - . sadeh, tamar, and mark ellingsen. "electronic resource management systems: the need and the realization." new library world , no. / ( ): - . samson, sue, sebastian derry, and holly eggleston. "networked resources, assessment and collection development." the journal of academic librarianship , no. ( ): - . sánchez vignau, bárbara susana, and ileana lourdes presno quesada. "collection development in a digital environment: an imperative for information organizations in the twenty-first century." collection building , no. ( ): - . sapp, gregg, and ron gilmour. "a brief history of the future of academic libraries: predictions and speculations from the literature of the profession, to —part one, to ." portal: libraries and the academy , no. ( ): - . ———. "a brief history of the future of academic libraries: predictions and speculations from the literature of the profession, to —part two, to ." portal: libraries and the academy , no. ( ): - . sauer, anne. "why archivists should be leaders in scholarly communication." journal of archival organization , no. / ( ): - . schäffler, hildegard. "how to organise the digital library: reengineering and change management in the bayerische staatsbibliothek, munich." library hi tech , no. ( ): - . schatzle, chad. "a proposed solution to the scholarly communications crisis." journal of access services , no. ( ): - . schneider, anette. "a nationwide solution for the management of electronic resources." serials , no. ( ): - . schonfeld, roger c., and kevin m. guthrie. "the changing information services needs of faculty." educause review , no. ( ): - . http://www.educause.edu/apps/er/erm /erm .asp secker, jane, martin oliver, and martin reid. "electronic reserves at university college london: understanding the needs of academic departments." journal of interlibrary loan, document delivery & information supply , no. ( ): - . sellen, mary, and brenda hazard. "user assessment of electronic reserves and implications for digital libraries." journal of interlibrary loan, document delivery & information supply , no. ( ): - . shepherd, peter t. "counter: from conception to compliance." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "counter: towards reliable vendor usage statistics." vine , no. ( ): - . ———. "counter : a new code of practice and new applications of counter usage statistics." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "information counting gathers pace." research information (october/november ). http://www.researchinformation.info/rioctnov counter.html ———. "keeping count." library journal, february , - . http://www.libraryjournal.com/article/ca .html ———. "project counter: a new international initiative to provide online usage statistics that are credible, compatible and consistent." serials (july ): - . shepherd, peter t., and denise m. davis. "electronic metrics, performance measures, and statistics for publishers and libraries: building common ground and standards." portal: libraries and the academy , no. ( ): - . shim, wonsik, and charles r. mcclure. "data needs and use of electronic resources and services at academic research libraries." portal: libraries and the academy , no. ( ): - . singh, s. p. "collection management in the electronic environment." the bottom line: managing library finances , no. ( ): - . skaggs, bethany latham, jodi welch poe, and kimberly weatherford stevens. "one-stop shopping: a perspective on the evolution of electronic resources management." oclc systems & services: international digital library perspectives , no. ( ): - . stachokas, george. "electronic resources and mission creep: reorganizing the library for the twenty-first century." journal of electronic resources librarianship , no. / ( ): - . stewart, lou ann. "choosing between print and electronic resources: the selection dilemma." the reference librarian, no. ( ): - . suber, peter. "california against nature." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#uc-np g sykes, phil. "on-demand publishing in the humanities: project." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art thomas, sarah e. "publishing solutions for contemporary scholars: the library as innovator and partner." library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / törmä, sanna, and pertti vakkari. "discipline, availability of electronic resources and the use of finnish national electronic library—finelib." information research , no. ( ). http://informationr.net/ir/ - /paper .html town, stephen. "e-measures: a comprehensive waste of time?" vine , no. ( ): - . http://hdl.handle.net/ / treloar, andrew. "rethinking the library's role in publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art waaijers, leo. "from libraries to 'libratories.'" first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ watts, louise. "document supply: the evolving needs of the library." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art white, gary w. "collaborative collection building of electronic resources: a business faculty/librarian partnership." collection building , no. ( ): - . zhang, yvonne w. "measurement and assessment of networked resources and services in academic libraries." the serials librarian , no. ( ): - . . library issues: metadata and linking allinson, julie. "describing scholarly works with dublin core: a functional approach." library trends , no. ( ): - . allinson, julie, pete johnston, and andy powell. "a dublin core application profile for scholarly works." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /allinson-et-al/ anderson, bill, and les hawkins. "development of conser cataloging policies for remote access computer file serials." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /ande n .html apps, ann, and ross macintyre. "why openurl?" d-lib magazine , no. ( ). http://www.dlib.org/dlib/may /apps/ apps.html arms, william et al. "uniform resource names: a progress report." d-lib magazine (february ). http://www.dlib.org/dlib/february / arms.html armstrong, chris. "metadata, pics and quality." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /pics/ atkins, helen, catherine lyons, howard ratner, carol risher, chris shillum, david sidman, and andrew stevens. "reference linking with dois: a case study." d-lib magazine (february ). http://www.dlib.org/dlib/february / risher.html baker, gayle, and eleanor j. read. "vendor-supplied usage data for electronic resources: a survey of academic libraries." learned publishing , no. ( ): - . baker, thomas. "a grammar of dublin core." d-lib magazine (october ). http://www.dlib.org/dlib/october /baker/ baker.html ———. "languages for dublin core." d-lib magazine (december ). http://www.dlib.org/dlib/december / baker.html baker, thomas, and makx dekkers. "identifying metadata elements with uris: the cores resolution." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /baker/ baker.html banski, erika. "implementation of dublin core at the university of alberta libraries." oclc systems & services , no. ( ): - . beall, jeffrey. "the death of metadata." the serials librarian , no. ( ): - . ———. "free books: loading brief marc records for open-access books in an academic library catalog." cataloging & classification quarterly , no. / ( ): - . ———. "geographical research and the problem of variant place names in digitized books and other full-text resources." library collections, acquisitions, and technical services , no. / ( ): - . beam, joan t. , and nora s. copeland. "electronic resources in union catalogs: urls and accessibility issues." serials review , no. / ( ): - . bearman, david, eric miller, godfrey rust, jennifer trant, and stuart weibel. "a common model to support interoperable metadata: progress report on reconciling metadata requirements from the dublin core and indecs/doi communities." d-lib magazine (january ). http://www.dlib.org/dlib/january /bearman/ bearman.html beisler, amalia, and glee willis. "beyond theory: preparing dublin core metadata for oai-pmh harvesting." journal of library metadata , no. / ( ): - . belanger, jacqueline. "cataloguing e-books in uk higher education libraries: report of a survey." program: electronic library and information systems , no. ( ): - . beit-arie, oren, miriam blake, priscilla caplan, dale flecker, tim ingoldsby, laurence w. lannom, william h. mischo, edward pentz, sally rogers, and herbert van de sompel. "linking to the appropriate copy: report of a doi-based prototype." d-lib magazine (september ). http://www.dlib.org/dlib/september /caplan/ caplan.html blake, miriam e., and frances l. knudson. "metadata and reference linking." library collections, acquisitions, & technical services , no. ( ): - . blanchi, christophe, and jason petrone. "distributed interoperable metadata registry." d-lib magazine (december ). http://www.dlib.org/dlib/december /blanchi/ blanchi.html boock, michael, and sue kunda. "electronic thesis and dissertation metadata workflow at oregon state university libraries." cataloging & classification quarterly , no. / ( ): - . boydston, jeanne m. k., and joan m. leysen. "internet resources cataloging in arl libraries: staffing and access issues." the serials librarian , no. / ( ): - . brand, amy. "crossref turns one." d-lib magazine (may ). http://www.dlib.org/dlib/may /brand/ brand.html ———. "publishers joining forces through crossref." serials review , no. ( ): - . brownlee, rowan. "research data and repository metadata: policy and technical issues at the university of sydney library." cataloging & classification quarterly , no. / ( ): - . bueno-de-la-fuente, gema, tony hernández-pérez, david rodríguez-mateos, eva m. méndez-rodrígue, and bonifacio martín-galán. "study on the use of metadata for digital learning objects in university institutional repositories (moderi)." cataloging & classification quarterly , no. / ( ): - . burk, alan, muhammad al-digeil, dominic forest, and jennifer whitney. "new possibilities for metadata creation in an institutional repository context." oclc systems & services: international digital library perspectives , no. ( ): - . burke, gerald, carol anne germain, and mary k. van ullen. "urls in the opac: integrating or disintegrating research libraries' catalogs." the journal of academic librarianship , no. ( ): - . burnett, kathleen, kwong bor ng, and soyeon park. "a comparison of the two traditions of metadata development." journal of the american society for information science , no. ( ): - . byrne, gillian, and lisa goddard. "the strongest link: libraries and linked data." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november / contents.html calanag, maria luisa, koichi tabata, and shigeo sugimoto. "linking preservation metadata and collection management policies." collection building , no. ( ): - . cameron, robert d. "bibliographic protocol: fine-grained integration of library services with the web." the serials librarian , no. / ( ): - . caplan, priscilla. "cataloging internet resources." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /caplan. n ———. "controlling e-journals: the internet resources project, cataloging guidelines, and usmarc." the serials librarian , no. / ( ): - . ———. "a lesson in linking." library journal net connect (fall ): - . ———. metadata fundamentals for all librarians. chicago : american library association, . ———. "reference linking for journal articles: promise, progress, and perils." portal: libraries and the academy , no. ( ): - . ———. "to hel(sinki) and back for the dublin core." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /capl n .html ———. understanding premis. washington, dc: library of congress network development and marc standards office, . http://www.loc.gov/standards/premis/understanding-premis.pdf ———. "u-r-stars: standards for controlling internet resources." the serials librarian , no. / ( ): - . ———. "you call it corn, we call it syntax-independent metadata for document-like objects." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /capl n .html caplan, priscilla, and william y. arms. "reference linking for journal articles." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /caplan/ caplan.html caplan, priscilla, and rebecca guenther. "metadata for internet resources: the dublin core metadata elements set and its mapping to usmarc." cataloging & classification quarterly , no. / ( ): - . caplan, priscilla, and stephanie haas. "metadata rematrixed: merging museum and library boundaries." library hi tech , no. ( ): - . chan, lois mai, and marcia lei zeng. "metadata interoperability and standardization—a study of methodology part i: achieving interoperability at the schema level." d-lib magazine , no. ( ). http://www.dlib.org/dlib/june /chan/ chan.html chandler, adam, and elaine l. westbrooks. "distributing non-marc metadata: the cugir metadata sharing project." library collections, acquisitions, & technical services , no. ( ): - . chandrakar, rajesh. "digital object identifier system: an overview." the electronic library , no. ( ): - . chapman, ann. "rda: a new international standard." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /chapman/ chapman, john w., david reynold, and sarah a. shreeves. "repository metadata: approaches and challenges " cataloging & classification quarterly , no. / ( ): - . chaudhri, talat. "assessing frbr in dublin core application profiles." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /chaudhri/ chaudhri, talat, julian cheal, richard jones, mahendra mahey, and emma tonkin. "towards a toolkit for implementing application profile." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /chaudhri-et-al/ chaudhry, abdus sattar, and makeswary periasamy. "a study of current practices of selected libraries in cataloguing electronic journals." library review , no. ( ): - . chen-gaffey, aiping. "marc standards and opac display of records for web-based resources." the serials librarian , no. ( ): - . chilvers, alison. "the super-metadata framework for managing long-term access to digital data objects: a possible way forward with specific reference to the uk." journal of documentation , no. ( ): - . chudnov, daniel, richard cameron, jeremy frumkin, ross singer, and raymond yee. "opening up openurls with autodiscovery." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /chudnov/ clair, kevin. "developing an audiovisual metadata application profile: a case study." library collections, acquisitions, and technical services , no. ( ): - . coburn, erin, elisa lanzi, elizabeth o'keefe, regine stein, and ann whiteside. "the cataloging cultural objects experience: codifying practice for the cultural heritage community." ifla journal , no. ( ): - . colati, jessica branco, robin dea, and keith maull. "describing digital objects: a tale of compromise." cataloging & classification quarterly , no. / ( ): - . cole, timothy w. "qualified dublin core metadata for online journal articles." oclc systems & services , no. ( ): - . colleran, edward. "digital object identifiers: just an idea or an innovation of value?" against the grain , no. ( ): - . cook, anita, and thomas dowling. "linking from index to primary source: the ohiolink model." the journal of academic librarianship , no. ( ): - . copeland, ann. "e-serials cataloging in the s: a review of the literature." the serials librarian , no. / ( ): - . ———. "works and digital resources in the catalog: electronic versions of book of urizen, the kelmscott chaucer and robinson crusoe." cataloging & classification quarterly , no. / ( ): - . costanza, jane, r. cecilia knight, and hsianghui liu-spencer. "metadata implementation for building cross-institutional repositories: lessons learned from the liberal arts scholarly repository (lasr)." journal of library metadata , no. / ( ): - . coyle, karen. "understanding metadata and its purpose." the journal of academic librarianship , no. ( ): - . cromwell-kessler, willy. "dublin core metadata in the rlg information landscape." d-lib magazine (december ). http://www.dlib.org/dlib/december / cromwell-kessler.html culling, james. "link resolvers and the serials supply chain: a research project sponsored by uksg." serials: the journal for the serials community , no. ( ): - . cummings, joel, and ryan johnson. "the use and usability of sfx: context-sensitive reference linking." library hi tech , no. ( ): - . curran, mary. "step one in formalizing the rules in aacr for cataloguing e-serials: chapter and the anglo-american cataloguing rules amendments package." the serials librarian , no. ( ): - . daniel, ron, jr., and carl lagoze. "extending the warwick framework: from metadata containers to active digital objects." d-lib magazine (november ). http://www.dlib.org/dlib/november /daniel/ daniel.html dappert, angela, and markus enders. "using mets, premis and mods for archiving ejournals." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /dappert/ dappert.html day, michael. "metadata for digital preservation: an update." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /metadata/ day, michael, rachel heery, and andy powell. "national bibliographic records in the digital information environment: metadata, links and standards." the journal of documentation (january ): - . dekkers, makx, and stuart l. weibel. "dublin core metadata initiative progress report and workplan for ." d-lib magazine , no. ( ). http://www.dlib.org/dlib/february /weibel/ weibel.html ———. "state of the dublin core metadata initiative, april ." d-lib magazine , no. ( ). http://www.dlib.org/dlib/april /weibel/ weibel.html dempsey, lorcan, and rachel heery. "metadata: a current view of practice and issues." the journal of documentation (march ): - . http://www.ukoln.ac.uk/metadata/publications/jdmetadata/ dempsey, lorcan, and stuart l. weibel. "the warwick metadata workshop: a framework for the deployment of resource description." d-lib magazine (july/august ). http://www.dlib.org/dlib/july / weibel.html deng, sai. "optimizing workflow through metadata repurposing and batch processing." journal of library metadata , no. ( ): - . dillon, martin, and erik jul. "cataloging internet resources: the convergence of libraries and internet resources." cataloging & classification quarterly , no. / ( ): - . doyle, mark. "pragmatic citing and linking in electronic scholarly publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art dulock, michael, and christopher cronin. "providing metadata for compound digital objects: strategic planning for an institution's first use of mets, mods, and mix." journal of library metadata , no. / ( ): - . dunsire, gordon. "distinguishing content from carrier: the rda/onix framework for resource categorization." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /dunsire/ dunsire.html duval, erik, wayne hodgins, stuart sutton, and stuart l. weibel. "metadata principles and practicalities." d-lib magazine (april ). http://www.dlib.org/dlib/april /weibel/ weibel.html elings, mary w., and günter waibel. "metadata for all: descriptive standards and metadata sharing across libraries, archives and museums." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ el-sherbini, magda. "metadata and the future of cataloging." library computing , no. / ( ): - . emtage, alan. "the why and what of urls and urns." serials review , no. ( ): - . flannery, melinda reagor. "cataloging internet resources." bulletin of the medical library association (april ): - . http://www.pubmedcentral.nih.gov/picrender.fcgi?artid= &bl obtype=pdf foulonneau, muriel, and jenn riley. metadata for digital resources: implementation, systems design and interoperability. oxford: chandos publishing, . fox, robert. "cataloging our information architecture." oclc systems & services: international digital library perspectives , no. ( ): - . geller, marilyn. "a better mousetrap is still a mousetrap." serials review , no. ( ): - . gerhard, kristin h. "cataloging internet resources: practical issues and concerns." the serials librarian , no. / ( ): - . gerrity, bob, theresa lyman, and ed tallent. "blurring services and resources: boston college's implementation of metalib and sfx." reference services review , no. ( ): - . http://escholarship.bc.edu/library_pubs/ / godby, carol jean. "what do application profiles reveal about the learning object metadata standard?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /godby/ godby, carol jean, jeffrey a. young, and eric childress. "a repository of metadata crosswalks." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /godby/ godby.html goldsmith, beth, and frances knudson. "repository librarian and the next crusade: the search for a common standard for digital repository metadata." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /goldsmith/ goldsmith.html gorman, michael. "metadata dreaming: the keynote speech at the canadian metadata forum, september ." the serials librarian , no. ( ): - . ———. "metadata or cataloging? a false choice." journal of internet cataloging , no. ( ): - . graham, crystal, and rebecca ringler. "hermaphrodites & herrings." serials review , no. ( ): - . gravett, karen. "the cataloguing of e-books at the university of surrey." serials: the journal for the serials community , no. ( ): - . green, brian, and liam earney. "the importance of linking electronic resources and their licence terms: a project to implement onix for licensing terms for uk academic institutions." serials: the journal for the serials community , no. ( ): - . greenberg, jane. "metadata extraction and harvesting: a comparison of two automatic metadata generation applications." journal of internet cataloging , no. ( ): - . http://www.ils.unc.edu/mrc/pdf/greenberg metadata.pdf ———. "metadata generation: processes, people and tools." bulletin of the american society for information science and technology (december/january ): - . http://www.asis.org/bulletin/dec- /greenberg.html ———. "theoretical considerations of lifecycle modeling: an analysis of the dryad repository demonstrating automatic metadata propagation, inheritance, and value system adoption." cataloging & classification quarterly , no. / ( ): - . greenberg, jane, hollie c. white, sarah carrier, and ryan scherle. "a metadata best practice for a scientific data repository." journal of library metadata , no. / ( ): - . grenci, mary. "the impact of web publishing on the organization of cataloging functions." library collections, acquisitions, & technical services , no. ( ): - . https://scholarsbank.uoregon.edu/dspace/handle/ / grenier, gerry. "stm x-ref: a link service for publishers and readers." the serials librarian no. / ( ): - . grogg, jill e. "land of linking." the serials librarian , no. ( ): - . grogg, jill e., debra k. andreadis, and rachel a. kirk. "full-text linking: affiliated versus nonaffiliated access in a free database." college & research libraries (may ): - . guenther, rebecca s. "mods: the metadata object description schema." portal: libraries and the academy , no. ( ): - . guenther, rebecca, and sally mccallum. "new metadata standards for digital resources: mods and mets." bulletin of the american society for information science and technology (december/january ): - . http://www.asis.org/bulletin/dec- /guenthermccallum.html guenther, rebecca, and leslie myrick. "archiving web sites for preservation and access: mods, mets and minerva." journal of archival organization , no. / ( ): - . guinchard, carolyn. "dublin core use in libraries: a survey." oclc systems & services , no. ( ): - . hakala, juha. "linking articles and bibliographic records with uniform resource names." the serials librarian , no. / ( ): - . ———. "the seven levels of identification: an overview of the current state of identifying objects within digital libraries." program: electronic library and information systems , no. ( ): - . han, myung-ja, christine cho, timothy w. cole, and amy s. jackson. "metadata for special collections in contentdm: how to improve interoperability of unique fields through oai-pmh." journal of library metadata , no. / ( ): - . hawkins, les. "cataloging web-based integrating resources." serials review , no. / ( ): - . ———. "refinement of cataloging tools." serials review , no. ( ): - . ———. "serials published on the world wide web: cataloging problems and decisions." the serials librarian , no. / ( ): - . hawthorne, dalene. "administrative metadata to support the acquisition of continuing e-resources." serials review , no. ( ): - . hayes, allene, and carolyn larson. "use of oclc's corc at the library of congress." serials review , no. ( ): - . heery, rachael. "naming names: metadata registries." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /metadata/ hellman, eric. "openurl: making the link to libraries." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art henderson, helen. "institutional identifiers and the journal supply chain efficiency improvement pilot." serials: the journal for the serials community , no. ( ): - . hendricks, arthur. "the development of the niso committee ax's openurl standard." information technology and libraries , no. ( ): - . herrera, gail, and lynda aldana. "integrating electronic resources into the library catalog: a collaborative approach." portal: libraries and the academy , no. ( ): - . hickey, thomas b. "collaboration in corc." journal of internet cataloging , no. / ( ): - . hillmann, diane i., and elaine l. westbrooks, eds. metadata in practice. chicago: american library association, . hinton, mellissa j. "on cataloging internet resources: voices from the field." journal of internet cataloging , no. ( ): - . hitchcock, steve, donna bergmark, tim brody, christopher gutteridge, les carr, wendy hall, carl lagoze, and stevan harnad. "open citation linking: the way forward." d-lib magazine (october ): http://www.dlib.org/dlib/october /hitchcock/ hitchcock.html hoffman, diane j. "think links: full-text linking projects." online (january/february ): - . ———. "think links: revisited a year later." serials (march ): - . holdsworth, michael. "onix—a transforming standard." against the grain (june ): , , . hoogcarspel, annelies. "the rutgers inventory of machine-readable texts in the humanities: cataloging and access." information technology and libraries (march ): - . hruska, martha. "remote internet serials in the opac?" serials review , no. ( ): - . hsieh-yee, ingrid, and michael smith. "the corc experience: survey of founding libraries. part i." oclc systems & services , no. ( ): - . ———. "the corc experience: survey of founding libraries. part ii." oclc systems & services , no. ( ): - . hurlbert, terry, and linda l. dujmic. "cataloging department participation in digital initiatives." technical services quarterly , no. ( ): - . hurt, charlene, and william gray potter. "corc and the future of libraries: two university librarians' perspectives." journal of internet cataloging , no. / ( ): - . iannella, renato. "turnip: the urn interoperability project." d-lib magazine (march ). http://www.dlib.org/dlib/march /briefings/ turnip.html jackson, mary e. "the 'bigger deal' is openurl." interlending & document supply , no. ( ): - . james, culling. link resolvers and the serials supply chain: final report for uksg. newbury, uk: uk serials group, . http://www.uksg.org/projects/linkfinal johnston, pete. "'what are your terms?'" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /johnston/ jones, ed. "serials in the realm of the remotely-accessible: an exploration." serials review , no. ( ): - . jones, simon. "the links in the information chain." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art jones, wayne. "we need those e-serial records." serials review , no. ( ): - . kahn, robert, and robert wilensky. "a framework for distributed digital object services." international journal on digital libraries , no. ( ): - . kennedy, marie r. "nine questions to guide you in choosing a metadata schema." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ lagace, nettie, and janet k. chisman. "how did we ever manage without the openurl?" the serials librarian , no. / ( ): - . lagoze, carl. "the warwick framework: a container architecture for diverse sets of metadata." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /lagoze/ lagoze.html landesman, betty. "keeping the jell-o nailed to the wall: maintaining and managing the virtual collection." the serials librarian , no. / ( ): - . lasher, rebecca. "new model needed for locating and describing networked information." serials review , no. ( ): - . lewis, nicholas. "'i want it all and i want it now!': managing expectations with metalib and sfx at the university of east anglia." serials , no. ( ): - . li, ying, dick r. miller, and mary buttner. "bibliographic data mining: automatically building component part records for e-journal articles on the internet." journal of internet cataloging , no. ( ): - . li, yiu-on, and shirley w. leung. "computer cataloging of electronic journals in unstable aggregator databases: the hong kong baptist university library experience." library resources & technical services (october ): - . lindquist, mats g. "not your father's references: citations in the digital space." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . little, david. "sharing history of science and medicine gateway metadata using oai-pmh." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /little/ liu, jia. "metadata development in china: research and practice." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /liu/ liu.html livingston, jill, deborah sanford, and dave bretthauer. "a comparison of openurl link resolvers: the results of a university of connecticut libraries environmental scan." library collections, acquisitions, & technical services , no. / ( ): - . http://digitalcommons.uconn.edu/libr_pubs/ / lopatin, laurie. "metadata practices in academic and non-academic libraries for digital projects: a survey." cataloging & classification quarterly , no. ( ): - . lubas, rebecca l. "defining best practices in electronic thesis and dissertation metadata." journal of library metadata , no. / ( ): - . lynch, clifford. "the dublin core descriptive metadata program: strategic implications for libraries and networked information access." arl: a bimonthly newsletter of research library issues and actions, no. . http://www.arl.org/bm~doc/dublin.pdf ———. "identifiers and their role in networked information applications." bulletin of the american society for information science (december/january ): - . http://www.asis.org/bulletin/dec- /lynch.htm ———. "uniform resource naming: from standards to operational systems." serials review , no. ( ): - . lynch, clifford a., and cecilia m. preston. "describing and classifying networked information resources." electronic networking: research, applications and policy (spring ): - . ma, jin. "managing metadata for digital projects." library collections, acquisitions, & technical services , no. / ( ): - . ———. "metadata in arl libraries: a survey of metadata practices." journal of library metadata , no. / ( ): - ———. metadata, spec kit . washington, dc: association of research libraries, . http://www.arl.org/bm~doc/spec web.pdf macintyre, ross. "nesli marc records: an experiment in creating marc records for e-journals." the serials librarian , no. / ( ): - . marko, lynn, and christina powell. "descriptive metadata strategy for tei headers: a university of michigan library case study." oclc systems & services , no. ( ): - . martin, charity k., and paul s. hoffman. "do we catalog or not? how research libraries provide bibliographic access to electronic journals in aggregated databases." the serials librarian , no. ( ): - . http://digitalcommons.unl.edu/libraryscience/ / massart, david, elena shulman, nick nicholas, nigel ward, and frédéric bergeron. "taming the metadata beast: ilox." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /massart/ massart.html mcclelland, marilyn, david mcarthur, sarah giersch, and gary geisler. "challenges for service providers when importing metadata in digital libraries." d-lib magazine (april ). http://www.dlib.org/dlib/april /mcclelland/ mcclelland.html mccollum, kelly. "publishers of on-line journals plan to link millions of science footnotes." the chronicle of higher education, november , a . mccutcheon, sevim, michael kreyche, margaret beecher maurer, and joshua nickerson. "morphing metadata: maximizing access to electronic theses and dissertations." library hi tech , no. ( ): - . mcdonough, jerome p. "mets: standardized encoding for digital library objects." international journal on digital libraries , no. ( ): - . http://hdl.handle.net/ / mcmillan, gail. "electronic theses and dissertations: merging perspectives." cataloging & classification quarterly , no. / ( ): - . medeiros, norm. "metadata for e-commerce: the onix international standard." oclc systems & services , no. ( ): - . miller, paul. "i am a name and a number." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /metadata/ miller, paul, and tony gill. "dc : the search for santa." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /metadata/ milstead, jessica, and susan feldman. "metadata projects and standards." online (january/february ): - , . missingham, roxanne. "reengineering a national resource discovery service: mods down under." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /missingham/ missingham.ht ml mitchell, anne m., and brian e. surratt. cataloging and organizing digital resources: a how-to-do-it manual for librarians. london: facet publishing, . morgan, cliff. "journals metadata: information about content." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "metadata for stm journal publishers: a review of the current scene." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art morgan, eric lease. "adding internet resources to our opacs." serials review , no. ( ): - . ———. "possible solutions for incorporating digital information mediums into traditional library cataloging services." cataloging & classification quarterly , no. / ( ): - . ———. "mr. serials revisits cataloging: cataloging electronic serials and internet resources." the serials librarian , no. / ( ): - . morris, wayne, and lynda thomas. "single or separate opac records for e-journals: the glamorgan perspective." the serials librarian , no. / ( ): - . morrow, anne, and allyson mower. "university scholarly knowledge inventory system: a workflow system for institutional repositories." cataloging & classification quarterly , no. / ( ): - . nagamori, mitsuharu, and shigeo sugimoto. "metadata schema registry as a tool to enhance metadata interoperability." tcdl bulletin , no. ( ). http://www.ieee-tcdl.org/bulletin/v n /nagamori/nagamori.html naun, chew chiat, and susan m. braxton. "developing recommendations for consortial cataloging of electronic resources: lessons learned." library collections, acquisitions, & technical services , no. ( ): - . needleman, mark h. "onix (online information exchange)." serials review , no. / ( ): - . ———. "the openurl: an emerging standard for linking." serials review , no. ( ): - . nero, lorraine m. "cataloguing digital resources: the experience of the university of the west indies, st. augustine campus." library review , no. ( ): - . neumeister, susan m. "cataloging internet resources: a practitioner's viewpoint." journal of internet cataloging , no. ( ): - . nicholas, nick, nigel ward, and kerry blinco. "abstract modelling of digital identifiers." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /nicholas-et-al/ nichols, david m., gordon w. paynter, chu-hsiang chan, david bainbridge, dana mckay, michael b. twidale, and ann blandford. "experiences in deploying metadata analysis tools for institutional repositories." cataloging & classification quarterly , no. / ( ): - . niu, jingfang. "a metadata framework developed at the tsinghua university library to aid in the preservation of digital resources." d-lib magazine (november ). http://www.dlib.org/dlib/november /niu/ niu.html palowitch, casey, and lisa horowitz. "meta-information structures for networked information resources." cataloging & classification quarterly , no. / ( ): - . park, eun g. "building interoperable canadian architecture collections: initial metadata assessment." the electronic library , no. ( ): - . park, jung-ran. "language-related open archives: impact on scholarly communities and academic librarianship." electronic journal of academic and special librarianship , no. / ( ). http://southernlibrarianship.icaap.org/content/v n /park_j .htm ———. "metadata quality in digital repositories: a survey of the current state of the art." cataloging & classification quarterly , no. / ( ): - . park, jung-ran, and eric childress. "dublin core metadata semantics: an analysis of the perspectives of information professionals." journal of information science , no. ( ): - . park, jung-ran, and yuji tosaka. "metadata quality control in digital repositories and collections: criteria, semantics, and mechanisms." cataloging & classification quarterly , no. ( ): - . paskin, norman. "digital object identifiers for scientific data." data science journal ( ): - . http://www.jstage.jst.go.jp/article/dsj/ / / /_pdf ———. "doi: a progress report." d-lib magazine , no. ( ). http://www.dlib.org/dlib/june /paskin/ paskin.html ———. "doi: current status and outlook may ." d-lib magazine (may ). http://www.dlib.org/dlib/may / paskin.html ———. "e-citations: actionable identifiers and scholarly referencing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "information identifiers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art patra, chandana. "digital repository in ceramics: a metadata study." the electronic library , no. ( ): - . patton, mark, david reynolds, g. sayeed choudhury, and tim dilauro. "toward a metadata generation framework: a case study at johns hopkins university." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /choudhury/ choudhury.html pearce, judith, david pearson, megan williams, and scott yeadon. "the australian mets profile—a journey about metadata." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /pearce/ pearce.html pentz, ed. "crossref: a collaborative linking network." issues in science and technology librarianship (winter ). http://www.library.ucsb.edu/istl/ -winter/article .html ———. "crossref at the crossroads." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art phillips, margaret e., and paul koerbin. "pandora, australia's web archive: how much metadata is enough?" journal of internet cataloging , no. ( ): - . porter, g. margaret, and laura bayard. "including web sites in the online catalog: implications for cataloging, collection development, and access." the journal of academic librarianship (september ): - . powell, andrea. "linking to full text: the secondary publisher's perspective." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art powell, andy. "dublin core management." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /dublin/ powell, andy, and ann apps. "encoding openurls in dublin core metadata." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /metadata/ pullinger, david. "instant linking—delayed use: setting provider expectations." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art rapple, charlie. "after the goldrush—the golden age of reference linking." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art rees, louise b., and bridget arthur clancy. "cataloging electronic journals: learning to weave the web." internet reference services quarterly , no. ( ): - . reilly, william, robert wolfe, and mackenzie smith. "mit's cwspace project: packaging metadata for archiving educational content in dspace." international journal on digital libraries , no. ( ): - . http://dspace.mit.edu/handle/ . / rettig, patricia j. "administrative metadata for digital images: a real world application of the niso draft standard." library collections, acquisitions, & technical services , no. ( ): - . rettig, patricia j., shu liu, nancy hunter, and allison v. level. "developing a metadata best practices model: the experience of the colorado state university libraries." journal of library metadata , no. ( ): - . reynolds, regina. "inventory list or information gateway? the role of the catalog in the digital age." serials review , no. ( ): - . riley, jenn, john chapman, sarah shreeves, laura akerman, and william landis. "promoting shareability: metadata activities of the dlf aquifer initiative." journal of library metadata , no. ( ): - . riley, jenn, and michelle dalmau. "the in harmony project: developing a flexible metadata model for the description and discovery of sheet music." the electronic library , no. ( ): - . riley, jenn, and muriel foulonneau. metadata for digital resources: implementation, systems design and interoperability. oxford: chandos publishing, . robertson, r. john. "metadata quality: implications for library and information science professionals." library review , no. ( ): - . http://eprints.cdlr.strath.ac.uk/ / salo, dorothea. "name authority control in institutional repositories." cataloging & classification quarterly , no. / ( ): - . schroeder, kathrin. "persistent identification for the permanent referencing of digital resources—the activities of the epicur project: enhanced uniform resource name urn management at die deutsche bibliothek." the serials librarian , no. ( ): - . searle, sam, and dave thompson. "preservation metadata: pragmatic first steps at the national library of new zealand." d-lib magazine , no. ( ). http://www.dlib.org/dlib/april /thompson/ thompson.html sha, vianne t. "cataloguing internet resources: the library approach." the electronic library (october ): - . shadle, steve. "identification of electronic journals in the online catalog." serials review (summer ): - . ———. "reflections on wrapping paper: random thoughts on aacr and electronic serials." serials review , no. ( ): - . ———. "a square peg in a round hole: applying aacr to electronic journals." the serials librarian , no. / ( ): - . shadle, steven, bill anderson, thomas champagne, and leslie o'brien. "electronic serials cataloging: now that we're here, what do we do?" the serials librarian , no. / ( ): - . sigrist, barbara, and andreas heise. "cataloging and retrieving e-journals in the zeitschriftendatenbank, in the german serials database." the serials librarian , no. ( ): - . simpson, pamela, and robert seeds. "electronic journals in the online catalog: selection and bibliographic control." library resources & technical services (april ): - . sleeman, allison mook. "cataloging remote access electronic serials." serials review , no. ( ): - . smith, adam j. "developing handle system web services at cornell university." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /smith/ smith.html soderdahl, paul a. "implementing the sfx link server at the university of iowa." information technology and libraries , no. ( ): - . http://www.ala.org/ala/mgrps/divs/lita/ital/ soderdahl.cfm sollins, karen r. "the hard problems are not all solved." serials review , no. ( ): - . sommer, dorothea. "persistent identifiers: the 'urn granular' project of the german national library and the university and state library halle." liber quarterly: the journal of european research libraries , no. / ( ). http://liber.library.uu.nl/publish/articles/ /article.pdf stengel, mark g. "using sfx to identify unexpressed user needs." collection management , no. ( ): - . su, siew-phek t., yu long, and daniel e. cromwell. "e m: automatic generation of marc-formatted metadata by crawling e-publications." information technology and libraries (december ): - . http://www.ala.org/ala/mgrps/divs/lita/ital/ su.cfm sukantarat, wichada. "digital initiatives and metadata use in thailand." program: electronic library and information systems , no. ( ): - . sun, li. "a metadata manager's role in collaborative projects: the rutgers university libraries experience." the electronic library , no. ( ): - . surratt, brian e., and dustin hill. "etd marc: a semiautomated workflow for cataloging electronic theses and dissertations." library collections, acquisitions, & technical services , no. ( ): - . taylor, arlene g. "where does aacr fall short for internet resources?" journal of internet cataloging , no. ( ): - . tennant, roy. "a bibliographic metadata infrastructure for the twenty-first century." library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "marc must die." library journal, october , - . http://www.libraryjournal.com/article/ca .html terry, ana arias. "reference-linking: today's realities, tomorrow's promises." library hi tech , no. ( ): - . thiele, harold. "the dublin core and warwick framework: a review of the literature, march -september ." d-lib magazine (january ). http://www.dlib.org/dlib/january / thiele.html thomas, charles f., and linda s. griffin. "who will create the metadata for the internet?" first monday , no. ( ). http://www.firstmonday.org/issues/issue _ /thomas/index.html thorburn, colleen. "cataloging remote electronic journals and databases." the serials librarian , no. / ( ): - . thunell, allen, and lisa robinson. "conventional language for cataloging remote access electronic resources: the time is now!" oclc systems & services , no. ( ): - . todd, chris. "metadata mayhem: cataloguing electronic resources in the national library of new zealand." the electronic library , no. ( ): - . van ballegooie, marlene. "metadata for archival collections: the university of toronto's 'barren lands' project." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#feature van de sompel, herbert, and oren beit-arie. "generalizing the openurl framework beyond references to scholarly works: the bison-futé model." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /vandesompel/ vandesompel.html ———. "open linking in the scholarly information environment using the openurl framework." d-lib magazine (march ). http://www.dlib.org/dlib/march /vandesompel/ vandesompel.htm l vellucci, sherry l. "metadata." in annual review of information science and technology, vol. , ed. martha e. willams, - . medford, nj: information today, inc., . ———. "metadata and authority control." library resources & technical services (january ): - . vitiello, giuseppe. "identifiers and identification systems: an informational look at policies and roles from a library perspective." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /vitiello/ vitiello.html vizine-goetz, diane. "classification schemes for internet resources revisited." journal of internet cataloging , no. ( ): - . wagner, harry, and stuart weibel. "the dublin core metadata registry: requirements, implementation, and experience." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ wakimoto, jina choi, david s. walker, and katherine s. dabbour. "the myths and realities of sfx in academic libraries." the journal of academic librarianship , no. ( ): - . walker, jenny. "crossref and sfx: complementary linking services for libraries." new library world , no. ( ): - . http://www.exlibrisgroup.com/files/publications/crossrefandsfx.p df ———. "open linking for libraries: the openurl framework." new library world , no. / ( ): - . http://www.exlibrisgroup.com/files/publications/openlinkingforlibr aries.pdf ———. "what is sfx?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art wang, jue. "digital object identifiers and their use in libraries." serials review , no. ( ): - . ward, diane. "internet resource cataloging: the suny buffalo libraries' response." oclc systems & services , no. ( ): - . warner, simeon. "author identifiers in scholarly repositories." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ waters, james, and robert b. allen. "music metadata in a new key: metadata and annotation for music in a digital world." journal of library metadata , no. ( ): - . weibel, stuart l. "border crossings: reflections on a decade of metadata consensus building." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /weibel/ weibel.html ———. "metadata: the foundations of resource description." d-lib magazine (july ). http://www.dlib.org/dlib/july / weibel.html ———. "the state of the dublin core metadata initiative april ." d-lib magazine (april ). http://www.dlib.org/dlib/april / weibel.html ———. "the world wide web and emerging internet resource discovery standards for scholarly literature." library trends (spring ): - . http://hdl.handle.net/ / weibel, stuart, and juha hakala. "dc- : the helsinki metadata workshop: a report on the workshop and subsequent developments." d-lib magazine (february ). http://www.dlib.org/dlib/february / weibel.html weibel, stuart, renato iannella, and warwick cathro. "the th dublin core metadata workshop report: dc- ; march - , ; national library of australia, canberra." d-lib magazine (june ). http://www.dlib.org/dlib/june /metadata/ weibel.html weibel, stuart l., and traugott koch. "the dublin core metadata initiative: mission, current activities, and future directions." d-lib magazine (december ). http://www.dlib.org/dlib/december /weibel/ weibel.html weibel, stuart, and eric miller. "image description on the internet: a summary of the cni/oclc image metadata workshop; september - , ; dublin, ohio." d-lib magazine (january ). http://www.dlib.org/dlib/january /oclc/ weibel.html weng, cathy, and jia mi. "towards accessibility to digital cultural materials: an frbrized approach." oclc systems & services: international digital library perspectives , no. ( ): - . westbrooks, elaine l. "remarks on metadata management." oclc systems & services: international digital library perspectives , no. ( ): - . whalen, maureen. "developing a rights metadata dictionary for digital surrogates." journal of library metadata , no. / ( ): - . wilson, andrew. "how much is enough: metadata for preserving digital data." journal of library metadata , no. / ( ): - . wool, gregory. "a meditation on metadata." the serials librarian , no. / ( ): - . wright, michael. "oclc's corc service: a user's perspective." the serials librarian , no. / ( ): - . younger, jennifer a. "resources description in the digital age." library trends (winter ): - . http://hdl.handle.net/ / younghusband, david. "electronic journals and link resolver implementation." serials , no. ( ): - . zavalina, oksana l., carole l. palmer, amy s. jackson, and myung-ja han. "evaluating descriptive richness in collection-level metadata." journal of library metadata , no. ( ): - zeng, marcia lei. "metadata elements for object description and representation: a case report from a digitized historical fashion collection project." journal of the american society for information science , no. ( ): - . zeng, marcia lei, and lois mai chan. "metadata interoperability and standardization—a study of methodology part ii: achieving interoperability at the record and repository levels." d-lib magazine , no. ( ). http://www.dlib.org/dlib/june /zeng/ zeng.html zeng, marcia lei, jaesun lee, and allene f. hayes. "metadata decisions for digital libraries: a survey report." journal of library metadata , no. / ( ): - . zeng, marcia lei, and jian qin. metadata. new york: neal schuman, . xu, fei. "the sfx citation linker and its enhancements." library hi tech , no. ( ): - . new publishing models albanese, andrew richard. "revolution or evolution." library journal, november , - . albert, karen m. "open access: implications for scholarly publishing and medical libraries." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/articlerender.fcgi?artid= alexander, adrian, and marilu goodyear. "the development of bioone: changing the role of research libraries in scholarly communication." the journal of electronic publishing (march ). http://hdl.handle.net/ /spo. . . ———. "'la jolla confidential': the inside story of bioone." the serials librarian , no. / ( ): - . allinson, julie, sebastien françois, and stuart lewis. "sword: simple web-service offering repository deposit." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /allinson-et-al/ anderson, byron. "open access journals." behavioral & social sciences librarian , no. ( ): - . anderson, ivy. "the audacity of scoap ." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arl-br- -scoap .pdf apt, krzysztof r. "one more revolution to make: free scientific publishing." communications of the acm (may ): - . armbruster, chris. "moving out of oldenbourg's long shadow: what is the future for society publishing?" learned publishing , no. ( ): - . arunachalam, subbiah. "open access to scientific knowledge." desidoc journal of library and information technology , no. ( ). http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ / ayris, paul. "new wine in old bottles: current developments in digital delivery and dissemination." european review , no. ( ): - . association of american universities research libraries project in collaboration with the association of research libraries. reports of the aau task forces on: acquisition and distribution of foreign language and area studies materials: a national strategy for managing scientific and technological information: intellectual property rights in an electronic environment. washington, dc: association of research libraries, . atkinson, ross. "a rationale for the redesign of scholarly information exchange." library resources & technical services (april ): - . bailey, charles w., jr. "the coalition for networked information's acquisition-on-demand model: an exploration and critique." serials review , no. - ( ): - . http://www.digital-scholarship.org/cwb/cni.htm bailey, charles w., jr. open access bibliography: liberating scholarly literature with e-prints and open access journals. washington, dc: association of research libraries, . http://www.digital-scholarship.org/oab/oab.htm ———. "scholarly electronic publishing on the internet, the nren, and the nii: charting possible futures." serials review , no. ( ): - . http://www.digital-scholarship.org/cwb/schpub.htm ———. transforming scholarly publishing through open access: a bibliography. houston: digital scholarship, . http://www.digital-scholarship.org/tsp/transforming.htm bachrach, steven, r. stephen berry, martin blume, thomas von foerster, alexander fowler, paul ginsparg, stephen heller, neil kestner, andrew odlyzko, ann okerson, ron wigington, and anne moffat. "who should own scientific papers?" science magazine, september , - . http://www.sciencemag.org/cgi/content/full/ / / baker, gavin. "open access: advice on working with faculty senates." college & research libraries news , no. ( ): - . bakker, theodora a., and marcus a. banks. "scholarly communication initiatives at georgetown university: lessons learned." oclc systems & services: international digital library perspectives , no. ( ): - . banks, marcus a, and gail l persily. "campus perspective on the national institutes of health public access policy: university of california, san francisco, library experience." journal of the medical library association , no. ( ): - . http://www.ncbi.nlm.nih.gov/pmc/articles/pmc / banks, peter. "open access: a medical association perspective." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art barnett, molly c., and molly w. keener. "expanding medical library support in response to the national institutes of health public access policy." journal of the medical libary association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= bergman, sherrie s. "the scholarly communication movement: highlights and recent developments." collection building , no. ( ): - . bernius, steffen. "the impact of open access on the management of scientific knowledge." online information review , no. ( ): - . bernius, steffen, matthias hanauske, wolfgang könig, and berndt dugall. "open access models and their implications for the players on the scientific publishing market." economic analysis and policy , no. ( ): - . bird, claire. "continued adventures in open access: perspective." learned publishing , no. ( ): - . ———. "oxford journals' adventures in open access." learned publishing , no. ( ): - . björk, bo-christer, and turid hedlund. "two scenarios for how scholarly publishers could change their business model to open access." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . blixrud, julia c. "sparc: setting sail into the seas of competition." the serials librarian , no. / ( ): - . boettcher, jennifer. "framing the scholarly communication cycle." online , no. ( ): - . bolman, pieter. "open access: marginal or core phenomenon? a commercial publisher's view." information services & use , no. - ( ): - . booth, andrew. "the politics of e-access and e-funding in the library environment." serials , no. ( ): - . bosc, hélène, and stevan harnad. "in a paperless world a new role for academic libraries: providing open access." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art branin, joseph j., and mary case. "reforming scholarly publishing in the sciences: a librarian perspective." notices of the ams (april ): - . braun, kim. "gap—german academic publishers: a network approach to scholarly publishing " canadian journal of communication , no. ( ). brazzeala, bradley, and patrick l. carr. "the potential impact of 'public access' legislation on access to forestry literature." serials review , no. ( ): - . brent, doug. "stevan harnad's 'subversive proposal': kick-starting electronic scholarship." ejournal , no. ( ). http://www.ucalgary.ca/ejournal/archive/rachel/v n /article.html brody, tim, les carr, yves gingras, chawki hajjem, stevan harnad, and alma swan. "incentivizing the open access research web publication-archiving, data-archiving and scientometrics." ctwatch quarterly , no. ( ). http://www.ctwatch.org/quarterly/articles/ / /incentivizing-the- open-access-research-web/index.html buckholtz, alison. "declaring independence: returning scientific publishing to scientists." the journal of electronic publishing (august ). http://hdl.handle.net/ /spo. . . burke, marianne. "pubmed central: be careful what you ask for." college & research libraries news (january ): - . byrd, gary d., shelley a. bader, and anthony j. mazzaschi. "the status of open access publishing by academic societies." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= callaghan, sarah, fiona hewer, sam pepler, paul hardaker, and alan gadian. "how to publish data using overlay journals: the ojims project." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /callaghan-et-al/ ———. "overlay journals and data publishing in the meteorological sciences." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /callaghan-et-al/ cameron, robert d. "a universal citation database as a catalyst for reform in scholarly communication." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ campbell, paulette walker. "nih may use the internet to distribute findings of research financed by its grants." the chronicle of higher education, may , a . canessa, e., and m. zennaro, eds. science dissemination using open access: a compendium of selected literature on open access. trieste, italy: science dissemination unit, abdus salam international centre for theoretical physics, . http://sdu.ictp.it/openaccess/scidissopenaccess.pdf canhos, vanderlei, leslie chan, and barbara kirsop. "bioline publications: how its evolution has mirrored the growth of the internet." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art carpenter, todd a., heather joseph, and mary waltham. "a survey of business trends at bioone publishing partners and its implications for bioone." portal: libraries and the academy , no. ( ): - . case, mary m. "arl promotes competition through sparc: the scholarly publishing & academic resources coalition." arl: a bimonthly newsletter of research library issues and actions, no. ( ): - . http://www.arl.org/bm~doc/sparc- .pdf ———. "partners in knowledge creation: an expanded role for research libraries in the digital future." journal of library administration , no. ( ): - . ———. "principles for emerging systems of scholarly publishing." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/principles.pdf ———. "promoting open access: developing new strategies for managing copyright and intellectual property." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/access- .pdf ———. "public access to scientific information: are , scientists wrong?" college & research libraries news , no. ( ): - . ———. "scholarly communication: arl as a catalyst for change." portal: libraries and the academy , no. ( ): - . cassella, maria. "new journal models and publishing perspectives in the evolving digital environment." ifla journal , no. ( ): - . chang, chen chi. "business models for open access journals publishing." online information review , no. ( ): - . chantavaridou, elisavet. "contributions of open access to higher education in europe and vice versa." oclc systems & services: international digital library perspectives , no. ( ): - . chodorow, stanley. "the tempe principles in practice." the serials librarian , no. / ( ): - . cockerill, matthew. "establishing a central open access fund." oclc systems & services: international digital library perspectives , no. ( ): - . coleman, ross. "publishing and the digital library: adding value to scholarship and innovation to business." learned publishing , no. ( ): - . conley, john p., and myrna wooders. "but what have you done for me lately? commercial publishing, scholarly communication, and open-access." economic analysis and policy , no. ( ): - . http://www.eap-journal.com.au/download.php?file= cope, william w., and mary kalantzis. "signs of epistemic disruption: transformations in the knowledge system of the academic journal." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / corbett, hillary. "the crisis in scholarly communication, part i: understanding the issues and engaging your faculty." technical services quarterly , no. ( ): - . ———. "the crisis in scholarly communication, part ii: internal impacts on the library, with a focus on technical services." technical services quarterly , no. ( ): - . corrado, edward m. "the importance of open access, open source, and open standards for libraries." issues in science and technology librarianship, no. ( ). http://www.istl.org/ -spring/article .html correia, ana maria ramalho, and josé carlos teixeira. "reforming scholarly publishing and knowledge communication: from the advent of the scholarly journal to the challenges of open access." online information review , no. ( ): - . crawford, walt. "library access to scholarship: oa controversies." cites & insights: crawford at large , no. ( ): - . http://citesandinsights.info/v i d.htm creaser, claire. "open access to research output—institutional policies and researchers' views: results from two complementary surveys." new review of academic librarianship , no. ( ): - . crow, raym. "publishing cooperatives: an alternative for non-profit publishers." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ / davis, philip m. "how the media frames 'open access'." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . day, michael. "the scholarly journal in transition and the pubmed central proposal." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /pubmed/ denning, peter j. "the acm electronic publishing plan and interim copyright policies." the serials librarian , no. / ( ): - . dobratz, susanne, peter rödig, uwe m. borghoff, björn rätzke, and astrid schoger. "the use of quality management standards in trustworthy digital archives." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ dominguez, magaly báscones. "economics of open access publishing." serials , no. ( ): - . dryburgh, alastair. "open access—time to stop preaching to the converted?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art edmonds, bruce. "a proposal for the establishment of review boards." the journal of electronic publishing (june ). http://hdl.handle.net/ /spo. . . english, ray. "scholarly communication and the academy: the importance of the acrl initiative." portal: libraries and the academy , no. ( ): - . english, ray, and larry hardesty. "create change: shaping the future of scholarly journal publishing." college & research libraries news (june ): - . english, ray, and peter suber. "public access to federally funded research: the cornyn-lieberman and cures bills." college & research libraries news , no. ( ): - . esposito, joseph j. "open access . : access to scholarly publications moves to a new phase." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . falk, howard. "journal publishing is ripe for change." the electronic library , no. ( ): - . ———. "open access gains momentum." the electronic library , no. ( ): - . ———. "the revolt against journal publishers." the electronic library , no. ( ): - . fang, conghui, and xiaochun zhu. "the open access movement in china." interlending & document supply , no. ( ): - . fernandez, leila. "open access initiatives in india—an evaluation." partnership: the canadian journal of library and information practice and research , no. ( ). http://journal.lib.uoguelph.ca//index.php/perj/article/view/ / fisher, julian h. "fixing the broken toaster: scholarly publishing re-imagined." science & technology libraries , no. ( ): - . forsman, rick b., and charles denison. "life and death on the coral reef: an ecological perspective on scholarly publishing in the health sciences." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/picrender.fcgi?action=stream&bl obtype=pdf&artid= franklin, jack. "open access to scientific and technical information: the state of the art." information services & use , no. - ( ): - . frazier, ken. "sparc: encouraging new models of disseminating knowledge." collection building , no. ( ): - . friend, frederick j. "alternatives to commercial publishing for scholarly communication." serials (july ): - . ———. "how can there be open access to journal articles?" serials , no. ( ): - . fuller, steve. "cybermaterialism, or why there is no free lunch in cyberspace." the information society , no. ( ): - . ———. "cyberplatonism: an inadequate constitution for the republic of science." the information society , no. ( ): - . fyffe, richard. "technological change and the scholarly communications reform movement: reflections on castells and giddens." library resources & technical services (april ): - . fyffe, richard, and david e. shulenburger. "economics as if science mattered: the bioone business model and the transformation of scholarly publishing." library collections, acquisitions, & technical services , no. ( ): - . gardner, william. "the electronic archive: scientific publishing for the s." psychological science (november ): - . gargouri, yassine, chawki hajjem, vincent larivière, yves gingras, les carr, tim brody, and stevan harnad. "self-selected or mandated, open access increases citation impact for higher quality research." plos one , no. ( ): e . http://www.plosone.org/article/info% adoi% f . % fjournal .pone. garrett, marie. "newfound press: participating in the future of scholarly publishing." college & research libraries news , no. ( ): - . gass, andy. "paying to free science: costs of publication as costs of research." serials review , no. ( ): - . gass, steven. "transforming scientific communication for the st century." science & technology libraries , no. / ( ): - . getz, malcolm. "open-access scholarly publishing in economic perspective." journal of library administration, , no. ( ): - . ghosh, s. b., and anup kumar das. "open access and institutional repositories—a developing country perspective: a case study of india." ifla journal , no. ( ): - . gibson, ian. "overview of the house of commons science and technology select committee inquiry into scientific publications." serials , no. ( ): - . ginsparg, paul. "can peer review be better focused?" science & technology libraries , no. / ( ): - . http://people.ccmr.cornell.edu/~ginsparg/blurb/pg pr.html ———. "next-generation implications of open access." ctwatch quarterly , no. ( ). http://www.ctwatch.org/quarterly/articles/ / /next-generation-i mplications-of-open-access/index.html ———. "scholarly information architecture, - ." data science journal ( ): - . http://www.jstage.jst.go.jp/article/dsj/ / / /_pdf goodman, david. "open access: what comes next?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art goodyear, marilu, and adrian w. alexander. "bioone: a new model for scholarly publishing." journal of library administration , no. / ( ): - . graham, tom. "scientific publications: free for all? the academic library viewpoint." serials , no. ( ): - . gradmann, stefan. "figaro and open access to electronic information objects." information services & use , no. - ( ): - . greyson, devon, kumiko vézina, heather morrison, donald taylor, and charlyn black. "university supports for open access: a canadian national survey." canadian journal of higher education , no. ( ): - . http://ojs.library.ubc.ca/index.php/cjhe/article/view/ / guédon, jean-claude. "beyond core journals and licenses: the paths to reform scientific publishing." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/resources/pubs/br/br /br guedon.shtml ———. "the budapest initiative for open access." information services & use , no. - ( ): - . ———. "electronic journals, libraries, and university presses." in filling the pipeline and paying the piper: proceedings of the fourth symposium, ed. ann okerson, - . washington, dc: office of scientific and academic publishing, association of research libraries, . ———. in oldenburg's long shadow: librarians, research scientists, publishers, and the control of scientific publishing. washington, dc: association of research libraries, . http://www.arl.org/resources/pubs/mmproceedings/ guedon.shtml ———. "mixing and matching the green and gold roads to open access—take ." serials review , no. ( ): - . ———. "research libraries and electronic scholarly journals: challenges or opportunities?" the serials librarian , no. / ( ): - . guernsey, lisa. "library groups, decrying 'excessive pricing,' demand new policies on electronic journals." the chronicle of higher education, april , a -a . ———. "a provost challenges his faculty to retain copyright on articles." the chronicle of higher education, september , a -a . ———. "some on-line journals make ends meet by charging authors instead of readers." the chronicle of higher education, february , a . gul, sumeer, tariq ahmad shah, and tariq ahmad baghwan. "culture of open access in the university of kashmir: a researcher's viewpoint." aslib proceedings , no. ( ): - . hackman, tim. "what's the opposite of a pyrrhic victory? lessons learned from an open access defeat." college & research libraries news , no. ( ). hahn, karla. "new tools for new times: remodeling the scholarly communication system." college & research libraries news , no. ( ): - . ———. research library publishing services: new options for university publishing. washington, dc, . http://www.arl.org/bm~doc/research-library-publishing-services.pd f ———. "research library publishing services: new options for university publishing and new roles for libraries." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arl-br- -res-lib-pub.pdf ———. "seeking a global perspective on scholarly communication: contributions from the uk." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr scholcom.pdf ———. "talk about talking about new models of scholarly communication." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . haider, jutta. "of the rich and the poor and other curious minds: on open access and 'development'." aslib proceedings , no. / ( ): - . hamaker, chuck, and brad spry. "google scholar." serials , no. ( ): - . harnad, stevan. "fast-forward on the green road to open access: the case against mixing up green and gold." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /harnad/ ———. "free at last: the future of peer-reviewed journals." d-lib magazine (december ). http://www.dlib.org/dlib/december / harnad.html ———. "how to fast-forward learned serials to the inevitable and the optimal for scholars and scientists." the serials librarian , no. / ( ): - . http://cogprints.org/ / ———. "the implementation of the berlin declaration on open access: report on the berlin meeting held february- march , southampton, uk." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /harnad/ harnad.html ———. "implementing peer review on the net: scientific quality control in scholarly electronic journals." in scholarly publishing: the electronic frontier, ed. robin p. peek and gregory b. newby, - . cambridge, ma: the mit press, . http://cogprints.org/ / ———. "interactive publication: extending the american physical society's discipline-specific model for electronic publishing." serials review , no. - ( ): - . http://cogprints.org/ / ———. "the invisible hand of peer review." nature, november , web matters section. http://www.nature.com/nature/webmatters/invisible/invisible.html ———. "learned inquiry and the net: the role of peer review, peer commentary and copyright." antiquity (december ): - . http://cogprints.org/ / ———. "minotaur: six proposals for freeing the refereed literature online: a comparison." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /minotaur/ ———. "no-fault peer review charges: the price of selectivity need not be access denied or delayed." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /harnad/ harnad.html ———. "on-line journals and financial fire walls." nature, september , - . ———. "the paper house of cards (and why it's taking so long to collapse)." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /harnad/ ———. "the postgutenberg galaxy: how to get there from here." the information society , no. ( ): - . http://cogprints.org/ / ———. "post-gutenberg galaxy: the fourth revolution in the means of production of knowledge." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /harnad. n ———. "publish or perish—self-archive to flourish: the green route to open access." ercim news, no. ( ): - . http://www.ercim.org/publication/ercim_news/enw /harnad.html ———. "resetting our intuition pumps for the online-only era: a conversation with stevan harnad." interview by ellen finnie duranceau. serials review , no. ( ): - . ———. "scholarly skywriting and the prepublication continuum of scientific inquiry." psychological science (november ): - . http://cogprints.org/ / ———. "self-archive unto others as ye would have them self-archive unto you." jekyll.comm, no. ( ). http://jcom.sissa.it/archive/ / /f / ———. "the self-archiving initiative." nature web debates, april . http://www.nature.com/nature/debates/e-access/articles/harnad.htm l ———. "sorting the esoterica from the exoterica: there's plenty of room in cyberspace." the information society , no. ( ): - . http://cogprints.org/ / harnad, stevan, and matt hemus. "all or none: no stable hybrid or half-way solutions for launching the learned periodical literature into the post-gutenberg galaxy." in the impact of electronic publishing on the academic community: an international workshop organized by the academia europaea and the wenner-gren foundation, ed. i. butterworth, - . london: portland press, . harnad, stevan, and alma swan. "india, open access, the law of karma and the golden rule." desidoc journal of library and information technology , no. ( ). http://eprints.ecs.soton.ac.uk/ / hawkins, brian l. "creating the library of the future: incrementalism won't get us there!" the serials librarian , no. / ( ): - . ———. "information access in the digital era. challenges and a call for collaboration." educause review (september/october ): - . http://www.educause.edu/ir/library/pdf/erm .pdf heath, malcolm, michael jubb, and david robey. "e-publication and open access in the arts and humanities in the uk." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /heath-et-al/ hedlund, turid, and ingegerd rabow. "scholarly publishing and open access in the nordic countries." learned publishing , no. ( ): - . helmes, leni. "project lays foundations for future scholarly communication." research information (december /january ). http://www.researchinformation.info/ridecjan coll.html henneken, edwin a., michael j. kurtz, guenther eichhorn, alberto accomazzi, carolyn s. grant, donna thompson, elizabeth bohlen, stephen s. murray, paul ginsparg, and simeon warner. "e-prints and journal articles in astronomy: a productive co-existence." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art herb, ulrich. "sociological implications of scientific publishing: open access, science, society, democracy and the digital divide." first monday , no. ( ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ hernández-borges, angel a., raúl cabrera-rodríguez, abián montesdeoca-melián, begoña martínez-pineda, maria luisa torres-Álvarez de arcaya, and alejandro jiménez-sosa. "awareness and attitude of spanish medical authors to open access publishing and the 'author pays' model for electronic content." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= hibbitts, bernard j. "e-journals, archives and knowledge networks: a commentary on archie zariski's defense of electronic law journals." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ ———. "from law reviews to knowledge networks: legal scholarship in the age of cyberspace." serials review , no. ( ): - . ———. "last writes? re-assessing the law review in the age of cyberspace." new york university law review , no. ( ): - . http://www.law.pitt.edu/hibbitts/lastrev.htm ———. "yesterday once more: skeptics, scribes and the demise of law reviews." akron law review (winter ): - . http://www.law.pitt.edu/hibbitts/akron.htm hindawi, ahmed. " : a publishing odyssey: based on a paper presented at the nd uksg conference, torquay, march/april ." serials: the journal for the serials community , no. ( ): - ho, adrian k., and charles w. bailey, jr. "open access webliography." reference services review , no. ( ): - . http://www.digital-scholarship.org/cwb/oaw.htm ho, adrian k., and daniel r. lee. "recognizing opportunities: conversational openings to promote positive scholarly communication change." college & research libraries news , no. ( ): - . hood, anna k. open access resources, spec kit . washington: dc: association of research libraries, . http://www.arl.org/bm~doc/spec web.pdf hunter, karen. "the national site license model." serials review , no. - ( ): - , . imboden, dieter m. "scientific publishing: the dilemma of research funding organisations." european review , no. ( ): - . jacobs, neil, ed. open access: key strategic, technical and economic aspects. oxford: chandos, . jacobson, robert l. "research universities consider plan to distribute scholarly work online." the chronicle of higher education, november , a . jacsó, peter. "open access to scholarly full-text documents." online information review , no. ( ): - . ———. "open access to scholarly indexing/abstracting information." online information review , no. ( ): - . johnson, richard k. "a question of access: sparc, bioone, and society-driven electronic publishing." d-lib magazine (may ). http://www.dlib.org/dlib/may /johnson/ johnson.html ———. "whither competition?" arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/whither.pdf joint, nicholas. "the antaeus column: does the 'open access' advantage exist? a librarian's perspective." library review , no. ( ): - . ———. "the 'author pays' model of open access and uk-wide information strategy." library review , no. ( ): - . ———. "current research information systems, open access repositories and libraries: antaeus." library review , no. ( ): - . joseph, heather. "bioone: building a sustainable alternative publishing model for non-profit publishers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "fair to whom?" college & research libraries news , no. ( ): - . ———. "from advocacy to implementation: the nih public access policy and its impact." journal of library administration , no. ( ): - . ———. "the scholarly publishing and academic resources coalition: an evolving agenda." college & research libraries , no. ( ): - . joseph, heather, and adrian w. alexander. "two years after the launch: an update on the bioone electronic publishing initiative." college & research libraries news , no. ( ): - . joseph, heather, and todd a. carpenter. "bioone's business model shift: balancing the interests of libraries and independent publishers." serials review , no. ( ): - . kahin, brian. "a cooperative framework for enhancing research communication in science and technology." serials review , no. ( ): - . kahle, brewster, rick prelinger, and mary e. jackson. "public access to digital material." d-lib magazine (october ). http://www.dlib.org/dlib/october /kahle/ kahle.html kaser, dick, and marydee ojala. "open access forum." online , no. ( ): - . kiernan, vincent. "a new on-line database will link dozens of journals in the biological sciences." the chronicle of higher education, july , a -a . ———. "nih proceeds with on-line archive for papers in the life sciences." the chronicle of higher education, september , a . ———. "scholars seek new copyright rule to ease dissemination of research on the web." the chronicle of higher education, september , a . ———. "university libraries join with chemical society to create a new, low-cost journal." the chronicle of higher education, july , a . kim, jihyun. "faculty self-archiving: motivations and barriers." journal of the american society for information science and technology , no. ( ): - . king, donald w. "an approach to open access author payment." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /king/ king.html ———. "should commercial publishers be included in the model for open access through author payment?" d-lib magazine , no. ( ). http://www.dlib.org/dlib/june /king/ king.html kirsop, barbara. "open access to publicly funded research information: the race is on." desidoc journal of library and information technology , no. ( ). http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ kirsop, barbara, subbiah arunachalam, and leslie chan. "access to scientific knowledge for sustainable development: options for developing countries." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /kirsop-et-al/ kling, rob, lisa b. spector, and joanna fortuna. "the real stakes of virtual publishing: the transformation of e-biomed into pubmed central." journal of the american society for information science and technology , no. ( ): - . the knight higher education collaborative. "op. cit." policy perspectives (december ): - . http://www.thelearningalliance.info/docs/jun /doc- jun . .pdf kohl, david f. "starting a library-based university press." reference services review , no. ( ): - . kosavic, andrea. "the york digital journals project: strategies for institutional open journal systems implementations." college & research libraries , no. ( ): - . kousha, kayvan. "characteristics of open access scholarly publishing: a multidisciplinary study." aslib proceedings , no. ( ): - . kousha, kayvan, and mahshid abdoli. "the citation impact of open access agricultural research: a comparison between oa and non-oa publications." online information review , no. ( ): - . krichel, thomas, and christian zimmermann. "the economics of open bibliographic data provision." economic analysis and policy , no. ( ): - . http://www.eap-journal.com.au/download.php?file= kroth, philip j., erinn e. aspinall, and holly e. phillips. "the national institutes of health (nih) policy on enhancing public access: tracking institutional contribution rates." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/articlerender.fcgi?artid= kuchma, iryna. "open access, equity, and strong economy in developing and transition countries: policy perspective." serials review , no. ( ): - . kumari, g. lalitha. "global access to indian research: indian stm journals online." desidoc journal of library and information technology , no. ( ). http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ ———. "global access to indian research: indian stm journals online." issues in science and technology librarianship, no. ( ). http://www.istl.org/ -spring/article .html kwasik, hanna, and pauline o. fulda. "open access and scholarly communication—a selection of key web sites." issues in science & technology librarianship, no. ( ). http://www.istl.org/ -summer/internet.html la manna, manfredi. "the economics of publishing and the publishing of economics." library review , no. ( ): - . la manna, manfredi, and jean young. "the electronic society for social scientists: from journals as documents to journals as knowledge exchanges." interlending & document supply , no. ( ): - . lal, krishan. "open access: major issues and global initiatives." desidoc journal of library and information technology , no. ( ). http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ lamb, christine. "open access publishing models: opportunity or threat to scholarly and academic publishers?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art lambert, jill. "developments in electronic publishing in the biomedical sciences." program: electronic library and information systems , no. ( ): - . law, derek. "delivering open access: from promise to practice." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /law/ law, d. g., r. l. weedon, and m. r. sheen. "universities and article copyright." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art lawrence, janna c. "libraries as journal publishers." journal of electronic resources in medical libraries , no. ( ): - . lewis, david w. "library budgets, open access, and the future of scholarly communication." college & research libraries news , no. ( ): - . look, hugh. "open access: look both ways before crossing." serials , no. ( ): - . ludwig, deborah. "open access at the university of kansas: toward a campus initiative." college & research libraries news , no. ( ): - . lynch, clifford a. "improving access to research results: six points " arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr sixpoints.pdf ———. "reaction, response, and realization: from the crisis in scholarly communication to the age of networked information." serials review , no. - ( ): - . mabe, michael a. "caveat auctor: let the author beware! some sceptical thoughts on open access." serials , no. ( ): - . magner, denise k. "seeking a radical change in the role of publishing." the chronicle of higher education, june , a -a . malenfant, kara j. "leading change in the system of scholarly communication: a case study of engaging liaison librarians for outreach to faculty." college and research libraries , no. ( ): - . malina, barbara. open access opportunities and challenges: a handbook. brussels: european commission and the german commission for unesco, . http://ec.europa.eu/research/science-society/document_library/pdf_ /open-access-handbook_en.pdf markovitz, barry p. "biomedicine's electronic publishing paradigm shift: copyright policy and pubmed central." journal of the american medical informatics association (may/june ): - . maron, nancy l., and k. kirby smith. current models of digital scholarly communication: results of an investigation conducted by ithaka for the association of research libraries. washington, dc: association of research libraries, . http://www.arl.org/bm~doc/current-models-report.pdf martin, susan k. "acrl takes up the challenges of scholarly communication." college & research libraries news , no. ( ): - , . ———. "a wedge in the door of scholarly communication." portal: libraries and the academy , no. ( ): vii-xx. matsubayashi, mamiko, keiko kurata, yukiko sakai, tomoko morioka, shinya kato, shinji mine, and shuichi ueda. "status of open access in the biomedical field in ." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= mccollum, kelly. "nih proposal for on-line publication draws fire." the chronicle of higher education, june , a . mcculloch, emma. "taking stock of open access: progress and issues." library review , no. ( ): - . mccullough, b. d. "open access economics journals and the market for reproducible economic research." economic analysis and policy , no. ( ): - . http://www.eap-journal.com.au/download.php?file= mckiernan, gerry. "open access and retrieval: liberating the scholarly literature." in e-serials collection management: transitions, trends, and technicalities, edited by david c. fowler, - . new york: haworth information press, . ———. "scholar-based initiatives in publishing." science & technology libraries , no. / ( ): - . ———. "scholar-based innovations in publishing. part i: individual and institutional initiatives." library hi tech news , no. ( ): - . ———. "scholar-based innovations in publishing. part ii: library and professional initiatives." library hi tech news , no. ( ): - . ———. "scholar-based innovations in publishing. part iii: organizational and national initiatives." library hi tech news , no. ( ): - . mcmillan, gail. "scholarly communications project: publishers and libraries." in filling the pipeline and paying the piper: proceedings of the fourth symposium, ed. ann okerson, - . washington, dc: office of scientific and academic publishing, association of research libraries, . medeiros, norm. "of budgets and boycotts: the battle over open access publishing." oclc systems & services , no. ( ): - . mele, salvatore. "open access publishing in high-energy physics." oclc systems & services: international digital library perspectives , no. ( ): - . mele, salvatore, heather morrison, dan d'agostino, and sharon dyas-correia. "scoap and open access." serials review , no. ( ): - . mercieca, paul. "integration and collaboration within recently established australian scholarly publishing initiatives." oclc systems & services: international digital library perspectives , no. ( ): - . metcalfe, amy scott, samuel esseh, and john willinsky. "international development and research capacities: increasing access to african scholarly publishing." canadian journal of higher education , no. ( ): - . http://ojs.library.ubc.ca/index.php/cjhe/article/view/ /pdf_ michalak, sarah c. "the evolution of sparc." serials review , no. ( ): - . mizzaro, stefano. "quality control in scholarly publishing: a new proposal." journal of the american society for information science and technology , no. ( ): - . montonen, claus. "the european physics publications scene: avant-garde and traditionalism." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art morgan, peter. "alive and kicking: a progress report on open access, institutional repositories, and health information." he@lth information on the internet , no. ( ): - . morris, sally. "open publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "open publishing: how publishers are reacting." information services & use , no. / ( ): - . ———. "scholarship-friendly publishing." liber quarterly , no. ( ). morrison, heather g. "the dramatic growth of open access: implications and opportunities for resource sharing." journal of interlibrary loan, document delivery & electronic reserve , no. ( ): - . ———. "professional library & information associations should rise to the challenge of promoting open access and lead by example." library hi tech news , no. ( ): - . ———. scholarly communication for librarians. oxford: chandos publishing, . morton, bruce. "is the journal as we know it an article of faith? an open letter to the faculty." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /mort n .html moskovkin, v. m. "institutional policies for open access to the results of scientific research." scientific and technical information processing , no. ( ): - . ———. "open access hybrid journals." scientific and technical information processing , no. ( ): - . ———. "open access to scientific knowledge. who receives dividends?" scientific and technical information processing , no. ( ): - . murray-rust, peter. "open data in science." serials review , no. ( ): - . navin, john c., and jay starratt. "does open access really make sense? a closer look at chemistry, economics, and mathematics." college and research libraries , no. ( ): - . newman, kathleen a., deborah d. blecic, and kimberly l. armstrong. scholarly communication education initiatives, spec kit . washington, dc: association of research libraries, . http://www.arl.org/bm~doc/spec book.pdf.zip nikam, khaiser, and rajendra babu. "moving from script to science . for scholarly communication." webology , no. ( ). http://www.webology.ir/ /v n /a .html odlyzko, andrew. "competition and cooperation: libraries and publishers in the transition to electronic scholarly journals." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . ———. "on the road to electronic publishing." euromath bulletin , no. ( ): - . http://www.dtc.umn.edu/~odlyzko/doc/tragic.loss.update ———. "tragic loss or good riddance? the impending demise of traditional scholarly journals." international journal of human-computer studies , no. ( ): - . okerson, ann. "back to academia? the case for american universities to publish their own research." logos , no. ( ): - . http://www.library.yale.edu/~okerson/case.html ———. "the law is the true embodiment of everything that's excellent': mandates—a view from the united states: based on a presentation given at the uksg seminar 'mandating and the scholarly journal article: attracting interest on deposits?', london, october ." serials: the journal for the serials community , no. ( ): - . ———. "the missing model: a 'circle of gifts.'" serials review , no. - ( ): - . ———. "open access: reflections from the united states." serials , no. ( ): - . oppenheim, charles. "electronic scholarly publishing and open access." journal of information science , no. ( ): - . organ, michael. "download statistics—what do they tell us? the example of research online, the open access institutional repository at the university of wollongong, australia." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /organ/ organ.html orsdel, lee c. van. "the state of scholarly communications: an environmental scan of emerging issues, pitfalls, and possibilities." the serials librarian , no. / ( ): - . packer, abel l. "the scielo open access: a gold way from the south." canadian journal of higher education , no. ( ): - . http://ojs.library.ubc.ca/index.php/cjhe/article/view/ /pdf payne, doug. "a revolutionary idea in publishing: economists plan online venture to challenge dominance of academic-journal companies." the chronicle of higher education, march , a -a . perciali, irene, and aaron edlin. "journals at bepress: new twists on an old model." learned publishing , no. ( ): - . peters, paul. "going all the way: how hindawi became an open access publisher." learned publishing , no. ( ): - . ———. "redefining scholarly publishing as a service industry." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . peterson, elaine. "librarian publishing preferences and open-access electronic journals." e-jasl: the electronic journal of academic and special librarianship , no. ( ). http://southernlibrarianship.icaap.org/content/v n /peterson_e . htm pikowsky, robert a. "a snapshot of electronic journals, august ." the serials librarian , no. ( ): - . ———. "electronic journals as a potential solution to escalating serials costs." the serials librarian , no. / ( ): - . pinfield, stephen. "journals and repositories: an evolving relationship?" learned publishing , no. ( ): - . ———. "a mandate to self archive? the role of open access institutional repositories." serials , no. ( ): - . ———. "paying for open access? institutional funding streams and oa publication charges." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art plotin, stephanie l. "legal scholarship, electronic publishing, and open access: transformation or steadfast stagnation?" law library journal , no. ( ): - . http://www.aallnet.org/products/pub_llj_v n / - .pdf plutchak, t. scott. "the impact of open access." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/articlerender.fcgi?artid= ———. "what's a serial when you're running on internet time?" the serials librarian , no. / ( ): - . pope, liz. "pubmed central: a barrier-free repository for the life sciences." the serials librarian , no. / ( ): - . pöschl, ulrich. "interactive open access publishing and public peer review: the effectiveness of transparency and self-regulation in scientific quality assurance." ifla journal , no. ( ): - . prosser, david c. "from here to there: a proposed mechanism for transforming journals from closed to open access." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "the next information revolution—how open access repositories and journals will transform scholarly communications." liber quarterly , no. ( ). http://eprints.rclis.org/archive/ / ———. "on the transition of journals to open access." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/openaccess- .pdf ———. "scholarly communication in the st century—the impact of new technologies and models." serials , no. ( ): - . http://eprints.rclis.org/archive/ / ———. "the view from europe: creating international change." college & research libraries news , no. ( ): - . puplett, dave. "version identification: a growing problem." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /puplett/ quinn, frank. "a role for libraries in electronic publication." ejournal , no. ( ). http://www.ucalgary.ca/ejournal/archive/rachel/v n /article.html quinn, frank, and gail mcmillan. "library copublication of electronic journals." serials review , no. ( ): - . rae, victoria, and fytton rowland. "is there a viable business model for commercial open access publishing?" serials: the journal for the serials community , no. ( ): - . rambler, mark. "do it yourself? a new solution to the journals crisis." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . rawlins, gregory j. e. "the new publishing: technology's impact on the publishing industry over the next decade." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /rawlins . n /a http://epress.lib.uh.edu/pr/v /n /rawlins . n reich, vicky. "discipline-specific literature bases: a view of the aps model." serials review , no. - ( ): - , . richard, jennifer, denise koufogiannakis, and pam ryan. "librarians and libraries supporting open access publishing." canadian journal of higher education , no. ( ): - . http://ojs.library.ubc.ca/index.php/cjhe/article/view/ / rogers, sharon j., and charlene s. hurt. "how scholarly communication should work in the st century." college & research libraries (january ): - . roth, dana l. "frpaa and nih mandate: a blessing in disguise for scientific society publishers?" science & technology libraries , no. ( ): - . rowland, fytton. "electronic publishing: non-commercial alternatives." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "the peer-review process." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "print journals: fit for the future?" ariadne, no. ( ). http://www.ukoln.ac.uk/ariadne/issue /fytton/ russell, jill, and tracy kent. "paved with gold: an institutional case study on supporting open access publishing: based on a paper presented by jill russell at the rd uksg conference, edinburgh, april ." serials: the journal for the serials community , no. ( ): - . sack, john. "highwire press: ten years of publisher-driven innovation." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art sale, arthur. "the patchwork mandate." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /sale/ sale.html sathyanarayana, n.v. "open access and open j-gate." desidoc journal of library and information technology , no. ( ). http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ savenije, bas. "the figaro project: a new approach towards academic publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art schroeder, robert, and gretta e. siegel. "a cooperative publishing model for sustainable scholarship." journal of scholarly publishing , no. ( ): - . schmidt, krista d., pongracz sennyey, and timothy v. carstens. "new roles for a changing environment: implications of open access for libraries." college and research libraries , no. ( ): - . schöpfel, joachim, and hélène prost. "document supply of grey literature and open access: an update." interlending & document supply , no. ( ): - . schultz, t. d. "a world physics information system: an online, highly interactive, discipline-oriented facility." serials review , no. - ( ): - . schwartz, charles a. "reassessing prospects for the open access movement." college & research libraries , no. ( ): - . shao, xiaorong. "perceptions of open access publishing among academic journal editors in china." serials review , no. ( ): - . shelton, victoria. "scientific research: the publication dilemma." issues in science and technology librarianship, no. ( ). http://www.istl.org/ -spring/article .html shepherd, peter t., and julia m. wallace. "peer: a european project to monitor the effects of widespread open access archiving of journal articles: based on a presentation given at the uksg seminar 'mandating and the scholarly journal article: attracting interest on deposits?', london, october ." serials: the journal for the serials community , no. ( ): - . shieber, stuart m. "equity for open-access journal publishing." plos biology , no. ( ). http://www.plosbiology.org/article/info:doi/ . /journal.pbio. shulenburger, david e. "improving access to publicly funded research: what's in it for the institution? can we make the case?" arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/arlbr institution.pdf ———. "moving with dispatch to resolve the scholarly communication crisis: from here to near." arl: a bimonthly newsletter of research library issues and actions, no. ( ): - . http://www.arl.org/bm~doc/shulenburger.pdf singer, peter. "when shall we be free?" the journal of electronic publishing (december ). http://hdl.handle.net/ /spo. . . smith, john w. t. "the deconstructed journal—a new model for academic publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "the deconstructed journal revisited—a review of developments." in proceedings of the th iccc/ifip international conference on electronic publishing, edited by sely maria de souza costa, joao alvaro carvalho, ana alice baptista and ana cristina santos moreira. braga, portugal: universidade do minho, . http://elpub.scix.net/data/works/att/ .content.pdf ———. "the deconstructed (or distributed) journal—an emerging model?" in proceedings of the th online information conference, edited by jonathon lewis, - . london: learned information europe ltd, . http://library.kent.ac.uk/library/papers/jwts/dordjem.pdf ———. "prolegomena to any future e-publishing model." in electronic publishing ' : redefining the information chain, new ways and voices: proceedings of an iccc/ifip conference held at the university of karlskrona/ronneby, ronneby, sweden - may , ed. john w. t. smith, anders ardo, and peter linde, - . washington dc: iccc press, . http://library.ukc.ac.uk/library/papers/jwts/prolegomena.htm ———. "reinventing journal publishing." research information (may/june ). http://www.researchinformation.info/rimayjun djmodel.html solomon, david j. "talking past each other: making sense of the debate over electronic publication." first monday (august ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ sosteric, michael. "the international consortium for the advancement of academic publication—an idea whose time has come (finally!)." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art "sparc and chemists to collaborate on new reduced-cost journals." arl: a bimonthly newsletter of research library issues and actions, no. (august ): - . http://www.arl.org/bm~doc/acs.pdf steele, colin. "phoenix rising: new models for the research monograph?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art stern, david. "open access or differential pricing for journals: the road best traveled?" online , no. ( ): - . stimson, nancy f. "national institutes of health public access policy assistance: one library's approach." journal of the medical library association , no. ( ): - . http://www.ncbi.nlm.nih.gov/pmc/articles/pmc / stodolsky, david s. "consensus journals: invitational journals based upon peer review." the information society , no. ( ): - . suber, peter. "abridgment as added value." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#abrid gment ———. "after the november election." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#electi on ———. "another oa mandate: the federal research public access act of ." sparc open access newsletter, no. ( ). http://www.earlham.edu/% epeters/fos/newsletter/ - - .htm#fr paa ———. "a bill to overturn the nih policy." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#nih ———. "creating an intellectual commons through open access." ( ). http://dlc.dlib.indiana.edu/dlc/handle/ / ———. "discovery, rediscovery, and open access. part ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#redisc overy ———. "discovery, rediscovery, and open access. part ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#redisc overy ———. "elsevier offers hybrid journals." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#elsevi er ———. "a field guide to misunderstandings about open access." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#fieldg uide ———. "flipping a journal to open access." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#flip ———. "four analogies to clean energy." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#energ y ———. "frpaa introduced in the us house of representatives." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#frpaa ———. "germany's dfg adopts an open access policy." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#dfg ———. "good facts, bad predictions." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#facts ———. "gratis and libre open access." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#gratis -libre ———. "how should we define 'open access'?" sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm ———. "the ides of february in europe: the european commission plan for open access." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#ec ———. "the ides of february in the us: the national day of action and other preparation for frpaa." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#us ———. "implementing the new nih policy." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#nih ———. "knowledge as a public good." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#public good ———. "lessons from maryland." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#maryl and ———. "mandate momentum in ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#mand ate ———. "the mandates of january." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#mand ates ———. "the mandates of october." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#mand ates ———. "the mandates of october ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#mand ates ———. "the many-copy problem and the many-copy solution." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#many copy ———. "nine questions for hybrid journal programs." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#hybri d ———. "the oa policies of june." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#june ———. "oa wrap-up on the last congress." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#congr ess ———. "open access and quality." desidoc journal of library and information technology , no. ( ). http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ / ———. "open access and quality." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#qualit y ———. "open access and the google book settlement." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#gbs ———. "open access and the last-mile problem for knowledge." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#lastmi le ———. "open access and the self-correction of knowledge." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#selfco rrection ———. "open access builds momentum." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/openaccess.pdf ———. "open access for digitization projects." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#digiti zation ———. "open access in ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm# ———. "open access in ." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . ———. "open access in ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm# ———. "open access in ." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . ———. "open access in ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm# ———. "open access in ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm# ———. "the open access mandate at harvard." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#harva rd ———. "an open access mandate for the nih." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#nih ———. "open access overview: focusing on open access to peer-reviewed research articles and their preprints." http://www.earlham.edu/~peters/fos/overview.htm ———. "open access policy options for funding agencies and universities." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#choic epoints ———. "open access to electronic theses and dissertations." desidoc journal of library and information technology , no. ( ). http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ ———. "open access to the scientific journal literature." journal of biology , no. ( ): . http://www.earlham.edu/~peters/writing/jbiol.htm ———. "the open access tracking project (oatp)." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#oatp ———. "an open letter to the next president of the united states." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#openl etter ———. "predictions for ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#predic tions ———. "predictions for ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#predic tions ———. "predictions for ." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#predic tions ———. "a primer on open access to science and scholarship." against the grain , no. ( ): - . http://www.earlham.edu/~peters/writing/atg.htm ———. "problems and opportunities (blizzards and beauty)." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#probl ems ———. "progress toward an oa mandate at the nih, one more time." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#nih ———. "re-introduction of the bill to kill the nih policy." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#conye rs ———. "removing barriers to research: an introduction to open access for librarians." college & research libraries news , no. ( ): - , . ———. "the return of frpaa." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#frpaa suber, peter. "self-archiving diary." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#self-a rchiving ———. "signs of spring." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#acron yms ———. "ten challenges for open-access journals." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#challe nges ———. "ten lessons from the funding agency open access policies." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#lesson s ———. "three gathering storms that could cause collateral damage for open access." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#collat eral ———. "three principles for university open access policies." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#princi ples ———. "trends favoring open access." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#trends ———. "twelve reminders about frpaa." sparc open access newsletter,, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#frpaa ———. "update on the bill mandating oa at the nih." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#nih ———. "victory in the senate: update on the bill to mandate open access at the nih." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#nih ———. "what we don't know about open access: research questions in need of researchers." sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#questi ons ———. "where does the free online scholarship movement stand today?" arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/scholar- .pdf ———. "will open access undermine peer review?" sparc open access newsletter, no. ( ). http://www.earlham.edu/~peters/fos/newsletter/ - - .htm#peerre view swan, alma. "open access for indian scholarship." desidoc journal of library and information technology , no. ( ). http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/ ———. "what is new in open access?" liber quarterly: the journal of european research libraries , no. / ( ). swan, alma, and sheridan brown. "authors and open access publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. jisc/osi journal authors survey report. london: jisc and the open society institute, . http://www.jisc.ac.uk/uploaded_documents/acf .pdf "task force report looks at future of information services." bulletin of the american physical society (april ): - . tenopir, carol. "online journals and developing nations." library journal, november , - . thatcher, sanford g. "re-engineering scholarly communication: a role for university presses?" journal of scholarly publishing (july ): - . thorn, sue, sally morris, and ron fraser. "learned societies and open access: key results from surveys of bioscience societies and researchers." serials: the journal for the serials community , no. ( ): - . till, james e. "success factors for open access." journal of medical internet research , no. ( ). "to publish and perish." policy perspectives (march ): - . turner, judith axler. "pubmed central: a good idea." the journal of electronic publishing (march ). http://hdl.handle.net/ /spo. . . turtle, elizabeth c., and martin p. courtois. "scholarly communication: science librarians as advocates for change." issues in science & technology librarianship, no. ( ). http://www.istl.org/ -summer/article .html uhlir, paul f. "re-intermediation in the republic of science: moving from intellectual property to intellectual commons." information services & use , no. - ( ): - . utter, timothy, and robert p. holley. "the scholarly communication process within the university research corridor (michigan state university, the university of michigan, and wayne state university): a case study in cooperation." resource sharing & information networks , no. / ( ): - . utulu, samuel c. avemaria, and omolara bolarinwa. "open access initiatives adoption by nigerian academics." library review , no. ( ): - . van de sompel, herbert, john erickson, sandy payette, carl lagoze, and simeon warner. "rethinking scholarly communication: building the system that scholars deserve." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /vandesompel/ vandesompel. html velterop, jan. "the golden route to open access." ercim news, no. ( ): - . http://www.ercim.org/publication/ercim_news/enw /velterop.html ———. "open access: principle, practice, progress." serials , no. ( ): - . ———. "open access publishing." information services & use , no. - ( ): - . ———. "should scholarly journals embrace open access (or is it the kiss of death)?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art waaijers, leo. "publish and cherish with non-proprietary peer review systems." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /waaijers/ waaijers, leo, bas savenije, and michel wesseling. "copyright angst, lust for prestige and cost control: what institutions can do to ease open access." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /waaijers-et-al/ wagner, a. ben. "a&i, full text, and open access: prophecy from the trenches." learned publishing , no. ( ): - . ———. "open access citation advantage: an annotated bibliography." issues in science and technology librarianship, no. ( ). http://www.istl.org/ -winter/article .html walker, thomas j. "two societies show how to profit by providing free access." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art waltham, mary. "open access—the impact of legislative developments." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art waters, donald j. "the metadata harvesting initiative of the mellon foundation." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/resources/pubs/br/br /br waters.shtml ———. "open access publishing and the emerging infrastructure for st-century scholarship." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . watkinson, anthony. "a publishing view of the sparc initiative." against the grain (december /january ): , . watson, linda a., ivan s. login, and jeffrey m. burns. "exploring new ways of publishing: a library-faculty partnership." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.gov/picrender.fcgi?action=stream&blobt ype=pdf&artid= weller, ann c. "electronic scientific information, open access, and editorial peer review: changes on the horizon." science & technology libraries , no. ( ): - . wheeler, david l. "a scholar outlines a plan to use electronic publishing to change peer review." the chronicle of higher education, february , a . williams, karen. "the acrl scholarly communications toolkit now online: a resource for administrators, faculty, and librarians." college & research libraries news , no. ( ): - . willinsky, john. the access principle: the case for open access to research and scholarship. cambridge ma: mit press, . ———. "the nine flavours of open access scholarly publishing." the journal of postgraduate medicine , no. ( ): - . http://www.jpgmonline.com/article.asp? issn= - ;year= ;volume= ;issue= ;spage= ;epage= ;aulast= ———. "the stratified economics of open access." economic analysis and policy , no. ( ): - . http://www.eap-journal.com.au/download.php?file= ———. "the unacknowledged convergence of open source, open access, and open science." first monday , no. ( ). http://firstmonday.org/issues/issue _ /willinsky/index.html wilson, robin. "provosts push a radical plan to change the way faculty research is evaluated." the chronicle of higher education, june , a -a . wittenberg, kate. "the electronic publishing initiative at columbia (epic): a university-based collaboration in digital scholarly communication." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art xia, jingfeng. "library publishing as a new model of scholarly communication." journal of scholarly publishing , no. ( ): - . yavarkovsky, jerome. "a university-based electronic publishing network." educom review (fall ): - . young, peter r. "national corporation for scholarly publishing: presentation and description of the model." serials review , no. - ( ): - . zariski, archie. "'never ending, still beginning': a defense of electronic law journals from the perspective of the e law experience." first monday , no. ( ). http://www.firstmonday.org/issues/issue _ /zariski/index.html zuccala, alesia. "open access and civic scientific information literacy." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html publisher issues alexander, johanna olson. "alliance building in the information and online database industry." portal: libraries and the academy , no. ( ): - . anderson, kent. "the useful archive." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ball, mary alice. "libraries and university presses can collaborate to improve scholarly communication or 'why can't we all just get along?'" first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ bartlett, rebecca ann., richard brown, kathleen keane, bruce wilcox, ni pfund, and thomas bacher. "university press forum: variations on a digital theme (and other matters)." journal of scholarly publishing , no. ( ): - . baveye, philippe c. "sticker shock and looming tsunami." journal of scholarly publishing , no. ( ): - . beebe, linda, and barbara meyers. "digital workflow: managing the process electronically." the journal of electronic publishing (june ). http://hdl.handle.net/ /spo. . . bederson, b., and h. lustig. "electronic publishing: the role of a large scientific society." astrophysics and space science , no. - ( ): - . bennett, liz. "electronic publishing in the new millennium." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art bennett, scott. "repositioning university presses in scholarly communication." journal of scholarly publishing (july ): - . bol, jennifer l. "online journal marketing strategies." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art boyce, p. b. "the aas program of electronic publication." astrophysics and space science , no. - ( ): - . brantley, peter. "terroir—the hypervisor press." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . brown, laura, rebecca griffiths, and matthew rascoff. university publishing in a digital age. new york: ithaka, . http://www.ithaka.org/strategic-services/university-publishing bryant, eric. "reinventing the university press." library journal, september , - . chesler, adam, and susan king. "tier-based pricing for institutions: a new, e-based pricing mode." learned publishing , no. ( ): - . cooney-mcquat, sarah, stefan busch, and deborah kahn. "open access publishing: a viable solution for society publishers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art courant, paul n. "what might be in store for universities' presses." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . courtney, keith. "library/vendor relations: an academic publisher's perspective." journal of library administration , no. / ( ): - . cox, john. "the changing economic model of scholarly publishing: uncertainty, complexity and multi-media serials." against the grain (april ): , - . ———. "globalization, consolidation and the growth of the giants: scholarly communication, the individual, and the internet." the serials librarian , no. / ( ): - . ———. "new models for serials: redefining the serial and the licensing environment." the serials librarian , no. / ( ): - . ———. "publisher-library relationships in the digital environment." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "publishers, publishing and the internet: how journal publishing will survive and prosper in the electronic age." the electronic library (april ): - . ———. "the role of the paper-based journal in an era of electronic information." the serials librarian , no. / ( ): - . ———. "valuing and protecting our intellectual property: the lifeblood of our business." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art deloughry, thomas j. "university presses try to ride the wave of electronic publishing." the chronicle of higher education, march , a -a . dixon, a. "electronic publishing at institute of physics." astrophysics and space science , no. - ( ): - . donovan, bernard. "learned societies and electronic publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art doyle, mark. "surviving the transition." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art dougherty, peter j. "reimagining the university press: a checklist for scholarly publishers." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . dryburgh, alastair. "a new framework for digital publishing decisions." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ehling, terry. "the development of an open source publishing system at cornell and penn state universities." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/bm~doc/opensource.pdf ekman, richard h. "technology and the university press: a new reality for scholarly communication." change (september/october ): - . esposito, joseph j. "stage five book publishing." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . fisher, janet h. " : a publisher's view into the crystal ball." the serials librarian , no. / ( ): - . freeman, lisa. "big challenges face university presses in the electronic age." the chronicle of higher education, april , a . ———. "university presses and scholarly communication: dilemmas and prospects in the new age." library acquisitions: practice & theory , no. ( ): - . gannon, frank. "open access: scientists as paradoxical consumers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art gottwald, matthias, henning nielsen, roger brown, and oliver renn. "proposals for quality standards for electronic stm journals." serials , no. ( ): - . gotze, dietrich. "electronic journals—market and technology." publishing research quarterly (spring ): - . greenstein, daniel. "next-generation university publishing: a perspective from california." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . guernsey, lisa, and vincent kiernan. "journals see the internet as a tool in the peer review system." the chronicle of higher education, april , a -a . habing, h. j., and j. lequeux. "the project of electronic publication of astronomy and astrophysics." astrophysics and space science , no. - ( ): - . harter, stephen, and taemin kim park. "impact of prior electronic publication on manuscript consideration policies of scholarly journals." journal of the american society for information science (august ): - . hemingway, clive. "successful journal publishing on the internet: hit or myth?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art hunter, karen a. "concerns carried into the third millennium." against the grain (february ): , . ———. "critical issues in the development of stm journal publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "electronic journal publishing: observations from inside." d-lib magazine (july/august ). http://www.dlib.org/dlib/july / hunter.html ———. "issues and experiments in electronic publishing and dissemination." information technology and libraries (june ): - . ———. "looking back to look forward: 'chicken little redux' or strategic lessons learned." information services & use , no. ( ): - . ———. "a publisher's perspective." library acquisitions: practice & theory , no. ( ): - . ———. "setting journal priorities by listening to customers." journal of library administration , no. ( ): - . ———. "sleepless nights redux." against the grain (february ): - , . ———. "surviving another year." against the grain (february ): , . ———. "things that keep me awake at night." against the grain (february ): - , . hunter, karen, scott virkler, and rafael sidi. "disruptive technologies: taking stm publishing into the next era." serials: the journal for the serials community , no. ( ): - . jacobson, robert l. "publishers and the net." the chronicle of higher education, june , a -a . jones, ruth. "journals e-publishing: outsourced solutions for professional, scholarly and society publishers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art kaser, richard t. "if information wants to be free . . . then who's going to pay for it?" d-lib magazine (may ). http://www.dlib.org/dlib/may /kaser/ kaser.html king, tim. "critical issues for providers of network-accessible information." educom review (summer ): - . ———. "the impact of electronic and networking technologies on the delivery of scholarly information." the serials librarian , no. / ( ): - . kutz, myer. "the scholars' rebellion against scholarly publishing practices: varmus, vitek, and venting." searcher (january ): - . http://www.infotoday.com/searcher/jan /kutz.htm lamm, donald s. "libraries and publishers: a partnership at risk." daedalus , no. ( ): - . lieb, thom. "q.a.: how about a little privacy?" the journal of electronic publishing (august ). http://hdl.handle.net/ /spo. . . litchfield, malcolm. ". . .but presses must stress ideas, not markets." the chronicle of higher education, june , b -b . lowe, chrysanne. "a publisher's view of e-journal services." serials review (spring ): - . lynch, clifford. "imagining a university press system to support scholarship in the digital age." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . mabe, michael a. "scholarly publishing." european review , no. ( ): - . marks, jayne, and rolf a. janke. "the future of academic publishing: a view from the top." journal of library administration , no. ( ): - . mcpherson, tara. "scaling vectors: thoughts on the future of scholarly communication." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . meyers, barbara, bob bovensculte, and ann lowry. "tossin' and turnin' all night: publishers' dreams and nightmares." against the grain (december -january ): , , , , . morris, sally. "mapping the journal publishing landscape: how much do we know?" learned publishing , no. ( ): - . ———. "open publishing: how publishers are reacting." information services & use , no. - ( ): - . ———. "who needs publishers?" journal of information science , no. ( ): - . olivieri, rené. "business, science and the common good." journal of library administration , no. / ( ): - . pochoda, phil. "university of michigan press: the future of scholarly communication: on the other side of the digital tipping point." journal of scholarly publishing , no. ( ): - . ———. "up . : some theses on the future of academic publishing." journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . prosser, david c. "between a rock and a hard place: the big squeeze for small publishers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art pullinger, david. "quality in on-line journals." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "open access: evidence-based policy or policy-based evidence? the university press perspective." serials , no. ( ): - . richardson, martin. "post-print archives: parasite or symbiont." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art rous, bernard. "how to succeed in online markets: acm: a case study." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . rowland, fytton. "the royal society of new zealand's journals: how can they cope with the changing serials environment?" serials , no. ( ): - . scupola, ada. "the impact of electronic commerce on the publishing industry: towards a business value complementarity framework of electronic publishing." journal of information science , no. ( ): - . shaw, d. f. "the icsu press programme on electronic publishing in science." astrophysics and space science , no. - ( ): - . siler, jennifer m. "from gutenberg to gateway: electronic publishing at university presses." journal of scholarly publishing (october ): - . singleton, alan. "open access and learned societies." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "opportunities, threats and myths in journals." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art slowinski, f. hill, and patrick bernuth. "how 'free distribution' impacts your business model: is it really free?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art stankus, tony. "electronic journal concerns and strategies of science publishers." science & technology libraries , no. / ( ): - . tananbaum, greg, and lyndon holmes. "the evolution of web-based peer-review systems." learned publishing , no. ( ): - . thatcher, sanford g. "the challenge of open access for university presses." learned publishing , no. ( ): - . ———. "towards the year ." scholarly publishing (october ): - . waltham, mary. "challenges to the role of publishers." learned publishing (january ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ———. "learned society business models and open access: overview of a recent jisc-funded study." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art white, martin. "electronic rhetoric or electronic reality?" serials (july ): - . wittenberg, kate. "reimagining the university press." the journal of electronic publishing , no. ( ). http://dx.doi.org/ . / . . wood, dee. "online peer review: perceptions in the biological sciences." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art . publisher issues: digital rights management and user authentication agnew, grace. digital rights management: a librarian's guide to technology and practice. oxford: chandos publishing, . alrashid, tareq m., james a. barker, brian s. christian, steven c. cox, michael w. rabne, elizabeth a. slotta, and luella r. upthegrove. "safeguarding copyrighted contents: digital libraries and intellectual property management: cwru's rights management system." d-lib magazine (april ). http://www.dlib.org/dlib/april / barker.html arms, william yeo. "implementing policies for access management." d-lib magazine (february ). http://www.dlib.org/dlib/february /arms/ arms.html barrow, e. "protecting published science." astrophysics and space science , no. - ( ): - . berghel, hal, and lawrence o'gorman. "protecting ownership rights through digital watermarking." computer (july ): - . bide, mark, and alicia wise. " st-century rights management: why does it matter and what is being done?" learned publishing , no. ( ): - . boettcher, judith v., robert brentrup, and john douglass. "digital certificates: coming of age." educause review (january/february ): - . http://www.educause.edu/ir/library/pdf/erm .pdf böhner, dörte. "digital rights description as part of digital rights management: a challenge for libraries." library hi tech , no. ( ): - . caplan, priscilla. "doi or don't we?" the public-access computer systems review , no ( ): - . http://epress.lib.uh.edu/pr/v /n /capl n .html calow, duncan, and rebecca egan. "is the answer still in the machine: do publishers need digital rights management?" learned publishing , no. ( ): - . carvajal, doreen. "an electronic sheriff to battle book rustling." the new york times, september , c , c . cohen, julie e. "drm and privacy." communications of the acm , no. ( ): - . cornish, graham p. "electronic copyright management systems: dream, nightmare or reality?" ifla journal , no. ( ): - . coyle, karen. "the 'rights' in digital rights management." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /coyle/ coyle.html ———. "rights management and digital library requirements." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /coyle/ ———. "the role of digital rights management in library lending." indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= cox, john. "digital rights management: old hat or new wrinkle?—ready or not, drm is dramatically altering today's publishing landscape." against the grain , no. ( ): , , . davidson, lloyd a., and kimberley douglas. "digital object identifiers: promise and problems for scholarly publishing." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . davis, denise m., and tim lafferty. "digital rights management: implications for libraries." the bottom line , no. ( ): - . duranceau, ellen finnie. "examining the user registration model for e-journal access." serials review , no. ( ): - . erickson, john s. "a digital object approach to interoperable rights management: fine-grained policy enforcement enabled by a digital object infrastructure." d-lib magazine (june ). http://www.dlib.org/dlib/june /erickson/ erickson.html eschenfelder, kristin r. "every library's nightmare? digital rights management, use restrictions, and licensed scholarly digital resources." college and research libraries , no. ( ): - . ———. "fair use, drm, and trusted computing." communications of the acm , no. ( ): - . ———. "technologies employed to control access to or use of digital cultural collections: controlled online collections." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /eschenfelder/ eschenfelder.htm l farmakis, charalampos, evangelia kopanki, and dracoulis martakos. "managing access to electronic subscriptions." vine, no. ( ): - . felten, edward w. "a skeptical view of drm and fair use." communications of the acm , no. ( ): - . fox, barbara l., and brian a. lamacchia. "encouraging recognition of fair uses in drm systems." communications of the acm , no. ( ): - . garrett, john r., and patrice a. lyons. "toward an electronic copyright management system." journal of the american society for information science (september ): - . gervais, daniel j. "electronic rights management and digital identifier systems." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . gladney, henry m. "safeguarding digital library contents and users: document access control." d-lib magazine (june ). http://www.dlib.org/dlib/june /ibm/ gladney.html ———. "safeguarding digital library contents and users: interim retrospect and prospects." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /gladney/ gladney.html gladney, h. m., and arthur cantu. "authorization management for digital libraries." communications of the acm (may ): - . gladney, h. m., and j. b. lotspiech. "safeguarding digital library contents and users: assuring convenient security and data quality." d-lib magazine (may ). http://www.dlib.org/dlib/may /ibm/ gladney.html ———. "safeguarding digital library contents and users: storing, sending, showing, and honoring usage terms and conditions." d-lib magazine (may ). http://www.dlib.org/dlib/may /gladney/ gladney.html gladney, henry m., fred mintzer, and fabio schiattarella. "safeguarding digital library contents and users: digital images of treasured antiquities." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /vatican/ gladney.html guenther, kim. "knock, knock, who's there? authenticating users." computers in libraries , no. ( ): - . herzberg, amir. "safeguarding digital library contents: charging for online content." d-lib magazine (january ). http://www.dlib.org/dlib/january /ibm/ herzberg.html hilton, james l. "digital asset management systems." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf hudomalj, emil, and avgust jauk. "authentication and authorisation infrastructure for the mobility of users of academic libraries: an overview of developments." program: electronic library and information systems , no. ( ): - . iannella, renato. "digital rights management (drm) architectures." d-lib magazine (june ). http://www.dlib.org/dlib/june /iannella/ iannella.html isaias, pedro. "electronic copyright management systems: aspects to consider." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /ecms/ ———. "technology issues and electronic copyright management systems." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /ecms/ joint, nicholas. "recent trends in authentication and national information management policy in the uk." library review , no. ( ): - . kidd, yvonne. "intellectual property rights management in the digital environment: an overview of developments and initiatives." information standards quarterly (july ): - . knopf, dominik, and christoph sorge. "model-oriented analysis of user-right holder relations and possible impacts of drm." information services & use , no. ( ): - . kohl, ulrich, jeffrey lotspiech, and marc a. kaplan. "safeguarding digital library contents and users: protecting documents rather than channels." d-lib magazine (september ). http://www.dlib.org/dlib/september /ibm/ lotspiech.html lichtenberg, james. "inching toward e-commerce." publishers weekly, december , - . magnussen, amanda. "electronic rights management in the united kingdom." library management , no. ( ): - . mann, david. "digital rights management and people with sight loss." indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= martin, mairead, grace agnew, david l. kuhlman, john h. mcnair, william a. rhodes, and ron tipton. "federated digital rights management: a proposed drm solution for research and education." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /martin/ martin.html may, christopher. digital rights management: the problem of expanding ownership rights. oxford: chandos publishing, . ———. "digital rights management and the breakdown of social norms." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ mcleish, simon. "installing shibboleth." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /mcleish/ millman, david. "cross-organizational access management: a digital library authentication and authorization architecture." d-lib magazine (november ). http://www.dlib.org/dlib/november /millman/ millman.html mintzer, fred, jeffrey lotspiech, and norishige morimoto. "safeguarding digital library contents and users: digital watermarking." d-lib magazine (december ). http://www.dlib.org/dlib/december /ibm/ lotspiech.html mooney, stephen. "interoperability: digital rights management and the emerging ebook environment." d-lib magazine (january ). http://www.dlib.org/dlib/january /mooney/ mooney.html morgan, r. l., scott cantor, steven carmody, walter hoehn, and ken klingenstein. "federated security: the shibboleth approach." educause quarterly , no. ( ): - . http://www.educause.edu/ir/library/pdf/eqm .pdf morris, sally. "metadata and rights." vine, no. ( ): - . needleman, mark. "the shibboleth authentication/authorization system." serials review , no. ( ): - . neylon, eamonn. "first steps in an information commerce economy: digital rights management in the emerging ebook environment." d-lib magazine (january ). http://www.dlib.org/dlib/january /neylon/ neylon.html nicholson, denise rosemary. "digital rights management and access to information: a developing country's perspective." libres: library and information science research electronic journal , no. ( ). http://libres.curtin.edu.au/libres n /nicholson_essyop.pdf oberknapp, bernd, ato ruppert, franck borel, and jochen lienhard. "from a pile of ip addresses to a clear authentication and authorization with shibboleth." serials: the journal for the serials community , no. ( ): - . olsen, florence. "do 'digital certificates' hold the key to colleges' on-line activities?" the chronicle of higher education, december , a -a . ———. "legal concerns delay publication of research on 'digital watermarks.'" the chronicle of higher education, february , a . pack, thomas. "digital rights management: can the technology provide long-term solutions?" econtent (may ): - . paschoud, john. access and identity management: controlling access to online information. new york: neal-schuman publishers, . paskin, norman. "on making and identifying a 'copy.'" d-lib magazine (january ). http://www.dlib.org/dlib/january /paskin/ paskin.html powell, andy, and david recordon. "openid: decentralised single sign-on for the web." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /powell-recordon/ poynder, richard. "the role of digital rights management in open access." indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= ramsden, anne. "copyright management technologies: the key to unlocking digital works?" ariadne, no. ( ). http://www.ariadne.ac.uk/issue /copyright/ rezmierski, virginia, and aline soules. "security vs. anonymity: the debate over user authentication and information access." educause review (march/april ): - . http://www.educause.edu/ir/library/pdf/erm .pdf robiette, alan. "managing access to electronic information: progress and prospects." serials (november ): - . roos, j. w. "copyright protection as access barrier for people who read differently: the case for an international approach." ifla journal , no. ( ): - . rosenblatt, bill. "the digital object identifier: solving the dilemma of copyright protection online." the journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . ———. "drm, law and technology: an american perspective." online information review , no. ( ): - . russell, carrie. "fair use under fire." library journal, august , - . http://www.libraryjournal.com/article/ca .html rust, godfrey. "metadata: the right approach: an integrated model for descriptive and rights metadata in e-commerce." d-lib magazine (july/august ). http://www.dlib.org/dlib/july /rust/ rust.html sairamesh, j., c. nikolaou, d. f. ferguson, and y. yemini. "economic framework for pricing and charging in digital libraries." d-lib magazine (february ). http://www.dlib.org/dlib/february /forth/ sairamesh.html samuelson, pamela. "drm {and, or, vs.} the law." communications of the acm , no. ( ): - . schutzer, daniel. "a need for a common infrastructure: digital libraries and electronic commerce." d-lib magazine (april ). http://www.dlib.org/dlib/april / schutzer.html sirbu, marvin a. "creating an open market for information." the journal of academic librarianship (november ): - . spitzer, stephan. "better control of user web access of electronic resources." journal of electronic resources in medical libraries , no. ( ): - . tyrväinen, pasi. "fair use licensing in library context." indicare monitor , no. ( ). http://www.indicare.org/tiki-read_article.php?articleid= wiseman, norman. "implementing a national access management system for electronic services: technology alone is not enough." d-lib magazine (march ). http://www.dlib.org/dlib/march /wiseman/ wiseman.html repositories, e-prints, and oai adamick, jessica, and rebecca reznik-zellen. "representation and recognition of subject repositories." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /adamick/ adamick.html ———. "trends in large-scale subject repositories." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /adamick/ adamick.html afshari, fereshteh, and richard jones. "developing an integrated institutional repository at imperial college london." program: electronic library and information systems , no. ( ): - . http://eprints.imperial.ac.uk/handle/ / / agnew, grace, and yang yu. "the rutgers workflow management system: migrating a digital object management utility to open source." the code lib journal, no. ( ). http://journal.code lib.org/articles/ alexander, martha latika, and j. n. gautam. "institutional repositories for scholarly communication: indian initiatives." serials: the journal for the serials community , no. ( ): - . allard, suzie, thura r. mack, and melanie feltner-reichert. "the librarian's role in institutional repositories: a content analysis of the literature." reference services review , no. ( ): - . allen, james. "interdisciplinary differences in attitudes towards deposit in institutional repositories." manchester metropolitan university, . http://eprints.rclis.org/archive/ / allinson, julie, and elizabeth harbord. "sherpa to yodl-ing: digital mountaineering at york." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /allinson-harbord/ allinson, julie, and roddy macleod. "building an information infrastructure in the uk." research information (october/november ). http://www.researchinformation.info/rioctnov digital.html anderson, greg, rebecca lasher, and vicky reich. "the computer science technical report (cs-tr) project: a pioneering digital library project viewed from a library perspective." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /ande n .html andré, francis, muriel foulonneau, anne-marie badolato, and daniel charnay. "the repository jigsaw." research information (may/june ). http://www.researchinformation.info/features/feature.php?feature_id = andreoni, antonella, maria bruna baldacci, stefania biagioni, carlo carlesi, donatella castelli, pasquale pagano, carol peters, and serena pisani. "the ercim technical reference digital library: meeting the requirements of a european community within an international federation." d-lib magazine (december ). http://www.dlib.org/dlib/december /peters/ peters.html andrew, theo. "trends in self-posting of research material online by academic staff." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /andrew/ androulakis, steve, ashley m buckle, ian atkinson, david groenewegen, nick nicholas, andrew treloar, and anthony beitz. "archer—e-research tools for research data management." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ anscombe, nadya. "archive programmes gain momentum." research information (october/november ). http://www.researchinformation.info/rioctnov repositories.html arl digital repository issues task force. the research library's role in digital repository services: final report of the arl digital repository issues task force. washington, dc: association of research libraries, . http://www.arl.org/bm~doc/repository-services-report.pdf armbruster, chris, and laurent romary. "comparing repository types: challenges and barriers for subject-based repositories, research repositories, national repository systems and institutional repositories in serving scholarly communication." international journal of digital library systems , no. ( ): - . http://hal.inria.fr/inria- /en/ arms, caroline r. "available and useful: oai and the library of congress." library hi tech , no. ( ): - . arms, william y., naomi dushay, dave fulker, and carl lagoze. "a case study in metadata harvesting: the nsdl." library hi tech , no. ( ): - . asamoah-hassan, helena. "alternative scholarly communication: management issues in a ghanaian university." library management , no. ( ): - . aschenbrenner, andreas, tobias blanke, david flanders, mark hedges, and ben o'steen. "the future of repositories? patterns for (cross-)repository architectures." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /aschenbrenner/ aschenbrenn er.html aschenbrenner, andreas, tobias blanke, marc w. küster, and wolfgang pempe. "towards an open repository environment." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ ashworth, susan, morag mackie, and william j. nixon. "the daedalus project, developing institutional repositories at glasgow university: the story so far." library review , no. ( ): - . http://eprints.gla.ac.uk/ / asner, haya, and tsviya polani. "electronic theses at ben-gurion university: israel as part of the worldwide etd movement." portal: libraries and the academy , no. ( ): - . avdeeva, nina. "innovative services for libraries through the virtual reading rooms of the digital dissertation library, russian state library." ifla journal , no. ( ): - . averkamp, shawn, and joanna lee. "repurposing proquest metadata for batch ingesting etds into an institutional repository." the code lib journal, no. ( ). http://journal.code lib.org/articles/ awre, chris. "the jisc's fair programme: disclosing and sharing institutional assets." learned publishing , no. ( ): - . http://eprints.rclis.org/archive/ / awre, chris, and alma swan. "linking repositories: scoping the development of cross-institutional user-oriented services." oclc systems & services: international digital library perspectives , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art ayris, paul. "open archives: institutional issues." vine, no. ( ): - . bailey, charles w., jr. "the role of reference librarians in institutional repositories." reference services review , no. ( ): - . http://www.digital-scholarship.org/cwb/reflibir.pdf bailey, charles w., jr., karen coombs, jill emery, anne mitchell, chris morris, spencer simons, and robert wright. institutional repositories. spec kit . washington, dc: association of research libraries, . http://www.arl.org/bm~doc/spec web.pdf bankier, jean-gabriel, connie foster, and glen wiley. "institutional repositories—strategies for the present and future." the serials librarian , no. - ( ): - . bankier, jean-gabriel, and irene perciali. "the institutional repository rediscovered: what can a university do for open access publishing?" serials review , no. ( ): - . baptista, ana alice, and miguel ferreira. "tea for two: bringing informal communication to repositories." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /baptista/ baptista.html baruch, pierre. "open access developments in france: the hal open archives system." learned publishing , no. ( ). http://hal.archives-ouvertes.fr/hal- /en/ barwick, joanna. "building an institutional repository at loughborough university: some experiences." program: electronic library and information systems , no. ( ): - . http://dspace.lboro.ac.uk/dspace/handle/ / bekaert, jeroen, emiel de kooning, and herbert van de sompel. "representing digital assets using mpeg- digital item declaration." international journal on digital libraries , no. ( ): - . http://arxiv.org/abs/cs.dl/ barton, mary r. creating an institutional repository: leadirs workbook. cambridge, ma: mit, . http://hdl.handle.net/ . / barton, mary r., and julie harford walker. "building a business plan for dspace, mit libraries' digital institutional repository." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ bell, jonathan, and stuart lewis. "using oai-pmh and mets for exporting metadata and digital objects between repositories." program: electronic library and information systems , no. ( ): - . http://cadair.aber.ac.uk/dspace/handle/ / bell, suzanne, nancy fried foster, and susan gibbons. "reference librarians and the success of institutional repositories." reference services review , no. ( ): - . https://urresearch.rochester.edu/handle/ / bell, suzanne, and nathan sarr. "case study: re-engineering an institutional repository to engage users." new review of academic librarianship , no. s ( ): - . http://www.informaworld.com/smpp/ftinterface~db=all~content=a ~fulltext= benjelloun, rida. "archimède: a canadian solution for institutional repository." library hi tech , no. ( ): - . bevan, simon j. "developing an institutional repository: cranfield queprints—a case study." oclc systems & services , no. ( ): - . bhat, mohammad hanief. "interoperability of open access repositories on computer science and it—an evaluation." library hi tech , no. ( ): - . ———. "open access repositories in computer science and information technology: an evaluation." ifla journal , no. ( ): - . blake, michelle. "economists online: user requirements for a subject repository." serials: the journal for the serials community , no. ( ): - . blumenstyk, goldie, and vincent kiernan. "idea of on-line archives of papers sparks debate on future of journals." the chronicle of higher education, july , a -a . blythe, erv, and vinod chachra. "the value proposition in institutional repositories." educause review , no. ( ): - . http://www.educause.edu/ir/library/pdf/erm .pdf bonilla-calero, a. i. "scientometric analysis of a sample of physics-related research output held in the institutional repository strathprints ( - )." library review , no. ( ): - . boock, michael. "improving dspace@osu with a usability study of the et/d submission process." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /boock/ bostanci, adam. "biomedical archive is first step towards national hub." research information (may/june ). http://www.researchinformation.info/features/feature.php?feature_id = bravo, blanca rodríguez, and ma luisa alvite díez. "e-science and open access repositories in spain." oclc systems & services: international digital library perspectives , no. ( ): - . breeding, marshall. "understanding the protocol for metadata harvesting of the open archives initiative." computers in libraries , no. ( ): - . breytenbach, amelia, and ria groenewald. "the african elephant: a digital collection of anatomical sketches as part of the university of pretoria's institutional repository—a case study." oclc systems & services: international digital library perspectives , no. ( ): - . brogan, martha l. contexts and contributions: building the distributed library. washington, dc: digital library federation, . http://www.diglib.org/pubs/dlf /index.htm brown, cecelia. "the coming of age of e-prints in the literature of physics." issues in science and technology librarianship (summer ). http://www.library.ucsb.edu/istl/ -summer/refereed.html ———. "the e-volution of preprints in the scholarly communication of physicists and astronomers." journal of the american society for information science and technology , no. ( ): - . ———. "the role of electronic preprints in chemical communication: analysis of citation, usage, and acceptance in the journal literature." journal of the american society for information science and technology , no. ( ): - . brown, cecelia, and june m. abbas. "institutional digital repositories for science and technology: a view from the laboratory." journal of library administration , no. ( ): - . brown, david j. "repositories and journals: are they in conflict?: a literature review of relevant literature." aslib proceedings , no. ( ): - . buehler, marianne a., and adwoa boateng. "the evolving impact of institutional repositories on reference librarians." reference services review , no. ( ): - . https://ritdml.rit.edu/dspace/handle/ / buehler, marianne a., and marcia s. trauernicht. "from digital library to institutional repository: a brief look at one library's path." oclc systems & services: international digital library perspectives , no. ( ): - . burrows, toby. "developing a digital repository for a humanities research network: the pioneer project." new review of academic librarianship , no. / ( ): - candee, catherine h. "the california digital library and the escholarship program." journal of library administration , no. / ( ): - . caplan, priscilla. "repository to repository transfer of enriched archival information packages." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /caplan/ caplan.html carim, lara. "serial killers: how great is the e-print threat to periodicals publishers?" learned publishing (april ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art carlson, jake, alexis e. ramsey, and j. david kotterman. "using an institutional repository to address local-scale needs: a case study at purdue university." library hi tech , no. ( ): - . carr, leslie, and tim brody. "size isn't everything: sustainable repositories as evidenced by sustainable deposit profiles." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /carr/ carr.html carr, les, steve hitchcock, wendy hall, and stevan harnad. "a usage based analysis of corr." journal of computer documentation (may ): - . http://eprints.ecs.soton.ac.uk/ / carriveau, kenneth l. "a brief history of e-prints and the opportunities they open for science librarians." science & technology libraries , no. / ( ): - . cassella, maria. "institutional repositories: an internal and external perspective on the value of irs for researchers' communities." liber quarterly: the journal of european research libraries , no. ( ): - . http://liber.library.uu.nl/publish/articles/ /article.pdf chan, diana l. h. "an integrative view of the institutional repositories in hong kong: strategies and challenges." serials review , no. ( ): - . chan, diana l. h., catherine s. y. kwok, and steve k. f. yip. "changing roles of reference librarians: the case of the hkust institutional repository." reference services review , no. ( ): - . http://repository.ust.hk/dspace/handle/ . / chan, leslie. "supporting and enhancing scholarship in the digital age: the role of open access institutional repository " canadian journal of communication , no. ( ): - . http://eprints.rclis.org/archive/ / chan, leslie, and barbara kirsop. "open archiving opportunities for developing countries: towards equitable distribution of global knowledge." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /oai-chan/ chantavaridou, elisavet. "open access and institutional repositories in greece: progress so far." oclc systems & services: international digital library perspectives , no. ( ): - . charnay, daniel. "the centre for direct scientific communication." information services & use , no. / ( ): - . cocciolo, anthony. "can web . enhance community participation in an institutional repository? the case of pocketknowledge at teachers college, columbia university." the journal of academic librarianship , no. ( ): - . cole, timothy w. "using oai: innovations in the sharing of information." library hi tech , no. ( ): - . cole, timothy w., and muriel foulonneau. using the open archives initiative protocol for metadata harvesting. westport, ct: libraries unlimited, . cole, timothy w., and sarah l. shreeves. "search and discovery across collections: the imls digital collections and content project." library hi tech , no. ( ): - . https://www.ideals.uiuc.edu/handle/ / coleman, anita, paul bracke, and s. karthik. "integration of non-oai resources for federated searching in dlist, an eprints repository." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /coleman/ coleman.html coleman, anita, and joseph roback. "open access federation for library and information science: dlist and dl-harvest." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /coleman/ coleman.html connell, tschera harkness, and thomas cetwinski. "the impact of institutional repositories on technical services." technical services quarterly , no. ( ): - . copeland, susan, andrew penman, and richard milne. "electronic theses: the turning point." program: electronic library & information systems , no. ( ): - . https://openair.rgu.ac.uk/handle/ / correia, ana maria ramalho, and miguel de castro neto. "the role of eprint archives in the access to, and dissemination of, scientific grey literature: liza—a case study by the national library of portugal." journal of information science , no. ( ): - . covey, denise troll. "self-archiving journal articles: a case study of faculty practice and missed opportunity." portal: libraries and the academy , no. ( ): - . creaser, claire. "open access to research outputs—institutional policies and researchers' views: results from two complementary surveys." new review of academic librarianship , no. ( ): - . creel, james s., jack r. koenig, and robert mcgeachin. "automating the importation of a historic scientific serial into a digital repository." oclc systems & services: international digital library perspectives , no. ( ): - . cullen, rowena, and brenda chawner. "institutional repositories: assessing their value to the academic community." performance measurement and metrics , no. ( ): - . daly, rebecca, and michael organ. "research online: digital commons as a publishing platform at the university of wollongong, australia." serials review , no. ( ): - . darby, r. m., c. m. jones;, l. d. gilbert, and s. c. lambert. "increasing the productivity of interactions between subject and institutional repositories." new review of information networking , no. ( ): - . davis, james r. "creating a networked computer science technical report library." d-lib magazine (september ). http://www.dlib.org/dlib/september / davis.html davis, james r., and carl lagoze. "ncstrl: design and deployment of a globally distributed digital library." journal of the american society for information science , no. ( ): - . davis, philip m., and matthew j. l. connolly. "institutional repositories: evaluating the reasons for non-use of cornell university's installation of dspace." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /davis/ davis.html delserone, leslie m. "at the watershed: preparing for research data management and stewardship at the university of minnesota libraries." library trends , no. ( ): - . devakos, rea. "towards user responsive institutional repositories: a case study." library hi tech , no. ( ): - . deng, sai, and terry reese. "customized mapping and metadata transfer from dspace to oclc to improve etd work flow." new library world , no. / ( ): - . dill, emily, and kristi l. palmer. "what's the big idea? considerations for implementing an institutional repository." library hi tech news , no. ( ): - . https://idea.iupui.edu/dspace/handle/ / dillon, cy. "philpapers breaks new ground for discipline based repositories." college & undergraduate libraries , no. ( ): - . dobratz, susanne, and birgit matthaei. "open archives activities and experiences in europe: an overview by the open archives forum." d-lib magazine (january ). http://www.dlib.org/dlib/january /dobratz/ dobratz.html dobratz, susanne, and frank scholze. "dini institutional repository certification and beyond." library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / doctor, gayatri. "capturing intellectual capital with an institutional repository at a business school in india." library hi tech , no. ( ): - . drury, caroline. "building institutional repository infrastructure in regional australia." oclc systems & services: international digital library perspectives , no. ( ): - . dunsire, gordon. "collecting metadata from institutional repositories." oclc systems & services: international digital library perspectives , no. ( ): - . duranceau, ellen finnie. "the 'wealth of networks' and institutional repositories: mit, dspace, and the future of the scholarly commons." library trends , no. ( ): - . duranceau, ellen finnie, and richard rodgers. "automated ir deposit via the sword protocol: an mit/biomed central experiment." serials: the journal for the serials community , no. ( ): - . http://uksg.metapress.com/app/home/contribution.asp?referrer=pare nt&backto=issue, , ;journal, , ;linkingpublicationresults, : , estlund, karen, and anna neatrour. "utah digital repository initiative: building a support system for institutional repositories." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /neatrour/ neatrour.html fabian, carole ann. "ubdigit: a repository infrastructure for digital collections at the university at buffalo." rlg diginews , no. ( ). http://worldcat.org/arcviewer/ /occ/ / / / /viewe r/file .html#article feijen, martin, and annemiek van der kuil. "a recipe for cream of science: special content recruitment for dutch institutional repositories." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /vanderkuil/ ferreira, miguel, eloy rodrigues, ana alice baptista, and ricardo saraiva. "carrots and sticks: some ideas on how to create a successful institutional repository." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /ferreira/ ferreira.html fleming, dan. "the garden of forking paths—forms of scholarship and the 'formations' pre-print system for cultural studies and related fields." computers and the humanities , no. ( ): - . foster, nancy fried, and susan gibbons. "understanding faculty to improve content recruitment for institutional repositories." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /foster/ foster.html foulonneau, muriel, and francis andre. investigative study of standards for digital repositories and related services. amsterdam: amsterdam university press, . http://dare.uva.nl/document/ foulonneau, muriel, timothy w. cole, charles blair, peter c. gorman, kat hagedorn, and jenn riley. "the cic metadata portal: a collaborative effort in the area of digital libraries." science & technology libraries , no. / ( ): - . foulonneau, muriel, thomas g. habing, and timothy w. cole. "automated capture of thumbnails and thumbshots for use by metadata aggregation services." d-lib magazine , no. ( ). http://www.dlib.org/dlib/january /foulonneau/ foulonneau.html french, james c., edward a. fox, kurt maly, and alan l. selman. "wide area technical report service: technical reports online." communications of the acm (april ): . fyffe, richard, and william c. welburn. "etds, scholarly communication and campus collaboration." college & research libraries news , no. ( ): - . garner, jane, lynne horwood, shirley sullivan. "the place of eprints in scholarly information delivery." online information review , no. ( ): - . http://eprints.infodiv.unimelb.edu.au/archive/ / gedye, richard. "measuring the usage of individual research articles: based on a presentation given at the uksg seminar 'mandating and the scholarly journal article: attracting interest on deposits?', london, october ." serials: the journal for the serials community , no. ( ): - . genoni, paul. "content in institutional repositories: a collection management issue." library management , no. ( ): - . http://espace.lis.curtin.edu.au/archive/ / gerber, anna, and jane hunter. "authoring, editing and visualizing compound objects for literary scholarship." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ ghosh, maitrayee. "e-theses and indian academia: a case study of nine etd digital libraries and formulation of policies for a national service." the international information & library review , no. ( ): - . gierveld, heleen. "considering a marketing and communications approach for an institutional repository." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /gierveld/ ginsparg, paul. "as we may read." the journal of neuroscience , no. ( ): - . http://www.jneurosci.org/cgi/reprint/ / / ———. "electronic research archives for physics." in the impact of electronic publishing on the academic community: an international workshop organized by the academia europaea and the wenner-gren foundation, ed. i. butterworth, - . london: portland press, . ———. "winners and losers in the global research village." the serials librarian , no. / ( ): - . http://xxx.lanl.gov/blurb/pg unesco.html goodyear, marilu, and richard fyffe. "institutional repositories: an opportunity for cio campus impact." educause review , no. ( ): – . http://www.educause.edu/ir/library/pdf/erm .pdf graham, john-bauer, bethany latham skaggs, and kimberly weatherford stevens. "digitizing a gap: a state-wide institutional repository project." reference services review , no. ( ): - . gray, andrew. "institutional repositories for creative and applied arts research: the kultur project." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /gray/ green, ann g., and myron p. gutmann. "building partnerships among social science researchers, institution-based repositories and domain specific data archives." oclc systems & services , no. ( ): - . http://hdl.handle.net/ . / green, richard, and chris awre. "the remap project: steps towards a repository-enabled information environment." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /green-awre/ ———. "repomman: delivering private repository space for day-to-day use." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /green-awre/ ———. "towards a repository-enabled scholar's workbench: repomman, remap and hydra." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /green/ green.html green, richard, ian dolphin, chris awre, and robert sherratt. "the repomman project: automating workflow and metadata for an institutional repository." oclc systems & services , no. ( ): - . greene, joseph. "project management and institutional repositories: a case study at university college dublin library." new review of academic librarianship , no. s ( ): - . http://www.informaworld.com/smpp/ftinterface~db=all~content=a ~fulltext= greig, morag. "achieving an 'enlightened' publications policy at the university of glasgow: based on a presentation given at the uksg seminar 'mandating and the scholarly journal article: attracting interest on deposits?', london, october ." serials: the journal for the serials community , no. ( ): - . ———. "implementing electronic theses at the university of glasgow: cultural challenges." library collections, acquisitions, & technical services , no. ( ): - . http://eprints.gla.ac.uk/ / greig, morag, and william j. nixon. "daedalus: delivering the glasgow eprints service." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /greig-nixon/ groenewegen, david, and andrew treloar. "arrow and the rqf: meeting the needs of the research quality framework using an institutional research repository." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /groenewegen-treloar/ ———. "the arrow project: a consortial institutional repository solution, combining open source and proprietary software." oclc systems & services: international digital library perspectives , no. ( ): - . guédon, jean-claude. "open access archives: from scientific plutocracy to the republic of science." ifla journal , no. ( ): - . gunnarsdóttir, kristrún. "on the role of electronic preprint exchange in the distribution of scientific literature." social studies of science , no. ( ): - . guy, marieke, andy powell, and michael day. "improving the quality of metadata in eprint archives." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /guy/ hagerdorn, katerina. "looking for pearls." research information (march/april ). http://www.researchinformation.info/rimarapr oaister.html ———. "oaister: a 'no dead ends' oai service provider." library hi tech , no. ( ): - . hagedorn, kat, and joshua santelli. "google still not indexing hidden web urls." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /hagedorn/ hagedorn.html hahn, karla. "achieving the full potential of repository deposit policies." research library issues, no. ( ): - . http://www.arl.org/bm~doc/rli- -repositories.pdf halbert, martin. "the metascholar initiative: americansouth.org and metaarchive.org." library hi tech , no. ( ): - . halls, peter. "pros and cons of online archive data for academic research." serials: the journal for the serials community , no. ( ): - . halpern, joseph y. "a computing research repository." d-lib magazine (november ). http://www.dlib.org/dlib/november / halpern.html ———. "corr: a computing research repository." journal of computer documentation (may ): - . http://arxiv.org/abs/cs.dl/ hamb, christopher p., matthew a. cordial, and thomas g. habing. "design and implementation of a custom oai search and discovery service." science & technology libraries , no. / ( ): - . haque, asif-ul, and paul ginsparg. "last but not least: additional positional effects on citation and readership in arxiv." journal of the american society for information science and technology , no. ( ): - . http://arxiv.org/abs/ . ———. "positional effects on citation and readership in arxiv." journal of the american society for information science and technology , no. ( ): - . harnad, stevan, les carr, tim brody, and charles oppenheim. "mandated online rae cvs linked to university eprint archives: enhancing uk research impact and assessment." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /harnad/ harris, evan. "institutional repositories: is the open access door half open or half shut?" learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art henneken, edwin a., michael j. kurtz, guenther eichhorn, alberto accomazzi, carolyn grant, donna thompson, and stephen s. murray. "effect of e-printing on citation rates in astronomy and physics." journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . henty, margaret. "ten major issues in providing a repository service in australian universities." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /henty/ henty.html herb, ulrich, and matthias müller. "the long and winding road: institutional and disciplinary repository at saarland university and state library." oclc systems & services: international digital library perspectives , no. ( ): - . http://eprints.rclis.org/archive/ / hey, jessie. "targeting academic research with southampton's institutional repository." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /hey/ hey, tony, and jessie hey. "e-science and its implications for the library community." library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / hirwade, mangala anil, and mohini t. bherwani. "facilitating searches in multiple bibliographical databases: metadata harvesting service providers." liber quarterly: the journal of european research libraries , no. ( ): - . http://liber.library.uu.nl/publish/articles/ /article.pdf hogenaar, arjan. "enhancing scientific communication through aggregated publications environments." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /hogenaar/ horová, iva, and radim chvála. "non-text theses as an integrated part of the university repository: a case study of the academy of performing arts in prague " liber quarterly: the journal of european research libraries , no. ( ). http://liber.library.uu.nl/publish/articles/ /article.pdf horwood, lynne, shirley sullivan, eve young, and jane garner. "oai compliant institutional repositories and the role of library staff." library management , no. / ( ): - . hu, changping, yaokun zhang, and guo chen. "exploring a new model for preprint server: a case study of cspo." journal of academic librarianship , no. ( ): - . hulse, bruce, joan f. cheverie, and claire t. dygert. "aladin research commons: a consortial institutional repository." oclc systems & services , no. ( ): - . http://www.istl.org/ -fall/viewpoints.html hunter, philip, and marieke guy. "metadata for harvesting: the open archives initiative, and how to find things on the web." the electronic library , no. ( ): - . hutchinson, alvin. "federal repositories: comparative advantage in open access?" issues in science and technology librarianship, no. ( ). huwe, terence k. "social sciences e-prints come of age: the california digital library's working paper repository." online (september/october ): - hyatt, shirley, and jeffrey a. young. "oclc research publications repository." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /hyatt/ hyatt.html jacobs, neil. "digital repositories in uk universities and colleges." freeprint, no. ( ). http://www.freepint.com/issues/ .htm#feature jacobs, neil, amber thomas, and andrew mcgregor. "institutional repositories in the uk: the jisc approach." library trends , no. ( ): - . jackson, allyn. "from preprints to e-prints: the rise of electronic preprint servers in mathematics." notices of the american mathematical society (january ): - . http://www.ams.org/notices/ /fea-preprints.pdf jamali, hamid r., and david nicholas. "e-print depositing behavior of physicists and astronomers: an intradisciplinary study." the journal of academic librarianship , no. ( ): - . jantz, ronald. "public opinion polls and digital preservation: an application of the fedora digital object repository system." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /jantz/ jantz.html jantz, ronald, and michael giarlo. "digital archiving and preservation: technologies and processes for a trusted repository." journal of archival organization , no. / ( ): - . jantz, ronald c., and myoung c. wilson. "institutional repositories: faculty deposits, marketing, and the reform of scholarly communication." the journal of academic librarianship , no. ( ): - . jayakanth, francis, filbert minj, usha silva, and sandhya jagirdar. "eprints@iisc: india's first and fastest growing institutional repository." oclc systems & services: international digital library perspectives , no. ( ): - . jeffery, keith, and anne asserson. "institutional repositories and current research information systems." new review of information networking , no. ( ): - . jenkins, barbara, elizabeth breakstone, and carol hixson. "content in, content out: the dual roles of the reference librarian in institutional repositories." reference services review , no. ( ): - . https://scholarsbank.uoregon.edu/dspace/handle/ / jerez, henry, giridhar manepalli, christophe blanchi, and laurence w. lannom. "adl-r: the first instance of a cordra registry." d-lib magazine , no. ( ). http://www.dlib.org/dlib/february /jerez/ jerez.html jewell, christine, william oldfield, and sharon reeves. "university of waterloo electronic theses: issues and partnerships." library hi tech , no. ( ): - . johnson, gareth j. "supporting the research base: the research information network and scholarly communications in the united kingdom." new review of academic librarianship , no. / ( ): - . johnson, richard k. "institutional repositories: partnering with faculty to enhance scholarly communication." d-lib magazine (november ). http://www.dlib.org/dlib/november /johnson/ johnson.html johnston, leslie. "development and assessment of a public discovery and delivery interface for a fedora repository." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /johnston/ johnston.html ———. "an overview of digital library repository development at the university of virginia library." oclc systems & services , no. ( ): - . joint, nicholas. "institutional repositories, self-archiving and the role of the library." library review , no. ( ): - . http://eprints.cdlr.strath.ac.uk/ / ———. "online digital thesis collections and national information policy: antaeus." library review , no. ( ): - . ———. "practical digital asset management and the university library." library review , no. ( ): - . joki, sverre magnus elvenes. "pepia: a norwegian collaborative effort for institutional repositories." oclc systems & services , no. ( ): - . jones, catherine. "collecting research output." research information (december /january ). http://www.researchinformation.info/ridec jan repositories.html ———. institutional repositories: content and culture in an open access environment. oxford: chandos publishing, . jones, richard. "dspace vs. etd-db: choosing software to manage electronic theses and dissertations." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /jones/ ———. "the tapir: adding e-theses functionality to dspace." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /jones/ jones, richard, theo andrew, and john maccoll. the institutional repository. oxford: chandos publishing, . ———. "open access, open source and e-theses: the development of the edinburgh research archive." program: electronic library & information systems , no. ( ): - . http://www.era.lib.ed.ac.uk/handle/ / jordan, mark. "the carl metadata harvester and search service." library hi tech , no. ( ): - . http://ir.lib.sfu.ca/handle/ / kaczmarek, joanne, patricia hswe, janet eke, and thomas g. habing. "using the audit checklist for the certification of a trusted digital repository as a framework for evaluating repository software applications." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /kaczmarek/ kaczmarek.html kaloyanova, stefka, gian luigi betti, francesco castellani, and johannes keizer. "achieving oai-pmh compliancy for cds/isis databases." the electronic library , no. ( ): - . kelly, john c. "creating an institutional repository at a challenged institution." oclc systems & services , no. ( ): - . kelly, julia, and louise letnes. "agecon search: a case study on the differences between operating a subject repository and an institutional repository." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ kennan, mary anne, and concepción wilson. "institutional repositories: review and an information systems perspective." library management , no. / ( ): - . kennan, mary anne, and danny a. kingsley. "the state of the nation: a snapshot of australian institutional repositories." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ kiernan, vincent. "'open archives' project promises alternative to costly journals." the chronicle of higher education, december , a -a . kim, hyun hee, and yong ho kim. "usability study of digital institutional repositories." the electronic library , no. ( ): - . kingsley, danny. "those who don't look don't find: disciplinary considerations in repository advocacy." oclc systems & services: international digital library perspectives , no. ( ): - . kling, rob, lisa spector, and geoff mckim. "locally controlled scholarly publishing via the internet: the guild model." the journal of electronic publishing (august ). http://hdl.handle.net/ /spo. . . knowles, jacqueline. "collaboration nation: the building of the welsh repository network." program: electronic library and information systems , no. ( ): - . koenig, jack, and adam mikeal. "creating complex repository collections, such as journals, with manakin." program: electronic library and information systems , no. ( ): - . koopman, ann, and dan kipnis. "feeding the fledgling repository: starting an institutional repository at an academic health sciences library." medical reference services quarterly , no. ( ): - . krevit, leah, and linda crays. "herding cats: designing digitalcommons @ the texas medical center, a multi-institutional repository." oclc systems & services , no. ( ): - . kroth, philip j., holly e. phillips, and gale g. hannigan. "institutional repository access patterns of nontraditionally published academic content: what types of content are accessed the most?" journal of electronic resources in medical libraries , no. ( ): - . kushkowski, jeffrey d. "web citation by graduate students: a comparison of print and electronic theses." portal: libraries and the academy , no. ( ): - . lagoze, carl, and james r. davis. "dienst: an architecture for distributed document libraries." communications of the acm (april ): . lagoze, carl, sandy payette, edwin shin, and chris wilper. "fedora: an architecture for complex objects and their relationships." international journal on digital libraries , no. ( ): - . http://arxiv.org/abs/cs.dl/ lawal, ibironke. "scholarly communication: the use and non-use of e-print archives for the dissemination of scientific information." issues in science and technology librarianship (fall ). http://www.istl.org/ -fall/article .html leiner, barry m. "the ncstrl approach to open architecture for the confederated digital library." d-lib magazine (december ). http://www.dlib.org/dlib/december /leiner/ leiner.html lercher, aaron. "a survey of attitudes about digital repositories among faculty at louisiana state university at baton rouge." the journal of academic librarianship , no. ( ): - . lewis, stuart, leonie hayes, vanessa newton-wade, antony corfield, richard davis, tim donohue, and scott wilson. "if sword is the answer, what is the question?: use of the simple web-service offering repository deposit protocol." program: electronic library and information systems , no. ( ): - . lim, edward. "preprint servers: a new model for scholarly publishing?" australian academic & research libraries (march ): - . liu, xiaoming, tim brody, stevan harnad, les carr, kurt maly, mohammad zubair, and michael l. nelson. "a scalable architecture for harvest-based digital libraries: the odu/southampton experiments." d-lib magazine (november ). http://www.dlib.org/dlib/november /liu/ liu.html liu, xiaoming, kurt maly, michael l. nelson, and mohammad zubair. "lessons learned with arc, an oai-pmh service provider." library trends , no. ( ): - . http://www.ideals.uiuc.edu/handle/ / llorens, faraón, juan josé bayona, javier gómez, and francisco sanguino. "the university of alicante's institutional strategy to promote the open dissemination of knowledge." online information review , no. ( ): - . lo, meikiu, and leah m. thomas. "creating an institutional repository for state government digital publications." code lib journal, no. ( ). http://journal.code lib.org/articles/ luce, richard e. "e-prints intersect the digital library: inside the los alamos arxiv." issues in science and technology librarianship (winter ). http://www.istl.org/istl/ -winter/article .html ———. "learning from e-databases in an e-data world." educause review , no. ( ): - . http://connect.educause.edu/library/educause+review/learning fromedatabasesina/ ———. "the open archives initiative: interoperable, interdisciplinary author self-archiving comes of age." the serials librarian , no. / ( ): - . lynch, clifford a. "institutional repositories: essential infrastructure for scholarship in the digital age." portal: libraries and the academy , no. ( ): - . ———. "metadata harvesting and the open archives initiative." arl: a bimonthly report on research library issues and actions from arl, cni, and sparc, no. ( ): - . http://www.arl.org/resources/pubs/br/br /br mhp.shtml lynch, clifford a., and joan k. lippincott. "institutional repository deployment in the united states as of early ." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /lynch/ lynch.html ma, jianxia, yuanming wang, zhongming zhu, and runhuan tang. "an attempt of data exchange between the institutional repository and the information environment for the management of scientific research—arp." library collections, acquisitions, and technical services , no. ( ): - . maccoll, john, and stephen pinfield. "climbing the scholarly publishing mountain with sherpa." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /sherpa/ mackie, morag. "filling institutional repositories: practical strategies from the daedalus project." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /mackie/ manepalli, giridhar, henry jerez, and michael l. nelson. "fedcor: an institutional cordra registry " d-lib magazine , no. ( ). http://www.dlib.org/dlib/february /manepalli/ manepalli.html maness, jack m., tomasz miaskiewicz, and tamara sumner. "using personas to understand the needs and goals of institutional repositories." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /maness/ maness.html manghi, paolo, marko mikulicic, leonardo candela, donatella castelli, and pasquale pagano. "realizing and maintaining aggregative digital library systems: d-net software toolkit and oaister system." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /manghi/ manghi.html manuel, kate. "the place of e-prints in the publication patterns of physical scientists." science & technology libraries , no. ( ): - . marcial, laura haak, and bradley m. hemminger. "scientific data repositories on the web: an initial survey." journal of the american society for information science and technology , no. ( ): - . http://onlinelibrary.wiley.com/doi/ . /asi. /full marcondes, carlos henrique, and luis fernando sayao. "the scielo brazilian scientific journal gateway and open archives: a report on the development of the scielo-open archives data provider server." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /marcondes/ marcondes.html marill, jennifer l., and edward c. luczak. "evaluation of digital repository software at the national library of medicine." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/may /marill/ marill.html markey, karen, soo young rieh, beth st. jean, jihyun kim, and elizabeth yakel. census of institutional repositories in the united states: miracle project research findings. washington, dc: council on library and information resources, . http://www.clir.org/pubs/abstract/pub abst.html markey, karen, beth st. jean, young rieh soo, elizabeth yakel, and jihyun kim. "institutional repositories: the experience of master's and baccalaureate institutions." portal: libraries and the academy , no. ( ): - . markey, karen, beth st. jean, soo young rieh, elizabeth yakel, jihyun kim, and yong-mi kim. "nationwide census of institutional repositories: preliminary findings." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ / martin feijen, wolfram horstmann, paolo manghi, mary robinson, and rosemary russell. "driver: building the network for accessing digital repositories across europe." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /feijen-et-al/ martin, kristin e. "moving into the digital age: a conceptual model for a publications repository." internet reference services quarterly , no. ( ): - . martin, ruth. "eprints uk: developing a national e-prints archive." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /martin/ masako, suzuki, and sugita shigeki. "from nought to a thousand: the huscap project." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /suzuki-sugita/ maslov, alexey, james creel, adam mikeal, scott phillips, john leggett, and mark mcfarland. "adding oai-ore support to repository platforms." journal of digital information ( ). http://journals.tdl.org/jodi/article/view/ maslov, alexey, adam mikeal, and john leggett. "cooperation or control? web . and the digital library." journal of digital information , no. ( ). https://journals.tdl.org/jodi/article/view/ mcdonald, john. "a recipe for a successful digital archive: collection development for digital archives." against the grain , no. ( ): - . mcdowell, cat s. "evaluating institutional repository deployment in american academe since early : repositories by the numbers, part ." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /mcdowell/ mcdowell.html mckay, dana. "institutional repositories and their 'other' users: usability beyond authors." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /mckay/ mckiernan, gerry. "arxiv.org: the los alamos national laboratory e-print server." the international journal on grey literature , no. ( ): - . ———. "open archives initiative data providers. part i: general." library hi tech news , no. ( ): - . ———. "open archives data providers part ii: science and technology." library hi tech news , no. ( ): - . ———. "open archives initiative service providers. part i: science and technology." library hi tech news , no. ( ): - . ———. "open archives initiative service providers. part ii: social sciences and humanities." library hi tech news , no. ( ): - . ———. "open archives initiative service providers. part iii: general." library hi tech news , no. ( ): - . medeiros, norm. "e-prints, institutional archives, and metadata: disseminating scholarly literature to the masses." oclc systems & services , no. ( ): - . melero, remedios, ernest abadal, francisca abad, and josep manel rodríguez-gairín. "the situation of open access institutional repositories in spain: report." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html mercer, holly, brian rosenblum, and ada emmett. "a multifaceted approach to promote a university repository: the university of kansas' experience." oclc systems & services , no. ( ): - . http://kuscholarworks.ku.edu/dspace/handle/ / misek, marla. "escholars of the world, unite! the university of california revolutionizes publishing paradigm." econtent , no. ( ): - . mittal, rekha, and g. mahesh. "digital libraries and repositories in india: an evaluative study." program: electronic library and information systems , no. ( ): - . moed, henk f. "the effect of 'open access' on citation impact: an analysis of arxiv's condensed matter section." journal of the american society for information science and technology , no. ( ): - . http://arxiv.org/abs/cs.dl/ mondoux, julie, and ali shiri. "institutional repositories in canadian post-secondary institutions: user interface features and knowledge organization systems." aslib proceedings , no. ( ): - . mongin, larry, yueyu fu, and javed mostafa. "open archives data service prototype and automated subject indexing using d-lib archive content as a testbed." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /mongin/ mongin.html morris, sally. "will the parasite kill the host? are institutional repositories a fact of life—and does it matter?" serials: the journal for the serials community , no. ( ): - . moyle, martin, rebecca stockley, and suzanne tonkin. "sherpa-leap: a consortial model for the creation and support of academic institutional repositories." oclc systems & services , no. ( ): - . http://eprints.ucl.ac.uk/ / mullen, laura bowering. "increasing impact of scholarly journal articles: practical strategies librarians can share." the electronic journal of academic and special librarianship , no. ( ). http://southernlibrarianship.icaap.org/content/v n /mullen_l .ht ml müller, eva. "e-theses and the nordic e-theses initiative. the impact of the joint work on the role of the library." liber quarterly: the journal of european research libraries , no. / ( ). http://liber.library.uu.nl/publish/articles/ /article.pdf müller, eva, uwe klosa, stefan andersson, and peter hansson. "the diva project—development of an electronic publishing system." d-lib magazine , no. ( ). http://www.dlib.org/dlib/november /muller/ muller.html müller, uwe, thomas severiens, robin malitz, and peter schirmbacher. "oa network: an integrative open access infrastructure for germany." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /mueller/ mueller.html needham, paul. "magic—shining a new light on a grey area." serials , no. ( ): - . needleman, mark. "the open archives initiative." serials review , no. ( ): - . nelson, michael l., joanne rocker, and terry l. harrison. "oai and nasa's scientific and technical information." library hi tech , no. ( ): - . nentwich, michael. "the european research papers archive: quality filters in electronic publishing." the journal of electronic publishing (september ). http://hdl.handle.net/ /spo. . . neuhaus, chris, ellen neuhaus, alan asher, and clint wrede. "the depth and breadth of google scholar: an empirical study." portal: libraries and the academy , no. ( ): - . nixon, william j. "daedalus: freeing scholarly communication at the university of glasgow." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /nixon/ ———. "daedalus: initial experiences with eprints and dspace at the university of glasgow." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /nixon/ ———. "the evolution of an institutional e-prints archive at the university of glasgow." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /eprint-archives/ nolan, christopher w., and jane costanza. "promoting and archiving student work through an institutional repository: trinity university, lasr, and the digital commons." serials review , no. ( ): - . okerson, ann shumelda, and james j. o'donnell, eds. scholarly journals at the crossroads: a subversive proposal for electronic publishing. washington, dc: office of scientific and academic publishing, association of research libraries, . http://www.arl.org/bm~doc/subversive.pdf organ, michael, and helen mandl. "outsourcing open access: digital commons at the university of wollongong, australia." oclc systems & services: international digital library perspectives , no. ( ): - . http://ro.uow.edu.au/asdpapers/ / palmer, carole l., lauren c. teffeau, and mark p. newton. "strategies for institutional repository development: a case study of three evolving initiatives." library trends , no. ( ): - . park, eun g., young-joon nam, and sanghee oh. "integrated framework for electronic theses and dissertations in korean contexts." the journal of academic librarianship , no. ( ): - . paterson, moira, david lindsay, ann monotti, and anne chin. "dart: a new missile in australia's e-research strategy." online information review , no. ( ): - . peters, thomas a. "digital repositories: individual, discipline-based, institutional, consortial, or national?" the journal of academic librarianship , no. ( ): - . phillips, holly, richard carr, and janis teal. "leading roles for reference librarians in institutional repositories: one library's experience." reference services review , no. ( ): - . pieper, dirk, and friedrich summann. "bielefeld academic search engine (base): an end-user oriented institutional repository search service." library hi tech , no. ( ): - . http://eprints.rclis.org/archive/ / pinfield, stephen. "can open access repositories and peer-reviewed journals coexist?" serials: the journal for the serials community , no. ( ): - . http://eprints.nottingham.ac.uk/ / ———. "creating institutional e-print repositories." serials , no. ( ): - . http://eprints.nottingham.ac.uk/ / ———. "how do physicists use an e-print archive? implications for institutional e-print services." d-lib magazine (december ). http://www.dlib.org/dlib/december /pinfield/ pinfield.html ———. "open archives and uk institutions: an overview." d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /pinfield/ pinfield.html pinfield, stephen, mike gardner, and john maccoll. "setting up an institutional e-print archive." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /eprint-archives/ piorun, mary, and lisa a. palmer. "digitizing dissertations for an institutional repository: a process and cost analysis." journal of the medical library association , no. ( ): - . http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= piorun, mary e., lisa a. palmer, and jim comes. "challenges and lessons learned: moving from image database to institutional repository." oclc systems & services , no. ( ): - . platt, alice. "developing a repository at southern new hampshire university: a case study." microform & imaging review , no. ( ): - . http://academicarchive.snhu.edu/xmlui/handle/ / polydoratou, panayiota. "use and linkage of source and output repositories: interviews with chemistry researchers " aslib proceedings , no. ( ): - . ———. "use of digital repositories by chemistry researchers: results of a survey." program: electronic library and information systems , no. ( ): - . powell, andy. "a brief overview of the oai protocol and it's potential impact." information services & use , no. / ( ): - . prieto, adolfo g. "from conceptual to perceptual reality: trust in digital repositories." library review , no. ( ): - . prom, christopher j. "reengineering archival access through the oai protocols." library hi tech , no. ( ): - . probets, steve, and celia jenkins. "documentation for institutional repositories." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art prosser, david. "institutional repositories and open access: the future of scholarly communication." information services & use , no. - ( ): - . proudman, vanessa. "the nereus international subject-based repository: meeting the needs of both libraries and economists." library hi tech , no. ( ): - . prudlo, marion. "e-archiving: an overview of some repository management software tools." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /prudlo/ pryor, graham. "attitudes and aspirations in a diverse world: the project store perspective on scientific repositories." international journal of digital curation , no. ( ). http://www.ijdc.net/ijdc/article/view/ ———. "project store: making the connections for research." oclc systems & services , no. ( ): - . puplett, dave. "the economists online subject repository—using institutional repositories as the foundation for international open access growth." new review of academic librarianship , no. s ( ): - . http://www.informaworld.com/smpp/ftinterface~db=all~content=a ~fulltext= ramirez, marisa. "ferpa and student work: considerations for electronic theses and dissertations." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /ramirez/ ramirez.html read, malcolm. "libraries and repositories." new review of academic librarianship , no. / ( ): - . reilly, sean, and robert tupelo-schneck. "digital object repository server: a component of the digital object architecture." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /reilly/ reilly.html ribaric, tim. "automatic preparation of etd material from the internet archive for the dspace repository platform." the code lib journal, no. ( ). http://journal.code lib.org/articles/ richardson, w. ryan, venkat srinivasan, and edward a. fox. "knowledge discovery in digital libraries of electronic theses and dissertations: an ndltd case study." international journal on digital libraries , no. ( ): - . rieger, oya y. "select for success: key principles in assessing repository models." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /rieger/ rieger.html rieh, soo young, karen markey, beth st. jean, elizabeth yakel, and jihyun kim. "census of institutional repositories in the u.s.: a comparison across institutions at different stages of ir development." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /rieh/ rieh.html rieh, soo young, beth st. jean, elizabeth yakel, karen markey, and jihyun kim. "perceptions and experiences of staff in the planning and implementation of institutional repositories." library trends , no. ( ): - . ringersma, jacquelijn, karin kastens, ulla tschida, and jos van berkum. "a principled approach to online publication listings and scientific resource sharing." code lib journal, no. ( ). http://journal.code lib.org/articles/ robertson, r. john, mahendra mahey, and phil barker. "a bug's life?: how metaphors from ecology can articulate the messy details of repository interactions." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /robertson-et-al/ robinson, michael. "promoting the visibility of educational research through an institutional repository." serials review , no. ( ): - . rockman, ilene f. "distinct and expanded roles for reference librarians." reference services review , no. ( ): - . rodr, nerea guez-armentia, and carlos b. amat. "is it worth establishing institutional repositories? the strategies for open access to spanish peer-reviewed articles." learned publishing ( ): - . rodriguez, marko a., johan bollen, and herbert van de sompel. "the convergence of digital libraries and the peer-review process." journal of information science , no. ( ): - . rogers, sally a. "developing an institutional knowledge bank at ohio state university: from concept to action plan." portal: libraries and the academy , no. ( ): - . rogers-urbanek, jenica p. "closing the repository gap at small institutions." portal: libraries and the academy , no. ( ): - . roosendaal, hans e. "driving change in the research and he information market." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art royster, paul. "the institutional repository at the university of nebraska-lincoln: its first year of operations." oclc systems & services , no. ( ): - . http://digitalcommons.unl.edu/libraryscience/ / ———. "publishing original content in an institutional repository." serials review , no. ( ): - . rusch-feja, diann. "the open archives initiative and the oai protocol for metadata harvesting: rapidly forming a new tier in the scholarly communication infrastructure." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art russell, jill. "ethos: from project to service." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /russell/ ———. "ethosnet: building a uk e-theses community." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /russell/ ———. "ethos: progress towards an electronic thesis service for the uk." serials , no. ( ): - . http://eprints.bham.ac.uk/ / russell, rosemary, and michael day. "institutional repository interaction with research users: a review of current practice " new review of academic librarianship , no. s ( ): - . http://www.informaworld.com/smpp/ftinterface~db=all~content=a ~fulltext= sale, arthur. "the acquisition of open access research articles." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ ———. "advice on filling your repository." serials: the journal for the serials community , no. ( ): - . http://uksg.metapress.com/app/home/contribution.asp?referrer=pare nt&backto=issue, , ;journal, , ;linkingpublicationresults, : , ———. "comparison of content policies for institutional repositories in australia." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ ———. "de-unifying a digital library." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ salo, dorothea. "innkeeper at the roach motel." library trends , no. ( ): - . schmitz, dawn. the seamless cyberinfrastructure: the challenges of studying users of mass digitization and institutional repositories. washington, dc: council on library and information resources, . http://www.clir.org/pubs/archives/schmitz.pdf sefton, peter. "re-discovering repository architecture: adding discovery as a key service." new review of information networking , no. ( ): - . ———. "towards scholarly html." serials review , no. ( ): - . sefton, peter, and jim downing. "ice-theorem—end to end semantically aware eresearch infrastructure for theses." journal of digital information , no. ( ). http://journals.tdl.org/jodi/article/view/ shearer, kathleen. "the carl institutional repositories project: a collaborative approach to addressing the challenges of irs in canada." library hi tech , no. ( ): - . shin, eun-ja. "the challenges of open access for korea's national repositories." interlending & document supply , no. ( ): - . ———. "implementing a collaborative digital repository: the dcollection experience in south korea." interlending & document supply , no. ( ): - . shoeb, zahid hossain. "developing an institutional repository at a private university in bangladesh." oclc systems & services: international digital library perspectives , no. ( ): - . shreeves, sarah l., thomas g. habing, kat hagedorn, and jeffrey a. young. "current developments and future trends for the oai protocol for metadata harvesting." library trends , no. ( ): - . https://www.ideals.uiuc.edu/handle/ / shreeves, sarah l., joanne s. kaczmarek, and timothy w. cole. "harvesting cultural heritage metadata using the oai protocol." library hi tech , no. ( ): - . simeoni, fabio. "the case for metadata harvesting." library review , no. ( ): - . simons, gary, and steven bird. "building an open language archives community on the oai foundation." library hi tech , no. ( ): - . http://arxiv.org/abs/cs/ simpson, pauline, and jessie hey. "repositories for research: southampton's evolving role in the knowledge cycle." program: electronic library and information systems , no. ( ): - . http://eprints.soton.ac.uk/ / sitas, anestis. "cdsware (cern document server software)." library hi tech , no. ( ): - . smith, arthur p. "the journal as an overlay on preprint databases." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art smith, kathlin. "institutional repositories and e-journal archiving: what are we learning?" journal of electronic publishing , no. ( ). http://hdl.handle.net/ /spo. . . smith, mackenzie, mary barton, mick bass, margret branschofsky, greg mcclellan, dave stuve, robert tansley, and julie harford walker. "dspace: an open source dynamic digital repository." d-lib magazine (january ). http://www.dlib.org/dlib/january /smith/ smith.html solla, leah. "building digital archives for scientific information." issues in science and technology librarianship (fall ). http://www.istl.org/ -fall/article .html sompel, herbert van de, ryan chute, and patrick hochstenbach. "the adore federation architecture: digital repositories at scale." international journal on digital libraries , no. ( ): - . sompel, herbert van de, and carl lagoze. "interoperability for the discovery, use, and re-use of units of scholarly communication." ctwatch quarterly , no. ( ). http://www.ctwatch.org/quarterly/articles/ / /interoperability-f or-the-discovery-use-and-re-use-of-units-of-scholarly-communicatio n/index.html spedding, vanessa. "the infrastructure is there: time to populate." research information (july/august ). http://www.researchinformation.info/special selfarchiving.html stanger, nigel, and graham mcgregor. "eprints makes its mark." oclc systems & services , no. ( ): - . staples, thornton, ross wayland, and sandra payette. "the fedora project: an open-source digital object repository management system." d-lib magazine , no. ( ). http://www.dlib.org/dlib/april /staples/ staples.html stevenson, valerie, and sue hodges. "setting up a university digital repository: experience with digitool." oclc systems & services: international digital library perspectives , no. ( ): - . sugita, shigeki, kunie horikoshi, masako suzuki, shin kataoka, e. s. hellman, and keiji suzuki. "linking service to open access repositories." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/march /sugita/ sugita.html suleman, hussein, and edward a. fox. "leveraging oai harvesting to disseminate theses." library hi tech , no. ( ): - . http://pubs.cs.uct.ac.za/archive/ / ———. "the open archives initiative: realizing simple and effective digital library interoperability." journal of library administration , no. / ( ): - . summers, ed. "building oai-pmh harvesters with net::oai::harvester." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /summers/ sutradhar, b. "design and development of an institutional repository at the indian institute of technology kharagpur." program: electronic library and information systems , no. ( ): - . swain, dillip k. "global adoption of electronic theses and dissertations." library philosophy and practice ( ). http://unllib.unl.edu/lpp/dillip-swain.htm swan, alma, and sheridan brown. open access self-archiving: an author study. truro, uk: key perspectives limited, . http://cogprints.org/ / /jisc .pdf swan, alma, and leslie carr. "institutions, their repositories and the web." serials review , no. ( ): - . swan, alma, paul needham, steve probets, adrienne muir, charles oppenheim, ann o'brien, rachel hardy, fytton rowland, and sheridan brown. "developing a model for e-prints and open access journal content in uk further and higher education." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art tananbaum, greg. "adventures in open data." learned publishing , no. ( ): - . tarrant, david, ben o'steen, tim brody, steve hitchcock, neil jefferies, and leslie carr. "using oai-ore to transform digital repositories into interoperable storage and services applications." the code lib journal, no. ( ). http://journal.code lib.org/articles/ taubes, gary. "aps starts electronic preprint service." science, july , . ———. "publication by electronic mail takes physics by storm." science, february , - . ternier, stefaan, erik duval, david massart, alessandro campi, and stefano ceri. "interoperability for searching learning object repositories: the prolearn query language." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /ceri/ ceri.html thomas, amber, and andrew rothery. "online repositories for learning materials: the user perspective." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /thomas-rothery/ thomas, chuck, and robert h. mcdonald. "measuring and comparing participation patterns in digital repositories: repositories by the numbers, part ." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/september /mcdonald/ mcdonald.html thomas, gwenda. "evaluating the impact of the institutional repository, or positioning innovation between a rock and a hard place." new review of information networking , no. ( ): - . till, james e. "predecessors of preprint servers." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art traylor, terry dennis. "the preprint network: a new dynamic in information access from the u.s. department of energy." journal of government information , no. ( ): - . treloar, andrew. "design and implementation of the australian national data service." international journal of digital curation , no. ( ). http://www.ijdc.net/index.php/ijdc/article/view/ treloar, andrew, and david groenewegen. "arrow, dart and archer: a quiver full of research repository and related projects." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /treloar-groenewegen/ troman, anthony, neil jacobs, and susan copeland. "a new electronic service for uk theses: access transformed by ethos." interlending & document supply , no. ( ): - . van de sompel, herbert, thomas krichel, michael l. nelson, patrick hochstenbach, victor m. lyapunov, kurt maly, mohammad zubair, mohamed kholief, xiaoming liu, and heath o'connell. "the ups prototype: an experimental end-user service across e-print archives." d-lib magazine (february ). http://www.dlib.org/dlib/february /vandesompel-ups/ vandesom pel-ups.html van de sompel, herbert, and carl lagoze. "the santa fe convention of the open archives initiative." d-lib magazine (february ). http://www.dlib.org/dlib/february /vandesompel-oai/ vandesomp el-oai.html van de sompel, herbert, carl lagoze, jeroen bekaert, xiaoming liu, sandy payette, and simeon warner. "an interoperable fabric for scholarly value chains." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /vandesompel/ vandesompel.ht ml van de sompel, herbert, michael l. nelson, carl lagoze, and simeon warner. "resource harvesting within the oai-pmh framework." d-lib magazine , no. ( ). http://www.dlib.org/dlib/december /vandesompel/ vandesompel. html van de sompel, herbert, jeffrey a. young, and thomas b. hickey. "using the oai-pmh . . . differently." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/july /young/ young.html van der graaf, maurits. "driver: seven items on a european agenda for digital repositories." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /vandergraf/ van der graaf, maurits, and kwame van eijndhoven. the european repository landscape: inventory study into present type and level of oai compliant digital repository activities in the eu. amsterdam: amsterdam university press, . http://dare.uva.nl/document/ van der kuil, annemiek, and martin feijen. "the dawning of the dutch network of digital academic repositories (dare): a shared experience." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /vanderkuil/ van deventer, martie, and heila pienaar. "south african repositories: bridging knowledge divides." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /vandeventer-pienaar/ van westrienen, gerard, and clifford a. lynch. "academic institutional repositories: deployment status in nations as of mid ." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /westrienen/ westrienen.html vernooy-gerritsen, marjan, gera pronk, and maurits van der graaf. "three perspectives on the evolving infrastructure of institutional research repositories in europe." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /vernooy-gerritsen-et-al/ vijayakumar, j. k., t. a. v. murthy, and m. t. m. khan. "electronic theses and dissertations and academia: a preliminary study from india." the journal of academic librarianship , no. ( ): - . waaijers, leo. "the dare chronicle: open access to research results and teaching material in the netherlands." ariadne, no. ( ). http://www.ariadne.ac.uk/issue /waaijers/ wallace, loretta. starting, strengthening, and managing institutional repositories: a how-to-do-it-manual. new york: neal-schuman publishers, . walsh, maureen p. "batch loading collections into dspace: using perl scripts for automation and quality control." information technology and libraries , no. ( ): - . walters, tyler o. "reinventing the library—how repositories are causing librarians to rethink their professional roles." portal: libraries and the academy , no. ( ): - . http://smartech.gatech.edu/handle/ / ———. "strategies and frameworks for institutional repositories and the new support infrastructure for scholarly communications." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /walters/ walters.html ward, jewel. "unqualified dublin core usage in oai-pmh data providers." oclc systems & services , no. ( ): - . ware, mark. "institutional repositories and scholarly publishing." learned publishing , no. ( ): - . http://www.ingentaconnect.com/content/alpsp/lp/ / / /art warner, simeon. "e-prints and the open archives initiative." library hi tech , no. ( ): - . watson, sarah. "authors' attitudes to, and awareness and use of, a university institutional repository." serials: the journal for the serials community , no. ( ): - . https://dspace.lib.cranfield.ac.uk/handle/ / watterworth, melissa. "planting seeds for a successful institutional repository: role of the archivist as manager, designer, and policymaker." journal of archival organization , no. / ( ): - . weenink, kasja, leo waaijers, and karen van godtsenhoven, eds. a driver's guide to european repositories. amsterdam: amsterdam university press, . http://dare.uva.nl/document/ westell, mary. "institutional repositories: proposed indicators of success." library hi tech , no. ( ): - . wheatley, paul. institutional repositories in the context of digital preservation, dpc technology watch series report. london: digital preservation coalition, . - . http://www.dpconline.org/docs/dpctwf word.pdf winterbottom, anna, and james north. "building an open access african studies repository using web . principles." first monday , no. ( ). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/vie w/ wise, marie, lisa spiro, geneva henry, and sidney byrd. "expanding roles for the institutional repository." oclc systems & services , no. ( ): - . witt, michael. "institutional repositories and research data curation in a distributed environment." library trends , no. ( ): - . witten, ian h., david bainbridge, robert tansley, chi-yu huang, and katherine j. don. "stoned: a bridge between greenstone and dspace." d-lib magazine , no. ( ). http://www.dlib.org/dlib/september /witten/ witten.html wleklinski, joann m. "studying google scholar: wall to wall coverage?" online , no. ( ): - . wojciechowska, anna. "analysis of the use of open archives in the fields of mathematics and computer science." oclc systems & services , no. ( ): - . http://archivesic.ccsd.cnrs.fr/sic_ /en/ wolpers, martin, martin memmel, joris klerkx, gonzalo parra, bram vandeputte, erik duval, rafael schirru, and katja niemann. "bridging repositories to form the mace experience " new review of information networking , no. ( ): - . wong, gabrielle k. w. "exploring research data hosting at the hkust institutional repository." serials review , no. ( ): - . wrenn, george, carolyn j. mueller, and jeremy shellhase. "institutional repository on a shoestring." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/january /wrenn/ wrenn.html xia, jingfeng. "assessment of self-archiving in institutional repositories: across disciplines." the journal of academic librarianship , no. ( ): - . ———. "a comparison of subject and institutional repositories in self-archiving practices." the journal of academic librarianship , no. ( ): - . ———. "disciplinary repositories in the social sciences." aslib proceedings: new information perspectives , no. ( ): - . ———. "personal name identification in the practice of digital repositories." program: electronic library and information systems , no. ( ): - . xia, jingfeng, and li sun. "assessment of self-archiving in institutional repositories: depositorship and full-text availability." serials review , no. ( ): - . ———. "factors to assess self-archiving in institutional repositories." serials review , no. ( ): - . xiang, xiaorong, and eric lease morgan. "exploiting 'light-weight' protocols and open source tools to implement digital library collections and services." d-lib magazine , no. ( ). http://www.dlib.org/dlib/october /morgan/ morgan.html yakel, elizabeth. "digital assets for the next millennium." oclc systems & services , no. ( ): - . yiotis, kristin. "electronic theses and dissertation (etd) repositories: what are they? where do they come from? how do they work?" oclc systems & services: international digital library perspectives , no. ( ): - . yu, shien-chiang, hsueh-hua chen, and huai-wen chang. "building an open archive union catalog for digital archives." the electronic library , no. ( ): - . zuber, peter a. "a study of institutional repository holdings by academic discipline." d-lib magazine , no. / ( ). http://www.dlib.org/dlib/november /zuber/ zuber.html zuccala, alesia, charles oppenheim, and rajveen dhiensa. "managing and evaluating digital repositories." information research: an international electronic journal , no. ( ). http://informationr.net/ir/ - /paper .html appendix a. related bibliographies bailey, charles w., jr. digital curation and preservation bibliography. houston: digital scholarship, . http://www.digital-scholarship.org/dcpb/dcpb.htm ———. "electronic publishing on networks: a selective bibliography of recent works." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /bailey. n ———. "electronic publishing on networks: part ii of a selective bibliography." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /bailey. n ———. electronic theses and dissertations bibliography. houston: digital scholarship, - . http://www.digital-scholarship.org/etdb/etdb.htm ———. google books bibliography. houston: digital scholarship, - . http://www.digital-scholarship.org/gbsb/gbsb.htm ———. institutional repository bibliography. houston: digital scholarship, - . http://digital-scholarship.org/irb/irb.html ———. "network-based electronic publishing of scholarly works: a selective bibliography." the public-access computer systems review , no. ( ): - . http://epress.lib.uh.edu/pr/v /n /bail n .html ———. open access bibliography: liberating scholarly literature with e-prints and open access journals. washington, dc: association of research libraries, . http://www.digital-scholarship.org/oab/oab.htm ———. transforming scholarly publishing through open access: a bibliography. houston: digital scholarship, . http://www.digital-scholarship.org/tsp/transforming.htm appendix b. about the author charles w. bailey, jr. is the publisher of digital scholarship (http://digital-scholarship.org/). from to , he was the assistant dean for digital library planning and development at the university of houston libraries. from to , he served as assistant dean/director for systems at the university of houston libraries. from - , he served as a media librarian at a university media center, a technical writer at a bibliographic utility, a systems librarian at a university research library, and the head of the systems department at a university medical library. he holds master's degrees in information and library science and instructional media and technology. in , bailey established pacs-l, a mailing list about public-access computers in libraries, and the public-access computer systems review, one of the first open access journals published on the internet. he served as pacs-l moderator until november and as the founding editor-in-chief of the public-access computer systems review until the end of . in , bailey and dana rooks established public-access computer systems news, an electronic newsletter, and bailey co-edited this publication until . in , he founded the pacs-p mailing list for announcing the publication of selected e-serials, and he moderated this list until . in recognition of his early electronic publishing efforts, bailey was given a network citizen award by the apple library in and the first lita/library hi tech award for outstanding achievement in communicating to educate practitioners within the library field in library and information technology in . in , he established the scholarly electronic publishing bibliography (sepb), an open access book that has been updated over times. in , he added scholarly electronic publishing resources, a directory of relevant websites, to sepb. in , he added the scholarly electronic publishing weblog, which announces relevant new publications, to sepb. in , he was selected as a team member of current cites, and he has subsequently been a frequent contributor of reviews to this monthly e-serial. bailey was profiled in the movers & shakers : the people who are shaping the future of libraries supplement to the march , issue of library journal. in , bailey established digital scholarship, which provides information and commentary about digital copyright, digital curation, digital repositories, open access, scholarly communication, and other digital information issues. digital scholarship's digital publications are open access. both print and digital publications are under versions of the creative commons attribution-noncommercial license. in , he also established digitalkoans, a weblog that covers the same topics as digital scholarship. in , he also published the open access bibliography: liberating scholarly literature with e-prints and open access journals with the association of research libraries (a paperback, a pdf file, and an xhtml website), the electronic theses and dissertations bibliography, the google book search bibliography, and the "open access webliography" (with adrian k. ho). in , he published author's rights, tout de suite and institutional repositories, tout de suite. in , he published the scholarly electronic publishing bibliography: annual edition (a paperback, a kindle e-book, and a pdf file) and the institutional repository bibliography. in , he published the digital curation and preservation bibliography, digital scholarship (a paperback and a pdf file), and transforming scholarly publishing through open access: a bibliography (a paperback, a pdf file, and an xhtml website). the charleston advisor gave bailey a best content by an individual award for transforming scholarly publishing through open access: a bibliography and his other digital publications. with the exception of the open access bibliography: liberating scholarly literature with e-prints and open access journals and transforming scholarly publishing through open access: a bibliography, bailey periodically updates his digital scholarship bibliographies. for more details, see the "digital scholarship publications overview" (http://digital-scholarship.org/about/overview.htm). bailey has written numerous papers about open access, scholarly electronic publishing, and other topics. see "selected publications of charles w. bailey, jr." for a more complete description of his publications (http://digital-scholarship.org/cwb/bailey.htm). his e-mail address is digitalscholarship at gmail.com. digital scholarship mn inewaiiiipwilliwin iii . ialitlyr, l'itnnt, lir r . alli i kir qmo i tar jodi a for ira r wi ii http://digital-scholarship.org/ sepb _frontcover- jpg.pdf sepb o.pdf sepb _backcover- jpg.pdf the lives and after lives of data harvard data science review • . the lives and after lives of data christine l. borgman published on: jun , updated on: oct , doi: . / f . a bdb harvard data science review • . the lives and after lives of data abstrac t the most elusive term in data science is ‘data.’ while often treated as objects to be computed upon, data is a theory-laden concept with a long history. data exist within knowledge infrastructures that govern how they are created, managed, and interpreted. by comparing models of data life cycles, implicit assumptions about data become apparent. in linear models, data pass through stages from beginning to end of life, which suggest that data can be recreated as needed. cyclical models, in which data flow in a virtuous circle of uses and reuses, are better suited for irreplaceable observational data that may retain value indefinitely. in astronomy, for example, observations from one generation of telescopes may become calibration and modeling data for the next generation, whether digital sky surveys or glass plates. the value and reusability of data can be enhanced through investments in knowledge infrastructures, especially digital curation and preservation. determining what data to keep, why, how, and for how long, is the challenge of our day. keywords astronomy, curation, data, digital curation, life cycles, observations, preservation, reuse, science, stewardship . introdu ction as an interdisciplinary journal of data science whose goal is to provoke dialog among diverse stakeholders, the harvard data science review is an ideal venue to explicate concepts whose terminological simplicity masks highly contested territory. ‘data’ is the most elusive term of all. data are often treated as objective entities to be computed upon, defined as f acts or numbers, or operationalized by lists of examples. in practical business situations where correlation matters more than causation, such declarative simplicity may suffice. in scholarly contexts, however, data, f acts, information, and knowledge are theory-laden concepts with long and contentious histories (blair, ; buckland, ; case, ; leonelli, ; meadows, ; rosenberg, ). researchers are exceedingly clever at treating almost anything as data, be it the air we breathe, clothes we wear, traces of our digital lives, or photons captured by astronomical instruments. in scientific contexts, data can be viewed as “entities used as evidence of phenomena for the purposes of research or scholarship” (borgman, , p. ). from a humanities perspective, “the concept of data as a given has to be rethought through a humanistic lens and characterized as capta, taken and constructed. … rooted in a co-dependent relation between observer and experience” (drucker, ). https://doi.org/ . /(sici) - ( ) : % c ::aid-asi % e . .co; - https://doi.org/ . / http://www.digitalhumanities.org/dhq/vol/ / / / .html harvard data science review • . the lives and after lives of data . data and infrastru ctu re whether in science, humanities, business, or government contexts, data are a human construct. people decide what are data for a given purpose, how those data are to be interpreted, and what constitutes appropriate evidence. one scientist’s signal is another’s noise. one politician’s f act is another’s f ake news. data exist within knowledge infrastructures that govern how they are created, managed, used, and interpreted (edwards et al., ). as infrastructures evolve, so do the characteristics and usability of data embedded within them. the notion of ‘data life cycle’ reflects the array of knowledge infrastructures that govern the flows of data. the term life cycle originated in biology in the th century as a linear model (“oxford english dictionary,” ): “the sequence of stages through which an individual organism passes from origin as a zygote to death, or through which the members of a species pass from the production of gametes by one generation to that by the next.” life cycle is used similarly in business and economic contexts to span processes from their beginning through decay or ending. an example is personnel records that are created when a person is hired and destroyed at the end of a legally defined records retention cycle. the common alternative to a linear data life cycle is a circular model, where data flow continually through stages. these models are common in scholarly communication and in other areas that benefit from the ability to mine and combine data indefinitely. figure , a ‘research life cycle’ from a library perspective, illustrates the flow of scholarly products. in the planning stage of a project, researchers typically describe a problem and determine the research design. in the implementation stage, assets such as data are collected, organized, described, and analyzed. the next stage is to publish the resulting work, which may include depositing associated datasets for public access. once published, the research findings may be disseminated further through social media, indexing and abstracting services, and various ‘impact’ mechanisms. the next stage in figure is preservation, which includes reliable storage and migration to new technologies that ensure continuous availability. the last and connecting stage is reuse, when research products become input to the planning and implementation of new research projects. http://hdl.handle.net/ . / harvard data science review • . the lives and after lives of data the idea behind the life cycle model in figure is to encourage researchers to think in terms of a virtuous circle wherein their work has greater impact, for longer periods of time, through dissemination and preservation of their research products. libraries provide essential elements of the knowledge infrastructure for this virtuous circle, such as dissemination, curation, preservation, and access. in principle, a student or other researcher could begin an inquiry at any point in the cycle or could skip a stage or two. questions provoked by the dissemination process could lead to reuse of data, as could datasets stored in archives, for example. conversely, projects may proceed only through parts of this research life cycle. researchers may f ail to complete a project or f ail to publish their findings. publications may or may not receive citations from other authors. only a minority of researchers preserve their datasets in ways that the data remain findable and accessible. even if datasets are available, those data may not be reused by others. figure —a much more complex model that is widely adopted in the digital archiving community— also focuses on keeping digital data alive for long periods of time. books and other paper objects often can survive indefinitely by benign neglect, given adequate storage conditions. digital records, in contrast, require active management. the digital curation life cycle model in figure , explained more fully in higgins ( ) and on the dcc site (digital curation centre, ), identifies activities that keep data available, useful, and usable. during reappraisal, archivists determine whether to continue investment in a dataset, such as migrating it to new formats and media, or whether to dispose of the dataset. digital data archives of scholarly content, such as icpsr in the social sciences, gbif for figure : research life cycle (university of california, irvine, libraries, digital scholarship services, ). reprinted with permission of the uci libraries. https://do% i.org/ . /ijdc.v i . http://www.dcc.ac.uk/ harvard data science review • . the lives and after lives of data biodiversity, uniprot for protein sequences, heasarc for high energy astrophysics, or dans for humanities and archaeology, all invest in data curation in a manner similar to that of the dcc model (data archiving and networked services, ; gbif, ; “heasarc: nasa’s archive of data on energetic phenomena,” ; “inter-university consortium for political and social research,” ; “uniprot,” ). lacking these investments in data curation and preservation, data f ade away through neglect, benign or otherwise, as storage media f ail and as software versions become obsolete (borgman, , ). the stark contrast between the popularity of linear life cycles in technical areas of data science and cyclical life cycles in the digital curation community reveals competing assumptions about data and infrastructure. if data exist only from the time they are generated de novo to when they are interpreted (wing, ; wing, janeja, kloefkorn, & erickson, ), they are ephemeral objects produced for a specific purpose. they can be discarded without further investment. in contrast, if data are entities humans created as evidence of a particular phenomenon, they may have enduring value. if figure : digital curation center curation lifecycle model (higgins, ). reprinted with permission of the digital curation centre, u.k. https://dans.knaw.nl/en/about/organisation-and-policy https://www.gbif.org/ https://heasarc.gsfc.nasa.gov/ https://deepblue.lib.umich.edu/handle/ % . / https://www.uniprot.org/ https://items.ssrc.org/parameters/not-fade-away-social-science-research-in-the-digital-era/ https://datascience.columbia.edu/data-life-cycle harvard data science review • . the lives and after lives of data those data are to be reused, they must be reusable, which requires considerable investment in the infrastructure necessary for documentation, interpretation, curation, and access. another implicit assumption about data that distinguishes these life cycle models is whether data can be recreated. experiments and computational models can be re-executed, social media streams can be resampled, and even genome sequences can be recreated if the original tissue is available and viable. observational data, in contrast, cannot be recreated. the census of cannot be conducted again, nor can infrared images of tonight’s sky be taken tomorrow, nor can the weather conditions of july , , be observed again with modern instruments. these are time-specific observations that may be valuable indefinitely. one never steps in the same river twice, because the water continues to flow. that said, not all observational data can be kept alive, nor are all worth keeping. . open science and data stewardship research policy initiatives for open science, open access to publications, data management plan requirements, and deposit of data associated with publications are predicated on assumptions that research data are valuable assets that should be preserved for reuse by others, whether for reproducibility, reuse for new questions or innovations, mining and integration, or other purposes (national academies of sciences, ; “numfocus,” ; wilkinson et al., ). implicit in these policies are assumptions that research data should be curated and preserved to become part of the virtuous circle presented in figure . astronomy offers numerous examples of cyclical data life cycles in which reuse is essential, as each round of observations and instrumentation lays the foundation for the next. human observations of the cosmos long predate the written record, and the cosmos long predates humans. a contemporary case to consider is the large synoptic survey telescope (lsst), which is in its final stages of construction in chile. “engineering first light” is due in fy and science operations are due to begin in fy , commencing years of data collection (ivezic et al., ; large synoptic survey telescope, ). many milestones could be chosen to mark the beginning of lsst. concept development and proposals began in the s, long before funding for the telescope instrument was obtained. countless design decisions and compromises were made by the time the glass was poured for the mirror, thus hardening the path to data collection. many of these design decisions are based on data obtained by earlier surveys and instruments. observations from the sloan digital sky survey, a ground-based survey that saw first light in and entered routine operations in (“sloan digital sky surveys,” ), are among those used to calibrate lsst. more than half of the one billion dollar budget of the lsst project is devoted to data management because those data are expected to remain valuable to several generations of astronomers. the science https://www.nap.edu/catalog/ /open-science-by-design-realizing-a-vision-for- st-century https://numfocus.org/ http://dx.doi.org/ . /sdata. . https://arxiv.org/abs/ . https://www.lsst.org/about/timeline https://www.sdss.org/surveys/ harvard data science review • . the lives and after lives of data is in the data. major astronomy missions such as chandra and hubble report that more new papers are being published from their archival data than from new observations (“chandra data archive,” ; “hubble legacy archive,” ). old observational data yield new forms of evidence and new baselines for current evidence. lsst is expected to benefit greatly from dasch, a project begun in to digitize the harvard observatory’s collection of a half-million glass plates, acquired over a period of more than a century. because the irreplaceable observations captured on these plates represent the first complete map of the sky, they are an essential baseline comparison for lsst and other sky surveys. the scientific value of dasch lies in the infrastructure that encompasses carefully curated data, high resolution imaging, and computational features that enable astronomers to explore and visualize time-domain astronomy in ways inconceivable when these data were collected in the th and th centuries (digital access to a sky century @ harvard, ; grindlay, tang, los, & servillat, ; sobel, ). the lives and afterlives of data depend upon many f actors, such as their perceived value and the efforts invested in their curation. glass plates fell into disuse for scientific purposes when charge- coupled devices (ccds) became a viable technology. these plates are large and fragile objects that are expensive to maintain, and thus many were discarded by the time that astronomy became digital. harvard, despite the continuing specter of fires, floods, and budget cuts, managed to keep their plate collection and catalogs intact. the dedication of a core group of individuals f acilitated the digital archive that is now openly available to the international community. . knowledge infrastru ctu res for the long term data life cycles, whether viewed as linear or cyclical processes, are necessarily reductionist. paths from data creation to interpretation and back tend to look more like a random walk than a perfect line or circle. infrastructures, by their nature, tend to be most visible when they break down. they build on an installed base and are embedded in the social practices of their communities (star & ruhleder, ). data are selected, collected, organized, and generated by humans, using the knowledge infrastructures available to them at the time. some of those data may be short-lived, discarded when they have served their purpose, and readily recreated if later needed. other data, such as observations of the natural world, may be long-lived, with value apparent from their initial capture. much else f alls in between, including observations lost before their value was recognized, duplicative material that can be done without, and sensitive data that should be destroyed regularly due to privacy and ethics risks. in data science, we ignore knowledge infrastructures at our peril. identifying principles for what to keep, why, how, and for how long, is the challenge of our day. http://cxc.harvard.edu/cda/ http://hla.stsci.edu/ http://dasch.rc.fas.harvard.edu/index.php https://doi.org/ . /s https://doi.org/ . /isre. . . harvard data science review • . the lives and after lives of data acknowledgements research on astronomy data practices reported here was supported by the alfred p. sloan foundation, if data sharing is the answer, what is the question?, sloan # - , christine l. borgman, pi, and by the harvard-smithsonian center for astrophysics as a visiting scholar. thanks to john p. renaud of university of california, irvine, libraries for permission to use figure and to kevin ashley of the u.k. digital curation centre for permission to use figure . bernadette boscoe, michael scroggins, morgan wofford, and peter darch of ucla center for knowledge infrastructures provided comments on an earlier draft. references blair, a. m. ( ). too much to know: managing scholarly information before the modern age. new haven, ct: yale university press. borgman, c. l. ( ). big data, little data, no data: scholarship in the networked world. cambridge, ma: mit press. borgman, c. l. ( , june ). not fade away: social science research in the digital era [the social science research council]. retrieved from parameters website: http://parameters.ssrc.org/ / /not-f ade-away-social-science-research-in-the-digital-era/ buckland, m. k. ( ). information as thing. journal of the american society for information science, ( ), – . https://doi.org/ . /(sici) - ( ) : < ::aid-asi > . .co; - case, d. o. ( ). looking for information: a survey of research on information seeking, needs, and behavior ( nd ed.). san diego: academic press. chandra data archive. ( ). retrieved january , , from http://cxc.harvard.edu/cda/ data archiving and networked services. ( ). dans: organisation and policy. retrieved july , , from https://dans.knaw.nl/en/about/organisation-and-policy digital access to a sky century @ harvard. ( ). dasch data release. retrieved april , , from http://dasch.rc.f as.harvard.edu/index.php digital curation centre. ( ). retrieved november , , from http://www.dcc.ac.uk/ drucker, j. ( ). humanities approaches to graphical display. digital humanities quarterly, ( ). retrieved from http://www.digitalhumanities.org/dhq/vol/ / / / .html https://items.ssrc.org/parameters/not-fade-away-social-science-research-in-the-digital-era/ https://doi.org/ . /(sici) - ( ) : % c ::aid-asi % e . .co; - http://cxc.harvard.edu/cda/ https://dans.knaw.nl/en/about/organisation-and-policy http://dasch.rc.fas.harvard.edu/index.php http://www.dcc.ac.uk/ http://www.digitalhumanities.org/dhq/vol/ / / / .html harvard data science review • . the lives and after lives of data edwards, p. n., jackson, s. j., chalmers, m. k., bowker, g. c., borgman, c. l., ribes, d., … calvert, s. ( ). knowledge infrastructures: intellectual frameworks and research challenges (p. ). retrieved from university of michigan website: http://hdl.handle.net/ . / global biodiversity information facility. ( ). retrieved april , , from https://www.gbif.org/ grindlay, j., tang, s., los, e., & servillat, m. ( ). opening the -year window for time-domain astronomy. proceedings of the international astronomical union, (s ), – . https://doi.org/ . /s heasarc: nasa’s archive of data on energetic phenomena. ( ). retrieved april , , from https://heasarc.gsfc.nasa.gov/ higgins, s. ( ). the dcc curation lifecycle model. international journal of digital curation, ( ), – . https://doi.org/ . /ijdc.v i . hubble legacy archive. ( ). retrieved april , , from http://hla.stsci.edu/ inter-university consortium for political and social research. ( ). retrieved october , , from https://deepblue.lib.umich.edu/handle/ . / ivezic, z., tyson, j. a., abel, b., acosta, e., allsman, r., alsayyad, y., … collaboration, for the l. ( ). lsst: from science drivers to reference design and anticipated data products. arxiv: . [astro- ph]. retrieved from http://arxiv.org/abs/ . large synoptic survey telescope. ( , march ). lsst project schedule. retrieved march , , from https://www.lsst.org/about/timeline leonelli, s. ( ). what counts as scientific data? a relational framework. philosophy of science, ( ), – . retrieved from https://www.ncbi.nlm.nih.gov/pmc/articles/pmc / life cycle. ( ). in oxford english dictionary (online). retrieved from www.oed.com meadows, j. ( ). understanding information. munchen: k. g. saur. national academies of sciences, e. ( ). open science by design: realizing a vision for st century research. https://doi.org/ . / numfocus: open code = better science. ( ). retrieved october , , from numfocus website: https://numfocus.org/ rosenberg, d. ( ). data before the fact. in l. gitelman (ed.), “raw data” is an oxymoron (pp. – ). cambridge ma: mit press. http://hdl.handle.net/ . / https://www.gbif.org/ https://doi.org/ . /s https://heasarc.gsfc.nasa.gov/ https://do% i.org/ . /ijdc.v i . http://hla.stsci.edu/ https://deepblue.lib.umich.edu/handle/ . / http://arxiv.org/abs/ . https://www.lsst.org/about/timeline https://doi.org/ . / file:///tmp/www.oed.com https://www.nap.edu/catalog/ /open-science-by-design-realizing-a-vision-for- st-century https://numfocus.org/ harvard data science review • . the lives and after lives of data sloan digital sky surveys. ( ). retrieved march , , from https://www.sdss.org/surveys/ sobel, d. ( ). the glass universe: how the ladies of the harvard observatory took the measure of the stars (reprint edition). new york: penguin books. star, s. l., & ruhleder, k. ( ). steps toward an ecology of infrastructure: design and access for large information spaces. information systems research, ( ), – . https://doi.org/ . /isre. . . uniprot. ( ). retrieved april , , from https://www.uniprot.org/ university of california, irvine, libraries, digital scholarship services. ( ). research life cycle. retrieved february , , from https://www.lib.uci.edu/dss wilkinson, m. d., dumontier, m., aalbersberg, ij. j., appleton, g., axton, m., baak, a., … mons, b. ( ). the fair guiding principles for scientific data management and stewardship. scientific data, , . retrieved from http://dx.doi.org/ . /sdata. . wing, j. m. ( , january ). the data life cycle | data science institute. retrieved march , , from https://datascience.columbia.edu/data-life-cycle wing, j. m., janeja, v. p., kloefkorn, t., & erickson, l. c. ( ). data science leadership summit: summary report. usa: national science foundation. this article is © by christine l. borgman. the article is licensed under a creative commons attribution (cc by . ) international license (https://creativecommons.org/licenses/by/ . /legalcode), except where otherwise indicated with respect to particular material included in the article. the article should be attributed to the author identified above. https://www.sdss.org/surveys/ https://doi.org/ . /isre. . . https://www.uniprot.org/ https://www.lib.uci.edu/dss http://dx.doi.org/ . /sdata. . https://datascience.columbia.edu/data-life-cycle https://creativecommons.org/licenses/by/ . /legalcode microsoft word - dh-bibliography . copy.docx a digital humanities bibliography compiled by john taormina, duke university with assistance from alexander strecker, katherine mccusker, and michael o’sullivan aahc. “tenure guidelines.” american association for history and computing, n.d. http://theaahc.org/about/tenure-guidelines/. aarseth, espen j. cybertext. perspectives on ergodic literature. baltimore, md: johns hopkins university press, . abbate, j. inventing the internet. cambridge, ma: mit press, . abelson, hal, ken ledeen, and harry lewis. blown to bits: your life, liberty, and happi- ness after the digital explosion. new york, ny: addison-wesley professional, . “about the emory center for digital scholarship.” emory center for digital scholarship. http://digital scholarship.emory.edu/about/index.html. abrams, s., j. kunze, and d. loy. “an emergent micro-services approach to digital cura- tion infrastructure.” international journal of digital curation ( ). - . . /ijdc.v il. . ackoff, r.l. “from data to wisdom.” journal of applied systems analysis, ( ): - . acland, charles r. residual media. minneapolis, mn: university of minnesota press, . adair, bill, benjamin filene, and laura koloski, eds. “throwing open the doors.” in let- ting go?: sharing historical authority in a user-generated world. - . left coast press, . http://arthistory .doingdh.org/readings/ adams, jennifer, and kevin b. gunn. “keeping up with … digital humanities.” associa- tion of college and research libraries, april . adams, randy, steve gibson, and stefan muller, eds. transdisciplinary digital art: sound, vision and the new screen. heidelberg, germany: springer-verlag publications, . adams, robyn. “bodley diplomatic correspondence project.” textal. http://www.textal.org/clouds/ f eaa. agar, jon. the government machine: a revolutionary history of the computer. cam- bridge, ma: mit press, . agosti, m, m. manfioletti, n. orio, and c. ponchia. “enhancing end user access to cul- tural heritage systems: tailored narratives and human-centered computing.” in new trends in image analysis and processing: iciap international workshops, naples, italy, september . eds. a. petrosino, l. maddalena, and p. papa. - . berlin, germany: springer, . ahlberg, kristin, william s. bryans, constance b. schulz, debbie ann doyle, kathleen franz, john r. dichtl, edward countryman, gregory e. smoak, and susan ferenti- nos. tenure, promotion and the publicly engaged historian. aha/ncph/oah working group on evaluating public history scholarship, , updated . http://ncph.org/cms/wp-content/uploads/engaged-historian.pdf. aldenderfer, m. and h.d.g. maschner. anthropoloy, space, and geographic information systems. oxford, uk: oxford university press, . alexander, bryan, and rebecca frost davis. “should liberal arts campuses do digital hu- manities? process and products in the small college world.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . alexander, marc. “patchworks and field-boundaries: visualizing the history of eng- lish.” digital humanities . https://www.academia.edu/ /patch- works_and_field-_boundaries_visualizing_the_history_of_english. allen, k.m.s., s.w. green, and e.b.w. zubrow. interpreting space: gis and archaeology. london, uk: taylor and francis, . allington, daniel, sarah brouillette, and david golumbia. “neoliberal tools (and ar- chives): a political history of digital humanities.” los angeles review of books. may , . alston, robin. “the eighteenth century short title catalogue: a personal history to .” http://web.archive.org/web/ /http:/www.r-al- ston.co.uk/estc.htm. alvarado, rafael. “are moocs part of the digital humanities?” the transducer. january , . http://transducer.ontoligent.com/?p= . alvarado, rafael. “the digital humanities situation.” in debates in the digital humani- ties. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . alvarado, rafael. “start calling it digital liberal arts.” the transducer, ( ). amelunxen, h, ed. photography after photography: memory and representation in the digital age. munich, germany: g+b arts, . american council of learned societies. computing and the humanities: summary of a roundtable meeting. occasional paper, no. . chicago: acls., . american council of learned societies. our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the hu- manities and social sciences. new york: american council of learned societies, . american historical association. american historical association statement on policies regarding the embargoing of completed history phd dissertations. https://www.histo- rians.org/publications-and-directories/perspectives-on-history/summer- /american- historical-association-statement-on-policies-regarding-the-embargoing-of-completed- history-phd-dissertations. amsterdam centre for digital humanities. “modeling crowdsourcing for cultural herit- age.” http://cdh.uva.nl/projects- - /m.o.c.c.a.html anderson, chris. makers: the new industrial revolution. new york, ny: crown, . anderson, deborah lines, ed. digital scholarship in the tenure, promotion, and review process. armonk, ny: m.e. sharpe, . anderson, deborah lines. “introduction.” in digital scholarship in the tenure, promo- tion, and review process. ed. deborah lines andersen. - . armonk, ny: m.e. sharpe, . anderson, erin r., and trisha n. campbell. “ethics in the making.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, minnesota: university of minnesota press, . anderson, richard. “is a rational discussion of open access possible?” (transcript url: http://discussingoa.wordpress.com/ ; video url: http://library.si.edu/webcasts/rick-an- derson-rational-discussion-open-access.) anderson, richard. “print on the margins.” in library journal, , no. ( ): - . url: http://lj.libraryjournal.com/ / /academic-libraries/print-on-the-margins-circu- lation-trends-in-major-research-libraries/. anderson, steve. “what are research infrastructures?” international journal of humani- ties and arts computing ( - ) ( ): - . anderson, steve, and tara mcpherson. “engaging digital scholarship: thoughts on eval- uating multimedia scholarship.” profession ( ): – . url: http://www.mlajour- nals.org/doi/abs/ . /prof. . . . . andrews, t.l. “the third way: philology and critical edition in the digital age.” variants ( ): - . ankersmit, f.r. historical representation. stanford, ca: stanford university press, . antoniou, g., and f. van harmelen. a semantic web primer. cambridge, ma: mit press, . http://www.dcc.fc.up.pt/-zp/aulas/ /pde/geral/bibliografia/mit.press.a.se- mantic.web.primer.ebook-tlfebook.pdf. appleford, simon, and jennifer guliano. devdh: development for the digital humanities. . applehans, w., a. globe, and g. laugero. managing knowledge: a practical web-based approach. reading, ma: addison-wesley, . arango, j. “architectures.” journal of information architecture , ( ): - . arazy, ofer, eleni stroulia, stan ruecker, cristina arias, carlos fiorentino, veselin ganev, and timothy yau. “recognizing contributions in wikis: authorship categories, algo- rithms, and visualizations.” journal of the american society for information science and technology . ( ): - . archer, dawn. “digital humanities : when two became many.” literary and lin- guistic computing , no. (april , ): - . archer, dawn, ed. what’s in a word-list? investigating word frequency and keyword ex- traction. farnham: ashgate, uk, . arctur, david, and michael zeiler. designing geodatabases: case studies in gis data mo- deling. redlands, ca: esri press, . arl/nsf workshop on long-term stewardship of digital data collections. association of research libraries, september . url: http://www.arl.org/pp/access/nsfwork- shop.shtml. arms, w. and larsen, r. “building the infrastructure for cyberscholarship.” report of a workshop held in phoenix, arizona, national science foundation, . arnold, m. culture and anarchy. oxford, uk: oxford university press, . arthur, p.l., and katherine bode, eds. advancing digital humanities: research, methods, theories. basingstoke, uk: palgrave macmillan. arts-humanities.net: guide to digital humanities and arts. http://arts-humanities.net/ artstor digital library. www.artstor.org ashton, k. “that ‘internet of things’ thing.” journal ( ). http://www.rfidjour- nal.com/articles/view? . association of college and research libraries. “changing roles of academic and re- search libraries.” association of college and research libraries, november .url: http://www.ala.org/ala/mgrps/divs/acrl/issues/value/changingroles.cfm. association for literary and linguistic computing. www.allc.org auerbach, eric. mimesis: the representation of reality in western literature. translated by w. trask. new york, ny: doubleday anchor, . aufderheide, patricia, et al. copyright, permissions, and fair use among visual artists and the academic and museum visual arts communities: an issues report. college art association, . http://www.collegeart.org/pdf/fairuseissuesreport.pdf (pdf) avery, j.m. “the democratization of metadata: collective tagging, folksonomies and web . .” library student journal. ayers, edward l. “the academic culture and the it culture: their effect on teaching and scholarship.” educause review , no. ( ): - . http://www.educause.edu/edu- cause+review/educauserreviewmagazinevolume /theacademiccultureandtheit- cult/ . ayers, edward l. “does digtal scholarship have a future?” in educause review/ , no. ( ): . http://www.educause.edu/ero/article/does-digital-scholarship-have-a-fu- ture. ayers, edward l. history in hypertext. charlottesville, va: university of virginia press, . ayers, edward l. “the past and futures of digital history.” virginia center for digital his- tory, . http://www.vcdh.virginia.edu/pastsfutures.html. bady, aaron. “the mooc moment and the end of reform.” in the new inquiry. may , . http://thenewinquiry.com/blogs/zunguzungu/the-mooc-moment-and-the-end-of- reform/. bailey, moya z. “all the digital humanists are white, all the nerds are men, but some of us are brave.” journal of digital humanities , no. ( ). http://journalofdigitalhu- manities.org/ - /all-the-digital-humanists-are-white-all-the-nerds-are-men-but-some- of-us-are-brave-by-moya-z-bailey/. bailey, moya, anne cong-huyen, alexis lothian, and amanda phillips. “reflections on a movement: #transformdh, growing up.” in debates in the digital humanities. eds. mat- thew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . bailey, trevor c., and anthony c. gatrell. interactive spatial data analysis. harlow: long- man, . bair, sheila, and sharon carlson. “where keywords fall: using metadata to facilitate digital humanities scholarship.” journal of library metadata . ( ): - . univer- sity libraries faculty and staff publications, paper . western michigan university, january . http://scholarworks.wmich.edu/cgi/viewcontent.cgi?article= &con- text=library_pubs. baird, d. thing knowledge: a philosophy of scientific instruments. berkeley, ca: univer- sity of california press, . baker, christopher w. scientific visualization: the new eyes of science. brookfield, ct: millbrook press, . baker, n. double fold: libraries and the assault on paper. new york, ny: random house, . baker, n. the size of thoughts: essays and other lumber. new york, ny: random house . ball, a. preserving computer-aided design (cad). dpc technology watch. digital preser- vation coalition. ball, cheryl e. “show, not tell: the value of new media scholarship.” computers and composition , no. ( ): - . ball, cheryl e., and douglas eyman. “digital humanities scholarship and electronic publi- cation.” in rhetoric and the digital humanities. eds. william hart-davidson and jim rodolfo. chicago, il: university of chicago press, . - . balmer, j. “review: digital hadrian’s villa project.” journal of the society of architectural historians ( ) ( ): - . baltes, elizabeth p. “dedication and display of portrait statues in hellenistic greece: spatial practices and identity politics.” phd dissertation, duke university press, . balsamo, anne marie. designing culture: the technological imagination at work. durham, nc: duke university press, . balsamo, anne marie. “videos and frameworks for ‘tinkering’ in a digital age.” spot- light on digital media and learning. http://sptlight.macfound.org/blog/entry/anne-bal- samo-tinkering-videos. banz, david a. “the values of the humanities and the values of computing.” in humani- ties and computer: new directions. ed. david s. miall, - . oxford, uk: clarendon press, . barab, sasha, and kurt squire. “design-based research: putting a stake in the ground.” the journal of the learning sciences , no. ( ): – . barab, sasha. et al. “making learning fun: quest atlantis, a game without guns”. edu- cational technology research & development ( ), ( ): - . barateiro, j., g. antunes, f. freitas, and j. borbinha. “designing digital preservation so- lutions: a risk management-based approach.” international journal of digital curation ( ) ( ): - . . /ijdc.v il. . barbour, kim. “hiding in plain sight: street artists online.” journal of media and com- munication , no. ( ): - . barnett, fiona. “the brave side of digital humanities.” differences , no. ( ): - . barnett, fiona, zach blas, micha cárdenas, jacob gaboury, jessica marie johnson, and margaret rhee. “queer os: a user’s manual.” in debates in the digital humanities. eds. matthew gold and lauren klein. - . minneapolis, mn: university of minnesota press, . barribeau, susan. “enhancing digital humanities at uw-madison: a white paper.” uni- versity of wisconsin at madison, . http://dighum.wisc.edu/facultyseminar/in- dex.html. barthes, roland. camera lucida: reflections on photography. translated by richard howard. new york, ny: farrar, straus and giroux, . barthes, roland. “from work to text.” in image, music, text. trans. stephen heath. - . new york, ny: hill and wang, . bartscherer, thomas and roderick coover. switching codes: thinking through digital technology in the humanities and the arts. chicago, il: university of chicago press, . bates, david. “peer review and evaluation of digital resources for the arts and humani- ties.” institute of historical research – digital resources, n.d. http://www.his- tory.ac.uk/projects/digital/peer-review. batley, s. information architecture for information professionals. oxford, uk: chandos, . battles, matthew, and michael maizels. “collections and/of data: art history and the art museum in the dh mode.” in debates in the digital humanities. eds. matthew gold and lauren klein. - . minneapolis, mn: university of minnesota press, . battle, r.a. designing virtual worlds. indianapolis, in: new riders, . baym, nancy k. personal connections in the digital age. cambridge, uk: polity press, . baym, nancy k., and danah boyd. “socially mediated publicness: an introduction.” journal of broadcasting and electronic media, , no. (september ): - . beagrie, n. “the digital curation centre.” learned publishing ( ): - . bearman, david and jennifer trant. “authenticity of digital resources: towards a state- ment of requirements in the research process.” d-lib magazine , no. (june ). beckett, c. supermedia: saving journalism so it can save the world. london, uk: wiley- blackwell, . becker, jonathan. “scholar . : public intellectualism meets the open web.” ucea re- view , no. . (june , ): - .url: http://www.ucea.org/special_fea- ture_ _ _pcp/ / / /scholar- -public-intellectualism-meets-the-open-web.html belfiore, e. and a. upchurch, eds. humanities in the twenty-first century: beyond utility and markets. new york, ny: palgrave macmillan, . benedict, b.m. curiosity: a cultural history of early modern inquiry. chicago, il: univer- sity of chicago press. . benjamin, walter. “theses on the philosophy of history.“ in illuminations. trans. h. zohn. - . london, uk: fontana, . benjamin, walter. "the work of art in the age of mechanical reproduction." in illumina- tions. ed. hannah arendt. trans. harry zohn. new york, ny: schocken books, benkler, yochai. the wealth of networks: how social production transforms markets and freedom. new haven, ct: yale university press, . bentkowska-kafel, anna, hugh denard, and drew baker, eds. paradata and transpar- ency in virtual heritage. digital research in the arts and humanities. burlington, vt: ashgate, . bentkowska-kafel, anna, trish cashen, and hazel gardiner, eds. digital art history: a subject in transition. bristol, uk: intellect, . berens, kathi inman. “judy malloy’s seat at the (database) table: a feminist reception history of early hypertext literature.” literary & linguistic computing . ( ): - . berman, merrick lex. “boundaries or networks in historical gis: concepts of measuring space.” historical geography ( ): - . berens, kathi inman. “interface.” in digital pedagogy in the humanities: concepts, mod- els, and experiments. eds. rebecca frost davis, matthew k. gold, katherine d. harris, and jentery sayers. new york: modern language association, . https://digitalpeda- gogy.commons.mla.org/keywords/interface/. berg, a.j. “a gendered socio-technical construction: the smart house.” in the social shaping of technology. eds. d. mackenzie and j. wajcman. - . buckingham, uk: open university press, . berg, maggie, and barbara seeber. the slow professor: challenging the culture of speed in the academy. toronto, on: university of toronto press, . berger, john. ways of seeing. new york, ny: penguin, . berlin, isaiah. “the divorce between the sciences and the humanities.” in the proper study of mankind. - . new york, ny: farrar, straus and giroux, . bernardi, joanne, and nora dimmock. “creative curating: the digital archive as argu- ment.” in making things and drawing boundaries: experiments in the digital humani- ties. ed. jentery sayers, - . minneapolis, mn: university of minnesota press, . bernardou, a., p. constantopoulos, c. dallas, and d. gavrilis. “understanding the infor- mation requirements of arts and humanities scholarship: implications for digital cura- tion.” international journal of digital curation . ( ): - . berners-lee, tim, and mark fischetti. weaving the web: the original design and ulti- mate destiny of the world wide web by its inventor. san francisco, ca: harper, . berners-lee, t., j. hendler, and o. lassila. “the semantic web.” scientific american ( ) ( ): - . bernstein, m.c. “hypertext and the linearity of history.” in hypertextnow: remarks on the state of hypertext, - . . berry, david m. “the computational turn: thinking about the digital humanities.” cul- ture machine . . http://www.culturemachine.net/index.php/cm/arti- cle/view/ / /. berry, david m. “critical digital humanities.” author’s blog. http://stunlaw.blog- spot.com/ / /critical-digital-humanities.html berry, david m. “the computational turn: thinking about the digital humanities.” cul- ture machine . ( ): - . http://www.culturemachine.net/index.php/cm/arti- cle/viewarticle/ . berry, david m. copy, rip, burn: the politics of copyleft and open source. london, uk: pluto press, . berry, david m. and anders fagerjord. digital humanities. cambridge, uk: polity press, . berry, david m. the philosophy of software: code and mediation in the digital age. lon- don, uk: palgrave macmillan, . berry, david m., ed. understanding digital humanities. new york, ny: palgrave macmil- lan, . bescoby, d.j. “detecting roman land boundaries in aerial photographs using radon transforms.” journal of archaeological science ( ): - . besser, howard. the past, present, and future of digital libraries. oxford, uk: blackwell, . best, stephen, and sharon marcus. “surface reading: an introduction.” representa- tions . ( ): – . bevan, andrew, and james conolly. ”gis, archaeological survey, and landscape archae- ology on the island of kythera, greece.” journal of field archaeology , no. ½. ( ): - . bianco, jamie “’skye’this digital humanities which is not one.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . bieber, douglas. “representativeness in corpus design.” literary and linguistic compu- ting , no. ( ): – . bijker, wiebe e., thomas p. hughes, and trevor pinch, eds. the social construction of technological systems: new directions in the sociology and history of technology. cam- bridge, ma: mit press, . billinghurst, mark, adrian clark, and gun lee. a survey of augmented reality. hanover, ma: now publishers, . bimber, oliver and ramesh raskar. spatial augmented reality. merging real and virtual worlds. wellesley, ma: peters, . binder, jeffrey m. “alien reading: text mining, language standardization, and the hu- manities.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . binkley, richard. “new tools, new recruits, for the republic of letters.” robert c. bin- kley, – / life, works, ideas. http://www.wallandbinkley.com/rcb/works/new- tools-new-recruits-for-the-republic-of-letters.html. bird, steven, ewan klein, and edward loper. natural language processing with python. beijing, china: o’reilly, . birkerts, sven. the gutenberg elegies. the fate of reading in an electronic age. boston, ma: faber and faber, . bissell, t. extra lives: why video games matter. new york, ny: pantheon books, . bjork, olin. “digital humanities and the first year writing course.” digital humanities pedagogy: practices, principles and policies. ed. brett d. hirsch. - . open book pub- lishers, . blackwell, christopher, and thomas r. martin. “technology, collaboration, and under- graduate research.” dhq: digital humanities quarterly , no. . http://digitalhumani- ties.org/dhq/vol/ / / / .html. blair, ann. too much to know: managing scholarly information before the modern age. new haven, ct: yale university press, . blais, joline, jon ippolito, and owen smith. new criteria for new media. new media de- partment, university of maine, (january ). http://newmedia.umaine.edu/interar- chive/new_criteria_for_new_media.html. blaney, jonathan. “citing digital resources.” sect: sustaining the ebbo-tcp. bodleian library. https://blogs.bodleian.ox.ac.uk/eebotcp/sect/. blanke, tobias. digital asset ecosystems: rethinking clouds and crowds. oxford, uk: chandos publishing, . blanke, tobias, and m. hedges. “scholarly primitives: building institutional infrastruc- ture for humanities e-science.” future generation computer systems ( ) ( ): - . blei, david m. “topic modeling and digital humanities.” journal of digital humanities , no. (winter ). http://journalofdigitalhumanities.org/ - /topic-modeling-and-digi- tal-humanities-by-david-m-blei. blevins, cameron. “digital history’s perpetual future tense.” in debates in the digital humanities. eds. matthew k. gold, and lauren klein. - . minneapolis, mn: univer- sity of minnesota press, . blevins, cameron. “space, nation, and the triumph of region: a view of the world from houston.” journal of american history , no. (june ): - . block, sharon. “doing more with digitization: an introduction to topic modeling of early american sources.” common-place , no. (january ). http://www.common- place.org/vol- /no- /tales/. blum, andrew. tubes: a journey to the center of the internet. new york, ny: ecco/har- per collins, . blustain, harvey, and donald spicer. “digital humanities at the crossroads: the univer- sity of virginia.” ecar case studies. boulder, colorado: educause, . net.edu- cause.edu/ir/library/pdf/ers /cs/ecs .pdf. boast, r., m. bravo, and r. srinivasan. “return to babel: emergent diversity, digital re- sources, and local knowledge.” information society , ( ): - . bode, katherine. reading by numbers: recalibrating the literary field. london, uk: an- them press, . bode, katherine. “resourceful reading: a new empiricism in the digital age?” in re- sourceful reading: the new empiricism, eresearch, and australian literary culture. eds. katherine bode and robert dixon. - . sydney, australia: university of sydney press, . bodenhamer, d.j. “narrating space and place.” in spatial narratives and deep maps. eds. d.j. bodenhamer, j. corrigan and t.m. harris. - . bloomington, in: indiana uni- versity press, . bodenhamer, david j., j. corrigan, and t.m. harris, eds. spatial narratives and deep maps. bloomington, in: indiana university press, . bodenhamer, david j. “the potential of spatial humanities.” in the spatial humanities: gis and the future of humanities scholarship. eds., david j. bodenhamer, john corrigan, and trevor m. harris. - . bloomington, in: indiana university press, . bodenhamer, david j., john corrigan, and trevor harris, eds. the spatial humanities: gis and the future of humanities scholarship. bloomington, in: indiana university press, . bodersen, lars. geo-communication and information design. fredikshavn, denmark: takegang, . boellstorff, tom, bonnie nardi, celia pearce, t.l. taylor, and george e. marcus. ethnog- raphy and virtual worlds: a handbook of method. princeton, nj: princeton university press, . boeva, yana, devon elliott, edward jones-imhotep, shean muhammedi, and william j. turkel. “doing history by reverse engineering electronic devices.” in making things and drawing boundaries: experiments in the digital humanities. ed. sayers jentery. - . minneapolis, mn: university of minnesota press, . boggs, jeremy, jennifer reed, and j.k. purdom linblad. ”making it matter.” in making things and drawing boundaries: experiments in the digital humanities. ed. sayers jen- tery. - . minneapolis, mn: university of minnesota press, . boggs, jeremy. “participating in the bazaar: sharing code in the digital humanities.” clioweb. june , . http://clioweb.org/ / / /participating-in-the-bazaar- sharing-code-in-the-digital-humanities/. bogost, ian. “the cathedral of computation.” the atlantic, january , . http://www.theatlantic.com/technology/archive/ / /the-cathedral-of-computa- tion/ /. bogost, ian. “gamification is bullshit.” the atlantic, august , . www.theatlan- tic.com/technology/archive/ / /gamification-is-bullshit/ /. bogost, ian. persuasive games: the expressive power of videogames. cambridge, ma: mit press, . bogost, ian, and nick montfort. “platform studies: frequently questioned answers.” digital arts and culture conference proceedings ( - december ): - . bogost, ian. “the turtlenecked hairshirt.” in debates in the digital humanities. ed. mat- thew k. gold. - . minneapolis, mn: university of minnesota press, . bohon, cory, jennifer guiliano, james smith, george williams, and amanda visconti. “’making the digital humanities more open’: modeling digital humanities for a wider audience.” journal of the digital humanities no. ( spring): . bol, peter k., and jianxiong ge. “china historical gis.” historical geography ( ): - . bolter, j. david. “critical theory and the challenge of new media.” in eloquent images: word and image in the age of new media. eds. mary e. hocks and michelle r. kendrick. - . cambridge, ma: mit press, . bolter, j. david. “ekphrasis, virtual reality, and the future of writing.” in the future of the book. ed. geoffrey nunberg. - . berkeley, ca: university of california press, . bolter, j. david. writing space: the computer, hypertext, and the history of writing. boston, ma: houghton mifflin, . bolter, j. david. writing space: computers, hypertext, and the remediation of print. tay- lor & francis. . bolter, j. david, and richard grusin. remediations: understanding new media. cam- bridge, ma: mit press, . bonacchi, chiara, ed. archaeologists and the digital: towards strategies of engagement. london, uk: archetype publications, . bonds, e. leigh. “listening in on the conversations: an overview of digital humanities pedagogy.” cea critic , no. (july ). https://muse.jhu.edu/login?auth= &type=summary&url=/jour- nals/cea_critic/v / . .bonds.pdf. booch, grady, james rumbaugh, and ivar jacobson. the unified modeling language user guide. upper saddle river, nj: addison-wesley, . borenstein, greg. making things see: d vision with kinect, processing, arduino, and makerbot. sebastopol, ca: media maker, . borgman, christine l. big data, little data, no data: scholarship in the networked world. cambridge, ma: mit press, . borgman, christine l. “the digital future is now: a call to action for the humanities.” digital humanities quarterly , no. ( ). http://works.bepress.com/borgman/ /. borgmann, albert. holding on to reality: the nature of information at the turn of the millennium. chicago, il: university of chicago press, . borgman, christine l. scholarship in the digital age. cambridge, ma: mit press, . börner, k. the atlas of science: visualizing what we know. cambridge, ma: mit press, . bornstein, george, and ralph g. williams, eds. palimpsest: editorial theory in the hu- manities. ann arbor, mi: university of michigan press, . bornstein, george and theresa tinkle. the iconic page in manuscript, print, and digital culture. ann arbor, mi: university of michigan press, . bosak, jon and tim bray. "xml and the second-generation web." scientific american ( may ). bouchard, matt, and andy keenan. “from theory to experience to making to breaking: iterative game design for digital humanists.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . bowker, geoffrey c., and susan leigh star. sorting things out: classification and its con- sequences. cambridge, ma: mit press, . boyack, kevin w., brian n. wylie, and george s. davidson. “a call to researchers: digital libraries need collaboration across disciplines.” d-lib magazine , no. (october ). http://www.dlib.org/dlib/october /boyack/ boyack.html. boyd, danah, and kate crawford. “critical questions for big data: provocations for a cul- tural, technological, and scholarly phenomenon.” information, communication & soci- ety , no. ( ): - . boyd, danah, scott golder, and gilad lotan. “tweet, tweet, retweet: conversational as- pects of retweeting on twitter.” hawaii international conference on system sciences, , kuai, hawaii. boyd, jason, and lynne siemens. “project management.” dhsi@congress . . boyle, james. “a closed mind about an open world.” financial times. august , . http://www.it.com/home/us.path:search;boyle closed mind. boyle, john. the public domain: enclosing the commons of the mind. new haven, ct: yale university press, . brabham, d.c. crowdsourcing. mit press essential knowledge series. cambridge, ma: mit press, . bradley, jeffrey. “no job for techies: technical contributions to research in the digital humanities.” in collaborative research in the digital humanities. eds. m. deegan and w. mccarty. - . farnham, uk: ashgate, . bradshaw, jeffrey, ed. software agents. cambridge, ma: mit press, . bradshaw, roy, and robert j. abrahart. “widening participation in historical gis: the case of digital derby .” rgs-ibg annual international conference, london. septem- ber , . bradwell, p. the edgeless university: why higher education must embrace technology. london, uk: demos, . brennan, sheila a. “let the grant do the talking.” journal of digital humanities , no. (fall ). http://journalofdigitalhumanities.org/ - /let-the-grant-do-the-talking-by- sheila-brennan/. brennan, sheila a. “navigating dh for cultural heritage professionals.” lot . january , , http://www.lotfortynine.org/ / /navigating-dh-for-cultural-heritage-pro- fessionals/. brennan, sheila a. “public, first.” in debates in the digital humanities. eds. matthew gold and lauren klein. - . minneapolis, mn: university of minnesota press, . brett, guy. “the computers take to art.” the times, august ( ): . brett, megan r. “topic modeling: a basic introduction.” journal of digital humanities ( : ). http://journalofdigitalhumanities.org/ - /topic-modeling-a-basic-introduction-by- megan-r-brett/ brier, stephen. “where’s the pedagogy? the role of teaching and learning in the digital humanities.” in debates in the digital humanities. ed. matthew k. gold. - . min- neapolis, mn: university of minnesota press, . britton, lauren. “democratized tools of production: new technologies spurring the maker movement.” technology & social change group. seattle, wa: university of washington information school, . britton, lauren. “examining the maker movement through discourse analysis: an intro- duction.” technology & social change group. seattle, wa: university of washington in- formation school, . britton, lauren. “power, access, status: the discourse of race, gender, and class in the maker movement.” technology & social change group. seattle, wa: university of washington information school, . britton, lauren. “stem, dastem, and steam in making: debating america’s economic future in the st century.” technology & social change group. seattle, wa: university of washington information school, . brosnan, mark. technophobia: the psychological impact of information technology. london, uk: routledge, . brook, t. “mapping knowledge in the sixteenth century: the gazetteer cartography of ye chunji.” the [princeton university, gest] east asian library journal : ( ): - . brooke, collin. lingua fracta: toward a rhetoric of new media. new york, ny: hampton press, . brown, bill. “thing theory.” critical inquiry . (autumn ): - . brown, james jr. “crossing state lines: rhetoric and software studies.” in rhetoric and the digital humanities. ed. jim ridolfo and william hart-davidson, - . chicago, il: university of chicago press, . brown, john seely and douglas thomas. a new culture of learning: cultivating the im- agination for a world of constant change. createspace independent publishing plat- form, . brown, john seely and paul duguid. the social life of information. cambridge, ma: har- vard business school press, . brown, john seely and paul duguid. “universities in the digital age.” change, . ( ): - . brown, laura, rebecca griffiths, and matthew rascoff. university publishing in a digital age. new york, ny: ithaka, . brown, paul, charlie gere, nicholas lambert, and catherine mason, eds. white heat cold logic: british computer art - . cambridge, ma: mit press, . brown, susan. “cwrc-writer.” the canadian writing research collaboratory. http://www.dh .uni-hamburg.de/conference/programme/abstracts/cwrc-writer-an- in-browser-xml-editor/. brown, susan. “towards best practices in collaborative online knowledge production.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . brown, susan, and john simpson. “the curious identity of michael field and its implica- tions for humanities research with the semantic web.” ieee big humanities data ( ): - . brown, susan, john simpson, the inke research group, and cwrc project team. “the changing culture of humanities scholarship: iteration, recursion, and versions in schol- arly collaboration environments.” scholarly and research communication . ( ). brown, s., and m. greengrass. “research portals in the arts and humanities.” literary and linguistic computing, vol. , no. ( ): - . brown, vincent. “mapping a slave revolt: digital tools and the historian’s craft.” ameri- can historical association, new york city, january - , . https://aha.con- fex.com/aha/ /webprogram/paper .html. browne, simone. dark matters: on the surveillance of blackness. durham, nc: duke uni- versity press, . bruns, axel, and hallvard moe. “structural layers of communication on twitter.” in twitter and society. eds. katrin weller, axel bruns, jean burgess, merja mahrt, cornelius puschmann. - . new york, ny: peter lang, . bruns, axel, and stefan stieglitz. “quantitative approaches to comparing communica- tion pattens on twitter.” journal of technology and human services , nos. - ( ): - . bruns, axel, and stefan stieglitz. “towards more systematic twitter analysis: metrics for tweeting activities.” international journal of social research methodology ( ). bruzelius, caroline. preaching, building and burying: friars and the medieval city. lon- don, uk: yale university press, . bruzelius, caroline. “teaching with visualization technologies: how information be- comes knowledge.” material religion ( ): - bruzelius, caroline. “visualizing venice: an international collaboration.” in lo spazio narrabile. scritti di storia inonore di donatella calabi. eds. rosa tamborrino and guido zucconi. - . venice, italy: quodlibet, . bryant, levi. the democracy of objects. ann arbor mi: open humanities, . bryson, tim. “digital humanities.” spec kit, - . washington, dc: association of research libraries, ( ): . buckland, michael k. “information as thing.” journal of the american society for infor- mation science , no. ( ): - . bulger, monica, eric meyer, grace de la flor, melissa terras, sally wyatt, marina jirotka, katherine eccles, and christine mccarthy madsen. “reinventing research? information practices in the humanities.” information practices in the humanities. a research infor- mation network report ( ). burdette, alan r. “evia digital archive project.” online humanities scholarship: the shape of things to come. ed. jerome mcgann. - . houston, tx: rice university press, . burdick, anne, johanna drucker, peter lunenfeld, todd presner, and jeffrey schnapp. digital_humanities. cambridge, ma: mit press, . burgess, helen j, and jeanne hamming. “new media in the academy: labor and the pro- duction of knowledge in scholarly multimedia.” dhq: digital humanities quarterly , no. (summer ). http://digitalhumanities.org/dhq/vol/ / / / .html. burgoyne, john ashley, ichiro fujinaga, and j. stephen downie. “music information re- trieval.” in a new companion to digital humanities. eds. susan schreibman, ray sie- mens, and john unsworth. - . west sussex, uk: wiley blackwell, . burke, timothy. “the humane digital.” in debates in the digital humanities. eds. mat- thew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . burnard, l., k. o’brien o’keefe, and j. unsworth, eds. electronic digital editing. - . new york, ny: modern language association. burrows, john f. computation into criticism. oxford, uk: clarendon press, . burrows, john. “textual analysis.” in a companion to digital humanities. http://nora.lis.uiuc.edu: /companion/view?docid=black- well/ / .xml&chunk.id=ss - - &toc.depth= &toc.id=ss - - &brand= _brand burrows, t. “a data-centered ‘virtual laboratory’ for the humanities: designing the australian humanities networked infrastructure (huni) service.” literary and linguistic computing ( ) ( ): - . burton, matt. “the joy of topic modeling.” http://mcburton.net/blog/joy-of-tm/. buurma, rachel sagner, and anna tione levine. “the sympathetic research imagina- tion: digital humanities and liberal arts.” in debates in the digital humanities. eds. mat- thew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . buzzetti, dino. “digital representation and the text model.” new literary history : ( ): - . buzzetti, dino, and jerome mcgann. “critical editing in a digital horizon.” in electronic textual editing. eds. lou burnard, katherine o’brien o’keefe, and john unsworth, – . new york, ny: modern language association, . byron, mark. “digital scholarly editions of modernist texts: navigating the text in sam- uel beckett’s watt manuscripts.” sydney studies in english ( ): - . callahan, v. reclaiming the archive: feminism and film history. detroit, mi: wayne state university press, . campbell, timothy. wireless writing in the age of marconi. minneapolis, mn: university of minnesota press, . cantara, linda. “long-term preservation of digital humanities scholarship.” oclc sys- tems and services , no. . ( ): - . carey, craig. "and: marks, maps, media, and the materiality of ambrose bierce’s style." american literature , no. ( ): - . carey, james w. communication as culture: essays on media and society. new york- london: routledge, . carr, nicholas. “is google making us stupid?” the atlantic. july/august . http://www.theatlantic.com/magazine/archive/ / /is-google-making-us-stu- pid/ /. carr, nicholas. the shallows: what the internet is doing to our brains. new york, ny: w. w. norton. . carr, patricia. “serendipity in the stacks: libraries, information architectures, and the problems of accidental discovery.” college and research libraries. association of col- lege and research libraries, . http://crl.acrl.org/content/early/ / / /crl - .full.pdf. carter, paul. the road to botany bay: an essay in spatial history. london, uk: faber & faber, . carusi, a., a.s. hoel, t. webmoor, and s. woolgar, eds. visualization in the age of com- puterization. new york, ny: routledge, . castells, manuel. the rise of network society. cambridge, ma: blackwell, . causer, t. and m. terras. “crowdsourcing bentham: beyond the traditional boundaries of academic history.” international journal of humanities and arts computing ( ) ( ): - . cavanagh, sheila. “living in a digital world: rethinking peer review, collaboration and open access.” journal of digital humanities , no. (fall ). http://journalofdigi- talhumanities.org/ - /living-in-a-digital-world-by-sheila-cavanagh/. cázes, hélène, and j. matthew huculak. “understanding the pre-digital book: ‘every contact leaves a trace’.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, and ray siemens. - . new york, ny: routledge, . cecire, natalia. “the visible hand.” works cited. http://nataliacecire.blog- spot.com/ / /visible-hand.html. cecire, natalia. “when digital humanities was in vogue.” journal of digital humanities , no. ( ): - . center for digital research in the humanities, university of nebraska-lincoln. “promo- tion & tenure criteria for assessing digital research in the humanities.” center for digi- tal research in the humanities. http://cdrh.unl.edu/articles/eval_digital_scholar.php. center for digital research in the humanities, university of nebraska-lincoln. “recom- mendations for digital humanities projects.” center for digital research in the humani- ties, n.d. http://cdrh.unl.edu/articles/best_practices.php. chabries, d.m., s.w. booras, and g.h. bearman. “imagining the past: recent applica- tions of multispectral imaging technology to deciphering manuscripts.” antiquity ( ): , - . chachra, debbie. “beyond making.” in making things and drawing boundaries: experi- ments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . chachra, debbie. “why i am not a maker.” the atlantic, january , . www.theat- lantic.com/technology/archive/ / /why-i-am-not-a-maker/ . champion, erik. critical gaming: interactive history and virtual heritage. new york, ny: routledge, . chan, anita say, and harriet green. “practicing collaborative digital pedagogy to foster digital literacies in humanities classrooms.” educause review. . chan, seb. “spreadable collections: measuring the usefulness of collection data.” mu- seums and the web : proceedings. toronto, on: archives & museum informatics, . http://www.archimuse.com/mw /papers/chan/chan.html. chang, k-t. introduction to geographic information systems. boston, ma: mcgraw-hill, . chartier, roger. the order of books. trans. lydia g. cochrane. stanford, ca: stanford university press, . chassanoff, alexandra. “historians and the use of primary sources in the digital age.” the american archivist , no. ( ): - . cheal, c. “second life: hype or hyperlearning?” on the horizon (pt. ), ( ): - . chen, chaomei. information visualization: beyond the horizon. nd ed. new york, ny: springer, . chenhall, r.g. “the archaeological data bank: a progress report.” computers and the humanities , no. ( ): - . chenhall, r. g. “the description of archaeological data in computer language.” ameri- can antiquity , no. ( ): - . chenhall, r.g. “the impact of computers on archaeological theory: an appraisal and projection.” computers and the humanities , no. ( ): - . chernaik, w., c. davis, and m. deegan, eds. the politics of the electronic text. london, uk: university of london centre for english studies, . chrisman, nicholas. exploring geographic information systems, d ed. new york, ny: john wiley & sons, inc., . christen, k. “ara irititja: protecting the past, accessing the future-indigenous memories in a digital age.” museum anthropology , ( ): - . christensen, christian. “twitter revolutions? addressing social media and dissent.” communication review , no. ( ): - . chui, m., m. löffler, and r. roberts. “the internet of things.” mckinsey quarterly. mckinsey & company, . chun, wendy hui kong. control and freedom: power and paranoia in the age of fiber optics. cambridge, ma: mit press, . chun, wendy hui kyong. “introduction: did somebody say new media?” in new media, old media: a history and theory reader. ed. wendy hui kyong chun and thomas kee- nan. - . new york, ny: routledge, . chun, wendy hui kong, and matthew fuller. programmed visions: software and memory. cambridge, ma: mit press, . chun, wendy hui kong, and lisa marie rhody. “working the digital humanities: uncov- ering shadows between the dark and the light.” differences: a journal of feminist cul- tural studies , no. ( ): - . ciula, a., and Øyvind eide. “reflections on cultural heritage and digital humanities: modeling in practice and theory.” in proceedings of the first international conference on digital access to textual cultural heritage. - , new york: acm. http://doi.acm.org/ . / . . cilevitz, adam. “the digital chastity belt.” . http://criticalmedia.uwater- loo.ca/crimelab/?p= . clavert, frédéric. “the digital humanities multicultural revolution did not happen yet.” l’histoire contemporaine à l’ère numérique. n.p., clement, tanya. “the ground truth of dh text mining.” in debates in the digital hu- manities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . clement, tanya. “half-baked: the state of evaluation in the digital humanities.” ameri- can literary history . ( ): - . ebscohost. clement, tanya. “multiliteracies in the undergraduate digital humanities curriculum: skills, principles, and habits of mind.” in digital humanities pedagogy: practices, princi- ples, and politics. ed. brett d. hirsch. cambridge, ma: open book publishers, . http://www.openbookpublishers.com/htmlreader/dhp/chap .html. clement, tanya. “text analysis, data mining, and visualizations in literary scholarship.” in literary studies in the digital age: a methodological primer. eds. k. price and r sie- mens. new york, n: mla commons, . clement, tanya. “‘a thing not beginning and not ending’: using digital tools to distant- read gertrude stein’s ‘the making of americans’.” literary and linguistic computing ( ), ( ): - . clement, tanya. “welcome to hipstas.” hipstas. https://blogs.ischool.utexas.edu/hip- stas/ / / /welcome-to-hipstas/. clement, tanya e. “when texts of study are audio files: digital tools for sound studies in digital humanities.” in a new companion to digital humanities. eds. susan schreib- man, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell. . clement, tanya e. “where is methodology in digital humanities?” in debates in the digi- tal humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: uni- versity of minnesota press, . clement, tanya, s. steger, j. unworth, and k. uszkalo. “how not to read a million books.” http://www .isrl.illinois.edu/-unsworth/hownot read.html#sdendnote sym. clement, tanya, wendy hagenmaier, and jennie levine knies. “toward a notion of the archive of the future: impressions of practice by librarians, archivists, and digital hu- manities scholars.” the library , no. ( ): - . clouston, nicole, and jentery sayers. “fabrication and research-creation in the arts and humanities.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . coble, zach. “evaluating dh work: guidelines for librarians.” journal of digital humani- ties , no. (fall ). http://journalofdigitalhumanities.org/ - /evaluating-digital-hu- manities-work-guidelines-for-librarians-by-zach-coble. codd, e.f. “a relational model of data for large shared data banks.” communications of the acm . (june ): - . pdf. cohen, daniel j. “creating scholarly tools and resources for the digital ecosystem: building connections in the zotero project.” first monday . ( ). cohen, daniel j. “from babel to knowledge: data mining large digital collections.” d-lib magazine , no. ( ). http://www.dlib.org/dlib/march /cohen/ cohen.html. cohen, daniel j. “introducing digital humanities now.” in debates in the digital humani- ties. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . cohen, daniel j. “the ivory tower and the open web: introduction: burritos, browsers, and books [draft].” dan cohen, july , . http://www.danco- hen.org/ / / /the-ivory-tower-and-the-open-web-introduction-burritos-brows- ers-and-books-draft/. cohen, daniel j. “searching for the victorians.” dan cohen’s digital humanities blog. oc- tober , . http://www.dancohen.org/ / / /searching-for-the-victorians/ cohen, daniel j. “the social contract of scholarly publishing.” in debates in the digital humanities, ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . cohen, daniel j. “welcome to the digital public library of america.” in digital public li- brary of america. april , . http://dp.la/info/ / / /message-from-the-exec- utive-director/. cohen, daniel j., j. frabetti, d. buzzetti, and j.d. rodriguez-velasco. defining the digital humanities. . http://academiccommons.columbia.edu/catalog/ac% a . cohen, daniel j., m. frisch, p. gallagher et al. “interchange: the promise of digital his- tory.” journal of american history ( ), - . . cohen, daniel j. and roy rosenzweig. digital history: a guide to gathering. preserving, and presenting the past on the web. philadelphia, pa: university of pennsylvania press, . cohen, daniel j., and roy rosenzweig. “to mark up, or not to mark up.” in digital his- tory: a guide to gathering, preserving, and presenting the past on the web. university of pennsylvania press, . http://chnm.gmu.edu/digitalhistory/digitizing/ .php. cohen, daniel j. and tom scheinfeldt, eds. hacking the academy: new approaches to scholarship and teaching from digital humanities. ann arbor, mi: university of michi- gan press, . cohen, julie. configuring the networked self. new haven, ct: yale university press, . cohen, patricia. “humanities scholars embrace digital technology.” new york times, november , . http://www.nytimes.com/ / / /arts/ digital.html. cohoon, jm. and w. aspray. woman and information technology: research on un- derrepresentation. cambridge, ma: mit press, . coleman, b. hello avatar: rise of the networked generation. cambridge, ma and lon- don uk: mit press, . coletta, cristina della. “guidelines for promotion and tenure committees in judging digital work.” evaluating digital scholarship – nines/neh summer institutes: - . . http://institutes.nines.org/docs/ -documents/guidelines-for-promo- tion-and-tenure-committees-in-judging-digital-work/. college art association intellectual property resources. http://www.collegeart.org/ip/ college art association and the society of architectural historians. "guidelines for the evaluation of digital scholarship in art and architectural history.” . collins, harry, robert evans, and michael e. gorman. “trading zones and international expertise.” trading zones and international expertise: creating new kinds of collabora- tion. ed. michael e. gorman. - . cambridge, ma: mit press, . collins, nicolas. handmade electronic music: the art of hardware hacking. nd ed. new york, ny: routledge, . cong-huyen, anne. “thinking through race (gender, class, & nation) in the digital hu- manities: the #transformdh example.” anne cong-huyen (blog), january , . http://anitaconchita.org/uncategorized/mla -presentation/. cong-huyen, anne. “toward a transnational asian/american digital humanities: a #transformdh invitation.” in between humanities and the digital. eds. patrik svensson and david theo goldberg. - . cambridge, ma: mit press, . connor, w.r. “scholarship and technology in classical studies.” in scholarship and tech- nology in the humanities. proceedings of a conference held at elvetham hall. hamp- shire, uk, - may. ed. may katzen, - . london, uk: british library research, bowker saur, . consalvo, mia. cheating: gaining advantage in videogames. cambridge, ma: mit press, . conway, p. “preservation in the age of google: digitization, digital preservation, and di- lemmas.” library quarterly: information, community, policy, , ( ): - . cook, t. “archival science and postmodernism: new formulations for old concepts.” ar- chival science , ( ): - . cook, t. “evidence, memory, identity, and community: four shifting archival para- digms.” archival science ( ): - . cook, t. “fashionable nonsense or professional rebirth: postmodernism and the prac- tice of archives.” archivaria ( ): - . cooley, heidi rae, and duncan a. buell. “building humanities software that matters: the case of the ward one mobile app.” in making things and drawing boundaries: ex- periments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: uni- versity of minnesota press, . cooper, andrew, and michael simpson. “looks good in practice, but does it work in theory? rebooting the blake archive.” wordsworth circle , no. (winter ): - . cooper, d., c.d. donaldson and p. murrieta-flores, eds. literary mapping in the digital age. aldershot, uk: ashgate, . cordell, ryan. “how not to teach digital humanities.” in debates in the digital humani- ties. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . cordell, ryan. “how to start tweeting and why you might want to.” april, . http://chronicle.com/blogs/profhacker/how-to-start-tweeting-and-why-you-might- want-to/ cordell, ryan. “new technologies to get your students engaged.” chronicle of higher education (may ). cosgrave, mike, anna dowling, lynn harding, róisín o’brien, and olivia rohan. “evaluat- ing digital scholarship: experiences in new programmes at an irish university.” journal of digital humanities , no. (fall ). http://journalofdigitalhumanities.org/ - /eval- uating-digital-scholarship-experiences-in-new-programmes-at-an-irish-university/. coté, mark. “data motility: the materiality of big social data.” culture studies review , no. ( ). coté, mark. “the prehistoric turn? networked new media, mobility and the body.” in the international companions to media studies: media studies futures. ed. kelly gates. - . oxford, uk: blackwell, . coté, mark. “technics and the human sensorium: rethinking media theory through the body.” theory and event , no. ( ). cotton, tressie mcmillan. “more scale, more questions: observations from sociology.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minnesota, mn: university of minnesota press, . council on library and information resources. “library as place: rethinking roles, re- thinking space.” washington, dc: council on library and information resources, . http://www.clir.org/pubs/abstract/pub abst.html. council on library and information resources. “no brief candle: reconceiving research libraries for the st century.” washington, dc: council on library and information re- sources, . http://www.clir.org/pubs/abstract/pub abst.html. council on library and information resources. “working together or apart: promoting the next generation of digital scholarship.” washington, dc: council on library and in- formation resources, . http://www.clir.org/pubs/reports/pub /pub .pdf. cowgill, george l. “computer applications in archaeology.” computers and the humani- ties , no. ( ): - . cox, gary w., and johnson n. katz. elbridge gerry’s salamander: the electoral conse- quences of the reappointment revolution. cambridge, uk: cambridge university press: . cox, r.j. archives & archivists in the information age. new york, ny: neal-schuman pub- lishers, . craig, a.b., w.r. sherman, and j.d. will. developing virtual reality applications: founda- tions of effective design. burlington, ma: morgan kaufmann, . craig, hugh, and arthur kinney. shakespeare, computers, and the mystery of author- ship. cambridge, uk: cambridge university press, . crampton, jeremy. the political mapping of cyber space. chicago, il: university of chi- cago press, . crane, g. “the humanities in the digital age.” paper presented at big data & uncer- tainty in the humanities, university of kansas, . http://www.youtube.com/watch?v=svdoaygu qa. crane, gregory, and alison jones. “text, information, knowledge and the evolving rec- ord of humanity.” d-lib magazine , no. . http://www.dlib.org/dlib/march /jones/ jones.html. crane, g., d. bamman, l. cerrato, et al. “beyond digital incunabula: modeling the next generation of digital libraries?” european conference on digital libraries. . http://www.eecs.tufts.edu/-dsculley/papers/incunabula.pdf. crane, g., b. seales, and m. terras. “cyberinfrastructure for classical philology.” dhq: digital humanities quarterly ( ). ( ). cranny-francis, anne. multimedia: texts and contexts. london, uk: sage, . “creating your web presence: a primer for academics.” profhacker. february , . http://chronicle.com/blogs/profhacker/creating-your-web-presence-a-primer-for-aca- demics/ creative commons. creativecommons.org. crofts, n. “museum informatics: the challenge of integration.” university of geneva. http://archive-ouverte.unige.ch/unige: . . crogan, patrick. gameplay mode: war, simulation, and technoculture. minneapolis, mn: university of minnesota press, . crompton, constance, richard j. lane, and ray siemens, eds. doing digital humanities: practice, training, research. - . new york, ny: routledge, . crowther, p. phenomenology of the visual arts (even the frame). stanford, ca: stanford university press, . croxall, brian. “all things google: google maps.” profhacker. april , . http://chron- icle.com/blogs/profhacker/all-things-google-google-maps-labs/ . croxall, brian. “build your own interactive timeline.” briancroxall.net, . http://bri- ancroxall.net/timelinetutorial/timelinetutorial.html. croxall, brian. “tired of tech: avoiding tool fatigue in the classroom.” writing and ped- agogy , no. ( ): - . cubitt, sean. “cybertime: ontologies of digital perception.” in society for cinema stud- ies. chicago, il: march . cudworth, a.l virtual world design: creating immersive virtual environments. boca ra- ton, fl: crc press, . “cultural analytics.” software studies initiative. http://lab.softwarestudies.com/p/cul- tural-analytics.html. (watch the intro video, scroll down to the description of the work at the software studies lab, and explore some of the examples.) cuny digital humanities resource guide. http://commons.gc.cuny.edu/wiki/in- dex.php/the_cuny_digital_humanities_resource_guide curry, michael r. “the digital individual in the private realm.” annals of the association of american geographers ( ): - . curry, michael r. digital places: living with geographic information systems. london, uk: routledge, . curry, michael r. “rethinking privacy in a geocoded world.” in geographic information systems: principles and applications, ( nd ed). eds. paul a. longley, michael f. goodchild, david j. maguire, and david w. rhind. — . new york, ny: john wiley and sons, inc., . dahlström, m., j. hansson, and u. kjellman. “’as we may digitize’-institutions and docu- ments reconfigured.” liber quarterly, : - ( ): - . darnton, robert. “google and the future of books.” new york review of books, febru- ary , . http://www.nybooks.com/articles/archives/ /feb/ /google-the-fu- ture-of-books/. date, c.j. an introduction to database systems. reading, ma: addison-wesley, . david rumsey map collection. http://www.davidrumsey.com/. davidson, cathy n. “how can a digital humanist get tenure?” hastac. september , . http://hastac.org/blogs/cathy-davidson/ / / /how-can-digital-humanist- get-tenure. davidson, cathy n. “humanities and technology in the information age.” the oxford dictionary of interdisciplinarity. eds. robert frodeman, julie thompson klein, and carl mitcham. - . oxford and new york: oxford university press, . davidson, cathy n. “humanities . : promise, perils, predictions.” in debates in the digi- tal humanities. ed. matthew k. gold. - . minneapolis, mn: university of minne- sota press, . davidson, cathy n. now you see it: how the brain science of attention will transform the way we live, work, and learn. new york, ny: penguin, . davidson, cathy n. “we can’t ignore the influence of digital technologies.” chronicle of higher education review. (march , ): b . davidson, cathy n., and david theo goldberg. the future of thinking: learning institu- tions in a digital age. cambridge, ma: mit press, . davies, john, dieter fensel, and frank van harmelen. towards the semantic web: ontol- ogy-driven knowledge management. hoboken, nj: j. wiley, . davies, mark. “a corpus-based study of lexical developments in early and late modern english.” in handbook of english historical linguistics. eds. merja kytö and päivi pahta. cambridge, uk: cambridge university press. davies, mark. “expanding horizons in historical linguistics with the million word corpus of historical american english.” corpora , no. ( ): – . davies, mark. “gephi+ mallet + emda.” robin camille davis/ blog. http://www.robin- camille.com/ - - -gephi-emda/. davis, robin camille. “testing out the nltk sentence tokenizer.” robin camille davis’ blog. http://www.robincamille.com/ - - -nltk-sentence-tokenizer/. davies, robin, and michael nixon. “digitization fundamentals.” in doing digital humani- ties: practice, training, research. eds. constance crompton, richard j. lane, ray sie- mens. - . new york, ny: routledge, . davis, rebecca frost. “learning from an undergraduate digital humanities project.” techne. december , . http://blogs.nitle.org/ / / /learning-from-an-under- graduate-digital-humanities-project/. dawson, ashley. “academic freedom and the digital revolution.” aaup journal of aca- demic freedom ( ). dawson, p. “’breaking the fourth wall’: d virtual worlds as tools for knowledge re- patriation in archaeology.” journal of social archaeology ( ) ( ): - . dear, michael, jim ketchum, sarah luria, and doug richardson, eds. geohumanities: art, history, text at the edge of place. new york, ny: routledge, . debord, guy. the society of the spectacle. trans. donald nicholson-smith. new york, ny: zone books, . deegan, marilyn. “a world of possibilities: digitisation and the humanities.” in research methods for creating and curating data in the digital humanities. eds. matt hayler and gabriele griffin. - . edinburgh, uk: edinburgh university press, . deegan, m. and k. sutherland, eds. text editing, print and the digital world. aldershot, uk: ashgate. deegan, marilyn and willard mccarty, eds. collaborative research in the digital human- ities. farnham, uk: ashgate, . deleuze, gilles. cinema : the movement image. trans. hugh tomlinson and barbara habberjam. minneapolis, mn: university of minnesota press, . deleuze, gilles. cinema : the time image. translated by hugh tomlinson and barbara habberjam. minneapolis, mn: university of minnesota press, . de man, paul. “the resistance to theory.” in the resistance to theory. minneapolis, mn: university of minnesota press, . derose, s.j., d.g. durand, e. mylonas et al. “what is text, really?” journal of computing in higher education ( ) ( ): - . the design-based research collective. “design-based research: an emerging paradigm for educational inquiry.” educational researcher , no. ( ): – . deutschmann, mats, anders steinvall, and anna lagerström. “raising language aware- ness using digital media: methods for revealing linguistic stereotyping.” in research methods for creating and curating data in the digital humanities. eds. matt hayler and gabriele griffin. - . edinburgh, uk: edinburgh university press, . deuze, mark. media work. cambridge, uk: polity, . dictionary of art historians. http://arthistorians.info. dieter, michael, and geert lovink. “theses on making in the digital age.” in critical making. ed. garnet hertz. hollywood, ca: garnet hertz, . digital art history society. https://digitalarthistorysociety.org digging into data challenge. . http://www.diggingintodata.org/ digital curation centre university of edinburgh. dcc curation lifecycle model. http://www.dcc.ac.uk/digital-curation/what-digital-curation. digital curation centre university of edinburgh. what is digital curation. http://www.dcc.ac.uk/digital-curation/what-digital-curation. http://www.digitalhumanities.org/companiondls/. the digital humanities manifesto . . . http://www.humanitiesblast.com/mani- festo/manifesto_v .pdf. digital humanities now. digitalhumanitiesnow.org. digital humanities quarterly. alliance of digital humanities organizations. http://digi- talhumanities.org/dhq/. digital humanities questions & answers. http://digitalhumanities.org/answers/. digital humanities summer institute statement of ethics and inclusion. led by jacquel- ine wernimont and angel david nieves. http://www.dhsi.org/events.php#ethics+inclu- sion. “digital humanities and the undergraduate: campus projects recognized.” national in- stitute for the technology in liberal education. october , . http://www.ni- tle.org/live/news/ -digital-humanities-and-the-undergraduate-campus. “digital humanities at the university of washington.” simpson center for the humani- ties, university of washington. http://depts.washington.edu/uwch/docs/digital_human- ities_case_statement.pdf. “digital humanities at yale: about.” digital humanities at yale. http://digital humani- ties.yale.edu/. digital labor reference library. digital labor working group. cuny graduate center. https://digitallabor.commons.gc.cuny.edu/digital-labor-reference-library/. digital librarians initiative. “role of librarians in digital humanities centers.” white pa- per. emory university library, august . http://docs.google.com/doc?do- cid= azbw qx_a jpzgm owdrdzzfmtmycwrnchjwbwo&hl=en. digital library federation. diglib.org. digital public library of america (dpla). https://dp.la/. digital research infrastructure for the arts and the humanities. www.dariah.eu. digital research tools wiki (dirt). https://digitalresearch- tools.pbworks.com/w/page/ /frontpage. digital roman forum: http://dlib.etc.ucla.edu/projects/forum/. digital scholarship lab. university of richmond, . http://dsl.richmond.edu/. digital studies/le champ numérique. www.digitalstudies.org. dillon, sheila, and elizabeth palmer baltes. “honorific practices and the politics of space on hellenistic delos.” american journal of archaeology ( ): - . dillon, sheila, and timothy d. shea. “sculpture and context: towards an archaeology of greek statuary.” in greek art in context. ed. d. rodríguez perez. new york, ny: routledge, . “discussion area, archived.” internet shakespeare editions. http://internetshake- speare.uvic.ca/annex/discussion.html#toc_on_line_numbering_in_the_electronic_edi- tion. dobrzynski, judith h. “modernizing art history.” the wall street journal. http://online.wsj.com/news/arti- cles/sb doel, ronald e., and pamela m. henson. “reading photographs: photographs as evi- dence in writing the history of recent science.“ in writing recent science. eds. ronald e. doel and thomas söderquist. london, uk: routledge, : - . dombrowski, quinn. “drupal and other content management systems.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . dombrowski, quinn. “what ever happened to project bamboo?” literary and linguistic computing , no. (december ): . dombrowski, quinn. “when not to use drupal.” drupal for humanists, http://dru- pal.forhumanists.org/book/when-not-use-drupal donahue-wallace, kelly, laetitia la follette, and andrea pappas, eds. teaching art his- tory with new technologies: reflections and case studies. cambridge, uk: cambridge scholars publishing, . dörk, marian, christopher collins, patrick feng, and sheelagh carpendale. “critical info- vis: exploring the politics of visualization.” chi extended abstracts. paris, . dorn, sherman. “is (digital) history more than an argument about the past?” in writing history in the digital age. eds. kristen nawrotzki and jack dougherty. ann arbor, mi: university of michigan press, . dooley, jackie. “ten commandments for special collections librarians in the digital age.” rbm: a journal of rare books, manuscripts and cultural heritage , no. ( ): - . dougherty, jack, and kristen nawrotzki, eds. writing history in the digital age. ann ar- bor, mi: university of michigan press, . douglas, j. yellowlees. the end of books—or books without end? ann arbor, mi: univer- sity of michigan press, . dourish, paul, and genevieve bell. divining a digital future: mess and mythology in ubiquitous computing. cambridge, ma: mit press, . downey, greg. “virtual webs, physical technologies, and hidden workers: the spaces of labor in information internetworks.” technology and culture , no. . - . . “downgrading your website, or why we are moving to wordpress.” smithsonian cooper-hewitt museum, http://labs.cooperhewitt.org/ /downgrading-your-web- site-or-why-we-are-moving-to-wordpress/ drain, adam. ”design anthropology: working on, with, and for technologies.” in digital anthropology. ed. heather a. horst and daniel miller. - . new york, ny: berg, . draxler, bridget. “digital humanities symposium: the scholar, the library and the digital future.” hastac, february . http://hastac.org/blogs/bridget-draxler/digital-hu- manities-symposium-scholar-library-and-digital-future. drucker, johanna. graphesis: visual forms of knowledge production. cambridge, ma: harvard university press, . drucker, johanna. “graphical approaches to the digital humanities.” in a new compan- ion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . drucker, johanna. “humanistic theory and digital scholarship.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . drucker, johanna. “humanities approaches to graphical display.” digital humanities quarterly , no. ( ). drucker, johanna. “is there a digital art history?” visual resources ( - ) : - . drucker, johanna. “performative materiality and theoretical approaches to interface.” dhq: digital humanities quarterly ( ). drucker, johanna. speclab: digital aesthetics and projects in speculative compu- ting. chicago, il: university of chicago press, . drucker, johanna. “theory as praxis: the poetics of electronic textuality.” modern- ism/modernity , no. ( ): - . drucker, johanna, and emily mcvarish. graphic design history. nd edition. boston, ma: pearson, . duguid, paul. “material matters: aspects of the past and the futurology of the book.” in the future of the book. ed. geoffrey nunberg. - . berkeley, ca: university of cali- fornia press, . duguid, paul. “material matters: the past and futurology of the book.” in the future of the book. ed. by geoffrey nunberg. berkeley and los angeles, ca: university of califor- nia press, . duke university libraries digital humanities research guide. http://guides.li- brary.duke.edu/content.php?pid= &sid= dumbill, ed. “what is big data? an introduction to the big data landscape.” o’reilly ra- dar. . http://radar.oreilly.com. duncan, j., and p.l. main. “the drawing of archaeological sections and plans by com- puter.” science & archaeology ( ): - . dunne, anthony, and fiona raby. speculative everything: design, fiction, and social dreaming. cambridge, ma: mit press, . dziuban, charles, charles r. graham, and anthony g. picciano, eds. “blended learning.” research perspectives, vol. . new york, ny: routledge, . earhart, amy e. “can information be unfettered? race and the new digital humanities canon.” in debates in the digital humanities. ed. matthew k. gold. - . minneap- olis, mn: university of minnesota press, . earhart, amy e. “challenging gaps: redesigning collaboration in the digital humanities.” in the american literature scholar in the digital age. eds. amy earhart and andrew jew- ell. - . ann arbor, mi: university of michigan press, . earhart, amy e. recovering the recovered text: diversity, canon building, and digital studies-amy earhart. , video url. http://www.youtube.com/watch?v= ui pijdreo&feature=youtube_gdata_player. earhart, amy e., and andrew jewell. the american literature scholar in the digital age. ann arbor, mi: university of michigan press and university of michigan library, . earhart, amy e. and toneisha l. taylor. “pedagogies of race: digital humanities in the age of ferguson.” in debates in the digital humanities. eds. matthew k. gold and lau- ren klein. - . minneapolis, mn: university of minnesota press, . eder, maciej. “visualization in stylometry: cluster analysis using networks.” digital scholarship in the humanities . (december ). edmond, jennifer. “collaboration and infrastructure.” in a new companion to digital humanities. ed. susan schreibman, ray siemens, and john unsworth. - . west sus- sex, uk: wiley-blackwell, . edmond, jennifer. “the role of the professional intermediary in expanding the humani- ties computing base.” literary and linguistic computing ( ) ( ): - . edwards, richard. “creating the center for digital research in the humanities.” univer- sity of nebraska-lincoln, july , . http://cdrh.unl.edu/articles/creatingcdrh.php. edwards, richard. “the digital humanities and its users.” in debates in the digital hu- manities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . egan, gabriel, and john jowett. “review of the early english books online (eebo).” inter- active early modern literary studies (january ): – . eggert, p. “text-encoding, theories of the text, and the ‘work-site’.” literary and lin- guistic computing ( ) ( ): - eisenstein, elizabeth l. the printing press as agent of change. cambridge, uk: cam- bridge university press, . eisenstein, elizabeth l. the printing press as agent of change: communications and cul- tural transformations in early modern europe. cambridge, uk: cambridge university press, . eisenstein, elizabeth l. the printing revolution in early modern europe. cambridge, uk: cambridge university press, . electronic literature organization. eliterature.org eliot, simon, and jonathan rose. a companion to the history of the book. malden, ma: blackwell publishing, . elliott, d., r. macdougall, and w.j. turkel. ”new old things: fabrication, physical com- puting, and experiment in historical practice.” canadian journal of communication ( ). - . elliot, tom and richard talbert. “mapping the ancient world.” in past time, past place: gis for history. ed. anne kelly knowles. redlands, ca: esri press, : - . emerson, lori. reading writing interfaces: from the digital to the bookbound. minneap- olis, mn: university of minnesota press, . emirbayer, mustafa, and jeff goodwin. “network analysis, culture and the problem of agency.” american journal of sociology , no. ( ): - . endres, bill. “a literacy of building: making in the digital humanities.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis,mn: university of minnesota press, . ensign, r. “historians are interested in digital scholarship but lack outlets.” chronicle of higher education. wired campus blog, october , . http://chroni- cle.com/blogs/wiredcampus/historians-are-interested-in-digital-scholarship-but-lack- outlets/ ensslin, astrid. canonizing hypertext: explorations and constructions. london, uk: bloomsbury press, . ensslin, astrid. literary gaming. cambridge, ma: mit press, . eposs. internet of things in : a roadmap for the future. brussels, belgium: euro- pean commission. . ernst, w. digital memory and the archive. minneapolis, mn: university of minnesota press. . erway, r. swatting the long tail of digital media: a call for collaboration. dublin, oh: oclc research, . http://www.oclc.org/research/publications/library/ / - .pdf. ethington, philip j. "los angeles and the problem of urban historical knowledge." amer- ican historical review , no. ( ): . ethington, p. “placing the past: ’groundwork’ for spatial theory of history.” rethinking history ( ), ( ): - . europeana. www.europeana.eu. evans, mel. “curating the language of letters: historical linguistic methods in the mu- seum.” in research methods for creating and curating data in the digital humanities. eds. matt hayler and gabriele griffin. - . edinburgh, uk: edinburgh university press, . everett, anna. digital diaspora: a race for cyberspace. albany, ny: suny press, . eyman, douglas. “are you a digital humanist?” in computers and writing. ann arbor, mi: university of michigan, may , . eyman, douglas. digital rhetoric: theory, practice, and method. ann arbor, mi: univer- sity of michigan press, . ezell, m.j.m. social authorship and the advent of print. baltimore, md: johns hopkins university press, . fair cite initiative. faircite.wordpress.com. farman, jason. “mapping the digital empire.” new media and society . ( ): - . farman, jason. mobile interface theory: embodied space and locative media. new york and london: routledge, . faull, katherine and diane jakacki. “digital learning in an undergraduate context: pro- moting long term student-faculty collaboration.” in digital scholarship in the humani- ties. oxford, uk: oxford university press, . favro, diane. “wagging the dog in the digital age: the impact of computer modeling on architectural history.” paper presented at the computer symposium: the once and fu- ture medium for the social sciences and the humanities. brock university, toronto. may , . favro, diane, and willeke wendrich. “digital karnak.” university of california, berkeley, - . fayyad, usama, georges grinstein, andreas wierse. information visualization in data mining and knowledge discovery. san francsico, ca: mogran kaufman, . fayyad, usama, g. piatetsky-shapiro, and p. smythe. “from data mining to knowledge discovery in databases.” ai magazine , ( ): - . fedora commons. www.fedora-commons.org. feigenbaum, gail. “unlocking archives through digital tech.” the getty iris. june , : http://blogs.getty.edu/iris/unlocking-archives-through-digital-tech/ felluga, dino franco. “addressed to the nines: the victorian archive and the disappear- ance of the book.” victorian studies , no. ( ): - . http://muse.jhu.edu/journals/victorian_studies/v / . felluga.html ferster, bill. interactive visualization: insight through inquiry. cambridge, ma: mit press, . findlen, paula. “how google rediscovered the th century.” chronicle of higher educa- tion ( ). https://www.chronicle.com/blogs/conversation/ / / /how- google-rediscovered-the- th-century/. finger, anke, and danielle follett, eds. the aesthetics of the total artwork: on borders and fragments. baltimore, md: johns hopkins university press, . finnegan, r. participating in the knowledge society: research beyond university walls. basingstoke, uk: palgrave macmillan, . finneran, richard j. the literary text in the digital age. ann arbor, mi: university of michigan press, . fiormonte, domenico. “toward a cultural critique of digital humanities.” in debates in the digital humanities. - . eds. matthew k. gold and lauren klein. minneapolis, mn: university of minnesota press, . fiormonte, domenico. “towards a monocultural (digital) humanities.” infolet, july , . http://infolet.it/ / / /moncultural-humanities/. fischer, c. “all tech is social.” boston review, august . http://www.bostonre- view.net/blog/claude-fischer-all-tech-is-social. fish, stanley. “the digital humanities and the transcending of mortality.” new york times: opinionator. http://opinionator.blogs,nytimes.com/ / / /the-digital-hu- manities-and-the-transcending-of-mortality. fish, stanley. is there a text in this class? cambridge, ma: harvard university press, . fish, stanley. “mind your p’s and b’s: the digital humanities and interpretation.” new york times, january , . http://opinionator.blogs.nytimes.com/ / / /mind- your-ps-and-bs-the-digital-humanities-and-interpretation/?_r=o. fister, barbara. “getting serious about digital humanities (peer to peer review).” li- brary journal. may , . http://www.libraryjournal.com/arti- cle/ca .html?nid= &source=title&rid=#reg_visitor_id#. fitch, catherine a. and steven ruggles. “building the national historical geographic in- formation system.” historical methods : (winter ): - . fitzpatrick, kathleen. “beyond metrics: community authorization and open peer re- view.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . fitzpatrick, kathleen. “giving it away: sharing and the future of scholarly communica- tion.” in planned obsolescence: publishing, technology, and the future of the academy. new york, ny: new york university press, . fitzpatrick, kathleen “the humanities, done digitally.” in debates in the digital humanities. ed. matthew k. gold. minneapolis, mn. university of minnesota press, . fitzpatrick, kathleen. “peer review.” in a new companion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley- blackwell, . fitzpatrick, kathleen. “peer review, judgment, and reading.” profession ( ): – . http://www.mlajournals.org/doi/abs/ . /prof. . . . . fitzpatrick, kathleen. planned obsolescence: publishing, technology, and the future of the academy. new york, ny: new york university press, . fitzpatrick, kathleen, and rowe, katherine. “keywords for open review.” logos: the journal of the world book community , no. - ( ): - . flanagan, mary. critical play. cambridge, ma: mit press, . flanders, julia. “the body encoded: questions of gender and the electronic text.” in electronic text: investigations in method and theory. ed. k. sutherland. - . ox- ford, uk: clarendon press, . flanders, julia. “collaboration and dissent: challenges of collaborative standards for digital humanities.” in collaborative research in the digital humanities. ed. marilyn deegan and willard mccarty. - . farnham, uk: ashgate, . flanders, julia. “the literary, the humanistic, the digital: toward a research agenda for literary studies.” in literary studies in the digital age: an evolving anthology. eds. ken- neth m. price and ray siemens. new york, ny: modern language association, . flanders, julia. “the productive unease of st-century digital scholarship.” dhq: digi- tal humanities quarterly , no. (summer ). http://digitalhuma- nities.org/dhq/vol/ / / / .html. flanders, julia. “time, labor, and ‘alternate careers’ in digital humanities knowledge work.” debates in the digital humanities. ed. matthew k. gold. - . minnesota, mn: university of minneapolis press, . flanders, julia, and fotis jannidis. “data modeling.” in a new companion to digital hu- manities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . flanders, julia, syd bauman, and sarah connell. “text encoding.” in doing digital hu- manities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . flanders, julia, syd bauman and sarah connell. “xslt: transforming our xml data.” in doing digital humanities: practice, training, research. eds. constance crompton, rich- ard j. lane, ray siemens. - . new york, ny: routledge, . flanders, j., & t. muñoz. “an introduction to humanities data curation.” in dh curation guide: a community resource guide to data curation in the digital humanities. . flanders, julia, wendell piez, and melissa terras. “welcome to digital humanities quar- terly.” digital humanities quarterly , no. ( ). http://digitalhumanities .org/dhq/vol/ / / / .html fletcher, pamela and anne helmreich, with david israel and seth erickson. “lo- cal/global: mapping nineteenth-century london’s art market.” nineteenth century art worldwide : (autumn ). http://www. thc-artworldwide.org/index.php/au- tumn /fletcher-helmreich-mapping-the-london-art-market. flew, t. new media: an introduction. rd edition. melbourne, australia: oxford univer- sity press, . flynn, b. “v-embodiment for cultural heritage.” digital heritage international congress, - . marseille: ieee, . folsom, e. ed. “database as genre: the epic transformation of archives.” pmla , no. (october ): - . folsom, e., & k.m. price. the walt whitman archive. . http://www.whitman- archive.org. fong, deanna, katrina anderson, lindsey bannister, janey dodd, lindsey seatter, and michelle levy. “students in the digital humanities: rhetoric, reality and representa- tion.” university of victoria, dhsi colloquium . forer, p., and d. unwin. “enabling progress in gis and education.” in geographical infor- mation systems. eds. paul longley, michael f. goodchild, david j. maguire, and david w. rhind. - . new york, ny: john wiley & sons, inc., . foresman, timothy w. ed. the history of geographic information systems: perspectives from the pioneers. upper saddle river, nj: prentice hall, . forte, maurizio. virtual archaeology. new york, ny: harry n. abrams, . forte, maurizio. “virtual archaeology: communication in d and ecological thinking.” in beyond illustration: d and d digital technologies as tools for discovery in archaeol- ogy. bar international series. eds. b. frischer and a. dakouri-hild. - . oxford, uk: archaeopress, . forte, maurizio and stafano campana. digital methods and remote sensing in archaeol- ogy. cham, switzerland: springer, . foster, a.l. “second life: second thoughts and second doubts.” chronicle of higher ed- ucation , ( ): - . foster, a.l. “professor avatar.” chronicle of higher education , , ( ): - . foster, hal. “the archive without museums.” october (summer ): - . fotheringham, a. stewart, chris brundson, and martin charlton. geographically weighted regression: the analysis of spatially varying relationships. chichester, uk: john wiley & sons, inc., . fotheringham, a. stewart. quantitative geography: perspectives on spatial data analy- sis. london: sage, . foucault, michel. “the discourse on language.” in the archaeology of knowledge. trans. a.m. sheridan smith. . new york, ny: pantheon books, . foulonneau, muriel, and jenn riley. metadata for digital resources: implementation, systems design and interoperability. oxford, uk: chandos, . fountain, kathleen carlisle. “to web or not to web? the evaluation of world wide web publishing in the academy.” in digital scholarship in the tenure, promotion, and review process. ed. deborah lines anderson. - . armonk, ny: m.e. sharpe, . fox, andrea. “bit by bit: tapping into big data.” library of congress, digital preservation, march , .http://digitalpreservation.gov/documents/big-data-report-andrea- fox .pdf. fox, nichols. against the machine: the hidden luddite tradition in literature, art, and individual lives. washington, dc: island press, . foys, martin. virtually anglo-saxon: old media, new media, and early medieval studies in the late age of print. gainesville, fl.: university press of florida, . frabetti, federica. “rethinking the digital humanities in the context of originary tech- nicity.” culture machine ( ). fraistat, neil. “the function of digital humanities centers at the present time.” in de- bates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: uni- versity of minnesota press, . fraistat, neil, and steven e. jones. “immersive textuality.” text ( ): - . fraistat, neil. “the question(s) of digital humanities.” maryland institute for technology in the humanities, february , . http://mith.umd.edu/the-questions-of-digital-hu- manities/. freedman, jonathan, n. katherine hayles, jerome mcgann, meredith l. mcgill, peter stallybrass, and ed folsom. “responses to ed folsom’s ‘database as genre: the epic transformation of archives’.” pmla , no. (october ): - . french, amanda. “make ‘ ’ louder; or, the amplification of scholarly communication.” amandafrench.net. http://amandafrench.net/blog/ / / /make- -louder/. friedlander, amy. “foreward.” in a survey of digital humanities centers in the united states. washington, dc: council on library and information resources, . friedlander, amy. “preface.” in a survey of digital humanities centers in the united states. washington, dc: council on library and information resources, . friendly, michael. “datavis.ca.” gallery of data visualization. new york university. frischer, b., and a. dakouri-hild, eds. beyond illustration: d and d digital technologies as tools for discovery in archaeology. bar international series. . oxford, uk: ar- chaeopress. froehlich, heather. “we’re up all night playing with docuscope.” early modern digital agendas. folger shakespeare library. https://earlymoderndigitalagendas.word- press.com/ / / /were-up-all-night-playing-with-docuscope/. (january , ). froehlich, heather. “how many female characters are there in shakespeare?”' http://hfroehlich.wordpress.com/ / / /how-many-female-characters-are-there- in-shakespeare/. frost davis, r. “crowdsourcing, undergraduates, and digital humanities projects.” . http://rebeccafrostdavis.wordpress.com/ / / /crowdsourcing-undergraduates- and-digital-humanities-projects fry, ben. visualising data: exploring and explaining data with the processing environ- ment. sebastopol, ca: o’reilly media, . fuchs, christian. digital labour and karl marx. new york, ny: routledge, . fuchs, christian. internet and society: social theory in the information age. new york, ny: routledge, . fuhrt, borko, ed. handbook of augmented reality. new york, ny: springer, . fuller, matthew. media ecologies: materialist energies in art and technology. cam- bridge, ma: mit press, . fuller, matthew. “software studies workshop.” . http://pzwart.wdka.hro.nl/mdr/seminars /softstudworkshop. fuller, matthew. software studies: a lexicon. cambridge, ma: mit press, . fuller, s. “humanity: the always already-or never to be-object of the social sciences?” in the social sciences and democracy. ed. j.w. bouwel. london: palgrave macmillan, . fuller, s. the new sociological imagination. london, uk: sage, . funkhouser, c.t. new directions in digital poetry. new york, ny: continuum press, . fyfe, paul. “digital pedagogy unplugged.” digital humanities quarterly , no. ( ). http://digitalhumanities.org/dhq/vol/ / / / .html. fyfe, paul. “electronic errata: digital publishing, open review, and the futures of cor- rection.” in debates in the digital humanities. ed. matthew k. gold. - . minneap- olis, mn: university of minnesota press, . fyfe, paul. “mid-sized digital pedagogy.” in debates in the digital humanities. - . eds. matthew k. gold and lauren klein. minneapolis, mn: university of minnesota press, . gabrys, jennifer. digital rubbish: a natural history of electronics. ann arbor, mi: uni- versity of michigan press, . gadd, ian. “the use and misuse of early english books online.” literature compass ( ): – . gaddis, j.l. the landscape of history: how historians map the past. new york, ny: ox- ford university press, . gaffney, vincent. “in the kingdom of the blind: visualization and e-science in archaeol- ogy, the arts and humanities.” in the virtual representation of the past. eds. mark greengrass and lorna hughes. - . farnham, uk: ashgate, . galarza, alex, jason heppler, and douglas seefeldt. “a call to redefine historical schol- arship in the digital turn.” journal of digital humanities , no. (fall ). http://jour- nalofdigitalhumanities.org/ - /a-call-to-redefine-historical-scholarship-in-the-digital- turn/. galey, alan, and stan ruecker. “how a prototype argues.” literary and linguistic com- puting , no. ( ): - . galina, isabel. “is there anybody out there? building a global digital humanities com- munity.” humanidades digitales. wordpress, . gallon, kim. “making a case for the black digital humanities.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . galloway, alexander r., and eugene thacker. the exploit: a theory of networks. minne- apolis, mn: university of minnesota press, . galloway, alexander r., e. thacker, and m. wark. excommunication: three inquiries in media and mediation. chicago, il: university of chicago press, . galloway, alexander r. the interface effect. cambridge, uk: polity, . gallway, p. “retrocomputing, archival research, and digital heritage preservation: a computer museum and school collaboration.” library trends ( ). - . gamelsberger, g. ed. from science to computational sciences: studies in the history of computing and its influence on today’s sciences. zürich: diaphanes. . gantz, john, and david reinsel. “the digital universe decade: are you ready?” interna- tional data corporation. gardin, j.-c. “the structure of archaeological theories.” in mathematics and infor- mation science in archaeology: a flexible framework. ed. a. voorrips. studies in modern archaeology . bonn, germany: holos, ( ): - . gardiner, eileen and ronald g. musto. the digital humanities: a primer for students and scholars. cambridge, uk: cambridge university press, . gardner, chelsea a.m., gwynaeth mcintyre, kaitlyn solberg, and lisa tweten. “looks like we made it, but are we sustaining digital scholarship?” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . garfinkel, susan. “dialogic objects in the age of -d printing: the case of the lincoln life mask.” in making things and drawing boundaries: experiments in the digital humani- ties. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . garrison d.r. & kanuka, h. “blended learning: uncovering its transformative potential in higher education.“ internet and higher education ( ): - . gatrell, anthony c. “any space for spatial analysis?” in the future of geography. ed. ronald j. johnston. - . london, uk: methuen, . gatrell, simon. “electronic hardy.” in the literary text in the digital age. ed. richard finneran. - . ann arbor, mi: university of michigan press, . gavin, michael and k.m. smith. “an interview with brett bobley.” in debates in the digi- tal humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . gee, james paul. what video games have to teach us about literacy and learning. new york, ny: palgrave macmillan, . gershenfeld, n. fab: the coming revolution on your desktop: from personal computers to personal fabrication. new york, ny: basic books, . geroimenko, vladimir; chaomei chen, eds. visualizing the semantic web: xml-internet and information visualization. new york, ny: springer, . gerschenfeld, neil, raffi krikorian, and danny cohen. “the internet of things.” scientific american (december , ): - . gershon, n., and w. page. “what storytelling can do for information visualization.” acm , ( ): - . “getting started with topic modeling.” digital humanities . ucla. june , . web. august , . gibbs, fred. “critical discourse in the digital humanities.” journal of digital humani- ties , no. (winter ). http://journalofdigitalhumanities.org/ - /critical-discourse- in-digital-humanities-by-fred-gibbs/. gibbs, fred. digital methods for the humanities. albuquerque, nm: university of new mexico, . http://fredgibbs.net/courses/digital-methods/. gibson, james j. the ecological approach to visual perception. hillsdale, nj: lawrence erlbaum, . gibson, william. neuromancer. new york, ny: ace books, . gil, alex. “interview with ernesto oroza.” in debates in the digital humanities. eds. mat- thew gold and lauren klein. - . minneapolis, mn: university of minnesota press, . gil, alex, and Élika ortega. “global outlooks in digital humanities: multilingual practices and minimal computing.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . gilbert, l.s. “going the distance ‘closeness’ in qualitative data analysis software.” in- ternational journal of social research methodology, , ( ): - . gillen, julia, and david barton. digital literacies: a research briefing by the technology enhanced learning phase of the teaching and learning research programme. . lon- don, uk: london knowledge lab, institute of education, university of london, . gillespie, tarleton. “the relevance of algorithms.” in media technologies: essays on communication, materiality, and society. ed. tarleton gillespie, pablo boczkowski, and kirsten foot. - . cambridge, ma: mit press, . gillespie, tarleton, pablo boczkowski, and kirsten foot, eds. media technologies: essays on communication, materiality, and society. cambridge, ma: mit press, . gilliland, jason. “imag(in)ing london’s past into the future with historical gis.” paper presented at the annual association of canadian geographers. toronto, june , . gillings, mark and david wheatley. spatial technology and archaeology: the archaeo- logical applications of gis. london, uk: taylor and francis, . giordano, a., k. huffman lanzoni, & c. bruzelius. eds. visualizing venice: mapping and modeling time and change in a city. new york and london: routledge, . gitelman, lisa. always already new: media, history, and the data of culture. cambridge ma: mit press, . gitelman, lisa. paper knowledge: toward a media history of documents (sign, storage, transmission). durham, nc: duke university press books, . gitelman, lisa. “raw data” is an oxymoron (infrastructures). cambridge, ma: mit press, . gitelman, lisa, and geoffrey b. pingree. “introduction: what’s new about new media?” in new media, - . eds. lisa gitelman and geoffrey b. pingree. xi-xxiv. cam- bridge, ma: mit press, . gladney, h.m. “long-term digital preservation: a digital humanities topic?” historical social research/historiche sozialforschung . ( ): - . glazier, los pequeno. digital poetics: the making of e-poetries. tuscaloosa, al: univer- sity of alabama press, . gleick, james. “books and other fetish objects.” the new york times, july , , sec. opinion/sunday review. http://www.nytimes.com/ / / /opinion/sun- day/ gleick.html?_r= . gleick, james. the information: a history, a theory, a flood. new york, ny: pantheon, . global outlook: digital humanities. http://www.globaloutlookdh.org/. gold, harvey, and shirley e. gold. “implementation of a model to improve productivity of interdisciplinary groups.” in managing high technology: an interdisciplinary perspec- tive. eds. brian w. mar, william t. newell, and borje o. saxbeg. - . amsterdam: elsevier, . gold, matthew k., ed. debates in the digital humanities. minneapolis, mn: university of minnesota press, gold, matthew k. “looking for whitman: a grand, aggregated experiment.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press . gold, matthew k. “looking for whitman: a multi-campus experiment in digital peda- gogy.” in digital humanities pedagogy: practices, principles and politics. ed. brett d. hirsch, - . open book publishers, . http://www.openbookpublish- ers.com/reader/ . gold, matthew k. “whose revolution? towards a more equitable digital humanities.” the lapland chronicles, january , . http://blog.mkgold.net/category/presenta- tions/. gold, matthew k. and lauren f. klein. debates in the digital humanities. minneapolis, mn: university of minnesota press, . gold, matthew k. and lauren f. klein. “introduction.” in debates in the digital humani- ties. ed. gold, matthew and klein, lauren. - . minneapolis, mn: university of min- nesota press, . gold, matthew k. and lauren f. klein. “series introduction and editors’ note.” in de- bates in the digital humanities. eds. gold, matthew and klein, lauren. - . minne- apolis, mn: university of minnesota press, . goldstein, evan r. “digitally incorrect.” chronicle of higher education, october , . http://chronicle.com/article/digitally incorrect/ /. goldstone, andrew, and ted underwood. “the quiet transformations of literary stu- dies: what thirteen thousand scholars could tell us.” new literary history , no. ( ): - . doi: . /nlh. . . golumbia, david. the cultural logic of computation. cambridge, ma: harvard university press, . gombrich, e.h. “the evidence of images.” in interpretation: theory and practice. ed. charles singleton. - . baltimore, md: johns hopkins university press. . goodchild, michael f. “geographical information science.” international journal of geo- graphical information systems ( ): - . goodchild, michael f. “geographic information systems and spatial analysis in the so- cial sciences.” in anthropology, space, and geographic information systems. eds. m. aldenerfer and h.d.g. maschner. - . new york, ny: oxford university press, . goodchild, michael f. introduction to spatial autocorrelation. concepts and techniques in modern geography. . norwich, uk: geoabstracts, . goodchild, michael f., and donald g. janelle, eds. spatially integrated social science. oxford, uk: oxford university press, . goodchild, michael f., and n.s.-n.lam. “areal interpolation: a variant of the traditional spatial problem.” geo-processing ( ): - . gooding p., c. warwick, and m. terras. “the myth of the new: mass digitization, distant reading and the future of the book.” in digital humanities , hamburg. . http://www.dh .uni-hamburg.de/conference/programme/abstracts/the-myth-of- the-new-mass-digitization-distant-reading-and-the-future-of-the-book. .html. goodrick, glyn thomas, and mark gillings. “constructs, simulations and hyperreal worlds: the role of virtual reality (vr) in archaeological research.” in on the theory and practice of archaeological computing. eds. g.r. lock and k. smith. - . oxford, uk: oxbow, . goodrum, abby. “the ethics of hacktivism.” journal of information ethics ( ): - . gordon, eric, and adriana de souza e silva. net locality: why location matters in a net- worked world. chichester, west sussex, uk: wiley-blackwell, . gorman, michael. “introduction: trading zones, interactional expertise, and collabora- tion.” in trading zones and interactional expertise: creating new kinds of collaboration. ed. michael e. gorman. - . cambridge, ma: mit press, . gorman, michael, ed. trading zones and interactional expertise: creating new kinds of collaboration. cambridge, ma: mit press, . gosden, c., and y. marshall. “the cultural biography of objects.” world archaeology, . ( ): - . gouglas, s., g. rockwell, v. smith, s. hoosin, and h. quamen. “before the beginning: the formation of humanities computing as a discipline in canada.” digital studies/le champ numérique . ( ). gradmann, s., and j.c. meister. “digital document and interpretation: re-thinking “text” and scholarship in electronic settings.” poiesis & praxis ( ) ( ): - . grafton, anthony. “apocalypse in the stacks: the research library in the age of google.” daedelus , no. (winter ): - . grafton, anthony. the footnote: a curious history. cambridge, ma: harvard university press, . graham, shawn, ian milligan, and scott weingart. “principles of information visualiza- tion.” in the historian’s macroscope – working title. under contract with imperial col- lege press, . http://www.themacroscope.org/?page_id= . grau, oliver. mediaarthistories. cambridge, ma: mit press, . grau, oliver. virtual art: from illusion to immersion. cambridge, ma: mit press, . green, karen. “naughty bits.” comixology. . http://www.aca- demia.edu/ /naughty_bits. greenbaum, joan m., and morten kyng. design at work: cooperative design of com- puter systems. boca raton, fl: crc press, . greenberg, hope, elli mylonas, scott hamlin, and patrick yott. “supporting digital hu- manities research: the collaborative approach.” northeast regional computing pro- gram, march . net.educause.edu/ir/library/pdf/ncp .pdf. greene, m.a. “the power of meaning: the archival mission in the postmodern age.” the american archivist , ( ): - . greenfield, adam. everyware: the dawning age of ubiquitous computing. berkeley, ca: new riders, . greengrass, mark and lorna hughes. the virtual representation of the past. eds. mark greengrass and lorna hughes. london, uk: ashgate, . greenshow, christine, and benjamin gleason. “social scholarship: reconsidering schol- arly practices in the age of social media.” british journal of educational technology . ( ): - . greenspan, brian. “are digital humanists utopian?” in debates in the digital humani- ties. - . eds. matthew k. gold and lauren klein. minneapolis, mn: university of minnesota press, . greenstein, daniel, and suzanne e. thorin. “the digital library: a biography.” washing- ton, d.c.: digital library federation/council on library and information resources, . http://www.clir.org/pubs/abstract/pub abst.html. greetham, david. “the resistance to digital humanities.” in debates in the digital hu- manities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . greetham, d.c. textual scholarship: an introduction. new york, ny: garland, . gregory, derek. geographical imaginations. cambridge, ma: blackwell, . gregory, ian n. a place in history: a guide to using gis in historical research. oxford, uk: oxbow books, . gregory, ian n., c. bennett, v.l. gilbam, and h.r. southall. “the great britain historical gis project: from maps to changing human geography.” cartographic journal : ( ): - . gregory, ian, c. donaldson, p. murrieta-flores and p. rayson. “geoparsing, gis and tex- tual analysis: current developments in spatial humanities research.” international jour- nal of humanities and arts computing ( ): - . gregory, ian n., and paul s. ell. historical gis: technologies, methodologies, and scholar- ship. cambridge, uk: cambridge university press. gregory, ian, and p.s. ell. historical gis: technologies, methodologies, scholarship. cam- bridge, uk: cambridge university press, . gregory, ian n., and paul s. ell, eds. history and computing , ( ). gregory, ian, and patricia murrieta-flores. “geographical information systems as a tool for exploring the spatial humanities.” in doing digital humanities: practice, training, re- search. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . gregory, ian, n. karen kemp, and ruth mostern. “geographical information and histori- cal research: current progress and future directions.” history and computing ( ): - . gregory, ian, and r.g. healey. “historical gis: structuring, mapping and analyzing geog- raphies of the past.” progress in human geography ( ): - . griffey, jason. “ d printers for libraries: types of plastics.” library technology reports . ( ): - . griffin, g. and m. hayler, eds. research methods for reading digital data in the digital humanities. edinburgh, uk: edinburgh university press, . grigar, dene. “curating electronic literature as critical and scholarly practice.” digital humanities quarterly , ( ). grigar, dene. “electronic literature and digital humanities: opportunities for practice, scholarship and teaching.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . grigar, dene. “electronic literature: where is it?” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . grigar, dene. “the present [future] of electronic literature.” transdisciplinary digital art: sound, vision and the new screen. eds. randy adams, steve gibson, and stefan muller. - . heidelberg, germany: springer-verlag publications, . grigar, dene and stuart moulthrop. pathfinders: documenting the experience of early digital literature. electronic literature organization, . grimes, sara m., and andrew feenberg. “rationalizing play: a critical theory of digital gaming.” the information society . ( ): - . gronlund, melissa. contemporary art and digital culture. new york, ny: routledge, . gruber, david. “new materialism and a rhetoric of scientific practice in the digital hu- manities.” in rhetoric and the digital humanities. eds. jim ridolfo and william hart-da- vidson. - . chicago, il: university of chicago press, . guiliano, jennifer. “i’ll see your open access and raise you two book contracts: or why the aha should re-think its policy.” jennifer guilliano’s blog. cyber chimps. http://jguili- ano.com/blog/ / / /can-we-get-a-re-do-please-the-aha-policy-on-embargoing- dissertations-or-why-im-disappointed-in-my-professional-organization/. guldi, jo. “spatial turn in art history.” spatial humanities. http://spatial.schol- arslab.org/spatial-turn/the-spatial-turn-in-art-history/index.html. guldi, jo. “what is the spatial turn?” spatial humanities, . http://spatial.schol- arslab.org/spatial-turn/. gurak, laura, and smiljana antonijevic. “digital rhetoric and public discourse.” in the sage handbook of rhetorical studies. eds. andrea lunsford, kirt h. wilson, and rosa a. eberly. - . thousand oaks, ca: sage, . habermas, jürgen. the structural transformation of the public sphere: an inquiry into a category of bourgeois society. trans. thomas burger, with frederick lawrence. cam- bridge, ma: mit press, . haegler, simon, pascal müller, and luc van gool. “procedural modeling for digital cul- tural heritage.” eurasip journal on image and video processing, ( ): - . hagood, j. “brief introduction to data mining projects in the humanities.” bulletin of the american society for information science and technology, . ( ): - . hai-jew, shalin, ed. data analytics in digital humanities. cham, switzerland: springer, . hale, constance, ed. wired style: principles of english usage in the digital age. new york, ny: hardwired, . hales, n. katherine. how we became posthuman: virtual bodies in cybernetics, litera- ture, and informatics. chicago, il: university of chicago press, . hall, gary. “the digital humanities beyond computing: a postscript.” culture machine ( ). http://www.culturemachine.net/index.php/cm/article/view/ / . hall, gary. digitize this book! the politics of new media, or why we need open access now. minneapolis and london: university of minnesota press, . hall, gary. “has critical theory run out of time for data-driven scholarship?” in de- bates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: uni- versity of minnesota press, . hall, gary. “there are no digital humanities.” debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . hall, gary. “toward a postdigital humanities: cultural analytics and the computational turn to data-driven scholarship.” american literature , no. ( ): - . hall, stephen s. mapping the next millennium. new york, ny: random house, . hall, stuart. “emergence of cultural studies and the crisis of the humanities.” october, ( ): - . hall, stuart “encoding/decoding.” in culture, media, language. eds. stuart hall, dorothy hobson, andrew lowe, paul willis. - . london, uk: hutchinson, . halpern, orit. beautiful data: a history of vision and reason since . durham, nc: duke university press, . hamburger, j. the visual culture of a medieval convent. berkeley, ca: university of cali- fornia press, . hamming, richard. numerical analysis for scientists and engineers. new york, ny: mcgraw-hill, . han, j., m. kamber, and j. pei. data mining: concepts and techniques. burlington, ma: morgan kaufmann, . hancher, michael. “re: search and close reading.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minne- sota press, . hannigan, lee, aurelio meza, and alexander flamenco. “reading series matter: per- forming the spokenweb project.” in making things and drawing boundaries: experi- ments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: univer- sity of minnesota press, . hansen, derek l., ben shneiderman and marc a. smith. analyzing social media net- works with nodexl: insights from a connected world. burlington, ma: morgan kauff- man, . hansen, mark b.n. “affect as medium or the ‘digital-facial-image’.” journal of visual culture , no. ( ): - . hansen, mark b.n. embodying technesis: technology beyond writing. ann arbor, mi: university of michigan press, . hansen, mark b.n. new philosophy for new media. cambridge, ma: mit press, . haraway, d. “a cyborg manifesto: science, technology, and socialist-feminism in the late twentieth century.” in simians, cyborgs, and women: the reinvention of nature. - . new york, ny: routledge, . hardt, michael, and antonio negri. multitudes. new york, ny: penguin, . hardy, molly o’hagan. “‘black printers’ on white cards: information architecture in the data structures of the early americans book trades.” in debates in the digital humani- ties. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . harley, j. brian. “deconstructing the map.” cartographica ( ): - . harley, j. brian. “maps, knowledge, and power.” in the iconography of landscape. eds. denis cosgrove and stephen daniels. - . cambridge, uk: cambridge university press, . harley, j. brian. the new nature of maps. ed. paul laxton. baltimore, md: johns hop- kins university press, . harley, diane, jonathan henke, and shannon lawrence, et al. use and users of digital resources: a focus on undergraduate education in the humanities and social sciences. berkeley’s center for studies in higher education, april , . http://cshe.berke- ley.edu/publications/publications.php?id= . harley, diane, and university of california, berkeley. assessing the future landscape of scholarly communication an exploration of faculty values and needs in seven disci- plines. berkeley, ca: center for studies in higher education, . harley, j.b. deconstructing the map. http://hackitectura.net/osfavelados/ _proyec- tos_eventos/ _cartografia_ciudadana/harley _maps.pdf harrell, d.f. phantasmal media: an approach to imagination, computation, and expres- sion. cambridge, ma: mit press, . harris, katherine. “explaining digital humanities in promotion documents.” the journal of digital humanities , no. ( ). harris, katherine. “explaining digital humanities in promotion documents.” journal of digital humanities , no. (fall ). http://journalofdigitalhumanities.org/ - /ex- plaining-digital-humanities-in-promotion-documents-by-katherine-harris/. harris, katherine d. “let’s get real with numbers: the financial reality of being a ten- ured professor.” https://triproftri.wordpress.com/ / / /lets-get-real-with-num- bers-the-financial-reality-of-being-a-tenured-professor/. harris, trevor m. “gis in archaeology.” in past time, past place: gis for history. ed. anne kelly knowles, - . redlands, ca: esri press, . harrower, mark. “representing uncertainty: does it help people make better deci- sions?” white paper prepared for ucgis workshop: geospatial visualization and knowledge discovery workshop. national conference center, lansdowne, virginia. no- vember - , . hartman, j. et al. “preparing the academy of today for the learner of tomorrow” edu- cause. http://net.educause.edu/ir/library/pdf/pub f.pdf hartman, kate. wearable electronics: design, prototype, and wear your own interactive garments. sebastopol, ca: maker media, . harvard library digital humanities café. http://guides.hcl.harvard.edu/digitalhumani- ties harvey, franci, marianna pavlovskaya, and mei-po kwan. “introduction to critical gis.” cartographica : ( ): - . harvey, r. digital curation: a how-to-do-it manual. new york, ny: neal-schuman, . harpham, geoffrey galt. the humanities and the dream of america. chicago, il: univer- sity of chicago press, . hassan, robert, and julian thomas, eds. the new media theory reader. maidenhead, uk: open university press, . hastac (humanities, arts, sciences, and technology advanced collaboratory). www.hastac.org hatch, mark. the maker movement manifesto: rules for innovation in the new world of crafters, hackers, and tinkerers. new york, ny: mcgraw hill, . hatfield, j. “imagining future gardens of history.” camera obscura ( / ) ( ): - . hathitrust digital library. www.hathitrust.org hawkins, d.t., ed. personal archiving: preserving our digital heritage. medford, nj: in- formation today, . hawkins, ann r. “making the leap: incorporating digital humanities into the english classroom.” in cea critic , no. (july ). https://muse.jhu.edu/login?auth= &type=summary&url=/jour- nals/cea_critic/v / . .hawkins.pdf. haworth, k.m. “archival description: content and context in search of structure.” in en- coded archival description on the internet. eds. d.v. pitti, & w.m. duff. - . bing- hamton, ny: haworth information press, . hayler, matt, and gabriele griffin. “introduction.” research methods for creating and curating data in the digital humanities. eds. matt hayler and gabriele griffin. - . ed- inburgh, uk: edinburgh university press, . hayler, matt and gabriele griffin, eds. research methods for creating and curating data in the digital humanities. edinburgh, uk: edinburgh university press, . hayles, n. katherine “cybernetics.” in critical terms for media studies. eds. w.j.t. mitchell and mark b. n. hansen. - . chicago, il: university of chicago press, . hayles, n. katherine. electronic literature: new horizons for the literary. notre dame, in: university of notre dame press, . hayles, n. katherine. “electronic literature: what is it?” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . hayles, n. katherine. “elit: what is it?” electronic literature organization. . re- trieved: october . hayles, n. katherine. how we became posthuman: virtual bodies in cybernetics, litera- ture, and informatics. chicago, il: university of chicago press, . hayles, n. katherine. how we think: digital media and contemporary technogenesis. chicago, il: university of chicago press, . hayles, n. katherine. “how we think: transforming power and digital technologies.” in understanding the digital humanities. ed. d. m. berry. london, uk: palgrave, . hayles, n. katherine. “how we read: close, hyper, machine.” ade bulletin ( ): - . http://www.mla.org/adefl_bulletin_c_ade_ _ . hayles, n. katherine. my mother was a computer: digital subjects and literary texts. chicago, il: university of chicago press, . hayles, n. katherine. “print is flat, code is deep: the importance of media-specific analysis.” poetics today ( ). – . hayles, n. katherine. “speech, writing, code: three worldviews.” in my mother was a computer: digital subjects and literary texts, - . chicago, il: university of chicago press, . hayles, n. katherine. writing machines. cambridge, ma: mit press, . hayles, n. katherine and jessica pressman, eds. comparative textual media: transform- ing the humanities in the post-print era. minneapolis, mn: university of minnesota press, . healey, richard g., and trem r. stamp. “historical gis as a foundation for the analysis regional economic growth: theoretical, methodological, and practical issues.” social science history : ( ): - . heasley, lynne. “shifting boundaries on a wisconsin landscape: can gis help historians tell a complicated story.” human ecology : ( ): - . heath, t., and c. bizer. linked data: evolving the web into a global data space. san ra- fael, ca: morgan & claypool, . heller, margaret. “lazy consensus and libraries.” acrl tech connect, march , . http://acrl.ala.org/techconnect/?p= heidegger, m. “the question concerning technology .” in martin heidegger: basic wri- tings. d.f. krell ed. - . london, uk: routledge, . hellqvist, björn. “referencing in the humanities and its implications for citation analy- sis.” journal of the american society for information science and technology , no. ( ). hendren, sara. “all technology is assistive: six design rules on disability.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery say- ers. - . minneapolis, mn: university of minnesota press, . henry, chuck. “removable type.” in online humanities scholarship: the shape of things to come. ed. jerome mcgann. - . houston, tx: rice university press, . henry, shawn lawton. “very briefly: scalable reading.” scalable reading. wordpress, june . web. aug. . herbert, james. “masterdisciplinarity and pictorial turn.” the art bulletin . ( ): - . hertz, g. “methodologies of reuse in the media arts: exploring black boxes, tactics, and archaeologies.” phd dissertation, university of california irvine, . hess, charlotte, and elinor ostrom. understanding knowledge as a commons: from the- ory to practice. cambridge, ma: mit press, . higgin, tanner. “cultural politics, critique and the digital humanities.” mediacommons. may, . http://www.tanneerhiggin.com/ / /cultural-politics-critique-and- the-digital-humanities/. higgin, tanner. “how do you define humanities computing/digital humanities?” in day of digital humanities. march , . http://tapor.ualberta.ca/taporwiki/in- dex.php/how_do_you_define_humanities_computng _/_digital_humanitites% f. higgins, s. “the dcc curtain lifecycle model.” international journal of digital curation ( ) ( ): - . http://ijdc.net/index.php/ijdc/article/view/ . hill, linda. georeferencing: the geographic associations of information. cambridge, ma: mit press, . hilyard, stephen. “the object and the event: time-based digital simulation and illusion in the fine arts.” in research methods for creating and curating data in the digital hu- manities. eds. matt hayler and gabriele griffin. - . edinburgh, uk: edinburgh uni- versity press, . himanen, pekka. the hacker ethic. new york, ny: random house, . hindley, meredith. “the rise of the machines.” humanities , no. ( ). hirsch, brett d. ed. digital humanities pedagogy: practices, principles and politics. cam- bridge, uk: open book publishers, . hitchcock, tim. “big data, small data and meaning.” historyonics (blog). november , . http://historyonics.blogspot.com/ / /big-data-small-data-and-mean- ing_ .html. hitchcock, tim. “digital searching and re-formulation of knowledge.” in the virtual rep- resentation of the past. eds. mark greengrass and lorna hughes. - . london, uk: ashgate, . hitchcock, tim. “digitising british history since .” in making history: the changing face of the profession in britain. institute for historical research, . http://www.his- tory.ac.uk/makinghistory/resources/articles/digitisation_of_history.html. hockey, susan. electronic texts in the humanities. oxford, uk: oxford university press, . hockey, susan. “the history of humanities computing.” in a companion to digital hu- manities. eds. s. schreibman, r. siemens, and j. unsworth. oxford, uk: blackwell, . http://www.digitalhumanities.org/companion . hockey, susan. “living with google: perspectives on humanities computing and digital libraries.” literary and linguistic computing , no. (march , ): - . hockey, susan. "towards a model for web-based language documentation and de- scription: some contributions from digital libraries and humanities computing re- search." web-based language learning workshop, philadelphia. december - , . hockey, susan. “workshop on teaching computers and the humanities courses.” liter- ary and linguistic computing . ( ): - . hocks, mary, and michelle kendrick, eds. eloquent images: word and image in the age of new media. cambridge, ma: mit press, . holley, r. “crowdsourcing: how and why should libraries do it?” in d-lib magazine ( / ) ( ). (http://www.dlib.org/dlib/march /holley/ holley.html holt, jim. “two brains running.” the new york times. november , : sunday book review. horst, heather a., daniel miller, eds. digital anthropology. london and new york: bloomsbury academic, . hoover, david l. “argument, evidence, and the limits of digital literary studies.” in de- bates in the digital humanities. eds. matthew k. gold and lauren klein. - . minne- apolis, mn: university of minnesota press, . hoover, david l., jonathan culpeper, and kieran o’halloran. digital literary studies: cor- pus approaches to poetry, prose, and drama. london, uk: routledge, . hoover, david l. “making waves: algorithmic criticism revisited.” dh , university of lausanne and ecole polytechnique fédérale de lausanne, - july . hopes, d. “digital cops and robbers: communities of practice and the use of digital ar- tefacts.” museum management and curatorship, . ( ): - . howard, jennifer. “digital materiality; or learning to love our machines.” wired cam- pus blog at the chronicle of higher education. august , . http://chroni- cle.com/blogs/wiredcampus/digital-materiality-or-learning-to-love-our-ma- chines/ . howard, jennifer. “the mla convention in translation.” chronicle of higher education. http://chronicle.com/article/the-mla-convention-in/ /. howe, jeff. “the rise of crowdsourcing.” wired.com, condé nast digital, june . http://www.wired.com/. hsu, mei-ling. “the qin maps: a clue to later chinese cartographic development.” imago mundi ( ): - . hsu, wendy f. “digital ethnography toward augmented empiricism: a new methodo- logical framework.” journal of digital humanities , no. ( ). hsu, wendy f. “lessons on public humanities from the civic sphere.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . huffman, kristin l., andrea jordano, and caroline bruzelius, eds. visualizing venice: mapping and modeling time and change in a city. oxford, uk: routledge, . hughes, lorna, panos constantopoulos, and costis dallas. “digital methods in the hu- manities: understanding and describing their use across the disciplines.” in a new com- panion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . huhtamo, erkki. illusions in motion: media archaeology of the moving panorama and related spectacles. cambridge, ma: mit press, . huhtamo e. and j. parikka, ed. media archaeology: approaches applications, implica- tions. berkeley and los angeles, ca: university of california press, . hui kong chun, wendy, richard grusin, patrick jagoda, and rita raley. in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . huitfeldt, claus. "scholarly text processing and future markup systems." forum com- puterphilologie . http://computerphilologie.uni-muenchen.de/jg /huitfeldt.html humanist discussion group. www.digitalhumanities.org/humanist. hunter, john, katherine faull, and diane jakacki. “reifying the maker as humanist.” in making things and drawing boundaries: experiments in the digital humanities. ed. jen- tery sayers. - . minneapolis, mn: university of minnesota press, . hunter, m. editing early modern texts: an introduction to principles and practice. new york, ny: palgrave macmillan, . hunyadi, laszlo. “collaboration in virtual space in digital humanities.” in collaborative research in the digital humanities. eds. marilyn deegan and willard mccarty. - . farnham, uk: ashgate, . hutchison, coleman. “breaking the book known as q.” pmla ( ): – . hypercities. http://www.hypercities.com. idc. digital universe study. december . http://www.emc.com. igoe, t. making things talk: using sensors, networks, and the arduino to see, hear, and feel your world. nd edition. sebastopol, ca: o’reilly, . ihde, don. postphenomenology and technoscience: the peking university lectures. al- bany, ny: suny press, . inkpen, deborah. “munfla: digitizing the past.” gazette january , : . inscho, jeffrey. “guest post: oh snap! experimenting with open authority in the gal- lery.” museum . . march , . http://museumtwo.blogspot.com/ / /guest- post-oh-snap-experimenting-with.html. institute for the future of the book. www.futureofthebook.org. institute of museum and library services. www.imls.gov. “interchange: the promise of digital history.” journal of american history , no. ( ): - . http://www.journalofamericanhistory.org/issues/ /interchange/. international journal for digital art history. http://www.dah-journal.org itō, mizuko. hanging out, messing around, and geeking out: kids living and learning with new media. cambridge, ma: mit press, . jackacki, diane, and katherine faull. “doing dh in the classroom: transforming the hu- manities curriculum through digital engagement.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . jackson, s.j. “rethinking repair.” in media technologies: essays on communication, ma- teriality and society. eds. t. gillespie, p. boczkowski, and k. foot. cambridge, ma: mit press, . jackson, william a. “some limitations of microfilm.” papers of the bibliographical soci- ety of america ( ): – . jah. “interchange: the promise of digital history.” the journal of american history. re- trieved december , . http://www.journalofamericanhistory.org/issues/ /inter- change/index.html. jagoda, patrick. “gamification and other forms of play.” boundary , no. (summer ): - . jagoda, patrick. “gaming in the humanities.” differences: a journal of feminist cultural studies , no. ( ): - . jameson, fredric. postmodernism, or the cultural logic of late capitalism. durham, nc: duke university press, . jannidis, fotis. “tei in a crystal ball.” literary and linguistic computing ( ), : - . jannidis, fotis et al. “an encoding model for genetic editions.” tei guide- lines. http://www.tei-c.org/vault/tc/tcw .html. jannidis, fotis et al. “ch. : representation of primary sources.” tei guidelines. http://www.tei-c.org/release/doc/tei-p -doc/en/html/ph.html. jarmon, l. traphagan, t. et al. “virtual world teaching, experimental learning and as- sessment: an interdisciplinary communications course in second life.” computers in ed- ucation , ( ): - . jaschik, scott. “an open, digital professoriat.” inside higher ed. january , . http://www.insidehighered.com/news/ / / /mlaa_embraces_digital_humani- ties_and_blogging. jaskot, paul b. “commentary: art-historical questions, geographic concepts, and digital methods,” historical geography ( ): - . jaskot, paul b. and ivo van der graaff, “historical journals as digital sources: mapping architecture in germany, - ,” journal of the society of architectural historians , no. (december ): - . jaskot, paul b. and anne kelly knowles, “architecture and maps, databases and a chives: an approach to institutional history and the built environment in nazi ge many,” the iris ( february ): http://blogs.getty.edu/iris/dah_jaskot_knowles/ jaskot, paul b., anne kelly knowles, andrew wasserman, stephen whiteman, and benjamin zweig, “a research-based model for digital mapping and art history: notes from the field,” artl@s bulletin , no. (spring ): - . jaskot, paul b., anne kelly knowles, and chester harvey, with benjamin perry blackshear, “visualizing the archive: building at auschwitz as a geographic problem,” in tim cole, alberto giordano and anne kelly knowles, eds., geographies of the holo- caust. bloomington, in: indiana university press, : - . jebara, tony. machine learning: discriminative and generative. new york, ny: springer, . jenkins, henry. “bringing critical perspectives to the digital humanities: an interview with tara mcpherson (part three).” confessions of an aca-fan, blog, march , . jenkins, henry. convergence culture: where old and new media collide. new york, ny: new york university press, . jenson, jennifer, stephanie fisher, and suzanne de castell. “disrupting the gender or- der: leveling up and claiming space in an after-school video game club.” international journal of gender, science and technology . ( ). jenstad, janelle. “restoring place to the digital archive.” in teaching early modern eng- lish literature from the archives. eds. heidi brayman hackel and ian frederick moulton. - . new york, ny: modern language association, . jenstad, janelle, and joseph takeda. “making the ra matter: pedagogy, interface, and practices.” in making things and drawing boundaries: experiments in the digital hu- manities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . jessop, martyn. “the inhibition of geographical information in digital humanities schol- arship.” literary and linguistic computing ( ). - . jessop, martyn. “the visualization of spatial data in the humanities.” literary and lin- guistic computing ( ), - . jockers, matthew l. “digital humanities: methodology and questions.” matthew l. jock- ers. april , , http://www.stanford.edu/-mjockers/cgi-bin/drupal/node/ . jockers, matthew l. macroanalysis: digital methods and literary history. urbana, il: university of illinois press, . jockers, matthew l. and ted underwood. “text-mining the humanities.” in a new com- panion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . johanson, christopher. “making virtual worlds.” in a new companion to digital human- ities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . johanson, christopher. “visualizing history: modeling in the eternal city.” visual re- sources: an international journal of documentation ( ) ( ): . doi: . / . johnson, i. “putting time on the map: using timemap for map animation and web de- livery.” geoinformatics ( ) ( ): - . johnson, jessica marie. diaspora hypertext. https://diasporahypertext.com/. johnson, l. “topic maps: from information to discourse architecture.” journal of infor- mation architecture , ( ): - . johnson, steven. where good ideas come from: the natural history of innovation. lon- don, uk: penguin, . johnston, john. the allure of mechanic life: cybernetics, artificial life, and the new ai. cambridge, ma: mit press, . jones, m., and n. beagrie. preservation management of digital materials: a handbook. london, uk: the british library for resource, the council for museums, archives and li- braries, . jones, r., and c. hafner. understanding digital literacies: a practical introduction. lon- don, uk: routledge, . jones, stephen e. the emergence of the digital humanities. new york and london: routledge, . jones, stephen e. “the emergence of the digital humanities (as the network is evert- ing).” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . jones, steven e. against technology: from the luddites to neo-luddism. new york, ny: routledge, . jones, steven e. “new media and modeling: games and the digital humanities.” in a new companion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . jones-imhotep, edward, and william j. turkel. “image mining for the history of elec- tronics and computing.” in seeing the past: augmented reality and computer vision. ed. kevin kee. ann arbor, mi: university of michigan press, . jones-kavalier, barbara r., and suzanne l. flannigan. “connecting the dots: literacy of the st century.” educause quarterly, no. (january ): - . jordan, tim. activism! direct action, hacktivism and the future of society. london, uk: reaktion books, . jørgensen, finn arne. “the internet of things.” in a new companion to digital humani- ties. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . joyce, michael. of two minds: hypertext pedagogy and poetics. ann arbor, mi: univer- sity of michigan press, . juola, patrick. “killer applications in digital humanities.” literary and linguistic compu- ting, . ( ): - . journal of digital humanities: http://journalofdigitalhumanities.org/, particularly the is- sue on evaluation: http://journalofdigitalhumanities.org/ - / journal of interactive pedagogy, http://jitp.commons.gc.cuny.edu/ jurgenson, nathan. “digital dualism versus augmented reality.” cyborgology. the soci- ety pages. february , . http://thesocietypages.org/cyborgology/ / / /dig- ital-dualism-versus-augmented-reality/. juul, jesper. a casual revolution: reinventing video games and their players. cam- bridge, ma: mit press, . kadushin, c. understanding social networks: theories, concepts, and findings. new york, ny: oxford university press, . kalas, gregor, diane favro, and chris johanson. “visualizing statues in the late antique forum.” inscriptions. http://inscriptions.etc.ucla.edu kalay y.e., t. kvan, and j. affleck, eds. new heritage: new media and cultural heritage. london and new york: routledge, . kallinkos, jannis, aleksi aaltonen, and attila marton. “a theory of digital objects.” first monday , no. ( ). kamada, hitoshi. “digital humanities: roles for libraries?” college & research libraries news , no. (october ): - . kasik, d.j., d. ebert, g. lebanon, h. park, and w.m. pottenger. “data transformations and representations for computation and visualization.” information visualization ( ) - . kauai, y.b., m.s. cook, and d.a. fields. “‘blacks deserve bodies too!’: design and discus- sion about diversity and race in a tween virtual world.” games and culture ( ), ( ): - . doi: . / . kearney, patrick j., and g. legman. the private case: an annotated bibliography of the private case erotica collection in the british (museum) library. london, uk: j. landes- man, . kee, kevin, ed. pastplay: teaching and learning history with technology. ann arbor, mi: university of michigan press, . keeling, kara. the witch’s flight: the cinematic, the black femme, and the image of common sense. durham, nc: duke university press, . keim, d.a., f. mansmann, j. schneidewind, and h. ziegler. “challenges in visual data analysis.” proceedings in information visualization iv . - . london, uk: ieee. kelland, lara. “the master’s tools, . .” in public history commons. may , . http://publichistorycommons.org/the-masters-tools- - /. keller, michael. “response to rotunda: a university press starts a digital imprint.” in online humanities scholarship: the shape of things to come. - . ed. jerome mcgann. houston, tx: rice university press, . kelley, victoria. “time, wear and maintenance: the afterlife of things.” in writing ma- terial culture. eds. anne gerritsen and giorgio riello. london:, uk bloomsbury, . kelly, t. mills. “making digital scholarship count (part i- of iii).” edwired, june , . http://edwired.org/ / / /making-digital-scholarship-count/. kelly, t. millis. teaching history in the digital age. ann arbor, mi: university of michigan press, . kelly, t. mills. “visualizing information.” edwired. october , . http://ed- wired.org/ / / /visualizing-information/. kelly, t. mills. “visualizing millions of words.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . kelty, christopher m. two bits: the cultural significance of free software. durham, nc: duke university press, . kemman, max, martijn kleppe, and stef scagliola. “just google it.” in proceedings of the digital humanities congress . eds. clare mills, michael pidd, and esther ward. shef- field: hri online publications, . http://www.hrionline.ac.uk/openbook/chap- ter/dhc -kemman. kenderdine, s., j. shaw, and t. gremmler. “cultural data sculpting: omnidirectional vis- ualization for cultural datasets.” in knowledge visualization currents: from text to art to culture. eds. e.t. marchese and e. banissi. - . london, uk: springer, . kenderdine, s. “speaking in rama: panoramic vision in cultural heritage visualization.” in digital cultural heritage: a critical discourse. ed. f. cameron and s. kenderdine. - . cambridge, ma: mit press. kenderline, sarah. “embodiment, entanglement, and immersion in digital cultural herit- age.” in a new companion to digital humanities. eds. by susan schreibman, ray sie- mens, and john unsworth. - . west sussex, uk: wiley-blackwell, . kennicott, p. “pure land tour: for visitors virtually exploring buddhist cave, it’s pure fun.” washington post, november , . kenny, anthony. the computation of style. oxford, uk: oxford university press, . kenny, anthony. computers and the humanities. ninth british library research lecture. british library, london, uk. , keramidas, kimon. “interactive development as pedagogical process: digital media de- sign in the classroom as a method for recontextualizing the study of material culture.” museums and the web : proceedings. museum and the web. http://mw .museumsandtheweb.com/paper/interactive-development-as-pedagogi- cal-process-digital-media-design-in-the-classroom-as-a-method-for-recontextualizing- the-study-of-material-culture/ kernighan, brian, and rob pike. the unix programming environment. englewood cliffs, nj: prentice-hall, . kernighan, brian, and d.m. ritchie. the c programming language. englewood cliffs, nj: prentice-hall, . reprint, . kernighan, brian w. d is for digital: what a well-informed person should know about computers and communications. createspace independent publishing platform, . ketelhut, d.j. “the impact of student self-sufficiency on scientific inquiry skills: an ex- ploratory investigation in river city, a multi-user virtual environment.” journal of sci- ence education & technology. , , ( ): - . killbride, william. “saving the bits: digital humanities forever?” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . kim, david. “archives, models, and methods for critical approaches to identities: repre- senting race and ethnicity in the digital humanities.” phd dissertation, university of cal- ifornia los angeles, . kim, david. “‘data-izing’ the images: process and prototype.” in performing archive: cur- tis + the vanishing race. eds. jacqueline wernimont, beatrice schuster, amy borsuk, david j. kim, heather blackmore, and ulia gusart (popova). scalar, kinder, marsha, and tara mcpheson, eds. transmedia frictions: the digital, the arts, and the humanities. berkeley, ca: university of california press, . kirsch, adam. “technology is taking over english departments: the false promise of the digital humanities.” new republic, may , . http://www.newrepublic.com/arti- cle/ /limits-digital-humanities-adam-kirsch. kirschenbaum, matthew g. “ancient evenings: retrocomputing in the digital humani- ties.” in a new companion to digital humanities. eds. by susan schreibman, ray sie- mens, and john unsworth. - . west sussex, uk: wiley-blackwell, . kirschenbaum, matthew g. “bookscapes: modeling books in electronic space.” human- computer interaction lab th annual symposium. - . may , . kirschenbaum, matthew g., et al. “collaborators’ bill of rights.” off the tracks work- shop. january , . kirschenbaum, matthew g. “digital humanities as/is a tactical term.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of min- nesota, . kirschenbaum, matthew. “done: finishing projects in the digital humanities.” dhq: dig- ital humanities quarterly , no. (spring ). http://digitalhumani- ties.org/dhq/vol/ / / / .html. kirschenbaum, matthew g. “hello worlds.” the chronicle of higher education. . http://chronicle.com/article/hello-worlds/ . kirschenbaum, matthew g. mechanisms: new media and the forensic imagination. cambridge, ma: mit press, . kirschenbaum, matthew g. “what is digital humanities?” ade bulletin ( ): - http://mkirschenbaum.wordpress.com/ / / /what-is-digital-humanities/. kirschenbaum, matthew g. “what is digital humanities and what’s it doing in english departments?” ade bulletin ( ): - . kirschenbaum, matthew g. “what is ‘digital humanities’ and why are they saying such terrible things about it?” differences: a journal of feminist cultural studies , no. ( ): - . kirschenbaum, matthew g., bethany nowviskie, tom scheinfeldt, and doug reside. “collaborators’ bill of rights.” maryland institute for technology and the humanities, january , . http://mith.umd.edu/offthetracks/recommendations/. kirschenbaum, matthew g., richard ovenden, and gabriela redwine. “digital forensics and born-digital content in cultural heritage collections.” council on library and infor- mation resources. december . http://www.clir.org/pubs/ab- stract/pub abst.html. kirton, isabella and melissa terras. “where do images of art go once they go online? a reverse image lookup study to assess the dissemination of digitized cultural herit- age.” museums and the web : proceedings. museum and the web. . http://mw .museumsandtheweb.com/paper/where-do-images-of-art-go-once-they- go-online-a-reverse-image-lookup-study-to-assess-the-dissemination-of-digitized-cul- tural-heritage/ kissane, erin. the elements of content strategy. new york, ny: a book apart, . kitchin, rob, and martin dodge. code/space: software and everyday life. cambridge, ma: mit press, . kittler, friedrich. discourse networks / . trans. chris metteer with chris cul- lens. stanford, ca: stanford university press, . klein, julie thompson. “the boundary work of making in digital humanities.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery say- ers. - . minneapolis, mn: university of minnesota press, . klein, julie thompson. crossing boundaries: knowledge, disciplinarities, and interdisci- plinarities. charlottesville, va: university of virginia press, . klein, julie thompson. creating interdisciplinary campus centers. san francisco, ca: jossey-bass and association of american colleges and universities, . klein, julie thompson. humanities, culture, and interdisciplinary: the changing ameri- can academy. albany, ny: state university of new york press, . klein, julie thompson. interdisciplinarity: history, theory, and practice. detroit, mi: wayne state university press, . klein, julie thompson. interdisciplining digital humanities: boundary work in an emerg- ing field. ann arbor, mi: university of michigan press, . klein, lauren f., and matthew k. gold. “digital humanities: the expanded field.” in de- bates in the digital humanities. ed. matthew k. gold and lauren f. klein. ix-xv. minne- apolis mn: university of minnesota press, . klein, lauren f. “the image of absence: archival silence, data visualization, and james hemings.” american literature , no. ( ): - . kline, m.-j., and s.h. perdue. a guide to documentary editing. charlottesville, va: uni- versity of virginia press, . kling, rob, and lisa b. spector. “rewards for scholarly communication.” in digital schol- arship in the tenure, promotion, and review process. ed. deborah lines andersen. - . armonk, ny: m.e. sharpe, . knight, kim. “mla paper for ‘the institution(alization) of digital humanities’.” kim knight. january , , http://kimknight.com/?p= . knochel, aaron d., and amy papaelias. “placeable: a social practice for place-based learning and co-design paradigms.” in making things and drawing boundaries: experi- ments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: univer- sity of minnesota press, . knowles, anne k. “a case for teaching geographic visualization without gis.” carto- graphic perspectives ( ): - . knowles, anne k. “a cutting-edge second look at the battle of gettysburg.” smithsonian magazine. june , . http://www.smithsonianmag.com/history-archaeology/a-cut- ting-edge-second-look-at-the-battle-of-gettysburg.html. knowles, anne k. “introduction to the special issue: historical gis: the spatial turn in social science history.” social science history : ( ): - . knowles, anne k., ed. past time past place: gis for history. redlands, ca: esri press, . knowles, anne k., ed. placing history: how gis is changing historical scholarship. red- lands, ca: esri press, . knowles, anne k., ed. “reports on national historical gis projects.” emerging trends in historical gis, historical geography ( ): - . knowles, anne k., and richard g. healey. “geography, timing, and technology: a gis- based analysis of pennsylvania’s iron industry, - .” journal of economic history : ( ): - . kocsis, a. and s. kenderline,. i sho u: “an innovative method for museum visitor evalua- tion.” in digital heritage and culture: strategy and implementation. eds. h. din and s. wu. singapore: world scientific publishing co., . koh, adeline. “the challenges of digital scholarship.” the chronicle of higher educa- tion. profhacker, january , . http://chronicle.com/blogs/profhacker/the-chal- lenges-of-digital-scholarship/ . koh, adeline. “first look: textual, a free smartphone app for text analysis.” the chron- icle of higher education. https://www.chronicle.com/blogs/profhacker/first-looks- textal-a-free-smartphone-app-for-text-analysis/ . koh, adeline. “a letter to the humanities: dh will not save you.” hybrid pedagogy (april , ). http://www.hybridpedagogy.com/journal/a-letter-to-the-humanities- dh-will-not-save-you/. koh, adeline. “niceness, building, and opening the genealogy of the digital humanities: beyond the social contract of humanities computing.” differences , no. ( ): - . kopas, merrit. “what are games good for? videogame creation as social, artistic, and investigative practice.” mkopas, . http://mkopas.net/files/talks/uvic talk- whataregamesgoodfor.pdf kraemer, harald. “art is redeemed, mystery is gone: the documentation of contempo- rary art.” in theorizing digital cultural heritage. eds. fiona cameron and sarah kender- ine. - . cambridge, ma: mit press, . kramer, michael. “what does digital humanities bring to the table?” issues in digital history, september , . http://www.michaeljkramer.net/issuesindigitalhis- tory/blog/?p= . kretzschmar, william a. “large-scale humanities computing projects: snakes eating tails, or every end is a new beginning?” dhq: digital humanities quarterly ( ) ( ). kretzschmar, william a., and william gray potter. “library collaboration with large digi- tal humanities projects.” literary and linguistic computing , no. (december , ): - . krug, steven. don’t make me think!: a common sense approach to web usability. berkeley, ca: new riders publishers, . kuhn, t. s. the structure of scientific research revolutions. chicago, il: chicago univer- sity press, . kulesz, octavio. digital publishing in developing countries. paris: international alliance of independent publishers/prince claus fund for culture and development, . http://alliance-lab.org/etude/?lang=en. kumar, vijay. design methods: a structured approach for driving innovation in your organization. hoboken, nj: wiley, . kurgan, l. close up at a distance: mapping, technology and politics. new york, ny: zone books, . kvamme, k.l. “geographic information systems in regional archaeological research and data management.” archaeological method and theory ( ): - . kwan, mei-po. “feminist visualization: re-envisioning gis as a method in feminist geo- graphic research.” annals of the association of american geographers ( ): - . kwan, mei-po, and j. lee. “geo-visualization of human activity patterns using -d gis: a time-geographic approach.” spatially integrated social science. - . new york, ny: oxford university press, . lakatos, i. methodology of scientific research programmes. cambridge, uk: cambridge university press, . lake, m.w., p.e. woodman, and s.j. mithen. “tailoring gis software for archaeological applications: an example concerning viewshed analysis.” journal of archaeological sci- ence ( ): - . lancaster, lewis r., and david j. bodenhamer. “the electronic cultural atlas initiative and the north american region atlas.” in past time, past place: gis for history. - . redlands, ca: esri press, . landow, george p. hypertext: the convergence of contemporary critical theory and technology. baltimore, md: johns hopkins university press, . landow, george p. hyper/text/theory. baltimore, md: johns hopkins university press, . landow, george p. hypertext . . baltimore, md: johns hopkins university press, . landow, george p. hypertext . : critical theory and new media in an era of globaliza- tion. baltimore, md: johns hopkins university press, . landow, george p. “what’s a critic to do? critical theory in the age of hypertext.” in hyper/text/theory. - . baltimore, md: johns hopkins university press, . langran, gail. time in geographic information systems. london, uk: taylor & francis, . lanier, jaron. you are not a gadget: a manifesto. new york, ny: vintage, . lantham, e. & zhou, w. “cultural issues in online learning - is blended learning a possi- ble solution?” international journal of computer processing of oriental languages , ( ): - . lanzoni, kristin, mark olson, and victoria szabo. "wired! and visualizing venice: scaling up digital art history.” artl@s bulletin ( ) ( ). purdue, in. http://docs.lib.pur- due.edu/artlas/vol /iss / / lascarides, michael and ben vershbow. “what’s on the menu?: crowdsourcing at the new york public library.” crowdsourcing our cultural heritage. ed. mia ridge. surrey, uk: ashgate, . lascarides, michael, ben vershbow, and trevor owens. “digital cultural heritage and the crowd.” curator: the museum journal , no. ( ): - . latour, bruno. “tarde’s idea of quantification.” in the social after gabriel tarde: de- bates and assessments. ed. m. candea. london, uk: routledge, . latour, bruno. "visualization and cognition: drawing things together." logos ( ): - . latour, bruno. “visualization and cognition: thinking with eyes and hands.” knowledge and society, vol. ( ): - . latour, bruno. “the landscape of digital humanities.” digital humanities quarterly , no. ( ). http://digitalhumanities.org/dhq/vol/ / / / .html. latour, bruno. “why has critique run out of steam? from matters of fact to matters of concern.” critical inquiry , no. ( ). latour, bruno, and peter weibel. making things public: atmospheres of democracy. cambridge, ma: mit press, . latour, bruno, and tomas sanchez-criado. “making the ‘res public’.” ephemera ( ) ( ): – . laufer, roger, and domenico scavetta. texte, hypertexte et hypermedia. paris, france: puf, . lawless, séamus, owen conlan, and cormac hampton. “tailoring access to content.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . lazer, d., et al. “computational social science.” science / . (february, ): - . learning through digital media: experiments in technology and pedagogy. . http://learningthroughdigitalmedia.net/ lee, c.a. i, digital: personal collections in the digital era. chicago, il: society of ameri- can archivists, . lee, c.a., m. kirschenbaum, a. chassanoff, p. olsen, & k. woods. “bitcurator: tools and techniques for digital forensics in collecting institutions.” d-lib magazine , / ( ). lee, c.a., h. tibbo. “digital curation and trusted repositories: steps toward success.” journal of digital information , ( ). lee, c.a., k. woods, m. kirschenbaum, & a. chassanoff. “from bitstrams to heritage: putting digital forensics into practice in collecting institutions.” bitcurator, . lee, maurice s. “searching the archive with dickens and hawthorne: databases and aesthetic judgment after the new historicism.” elh . (fall ): - . lee, rainie, and barry wellman. networked: the new social operating system. cam- bridge, ma: mit press, . lehman, robert s. “allegories of rending: killing time with walter benjamin.” new lit- erature history , no. ( ): - . lerman, n., a.p. mohun, and r. oldenziel. technology and culture, ( ). special issue: gender analysis and the history of technology ( ). lesk, michael. practical digital libraries: books, bytes, and bucks. san francisco, ca: morgan kaufmann publishers, . lesk, michael. understanding digital libraries. san francisco, ca: morgan kaufmann publishers, . lessig, lawrence. code and other laws of cyberspace. version . . new york, ny: basic books, . lessig, lawrence. the future of ideas: the fate of the commons in a connected world. new york, ny: random house, . levinson, stephen c. space in language and cognition: explorations in cognitive diver- sity. cambridge, uk: cambridge university press, . levmore, saul, and martha craven nussbaum. the offensive internet; speech, privacy, and republican. cambridge, ma: harvard university press, . levy, david m. scrolling forward: making sense of documents in the digital age. new york, ny: arcade, . lévy, p. collective intelligence. london, uk: perseus, . lévy, pierre, and robert bononno. collective intelligence: mankind's emerging world in cyberspace. london, uk: perseus, . library of congress. american memory: historical collections for the national digital li- brary. library of congress. “metadata for digital content (mdc). developing institution-wide policies and standards at the library of congress.” . liestøl, gunnar, andrew morrison, terje rasmussen, eds. digital media revisited: theo- retical and conceptual innovations in digital domains. cambridge, ma: mit press, . lilley, keith, chris lloyd, and steven trick. mapping the medieval urban landscape: ed- ward ’s new towns of england and wales. http://www.qub.ac.uk/urban_mapping. lima, manuel. visual complexity: mapping patterns of information. princeton, nj: princeton architectural press, . lind, rebecca ann, ed. “producing theory in a digital world . : the intersection of au- diences and production in contemporary theory.” digital formations, vol. . peter lang, . lindhé, cecilia. “medieval materiality through the digital lens.” in between humanities and the digital. eds. d.t. goldberg and p. svensson. - . cambridge, ma: mit press, . linley, margaret. “ecological entanglements of dh.” in debates in the digital humani- ties. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . lipson, h. and m. kunman. fabricated: the new world of d printing. indianapolis, in: john wiley & sons, inc., . lipson, h., f.c. moon, j. hai, and c. paventi. “ -d printing the history of mechanisms.” journal of mechanical design. ( ) ( ): - . literary and linguistic computing. oxford journals. llc.oxfordjournals.org. liu, alan. “digital humanities and academic change.” english language notes , no. ( ): - . liu, alan. "escaping history: new historicism, databases, and contingency." digital ret- roaction conference, university of california at santa barbara, pp. - . . liu, alan. “friending the past: the sense of history and social computing.” new literary history , no. ( ): - . liu, alan. “imagining the new media encounter.” in a companion to digital literary studies. eds. susan schreibman and ray siemens. oxford, uk: blackwell, .. liu, alan. the laws of cool: knowledge work and the culture of information. chicago, il: university of chicago press, . liu, alan. local transcendence: essays on postmodern historicism and the database. chicago, il: university of chicago press, . liu, alan. “manifesto for the digital humanities.” in thatcamp paris . hypotheses. june , . http://tcp.hypotheses.org/ . liu, alan. “the meaning of digital humanities.” pmla , no. ( ): - . liu, alan. “n+ : a plea for cross-domain data in the digital humanities.” in debates in the digital humanities. ed. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota, . liu, alan. “sidney’s technology.” in local transcendence: essays on postmodern histori- cism and the database. - . chicago, il: university of chicago press, . liu, alan. “the state of the digital humanities: a report and a critique.” arts and hu- manities in higher education , no. ( ): - . liu, alan. “theses on the epistemology of the digital: advice for the cambridge centre for digital knowledge.” author’s blog, august , . http://liu.english.ucsb.edu/the- ses-on-the-epistemology-of-the-digital-page liu, alan. “transcendental data: toward a cultural history and aesthetics of the new encoded discourse.” critical inquiry ( ). - . liu, alan. “where is cultural criticism in the digital humanities?” in debates in the digi- tal humanities. ed. matthew k. gold. - . minneapolis, mn: university of minne- sota, . liu, lydia. the freudian robot: digital media and the future of the unconscious. chi- cago, il: university of chicago press, . the living net. “project snapshot: vibrant lives present.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneap- olis, mn: university of minnesota press, . llobera, marcos. “building past landscape perception wit gis: understanding topo- graphic prominence.” journal of archaeological science ( ): - . lock, g.r., and k. smith, eds. on the theory and practice of archaeological computing. oxford, uk: oxbow, . lock, gary. using computers in archaeology: towards virtual pasts. london, uk: routledge, . long, christopher. “performative publication.” http://cplong.org/ / /performa- tive-publication/. long, p.o. openess, secrecy, authorship: technical arts and the culture of knowledge from antiquity to the renaissance. baltimore, md: johns hopkins university press, . longley, paul a., and michael f. goodchild, david j. maguire, and david w. rhind, eds. geographic information systems and science. new york, ny: john wiley & sons inc., . lopez, andrew, fred rowland, and kathleen fitzpatrick. “on scholarly communication and the digital humanities: an interview with kathleen fitzpatrick.” in the library with the lead pipe. self-published, . losh, elizabeth. “hacktivism and the humanities: programming protest in the era of the digital humanities.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: minnesota, . lost, elizabeth, jacqueline wernimont, laura wexler, and hong-an wu. in debates in the digital humanities. eds. matthew k. gold and lauren lein. - . minneapolis, mn: university of minnesota, . lotan, gilad, erhardt graeff, mike ananny, devin gaffney, ian pearce, and danah boyd. “the revolutions were tweeted: information flows during the tunisian and egyp- tian revolutions.” international journal of communication ( ): - . lothian, alexis, and amanda phillips. “can digital humanities mean transformative cri- tique?” journal of e-media studies , no. ( ). https://journals.dartmouth.edu/cgi- bin/webobjects/journals.woa/xmlpage/ /article/ . lothian, alexis. “marked bodies, transformative scholarship, and the question of the- ory in digital humanities.” journal of digital humanities ( ). http://journalofdig- italhumanities.org/ - /marked-bodies-transformative-scholarship-and-the-question-of- theory-of-digital-humanities-by-alexis-lothian/. lovink, geert. zero comments: blogging and critical internet culture. new york, ny: routledge, . luff, paul, jon hindmarsh, and christian heath. workplace studies: recovering work practice and informing system design. cambridge, ma: cambridge university press, . lum, casey man kong. “notes toward an intellectual history of media ecology.” in per- spectives on culture, technology, and communication: the ecology tradition. ed. casey man kong lum. - . cresskill, nj: hampton, . lupton, julia reinhard. “blur building: softscape.” shakespeare & hospitality. https://folgerpedia.folger.edu/julia_reinhard_lupton. lynch, c.a. “institutional repositories: essential infrastructure for the digital age.” arl bimonthly report ( ): - . lynch, michael. “science in the age of mechanical reproduction: moral and epistemic relations between diagrams and photographs.” biology and philosophy , no. ( ): - . lyotard, jean-françois. the postmodern condition: a report on knowledge. manchester, uk: manchester university press, . maack, mary niles. “toward a new model of the information professions: embracing empowerment.” journal of education for library and information science , no. ( ): - . macarthur foundation. reports on digital media and learning. cambridge, ma: mit press, - . www.scribd.com/collections/ /john-d-and-catherine-t-macar- thur-foundation-reports-on-digital-media-and-learning. macdonald, bertram h., and fiona a. black. “using gis for spatial and temporal anal- yses in print culture studies: some opportunities and challenges.” social science history : ( ): - . maceachren, alan m. “visualization quality and the representation of uncertainty.” in some truth with maps: a primer on symbolization & design. washington, dc: associa- tion of american geographers, . maceachren, alan m., and fraser taylor. visualization in modern cartography. london, uk: elsevier, . mackenzie, adrian. cutting code: software and sociality. oxford, uk: peter lang, . mackenzie, e.s., j. mclaughlin, a. moore, and k. rogers. “digitising the middle ages: the experience of the ‘lands of the normans’ project.” international journal of humani- ties & arts computing , no. ½ (march ): - . mackey, thomas p., and trudi e. jacobson. “reframing information literacy as a metalit- eracy.” college & research libraries. ( ): . mackey, wendy e. “augmented reality: linking real and virtual worlds. a new para- digm for interacting with computers.” proceedings of the workshop on advanced visual interfaces avi ( ): - . madden, l. “applying the digital curation lessons learned from american memory.” in- ternational journal of digital curation . ( ). maeda, john. creative code: aesthetics and computation from the mit media lab. lon- don, uk: thames & hudson, . maeroff, g.i. a classroom of one: how online learning is changing our schools and col- leges. basingstoke, uk: palgrave, . maher, jimmy. the future was here: commodore amiga. cambridge, ma: mit press, . mahoney, simon and elena pierazzo. “teaching skills of teaching methodology?” digital humanities pedagogy: practices, principles and policies. ed. brett d. hirsch. - . cambridge, uk: open book publishers, . maier, andrew. “digital literacy, part : cadence.” ux booth. october , . http://www.uxbooth.com/articles/digital-literacy-part- -cadence/. mailing, d.h. measurements from maps: principles and methods of cartometry. new york, ny: pergamon, . mak, bonnie. “archaeology of a digitization.” journal of the american society for infor- mation science and technology. , no. ( ): - . doi: . /asi. maker lab in the humanities. http://maker.uvic.ca/. mandell, laura c. breaking the book: print humanities in the digital age. oxford, uk: wiley-blackwell, . mandell, laura c. “gendering digital literacy history: what counts for digital humani- ties.” in a new companion to digital humanities. eds. susan schreibman and ray sie- mens. oxford, uk: blackwell, . mandell, laura c. “promotion and tenure for digital scholarship.” journal of digital hu- manities , no. (fall ). http://journalofdigitalhumanities.org/ - /promotion-and- tenure-for-digital-scholarship-by-laura-mandell/. manoff, marlene. “archive and database as metaphor: theorizing the historical rec- ord.” portal: libraries and the academy , no. ( ): - . manoff, marjorie. “theories of the archive from across the disciplines.” portal: libraries and the academy , no. (january ): - . manovich, lev. “cultural analytics: analysis and visualization of large cultural data sets.” calit white paper, . www.manovich.net/cultural_analytics.pdf manovich, lev. “database as a genre of new media.” ai & society , no. (june , ). http://time.arts.ucla.edu/ai_society/manovich.html. manovich, lev. “database as symbolic form.” convergence: the international journal of research into new media technologies , no. ( ): - . manovich, lev. "info-aesthetics.” new media / culture / software. may . http://www.manovich.net. manovich, lev. the language of new media. cambridge, ma: mit press, . manovich, lev. “the language of new media (what is new media?), “the interface,” and “the forms.” in the language of new media. ed. lev manovich. cambridge, ma: mit press, . manovich, lev. software takes command. new york, ny: continuum publishing corpo- ration, . manovich, lev. “trending: the promises and the challenges of big social data.” in de- bates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: uni- versity of minnesota, . manovich, lev. “visualizing large image collections for humanities research.” in media studies futures. ed. kelly gates. oxford, uk: blackwell, . http://moanovich.net/docs/media_visualization. .pdf. manovich, lev. “what is new media?” the new media theory reader. eds. robert has- san and julian thomas. - . maidenhead, uk: open university press, . mansfield, elizabeth, ed. art history and its institutions: foundations of a discipline. new york, ny: psychology press, . mantovani, f. et al. “virtual reality training for health-care professionals.” cyber psy- chology & behaviour. , , ( ): - . map of early modern london (moeml). https://mapoflondon.uvic.ca/. “mapping the stacks: a guide to chicago’s hidden archives.” http://mts.lib.uchi- cago.edu/. “mapping initiatives.” united states holocaust memorial. http://www.ushmm.org/maps/. “marc in xml.” library of congress. http://www.loc.gov/marc/marcxml.html. marcus, manfred. “article contents.” wright’s english dialect dictionary computerised: towards a new source of information. university of helsinki, dec. . marche, stephen. “literature is not data: against digital humanities.” los angeles re- view of books. october , . https://lareviewofbooks.org/essay/literature-is-not- data-against-digital-humanities. marchionini, g., c. plaisant, & a. komlodi. “the people in digital libraries: multifaceted approaches to assessing needs and impact.” in digital library use: social practice in de- sign and evaluation. eds. bishop, a. p. et al., – . cambridge, ma: mit press, . marcum, deanna, and amy friedlander. “keepers of the crumbling culture.” d-lib ma- gazine , no. (may ). http://www.dlib.org/dlib/may /friedlander/ friedlan- der.html. marcus, leah s. “the silence of the archive and the noise of cyberspace.” in the renais- sance computer: knowledge technology in the first age of print, eds. neil rhodes and jonathan sawday. – . london and new york: routledge, . marcuse, herbert. “some social implications of modern technology.” in the essential frankfurt school reader. eds. andrew arato and erike gebhardt. - . new york, ny: continuum, marino, mark c. ”why we must read the code: the science wars, episode iv.” in de- bates in the digital humanities. eds. matthew k. gold and lauren klein. - . minne- apolis, mn: university of minnesota press, . maron, nancy, k. kirby smith, and matthew loy. “sustaining digital resources: an on- the-ground view of projects today.” ithaka case studies in sustainability. ithaka s+r, july . http://www.ithaka.org/ithaka-s-r/research/ithaka-case-studies-in-sustainabil- ity/report/sca_ithaka_sustainingdigitalresources_report.pdf. maron, n.l. and s. pickle. sustaining the digital humanities; host institution support be- yond the start-up phase. ithaka s+r. . http://www.sr.ithaka.org/research-publica- tions/sustaining-digital-humanities. marsh, leslie. “review of ‘natural-born cyborgs: minds, technologies, and the future of human intelligence’.” cognitive systems research ( ): – . marshall, catherine c. reading and writing the electronic book. san rafael, ca: morgan & claypool, . martin, kim, beth compton, and ryan hunt. “disrupting dichotomies: mobilizing digital humanities with the makerbus.” in making things and drawing boundaries: experi- ments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . martinec, r., and t. van leeuwen. the language of new media design: theory and practice. new york, ny: routledge, . marwick, alice e., and danah boyd. “i tweet honestly, i tweet passionately: twitter us- ers, context collapse, and the imagined audience.” new media and society , no . ( ): - . marx, vivien. “data visualization: ambiguity as a fellow traveler.” nature methods , no. (july ): - . doi: . /nmeth. . massey, doreen. “space-time, ‘science,’ and the relationship between physical geogra- phy and human geography.” transactions of the institute of british geographers: new series ( ): - . mateas, m. “procedural literacy: educating the new media practitioner.” beyond fun: serious games and media. ed. d. davidson. pittsburgh, pa: etc press, - . . mattern, shannon christine. “evaluating multimodal work, revisited.” journal of digital humanities , no. (fall ). http://journalofdigitalhumanities.org/ - /evaluating- multimodal-work-revisited-by-shannon-mattern/. mauro, aaron. “digital liberal arts and project-based pedagogies.” doing digital hu- manities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . maya mapping project. maya atlas: the struggle to preserve maya land in southern be- lize. berkeley, ca: north atlantic books, . mccarty, willard. “becoming interdisciplinary.” in a new companion to digital humani- ties. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . mccarty, willard. “being reborn: the humanities computing and styles of scientific reasoning.” new technology in medieval and renaissance studies , - . . mccarty, willard. “collaborative research in the digital humanities.” in collaborative research in the digital humanities. ed. marilynn deegan and willard mccarty. - . farnham, uk: ashgate, . mccarty, willard. “digital knowing, not digital knowledge.” humanist , no. ( ). mccarty, willard. "finding implicit patterns in ovid's metamorphoses with tact." digi- tal studies/le champ numérique ( ). mccarty, willard. “the future of digital humanities is a matter of words.” in a compan- ion to new media dynamics. eds. j. hartley, j. burgess, and a. burns. chichester, uk: john wiley & sons ltd., . mccarty, willard. “getting there from here: remembering the future of digital human- ities.” roberto bush award lecture . literary and linguistic computing ( ) ( ): - . mccarty, willard. “humanities computing: essential problems, experimental practice.” literary and linguistic computing , no. (april , ): - . mccarty, willard. introduction to humanities computing. basingstoke, uk: palgrave macmillan, . mccarty, willard. “modeling.” in humanities computing. ed. willard mccarty - . ba- singstoke, uk: palgrave macmillan, . mccarty, willard. “what is humanities computing? toward a definition of the field.” http://ilex.cc.kcl.ac.uk/wlm/essays.what/. mccarty, willard. “modeling: a study in words and meanings.” in a companion to digi- tal humanities. eds. susan schreibman, ray siemens, and john unsworth. oxford, uk: blackwell, . mccarty, willard. “the ph.d. in digital humanities.” in digital humanities pedagogy: practices, principles and policies. ed. brett d. hirsch. - . cambridge, uk: open book publishers, . mccarty, willard. “a telescope for the mind?” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of mn, . mccarty willard, ed. text and genre in reconstruction: effects of digitalization on ideas, behaviors, products and institutions. cambridge, uk: open book publishers, . mccarty, willard, and matthew kirschenbaum. “institutional models for humanities computing.” literary and linguistic computing , no. (november , ): - . mccloud, scott. understanding comics: the invisible art. new york, ny: harpercollins, . mccullough, malcolm. ambient commons: attention in the age of embodied infor- mation. cambridge, ma: mit press, . mccullough, malcolm. digital ground: architecture, pervasive computing, and environ- mental knowing. cambridge, ma: mit press, . mcdonough, j., r. olendorf, m. kirschenbaum, k. kraus, d. reside, r. donahue, a. phelps, c. egert, h. lowood, and s. rojo. preserving virtual worlds final report. decem- ber , . http://www.ideals.illinois.edu/handle/ / . mcenery, tony, and andrew hardie. corpus linguistics: method, theory and practice. cambridge, uk: cambridge university press, . mcgann, jerome. “culture and technology: the way we live now, what is to be done?” new literary history , no. ( ): - . http://muse.jhu.edu/jour- nals/new_literary_history/v / . mcgann.html mcgann, jerome. “electronic archives and critical editing.” literature compass ( ) ( ): - . mcgann, jerome. “imagining what you don’t know: the theoretical goals of the ros- setti archive.” institute for advanced technology in the humanities. . http://www .iath.virginia.edu/jjm f/old/chum.html mcgann, jerome. “making texts of many dimensions.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . mcgann, jerome. a new republic of letters: memory and scholarship in the age of digi- tal reproduction. cambridge, ma: harvard university press, . mcgann, jerome, ed. online humanities scholarship: the shape of things to come. hou- ston, tx: rice university press, . mcgann, jerome. “philology in a new key.” critical inquiry , no. ( ): – . mcgann, jerome. radiant textuality: literature after the world wide web. new york, ny: palgrave, . mcgann, jerome. “the rationale of hypertext.” in radiant textuality: literature after the world wide web. ed. jerome mcgann. - . new york, ny: palgrave, . mcgann, jerome. “the rossetti archive and image-based electronic editing.” in the lit- erary text in the digital age. ed. richard finneran. - . ann arbor, mi: university of michigan press, . mcgann, jerome. “visible and invisible books: hermetic images in n-dimensional.” in the future of the page. eds. peter stoicheff and andrew taylor. - . toronto, on: university of toronto press, . mcgann, jerome, andrew stauffer, dana wheeles, and michael pickard. “abstract of roger bagnall, ‘integrating digital papyrology’.” online humanities scholarship: the shape of things to come. ed. jerome mcgann. . houston, tx: rice university press, . mcgann, jerome and bethany nowviskie. "nines: a federated model for integrating digital scholarship." mcgonigal, jane. reality is broken: why games make us better and how they can change the world. new york, ny: penguin press, . mcgrail, anne b. “the ‘whole game’: digital humanities at community colleges.” in de- bates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneap- olis, mn: university of minnesota press, . mckenzie, d.f. bibliography and the sociology of text. cambridge, uk: cambridge uni- versity press, . mckenzie, jon. “enhancing digital humanities at uw-madison: a white paper.” http://www.labster .net/wp-content/uploads/ / /fds_white_paper.pdf mckeon, richard. “the uses of rhetoric in a technological age: architectonic productive arts.” the prospect of rhetoric: report of the national development project. eds. lloyd. f. bitzer and edwin black. upper saddle river, nj: prentice hall, . mclafferty, sara. “women and gis: geospatial technologies and feminist geographies.” cartographica : ( ): - . mcluhan, marshall. the guttenberg galaxy: the making of typographic man. toronto, on: university of toronto press, . mcluhan, marshall. understanding media: the extensions of man. . ed. lewis lap- ham. cambridge, ma: mit press, . mcluhan, marshall and quentin fiore. the medium is the massage. berkeley, ca: gin- gko press, . mcpherson, tara. “introduction: media studies and the digital humanities.” cinema journal , no. ( ): - . http://muse.jhu.edu/journals/cj/sum- mary/v / . .mcpherson.html. mcpherson, tara. “media studies and the digital humanities.” cinema journal ( ) ( ): - . mcpherson, tara. “u.s. operating systems at mid-century: the intertwining of race and unix.” in race after the internet. eds. lisa nakamura and peter a. chow-white. - . new york, ny: routledge, . mcpherson, tara. “why are the digital humanities so white? or thinking the histories of race and computation.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minneapolis press, . mcpherson, tara. reconstructing dixie: race, place and nostalgia in the imagined south. durham, nc: duke university press, . mediacommons. mediacomons.futureofthebook.org. medieval kingdom of sicily image database. http://kos.aahvs.duke.edu/index.php. meeks, elijah. “the digital humanities as imagined community.” digital humanities spe- cialist. september , . http://dhs.stanford.edu/the-digital-humanities-as/the-digi- tal-humanities-as-imagined-community/. meeks, elijah. “more networks in the humanities or did books have dna?” digital hu- manities specialist. stanford university libraries. https://dhs.stanford.edu/visualiza- tion/more-networks/. meeks, elijah, and scott b. weingart. “the digital humanities contribution to topic modeling.” journal of digital humanities , no. (april , ). http://journalofdig- italhumanities.org/ - /dh-contribution-to-topic-modeling/. meeks, e. and k. grossner. “orbis: an interactive scholarly work on the roman world.” journal of digital humanities ( ) ( ). http://journalofdigitlhumanities.org/ - /or- bis-an-interactive-scholarly-work-on-the-roman-world-by-elijah-meeks-and-karl-gross- ner. mendoza, marcelo, barbara poblete, and carlos castillo. “twitter under crisis: can we trust what we rt?” first workshop on social media analystics (soma ’ ). washington dc, . metcalfe, a.s. knowledge management and higher education: a critical analysis. lon- don, uk: information science, . metraux, stephen. “waiting for the wrecking ball: skid row in postindustrial philadel- phia.” journal of urban history : ( ): - . meuhrcke, phillio c. “the logic of map design.” in cartographic design: theoretical and practical perspectives. - . new york, ny: john wiley & sons, inc., . meyer, eric t. and ralph schroeder. knowledge machines: digital transformations of the sciences and humanities. cambridge, ma: mit press, . miall, david s. humanities and computer: new directions. oxford, uk: clarendon press, . michel, jean baptiste, yuan kui shen, aviva presser aiden, adrian veres, matthew k. gray, the google books team, joseph p. picket, dale hoiberg, dan clancy, peter norvig, jon orwant, steven pinker, martin a. nowak, erez lieberman aiden. "quantita- tive analysis of culture using millions of digitized books.” science : (january , ). milgram, p., h. takemura, a. utsumi, and f. kishino. “augmented reality: a class of dis- plays on the reality-virtuality continuum.” proceedings of telemanipulator and telepresence technologies , - . . miller, j.h. and s.e. page. “complex adaptive systems. an introduction to computa- tional models of social life.” princeton, nj: princeton university press, . miller, peter, ed. cultural histories of the material world. ann arbor, mi: university of michigan press, . milic, l. “the next step.” computers and the humanities : ( ): - . millon, emma. “project bamboo: building shared infrastructure for humanities re- search.” maryland institute for technology in the humanities blog. july , . http://mith.umd.edu/project-bamboo-building-shared-infrastructure-for-humanities- research/. milton, n. knowledge management for teams and projects. oxford, uk: chandos pub- lishing, . mirzoeff, nicholas. an introduction to visual culture. london and new york: routledge, . mirzoeff, nicholas, ed. the visual culture reader. london and new york: routledge, . mirzoeff, nicholas. “what is visual culture?” in the visual culture reader. ed. nicholas mirzoeff. - . london and new york: routledge, . mitchell, don. the right to the city: social justice and the fight for public space. new york, ny: guilford press, . mitchell, e.t., ed. library linked data: research and adoption. chicago, il: ala tech- source, . mitchell, e.t. “metadata developments in libraries and other cultural heritage institu- tions.” in library linked data: research and adoption. ed. e.t. mitchell. - . chicago, il: ala techsource, . mitchell, marilyn. library workflow redesign: six case studies. washington d.c.: coun- cil on library and information resources, . http://www.clir.org/pubs/ab- stract/pub abst.html. mitchell, w.j.t. picture theory. chicago, il: university of chicago press, . mitchell, w.j.t., and mark b.n. hansen. critical terms for media studies. chicago, il: university of chicago press, . mitchell, william j., alan s. inouye, marjory s. blumenthal, eds. beyond productivity: in- formation technology, innovation, and creativity. may . http://newton.nap.edu/html/beyond_productivity/. mod, craig. “the digital-physical: on building flipboard for iphone & finding the edges of our digital narratives.” @craigmod. https://craigmod.com/journal/digital_physical/. modern language association. “documenting a new media case.” journal of digital hu- manities , no. (fall ). http://journalofdigitalhumanities.org/ - /documenting-a- new-media-case-evaluation-wiki-from-the-mla/. modern language association. “guidelines for editors of scholarly editions.” modern language association, n.d. http://www.mla.org/resources/documents/rep_schol- arly/cse_guidelines. modern language association. “guidelines for evaluating work in digital humanities and digital media.” journal of digital humanities , no. (fall ). http://journalofdig- italhumanities.org/ - /guidelines-for-evaluating-work-in-digital-humanities-and-digital- media-from-the-mla/. modern language association. report of the mla task force on evaluating scholarship for tenure and promotion. . http://www.mla.org/tenure_promotion. mohl, raymond. “planned destruction: the interstates and central city housing.” in from tenements to the taylor homes. - . university park, pa: pennsylvania state university press, . monmonier, mark. drawing the line. new york, ny: henry holt, . monmonier, mark. spying with maps. chicago, il: university of chicago press, . monmonier, mark. how to lie with maps, nd edition. chicago, il: university of chicago press, . montfort, nick. “beyond the journal and the blog: the technical report for communica- tion in the humanities.” amodern ( ): http://amodern.net/article/beyond-the- journal-and-the-blog-the-technical-report-for-communication-in-the-humanities. montfort, nick. “exploratory programming in digital humanities pedagogy and re- search.” in a new companion to digital humanities. eds. by susan schreibman, ray sie- mens, and john unsworth. - . west sussex, uk: wiley-blackwell, . montfort, nick, and ian bogost. racing the beam: the atari video computer system. cambridge, ma: mit press, . montfort, nick. twisty little passages: an approach to interactive fiction. cambridge, ma: mit press, . moran, joe. interdisciplinarity. london and new york: routledge, . moravec, michelle. “teaching with pinterest.” http://historyinthecity.blog- spot.com/ / /teaching-students-in-pinterest.html moretti, franco. “conjectures on world literature.” new left review. (jan-feb. ): - . moretti, franco. distant reading. london and new york: verso, . moretti, franco. graphs, maps, trees: abstract models for a literary history. london and new york: verso, . moretti, franco. “network theory, plot analysis.” literary lab. pamphlet , may , . https://litlab.stanford.edu/literarylabpamphlet .pdf. morgan, paige. “how to get your digital humanities project off the ground.” http://www.paigemorgan.net/how-to-get-a-digital-humanities-project-off-the-ground/ morozov, evgeny. to save everything, click here: the folly of technological solutions. new york, ny: public affairs, . moore, r. “towards a theory of digital preservation.” international journal of digital curation . ( ): - . moore, suzanne. “grayson perry’s tapestries: weaving class and taste.” the guardian ( ). https://www.theguardian.com/books/ /jun/ /grayson-perry-tapestries- class-taste. moretti, franco. “network theory, plot analysis.” new left review (march-april ): - . pdf. morgan, coleen leah. “emancipatory digital archaeology.” phd dissertation, university of california, . morris, kief. infrastructure as code: managing servers in the cloud. sebastopol, ca: maker media, . mortensen, p. “the place of theory in archival practice.” archivaria ( ): - . mossberger, karen, caroline j. tolbert, and mary stansbury. virtual inequality: beyond the digital divide. washington, dc: georgetown university press, . mostern, ruth, and elana gainor. “traveling the silk road on a virtual globe: pedagogy, technology, and evaluation for spatial history.” digital humanities quarterly , no. ( ). mueller, martin. “about the future of the tei.” august , . http://ariadne.north- western.edu/mmueller/teiletter.pdf. mueller, martin. “collaborative curation of early modern plays by undergradu- ates.” scalable reading ( ). mueller, martin. “digital shakespeare, or towards a literacy informatics.” shakespeare , no. (december ): - . mueller, martin. “how to fix , errors.” scalable reading ( ). mueller, martin. “what is a young scholar edition.” scalable reading ( ). mukurtu. www.mukurtu.org. mullen, lincoln. “digital humanities is a spectrum: or, we’re all digital humanists now.” in backward glance. april , . http://lincolnmullen.com/ / / /digital-hu- manities-is-a-spectrum-or-were-all-digital-humanists-now/. mullen, lincoln. “these maps show how slavery expanded across the united states.” smithsonian.com. http://www.smithsonianmag.com/history/maps-reveal-slavery-ex- panded-across-united-states- /?no-ist. muñoz, trevor. “in service? a further provocation on digital humanities research in li- braries.” dh + lib. june , . http://acrl.ala.org/dh/ / / /in-service-a-further- provocation-on-digital-humanities-research-in-libraries. munster, anna. an aesthesia of networks: conjunctive experience in art and technol- ogy. cambridge, ma: mit press, . muri, allison. “the grub street project.” online humanities scholarship: the shape of things to come. ed. jerome mcgann. - . houston, tx: rice university press, . murray, janet h. hamlet on the holodeck: the future of narrative in cyberspace. cam- bridge, ma: mit press, . murray, k.m.e. caught in a web of words. new haven, ct: yale university press, . murray, susan “digital images, photo-sharing, and our shifting notions of everyday aes- thetics.” journal of visual culture , no. ( ): - . mussell, james. “doing and making: history as digital practice.” in history in the digital age. ed. toni weller. - . london, uk: routledge, . nakamura, lisa. digitizing race: visual cultures of the internet. minneapolis, mn: uni- versity of minnesota press, . nakamura, lisa and peter chow-white. race after the internet. new york, ny: routledge . nardi, b.a. my life as a night elf priest. ann arbor, mi: university of michigan press, . national information standards organization (niso). understanding metadata. be- thesda, md: niso press, . national initiative for a networked cultural heritage (ninch). the ninch guide to good practice in the digital representation and management of cultural heritage materials. national initiative for a networked cultural heritage, . www.nyu.edu/its/pubs/pdfs/ninch_guide_to_good_practice.pdf. naughton, j. from gutenberg to zuckerberg: what you really need to know about the internet. london, uk: quercus, . nawrotzki, kristen, and jack dougherty. “introduction.” writing history in the digital age. eds. jack dougherty and kristen nawrotzki. - . ann arbor, mi: university of michigan press, . neal, mark anthony. “race and the digital humanities.” left of black (webcast), season , episode , john hope franklin center, september , . https://www.youtube.com/watch?v=aqth _-qnj . negroponte, nicholas. being digital. new york, ny: alfred a. knopf, . neh office of digital humanities. www.neh.gov/odh. nelson, b. “exploring the use of individualised, reflective guidance in an educational multi-user environment.” journal of science education & technology, , ( ): - . nelson, theodor. computer lib/dream machines. redmont, wa: tempus books, . nelson, theodor. “a file structure for the complex, the changing, and the indetermi- nate.” in the new media reader. ed. noah-wardrip fruin. cambridge, ma: mit press, . nelson, theodore. literary machines: the report on, and of, project xanadu concerning word processing, electronic publishing, hypertext, thinkertoys, tomorrow’s intellectual revolution, and certain other topics including knowledge, education, and freedom. - . san antonio, tx: t.h. nelson, . nelson, robert. “the slide lecture, or the work of art ‘history’ in the age of mechanical reproduction.” critical inquiry , no. (spring, ): - . nesmith, t. “seeing archives: postmodernism and the changing intellectual place of ar- chives.” the american archivist, . ( ): - . netz, r., and w. noel. the archimedes codex: revealing the secrets of the world’s greatest palimpsest. london, uk: weidenfeld & nicolson, . newfield, christopher. “ending the budget wars: funding the humanities during a crisis in higher education.” profession ( ): – . new york public library. “digital humanities and the future of libraries (multimedia conference proceedings).” new york public library, june , . http://www.nypl.org/events/programs/ / / /digital-humanities-and-future-li- braries ngata, w., h. ngata-gibson, and h. salmond, “te ataakura: digital taonga and cultural innovation.” journal of material culture, . ( ): - . nichols, stephen g. “time to change our thinking: dismantling the silo model of digital scholarship.” ariadne, no. (january , ). http://www.ariadne.ac.uk/is- sue /nichols/. notes from thatcamp digital humanities & libraries. topics include “starting a dh pro- gram in the library.” “re-skilling librarians for dh,” and “dht.” novak, peter. that noble dream: the “objectivity” question and the american historical profession. cambridge, uk: cambridge university press, . nowviskie, bethany. “digital humanities in the anthropocene.” bethany nowviskie (blog), july , . http://nowviskie.org/ /anthropocene/. nowviskie, bethany. “eternal september of the digital humanities.” debates in the digi- tal humanities. - . ed. matthew k. gold. minneapolis, mn: university of minne- sota press, . nowviskie, bethany. “evaluating collaborative digital scholarship (or, where credit is due).” journal of digital humanities , no. (fall ). http://journalofdigitalhumani- ties.org/ - /evaluating-collaborative-digital-scholarship-by-bethany-nowviskie/. nowviskie, bethany. “mapping the catalog of ships.” university of virginia library. http://scholarslab.org/blog/mapping-the-catalogue-of-ships/. nowviskie, bethany. “a skunk in the library.” june , . http://nowvis- kie.org/ /a-skunk-in-the-library/. nowviskie, bethany. “skunks in the library: a path to production for scholarly r&d.” journal of library administration , no. ( ): - . nowviskie, bethany. “on the origin of ‘hack’ and ‘yack.’” in debates in the digital hu- manities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . nowviskie, bethany. “reality bytes.” june , . http://nowviskie.org/ /reality- bytes/. nowviskie, bethany. “resistance in the materials.” in debates in the digital humanities. eds. matthew k. gold, and lauren klein. - . minneapolis, mn: university of min- nesota press, nowviskie, bethany. “what do girls dig?” in debates in the digital humanities. ed. mat- thew k. gold. - . minneapolis, mn: university of minnesota press, . nowviskie, bethany. “where credit is due: preconditions for the evaluation of collabo- rative digital scholarship.” profession ( ): - . nowviskie, bethany, and dot porter. “graceful degradation survey findings: managing digital humanities projects through times of transition and decline?” digital humani- ties conference abstract, june . http://dh .cch.kcl.ac.uk/academic-pro- gramme/abstracts/papers/html/ab- .html. nuffield foundation. interdisciplinarity. london, uk: nuffield foundation, . nunberg, geoffrey. “counting on google books.” the chronicle of higher education. the chronicle review. december , . http://chronicle.com/article/counting-on- google-books/ /. nunberg, geoffrey. the future of the book. berkeley, ca: university of california press, . nygren, zephyr frank, nicholas bauch and erik steiner. “connecting with the past: op- portunities and challenges in digital history.” in research methods for creating and curating data in the digital humanities. eds. matt hayler and gabriele griffin. - . edinburgh, uk: edinburgh university press, . nyhan, julianne, andrew flinn, and anne welsh. “oral history and the hidden histories project: towards histories of computing in the humanities.” digital scholarship in the humanities . ( ): - . web. june . nyhan, j., and o. duke-williams. “joint and multi-authored publication patterns in the digital humanities.” literary and linguistic computing ( ) ( ): - . nyhan, julianne, melissa m. terras, and claire warwick. digital humanities in practice. facet publishing in association with ucl centre for digital humanities, . o’donnell, angela n., and sharon j. derry. “cognitive processes in interdisciplinary groups: problems and possibilities.” in interdisciplinary collaboration: an emerging cog- nitive science. eds. sharon derry, christopher d. schunn, and morton a. gernsbacher. - . mahwah, nj: earlbaum, . o’donnell, daniel paul, katherine l. walter, alex gil, and neil fraistat. in a new com- panion to digital humanities. eds. by susan schreibman, ray siemens, and john un- sworth. - . west sussex, uk: wiley-blackwell, . o’donnell, james j. “engaging the humanities: the digital humanities.” daedalus, . ( ): - . o’gorman, marcel. “the making of a digital humanities neo-luddite.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . ohya, k. “programming with arduino for digital humanities.” journal of digital humani- ties ( ). http://journalofdigitalhumanities.org/ - /programming-with-arduino-for-digi- tal-humanities oishi, l. “what does second life have to do with real-life learning?” technology & learning , ( ): . old maps online. http://oldmapsonline.org/. oldman, dominic, martin doerr, and stefan gradmann. ”zen and the art of linked data: new strategies for a semantic web of humanist knowledge.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . olsen, p. “building a digital curation workstation with bitcurator (update).” bitcurator. august , . http://www.bitcurator.net/building-a-digital-curation-worskstation- with-bitcurator-update. olsen, m. “signs, symbols, and discourses: a new direction for computer-aided litera- ture studies.” computers and humanities ( - ) ( ): - . olsen, mark. “what can and cannot be done with electronic text in historical and liter- ary research.” paper for “modeling literary research methods by computer”. modern language association annual meeting. olson, mark j.v. “hacking the humanities: st century literacies and the ‘becoming- other’ of the humanities.” in humanities in the twenty-first century: beyond utility and markets. eds. e. belfiore and a. upchurch. new york, ny: palgrave macmillan, . omeka. http://omeka.net. ong, walter j. interfaces of the word. ithaca, ny: cornell university press, . ong, walter j. orality and literacy: the technologization of the word. london, uk: me- thuen, . ong, walter. “writing restructures consciousness.” orality and literacy: the technolo- gizing of the word. – . london and new york: routledge, . open access directory. oad.simmons.edu. orr, julian e. talking about machines: an ethnography of a modern job. ithaca, ny: ilr press, . ortiz, santiago. “ ways to communicate two quantities.” visual.ly. ( ). https://vis- ual.ly/blog/ -ways-to-communicate-two-quantities/. orwant, john. “our commitment to the digital humanities.” the official google blog. july , . http://googleblog.blogspot.com/ / /our-commitment-to-digital-hu- manities.html. o’sullivan, david, and david unwin. geographic information analysis. chichester, uk: john wiley & sons, inc., . o’sullivan, david, and t. igor. physical computing: sensing and controlling the physical world with computers. new york, ny: thomson, . o’sullivan, james, christopher p. long, and mark a. mattson. “dissemination as cultiva- tion: scholarly communications in a digital age.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . otty, lisa, and tara thomson. “data visualisation and the humanities.” in research methods for creating and curating data in the digital humanities. eds. matt hayler and gabriele griffin. - . edinburgh, uk: edinburgh university press, . owen, j.b., and laura woodworth-ney. “envisioning a master’s degree program in geo- graphically integrated history.” journal of the association for history and computing : ( ): n.p. owens, trevor. “defining data humanists: text, artifact, information or evidence?” journal of digital humanities . ( ). owens, trevor. “the public course blog: the required reading we write ourselves for the course that never ends.” - . in debates in the digital humanities. ed. mat- thew k. gold. minneapolis, mn: university of minnesota press, . owens, trevor, and j. bailey. “viewshare: digital interfaces as scholarly activity.” per- spectives on history. american historical association, . padrón, ricardo. the spacious word: cartography, literature, and empire in early mod- ern spain. chicago, il: university of chicago press, . palamidese, patrizia. scientific visualizaion: advanced software techniques. new york, ny: ellis horwood, . palen, leysia, kate starbird, sarah vieweg, and anabda hughes. “twitter-based infor- mation distribution during the red river valley flood threat.” bulletin of the american society for information science and technology , no. ( ): - . pallasma, j. the embodied image: imagination and imagery in architecture. hoboken, nj: john wiley & sons, inc., . palmer, carole l. “thematic research collections.” in online humanities scholarship: the shape of things to come. ed. jerome mcgann. - . houston, tx: rice university press, . pannapacker, william. “digital humanities triumphant?” in debates in the digital hu- manities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . pannapacker, william, “the mla and the digital humanities.” chronicle of higher educa- tion. december , . http://chronicle .com/blog/author/brainstorm/ /william-pan- napacker/ /. pannapacker, william. “stop calling it ‘digital humanities’.” chronicle of higher educa- tion. february , . http://chronicle.com/article/stop-calling-it-digital/ /. papacharissi, zizi a. a private sphere: democracy in a digital age. cambridge, uk: polity, . papacharissi, zizi a. “conclusion: a networked self.” in a networked self: identity, com- munity, and culture on social network sites. ed. zizi papacharissi. - . new york, ny: routledge, . pappano, laura. “the year of the mooc.” nytimes, november , . http://www.ny- times.com/ / / /education/edlife/massive-open-online-courses-are-multiplying- at-a-rapid-pace.html parikka, jussi. what is media archaeology? cambridge, uk: polity, . parker, cornelia. cold dark matter: an exploded view. london: tate modern, . http://www.tate.org.uk/learn/online-resources/cold-dark-matter parker, patricia a. “othello and hamlet: syping, discoery and secret faults.” in shake- speare from the margins: language, culture, context. chicago, il: university of chicago, . parks, lisa. culture in orbit: satellites and the televisual. durham, nc: duke university press, . parry, david. “be online or be irrelevant.” academhack. january , . http://academhack.outsidethetext.com/home/ /be-online-or-be-irrelevant/. parry, david. “the digital humanities or a digital humanism.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . parry, ross, ed. museums in a digital age. new york, ny: routledge, . pasztory, esther. thinking with things: toward a new vision of art. austin, tx.: univer- sity of texas press, . pastorino, cesare. “the mine and the furnace: francis bacon, thomas russell, and early stuart mining culture.” early science and medicine , no. ( ): – . pearce, celia. communities of play: emergent cultures in multiplayer games and virtual worlds. cambridge, ma: mit press, . pearce-moses, r., ed. a glossary of archival and records terminology. saa, . pearson, alastair w., and peter collier. “agricultural history with gis.” in past time, past place, gis for history. - . redlands, ca: esri press, . pensias, arno. ideas and information: managing in a high-tech world. new york, ny: w.w. norton & company, . perkins, david. future wise: educating our children for a changing world. san fran- cisco, ca: jossey-bass, . peters, john durham. the marvelous clouds: toward a philosophy of elemental media. chicago, il: university of chicago press, . peters, john durham. speaking into the air: a history of the idea of communication. chi- cago, il: university of chicago press, . petroski, henry. the pencil: a history of design and circumstance. new york, ny: knopf, . petroski, henry. the toothpick: technology and culture. new york, ny: knopf, . petzold, charles. code: the hidden language of computer hardware and software. st edition. microsoft press, . peuquet, donna j. representations of space and time. new york, ny: guilford, . phillips, whitney. in this is why we can’t have nice things: mapping the relationship between online trolling and mainstream culture. cambridge, ma: mit press, . pickering, a. the cybernetic brain: sketches of another future. chicago, il: university of chicago press, . pickles, john. “arguments, debates, and dialogues: the gis-social theory debate and the concern for alternatives.” in geographic information systems. - . new york, ny: johns wiley & sons, inc., . pickles, john, ed. ground truth: the social implications of geographic information sys- tems. new york, ny: guilford press, . pickles, john. a history of spaces: cartographic reason, mapping, and the geo-coded world. new york, ny: routledge, . pickles, john. “representations in an electronic age: geography, gis, and democracy.” in ground truth: the social implications of geographic information systems. - . new york, ny: guilford press, . pierazzo, elena. “digital humanities: a definition.” . http://epierazzo.blog- spot.co.uk/ / /digital-humanities-definition.html. pierazzo, elena. “digital documentary editions and the others.” scholarly editing, ( ). pierazzo, elena. ”textual scholarship and text encoding.” in a new companion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . pinch, trevor j., and wiebe e. bijker. “the social construction of facts and artifacts: or how the sociology of science and the sociology of technology might benefit each other.” in the social construction of technological systems: new directions in the soci- ology and history of technology. eds. wiebe e. bijker, thomas p. hughes, and trevor pinch, - . cambridge, ma: mit press, . piper, andrew. book was there: reading in electronic times. chicago, il: university of chicago press, . pitti d.v., and w.m. duff. encoded archival description on the internet. binghamton, ny: haworth information press, . plants, s. zeroes and ones: digital women and the new technoculture. new york, ny: doubleday, . plewe, brandon. “the nature of uncertainty in historical geographic information.” transactions in gis : ( ): - . polefrone, phillip r., john simpson, and dennis yi tenen. “critical computing in the hu- manities.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . poole, a. “now is the future now? the urgency of digital curation in the digital human- ities.” dhq: digital humanities quarterly, ( ). . http://www.digitalhumani- ties.org/dhq/vol/ / / / .html. poole, steven. “green’s dictionary of slang by jonathon green and guardian style by david marsh & amelia hodsdon–review.” the guardian, . https://www.theguard- ian.com/books/ /dec/ /dictionary-slang-guardian-style-review. posner, miriam. “here and there: creating dh community.” in debates in the digital humanities. eds. matthew k. gold, and lauren klein. - . minneapolis, mn: univer- sity of minnesota press, . posner, miriam. “no half measures: overcoming common challenges to doing digital humanities in the library.” journal of digital humanities : (january ). posner, miriam. “think talk make do: power and the digital humanities.” journal of dig- ital humanities . ( ). posner, miriam. “what’s next: the radical, unrealized potential of digital humanities.” in debates in the digital humanities. ed. matthew gold and lauren klein. - . minne- apolis, mn: university of minnesota press, . postcolonial digital humanities. http://dhpoco.org. potter, r. “literary criticism and literary computing.” computers in the humanities ( ) ( ): . potter, claire. “putting the humanities in action: why we are all digital humanists, and why that needs to be a feminist project.” keynote presentation, women’s history in the digital world conference, bryn mawr college, . http://repository.bryn- mawr.edu/greenfield_conference/ /thursday/ /. potter, r.g. “statistical analysis of literature: a retrospective on computers and the hu- manities, - .” computers and the humanities , no. ( ): - . potter, w. james. media literacy. los angeles, ca: sage, . powell, daniel. “dispatches from capitol hill: # .ʺ http://djp .com/dispatches-from- capitol-hill- /. powell, daniel. “dispatches from capitol hill: # , or eebo and the infinite weird- ness.” http://djp .com/dispatches-from-capitol-hill- -or-eebo-and-the-infinite- weirdness/. powell, daniel. “dispatches from capitol hill: # , or xml and tei are scary.” http://djp .com/dispatches-from-capitol-hill- /. powell, daniel. “dispatches from capitol hill: # , or what is transcription, re- ally?” http://djp .com/dispatches-from-capitol-hill- /. power, eugene. edition of one. ann arbor, mi: university of michigan, . prady lougee, wendy. diffuse libraries: emergent roles for the research library in the digital age. washington, dc: council on library and information resources, . http://www.clir.org/pubs/abstract/pub abst.html. pratt, vernon. thinking machines: the evolution of artificial intelligence. oxford, uk: basil blackwell, . prescott, andrew. “an electric current of the imagination.” digital humanities: works in progress. http://blogs.cch.kcl.ac.uk/wip/ / / /an-electric-current-of-the-imagina- tion. prescott, andrew. “beyond the digital humanities center: the administrative land- scapes of the digital humanities.” in a new companion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-black- well, . prescott, andrew. “consumers, creators or commentators? problems of audience and mission in digital humanities.” arts and humanities in higher education , nos. - ( ): - . prescott, andrew. “an electric current of the imagination: what the digital humanities are and what they might become.” journal of digital humanities, june . prescott, andrew. “riffs on mccarty.” digital riffs. http://digitalriffs.blog- spot.com/ / /riffs-on-mccarty.html. presner, todd. “critical theory and the mangle of digital humanities.” in between hu- manities and the digital. eds. patrik svensson and davi theo goldberg. - . cam- bridge, ma: mit press, . presner, todd. “the ethics of the algorithm: close and distant listening to the shoah foundation visual history archive.” in history unlimited: probing the ethics of holocaust culture. cambridge, ma: harvard university press, . presner, todd. “how to evaluate digital scholarship.” journal of digital humanities , no. (fall ). http://journalofdigitalhumanities.org/ - /how-to-evaluate-digital- scholarship-by-todd-presner/. presner, todd. “hypercities.” in online humanities scholarship: the shape of things to come. ed. jerome mcgann. - . houston, tx: rice university press, . presner, todd. "remapping german-jewish studies: benjamin, cartography, moder- nity." the german quarterly , no. ( ): - . presner, todd, and chris johanson. “the promise of digital humanities: a whitepaper. march , -final version.” http://www.itpb.ucla.edu/documents/ /promiseofd- igitalhumanities.pdf. presner, todd, david shepard, yoh kawano. hypercities: thick mapping in the digital humanities (metalabprojects). cambridge, ma: harvard university press, . presner, todd, and david shepard. “mapping the geospatial turn.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . presner, todd, j. schnapp, and p. lunenfeld. the digital humanities manifesto . . . http://www.humanitiesblast.com/manifesto/manifesto_v@.pdf. price, jacob. “recent quantitative work in history: a survey of the main trends.” his- tory and theory ( ): - . price, kenneth m. “civil war washington project.” online humanities scholarship: the shape of things to come. ed. jerome mcgann. - . houston, tx: rice university press, . price, kenneth m. “collaborative work and the conditions for american literary schol- arship in a digital age.” the american literature scholar in the digital age. eds. amy e. earhart and andrew w. jewell. - . ann arbor, mi: university of michigan press, . price, kenneth m. “digital scholarship, economics, and the american literary canon.” literature compass , no. ( ): - . http://onlineli- brary.wiley.com/doi/ . /j. - . . .x/full. price, kenneth m. “edition, project, database, archive, thematic research collection: what’s in a name?” dhq: digital humanities quarterly ( ). price, kenneth m. “electronic scholarly editions.” in a companion to digital literary studies. eds. r.g. siemens, and s. schreibman. - . oxford, uk: blackwell, . price, kenneth m. and r. siemens, eds. literary studies in the digital age: a methodo- logical primer. new york, ny: mla commons. price, kenneth. “social scholarly editing.” in a new companion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . pritchard, d. “working papers, open access, and cyber-infrastructure in classical stud- ies.” literary and linguistic computing ( ). - . proctor, nancy. “digital: museum as platform, curator as champion, in the age of social media.” curator: the museum journal , no. (january , ): – . http://arthis- tory .doingdh.org/readings/ project gutenberg. www.gutenberg.org. project bamboo. . http://www.projectbamboo.org/. promey, sally m., and miriam stewart. “digital art history: a new field for collabora- tion.” american art , no. (july , ): – . http://www.jstor.org/sta- ble/ proot, goran, and leo egghe. “estimating editions on the basis of survivals…” papers of the bibliographic society of america, , no. ( ): – . prown, jules. “the art historian and the computer.” art as evidence: writings on art and material culture. new haven, ct: yale university press, . public knowledge project. pkp.sfu.ca. pumfrey, paul, paul rayson and john mariani. “experiments in th century english: manual versus automatic conceptual history.” literary and linguistic computing , no. ( ): – . purdue university. “evaluation criteria for the scholarship of engagement.” n.d. http://www.vet.purdue.edu/engagement/files/documents/evaluationcriterion.pdf. puschmann, cornelius, and jean burgess. “the politics of twitter data.” hiig discussion paper series - ( ). quamen, harvey, and jon bath. “databases.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . quan-haase, anabel, juan luis suarez, and david m. brown. “collaborating, connecting, and clustering in the humanities: a case study of networked scholarship in an interdis- ciplinary, dispersed team.” american behavioral scientist . ( ): - . race: the floating signifier. dir. sut jhally, with stuart hall and media education foun- dation. northhampton, ma: media education foundation, . radford, marie l., pamela snelson. “academic library research: perspectives and cur- rent trends.” acrl publications in librarianship no. . chicago: association of college and research libraries, . raessens, joost. “computer games as participatory media culture.” in handbook of computer games studies. eds. j. raessens and j. goldstein. - . cambridge, ma: mit press, . raffaelle, simone. “the body of the text.” in the future of the book. ed. geoffrey nun- berg. - . berkeley, ca: university of california press, . rahtz, s. “storage, retrievals, and rendering.” in electronic textual editing. eds. l. bur- nard, k. o’brien o’keeffe, and j. unsworth. - . new york, ny: modern language association, . raley, rita. “digital humanities for the next five minutes.” differences , no. ( ): - . raley, rita. tactical media. minneapolis, mn: university of minnesota press, . rambsy, kenton. “african american literature and digital humanities.” january , . http://www.culturalfront.org/ / /african-american-literature-and-digi- tal.html. ramsay, stephen. “algorithmic criticism.” in a companion to digital literary studies. eds. ray siemens and susan schreibman. oxford, uk: blackwell, . ramsay, stephen. “care of the soul.” literatura mundana, october , . http://lenz.unl.edu/wordpress/?p= . ramsay, stephen. “centers are people.” april . http://lenz.unl.edu/pa- pers/ / / /centers-are-people.html. ramsay, stephen, “databases.” in a companion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. oxford: blackwell, . ramsay, stephen. “hard constraints: designing software in the digital humanities.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . ramsay, stephen. “the hermeneutics of screwing around; or what you do with a mil- lion books.” in pastplay: teaching and learning history with technology. ed. kevin lee. - . ann arbor, mi: university of michigan press, . ramsay, stephen. “humane computation.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . ramsay, stephen. “in praise of pattern.” text technology ( ): - . ramsay, stephen. “on building.” stephen ramsay (author’s blog). january , . http://stephenramsay.us/text/ / / /on-building/. ramsay, stephen. reading machines: toward an algorithmic criticism (topics in the dig- ital humanities). urbana-champaign, il: university of illinois press, . ramsay, stephen. “rules of the order: the sociology of large, multi-institutional soft- ware developmental projects.” digital humanities . . ramsay, stephen. “toward an algorithmic criticism.” literary and linguistic computing . ( ): - . ramsay, stephen. “who’s in and who’s out.” stephen ramsay blog. january , . http://lenz.unl.edu/papers/ / / /whos-in-and-whos-out.html ramsay, stephen, and geoffrey rockwell. “developing things: notes toward an episte- mology of building in the digital humanities.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . raper, jonathan. multidimensional geographic information science: extending gis in space and time. new york, ny: taylor & francis, . ratto, m. “critical making: conceptual and material studies in technology and social life.” information society ( ): - . ratto, m., s. wylie, and k. jalbett. “introduction to the special forum on critical making as research program.” information society , ( ): - . real, l.a. “collaboration in the sciences and the humanities: a comparative phenome- nology.” arts and humanities in higher education ( ): - . reed, ashley. “managing an established digital humanities project: principles and prac- tices from the twentieth year of the william blake archive.” digital humanities quar- terly , no. ( ). reichardt, j. robots: fact, fiction, and prediction. london, uk: thames & hudson, . reid, alexander. “the creative community and the digital humanities.” digital digs. oc- tober , . http://www.alex-reid.net/ / /the-creative-community-and-the- digital-humanities.html. reid, alexander. “digital digs: the digital humanities divide.” digital digs. february , . http://www.alex-reid.net/ / /the-digital-humanities-divide.html. reid, alexander. “digital humanities: two venn diagrams.” digital digs. march , . http://www.alex-reid.net/ / /digital-humanities-two-venn-diagrams.html. reid, alexander. “graduate education and the ethics of the digital humanities.” in de- bates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: uni- versity of minnesota press, . reigar, oya y. “framing digital humanities: the role of new media in humanities schol- arship.” first monday , no. ( ). renear, allen. “text encoding.” in a companion to digital humanities. eds. s. schreib- man, r. siemens, and j. unsworth. oxford, uk: blackwell, . http://www.digitalhu- manities.org/companion. renear, allen, david dubin, c. m. sperberg-mcqueen, claus hiutfeldt. "xml semantics and digital libraries." international conference on digital libraries. washington, dc: . resh, gabby, dan southwick, isaac record, and matt ratto. “thinking as handwork: crit- ical making with humanistic concerns.” in making things and drawing boundaries: ex- periments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: uni- versity of minnesota press, . resig, john “using computer vision to increase the research potential of photo ar- chives.” http://ejohn.org/research/computer-vision-photo-archives/. rettberg, scott. “electronic literature as digital humanities.” in a new companion to digital humanities. ed. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . rheingold, howard. smart mobs: the next social revolution. new york, ny: basic, . rhody, lisa marie. “why i dig: feminist approaches to text analysis.” in debates in the digital humanities. eds. matthew k. gold, and lauren klein. - . minneapolis, mn: university of minnesota press, . rhyne, charles s. “images as evidence in art history and related disciplines.” in mw : museums and the web . ridolfo, jim, and william hart-davidson, eds. rhetoric and the digital humanities. - . chicago, il: university of chicago press, . riegar, oya y. “framing digital humanities: the role of new media in humanities schol- arship.” first monday , no. (october , ) http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / . rigney, a. “when the monograph is no longer the medium: historical narrative in the online age.” history and theory, theme issue (december ). - . riley, jenn, and david becker. “seeing standards: a visualization of the metadata uni- verse.” indiana university libraries. . www.dlib.indiana.edu/-jentrile/metada- tamap/. rimmer, jon, claire warwick, ann blandford, jeremy gow, and george buchanan. “an examination of the physical and the digital qualities of humanities research.” infor- mation processing & management , no. (may ): - . risam, roopika. “navigating the global digital humanities: insights from black femi- nism.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . rivera monclova, marta. “towards an open digital humanities.” in thatcamp southern california . january , . http://socal .thatcamp.org/ / /opendh/. rizzo, mary. “every tool is a weapon: why the digital humanities movement needs public history.” public history commons, november , . http://publichisto- rycommons.org/every-tool-is-a-weapon/. robbins, k. & webster, f. times of the technoculture: from the information society to the virtual life. london, uk: routledge, . roberts, colin h., and t.c. skeat. the birth of the codex. london, uk: oxford university press, . robertson, stephen. “the difference between digital humanities and digital history.” in debates in the digital humanities. eds. matthew gold and lauren klein. - . min- neapolis, mn: university of minnesota press, . robertson, stephen. “putting harlem on the map.” in writing history in the digital age. eds. jack dougherty and kristen nawrotzki: http://writinghistory.trincoll.edu/evi- dence/robertson- -spring. robertson, stephen, shane white, and stephen garton. “harlem in black and white: mapping race and place in the s.” journal of urban history , no. ( ): - . robinson, arthur h. the look of maps. madison, wi: university of wisconsin press, . robinson, arthur h., and barbara bartz petchenik. the nature of maps: essays toward understanding maps and mapping. chicago, il: university of chicago press, . robinson, p. “digital humanities: is bigger better?” in advancing digital humanities: re- search, methods, theories. eds. p.l. arthur and k. bode. - . basingstoke, uk: pal- grave macmillan, . robinson, peter. “response to roger bagnall, ‘integrating digital papyrology.” in online humanities scholarship: the shape of things to come. ed. jerome mcgann. - . houston, tx: rice university press, . rockenbach, barbara. “digital humanities in libraries: new models for scholarly engage- ment.” journal of library administration : (january ). rockwell, geoffrey. “crowdsourcing the humanities: social research and collabora- tion.” collaborative research in the digital humanities. eds. marilyn deegan and willard mccarty. - . farnham, uk: ashgate, . rockwell, geoffrey, and s. sinclair. hermeneutica: the rhetoric of text analysis. cam- bridge, ma: mit press, . rockwell, geoffrey. “humanities computing challenges.” theoreti.ca ( ). rockwell, geoffrey. “inclusion in the digital humanities.” philosphi.ca. june , . http://www.philosophi.ca/pmwiki.php/main/inclusioninthedigitalhumanities. rockwell, geoffrey. “on the evaluation of digital media as scholarship.” profession ( ): - . rockwell, geoffrey. “serious play at hand: is gaming serious research in the humani- ties?” text technology ( ), - . rockwell, geoffrey. “short guide to evaluation of digital work.” journal of digital hu- manities , no. (fall ). http://journalofdigitalhumanities.org/ - /short-guide-to- evaluation-of-digital-work-by-geoffrey-rockwell. rockwell, geoffrey. “thinking-through the history of computer-assisted text analysis.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . rockwell, geoffrey. “the visual concordance: the design of eye-contact.” technology , no. ( ): - . rockwell, geoffrey. “what is text analysis, really?” literary and linguistic compu- ting. . ( ): - . rockwell, geoffrey, and stefan sinclair. “acculturation and the digital humanities com- munity.” digital humanities pedagogy: practices, principles and politics. ed. brett d. hirsch. - . cambridge, uk: open book publishers, . rodowick, d.n. the virtual life of film. cambridge, ma: harvard university press, . roegiers, s., and f. truyen. “history is d: presenting a framework for meaningful his- torical representation in digital media.” in new heritage: new media and cultural her- itage. eds. y.e. kalay, t. kvan, & j. affleck. - . london and new york: routledge, . rogers, melissa. “making queer feminisms matter: a transdisciplinary makerspace for the rest of us.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . rogers, richard. digital methods. cambridge, ma: mit press, . rogoff, irit. “studying visual culture.” in the visual culture reader. ed. nicholas mir- zoeff. - . new york, ny: routledge, . rorabaugh, pete. “twitter theory and the public scholar.” hybrid pedagogy. march . rosenfeld, gabriel. “why do we ask ‘what if?’ reflections on the function of alterna- tive history.” history and theory (december ): - . rosenfeld, l., and p. moorville. information architecture for the world wide web. nd ed. beijing, china: o’reilly, . rosenzwieg, roy. clio wired: the future of the past in the digital age. new york, ny: co- lumbia university press, . rosenzweig, roy. “the road to xanadu: public and private pathways on the history web.” journal of american history , (september ). rosenzweig, roy. “scarcity or abundance? preserving the past in a digital era.” ameri- can historical review , (june ): - . http://chnm.gmu.edu/essays-on-his- tory-new-media/essays/?essayid= . rosner, daniela k., and sarah e. fox. “legacies of craft and the centrality of failure in a mother-operated hackerspace.” new media & society , no. ( ): - . ross, andrew. "hacking away at the counterculture." postmodern culture , no. ( ). ross, nancy. “teaching twentieth-century art history with gender and data visualiza- tions.” journal of interactive technology and pedagogy, issue . http://jitp.com- mons.gc.cuny.edu/teaching-twentieth-century-art-history-with-gender-and-data-visuali- zations/. ruecker, stan. “interface as mediating actor for collection access, text analysis, and ex- perimentation.” in a new companion to digital humanities. eds. susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . ruecker, stan, luciano frizzera, milena radzikowska, geoff roeder, ernesto pena, te- resa dobson, geoffrey rockwell, susan brown, the inke research group. “visual work- flow interfaces for editorial processes.” literary and linguistic computing . ( ): - . ruecker, stan, milena radzikowska, and stéfan sinclair. “hackfests, designfests, and writingfests: the role of intense periods of face-to-face collaboration in international research teams.” digital humanities . . ruecker, stan, and milena radzikowska. “the iterative design of a project charter for interdisciplinary research.” in proceedings of the th acm conference on designing in- teractive systems – dis ‘ , - . cape town, south africa, . http://dl.acm.org/citation.cfm?id= . ruecker, s., milena radikowska, and s. sinclair. visual interface design for cultural herit- age: a guide to rich-prospect browsing. farnham, uk: ashgate, . ruecker, stan, and jennifer roberts-smith. “experience design for the humanities: acti- vating multiple interpretations.” in making things and drawing boundaries: experi- ments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . rumsey, david, and edith m. punt. cartographica extraordinaire: the historical map transformed. redlands, ca: esri press, . rumsey, david, and meredith williams. “historical maps in gis.” in past time, past place: gis for history. - . redlands, ca: esri press, . rush, matthew. new media in late th century art (world of art). london, uk: thames and hudson, . rushkoff, d. program or be programmed: ten commands for a digital age. new york, ny: or books, . russell, isabel galina. “case study: digital humanities in mexico.” in digital humanities in practice. eds. claire warwick, melissa terras, and julianne nyhan. - . london, uk: facet in association with ucl center for digital humanities, . russell, john. “teaching digital scholarship in the library: course evaluation.” dh + lb. arcl digital humanities discussion group, . russo, a., and j. watkins. “digital cultural communication: audience and remediation.” in theorizing digital cultural heritage: a critical discourse. eds. f. cameron, and s. kenderdine. - . cambridge, ma: mit press, . ryan, marie-laure, ed. cyberspace textuality: computer technology and literary the- ory. bloomington, in: indiana university press, . ryan, m.-l. “defining narrative media.” image and narrative: online magazine of the visual narrative, ( ). http://www.imageandnarrative.be/inarchive/mediumthe- ory/marielaureryan.htm. rybicki, jan, maciej eder, and david l. hoover. “computational stylistics and text analy- sis.” in doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . rydberg cox, jeffrey a. digital libraries and the challenges of digital humanities. chan- dos information professional series. oxford, uk: chandos publishing, . sabharwal, arjun. digital curation in the digital humanities: preserving and promoting archival and special collections. oxford, uk: chandos publishing, . sabharwal, arjun. “digital directions in academic knowledge management: visions and opportunities for digital initiatives at the university of toledo.” special libraries associ- ation annual conference & info-expo. . sabharwal, arjun. “digital representations of disability history: developing a virtual ex- hibition at the ward m. canaday center, university of toledo.” archival issues: journal of the midwest archives conference , ( ): - . saenger, paul. space between words: the origin of silent reading. stanford, ca: stan- ford university press, . saint-martin, fernande. semiotics of visual language. translated by ferande saint-mar- tin. bloomington, in: indiana university press, . saklofske, jon, estelle clements, and richard cunningham. “on the digital future of hu- manities.” in digital humanities pedagogy: practices, principles, and policies. ed. brett d. hirsch. - . cambridge, ma: open book publishers, . saklofske, jon, estelle clements, and richard cunningham. “they have come, why won’t we build it? on the digital future of the humanities.” in digital humanities peda- gogy: practices, principles, and politics. ed. brett d hirsch. cambridge, ma: open book publishers, . http://www.openbookpublishers.com/htmlreader/dhp/chap .html. salen, katie, and eric zimmerman. rules of play: game design fundamentals. cam- bridge, ma: mit press, . saler, michael. “the hidden cost: review of to save everything, click here, by evgeny morozov.” the times literary supplement (may , ): – . salter, anastasia. what is your quest?: from adventure games to interactive books. iowa city, ia: university of iowa press, . salter, c. “entangled: technology and the transformation of performance.” cambridge, ma: mit press. . salway, benet. “travel, itinerary and tabellaria.” in travel and geography in the roman empire. eds. colin adams and ray laurence. - . london and new york: routledge, . sample, mark. “difficult thinking about the digital humanities.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: univer- sity of minnesota, . sample, mark. “the digital humanities is not about building, it’s about sharing.” sam- ple-reality (blog), may , . / http://www.samplereality.com/ / / /the-digi- tal-humanities-is-not-about-building-its-about-sharing/. sample, mark. “on the death of the digital humanities center.” sample reality (blog). march , . http://www.samplereality.com/ / / /on-the-death-of-the-digi- tal-humanities-center/. sample, mark. ”renetworking house of leaves in the digital humanities.” sample real- ity (blog). august , . http://www.samplereality.com/ / / /the-digital-hu- manities-is-not-about-building-its-about-sharing/. sample, mark. “resisting technology: the right idea for all the wrong reasons.” works and days , no. - ( ): - . sample, mark. “tenure as a risk-taking venture.” journal of digital humanities , no. (fall ). http://journalofdigitalhumanities.org/ - /tenure-as-a-risk-taking-venture- by-mark-sample/. sample, mark. “unseen and unremarked on: don delillo and the failure of the digital humanities.” in debates in the digital humanities. ed. matthew k. gold. - . min- neapolis, mn: university of minnesota, . sample, mark. “what’s wrong with writing essays.” in debates in the digital humani- ties. ed. matthew k. gold. - . minneapolis, mn: university of minnesota, . sanchez, elie. fuzzy logic and the semantic web. new york, ny: elsevier, . sandweiss, martha a. “artifacts as pixels, pixels as artifacts: working with photographs in the digital age.” perspectives on history (november ). sandweiss, martha a. “image and artifact: the photograph as evidence in the digital age.” journal of american history ( ): - . sau-dufrene, bernadette, ed. heritage and digital humanities: how should training practices evolve? lit verlag, . sayers, jentery. “dropping the digital.” in debates in the digital humanities. eds. mat- thew k. gold and lauren klein. - . minneapolis, mn: university of minnesota, . sayers, jentery. how text lost its source: magnetic recording cultures. phd disserta- tion, university of washington, . sayers, jentery. “i don’t know all the circuitry.” in making things and drawing bounda- ries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . sayers, jentery, ed. making things and drawing boundaries: experiments in the digital humanities. minneapolis, mn: university of minnesota press, . sayers, jentery. “project snapshot: ‘aids quilt touch’: virtual quilt browser.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery say- ers. . minneapolis, mn: university of minnesota press, . sayers, jentery. “project snapshot: bibliocircuitry and the design of the alien everyday, - .” in making things and drawing boundaries: experiments in the digital human- ities. ed. jentery sayers. . minneapolis, mn: university of minnesota press, . sayers, jentery. “project snapshot: designs for foraging: fruit are heavy, - .” in making things and drawing boundaries: experiments in the digital humanities. ed. jen- tery sayers. - . minneapolis, mn: university of minnesota press, . sayers, jentery. “project snapsot: fashioning circuits, -present.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . sayers, jentery. “project snapshot: glitch console.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneap- olis, mn: university of minnesota press, . sayers, jentery. “projects snapshot: loss sets.” in making things and drawing bounda- ries: experiments in the digital humanities. ed. jentery sayers. . minneapolis, mn: university of minnesota press, . sayers, jentery. “project snapshot: made: technology on affluent leisure time.” in making things and drawing boundaries: experiments in the digital humanities. ed. jen- tery sayers. - . minneapolis, mn: university of minnesota press, . sayers, jentery. “project snapshot: mashbot.” in making things and drawing bounda- ries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, min- nesota: university of minnesota press, . sayers, jentery. “project snapshot: mic jammer.” in making things and drawing bound- aries: experiments in the digital humanities. ed. jentery sayers. . minneapolis, mn: university of minnesota press, . sayers, jentery. “project snapshot: movable party.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneap- olis, mn: university of minnesota press, . sayers, jentery. “prototyping the past.” visible language , no. ( ): - . sayer, jentery. teaching and learning multimodal communications. . http://sca- lar.usc.edu/maker/english- /index. sayers, jentery. “technology.” in keywords for american cultural studies. nd edition. eds. b. burnett and g. hendler. new york, ny: new york university press. http://hdl.handle.net/ . /rr xh x). sayers, jentery. “why do marketspaces matter for the humanities? for writing cen- ters?” two year college association pacific-northwest, october , . http://www.maker.uvic.ca/pnwca /#/title. sayers, jentery, devon elliot, kari kraus, bethany nowviskie, and william j turkel. “be- tween bits and atoms: physical computing and desktop fabrication in the humanities.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . sayers, jentery, j. boggs, d. elliott, and w.j. turkel. “made to make: expanding digital humanities through desktop fabrication.” digital humanities. scalar. http://scalar.usc.edu/. schaffner, j., and r. erway. “does every research library need a digital humanities cen- ter?” dublin, oh: oclc research. http://www.oclc.org/content/dam/research/publica- tions/library/ /oclcresearch-digital-humanities-center- .pdf. schama, simon. landscape and memory. new york, ny: random house, . schantz, h. the history of ocr, optical character recognition. manchester center, vt: recognition technologies users association, . scheindfeldt, tom. “the dividends of difference: recognizing digital humanities’ di- verse family tree/s.” found history. april , . http://foundhistory.org/ / /the- dividends-of-difference-recognizing-digital-humanities-diverse-family-trees/. scheindfeldt, tom. “stuff digital humanities like: defining digital humanities by its val- ues.” found history. december , . http://www.foundhistory.org/ / / /stuff-digital-humanists-like/. scheindfeldt, tom. “sunset for ideology, sunrise for methodology?” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of min- nesota press, . scheindfeldt, tom. “’where’s the beef?’” does digital humanities have to answer ques- tions?” in debates in the digital humanities. ed. matthew k. gold - . minneapolis, mn: university of minnesota press, . scheindfeldt, tom. “why digital humanities is ‘nice’?”. in debates in the digital humani- ties. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . schell, j. the art of game design: a book of lenses. amsterdam and boston: else- vier/morgan kaufmann. schmidt, desmond. “the inadequacy of embedded markup for cultural heritage texts.” literacy and linguistic computing , no. ( ): - . schmidt, benjamin. “do digital humanists need to understand algorithms?” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . schmidt, benjamin. “words alone: dismantling topic models in the humanities.” jour- nal of digital humanities , no. ( ). http://journalofdigitalhumanities.org/ - /words-alone-by-benjamin-m-schmidt/. schnapp, jeffrey and matthew battles. the library beyond the book (metalabprojects). cambridge, ma: harvard university press, . shneiderman, ben. leonardo's laptop: human needs and the new computing technolo- gies. cambridge, ma: mit press, . schöch, christof. “big? smart? clean? messy? data in the humanities.” journal of digital humanities . ( ): - . scholle, david. “resisting disciplines: repositioning media studies in the university.” communication theory, ( ): - . scholz, sandra, and robert chenhall. “archaeological data banks in theory and prac- tice.” american antiquity , no. ( ): - . scholz, r. trebor, ed. digital labor: the internet as playground and factory. new york, ny: routledge, . scholz, r. trebor. digital labor: new opportunities, old inequalities. re:public, . may , . video. http://www.youtube.com/watch?v= cqkir rvm. scholz, r. trebor. learning through digital media. new york, ny: institute for distrib- uted creativity, . schreibman, susan. “computer-mediated texts and textuality: theory and practice.” computers and the humanities , no. ( ): - . http://www.jstor.org/pss/ . schreibman, susan. “digital scholarly editing.” literary studies in the digital age: an evolving anthology. eds., kenneth m. price and ray siemens. modern language associa- tion, . schreibman, susan, ray siemens, and john unsworth. a companion to digital humani- ties. west sussex, uk: wiley-blackwell, . www.digitalhumanities.org/companion. schreibman, susan, ray siemens, and john unsworth, eds. a new companion to digital humanities. xxiii-xxvii. west sussex, uk: wiley-blackwell, . schreibman, susan, laura mandela, and olsen stephen. “introduction: evaluating digital scholarship.” profession ( ): - . http://www.mlajour- nals.org/doi/abs/ . /prof. . . . . schuler, douglas, and aki namioka. participatory design: principles and practices. mah- wah, nj: erlbaum, . schulz, kathryn. being wrong: adventures in the margin of error. new york, ny: harper collins, . schulz, kathryn. “what is distant reading?” new york times sunday book review, june , . http://www.nytimes.com/ / / /books/review/the-mechanic-muse- what-is-distant-reading.html?_r= . schuurman, n. “trouble in the heartland: gis and its critics in the s.” progress in human geography, , ( ): - . scoreahit. “the hit equation.” http://scoreahit.com/thehitequation/. seaman, david. “gis and the frontier of digital access: application of gis technology in the research library.” paper presented at future foundations: mapping the past-build- ing the greater philadelphia geohistory network. chemical heritage foundation, phila- delphia, pa. . segel, edward, and jeffrey heer. “narrative visualisation: telling stories with data.” tvcg , , ( ): - . selfe, cynthia. “computers in english departments: the rhetoric of technopower.” ade bulletin ( ): - . http://www.mla.org/adefl_bulle- tin_c_ade_ _ &from=adefl_bulletin_t_ade _ . selfe, cynthia. and g. hawisher. literate lives in the information age: narratives of liter- acy from the united states. mahwah, nj: lawrence erlbaum, . selfe, cynthia. technology and literacy in the twenty-first century: the importance of paying attention. carbondale, il: southern illinois university press, . selisker, scott. “digital humanities knowledge: reflections on the introductory gradu- ate syllabus.” in debates in the digital humanities. eds. matthew k. gold, and lauren klein. - . minneapolis, mn: university of minnesota press, senchyne, jonathan. “between knowledge and metaknowledge: shifting disciplinary borders in digital humanities and library and information studies.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . she, sydney j. “digital materiality.” in a new companion to digital humanities. eds. su- san schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley- blackwell, . sheppard, eric. “knowledge production through critical gis: genealogy and prospects.” cartographica : ( ): - . sherman, erica. “urban agents: confraternities, devotion and the formation of a new urban state in eighteenth-century minas gerais.” phd dissertation, duke university, . sherratt, tim. “it’s all about the stuff: collections, interfaces, power and people.” di- scontents. november . http://journalofdigitalhumanities.org/ - /its-all-about-the- stuff-by-tim-sherratt/ shillingsburg, peter l. from gutenberg to google: electronic representations of literary texts. cambridge, ma: cambridge university press, . shillingsburg, peter l. “principles for electronic archives, scholarly editions, and tutori- als.” in the literary text in the digital age. ed. richard j. finneran. - . ann arbor, mi: university of michigan press, . shields, r. “the virtual.” in key ideas. london and new york: routledge, . shirazi, roxane. “reproducing the academy: librarians and the question of service in the digital humanities.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . sholette, gregory. “disciplining the avant-garde: the united states versus the critical art ensemble.” circa ( ): - . http://www.jstor.org/pss/ . shopes, linda. “making sense of oral history.” oral history in the digital age. http://ohda.matrix.msu.edu/ / /making-sense-of-oral-history/. shore, daniel. “wwjd? the genealogy of a syntactic form.” critical inquiry. , no. ( ): – . short, h., and j. nyhan. “‘collaboration must be fundamental or it’s not going to work’: an oral history.” dhq: digital humanities quarterly. ( ) ( ). showers, ben. “does the library have a role to play in the digital humanities?” jisc dig- ital infrastructure team, february , . http://infteam.jiscin- volve.org/wp/ / / /does-the-library-have-a-role-to-play-in-the-digital-humani- ties/. siebert, loren. “using gis to document, visualize, and interpret tokyo’s spatial history.” social science history : ( ): - . siefring, judith. “sect (sustaining the ebbo-tcp corpus in translation).” jisc. ( ). https://www.webarchive.org.uk/wayback/ar- chive/ /http://www.jisc.ac.uk/whatwedo/programmes/preserva- tion/sect.aspx. siemens, lynne. “project management and the digital humanist.” in doing digital hu- manities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . siemens, lynne. “‘it’s a team if you use ‘reply all’: an exploration of research teams in digital humanities environments.” literary and linguistic computing , no. (june , ): - . siemens, lynne, ray siemens, richard cunningham, teresa dobson, alan galey, stan ruecker, and claire warwick. “inke administrative structure, omnibus document.” new knowledge environments , no. . . http://journals.uvic.ca/index.php/inke/arti- cle/view/ / . siemens, lynne, richard cunningham, wendy duff, and claire warwick. “a tale of two cities: implications of the similarities and differences in collaborative approaches within the digital libraries and digital humanities communities.” literary and linguistic computing , no. ( ): - . siemens, raymond, and s. schreibman, eds. a companion to digital literary studies. ox- ford, uk: blackwell, . siemens, raymond. and j. sayers. “toward problem-based modeling in the digital hu- manities.” in between humanities and the digital. eds, p. svensson and d.t. goldberg. cambridge, ma: mit press, . siemens, raymond, et al. “human-computer interface/interaction and the book: a con- sultation-derived perspective on foundational e-book research.” in collaborative re- search in the digital humanities. eds. marilyn deegan and willard mccarty. - . farnham, uk: ashgate, . silberschatz, a., h.f. korth, and s. sudarshan, eds. database system concepts, rd edi- tion. new york, ny: mcgraw-hill, . simon, herbert a. “understanding the natural and the artificial worlds.” in the sciences of the artificial, rd ed., – . cambridge and london: the mit press, . simon, nina. the participatory museum. http://www.participatorymuseum.org/. simsion, g. data modeling: theory and practice. bradley beach, nj: technics publica- tions, . sinclair, s., s. ruecker, and m. radzikowska. “information visualization for humanities scholars.” in literary studies in the digital age: a methodological primer. eds. k. price and r. siemens. new york, ny: mla commons, . sinclair, stéfan, and geoffrey rockwell. “text analysis and visualization: making mean- ing count.” in a new companion to digital humanities. eds. susan schreibman, ray sie- mens, and john unsworth. - . west sussex, uk: wiley-blackwell, . sinclair, stéfan, and geoffrey rockwell. “towards an archaeology of text analysis tools.” digital humanities . . sinclair, stéfan, stan ruecker, and milena radzikowska. “information visualization for humanities scholars.” literary studies in the digital age: an evolving anthology. eds. kenneth m. price and ray siemens. mla commons. modern language association of america. . sinton, diana s., and jennifer j. lund. understanding place: gis and mapping across the curriculum. redlands, ca: esri press, . slack, jennifer daryl, and john macgregor wise. culture and technology: a primer. new york, ny: peter lang, . slade, g. made to break: technology and obsolescence in america. cambridge, ma: har- vard university press, . smith, h., and r. dean. practice-led and research-led practice. edinburgh, uk: edin- burgh university press, . smith, i.g., ed. the internet of things . new horizons. internet of things european research cluster, . smith, j.b. “computer criticism.” style xii, ( ): - . smith, j.b. “image and imagery in joyce’s portrait: a computer-assisted analysis.” direc- tions in literary criticism: contemporary approaches to literature. eds. s. weintraub and p. young. - . university park, pa: the pennsylvania state university press, . smith, j.b. “a new environment for literary analysis.” perspectives in computing , / , ( ): - . smith, james. “working with the semantic web.” doing digital humanities: practice, training, research. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . smith, martha nell. “electronic scholarly editing.” in a companion to digital humanities. eds. ray siemens, john unsworth, and susan schreibman. oxford, uk: blackwell, . http://www.digitalhumanities.org/companion/. smith rumsey, abby. “creating value and impact in the digital age through transla- tional humanities.” washington, dc: council on library and information resources. . smith rumsey, abby. “report of the scholarly communication institute : emerging genres in scholarly communication.” scholarly communication institute, university of virginia library, july . smithies, james. “evaluating scholarly digital outputs: the layers approach.” journal of digital humanities , no. (fall ). http://journalofdigitalhumanities.org/ - /eval- uating-scholarly-digital-outputs-by-james-smithies/. smithies, james. “full stack dh: building a virtual research environment.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery say- ers. - . minneapolis, mn: university of minnesota press, . smithies, james. “introduction to digital humanities.” march , . http://jamessmithies.org/ / / /introduction-to-digital-humanities/. smithsonian “smithsonian digital volunteers.” smithsonian digital volunteers. https://transcription.si.edu. smithsonian social media policy. . http://www.si.edu/content/pdf/about/sd/sd- .pdf. sneha, p.p. “making humanities in the digital: embodiment and framing in bichitra and indiancine.ma.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . snow, c.p. the two cultures and the scientific revolution. new york, ny: cambridge uni- versity press, . snyder, susan. the comic matrix of shakespeare’s tragedies: romeo and juliet, hamlet, othello, and king lear. princeton, nj: princeton university press, . soja, edward. postmodern geographies: the reassertion of space in critical social the- ory. london, uk: verso, . somerson, r., and m. hermano, eds. the art of critical making: rhode island school of design on creative practice. hoboken, nj: john wiley & sons, inc, . sorapure, madeleine. “between modes: assessing student new media compositions.” kairos , no. ( ): - . “sorting algorithms as dances.” . https://www.i-programmer.info/news/ -train- ing-a-education/ -sorting-algorithms-as-dances.html. (january , ). sousanis, nick. unflattening. cambridge, ma: harvard university press, . southall, humphrey r. “applying historical gis beyond the academy: four use cases for the great britain hgis.” in toward spatial humanities. bloomington, in: indiana univer- sity press, . spatial humanities. spatial.scholarslab.org. speck, r., and p. links. “the missing voice: archivists and infrastructures for humanities research.” in international journal of humanities and arts computing ( - ) ( ): - . doi: . /ijhac. . . sperberg-mcqueen, c.m. “classification and its structures”. in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . spiro, lisa. “collaborative authorship in the humanities.” digital scholarship in the hu- manities. april , . http://digitalscholarship.wordpress.com/ / / /collabo- rative-authorship-in-the-humanities/. spiro, lisa. “computing and communicating knowledge: collaborative approaches to digital humanities projects.” http://ccdigitalpress.org/cad/ch _spiro.pdf. spiro, lisa. digital research tools (dirt) wiki. https://digitalresearch- tools.pbworks.com/w/page/ /frontpage. spiro, lisa. “examples of collaborative digital humanities projects.” digital scholarship in the humanities, june , . http://digitalscholarship.word- press.com/ / / /examples-of-collaborative-digital-humanities-projects/. spiro, lisa. “getting started in digital humanities.” journal of digital humanities, vol , no. ( ). http://journalofdigitalhumanities.org/ - /getting-started-in-digital-human- ities-by-lisa-spiro/ spiro, lisa. “getting started in the digital humanities.” digital scholarship in the human- ities. october , . http://digitalscholarship.wordpress.com/ / / /getting-started-in-the-digital-hu- manities/. spiro, lisa. “opening up digital humanities education”. digital scholarship in the hu- manities. september , . http://digitalscholarship.word- press.com/ / / /opening-up-digital-humanities-education/. spiro, lisa. “’this is why we fight’: defining the values of the digital humanities.” in de- bates in the digital humanities. ed. matthew k. gold. minneapolis, mn: university of minnesota press, . spiro, lisa. “tips on writing a successful grant proposal.” digital scholarship in the hu- manities, september , . http://digitalscholarship.wordpress.com/ / / /tips- on-writing-a-successful-grant-proposal/. srinivasan, ramesh. “taking power through technology in the arab spring.” al jazeera. october , . http://www.aljazeera.com/indepth/opin- ion/ / / .html. srinivasan, ramesh, katherine m. becvar, robin boast, and jim enote. “diverse knowl- edges and contact zones within the digital museum.” science, technology, and human values , no. ( ): - . srinivasan, r., j. enote, k. becvar, and r. boast. “critical and reflective uses of new me- dia in tribal museums.” museum management and curatorship, , ( ): - . srinvasan, ramesh, and jeffrey huang. “fluid ontologies for digital museums.” interna- tional journal on digital libraries , no. ( ): - . staley, david j. brain, mind and internet: a deep history and future. basingstoke, uk: palgrave pivot, . staley, david j. computers, visualization, and history: how new technology will trans- form our understanding of the past. armonk, ny: m.e. sharpe, . staley, david j. "historical visualizations." journal of the association for history and computing , no. ( ). staley, david j. “on the ‘maker turn’ in the humanities.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneap- olis, mn: university of minnesota press, . staley, david j. “visual historiography: toward an object-oriented hermeneutics.” the american historian. https://tah.oah.org/content/visual-historiography/. staley, david j., scott a. french, and bill ferster. “visual historiography: visualizing ‘the literature of a field’.” journal of digital humanities , no. (spring ). steinkuehler, constance. “massively multiplayer online gaming as a constellation of lit- eracy practices.” e-learning . ( ): - . sternberg, s. h. five hundred years of printing. new york, ny: criterion books, . sternfeld, j. “archival theory and digital historiography: selection, search, and metadata as archival processes for assessing historical contextualization.” the ameri- can archivist , ( ): - . stertzer, jennifer. “foundations for digital editing, with focus on the documentary tra- dition.” in doing digital humanities: practice, training, research. eds. constance cromp- ton, richard j. lane, ray siemens. - . new york, ny: routledge, . strommel, jesse. “the twitter essay.” hybrid pedagogy (january ). suber, peter. open access. cambridge, ma: mit, . suda, brian, and sam hampton smith. “the best tools for data visualization.” crea- tive bloq. future publishing limited, . https://www.creativebloq.com/design- tools/data-visualization- . sullivan, elaine, angel david nieves, and lisa m. snyder. “making the model: scholarship and rhetoric in -d historical reconstructions.” in making things and drawing bounda- ries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . sukovic, suzana. “beyond the scriptorium: the role of the library in text encoding.” d- lib magazine , no. (january ). http://www.dlib.org/dlib/january /sukovic/ su- kovic.html. suri, v.r. “the assimilation and use of gis by historians: a socio-technical interaction networks (stin) analysis.” international journal of humanities and arts computing, , ( ): - . stafford, barbara maria. good looking: essays on the virtue of images. cambridge, ma: mit press, . stauffer, andrew. “digital scholarly resources for the study of victorian literature and culture.” victorian literature and culture ( ): - . stauffer, andrew. “my old sweethearts: on digitalization and the future of the print record.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . sterling, bruce. shaping things. cambridge, ma: mit press, . stern, fritz, ed. the varieties of history: from voltaire to the present. new york, ny: vin- tage books, . sterne, jonathan. mp : the meaning of a format. durham, nc: duke university press, . sternfeld, joshua. “pedagogical principles of digital historiography.” in digital humani- ties pedagogy: practices, principles and policies. ed. brett d. hirsch. - . cambridge, uk: open book publishers, . stone, a.r. the war of desire and technology at the clone of the mechanical age. cam- bridge, ma: mit press, . stone, michael. “map or be mapped.” whole earth (fall ): . stone, s. “humanities scholars: information needs and uses.” journal of documentation ( ) ( ): - . strate, lance. “studying media as media: mcluhan and the media ecology approach.” mediatropes ( ): - . http://www.mediatropes.com/index.php/medi- atropes/article/view/ / . sturm, sean, and stephen francis turner. “digital caricature.” digital humanities quar- terly , no. ( ). http://www.digitalhumani- ties.org/dhq/vol/ / / / .html. suchman, lucille alice. human-machine reconfigurations: plans and situated actions. cambridge and new york: cambridge university press, . sui, daniel z. “gis, cartography, and the ‘third culture’: geographic imaginations in the computer age.” professional geographer ( ): - . sula, chria alen. “digital humanities and libraries: a conceptual model.” journal of li- brary administration : (january ). summit on digital tools for the humanities. the institute for advanced technology in the humanities – university of virginia, . http://www.iath.vir- ginia.edu/dtsummit/summittext.pdf. sunstein, cass r. infotopia: how many minds produce knowledge. new york, ny: oxford university press, . “sustainable economics for a digital planet: ensuring long-term access to digital infor- mation.” washington, dc: blue ribbon task force on sustainable digital preservation and access, february . http://brtf.sdsc.edu/biblio/brtf_final_report.pdf. svensson, patrik. “beyond the big tent.” in debates in the digital humanities. ed. mat- thew k. gold. minneapolis, mn: university of minnesota press, . svensson, patrik. big digital humanities: imagining a meeting place for the humanities and the digital. ann arbor, mi: university of michigan press, . svennson, patrik. “the digital humanities as a humanities project.” arts and humanities in higher education ( - ) ( ): - . svensson, patrik. “humanities computing as digital humanities.” digital humanities quarterly , no. ( ). http://digitalhuma- nities.org/dhq/vol/ / / / .html svensson, patrik. “the landscape of digital humanities.” dhq: digital humanities quar- terly , no. (summer ). http://digitalhuma- nities.org/dhq/vol/ / / / .html svensson, patrik. “sorting out the digital humanities.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . svensson, patrik. “a visionary scope of the digital humanities.” humlab blog. february , . http://blog.humlab.umu.se/?p= . svensson, patrik and david theo goldberg, eds. between humanities and the digital. cambridge, ma: mit press, . swafford, joanna. “messy data and faulty tools.” in debates in the digital humanities. eds. matthew k. gold, and lauren klein. - . minneapolis, mn: university of min- nesota press, . szabo, victoria. “transforming art history research with database analytics: visualizing art markets.” art documentation : ( ): - . tadirah. “tadirah: taxonomy of digital research activities in the humanities.” dariah. . http://tadirah.dariah.eu/vocab/index.php. tally, r. melville, mapping and globalization: literary cartography in the american ba- roque writer. london, uk: continuum, . tanner, simon. “inspiring research, inspiring scholarship. the value and benefits of dig- itized resources for learning, teaching, research and enjoyment.” proceedings of ar- chiving . - . arlington, va: society for imaging science and technology, . tanner, simon. measuring the impact of digital resources: balanced value impact model. london, uk: king’s college, october . http://www.kdcs.kcl.ac.uk/innova- tion/impact.html. tanner, simon. and g. bearman. “digitising the dead sea scrolls.” proceedings of archiv- ing . - . arlington, va: society for imaging science and technology, . tanner, simon, laura gibson, rebecca kahn, and geoff laycock. “choices in digitisaion for the digital humanities.” research methods for creating and curating data in the dig- ital humanities. eds. matt hayler and gabriele griffin. - . edinburgh, uk: edinburgh university press, . tanopir, carol, et al. trust and authority in scholarly communications in the light of the digital transition: final report. university of tennessee and ciber research ltd, . tate, nicholas j., and peter m. atkinson, eds. modelling scale in geographical infor- mation science. chichester, uk: wiley, . taylor, pamela. “critical thinking in and through interactive computer hypertext and art education.” innovate: journal of online education , no. ( ): - . taylor, tina l. play between worlds: exploring online game culture. cambridge, ma: mit press, . teboul, ezra. “electronic music hardware and open design methodologies for post-op- timal objects.” in making things and drawing boundaries: experiments in the digital humanities. ed. jentery sayers. - . minneapolis, mn: university of minnesota press, . tei (textual encoding initiative consortium). http://www.tei-c.org. tei: a test coding initiative. “a gentle introduction to xml.” http://www.tei-c.org/re- lease/doc/tei-p -doc/en/html/sg.html. templeman-kluit, nadaleen, and alexa pearce. “invoking the user from data to de- sign.” college & research libraries ( ). tenen, dennis. “blunt instrumentalism: on tools and methods.” in debates in the digi- tal humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: univer- sity of minnesota press, . terras, melissa. “being the other.“ collaborative research in the digital humanities. eds. marilyn deegan and willard mccarty. - . farnham, uk: ashgate, . terras, melissa. “crowdsourcing in the digital humanities.” in a new companion to digi- tal humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . terras, melissa. defining digital humanities: a reader. farnham, uk: ashgate, . terras, melissa. digital images for the informational professional. aldershot, uk: ash- gate, . terras, melissa. “disciplined: using educational studies to analyze humanities compu- ting.” literary and linguistic computing, . ( ): - . terras, melissa. “digitization and digital resources in the humanities.” in digital huma- nities in practice. eds. claire warwick, melissa terras, and julianne nyhan. - . lon- don, uk: facet in association with ucl center for digital humanities, . terras, melissa, and julianne nyhan. “father busa’s female punch card operatives.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . min- neapolis, mn: university of minnesota press, . terras, melissa. “the impact of social media on the dissemination of research: results of an experiment.” journal of digital humanities, vol. , no. (summer ), http://journalofdigitalhumanities.org/ - /the-impact-of-social-media-on-the-dissemina- tion-of-research-by-melissa-terras/. terras, melissa. “peering inside the big tent: digital humanities and the crisis of inclu- sion.” author’s blog. july , . http://melissaterras.blogspot.com/ / /peering- inside-big-tent-digital.html. terras, m. “present, not voting: digital humanities in the panopticon: closing plenary speech, digital humanities .” literary and linguistic computing , no. ( ): - . thacker, eugene. biomedia. minneapolis, mn: university of minnesota press, . thacker, eugene. “networks, swarms, multitudes: part one.” ctheory. may , . http://dhdebates.gc.cuny.edu/debates/text/ . thacker, eugene. “networks, swarms, multitudes: part two.” ctheory. may , . http://dhdebates.gc.cuny.edu/debates/text/ . thaller, m., ed. controversies around the digital humanities. historical social research/ historische sozialforschung . . köln, germany: quantum and zentrum für histor- ische sozialforschung. thatcamp: the humanities and technology camp. thatcamp.org. thomas, douglas and john seely brown. a new culture of learning: cultivating the im- agination for a world of constant change. createspace independent publishing plat- form, . thomas, lindsay, and dana solomon. “active users: project development and digital humanities pedagogy.” cea critic , no. (july ). http://muse.jhu.edu/login?auth= &type=summary&url=/jour- nal/cea_critic/v / . .thomas.html. thomas iii, william g. “blazing trails toward digital history scholarship.” social his- tory/histoire sociale , no. ( ): - . thomas iii, william g., and elizabeth lorang. “the other end of the scale: rethinking the digital experience in higher education.” educause review , no. ( ). http://www.educause.edu/ero/article/other-end-scale-rethinking-digital-experience- higher-education. thomas iii, william g. “the promise of the digital humanities and the contested nature of digital humanities.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-black- well, . thompson, ann. “teena rochfort smith, frederick furnivall, and the new shakespere society’s four-text edition of hamlet.” shakespeare quarterly , no. ( ): – . thompson klein, julie. interdisciplining digital humanities. ann arbor, mi: university of michigan press, . tiffany, daniel. toy medium: materialism and modern lyric. berkeley, ca: university of california, . tiles, mary, and hans oberdiek, “conflicting visions of technology.” in living in a tech- nological culture: human tools and human values, – . london and new york: routledge, . tillman, r l. “pirensi: now in -d.” printeresting. warhol foundation. october . web. august . tolman, e.c. “cognitive maps in rats and men.” psychological review , no. ( ): - . townsend, r.b. “how is new media reshaping the work of historians?” perspectives on history. november . trahey, tara m. “a black-figure vase in the nasher museum: visualizing an iconographic network between athens and vulci in the th century bce.” ba honors thesis, duke uni- versity, . “#transformdh: this is the digital humanities.” http://transformdh.tmblr.com/. troyano, joan fragaszy. “discovering scholarship on the open web: communities and methods.” april , , http://pressforward.org/discovering-scholarship-on-the-open- web-communities-and-methods/http://www.lotfortynine.org/ / /navigating-dh- for-cultural-heritage-professionals- -edition/. tryon, chuck. “using video annotation tools to teach film analysis.” profhacker. http://chronicle.com/blogs/profhacker/using-video-annotation-tools-to-teach-film-anal- ysis/ . tuan, yi-fu. “images and mental maps.” annals of the association of american geogra- phers. , no ( ): - . tuan, yi-fu. space and place: the perspective of experience. reprint. minneapolis, mn: university of minnesota press, . tufte, edward. envisioning information. cheshire, ct: graphics press, . tufte, edward. “powerpoint is evil.” wired. ( ). https://www.wired.com/ / /ppt /. tufte, edward. the visual display of quantitative information. nd ed. cheshire, ct: graphics press, . tufts university. perseus digital library. http://www.perseus.tufts.edu/hop- per/help/versions.jsp. tunkelang, daniel. faceted search. san rafael, ca: morgan & claypool, . turkel, william j. “hacking history, from analog to digital and back again.” rethinking history ( ) - . turkel, william j. shezan muhammedi, and mary beth start. “grounding digital history in the history of computing.” ieee annals of the history of computing ( ): . tukey, john w. exploratory data analysis. reading, ma: addison-wesley, . turkle, sherry. alone together: why we expect more from technology and less from each other. new york, ny: basic books, . turkle, sherry. life on the screen: identity in the age of the internet. new york, ny: si- mon and schuster, . turner, fred. from counterculture to cyberculture: stewart brand, the whole earth net- work, and the rise of digital utopianism. chicago, il: university of chicago press, . tversky, barbara, and paul u. lee. “pictorial and verbal tools for conveying routes.” in spatial information theory: cognitive and computational foundations of geographical information science: international conference cosit ’ , stade, germany, - august: proceedings. eds. christian freska and david mark. - . berlin, germany: springer verlag, . “the best tools for data visualization.” creative blog. future publishing limited. march , . tweten, lisa, gwynaeth mcintyre, and chelsea gardner. “from stone to screen: digital revitalization of ancient epigraphy.” digital humanities quarterly , no. ( ). twycross, m. “virtual restoration and manuscript archaeology.” in the virtual repre- sentation of the past. eds. m. greengrass and l. hughes. - . farnham, uk: ashgate, . ucla library digital humanities research guide. http://guides.library.ucla.edu/digi- talhumanities. underwood, ted. “distant reading and recent intellectual history.” in debates in the digital humanities. eds. matthew k. gold and lauren klein. - . minneapolis, mn: university of minnesota press, . underwood, ted. “hold on loosely, or gemeinschaft and gesellschaft on the web.” in debates in the digital humanities. ed. matthew gold and lauren klein. - . minne- apolis, mn: university of minnesota press, . underwood, ted. “how much dh can you fit in a literature department?” the stone and the shell. http://tedunderwood.com. underwood, ted. “seven ways humanists are using computers to understand text.” the stone and the shell. http://tedunderwood.com. underwood, ted. “we don’t already understand the broad outlines of literary his- tory.” the stone and the shell. http://tedunderwood.com underwood, ted. “where to start with text mining.” the stone and the shell. http://tedunderwood.com. underwood, ted. “why digital humanities isn’t actually ‘the next thing in literary stud- ies.” the stone and the shell. http://tedunderwood.com. university of texas libraries “using the four factor fair use test.” fair use. ( ). http://guides.lib.utexas.edu/copyright#test. unsworth, john, raymond george siemens, and susan schreibman, eds. a companion to digital humanities. blackwell companions to literature and culture . maiden, ma: blackwell pub, . unsworth, john. “evaluating digital scholarship, promotion & tenure cases.” university of virginia college and graduate school of arts and sciences – office of the dean, n.d. http://artsandsciences.virginia.edu/dean/facultyemployment/evaluating_digi- tal_scholarship.html. unsworth, john. “the state of digital humanities, .” talk manuscript. digital hu- manities summer institute, june . http://www .isrl.illinois.edu/-un- sworth/state.of.dh.dhsi.pdf. unsworth, john. “university . .” the tower and the cloud: higher education in the age of cloud computing. ed. r. n. katz. washington, dc: educause, . unsworth, john. “what is humanities computing and what is not?” graduate school of library and information sciences. illinois informatics institute, university of illinois, ur- bana. http://computerphilologie .uni-muenchen.de/jg /unsworth.html. urban, richard, and marla misunas. “a brief history of the museum computer net- work.” encyclopedia of library and information sciences. boca raton, fl: crc press, . urban, r. marty, p. & twidale, m. “a second life for your museum: d multi-user vir- tual environments and museums.” museums and the web conference, san francisco. ( ). www.archimuse.com/mw /papers/urban/urban.html. vaidhyanathan, siva. “afterword: critical information studies.” cultural studies , no. - ( ): - . vaidhyanathan, siva. the googlization of everything (and why we should worry). oak- land, ca: university of california press, . van zundert, jj., c. van den heuvel, b. brumfield, ed. “text theory, digital documents, and the practice of digital editions.” digitize humanities, . van der weel, adriaan van der. changing our textual minds: towards a digital order of knowledge. manchester uk: manchester university press, . vandendorpe, christian. from papyrus to hypertext: toward the universal digital library. vol. . urbana, il: university of illinois press, . vanhemert, kyle. “artist turns a year’s worth of tracking data into a haunting rec- ord.” wired. ( ). https://www.wired.com/ / /a-years-worth-of-location-data- transformed-into-a-beautiful-record/. vanhoutte, e. “traditional editorial standards and in the digital edition.” in learned love: proceedings of the emblem project utrecht conference on dutch love emblems and the internet (november ). eds. e. stronks and p. boot. - . the hague: dans- data archiving and networked services, . various authors. “reports on national historical gis projects.” historical geography ( ): - . vaughan-nichols, steven j. “augmented reality: no longer a novelty?” computer : ( ): - . vectors: journal of culture and technology in a dynamic vernacular. www.vectorsjour- nal.org. verbeek, peter-paul. moralizing technology: understanding and designing the morality of things. chicago, il: university of chicago press, . verhoeven, deb. “doing the sheep good: facilitating engagement in digital humanities and creative arts research.” in advancing digital humanities: research, methods, theo- ries. eds. paul longley arthur and katherine bode, - . new york, ny: palgrave macmillan, . vershbow, ben. “nypl labs: hacking the library.” journal of library administration, ( ): - . vesna, victoria, ed. database aesthetics: art in the age of information overflow. minne- apolis, mn: university of minnesota press, . vickers, jill. “diversity, globalization, and ‘growing up digital’: navigating interdiscipli- narity in the twenty-first century.” history of intellectual culture, . ( ). http://www.ucalgary.ca/hic/issues/vol . vinopal, jennifer. “supporting digital humanities in the library: creating sustainable & scalable services.” library sphere, june , . http://vinopal.org/ / / /sup- porting-digital-humanities-in-the-library-creating-sustainable-scalable-services/. vinopal, jennifer and monica mccormick. “supporting digital scholarship in research libraries: scalability and sustainability.” journal of library administration : (january ). vinopal, jennifer. “why understanding the digital humanities is key for libraries.” li- brary sphere, february . http://vinopal.org/ / / /why-understanding-the- digital-humanities-is-key-for-libraries/. visconti, amanda. “‘songs of innocence and of experience:’ amateur users and digital texts.” ann arbor, mi: university of michigan, . http://hdl.han- dle.net/ . / . voyant tools. voyant-tools.org. wajcman, judy. feminism confronts technology. oxford, uk: polity, . wajcman, judy. “reflections on gender and technology studies: in what state is the art?” social studies of science ( ) ( ): - . walk, paul. “linked, open, semantic?” ( ). http://www.paulwalk.net. wallace, david foster. “tense present: democracy, english and the wars over us- age.” harper’s magazine, . waltzer, luke. “digital humanities and the ‘ugly stepchildren’ of american higher edu- cation.” debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . wands, b. art of the digital age. london, uk: thames and hudson, . wankel, c. & kingsley, j., eds. higher education in virtual worlds: teaching and learning in second life. bradford, uk: emerald, . ware, colin. information visualization: perception for design. san francisco, ca: morgan kaufman, . warwick, claire. “the end of the beginning: building, supporting and sustaining digital humanities institutions.” digital humanities summer institute, victoria, . waters, d. “an overview of the digital humanities.” research library issues ( ): - . warburtone, s. “second life in higher education: assessing the potential for the barriers to deploying virtual worlds in learning and teaching.” british journal of educational technology, ( ), ( ): - . warde, beatrice. “the crystal goblet.” first delivered in as “printing should be in- visible.” in the crystal goblet: sixteen essays on typography. london, uk: sylvan press, . wardrip-fruin, noah, and p. harrigan, eds. first person: new media as story, perfor- mance, and game. cambridge, ma: mit press, . wardrip-fruin, noah. “reading digital literature: surface, data, interaction, and expres- sive processing.” in a companion to digital literary studies. eds. by ray siemens and su- san schreibman. oxford, uk: blackwell, . warwick, claire. “building theories or theories of building? a tension at the heart of di- gital humanities.” in a new companion to digital humanities. eds. by susan schreib- man, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . warwick, claire. “institutional models for digital humanities.” in digital humanities in practice. eds. claire warwick, melissa terras, and julianne nyhan. - . london, uk: facet in association with ucl center for digital humanities, . warwick, claire, isabel galina, melissa terras, paul huntington, and nikoleta pappa. “the master builders: lairah research on good practice in the construction of digital humanities projects.” literary and linguistic computing , no. ( ): - . warwick, claire, melissa terras, and julianne nyhan. “introduction.” in digital humani- ties in practice. eds. claire warwick, melissa terras, and julianne nyhan. - . london, uk: facet in association with ucl center for digital humanities, . warwick, claire, melissa terras, and julianne nyhan, eds. a practical guide to the digital humanities. london, uk: facet publishing, . watrall, ethan. “archaeology, the digital humanities, and the ‘big tent’.” in debates in the digital humanities. eds. matthew k. gold, and lauren klein. - . minneapolis, mn: university of minnesota press, . watts, reggie. “beats that defy boxes.” ted conference, february . lecture. ted: ideas worth spreading. https://www.ted.com/talks/reggie_watts_disori- ents_you_in_the_most_entertaining_way. weber, max. "science as a vocation." from max weber: essays in sociology. trans. h. h. gerth, c. wright mills. new york, ny: oxford university press, . - . weibel, peter, and timothy druckrey, eds. net condition art and global media. cam- bridge, ma: mit press, . weible, robert. “defining public history: is it possible? is it necessary?” in perspectives on history, march . http://www.historians.org/pubications-and-directories/per- spectives-on-history/march- /defining-public-history-is-it-possible-is-it-necessary. weinberger, david. everything is miscellaneous. new york, ny: henry holt and com- pany, . weir, george r. s., and marina livitsanou. “playing textual analysis as music.” corpus, ict, and language education. eds. weir, george r. s., and shinʼichirō ishikawa. glasgow, uk: university of strathclyde press, . weiser, mark. “the computer for the twenty-first century.” scientific american, sep- tember, - . . weiser, mark. “ubiquitous computing.” computer science lab at xerox parc, . www.ubiq.com/ubicomp. weiss, sholom m., nitin indurkhya, tong zhang, and fred j. damerau. text mining: pre- dictive methods for analyzing unstructured information. new york, ny: springer, . weller, martin. the digital scholar: how technology is transforming scholarly practice. london, uk: bloomsbury academic, . wellmon, chad. organizing enlightenment: information overload and the invention of the modern research university. baltimore, md: johns hopkins university press, . werner, sarah. “fetishizing books and textualizing the digital.” sarahwerner.net, july , . http://sarahwerner.net/blog/index.php/ / /fetishizing-books-and-textu- alizing-the-digital/. wernimont, jacqueline. “feminist digital humanities: theoretical, social, and material engagements around making and breaking computational media.” june , . http://jwernimont.wordpress.com/ / / /feminist-digital-humanities-theoretical- social-and-material-engagements-around-making-and-breaking-computational-media/. wernimont, jacqueline. “whence feminism? assessing feminist interventions in digital literary archives.” dhq: digital humanities quarterly, ( ) ( ). http://digitalhumani- ties.org: /dhq/vol/ / / / .html. wernimont, jacqueline and j. flanders. “feminism in the age of digital archives: the women writers project.” tulsa studies in women’s literature ( ), - . wernimont, jacqueline, and elizabeth losh. “problems with white feminism: intersec- tionality and digital humanities.” in doing digital humanities: practice, training, re- search. eds. constance crompton, richard j. lane, ray siemens. - . new york, ny: routledge, . westphal, b. geocriticism: real and fictional spaces. trans. r. tally. new york, ny: pal- grave macmillan, . wetmorland, b.k., ragas, m.w. et al. “assessing the value of virtual worlds for post- secondary instructors, early adopters and the early majority in second life.” interna- tional journal of humanities and social sciences, ( ) ( ). whallon, robert, jr. “the computer in archaeology: a critical survey.” computers and the humanities , no. ( ): - . wheatley, d. and m. gillings. spatial technology and archaeology: the archaeological applications of gis. london, uk: taylor & francis, . white, john w., and heather gilbert. laying the foundation. west lafayette, in: purdue university press, . white, richard. “what is spatial history?” stanford university spatial history project. . http://www.stanford.edu/group/spatialhistory/cgi-bin/site/pub.php?id= . whitelaw. mitchell. “generous interfaces for digital cultural collections.” digital hu- manities quarterly , no. ( ). whitson, roger. “critical making in digital humanities: a mla special session pro- posal.” washington state university, . wickham, hadley. “tidy data.” journal of statistical software. http://vita.had.co.nz/pa- pers/tidy-data.pdf. wiener, nobert. cybernetics: or control and communication in the animal and the ma- chine. cambridge, ma: mit press, . wiener, norbert. “men, machines, and the world about.” in the new media reader. ed. noah wardrip-fruin and nick montfort. - . cambridge, ma: mit press, . wikipedia statistics. en.wikipedia.org/wiki/special:statistics. wilkens, matthew. “canons, close reading, and the evolution of method.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . wilkinson, lane. “join the digital humanities…or else.” sense & reference (blog). janu- ary , . http://senseandreference.wordpress.com/ / / /join-the-digital-hu- manities-or-else/. williams, george h. “disability, universal design and the digital humanities. day of dh: defining digital humanities.” in debates in the digital humanities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . williams, joseph c. “architectural practice in the medieval mediterranean: the church of s. corrado in molfetta.” phd dissertation, duke university, . williams, raymond. keywords: a vocabulary of culture and society. revised edition. new york, ny: oxford university press, . williams, william proctor, and william baker. “caveat lector. english books – and the electronic age.” analytical & enumerative bibliography ( ): – . williford, christa and charles henry. one culture: computationally intensive research in the humanities and social sciences. a report on the experiences of first respondents to the digging into data challenge. washington, dc: council on library and information resources, . willinsky, john. technologies of knowing. boston, ma: beacon press, . wilson, greg. “software carpentry: lessons learned.” cornell university library. ( ). https://arxiv.org/abs/ . . wilson, stephen. information arts: intersections of art, science, and technology. cam- bridge, ma: mit press, . winchester, simon. the map that changed the world: william smith and the birth of modern geology. new york, ny: harpercollins, . winesmith, k., and a. carey. “why build an api for a museum collection?” san francisco museum of modern art, . http://www.sfmoma.org/about/research_pro- jects/lab/why_build_an_api. winn, james anderson. the pale of words: reflections on the humanities and perfor- mance. new haven, ct: yale university press, . winter, michael. “specialization, territoriality, and jurisdiction in librarianship.” library trends, . ( ): - . wired! group, duke university. wired! @ (years): visualizing the past at duke univer- sity. visual resources association bulletin : (may ): - . witcomb, andrea. “the materiality of virtual technologies: a new approach to thinking about the impact of multimedia in museums.” theorizing digital cultural heritage. eds. fiona cameron and sarah kenderine. - . cambridge, ma: mit press, . withington, phil. society in early modern england: the vernacular origins of some pow- erful ideas. cambridge, uk: polity press, . witmore, michael. “fuzzy structuralism.” wine dark sea (blog). . http://winedarksea.org/?p= . witmore, michael. “text: a massively addressable object.” in debates in the digital hu- manities. ed. matthew k. gold. - . minneapolis, mn: university of minnesota press, . witmore, michael. “the ancestral text.” in debates in the digital humanities. ed. mat- thew k. gold. - . minneapolis, mn: university of minnesota press, . witten, ian h., david bainbridge, and david m. nichols, eds. how to build a digital li- brary. san francisco, ca: morgan kaufmann publishers, . wood, denis. the power of maps. new york, ny: guilford press, . wood, denis. rethinking the power of maps. new york, ny: guilford press, . woodley, mary s. digital project planning & management basics: instructor manual. . woodward, david, et al., eds. the history of cartography. vol. and vol. , books , , . chicago, il: university of chicago press. - . worthy, glen. “literary texts and the library in the digital age, or, how library dh is made.” stanford digital humanities. march , . https://digitalhumanities.stan- ford.edu/literary-texts-and-library-digital-age-or-how-library-dh-made. wosh, peter j., cathy moran hajo, and esther katz. “teaching digital skills in an archives and public history curriculum.” in digital humanities pedagogy: practices, principles and politics. ed. brett d. hirsch. cambridge, ma: open book publishers, . wouters, paul, and rodrigo costas. users, narcissism and control – tracking the impact of scholarly publications in the st century. surf foundation, february . http://www.surf.nl/nl/publicaties/documents/users% narcis- sism% and% control.pdf. wright, alex. glut: mastering information through the ages. ithaca, ny: cornell univer- sity press, . wu, tim. “book review: ‘to save everything, click here’ by evgeny morozov.” the wash- ington post. . https://www.washingtonpost.com/opinions/book-review-to-save- everything-click-here-by-evgeny-morozov/ / / / e a- ac - e - a - eb c c _story.html?noredirect=on&utm_term=. e b b f . wust, markus. “augmented reality.” doing digital humanities: practice, training, re- search. eds. constance crompton, richard j. lane, ray siemens. - . new york: routledge, . wynne, martin. “archiving, distribution and preservation,” in developing linguistic cor- pora: a guide to good practice. eds. m. wynne. oxford, uk: oxbow books: – . yakel, e. “digital curation.” oslc systems & services , ( ) - . yakel, e., p. conway, m. hedstrom, & d. wallace. “digital curation for digital natives.” journal of education for library & information science , ( ): - . yan, l., y. zhang, l.t. yang, and h. ning. the internet of things: from rfid to the next- generation pervasive networked systems. boca raton, fl: auerbach publications, . young, j.r. “virtual reality on a desktop hailed as a new tool in distance educa- tion.” chronicle of higher education , , ( ): - . zeldman, jeffrey. “understanding web design.” a list apart. november , . http://alistapart.com/article/understandingwebdesign. zhang, jingxiong, and michael f. goodchild. uncertainty in geographical information. london and new york: taylor & francis, . zimmer, ben. “rowling and “galbraith”: an authorial analysis.” language log. linguistic data consortium. ( july ). http://languagelog.ldc.upenn.edu/nll/?p= . ziemer, tom. “collaborative project pushes discovery in humanities, computer sci- ences.” university of wisconsin-madison college of arts & science: news. university of wisconsin-madison, . zorich, diane m. “the ‘art’ of digital art history.” presented at the digital world of art history, princeton university, june , . http://ica.princeton.edu/digitalbooks/digi- talworldofarthistory / .d.zorich.pdf. zorich, diane m.“ digital humanities centers: loci for digital scholarship.” washington, dc: council on library and information resources, november . http://www.clir.org/activities/digitalscholar /zorich.pdf. zorich, diane m. a survey of digital humanities centers in the united states. clir publi- cation no. . washington, dc: council on library and information resources, . zorich, diane m. a survey of digital cultural heritage initiatives and their sustainability concerns. washington, dc: council on library and information resources, june . http://www.clir.org/pubs/reports/pub /contents.html. zorich, diane m. “transitioning to a digital world: art history, its research centers, and digital scholarship; a report to the samuel h. kress foundation and the roy rosenzweig center for history and new media.” may . http://www.kressfoundation.org/re- search/default.aspx?id= . zoran, a., and l. buechley. “hybrid reassemble: an exploration of craft, digital fabrica- tion and artifact uniqueness.” leonardo , - . zotero. https://www.zotero.org/. zubrow, ezra. “digital archaeology: a historical context.” in digital archaeology. bridg- ing method and theory. eds. patrick daly and thomas l. evans. - . london, uk: routledge, . zundert, joris j. van. “screwmeneutics and hermenumericals: the computationality of hermeneutics.” in a new companion to digital humanities. eds. by susan schreibman, ray siemens, and john unsworth. - . west sussex, uk: wiley-blackwell, . / / an aegean history and archaeology written through radiocarbon dates ( ) overview context the aegean area has so far lagged behind several other parts of europe and the mediterranean in not offering any major listing of its considerable radiocarbon record, despite decades of radiocarbon sampling at major sites and worldwide radiocarbon-led debates, such as over the dating of the santorini eruption (e.g. [ – ]). the dataset provided here is the outcome of a project “an aegean prehistory written in radiocarbon dates” and it offers the most complete list so far of published radiocarbon dates from greece. some c dates from sites located in greece have been discovered or cross-checked via a combination of harmonizing records from several existing radiocarbon databases, searching original publications and checking preliminary reports from both international and greek sources. the project was designed to complement/enhance wider research agendas considering the interplay between human population, land use and long-term environmen- tal processes, especially a leverhulme trust funded pro- ject known as “changing the face of the mediterranean” (rpg- - , pi neil roberts) to which the current radiocarbon data were used as part of regional case study paper ([ ], for the special issue, see [ ]). in a variety of contexts worldwide, the assessment of radiocarbon date- lists as aggregate times series, often via summed prob- ability distributions, has become popular for modelling human population change (e.g. [ ]). their collation with additional archaeological records, such as pollen cores and site data, has offered further opportunities to detect regional differences in the long-term socio-ecological development. although the original aim of the project was to retrieve as exhaustive a list of published radiocarbon dates in the aegean area from only the mesolithic to iron age (ca. – kya), it became evident that the number of published dates covering the area of modern greece was far lower data paper an aegean history and archaeology written through radiocarbon dates markos katsianis , andrew bevan , giorgos styliaras and yannis maniatis department of history and archaeology, university of patras, gr institute of archaeology, university college london, uk laboratory of archaeometry, ncsr demokritos, gr corresponding author: markos katsianis (mkatsianis@upatras.gr) this dataset is the outcome of an instap-funded project “an aegean prehistory written in radiocarbon dates”. it includes c dates from sites in greece and reflects an attempt to exhaustively collect and cross-check all published radiocarbon dates from existing databases, original publications and preliminary reports using both international and greek sources ( sources in total). although originally targeting prehistoric dates, all dates coming from archaeological or environmental sampling were inte- grated in the final dataset regardless of chronological period. sites have been identified and positioned as accurately as possible, while additional information on sampling procedures, sample material and strati- graphic context have been recorded. keywords: environment; archaeology; radiocarbon dating; greece; holocene funding statement: the institute for aegean prehistory (instap) provided the core funding for the “an aegean prehistory written in radiocarbon dates” project which ran between – . previous sup- port by instap to yannis maniatis for the radiocarbon dating of early neolithic settlements in greece in ncsr demokritos radiocarbon laboratory allowed for several dates to be processed and included in the present work. work related to data cleaning and terminology mapping was implemented as part of the dataset preparation process to be ingested in the ariadneplus portal. ariadneplus (advanced research infrastructure for archaeological database networking in europe) is a project funded by the european commission under the h programme, contract no. h -infraia- - - . the views and opinions expressed in this publication are the sole responsibility of the authors and do not necessarily reflect the views of the european commission. katsianis m, et al. an aegean history and archaeology written through radiocarbon dates. journal of open archaeology data, : . doi: https://doi.org/ . /joad. mailto:mkatsianis@upatras.gr https://doi.org/ . /joad. katsianis et al: an aegean history and archaeology written through radiocarbon datesart.  , pp.  of than expected. four main reasons are identified for this issue: – there is a small core of radiocarbon dates especially for aegean prehistory originally published in eng- lish and re-used extensively in subsequent attempts by researchers to better define individual chronolog- ical subperiod boundaries, period outsets (e.g. the beginning of neolithic) or important events (e.g. the santorini eruption). – a substantial number of measurements (ca. %) come from purely or partially non-anthropic con- texts as part of investigation strategies involving boreholes towards the reconstruction of environ- mental or geomorphological conditions in the past. – in the greek literature, there has been a marked ten- dency for archaeologists to report calibrated dates without clear reference to conventional (pre-calibra- tion) radiocarbon ages or supplementary data (e.g. context details). – later periods (after about ~ bp) are under- represented (far fewer dates) due to the lack of an academic tradition in collecting radiocarbon dates for classical, medieval and more recent periods of archaeology. in this respect, the final dataset includes all dates encoun- tered in the literature regardless of research context (archae- ological, environmental or material conservation studies) or chronological period. contextual information regarding the sampling procedure has also been recorded and sites have been identified and located as accurately as possible. a lot of effort has been directed towards cleaning data and refining terminology, with a view to data ingest into the ariadneplus portal. as a result, we hope that in terms of both data struc- ture and content, the current date-list aggregate will form an important radiocarbon data reference for greece and con- tinue to grow through further input and re-use. spatial coverage description: the dataset covers the area of modern greece. figure shows the study area and the distribution of sites with archaeological and environmental samples. the coor- dinates of the minimum-bounding box defined by site coordinates are given in wgs decimal degrees. northern boundary: . (promachonas-topolnitsa, central macedonia) southern boundary: . (gavdos, crete) eastern boundary: . (kallithies a, rhodes, south aegean) western boundary: . (sidari, corfu, ionian islands) figure : study area and distribution of sites with archaeological and environmental samples within administrative divisions of greece (basemap sources: esri, here, garmin, fao, noaa, usgs, © openstreetmap contributors and the gis user community). katsianis et al: an aegean history and archaeology written through radiocarbon dates art.  , pp.  of temporal coverage dates range from the middle/late palaeolithic (ca. , cal. bc) to early modern times. ( ) methods the creation of this dataset was only possible due to the growing availability of openly available data records and relevant digital scholarship [ ]. to approach the data col- lection, we combined secondary sources of already com- piled radiocarbon datasets with other available online sources than might be screened and harvested for radio- carbon data. steps more specifically, we have extracted lists of greek dates from available c date lists and databases [ – ]. the lead author further screened digital versions of original publications and paper-based sources to find new dates or check, enhance and georeference those already listed by others. journal sources that were extensively screened include but are not limited to: acta archaeologica, annual of the british school at athens, antiquity, archaeological and anthropological sciences, archaeometry, bulletin de correspondance hellénique, bsa archaeological reports, eurasian prehistory, european journal of archaeology, geoarchaeology, géomorphologie, hesperia, international journal of nautical archaeology, international journal of osteoarchaeology, journal of archaeological research, journal of archaeological science, journal of archaeological science reports, journal of european archaeology, journal of field archaeology, journal of human evolution, journal of the royal anthropological institute, journal of world prehistory, méditerranée, mediterranean archaeology and archaeometry, oxford journal of archaeology, plos one, proceedings of the prehistoric society, radiocarbon, mediterranean historical review, science, vegetation history and archaeobotany, volumes of the archaeological work in macedonia and thrace annual conference, world archaeology. all the above were checked for all those vol- umes that were available online up to . radiocarbon lists were further extracted from several monographs, chapters in edited volumes, websites and site reports. finally, y. maniatis (one of the authors here) provided clar- ifications for partially published (i.e. published by archae- ologists only as calibrated timespans) radiocarbon dates from the ncsr demokritos radiocarbon laboratory. sampling strategy in addition to providing basic and alternative lab codes (stored in “labid” and “othlabid” respectively) as well as date codes in searched radiocarbon dating data- bases (“otherdatecode”), conventional (pre-calibration) radiocarbon age (“cra”) and -standard deviation error (“error”), we have further collected several data fields per date containing: – isotopic fractionation of stable carbon isotopes carbon- (δ c) for allowing clear assessment of fractionations and reservoir effects, but also for understanding changing water-stress across regions and through time (“dc ”), – other measurements related to the reported data, e.g. percent modern carbon (pmc) (“oth measures”), – notices on the technique/method used to process the sample (“datemethod”), – basic information on the sample material (“mate- rial”) as well as genus or species level identifications where possible (“species”). for each date, we report its original publication and all subsequent works referencing it including online databases or publicly accessible data archives. in this respect, a considerable amount of contextual and sup- plementary information associated with each date has been included. the collection procedure focused on data quality control, by cross-checking all attributes associated with the radiocarbon dates and addressing possible inconsistencies in the published records. in cases where conflicting statements (e.g. sample age, deviation error) were encountered in the sources, we made decisions on our final database entry based on the most complete/detailed descriptions, the preference of original publications rather over compiled second- ary sources (e.g. databases), the comparison with later (paper) publications on the possibility of measurement revisions (e.g. [ ]). problematic cases are reported in the “problems” field, while alternative measurements alongside their link to their respective references have been included in the “comments” field, both contained in the “c samples” table. the geographic location of samples has been assigned latitude/longtitude coordinates in decimal degrees (“longtitude” and “latitude”) recorded under the wgs ellipsoid (epsg ). each location has been coded according to its perceived accuracy using four different assessment levels (a: sub-site quality +/– m, b: within +/– km, c: moderate accuracy within admin region, d: unknown accuracy within country). in cases of large sites (e.g. knossos) where it was possible to locate sam- ples within smaller or neighbouring research areas (e.g. unexplored mansion), this differentiation is also reflected in the “sitename” field. however, the published dataset includes downgraded location coordinates that have been grouped by site name, in line with looting prevention policies by the greek ministry of culture and tourism. researchers wishing to obtain the accurate coordinates can contact the lead author. current greek administrative divisions (“adminregion” and “country”) have been used to group samples. in terms of contextual information, recordings include typological distinctions between site types (“sitetype”), notes on the stratigraphic context of each sample (“sitecontext”), chronological distinctions related to intra-site phasing (“sitephase”) and broader chrono- logical periods related to the sample’s cultural context (“culturalperiod”). katsianis et al: an aegean history and archaeology written through radiocarbon datesart.  , pp.  of quality control after the completion of the data collection stage we undertook painstaking steps in data cleaning and check- ing. to solve data discrepancies, we had to re-visit and cross-check already screened sources. in descriptive fields, such as “sitecontext”, readily available information was edited to achieve a standardized contextual notation for each site and mitigate differences in site context reporting between publications (e.g. franchthi cave). in fields where a term list could be established we tried to standardize entries as much as possible and map resulting terms to reference vocabularies or thesauri (see relevant tables). we used the getty art and architecture thesaurus (aat) to map terms in the fields of “sitetype”, “material” and “species”. also, the getty thesaurus of geographic names (tgn) to standardize individual entries for the “sitename” and “adminregion” fields. finally, we employed periodo and the greek historical periods vocabulary (uri: http:// semantics.gr/authorities/vocabularies/historical-periods) from the national documentation centre (ndc) [ ] to further normalize the chronological periods reported in the “culturalperiod” field. all mappings are also being made available to ariadneplus to enhance data interop- erability in the ariadneplus portal. constraints in every respect, the dataset remains far from ideal. for example, almost % of the dates do not have associated δ c values, whereas ca. % are not associated with any defined chronological period. radiocarbon dates have been reported in the literature with varying degrees of associated information. although the most recent pub- lications are obviously more detailed, in many aspects reporting continues to vary significantly between labo- ratories or individual reports. although, we have given a special emphasis to quality control while moving through the labyrinth of different paper and digital sources, a feel- ing persists that certain constraints rise from the original data reporting sources. in this regard, it remains up to the user to assess data reliability and proceed with appropri- ate caution. the data archive to which this paper points should be considered a versioned first release: we encour- age all users to inform the lead author of any possible errors and we will seek to produce and update a more dynamic online repository accordingly. ( ) dataset description the dataset contains a single tab delimited text file (.txt) of the c dates for greece (c samples) plus a bibtex format bibliography (references). a further tab delimited text file has been included to document the main file fields and the domain values included in the (c greece_ fields). the project’s relational database was originally in ms access and a version of this has been made avail- able as a sql dump file containing ddl (data definition language) and dml (data manipulation language) que- ries for reconstructing the database (c greece_dump). apart from the main c samples table, the full rela- tional database contained six more tables with standard- ized domain values for the following fields contained in the main table: ) admin region, ) culturalperiod, ) material, ) sitename, ) sitetype, ) species, ) source. values from tables and have been mapped to tgn, those of table to periodo and ndc chronological peri- ods, while the values from , and to the aat. table included the original transcription of the source refer- ence, which eventually resulted in the bibtex file, but was also maintained in the original database. object name c samples.txt – single file (tab delimited text, utf encoding) providing the data for all c samples. it cor- responds to the original database main table. c greece_fields.txt – single file (tab delimited text, utf encoding) containing field type definitions and domain values for all content included in the projects database. references.bib – single bibtex file containing references cited for all c samples recorded. c greece_dump.sql – single file containing the main data table and seven additional tables for domain values and references. data type secondary data format names and versions txt utf , sql, bibtex creation dates most of the records were created in – as part of the instap funded project “an aegean prehistory written in radiocarbon dates”. between – records underwent cleaning and standardization in the frame- work of the ariadneplus project, while a small number of additions were implemented. dataset creators the researcher responsible for data entry was markos katsianis. online radiocarbon listings and literature sources were provided by andrew bevan, who supervised the data recording and standardisation process. records were restructured, cleaned and standardized by giorgos styliaras. yannis maniatis provided additional records from the ncsr demokritos radiocarbon laboratory and helped solve dis- crepancies between entries. terminology mappings were performed by giorgos styliaras and markos katsianis. language english. greek literature has been included in modern greek. site context descriptions from greek sources may contain greek characters (e.g. sector Φ). license https://creativecommons.org/licenses/by/ . /. repository location https://doi.org/ . / / .v http://semantics.gr/authorities/vocabularies/historical-periods http://semantics.gr/authorities/vocabularies/historical-periods https://creativecommons.org/licenses/by/ . / https://doi.org/ . / / .v katsianis et al: an aegean history and archaeology written through radiocarbon dates art.  , pp.  of publication date / / . ( ) reuse potential this dataset comprises the largest single collection so far of radiocarbon data for the aegean region, covering the equivalent area of modern greece. it provides a compre- hensive resource for accessing detailed chronological data from specific sites or wider regions within greece. sites have been located to the highest possible accuracy and although their coordinates have been downgraded to ca. m. radius to discourage illegal uses, their positions can still be used in archaeological site mappings. the crea- tion and circulation of radiocarbon databases of this kind follows wider efforts in sharing open licensed, georefer- enced large-scale datasets in archaeology and beyond. on a broader level, the dataset allows the greek radiocarbon listings to be added to data collections of other kinds from the aegean region and to be used in comparative agendas of greater geographical scope. one key reuse potential relates to the use of aggregate lists of anthropo- genic radiocarbon data as a proxy for human population change, for example via summed probability distributions (spds). further potential might relate to enhanced inter- pretation of aegean prehistoric archaeological sequences at regional levels or chronological comparison between different regions. also, the juxtaposition of large lists of radiocarbon dates with other scientific data, such as pol- len cores, macrobotanical remains, skeletal assemblages or archaeological settlement survey datasets offers further opportunities to approach long-term socio-environmental trajectories and questions. note project website: https://ariadne-infrastructure.eu/, ar- iadne portal: http://portal.ariadne-infrastructure.eu/. acknowledgements we would like to thank instap for the funding behind this project as well as colleagues involved in the changing the face of the mediterranean project who provided extra impetus for data collection. likewise, to the ariadneplus project for allowing us to include this dataset in the ariadne portal. we would also like to acknowledge the various researchers and developers involved in creating other online radiocarbon databases and individual date- lists with dates from the aegean region that are openly published and have been used in our research (see methods). access to several unavailable sources has been provided by Ž. tankosić and m. ntinou. competing interests the authors have no competing interests to declare. references . aitken, mj the thera eruption: continuing discussion of the dating. i: resume of dating iii: fur- ther arguments against an early date iv: addendum. archaeometry, ( ): – . doi: https://doi. org/ . /j. - . .tb .x . bruins, hj, van der plicht, j and macgillivray, j a the minoan santorini eruption and tsunami deposits in palaikastro (crete): dating by geology, ar- chaeology, c, and egyptian chronology. radiocar- bon, ( ): – . doi: https://doi.org/ . / s x . bruins, hj and van der plicht, j the thera ol- ive branch, akrotiri (thera) and palaikastro (crete): comparing radiocarbon results of the santorini erup- tion. antiquity, ( ): – . doi: https://doi. org/ . /s x . friedrich, wl, kromer, b, friedrich, m, heinemeier, j, pfeiffer, t and talamo, s santorini erup- tion radiocarbon dated to – b.c. science, ( ): . doi: https://doi.org/ . /sci- ence. . manning, sw the bronze age eruption of thera: absolute dating, aegean chronology and mediterrane- an cultural interrelations. journal of mediterranean ar- chaeology, ( ): – . doi: https://doi.org/ . / jmea.v i . . manning, sw, bronk ramsey, c, doumas, c, marketou, t, cadogan, g and pearson, c new evidence for an early date for the aegean late bronze age and thera eruption. antiquity, ( ): – . doi: https://doi.org/ . / s x . manning, sw, höflmayer, f, moeller, n, dee, m w, bronk ramsey, c, fleitmann, d, higham, t, kutschera, w and wild, em dating the thera (santorini) eruption: archaeological and scientific evidence supporting a high chronology. antiquity, ( ): – . doi: https://doi.org/ . / s x . maniatis, y radiocarbon dating of the late cy- cladic building and destruction phases at akrotiri: new evidence. european physical journal plus, : . doi: https://doi.org/ . /epjp/i - -y . weiberg, e, bevan, a, kouli, k, katsianis, m, woodbridge, j, bonnier, a, engel, m, finné, m, fyfe, r, maniatis, y, palmisano, a, panajiotidis, s, roberts, n and shennan, s long-term trends of land use and demography in greece: a comparative study. the holocene, ( ): – . doi: https://doi. org/ . / . bevan, a, palmisano, a, woodbridge, j, fyfe, r, roberts, cn and shennan, s the changing face of the mediterranean – land cover, demography and environmental change: introduction and over- view. the holocene, ( ): – . doi: https://doi. org/ . / . shennan, s, downey, s, timpson, a, edinborough, k, colledge, s, kerig, t, manning, k and thomas, mg regional population collapse followed initial agriculture booms in mid-holocene europe. nature communications, : . doi: https://doi. org/ . /ncomms . bevan, a the data deluge. antiquity, ( ): – . doi: https://doi.org/ . / aqy. . https://ariadne-infrastructure.eu/ http://portal.ariadne-infrastructure.eu/ https://doi.org/ . /j. - . .tb .x https://doi.org/ . /j. - . .tb .x https://doi.org/ . /s x https://doi.org/ . /s x https://doi.org/ . /s x https://doi.org/ . /s x https://doi.org/ . /science. https://doi.org/ . /science. https://doi.org/ . /jmea.v i . https://doi.org/ . /jmea.v i . https://doi.org/ . /s x https://doi.org/ . /s x https://doi.org/ . /s x https://doi.org/ . /s x https://doi.org/ . /epjp/i - -y https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /ncomms https://doi.org/ . /ncomms https://doi.org/ . /aqy. . https://doi.org/ . /aqy. . katsianis et al: an aegean history and archaeology written through radiocarbon datesart.  , pp.  of . reingruber, a and thissen, l c database for the aegean catchment (eastern greece, southern balkans and western turkey) , – cal bc. in: lichter, c and meriç, r (eds) how did farming reach europe? ana- tolian-european relations from the second half of the th through the first half of the th millennium cal bc: pro- ceedings of the international workshop, istanbul, – may (byzas ). İstanbul: ege yayınları, – . . hinz, m, furholt, m, müller, j, raetzel-fabian, d, rinne, c, sjögren, k-g and wotzka, h-p ra- don – radiocarbon dates online . central euro- pean database of c dates for the neolithic and the early bronze age. journal of neolithic archaeology, : – . doi: https://doi.org/ . /jna. . . da- tabase available at: https://radon.ufg.uni-kiel.de/ (ac- cessed august ). . manning, k, college s, crema e, shennan, s and timpson, a the cultural evolution of neolithic europe. euroevol dataset : sites, phases and radio- carbon data. journal of open archaeology data, : e , doi: https://doi.org/ . /joad. . brami, m and zanotti, a modelling the initial expansion of the neolithic out of anatolia. docu- menta praehistorica, : – . doi: https://doi. org/ . /dp. . . centre de datation par le and radio carbone de lyon (cdrc) banadora (banque nationale de données radiocarbone pour l’europe et le proche orient). cdrc. http://www.arar.mom.fr/banadora/ (accessed august ). . reingruber, a and thissen, l the sea project. a c database for southeast europe and anatolia ( , – , cal bc). http://www. sea.org/ _dates. html (accessed august ). . orau. oxford radicoarbon accelerator unit (orau) database. https://c .arch.ox.ac.uk/ (accessed august ). . weninger, b cologne radiocarbon calibration and paleoclimate research package (calpal). https:// monrepos-rgzm.de/forschung/ausstattung.html (accessed january ). . manning, s w the absolute chronology of the aegean early bronze age: archaeology, radio- carbon and history. sheffield: sheffield academic press. . georgiadis, h, papanoti, a, paschou, m, roubani, a, hardouveli, d and sachini, e the semantic enrichment strategy for types, chronologies and his- torical periods in searchculture.gr. in: garoufallou, e, virkus, s, siatri, r and koutsomiha, d (eds) metadata and semantic research. mtsr . communications in computer and information science, vol . cham: springer. doi: https://doi.org/ . / - - - - _ how to cite this article: katsianis m, bevan a, styliaras, g and maniatis, y an aegean history and archaeology written through radiocarbon dates. journal of open archaeology data : . doi: https://doi.org/ . /joad. published: august copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. journal of open archaeology data is a peer-reviewed open access journal published by ubiquity press open access https://doi.org/ . /jna. . https://radon.ufg.uni-kiel.de/ https://doi.org/ . /joad. https://doi.org/ . /dp. . https://doi.org/ . /dp. . http://www.arar.mom.fr/banadora/ http://www. sea.org/ _dates.html http://www. sea.org/ _dates.html https://c .arch.ox.ac.uk/ https://monrepos-rgzm.de/forschung/ausstattung.html https://monrepos-rgzm.de/forschung/ausstattung.html http://searchculture.gr https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ https://doi.org/ . /joad. http://creativecommons.org/licenses/by/ . / ( ) overview context spatial coverage temporal coverage ( ) methods steps sampling strategy quality control constraints ( ) dataset description object name data type format names and versions creation dates dataset creators language license repository location publication date ( ) reuse potential note acknowledgements competing interests references figure digital humanities the importance of pedagogy: towards a companion to teaching digital humanities hirsch, brett d. brett.hirsch@gmail.com university of western australia timney, meagan mbtimney.etcl@gmail.com university of victoria the need to “encourage digital scholarship” was one of eight key recommendations in our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences (unsworth et al). as the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” ( ). in other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. while the commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. both the companion to digital humanities ( ) and the companion to digital literary studies ( ), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. there is much work to be done. this poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. this discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by mccarty and kirschenbaum ( ). the growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (jessop ), and the desire of practitioners to consolidate and validate their research and methods. we propose a volume, teaching digital humanities: principles, practices, and politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. we plan to structure the volume according to the four critical questions educators should consider as emphasized recently by mary bruenig, namely: - what knowledge is of most worth? - by what means shall we determine what we teach? - in what ways shall we teach it? - toward what purpose? in addition to these questions, we are mindful of henry a. giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” ( ). consequently, we will encourage submissions to the volume that address these wider concerns. references breunig, mary ( ). 'radical pedagogy as praxis'. radical pedagogy. http://radicalpeda gogy.icaap.org/content/issue _ /breunig.ht ml. giroux, henry a. ( ). 'rethinking the boundaries of educational discourse: modernism, postmodernism, and feminism'. margins in the classroom: teaching literature. myrsiades, kostas, myrsiades, linda s. (eds.). minneapolis: university of minnesota press, pp. - . http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html digital humanities schreibman, susan, siemens, ray, unsworth, john (eds.) ( ). a companion to digital humanities. malden: blackwell. jessop, martyn ( ). 'teaching, learning and research in final year humanities computing student projects'. literary and linguistic computing. . ( ): - . mccarty, willard, kirschenbaum , matthew ( ). 'institutional models for humanities computing'. literary and linguistic computing. . ( ): - . unsworth et al. ( ). our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. new york: american council of learned societies. natural language processing for historical texts michael piotrowski (leibniz institute of european history) morgan & claypool (synthesis lectures on human language technologies, edited by graeme hirst, volume ), , ix+ pp; paperbound, isbn - . hal id: hal- https://hal.inria.fr/hal- submitted on jun hal is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. the documents may come from teaching and research institutions in france or abroad, or from public or private research centers. l’archive ouverte pluridisciplinaire hal, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. natural language processing for historical texts michael piotrowski (leibniz institute of european history) morgan & claypool (synthesis lectures on human language technologies, edited by graeme hirst, volume ), , ix+ pp; paperbound, isbn - . laurent romary to cite this version: laurent romary. natural language processing for historical texts michael piotrowski (leibniz insti- tute of european history) morgan & claypool (synthesis lectures on human language technologies, edited by graeme hirst, volume ), , ix+ pp; paperbound, isbn - .. computa- tional linguistics, massachusetts institute of technology press (mit press), , ( ), pp. - . �hal- � https://hal.inria.fr/hal- https://hal.archives-ouvertes.fr book review natural language processing for historical texts michael piotrowski leibniz institute of european history morgan & claypool (synthesis lectures on human language technologies, edited by graeme hirst, volume ), , ix+ pp; paperbound, isbn - reviewed by laurent romary inria & humboldt university berlin the publication of a scholarly book is always the conjunction of an author’s desire (or need) to disseminate his experience and knowledge and the interest or expectations of a potential community of readers to gain benefit from the publication itself. michael piotrowski has indeed managed to optimise this relation by bringing to the public a compendium of information, which i think has been heavily awaited by many scholars having to deal with corpora of historical texts. the book covers most topics related to the acquisition, encoding and annotation of historical textual data, seen from the point of view of their linguistic content. as such, it does not address issues related, for instance, to scholarly editions of these texts, but conveys a wealth of information on the various aspects where recent developments in language technology may help digital humanities projects to be aware of the current state of the art in the field. still, the book is not an encyclopedic description of such technologies. it is based on the experience acquired by the author within the corpus development projects he has been involved in, and reflects in particular the specific topics on which he has made more in-depth explorations. it is thus written more as a series of return on experience than a systematic resource to which one would want to return after its initial reading. the book is organized as a series of short chapters, which i describe below. in the first two (very short) chapters, the author presents the general scope of the book and provides an overview of the reasons why nlp has such an entrenched position in digital humanities at large and the study of historical text in particular. citing several prominent projects and corpus initiatives that have taken place in the last few decades, piotrowski defends the thesis, which i share, that a deep understanding of textual documents requires some basic knowledge of language processing methods and techniques. chapter in particular (“nlp and digital humanities”) could be read as an autonomous position paper, which, independently of the following chapters, presents the current landscape of infrastructural initiatives and scholarly projects that shape this convergence between the two fields. chapter (“spelling in historical texts”; pp. – ) describes the various issues re- lated to spelling variations in historical text. it shows how difficult it may be to deal with both diachronic (e.g. in comparison to modern standardised spellings) and synchronic variations (degree of stabilisation of historical spellings), especially in the context of the uncertainty brought about by the transcription process itself. this is particularly true for historical manuscripts and piotrowski goes deeply into this, showing some concrete examples of the kind of hurdles that a scholar may fall into. this is the kind of short introduction i would recommend for anyone, in particular students, wanting to gain a first understanding in the domain of historical spelling. computational linguistics volume , number chapter is the longest chapter in the book “acquiring historical texts”; pp. – ) and covers various aspects of the digitization workflow that needs to be set up for creating a corpus of historical texts. the chapter is quite difficult to to read as one single unit because of its intrinsic heterogeneity. indeed, it covers quite a wide range of topics: presentation of existing digitisation projects worldwide, technical issues related to scanning, comparison of various optical character recognition systems for various types of scripts, the potential role of lexical resources, crowd-sourcing for ocr post- processing, manual or semi-automatic keying. getting an overview of the various topics is even more difficult because of the way the author has followed his own personal experience, and alternates between general considerations and in-depth presentations of concrete results. pages – for instance is one single subsection on the comparison of ocr outputs that goes into so much detail that it breaks out the continuity of the argument although in itself this subsection could be really interesting for a specialized reader. as we shall see in the conclusion, this chapter illustrates that the content of this book would benefit from being published on a more modern and opened setting. data representation aspects are covered in chapter (“text encoding and annotation schemes”; pp. – ), which tackles two specific issues, namely character and document encoding. on these two, the author presents what could be considered best practices. for character encoding, the book rightly focuses on the advantages that the move towards iso /unicode has brought to the community. the corresponding sub- section actually covers three different aspects: it first makes an extensive presentation of the history of character encoding standards (from ascii/iso to unicode/iso ), provides insights into the current coverage and encoding principles (e.g. utf- vs. utf- ) of iso , and finally focuses on the specific difficulties occurring in his- torical texts both from the point of view of legacy ascii based transcription languages and the management of characters that are not present in unicode. although well documented, these three topics should have been more clearly separated so that readers interested in one or the other could directly refer to it. this is a typical case where, given the great expertise of the author in the subject, i could imagine the corresponding texts being published online as separate entries in a blog. the second half of the chapter focuses on the role of the tei guidelines for the transcription and encoding of historical text. it covers the various representation levels that may be concerned (metadata, text structure, surface annotation) and insist on the current difficulty of linking current nlp tools to tei encoded documents. while this is indeed still an issue in general, it might have been interesting to refer to standards (iso – maf) and initiatives (textgrid core encoding at the token level; the txm platform for text mining) that have started to provide concrete sustainable answers to the issue. the following chapter (“handling spelling variations”; pp. – ), provides a series of short studies describing possible methods for dealing with ocr errors or spelling variation as described in chapter . independently from the fact that i find it strange to see the two chapters quite far from one another, the present one distinguishes itself with its profound heterogeneity. while several sections do have the most appropriate level of details and topicality for historical texts (in particular those on canonicalization), some sections seem to be completely off-topic (section . , “edit distance”, describes what i would consider as background knowledge for such a book). it is all the more disappointing that the author shows here a very high level of expertise and as in the case of chapter , i would strongly recommend the reading of the relevant sections to newcomers in the field. in contrast with the previous chapter, chapter (“nlp tools for historical lan- guages”; pp. – ) is more coherent and focused. it mainly addresses the morpho- book reviews syntactic analysis of historical text and presents, through concrete deployment scenar- ios, possible methods to constrain the appropriate parsers, in a context where hardly any existing tools can be simply re-used. the chapter is very well documented and refers to most of the relevant initiatives in the domain of morphology for historical text, at least on the european scene. this focus may also be misleading since recent work on named entity recognition on historical texts are not at all mentioned and are probably, to my view, one of the most promising direction for enhanced digital scholarship. the last chapter (“historical corpora”; pp. – ) is a compendium, sorted by language, of the major historical corpora available worldwide. it shows the dynamic that currently exists in the community and is an essential background resource to both understanding who is active in maintaining historical corpora and discerning the most relevant resources. the chapter as a whole provides an interesting “historical” perspec- tive on the progress made by most text-based projects in using the tei guidelines as their reference standard. it seems quite difficult now to imagine an initiative which would not take tei for granted, and would not build inside the tei framework. on another issue, namely copyright, piotrowski also provides an interesting analysis on the difficulty of re-using old editions which have been recently re-edited on paper, and thus fall into some publisher’s copyright restrictions. the conclusion could have been a little tougher here though, and probably should have recommended putting a hold on any paper publication of historical sources by a private publisher, unless it is guaranteed that the electronic material can be used freely, under an appropriate open license. as a whole, the book leaves the reader with a mixed feeling of enthusiasm and disappointment. enthusiasm, because the content is so rich that it should serve as background reference (and indeed be quoted) for any further work on the creation, management and curation of historical corpora. still, i cannot help thinking that the editorial setting as a book is not the most appropriate setting for such content. the variety of topics that are addressed as well as the heterogeneous levels of details pro- vided through the different chapters would benefit from a more fragmented treatment. indeed, this would be the perfect content for a series of blog entries (for instance in a scholarly blog such as those on the hypotheses.org platform) which in turn would allow an interested reader to discover exactly the topics he wants information about and cite the corresponding entries. with the bibliography in zotero and relevant pointers to the corresponding online corpora or tools, i could imagine the resulting content soon becoming one of the most cited online resources. i am sure the author would gain more visibility in doing so than having the material hidden on a library shelf or behind a paywall. not knowing the exact copyright transfer agreement associated with the book, i cannot judge if it is too late for the author to think in these terms, but this could be a lesson for scholars who are now planning to write such an introductory publication. is the book still the best medium? this book review was edited by pierre isabelle laurent romary is directeur de recherche at inria, france, and guest scientist at the humboldt university in berlin. he has been involved for many years in language resource modeling activi- ties and in particular in standardization initiatives in the tei consortium and iso committee tc /sc (language resource management). he is the director of the european dariah digital infrastructure in the humanities. email: laurent.romary@inria.fr. doi: . /rlt.v . research article different views on digital scholarship: separate worlds or cohesive research field? juliana e. raffaghelli, stefania cucchiara, flavio manganello and donatella persico* institute for educational technology, national research council of italy, genova, italy (received april ; final version received november ) this article presents a systematic review of the literature on digital scholarship, aimed at better understanding the collocation of this research area at the crossroad of several disciplines and strands of research. the authors analysed articles in order to draw a picture of research in this area. in the first phase, the articles were classified, and relevant quantitative and qualitative data were analysed. results showed that three clear strands of research do exist: digital libraries, networked scholarship and digital humanities. moreover, researchers involved in this research area tackle the problems related to technological uptake in the scholar’s profession from different points of view, and define the field in different � often complementary � ways, thus generating the perception of a research area still in need of a unifying vision. in the second phase, authors searched for evidence of the disciplinary contributions and interdisciplinary cohesion of research carried out in this area through the use of bibliometric maps. results suggest that the area of digital scholarship, still in its infancy, is advancing in a rather fragmented way, shaping itself around the above-mentioned strands, each with its own research agenda. however, results from the cross-citation analysis suggest that the networked scholar- ship strand is more cohesive than the others in terms of cross-citations. keywords: digital scholarship; digital humanities; networked scholarship; digital libraries; systematic review . introduction the digital era is challenging all knowledge workers to develop new skills and literacies to work effectively within digital spaces (goodfellow ). the academic profession is no exception (borgman ; pearce et al. ; weller ), as digital technology offers unprecedented affordances to improve both research and teaching performance. the concept of digital scholarship emerged early in early in the st century (andersen and trinkle ; ayers ) and, according to wikipedia, refers to the use of information and communication technology to achieve scholarly and research goals. among the scholars’ activities that take advantage of technological affordances are: collecting evidence, carrying out investigations and research, publishing and dissemi- nating results and preserving and making available outcomes. however, a fly-through the landscape of digital scholarship reveals that, although the term has become quite *corresponding author. email: persico@itd.cnr.it research in learning technology vol. , research in learning technology . # j.e. raffaghelli et al. research in learning technology is the journal of the association for learning technology (alt), a uk-based professional and scholarly society and membership organisation. alt is registered charity number . http://www.alt.ac.uk/. this is an open access article distributed under the terms of the creative commons attribution . international license (http://creativecommons.org/licenses/by/ . /), allowing third parties to copy and redistribute the material in any medium or format and to remix, transform, and build upon the material for any purpose, even commercially, provided the original work is properly cited and states its license. citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) http://creativecommons.org/licenses/by/ . / http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . popular, it does not seem to have a widely agreed definition across the research disciplines that have contributed to its evolution. according to borgman ( ), for example, the concept of digital scholarship is tightly connected to the discourse about cyber infrastructures supporting new forms of doing research and science, namely eresearch and escience, which involves the progressive digitisation of institutional infrastructures and impacts on scholars’ practices in dealing with information and communication processes. borgman’s work, in fact, is deeply rooted in the field of information science whose primary aim is to improve the way libraries curate digital content and support scholarly work of all subject areas. this field of work also deals with the way scholars use the libraries’ digital facilities to increase their reputation (andersen ; holliman ; quigley et al. ; zhao ). at the same time, an important role in this field of research has been played by social science scholars who work at the crossroads between the humanities and digital technologies, thus identifying a new field of research, the digital humanities (terras, nyhan, and vanhoutte ), which is also strictly related to digital scholarship. as these authors point out, ‘digital humanities as a term (. . .) provides a big tent for all digital scholarship in the humanities’ (p. ). these scholars have worked intensely to define the borders of this field of research (unsworth ), which embraces both the theory and the practices concerning the new forms of representation of cultural heritage, including history, arts and literature, through the digital medium (bentkowska-kafel ; gardiner and musto ; kaltenbrunner ). more- over, the term ‘digital humanities’ encompasses the area of debate about changing research methods and required professionalism in the humanities and the inter- disciplinary dialogue with digital technologies (klein ). under the influence of the ideas of open science and open access (den besten, david, and schroeder ; suber ), the interest in the concept of digital scholarship has spread to social science researchers interested in investigating the complexities of the technological uptake by institutions and users as a cultural and social phenomenon. socio-technical studies played a highly important role in this case by expanding the focus of digital scholarship research in a direction different from those described above (borgman , p. ). this strand of research relates to academics’ professional learning and identity in the digital era and is tightly connected to educational technology research. its focus is on the ways scholars thrive to do (practices) and to be (identity) in the changing context of higher education, which pushes them � sometimes in rather conflictive and contradictory ways � to keep pace with innovations in digital, open and networked contexts (goodfellow ). the conundrum of opening up science and education is hereby faced through the exploration of professional learning by open, digital and networked scholars, that not only adopt technologies as a means but also reflect on the nature and ethics of research, through their deontological position, and create new scenarios of practice (costa ; scanlon ; veletsianos and kimmons b). this approach aligns with socio-technical studies going beyond technological determinism (pearce et al. ). for this group of researchers, the research problems of digital scholarship are connected with the adoption of social media to do and share research, social scholarship (greenhow and gleason ; manca and ranieri ; veletsianos ), with emerging forms of reputation based on general and bespoke media tools (nicholas, herman, and jamali ; weller ); with fluid processes of collaborative research entailing interdisciplinary dialogue, teaching and dissemination (veletsianos and j.e. raffaghelli et al. (page number not for citation purpose) citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . kimmons a); and with a vision of open science that engages public audiences in the making of science, by extending the forms of participation along with the research process (grand et al. ). the whole debate is connected to the need for improving scholars’ literacy to participate in digital, networked and open contexts of scholarship (goodfellow and lea ; veletsianos and kimmons b). the work of this group of researchers is rooted in the model by boyer ( ) of the academic profession and suggests that boyer’s four dimensions (discovering, integration, application and teaching) are being enhanced and transformed by openness and networking, thus creating new professional ways of collaboration across geographical and institutional frontiers based on the affordances provided by web . (greenhow and gleason ; nicholas, herman, and jamali ; weller ). the above picture lets us appreciate that digital scholarship is a complex research area, guided by different research aims, rooted in several conceptual and methodological bases, and informed by diverse disciplinary traditions. moreover, it appears that the concept of digital scholarship is rather fuzzy, embracing different concerns and using a variety of research methods, professional practices and scholars’ identities. this blurred picture stimulated the authors of this article to analyse the literature on the topic in order to identify more clearly the different areas of research involved and better understand their relative importance, the reciprocal influences, the common concerns and the specificities, in terms of the problems tackled, the topics dealt with and, more generally, the interplay of the disciplines involved. to this end, a systematic review of the literature on digital scholarship has been carried out, complemented with bibliometric maps aimed to reveal and investigate the main views on digital scholarship, the keywords used and the extent to which they build upon each other’s results. the research aim is to explore whether, and to what extent, the emerging landscape depicts a unitary and cohesive research topic, or a fragmented disciplinary vision. as a result, our study should contribute to inform the evolution of this research topic, clarifying the areas where there is a need for better convergence of research problems and questions, and of connected constructs and methodological approaches. . methodological approach set out as a classic systematic literature review (petticrew and roberts ), this study encompassed an initial identification of a significant sample of publications concerning the field of digital scholarship, followed by the construction of a database where such publications are classified according to relevant categories. then, bibliometric maps have been used to identify the relationships between the papers and to spot existing agglomerates, corresponding to different strands of research. both the systematic review and the bibliometric maps were adopted to explore the relationships between the three strands of research identified, namely digital libraries, networked scholarship and digital humanities. we searched for juxtapositions in the classification of research areas, the research aims, the methodological approaches adopted, the citations between contributing authors and the concepts emerging as mostly used (keywords) to achieve a better picture of digital scholarship as a research topic. in the following subsections, we will describe the sample, the data collection process and the methods adopted for the data analysis. research in learning technology citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . . . sample selection the sample analysed comprised papers of relevant scholarly literature published during the period january �march . the sample, derived from the initial exploration of six specialised databases, namely web of science (wos; % of all papers were found in this database), scopus ( % of papers), the directory of open access journals (doaj; %), educational resources information centre (eric; %), editlib digital library ( %) and google scholar ( %), was arrived at through a search for the term ‘digital scholarship’ in the title, the abstract and the keywords. the search yielded papers, which were filtered by eliminating ( ) duplicated papers, due to overlaps between databases; ( ) papers with full text in languages other than english, ( ) pieces of work other than research papers (reports, position papers, magazine articles, etc.) and ( ) proceedings papers. technically, the authors searched for pieces of work representing consolidated research, thereby highlighting phenomena as well as conceptualisations that have passed a rigorous process of evaluation. this process led to the sample of journal papers indexed by at least one of the above-mentioned databases. the complete information of every article is documented in annex � references used for the review. . . data collection process and analysis . . . first phase: classification of articles according to the systematic review approach, the next step consisted in defining the structure of a database destined to host the relevant information about the papers. the database records were structured as reported in table , according to a procedure previously used elsewhere by raffaghelli, cucchiara, and persico ( ). table shows the dimensions that the researchers deemed relevant for the analysis of the field of digital scholarship. while the way the first three data fields of each record were filled in does not require further discussion, since researchers only had to report the data as found in the paper or in database sources, fields and require some additional explanations. field , corresponding to the dimension ‘view on digital scholarship’, refers to the three main perspectives on digital scholarship research described in the introduction of this paper. the first one was ‘networked scholarship’ and included all the papers that adopted social networks and other informal methods to disseminate research and teaching, as well as those that dealt with open science and open educational resources. the second one was ‘digital libraries’ and included papers analysing the digital infrastructures and their affordances, and the stake- holders’ policies with regard to them. the third category was ‘digital humanities’ and included papers on new research methods to capture or represent research objects within the humanities. although these three categories are consistent with the trends outlined in the analysis of the literature, in principle, some papers may simultaneously belong to two or even all three of the above-mentioned categories. for this reason, four hybrid categories were also created. however, there were no papers that were found to lie at the crossroad between the three categories. as for the ‘research approach’, the sub-field ‘research topic’ was an open field, and it was processed through a ‘thematic analysis’ procedure (guest, macqueen, and namey ), a widely used qualitative research method based on an inductive j.e. raffaghelli et al. (page number not for citation purpose) citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . approach. the full text of the articles under analysis was explored by two researchers according to the following procedure: ( ) the research topics were extracted by one researcher who created ‘subcategories’ in a first round of classification of articles (free codification); ( ) the results of phase ( ) were shared between the two researchers (member-checking); ( ) both researchers independently coded five papers table . database fields and values assigned. data field data sub-field assigned values type of data � article article title title as published text/as found in identification source journal title paper author(s) author(s) name and surname publication date year key words keywords as published abstract full abstract as published � scientific database wos https://apps. webofknowledge. com/ ‘the article is indexed in the database’ true/false ( / ) value/as found scopus http://www.scopus. com/ eric http://eric.ed.gov/ doaj https://doaj.org/ editlib http://www.editlib. org/ google scholar https://scholar. google.it/ � research area on the scientific database research area on scientific database classification of research as extracted from the scientific database text/as found � view on ds classification of research taking into consideration the research area as well as the theoretical approach: a. ns b. dh c. dl d. ns/dl e. ns/dh f. dl/dh g. dl/dh/ds text-label/ upon researcher’s interpretation � research approach research topic the topic of research, if declared. text/upon researcher’s interpretation research aim the guiding research purpose text/upon researcher’s interpretation dh, digital humanities; dl, digital libraries; ds, digital scholarship; ns, networked scholarship. research in learning technology citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) https://apps.webofknowledge.com/ https://apps.webofknowledge.com/ https://apps.webofknowledge.com/ http://www.scopus.com/ http://www.scopus.com/ http://eric.ed.gov/ https://doaj.org/ http://www.editlib.org/ http://www.editlib.org/ https://scholar.google.it/ https://scholar.google.it/ http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . using the agreed-upon sub-categories and the inter-rater analysis was carried out; and ( ) both researchers proceeded with the classification by adopting the existing sub-categories as themes covering one or more free codes, which in this case represent the research topics. in order to deal with possible biases in the researchers’ judgement of database fields and , the classification of the papers consisted three steps: the first step of joint ‘training’ was followed by the second step where both researchers classified independently the same five articles ( % of the whole sample) and the third step where the inter-rater agreement between the two raters was calculated. the inter-rater’s percentage of agreement was %. cohen’s kappa coefficient was also calculated, obtaining a value of . , which can be considered a high level of agreement (hayes and krippendorff ). controversial cases were then discussed till a consensus was reached. . . . second phase: bibliometric maps production and analysis while the first phase of this study was meant to allow the authors to identify the main areas of investigation, the focus of the studies on digital scholarship and the type of research carried out, the second phase was based on bibliometric maps and aimed at investigating the relationships amongst the disciplinary perspectives. bibliometric maps are a form of representation of scientific networks (van eck and waltman ) used in scientometrics as a means to understand connections between researchers and their work. they are based on three main elements: statistical analysis of written publications (often including text and data mining); different methods of visualisation (distance-based, graph-based and timeline-based) and digital tools supporting analysis and visualisation. bibliometric maps are graphs consisting of nodes and edges; while the nodes may represent publications, journals, researchers or keywords, the edges represent relationships between the nodes. according to the type of nodes, the focus of analysis and the emerging map are different. the most frequent types of relationship studied through bibliometric maps are: citations among papers (to explore connections between publications), co-authorship relations (to explore connections inside a network of researchers) and keyword co-occurrences (providing information about the distribution of topics) (van eck et al. ). some forms of visualisation explore static relationships, highlighting groups (clusters) of nodes that are ‘closer’, while others explore their evolution in time. in this research, bibliometric maps were used to analyse the sample of papers in order to: ( ) study the keywords characterising the field and differentiating agglomerates of papers and their relationships (i.e. central/peripheral, related/not related). the operational hypothesis guiding this analysis was that the three main groups of keywords, respectively, connected to the three digital scholarship views would emerge as clusters within the semantic universe connected to the construct of digital scholarship, and ( ) study the relationships between bibliographic items in terms of citations. the operational hypothesis here was that the distinction between the three ‘views’ on digital scholarship would be reflected by intense cross-citations within clusters of papers and few cross-citations between papers of different clusters. j.e. raffaghelli et al. (page number not for citation purpose) citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . after a careful analysis of existing tools for bibliometric maps analysis and visualisation, the authors selected two software tools to carry out this phase of the study: vosviewer, for the analysis and visualisation of keywords, and citnetexplorer for the analysis and visualisation of cross-citations. in the case of the cross-citations bibmap, the original sample consisting of the papers studied in the first phase was integrated by three books (borgman ; boyer ; weller ), since they were highly cited by the papers. furthermore, these books were perceived to be relevant to define the background and hence the relationships between the views. . results . . first phase: ‘characterising disciplinary contributions to digital scholarship’ this section presents the results of the first phase of work, the systematic review. figure shows that digital scholarship appears to be a fairly recent field of research, dating back to (although rooted in previous literature on scholarship), featuring a significant increase in papers on scientific productivity in the years and (the yearly number of papers almost doubled between and and doubled again in ); this highlights a fast emerging field of research. in addition, the papers are well distributed amongst several journals belonging to different subject areas, which confirm the relevance of the topic for different disciplines. figure shows the distribution of research topics as they emerged from the ‘thematic analysis’ procedure described in the methodological approach section. we note the prevalence ( %) of the group of papers dealing with the issue of (academic) professional practices tightly connected to educational research; these papers deal with research in the field of higher education and focus on digital scholarship as a problem of professional learning and innovation. this is followed by a number of papers ( %) concerning the themes of ‘openness, democratisation of education’ and the ‘participatory culture’ of the web. less represented is the topic of ‘digital identity, interaction, social networks (sns) and social media’. besides, ‘e-publishing and serie articles on ds per year figure . evolution of scientific production in the field ( is excluded because data were collected only for the first semester). research in learning technology citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . library services’ ( %), as well as ‘digital art and history’ ( %) deserve interest. the least represented topics are ‘e-science and information and communition technol- ogies (ict)’ and ‘multimedia and innovation’ ( %). during this process of analysis, the researchers observed co-occurrences of keywords between papers dealing with the topics: ‘‘professional practices, educa- tional practices’’; ‘‘openness and democratisation, participatory culture’’; and ‘‘digital identity, interaction, sns and social media’’. these three topics together, respresent % of the sample. besides, the topics of ‘‘e-publishing, library services’’ shared several keywords with ‘‘e-science and ict, multimedia, innovation’’, representing together % of the sample. lastly, the topic of digital art and history appeared to be a stand-alone category. the above situation relating to research areas as well as research topics revealed that the expected three main views were present in the sample: the view of digital scholarship as a networked process of collaboration on the open web connected to the ; % ; % ; % ; % ; % ; % ; % e-science and ict, multimedia, innovation identity, interaction, sns and social media professional practices, educational practices e-publishing, library services openess and democratization, participatory culture figure . topics of research. dl/ns ( ; % ) ns ( ; % ) dh ( ; % ) h/ns ( ; % ) dl ( ; % ) dh/dl/ns ( ) dh/dl ( ; %) figure . views on digital scholarship. j.e. raffaghelli et al. (page number not for citation purpose) citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . scholars’ endeavour to transform their own practices with a new deontology of scholarship; the view of digital infrastructures (libraries) leading researchers to adopt new affordances to do their work and hence requiring professional interventions to organise new, complex technological contexts; and the view of digital humanities as a strand of research focused on technological settings and objects supporting research in the humanities, but also transforming it. it was observed that most papers falling in the research area of ‘social sciences’ belonged to the first group; that in the case of the second group, the papers could be placed amongst the two research areas ‘information sciences’ and ‘computer sciences’; and in the third case, the papers were distributed between ‘computer sciences’ and ‘humanities’, showing the separations in the disciplines contributing to the ‘views on digital scholarship’. figure illustrates the distribution of papers per ‘view’, including papers with overlapping or ‘mixed’ visions. according to figure , most papers in our sample are distributed between the dominant visions of networked scholarship ( %) and digital libraries ( %), with less presence of the digital humanities ( %). only % of the examined papers are ‘hybrids’ and simultaneously belong to two visions. no paper belongs to the interception of the three visions. the set of papers belonging to the field of digital humanities, besides being smaller, is also more isolated than the other two ( % of overlapping with the other two). . . second phase: ‘exploring disciplinary relationships within digital scholarship’ . . . the map of keywords co-occurrences the map of co-occurrences of keywords is a representation based on the number of occurrences of keywords within the ‘corpus’ of terms extracted from all the titles, keywords and abstracts of the articles within the sample. the software vosviewer extracts all the ‘noun-phrases’ from the corpus; therefore, the terms are organised by topics automatically generated by the software, namely the keywords. in this case, from the original corpus, the software extracted , relevant keywords from a sample of , terms. a total number of nodes emerged; however, only ( %) of these keywords are considered by the software for representational purposes. moreover, the authors removed irrelevant or ambiguous keywords from the representation such as too general terms (e.g. issue, author, purpose, role, publishing, scholar, academic, survey) or terms which conditioned the visualisation of a cluster, such as teaching, publication, implication, challenge and collaboration. the final representation, composed of keywords/nodes, is shown in figure ), where three bigger clusters and two smaller clusters are identifiable. table introduces the details of keywords for each cluster, while in the second column, we have associated each cluster with the relevant perspective. cluster (in red at the top of figure ), contains eight nodes, with ‘library’ as its main node, and was connected semantically with the view of digital libraries research, focusing on the role of infrastructures allowing new ways of scientific production, the problem of preserving and using content, and the role of libraries and librarians in scientific information. on the right-hand side in figure (in green), cluster , with the central word ‘network’, contains eight nodes. this cluster appears to be semantically connected with the perspective of ‘networked (and open) scholarship’ seen as scholars’ endeavour to embrace the social web with all its affordances to promote new practices (such as opening up education and research) in line with a new deontology of public engagement. cluster (in purple), in between research in learning technology citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . clusters and , features ‘openness’ as the main node, relates to the issue of open access to content and to open scholarship as professional practice. hence, this small cluster brings some evidence of the existence of contaminations between the digital libraries and the networked scholarship perspectives. the third biggest cluster, cluster (in blue), at the bottom-right of the map, is composed by seven nodes, and its main nodes are ‘humanities’ and ‘collaboration’. we can assume that this cluster aligns with the perspective of digital humanities, dealing with how researchers interact with new digitised objects within the humanities as well as how the field evolves as an interdisciplinary field, between computer science and the humanities. cluster (in yellow), a small cluster whose main node is ‘history’; is tightly connected with the digital humanities perspective. in figure , this cluster is rather isolated, and specifically there are no nodes that can be attributed to the view networked scholarship/digital humanities, which seems to be in line with the very small overlapping between these two perspectives already shown in figure . the clusters described above can be clearly put in relation with the ‘views’ on digital scholarship identified in the first phase of the study. one question that could be raised is whether the mere existence of clusters , and as separate clusters reflects little mutual awareness deriving from the respective disciplinary viewpoints; and whether the connections observed (cluster and ) can be regarded as a sort figure . bibliometric map of keywords (colours are attributed to nodes by vosviewer to highlight clusters). table . clusters of keywords and connected perspectives on digital scholarship. cluster keywords connected perspective -red access, digital age, librarian, library, literacy, open access, tool, web dl -green digital scholar, engagement, habitus, network, participatory web, scholarly practice, social medium, twitter ns -blue collaboration, digital humanity, humanities, humanity, infrastructural inversion, social science, visualisation dh/dl -yellow digital art history, digitalisation, discipline, history dh -purple education, open scholarship, openness ns/dl dh, digital humanities; dl, digital libraries; ds, digital scholarship; ns, networked scholarship. j.e. raffaghelli et al. (page number not for citation purpose) citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . of starting point for interdisciplinary analysis of the topic of digital scholarship. clearly, the keywords map is not informative enough to answer this question, while the cross-citation bibmap described in the next section can shed more light on it. . . . cross-citations bibliometric map across-citation bibliometric map was built to understand the relationships between cited and citing papers, that is, to understand whether the authors built upon the work of each other. more in general, this type of map allowed us to focus on the extent to which each research strand is aware of the work of the others. the software used for this purpose (citnetexplorer) visualises the relevant publications of our sample as well as their citational relationships across a time span. in our case, the time span ( � ) is the one covered by our sample, consisting of papers plus three highly cited books (borgman ; boyer ; weller ). figure shows the bi-dimensional representation of the citation network per year, organised in clusters of publications based on their citational relationships. in figure , a cluster is identified and its nodes highlighted. the parameter ‘minimum number of citation links’ was set at , which means that documents receiving less than three citations from other documents of the sample are not visualised in the map. this is a low value in typical bibliometric problems, but adequate for this small set of documents; in any case, the situation observed is typical of very specific research fields, as well as of the application of bibliometric indicators in the humanities and in educational research (hammarfelt ). in figure , it is possible to observe one main cluster of publications, and some isolated nodes. the main cluster corresponds to papers belonging to the networked scholarship group, which are at the centre of the cluster (i.e. labelle, weller, veletsianos, costa, goodfellow), but it also includes publications belonging to ‘hybrid’ categories (i.e. wolski & richardson, pasquini, holliman from digital libraries/ figure . bibliometric map of cross-citations. multiple papers with the same first author can be distinguished by the year, that is, the position along the y axes. no couples of papers/books with the same first author in the same year are present in the sample. research in learning technology citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . networked scholarship; najmi from digital humanities/networked scholarship ). in this cluster, the only publication which was classified as digital humanities is that by kaltenbrunner. however, it is crucial to highlight that all of these articles cite two core books in the cluster: boyer ( ) and borgman ( ). while the first author pioneered the debate on the need for revolutionising the academic profession in the very early nineties, the second has become a crucial landmark in the research about the changing cyber infrastructures supporting (and questioning) scholarship. boyer’s work is particularly considered as a model to understand academics’ professional practices. instead, borgman’s book is a pillar of the debate about the scholarly communication paradigms that the academics have to face. indeed, an analysis performed removing these two books shows a cluster of authors mainly belonging to the networked scholarship view ( ), and the rest of publications completely scattered and isolated. another important book for the scientific community exploring the topic of digital scholarship is weller ( ) that can be seen at the centre of the cluster, with less cross-citations due to the fact that it is more recent than the other two. besides, there are very few cross-citations (lateral lines) between authors within the network. this emerges from the identification of the core publications ( in total); these are publications that have at least a certain minimum number of citation relations with other core publications, taking into account that incoming and outgoing citation relations are treated identically. the publications identified mainly coincide with the networked scholarship perspective identified in the prior phase. with regard to the isolated publications ( ), the situation is mixed between digital libraries and digital humanities, which means that there are little citations between these perspectives, and the work considering the construct of digital scholarship in these two areas is not cohesive. to wrap up this part of the analysis, one could have expected a citation map clearly showing the three clusters networked scholarship, digital libraries and digital humanities consisting of publications reciprocally citing each other inside each view and with fewer citations across clusters. this does not seem to be the case. however, this analysis brings to light issues that are consistent with the prior analysis. the first is the low number of cross-citations, supporting the idea that the field of research is rather fragmented, which highlights that most contributions do not take into account the three disciplinary perspectives. the exception to the above consideration is provided by the publications belonging to the networked scholarship perspective, that is, those that explore academic professional practices and scholars. the existence of this cluster seems to confirm that scholars who study networked scholarship are actually more ‘networked’ than the others, and the identity of this field of research should and perhaps could be built on their shoulders. however, the weak connection with the other perspectives allows us to suppose that this group could be rather unaware of the contributions coming from the other two perspectives, their problems and their research agenda; as a result, we can conclude that interdisciplinary collaboration in this area is not strong enough. . discussion and conclusions this study was aimed at exploring and mapping a set of selected papers on digital scholarship. most of these articles aimed to define the concept and to study related phenomena (‘in the wild’), that is, the academics’ practices and the supporting infrastructures in a digital, open and networked context of activity. at first sight, j.e. raffaghelli et al. (page number not for citation purpose) citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . the authors observed the coexistence of several interpretations of the term, each reflecting different disciplinary research perspectives on the construct. consequently, the whole study was set up to investigate digital scholarship by identifying the main research strands involved and their relationships, including common epistemological roots, reciprocal awareness, as well as key topics and concerns of the strands and their overlapping. the results of the study show not only the fragmentation of research efforts across three main disciplinary strands of research but also a relatively low degree of cohesion inside each strand, which might be due to the early stage of development of this research field (even if the first paper dates back to , the field actually took off around ). the three main strands, networked scholarship, digital libraries and digital humanities, seem to differ as to the disciplinary background (respectively, social sciences, information sciences and humanities). in spite of the isolation observed, our exploration revealed some partial overlapping through the thematic analysis of keywords, as well as the bibliometric maps of keywords. this probably indicates that research problems and discourses are connected to some extent. in fact, networked scholarship is connected with some of the assumptions of digital libraries, while digital humanities seems loosely connected to digital libraries and networked scholarship. the authors could not classify any paper at the intercep- tion of the three. the cross-citation map shows a rather fragmented panorama, rooted in some previous seminal books, with more citations between publications of the ‘networked scholarship’ strand and a few cross-citations between publications of the other two. besides, there are a few citations between strands and very few citations between digital libraries and digital humanities. in this regard, the cross-citation map was not completely convergent with the researchers’ manual classification and the thematic analysis: the isolation observed was even higher than expected. the above considerations confirm that the construct of digital scholarship encompasses three strands of research with a rather clear focus and raises the question of whether there is a lack of reciprocal awareness, possibly preventing scholars to build on prior efforts, towards an interdisciplinary collaboration. the division between the disciplinary fields contributing to the topic of digital scholarship hereby presented is not new in the literature and has been pointed at by several authors (goodfellow ; quan-haase, suarez, and brown ; scanlon ). however, this analysis contributes to the discourse by highlighting both the forms of fragmentation assumed by the literature and the existing attempts to overcome this fragmentation. above all, the problem of coexistence of different digital scholarship definitions and the field conceptual fragmentation causes an entropic situation hindering further empirical research. for example, it makes it difficult to identify what is innovative and to put forward recommendations for practice (e.g. proposals for the training of scholars) and for policy-making (e.g. prioritising efforts of investment in scholars’ career development, in supporting infrastructures and in the evaluation systems based on scientific productivity). another important issue relates to the values attached to the research undertaken across the three ‘views’ of digital scholarship. while many studies, particularly within the networked scholarship perspective, focus on the positive ethical value of open scholarship, based on avant-garde practices and pioneering scholars, other studies bring to light the lack of participation of scholars to innovative practices, emphasising the limited concern about the need for changing the practices of research in learning technology citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . scholarship as well as the attritions between innovation and tradition in academic research and teaching (costa ). solving these issues requires an increase in the level of awareness among scholars, the adoption of convergent research methods and visions of the field and more interdisciplinary dialogue between researchers. notes . http://www.vosviewer.com/ . http://www.citnetexplorer.nl/ references andersen, d. l., ed. ( ) digital scholarship in the tenure, promotion, and review process, m.e. sharpe, london. andersen, d. l. & trinkle, d. ( ) ‘valuing digital scholarship in the tenure, promotion, and review process � a survey of academic historians’, in digital scholarship in the tenure, promotion and review process, d. andersen, m.e. sharpe, london, pp. � . borgman, c. l. ( ) scholarship in the digital age, mit press, cambridge. boyer, e. l. ( ) scholarship reconsidered: priorities of the professoriate, vol. , carnegie foundation for the advancement of teaching, wiley, new york. den besten, m., david, p. & schroeder, r. ( ) ‘research in e-science and open access to data information’, in international handbook of internet research, eds j. husinger, l. klastrup & j. allen, london, new york, springer, pp. � . doi: . / - - - - gardiner, e. & musto, r. g. ( ) the digital humanities a primer for students and scholars, harvard university press, cambridge, ma. goodfellow, r. & lea, m. ( ) literacy in the digital university: critical perspectives on learning, scholarship, and technology, routledge, london. grand, a., et al., ( ) ‘open science: a new ‘‘trust technology’’?’, science communication, vol. , no. , pp. � . doi: http://dx.doi.org/ . / greenhow, c. & gleason, b. ( ) ‘social scholarship: reconsidering scholarly practices in the age of social media’’, british journal of educational technology, vol. , no. , pp. � . doi: http://dx.doi.org/ . /bjet. guest, g., macqueen, k. m. & namey, e. e. ( ) applied thematic analysis, vol. , sage, london. hammarfelt, b. ( ) ‘using altmetrics for assessing research impact in the humanities’’, scientometrics, vol. , no. , pp. � . doi: http://dx.doi.org/ . /s - - � hayes, a. f. & krippendorff, k. ( ) ‘answering the call for a standard reliability measure for coding data’, communication methods and measures, vol. , no. , pp. � . doi: http://dx.doi.org/ . / klein, j. t. ( ) interdisciplining digital humanities: boundary work in an emerging field, university of michigan press, ann arbor, mi. [online] available at: http://quod.lib.umich. edu/cgi/t/text/text-idx?cc=dh;c=dh;idno= . . ;rgn=full% text;view=toc;xc= ; g=dculture manca, s. & ranieri, m. ( ) ‘yes for sharing, no for teaching!’’: social media in academic practices’, the internet and higher education, vol. , pp. � . doi: http://dx.doi.org/ . /j.iheduc. . . nicholas, d., herman, e. & jamali, h. r. ( ) emerging reputation mechanisms for scholars, european commission jrc science and policy report, seville, spain. doi: http:// dx.doi.org/ . / pearce, n., et al., ( ) ‘digital scholarship considered: how new technologies could transform academic work’, in education, [online] available at: http://ineducation.ca/ ineducation/article/view/ / j.e. raffaghelli et al. (page number not for citation purpose) citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . http://www.vosviewer.com/ http://www.citnetexplorer.nl/ . / - - - - . / - - - - http://dx.doi.org/ . / http://dx.doi.org/ . /bjet. http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / http://quod.lib.umich.edu/cgi/t/text/text-idx?cc=dh;c=dh;idno= . . ;rgn=full% text;view=toc;xc= ;g=dculture http://quod.lib.umich.edu/cgi/t/text/text-idx?cc=dh;c=dh;idno= . . ;rgn=full% text;view=toc;xc= ;g=dculture http://quod.lib.umich.edu/cgi/t/text/text-idx?cc=dh;c=dh;idno= . . ;rgn=full% text;view=toc;xc= ;g=dculture http://dx.doi.org/ . /j.iheduc. . . http://dx.doi.org/ . /j.iheduc. . . http://dx.doi.org/ . / http://dx.doi.org/ . / http://ineducation.ca/ineducation/article/view/ / http://ineducation.ca/ineducation/article/view/ / http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . petticrew, m. & roberts, h. ( ) systematic reviews in the social sciences, a practical guide, blackwell, oxford, uk. raffaghelli, j. e., cucchiara, s. & persico, d. ( ) ‘methodological approaches in mooc research: retracing the myth of proteus’, british journal of educational technology, vol. , no. , pp. � . doi: http://dx.doi.org/ . /bjet. suber, j. p. ( ) open-access timeline, [online] available at: http://legacy.earlham.edu/ ~peters/fos/timeline.htm terras, m., nyhan, j. & vanhoutte, e, eds. ( ) defining digital humanities: a reader, ashgate, london. unsworth, j. ( ) ‘what is humanities computing and what is not?’, in defining digital humanities: a reader, eds. m. terras, j. nyhan & e. vanhoutte, ashgate, london, pp. � . van eck, n. j. & waltman, l. ( ) ‘visualizing bibliometric networks’, in measuring scholarly impact: methods and practice, eds. y. ding, r. rousseau & d. wolfram, springer, london, pp. � . van eck, n. j., et al., ( ) ‘a comparison of two techniques for bibliometric mapping: multidimensional scaling and vos’, digital libraries; physics and society, [online] available at: http://arxiv.org/abs/ . weller, m. ( ) the digital scholar: how technology is transforming scholarly practice, bloomsbury academic, london. annex � references used for the literature review antonijević, s. & cahoy, e. s. ( ) ‘personal library curation: an ethnographic study of scholars’ information practices’, portal-libraries and the academy, vol. , no. , pp. � . ayers, e. ( ) ‘doing scholarship on the web: ten years of triumphs � and a disappoint- ment’’, journal of scholarly publishing, vol. , no. , pp. � . doi: http://dx.doi.org/ . /jsp. . . bennett, l. & folley, s. ( ) ‘a tale of two doctoral students: social media tools and hybridised identities’, research in learning technology, vol. , p. . doi: http://dx.doi. org/ . /rlt.v . bentkowska-kafel, a. ( ) ‘i bought a piece of roman furniture on the internet. it’s quite good but low on polygons. � digital visualization of cultural heritage and its scholarly value in art history’, visual resources, vol. , no. � , pp. � . doi: http://dx.doi.org/ . / . . burdick, a. & willis, h. ( ) ‘digital learning, digital scholarship and design thinking’, design studies, vol. , no. , pp. � . christie, a. ( ) ‘interdisciplinary, interactive, and online: building open communication through multimodal scholarly articles and monographs’, scholarly and research com- munication, [online] available at: http://src-online.ca/index.php/src/article/view/ / costa, c. ( ) ‘the habitus of digital scholars’, research in learning technology, vol. , p. . doi: http://dx.doi.org/ . /rlt.v i . costa, c. ( ) ‘outcasts on the inside: academics reinventing themselves online’, international journal of lifelong education, vol. , no. , pp. � . doi: http://dx. doi.org/ . / . . dority baker, m. l. ( ) ‘using buttons to better manage online presence: how one academic institution harnessed the power of flair’, journal of web librarianship, vol. , no. , pp. � . doi: http://dx.doi.org/ . / . . goodfellow, r. ( ) ‘scholarly, digital, open: an impossible triangle? research in learning technology, vol. , p. . doi: http://dx.doi.org/ . /rlt.v . hartelius, e. j. & mitchell, g. r. ( ) ‘big data and new metrics of scholarly expertise’, review of communication, vol. , no. � , pp. � . doi: http://dx.doi.org/ . / . . heap, t. & minocha, s. ( ) ‘an empirically grounded framework to guide blogging for digital scholarship’, research in learning technology, [online] available at: http://www. researchinlearningtechnology.net/index.php/rlt/article/view/ /xml research in learning technology citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) http://dx.doi.org/ . /bjet. http://legacy.earlham.edu/~peters/fos/timeline.htm http://legacy.earlham.edu/~peters/fos/timeline.htm http://arxiv.org/abs/ . http://dx.doi.org/ . /jsp. . . http://dx.doi.org/ . /jsp. . . http://dx.doi.org/ . /rlt.v . http://dx.doi.org/ . /rlt.v . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://src-online.ca/index.php/src/article/view/ / http://dx.doi.org/ . /rlt.v i . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . /rlt.v . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ /xml http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ /xml http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . holliman, r. ( ) ‘from analogue to digital scholarship: implications for science communication researchers’, journal of science communication, vol. , no. , p. c . hswe, p. ( ) ‘what you don’t know will hurt you: a slavic scholar’s perspective on the practicality, practicability, and practice of digital scholarship’, slavic & east european information resources, vol. , no. , pp. � . doi: http://dx.doi.org/ . /j v n _ jakubowicz, a. ( ) ‘bridging the mire between e-research and e-publishing for multimedia digital scholarship in the humanities and social sciences: an australian case study’, webology, vol. , no. , [online] available at: http://www.webology.org/ /v n /a .html kaltenbrunner, w. ( ) ‘scholarly labour and digital collaboration in literary studies’, social epistemology, vol. , no. , pp. � . doi: http://dx.doi.org/ . / . . kaltenbrunner, w. ( ) ‘infrastructural inversion as a generative resource in digital scholarship’, science as culture, vol. , no. , pp. � . doi: http://dx.doi.org/ . / . . kiefer, r. s. ( ) ‘digital preservation of scholarly content, focusing on the example of the clockss archive’, insights the uksg journal, vol. , no. , pp. � . doi: http://dx. doi.org/ . /uksg. labelle, c., anderson-wilk, m. & emanuel, r. ( ) ‘leveraging new media in the scholarship of engagement: opportunities and incentives’, journal of extension, vol. , no. , [online] available at: http://www.joe.org/joe/ december/a .php llona, e. ( ) ‘what slavic scholars need to know about archiving digital materials in institutional repositories’, slavic & east european information resources, vol. , no. , pp. � . doi: http://dx.doi.org/ . / manoff, m. ( ) ‘unintended consequences: new materialist perspectives on library technologies and the digital record’, portal: libraries and the academy, vol. , no. , pp. � . doi: http://dx.doi.org/ . /pla. . martin, s. ( ) ‘collaboration in electronic scholarly communication: new possibilities for old books’, journal of the association for history and computing, vol. , no. . najmi, a. & keralis, s. d. c. ( ) ‘engaging the twitter backchannel as digital scholarship: methods for analyzing scholarly engagement in alternative media’, [online] available at: http://digital.library.unt.edu/ark:/ /metadc /citation/ nowviskie, b. ( ) ‘a scholar’s guide to research, collaboration, and publication in nines’, romanticism and victorianism on the net, no. . doi: http://dx.doi.org/ . / ar pasquini, l. a., wakefield, j. s. & roman, t. ( ) ‘impact factor: early career research & digital scholarship’, techtrends, vol. , no. , pp. � . doi: http://dx.doi.org/ . / s - - - pearce, n. ( ) ‘a study of technology adoption by researchers’, information, communication & society, vol. , no. , pp. � . doi: http://dx.doi.org/ . / pfannenschmidt, s. l. & clement, t. e. ( ) ‘evaluating digital scholarship: suggestions and strategies for the text encoding initiative’, journal of the text encoding initiative, no. . doi: http://dx.doi.org/ . /jtei. quan-haase, a., suarez, j. l. & brown, d. m. ( ) ‘collaborating, connecting, and clustering in the humanities: a case study of networked scholarship in an interdisciplinary, dispersed team’, american behavioral scientist, vol. , no. , pp. � . doi: http://dx. doi.org/ . / quigley, d. s., et al., ( ) ‘scholarship and digital publications: where research meets innovative technology’, visual resources, vol. , no. � , pp. � . doi: http://dx.doi. org/ . / . . romero-frı́as, e. & del-barrio-garcı́a, s. ( ) ‘una visión de las humanidades digitales a través de sus centros’, el profesional de la informacion, vol. , no. , pp. � . doi: http://dx.doi.org/ . /epi. .sep. scanlon, e. ( ) ‘digital futures: changes in scholarship, open educational resources and the inevitability of interdisciplinarity’, arts and humanities in higher education, vol. , no. � , pp. � . doi: http://dx.doi.org/ . / scanlon, e. ( ) ‘scholarship in the digital age: open educational resources, publication and public engagement’, british journal of educational technology, vol. , no. , pp. � . doi: http://dx.doi.org/ . /bjet. j.e. raffaghelli et al. (page number not for citation purpose) citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . http://dx.doi.org/ . /j v n _ http://www.webology.org/ /v n /a .html http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . /uksg. http://dx.doi.org/ . /uksg. http://www.joe.org/joe/ december/a .php http://dx.doi.org/ . / http://dx.doi.org/ . /pla. . http://digital.library.unt.edu/ark:/ /metadc /citation/ http://dx.doi.org/ . / ar http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / http://dx.doi.org/ . /jtei. http://dx.doi.org/ . / http://dx.doi.org/ . / http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . /epi. .sep. http://dx.doi.org/ . / http://dx.doi.org/ . /bjet. http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . stafne, e. t. ( ) ‘a view of digital scholarship in extension’, journal of extension, vol. , no. . stewart, b. e. ( ) ‘in abundance: networked participatory practices as scholarship’, the international review of research in open and distributed learning, vol. , no. , pp. � . thomas, w. j. ( ) ‘the structure of scholarly communications within academic libraries’, serials review, vol. , no. , pp. � . doi: http://dx.doi.org/ . / . . trehub, a. ( ) ‘slavic studies and slavic librarianship in the united states: a post-cold war perspective (excerpts)’, slavic & east european information resources, vol. , no. , pp. � . doi: http://dx.doi.org/ . / veletsianos, g. ( ) ‘higher education scholars’ participation and practices on twitter’, journal of computer assisted learning, vol. , no. , pp. � . doi: http://dx.doi.org/ . /j. - . . .x veletsianos, g. & kimmons, r. ( a) ‘networked participatory scholarship: emergent techno-cultural pressures toward open and digital scholarship in online networks’, computers & education, vol. , no. , pp. � . doi: http://dx.doi.org/ . /j. compedu. . . veletsianos, g. & kimmons, r. ( b) ‘assumptions and challenges of open scholarship’, the international review of research in open and distance learning, vol. , no. , pp. � . vilar, p., juznic, p. & bartol, t. ( ) ‘information-seeking behaviour of slovenian researchers: implications for information services’, international journal on grey literature, vol. , no. , pp. � . vinopal, j. & mccormick, m. ( ) ‘supporting digital scholarship in research libraries: scalability and sustainability’, journal of library administration, vol. , no. , pp. � . doi: http://dx.doi.org/ . / . . weller, m. ( ) ‘digital scholarship and the tenure process as an indicator of change in universities’, rusc’, revista de universidad y sociedad del conocimiento, vol. , no. , pp. � . doi: http://dx.doi.org/ . /rusc.v i . wolski, m. & richardson, j. ( ) ‘a model for institutional infrastructure to support digital scholarship’, publications, vol. , no. , pp. � . doi: http://dx.doi.org/ . / publications zhao, l. ( ) ‘riding the wave of open access: providing library research support for scholarly publishing literacy’, australian academic & research libraries, vol. , no. , pp. � . doi: http://dx.doi.org/ . / . . zorich, d. m. ( ) ‘digital art history: a community assessment’, visual resources, vol. , no. � , pp. � . doi: http://dx.doi.org/ . / . . research in learning technology citation: research in learning technology , : - http://dx.doi.org/ . /rlt.v . (page number not for citation purpose) http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /j.compedu. . . http://dx.doi.org/ . /j.compedu. . . http://dx.doi.org/ . / . . http://dx.doi.org/ . /rusc.v i . http://dx.doi.org/ . /publications http://dx.doi.org/ . /publications http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ http://dx.doi.org/ . /rlt.v . cm&r : (march) hmorn – selected abstracts c-d - : governing access to a distributed research network’s data resources beth l syat, mph, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; kimberly lane, mph, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; jeffrey s brown, phd, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; david magid, md, mph, institute for health research, kaiser permanente colorado; joe v selby, md, mph, division of research, kaiser permanente northern california; richard platt, md, ms, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; andrew nelson, mph, healthpartners research foundation to answer many public health questions, it is essential to use information from more than one electronic data system, and efficient ways are needed to securely access and use data from multiple organizations while respecting the regulatory, legal, proprietary, and privacy implications of this data use and access. one approach centers on the development of distributed research networks that allow data owners to maintain confidentiality and physical control over their data, while permitting authorized users to ask essential questions. once such a network is fully operating and key elements are in place, sharable data resources can be made available to approved network users, under approved conditions. for instance, data from a large cohort of hypertensive patients with five years of utilization (a hypertension cohort) could be available on the network. the following questions will need to be addressed: who can have access? under what conditions should access be granted? what policies/procedures are required? to address the specific needs associated with governance of a network’s resource(s), the authors call for the establishment of user eligibility requirements, policies to deal with funders (i.e., access rules for study funders), clear standard operating procedures, and guidelines for accessing the network. recommendations to meet to those needs include: ) establishing data oversight policies; ) defining responsibilities for data resource access; ) defining responsibilities for data owners at each site (i.e., responding to queries when requests come in); ) creating standard operating procedures for the data resource; ) creating collaboration guidelines for external partners; and ) monitoring overall resource use. for the purpose of this poster, we propose to illustrate responsibilities for data owners at each site. ps - : digital scholarship: scientific publishing at the crossroads virginia d scobba, mls, ma, group health center for health studies, group health cooperative background/aims: scholarly communication is the system through which research and other scholarly writings are created, evaluated for quality, disseminated to the scholarly community, and preserved for future use. the traditional formal means of interchange, publication in peer reviewed journals, is at the core of the communication infrastructure. however, the structures and processes by which scholars communicate have undergone a major transformation in recent years with the advent of the digital age. new electronic technologies for access to information appear to be revolutionizing scholarly publishing, aptly defined by the term, digital scholarship. current trends in the chaotic scholarly publishing market can be perceived as both opportunities for and threats to digital scholarship. methods: digital scholarship is in a state of unprecedented upheaval as publishers, librarians, legislators, scholarly societies, scientists and other scholars engage in tactics to propel change in directions that promote their individual goals. strategies involve remodeling the publishing market, modifying academic and research institutional procedures, and influencing public policy. results: emerging digital publishing technologies, increasing volume of scholarly works, and decreasing satisfaction with a costly and dysfunctional economic model are changing the fundamental structure of scholarly publishing. research institutions, as well as government and funding agencies, are implementing or exploring strategies which promote free and open access to research results. these include alternative copyright arrangements, e -print archives and digital repositories. conclusion: scholars, researchers, and society at large gain tremendous benefits from the expanded dissemination of research findings. however, several factors have impeded the progress of digital scholarship, including efforts to protect publishing revenues and profits, legal licensing restrictions, and the traditional culture of academia. it is therefore critical that the scientific community is actively engaged to ensure that the advancement of scholarship takes priority in the development of new publishing models. ps - : developing an analytical tool for assessing the adequacy of state health information exchange laws randy mcdonald, jd, lovelace clinic foundation; maggie gunter, phd, lovelace clinic foundation; shelley carter, rn, mph, lovelace clinic foundation; bob mayer aims: to develop and test an analytic legislative tool that provides states with the ability to analyze and propose reform to laws related to the exchange of electronic health information. background: through extensive research, the multi -state harmonizing security and privacy law collaborative (hsplc) found myriad barriers to health information exchange in laws and business practices. in some cases, barriers are beneficial because they protect people’s privacy. however, barriers can be problematic when they prevent the timely exchange of information needed for the treatment of patients. there are many inconsistencies in state and federal laws and among state statutes in their definitions, organizational structure, and content. some states have adopted new legislation that addresses the exchange of health information that may further exacerbate differences among states and impede interstate exchange of electronic health information. methods: hsplc developed a set of analytical tools and a narrative guide, the roadmap, to assist states in implementing an effective legal framework for the review and adoption of legislation that supports health information exchange (hie). the tools and roadmap were created through extensive research to identify best practices for identifying, evaluating, and reforming state laws related to the disclosure of electronic health information. results: hsplc found that various state resources (legal, legislative, healthcare policy, healthcare providers, and consumers) are necessary for successful completion of the roadmap to identify opportunities for legislative reform. hsplc believe that states will have greater likelihood of success in achieving legislative reform if they use the roadmap and reach out to other states contemplating a change in legislation. interstate collaboration and coordination are essential if we are to achieve a national legal and technical infrastructure that facilitates health information exchange. conclusions: legislation in most states does not adequately address the exchange of electronic health information. drafting of legislation must take into account a state’s unique environment and culture, and the needs and support of stakeholders. the goal of using the analytic tool is to protect health information while removing barriers that impede the exchange of vital information. the hsplc roadmap provides a step by step process to analyze and reform state legislation. ps - : optimizing health informatics interventions from the patient’s perspective: focus group on improving safe nsaid use douglas w roblin, phd, kaiser permanente georgia; richard m shewchuk, phd, university of alabama at birmingham; jeroan j allison, md, msc, university of alabama at birmingham; renny varghese, mph, kaiser permanente georgia; suzanne baker, mph, university of alabama at birmingham; catarina i kiefe, md, phd, university of alabama at birmingham background: patient- provider messaging in an electronic medical record (emr) system provides an opportunity to create and sustain productive patient- provider interactions. we elicited patient perspectives on design, benefits, and concerns to improve usability and efficacy of a proposed health informatics intervention to support surveillance of, and provider feedback on, over the counter (otc) non-steroidal anti -inflammatory drug (nsaid) use. methods: we conducted four focus groups involving kaiser permanente georgia (kpg) adults – years old who had a medical condition for which nsaids should be used cautiously or had a recent prescription for nsaids. the focus group elicited information regarding: otc nsaid use (including recognition of risks and side effects), design of an otc nsaid survey to be delivered via kp.org (the secure kp internet portal for patient- physician messaging), benefits and concerns about transmission of this information via electronic messaging to their primary care physicians, and cm&r : (march) hmorn – selected abstracts willingness to participate in a health informatics intervention to improve safe nsaid use. we then developed a concept map and classification scheme for analyzing transcripts from the focus groups. two trained coders labeled the transcripts using atlasti. results: forty- eight kpg adults participated in the focus groups: % female; % african american; median age years. forty seven participants indicated current or recent use of otc nsaids. easy access to otc nsaids (low cost, no prescription required) promoted their use; however, self- medication strategies often combined multiple otc nsaids or increased otc nsaid dosing to obtain pain relief. participants acknowledged that the proposed intervention would benefit their health care through more complete reporting and documentation of otc nsaid use in their emr. concerns were expressed about: keeping this information up -to - date, if the information would be used or (if used) evaluated by a qualified provider on their health care team, and mode (e -mail or telephone) and timeliness about how they would be informed about potential risks from their otc nsaid use. conclusions: consistent with the chronic care model, participants acknowledged that the proposed intervention would create productive interactions with their providers and likely improve their health outcomes. their perspectives also yielded some unexpected insights (e.g. importance of timely updating of otc nsaid use) and have resulted in modifications to the overall intervention design. ps - : research mentor: a web-based reference for planning and preparing a research proposal ingrid glurich, phd, office of scientific writing and publication, marshfield clinic research foundation; marie fleisner, office of scientific writing and publication, marshfield clinic research foundation background: national emphasis on interdisciplinary and translational research as research priorities has created new challenges in grantsmanship. the office of scientific writing and publication at marshfield clinic research foundation utilized an informatics approach to create a comprehensive educational resource to assist new and established investigators engaged in research design. an interactive website accessible on the institutional intranet was designed to provide links to information, resources and support personnel to assist with navigating the central and peripheral processes required for successful procurement of institutional and external grants. methods: research mentor was designed to provide comprehensive guidance to research fundamentals in a conveniently accessible interactive, user- friendly, online format. the website included resources and links to guidelines for grant development including feasibility analysis and study design planning, biostatistical considerations, peer review, grantsmanship, intellectual property protection, regulatory policies, computer- based training, institutional and national policies governing research, and access to funding agencies, forms, and appropriate support staff. the website was beta- tested by physicians and scientists with varying degrees of research experience and refined based on user comments before the website was launched in spring of . results: research mentor proved to be an effective orientation tool for researchers by enhancing grantsmanship skills and providing access to research resources. research mentor has been effective in linking the researchers with appropriate support personnel who offer further assistance to researchers in producing competitive proposals. tools custom -designed for research mentor to assist in project planning and design have been frequently accessed by investigators and reduce time spent by support staff on assisting with project planning. conclusions: informatics venues such as interactive user- friendly online educational websites can offer step- by -step guidance to research design and processes by providing a comprehensive cross- disciplinary research resource that offers value to new and established investigators alike. these tools promote networking with experienced support personnel to facilitate production of competitive grants. ps - : use of web-based rheumatology practice visual display tools with an electronic health record (ehr) eric d newman, md, department of rheumatology, geisinger clinic; virginia r lerch, mph, geisinger center for health research; jb jones, phd, mba, geisinger center for health research; walter f stewart, phd, mph, geisinger center for health research background: a variety of different types of data (i.e., patient -reported, lab, imaging, clinician- documented) are required to guide and improve rheumatologic treatment decisions. although these data elements are available in the electronic health record (ehr), the demands of a busy practice do not allow sufficient time to effectively review all sources of data. moreover, the ehr does not offer a facility to bring relevant but disparate data together in an integrated visual display. we developed a novel web- based software program–rheum- pacer (patient centric electronic redesign) that displays relevant data in a web- based dashboard format. we report on the results of the first phase of implementing rheum- pacer, i.e., identifying key data elements and designing the user interface. methods: rheum- pacer is a web- based program that obtains, aggregates, and/or exchanges information from/with four sources: patients, nurses, rheumatologists, and the ehr. it is separate from, but accessed seamlessly from within, the ehr. an iterative consensus process was used to identify the data elements desired by/from each of these four sources. results: the rheum- pacer dashboard is comprised of four key tabs, each of which allows the provider to complete a specific task within a single interface. the “outcomes general” tab displays parallel temporal trends of patient reported outcomes (pro), labs, and rheumatic medications. the “outcomes composite” tab displays temporal trends of composite pro scores and physician- recorded data (e.g., tender joint counts), labs, and rheumatic medications over a -month period. the “demographics” tab visually parses rheumatic versus other diagnoses and medications and allows for entry of data not typically found in the ehr (i.e., date of first rheumatic disease diagnosis). the “construction” tab is used to construct a visit progress note. this tab incorporates pre -populated patient reported data (e.g., events since last visit, review of systems) and ehr data (e.g, medications, lab values) and allows for entry of nurse and physician- derived measures (e.g., physical exam, global scores). conclusions: web- based software tools that are external to, but which interact with, the ehr have the potential to improve clinical practice and clinical decision- making by providing clinicians with information that is aggregated, formatted, and presented in a way that reflects their cognitive clinical decision- making process. pharmacoepidemiology c-a - : computerized clinical decision support during drug ordering for long-term care residents with renal insufficiency terry s field, dsc, meyers primary care institute; paula rochon, md, kunin -lunenfeld applied research unit; monica lee, pharmd, kunin- lunenfeld applied research unit; linda gavendo, rph, kunin- lunenfeld applied research unit; joann l baril, bs, meyers primary care institute; jerry h gurwitz, md, meyers primary care institute objective: to determine whether a computerized clinical decision support system (cdss) providing patient specific recommendations in real- time improves the quality of prescribing for long -term care residents with renal insufficiency. design: a randomized trial within the long-stay units of a large long-term care facility. randomization was within blocks by unit type. alerts related to medication prescribing for residents with renal insufficiency were displayed to prescribers in the intervention units and hidden but tracked in control units. measurement: the proportions of final drug orders that were appropriate were compared between intervention and control units within alert categories: ) recommended medication doses; ) recommended administration frequencies; ) recommendations to avoid the drug; ) warnings of missing information. results: the rates of alerts were nearly equal in the intervention and control units: . per resident days in the intervention units and . in the control units. the proportions of dose alerts for which the final drug orders were appropriate were similar between the intervention and control units (relative risk . , % confidence interval the habitus of digital scholars skip to main content home research outputs people faculties, schools & groups research areas accountancy africa ageing agricultural management agriculture agriculture & food animal care any arts arts & humanities biomedical sciences biomedical/medical sciences/health biotechnology & informatics business administration cancer & neurodegeneration cell signalling chemical engineering chemistry civil law computer science computing social sciences criminal law education energy engineering fisheries food food analysis food management fossil fuels gene expression health higher education housing humanities infectious diseases journalism education land management law management & commerce mathematics mechanical engineering medical science other miscellaneous categories peace studies physics renewable energy sources science & technology seafood vegetables research centres/groups air quality management resource centre applied marketing research group applied statistics group bat conservation research lab big data enterprise and artificial intelligence laboratory bristol bio-energy centre bristol centre for economics and finance bristol centre for linguistics bristol economic analysis bristol group for water research bristol inter-disciplinary group for education research bristol leadership and change centre bristol robotics laboratory centre for appearance research centre for applied legal research centre for architecture and built environment research centre for fine print research centre for health and clinical research centre for machine vision centre for moving image research centre for public health and wellbeing centre for research in biosciences centre for sustainable planning and environments centre for transport and society centre for water, communities and resilience collaborative entrepreneurship research group commercial law research unit computer science research centre creative technologies laboratory data research access and governance network (dragon) digital cultures research centre document and location research group education innovation centre engineering modelling and simulation research group environmental law and sustainability research group global crime, justice and security research group human resources, work and employment innovation, operations management and supply institute for sustainability, health and environment institute of bio-sensing technology mathematics and statistics research group moving image research group psychological sciences research group regional history centre research group in mathematics and its applications robotic engineering and computing for healthcare - fet science communication unit social justice research group social science research group software engineering research group sustainable economies research group (serg) the who collaborating centre for healthy urban environment unconventional computing group visual culture research group browse by year by author by type about oai research repository all output person project advanced search the habitus of digital scholars costa, cristina home outputs authors cristina mendes da costa cristina .costa@uwe.ac.uk associate professor in learning and teaching abstract this article concerns the participatory web and the impact it has on academic researchers’ perceptions of digital scholarship practices. the participatory web, as a space of active involvement, presence and socialisation of knowledge, has the potential to introduce significant changes to scholarly practice and to diversify it. this article draws on the findings of a narrative inquiry study that investigated the habitus of digital scholars. the study uses bourdieu’s concepts of habitus, field, and social and cultural capital as a research lens. one of the main findings to come out of the study was that research participants’ approaches to digital scholarship practices are highly influenced by their online social capital, the online networks that influence their thinking and outlook on scholarly practices, including their advocacy of openness and transparency of academic practice. this article concludes by highlighting the dispositions digital scholars display in an attempt to characterise the values and beliefs that underpin their scholarly practices. citation costa, c. ( ). the habitus of digital scholars. research in learning technology, , . https://doi.org/ . /rlt.v i . journal article type article acceptance date dec , publication date jan , journal research in learning technology print issn - publisher taylor & francis open peer reviewed peer reviewed volume pages doi https://doi.org/ . /rlt.v i . keywords digital scholarship, habitus, social capital, cultural capital, the participatory web, pierre bourdieu public url https://uwe-repository.worktribe.com/output/ publisher url https://doi.org/ . /rlt.v i . related public urls https://journal.alt.ac.uk/index.php/rlt/article/view/ files document ( ).pdf ( kb) pdf download preview licence http://creativecommons.org/licenses/by/ . / organisation(s) ace dept of arts & cultural industries you might also like doing research in and on the digital: research methods across fields of inquiry ( ) book digital scholarship, higher education and the future of the public intellectual ( ) journal article digital scholars: a feeling for the academic game ( ) book chapter cluster analysis characterization of research trends connecting social media to learning in the united kingdom ( ) journal article digital literacies for employability- fostering forms of capital online ( ) journal article downloadable citations html bib rtf uwe bristol research repository powered by worktribe | accessibility about uwe bristol research repository administrator e-mail: repository@uwe.ac.uk this application uses the following open-source libraries: sheetjs community edition apache license version . (http://www.apache.org/licenses/) pdf.js apache license version . (http://www.apache.org/licenses/) font awesome sil ofl . (http://scripts.sil.org/ofl) mit license (http://opensource.org/licenses/mit-license.html) cc by . ( http://creativecommons.org/licenses/by/ . /) powered by worktribe © advanced search just leave the fields blank that you don't want to search repository id title all of any of name year keywords all of any of research centres/groups ace central ace dept of art & design ace dept of arts & cultural industries ace dept of creative & cultural industries ace dept of education and childhood ace dept of film & journalism ace technical resources apd central academic practice directorate dir central dir directorate projects dir planning & bi directorate fac accommodation services fac business support fac catering & bar services fac central fac centre for sport fac cleaning services fac conference/ecc fac estates operations fac ft facilities technology fac health & safety unit fac logistics fac printing & stationery fac security fac space management & design fac sustainability fac travel & access fbl bristol business engagement centre fbl central fbl dept of accounting economics & finance fbl dept of business & management fbl dept of law fcm central fcm corporate communications fcm creative strategy fcm global centre fcm international recruitment & admissions fcm marketing fcm student journey communications fcm tsu temps fcm uk recruitment & admissions fet central fet dept of architecture & built environ fet dept of computer sci & creative tech fet dept of engineering design & mathematics fet dept of geography & envrnmental mgmt fet technical resources fin central fin commercial services fin corporate services fin faculty finance fin financial services fin procurement & payments fin systems and management accounts fin treasury and operations facilities faculty of arts creative industries & education faculty of business & law faculty of environment & technology faculty of health & applied sciences finance future students future students, comms and marketing har dept of animal and agriculture science har dept of equine science har dept of sport science has central has dept of allied health professions has dept of applied sciences has dept of health & social sciences has dept of nursing & midwifery has technical resources hrs advice hub hrs business partners hrs central hrs consultancy hrs employee relations & reward hrs equality & diversity hrs hr online project hrs learning & development hrs organisation & learning development hrs payroll & pensions hrs resourcing hrs systems & information hartpury (associate faculty) human resources it services its applications development & testing its central its compliance & security team its enterprise architecture & strategy its it operations its strategic business engagement lci careers and enterprise lci equality diversity and inclusivity lci library services library careers and inclusivity rbi central rbi research & business enterprise service rbi research & development research business & innovation sas administration & advice sas casuals sas central sas policy development & student experience sas student data & systems sas student journey programme sas student support & wellbeing scm central scm corporate communications scm creative strategy scm strategic marketing sfs admissions sfs central sfs global centre sfs international sfs recruitment & outreach spo central strategic communications and marketing strategic programmes office student and academic services university of the west of england type book book chapter conference proceeding dataset digital artefact exhibition / performance journal article other patent physical artefact presentation / conference report thesis working paper publication status submitted accepted in press published unpublished journal or publication title all of any of order the results by last modified (most recent first) by last modified (oldest first) by year (most recent first) by year (oldest first) by title search cancel toward a model for digital tool criticism: reflection as integrative practice authors: marijn koolen (​marijn.koolen@huygens.knaw.nl​) royal netherlands academy of arts and sciences - humanities cluster jasmijn van gorp (​j.vangorp@uu.nl​) utrecht university - department of media and culture studies jacco van ossenbruggen (​jacco.van.ossenbruggen@cwi.nl​) centrum wiskunde & informatica, vu university amsterdam - network institute abstract the past decade, an increasing set of digital tools is developed with which digital sources can be selected, analyzed and presented. many tools go beyond keyword search and perform different types of analysis, aggregation, mapping and linking of data selections, which transforms materials and creates new perspectives, thereby changing the way scholars interact with and perceive their materials. these tools, together with the massive amount of digital and digitized data available for humanities research, put a strain on traditional humanities research methods. currently, there is no established method of assessing the impact of the digital tools deployed in a specific digital research trajectory. there is no consensus on what questions researchers should ask themselves to evaluate digital sources beyond those of traditional analog source criticism. this article aims to contribute to a better understanding of digital tools and the discussion of how to evaluate and incorporate them in research, based on findings from a digital tool criticism workshop held at the digital humanities benelux conference. the overall goal of this article is to provide insight in the actual use and practice of digital tool criticism, offer a ready-made format for a workshop on digital tool criticism, give insight in aspects that play a role in digital tool criticism, propose an elaborate model for digital tool criticism that can be used as common ground for further conversations in the field, and finally, provide recommendations for future workshops, researchers, data custodians and tool builders. this is the author copy of: marijn koolen, jasmijn van gorp, jacco van ossenbruggen; toward a model for digital tool criticism: reflection as integrative practice, ​digital scholarship in the humanities​, october , ​https://doi.org/ . /llc/fqy mailto:marijn.koolen@huygens.knaw.nl mailto:j.vangorp@uu.nl mailto:jacco.van.ossenbruggen@cwi.nl https://doi.org/ . /llc/fqy introduction the past decade, an increasing set of digital tools is developed with which digital sources can be selected, analysed and presented. many tools go beyond keyword search and perform different types of analysis, aggregation, mapping and linking of data selections, which transforms materials and creates new perspectives, thereby changing the way scholars interact with and perceive their materials. these tools, together with the massive amount of digital and digitized data available for humanities research, put a strain on traditional humanities research methods. currently, there is no established method of assessing the impact of the digital tools deployed in a specific digital research trajectory. there is no consensus on what questions researchers should ask themselves to evaluate digital sources beyond those of traditional analog source criticism. while source criticism is common practice in many academic fields, the awareness for biases inherent in digital tools and their influence on research tasks needs to be increased. when it comes to the criticism of data or sources, source criticism is an established method for historians and humanities scholars. the literature in the humanities on source criticism are primarily aimed at analogue research, but not yet up to date with digital research in the heritage domain. lara putnam describes the shift from consulting analogue archives to keyword searching digital archives (putnam, ). current methods in historical research in physical archives are shaped around leafing through large volumes of materials to identify documents of relevance, with two important consequences. first, the scholar is confronted with the large number of unrelated materials that demonstrates the relative importance of their topic. second, they are made more aware of what other related and unrelated topics were competing for attention at the time. this prompts the question of how scholars can use digital tools to get a similar understanding of a topic’s relative importance and connections with other topics in a digital archive. moreover, many digital tools allow scholars to transform, aggregate, count, classify, link and visualize the underlying data. with these modelling steps they further change the materials they are studying. there is as yet little common understanding within and across humanities disciplines of how these steps affect the relation between research questions and materials and how these activities differ from traditional practice in terms of interpreting and contextualizing digital data. some scholars (e.g. giuliano, ; underwood, ; gibbs and owens, ) have pointed out the importance of reporting on these parts of the research process to start conversations around how to incorporate them in humanities research. this article aims to contribute to a better understanding of digital tools and the discussion of how to evaluate and incorporate them in research, first by reporting on two experiments held during a workshop at the dh benelux conference with participants of different digital humanities backgrounds, and, second, by synthesizing the theoretical background of the workshop with a review of digital humanities in the benelux conference: https://dhbenelux .eu/ relevant literature and an analysis of the workshop outcomes. we aim to formulate a set of assessment criteria (or building blocks for the conceptualisation) of digital tool criticism. at the workshop we invited the participants to experiment with tools and explicitly asked them to question and criticize the tools at hand. the overall goal of this article is to provide insight in the actual use and practice of digital tool criticism during the workshop and more specifically: ) offer a ready-made format for a workshop on digital tool criticism, including assignments, tools and methods for analysis, that can be reused for training and education (cf. section ) ) give insight in all aspects, both reported during the workshop and deriving from our own discussions, that play a role in digital tool criticism (cf. section ) ) propose an elaborate model for digital tool criticism that can be used as common ground for further conversations in the field (cf. section ) ) provide recommendations for future workshops, researchers, data custodians and tool builders (cf. section ) different disciplines may use different methods and may evaluate and reflect on digital tools differently, so there may not be a single common understanding of how digital tools fit in scholarly practice. but we think that a workshop with participants from diverse disciplines, working on the same semi-structured assignments, openly discussing their findings and reflections, and focusing on the exploratory phase in which scholars design their research around questions, materials and methods, is a good starting point for developing meaningful and shareable ways of doing digital tool criticism. literature on digital tools and their impact on research in information science, research practices of humanities scholars have been often object of research. the research cycle of social sciences is characterized by bhattacherjee ( ) and kendall ( ), while the research cycle of humanities as an iterative process, that continuously revisits all phases (marshall and rossman, ). bron et al ( ) distinguish three research phases in media studies research: exploration, contextualisation and presentation. in our conceptualisation of digital tool criticism, it is important to relate the tools and assessment criteria to the phase of research. if we look at the literature on digital tool criticism, the majority of it can be situated at the first phase of research: exploration​. most of the literature that discusses the use of digital tools in humanities scholarship focuses on search interfaces around digital collections. timothy burke lists a number of recommendations for scholars to guide their discovery and exploration in digital collections (burke, ). in the exploratory phase, they should exploit the quick responses of keyword search systems to rapidly iterate through multiple keyword searches, with which they can explore the viability of the collection and the search interface for their research. for this initial phase, simple interfaces should be preferred over advanced interfaces, as the latter require some expertise of the collection, how it is structured and how the search system makes use of that structure to organize search results. scholars should consciously ​​develop heuristics to evaluate and make sense of search results lists, and develop strategies to gather sets of keywords. we follow this recommendation, by requesting our participants to take notes during their research practice. another aspect according to burke ( ) is assessing the quality and authority of found results, which touches on source criticism, but through the lense of digital tools. in our workshop we explicitly asked participants to reflect on this relation between tool and source criticism. huistra and mellink ( ) provide a critical discussion of full-text searches on historical newspaper archives, specifically the dutch national library’s newspaper database, and offer three recommendations ​​on how to conduct different types of searches to achieve different types of goals. they formulate as advice a.o. that scholars to keep track of and report the steps they took to select their sources, including which search tools were used, and which queries and filters, to retrieve those sources. moreover, they write that scholars should discuss these steps with colleagues across disciplines to reach a better understanding both of how these digital technologies influence their research practice and how they can or should adjust their practice when incorporating these tools. this recommendation is incorporated in our workshop format by bringing together researchers with different backgrounds. although search may seem a well-understood finding aid, there are many subtleties that scholars should take into account, and introduces experimentation as important element of the research process (gibbs and owens, ; underwood, ). gibbs and owens ( ) argue that scholars should make their data interactions transparent to explain how these interactions contribute to making sense of the historical record. keyword searches are effective finding aids, but many digital archives and libraries offer additional sense-making tools to get a better understanding of what a digital corpus contains and does not contain and how it is structured, with which scholars can critically evaluate the archive as a whole. these can be indices of topics, persons or periods, faceted classifications based on various metadata fields, timeline visualizations, and documentation that provide details on selection criteria, data formats and search functionalities. jennifer guiliano ( ) argues a move toward recognized methodologies for digital sport history. 'for every affordance the personal computer could offer, as many problems and limitations would be introduced to the practice of research’ (p. ). similar to gibbs and owens ( ), she mentions experimentation with digital tools as an important part of digital scholarship. she illustrates this with an example of using text mining on digital archives of th century newspaper. automatic sentiment analysis using algorithms trained on modern social media data such as tweets, blogs and online user reviews might give unusable results. adjusting the algorithms by training on th century newspaper articles or trying different algorithms that better fit that genre of texts constitutes a form of experimentation that guiliano considers a core activity (p. ). we incorporated this recommendation in the workshop by having experimentation as main forma. in 'confronting the digital', tim hitchcock argues that the digital makes sources different and there is a need for more than 'being explicit about our use of keyword searching - it is about moving beyond a traditional form of scholarship to data modelling and to what franco moretti calls “distant reading”' (hitchcock, , p. ). data modelling is an intellectual activity to determine what elements the data consist of and what these elements represent. when searching through digital collections, scholars should be aware that data modelling has already taken place to make sources searchable, such as indexing of words and phrases for full-text search, or decisions about what to do with metadata that is missing, incomplete or uncertain such as 'circa '. but scholars also add further layers of data modelling when using digital tools to aggregate, link and visualize data. in ‘exploring big historical data: the historian’s macroscope’, graham et al ( ) discuss several tools and techniques to analyze large data sets to extract aggregated information that is hard to see by reading and searching. examples are algorithmic topic modelling to identify what the major topics are in a set of textual documents and which documents cover which topics, or network analysis of how people, places or topics mentioned in metadata records are connected to each other through co-occurrence. to interpret this aggregated information in a meaningful way, scholars need to consider the process by which it was generated, the selection of sources that were included or excluded in the analysis and how the algorithm determines when chunks of data in different documents refer to the same thing. this is regardless of whether they did the aggregation themselves or used information previously aggregated by some tool. reflecting on the choices that were made for identifying elements of interest in the data (such as topics, keywords or person names) and what alternative choices are possible, can help scholars to consider how the actual choices focus the analysis on certain aspects and pushes others to the background. in our workshop we explicitly asked participants to take these choices into account in assessing their use of tools. research by bron et al ( ) has shown that humanities researchers refine, leave out and change their research questions based on the availability of data and transparency of tools: due to the abundance of material that seems to be available, at first sight a researcher may think that a particular research question can be answered. [...] another factor are the tools used to gather material. these often lack transparency in terms of how documents are retrieved in response to search terms, which part of a collection is indexed, and which preprocessing steps have been applied, for example, exclusion of a particular field a researcher expected to be present (bron et al , p. ). this aspect of changing and refining research questions based on tool and data limitations was chosen as a focal point of the workshop assignments, to encourage participants to reflect on this part of the research process. format of the workshop . theoretical working definitions as first part of the workshop, we provided the participants with a shared theoretical framework. the slides are available online. we are aware that we primed the participants in providing working definitions. we do believe, however, that it is important to start with a common understanding of concepts in order to be able to criticize them and deconstruct them during the experiments and the discussion session. in the workshop, we focus on the exploratory phase of the research process, in which researchers are determining their goals, shaping their research questions and gathering their materials. to help participants in framing this phase, we let the participants read a text by trevor owens ( ) as preparation for the workshop. trevor owens argues that researchers can develop their research designs from different starting points, which can be one or more research questions, a collection of research materials, a set of preferred methods, or a specific conceptual framework. the adoption of digital tools affects many aspects of the research, including the research questions, the selection of materials to study and analyse, and the methods employed to study them. regardless of where the researcher starts, these aspects influence each other, such that making choices to adopt certain methods may prompt the researcher to modify their research questions and materials, and changing the question forces them to reconsider which conceptual frameworks and methods are appropriate. digital tools mediate between method and materials, such that choosing a specific tool affects what methods are appropriate and what form of materials or data can be used as input for the tool. indirectly, tool choice thereby affects the research questions and conceptual frameworks. vice versa, choices in materials, methods and questions affect what tools are appropriate. in practice, the research design and choices are made interactively and iteratively as the researcher explores different ways in which the available materials, methods and tools can be brought together into a coherent and appropriate design. owens adopts the research design model from joe maxwell ( ) that connects five elements of research design: questions, materials, methods, conceptual framework and validity (see figure ). note that tools are not explicitly mentioned in maxwell’s framework. they are related to, but not the same as, research methods. methods are modes of inquiry, and tools afford certain modes more than others, so choosing a tool requires reflection on how it affords a method appropriate for a research question. for a certain method there may be multiple tools that are appropriate, to varying extents. similarly, the data that is used in the inquiry should fit its mode. for the purpose of digital tool criticism, therefore, we provided the participants with a new model (figure ). according to us, it is useful to include data and tools as additional aspects of the framework, which are directly connected to methods in an interdependent network. we also url: ​http://bit.ly/ ohsssk http://bit.ly/ ohsssk added ‘researcher’ to the model in order to encourage the participants to reflect on their own role and the role of their peers in the research process. fig. . an interactive model of research design, as developed by maxwell ( ) fig. . a model of interdependent concepts of digital tool criticism as made by us and presented to the workshop participants besides the theory of owens and the two models, we also provided the participants with a working definition on ‘source criticism’ as hook-up for the demarcation of tool criticism. source criticism is a method or approach common in the humanities and specifically in historical research for evaluating information sources (cf. fickers, ). internal source criticism focuses solely on the content of the text itself and excludes external aspects. external source criticism, on the other hand, focuses on the metadata of the text, i.e. contextual aspects. fickers posits five basic questions that are essential for historical source criticism: ■ who created the text? ■ what kind of document is it? ■ where was it made and distributed? ■ when was it made? ■ why was it made? we argue it is important to also address the open question whether ‘digital’ source criticism is different from ‘analogue’ source criticism and in what way. the same basic questions can be asked of digital sources, whether these sources were born digital or were digitized versions of analogue sources. tool criticism adds a question for source criticism to the list of five, namely: how was a (version of a) source made? this question can be translated into questions about the tool itself: ■ who made the tool? ■ what kind of tool is it? ■ when was it made? ■ why was it made? ■ how does the tool function? this prompts further questions, such as: what makes digital tool criticism different from digital source criticism? and to what extent are digital tool criticism and digital source criticism entangled? we added that when thinking about why a tool was made and what it was developed to do, it is important to take into account that it can be and often is used for other things than it is intended purpose. before discussing the methodology of the workshop, we also provided a working definition for digital tools. tools can be studied and evaluated from different perspectives: as research instruments, as methods and as platforms. in the workshop we equated the concept of digital tool with that of computational tool. this can be a tool which is available and used online, that is, the computations are performed remotely on a server that hosts the tool, not locally on the researcher’s own computer. a tool can also be software installed locally (such as excel, gephi,...). more specifically, we used the working definition by van eijnatten et al. ( ): ​this part on digital source criticism is derived from the following book chapter that co-author van gorp was writing at the time of the workshop: van gorp, j. & de leeuw, j.s. ( ) methods of data collection with/in digital television archives: digital television historiography. in van den bulck h. et al (eds.) ​palgrave handbook for media policy methods​. forthcoming digital tools are used in opening up, presenting and curating textual and multi-media sources, in heuristic techniques of retrieval and accumulation of digitised data, in data analysis, in various forms of visualisation and in enhanced and multi-media publications of research results.' this working definition proved to be a fruitful one as it fits our perspective to link tool criticism to stages of humanities research in the heritage domain. . experimental setup we based our experimental setup on recommendations in the existing literature, as elaborated in section . to investigate how tools affect the exploratory phase of research (bron et al, ), we chose a flexible experimental setup in which participants could start from any of the aspects mentioned in figure and work out a research design that has a research question, a method of investigation and a set of digital sources and tools to investigate. we wanted the participants to investigate and reflect on the role and impact of digital tools during the exploratory phase, both in establishing a research question and in the selection of digital sources to be used in addressing that research question. therefore, we ran two short experiments covering different steps in the exploratory research phase, in which participants worked in small groups, and wrote down the steps, choices taken and their findings. in the first experiment they explored data sets and tools to establish a research question, in the second to select appropriate digital data and tools for their research questions. we also decided to have a single research theme that participants were encouraged to adopt to give direction to their exploratory research steps, so that they could compare their findings relatively easily. the topic/theme was ‘migration in europe’, although they were allowed to ignore this theme and choose their own. in each part of the experiment, we asked to participants to keep a logbook of their research process, in which to keep track of the chosen goals, framework, questions, methods and their validity during the experiments. we provided them with post-it notes and a google document per group to write down any questions they had about the tools and datasets they used, as well as any reflections and insights. we advised them to appoint one person to log considerations, choices, questions, observations. participants could take screenshots and photos to document their research process. we also encouraged participants to talk out loud and discuss with each other during this process. we asked participants to think during the experiments about the following questions, related to fickers’ five ws: ● which tools do you use, and why? when do you switch, and why? ● what type of use was the tool intended for? ● who is the intended audience or user group of a tool? ● what should you know about a tool w.r.t. the access, presentation and transformation of data? ● do digital tools change our research, and if so, how? in shaping research questions, in selecting or analysing materials? ● to what extent can digital source criticism and digital tool criticism be separated? after both experiments, participants were asked to analyse their written notes and post-it notes and to create a simple poster to present to the other groups. specifically, we asked each group to address the following questions. what are most important questions on specific tools and tool use? what are important considerations, reflections and insights? how did the tools you used influence or steer your exploration and analysis? . . data and tools we introduced a limited number of digital tools to give participants an idea of what is available and to ensure that there was some overlap in the tools used by multiple groups of participants so we could compare experiences. again, participants could choose other tools as well so as not to constrain their explorations. in the workshop we focused on online digital heritage collections, which are many and diverse, and for which different types of tools are available, both tools that are specific to individual collections and tools that are generic and can be used on many different collections. we provided a list of current tools, both generic tools in which data can be imported, and tools that are tied to and built around specific datasets. tools for specific datasets: ● cultural heritage ○ europeana (​https://www.europeana.eu/​): a digital platform giving access to heritage collections from more than european heritage and memory institutions. ○ european library (​http://www.theeuropeanlibrary.org/​): gives access to the digital collections of national libraries in europe. users can search through million metadata records and over million pages of full-text content. ● broadcast media ○ euscreenxl (eu) (​http://euscreen.eu/​): gives access to european audiovisual heritage, with over million metadata records and over , media items. ○ delpher newspaper collection (nl) (​https://www.delpher.nl/​): a faceted search interface for a range of collections of the national library of the netherlands, including million newspaper articles of the dutch historical newspaper archive, digitized books and journals and radio bulletins. ○ avresearcherxl (nl) (​http://avresearcher.clariah.beeldengeluid.nl/​): a comparative search tool that gives access to the dutch television and radio archive of the netherlands institute for sound and vision and the dutch historical newspaper archive offered by the delpher tool described above. the tool offers two search boxes so users can compare queries. each search box is connected https://www.europeana.eu/ http://www.theeuropeanlibrary.org/ http://euscreen.eu/ https://www.delpher.nl/ http://avresearcher.clariah.beeldengeluid.nl/ to its own search results list and to a combined timeline view that shows the number of search results per year for the two queries. ● politics: ○ parliamentary debate search (​http://search.politicalmashup.nl/​): a faceted and structured querying interface on top of archives of parliamentary debates from seven european countries. users can narrow the search by political party, party member, and analyse search results through a number of visualisations and aggregations, such as word clouds and timelines. ○ talk of europe (​http://www.talkofeurope.eu/data/​): a platform for querying a linked data representation of the same parliamentary debates described above. users can search the collection using sparql queries and download result sets for further analysis in other tools. ○ migration flows - europe (​http://migration.iom.int/europe/​): a platform that visualizes european data on migration on a geographical map, including migrant registrations, transit routes and relocations, and a map of migrations offices. the site also gives access to the statistical reports on which the visualizations are based. generic tools: ● voyant tools (​https://voyant-tools.org/​): an online text analysis tool in which users can create a text corpus by uploading documents or providing lists of urls. the tool parses the text of documents and offers a range of statistical tables and visualizations for analysis. ● openrefine (​http://openrefine.org/​): a desktop application in which users can upload tabular data and perform data cleaning and aggregation. the tool keeps track of the steps taken, so users can see how a particular view on the data was reached and repeat those steps as a recipe on similar data. ● digital methods initiative tools (​https://wiki.digitalmethods.net/dmi/tooldatabase​) ● digital research tools directory (​https://dirtdirectory.org/​): a directory of digital tools that organizes a long list of research tools by type of access and use. in addition, we encouraged participants to use any tools they know well, such as ms excel and google spreadsheet. . . participants participants worked collaboratively in small groups, so that they could share their experiences, ideas and questions regarding data and tools. the workshop was attended by participants. after a short introduction about the workshop, each participant introduced themselves and described their background, experience and expectations. the group was very heterogeneous, representing many humanities disciplines (historical sciences, media studies, literary studies, linguistics, (digital) heritage) and library and information science. some had little experience with digital tools and digital research, others had years of experience with many different tools http://search.politicalmashup.nl/ http://www.talkofeurope.eu/data/ http://migration.iom.int/europe/ https://voyant-tools.org/ http://openrefine.org/ https://wiki.digitalmethods.net/dmi/tooldatabase https://dirtdirectory.org/ and methodologies. the participants split up into six groups, five groups of three participants each and one of four participants. . . method of analysis each group kept notes of their explorations in a google document, so it is possible to compare how different groups develop their research questions, how they choose their methods of analysis and make data and tools selections. to analyse the research process in terms of these activities, we categorized phrases in the participants collaborative notes for five aspects, and color-coded the phrases with different colors for the aspects: ​research question (blue), method (red), data (green), tools (pink), ​and ​reflection (yellow). ​to visually analyze how groups shift between these aspects, we created versions where we removed white space to collect the notes of a group on a single page. this offers a form of distant reading of these notes that reveals patterns that might otherwise go unnoticed. we call these visualisations ‘research-process-visualisations’. results . general trends in research processes we observed that groups interpreted the note taking process differently, with some groups writing down each step in exploring and reflecting in chronological order, while others summarised at the end of each experiment. even when taken these procedural differences into account, the notes show some interesting patterns. figure shows the color-coded notes of the groups. fig. . research-process-visualisations: research process notes of the six groups, color coded by research aspect. the numbers on the left of the images correspond to the numbers of the groups. first, the amount of text devoted to critical reflection (yellow) differs from group to group. it dominates the notes of group and is almost absent in group . also, the focus of the discussions around the research topic (blue) is remarkable. given the two parts of the workshop, with establishing a research question as the explicit task in the first part, one would expect the most blue-coded phrases in the top half of the notes. this is indeed the case for group , and , but it is clearly in the middle of group and in the second half of the notes of group and . this observation is in line with maxwell’s claim that the formation of a research question is an iterative process influenced by multiple aspects of the research design (see figure ). we also observed that since many datasets are only available through a specific tool, discussion about data (green) are often mixed with discussions around the associated tool (pink). this observation also supports the idea that digital source criticism and digital tool criticism are hard to separate. likewise, the functionality of a tool (pink) is often discussed in terms of the research method (red), to the extent that the two become hard to distinguish. this corroborates our earlier claim that tools mediate between data and methods. in some cases it is clear that participants are discussing specific aspects of a tool, such as what features it has or does not have or to what extent they are configurable. in other cases that they are talking about a method of analysis in general without considering specific types of tools. but in many parts of the notes these aspects blend into each other. this demonstrates that tool and method clearly are interdependent, but should be considered separate aspects in a model of digital tool criticism, as we will elaborate on in section . . impact of data & tools on research question refinement the first task given to the six groups was to refine the given theme 'migration in europe’ into a more specific research question. as first steps in this exploratory phase, all participant groups use rapid searches to establish whether a given data set or tool is suitable for a certain line of inquiry and iteratively adjusted questions, tools and data selections until they are aligned enough to warrant further exploration in a specific direction. once they had established a fruitful direction, they use the same strategy 'to rapidly test and refine questions and hypotheses' (solberg , p. ). group started with questions around a chosen topic of interest, then looked for about pages of tools to see which ones give access to the data required for these questions. having found that the parliamentary debate search (pds) and talk of europe (toe) tools give access to recent materials and promising results based on an initial keyword search, they try several related keywords to get a feel for the extent of the relevant data. their overall research goal, 'compare discussion of migration in broadcast media and in european parliamentary debate speeches - ' was formulated relatively early in the process, and formulated in terms of the corpora of the investigated tools. group investigated perception and stereotyping of immigrants and refugees by different political parties, and used keyword searches initially to establish which historical periods best fit this investigation. once they focused on a specific period, from the geneva convention in until (as more recent newspapers were not available due to copyright), they used ‘pearl growing’ (drabenstott, ; yakel, ) or what burke ( ) calls ‘keyword harvesting’ as a manual form of topic modeling, to investigate the evolution of terminology around the main topic. this is also reflected in central role terms play in the research question formulated by this group: ‘in which ways do the terms that are used in newspapers and parliamentary debates to describe immigrants and refugees from distinct nationalities evolve between and ?’. group started with the tool avresearcherxl, which gives access to two collections, a dutch radio and television archive; and a dutch newspaper archive. it allows users to run two keyword queries side-by-side, either on the same collection or on different collections. the group quickly realised that what at first seemed to be an affordance of the tool, comparative analysis, is in fact difficult because the two collections do not fully overlap in the periods covered (for copyright reasons) and the newspaper archive includes full-text search whereas the radio and television archive only uses metadata. this group’s research question is somewhat similar to that of previous group: ‘how did word usage of migration changed over time?’. the comparative nature of the tool is, however, clearly reflected in the research method formulated by this group: ‘using the parliamentary debates via the parliamentary debate search system as a baseline to trace the development of word usage, how can other data sets be used to characterize the developments?’. group is relatively brief in their notes. they explicitly address the question whether their research question may or may not depend on available data and tools: we struggled with the scope of the question: should we adapt it to the sources we have at hand right away? or do we want to make up a question that we are not sure we can answer, because we might not be able to extrapolate from the materials that we have available (because of limitation of the sources)? it is likely that when we do the latter, we end up more with tool criticism than with actual answers to questions. their reflections on their own research question: ‘how is the topic migration present in cultural expression? comparing end of s with s’ follows a similar pattern. they noted: ‘we started with ambitious research questions. through bumping into limitations, research question slowly disappeared from view’. group ’s notes are hardly about tools, data, method and research questions directly, but mainly reflections on these topics, indicated by the yellow color. for example: the type of questions we think of is already influenced by what we expect to be possible with the tools (‘how did people think about’ became ‘what terms were used’, so this is based on available metadata/presentation of the material). the resulting research question is indeed term-centric: ‘what were the terms used for migrants around the time of suriname’s independence in ? taking a five year window from to ’. group quickly starts with keyword searches related to 'migrants' and 'integration' to identify which specific topics are viable for inquiry. once they have established that 'integration' is more fruitful, they use explorations around this topic to address questions about how the tool constraints and steers them towards specific questions and analyses. their lab notes suggest that part of the time during the workshop is used to try to carry out the actual research with the goal to find the answers to the research question discussed. for some queries it is unclear to what extent these are still intended to help in refining the research question. they formulated their research question as: 'in what way can we use word frequencies in parliamentary speeches as an indicator for political viewpoints on integration?' the main point in the process ​when questions changed was when scholars identified the boundaries of the available corpus and the properties of the (meta)data. in all cases, questions around the discussion of migration and refugees were refined by zooming in on either specific organizations (e.g. dutch political specific parties pvv and vvd), specific regions (surinam), specific periods ( - , - , late s and s) or specific topics (assimilation). . meta-discussion about the workshop the workshop closed with a general discussion in which participants were asked to reflect on the value of the format and outcomes of the workshop. one of the main points raised is that, in using digital tools, scholars are not always reflectively questioning what they are doing. participants who had worked on the same datasets in the workshop as in previous projects realized that back then they did not reflect in the same way and ask the questions they asked themselves in this workshop. the participants agreed that the explicit reflection on tool use in the format of a workshop, where they work together and can discuss findings on the same or similar assignments, tools and datasets, is an effective way to critically assess the use of digital tools. here -interestingly- analogue tools such as post-its and pen-and-paper can help to stimulate this reflection as they pull scholars out of the environment of digital research. the importance of documentation was another important topic in the discussion. one group mentioned they explicitly looked for documentation on the digital tools they considered, to find out how these tools work, what data they give access to or what formats they accept, how they transform data and for what purposes these tools were made. ​​such documentation is often limited or not present at all but is crucial in understanding whether a tool does what a user thinks it does. digital tools are boxes that can be opened up to a certain extent by tool builders, either by providing source code or documentation, or working directly with (other) scholars and discuss how they work. another group noted that scholars often attempt to use a tool for a specific part of the research but upon hitting the limitations of that tool, come up with workarounds. these are often very useful but rarely documented. one participant said he would like to know what workarounds others have developed, so he can possibly reuse them. the third main topic that was discussed is data literacy and the complex interactions between digitals tools and data. some participants argued that the opacity of tools means they only get in the way of getting to grips with the data: 'we don’t want a tool, we want the raw data.' they felt that researchers should have a basic understanding of data and how it is structured. they noticed that in using digital tools for research, they keep going back to the data and metadata, and the underlying structures and schemes used. 'being able to look at a sparql query and maybe not being able to write it yourself but at least to understand what it‘s doing … that is the literacy that we certainly should have.' 'the more directly you are able to query data, the more confident you are about what you get out.' this points to the difficulty of separating tools and data. once you separate the digital tool from the digital data, whatever you do with the data will involve some other tool, as interacting with digital data always requires some tool, however rudimentary, to mediate. 'tools are intimately related to the data.' before choosing a tool to perform data transformation or analysis, a researcher has to critically evaluate the data they use as input to the tool. although the question remains to what extent one can separate data criticism from tool criticism, because one of the aspects of digital data criticism is to assess how it was created and shaped by previous digital technologies in the first place. this prompted the question: 'what actually is the raw data?' sparql is a structured query language for linked data. see https://www.w .org/tr/rdf-sparql-query/ there is a long process of tools, even for digitization only. when confronted with a digitized data set, there are already many questions regarding the digitization process. especially around ocr and text interpretation. did the ocr process use language-specific models and parameters in deciding between candidate characters or words? how did the digitization process deal with aspects like image noise, marginalia, tilted scans, missing fragments, cuts and holes in the page, etc? furthermore, critiquing the chain of tools that are involved to create an online keyword search interface of a large digitized archive blends naturally with critiquing of analogue processes of constructing that archive. one question is how the metadata formats, institutional cataloging policies, selection criteria for materials to include and the cataloging choices and behaviors of individual cataloguers and documentalists have changed over the decades or centuries of an institute’s history. this led to the suggestion that we also primed in our workshop set-up: work out a method of digital data and tool criticism in phases that follow the phases of the research process, e.g. exploration, analysis, presentation (bron et al, ). in each phase, criticism should focus on tool use as a chain of steps or interactions. in analyzing data that is presented in a particular tool at a particular step, it is important to understand what previous data interactions and transformations led to that view on the data and how that process shapes what a user sees. discussion: reflection as integrative practice digital tool criticism forces us to step back and assess how tools fit in our research methodologies. we chose to focus on the exploratory phase to draw out the questions around digital tools in the initial steps. the most important lesson learned in this workshop is that the choice to have participants work in groups and write down their steps, encouraged them to reflect on their own research process and the role of tools in it. by introducing the model of maxwell ( ) and owens’ discussion of its role in digital humanities research, participants could easily separate tools, data and methods and question and reflect on each aspect individually and in interaction with each other. digital tool criticism requires scholars to relate the choice and use of tools to the phase of their research. scholarly publications should not only focus on what we have learned about e.g. migration through using digital tools, but also reflect on the process by which we learn and generate new knowledge and insights. therefore, we consider reflection is the central concept in digital tool criticism. reflection as practice integrates all elements of research to critically assess and use digital tools: research questions, methods, tools and data are interdependent and choices regarding them are shaped in an interactive and reflective research process. why are particular data, tools and functionalities chosen? why are certain directions discarded in favour of different directions? what insights led to a change in direction, and what new insights does that give? our analysis of the notes and posters made by the participants suggests that research method should be included as a separate concepts in a model for digital tool criticism. at the same time, the role of the researcher is not mentioned in the notes and posters, but only came up in the closing discussion of the workshop when participants were reflecting on the workshop and on digital tool criticism is a method, so we argue that ​researcher makes less sense as an explicit concept in the model. these considerations lead to a different model, shown in figure , in which research method ​is brought back into the model and ​reflection is added to replace ‘researcher’ and is considered as integrative practice encompassing all other concepts. fig. . an interactive model of digital tool criticism, where reflection integrates the four concepts of research questions, methods, data and tools as interactive and interdependent parts of the research process. adopting this type of reflection in research practice has consequences for how we conduct and organise our work. in other words, it affects our methodologies. much like research in the late th and early th century, we have to reflect on how tools organize, access and analyse our materials before we can apply them in researching the materials. as scheinfeldt ( ) argues, late th and early th century scholarship was dominated not by big ideas, but by methodological refinement and disciplinary consolidation. denigrated in the later th century as unworthy of serious attention by scholars, the th and early th century, by contrast, took activities like philology, lexicology, and especially bibliography very seriously. serious scholarship was concerned as much with organizing knowledge as it was with framing knowledge in an ideological construct. the explicitness of digital tools prompts scholars to ask questions about them that may not always have been obvious when working with analogue tools. questions regarding the selection, normalization and organization of data in indexes has correspondences with questions about traditional access tools for archives, libraries and heritage collections. this goes beyond recognizing the politics and rhetorical construction of archives (finnegan, , p. ), to understanding the history of collection creation, organization and management. an institution’s history of gathering and organizing materials into collections, and changes in institutional policy regarding these activities are rarely documented in great detail, but are also rarely considered or reported in research that makes use of these collections. e.g. how selection criteria and topical or subject indexing of archival materials has changed over time, how indexers applied the chosen controlled vocabularies and conducted their document analysis, how different indexers made different interpretive choices regarding the relevance of index terms, etc. all these affect accessibility of archival materials. yet with digital tools and data, these types of questions are posed frequently. perhaps the disconnect between distant reading perspectives and established close reading methods prompts scholars to question how to make sense of such reductive views on the data and how these views relate to a scholar’s expectations derived from background knowledge. for instance, seeing search results represented as a frequency graph on a timeline, a scholar might see a peak or a dip in a certain period and wonder how it relates to what they know about that period, but also how it relates to the history of the collection being searched. the main questions center around complex relationship between tools and data in a digital environment. the first aspect is how tools select, filter and give access to data. tool limitations may form a barrier to having full access to a set of data because a tool may be the only way to access them, as with web-based tools that gives access to digital archives and heritage collections. access to digital sources is often mediated through digital tools, which suggests an integrated criticism of tools and sources. another issue with many digital tools working on integrated data sets is that they lack information about what data is accessible through the tool, how that data has been selected and how tool features include or exclude certain parts of the data. this makes it hard for scholars to judge whether what they see is all there is, or that other data has been filtered out or is simply not available in the tool. the second aspect is how tools transforms the data they operate on and thereby can change the nature of the data and how they can be interpreted. in order to critically evaluate the suitability of digital tools for a particular research scope and approach, a scholar needs to have a basic understanding of how they work and what they do and don’t do. we agree with benjamin schmidt ( ) that this need not necessarily be at the level of algorithmic detail, but at the level of data transformations. some tools are extremely complex with hundreds of algorithms, and some require advanced mathematical knowledge to fully comprehend but which is not necessary to meaningfully use the tool in research. however, at the level of data transformations, the workings of tools represent data interpretations and directly affect methodology. in this sense, the selection and filtering of data discussed above are also transformative. keyword search not only selects or filters, but also reorganizes data sources, taking them out of their individual contexts and placing them together in a list of search results, often ordered by algorithmically determined relevance. this also makes it clear that the choices made by the researcher to use certain keywords or to use certain tools in a particular order should be included in the critical assessment of a tool and that this an important reflective step in the research process. another aspect of tools are interfaces. interfaces are often introduced with comments about how easy to use they are. incorporating digital tools in research is never easy and always requires critical reflection on how they mediate between researchers and their materials of study. attractive and intuitive interfaces make it easy to forget that under the hood, many choices are made based on implicit or explicit assumptions of the creators of the tools, that may or may not align with the assumptions of their users. this has lead to the following definition or demarcation of the concept of digital tool criticism: with digital tool criticism we mean the reflection on the role of digital tools in the research methodology and the evaluation of the suitability of a given digital tool for a specific research goal. the aim is to understand the impact of any limitation of the tool on the specific goal, not to improve a tool’s performance. that is, ensuring as a scholar to be aware of the impact of a tool on research design, methods, interpretations and outcomes. this requires researchers, data custodians and tool providers to understand issues from different perspectives. researchers need to be trained to anticipate and recognize tool bias and its impact on their research results. data custodians and tool providers, on the other hand, have to make information about the potential biases of the underlying processes more transparent. this includes processes such as collection policies, digitization procedures, data enrichment and linking, quality assessment, error correction and search technologies (traub and van ossenbruggen, ). reflection on tool use in a research process suggests an element of experimentation, the latter being widely considered as important element in digital tool use (cf. section ). one way to critically evaluate a tool for a given purpose is to experiment with different ways of applying the tool. this allows evaluation from multiple experiences and perspectives. a concrete example is a simple heuristic of testing alternative keyword queries and compare the number of results or analyse the overlap in results, which can reveal the inner workings of tools. experimentation is a skill in the sense that there are good and bad ways to experiment with a tool to assess its impact on data and interpretation. experimentation also helps scholars to reflect on and challenge their own assumptions regarding tools and data. reflection on procedure and method does not come naturally while doing research, especially when interfaces resemble those we use everyday. this is where collaborative sessions are useful, each person bringing their own experiences and skills. for digital tool criticism it helps to have both scholars and tool developers involved in the discussion. collaboration also affords brainstorming ideas and coming up with experiments to quickly test hypotheses. at same time, collaborative research raises the issue of being less involved in the entire process, especially in presenting parts of scholarly work that were done by others. in the case of humanities scholars and computer scientists, it may be difficult to establish to what extent they understand each others’ contributions. list of recommendations based on the discussion points above, we provide a list of recommendations for conducting digital tool criticism for ( ) tool creators and maintainers and ( ) humanities scholars. first, ​creators and maintainers that give access to data sets, stand alone tools and tools built around datasets should provide documentation describing a range of details of these data sets and tools: - for data sets it is important to describe the selection criteria and any data processing and transformations performed on the selected data before it is made available. selections, normalizations, aggregations and other steps that affect the input data need to be described, at least at a high level, so that researchers can reason about what is in the data sets and what is not, and how the transformations affect the ways they can validly interpret the data. - for tools it is important to describe what functionalities are available and how each of these selects, filters and transforms data, so scholars can reason how they change the nature and scope of the data from input to output. from the workshop discussion came the recommendation for tool builders to have an “about” page with each digital tool that covers these aspects. second, ​humanities scholars using digital tools in their research should reflect and report on their choices for those tools. we make the following recommendations: - digital tool criticism should analyse and discuss tools at the level of data transformations. reflect on how inputs and outputs differ and what this means for interpreting the transformed data. - source criticism, tool criticism and data criticism (as output of the tools they used) should be integrated and incorporated in the research process. scholars should reflect and report on how these three aspects contribute to the scope of the data and how that aligns with the scope of the research questions. - scholars should document and share the workarounds they develop in dealing with limitations of tools. aspects to document are the types of activities that a tool does not support well and what alternative steps with the same or other tools have been taken. - the research process should include experimentation to find out how digital tools work in terms of modelling and transforming data, and to bring out and refine scholars’ own assumptions about tools. a good way to perform digital tool and data criticism is to use a checklist of questions to ask about the tools and data: - questions to ask about digital data: where does the data come from? who made the data? who made the data available? what selection criteria were used? how is it organised? what preprocessing steps were used to make the data available? if digitized from analogue sources, how does the digitized data differ from the analogue sources? are all sources digitized or only selected materials? what are known omissions/gaps in the data? - questions about digital tools: which tools are available and relevant for your research? which tool best fits the method you want to use? how does the tool fit the method you want to use? for which phase of your research is this tool suitable? what kind of tool is it? who made the tool, when, why and what for? how does the tool transform the data that it works upon? what are the potential consequences of this? - questions about digital search tools​: what search strategies does the tool allow? what feedback about matching and non-matching documents does the tool provide? what ways does the tool offer for sense-making and getting an overview of the data it gives access to? - questions about digital analysis tools: ​what elements of the data does the tool allow you to analyze qualitatively or quantitatively? what ways of analyzing does the tool offer, and what ways to contextualize your analysis? although there are also digital publication tools, we did not yet look into this within the confines of the workshop. the workshop focused on tools for exploration and also on tools for analysis, as exploration often incorporates different forms of analysis. conclusion and future steps in this article we argued that reflection can be seen as an integrative practice. our research is based on the outcomes of a workshop in which we brought together people with an interest in digital humanities research. one of the findings was that collaborative note taking and reflection is an effective way to make scholars more aware of limitations of data and tools but more importantly of their own research process and the questions, considerations and choices they have. in that sense, the format of the workshop was a success. therefore, we are planning further iterations of this workshop where we tighten the protocol for tracking the research process.t. for instance, we will try to let our future participants make their own ‘research-process-visualisations’ since we expect these visualisations to be a great help in their reflection process. we also plan to include logging of system interactions in future workshops, so that participants can connect the steps in their research process to specific interactions with tools and also see when they switch between tools. a challenge of any workshop is to find a balance between priming of participants in providing working definitions, tools and assignments ​and enabling to draw conclusions on the outcomes of workshop in a collaborative fashion. we believe it is important to build on existing knowledge and experiences, and therefore ​we plan to share this article with all future participants, so we can build an even more broadly shared framework for digital tool criticism. in a follow-up workshop to the one discussed in this paper, we have to think about a way to let participants also co-author guidelines, perhaps by let them write and test guidelines during the workshop and/or create a voting system by which guidelines can be ranked according to their perceived importance. moreover, ​our workshop focused on the first phase of research - exploration - and related tools. it would be valuable to retake the workshop for all other phases as well to test our model of reflection as integrative practice. funding this work was supported by the vre eic project, a project that has received funding from the european union's horizon research and innovation program under grant agreement no and by the clariah-core project financed by nwo (​www.clariah.nl​). notes references bhattacherjee, a. ( ). social science research: principles, methods, and practices. tampa, fl: global text project. bhavnani, s., k. drabenstott, and d. radev. ( ). towards a unified framework of ir tasks and strategies. in: proceedings of the american society for information science annual meeting , pp. - . burdick, a., drucker, j., lunenfeld, p., presner, t. and schnapp, j., ( ). digital_humanities. mit press. burke, t. ( ). how i talk about searching, discovery and research in courses. [blog] easily distracted. available at: https://blogs.swarthmore.edu/burke/blog/ / / /how-i-talk-about-searching-discovery-and-r esearch-in-courses/​ [accessed mar. ] drabenstott, k.m. ( ). web search strategy development. online, vol. , no. . http://www.clariah.nl/ https://blogs.swarthmore.edu/burke/blog/ / / /how-i-talk-about-searching-discovery-and-research-in-courses/ https://blogs.swarthmore.edu/burke/blog/ / / /how-i-talk-about-searching-discovery-and-research-in-courses/ eijnatten, j. van, pieters, t. and verheul, j., ( ). big data for global history: the transformative promise of digital humanities. bmgn - low countries historical review. ( ), pp. – . doi: ​http://doi.org/ . /bmgn-lchr. fickers, a. ( ). towards a new digital historicism? doing history in the age of abundance. view journal, volume ( ). ​http://orbilu.uni.lu/bitstream/ / / / - - -pb.pdf finnegan, c.a., ( ). what is this a picture of?: some thoughts on images and archives. rhetoric & public affairs, ( ), pp. - . gibbs, f. and owens, t. ( ). the hermeneutics of data and historical writing. in kristen nawrotzki; jack dougherty. writing history in the digital age. university of michigan press, . doi: ​http://dx.doi.org/ . /dh. . . ​. graham, s., milligan, i. and weingart, s. ( ). exploring big historical data: the historian's macroscope. london, imperial college press. guiliano, j. ( ). toward a praxis of critical digital sport history. journal of sport history, volume , number , summer , pp. - . hitchcock, t. ( ). confronting the digital - or how academic history writing lost the plot. cultural and social history, volume , issue , pp. - . https://doi.org/ . / x huistra, h., and melink, b. ( ). phrasing history: selecting sources in digital repositories. historical methods: a journal of quantitative and interdisciplinary history, vol. , no. , - . doi: . / . . kendall, d. ( ). sociology in our times. belmont, canada: cengage learning. marshall, c. and rossman, g.b. ( ). designing qualitative research. london: sage. maxwell, j. ( ). qualitative research design: an interactive approach, rd edition. sage publications. owens, t. ( ). where to start? on research questions in the digital humanities. [blog] trevor owens. http://www.trevorowens.org/ / /where-to-start-on-research-questions-in-the-digital-humanit ies/​ [accessed mar. ] putnam l. ( ). the transnational and the text-searchable: digitized sources and the shadows they cast. american historical review, volume , number , pp. - . http://doi.org/ . /bmgn-lchr. http://orbilu.uni.lu/bitstream/ / / / - - -pb.pdf http://dx.doi.org/ . /dh. . . http://www.trevorowens.org/ / /where-to-start-on-research-questions-in-the-digital-humanities/ http://www.trevorowens.org/ / /where-to-start-on-research-questions-in-the-digital-humanities/ scheinfeldt, t. ( ). sunset for ideology, sunrise for methodology? [blog] found history. http://foundhistory.org/ / /sunset-for-ideology-sunrise-for-methodology/​ [accessed mar. ] schmidt, b. ( ). do digital humanists need to understand algorithms? debates in the digital humanities, edition. solberg, j. ( ). googling the archive: digital tools and the practice of history. advances in the history of rhetoric, volume , pp. - . traub, m.c, and van ossenbruggen, j.r. ( ). ​workshop on tool criticism in the digital humanities​. cwi techreport https://ir.cwi.nl/pub/ underwood, t. ( ). theorizing research practices we forgot to theorize twenty years ago. representations, vol. no. , summer ; (pp. - ) doi: . /rep. . . . . van gorp, j. and de leeuw, j.s. ( ) methods of data collection with/in digital television archives: digital television historiography. in van den bulck h. et al. ( ) ​palgrave handbook for media policy methods​. forthcoming yakel, e. ( ). searching and seeking in the deep web: primary sources on the internet. in working in the archives: practical research methods for rhetoric and composition, pp. - . southern illinois university press. http://foundhistory.org/ / /sunset-for-ideology-sunrise-for-methodology/ powerpoint presentation louise spiteri school of information management lis education in the st century: leadership, vision, and management landscape for academic libraries association of college and research libraries five-year goals, value of academic libraries academic libraries demonstrate alignment and impact on institutional outcomes leverage existing research to articulate and promote the value of academic and research libraries. develop and deliver responsive professional development programs that build the skills and capacity for leadership and local data-informed and evidence-based advocacy. influence national conversations and activities focused on the value of higher education. association of college and research libraries five-year goals, student learning librarians transform student learning, pedagogy, and instructional practices through creating and innovative collaborations build capacity for the librarians’ role in supporting faculty development and the preparation of graduate students as instructors. increase collaborative programs that leverage partnerships with other organizations in order to support and encourage local and national team approaches. articulate and advocate for the role of librarians in setting, achieving, and measuring institutional learning outcomes. build librarian capacity to create new learning environments (physical and virtual) and instructional practices. association of college and research libraries five-year goals, research and scholarly environment librarians accelerate the transition to a more open system of scholarship create and promote new structures that reward and value open scholarship. influence scholarly publishing policies and practices toward a more open system. enhance members’ ability to address issues related to digital scholarship and data management model new dissemination practices transformational change in the information landscape collection size is losing importance traditional library metrics do not capture value to academic mission rising journal costs alternative publishing models alternatives to the library have faster growth and easier access declining demand for traditional library services new client demands stretch budget and organizational culture building digital collections ebook adoption reaching a tipping point large-scale digital collections offer widespread, low-cost access use restrictions and copyright are the largest obstacles to access patron-driven acquisition models allow “just in time” purchasing repurposing library space local print collections are large, expensive, and rarely used avoid unnecessary duplication through collaborative storage and acquisition plans repurpose library space to support collaborative learning redeploying library staff tiered reference services free up librarian time crowd-sourced reference matches supply to decreased demand students in need of information literacy beyond “library ” embedded librarians and services offer on-demand, online to students and faculty preparing academic librarians skills in advocacy: libraries in higher education increased concerns about the quality and affordability of higher education. how well do colleges and universities prepare students for future and productive careers? increased demand for public scrutiny can bring into question traditional notions of self-regulation, institutional autonomy, and peer review. librarians must ensure that they amplify the mission of their host institutions and, ultimately, the mission of the university system at a national and international level. skills in promotion: librarians and assessment of learning collaborate with university administrators, academic staff, and faculty from across the institution to create effective student learning and assessment of learning. understand the value and impact of the library in relation to dimensions of student learning and success. articulate and promote the importance of assessment competencies necessary for documenting and communicating library impact on student learning and success skills as educators: engage in the learning process librarians must be involved in all aspect of the learning process, for both faculty and staff. engagement in curriculum design and measures of assessment information literacy should be embedded in all curricular activities. librarians should be involved in the creation of all learning-related programs. skills as scholars: engage in scholarly output increased demand for evidence-based decision making means that it is increasingly important for librarians to research their operations systematically. in institutions where librarians have faculty status, it is crucial that librarians contribute actively and significantly to scholarly knowledge. • increased emphasis on research skills. skills as publishers: disseminators of knowledge librarians need to be increasingly involved in the dissemination of knowledge beyond the traditional publishing model: organizational digital repositories of knowledge created by the institution open access journals and textbooks integration of different aspects of a work into one site, e.g., research articles can be integrated with primary source material, commentaries, learning objects, blog postings, etc. skills as technologists: technological fearlessness in a rapidly-changing technological environment, it is difficult to teach the current tools in two-year programs. education must go beyond teaching how to use the tools, but how to apply them to serve the mission of the library and its parent institution. librarians need to have a fearless approach towards technological innovation and to be self-motivated to learn how new tools work and how they can be applied. vision, leadership, and management dalhousie university’s faculty of management, which includes programs in business, public administration, information management, marine affairs and resource and environmental studies, offers a holistic and values-based approach to management education and research. our vision is to be the acknowledged centre of values-based management whose graduates become private sector, public sector and civil society leaders who manage with integrity, focus on sustainability and make things happen. the school of information management, in the faculty of management, develops and nurtures dynamic, innovative, and practical information professionals who are skilled in the management of information and technology, and who provide leadership and vision in a knowledge-based society. in collaboration with clients and stakeholders, graduates will promote and advocate the values-based concepts of sustainability and social responsibility in the management of knowledge and information. the school advances the discipline of information management by pursuing creative multidisciplinary research. further information http://sim.management.dal.ca louise spiteri school of information management http://about.me/louisespiteri http://sim.management.dal.ca/ http://about.me/louisespiteri references acrl. ( ). acrl plan for excellence. retrieved from www.ala.org/acrl/aboutacrl/strategicplan/stratplan brown, k., & malenfant, k. j. ( ). connect, collaborate, and communicate: a report from the value of academic libraries summits. retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/val_summit.pdf dillon, a. ( ). accelerating learning & discovery: refining the role of academic librarians. retrieved from http://www.clir.org/pubs/reports/pub /dillon.html kennedy, m. r., & brancolini, k. r. ( ). academic librarian research: a survey of attitudes, involvement, and perceived capabilities. college & research libraries (pre-print). http://crl.acrl.org/content/early/ / / /crl- .full.pdf+html ridley, m. ( , may ). librarians, crisis, higher education: the real challenge. posted to exploring the information ecology. retrieved from http://michaelridley.ca/ / /real-challenge/ university leadership council. ( ). redefining the academic library: managing the migration to digital information services. retrieved from http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.ala.org/acrl/aboutacrl/strategicplan/stratplan http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/val_summit.pdf http://www.clir.org/pubs/reports/pub /dillon.html http://crl.acrl.org/content/early/ / / /crl- .full.pdf+html http://crl.acrl.org/content/early/ / / /crl- .full.pdf+html http://crl.acrl.org/content/early/ / / /crl- .full.pdf+html http://michaelridley.ca/ / /real-challenge/ http://michaelridley.ca/ / /real-challenge/ http://michaelridley.ca/ / /real-challenge/ http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf http://www.educationadvisoryboard.com/pdf/ -eab-redefining-the-academic-library.pdf sources for images http://www.bcieurobib.com/wp-content/uploads/ / /open-space-library-design.jpg http://lonewolflibrarian.files.wordpress.com/ / /librarian- .jpg http://thesologuide.com/wp-content/uploads/ / /bigstockphoto_research_ .jpg http://www.futuretechnologypredictions.org/wp-content/uploads/future-technology.jpg http://surajatreyadigital.files.wordpress.com/ / /digital_publishing.png http://www.bcieurobib.com/wp-content/uploads/ / /open-space-library-design.jpg http://lonewolflibrarian.files.wordpress.com/ / /librarian- .jpg http://thesologuide.com/wp-content/uploads/ / /bigstockphoto_research_ .jpg http://www.futuretechnologypredictions.org/wp-content/uploads/future-technology.jpg http://www.futuretechnologypredictions.org/wp-content/uploads/future-technology.jpg http://www.futuretechnologypredictions.org/wp-content/uploads/future-technology.jpg http://www.futuretechnologypredictions.org/wp-content/uploads/future-technology.jpg http://www.futuretechnologypredictions.org/wp-content/uploads/future-technology.jpg http://surajatreyadigital.files.wordpress.com/ / /digital_publishing.png louise spiteri�school of information management�louise.spiteri@dal.ca landscape for academic libraries association of college and research libraries five-year goals, association of college and research libraries five-year goals, association of college and research libraries five-year goals, transformational change in the information landscape building digital collections repurposing library space redeploying library staff preparing academic librarians skills in advocacy: libraries in higher education skills in promotion: librarians and assessment of learning skills as educators: engage in the learning process skills as scholars: engage in scholarly output skills as publishers: disseminators of knowledge skills as technologists: technological fearlessness vision, leadership, and management slide number further information references sources for images microsoft word - article_final_draft_header_ludwigbullington.doc final draft / page title: libraries and it: are we there yet? author(s): deborah m. ludwig, jeffrey s. bullington. the authors represent ku libraries and information technology. deborah ludwig is a librarian and the director of enterprise academic systems for the information technology division of information services. jeffrey s. bullington is data services and government information librarian for ku libraries, a division of information services. journal: reference services review issn: - year: doi: abstract: purpose – this study looks at the impact for users on university library and information technology services, present and future, following merger. design/methodology/approach –the authors examined user survey data from the early ’s through the libqual survey, collected information through interviews with faculty and information services, and examined the national science foundation and the american council of learned societies reports on cyberinfrastructure. practical implications – this article is useful to others thinking the organizational relationship between libraries and campus information technology. findings – while the merged information services organization is not yet a resounding success from the perspectives of staff in information services or faculty, it is a brave attempt to respond to the future. originality/value – keywords: academic libraries, computing centers, technology, merger, cyberinfrastructure article type: case study title: libraries and it: are we there yet? creativity is not the finding of a thing, but the making something out of it after it is found. - james russell lowell technology infuses today’s library services affecting how we find information, how it is delivered, and what we create or do with it once in our possession. the technology our students and scholars use to record, interpret, and imprint data final draft / page with their own experience and knowledge permeates the higher education experience. technology enhances or threatens the prospect that someone can with certainty return to a piece of information or its subsequent repurposing as time goes by. the fragile nature of digital creativity and scholarship challenges libraries and technology centers to reconsider traditional roles and collaborative models necessary to support teaching, learning, and research today and tomorrow. because libraries and their services depend heavily on technology, the organizational marriage of technology and libraries may seem the most expedient model for channeling streams of data into navigable bodies of scholarly information. merger, however, can be every bit as difficult as the literature of the last quarter century suggests. measuring the impact on faculty and students is also difficult. the benefits or harms to users may not be readily apparent or may take time to materialize and may not apply equally to faculty and students. surveys are a common way to assess the user experience in an institutional context. dialog with users is another. to fully understand the potential ability of an it/library merger to support research, teaching and learning requires not only understanding the user experience today, but also reconnoitering in the direction today’s institutional values may compel us tomorrow. in this article, the authors will examine and interpret the impact of it and library merger at the university of kansas by looking at historical and current information found in the literature on merger, data from ku library user surveys, the perceptions of faculty and leadership in the merged organization gleaned through interviews, and reflection on future needs to support research and scholarship with cyberinfrastructure. the university of kansas the university of kansas (ku) is a state-funded, doctoral-granting institution with a carnegie classification profile that includes very high research activity. ku has academic departments, , students and , faculty and is an association of research libraries (arl) institution with branch libraries. the ku libraries and central information technology (it) organizations have a long history of close collaboration and organizational overlap. while this overlap has never represented a deeply integrated organization at many unit levels, these entities work together under an administrative framework known as information services. this organizational merger began in with the appointment of the first vice chancellor of information services, who also served as dean of the libraries. today information services is three distinct but administratively and functionally interconnected branches: libraries, information technology, and networking and telecommunications. the most closely merged units and programs are the library’s instructional services unit which provides both bibliographic and technology instruction, public computing support for labs and library workstations, scholarly digital initiatives, the academic data research services alliance which supports statistical data analysis, use of data sets, and geographic information systems support, and the enterprise academic systems unit which supports library and digital library systems as well as those for learning management and campus communication. a decade of literature on library and it mergers in higher education final draft / page the literature of library and computing center mergers from through is well established in “an issue in search of a metaphor, readings on the marriageability of libraries and computing center” (freeman ) found in books, bytes and bridges (hardesty ). the latter includes a broad set of writings on library and it merger and is recommended reading. in the last decade, there have been several publications written about the merged organizational model. hardesty ( ) interviewed computer center administrators and librarians at small colleges to study their differences, similarities, and relationship and the innate difficulty merger represents. hirshon ( ) provided a comprehensive summary of the growth in number of campus it and library mergers, their organizational models, cio leadership, and other pragmatic issues. bolin ( ) conducted a similar review of land grant universities and found that % of these institutions had traditional organizations with the dean of the library reporting directly to the provost and the computing center director reporting either to a provost or other administrative official while % had non-traditional organizational patterns grouped into models. renaud ( and ) wrote of the complexity brought on by degree of merger, the different cultures of libraries and computing centers, the difference in the compensation and status of people working as librarians from those working in it, the predominance of mergers in private liberal arts colleges and potential complexity of mergers in large institutions, issues of leadership, and alignment with governance. lewis and sexton ( ) examined organizational issues and cultural differences in merger in the u.k. at the university of sheffield. some authors have linked the need for it and library collaboration or merger to the changing and future needs of users for technology-based services and resources. herro ( ) covered the literature of merger from the user services perspective and surveyed cio’s at small institutions with merged organizations in to “determine why their institutions converged, how services to users have improved following convergence, and if institutions would converge again.” foley ( ) discusses the methodology of merger at lehigh university, the challenges and issues, and the use of virtual functional teams and client interest groups. frand and bellanti ( ) wrote about the merger of computing and library services at the anderson graduate school of management at ucla and creation of a library “without walls.” ferguson, spencer, and metz ( ) wrote of the dimensions of merger, administrative, physical, collaborative/operational, and cultural necessary for understanding the potential for successful integration. ferguson ( ) wrote of the leadership required to face the massive changes ahead of libraries in transitioning from print to digital and the need to create viable frameworks for this transition within a higher education environment that is also rapidly changing. ku’s information services organization, in its present iteration, was documented by goodyear, russell, and ames-oliver ( ). recent reorganization efforts put into practice concepts from the literature of organizational development, change management, and process facilitation to create campus-wide engagement about services and infrastructure resulting in greater collaboration and service delivery particularly between information services (it and libraries) and student success. a look past and present through surveys surveys are snapshots in time. they expose perceptions, desires, and experiences at a moment in time and may point to satisfaction or gaps with current services, but they do final draft / page not necessarily tell us where we are headed or how to move forward strategically during times of rapid change. as ku libraries have increased their reliance on technology and as organizational merger has knitted libraries and it together, the most readily available historical and current snapshots of user perception about services comes from surveys. ku libraries have a long history of user assessment. from through , ku libraries conducted a “general satisfaction survey” of users based on an acrl model survey. in the libraries undertook a substantial student survey. in , , and the libraries participated in what is now known as libqual+ developed for libraries by arl. library user surveys – the - general satisfaction surveys were completed by , users and netted , comments. of those comments ( %) specifically mentioned technology. technology at that time consisted primarily of the library catalog and a cdrom network of electronic databases. while the results were not tabulated specifically with technology in mind, an early picture emerges in these and future surveys of insatiable appetite for more and better electronic resources, for improved tools to access, deliver and make sense of information, for fast and unfettered technology infrastructure, and for helpful people to steer the course through this new electronic world. user comments in - already showed uneasiness with quality and quantity of electronic information: “[the] online catalog is not up to date with what is in the stacks,” “i think the cd-rom database system is extremely helpful for research. it would be nice to have more years of data in the biological abstracts,” “flipping through the avery index is a pain, but since periodicals aren't online, it's a necessary evil,” and “the best new thing in the library is mathsci on cd-rom. it really helps my work, in both teaching and research.” the early ’s also revealed both the precocious technology pessimist, “the computer offers little possibilities,” as well as the technology optimist who implicitly trusted what he saw online: “…one may find anything on the online catalog,” the tools for finding content challenged users, “i feel like the on-line system is a bit difficult for me,” and, “we need an online catalog that allows keyword searches. journals and proceedings are sometimes nearly impossible to find because they are listed in only one way.” frustration with computing infrastructure, facilities, equipment, and network, was evident in a few comments, “psych-lit [sic] was working very slowly. i had to reboot twice”, “computers went down,” and “we desperately need a printer hooked up to the on-line system.” the perception of library staff as helpers ran the gamut from perceived animosity when asked for help with copiers or computers to glowing satisfaction, “everyone (staff) is really helpful,” one user summarized, “i love the library, clean, quiet--tons of computer support “ survey of students in the student survey, ku libraries gathered information from graduates and undergraduates. themes of content, tools, computing infrastructure, and staff resources further emerged in comments and quantitative data from these surveys. electronic content and services were a primary reason that % of undergraduates and % of graduate students used the libraries. when asked to select the top three spending priorities for ku library, graduate students ( %) asked for more electronic final draft / page databases while both undergraduates and graduates wanted the catalog to better index print collections. “i would like on-line text available on the periodical databases,” “add lots more on-line services available hrs. a day,” and “internet services would be great!” users comments about tools became more sophisticated, asking for “boolean logic on the on-line catalog; remote access to cd-roms; grad student access to oclc & pre- mla cd-rom.” dissatisfaction with computing infrastructure occasionally surfaced. “[the] on-line catalog is too slow,” and “i wish the library had a computer lab.” many users were still either unaware or disinterested in the availability of modern technology. . % of users indicated they were unaware or had not used internet from library terminals and . % had not used or were unaware of remote access to library databases. library staff, who garnered high marks for providing traditional library services, appeared less savvy or available to help with technology in the eyes of some users, “[there was] no reference librarian to help w/medicine search,” “librarians do not know how to work electronic devices at times,“ and “i don't get verbal steps to follow when i actually need demonstration.” users asked for “better instruction in the use of specific library tools i.e., cd_rom database,” “short classes explaining how to use some of the software on the computers,” “guided tours, demonstrations on how to use electronic equipment,“ and “[a] more user-friendly way of easily teaching students how to obtain info from computer sources. “ students also wanted assistance from the library staff with diverse technologies including “internet access, e-mail, [and] classes about what they are and how to use [them].” while user surveys between and do not provide a consistent set of quantitative inputs and outputs, the authors interpret in the comments early rationale for thinking about it and libraries as a combined organization at ku. the needs amplified by users, for more electronic content, better tools for discovery, robust computing infrastructure for speedy and reliable access on and off-campus, and staff well-versed in using technology and interpreting electronic content, were known and may have influenced the administrative and organizational changes that led to the creation of ku information services and the eventual integration of technology and bibliographic instruction, library and campus technology systems, and combined lab/library public computing support services. (university ) (university of kansas information services “history” ) libqual+ surveys, - in , , and again in , ku libraries began to take advantage of new standardized criteria to measure library performance and the satisfaction of users using the arl libqual+ survey and for comparison with other participating institutions. these surveys were directed at faculty, staff and students and, in the iteration, looked at dimensions of library service in three areas: information control (printed and electronic resources and the infrastructure to support their use), library as place, and the affect of service (the nature and quality of service provided by library staff). perceived service levels were measured as a reference point in relationship to a user’s minimum expected and desired level of service. the library summarized its libqual results as user desire for electronic and print content in the form of journals and library materials, for easy-to-use tools, and for infrastructure to support convenient access to library collections, including access from final draft / page home or office, and modern equipment for easy access. (university of kansas information services “ku libraries” ) in , the appetite for electronic and print content, particularly journals, showed no abatement, and library tools for remote access as well as physical access to collections remained important. data from institution-specific questions showed that . % of faculty accessed library resources through the library web site daily, up from . % in . even so, faculty perceived levels of electronic and print resources as lower than the minimum they expected. at the same time % of faculty used resources on the library premises weekly, up from . % in , and daily use of library facilities by all users increased . % in the same period. a curiosity is that faculty perceived the service level for “community space for group learning and group study” as actually exceeding their desired level. the number of public workstations in the library system increased roughly % between and to fill the entry levels of the largest libraries with desktop pcs as well as laptops to borrow and use in the library. at the same time, the library opened a storage annex and began physically moving materials offsite. while any interpretation of these statistics by the authors is speculative, some comments seemed to reflect faculty disagreement with the library’s choice in provisioning library space as technology-centric commons. ”please, prioritize substance over space,” “a library should be a space for private study. group work can take place in many other venues,” and “with most students having their own laptops or home computers, it is wrong to devote so much first floor space to computer terminals,” others indicated they simply do not use the physical library. “i primarily use the library to request journal articles -- either thru [sic] the electronic journals or by ill. i have only set foot in the library once, to put a text on reserve for my students.” one summarized the shifting definition of the library in an increasingly virtual world, “…my use of the library is . % through electronic journals. does electronic use constitute ‘library premises?’ ” student responses in to libqual+ for both undergraduates and graduates, perceived issues of content and tools (information control), library as place, and library staff (affect of service) differently than faculty. student expectations were met at least at the minimal levels in all areas except graduate student expectations for print and electronic journal collections. students were broadly satisfied with library as place and with the technology found in these places although it was not necessarily used for access to the library’s electronic resources: “i have only used the computers inside the library for work on blackboard, (which could also be done from home.) i have not used the library for anything else to date.” a graduate student highlights the social aspects of library spaces, “[the library] is a great environment for studying; also it is a good place to meet with people you know or just walk around looking for people in your classes to glean information from them.” a graduate student commented on helpful research assistants and library “specialists more than willing to assist me, and [they] have made individual appointments with me to show me databases that are particularly helpful for the discipline i am researching.” in summary, libqual results from and reveal that ku libraries met the expectations of most students at some level while pointing to possible tensions with some faculty over the purposing of library facilities as technology commons and group meeting spaces. based on surveys, the authors interpret the most visible and tangible current value to library users afforded by merger is probably the development and support for public lab and library workstations in the technology commons. while many academic libraries can and do provide technology commons for their users without the final draft / page support of a central it organization, combined support for lab and library computing at ku is a sensible and scalable synergy. this approach maximizes the use of student technology employees who may work in either lab or library, enables mass deployment of row upon row of computer workstations cloned from a basic image, and unifies the presentation platform for users whether in library or public lab. in educating the net generation (oblinger and oblinger ) the authors talk about why these commons environments are important for learning: interaction [for learning] is not limited to classroom settings. informal learning may comprise a greater share of students’ time than learning in formal settings. the type of interaction, peer-to-peer instruction, synthesis, and reflection that takes place in informal settings can be critically important. in fact, the full range of students’ learning styles is undercut when interaction is limited to classroom settings. these technology-filled spaces are also important for library staff. they create an opportunity to interact with students. the extent and quality of interaction deserves more study. one possible indicator of quantity of interaction is found in reference statistics: questions increased by % between - and - following a decade of decline. it is still too early to tell if other merged units will yield tangible and visible benefits. in arnold hirshon wrote about the convergence of computing and communications technologies affecting entertainment and popular information content. he predicted this convergence would also permeate the realm of scholarly content with the expectation that “the time for e-content will be always, the place will be everywhere, and the demand will become insatiable.” (hirshon, ) closely aligned library and campus it organizations would seem well suited to meet these challenges for support of new modes of delivering or accessing scholarly content in diverse formats from sources perhaps less conventional.. libraries bring knowledge and historical responsibility for collecting and organizing scholarly content while campus it may be best prepared to support interactive and mobile technologies and to provision the computing infrastructure required for the high-demand highly-mobile environment hirshon envisioned. a look at the present and future: faculty and is leadership perspectives [ku] is a research university. doing research is your first responsibility [and] we expect that you will make significant new discoveries throughout your career. this is hard work, but merely making those discoveries is not adequate. you must share them with the wider world, and we require that you do this in two ways: publish your discoveries so that they will have an impact nationally and internationally; and bring your discoveries into the classroom so as to have an impact on your students. both of these are required for a successful career. (lariviere ) these were the convocation remarks of a new provost to faculty followed by an interview in the same month where he stated, “the most fundamental [economic development role for ku] is that every year we give to the world , new graduates who will go out and change the world.” the provost also recognized the need for robust computing and information infrastructure in a goal put forth for ku with deep impact for final draft / page information services. we will create a “truly first-class information technology infrastructure” to support research and teaching. (provost ) to better understand campus present perspectives and future directions for research and teaching in relation to library and technology services, the authors interviewed faculty and information services leaders in the spring of . the questions are found in appendix a. conversations focused on the ku environment, finding and creating information, and the role of the university in supporting “cyberinfrastructure.” definitions of cyberinfrastructure vary in the literature, but the authors defined it as something different and broader than the facilities, network, systems and software that make up computing infrastructure. in talking about cyberinfrastructure with faculty and is leadership, the authors relied on the acls ( ) definition of cyberinfrastructure as the shared information, expertise, standards, policies, tools, and services developed to support scholarship. the observations of those interviewed provided insight into faculty and is leadership thinking about the support required for research and digital scholarship and whether or not that support might be enhanced by a merged it and library organization. interviews on research and scholarship at ku in in talking with faculty and is leaders about current perspectives of research and scholarship, one interviewee summarized the growth of research at ku in the ’s as going from a “small liberal arts college on steroids …to a major research university.” interviewees noted that ku’s rigorous emphasis on research and on becoming a top- university (hemenway ) are “ratcheting up research [and the] importance of obtaining grants” with implications for promotion and tenure processes. one interviewee spoke of different expectations by different schools, with publication in peer-reviewed journals the primary focus for some and alternative or additional forms of dissemination and scholarship appropriate for others such as software creation, data sets, and simulations. interviewees concurred that the biggest disciplinary footprint for research at ku is in the sciences, particularly the life sciences. it is technology intensive, requiring not only facilities and instrumentation but also “big pipes” (the network), “big iron” (high-end computing platforms) and a strong basis of it support. the humanities at ku were viewed by some as well supported through a dedicated research center and endowment fund. the social sciences were viewed by some as the less supported. multidisciplinary research was mentioned as increasingly important. faculty and is leadership noted the need for better connections and cooperation between the medical and main campus and between disciplines. one interviewee spoke of the need for renewed connections between the sciences, humanities and social sciences much as there had been in the ’s when research was previously in the university limelight. another said, “ku should operate as one campus [and] multiple [research] sites … should not serve as barriers.” certain disciplines, certain areas within those disciplines, and the ultimate applicability of research results were all seen as factors that impact what research is funded. locally, the ku center for research (kucr) and its research centers were mentioned by some as “our historic strategy and priorities” for research support and funding, impacting funding and influencing or impeding the development of technology final draft / page infrastructure through its control of grant overhead funding. there was considerable tension expressed over how research funding is controlled and used. globally, one person noted that “we are hampered by nfs/nih funding models,” and another described the “sweet spot for research” in the social sciences as the venn-diagram intersection existing between “good ideas and fundable ideas, and what the funding agencies will support.” several believed that research in the social sciences was less funded and supported when compared to the sciences and humanities. one noted a diminishing market for publications in the humanities and social sciences which in turn would eventually affect the discipline itself and begins to shift the quality of the graduate experience. another reflected on the difficulty of publication for faculty in specialized areas such as management information systems that have only a few peer- reviewed journals to serve as outlets for publication. technology transfer was seen as focusing support on the marketability of research. funding and economic factors impact scholarship. this is not unique to ku. although the provost’s messages about the importance of research at ku did not specifically mention the role of the libraries; the services of libraries, the work of librarians, and print and electronic collections were characterized by one is leader as important in meeting the “library challenge to fill a great need for bringing information to community in ways that helps [faculty] innovate, create, imagine, without barriers” and to “shape new generations of scholars both as graduate students and as new faculty at ku.” faculty and is leaders recognized in positive terms the traditional role of libraries as they emphasized the continuing drive of scholars to find, use, and create data, to connect with both traditional library resources, tools, and content. at the same time, they recognized growing reliance on resource discovery outside institutional control. one interviewee was almost apologetic in preferring google as a search tool saying, “i know [google] has flaws, but it is so much faster [than library tools].” fast, flexible, and comprehensive access to scholarly content, particularly in electronic form, was deemed crucial. organization and dissemination of research data produced by ku scholars was considered challenging especially when there were interim products of research to be shared, when research relied on software and hardware tools that would have to be migrated over time, or when alternative formats for disseminating research results were the outcome. one faculty member mused that while technology has changed the capabilities for accessing information and analyzing information in creative ways, the essential directions and questions endure. the traditional role of libraries was understood while at the same time there appeared to be growing awareness of external partnerships that may affect how scholarly content is discovered, organized and made available over time. faculty recognized the push for big pipes, big iron, and big dollars, while also expressing concern for support of individual researchers as an overlay on the robust technology base layer. as one faculty interviewee put it, “success takes people - people you have a long-term relationship with, who know … the differing situations for people”. interviewees saw within libraries a service orientation and capacity for individual relationship building missing in the it organization. they bristled at their perception of a “one size fits all” desktop support model that doesn’t recognize individual needs and at the notion of all contact with it funneled through a help desk. individual researchers commented on the financial strain created by technology charges for essentials like network ports and data storage. in some ways this mirrors the frustration expressed in library surveys over perceived inadequacy in collections of print and electronic journals. one interviewee complained at having to purchase or subscribe individually to scholarly final draft / page content not freely available through the libraries. in thinking more broadly about the support needs for researchers, one interviewee noted that teaching and research are very integrated and faculty require a single environment for storing and sharing research coupled with individual control in managing the digital rights. the authors interpret these comments as faculty expectation for freely available and unfettered technology access, robust collections of scholarly materials accessible anywhere and anytime, and desire for individual control of the technology environment as it relates to their own research priorities. faculty and is leadership perceptions of the merged it/library organization does the marriage of information technology and libraries at ku contribute to the effective support of faculty and students as they seek and use information? from the perspective of some leading information services it does. for others, it is the collaboration rather than the organization that is most important. is leaders commented, “it's all about the information,” and “information and the delivery mechanism can’t be split.” one remarked, “[technology] breaks down the 'brick and mortar' distinction. [the] library for example is not just the building, but also available globally and locally in new ways.” another noted that in helping scholars, “the key is whether it and libraries collaborate, not whether we have a single organization. people will find a way [to work together]. the organizational structure forces the issue and shows that we are in it for the long haul.” yet another characterized information services as a “mosaic not a melting pot.” one is leader urged we do more. “libraries could benefit from more experimentation. it could benefit from more user focus. expand the type of information that libraries deal with. we haven’t pushed the model far enough.” others in information services pointed to the challenges of bringing together staff in such a diverse organization and that an is-like model, while good and desirable, may not be scaleable to a large and complex institution like ku. most successful mergers have involved smaller institutions. one is leader stated that librarians “have to be seen as essential partners in solving problems [and as] parts of research teams” while another observed that we don’t have much depth in staffing and referred to information services as “a thin veneer layer” possibly not capable of substantive support in its current state. interviewees within and outside of information services noted the historic under-funding of technology and the mark it has left on the current is organization. yet there was also recognition that print and electronic collections have historically faced funding issues as well. one person voiced anxiety that libraries might be losers in the information services organization, stating that “the convergence of it and libraries is problematic. technology is driving things.” it/library convergence was characterized by another as “loss of identity” for libraries. despite the long years of organizational overlap between ku it and libraries, information services still contains two mostly distinct organizational halves with specialists comfortable working in both viewed as more of an anomaly than a probability. however, one is leader summed up today as transition, “partnering [between it and libraries] while building is important. in years we will develop people who can do both [work in it and library].” one person summarized it thus, “the merged organization works in spite of itself. it isn’t about organizational structure, it is about working together.” final draft / page in general faculty were ambivalent about whether or not a combined library/it model was important. they recognized the dependency of libraries on it and the value of some level of partnership or connection regardless of organizational structure. several commented on the service orientation that libraries provide as a needed model for it. for one, clarity of purpose for the merged organization was at issue as well as “how libraries define themselves” and their role within the research process beyond archiving the resulting books and journal articles. one interviewee noted that in the future research areas are “all going to be massively data-driven. the role of technology is paramount.... focus needs to be on information technology and this requires enormous data collection and analysis capability. we must accommodate the data.” in summary, interviews with information services leadership and faculty tell us that the “jury is still out” on whether or not the combined information services unit contributes to the effective support of faculty and students. moving forward, information services may offer new roles for both librarians and technologists and opportunities for staff to work with researchers, to foster collaborative connections, to support innovation, and to evolve the traditional library roles of organization, access, and preservation in the emerging digital environment. the is organization may allow us to “push the envelope” and engage both the library and it halves in creating a first class information and technology environment in partnership with research centers and others who support the learning and teaching environment. thinking differently about our organization and about ourselves creates both anxiety and hope. the question for librarians and technologists alike is how to step up to this challenge. the answer to this challenge may lie in moving beyond physical and technology infrastructure to engage is in building and supporting a truly first-class cyberinfrastructure. as one is leader reminded us in our interviews, we are a young organization and “is has only started learning what it can do together.” perhaps the true value that the integrated information services mosaic provides lies in addressing the future. cyberinfrastructure and the future of the it/library merger …a new age has dawned in scientific and engineering research, pushed by continuing progress in computing, information, and communication technology; and pulled by the expanding complexity, scope, and scale of today’s research challenges. the capacity of this technology has crossed thresholds that now make possible a comprehensive “cyberinfrastructure” on which to build new types of scientific and engineering knowledge environments and organizations and to pursue research in new ways and with increased efficacy. the cost of not doing this is high, both in opportunities lost and through increasing fragmentation and balkanization of the research communities. (nsf ) this report is therefore primarily concerned not with the technological innovations…, but rather with institutional innovations that will allow digital scholarship to be cumulative, collaborative, and synergistic…the widespread social adoption of computing is transforming the very subjects of humanistic inquiry. in most expressions of human creativity in the united states— writing, imaging, music—will be “born digital.” the intensification of computing as a cultural force makes the development of a robust cyberinfrastructure an imperative for scholarship in the humanities and social sciences. (acls ) final draft / page these two excerpts from the respective reports of the national science foundation and american council of learned societies on cyberinfrastructure illustrate some large questions for all disciplines (sciences, social sciences, and humanities) that extend beyond the simpler questions of technology infrastructure: how to adequately build and support effective research environments for the future? how to discover and explore new research questions? how to preserve the record of research and human expression? further analysis reveals the complementary nature of the conversations – each highlighting issues of particular importance to the target community, yet when brought together, helping to articulate the comprehensive needs. researchers want to collaborate with their colleagues regardless of physical proximity or institutional affiliation – and they want systems that will afford fast communications, information sharing, and increased productivity. (nsf ) the primary mode of connecting to the latest developments in many disciplines is shifting into the web and only later into more traditional (and slower) modes of publishing such as preprints or the final published work, (nsf ). access to data is increasingly important for conducting research, and the amount of available data is growing, (nsf ). data, and other information, should be held in well curated data repositories and digital libraries that are widely accessible via the internet. (nsf ) the world’s cultural heritage should also be more effectively placed within reach of people. (acls ) in achieving this vision of near comprehensive access to information, there are enormous issues to be worked out regarding adequate preservation, copyright and other rights management issues, and effective methods for keeping digital information and digital information tools, alive and useable into the future. (acls ) effective cyberinfrastructure can break down disciplinary boundaries and afford new means of analyzing and creating information – for sciences in particular, the traditional research methods of theory and experimentation have joined by capabilities for simulations and modeling via computational environments. (nsf ) researchers will begin exploring new questions and areas as a result of the additional tools, capabilities, and information available through cyberinfrastructure. (acls ) the information economy and needs for a workforce trained with new skills, and capabilities to participate in that economy, are critical drivers for creating this cyberinfrastructure. (nsf ) and this development of new skills should not be driven only by technological or scientific advances, but also by understanding and sensitivity to humanistic, cultural, and social dynamics. (acls ) the building and maintaining of such infrastructure requires complex and close collaboration among a wide variety of stakeholders. those stakeholders will add their own unique, and yet complementary, skills, interests, and desired outcomes for cyberinfrastructure. we must account for the nsf reports comment about the ‘push- pull’ dynamic of technological progress and complexity of questions, together with the acls report’s wish for a ‘cumulative, collaborative, synergistic’ form of scholarship and the recognition that current knowledge creation is primarily ‘born digital’. and we should recall the desires expressed by users in ku libraries surveys and interviews from the ’s onward, and echoed in the cyberinfrastructure reports, to more readily access, create, house, share, and preserve created knowledge in ways that afford flexibility, customization, new capabilities, and new benefits. final draft / page we might then consider that the two main campus resources for managing information (the information itself as well as the means and capabilities of transmitting the information); information technology and libraries, ought to be working more closely together. this need appears as a recurring theme throughout our analysis. users want the abilities for work to be fluid, fast, and occurring wherever the users are. “these phenomena point to the need for the library and it organization to work together to support today’s scholars and students in a much more seamless fashion…a growing potential for integration [between libraries and it] exists on all campuses.” (ferguson ). the nsf and acls reports both evoke a public goods model for cyberinfrastructure; and that such developments should be built for wide access and use, and serve as a foundation upon which individuals or groups can additionally customize their own environments with additional tools, content, or other resources that will afford interoperability and connectedness. this public goods approach for cyberinfrastructure is further reinforced where the nsf report notes, “although good infrastructure is often taken for granted and noticed only when it stops functioning, it is among the most complex and expensive thing that society creates.” (nsf ) benefits and harms for users, providers, and the organization as we move forward with scholarship, teaching, and learning the intertwining of information technology and information content is a reality. in truth, it has never been any different. we should continually remind ourselves that tools and processes are in constant development and evolution. in their time, scrolls, books, typewriters, computers, and the internet were (are) all new means to capture, create, and convey information. tables of contents, indexing, and databases were (are) new ways to organize and manage information. libraries and data centers were (are) new ways to house and preserve that information. each wave in its turn has presented challenges, frustrations, learning, support needs, wonder, delight, and potential for users and providers alike. conventional wisdom reminds us “there is nothing new under the sun” and paradoxically “times change, and we with time”. the promise of a merged organization is in the cross-fertilization of knowledge, ideas, experimentation, and services in support of the university. it by itself can be seen as just an information carrier, a ‘pipe’. the library by itself can be seen as just a collection of content ‘a bucket of water’. success hinges on the ability of the merged organization to give priority to the ‘true’ information agenda – getting the water through the pipes, to the users, and supporting users to transform, share, transport, and save that information. the challenge is to create an effective centralized organization that is still capable of understanding, and responding to, the more specialized and unique needs of different aspects of the target audiences. efforts at combining and integrating library and information technology through ku information services groups, instructional services, scholarly digital initiatives, and adrsa, are recent experiments in meeting the needs of users as researchers and creators of information through this interaction of previously separate and disconnected staff, tools, processes, and objectives. the potential harms resulting from a merger of libraries and it seem almost the flip side of the benefits. that ‘library issues’ will mask and distract attention away from it (research computing) issues; that ‘it issues’ will excessively dominate library directions and uses; finally, that the information services organization will be perceived as an unnecessary, irrelevant, and confusing administrative structure. final draft / page are we there yet? so, “are we there yet, are we there yet?” no, but close enough to holler “he’s leaning on me.” “she’s taking up too much room.” “he threw my books out the window.” “make her stop looking at me that way.” the promise of the merged organization lies in the future, not in our difficult adolescence today. when librarians can work collegially along side it professionals and not feel it lessens their status on the faculty playing field; when those same it staff intuitively understand why it is important to build terabytes and terabytes of secure data and invest in insuring its integrity and future access; when researchers are supported by a collaborative information services team able to address the full spectrum of information and technology needed for a research or teaching project; when it support staff can scale desktop solutions to the meet the differing needs of the librarian, gis specialist, or researcher in the social sciences; when archivists have a place at the table as we talk about the future of the campus email or student records system; when budgeting for building the big network pipes doesn’t feel like throwing the books out the window; and when the management and curation of data is as important as the subset of practices needed for data security, then we will be much closer to our destination. have students and faculty benefited or been harmed by the merger of it and libraries? if you ask many librarians who value the more traditional roles of librarianship, they may say that the information services organization has eroded traditional library roles and the benefits those roles provide to library users. the libraries have lost their identity and librarians are in danger of being reduced to technologists. if you ask teaching faculty, you would learn that some of them struggle with student aversion to print and microforms, but they are moderating their instruction to accommodate student preference; after all student preference mirrors their own for electronic access and e- delivery. if you ask the research community, they will likely say the organizational structure either does not matter or does not make sense. libraries are customers and consumers of it, just as they are. the value of libraries for research is perceived in their collections, service orientation, and at the end of the research cycle in providing access to and preservation of the historical record. the value of it is in enabling the conduct of research and its dissemination in many forms. if you ask students, they might acknowledge that finding quiet study area in the library can be challenging; but the open spaces with row upon row of computer workstations serve both their broad information- seeking and learning needs as well as ubiquitous space for meeting and gathering in both the real and virtual realms. while the merged information services organization is not yet a resounding success from most perspectives, it is a brave attempt to anticipate the future. the growth in networked content, capabilities, and digitally-driven scholarship and learning has created more facets for libraries, it, faculty, and students, to influence and manage while still offering traditional services. from here, the mosaic grows only more complex. perhaps it will be our legacy to the next generation of students and scholars. references acls (the american council of learned societies) ( ), “our cultural commonwealth: the report of the american council of learned societies commission”, available at http://www.acls.org/cyberinfrastructure/ourculturalcommonwealth.pdf bolin, m.k. ( ), ”the library and the computer center: organizational patterns at land grant universities the journal of academic librarianship, volume , final draft / page number (january ), pages – . available at http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article= &context=librarysc ience ferguson, c. ( ), "whose vision? whose values? on leading information services in an era of persistent change" in wittenborg, k., ferguson, c., and keller, m. a. reflecting on leadership, council on library and information resources available at http://www.clir.org/pubs/reports/pub /pub .pdf ferguson, c., spencer, g, and metz, t. ( ), "greater than the sum of its parts: the integrated it/library organization" educause review, vol , no , pp - , available at http://www.educause.edu/ir/library/pdf/erm .pdf foley, t.j. ( ), “the metamorphosis of libraries, computing, and telecommunications into a cohesive whole”, lehigh university available at http://www.educause.edu/ir/library/pdf/cmr .pdf frand, j. and bellanti, r. ( ), “collaborative convergence: merging computing and library services at the anderson graduate school of management at ucla.” journal of business & finance librarianship v. no ( ) p. - . freeman, r.s., mandernack, scott b., tucker, j. m. ( ), “an issue in search of a metaphor: readings on the marriageability of libraries and computing centers.” in hardesty, l. (ed.) books, bytes, and bridges, chicago, american library association, pp - . goodyear, m., russell, k., and ames-oliver, keith ( ), “change at the university of kansas: process, experimentation, and collaboration” ecar research bulletin, vol no available at http://www.educause.edu/ir/library/pdf/erb .pdf hardesty, l. (ed.) ( ), books, bytes, and bridges, chicago, american library association hardesty, l. ( ), “computer center-library relations at smaller institutions: a look from both sides” cause/effect vol no , , pp. - , available at http://www.educause.edu/librarydetailpage/ ?id=cem hemenway, r.e. ( ), [monday messages, october , ] available at http://www.chancellor.ku.edu/messages/ /october .shtml herro, s. ( ), "the impact on user services of merging academic libraries and computing services", available at http://www.educause.edu/ir/library/pdf/csd .pdf hirshon, a. ( ), ‘integrating computing and library services: an administrative planning and implementation guide for information resources” cause professional paper series, no , available at http://www.educause.edu/librarydetailpage/ ?id=pub hirshon, a. ( ), “a diamond in the rough: divining the future of e-content”, educause review, vol no , pp. - [also] available at http://www.nelinet.net/ahirshon/diamond-in-the-rough.pdf lariviere, r. ( ), [speech given at convocation on august , ] available at http://www.provost.ku.edu/reports/speeches/convocation .shtml lewis, m.j. and sexton, c. ( ), “the full monty: two mutually incompatible views of organizational convergence,” educause paper, presented at conference october - , in nashville, available at http://www.educause.edu/ir/library/pdf/edu .pdf nsf (national science foundation) (u.s.) & blue-ribbon advisory panel on cyberinfrastructure ( ), “revolutionizing science and engineering through cyberinfrastructure : report of the national science foundation blue-ribbon final draft / page advisory panel on cyberinfrastructure.” available at http://www.nsf.gov/od/oci/reports/toc.jsp oblinger, d.g. and oblinger, j.l. (eds.) ( ), educating the net generation. educause, available at http://www.educause.edu/educatingthenetgen/ “provost sets down kansas roots, articulates a global outlook” ( ), oread, vol no , available at http://www.oread.ku.edu/ /august/ /lariviere.shtml renaud, r.e. ( ), “what happened to the library? when the library and the computer center merge.” college & research libraries news, vol no , p. - renaud, r.e. ( ), "shaping a new profession: the role of librarians when the library and computer center merge," library administration & management, vol no , pp. - , university of kansas information services ( ), “ku libraries: we listen to you!” available at http://www.informationservices.ku.edu/assessment/libqual/ university of kansas information services ( ), “history, – present,” available at http://www.informationservices.ku.edu/~iserv/aboutis_history.shtml university of kansas libraries ( ). ‘student survey, conducted spring , summary and highlights” [unpublished memo] for further reading: berman, f.d., brady, h.e. & national science foundation (u.s.) ( ), “final report nsf sbe-cise workshop on cyberinfrastructure and the social sciences.” available at http://vis.sdsc.edu/sbe/reports/sbe-cise-final.pdf blaustain, h., braman, s., katz, r.n. & salaway, g. ( ), “key findings: it engagement in research” available at http://www.educause.edu/ir/library/pdf/ecar_so/ers/ers /ekf .pdf braman, s. ( ), “what researchers want from it: educause live! “ november , ” available at http://www.educause.edu/live bush, v. ( ), “as we may think” atlantic monthly, vol , no , july, , pp. - . available at http://www.theatlantic.com/doc/ /bush cyberinfrastructure council, national science foundation (u.s.) ( ), “cyberinfrastructure vision for st century discovery” available at http://www.nsf.gov/pubs/ /nsf /index.jsp friedlander, a., adler, p., association of research libraries & national science foundation (u.s.) ( ), “to stand the test of time: long-term stewardship of digital data sets in science and engineering” available at http://www.arl.org/bm~doc/digdatarpt.pdf goldenberg-hart, d. ( ), "libraries and changing research practices: a report of the arl/cni forum on e-research and cyberinfrastructure", arl, available at http://www.arl.org/bm~doc/arlbr .pdf [ hacker, t.j. & wheeler, b.c., ( ), "making research cyberinfrastructure a strategic choice", educause quarterly vol. , no. , pp. - , available at http://www.educause.edu/ir/library/pdf/eqm .pdf lewis, d.w. ( ), a model for academic libraries - , available at http://hdl.handle.net/ / messerschmitt, d.g. ( ), opportunities for research libraries in the nsf cyberinfrastructure program", arl, no. , available at http://www.arl.org/resources/pubs/br/br /br cyber.shtml national science board (u.s.) ( ), long-lived digital data collections: enabling research and education in the st century” available at http://www.nsf.gov/pubs/ /nsb /nsb .pdf final draft / page oldenburg, r. ( ), the great good place: cafes, coffee shops, community centers, beauty parlors, general stores, bars, hangouts, and how they get you through the day, paragon house, new york. pothen, p. ( ), “developing the uk's e-infrastructure for science and innovation,” available at http://www.nesc.ac.uk/documents/osi/report.pdf final draft / page appendix a. interview questions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "going enterprise: merging campus and library technology services at the university of kansas" research article project format - minute interview interview questions: as we think back - - years, it is obvious that technology has driven many changes in how we learn, teach, and conduct research. we have some questions for you to address with respect to technology’s impact and significance in your work in higher education. . ku environment. describe how you see ku’s strategic directions and role of some of the major information technology and information content providers with respect to those strategic directions. • what do you see (from your perspective) as the strategic directions for scholarship at the university of kansas? • how large a role does/will technology play in achieving those strategic directions successfully? • what roles do libraries play in achieving those strategic directions successfully? • what role do external players have (google for example) in achieving those strategic directions? . finding information. describe how you use information today in your role with the university in either teaching, researching, or managing information. • when you need to find information on a specific topic in your field, how do you do it? describe briefly the process, steps, and tools you might use. • is the organization of information in your field changing? how is it changing? • do libraries, or the services they extend to you through the internet, play a part in your current use of information? ...for that of your students? how do the ‘virtual’ or internet aspects of library services matter to you or your students? • how relevant and how successful are libraries in creating an environment that is effective for users? please explain. . creating information. talk to us about your role as a creator of information. • what kinds of information do you create in your profession? please provide some specifics. final draft / page • how do you share and disseminate information within your profession and with others? • is the role of formal publishing changing in your field? if so, how? • are you concerned about the availability of the information you create for its intended audience today? …for future users? • do you have available to you the support (tools, resources, services, support) that you require as a creator of information? • what do you require that is not easily supported through the university’s current resources? what do you see as some of the most challenging aspects of your work for a centralized, university technology group and/or libraries to support adequately? . role of the university. have you had an opportunity to read either the dec report "our cultural commonwealth: the report of the acls commission on cyberinfrastructure for the humanities and social sciences" or the previous parallel nsf-sponsored report "revolutionizing science and engineering through cyberinfrastructure: report of the national science foundation blue-ribbon advisory panel on cyberinfrastructure?" the acls commission report offers this definition: cyberinfrastructure is defined as the "layer of information, expertise, standards, policies, tools, and services that are shared broadly across communities of inquiry but developed for specific scholarly purposes: cyberinfrastructure is something more specific than the network itself, but it is something more general than a tool or a resource developed for a particular project, a range of projects, or, even more broadly, for a particular discipline. so, for example, digital history collections and the collaborative environments in which to explore and analyze them from multiple disciplinary perspectives might be considered cyberinfrastructure, whereas fiber-optic cables and storage area networks or basic communication protocols would fall below the line for cyberinfrastructure. • ku libraries and ku information technology are part of a combined organization called information services. do you believe the marriage of these organizations contributes to the effective support of faculty and students at they seek and use information? • given the rapid growth and development of technology and its direct influence on the environment for teaching, learning, and research what steps must ku undertake now to provide cyberinfrastructure (information, expertise, standards, policies, tools, services) for scholarly purposes? • is the concept of digital curation (adequate preservation and forward migration of information) as important for the university as curation has been in the print and analog world? final draft / page • what should faculty, students, researchers - years into the future expect us to do today? information services & use ( ) – doi . /isu- ios press leadership to advance data and information science at virginia tech library julie griffina,b,∗ aassociate dean for research & informatics, university libraries, virginia polytechnic institute and state university, blacksburg, va , usa bsenior associate dean, university libraries, virginia polytechnic institute and state university, blacksburg, va , usa abstract. libraries have adapted and are continuing to adapt to the rapidly changing information and data needs of the communities they support. the following paper presents the evolution of virginia tech university libraries, with special attention given to program and service developments in data and information science. the author describes leadership strategies for advancing services, shares stories about developing sustainable, human-centered, and technology-enhanced service infrastructures, and paints a picture of aspirations to enable individuals’ success through active campus engagement. the paper will end with an overview of challenges and opportunities for future collaboration within higher education and beyond. keywords: data science, information science, libraries, leadership . introduction university library services necessarily adapt to constituents’ changing needs, which are impacted to a growing degree by transdisciplinary approaches, the drive to use data to address technical and societal problems, the inherent desire to make sense of—and derive meaning from—data, and by the influence of commercial data and information services and products on user expectations. libraries vary in their response to these and other changes, including the degree to which they engage their local and campus communities in the change process, the degree to which they learn together and from each other, and the emphasis placed on leadership, specifically change leadership. the following paper will describe how a community of the virginia tech libraries has responded to changing user needs in the ways described in order to position their library faculty, staff, and students for success in data and information science. dumas and beinecke [ , p. ] describe effective change leaders as individuals who, “encourage their organizations to learn, innovate, experiment, and question.” this description of leadership as an intentional cultivation of organizational curiosity resonates with me as an associate dean overseeing research, learning and informatics (rl&i) programs at virginia tech library, where change is and has been a constant due to developments in practice at the professional and library levels. however, as we know, developments impacting libraries extend beyond the walls of the information profession. *e-mail: gjulie@vt.edu. - / /$ . © – ios press and the authors. this article is published online with open access and distributed under the terms of the creative commons attribution non-commercial license (cc by-nc . ). http://dx.doi.org/ . /isu- mailto:gjulie@vt.edu j. griffin / leadership to advance data and information science at virginia tech library in , virginia tech launched a beyond boundaries visioning process [ ], which included, among other areas of emphasis, a focus on advancing the university’s existing research strengths, expanding the university’s global land grant mission, and helping people develop both disciplinary knowledge and transdisciplinary experiences. data and decision sciences emerged as one of nine transdisciplinary strengths of the university. . community & leadership during this time of change, the libraries had been for several years developing new services and infrastructures to support open access, open education [ ], digital scholarship, data management and analytics [ , ], digital curation, and research collaboration and impact. our new services and collaboration interests began to align nicely with opportunities presented during beyond boundaries conversations. the question became how we would simultaneously achieve our goals to advance library services and also engage with partners to realize the university’s more expansive vision. university libraries are, and have been for some time, a highly engaged organization within the larger academic enterprise at virginia tech. our community engagement dates back to a program established in that reimagined the role of liaison as embedded partner and service provider for the colleges [ ]. over the last several years, however, the libraries have increased both the volume of credit-bearing courses taught and the vigorous pursuit of external funding for research projects. in both cases our involvement is often in cross-disciplinary collaborations. we serve as senior investigators and co-principal investigators on grants with other faculty, co-teach credit bearing courses with other faculty, co-author papers with other faculty, and serve as co-directors and associate directors of labs and centers established by or based in other departments, and this is in addition to conducting our own research and directing our own research center. these activities represent an increase in partnerships as colleagues and peers in the teaching and research process. when i joined the organization a few years before the beyond boundaries vision was established, with a strong foundation and culture of partnership and innovation to build upon, my focus was given to reimagining the role of the library through connected infrastructures, multi-year cluster hiring to expand our areas of internal consulting expertise and service, and change leadership to ensure successful new program outcomes and retention of diverse talent and expertise. a specific example of an effort to connect university infrastructures and to expand services would be an initiative to connect the libraries’ open access repository, vtechworks [ ], to the university’s electronic faculty activity reporting system. the initiative represents a cross-campus partnership with the office of the executive vice president and provost, as well as with the virginia tech division of information technology and the office of research and innovation. through this partnership, we took an established library service and experimented with embedding the repository service in an existing faculty workflow to increase adoption of open access practices. over time, the role of the libraries’ in supporting the university’s research information ecosystem expanded from being provider of open access archiving services to becoming partner-provider of a suite of researcher services, including research impact, researcher identifiers, researcher profiles, research analytics, metadata, and open access services. we took intentional steps to expand our support for faculty success by experimenting with new roles, adopting new frameworks for presenting our services, and developing and sustaining new university partnerships. the partnerships we sustained in this case, as well as in most other cases, manifest as active cross- library and cross-campus teams, and are characterized by an alignment with our values, a desire to make a j. griffin / leadership to advance data and information science at virginia tech library positive impact and to support our users, as well as to, in the case of our services, leverage interoperability standards to enable data use and reuse. another example includes our partnership with faculty in the department of computer science to develop digital library infrastructure to support data reuse [ ]. working effectively in teams is critical in this kind of dynamic, highly networked and partnership- based environment. i encourage individuals in rl&i to share work, roles, and leadership as needed, and to support each other when necessary to strengthen the impact of our contributions. i practice modeling collaboration and leadership practices that emphasize support, inclusiveness, sharing, and teamwork. with department directors, i distribute a community-wide email newsletter with notes of thanks and recognition of achievements and contributions. i encourage individual and group participation as much as possible, for example, inviting rl&i community members to name the new community, to develop the community’s mission, vision, and values statements and to plan and lead retreats and team building workshops. i try to respond to expressions of interest in leadership, create opportunities, and adjust my role in projects as needed. all of these activities are designed to establish connections for individuals across seemingly disparate concepts, services, and activities. the successful translation of potentially conflicting individual-level campus engagement and program- level goals into a thriving community of people providing a diverse set of services to an expanding university comes down to four leadership activities: ( ) listening and being open to new ideas and suggestions, ( ) creating a culture of distributed thought leadership, ( ) leading community towards— and creating space to find—commonality across activities and programs, ( ) celebrating and respecting different interests, approaches, strategies, and perspectives; and ( ) supporting creativity through learning and experimentation. . learning & service philosophy what guides my leadership philosophy is a desire to advance creativity, innovation, and research education at the university by helping people and teams establish fruitful connections. through these connections, whether made via access to library services, learning spaces, resources, experiences, or information and data science talent, my hope is that individuals and teams are able to develop new and deeper knowledge, to contextualize experiences, to engage in community, to share in the celebration of achievements, to identify potential in themselves and others, and to practice leadership. when we do these well and when we support others in doing these well, i believe we see positive outcomes. at the university or community level, this translates to success in academics and research, more impactful scholarship, and a more adaptable organization. we also see demonstrated thinking and engaging beyond disciplinary and methodological boundaries, which all leads to greater success in data and information science. in terms of how the libraries help to connect the individual experience with community-level outcomes, we develop individuals’ develop digital literacy skills [ ], provide value added data and information services, and leverage our talents and skill sets to help people, whether in a role as facilitator, data analyst, designer, scholar, research partner, or curator. we do what we have always done well, the only difference is that in a technology-enriched environment we choose to take an even more user- and human- centered approach to our work. the frameworks through which individuals can engage with our services (e.g. through use of spaces and services, and through partnerships in research and scholarship) are focused on providing personalized experiences as much as possible. the common theme across frameworks and engagement opportunities j. griffin / leadership to advance data and information science at virginia tech library is a focus on applying to the best of our ability our data and information science knowledge and talents to advance research and scholarship. we do this best when we can help users make connections across services and service models. establishing connections can take the form of referrals, co-consulting, joint programs within the library and across campus, and through services aimed at helping users succeed in collaboration and transdisciplinarity. while many of our services are offered as campus-wide support, increasingly our focus on longitudinal impact has us involved as partners in projects. we approach scaling these efforts in different ways: hiring and cross-training to develop capacity to reach a larger number of faculty, staff, and students; collaborating and sharing responsibility (as previously mentioned); and developing service partnerships. i encourage exploration of all of the above at different levels of effort for different services as part of the change process. learning is key to preparing for the opportunities presented by data and information science. the process of learning, perhaps most importantly, will help us understand how our skills, expertise, and priorities need to evolve. to deepen learning engagement we must give ourselves time to explore new ideas, to take risks, and learn together and with the communities we support. as part of this process, we need to practice the competencies and skills that we strive to help others develop and seek out learning opportunities that help to contextualize our experiences, and assist others in doing the same. all of these types of learning experiences contribute to a more understanding community and adaptable service organization. active partnerships are also critically important for success. libraries are undergoing a similar kind of change as other academic units. it would make sense, for example, for us to explore with our campus partners how we measure success in a transdisciplinary, highly collaborative, rapidly evolving, partnership-based environment. because data is an important part of the storytelling process, we should bring our data expertise to bear when working with partners to find ways to cultivate a data way of thinking and working, and carefully consider privacy and ethics issues with data collection, management, use, and sharing. designing services and developing partnerships to support teams is also important. we need to work with others to develop infrastructures that help community members address and overcome problems. we do this by designing and implementing change frameworks when needed, and by supporting the change frameworks designed and implemented by community members. we should pursue partnerships that reinforce our global land grant mission, including international research partnerships and library exchange programs. and, as a land grant institution library, we need to partner to increase public engagement, make education more affordable, and make our experiences and services more accessible to all. to move forward requires changing the way we work and interact, creative approaches to problem solving, and empathetic leadership. if we explore all of the above within our libraries and with partners in/outside of our institution, we are advancing data and information science. . conclusion in a beyond boundaries library that advances data and information science, we must be able to make connections and empower others to make connections across disciplines, communities, organizations, and the world. by removing traditional barriers in the provision of services, the rl&i community practices making the scholarly record more open, facilitates transdisciplinarity, practices upholding our professional values and ideals to promote “equitable access to information” [ ], and integrates services to facilitate the institutional stewardship—and future use and reuse—of content. we also strive to enable connections j. griffin / leadership to advance data and information science at virginia tech library that facilitate beyond boundaries through adaptive leadership approaches, collaborative learning, and partnership-based services. it is my belief that in doing so, we position the libraries for success in enabling members of the vt community to effectively leverage data and information to make a positive impact in the world now and into the future. acknowledgements all of university libraries have helped to shape my leadership approach and philosophy. i would like to thank the leadership team in rl&i for directing the programs described in the paper, including open access, open education, publishing, data, teaching and learning engagement, digital literacy, digital collections, digital preservation, immersive environments, and learning environments, and also the rl&i community for delivering programs and services and engaging directly with library users. about the author julie griffin, university libraries senior associate dean, leads a team of individuals offering faculty, staff, and students at virginia tech unique research education opportunities, experiences, and connections across a diverse portfolio of information and data services. phone: + ( ) ; email address: gjulie@vt.edu. references [ ] c. dumas and r. h. beinecke, change leadership in the st century, journal of organizational change management ( ) ( ), – , available from: doi: . /jocm- - - . [ ] r. bleizner, a. grant and t. rikakis, envisioning virginia tech beyond boundaries: a vision. retrieved from: http://hdl.handle.net/ / (accessed may , ). [ ] a. walz, open and editable: exploring library engagement in open educational resource adoption, adaptation and authoring, virginia libraries : ( ), – , retrieved from: http://hdl.handle.net/ / (accessed may , ). [ ] a.l. ogier, a.m. brown, j. petters, a. hilal and n. porter, enhancing collaboration across the research ecosystem: using libraries as hubs for discipline-specific data experts, in: proceedings of the practice and experience on advanced research computing (pearc ’ ). acm, new york, ny, usa. available from: doi: . / . . [ ] n.h. seamans and p. metz, virginia tech’s innovative college librarian program, college & research libraries ( ) ( ), available from: doi: . /crl. . . . [ ] vtechworks is virginia tech’s institutional repository available at https://vtechworks.lib.vt.edu. [ ] z. xie and e. fox, advancing library cyberinfrastructure for big data sharing and reuse, information services and use ( ) ( ), – , doi: . /isu- . available at: https://content.iospress.com/download/information-services-and- use/isu ?id=information-services-and-use% fisu (accessed june , ). [ ] j. feerrar, literacies & campus context: leading the campus conversation, in: proceedings of the innovative library classroom, tilc , may - , , radford, virginia, usa. retrieved from: http://hdl.handle.net/ / , (accessed may , ). [ ] american library association’s core values, key action areas and strategic directions. retrieved from: http://www. ala.org/aboutala/, (accessed february , ). mailto:gjulie@vt.edu http://dx.doi.org/ . /jocm- - - http://hdl.handle.net/ / http://hdl.handle.net/ / https://doi.org/ . / . http://dx.doi.org/ . /crl. . . https://vtechworks.lib.vt.edu http://dx.doi.org/ . /isu- https://content.iospress.com/download/information-services-and-use/isu ?id=information-services-and-use% fisu https://content.iospress.com/download/information-services-and-use/isu ?id=information-services-and-use% fisu http://hdl.handle.net/ / http://www.ala.org/aboutala/ http://www.ala.org/aboutala/ an assessment of institutional repositories in the arab world search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine may/june volume , number / table of contents   an assessment of institutional repositories in the arab world scott carlson rice university scarlson@rice.edu orcid.org/ - - - doi: . /may -carlson   printer-friendly version   abstract compared to those of the western world, institutional repositories from the arabic-speaking countries of the middle east have been described by scholars of the region as occupying an "infancy stage". in this article, repositories from countries in the arab world were selected and assessed in terms of accessibility and transparency from the viewpoint of an external user. a set of assessment criteria was formed by analyzing trends and similarities in established repositories from the rest of the world, in hopes of analyzing the "infancy stage" appraisal. the results provide not only a current view of digital scholarship and institutional memory in the middle east, but may also provide a helpful set of criteria for developing repositories for the rest of the world.   keywords: open repositories, institutional repositories, middle east, repository assessment, open access, arab world, arabian gulf   introduction and background since the release of eprints, dspace, and other software packages in the early s, the history and development of the digital repository in western archives and libraries have been well documented. this is less true for other regions of the world, especially for arabic-speaking countries and territories. a article by syed sajjad ahmed and saleh al-baridi of the king fahd university of petroleum and minerals library in dhahran, saudi arabia, as well as a conference presentation by mohamed boufarss of the petroleum institute in abu dhabi, note several factors that have contributed to institutional repositories in the arabian gulf region remaining at an "infancy stage." these factors include a lack of published literature on the topic originating from the region [ ], the majority of academics in the arab world having little knowledge of or experience with institutional repositories [ ], and the lack of a coherent long-term repository preservation strategy among regional institutions [ ]. in fact, ahmed and al-baridi noted that much of their research on the state of repositories was gathered from colleagues and acquaintances from other institutions in the region. i was no stranger to this "infancy stage." my interest in institutional repositories in the arab world began in late in a project that combined professional interest with class work for a certificate in digital stewardship. between and , i worked in the technical services department of the library of the american university of sharjah (aus), an accredited, multicultural institution in the united arab emirates. not long after joining aus, my role in working with the library's institutional repository (formally launched in march, , the same month i joined) increased, as did faculty interest in the repository. in , a series of working papers created by faculty from the university's school of business administration was implemented; later that year, a test community was set up to deposit student work produced by the university's english for engineering class. despite this interest, feedback from the faculty community indicated that the school's repository was difficult to locate online, as was information about the repository and its policies on the library's web site. thus, a plan was initiated to visit institutional repositories in the arab world, as if they were being visited by anonymous external users, and compare trends or similarities found to see if this "infancy stage" appraisal was apparent through repository assessment. (this comparison plan was inspired by a association of research libraries conference presentation, wherein robert h. mcdonald and charles thomas argued that institutional repositories do not exist as "stand-alone phenomena." [ ]) originally, the selection of repositories was to begin by visiting the directory of open access repositories (opendoar); however, both the ahmed/al-baridi and boufarss pieces pointed out that opendoar lacks information on a significant number of existing repositories in the region. currently, opendoar lists only registered repositories between egypt, iraq, saudi arabia, lebanon, the sudan and qatar. meanwhile, the ranking web of repositories —an initiative of the consejo superior de investigaciones científicas (csic), a public research body in spain — lists only repositories in their "arab world" category, with some overlap to opendoar. all together, opendoar, the ranking web, and the registry of open access repositories (roar) contained information on the repositories of individual institutions in countries of the arab world. ultimately, this pool of candidate repositories was supplemented by an informal survey of the major universities and institutions within the arabian gulf region, which was found to be under-represented in the registered and ranked repositories. the final pool of candidates grew to repositories in countries, from which were selected for assessment, representing each one of the countries from the pool.   assessment criteria the foundations of the assessment criteria for this investigation were found in the metrics developed by the center for research libraries (crl), which tend to focus on the transparency of repositories. the crl's ten principles, developed along with the u.k.'s digital curation center, digitalpreservationeurope, and germany's nestor in january of , spell out a number of expectations for digital preservation repositories, including policy frameworks, stated ingestion criteria, and requisite metadata on digital objects, before and during preservation [ ]. these principles would be further refined in the trustworthy repositories audit & certification: criteria and checklist (trac), developed and used by the center for research libraries in auditing and certification of digital repositories, specifically sections a (organizational structure & staffing), a (procedural accountability & policy framework), b (ingest), b and b (preservation planning), and b (information management) [ ]. additionally, the latest edition of institutional digital repository benchmarks was consulted for the web visibility metrics contained in its marketing section. [ ]. specifically, the benchmarks survey focused on whether repository web sites provide url links back to either the institution's library website or main website. the survey questions, however, effectively discount the ways that users might arrive at the repository. the arguable assumption is that libraries tend to either be given or willfully take on the responsibility of managing the institutional repository, and are thus often shuttled to repositories via the institution's library web site. this, too, would become an assessment metric. however, because the assessment would also focus on the accessibility of repositories, the assessment would take place much in the same way a critic would visit a restaurant (i.e., without announcement and without interviews). thus, it seemed appropriate to conduct more informal surveying — this time, repositories outside of the arab world — for accessibility trends. repositories that were already long-established (in relative terms, as the digital archiving community is itself still a blooming field) could provide insight in assessing the comparatively "younger" pool of the arab world; baseline repositories were chosen, each founded sometime between and and representing institutions of the united states, canada, the united kingdom, australia, china, and the netherlands. the full assessment criteria were structured in order to answer the following questions: what is the name of the repository's home institution, country of origin, and software platform? is the repository registered with opendoar or roar? if so, what is the link? does the repository have a dedicated "landing page" with associated information? does the repository have a connecting url link from the institution's library web page? does the repository or the library site offer contact information for the repository staff or administrators? if so, who is listed and how many? is there a stated submission policy for ingested objects? is there a stated preservation plan? is there a stated collection policy? is there a stated metadata policy? are there stated standards on what file types can be deposited? is self-deposit allowed for the repository user community? is the repository subject to some kind of institutional mandate? is there a mandate registered with the registry of open access repositories mandatory archiving policies (roarmap)? does the repository or library offer open access information? is there an open access policy?   assessment results the full results of the investigation of arab world repositories were recorded in a google sheets document which is freely available for reference. the data collected from the baseline repositories, which informed the assessment criteria, is also freely available for reference.   . web presence and accessibility as mentioned previously, one of the main focuses of this investigation was the web presence of each sampled repository and the process of reaching each of them, starting at the institution's library website. all of the baseline repositories were reachable from somewhere within the library's web site, mostly from the front page, or through minimal searching. roughly half ( , or percent) opted to utilize ir landing pages in their library website — pages where information about the repository "lives" on the site, as opposed to storing relevant information on the repository itself. of the repositories from the arab world, ( percent) were directly linked to from the university library's web page; only three of those ( percent of the overall group) elected to utilize landing pages in their library website. a further four repositories were searchable via a portal on the library web page; this left six repositories which either were not linked to directly from the library site — instead, linked from somewhere else on the institutional web site — or in some cases, appeared not to be linked anywhere at all. also investigated was whether the repository (or its associated institutional web presence) provided concrete contact information for those responsible for the repository's administration or oversight. section a of trac (procedural accountability & policy framework) recommends that a trustworthy repository "ensure[s] that feedback from producers and users is sought and addressed over time" and "commits to transparency and accountability in all actions supporting the operation and management of the repository [ ]; again, all of the baseline repositories included at least one specific point of contact. however, less than half of the sampled repositories ( , or percent) provided a contact email for the responsible party (or parties). this number is slightly inflated to ( percent) if non-email feedback forms are included in the number; however, this still leaves five repositories that included no form of communication at all. more troubling, though, was the connectivity issues affecting a handful of the sampled repositories. over the course of research in , khartoumspace — the repository of the university of khartoum in the sudan — seemed to suffer many server errors, resulting in unpredictable access to the repository and its contents. the repository of khalifa university in abu dhabi has appeared to suffer from seemingly permanent connectivity errors for months, rendering the archive unavailable since at least the beginning of ; meanwhile, the url for the british university in egypt merely points to a welcome page for internet information services (iis) , a microsoft windows web server. the total inaccessibility of the latter two ultimately dropped the total of the sample pool down to repositories.   . transparency of policies as noted in the assessment statements above, the repositories in this study and their institutions were analyzed for formal, publicly-available policies and procedures, specifically covering the submission of ingested objects, preservation, detailed scope of collection, metadata, recommended file types of objects [ ], and the explicit allowance of users to self-deposit their work. an individual analysis of the specifics of each policy, however, was not undertaken; rather, the focus was placed on whether or not these policies exist via explicit references or availability on the library/repository websites. table clearly demonstrates the low frequency of such policies and procedures in the arab world repositories; only egypt's american university in cairo, saudi arabia's king abdulaziz university and king abdullah university, and tunisia's université virtuelle de tunis made their policies available, and even then, not all of the policies that were investigated. only the king abdullah university of science and technology in saudi arabia offered an application profile, or the available fields of metadata, for their digital archive; that being said, their faq page maintains that "the only compulsory field [for submitted objects] is title." similarly, only the american university in cairo offered recommendations on file types of submitted objects that moved beyond standard boilerplate repository language. none of the repositories offered written policy on the methods of preservation of their materials, apart from general statements detailing their overall goals. (it should be noted that these repositories are not assumed to be lacking in formal policies on their day-to-day operation and long-term goals; this is simply a report of the public availability of those institutional policies.)   repository name submission policy? collection policy? metadata policy file types policy preservation plan? self deposit allowed? depot institutionnel de l'universite kasdi merbah ouargla no no no no no unknown dépôt institutionnel de l'université de biskra (institutional repository) no no no no no unknown dépôt institutionnel de l'université de biskra (phd theses) no no no no no unknown bibliothèque virtuelle de l'université d'alger no no no no no unknown american university in cairo digital archive and research repository yes no no yes no yes alexandria scholarly publication repository portal no no no no no unknown thi qar university repository no no no no no unknown yu-dspace no no no no no unknown american university of beirut scholarworks no no no no no unknown lebanese american university ecommons no no no no no unknown université mohammed v — rabat no no no no no unknown qu institutional repository no no no no no unknown king abdulaziz university digital repository of information science yes yes no no no yes king abdullah university of science and technology digital archive yes yes yes no no yes king fahd university of petroleum and minerals eprints no no no no no unknown king saud university repository no no no no no unknown sudan university for science and technology institutional repository no no no no no unknown sali library english literature collection no no no no no unknown university of khartoum khartoumspace no no no no no unknown université virtuelle de tunis edoc no yes no no no yes british university in dubai bspace no no no no no unknown american university of sharjah no no no no no unknown masdar institute of science and technology no no no no no unknown table : general policies and procedures of the assessed arab world repositories these results were compared to the availability of policies from the baseline repositories. seventy-eight percent of the baseline group ( repositories, which vary by actual policy) included collection and document submission policies, as well as self-deposit, while just over half ( repositories, or percent) included standards for file formats. far less common in the baselines were explicit policies on metadata ( , or one-third) and preservation ( , or percent); in the case of preservation, a further three repositories explicitly mentioned techniques such as migration from obsolete formats, maintaining regular backups, and checking bit integrity, but without any substantial policy or planning information accompanying these tidbits, they were judged to be no more helpful or informative than boilerplate statements about best practices.   . open access and mandate information open access information was also collected; repository, library and institution websites were directly investigated for both general information on the concept of open access and institution-specific policies, including any references to mandated deposits. the registry of open access repositories mandatory archiving policies (roarmap) was then checked for any specific institutional entries found in the repository pool and cross-referenced with the collected data. while all of the baseline repositories included some basic information on open access (via either the repository, library or institutional website), over half ( repositories, or percent) carried an explicit open access policy and/or registered some kind of mandate with roarmap. (interestingly, having one did not necessarily guarantee having the other.) table shows the collected data for the arab world repositories. only one (king abdullah university of science and technology in saudi arabia) explicitly references an institutional open access policy, along with a registered entry at roarmap. one might be tempted to conclude that similar mandates otherwise do not exist within the other repositories; the problem is that at least four of the other repositories — bibliothèque virtuelle de l'université d'alger, the american university in cairo, the lebanese american university, and the american university of sharjah — employ language, on either the library websites or the repositories themselves, that implies the existence of thesis mandates for graduating students. in the case of aus, direct experience proves this to be the case, but for the other three, without clearly defined policies, we are left to read between the lines with statements such as "all theses published after are open access" [ ], and "you will not receive your graduation invitations until you submit an electronic final approved version of your thesis to the repository" [ ].   institution & repository name open access information? open access policy? stated mandate? mandate type roarmap link depot institutionnel de l'universite kasdi merbah ouargla no unknown no n/a n/a dépôt institutionnel de l'université de biskra (institutional repository) no unknown no n/a n/a dépôt institutionnel de l'université de biskra (phd theses) no unknown no n/a n/a bibliothèque virtuelle de l'université d'alger yes unknown likely thesis? n/a american university in cairo digital archive and research repository yes unknown likely thesis? n/a alexandria scholarly publication repository portal no unknown no n/a n/a thi qar university repository no unknown no n/a n/a yu-dspace no unknown no n/a n/a american university of beirut scholarworks yes unknown no n/a n/a lebanese american university ecommons yes unknown likely thesis? n/a université mohammed v — rabat no unknown no n/a n/a qu institutional repository yes unknown no n/a n/a king abdulaziz university digital repository of information science yes unknown no n/a n/a king abdullah university of science and technology digital archive yes yes yes institutional http://roarmap.eprints.org/ / king fahd university of petroleum and minerals eprints no unknown no n/a n/a king saud university repository no unknown no n/a n/a sudan university for science and technology institutional repository no unknown no n/a n/a sali library english literature collection no unknown no n/a n/a university of khartoum khartoumspace no unknown no n/a n/a université virtuelle de tunis edoc yes unknown no n/a n/a british university in dubai bspace no unknown no n/a n/a american university of sharjah no unknown likely thesis? n/a masdar institute of science and technology no unknown no n/a n/a table : open access policies and mandates of the assessed arab world repositories ahmed and al-baridi identify an overall lack of open access discourse in the arabian gulf region [ ], which may offer a possible explanation for the lack of open access and mandate policy information in the surveyed repositories. however, a healthy dose of skepticism may be in order for this particular assessment; the freely available metadata of the contents of the directory of open access journals lists more than individual open access journals founded in countries in the arab world (though almost of them originate from egypt) [ ].   conclusions and further applications the impetus for this research was to assess the progress of the modern institutional repository of the arab world, and to assess its appraisal as relatively "youthful" in comparison to repositories of the rest of the world. many of the repositories in the arab world have taken the initiative to build their own unique collections, make their presence felt within the community, and begin building a case for their necessity. however, these comparisons seem to bear witness to ahmed and al-baridi's appraisal of a community in relative infancy. obviously, not all procedural documentation requires full disclosure, but a sizable portion of the sampled repositories seem to occupy a plane of existence where material is deposited for safekeeping and public circulation, but without any public acknowledgment or information on how those goals will be met. and while the focus of investigation rested on the repositories from the arab world, the sampled baseline repositories were notably lacking in explicit policies on file formats ( percent), metadata ( percent) and preservation ( percent), revealing that even these repositories have areas in need of development. to once again refer to trac, "only a repository that exposes its design, specifications, practices, policies, and procedures for risk analysis can be trusted" [ ]. a further indication of the need for serious commitment to the development and management of institutional repositories in the arab world is that in the short time between when this article was written and its publication in d-lib magazine, the accessibility of a number of the assessed repositories has fluctuated greatly: universite kasdi merbah ouargla, thi qar university, lebanese american university, université virtuelle de tunis, and the american university of sharjah. the path to developing a secure, trustworthy repository — whether assessed through trac, drambora, or another auditing method — often requires a serious commitment of time and resources. hopefully, the assessment criteria developed for this research project can be of use to repositories while preparing certification proper, especially relatively young repositories.   references & notes [ ] syed sajjad ahmed and saleh al-baridi, "an overview of institutional repository developments in the arabian gulf region." oclc systems & services: . http://doi.org/ . / [ ] mohamed boufarss, "if we build it, will they come? a survey of attitudes toward institutional repositories among faculty at the petroleum institute." special libraries association-arabian gulf chapter th annual conference. http://www.ceser.in/ceserp/index.php/ijls/article/view/ [ ] ahmed and al-baridi, . [ ] robert h. mcdonald and charles thomas, "cross-institutional repository assessment: a standardized model for institutional research assessment." arl assessment conference. association of research libraries, seattle, august . [ ] center for research libraries, digital curation centre, digital preservation europe, and competence network for digital preservation, "ten principles." . [ ] center for research libraries, trustworthy repositories audit & certification: criteria and checklist. version . . center for research libraries, chicago, . [ ] institutional digital repository benchmarks. ed. new york: primary research group, inc., : - . print. [ ] trustworthy repositories audit & certification: . [ ] the dspace platform includes a boilerplate list of supported file formats in its help guide. this study differentiated this standard list from a policy provided by the institution that specifies preferred or disallowed file formats available for submission to the repository. [ ] available here. [ ] available here. [ ] ahmed and al-baridi: . [ ] metadata can be acquired here. [ ] trustworthy repositories audit & certification: .   about the author scott carlson is the metadata coordinator at rice university's fondren library. between march and august , he was the cataloging and metadata librarian at the american university of sharjah, an accredited, multicultural institution in the united arab emirates. he received his mlis from dominican university in river forest, illinois, and recently completed an archives certificate in digital stewardship from simmons college.   copyright © scott carlson s o u r c e : h t t p s : / / d o i . o r g / . / b o r i s . | d o w n l o a d e d : . .   analysis  of  variation  significance  in     artificial  traditions  using  stemmaweb     tara  l  andrews,  universität  bern         the  role  of  the  scholar’s  intuition  in  textual  scholarship  is  a  subject  that  has  occasioned     impassioned  debate  at  times  over  the  last  century  or  more.  is  textual  criticism  a  science,     or  an  art—should  it  be  pursued  with  methodical  rigor  or  with  intellectual  inspiration?     nowhere  is  this  conflict  more  pointed  than  in  the  sub-­‐field  of  text  stemmatology.  while     nearly  all  textual  scholars  agree  that,  particularly  in  the  era  before  the  printing  press,     texts  were  copied  and  changed  in  both  intentional  and  unintentional  ways,  not  all  of     them  admit  both  the  possibility  and  the  utility  of  deriving  a  stemma  of  its  transmission.     those  who  would  do  so,  either  for  the  purposes  of  text  reconstruction  or  simply  to  study     its  history,  must  align  themselves  on  an  ideological  spectrum  that  ranges  from  the     superiority  of  human  intellect  and  judgment  represented  by  the  method  of  lachmann,  to     the  wholehearted  embrace  of  empirics  and  statistics  represented  by  phylogenetic     methods.           since  the  nineteenth  century,  the  process  of  stemma  construction  has  been  more     or  less  codified  and  methodical.  for  all  the  formalization  it  has  undergone,  however,  at     the  core  of  stemmatics  there  still  lies  the  question  of  what  role,  precisely,  philological     judgment  should  play.  while  modern  computational  methods  allow  philologists  to  delay                                                                                                                      email:  firstname.lastname@kps.unibe.ch     correspondence:  digital  humanities,  muesmattstrasse   ,  ch-­‐  bern.       judgment  until  most  of  the  analysis  is  done  (in  the  case  of  neo-­‐lachmannian  binary  tree     construction)  or  even  to  suspend  it  altogether  (in  the  case  of  purely  phylogenetic  trees     presented  as  stemmata),  there  has  been  little  assessment  of  the  positive  difference  that     philological  intuition  makes  to  the  recovery  of  the  transmission  history  of  a  text.     here  we  report  on  an  experiment  designed  to  assess  the  weight  that  can  be  given     to  philological  judgment  in  three  cases,  all  artificial  traditions  in  which  the  true  stemma     of  the  text  is  known.  we  shall  give  an  overview  of  each  of  these  traditions,  discuss  the     methods  and  tools  used  for  experimentation,  examine  the  results  that  were  obtained,     and  draw  some  general  conclusions.       background     in  his  recent  study  of  the  development  of  humanistic  method,  rens  bod  (bod,   )     writes  approvingly  that  ‘stemmatic  philology  appears  to  be  the  only  humanities     discipline  to  have  become  a  “normal  science”’.  this  statement  might  come  as  something     of  a  surprise  to  stemmatologists,  many  of  whom  are  embroiled  in  an  on-­‐going  conflict     between  the  desire  for  empiricism  and  falsifiability  in  stemmatic  method  on  the  one     hand,  and  the  belief  on  the  other  hand  that  mechanical  process  simply  cannot  replace     human  intuition  as  a  means  to  divine  the  ‘signal’  in  textual  variation  from  the  ‘noise’.     the  history  of  textual  criticism  since  roughly  the  time  of  lachmann  can  certainly     be  understood  as  a  story  of  attempts  to  create  bod’s  “normal  science”—to  formalize  and     generalize  the  restoration  of  a  text  into  something  approaching  a  scientific  method  (e.g.     greg,   )—  and  reactions  against  these  attempts  by  scholars  who  believed  that  no     mechanistic  approach  could  ever  rival  the  work  produced  by  the  intuition  that  a  genuine                                                                                                                      i  am  very  grateful  to  the  reviewers  of  this  article  for  their  numerous  helpful  comments,   and  in  particular  to  matthew  spencer  for  his  suggestions  concerning  statistical  analysis  of  the   results.  their  feedback  has  vastly  improved  this  paper.     master  of  textual  scholarship  should  possess  (e.g.  housman,   )  or  indeed  who     believed  that  stemmatic  methods  tend  to  produce  specious  nonsense  (e.g.  bédier,   ).     the  middle  ground  after  over  a  century  of  these  debates  is  perhaps  stated  most     succinctly  by  west  ( ),  who  explains  how  a  stemma  should  be  created:       the  investigator  will  not  put  off  the  question  of  the  interrelationships  of  the  manuscripts  till  he     has  finished  collating  them:  he  will  be  considering  it  while  he  collates  them,  forming  and     modifying  hypotheses  all  the  time.  this  will  not  only  make  the  work  considerably  more     interesting  to  do  (which  will  make  him  more  alert  and  accurate  while  doing  it),  it  will  also     shorten  it,  as  will  be  explained  presently.       as  the  use  of  cladistic  and  other  phylogenetic  methods  accelerated  in  the  last     decades  of  the  twentieth  century,  and  as  software  for  automatic  collation  began  to  be     available,  the  prevailing  attitude  changed  again:  many  scholars  today  (andrews,   a;     robinson,   ;  wattel,   )  have  advocated  best-­‐practice  methods  in  which  the     collation  is  produced  before  any  analytical  judgment  is  made  concerning  the     relationships  between  the  texts,  on  the  basis  of  all  available  textual  information,  with  as     little  human  interference  as  possible  (although  opinion  remains  divided  as  to  whether     the  collations  should  be  normalized  for  orthography,  punctuation,  and  so  forth.)  only     when  the  collation  is  finished  should  the  analysis  begin.  this  attitude  is  itself  represents     a  shift  in  textual  criticism  back  in  the  direction  of  ‘science’  from  ‘art’,  insofar  as     interpretation  is  separated  from  that  which  can  be  done  in  a  mechanical  way  with     reasonable  and  undisputed  accuracy.  even  so,  while  some  scholars  have  wholeheartedly     embraced  cladistics  to  such  a  degree  that  they  no  longer  attempt  even  the  orientation  of     a  phylogenetic  tree  into  a  more  traditional  stemma,  most  others  prefer  a  ‘happy     marriage  of  our  human  philological  judgment  with  the  computing  power  of  our     algorithm’  (roelli  and  bachmann,   ).  cladistic  methods  do  not  make  any  inherent       distinction  or  judgment  concerning  the  significance  of  a  variant;  while  arbitrary     weightings  can  certainly  be  supplied  by  scholars  to  be  used  in  the  algorithm  (howe,     connolly,  and  windram,   ),  at  present  these  weightings  tend  to  arise  from     philological  judgment  rather  than  any  computable  property  of  the  text.     rather  than  simply  the  increasing  separability  of  collatio  and  recensio,  however,     bod  seems  to  draw  his  impression  of  stemmatology-­‐as-­‐a-­‐science  from  multiple  studies     that  appeared  in  the  late   s  and  early   s  (e.g.  salemans,   ,   ;  schøsler,     ;  smelik,   )  in  which  attempts  were  made  to  derive  formal  categories  of  text     variation  and  assign  relative  text-­‐genealogical  weights  to  different  categories.     the  most  well-­‐known  of  these  is  the  work  of  salemans  ( ),  who  proposed  a     strict  set  of  formal  guidelines  for  the  categorization  of  textual  variation  and  the  selection     of  those  variants  that  should  be  deemed  ‘text-­‐genealogical’,  that  is,  significant  enough  to     form  the  basis  for  construction  of  a  text-­‐stemmatic  tree.  salemans  is  straightforward     about  how  he  constructed  these  guidelines.  some  of  them  are  drawn  from  his  own     philological  intuition,  informed  by  the  common  wisdom  of  philologists  who  came  before     him,  for  identifying  those  sorts  of  variants  that  are  unlikely  to  occur  by  chance;  others,     which  appear  more  strangely  restrictive,  are  meant  to  ensure  that  the  algorithm  he  uses     can  draw  up  a  neat  binary  tree,  as  free  of  contradiction  as  possible.    a  few  examples  of     these  rules  are  listed  here:     • a  place  of  variation  in  the  text  occurs  where  there  are  two  or  more     ‘competing’  readings  of  the  text,  while  the  surrounding  readings  agree  in     all  text  versions;  these  places  should  be  as  small  as  possible.     • a  place  of  variation  suitable  for  the  construction  of  a  stemma  is  one  that     contains  exactly  two  competing  variants,  each  attested  by  at  least  two     witnesses.       • reordering  of  words  (assuming  the  reordering  is  grammatically  correct)     may  be  used  as  a  text-­‐genealogical  variation,  so  long  as  there  are  at  least     three  words  being  reordered,  none  of  which  are  adverbs.     • nouns  and  verbs  are  the  most  suitable  types  of  readings  for  creation  of  a     stemma.     the  primary  concern  of  salemans  was  to  exclude  the  possibility  (so  far  as  it  can     be  done)  that  the  scholar  might  compromise  his  or  her  stemma  by  inadvertently     assigning  text-­‐genealogical  significance  to  a  variant  that  in  fact  arose  coincidentally  in     parallel  in  unrelated  manuscripts;  in  order  to  avoid  this  possibility,  the  method  tends  to     discard  the  vast  majority  of  observed  variation  from  consideration.     cautious  as  it  is,  does  the  method  of  salemans  work?  he  used  it  to  produce  a     plausible  stemma  for  the  text  of  lanseloet  van  denemerken,  but  as  salemans  himself     affirms  in  a  long  discussion  of  the  merits  of  deductive  reasoning,  he  has  used  his  own     textual  intuition  and  prejudices  to  build  up  a  set  of  rules  for  avoiding  those  very  textual     prejudices.  as  schmid  ( )  points  out,  this  has  produced  a  result  that  conforms  very     nicely  to  the  intuition  by  which  it  is  shaped.  it  is  an  interesting  deductive  experiment  but     there  is  little  in  the  way  of  falsifiability  in  the  result.       in  the  same  article,  schmid  observes  that  salemans  ‘certainly  pinned  down  [the     types  of  variant  readings]  that  are  predominantly  suspect  of  accidental  variation’.  in     other  words,  salemans  has  done  an  excellent  job  of  codifying  the  shared  philological     common  wisdom  of  his  time;  he  has  not  provided  additional  evidence  that  the  common     wisdom  is  actually  justified.  schmid  goes  on  to  demonstrate  not  only  that  ‘suspected     accidental’  variation  is  not  always  coincidental,  but  also  that  variation  that  ought  to  be     safely  genealogical  by  the  standard  of  salemans  is  not  necessarily  so!  this  has  called  into     sharp  question  the  reliability  of  philological  common  sense  in  the  first  place.       schmid’s  findings  on  the  potential  significance  of  ‘insignificant’  variance  have     been  corroborated  elsewhere  (blake  and  thaisen,   ;  spencer,  mooney,  et  al.,   );     it  is  clear  that,  if  we  discount  these  entirely,  we  are  losing  potentially  valuable     information.  what  has  not  so  far  been  tested  in  any  real  way  is  the  philological  judgment     that  is  at  the  heart  of  all  the  classification  systems  that  have  been  proposed.       between    and    a  computational  object  model  was  developed,     implemented  as  a  perl  library,  to  represent  a  given  tradition  together  with  the  variation     in  its  witnesses  as  an  interlinked  graph;  a  companion  model  was  developed,  again  based     conceptually  on  a  graph,  to  represent  arbitrarily  complex  manuscript  transmission.  use     of  these  models  made  it  possible  to  perform  empirical  analysis  on  a  variety  of  stemmata     produced  using  different  methods  (andrews  and  macé,   ).  the  models  also  provide     the  underlying  framework  for  a  set  of  software  tools  that  were  used  to  perform  the     analysis  and  subsequently  made  available  to  other  textual  scholars  for  their  own  use     (andrews,   b).  one  tool  allows  the  categorization  and  annotation  of  the  way  in     which  individual  variant  readings  are  related,  another  allows  the  specification  of  one  or     more  stemma  hypotheses,  and  a  third  performs  an  analysis  and  cross-­‐correlation  of     reading  variants  with  their  consequences  for  any  of  the  existing  stemma  hypotheses.     the  initial  experiments  conducted  using  these  tools  also  corroborated  the  findings  that     ‘insignificant’  variation  was  surprisingly  likely  to  follow  text-­‐genealogical  transmission     patterns  in  both  artificial  text  traditions  and  genuine  traditions  for  which  reasonable     certainty  of  the  stemma  can  be  had;  we  concluded  that  the  application  of  syntactically-­‐   based  categories  of  the  sort  that  are  relatively  straightforward  to  identify  automatically     using  linguistic  analysis  parsers  (e.g.  spelling  variation,  grammatical  variants  of  the     same  word,  variants  that  involve  different  words  fulfilling  the  same  grammatical       function,  which  were  termed  ‘lexical’  variants  in  the  tools)  does  not  tend  to  pick  out  the     sorts  of  variation  that  are  more  or  less  likely  to  indicate  the  copying  history  of  the  text.     with  these  tools  in  place,  however,  and  with  a  set  of  texts  for  which  the  stemma  is     known  (such  as  the  corpus  of  artificial  text  traditions),  we  can  instead  attempt  a  much     simpler  categorization:  to  indicate  those  variants  which,  in  the  scholarly  judgment  of  a     philologist,  are  likely  to  be  stemmatically  significant.  from  there  we  can  assess  the     results:  how  often  was  the  philologist  correct,  and  how  often  did  the  copyist  produce  an     unexpected  surprise?       the  artificial  traditions     in  roughly  the  last  decade  there  have  been  a  number  of  ‘artificial  traditions’  made  for  the     purposes  of  stemmatological  experimentation;  these  are  texts  that  were  copied  by     volunteers,  so  that  the  actual  order  of  transmission  is  known  and  a  true  stemma  can  be     drawn.  three  of  these  were  used  in  the  experiment  described  here.     the  first  is  a  french  translation  of  a  swedish  work,  notre  besoin  de  consolation  est     impossible  à  rassasier.  the  archetype  text,  first  dictated  to  a  non-­‐native  french  speaker     and  then  corrected  by  a  native  speaker  without  reference  to  the  printed  edition,  is       words  long;  it  has  been  made  available  in    copies  from    different  hands  (see  fig.       for  the  stemma).  one  of  the  texts  was  copied  both  before  and  after  being  mutilated;  the     first  of  these  copies  was  itself  copied  before  being  ‘lost’,  and  the  second  used  a  different     exemplar  to  replace  the  missing  text.  this  was  done  to  simulate  both  the  loss  of  texts  in     a  copying  history  and  the  phenomena  of  ‘contamination’  of  the  stemma.       this  tradition  was  created  for  the  comparison  of  several  different  methods  for     computational  stemmatology  (baret,  macé,  and  robinson,   );  this  experiment  is  the     only  one  to  date  for  which  the  results  of  ‘classical’,  non-­‐computational  methods  of     stemma  creation  were  included  alongside  the  computational  versions.  in  the  published       experiment,  one  of  the  two  non-­‐computational  methods  came  closest  to  reproducing  the     true  stemma,  although  the  computational  methods  (none  of  which  are  able  to  infer  the     sort  of  contamination  that  was  present  in  the  true  stemma)  were  assessed  on  the  basis     of  the  raw  output  of  the  algorithm,  without  any  interpretative  intervention.  the  authors     note  that  ‘most  philologists’  were  easily  able  to  observe  the  shift  of  exemplar  from  the     collation  alone,  which  suggests  that,  had  the  computational  methods  been  subject  to     interpretation,  the  outcome  may  well  have  been  different.     t t a j c u m fs d v * b l     fig.   :  stemma  for  the  notre  besoin  artificial  tradition           the  second  artificial  tradition  is  an  english  translation  of  a  portion  of  the     medieval  german  epic  poem  parzival.  this  text  is    words  long,  copied  by  an     unknown  number  of  volunteer  scribes,  and  is  available  in    versions  (see  fig.    for  the     stemma).  although  the  text  is  a  little  shorter  than  notre  besoin,  the  somewhat  archaic     language  gave  rise  to  more  frequent  variation  within  copies.  the  parzival  artificial  text     was  used  to  test  the  applicability  of  phylogenetic  methods  from  evolutionary  biology  on     textual  data  (spencer,  davidson,  barbrook,  and  howe,   ).  no  attempt  to  reconstruct     the  stemma  by  hand  was  reported  for  this  experiment.     * * * p p p p p * p p p p * p p p p p p p     fig.   :  stemma  for  the  parzival  artificial  tradition         the  third  artificial  tradition  is  a  text  in  old  finnish,  piispa  henrikin  surmavirsi.     this  text,  also  known  as  the  “heinrichi”  tradition,  is  roughly    words  long  and  was     copied  by    volunteer  scribes.    copies  were  made,  of  which    were  made  available     for  analysis  (see  fig.    for  the  stemma).  the  creators  of  this  tradition  wished  to  simulate     medieval  copying  conditions  as  far  as  possible  in  the  modern  era;  in  service  to  that  goal     they  chose  a  text  in  an  archaic  language  that  was  only  imperfectly  known  to  most  of     their  scribes  (speakers  of  the  modern  language),  they  produced  a  far  larger  set  of       manuscript  texts,  they  had  some  of  the  volunteers  make  two  or  three  copies  from     different  exemplars,  and  several  of  the  copies  were  mutilated  after  the  volunteer  work     of  copying  had  finished  to  simulate  damage  to  manuscripts  that  tends  to  occur  over     time.  this  tradition  was  the  primary  data  set  used  in  a  ‘computer-­‐assisted  stemmatology     challenge’  run  in    (roos  and  heikkilä,   );  both  the  notre  besoin  and  the  parzival     artificial  traditions  were  also  provided  to  challenge  entrants.  no  attempt  at  a  stemma     reconstruction  by  hand  of  the  heinrichi  text  was  reported  during  the  challenge.     * * * * * * *w ae s t ba be cao p v f * * n ccd e * ad cb z h x * * ac ccg * a k l * m ab ce r bcf bbbd     fig.   :  stemma  for  the  available  texts  of  the  heinrichi  artificial  tradition         the  experiment     for  each  of  the  artificial  traditions,  a  volunteer  philologist  agreed  to  use  the  stemmaweb     software  (andrews,   b)  to  categorize  the  textual  variants  according  to  whether,  in     his  or  her  opinion,  the  variation  was  stemmatically  significant;  in  the  case  of  the  parzival     text,  two  volunteers  were  found.  the  volunteers  were  chosen  both  for  their  experience     in  the  practice  of  philological  reconstruction  of  medieval  texts  and  for  their  native  or     near-­‐native  familiarity  with  the  language  of  the  text.  if  there  were  more  than  two       readings  in  a  variant  location,  then  the  determination  had  to  be  made  for  each  pair  of     readings  with  respect  to  each  other  at  that  location.  since  the  philologist  did  not  consult     the  stemma,  it  was  impossible  to  have  any  external  verification  of  which  reading  in  a  set     of  variant  readings  came  from  the  archetype,  and  which  were  derivative  readings.         the  premise  to  be  tested  is  this:  a  trained  philologist  should  be  able  to  choose  variants     as  ‘significant’  that  do,  in  fact,  genealogically  follow  the  true  stemma.  the  converse  is  not     true;  the  philologist  should  not  be  expected  to  choose  with  any  certainty  those  variants     that  positively  contradict  the  stemma;  to  call  a  variant  ‘insignificant’  merely  means  that     it  cannot  be  relied  upon  to  provide  text-­‐genealogical  information.  a  great  many  so-­‐called     ‘insignificant’  variations  happen  to  follow  the  stemma  in  all  three  of  the  texts.     the  notre  besoin  and  parzival  texts  were  not  normalized  in  any  way;  the  heinrichi  text,     due  to  its  sheer  size  and  complexity,  was  normalized  for  spelling.  since  spelling  variation     is  almost  universally  considered  not  to  be  stemmatically  significant,  it  was  felt  that  this     normalization  would  not  harm  the  philologist’s  chances  of  choosing  ‘significant’     variation.       the  stemmaweb  text  annotation  interface  presents  the  variant  texts  as  a  unified     ‘variant  graph’,  in  which  textual  alternatives  are  represented  relative  to  each  other  in  a     continuous  presentation  of  the  entire  text  (c.f.  andrews  and  macé,   ;  dekker,  hulle,     middell,  neyt,  and  zundert,   ;  schmidt  and  colomb,   ).  the  user  may  create  a     relationship  between  two  analogous  reading  nodes,  and  define  several  properties  of  the     relationship  (see  fig.   ).  in  this  case  the  philologist  had  the  option  of  providing  any  or  all     of  the  following  information:     • how  the  readings  were  related  syntactically  (e.g.  whether  it  was  a  spelling,     grammatical,  or  some  other  sort  of  variation;  whether  the  readings  were  variant       grammatical  forms;  whether  they  were  different  words  filling  the  same  grammatical     role  in  the  sentence).     • whether  the  variation  was  significant  (possible  answers  were  “yes”,  “maybe”,  and     “no”.     • whether  the  variation  was  unlikely  to  have  occurred  coincidentally.     • whether  a  scribe,  upon  seeing  reading  a,  might  ‘correct’  it  to  match  reading  b     without  reference  to  another  exemplar  (or  vice  versa).         fig.   :  variant  classification  interface  for  stemmaweb:  creating  a  relationship  between  the  parallel     readings  ‘honour’  and  ‘horror’.     there  is  currently  a  deficiency  in  the  stemmaweb  software,  so  that  there  is  no     way  to  indicate  whether  a  gap  (or  addition)  in  the  text  is  stemmatically  significant.  the     volunteer  philologists  were  made  aware  of  this  deficiency  at  the  outset  of  the     experiment,  and  each  of  them  was  asked  to  keep  a  list  of  which  addition/omission     variants  might  be  significant.  two  such  lists  were  received,  both  for  the  parzival  text;  for     the  other  texts,  the  philologists  working  on  the  texts  simply  stated  guidelines  to  be       applied  for  these  variants.  in  both  cases  they  advised  that  they  were  likely  to  be     significant,  unless  it  was  purely  a  question  of  easily-­‐replaceable  readings  such  as     punctuation.     once  annotated,  the  text  variation  was  compared  against  the  true  stemmas  for     each  tradition.  for  this,  the  text  is  subdivided  into  variant  locations—these  are  places  in     the  text  where  variation  occurs,  and  in  terms  of  the  graph  a  variant  location  occurs     wherever  more  than  one  readings  occurs  at  the  same  rank  (that  is,  the  same  number  of     readings  distant  from  the  nearest  shared  prior  reading)  in  the  graph.  in  order  to  avoid     artificially  inflating  the  number  of  variants,  each  graph  was  compressed  before  analysis,     so  that  individual  sequences  of  readings  that  did  not  vary  between  witnesses,  and  for     which  no  individual  relationships  had  been  made  to  parallel  readings,  were  treated  as  a     single  reading.  three  examples  of  a  graph  with  compression  rules  applied  are  given  in     fig.   .  in    the  example  marked  a,  the  relationship  between  βλασφημίας  and  βλαςφημία     prevents  compression,  so  that  βλασφημία[ς]  is  treated  as  one  reading,  and  the  omission     of  ἀπορία  in  witness  q  is  treated  as  a  separate  reading.  in  example  b,  on  the  other  hand,     the  entire  phrase  ὡς  οὐκ  οἶδε  is  treated  as  a  single  omission  in  witness  p(a.c.),  and  in     example  c  the  two  words  καθαίρει  αὐτὸν  are  treated  as  a  single  reading  with  the     alternative  καθεαυτὸν  in  witness  s.           fig.   :  examples  of  reading  compression  before  analysis.          for  each  distinct  variant  location  within  the  text,  an  individual  instance  of     variation  was  counted  when  one  reading  was  changed  by  one  or  more  copyists  into  a     different  reading.  in  the  example  given  in  fig.    for  a  set  of  non-­‐genealogical  variants,  the     original  reading  turns  has  been  modified  two  different  ways:  in  witnesses  p  and  p  it     became  twins,  and  in  witness  p  it  became  turn.  the  reading  turn  itself  was  modified     again,  reverting  to  turns  in  witness  p .  three  instances  of  variation  are  thus  counted:     turns  -­‐>  twins,  turns  -­‐>  turn,  and  turn  -­‐>  turns.  as  a  result,  coincidental  variation  is     counted  as  a  single  instance  of  variation  (turns  -­‐>  twins),  but  the  phenomenon  of     reading  reversion,  wherein  a  scribe  uses  his  or  her  intuition  to  correct  the  reading  of  the     exemplar  to  match  an  ancestral  reading  that  the  scribe  did  not  personally  see,  is  counted     as  two  instances  of  variation  (turns  -­‐>  turn  and  turn  -­‐>  turns).     the  analysis  of  variant  locations  against  the  stemma  is  done  using  a  pair  of  graph     calculation  programmes  that  were  developed  for  the  purpose  (andrews  et  al.,   );  the     programmes  first  determine  whether  the  specific  occurrence  of  readings  can  be     explained  by  genealogical  adherence  to  a  given  stemma,  and  then  calculate  the  minimum     set  of  manuscripts  (the  ‘roots’)  in  which  each  reading  could  have  independently  arisen     (that  is,  without  having  been  copied  directly  from  the  exemplar.)  in  the  calculation,  a       particular  reading  is  classified  as  ‘genealogical’  if  and  only  if  there  is  a  single  ‘root’  for     the  reading  in  the  stemma;  for  archetypal  readings,  the  ‘root’  will  always  be  the     archetype.  no  attempt  was  made  to  detect  potential  reading  reversions;  these  were     treated  simply  as  separate  variants.     since  the  philologists  were  working  without  reference  to  a  stemma,  there  are     several  pairs  of  variants  that  were  categorized  in  the  interface  but  did  not  occur  in  the     final  analysis,  because  there  was  no  instance  of  variation  between  the  readings  that     formed  the  pair.  in  our  example  above,  any  categorization  of  the  pair  turn  –  twins  would     be  thus  disregarded,  although  the  philologist  may  well  have  expressed  an  opinion,     because  according  to  the  stemma  no  copyist  read  ‘turn’  and  wrote  ‘twins’  or  vice  versa.             fig.   :  analysis  of  a  variant  location  in  parzival.  three  instances  of  variation  are  recorded:  turns  -­‐>     turn  by  witness  p ,  turn  -­‐>  turns  by  witness  p ,  and  turns  -­‐>  twins  by  witnesses  p  and  p .           results     how,  then,  did  our  scholarly  intuition  fare?  taking  into  account  the  difficulty  with     recording  significance  of  addition/omission  variants,  the  traditions  were  analysed     according  to  three  different  scenarios:     . addition/omission  variants  were  excluded  from  the  analysis.     . additions  were  treated  as  significant  unless  the  added  readings  were  punctuation-­‐   only,  in  which  case  they  were  treated  as  insignificant.  deletions  were  treated  as     possibly-­‐significant,  unless  they  were  punctuation-­‐only.  in  the  case  of  the  parzival     text,  the  addition/deletion  significance  information  that  was  provided  directly  by  the     philologist  was  used  instead.     . additions  were  treated  as  significant  (except  for  the  parzival  text),  and  deletions     were  excluded  from  analysis.         as  well  as  the  question  of  additions  and  deletions,  there  was  the  question  of     orthographic  normalization  of  the  text.  due  to  the  sheer  size  of  the  heinrichi  tradition,     the  text  was  normalized  for  spelling  and  punctuation  before  the  experiment  began;  the     other  two  traditions  were  not  normalized  beforehand.  in  order  to  provide  an  adequate     basis  for  comparison,  the  analysis  for  these  two  texts  was  run  both  with  and  without     normalization  in  the  relevant  scenarios.       table    shows  the  aggregate  results.  for  each  text  (normalized  or  not)  in  each     scenario,  the  number  of  total  variants  assigned  to  each  of  the  significance  values  “yes”,     “maybe”,  and  “no”  is  given,  as  well  as  the  number  of  variants  in  each  category  that  were     found  to  follow  the  stemma  in  a  genealogical  fashion.  reading  the  table,  for  instance,  we     can  see  that  within  the  non-­‐normalized  parzival  tradition  there  were    variants  in       total,  of  which    were  deemed  significant  and    were  deemed  potentially-­‐significant.     /  ( %)  of  the  readings  deemed  significant  were  in  fact  genealogical  according  to     the  stemma;   /  ( . %)  of  the  readings  deemed  potentially-­‐significant  were     genealogical.     a  list  of  those  variants  marked  significant  for  each  text  is  given  in  tables   – .  we     have  omitted  additions  and  deletions  from  the  list,  as  well  as  “type-­‐ ”  variation—this  is     a  term  for  variant  locations  in  which  only  a  single  manuscript,  copied  by  no  others,     differed  from  the  rest  in  its  reading.  for  each  relationship  link  the  exemplar  and  copy     reading  is  listed,  along  with  whether  the  variation  conforms  genealogically  to  the     stemma  or  is  an  instance  of  parallel/coincidental  variation.       there  was  a  somewhat  surprising  situation  to  be  found  within  the  notre  besoin     data—when  the  text  was  normalized,  the  number  of  variants  counted  went  up  and  the     accuracy  went  down.  this  was  due  to  the  set  of  readings  at  rank    in  the  graph  (see  fig.     ):  the  potential  variants  included  the  words  “nime”,  “cime”,  “cîme”,  “scime”,  and  an     illegible  word  that  was  either  “nime”  or  “scime”.  if  the  two  readings  “cime”  and  “cîme”     were  treated  as  separate  variants,  then  the  variants  could  be  arranged  genealogically  on     the  stemma  so  that  each  spelling  arose  from  the  reading  in  witness  c;  if,  however,  they     were  treated  as  spelling  variants  of  the  same  word,  then  it  was  a  parallel  variation,  in     which  witnesses  u  and  s  independently  read  ‘cime’  from  their  exemplars  (a  and  c     respectively)!  this  was  an  interesting  specific  counter-­‐example  to  the  prevailing  wisdom     that  texts  should  be  normalized  for  orthography  before  analysis.             fig.   :  a  variant  location  that  is  genealogical  only  before  normalization         table   :  aggregate  results  of  variant  analysis  for  the  three  texts     including addition/deletion assumptions                                                                         parzival     parzival     normalized   parzival     parzival     normalized   notre   besoin   notre   besoin   normalized   heinrichi   normalized   total  yes                 total  maybe                 total  no                 genealogical  yes                 genealogical  maybe                 genealogical  no                     excluding addition/deletion assumptions                                         parzival     parzival     normalized   parzival     parzival     normalized   notre   besoin   notre   besoin   normalized   heinrichi   normalized   total  yes                 total  maybe                 total  no                 genealogical  yes                 genealogical  maybe                   genealogical  no                     excluding only deletion assumptions                                                                                                       parzival     parzival     normalized   parzival     parzival     normalized   notre   besoin   notre   besoin   normalized   heinrichi   normalized   total  yes                 total  maybe                 total  no                 genealogical  yes                 genealogical  maybe                 genealogical  no                         table   :  list  of  significant  variants  in  notre  besoin  (excluding  addition/deletion)                                                                                                   text position genealogical? exemplar reading copy reading note yes je n'ai jai yes minspire m'inspirent no m'inspirent minspire reverted reading yes nime or scime cime yes arche arc yes abandées à bander yes au deu dieu odieux yes avides arides no arides avides reverted reading yes la cèse l'ascèse yes perds prends no joie jour reverted reading yes jour joie no du au no au du yes tout tour yes coup tour yes des pour yes être humain lézard yes lézard être humain         table   :  list  of  significant  variants  in  parzival                               text position genealogical? exemplar reading copy reading note yes rue use yes rue see   yes clash dash yes where with yes hare horse no reveal several yes oh ok yes its his yes rate note no note rate no odd old no cum and yes cum over      table   :  list  of  significant  variants  in  parzival       text position genealogical? exemplar reading copy reading note yes heart heat no heat heart reverted reading yes rue see yes hare horse no reveal several yes is in yes rate note no note rate reverted reading no cum and     table   :  list  of  significant  variants  in  heinrichi     text position genealogical? exemplar reading copy reading note no wainen nainen yes carcot carkuhun yes gongarita gangista yes gongarita gangistu yes gongarita amvanta yes suin nin yes paljon tuhansia yes enämbi erämki yes cotiani cariani yes pane pahe no ohjat olijat yes suoniset puaniset yes harman harwan yes orhilda ahtialda yes iduilta iavialta yes lihainen likainen yes luocka kuokka yes harjallen haijuillen yes hyvän kywän yes hyvän luocka no aiella siellä yes wiritti wintti yes juoxemahan juotemahan yes laulajtta kaulojtta   yes laulajtta laukijitta yes wirguttamahan weigottamahan yes wirguttamahan wingottumahan yes rauta-cahlehisa routa-cahlehisa yes rautainen rantainen yes rautainen tauroinen yes kukersi kaukan yes walcoinen waleoinen no fildin tildin yes njn siju yes wandi wanki yes wandi waneli yes takoa kackoa yes takoa tokra yes pannahinen lallinlainen yes kiukahalda luikahalda yes kiukahalda kirikahalda yes parku lauleli yes wielä sulle yes se olutta yes sun tarjoapi no sun suu yes wielä wiila yes päänsi leiwän yes päänsi päänni yes päristelepi päällystelepi yes sirgotelepi virgotelepi yes heittelepi heittelemi yes kijruhti lähti yes lalloi lakoi yes cuin ain yes walehteli certoili yes heitti kejtti yes tuhkia luhkia yes siwui silleni no lahtarinsa lahtaunsa yes pitkän pilkän yes wuoldu wuceldu no suxen suten no siasta piasta yes sitten sinen no sinen sitten reverted reading yes wandi wouti yes corkuhujnen dorkuhujjnen yes tacoa tuloa yes tacoa taloa no kuhunga kuhunsa yes luuni luuhi yes luuni kuuni yes lendelepi laudelepi no suaneni suoleni no oroin aroin yes nousiaisten pargahisten no hieta-cungahan hieta cangahan   yes haudattihin handotti yes kewät kewät [ ] transposition no sijne sijtte reverted reading yes sijtte sijne no ja jo     in  all  texts  but  heinrichi,  the  philological  determination  of  stemmatic  significance     fared  surprisingly  poorly.  if  human  intuition  is  to  be  a  reasonably  reliable  and  accurate     tool  for  assessing  variation,  one  would  expect  to  see  a  relatively  much  higher  proportion     of  text-­‐genealogical  variation  marked  as  significant  than  as  potentially-­‐significant;  the     “maybes”  should  probably,  in  turn,  be  higher  again  than  that  not  marked  as  significant  at     all.       how,  in  this  instance,  do  we  define  ‘poorly’?  one  way  to  examine  the  data  is     through  use  of  a  chi-­‐square  analysis  on  each  of  the  text  scenarios:  if  our  philologists  are     successful  at  identifying  genealogical  variation,  we  should  expect  to  find  that  there  is  a     positive  correlation  between  ‘genealogical’  and  ‘significant’.  if,  on  the  other  hand,  the     philologists  are  not  successful,  we  will  not  be  able  to  demonstrate  the  correlation  with     any  degree  of  certainty.  the  chi-­‐square  test  is  not  foolproof,  both  because  the  amount  of     variation  classed  significant  is  fairly  low  for  most  of  our  texts,  and  because  it  may  not  be     safe  to  assume  that  each  variant  is  entirely  independent  of  the  others  in  whether  or  not     it  is  genealogical.  it  can  nevertheless  work  as  a  first  approximation.       table    shows  the  results  of  the  chi-­‐square  analysis  across  texts  and  scenarios.     the  only  text  to  show  a  strong  correlation  between  ‘significant’  and  ‘genealogical’  is     heinrichi.  in  the  case  where  additions  and/or  deletions  are  included,  however,  this     extremely  strong  correlation  is  highly  negative!    these  are  the  scenarios  where  text     additions  are  usually  assumed  to  be  significant,  and  deletions  are  usually  assumed  to  be     in  the  ‘maybe’  category.  if  we  refer  back  to  the  numbers  in  table   ,  however,  we  find     that   /  ( . %)  of  significant  variants  are  genealogical,  as  compared  to   /       ( %)  of  insignificant  variants  but  only   /  ( . %)  of  possibly-­‐significant!  in  this     case,  the  decision  to  treat  additions  and  deletions  in  this  categorical  manner  has  had  a     disastrous  impact  on  the  result.     table   :  results  of  chi-­‐square  analysis  across  all  text  scenarios     all  variants   Χ  value   p-­‐value   parzival     .   .   parzival    normalized   .   .   parzival     .   .   parzival    normalized   .   .   notre  besoin   .   .   notre  besoin  normalized   .   .   heinrichi  normalized   .   .              excl.  addition/deletion   Χ  value   p-­‐value   parzival     .   .   parzival    normalized   .   .   parzival     .   .   parzival    normalized   .   .   notre  besoin   .   .   notre  besoin  normalized   .   .   heinrichi  normalized   .   .              excl.  deletions   Χ  value   p-­‐value   parzival     .   .   parzival    normalized   .   .   parzival     .   .   parzival    normalized   .   .   notre  besoin   .   .   notre  besoin  normalized   .   .   heinrichi  normalized   .   .         once  addition  and  deletion  is  excluded,  the  news  for  heinrichi  is  much  improved:     we  can  say  with  roughly   %  certainty  that  there  is  indeed  a  positive  correlation     between  ‘genealogical’  and  ‘significant’.  for  the  other  two  texts,  the  chi-­‐square  text     rather  spectacularly  fails  to  demonstrate  any  correlation  at  all!             an  objection  to  the  chi-­‐square  test  could  be  raised  here,  however:  the  text  that     demonstrated  a  convincing  correlation  also  happens  to  be  the  text  for  which  an  order  of     magnitude  more  variation  existed  to  be  analyzed.  the  test  is  not  usually  recommended     unless  all  combinations  of  category  contain  at  least    instances,  and  that  criterion  is  not     quite  met  by  any  of  the  texts  besides  heinrichi.  in  the  case  of  parzival    in  particular,  the     philologist  has  marked  relatively  few  variants  as  significant  at  all.     we  might  thus  apply  a  simpler  test:  to  compare  the  success  rates  of  the     ‘significant’  and  ‘possibly-­‐significant’  categories  to  the  mean  success  rate  of  the  text  as  a     whole.  we  can  treat  this  situation  as  a  binomial  distribution  (with  the  same  caveat     concerning  the  independence  of  genealogical  variants),  and  analyze  the  ‘significant’     group  as  a  sample  drawn  from  the  whole.  in  this  case,  the  successful  philologist  should     have  constructed  a  sample  of  ‘significant’  variants  that  should  have  a  markedly  higher     mean  success  rate  than  the  wider  population  of  variants.  (the  same  analysis  can  be     performed  on  the  population  of  ‘possibly-­‐significant’  variants,  but  we  would  not  expect     such  a  marked  difference  in  the  success  rate,  so  we  omit  that  analysis  here.)  the  specific     question  we  ask  is:  what  is  the  probability  that  a  random  sample  of  variants  would  have     at  least  the  same  number  of  genealogical  variants  as  our  significant  sample?     table    shows  the  results  of  our  binomial  distribution.  in  every  case  except  for     that  of  heinrichi  there  were  fewer  than    genealogical  variants  classed  as  significant,     the  ‘plus-­‐four’  rule  has  been  applied  to  the  data  in  order  to  compensate  for  the  small     sample  size  (moore,  craig,  and  mccabe,   ).     table   :  ‘significant’  variants  treated  as  samples  from  a  binomial  distribution           all  variants     %  mean     genealogical   %  genealogical   significant   likelihood  of   randomness   std.   deviation   parzival     . %   . %   . %   .   parzival    normalized   . %   . %   . %   .   parzival     . %   . %   . %   .     parzival    normalized   . %   . %   . %   .   notre  besoin   . %   . %   . %   -­‐ .   notre  besoin  normalized   . %   . %   . %   -­‐ .   heinrichi  normalized   . %   . %   . %   .                      excl.  addition/deletion     %  mean   genealogical     %  genealogical   significant   likelihood  of   randomness   std.   deviation   parzival     . %   . %   . %   .   parzival    normalized   . %   . %   . %   .   parzival     . %   . %   . %   -­‐ .   parzival    normalized   . %   . %   . %   -­‐ .   notre  besoin   . %   . %   . %   -­‐ .   notre  besoin  normalized   . %   . %   . %   -­‐ .   heinrichi  normalized   . %   . %   . %   .                      excl.  deletion     %  mean   genealogical     %  genealogical   significant   likelihood  of   randomness   std.   deviation   parzival     . %   . %   . %   .   parzival    normalized   . %   . %   . %   .   parzival     . %   . %   . %   -­‐ .   parzival    normalized   . %   . %   . %   -­‐ .   notre  besoin   . %   . %   . %   -­‐ .   notre  besoin  normalized   . %   . %   . %   -­‐ .   heinrichi  normalized   . %   . %   . %   -­‐ .       with  this  analysis,  we  can  see  a  differentiation  of  results  between  the  three  texts.     the  results  for  notre  besoin  were  by  far  the  worst:  there  was  no  scenario  where  the     variants  treated  as  significant  were  more  likely  than  average  to  be  genealogical.  both     parzival  texts  fared  slightly  better  when  additions  and  deletions  were  taken  into     account;  since  these  were  the  two  texts  for  which  a  positive  list  of  additions  and     deletions  were  received,  and  in  light  of  the  overall  small  sample  size,  this  is  not     particularly  surprising.  heinrichi  again  appears  to  be  the  most  convincing  case  of     success,  when  additions  and  deletions  are  disregarded;  the  philologist  was  correct  about     %  of  the  time,  as  opposed  to  the   %  that  random  chance  might  yield.             conclusions     what  are  we  to  make  of  these  rather  surprising  results?  above  all  it  is  important     to  bear  in  mind  that  the  experiment  was  done  using  artificial  traditions.  particularly  for     the  notre  besoin  text,  many  of  whose  copyists  were  themselves  philologists,  there  is  a     real  risk  that  the  volunteers  consciously  or  semi-­‐consciously  introduced  innovations     into  their  copies  in  order  to  make  the  resulting  tradition  “interesting”.    on  the  other     hand,  also  in  the  case  of  notre  besoin,  at  the  time  of  the  original  experiment  a  philologist     using  classical  methods  was  able  to  reconstruct  a  stemma  that  was  not  very  different     from  the  true  stemma.  is  this  a  case  of  one  philologist  simply  being  better  than  the     other?  while  that  is  possible,  it  is  not  tremendously  likely;  over   %  of  all  variation     within  notre  besoin  followed  the  stemma,  which  made  its  reconstruction  a     comparatively  straightforward  task  no  matter  what  method  was  used.  the  results  of     that  experiment  bore  this  out:  they  showed  that  every  one  of  the  attempted  methods,     including  the  computational  methods  whose  results  were  not  manipulated  into  a     ‘normal’  rooted  stemma,  could  correctly  identify  the  main  manuscript  groupings.  that     does  in  itself  raise  another  question:  how  accurate  must  we  be  in  choosing  significant     variation  in  order  to  reconstruct  an  accurate  stemma?  although  none  of  the  volunteers     in  this  study  attempted  to  draw  a  stemma,  one  of  the  two  philologists  for  parzival     provided  a  set  of  observations  concerning  which  manuscripts  should  be  grouped     together;  these  were  broadly  accurate,  even  though  the  selection  of  individual     significant  variants  was  often  wide  of  the  mark;  it  is  also  worth  noting  that  the     philologist  quite  often  cited  variants  as  examples  of  group  affinity  that  were  not  judged     significant!       compared  to  the  rest,  the  heinrichi  artificial  tradition  fared  comparatively  well.       the  overall  mean  rate  of  genealogical  variation  in  that  text  was  rather  lower  than  in  the       other  two  texts,  at  just  under   %.  the  heinrichi  corpus  includes   -­‐  copies  per  scribe,     which  increases  the  possibility  of  horizontal  transmission  (particularly  for  spelling  and     grammatical  idiosyncrasies)  in  a  different  way;  on  the  other  hand,  that  tradition  appears     to  have  contained  many  more  genuine  errors,  and  the  philologist  who  did  the  work  was     accordingly  more  accurate—leaving  aside  the  question  of  additions  and  deletions—in     detecting  whether  variation  was  significant.  the  creators  of  heinrichi  seem  to  have  had     more  success  than  the  others  in  creating  a  tradition  that  is  reasonably  close  to  the  ‘real-­‐   world’  situation  of  a  medieval  text  widely  copied.     one  substantial  conclusion  to  be  found  in  the  data,  and  one  that  reinforces     findings  made  previously,  is  that  ‘insignificant’  variation  is  really  not  that  insignificant  at     all.  we  have  seen  that  some  philologists  prefer  to  exclude  it  entirely;  others  (e.g.  wattel     and  van  mulken,   )  include  the  information  but  give  it  as  low  a  weighting  as     possible.  this  experiment,  together  with  several  others,  strongly  suggests  that  our     practices  for  handling  this  sort  of  ‘insignificant’  variation  are  in  dire  need  of  revision.     a  second  conclusion  concerns  the  effect  of  the  adoption  of  blanket     generalizations:  in  this  case,  the  guidelines  from  two  of  the  philologists  for  how  to     handle  certain  variants.  they  advised  that,  “in  general”,  additions  and  deletions  should     be  treated  in  a  certain  way;  when  these  rules  were  duly  applied  in  a  general  fashion,  the     resulting  proportion  of  “significant”  genealogical  variation  was  badly  impacted.  this     aspect  of  the  experiment  suggests  that  we  must  be  extremely  careful  before  adopting     any  sort  of  rule-­‐based  guideline  for  the  classification  of  variants,  especially  if  the     guidelines  are  meant  to  be  applied  in  a  regular  computational  way.  it  is  far  too  easy  to  be     led  blindly  into  poor  results.     finally,  this  experiment  makes  clear  that  stemmatology  has  some  way  to  go     before  it  can  claim  the  title  of  a  ‘normal  science’  that  rens  bod  has  offered.  our  systems       of  categorization  are  suspect;  our  very  philological  sense  of  what  is  or  is  not  significant     has  not  fared  as  well  as  we  ought  to  expect  in  the  test  against  artificial  traditions.  we     have  more  work  to  do  than  bod’s  simple  ‘problem-­‐solving’;  we  have  yet  to  capture  in     any  formal,  demonstrable,  or  falsifiable  way  the  essence  of  what  scribes  were  likely  to     copy  and  what  they  were  likely  to  change.  if  stemmatology  is  indeed  to  become  a     science,  this  is  the  next  task  that  needs  to  be  done.         references     andrews,  t.  l.  ( a).  the  third  way:  philology  and  critical  edition  in  the     digital  age.  variants,   :   – .     andrews,  t.  l.  ( b).  stemmaweb  -­‐  a  collection  of  tools  for  analysis  of  collated     texts.  http://byzantini.st/stemmaweb/  (accessed    april   ).     andrews,  t.  l.,  blockeel,  h.,  bogaerts,  b.,  bruynooghe,  m.,  denecker,  m.,  de     pooter,  s.,  …  ramon,  j.  ( ).  analyzing  manuscript  traditions  using  constraint-­‐based     data  mining.  in  cocomile    -­‐  combining  constraint  solving  with  mining  and  learning.     montpellier.  http://cocomile.disi.unitn.it/ /papers/cocomile _manuscript.pdf.     andrews,  t.  l.,  and  macé,  c.  ( ).  beyond  the  tree  of  texts:  building  an     empirical  model  of  scribal  variation  through  graph  analysis  of  texts  and  stemmata.     literary  and  linguistic  computing,   ( ):   – .   . /llc/fqt .     baret,  p.,  macé,  c.,  and  robinson,  p.  ( ).  testing  methods  on  an  artificially     created  textual  tradition.  in  the  evolution  of  texts:  confronting  stemmatological  and     genetical  methods.  pisa;  rome:  istituti  editoriali  e  poligrafici  internazionali,  pp.   – .     bédier,  j.  ( ).  la  tradition  manuscrite  du  lai  de  l’ombre.  réflexions  sur  l’art     d’éditer  les  anciens  textes.  romania,   :   – ,   – .     blake,  n.,  and  thaisen,  j.  ( ).  spelling’s  significance  for  textual  studies.     nordic  journal  of  english  studies,   ( ):   – .  (accessed    march   ).     bod,  r.  ( ).  a  new  history  of  the  humanities:  the  search  for  principles  and     patterns  from  antiquity  to  the  present.  oxford  university  press.     dekker,  r.  h.,  hulle,  d.  van,  middell,  g.,  neyt,  v.,  and  zundert,  j.  van.  ( ).     computer-­‐supported  collation  of  modern  manuscripts:  collatex  and  the  beckett  digital       manuscript  project.  literary  and  linguistic  computing,  fqu .   . /llc/fqu .     greg,  w.  w.  ( ).  the  calculus  of  variants:  an  essay  on  textual  criticism.  oxford:     clarendon  press.     housman,  a.  e.  ( ).  the  application  of  thought  to  textual  criticism.     proceedings  of  the  classical  association,   :   – .     howe,  c.  j.,  connolly,  r.,  and  windram,  h.  f.  ( ).  responding  to  criticisms     of  phylogenetic  methods  in  stemmatology.  studies  in  english  literature   -­‐ ,     ( ):   – .   . /sel. . .     moore,  d.  s.,  craig,  b.  a.,  and  mccabe,  g.  p.  ( ).  introduction  to  the  practice     of  statistics  ( th  ed.,  international  ed.).  new  york:  w.  h.  freeman.     robinson,  p.  ( ).  making  electronic  editions  and  the  fascination  of  what  is     difficult.  linguistica  computazionale,   – :   – .     roelli,  p.,  and  bachmann,  d.  ( ).  towards  generating  a  stemma  of     complicated  manuscript  traditions:  petrus  alfonsi’s  dialogus.  revue  d’histoire  des  textes,     n.s.   :   – .     roos,  t.,  and  heikkilä,  t.  ( ).  evaluating  methods  for  computer-­‐assisted     stemmatology  using  artificial  benchmark  data  sets.  literary  and  linguistic  computing,     ( ):   – .   . /llc/fqp .     salemans,  b.  j.  p.  ( ).  cladistics  or  the  resurrection  of  the  method  of     lachmann.  in  van  reenen,  p.  t.,  van  mulken,  m.,  and  dyk,  j.  w.  (eds.),  studies  in     stemmatology.  amsterdam;  philadelphia:  benjamins,  pp.   – .     salemans,  b.  j.  p.  ( ).  building  stemmas  with  the  computer  in  a  cladistic,  neo-­‐   lachmannian,  way:  the  case  of  fourteen  text  versions  of  lanseloet  van  denemerken.     ph.d.  thesis,  katholieke  universiteit  nijmegen.     schmid,  u.  ( ).  genealogy  by  chance!  on  the  significance  of  accidental     variation  (parallelisms).  in  van  reenen,  p.  t.,  den  hollander,  a.,  and  van  mulken,  m.     (eds.),  studies  in  stemmatology  ii.  amsterdam:  benjamins,  pp.   – .     schmidt,  d.,  and  colomb,  r.  ( ).  a  data  structure  for  representing  multi-­‐   version  texts  online.  international  journal  of  human-­‐computer  studies,   :   – .     schøsler,  l.  ( ).  scribal  variations:  when  are  they  genealogically  relevant—   and  when  are  they  to  be  considered  as  instances  of  “mouvance”?  in  van  reenen,  p.  t.,     den  hollander,  a.,  and  van  mulken,  m.  (eds.),  studies  in  stemmatology  ii.  amsterdam:     benjamins,  pp.   – .     smelik,  w.  f.  ( ).  trouble  in  the  trees!  variant  selection  and  tree       construction  illustrated  by  the  texts  of  targum  judges.  in  van  reenen,  p.  t.,  den     hollander,  a.,  and  van  mulken,  m.  (eds.),  studies  in  stemmatology  ii.  amsterdam:     benjamins,  pp.   – .     spencer,  m.,  davidson,  e.  a.,  barbrook,  a.  c.,  and  howe,  c.  j.  ( ).     phylogenetics  of  artificial  manuscripts.  journal  of  theoretical  biology,   :   – .     spencer,  m.,  mooney,  l.,  barbrook,  a.,  bordalejo,  b.,  howe,  c.  j.,  and     robinson,  p.  ( ).  the  effects  of  weighting  kinds  of  variants.  in  van  reenen,  p.  t.,  den     hollander,  a.,  and  van  mulken,  m.  (eds.),  studies  in  stemmatology  ii.  amsterdam:     benjamins,  pp.   – .     wattel,  e.  ( ).  constructing  initial  binary  trees  in  stemmatology.  in  van     reenen,  p.  t.,  den  hollander,  a.,  and  van  mulken,  m.  (eds.),  studies  in  stemmatology  ii.     amsterdam:  benjamins,  pp.   – .     wattel,  e.,  and  van  mulken,  m.  ( ).  weighted  formal  support  of  a  pedigree.     in  van  reenen,  p.  t.,  van  mulken,  m.,  and  dyk,  j.  w.  (eds.),  studies  in  stemmatology.     amsterdam;  philadelphia:  benjamins,  pp.   – .     west,  m.  l.  ( ).  textual  criticism  and  editorial  technique:  applicable  to  greek     and  latin  texts.  stuttgart:  b.  g.  teubner.         [pdf] visual knowledge: textual iconography of the quixote, a hypertextual archive | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /llc/fql corpus id: visual knowledge: textual iconography of the quixote, a hypertextual archive @article{urbina visualkt, title={visual knowledge: textual iconography of the quixote, a hypertextual archive}, author={e. urbina and r. furuta and s. smith and neal audenaert and j. deng and carlos monroy}, journal={lit. linguistic comput.}, year={ }, volume={ }, pages={ - } } e. urbina, r. furuta, + authors carlos monroy published sociology, computer science, art lit. linguistic comput. ever since its initial publication four hundred years ago, thousands of editions, most often illustrated, have been published of cervantes' masterpiece, don quixote. imagery has become an integral part of the reception and interpretation of the text. to date, a comprehensive collection of these images, the textual iconography of the quixote, has not been published. we report in this paper on overcoming two key obstacles: limitations on the availability of materials and limitations due to the… expand view via publisher cervantes.tamu.edu save to library create alert cite launch research feed share this paper citationsbackground citations methods citations view all figures and topics from this paper figure figure quixote archive interpretation (logic) floor and ceiling functions don woods (programmer) citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency don quixote illustrated : an international digital humanities project e. urbina, f. moreno, + authors stephanie elquist pdf view excerpt, cites methods save alert research feed visualizing the quixote: a digital humanities archive for teaching and research e. urbina, f. moreno art pdf save alert research feed la colección de quijotes ilustrados del proyecto cervantes: catálogo de ediciones y archivo digital de imágenes f. moreno, e. urbina, r. furuta, jie deng art pdf save alert research feed locating thematic pinpoints in narrative texts with short phrases: a test study on don quixote j. deng, r. furuta, e. urbina computer science jcdl ' save alert research feed texts, illustrations, and physical objects: the case of ancient shipbuilding treatises carlos monroy, r. furuta, f. castro computer science ecdl view excerpt, cites background save alert research feed historia y prácticas iconográficas del quijote juvenil ilustrado en el siglo diecinueve e. urbina, f. moreno art save alert research feed facilitating reading through a theme-driven approach j. deng computer science view excerpt, cites methods save alert research feed feature identification framework and applications (fifa) michael neal audenaert engineering view excerpt, cites background save alert research feed towards an ontology-based iconography r. gartner art, computer science digit. scholarsh. humanit. save alert research feed references showing - of references sort byrelevance most influenced papers recency critical images. the canonization of don quixote through illustrated editions of the eighteenth century a. re art save alert research feed texts, images, knowledge: visualizing cervantes and picasso carlos monroy, r. furuta, e. urbina, e. mallen computer science save alert research feed the history of the illustrated book : the western tradition john p. harthan art save alert research feed don quixote in england. the aesthetics of laughter a. close art save alert research feed the rationale of hypertext j. mcgann sociology save alert research feed john frow. university of edinburgh institute for advanced studies in the humanities proceedings of the visual knowledges conference modelos de representación en las ediciones de los siglos xvii a xix madrid: calcografía nacional university of edinburgh: institute for advanced studies in the humanities proceedings visual knowledges conference. john frow review of ronald paulson, don quixote in england: the aesthetics of laughter (johns hopkins, ) s. regan history save alert research feed knowledge: visualizing cervantes and picasso university of edinburgh institute for advanced studies in the humanities proceedings of the visual knowledges conference. ed. john frowcfm> paulson, ronald. don quixote in england: the aesthetics of laughter ... ... related papers abstract figures and topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue text mining at an institution with limited financial resources search d-lib: home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib d-lib magazine july/august volume , number / table of contents   text mining at an institution with limited financial resources drew e. vandecreek northern illinois university libraries drew@niu.edu doi: . /july -vandecreek   printer-friendly version   (this opinion piece presents the opinions of the authors. it does not necessarily reflect the views of d-lib magazine, its publisher, the corporation for national research initiatives, or the d-lib alliance.)   abstract the digital humanities are now coming to the attention of a growing number of scholars and librarians, including many at medium-sized and small institutions that lack significant financial resources. should these individuals seek to explore text mining, one of the digital humanities core activities, they are likely to confront the fact that their library cannot afford the typical expensive database products that contain large volumes of materials suitable for analysis. in this opinion piece, i suggest that vendors would benefit from increasing their customer base by offering potential users the opportunity to purchase discrete portions of data sets individually. this approach may prove practicable for libraries able to muster relatively modest sums for the purchase of single items. it also may represent a new source of revenue for vendors, or at least an opportunity to build trust and goodwill in the digital humanities community.   the problem the digital humanities' increasing prominence in academic life, marked by such things as the advertisements seeking applications for new positions and calls for papers, has brought it to the attention of a large number of humanities scholars, librarians and administrators not employed at the larger institutions that have heretofore often led the field's development. many have expressed an interest in the field. these individuals often do not have access to as many financial resources as the field's leaders often enjoy. this shortfall makes itself apparent in any number of ways: the lack of a technical infrastructure robust enough to support many types of digital humanities work; a lack of information technology professionals that understand, appreciate and can support the work; and an inability to attend professional development workshops at other institutions. another potential problem to be faced by this new group of practitioners at non-elite institutions with limited resources will arise when they undertake text mining, one of the digital humanities' core activities, and confront the expense of acquiring a corpus of data to mine. in this article i discuss the problem, and propose a partial solution which, while far from ideal, could allow these practitioners to begin.   text mining: the cost of getting started i attended the university of michigan's "beyond ctrl+f: text mining across the disciplines " workshop on february , . i want to thank the university of michigan libraries for organizing and hosting the event. i enjoyed it. it must have taken a great deal of work. when the workshop first came to my attention, i noticed that participants could attend at no charge. this was too good to be true. working at a state university in the bankrupt state of illinois, i of course have access to no financial support for professional development activities. i happily drove to ann arbor and stayed overnight at my own expense, then took part in the workshop. without the free-admission policy, i might not have gone to the event. the workshop began with a session devoted to "finding your corpus." this seemed reasonable. no one can perform text mining until they have some text. the session featured representatives of several vendors of subscription products providing access to large amounts of textual materials: proquest, jstor, gale, alexander street press (full disclosure — i edited an online product for alexander street press and have cashed their checks) and several others. it dawned on me that the no-charge policy resulted, of course, from these vendors' sponsorship of the event. as sponsors, they enjoyed the opportunity to pitch their products to members of a captive audience who had expressed an interest in text mining. vendor representatives described how scholars and students might use their products for text-mining projects. they presented an impressive set of resources, but they did emphasize that library users were not simply to bring up one of their databases and begin to download the very large bodies of text they wanted to use. vendors of online library resources typically offer their products for subscription with the proviso that library patrons not use them too much. from a vendor's point of view, a database user might download a very large amount of text and then turn around and put it on the web for free use. thus, they monitor their product's use, and terminate access if they detect that a patron is downloading too much material. vendor representatives at the ctrl+f event explained that their policies direct prospective text miners to use their products to discover potentially suitable text materials, then submit a request for a specific corpus, which they will then prepare and deliver for an extra fee in the range of $ -$ , . this made something very apparent to me: text mining is in many cases only practicable at its intended scale at institutions commanding the financial resources necessary to ) subscribe to these products, and ) go on to pay the additional fee. of course open access entities like hathitrust make text materials available at the scale required for text mining activities at no cost, but it is important to recognize that vendors of subscription-based products like those discussed at the ctrl+f event also represent a major source of text materials that scholars will likely find very attractive. i noticed that a significant number of scholars employed at institutions well outside the vendors' target audience of university libraries with budgets allowing them to purchase or subscribe to high-cost digital resources in the humanities attended the "beyond ctrl+f" event. those with whom i conversed often emphasized that they were happy to attend such an introductory-level event hosted by a major institution of high reputation. it offered an opportunity to get oriented in the field, to get started in the work. i suspect that a number of these individuals must have reached the same conclusion that i did: "i can only do this if i can find text available at no charge. i must direct my research toward questions that can be answered by reference to free-use data alone."   my experience i attended the ctrl+f event as a digital humanities professional responsible for the encouragement and support of activities like text mining at my university. i am also a scholar of nineteenth and early twentieth century american intellectual and political history. i am interested in language and rhetoric in american political development. more specifically, i am interested in how americans have talked about the federal government. what did they have to say about its scope of activity? how might americans have understood what it did, or did not do? what language did they use to argue for more, or less, government involvement in the american economy and society? did their language reflect the influence of major intellectual traditions like liberalism and republicanism in political thought, or perhaps romanticism and sensibility in literature and culture? i turned to speeches and debates in congress as a good source of arguments for and against specific state activities. this led me to the congressional record, a very large set of text that is available in a searchable text format from several sources. the library of congress' a century of lawmaking for a new nation web site provides free access to full-text versions of the congressional record beginning with the year . i needed access to full-text versions of the record from the nineteenth century. this led me to proquest congressional, a subscription product providing a variety of congressional materials. unfortunately, my university library's subscription to proquest congressional did not include materials from the congressional record before . when our acquisitions department contacted proquest to inquire about the matter they learned that we might purchase the back file materials for the nineteenth-century congressional record for a one-time payment of approximately $ , . this was an all-or-nothing proposition: purchase the entire back file, or purchase nothing. proquest's price was a complete non-starter at my financially strapped university. i asked librarians at several institutions with large library budgets if they might acquire materials for me, in effect providing an inter-library loan, but found that vendors' contracts restrict use to individuals defined as members of an individual institution's user community. i attempted to resolve my problem by asking vendors if they would sell me my preferred chunk of data by itself (the congressional record, - ), rather than an entire database product or back file, at a more reasonable price. proquest declined to negotiate, but hein online (another vendor of digitized government documents) agreed. i bought, at my own expense, the text of the congressional record for the period - for a price i could accept. i now have it available for research. upon completing this transaction, i discovered that the university of north texas libraries, which present a digitized version of the entire congressional record, would provide me with their uncorrected text data at no charge. i thank the university of north texas libraries for the use of their data, and recommend them to other students and scholars. their collections include a large amount of digitized texas newspapers, as well as records of the federal communication commission. however, like other not-for-profit providers of text data, north texas offered uncorrected copy. with two versions of the same data in hand, i may have an opportunity to compare the results they produce in text-mining work. in any event, corrected text is clearly more useful than uncorrected materials.   the vendors' perspective as i pondered the situation, i tried to take proquest's point of view. i understand that most library vendors are private concerns and need to make a profit for their investors. their representatives sell that product in order to earn a living. nevertheless, the congressional record is a government publication available at no charge in libraries and other depositories of federal materials. how could proquest charge so much for the use of it? i imagined that from proquest's perspective, they are not selling access to a government publication in the public domain. they are selling access to a value-added version of it: a digitized, full-text searchable version of the materials available in an online format. their costs include funds devoted to the initial digitization of materials originally published in an analog format; the markup and other technical work required to prepare the text for use with a search engine; the storage and preservation of the materials on a technical infrastructure requiring maintenance and upgrades; and the online service of the digital materials themselves, again on an infrastructure requiring maintenance and regular upgrades. of these costs, those devoted to digitization itself deserve specific discussion. many librarians and humanities scholars have taken some part in the digitization of materials at some point in their career. experience with the process reveals that the various software products that convert type-set, analog materials to a digital format are far from foolproof. they often produce enough errors to compromise the materials' usefulness, at least to some degree. this is especially true of older materials, in which ink has often faded and pages have yellowed with age. in my experience nineteenth-century materials digitized from an analog format usually have a very high error rate. i examined a small sample of proquest's congressional record materials, which they courteously provided me. it contained a very small amount of scanning errors, significantly fewer than those found in the portion of the unt data that i reviewed, and about the same as the hein materials. i tentatively determined that in my case vendors provide access to better text than that available for free. if a researcher were to attempt to bring the open source data up to the quality of the proquest materials, s/he would have to find a way to fix many of the errors in it, most likely by using a script that finds and replaces common scanning errors in a document. in my experience most humanities scholars and students cannot write search and replace scripts, nor do they know how to find them online, ready to use, and implement them in ways that many technologists and programmers do. i certainly do not. most libraries and medium-sized and smaller institutions with limited resources lack access to this type of technical expertise. thus, when hein and proquest charge fees for materials in the public domain, they charge for access to more accurate digitized text.   a measure of progress my experience with hein online led me to draw a parallel to another experience i had with a vendor in a somewhat similar, but not identical, situation. in the past several years i have taken part in the activities of the digital powrr project, an imls-funded activity that produced a study of digital preservation challenges and potential solutions at medium-sized and smaller colleges and universities lacking large financial resources. our study included the review of a number of applications and tools available for use in digital preservation activities. among them we found a comprehensive, all-in-one product called preservica. they made no pricing information available online. we had to call for a quote. when we contacted a preservica sales representative to ask if they might make the product available to our study for testing at little or no cost, they immediately rejected us, explaining that preservica is a version of a digital preservation product that the company originally sold to large corporations such as banks. they have now begun to market it to other very large institutions with need to preserve digital materials that have suitable budgets, ranging from universities to state and national governments. apparently, medium-sized and smaller institutions with little money did not represent an attractive market segment. the digital powrr project published a white paper resulting from the study, "from theory to action: good enough digital preservation for under-resourced cultural heritage institutions". it recommended that institutions unable to afford a product like preservica adopt a one-step-at-a-time approach to digital preservation activities using sets of open-source tools in combinations suited to their particular needs. another thing occurred in the process of conducting the study. through a frank and open exchange of views with members of the digital powrr team, preservica executives became aware that they were leaving money on the table by adopting a call-for-quote stance and pricing their product at a level that put it well out of reach of smaller, less prosperous institutions. we urged them to adopt a more transparent pricing policy and become aware of this other market, which the response to our study has shown is vast. there are only so many institutions with the resources necessary to buy preservica at their initial price level. what happens when they all have acquired or constructed a satisfactory digital preservation application? where does the company find growth then? preservica executives changed their position, instituting a transparent, online pricing policy and devising versions of their product priced to suit more modest budgets. i want to suggest that vendors of large sets of humanities text materials do the same.   my recommendation i suggest that vendors of library database products recognize that they can contribute to future scholarship, ease a major, obvious inequity in the field and, perhaps, find a new source of revenue by making chunks of text data available for sale on an à la carte basis. in many cases, this would require them to offer libraries that do not subscribe to their products a free trial-period use so that researchers might identify materials of interest. it would also require the additional administrative work involved in processing a number of transactions involving lesser amounts of funds than those to which they are accustomed. i understand that vendors will raise these objections, but i believe they should investigate this potential sales model in a systematic fashion and determine if they can earn profits with it. i submit that vendors would not need to understand this approach as a charity measure. i suspect that purveyors of large, online humanities text databases may well confront a situation similar to that which the digital powrr team perceived in preservica's case. once they have sold their products to the limited number of institutions able to afford them, where do they find growth? of course they can grow by introducing new products, but do they not want to find revenue growth in legacy products as well? representatives of a number of vendors may reply to this observation by noting that they price their products on the basis of an institution's number of full-time enrolled students, or offer access to a limited number of simultaneous logins, measures that can help a smaller institution. this is not enough. it may prove to be a benefit to smaller institutions to some degree, but it is only a partial measure. it certainly does not help cases like mine — a large institution lacking the budget level to buy even these versions of products — and there are many such institutions. if vendors do not recognize and respond to the market made up of medium-sized and smaller institutions of lesser financial means, i fear that they will make a powerful contribution to the perpetuation of the existing situation: students and scholars at the wealthiest colleges and universities can do text mining work with access to very large collections of suitable materials, while others may never find their corpus. those vendors will also, in my estimation, leave money on the table. even if they cannot earn any profit from this type of sale, it may be worthwhile for them to sell materials at a modest loss in order to earn the trust and goodwill of the scholars, librarians, and other practitioners populating the digital humanities. i ask vendors to consider the above proposition, and digital humanists and librarians at institutions of all sizes and financial conditions to raise these issues associated with access to their materials with vendors' sales representatives.   acknowledgements the author thanks jim millhorn of northern illinois university libraries and alix keener of the university of michigan libraries for help in gathering information for this article.   about the author drew e. vandecreek is director of digital scholarship and co-director of the digital convergence lab at northern illinois university libraries. he holds a ph.d. in american history from the university of virginia. he has secured funding for and directed the development of a number on online resources exploring nineteenth-century american history, available from the university libraries digital collections.   copyright ® drew e. vandecreek awareness of altmetrics among lis scholars and faculty © journal of education for library and information science   vol. , no. – doi: . /jelis. . - . awareness of altmetrics among lis scholars and faculty sarah sutton, emporia state university ssutton @emporia.edu rachel miles, kansas state university ramiles@ksu.edu stacy konkiel, altmetric.com stacy@altmetric.com altmetrics track the attention paid to scholarship via mentions in social media, the press, and other non-traditional venues. for library and information science (lis) faculty, altmet- rics are also a new and important area for research and teaching. we conducted a survey of lis faculty teaching in us and canadian graduate lis programs accredited by the amer- ican library association in which we asked about their familiarity with and awareness of measures of research impact, including altmetrics. our results indicate that while most lis faculty in our sample had some awareness of altmetrics, they reported greater familiarity with traditional measures of research impact such as citation counts and usage statistics. we also confirmed that, among our sample, there was a relationship between years of teaching experience and awareness of altmetrics, as well as among familiarity with altmet- rics, familiarity with citation counts, and familiarity with usage statistics. among the robust, global body of research related to the use of new measures of research impact among scientists and scholars, there are few studies that use survey methods and focus on faculty scholars within a specific discipline. the results of this study contribute new knowledge to the existing body of research on altmetrics and may contribute to the development of lis graduate curricula devoted to measures of research impact and their application in practice. keywords: altmetrics, bibliometrics, faculty, library and information science, lis education, research impact, survey stakeholders use measures of scholarly research impact across academia and the public sector for a variety of purposes. journal publishers use them as a measure of the influence of their publications. institutions of higher education use them to measure their research output and its im- pact. librarians use them to measure the benefit of their collections to their users. scholars use them to identify the impact of their own research and, often, to make the case for their promotion and tenure. because of the widespread focus on measures of research impact in libraries and the institutions of which they are a part, the topic is one that should not be overlooked in lis education. jelis vol . - _proof .indd / / : : am mailto:ssutton @emporia.edu mailto:ramiles@ksu.edu mailto:stacy@altmetric.com measures of research impact traditional measures of journal-level impact include the journal impact factor (jif) and journal-level usage statistics. traditional measures of a scholar’s research impact include citation counts, article-level impact, and the author h-index. altmetrics are a relatively new type of data that can indicate journal, article, and au- thor-level research impact, including the attention paid to research online (“what are altmetrics?,” ). altmetrics are measures of men- tions of research and scholars made in non-traditional venues such as social media (e.g., twitter, facebook, blogs, etc.), inclusion in reference managers (e.g., mendeley), expert peer-review and recommendation services (e.g., publons and faculty of prime), and mentions in mainstream media and public policy documents. generally, altmetrics are portrayed as complementary to traditional measures of research impact (costas, zahedi, & wouters, ; priem, taraborelli, groth, & neylon, ; thelwall, haustein, larivière, & sugimoto, ; “what are altmet- rics?,” ). altmetrics have the advantage of providing impact data within days or even hours of the release of a publication and of measuring the influence of a wide variety of research outputs among many audiences (priem et al., ). they are, however, relatively new and have not yet gained the same level of acceptance within academia as is afforded to more traditional mea- sures of scholarly impact (bonnici & julien, ; gruzd, staves, & wilk, ) such as citation counts and author h-index. significance of measuring research impact in library and information science library and information science (lis) scholars who teach in lis grad- uate programs have a somewhat unique position regarding measures of research impact. as is the case with scholars in other academic disciplines, measuring research impact is often important to lis scholars’ career ad- vancement. but unlike most scholars in other disciplines, for lis scholars, measures of research impact are also a topic of research and an area of key points • for lis scholars, measures o f r e s e a r c h i m p a c t l i k e altmetrics are both a topic of research and an area of teaching expertise as well as having potential importance for career advancement. • lis scholars who responded to this survey are more familiar with more long-standing and widely recognized measures of research impact such as citation and usage counts than they are with altmetrics. • more years of experience as lis faculty is related to having greater familiarity with altmetrics. sutton, miles, konkiel jelis vol . - _proof .indd / / : : am awareness of altmetrics among lis scholars and faculty teaching expertise. while other disciplines may include measures of re- search impact as a curricular topic for graduate students seeking academic careers, lis graduate students need instruction in the use of measures of research impact because they are likely to encounter them in professional practice. in addition to the possible need to use measures of research im- pact for career advancement, practicing librarians should recognize the usefulness of measures of research impact to collection development and how such tools can help them identify resources with the most impact in the disciplines and subject areas the library supports. they may also be called upon to identify measures of research impact as an area in which to expand services to scholars (desanto & nichols, ; reed, mcfarland, & croft, ; tran & lyon, ). exposure to measures of research impact as a part of lis graduate programs enhances the practitioner’s ability to perform professional responsibilities. it is therefore incumbent upon lis educators to include some coverage of measures of research impact in the lis curriculum. there is a small body of literature devoted to examining scholars’ beliefs about and uses of measures of research impact to gauge directions for adding library support services for scholars, such as information about the use of author identifiers (tran & lyon, ) and the creation and maintenance of scholarly profiles (reed et al., ). this work points to the need to understand disciplinary differences in beliefs about and uses of measures of research impact. however, to date, such studies have often used different disciplinary units of analysis; some examine very broad disciplinary categories such as sciences, social science, and arts, while others drill down to more specific disciplines such as romance languages, psychology, and political science, making cross-disciplinary comparisons difficult. focusing an entire study on scholars within a single discipline like lis will establish a clear picture of beliefs and uses of measures of research impact within the discipline, enabling future interdisciplinary comparisons. aims of the current study the central aim of this study was to assess the awareness of research-im- pact metrics among lis scholars teaching in ala-accredited lis graduate programs. the study examined lis scholars’ awareness of altmetrics while conducting their own research, while evaluating others’ research, and while teaching. the questions we sought to answer through this study were the following: . what level of familiarity with and awareness of altmetrics do lis scholars report themselves to have? . are there relationships between their self-reported levels of famil- iarity with and awareness of altmetrics and their appointment type, tenure status, and teaching experience? jelis vol . - _proof .indd / / : : am . how do their familiarity with and awareness of altmetrics compare to their familiarity with, and awareness of, other measures of re- search impact? although there have been national and international bibliometric studies of lis scholars’ behavior with regard to research metrics, this study is one of the first to seek to understand us and canadian lis scholars’ familiarity with and awareness of emerging research-impact metrics in the course of their teaching and research. we anticipate that this study will contribute not only to the body of literature on research metrics and their use by lis scholars and researchers but also to the further development of lis graduate curricula devoted to measures of research impact. literature review altmetrics is a relatively new topic, but there is a growing global body of literature associated with it. much of this literature is rhetorical, perhaps because of questions related to the appropriateness of altmetrics as a mea- sure of scholarly impact. however, another subset of the literature on alt- metrics is research-based. within the second set reside reports of research focused on lis scholars’ behavior toward and awareness of altmetrics, in- cluding both quantitative studies in the tradition of bibliometrics that seek correlations between and among measures of research impact including altmetrics, and qualitative studies that seek self-reports of awareness of alt- metrics via surveys and interviews. the focus of this review of the literature is limited to altmetrics research, both quantitative and qualitative, focused on lis scholars’ behavior toward, and awareness of, altmetrics. while not all quantitative studies bear it out, many have identified correlations between altmetrics measures of research impact and tra- ditional measures of research impact among the work of lis scholars. for example, bornmann’s ( ) meta-analysis of correlations between three altmetrics (twitter, reference managers, and blogging) and citation counts demonstrated that there are different types of altmetrics and that reference managers have the most correlation with citation counts. an- other such study suggested that “altmetrics may indeed reflect impact not reflected in citation counts” (haustein, peters, bar-ilan, priem, shema, & terliesner, , p. ). meho and yang ( ) suggest that using citation counts from scopus and google scholar together with citation counts from web of science provides a more accurate view of scholarly impact among lis scholars. supporting evidence appears in a study of articles published in jasist (the journal of the american society for information science and technology) between and (bar-ilan, ), a longer period than most studies of this type normally cover. this study suggests that there are significant correlations between mendeley readership, an altmetric, and citations in web of science, scopus, and google scholar. while many recent quantitative studies in the tradition of bibliometrics use citations as sutton, miles, konkiel jelis vol . - _proof .indd / / : : am awareness of altmetrics among lis scholars and faculty their traditional measure of research impact, martín-martín, orduña-malea, ayllon, and lópez-cózar’s ( ) research among scholars in bibliomet- rics, scientometrics, informetrics, webometrics, and altmetrics found strong correlations between altmetrics and citations, usage, and h-index. bornmann ( , p. ) suggests that “studies of correlation appear to be frequently done because they are easily produced, not because the correlation between citation counts and altmetrics is the most pertinent question to examine.” luckily, several studies have explored lis scholars’ views of altmetrics. some of them focus on lis scholars’ perceptions of and preferences among altmetrics tools. haustein et al. ( ) surveyed the bibliometrics community of scholars about their use of social book- marking services and reference managers. mendeley and citeulike were the most popular, and “although use of altmetric platforms was quite low among survey participants, . % thought that altmetrics had some poten- tial in author or article evaluation” (p. ). gruzd, staves, and wilk ( , p. ) interviewed members of asist “among whom [online social me- dia] tools are used as a complementary resource to traditional information resources . . . [and use] is mainly focused on finding information rather than disseminating it.” one of the reasons often given for the lack of uptake of altmetrics is that they don’t “count” toward promotion and tenure. building on the work of gruzd et al. ( ), bonnici and julien ( ) surveyed lis program deans, directors, and chairs. this study suggested that altmetrics had not been adopted as measures of research impact in promotion and tenure decisions. in a follow-up study, bonnici and julien ( , p. ) concluded “that altmetrics are a low priority for most faculty members in lis, and are considered only supplemental to traditional metrics.” gruzd et al.’s ( ) study further suggested that untenured faculty use online social networking tools more often than tenured faculty and that unten- ured faculty are using online social media to build social networks and “creating a higher profile,” a reversal of a previous trend for senior faculty to “embrace new technologies,” presumably after they are tenured and feel safer in their positions (p. ). gruzd et al. suggest that the new trend of junior faculty adopting online social media will result in the adoption of altmetrics as indicators of research impact as they become more senior and have more say in setting standards. methodology to assess lis scholars’ awareness and current usage of research metrics in the course of their work, we conducted a survey of lis faculty in the us and canada who were associated with american library association (ala)–accredited master of library and information science (mlis) and master of library science (mls) programs. we obtained participants’ email addresses from public institutional web pages. the survey population (n = , ) included both full- and part-time faculty jelis vol . - _proof .indd / / : : am the survey was developed based upon a similar survey of academic librarians’ awareness and use of research-impact metrics conducted in (konkiel, sutton, & levine-clark, ; miles, sutton, & konkiel, ; sutton, miles, & konkiel, ). the original survey was pilot tested for reliability on a random sample of invited participants. based on the results of the pilot test, several questions on the original survey were adjusted. the survey of lis scholars was revised only so that job responsi- bilities would be pertinent to scholar/teachers rather than librarians. the study, including the survey instrument, was approved by the institutional review board at emporia state university. it consisted of questions, not all of which were asked of every respondent because the survey employed skip logic to ask respondents only those follow-up questions that were relevant to their earlier answers. we obtained responses, which represents a . % response rate. because of this relatively low response rate and the consequent inability to establish goodness of fit between our sample and the population, we also examined confidence intervals (cis) for some of our results. data anal- ysis consisted of both descriptive and non-parametric statistics. the use of surveys and descriptive statistics is consistent with similar studies in which faculty were asked to self-report familiarity with scholarly metrics (desanto & nichols, ; tran & lyon, ). the chi-square test for independence was applied to the categorical data collected via likert-scale–based survey questions about familiarity with research-impact metrics to identify relation- ships between those data and data describing respondents’ years of expe- rience, appointment type, tenure status, and years of teaching experience. results the focus of this survey was to gauge lis scholars’ awareness of and famil- iarity with altmetrics. we examined the survey results through the lens of several factors that might influence awareness and familiarity: appointment type (whether respondents were employed in full- or part-time teaching positions), tenure status (whether they were tenured, on a tenure track, or neither), and teaching experience (measured by the number of years they had been teaching in lis). we also examined our respondents’ fa- miliarity with, and awareness of, other research-impact metrics, because our previous attempts to identify and measure correlations between famil- iarity with and awareness of altmetrics and other bibliometrics had been inconclusive (konkiel et al., ; miles et al., ; sutton et al., ), because promotion and tenure guidelines still focus most often on tradi- tional metrics (julien & bonnici, ) and because we were interested in comparing familiarity with and awareness of altmetrics with familiarity with and awareness of other measures of research impact. awareness of altmetrics we asked the respondents to identify their level of awareness of altmet- rics on a five-point likert scale. the results are illustrated in figure . sutton, miles, konkiel jelis vol . - _proof .indd / / : : am awareness of altmetrics among lis scholars and faculty most of the survey respondents ( . %, n = ) had heard of altmetrics. only . % (n = ) reported never having heard of them. almost % (n = ) considered themselves experts in altmetrics. the majority, . % (n = ), reported their awareness of altmetrics to be somewhere in the middle of the two extremes. influence of appointment type of the respondents who identified themselves as being either full- or part-time faculty, the majority, % (n = ), were full-time, and % (n = ) were part-time faculty. of the part-time lis faculty, % (n = ) reported that they held another full-time position besides a traditional faculty role. eight worked in academic libraries, two in public libraries, one in a school library, and three in special libraries. twelve respondents worked full-time in other types of positions, including higher-education administration, software design, and management. figure illustrates the differences in our respondents’ familiarity with altmetrics depending on whether they were full- or part-time faculty. although it appears that full-time faculty may be more familiar with altmetrics than their part-time counterparts, the difference in our data are not statistically significant (χ ( ) = . , p = . ). influence of tenure status of the respondents in full-time lis faculty positions, % (n = ) re- ported being in tenure-track positions, % (n = ) reported not being in tenure-track positions, and % (n = ) chose not to answer this question. of tenure-track respondents, % (n = ) reported their awareness of alt- metrics at the two highest levels, and on the likert scale, whereas only % (n = ) of non–tenure-track respondents reported their awareness of altmetrics at level , and none reported an awareness of altmetrics at level . figure illustrates these results. while descriptive statistics suggest that there may be an effect of tenure status upon researchers’ awareness of . % . % . % . % . % . % . % . % . % . % . % . % . % - never heard of them - i'm an expert figure : lis scholars’ and faculty awareness of altmetrics jelis vol . - _proof .indd / / : : am altmetrics, because of the small number of non–tenure-track respondents, it was not possible to conduct an accurate chi-square test of independence to find a statistically significant relationship between the two. influence of experience as an lis faculty member the largest percentage of respondents ( . %, n = ) reported having one to five years of teaching experience (table ). however, the number of responses to self-reported familiarity with altmetrics (n = ) was too small to conduct an accurate chi-square test of independence among other respondent categories. to address this problem, we collapsed years . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % - never heard of them - i'm an expert full time (n = ) part time (n= ) . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % - never heard of them - i'm an expert tenure track (n= ) non-tenure track (n= ) figure : familiarity with altmetrics by appointment type figure : familiarity with altmetrics by tenure status sutton, miles, konkiel jelis vol . - _proof .indd / / : : am awareness of altmetrics among lis scholars and faculty of faculty experience from five categories to two: less than or equal to five years’ faculty experience, more than or equal to six years’ faculty experi- ence as depicted in figure . in this form, the data met the assumptions of the chi-square test of independence and the results indicate a statistically significant relationship between having more years of faculty experience and familiarity with altmetrics (χ ( , n = ) = . , p = . at alpha = . ), although the effect size is low (cramer’s v = . ). familiarity with other metrics we went on to explore lis faculty members’ familiarity with other mea- sures of research impact. figure depicts respondents’ ratings of their familiarity with citation counts, usage statistics, and the author h-index as measures of article-level impact. of the respondents % (n = of responses) reported expert or almost expert levels of familiarity with citation counts. almost % (n = of responses) reported expert or almost expert levels of familiarity with usage statistics. some % (n = of responses) reported expert or almost expert levels of familiarity with table : years of teaching experience years of experience number of faculty percent of total < year . % – years . % – years . % – years . % > years . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % - never heard of them - i'm an expert >= years (n= ) <= years(n= ) figure : familiarity with altmetrics by years of teaching experience jelis vol . - _proof .indd / / : : am sutton, miles, konkiel the author h-index. only % (n = of responses) reported expert or almost expert levels of familiarity with altmetrics. these results suggest that our respondents were more likely to be most familiar with citation counts and usage statistics and that they also were more familiar with the author h-index than with altmetrics. to test for statistically significant relationships among familiarity with altmetrics, citation counts, usage statistics, and author h-index, we again collapsed the data to compensate for small numbers of responses. in this case, we collapsed responses for lower levels of familiarity with all metrics, (never heard of them) and . we found statistically significant rela- tionships between familiarity with altmetrics and familiarity with citation counts (χ ( , n = ) = . , p = . , cramer’s v = . ) and between altmetrics and familiarity with usage statistics (χ ( , n = ) = . , p = . , cramer’s v = . ). we did not find a statistically significant relationship between familiarity with altmetrics and familiarity with author h-index (χ ( , n = ) = . , p = . ). discussion given the response rate to our survey, generalizing based on our results should be undertaken with care. because of the relatively low response rate and the consequent inability to establish goodness of fit between our sample and the population, we also examined cis for some of our results. in most cases, at a % level of confidence, cis for our results were broad, averaging plus or minus five percentage points. however, our results lend support to other studies on this topic, as will be apparent in the discussion below. figure : familiarity with types of research-impact metrics - never heard of them - i'm an expert altmetrics . % . % . % . % . % citation counts . % . % . % . % . % usage counts . % . % . % . % . % author h-index . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % . % jelis vol . - _proof .indd / / : : am awareness of altmetrics among lis scholars and faculty what level of familiarity with and awareness of altmetrics do lis faculty report themselves to have? among the lis faculty responding to our survey, % (n = , % ci [ . , . ]) report having at least heard of altmetrics. desanto and nichols ( ) surveyed faculty at a single institution on their familiarity with scholarly metrics but reported their results by large groups of faculty in the sciences and social sciences; % of their respondents had at least heard of altmetrics. reed et al. ( , p. ) interviewed a small number of faculty also at a single institution and reported that “the term ‘altmet- rics,’ and associated tools, were new to most participants.” our results suggest that lis faculty may have greater familiarity with altmetrics than do faculty in the social sciences and faculty as a whole. this could be be- cause the topic of measuring research impact is of greater interest to and more central to the discipline of lis than it is to other disciplines. further research is needed to confirm whether the trends found in our data hold true across a more representative sample of lis faculty. our respondents were significantly more familiar with traditional measures of article-level research impact such as citation counts and usage statistics than they were with altmetrics. this may be because altmetrics are nascent, or because the use of altmetrics is not yet as well established for purposes such as promotion and tenure (bonnici & julien, ). this supports the idea that even among members of a discipline that has a strong focus on metrics as a topic of research, altmetrics do not have the established credibility that more traditional metrics enjoy. again, more research is needed to confirm our initial findings in a larger and more representative sample of lis faculty. are there relationships between respondents’ self-reported levels of familiarity with and awareness of altmetrics and their appointment type, tenure status, and teaching experience? our results suggest that there is no relationship between familiarity with altmetrics and appointment type within our sample (χ ( ) = . , p = . ). however, because our response rate among part-time faculty was low, the number of responses from part-time faculty may not accurately reflect this sub-group’s familiarity with altmetrics—at best, our results are inconclusive. because of the small number of non–tenure-track respondents, it was not possible to conduct an accurate chi-square test of independence to determine the existence of a significant relationship between tenure status and familiarity with altmetrics. however, our results suggest support for those of gruzd et al. ( , p. ), who found that untenured lis scholars use altmetric sources to “create a higher profile.” because most of our respondents ( %) were on the tenure track, these results lend additional support to the notion that “most faculty learn about scholarly metrics jelis vol . - _proof .indd / / : : am sutton, miles, konkiel when scholarly metrics become important to their career advancement” (desanto & nichols, , p. ). among our sample, our results indicate a statistically significant rela- tionship between years of faculty experience and familiarity with altmetrics (χ ( , n = ) = . , p = . at alpha = . ), although the effect size is low (cramer’s v = . ). the distribution of familiarity with altmetrics by years of teaching experience, as illustrated in figure , suggests that lis researchers with five or fewer years of teaching experience have less awareness of altmetrics than do those with six or more years of teaching experience. this is the opposite of what gruzd et al. ( ) suggested might be the case when they commented that the new trend of junior faculty adopting online social media would result in the adoption of alt- metrics as indicators of research impact as they become senior and have more say in setting standards. bonnici and julien ( , ) concluded that there is little support for the use of altmetrics in promotion and tenure decisions and that altmetrics are considered supplemental at best. this might suggest that even senior faculty struggle to effect change in the academy’s view of appropriate measures of research impact. again, all initial findings among our respondents require study using a larger, more measurably representative sample of lis faculty. in analyzing these survey results, we make some assumptions about the impact of promotion and tenure requirements on faculty’s familiarity with and use of altmetrics. haustein et al. ( ) suggest that even though researchers’ reported use of altmetrics was low, many of their respondents believed that article downloads or views could be useful in the evaluation of impact. how do respondents’ familiarity with and awareness of altmetrics compare to their familiarity with and awareness of other measures of research impact? when we collapsed the categorical data for familiarity with altmetrics, citation counts, usage statistics, and author h-index, we found statistically significant relationships between familiarity with altmetrics and familiarity with citation counts (χ ( , n = ) = . , p = . , cramer’s v = . ) and between altmetrics and familiarity with usage statistics (χ ( , n = ) = . , p = . , cramer’s v = . ) among our respondents. we did not find a statistically significant relationship between familiarity with altmet- rics and familiarity with author h-index (χ ( , n = ) = . , p = . ). this supports the conclusion that our respondents are more familiar with more long-standing and widely recognized measures of research impact such as citation and usage counts than they are with altmetrics. while the author h-index is considered a traditional measure of research impact in this study, anecdotal evidence suggests that it is less easily obtained and understood than citation counts or usage counts, which may contribute to our respondents’ reported lack of familiarity with it. jelis vol . - _proof .indd / / : : am awareness of altmetrics among lis scholars and faculty limitations in analyzing the results of our survey, we recognized several limitations that should be considered in the interpretation of our results. although we took care to exclude non-lis faculty working in hybrid programs, some of the responses indicated that we were not entirely successful in this effort. it was also apparent from those responses that part-time faculty in particular did not understand that we were interested in responses from part-time faculty. this misunderstanding is potentially why there were relatively few responses from that group. in an effort to compare the demographics of our respondents to the population, the survey included questions in which the respondents were asked to select their areas of teaching and research interest from the lis research areas classification scheme (alise, ), which we planned to use to identify goodness of fit between our pool of respondents and the population of lis faculty. unfortunately, the small number of respondents, combined with the large number of areas in the classification, made this impossible. for this reason, along with the overall small number of respon- dents, our results cannot be generalized to the population of lis faculty, and consideration of cis for each finding should be interpreted with care. conclusion for lis scholars, measures of research impact are both a topic of research and an area of teaching expertise as well as having potential importance for career advancement. the literature suggests that altmetrics are com- plementary to traditional measures of research impact and should be used to supplement rather than replace them. traditional measures of research output are often already covered in courses related to collection devel- opment and user services. the literature also suggests a growing interest among practicing librarians to provide user services related to measuring research impact. since the majority of lis faculty in our survey report at least having heard of altmetrics, it would not be unrealistic to incorporate instruction in altmetrics into those courses alongside instruction in other measures of research output. given that the results of our study suggest that lis faculty with six or more years of experience have greater famil- iarity with altmetrics, it is these faculty who are positioned to take the lead in this endeavor. however, it is also clear that there are a great many questions still to be answered with regard to lis faculty awareness of scholarly research metrics. our study examined one discipline’s familiarity and awareness, but there is evidence in the literature of disciplinary differences in the use of altmetrics, which suggests the need for studies of other disciplines’ awareness and familiarity with altmetrics. the lack of consistent use of similar units of analysis related to academic disciplines restricts the comparison of one study to another. therefore, we recommend further jelis vol . - _proof .indd / / : : am sutton, miles, konkiel examination of cross-disciplinary differences in familiarity with and aware- ness of altmetrics. it is also clear from previous studies that a strong influence on faculty awareness and use of altmetrics is promotion and tenure (desanto & nichols, ; reed et al., ; tran & lyon, ), just as it influences more traditional measures of research impact. reed et al. report that low institutional value on research corresponded to lower “incentive to track influence” (p. ). this, combined with julien and bonnici’s work (bon- nici & julien, , ; julien & bonnici, , a, b), suggests that one fruitful follow-up to the current study would be a longitudinal study of lis faculty promotion and tenure guidelines, particularly upon which measures of research impact they mention, if any, both at the insti- tutional and departmental levels. unlike the research from which the survey instrument was drawn (konkiel et al., ; miles et al., ; sutton et al., ), in the current study we did not ask respondents if they covered measures of research im- pact in their teaching. however, the question of whether a correlation ex- ists between scholars’ familiarity and awareness of altmetrics and whether they teach in the area of measures of research impact would clearly also make an excellent follow-up study. sarah w. sutton has years of experience in libraries. she has published and presented in multiple venues on the topic of altmetrics and currently teaches in the school of li- brary and information management at emporia state university. rachel miles is a digital scholarship librarian at kansas state university with a focus on copyright education and outreach. she has been involved in multiple open access (oa) and copyright projects at k-state and has published, presented, and taught workshops on oa and copyright. she has also researched, published, and presented on the topic of altmetrics at regional and national conferences. stacy konkiel is the director of research & education at altmetric, a data science com- pany that uncovers the attention that research receives online. stacy has written and presented widely about altmetrics and library services. references association for library and information science education (alise). ( ). lis research areas classification schema. retrieved http://www.alise.org/index. php?option=com_content&view=article&id= bar-ilan, j. ( ). jasist@mendeley—altmetrics.org. retrieved from http:// altmetrics.org/altmetrics /bar-ilan/ bonnici, l., & julien, h. ( , september). sooner or later?: the diffusion and adoption of social media metrics to measure scholarly productivity in lis faculty. paper presented at the sm&s: social media and society international conference, halifax, nova scotia. retrieved from https://smsociety .sched. com/event/ ehkn /sooner-or-later-the-diffusion-and-adoption-of-social-media- metrics-to-measure-scholarly-productivity-in-lis-faculty bonnici, l., & julien, h. ( ). altmetrics: an entrepreneurial approach to assessing impact on scholarship and professional practice. paper presented at the alise (association for library and information science education), philadelphia. retrieved from https://ali.memberclicks.net/assets/documents/ conf_ /abstracts/ _juried_papers.pdf jelis vol . - _proof .indd / / : : am http://www.alise.org/index.php?option=com_content&view=article&id= mailto:jasist@mendeley%e % % altmetrics.org http://altmetrics.org/altmetrics /bar-ilan/ http://altmetrics.org/altmetrics /bar-ilan/ https://smsociety .sched.com/event/ ehkn /sooner-or-later-the-diffusion-and-adoption-of-social-media-metrics-to-measure-scholarly-productivity-in-lis-faculty https://ali.memberclicks.net/assets/documents/conf_ /abstracts/ _juried_papers.pdf http://www.alise.org/index.php?option=com_content&view=article&id= https://smsociety .sched.com/event/ ehkn /sooner-or-later-the-diffusion-and-adoption-of-social-media-metrics-to-measure-scholarly-productivity-in-lis-faculty https://smsociety .sched.com/event/ ehkn /sooner-or-later-the-diffusion-and-adoption-of-social-media-metrics-to-measure-scholarly-productivity-in-lis-faculty https://ali.memberclicks.net/assets/documents/conf_ /abstracts/ _juried_papers.pdf awareness of altmetrics among lis scholars and faculty bornmann, l. ( ). alternative metrics in scientometrics: a meta-analysis of research into three altmetrics. arxiv: . [physics]. retrieved from http:// arxiv.org/abs/ . ; https://doi.org/ . /s - - -y costas, r., zahedi, z., & wouters, p. ( ). do “altmetrics” correlate with ci- tations? extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. journal of the association for information science and technology, ( ), – . https://doi.org/ . /asi. desanto, d., & nichols, a. ( ). scholarly metrics baseline: a survey of faculty knowledge, use, and opinion about scholarly metrics. college & research libraries, ( ). https://doi.org/ . /crl. . . gruzd, a., staves, k., & wilk, a. ( ). tenure and promotion in the age of online social media. proceedings of the american society for information science and technology, ( ), – . https://doi.org/ . /meet. . haustein, s., peters, i., bar-ilan, j., priem, j., shema, h., & terliesner, j. ( ). coverage and adoption of altmetrics sources in the bibliometric community. scientometrics, ( ), – . https://doi.org/ . /s - - - julien, h., & bonnici, l. ( , september). altmetrics in library and information science: trickle or tsunami? paper presented at the sm&s: social media and society international conference, toronto. julien, h., & bonnici, l. ( a). the times, they are a-changin’: attitudes towards altmetrics in higher education. proceedings of the annual conference of cais. retrieved from https://journals.library.ualberta.ca/ojs.cais-acsi.ca/index.php/ cais-asci/article/view/ julien, h., & bonnici, l. ( b, july). altmetrics in academe: bottom up or policy driven? paper presented at the sm&s: social media and society interna- tional conference, toronto. retrieved from https://smsociety .sched.com/ event/ f s/altmetrics-in-academe-bottom-up-or-policy-driven konkiel, s., sutton, s., & levine-clark, m. ( ). myth vs. reality: altmetrics & librarians. paper presented at the altmetrics conference, amsterdam. martín-martín, a., orduña-malea, e., ayllon, j.m., & lópez-cózar, e.d. ( ). the counting house: measuring those who count. presence of bibliometrics, scientometrics, informetrics, webometrics and altmetrics in the google scholar citations, researcherid, researchgate, mendeley & twitter. arxiv preprint arxiv: . . retrieved from https://arxiv.org/abs/ . meho, l. i., & yang, k. ( ). impact of data sources on citation counts and rankings of lis faculty: web of science versus scopus and google scholar. jour- nal of the american society for information science & technology, ( ), – . https://doi.org/ . /asi. miles, r.a., sutton, s., & konkiel, s. ( ). scholarly communication librarians’ relationship with research impact metrics. retrieved from http://krex.k-state. edu/dspace/handle/ / priem, j., taraborelli, d., groth, p., & neylon, c. ( ). alt-metrics: a manifesto. retrieved from http://altmetrics.org/manifesto/ reed, k., mcfarland, d., & croft, r. ( ). laying the groundwork for a new li- brary service: scholar-practitioner & graduate student attitudes toward altmetrics and the curation of online profiles. evidence based library and information practice, ( ), – . https://doi.org/ . /b j sutton, s., miles, r., & konkiel, s. ( ). the future of impact metrics use among collection development librarians. qualitative and quantitative methods in libraries, , – . thelwall, m., haustein, s., larivière, v., & sugimoto, c.r. ( ). do altmetrics work? twitter and ten other social web services. plos one, ( ), e . https:// doi.org/ . /journal.pone. tran, c.y., & lyon, j.a. ( ). faculty use of author identifiers and researcher networking tools. college & research libraries, ( ), – . https://doi. org/ . /crl. . . what are altmetrics? ( , june ). retrieved from https://www.altmetric.com/ about-altmetrics/what-are-altmetrics/ jelis vol . - _proof .indd / / : : am http://arxiv.org/abs/ . http://arxiv.org/abs/ . https://doi.org/ . /s - - -y https://doi.org/ . /asi. https://doi.org/ . /crl. . . https://doi.org/ . /meet. . https://doi.org/ . /s - - - https://journals.library.ualberta.ca/ojs.cais-acsi.ca/index.php/cais-asci/article/view/ https://smsociety .sched.com/event/ f s/altmetrics-in-academe-bottom-up-or-policy-driven https://arxiv.org/abs/ . https://doi.org/ . /asi. http://krex.k-state.edu/dspace/handle/ / http://altmetrics.org/manifesto/ https://doi.org/ . /b j https://doi.org/ . /journal.pone. https://doi.org/ . /journal.pone. https://doi.org/ . /crl. . . https://www.altmetric.com/about-altmetrics/what-are-altmetrics/ https://journals.library.ualberta.ca/ojs.cais-acsi.ca/index.php/cais-asci/article/view/ https://smsociety .sched.com/event/ f s/altmetrics-in-academe-bottom-up-or-policy-driven http://krex.k-state.edu/dspace/handle/ / https://doi.org/ . /crl. . . https://www.altmetric.com/about-altmetrics/what-are-altmetrics/ awareness of altmetrics among lis scholars and faculty: sarah sutton, emporia state university, rachel miles, kansas state university: stacy konkiel, altmetric.com measures of research impact significance of measuring research impact in library and information science aims of the current study literature review methodology results awareness of altmetrics influence of appointment type influence of tenure status influence of experience as an lis faculty member familiarity with other metrics discussion what level of familiarity with and awareness of altmetrics do lis faculty report themselves to have? are there relationships between respondents’ self-reported levels of familiarity with and awareness of altmetrics and their appointment type, tenure status, and teaching experience? how do respondents’ familiarity with and awareness of altmetrics compare to their familiarity with and awareness of other measures of research impact? limitations conclusion references the case of the bold button: social shaping of technology and the digital scholarly edition the case of the bold button: social shaping of technology and the digital scholarly edition ............................................................................................................................................................ joris j. van zundert huygens institute for the history of the netherlands, royal netherlands academy of arts and sciences, the hague, the netherlands ....................................................................................................................................... abstract the role and usage of a certain technology is not imparted wholesale on the intended user community—technology is not deterministic. rather, a negotiation between users and the designers of the technology will result in its particular form and function. this article considers a side effect of these negotiations. when a certain known technology is used to convey a new technological concept or model, there is a risk that the paradigm associated by the users with the known technology will eclipse the new model and its affordances in part or in whole. the article presents a case study of this ‘paradigmatic regression’ centering on a transcription tool of the huygens institute in the netherlands. it is argued that similar effects also come into play at a larger scale within the field of textual scholarship, inhibit- ing the exploration of the affordances of new models that do not adhere to the pervasive digital metaphor of the codex. an example of such an innovative model, the knowledge graph model, is briefly introduced to illustrate the point. ................................................................................................................................................................................. first, let us observe two things missing from almost all electronic scholarly editions made to this point. the first missing aspect is that up to now, almost without exception, no scholarly electronic edition has presented ma- terial which could not have been presented in book form, nor indeed presented this material in a manner significantly different from that which could have been managed in print. these are words by peter robinson, who spoke and wrote them in (robinson, ). i think little has changed in the years since and the ob- servation still more or less holds. at the time, robinson argued vehemently for digital scholarly editions that would move decisively beyond the realm of the possibilities of print publication. he was—and is—by no means the only one that has been advocating for such a shift. in fact, many have wondered how the digital medium, or the virtual environment, would change the nature and appear- ance of the scholarly edition. for that matter, grand perspectives on paradigmatic change due to medium change are not unique to textual scholar- ship. the introduction of a new medium or tech- nology has always inspired great debate between advocates and antagonists of the next big thing. self-proclaimed supporters of digital media usually advocate revolutionary changes. in the case of text- ual scholarship, for example, one may hear it pro- claimed that the book is dead; good riddance, the advocates of ‘the next big thing’ (bod, : ) correspondence: joris j. van zundert, huygens institute for the history of the netherlands, royal netherlands academy of arts and sciences, the hague, the netherlands. e-mail: joris.van.zundert@huygens. knaw.nl digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. this is an open access article distributed under the terms of the creative commons attribution non-commercial license (http://creativecommons.org/licenses/by-nc/ . /), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. for commercial re-use, please contact journals.permissions@oup.com of doi: . /llc/fqw digital scholarship in the humanities advance access published march , by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from xpath error undefined namespace prefix http://dsh.oxfordjournals.org/ judge, for it was a clumsy, static, institutionally bounded, difficult to use, and outdated interface. give way to open access, process orientation, dy- namic interfaces, intuitive interaction, fluid text, social editing, etc. (cf., for instance, siemens et al., ). with similar and undaunted zeal, luddites lament the waning of solid scholarly practice: con- centration span, close reading, philological inter- pretation, editorial practice, and convention (fish, )—all sacrificed to the ‘bitch goddess, quantification’ (sic) as bridenbaugh once put it (bridenbaugh, ). the screaming and kicking of luddites aside the proponents of change do not seem really to get what they want. after many years of development of digi- tal technology, the book is as alive as it ever was. we scarcely find digital editions, scholarly or otherwise, resembling the advanced models of dynamic, fluid, collaborative, and social texts such as those pro- posed by mcgann ( ), drucker (lunenfeld et al., : ), shillingsburg (jones et al., ), robinson ( ), van hulle ( ), siemens (siemens et al., ), and myself (boot and van zundert, ). e-books are certainly impacting the market (aap, , cain miller and bosman, ), but e-books are pure digital metaphors of the print book. digital scholarly editions hardly have any impact (porter, ), but what is more import- ant is that they are a far cry from what many ex- pected them to be. we could suppose that this state of affairs is due to a lack of knowledge, skills, and technology support as has been indeed suggested before (cf. courant et al., ). and it is probably true there are severe problems of teaching and train- ing in our field, given that master and ph.d. pro- grams truly oriented on the digital humanities are only lately coming into existence. yet, i think there might be more to the matter. maybe we need to answer to borgman’s call: ‘why is no one following digital humanities scholars around to understand their practices, in the way that scientists have been studied for the last several decades?’ (borgman, ). what do we see if we step back for a while from our work as textual scho- lars and digital humanities researchers and look at what is happening from the social sciences, in par- ticular of science and technology studies? science and technology studies suggest inter alia to study technology development in its social context. in the past few years, i have studied the creation and de- velopment of the digital scholarly edition within the laboratory-like setting in the huygens institute for the history of the netherlands. here we find a rela- tively large—for humanities contexts in any case— it research and development (r&d) group of on average sixteen persons working together with about sixty historians, textual scholars, and digital archiv- ists. the research context consists of a dozen senior researchers, a similar amount of non-senior and associate researchers, a similar amount of ph.d. candidates with various contracts ranging from pre- dominantly full-time added staff to volunteer work- ers, and of course non-it r&d supporting staff. the adoption and application of technology is as much a social as it is a technical process. these processes are inevitably intertwined: technology does not determine but operates within and is oper- ated upon in a complex social field (bijker et al., ). the manifestation of such intertwined pro- cesses is directly visible in the field of digital huma- nities and in the development of the digital scholarly edition. of course, the digital scholarly edition is a digital artifact brought to life in a context of heavy interaction between technology (computer science and digital humanities) and a non-technological context (textual scholarship and humanities in gen- eral). this intricate and intensive interaction is a daily practice at the huygens institute. one of my tasks is to guide the interaction between it r&d, documentary editors, textual scholars, and re- searchers of literature and history, and to facilitate the ongoing methodological discussion between these cultures. i have had the privilege to study these processes from many angles: methodology, technology, model, role, audience, development, and so on. as has happened in many similar research con- texts, a transcription tool was developed at the huygens institute to support the basic work of turn- ing non-ocr-able texts from early printed works and medieval and modern manuscripts into their digital machine-processable counterparts. the de- velopment of this tool, elaborate (c.f. https:// www.elaborate.huygens.knaw.nl), was based on a j. j. van zundert of digital scholarship in the humanities, by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from https://www.elaborate.huygens.knaw.nl https://www.elaborate.huygens.knaw.nl http://dsh.oxfordjournals.org/ strategy of encapsulating and hiding xml markup—to be transformed to tei encoding behind the scenes—with a graphical interface. in this way, the tool was meant to present minimal barriers to transcribers who came in a variety of levels of expertise on encoding. this indeed resulted in successful participation of significant numbers of volunteers unskilled in xml over a large set of pro- jects. also the encapsulation of technicalities facili- tated greatly the focus on community and project management (beaulieu et al., ). here i am not so much interested in the features or particulars of elaborate. instead i want to focus on one particular researcher–developer interaction i witnessed that, i think, stands as an example of a general and strong tendency in the scholarly com- munity at large. the usability principle behind elaborate is that any encoding or markup is treated as an annotation on arbitrary regions within the text. to this end, when a user has selected a certain region in the text with the mouse, a pop-up dialog appears allowing the user to enter annotative tags, comments, etc. the interface thus closely mimics a concept—using a highlighter and pen to create an- notations—that is known and tangible to anyone who has basic experience in working with scholarly texts. the clear down side of this principle—if dog- matically applied—is that a user is left with an enor- mous number of click-and-point-and-type annotation tasks. especially in cases of seemingly insignificant but frequent markup, such as with the indication of bold face print, this approach strikes the user as tediously pedantic. the result of this usability agony was a recurring and strong push in the user community to have a button labeled ‘bold’—in fact to have several such buttons for italics, underline, and other common very fre- quently appearing properties of text—lowering the volume of tedious annotation. i remain to this day convinced that we should not have implemented that button as we did. the root cause for my conviction is of course that these buttons violate the rationale for xml over html, namely the strict and intentional sep- aration of representational and semantic informa- tion. the most common interpretation of boldface type is that it is a material manifestation of the concept of emphasis. even this is not universal— many other concepts may also be expressed by the use of boldface type. thus, the provision of a button to record that some text is in boldface type intro- duces principal ambiguity in a descriptive system. there is no way to tell what the function of the bold print was: it arbitrarily covers any use, without deli- neating which of the several possible textual con- cepts might apply. more importantly, however, for my argument here is that the implementation of this simple button reveals how technology is indeed shaped through its social context. the intent of elaborate’s approach was paradigmatic: its purpose was to allow editors of text to change from a repre- sentational paradigm to a semantic paradigm. we could have done this by forcing our users to become competent xml authors. our users judged xml tedious and complicated, however, and complexity is a well-known ‘fail factor’ working against the adoption of any new technology (rogers, ). thus, to move our users gently into the new para- digm, we had to create an interface that offered a clear and substantial advantage over existing tech- nology, but that at the same time did not seem overly complex. the annotation ‘highlighter’ pop- up seemed a good solution, trying to balance para- digm innovation with ease of use and compatibility with a known paradigm. however, the annotation pop-up led to a tedious routine that severely con- strained ease of use. when ease of use is compro- mised to such an extent, the new possibilities inherent in a technology do not lead to a change of routine to accommodate the technology, and thus the adoption of a new paradigm does not occur. instead, the perceived constraints lead to a change in the technology (leonardi, ). this is exactly what happened in the interaction between developers, users, researchers, and technology in the case of elaborate. a bold button was introduced to remedy usability constraints: social shaping of technology at work. as an unintended consequence—as robert merton would have it—of this social shaping of elaborate the paradigmatic intent of the innovation was now black boxed. this is not meant in the sense of latour’s definition that defines a black box the case of the bold button digital scholarship in the humanities, of by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://dsh.oxfordjournals.org/ according to general acceptance of the correctness of the inner mechanism (latour, ), but in the sense that the innovative aspect of the new para- digm was now completely unobservable and thus effectively unknowable to its intended audience. the unobservability of such a black-box model is also a known ‘fail factor’ for innovation (marinova and phillimore, ; rogers, ). this is an un- intended and usually unrecognized effect i have often found interfaces to have, and it is a problem that particularly affects graphical interfaces. a graphical user interface suggests a transparency of model and paradigm that is not truly there—in fact the graphical interface is as much an opaque barrier to the internal paradigm of a system as it is a means of engaging with that very system. analogous to robinson ( ) and others, i would argue that software interfaces, such as the interfaces to digital text editions, are an intellectual argument about the internal model of a system rather than a direct com- munication of that model to any user. when (as a result of the interaction between developer and user/researcher) the interface undergoes social shap- ing, that is also an expression of an intellectual ar- gument by the user about the model. in the case of the bold button, the user has not merely molded convenience into the interface. what also happened was that the intended paradigm— that of semantically oriented xml—was expressed in a paradigm which was more familiar to most users—that of representationally oriented html. but this effectively prevented the user from engaging with and getting to know the new paradigm, or at least a part of it. the bold button hid a class of semantically expressive potential behind a single representational ‘wrapper’. as an extension of the meno paradox (nickles, ), not only were the users unable to negotiate new knowledge, they had shaped the technology in a way that made it now impossible to engage at all with the new paradigm. user-centered design had led to the users shaping new technology so that it was congruent with the paradigm they were familiar with. the new was ex- pressed in the ways of the old, but also turned into something inaccessible and irrelevant. this unin- tended effect of an intended paradigm being encap- sulated and effectively hidden by a more familiar paradigm is caused by what i will call paradigmatic regression: the social shaping of a technological interface such that it can no longer express essential properties of an intended paradigm. the pivotal error that was made with the introduction of the ‘bold’ button was that the button does not express the digital paradigm. instead, we did exactly the opposite: we facilitated the scholarly users’ regres- sion toward the paradigm of the book metaphor known to them. thereby we confirmed that nothing had changed, that print convention was still the paradigm to use. as proponents of digital scholar- ship, we may tend to think we are free from this sort of paradigmatic regression. but we are not. most if not all digital scholarly editions are still solidly rooted in the book metaphors and print conven- tions, and i think it is exactly because of this silent regression. a brief history of humanities computing may be telltale. the beginnings of humanities computing and the development of the digital scholarly edition are usu- ally dated with the seminal work of father busa (hockey, ). roberto busa demonstrated the first practical applications of computational text processing by automating the tasks of indexing and context retrieval. however, the result was pre- sented in a form already well known to scholarly editing: a fifty-six-volume print publication con- cordance. the computational aspect was used simply to automate and scale a tedious and error- prone editorial task. the utility and sense of that of course goes without question. what interests me here, however, is that the automation was geared toward reiterating on a larger scale a scholarly task that was in essence well known and rehearsed; com- putational power was harnessed to produce an in- strument well within the confines of the existing paradigm of print text and its scholarly applications. the advent of the database and later the relational database prompted the curation and publication of several catalogs and indices of textual metadata, as well as the first repositories of text. this was of course a major enhancement of the capacity for dis- covery of texts and related metadata. databases allowed for efficient and convenient discovery of text through the use of matching selection queries. scholars such as jerome mcgann, peter robinson, j. j. van zundert of digital scholarship in the humanities, by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://dsh.oxfordjournals.org/ dino buzzetti, manfred thaller, and others began to envision different forms of engagement with text made possible due to the availability of full-text repo- sitories and metadata. despite all this, the database did not change the essential way scholars engaged with the actual texts. even if, for instance, buzzetti and thaller argued that a digital edition’s ‘liability to processing’ is the essential feature that sets it apart from conventional editions (buzzetti and rehbein, ), texts were still perceived predominantly as in- tentionally ordered strings of words for human inter- pretation. thus, notwithstanding ideas on how to engage with text in new ways separate from the read- ing, commentary, and interpretation that has trad- itionally been handled by humans, the digital scholarly editions produced in the last part of the th century have again presented text to us essen- tially as a digitized book. according to hockey, in the early to mid- s a great deal of interest and discussion arose in the scholarly community concerning what an electronic edition might look like. however, with the ‘notable exception of work carried out by peter robinson’, few of these publications were realized in an actual implementation. once ‘theory had to be put into practice and projects were faced with the laborious work of entering and marking up text and develop- ing software, attention began to turn elsewhere’ (hockey, ). as with the bold button example, we find that a new technology turned out to provide too little practical facility to lead to successful in- novation. yet there is more to the matter. the ‘next big thing’ of the last decade of the th century was the world wide web, founded on the technologies of the internet and hypertext. as landow has pointed out, ‘computer hypertext—text composed of blocks of words (or images) linked electronically by multiple paths, chains, or trails in an open-ended, perpetually unfinished textuality described by the terms link, node, network, web, and path’ precisely matches roland barthes’ ideal textuality (landow, ). if we need to point to a single moment and opportunity in history when the very fabric of a new technology was made suit- able to a scholarly community for the expression of relations and structures, not just within single texts but especially between texts, it was the moment of the invention of hypertext. that the opportunity arose cannot have been surprising, as the essential mechanism of hypertext—the hyperlink—was the technological implementation of a long-standing idea that knowledge and information are inter- linked. already pioneers such as paul otlet in the early th century could contemplate information systems that would link knowledge in the form of formalized multidimensional relations between documents (rayward, ). what is actually rather surprising is that such long-standing epis- temological knowledge about the relation of differ- ent chunks of information within documents and congruent ideas from post-structuralist literary criticism such as kristevas intertextual references (mitra, ) found so little expression in digital scholarly editions. the expressive power of that single pivotal element of the original html . specification, the http a element with its invaluable href property, implemented by tim berners-lee and itself an echo of theodor nelson’s ideas of transclusion (nelson, ) should have reverber- ated within the scholarly community. here was its opportunity to give expression to the linked and intertwined natures of cultures of text, literary criti- cism, and (digital) textual materiality that go to the heart of the field (van mierlo, ). the hyperlink created a native digital expression for the act of referencing, an expression of knowledge very much at the core of textual description, interpret- ation, and criticism. thus, here was a unique op- portunity to change from a paradigm of print publication to a paradigm of interconnected texts expressing knowledge. the scholarly editing community, however, adopted the ‘markup’ rather than the ‘hyper’ of the hypertext markup language, by developing goldfarb’s sgml eventually into the tei-xml de- scriptive standard (goldfarb, and renear, ). at the time, these dialects of markup tech- nology were used primarily to mark up texts as they are represented in books—the fact that i do not think anyone has but flippantly suggested marking up web pages in tei-xml may stand to prove the point. the scholarly community predominantly turned hypertext markup into a descriptive model of the book, and we have produced digital book the case of the bold button digital scholarship in the humanities, of by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://dsh.oxfordjournals.org/ metaphors as digital scholarly editions ever since. as with the bold button, a new technology was not explored but rather encapsulated by a known para- digm. the hyperlink was meant not to be a descrip- tive tool, but to link information in different documents. yet its foremost use in scholarly editing has been to link contents, chapter headings, and indices to pages in self-contained digital editions. roberto busa had ‘a vision and imagination that reach beyond the horizons of many of the current generation of practitioners who have been brought up with the internet’. he imagined scholarly edi- tions on the internet combined with analysis tools (hockey, ), a horizon that has been reiterated by many (cf. for instance buzzetti, ). however, digital editions developed in a completely different direction. the processing involved is mostly aimed at rendering the text for consumption by human readers. to defy the intent of the hyperlink has been in my view among the most remarkable feats of paradigmatic regression in the textual scholarship community. one can wonder though whether this is a bad thing. if we accept the bilateral dynamic be- tween audience and innovation, then why would we care when some innovations do not succeed? if the book metaphor paradigm suffices for our needs, does this not indeed suffice? to answer that we must ask: to whose needs do digital scholarly editions actually cater? given the designation, they should cater to scholars and re- searchers, but do they? the latest developments in digital scholarly editing are linked to the possibilities created for computer-supported cooperative work (cscw)—a term that was coined by the ibm re- search group headed by greif ( ) by the internet and the rise in computer literacy. essentially cscw is a label that can be put on any collaborative activ- ity that is supported by web or web . means. crowdsourcing as a means of dividing large work- loads has been around for a while and has been a specific implementation of cscw ever since web . technologies turned into web . technologies. many have proclaimed crowdsourcing to be the advent of the social edition—most prominently ray siemens (siemens et al., )—which re- defines the editor’s role to be that of a team leader concerned with proper workflow, quality control, and overseeing managerial and funding aspects (sahle, ), whereas concrete editorial tasks are delegated to social communities formed around specific texts. questions have been raised about the actual effectiveness of crowd sourcing (causer et al., ). but more importantly, recent studies show that the old rule of thumb of the collaborative internet—that % of the workforce provides % of the labor (cf. brumfield et al., )—still holds for any open collaborative project, implying that many crowdsourced editions are not in fact truly social. moreover, when peter robinson said ‘all readers may become editors too’, he was not simply referring to a cheap labor force for source transcription, to be conveniently discarded the moment a transcription phase is done (robinson, ). instead, like ray siemens proposed, he envi- sioned a ‘social edition’ that embodies the ideas of open notebook science (cf. shaw et al., ) and renders all aspects of the editorial process—e.g. an- notation, commentary, and interpretation—open to public engagement (siemens et al., ). but we in the scholarly community are not at all at ease with letting go of our presumption that scholarly editing is a highly skilled practice that does not provide for easy delegation of tasks. it is challenging to truly consider the extent to which we can open up the scholarly process of creating a digital edition to leave the tedious tasks typically associated with high quality scholarly inference to the wisdom of the crowds—in the case of literary analysis, this often includes the painstaking tracing of names, an- notation of plot, clarification of meaning, for in- stance. in current practice, however, the digital scholarly editorial tasks beyond the transcription phase remain reserved either for the single authori- tative author or for a small group of qualified edi- tors. in this way, most scholarly digital editions adhere to an authoritative publication paradigm. we use big all-encompassing words like ‘social’, ‘open’, and ‘community’, but in fact we are again regressing to authoritative processes that remain well within the paradigm of the print edition. although on the verge of being harsh, it is never- theless fair to state that digital scholarly editions cater to the needs of the scholarly editors, not to users and researchers as knowledge producers. j. j. van zundert of digital scholarship in the humanities, by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://dsh.oxfordjournals.org/ along another tangent: edward vanhoutte pointed out the possibilities of targeting different audiences with different visualizations of the same editions (vanhoutte, ). in his view, minimal editions should target a broader audience, while maximal editions with a far larger number of scholarly bells and whistles should provide for the needs of re- searchers. several digital scholarly editions do show signs of this sort of differentiation. we can point to the van gogh letters (jansen et al., ) as something of a midpoint between the minimal and maximal edition. the samuel beckett digital manuscript project (van hulle and nixon, ) and the pre-production version of the digital faust edition (brüning et al., ) that i have been allowed to see certainly should qualify as max- imal editions. however, these and virtually all digi- tal scholarly editions again reiterate in the gui metaphors of the ‘read-only’ book. no digital scholarly editions do provide what i think is paramount for true interaction with edi- tions or scholarly text resources: the capacity to ne- gotiate the edition and its text as data over web serviced application programming interfaces (apis). apis allow for computer-to-computer nego- tiation of texts, opening them up to algorithmic processing and reuse. my primary reason for arguing that we need our digital scholarly editions as api accessible texts is not, as some may expect, to enable quantified computational approaches such as those that matthew jockers and franco moretti have taken (jockers, ; moretti, ), or the stylometric analysis desired by many others (van dalen-oskam and van zundert, ; kestemont, ). it is highly useful and convenient to have the text of scholarly editions available as open web service, so that my computational colleagues and i can do our principal component analyses, bootstrap trees, clustering analyses, and any other analysis that can possibly be envisioned. there is another reason, in my view more im- portant yet overlooked, to consider anchoring digi- tal scholarly editions on a data model that is not oriented around a book metaphor. this motivation derives from the growing and increasingly unsettling gap i find between the close reading of scholars using conventional hermeneutic approaches and the ‘big data’ driven distant reading supported by probabilistic approaches—a discrepancy which is also signaled by others (capurro, ). on the one hand, we see a conventional scholarly approach in which texts are mindfully and meticulously pro- duced, detailed, and interpreted. on the other hand, we find a deterministic and probabilistic approach whose focus is large-scale data analysis and which is, through its statistical aspect, reductive in nature. to the hermeneutic scholar, distant reading approaches are therefore ‘lossy’, prone to discarding some of the substance, and quite incapable of capturing essential hermeneutic knowledge (cf. ramsay, ). it is often the statistical outliers and not just patterns of similarities that are telltale to textual scholars and historian in their hermeneutic explorations. at present there is no model connecting these worlds of close and distant reading. rather, the dis- tance between them is growing, which threatens not only to set the scholarly community of textual and literary studies against itself, but also to waste the opportunity for a true and meaningful advance in our capabilities for computational-based humanities research. if we are to close this gap, we need a model for digital text that allows for both hermeneutic and statistical approaches so that these approaches can truly inform each other. to this end we need to revisit and reconsider how we anchor digital edi- tions on the hypertext model. the slavish adherence to the book metaphor, even in xml form, will not take us into a realm where texts and editions are published as online apis for processing by compu- tational means. yet, also models of quantification fall short as they are narrowly defined for statistical methodology. because such models are not data models, they do nothing as to expressing descrip- tion, encoding, or annotation. we are in need of a model that actually provides for all of the above. that is, a model that provides for the capturing, encoding, and annotating of a text and also for pro- cessing the edited or raw resource to enable analyses by both conventional hermeneutics and quantified approaches. lastly, this model must be recursive: it must be able to capture all resulting information from an analysis and add that information into the model itself. only then new knowledge gained the case of the bold button digital scholarship in the humanities, of by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://dsh.oxfordjournals.org/ from the model can be used ‘natively’ for a next cycle of both qualitative and quantitative analysis. such a model captures all editorial and research as- pects and outputs of scholarly activity in an encom- passing lifecycle. but even more important: only such a model provides for a way to bridge the widening gap that is coming into existence between the hermeneutic tradition and new quantified means. computational method can do far more than just counting, averaging, and comparing histo- grams. but currently computational approaches ignore many of the properties of text and textual materiality that are important to hermeneutic en- gagement. current quantified approaches lack therefore the ability to model and computationally process the close reading aspects of text engagement. thus what we lack is something we could call tongue in cheek near distant or near close reading. more formally and in line with current debate, i think we should qualify what we lack as an enabler of computational heuristics for capta (drucker, ). but arguably either ‘near close reading’ or ‘near distant reading’ both capture in their own am- biguity exactly the properties of textual scholarly data and knowledge that quantified approaches tend to overlook: extremity of sparseness, inconsist- ency, vagueness, ambiguity, multi-interpretability, and uncertainty. there is no readily available means for such qualitative computing. qualitative modeling and computing are still highly explorative fields (cf. forbus, ), and yet, abilities to com- pute and reason over qualitative data are coming into existence. as the creators and providers of the raw materials that such qualitative computational approaches should operate on, editors of digital scholarly editions should consider how text as data is to be provided. knowledge graphs are, i think, extremely well suited for this. graphs are not new to us, nor to our field. the world wide web is a graph, a net- work of nodes and edges connecting information. in a sense, every digital scholarly edition put online has in fact been made part of a graph therefore. in recent years, graphs have found various more expli- cit applications also in the field of digital huma- nities, most notably as a data model for describing textual variation between different witnesses of the same text (schmidt and colomb, ). the prop- erties of the graph model, however, allow it to be a generic model capturing the information tied to a digital scholarly edition on all conceivable levels of granularity. two examples may show this potential conceptually. imagine a knowledge graph as a net- work with nodes and edges. in this hypothetical graph, we designate three nodes to represent texts a, b, and c. an interface to the graph allows us to add edges and nodes to this network. what is es- sential here is that the underlying model is a graph, the graphical display may take many forms but need not necessarily be a visual network itself. suppose now a textual scholar x states that text a was con- ceived before text c. this statement can be repre- sented as a directed relational edge (or predicate if you like) ‘precedes’ between a and c as depicted in fig. . now assume another researcher y at another point in time, and not necessarily even knowing anything about text a, independently of researcher x, concludes that text b was conceived after text c. this statement can be captured by putting an edge ‘precedes’ between c and b. the tiny graph as de- picted in fig. now holds the accumulated know- ledge. however, note that the combination of independent observations now adds up to more than just the sum of its parts, for reasoning, walking, or computing over the graph—all three verbs essen- tially express the same operation of inferring know- ledge from the graph—gives us the added knowledge that a must have preceded b. the second example is taken from collatex, which is a tool to automatically collate variant texts (cf. http://collatex.net/). the result of such comparisons can be stored as graphs, e.g. fig. . such graphs cannot be said to be quantified, they express rather the qualitative word variance between fig. . nodes in a conceptual knowledge graph j. j. van zundert of digital scholarship in the humanities, by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://collatex.net/ http://dsh.oxfordjournals.org/ texts. but the application of the graph stretches wider. as in the previous example, we can add state- ments (knowledge) about this text to the graph by adding nodes and edges. the example in fig. shows two statements made by superseding nodes on partly overlapping regions of the text. they ex- press in a hypothetical fashion how these regions should look for a reader of an epub serialization of the text to be read on an ereader. note how overlap, a well-discussed problem for hierarchical models (sperberg-mcqueen, ), is not relevant to such a non-two-dimensional graph model. it should be carefully pointed out that knowledge graphs as a model are not to be equated with the currently popular ideas on semantic web and rdf. rdf can necessarily only be a static representation of a certain state of such a graph. the relation be- tween rdf/semantic web and graph models is analogous to the relation between tei and xml. a tei conformant xml document is a singular in- stantiation of (a part of) the tei model. the tei model itself however is represented by the dynamic fig. . overlapping semantic and representational knowledge added to the graph of figure fig. . conceptual knowledge graph representing textual variation in two texts a and b fig. . edges multiply knowledge the case of the bold button digital scholarship in the humanities, of by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://dsh.oxfordjournals.org/ set of guidelines defined for the description of text and document structures. these knowledge graphs can grow dauntingly complex very quickly, as may be inferred from fig. . because such complexity also poses a prob- lem for querying and performance on the computer science side of things, we have never seen wide ap- plication of graphs—let alone as a model for huma- nities data. however, meanwhile knowledge graphs in the same fashion as shown in these tiny examples back the social network applications of, for instance, companies like facebook and google. graph data- bases like neo j (http://en.wikipedia.org/wiki/ neo j) and infogrid (http://infogrid.org/) are making application-level models feasible. this paves the way toward exploring the potential of graphs for expressing the information and know- ledge represented in digital scholarly editions. in reality when putting text and editions on a graph, as users we may not experience them as graphs, but rather as any visualization or data representation we want to derive from the graphs. by footing such representations and visualizations on a graph model, we provide an underlying truly generic and interoperable means for representing, editing, anno- tating, and visualizing text, its relations, its multi- perspectivity, and its materiality in digital scholarly editions. at the same time and through the same data model we provide a means for qualitative and quantitative computing over the information con- tained in the graphs representing our editions. thus, with a graph model, we provide a more expressive data model for digital scholarly editions, allowing for the modeling and computation of both statistical and hermeneutic approaches. providing a digital scholarly edition with the backbone of a network graph would mean anchoring text on a fundamentally different model than that of the book metaphor. all digital book metaphors are until now essentially closed off in- convenient mixtures of multiple page- and string- oriented hierarchical models. what we cannot achieve through the book paradigm is walking the various alternatives of the graph that expresses in- terpretations and knowledge about the document in consideration. that is, we cannot algorithmically get at and process the text with all its annotations, com- ments, and additional information on authorship, materiality, interpretation, etc. the reason for this is that the book paradigm keeps us locked in and focused on a finite representational state of the text: it is oriented toward closing down the text. in contrast, graph models provide an elegant open way to connect information to the text in an infinite extensible fashion. whether machine negotiated or by human interpretation, new information can be attached to any particular item in the graph in the same way, thus becoming information that can be processed by both scholar and algorithm. thus, the essential difference is that the same model can cater to capturing hermeneutic inference and computa- tional analysis results. but we will only successfully explore such potential if we quit the social habit of shaping back new models into old paradigms. references aap. ( ). aap reports october book sales. aap the association of american publishers. http://www.winter digital.com/work/aap_final/main/presscenter/archic ves/ _dec/aapreportsoctoberbooksales.htm. beaulieu, a., van dalen-oskam, k., and van zundert, j. ( ). between tradition and web . : elaborate as a social experiment in humanities scholarship. in fig. . a graph representing a bible verse in various redactions j. j. van zundert of digital scholarship in the humanities, by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://en.wikipedia.org/wiki/neo j http://en.wikipedia.org/wiki/neo j http://infogrid.org/ http://www.winterdigital.com/work/aap_final/main/presscenter/archicves/ _dec/aapreportsoctoberbooksales.htm http://www.winterdigital.com/work/aap_final/main/presscenter/archicves/ _dec/aapreportsoctoberbooksales.htm http://www.winterdigital.com/work/aap_final/main/presscenter/archicves/ _dec/aapreportsoctoberbooksales.htm http://dsh.oxfordjournals.org/ takševa, t. (ed.), social software and the evolution of user expertise: future trends in knowledge creation and dissemination. hershey: igi global, pp. – . bijker, w., hughes, t., and pinch, t. (eds.) ( ). the social construction of technological systems: new directions in the sociology and history of technology. cambridge, ma: mit press. bod, r. ( ). het einde van de geesteswetenschappen . . http://staff.science.uva.nl/�rens/oratierens.pdf (accessed february ). boot, p. and van zundert, j. ( ). the digital edition . and the digital library: services, not resources. bibliothek und wissenschaft, : – . borgman, c. ( ). the digital future is now: a call to action for the humanities. digital humanities quarterly, ( ). www.digitalhumanities.org/dhq/vol/ / / / .html. bridenbaugh, c. ( ). the great mutation. american historical review, ( ): – . brumfield, b., klevan, d. and vershbow, b. ( ). sharing public history work using crowdsourcing of both data and sources. http://www.imls.gov/about/ webwise.aspx (accessed june ). also: brumfield, b. ( ). crowdsourcing at imls webwise . collaborative manuscript transcription. http://manu scripttranscription.blogspot.nl/ / /crowdsourc ing-at-imls-webwise- .html (accessed june ). brüning, g., henzel, k., and pravida, d. ( ). multiple encoding in genetic editions: the case of ‘‘faust’’. journal of the text encoding initiative, . http://jtei.revues.org/ . buzzetti, d. and rehbein, m. ( ). textual fluidity and digital editions. in dobreva, m. (ed.), text variety in the witnesses of medieval texts. sofia: institute of mathematics and informatics, pp. – . http://www.denkstaette.de/files/buzzetti-rehbein.pdf. buzzetti, d. ( ). digital editions and text processing. in deegan, m. and sutherland, k. (eds), text editing, print and the digital world. farnham/burlington: ashgate, pp. – . http://www.academia.edu/ /digital_editio ns_and_text_processing (accessed september ). causer, t., tonra, j., and wallace, v. ( ). transcription maximized; expense minimized? crowdsourcing and editing the collected works of jeremy bentham. literary and linguistic computing, ( ): – . cain miller, c. and bosman, j. ( ). e-books outsell print books at amazon. new york times. http://www. nytimes.com/ / / /technology/ amazon.html?_ r¼ (accessed september ). capurro, r. ( ). digital hermeneutics: an outline. ai & society, ( ): – . courant, p. n. et al. ( ). our cultural common- wealth: the report of the american council of learned societies’ commission on cyberinfrastructure for huma- nities and social sciences. new york: american council of learned societies. drucker, j. ( ). humanities approaches to graphical display. digital humanities quarterly, ( ). http://digi talhumanities.org/dhq/vol/ / / / .html (accessed august ). fish, s. ( ). the old order changeth. new york times. http://opinionator.blogs.nytimes.com/ / / /the-old-order-changeth/ (accessed february ). forbus, k. d. ( ). qualitative modeling. in van harmelen, f., lifschitz, v., and porter, b. (eds), handbook of knowledge representation. foundations of artificial intelligence. amsterdam, boston, heidelberg etc.: elsevier, pp. – . goldfarb, c. f. ( ). the roots of sgml—a personal recollection. http://www.sgmlsource.com/history/roots. htm (accessed september ). greif, i. (ed.) ( ). computer-supported cooperative work: a book of readings. san mateo, ca: morgan kaufmann publishers, inc. haslhofer, b. et al. ( ). the open annotation collaboration (oac) model. arxiv. http://arxiv.org/ abs/ . (accessed september ). hockey, s. ( ). the history of humanities computing. in schreibman, s., ray, s., and unsworth, j. (eds), a companion to digital humanities. oxford: blackwell. http://www.digitalhumanities.org/companion/ (accessed september ). jansen, l., luijten, h., and bakker, n. (eds) ( ). vincent van gogh: the letters. amsterdam: amsterdam university press. http://www.vangoghlet ters.org/ (accessed september ). jockers, m. l. ( ). macroanalysis: digital methods and literary history. urabana, chicago, springfield: ui press. jones, s., shillingsburg, p., and thiruvathukal, g. ( ). e-carrel: an environment for collarborative textual scholarship. journal of the chicago colloquium on digital humanities and computer science, ( ). https://letterpress.uchicago.edu/index.php/jdhcs/arti cle/view/ / (accessed september ). the case of the bold button digital scholarship in the humanities, of by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://staff.science.uva.nl/~rens/oratierens.pdf http://staff.science.uva.nl/~rens/oratierens.pdf www.digitalhumanities.org/dhq/vol/ / / / .html www.digitalhumanities.org/dhq/vol/ / / / .html http://www.imls.gov/about/webwise.aspx http://www.imls.gov/about/webwise.aspx http://manuscripttranscription.blogspot.nl/ / /crowdsourcing-at-imls-webwise- .html http://manuscripttranscription.blogspot.nl/ / /crowdsourcing-at-imls-webwise- .html http://manuscripttranscription.blogspot.nl/ / /crowdsourcing-at-imls-webwise- .html http://jtei.revues.org/ http://www.denkstaette.de/files/buzzetti-rehbein.pdf http://www.academia.edu/ /digital_editions_and_text_processing http://www.academia.edu/ /digital_editions_and_text_processing http://www.nytimes.com/ / / /technology/ amazon.html?_r= http://www.nytimes.com/ / / /technology/ amazon.html?_r= http://www.nytimes.com/ / / /technology/ amazon.html?_r= http://www.nytimes.com/ / / /technology/ amazon.html?_r= http://digitalhumanities.org/dhq/vol/ / / / .html http://digitalhumanities.org/dhq/vol/ / / / .html http://opinionator.blogs.nytimes.com/ / / /the-old-order-changeth/ http://opinionator.blogs.nytimes.com/ / / /the-old-order-changeth/ http://www.sgmlsource.com/history/roots.htm http://www.sgmlsource.com/history/roots.htm http://arxiv.org/abs/ . http://arxiv.org/abs/ . http://www.digitalhumanities.org/companion/ http://www.vangoghletters.org/ http://www.vangoghletters.org/ https://letterpress.uchicago.edu/index.php/jdhcs/article/view/ / https://letterpress.uchicago.edu/index.php/jdhcs/article/view/ / http://dsh.oxfordjournals.org/ kestemont, m. ( ). het gewicht van de auteur. een onderzoek naar stylometrische auteursherkenning in de middelnederlandse epiek. universiteit antwerpen, faculteit letteren en wijsbegeerte, departementen taal- en letterkunde. landow, g. p. ( ). hypertext . : critical theory and new media in an era of globalization. rev. ed. of hypertext . . baltimore: the john hopkins university press. latour, b. ( ). science in action: how to follow scientists and engineers through society, cambridge, ma: harvard university press. leonardi, p. m. ( ). when flexible routines meet flexible technologies: affordance, constraint, and the imbrication of human and material agencies. mis quarterly, ( ): – . lunenfeld, p. et al. ( ). digital_humanities. cambridge, ma/london: mit press. http://mitpress. mit.edu/books/digitalhumanities- (accessed november ). marinova, d. and phillimore, j. ( ). models of innovation. in shavinina, l. v. (ed.), the international handbook on innovation. kidlington, oxford: elsevier science ltd., pp. – . mcgann, j. ( ). electronic archives and critical edit- ing. literature compass, : – . mitra, a. ( ). characteristics of the www text: tra- cing discursive strategies. journal of computer-mediated communication, ( ). http://onlinelibrary.wiley.com/ doi/ . /j. - . .tb .x/full (accessed october ). moretti, f. ( ). graphs, maps, trees: abstract models for literary history. london: verso. nelson, t. h. ( ). the heart of connection: hyper- media unified by transclusion. communications of the acm, ( ): – . nickles, t. ( ). evolutionary models of innovation and the meno problem. in shavinina, l. v. (ed.), the international handbook on innovation. kidlington, oxford: elsevier science ltd., pp. – . nowviskie, b. ( ). too small to fail. bethany nowviskie. http://nowviskie.org/ /too-small-to- fail/ (accessed october ). porter, d. ( ). medievalists and the scholarly digital edition. scholarly editing: the annual of the association for documentary editing, . http://www.scholarlyedit- ing.org/ /essays/essay.porter.html (accessed march ). rayward, w. b. ( ). visions of xanadu: paul otlet ( - ) and hypertext. jasis, , pp. – . ramsay, s. ( ). reading machines: toward an algorithmic criticism (topics in the digital humanities). chicago: university of illinois press. renear, a. ( ). text encoding. in schreibman, s., ray, s., & and unsworth, j., eds. a companion to digital humanities. oxford: blackwell. http://www. digitalhumanities.org/companion/ (accessed september ). robinson, p. ( ). where we are with electronic scholarly editions, and where we want to be. http:// computerphilologie.uni-muenchen.de/jg /robinson. html (accessed september ). robinson, p. ( ). five desiderata for scholarly editions in digital form. in digital humanities conference . lincoln, nb. http://dh .unl.edu/abstracts/ab- . html (accessed september ). rogers, e.m. ( ). diffusion of innovations, nd edn. new york, london: the free press. sahle, p. ( ). digitale editionsformen, zum umgang mit der überlieferung unter den bedingungen des medienwandels – befunde, theorie und methodik. norderstedt: norderstedt books on demand. schmidt, d. and colomb, r. ( ). a data structure for representing multi-version texts online. international journal of human-computer studies, ( ): – . shaw, r., buckland, m. and golden, p. ( ). open notebook humanities: promise and problems. in digital humanities conference . lincoln, nb. http://www.academia.edu/ /open_notebook_ humanities_promise_and_problems (accessed september ). siemens, r. et al. ( ). toward modeling the social edition: an approach to understanding the electronic scholarly edition in the context of new and emerging social media. literary and linguistic computing, ( ): – . sperberg-mcqueen, c. m. ( ). what matters? in proceedings of extreme markup languages. extreme markup languages. montréal, canada. http://conferences. idealliance.org/extreme/html/ /cmsmcq / eml cmsmcq .html (accessed september ). van hulle d. ( ). editing samuel beckett. jnul.huji.ac. il/eng/docs/israel_interedition_nli_ .pdf (accessed september ). van hulle, d. and nixon, m. ( ). samuel beckett digital manuscript project. samuel beckett digital j. j. van zundert of digital scholarship in the humanities, by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://mitpress.mit.edu/books/digitalhumanities- http://mitpress.mit.edu/books/digitalhumanities- http://onlinelibrary.wiley.com/doi/ . /j. - . .tb .x/full http://onlinelibrary.wiley.com/doi/ . /j. - . .tb .x/full http://nowviskie.org/ /too-small-to-fail/ http://nowviskie.org/ /too-small-to-fail/ http://www.scholarlyediting.org/ /essays/essay.porter.html http://www.scholarlyediting.org/ /essays/essay.porter.html http://www.digitalhumanities.org/companion/ http://www.digitalhumanities.org/companion/ http://computerphilologie.uni-muenchen.de/jg /robinson.html http://computerphilologie.uni-muenchen.de/jg /robinson.html http://computerphilologie.uni-muenchen.de/jg /robinson.html http://dh .unl.edu/abstracts/ab- .html http://dh .unl.edu/abstracts/ab- .html http://www.academia.edu/ /open_notebook_humanities_promise_and_problems http://www.academia.edu/ /open_notebook_humanities_promise_and_problems http://conferences.idealliance.org/extreme/html/ /cmsmcq /eml cmsmcq .html http://conferences.idealliance.org/extreme/html/ /cmsmcq /eml cmsmcq .html http://conferences.idealliance.org/extreme/html/ /cmsmcq /eml cmsmcq .html http://dsh.oxfordjournals.org/ manuscript project. http://www.beckettarchive.org/ (accessed september ). van mierlo, w. ( ). textual scholarship and the ma- terial book. variants: the journal of the european society for textual scholarship, . http://www.aca- demia.edu/ /introduction_to_textual_ scholarship_and_the_material_book (accessed september ). van dalen-oskam, k. and van zundert, j. ( ). delta for middle dutch: author and copyist distinction in ‘‘walewein.’’ literary and linguistic computing, : – . vanhoutte, e. ( ). so you think you can edit? the masterchef edition. http://edwardvanhoutte.blogspot.nl/ _ _ _archive.html (accessed september ). winkler, m. ( ). interpretatie en/of patroon? over ‘het einde van de geesteswetenschappen . ’ en het onderscheid tussen kritiek en wetenschap. vooys, ( ): – . notes a recent example in the dutch literary and linguistics theatre is professor rens bod proclaiming the end of humanities . (bod, ), and ph.d. student marieke winkler sincerely questioning that (winkler, ). as in many other contexts (cf. nowviskie, ), the relationship between an it r&d group and scientific staff is some matter of internal debate in the institute. in part, this role is supporting; in part, it is collabora- tive at the research level. there is likely a distinction to be made here between senior scholars as transcribers and non-academic vol- unteer ‘crowd sources’. although i lack any statistical viable data, anecdotal evidence suggests that volunteer transcribers in fact may attach hundreds of tiny and similar annotations without complaint, but the senior researcher will feel put at odds with his experience and practice when invited to do so. i kindly thank moritz wissenbach from würzburg university—who is among other occupations the tech- nical lead for the development of the digital faust edi- tion—for allowing me to share this example which he originally conceived. initiatives such as the open annotation collaboration are proposing extensions to the world wide web and semantic web models to support annotation of linked data including temporal ‘aware’ annotations (haslhofer et al., ). it is out of scope of this article to examine whether such models would provide for the needed reciprocality and dynamics for graph model-based digi- tal scholarly editions. as the web in its current form is not real-time read/write enabled, it is hard to imagine though how it would provide for such highly dynamic webs of knowledge interaction. the case of the bold button digital scholarship in the humanities, of by guest on january , http://dsh.oxfordjournals.org/ d ow nloaded from http://www.beckettarchive.org/ http://www.academia.edu/ /introduction_to_textual_scholarship_and_the_material_book http://www.academia.edu/ /introduction_to_textual_scholarship_and_the_material_book http://www.academia.edu/ /introduction_to_textual_scholarship_and_the_material_book http://edwardvanhoutte.blogspot.nl/ _ _ _archive.html http://edwardvanhoutte.blogspot.nl/ _ _ _archive.html http://dsh.oxfordjournals.org/ digital humanities the importance of pedagogy: towards a companion to teaching digital humanities hirsch, brett d. brett.hirsch@gmail.com university of western australia timney, meagan mbtimney.etcl@gmail.com university of victoria the need to “encourage digital scholarship” was one of eight key recommendations in our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences (unsworth et al). as the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” ( ). in other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. while the commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. both the companion to digital humanities ( ) and the companion to digital literary studies ( ), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. there is much work to be done. this poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. this discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by mccarty and kirschenbaum ( ). the growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (jessop ), and the desire of practitioners to consolidate and validate their research and methods. we propose a volume, teaching digital humanities: principles, practices, and politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. we plan to structure the volume according to the four critical questions educators should consider as emphasized recently by mary bruenig, namely: - what knowledge is of most worth? - by what means shall we determine what we teach? - in what ways shall we teach it? - toward what purpose? in addition to these questions, we are mindful of henry a. giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” ( ). consequently, we will encourage submissions to the volume that address these wider concerns. references breunig, mary ( ). 'radical pedagogy as praxis'. radical pedagogy. http://radicalpeda gogy.icaap.org/content/issue _ /breunig.ht ml. giroux, henry a. ( ). 'rethinking the boundaries of educational discourse: modernism, postmodernism, and feminism'. margins in the classroom: teaching literature. myrsiades, kostas, myrsiades, linda s. (eds.). minneapolis: university of minnesota press, pp. - . http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html digital humanities schreibman, susan, siemens, ray, unsworth, john (eds.) ( ). a companion to digital humanities. malden: blackwell. jessop, martyn ( ). 'teaching, learning and research in final year humanities computing student projects'. literary and linguistic computing. . ( ): - . mccarty, willard, kirschenbaum , matthew ( ). 'institutional models for humanities computing'. literary and linguistic computing. . ( ): - . unsworth et al. ( ). our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. new york: american council of learned societies. top fair data & software things top fair data & software things february , sprinters: reid otsuji, stephanie labou, ryan johnson, guilherme castelao, bia villas boas, anna- lena lamprecht, carlos martinez ortiz, chris erdmann, leyla garcia, mateusz kuzak, paula andrea martinez, liz stokes, natasha simons, tom honeyman, chris erdmann, sharyn wise, josh quan, scott peterson, amy neeser, lena karvovskaya, otto lange, iza witkowska, jacques flores, fiona bradley, kristina hettne, peter verhaar, ben companjen, laurents sesink, fieke schoots, erik schultes, rajaram kaliyaperumal, erzsebet toth-czifra, ricardo de miranda azevedo, sanne muurling, john brown, janice chan, lisa federer, douglas joubert, allissa dillman, kenneth wilkins, ishwar chandramouliswaran, vivek navale, susan wright, silvia di giorgio, akinyemi mandela fasemore, konrad förstner, till sauerwein, eva seidlmayer, ilja zeitlin, susannah bacon, chris erdmann, katie hannan, richard ferrers, keith russell, deidre whitmore, and tim dennis. organisations: library carpentry/the carpentries, australian research data commons, research data alliance libraries for research data interest group, foster open science, openaire, research data alliance europe, data management training clearinghouse, california digital library, dryad, aarnet, center for digital scholarship at the leiden university, dans, the netherlands escience center, university utrecht, uc san diego, dutch techcentre for life sciences, embl, university of technology, sydney, uc berkeley, university of western australia, leiden university, go fair, dariah, maastricht university, curtin university, nih, nlm, ncbi, zb med, csiro, and ucla. https://orcid.org/ - - - https://orcid.org/ - - - https://library.ucsd.edu/about/contact-us/librarians-and-subject-specialists/ryan-johnson.html https://scripps.ucsd.edu/profiles/gpimentacastelao http://orcid.org/ - - - https://www.uu.nl/staff/allamprecht https://www.uu.nl/staff/allamprecht https://github.com/c-martinez https://github.com/libcce https://github.com/ljgarcia https://github.com/mkuzak https://github.com/orchid / https://github.com/orchid / https://twitter.com/ragamouf https://twitter.com/n_simons https://www.linkedin.com/in/tom-honeyman- / https://twitter.com/libcce https://orcid.org/ - - - x https://orcid.org/ - - - x https://github.com/wrathofquan https://github.com/scottcpeterson https://twitter.com/pseudoamyloid https://www.uu.nl/medewerkers/ekarvovskaya https://www.uu.nl/staff/oalange https://www.uu.nl/staff/imwitkowska https://www.uu.nl/staff/imwitkowska https://www.uu.nl/staff/jpflores http://orcid.org/ - - - https://twitter.com/kristinahettne https://twitter.com/pverhaar?lang=en https://www.universiteitleiden.nl/en/staffmembers/ben-companjen https://www.universiteitleiden.nl/en/staffmembers/laurents-sesink#tab- https://www.universiteitleiden.nl/en/staffmembers/fieke-schoots#tab- https://orcid.org/ - - - x https://www.lumc.nl/org/humane-genetica/medewerkers/rajaram-kaliyaperumal?setlanguage=english&setcountry=en https://openmethods.dariah.eu/erzsebet-toth-czifra/ https://www.linkedin.com/in/ricardo-de-miranda-azevedo-b b b / https://www.universiteitleiden.nl/en/staffmembers/sanne-muurling#tab- https://staffportal.curtin.edu.au/staff/profile/view/john.brown https://github.com/icecjan/ https://github.com/informationista https://github.com/doujoudc https://twitter.com/dchackathons https://www.niddk.nih.gov/about-niddk/staff-directory/biography/wilkins-kenneth https://www.linkedin.com/in/ishwarc/ https://www.rd-alliance.org/users/vivek-navale https://www.rd-alliance.org/users/vivek-navale https://www.drugabuse.gov/about-nida/organization/divisions/division-basic-neuroscience-behavioral-research-dbnbr/office-director-od https://twitter.com/digiorgiosilvia https://sea-region.github.com/fasemoreakinyemi https://twitter.com/konradfoerstner https://twitter.com/tillsauerwein https://twitter.com/tillsauerwein https://sea-region.github.com/evaseidlmayer https://rd-alliance.org/users/ilja-zeitlin https://twitter.com/ardcsbacon https://twitter.com/libcce http://orcid.org/ - - - https://twitter.com/valuemgmt https://www.rd-alliance.org/users/kgrussell https://github.com/deidrewhitmore https://github.com/jt den top fair data & software things: table of contents about, p. oceanography, p. research software, p. research libraries, p. research data management support, p. international relations, p. humanities: historical research, p. geoscience, p. biomedical data producers, stewards, and funders, p. biodiversity, p. australian government data/collections, p. archaeology, p. https://librarycarpentry.org/top- -fair top fair data & software things: about the top fair data & software global sprint was held online over the course of two-days ( - november ), where participants from around the world were invited to develop brief guides (stand alone, self paced training materials), called "things", that can be used by the research community to understand fair in different contexts but also as starting points for conversations around fair. the idea for "top data things" stems from initial work done at the australian research data commons or ardc (formerly known as the australian national data service). the global sprint was organised by library carpentry, australian research data commons and the research data alliance libraries for research data interest group in collaboration with foster open science, openaire, rda europe, data management training clearinghouse, california digital library, dryad, aarnet, center for digital scholarship at the leiden university, and dans. anyone could join the sprint and roughly groups/individuals participated from the netherlands, germany, australia, united states, hungary, norway, italy, and belgium. see the full list of registered sprinters. sprinters worked off of a primer that was provided in advance together with an online ardc webinar introducing fair and the sprint titled, "ready, set, go! join the top fair data things global sprint." groups/individuals developed their things in google docs which could be accessed and edited by all participants. the sprinters also used a zoom channel provided by ardc, for online calls and coordination, and a gitter channel, provided by library carpentry, to chat with each other throughout the two-days. in addition, participants used the twitter hashtag #top fair to communicate with the broader community, sometimes including images of the day. participants greeted each other throughout the sprint and created an overall welcoming environment. as the sprint shifted to different timezones, it was a chance for participants to catch up. the zoom and gitter channels were a way for many to connect over fair but also discuss other topics. a number of participants did not know what to expect from a library carpentry/carpentries-like event but found a welcoming environment where everyone could participate. the top fair data & software things repository and website hosts the work of the sprinters and is meant to be an evolving resource. members of the wider community can submit issues and/or pull requests to the things to help improve them. in addition, a published version of the things will be made available via zenodo and the data management training clearinghouse in february . https://librarycarpentry.org/top- -fair https://librarycarpentry.org/blog/ / /top-ten-fair-announcement/ /users/cerdmann/downloads/topfair/_posts/(https:/www.ands.org.au/working-with-data/skills/ -research-data-things/ -medical-and-health-things) https://librarycarpentry.org/ https://ardc.edu.au/ https://www.rd-alliance.org/groups/libraries-research-data.html https://www.fosteropenscience.eu/ https://www.openaire.eu/ https://www.rd-alliance.org/rda-europe http://dmtclearinghouse.esipfed.org/ http://dmtclearinghouse.esipfed.org/ https://www.cdlib.org/ http://datadryad.org/ https://www.aarnet.edu.au/ https://www.library.universiteitleiden.nl/research-and-publishing/centre-for-digital-scholarship https://www.library.universiteitleiden.nl/research-and-publishing/centre-for-digital-scholarship https://dans.knaw.nl/nl https://docs.google.com/spreadsheets/d/ qq mpxp orue whewac hxxfid_g vvkw dmmtum m/edit?usp=drive_web&ouid= https://docs.google.com/document/d/ twjybutvavez tcq_bdzd kdkmvy wivlue unbr bs/edit https://www.slideshare.net/australiannationaldataservice/ready-set-go-join-the-top- -fair-data-things-global-sprint https://www.slideshare.net/australiannationaldataservice/ready-set-go-join-the-top- -fair-data-things-global-sprint https://monash.zoom.us/j/ https://monash.zoom.us/j/ https://gitter.im/librarycarpentry/top fair https://twitter.com/search?f=tweets&vertical=default&q=% top fair&src=typd https://github.com/librarycarpentry/top- -fair https://librarycarpentry.org/top- -fair/ https://zenodo.org/ http://dmtclearinghouse.esipfed.org/ http://dmtclearinghouse.esipfed.org/ top fair data & software things: oceanography sprinters: reid otsuji, stephanie labou, ryan johnson, guilherme castelao, bia villas boas (uc san diego) table of contents findability: thing : data repositories thing : metadata thing : permanent identifiers thing : citations accessibility: thing : data formats thing : data organization and management thing : re-usable data interoperability: thing : permanent identifiers thing : data organization and management thing : metadata thing : apis and apps reusability: thing : tools of the trade thing : reproducibility thing : apis and apps description: oceanographic data encompasses a wide variety of data formats, file sizes, and states of data completeness. data of interest may be available from public repositories, collected on an https://librarycarpentry.org/top- -fair https://orcid.org/ - - - https://orcid.org/ - - - https://library.ucsd.edu/about/contact-us/librarians-and-subject-specialists/ryan-johnson.html https://scripps.ucsd.edu/profiles/gpimentacastelao http://orcid.org/ - - - individual basis, or some combination of these, and each type has its own set of challenges. this “ things” guide introduces topics relevant to making oceanographic data fair: findable, accessible, interoperable, and reusable. audience: • library staff and programmers who provide research support • oceanographers • oceanography data stewards • researchers, scholars and students in oceanography goal: the goal of this lesson is to introduce oceanographers to fair data practices in their research workflow through guided activities. things thing : data repositories there are numerous data repositories for finding oceanographic data. many of these are from official “data centers” and generally have well-organized and well-documented datasets available for free and public use. • nsf / earth cube • clivar - cchdo • clivar - hawaii adcp • clivar - jodc adcp data • noaa - nodc • noaa - ncdc • noaa - ngdc • nsidc http://www.nsf.gov/geo/earthcube/ http://cchdo.ucsd.edu/ http://ilikai.soest.hawaii.edu/sadcp/clivar.html http://www.jodc.go.jp/goin/adcp.html http://www.nodc.noaa.gov/ http://www.ncdc.noaa.gov/oa/ncdc.html http://www.ngdc.noaa.gov/ http://nsidc.org/ • cdiac • bcodmo • geotraces • r r • samos • argo data • nasa - po.daac • world ocean database (wod) • spray underwater glider at some point, you may want or need to deposit your own data into a data repository, so that others may find and build upon your work. many funding agencies now require data collected or created with the grant funds to be shared with the broader community. for instance, the national science foundation (nsf) division of ocean sciences (oce) mandates sharing of data as well as metadata files and any derived data products. finding the “right” repository for your data can be overwhelming, but there are resources available to help pick the best location for your data. for instance, oce has a list of approved repositories in which to submit final data products. activity : • go to re data.org and search for a data repository related to your research subject area. how many results did you get? which of these repositories looks most relevant to your research area? is it easy to find a dataset in those repositories that covered the california coast (or any other region of your choice) during the last year? activity : • what is the next journal you would like to publish in? (alternatively: what is a top journal in your field?) can you find the data submission requirements for this journal? thing : metadata high quality metadata (information about the data, such as creator, keywords, units, flags, etc.) significantly improves data discovery. while metadata is most often for the data itself, http://cdiac.ornl.gov/ http://bcodmo.org/ http://www.geotraces.org/ http://www.rvdata.us/ http://samos.coaps.fsu.edu/html/ http://www.argo.ucsd.edu/argo_data_and.html https://podaac.jpl.nasa.gov/ https://www.nodc.noaa.gov/oc /wod/pr_wod.html https://spraydata.ucsd.edu/ https://www.nsf.gov/pubs/ /nsf /nsf .jsp https://www.nsf.gov/pubs/ /nsf /nsf .jsp https://www.nsf.gov/geo/oce/oce-data-sample-repository-list.jsp https://www.re data.org/ metadata can also include information about machines/instruments used, such as make, model, and manufacturer, as well as process metadata, which would include details about any cleaning/analysis steps and scripts used to create data products. using controlled vocabularies in metadata allows for serendipitous discovery in user searches. additionally, using a metadata schema to mark up a dataset can make your data findable to the world. activity : • using schema.org markup, look at the metadata elements pertaining to scholarly articles: https://schema.org/scholarlyarticle. imagine you have an article you have hosted on your personal website, and you would like to add markup so that it could be more readily indexed by google dataset search. what metadata elements would be most important to include? (this resource will help you: https://developers.google.com/search/docs/data-types/dataset) activity : • openrefine example for making data fair. read this walkthrough of how to “fairify” a dataset using the data cleaning tool openrefine: https://docs.google.com/document/d/ hq kbnmpqq - hqnva ar v esk brlg nvnnzjuapq/edit#heading=h.v puannmxh u discussion: • if you had thousands of keywords in a dataset you wanted to associate with a controlled vocabulary relevant to your field, what would be your approach? what tools do you think would best automate this task? thing : permanent identifiers permanent identifiers (pids) are a necessary step for keeping track of data. web links can break, or “rot”, and tracking down data based on a general description can be extremely challenging. a permanent identifier like a digital object identifier (doi) is a unique id assigned to a dataset to ensure that properly managed data does not get lost or misidentified. additionally, a doi makes it easier to cite and track the impact of datasets, much like cited journal articles. identifiers exist for researchers as well: ocrid is essentially a doi for an individual researcher. this ensures that if you have a common name, change your name, change your affiliation, or otherwise change your author information, you still get credit for your own and maintain a full, identifiable list of your scientific contributions. https://schema.org/ https://toolbox.google.com/datasetsearch https://developers.google.com/search/docs/data-types/dataset https://docs.google.com/document/d/ hq kbnmpqq -hqnva ar v esk brlg nvnnzjuapq/edit#heading=h.v puannmxh u https://docs.google.com/document/d/ hq kbnmpqq -hqnva ar v esk brlg nvnnzjuapq/edit#heading=h.v puannmxh u https://www.doi.org/ https://orcid.org/ activity : go to re data.org and search for a data repository related to your research subject area. from the repository you choose, pick a dataset. does it have a doi? what is? who is the creator of that dataset? what is the orcid of the author? activity : you’ve been given this doi: . /j n kq • what would you do to find the dataset this doi references? • using the above approach, you just identified, what is associated with this doi? who was the creator of this dataset? when was that published? who funded that research? activity : • go to the orcid website and create an orcid if you do not have one already. can you identify the creator associated with the doi on the activity ? discussion: • what would be a positive benefit for having a personal persistent id such as orcid? are there any drawbacks or concerns? thing : citations citing data properly is equally as important as citing journal articles and other papers. in general, a data citation should include: author/creator, date of publication, title of dataset, publisher/organization (for instance, noaa), and unique identifier (preferably doi). activity : • read through this overview of citing data from dataone. this has information application to any data citations, as well as guidelines specific to dataone. • think of the last dataset you worked with. is it something you collected, or was it from a public repository? how would you cite this data? • websites/data repositories will often provide the text of preferred citation, but you may have to search for it. how would you cite the world ocean database? how would you cite data from the multibeam bathymetry database? discussion long-term data stewardship is an important factor for keeping data open and accessible for the long term. https://orcid.org/ https://www.dataone.org/citing-dataone https://www.nodc.noaa.gov/oc /wod/pr_wod.html • after completing the last activity, discuss how open is data in the discipline? are there long-term considerations and protocols for the data that is produced? tip: resources that can help make your data more open and accessible or to protect your data • open science framework • figshare • oceanographic data centers thing : data formats oceanographic data can include everything from maps and images to high dimensional numeric data. some data are saved as common, near-universal formats (such as csv files), while others require specialized knowledge and software to open properly (e.g., netcdf). explore the intrinsic characteristics of the dataset that influence the choice of the format, such as a time series versus a regular -d grid of temperature varying on time; robust ways to connect the data with metadata; size factors, binary versus ascii file; and think about why a format to store/archive data is not necessarily the best way to distribute data. discussion : • what are the most common data formats used in your field? what level of technical/domain knowledge is required to open, edit, and interactive with these data types? discussion : • what are the advantages and disadvantages of storing in plain ascii, like a csv file versus a binary, like netcdf? does the characteristics of the data influence that decision, i.e. the preferred format for a time series would be different than a numerical model output, or a gene sequence? thing : data organization and management good data organization is the foundation of your research project. data often has a longer lifespan than the project it is originally associated with and may be reused for follow-up projects or by other researchers. data is critical to solving research questions, but lots of data are lost or poorly managed. poor data organization can directly impact the project or future reuse. https://osf.io/ https://figshare.com/ activity : considerations for basic data organization and management group discussion : • is your data file structure something that a new lab member could easily learn, or are datasets organized in a more haphazard fashion? • do you have any documentation associated describing how to navigate your data structures? group discussion : • talk about where/how you are currently storing data you are working with. would another lab member be able to access all your data if needed? activity : identifying vulnerabilities • scenario : your entire office/lab building burns down overnight. no one is harmed, because no one was there, but all electronics in the building perish beyond hope of repair. the next morning, can you access any of your data? • scenario : the cloud server you use (everything from google drive to github) crashes. can you still access your most up to date data? discussion : • from either of the two scenarios, can your data survive a disaster? what are some of the things that you think you are doing incorrectly to prevent data loss? discussion : • think about a time when you had or potentially had a data disaster - how could the disaster have been avoided? what, if anything, have you changed about your data storage and workflow as a result? the data management plan (dmp) some research institutions and research funders now require a data management plan (dmp) for new research projects. let's talk about the importance of a dmp and what should a dmp cover. think about it you would you be able to create a dmp? what is a dmp? a data management plan (dmp) documents how data will be managed, stored and shared during and after a research project. some research funders are now requesting that researchers submit a dmp as part of their project proposal. activity : • start by watching the dmptool: a brief overview second video to see what the dmptool can do for researhers and data managers. • next, review this short introduction to data management plans. • now browse through some public dmps from the dmptool, choose one or two of the dmps related to oceanography and read them to see the type of information they capture. activity : there are many data management plan (dmp) templates in the dmptool. • choose one dmp funder template you would potentially use for a grant proposal in the dmptool. spend - minutes starting to complete the template, based on a research project you have been involved with in the past. discussion: • you will have noticed that dmps can be very short, or extremely long and complex. what do you think are the two or three pieces of information essential to include in every dmp and why? • after completing the second activity, what are strengths and weaknesses of your chosen template? thing : re-usable data there are two aspects to reusability: reusable data, and reusable derived data/process products. reusable data reusable data is the result of successful implementation of the other “things” discussed so far. reusable data ( ) has a license which specifies reuse scenarios, ( ) is in a domain-suitable format and an “open” format when possible, and ( ) is associated with extensive metadata consistent with community and domain standards. https://youtu.be/xt by-p juw https://www.ands.org.au/working-with-data/data-management/data-management-plans https://dmptool.org/public_plans process/derived data products what is often overlooked in terms of reusability are the products created to automate research steps. whether it’s using the command line, python, r, or some other programming platform, automation scripts in and of themselves are a useful output that can be reused. for example, data cleaning scripts can be reapplied to datasets that are continually updated, rather than starting from scratch each time. modeling scripts can be re-used and adapted as parameters are updated. additionally, these research automation products make any data- related decision you made explicit: if future data users have questions about exclusions, aggregations, or derivations, the methodology used is transparent in these products. discussion : • how many people have made public or shared part of their research automaton pipeline? if you haven’t shared this, what prevented you from sharing? discussion : • are there instances where your own research would have been improved if you had access to other people’s process products? thing : tools of the trade when working with your data, there are a selection of proprietary and open source tools available to conduct your research analysis. why open source tools? open source tools are software tools developed, in which the source code is openly available and published for use and/or modification by any one free of charge. there are many advantages to using open source tools: • low software costs • low hardware costs • wide community development support • interoperable with other open source software • no vendor control • open source licensing caution: be selective with the tools you use there are additional benefits you may hear about using open sources tools which are: • higher quality software • greater security • frequent updates keep in mind, in an ideal world these three ideas are what we all wish for, however not every open source tool satisfies these benefits. when selecting an open source tool, choose a package with a large community of users and developers that proves to have long-term support. things to consider when using open source tools benefits: • open source tools often have active development community. quality for end users is usually higher because the community are users of the software being developed. in turn, open source costs for development are cheaper. • with a larger community of development, security problems and vulnerabilities are discovered and fixed quickly. another major advantage of open source is the possibility to verify exactly which procedures are being applied, avoiding the use of "black-boxes" and allowing for a thorough inspection of the methods. issues: • open sources tools are only as good as the community that supports it. unlike commercial software there is no official technical support. additionally, not all open source licenses are permissive. • training time can be significant. if open source tools are not an option and commercial software is necessary for your project, there are benefits and issues to consider when using proprietary or commercial software tools. benefits: • this type of software often comes with official technical support such as a customer service phone number or email. issues: • proprietary or commercial tools are often quite expensive at the individual level. • universities may have campus-wide licenses, but if you move institutions, you may find yourself without the software you had been using. discussion: • think about the tools you use for conducting data clean up, analysis, and for creating visualizations and reports for publications. what were the deciding factors for selecting the applications you used or are using for your project? thing : reproducibility can you or others reproduce your work? reproducibility increases impacts credibility and reuse. read through the following best practices to make your work reproducible. best practices: making your project reproducible from the start of the project is ideal. • documenting each step of your research - from collecting or accessing data, to data wrangling and cleaning, to analysis - is the equivalent of creating a roadmap that other researchers can follow. being explicit about your decisions to exclude certain values or adjust certain model parameters, and including your rationale for each step, help eliminate the guesswork in trying to reproduce your results. • consider open source tools. this allows anyone to reproduce research more easily, and helps with identifying who has the right license for the software used. this is useful not only for anyone else who wants to test your analysis - often the primary beneficiary is you! research often takes months, if not years, to complete a certain project, so by starting with reproducibility in mind from the beginning, you can often save yourself time and energy later on. discussion: think about a project you have completed or are currently working on. • what are some of the best practices you have adopted to make your research reproducible for others? • were there any pain points that you encounter or are dealing with now? • is there something you can do about it now? • what are the most relevant "things" previously mentioned in this document that you could use to make your research more reproducible? thing : apis and applications (apps) apis (application programming interfaces) allow programmatic access to many databases and tools. they can directly access or query existing data, without the need to download entire datasets, which can be very large. certain software platforms, such as r and python, often have packages available to facilitate access to large, frequently used database apis. for instance, the r package “rnoaa” can access and import various noaa data sources directly from the r console. you can think of it as using an api from the comfort of a tool you’re already familiar with. this not only saves time and computer memory, but also ensures that as databases are updated, so are your results: re-running your code automatically pulls in new data (unless you have specified a more restricted date range). activity: on the erddap server for spray underwater glider data, select temperature data for the line (https://spraydata.ucsd.edu/erddap/tabledap/binnedcugn .html). • restrict it to measurements at m or shallower. • choose the format of your preference, and instead of submit the request, generate an url. • copy and paste the generated url in your browser. discussion: • think about the last online data source you accessed. is there an api for this data source? is there a way to access this data from within your preferred analysis software? https://ropensci.org/tutorials/rnoaa_tutorial/ https://spraydata.ucsd.edu/erddap/tabledap/binnedcugn .html https://spraydata.ucsd.edu/erddap/tabledap/binnedcugn .html top fair data & software things: research software sprinters anna-lena lamprecht, carlos martinez ortiz, chris erdmann, leyla garcia, mateusz kuzak, paula andrea martinez description: the fair data principles are widely known and applied today. what the fair principles mean for (scientific) software is an ongoing discussion. however, there are some things on which there is already agreement that they will make software (more) fair. in this document, we go for some ‘low hanging fruit’ and describe easy fair software things that you can do. to limit the scope, “software” here refers to scripts and packages in languages like r and python, but not to other kinds of software frequently used in research, such as web-services, web platforms like myexperiment.org or big clinical software suites like openclinica. a poster summarizing these fair software things is also available. audience: • researchers who develop software • research software engineers goals: translate fair principles to applicable actions for scientific software what is fair for software in the context of this document, we use the following simple definition of fair for software: findable software with sufficiently rich metadata and unique persistent identifier accessible software metadata is in machine and human readable format. software and metadata is deposited in trusted community approved repository. https://librarycarpentry.org/top- -fair https://www.uu.nl/staff/allamprecht https://github.com/c-martinez https://github.com/libcce https://github.com/ljgarcia https://github.com/mkuzak https://github.com/orchid / https://www.go-fair.org/fair-principles/ file://///top- -fair/files/poster_ things_fairsoftware.pdf https://researchsoftware.org/ interoperable software uses community accepted standards and platforms, making it possible for users to run the software. reusable software has clear licence and documentation things findability thing : create a description of your software the name alone does not tell people much about your software. in order for other people to find out if they can use it for their purpose, they need to know what it does. a good description of your software will also help other people to find it. activity: think of minimum set of information (metadata) which will help others find your software. this can include short descriptive text and meaningful keywords. codemeta is a set of keywords used to describe software and way to structure them in machine readable way. for examples of codemeta used in software packages see: • https://github.com/nlesc/boatswain/blob/master/codemeta.json • https://github.com/datacite/maremma edam is an example of an ontology that provides terminology that can be used to describe bioinformatics software. take the oss lesson episode about metadata and registries and walk through the exercise. this example: http://r-pkgs.had.co.nz/description.html#description thing : register your software in a software registry people search for research software using search engines like google. registering your software in a dedicated registry will make it findable by search engines, because the registries take care about search engine optimization etc. the registries will usually ask you to provide descriptions (metadata) as above. activity: think of the registries most used in your domain? do you know about any? how and where do you usually find software? what kind of keywords do you use when searching? https://codemeta.github.io/terms/ https://github.com/nlesc/boatswain/blob/master/codemeta.json https://github.com/datacite/maremma http://edamontology.org/page https://softdev research.github.io/ oss-lesson/ -use-registry/index.html http://r-pkgs.had.co.nz/description.html#description here are some examples of research software registries: * bio.tools * research software directory (check if your institution hosts one) * ropensci project * zenodo oss lesson episode about metadata and registries thing : get and use a unique and persistent identifier for your software it will help others find and access the particular version of your software. unique means that the identifier will point on and only version and location of your software. persistent means that it will pointing to the same version and location for long, specified amount of time. for example, zenodo provides you with a doi (digital object identifier) that will be resolvable for at least the next years. recent initiatives, such as software heritage, propose to associate a permalinks as intrinsic sha identifier to software (see example through the id: swh: :dir: bc c a a ede f a adcde a / permalinks: https://archive.softwareheritage.org/swh: :dir: bc c a a ede f a adcde a /) activity: if you have registered your software in a registry, chances are good that they provide a unique and persistent identifier. if not, obtain an identifier from another organization. if you have multiple identifiers, choose one that you use as your main identifier. make sure you use it consistently when referring to your software, e.g. on your own website, code repository or in publications. making your code citable with zenodo accessibility thing : make sure that people can download your software in order for anyone to use your software, they need to be able to download an executable version along with documentation. for interpreted languages like python and r, the code is also the executable version. for compiled languages like java and c, the executable version is a binary file, and the code might not be accessible. downloading the software and documentation is possible, for instance, from a project website, a git repository or from a software registry. activity: using the identifier as your starting point, ask a colleague to try to get your software (binary/script). can he/she download it? does he/she also have access to the documentation? is there anything preventing him/her from getting to it? is it hosted on a reliable platform (long term persistent, such as zenodo, pypi, cran)? https://bio.tools/ https://github.com/research-software-directory/research-software-directory https://ropensci.github.io/ https://ropensci.github.io/ https://zenodo.org/ https://softdev research.github.io/ oss-lesson/ -use-registry/index.html https://archive.softwareheritage.org/swh: :dir: bc c a a ede f a adcde a / https://archive.softwareheritage.org/swh: :dir: bc c a a ede f a adcde a / https://guides.github.com/activities/citable-code/ interoperability thing : explain the functionality of your software your software performs one or more operations that take an input and transform it into the output. to help people use your software, provide a clear and concise description of the operations along with the corresponding input and output data types. for example, the wc (word count) command line tool takes a text as input, counts the number of words in it and gives the number of words as output. the clustalw tool takes a set of (gene or protein) sequences as input, aligns them and returns a multiple sequence alignment as output. activity: list all operations that your software provides, and describe them along with corresponding input and output data types. if possible, use terms from a domain ontology like edam. thing : use standard (community agreed) formats for inputs and outputs in order for people to use your software, they need to know how to feed data to it -- standard formats are easy ways to exchange data between different pieces of software. by sticking to standards, it is possible to use the output from another piece of software as an input to your software (or the other way around). for example, fasta is a format for representing molecular sequences (dna, rna, protein, …) that most sequence analysis tools can handle. netcdf is a standard file format used sharing of array-oriented scientific data. activity: what are the relevant standards in your field? which are the groups/organizations that are responsible for standards in your field? is there a place where you can find the relevant standards and a detailed description? what other tools use these standards? if possible, use such standard formats as input/output of your software and state which you are using. (avoid to define your own standards! http://imgs.xkcd.com/comics/standards.png) reusability thing : document your software your software should include sufficient documentation: instructions on how to install, run and use your software. all dependencies of your software should be clearly stated. provide sufficient examples on how to execute the different operations your software offers, ideally along with example data. write the docs page explains and gives examples of good documentation. http://man .org/linux/man-pages/man /wc. .html https://www.genome.jp/tools-bin/clustalw http://imgs.xkcd.com/comics/standards.png https://www.writethedocs.org/guide/writing/beginners-guide-to-docs/ activity: ask a colleague to look at your software’s documentation. is he/she able to install your software? can he/she run it? can he/she produce the expected results? thing : give your software a license a license tells your (potential) users what they are allowed to do with your software (and what not to do), and can protect your intellectual property. without a license people may spend time trying to figure out if they are allowed to use your software -- make things easy for them. therefore, it is important that you choose a software license that meets your intentions. choose a license website provides a simple guide for picking the right license for your software. activity: * follow the oss lesson to learn more about licenses and their implications. * read oss paper thing : state how to cite your software you want to get credit for your work. by providing the citation guideline you will help users of your software to cite your work properly. there is no single right way to do it. software sustainability institute website provides more information and discussion on this topic in a blog post how to cite and describe software. activity: read “software citation principles” paper. read documentation of citation file format and create cff file for your software. thing : follow best practices for software development reusability benefits from good quality of software. there are a number of actions you can take to improve the quality of your software: make your code modular, have code level documentation, provide tests, follow code standards, use version control, etc. there are several guidelines which you can use to guide you in the process such as the escience center guide, the best practices and the good enough practices. activity: familiarize yourself with the guides provided above. have a look at your software and create a list of actions which you could follow to improve the quality of your software. ideally, follow these practices from the very beginning. https://choosealicense.com/ https://softdev research.github.io/ oss-lesson/ -use-license/index.html https://f research.com/articles/ - /v https://f research.com/articles/ - /v https://www.software.ac.uk/how-cite-software https://www.force .org/software-citation-principles https://citation-file-format.github.io/ https://citation-file-format.github.io/cff-initializer-javascript/ https://guide.esciencecenter.nl/ https://guide.esciencecenter.nl/ https://journals.plos.org/plosbiology/article?id= . /journal.pbio. https://journals.plos.org/ploscompbiol/article?id= . /journal.pcbi. top fair data & software things: research libraries sprinters: liz stokes, natasha simons, tom honeyman (australian research data commons), chris erdmann,(library carpentry/the carpentries/california digital library), sharyn wise ( university of technology, sydney), josh quan, scott peterson, amy neeser (uc berkeley) description: to translate fair principles into useable concepts for research-facing support staff (e.g. librarians). audience: • library staff who provide research support • those who want to know more about fair and how it could be applied to libraries goals: • translating fair speak to library speak (what is it? why do i need to know? what do i tell researchers?) • identifying ways to improve the ‘fairness’ of your library • understanding that fair data helps us be better stewards of our own resources things thing : why should librarians care about fair? there’s a lot of hype about the fair data principles. but why should librarians care? for starters, libraries have a strong tradition in describing resources, providing access and building collections, and providing support for the long-term stewardship of digital resources. building on their specific knowledge and expertise, librarians should feel confident with making research data fair. so how can you and your library get started with the fair principles? activity: . read liber’s implementing fair principles: the role of libraries at https://libereurope.eu/wp-content/uploads/ / /liber-fair-data.pdf ( minute read) https://librarycarpentry.org/top- -fair https://twitter.com/ragamouf https://twitter.com/n_simons https://www.linkedin.com/in/tom-honeyman- / https://twitter.com/libcce https://twitter.com/libcce https://orcid.org/ - - - x https://github.com/wrathofquan https://github.com/scottcpeterson https://twitter.com/pseudoamyloid https://libereurope.eu/wp-content/uploads/ / /liber-fair-data.pdf consider: * where is your library at in regard to the section on ‘getting started with fair’? * where are you at in your own understanding of the fair data principles? thing : how fair are your data? the fair principles are easily understood in theory but more challenging when applied in practice. in this exercise, you will be using the australian research data commons (ardc) data self-assessment tool to assess the 'fairness' of one of your library’s datasets. activity: . select a metadata record from your library’s collection (e.g. your institutional repository) that describes a published dataset. . open the ardc fair data assessment tool and run your chosen dataset against the tool to assess its ‘fairness’. consider: * how fair was your chosen dataset? * how easy was it to apply the fair criteria to your dataset? * what things need to happen in order to improve the ‘fairness’ of your chosen dataset? want more? try your hand at other tools like the csiro star data rating tool and the dans fair data assessment tool. thing : do you teach fair to your researchers? how fair aware are your researchers? does your library incorporate fair into researcher training? activity: go to existing data management/data sharing training you provide to graduates, higher degree researchers (hdrs) or other researchers. for example, review the duke graduate school’s responsible conduct of research topics page. review how well the fair principles are covered in this training and adjust accordingly. thing : is fair built into library practice and policy? your library may do a great job advocating the fair data principles to researchers but how well have the principles been incorporated into library practice and policy? activity: . review your library or institutional policies regarding research data management and digital preservation with the fair principles in mind. consider that in most cases library policy will have been written before the advent of fair. are revisions required? . review https://www.ands-nectar-rds.org.au/fair-tool https://doi.org/ . / / a f b https://www.surveymonkey.com/r/fairdat https://www.surveymonkey.com/r/fairdat https://gradschool.duke.edu/professional-development/programs/responsible-conduct-research/rcr-topics https://gradschool.duke.edu/professional-development/programs/responsible-conduct-research/rcr-topics https://www.force .org/group/fairgroup/fairprinciples https://www.force .org/group/fairgroup/fairprinciples the data repository managed by your library. how well does it support fair data? . review your library’s data management planning tool. does it have features that support the fair data principles or are changes required? thing : are your library staff trained in fair? reusing the wide range of openly available training materials available in the fair data principles e.g. you could start here. activity: * conduct a skills and knowledge audit regarding fair with your library team. * based on the audit, identify gaps in fair skills and knowledge. * design a training program that can fill the identified gaps. to help build your program, read the blog post, a carpentries based approach to teaching fair data and software principles. consider: reusing the wide range of openly available training materials available in the fair data principles e.g. you could start here. thing : are digital libraries fair? while the fair principles are designed for data, considering their potential application in a broader context is useful. for example, think about what criteria might be applied to assess the ‘fairness’ of digital libraries. considerations might include: * persistent identifiers * open access vs. paid access * provenance information / metadata * author credibility * versioning information * license / reuse information * usage statistics (number of times downloaded) activity: . select one of these digital libraries (or another of your choice): * british library * national digital library of india * europeana * national library of australia’s trove . search/browse the catalogue of items. consider: * does the library display reuse permissions/licenses on how to use the item? * is there provenance information? * are persistent identifiers used? thing : does your library support fair metadata? a number of fair principles make reference to “metadata”. what is metadata, how is it relevant to fair and does your library support the kind of metadata specified in the fair data principles? https://www.ands.org.au/working-with-data/fairdata/training https://uc .cdlib.org/ / / /a-carpentries-based-approach-to-teaching-fair-data-and-software-principles/ https://uc .cdlib.org/ / / /a-carpentries-based-approach-to-teaching-fair-data-and-software-principles/ https://www.ands.org.au/working-with-data/fairdata/training https://www.bl.uk/ https://ndl.iitkgp.ac.in/ https://ndl.iitkgp.ac.in/ https://www.europeana.eu/ https://trove.nla.gov.au/ activity: . watch this video in which the metadata librarian explains metadata ( mins) . select three metadata records at random for datasets held in your library or repository collection. . open the checklist produced for use at the eudat summer school and see if you can check off those that reference metadata against the records you selected. . make a list of what metadata elements could be improved in your library records to enable better support for fair. thing : does your library support fair identifiers? the fair data principles call for open, standardised protocols for accessing data via a persistent identifier. persistent identifiers are crucial for the findability and identification of research, researchers and for tracking impact metrics. so how well does your library support persistent identifiers? activity: find out how well your library supports orcids and dois: * do your library systems support the identification of researchers via an orcid? do you authenticate against the orcid registry? do you have an orcid? * do your library systems, such as your institutional repository, support the issuing of digital object identifiers (dois) for research data and related materials? consider: * what other types of persistent identifiers do you think your library should support? why or why not? want more? if you library supports the minting of dois for research data and related materials, is there more that you could do in this regard? check out a data citation roadmap for scholarly repositories and determine how much of the roadmap you can check off your list and how much is yet to do. thing : does your library support fair protocols? for (meta)data to be accessible it should ideally be available via a standard protocol. think of protocols in terms of borrowing a book: there are a number of expectations that the library lays out in order to proceed. you have to identify yourself using a library card, you have to bring the book to the checkout desk, and in return you walk out of the library with a demagnetised book and receipt reminding you when you have to return the book by. accessing the books in the library means that you must learn and abide by the rules for accessing books. https://www.youtube.com/watch?v=abf fvspvye http://doi.org/ . /zenodo. https://orcid.org/ http://www.doi.org/ https://doi.org/ . / https://doi.org/ . / activity: * familiarise yourself with apis by completing thing of the ands (research data) things * consider the apis your library provides to enable access (meta)data for data and related materials. are they up to scratch or are improvements required? thing : next steps for your library in supporting fair in thing you read liber’s implementing fair principles: the role of libraries. you considered what your library needed to do in order to better support fair data. in thing we will create a list of outstanding action items. activity: . write a list of what your library is currently doing to support and promote the fair data principles. . now compare this to the list in the liber document. where are the gaps and what can you do to fill these? . create an action plan to improve fair support at your library! consider: * incorporate all that you learnt and progress that you made in “doing” this top fair things! https://www.ands.org.au/working-with-data/skills/ -research-data-things/all /thing- top fair data & software things: research data management support sprinters: lena karvovskaya, otto lange, iza witkowska, jacques flores (research data management (rdm) support at utrecht university) description: this is an umbrella-like document with links to various resources. the aim of the document is to help researchers who want to share their data in a sustainable way. however, we consider the border between librarians and researchers to be a blurred one. this is because, ultimately, librarians support researchers that would like to share their data. we primarily wish to target researchers and support staff irregardless of their experience: those who have limited technical knowledge and want to achieve a very general understanding of the fair principles and those who are more advanced technically and want to make use of more technical resources. the resources we will link to for each of the fair things will often be on two levels of technicality. audience: our primary audience consists of researchers and support staff at utrecht university. therefore, whenever possible we will use the resources available at utrecht university: the institutional repositories and resources provided at the rdm support website. things thing : why bother with fair? background: the advancement of science thrives on the timely sharing and accessibility of research data. timely and sustainable sharing is only possible if there are infrastructures and services that enable it. . read up on the role of libraries in implementing the fair data principles. think about the advantages and opportunities made possible by digitalization in your research area. think about the challenges. have you or your colleagues ever experienced data loss? is the falsification/fabrication of data an issue with digital data? how easy it to figure out if the data you found online is reliable? say you found a very useful resource available https://librarycarpentry.org/top- -fair https://www.uu.nl/medewerkers/ekarvovskaya https://www.uu.nl/staff/oalange https://www.uu.nl/staff/imwitkowska https://www.uu.nl/staff/jpflores https://www.uu.nl/en/research/research-data-management https://libereurope.eu/wp-content/uploads/ / /liber-fair-data.pdf online and you want to refer to it in your work; can you be sure that it is still there several years later? . for more information, you can refer to this detailed explanation of fair principles developed by the dutch center for life sciences (dtls). thing : metadata background: metadata are information about data. this information allows data to be findable and potentially discoverable by machines. metadata can describe the researchers responsible for the data, when, where and why the data was collected, how the research data should be cited, etc. . if you find the discussion on metadata too abstract, think about a traditional library catalogue record as a form of metadata. a library catalogue card holds information about a particular book in a library, such as author, title, subject, etc. library cataloging, as a form of metadata, helps people find books within the library. it provides information about books that can be used in various contexts. now, reflect on the differences in functionality between a paper catalogue card and a digital metadata file. . reflect on your own research data. if someone who is unfamiliar with your research wants to find, evaluate, understand and reuse your data, what would he/she need? . watch this video about structural and descriptive metadata and reflect on the example provided in the video. if the video peaked your interest about metadata, watch a similar video on the ins and outs of metadata and data documentation by utrecht university. thing : the definition of fair metrics background: fair stands for findable, accessible, interoperable and re-usable. https://www.go-fair.org/fair-principles/ https://www.youtube.com/watch?v=l vog ncwe&feature=youtu.be https://www.youtube.com/watch?v=h oz swbtj & https://www.youtube.com/watch?v=h oz swbtj & . take a look at the image above, provided by the australian research data commons (ardc). reflect on the images chosen for various aspects of the fair acronym. if we consider this video, already mentioned in thing , how would you describe the photography example in terms of fair? . go to datacite and choose data center "utrecht university". select one of the published datasets and evaluate it with respect to fair metrics. in evaluating the dataset, you can make use of the fair data self-assessment tool created by ardc. which difficulties do you experience while trying to do the evaluation? thing : searchable resources and repositories background: to make objects findable we have to commit ourselves to at least two major points: ) these objects have to be identifiable at a fixed place, and ) this place should be fairly visible. when it comes to finding data this is where the role of repositories comes in. . utrecht university has its own repository yoda, short for "your data". it is possible to publish a dataset in this repository so that it becomes accessible online. try to search for one of the datasets listed on yoda in google data search. take "chronicalitaly" as an example. was it difficult to find the dataset? now try to search for one of the databases stored at the meertens institute using google dataset search. why are the results so different? https://www.youtube.com/watch?v=l vog ncwe&feature=youtu.be https://search.datacite.org/ https://www.ands-nectar-rds.org.au/fair-tool https://yoda.sites.uu.nl/ https://public.yoda.uu.nl/i-lab/uu /t ymow.html https://www.meertens.knaw.nl/cms/en/collections/databases . take a look at the storage solutions suggested by utrecht rdm support. identify searchable repositories among these solutions. thing : persistent identifiers background: a persistent identifier is a permanent and unique referral to an online digital object, independent of (a change in) the actual location. an identifier should have an unlimited lifetime, even if the existence of the identified entity ceases. this aspect of an identifier is called "persistency". . read about the digital object identifier (doi)) system for research data provided by the australian national data service (ands). . watch the video "persistent identifiers and data citation explained" by research data netherlands. read about persistent identifiers on a very general level (awareness). thing : documentation . browse through the general overview of data documentation as provided by the consortium of european social science data archives. think of the principal differences between object-level documentation of quantitative and qualitative data. thing : formats and standards . take a look at data formats recommended by dans. which of these formats are relevant for your subject area and for your data. do you use any of the non-preferred formats? why? . read the background information about file formats and data conversion provided by the consortium of european social science data archives. reflect on the difference between short-term and long-term oriented formats. think of a particular example of changing from a short-term processing format to a long-term preservation format, relevant for your field. thing : controlled vocabulary background: the use of shared terminologies strengthens communities and increases the exchange of knowledge. when the researchers refer to specific terms, they rely on common understanding of these terms within the relevant community. controlled vocabularies are concerned with the commitment to the terms and management standards that people use. . browse controlling your language: a directory of metadata vocabularies from jisc in the uk. reflect on possible issues that may arise if there is no agreement on the use of a controlled vocabulary within a research group. https://www.uu.nl/en/research/research-data-management/tools-services/tools-for-storing-and-managing-data/storage-solutions https://www.ands.org.au/__data/assets/pdf_file/ / /digital-object-identifiers.pdf https://www.youtube.com/watch?v=pgqtiy oz k https://www.ands.org.au/guides/persistent-identifiers-awareness https://www.cessda.eu/training/training-resources/library/data-management-expert-guide/ .-organise-document/documentation-and-metadata https://dans.knaw.nl/en/deposit/information-about-depositing-data/before-depositing/file-formats https://www.cessda.eu/training/training-resources/library/data-management-expert-guide/ .-process/file-formats-and-data-conversion https://www.cessda.eu/ https://www.webarchive.org.uk/wayback/archive/ /http:/www.jiscdigitalmedia.ac.uk/guide/controlling-your-language-links-to-metadata-vocabularies . consider the following example from earth science research: "to be able to adequately act in the case of major natural disasters such as earthquakes or tsunamis, scientists need to have knowledge of the causes of complex processes that occur in the earth's crust. to gain necessary insights, data from different research fields are combined. this is only possible if researchers from different applicable sub-disciplines 'speak the same language'". choose a topic within your research interests that requires combining data from different sub-disciplines. think about some differences in vocabularies between these sub-disciplines. thing : use a license background: a license states what a user is allowed to do with your data and creates clarity and certainty for potential users. . take a look at various creative commons licences. which licenses put the least restrictions on data? you can make use of creative commons guide to figure this out. . watch this video about creative commons licences. thing : fair and privacy background: the general data protection regulation (gdpr) and its implementation in the netherlands called algemene verordening gegevensbescherming(avg) requires parties handling data to provide clarity and transparency where personal data are concerned. . take a look at at the handling personal data guide from the utrecht university rdm website. reflect on how personal data can be fair. https://www.uu.nl/en/research/research-data-management/tools-services/designing-metadata-schemes https://creativecommons.org/licenses/ http://creativecommons.org/choose/ https://www.youtube.com/watch?v=hywdenq fo https://gdpr-info.eu/ https://autoriteitpersoonsgegevens.nl/nl/onderwerpen/avg-nieuwe-europese-privacywetgeving/algemene-informatie-avg https://www.uu.nl/en/research/research-data-management/guides/handling-personal-data top fair data & software things: international relations sprinter: fiona bradley, unsw library, and university of western australia (phd candidate) description: international relations researchers increasingly make use of and create their own datasets in the course of research, or as part of broader research projects. the funding landscape in the discipline is mixed, with some receiving significant grants subject to open access and open data compliance while others are not funded for specific outputs. datasets have many sources, they may be derived from academic research, or increasingly, make use of large-n datasets produced by polling organisations such as yougov, gallup, third-party datasets produced by non-governmental organisations or ngos that undertake human rights monitoring, or official government data. there is a wide range of licensing arrangements in place, and many different places to store this data. what is fair data? fair data is findable, accessible, interoperable and reusable. for more information, take a look at the force definition. audience: international relations and human rights researchers goal: help researchers understand fair principles things thing : getting started is there a difference between open and fair data? find out more: https://www.go- fair.org/faq/ask-question-difference-fair-data-open-data/ https://librarycarpentry.org/top- -fair http://orcid.org/ - - - https://www.force .org/group/fairgroup/fairprinciples https://www.go-fair.org/faq/ask-question-difference-fair-data-open-data/ https://www.go-fair.org/faq/ask-question-difference-fair-data-open-data/ activity: are there examples in your own research where you have used or created data that may be fair, but may not necessarily be open? * does the material you used or created include personal information? * does it include culturally sensitive materials? * does it reveal information that endangers or reveals the location of human rights defenders, whistleblowers, or other people requiring protection? * does it involve material subject to commercial agreements? thing : discovering data united nations (un) agencies, international organisations, governments, ngos, and researchers all produce and share data. some data are very easy to use - they are well- described, and a comprehensive code book may be supplied. other data may need significant clean up especially if definitions or country borders have changed over time, as they will in longitudinal datasets. a selection of the types of datasets available are linked below: • polity iv dataset • world bank open data • itu global it statistics • freedom house reports • american journal of political science (ajps) dataverse • uk dataset guidelines (provides advice on using many open datasets) • icpsr: inter-university consortium for political and social research thing : data identifiers a unique, permanent link helps make it easy to identify and find data. a digital object identifier (doi) is a widely used identifier, but not the only one available. if you are contributing a dataset to an institutional repository or discipline repository, these services may ‘mint’ a doi for you to attach to your dataset. zenodo is an example of an open repository that will provide a doi for your dataset. the ajps dataverse and uk data service, linked in thing , both use dois to identify datasets. thing : data citation using someone else’s dataset? or want to make sure you are credited for use of data? the make data count initiative and datacite are developing guidelines to ensure that data citations are measured and credited to authors, in the same way as other research outputs. currently many researchers, ngos, and organisations contribute data to the un system or at national level to show progress on the un agenda for sustainable development, including the sustainable development goals. there are several initiatives aimed at http://www.systemicpeace.org/polityproject.html https://data.worldbank.org/ https://www.itu.int/en/itu-d/statistics/pages/stat/default.aspx https://freedomhouse.org/reports https://dataverse.harvard.edu/dataverse/ajps https://www.ukdataservice.ac.uk/use-data/guides/dataset-guides https://www.icpsr.umich.edu/icpsrweb/ https://zenodo.org/ https://makedatacount.org/ https://www.datacite.org/ https://sustainabledevelopment.un.org/ http://www.data sdgs.org/ strengthening national data including national statistical office capacity, disaggregated data, third-party data sources, and scientific evidence. thing : data licensing depending on your funder, publisher, or purpose of your dataset, you may have a range of data licensing compliance requirements, or options. creative commons is one licensing option. the australian research data commons (formerly known as the australian national data service) provides a guide with workflows for understanding how to licence your data in australia. activity: when might a creative commons licence not be appropriate for your data? for example: * when you are working on a contract and the contracting body does not permit it? * when you are producing data for a body with a more permissive licence or different licencing scheme in place? * when you are producing data on behalf of a body with an open government data licence? (linked example is for uk) * are there other examples? thing : sensitive data human rights researchers, scholars studying regime change in fragile and conflict states, and interviews with security officials are among the cases where data may need to be handled carefully, and be sensitive. in these cases, procedures utilised in collecting the data must remain secure, and the data may be fair, but not open, or require specific access protocols and mediated access. see: • a human-rights based approach to data, un ochr • data security, uk data service thing : data publishing data sharing policies in political science and international relations journals vary widely. see: • data policies of highly-ranked social science journals activity: * what might some general data requirements look like for international relations? are the data access, production transparency, and analytic transparency guidelines for apsr (american political science review) helpful? * or do you prefer a less defined set of criteria, such as that set out by international organization? https://wiki.creativecommons.org/wiki/data_and_cc_licenses https://www.ands.org.au/guides/research-data-rights-management https://datacatalog.worldbank.org/public-licenses http://www.nationalarchives.gov.uk/doc/open-government-licence/version/ / http://www.nationalarchives.gov.uk/doc/open-government-licence/version/ / https://www.ohchr.org/documents/issues/hrindicators/guidancenoteonapproachtodata.pdf https://www.ukdataservice.ac.uk/manage-data/store/security https://osf.io/preprints/socarxiv/ h ay https://www.apsanet.org/apsr-submission-guidelines http://iojournal.org/data-archive/ thing : funder requirements funder requirements vary. gary king has compiled the policies of most major social science funders (and journals, see thing ). thing : data sharing your funder or publisher may set requirements for data sharing, either as ‘supplementary data’, or in a data repository. but, what if you aren’t funded, and aren’t required to provide supplementary data or comply with data publishing conditions? make it a habit and practice to prepare and release your datasets as fair data when appropriate. choose a repository, claim an identifier (thing ), and licence it appropriately (thing ). add links to your homepage and orcid profile. see: • guide to social science data preparation and archiving thing : learn more the carpentries provide training and workshops on fundamental data skills for research. https://gking.harvard.edu/pages/data-sharing-and-replication https://www.icpsr.umich.edu/icpsrweb/content/deposit/guide/ https://carpentries.org/ top fair data & software things: humanities: historical research sprinters: kristina hettne, peter verhaar (centre for digital scholarship at leiden university), ben companjen, laurents sesink, fieke schoots (centre for digital scholarship at leiden university, reviewer), erik schultes (go fair, reviewer), rajaram kaliyaperumal (leiden universitair medisch centrum, reviewer), erzsebet toth-czifra (dariah, reviewer), ricardo de miranda azevedo (maastricht university, reviewer), sanne muurling (leiden university library, reviewer). description: this document offers a concise overview of the ten topics that are most essential for scholars in the field of historical research who aim to publish their data set in accordance with the fair principles. in historical research, research data mostly consists of databases (spreadsheets, relational databases), text corpora, images, interviews, sound recordings or video materials. things findable to ensure that data sets can be found, scholars need to deposit their data sets and all the associated metadata in a repository which assigns persistent identifiers. thing : data repositories data repositories enable researchers to share their data sets. the following data repositories accept data sets in the field of history: • dans easy • figshare • zenodo • b share a number of additional data repositories can be found by going to re data.org, and by clicking on browse > browse by subject > history https://librarycarpentry.org/top- -fair https://twitter.com/kristinahettne https://twitter.com/pverhaar?lang=en https://www.universiteitleiden.nl/en/staffmembers/ben-companjen https://www.universiteitleiden.nl/en/staffmembers/ben-companjen https://www.universiteitleiden.nl/en/staffmembers/laurents-sesink#tab- https://www.universiteitleiden.nl/en/staffmembers/fieke-schoots#tab- https://orcid.org/ - - - x https://www.lumc.nl/org/humane-genetica/medewerkers/rajaram-kaliyaperumal?setlanguage=english&setcountry=en https://openmethods.dariah.eu/erzsebet-toth-czifra/ https://www.linkedin.com/in/ricardo-de-miranda-azevedo-b b b / https://www.universiteitleiden.nl/en/staffmembers/sanne-muurling#tab- https://easy.dans.knaw.nl/ui/home https://figshare.com/ https://zenodo.org/ https://b share.eudat.eu/records/new https://www.re data.org/ choosing a repository that complies with the coretrustseal criteria for long term repositories is recommended. this way, the durable findability of the data is guaranteed. activities: . study the data set that can be found via https://doi.org/ . /dans-zw -fkxb. how can the dataset be downloaded? which formats are available? thing : metadata once a certain data repository has been selected, the data set can be submitted, together with the metadata describing this data set. metadata is commonly described as data about data. in the context of data management, it is structural information about a data set which describes characteristics such as the quality, the format and the contents. most repositories require a minimum set of metadata, such as name of the creator, the title and the year of creation. check what kind of metadata the repository you choose asks. remember that the effort you put into metadata will contribute to the findability of your dataset. metadata are often captured using a fixed metadata schema. a schema is a set of fields which can be used to record a particular type of information. the format of the metadata is often prescribed by the data repository which will manage the data set. activities: . read the digital scholarship @ leiden blog to learn about metadata for humans and machines . log in at zenodo.org and click on upload > new upload. on the web page that appears, take stock of the various metadata fields that need to be completed. zenodo is an international repository. different countries and institutions might have other preferred repositories, such as dans easy. dans easy list the following specific requirements for historical sciences: historical sciences: ) a description of the (archival) sources; ) the selection procedure used; ) the way in which the sources were used; and ) which standards or classification systems (such as hisco) were used. read more at https://dans.knaw.nl/en/deposit/information-about-depositing-data/before-depositing thing : persistent identifiers datasets need to be deposited in repositories that assign persistent identifiers (pids) to ensure that online references to publications, research data, and persons remain available in the future. a pid is a specific type of a uniform resource identifier (uri), which is managed by an organisation that links a persistent identification code with the most recent uniform resource locator (url). academic journals mostly work with dois. dois are globally unique identifiers that provide persistent access to publications, datasets, software applications, and a wide range of other research results. doi has been an iso standard since . a typical doi looks as follows: https://www.coretrustseal.org/ https://doi.org/ . /dans-zw -fkxb https://digitalscholarshipleiden.nl/articles/metadata- -machines-help-you-find-and-reuse-relevant-research-data https://zenodo.org/ https://easy.dans.knaw.nl/ https://en.wikipedia.org/wiki/uniform_resource_identifier http://doi.org/ . /dans-x b-uy q. when users click on this doi, the doi is resolved to an actual web address. next to identifiers for data sets and for publications, it is also possible to create pids for people. open researcher and contributor identifier (orcid) is an international system for the persistent identification of academic authors. it is a non-proprietary system, managed by an international consortium consisting of universities, national libraries, research institutes and data repositories. when your research results are associated with an orcid, this information can be exchanged effectively across databases, across countries and across academic disciplines. you always retain full control over your own orcid id. it is the de facto standard when submitting a research article or grant application, or depositing research data. activities: . watch the video “persistent identifiers and data citation explained” by research data netherlands. . watch the video “what are persistent identifiers” for an example on how they are used in digital heritage. . if you don’t have one, request an orcid. add all your information as completely as possible. . read alice meadow’s blog post six things to do now you have an orcid id. . go to a data record and click on the doi to see how the doi can be resolved to current url of the data set: http://dx.doi.org/ . /dans-x b-uy q. . read “digital object identifier (doi) system for research data”. accessible thing : open data the fair principles stipulate that data and metadata ought to be “retrievable by their identifier using a standardised communication protocol” (requirement a ). this requirement does not necessarily imply that the data should fully be available in open access. it principally means that there needs to be a protocol that users may follow to obtain of the data set. there can be many good reasons for limiting the access to a file. public accessibility may be difficult because of privacy laws or copyright protection regulations, for example. the accessibility of the data may occasionally be complicated by the fact that the data have been stored using a so-called proprietary format, i.e. a format that owned exclusively by a single company. for formats which are associated with specific software applications, it can be difficult to guarantee their long-term usability, accessibility and preservation. for this reason, the dans easy archive in the netherlands works with a list ‘preferred formats’. activities: . read the article on the website of dans about preferred formats, and about what you can do to improve the durability of non-preferred formats. . read the web page on open data on the ands website. . consider the following three articles. to what extent can the data sets http://doi.org/ . /dans-x b-uy q https://orcid.org/ https://www.youtube.com/watch?v=pgqtiy oz k https://www.youtube.com/watch?v=auvmlzdgb y&feature=youtu.be https://orcid.org/register http://orcid.org/blog/ / / /six-things-do-now-you%e % % ve-got-orcid-id http://orcid.org/blog/ / / /six-things-do-now-you%e % % ve-got-orcid-id http://dx.doi.org/ . /dans-x b-uy q https://www.ands.org.au/__data/assets/pdf_file/ / /digital-object-identifiers.pdf https://www.go-fair.org/fair-principles/ - / https://dans.knaw.nl/en/deposit/information-about-depositing-data/before-depositing/file-formats https://www.ands.org.au/working-with-data/articulating-the-value-of-open-data/open-data https://www.ands.org.au/working-with-data/articulating-the-value-of-open-data/open-data that are mentioned in the articles be accessed? are the data sets also in preferred formats? * https://doi.org/ . / x. . * http://dx.doi.org/ . /journal.pone. * http://doi.org/ . /lang. . look at the data set that can be found via https://doi.org/ . /dans-x u-usxj. what is needed to access the data? interoperable thing : data structuring and organisation well-structured and well-organised data can evidently be reused much more easily. this section explains how researchers can organize their data in such a way that they can be analysed effectively with data science tools. many historians capture their data in spreadsheets. as is explained by broman and woo ( ), there are a number of important principles to bear in mind when you work with spreadsheets. • it is important to be consistent. terminology should be used invariably. • avoid empty cells. use a consistent code for data which is unavailable, such as ‘na’ used in r. • use a regular format for dates, such as yyyy-mm-dd. • use all cells to capture atomic data. do not place multiple values in a single cell. every value that you may want to use in calculations or in other analyses needs to be available separately. • organise the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row) • do not make use with colours to indicate properties of data.. represent all data that you need as actual values in the spreadsheet. • do not include calculations in the raw data files. once you have developed a suitable data model, you are also advised to develop a data dictionary which documents the model. this document may contain the following information: • a list of all the column names used in the data spreadsheet • a description of the purpose and the contents of these different columns. • if applicable, give an indication of the units of measurement. • if applicable, describe the measures that have been taken to ensure the correctness and the consistency of the data • explain abbreviations or notational conventions that have been used in the data set. https://doi.org/ . / x. . http://dx.doi.org/ . /journal.pone. http://doi.org/ . /lang. https://doi.org/ . /dans-x u-usxj activities: read karl broman and kara h. woo, “data organization in spreadsheets”. thing : controlled vocabularies and ontologies tim berners-lee, the inventor of the web, argued that there are five levels of open data. creators of data can earn five stars by following the steps below. . data sets can be awarded one star if it has been made public. this is clearly the case for data which have been published via an open license in a data repository. . in order to win a second star, the open data needs to be made available as machine- readable data. this criterion can be satisfied by providing access to an excel spreadsheet, for instance. . one disadvantage of an excel spreadsheet is that users need proprietary software to open the data. the third star can be awarded to datasets which are captured using open formats, such as csv or txt. . a fourth star can be awarded when the entities in the data set are identified using persistent identifiers. such pids have the effect that other researcher can effectively link to the data set. . the fifth star can be earned by linking the data to entities in other data sets via pids. when researchers have published their well-structured and their well-organised data set in a data repository via a public license, as explain in things to above, they will have arrived at data set that can be awarded three stars, according to berners-lee’s scheme. this section and the following section will further explain how you enhance the interoperability of their data sets even further by working with rdf and with persistent identifiers. as a first step, it can be useful to explore whether some of the general topics that you focus on have already been assigned persistent identifiers or uris. many researchers and institutions have developed shared vocabularies and ontologies to standardise terminology. in many cases, the terms which have been defined have also been assigned persistent identifiers. such shared vocabularies can make it clear that we are talking about the same thing when we exchange knowledge. historical research often concentrates on people, events, organisations and locations. the following ontologies and shared vocabularies concentrate on entities such as these: • the cidoc conceptual reference model (crm) concept search. • wikidata assigns identifiers to a wide range of entities, including people, locations and organisations • the library of congress name authority files, e.g. http://id.loc.gov/authorities/names/n . https://www.tandfonline.com/doi/full/ . / . . https:// stardata.info/en/ http://www.cidoc-crm.org/concept-search http://id.loc.gov/ http://id.loc.gov/authorities/names/n • viaf (virtual international authority file (https://viaf.org/) • identifiers for book published in dutch or in the netherlands can be found via the stcn, whose contents is available as linked open data. • the unesco history thesaurus. • aspects of books can be described using terms from the bibliographic ontology and the fabio ontology. • geonames defined persistent identifiers to locations, e.g. https://www.geonames.org/ /leiden.html. • tadirah and bartoc (basel register of thesauri, ontologies & classifications also offer valuable overviews of the ontologies that have been developed within specific disciplines. • one of the ways to describe the provenance of data sets is by so-called nanopublications, i.e. a set of resource description framework (rdf) triples (subject-predicate-object tuples). although you do not need nanopublications to describe provenance, nanopublications are a way of combining argument and provenance in a single package. nanopublications rely on the provenance ontology to express provenance. you can read more about them and their application in historical research in this paper by patrick golden and ryan shaw: nanopublication beyond the sciences: the periodo period gazetteer where possible, try to use terms that have been defined in these existing ontologies in your own data set. an example where a specific vocabulary (the voc glossary) was used to markup a dataset can be found here. the dataset is part of a project to reconstruct the domestic market for colonial goods in the dutch republic. activities: . try to find one or two terms that are relevant to your research using the resources that are mentioned above. you can aso use swoogle to search for vocabularies related to your research. . search for a term related to your research in the cidoc conceptual reference model (crm) concept search. were you able to find it? tip : search for “person” to get an idea of how the thesaurus works. tip : all the terms used can be found in the last release of the model: http://www.cidoc-crm.org/get-last-official-release. thing : fair data modelling the fourth and the fifth star in berner lee’s model can be awarded when the data are stored in a format in which the topics their properties and their characteristics are identified using uris whenever possible. more concretely, it implies that you record your data using the resource description framework (rdf) format. rdf, simply put, is a technology which enables you to publish the contents of a database via the web. it is based on a simple data model which assumes that all statements about resources can be reduced to a basic form, https://viaf.org/ http://openvirtuoso.kbresearch.nl/sparql http://vocabularies.unesco.org/browser/thesaurus/en/page/concept http://bibliontology.com/ https://sparontologies.github.io/fabio/current/fabio.html https://www.geonames.org/ https://www.geonames.org/ /leiden.html http://tadirah.dariah.eu/vocab/index.php https://bartoc.org/ https://peerj.com/articles/cs- / https://peerj.com/articles/cs- / http://resources.huygens.knaw.nl/vocglossarium/index_html_en https://easy.dans.knaw.nl/ui/datasets/id/easy-dataset: http://swoogle.umbc.edu/ / http://www.cidoc-crm.org/concept-search http://www.cidoc-crm.org/get-last-official-release consisting of a subject, a predicate and an object. rdf assertions are also known as triples. in a fair data model, elements of data are organised and identified using pids. the same goes for the relations between these elements. the fair data model is a graphical view of the data that act as a metadata key to a spreadsheet but it can also be used as a guide to expose data as a linked data graph in rdf format. existing data sets can be converted to rdf by making use of the fairifier software. this application is based on openrefine. other examples of tools to generate rdf are karma and rml. in the fairifier, it is possible to upload a csv file. after this, the data set can be connected to elements from existing ontologies. activities: . learn about the basics of rdf modeling by going through the first slides of the webinar about the unesco thesaurus. . dig in deep by exploring the fairifier for a dataset you already have available in csv. reusable thing : licensing a license describes the conditions under which your data or software is (re)usable. picking a license can be a daunting process because of the common feeling that if you do not pick the right license something will go wrong. however keep in mind that if you do not choose a license for your data or software, it means that it cannot be used or reused. a copyright expert can help you, but to get you going you can try out the activities listed below. activities: . try to pick a license for a data set you are working on by using the creative commons license picker . try to pick a license for a piece of software or code you are working on by using the choose a license picker . learn more about licensing your data by reading this guide from the digital curation center if you deposit your data in a repository there will be default options available. thing : data citation when you have made use of someone else’s data, you are strongly recommended to attribute the original creators of these data by including a proper reference. data sets, and even software applications, can be cited in the same way as textual publications such as articles and monographs. structured data citations can also be used to calculate metrics about the reuse of the data. data citations, regardless of citation style, typically contain the authors, the year, the title, the publisher and a persistent identifier. https://github.com/dtl-fairdata/fairifier/wiki http://openrefine.org/ http://usc-isi-i .github.io/karma/ http://rml.io/ http://dublincore.org/resources/training/asist-webinar- /webinar-en.pdf http://vocabularies.unesco.org/browser/thesaurus/en/ https://github.com/dtl-fairdata/fairifier/wiki https://creativecommons.org/choose/ https://creativecommons.org/choose/ https://choosealicense.com/ http://www.dcc.ac.uk/resources/how-guides/license-research-data http://www.dcc.ac.uk/resources/how-guides/license-research-data activities: . read the ands guide on data citation. . read the force data citation principles. . study the following data set on figshare: https://doi.org/ . /m .figshare. .v . note that there is the possibility to generate a data citation, under the link “cite”, in the citation style of your choice. . consider the following publication: https://doi.org/ . /journal.pone. . note that the article has a “data availability” statement. . explore citeas by typing in the figshare doi from above ( . /m .figshare. .v ). context thing : policies policies for data availability can come from publishers, funders and universities. these policies are listed on the respective website, but finding these is not always straightforward. fairsharing is a repository for standards, databases and policies with the possibility to filter on information for a specific research domain. it started as an initiative for the life sciences but is rapidly expanding its content for other disciplines as well. activities: . start by going to fairsharing . click on the blue “policies” button at the top . in the left side menu under “subjects”, click on “show more” and select “humanities”. . scroll down to the taylor and francis data policy . which databases and standards are mentioned in this policy? . go to the specific policy for the “european review of history” journal. . does it differ from the general taylor and francis policy? . try to find the data policy for your favorite journal. https://www.ands.org.au/__data/assets/pdf_file/ / /data-citation.pdf https://www.force .org/datacitationprinciples https://doi.org/ . /m .figshare. .v https://doi.org/ . /journal.pone. http://citeas.org/ https://fairsharing.org/ https://fairsharing.org/bsg-p / https://www.tandfonline.com/action/authorsubmission?journalcode=cerh &page=instructions&#dsp top fair data & software things: geoscience sprinters: john brown, janice chan, niamh quigley (curtin university, perth, western australia) audience: researchers things findable thing : data sharing and discovery thing : vocabularies for data description thing : identifiers and linked data thing : spatial data accessible thing : long-lived data: curation & preservation thing : data citation for access & attribution thing : dois and citation metrics interoperable thing : dois and citation metrics thing : vocabularies for data description thing : identifiers and linked data thing : exploring apis and apps https://librarycarpentry.org/top- -fair https://staffportal.curtin.edu.au/staff/profile/view/john.brown https://github.com/icecjan/ reusable thing : licensing data for reuse thing : what are publishers & funders saying about data? thing : data sharing and discovery activity : data discovery data repositories enable others to find existing data by publishing data descriptions ("metadata") about the data they hold, much like a library catalogue describes the resources held in a library. also, repositories often provide access to the data itself and some even provide ways for users to explore that data. many research funding requirements reference researchers depositing their data into data repositories (which we’ll discuss later in thing ). data portals or aggregators draw together research data records from a number of repositories. because of the huge amounts of data available they sometimes focus on data from one discipline or geographic region. the eu open data portal is an example that aggregates metadata records from over european national data repositories and the us government’s open data portal data.gov aggregates from over us government agencies. . look at this data.gov.au record from geoscience australia: lord howe rise marine survey . • examine the description and additional info fields to see the ways that geoscience australia has made this record findable to other researchers. if you knew about this data portal, would you be able to easily find this dataset if it was relevant to your research? . spend a few minutes exploring the scottish spatial data infrastructure metadata portal. • try browsing or searching on a topic of interest. • explore a record and see where it came from and if there’s a way to contact the creator. • have a look at the map and see if you can find and add a map layer relating to fishing. . look at earthchem. • have a look at some of the data in earthchem. would it be a good place to contribute the data from your own research? consider: if your research appeared in the right data portal or repository, what things might result from that for yourself? what about your discipline? activity : finding data repositories . choose one of the specialised data repositories below, or find another data repository on re data.org (perhaps one outside your particular focus area) and spend some time browsing around your chosen repository to get a feel for the data available. https://data.europa.eu/euodp/en/home https://www.data.gov/ https://data.gov.au/ https://data.gov.au/dataset/lord-howe-rise-marine-survey- -ga- -kr - c-bathymetry-grids https://data.gov.au/dataset/lord-howe-rise-marine-survey- -ga- -kr - c-bathymetry-grids https://www.spatialdata.gov.scot/geonetwork/srv/eng/catalog.search#/home https://www.spatialdata.gov.scot/geonetwork/srv/eng/catalog.search#/map http://www.earthchem.org/ http://www.earthchem.org/data/templates http://www.earthchem.org/data/templates https://www.re data.org/browse/by-subject/ • worldclim • southern california earthquake centre • mopitt (atmospheric science data centre) • international service of geomagnetic indices • scientific drilling database • alberta geological survey . think about how the data here differs from data you are familiar with, for example, in format, size and access method. consider: could you apply a dataset from one of these repositories to your own work? would you need to change file formats or learn a new software package? thing : long-lived data: curation & preservation activity : preserving born digital objects information sources that were commonly used in the past such as maps and handwritten observation notes and can easily survive for years, decades or even centuries. however, because most current research is done mostly on computers, it’s important to remember that digital items require special care to keep them usable over time. . this video ( . min) from the us library of congress shows the vulnerability of “born digital” objects like research data: they are fragile; they are dependent on software and hardware; and they require active management. . look at the ands page on file formats. consider: if your research was put into a time capsule and unearthed in years' time, would future researchers be able to determine if your research is still useful to them? if you were allowed to update the time capsule every years, what would you change to make it easier for those unearthing it? activity : readme files one way that researchers can ensure their data is useful in the future is to package their data with an explanation that can be opened without any software. these explanatory files mean that anyone who finds the data will know if the data is useful to them and hopefully won’t have any questions for the original researcher, who may not be available or not remember. the files are usually called “readme” files in the hope that by reading the file, all the important questions will be answered. . read the guide to writing “readme” style metadata from the cornell research data management service group and create a readme.txt file for one of your own datasets. http://worldclim.org/ https://www.scec.org/ https://eosweb.larc.nasa.gov/project/mopitt/mopitt_table http://isgi.unistra.fr/index.php http://www.scientificdrilling.org/ https://geology-ags-aer.opendata.arcgis.com/ http://www.bl.uk/learning/timeline/large .html http:// .bp.blogspot.com/-a pzzicrac /vhmmgigndqi/aaaaaaaafa /fkk e veaja/s /img_ .jpg http:// .bp.blogspot.com/-a pzzicrac /vhmmgigndqi/aaaaaaaafa /fkk e veaja/s /img_ .jpg https://youtu.be/qemmeffafus https://www.ands.org.au/working-with-data/data-management/data-preservation https://data.research.cornell.edu/content/readme don’t forget to include notes on software versions used, methodology and any special things you’d tell a colleague if you were giving them the data yourself! thing : data citation for access & attribution activity : citing research data when authors cite an article they have used ideas from, they formally and publicly acknowledge the work of the earlier author. data citation works in the same way – by citing the data created by earlier researchers they get formal and public credit for their contribution to the new work. along with books, journals and other scholarly works, it is now possible to formally cite research datasets and even the software that was used to create or analyse the data. . have a look at https://www.bgs.ac.uk/services/ngdc/citeddata/catalogue/a b - e f- -b ff- b.html the geophysical, hydraulic and mechanical properties of synthetic versus natural sandstones under variable stress conditions dataset from the british geological survey. if someone wanted to use this dataset for further research, would they know how to give credit to the creator of the original dataset? . find a doi of a dataset from one of the repositories you found in thing and enter it into the doi citation formatter: https://citation.crosscite.org/. if you saw the citation, would you know how to go about accessing the data? . read the article, “sharing detailed research data is associated with increased citation rate” – why would it be that papers that make their data openly available get better citation counts? would you feel more confident citing another person’s work if you knew? consider: data citation is a relatively new concept in the scholarly landscape and as yet, is not routinely done by researchers, or demanded by journals. what could be done to encourage routine citation of research data and software associated with research outputs? activity : citing software the increase in available computational power over the last years has led to a massive increase in the usage of computational analysis methods in geoscience. as such techniques become more commonplace, it’s important to distinguish between the data itself, the tools used to analyse data and any discrete components within those tools. in some cases, a particular function of the software is critical to the analysis process; in other cases the critical part is an interchangeable block of code within that software package. recognising the difference between these two is important as it changes who gets credit for their previous work and who gets left unsung. https://www.bgs.ac.uk/services/ngdc/citeddata/catalogue/a b - e f- -b ff- b.html https://www.bgs.ac.uk/services/ngdc/citeddata/catalogue/a b - e f- -b ff- b.html https://www.bgs.ac.uk/services/ngdc/citeddata/catalogue/a b - e f- -b ff- b.html https://www.bgs.ac.uk/services/ngdc/citeddata/catalogue/a b - e f- -b ff- b.html https://citation.crosscite.org/ https://doi.org/ . /journal.pone. https://doi.org/ . /journal.pone. it’s not always easy to know which to cite, but trying to give recognition for the creation of software and software components can make huge impacts on the career of a researcher, especially if they create scientific software! . read https://libguides.mit.edu/c.php?g= &p= the how to cite software guide from the mit libraries. . read adding citation to your r package blog post. consider: if you wrote a package of code for a computer program to run and made it freely available to your colleagues to solve a problem in your field, would they know how they could give you credit in their work? would they think that you would want attribution? thing : dois and citation metrics dois are unique identifiers that enable data citation, metrics for data and related research objects, and impact metrics. citation analysis and citation metrics are important to the academic community. find out where data fits in the citation picture. activity : dois digital object identifiers (dois) are a type of ‘persistent identifier’. dois are unique identifiers that provide persistent access to published articles, datasets, software versions and a range of other research inputs and outputs. there are over million digital object identifiers (dois) in use, and in dois were “resolved” (clicked on) over billion times! each doi is unique but a typical doi looks like this: http://dx.doi.org/ . / / f ba . start by watching this short . -minute video persistent identifiers and data citation explained from research data netherlands. it gives you a succinct, clear explanation of how dois underpin data citation. . have a look at the poster building a culture of data citation and follow the arrows to see how dois are attached to data sets and are used in data citation. . let’s go to a research data australia data record which shows how dois are used. click on this doi to ‘resolve’ the doi and take us to the record: http://dx.doi.org/ . / / f ba . . click on the cite icon on the upper left of the record (under the green access the data tab). no matter where the doi appears it always resolves back to its original dataset record to avoid duplication. i.e. many records, one copy. . dois can also be applied to grey literature, a term that refers to research that is either unpublished or has been published in non-commercial form, such as government reports. for example, reports like this: http://doi.org/ . / / d b . https://libguides.mit.edu/c.php?g= &p= https://libguides.mit.edu/c.php?g= &p= https://libguides.mit.edu/c.php?g= &p= https://www.r-bloggers.com/adding-citation-to-your-r-package/ http://dx.doi.org/ . / / f ba https://youtu.be/pgqtiy oz k https://www.ands.org.au/__data/assets/pdf_file/ / /data_citation_poster.pdf http://dx.doi.org/ . / / f ba http://doi.org/ . / / d b activity : igsns international geo sample number (igsn) are designed to provide an unambiguous globally unique persistent identifier for physical samples. it facilitates the location, identification, and citation of physical samples used in research. each igsn is unique but a typical igsn looks like this ieevb c . the first five characters of the igsn represent a name space (a unique user code) that uniquely identifies the person or institution that registers the sample. the last characters of the igsn are a random string of alphanumeric characters ( - , a-z). . start by reading this brief introduction to igsn. . review the scope and capability of each igsn allocation agent listed on the igsn website and consider which allocation agent is most appropriate for your samples. . have a look at an igsn record https://app.geosamples.org/sample/igsn/ieevb c which displays what information about the sample was recorded. . now have a look at how igsns are referenced in a dataset record http://get.iedadata.org/doi/ . consider: how are you managing your physical samples? the ands igsn minting service may be used by australian researchers at no cost. do you know of a service provider in your region? activity : altmetrics data citation best practice, as discussed in thing , enables citation metrics for data to be tracked and analysed. data citations are available from the clarivate data citation index which is a commercial product. altmetrics is an alternative measure to help understand the influence of your work. it refers to metrics such as number of views, number of downloads, number of mentions in policy documents, social media, and social bookmarking platforms associated with any research outputs that have a doi or other persistent identifiers. because of their immediacy, altmetrics can be an early indicator of the impact or reach of a dataset; long before formal citation metrics can be assessed. . start by looking at the altmetrics for this phylogenomics article published in science. note the usage statistics, including number and pattern of downloads, for this article since it was published in november . . now click on the “donut” or the link to ‘see more details’ to see the wealth of information available. https://app.geosamples.org/sample/igsn/ieevb c https://www.ands.org.au/working-with-data/citation-and-identifiers/igsn http://www.igsn.org/register-your-samples http://www.igsn.org/register-your-samples https://app.geosamples.org/sample/igsn/ieevb c http://get.iedadata.org/doi/ http://www.sciencemag.org/articleusage?gca=sci; / / . look also at the associated data in dryad noting that the data has been assigned a doi. can you see how many times the data has been downloaded and the record viewed (scroll down to the bottom of the record)? by way of comparison, as of early november : * the same dataset had been cited once in web of science data citation index * the article had been cited times in web of science consider: do you think altmetrics for data have value in academic settings? why, or why not? thing : licensing data for reuse understand the importance of data licensing, learn about creative commons and find out how enabling reuse of data can speed up research and innovation. activity : why license research data? consider this scenario: you’ve found a dataset you are interested in. you’ve downloaded it. excellent! but do you know what you can and cannot do with the data? the answer lies in data licensing. licensing is critical to enabling data to be reused and cited. . start by reading this brief introduction to licensing research data. . now watch this creative commons licensing introductory video or have a closer look at the understanding cc licences poster. . check out the licence chooser from creative commons, which walks you through the decision of which licence is appropriate for your purpose. consider: if you were considering licensing a dataset on something which may have commercial value to others - what licence would you apply? activity : data licences: unlock data for innovation enabling reuse of data can speed up research and innovation. licensing is critical to enabling data reuse. . start by watching this . mins video in which dr kevin cullen from the university of new south wales explains their approach to licensing which aims to strengthen the university’s relationship with business and industry. . check out the data standards of geoscience australia, which refers to the australian government policy on public data. which creative commons licence is applied to government data by default? . since november , geoscience australia has officially adopted creative commons attribution as the default licence for its website. that means thousands of products and datasets available through the website are free to be reused. http://datadryad.org/resource/doi: . /dryad. c f http://apps.webofknowledge.com/full_record.do?product=wos&search_mode=generalsearch&qid= &sid=e hcr sig gepv octf&page= &doc= https://www.ands.org.au/working-with-data/publishing-and-reusing-data/licensing-for-reuse https://youtu.be/fsto ink oi https://www.ands.org.au/__data/assets/image/ / /ccposter.png https://creativecommons.org/choose/ https://youtu.be/lmyzf ijp e?list=plg fmbdlra qh _yynsgzkqotbvstk r http://www.ga.gov.au/data-pubs/datastandards https://pmc.gov.au/resource-centre/public-data/australian-government-public-data-policy-statement http://creativecommons.org.au/blog/ / /more-on-government-data-geoscience-australia-goes-cc/ http://creativecommons.org.au/blog/ / /more-on-government-data-geoscience-australia-goes-cc/ . see the range of data products and license available at british geological survey. does your institution have policies or guidelines around data licensing? activity : data licensing in practice not all research data that is shared is licensed for reuse. it should be! . explore the following data repositories: • research data australia • auscope geonetwork portal • earthchem . or review the following example records: • darwin harbour marine habitats • mineral occurrences - south australia • whole rock composition data for garnet pyroxenites from arizona . do all data repositories or metadata catalogues enable users to refine search by licenses? look closely at the specific licensing information on a small sample of those records with ‘open’ licences. how easy or difficult it is to work out if the data can or can’t be reused e.g. for commercial purposes? with international collaborators? consider: assigning open licenses is not routine. suggest one tip for encouraging uptake of 'open' licensing. thing : vocabularies for data description in addition to selecting a metadata standard or schema, whenever possible you should also use a controlled vocabulary. activity : what is controlled vocabulary? a controlled vocabulary provides a consistent way to describe data - location, time, place name, subject. read this short explanation of controlled vocabularies. controlled vocabularies significantly improve data discovery. it makes data more shareable with researchers in the same discipline because everyone is ‘talking the same language’ when searching for specific data e.g. plants, animals, medical conditions, places etc. if you have time, have a look at controlling your language: a directory of metadata vocabularies from jisc in the uk. make sure you scroll down to . conclusion - it’s worth a read. https://www.bgs.ac.uk/data/licensing/home.html https://researchdata.ands.org.au/ http://portal.auscope.org/geonetwork http://www.earthchem.org/ https://researchdata.ands.org.au/darwin-harbour-marine-habitats/ http://portal.auscope.org/geonetwork/srv/eng/catalog.search;jsessionid= f dd d fb e cf a #/metadata/ b f cd d d ec d b c http://get.iedadata.org/doi/ https://stats.oecd.org/glossary/detail.asp?id= http://www.webarchive.org.uk/wayback/archive/ /http:/www.jiscdigitalmedia.ac.uk/guide/controlling-your-language-links-to-metadata-vocabularies http://www.webarchive.org.uk/wayback/archive/ /http:/www.jiscdigitalmedia.ac.uk/guide/controlling-your-language-links-to-metadata-vocabularies activity : controlled vocabularies in action we are going to see some controlled vocabularies in action in the atlas of living australia (ala). . do a search in the ala search engine. type “whale” in the search box and click on search. choose one of the records listed and click on the (red text) view record link. . any metadata field where you see supplied... tells you that the information supplied by the person who submitted the record (often a 'citizen scientist') has been changed to the controlled vocabulary being used in metadata fields e.g. observer, record date and common name. . have a scroll down the record and consider how many of the metadata fields probably have a controlled vocabulary in use (e.g. taxonomy, geospatial etc.). if you have time: have a browse around the stunning level of data description and data contained in the atlas of living australia. activity : geoscience vocabularies explore some examples of vocabularies used in geoscience: • american geosciences institute georef thesaurus • geological survey of western australia geoscience thesaurus (gempet) • geosciences australia vocabularies register • british geology society vocabularies consider: do you use controlled vocabularies to describe your data? how would you encourage other researchers to use them? thing : identifiers and linked data orcid is a unique identifier for researchers. many research data repositories record your orcid when you submit research data for publication. activity : check your orcid in your orcid record, datasets you have published will be displayed in the works section. log into orcid now and check your details are up to date, including: * email address * biography * research keywords * other ids such as scopus author id. if you don’t already have an orcid you can get one, this curtin university webpage has information on how to get the most out of your orcid. https://www.ala.org.au/data-sets/ http://www.ala.org.au/ https://www.americangeosciences.org/georef/georef-thesaurus-lists http://www.dmp.wa.gov.au/geoscience-thesaurus-gempet- .aspx http://ldweb.ga.gov.au/def/voc/ga/ https://www.bgs.ac.uk/data/vocabularies/home.html https://orcid.org/ https://orcid.org/signin http://libguides.library.curtin.edu.au/c.php?g= &p= activity : get more from your orcid orcid populates your orcid record from many sources, one of which is peer review activities. publishers such as the american geophysical union publications now send details of peer review activities to orcid. • look at your orcid record, if you have undertaken peer review activities are they listed? • why do you think linking peer review activities to orcids could be useful? activity : identifiers and linked data because they are unique identifiers, orcids can be used to link data from different datasets together. geolink is a network of linked data from multiple data repositories. . go to the portal for the geolink demo. . choose an entity e.g. datasets, cruises, vessels, instruments, researchers and explore! the help guide is here. thing : what are publishers & funders saying about data? geoscience research data is a world heritage. researchers share the responsibility with research institutions and funders of ensuring their data is well-documented, preserved and openly available. many publishers have special requirements for the citation of data in publications. this can be in the form of compliance with a data policy, author guidelines or the completion of a data availability statement. activity : research data and scholarly publishing have a look at the nature data availability statement examples or the plos data availability policy to get an idea of what publishers expect. copdess is the coalition for publishing data in the earth and space sciences, and they have collected links to author instructions and data policies for some geoscience journals, publishers and funders. activity : research funders and data sharing activity has shown us that it’s becoming more common for journals and publishers to demand your data be made available when you seek to publish. however, if your research is publicly funded it’s almost guaranteed that your grant and funding obligations with require you to make your data publicly available at the end of your project – the outputs of research funded by a population should be made available to that population. https://eos.org/agu-news/agu-opens-its-journals-to-author-identifiers https://eos.org/agu-news/agu-opens-its-journals-to-author-identifiers http://demo.geolink.org/ http://demo.geolink.org/help/index.html https://sciencepolicy.agu.org/files/ / /agu-data-position-statement-final- .pdf https://www.nature.com/authors/policies/data/data-availability-statements-data-citations.pdf https://journals.plos.org/plosone/s/data-availability https://journals.plos.org/plosone/s/data-availability http://www.copdess.org/datapolicies/ the australian research council’s data management requirements states that funded researchers are expected to follow the oecd principles and guidelines for access to research data from public funding. similar principles are outlined by the uk research and innovation (ukri) in their guidance on best practice in the management of research data document. consider: if you were on a funding panel and were asked to assess a grant with a clear plan for making the data openly available, would you rate the future impact of that proposal better or worse than one with a poorly defined plan? thing : exploring apis and applications geosciences has many specialised services, applications and apis which can be used to directly access and harness existing research data. some are free, and some are subscription- based, but your research institution may have access. activity : try an app • the wa geology app created by the western australian government, can be used in a mobile web browser and provides multiple layers of geoscience information for western australia. • the british geological survey has created the free igeology app to explore hundreds of british maps. activity : apis apis (application programming interfaces) are software services that allow you to access structured data or systems held by someone else. these are usually provided so that developers can access data held by an organisation on demand, rather than them having to hold an entire dataset (which may not be possible due to security, space requirements or if the dataset is constantly changing). some companies charge for using their apis, but many research-oriented organisations provide their apis for free so that other organisations can link in to their knowledge. • the nasa earth data developer portal provides data from the nasa earth science data portal. • the natural history museum api provides a range of data from their collections. consider: if you could systematically access and integrate the data provided from one of the sources above, can you think of a way you could enrich the outputs of your own research? https://www.arc.gov.au/policies-strategies/strategy/research-data-management http://www.oecd.org/sti/inno/ .pdf http://www.oecd.org/sti/inno/ .pdf https://www.ukri.org/about-us/ https://www.ukri.org/about-us/ https://www.ukri.org/files/legacy/documents/rcukcommonprinciplesondatapolicy-pdf/ http://www.dmp.wa.gov.au/geology-mapping-app-for-mobile- .aspx https://www.bgs.ac.uk/igeology/ https://developer.earthdata.nasa.gov/ https://earthdata.nasa.gov/ https://earthdata.nasa.gov/ http://data.nhm.ac.uk/about/download http://www.nhm.ac.uk/our-science/collections.html thing : spatial data the importance of spatial data is ever increasing. many of the societal challenges we face today such as food scarcity and economic growth are inherently linked to big spatial data. in fact, it is often said that % of all research data has a geographic or spatial component. it is useful then, for all of us to have an understanding of spatial data. activity : spatial data: maps and more . start by watching this incredible, inspiring video ( . min) from the university of wollongong’s petajakarta project. it shows innovative ways of combining social media and geospatial data to save lives. . now read the application of geographic information science in earth sciences. . this video combines a range of different data visualisations depicting the human impacts on our environment. . geospatial data is fundamental to australia’s economic future. check out this very short article about how geoscience australia is mapping the mineral potential of our continent - a world first! just for fun: enter your address in the atlas of living australia and see what birds and plants have been reported in your street or suburb. you may be surprised at how ‘alive’ your street is! consider: why do you think these geospatial visualisations are so powerful? activity : spatial data concepts there are many types and sources of geospatial data. if you are new to the world of geospatial data, you will probably appreciate some ‘busting’ of the jargon of geospatial data. . start by reading this fundamentals chapter to learn more about maps, projections, coordinate systems, datums and gis. . want more? continue with this blog about finding and making sense of geospatial data on the internet which explains some basic geospatial data file formats and concepts. . prefer watching? most of these concepts are also explained in this video. . read more about two important aspects of spatial data: scale and resolution. consider: how would you give an explanation of two new terms you have just learnt? activity : using and visualising spatial data spatial data can be used in many ways, and there are many tools that you can use to manipulate and display spatial data. https://www.youtube.com/watch?v= v bo _rhwi&feature=youtu.be https://gis.usc.edu/blog/the-application-of-geographic-and-information-science-gis-in-earth-sciences/ http://spatial.ly/ / /climate-change-state-science/ http://spatial.ly/ / /climate-change-state-science/ http://www.ga.gov.au/news-events/news/latest-news/continental-scale-mapping-of-mineral-potential-wins-top-award https://biocache.ala.org.au/explore/your-area http://vcgi.vermont.gov/sites/vcgi/files/training/chapter_ .pdf https://blog.openshift.com/finding-and-making-sense-of-geospatial-data-on-the-internet/ https://blog.openshift.com/finding-and-making-sense-of-geospatial-data-on-the-internet/ https://www.youtube.com/watch?v=lelnsbj vwo&t= s http://desktop.arcgis.com/en/arcmap/latest/manage-data/raster-and-images/cell-size-of-raster-data.htm you can try one of the tools below. do one, or do them all and compare the results. . free gis software options: map the world in open source • browse through this site for ideas for free, open source geospatial software; the descriptions often include discipline specific advice. download one and try your hand at mapping. . spatial data visualisation with r: for those who have done the r modules in software carpentry - this might be a good activity to flex your r muscles! want more? here are some more r tutorials. . create a map using google fusion tables: this offers lots of features, but you need a google account. the excellent google fusion tutorial uses butterfly data to show you how to import data, map the data and customise your map. the open geospatial consortium (ogc) is an international not-for-profit organization that develops open standards for the geospatial community. ogc through their dedicated global members have developed several standards to share geospatial data. some of the most commonly use standards are: . web map service (wms): a standard web protocol to query and access geo-registered static map images as a web service. the outputs are images that can be displayed in a browser application. . web feature service (wfs): a standard web protocol to query and extract geographic features of a map, these are typically attributes of a map. the latest version of wfs ( . , dec ) has created a lot of excitement in the community. . web coverage service (wcs): provides access to geospatial information representing phenomena that are variable over space and time, such as satellite images or aerial photos. the service delivers a raster image that can be further interpreted and processed. geoserver is the most popular open source reference implementation of wms, wfs and wcs standards. consider: the data world is hungry for geospatial tools and metadata and there is growing demand for people with these skills. how can these skills be encouraged in your institution? references: ands (research data) things https://www.ands.org.au/working-with-data/skills/ - research-data-things/all eco data things https://www.ands.org.au/__data/assets/pdf_file/ / / -eco- data-things_handout.pdf https://gisgeography.com/free-gis-software/ https://www.r-bloggers.com/spatial-data-visualization-with-r- / https://www.researchgate.net/publication/ _spatial_data_visualisation_with_r http://pakillo.github.io/r-gis-tutorial/ https://support.google.com/fusiontables/answer/ ?hl=en&ref_topic= http://www.opengeospatial.org/ http://www.opengeospatial.org/standards/wms http://www.e-cartouche.ch/content_reg/cartouche/webservice/en/html/wfs_whatwfsis.html https://medium.com/@cholmes/wfs- - -get-excited-yes- e fdbcc http://www.opengeospatial.org/standards/wcs http://geoserver.org/ https://www.ands.org.au/working-with-data/skills/ -research-data-things/all https://www.ands.org.au/working-with-data/skills/ -research-data-things/all https://www.ands.org.au/__data/assets/pdf_file/ / / -eco-data-things_handout.pdf https://www.ands.org.au/__data/assets/pdf_file/ / / -eco-data-things_handout.pdf top fair data & software things: biomedical data producers, stewards, and funders sprinters: lisa federer (national library of medicine), douglas joubert (national institutes of health library), allissa dillman (national center for biotechnology information), kenneth wilkins (national institute of diabetes and digestive and kidney diseases), ishwar chandramouliswaran (national institute of allergy and infectious diseases), vivek navale (nih center for information technology), susan wright ( national institute on drug abuse) audience: • biomedical researchers • data stewards • funding organizations things thing : metadata creation and curation beginner activity: . learn about the various types of metadata. dataone defines metadata as "documentation about the data that describes the content, quality, condition, and other characteristics of a dataset. more importantly, metadata allows data to be discovered, accessed, and reused" - dataone education module. • descriptive • technical • administrative • provenance . work through the dataone metadata educational module: lesson - metadata. . explore the use of controlled vocabularies and common data elements (cde). a cde is a "data element that is common to multiple data sets across different studies." the nih common data element (cde) resource portal has identified cdes for use in particular types of research or research domains after a formal evaluation and selection process. • take the nih cde interactive tour to learn how to use the site. https://librarycarpentry.org/top- -fair https://github.com/informationista https://github.com/doujoudc https://twitter.com/dchackathons https://www.niddk.nih.gov/about-niddk/staff-directory/biography/wilkins-kenneth https://www.linkedin.com/in/ishwarc/ https://www.linkedin.com/in/ishwarc/ https://www.rd-alliance.org/users/vivek-navale https://www.drugabuse.gov/about-nida/organization/divisions/division-basic-neuroscience-behavioral-research-dbnbr/office-director-od https://www.ncbi.nlm.nih.gov/pmc/articles/pmc / https://www.dataone.org/education-modules https://www.dataone.org/education-modules https://www.nlm.nih.gov/cde/glossary.html#cdedefinition https://www.nlm.nih.gov/cde/ https://cde.nlm.nih.gov/home?tour=yes • browse the cdes to explore how these might be used in your discipline. intermediate activity: . think about ways you can standardize minimal/core metadata to use across disciplines. for example, crosswalk between standards). . automated metadata creation can "help improve efficiency in time and resource management within preservation systems, and alleviate the problems associated to the "metadata bottleneck". . review the digital curation centre (dcc) automated metadata generation primer page. . download the dcc digital curation reference manual and think about the ways you might be able to automate metadata creation at your organization. . watch the alcts session : automating descriptive metadata creation: tools and workflows webinar which examines workflows for automating the creation of descriptive metadata. thing : use of standard data models . explore the omop common data model (cdm), which allows for the systematic analysis of disparate observational databases. . review one of the omop community meeting presentations and think about how this might align to the work of your organization. . familiarize yourself with one of the observational health data sciences and informatics github repositories. thing : exploring unique, persistent identifiers beginner activity: globally unique and persistent identifiers remove ambiguity in the meaning of your published data by assigning a unique identifier to every element of metadata and every concept/measurement in your dataset (gofair) . explore the go fair f webpage to see examples of globally unique and persistent identifiers. . learn how a digital object identifier (doi) can be used to create a unique reference to your data. watch a video that explain what dois are and how they work, and how they benefit managers of digital content. . read the digital preservation handbook to learn about all of the elements that comprise a persistent identifier. https://cde.nlm.nih.gov/cde/search https://www.ands.org.au/online-services/rif-cs-schema/crosswalks-transform-your-metadata http://www.dcc.ac.uk/resources/curation-reference-manual/completed-chapters/automated-metadata-extraction http://www.dcc.ac.uk/resources/curation-reference-manual/completed-chapters/automated-metadata-extraction http://www.dcc.ac.uk/resources/curation-reference-manual/completed-chapters/automated-metadata-extraction http://www.dcc.ac.uk/resources/curation-reference-manual/completed-chapters/automated-metadata-extraction http://www.dcc.ac.uk/webfm_send/ http://www.ala.org/alcts/events/ac/ /vc-sess http://www.ala.org/alcts/events/ac/ /vc-sess https://www.ohdsi.org/data-standardization/the-common-data-model/ https://www.ohdsi.org/resources/presentations/community-meeting-presentations/ https://github.com/ohdsi https://www.go-fair.org/fair-principles/f -meta-data-assigned-globally-unique-persistent-identifiers/ https://www.go-fair.org/fair-principles/f -meta-data-assigned-globally-unique-persistent-identifiers/ http://www.doi.org/driven_by_doi.html https://www.dpconline.org/handbook/technical-solutions-and-tools/persistent-identifiers intermediate activity: orcid allows you to create persistent digital identifiers for authors. . create an orcid id. . link your orcid with crossref and datacite. . then, go through steps included in the getting started with orcid integration guide. . test the orcid application programming interface (api). . as a best practice, use orcids from the start of data creation. for example, you can attach data creator name/orcid to dataset as a metadata field. include orcids with datasets in repositories (e.g. in sequence read archive (sra), include the orcid for the data creator). this allows for the tracking of your research and enables citation of your data. thing : versioning and data "retirement" beginner activity: a source-code repository is a file archive and web hosting facility where a large amount of source code, for software, web pages, and other resources, is kept, either publicly or privately. advantages of versioning include: . persistence of identifiers pointing to different/earlier versions . maintaining previous versions of code, software, and data. . sharing various levels of processed data (primary, secondary, or raw/clean/processed, etc.). . de-accessioning of data that has reached the end of its life cycle intermediate activity: . github is one of the most popular options for code hosting. explore alternative options for code hosting. . work through the library carpentry introduction to github module. thing : linking research objects beginner activity: . read the following article on managing digital research objects. . read the linking data crossref page. intermediate activity: . using a (github code repository or zenodo), try to find data that goes with a published paper. then answer some of the following questions: https://orcid.org/ https://orcid.org/register https://orcid.org/members/ g c dneiaz-crossref https://orcid.org/members/ g g qiuia -datacite https://members.orcid.org/api/getting-started https://orcid.org/content/register-client-application-sandbox https://www.ncbi.nlm.nih.gov/sra https://en.wikipedia.org/wiki/comparison_of_source-code-hosting_facilities https://en.wikipedia.org/wiki/list_of_most_popular_websites https://opensource.com/article/ / /github-alternatives https://librarycarpentry.org/lc-git/ https://datascience.codata.org/articles/ . /dsj- - / https://www.crossref.org/community/linking-data/ https://github.com/ https://zenodo.org/ • where is the data or code stored (for example, github repo or zenodo)? • who created the objects (orcid)? • was there proper documentation? license information (regarding commercial use)? thing : human and machine readability . read about the fair principles for making your code both human and machine readable, and the fair guiding principles article. . read the following report jointly designing a data fairport from the lorentz center. . having code that is both human and machine readable supports: • api access • allows for automatic integration of multiple datasets • use of standard formats widely accepted in the discipline thing : maintain/preserve entire research environment (e.g. software) . familiarize yourself with best practices for scientific computing. read good enough practices in scientific computing, and top metrics for life science software good practices to familiarize yourself with the topics of containers, software preservation, and software emulation. . read more about the long-term preservation of biomedical research data. thing : indexing repositories to enable findability . re data.org is a global registry of research data repositories that covers research data repositories from different academic disciplines. register your dataset with re data.org . schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data. • read their getting started guide to get indexed with google. . link your orcid account to fairsharing.org, verify your email address, and create a public profile. • familiarize yourself with their standards, databases, policies, and collections. thing : broad consent informed consent for human subjects should be broad enough to make reuse possible. see broad consent for research with biological samples: workshop conclusions. also see, recommendations for broad consent guidance from the office for human research protections. https://github.com/ https://zenodo.org/ https://www.force .org/fairprinciples https://www.ncbi.nlm.nih.gov/pmc/articles/pmc / https://www.lorentzcenter.nl/lc/web/ / /info.php ?wsid= https://journals.plos.org/ploscompbiol/article?id= . /journal.pcbi. https://journals.plos.org/ploscompbiol/article?id= . /journal.pcbi. https://f research.com/articles/ - /v https://f research.com/articles/ - /v https://f research.com/articles/ - /v https://www.re data.org/about https://schema.org/ https://schema.org/docs/gs.html https://fairsharing.org/ https://www.ncbi.nlm.nih.gov/pmc/articles/pmc / https://www.hhs.gov/ohrp/sachrp-committee/recommendations/attachment-c-august- - /index.html thing : application of metrics to evaluate the fairness of (data) repositories beginner activity: . explore the work of the fair metrics group. explore their proposed fair metrics. . read the following paper: evaluating fair-compliance through an objective, automated, community-governed framework. . explore the design framework for exemplar metrics for fairness. intermediate activity: . explore the make data count project, where you can learn about counter code of practice as well as the code of practice for research data usage metrics. . learn how zenodo and dataone have responded to the make data count recommendations. http://fairmetrics.org/ https://github.com/fairmetrics/metrics https://www.biorxiv.org/content/biorxiv/early/ / / / .full.pdf https://www.biorxiv.org/content/biorxiv/early/ / / / .full.pdf https://www.nature.com/articles/sdata https://makedatacount.org/ https://www.projectcounter.org/code-of-practice-sections/general-information/ https://www.projectcounter.org/code-of-practice-sections/general-information/ https://peerj.com/preprints/ / http://blog.zenodo.org/ / / / - - -usage-statistics/ https://www.dataone.org/news/new-usage-metrics top fair data & software things: biodiversity sprinters: silvia di giorgio, akinyemi mandela fasemore, konrad förstner, till sauerwein, eva seidlmayer (zb med - information center for life science, cologne, germany), ilja zeitlin, susannah bacon, chris erdmann (library carpentry/the carpentries,/california digital library) audience: researchers things findability thing : identifiers to make data findable, it has to be uniquely and persistently stored with an identifier. • a digital object identifier (doi) is a unique, case-insensitive, alphanumeric character sequence and can be very helpful for this purpose. you can reach the identified digital object by using the doi as a url. just fill in the doi in the address bar (e. g. https://doi.org/ . / . ). also, see: ands guide: digital object identifier (doi) system for research data. note: the distributing datacite-agency (i.e. issues dois) for life sciences is publisso: https://www.publisso.de/wir-fuer-sie/doi-service/ exercise: for easy look up, we have a list of dois below. can you match the right document to the appropriate doi? hint: start from here https://www.doi.org/! . . /physrev. . . . /bhl.title. • on the origin of species • the particle problem in the general theory of relativity https://librarycarpentry.org/top- -fair https://twitter.com/digiorgiosilvia https://sea-region.github.com/fasemoreakinyemi https://twitter.com/konradfoerstner https://twitter.com/tillsauerwein https://sea-region.github.com/evaseidlmayer https://sea-region.github.com/evaseidlmayer https://rd-alliance.org/users/ilja-zeitlin https://twitter.com/ardcsbacon https://twitter.com/libcce https://doi.org/ . / . https://www.ands.org.au/__data/assets/pdf_file/ / /digital-object-identifiers.pdf https://www.ands.org.au/__data/assets/pdf_file/ / /digital-object-identifiers.pdf https://www.publisso.de/wir-fuer-sie/doi-service/ https://www.doi.org/ which of these is not a valid doi? . . /arc . . /fmicb. . . . / hint: check the prefix (before the forward slash)! which part indicates the publishing institution? the prefix or the suffix of a doi? orcid exercise: orcid is a self-identifier for authors to avoid author name ambiguity. use orcids from the start of data creation, i.e. attach data creator name/orcid to dataset as a metadata field. include orcids with datasets in repositories (e.g. in sequence read archive (sra), include the orcid for the data creator). this allows for the tracking of data provenance (the origins, custody, and ownership of research data). go through the getting started with orcid integration. thing : citations zenodo, for example, is a tool that makes scientific data and publications easier to cite. it supports various data and license types. it also supports source code from github repositories. see https://zenodo.org/ exercise: * use the zenodo sandbox to upload an example dataset, software program, etc. https://sandbox.zenodo.org/ questions: . which metadata fields do you have to add when uploading data and why? . which fields are mandatory and which ones are not? . what identifiers can you use? https://members.orcid.org/api/getting-started https://zenodo.org/ https://sandbox.zenodo.org/ uploading to zenodo (sandbox) thing : wikidata wikidata provides a common source of open data which can be used by wikimedia projects such as wikipedia, and by anyone else, under a public domain license. exercise: go to wikidata and find the publication date of the book “on the origin of species”. • switch over to the linked dataset of the author of the book and see his other publications. • what did he publish in ? thing : registry of research data repositories (re data) this project aims to accelerate scientific discovery and enhance the integrity, transparency, and reproducibility of data. to enable fair data sharing, data need to be deposited in a repository that is taking steps to make data as open and fair as possible. it's not clear-cut what is fair at this time, there is no such thing as a fair stamp - although the coretrustseal certification provides a good indication. therefore, under the auspices of the enabling fair data project, american geophysical union (agu), re data, and datacite, https://www.wikidata.org/wiki/wikidata these organisations have decided to develop new tools to assist researchers with finding an appropriate repository for their data: • browse subject repositories • repository finder exercise: . how many entries are returned for the query specific for your research topic on re data? . if you filter under "subject", what do you find? . do you think something is missing from the results? if so, suggest a repository. try the "browse by subject" entry to the re data-database since this gives a great overview on the wide landscape of research data repositories: https://www.re data.org/browse/by- subject/ accessibility thing : bioschemas bioschemas.org aims to improve data interoperability in the life sciences. it does this by encouraging people in the life sciences to use schema.org markup, so that their websites and services contain consistently structured information (metadata). this structured information then makes it easier to discover, collate and analyse distributed data. exercises can be found on the bioschema website under "tutorials" and "how to". • https://bioschemas.gitbook.io/training-portal/ thing : licenses knowing the appropriate licenses to use for your data can help others understand how they can use your data and can also help with improving accessibility. • open source licenses • data and creative commons licenses • how to license research data exercise: . use the creative commons license tool to select the appropriate license with the following intentions; . allow your work to be adapted and also allow it to be used commercially. https://www.re data.org/browse/by-subject/ https://repositoryfinder.datacite.org/ https://www.re data.org/ https://www.re data.org/suggest https://www.re data.org/browse/by-subject/ https://www.re data.org/browse/by-subject/ http://bioschemas.org/ https://schema.org/ https://bioschemas.gitbook.io/training-portal/ https://opensource.org/licenses https://wiki.creativecommons.org/wiki/data_and_cc_licenses http://www.dcc.ac.uk/resources/how-guides/license-research-data https://creativecommons.org/choose thing : availability via torrents the era of big data is finally upon us. a prerequisite for accessibility is availability. well established sharing protocols like torrents will ensure data are perpetually available without the constraint of time and space. using the torrent protocol for scientific data will lead to some of the below advantages: • immutability • distribution capabilities (lower cost for distributing the data) • no sole maintainer (we don’t have to rely only on one specific maintainer because data can be cloned and maintained across the peer-networks) the magnet uri scheme defines the format of magnet links, a de facto standard for identifying files by their content, via cryptographic hash value rather than by their location. using magnet uri scheme directly on the publication will make all the data accessible. for more information, read: • academic torrents • magnet uri scheme exercise: . upload any small data set of your choice with the above link. . share with a colleague a link to access it over torrent. interoperability: thing : elixir platforms standardisation of life science data will ensure interoperability across different sub fields. elixir is an intergovernmental organisation that brings together life science resources from across europe. • elixir interoperability platform exercise: use the elixir software bio.tools to find the author of the rna-seq python pipeline "reademption". thing : research data management bio rdf is a large network of linked data for the life sciences. the database provides interlinked life science data using semantic web technologies. to learn more about bio rdf, read bio rdf: towards a mashup to build bioinformatics knowledge systems. http://academictorrents.com/ https://en.wikipedia.org/wiki/magnet_uri_scheme https://www.elixir-europe.org/platforms/interoperability https://bio.tools/ https://www.ncbi.nlm.nih.gov/pubmed/ • http://bio rdf.org/ the german federation for biological data (gfbio) is the authoritative, national contact point for issues concerning the management and standardisation of biological and environmental research data during the entire data life cycle (from acquisition to archiving and data publication). gfbio mediates expertises and services between the gfbio data centers and the scientific community, covering all areas of research data management. • https://www.gfbio.org/ thing : machine-readability make the data accessible via an api, in a structured data format that can be automatically read and processed by a computer. see the open data handbook glossary - machine readable. exercise - crossref: . pick the doi of a publication of your choice. . open a web browser and add the url. . https://api.crossref.org/works/doi <= replace doi with the doi of the publication. example: https://api.crossref.org/works/ . /journal.pcbi. exercise - datacite: . pick the doi of a dataset in zenodo. . open https://api.datacite.org/works/doi <= replace doi with the doi of the zenodo entry. example: https://api.datacite.org/works/ . /zenodo. reusability thing : digitalization if the methods to record complex experiments are prone to error, so that reproducible results cannot be guaranteed, how can you ever be sure you’re dealing with real insights and not random information? the electronic lab notebook provides the missing infrastructure for data recording, retrieval and integrity. an electronic lab notebook must be able to create, import, store and retrieve all important data types in digital format. for more information, read: • kanza, samantha et al. “electronic lab notebooks: can they replace paper?” journal of cheminformatics vol. , . may. , doi: . /s - - - http://bio rdf.org/ https://www.gfbio.org/ http://opendatahandbook.org/glossary/en/terms/machine-readable/ http://opendatahandbook.org/glossary/en/terms/machine-readable/ https://api.crossref.org/works/doi https://api.crossref.org/works/ . /journal.pcbi. https://api.datacite.org/works/doi https://api.datacite.org/works/ . /zenodo. https://www.ncbi.nlm.nih.gov/pmc/articles/pmc / https://www.ncbi.nlm.nih.gov/pmc/articles/pmc / • electronic lab notebook matrix exercise: explore the demo lab notebook at https://demo.elabftw.net/experiments.php thing : containers in a scientific field, most of the time we have to deal with large amounts of data that have to be processed before publication. one important aspect of the reproducibility challenge is ensuring computational analysis can be reproduced, even in different environments. for more information, read: • grüning, björn, et al. "practical computational reproducibility in the life sciences." cell systems . ( ): - . exercise: learn docker & containers using interactive browser-based scenarios: https://www.katacoda.com/courses/docker thing : blockchain for life science blockchain technology has the potential to be a technical solution to the current reproducibility crisis in science, and could "reduce waste and make more research results true". see: • mapping the blockchain for science landscape • blockchain for science and knowledge creation: a technical fix to the reproducibility crisis? living document example: see blockchain for open science – the living document: https://www.blockchainforscience.com/ / / /blockchain-for-open-science-the-living- document/ supplementary information: research data infrastructure for the life sciences (nfdi life) nfdi life brings together research communities across the life sciences domain in the context of the planned national research data infrastructure (nfdi). as a response to the increasing scientific and societal demand for data and data analysis, nfdi life brings together scientific communities and research data infrastructures broadly covering the life sciences with particular focus on the subdomains biology, medicine (with veterinary https://datamanagement.hms.harvard.edu/electronic-lab-notebooks https://demo.elabftw.net/experiments.php https://www.cell.com/cell-systems/pdf/s - ( ) - .pdf https://www.cell.com/cell-systems/pdf/s - ( ) - .pdf https://www.katacoda.com/courses/docker https://hackernoon.com/mapping-the-blockchain-for-science-landscape- b bfbd https://zenodo.org/record/ /files/zenodoblockchainforscienceknowledgecreation.pdf https://zenodo.org/record/ /files/zenodoblockchainforscienceknowledgecreation.pdf https://www.blockchainforscience.com/ / / /blockchain-for-open-science-the-living-document/ https://www.blockchainforscience.com/ / / /blockchain-for-open-science-the-living-document/ medicine), epidemiology, nutrition, agricultural and environmental science as well as biodiversity research. • https://www.nfdi life.de/ carpentries community the carpentries develops and teaches workshops on the fundamental data skills needed to conduct research. • https://carpentries.org/ go-fair-initiative go fair follows a bottom-up open implementation strategy for the european open science cloud (eosc) as part of a broader global internet of fair data & services. • https://eosc-portal.eu/ • https://www.go-fair.org/ fairdom fairdom supports researchers, students, trainers, funders and publishers to make their data, operating procedures and models, findable, accessible, interoperable and reusable (fair). • https://fair-dom.org/about-fairdom/ https://www.nfdi life.de/ https://carpentries.org/ https://eosc-portal.eu/ https://eosc-portal.eu/ https://fair-dom.org/about-fairdom/ top fair data & software things: australian government data/collections sprinters: katie hannan, data librarian (csiro), richard ferrers, research data analyst (ardc), keith russell, manager engagements (ardc) fair data see ardc image summarising what fair means; see also force definition. figure ; fair in a nutshell. image: ardc - cc-by . . description: governments have a mandate to make non-sensitive data open. for example, the australian government public data policy statement says “australian government entities will ... make non-sensitive data open by default...make high value data available for use by the public, https://librarycarpentry.org/top- -fair http://orcid.org/ - - - https://twitter.com/valuemgmt https://www.rd-alliance.org/users/kgrussell https://www.ands.org.au/__data/assets/image/ / /fair-data-image-map-graphic-v - px.png https://www.force .org/group/fairgroup/fairprinciples https://www.pmc.gov.au/resource-centre/public-data/australian-government-public-data-policy-statement industry and academia... ensure non-sensitive publicly funded research data is made open for use and reuse... to extend the value of public data for the benefit of the australian public.” fair data is a way to extend the value of data. the largest nations, the g , agreed to make open data principles a priority at the meeting in turkey, saying “transparency... global transformation, facilitated by technology, fuelled by data and information.. open data is at the center of this global shift.” (p. ). audience: government data custodians goal: help government data custodians to understand fair data principles nb: nomenclature and data: where “data” is used here, we also mean collections such as cultural collections, historical collections, documents, artefacts and other valuable collections. table of contents . thing - why is data important? . thing - open data vs fair data . thing - data discovery (f) . thing - describing your data (fai) . thing - identifiers (f) . thing - licensing (r) . thing - dirty data (r) . thing - sensitive data (a) . thing - vocabularies (i) . thing - data impact (r) things thing : why is data important? read g , australian and states policies on open data http://www.g .utoronto.ca/ /g -anti-corruption-open-data-principles.pdf figure ; data sharing drivers source: katie hannan, , cc-by. beginner activity: international g : open government forum; g turkey . “transparency... global transformation, facilitated by technology, fuelled by data and information.. open data is at the center of this global shift.” (p. ) read and consider g open data principles. familiarise yourself with your state or territories data policy. see links in appendix . australia * public data policy statement office of the australian information commissioner: principles on open public sector information * principle : open access to information — a default position “information held by australian government agencies is a valuable national resource. if there is no legal need to protect the information it should be open to public access.” * principle : effective information governance “ensuring agency compliance with legislative and policy requirements on information management and publication” * principle : robust information asset management * principle : discoverable and useable information “ensure that information published online is in an open and standards-based format and is machine-readable” “attach high quality metadata to information so that it can be easily located and linked to similar information using standard web search applications” * principle : clear reuse rights “the economic and social value of public sector information is enhanced when it is made available for reuse on open licensing terms.” see appendix for a list of australian state open data policies. intermediate activity: the following legislation may apply to the management of government data: mailto:katie.hannan@csiro.au http://www.g .utoronto.ca/ /g -anti-corruption-open-data-principles.pdf https://www.pmc.gov.au/public-data/public-data-policy https://www.oaic.gov.au/information-policy/information-policy-resources/principles-on-open-public-sector-information https://www.oaic.gov.au/information-policy/information-policy-resources/principles-on-open-public-sector-information • archives act - https://www.legislation.gov.au/details/c c • freedom of information - http://my.csiro.au/support-services/legal/foi.aspx • privacy - http://my.csiro.au/support-services/legal/privacy-law.aspx • australian government intellectual property rules - https://www.communications.gov.au/policy/policy-listing/australian-government- intellectual-property-rules • records disposal authority - an agency-specific records authority may have advice that you need to follow. find your agency here - http://www.naa.gov.au/information- management/records-authorities/types-of-records-authorities/agency-ra/index.aspx • new australian government sharing and release legislation (open for public comment, shows where legislation is going): https://www.pmc.gov.au/sites/default/files/publications/australian-government-data- sharing-release-legislation_issues-paper.docx advanced activity: if your organisation doesn’t have a policy on open data, who are the key stakeholders that you would need to work with to prepare an open data policy? what main headings would you need to include as part of your data policy? thing : open data vs fair data read https://www.go-fair.org/faq/ask-question-difference-fair-data-open-data/ can you think of examples of data you deal with that cannot be made open but can be made fair? list some advantages in making this data fair. does the current wording in the policy for open data encourage making the data fair? where do you see gaps? see slide here https://www.slideshare.net/sjdcc/open-fair-data-and-rdm beginner activity: see how geoscience australia implement the fair data principles in their work. geoscience australia describe themselves as “the nation's trusted advisor on the geology and geography of australia” (ga ). advanced activity: how fair is your data? - https://www.ands-nectar-rds.org.au/fair-tool suggest using this now, and then finishing off the modules, making some changes to a data collection and then testing again using the fair data tool. thing : data discovery • what’s a data repository? • what’s a data portal? • where to find data? • where to store data? • data.gov.au (and search.data.gov.au!) - find - this is an aggregator see https://data.gov.au/dataset/list-of-australian-government-data-portals for a list of australian government data portals (current as of march ). some other data portals appear on https://data.gov.au/harvest. • csiro dap - find/store • national map - find • re data.org - registry of research repositories (etc) international government data portals: • united kingdom - https://data.gov.uk/ • new zealand - https://www.data.govt.nz/ • canada - https://open.canada.ca/en/open-data • united states of america - https://www.data.gov/ • india - https://data.gov.in/ • finland - https://vm.fi/en/opendata • singapore - https://data.gov.sg thing : describing your data or collection • including a description of data. what should go in a description? • what makes a good description? see ands content providers guide on descriptions -> best practice -> writing good descriptions • write the description for a reader who has a general familiarity with a research area but is not a specialist—this will make data more accessible for cross-disciplinary use. • don't use specialist acronyms or obscure jargon. • don't assume a reader has specialist knowledge. some reusable content here - https://ecu.au.libguides.com/ -marine-science-rdm- things/thing https://data.csiro.au/dap/home?execution=e s https://nationalmap.gov.au/ https://documentation.ands.org.au/display/doc/description beginner activity: read a data description on data.gov.au eg arts victoria, abc or research data australia eg national archive of australia, australian antarctic data centre, csiro (commonwealth scientific and industrial research org), geoscience australia. reflection: could you understand the description? can you think of someone for whom this data or collection would be useful? was it clear where to go next to access the data, or to ask for more information about this data or collection? what else would you like to know about this data/collection? activity: post your questions or responses to the reflection above to: the data custodian, or the comments section at data.gov.au. intermediate activity; if you are a data custodian/researcher, consider your five most important datasets, that you have contributed to or that you manage. pick the most important dataset to describe. . start with: title, author, year, institution, location/url. this is the minimum description required to get a doi (a permanent identifier). the url for a doi is the home page for the dataset description. if you don’t have one, make a person’s contact the url. • (hint: if you get stuck with the description, copy the abstract of a paper or conference paper or annual report, which uses or references your dataset. edit the abstract to talk only about the data.) q: what type of data identifier does a government data custodian have? . add more rich description to your data description eg subjects, grant ids (where applicable - rda; the australian national data catalogue, has permanent urls for australian arc and nhmrc grants). include a significant statement about why the dataset is important. . ask a colleague in a related field if they can understand your description. this helps the description be broadly readable by someone who is not deeply knowledgeable in your field. this will ensure that your description is more broadly understood. advanced activity: publish your data description on your resume, especially if online e.g. linkedin. send your data description to your data librarian, for addition to your institutional repository or data portal. alternatively, post your description to a public cloud service, such as zenodo, figshare or data dryad. no data need be included. a description record is valuable in itself as it reveals the existence of data, previously unknown and inaccessible. https://www.data.gov.au/organization/artsvictoria https://www.data.gov.au/organization/australianbroadcastingcorporation https://researchdata.ands.org.au/contributors/national-archives-of-australia https://researchdata.ands.org.au/contributors/australian-antarctic-data-centre https://researchdata.ands.org.au/contributors/commonwealth-scientific-and-industrial-research-organisation https://researchdata.ands.org.au/contributors/geoscience-australia https://researchdata.ands.org.au/grants https://www.linkedin.com/ https://zenodo.org/ https://figshare.com/ https://datadryad.org/ thing : identifiers to make data findable, it has to be uniquely and persistently stored with an identifier. a digital object identifier (doi) is a unique, case-insensitive, alphanumeric character sequence and can be very helpful for this purpose. see also [ands guide: digital object identifiers (doi) system for research data]](https://www.ands.org.au/__data/assets/pdf_file/ / /digital-object- identifiers.pdf). see who mints ands dois, including nsw office of heritage and environment, bureau of meteorology, csiro, geoscience australia, dept of environment. types of persistent identifiers: • doi • handle • igsn videos watch the video persistent identifiers and data citation explained by research data netherlands - https://youtu.be/pgqtiy oz k read about persistent identifiers on a very general level (awareness). doi requires five fields; author, title, year, publisher, url of doi landing page. beginner activity: visit http://www.doi.org/ and try resolving these doi numbers: . / bf ea a . / b b c thing : licensing see the licensing guide: what is the appropriate licence for data produced by a government agency? refer to australian government data statement: “at a minimum, australian government entities will publish appropriately anonymised government data by default: ...under a creative commons by attribution licence (ie cc_by licence) unless a clear case is made to the department of the prime minister and cabinet for another open licence.” specific cc licences, which require dpc approval, include nc - non-commercial, sa - share alike, and the very restrictive (and not-recommended ands) nd - no derivatives allowed. examples of licensing statements: https://www.ands.org.au/guides/persistent-identifiers-awareness https://www.ands.org.au/guides/research-data-rights-management https://www.pmc.gov.au/public-data/public-data-policy http://www.bom.gov.au/waterdata/index.shtml?selected=copyright thing : dirty data why is ”clean” data important? public policy, changes to medical protocols and economic decisions all depend on accurate and complete data. see further at ecu resource which looks at the why and what of “dirty data.” https://ecu.au.libguides.com/ -marine-science-rdm-things/thing beginner activity: read this case study. the data retriever automates the tasks of finding, downloading, and cleaning up publicly available data, and then stores them in a variety of databases and file formats. this lets data analysts spend less time cleaning up and managing data, and more time analysing it. https://frictionlessdata.io/articles/the-data-retriever/ • bad data guide - https://github.com/quartz/bad-data-guide • releasing data or statistics in spreadsheets - http://www.clean-sheet.org/ • how to share data with a statistician - https://github.com/jtleek/datasharing • a gentle introduction to data cleaning - https://schoolofdata.org/courses/#introdatacleaning • tidy data for librarians - https://librarycarpentry.org/lc-spreadsheets/ advanced activity: • open refine - https://librarycarpentry.org/lc-open-refine/ • clean your data: getting started with openrefine [video] - https://www.youtube.com/watch?v=wgvtycv ss thing : working with sensitive data what is sensitive data? fair data doesn’t need to be published as open data. see thing . reuse: https://www.ands.org.au/working-with-data/skills/ -research-data-things/ - medical-and-health-things/m-and-h-thing- useful resource: csiro data the de-identification decision-making framework - https://publications.csiro.au/rpr/download?pid=csiro:ep &dsid=ds indigenous knowledge: issues for protection and management - https://www.ipaustralia.gov.au/sites/g/files/net /f/ipaust_ikdiscussionpaper_ march . pdf additional resources (from library-research-support-top- -fair-things_draft) • despite being written for human research ethics committees, the ands human research ethics committees guide is a handy overview for people interested in making personal data fair: https://www.ands.org.au/__data/assets/pdf_file/ / /hrec_guide.pdf key points: “.....” • nhmrc national statement on ethical conduct of human research ( ) - ch . element https://nhmrc.gov.au/about-us/publications/national-statement-ethical- conduct-human-research- -updated- #element_ __data_collection_and_management key points: “....” • guiding principles for ethical research (u.s national institutes of health) - https://www.nih.gov/health-information/nih-clinical-research-trials-you/guiding- principles-ethical-research thing : vocabularies - assisting with interoperability beginner activity: controlled vocabularies for data description in addition to selecting a metadata standard or schema, whenever possible you should also use a controlled vocabulary. a controlled vocabulary provides a consistent way to describe data - location, time, place name, and subject. controlled vocabularies significantly improve data discovery. it makes data more shareable with researchers in the same discipline because everyone is ‘talking the same language’ when searching for specific data e.g. plants, animals, medical conditions, places etc . start by browsing controlling your language: a directory of metadata vocabularies from jisc in the uk. make sure you scroll down to . conclusion - it’s worth a read. advanced activity: have a browse around the stunning level of data description and data contained in the atlas of living australia. other examples: * geosciences australia - http://ldweb.ga.gov.au/def/voc/ga/ * national environmental information infrastructure - http://www.neii.gov.au/vocabulary/vocabulary- providers * australian governments' interactive functions thesaurus (agift) - http://www.naa.gov.au/information-management/managing-information-and- records/describing/agift/index.aspx (of interest to australian government linked open data working group) http://www.ala.org.au/ http://www.ala.org.au/ http://www.linked.data.gov.au/ data dictionaries standardised, accepted terms and protocols used for data collection • australian institute of health and welfare - http://meteor.aihw.gov.au/content/index.phtml/itemid/ • australian business register - https://abr.gov.au/for-government-agencies/accessing- abr-data/abr-data-dictionary/ • health.vic - https://www .health.vic.gov.au/about/reporting-planning-data/data- dictionaries • south australian electronic forms data dictionary - https://www.sa.gov.au/editors/electronic-forms-platform/data-dictionary • growing up in australia data dictionary - https://growingupinaustralia.gov.au/data- and-documentation/data-dictionary • department of social services settlement database data dictionary - https://www.dss.gov.au/our-responsibilities/settlement-services/programs- policy/settlement-services/settlement-reporting-facility/help-for-settlement- reports/data-dictionary thing data impact: data reuse - it is hard to check/track when you don’t have persistent identifiers and there’s not much of a data citation culture. web stats selected data.gov.au web analytics - https://search.data.gov.au/dataset/ds-dga- fa bfda- b - - a - af b/details?q=data.gov.au some old uses of open data: https://data.gov.au/showcase use in govhack(au) - https://twitter.com/govhackau?lang=en tracking identifiers - data citation beginner activity: looking at the broader impact of how the data has been used and the benefits it has brought to society, industry, economy, etc. is a richer source of impact evidence than just looking at citations. https://www.ands.org.au/working-with-data/articulating-the-value-of-open-data/data- engagement-and-impact postscript: other topics to consider: • data people - data technologists, data librarians, data trainers, data leaders, data scientists • data governance - policy, procedure, planning, improving systems, request funding, build business cases for change • data training - when: induction, checkups, when problems occur; what? store, describe, how and why do data. advanced topics eg sensitive data, spatial data, vocabularies, provenance. see for example slide in this data readiness slideshow as well as the th edition of share (cover shown below). https://www.slideshare.net/richardferrers/the-national-eresearch-and-data-management-landscape-cdu-data-readiness-training-nov- https://www.ands.org.au/news-and-events/share-newsletter/share- people in data references: • government data links • public records office victoria appendix: list of australian state/territory government open data policies: australian federal government: refer policy at dept of prime minister and cabinet. see also national data commissioner, ”responsible for implementing a simpler data sharing and release framework”. victoria data access policy “the victorian government recognises the benefits from and encourages the availability of victorian government data for the public good. the datavic access policy has been developed to support this recognition.” new south wales policy (nsw) “the objectives of this policy are to assist nsw government agencies to: release data for use by the community, research, business and industry accelerate the use of data to derive new insights for better public services embed open data into business-as-usual...” queensland policy tasmania policy south australia policy western australia policy australian capital territory policy northern territory policy (darwin) https://toolkit.data.gov.au/index.php/main_page https://www.prov.vic.gov.au/about-us/partnerships-and-collaborations/open-data https://www.data.vic.gov.au/policy-and-standards- https://www.finance.nsw.gov.au/ict/resources/nsw-government-open-data-policy https://www.oic.qld.gov.au/publications/policies/open-data-strategy http://www.egovernment.tas.gov.au/stats_matter/open_data/tasmanian_government_open_data_policy https://digital.sa.gov.au/resources/topic/open-data/open-data-declaration https://data.wa.gov.au/open-data-policy http://www.cmd.act.gov.au/__data/assets/pdf_file/ / / -proactive-release-of-data-open-data-policy.pdf https://www.darwin.nt.gov.au/sites/default/files/publications/attachments/policy_no_ _-_open_data.pdf top fair data & software things: archaeology sprinters: deidre whitmore, tim dennis (ucla) description: this guide brings concepts surrounding fair data principles and the (research data) things program to the archaeological research domain with the aim of fostering better data practices and stewardship throughout the discipline. audience: researchers, scholars, employees, students, volunteers -- anyone working with or around data collected for archaeological research and management. how to use this guide? you don’t have to do all of the things, and in fact, you may not be able to do every thing. however, familiarize yourself with each thing and implement those which suit your work and interests. try to schedule time to learn more about a thing regularly and work through how you could integrate it into your own research practices. why this guide? archaeological data is costly to collect, difficult or impossible to re-collect, and frequently lacks the context or documentation to reuse. because of this, the domain has not yet coalesced around standards, though guidelines and data services are gaining traction. this guide helps introduce these services and calls out resources that can facilitate the adoption of leading practices. data in archaeology: archaeologists collect and work with a wide range of data types: textual, visual (raster, vector), tabular (spreadsheets, databases), spatial, audio, d, etc. this makes the creation and adoption of standards surrounding data management challenging but also even more necessary as these varied types frequently need to be analyzed together and shared among collaborators. https://librarycarpentry.org/top- -fair https://github.com/deidrewhitmore https://github.com/jt den after working through the things below you’ll know how to: • plan and prepare for data collections so that the data that are collected are fair • document collection processing analyses to support fair data • draft and refine a data model • find training or data specialists that can assist you in your work • identify the multiple roles in the interdisciplinary project • plan for a field season that integrates best practices for data management • cite data, publish your data so that it can be cited, and why it is important to do so • write a good data management plan • identify the major data repositories in archaeology • reference the guides to good practice and when to do so (at the start of a project and prior to collecting data!) • evaluate tools that exist and can be used for humanities data things thing : understanding the lifecycle of research data getting started * read planning for the creation of digital data in the digital antiquity guides to good practice. * consider the types of data collected and used within your own work. how many file formats do you work with regularly? how many files have become inaccessible to you over the years? to your colleagues or collaborators? learn more * watch the short film on the lifecycle of research data at https://www.ukdataservice.ac.uk/manage-data/lifecycle. * map out the lifecycle of data on your most recent project. what processes and workflows have gotten you to the stage you are at currently? what can you do to facilitate the ongoing use and reuse of your data? challenge me * read project documentation and project metadata in the digital antiquity guides to good practice. * draft documentation for your most recent project or a forthcoming project. include information about the background, methodology employed or to be employed, a narrative on the site and its context (historically, archaeologically, culturally, etc.). this documentation will not only facilitate the eventual dissemination of your data but also any proposals or publications about the work itself. * review the metadata for this project, document in a single location what metadata you currently record or plan to record and compare it to the metadata tables at http://guides.archaeologydataservice.ac.uk/g gp/createdata_ - https://www.ukdataservice.ac.uk/manage-data/lifecycle http://guides.archaeologydataservice.ac.uk/g gp/createdata_ - http://guides.archaeologydataservice.ac.uk/g gp/createdata_ - http://guides.archaeologydataservice.ac.uk/g gp/createdata_ - . are you missing any project metadata? file-level metadata (general and technical)? how can you fill in any gaps? thing : preservation getting started * browse the websites for archaeological data repositories and preservation services (archaeology data service, tdar, open context). * identify which service(s) contain data of interest to your work. get familiar searching the services. * read why deposit data and consider what is significant about your data, what requirements you need to meet, and which reasons resonate with your work and beliefs. learn more * dig into the deposit instructions and criteria for each repository and service and identify which is the best fit for your own data. * contact the service and discuss your project and data with them. document their recommendations and determine how you can update your current workflow to support deposit. challenge me * select a dataset you can deposit and go through the process of depositing in a repository. thing : training and community getting started * review online resources and training materials for archaeological data management such as datatrain's 'open access post-graduate teaching materials in managing research data in archaeology' at http://archaeologydataservice.ac.uk/learning/datatrain.xhtml. learn more * attend a workshop at an upcoming archaeology conference that focuses on data management or a session on the topic. challenge me * attend a conference or program on data and scholarly communication such as force 's scholarly communication institute and/or asist's annual meeting. * if you are in a position to do so, incorporate archaeological data management and preservation lessons into courses you teach. consider inviting a data librarian or information specialist that is familiar with archaeological data to be a guest speaker. thing : data management plan (dmp) tools getting started * review the guidelines for dmps from funding agencies you are considering or have applied http://guides.archaeologydataservice.ac.uk/g gp/createdata_ - % d http://archaeologydataservice.ac.uk/ https://www.tdar.org/about/ https://opencontext.org/ http://archaeologydataservice.ac.uk/deposit/why.xhtml http://archaeologydataservice.ac.uk/learning/datatrain.xhtml https://www.force .org/fsci https://www.force .org/fsci https://www.asist.org/events/annual-meeting/ to in the past: neh, nsf, aia, etc. see sparc's browse data sharing requirements by federal agency. learn more * check if your institution is participating in the dmp tool (meaning they have customized the tool to point to institutional resources and services) at https://dmptool.org/public_orgs. * read through publicly available dmps at https://dmptool.org/public_plans and consider what makes them strong/weak. take notes on what aspects are important to include when writing your own. challenge me * use a dmp tool to create a dmp for a project you are currently working on or planning to start. * ask a data librarian or specialist at your institution to review your dmp. thing : describing data getting started * learn more about metadata schema, controlled vocabularies and why describing data is a good practice. read what are metadata standards from the digital curation center at http://www.dcc.ac.uk/resources/briefing-papers/standards-watch-papers/what-are-metadata- standards and preparing datasets - metadata from ads at http://archaeologydataservice.ac.uk/advice/preparingdatasets.xhtml#metadata . * consider your current metadata practices - do they follow any schema or incorporate any vocabularies? are your metadata fields described and documented explicitly? learn more * review some of the vocabularies and thesauri related to archaeological data including getty vocabularies at http://www.getty.edu/research/tools/vocabularies/index.html and periodo at http://perio.do/en/. * consider whether these vocabularies could be incorporated into your data practices and workflow. challenge me * create a data dictionary (metadata field, type, definition, controlled vocabulary status) for a current or future project based on the metadata recommendations in the guides to good practice. * do this for each type of data you plan to or have collected that has an associated guide (i.e. raster images, geophysics, gis). thing : cleaning, processing, and documentation getting started * learn about processing and documentation in ‘data selection: preservation intervention points’ at http://guides.archaeologydataservice.ac.uk/g gp/archivalstrat_ - . * consider your own workflow and the different stages at which your data is transformed. write down the http://datasharing.sparcopen.org/data http://datasharing.sparcopen.org/data https://dmptool.org/public_orgs https://dmptool.org/public_plans http://www.dcc.ac.uk/resources/briefing-papers/standards-watch-papers/what-are-metadata-standards http://www.dcc.ac.uk/resources/briefing-papers/standards-watch-papers/what-are-metadata-standards http://archaeologydataservice.ac.uk/advice/preparingdatasets.xhtml#metadata http://www.getty.edu/research/tools/vocabularies/index.html http://perio.do/en/ http://guides.archaeologydataservice.ac.uk/g gp/archivalstrat_ - equipment and instruments you use to collect data and the process for obtaining the data from those instruments (i.e. calibrating, exporting) learn more * investigate tools that facilitate data cleaning and documentation such as open refine at http://guides.archaeologydataservice.ac.uk/g gp/archivalstrat_ - . * attend a workshop or go through a tutorial to learn how to use the tool and its features including exporting out the record of the cleaning, etc. challenge me * choose a recent dataset you've collected and go through the processing and cleaning workflow. be sure to document every step and follow conventions for file names, file formats, and backup creation. thing : sharing getting started * learn more about why sharing data matters in archaeology. explore publications on archaeological data, reuse, and publishing including openness and archaeology's information ecosystem at https://escholarship.org/uc/item/ tq jg and other people's data: a demonstration of the imperative of publishing primary data at https://escholarship.org/uc/item/ nt v n * consider times you haven't been able to access data associated with your research. how did you address this issue? * consider times you have tried to use collaborators' or colleagues' data in your own research. what steps did you have to take to make sense of the data, to incorporate it into your own dataset, or to analyze it? what might have made this process easier? learn more * learn more about sensitive data and what you can do to protect while still making it accessible from resources such as ands (https://www.ands.org.au/working-with- data/sensitive-data/sharing-sensitive-data) and ads (http://archaeologydataservice.ac.uk/advice/sensitivedatapolicy.xhtml) * consider whether there are any ethical or legal restrictions around data in your own work. discuss these considerations with the appropriate representatives and determine what the best plan for sharing data is for all relevant parties. challenge me * learn more about the differences between publishing and sharing data then either: * prepare a dataset of your own for sharing with a colleague or collaborator and ask them to report back on any issues they faced understanding the data, accessing files or information, and what you could have done to simplify their use of the dataset. * or publish a dataset of your own. this can be done either in association with an article or book, as a data paper with a journal that specializes in data publication, or through a data publishing service. consider http://guides.archaeologydataservice.ac.uk/g gp/archivalstrat_ - https://escholarship.org/uc/item/ tq jg https://escholarship.org/uc/item/ nt v n https://www.ands.org.au/working-with-data/sensitive-data/sharing-sensitive-data https://www.ands.org.au/working-with-data/sensitive-data/sharing-sensitive-data http://archaeologydataservice.ac.uk/advice/sensitivedatapolicy.xhtml the challenges you faced as your prepared the dataset and what you can do to simply the process next time, then incorporate these practices into your workflow. thing : citation getting started * data citation continues the tradition of acknowledging other people’s work and ideas. along with books, journals and other scholarly works, it is now possible to formally cite research datasets and even the software that was used to create or analyze the data. consider if there are times your data was reused by someone else and whether you received scholarly credit. * read the force joint declaration of data citation principles at https://www.force .org/datacitationprinciples. * watch this video on persistent identifiers and data citation at https://www.youtube.com/watch?v=pgqtiy oz k. * search data repositories and services such as ads, tdar, and open context and see how their recommended citations are formatted. learn more * consider how many times you've read research papers and felt the data was either insufficient or inaccessible and how this impacted your interpretation. * have a discussion with your colleagues about their perspectives on publishing data so that it is findable, in formats that are accessible, and with enough descriptive metadata and documentation to be reusable. have any of them ever cited a dataset? why or why not? what would be needed for this to become a common practice in archaeology? challenge me * include citations to datasets, not just scholarly articles and books, relevant to your work in your next publication. * consider whether persistent identifiers (pids) should be routinely applied to all research outputs. remember that pids carry an expectation of persistence (maintenance costs, etc.) but can be used to collect metrics as well as link articles and data (evidence of impact). thing : licensing getting started * research licensing research data in your country and what set of licenses is used most commonly. * discuss with colleagues if they have licensed their data and what their experience has been. learn more * read through the licensing agreements and policies for data services and repositories, starting with ads, tdar, and open context. consider whether these policies align with your datasets and obligations. https://www.force .org/datacitationprinciples https://www.youtube.com/watch?v=pgqtiy oz k http://archaeologydataservice.ac.uk/advice/termsofuseandaccess.xhtml https://www.tdar.org/about/policies/contributors-agreement/ https://opencontext.org/about/publishing challenge me * determine which license is appropriate for your data and if possible, release one of your own datasets by depositing into an archive or repository. consider consulting with a data service representative or data librarian about your selection. thing : fair in archaeology getting started * read through the fair data principles at https://www.go-fair.org/fair-principles/. * consider what these principles mean in practice and how each of the things you are implementing support fair archaeological data. what would it mean if every archaeologist followed these principles? learn more * watch the webinar enabling fair data at https://www.dataone.org/webinars/enabling- fair-data or are we fair yet? at https://rd-alliance.org/webinar-are-we-fair-yet. challenge me * assess the fairness of one of your recent datasets using the fair self-assessment tool from ardc. what did you learn about your data? how can you do better? https://www.go-fair.org/fair-principles/ https://www.dataone.org/webinars/enabling-fair-data https://www.dataone.org/webinars/enabling-fair-data https://rd-alliance.org/webinar-are-we-fair-yet https://www.ands-nectar-rds.org.au/fair-tool https://www.ands-nectar-rds.org.au/fair-tool top fair data & software things february , sprinters: organisations: top fair data & software things: table of contents about, p. oceanography, p. research software, p. research libraries , p. research data management support, p. international relations, p. humanities: historical research, p. geoscience, p. biomedical data producers, stewards, and funders, p. biodiversity, p. australian government data/collections, p. archaeology, p. top fair data & software things: about top fair data & software things: oceanography sprinters: table of contents findability: accessibility: interoperability: reusability: description: audience: goal: things thing : data repositories activity : activity : thing : metadata activity : activity : discussion: thing : permanent identifiers activity : activity : activity : discussion: thing : citations activity : discussion tip: resources that can help make your data more open and accessible or to protect your data thing : data formats discussion : discussion : thing : data organization and management activity : considerations for basic data organization and management group discussion : group discussion : activity : identifying vulnerabilities discussion : discussion : the data management plan (dmp) what is a dmp? activity : activity : discussion: thing : re-usable data reusable data process/derived data products discussion : discussion : thing : tools of the trade why open source tools? things to consider when using open source tools benefits: issues: benefits: issues: discussion: thing : reproducibility can you or others reproduce your work? best practices: discussion: thing : apis and applications (apps) activity: discussion: top fair data & software things: research software sprinters description: audience: goals: what is fair for software things findability thing : create a description of your software thing : register your software in a software registry thing : get and use a unique and persistent identifier for your software accessibility thing : make sure that people can download your software interoperability thing : explain the functionality of your software thing : use standard (community agreed) formats for inputs and outputs reusability thing : document your software thing : give your software a license thing : state how to cite your software thing : follow best practices for software development top fair data & software things: research libraries sprinters: description: audience: goals: things thing : why should librarians care about fair? thing : how fair are your data? thing : do you teach fair to your researchers? thing : is fair built into library practice and policy? thing : are your library staff trained in fair? thing : are digital libraries fair? thing : does your library support fair metadata? thing : does your library support fair identifiers? thing : does your library support fair protocols? thing : next steps for your library in supporting fair top fair data & software things: research data management support sprinters: description: audience: things thing : why bother with fair? thing : metadata thing : the definition of fair metrics thing : searchable resources and repositories thing : persistent identifiers thing : documentation thing : formats and standards thing : controlled vocabulary thing : use a license thing : fair and privacy top fair data & software things: international relations sprinter: description: what is fair data? audience: goal: things thing : getting started thing : discovering data thing : data identifiers thing : data citation thing : data licensing thing : sensitive data thing : data publishing thing : funder requirements thing : data sharing thing : learn more top fair data & software things: humanities: historical research sprinters: description: things findable thing : data repositories thing : metadata thing : persistent identifiers accessible thing : open data interoperable thing : data structuring and organisation thing : controlled vocabularies and ontologies thing : fair data modelling reusable thing : licensing thing : data citation context thing : policies top fair data & software things: geoscience sprinters: audience: things findable accessible interoperable reusable thing : data sharing and discovery activity : data discovery activity : finding data repositories thing : long-lived data: curation & preservation activity : preserving born digital objects activity : readme files thing : data citation for access & attribution activity : citing research data activity : citing software thing : dois and citation metrics activity : dois activity : igsns activity : altmetrics thing : licensing data for reuse activity : why license research data? activity : data licences: unlock data for innovation activity : data licensing in practice thing : vocabularies for data description activity : what is controlled vocabulary? activity : controlled vocabularies in action activity : geoscience vocabularies thing : identifiers and linked data activity : check your orcid activity : get more from your orcid activity : identifiers and linked data thing : what are publishers & funders saying about data? activity : research data and scholarly publishing activity : research funders and data sharing thing : exploring apis and applications activity : try an app activity : apis thing : spatial data activity : spatial data: maps and more activity : spatial data concepts activity : using and visualising spatial data references: top fair data & software things: biomedical data producers, stewards, and funders sprinters: audience: things thing : metadata creation and curation beginner activity: intermediate activity: thing : use of standard data models thing : exploring unique, persistent identifiers beginner activity: intermediate activity: thing : versioning and data "retirement" beginner activity: intermediate activity: thing : linking research objects beginner activity: intermediate activity: thing : human and machine readability thing : maintain/preserve entire research environment (e.g. software) thing : indexing repositories to enable findability thing : broad consent thing : application of metrics to evaluate the fairness of (data) repositories beginner activity: intermediate activity: top fair data & software things: biodiversity sprinters: audience: things findability thing : identifiers thing : citations thing : wikidata thing : registry of research data repositories (re data) accessibility thing : bioschemas thing : licenses thing : availability via torrents interoperability: thing : elixir platforms thing : research data management thing : machine-readability exercise - crossref: exercise - datacite: reusability thing : digitalization thing : containers exercise: thing : blockchain for life science supplementary information: top fair data & software things: australian government data/collections sprinters: fair data description: audience: goal: nb: nomenclature and data: table of contents things thing : why is data important? beginner activity: intermediate activity: advanced activity: thing : open data vs fair data beginner activity: advanced activity: thing : data discovery international government data portals: thing : describing your data or collection beginner activity: intermediate activity; advanced activity: thing : identifiers beginner activity: thing : licensing thing : dirty data beginner activity: advanced activity: thing : working with sensitive data thing : vocabularies - assisting with interoperability beginner activity: advanced activity: thing data impact: beginner activity: postscript: other topics to consider: references: appendix: list of australian state/territory government open data policies: top fair data & software things: archaeology sprinters: description: audience: how to use this guide? why this guide? data in archaeology: after working through the things below you’ll know how to: things thing : understanding the lifecycle of research data thing : preservation thing : training and community thing : data management plan (dmp) tools thing : describing data thing : cleaning, processing, and documentation thing : sharing thing : citation thing : licensing thing : fair in archaeology department of engineering and architecture fair research data management study school parma, - july . theme: data stewardship data stewardship is defined as: “the process and attitudes that makes one deal responsibly with one’s own and other people data throughout and after the initial scientific creation and discovery cycle”. the focus of the fair data management study school at parma university will be on the fair principles related to​ interoperability​ and ​re-using​ research data. staff will acquire knowledge, expertise and practical experience in following outcomes: . planning a research outputs management campaign . reusing research data for the full research lifecycle a laboratory on data carpentry is planned on - july. other optional activities include staff doing visits to universities libraries managing research data in bologna and (virtually) in venice. . venue map main teaching building of the school of engineering (sede didattica di ingegneria) university of parma, campus of science and technology parco area delle scienze, /a (in front of last bus or stop) parma room: teaching room n. https://www.google.com/maps/place/parco+area+delle+scienze,+ ,+ +parma+pr/@ . , . , z/data=! m ! b ! m ! m ! s x b e : xfcb d a ee bf ! m ! d . ! d . ?hl=en-us https://dia.unipr.it/it/didattica/gestione-aule-e-spazi/sede-didattica . agenda each day will be an appropriate mixture of lectures, independent learning, group work and participant presentation. each day will start at . am and finish at . pm (italy time). there will be no less than one hour for lunch, usually at . - . pm arrival on sunday june monday st july . - . welcome, introduction to the study school, data stewardship core stefano caselli, anna maria tammaro, janet and david anderson . – . data stewardship ( ): introduction and overview of fair peter burnhill . – . data stewardship ( ): focus on the data user. provision for access. relationship to articles, books and web resources peter burnhill : - : group work this is a group task and the results will be submitted by mail on july tuesday july . – . behaviours and technical recommendations of the coar next generation repositories working group - part - behaviours behaviours and technical recommendations of the coar next generation repositories working group - part - technologies and implementations susanna mornati . – . openaire and rda: community driven tools and support for research data access and reuse emma lazzeri : - : library system organisation for rdm: university of bologna marialaura vignocchi (voluntary participation - to be confirmed) wednesday july . – . . - . copyright, creative commons, privacy issues for research output management janet anderson and david anderson data stewardship ( ). use, preservation and citation of 'web resources': a look into the future peter burnhill . – : : - . single point of entry: rdm management at university of venice marisol occioni (videoconference) group work thursday-friday - july data carpentry ​group work marianne corvellec, nilani ganeshwaran day : : am- pm : ​introduction to data, and best practices in using spreadsheets pm- : pm : ​cleaning data with openrefine day : : am - pm : ​introduction to r for data analysis pm - : pm : ​using r for data visualization and generating reports more information at: ​http://fair-rdm.unipr.it http://fair-rdm.unipr.it/ . speakers anderson delve janet janet anderson (phd. history of mathematics): professor of digital humanities at the university of brighton, a field she has been researching for the last years, developing fundamentally new methods/technologies to keep alive our digital cultural heritage: digital art, computer games or d models of archaeological sites. her interests lie in digital archiving, using big data techniques for archiving databases. she was the co-ordinator of the european commission e-ark project ( - ), developing a digital archiving infrastructure and standards for national archives, governments and business. she also researches using emulation to replicate old computing platforms as a digital preservation strategy. anderson david david anderson (ph.d. artificial intelligence) is professor of digital humanities at the university of brighton where he leads the cultural informatics research and enterprise group (cireg). he is editor-in-chief of the new review of information networking, and member of the international federation of information processing (ifip) panel ( . ). david coordinated the quality control and wrote the extensive legal study for the e-ark project. he was the pi at the university of portsmouth for the fp keep project, and was co-pi on the jisc-pocos project on preserving complex digital objects. david has authored five books and numerous articles on computing and digital preservation, and wrote the keep layman’s guide to the legal studies. david is treasurer and secretary of the dlm forum. burnhill peter peter served as director of edina and as the first director of the digital curation centre (dcc) as part of a varied -year career at the university of edinburgh. having background as a statistician, researcher and senior lecturer, he strayed into information science. he is a past president of iassist and an honorary fellow of the royal scottish geographic society. contributions include leading the set-up of the world's first national online mapping system for university staff and students (digimap) and the first uk serials union catalogue (suncat). he has also helped ensure continuing access and integrity of the digital scholarly record, as a founder member of clockss, through the keepers registry of preserved e-serials and in research to highlight the threat and potential remedy for 'reference rot'. now with more time available he is keen to share expertise and experience more widely, as well as carrying out his own research into the use of contemporary sources now found online for historical enquiry, such as the impact of c th socio-demographic change, locally and globally. caselli stefano ganeshwaran nilani i am a digital library applications software developer working at the university of manchester library. i work very closely with librarian on a day to day basis and have good understanding of how much of their day job can be improved by the software skills covered by the library carpentry lessons. my connection with software and data carpentries began a few years ago through my involvement in digital scholarship as part of the current role. library carpentry was beginning to emerge at that time; i was fascinated by the idea and looked for the possibility to introduce the concept at manchester. i am now a certified carpentry instructor, taught carpentry lessons for our librarians. i am very interested in bringing library carpentry higher up the agenda, demystifying the concept for librarians and increasing their participation. lazzeri emma emma lazzeri is researcher at the institute information science and technologies of the italian national research council in pisa italy. she is open science manager working on defining strategies, tools and in disseminating open science. she is one of the italian national open access desks (noads) of openaire and contact point for the italian research data alliance node. she is involved also in eoscsecretariat.eu, a eu funded project that supports the european open science cloud (eosc) governance and co-creation. she is member of the open science monitor expert group of the european commission. her research interests are in open science, including policies, best practices, strategies. she holds a phd in innovative technology - telecommunications from scuola superiore sant’anna, pisa italy and a msc and bsc in telecommunication engineering from università di pisa, italy. bahat typewritten text professor of computer engineering at the university of parmasince . director of the phd school in engineering and architecture since , with extended prior experience as chairman of undergraduate and graduate degrees in computer, electronics, and communication engineering. designated representative of the university of parma in the scientific committee of the digital agenda of regione emilia-romagna during the previous regional legislature. i enjoy lecturing in classroom, advise students, and observe over the years their cultural and professional growth, before as well as after their graduation. i also promote placement of graduates in ict engineering in local firms (indeed an easy task) to cater local development. starting from a long standing research interests in robotics and machine intelligence, i have been involved in the challenges of modern agriculture . , including its quest for smart, sensor-driven irrigation equipment optimizing water usage. as romor project coordinator for the university of parma, i am delighted by the success of the project in the palestinian partner universities and i promote its key concepts among phd students and young researchers in my own university. bahat typewritten text bahat typewritten text marianne corvellec marianne corvellec has worked as an industry data scientist since , specializing in the tidyverse (r) and scipy (python) stacks. a physics phd, she left academia in to join the vibrant montreal startup scene. notably, she worked with plotly as a developer. a certified instructor with the carpentries, she speaks or teaches at local and international events on a regular basis. she is an advocate of free software and open standards. mornati susanna susanna mornati is coo at science, italy. she has extensive experience in the design and implementation of information systems for research, gained in thirty years spent at the university of milan, cern and university consortia for ict. with her vast expertise in the research domain, in she directed the program of implementing dspace-cris (iris) at italian he and research institutions and the iride project for orcid adoption at the national level in italy. both projects involved over , researchers and were successfully achieved in just a few months. susanna has gained an international reputation in the open science communities, participating in scientific boards and committees, and a speaker at numerous events. she is a member of the research data alliance (rda), the coar controlled vocabularies board, the dspace leadership and steering groups, the eurocris cris-irs task group, the italian association for open science (aisa), the italian open science support group (iossg). occioni marisol director of the digital library of the ca' foscari university of venice, collaborates in the management of the institutional repository and phaidra, a platform for long-term archiving and dissemination of the university digital collections. supports researchers in open science and she is member of the ca'foscari data monitoring board. member of aisa (a non-profit organization that undertakes to advance open access to knowledge) and iossg, italian open science support group. tammaro anna maria anna maria tammaro has been teaching at the international master in digital library learning (dill), joint master of tallinn university and university of parma until . she has been the president of open edition italy and of the interdepartmental centre unipr colab. she is an international librarian, collaborating with ifla, asis&t and euclid (european association of lis teachers). she is member of open education italy. you can read her blog “bibliotecari internazionali” here: http://annamariatammaro.wordpress.com. homepage: https://works.bepress.com/annamaria_tammaro vignocchi maria laura head librarian of the digital library of the university of bologna – almadl - since , her main interests are in digital libraries, institutional repositories, scholarly communication, library publishing services, open access and open science. she is a member of the open access working group of the conference of the rectors of the italian universities and a member of the informal working group iossg (italian open science support group). in the last five years she has been focusing on research data management and sharing while exploring the impact of fair data principles on research praxis and university support services and infrastructures. . arrival info how to reach parma: ● airport -> shuttle bus (or taxi) to railway station -> train to parma railway station -> taxi (or city bus) in parma. as an alternative, visitors could rent a car at the airport and then reach parma directly. ● the closest airports are bologna and milan linate. milan malpensa and milan-bergamo orio al serio are quite farther. from any of these airports, visitors should reach the railway station using a shuttle bus (or a taxi). ● milan linate to stazione ferroviaria milano centrale (milan central railway station) is about ' and euro ● begamo orio al serio to stazione ferr. milano centrale (milan central railway station) is about ' and euro ● milano malpensa to stazione ferr. milano centrale (milan central railway station) is about h and euro ● from bologna airport to stazione ferroviaria bologna centrale (bologna central railway station): ' and euro ● from the milan or bologna railway station, take the train to parma railway station. parma railway station is very close to the historical city center, however, either a city bus ( , euro) or a taxi is needed to reach one of the hotels. ● the university campus, where the workshop will be held, is outside the city center, but it's well connected by the city bus network to the city center where hotels are located. please note that since attendees will be using the bus multiple times, it may be convenient to purchase a multiple travel ticket (perhaps travels for euro). bus lines that reach the university campus are n. , , . ● the web site to check the train schedule is: http://www.trenitalia.com/ ● queen alia airport in amman, and cairo airport, may not be directly connected to bologna, but one can fly to rome or athens or other airports and then connect from there to bologna. ● flying to bologna rather than malpensa will likely save or hours of time in reaching parma. . accommodation the city center can be reached in about minutes by bus (bus ticket . euro) or by taxi. hotels: hotel torino ​http://www.hotel-torino.it/en/index.php​ is at the very center of the historical part of the city, however, hotel toscanini (ibis) https://www.accorhotels.com/gb/hotel- -ibis-styles-parma-toscanini/index.shtml​ is also at ' walking distance from the center and directly on a bus line to the campus. using standard hotel or b&b reservation platforms, one may find other * or * hotels or b&b in the center of parma offering similar services and costs. for easier connection with the campus and the city center, we suggest not to book a hotel or a b&b northern than parma railway station. please check on ​maps​ resources and have a prior look at the geography to understand where each hotel is located with respect to the city center and the university campus where the study school will be held. https://mail.iugaza.edu.ps/owa/redir.aspx?ref=emt oa lok iasc tb qtw tsrmuajk eigyonbykysbblctvcafodhrwoi vd d lnryzw pdgfsaweuy tl rjb tzw . http://www.hotel-torino.it/en/index.php https://www.accorhotels.com/gb/hotel- -ibis-styles-parma-toscanini/index.shtml https://www.google.it/maps/place/hotel+ibis+styles+parma+toscanini/@ . , . , z/data=! m ! m ! m ! s x : xc f cba fb! shotel+torino! m ! d . ! d . ! m ! s x : x b d cb ba! m ! d . ! d . . contact us prof. stefano caselli: ​http://en.unipr.it/ugov/person/ stefano.caselli@unipr.it prof. anna maria tammaro: https://www.linkedin.com/in/anna-maria-tammaro- annamaria.tammaro@unipr.it http://en.unipr.it/ugov/person/ https://www.linkedin.com/in/anna-maria-tammaro- cm&r : (march) hmorn – selected abstracts c-d - : governing access to a distributed research network’s data resources beth l syat, mph, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; kimberly lane, mph, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; jeffrey s brown, phd, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; david magid, md, mph, institute for health research, kaiser permanente colorado; joe v selby, md, mph, division of research, kaiser permanente northern california; richard platt, md, ms, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; andrew nelson, mph, healthpartners research foundation to answer many public health questions, it is essential to use information from more than one electronic data system, and efficient ways are needed to securely access and use data from multiple organizations while respecting the regulatory, legal, proprietary, and privacy implications of this data use and access. one approach centers on the development of distributed research networks that allow data owners to maintain confidentiality and physical control over their data, while permitting authorized users to ask essential questions. once such a network is fully operating and key elements are in place, sharable data resources can be made available to approved network users, under approved conditions. for instance, data from a large cohort of hypertensive patients with five years of utilization (a hypertension cohort) could be available on the network. the following questions will need to be addressed: who can have access? under what conditions should access be granted? what policies/procedures are required? to address the specific needs associated with governance of a network’s resource(s), the authors call for the establishment of user eligibility requirements, policies to deal with funders (i.e., access rules for study funders), clear standard operating procedures, and guidelines for accessing the network. recommendations to meet to those needs include: ) establishing data oversight policies; ) defining responsibilities for data resource access; ) defining responsibilities for data owners at each site (i.e., responding to queries when requests come in); ) creating standard operating procedures for the data resource; ) creating collaboration guidelines for external partners; and ) monitoring overall resource use. for the purpose of this poster, we propose to illustrate responsibilities for data owners at each site. ps - : digital scholarship: scientific publishing at the crossroads virginia d scobba, mls, ma, group health center for health studies, group health cooperative background/aims: scholarly communication is the system through which research and other scholarly writings are created, evaluated for quality, disseminated to the scholarly community, and preserved for future use. the traditional formal means of interchange, publication in peer reviewed journals, is at the core of the communication infrastructure. however, the structures and processes by which scholars communicate have undergone a major transformation in recent years with the advent of the digital age. new electronic technologies for access to information appear to be revolutionizing scholarly publishing, aptly defined by the term, digital scholarship. current trends in the chaotic scholarly publishing market can be perceived as both opportunities for and threats to digital scholarship. methods: digital scholarship is in a state of unprecedented upheaval as publishers, librarians, legislators, scholarly societies, scientists and other scholars engage in tactics to propel change in directions that promote their individual goals. strategies involve remodeling the publishing market, modifying academic and research institutional procedures, and influencing public policy. results: emerging digital publishing technologies, increasing volume of scholarly works, and decreasing satisfaction with a costly and dysfunctional economic model are changing the fundamental structure of scholarly publishing. research institutions, as well as government and funding agencies, are implementing or exploring strategies which promote free and open access to research results. these include alternative copyright arrangements, e -print archives and digital repositories. conclusion: scholars, researchers, and society at large gain tremendous benefits from the expanded dissemination of research findings. however, several factors have impeded the progress of digital scholarship, including efforts to protect publishing revenues and profits, legal licensing restrictions, and the traditional culture of academia. it is therefore critical that the scientific community is actively engaged to ensure that the advancement of scholarship takes priority in the development of new publishing models. ps - : developing an analytical tool for assessing the adequacy of state health information exchange laws randy mcdonald, jd, lovelace clinic foundation; maggie gunter, phd, lovelace clinic foundation; shelley carter, rn, mph, lovelace clinic foundation; bob mayer aims: to develop and test an analytic legislative tool that provides states with the ability to analyze and propose reform to laws related to the exchange of electronic health information. background: through extensive research, the multi -state harmonizing security and privacy law collaborative (hsplc) found myriad barriers to health information exchange in laws and business practices. in some cases, barriers are beneficial because they protect people’s privacy. however, barriers can be problematic when they prevent the timely exchange of information needed for the treatment of patients. there are many inconsistencies in state and federal laws and among state statutes in their definitions, organizational structure, and content. some states have adopted new legislation that addresses the exchange of health information that may further exacerbate differences among states and impede interstate exchange of electronic health information. methods: hsplc developed a set of analytical tools and a narrative guide, the roadmap, to assist states in implementing an effective legal framework for the review and adoption of legislation that supports health information exchange (hie). the tools and roadmap were created through extensive research to identify best practices for identifying, evaluating, and reforming state laws related to the disclosure of electronic health information. results: hsplc found that various state resources (legal, legislative, healthcare policy, healthcare providers, and consumers) are necessary for successful completion of the roadmap to identify opportunities for legislative reform. hsplc believe that states will have greater likelihood of success in achieving legislative reform if they use the roadmap and reach out to other states contemplating a change in legislation. interstate collaboration and coordination are essential if we are to achieve a national legal and technical infrastructure that facilitates health information exchange. conclusions: legislation in most states does not adequately address the exchange of electronic health information. drafting of legislation must take into account a state’s unique environment and culture, and the needs and support of stakeholders. the goal of using the analytic tool is to protect health information while removing barriers that impede the exchange of vital information. the hsplc roadmap provides a step by step process to analyze and reform state legislation. ps - : optimizing health informatics interventions from the patient’s perspective: focus group on improving safe nsaid use douglas w roblin, phd, kaiser permanente georgia; richard m shewchuk, phd, university of alabama at birmingham; jeroan j allison, md, msc, university of alabama at birmingham; renny varghese, mph, kaiser permanente georgia; suzanne baker, mph, university of alabama at birmingham; catarina i kiefe, md, phd, university of alabama at birmingham background: patient- provider messaging in an electronic medical record (emr) system provides an opportunity to create and sustain productive patient- provider interactions. we elicited patient perspectives on design, benefits, and concerns to improve usability and efficacy of a proposed health informatics intervention to support surveillance of, and provider feedback on, over the counter (otc) non-steroidal anti -inflammatory drug (nsaid) use. methods: we conducted four focus groups involving kaiser permanente georgia (kpg) adults – years old who had a medical condition for which nsaids should be used cautiously or had a recent prescription for nsaids. the focus group elicited information regarding: otc nsaid use (including recognition of risks and side effects), design of an otc nsaid survey to be delivered via kp.org (the secure kp internet portal for patient- physician messaging), benefits and concerns about transmission of this information via electronic messaging to their primary care physicians, and the hkust institutional repository publication information copyright information notice http://repository.ust.hk/ir/ this version is available at hkust institutional repository via if it is the author’s pre-published version, changes introduced as a result of publishing processes such as copy-editing and formatting may not be reflected in this document. for a definitive version of this work, please refer to the published version. adaptive leadership in academic libraries wong, gabrielle ka wai; chan, diana l.h. library management , v.  ,  / ,  , p.  - pre-published version https://doi.org/ . /lm- - - emerald publishing limited     ©  emerald publishing limited  . published by emerald publishing limited. licensed re- use rights only. http://hdl.handle.net/ . / adaptive leadership in academic libraries structured abstract purpose this article outlines the core ideas of adaptive leadership and relates them to challenges confronting academic libraries. design/methodology/approach the paper provides an overview of adaptive leadership model and highlights the key concepts. recent initiatives at the hong kong university of science and technology library are used as cases to illustrate how the model may guide our focus to finding leverage points. findings using the model, the key role of positional leaders shift from the traditional sense of giving direction and protection to followers, to one that orchestrates the change process with the team through difficulties and uncertainties, and to build culture and structure that facilitate adaptive changes. practical implications academic librarians can use the concepts and framework of adaptive leadership to design change strategies and manage change processes. originality/value this is the first article introducing adaptive leadership model to academic libraries. this is the pre-published version introduction the traditional metaphor of seeing a library as “the heart of a university” is losing ground. a modern academic library may be more like a crossroads community, in which users are provoked and enabled to challenge their current knowledge (fowler, ). how can librarians lead academic libraries to adapt to changing environments, and evolve from a traditional collection-centric organization to an engaging, dynamic “crossroads community”? the model of adaptive leadership, developed by professor ronald heifetz of harvard university since , emphasizes leading changes when organizations have to adapt to a radically altered environment (heifetz, ). challenges that confront academic libraries nowadays rarely come with clear boundaries; nor do they present themselves with any pre-defined paths to solutions. libraries must explore the ways to tackle them, not only to evolve to new roles, but also thrive in these new roles. adaptive leadership is therefore a practice with good potential to guide academic libraries through complex challenges and changes. this article outlines the core ideas of adaptive leadership and relates them to library challenges. recent initiatives at the hong kong university of science and technology library (hkust) are used as cases to illustrate how the model may guide our focus to finding leverage points. key elements in adaptive leadership model adaptive leadership is the practice of mobilizing people to tackle tough challenges and thrive (heifetz et al., ). it embraces complexity and ambiguity in situations, and actively pursues innovative solutions via organizational learning, creative problem solving, experiments, and collaboration (kezar and holcombe, ). the roles of adaptive leaders are different from those of the traditional view, which focuses on providing vision, solutions, and directions to relatively passive followers under the leaders’ protection. instead, adaptive leaders work together with the team to bring out tough issues, challenge established practices, and involve people at all levels to learn their ways to solutions. followers are actively engaged in the change process to experiment and to learn. therefore, in the adaptive model leadership is a practice rather than a position or a job. the breadth of adaptive leadership model and practice is beyond the scope of a journal article. yet, a brief outline of the core concepts should benefit academic librarians as an introduction to this approach. the key ideas of adaptive leadership include the importance of diagnosing complex systemic challenges, engaging stakeholders to inspect organizational practices and values, and navigating the change process collectively through inevitable resistance, potential losses and trade-offs. the following sections capture three important elements of adaptive leadership based on heifetz’s writings (heifetz, ; heifetz et al., , ; heifetz and laurie, ): . the concept of adaptive challenges . the nature of adaptive changes . practices to implement changes this is the pre-published version . identifying adaptive challenges when libraries face a new situation, do we afford ample time and effort to diagnose the problem, or do we tend to respond with a quick fix using existing tools that we have? one core element of adaptive leadership is to make an explicit effort to analyse the situation and distinguish the types of problems. the model introduces the terms adaptive challenges versus technical challenges. technical problems have clear definitions and known solutions that can be implemented by current knowledge, through application of existing professional expertise, or using the organization’s current structures or procedures. on the other hand, adaptive challenges are ones for which the experts or organizational leaders have not yet developed an adequate response. they do not have clearly defined problems; further learning is needed to identify problems and find solutions. the two types of situation also differ in terms of who can implement the solutions. technical problems can usually be tackled sufficiently by someone who has the authority or technical expertise; however, tackling adaptive challenges requires everyone to work in new ways. authority leaders must share responsibilities with others for fuller understanding of problems, and experiment with members to find solutions. it usually involves deeper changes in people's priorities, beliefs, and habits; and it takes time for the change to be implemented. as it naturally takes greater effort and attention to perform the diagnosis, mistaking an adaptive problem as a technical problem is a common phenomenon. adaptive and technical problems differ in the nature of the problems; technical problems are not necessarily easier to solve. some technical problems may not be solvable due to lack of resources or other reasons. technical solutions usually bring incremental changes; adaptive solutions often incur transformational changes, which relates to shifts in mind-sets, beliefs, and long-established habits. in academic libraries, we face both types of challenges in many areas. for example, building a collection for a new course can be a technical move, while supporting new pedagogy or program structure in the university is adaptive. the former can be tackled with existing expertise and procedures in the library; although the availability of funding may make it a difficult challenge. the latter situation is very different, it may call for a review of library programs, facilities, and service priorities. another example could be digitizing a special collection using an existing infrastructure and workflow, which is a technical solution; as compare to creating a digital scholarship project from scratch, which requires adaptive leadership. . the nature of adaptive changes tackling adaptive challenges is a process of change that takes time, hard work and persistence. the idea of “being adaptive and thriving in the new environment” comes from the analogy to biological evolution. evolutionary change is an adaptation, which builds on preserving what is important in the existing system and changing what is expandable or dated. therefore, successful adaptation is both conservative and progressive, in the sense that it is not about simply giving up the old ways, but to distinguish what are essential in an organization’s tradition from what can be or should be renewed or removed. this is the pre-published version what does thriving means for libraries? in biology, thriving means propagation. for organizations such as academic libraries, signs of thriving may include quality user services, increases in transactional library use, high staff morale, and positive impact in the support of teaching, learning and research in the parent institutions. in this sense, adaptive success in an academic library may start with effort that engages library staff and stakeholders to prioritize organizational values and define thriving. in the process of adaptive changes, an experimental mind-set plays a key role. using the analogy of natural evolution, variation and diversity is essential for a system to generate innovations for adaptation in new environments. the same applies in the process of designing and implementing adaptive changes in organizations. experimentation produces variations, some may succeed and some will fail. an experimental mind-set allows adaptive leaders to expect failure, and learn to improvise as they go. at the same time, adaptive leaders build a culture that values diverse views, a culture that relies less on central planning and the expertise of the few at the top. in an adaptation process, some organizational heritage is conserved, but there are also losses in certain legacy practices, traditional values, professional identities and others. the process in essence requires tough choices and trade-offs to be made. adaptive leadership requires the awareness of those losses, and the expectation of potential resistance at individual and systemic levels. adaptation changes takes time to consolidate into new sets of norms and processes; therefore, adaptive leadership calls for persistence and the tolerance of uncertainties. . practices for effective actions adaptive leaders diagnose the situations, design change processes, and navigate with their teams through the processes. the adaptive leadership model highlights different actions that help leaders to focus their attention on important issues in handling adaptive changes. four adaptive practices are presented below. get on the balcony this is an analogy to illustrate how leaders have to be able to view patterns as if they were on a balcony rather than in the field of action. only from a distant point of view can a leader observe the context of actions and see the connection between different forces in the complex systems. to build a systemic view, a leader attempts to view problems as caused by system weakness rather than individuals’ failure. multiple perspectives are usually generated to broaden a problem diagnosis. for example, if a study area in a library has a noise issue, we may interpret it as a “user behaviour” problem; if we try to take a systemic approach, perhaps it is not primarily the users but the furniture type or the surroundings that is one of the causes. the interpretation that shifts the “blame” from the users to the environment leads the library to a very different approach to tackle the issue. give the work back to people adaptive changes are systemic in nature; the new state is a result of new ways of thinking and practices, implying evolved elements such as new rules, mind-set, values and workflows. to achieve meaningful changes, authority leaders must involve all members in the change process. letting people take the initiative in defining and solving problems means that managers need to learn to support rather than control. at the same time, team members need to learn to take responsibility. adaptive leaders this is the pre-published version should know how to create wide-spread engagement, instill confidence among team members, and back them up if they make mistake. regulate distress using a “holding environment” adaptive changes are tough for the people involved. adaptive leaders strike a delicate balance between having people feel the pressure to change and not having them feel overwhelmed by change. a state of disequilibrium can motivate team members to make productive change; but too much discomfort leads to burnout or work avoidance. in adaptive leadership, the idea of a “holding environment” is a physical or virtual structure where team members can find support and protection. it is a safe “place” where frustration can be expressed, failures are understood, and ideas are exchanged. a typical practice in regulating pressure may be making opportunities for diverse groups to share practice and discuss issues and progress; it helps to create mutual support, clarify assumptions and relieve competing perspectives. maintain disciplined attention adaptive work is difficult and disturbing; it usually implies certain loss. it is normal that team members may respond with avoidance and resistance. adaptive leaders learn to counteract distractions and guide people to regain focus. they help team members to handle tough trade-offs in values, procedures, operating styles and power. conflicts arise in adaptive changes; adaptive leaders have the role to bring out conflicts among team members and resolve conflicts with them, rather than for them. for a team to navigate through adaptive change, the persistence to success comes from having a disciplined mind to stay focus at tackling the challenge despite of avoidance, resistance and conflicts. adaptive leadership in library literature cases in education leadership among higher education, and academic librarianship in particular, reported cases or discussion of the practice of adaptive leadership are few. one extraordinary example of the power of adaptive leadership on effecting social change is the story of how three charitable foundations successfully led to improvement in the public school system in pittsburgh (heifetz et al., ). on management of higher education, randall and coakley contrast two cases in academia with the lens of adaptive leadership (randall and coakley, ): a four-year college failed in a crisis management as the president applied an unsuccessful technical solution to an adaptive situation; it did not obtain stakeholders’ attention on the challenges, and failed to engage stakeholders and did not create a sense of responsibility to the problem. in the second case, a graduate program at a university was successfully rebuilt by taking strategies that align with the adaptive approach. papers in lis fields it appears that many libraries have initiated and are managing adaptive changes without the awareness of the adaptive leadership approach. so far there is very little discussion in the library literature about the concepts of adaptive leadership. in the context of proposing a transformational change to a sustainable, responsive information literacy culture in academic libraries, wilkinson & bruch (wilkinson and bruch, ) this is the pre-published version made reference to an adaptive method of holding open dialogue that aimed to resolving competing priorities and beliefs. for lis education, the concept of adaptive change was used to frame the demand for training librarians towards higher levels of mental complexity (yukawa, ). in the field of health information management, leaders of information governance found that the adaptive framework suited them the most in facing the complex environment of health information governance (sheridan and watzlaf, ). challenges for academic libraries what kinds of challenges are academic libraries facing in the changing information and scholarly environment? for academic libraries, traditional roles such as collection gatekeepers and information mediation have been challenged by communication technology; traditional functions are becoming obsolete, or are being performed by other units in campus or services on the internet. librarians have been proactively responding to challenges. while defending the core jurisdiction of information access, academic librarians advance new areas such as information literacy teaching and research outputs management (cox and corrall, ). many libraries redesign and expand their operations and service scope. a study in projected a set of four new core responsibilities of academic librarians: consulting services; information lifecycle management; collaborative print and electronic collection building; and information mediation and interpretation (goetsch, ). to experiment new roles and to operate versatile services, many academic libraries undergo organizational restructuring (franklin, ); new job descriptions and educational roles of librarian positions have been emerging (vassilakaki and moniarou-papaconstantinou, ). libraries are challenged to develop new initiatives and skills to fulfil the new purpose. innovation and creativity in their services, systems and facilities become essential to face new challenges (walton and webb, ). at the same time, libraries are expected to demonstrate their value in accountable ways (oakleaf and association of college and research libraries, ). the majority of these challenges are adaptive rather than technical. they call for libraries to examine their traditional services and values, rethink their priorities, revamp existing practices, and reinvent their expertise. these cannot be addressed by incremental, technical solutions, but adaptive, systemic changes. initiatives in the hkust library the hong kong university of science and technology (hkust) is a vibrant university with teaching and research programs in the disciplines of science, engineering, business and management, social sciences and humanities. it is recognized as a leading university internationally as reflected on the high rankings by different agencies . the hkust library provides active support to the university’s teaching and research. change is a constant in our library. this section examines selected initiatives at hkust library using the lens of adaptive leadership. these projects are at different stages of the change process; for instance, the library learning space was successfully transformed to a well-received commons model http://www.ust.hk/about-hkust/rankings/ this is the pre-published version after the learning commons opened in ; while digital scholarship is still in an exploration phase at the time of writing. these cases aim to highlight features of adaptive changes in the development processes. learning space: from traditional to the commons model the transition of learning space at hkust library was catalysed by the opportunity to extend the building in . as a consequence of the extended floor space, the library repurposed space use and created a new learning commons (the lc). the change was not only in physical space, but also in how we support learning to happen in the space. it involved a new service mind-set, and required operations which was not readily reproducible from existing library workflows. library staff had to go through learning at different levels to evolve with the new environment. it was an adaptive change. adaptive processes invoke new practices that may clash with existing values and beliefs. shortly before the lc opened, we conducted a promotion program that engaged campus units, many of which subsequently collaborate with the library to offer a variety of learning activities in the lc (chan and spodick, ). traditionally, the library staff had been proud of a well-maintained study space in a more controlled mode. once we opened our space to a wider range of learning activities, we started to see how the more proactive model of learning support brought up conflicts with the long-held rules and values. for instance, how much control of the space and facilities did we let go when a learning event was hosted by a campus partner? was there any potential resistance or negative emotion that frontline staff might have when they saw student activities that were not allowed in the library before? the adaptive leadership model helps us to expect such difficulties in the change process, and reminds us to resolve conflicts and resistance rather than to avoid them. research data management: experimenting in a holding environment research data management (rdm) has become a strategic priority of many academic research libraries. what roles libraries may have in terms of rdm throughout the research life-cycle is still being explored and experimented. studies of academic libraries in the uk and north america found that libraries had been developing technology support, such as repository for data storage or curation; and informational support, such as advisory and training for researchers (cox and pinfield, ; tenopir et al., ). developing rdm service is an adaptive challenge, which calls for changes in librarians’ skill gaps, resources and service culture. as highlighted in the adaptive leadership model, tackling an adaptive challenge requires the whole team to take on learning in order to understand the problem. at hkust library, rdm was first explored in june when we formed the internal group called scholarly communications committee (scc); the membership included the university librarian and more than professional librarians from different functional units. members contribute to the exploration using their own expertise, which range from metadata management, digital infrastructure, user education, to service design. scc is the platform where we plan for systematic self-learning about rdm through online tutorials, seminars, and data training. we complement each other’s learning, share experiences, toy around with new ideas, and develop new services at scc. the nature of research data is complex, many stakeholders are involved, and there are many possible models of rdm service. one can expect that the process of learning and service development as nonlinear and continuous. throughout the process, scc has become a this is the pre-published version “holding environment” where we can find support, feedback, and experimentation. our model of rdm support evolves gradually. at the time of writing, the main components of rdm services at hkust library include a portal showcasing the components of rdm, the rdm service kit, the data repository dataspace@hkust (https://dataspace.ust.hk/) and rdm seminars for researchers. our learning and experimentation continue. digital scholarship: in search of an adaptive path while rdm could be viewed as an extension of existing library services such as information advisory and institutional repositories, digital scholarship is a new ground for most academic libraries. hkust library is in an early phase in the development of digital scholarship service model. using the adaptive leadership vocabulary, we are taking a “balcony” angle to generate a good view of the “playing field”. the library has substantial experience in creating digital projects, which are primarily driven by library collections and activities. for example, the special collections were digitized and made available via the rare & special ezone (http://lbezone.ust.hk/rse/), and library exhibitions are usually captured with digital images, videos, and platform for users’ comments (http://library.ust.hk/exhibitions/). these digital initiatives align with library’s strategic priorities, but do not relate to specific research needs or interests of scholars in the institution. how do we transit from a system that enables such library- driven digital projects, to a new model supporting digital scholarship? what does that new model look like? vinopal and mccormick proposed a -tier framework (vinopal and mccormick, ) that can provide a structure when we explore what services we can support with the consideration of scalability and sustainability. yet, at the time being, we are first taking our time to understand our digital scholarship context. adaptive leadership model emphasises the importance of spending time to diagnose the situation rather than jumping to quick solutions. in our diagnostic process, we survey the diverse modes of digital scholarship services at pioneering libraries, including those from hong kong as well as overseas. we scan the research interests of faculty at hkust, identify researchers and research topics that may be of interest for digital projects. we continue to talk with potential collaborators to explore if and how they may take advantage of digital scholarship. internally, we take stock of the library’s existing strengths and weaknesses, in terms of digital tools, services, and staffing. while staying mostly on the balcony, we also get down to the field of action. in the spring term of , two undergraduate courses in the school of humanities collaborated with the library through our exhibitions. students’ coursework included guiding library gallery tours as well as writing essays about the exhibits. the library took this opportunity, proposed to the professors to create a web project connecting students’ intellectual work with the exhibition items. this experimentation will give us a different learning experience from what we culminated from developing library-driven, collection-based digital projects. such projects help us prepare for more research-based digital scholarship projects in the future. information literacy course enhancement: collaborating under uncertainties and diversity under a government-sponsored project on information literacy, each university in hong kong was given a few course enhancement funds (cef) for teaching staff to develop information literacy elements in their courses. this fund supports teaching staff to co- design teaching and learning elements with librarians in order to enhance students’ this is the pre-published version information literacy. in - academic year instructional librarians at hkust initiated and engaged in four collaborations using the first round of cef. this was a brand new challenge for us, in terms of the level of outreach to course instructors and the depth of engagement in the course delivery. when compared with the ways many librarians organize traditional one-shot classes, to initiate and carry out a cef collaboration with course instructors requires a very different mind-set. again, this is an adaptive situation rather than a technical one. to face the challenge, instructional librarians drew on previous experiences in deep collaboration, and went through training program covering marketing and collaboration skills. in the exploration with instructors of different courses, librarians encountered different ideas and demands from them, some were creative while some were very tough to meet. our librarians had to keep an open, flexible mind to handle the diverse situations; and to determine how much ambiguity, uncertainty and risks we could withstand. as a consequence of such experimental effort, the instructional librarians successfully engaged the four courses with a wide variety of learning activities. information literacy was instilled through different channels and media, including libguides, learning objects, videos, coaching sessions, database searching, referencing workshops, intellectual property workshops, poster design workshops, mini-conference poster presentation, learn-by-doing exercises, and group project consultations. these various deliverables were impressive to all stakeholders: students, librarians, course instructors and library administrators. collaboration happened not only in teaching, but also in assessment and rubric design. such multi-dimensional interactions between teaching staff and librarians was a good exposure for us to embed information literacy teaching in the course context. throughout the challenging and ambiguous process, our instructional librarians brought back problems to the team’s meetings which served as a holding environment to share frustration, brainstorm creative solutions, and find mutual support. shared ils workflow reengineering: wide-spread engagement the eight government-funded academic libraries in hong kong embarked on a shared integrated library system project in . this was a significant transformation from eight ils to a single instance, cloud-based, next generation system. the migration of million bibliographic records to a new shared system was an enormous challenge not only of technical nature, but also of adaptive change. the project was led by an implementation team and seven functional working groups with delegates from eight libraries. they had frequent meetings to guide the changes in areas such as acquisitions, metadata, fulfilment, user experience, etc. it was a deep collaboration between sister institutions; across institutions the focus was standardizing and simplifying inconsistent issues and policies. at the local level, however, we emphasized on embracing the change with a positive attitude. a central task force was formed at hkust library to oversee the implementation progress: it took the “balcony view” at issues using multiple perspectives, discussed issues at the macro level and built a systemic view to solve problems. a key part in the transformation was workflow changes. for this, a change manager was hired by the consortium to guide the eight libraries through the process. at hkust library, he conducted sessions on business process re-engineering to all library staff. the sessions applied brain-storming exercises to guide participants to rethink given this is the pre-published version workflows, and to come up with innovative ways to redesign practices to achieve set targets. at the time of writing, the change manager was planning more workshops that would focus on mapping out “as is” and “should be” workflows in various library functional areas. the workflow change was adaptive: staff at all levels need to make trade-offs in values, procedures and operating styles; conflicts have to be resolved among themselves. team leaders need to maintain a disciplined mind to focus on tackling the challenge, resistance and conflicts; and help their team members to do the same. building an adaptive culture once we have learnt how to differentiate adaptive situations from technical situations, academic librarians may notice that most challenges they face are in fact adaptive. it becomes important that libraries develop a facilitative organizational culture that can handle adaptive changes. heifetz et al. (heifetz et al., ) suggested that authoritative leaders can cultivate an adaptive culture by: making “naming the elephants” the norm – key issues that need attention should not be avoided even if they are uncomfortable to be discussed openly; “troublemakers” should be protected nurturing shared responsibility for the organization – people feel a shared sense of responsibility when rewards are based mostly on the performance of the entire organization and not on an individual only; other indicators are whether they feel comfortable to share resources, ideas, insights and lessons across boundaries in the organization encouraging independent judgment - distribute leadership in which everyone seizes opportunity to take initiative in mobilizing adaptive work in their own roles; prepare team members to develop a tolerance for ambiguity, and recognize the fact that authority does not have all the answers developing leadership capacity among members – through on-the-job experience with appropriate challenge, feedback and support institutionalizing reflection and continuous learning – develop the group norms to ask difficult reflective questions, honour risk taking and experimentation, and foster a taste for action. although adaptive leadership emphasizes the engagement of the whole team in the change process, the role of positional leaders is still a major determinant of success. they are important not in the traditional sense of giving direction and protection to followers, but in their roles to orchestrate the change process with the team through difficulties and uncertainties, and to build culture and structure that facilitate adaptive changes. academic libraries are operating in a continuously altering environment. adaptive leadership model equips librarians with a framework, a set of vocabulary, and a systemic perspective; it does not give us easy solutions, but can guide us to confront tough challenges with strategies and courage. references this is the pre-published version chan, d., spodick, e., . space development: a case study of hkust library. new libr. world , – . cox, a.m., corrall, s., . evolving academic library specialties. j. am. soc. inf. sci. technol. , – . cox, a.m., pinfield, s., . research data management and libraries: current activities and future priorities. j. librariansh. inf. sci. , – . fowler, g.j., . the essence of the library at a public research university as seen through key constituents’ lived experiences (phd). old dominion university. franklin, b., . surviving to thriving: advancing the institutional mission. j. libr. adm. , – . goetsch, l.a., . reinventing our work: new and emerging roles for academic librarians. j. libr. adm. , – . heifetz, r.a., . leadership without easy answers. harvard university press. heifetz, r.a., grashow, a., linsky, m., . the practice of adaptive leadership: tools and tactics for changing your organization and the world. harvard business press. heifetz, r.a., kania, j.v., kramer, m.r., . leading boldly. stanf. soc. innov. rev. , – . heifetz, r.a., laurie, d.l., . the work of leadership. harv. bus. rev. , – . kezar, a.j., holcombe, e.m., . shared leadership in higher education: important lessons from research and practice. american council on education, washington, dc. oakleaf, m., association of college and research libraries, . the value of academic libraries: a comprehensive research review and report. association of college and research libraries, chicago. randall, l.m., coakley, l.a., . applying adaptive leadership to successful change initiatives in academia. leadersh. organ. dev. j. , – . sheridan, p.t., watzlaf, v., . adaptive leadership in information governance. j. ahimaamerican health inf. manag. assoc. , . tenopir, c., sandusky, r.j., allard, s., birch, b., . research data management services in academic research libraries and perceptions of librarians. libr. inf. sci. res. , – . vassilakaki, e., moniarou-papaconstantinou, v., . a systematic literature review informing library and information professionals’ emerging roles. new libr. world , – . vinopal, j., mccormick, m., . supporting digital scholarship in research libraries: scalability and sustainability. j. libr. adm. , – . walton, g., webb, p., . leading the innovative and creative library workforce: approaches and challenges, in: innovation in libraries and information services. emerald group publishing limited, pp. – . wilkinson, c.w., bruch, c., . building a library subculture to sustain information literacy practice with second order change. commun. inf. lit. , – . yukawa, j., . preparing for complexity and wicked problems through transformational learning approaches. j. educ. libr. inf. sci. , – . this is the pre-published version a digital humanities reading list: part , cooperation between libraries and research communities liber’s digital humanities & digital cultural heritage working group is gathering literature for libraries with an interest in digital humanities. four teams, each with a specific focus, have assembled a list of must-read papers, articles and reports. the recommendations in this article (the second in the series) have been assembled by the team in charge of cooperation and relationship between libraries and research communities, led by liam o’dwyer of dublin city university. https://libereurope.eu/working-group/digital-humanities-digital-cultural-heritage/ https://web.archive.org/web/ /http://libereurope.eu/blog/dt_team/liam-odwyer/ https://web.archive.org/web/ /http://libereurope.eu/blog/dt_team/liam-odwyer/ the second theme: cooperation between libraries & research communities as digital humanities (dh) evolves, the role of libraries and librarians working in the field continues to develop. a core factor in realising the opportunities that dh presents for libraries – and that libraries present for dh – is the level and nature of cooperation between libraries and their research communities. how do libraries find their dh research communities? how do we let ‘them’ find ‘us’? how are these connections best facilitated and fostered? a significant body of literature focuses on this aspect of dh librarianship and this post results from an appropriately collaborative attempt to list must-reads. . the digital in the humanities: an interview with bethany nowviskie in melissa dinsman’s interview , nowviskie identifies the field of library and information science (lis) as being of most benefit to dh. expertise in digitisation, data curation, digital stewardship, metadata, discovery and data visualisation and analysis are called out as key offerings. these are augmented by the established liaison and consultative roles of libraries. . communicating new library roles to enable digital scholarship: a review article, john cox in his consideration of academic libraries’ approaches to dh, cox notes the importance of language and terminology in broadcasting skillsets, for example in job titles and team names. it may be more apt for the library to present itself as partner or collaborator as opposed to service or support provider. cox calls for a focused communications strategy to embed libraries in digital scholarship and create new perceptions of their role as enabling partners, one “that focuses on inserting the library into digital scholarship communities, mirroring their experimental mindset, and projecting a confident, ‘can-do’ outlook” . . no half measures: overcoming common challenges to doing digital humanities in the library, miriam posner and digital humanities in the library isn’t a service, trevor munoz posner also acknowledges the importance of language and framing. she concurs with trevor munoz who argues that support may be unsuited to dh where projects typically need collaborators rather than supporters. in her piece, posner identifies recurring challenges and opportunities for libraries working in dh and https://lareviewofbooks.org/article/digital-humanities-interview-bethany-nowviskie/ https://aran.library.nuigalway.ie/bitstream/handle/ / /communicating% new% library% roles% to% enable% digital% scholarship% nral% final% draft% prepublication.pdf?sequence= &isallowed=y http://miriamposner.com/posnerjla.pdf http://trevormunoz.com/notebook/ / / /doing-dh-in-the-library.html investigates common factors of success and failure. among her conclusions are the importance of institutional commitment and openness to new models and workflows. . evolving in common: creating mutually supportive relationships between libraries and the digital humanities, micah vandegrift & stewart varner in this piece vandegrift and varner use texts by lisa spiro, matthew kirschenbaum, stephen ramsay and bethany nowviskie to present and discuss a variety of perspectives on the subject of library engagement in dh. they emphasise the need for deep collaboration and echo the importance of acting as equal partner and overcoming any reluctance or “timidity” in this regard. the potential of library as space is signaled as particularly pertinent for dh activity and relationship building . building capacity for digital humanities, ecar working group the ecar working group paper outlines categories to assess institutions in terms of capacity and readiness for dh and suggests practical approaches and next steps. different structural approaches to facilitate dh collaboration are explored – centralised, hub and spoke, mesh and consortial. they stress the importance of local context in their consideration of how to best foster dh growth. the ecar recommendation of a tailored approach recurs frequently in this literature, responding to the local dh environment, available resources and strategic goals. performing a needs assessment or environmental scan is repeatedly advocated as an appropriate first step to inform how a library should engage with its researchers. . research libraries & digital humanities tools, rluk rluk’s report on the role of research libraries in the creation, archiving, curation, and preservation of tools for the digital humanities documents the outcomes of a survey of uk research libraries, presenting a broad range of models used and approaches taken. it reinforces views found elsewhere here, such as the cautioning against a one-size-fits-all approach and the shifting role of libraries from service provider to active participant. . digital humanities in the library / of the library, caitlin christian-lamb, sarah potvin & thomas padilla acrl’s special issue digital humanities in the library / of the library contains many articles broaching the topic of research http://diginole.lib.fsu.edu/islandora/object/fsu: /datastream/pdf/view https://library.educause.edu/~/media/files/library/ / /ewg .pdf https://web.archive.org/http://www.rluk.ac.uk/news/rluk-report-the-role-of-research-libraries-in-the-creation-archiving-curation-and-preservation-of-tools-for-the-digital-humanities/ https://web.archive.org/http://www.rluk.ac.uk/news/rluk-report-the-role-of-research-libraries-in-the-creation-archiving-curation-and-preservation-of-tools-for-the-digital-humanities/ https://web.archive.org/http://www.rluk.ac.uk/news/rluk-report-the-role-of-research-libraries-in-the-creation-archiving-curation-and-preservation-of-tools-for-the-digital-humanities/ https://web.archive.org/web/ /http://acrl.ala.org/dh/ / / /introduction/ https://web.archive.org/web/ /http://acrl.ala.org/dh/ / / /introduction/ cooperation. do dh librarians need to be in the library?: librarianship in academic units by locke and mapes explores how models of embedded librarianship within a faculty can help position the librarian as an active partner. other articles discuss the task of bringing library dh labour to light. when metadata becomes outreach focuses on the importance of communicating library skills, where metadata can become “ the heartbeat making dh projects usable, robust, preservable, sustainable, and scalable”. in another piece, by huculack and goddard, a tension is identified between priorities – the scholar focusing on theory/prototype/output and the librarian on practice/preservation/standardisation. . the reciprocal benefits of library researcher-in-residence programs , virginia wilson this paper looks at how use of library research-in-residence programs can enhance the research culture of the library and help foster a collaborative culture between library and faculty. . digital humanities: what can libraries offer? shun han rebekah wong wong undertakes a quantitative analysis of authorship in dh journals to investigate library involvement in the field. she present libraries as central to dh realizing its potential while acknowledging complexity and challenges of relationship building. . special report: digital humanities in libraries, stewart varner and patricia hswe varner and hswe’s survey and report of digital humanities in libraries reflects uncertainty in how to best respond to the expanding scope of activity in the field. many themes and recommendations recur: an engaged, agile, responsive approach, leveraging of existing library strengths. .the research librarian of the future: data scientist and co-investigator lse’s the research librarian of the future looks at emerging roles and opportunities for liaison librarians. meeting emerging research requirements (e.g. around data) can drive collaboration, another example of the agile approach – looking for researchers’ knowledge gaps and where they overlap with library strengths. the need for a strategic approach, supporting upskilling and committing resources, is highlighted. http://acrl.ala.org/dh/ / / /do-dh-librarians-need-to-be-in-the-library/ https://web.archive.org/web/ /http://acrl.ala.org/dh/ / / /do-dh-librarians-need-to-be-in-the-library/ http://acrl.ala.org/dh/ / / /when-metadata-becomes-outreach/ http://acrl.ala.org/dh/ / / /when-metadata-becomes-outreach/ http://acrl.ala.org/dh/ / / /when-metadata-becomes-outreach/ http://acrl.ala.org/dh/ / / /a-case-for-care-and-repair/ https://journals.library.ualberta.ca/eblip/index.php/eblip/article/view/ / notes https://preprint.press.jhu.edu/portal/sites/ajm/files/ . wong.pdf https://cdr.lib.unc.edu/record/uuid: a ad - - bce- c - c f f https://cdr.lib.unc.edu/record/uuid: a ad - - bce- c - c f f http://blogs.lse.ac.uk/impactofsocialsciences/ / / /the-research-librarian-of-the-future-data-scientist-and-co-investigator/ it is somewhat reassuring that across these writings there are recurring themes, and interesting to see how they relate and intersect. library skills and functions are a natural fit for dh, yet a reframing of roles can help communicate their relevance to the field. dh offers great potential as an area of growth but strategic alignment and commitment of resourcing are essential for that potential to be realised. while it may no longer be new , dh remains decidedly different in the challenges and opportunities it poses for libraries – and particularly how libraries and researchers collaborate. posner acknowledges the reality of much dh scholarship as “eccentric, unpredictable, bespoke, and prone to failure. it will not match up neatly with a library’s existing workflows”. these truths, however unpalatable to the dh-enthused librarian, indicate that libraries need to adjust and experiment to succeed here. as posner again puts it, “dh is not, and cannot be, business as usual for a library” for further reading there are of course many more comprehensive listings than this post covers. in the course of our discussions, the following were mentioned: ● miriam posner’s digital humanities and the library ● the acrl’s dh+lib list of relevant readings on dh+libraries http://miriamposner.com/blog/digital-humanities-and-the-library/ http://acrl.ala.org/dh/dh /readings/ debating digital art history debating digital art history anna bentkowska-kafel abstract: this paper offers a few reflections on the origins, historiography and condition of the field often referred to as digital art history (dah), with references, among others, to the activities of the computers and the history of art group (chart, est. ) and my personal experience, spanning over years, first as a postgraduate student, then doctoral researcher and eventually lecturer in dah. the publications and teaching activities of scholars connected to chart are seen as indicative of the evolution of the field internationally. personal experience, or a reality check, is limited to higher education in the uk. the key argument here concerns the questionable benefit of promoting dah as a discrete discipline and detaching digital practices from the mainstream history of art and its institutions. when introduced in the late s, the ‘dah’ served to indicate a dramatic shift in the way art history could be practiced, taught, studied and communicated. the changes were brought about by widening access to computers and information technology. dah was suggested—“perhaps a little ahead of time— as a new kind of intellectual fusion” (w. vaughan). it is no longer necessary to argue for the wise use of computers. digital technology has become part and parcel of teaching, learning and re- search. it is the history of art and its more traditional research methods and critical per- spectives that are seen at risk of neglect. the theories of crisis, even ‘death’ of art history have contributed to general anxiety over the discipline’s future. however, a discipline has “the ability and power to control and judge its borders” (r. nelson). the discipline of art history is richer and stronger through the fusion of digital scholarship with, not separation, from more traditional methodologies and critical canons. the need to continue with the ‘digital’ distinction is questionable. keywords: art history, arts computing, digital art history, historiography digital art history. a new or old field? hair – history of art information and resources; haggis – history of art group for information systems; and hacks – history of art, computers, knowledge, slides, were among many names proposed in for a group, which eventually established itself inter- nationally under the name of computers and the history of art, or chart. the invited article figure : a cartoon drawing by an unknown hand, in chart newsletter, ( ): . (© chart. reproduced by permission) debating digital art history dah-journal, issue , acronym chimera was also considered, in the same light-hearted spirit, but was rejected on the grounds of ‘enough anxieties about our ontological status al- ready’. thirty years on, does this anxiety not sound familiar to those engaged in art-historical computing? after a few years of intense activity and debate, in chart published its first scholarly overview of the field. the book was titled, predictably, computers and the history of art. a bibliographic record, located in what appears an early online library catalogue, reads ‘no discipline assigned’ (fig. ). it shows the bibliographer’s inability to assign the title to any discipline known at the time. why the bibliographer did not classify this book under the history of art, which features in the title, gives food for thought. the present new journal and numer- ous recent and upcoming international events are indicative of the renewed interest in digital art history (dah). four institutes held in the us in the summer of led to the belief that ‘digital art history takes off’. this has been a frustratingly long ‘take-off’. the tendency is to discuss and define this field through its presumed novelty and in opposition to art-historical scholarship and its dissemination formats that do not rely on digital media. digital humanities (dh) has been engaged in a similar debate. the blurred relationship between dah and dh has been noted on many occasions. for example, in the digital art history workshop organized by the getty research institute and the university of málaga in . the resulting publica- tion, with additional material, includes the burning question, on this occasion figure : computers and the history of art ( ) and the book record at http://www.getcited.org/pub/ (accessed . . ). debating digital art history dah-journal, issue , raised by johanna drucker, ‘is there a “digital” art history?’ why do we continue raising questions concerning the ontological status of dah? are we asking the wrong ques- tions? or, being engaged in this field in one way or another, are we simply asking for recognition? those who are new to this debate, students in particular, may find this continued scrutiny of the place of digital technology in the art- historical practice and critical inquiry confusing and perhaps even pointless. these few personal reflections on the origins, historiography and condition of dah are addressed to them. am i a digital humanist or a digital art historian or, simply, an art historian? he big question for this journal— what is dah?—has been recurring since the late s. the desire to define the field anew has been the reason for convening the aforementioned recent international events. what it takes to become a digital art historian and pursue a career in this field is an interrelated question. in most disciplines the level of professionalism is normally determined by a degree or another recognized qual- ification after a period of training. if one practices medicine without a diploma, one is a charlatan; if one paints without having studied fine art, one is a dilet- tante. is it necessary to have a degree in dah to be considered a professional digital art historian? in the department of the history of art at birkbeck college, university of london, introduced an ma in computer applications for the history of art, later renamed ma dah. postgraduate stu- dents were taught by the art historian william vaughan, photography expert anthony hamber and art imaging scientist kirk martinez, among others. these academics were engaged at the time ( – ) in the european esprit ii project, best known under the acronym vasari — visual art system for ar- chiving and retrieval of images. the project was a collaboration between birk- beck, the national gallery in london, bramuer ltd. uk, telecom paris, the doerner institute in munich and other institutions. benefiting from the funding of around us$ million, the project de- veloped a prototype scanner and a meth- odological basis for accurate color re- production of paintings, for the purpose of recording and conservation. apart from the expertise of the teachers and their infectious enthusiasm for computing, birkbeck’s students ben- efited from a departmental vasari com- puter lab. it was well-equipped with net- worked mac and ibm computers, a silicon graphics workstation for imaging and d work, scanners and a wide range of software. the syllabus could be envied by many art history departments even today. the emphasis was on critical dis- t debating digital art history dah-journal, issue , cussion of the value of using compu- tational methods in art-historical in- vestigations. essay/exam questions in- cluded, for example: ‘to what extent have imaging techniques for pictorial analysis yielded concrete results for the study of art history?’; ‘discuss the value of using statistical methods in the study of history of art, using specific examples.’ [my emphasis] of course, to be able to answer such questions, it was mandatory for the student to have a background in art history, as well as acquire practical computing skills, including basic coding. i arrived at birkbeck with a master’s de- gree in ‘straightforward’ ‘old’ history of art and several years of curatorial museum experience. the reading list drew on a considerable body of specialist literature published in the s, with a significant number of titles published by chart and the getty art history infor- mation program (ahip). the course is no longer offered. having graduated from birkbeck in , with an ma in computer applica- tions for the history of art, i went on to do a phd in digital iconology. i located a small body of some early-modern paintings, drawings and prints represent- ing nature in human form. i undertook to establish, mainly through sixteenth- and seventeenth-century cosmological texts, the purpose and meaning of such anthropomorphic representations for the contemporary beholder. i was curious to find out why a number of mediocre artists depicted landscape as a human figure; how many such works have survived, in what form and where. i wanted to describe, classify, date and attribute these double images to partic- ular schools and propose an indexing system independent of ambiguous sub- ject classifications. i was also driven by a determination to prove a prominent critic of my chosen computational methods wrong. i owe him my gratitude. every stage of my ‘old-fashioned’ research— pre-iconographical, iconographical and iconological—benefited from digital tools, computer graphics, pattern recognition and image processing in particular. in the course of my unconventional career i have had the opportunity to slowly, but steadily introduce classes in dah. first, in , to a ba (hons) art and design history course at south- ampton institute, then to the graduate and postgraduate programs at birkbeck and the centre for computing in the humanities at king’s college london. i renamed the king’s module to digital arts and culture, making it more ap- proachable to students. in – it is being offered for the last time. king’s digital humanities has offered me a stimulating academic environment; a scholarly community of distinction with critical enthusiasm for arts com- puting. from – i also worked at the courtauld institute of art on the british academy’s corpus of romanesque sculpture in britain and ireland. regret- tably, there was no interest to embed this or any other large-scale computer-based projects, hosted by the institute, in the teaching curricula, to enable students to learn from the then cutting-edge dig- itization practices. project teams endeav- ored, in collaboration with external specialists, to produce digital images of medieval stain-glass and sculpture of the highest resolution possible, coded records of objects in xml, automated some of the debating digital art history dah-journal, issue , editorial processes, designed databases and managed large sets of data , while postgraduate students and academics continued to rely on the slide library and print reproductions in the conway and witt libraries renowned for the custom- made, red and green filing boxes. the situation at king’s centre for computing in the humanities (now the department of digital humanities) was quite the opposite. postgraduate teaching has al- ways evolved around scholarly com- puter-based projects, which established the reputation of the department. this has been a computer-friendly environ- ment, but my art-historical specialism, with its emphasis on visual arts, rather than text, felt out of place. it was the recognition of digital visualization as a scholarly method of digital humanities that provided a wel- come context to my research, and ex- tended teaching and training oppor- tunities to include historical visualization and virtual museums. through ex- perimentation with digital tools and pro- cesses my students and i have been able to better understand the complexity of human perception. the opportunity to experience and discuss, for example, the potential cognitive value of machine haptics in simulating touch and handling of museum objects that is normally not possible, made us more aware of the extent to which art-historical appreci- ation and museum education privilege the role of visual experience (fig. ). despite benefiting from affiliation to dh, i believe the place of dah is within academic art institutions, ideally with access to teaching art collections. digital art history. a history rt history has been described by robert nelson as “a discipline that typically studies the histories of every- thing but itself, conveniently forgetting that it, too, has a history and is his- tory.” an early use of the phrase ‘dah’ is in by sally m. promey and miriam stewart in “digital art history: a new field for collaboration”, published in american art. the authors describe teaching and learning with digital im- ages, and recognize "the larger impli- cations of new electronic technologies for visual education and scholarship in the museum and the academy". there is no mention of dah other than in the title, but the authors offer a number of insightful observations concerning the subject. since its initiation in , chart "has set out to promote interaction between the rapidly developing new it and the study and practice of art. [over the years] it has become increasingly clear that this interaction has led, not just to provision of new tools for carrying out of existing practices, but to the evolution of unprecedented activities and modes of thought. it was in recognition of this change that we decided, in to hold a conference entitled 'dah' [a subject in transition: opportunities and problems], suggesting – perhaps a little ahead of time – a new kind of intellectual fusion.” a debating digital art history dah-journal, issue , explains william vaughan. the subject of the conference proved extremely controversial. therefore, the following year chart convened, again at the british academy, the conference digital art history? exploring practice in a network society, adding a question mark and the emphasis on the impact of the internet on art and ah. chart's voice was international and far-ranging, but not unanimous in the understanding of dah. one may argue that the founding principles and methods of dah were laid down decades ago. the vision and achievements of pioneers of arts com- puting deserve proper recognition. some key concepts were developed well before the advent of personal computers and the internet, in anticipation of information communication technology as it is known today. “a worldwide museum informa- tion network for research, [...] lectures and simulated exhibitions (in audio/ visual form) delivered electronically, upon request, to a classroom console or even to the home” was everett ellin’s vision already in the mid- s. significant considerations and appli- cations of computer technology—dem- onstrating its benefit to the study of art— go back to the s. the second conference in automatic processing of art history data and documents, held in pisa in , set the international research agenda for years to come. the need to learn programming languages seemed then inevitable and frightened most art historians, but not william vaughan. in the s he initiated the development of early pattern recognition software for matching and retrieval of images of paintings. using the university of cam- bridge (uk) mainframe computer, the architectural historian tim benton of the open university created a database of le corbusier’s architectural drawings and notes. he went on to enhance this resource with tools for scaling and comparing the drawings in a way not possible with paper originals. the resource is not widely available, but the insights into the architect’s creative non-invasive d recording simulated handling of virtual artefacts through machine haptics. photo: a. bentkowska-kafel, classroom v irtu a l m u seu m v ir tu a l l a b laser scanning of objects in the potteries museum, stoke-on-trent, uk, to create virtual d surrogates. photos: l. hewett a screen grab of a d model of a carved ivory inlaid box, scanned and optimised for haptic display, created as a demonstration for warwick castle, uk. image: d. prytherch, user-lab, city university, birmingham virtual artefacts museum a postgraduate course in digital arts and culture, department of digital humanities, king's college london, uk © anna bentkowska-kafel and david prytherch figure : understanding touch and its value in art studies; a postgraduate class taught by anna bentkowska-kafel, king’s college. london and david prytherch, birmingham institute of art and design, – . debating digital art history dah-journal, issue , process it has enabled are evidenced in benton’s writings. the pioneering work of marilyn aronberg lavin in the course of her research into “the narrative disposition of medieval and renaissance mural decoration”, since , involved the creation of a database of some fresco cycles and construction of a com- puter model of the cappella maggiore of san francesco in arezzo, decorated with piero della francesca’s the legend of the true cross. a later version of the d model is, remarkably, still available online. when we talk about the nature and significance of dah, we recognize the rise in the status of this field. some of the earlier concerns over art history “not being at the helm of the sweeping visual- ization revolution” have been resolved, although not entirely satisfactorily. however, defining the nature of dah, in all its cognitive and methodological complexity, proves more difficult. it is relatively straightforward to look at the applications of digital technology—past and current—to art practice, art scholar- ship, conservation and education. they give us a good picture how the field has evolved over the years, and help to foresee its possible future directions. whether applied dah has led to establishing a theoretical basis that could set the field firmly within or apart from mainstream ah is an open question. there is no area of dah that cognitively would be distinct from ah. evolving digital analytical methods facilitate the discovery of new knowledge and review of earlier scholarship. it is particularly satisfying when this discovery comes from students, as in the case of ryan egel-andrews’s original, visualization- figure : visualization of piet mondrian’s studio at rue de coulmiers, paris. south wall view with and without easel. (© ryan egel-andrews, ) debating digital art history dah-journal, issue , based research into piet mondrian’s experiments with architectural space. it challenges earlier assumptions about the artist’s lack of interest in the third di- mension. three-dimensional computer model of the artist studio supported the reading of mondrian’s writings and interpretation of neoplastic principles. a photo-realistic recreation of architectural space was not the aim of this visual- ization. digital art history has been mainly promoted through applications of digital technology. little effort has been made to conceptualize this practice; to connect projects and evaluate patterns in emerg- ing methodologies and critical perspec- tives. digital art history has not estab- lished its own canon of critical texts. when asked to identify the most signifi- cant written works about new media art – , lev manovich proposed a list of ten titles. literature on applied dah is abounding, but i would find it difficult to identify critical texts that have made a lasting impact. reconnecting digital art history to art history n the introduction to his popular anthology of critical texts in art history and its methods ( st ed. ), eric fernie refutes the apparent 'death' of art history. he addresses a need to present a history of the methods, “which art historians have found appropriate or productive in studying the objects and ideas which constitute their discipline [believing that] undergraduates might welcome a discussion of the range of approaches available to them for the study of their subject […]". when refer- ring to the present, fernie notes ‘versa- tility and potential’. there is no mention of the computer. no text concerning its use or impact on key concepts is included in the anthology. while the addition of digital practice and more recent texts would be welcome in future editions (similarly to the anthology edited by donald preziosi ), my identification of the lack of theoretical writings concerned explicitly with dah is not a criticism. in his keynote address to the first chart conference dedicated to dah, held in , eric fernie was not only provocative, but also right to question the very concept of dah as a subject separate from the traditional history of art. dah scholarship has investigated intrinsically ‘mainstream’ art-historical questions, such as the narrative schemes in italian renaissance wall decoration, and artistic principles of mondrian’s neoplasticism. digital iconology needs panofsky. the study of digital aesthetics would be poorer without kant or goodman. a phenomenological critique of virtual historical environments may only benefit from the writings of wilhelm dilthey. walter benjamin’s the work of art in the age of mechanical reproduction [ ] is probably one of the most frequently cited texts in discussions of digital culture. critical perspectives of dah are well served by a much broader canon. i debating digital art history dah-journal, issue , art history has always been inter- disciplinary and always aware of broader theoretical contexts. serious art-historical arguments not only require, but neces- sitate erudite knowledge of—variably— history of ideas, philosophy, history, literature, religion and beliefs, etc. earlier attempts at defining dah have been only partly successful, because they sought the differences rather than affinities with established methodologies and conven- tions. it is impossible to address art- historical questions—whether philosoph- ical, social, political, formal and aesthe- tical—without drawing on the history of human thought and artistic practice. digital research into art and cultural heritage, which has not been informed by a professional art-historical knowledge and rigorous scholarly methodology, often demonstrates inferior or uncertain cognitive value of the findings. examples include historical visualization that does not show the difference between known facts and hypotheses. digital art history is not a discrete discipline, but an umbrella name for methods that involve digital tools, techniques and processes of analysis and interpretation, ranging from basic statistics to complex applications of artificial intelligence (computer vision, pattern recognition, automation, etc.). these tools and techniques are not unique to art history; they are uni- methods. the zurich declaration on digital art history ( ) reads like re- commendations for digital scholarship in general. its eight points—on methodology, authority data, archives and collections, big data, digital work- space, open access, legal matters and sustainability—describe the conditions that are necessary to practice many other disciplines. like ‘new media’ and ‘digital humanities’, ‘dah’ is a temporary name that has served its purpose. by continu- ing to emphasize the ‘digital’, rather than figure : students of digital arts and culture at michael takeo magruder’s de|coding the apocalypse exhibition, somerset house, king’s college london. (photo: a. bentkowska-kafel, ) debating digital art history dah-journal, issue , ‘art’ and ‘history’, we are contributing to further ontological disruption of the discipline. we should instead stress the significance of earlier thought and methods. hans belting believed that "both the artist and the art historian have lost faith in a rational, teleological process of artistic history, a process to be carried out by the one and described by the other". the twentieth-century rift between art-historical scholarship and art practice (about which belting argued so eloquently, if controversially) is allevi- ated when an art form is also a means of scholarly inquiry. the de|coding the apocalypse exhibition (somerset house, ) may serve as an example of art, which has the power of reconnecting artistic practice with scholarly enquiry and learning. this particular collabora- tion was between the computer artist, michael takeo magruder, programming and digital technology specialists, and theology scholars. visiting the exhibition has inspired the students of digital arts and culture to decode the book of re- velation of st john the divine and interpret it for their own time. according to critics, the crisis of academic art history is partly due to changing education needs and students’ loss of interest in historical art; the tendency to ignore historical sources; in- creasing neglect of fieldwork and archival research; “denigration of critical thinking as practiced in the pre-digital age”. it is therefore counter-productive to continue to differentiate between dah and ah. the emphasis should be on erudite historical knowledge, including earlier digital scholarship and its historiography. art, rather than appli- cation of digital technology, should be seen as the incentive for acquiring this knowledge. dah should drop the ‘digital’ label which soon will become irrelevant anyway. the embrace of digital technology in the best possible manner and in intellectual fusion, not in opposition to critical and methodological traditions of the discipline, is a way of demonstrating that there is no ‘crisis’, no ‘lagging behind’, that continues to plague the reputation of the academic history of art and is discouraging new students. students are interested in history when it is presented as relevant and in a way they find appealing. the classroom- based model of teaching, with the typical projection of images of art, away from art being the subject of study, is now an inferior mode of teaching and learning. although not without logistical prob- lems, a class at the de|coding apocalypse exhibition, led by the artist, is a perfect scenario. students responded with equal enthusiasm, and eagerness to learn, when they visited the national gallery, london to study hans holbein the younger’s so- called ambassadors ( ), in the vicinity of other works of the artist and best examples of western painting. “what would be a digital modern equivalent to the holbein image?”—is a question that in the early days of my teaching career i would not have asked of postgraduate students. today such a question inspires international students of the google and wikipedia generation to learn about the making, meaning and provenance of holbein’s masterpiece; the art, music, science, religion and politics of the time. the students typically re- debating digital art history dah-journal, issue , present different cultural backgrounds and very different levels of general knowledge; some are unfamiliar with european renaissance. in the case under discussion, the inspiration to learn his- tory and digital technology came primar- ily from the sixteenth-century work of art. the digital collage that resulted from student collaboration was based on a thorough study of sources, surprisingly also books in print. the collage employed a variety of media, including an original musical composition. it was creative and funny, but also thoughtful and critical of the past and present. the students also learned about copyright restrictions that are preventing a public showing of their coursework. the future of the history of art is in training of the observant eye and knowledgeable, critical mind, using digital tools when useful. chart’s early idea of hacks requires only one re- vision—history of art, computers, knowledge, seriously. notes acknowledgements: i wish to thank the editors of the international journal for digital art history, harald klinke and liska surkemper, for the invitation to share these comments with the readers of the first issue of their journal. i am grateful to the editors and trish cashen, neil grindley, hubertus kohle and jeremy pilcher for reading the manuscript and for their excellent critical comments. the paper is partly based on my unpublished talk, mapping digital art history. the missing chapter, presented to the digital art history laboratory at the getty research institute, ca, held – march . see computers and the history of art (chart), www.chart.ac.uk. all urls active on april , unless stated otherwise. dave guppy, will vaughan and charles ford, eds., computers and the history of art newsletter, ( ): . anthony hamber, jean miles and william vaughan, eds., computers and the history of art (london and new york: mansell pub., ). for example, digital art history, mellon research initiative, convened by jim coddington at the institute of fine arts, new york university, november – december ; http://www.nyu. edu/gsas/dept/fineart/research/mellon/mellon-digi tal.htm; digital art history laboratory held at the getty research institute, ca, – march ; digital art history: challenges and prospects, international conference, held at the swiss institute for art research (sik-isea), zürich, – june ; http://www.gta.arch.ethz.ch/events/ digital-art-history-challenges-and-prospects. anne collins goodyear and paul b. jaskot, “digital art history takes off”, caa news, college art association, october , http://www. collegeart.org/news/ / / /digital-art-history- takes-off/ digital art history: challenges, tools & practical solutions, university of málaga and the getty research institute (gri), malaga, – septem- ber , http:/digitalarthistory.weebly.com/ visual resources. an international journal of documentation, special issue on digital art history edited by murtha baca et al., . ( ). j. drucker’s opening article, 'is there a "digital" art history?', – . anthony hamber, “computer applications in the history of art. a perspective from birkbeck college, university of london”, in la revue informatique et statistique dans les sciences hu- maines, université de liège, . – ( ): – . anna bentkowska, “computer-aided iconologi- cal analysis of anthropomorphic landscapes in western art, c. - ” (doctoral thesis; multimedia cd-rom, nottingham trent univer- sity, ); “ikonologia cyfrowa – nowe oblicze starej metody” [english summary: ‘digital iconology. a new approach to the old method’], in ars longa, published in memory of professor jan białostocki, ed. teresa hrankowska (warsaw: arx regia, ), – . anna bentkowska-kafel, “electronic corpora of artefacts: the example of the corpus of romanesque sculpture in britain and ireland”, in the virtual representation of the past, eds. mark greengrass and lorna hughes (farnham: ashgate, ), – . martyn jessop, “visualization as a scholarly debating digital art history dah-journal, issue , activity”, literary and linguistic computing, , no. ( ): – . anna bentkowska-kafel, “‘i bought a piece of roman furniture on the internet. it’s quite good but low on polygons’—digital visualization of cultural heritage and its scholarly value in art history,” visual resources. an international journal of documentation, special issue on digital art history ed. murtha baca et al., , no. ( ): – . robert s. nelson, “the map of art history”, the art bulletin, , no. ( ): . sally m. promey and miriam stewart, “digital art history: a new field for collaboration”, american art, , no. ( ): – . ibid, . william vaughan, “introduction. digital art history?”, in digital art history – a subject in transition, ed. anna bentkowska-kafel, trish cashen and hazel gardiner (bristol and portland: intellect, ), . the programmes for both conferences are available at http://www.chart.ac.uk/chart pro gramme.html and http://www.chart.ac.uk/cfp .html respectively. selected papers have been published in two volumes of proceedings online and in book format, op. cit., note above. everett ellin, “museums and the computer. an appraisal of new potentials,” computers and the humanities ( ): – . laura corti and marilyn schmitt, eds., international conference in automatic processing of art history data and documents, held at the scuola superiore de pisa, – sept , scuola superiore de pisa and the getty trust ahip, santa monica ( ). the conference was first held in . tim benton, “le corbusier and his drawings: an integrated database and drawing package”. ab- stract of paper presented to digital environments: design, heritage and architecture, fifteenth annual conference of computers and the history of art, – september , university of glasgow, available at http://www.chart.ac.uk/chart /ben ton.html i wish to thank marilyn aronberg lavin for clarifying the nature and scope of her collabo- rative computing work; the citation is to email communication of jan ; see her “piero della francesca: legend of the true cross: -d walkthrough, realtime, interactive computer model,” . rivista della fondazione piero della francesca ( ): – , revised in http://www. archimuse.com/mw / http:/projects.ias.edu/pierotruecross barbara maria stafford in: kathleen cohen, james elkins, marilyn aronberg lavin et al., “digital culture and the practices of art and art history.” art bulletin , no. ( ): . ryan egel-andrews, “paradata in art-historical research. a visualization of piet mondrian’s stu- dio at rue de coulmiers”, in paradata and trans- parency in virtual heritage, ed. anna bentkowska- kafel and hugh denard (farnham: ashgate, ): – . abstract at https://visualizationparadata. wordpress.com/ - / lev manovich, “ten key texts on digital art: – ,” leonardo , no. ( ): – and – , also available at http://manovich.net/ index.php/projects/key-texts-on-new-media-art art history and its methods, a critical anthology, selection and commentary by eric fernie (london: phaidon, ). ibidem, . donald preziosi, ed., the art of art history: a critical anthology (oxford university press, ). eric fernie’s keynote address has not been published. see, anna bentkowska-kafel, editorial, digital art history – a subject in transition: opportunities and problems, chart conference proceedings online, ( ), http://www.chart.ac. uk/chart /papers/noframes/editorial.html zürich declaration on digital art history ( ), http://www.gta.arch.ethz.ch/events/digital- art-history-challenges-and-prospects hans belting, the end of the history of art? [ ], trans. christopher s. wood (chicago and london: the university of chicago press, ): ix. argula rublack, “exploring theology with digital art.” a student review of the de/coding the apocalypse exhibition, based on postgraduate coursework, cassone. the internatioal online mag- azine of art and art books, april , http://www. cassone-art.com/magazine/article/ / /explor ing-theology-with-digital-art/?psrc=photography- and-media patricia mainardi, “the crisis in art history. introduction.” visual resources: an international journal of documentation , no. ( ): . maxwell l. anderson, “the crisis in art history: ten problems, ten solutions.” ibidem, . similar findings in diane m. zorich, transitioning to a digital world. art history, its research centers, and digital scholarship, a report to the samuel h. kress foundation and the roy rosenzweig center debating digital art history dah-journal, issue , for history and new media, george mason uni- versity ( ), available at http://www.kress foundation.org/uploadedfiles/sponsored_research /research/zorich_transitioningdigitalworld.pdf bibliography anderson, maxwell l. “the crisis in art history: ten problems, ten solutions.” visual resources. an international journal of documentation , no. ( ): – . aronberg lavin, marilyn. “piero della francesca: legend of the true cross: -d walkthrough, realtime, interactive computer model”, in . rivista della fondazione piero della francesca, ( ): – , revised in http://www.archimuse.com/mw / belting, hans. the end of the history of art? [ ] translated by christopher s. wood. chicago and london: the university of chicago press, . bentkowska, anna. “computer-aided iconological analysis of anthropomorphic landscapes in western art, c. - .” doctoral thesis; multimedia cd-rom, nottingham trent university, . bentkowska, anna. “ikonologia cyfrowa – nowe oblicze starej metody” [english summary: ‘digital iconology. a new approach to the old method’]. in ars longa, published in memory of professor jan białostocki, edited by teresa hrankowska, – . warsaw: arx regia, . bentkowska-kafel, anna. “electronic corpora of artefacts: the example of the corpus of romanesque sculpture in britain and ireland.”, in the virtual representation of the past, edited by mark greengrass and lorna hughes, – . farnham: ashgate . bentkowska-kafel, anna. “‘i bought a piece of roman furniture on the internet. it’s quite good but low on polygons’—digital visualization of cultural heritage and its scholarly value in art history.” visual resources. an international journal of documentation, special issue on digital art history edited by murtha baca et al., , no. ( ): – . cohen, kathleen, james elkins, marilyn aronberg lavin, barbara maria stafford et al., “digital culture and the practices of art and art history.” art bulletin , no. ( ): . collins goodyear, anne and paul b. jaskot, “digital art history takes off”, caa news, college art association, october , http://www.collegeart.org/news/ / / /digital-art-history-takes-off/ corti, laura and marilyn schmitt, eds. international conference in automatic processing of art history data and documents, held at the scuola superiore de pisa, – sept . scuola superiore de pisa and the getty trust ahip, santa monica, . drucker, johanna. “is there a ‘digital’ art history?”, visual resources. an international journal of documentation, special issue on digital art history. murtha baca et al. eds., . ( ): – . egel-andrews, ryan. “paradata in art-historical research. a visualization of piet mondrian’s studio at rue de coulmiers.” in paradata and transparency in virtual heritage, edited by anna bentkowska-kafel and hugh denard, – . farnham: ashgate, . ellin, everett, “museums and the computer. an appraisal of new potentials.” computers and the humanities ( ): – . fernie, eric, ed. art history and its methods, a critical anthology. london: phaidon, . guppy, dave, will vaughan and charles ford, eds., computers and the history of art newsletter, ( ): . hamber, anthony, jean miles and william vaughan, eds. computers and the history of art. london and new york: mansell pub., . hamber, anthony. “computer applications in the history of art. a perspective from birkbeck college, university of london.” la revue informatique et statistique dans les sciences humaines, université de liège, . – ( ): – . jessop, martyn. “visualization as a scholarly activity.” literary and linguistic computing, , no. ( ): – . mainardi, patricia. “the crisis in art history. introduction.” visual resources. an international journal of documentation , no. ( ): – . manovich, lev. “ten key texts on digital art: – .” leonardo , no. ( ): – and – , also debating digital art history dah-journal, issue , available at http://manovich.net/index.php/projects/key-texts-on-new-media-art nelson, robert s. “the map of art history.” the art bulletin, , no. ( ): . preziosi, donald, ed. the art of art history: a critical anthology. oxford university press, . promey, sally m. and miriam stewart. “digital art history: a new field for collaboration.” american art, , no. ( ): – . rublack, argula. “exploring theology with digital art.” review of michael takeo magruder’s exhibition, de/coding the apocalypse. cassone. the internatioal online magazine of art and art books, april , http://www.cassone-art.com/magazine/article/ / /exploring-theology-with-digital- art/?psrc=photography-and-media vaughan, william. “introduction. digital art history?” in digital art history – a subject in transition, edited by anna bentkowska-kafel, trish cashen and hazel gardiner, – . bristol and portland: intellect, . zorich, diane m. transitioning to a digital world. art history, its research centers, and digital scholarship, a report to the samuel h. kress foundation and the roy rosenzweig center for history and new media, george mason university ( ), available at http://www.kressfoundation.org/uploadedfiles/sponsored_ research/research/zorich_transitioningdigitalworld.pdf zürich declaration on digital art history ( ), http://www.gta.arch.ethz.ch/events/digital-art-history- challenges-and-prospects selected websites (accessed april ) computers and the history of art (chart), www.chart.ac.uk digital art history challenges, tools & practical solutions, university of málaga and the getty research institute (gri), malaga, – september , http://digitalarthistory.weebly.com/ digital art history, mellon research initiative convened by jim coddington at the institute of fine arts, new york university, november – december ; http://www.nyu.edu/gsas/dept/fineart/research/mellon/ mellon-digital.htm digital art history laboratory held at the getty research institute, ca, – march , http://digitalarthist ory.weebly.com/agenda.html digital art history: challenges and prospects, international conference, held at the swiss institute for art research (sik-isea), zürich, – june ; http://www.gta.arch.ethz.ch/events/digital-art-history- challenges-and-prospects piero della francesca: legend of the true cross. san francesco, arezzo, italy: d computer model, http://projects.ias.edu/pierotruecross anna bentkowska-kafel is an independent scholar and part-time lecturer in digital art history in the department of digital humanities, king's college london, uk. she has been a longstanding committee member and editor for computers and the history of art (chart, est. ). she co- organized two chart conferences on digital art history held at the british academy in and , and co-edited the proceedings published by intellect. correspondence e-mail: anna.bentkowska@kcl.ac.uk bentkowska.wordpress.com debating digital art history dah-journal, issue , understanding academics: a ux ethnographic research project at the university of york this is a repository copy of understanding academics: a ux ethnographic research project at the university of york . white rose research online url for this paper: http://eprints.whiterose.ac.uk/ / version: accepted version article: blake, michelle and gallimore, vanya orcid.org/ - - - ( ) understanding academics: a ux ethnographic research project at the university of york. new review of academic librarianship. issn - https://doi.org/ . / . . eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ reuse items deposited in white rose research online are protected by copyright, with all rights reserved unless indicated otherwise. they may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. the publisher or other rights holders may allow further reproduction and re-use of the full text version. this is indicated by the licence information on the white rose research online record for the item. takedown if you consider content in white rose research online to be in breach of uk law, please notify us by emailing eprints@whiterose.ac.uk including the url of the record and the reason for the withdrawal request. mailto:eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=racl new review of academic librarianship issn: - (print) - (online) journal homepage: http://www.tandfonline.com/loi/racl understanding academics: a ux ethnographic research project at the university of york michelle blake & vanya gallimore to cite this article: michelle blake & vanya gallimore ( ): understanding academics: a ux ethnographic research project at the university of york, new review of academic librarianship, doi: . / . . to link to this article: https://doi.org/ . / . . accepted author version posted online: apr . submit your article to this journal article views: view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=racl http://www.tandfonline.com/loi/racl http://www.tandfonline.com/action/showcitformats?doi= . / . . https://doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=racl &show=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=racl &show=instructions http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - publisher: routledge journal: new review of academic librarianship doi: https://doi.org/ . / . . understanding academics: a ux ethnographic research project at the universi- ty of york michelle blake head of relationship management university of york york yo dd michelle.blake@york.ac.uk + ( ) vanya gallimore academic liaison team manager university of york york yo dd vanya.gallimore@york.ac.uk + ( ) abstract https://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf mailto:michelle.blake@york.ac.uk mailto:vanya.gallimore@york.ac.uk understanding academics in spring the university of york launched a research project to better understand academic ゲデ;aaく aマhキデキラ┌ゲノ┞ デキデノws けuミswヴゲデ;ミsキミェ ai;swマキiゲげ デエw ヮヴラテwiデ iwミデヴws ;ヴラ┌ミs デエw ┌ゲw ラa ゲヮwiキaキi ethnographic methodologies and in particular two ux techniques: cognitive mapping followed by semi-structured interviews. the use of ux methodologies put the academics at the centre of the interviews, focussing on what they wanted to talk about rather than working through a pre- determined set of questions. following the interviews, a five-stage methodology for managing and analysing the research data was developed. ultimately, the research has led to a number of key ラ┌デiラマwゲぎ ; ゲwデ ラa けケ┌キiニ ┘キミゲげき ; ゲwデ ラa ノラミェwヴ-term practical recommendations; an evidence- based synthesis which seeks to define and explain academic life and understand the key motiva- tions, frustrations and aspirations for academics; and finally an analysis of the key themes from the interview data. keywords academic staff university libraries usability ux ethnography introduction in spring the university of york launched a research project to better understand aca- demic staff. ambitiously titled けuミswヴゲデ;ミsキミェ ai;swマキiゲげ the aims of the project were threefold: to gain a better understanding of how academics at york approach their research and teaching activities; to consider how library services currently facilitate and support those activities; and to integrate the け;i;swマキi ┗ラキiwげ into future service planning and de- understanding academics velopment of support for academics, ensuring that the library continues to engage depart- ments in innovative ways that respond to both current and future needs. the project centred around the use of specific ethnographic methodologies and in particular two ux techniques: cognitive mapping followed by semi-structured interviews carried out by academic liaison librarians (alls). the project was firmly rooted in evidence and what was happening at york, focussing on what practical steps the library could take to support its academic community. the interviews carried out in spring are a snapshot in time and encapsulate and represent the commentary and thoughts of a set of academics at york at that particular moment. since then, events and circumstances have moved on, both in the world of academia and beyond (the tef and brexit, for example). whilst this is not a literature review, the project reflected on the shifting trends and practic- es in academia nationally and internationally. lanclos, for example, writes about the さマwゲゲ of academia ざ today and explores changing landscapes in academia, highlighting how new and emerging digital practices are fundamentally shifting the ways in which academics re- search, teach and access resources. the research carried out by lanclos into these けmessyげ lanclos, d. m. ( ). ethnographic approaches to the practices of scholarly communica- tion: tackling the mess of academia. insights, ( ), p . understanding academics shifts is reflected in the much broader uk survey of academics , the けiデエ;ニ;げ survey . the survey analysed responses from nearly uk researchers to a questionnaire about behaviours and expectations of researchers in デラs;┞げゲ scholarly environment. the results from the understanding academics project aligned with findings from the ithaka survey and research undertaken by lanclos amongst others , defining new directions in which academ- ia is moving and ways in which information services are starting to respond. the project had two key outputs; a synthesis of what it means to be an academic at york (motivations, frustrations and aspirations) and an analysis of the key themes emerging from the interviews. due to the sheer volume of data generated from the interviews it is not possible to present the themed analysis as part of this paper. instead this article focuses on the first output, namely the synthesis of academic life at york. methodology user experience or ux, as it is defined in the library context, is a suite of techniques based around first understanding and then improving the experiences people have when using our wolff-eisenberg, c., rod, a. b., & schonfeld, r. c. ( june ). uk survey of academ- ics : ithaka s+r | jisc | rluk. retrieved from ithaka s+r website http://www.sr.ithaka.org/publications/uk-survey-of-academics- / including adams ( ), bazar ( ), boyd, p., & smith, c. ( ), darabi, m., macaskill, a., & reidy, l. ( ), gregory, m. s.-j., & lodge, j. m. ( ), hoffman, a. j. ( ). for a full list please see the bibliography. http://www.sr.ithaka.org/publications/uk-survey-of-academics- / understanding academics library services. it utilises ethnography and design to achieve this. andy priestner defines says ethnography “is simply a way of studying cultures through observation, participation and other qualitative techniques with a view to better understanding the subject’s point of view and experience of the world. applied to the library sector, it’s about user research that chooses to go beyond the default and largely quantitative library survey, with a view to ob- taining a more illuminating and complex picture of user need. these are often hidden needs that our users do not articulate, find it difficult to describe, are unwilling to disclose, or don’t even know that they have – which special ethnographic approaches are perfect for drawing out. ” this project used two ethnographic techniques: cognitive maps and semi-structured inter- views. to prepare for the project, alls received training in neurolinguistic programming, business analysis tools and effective questioning and listening skills. the methodologies were tested on three volunteer academics. minor changes were subsequently made to the process and interviews were carried out across all three faculties (an average of four per department). the project drew on an additional interviews that had been carried out immediately prior to the formal project using more traditional interview techniques. all academic departments took part in the project. priestler, a. ( , may). uxlibs: a new breed of conference. cilip update, p. . for an explanation of cognitive maps and how to use them see asher, a. , miller, s. ( ). so you want to do anthropology in your library? or a practical guide to ethnographic re- search in academic libraries. the erial project. retrived from http://www.erialproject.org/publications/toolkit/ all departments were included with the exception of the hull york medical school (hyms) https://andypriestnertraining.com/ understanding academics at the start of the interviews, academics were asked to draw a cognitive map of how they prepare for a new module or a new research project, showing each of the key stages, along with the systems or tools needed to make them work and how they link together. the task was deliberately broad in order to understand how academics worked, what they prioritised in their thinking, where the li- brary fitted in with research and teaching activities and where any missed opportunities for support might be. academics were asked to talk through their cognitive map and a semi-structured interview took place using open questions to facilitate discussions based on what academics wanted to talk about rather than going through a pre-prepared set of questions. in order to help process, analyse and manage all the project data, a five-stage methodologi- cal approach was developed which involved: conducting and writing up the ethnography; coding and analysing the data in nvivo qualitative software (against a set of key themes); assigning themes for further analysis; developing project outputs and recommendations; and finally disseminating results for wider comment. the key themes of resources, digital skills and tools, research support, and digital and virtual spaces were central to the project and created all the main streams of work and subsequent actions. synthesis at york academic life at york is at once varied, rich, challenging, pressured, all-consuming, stressful, energising and motivating. each academic interviewed for the project experiences their job and the university in their own individualised way depending on the department they are in, whether they are carrying out research or teaching (or both), what their previous experi- understanding academics ence in academia has been, how well supported they feel in their professional development, what feedback they receive from students (either individually or through local and national- ised surveys), and how individual personal circumstances impact on their work-life balance. it is perhaps not advisable to overly generalise about what it means to be an academic at york; however there were enough similarities emerging from the interviews to be able to articulate some broad themes and issues that help identify and explain the shifting scholarly environment in which academics now operate. motivations the opportunity to be part of a scholarly research community, and in particular working with students and colleagues, is central to what motivates and enthuses many of the aca- demics interviewed in the project. academics talked with real passion about their research areas in particular, their collaborations with colleagues around the world, and being able to encourage and motivate new students into that world: さtエw academic conversation is extremely important and, following it, taking some of it and pitching it to ゲデ┌swミデゲくざ social sciences researcher the nature of the academic role provides some academics with a challenging and stimulat- ing balance between their own research activities and teaching students: さi sラミげデ think iげマ a researcher who is forced to do teaching, i sラミげデ think of myself as a teacher who fits in a bit of research. i think デエw┞げヴw pretty equally important to me and i enjoy them both ...the idea of just being a researcher would send me round the bend, understanding academics and the idea of just teaching would send me round the bend. i like being an academic. i like the public, sociable side and i like then going away and shutting the door and get- ting on with my ┘ラヴニくざ humanities researcher academics are genuinely motivated by their research activities and providing their students with the best possible student experience. that may partly be the inevitable result of na- tional surveys such as the nss and ptes/pres, as well as internal university targets and drivers; however there was an overwhelming sense from the academics interviewed that they enjoy working with their students, that they are committed to teaching them in the best ways possible, and that they invest a huge amount of time and energy in supporting and developing departmental teaching agendas and initiatives. many put the needs of their students above their own needs, working long hours to accommodate student requests, re- spond to feedback, prepare for classes and mark assignments. さ“デ;ヴデ of this term was manic. i had practicals this term on monday and friday that fin- ished at pm in the evening, that takes out big chunks and デエwヴwげゲ nothing you can do. you either work out of hours or it sラwゲミげデ get sラミwくざ science academic motivations around research and teaching may help to explain the long-hours culture within which many academics operate; yet that long-hours culture is often an accepted part of ac- ademic life. the library as an enduring, appealing physical space was also commented on by many aca- demics interviewed. they enjoy working in the library when they get the chance to because understanding academics it is possible to get more work done without being interrupted by a knock on the office door. some said that they would like to work more in the library, however, but were not able to due to noise levels or guilt about taking precious study space away from students. さi think キデげゲ amazing [the library] and i sometimes come here just to work, to get away from the department. at the moment キデげゲ a bit full so i have been over to the library in town キミゲデw;sくざ social science academic this importance of yラヴニげゲ physical library at york is captured by one of the academics in a recent blog post (beer, ) . frustrations many frustrations were raised during the interviews with staff acknowledging that the in- terviews themselves were a cathartic process. there was a genuine appreciation from aca- demics that someone was taking an active interest in them, and affording them the oppor- tunity to reflect on their work and their lives as academics. in particular, a number of academics interviewed outlined the range of different roles, tasks and responsibilities that they are increasingly expected to take on in response to a new fi- beer, d. ( ). writing in the library: some brief reflections on an evocative writing space. medium. retrieved from https://medium.com/@davidgbeer/writing-in-the-library- ef f https://medium.com/@davidgbeer/writing-in-the-library- ef f https://medium.com/@davidgbeer/writing-in-the-library- ef f understanding academics nancial reality and rising student expectations. for academics who have been in the profes- sion for many years, there has been a significant shift in responsibilities and accountabilities which can feel overwhelming at times. さia you speak to most academics they will probably be at capacity or overworked. we are a very different place and academics like me have seen things change ヴ;ヮキsノ┞くざ hu- manities academic many academics are juggling a variety of demands at once and this can be particularly acute for academics in smaller departments who still have the same number of demands on them but demands which have to be carried out by fewer people: さweげ┗w got the same number of significant admin roles as a big department but far fewer people to spread them around. they roll them around a lot more. they tend to say - spend two years in this role and then you can have time without an admin role but デエ;デげゲ just not happening with ┌ゲくざ humanities academic in particular, increasing administrative tasks are placing a considerable burden on academ- ics who have to balance these with their ongoing research and teaching activities. さiげマ spending most of my life at the moment doing all the planning for this depart- マwミデげゲ research strategy and other performance supervision and all sorts of other admin. iデげゲ that, デエ;デげゲ the big issue. and of course having time for that is very, very difficult because often ┞ラ┌げヴw in permanent crisis mode. you think ┞ラ┌げ┗w got time understanding academics for some research then suddenly キデげゲ けh┞ the way we need this document by tomor- ヴラ┘げ and suddenly that time is ェラミwくざ humanities academic departments have workload allocation models which allocate time to academics for admin- istrative tasks, research and teaching activities. some departments seem to be better at or- ganising these allocations for their staff than others. the reality for most academics, in fact, seems to be much more pressured, with many finding it difficult to maintain an even bal- ance, particularly at certain times of the year. term-time is inevitably focused on teaching and administrative work, with research activities often falling during the long summer vaca- tions. academics evolve their own systems for managing workloads and spreading things out across the year. さiげ┗w been writing an article for two years now. it could take ten years to write a [named subject] monograph. people don't realise that. but デエ;デげゲ also with the very, very heavy teaching loads we have in [name of department]. effectively the biggest time for re- search i get in a teaching year is august, and if you've got a family, the time is very re- ゲデヴキiデwsくざ humanities academic さ‘wゲw;ヴiエ really only tends to happen at the weekends. for a normal member of staff, the week is pretty filled with things which are not necessarily conducive to research, es- pecially during term-time, キデげゲ rare that you finish a paper. yラ┌げヴw always hoping for the- se breaks. i think just for this reason, research is always where you can take time away from urgent things that need to be done. iげマ not sure how good we are at noting when people take their annual leave but デエwヴwげゲ a well-known tendency that people tend to understanding academics work - hours per week and have maybe half their holidays just in order to bolt on the ordinary demands, some research activity, because デエ;デげゲ always the thing that comes last. you can never say けiげマ not going to mark these scripts because i have some research to sラげが you can never say that, thwヴwげゲ always a deadline for marking, キデげゲ al- ways tomorrow, you i;ミげデ say けiげノノ bring them in two days ノ;デwヴげ to do some research. but the other way is perfectly legitimate and キデげゲ up to you how you do your ヴwゲw;ヴiエくざ science researcher linked to these pressures are the challenges of having to publish a specific number of re- search outputs in any one year, where to publish, metrics as an individual and bringing in research income which is vital for individual departments and the institution overall: さくくく tエwヴwげゲ a certain subset of journals, three or five journals, that everyone should pub- lish in. iデげゲ good for your cv, good for your promotion and everything, to publish in those テラ┌ヴミ;ノゲくざ social sciences researcher さia ┞ラ┌げヴw bringing in money, デエ;デげゲ all they care ;hラ┌デくくくyラ┌げヴw only worth something if ┞ラ┌げヴw bringing in マラミw┞くざ humanities researcher さaミs デエw┞げヴw constantly ref ref ref funding funding funding, and you end up working all your evenings and ┘wwニwミsゲくざ humanities researcher such pressures and demands can ultimately impact on academic creativity and innovation: understanding academics さiデ is very important to say that the creativity is the hardest part because as an academ- ic you get almost no time to think. i work part time, most of my time is taken up with admin, teaching and supervising ゲデ┌swミデゲくざ science academic さahゲラノ┌デwノ┞が we have guidelines as to how many papers we have to publish every year because of these assessment exercises so, yes, but we try to implement them very soft- ly as research is something which キゲミげデ happening linearly, research requires that you just sit there and stare into the air for hours. and then after three weeks, maybe some- thing happens! you i;ミげデ force it. the more you are under stress, the less creative peo- ple are so キデげゲ really hard to find these empty spaces for continuous reflection on the ヮヴラhノwマくざ science researcher さi taught a course on [topic] which is a notoriously difficult piece of work, キデげゲ brilliant but difficult. and for that i was reading through with or commentaries every week, reading and writing lectures for and a half days solidly. that was a different kind of preparation, it was actually quite exhilarating but you can only do that if you have no other responsibilities which at the time i sキsミげデく when ┞ラ┌げヴw constantly seeing stu- dents, ┞ラ┌げヴw constantly interrupted, ┞ラ┌げ┗w got administrative responsibilities, you have a stack of emails waiting to be answered, you just i;ミげデ do デエ;デくざ humanities aca- demic there would appear to be no easy solution to the pressures of time and workloads. nearly every academic interviewed for the project talked at some point about the challenges and pressures of modern academic life. understanding academics さiデげゲ very difficult to do anything in term time because of teaching and admin. it must do that for everyone and there キゲミげデ an easy solution. if there is iげs love to know キデぁざ social sciences academic there were a number of comments about dealing with a new generation of students coming through and some of the associated frustrations that academics feel when teaching them. academics are adjusting to a generation who are used to reading in a very different way, if at all! interactions with the library have changed and those changes need to be recognised. the extent to which academics must then adapt their teaching to accommodate such changes remains unclear. さvwヴ┞ difficult to get students to read whole books - they just read a section or an article but not the whole book, the wider iラミデw┝デくざ social sciences academic "i sラミげデ really know how to describe it. iデげゲ kind of like ┘wげ┗w kind of given them a drug or something and they are never going to go back now to looking at bookshelves. they just ┘ラミげデ do it. maybe if i was stricter or something but ┘エ;デげゲ the point in that? if i did that they just ┘ラ┌ノsミげデ come to the seminar and then キデげゲ just a losing battle so キデげゲ got to be presented in a, i understand in a way it should be, really accessible, in an appeal- ing and systematic way...but if we i;ミげデ teach them to, デエ;デげゲ such a skill, one of the best things about a [named humanities] degree is to know nothing about a subject and then have a grooved way of approaching the material which includes asking the right questions, looking in the right places, being able to form a picture then analysing it and understanding academics writing about it and that's what we teach them. so if they sラミげデ learn to use the library then they are just not learning to do a big section of that." humanities academic aspirations academics have aspirations for themselves, their colleagues and their students. for them- selves, many aspire to have more time for research activities in particular. there is a strong sense that york should be a research-led teaching institution and that teaching is greatly improved and informed by a strong research culture. aside from comments about adminis- trative burdens, the importance of a continued focus on research-led teaching was one of the key themes raised by academics across all three faculties. さ‘wゲw;ヴiエ-led teaching? we like to think so. sometimes it happens in rd or th year. although there are modules i can think of that the research edge is too far from the ma- terial that needs to be デ;┌ェエデくざ science researcher さaノノ of this makes my teaching easy because all of the subjects i teach tend to be rapidly moving. everything we have been using for teaching has been published within the last year. iデげゲ just moving so fast. therefore iげマ using very current papers to design my teaching, and iげマ thinking に iげマ thinking about what conferences i have been to. all the time iげマ melding my research and teaching, デエ;デげゲ why i have so much time compared to other academics! it is so collaborative and content-rich that i am constantly learning from my ゲデ┌swミデゲくざ humanities researcher understanding academics conversations with students can be hugely stimulating and productive, and can ultimately help feed into the wider research process for academics: さc;ミ the teaching shape your research? can the students add to this? iげマ hoping so! because, based on my experience with rd year students, doing dissertations etc., the biggest contributions are not so much questions asked but case studies. if i think about the questions they asked, they ┘wヴwミげデ earth-shattering, great, but it was kind of the se- lection of empirical cases or specific questions they asked about those things that were interesting and illuminating. the questions they ask are often better than ma students! the kind of conversations iげノノ have, especially with three seminars, will actually be really helpful. now how will they concretely help me and affect how i do research? maybe not, but iげマ sure two or three years from now i can look back after having done an in- terview, would i have asked that question if i エ;sミげデ had that experience in a seminar years ago? i found that days after i do seminars in the morning, i often do much more writing that afternoon even though キデげゲ often ┌ミヴwノ;デwsくざ social sciences researcher one academic in the social sciences is actively working on new ways of developing research- based teaching in the department to ensure that the course is attractive to current and in- coming students. the department had previously made much more of a distinction between teaching activities and research activities, but now the two are working together in a much more blended and integrated way. postgraduate students taking this particular course are able to experience live-action, research-based teaching, as they are taken through practical research projects and encouraged to develop real world knowledge and a new skill-set which is already proving more engaging to the students. real research is inherently interest- understanding academics ing for student recruitment and creates attractive, practical courses. more academics are wanting to work in these ways. a number of academics expressed a need for better personal support for their work and their professional development. to some extent, academics feel left to get on with things themselves which, for new academics in particular, can feel stressful. さiげマ finding the balance of teaching and all the other responsibilities of different jobs that iげ┗w been given and new modules and all sorts of things, have severely restricted what iげ┗w been able to do. and also getting a sense from anywhere of what is sensible and what you should be able to achieve, how to do that? you might get as far as け┞ラ┌ should be writing two articles a ┞w;ヴげ but how do i write those, should it be projects iげマ already working on, do i need to find time to go and start some ヴwゲw;ヴiエいざ humanities researcher for many academics, active collaborations with colleagues both within and outside the insti- tution are highly valued and help make them feel less isolated; indeed working with others is often seen as a strong motivation and aspiration. across all three faculties, collaboration has become a central and highly active part of the academic research process: さtエwヴwげゲ pretty much no project i do that キゲミげデ collaborative. tエwヴwげゲ almost nothing i sit and do on my ラ┘ミくざ science researcher understanding academics collaboration is often built in inherently from the start of a research project and is part of the initial thinking around the project: さiゲ it small-scale enough to do on my own or would it involve others, if so ┘エラいざ humanities researcher academics talked about the importance of being able to discuss new projects with their col- leagues; however time pressures have meant that for some academics, they simply do not have the same opportunities for casual conversations with colleagues about their research. where internal departmental collaboration works well seems to be where departments have actively considered how well their research themes work across the department as a whole, and have put in place various platforms for staff to discuss their research activities so that everyone knows what everyone else is doing. research funding is often the driver behind many collaborative research projects: さgヴ;ミデゲ make research a much more sociable h┌ゲキミwゲゲぁざ (humanities researcher). outside specific research projects, academics draw on wider research communities for dis- cussion, networking and informal collaborations. a number of academics talked about the importance of attending conferences to find out and discuss new research taking place: さyラ┌ sラミげデ want someone who is doing exactly what you're doing. what you want is complementary but non-competing ぷヴwゲw;ヴiエへざ science researcher understanding academics membership of societies is another way of engaging with the wider research community. academics talked about their use of internal and external email groups to facilitate regular discussion on their research fields. this can be particularly helpful for small research groups who are working in highly specialist areas. benefits and value of the research the academic synthesis was a key outcome of the project and has enabled the library to gain in- depth knowledge and understanding of academic practice and needs. having analysed the data about the lives of academics, it became apparent that many of their motivations, frustrations and aspirations have little or nothing to do with the library, yet they can help us to understand what is going on in their world, what their key issues and priorities are, how these may impact on their rela- tionship with the library and ultimately what they need from us. those that did relate to the library formed the basis of a series of developments and initiatives to improve services and support for ac- ademics over the past year. one of the key goals of ux is to be responsive and to be able to make immediate changes where you can rather than waiting for the project デラ aキミキゲエく tエw lキhヴ;ヴ┞ キマヮノwマwミデws ; ゲwヴキwゲ ラa けケ┌キiニ ┘キミゲげ swゲキェミws デラ ;ssヴwゲゲ aヴ┌ゲデヴ;デキラミゲ ;i;swマキiゲ w┝ヮヴwゲゲwsく aミ w┝;マヮノw ラa ; けケ┌キiニ ┘キミげ ;デ yラヴニ ┘;ゲ changing the borrowing system to give all academics the same package as part-time staff, in essence giving them longer to return items if recalled. another significant issue arising from the interviews was around our reading lists system which was an old, in-house system no longer fit for purpose. data from the interviews was synthesised into user requirements. these were subsequently used as part of the supplier selection. the library has understanding academics subsequently launched the new reading list system and a recent focus group with academic staff confirmed that the new software is meeting their needs. aノラミェゲキswが デエw けケ┌キiニ-┘キミゲげ ;ヴw ; ゲwヴキwゲ ラa ノラミェwヴ-term recommendations which the library is contin- uing to address. the themed analysis was condensed into three key areas which have formed the basis for the new library strategy ( - ): space: how do we make the physical library virtual? how do we create a go-to online pres- ence that does the job of a physical library but in a virtual space? scholarship: how do we make information available in an increasingly open, virtual and col- laborative scholarly environment? how can our strategy reflect and anticipate new direc- tions in accessing and using resources? skills: what does it mean for us to be at the forefront of learning delivery? how will we en- sure a culture of digital skills curiosity and engagement, and who ultimately has responsibil- ity for developing digital skills literacies in staff and students across the university? the project has had a number of benefits and confirmed why it has been so important to undertake. firstly, and perhaps most importantly, we have built up our relationships with academics across the university, many of whom we had not met before. we have new connections, and new and exciting possibilities for working with academics on emerging areas like digital scholarship. we have devel- oped an nvivo dataset of academic views across a range of topics which we use in other library pro- understanding academics jects to ensure that our vision for integrating the academic voice into our planning becomes a reali- ty. one of the most tangible benefits of the project has been developing the confidence and experience of the alls. the team have built up a lot of knowledge and experience of ux techniques and can see how it works in practice. ux is now key to our approach and we have established a group of staff to oversee its use across information services and to promote its wider application across the universi- ty community. our focus on ux has been highly commended by assessors for the customer services excellence award (cse) which information services has achieved over the past few years, with the assessor in commentinェ デエ;デぎ さthe use of ethnographical research to help develop customer insight has come to fruition with recommendations in a number of projects." finally, we are in discussions to partner with others to produce user personas based on behaviour (rather than based on role). this will maximise use of the data and allow further integration of the academic user voice into future service development both at york and at other uk he institutions. conclusion the project has been endlessly fascinating, opening a window into the world of academics and understanding more about how modern libraries can and should be supporting the aca- demic endeavour in the twenty-first century. it has demonstrated the importance of effec- tive relationships between the library and its academic community, based on sound knowledge, understanding, evidence, respect and trust. such relationships underpin all of the work that we do and demonstrate an ongoing commitment to the uミキ┗wヴゲキデ┞げゲ drive for excellence for all. understanding academics understanding academics bibliography adams, d. ( ). examining the fabric of academic life: an analysis of three decades of research on the perceptions of australian academics about their roles. higher educa- tion, ( ), pp. - . asher, a. , miller, s. ( ). so you want to do anthropology in your library? or a practi- cal guide to ethnographic research in academic libraries. the erial project. retrived from http://www.erialproject.org/publications/toolkit/ bazaz, p. ( ). are academics relevant in the digital age? beer, d. ( ). writing in the library: some brief reflections on an evocative writing space. medium. retrieved from https://medium.com/@davidgbeer/writing-in-the-library- ef f blommaert, j., & jie, d. ( ). ethnographic fieldwork : a beginner's guide. bowles, c., & box, j. ( ). undercover user experience : learn how to do great ux work with tiny budgets, no time, and limited support. boyd, p., & smith, c. ( ). the contemporary academic: orientation towards research work and researcher identity of higher education lecturers in the health professions. studies in higher education, ( ), pp. - . buley, l. ( ). the user experience team of one : a research and design survival guide. charmaz, k. ( ). constructing grounded theory. darabi, m., macaskill, a., & reidy, l. ( ). a qualitative study of the uk academic role: positive features, negative aspects and associated stressors in a mainly teaching- focused university. journal of further and higher education, ( ), pp. - . https://medium.com/@davidgbeer/writing-in-the-library- ef f https://medium.com/@davidgbeer/writing-in-the-library- ef f understanding academics duke, l. m., asher, a. d., & ebrary, i. ( ). college libraries and student culture [electron- ic resource] : what we now know chicago: american library association. foster, n. f. ( ). studying students : a second look. foster, n. f., & gibbons, s. ( ). studying students : the undergraduate research pro- ject at the university of rochester. gregory, m. s.-j., & lodge, j. m. ( ). academic workload: the silent barrier to the im- plementation of technology-enhanced learning strategies in higher education. dis- tance education, ( ), pp. - . hoffman, a. j. ( ). reflections: academia's emerging crisis of relevance and the conse- quent role of the engaged scholar. journal of change management, ( ), pp. - . kenny, j. ( ). re-empowering academics in a corporate culture: an exploration of work- load and performativity in a university. higher education, ( ), pp. - . kinman, g., & jones, f. ( ). 'running up the down escalator': stressors and strains in uk academics. quality in higher education, ( ), pp. - . ladner, s. ( ). practical ethnography : a guide to doing ethnography in the private sec- tor. lanclos, d. m. ( ). ethnographic approaches to the practices of scholarly communica- tion: tackling the mess of academia. insights, ( ), p . long, m. p., & schonfeld, r. c. ( ). supporting the changing research practices of art historians: citeseer. pickard, a. j., & childs, s. ( ). research methods in information. priestler, a. ( , may). uxlibs: a new breed of conference. cilip update, pp. - . understanding academics schmidt, a., etches, a., & american library, a. ( ). useful, usable, desirable : applying user experience design to your library. watts, j., & robertson, n. ( ). burnout in university teaching staff: a systematic literature review. educational research, ( ), pp. - . wolcott, h. f. ( ). ethnography [electronic resource] : a way of seeing ( nd ed. ed.) lan- ham, md: altamira press. wolff-eisenberg, c., rod, a. b., & schonfeld, r. c. ( june ). uk survey of academ- ics : ithaka s+r | jisc | rluk. a vision for open cyber-scholarly infrastructures publications article a vision for open cyber-scholarly infrastructures costantino thanos institute of information science and technologies (isti) of the italian national research council (cnr), via g. moruzzi , pisa , italy; costantino.thanos@isti.cnr.it; tel.: + - - - ; fax: + - - - academic editor: craig smith received: february ; accepted: may ; published: may abstract: the characteristics of modern science, i.e., data-intensive, multidisciplinary, open, and heavily dependent on internet technologies, entail the creation of a linked scholarly record that is online and open. instrumental in making this vision happen is the development of the next generation of open cyber-scholarly infrastructures (ocis), i.e., enablers of an open, evolvable, and extensible scholarly ecosystem. the paper delineates the evolving scenario of the modern scholarly record and describes the functionality of future ocis as well as the radical changes in scholarly practices including new reading, learning, and information-seeking practices enabled by ocis. keywords: scholarly record; linked scholarly record; semantic publishing; enhanced publication; linked data; scientific article models; information exploration; topic map; reading practices; learning practices; information seeking . introduction modern science has undergone deep transformations due to recent advances in information technology, computer infrastructures, and the internet as well as the development of new high-throughput scientific instruments, telescopes, satellites, accelerators, supercomputers, and sensor networks that are generating huge volumes of research data. modern science is increasingly based on data-intensive computing; it tries to solve complex problems not within a discipline but across disciplines (multidisciplinary/interdisciplinary science); and it is conducted by scientists at different locations at the same (synchronous) or different (asynchronous) times by collapsing the barrier of distance and removing geographic location as an issue. finally, there is an emerging consensus among the members of the academic research community that the practices of modern science should be congruent with “open science”. global scientific collaboration takes many forms, but from the various initiatives around the world a consensus is emerging that collaboration should aim to be “open” or at least should include a substantial measure of “open access” to the results of research activities. this new science paradigm and a revolutionary process of digitization of information have created enormous pressure for radical changes in scholarly practices. they have induced changing demands and created expectations of scholars that are significantly different than they were just a few years ago. today one of the main challenges faced by scholars is to make the best use of the world’s growing wealth of scientific information. scholars need to be able to find the most authoritative, comprehensive, and up-to-date information about an important topic; to find an introduction to a topic that is organized by an expert; to conduct perspective analyses of scientific literature (for example, what arguments are there to refute this article?); to conduct a lineage analysis (for example, where did this idea come from?); etc. they also need to be able to navigate through an information-rich environment in order to discover useful knowledge from data, i.e., to extract high-level knowledge from low-level data in the context of huge volume datasets. a further pressure for radical changes in scholarly practice is caused by the revolutionary changes underway in scientific communication. these include: publications , , ; doi: . /publications www.mdpi.com/journal/publications http://www.mdpi.com/journal/publications http://www.mdpi.com http://www.mdpi.com/journal/publications publications , , of ‚ scientific data are becoming key to scientific communication; as such they must be integrated with scientific publications in order to support repeatability, reproducibility, and re-analyses. ‚ scientific data and publications have to cross disciplinary boundaries; therefore, in order to maintain the interpretative context they must be semantically enhanced, i.e., semantic mark-up of textual terms with links to ontologies/terminologies/vocabularies, interactive figures, etc. semantic services will help readers to find actionable data, interpret information, and extract knowledge. ‚ scientific literature is becoming increasingly online. the digital form of the article has permitted the definition of new scientific article models, based on modularization techniques, which allow the overcoming of the traditional linear form of the scientific article. ‚ advanced linking technology allows the meaningfully interconnection of datasets and article modules in many different ways. this permits the creation of several patterns of interest for scholars and scientists. instrumental in making radical changes in scholarly practices happen is the development of the next generation of open cyber-scholarly infrastructures (ocis). in a previous paper [ ] i have delineated the future of the digital scholarship and argued that connectivity is its technological foundation. in this paper i have further elaborated this concept and argue that building future ocis will contribute to radical changes in the way scholars create, communicate, search for, and consume scientific information. ocis have the potential to completely reshape scientific research. the paper is organized as follows: in section the evolving scenario of the modern scholarly record is described. in section a linked scholarly record that meets the needs of modern science is described. section describes the radical changes in the scholarly practices enabled by connectivity and semantic technologies. section describes the functionality of the future ocis. finally, section contains some concluding remarks. . the modern scholarly record a scholarly record is taken as a means of aggregation of scientific journals, gray literature, and conference presentations plus the underlying datasets and other evidence to support the published findings. moreover, the communications of today’s scholars encompass not only journal publications and underlying datasets but also less formal textual annotations and a variety of other work products, many of them made possible by recent advances in information technology and internet. this evolving scholarly record can also include news articles, blog posts, tweets, video presentations, artworks, patents, computer code, and other artifacts. this record is highly distributed across a range of libraries, institutional archives, publishers’ archives, discipline-specific data centers, and institutional repositories. it is also poorly connected, and this constitutes a major obstacle to full engagement by scholars. two of the main constituents of the modern scholarly record are the scientific dataset and the scientific article. . the scientific dataset there is no single well-defined concept of dataset. informally speaking, we can think of a dataset as a meaningful collection of data that is published and maintained by a single provider, deals with a certain topic, and originates from a certain experiment/observation/process. in the context of the linked data world, a dataset means a set of rdf triples that is published, maintained, or aggregated by a single provider. the concept of collection also suggests that there is an intentional collecting of the constituents of a dataset. in [ ] different kinds of relatedness among the grouped data have been identified: publications , , of circumstantial relatedness: a dataset is thought of as consisting of data related by time, place, instrument, or object of observation. syntactic relatedness: data in a dataset are typically expected to have the same syntactic structure (records of the same length, field values in the same places, etc.). semantic relatedness: data in a dataset may be about the same subject or make assertions similar in content. a dataset, once accepted for deposit and archived, is assigned by a registration agency a digital object identifier (doi) for registration. a digital object identifier (doi) is a unique name (not a location) within a name space of a networked data environment and provides a system for persistent and actionable identification of datasets. it must: (i) unambiguously identify the dataset; (ii) be globally unique; and (iii) be associated with a naming resolution service that takes the name as input and shows how to find one or more copies of the identical dataset. a dataset must be accompanied by metadata, which describes the information contained in the dataset, details of data formatting and coding, how the dataset was collected and obtained, associated publications, and other research information. metadata formats range from a text “readme” file, to elaborate written documentation, to systematic computer-readable definitions based on common standards. the doi as a long-term linking option from data to source publication is of fundamental importance. dois could logically be assigned to every single data point in a dataset; in practice, the assignment of a doi is more likely to be to a meaningful set of data following the index principle of functional granularity: identifiers should be assigned at the level of granularity appropriate for the functional use that is envisaged. however, having the ability to make references to subsets of datasets would be highly desirable. datasets may be subdivided by row, by column, or both. devising a simple standard for describing the chain of evidence from the dataset to the subset would be highly valuable. the task of creating subsets is relatively easy and is done in a large variety of ways by researchers. with respect to the versioning problem, i.e., how to treat subsequent versions of the same dataset, it is recommended to treat them as separate datasets. an emerging “best practice” in the scientific method is the process of publishing scientific datasets. dataset publication is a process that allows the research community to discover, understand, and make assertions about the trustworthiness and fitness for purpose of the dataset. in addition, it should allow those who create datasets to receive academic credit for their work. the ultimate aim of dataset publication is to make scientific datasets available for reuse both within the original disciplines and the wider community. the dataset publication process is composed of a number of procedures that altogether implement the overall functionality of this process. in particular, they should support the following functionality relevant for achieving dataset reusability: (i) dataset peer-reviewing; (ii) dataset discoverability; (iii) dataset understandability; and (iv) making dataset assessable. . the modern scientific article we foresee that in an increasingly online and interconnected scientific world the structure, functionality, and presentation of the scientific article is destined to change radically. the article will become a window for scientists and scholars, allowing them to not only actively understand a scientific result, but also to reproduce it or extend it; it will act as an access point for, or interface to, any type of global networked resource. another way of viewing the modern article is as an interface through which authors and readers interact. the future digital scientific article will feature several important characteristics: publications , , of first, modularization or disaggregation of the scientific article, i.e., the linear form of the scientific article will be overcome and it will be presented as a network of modules meaningfully connected by relations. in essence, a new modular model of scientific article will emerge with two main constituents: modules and relations. the modules are conceptual information units representing self-contained, though related types of information. they can contain organizational information concerning the structural aspects of an article as well as scientific discourse information concerning hypotheses made by the author of an article, evidence for the hypotheses, underlying datasets, findings, pointers to future research, etc. modules could be located, retrieved, and consulted separately as well as in combination with related modules. different types of relations between the modules can be established: organizational relations that are based on the structure of the article; discourse relations that define the reasoning of the argument; causal relations that establish a causal connection between premise and conclusion; comparison relations where the relation is one of contradiction, similarity, or resemblance. deconstructing a scientific article into semantically typed modules will enable scientists and scholars to access and manipulate individual modules, such as hypotheses, conclusions, references, etc. it will allow a reader to compile her/his own version, depending on interests and background; the reader becomes a creator of her/his individual reading versions. in essence, modularization will allow a more flexible interaction between article author and reader. second, a digital scientific article is intrinsically dynamic, i.e., mutable. this characteristic of the digital article allows the author to update it at any time, to revise it by changing its content, and expand it by adding annotations, hyperlinks, comments, etc. it also allows the inclusion of non-static information types such as animation, moving images, and sound. third, a digital scientific article can have embedded software that can allow one to, for example, compute a formula and visualize the results while reading the article; or to link to the underlying datasets, thus allowing the reader to perform additional analyses on the data. an example of the use of embedded software in articles is the concept of a multivalent document. several models for representing a scientific article have appeared in the literature. the name used to indicate these models is enhanced publication. enhanced publication is a dynamic, versionable, identifiable compound of objects combining an electronic publication with embedded or remote research data, extra materials, post publication data, database records, and metadata. it is an umbrella concept that embraces many different article models. the conceptual model of an enhanced publication includes a mandatory text body and a set of interconnected sub-parts. several instantiations of this model have been proposed in the literature. these instantiations, essentially, regard the way the mandatory text body is organized, the type of the sub-parts, and the way they are connected to the text. a first instantiation regards the case where the sub-parts are essentially supplementary material along with the mandatory text. examples include presentation slides, appendixes to the text, tables, etc. in this case, generally, the sub-parts do not have an identifier and are not described by metadata. a second instantiation regards the case where the mandatory text body is not a single block of text but is structured in a number of interconnected modules, such as abstract, sections, bibliography, etc. a third instantiation regards the case where the sub-parts are scientific datasets external to the publication, i.e., stored in discipline specific data centers/repositories with their own identity (dois). in this case, the scientific datasets are cited from within the text using a doi system. a fourth instantiation regards the case where some sections or modules of the text body or some sub-parts are live, meaning that they can be activated in order to produce visual content, video streaming, etc. finally, a fifth instantiation regards the case where some sections or modules of the text body or sub-parts can be dynamically executed at run time. a generalization of the concept of “enhanced publication” is the concept of research object (ro). informally, a research object is intended as a semantically rich aggregation of resources that poses publications , , of some scientific intent or supports some research objective. it should allow a principled publication of the results of research activity in a self-contained manner that facilitates the sharing and reuse of these objects. an ro bundles together all the essential information relating to a scientific investigation, i.e., article, data produced/used, methods used to produce and analyze that data, as well as the people involved in the investigation. in addition, an ro includes additional semantic information that allows one to link its components in a meaningful way. scientific articles are increasingly being assigned dois that provide live links from online citing articles to the cited articles in their reference lists. in addition, they should be enriched with appropriate metadata. dois could logically be assigned to every single article module; having the possibility to make references to article modules would be highly desirable. . linked scholarly record the scholarly record is poorly interconnected. this is in opposition to modern science, which requires the establishment of discipline-specific linked scientific records in order to effectively support scholarly inquiry. in fact, scientists and scholars need to be able to move from hypotheses to evidence, from article to article, from dataset to dataset, and from article to dataset and conversely. they need to discover potentially significant patterns and ways to make meaningful connections between parts of the scholarly record. from a conceptual point of view, a linked scholarly record means that its single parties, i.e., a dataset, an article module, etc. constitute single nodes of a networked scholarly record that can be accessed by any scholar, anytime, anywhere. the two pillars of the modern scholarly communication are discipline-specific data centers and research digital libraries, whose technologies and organizations allow researchers to store, curate, discover, and reuse the data and publications they produce. made to implement complementary phases of the scientific research and publication process, they are poorly integrated with one another and do not adopt the strengths of the other. such a dichotomy hampers the realization of a linked scholarly record. however, i am confident that the recent technological advances in many fields of information technology will make it happen. . linked discipline-specific data spaces new high-throughput scientific instruments, telescopes, satellites, accelerators, supercomputers, sensor networks, and running simulations are generating massive amounts of data. the availability of huge volumes of data is revolutionizing the way research is carried out and leading to a new data-centric way of thinking, organizing, and carrying out research activities. the most acute challenge stems from research teams relying on a large number of diverse and interrelated datasets but having no way to manage their scientific data spaces in a principled fashion. an example taken from [ ] illustrates the requirement for interlinking scientific datasets. “consider a scientific research group working on environmental observation and forecasting. they may be monitoring a coastal ecosystem through weather stations, shore-and buoy-mounted sensors, and remote imagery. in addition, they can be running atmospheric and fluid-dynamics models that simulate past, current, and near-future conditions. the computations may require importing data and model outputs from other groups, such as river flows and ocean circulation forecasts. the observations and simulations are the inputs to programs that generate a wide range of data products, for use within the group and by others: comparison plots between observed and simulated data, images of surface-temperature distributions, animations of salt-water intrusion into an estuary. such a group can easily amass millions of data products in just a few years. soon, such groups will need to federate with other groups to create scientific data spaces of regional or national scope. they will need to easily export their data in standard scientific formats, and at granularities that do not necessarily correspond to the partitions they use to store the data.” publications , , of therefore, there is a need for mechanisms and approaches that allow the linking of datasets produced by diverse research teams. linking a dataset refers to the capability of linking it to other external datasets, which in turn can be linked to from external datasets. linking data will allow the sharing of scientific data on a global scale and interconnect data between different scientific sources. it also makes data access, i.e., search and exploration, and data exploitation, i.e., integration and reuse much easier. the process that enables the linking of datasets is known as data publishing. a generalization of the linking data concept leads to the creation of linked scientific data spaces of disciplinary or interdisciplinary scope. a data space can be considered as an abstraction for the management of linked datasets. the concept of scientific data spaces responds to the rapidly-expanding demands of “data-everywhere”. a linked disciplinary-specific data space should enjoy the following properties: ‚ it contains datasets specific to a scientific discipline; ‚ any scientific community belonging to this discipline can publish on the scientific data space; ‚ dataset creators are not constrained by the choice of vocabularies with which to represent them; ‚ datasets are connected by links creating a global data graph that spans datasets and enables the discovery of new datasets; ‚ datasets are self-describing; ‚ datasets are strictly separated from formatting and presentational aspects; ‚ the scientific data space is open, meaning that applications do not have to be implemented against a fixed set of datasets, but can discover new datasets at run time by following the data links. a managed linked data space will enable researchers to start browsing in one dataset and then navigate to related datasets; or it can support data search engines that crawl the data space by following links between datasets. however, in order to be able to implement a linked discipline-specific data space, the ability to meaningfully and formally describe the datasets that participate in the linked data space, as well as the links among them, is of paramount importance. metadata is the descriptive information about datasets that explains the measured attributes, their names, units, precision, accuracy, data layout, and ideally a great deal more. most importantly, metadata should include the dataset lineage, i.e., how the dataset was measured, acquired, or computed. equally important is the concept of the dataset identifier, i.e., doi (or uri) as mechanisms for referring to datasets, on which there exists some agreement among multiple data providers. modeling the many kinds of relationships existing between datasets is equally important. we need to define metadata models for describing links. we must be able to model, for example, dataset b as a temporal/spatial abstraction of dataset a; or show that datasets a and b are generated independently but both reflect the same observational or experimental activity; or that datasets a and b were generated at the same time and by the same organization, etc. in order to be able to exploit the full potential of the linked data space, it is importance to make sense of heterogeneous datasets that constitute a linked data space. this can be achieved by adopting formalisms for representing discipline-specific ontologies. an initiative that implements the concept of linked data space by using the semantic web technologies is linked data [ ]. . linked scientific articles: linked literature spaces above i have described the network-centric nature of the future scientific article. deconstructing the scientific article into semantically typed modules allows the structuring of the scientific information as a multitude of interlinked modules. this will allow us to answer questions like: (i) what is the evidence for this claim? (ii) was this prediction accurate? (iii) what are the conceptual foundations for this idea? (iv) who has built on this idea? (v) who has challenged this idea, and using what kind of arguments? (vi) are there distinctive perspectives on this problem? and (vii) are there inconsistencies within this school of thought? publications , , of in essence, it will also enable the author to create paths of reasoning within the article as well as between articles. on the other hand, by following such paths the reader is enabled, for example, to assess the validity of a claim by gaining insight into its empirical backing. in computational linguistics the structure and the relations of discourse has been extensively studied as well as the relationship between discourse semantics and information packaging (modularization). some studies have suggested that the modularization of discourse is not based purely on semantics but that the rhetorical nature of discourse relations must also be taken into consideration when deconstructing a scientific article. it must also be pointed out that some segments of discourse play a subordinate role relative to previous segments they are connected to, while others are considered on a par; for example, the result module has a coordinating role while the explanation module is a subordinate one. this distinction, often called subordinating/coordinating, must also be considered when an article is decomposed into a number of modules. in essence, breaking a scientific article into different modules is a difficult conceptual operation as it should take into consideration the discourse structure and relations. in the literature many models of discourse relations have been proposed; as an example, a small set of eight relations has been proposed in order to support a principled modularization of a scientific article and a realistic scientific reasoning: ‚ proves/refutes ‚ supports/contradicts ‚ agrees/disagrees ‚ suggests/does not suggest the discourse relations are materialized by explicitly labeled links. a link can be defined as a uniquely characterized, explicit, directed connection between modules that represents one or more different kinds of relations. we can have different types of links: semantic links implement relations of similarity, contrast, part of, etc.; rhetorical links implement relations of definition, explanation, illustration, etc.; and pragmatic links implement relations of prerequisite, usage, example, etc. modularization and linking will enable scientific information to become part of a global, universal, and explicit network of knowledge. literature will be modeled as a network of modules. a generalization of the network centrality of scientific information leads to the creation of a linked scientific literature space of disciplinary or interdisciplinary scope. a scientific contribution thus becomes a rigorously connected, substantiated node or region in a linked scientific literature space. however, in order to be able to implement a linked, discipline-specific literature space, it is important to meaningfully and formally describe the article modules that participate in the linked literature space as well as the links among them. we need semantically rich metadata models to describe the article modules as well as the relations between them. equally important is the concept of the module identifier, i.e., doi (or uri) as a mechanism for referring article modules on which there exist some agreements among multiple publishers. . linking literature spaces with data spaces the need to link datasets to scientific publications is starting to be held as a key practice, underpinning the recognition of data as a primary research output, rather than as a byproduct of research. linking data to publications will enable scientists, while reading an article, to go off and look at the underlying data and even redo analyses in order to reproduce or verify results. the distinction between data and publication is destined to disappear as both are made increasingly available in electronic form. it is the task of the linking technology to support the next step, i.e., their integration. publications , , of publishers are beginning to embrace the opportunity to integrate data with scientific articles but barriers to the sustainability of this practice include the sheer volume of data and the huge variety of data formats. several levels of integration can be achieved ranging from tight to weak integration. a tight integration is achieved when datasets are contained within peer-reviewed articles. in this publishing model, the publisher takes full responsibility for the publication of the article and the aggregated data embedded in it and the way it is presented. the embedding of the dataset into the publication makes it citable and retrievable. however, the reusability of the dataset is limited as it is difficult to find it separate from the publication. this publishing model is not appropriate when the embedded dataset is too large to fit into the traditional publication format. in addition, the preservation of these enhanced articles is more demanding than for traditional articles. a less tight integration is achieved when the datasets reside in supplementary files added to the scientific article. the publisher offers authors the option of adding supplementary files to their article containing any relevant material that will not fit the traditional article format or its narrative, such as datasets, multimedia files, large tables, animations, etc. there are some issues related to this publishing model: they mainly concern the preservation of the supplementary files as well as the ability to find them independently from the main publication. a weak integration is achieved when the datasets reside in institutional data repositories or in discipline-specific data centers with bi-directional linking to and from articles. in this publishing model the article should include a citation and links to the dataset. the data preservation is the responsibility of the administrators of the institutional repository or data center. in this model the datasets become better discoverable and can be reused separately from the publication and in combination with other datasets. however, this publishing model depends very much on the existence of proper and persistent linking mechanisms enabling bi-directional citation. in the big data era it is obvious that only the weak integration scheme is viable. unfortunately, due to technological and policy reasons discipline-specific data centers and research libraries currently do not interoperate. linking publications to the underlying data can produce significant benefits: ‚ help the data to be better discoverable ‚ help the data to be better interpretable ‚ provide the author with better credits for the data ‚ add depth to the article and facilitate better understanding. unifying all scientific datasets with all scientific literature to create a world in which data and literature interoperate, as in jim gray’s vision, implies the capability to link literature spaces with data spaces, i.e., the capability to create a linked scholarly record. linking literature spaces with data spaces will increase scientific “information velocity” and will enhance scientific productivity as well as data availability, discoverability, interpretability, and reusability. the main mechanism enabling the linking between datasets and articles in the scientific communication workflow is data citation. data citation is the practice of providing a reference to datasets intended as a description of dataset properties that enable discover, interlinking, and access to the dataset. as such, proper citation mechanisms rely on the assignment of persistent identifiers to datasets, together with a description (metadata) of the dataset, which allows for discovery and, to some extent, reuse of the data. several standards exist for citing datasets and practices vary across different disciplines and data repositories, supported by initiatives in various fields of applications. . semantic enhancement of the scientific record a multidisciplinary approach to research problems draws from multiple disciplines in order to redefine a research problem outside of the usual boundaries and reach solutions based on a new understanding of complex situations. scientific communication across disciplinary boundaries needs semantic enhancements in order to make the text intelligible to a broad audience composed publications , , of of specialists in different scientific disciplines. this need motivated the current development of semantic publishing. by semantic publishing we mean the enhancement of the meaning of an online research article by automatically disambiguating and semantically defining specialist terms. this can be achieved by linking to discipline-specific ontologies and standard terminology repositories, by linking to other information sources of relevance to the article, and by direct linking to all of the article’s cited references. semantic mark-up of text is a technology that would facilitate increased understanding of the underlying meaning. sophisticated text mining and natural language processing tools are currently being developed to recognize textual instances and link them automatically to domain-specific ontologies. additional semantic enhancements can be obtained by intelligently linking scientific texts to third-party commentaries, archived talks, and websites. semantic publishing facilitates the automated discovery of an article, enables its linking to semantically related articles, provides access to data within the article in actionable form, or facilitates integration of data between papers. it demands the enrichment of the article with appropriate metadata that are amenable to automated processing and analysis. the semantic enhancements increase the intrinsic value of scientific articles, by increasing the ease by which information, understanding, and knowledge can be extracted. semantic technologies are enabling technologies for semantic publishing. in the context of multidisciplinary research, communities of research and data collections inhabit multiple contexts. there is the risk, when datasets are moving across contexts, of interpreting their representations in different ways caused by the loss of the interpretative context. this can lead to a phenomenon called “ontological drift” as the intended meaning becomes distorted when the datasets move across semantic boundaries (semantic distortion). this risk arises when a shared vocabulary and domain terminology are lacking. scientists nowadays face the problem of accessing existing large datasets by means of flexible mechanisms that are both powerful and efficient. ontologies describe the domain of interest at a high level of abstraction and allow for expressing at the intentional level complex kinds of semantic conditions over such a domain. they are, thus, widely considered to be a suitable formal tool for sophisticated data access. providing ontology-based access to data demands the creation of a conceptual view of data and presenting it to the scientist-user. this view is expressed in terms of an ontology and presents the unique access point for the interaction between the users and the system that manages the dataset. the challenge is to link the ontology to a dataset that exists autonomously and has not been necessarily structured with the purpose of storing the ontology instances. in this case, the conceptual view and the datasets are at different levels of abstraction and are expressed in terms of different formalisms. for example, while logical languages are used to specify the ontology, datasets are usually expressed in terms of a data model. therefore, there is a need for specific mechanisms for mapping the data to the elements of the ontology. in summary, in ontology-based data access, the mapping is the formal tool by which we determine how to link data to ontology, i.e., how to reconstruct the semantics of datasets in terms of the ontology. the main reason for a functionality that supports an ontology–based access to data is to provide high-level services to the scientists-clients. the most important service is query answering. clients express their queries in terms of the conceptual view (the ontology) and the mapping and should translate the request into suitable queries posed to the system that manages the dataset. in the context of a networked multidisciplinary scientific world, in order to maintain the interpretative context of data when crossing semantic boundaries, there is the need for aligning domain-specific ontologies that support the ontology-based access to distributed datasets. these ontologies are not standalone artifacts. they relate to each other in ways that can affect their meaning, and are distributed in a network of interlinked datasets, reflecting their dynamics, modularity, and contextual dependencies. their alignment is crucial for effective and meaningful data access and publications , , of usability. it is achieved through a set of mapping rules that specify a correspondence between various entities, such as objects, concepts, relations, and instances. . innovation in scholarly practices we are entering a new era characterized by the availability of huge collections of scientific articles. it is estimated that at least million english-language scientific documents are accessible on the web. of these, it is estimated that at least million are freely available. moreover, high-throughput scientific instruments, telescopes, satellites, accelerators, supercomputers, and sensor networks are generating massive amounts of scientific data. this information explosion is making it increasingly difficult for scholars to meet their information needs. in addition, the availability of huge amounts of scientific information has caused scholars to significantly extend their search goals. we expect that advanced semantic linking, information modeling, and searching technologies will contribute to the emergence of new scholarly practices that will enable scholars to successfully face the challenges of the information deluge era. below are described some innovations in scholarly practices that will be enabled, in the near future, by the cyberscholarly infrastructures described in section . . new discovery practices the availability of huge amounts of scientific information will produce a shift in the traditional scientific method: from hypothesis-driven advances to advances driven by connections and correlations found between diverse types of information resources. discovering previously unknown and potentially useful scientific information requires the discovery of patterns within a linked scholarly record. given a discipline-specific linked scholarly record, a pattern is defined as a path composed of a number of (sub)datasets and article modules meaningfully connected by relations that are materialized by links. the relations can be expressed by: ‚ mathematical equations when they relate numeric fields of two (sub)datasets; ‚ logical relationships among article modules and (sub)datasets; ‚ semantic/rhetoric relationships among modules of articles. a pattern describes a recurring information need in terms of relationships among some components of the scholarly record (datasets, articles) and suggests a solution. the solution consists of two or more components of the scholarly record that work together in order to satisfy a scholar’s information needs. a pattern forms a causal chain and the discovery process can take complex forms. it is expressed in high-level language and constitutes the input to a knowledge-based search engine. in many cases, it is very useful to identify relationships among individual patterns, thus creating a connected pattern space. such a space allows scholars to navigate from one pattern to a set of related patterns. given the extremely large dimensions of the modern scholarly record, a search engine can better assist scholars and scientists in finding the interesting patterns they are looking for by clearly identifying and understanding the intent behind a pattern specification. the user intent is represented in the pattern specification and contained in the query submitted to the search engine. enabling a search engine to understand the intent of a query requires addressing the following problems: (i) precisely defining the semantics of the query intent representation; and (ii) precisely delineating the semantic boundary of the intent domain. two broad categories of user intent can be identified: targeted intent: when the desired pattern is precisely described; and publications , , of explorative intent: when the desired pattern is described in vague terms, i.e., the user does not know exactly what s/he is looking for. . a new paradigm of information seeking: information exploration in the era of scientific information deluge, the amount of information exceeds the capabilities of traditional query processing and information retrieval technologies. new paradigms of information seeking will emerge that allow scholars to: ‚ surf the linked scholarly information space following suitable patterns; ‚ explore the scholarly record searching for interesting patterns; and ‚ move rapidly through the linked scholarly record and identify relevant information on the move. information exploration is an emerging paradigm of information seeking. exploration of a large information space is performed when scholars are searching for interesting patterns, often without knowing a priori what they are looking for. exploration-based systems should help scholars to navigate the information space. in essence, a scholar supported by exploration-based systems becomes a navigator of the scientific information space. exploration can be conducted in two ways: navigational querying: in the navigational querying mode, the exploration is conducted with a specific target node in mind. in this style of exploration, the key point is the process of selecting where to go next. in order to improve the effectiveness of this process, it is important to increase the awareness of the structure of the information space. navigational browsing: in the navigational browsing mode, a scholar is looking at several nodes of a linked information space in a casual way, in the hope that s/he might find something interesting. in essence, in this style of exploration the user (scholar) is not able to formulate her/his information need as a query; however, s/he is able to recognize relevant information when s/he sees it. in this style of exploration, the most efficient strategy is a two-step approach: first, the user navigates to the topic neighborhood in a querying mode and then browses the information space within that neighborhood. in order to increase the effectiveness of browsing, it is important to assist the user in the process of choosing between different patterns. browsing is distinguished from querying by the absence of a definite target in the mind of the scholar. therefore, the distinction between browsing and querying is not determined by the actions of the scholar, or by the functionality of an information exploration system, but by the cognitive state of the scholar. it is difficult to have a clear distinction between these two styles of exploration. presumably, there is a continuum of user behaviors varying between knowing exactly what a user wants to find (querying) and having only an extremely vague idea of what s/he is looking for (browsing). a wide range of exploration strategies can be defined based on the degree of target specificity in the mind of scholar. on the one extreme of the range the starting point of the exploration is the target identification and on the other extreme the starting point is the context identification. . topic maps: a tool for producing conceptual views on top of a linked scholarly record a technology that can support scholars in finding useful information in a linked scientific information space is topic maps, a standard for connecting knowledge structures to information resources. a topic map is a way of representing networked knowledge in terms of topics, associations, and occurrences. publications , , of ‚ a topic is a machine-processable representation of a concept. the topic maps standard does not restrict the set of concepts that can be represented as topics in any way. topics can represent any concept: in a scientific context, they can represent any scientific outcome: an article, an author, a dataset, a data mining/visualization tool, an experiment, etc. typically topics are used to represent electronic resources (such as documents, web pages, and web services) and non-electronic resources (such as people or places). ‚ associations represent hyper-graph relationships between topics: an article that suggests a thesis can connect to another article that supports this thesis, a data mining tool can connect to a mined dataset, etc.; and ‚ occurrences represent information resources relevant to a particular topic. in a topic map, each concept connects to another and links back to the original concept. topics, associations, and occurrences can all be typed. types are defined by the creator of the topic map(s). the definitions of allowed types constitute the ontology of the topic map. each topic involved in an association is said to play a role, which is defined by the association type. topic maps are a way to develop logical thinking by revealing connections and helping scholars see the lineage of an idea and how individual ideas form a larger whole. the topic map can act as a high-level overview of the domain knowledge contained in a set of resources. in this way it can serve not only as a guide to locating resources for the expert, but also as a way for experts to model their knowledge in a structured way. this allows non-experts to grasp the basic concepts and their relationships before diving down into the resources that provide more detail. topic maps are often described as a kind of superimposed semantic metadata layer for indexing (often dispersed and heterogeneous) information resources. an information architecture based on topic maps may be said to have two layers: a knowledge layer (topic space) representing the objects in the domain being described and a content layer (resource space) holding information about these objects. with some thoughtful modeling it is even possible to create different layers of detail in a topic map. another way of looking at topic maps is to consider them as enablers of knowledge arenas, that is, virtual spaces where scholars and learners may explore what they know and what they do not know. in fact, a topic map might be employed in an e-learning system to organize distributed learning resources on the web. here the individual topics would represent digital “learning objects” like articles, video lectures, or slides. a topic map can be created by a human author or automatically. the manual creation of topic maps guarantees high-quality, rich topic maps. however, even the automatic production of topic maps from a linked information space can give good results. topic maps make information findable by giving every concept in the information space its own identity and providing multiple redundant navigation paths through the linked information space. these paths are semantic, and all points on the way are clearly identified with names and types that tell you what they are. this means you always know where you are. therefore, topic maps can act as a gps of the information universe. in essence, topic maps can be used to create personalized semantic views, on top of a linked scholarly record that satisfies scholars’ reading, learning, and research needs. the standardization of topic maps is taking place under the umbrella of the iso/iec jtc /sc /wg committee (iso/iec joint technical committee , subcommittee , working group —document description and processing languages—information association). the topic maps (iso/iec ) reference model and data model standards are defined in a way that is independent of any specific serialization or syntax. it is desirable to have a way to arbitrarily query the data within a particular topic maps store. many implementations provide a syntax by which this can be achieved (somewhat like ‘sql for topic maps’) but the syntax tends to vary a lot between different implementations. publications , , of . new reading practices the creation of linked scientific spaces, together with the growing quantity of published articles and the limited time for reading, is increasingly modifying reading practices in two main directions: focused reading vs. horizontal/explorative reading. . . focused reading due to the continuously increasing quantity of scientific articles and data and the limited time for reading, scientists strive to avoid older and less relevant literature. they want to read only the relevant parts of a small number of core articles. therefore, they tend to narrow the literature space to be browsed (tuned vision). a number of indicators of the relevance of an article are used: indexing and citations as indicators of relevance, abstracts and literature reviews as surrogates for full papers, and social networks of colleagues as personal alerting services. . . horizontal reading/exploration another form of reading consists in surfing the linked literature space in order not to find a specific article or a core set of articles to read, but rather to find, assess, and exploit a wide range of information by scanning portions of many articles, i.e., horizontal reading. horizontal reading is the exploration of large quantities of relevant information. . . strategic reading both directions lead to a new reading practice: strategic reading. strategic reading is the reading of the different modules of an article in relevance order rather than narrative order. . new learning practices a linked scientific information space has the potential for increasing the learning capacity of scholars as it supports their cognitive processes. cognitive processes involve the creation of links between concepts. this implies the ability to create meaning by establishing patterns, relationships, and connections. a linked space enables the construction of meaningful learning patterns that allow the acquiring of new or modifying existing knowledge. the following of pre-constructed learning patterns makes possible the exploration and comparison of ideas, the identification or resolution of disagreements, the tracking of contributions by an individual researcher, the tracing of the lineage of an idea, etc. more importantly, a linked space facilitates the establishment of meaningful connections between several information elements: interpretation, prediction, causality, consistency, prevention, (supporting/challenging) argumentation, etc. . open cyber-scholarly infrastructures by cyber-scholarly infrastructure we mean a managed networked environment that incorporates capabilities instrumental to supporting the activities conducted by scholars. it is an enabler of an open, evolvable, and extensible learned ecosystem composed of digital libraries, publishers’ repositories, institutional repositories, data repositories, data centers, and communities of scholars. it enables interoperation between data and literature, thus creating an open, distributed, and evolvable linked scholarly record. it provides an enabling framework for data, information, and knowledge discovery, advanced literature analyses, and new scholar practices based on linking and semantic technologies. cyber-infrastructure-enhanced discovery, analysis, reading, and learning are especially important as they encourage broadened participation and wider diversity along individual, geographical, and institutional dimensions. in particular, future open cyber-scholarship infrastructures should support: publications , , of ‚ a scholarly linking environment that: ‚ provides a core set of linking services that create discipline-specific linked literature spaces and discipline-specific linked data spaces, connect literature spaces with data spaces, and build connections between diverse discipline-specific literature spaces. ‚ supports the creation, operation, and maintenance of a core set of linkers. a linker is a software module that exploits encoded knowledge and metadata information about certain datasets or articles in order to build a relation between modules and/or datasets. the linking process is a two-phase process: the first phase provides assistance in locating and understanding resource capabilities; the second phase focuses on linking the identified resources. different types of linkers should be supported in order to implement the different types of relations between article modules and datasets. linkers that connect modules related by a causality relationship, by a similarity relationship, by an “aboutness” relationship, or by a generic relationship; linkers that connect an article with the underlying dataset; linkers that connect a dataset with the supported articles; etc. ‚ a mediating environment that: ‚ provides a core set of intermediary services that make the holdings of discipline-specific repositories and data centers, data archives, research digital libraries, and publisher’s repositories discoverable, understandable, and (re)usable. ‚ supports the creation, operation, and maintenance of mediators. a mediator is a software module that exploits encoded knowledge and metadata information about certain datasets or articles in order to implement an intermediary service. a core set of mediators should include: data discovery mediators, article module discovery mediators, mapping mediators, matching mediators, consistency checking mediators, data integration mediators, etc. ‚ maintains data dictionaries, discipline-specific ontologies, and terminologies. ‚ a navigational environment that: ‚ offers the possibility for scholars to start browsing in one dataset/article module and then navigate along links into related datasets/article modules, and/or supports search engines that crawl the linked information space by following links between datasets/article modules and provide expressive query capabilities over aggregated data. ‚ maintains article module metadata registries; ‚ maintains link metadata registries; ‚ a scholarly reading and/or learning environment that: ‚ supports the creation, operation, and maintenance of a core set of scholarly workflows. scholars should be enabled to describe an abstract workflow by specifying a number of abstract tasks. these tasks include identity resolution, text analysis, literature analysis, lineage analysis, reproducibility of work, repeatability of experiments, etc. the abstract workflow or workflow template is mapped into a concrete workflow using mappings that, for each task, specify a linker or a mediator, or a service to be used for its implementation. an abstract workflow is an acyclic graph in which the nodes are tasks and the edges present links that connect the output of a given task to the input of another task, specifying that the artifacts produced by the former are used by the latter. the instantiation of a workflow results in a scholarly reading/learning pattern. by scholarly reading/learning pattern we mean a set of meaningfully linked article modules and datasets that support a scholarly activity (reading/learning/research). in essence, scholarly reading/learning patterns draw paths within the linked scholarly record. publications , , of ‚ supports the creation and maintenance of reading and learning profiles in order to enable the creation of “personalized reading/learning patterns.”. ‚ supports the creation and maintenance of virtual information spaces (topic maps) where scholars and learners may explore what they know and what they do not know. ‚ enables scholars, readers, and learners to find the scientific information they are looking for and correctly interpret it by allowing them to surf the linked scholarly record, following suitable scholarly patterns. . concluding remarks we expect that the building of the next generation of cyber-scholarly infrastructures will have a considerable impact on: ‚ accelerating the transition towards an extended system of scholarly communication that allows us to create, disseminate, and sustain unprecedented new forms of scholarly inquiry by utilizing the innovative capabilities of digital technologies; ‚ bringing to maturity digital publishing business models that support promotion and tenure practices that systematically reward digital publishing efforts; ‚ making scholarly knowledge freely available to anyone and opening up the process of knowledge discovery as early as possible; ‚ changing the scholarly publication: making the research outcomes reproducible, replicable, and transparent; making explicit hidden aspects of knowledge production; ‚ overcoming the distinction between the two cultures of the contemporary scientific world (i.e., the culture of data and the culture of narrative) by tightly linking datasets and narrative; ‚ enabling open scholarship; ‚ enabling reputation management; ‚ bringing into closer working alignment scholars, libraries, and publishers; ‚ shifting the scientific method from hypothesis-driven to data-driven discovery; and ‚ enabling analysis of research dynamics as well as macro-analyses of research data of interest to universities, funding bodies, academic publishers, and companies. future cyber-scholarly infrastructures will make jim gray’s vision of a world in which all scientific literature and all scientific data are online and interoperating happen. acknowledgments: this paper has been inspired by a number of papers, listed in appendix, that have addressed some of the issues discussed. the author is grateful to the authors of these papers. conflicts of interest: the author declares no conflict of interest. appendix . alexander, k.; cyganiak, r.; hausenblas, m.; zhao, j. describing linked datasets. in proceedings of the linked data workshop at www , madrid, spain, september . . altman, m.; king, g. a proposed standard for the scholarly citation of quantitative data. d-lib magazine march, april . . bardi, a.; manghi, p. enhanced publications: data models and information systems. liber q. , , – . . bechhofer, s.; de roure, d.; gamble, m.; goble, c.; buchan, i. research objects: towards exchange and reuse of digital knowledge. in proceedings of the future of the web for collaborative science (fwcs ), raleigh, nc, usa, july . . belhajjame, k.; corcho, o.; garijo, d.; zhao, j.; missier, p.; palma, r.; bechhofer, s.; garcía, e.; gómez-pérez, j.m.; klyne, g.; et al. workflow-centric research objects: first class citizens in scholarly discourse. in proceedings of the eswc workshop on the future of scholarly communication in the semantic web (sepublica ), heraklion, greece, may . . bizer, c. interlinking scientific data on a global scale. data sci. j. , , grd –grd . publications , , of . bizer, c.; heatth, t. linked data: evolving the web into a global data space ( st edition). synthesis lectures on the semantic web: theory and technology, : , - . morgan & claypool. . borgman, c. data, disciplines, and scholarly publishing. learn. publ. , doi: . / x . . bourne, p.; clark, t.; dale, r.; de waard, a.; herman, i.; hovy, e.; shotton, d. (eds.) improving the future of research communications and e-scholarship. dagstuhl manif. , doi: . / dagman. . . . . bourne, p. will a biological database be different from a biological journal? plos comput. biol. , , – . . shun, s.b.; simon, j.; li, v.u.; sereno, b.; mancini, c. modeling naturalistic argumentation in research literatures: representation and interaction design issues. int. j. intell. syst. , doi: . /int. . . shun, s.b. net-centric scholarly discourse? available online: http://slidesha.re/qvoqou (accessed on february ). . clark, t.; ciccarese, p.; goble, c. micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications. j. biomed. semant. , doi: . / - - - . . burke, r.; hammond, k.; young, b. knowledge-based navigation of complex information spaces. in proceedings of the thirteenth national conference on artificial intelligence and eighth innovative applications of artificial intelligence conference, portland, or, usa, – august . . castelli, d.; manghi, p.; thanos, c. a vision towards scientific communication infrastructures. int. j. digit. libr. , , – . . decker, s. from linked data to networked knowledge. available online: http://videolectures. net/eswc _decker_networked_knowledge/ (accessed on february ). . de vocht, l.; coppens, s.; verborgh, r.; sande, m.v.; mannens, e.; van de walle, r. discovering meaningful connections between resources in the web of data. in proceedings of the ldow , rio de janeiro, brazil, may . . de waard, a. from proteins to fairytales: directions in semantic publishing. ieee intell. syst. , , – . . de waard, a.; buckingham, s.; carusi, a.; park, j.; samwald, m.s. hypotheses, evidence and relationships: the hyper approach for representing scientific knowledge claims. in proceedings of the th international semantic web conference, workshop on semantic web applications in scientific discourse, lecture notes in computer science, washington, dc, us, – october ; springer verlag: berlin, germany, . . de waard, a.; kircz, j. modeling scientific research articles—shifting perspectives and persistent issues. in proceedings of the elpub conference on electronic publishing, toronto, on, canada, – june . . dillon, a.; richardson, j.; mcknight, c. navigation in hypertext: a critical review of the concept; diaper, d., gilmore, d., cockton, g., shackel, b., eds.; human interaction—interact’ : amsterdam, the netherlands, ; pp. – . . evans, j. electronic publication and the narrowing of science and scholarship. science , doi: . /science. . . fink, l.; fernicola, p.; chandran, r.; parastatidis, s.; wade, a.; naim, o.; quinn, g. word add-in for ontology recognition: semantic enrichment of scientific literature. bioinfrmatics , , . . ginsparg, p. text in a data-centric world. in the fourth paradigm: data intensive scientific discovery; microsoft: redmond, wa, usa, . . goble, c.; de roure, d. the impact of workflows on data-centric research. in the fourth paradigm: data intensive scientific discovery; hey, t., tansley, s., tolle, k., eds.; microsoft research: redmond, wa, usa, . publications , , of . gray, j.; szalay, a.; thakar, a.; stoughton, c.; van de berg, j. online scientific data: curation, publication and archiving; technical report msr-tr- - ; microsoft research: redmond, wa, usa, . . gruber, t. towards principles for the design of ontologies used for knowledge sharing. int. j. hum. comput. stud. , , – . . halevy, a.; franklin, m.; maier, d. principles of dataspace systems. in proceedings of the pods’ , chicago, il, usa, – june . . harmsze, f. a modular structure for scientific articles in an electronic environment. ph.d. thesis, university of amsterdam, amsterdam, the netherlands, . . herman, i.; clark, t.; hovy, e.; de waard, a. report on the “future of research communications” workshop; dragstuhl research online publication server: dagstuhl, germany, – august ; doi: . /dagrep. . . . the fourth paradigm: data intensive scientific discovery; hey, t., tansley, s., tolle, k., eds.; microsoft research: redmond, wa, usa, . . hu, j.; wang, g.; lochovsky, f.; sun, j.; chen, z. understanding user’s query intent with wikipedia. in proceedings of the www , madrid, spain, – april . . hunter, j. scientific models—a user—oriented approach to the integration of scientific data and digital libraries. in proceedings of the vala , melbourne, australia, february . . idreos, s. big data exploration. in big data computing; taylor and francis: abingdon, uk, . . johnsen, l. topic maps. j. inf. archit. july , issn: - . . kavuluru, r.; thomas, c.; sheth, a.; chan, v.; wang, w.; smith, a. an up-to-date knowledge-based literature search and exploration framework for focused bioscience domains. in proceedings of the ihi — nd acm sighit international health informatics symposium, new york, ny, usa, – january . . khabsa, m.; giles, c.l. the number of scholarly documents on the web. plos one , , e . . kircz, j.; harmsze, f. modular scenarios in the electronic age. in proceedings of the conferentie informatiewetenschap : de doelenutrecht, april ; pp. – . . kircz, j.g. new practices for electronic publishing—new forms of the scientific paper. learn. publ. , doi: . / . . lagoze, c.; van de sompel, h. the oai protocol for object reuse and exchange. available online: http://www.openarchives.org/ore (accessed on february ). . lynch, c. jim gray’s fourth paradigm and the construction of the scientific record. in the fourth paradigm: data intensive scientific discovery; microsoft: redmond, wa, usa, . . owen, j.s.m. the scientific article in the age of digitization. ph.d. thesis, university of amsterdam, amsterdam, the netherlands, . . mcpherson, t. scaling vectors: thoughts on the future of scholarly communication. j. electron. publ. , doi: . / . . . . microsoft. patterns & practices. available online: https://msdn.microsoft.com/en-us/library/ ff .aspx (accessed on february ). . microsoft. an introduction to topic maps. available online: https://msdn.microsoft.com/ en-us/library/aa (d=printer) (accessed on february ). . nicholas, d.; huntington, p.; jamali, h.; rowlands, i.; dobrowoski, t. viewing and reading behavior in a virtual environment. available online: https://www.emeraldinsight.com/ – x.htm (accessed on february ). . nicholas, d.; huntington, p.; jamali, h.; rowlands, i.; dobrowoski, t. characterizing and evaluating information seeking behavior in a digital environment: spotlight on the ‘bouncer’. inf. process. manag. , , – . . paskin, n. digital object identifier for scientific data. data sci. j. , , – , doi: . / dsj. . . publications , , of . phelps, t.; wilensky, r. toward active, extensible, networked documents: multivalent architecture and applications. in proceedings of the acm digital libraries ‘ /bethesda, bethesda, md, usa, – march . . poggi, a.; lembo, d.; calvanese, d.; de giacomo, g.; lenzerini, m.; rosati, r. linking data to ontologies. j. data semant. x, lncs , pages – ; springer-verlag: berlin/heidelberg, germany, . . porter, b.; souther, a. “knowledge-based information retrieval” in aaai technical report fs- - , . available online: http://www.aaai.org/papers/symposia/fall/ /fs- - / fs - - .pdf (accessed on february ). . renear, a.; palmer, c. strategic reading, ontologies, and the future of scientific publishing. science , , – . . seringhaus, t.; gerstein, m. publishing perishing? towards tomorrow’s information. bmc bioinform. , , . . shotton, d. semantic publishing: the coming revolution in scientific journal publishing. learn. publ. , doi: . / . . simon, b.; miklos, z.; nejdl, w.; sintek, m.; salvachua, j. smart space for learning: a mediation infrastructure for learning services. available online: https://wwwconference.org/ www /cdrom/papers/alternate (accessed on february ). . taylor, i.; gannon, d.; shields, m. (eds.) workflows for e-science; springer –verlag: london, uk, . . tenopir, c.; king, d.; edwards, s.; wu l. electronic journals and changes in scholarly article seeking and reading patterns. aslib proc. , , – , doi: . / . . thearling, k. an introduction to data mining. available online: http://www.thearling.com/ dmintro/dmintro_ .htm (accessed on february ). . vieu, l. on the semantics of discourse relations. available online: http://www.irit.fr/publis/ lilac/v-drsemantics-cid .pdf (accessed on february ). . waterworth, j.; chignell, m. a model for information exploration. available online: http://www .informatik.umu.se/~jwworth/infomodel.pdf (accessed on february ). . white, c. data exploration and discovery: a new approach to analytics. bi res. , doi:not available. . woutersen-windhouwer, s.; brandsma, r.; verhaar, p.; hogenaar, a.; hoogerwerf, m.; doorenbosch, p.; durr, e.; ludwig, j.; schmidt, b.; sierman, b. enhanced publications; vernooy-gerritsen, m., surf foundation, eds.; amsterdam university press: amsterdam, the netherlands, . references . thanos, c. the future of digital scholarship. proced. comput. sci. , , – . [crossref] . renear, a.; sacchi, s.; wickett, k. definitions of dataset in the scientific and technical literature; asist: pittsburgh, pa, usa, . . franklin, m.; halevy, a.; maier, d. from databases to dataspaces: a new abstraction for information management. in sigmod record; acm: new york, ny, usa, ; volume , pp. – . [crossref] . bizer, c.; heath, t.; berners-lee, t. linked data—the story so far. int. j. semant. web inf. syst. , , – . [crossref] © by the author; licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc-by) license (http://creativecommons.org/licenses/by/ . /). http://www.aaai.org/papers/symposia/fall/ /fs- - /fs - - .pdf http://www.aaai.org/papers/symposia/fall/ /fs- - /fs - - .pdf http://dx.doi.org/ . /j.procs. . . http://dx.doi.org/ . / . http://dx.doi.org/ . /jswis. http://creativecommons.org/ http://creativecommons.org/licenses/by/ . / introduction the modern scholarly record the scientific dataset the modern scientific article linked scholarly record linked discipline-specific data spaces linked scientific articles: linked literature spaces linking literature spaces with data spaces semantic enhancement of the scientific record innovation in scholarly practices new discovery practices a new paradigm of information seeking: information exploration topic maps: a tool for producing conceptual views on top of a linked scholarly record new reading practices focused reading horizontal reading/exploration strategic reading new learning practices open cyber-scholarly infrastructures concluding remarks libraries for teaching and learning june , | : – : cest workshop libraries for teaching and learning workshop programme part i . welcome and introduction (hilde van wijngaarden, vrije universiteit amsterdam) . three paths for libraries to embrace teaching and learning : facilitation in makerspaces • supporting the creation and usability of new forms of content (sylvia moes, vrije universiteit amsterdam) • incentives for new services at linköping university library- the rise of digimaker (anneli friberg, linköping university library) : open educational resources (oer) • unesco guidelines and network of european open education librarians (vanessa proudman, sparc) • examples of oer support in spanish libraries (gema santos hermosa, library open university cataluna) libraries for teaching and learning workshop programme part ii : librarian as a guide to the future • artificial intelligence bazars at university of oslo library (andrea gasparini, u university of oslo) • teaching in the brave new open world: how can libraries help (monique schoutsen, library radboud university nijmegen) : break . breakup groups: discussion on the three paths . general discussion: what is the role of the library in teaching and learning? how do we utilise our traditional competencies in this new role? thank you for participating! supporting the creation and usability of new forms of content sylvia moes innovation manager education support university library vu network for teaching & learning (vu nt&l) is the education network of the vu with the mission to be a thriving and inspiring vu educational community! learn! academy, university libary, student & educational affairs, it and audiovisual centre work together in the network to achieve excellent education at the vu. dissemination lab makerspace madelon simons jacob cornelisz. van oostsanen, seflie ( ) https://sites.google.com/vu.nl/vu-teaching-learning-tips/teaching-tips-educational-technologies/ d-printing https://sites.google.com/vu.nl/vu-teaching-learning-tips/teaching-tips-educational-technologies/ d-printing d printers virtual reality view of student with vr gear: virtual courtroom student in groningen in front of green screen student with vr-gear in amsterdam point of view lawyer in courtroom _ (student with vr-gear) plus personal annotations– keywords for pleading students meet each other in virtual courtroom live stream student in groningen https://library.educause.edu/~/media/files/library/ / /ers .pdf?la=en learning in three dimensions https://library.educause.edu/~/media/files/library/ / /ers .pdf?la=en but could the library be queen? incentives for new services at linköping university library – the rise of digimaker liber workshop: libraries for teaching and learning june , anneli friberg linköping university library anneli.friberg@liu.se @fribban on twitter - - - - - - - - ➢ : digimaker project ➢ : opening of digimaker ➢ : digimaker part of the regular library activity our digimaker journey - - excel word python git - - - - research support ❖ digitization ❖ processing ❖ analyzing ❖ visualization - - - - - - some challenges ✓ acceptance among colleagues ✓ students need to prioritize their studies ✓ it’s difficult to make long term plans www.liu.se thanks for your attention! any questions? anneli friberg anneli.friberg@liu.se @fribban on twitter vanessa proudman, director, sparc europe liber conference workshop : libraries for teaching & learning june making open the default international policy development libraries can help governments deliver on their promises the unesco oer recommendation - - key international policy developments the cape town declaration, & open government partnership, un sustainable development goals, unesco oer recommendation, liber online : libraries for teaching and learning - - unesco recommendation on oer http://portal.unesco.org/en/ev.php- url_id= &url_do=do_topic&url_section= .html liber online : libraries for teaching and learning http://portal.unesco.org/en/ev.php-url_id= &url_do=do_topic&url_section= .html - - unesco oer recommendation essential for decision-makers and innovators in learning and education guidance for national governments on oer policies and practices -> countries to report on efforts and progress increasing action, strategy & legislation a standard-setting instrument liber online : libraries for teaching and learning - - aims and objectives achieve sustainable development goal (sdg) : ensure inclusive and equitable quality education and promote lifelong learning opportunities for all promote and adopt more open licensing apply oer for engagement and innovation amongst educators and learners collaborate and advocate for oer for the evaluation of quality oer and for optimal investment liber online : libraries for teaching and learning - - areas of action . building capacity of stakeholders to create, access, re-use, adapt and redistribute oer; . developing supportive policy; . encouraging inclusive and equitable quality oer; . nurturing the creation of sustainability models for oer; . facilitating international co-operation liber online : libraries for teaching and learning - - monitoring measure the effectiveness and efficiency of oer policies and incentives collect and share progress, good practices, innovation and reports on oer and its consequences on teaching and learning develop strategies to monitor the educational effectiveness and long-term efficiency of oer liber online : libraries for teaching and learning - - policy to implementation process liber online : libraries for teaching and learning policy wgsroadmap dynamic coalition >> now inviting libraries to engage in debate and action in the working groups - - launch of the dynamic coalition sparc europe a partner meeting with over participants: – unesco ms, inter-governmental organisations, oe expert organisations, the private sector, publishers, foundations invitation to feed back on areas of action developing a multi-stakeholder roadmap roadmap to implement the policy liber online : libraries for teaching and learning - - oer dynamic coalition https://en.unesco.org/themes/building-knowledge- societies/oer/dynamic-coalition liber online : libraries for teaching and learning https://en.unesco.org/themes/building-knowledge-societies/oer/dynamic-coalition - - how libraries can help implement the unesco oer recommendation “libraries are really important” named as stakeholders (due to ifla & sparc europe engagement) wide experience with digitisation and organising access to knowledge liber online : libraries for teaching and learning - - how libraries can help implement the unesco oer recommendation capacity building – advocate for oe across the institution, showcase good practices and oe champions – share and provide access to existing oers – build on optimising connected repositories inclusive, equitable accessible, quality oer – help create a multilingual federated discovery system for oer, using open standards and formats liber online : libraries for teaching and learning - - how libraries can help implement the unesco oer recommendation policy – provide an evidence base for policy-making – engage in policy-making (institutional & national) sustainability – optimise oer as public good through new models – engage with publishers/service providers to open liber online : libraries for teaching and learning - - sparc europe & the oer rec capacity-building amongst academic libraries in europe: eoenl network growth and engagement policy development research oer advocacy through champions sustaining oe, a public good liber online : libraries for teaching and learning - - how ready are we? oe european survey results presentation, tomorrow, at am opening up knowledge session “opening up knowledge in higher education. survey results: supporting open education in european libraries today” – how future fit we are – what academic libraries are doing and how • what the opportunities are • what the challenges are liber online : libraries for teaching and learning - - please join the network of oe librarians oer@sparceurope.org liber online : libraries for teaching and learning we can do this togetherwe can do this together @libereurope examples of oer support in spanish libraries gema santos-hermosa universitat oberta de catalunya (uoc) library sparc eu network of european open education librarians (neoel) @libereurope . library in the dissemination of oer: from the upv to the world , oers from the teaching collection of riunet repository (universitat politècnica de valència - upv ) are being sent and shared in merlot https://www.merlot.org/merlot/ https://riunet.upv.es/ http://www.upv.es/ @libereurope . library supporting the creation of oer: collaboration and incentives for teachers (uji & udl) call for a selective procedure to support the development of open teaching materials at universitat jaume i (uji) -> incentives for creating oer and publishing them in the uji repository and ocw collaboration between the library and the teaching activity support and advice unit at universitat de lleida (udl) https://www.uji.es/?urlredirect=https://www.uji.es/&url=/ http://www.udl.es/ca/en/ @libereurope . library helping faculty in understanding oer: oer toolkit in spanish for the rebiun network from rebiun (the spanish network of universities libraries) and its action - oer, the oer toolkit (ontario libraries) is being translated into spanish and adapted to the spanish context. still in progress … available soon at: https://rebiun.libguides.com/ https://www.rebiun.org/ https://tlp-lpa.ca/oer-toolkit https://rebiun.libguides.com/ @libereurope . library involvement in oer strategies & policies: uoc open knowledge plan & policy at the universitat oberta de catalunya (uoc), the open knowledge plan ( ) includes a specific area for open learning & oers. more information: currently uoc is working in updating its oa policy ( ) which it is becoming a global institutional open knowledge policy ( ). the uoc library is co-coordinating this plan and new policy (together with the globalization and cooperation unit) santos-hermosa, g. ( ). open learning at uoc open knowledge action plan. http://hdl.handle.net/ / https://www.uoc.edu/portal/en/index.html https://www.uoc.edu/portal/_resources/en/documents/coneixement-obert/pla-accio-coneixement-obert.pdf http://hdl.handle.net/ / @libereurope "libraries are supporting the unesco oer recommendation" http://portal.unesco.org/en/ev.php-url_id= &url_do=do_topic&url_section= .html @libereurope thank you! msantoshe@uoc.edu @gsantoshe mailto:msantoshe@uoc.edu https://twitter.com/gsantoshe liber conference – workshop teaching and learning artificial intelligence bazars at university of oslo library dr. andrea gasparini digital services, university of oslo library norway the use of ai in the academic library • why: many concurring ai-based tools are entering the market • how: funding from the national library of norway • what: using and developing ai-based tools – research bazars in and drawing: university of oslo library research bazar :exploring research data with ai and design thinking • fully day • fully booked • participants, mostly phd-candidates and some researchers research bazar : exploring research data with ai and design thinking • session : data organization tasks e.g. finding and acquiring data, cleaning the data, filtering the data, etc. • session : design thinking research bazar : exploring research data with ai and design thinking • session : a brief introduction to machine learning, then the data was used to “learn” models prepared in advance by the organizers • session : each team presented the solution they have developed research bazar : exploring research data with ai and design thinking the approached used for ai is named xai – explainable ai research bazar : exploring research data with ai and design thinking feedbacks in general - participants were eager to use ai and dt in their research - some of the solutions were creative research bazar : hands-on activity on the explore and focus tool inside iris research bazar : • half day • fully booked • participants, mostly phd-candidates, some researchers and some librarian feedback from the iris research bazar "good place to start collecting literature for a research project" (researcher from sintef affiliated with university of oslo). "i'll share this with my colleagues and students as soon as i'm back in place.“ “we must understand how to use ai with library services. ai is here to stay ” (librarian) feedback from the iris research bazar some critical feedbacks: “the "explore" section of iris is problematic as all my relevant resources are behind pay-wall.” the "focus" section of iris was perceived as unclear in the functionality and meaning of the individual steps. what we learned from the research bazars • researcher are positive and interested in ai-tools helping them with research • ai may give additional perspectives to researchers • ai-tools can be used in addition to traditional ways of doing literature search what we learned from the research bazars • small research group needs someone to help them – they need a one-point access to information • academic libraries should help researcher with all aspects of ai: trust, bias, ethics and so on. • library staff need arenas and incentives to learn ai x libraries future work • more «proof of concept» • test new services with users • connect with specialist (again) tusen takk! teaching in the brave new open world: how can libraries help? a short introduction: monique schoutsen coordinator information literacy radboud university (nijmegen) m.schoutsen@ubn.ru.nl information scarcity > information abundance information scarcity > information abundance & complexity & openness finding > publishing helping researchers and students with finding information > helping them with publishing information research support > teaching support focus on just research support > also focus on teaching support projects that fit into his trends • publishing (workshops on adobe illustrator) • information abundance (anyone can publish): fake news • supporting the teacher: use open data in teaching workshops in adobe illustrator escaperoom on fake news udit project helene n. andreassen artic university of norway senior academic librarian arctic university of norway helene.n.andreassen@uit.no torstein låg artic university of norway f norway torstein.lag@uit.no harrie van der meer university of amsterdam (the netherlands) monique schoutsen radboud university (the netherlands) coordinator lnformation literacy radboud university m.schoutsen@ubn.ru.nl twitter: @festinaatje mijke jetten radboud university (the netherland) coordinator open sciencepport radboud university m.jetten@ubn.ru.nl to encourage and help teachers in higher education to start using open research data in their teaching, and to share their experience and teaching material with the teacher community. objective of the project ● further the open educational resources movement ● further the open science movement intended side objectives udit oer commons udit website udit module udit module https://www.fosteropenscience.eu/learning/use-open-data-in-teaching https://www.fosteropenscience.eu/learning/use-open-data-in-teaching the platform ● oer commons ○ https://www.oercommons.org ○ facilitates access to teaching and learning material for all parties ○ facilitates visibility and proper attribution by making the material discoverable and citable ● group: use open data in teaching ○ https://www.oercommons.org/groups/ use-open-data-in-teaching/ / ○ platform for uploading activities or links to already published ones ○ template which facilitates inclusion of all relevant information https://www.oercommons.org/ https://www.oercommons.org/groups/use-open-data-in-teaching/ / the dream engage the students in research without having to leave the classroom. next step: finding ambassadors in every field https://vimeo.com/ #t= s https://vimeo.com/ #t= s https://vimeo.com/ #t= s questions? . workshop # - libraries for teaching and learning libraries for teaching and learning libraries for teaching and learning libraries for teaching and learning thank you for participating! . presentatie liber -sylvia moes b. liber workshop_anneli friberg . oer_unescorec_liber . oer support in spanish libraries_liber _gsantos . ai-researchbazar-liber-ubo . liber oer ru the interactive library as a virtual working space vol. , no. ( ) – | e-issn: - x this work is licensed under a creative commons attribution . international license uopen journals | http://liberquarterly.eu/ | doi: . /lq. liber quarterly volume issue the interactive library as a virtual working space andreas degkwitz humboldt university berlin, germany andreas.degkwitz@ub.hu-berlin.de, orcid.org/ - - - abstract the internet and new digital media are challenging the traditional business model of academic libraries and they enable new capabilities of informa- tion provisioning and new shapes of collaborations between the librarians and the users. to pick up the demands and the expectations of the many users, whose information behaviour is heavily influenced by the internet, a new business model for academic libraries has to be designed urgently. the present paper tries to analyse the requisites of such design and to develop a framework for setting up a pilot study for identifying the organizational and technical requirements of a business model for the future library, which is based on the potential of the internet and new media. the result should be a pilot study about the interactive, multi-user driven library as the future business model for libraries. key words: digital libraries; scholarly makerspaces; virtual working environ- ment; digital transformation . introduction the logistic of printed books and journals is influencing all the processes and structures of libraries since the age of gutenberg. our core processes are lin- ear: acquisition, cataloguing, short- and long term availability and usage. by the implementation of it driven library systems and collections of e-books and e-journals in pdf the former analogue processes and printed materials http://liberquarterly.eu http://www.doi.org/ . /lq. mailto:andreas.degkwitz@ub.hu-berlin.de http://orcid.org/ - - - the interactive library as a virtual working space liber quarterly volume issue have been transferred and transformed in a digital environment – in other words: they are emulated! this is part of the transformation process, but not the main part of the development. because components of the logistic of digi- tal materials are: interaction, collaboration, multimedia end and global net- working – do we identify these items in libraries, which we call or define as digital libraries (degkwitz, )? the organization and workflows of libraries are still influenced by the tradi- tional patterns. are there any networked structures beyond the cooperation between libraries – e.g. with patrons and users? the roles of librarians and users did not change for many years. where are the collaborative approaches, which the internet and new media are offering? print oriented e-books and e-journals (emulations of printed patterns) are focusing the library collec- tions and services. what about the integration of research data and multi- media objects in research publications and scholarly communication (burpee, glushko, goddard, kehoe, & moore, ; dempsey, malpas & lavoie, )? local ip-based licenses of electronic books and journals are the main path of accessibility. do we have really global access to research results and data? there are many “gaps” between the options of the academic support and the researchers’ demands and needs. do we really aim at appropriate shapes of deep exchange and strong interaction? . which changes are happening? a number of changes show that internet driven change has started, like the examples below may demonstrate: • patron driven acquisition models: users are choosing the materials that they demand and need. • digital resources like e-book and e-journal packages don’t have to be recorded by librarians in a traditional way. moreover the related metadata delivered by the publishers were prepared technically and loaded in the index of the discovery system. • e-books, e-journals and databases of commercial publishers are in general already in the web. there aren’t any bigger transaction costs, but just the costs for activating the license. • the numbers of scholarly materials and objects outside the familiar scope of books and journals are permanently increasing. more and andreas degkwitz liber quarterly volume issue more libraries are dealing with so-called “enhanced publications,” which cover data and objects beyond the text. • users and researchers are providing repositories or information hubs by themselves. these resources could and should be harvested and indexed by the library’s search engine as part of its collection. the interactive, multi-user driven library: from that background we are in the situation to exploit the digital potential of the internet and new media by much more efforts. as a result we should allow and enable more interac- tion and collaboration between librarians and users. therefore we have to reshape and open up the roles of the librarians and the users in an explic- itly collaborative way (moravec & killorn, ). why do we distinguish so formally between the librarians and the users? we better talk about “multi- users”: “multi-user driven acquisition,” “multi-user driven collection build- ing,” “multi-user driven indexing,” “multi-user driven funding,” “multi-user driven availability.” such an approach could move us forward and should be done as follows: • acquiring and collecting: librarians and users are allowed to acquire or to transmit materials and objects in the collection of the libraries by different rights and/or in their particular repositories. the scope of materials and material types covers everything related to scholarly communication: books, journals, digitized items, research data, soft- ware tools, audios, pictures, videos, simulations, etc. • cataloguing and enriching: librarians and users are allowed to cre- ate and/or to enrich the metadata of scholarly materials and objects for loading them in the index of the (central) search engine by dif- ferent competencies and rights. enrichments may be done by name authorities, classifications, subject headings up to semantic relation- ships. in this way more user-oriented access and search facilities can be established. • usage and availability: librarians and users are allowed to define operation and usage of acquired/collected materials and objects up to the time limits of their availability. the overhanded rights and roles have to conform to the governance rules of the library policy. the principles of open access are generally applied. • funding and sourcing: librarians and users own different funds for paying acquisitions and licenses of materials. contrary to the practice of today these sources have to cover the material’s “maintenance” the interactive library as a virtual working space liber quarterly volume issue too – that means: cataloguing, indexing, availability, operation, pres- ervation, etc., unless this will be done by the users themselves. long term archiving is a basic option, which is free to a certain extent. . creating a virtual working space digital technologies are influencing scholarly scholarship and scholarly com- munication and include immediate and essential requirements to the aca- demic support and to the service portfolios of the libraries. academic and research libraries – especially in the fields of the humanities and the social sciences – play a crucial role as the laboratories of these disciplines. the specific impact of the digitalization concerning the researchers’ method- ologies and working pattern is based on the dynamic capabilities of linking, operating and processing of digital objects like pictures, texts and further data- sets (fowler, stanley, murray, jones, & mcnamara, ). the masses of digi- tized resources are increasing permanently by digitizing materials or by digital born data and texts. these materials are findable and accessible in a system- atic way. hence we are in the situation to assume that digital scholarship will increase significantly in cultural studies and humanities during the next years. this development will entail that digital materials and resources won’t be collected, recorded and made available from the traditional background as “local” collections. moreover these materials and resources must be curated and prepared for researchers’ purposes and scholarly use. therefore aca- demic and research libraries are more and more in the situation to liaise and to offer appropriate services and tools actively, which is what digital scholars need and will expect increasingly. the library as the conventional intermedi- ary is more and more challenged to meet these requirements and to deliver services enabling easy access and use. the interactive – multi-user driven – library is proving to be a virtual work- ing space as an ongoing result of the collaboration between librarians and users. the digital public library of america, the german digital library, the europeana, the hathitrust, the internet archive, and many other hubs and platforms like google scholar, mendeley and wikipedia are not in particular interactive libraries. but these information hubs and platforms demonstrate andreas degkwitz liber quarterly volume issue collaborative and interactive approaches, components and procedures of vir- tual working spaces, into which digital libraries are determined to be devel- oped. facing the potential and the opportunities of the internet and new digital media, the shape of libraries has to be re-designed and re-organized. in our times the library has to integrate and to include the users in its devel- opments. from that point of view we will create and provide an appropri- ate and heavily needed virtual working space, which is the future business model for libraries based on the capabilities of the internet and new media. but how can we implement and establish this? . scholarly makerspaces for creating interactive, virtual working spaces libraries are in the situation to take up the approach of the scholarly makerspaces. following the idea of the internationally known approach of “makerspaces” in public librar- ies scholarly makerspaces are digital working environments, where digital resources and tools are combined and made available. the service portfolios of scholarly makerspaces are provided and supported by academic libraries collaborating and interacting with researchers and third party providers of digital data, materials and tools according to the disciplinary needs. the vir- tual environments of the scholarly makerspaces is hosted on a work station or – and even better and more often – on a web based platform for enabling as comfortable access as possible (dellot, ; goldenson & hill, ; willett, ; willingham & de boer, ). what can we do in scholarly makerspaces? what does really happen in them? for example records of aggregated objects can be searched and the related objects transferred in an environment making text and data mining possible. the operated data can be processed further on a quality level, which signif- icantly exceeds the regular bibliographical level. a deep findability of data and objects can be achieved, which is impossible by traditional methods of the libraries’ cataloguing. moreover further tools will be made available corresponding to the research approaches and projects of the single disciplines. these tools are used for annotations, encoding procedures, mapping and measurement, visual- ization, publishing, etc. good reasons exist to establish cooperation with the interactive library as a virtual working space liber quarterly volume issue partners inside and outside of the universities to offer services and tools (hilf & severiens, ; kaden, ). we expect that scholarly makerspaces cover a basic set of tools like xml editors and/or annotation tools for digital humanists. for more complex and high level tools the library should act as an intermediary or a broker between external services and tool providers like associations as clarin or dariah on the humanities’ field. in these scenarios the library is liais- ing local groups with external experts or networks of expertise in concern of expertise, content, resources and tools. to sum it up: libraries should be providing and supporting virtual scholarly makerspaces as an open, dynamic and interactive infrastructure oriented to the disciplinary demands. the purposes of the necessary redesign of library services are as following: • to meet the demands and requirements of digitally working scholars and students by local services providing expertise, infrastructures, resources, training and tools, • to enable researchers an enhanced access and overview of existing methods and resources concerning e-research, • to share digital procedures and tools with students and the young researcher generation, • to complete and to gain expertise about new technologies as well as what the disciplines are demanding and claiming for, • to get deeper insights and immediate impetus for the further devel- opment of academic support. . the framework of the pilot study by the outline of the virtual scholarly makerspaces the key issues of the fur- ther development of academic libraries are identified. now the framework of the pilot study can be described, to explore a valid concept of an organization and process model including cost calculations for realizing scholarly maker- spaces. from the impact of this new working environment and the resulting services the aimed business model will influence the entire library as well. the study should prepare the development of the virtual makerspaces, but andreas degkwitz liber quarterly volume issue not the prototype itself. this will be done, if the study can demonstrate a viable implementation for reasonable costs. the study will outline the frame- work for the implementation and the production plant of the makerspaces. the implementation is influenced by: to meet disciplinary requirements in practice we cooperate with research- ers of the humboldt university in the field of german literature, cultural studies and social anthropology as well as with representatives of clarin, dariah and the university library of mainz. all the colleagues are familiar with digital environments and according working patterns (süptitz, weis, & eymann, ). the tasks of the libraries providing scholarly makerspaces are covering the acquisition, preparation and dissemination of content resources and tools as well as to analyse and to communicate acceptance, demands and use. the library as a scholarly makerspace will be established as an active broker or intermediary between researchers and the providers of content and services. during the period of transformation to digital scholarship the training on information and media competence plays a crucial role. embedding these skills in appropriate courses and curricula will make a big difference. the configuration of scholarly makerspaces will include the following compo- nents (kaden & rieger, ): • availability of tools (on platforms) or software (on work sta- tions) dedicated to e-research and digital publishing (“enhanced publications”), • providing content for digital scholarship by libraries and informa- tion hubs, • sharing expertise and training in the necessary competencies, the interactive library as a virtual working space liber quarterly volume issue • creating real and virtual spaces for experiments, • low-threshold communication facilities by blogs, wikis, repositories, etc. and building communities, • standardized procedures of monitoring user experience and needs, • permanently enlarging and improving the service portfolios. as the aim of the pilot study the requirements for scholarly makerspaces will be identified in specific modules and details. the legal and technical prereq- uisites of brokering and reusing existing services and tools play an important role and have to be clarified. at the same time the offered infrastructures and services must be evaluated in concern of low-threshold usability and intui- tive operation. the traditional organizational patterns of libraries focusing mainly on information provisioning has to be changed to an organizational model, which is enhancing the libraries’ mission of providing information by the services of scholarly makerspaces. digital content and materials will be made available and embedded in the virtual working environment of the makerspaces to integrate resources in the researchers’ processes for analys- ing and operating. for building up and upgrading the scholarly makerspaces the existing structures of libraries’ organization have to be re-designed and oriented to the enhanced and extended mission of libraries to support the research and the education life cycles and to meet the requirements, which are needed for this. interactive procedures between digital scholars and librar- ians as well as professionally conducted collaboration work in the scholarly makerspaces will improve the capabilities of libraries and optimize the pro- cesses and the results of research activities (gold & klein, ). . conclusions an analysis of the current state of libraries clearly shows that the impact of the internet and new media is not taken into account by the organisational patterns of libraries at a sufficient scale. the outline of the scholarly mak- erspaces shows, that the basic procedures of the librarians’ business have to be changed in depth. by the described framework of the intended pilot study the key issues of this development should be explored. as a result a new shape of libraries will be aimed and designed exploiting the potentials of the internet and digital media. establishing libraries as virtual working spaces by the scholarly makerspaces’ approach is a great opportunity and andreas degkwitz liber quarterly volume issue the right place of libraries in the digital world. of course, the library staff members have to be skilled and trained for the challenges and tasks related to the new mission and the new shape of the library. librarians must be pre- pared for their new roles as liaisons and partners of scholarly collaboration and interaction. digital scholars and researchers must be enabled to act as knowledge workers and to take over library tasks at a certain extent. both parts of the library world are required to work together at eye level. the new relations between librarians and users must be ruled and established even legally. the importance and roles of third party cooperation are crucial for setting up the new library model, because the libraries will not just deliver their own or the campus’ resources and tools, but also materials and services from outside of the campus or the local host will be liaised and procured by the library. libraries act primarily as intermediaries or interfaces in these new service scenarios. this concerns the support by expertise and competences as well. regarding the implementation of scholarly makerspaces under these aspects the calculation of costs plays an important role of the pilot study as well. if the pilot study will be completed successfully the new comprehensive library model will have a big impact for the digitalization of research and education. acknowledgement i thank both my colleagues, ben kaden and michael kleineberg (library of the humboldt university of berlin), for the many considerations and discus- sions about the idea of the virtual scholarly makerspaces. references burpee, k.j., glushko, b., goddard, l., kehoe, i., & moore, p. ( ). outside the four corners: exploring non-traditional scholarly communication. scholarly and research communication, ( ), – . https://doi.org/ . /src. v n a . degkwitz, a. ( ). “yes, we can” - if we take over future tasks! in l. bultrini, s. mccallum, w. newman & j. sempéré (eds.), knowledge management in libraries and organizations, ifla publications series, (pp. – ). berlin/munich: de gruyter saur. retrieved july , , from http://library.ifla.org/ / / -degkwitz-en. pdf (pre-print version). https://doi.org/ . / - . https://doi.org/ . /src. v n a http://library.ifla.org/ / / -degkwitz-en.pdf http://library.ifla.org/ / / -degkwitz-en.pdf https://doi.org/ . / - the interactive library as a virtual working space liber quarterly volume issue dellot, b. ( ). ours to master. how makerspaces can help us master technology for a more human end. londen: rsa, action and research centre. retrieved july , , from https://www.thersa.org/discover/publications-and-articles/reports/ours-to-master. dempsey, l., malpas, c., & lavoie, l ( ). collection directions: the evolution of library collections and collecting. portal: libraries and the academy, ( ), – . retrieved july , , from https://www.oclc.org/content/dam/research/ publications/library/ /oclcresearch-collection-directions-preprint- .pdf. fowler, z., stanley, g., murray, j., jones, m., & mcnamara, o. ( ). research capacity-building with new technologies within new communities of practice: reflections on the first year of the teacher education research network. professional development in education, ( ), – . https://doi.org/ . / . . . gold, m.k., & klein, l.f. ( ). debates in the digital humanities . minneapolis, london: university of minnesota press. retrieved july , , from http:// dhdebates.gc.cuny.edu. goldenson, j., & hill, n. ( , july ). making room for innovation. library journal. retrieved july , , from http://lj.libraryjournal.com/ / / future-of-libraries/making-room-for-innovation/. hilf, e.r., & severiens, t. ( ). vom open access für dokumente und daten zu open content in der wissenschaft. in r. kuhlen, w semar & d. strauch (eds.), grundlagen der praktischen information und dokumentation (pp. – ). berlin: de gruyter saur. https://doi.org/ . / . . kaden, b. ( , april ). zur epistemologie digitaler methoden in den geisteswissenschaften. zenodo. https://doi.org/ . /zenodo. . kaden, b., & rieger, s. ( ). usability in forschungsstrukturen für die geisteswissenschaften: erfahrungen und einsichten aus textgrid iii. in a. rapp & s. söring (eds.). textgrid: von der community – für die community; eine virtuelle forschungsumgebung für die geisteswissenschaften (pp. – ). glückstadt: hülsbusch. moravec, j.w., & killorn, k.e. ( ). designing the future of research libraries and special libraries in knowmad society. paper prepared for congreso amigos, mexico. education futures. retrieved july , , from https://educationfutures.com/blog/ / / designing-the-future-of-research-libraries-and-special-libraries-in-knowmad-society/. süptitz, t., weis, s.j.j., & eymann, t. ( ). was müssen virtual research environments leisten? - ein literaturreview zu den funktionalen und nichtfunktionalen anforderungen. in r. alt & b. franczyk (eds.), proceedings of the th international conference on wirtschaftsinformatik (wi ), volume (pp. – ). retrieved july , , from http://www.wi .de/proceedings/wi % -% track% % -% sueptitz.pdf. https://www.thersa.org/discover/publications-and-articles/reports/ours-to-master https://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-collection-directions-preprint- .pdf https://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-collection-directions-preprint- .pdf https://doi.org/ . / . . https://doi.org/ . / . . http://dhdebates.gc.cuny.edu http://dhdebates.gc.cuny.edu http://lj.libraryjournal.com/ / /future-of-libraries/making-room-for-innovation/ http://lj.libraryjournal.com/ / /future-of-libraries/making-room-for-innovation/ https://doi.org/ . / . https://doi.org/ . /zenodo. https://educationfutures.com/blog/ / /designing-the-future-of-research-libraries-and-special-libraries-in-knowmad-society/ https://educationfutures.com/blog/ / /designing-the-future-of-research-libraries-and-special-libraries-in-knowmad-society/ http://www.wi .de/proceedings/wi % -% track% % -% sueptitz.pdf andreas degkwitz liber quarterly volume issue willett, r. ( ). making, makers, and makerspaces. a discourse analysis of professional journal articles and blog posts about makerspaces in public libraries. the library quarterly ( ), – . https://doi.org/ . / . willingham, t., & de boer, j. ( ). library technology essentials. in , kroski, e. (ed.). makerspaces in libraries. lanham, md: rowman & littlefield. notes https://www.clarin.eu. www.dariah.eu/ and https://de.dariah.eu/. https://doi.org/ . / https://www.clarin.eu www.dariah.eu https://de.dariah.eu balancing multiple roles of repositories: developing a comprehensive repository at carnegie mellon university publications article balancing multiple roles of repositories: developing a comprehensive repository at carnegie mellon university david scherer ,* and daniel valen university libraries, carnegie mellon university, pittsburgh, pa , usa figshare, cambridge, ma , usa; dan@figshare.com * correspondence: daschere@andrew.cmu.edu; tel.: + - - - received: february ; accepted: april ; published: april ���������� ������� abstract: many academic and research institutions today maintain multiple types of institutional repositories operating on different systems and platforms to accommodate the needs and governance of the materials they house. often, these institutions support multiple repository infrastructures, as these systems and platforms are not able to accommodate the broad range of materials that an institution creates. announced in , the carnegie mellon university (cmu) libraries implemented a new repository solution and service model. built upon the figshare for institutions platform, the kilthub repository has taken on the role of a traditional institutional repository and institutional data repository, meeting the disparate needs of its researchers, faculty, and students. this paper will review how the cmu libraries implemented the kilthub repository and how the repository services was redeveloped to provide a more encompassing solution for traditional institutional repository materials and research datasets. additionally, this paper will summarize how the cmu university libraries surveyed the current repository landscape, decided to implement figshare for institutions as a comprehensive institutional repository, revised its previous repository service model to accommodate the influx of new material types, and what needed to be developed for campus engagement. this paper is based upon a presentation of the same title delivered at the open repositories conference held at montana state university in bozeman, montana. keywords: institutional repositories; research data; carnegie mellon university; scholarly communications; open access; open scholarship; open science; open data; engagement . introduction over the last two decades since the creation of the dspace repository platform by mit and hewlett-packard in [ ], academic and research institutions have developed and implemented a wide range of institutional repositories. increasingly, institutional repositories have become a dynamic tool for scholarly communication, and a necessary resource for managing institutional research and knowledge [ ]. this has included multiple repositories focused on maintaining and housing the wide range of materials that required unique environments and needs to accommodate them digitally. likewise, some repositories were designed for set purposes, such as electronic thesis/dissertation (etd) repositories, open access publication repositories, and research data repositories. as the creation of research data has increased, so too has the need to support its creation and management. michael witt noted that academic and research libraries have taken a more active role in the research data management services and infrastructure provided by institutions to handle the increase in data output [ ]. the expansion of roles for academic libraries now has often led to their expanded integration in the research cycle of their institutions. witt further elaborates this point, publications , , ; doi: . /publications www.mdpi.com/journal/publications http://www.mdpi.com/journal/publications http://www.mdpi.com https://orcid.org/ - - - https://orcid.org/ - - - http://www.mdpi.com/ - / / / ?type=check_update&version= http://dx.doi.org/ . /publications http://www.mdpi.com/journal/publications publications , , of detailing that libraries can collaborate with their campus communities to understand what tools, services, and support will be necessary to support services for data [ ]. as tenopir et al. explained in their study, this can lead to libraries becoming invested partners in all aspects of the research process, from data collection to publication, and to the preservation of research outputs [ ]. in his sparc position paper, raym crow noted that an institutional repository (ir) could be implemented to demonstrate the visibility, reach, and overall significance of an institution’s research, thereby providing both short-term and long-term benefits [ ]. in contrast, clifford lynch expanded upon the notion of how an ir could be defined beyond a single entity or service. in his arl briefing, lynch described an ir not just as a single entity or service, but rather as “a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members” [ ]. more recently, the research data alliance’s (rda) data foundations and terminology working group presented a more defined definition of a repository, especially with the involvement of research data. rda defines a repository as “a repository (aka data repository or digital data repository) is a searchable and queryable interfacing entity that is able to store, manage, maintain and curate data/digital objects. a data repository provides a service for human and machine to make data discoverable/searchable through collection(s) of metadata” [ ]. as institutional and research repositories have grown in adoption and usage, the conceptual thinking of what a repository is has also grown. this new rethought includes the range of different repository platforms and services models. just as repositories can be defined as a dynamic service and tool, the overall scholarly communication ecosystem can also be defined as a set of related and sometimes interrelated tools and services designed to create, maintain, publish, disseminate, and assess the data and other scholarly outputs created during the research lifecycle. founded in by steel magnate and philanthropist, andrew carnegie, carnegie mellon university is an r : doctoral universities—very high research activity private nonprofit research university located in pittsburgh, pennsylvania [ ]. cmu is home to nearly faculty and , students representing seven academic colleges, ranging from business, computer science, fine arts, engineering, humanities and social sciences, information systems and public policy, and the sciences [ ]. with campuses located nationwide in pittsburgh, new york, silicon valley, and globally with campuses in qatar, rwanda, and australia, cmu is well represented and situated within the global communities of research and practice. cmu students represent countries, with faculty also representing forty-two countries. the cmu alumni network includes over , thousand living members representing different countries [ ]. with such high and dynamic focuses in both the arts and stem, and with such a large and diverse global reach, the research data and other scholarly outputs produced by its campus community range in diversity as well. in , carnegie mellon university (cmu) began evaluating its own institutional repository platform and services models. cmu concluded that a new repository was needed to support the wide range of materials it produces, including research data and other forms of scholarly outputs. beyond focusing on the repository and service models, cmu also focused on the overall scholarly communication ecosystem. this additional focus included examining and considering the expanded role the ir. cmu sought a partnership with open data repository platform figshare, in examining the development of a new repository that could comprehensively serve an academic institution or research entity by serving the multiple needs required of a new generation of repositories, while also expanding the role a repository could play in the broader research lifecycle for the individual and the institution. this paper is based upon a presentation of the same title delivered at the open repositories conference held at montana state university in bozeman, montana [ ]. . figshare.com to figshare for institutions figshare was launched in by mark hahnel while he was finishing his phd in stem cell biology at imperial college london [ ]. through a chance meeting during the beyond impact workshop at publications , , of the wellcome trust in , figshare was offered funding by digital science to grow the company with the mission of making research data citable, shareable, and discoverable [ ]. the platform initially operated under a ‘freemium’ model, allowing individual researchers to create free accounts and upload research data, regardless of whether it was associated with a published paper or contained negative results. in , there was a push from government initiatives and funding agencies across the globe regarding public access to research (this seen in open access activity in australia to the ostp memo from the obama administration in the us) [ , ]. during this same year, figshare announced an enterprise tool and began working to support universities and academic publishers in research data management. with figshare for institutions, universities were given a way to publish research data in any file type and encourage collaboration, sharing of research, and data reuse. today, figshare works with over enterprise partners globally and hosts over million files on the figshare platform [ ]. since its inception, figshare has aligned itself and its mission with the wider open access and open research communities. larger open research initiatives like the research data alliance (rda) and force have helped provide standards and guidelines for the community around best practices for managing scholarly content at the individual and institutional level [ , ]. one particular best practice for trusted research repositories is to have a clearly stated and public mission [ ]. figshare founder and ceo mark hahnel published a copy of the figshare mission and beliefs in [ ]. the figshare core beliefs are as follows. • academic research outputs should be as open as possible, as closed as necessary • academic research outputs should never be behind a paywall • academic research outputs should be human and machine readable/queryable • academic infrastructure should be interchangeable • academic researchers should never have to put the same information into multiple systems at the same institution • identifiers for everything • the impact of research is independent of where it is published and what type of output it is these core beliefs are the foundation of all the work figshare does to support the wider scholarly communication ecosystem and ensure that the platform’s tools align with community standards. figshare, as the name suggests, aims to ensure that all research is made openly available in a discoverable manner under the most liberal licenses available for reuse. none of the content on the platform is behind a paywall and the figshare team has a public and openly-documented api to ensure that content is accessible not only to humans but also programmatically available to machines [ ]. the open api also ensures that content on figshare can be queried by computers, migrated, or feed into other university systems. all researchers on figshare can sync their accounts with orcid and every public item on figshare receives a persistent identifier (usually a doi) [ ]. finally, the driving mission and one of the reasons mark created the platform was to create a larger commons and community, ensure that research data be treated as a first-class research object, and allow researchers to get credit for their work outside of the existing academic publishing process. these beliefs helped shape and continue to drive the development of figshare for institutions. . repository landscape at carnegie mellon university prior to , the university libraries at cmu maintained only two repositories. these two repositories were focused on repository services for archival and special collection materials in its archival repository, and materials were traditionally housed in an ir in the traditionally focused institutional repository. at this time, there was not a repository service designated for research data that could adequately address the needs of researcher’s data. publications , , of the mission of the university archives at cmu is to document, preserve, and provide access to the records documenting life at cmu and the contributions of its students and faculty [ ]. implemented in , the university archives maintains an archival repository for its digital collections. the archival repository is built upon the hosted platform knowvation (formerly known as archivalware) offered by progressive technology federal systems, inc. (ptfs) [ ]. the digital collections at cmu house twenty-six digital collections from the university archives. these digital collections include digitized campus publications; large archival collections, such as the herb simon papers; rare books from the posner collection; projects digitized in partnership with the carnegie library of pittsburgh and the heinz history center; and fully-digitized archival collections made available for researcher access [ ]. built upon the digital commons hosted ir platform and publishing platform offered by bepress (now elsevier), research showcase served as the ir for cmu from october of to june [ ]. as a traditionally focused ir, research showcase provided online access to materials produced by members of the cmu faculty, staff, and students. these materials included green and gold open access versions of published works, gray literature such as white papers and technical reports, academic posters, conference papers, presentation slide decks, undergraduate honors theses, and graduate student electronic theses and dissertations. while used primarily as a traditionally focused ir, research showcase was lightly used as a publishing platform. between and with the publishing of volume issues , the journal of privacy and confidentiality was published on research showcase through the relationship of the journal and one of its three founding editors, the late professor stephen fienberg [ ]. while the journal was published on research showcase, it did not utilize the journal publishing module built within digital commons. since the publication of volume issue , the journal has moved its operations to the labor dynamics institute at cornell university [ ]. in alignment with the . % of institutions examined by ayoung yoon and teresa shultz in their content analysis study of academic library websites [ ], the university libraries began offering research data consultation services around with no data repository in place. these consultation services included data management plan development, search, reuse, sharing methods, and reviews of required or appropriate venues for data publishing. . evolution of repository services at cmu with the expansion for data sharing in with the national institutes of health and the national science foundation in , academic research libraries explored how to provide the necessary technical infrastructure and services necessary to aid researchers in these new mandated requirements [ , ]. neither the archival repository nor the traditional ir was designed to handle the complexity of research datasets. that being said, the need to have a data repository service had not been a prior requested need or service. the need for such a repository at cmu changed in february with the u.s. white house office of science technology policy (ostp) memorandum, which directed federal agencies with more than $ million in research and development expenditures to prepare policies to make federally funded research results publicly available within months of publication [ ]. in february of , the university libraries were asked if they could assist in making a research dataset publicly available to assist a researcher in complying with their funder’s data sharing requirements. the university libraries was able to assist the faculty member, but with an unconventional and short-term solution. the university libraries utilized the archival repository to deposit the dataset as a stop-gap solution. this dataset has since been migrated to the new repository [ ]. this use case and stop-gap solution provided the basis and laid out the needs for a new repository platform that would meet the needs for data publishing and sharing across campus. it also presented an opportunity to evaluate the current repository landscape at cmu, and ascertain if a new solution could be implemented to meet the growing needs not currently being met for emerging forms of scholarly outputs, but also to better meet the needs being met by the current repository solution. publications , , of published in the fall of , the carnegie mellon university strategic plan included a strategic recommendation for the creation of a st century library that would serve as a cornerstone of world-class research and scholarship from cmu. one important goal tied to this strategic recommendation was to develop services and infrastructure that would “steward the evolving scholarly record and champion new forms of scholarly communication” [ ]. the university libraries took this goal, and began evaluating repository platforms and repository service models. prior to the publication of the cmu strategic plan, the university libraries published an internal report in early that was based upon its evaluation of current repository solutions for a new institutional repository. the report covered several common discussions and evaluations similarly conducted by peer institutions who had evaluated their own institutional repository or data repository needs [ ]. this internal report on “cmu’s institutional repository, research data repository, and digital collections platforms” focused on determining the requirements for a replacement ir platform, and a potential data repository [ ]. additionally, the report including a review of the challenges and issues a new repository platform would present to the university libraries from a technical, organizational, and service perspectives. the report presented some of the internal use cases and requirements that were based upon current capabilities provided by digital commons, which included the ability for self-deposit and deposit by proxy, arrangement and description of content by academic hierarchy, ability to deposit content with various file formats and their accompanying metadata, and the ability to monitor usage statistics (e.g., altmetric data, views, and downloads). likewise, the report presented several aspirational features and capabilities, including a system that could generate dois during the submission and publishing workflows, ability to accept larger (> gb) files, and a system that could provide users with a way to preview content before being downloaded. the report presented an evaluation of possible repository solutions, based upon currently known systems and implementation examples from peer and aspiration peer institutions using similar systems. the systems that were evaluated included fedora, dspace, eprints, islandora, hydra (now known as samvera), invenio (formerly known as cdsware), sobekcm, and zentity from microsoft [ ]. each platform evaluation included a summary of its background, history, technical overview, features, and a summary of implementations found at other institutions. the report concluded with presenting a possible plan for implementation of each proposed system, as well as a discussion of challenges and concerns each new system would present. overall, the report found that while digital commons lacked some of the needs and technical capabilities necessary for the data repository, it possessed several features that were useful and beneficial to users and administrators. similarly, while several open source platforms offered potential solutions that met the proposed data repository needs, they presented their own challenges. with many of the open source solutions being written in various software languages, the university libraries lacked the personnel with the background and knowledge of these new software languages. likewise, these systems would present additional needs for hosting and infrastructure that the university libraries could not sufficiently provide at that time. at the same time, the internal report on institutional repository evaluations and possible data repository solutions was being developed, and the university libraries became aware of the figshare for institutions platform as a possible data solution, which had not included in the original internal report given its timing and availability. because the university libraries lacked the technical knowledge to maintain an open source repository solution such as those discussed in the institutional repository report, utilizing a licensed repository solution was appealing for several reasons. first, as already discussed, the university libraries lacked the technical expertise to manage and support the most commonly used open source solutions. secondly, the operational costs for the new repository, as compared to the costs associated with the current institutional repository, which was also a licensed solution, were commensurable. lastly, the university libraries already had a critical need for a repository, and waiting to hire necessary personnel would have extended the solution beyond the expectation for results from campus leadership. publications , , of using figshare for institutions as the data repository solution also appealed to the university libraries because of what the product would provide functionally and technically. as an open platform available freely for anyone to use via figshare.com, it was a repository solution that campus community members would already potentially be accustomed to using. upon examining the data published publicly, the university libraries identified several datasets deposited by campus faculty and graduate students. this meant that the university libraries could utilize the potential name recognition and workflows to highlight that their new repository would not be something that users would not be accustomed to used. by highlighting that the cmu repository would be “powered by figshare,” the university libraries could utilize figshare’s relationship with the campus community to provide its own repository services. with a metadata record based upon dublin-core, the submission process required to make deposits presented a simple, straightforward workflow that would not overburden users. lastly, like figshare.com, figshare for institutions possessed several avenues for interoperability and integrations to necessary research mechanisms. users were not restricted from uploaded certain file formats, and they could conduct deposits through either the systems user interface, desktop plugin, or through the platforms open api [ , ]. through its integrations with github and doi registering authorities ezid and datacite, users could easily sync their current workflows to push datasets from a working space to the repository to be published with a recognized data citation and doi for future citability [ ]. lastly, because figshare for institutions was a hosted repository solution, with storage maintained by amazon web services, the technical infrastructure necessary for hosting the repository and its materials would not be left to the university libraries to manage or maintain [ ]. the internal report revealed that further evaluation of repositories was needed. this led to the formation of the digital repository task force (drtf) within the university libraries in october of . in similar groups organized at other institutions, such as the task force developed at the university of minnesota in the development of the data repository for the university of minnesota “drum”, the university libraries’ task force was comprised of librarians, archivists, and staff from around the university libraries [ ]. all identified team members possessed some level of knowledge or expertise in repositories, and were also identified as individuals who would have an invested interest in the repository once implemented. the drtf included members from the archives, research data management unit, scholarly communications and research curation unit, libraries it, and postdoctoral fellows from the university libraries council on library and information (clir) postdoctoral program. the tasks force’s goal was to take the information gathered from the previous internal report and combine it with new analyses on a new repository solution. part of this goal was also to define a new repository and related service that could be targeted towards multiple and diverse audiences. as the university of minnesota study found, this diverse audience could include researchers/data authors, pis, campus administrators, and institutional research stakeholders [ ]. as the university libraries further evaluated repository solutions, the university also began evaluating the research information management (rim) system landscape. from october of to may of , the university evaluated several rim systems. this included pure from elsevier, converis from clarivate, and symplectic elements from digital science. the university chose to not evaluate digital measures from watermark, because it was a solution already implemented at an individual college/school level. the college of engineering and the tepper school of business both had their own licenses to digital measures. both units were ready to evaluate a new rim system, especially if that new system was going to be maintained and supported university-wide. the evaluation of the rim landscape involved a number of individuals from around the university, and could have been described as a “collaboration of stakeholders” [ ]. the evaluation was conducted by members of the university libraries, campus administration, college and school deans and associate deans, members of the faculty, campus computing services, vice-provost for research office, sponsored programs, and the general counsel’s office. all members of the rim evaluation group were invested in publications , , of the way in which research conducted at cmu was developed, completed, reported, verified, published, and preserved. beyond focusing on just a rim, the campus rim evaluation group also looked at other systems, tools, platforms, and services that could have a potential connection to the rim, which included new repository system(s). after evaluating each of the rims, and several other potentially interrelated systems, the university chose to select symplectic elements as its rim in february . in addition to selecting symplectic elements, the university also chose to license a suite of services from digital science. this included altmetric.com and dimensions. the university also decided that figshare’s figshare for institutions repository platform would become the new repository platform. but beyond utilizing figshare as a data repository, it was decided that the new figshare for institutions repository would also become the new institutional repository platform. this decision was not just a matter of setting forth a plan, but also included the investment and purchasing of these new services, which came from newly added funds provided by the provosts office. this new repository, including its related services, would not just be a grassroots effort of the university libraries. the repository should be both a top-down- and bottom-up-focused endeavor [ ]. a key factor found in the study conducted by lagzian, abrizah, and wee was the importance placed on management support of the ir [ ]. the purchase of the new repository was not be just an investment made by the university libraries, but an investment that was integral to the university, thus providing the university libraries the necessary means to expand research support and services across the university. cmu and figshare were both very interested in exploring how the repository could be implemented beyond as a traditional data repository. during the examination of figshare for institutions as a data repository, the university libraries recognized that the technical and functional needs necessary to implement a new institutional repository were already present in figshare for institutions. additionally, because the figshare repository would be treated as a repository at an institution, the data would be published and arranged in collections and series that would reflect the organizational structure of the academic colleges, schools, departments, researcher centers, and institutions at cmu, which is exactly how the ir was already arranged. because figshare.com already permitted users to submit any file format, many users were already depositing materials that, from a collection development perspective, would have been deposited to an ir. lastly, figshare for institutions possessed the functional and technical capabilities to ensure that the university libraries could implement curation workflows to ensure that the content published in the repository were reflective of the research and scholarship of the cmu community, and were permitted for open dissemination in an open access repository. with these common functions and capabilities, the university libraries questioned why users had not previously thought to use figshare for institutions as the ir. with the repository serving as both a data repository and institutional repository, cmu referred to its new repository as the comprehensive repository. this new comprehensive repository would offer a robust and reliable place to curate research data and other scholarly outputs; ensuring compliance with open data and open access mandates from funders and publishers and promoting a culture of open and sharing research and scholarship from cmu. additionally, this consolidated repository service would decrease the number of locations campus partners would have to interact with for depositing their content. by limiting the number of repositories and interaction points by developing a new repository that combined common and parallel goals, the university libraries could define this new service in a way that prevented offering multiple repositories with overlap, thereby creating points of competition, such as those seen at the university of minnesota and penn state university [ ]. there were several use cases that digital science and cmu wanted to jointly explore through taking advantage of the interoperable nature of these systems. these use cases and shared interests moved the relationship between cmu and digital science beyond a traditional licensed product relationship between vendors and providers, and towards a relationship that wanted to explore and design possible solutions for these use cases as partners. in february of , cmu and digital science announced the creation of a strategic development partnership agreement [ ]. through publications , , of the implementation of a suite of products from the digital science portfolio, cmu unveiled a broad solution to capture, analyze, and showcase the research and scholarship of its faculty, staff, and students through using continuous, automated methods of capturing data from multiple internal and external sources. this include publication data and associated citations, altmetric data, grant data, and research data itself. this partnership and common goals provided cmu the mechanisms to provide its faculty, funders, and decision-makers with a more accurate, timely, and holistic examination of the institution’s research and outputs. through the shared goal of championing new forms of scholarly communications, cmu brought together these services and tools from digital science, alongside other services solutions from within the university and other external service providers, to develop its own scholarly communications ecosystem—a scholarly communications ecosystem that would rely heavily on the new comprehensive repository platform from figshare. . the kilthub repository having a repository built upon the figshare for institutions platform presented both advantages and challenges to the university libraries. first, the universities libraries had known from earlier interactions and meetings with faculty and students that many were already familiar with the service provided by figshare.com. this service offering included the ability to deposit a wide range of file types, with many of these file types having built-in file previewers and manipulation tools and plug-ins built into the user interface. additionally, from a data creation standpoint, figshare already had integration with github, which allowed users to pull data from their github accounts and publish the data to figshare. beyond traditional publishing, a distinguishing trait and capability is the ability to version data during the data publishing process [ ]. figshare already had the functionality to allow users to version their data, regardless if this was initiated directly through the user interface, or through the versioning of data provided from the user’s github account integration. after the announcement of the strategic partnership, the university libraries knew it needed to name and brand its new repository to reflect its ties to cmu. simply calling the repository “figshare @ cmu” or the “cmu figshare repository” would not work. while a simple solution, these names wouldn’t allow the university to market the repository as a true repository solution and service offered by cmu. marketing a repository in a way that will highlight its capabilities, services, value, and impact is crucial to ensure campus awareness and to develop the necessary incentives to internal and external stakeholders [ ]. the name needed to reflect that it was more than just a portal to figshare.com filtered to cmu material. likewise, it was important that the name convey the intended nature of the new platform; being a comprehensive repository that combined a data repository and traditional ir into one single repository. with figshare traditionally being seen as a data repository, it was important that users understand that the new repository would be much more. the repository would have more than one single primary focus. this new repository would account for both lynch and crow’s definition of an ir. novak and day described these definitions as the “thesis and antithesis” to the two foundational principles of irs as primarily serving the needs for green open access or new forms of digital scholarship [ ]. kilthub’s focus would be to serve as a repository that reflected both perspectives. by reflecting both foundational perspectives, kilthub would serve as a proposed solution to what clifford lynch described in the introduction to making repositories work, as the “unresolved dialectic” [ ]. lynch’s dialectic could also be carried through between figshare.com and figshare for institutions. while this new repository would be “powered by figshare,” it would provide more than what users experienced from the public figshare.com service. it would reflect the capabilities of a repository maintained at an institution, including the additional layers of curation services provided by repositories from similar universities. while having a unique repository built on a platform that many counterparts had not adopted, kilthub’s capabilities and services would be comparable to those seen as other institutions. when compared to the six repositories, which were compared by johnston, carlson, and hswe in their study from the data curation network, kilthub would provide the same publications , , of types of pre-ingest curation, deposit support and mechanisms, approval, publication, and post-ingest curation services [ ]. additionally, cmu wanted the repository to fall into the traditional “institutional repository” category. this could cause the repository to be categorized with the same associated pitfalls and limitations linked with a traditional ir. to achieve these goals, the university libraries organized a naming contest for the new repository. between february and march , the university libraries ran a campus-wide naming contest for the repository [ ]. the contest was open to all faculty, staff, and students. a prize was offered for the winning entry of five hundred dollars towards a research or travel grant (for faculty) or a piece of technology of equal value from the campus computing store. entrants were required to submit an original and distinctive name. entrants could also create multiple unique name entries. although not required to enter the contest, entrants were also encouraged to submit taglines and proposed logos to use for marketing and promotional purposes. once the entrance period had ended, a selection committee was formed. the selection committee was comprised of representatives of the university libraries, the faculty senate university libraries committee, and students. in total, the contest received entries from faculty, staff, and undergraduate, graduate, and phd students representing the pittsburgh, silicon valley, and qatar campuses. the winning application was submitted by an associate teaching professor of hispanic studies from the department of modern languages within the dietrich college of humanities and social sciences. on april , during national library week, the university libraries announced the new name of its repository—kilthub [ ]. the kilthub name was selected for two main reasons: first, the name reflected the scottish connection the university has maintained with its founder. second, the name alluded to the central “hub-like” nature repositories can serve by collecting and disseminating research data and other scholarly outputs of the entire institution. as the comprehensive repository to cmu, the kilthub repository “collects, preserves, and provides stable, long-term global online access to a wide range of research data and scholarly outputs created by the faculty, staff, and students of carnegie mellon university in the course of their research and teaching” [ ]. in addition to implementing the repository, the university libraries developed a parallel information portal [ ]. the information portal provides additional information, contact information, and several user guides. the user guides cover several topics, such as using the repository, depositing scholarly outputs, preparing data, and completing the readme.txt file, which are required for each data deposit submission. . kilthub repository teams as researchers have become more accustomed to sharing their data, working with these materials has also matured. as data and other forms of digital scholarship and research expand, librarianship to support these objects and activities will also mature [ ]. while systems will need to mature, the repository service must rely upon those providing these services. as rowena cullen and brenda chawner discuss in their study, the “build it they will come” philosophy has never been truly justified [ ]. the repository is more than its technology. there has to be a strong supporting infrastructure of support services, including support personnel, to enrich the repository experience and its usefulness to its users. to maintain a repository with such a dynamic purpose, and to provide a suite of repository services, a service model utilizing three sets of teams of individuals from within the university libraries was implemented. the service model adopted by these teams of individuals would need to scalable and manageable for the university libraries [ ]. the composition of the kilthub repository teams was reflective of the findings of the research data services survey from the dataone project, which highlighted the usage of a diverse set of individuals and teams to provide these services [ ]. as tenopir et al. discussed, while the need to grow the core data services team was core to the research data services component of the repository, the larger bulk of expansion of research data services at cmu have been around shifting current library faculty and staff in association to research data publications , , of services, as well as depending upon individuals having secondary support roles, based upon on-the-job exposure, training, and data deposit-related responsibilities [ ]. the university libraries developed three teams, which combined would constitute the kilthub repository team. these three teams were classified as the repository services team, the data services team, and the liaison librarian team. . . repository service team as lee and stvilia present in their study on practices of research data curation in irs, in many cases the repository staff are the first to interact with users, and can coordinate the next steps in the workflow, and any additional service provides who may be involved in the data deposit [ ]. the repository services team is the first to interact with deposits, and coordinate the stages of the curation workflow with any future involvement of the additional repository teams. the repository service team is comprised of three individuals: the scholarly communications and research curation consultant, the repository specialist, and the data deposit coordinator. the scholarly communications and research curation consultant is the faculty librarian who is overall responsible for the repository and its related services. they are responsible for the communication and interactions within the repository team, and serves as a liaison between the university and the vendor. they also oversee the overall mission and future goals for the repository. the repository specialist is a fte university libraries staff member, and serves as the primary repository manager. they serve as the day-to-day lead of the repository and site-level administration, and oversee deposits and questions from users. they also liaise with the university libraries cataloging unit to oversee the submission of electronic thesis and dissertations to kilthub and to other etd services, such as proquest. the data deposit coordinator is a full fte staff member with the university libraries, but is only a . fte with kilthub. they are responsible for overseeing data deposit submissions to the repository. they ensure the data deposit is compliant with the requirements for general data deposit, as well as ensuring the submission is compliant with the data deposit requirements. the data deposit coordinator has received additional training and development specifically related to best practices and standards for the deposit and dissemination of research data. in this model, the organizational makeup and composition of the repository services team is comparable to the core service teams seen at other institutions, such as the illinois data bank at the university of illinois, urbana-champaign, and deep blue data from the university of michigan [ ]. . . the data services team academic libraries have become an important stakeholder and builder of the culture and infrastructure for research data services [ ]. in their final report, the dmci at the university of minnesota found that a successful and significant repository service would be built around capacities for data management and curation to coalesce in operational effectives [ ]. because of this, it was important that the faculty and staff within the university libraries who provide research data services would the second core team to the kilthub service. the data services team is comprised of three individuals: the research data management consultant and the two clir fellows currently serving in their post-doctoral fellowships at cmu in data curation in the sciences and data visualization and curation. the research data management consultant is the faculty librarian who is overall responsible for research data management services within the university libraries. they liaise with the repository services team to ensure that the repository is adhering to data best practices for the deposit and dissemination of research data. they also serve as an additional layer of engagement, directing campus community members to utilize the repository for their data deposit needs when required or permitted by a funder or publisher when related to a grant or publication. while members of the data services team, the two clir fellows are not operational members of the data services or workflows. the clir program is a two-year post-doctoral program offering recent ph.d. students an opportunity to develop new tools, resources, and services, while exploring potential career opportunities [ ]. because of their short-term status, publications , , of the university libraries did not want to design operation services around postdocs. additionally, the clir fellows in the cmu cohort at the time of the development of the workflows were involved in data services and software curation, which is why their involvement is of note. this may not be the case with future potential clir fellows, which further reiterates their supporting roles to the research data management consultant. the clir fellows supply additional support for the research data management consultant by providing additional outreach and overview support for data deposit, especially when reviewing unique content, such as software and code. the clir fellows have also assisted in reviewing interactions between the three teams within the university libraries, as well as between the university libraries and campus constituents, and have aided in the development of additional outreach and engagement resources specifically geared towards improving the stream of information for reviewing data deposits. the data services team provides a secondary layer of support to the data deposit coordinator during the review of data to ensure that data deposit best practices are being exercised. . . liaison librarian team given their close relationship to both the discipline and their faculty, subject/liaison librarians have been found in many studies to be key stakeholders in research data management and the foundation for repository services that require high levels of collaboration [ , ]. the third team with the university libraries that supports the repository is the liaison librarian team. the liaison librarians are the faculty librarians that serve as liaisons and subject specialists to the schools, departments, research centers, and institutes around the university. in most cases, a liaison serves in a direct : relationship to a particular campus unit, but in other cases, they are responsible for multiple units and programs. the liaison serves as a bridge between the university libraries and the rest of campus. they provide marketing and engagement of university libraries’ tools and services to university constituents. they also provide recommendations to utilize kilthub as a repository solution for research data and other scholarly outputs as appropriate and permissible by requirements of a funder or publisher. in recent times, the university libraries has expanded its liaison corps, through filling vacant positions with new hires, as well creating new positions to fill needed and necessary roles. many of these new hires have included information professionals that hold phd’s within the disciplines they liaise to, and are able to discuss the outputs of those communities more directly with their constituents through their shared backgrounds and experiences. the liaison team serves as another level of support to the data services team and the data deposit coordinator if there are any questions on how the data submitted may have been collected, described, and arranged. the liaisons also provided information and background over any disciplinary best practices that should also be accounted for when depositing unique forms of data. in a few cases, when a liaison is interested in serving as a repository administrator for their liaising units, these liaisons take over the administrative roles for deposits normally overseen by the repository specialist and the data deposit coordinator. while the liaison may request to take on these roles, the repository specialist and data deposit coordinator still provide overall oversight around the work being done by the liaison administrators. the liaison administrator role is an optional responsibility, but more liaisons are taking on this role as a means to further engage with their constituents, and to stay abreast on their current research. . streamlining workflows with so many invested interests within the repository, it was critical to understand the potential roles that were required within a streamlined repository deposit workflow. beyond understanding the roles of the various parties, developing a coherent workflow also highlighted the services and expertise offered during the various stages in the workflow. in this way, the workflows are not just a set of tasks to be reviewed and completed, but they are also a suite of services tailored to address key components of the data life cycle [ ]. beyond ensuring that the deposit is satisfactory completed, the workflow publications , , of also ensures that librarians and library staff have the opportunity to address any concerns with the deposit, and also ensure that the deposit itself follows best practices. during the various stages of the workflow, those assigned to those tasks can ensure that the deposit is adhering to certain standards, such as the fair guiding principles, which will ensure that the deposit is prepared and maintained in a way that makes the dataset findable, accessible, interoperable, and reusable [ ]. . . workflow roles the workflow serves as a means to curate, document, review the deposit, thus ensuring and enhancing the value of the deposit and the final published work [ ]. the workflow is intended to act as a means to review the materials and information submitted by the user. since the creation of the deposit, the kilthub repository teams have yet had a deposit that met the full set of requirements for deposit, and thus not needing any review or enhancements provided by the three kilthub repository teams. when assessing the necessary roles required to maintain the streamlined workflows, the university libraries assessed team member involvement based upon a minimum involvement model that focused on particular roles within the workflow. additionally, the workflow was reviewed for adaptiveness for the inclusion of additional team members when and if necessary. the workflow is intended to create a process of review that ensures the deposit meets the minimum set of requirements. the assessment of the workflow was based first on an initial evaluation of work and involvement, but evolved to its current model after assessment of early deposit use case examples. as noted by michael witt, no workflow is without review or revision, as workflows themselves are designed in iteration [ ]. the deposit workflow has several distinct roles. these roles are activated depending on the type of material being deposited. for example, for a dataset deposit, kilthub has five distinct roles. in the data deposit workflow, the repository administrator, data deposit administrator, data services team, liaison librarians, and the research data management consultant will all have a potential role to play in the deposit. all user-submitted workflows begin in the same manner. once a user has added the appropriate required and optional metadata and uploaded their data files and the required readme.txt file, the user will then click submit. the metadata that comprises the submission metadata record can be broken down into required and optional metadata [ ]. both sets of metadata are built using qualified and unqualified dublin core. the required metadata includes the deposit’s title, author listing, categories taxonomy, file type, keywords, description/abstract, and appropriate copyright license. the categories taxonomy is based upon the australian and new zealand standard research classification (anzsrc) [ ]. the repository possesses a wide range of available copyright licenses, included the full suite of creative commons licenses, gpl, mit, and apache licenses [ ]. kilthub also permits users to select “in copyright” for items that cannot be deposited utilizing an open license. when this option is selected, users must enter the copyright statement in the publisher statement field, which is a requirement for deposit if the “in copyright” license is utilized. the optional metadata that can be supplied by user includes related funding information (grant name and number/id), references to related content, and date. data is not required by kilthub because the repository will assume the date the items are published to the repository will be the official date of the items if no information is provided in the date field. once the user clicks submit, they are informed that their dataset submission will be reviewed by the site-level administrators. the site-level administrators are either the repository service team or the liaison librarian that has taken on the job of repository administrator for their school or department. once the user clicks ’publish’, a deposit notification is sent by the system to all site-level administrators, including the repository specialist. unless administrative review has been assigned to another individual, such as the site’s liaison librarian, the repository specialist is the reviewer for that site. the repository specialist begins by conducting an initial review of the content from the notification. their responsibilities include reviewing the submission metadata that accompanies the deposit and verifying the files attached with the deposit. publications , , of the data deposit workflow is initiated once a submission is made in the repository by a user that has been marked with the content-type ’dataset’ within the figshare for institutions content-type metadata field. this metadata field is a required default metadata field for all deposits made to figshare for institutions repositories. at cmu, the university libraries made the decision that all datasets, regardless of file types, would be marked as ’dataset’, rather than utilizing other content types that were more representative of the types of file extensions that one may associate with other content types (e.g., using ’filesets’ for tabular data/spreadsheets). if the submission is identified as a dataset, the repository specialist assigns the deposit to be reviewed further by the data deposit coordinator. this will trigger a notification to be sent to the data deposit coordinator to begin the data deposit workflow. . . data deposit workflow the data deposit workflow begins as soon as the data deposit coordinator is assigned to the dataset. from this point forward, the deposit workflow is best described as an “intricate dance of communication, verification, and iteration” [ ]. as illustrated in figure , once assigned to the dataset, the data deposit coordinator begins by reviewing the deposit metadata for deposit requirement consistency. this is to ensure that all of the required metadata that must accompany a data deposit to kilthub has been provided. once the deposit metadata is checked and verified, the data deposit coordinator reviews the dataset files and the readme.txt file. the readme.txt file is a text file that must accompany all dataset deposits. the file includes additional metadata about the dataset, and verifies the contents for data deposit consistency. if the dataset meets all the deposit requirements, the data deposit coordinator will approve the dataset for deposit. by approving the dataset, the system will send an automatic notification to the researcher that their dataset has been published in kilthub. this will also complete the registration process for the datasets doi with the doi registering authority, and can then be used for citation and discovery purposes. if the dataset does not meet all deposit requirements, the data deposit coordinator will email the researcher to make initial contact. in their message, the data deposit coordinator informs the researcher that they are reviewing their dataset and may be in further contact with questions regarding the deposit. the coordinator will also contact the researcher ’s liaison librarian to confer on questions or concerns they wish to raise and review. publications , for peer review the data deposit workflow begins as soon as the data deposit coordinator is assigned to the dataset. from this point forward, the deposit workflow is best described as an “intricate dance of communication, verification, and iteration” [ ]. as illustrated in figure , once assigned to the dataset, the data deposit coordinator begins by reviewing the deposit metadata for deposit requirement consistency. this is to ensure that all of the required metadata that must accompany a data deposit to kilthub has been provided. once the deposit metadata is checked and verified, the data deposit coordinator reviews the dataset files and the readme.txt file. the readme.txt file is a text file that must accompany all dataset deposits. the file includes additional metadata about the dataset, and verifies the contents for data deposit consistency. if the dataset meets all the deposit requirements, the data deposit coordinator will approve the dataset for deposit. by approving the dataset, the system will send an automatic notification to the researcher that their dataset has been published in kilthub. this will also complete the registration process for the datasets doi with the doi registering authority, and can then be used for citation and discovery purposes. if the dataset does not meet all deposit requirements, the data deposit coordinator will email the researcher to make initial contact. in their message, the data deposit coordinator informs the researcher that they are reviewing their dataset and may be in further contact with questions regarding the deposit. the coordinator will also contact the researcher’s liaison librarian to confer on questions or concerns they wish to raise and review. figure . the kilthub repository workflow. the steps in bold represent the minimally required steps within the workflow. with input from the liaison librarian, and if additional information or expertise is required, the data deposit coordinator will contact the research data management consultant to involve the data services team in the review of the dataset. after conferring with the data services team and liaison librarian, the researcher is contacted again. the initiator for the contact is based upon the liaison’s preference, and will be conducted by either the data deposit coordinator or the liaison librarian. the email sent to the researcher will summarize what revisions or additions are necessary for the dataset to be approved for deposit. all parties involved in the workflow to this point are cc’d in the email to the researcher. this is to maintain the flow of information between all team members involved in the deposit. this team-based approach to the data deposit workflow relies heavily upon the communication between the team members and the author of the dataset [ ]. because so many are involved, no one person is left to provide all that is necessary for the deposit. as skill sets amongst members may differ, relying upon the expertise of the collective service providers is essential in delivering a cohesive repository-based data management service. figure . the kilthub repository workflow. the steps in bold represent the minimally required steps within the workflow. publications , , of with input from the liaison librarian, and if additional information or expertise is required, the data deposit coordinator will contact the research data management consultant to involve the data services team in the review of the dataset. after conferring with the data services team and liaison librarian, the researcher is contacted again. the initiator for the contact is based upon the liaison’s preference, and will be conducted by either the data deposit coordinator or the liaison librarian. the email sent to the researcher will summarize what revisions or additions are necessary for the dataset to be approved for deposit. all parties involved in the workflow to this point are cc’d in the email to the researcher. this is to maintain the flow of information between all team members involved in the deposit. this team-based approach to the data deposit workflow relies heavily upon the communication between the team members and the author of the dataset [ ]. because so many are involved, no one person is left to provide all that is necessary for the deposit. as skill sets amongst members may differ, relying upon the expertise of the collective service providers is essential in delivering a cohesive repository-based data management service. based on the circumstances of what is needed to be revised or added, the work is completed by either the researcher or the data deposit coordinator. if the decision is to allow the coordinator to conduct the work, they will utilize their repository-level administrative privileges to access the researchers account and make the changes. if the decision is for the researcher to conduct the work, the dataset will be rejected. this is so that the dataset can be released from the review process and returned to the researcher for revision. an internal comment is left attached to the deposit detailing to the researcher what work is required. this note will accompany the rejection notice in the form of an email to the researcher. once the researcher receives the rejection notice, they can begin to make the changes to their dataset detailed in the rejection comments. after the researcher makes the requested changes to the dataset, they can resubmit the dataset for a second review. in the second review, the dataset is reevaluated, ensuring that all of the necessary changes were indeed made by the researcher. this last stage of the workflow is considered iterative, as the researcher may not have made all changes requested when they resubmitted the dataset to be reevaluated. as figure one details, this last stage is repeated as necessary, but only for a certain number of iterations. the kilthub repository team has determined that as long as the minimum set of requirements for deposit have been made, this last iterative stage of ensuring a complete and “perfect” deposit are met will be cycled for a maximum of three iterations. as long as the minimum requirements are met, the dataset will be accepted for deposit by the research data deposit coordinator after the third iteration, even if not all detailed revisions were made by the researcher. similar to the findings of the university of minnesota implementation report, the university libraries repository service model focuses on four primary service model outcomes: self-deposit, curated workflows, policy-driven decision-making, and “freemium” services where costs can be written into grants when necessary [ ]. while the service does have a means to provide cost recovery capabilities, the core repository offering is taken by the university libraries as the initial burden of service of the institution. since the creation of the current deposit workflow and the increased involvement of the service providing team members with the university libraries, the repository has seen an increase in deposits. likewise, consultations for data deposits have also increased. consultations have taken place in-person during scheduled meetings, weekly repository office hours, as well as digitally through the shared repository and data services email accounts that connect the entire service teams to one another. additionally, communication gathered during these consultations are shared and disseminated through synchronous communication provided by a shared slack service, as well as during biweekly meetings held with the research data services units. publications , , of . balancing requirements with ease of deposit the decision to limit the number of iterations within the number of times the repository team must communicate with the researcher; establishing a minimum level of requirement for deposit was in recognition that the repository needed to balance the requirements for deposit against the ease of use and deposit to the repository. part of this balance was to ensure that there was a clear and articulated set of minimum requirements for deposit, since the materials within the repository would be considered curated content versus freely available [ ]. the minimum requirements for a deposit include the submission metadata, the readme.txt file, and a proper file naming convention applied to the dataset files. the requirement of accompanying files, such as the readme.txt file, is not unique to cmu’s deposit workflow. in don joon lee and besiki stvilla’s survey on research data curation in institutional repositories, several responding institutions indicated that they also required such additional files, with many including additional domain-specific and data collection specific metadata not found within the item’s primary dublin core-based descriptive metadata record [ ]. the last requirement is the usage of appropriate file types for access and preservation. this may still include proprietary file types, depending on the data, but open file formats are recommended whenever possible. the requirements are kept to a minimum so that researchers do not feel as if the repository or the university libraries are asking for more information or a higher level of completeness than what is expected to be supplied within a disciplinary setting. likewise, the deposit process and services have to provide a smooth and easy to understand process for users to utilize that will also highlight the benefits of deposit. some of these processes include a quick and easy submission process, responsive communication turnaround times; providing mechanisms, such as doi generation and holding, for a deposit. the doi is available for generation and holding before the dataset is published, allowing researchers to embed the doi citation for their work in publications and funder documents while the materials are being developed. additionally, the depositing and publishing of the dataset ensures that the deposit can comply with requirements from publishers and funders. the second layer of concern kilthub is presented with is because it is built on a system that users may be accustomed to already using. since figshare for institutions is based upon figshare.com, users will recognize these preexisting requirements and processes. the requirements or steps to ease deposit that were implemented at cmu could not be seen as being higher, more strenuous, or extensively different than the same expectations within figshare.com. if kilthub had stricter requirements, or provided fewer offerings to ease deposit, it could negatively turn the cmu community towards using figshare.com over its own campus-based solution. for example, one of the main differences is the minimum amount of information necessary for deposit in kilthub is greater than that of figshare.com. this increased amount of information necessary for deposit increases the amount of time necessary to complete a deposit, which austen et al. noted, is seen as a major disincentive for sharing data via repositories [ ]. while the figshare for institutions vs. figshare.com comparison was a concern for cmu, the concern of having a campus solution seen as stricter or harder to use over a freely available or alternative solutions should be a concern for any institution offering these types of services. ultimately, the repository service cannot overburden the research with too many requirements or implement a deposit workflow or submission process that would be viewed as overly complicated. without taking these points into account, ensuring these requirements are kept to what is essential, and developing a repository service focused on ease of use, the ir service will either not be used over an alternative service; or worse, the university libraries could be accused of not thinking of the best interests of their campus community. . institutional repositories and repositories at institutions balancing the deposit requirements for the new comprehensive repository, and the services offered as a benefit of the repository or to ease deposit, centered on the notion of what it meant for kilthub publications , , of to be an institutional repository (ir) versus a repository at an institution. when considering what it means for a repository to be an ir, there are several questions that one can ask. the first question is what does it mean to be defined as an ir? does it mean the repository will hold textual materials and not data? or will the repository contain academic and research outputs, but not other materials that may have been housed in other types of repositories? related to these previous questions, what will be the role of the ir? how will its role affect its limitations and capabilities? if the ir can accommodate the needs of the content and its users, why should it be limited in role or responsibility? this ultimately questions the types of material it can collect from both a technical and contextual perspective. should the repository collect content that it cannot accommodate, or provide users a means to use or preview? lastly, since we have seen that irs, both in their technical and contextual capabilities, is it time to reevaluate how we define an ir’s mission as well? can an ir serve a larger role than just as a home for outputs of a particular institution? by examining the repository beyond its own confines, can the ir serve a larger role? the university libraries addressed these questions during its evaluation, selection, and implementation of kilthub and its related services. in the future, as kilthub and its service matures, the university libraries will continue asking these questions to ensure that its approach to each question is in alignment with its mission, capabilities, and focus in providing such tools and services. the other perspective within this conversation is in regards to general repositories maintained at an institution. if the repository is truly “institutional,” how should it be administered? what level of control or administration does an institution need over its repositories? does this level of control and oversight alter how one decides to administer the repository? how should an institution provide a repository as a service? if the repository is institutional service, how does it fit within the broader system of services and mechanisms offered by the institution? can a repository be a part of the broader scholarly communication ecosystem at an institution? if a repository is a part of the broader scholarly communication ecosystem, what type of role can it potentially play? especially when considering how researcher ’s outputs may transition amongst the various stages of the research lifecycle. repositories at an institution are more than just the software they are built upon. the entire repository is comprised of several components, which may include its technical infrastructure, policies, staff, and partnership [ ]. all of these components are what make repositories at institutions a comprehensive repository solution. . figshare ir advisory board in , coar published a report on the behaviors and technical recommendations on what the next generation repository should consider [ ]. a number of the requirements aligned with the figshare vision for the future of repository infrastructure. for example, interoperability between academic systems within the university is as important as interoperability between research outputs. this alignment helped shape how figshare looked at its platform and community and spurred the creation of the figshare ir advisory board in november of . comprised of figshare for institutions partners as well as universities who did not use the platform, the ir advisory board helped shape figshare’s development towards the next generation repository coar outlined it its report. the ir advisory board met over the course of several months and reviewed existing figshare for institutions functionality as well as looking into what makes a successful ir [ ]. the figshare roadmap is largely client-driven, and this opportunity to work with the library community outside of figshare for institutions partners was a way to ensure the tool meets all requirements, recommendations, and best practice in the scholarly communication ecosystem. . the ir and current research information system like any other technical space, there has now become a plethora of rim solutions available. while rims are traditionally viewed as a service focused towards academic leadership within faculty affairs or the research office. with that being said, many institutions see benefits in having the implementation and overall oversight of these systems provided by university libraries. in the choice white paper, publications , , of “the evolving institutional repository landscape,” judy luther identifies many of the new and emerging roles associated with repositories. these roles include their relationship to new emerging systems being adopted in academic institutions today, including rims [ ]. additionally, john novak and annette day, present that the new administrative role of working with rims, presents repositories with another new dialectic [ ]. in this model, cmu is no different, and has begun to align this new face for its repository by integrating it to the cmu rim system—symplectic elements. because rims are designed to gather information on faculty activities and outputs, they have a similar mission to irs, and since these systems share these parallel goals, many rim systems have developed integrations of various sorts to connect in some sort of means to an institution’s ir. while the placement of rims may differ, it is important that librarians understand the role they potentially have with interacting with these systems. additionally, because these systems are designed to ingest a wide range of information about a faculty member, including their publications and research activities, librarians are poised to provide a level of expertise and professional knowledge on how to best optimize how this information is gathered and verified. likewise, the repository has an opportunity to provide useful information on materials not found within other sources synced to the rim, such as gray literature, software, and research data. the cmu university libraries recognized these opportunities through the implementation of kilthub and symplectic elements. through the interoperable capabilities of these two systems, there could be mutually beneficial outcomes that would ultimately improve the mission and functional capabilities of each system. . . ir to rim as the rim harvests publication information from multiple sources, the repository can also be integrated to serve as an additional publication source. through crosswalks established between kilthub and elements, the metadata records of content published in the repository are matched to the corresponding publication record found within elements. additionally, both systems utilize the same user feed to create and maintain user accounts. this is based upon the usage of the cmu personnel identifier (andrew id) and cmu email address (andrewid@andrew.cmu.edu). by matching this information from kilthub to the information located within elements, the faculty member’s publication records are further validated. because both systems use the common author profile information, this matching between records in kilthub does not require authors to claim the publication within elements or kilthub. the new record found within kilthub is connected to the additional records found from other sources, thereby allowing a faculty member to review the location where elements found the information from each source. additionally, the repository is used as a harvesting source for publications and other materials not found within elements. this matching and connection is made through the usage of the dois published by kilthub and the dois found within the metadata of items found in the elements content indexing searches. these materials are found by the elements indexing searches through name identification, as well as through using other known author identifiers found within the data sources, such as orcid, scopus id, and researcher id [ ]. when an item found within kilthub is not matched to an existing record within elements, the system will generate a new record for that item within elements. this record sits within the faculty member’s full listing of publications, and identifies the record as originating from the kilthub repository. further work is being done to improve this connection and content identification. because kilthub creates a doi for all published records, this causes items added to kilthub through green open access to have two dois. one doi is the publisher doi to the version of record, and the second doi represents the deposit in kilthub (i.e., the repository doi). additional work between cmu, figshare, and symplectic is currently in progress ensuring that elements can understand the nuances of these two dois and the connection they have to one another if both are present on a single record. this additional work will ensure that elements will not add the kilthub record for publications as a new record, but will know that the record found in kilthub’s is the record of a publication it may have already indexed. publications , , of . . rim to ir because the rim is a hub for publication information—information that it has been able to search, query, and collect through automated processes—these same processes can be extended to provide a mechanism to gather content that may be suitable and permissible for deposit to kilthub. because the data has been gathered through machine-readable processes, the metadata can be reviewed through additional tools to review if the publisher permits repository deposit by comparing the source information found within sherpa/romeo [ ]. within elements, the repository tools module allows for publication data to be checked for deposit. when a publication is claimed by an author, it can then enter the repository review process. in this process, a publication’s source will be reviewed through the connections between elements and sherpa/romeo. if a publisher allows for the deposit to a repository, elements will indicate this information to the user, and provide them the opportunity to attach the appropriate version of the article permitted by the publisher. once the user supplies the appropriate file, they can also agree to the terms of the repository’s deposit agreement. once the user agrees to the terms and clicks submit, the metadata record from within elements is supplied as the repository submission record along with the publication files. the submission metadata and deposit files are supplied to the repository to be reviewed by the repository specialist. this process allows for the full metadata record that was used to populate the elements publication record to be utilized to complete the repository submission. because the deposit still enters the traditional repository review workflow, the content is still curated and verified by the repository specialist. since these two systems use the same user feed and account verification mechanisms, the systems can again ensure that the publication will be associated with the correct authors once the deposit is completed. this feature currently provides deposit for publications, but cmu is working with symplectic and figshare to see how this functionality could be extended to other content types, such as research data. once available, this extension of the publication deposit process will allow users to deposit their content to the repository, regardless of the system they choose to interact with during their primary interaction. . . ir and rim ecosystem as figure illustrates, by having the repository and rim interconnected through a two-way integration, both systems are able to combine their primary purposes towards a new shared service model of monitoring the levels of open access. as anna clements noted in her article, “research information meets research data management . . . in the library?”, we can change the perception of these platforms from just as systems and start thinking of them as more than a system, but as interrelated services [ ]. additionally, austen et al. found that these integrated workflows resulted in the scholarly objects found in these two systems to be connected, linked, citable, and persistent to allow researchers to navigate smoothly within these systems, thus enabling their reusability for further uses [ ]. whether it was mandated by a funder or not, the open status of any individual faculty member, campus unit, or the overall university could be analyzed and potentially extended when permitted. additionally, while cmu neither has an open access mandate or any extenuating government requirements to validate that it has complied with open access requirements, the university’s open access status could still be reviewed and enhanced through this dual connection. by using the rim to monitor open access, the repository services team can utilize the rim as an additional layer of engagement. just as the repository specialist would review a faculty members cv or faculty website, the faculty member’s elements profile would verify which publications were authored by the faculty member, as well as provide the necessary information from the metadata record that would be linked via sherpa/romeo to confirm the actions taken to make that publication openly available within the repository. by having the ir and rim connected by a two-way channel that enables both platforms to fulfill a primary and supporting role for one another, the ecosystem created by these two connections allows the systems to become interoperable. this also allows the passage of information and content to be seen as a seamless connection to both users and administrators. publications , , of publications , for peer review figure . the institutional repository (ir)-research information management (rim) system integration at carnegie mellon university (cmu). full utilization of the connections and potential benefits of the rim requires the repository staff to be cross-trained on its usage. additionally, the repository staff need to have a certain level of access to faculty profiles to engage with faculty publication listings. by ensuring the repository services staff are cross-trained and have the right level of access to the rim, the rim can be seen as an additional layer of open access services. by examining the rim as an additional service layer, it can be further positively marketed to faculty as a service that will improve the discovery and verification of their scholarship without creating another layer of undue burden. if librarians and other information professionals can focus on how these services are interconnected, and can improve how they provide their services, they can provide both a level of expertise and knowledge that disciplinary faculty may not have. likewise, because of their expertise, they can also free faculty from having to be focused on such matters. as nicholas joint points out, this allows the disciplinary faculty to focus on their own research, and not be concerned with learning additional systems or understand how conduct self- deposit [ ]. this further illustrates the value that the libraries and librarians provide to both the institution and faculty, and how they can develop additional services from preexisting systems being used across their institutions. . the ir in the scholarly communication ecosystem the role of the repository will continue to change and evolve as the needs of researchers and administrators evolve as well. in this way, the repositories’ dialectic will continue to add additional “faces” [ ]. the number of tools and services designed to assist with scholarly communications is increasingly expanding. new tools and services have been created to fill voids where previous support either did not exist, or was inadequately provided. in , jeroen bosman and bianca kramer presented their findings on their innovations in scholarly communications” at the force meeting [ ]. to date, bosman and kramer have identified over different systems, tools, and services [ ]. their work also identified that these innovations could be classified into six different stages (discovery, analysis, writing, publication, outreach, and assessment), which represent a researcher’s workflow. in comparison, the university libraries has classified its tools and services into a similar model also reflecting the researcher’s workflow. as figure illustrates, the university libraries differs from kramer and bosman, in that it has identifies only five stages. the five stages represented at cmu are discover, organize, create, share, and impact. with these five stages, the university libraries has classified the services, tools, and platforms that it maintains, figure . the institutional repository (ir)-research information management (rim) system integration at carnegie mellon university (cmu). full utilization of the connections and potential benefits of the rim requires the repository staff to be cross-trained on its usage. additionally, the repository staff need to have a certain level of access to faculty profiles to engage with faculty publication listings. by ensuring the repository services staff are cross-trained and have the right level of access to the rim, the rim can be seen as an additional layer of open access services. by examining the rim as an additional service layer, it can be further positively marketed to faculty as a service that will improve the discovery and verification of their scholarship without creating another layer of undue burden. if librarians and other information professionals can focus on how these services are interconnected, and can improve how they provide their services, they can provide both a level of expertise and knowledge that disciplinary faculty may not have. likewise, because of their expertise, they can also free faculty from having to be focused on such matters. as nicholas joint points out, this allows the disciplinary faculty to focus on their own research, and not be concerned with learning additional systems or understand how conduct self-deposit [ ]. this further illustrates the value that the libraries and librarians provide to both the institution and faculty, and how they can develop additional services from preexisting systems being used across their institutions. . the ir in the scholarly communication ecosystem the role of the repository will continue to change and evolve as the needs of researchers and administrators evolve as well. in this way, the repositories’ dialectic will continue to add additional “faces” [ ]. the number of tools and services designed to assist with scholarly communications is increasingly expanding. new tools and services have been created to fill voids where previous support either did not exist, or was inadequately provided. in , jeroen bosman and bianca kramer presented their findings on their innovations in scholarly communications” at the force meeting [ ]. to date, bosman and kramer have identified over different systems, tools, and services [ ]. their work also identified that these innovations could be classified into six different stages (discovery, analysis, writing, publication, outreach, and assessment), which represent a researcher’s workflow. in comparison, the university libraries has classified its tools and services into a similar model also reflecting the researcher’s workflow. as figure illustrates, the university libraries differs from kramer and bosman, in that it has identifies only five stages. the five stages publications , , of represented at cmu are discover, organize, create, share, and impact. with these five stages, the university libraries has classified the services, tools, and platforms that it maintains, supports, or licenses to support the endeavors of its faculty and students throughout the stages of their research and scholarship. publications , for peer review supports, or licenses to support the endeavors of its faculty and students throughout the stages of their research and scholarship. figure . the scholarly communication ecosystem at cmu. in creating this ecosystem, the university libraries has focused on reviewing options within a particular space to ensure that any new additions to the ecosystem are as beneficial to as much of the campus community as possible in a financially sustainable means through requiring interoperability beyond a single vendor’s solutions. a focus of this ecosystem has been to select tools and services that break away from siloing activities and transitions to other stages. just as an author wants to transition from writing, to publishing, to disseminating, the tools that support these actions should also allow a researcher to transition their outputs from one stage to the next. whether it is from moving from hosting project files in a cloud-based storage solution, to publishing these materials in the repository, to then measuring the impact of the dissimilation in the form of altmetrics and traditional citation metrics, researchers want fluidity and ease of movement in their workflows and systems that support their research. this also means that tools and services should be able to integrate and interoperate with multiple solutions, regardless of their classified stages. for example, a researcher should not be expected to have to move from create to share, but can move from organize to share to impact if they do not require create. by ensuring the greatest amount of fluidity, the researcher can control their workflows without having to create additional workarounds or external solutions. by recommending the usage and creation of the same identifiers used by the repository, the user’s connection to their materials across the ecosystem can be further interconnected, allowing content to be utilized and synced across multiple platforms. additionally, by having systems that are interoperable agnostic, institutions can feel empowered to review and select the tools, services, and platforms that are the right fit for their institution. institutions should not feel forced or obligated in selecting services because of big-ticket buy-in or limited integration to systems from the same vendor or service provider. . conclusions—weaving the fabric of research the kilthub repository was designed to serve as a comprehensive repository for the materials produced by members of the carnegie mellon university and their collaborators. while the repository can accept a wide range of materials, these materials fall into two major categories: research data and other scholarly outputs. these two categories personify the repository’s comprehensive nature and intention. by having a single repository that can gather, publish, and link figure . the scholarly communication ecosystem at cmu. in creating this ecosystem, the university libraries has focused on reviewing options within a particular space to ensure that any new additions to the ecosystem are as beneficial to as much of the campus community as possible in a financially sustainable means through requiring interoperability beyond a single vendor ’s solutions. a focus of this ecosystem has been to select tools and services that break away from siloing activities and transitions to other stages. just as an author wants to transition from writing, to publishing, to disseminating, the tools that support these actions should also allow a researcher to transition their outputs from one stage to the next. whether it is from moving from hosting project files in a cloud-based storage solution, to publishing these materials in the repository, to then measuring the impact of the dissimilation in the form of altmetrics and traditional citation metrics, researchers want fluidity and ease of movement in their workflows and systems that support their research. this also means that tools and services should be able to integrate and interoperate with multiple solutions, regardless of their classified stages. for example, a researcher should not be expected to have to move from create to share, but can move from organize to share to impact if they do not require create. by ensuring the greatest amount of fluidity, the researcher can control their workflows without having to create additional workarounds or external solutions. by recommending the usage and creation of the same identifiers used by the repository, the user’s connection to their materials across the ecosystem can be further interconnected, allowing content to be utilized and synced across multiple platforms. additionally, by having systems that are interoperable agnostic, institutions can feel empowered to review and select the tools, services, and platforms that are the right fit for their institution. institutions should not feel forced or obligated in selecting services because of big-ticket buy-in or limited integration to systems from the same vendor or service provider. . conclusions—weaving the fabric of research the kilthub repository was designed to serve as a comprehensive repository for the materials produced by members of the carnegie mellon university and their collaborators. while the repository publications , , of can accept a wide range of materials, these materials fall into two major categories: research data and other scholarly outputs. these two categories personify the repository’s comprehensive nature and intention. by having a single repository that can gather, publish, and link these types of materials together, researchers have a single tool to collect and disseminate the outputs of their research endeavors. the scottish tartan museum defines a tartan as a pattern of interlocking stripes, running both in the warp and weft (horizontal and vertical patterns) within cloth [ ]. just as the warp and weft are woven together by a skilled professional and their services, the university libraries and repository can assist a researcher to weave their own warp and weft, in the form of their research data and other scholarly outputs, to produce their own research narrative. as tartans have come to represent the clans of scottish families, the research narrative produced by a researcher represents their own professional narrative. because of this, and to continue its scottish alignment, kilthub has adopted the phrase “weaving the fabric of your research” to convey the value and service it provides to the cmu community. scholarly communications is shifting. the repository is no longer a tool that has to be a separate silo, or utilize workflows and processes that are separate from those employed directly by the researcher as they progress through the research lifecycle. the repository was designed and implemented to avoid becoming seen as a “roach motel,” as dorothea salo described [ ]. instead, it has become an active component of the research lifecycle offered by cmu. it can be the loom that allows the researcher to bring together their work, and make it available publicly when it may be otherwise not. likewise, the repository is the tool that weaves together both technical needs and service-based expertise and knowledge. repositories are not just a technical enterprise. they are a multifaced sociotechnical endeavor that are drawn from their community, and serve as a representation of the professionals that provide the as a service [ ]. in these ways, the repository can become a foundational piece to the research lifecycle; serving as the mechanism to accomplish several goals, and to facilitate the beginning and conclusion of other life cycle stages. by doing so, the repository may become seen as one of the strongest tools available to researchers as the roles and dialectics of repositories continue to expand. author contributions: d.s.: conceptualization, project administration, supervision, visualization, writing—original draft, writing—review and editing. d.v.: conceptualization, data curation, software, writing—original draft, writing—review and editing. funding: this research received no external funds. acknowledgments: the authors would like to thank their colleagues from the carnegie mellon university libraries, digital science, and figshare for their contributions, review, and collaboration. conflicts of interest: the authors declare no conflicts of interest. references . smith, m.; barton, m.; bass, m.; branschofsky, m.; mcclellan, g.; stuve, d.; tansley, r.; walker, j.h. dspace: an open source dynamic digital repository. d-lib mag. , , . available online: http: //www.dlib.org/dlib/january /smith/ smith.html (accessed on november ). [crossref] . jain, p. new trends and future applications/directions of institutional repositories in academic institutions. libr. rev. , , – . [crossref] . witt, m. co-designing, co-developing, and co-implementing an institutional data repository service. j. libr. adm. , , – . [crossref] . tenopir, c.; hughes, d.; allard, s.; frame, m.; birch, b.; baird, l.; sandusky, r.; langseth, m.; lundeen, a. research data services in academic libraries: data intensive roles for the future? j. esci. librariansh. , , – . [crossref] . crow, r. the case for institutional repositories. arl bimon. rep. , . available online: http: //www.sparc.arl.org/sites/default/files/media_files/instrepo.pdf (accessed on november ). . lynch, c. institutional repositories: essential infrastructure for scholarship in the digital age. arl a bimon. rep. , . available online: http://old.arl.org/resources/pubs/br/br /br ir.shtml (accessed on november ). http://www.dlib.org/dlib/january /smith/ smith.html http://www.dlib.org/dlib/january /smith/ smith.html http://dx.doi.org/ . /january -smith http://dx.doi.org/ . / http://dx.doi.org/ . / . . http://dx.doi.org/ . /jeslib. . http://www.sparc.arl.org/sites/default/files/media_files/instrepo.pdf http://www.sparc.arl.org/sites/default/files/media_files/instrepo.pdf http://old.arl.org/resources/pubs/br/br /br ir.shtml publications , , of . research data alliance data foundation and terminology working group. data repository. term definition tool. available online: https://smw-rda.esc.rzg.mpg.de/index.php?title=data_repository (accessed on april ). . the carnegie classification of institutions of higher education. carnegie mellon university, . available online: http://carnegieclassifications.iu.edu/lookup/lookup.php (accessed on january ). . carnegie mellon university factsheet. available online: https://www.cmu.edu/assets/pdfs/cmufactsheet.pdf (accessed on december ). . scherer, d.; zilinski, l.; valen, d. balancing multiple roles of repositories: developing a comprehensive institutional repository at carnegie mellon university. in proceedings of the open repositories conference, montana state university, bozeman, mt, usa, – june . [crossref] . fenner, m. figshare interview with mark hahnel. available online: https://blogs.plos.org/mfenner/ / / /figshare-interview-with-mark-hahnel/ (accessed on february ). . thainey, k. welcoming figshare, an open data project, to the digital science family. available online: https://www.digital-science.com/blog/news/welcoming-figshare-an-open-data-project-to-the- digital-science-family/ (accessed on february ). . steele, c. open access in australia: an odyssey of sorts? insights , , – . [crossref] . stebbins, m. expanding public access to the results of federally funded research. available online: https://obamawhitehouse.archives.gov/blog/ / / /expanding-public-access-results-federally- funded-research (accessed on february ). . the figshare knowledge portal. available online: https://knowledge.figshare.com (accessed on february ). . the research data alliance. available online: https://www.rd-alliance.org/ (accessed on february ). . force . available online: https://www.force .org/ (accessed on february ). . coretrust seal data repository certification. available online: https://www.coretrustseal.org/ (accessed on february ). . hahnel, m. mission statement & core beliefs. available online: https://knowledge.figshare.com/articles/ item/mission-statement-and-core-beliefs (accessed on february ). . figshare api documentation. available online: https://docs.figshare.com (accessed on february ). . hahnel, m. figshare orcid integration. available online: https://figshare.com/blog/figshare_orcid_ integration/ (accessed on february ). . carnegie mellon university archives. available online: https://library.cmu.edu/find/unique/archives (accessed on december ). . knowvation. available online: http://www.ptfs.com/knowvation (accessed on december ). . scherer, d.; corrin, j. (carnegie mellon university libraries). personal communication, . . research showcase. available online: https://repository.cmu.edu (accessed on july ). . journal of privacy and confidentiality. volume , no . available online: https://journalprivacyconfidentiality. org/index.php/jpc/issue/view/ (accessed on december ). . journal of privacy and confidentiality. about the journal. available online: https:// journalprivacyconfidentiality.org/index.php/jpc/about (accessed on december ). . yoon, a.; schultz, t. research data management services in academic libraries in the us: a content analysis of libraries’ websites. coll. res. libr. , , – . [crossref] . university of minnesota libraries. the supporting documentation for implementing the data repository for the university of minnesota (drum): a business model, functional requirements, and metadata schema. . available online: http://hdl.handle.net/ / (accessed on march ). . holden, j. memorandum for the heads of executive departments and agencies. executive office of the president, office of science technology policy. available online: https://obamawhitehouse.archives.gov/ sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf (accessed on january ). . cohen, s.; deverts, d.; doyle, w.j. aggregated cold studies (studies – ) . available online: https: //doi.org/ . /rdl/ (accessed on december ). . carnegie mellon university. cmu strategic plan : creating a st century library. available online: https://www.cmu.edu/strategic-plan/strategic-recommendations/ st-century-library.html (accessed on march ). https://smw-rda.esc.rzg.mpg.de/index.php?title=data_repository http://carnegieclassifications.iu.edu/lookup/lookup.php https://www.cmu.edu/assets/pdfs/cmufactsheet.pdf http://dx.doi.org/ . /r / .v https://blogs.plos.org/mfenner/ / / /figshare-interview-with-mark-hahnel/ https://blogs.plos.org/mfenner/ / / /figshare-interview-with-mark-hahnel/ https://www.digital-science.com/blog/news/welcoming-figshare-an-open-data-project-to-the-digital-science-family/ https://www.digital-science.com/blog/news/welcoming-figshare-an-open-data-project-to-the-digital-science-family/ http://dx.doi.org/ . / - . https://obamawhitehouse.archives.gov/blog/ / / /expanding-public-access-results-federally-funded-research https://obamawhitehouse.archives.gov/blog/ / / /expanding-public-access-results-federally-funded-research https://knowledge.figshare.com https://www.rd-alliance.org/ https://www.force .org/ https://www.coretrustseal.org/ https://knowledge.figshare.com/articles/item/mission-statement-and-core-beliefs https://knowledge.figshare.com/articles/item/mission-statement-and-core-beliefs https://docs.figshare.com https://figshare.com/blog/figshare_orcid_integration/ https://figshare.com/blog/figshare_orcid_integration/ https://library.cmu.edu/find/unique/archives http://www.ptfs.com/knowvation https://repository.cmu.edu https://journalprivacyconfidentiality.org/index.php/jpc/issue/view/ https://journalprivacyconfidentiality.org/index.php/jpc/issue/view/ https://journalprivacyconfidentiality.org/index.php/jpc/about https://journalprivacyconfidentiality.org/index.php/jpc/about http://dx.doi.org/ . /crl. . . http://hdl.handle.net/ / https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf https://doi.org/ . /rdl/ https://doi.org/ . /rdl/ https://www.cmu.edu/strategic-plan/strategic-recommendations/ st-century-library.html publications , , of . johnston, l. data repositories: the answer that actually came with a question. university of massachusetts and new england area librarian e-science symposium, . available online: https://doi.org/ . / tfak-y (accessed on march ). . kellen, c.s. research data repository and digital collections—overview and plan; internal report; cmu’s institutional repository, carnegie mellon university: pittsburgh, pa, usa, . . about figshare. available online: https://figshare.com/about (accessed on march ). . figshare tools. available online: https://figshare.com/tools (accessed on march ). . figshare features. available online: https://figshare.com/features (accessed on march ). . figshare for institutions. available online: https://knowledge.figshare.com/institutions (accessed on march ). . lagzian, f.; abrizah, a.; wee, m.c. critical success factors for institutional repositories implementation. electron libr. , , – . [crossref] . johnston, l.r.; carlson, c.r.; hswe, p.; hudson-vitale, c.; imker, h.; kozlowski, w.; olendorf, r.k.; stewart, c. data curation network: how do we compare? a snapshot of six academic library institutions’ data repository and curation services. j. esci. librariansh. , , – . [crossref] . carnegie mellon university partners with digital science to create st century library. available online: https://library.cmu.edu/about/publications/news/digital-science-partnership (accessed on december ). . austen, c.c.; bloom, t.; dallmeier-tiessen, s.; khodiyar, v.k.; murphy, f.; nurnberger, a.; raymond, l.; stockhause, m.; tedds, j.; vardigan, m.; et al. key components of data publishing: using current best practices to develop a reference model for data publishing. int. j. digit. libr. , , – . [crossref] . scherer, d. incentivizing them to come: strategies, tools, and opportunities for marketing an institutional repository. in making institutional repositories work, st ed.; callicott, b., scherer, d., wesolek, a., eds.; purdue university press: west lafayette, in, usa, ; pp. – . . novak, j.; day, a. the ir has two faces: positioning institutional repositories for success. j. acad. librariansh. , , – . [crossref] . lynch, c. foreward: a few reflection on the evolution of institutional repositories. in making institutional repositories work, st ed.; callicott, b., scherer, d., wesolek, a., eds.; purdue university press: west lafayette, in, usa, ; pp. xi–xiii. . introducing kilthub. available online: https://www.library.cmu.edu/about/publications/news/introducing- kilthub (accessed on december ). . about the kilthub repository. available online: https://library.cmu.edu/kilthub/about (accessed on december ). . cullen, r.; chawner, b. institutional repositories, open access, and scholarly communication: a study of conflicting paradigms. j. acad. librariansh. , , – . [crossref] . lee, d.j.; stvilia, b. practices of research data curation in institutional repositories: a qualitative view from repository staff. plos one , , – . [crossref] [pubmed] . the council on library and information resources (clir) postdoctoral fellowship program. available online: https://www.clir.org/fellowships/postdoc/ (accessed on april ). . raboin, r.; reznik-zellen, r.c.; salo, d. forging new service paths: institutional approaches to providing research data management services. j. esci. librariansh. , , – . [crossref] . force . the fair data principles. available online: https://www.force .org/group/fairgroup/fairprinciples (accessed on april ). . kilthub deposit guide. available online: https://libwebspace.library.cmu.edu/libraries-and-collections/ kilthub_deposit_guide.pdf (accessed on april ). . . —australian and new zealand standard research classification (anzsrc). . available online: http://www.abs.gov.au/ausstats/abs@.nsf/lookup/ . main+features (accessed on april ). . next generation repositories: behaviours and technical recommendations of the coar next generation repositories working group. available online: https://www.coar-repositories.org/files/ngr-final- formatted-report-cc.pdf (accessed on february ). . splawa-neyman, p. what makes a successful ir? literature review. in proceedings of the figshare advisory board meeting, melbourne, australia, march . [crossref] . luther, j. the evolving institutional repository landscape. acrl/choice report, , . available online: https://www.research.net/r/pnr vqp (accessed on december ). https://doi.org/ . /tfak-y https://doi.org/ . /tfak-y https://figshare.com/about https://figshare.com/tools https://figshare.com/features https://knowledge.figshare.com/institutions http://dx.doi.org/ . /el- - - http://dx.doi.org/ . /jeslib. . https://library.cmu.edu/about/publications/news/digital-science-partnership http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / . . https://www.library.cmu.edu/about/publications/news/introducing-kilthub https://www.library.cmu.edu/about/publications/news/introducing-kilthub https://library.cmu.edu/kilthub/about http://dx.doi.org/ . /j.acalib. . . http://dx.doi.org/ . /journal.pone. http://www.ncbi.nlm.nih.gov/pubmed/ https://www.clir.org/fellowships/postdoc/ http://dx.doi.org/ . /jeslib. . https://www.force .org/group/fairgroup/fairprinciples https://libwebspace.library.cmu.edu/libraries-and-collections/kilthub_deposit_guide.pdf https://libwebspace.library.cmu.edu/libraries-and-collections/kilthub_deposit_guide.pdf http://www.abs.gov.au/ausstats/abs@.nsf/lookup/ . main+features https://www.coar-repositories.org/files/ngr-final-formatted-report-cc.pdf https://www.coar-repositories.org/files/ngr-final-formatted-report-cc.pdf http://dx.doi.org/ . /m .figshare. .v https://www.research.net/r/pnr vqp publications , , of . symplectic elements data sources. available online: https://symplectic.co.uk/products/elements- /data- sources/ (accessed on april ). . sherpa/romeo: about sherpa. available online: http://sherpa.mimas.ac.uk/romeo/about.php?la=en& fidnum=\t \textbar{}&mode=simple (accessed on april ). . clements, a. research information meets research data management . . . in the library? insights , , – . [crossref] . joint, n. current research information systems, open access repositories and libraries: antaeus. libr. rev. , , – . [crossref] . kramer, b.; bosman, j. innovations in scholarly communication—the changing research workflow. in proceedings of the force meeting, university of oxford, oxford, uk, – january . [crossref] . kramer, b.; bosman, j. innovations in scholarly communication—changing research workflows. outcomes. available online: https:// innovations.wordpress.com/ (accessed on january ). . the scottish tartan museum. what is tartan? available online: https://www.scottishtartansmuseum.org/ content.aspx?page_id= &club_id= &module_id= (accessed on january ). . salo, d. innkeeper at the roach motel. libr. trends. , , . [crossref] © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). https://symplectic.co.uk/products/elements- /data-sources/ https://symplectic.co.uk/products/elements- /data-sources/ http://sherpa.mimas.ac.uk/romeo/about.php?la=en&fidnum=\t \textbar {}&mode=simple http://sherpa.mimas.ac.uk/romeo/about.php?la=en&fidnum=\t \textbar {}&mode=simple http://dx.doi.org/ . / - . http://dx.doi.org/ . / http://dx.doi.org/ . /m .figshare. .v https:// innovations.wordpress.com/ https://www.scottishtartansmuseum.org/content.aspx?page_id= &club_id= &module_id= https://www.scottishtartansmuseum.org/content.aspx?page_id= &club_id= &module_id= http://dx.doi.org/ . /lib. . http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction figshare.com to figshare for institutions repository landscape at carnegie mellon university evolution of repository services at cmu the kilthub repository kilthub repository teams repository service team the data services team liaison librarian team streamlining workflows workflow roles data deposit workflow balancing requirements with ease of deposit institutional repositories and repositories at institutions figshare ir advisory board the ir and current research information system ir to rim rim to ir ir and rim ecosystem the ir in the scholarly communication ecosystem conclusions—weaving the fabric of research references the prague spring archive at the university of texas at austin full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=wjwl journal of web librarianship issn: - (print) - (online) journal homepage: http://www.tandfonline.com/loi/wjwl the prague spring archive at the university of texas at austin ian goodale to cite this article: ian goodale ( ) the prague spring archive at the university of texas at austin, journal of web librarianship, : - , - , doi: . / . . to link to this article: https://doi.org/ . / . . published online: nov . submit your article to this journal article views: view related articles view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=wjwl http://www.tandfonline.com/loi/wjwl http://www.tandfonline.com/action/showcitformats?doi= . / . . https://doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=wjwl &show=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=wjwl &show=instructions http://www.tandfonline.com/doi/mlt/ . / . . http://www.tandfonline.com/doi/mlt/ . / . . http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - article the prague spring archive at the university of texas at austin ian goodale ut libraries, the university of texas at austin, austin, texas, usa article history received march accepted august abstract there is a noted lack of primary historical documents related to the prague spring, one of the key moments in the cold war, available online. as such, there is a need for an online resource that presents these primary documents in their entirety, allowing researchers and the public to engage with these materials in a user-friendly, open access format. the prague spring archive, a new project at the university of texas at austin, fills this gap. this article addresses the development, promotion, and future steps of the project. keywords digital archives; digital humanities; digital scholarship; east european studies; global studies; metadata; slavic studies background the prague spring archive project is a collaboration between the university of texas at austin libraries and the center for russian, east european, and eurasian studies at ut austin, using documents from the lyndon b. johnson presidential library. the project makes important primary documents on the prague spring openly accessible, allowing greater opportunities for public and academic access to the documents through an online portal created in scalar, an online publishing platform from the alliance for networking visual culture. the university of texas at austin has the nation’s fifth largest academic library, with over million volumes and library locations (the lyndon b. johnson presidential library n.d.). the university also has over colleges and schools served by these libraries, with more than , teaching faculty and over , students from more than degree programs. the perry-casta~neda library is the main library within the university of texas system, and contains major holdings in a variety of subject areas, including humanities and social sciences. the center for russian, east european, and eurasian studies (creees) was established at the university of texas at austin in and now includes over faculty members from over different departments and administrative units across campus (the university of texas at austin n.d.). creees is committed to contact ian goodale iangoodale@utexas.edu european studies and digital scholarship librarian, the university of texas at austin, e. st st., pcl . l, austin, tx , usa. color versions of one or more of the figures in this article can be found online at www.tandfonline.com/wjwl. © ian goodale journal of web librarianship , vol. , nos. – , – https://doi.org/ . / . . https://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - mailto:iangoodale@utexas.edu http://www.tandfonline.com/wjwl https://doi.org/ . / . . reaching out to the campus community, as well as the broader region, to provide access to speakers from russia, eastern europe, and eurasia, and activities that will promote interest in the region. as part of the largest university in the state of texas, the center has a special responsibility to support continued international development and to educate students who can play a fundamental role in an inter- national community in which russia, eastern europe, and eurasia are critical players. situated on the university of texas at austin campus, the lyndon b. johnson presidential library houses million pages of historical documents, , pho- tos, and , hours of recordings from lyndon b. johnson’s political career, including about hours of his recorded telephone conversations (the lyndon b. johnson presidential library n.d.). some collections (both textual and audiovi- sual) have been digitized for the web, but many more are only available for research in person at the library. the lbj presidential library is part of a system of presidential libraries administered by the national archives and records administration. scalar was chosen for the project due to its reputation as a platform for scholarly publishing in a digital format. the platform’s tools allowed for the easy replication of the experience of navigating through a physical archive in the digital format, maintaining the archive’s original structural integrity. it also allowed for the seamless integration of additional features exclusive to (figure ) the digital portal for the archive. key documents and figures are curated and highlighted to aid research, an interactive timeline (figure ) was created to introduce the basic structure of the prague spring crisis to those unfamiliar with its history, and images from the lyndon b. johnson library’s photo- graphic archive are paired with other site content to provide a visual reference for figures and events mentioned in the archival documents. educational activi- ties for high school students are in development to further outreach with the archival materials, including activities that could easily integrate into curricula on the cold war. for researchers who would like to explore what is available in the physical collections of the lbj presidential library, the finding aid for the entire archival collection is also available on the site. the prague spring was a period of political liberalization in communist czechoslovakia that marked a significant point of resistance against the soviet regime, and which was eventually suppressed by an invasion of warsaw pact forces. it is important to researchers examining the cold war as an example of an eastern bloc country undergoing a process of peaceful revolution from within, as well as case study in militaristic suppression of such revolution by the soviet union. the crisis would foreshadow the eventual fall of the soviet union, and the documents in the project have still broader appeal as examples of how the united states government developed political strategy in response to both the initial liberalization and the soviet union’s harsh repression. the declassified primary documents we have digitized reveal u.s. attitudes toward communist journal of web librarianship governments, its attempts to further u.s. interests in east europe, and its approaches to public relations and diplomacy in the face of a perceived commu- nist threat. work on the project first began in . with funding from a u.s. department of education title vi national resource centers grant and the texas chair in czech studies, digitization work on an initial selection of archival boxes was com- pleted by undergraduate students from the creees and msis graduate students working at the university of texas libraries. digitization work is ongoing, with new materials being photographed, processed, and added to texas scholarworks, the institutional repository at the university of texas austin, by msis graduate student nicole marino and the author, the european studies and digital scholar- ship librarian. to help maintain the archival integrity of the materials in their digitized format, extensive metadata was created to accompany the documents within the texas scholarworks repository. the metadata allows researchers working with the mate- rials within texas scholarworks to easily search the documents, and can be down- loaded by anyone through the repository. the metadata was prepared using excel spreadsheets to assist with batch uploads to the repository, with rows correspond- ing to individual documents and rows corresponding to various dublin core meta- data elements that would become searchable once online. these dublin core elements included the title, date created, a number of subject fields with library of congress standardized subject headings, a description, the type of the docu- ment (e.g., “correspondence”), its creator, and the language of the document. the project team also included non-dublin core filename identifiers, which were used by staff in the digitization department and the institutional repository to link the uploaded files to their respective metadata. full-text of the documents will soon be added in xml format to accompany the archival pdfs, increasing search- ability and providing an additional resource for working with the documents— making digital humanities practices such as text mining or sentiment analysis eas- ier to accomplish, for example. the xml files are generated through the optical character recognition (ocr) program docworks, and are monitored for accuracy and edited as needed by msis graduate students working with the university of texas libraries. promoting the archive a variety of strategies were used to facilitate discovery of the archive online. the author sent e-mails to listservs read widely by scholars and librarians in subject areas relevant to the project, which succeeded in raising its profile. after publicizing the project in this way, the author was contacted by researchers who invited him to publish an additional write-up in two aca- demic newsletters, czech language news and aseees newsnet, which further broadened the audience for the project. links to the project were also added i. goodale to libguides administered by librarians at a number of institutions in the united states and canada, expanding the reach of the project. this has enabled the project to be advertised in ways the author has not directly deployed, which has further allowed the project to be broadcast to a wider audience. it has also allowed the project team to build relationships with scholars working in relevant fields, as well as librarian colleagues. the author contributed write-ups to raise the project’s profile on his home cam- pus as well. mary neuburger, the director of the creees, spoke with joan neu- burger, the director of the not even past (nep) website, about publicizing the project there. to that end, mary and the author collaborated on an article describ- ing the project’s timeline, scope, and goals for the future that was featured on the nep site. in addition, the author wrote a description of an important document from the archival collection that was likewise published on the nep website. this write-up gave historical context for the document, providing background on the importance of the collection as a whole, while simultaneously foregrounding the project and further increasing interest in its materials. the author also contributed a write-up for the university of texas libraries’ website. this helped raise the profile of the project on campus, and provided the communications staff at the perry-casta~neda library with information that could figure . the homepage of the prague spring archive. journal of web librarianship be used to advertise the project elsewhere. through this connection with the com- munications staff, the project was later featured on the library journal’s info- docket, which in turn led to further coverage on mit’s hcd insights page. in addition, the author added links to the portal to his libguides, integrating the resource into his own outreach to students. the author was active in promoting the project on social media. the author promoted the project using his professional twitter account, providing links to the project and tagging appropriate parties in his tweets. by mentioning scalar, the author was able to garner attention from the official account of the platform, which retweeted the link to the project and further broadened the reach of the project. the author also worked with a faculty member to post about the project on a face- book group for slavic digital humanities, which further broadened the reach of the project. this outreach on twitter led to contact and discussions of future collabora- tion with socialism realised, an online project aimed at providing an under- standing of people’s daily lives under communist governments, whose administrators reached out to the author after seeing one of his tweets. the pub- licity generated online also enabled the author to more easily make the case for figure . an interactive timeline of the events preceding, during, and following the prague spring. this particular slide includes an embedded video of dub�cek pledging czechoslovak independence. i. goodale collaboration with baylor university’s keston center, whose staff agreed to digitize relevant archival materials for the project team as a way to supple- ment the prague spring archive’s online collection of documents. the team plans to link to these newly digitized materials, which will be hosted by bay- lor, from the prague spring archive’s online portal, which will benefit both institutions in increasing access to their collections and broadening their impact with researchers looking for primary source materials online. the project’s online presence and mention in other outlets also led to the project team being contacted by staff from the national czech and slovak museum, who are planning to use a document from the collection in an upcom- ing curriculum for public school students in iowa. this will lead not only to direct use of the materials by students studying east european history, but also to a general broadening of the project’s audience by expanding its profile outside of academia. furthermore, the team hopes to develop its own educational mate- rials for use by public school educators in the future, which will be made avail- able freely on the project’s online portal and similarly increase its reach. mary neuburger will use the project again in a future iteration of her graduate semi- nar, as well, which will introduce the materials to yet another new class of gradu- ate students and researchers on campus. figure . a page highlighting key documents from box , folder in the prague spring archive, with text contributed by graduate students. journal of web librarianship building the archive librarians have been increasingly involved in digital humanities projects that serve both a pedagogical purpose and a research purpose (varner ). this project was a conscious attempt to increase the active participation of the libraries at the university of texas at austin with such projects on campus, and to build relationships with departments and nearby cultural heritage insti- tutions in the process. this type of relationship building has been identified as a key value for cultural heritage institutions working on digital projects in the past (rizzo ), and is something that the project team wanted to cultivate as a way to strengthen its ties to other institutions, simultaneously enhancing the prague spring archive and helping other institutions. as a result of this effort, the prague spring site has been an important aspect of embedded librarianship at the university of texas libraries. the author worked with graduate students in a graduate seminar taught by mary neuburger, ree : russian, east european, and eurasian civilizations and cultures, to have the students contribute text for incorporation into the online portal, visiting mul- tiple class sessions to teach about the project and serve as a contact for digital scholarship and metadata-related questions the students might have. the stu- dents also selected key documents from archival folders to be highlighted on the portal (figure ), and provided input on the site’s design and features throughout its development. professors mary neuburger and vlad beronja contributed their input on design and content, helping to write descriptions of archival materials and select key documents to profile. the finished portal was then presented to the class for additional feedback, and more content from future iterations of the class will be added shortly. this collaborative approach has been used in digital archives projects in the past (norcia ). students use the materials as an integral part of the graduate seminar, going through metadata created by librarians and graduate student research assistants at the library and selecting their documents. this assignment was created to serve multiple purposes, as it increases their familiarity with the prague spring events, contributes to the development of content for the online archive, and enables them to gain familiarity with metadata standards and expe- rience working on a digital project. this increased their information literacy while giving them experience working with primary archival sources, and allowed them to gain experience navigating the materials in the university of texas’s institutional repository, as well. this likewise served to strengthen the libraries’ relationship with the community of scholars and students on campus while simultaneously introducing new graduate students to possible areas of interest for their research. students will use the archive’s materials for assignments in future classes, work- ing with professors and librarians to contribute content to the site and identify documents of interest to their personal research in the archive. the project team i. goodale hopes that through publicizing and promoting the archive, the materials it contains will be of use not only to students, but to researchers with interest in these docu- ments who will be able to easily access the materials without traveling to the physi- cal archives in austin. the author also carried out the entirety of the web design component of the project as a way to expedite and simplify the process of the portal’s crea- tion. the author was able to incorporate his skills in photoshop, html and css, and web design to create an attractive, easy-to-use portal that could be effectively tested with users for further refinement. early prototypes of the site were shown to students in a graduate seminar for feedback, which was incor- porated into its final iteration. faculty input from mary and vlad was also helpful and incorporated into the website’s final design. an important debate in the field of digital humanities is the way its utiliza- tion of technologies noted for their ease of access and participation relates to the idea of democratizing the humanities (hunter ). scalar was chosen in part due to its ease of access and possibilities for open collaboration between graduate students and librarians, so the possibilities for democratiza- tion and equal participation in digital humanities work was an important consideration for the project team during the project’s conception. the devel- opment plan for the project was intentionally constructed as a collaborative process, with a division of labor that best suited the strengths of all the proj- ect’s collaborators. msis graduate students worked on creating metadata that conformed to an agreed upon set of standards, which ensured that metadata created for the project’s items remained as uniform as possible. maintaining a solid, standardized language and format for the project’s metadata was a key aspect of the team’s approach to creating effective metadata for the proj- ect, as by keeping it uniform we were able to ensure that the project remained uniformly accessible. digitization work was carried out by a small number of graduate students who were trained by staff in the digital stew- ardship department of the libraries. effective project management ensured the quality of the content being captured and processed while simultaneously streamlining and equitably dividing labor. graduate students in the depart- ment of russian and east european studies helped identify key documents and provide text that was later adapted for inclusion on the website. collaboration with the institutional repository on the project has also been a key aspect of institutional support for the project. the author has worked closely with both the digitization department of the libraries and the man- agers of the repository to ensure the successful completion of the backend of the project, and they have helped spread the word about the work to others, which has further broadened the project’s audience. the repository provides a stable, institution-specific solution to document hosting that the prague spring archive portal can easily interface with, and is a key aspect of institu- tional support for the online archive. journal of web librarianship interface design the author wrote custom html to alter the appearance of elements in the project’s pages, allowing him to take advantage of scalar’s extensive options for customiza- tion. the author altered the positioning of navigational elements on the page, creat- ing a custom interface separate from scalar’s built-in navigational options. the author’s custom series of clickable images, with text overlaid on historical photo- graphs available either in the public domain or under a creative commons license, both increased the visual appeal of the homepage and provided users with a readily comprehensible, visual interface with which to navigate the site. this increased usability, allowing users to utilize built-in navigation menus (accessible by clicking a thumbnail in the corner of the screen) or the site’s custom buttons, available directly on the homepage, to navigate the site according their preference. the author used photoshop to create customized header images for each page, utilizing images in the public domain or available under a creative commons license as material that could be edited as necessary. to improve the appearance of the banners within scalar’s interface, images were cropped, brightness and contrast were altered, text was added, and sections of the images were separated into their own layer within photoshop. this separation allowed for the altering of these sec- tions to improve the appearance of the text placed over them, namely by lightening and altering the contrast on these sections to make the text appear more clearly. the clickable images used as navigation buttons on the homepage were created in photoshop using a similar process. enhancing searchability adding full-text of the documents in xml format will increase accessibility of the documents by making their full text searchable in the university of texas’s insti- tutional repository. the work of generating these xml files is still underway, and is being obtained by running digitized copies of the documents through the ocr program docworks. a graduate student is manually correcting the ocr text gen- erated by the program, then exporting the clean xml files to a local server, where they will live until they can be uploaded to the online repository. once in the repository, they will be linked to the archival pdfs of their respective documents and directly accessible through their respective documents’ pages. the author felt it was important to use a free, open source tool like scalar for the project, as supporting such tools by using them for large, institutionally-sup- ported projects both strengthens their profiles and serves his institution’s com- mitment to supporting open information in digital scholarship. the project team seeks, by utilizing such technologies, to open the physical archive and make its information freely, easily accessible to all, regardless of whether they are able to visit the physical repository where the actual archival documents reside. as such, the team hopes to join a community of librarians, scholars, and researchers who seize the opportunities digital humanities and open access i. goodale provide for making knowledge freely available to as broad an audience as possi- ble (suber ). future directions the project was somewhat experimental, but inspired the project team to continue both its own development and to explore future directions for digital initiatives within the university of texas libraries. one of the key aspects of the project’s suc- cess was the implementation of collaborative workflows combined with having one person hone in on specific elements of the project (e.g., the author took control of web design and metadata creation, while msis candidate nicole marino per- formed digitization work). it was vital to keep work on the project collaborative, while allowing individuals to explore their individual strengths and successfully apply themselves to areas on which they could singularly focus. this approach will continue to be implemented in this project as it develops, and will be applied in other projects at the libraries as well. one unsuccessful strategy implemented early on in the project’s lifespan was the division of metadata creation among different graduate students. despite the stan- dardized format of the metadata, individual differences in writing styles for the item descriptions and titles resulted in metadata that required some correction to make fully uniform. for this reason, the author plans to either create the metadata on his own or to have one student under his direct supervision create the metadata to the project’s specifications in the future. this project stands out due to its collaborative nature across institutions; its inte- gration of instruction and work between professors, librarians, and graduate stu- dents; and its appeal to scholars working in a variety of areas. the work undertaken by the graduate students not only contributed content to the site, but was designed to increase their familiarity and literacy with metadata formats and working directly with primary documents. the subject of information literacy was considered espe- cially important due to the seminar being comprised entirely of first-year graduate students, whose work directly impacted the content of the online portal. this project also stands out in that while it was conceived of and worked on pri- marily by slavists, its scope is broad enough to be of interest to researchers working in a variety of other areas, such as history or political science. while many digital projects have been carried out in the humanities, the team sees the prague spring archive as both a humanities project and one that while multidisciplinary, also fits into the growing number of slavic digital humanities projects being undertaken by research libraries (trehub ). while these projects represent a significant step forward in the development of digital projects in the slavic field, only one other project specifically addresses primary documents from the cold war (lawrence ), and none focus on the prague spring. while projects such as the wil- son center archive reference documents from u.s. sources related to the cold war, none make a specific corpus of such primary sources available online as journal of web librarianship this project does (remnek ). thus, the prague spring archive project represents an addition to an existing lineage of diverse slavic digital humani- ties projects (digital humanities in the slavic field ), as well as a unique contribution to the field. about the author ian goodale is the european studies and digital scholarship librarian at the university of texas at austin. he is a member of the collection development subcommittee within the asso- ciation for slavic, east european and eurasian studies' (aseees) committee on libraries and information resources. he is interested in the intersections of librarianship, digital humanities, archives, and ux/ui design. orcid ian goodale http://orcid.org/ - - - references austin, the university of texas at. n.d. facts and figures. accessed march , . https:// www.utexas.edu/about/facts-and-figures. digital humanities in the slavic field. . digital humanities in the slavic field. accessed july . http://www.slavic-dh.org/directory/. hunter, a. . the digital humanities and democracy. canadian journal of communication . lawrence, m. a. . cold war international history project digital archive. journal of ameri- can history : – . doi: . /jahist/jat . norcia, m. a. . out of the ivory tower endlessly rocking: collaborating across disciplines and professions to promote student learning in the digital archive. pedagogy : – . remnek, m. b. . access to east european and eurasian culture: publishing, acquisitions, dig- itization, metadata. binghamton, ny: haworth information press. rizzo, m. . history at work, history as work: public history’s new frontier. american quar- terly : – . doi: . /aq. . . suber, p. . open access. cambridge, ma: mit press. the lyndon b. johnson presidential library. n.d. about. accessed march , . https:// www.lbjlibrary.org/page/library-museum/. the university of texas at austin. n.d. the center for russian, east european, and eurasian studies. accessed march , . https://www.liberalarts.utexas.edu/slavic/creees/about- creees.php. the university of texas libraries. n.d. about the libraries. accessed march , . https:// www.lib.utexas.edu/about. trehub, a. . slavic studies and slavic librarianship’ revisited: notes of a former slavic librarian. slavic & east european information resources . varner, s. . library instruction for digital humanities pedagogy in undergraduate classes. in laying the foundation: digital humanities in academic libraries, ed. john w. white and heather gilbert, . west lafayette, in: purdue university press. i. goodale http://orcid.org/ - - - https://www.utexas.edu/about/facts-and-figures https://www.utexas.edu/about/facts-and-figures http://www.slavic-dh.org/directory/ https://doi.org/ . /jahist/jat https://doi.org/ . /aq. . https://www.lbjlibrary.org/page/library-museum/ https://www.lbjlibrary.org/page/library-museum/ https://www.liberalarts.utexas.edu/slavic/creees/about-creees.php https://www.liberalarts.utexas.edu/slavic/creees/about-creees.php https://www.lib.utexas.edu/about https://www.lib.utexas.edu/about abstract background promoting the archive building the archive interface design enhancing searchability future directions about the author references new criteria for new media d o c u m e n t new criteria for new media jon ippolito, joline blais, owen f. smith, steve evans and nathan stormer jon ippolito (artist, professor), chadbourne hall, the university of maine, orono, me - , u.s.a. e-mail: . joline blais (writer, educator), chadbourne hall, university of maine, orono, me - , u.s.a. e-mail: . owen f. smith (educator, artist), chadbourne hall, university of maine, orono, me , u.s.a. e-mail: . steve evans (educator, literary critic), national poetry foundation, neville hall, uni- versity of maine, orono, me , u.s.a. e-mail: . nathan stormer (educator), dunn hall, university of maine, orono, me , u.s.a. e-mail: . part : introduction by jon ippolito © jon ippolito in , benjamin weil curated an exhibition for london’s institute for contemporary art called web classics. the title was both ironic—the web had only been around for years at the time—and prophetic. weil, a co-founder of the influential site ada’web and later curator at the san francisco museum of modern art, once opined that every calendar year corre- sponded to three web years. weil was right that internet art has grown up quickly, at least to judge from the frequency of e-mails popping into my inbox from masters’ and ph.d. students researching ada’web and its contemporaries. in recognition of the speedy maturation of networked media, a new generation of fledgling new media scholars—and an aging generation of digital trailblazers—will soon establish a tenured foothold in academic departments worldwide. or will they? the university, an institution that dates back to the th century bc, operates by calendar years rather than web years, and academic review committees still expect candidates for promotion and tenure to hand them stacks of books and periodicals rather than a list of urls. nevertheless, i hope that being a new media scholar means more than publishing books with the word “digital” or “internet” in the title. marxism and feminism were also revolutionary discourses, but they failed to change the way history and other academic disciplines do busi- ness. by that i mean that even in universities where marxism or feminism influence scholarship, the broadcast paradigms are still in place: professors “instructing” students, scholars competing for publication in prestigious journals, attention- constraining media such as print and powerpoint enforcing the one-way flow of information. new media hold out the promise of toppling these behav- ioral hierarchies, rather than merely changing the subjects taught according to them. whether this effort succeeds will depend on whether we, as a group of scholars and activists, can point out the hypocrisy of preaching decentralization from powerpoint slides or closed-access journals and investigate and contribute to networked modes of sharing knowledge. consider scholarly publication, for example. books and print journals do have some ad- vantages over virtual ink. for one thing, paper is much more back- ward compatible; it is easier to find a university library with a century- old book than a working floppy drive. but research universities are supposed to represent the future as well as the past, and the future is about connecting rather than stor- ing knowledge. fortunately, new media of- fer plenty of ways for scholars to connect. thoughtmesh [ ], a project craig dietrich and i have devel- oped for the still water network for art and culture at the university of maine and university of southern california’s vectors program, gives readers a tag-based navigation system that uses keywords to connect ex- cerpts of essays published on different web sites. for example, the reader of an essay on modern art can pick a single term out of that essay’s tag cloud, such as “nam june paik” and view a list of all the sections from that essay that relate to paik. or one can view a list of sections of other articles tagged with “nam june paik” and jump right to one of those sections. one can also combine tags to narrow the search: “nam june paik” + “fluxus” + “ .” related efforts include still water research fellow john bell’s distributed publication system re:paik [ ], which allows scholars and critics to ferret out and share contemporary signs of the legacy of this “grandfather of video art” in everything from museum exhibitions to pop music. recognizing new-media researchers’ need to get infor- mation into the collective ether as quickly as possible, leo- nardo has embarked on leonardo transactions (http://www. leonardotransactions.com/), a “fast track” section of its vener- able print journal, which subjects two-page papers to a faster referee process than most peer-reviewed journals can muster. of course, academics can also circulate ideas quickly and widely by blogging, contributing to wikipedia, or at least pub- lishing in open access repositories. unfortunately, few new-media academics are going to bother with these innovations if their departments’ criteria for promotion and tenure recognize only dead-tree journals. that is why these criteria have to change. it will not be easy; the most conservative constituents of university hierarchies often control these criteria. times are changing, however: not only is tenure irrelevant in many universities worldwide, but even in countries such as the u.k. and the u.s. traditional criteria are becoming overshadowed by “research assessment exercises” and other metrics. by publishing the following criteria de- a b s t r a c t this paper argues for rede- fining evaluation criteria for faculty working in new media research and makes specific recommendations for promotion and tenure committees in u.s. universities. leonardo, vol. , no. , pp. – , © isast. individual article sections copyright as indicated. published under creative commons attribution (cc-by) license. all rights not granted thereunder to the public are reserved to the publisher and may not be exercised without its express written permission. veloped by still water, the research arm of the university of maine’s new media department, we hope to influence these fledgling developments—if only philo- sophically—and remind scholars of all generations that impact in our field can and should be measured differently. part : new criteria for new media ( january ) © jon ippolito et al. authors: joline blais, jon ippolito, and owen smith in collaboration with steve evans and nathan stormer. introduction recognition and achievement in the field of new media must be measured by standards as high as but different from those in established artistic or scien- tific disciplines. as the reports from the american council of learned societies [ ], the modern language association [ ], and the university of maine [ ] rec- ommend, promotion and tenure guide- lines must be revised to encourage the creative and innovative use of technology if universities are to remain relevant in the st century. the following points summarize some of the key areas in which new media re- search departs from traditional academic scholarship, with the aim of providing a rationale for specific criteria for universi- ties with u.s.-style promotion and tenure policies. new form and content the differences between traditional and new media excellence lie in both form and content. the hard-copy format of traditional review documentation, such as photocopies or slides, is insufficient for evaluating new media work; screen- shots do little justice to electronic proj- ects based on innovative interactive or participatory design. as the mla puts it, “evaluative bodies should review faculty members’ work in the medium in which it was produced. for example, web-based projects should be viewed online, not in printed form” [ ]. further complicating the evalua- tion of new media achievements is the fact that they are often interdisciplin- ary, as reflected by the current univer- sity of maine new media faculty, whose backgrounds range from engineering to computer science to fine art to pho- tojournalism to literature. established faculties with ties to new media may sig- nal themselves as exclusively critical or creative, as in the distinction between art history and studio art, respectively. new media’s brief history, however, of- ten requires its practitioners to develop a critical context for their own creative work. this is why the majority of first- generation new media critics are also art- ists [ ]. it is also why new media research spans numerous genres, from critical es- says to political activism to community- building to software design. new media faculties may profit by examining and borrowing criteria from practice-based departments such as journalism and ar- chitecture. limitations of academic journals these differences may require evaluators of new media artist-researchers to look beyond the usual standards applicable in other disciplines. as noted by a national academies report: because the field of [information tech- nology and creative practices] is young and dynamic, itcp production is hard to evaluate. traditional review panels . . . may be hampered by their members’ ties to single disciplines and the absence of a time-tested consensus about what constitutes good work in itcp and why [ ]. ironically, the national academies study found that the highest benchmark for success in traditional academic de- partments, publication in peer-reviewed journals, is less relevant to success in new media—and empirically less an accurate measure of stature in the field—than more supple or timely forms of intellec- tual exposition: the gold standard for academia—and the criterion most easily understood by parties outside a given subdiscipline—is the so-called archival journal (often pub- lished by scholarly or professional societ- ies) that involves considerable editorial selection plus prepublication review and revision, which function as a screen- ing system for quality. but the long lead time for such publications poses prob- lems for subdisciplines in which timeli- ness—quickly getting an idea into the field—matters [ ]. leonardo journal (mit press) is as of this writing the only print journal with a longstanding track record as a peer- reviewed journal about new media. there is currently a new handful of peer- reviewed journals devoted to new me- dia, such as leonardo electronic almanac (cambridge), fibreculture (sydney), first monday (chicago), vectors (los angeles), and digital creativity (copenhagen). yet the field’s most prominent print publish- ers and research archivists [ ] have ac- knowledged a – year lag and limited exposure that makes print publications far less relevant for new media research. although promising new paradigms for distributed publication are on the hori- zon, at the time of this writing these sys- tems are only in the planning stage [ ]. finally, as the mla warns, participation in electronic scholarship should not place extra demands on a researcher [ ]; an accomplishment in new media research should substitute for a print article or monograph, not merely supplement them. alternative recognition measures given the accessibility and timeliness re- quired for new media research, the fol- lowing measures of recognition should be prioritized in the evaluation of new media research candidates: . invited/edited publications invitations to publish in edited electronic journals or printed magazines and books should be recognized as the kind of peer influence that in other fields would be signaled by acceptance in peer-reviewed journals. . live conferences the national academies study concludes that conferences on new me- dia, both face-to-face and virtual, offer a more useful and in some cases more prestigious venue for exposition than academic journals: [the sluggishness of journal publica- tions] is offset somewhat by a flourishing array of conferences and other forums, in both virtual and real space, that pro- vide a sense of community and an outlet as well as feedback [ ] . . . the prestige associated with presentations at major conferences actually makes some of them more selective than journals [ ]. new forms of conference archiving— such as archived webcasts—add value and exposure to the research presented at conferences. . citations citations are a valuable and versatile mea- sure of peer influence because they may come from or point to a variety of genres, from web sites to databases to books in print. examples include citations in: a. electronic archives and recognition networks, such as the publicly ac- cessible databases maintained by the daniel langlois foundation (montreal), the v organization (rotterdam), the database of vir- tual art (berlin), and the media art net database (karlsruhe). b. books, printed journals, and newspa- pers. these are easier to find now, thanks to google scholar, google ippolito et al., new criteria for new media print, and amazon’s “look inside the book” feature. c. syllabi and other pedagogical con- texts. google searches on .edu domains and citations of the author’s work in syllabi from out- side universities can measure the academic currency of an individual researcher or her ideas. in the sci- ences, readings or projects cited on a syllabus are likely to be popu- lar textbooks, but in an emerging field such as new media, such rec- ognition is a more valid marker of relevance. . download/visitor counts downloads and other traffic-related sta- tistics represent a measure of influence that has gained importance in the online community recently. as a open ac- cess study [ ] concludes: whereas the significance of citation impact is well established, access of re- search literature via the web provides a new metric for measuring the impact of articles—web download impact. down- load impact is useful for at least two reasons: ( ) the portion of download variance that is correlated with citation counts provides an early-days estimate of probable citation impact that can begin to be tracked from the instant an article is made open access and that already attains its maximum predictive power af- ter months. ( ) the portion of down- load variance that is uncorrelated with citation counts provides a second, partly independent estimate of the impact of an article, sensitive to another form of research usage that is not reflected in citations [ ]. . impact in online discussions email discussion lists are the proving grounds of new media discourse. they vary greatly in tone and substance, but even the least moderated of such lists can subject their authors to rigorous—and at times withering—scrutiny [ ]. measures such as the number of list subscribers, geographic scope, the presence or ab- sence of moderation, and the number of replies triggered by a given contribu- tion can give a sense of the importance of each discussion list [ ]. . impact in the real world while magazine columns and newspa- per editorials may have little standing in traditional academic subjects, one of the strengths of new media are their rel- evance to a daily life that is increasingly inflected by the relentless proliferation of technologies. even counting google search returns on the author’s name or statistically improbable phrases can be a measure of real-world impact [ ]. by privileging new media research with di- rect effect on local or global communi- ties, the university can remain relevant in an age where much research takes place outside the ivory tower. . net-native recognition metrics peer-evaluated online communities may invent their own measures of member evaluation, in which case they may be relevant to a researcher who participates in those communities. examples of such self-policing communities include slash- dot, the pool, open theory, and the distributed learning project. the mla pins the responsibility for learning these new metrics on reviewers rather than the reviewed [ ]. given the mutability of such metrics, however, promotion and tenure candidates may be called upon to explain and give context to these met- rics for their reviewers. again, efforts to educate a scholar’s colleagues about new media should be considered part of that scholar’s research, not supplemental to it. . reference letters/committees letters of recommendation from outside referees are an important compensation for the irrelevance of traditional recog- nition venues. nevertheless, it is insuffi- cient merely to solicit such letters from professors tenured in new media at other universities, since so few exist. more valu- able is to use the measures outlined in this document to identify pre-eminent figures in new media, or to require new media promotion and tenure candidates to identify such figures and supply evi- dence that they qualify according to the criteria above. it has also been suggested that the membership of review com- mittees for researchers in new media should also represent a balance of criti- cal and creative experts with standing in both the academic and the outside world. part : criteria by category © university of maine the following criteria formulated by the university of maine’s new media department offer one example of how universities can adapt their standards of recognition to reflect the growing im- portance of electronic scholarship in the st century. because of the rapid pace of innovation in electronic formats, this list must remain partial, since it is impos- sible to predict what new recognition mechanisms may be relevant a few years from now. i. teaching and instructional activities new media pedagogy must be light on its feet to stay relevant. below are some instructional activities that serve as im- portant supplements to regular courses on the new media curriculum. a. other teaching activities independent study, directed research, etc. (list by course number) because new media’s tools and topics proliferate too quickly to be captured by any one curriculum, faculty are encour- aged to teach independent studies when students want to explore research areas not on a current syllabus. in addition, new media student and faculty projects often reach beyond the walls of the classroom into the real world. the new media program recog- nizes the value of directed research in which faculty involve students in outside collaborations for artistic or commercial purposes, as well as faculty members who facilitate students’ exposure to or par- ticipation in national and international exhibitions, conferences, and other venues. b. curriculum and course development . curriculum during its building years, the new media program expects its faculty to contribute more to curriculum devel- opment than expected in other depart- ments. this work may take the form of course proposals, curriculum pro- posals, or curriculum subcommittee membership. . courses given the quick pace of new media evo- lution, the program recognizes excep- tional value in developing courses that explore new pedagogies or emerging technologies. it is understood that new media faculty may spend a significant portion of their research or course preparation time learning an emerging technology, such as a new programming language, with the understanding that such knowledge may lay the groundwork for future research or new courses. this groundwork is not “brushing up on skills,” but experiment- ing with promising yet unproven systems, codes, or devices. ii. research and scholarly activities good collaborators are critical to thriv- ing research ecosystems. candidates are encouraged to list any collaborative roles they have played in publications and other activities, such as conceptual architect, approach designer, release en- ippolito et al., new criteria for new media gineer, or matchmaker (e.g., introducing two other researchers whose collabora- tion results in a publication). each new media department may choose to weight these various roles according to its own priorities. a. publications . books/monographs networked or rich-media publications such as extended blogs, dvds, or cd- roms should be included if they con- stitute a sustained investigation of a particular topic. . refereed journal articles in a new media context, a “closed peer- review” article includes invited contri- butions to edited print journals and networked journals. the format of these contributions may go beyond the form of a written essay to include podcasts, videoblogs, and other forms of archival media. an “open peer-review” article includes contributions to self-policing publication networks, where the quality or relevance of contributions is subject to community debate and evaluation. . chapters of books/monographs (please indicate if invited or juried) essays or chapters in edited volumes are more important in new media than the sciences, for these edited volumes estab- lish standards for discourse in emergent subdisciplines of new media. this category should also include in- vited contributions to edited, single-issue networked publications. . edited volumes this category includes coordinating or managing a multi-user discussion list, whether accessible via email or web. this category also includes the con- ception, design, engineering, and/or editing of organized media collections, including film festivals, networked data- bases, and publications. . technical reports/book reviews this category includes networked reports and reviews. . other publications (e.g., editorials, working papers, etc.) this category includes essays published to email lists, including all contributions to discussions sparked by the publication of that essay. b. creative activities, exhibitions, and performance-related activities (please indicate whether regional, inter- national, national, solo, group, invited or juried) . exhibitions this category includes networked exhibi- tions hosted by brick-and-mortar institu- tions or independent organizations, and can include online exhibitions as well as physical installations. a. participating b. curated . performance related activities this category includes political de- sign, social software, and interactive performance. . creative writing and poetry this category includes literature in all its forms, both analogue and digital, in print or online. c. professional presentations and posters (please indicate if regional, national, or international) . conferences and discussions orga- nized researchers in new media at this point in its development are actively filling in gaps in the awareness of new media’s own history, a critical vocabulary, and other intellectual frameworks already in place in other fields. the new media program recognizes the value that organizing private and public events has for the field as a whole and, when local, for our students. . presentations as studies of new media have argued, presenting research at prestigious con- ferences can be more important than publishing it. while there is no substitute for in- person gatherings, teleconferences are gradually becoming an important venue for conference presentations, though they vary in degree of formality and organization. iii. service a. service to university . department as a fledgling program with a high stu- dent-to-teacher ratio, the new media program requires an unusual amount of innovation and labor from its faculty, which should be taken into consideration when evaluating faculty contributions to other areas. . university because new media promise to change the methods of many academic disci- plines, faculty are encouraged to lend their voice to interdisciplinary commit- tees and work with other departments to envision and develop programs that integrate new media into their own practices. b. service to the public (e.g., service on state commissions, public schools, civic groups, consulting, media interviews, public presentations) new media can be especially effective in transforming local cultures as well as global ones. faculty research in this area can be distinguished from traditional academic “service” by its innovative, activist, or performative character. iv. special recognition/ awards/honors received a. press given the limitations of publishing new media research in academic journals, recognition from the press in the form of articles or interviews about a research- er’s work can be a valuable indicator of influence. . print and broadcast press this category includes outside sources such as general-interest newspapers, ra- dio or tv spots, and specialized journals or magazines. . electronic press this category includes articles in online journals as well as blogs. b. citations only general citations go here; citations to document the relevance and achieve- ment of specific projects should accom- pany the entries on that research above. . print citations although they are not as timely as elec- tronic citations, citations in books on new media can suggest a measure of a researcher’s influence and relevance to the field. . electronic citations one measure of influence in academia can be suggested by citations in other university syllabi. (see the breakdown in part .) part : note from roger f. malina, leonardo�� executive editor © isast the problem discussed by jon ippolito is one that faces many young professionals in academic institutions internationally. over the years we have been contacted by chairs of promotion and tenure com- mittees at a number of institutions who want to understand whether leonardo’s scholarly publications use peer review (they do), and what kind (we use single blind review). yet traditional peer review is evolving in science and engineering, not only to take into account the prolif- eration of examples of fraud and plagia- rism surviving peer review, but also to open up the process to counter obstacles to interdisciplinary scholarship. we have been asked for impact and citation statistics (leonardo is in the isi ippolito et al., new criteria for new media database). yet, as pointed out by ippolito, it is clear that many of the leading prac- titioners in rapidly changing interdisci- plinary fields not only fall between the cracks of established evaluation systems but also are disseminating their work in new ways on-line that entirely escape as- sessment by existing metrics. i have been asked to write letters of recommenda- tion taking into account authors’ work in on-line communities such as second life and to comment on the “perceived value” of certain on-line conference ven- ues and archives. some of the most influ- ential collections of texts and work in our fields have never seen the light of print. in the sciences, a number of open- archive systems now co-exist with more traditional scholarly publishing business models. in neither the art-and-technol- ogy nor the new-media fields do such “evaluatable” open archive systems exist. yet, it is possible for open archive systems to allow rapid dissemination while texts proceed through peer-reviewing systems. leonardo transactions, under editor-in- chief ernest edmonds, is one experi- ment in coupling an open archive with peer-review journal flow. as indicated by ippolito, we are inter- ested in documenting in leonardo various international approaches that develop “alternative evaluation criteria or met- rics” to allow assessment of new modes of scholarly text dissemination and pub- lication. references and notes unedited references as provided by authors. . thoughtmesh by jon ippolito and craig dietrich, , accessed july . . re:paik by john bell, , ac- cessed july . . the acls recommends “policies for tenure and promotion that recognize and reward digital schol- arship and scholarly communication; recognition should be given not only to scholarship that uses the humanities and social science cyberinfrastructure but also to scholarship that contributes to its design, con- struction, and growth.... we might expect younger colleagues to use new technologies with greater flu- ency and ease, but with tenure at stake, they will also be more risk-averse.... senior scholars now have both the opportunity and the responsibility to take certain risks, first among which is to condone risk taking in their junior colleagues and their graduate students, making sure that such endeavors are appropriately rewarded.” “our cultural commonwealth,” report by the acls commission on cyberinfrastructure for the humanities and social sciences, july , , accessed january , . . “departments and institutions should recognize the legitimacy of scholarship produced in new me- dia, whether by individuals or in collaboration, and create procedures for evaluating these forms of scholarship.” december report of the mla task force on evaluating scholarship for tenure and pro- motion, , accessed january , . . “the commission encourages each department on campus, as well as the university as a whole, to examine promotion and tenure criteria to recognize and reward innovative uses of technology in teach- ing, research and service.... the university needs to consider the criteria and standards used in the promotion and tenure process. the commission encourages each department and the university as a whole to consider whether faculty efforts in this area are recognized, valued, and/or encouraged.” november report of the university of maine commission on information technologies, accessed at on may , . . mla committee on information technology. “guidelines for evaluating work with digital media in the modern languages.” may . ade bulle- tin ( ): –- . , mirrored at , accessed january . . a brief sampling of new media theorist-practitio- ners and institutions they have been connected with includes simon biggs (edinburgh), matthew fuller (piet zwart institute), mary flanagan (hunter), alexander galloway (nyu), kenneth goldberg (berkeley), eduardo kac (art institute of chicago), natalie jeremijenko (ucsd), raphael lozano-hem- mer (karlstad university, sweden), lev manovich (ucsd), randall packer (american university), richard rinehart (berkeley), and jeffrey shaw (zkm). . national research council, beyond productiv- ity: information technology, innovation, and creativity (washington, dc: the national academies press, ) pp. – . . national research council [ ] p. . . these estimates are from roger malina (execu- tive editor of leonardo journal) and the daniel lan- glois foundation’s alain depocas (director of the centre for documentation + research). . the interarchive project is a possible model for distributed publication; see . . “change in favor of a more capacious conception of scholarship, which we strongly endorse, should not mean ever-wider demands on faculty members, most especially those coming up for tenure and pro- motion.” mla task force on evaluating scholarship for tenure and promotion [ ] p. . . national research council [ ] pp. – . . national research council [ ] p. . . tim brody and stevan harnad, “earlier web us- age statistics as predictors of later citation impact”, , accessed march . . kurtz, michael j. ( ) “restrictive access poli- cies cut readership of electronic research journal articles by a factor of two,” harvard-smithsonian centre for astrophysics, cambridge, ma, , pp. – . . this recent [http://www.nettime.org/lists-ar- chives/nettime-l /msg .html] rejoinder by morlock elloi on the list exemplifies the expectations of such online forums: if you have any past publications that might help me under- stand your point of view, i would gladly read them. while i understand that in paid-speaker-world the weight of the argument is computed as (volume of publications) × (number of speeches), on nettime and elsewhere closer to reality arguments stand for themselves. . electronic and email texts also have a currency acknowledged by leading institutions in the field. as of december , , one of the premiere bibliographic indices in new media, the langlois foundation’s cr+d database, included the fol- lowing indexation for “jon ippolito”: author of documents; subject of documents; participant to events; and organizer of events. of the documents by the author indexed, is from an email list and are parts of web sites. in the case of artist and critic alexander galloway, the relevance of his online texts is even more striking: although by he was the author of several journal articles and an important book from mit press, the two documents that represented his writing in the cr+d database were both from email lists. . a statistically significant number of google re- turns, e.g., > , may be a necessary but insufficient condition for confirming global impact. . “in evaluating scholarship for tenure and pro- motion, committees and administrators must take responsibility for becoming fully aware both of the mechanisms of oversight and assessment that already govern the production of a great deal of digital schol- arship and of the well-established role of new media in humanities research. it is of course convenient when electronic scholarly editing and writing are clearly analogous to their print counterparts. but when new media make new forms of scholarship possible, those forms can be assessed with the same rigor used to judge scholarly quality in print media. we must have the flexibility to ensure that, as new sources and instruments for knowing develop, the meaning of scholarship can expand and remain rel- evant to our changing times.” mla task force on evaluating scholarship for tenure and promotion [ ] p. . manuscript received august . jon ippolito’s current projects—including the variable media network, the pool and thoughtmesh—aim to expand the art world beyond its traditional preoccupations. joline blais’s projects include longgreen- house, a merging of the wabanaki long- house, permaculture gardens and networked collaboration; the cross-cultural partner- ship, a legal framework for sharing connected knowledge responsibly and sustainably; and at the edge of art, a book on strategies that empower new media artists to reshape the prac- tice of art and beyond. owen f. smith is an historian of alternative art forms, a producer of multiples and a digi- tal and performance artist. his scholarly work has been published in numerous books and catalogs on fluxus, intermedia and related forms of creativity. his work as an artist has been exhibited throughout the u.s.a., europe and japan. steve evans’s writing and research focus on poetry and poetics, critical theory and the avant-garde. he runs the new writing series at the university of maine, does projects with the national poetry foundation, and tends a web site, thirdfactory.net, devoted to contem- porary poetry. nathan stormer’s principal research area is medical rhetoric about abortion. he also teaches and researches visuality and culture. ippolito et al., new criteria for new media aarc- - - page pdf created: - - : : :pm the american archivist vol. , no. fall/winter – abstract producing exhibits is an important form of scholarly and creative activity for aca- demic librarians, archivists, and curators. while other forms of scholarship such as publishing a book or a peer-reviewed journal article are unquestionably accepted, exhibits are typically viewed as less intellectually rigorous. through a literature review and a review of appointment, promotion, and tenure policies of selected association of research libraries institutions with faculty status, this study seeks to uphold the creation of exhibits as a critical scholarly endeavor in the academic library and to provide guidance in evaluating exhibits as scholarship for library fac- ulty, especially those working in archives and special collections. an overview of strategies for documentation and evaluation of exhibits as noteworthy scholarly communication is included. the recommendations provided can also assist nonaca- demic library and archival institutions to create high-quality exhibits of enduring value. exhibits, digital humanities projects, and other forms of scholarship and cre- ativity should be considered for promotion and tenure if presented in a compelling way to review communities. exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries elizabeth a. novara and vincent j. novara key words academic archivists, academic libraries, exhibits, faculty status, promotion and tenure © elizabeth a. novara and vincent j. novara. the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara exhibits remain an undervalued form of scholarly communication for aca-demic librarians, archivists, and curators. while other forms of scholarship such as publishing a book or a peer-reviewed journal article are accepted with- out question, it is typical for evaluators to view exhibits as less intellectually rigorous, even though enormous amounts of time, talent, research, writing, and presentation go into planning and staging academic library exhibits. at each review cycle, academic librarians are asked to justify such scholarly com- munications to their library colleagues and other faculty who possess little or no knowledge about the intellectual work required to create exhibits. through a literature review and a review of policies of association of research libraries (arl) institutions with faculty status, this study seeks to uphold the creation of exhibits as a critical scholarly endeavor in the academic library and to provide guidance in evaluating exhibits as scholarship for faculty librarians, especially those working in archives and special collections. academic library exhibits can exist in many forms, but this article will focus specifically on larger-scale gallery exhibits most often found within spe- cial collections departments. this does not imply that smaller exhibits such as the single display case or lobby panel displays are not important to the outreach strategies of academic libraries, but that gallery exhibits require effort and dedication on a scale comparable to that of an article published in a scholarly journal. successful gallery exhibits demand significant effort and resources, as well as extensive study and contextualization of a wide array of primary source materials. in addition, a large-scale exhibit must be based on in-depth research and be an accessible counterpart to other forms of scholar- ship on the research topic. this article will review the literature previously published on the topic of exhibits in academic libraries, including literature focused on the subject of librarians with faculty status, and will seek to reinforce an expanded definition of “scholarship.” in addition, this study will look toward the literature in the dig- ital humanities, history, and museum studies to provide a broader perspective on recognizing exhibits as scholarship. using the literature review as a foun- dation, this study will examine the appointment, promotion, and tenure (apt) policies for faculty librarians at selected arl institutions to discover how they perceive exhibits and if/how faculty librarians can submit exhibits as evidence of scholarship. we will also highlight the inconsistencies and limitations in cur- rent practices in defining dossiers for academic librarians. finally, an overview of strategies for documentation and evaluation of exhibits as scholarly commu- nication is included. such strategies for documentation and evaluation are at the core of making exhibits an accepted component of a faculty librarian’s pro- motion and tenure dossier. overall, the hope is to encourage broader recogni- tion of the creation of library exhibits as a worthwhile scholarly endeavor both the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries within academia and for any cultural institution with a focus on public history and public engagement. literature review within academia, a long-standing debate continues about whether or not librarians qualify for faculty status and if librarians’ criteria for faculty status should be comparable in rigor to those of instructional faculty. if universi- ties adopt proposed changes to definitions of scholarship, these debates may become superfluous. however, some academic librarians “apparently believe that research, although central to the university’s mission, is only to be sup- ported by librarians, and not done by them.” this attitude about scholarship among academic library colleagues and in the academy itself needs to change. fortunately for archivists and other academic librarians, the definition of “schol- arship” is now evolving within the academy, albeit slowly. academia is experi- encing a push for alternative definitions and evaluations of scholarly output, and some institutions are beginning at least to consider changes to their apt policies. the recent rise of “altmetrics” indicates this development. as defined by elizabeth joan kelly, altmetrics are “an alternative to traditional measurement of the impact of published resources,” including a greater reliance on refer- ences within various social media platforms. kelly proposed that archivists seek ways to apply altmetrics to measure the impact of finding aids, digital projects, exhibitions, and other scholarly communications. this development suggests academia’s limitations for evaluating scholarship from discipline to discipline (including academic archives) and reveals that the time is right to reconsider a place for exhibitions in the apt dossier. eugene rice, ernest boyer, and others have, since the late s, “pro- posed that colleges and universities move beyond the debate of teaching versus research and that the definition of scholarship be expanded to include not only original research but the synthesizing and reintegration of knowledge, pro- fessional practice, and the transformation of knowledge through teaching.” according to this definition, scholarship has four distinct, yet interrelated cate- gories: the scholarship of discovery, of integration, of application, and of teach- ing. exhibits certainly fall into this definition of scholarship, in particular “the synthesizing and reintegration of knowledge.” other authors have argued for a more integrated view of the traditional three-tiered performance review criteria of librarianship, service, and scholarship. william k. black and joan m. leysen argued that “it is easy to view the cataloging or reference work that librarians do as the primary job to the exclusion of other facets or responsibilities . . . . there should be a real continuity between professional practice, research, and service, and we need to appreciate the benefits inherent in this relationship.” this view the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara then does not limit exhibits to the performance criteria of librarianship, but extends them into the scholarly realm of an academic librarian’s efforts. black and leysen also noted that exhibits warrant consideration as creative activities and complementary research, though not necessarily original research. while exhibits may not always present original research (but often do), the research is nonetheless important and academically rigorous. although the topic of exhibits within academic libraries has been inves- tigated to some extent, the focus of previous research has been on “how-to” manuals and descriptions of specific exhibits. a noticeable lack exists in the academic library and archival literature relating to “the intellectual and creative process of producing an exhibit” and the relationship of exhibits to scholarly research. although a recently published monograph on managing academic archives and special collections encouragingly states that archivists, as faculty librarians, need to maintain an active research agenda and strive to stay up-to- date on various trends in research, higher education, and technology, it makes no mention of exhibits or other forms of scholarship beyond peer-reviewed books and journal articles. in fact, only one article speaks directly to the issue of scholarly exhibits in academic libraries. in , laurel g. bowen and peter j. roberts published “exhibits: illegitimate children of academic libraries?” in college & research libraries. to demonstrate that exhibits are a legitimate form of scholarship, bowen and roberts compared the details of the process of writing a scholarly article to those of planning, researching, and creating an exhibit. the authors argued that “a new interpretation of information or presentation of ideas that leads to a new understanding is just as necessary in advancing knowledge as is the discovery of new facts.” in regard to academic librarian dossiers, black and leysen summed things up well: “the full picture of the can- didate’s expertise in the area of scholarship should be drawn from the range of contributions presented. each activity that reflects research has a place in the scholarship assessment. activities should be judged individually on their own merits and then brought together to form a cohesive picture of the candidate’s professional competence.” scholarly accomplishments, therefore, encompass a wide range of activities, including exhibits. finally, a recent examination on the topic of special collections exhibits published by the society of american archivists notes that exhibits can demonstrate how faculty librarians are not just experts in technical work, but are also well versed in subject-area expertise and interpretation of materials. the author emphasized, “exhibition curator- ship is scholarship.” the digital humanities (dh) are also grappling with similar issues in making the case for their projects as scholarship in the apt arena. according to j. matthew huculak and lisa goddard, digital humanists, much like aca- demic archivists, also face impeding and outmoded apt models that discourage the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries collaboration, in particular for developing and envisioning access to digital scholarship. furthermore, as they “operate within departmental structures that have traditionally prioritized individual achievement and monograph pro- duction” in apt, digital humanists are put on the defensive to qualify their schol- arly communication. yet, while in academic humanities programs coauthoring or codeveloping projects is sometimes viewed as a “liability,” academic libraries and archives place greater value on collaboration. information professionals are “rewarded for developing solutions by consultation and collaboration . . . for producing initiatives that have demonstrable reach and impact for the larger library or university community.” in defining impact and reach, archivists can do more to make the case for exhibitions as scholarship. yet, much like exhibits, dh projects, which may feature interpretive content, are not always “intended to last forever.” however, while exhibit catalogs offer one potential solution for exhibit reach and longevity, digital humanists are still formulating applicable solutions. and, arguments exist against creating such traditional publications in place of the original dh project (or exhibit for that matter). should not the original project be enough to satisfy the apt process? odell and pollock noted that “this discourages further work on the digital project, creating a culture in which the project need only be good enough to describe in an article. it also punishes the digital humanist by doubling up on their efforts to meet the bar of p&t” (promotion and tenure). beyond the academic library literature, museum curators and histori- ans have more widely accepted exhibits as scholarly communications in their own standards and literature. museum curators have established guidelines for exhibits, one example being “the standards for museum exhibitions and indicators of excellence” developed by the standing professional committees council of the american alliance of museums. these standards provide some general guidelines to follow for exhibits, including qualities related to content and intellectual value. in addition, the standards should be viewed as suggestive rather than prescriptive by stating, “we should always allow for purposeful— and often brilliant—deviation from the norm.” this perspective shows strong linkages to accepting exhibits created by both scholarly and creative processes. museum curators also recognize that peer review can play a role in exhibits, including those online. one museum curator turned faculty member argued for peer review of digital exhibits, noting that: while they help identify and assess important work, scholarly reviews also play a related role within the professional lives of scholars, whether they work in museums, archives, higher education, government, or grassroots commu- nity organizations. individually and collectively, we are judged by the work that we produce; thus rigorous and independent assessments of our efforts by knowledgeable peers are a useful service. the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara peer review of exhibitions can not only assist in the promotion and tenure process, but can demonstrate the exhibitions value to the general public, granting agencies, donors, and other stakeholders at an archival or cultural institution. historians, too, have recognized exhibits as examples of successful public history scholarship. the organization of american historians, the american historical association, and the national council on public history together pro- duced a report on evaluating the work of the “publicly engaged academic historian.” the report argues that scholarly work in public history, which would include researching and creating exhibits, “is too often overlooked in a tenure process that emphasizes single-authored monographs and articles at the expense of other types of scholarly production.” the report notes that “public history scholarship, like all good historical scholarship, is peer reviewed, but that review includes a broader and more diverse group of peers, many from outside traditional academic departments, working in museums, historic sites, and other sites of mediation between scholars and the public.” finally, the report emphasizes the need for recognition among scholars that commu- nity engagement is a vital part of faculty dossiers. jacques berlinerblau wrote for the chronicle of higher education that humanists, especially, need to engage more with broader audiences, noting, “tomorrow’s humanist will be outward bound . . . less isolate and microspecialist, more conversationalist, generalist, and even . . . a conscientious popularizer.” from the perspective of the human- ities, exhibits represent a distinct form of scholarly and community engage- ment, whose informational content sparks conversations accessible to a broad audience. methodology for this study, we chose to collect and perform a textual analysis of the appointment, promotion, and tenure policies for academic librarians at insti- tutions with membership in the association of research libraries and whose librarians also had faculty status with tenure. we chose arl membership as one of the selection criterion because the ideal profile for an arl member institu- tion includes supporting a special collections program, where exhibits typically flourish within the academic library. the arl benefits of membership state that member libraries must have “distinctive research-oriented collections and resources of national or international significance in a variety of media that result in shared or collective collections that support global research and core and specialized services to the scholarly community of faculty, students, and visiting scholars.” in addition, member institutions must be involved in the “preservation and archiving of research resources to ensure their availability for the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries future scholars.” beginning in , arl also designated special collections as “a priority for arl attention.” there are currently member institutions in arl, so we further nar- rowed the survey to include only those academic library institutions granting faculty status and tenure to librarians. this qualification remained challenging to determine as academic libraries have varying degrees of professional statuses and tenure. some academic librarians are considered faculty, but do not have tenure; some are considered staff; some libraries use a mixed model with both faculty and staff librarians; and degrees of other combinations and models vary. to identify those arl academic libraries with faculty status and tenure, we referred to two online sources that have compiled information about profes- sional statuses for academic librarians. after cross-referencing these sources, we had a more concentrated list of institutions to review. we began by searching for library apt policies available online. however, since apt policies are not always available online to the public, we also obtained apt policies through direct contact with peer librarians at several institutions as well as with library human resources personnel. institutions that fit the above criteria whose apt policies were not readily available online or were not obtainable with reasonable requests were excluded from this study. one other challenge also complicated this study: academic institutions often have policies governing the apt process at the institutional and the departmental levels. whenever possible, we attempted to locate policies at the departmental or library level as these more-detailed policies are more specific to faculty librarians. on occasion, we only found (or the library faculty only used) the overarching institutional policy, and we consulted this policy instead as our basis of study. whenever possible, we referenced the most up-to-date policy, but since policies are perpetually revised, this also proved challenging over the course of the study. finally, the most limiting challenge to this study was that policy docu- ments do not always represent the nuances of how policies are actually imple- mented for individuals within an institution. candidates, apt committees, and individual faculty members can interpret policies in a number of ways, and that interpretation can change over time as new insights develop about the policy documents. future studies should perhaps include faculty interviews, responses, or case studies at particular institutions, keeping in mind that the apt process is often fraught with sensitive information and emotional experiences. after all these considerations, the final sample group totaled institu- tions (see table ). we examined scholarship requirements in the apt policies for each insti- tution beginning in academic year – and ending in – . while searching for the terms “exhibit” or “exhibition” within each apt policy document proved essential to the review, we considered how scholarship was defined and the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara the degree of flexibility permitted in this area of the dossier supporting promo- tion with tenure. we also examined apt policies to determine if curating exhibits was considered a job responsibility and thereby tied to librarianship, or if policies allowed the inclusion of exhibits as scholarly endeavors. discussion as is appropriate, the majority of academic libraries continue to value peer-reviewed or refereed work the most highly in the promotion and tenure table . academic libraries surveyed university of albany, suny, libraries university of arizona libraries auburn university libraries university at buffalo, suny, libraries university of cincinnati libraries university of colorado boulder libraries colorado state university libraries university of florida libraries university of georgia libraries university of illinois at chicago library university of illinois at urbana-champaign library indiana university libraries bloomington iowa state university library louisiana state university libraries university of louisville libraries university of maryland libraries mcgill university library (canada) university of new mexico libraries university of nebraska–lincoln libraries ohio state university libraries pennsylvania state university libraries rutgers university libraries stony brook university, suny, libraries university of south carolina libraries university of tennessee, knoxville, libraries texas a&m university libraries virginia tech libraries washington state university libraries the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries process. the association for college and research libraries’ (acrl) guidelines for faculty status directly support this approach. many apt policies directly support this preference with statements such as that found in the university of buffalo libraries, suny policy: “there are two critical elements in evaluating research and creative activity: publication and peer review.” pennsylvania state university libraries also strongly emphasizes peer-reviewed publications in its guidelines: “the university libraries highly value products of scholarship that have undergone an independent evaluation and selection process, such as peer review, rigorous editorial selection, or competitive juried selection.” while aca- demic libraries have broadened their definition of scholarship, often by using the terms “scholarly” and “creative works” to recognize the variety of activities in which librarians engage, a preference clearly remains for traditional scholar- ship in the form of peer-reviewed publications. ohio state university libraries notes, however, “no single type of publication/creative work is invariably a more significant component of a research program than another. nevertheless, a body of work, which is cumulative in nature and reflects the highest academic standards, is required.” for exhibits to attain recognition as another form of scholarship and creativity in an academic librarian’s dossier, the faculty librar- ian needs to make the case for them as quality, peer-reviewed work. as table indicates, of those institutions surveyed, universities men- tioned exhibits within the scholarship/research sections of their apt policies. exhibits are at least acknowledged as some sort of scholarship or creative activ- ity at most of the institutions surveyed. these institutions commonly provide examples of scholarly and creative activities in their apt guidelines, usually listed in order of importance. alas, exhibits are generally found near the bottom of the list. the university of maryland, college park, for example, lists exhibits in conjunction with “performances, demonstrations, and other creative activi- ties,” while the university of illinois urbana-champaign’s university-wide apt policy groups creative works as a subgroup under publications and creative works. the policy notes that creative works on the curriculum vitae include table . “exhibits” mentioned in apt policies number of institutions surveyed (policies found or provided) number of institutions that specifically mention “exhibit” or “exhibition” in apt policies (either university or library) number of institutions that use the term “exhibit” or “exhibition” within the scholarly and creative activities section of their apt policies (either university or library) number of institutions that use the term “exhibit” or “exhibition” within the librarianship or service sections of their apt policies (either university or library) number of institutions that do not mention “exhibits” or “exhibition” within their apt policies (either university or library) the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara “exhibitions, commissions, competitions, performances, designs, and art or architecture executed.” two of the apt policies mention exhibits, but do not categorize them as scholarship, instead identifying them as librarianship or service. the university of georgia libraries places exhibits under service to the university or the librar- ies. its policy states that “examples of university, faculty, or library projects include preparation of exhibits, participation in the planning of staff develop- ment workshops or other education programs, editing in-house newsletters, reports, or other publications.” other institutions such as the university of arizona simply do not recognize exhibits as scholarship or service, but class them exclusively as librarianship or part of day-to-day job responsibilities. the university of arizona libraries’ policy states: written materials (including electronic or paper research guides, finding aids, and similar materials) and/or oral presentations (including lectures, panel discussions, and other invited presentations) and/or exhibitions which were developed as part of assigned library work and that are focused on a campus audience or affiliates, should be listed in the position effectiveness section of the cv. finally, some institutions categorize exhibits in more than one area of evaluation for promotion. the university of buffalo libraries, suny, for exam- ple, lists exhibits, both physical and virtual, under examples of scholarly activ- ities as well as under “contributions to the libraries” or librarianship. in the reference to exhibits under “contributions to the libraries,” it recognizes the research involved in this type of effort, stating that “when a librarian’s work generates library guides, media productions, exhibits, electronic media, or other practice-related matter, such materials are evaluated by colleagues and, whenever possible, by appropriate evaluators from outside the university. these resources can involve research and creative efforts comparable to that required for articles in refereed journals.” several academic institutions support or are at least open to some of the unique activities that archivists, curators, and special collections librarians can perform through the scholarly work of creating exhibits. at the university of maryland, faculty librarians evaluate candidates using an apt process separate from that of the teaching faculty. other faculty librarians, external evaluators from the field, the dean of libraries, and the provost evaluate faculty librari- ans. teaching faculty do not currently participate in the process, and academic librarians have their own set of guidelines that include examples of what types of work constitute scholarship and creativity. the university of maryland libraries apt guidelines for how to organize a curriculum vita groups exhibits with “performances, demonstrations, and other creative activities” a little over halfway down the list of acceptable activities. while deemed creative, nothing the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries indicates that exhibits are necessarily considered scholarly endeavors. in addi- tion, exhibits clearly rank lower on the spectrum of scholarship and creativity than do monographs and peer-reviewed articles. however, in the apt policy itself, scholarship and creativity are broadly defined, leaving room for a more open interpretation of where exhibits might fall on the spectrum. under “schol- arship and creativity” the policy reads: the candidate for promotion to higher rank shall demonstrate sustained and effective engagement in scholarship and creativity. these contributions must be of high quality and significance to the field of librarianship or another discipline related or complementary to the candidate’s area of responsibility. a library faculty member’s scholarship and creativity will be judged for its contribution to library effectiveness and expansion of the librarian’s relation- ship to knowledge. other academic libraries take a similar stance in supporting exhibits as scholarship and rank peer-reviewed work higher on the spectrum of scholar- ship. the apt policy of auburn university notes, “research and creative work ordinarily can be documented by a candidate’s publications or performances/ exhibits. publication subjected to critical review by other scholars as a condition of publication should carry more weight than publication that is not refereed.” similarly, colorado state university libraries includes exhibits in its apt policy and appears open to forms of scholarship and creativity beyond the monograph and the journal article. the policy states, “activities encompassed by the term ‘research and creative activity’ include, but are not limited to . . . producing creative work related to the discipline or specialty, such as films, tapes, exhibits, reports, compositions, audiovisual material, computer programs, and/or web pages.” in addition, the policy explains, “because librarianship does not exist in isolation from the community, which it serves, but rather co-exists with and contributes to all disciplines, scholarly endeavors of libraries faculty reflect this symbiosis, and often cross-disciplinary boundaries.” finally, the iowa state university libraries apt policy serves as another example of accepting exhibits as scholarly endeavors for faculty librarians. its policy states: the nature of scholarly work at a diverse university necessarily varies. in the promotion and tenure review process, however, evidence that a significant portion of a faculty member’s scholarship has been documented (i.e., commu- nicated to and validated by peers beyond the university) is required of all. in the library field, refereed journals and monographs are the traditional media for documenting scholarship; in some areas of librarianship, exhibitions are an additional appropriate form. emerging technologies are creating (and will continue to create) entirely new media which may be used by librarians. finally, scholarship may be validated and communicated through conference presentations and invited lectures. the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara many apt policies, like that of the iowa state university libraries, are beginning to recognize new forms of scholarship and creativity that benefit the academic community. of the institutions that do not specifically mention exhibits in their apt policies, it is entirely possible that exhibits qualify as research or schol- arly activity—silence does not indicate omission. as definitions of scholarship expand, academic institutions must be open to considering exhibits and other forms of scholarship as library faculty apply for tenure. the pennsylvania state university libraries policy, for example, does not mention exhibits, but it does note that “evidence of the impact of the candidate’s research and cre- ative accomplishments, and of the candidate’s reputation in the discipline, are also valued.” if exhibits are not mentioned specifically in an institution’s apt policy, this absence likely signals that faculty applicants will have to make a compelling case for any exhibit or other creative work to earn recognition as scholarship. conclusion: strategies for acceptance, documentation, and evaluation of exhibits definitions of scholarship and creativity differ widely among arl insti- tutions and in academic libraries in general, in part because no agreement exists on apt policies for academic librarians. some institutions cannot even decide whether librarians should have faculty status. this variance renders justifying and documenting the value of nontraditional modes of scholarship, such as exhibits, a challenging proposition. w. bede mitchell and bruce morton suggested various reasons why library faculty differ so much from teaching fac- ulty. one possible answer includes “substantive differences” in graduate library education, which may leave some librarians unprepared or uncomfortable with faculty status. additionally, faculty status is not guaranteed for librarians and archivists at all institutions of higher learning. some information professionals may have had faculty status for quite some time, while others may have just received faculty status in the last ten to twenty years. academic library faculty still have some catching up to streamline more standard apt policies similar to those of teaching faculty specializations. without more standard metrics and policies, academic librarians changing positions across institutions have a higher learning curve as to acceptable apt requirements than do teaching faculty. part of the challenge in creating an apt dossier is the ability to describe the scholarly value and the impact of one’s work. with peer-reviewed articles and books a framework exists for describing this that includes the review pro- cess, acceptance rates, and impact factors. nothing similar currently exists to evaluate the impact of exhibits or other more creative forms of scholarship. the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries to address this apt policy issue for faculty librarians, we have two recommendations: • in the near term, library faculty at academic institutions should update apt policies to include scholarly exhibits in the criteria for scholarship. • arl and acrl should include scholarly exhibits as a recommended form of scholarship in their next publications addressing faculty pro- motion and tenure. beyond these policy changes in the profession as a whole, many ways exist for individual faculty members to present exhibits so that they can provide evi- dence of scholarly communications long after the exhibits are physically taken down. we recommend that exhibits feature the following characteristics to make a solid argument to apt committees and the broader academic commu- nity. these recommendations can also assist nonacademic library and archival institutions in creating high-quality exhibits of enduring value. • demonstrated in-depth research comparable to a published article. the research process can be demonstrated not only by the quality of the exhibit text, but with proper citations, bibliographies, and primary source transcriptions and interpretations. these materials along with the main text, images, and other exhibit graphics can be submitted as examples of scholarly communication in the promotion and tenure dossier. • enduring products. long after an exhibit is physically on display, cu- rators must provide evidence of an exhibit’s enduring value. one of the best ways to accomplish this is a professionally published exhibit catalog. however, publication costs can be prohibitively expensive, and not all institutions will support the creation of such a catalog. other ways to create enduring products include hosting a digital ver- sion of, or companion to, the exhibit on the institution’s website or, where available, within an open access digital repository for campus scholarship. in addition, photographs, audio, and/or video document- ing the exhibit and any special events are vital for the tenure and promotion dossier, as are published event programs, invitations, agendas, syllabi, or other publications produced for special events and instruction sessions. • outreach and special events. outreach and special events consist of symposia, alumni events, donor recognition events, general public programming, book signings, or any activities that will engage the academic community and the general public. these include collabo- rating with teaching faculty to use the exhibit in undergraduate or graduate instruction or workshops. in addition to hosting events at the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara the exhibiting institution or in the gallery, faculty librarians can also present their work on an exhibit at outside scholarly conferences or community venues. • peer review and collaboration. library colleagues, the campus com- munity, and outside experts can provide peer review of exhibits in various ways. peer review can occur during production or after com- pletion of the exhibit or as part of in-depth scholarly collaboration. collaboration can be especially fruitful at larger academic institutions with multiple curators responsible for creating exhibits and where teaching faculty are stakeholders in the exhibit outcome. it is also important to engage historians or other scholars who are knowl- edgeable about an exhibit’s topic early in the research and planning stages. collaborators or reviewers can provide feedback during the research and writing process and can write evaluative statements to accompany the promotion and tenure dossier. in addition, learning from and collaborating with other professional groups, such as the national association of museum exhibitions, may assist in establish- ing better standards and a peer-review process for library or special collections exhibits. • assessment, impact, and engagement. the ability to measure impact on the targeted audience is an important part of any exhibit. for online components of exhibits, analytics software is essential. jessica lacher-feldman’s book on exhibits in special collections provides some brief guidance at evaluating exhibits, including using assess- ment tools such as focus groups, virtual comment boxes, and online surveys. in addition, visiting groups and individuals can be surveyed or asked to provide comments in the tried-and-true physical comment box or registration book. coverage in the media is another important way to measure impact and engagement and also to advertise the exhibit to increase impact, as are social media tools. impact can be measured in a more traditional way by tracking citations of an exhibit catalog in journal articles and other publications. finally, the museum profession has many resources and tools adaptable for use in assess- ing and evaluating academic library exhibits. some of these tools may be excessive for academic libraries, but they do provide a good place to start. the results that these assessment tools provide can be submitted with an individual’s promotion dossier to demonstrate impact and engagement. these strategies for acceptance, documentation, and evaluation of exhib- its are at the core of making exhibits an accepted component of a faculty archivist’s promotion and tenure dossier. library faculty are often encouraged the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries to use day-to-day work to inspire scholarly research projects, and this can be an exciting proposition. inside and outside the academic institution, it is important to create “an environment of shared ownership and pride, which can only produce greater success.” however, those who insist on the seemingly more permanent nature of monographs and peer-reviewed articles continue to consider the nature of exhibits ephemeral and difficult to grasp. perhaps the ongoing dialog on the acceptance of digital humanities scholarship in the tenure process can also influence this discussion, especially since many phys- ical exhibits also have online components and spark digital projects. while exhibits may have a different audience and impact than typical peer-reviewed journal articles, they are no less important to the scholarly endeavor. exhibits often have the ability to produce a broad impact on a more public, but no less important, audience than does an article in a peer-reviewed journal written for a small group of specialized scholars. archivists, curators, and librarians need to take this advice to heart to produce exhibits and other collaborative, innovative scholarly projects that engage with students, teaching faculty, and the general public. notes some information in this article was presented at the mid-atlantic regional archives conference (marac) on october , , in richmond, virginia, by elizabeth a. novara as “exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries,” for a panel session entitled archivists as academics: meeting scholarship and creativity requirements. for this study we will use the term “librarian” to denote any information professional holding the faculty appointment of librarian. we hope that this study will prove most useful to library faculty who frequently have exhibit creation as a core responsibility, which includes archivists, curators, rare book librarians, and special collections librarians. however, our findings and recommenda- tions should prove helpful to the profession at large. other forms of scholarship often considered atypical include artwork, theatrical and musical performances, and digital humanities projects, although perspectives on these types of projects as scholarly work are evolving. an arl survey of special collections repositories identified a “widespread emphasis on exhibits” and further “that the majority of respondents have a physical space within the library designated for this activity.” indeed, of arl survey respondents specified that they mount exhibits in a gallery space, a substantive exhibiting hall, or a partnering museum. see adam berenbak et al., special collections engagement, spec kit (washington, d.c.: association of research libraries, ), – , – . for the sake of clarity, in this article we will refer to all policies pertaining to appointment, pro- motion, and tenure, continuing appointment, or permanent status as “apt” or “tenure” policies. w. bede mitchell and bruce morton, “on becoming faculty librarians: acculturation problems and remedies,” college & research libraries , no. ( ): . elizabeth joan kelly, “altmetrics and archives,” journal of contemporary archival studies , no. ( ): . see also acrl’s “scholarly communications toolkit: measuring impact,” http://acrl. libguides.com/scholcomm/toolkit/impact. robert m. diamond, “tenure and promotion: the next iteration,” national academy for academic leadership, , http://www.thenationalacademy.org/readings/tenpromo.html. see also ernest l. boyer, scholarship reconsidered: priorities of the professorate (princeton, n.j.: carnegie foundation for http://acrl.libguides.com/scholcomm/toolkit/impact http://acrl.libguides.com/scholcomm/toolkit/impact http://www.thenationalacademy.org/readings/tenpromo.html the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara advancement and teaching, ); r. eugene rice, “the new american scholar: scholarship and the purposes of the university,” metropolitan universities journal , no. ( ): – . william k. black and joan m. leysen, “scholarship and the academic librarian,” college & research libraries , no. ( ): . black and lysen, “scholarship and the academic librarian,” – . laurel g. bowen and peter j. roberts, “exhibits: illegitimate children of academic libraries?,” college & research libraries , no. ( ): . aaron d. purcell, academic archives: managing the next generation of college and university archives, records, and special collections (chicago: neal-schuman, ), , – . bowen and roberts, “exhibits,” . black and leysen, “scholarship and the academic librarian,” . jessica lacher-feldman, exhibits in archives and special collections (chicago: society of american archivists, ), – . j. matthew huculak and lisa goddard, “is promotion and tenure inhibiting dh/library collaboration? a case for care and repair,” dh+lib, july , , http://acrl.ala.org/ dh/ / / /a-case-for-care-and-repair/. huculak and goddard, “is promotion and tenure inhibiting dh/library collaboration?” huculak and goddard, “is promotion and tenure inhibiting dh/library collaboration?” jere d. odell and caitlin m. j. pollock, “open peer review for digital humanities projects: a modest proposal” (working paper presented at thatcamp indiana , university of notre dame, notre dame, ind., april , ), http://hdl.handle.net/ / . bowen and roberts, “exhibits,” . professional networks council of the american alliance of museums, “standards for museum exhibition and indicators of excellence,” national association for museum exhibition, https:// www.name-aam.org/s/ -standards-for-museum-exhibitions-and-indicators-of-excellence.pdf. jason baird jackson, “on the review of digital exhibitions,” museum anthropology , no. ( ): – . jackson, “on the review of digital exhibitions,” – . working group on evaluating public history scholarship, “tenure, promotion, and the publicly engaged academic historian: a report,” american historical association, perspectives on history (september ), http://www.historians.org/publications-and-directories/perspectives-on-history/ september- /tenure-promotion-and-the-publicly-engaged-academic-historian-a-report. working group on evaluating public history scholarship, “tenure, promotion, and the publicly engaged academic historian.” jacques berlinerblau, “survival strategy for humanists, engage, engage,” the chronicle of higher education , no. ( ): a –a . william gray potter, colleen cook, and martha kyrillidou, arl profiles: research libraries (washington, d.c.: association of research libraries, april ), , – , , http://www.arl.org/ storage/documents/publications/arl-profiles-report- .pdf. “principles of membership in the association of research libraries,” association of research libraries (september , ), http://www.arl.org/storage/documents/publications/arl- membership-principles.pdf. “special collections,” association of research libraries, http://www.arl.org/focus-areas/ research-collections/special-collections. “promotion and tenure requirements for peer institutions,” university of washington libraries staff web, http://staffweb.lib.washington.edu/committees/aluw/status/p-t-information/peers; academic-librarian-status wiki, “a guide to the professional status of academic librarians in the united states (and other places)” ( ), https://academic-librarian-status.wikispaces.com. “a guideline for the appointment, promotion, and tenure of academic librarians,” association of college and research libraries (june ), http://www.ala.org/ala/mgrps/divs/acrl/standards/ promotiontenure.cfm. http://acrl.ala.org/dh/ / / /a-case-for-care-and-repair/ http://acrl.ala.org/dh/ / / /a-case-for-care-and-repair/ http://hdl.handle.net/ / https://www.name-aam.org/s/ -standards-for-museum-exhibitions-and-indicators-of-excellence.pdf https://www.name-aam.org/s/ -standards-for-museum-exhibitions-and-indicators-of-excellence.pdf http://www.historians.org/publications-and-directories/perspectives-on-history/september- /tenure-promotion-and-the-publicly-engaged-academic-historian-a-report http://www.historians.org/publications-and-directories/perspectives-on-history/september- /tenure-promotion-and-the-publicly-engaged-academic-historian-a-report http://www.arl.org/storage/documents/publications/arl-profiles-report- .pdf http://www.arl.org/storage/documents/publications/arl-profiles-report- .pdf http://www.arl.org/storage/documents/publications/arl-membership-principles.pdf http://www.arl.org/storage/documents/publications/arl-membership-principles.pdf http://www.arl.org/focus-areas/research-collections/special http://www.arl.org/focus-areas/research-collections/special http://staffweb.lib.washington.edu/committees/aluw/status/p-t-information/peers https://academic-librarian-status.wikispaces.com http://www.ala.org/ala/mgrps/divs/acrl/standards/promotiontenure.cfm http://www.ala.org/ala/mgrps/divs/acrl/standards/promotiontenure.cfm the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm exhibits as scholarship: strategies for acceptance, documentation, and evaluation in academic libraries “university at buffalo libraries criteria for library faculty personnel actions” (june revision), http://library.buffalo.edu/jobs/files/libcriteria oct .pdf. human resources, “guideline ul-hrg : promotion and tenure criteria guidelines,” pennsylvania state university libraries (last review date july ), https://libraries.psu.edu/policies/ul-hrg . “appointment, promotion, and tenure criteria and procedures for the university libraries,” ohio state university libraries ( ), https://library.osu.edu/document-registry/docs/ /stream. appointment, promotion and permanent status committee, “university of maryland guidelines for appointment, promotion, and permanent status of library faculty,” university of maryland libraries staff internet (revised july ). office of the provost, “promotion and tenure, office of the provost communication # ,” university of illinois, urbana-champaign (july ; revised may ), https://provost.illinois.edu/policies/ provosts-communications/communication- -promotion-and-tenure/. university of georgia libraries committee on promotion, “criteria for appointment and promotion” (undated, updated since ), http://www.libs.uga.edu/employee-resources/ committees/promotion/criteria. “university of arizona, library faculty assembly bylaws” (approved may , ), received via email from verónica reyes-escudero, special collections associate librarian, august , . “university at buffalo libraries criteria for library faculty personnel actions.” appointment, promotion, and permanent status committee, “university of maryland guidelines for appointment, promotion, and permanent status of library faculty.” appointment, promotion, and permanent status committee, “university of maryland policy on appointment, promotion, and permanent status of library faculty,” university of maryland libraries staff intranet (revised july , ; technical amendments on may , ). “auburn university faculty handbook,” section . . . “policy and procedure for promotion and tenure” (approved april , ), http://www.auburn.edu/academic/provost/facultyhandbook/. “colorado state university libraries faculty code,” appendix a: criteria and standards for reappointment, promotion and tenure (approved by the libraries faculty, june , ), http://lib. colostate.edu/images/about/goals/facultycode/csulfacultycodecurrent.pdf. “library faculty promotion and tenure policies and procedures,” iowa state university library (april ), http://www.lib.iastate.edu/cfora/pdf/ .pdf. human resources, “guideline ul-hrg .” mitchell and morton, “on becoming faculty librarians,” . see the previous arl spec kits on librarian faculty and tenure as examples: susan a. massey and mary ann sheble, faculty organizations in arl libraries: activities and documents, spec kit (washington, d.c.: association of research libraries, office of management services, ); tracy bicknell-holmes and kay logan-peters, external review for promotion and tenure, spec kit (washington, d.c.: association of research libraries, ). professional networks council of the american alliance of museums, “standards for museum exhibition and indicators of excellence.” lacher-feldman, exhibits in archives and special collections, – . many tools exist within the museum profession for evaluating exhibits. two relevant books include beverly serrell, judging exhibitions: a framework for assessing excellence (walnut creek, calif.: left coast press, ); and judy diamond, michael horn, and david h. uttal, practical evaluation guide: tools for museums and other informal educational settings (lanham, md.: rowman and littlefield, ). see also alan teller, “assessing excellence in exhibitions: three approaches,” exhibitionist (fall ): – . lacher-feldman, exhibits in archives, . see the journal of digital humanities , no. ( ), http://journalofdigitalhumanities.org/ - /. the entire issue is devoted to evaluating digital humanities scholarship and the tenure process. http://library.buffalo.edu/jobs/files/libcriteria oct .pdf https://libraries.psu.edu/policies/ul-hrg https://library.osu.edu/document-registry/docs/ /stream https://provost.illinois.edu/policies/provosts-communications/communication- -promotion-and-tenure/ https://provost.illinois.edu/policies/provosts-communications/communication- -promotion-and-tenure/ http://www.libs.uga.edu/employee-resources/committees/promotion/criteria http://www.libs.uga.edu/employee-resources/committees/promotion/criteria http://www.auburn.edu/academic/provost/facultyhandbook http://lib.colostate.edu/images/about/goals/facultycode/csulfacultycodecurrent.pdf http://lib.colostate.edu/images/about/goals/facultycode/csulfacultycodecurrent.pdf http://www.lib.iastate.edu/cfora/pdf/ .pdf http://journalofdigitalhumanities.org/ - / the american archivist vol. , no. fall/winter aarc- - - page pdf created: - - : : :pm elizabeth a. novara and vincent j. novara about the authors elizabeth a. novara has been the curator of historical manuscripts at the university of maryland libraries for almost ten years. in this position, she manages archival and print collections related to the state of maryland his- tory, historic preservation, and women’s studies. previously, she held the positions of project archivist and assistant university archivist. her research interests include women’s and gender history as well as issues related to women’s collections within academic libraries. she has curated exhib- its related to women and the american civil war, maryland history, and university of maryland history. vincent j. novara is curator for special collections in performing arts at the michelle smith performing arts library, university of maryland (umd), where he earned his master’s of music. a certified archivist, he has held archivist positions at umd since and was appointed curator in . his scholarly communications include exhibitions, book chapters, and arti- cles and reviews in music library association’s notes, acrl’s choice, and educational media reviews online, and he has participated as panelist, pre- senter, and moderator at numerous conferences. novara also instructs the “project management for the archival workplace” workshop for the mid- atlantic regional archives conference. digital humanities does size matter? authorship attribution, small samples, big problem eder, maciej maciej_eder@poczta.onet.pl pedagogical university, krakow, poland the aim of this study is to find a minimal size of text samples for authorship attribution that would provide stable results independent of random noise. a few controlled tests for different sample lengths, languages and genres are discussed and compared. although i focus on delta methodology, the results are valid for many other multidimensional methods relying on word frequencies and "nearest neighbor" classifications. in the field of stylometry, and especially in authorship attribution, the reliability of the obtained results becomes even more essential than the results themselves: failed attribution is much better than false attribution (cf. love, ). however, while dozens of outstanding papers deal with increasing the effectiveness of current stylometric methods, the problem of their reliability remains somehow underestimated. especially, the simple yet fundamental question of the shortest acceptable sample length for reliable attribution has not been discussed convincingly. in many attribution studies based on short samples, despite their well-established hypotheses, convincing choice of style-markers, advanced statistics applied and brilliant results presented, one cannot avoid a very simple yet uneasy question: whether those impressive results could be obtained by chance, or at least positively affected by randomness? this question can be also formulated in a different way: if a cross-checking experiment with numerous short samples were available, would the results be just as satisfying? . hypothesis it is commonly known that word frequencies in a corpus are random variables; the same can be said about any written authorial text, like a novel or poem. being a probabilistic phenomenon, word frequency strongly depends on the size of the population (i.e. the size of the text used in the study). now, if the observed frequency of a single word exhibits too much variation for establishing an index of vocabulary richness resistant to sample length (cf. tweedie and baayen, ), a multidimensional approach – based on several probabilistic word frequencies – should be even more questionable. on theoretical grounds, we can intuitively assume that the smallest acceptable sample length would be hundreds rather than dozens of words. next, we can expect that, in a series of controlled authorship experiments with longer and longer samples tested, the probability of attribution success would at first increase very quickly, indicating a strong correlation with the current text size; but then, above a certain value, further increase of input sample size would not affect the effectiveness of the attribution. in any attempt to find this critical point in terms of statistical investigation, one should be aware, however, that this point might depend – to some extent – on the language, genre, or even the text analyzed. . experiment i: words a few corpora of known authorship were prepared for different languages and genres: for english, polish, german, hungarian, and french novels, for english epic poetry, latin poetry (ancient and modern), latin prose (non-fiction), and for ancient greek epic poetry; each contained a similar number of texts to be attributed. the research procedure was as follows. for each text in a given corpus, randomly chosen single words were concatenated into a new sample. these new samples were analyzed using the classical delta method as developed by burrows ( ); the percentage of attributive success was regarded as a measure of effectiveness of the current sample length. the same steps of excerpting new samples from the original texts, followed by the stage of "guessing" the correct authors, were repeated for the length of , , , ..., words per sample. the results for a corpus of english novels are shown on fig. . the observed scores digital humanities (black points on the graph; grey points will be discussed below) clearly indicate the existence of a trend (solid line): the curve, climbing up very quickly, tends to stabilize at a certain point, which indicates the minimal sample size for the best attributing rate. it becomes quite obvious that samples shorter than words provide a poor "guessing", because they can be immensely affected by random noise. below the size of words, the obtained results are simply disastrous. other analyzed corpora showed that the critical point of attributive success could be found between and words per sample (and there was no significant difference between inflected and non-inflected languages). better scores were obtained for the two poetic corpora: english and latin ( words per sample were enough for good results), and, surprisingly, the corpus of latin prose (its minimal effective sample size was of some words; cf. fig. , black points). . experiment ii: passages the way of preparing samples by extracting a mass of single words from the original texts seems to be an obvious solution for the problem of statistical representativeness. in most attribution studies, however, shorter or longer passages of disputed works are usually analyzed (either randomly chosen from the entire text, or simply truncated to the desired size). the purpose of the current experiment was to test the attribution effectiveness of this typical sampling. the whole procedure was repeated step by step as in the previous test, but now, instead of collecting individual words, sequences of words (then , , ..., ) were excerpted randomly from the original texts. three main observations could be made here: . for each corpus analyzed, the effectiveness of such samples (excerpted passages) was always worse than the scores described in the former experiment, relying on the "bag-of-words" type of sample (cf. fig. and , grey points). . the more inflected the language, the smaller the difference in correct attribution between both types of samples, the "passages" and the "words": the greatest in the english novels (cf. fig. , grey points vs. black), the smallest in the hungarian corpus. . for "passages", the dispersion of the observed scores was always wider than for "words", indicating the possible significance of the influence of random noise. this effect might be due to the obvious differences in word distribution between narrative and dialogue parts in novels (cf. hoover, ); however, the same effect was equally strong for poetry (latin and english) and non-literary prose (latin). . experiment iii: chunks at times we encounter an attribution problem where extant works by a disputed author are doubtless too short for being analyzed in separate samples. the question is, then, if a concatenated collection of short poems, epigrams, sonnets, etc. in one sample (cf. eder and rybicki, ) would reach the effectiveness comparable to that presented above? and, if concatenated samples are suitable for attribution tests, do we need to worry about the size of the original texts constituting the joint sample? the third experiment, then, was designed as follows. in iterations, several word-chunks were randomly selected from each text into -word samples: bi-grams, tetra- grams, chunks of words in length, of words, and so on, up to chunks of words. thus, all the samples in question were words long. the obtained results were very similar for all the languages and genres tested. as shown in fig. (for the corpus of polish novels), the effectiveness of "guessing" depends to some extent on the word-chunk size used. although the attributive scores are slightly worse for long chunks within a sample ( words or so) than for bi-grams, -word chunks etc., every chunk size could be acceptable to constitute a concatenated sample. however, although this seems to be an optimistic result, we should remember that this test would not be feasible on really short poems. epigrams, sonnets etc. are often masterpieces of concise language, with a domination of verbs over adjectives and so on, and with a strong tendency to compression of content. for that reason, further investigation is needed here. . conclusions the scores presented in this study, as obtained with classical delta procedure, would be slightly digital humanities better when solved with delta prime, and worse if either cluster analysis or multidimensional scaling is used (a few tests have been done). however, the shape of all the curves, as well as the point where the attributive success rate becomes stable, are quite identical for each of these methods. the same refers to different combinations of style-markers' settings, like "culling", the number of the most frequent words analyzed, deleting/non- deleting pronouns, etc. – although different settings provide different "guessing" (up to % for the most efficient), they never affect the shape of the curves. thus, since the obtained results are method-independent, this leads us to a conclusion about the smallest acceptable sample size for future attribution experiments and other investigations in the field of stylometry. it also means that some of the recent attribution studies should be at least re-considered. until we develop style- markers more precise than word frequencies, we should be aware of some limits in our current approaches. as i tried to show, using -word samples will hardly provide a reliable result, to say nothing of shorter texts. figure : english novels figure : latin prose figure : polish novels references burrows, j. f. ( ). 'delta: a measure of stylistic difference and a guide to likely authorship'. literary and linguistic computing. : - . craig, h. ( ). 'stylistic analysis and authorship studies'. a companion to digital humanities. s. schreibman, r. siemens and j. unsworth (ed.). blackwell publishing, pp. - . eder, m., rybicki, j. ( ). 'pca, delta, jgaap and polish poetry of the th and digital humanities the th centuries: who wrote the dirty stuff?'. digital humanities : conference abstracts. university of maryland, college park, pp. - . hoover, d. l. ( ). 'statistical stylistic and authorship attribution: an empirical investigation'. literary and linguistic computing. : - . hoover, d. l. ( ). 'multivariate analysis and the study of style variation'. literary and linguistic computing. : - . juola, p., baayen r. h. ( ). 'a controlled-corpus experiment in authorship identification by cross-entropy'. literary and linguistic computing. suppl. issue : - . love, h. ( ). attributing authorship: an introduction. cambridge: cambridge university press. rudman, j. ( ). 'the state of authorship attribution studies: some problems and solutions'. computers and the humanities. : - . rybicki, j. ( ). 'does size matter? a re-examination of a time-proven method'. digital humanities : book of abstracts. university of oulu, pp. . tweedie, j. f. and baayen, r. h. ( ). 'how variable may a constant be? measures of lexical richness in perspective'. computers and the humanities. : - . publications , , ; doi: . /publications www.mdpi.com/journal/publications article open science in the humanities, or: open humanities? marcel knöchelmann department of information studies, university college london, foster court, gower street, london wc e bt, uk; marcel.knochelmann. @ucl.ac.uk received: october ; accepted: november ; published: november abstract: open science refers to both the practices and norms of more open and transparent communication and research in scientific disciplines and the discourse on these practices and norms. there is no such discourse dedicated to the humanities. though the humanities appear to be less coherent as a cluster of scholarship than the sciences are, they do share unique characteristics which lead to distinct scholarly communication and research practices. a discourse on making these practices more open and transparent needs to take account of these characteristics. the prevalent scientific perspective in the discourse on more open practices does not do so, which confirms that the discourse’s name, open science, indeed excludes the humanities so that talking about open science in the humanities is incoherent. in this paper, i argue that there needs to be a dedicated discourse for more open research and communication practices in the humanities, one that integrates several elements currently fragmented into smaller, unconnected discourses (such as on open access, preprints, or peer review). i discuss three essential elements of open science—preprints, open peer review practices, and liberal open licences—in the realm of the humanities to demonstrate why a dedicated open humanities discourse is required. keywords: open humanities; open science; digital humanities; scholarly communication; peer review . introduction there is a long history of sorting disciplines into clusters, primarily the sciences and humanities [ – ]. these clusters are, at times, extended to a triad with the social sciences in between. contrary to the impression this clustering conjures, though, no exact distinction can be drawn between the sciences and the humanities (or the social sciences in between). not one binary opposition, nor a combination of several ones, can describe the differences that would suffice for a clear-cut separation of disciplines: understanding or explaining, idiographic or nomothetic, qualitative or quantitative, meaning or theory—all fall short of describing but a few temporal or field-specific regularities. any such distinction can at best approximate what unites and separates disciplines so that, in the end, it is a question of purpose or necessity on the basis of which the exercise of unifying or separating disciplines is to be undertaken. one such necessity arises when norms and practices of open and transparent research and communication are to be debated. disciplines share characteristics of research and communication practices that require a discourse on these practices to sort disciplines into clusters so as to sufficiently address the characteristics of these disciplines. there is a specific discourse dedicated to open practices for disciplines of the sciences: open science. this discourse regularly takes for granted to speak for scholarly communication as a whole. however, already its name—open science—indicates that this discourse is not concerned with the humanities but with a cluster of scientific disciplines. there is no such discourse dedicated to the publications , , of humanities. this is the case, although research and communication practices in disciplines commonly clustered into the humanities do share characteristics, that could—and, i argue, need to—be addressed by a coherent discourse as well. the coherence of this discourse builds upon the unifying characteristics of its disciplines. without such unifying grounding, contributions to a potential discourse would only be concerned with elements of some disciplines—in the humanities, for instance: philology, philosophy, history, theology, among others—instead of the humanities as a whole. the result would be a fragmentation into several smaller discourses such as open philology, open philosophy, etc. characteristics on the basis of which disciplines can be sorted into the humanities are, for instance, an emphasis on perspectivity (as opposed to objectivity in the sciences), verbality (as opposed to reliance on models), or historicity (as opposed to systemic integration) of contributions to discourses in these disciplines [ ]. these characteristics are expressive of the research paradigms and epistemologies employed in what is commonly termed the humanities: the importance of hermeneutics, source criticism, and nuanced, contextual meaning [ – ]. these more abstract characteristics lead to distinct practices such as the reliance on long-form publications (primarily the monograph), qualitative arguments, slower, editorially-heavy publishing processes, recursivity of its discourses, critique, and qualitative embedding of references [ – ]. moreover, the humanities live on a culture of debate, with the analysis and dialectic of interpretative understanding at its core [ ]. scholars in the humanities focus on “interpretation and critical evaluation, primarily in terms of the individual response and with an ineliminable element of subjectivity” [ ] (p. ). the resulting discourses are based on the power of arguments so that the “overall cogency of a substantial piece of work seems more closely bound up with the individual voice of its author” [ ] (p. ). dissonance is essential and there is no need for agreement in a discourse for it to be successful in scholarship. scholars may never reach consensus; their arguments of disagreement are essential bits of knowledge production. all this makes critiquing and reinterpreting existing contributions to a discourse—thus, continually and recursively coming back to previous work—an integral part of the humanities. a prerequisite for this is that scholarly communication practices enable such a culture of debate to flourish. a liberal understanding of scholarly practices enabled by free access to contributions, diversity of argument, intention of the author to contribute to discourses—as opposed to the intention to publish for the sake of authorship and reputation—can be supportive of such flourishing (as is discussed for the characteristics of the sciences in open science). there is, however, no discourse concerned with such elements that are dedicated to the here-offered characteristics of the humanities. in other words: though there is no one field of scholarly communication—but at least one for each cluster of scholarship—there is currently only one dedicated discourse on open research and scholarship, and this is open science. . open science, open humanities, and digital humanities . . the meaning of discourse there is a long and strong thread of—mostly scholarly—discourses on topics of openness in all forms: open source, open access, and open science in particular, and the digital means and discourse conventions enabling openness of scholarly communication in general. moreover, discourse is a term with a difficult genealogy. its meaning varies depending on the disciplinary and temporally-situated context. this necessitates differentiating what i imply to with my claim for a dedicated discourse. discourse here refers to a debate that includes various forms of textual or oral communication as contributions to a specific enquiry and body of knowledge. such discourse is both the intellectual construction of the object of enquiry and serves as a reference to the practices it is dedicated to (in discursive form). this notion does not refer to a foucauldian conception of discourse that includes there are variances in these, of course. as mentioned in paragraph one, such unifications are approximations at best and so it needs to be stated that, for philosophy, for instance, the monograph has become less important. publications , , of practices, but comes closer to the early habermasian tradition that posits as discourses a rational exchange of communicative action. the closest the conception of discourse in this article comes to is that offered by hyland [ ]. it is the engagement of experts—or novices who aspire to become experts by contributing to discourses—in communication that is governed by conventions fixed temporally only by the very members of that discourse. contributions to a discourse are accepted and negotiated by those who already partake in the discourse. on the one hand, this is based on peer review or editorial decision making, and, on the other hand, on less standardised forms of communication, for instance, in blogs, at conferences, or on social media. the discourse, therefore, is constituted by the contributions to it which are either accepted formally through selection, or included informally through references to them. crucially, such notion of discourse links the communicative action happening within the discourse to the individuals who contributed to it. they form discourse communities and, as hyland summarises, the ways the members of these communities “understand knowledge, what they take to be true, and how they believe such truths are arrived at, are all instantiated in a community’s discourse conventions” [ ] (p. ). a new discourse is rarely established by an artificial gathering of contributions, but is formed organically by the need of enquiry or the lack of a coherent body of knowledge. hence, proposing a discourse means demanding more enquiry to form a more coherent body of knowledge. contributions to a later-formed discourse may already exist in disparate forms. as i will show in sections . to . , there are few instances where the object in question—open practices in the humanities—is already touched on. these instances are not linked, though, because of the lack of the dedicated discursive realm. similarly, the discourse on open science was not opened as a discursive realm dedicated to open science. its origins are widely spread and contributions to it may stem from a wide array of other discourses that existed before open science was established (see section . for a discussion of open science). it is, thus, not merely the dedication of a new terminology. the term open humanities has been used before, but this does not mean that there is an open humanities discourse. demanding such a discourse does not dismiss existing explorations of open practices in humanities disciplines, but calls for a dedicated communicative realm where such enquiries have space to be taken on in an integrated and focussed manner. moreover, the members of a discourse community may have worked on practical implementations, thus, conversions of knowledge into practices or experimentation to induce practical knowledge. that these practices are concerned with the knowledge in action that may be part of a discourse does not constitute that these practices are themselves components of that discourse. only by means of textual reporting do the experiences of implementing practices or applying knowledge feed back into discourses, especially discourses governed by scholarship such as open science, digital humanities, or a potential open humanities. this distinction is essential to the ensuing discussion in this article. though there are or have been instantiations of practices in a form of open humanities—irrespective of these being called open humanities—these instantiations do not as such contribute to an open humanities discourse; they either form minor contributions to the digital humanities discourse (see section . ), or they contribute to other, disparate discourses which are not linked to each other (as in the case of some of the elements such as preprints or liberal licensing discussed in section ). already the existence of such practices that are linked to each other through their integrated nature in the field requires that the enquiries into these practices, their textual representations, and the knowledge of these practices are linked as well. thus, what a discourse is cannot be defined before its existence, as only the content of those elements of communication that contribute to the discourse determine its boundaries. with this in mind, i do not aim to define what open humanities needs to be in general or in detail, but demand that there is a discursive realm where potential elements have the potential to be linked and debated. these elements will then render the realm and define its boundaries. open practices in the humanities could be an element of either open science or digital humanities discourses. there is no definition or guard preventing the uptake of this direction within the existing discourses. however, as i will demonstrate with the relevant literature in section , and the discussion in section , existing publications , , of contributions to these discourses show that they are conceptionally unqualified (open science), or lack coherence and dedication (digital humanities). . . open science the term open science refers to the historical and contemporary practices and norms of open research and communication in disciplines of the sciences as well as to the discourse on these practices and norms. david [ ] finds historical origins of open science practices in early developments of the then still less formal conduct of natural philosophical enquiry in the late sixteenth century—a time when there was not even a separation of clusters of scholarship into sciences and humanities. vicente-saez and martinez-fuentes [ ] aim to determine an integrated definition for the proclaimed “disruptive phenomenon” that open science is and arrive at: “[o]pen [s]cience is transparent and accessible knowledge that is shared and developed through collaborative networks”. in addition to having a primarily static definition of open science, they remain diffuse on what knowledge is. madsen simply sees open science as a movement that “seeks to promote openness, integrity and reproducibility in [scientific] research” [ ]. fecher and friesike [ ] have a more wide-ranging approach to defining open science, including processes, infrastructures, measurement, and society outside institutionalised academia. though they state in their introduction that open science is concerned with the “future of knowledge creation and dissemination”, making no distinction here between clusters of disciplines, they only refer to scientists (next to politicians, citizens, or platform providers) when they discuss their open science schools of thought. they either presume that, due to epistemological distinctions, the humanities disciplines are not to be found in the realm of knowledge production, or they locate the humanities disciplines outside of any of their five schools of thought of open science. such scientific perspective is further reinforced by friesike et al. in another study on the emergent field of research on contemporary openness in research [ ]. a different approach to defining open science comes from a review conducted by peters [ ]. after having reviewed the dimension and some historical origins of thought about open science, peters offers conclusive remarks about the nature of thought that underscores open science and the broader philosophy of openness. what he does not do, though, is examine the application of this in specific research, thus, potentially enlightening the eventual differences of the development of an open culture between disciplines of the humanities and the sciences. similar approaches and shortcomings can be found in other articles concerned with open science (see [ ] or [ ]). thus, it can be confirmed that open science is taken to be literal—science-related. open science is a concept for scientific research; the broader terminology encompassing also humanities and social science disciplines may be open scholarship, which, in short, means “opening the process of scholarship”, irrespective of discipline [ ]. thus, by definition, open scholarship includes all scholarship—irrespective of disciplinary specifics. but the above-mentioned characteristics of the humanities necessitate such differentiation. and with the humanities being in the process of opening communication and research practices as well, a dedicated space is required for debating these processes and emerging practices—one that complements open science but does not resolve into the overtly abstract open scholarship. such a dedicated discourse should not be read as a demand to separate sciences and humanities or to reinforce a dichotomous perspective on scholarship, but as a reference to the unifying aspects of disciplines and their practices that allow for such a clustering. moreover, what this overview of the definitions of open science shows is that it has shortcomings in addressing the humanities. open science is not simply reducible to scientific disciplines and it is not my objective to do so; however, it is, as the literature shows, the case that open science does not address the unique characteristics of the humanities both terminologically and conceptually, making an open humanities discourse necessary. publications , , of . . the necessity of a discourse on open humanities arguments for the necessity to establish such a dedicated discourse can be made in manifold ways: the humanities are lacking behind the sciences in the transformation towards openness; the humanities are but a by-product of open science due to the lacking of an own discourse; the fragmentation of discourses about open practices in the humanities requires an integration of these smaller discourses into a single discourse (for instance, the connection of preprints and peer review as discussed in section ); there lies strength in a focussed, single voice of a discourse community such as (a potential) open humanities with which to address issues of policy and funding that are more and more concerned with openness. the most pressing argument, however, comes from within scholarship of the contemporary humanities: the inadequacies of the current practices of scholarly communication require a systemic approach to finding new solutions. dedicating a discursive realm termed open humanities to this quest not necessarily means that openness is to be taken as the only solution to these shortcomings. but without a discourse analogous to that of open science, there is no realm within which the potential of openness as a solution can be determined for the humanities. to mention are among the inadequacies of communication practices in contemporary humanities, for instance, the detrimental ways “peer reviewers criticize one another” [ ]; the “great many unnecessary and inadequate publications” because of wrong incentives and evaluation mechanisms [ ]; the fear of subjectivity that is immanent to judgement of quality alongside the denial of subjectivity in quantitative measurement [ ]; the different funding and financial support structures that are unfit for a quicker uptake of open access in the humanities [ ]; the debate around the “problem of value, transparency, and distributed financing of disciplinary activities” that arises because of the reluctance of learned societies to engage in more open processes [ ]; the poisoned paradigm behind productivity and excellence, because of which scholarly communication is increasingly alienated [ , , , ]. some of these issues are equally applicable to all disciplines, but by means of their discourse on openness, these issues are regularly addressed for the sciences (but not so for the humanities). the latest example for this is the recently published ten hot topics around scholarly publishing by tennant at al. [ ]. tennant et al.’s review article provides a useful guide to current debates in scholarly communication in the sciences; it is framed, however, as a review of scholarly publishing as a whole. this framing would make it necessary to include perspectives on the humanities which it obviously lacks. of the ten hot topics, only three appear to be not focussed on journal articles and only four are not primarily concerned with scientific literature. especially those topics that take on issues of research quality, judgement, and objectivity do not discern the profound differences that are in place between the communication practices of scientific and humanities disciplines. this makes both choice of and approach to the disputed topics a perpetuation of debates rooted in scientifically minded open scholarship practices—in short: open science. what, then, are hot topics in (open) humanities publishing? a point can be made about the higher pressure of openness in scientific disciplines because of more policy work. indeed, open policy is a key element of open science as outlined in some of the conceptual frameworks such as foster open science. this is less so the case in the humanities. the demand for openness in the humanities seems to be rooted in scholarship itself, whereas it is rooted in both scholarship and policies in the sciences. while this may impact the pace of implementing open practices in the humanities, it does not affect the necessity to have a discourse on these practices. . . open access, open humanities, and digital humanities open access is one of the hot topics found in both the sciences and the humanities. but as opposed to issues such as peer review, preprints, or licences, open access in the humanities is well- as the authors state in the article, the choice of topics arose by means of a somewhat democratic process through a discussion on social media. the demos in this process, however, may have been unrepresentative for the humanities resulting in these science-focussed ten topics. publications , , of established as a discourse, or: within the discourse on open access, distinctions are made between the sciences and the humanities [ – ]. though the early uptake of open access took place in non- humanities disciplines, already the early declarations on open access include the humanities, with the berlin declaration explicitly being issued as “berlin declaration on open access to knowledge in the sciences and humanities” [ ] (own emphasis). hence, other than for complementary elements of open science, the humanities developed their own discourse on open access, often focussing on the technicalities of implementing more open forms of publishing. however, whereas open access to scientific literature is embedded in an established discourse on open science, open access to scholarly literature in the humanities often remains in quarantine without a broader discursive framework— such as open humanities—within which it could be embedded. even kleineberg and kaden, who enquire the need of a concept of open humanities, cling to open access as a key issue already in their heading (though they do refer to open research data and open review as well in their article) [ ]. kleineberg and kaden’s contribution shows that, theoretically, there is a discourse on open humanities in place. in practice, however, it seems to be mostly invisible, especially in comparison with open science and digital humanities as well as in reference to the fundamental changes potential open practices in the humanities would mean for scholarship. the reluctance of an uptake of open humanities may be due to the outsourcing of digitality in the humanities that is unseen in scientific disciplines. whereas digital methods are integral to the disciplines of the life, natural, or applied sciences, the humanities developed digital humanities to devise new methods empowered by digitality. in other words, where digitality is part of each scientific discipline, open science is the dedicated realm for debating open practices in the sciences; humanities scholars would refer to digital humanities for digitality in humanities disciplines but lack a discourse as a reference to open practices. the digital humanities appear with a twofold mission: applying digitality to support or help answer questions in traditional humanities disciplines, and exploring what it means to be human in an increasingly digital environment [ ]. the former mission is transdisciplinary in nature, bridging humanities with digital specialist realms. because of this nature, though, it advances only what is already enquired in the humanities but may not lead to new epistemological efforts; or, as gibbs and owens argue, “[d]espite significant investment in digital humanities tool development, most tools have remained a fringe element in humanities scholarship” [ ]. digitality remains an add-on to the humanities, a set of tools and approaches that requires more focus and aspiration for genuine integration into the traditional disciplines [ ]. the latter mission seems philosophical, or even sociological, in nature, touching on questions of digital hermeneutics and ontology. it may well be argued that this is a genuine task for philosophers or sociologists which only appears in the guise of a different discourse, termed digital humanities. overall, this layover discipline may have the potential to gather scholars to form a community that drives a discourse on open humanities—only termed digital instead of open humanities. this is questionable, though. much of the discourses within digital humanities focus on digital practices, applied methods, or projects, instead of on open practices comparable to open science. there are exemptions to this, authors who directly or indirectly analyse or discuss elements considered to be part of open science (or open humanities respectively); examples include borgmann, who discusses “publication practices, data, research methods, collaboration, incentives, and learning” for the “future of digital scholarship in the humanities” [ ] (p. ), thus describing essential prerequisites for open humanities. bianco emphasises the “tremendous opportunities” that digital humanities have “to open up research modes, methods, practices, objects, narratives, locations of expertise, learning and teaching” [ ], but fails to connect this to any of the larger contexts of open science/humanities or open scholarship. borrelli looks at the distinction of digital practices as opposed to digitalised practices [ ], touching on open access and peer review. pritchard looks at an early, concrete version of open see, for instance, key journals: digital humanities quarterly, debates in digital humanities, digital scholarship in the humanities, zfdg - zeitschrift für digitale geisteswissenschaften, digital studies / le champ numérique, or international journal of humanities and arts computing. publications , , of scholarly communication in the humanities from which he deduces more general findings about pre- /post-prints, open access, and a digital, potentially open infrastructure in classical studies [ ]. kuhn and hagenhoff analyse requirements of digital monograph publishing and conclude with a decisively progressive, open potential for an outdated publishing model [ ]. fitzpatrick contributes comprehensive discussions of inadequate scholarly communication practices for which openness is offered as the best way forward [ – ]. cohen discusses digital processes as a possible solution to one of the fundamental problems scholarly communication in the humanities exhibit: the social contract that is actualised by traditional, institutional publishing [ ]. again, this article makes no reference to open humanities (or open science/open scholarship respectively). these contributions are starting points for debates. but their glaring shortcoming is that they are not integrated into a dedicated discourse, and, instead, remain disintegrated and a minor concern next to the many contributions on methods and projects that appear in the digital humanities discourses. integration here means bringing together the different ideas and enquiries to debate their interconnectedness and implementation in practice. combined, the various open practices pose a fundamental change of scholarship in the humanities and to get scholars to engage with the shaping of these practices, there needs to be a single discourse as a reference, one that is centred around open humanities and not mixed up with the various debates on methods presented in the digital humanities. conclusively, thus, the digital humanities have a topical agenda that is only to a lesser degree concerned with what would be an open humanities discourse. in the following section, i will look at some of the elements that such a discourse would be required take on and reconfigure. . discussion of practices of open science and their applicability in the humanities the foster open science taxonomy names a variety of elements from open access and open data to open reproducible research and open science policies, tools, or guidelines as first order elements of open science [ ]. most of these elements further spread to second or third order elements among which range some of the often-debated topics such as open metrics and impact or open peer review. this taxonomy comprises of many terms explicitly connected to the sciences. as apparent by the terminology, these elements are largely indifferent to the humanities: they are intended to advance, or improve, scholarship specifically in scientific disciplines by making it more open. i will discuss three key elements of the open science taxonomy with respect to their meaning and how the distinction of practices requires a different understanding of these terms in the humanities. . . preprints in the humanities a preprint is a manuscript of an article, book, or chapter that is being published in a distinguished online repository before it is formally published in a journal or as a book [ , ]. this quite general definition can be discussed in more detail with respect to its elements: the manuscript is usually one authored for the purpose of being published in a peer reviewed journal or book; publication online in a repository means that the manuscript is published rather informally—i.e., not necessarily formatted according to a publisher’s guidelines—on an online server that functions as a repository for such manuscripts, optionally specifically for a discipline. this online publication is freely accessible with respect to a creative commons licence; the manuscript is likely to be published in a similar, but potentially revised version formally later on. the ostensibly outdated term preprint derives from the idea that the manuscript is available for debate before its formal imprimatur in a journal or book, thus, before it is printed—irrespective of this being done with ink or digitally. this also enlightens about the purpose of using preprints: it accelerates scholarship without compromising authorship and enables early debates [ , ]. the formal publishing process usually takes time, especially because of the standard pre-publication peer review [ ]. during the time of this peer review, potential readers can already evaluate and judge the research as it is publicly available as a preprint. at the same time, the author(s) can take advantage of the preprint because they have an early citable and timestamped manuscript for the time of it being in review for a journal or book. publications , , of these characteristics have a potentially positive impact on scientific scholarship, as tennant et al. [ ], ginsparg [ ], or taubes [ ] suggest. what these authors do not touch on is that this may be different in humanities disciplines, where preprints have as yet a much smaller presence than in scientific disciplines. there are far fewer preprint servers for humanities disciplines than there are for scientific disciplines [ ], which can be explained with the much smaller output of publications in the humanities, their (sometimes) geographically limited importance, and perhaps the reluctance of humanities scholars to engage in progressive, digital publishing procedures in general. kleineberg and kaden [ ] discuss how there is no established culture for preprints in the humanities, though their publication processes are particularly long-termed. laporte [ ] identifies several challenges that arise for humanities scholars to have a more established preprint culture. one of these is that, other than in most discourses in the sciences, those of the humanities are still highly (geographically) regionally rooted and make use of a variety of languages and discursive norms instead of resorting to english as a lingua franca. this goes along with the reach of journals and book publishers who have more individual, language-dependent audiences, which makes it much harder to have a single space for an international discourse. another challenge is posed by the high share of monographs among the overall output as well as popular articles, which further divides the output of the already considerably smaller amount of humanities publications. those obstacles may well be overcome by highly specialised or well indexed preprint servers; but apart from the conception in theory, these characteristics are a reason for the difficulty of finding a critical mass for the uptake of active preprinting in the humanities. geltner, the founder of a preprint server for medieval studies [ ], argues strongly in favour of preprints [ ]. one of his key arguments is to encourage more curation by editors actively reaching out to authors of compelling new works to convince them of publishing with their journals. if this new work would have been uploaded to preprint servers, the whole process of active curation would be strengthened, the argument goes. preprints in the humanities are, for geltner, not about ““accelerating research,” but rather protecting research as a curiosity-driven endeavor [sic]” [ ]. it remains questionable whether such a form of curation is incentive enough for scholars to engage in authorship and preprinting, only then to be targeted by journal acquisition editors with publication advertisements. if such targeting led to publications in more prestigious journals, most authors would be encouraged to take the effort. but it may as well be the rather less prestigious journals that would aim for more targeting—a scenario no discourse or author may wish for (irrespective of the fact that the concept of prestige of journals is highly contested in the first place). moreover, the purpose of preprints is to enable discussion for the time a manuscript is in review, not before it is submitted to a journal for review. another yet undebated point can be made with reference to the nature of discourses particularly in the humanities, and whether scholars in the humanities may be better served with more rapid formal publication, rather than the formal publication remaining the same and instituting preprints as a temporal placeholder: in disciplines of the sciences, especially in those that have a high uptake of preprints such as physics, formal publication is generally conceived to be, literally, the formal last step. at that point in time, the content of the manuscript is generally already known within the discourse community and has gone through debate [ ]—just what preprints aim to facilitate. other scholars had the chance to incorporate and work with the knowledge drawn from the research reported in the preprinted manuscript. the purpose is clear—a sensible procedural step to accelerate scholarship during that time in which the manuscript is in review. the downside is that this step reinforces the impression that both the formality of the publication and its closed review process are required. this impression may be falsely conjured as there may not need to be a closed review process, so that preprints would either not be necessary or the preprint would be the final publication itself (making the practice of preprinting redundant). another characteristic may be even more imperative as an argument against preprinting in the humanities. it is based on the importance of the historicity of publications. this is different in the for a popular example of a preprint as the only—thus final—publication, see [ – ]. publications , , of sciences, where the most recent version of a publication—if approved of by peer review—is essential: the up to date, reliable, and reproducible research reported in the publication counts, not the historicity and versioning of the publication. this is not the case in the humanities which can be illuminated by an example. consider that, by means of peer review, there is a change in the content from preprinted manuscript to final publication, for instance, in the line of argument or references included. this change will have to be reflected in case a future author aims to discuss this publication. this future author may no longer merely write: author a claims that x; she will instead be required to write: author a claims that x and reviewer r adds that xy. if authors in the humanities adopted preprinting, there would always be at least two versions of an article available (provided the article got accepted after revision). this may sound trivial. but for disciplines that emphasise the importance of hermeneutics and source criticism, where editorial history is a key concern, such details require attention. it is, therefore, necessary to debate whether the broader uptake of preprints in the humanities is desirable. counterarguments may be that both the author’s initial intention and the reviewer’s request for change may be reasonable so that preprint and final publication should reflect such process of change. moreover, one may argue that attribution of authorship of any publication is entangled already today; that is, reviewers can as well be seen as co-authors—they help with the creation of the work, rather less than more substantially, only that this is opaque today. preprints will just make visible the difference between manuscript and publication; they will not change that there is a difference. this, however, connects to our understanding of peer review as it is this process, rather than preprints, that makes such co-creation and change opaque today. this directly leads to the next realm that requires attention for open humanities: open peer review. . . open peer review in the humanities peer review refers to the practice where fellow scholars evaluate each other’s works. the resulting evaluation may be used by editors to provide guidance to authors so that they can improve the manuscript, and to make an informed decision about whether or not to publish. the practice’s institutionalisation originates in “debates over grant funding” in the s “and has since been extended to cover a variety of processes by which academics formally evaluate each other’s work” [ ] (p. ). however, the process of peer refereeing is much older. it can be traced back to learned societies and their journals in the eighteenth century. it developed as “distinctive editorial practices of learned societies [which] arose from the desire to create forms of collective editorial responsibility for publications which appeared under institutional auspices” [ ] (p. ). since then, peer review has been developed into an institutionalised practice, and it is a systemic gatekeeper today. especially the systemic nature of this practice, captivated by a paradigm of excellence, led to the acceptance of peer review to be a threshold to quality and authenticity, and to the assumption that by merely organising peer review, publishing companies add value to the published material [ ]. some authors argue that peer review is material for scholarship and its quality, as, for instance, babor et al. do: “[t]he most important criterion for quality and integrity is the peer-review process, as overseen by a qualified journal editor and the journal’s editorial board” [ ] (p. ). gatekeeping is seen as a means of merit and scholarly obligation [ ], one that is supposed to be—but often enough does not achieve to be—a democratic process [ ]. finch argues that peer review performs as part of the effectiveness of high-quality channels within the current communication system, in which researchers have “effective and high-quality channels through which they can publish and disseminate their findings, and that they perform to the best standards by subjecting their published findings to rigorous peer review” [ ] (p. ). such statements seem ignorant of the limitations of peer review in terms of quality. when refereeing is applied as an entry threshold to communication, it is a sorting mechanism, indeed a procedure of selection that filters content into diffusely constructed classes of quality; but it is not a criterion for quality as such. moreover, the effectiveness of a practice can be questioned if that practice, on the one hand, withholds research for a longer period in which it is inaccessible to other scholars while, on the other hand, only a selection of one to three fellow scholars is deemed worthy for the publications , , of judgement about the value of a manuscript in a discourse. it takes, on average, weeks for eventually accepted papers to get through peer review; this period is longer than average in the social sciences and humanities with – weeks [ ]. many journals in the humanities have “single figure acceptance rates” [ ] (p. ), meaning that a bulk of research is excluded from that discourse to which an author wished to contribute it to. because of the hidden process, the reasoning behind both ex- and inclusion is opaque to fellow scholars. it is for these and other reasons that peer review is a contested practice across disciplines. tennant et al. conclude that “debates surrounding the efficacy and implementation of peer review are becoming increasingly heated, and it is now not uncommon to hear claims that it is “broken” or “dysfunctional”“ [ ]. especially for disciplines in the sciences, this practice of gatekeeping and verification judgement is under close scrutiny [ – ]. on the other hand, some authors claim evidence for value in peer review [ ] and double-blind procedures in particular [ ] for disciplines of the sciences. others raise concern and enquire the options for opening this practice, claiming increased accountability and transparency [ ] or proposing entirely new models such as a preprinting-connected collaborative open peer review [ ]. ross-hellauer [ ] discusses a variety of problems with peer review, most of which affect the realms of quality and credibility fundamentally. he accounts inconsistent and “weak levels of agreement” among referees, questions the authority of their role as gatekeeper, and issues the ““black-box” nature of traditional peer review” as a “[l]ack of accountability and risk[...] of subversion”. most of all, the social component of peer reviewing is set against the “idealized as impartial, objective assessors” based on gender, nationality, institutional affiliation, or language. backed with (peer reviewed) studies as evidence, ross-hellauer’s review arrives at a devastating conclusion for this practice. early experiments such as peters and ceci’s [ ] only add to the impression that modern peer review has long established itself as a contested gatekeeping practice instead of a process of collaborative improvement of research. this criticism questions how peer review really achieves to democratise scholarship, or conform to an objective enterprise—two of robert merton’s key principles in the sociology of science [ ]—resulting in an inevitable debate about making this practice more open and transparent. a concerted and concentrated debate is established within open science, and it is likewise necessary for the humanities, where there are differences in the practice. being published in the humanities is much more connected to editorship, where peer reviewers provide the editor with a subjective understanding of the work. decisions of acceptance or rejection are much more connected to interpretation and argument instead of objectified principles. the name of the editor is highly connected to the value of the journal and the discourse it serves. editors are “cultural intermediaries who bridge two worlds, insiders-outsiders with a foot in each camp” [ ] (p. ). it is no news, however, that scholars, especially in the humanities, judge and argue in manifold subjective ways [ ], making it much harder to compare reviews so as to arrive at a desirable compromise in the process of gatekeeping. being published with a publishing brand—may it be a journal or a book series—is, thus, more than a question of abstract, objectified quality. statements of quality are much harder to be made in the humanities than they are in the sciences; it is, here, rather a question of consensus and agreement of reviewers or editors on a particular level of intelligibility. but if agreement is not the objective of humanities discourses, why should inclusion in a discourse be based on agreement? making the peer review process more open by, for instance, publishing the reviews would not necessarily disturb the mechanism of gatekeeping but make the terms of inclusion more transparent. these terms of inclusion may not be the decisive elements, though; the terms of exclusion are. and the culture of debate may require scholars to know about these terms as well. moving the position of review from pre- to post-publication would profoundly change the purpose of reviewing from gatekeeping to improving, reconfiguring the emphasis of this practice away from the journal towards the discourse. fitzpatrick argues in a similar manner, demanding more progress in the practice and discourse on open peer review [ , ]. however, her approach is profoundly shaped by notions of the digital publications , , of and digital humanities. while she indeed writes about the subjectivity and qualitative representation of humanities scholarship, these arguments are not taken up by a larger discourse outside of digital humanities. the discourse on opening peer review should not be left to either the sciences or digital humanities. it needs to be approached from within the humanities, where connections are drawn to other practices such as preprinting and making statements of value and judgement about scholarship. . . liberal copyright licences in the humanities another discussion that stands in line with the elements discussed above concerns the applicability of liberal copyright licences. licences are fundamental for open scholarship as they are guiding principles for the practice of any form of open publishing as well as the policy work that underpins these practices. licences are not just crucial for potential readers, indicating how the published material can be accessed and used without consulting author or publisher. they are also responsible for a progressive understanding of authorship where the author is not required to sign over copyright to the publisher. cc by is the licence most favoured by open access advocates and organisations (see, for instance, the discussion by frosio [ ] (p. ); or [ , ]). the reasoning here is that reducing the limitations issued by the creative commons licence (by means of nc, nd, sa) to by, means limiting the limitation of reuse of the publication simply to attribution of authorship. in other words: “anything less introduces a barrier to the open progress of science” [ ] (§ ). while it is true that cc by as the “most liberal” licence “imposes no limits on the use and reuse of material so long as the original source is acknowledged” [ ]; it is also true that there is a debate whether such liberalism is in favour of discourses in the humanities. a strong position in this matter has mandler who repeatedly voiced his concern about overtly liberal licences—though supporting open access in general [ ]. in an interview he claims that ““reuse” under cc by authorises practices that we call plagiarism in academic life. i know advocates of cc by dislike the use of this word, but it is a good word to describe the practice of copying and altering words without specifying how they are altered” [ ]. this aligns to what morrison discusses, criticising that a “poor translation could have a negative impact on a scholar’s reputation, whether through the quality of the writing or through other scholars misquoting an inaccurate translation” [ ] (p. ). this is essentially a problem of the humanities: not only because the humanities live on a variety of languages, thus, making translations a regular necessity; even more so, it is in the humanities where the nuances of argument and expression matter. both these issues are much less present in disciplines of the sciences. therefore, the claim made by morrison and mandler seems legitimate. in the daily process of discourses, however, i think it is not the licence that is fundamental for this form of misrepresentation. it rather is good or bad scholarly convention, and thus practice, that is responsible. translations are made by other scholars as are discussions and reviews of works cited. the basis of such scholarly practices should be the willingness to not misrepresent a fellow scholar and her argument; if this fails, no amount of altering licences will help maintain scholarly integrity. one may wish to publish only with an nd licence to pre-emptively avoid misrepresentation and still be wrongly represented in discussions and reviews of the published work, or even incorrectly connected to a different argument altogether by means of inattentive referencing. if the integrity of the discourse practices in scholarship are not held high, any pre-emptive steering through licences is futile. if the integrity is in place, however, the steering though licences is not necessary in the first place so that the more liberal licence cc by may as well be suitable for the humanities. the suitability of this practice needs to be integrated into a larger discourse that connects it to other elements of open humanities, especially open access and open data (infrastructures), enabling creative commons attribution only licence. creative commons attribution licence with the added restrictions (possible in combinations): nc— noncommercial; nd—noderivatives; sa—sharealike. publications , , of policy workers to draw on it to make informed decisions. without such integrated discourse, policy workers can draw on only minor, fragmented opinions about this matter, or must resort to open science altogether. . conclusion it seems to be harder for the humanities to speak with a single voice than it is for the sciences. but this must not mean that the humanities need not have a discourse on opening their research and communication practices. the three elements discussed indicate the necessity to have this discourse; the lack of having it in a coherent, focussed manner may harm the progress of scholarship in the humanities in an—arguably inevitable—digital future. unlike the sciences, where the dialectic between the discourse open science and the advancement of more open science practices results in positive progress, issues in question in the humanities are dispersed into often isolated, disintegrated niche discourses. especially the close connection of preprinting and open peer review indicates the need to have an integrated, rather than a fragmented discourse. similar claims for interconnectedness can be made for, for instance, open data and open reproducible research, open evaluation and open metrics and impact, or open access and popular humanities communication. another form of interconnectedness concerns book reviews and review articles which may well be called forms of post-publication peer review. in this sense, practices in the humanities include forms of openness already. however, those review publications are at times excluded from debates of open access as, according to principles of open science, primarily original research is to be published open access. yet, reviewing, and thus debating, scholarship is incremental to discourses in the humanities. to sufficiently debate and find solutions for such an issue, humanities scholars and scholarly communication experts need an integrated discourse that brings these elements of open access and open post-publication peer review together. in its current, disintegrated form, the issues will always run into errors in implementation and be unrelatable to scholars in the traditional humanities disciplines. open access alone—likely the most prominent element of open science—is well discussed for publishing practices in the humanities. and there are other practices that, out of the requirement of particular scholarship in the humanities, drive individual, smaller discourses that showcase the need for understanding and advancing practice in this field. opening access to data of humanities scholarship and building sustainable infrastructures is one such practice. similar to many other elements in open science that are assumed to be applicable to the humanities, however, it must be understood that “[t]he concept of research data comes from the sciences, and can only be transferred to the traditional scholarly methods of the humanities to a limited degree” [ ]. though questions of publishing data, their infrastructures, and degrees of openness are partly integrated into discourses of the digital humanities [ , – ], it is not connected paradigmatically to a broader open humanities. but facing a digital future, the humanities need to be open to interdisciplinary knowledge transfer in this realm [ ]. this can also be seen in the social sciences, where herb, for instance, consulted on various elements of open science and their development in sociology, eventually concluding that “[t]he open knowledge culture is not widespread in sociology” [ ] (p. ; translated by the author) . as stated before, some disciplines and research projects of the social sciences may well be in the realm of the humanities communication does not exist as a term. this is another curious instance where there is a term called science communication without a comparable counterpart for the humanities. this is the case, either because or although, practices such as publishing in popular media are integral to humanities scholarship (collini, ). it can be argued that there is no need to have such a term because of the integral nature of such publishing practice. the fact that there is no such term and discourse, however, should not lead to the assumption that there is no associated practice. the sciences only seem to be more communicative due to their discourse on and practice of science communication; the humanities conduct this communication integratively. original: ‘die kultur des offenen wissens ist in der soziologie nicht verbreitet.’. publications , , of sciences and can, thus, be captured by open science. for those closer to the humanities, this cannot be assumed, especially because their scholarly communication practices are distinct. most of all, it can be stated with certainty that the crisis of scholarly communication is not simply one of the sciences. just because the crisis may have a similar origin, the discipline-specific developments and their potential solutions may not be the same. the origin of this crisis may be seen in the paradigm of false productivity, excellence, and pressure to publish. due to the differences of practices, this led to dissimilar problems which, for the humanities, are concisely summarised by rosa: i am firmly convinced that, at least in the social sciences and the humanities, there is, at present, hardly a common deliberation about the convincing force for better arguments, but rather a non-controllable, mad run rush for more publications, conferences and research-projects the success of which is based on network-structures rather than on argumentational force [ ] (p. ). more transparency and openness that rid authorship of its stances of formality and reputation may serve as a solution. but the discursive space in which scholars debate the specifics of this solution must not be fragmented and dispersed into niche contributions. there may well be individual contributions to what openness means in the humanities or why it may be beneficial. this is especially true in the digital humanities discourse. but, as discussed above, the digital humanities are focussed on methods much more than on open practices. yet, scholars in the humanities need a voice to shape their digital, open future. this needs to be a transdisciplinary space, just like digital humanities is one; only that it needs be driven from within the humanities (where digital humanities seems to be driven rather by technology), and dedicated and focussed on the opening of practices (where digital humanities are primarily concerned with methods and projects). this process may start with a disciplinary discourse in individual humanities disciplines, for instance, at conferences or in special issues of dedicated journals. the importance of an open humanities discourse will be to bring these threads together and to serve as a dedicated reference to what open practices means to humanities scholars and what best practices and their problems and implementations are. open science achieves this for the sciences. there is nothing comparable for the humanities. notwithstanding, such a space should not be taken as an openness for granted discourse. as the discussion above shows, not everything that can be opened necessarily works in favour of the knowledge production of its disciplines. it may just as well be the case that the ends of practices instead of the practices themselves are ripe for change so that merely reconfiguring processes may not lead to positive progress. controversial as this may seem, there needs to be a dedicated discourse on it that brings together the variety of currently disconnected endeavours and proficiencies. open humanities may serve as a namespace for this. funding: the author received funding from the arts & humanities research council through the london arts & humanities partnership. conflicts of interest: the author declares no conflict of interest. references . abbott, a. chaos of disciplines; university of chicago press: chicago, il, usa, . . kagan, j. the three cultures. natural sciences, social sciences, and the humanities in the st century; cambridge university press: cambridge, uk, . . snow, c.p. the two cultures and the scientific revolution; cambridge university press: cambridge, uk, . . beiner, m. humanities. was geisteswissenschaft macht. und was sie ausmacht; berlin university press: berlin, germany, . . bod, r. a new history of the humanities; oxford university press: oxford, uk, . . daston, l. objectivity and impartiality: epistemic virtues in the humanities. in the making of the humanities, volume , from early modern to modern disciplines; bod, r., maat, j., weststeijn, t., eds.; amsterdam university press: amsterdam, netherlands, ; pp. – . . hamann, j. die bildung der geisteswissenschaften. zur genese einer sozialen konstruktion zwischen diskurs und feld; herbert von halem verlag: köln, germany, . publications , , of . hyland, k. academic publishing. issues and challenges in the construction of knowledge; oxford applied linguistics; oxford university press: oxford, uk, . . steiner, f. dargestellte autorschaft. autorkonzept und autorsubjekt in wissenschaftlichen texten; reihe germanistische linguistik ; niemeyer: tübingen, germany, . . thompson, j.b. books in the digital age. the transformation of academic and higher education publishing in britain and the united states; polity: cambridge, uk, . . hösle, v. kritik der verstehenden vernunft. eine grundlegung der geisteswissenschaften; c.h. beck: münich, germany, . . small, h. the value of the humanities; oxford university press: oxford, uk, . . collini, s. what are universities for?; penguin: london, uk, . . david, p.a. the historical origins of 'open science': an essay on patronage, reputation and common agency contracting in the scientific revolution. capital. soc. , , . . vicente-saez, r.; martinez-fuentes, c. open science now: a systematic literature review for an integrated definition. j. bus. res. , , – . . madsen, r.r. scientific impact and the quest for visibility. febs j. , doi.org/ . /febs. . . fecher, b.; friesike, s. open science: one term, five schools of thought. in web . for scientists and science . ; springer: vienna, austria, ; pp. – . . friesike, s.; widenmayer, b.; gassmann, o.; schildhauer, t. opening science: towards an agenda of open science in academia and industry. j. technol. transf. , , – . . peters, m.a. openness, web . technology, and open science. policy futures educ. , , pp. – . . lahti, l.; da silva, f.; laine, m.; lähteenoja, v.; tolonen, m. alchemy & algorithms: perspectives on the philosophy and history of open science. rio , , doi: . /rio. .e . . mckiernan, e.c.; bourne, p.e.; brown, c.t.; buck, s.; kenall, a.; lin, j.; mcdougall, d.; nosek, b.a.; ram, k.; soderberg, c.k.; et al. how open science helps researchers succeed. elife , , doi: . /elife. . . katz, d.s.; allen, g.; barba, l.a.; berg, d.r.; bik, h.; boettiger, c.; borgman, c.l.; brown, c.t.; buck, s.; burd, r.; et al. the principles of tomorrow's university. f research , , , doi: . /f research. . . . crane, t. the philosopher’s tone. the times literary supplement. . available online: https://www.the- tls.co.uk/articles/public/philosophy-journals-review/ (accessed on october ). . brink, c. the soul of a university. why excellence is not enough, st; bristol university press: bristol, uk, . . finch, j. accessibility, sustainability, excellence: how to expand access to research publications. , doi: . / . . . . . eve, m.p. learned societies, open access and budgetary cross-subsidy. available online: https://eve.gd/ / / /learned-societies-open-access-and-budgetary-cross-subsidy/ (accessed on september , ). . sperlinger, t.; mclellan, j.; pettigrew, r. who are universities for? re-making higher education, st; bristol university press: bristol, uk, . . moore, s.; neylon, c.; eve, m.p.; o’donnell, d.p.; pattinson, d. “excellence r us”: university research and the fetishisation of excellence. palgrave commun. , , . . tennant, j.p.; crane, h.; crick, t.; davila, j.; enkhbayar, a.; havemann, j.; kramer, b.; martin, r.; masuzzo, p.; nobes, a.; et al. hot topics around scholarly publishing. publications , , doi: . /publications . . moore, s. a genealogy of open access: negotiations between openness and access to research. rev. fr. sci. inf. commun. , doi: . /rfsic. . . crossick, g. monographs and open access. insights: uksg j. , , doi: . /uksg. . . eve, m.p. open access and the humanities; cambridge university press: cambridge, uk, . . jubb, m. academic books and their futures: a report to the ahrc and the british library; ahrc/british library: london, uk, . . mandler, p. open access for the humanities: not for funders, scientists or publishers. j. vic. cult. , , – . . mandler, p. open access: a perspective from the humanities. insights: uksg j , , – , doi: . / - . . . berlin declaration. available online: https://openaccess.mpg.de/berlin-declaration (accessed on october ). publications , , of . kleineberg, m.; kaden, b. open humanities? expertinnenmeinungen über open access in den geisteswissenschaften. libreas. libr. ideas . available online: https://libreas.eu/ausgabe / kleineberg/ (accessed on october ). . gardiner, e.; musto, r.g. the digital humanities. a primer for students and scholars; cambridge university press: cambridge, uk, . . gibbs, f.; owens, t. building better digital humanities tools: toward broader audiences and user- centered designs. digit. humanit. q. . available online http://www.digitalhumanities.org/dhq/vol/ / / / .html. (accessed on october ) . bod, r. who’s afraid of patterns?: the particular versus the universal and the meaning of humanities . . bmgn—low ctries hist. rev. , , – . . borgman, c.l. the digital future is now: a call to action for the humanities. digit. humanit. q. . available online: http://digitalhumanities.org/dhq/vol/ / / / .html (accessed on october ). . bianco, j. this digital humanities which is not one. in debates in the digital humanities; gold, m.k., ed.; university of minnesota press: minneapolis, ms, usa, . . borrelli, a. wissenschaftsgeschichte zwischen digitalität und digitalisierung. z. digit. geisteswiss. . doi: . /sb _ . . pritchard, d. working papers, open access, and cyber-infrastructure in classical studies. lit. linguist. comput. , , – . . kuhn, a.; hagenhoff, s. nicht geeignet oder nur unzureichend gestaltet? digitale monographien in den geisteswissenschaften. z. digit. geisteswiss. , doi: . / _ . . fitzpatrick, k. planned obsolescence. publishing, technology, and the future of the academy; new york university press: new york, nj, usa, . . fitzpatrick, k. peer review, judgment, and reading. profession , , – . . fitzpatrick, k. beyond metrics: community authorization and open peer review. in debates in the digital humanities; gold, m.k., ed.; university of minnesota press: minneapolis, ms, usa, . . cohen, d.j. the social contract of scholarly publishing. in debates in the digital humanities; gold, m.k., ed.; university of minnesota press: minneapolis, ms, usa, . . foster open science taxonomy. available online: https://www.fosteropenscience.eu/foster (accessed september ). . neylon, c.; pattinson, d.; bilder, g.; lin, j. on the origin of nonequivalent states: how we can talk about preprints. f research , , doi: . /f research. . . . tennant, j.; bauin, s.; james, s.; kant, j. the evolving preprint landscape: introductory report for the knowledge exchange working group on preprints, metaarxiv . doi: . /osf.io/ tu. . crick, t.; hall, b.; ishtiaq, s. reproducibility in research: systems, infrastructure, culture. j. open res. softw. , , . . vale, r.d.; hyman, a.a. priority of discovery in the life sciences. elife , , doi: . /elife. . . powell, k. does it take too long to publish research? nat. news , , . . ginsparg, p. preprint déjà vu. embo j. , , – . . taubes, g. electronic preprints point the way to ‘author empowerment’. science , , . . osf. preprint archive search on open science framework. available online: https://osf.io/preprints/ discover (accessed on october ). . laporte, s. preprint for the humanities—fiction or a real possibility? socarxiv . doi: . /osf.io/jebhy. . anonymous. bodoarxiv preprints: open repository for medieval studies. available online: https://osf.io/ preprints/bodoarxiv/ (accessed may ). . geltner, g. long live the curator! available online: https://www.scienceguide.nl/ / /long-live-the- curator/ (accessed on october ). . geltner, g. why arts & humanities scholars should care about preprints. available online: http:// www.guygeltner.net/blog/ why-arts-humanities-scholars-should-care-about-preprints (accessed on october ). . delfanti, a. beams of particles and papers: how digital preprint archives shape authorship and credit. soc. stud. sci. , , – . publications , , of . perelman, g. the entropy formula for the ricci flow and its geometric applications. arxiv . available online: https://arxiv.org/abs/math/ (accessed on october ). . perelman, g. ricci flow with surgery on three-manifolds. arxiv . available online: https://arxiv.org/abs/math/ (accessed on october ). . perelman, g. finite extinction time for the solutions to the ricci flow on certain three-manifolds. arxiv . available online: https://arxiv.org/abs/math/ (accessed on october ). . fyfe, a.; coate, k.; curry, s.; lawson, s.; moxham, n.; røstvik, c.m. untangling academic publishing: a history of the relationship between commercial interests, academic prestige and the circulation of research, . available online: https://zenodo.org/record/ /files/untanglingacpub.pdf (accessed on october ). . moxham, n.; fyfe, a. the royal society and the prehistory of peer review, – : (accepted manuscript/author version). hist. j. , , doi: . /s × . . babor, t.f.; stenius, k.; pates, r.; miovský, m.; o’reilly, j.; candon, p. publishing addiction science. a guide for the perplexed; ubiquity press: london, uk, . . caputo, r.k. peer review: a vital gatekeeping function and obligation of professional scholarly practice. fam. soc. , , – , doi: . / . . huisman, j.; smits, j. duration and quality of the peer review process: the author’s perspective. scientometrics , , – , doi: . /s - - - . . tennant, j.p.; dugan, j.m.; graziotin, d.; jacques, d.c.; waldner, f.; mietchen, d.; elkhatib, y.; collister, l.b.; pikas, c.k.; crick, t.; et al. a multi-disciplinary perspective on emergent and future innovations in peer review. f research , , doi: . /f research. . . . crane, h.; ryan, m. in peer review we (don't) trust: how peer review's filtering poses a systemic risk to science. researchers.one . available online: https://www.researchers.one/article/ - - (accessed on october ). . ferguson, c.; marcus, a.; oransky, i. publishing: the peer-review scam. nature , , – . . smith, r. peer review: a flawed process at the heart of science and journals. j. r. soc. med. , , – . . stephan, p.; veugelers, r.; wang, j. reviewers are blinkered by bibliometrics. nat. news , , . . tennant, j.p. the state of the art in peer review. fems microbiol. lett. , , doi: . /femsle/fny . . siler, k.; lee, k.; bero, l. measuring the effectiveness of scientific gatekeeping. proc. natl. acad. sci. u. s. a. , , – . . tomkins, a.; zhang, m.; heavlin, w.d. reviewer bias in single- versus double-blind peer review. proc. natl. acad. sci. u. s. a. , , – . . van rooyen, s.; godlee, f.; evans, s.; black, n.; smith, r. effect of open peer review on quality of reviews and on reviewers' recommendations: a randomised trial. bmj (clinical research ed.) , , – . . perakakis, p.; taylor, m.; mazza, m.; trachana, v. natural selection of academic papers. scientometrics , , – , doi: . /s - - - . . ross-hellauer, t. what is open peer review? a systematic review. f research , , doi: . /f research. . . . peters, d.p.; ceci, s.j. peer-review practices of psychological journals: the fate of published articles, submitted again. behav. brain sci. , , – . . merton, r.k. the normative structure of science. in the sociology of science: theoretical and empirical investigations; merton, r.k., ed.; university of chicago press: chicago, il, usa, . . lamont, m. how professor think: inside the curious world of academic judgment; harvard university press: cambridge, ma, . . frosio, g. open access publishing: a literature review; center for copyright and new business models (create): glasgow, uk, . . neylon, c. open access must enable open use. nature , , – . . suber, p. strong and weak oa. available online: http://legacy.earlham.edu/~peters/fos/ / /strong-and- weak-oa.html (accessed on september ). . poynder, r. the oa interviews: peter mandler. available online: https://poynder.blogspot.com/ / / the-oa-interviews-peter-mandler.html (accessed on september ). . morrison, h.g. freedom for scholarship in the internet age. available online: http://summit.sfu.ca/item/ (accessed on september ). publications , , of . cremer, f.; klaffki, l.; steyer, t. der chimäre auf der spur: forschungsdaten in den geisteswissenschaften. o-bib. das offene bibliotheksjournal . , – . . brehm, e.; neumann, j. anforderungen an open-access-publikation von forschungsdaten–empfehlungen für einen offenen umgang mit forschungsdaten. o-bib. das offene bibliotheksjournal . , – . . lemaire, m. vereinbarkeit von forschungsprozess und datenmanagement in den geisteswissenschaften. o- bib. das offene bibliotheksjournal , , pp. – . . arnold, t.; tilton, l. new data? the role of statistics in dh. in debates in the digital humanities, ; gold, m.k., klein, l.f., eds.; university of minnesota press: minneapolis, ms, usa, . . herb, u. open science in der soziologie. eine interdisziplinäre bestandsaufnahme zur offenen wissenschaft und eine untersuchung ihrer verbreitung in der soziologie; schriften zur informationswissenschaft ; hülsbusch: glückstadt, germany, . . rosa, h. alienation and acceleration. towards a critical theory of late-modern temporality; nsu press: malmö, sweden, . © by the author. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). digital humanities the importance of pedagogy: towards a companion to teaching digital humanities hirsch, brett d. brett.hirsch@gmail.com university of western australia timney, meagan mbtimney.etcl@gmail.com university of victoria the need to “encourage digital scholarship” was one of eight key recommendations in our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences (unsworth et al). as the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” ( ). in other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. while the commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. both the companion to digital humanities ( ) and the companion to digital literary studies ( ), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. there is much work to be done. this poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. this discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by mccarty and kirschenbaum ( ). the growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (jessop ), and the desire of practitioners to consolidate and validate their research and methods. we propose a volume, teaching digital humanities: principles, practices, and politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. we plan to structure the volume according to the four critical questions educators should consider as emphasized recently by mary bruenig, namely: - what knowledge is of most worth? - by what means shall we determine what we teach? - in what ways shall we teach it? - toward what purpose? in addition to these questions, we are mindful of henry a. giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” ( ). consequently, we will encourage submissions to the volume that address these wider concerns. references breunig, mary ( ). 'radical pedagogy as praxis'. radical pedagogy. http://radicalpeda gogy.icaap.org/content/issue _ /breunig.ht ml. giroux, henry a. ( ). 'rethinking the boundaries of educational discourse: modernism, postmodernism, and feminism'. margins in the classroom: teaching literature. myrsiades, kostas, myrsiades, linda s. (eds.). minneapolis: university of minnesota press, pp. - . http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html digital humanities schreibman, susan, siemens, ray, unsworth, john (eds.) ( ). a companion to digital humanities. malden: blackwell. jessop, martyn ( ). 'teaching, learning and research in final year humanities computing student projects'. literary and linguistic computing. . ( ): - . mccarty, willard, kirschenbaum , matthew ( ). 'institutional models for humanities computing'. literary and linguistic computing. . ( ): - . unsworth et al. ( ). our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. new york: american council of learned societies. provided by the author(s) and nui galway in accordance with publisher policies. please cite the published version when available. downloaded - - t : : z some rights reserved. for more information, please see the item record link above. title positioning the academic library within the institution: aliterature review author(s) cox, john publication date - - publication information cox, john. ( ). positioning the academic library within the institution: a literature review. new review of academic librarianship, ( - ), - . doi: . / . . publisher taylor & francis link to publisher's version https://doi.org/ . / . . item record http://hdl.handle.net/ / doi http://dx.doi.org/ . / . . https://aran.library.nuigalway.ie http://creativecommons.org/licenses/by-nc-nd/ . /ie/ positioning the academic library within the institution: a literature review john cox, university librarian, library, national university of ireland galway abstract a strong position in the institution is vital for any academic library and affects its recognition, resourcing and prospects. higher education institutions are experiencing radical change, driven by greater accountability, stronger competition and increased internationalisation. they prioritise student success, competitive research and global reputation. this has significant implications for library strategy, space, structures, partnerships and identity. strategic responses include refocusing from collections to users, reorganising teams and roles, developing partnerships, and demonstrating value. emphasis on student success and researcher productivity has generated learning commons buildings, converged service models, research data management services, digital scholarship engagement, and rebranding as partners. repositioning is challenging, with the library no longer perceived as the heart of the campus but institutional leadership often holding traditional perceptions of its role. this review discusses literature on how academic libraries have been adapting, or might adapt, functionally, physically, strategically and organisationally to position themselves effectively within the institution. introduction appropriate positioning in the institution is vital for any academic library and is strongly linked to its recognition, resourcing and prospects. close alignment with institutional strategy is key to successful positioning. higher education institutions (heis) have been changing radically in recent years; this impacts the strategies they pursue and creates challenges of adaptation and alignment for libraries. many operate more like businesses (weaver, ), shaped by multiple drivers such as greater accountability, stronger competition for students and research funding, higher student expectations, internationalisation and challenging economic conditions. some particular areas of institutional focus have emerged. these include student success, internationally recognised research, community engagement, global reputation and metrics-driven demonstration of impact. the influence of government policy globally (adams becker, cummins, davis, freeman, & hall giesinger, a) and nationally, as in the uk (bulpitt, ), has been powerful in foregrounding performance measurement, competition and consumerisation. technology has increased student choice and expectations and continues to transform how learning happens, emphasising greater flexibility, influencing learner behaviours and changing the profile of the student body (bell, dempsey, & fister, ). it also affects profoundly the ways in which research is conducted by promoting more collaborative and computational approaches (mcrostie, ). shifts in institutional operating environment and strategic focus have implications for the library and its positioning in the organisation. traditionally the library was viewed as the heart of campus and there was an almost unquestioning acknowledgement of the centrality of its contribution to the institutional mission. this situation has now changed fundamentally and the onus rests with libraries to prove their worth to stakeholders who are asking different questions and seeking new value as their priorities evolve (oakleaf, ). the environment for libraries has changed in lots of other ways. many players are now involved in information management on campus (dempsey, ). academics and students can look to other providers locally and globally (holmgren & spencer, ). there are new expectations of student experience and engagement (gwyer, ), researcher needs are changing (corrall, ), and internationalisation has created new audiences (anne r kenney & li, ). all of this has significant implications for how libraries operate in their institutions in terms of strategy, space, structures, partnerships and identity. the manifestations of change are readily evident. they include learning commons buildings, research data management services, converged service models, new relationship manager posts, and branding of the library as partner. all feature in this literature review, as do the many challenges they present for academic libraries. not surprisingly, there is a mix of experience, reflecting different priorities per institution and a diversity of proactive and reactive library engagement. the perspective of stakeholders certainly influences positioning and is highlighted early. the focus of this article is to review literature that discusses ways in which academic libraries have been adapting, or might adapt, functionally, physically, strategically and organisationally in order to position themselves effectively within their parent institution. literature review method and structure due to the range of terminology, much of it quite general, associated with library positioning, the approach to surveying the literature involved browsing predominantly. a long list of possible terms was compiled, with some searches conducted using truncation of those that were more precise, for example positioning, alignment, prioritisation, partnership and restructuring. the focus was on literature in library and information science, recognising that in-depth coverage of library positioning was unlikely to be published elsewhere and would be referenced if significant. reports from organisations covering higher education were, however, located. publications included were mainly from the past five years and in english to keep the volume of literature reviewed manageable. the focus was on documents articulating or practically demonstrating institutional alignment and library adaptation or partnership. publications describing only internal processes, or services developed without coverage of strategic context and impact in the institution, were excluded. current awareness services proved valuable, given the focus on recent literature and the need to browse. the informed librarian online (https://www.informedlibrarian.com/) indexes journals and newsletters worldwide and includes a section on academic/research libraries. this was the main source but current cites (http://currentcites.org/) was also helpful. searches of scopus and google scholar yielded limited results due to the challenges already identified with the subject https://www.informedlibrarian.com/ http://currentcites.org/ terminology involved. instead of searching other databases, the author browsed the websites and publication lists of organisations which regularly commission reports of interest. these included the association of research libraries (arl), the association of college and research libraries (acrl), oclc, ithaka, sconul, the new media consortium (nmc) and educause. many new publications were also identified in bibliographies or lists of documents citing particularly relevant papers. the literature survey undertaken has generated the structure of this article. it begins with an overview of stakeholder perspectives, recognising that the impressions formed by institutional leadership and academic staff will influence the positioning of the library. the next section reviews library strategic responses across a number of sub-themes: alignment; refocusing and rebranding; roles and structures; partnerships and identity; and demonstrating value and impact. the remaining sections represent a tightening of focus to student success and researcher productivity as two major goals for libraries and their parent institutions. each is treated separately, highlighting new areas of action for library positioning in the institution, for example the adaptation of library space to engage students with learning, and new partnership roles in digital scholarship for innovative research. the article aims to offer evidence of positioning in action or of strategic thinking to guide practitioners as libraries and their parent institutions navigate ongoing change. theme . stakeholder perspectives a mixed picture emerges from a study of recent literature regarding the perspectives of institutional stakeholders on academic libraries, raising some concerns around positioning. publications on this topic had been rare for a time since but there has been an upsurge in recent activity. in general, there is a degree of recognition of the contribution of academic libraries, but this is somewhat limited and is often based on traditional views. murray and ireland (in press) surveyed us provosts, generating responses. participants saw the library as fairly involved in research productivity and student success but there was more limited recognition of its role in student retention and enrolment, two key objectives for heis. connaway, harvey, kitzie, and mikitish ( ) interviewed a group of fourteen provosts, some of whom explicitly sought stronger communication of library alignment with institutional goals and more collaboration across campus. both groups tended to see library contributions passively in terms of space and collections rather than activities around information literacy and critical thinking. robertson’s ( ) interviews with nine canadian provosts were more encouraging regarding the range of contributions but were again somewhat focused on collections and space. the studies already mentioned indicated more credit for library contributions to teaching and learning than to research and this is true also of a literature review and study commissioned by sconul on “the view from above” (baker & allden, a, b). furthermore, some indifference towards the library was evident, as it was generally seen as not a problem or a strategic concern. some of the interviewees expressed a desire for librarians to participate beyond their own area and to work on solutions to university issues, not just those of the library. a subsequent sconul study raised concerns about insularity, incremental approaches to change and insufficient innovation (pinfield, cox, & rutter, ). a lack of interest in the library was mentioned and melling and weaver ( ) found further evidence of this in the poor awareness of senior administrators about library activities relevant to the uk teaching excellence framework. an ithaka survey of us academic library directors, with respondents, highlighted a growing dissonance between them and their supervisors (wolff-eisenberg, ). fewer of them than in a similar survey shared a common vision and a smaller number of directors saw themselves as considered to be a member of the institution’s senior leadership. ithaka had also surveyed us academics in (wolff-eisenberg, rod, & schonfeld, b) and a comparison of data showed that directors and academic staff had different views of the key role of the library. eighty per cent of directors saw student success as their priority while only half of academics in this and a similar uk survey (wolff-eisenberg, rod, & schonfeld, a) recognised that contribution by the library, although this figure was rising. academics continue to prioritise collections and may need to be persuaded about new directions for libraries. students, although expecting more as fees rise, may also see the role of libraries in an unchanged light (delaney & bates, ), due perhaps to a lack of awareness of services highlighted by over a third of library staff in a sconul survey (pinfield et al., ). ineffective communications emerged as an issue for libraries. only half of the us library directors surveyed by ithaka believed that their library had clearly communicated its contribution to student success (wolff-eisenberg, ). interviews with provosts highlighted not only the need for clearer communication of alignment but also identified issues around language used, notably “service” terminology (connaway et al., ). a study of library strategy documents showed explicit referencing of the institutional strategy as the exception rather than the norm (saunders, ). summary the perspective of institutional stakeholders matters in terms of the library’s position in the organisation. negative consequences are likely if stakeholders are not convinced of the library vision and strategy or are unaware of the library contribution to institutional priorities. these consequences may include less influence with leadership (pinfield et al., ; wolff-eisenberg, ), a lower position in the hierarchy (baker & allden, b) or a bundling with professional rather than academic services (corrall, ). the library contribution may be overlooked (sconul, ) and this presents a serious challenge to effective positioning. however, the literature offers many examples of strategic responses by academic libraries, as described next. theme . library positioning strategies this section identifies some high-level, overarching strategies, applicable across the whole library mission. some of these strategies recur in more detail in the sections on student success and researcher productivity which follow but they are introduced here as significant for the overall positioning of the library. aligning and leading the literature provides descriptions of conscious strategic and organisational alignment by libraries with the priorities of their institutions in ways that are explicit, visible and desired rather than reactive or imposed. examples include the university of connecticut (franklin, ), the university of manchester (jeal, ), and national university of ireland galway (j. cox, ). a similar experience at the university of leicester is described as “repositioning by doing” (wynne, dixon, donohue, & rowlands, , p. ). there is a strong emphasis on engagement with campus communities and stakeholders in these studies. pinfield, cox and rutter ( ) also uncovered a strong view that libraries should not just align reactively but should seek to provide leadership on campus where opportune, going beyond roles of service provider and partner. opportunities for library leadership have been championed or successfully executed and described in the literature. coverage includes library influencing of learning space development at uk universities (appleton, stevenson, & boden, ; matthews & walton, ), identification of the library as a natural leader in improving digital literacy on campus (adams becker, cummins, davis, freeman, & hall giesinger, b), leadership of open access policy creation (fruin & sutton, ) and lead roles in a range of digital scholarship projects (mackenzie & martin, ). refocusing and rebranding shifting institutional priorities, evolving technologies and changes in scholarly communication have called on libraries to refocus their value proposition, activities and brand. the unique selling point of the library has changed now that users either no longer need or can do for themselves what they previously relied on library staff to provide. lewis ( ) urges an agenda based on what libraries can do either uniquely or better than any other unit on campus and to act before those opportunities are lost. there is a recognition that library services can be provided by others and that libraries need to prove that they can add specific value (pinfield et al., ). distinctiveness emerges as vital, necessitating a new agenda (bell et al., ). a common element in the literature about repositioning of academic libraries has been the change in emphasis from collections to users. lorcan dempsey ( ) has led the thinking around this shift by developing the concept of the “inside-out library”. formerly local collections are now part of a global networked collection to which libraries should contribute and share. instead of acquiring material published elsewhere the library role should be focused on the curation and publication of learning and research materials uniquely created by their parent institutions. this emphasises the library as publisher, enabling new areas of focus with many implications for library positioning. these include repurposing library space towards learning commons models (blummer & kenton, ), staking out new roles in research data management (pinfield, cox, & smith, ), and promoting the international reach and reputation of the institution (dempsey, ). collections continue to be a key focus but in different ways and with a new attention towards archives and special collections as sources which are unique to their institutions and enable distinctive scholarship (anderson, ). attention is shifting to fitting into user workflows and to emphasising the library in the life of the user (connaway, ). pinfield, cox and rutter ( ) have identified this and the inside-out library in a list of ten new paradigms for libraries which include the library as platform, the computational library and the globalised library. these paradigms are highly relevant to the rebranding of libraries. the concept of library as partner has also been strongly promoted (eldridge, fraser, simmonds, & smyth, ; posner, ; wynne et al., ), recognising that terminology such as service or support can be limiting (j. cox, ). asserting partnership does not, however, guarantee success and progress can be compromised by lack of buy-in from academics (wolff-eisenberg, ). reorganising teams and roles a changing agenda needs appropriate staffing and the literature reflects thinking around new structures and emerging roles. the need for more flexible staffing models and a move away from rigid hierarchies towards stronger teamwork is recognised, given that a range of functional experts will often need to collaborate (adams becker et al., b; jaguszewski & williams, ). there is a general shift towards stronger engagement with users and this is reflected in team structures. corrall ( ) identifies a move away from the traditional reader/technical services model to a consistent grouping of teams among uk libraries around five strategic areas: information resources, academic engagement, customer service, heritage collections, and digital technologies. an ithaka study of organisational models at us research libraries found a strong emphasis on building leadership teams around institutional priorities, with attention at all levels reoriented from internal affairs to engagement with the organisation at large (schonfeld, ). another trend, expanded upon later in this review, is greater investment in specialist posts to meet new needs and expectations, often for higher-level engagement than before (wolff-eisenberg, ). increasing specialism impacts the composition of library teams, leading to more staff from other professional backgrounds and fewer staff at support grades (gremmels, ). the role of liaison librarians has received particular attention, with a move away from subject and collection emphasis and a thrust towards maximum outreach to the campus community. jaguszewski and williams ( ) advocate an engagement model for liaisons and this is a priority for the us research library directors interviewed by ithaka (schonfeld, ). there is evidence of increased emphasis in this direction in a group of uk libraries too (eldridge et al., ). the university of nottingham has prioritised relationship management roles within a faculty and engagement team created in as part of a restructure, with team members seeking to engage at a strategic level to bridge the library with the academic community (eldridge et al., ). there is an ongoing debate, explored more fully under researcher productivity, about deploying functional instead of subject-based staffing models to satisfy new expectations, with some libraries motivated by a desire to signal radical change to their stakeholders (hoodless & pinfield, in press). challenges in reshaping staffing are also noted in the literature and these include skills and capacity issues (auckland, ; gremmels, ; haddow & mamtora, ), academic resistance (gremmels, ; hoodless & pinfield, in press) and tendencies towards silo working within libraries (jaguszewski & williams, ). collaborating but maintaining identity working more closely with others across the campus is another key strategy for academic libraries. delaney and bates ( ) see this as vital to an embedded and participative approach to institutional contribution by libraries, while corrall ( ) uses the phrase “boundary-spanning activities” (p. ) to describe partnerships essential to advancing research. collaboration is the theme of a book in which library staff are urged to give up territory, recognise interdependencies and embrace ambiguity (roberts & esson, ) and to see themselves as being of the university first and of the library second (melling, ). greater institutional focus on student success, research impact and international reputation is bringing the library into closer alignment with a wider range of collaborators on campus, often resulting in new multi-professional partnerships (pinfield et al., ). these new alliances impact the position of libraries in their institutions. the role of library directors has expanded in some instances, especially at the head of converged student-facing services, drawing upon their experience of engagement across multiple constituencies and of managing technology-generated change (bulpitt, ). they may also lead and coordinate digital education, research or digital scholarship (schonfeld, ). reporting relationships have changed, sometimes moving away from academic alignment. corrall notes that the grouping of the library with professional services may mean that the library director reports to a chief operating officer, with advantages and disadvantages (corrall, ). partnership brings challenges too. a balance needs to be struck between cooperation and competition for resources (pinfield et al., ). loss of identity in larger multi-professional alliances is a significant risk. the library may lose its scholarly association through convergence with other services (bulpitt, ). there is an increased blurring of boundaries with other units (pinfield et al., ) and more staff from other professional backgrounds are working in library teams (gwyer, ). academic libraries therefore need to work at retaining their distinctive identity. communicating value it is important for academic libraries not only to identify themselves clearly with the value they provide to the institution but also to communicate this effectively. as noted in an earlier section, the library’s changing contribution is not always well understood by stakeholders. there is a need to create and communicate systematically a compelling vision about the role and value of the library (pinfield et al., ). wolff-eisenberg ( ) found that experience was much more positive when a well-developed vision and strategy for the library was in place. the value communicated needs to be closely associated with the needs of the institution at large. oakleaf’s ( ) report and review on behalf of acrl has been highly influential in this regard. this document puts the focus firmly on active linkage of library achievement to outcomes in ten areas of value to the institution, such as student enrolment and retention, research productivity and grants, and institutional reputation. a more recent study of academic library impact, also for acrl, shows a similar emphasis and encourages libraries to ensure the incorporation of their data into institutional data collection systems (connaway et al., ). the next section includes linkage of library value with student success. summary libraries are aligning more closely with institutional strategy, promoting themselves as partners and exercising leadership in some areas. some of the traditional roles of the library have become less relevant, leading to a focus on adding new value and a heightened attention to appropriate branding. the shift in emphasis from collections to users has been a key driver of change, especially in the creation of different structures, teams and roles to deliver new functions. a more outward- facing approach towards the parent institution is evident and includes an increased appetite for partnership with other units on campus. maintaining the unique identify of the library in the institution can be challenging and this puts a premium on promoting awareness of the library’s evolving value proposition. theme . positioning for student success a strong focus on outcomes is reflected in the term student success which is commonly central to the institutional mission and linked to student expectations of employability (adams becker et al., a). higher student fees have promoted the view of students not only as customers (weaver, ) but also as partners, with a stronger voice in decision-making (melling, ). there is an increased focus on the student experience (appleton et al., ), and greater scrutiny of the quality and value for money of services (melling & weaver, ). tougher competition for students makes their retention vital and encourages internationalisation. teaching is increasingly virtual, with learning more independent and located outside the classroom. all of these factors influence how libraries position themselves to contribute to student success. influencing the student journey libraries now need to understand the entirety of the student experience from recruitment to graduate employment (weaver, ). progression incorporates retention, the subject of a literature review by oliveira ( ) which includes a section titled “retention is the library’s business too” (p. ). demonstrating a role in retention is important for academic libraries in their institutions, and initiatives described in the literature include hosting of supports such as writing centres (jackson, ), outreach to at-risk groups (pagowsky & hammond, ), and adapting space to promote collaborative learning (oliveira, ) or to establish a family reading room (godfrey et al., ). allen’s literature review ( ) summarises library effort but also partnership on retention. weaver ( ) emphasises the need to work with many other units across campus to support retention and to facilitate the student journey, recognising that no unit has all the answers. there have been many studies aiming to show linkage between strong library engagement by students and their success (oliveira, ). brown and malenfant ( ) report positive findings from more than acrl assessment in action projects, particularly in terms of library instruction. communicating such linkage to the institution is attractive but there are limitations. library engagement is likely to be only a contributory factor (allen, ), offering correlation rather than conclusive proof (murray & ireland, ). deeper engagement with, and analysis of, data are seen as important to understand the student profile better (weaver, ). melling and weaver ( ) argue that integration of data from library systems with other learning analytics may help to predict student outcomes. their article on the uk teaching excellence framework highlights both institutional ignorance of the library contribution to student learning and the need to redefine that contribution in areas such as access to open educational resources and linkage of information skills teaching to learning gain. there are also opportunities for closer engagement with employability (tyrer, ives, & corke, ) and entrepreneurship initiatives on campus (armann-keown & bolefski, ). joining forces weaver ( ) has highlighted the importance of partnerships across campus to provide a holistic approach to student success. recognition of the need to join up student support has generated new service configurations and convergences, with implications for the positioning of the library. corrall ( ) notes a move away from the earlier convergence of libraries and it units in a study of the russell group universities in the uk but identifies a trend towards including the library with other, mainly student-facing, services in academic services directorates. the library is not merged with other units in these groupings, but a much closer relationship exists at some institutions under what has been termed superconvergence. superconvergence joins the library with a range of service departments, typically including student administration, student services, counselling, welfare, careers and others (appleton, ). bulpitt ( ) identifies support beyond the classroom as the common purpose of the constituent units and outlines a number of drivers, primarily to improve the student experience and to realise efficiencies for the institution. there is a focus on convenience for the student through physical co-location of services and library buildings are most commonly home to these one-stop shops, due in particular to their central location and long opening hours (melling, ). library professionals are often at the head of superconverged services, as shown in four case studies (bulpitt, ). multiple models are possible, and appleton ( ) outlines examples ranging from simple co-location to full organisational convergence. there are benefits and challenges in superconvergence for libraries (appleton, ; bulpitt, ). hosting and leading a larger unit can enable stronger influence in the institution, positive association with student success and a more comprehensive service offering through sharing of expertise. challenges include loss of library identity in both spatial and professional terms, new demands for leaders as well as for staff at service desks, and achievement of consistent standards. changing spaces library space is being used differently to host other services but also in many other ways to enable new approaches to learning. a shift in emphasis from teaching to learning has occurred. participation, interaction with other disciplines and independent learning beyond the lecture have all increased, as students embrace opportunities to take control of their own learning, becoming active co-creators rather than passive recipients. new approaches need new space and it is not surprising that the themes of rethinking library spaces and of patrons as creators feature, with examples of recent practice, in an international report of key trends for libraries (adams becker et al., b). library leaders have recognised the need to adapt their buildings or create new ones with an emphasis on enabling learning rather than presenting paper collections. a survey of us academic libraries has indicated an average footprint of % of available space for new facilities such as studios, labs and interactive environments (s. brown, bennett, henson, & valk, ). these spaces are created through strong collaboration with campus partners. perspectives on, and case studies of, library space development, adaptation and sharing are presented in an edited collection by matthews and walton ( ). a book with the same editors includes international contributions about informal learning spaces in and beyond the library, highlighting the library’s strong campus- wide influence (walton & matthews, ). there are two other developments to note, firstly, the learning commons has emerged as a building on some campuses. its key components are ubiquitous technology, facilities for interaction and a diversity of spaces to meet preferred learning styles. these buildings may incorporate libraries or operate separately from them but are commonly under their management. collaboration with a range of service providers is essential to their success, as blummer and kenton ( ) emphasise in a literature review of the evolution of the learning commons. secondly, libraries are responding to the trend towards patrons as creators by establishing makerspaces. these spaces are emerging in significant numbers and respond to a growing maker culture, enabling fabrication of objects through -d printers and other technologies (altman, bernhardt, horowitz, lu, & shapiro, ). they are seen as a natural fit for libraries (adams becker et al., b), promoting creativity and entrepreneurship (nichols, melo, & dewland, ). leading digital literacy libraries are repositioning themselves in the area of information literacy, already identified as important on the student journey and seen as a priority by library directors (wolff-eisenberg, ). information literacy has involved library staff teaching students the skills needed to find, evaluate and use information. librarians have largely done this on their own and have encountered difficulties in “infiltrating the curriculum” (fister, , p. ). cowan ( ) points out the limitations of exclusive library leadership of information literacy and the need to broaden its scope and ownership. this has been recognised in a reframing of information literacy as a metaliteracy which takes account of the new emphasis on creating, publishing and sharing digital information through participatory channels such as social media (mackey & jacobson, ). as a result, the creation of the acrl framework for information literacy for higher education (association of college and research libraries, ) was informed by a need to expand the definition of information literacy to include multiple literacies. these include digital literacy, which incorporates dimensions such as content creation, communication, collaboration and responsible digital citizenship. all are captured in a strategic briefing from nmc that examines and synthesises multiple frameworks to define digital literacy in global and discipline-specific contexts which emphasise learners as creators (b. alexander, adams becker, cummins, & hall geisinger, ). the increasing importance attached to digital literacy brings new opportunities for libraries on campus. strong linkage to employability engages students and staff while the need to combat “fake news” places a premium on critical thinking and information literacy (b. alexander et al., ). this offers the library a key role in developing digital literacy strategies and influencing curriculum design, with examples reported of this in practice (adams becker et al., b; b. alexander et al., ). librarians are also playing important roles in the development of teaching and learning across the institution. examples include participating in a multi-professional team at coventry university to embed and sustain innovative teaching and learning programmes (mackenzie, ), working outside the library on curriculum development and learning design at the university of western australia (howard & fitzgibbons, ), and influencing the development of more participatory courses at purdue university (jaguszewski & williams, ). contributing to internationalisation libraries are engaging with global and more culturally diverse audiences at home and abroad in order to align with campus internationalisation strategies. their engagement encompasses a number of dimensions, including research, staff exchanges and participation in international projects, but the primary focus is on teaching and learning. the emphasis on internationalisation is increasing and witt, kutner and cooper ( ) identify its drivers, while bordonaro ( ) summarises its strategic components, including student recruitment, study abroad programmes and incorporation of international perspectives in the curriculum. kenney and li ( ) map the range of activities involved. there is a definite sense in the literature that libraries have failed to keep up with campus internationalisation and have not done enough to position themselves as key players in the institutional agenda in this area. a study of institutional strategies in the us and canada finds a striking absence of reference to libraries, with internationalisation similarly lacking in prominence in library strategies (bordonaro, ). involvement in strategic planning is mixed according to a us survey, with the library focus often more operational than strategic and staffing support fragmented (witt et al., ). the same survey found a significantly lower figure reported by us libraries regarding their level of international engagement than that reported by institutions in a separate national survey. a literature review finds inadequate library provision for study abroad programmes, despite some exceptions (denda, ). more positively, the important role of an international branch library in accreditation, information literacy and creating community is reported (green, ), along with the range of activities undertaken by library staff at new york university across its multiple international campuses (pun, collard, & parrott, ). these include the creation of a post of global services librarian to maximise collaboration with other institutional providers. overall, there is more that libraries can do to improve their positioning for internationalisation. this includes moving from a focus on collections to real engagement, rethinking structures towards better teamwork and a whole of library approach, deepening partnerships on campus, showcasing study abroad projects and experiences, promoting multiculturalism through events and exhibitions, and shifting the focus from challenges to benefits (bordonaro, ; denda, ; anne r kenney & li, ). summary the increased diversity and expectations of the student body, along with the trend towards independent learning on and beyond the campus, have changed the positioning of libraries. this is evident in new space deployments to promote interactive, creative and communal learning. convergence with, and hosting of, other student-facing units have become commonplace. libraries have extended their role in information literacy to embrace the inclusion of digital and other literacies as vital skills for student success and employability. leadership roles for libraries in the institution have been a feature of these changes. internationalisation is an exception and libraries have some ground to make up in this area. theme . positioning for researcher productivity researchers today face increased change, competition and pressure to deliver impact and distinction for their institutions. governments are strongly interested in research and funders want not only discovery but also dissemination. advances in technology make research more data-intensive, collaborative and shareable. this has created a new scholarly communications environment, with a wider range of outputs prior to final publication (adams becker et al., b) and a strong emphasis on open access to them. for libraries, this leads to stronger engagement with the process as well as the products of scholarship (dempsey, ) and interaction with the whole research cycle (maxwell, ). new positioning in the overlapping areas of digital scholarship, open publishing and research impact is becoming established, enabled by a strong emphasis on partnership and major changes in staffing. participating in digital scholarship digital scholarship is difficult to define and martin ( ) devotes a number of pages to the task. at a simple level it is the application of digital technologies and content to enable new methods of enquiry, often involving large-scale manipulation of data (j. cox, ). it encompasses digital humanities, computational research and open scholarship and is characterised by highly collaborative, interdisciplinary approaches, involving a range of parties (anne et al., ). libraries are identified as key players in digital scholarship and there are good reasons for this. they include roles as connectors (l. alexander, case, downing, gomis, & maslowski, ), a collaborative focus (sinclair, ), archives and special collections as vital source material (anderson, ), shared interests with the humanities in particular (vandegrift & varner, ) and a strong match of both librarian skills and technical infrastructures (j. cox, ). engagement with digital scholarship takes libraries well beyond digitisation. mulligan ( ) identifies nineteen strands of activity, ranging from digital preservation and metadata creation to digital mapping and computational text analysis, as well as some areas such as interface design and database development which were previously more likely to be outsourced or done by other departments. his survey of us libraries shows activity in each of these domains across the respondent institutions. alexander et al. ( ) highlight a diversity of activities at the university of michigan and many more examples are evident in literature reviews by cox ( ) and martin ( ). the latter is included in a book edited by mackenzie and martin ( ) which contains many case studies. one of these describes the establishment of a digital scholarship centre at the university of notre dame (bergstrom, ) and the benefits accruing to the library from hosting and engaging with research across many disciplines. other libraries have made space for such centres as an effective way of embedding their staff in digital scholarship (lippincott & goldenberg-hart, ). different levels of library engagement with elements of digital scholarship are possible and mackenzie ( ) identifies these as owner/user, partner, consultant or expert. institutional models vary also, often depending on local factors, as outlined by anne et al. ( ) whose four examples show library prominence in each case. many libraries have positioned themselves as important participants in digital scholarship, emphasising partner roles as discussed later. there are challenges too. these include the requirements for new, often advanced, skills (mulligan, ), sustainability as resourcing lags demand (vinopal & mccormick, ), uncertainty around project scope or longevity (posner, ), and communicating the library’s role effectively to multiple stakeholders (j. cox, ). publishing research outputs the “inside-out” model was described earlier in terms of a growing role for libraries as publishers of institutionally-authored materials (dempsey, ). libraries have embraced this opportunity increasingly in recent years as open access gains traction and the scholarly record broadens in scope to include an increasing volume, diversity and complexity of content (lavoie & malpas, ). they have taken on new roles in curating and publishing institutional output, building on a keen interest in open access over more than a decade and an established role as managers of institutional repositories. pinfield’s ( ) literature review of developments in open access between and shows that it has become largely mainstreamed in that time, embracing growing numbers of publications and engaging the attention of institutions as a whole, not just libraries. this has been stimulated by an increase in the number of policies from funders and heis, mandating that research outputs should be openly available. libraries have played a leading role in the creation of these policies and in advocating their full implementation on campus (fruin & sutton, ). open access is complex in terms of licencing, embargoes, deposit options and green and gold models, reflecting what pinfield describes as “mandate messiness” (pinfield, , p. ), and the expertise that libraries can offer is highly valued. libraries are therefore involved in more than just publishing outputs and their role in interpreting requirements and administering compliance has expanded in the uk in particular, due to two recently introduced requirements. these are the funding of gold open access following the finch report in and the stipulation that publications submitted to the national research excellence framework should be openly accessible via a repository. this has involved libraries in close collaboration with research offices and others in a range of work around policies, cost management, workflows and advocacy (degroff, ). there is also a requirement for better integration of, and interoperability between, library repositories and institutional research management systems (adams becker et al., b). pinfield ( ) observes that libraries engage with other open agendas, most notably open data. funders are extending their requirement for openness to the curation and publication of data generated by the research they support. this and other drivers (bryant, lavoie, & malpas, b; a. m. cox, kennan, lyon, & pinfield, ), including societal benefits and research reproducibility, have generated a new focus on research data as a valuable asset to be managed and shared. librarians have skills relevant to research data management and are working closely with others on campus, notably research offices and it units (bryant, lavoie, & malpas, a). in doing so they face some challenges similar to those encountered with open access to publications, including difficulties in securing institutional commitment and a lack of clarity around the responsibility of different stakeholders (pinfield et al., ; tenopir et al., ). research data management is complex and has technical, policy and legal dimensions, in addition to variations in practice among disciplines. libraries have been working out their level of involvement, influenced by local circumstances and available capacity. one study identifies three levels of engagement: education, expertise and curation (bryant, lavoie, et al., b). investigations of research data services in practice reveal variations in provision (hudson-vitale et al., ), but with a clear trend towards advisory rather than technical roles (a. m. cox et al., ; tenopir et al., ). four case studies show a range of institutional models but strong library roles in each case (bryant, lavoie, et al., a). there is, furthermore, a strong recognition by library directors of the importance of research data management, not only for future scholarship but also for the relevance of libraries to research (tenopir et al., ). a further engagement with publishing relates to closer relationships with university presses, ranging from the press reporting to the library to full integration with it (bonn & furlough, ). libraries’ experience with open access publishing has contributed to this, and university presses, although undergoing revival and growth (adema & stone, ), are perceived as having something to gain from library innovation, agility and experimentation (bonn & furlough, ; okerson & holzman, ). this further aligns libraries with institutions’ desire to publish and spread their reputation and may be seen as additional evidence of library repositioning in the research cycle from a less unique role in discovery than previously to a valued one in publishing now (bonn & furlough, ). promoting reputation and impact heis pay strong attention to their reputation and impact, recognising the importance of their global ranking in a highly competitive research environment. research outputs are increasingly measured according to their impact which is assessed through a range of metrics, including the number and quality of citations they attract. this is of interest to funders as well as parent institutions. the publication by libraries of institutional research content is relevant in terms of greater global exposure and opportunities for citation. libraries engage with reputation and impact in other ways too, partnering to capture and measure research activity and outcomes through research information management systems (bryant, clements, et al., ). these systems track details of researcher expertise, outputs, grants, projects and collaborations, informing decision-making, enabling benchmarking and reporting impact. the partnerships and systems dependencies they require across campus are well mapped by bryant et al. ( ), as are the contribution and unique expertise of the library in areas such as scholarly communications, discoverability, training on how to increase impact, and preservation for long-term access. by engaging with these systems and their stakeholders, libraries are able to promote open access and the repositories they manage (givens, macklin, & mangiafico, ). librarians have played leadership roles in the scoping or implementation of research management and profiling systems at a number of institutions (day, ; givens et al., ). measuring impact through bibliometric analysis and promoting the use of altmetrics has also been a growing area of library activity, stimulated by national research evaluation exercises, and generating challenges in terms of the specialist skills required (haddow & mamtora, ). a survey of arl institutions reports a diversity of library engagement with scholarly output assessment, extending well beyond citation counts and encompassing strategies to promote researcher impact in partnership with research offices and institutional research units (r. lewis, sarli, & suiter, ). libraries can also use their trusted status to advantage to act as honest brokers with academics who may distrust the motives of other parties on campus in measuring their impact (givens et al., ). partnering more than supporting the importance of working closely with others on campus for researcher productivity is evident and libraries have positioned themselves as willing and effective partners. key collaborators for open access, research data management, publishing and maximising impact include the research office, it unit, university press and academic staff. dempsey ( ) notes the need for the library to position itself as advocate and partner, while corrall ( ) sees operational convergence among partners as “arguably more prevalent than ever” (p. ). these relationships can be challenging, however. a jurisdictional issue has been identified around research data management, in which libraries may meet scepticism from others in trying to frame a coherent agenda within the institution (pinfield et al., ). the prevailing research environment on campus and the receptiveness or otherwise of the research office and of academics can influence the extent of the library’s role (haddow & mamtora, ). libraries, as noted earlier, are branding themselves as partners with researchers, shifting away from traditional roles of service or support. this is particularly the case for digital scholarship and the concept of libraries as partners is often forcibly argued in light of their active participation in, and strong contributions to, this area (l. alexander et al., ; posner, ; vandegrift & varner, ). examples of strong, deep and successful embedding in digital scholarship communities are reported, with libraries taking very enterprising and valued roles (mackenzie & martin, ; mcrostie, ; sinclair, ). this is not always the case and vandegrift and varner ( ) cite timidity and an academic inferiority complex as issues for libraries. other barriers can be a lack of recognition of the library role (posner, ) or failures by librarians to understand what academics need (groenewegen, ). staffing for research evolving interaction with research has taken libraries into different territory and this has generated new staffing roles and structures. corrall ( ) identifies “higher-end” (p. ) engagement, resulting in an expansion of specialist positions for areas such as open access and research data management as well as the establishment of new scholarly communications teams. she notes that a third of senior management posts in her sample have “research” in their title and half of these leadership teams have special collections or archives as a distinct role. cox ( ) sees digital scholarship as generating a further radical shift, evident in the results of a survey of us libraries which indicates a substantial growth in multi-professional teams, non-librarian posts and the recent creation of many new posts, often at senior grades (mulligan, ). research data management has also stimulated staffing changes, with over % of european libraries surveyed either having created new posts for this purpose or planning to do so (tenopir et al., ), although another study found less organisational change than expected (a. m. cox et al., ). the literature reports significant restructuring to strengthen research engagement, often with a distinctive emphasis per institution (j. cox, ; mcrostie, ; wynne et al., ). the issue of whether to organise staffing for research around subject or functional teams has emerged as a particular topic of debate. the approach at many institutions has been to graft new roles in areas such as open access, bibliometrics and research impact onto existing subject or liaison librarian roles, but the feasibility of this has been questioned (jaguszewski & williams, ; anne r kenney, ). others argue that increasingly specialist skills requirements and interdisciplinarity approaches to research make the subject librarian model less effective, calling for a more radical structural adaptation which results in the creation of functional teams focused on research. the debate is well summarised by hoodless and pinfield (hoodless & pinfield, in press). they describe the rationale, drivers and models involved, as well as the outcomes for the functional approach which include an improved profile in the institution and better linkage with non-academic units such as the research office, but a risk of less close relationships with academic staff. any move away from the established subject model represents a big step and experience of this is reported for a number of institutions (bains, ; j. cox, ; wynne et al., ). the most common approach is a mix of both models but with a changing balance towards functional organisation (hoodless & pinfield, in press). major recent changes in the liaison librarian role were reported by % of respondents to a survey of us libraries in which institutions participated (miller & pressley, ), and these adaptations are ongoing (church-duran, ). summary academic libraries are operating at a higher level of specialism to meet new expectations from researchers, funders and their parent institutions. digital scholarship has opened up new roles and partnerships, leveraging library skills in preservation, description and dissemination. the publishing role of libraries has become more prominent in expanding open access to research outputs, including data as well as publications. this has positioned libraries well in helping their institutions to maximise and measure the impact of their research, enhancing international reputation. partnership with academic staff, research offices and others is key but can be challenging as needs and roles evolve. major shifts in library staffing for research are taking place, resulting in specialist roles which may not be filled by librarians, and organisational structures which replace subject librarians with functional experts. conclusion academic library positioning and repositioning within the institution has occurred across many fronts. it has been made more challenging by the many changes experienced by parent institutions which have therefore become moving targets. the shifting higher education environment has impacted the library as much as its parent organisation. new approaches to learning and research necessitate different roles for libraries if they are to be relevant to the institutional mission. some common threads have emerged to drive new positioning. foremost is the emphasis on partnerships across campus, recognising that more can be achieved together and that isolation risks marginalisation. libraries are more outward looking and keen to share space, infrastructure and expertise, committing themselves to alignment around institutional priorities. the shift in focus from collections to users has stimulated major changes in the way library space is presented as an enabler of interactive learning. that shift has in turn moved the deployment of staffing towards stronger engagement with academic staff, students and campus partners. the library as publisher has promoted digital scholarship and the international reputation of the institution for research. libraries have repositioned themselves, but have perceptions of them changed in their organisations? the answer is not clear-cut. this review has noted divergences from institutional leadership and academics, a loss of position at the heart of the campus, and a tendency for libraries to be taken for granted. communication appears to be an issue and more work needs to be done to capture scarce attention in busy institutions and to pit the new library agenda against traditional perceptions of its contribution. selling that contribution in terms of what the institution values is important if new roles and partnerships in advancing the academic mission are to be recognised and appreciated. a balance may need to be struck between being a good partner and maintaining a distinctive identity, claiming credit where it is due so that repositioning results in advancement rather than loss of status on campus. opportunities to lead exist and are being realised in areas such as digital literacy, open access, research data management and digital scholarship. dynamic, engaged alignment with organisational priorities is key, and this literature review has highlighted committed practice by libraries to reposition themselves successfully in the institution. references adams becker, s., cummins, m., davis, a., freeman, a., & hall giesinger, c. ( a). nmc horizon report: higher education edition. retrieved from http://cdn.nmc.org/media/ -nmc- horizon-report-he-en.pdf adams becker, s., cummins, m., davis, a., freeman, a., & hall giesinger, c. ( b). nmc horizon report: library edition. retrieved from http://cdn.nmc.org/media/ -nmc-horizon- report-library-en.pdf adema, j., & stone, g. ( ). the surge in new university presses and academic-led publishing: an overview of a changing publishing ecology in the uk. liber quarterly, ( ), - . retrieved from http://doi.org/ . /lq. alexander, b., adams becker, s., cummins, m., & hall geisinger, c. ( ). digital literacy in higher education, part ii: an nmc horizon project strategic brief. retrieved from https://cdn.nmc.org/media/ -nmc-strategic-brief-digital-literacy-in-higher-education- ii.pdf alexander, l., case, b. d., downing, k. e., gomis, m., & maslowski, e. ( ). librarians and scholars: partners in digital humanities. educause review, ( june ). retrieved from http://er.educause.edu/articles/ / /librarians-and-scholars-partners-in-digital- humanities http://cdn.nmc.org/media/ -nmc-horizon-report-he-en.pdf http://cdn.nmc.org/media/ -nmc-horizon-report-he-en.pdf http://cdn.nmc.org/media/ -nmc-horizon-report-library-en.pdf http://cdn.nmc.org/media/ -nmc-horizon-report-library-en.pdf http://doi.org/ . /lq. https://cdn.nmc.org/media/ -nmc-strategic-brief-digital-literacy-in-higher-education-ii.pdf https://cdn.nmc.org/media/ -nmc-strategic-brief-digital-literacy-in-higher-education-ii.pdf http://er.educause.edu/articles/ / /librarians-and-scholars-partners-in-digital-humanities http://er.educause.edu/articles/ / /librarians-and-scholars-partners-in-digital-humanities allen, s. ( ). towards a conceptual map of academic libraries’ role in student retention: a literature review. the christian librarian, ( ), - . retrieved from http://digitalcommons.georgefox.edu/tcl/vol /iss / altman, m., bernhardt, m., horowitz, l., lu, w., & shapiro, r. ( ). spec kit : rapid fabrication/makerspace services. retrieved from https://doi.org/ . /spec. anderson, r. ( ). can't buy us love: the declining importance of library books and the rising importance of special collections. retrieved from https://doi.org/ . /sr. anne, k., carlisle, t., dombrowski, q., glass, e., gniady, t., jones, j., . . . sipher, j. ( ). building capacity for digital humanities: a framework for institutional planning. retrieved from https://library.educause.edu/resources/ / /building-capacity-for-digital-humanities-a- framework-for-institutional-planning appleton, l. ( ). assuring quality using “moments of truth” in super‐converged services. library management, ( / ), - . retrieved from https://doi.org/ . / appleton, l. ( ). sharing space in university libraries. in g. matthews & g. walton (eds.), university libraries and space in the digital world (pp. - ). farnham: ashgate. appleton, l., stevenson, v., & boden, d. ( ). developing learning landscapes: academic libraries driving organisational change. reference services review, ( ), - . retrieved from https://doi.org/ . / armann-keown, v., & bolefski, a. ( ). spec kit : campus-wide entrepreneurship. retrieved from https://doi.org/ . /spec. association of college and research libraries. ( ). framework for information literacy for higher education. retrieved from http://www.ala.org/acrl/standards/ilframework auckland, m. ( ). re-skilling for research: an investigation into the role and skills of subject and liaison librarians required to effectively support the evolving information needs of researchers. retrieved from http://www.rluk.ac.uk/wp-content/uploads/ / /rluk-re- skilling.pdf bains, s. ( ). teaching ‘old’ librarians new tricks. sconul focus, , - . retrieved from http://www.sconul.ac.uk/sites/default/files/documents/bains.pdf baker, d., & allden, a. ( a). leading libraries: leading in uncertain times: a literature review. retrieved from https://www.sconul.ac.uk/publication/leading-in-uncertain-times-a- literature-review baker, d., & allden, a. ( b). leading libraries: the view from above. retrieved from https://www.sconul.ac.uk/publication/the-view-from-above bell, s., dempsey, l., & fister, b. ( ). new roles for the road ahead: essays commissioned for acrl's th anniversary. n. allen (ed.) retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_role s_ th.pdf bergstrom, t. c. ( ). digital scholarship centres: converging space and expertise. in a. mackenzie & l. martin (eds.), developing digital scholarship: emerging practices in academic libraries (pp. - ). london: facet. blummer, b., & kenton, j. m. ( ). learning commons in academic libraries: discussing themes in the literature from to the present. new review of academic librarianship, ( ), - . retrieved from https://doi.org/ . / . . bonn, m., & furlough, m. (eds.). ( ). getting the word out: academic libraries as scholarly publishers. chicago: association of college and research libraries. retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/booksanddigitalresour ces/digital/ _getting_oa.pdf. bordonaro, k. ( ). internationalization and the north american university library. lanham, md: scarecrow press. http://digitalcommons.georgefox.edu/tcl/vol /iss / https://doi.org/ . /spec. https://doi.org/ . /sr. https://library.educause.edu/resources/ / /building-capacity-for-digital-humanities-a-framework-for-institutional-planning https://library.educause.edu/resources/ / /building-capacity-for-digital-humanities-a-framework-for-institutional-planning https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /spec. http://www.ala.org/acrl/standards/ilframework http://www.rluk.ac.uk/wp-content/uploads/ / /rluk-re-skilling.pdf http://www.rluk.ac.uk/wp-content/uploads/ / /rluk-re-skilling.pdf http://www.sconul.ac.uk/sites/default/files/documents/bains.pdf https://www.sconul.ac.uk/publication/leading-in-uncertain-times-a-literature-review https://www.sconul.ac.uk/publication/leading-in-uncertain-times-a-literature-review https://www.sconul.ac.uk/publication/the-view-from-above http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_roles_ th.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_roles_ th.pdf https://doi.org/ . / . . http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/booksanddigitalresources/digital/ _getting_oa.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/booksanddigitalresources/digital/ _getting_oa.pdf brown, k., & malenfant, k. j. ( ). academic library impact on student learning and success: findings from assessment in action team projects. retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/findings_y .pdf brown, s., bennett, c., henson, b., & valk, a. ( ). spec kit : next-gen learning spaces. retrieved from https://doi.org/ . /spec. bryant, r., clements, a., feltes, c., groenewegen, d., huggard, s., mercer, h., . . . wright, j. ( ). research information management: defining rim and the library’s role. retrieved from http://dx.doi.org/ . /c nk bryant, r., lavoie, b., & malpas, c. ( a). scoping the university rdm service bundle. the realities of research data management, part . retrieved from http://dx.doi.org/ . /c z bryant, r., lavoie, b., & malpas, c. ( b). a tour of the research data management (rdm) service space. the realities of research data management, part . retrieved from http://dx.doi.org/ . /c pg j bulpitt, g. (ed.) ( ). leading the student experience: super-convergence of organisation, structure and business processes. london: leadership foundation for higher education. church-duran, j. ( ). distinctive roles: engagement, innovation, and the liaison model. portal: libraries and the academy, ( ), - . retrieved from https://doi.org/ . /pla. . connaway, l. s. ( ). the library in the life of the user: engaging with people where they live and learn. retrieved from http://www.oclc.org/content/dam/research/publications/ /oclcresearch-library-in-life- of-user.pdf connaway, l. s., harvey, w., kitzie, v., & mikitish, s. ( ). academic library impact: improving practice and essential areas to research. retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/academic lib.pdf corrall, s. ( ). designing libraries for research collaboration in the network world: an exploratory study. liber quarterly, ( ), - . retrieved from http://doi.org/ . /lq. cowan, s. m. ( ). information literacy: the battle we won that we lost? portal: libraries and the academy, ( ), - . retrieved from https://doi.org/ . /pla. . cox, a. m., kennan, m. a., lyon, l., & pinfield, s. ( ). developments in research data management in academic libraries: towards an understanding of research data service maturity. journal of the association for information science and technology, ( ), - . retrieved from http://dx.doi.org/ . /asi. cox, j. ( ). communicating new library roles to enable digital scholarship: a review article. new review of academic librarianship, ( - ), - . retrieved from http://dx.doi.org/ . / . . cox, j. ( ). new directions for academic libraries in research staffing: a case study at national university of ireland galway. new review of academic librarianship, ( - ), - . retrieved from http://dx.doi.org/ . / . . day, a. ( ). research information management: how the library can contribute to the campus conversation. new review of academic librarianship, ( ), - . retrieved from https://doi.org/ . / . . degroff, h. ( ). preparing for the research excellence framework: examples of open access good practice across the united kingdom. the serials librarian, ( ), - . retrieved from https://doi.org/ . / x. . delaney, g., & bates, j. ( ). envisioning the academic library: a reflection on roles, relevancy and relationships. new review of academic librarianship, ( ), - . retrieved from http://dx.doi.org/ . / . . dempsey, l. ( ). intra-institutional boundaries: new contexts of collaboration on campus. in: new roles for the road ahead: essays commissioned for acrl’s th birthday (pp. - ). http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/findings_y .pdf https://doi.org/ . /spec. http://dx.doi.org/ . /c nk http://dx.doi.org/ . /c z http://dx.doi.org/ . /c pg j https://doi.org/ . /pla. . http://www.oclc.org/content/dam/research/publications/ /oclcresearch-library-in-life-of-user.pdf http://www.oclc.org/content/dam/research/publications/ /oclcresearch-library-in-life-of-user.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/academiclib.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/academiclib.pdf http://doi.org/ . /lq. https://doi.org/ . /pla. . http://dx.doi.org/ . /asi. http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / x. . http://dx.doi.org/ . / . . chicago: association of college and research libraries. retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_role s_ th.pdf. dempsey, l. ( ). library collections in the life of the user: two directions. liber quarterly, ( ), - . retrieved from http://doi.org/ . /lq. denda, k. ( ). study abroad programs: a golden opportunity for academic library engagement. the journal of academic librarianship, ( ), - . retrieved from https://doi.org/ . /j.acalib. . . eldridge, j., fraser, k., simmonds, t., & smyth, n. ( ). strategic engagement: new models of relationship management for academic librarians. new review of academic librarianship, ( - ), - . retrieved from http://dx.doi.org/ . / . . fister, b. ( ). student learning, lifelong learning and partner in pedagogy. in: new roles for the road ahead: essays commissioned for acrl’s th birthday (pp. - ). chicago: association of college and research libraries. retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_role s_ th.pdf. franklin, b. ( ). surviving to thriving: advancing the institutional mission. journal of library administration, ( ), - . retrieved from https://doi.org/ . / . . fruin, c., & sutton, s. ( ). strategies for success: open access policies at north american educational institutions. college & research libraries, ( ), - . retrieved from http://crl.acrl.org/index.php/crl/article/view/ / givens, m., macklin, l. a., & mangiafico, p. ( ). faculty profile systems: new services and roles for libraries. portal: libraries and the academy, ( ), - . retrieved from https://doi.org/ . /pla. . godfrey, i., rutledge, l., mowdood, a., reed, j., bigler, s., & soehner, c. ( ). supporting student retention and success: including family areas in an academic library. portal: libraries and the academy, ( ), - . retrieved from https://doi.org/ . /pla. . green, h. ( ). libraries across land and sea: academic library services on international branch campuses. college & research libraries, ( ), - . retrieved from https://doi.org/ . /crl- gremmels, g. s. ( ). staffing trends in college and university libraries. reference services review, ( ), - . retrieved from https://doi.org/ . / groenewegen, d. ( ). yesterday and today: reflecting on past practice to help build and strengthen the researcher partnership at monash university. new review of academic librarianship, ( - ), - . retrieved from https://doi.org/ . / . . gwyer, r. ( ). identifying and exploring future trends impacting on academic libraries: a mixed methodology using journal content analysis, focus groups, and trend reports. new review of academic librarianship, ( ), - . retrieved from https://doi.org/ . / . . haddow, g., & mamtora, j. ( ). research support in australian academic libraries: services, resources, and relationships. new review of academic librarianship, ( - ), - . retrieved from https://doi.org/ . / . . holmgren, r., & spencer, g. ( ). the changing landscape of library and information services: what presidents, provosts, and finance officers need to know. retrieved from https://www.clir.org/wp-content/uploads/sites/ /pub .pdf hoodless, c., & pinfield, s. (in press). subject vs. functional: should subject librarians be replaced by functional specialists in academic libraries? journal of librarianship and information science. retrieved from http://dx.doi.org/ . / http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_roles_ th.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_roles_ th.pdf http://doi.org/ . /lq. https://doi.org/ . /j.acalib. . . http://dx.doi.org/ . / . . http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_roles_ th.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/new_roles_ th.pdf https://doi.org/ . / . . http://crl.acrl.org/index.php/crl/article/view/ / https://doi.org/ . /pla. . https://doi.org/ . /pla. . https://doi.org/ . /crl- https://doi.org/ . / https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / . . https://www.clir.org/wp-content/uploads/sites/ /pub .pdf http://dx.doi.org/ . / howard, r., & fitzgibbons, m. ( ). librarian as partner: in and out of the library. in a. mackenzie & l. martin (eds.), developing digital scholarship: emerging practices in academic libraries (pp. - ). london: facet. hudson-vitale, c., imker, h., johnston, l. r., carlson, j., kozlowski, w., olendorf, r., & stewart, c. ( ). spec kit : data curation. retrieved from https://doi.org/ . /spec. jackson, h. a. ( ). collaborating for student success: an e-mail survey of u.s. libraries and writing centers. the journal of academic librarianship, ( ), - . retrieved from https://doi.org/ . /j.acalib. . . jaguszewski, j. m., & williams, k. ( ). new roles for new times: transforming liaison roles in research libraries. retrieved from http://www.arl.org/storage/documents/publications/nrnt-liaison-roles-revised.pdf jeal, y. ( ). strategic alignment at the university of manchester library: ambitions, transitions, and new values. new review of academic librarianship, ( ), - . retrieved from https://doi.org/ . / . . kenney, a. r. ( ). leveraging the liaison model: from defining st century research libraries to implementing st century research universities. retrieved from https://doi.org/ . /sr. kenney, a. r., & li, x. ( ). rethinking research libraries in the era of global universities. retrieved from https://doi.org/ . /sr. lavoie, b., & malpas, c. ( ). stewardship of the evolving scholarly record: from the invisible hand to conscious coordination. retrieved from https://www.oclc.org/content/dam/research/publications/library/ /oclcresearch- evolving-scholarly-record- -a .pdf lewis, d. w. ( ). reimagining the academic library. lanham, md: rowman & littlefield. lewis, r., sarli, c. c., & suiter, a. m. ( ). spec kit : scholarly output assessment activities. retrieved from http://www.arl.org/component/content/article/ / lippincott, j. k., & goldenberg-hart, d. ( ). cni workshop report. digital scholarship centers: trends and good practice. retrieved from http://cni.org/wp-content/uploads/ / /cni- digitial-schol.-centers-report- .web_.pdf mackenzie, a. ( ). digital scholarship: scanning library services and spaces. in a. mackenzie & l. martin (eds.), developing digital scholarship: emerging practices in academic libraries (pp. - ). london: facet. mackenzie, a., & martin, l. (eds.). ( ). developing digital scholarship: emerging practices in academic libraries. london: facet. mackey, t. p., & jacobson, t. e. ( ). reframing information literacy as a metaliteracy. college & research libraries, ( ), - . retrieved from https://doi.org/ . /crl- r martin, l. ( ). the university library and digital scholarship: a review of the literature. in a. mackenzie & l. martin (eds.), developing digital scholarship: emerging practices in academic libraries (pp. - ). london: facet. matthews, g., & walton, g. ( ). strategic development of university library space: widening the influence. new library world, ( / ), - . retrieved from https://doi.org/ . /nlw- - - matthews, g., & walton, g. (eds.). ( ). university libraries and space in the digital world. farnham: ashgate. maxwell, d. ( ). the research lifecycle as a strategic roadmap. journal of library administration, ( ), - . retrieved from https://doi.org/ . / . . mcrostie, d. ( ). the only constant is change: evolving the library support model for research at the university of melbourne. library management, ( / ), - . retrieved from https://doi.org/ . /lm- - - https://doi.org/ . /spec. https://doi.org/ . /j.acalib. . . http://www.arl.org/storage/documents/publications/nrnt-liaison-roles-revised.pdf https://doi.org/ . / . . https://doi.org/ . /sr. https://doi.org/ . /sr. https://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-evolving-scholarly-record- -a .pdf https://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-evolving-scholarly-record- -a .pdf http://www.arl.org/component/content/article/ / http://cni.org/wp-content/uploads/ / /cni-digitial-schol.-centers-report- .web_.pdf http://cni.org/wp-content/uploads/ / /cni-digitial-schol.-centers-report- .web_.pdf https://doi.org/ . /crl- r https://doi.org/ . /nlw- - - https://doi.org/ . / . . https://doi.org/ . /lm- - - melling, m. ( ). collaborative service provision through super-convergence. in m. melling & m. weaver (eds.), collaboration in libraries and learning environments (pp. - ). london: facet. melling, m., & weaver, m. ( ). the teaching excellence framework: what does it mean for academic libraries? insights, ( ), - . retrieved from https://doi.org/ . /uksg. miller, r. k., & pressley, l. ( ). spec kit : evolution of library liaisons. retrieved from https://doi.org/ . /spec. mulligan, r. ( ). spec kit : supporting digital scholarship. retrieved from https://doi.org/ . /spec. murray, a., & ireland, a. ( ). communicating library impact on retention: a framework for developing reciprocal value propositions. journal of library administration, ( ), - . retrieved from https://doi.org/ . / . . murray, a., & ireland, a. (in press). provosts' perceptions of academic library value & preferences for communication: a national study. college & research libraries. retrieved from http://crl.acrl.org/index.php/crl/article/view/ nichols, j., melo, m., & dewland, j. ( ). unifying space and service for makers, entrepreneurs, and digital scholars. portal: libraries and the academy, ( ), - . retrieved from https://doi.org/ . /pla. . oakleaf, m. ( ). the value of academic libraries: a comprehensive research review and report for the association of college and research libraries. retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/val_report.pdf okerson, a., & holzman, a. ( ). the once and future publishing library. retrieved from https://www.clir.org/wp-content/uploads/sites/ / / /pub- - - - .pdf oliveira, s. m. ( ). the academic library’s role in student retention: a review of the literature. library review, ( / ), - . retrieved from https://doi.org/ . /lr- - - pagowsky, n., & hammond, j. ( ). a programmatic approach: systematically tying the library to student retention efforts on campus. college & research libraries news, ( ), - , . retrieved from http://crln.acrl.org/index.php/crlnews/article/view/ / pinfield, s. ( ). making open access work: the ‘state-of-the-art’ in providing open access to scholarly literature. online information review, ( ), - . retrieved from http://dx.doi.org/ . /oir- - - pinfield, s., cox, a. m., & rutter, s. ( ). mapping the future of academic libraries: a report for sconul. retrieved from https://sconul.ac.uk/publication/mapping-the-future-of-academic- libraries pinfield, s., cox, a. m., & smith, j. ( ). research data management and libraries: relationships, activities, drivers and influences. plos one, ( ), e . retrieved from https://doi.org/ . /journal.pone. posner, m. ( ). no half measures: overcoming common challenges to doing digital humanities in the library. journal of library administration, ( ), - . retrieved from http://dx.doi.org/ . / . . pun, r., collard, s., & parrott, j. ( ). bridging worlds: emerging models and practices of u.s. academic libraries around the globe. chicago: association of college and research libraries. roberts, s., & esson, r. ( ). leadership skills for collaboration: future needs and challenges. in m. melling & m. weaver (eds.), collaboration in libraries and learning environments (pp. - ). london: facet. robertson, m. ( ). perceptions of canadian provosts on the institutional role of academic libraries. college & research libraries, ( ), - . retrieved from http://crl.acrl.org/index.php/crl/article/view/ / https://doi.org/ . /uksg. https://doi.org/ . /spec. https://doi.org/ . /spec. https://doi.org/ . / . . http://crl.acrl.org/index.php/crl/article/view/ https://doi.org/ . /pla. . http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/val_report.pdf https://www.clir.org/wp-content/uploads/sites/ / / /pub- - - - .pdf https://doi.org/ . /lr- - - http://crln.acrl.org/index.php/crlnews/article/view/ / http://dx.doi.org/ . /oir- - - https://sconul.ac.uk/publication/mapping-the-future-of-academic-libraries https://sconul.ac.uk/publication/mapping-the-future-of-academic-libraries https://doi.org/ . /journal.pone. http://dx.doi.org/ . / . . http://crl.acrl.org/index.php/crl/article/view/ / saunders, l. ( ). room for improvement: priorities in academic libraries’ strategic plans. journal of library administration, ( ), - . retrieved from https://doi.org/ . / . . schonfeld, r. c. ( ). organizing the work of the research library. retrieved from https://doi.org/ . /sr. sconul. ( ). leadership challenges. some views from those in the hot seat. sconul focus, ( ), - . retrieved from https://www.sconul.ac.uk/sites/default/files/documents/ _ .pdf sinclair, b. ( ). the university library as incubator for digital scholarship. educause review, ( june ). retrieved from http://er.educause.edu/articles/ / /the-university-library- as-incubator-for-digital-scholarship tenopir, c., talja, s., horstmann, w., late, e., hughes, d., pollock, d., . . . allard, s. ( ). research data services in european academic research libraries. liber quarterly, ( ), - . retrieved from http://doi.org/ . /lq. tyrer, g., ives, j., & corke, c. ( ). employability skills, the student path, and the role of the academic library and partners. new review of academic librarianship, ( ), - . retrieved from https://doi.org/ . / . . vandegrift, m., & varner, s. ( ). evolving in common: creating mutually supportive relationships between libraries and the digital humanities. journal of library administration, ( ), - . retrieved from http://dx.doi.org/ . / . . vinopal, j., & mccormick, m. ( ). supporting digital scholarship in research libraries: scalability and sustainability. journal of library administration, ( ), - . retrieved from http://dx.doi.org/ . / . . walton, g., & matthews, g. (eds.). ( ). exploring informal learning space in the university: a collaborative approach. london: routledge. weaver, m. ( ). student journey work: a review of academic library contributions to student transition and success. new review of academic librarianship, ( ), - . retrieved from http://dx.doi.org/ . / . . witt, s. w., kutner, l., & cooper, l. ( ). mapping academic library contributions to campus internationalization. college & research libraries, ( ), - . retrieved from https://doi.org/ . /crl. . . wolff-eisenberg, c. ( ). ithaka s+r us library survey . retrieved from https://doi.org/ . /sr. wolff-eisenberg, c., rod, a. b., & schonfeld, r. c. ( a). ithaka s+r | jisc | rluk uk survey of academics . retrieved from https://doi.org/ . /sr. wolff-eisenberg, c., rod, a. b., & schonfeld, r. c. ( b). ithaka s+r us faculty survey . retrieved from https://doi.org/ . /sr. wynne, b., dixon, s., donohue, n., & rowlands, i. ( ). changing the library brand: a case study. new review of academic librarianship, ( - ), - . retrieved from https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /sr. https://www.sconul.ac.uk/sites/default/files/documents/ _ .pdf http://er.educause.edu/articles/ / /the-university-library-as-incubator-for-digital-scholarship http://er.educause.edu/articles/ / /the-university-library-as-incubator-for-digital-scholarship http://doi.org/ . /lq. https://doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . https://doi.org/ . /crl. . . https://doi.org/ . /sr. https://doi.org/ . /sr. https://doi.org/ . /sr. https://doi.org/ . / . . microsoft word - rarebookscatalogueaccepted.docx the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh the rare books catalog and the scholarly database anne welsh department of information studies, university college london, gower street, london wc e bt abstract a researcher’s eye view of the value of the library catalog not only as a database to be searched for surrogates of objects of study, but as a corpus of text that can be analysed in its own right, or incorporated within the researcher’s own research database. barriers are identified in the ways in which catalog data can be output and the technical skills researchers currently need to download, ingest and manipulate data. research tools and datasets created by or in collaboration with the library community are identified. keywords library catalogs / opacs, catalog indexing / display / design, information retrieval, bibliographic data - interoperability introduction … quando uma coleção se mantém una, a utilidade do catálogo é óbvia; quando a coleção se dispersa, o catálogo serve muitas vezes para confirmar a autenticidade de uma pintura, acrescentar-lhe valor imaginário e atribuir-lhe uma historia. the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh … while a collection remains entire, the use of the catalogue is obvious; when dispersed, it often serves to authenticate a picture, adds to its imaginary value, and bestows a history on it. horace walpole. this quotation, in portuguese and english, on a wall in lisbon’s gulbenkian museum, cuts right to the heart of this article, which builds on a presentation at the chartered institute of library and information professionals cataloguing and indexing group conference and a short communication in catalogue and index. it draws on original research into the working library of walter de la mare and a literature review of activities exploiting the catalogs of writers’ libraries, which are de facto catalogs of rare materials. in doing so, this article explores the uses to which catalog data is being put by libraries in evaluating their collections and by researchers. writers’ libraries are presented as an example of an area of academic research into materials often held by special collections departments and for which bibliographic research and description is core. this article highlights the potential for library data to be more than a finding aid for researchers: to be also the foundation for scholarly databases. provenance and the history of reading in his seminal work provenance research in book history, pearson asserts that “the serious study of private libraries, and of the lessons which can be learned from book ownership, is a growth industry and one which has gained much ground in the recent past.” he uses the example of st cuthbert’s the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh gospel-book to indicate that the roots of interest in provenance stretch back to the middle ages, when the gospel was included as a relic at durham cathedral. however, he makes a distinction between “venerating a book” and using its provenance information as evidence in the modern sense. the former is, of course, a spiritual experience, while the latter is both a tool for assessing collections and an academic research technique. interest in reading really flourished in the last decade of the th century, and pearson’s first edition and its reprint with corrections spanned the rapid expansion of the field, so that he was able to observe in his “introduction to the reprint” that “there is a steadily growing literature on the ownership and use of books, embracing works on particular private libraries, studies of marginalia, and the new academic vogue for the history of reading.” this is not to suggest that readers were not of interest to scholars in earlier times. within literary studies, reader response critics of the s and s introduced concepts including the intended reader and the unreliable narrator, interpretive communities, the ideal reader, the implied reader and the creation of meaning between the author and the reader, the writer’s imagined audience, and the role of the critic as mediator for the author and reader. however, rose highlights a paucity of source materials for those interested in the physical evidence of the history of reading. as a result, although altick was able to write the english common reader in , he was only able to spend one chapter on acts of reading themselves, and as rose puts it, recent “scholars have, with considerable ingenuity, located and used a wide range of new materials that allow us to fill in the vast blank the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh spaces on altick’s map” to such an extent that “it is fairly astonishing to recall that just twenty years ago, the history of the common reader was widely believed to be unrecoverable.” today, classes on book history often start with darnton’s communications circuit, with “readers: purchasers, borrowers, clubs, libraries” towards ‘the end’ of the cycle, but with a dotted line from them to the author at the start, representing ( ) the iterative and reiterative influence that readers have on the creators of works and ( ) that the processes associated with the life cycle of the book are not linear but circular in nature. the centre of darnton’s circuit shows the major forces that shape and give rise to books: “economic and social conjuncture”; “political and legal sanctions”; and “intellectual influences and publicity.” as eliot and rose have summarized, “books are made by history: that is, they are shaped by economical, political, social and cultural forces,” and at the same time, books can influence the wider world: “readers can read the same book in a variety of different ways, with important consequences: after all, wars have been fought over differing interpretations of treaties.” modern researchers heed the warnings of earlier scholars not to be myopic in their studies. the publishing historian john feather has been vociferous in reminding us of this point: “book historians who are not at least aware of bibliographical techniques are ill equipped for their task, and it could be forcefully argued that a knowledge of historical bibliography should be the basis of their training as scholars. we should also, however, venture outside the confined space of the printing house into the world in which its products were used.” as secord has put it, the history of reading encompasses “all the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh the diverse ways that books and other forms of printed words are appropriated and used,” while, as jackson has pointed out, it does not simply have one sole aim to “recapture the mental processes by which readers appropriated texts.” she highlights darnton’s work on banned literature in th century france, raven’s ongoing interest in the impact of mass market publications, st clair’s study of access to reading material in the th century and manguel’s wide-ranging essays, which she describes as “tell[ing] us about the evolution of material accompaniments to reading.” the tools and sources of information for discovering these histories are diverse, and sometimes prosaic: marginalia, notebooks, letters, autobiographies, library borrowers’ registers, booksellers’ lists and library catalogs – both private (such as the one walpole made of his father’s collections) and institutional. in the last of these, our st century library catalogs could – and we might argue should – play a central role. writers’ libraries as a field of study the study of writers’ libraries lies at the nexus of bibliography, book history, library history, cultural and literary studies. darnton suggests that “most of us would agree that a catalog of a private library can serve as a profile of a reader, even though we don’t read all the books we own and we do read many books that we never purchase. to scan the catalog of the library in monticello is to inspect the furnishings of jefferson’s mind.” leah price, in the introduction to her collection of photographs of authors’ bookcases and interviews with their owners asserts that “bookshelves reveal at once our the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh most private selves and our most public personae. they can serve as a utilitarian tool or a theatrical prop.” herein lies the challenge and the promise of such study – in order to understand a writer’s library we must have a certain level of knowledge of their creative output and yet assessment of their book collection can enhance our appreciation of a writer’s working methods, and ultimately, their work. gribben has asserted that librarians “may feel somewhat uncomfortable” in making judgments like these, whereas english professors may lack advanced technical skills in bibliography. given the number of literature graduates and doctorate holders who go on to train as information professionals, we may in fact feel decidedly uncomfortable with this segregation by person holding a role. however, we might fairly agree with his summary that “the study of an author’s library and reading is a borderline area between literary studies and library science” as disciplines. in his handbook collecting, curating, and researching writers’ libraries, oram provides us with a straightforward definition of writers’ libraries: “a set of books or other printed works owned by the author at a particular moment in time … writers’ libraries in the possession of institutions are often (although not always) a collection of their books at the time of their death, or a subset thereof.” within cultural studies, museum studies, art history and psychology there are many publications concerned with what we may call ‘the collecting habit’ and the motivations for collecting. in the library setting, we can consider attar’s assertion that “books in a library differ fundamentally from books anywhere else in that each one is part of a the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh collection, an aggregation that imposes its own meaning derived from the decisions and accidents that went into its formation.” this view of ‘the collection’ is in line with tanselle’s wide definition that “collecting is the accumulation of tangible things.” it also places the agency of determining the “coherence of a ‘collection’” firmly in the hands of the librarian and sidesteps concerns about whether the previous, private owner styled themselves a collector or not. private collections may have been deliberately constructed as such by their original owner. such an example is the phyllis t.m. davies collection of books by walter de la mare now at cambridge university library. alternatively, the previous owner may have acquired some of their books as working materials while others may have been specifically collected. in interviews with oram and macdonnell, ted kooser, russell banks and jim crace each make such a distinction in the books they own. edmund white is an example of a living author who has sold manuscripts to the beinecke library and given them “books that helped me in my own work” but now has “the feeling the librarians don’t really want those research books anymore [sic].” instead, he talks about “go[ing] to the strand and buy[ing] every book about henry james’s letters, for instance, that i can find, but i ship them out as soon as i’m finished [writing] the essay.” here, there is a clear view of materials that are suitable for an institutional library, books he wants to retain, and books that, in his view, can be treated as ephemeral and disposable. other writers may have an egalitarian completest attitude to the books on their shelves. junot díaz, for example, reports “i have never liked the idea of a hidden book. it means no-one will ever randomly pick it up and have a the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh conversation with you about it.” at the opposite end of the scale from the private collection that has been carefully constructed by its original owner, there are, of course, those who would agree with ron powers that “books are to be read. not to be resold, speculated in, sanctified, put on shelves as indicators of intellect or status, or otherwise violated. read. period.” turning to tanselle again, “what one person accumulates haphazardly, another will regard as bearing a design; and even the product of a careful plan may turn out to be of interest to another person for an entirely different pattern that can be put into it.” writers’ libraries in the scholarly record an early example of a writer’s library that received scholarly attention is that of edward gibbon, the author of the decline and fall of the roman empire. william beckford, the author of gothic bestseller, vathek, bought gibbon’s library in france wholesale, and although he was a famous bibliomaniac, he did not add it to his collection at home, but instead said that “i bought it to have something to read when i passed through lausanne … it is now dispersed, i believe. i made it a present to my excellent physician.” significantly, “when asked if the books were rare or curious … he replied in the negative. there were excellent editions of the principle historical writers, and an extensive collection of travels. the most valuable work was an edition of “eustathius;” there was also a ms. or two. all the books were in excellent condition; in number considerably above six thousand, near seven, perhaps.” here, we can see a later collector assessing a collection that, according to his bibliographer, was considered by its original collector “a the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh working library” and who himself declared “i am not conscious of having bought a book from a motive of ostentation.” beckford was clearly interested in the books – he claimed to have shut himself away for six weeks to read them – but having read them once, had no further desire to keep them, so he gave them away. nor did he give any indication of considering the books special because such a famous author had been their original owner. yet, a century later, geoffrey keynes went to great pains to piece together the bibliography not only of the library in lausanne, but also of the other books gibbon amassed in his lifetime. in , on compiling the first edition of the library of edward gibbon: a catalogue, keynes reported that while bibliographers and librarians had worked on “the catalogues of the great libraries [which] enable the individual to consult the universal mind … the library collected by one man … expresses only his own interests and a catalogue of the books it contains can have no value unless the mind that it reflects is one of very universal distinction. seldom, therefore, has it been thought worth while [sic] to attempt to reconstruct the individual libraries of the writers of the past.” again, reflecting rose’s claim for altick’s experience of a paucity of data, keynes writes, “usually, indeed, no material has existed for such attempts, unless it were an auctioneer’s catalogue hastily compiled after the owner of the books had died.” scholarly interest in writers’ libraries and the formulation of methods to reconstruct them awaited the rise not only of the reader response critics in literary studies, but a widening of focus from a canon of literature considered worthy of academic study to the full range of texts published from the th the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh century onwards, and the increasingly diverse group of people able to read them. halsey has identified this work and the rise in the study of the book as object as leading “inexorably to a focus on readers, both contemporary and historical.” at the same time, within cultural studies, museum studies and art history, there have been accounts published of individual private collections, and within psychology some work has focused on collecting, on creativity, and on the relationship between them. as muensterberger asserted in his book on collecting: an unruly passion, “we know from contemporary artists’ collections that they provide animation and inspiration, or may even sway his barely conscious susceptibilities, long before the artist himself is fully aware of the source.” different writers have different ways of working. as lev grossman has admitted, “i read obsessively when i’m writing. i think there are two kinds of fiction writers, those who read incessantly while they write, and those who can’t read at all, lest their individual voices get overwhelmed, or tainted somehow. i’m the first kind.” gary shteyngart would seem to agree: “you read, then you write, then you read some more, then more writing, and so on in an endless wordy loop.” however, jim crace’s experience is different: “i don’t see the books themselves as sources for my books. but … whenever i am in the countryside and hiking, then i do feel creatively grand. as a landscape writer i feel deep and dirty amongst my sources … i suppose i want to allow for the idea that our libraries reveal much but that other chambers of our lives can disclose a good deal more.” the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh in terms of scholarship of a particular writer, gribben has described the process that can be observed: “within the ecosystem of a newly fertile subject, the biography is usually the first green growth to appear; then a bibliography grows up within its shade; several boldly broad critical surveys follow; and soon, if the quality of the literary canon is sufficient, a grove of increasingly specialized studies takes root, affording protective habitats for modish critical approaches. competing biographies and new editions of the works eventually flourish among the dense woodland vegetation of the climax stage. somewhere in the evolution of this delicate ecosystem of academic books, a study of the author’s knowledge of others’ writings – an examination of his or her library and reading – manages to thrust itself through the foliage of this timber into the sunlight.” the current article comes from work undertaken in the course of studying the working library of walter de la mare. we can see that gribben’s pattern is roughly followed: de la mare’s first publications were short stories in the cornhill magazine in - . his first book was the poetry collection songs of childhood, published in under the pseudonym walter ramal, with his first novel, henry brocken, following in . in he began his career as a reviewer for the bookman and started working for the times literary supplement in and the westminster gazette in , and so it makes sense that the first biographical study of him was a chapter in adcock’s gods of modern grub street. the following year, r.l. mégroz’s walter de la mare: a biographical and critical study was published, to be joined in by forrest reid’s walter de la mare: a critical study. the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh in terms of bibliographies, danielson’s was published in and murphy’s in . the latter was praised by bowers in a footnote of his principles of bibliographical description as a modern bibliography that “excels in the details of the descriptions.” other biographical, bibliographical and critical works have been published, with the current key biography now that of whistler in and the main critical work bentinck’s published in . senate house library completed cataloging the working library and family archive of the printed oeuvre in , and work on my phd focused on it began shortly thereafter. all fairly typical, according to gribben’s ecosystem. the scholar and the computer catalog gribben was also fairly prescient, in , in foretelling an increase in the use of the computer within research. although “doubt[ing] whether such permanent records of library borrowings” as kesselring used for her work on hawthorne and his family’s loans from the salem athenaeum “will survive from our contemporary computer-assisted libraries,” he was optimistic about new methods and tools for research: “the technology and determination that enable us to penetrate outer space will most likely also give us better means to explore the intellectual lives of our cherished authors. word-processors, as well as other apparatuses now beyond our ken, will ultimately supplement the researcher’s notecards and fileboxes, but an unquenchable curiosity about the creators and backgrounds of great literary manuscripts will continually bring forth dauntless scholars in each generation.” early bibliographies of writers’ libraries often relied on the merging of various lists produced by the writers themselves, booksellers’ inventories and the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh library catalogs. this was a complex and time-consuming task, and reliance often had to be placed on the bibliographic skills of library staff. keynes identified early on that “it was necessary to obtain professional assistance in the compilation of the catalog from the various sources … the identification of the books hidden in the very inaccurate entries of the different lists being an arduous and difficult task which i could not possibly assume myself.” for the first edition of the library of edward gibbon in , r.a. skelton spent “ hours of his spare time during the course of six months” while for the second in , david mckitterick produced a list of “ entries with one correction of an entry in the original catalogue and one addition of a book now in my library.” bibliographers report the usefulness of library catalogs for checking bibliographic details of books appearing in authors’ correspondence and personal lists. as harding puts it in his study thoreau’s library, “when all these sources were exhausted, i resorted to a search of the library of congress catalogs, the british museum catalogs, sabin, roerbach, and the catalogs of such nineteenth-century libraries as the boston athenaeum.” again, direct involvement of library staff was necessary: “i should add here that miss sarah bartlett has checked my list against the concord library holdings,” since in it was not possible for the researcher to check the paper donations files himself. by , when reynolds published hemingway’s reading, - he was able to report on computerized searching techniques: “the ohio college on-line catalog contains a data base [sic] of over four million book entries. from our local terminal we needed only to punch in the first words of the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh the title and part of the author’s last name. within seconds, a list of possibilities would appear on our screen. the mindless computer loved nothing better than to search for browns. if, however, either author or title were the least misspelled, we drew a blank, for computers do not think or guess.” however, not all their searching could be conducted online: “books published only in england, for example, might not be in the data base [sic]. we turned back to the british museum catalogue … the last resort was the catalogue of english books. year by year, we thumbed through until we found the entry.” the catalog record and the library user it is important to note that the computer catalog data being searched by scholars from the s onwards were mostly marc records, and to acknowledge the inherent limitations in their structure. as avram pointed out in the marc pilot project final report, the main aim of the original project was “to test the feasibility of a distribution service of centrally produced machine-readable cataloging data.” although information retrieval was a consideration, it was third in a list of four assessment criteria for the format, with printing at the top of that list – the production of catalog cards was a key driver for the project. crucially, the designers of marc state their own awareness of their limited fore-knowledge of how researchers might search for information: “since so little is known about how a bibliographic record will be used in machine-readable form for retrieval, it was only possible to anticipate future applications.” although we have seen marc go through several versions to the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh become marc , the underlying structure of the records, and the bibliographic information we record within them, remains much the same even today. it was not until the s that keyword searching of the whole bibliographic record became possible, and so the search experiences described by reynolds, in which strict knowledge of author and / or title was needed, relay an authentic first-person report of researcher interactions with the library catalog at the time. indeed, in the early s, although public catalogs were designated opacs – online public access catalogs – “online” did not have the same connotation in pre-internet days as it does now, and the catalog terminals were merely networked within their own local area, usually just the library building itself. so the simplest and most convenient way to obtain information about private collections now dispersed across institutions was to contact them, as edel and tintner encouraged those interested in henry james’s library to do: “the several university collections have complete bibliographic details, including the number of volumes in each title, whether or not the pages have been cut, and what, if any, exact marginal annotations appear. transcripts of their lists are available from the libraries.” within the st century catalog, the need, or otherwise, for separate listings of special collections is largely a question of ( ) local cataloging decisions concerning the quantity and quality of information recorded in the catalog ( ) catalog search options and ( ) options for users to output search lists themselves. as joseph nicholson has highlighted, in the modern catalog and discovery layer, “the vagaries of keyword searching and a lack of uniformity in note fields can make it painfully difficult for users to track down the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh books belonging to particular private libraries in online catalogs.” nevertheless, “catalog records should be constructed in such a way as to allow patrons to identify books that belonged to a particular writer and, equally important, to retrieve them as part of a group.” although frbrization (the implementation of ifla’s functional requirements for bibliographic records’s bibliographic model in cataloging, or an approximation of it within the discovery layer) places “relationships at the heart of the catalogue,” in real terms the collocation of materials that formerly shared the same private ownership in ways that assist in their retrieval is complex. senate house library provides a good example of ways in which researchers can retrieve records from named collections in which they are interested: ( ) by searching by author name with the addition of the phrase “former owner”; ( ) by locating a known item and then using a hypertext link to search for the other items in the collection ( ) by searching by local classmark. it is also possible to approach the search via a collection description on the special collections web page. each of these options has strengths and weaknesses and might suit researchers arriving at the library catalog with different levels of information retrieval experience and expertise. collection description as nicholson has summarized, “the private libraries of writers pose a number of peculiar challenges to catalogers in special collections units due to a hybrid identity that incorporates aspects of both archival collections and books. though the fundamental unit of the private library is the book … the textual bedrock of such collections often serves as a substratum on which layers of the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh materials commonly considered archival in nature have been deposited.” this “quasi-archival” nature of writers’ libraries makes description at collection level a sine qua non for best practice, and nicholson suggests both the creation of collection-level marc records, following descriptive cataloging of rare materials (books) (dcrm(b)) appendix b, and the creation of a collection description on a library web-page. senate house library has not created collection-level marc records, but it has provided a useful summary of its main special collections on the library website. the page on the walter de la mare collection provides an overview of the working library and the de la mare family archive of walter de la mare’s printed oeuvre, a note on de la mare, with a link to his entry in the oxford dictionary of national biography, brief information on acquisition, an indication of related holdings, and a select list of publications about the collection. crucially, the information on access provides a summary of when the materials were cataloged and first became available to researchers, and the instruction to carry out a local classmark search. locating brief collection descriptions on a webpage is also useful for the researcher since such websites are indexed by search engines, while there is much within library catalogs that still resides in the deep web. as oram and nicholson point out in the introduction to their directory of writers’ libraries, “basic information on library holdings of writers’ libraries is difficult to obtain.” they indicate institutional cataloging and differing practices as a key issue here, but to this we may also add the general difficulties in even carrying out general web searches to look for the books a writer once owned versus the books she wrote. a collection description on a library webpage, as well as the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh entry into whichever published directories seem appropriate is a great assistance to researchers in this field. search is only elementary so far we have devoted a fair amount of time to tracing searches and search strategies, and this reflects a major way in which researchers interact with libraries and their catalogs. in our attempts to fulfill ranganathan’s five laws of library science, we look towards frbr’s four user tasks to help us in our aims “every reader his book” and “every book its reader.” these tasks, familiar to every cataloger, are “to find entities that correspond to the user’s stated search criteria …; to identify an entity … ; to select an entity that is appropriate to the user’s needs … ; to acquire or obtain access to the entity described [underlining in original].” first published in , these tasks have become the objectives of those of us involved in cataloging and information retrieval, and, when we think about how a particular group of researchers may use our facilities and tools, it is natural for us to have care for the ease with which they are accomplished by researchers. indeed, it might be argued that until our library management systems were capable of offering full search capabilities, it is right that we should focus our efforts in this area. certainly, the rise in ethnographic research into catalog use reflects a widening of our interest in how people search. however, it has never been argued that search is the only use to which library users may put the catalog. even within the frbr report, the wording introducing the user tasks is “four generic user tasks have been defined … the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh the tasks are defined in relation to the elementary uses that are made of the data by the user.” so the tasks are “generic” – the ones that most, if not all, users will undertake; the uses themselves are “elementary” – the beginning or starting point (note that the word “primary” was not used); and what is being used is “data” – not limited to records, nor to results lists – data. perhaps we are so used to the fundamental concept that the catalog record is a surrogate for the material itself and to the argument that what users want is full-text access that it becomes easy for us to overlook the status of the catalog itself as data. smiraglia has pointed out that the catalog is a cultural artifact, while whaite has grounded this in history – “a catalog that is in use is a finding tool, but when a newer version is introduced, the old catalog becomes a relic of its time.” we might extend this into contemporary history: while a catalog’s elementary use is as a finding tool, its data can tell us of its time. as andersen has made explicit, there is a materiality that we can explore in the “bibliographic record as text.” if we accept that our catalog data is not solely paratext for the item it describes, but is also a text in its own right, we can open it up to any of the research techniques for text, including any of the tools that have been devised for the computational linguistics that is often seen to be the genesis of the digital humanities. if text-mining techniques can be used to reveal the structure of pynchon’s novel v, how might they be used to examine the structure of library catalogs? what quantitative analyses might we, or the people who use our catalogs for access, wish to carry out on our data, and how might this be possible? the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh quantitative analysis of the catalog within libraries, we are already beginning to exploit quantitative tools. oclc’s worldshare collection evaluation provides reports on collections, including “comparisons to individual libraries, peer groups and benchmark library groups.” as it is marketed to libraries for collection management purposes, the three uses that oclc have chosen to highlight are acquisition, deselection and accreditation, but some of the other features could be put to interesting work on individual library collections, such as those of writers’ libraries. for example: “view detailed information including title, subject analyses and local circulation data; export comparison data for offline analysis and reuse; [and] visualize comparison data.” copac has recently made their collection management tools available “via single sign on (shibboleth), so existing users no longer need their ccm [copac collection management] tools username.” this development also means that use of the tools is no longer restricted solely to the librarians responsible for collection development – anyone whose institution is part of the shibboleth consortium can use their university login to access the tools. as with oclc’s worldshare collection evaluation, copac collection management tools have, naturally, been envisaged for use by libraries to develop their collections: “by providing a suite of search and visualization options, users can make use of the rich holdings data in copac: to support difficult decision making about what materials to keep, remove, conserve or indeed purchase.” case studies and user stories highlight collection management activities. however, there is clear scope for researchers to use the tools in the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh academic work. to pick one example, copac reports that “at st. andrews university the tools were used to assess a significant donation to the library, confirming its value.” such value statements are of obvious use to researchers in deciding whether to travel to see a collection and, on occasion, in writing the introduction to an article. knowing how rare within public collections the contents of a formerly private library are is of clear scholarly interest, as is being able to determine quickly and easily the extent to which an author’s books are held within uk research libraries. such approaches could be said to belong to the academic field of digital bibliography – the creation and / or application of computational tools and techniques to the study of the book as object. other projects we might claim for this interdisciplinary area might include early modern print, which has developed and made available tools such as eebo-tcp keywords in context and the eebo-tcp n-grams browser, and the many projects of the consortium of european research libraries (cerl), including its heritage of the printed book (hpb) database and its material evidence in incunabula, built on the incunabula short title catalogue with added provenance and annotation information and links to the cerl thesauri (of provenance and place names). many of these projects are built on marc records: library records in marc are plentiful and it does not take a great training in quality control for researchers to gain an understanding of the sources of the best quality, most detailed records for their needs. oclc, copac and cerl are examples of consortia that provide their member libraries with many different services – not only tools and data hosting, but also training, conferences and the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh opportunities to take part in research and development projects. as a result, there is a great deal of good-quality data about collections hosted in tools that researchers can use – good quality data derived from library catalog records. are we beginning to see the flowering of attar’s prediction from of “the developing function of a catalogue record as a research tool in itself, instead of a mere finding aid”? current limitations in library data and systems the basic structure of marc pre-dates the internet, and its consequent limitations and lack of flexibility in terms of sharing with systems outside libraries has been well-documented, from tennant’s original opinion piece that “marc must die” through to the bibframe primer , in which marc’s structural failings are highlighted as a rationale for the library of congress to explore future possibilities for data exchange formats. the growth of the semantic web and correspondingly in researchers using semantic web technology, standards and schema, such as the resource description framework (rdf) on which bibframe is based, makes this an attractive successor, with the promise that we can move from a situation in which the data in our web catalogs is essentially hidden in the deep web where standard search engines do not penetrate, through what willer and dunsire have termed “a manifesto for a paradigm shift” so that our data is an integral part of the semantic web. in practical terms, as well as being less open than we might like it to be in the st century, the library data that we create in marc records requires a high overhead of systems work. as has pūrongo summarized, “library the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh systems experts now spend time managing marc data – manipulating it, doing quality management activities, and mining it, and all the while keeping the marc data moving through systems in client server architecture.” beyond the catalog itself, in order to make our data available on the internet, we have to publish it, and that too requires technical skills. many national libraries, who have an ethos of openness and, therefore, of wanting to make their data available for use by researchers, have web pages dedicated to download options, so that those outside the library with the technical skills can ingest their data and manipulate it. recently, national libraries including the british library, the library of congress, the deutsche nationalbibliothek and the bibliotheque nationale de france (bnf) have made data available in rdf for researchers, allowing those with linked data skills to incorporate these datasets in their work. an important consideration in this work that has been expressed by researchers from the bnf is that “as the main purpose of the library is to give access to documents for patrons, the html publication ha[s] to be coherent with the rdf publication, the data in rdf being just a different view from the same data that is in the html page.” in this brief statement we can see the major challenges of presenting our catalog data to the world in the st century: data that is created in marc has to be published in xml and also in rdf so others can reuse it. digital bibliography there have been several projects that have used both programming and data skills in order to manipulate library data in order to answer bigger humanities the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh questions. for example for the ala midwinter hackathon in , mitch fraas created a network visualization of former owners of manuscripts at university of pennsylvania libraries “in the hopes that it [would] be not only useful to scholars but also might generate some conversation over how libraries and archives distribute their valuable descriptive information.” while demonstrating the techniques he used to create the visualization, he makes a remark part-way through his account, which is worthy of notice: “i realize now that this task would have been near to impossible at most libraries where the online catalogs and back-end databases don’t easily allow public users to batch download full records. fortunately at penn all of our catalog records are available in marc-xml form.” issues in downloading data from public catalogs recur in the literature. in her phd thesis on the raymond klibansky collection at mcgill university library, tomm writes about having to feed catalog data through reference management software so that she could obtain the data she needed in a format she could manipulate and use, while baker, writing about a small project working with data from the british cartoon archive, reported that he had to run programming scripts to cleanse the metadata before he could run it through the quantitative analysis tool he was using. even in downloading records in basic formats, like csv, which should allow easy importation to spreadsheet programs, researchers working outside the library can encounter issues. if data has not been specifically published for use by researchers, it is not uncommon to encounter issues with the standard download options offered by library management system providers. in , for example, it was not possible to download the entire set of records the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh for the working library of walter de la mare from senate house library’s public catalog. the csv file, once imported into excel, located all the information in the first cell. massaging the data through reference management software, as tomm did for her project, resulted in records which lacked their provenance notes. the only ways forward were to either re-key the data by hand, or download a file and manually tab delimit it, or ask the systems team at the library to publish the data. although larger libraries are beginning to look into bespoke data publishing for researchers, there are, of course, many workflow and cost issues surrounding supply and demand for any one-to-one services. as things stand currently, download options from most standard library catalogs are targeted towards reference management. a show of hands at the cataloguing and indexing group conference in revealed that only one of the delegates present had tested that all the download options offered by their catalogs (not just the reference management ones) resulted in the download of complete records. the other catalogers present in the room had, until that point, been content to input high-quality catalog records without double-checking how readers might output them. essentially, we had been creating surrogate records to satisfy frbr’s elementary user tasks, all of which are focused on search. exploitative power in bibliographical control in , when computerized cataloging was in its infancy, wilson published his seminal work of cataloging theory, two kinds of power: an essay on bibliographical control. in it, he argued that our power is of two kinds – the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh “descriptive” and “exploitative.” we could make a case that from the birth of marc and the computer more generally, as a profession we have been focused on the first of these, and that now, with greater opportunities to manage and manipulate data, we can focus on the second, which, in wilson’s own argument, was always the greater. smiraglia’s summary of “exploitative power” is still pithiest and clearest – “the power of a scholar to make the best possible use of recorded knowledge.” if the drive in the heritage sector to open up its collections data is to result in a large uptake by researchers, it is necessary to be aware that there are barriers to use and to work to overcome them. as well as practical issues such as interoperability and fragmentation of resources and tools, there can be a lack of technical skills among humanities researchers, and even a lack of awareness of the possibilities that exist. an acknowledgement of the need for encouragement to experiment can be seen in the creation of departments such as the british library’s digital scholarship team, with a remit to encourage use of datasets published by the library, including running outreach events and competitions. meanwhile, the presence of academics within libraries through awards such as the kluge fellowship in digital studies at the library of congress further encourages collaboration on digital projects using library data. collaboration has not always been the history of projects involving or recreating writers’ libraries. in keeping with his account of early opac searching, reynolds has given a summary of how, having used the catalogs to identify hemingway’s books and “carded each: author, title, date, genre, source, subject and contents,” the academic team used “the massive tri- the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh university computer to record the data, using a program specially written by george van den bout.” they were able to sort by category as well as by year, and this assisted in their critical thinking about the topic of hemingway’s reading. today there are projects in which teams of academics create the entries for their databases from start to finish. some, such as melville’s marginalia online take as their starting point a published bibliography – in this case sealts’s “check-list of books owned and borrowed” and the supplements published by sealts and olsen-smith. the most prominent database in the history of reading is arguably the reading experience database, which collects information on online forms from volunteers in australia, canada, the netherlands, new zealand and the uk. however, there are also projects in which libraries and academics collaborate. davies and fichtner’s freud’s library: a comprehensive catalogue published a cd-rom with full catalog in english and accompanying text in german and english describing the project and reporting on its findings on freud’s books and reading habits, including qualitative and quantitative analysis of the subjects in his collection, the provenance of the books, and the number of dedication copies he received. another project which utilizes a library management system at its core is the gladstone reading database (gladcat) – a project between the university of liverpool and st deiniol’s library to document the books owned by prime minister william ewart gladstone, including indicating his annotations. originally funded by the arts and humanities research council (ahrc), the catalog has continued to be maintained and added to by the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh the library after the project finished in . there have been publications and a funded phd associated with the project, and university of liverpool department of english submitted it as an impact case study in the research excellence framework (ref ) exercise. the submission claimed “the primary impact of the gladstone project has been to preserve an important part of the uk’s cultural heritage and to make it available to audiences outside the academy. specifically, it has stimulated tourism to gladstone’s library, a significant commercial heritage institution in the area, and in wales second only to the national library of wales in aberystwyth, with holdings of over , volumes. in particular, it has enabled a significant re-orientation of the library’s marketing strategy, emblematized by the change of name, from ‘st deiniol’s library’ to ‘gladstone’s library’ in reflecting the opportunity for the library to market itself as a gladstone heritage institution.” a wish upon a catalog the claims made for gladcat’s impact are large and far-reaching, and, in a different, more technological way, the aims of the national and other libraries publishing their collections data in rdf for uptake by the linked data community may also seem quite large. twenty-first century catalogers are used to big visions – we are living through the era of rda implementation with bibframe (and a complete data structure change) on our horizon. in some ways, this article makes a smaller case: the case for the researcher working in a field that is not only of interest to academics but also, presumably, to libraries who want to know more about their individual collections. the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh while rdf data will allow for progress in the research into writers’ libraries and provenance more generally, and while much has been achieved by consortia like oclc, copac and cerl in publishing and utilizing xml data based on marc records, there is still plenty that can be achieved through a researcher’s being able to download complete records into a basic format – something as simple as csv that displays correctly when imported to a spreadsheet would be enough. to come back to the question in the title of the conference at which this work was originally presented, “a common international standard for rare materials?” the answer this article offers is that the input of data by libraries to their catalogs has not been a stumbling block for researchers – not even when computerization and data sharing was relatively new; not even when catalogs were manual. research into writers’ libraries held now by institutional libraries is dependent on several processes – discovery, access, collocation of materials, consistent catalog input and reliable output. the missed opportunity, it is the contention here, is not input so much as output (beyond search and display). in some ways, this is a smaller case than the cases for international standards and linked data. in another way, it is much larger, leading us back round to wilson’s philosophy of the exploitative power of bibliographic control. if we can meet the needs of researchers who want to engage with our data not as a route through to ‘the real’ objects of their research – full-text files, books, the item for which catalog data is a surrogate – but as an integral part of their own research, then, surely, we are assisting not simply in an the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh “elementary” user task, but something that is fundamental to scholarship: “the best possible use of recorded knowledge.” acknowledgement some of the research in this article was undertaken as part of a phd in cultural studies at ucl. the author would like to thank gladstone’s library for the award of a revd. dr. murray macgregor scholarship in , which provided her with the peace and space to write up part of her thesis. orcid anne welsh http://orcid.org/ - - - horace walpole, aedes walpolinae. (rev. ed. ), xxxi, quoted in carla anne welsh, “metadata output and the researcher” (paper presented at the cilip cataloging and indexing group conference, canterbury, kent, september - ). anne welsh. “metadata output and the researcher,” catalogue and index ( ), - . david pearson, provenance research in book history: a handbook. reprinted with a new introduction (london: the british library; new castle, delaware: oak knoll, ), . c.f. battiscombe, ed., the relics of st cuthbert: studies by various authors collected and edited with an historical introduction (oxford: oxford university press for the dean and chapter of durham cathedral, ), . jonathan rose, “altick’s map: the new historiography of the common reader” in the history of reading. volume . methods, strategies, tactics, ed. rosalind crone and shafquat towheed (basingstoke: palgrave macmillan, ), - . pearson, provenance research, xi. wayne c. booth, the rhetoric of fiction (chicago; london: university of chicago press, ). the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh stanley fish, surprised by sin: the reader in paradise lost (london: macmillan, ; is there a text in this class?: the authority of interpretive communities. cambridge, massachusetts: harvard university press, ). gerard genette, figures iii (paris: seuill, ). wolfgang iser, the implied reader: patterns of communication in prose fiction from bunyan to beckett (baltimore; london: john hopkins university press, . walter ong, “the writer’s audience is always a fiction,” pmla . ( ), - . jonathan culler, structuralist poetics: structuralism, linguistics and the study of literature (london: kegan, paul, ). r.d. altick, the english common reader: a social history of the mass reading public, - (chicago: university of chicago press, ). rose, “altick’s map,” . robert darnton, “what is the history of books?” daedalus . ( ), - . simon eliot and jonathan rose, introduction to a companion to the history of the book, ed. simon eliot and jonathan rose (oxford: wiley-blackwell, ), . john feather, “the book in history and the history of the book,” the journal of library history . ( ), . james secord, victorian sensation: the extraordinary publication, reception, and secret authorship of ‘vestiges of the natural history of creation’ (chicago: university of chicago press, ), . h.j. jackson, “‘marginal frivolities’: readers’ notes as evidence for the history of reading,” in owners, annotators and the signs of reading, ed. robin myers, michael harris and giles mandelbrote (new castle delaware: oak knoll; london: british library, ), . robert darnton. the corpus of clandestine literature in france, - (new york; london: norton, ). william st clair, the reading nation in the romantic period (cambridge: cambridge university press, ). alberto manguel, a reading diary (edinburgh: canongate, ). jackson, “marginal frivolities,” . robert darnton, the kiss of lamourette: reflections in cultural history (london: faber, ), . leah price, introduction to unpacking my library: writers and their books, ed. leah price (new haven, connecticut; london: yale university press, ), . alan gribben, “private libraries of american authors: dispersal, custody and description,” the journal of library history . ( ), . richard w. oram, introduction to collecting, curating, and researching writers’ libraries: a handbook, ed. richard w. oram and joseph nicholson (lanham, maryland: rowman & littlefield, ), - . karen attar, “books in the library,” in the cambridge companion to the history of the book, ed. leslie howsam (cambridge: cambridge university press, ), . g. thomas tanselle, “a rationale of collecting,” studies in bibliography ( ), . the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh tanselle, “rationale of collecting,” . richard w. oram and kevin macdonnell, “writers on their libraries: interviews,” in collecting, curating and researching writers’ libraries: a handbook, ed. richard w. oram with joseph nicholson (lanham, maryland: rowman & littlefield, ), - . edmund white and leah price, “an interview with edmund white,” in unpacking my library: writers and their books, ed. leah price (new haven, connecticut; london: yale university press, ), . edmund white and leah price, “interview,” . junot díaz and leah price, “an interview with junot díaz,” in unpacking my library: writers and their books, ed. leah price (new haven, connecticut; london: yale university press, ), . oram and macdonnell, “writers and their libraries,” . tanselle, “rationale for collecting,” . cyrus redding, “recollections of the author of vathek,” new monthly magazine and humorist . ( ), . edward gibbon, the memoirs of edward gibbon by himself, ed. george birkbeck hill (london: methuen, ), . geoffrey keynes, the library of edward gibbon: a catalogue, nd ed. (winchester: st paul’s bibliographies, ). geoffrey keynes,the library of edward gibbon, . katie halsey, “‘folk stylistics’ and the history of reading: a discussion of method,” language and literature . ( ), . werner muensterberger, collecting: an unruly passion: psychological perspectives (princeton: princeton university press, ), . lev grossman and leah price, “an interview with lev grossman,” in unpacking my library: writers and their books, ed. leah price (new haven, connecticut; london: yale university press), . gary shteyngart and leah price, “an interview with gary shteyngart,” in unpacking my library: writers and their books, ed. leah price (new haven, connecticut; london: yale university press), . richard w. oram and kevin macdonnell, “writers and their libraries,” . gribben, “private libraries,” . fredson bowers, principles of bibliographical description (winchester: st paul’s bibliographies, ), . theresa whistler, imagination of the heart: the life of walter de la mare (london: duckworth, ). anne bentinck, romantic imagery in the works of walter de la mare (lewiston, ny; lampeter: edwin mellen press, ). “walter de la mare library, senate house library,” last accessed may , , http://www.senatehouselibrary.ac.uk/our-collections/special- collections/printed-special-collections/walter-de-la-mare-library marion l. kesselring, hawthorne’s reading - : a transcription and identification of the titles recorded in the charge-books of the salem athenaeum (new york: new york public library, ). gribben, “private libraries,” . gribben, “private libraries,” . keynes, “preface,” . keynes, “preface,” . the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh keynes, “preface to the second edition” in the library of edward gibbon: a catalogue, nd ed. (winchester: st paul’s bibliographies, ), . walter harding, thoreau’s library (charlottesville: university of virginia press, ), . harding, thoreau’s library, . michael s. reynolds, hemingway’s reading, - : an inventory (princeton, new jersey: princeton university press, ), . reynolds, hemingway’s reading, . welsh, “metadata output and the researcher” ( ) h.d. avram, the marc pilot project: final report on a project sponsored by the council on library resources, inc. (washington, dc: library of congress, ), . h.d. avram, j.f. knapp and l.j. rather, the marc ii format: a communications format for bibliographic data (washington, dc: library of congress, ), . welsh, “metadata and the researcher” ( ). j.h. bowman, “opacs: the early years and user reactions,” library history ( ), - . j.h. bowman, “opacs.” james edel and adeline tinter, “the library of henry james, from inventory, catalogs and library lists,” the henry james review . ( ): - . joseph nicholson, “cataloging writers’ private libraries,” in collecting, curating, and researching writers’ libraries: a handbook, ed. richard w. oram with joseph nicholson (lanham, maryland: rowman & littlefield, ), anne welsh and sue batley, practical cataloguing: aacr, rda and marc (london: facet, ), . nicholson, “cataloging,” . nicholson, “cataloging,” . oram and nicholson, “location and bibliographical guide to writers’ libraries,” in collecting, curating, and researching writers’ libraries: a handbook, ed. richard w. oram with joseph nicholson (lanham, maryland: rowman & littlefield, ), . s.r. ranganathan, the five laws of library science (madras: madras library association; london: edward goldstone, ). ifla study group on the functional requirements for bibliographic records, functional requirements for bibliographic records final report (munich: k.g. saur, ), . victoria wilson, “catalog users ‘in the wild’: the potential of an ethnographic approach to studies of library catalogs and their users,” cataloging and classification quarterly . ( ), - . ifla study group on frbr, functional requirements for bibliographic records, . r.p. smiraglia, “rethinking what we catalog: documents as cultural artifacts, cataloging and classification quarterly ( ), - . katharine claire whaite, “new ways of exploring the catalogue: incorporating text and culture,” information research . , http://www.informationr.net/ir/ - /colis/papers .html#.vwittxi zuq the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh jack andersen, “materiality of works: the bibliographic record as text,” cataloging and classification quarterly ( ), - . christos iraklis tsatsoulis, “unsupervised text mining methods for literature analysis: a case study for thomas pynchon’s v,” orbit . ( ), https://www.pynchon.net/articles/ . /orbit.v . . / “oclc worldshare collection evaluation overview,” last accessed, may , , http://www.oclc.org/collection-evaluation.en.html “oclc worldshare collection evaluation brochure,” last accessed, may , , http://www.oclc.org/content/dam/oclc/services/brochures/ usf_worldsh are_collection_evaluation.pdf “worldshare collection evaluation brochure.” “copac collection management tools,” last accessed, may , , https://ccm.copac.jisc.ac.uk “copac collection management tools.” “shibboleth,” last modified, may , , https://shibboleth.net “about ccm tools,” last accessed, may , , https://ccm.copac.jisc.ac.uk/about/ “ccm tools user stories,” last accessed, may , , http://blog.ccm.copac.ac.uk/user-stories/ “early modern print: text mining early printed english,” last accessed, may , , http://earlyprint.wustl.edu “heritage of the printed book database,” last modified, november , , https://www.cerl.org/resources/hpb/main “material evidence in incunabula,” last modified, november , , https://www.cerl.org/resources/mei/main karen attar, “cataloguing early children’s books: demand, supply and a seminar,” catalogue and index ( ), . roy tennant, “marc must die,” library journal october ( ), http://lj.libraryjournal.com/ / /ljarchives/marc-must-die/#_ eric miller, uche ogbuji, victoria mueller and kathy macdougall, bibliographic framework as a web of data: linked model and supporting services (washington, dc: library of congress, ). mirna willer and gordon dunsire, bibliographic information organization in the semantic web (oxford: chandos, ), xxvii. ngā pūrongo, “marc to bibframe: outcomes, possibilities and new directions,” new zealand library & information management journal . ( ), . agnes simon, romain wenz, vincent michel and adrien di mascio, “publishing bibliographic records on the web of data: opportunities for the bnf (french national library),” in the semantic web: semantics and big data, ed. philipp cimiano, oscar corcho, valentina presutti, laura hollink and sebastian rudolph (heidelberg: springer, ), . mitch fraas. “charting former owners of penn’s codex manuscripts,” mapping books, january , , http://mappingbooks.blogspot.co.uk/ / /charting-former-owners-of- penns-codex.html the rare books catalog and the scholarly database. accepted for publication in cataloging and classification quarterly, . anne welsh jillian tomm, “the imprint of the scholar: an analysis of the printed books of mcgill university’s raymond klibansky collection” (phd diss., mcgill university, ). james baker, “on metadata and cartoons,” british library digital scholarship blog, may , , http://britishlibrary.typepad.co.uk/digital- scholarship/ / /on-metadata-and-cartoons.html welsh, “metadata output” ( ). patrick wilson, two kinds of power: an essay on bibliographical control (berkeley, california: university of california press, ). smiraglia, “rethinking what we catalog,” . melissa terras, james baker, james hetherington, david beavan, anne welsh, helen o’neill, will finley, oliver duke-williams and adam farquhar, “enabling complex analysis of large-scale digital collections: humanities research, high performance computing and transforming access to british library digital collections,” digital humanities (alliance of digital humanities organizations, ). simon mahony and elena pierazzo, “teaching skills or teaching methodology?” digital humanities pedagogy: practices, principles and politics, ed. brett d. hirsch (cambridge: open book, ). reynolds, hemingway’s reading, . “melville’s marginalia online,” last accessed, may , , http://melvillesmarginalia.org/front.php merton m. sealts, pursuing melville, - : chapters and essays (madison, wisconsin: university of wisconsin press, ). “reading experience database,” last accessed, may , , http://www.open.ac.uk/arts/reading/ j. keith davies and gerhard fichtner, freud’s library: a comprehensive catalog (london: the freud museum; tübingen, ). “the gladstone’s reading database,” last accessed, may , , https://www.liverpool.ac.uk/english/research/gladstone-library/ “gladstone’s library library catalogues,” last accessed, may , , https://www.gladstoneslibrary.org/reading-rooms/library-catalogues “gladstone’s library, gladstone’s reading: impact case study,” last accessed, may , , http://impact.ref.ac.uk/casestudies /refservice.svc/getcasestudypdf/ smiraglia, “rethinking what we catalog,” . / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html /   p r i n t e r ­ f r i e n d l y  f o r m a t return to article   d­lib magazine march/april  volume  , number  /   discovering the information needs of humanists when planning an institutional repository david seaman, dartmouth college library david.seaman@dartmouth.edu  doi: . /march ­seaman   abstract through in­person interviews with humanities faculty members, this study examines what information needs are expressed by humanities scholars that an institutional repository (ir) can address. it also asks what concerns humanists have about irs, and whether there is a repository model other than an institutional one that better suits how they work. humanists make relatively low use of existing irs, but this research indicates that an institutional repository can offer services to humanities faculty that are desired by them, especially the digitization, online storage, curation, and sharing of their research materials and publications. if presented in terms that make sense to humanities faculty, and designed consciously with their needs and concerns in mind, an ir can be of real benefit to their teaching, scholarship, collaborations, and publishing.   introduction institutional repositories (irs) are infrastructures through which universities and colleges seek to safeguard and share digital content created by faculty and staff. most irs contain articles, books, datasets, and related scholarly material; a minority also contain teaching and administrative records (primary research group, , p.  ). irs are increasingly common in academia: by september  , the directory of open access repositories (opendoar) listed over  ,  irs worldwide. irs are predominantly planned and implemented by library and information technology directors and staff (association of research libraries,  , p.  ; baudoin & branschosky,  ; dill & palmer,  ; markey, , p.  ). however, there have been few attempts to discover the information needs of faculty during this planning, and before ir hardware and software are installed: "many librarians and administrators are convinced that repositories are important­so much so that most are, or will be, implementing repositories before they do a needs assessment" (markey,  , p. ix). this general lack of a needs analysis before launching a library service runs counter to normal library practice. libraries routinely include faculty members as collaborators in other areas of service design, planning, and implementation, such as the development of new digital library services (seaman,  ), the way in which bibliographic utilities are used (rader,  ), or the integration of library holdings with electronic courseware systems (agingu & cooper,  ). despite the growing number of irs worldwide, university and college faculty have been slow to submit content to them. a   survey of   faculty at colleges and universities in the united states and canada found that "only  . % ... have ever contributed a publication to their library's digital repository" (primary research group,  , p. ) although  . % of those surveyed were aware that their institution had instituted "a digital repository for faculty publications" (primary research group,  , p. ). schonfeld and housewright ( ) measure only a slightly higher percentage in a survey of  ,  faculty members: "our http://www.dlib.org/dlib/march /seaman/ seaman.html / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / national sample found only  % of faculty members have deposited into an institutional repository" (p.  ). prior research suggests that one reason for the low incidence of content submission is a failure to evaluate the information needs and concerns of faculty when designing an ir service. problem statement humanities scholars are an important part of an academic faculty, but their information needs are rarely if ever considered during the design phase of an institutional repository. this study fills that void by exploring the following questions: what information needs do humanities scholars express that irs might address? what issues concerning ownership and re­use of their materials do they express when they consider using an ir? is there a repository model other than an institutional one that better suits how they create, store, and share their teaching and research materials? failure to understand the information requirements of humanities content contributors early in the planning of an ir can result in services that are ill matched to their requirements, which in turn leads to under­populated repositories. to build a service based on an articulation of desires and in response to concerns­instead of one based on various assumptions of need­should create a more precise match between irs and the needs of their intended contributors. this research will provide a better understanding of the information needs that humanists have that can be addressed by institutional repository services, and will aid in the future design and promotion of irs. literature review recent literature is beginning to pay attention to the importance of uncovering users' information preferences prior to ir implementation. maness, miaskiewicz, & sumner ( ) "decided it was imperative that insight into users' goals and needs of an ir be gained before design of the repository began" (introduction section, para.  ), and conducted   interviews with faculty at the university of colorado, boulder. at the university of southern california, interviews were conducted to determine how faculty members publish and share research, and to gauge their receptiveness to an ir as an infrastructure for scholarly communications (holmes­wong, brown, & tompson,  , slide  ). at yale university, the library investigated faculty needs while planning repository services to support "scholarly publishing, open access, and institutional branding" (green,  , p.  ). additional research has probed faculty information needs after an ir has been implemented. the university of rochester, faced with an underused repository in  , sought to tailor its ir services to the information needs of their faculty members. the purpose of that study was: ... to explore the apparent misalignment between the benefits and services of an ir with the actual needs and desires of faculty [and] to understand the current work practices of faculty in different disciplines in order to see how an ir might naturally support existing ways of work. (foster & gibbons,  , institutional repositories and the adoption problem section, para.  ) despite these examples to the contrary, there has been a surfeit of assumptions about faculty needs during ir planning, which has led to many irs that are thinly populated with content (a persistent problem since their inception). in  , a survey of   repositories "found the average number of documents to be only ,  per repository, with a median of  " (ware,  , p.  ); davis & connelly ( ), looking at the cornell university ir, discovered that at that point "[m]any of its collections are empty, and most collections contain few items" (abstract section, para.  ); and a   census of   irs in the united states found that "both pilot­test and operational irs are very small. about  % of the former and  % of the latter contain fewer than  ,  digital documents" (markey,  , p.  ). humanists in particular have been infrequent ir contributors: "no survey interviewee viewed the english or other literature­oriented departments (classics, comparative literature, and theater) as being a heavy contributor to the digital repository ... and  . % / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / considered them modest contributors" (primary research group,  , p.  ). markey ( ) also found that statistically "there is no relationship between ir size and ir age" (p.  ), which indicates that the root cause of the under­population is something other than simply the newness of the service. researchers have not been slow to offer remedies to this general unwillingness of faculty to deposit content into an ir. foster & gibbons ( ) suggested that confusing terminology contributes to the problem: "[faculty] did not perceive the relevance of almost any of the ir features as stated in the terms used by librarians, archivists, [and] computer programmers" (what faculty members want section, para.  ). they also noted that the emphasis on the needs of the institution over the needs of the contributor may contribute to the problem (what faculty members want section, para.  ), a point which is strengthened by more recent research: "most faculty members ... did not consider their research materials as 'institutional' property" (rieh, st jean, yakel, markey, and kim,  , p.  ). kim ( ) proposes that one should also uncover faculty members' perceptions of the costs (time to submit; copyright concerns) and benefits (professional recognition; altruism) of irs "in order to better structure incentives and social mechanisms to foster contribution [to them]" (p.  ). in a frequently cited older study, wilson ( ) advanced "a theory of the motivations for information seeking behavior" (p.  ). this theory stresses the importance of understanding an information seeker's social and professional setting in order to understand what real needs, motivations, and hindrances exist. current efforts to populate irs with faculty­generated data are hampered precisely by a tendency to focus on systems, software, and standards, rather than on what wilson called in   the "understanding of information users in the context of their work or social life" (p.  ). procedures research design this study took place from september  ­april   at dartmouth college, a private four­year liberal arts institution in hanover, new hampshire, with a student body of approximately  ,  undergraduate and  , graduate students. graduate programs are predominately in science, technology, and medicine. [ ] participants for this study were drawn from the   full­time individuals in dartmouth's division of arts and humanities. in order to select interviewees, a current list of faculty names and departments was drawn up from the  ­  dartmouth faculty handbook, and this list was sent to the librarians who are liaisons to the   departments in the division of arts and humanities. these librarians helped identify faculty members who have expressed an opinion about the services that an ir can fulfill, such as data storage, long­term archiving, or open­access publishing, and whose current needs make them likely early adopters of a repository or whose concerns may make them early objectors to a college investment in such an infrastructure. the researcher then contacted those faculty members and conducted individual interviews with any who agreed to be part of the study and who were present on campus during september  ­april . twenty­seven faculty members were invited to take part in this study;   agreed to do so. the interviewees came from nine of the   departments; seven were female, six were male; and the pool contained both senior and junior faculty members. those who chose not to take part mentioned a lack of interest or time, insufficient expertise (none was required), recovery from illness, or absence from campus; some did not respond. other ir evaluation studies that rely on in­person interviews have made similar decisions regarding the size of the pool, when the purpose is to indicate trends rather than to generalize findings to a population. as schneiderman ( ) points out, in­person interviews "can probe deeper to understand frustrations or satisfactions" (p.  ) than is typically possible in a questionnaire, but they are time­consuming to conduct, record, analyze, and describe. at cornell university, for example, davis & connolly ( ) interviewed  faculty members out of the entire faculty population; at the university of rochester, foster & gibbons ( ) sampled   faculty members, but their work was supported by a grant­funded team (studying faculty work practices section, para.  ). methodology http://dfd.dartmouth.edu/ / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / the researcher sent an email message to the faculty members selected by the library liaisons to explain the project and to ask them to participate, sending a follow­up message if necessary. the researcher then conducted in­person interviews with those arts and humanities faculty members who agree to participate. interviews began with a brief definition of an ir and its common functions (table  ). this definition had also been sent to the interviewees beforehand because the term "institutional repository" has not been well understood by faculty: "less than half of faculty at research universities understood the meaning of this term" (primary research group,  , p.  ). table  : background information for interview participants regarding irs definition an institutional repository (ir) is the infrastructure through which colleges and universities seek to achieve "the more coordinated management and disclosure of digital assets [such as] learning objects, data sets, e­prints, theses, dissertations and so on." de rosa, c., dempsey, l., & wilson, a. ( ). oclc environmental scan: pattern recognition. dublin, oh: oclc, p.  . functions collecting: the gathering together of scholarly, pedagogical, and administrative output from across campus. preserving: protecting the intellectual and administrative assets of an institution. publishing: providing a place from which scholarly and teaching materials can be made globally available (even if they are commercially published elsewhere). re­purposing: allowing the easy re­use (where appropriate) of teaching and research materials. promoting: demonstrating the excellence of an institution through the availability of its scholarly and pedagogical outputs, and celebrating its altruism in sharing them widely. all participants were asked the same questions [see appendix] and were assured anonymity in the final report to the dartmouth digital information (d i) steering committee [ ] and in any subsequent publication of results. they were informed that their opinions could influence how the institution plans "a broad, long­range strategy for managing both the administrative records of the university and the academic output of departments and faculty" (dartmouth college & duke university,  , p.  ).   data quality a member of the faculty at simmons college graduate school of library and information science commented upon a draft version of the materials to be sent to interviewees and also examined the ir definition and the interview questions for ambiguity or unexplained terminology. the ir definition and the interview questions were also shared with the library liaisons at dartmouth college.   findings . information management / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / .  managing and distributing materials teaching interviewees expressed a greater satisfaction with their ability to manage and distribute their teaching materials than their research data and publications. like many academic institutions, dartmouth uses a course management system (blackboard™) to organize and deliver curricular materials. interviewees use blackboard™ heavily as both a delivery system and as a long­term repository for pedagogical materials, although the college does not explicitly support the latter role. ten of the   interviewees used blackboard™ for delivering syllabi, grading, and for enabling course discussions; seven also used it as a content repository over time. four had web sites that they also used for class materials, but one of the interviewees observed that there is currently no institutional infrastructure to archive class web sites. one interviewee expressed frustration that blackboard™ content is inaccessible to a wider public than dartmouth students, and another has abandoned blackboard™ because of its perceived poor integration with external media assets. research the faculty expressed a considerable need for an institutional infrastructure for faculty research materials. eight interviewees stored research materials on their desktop and laptop machines, and they were clear that this situation was sub­optimal. three interviewees discussed the challenges of dealing with very large files in this manner: music archives, cad programs, and video. one interviewee with multi­terabyte media files had no institutional backup at all for this research material. two thought that the institution's networked drives were too slow to be practical for research materials; one observed that a departmental server provided storage but — critically — no access for colleagues and students; one noted that there are few institutional resources to help with the archiving of research materials; and one was frustrated at the storage limit placed on a personal college web page account. four interviewees specifically wanted to access data in a networked repository; two of them observed that they are challenged to coordinate research and teaching material scattered across several machines. one interviewee, who did not distribute scholarship in digital form, thought that digital research and scholarly materials have an odd standing in his/her field, and believed there is still a premium on print with digital content considered as something of an add­on. this interviewee believed that this attitude typifies how the college judges digital scholarship in tenure and promotion decisions. .  preferences this section elicited the most discussion and the widest array of responses. the dominant need expressed was for services that create and manage digital materials in order to advance teaching and scholarship. digitization services eight interviewees expressed the need for materials to be digitized in order to include them in a repository. humanities faculty members have scholarship and raw research materials in paper, slide, photo, tape, and film formats, and they want this material online with the ability to search across all material in the repository. one interviewee suggested that the scholarship, once digitized, could be used for a "digital dean's shelf" as an online promotion of dartmouth scholarship. data processing services this was the next most prevalent need, with seven interviewees offering examples of data processing services that would further their use of an ir. two of them wanted a "format conversion function" to convert old file formats to newer ones (e.g., wordperfect to ms word, cad files into video, or photoshop files to jpeg). one interviewee needed to create pdf or powerpoint documents of classroom lecture materials and add them to blackboard™; another needed repository tools to process large numbers of images; and a third wanted to cull video clips more easily for classroom use. storage services / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / six faculty members needed better data storage solutions. four required support for large datasets, including audio, video, and  d models. these files are cumbersome to manage and share without an infrastructure designed to accommodate them. two interviewees asked for storage for departmental records, and one also desired long­term preservation of course materials. open access/publishing services four interviewees needed help negotiating rights to use material, both for their own content (for which they had signed away rights to a publisher) and for content created and owned by others. they thought such support would be an important part of a successful repository infrastructure. two interviewees wanted help with open access publishing: one wished that all his own material was openly accessible; another wanted to put a digital copy of a print book into the library upon publication, and believed an open access agreement like harvard's might help with this process. [ ] this interviewee has written a book with a colleague and needed additional material posted online. currently there is no institutional infrastructure through which to assign a permanent identifier to such material. another wanted training and tools for easier maintenance of websites that promote work to the alumni of the department, who are potential resources for students. annotation and collaboration services two interviewees would like a repository to allow users to interact with content, by adding comments and tags as they can on facebook and flickr. one also asked for collaboration tools for shared editing. another wanted support for virtual teaching, working with colleagues and students at multiple sites, and would want any new campus infrastructure to make this much easier. a third saw the opportunity for ir­based tools to drive new interdisciplinary work, and to help students create new fields of study by deepening and easing their discovery of material and partners in other fields. only one interviewee did not require any ir­based services. this interviewee valued access to online archives and repositories of material, but liked the idea of creating and working with physical objects and observed that there is still a deep relationship to print in his/her discipline. .  interest in adding material to an ir eleven interviewees were interested in adding material to an ir. two were unsure: one believed that his/her journal content was already all available electronically through publisher sites; the other had concerns about loss of ownership and appropriate credit (especially for teaching materials). where an interest was qualified or amplified, the interviewees wanted appropriate technical support, for the college to maintain faculty web sites over time, and for the digitization of paper­based material from research. two interviewees specifically mentioned the desire for a faculty profile page on the web to showcase one's own scholarship or teaching. .  requirements for a campus­wide repository the interviewees all emphasized the importance of trust in the ir as a secure and enduring institutional service. they favored material under the control of the owner, especially for material that is pre­publication or part of a collaborative process. ease and speed of ingest were also important, as was version control and the ability to add metadata for different audiences. .  types of materials for submission a wide array of teaching and research materials (table  ) and data formats (table  ) were specified. table  : materials that interviewees would submit to an institutional repository material respondents research material / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / scholarship teaching materials nothing   table  : formats that interviewees would submit to an institutional repository format respondents text images syllabi video audio scholarly archives cad and  d models student projects and theses four interviewees discussed syllabi, and expressed some hesitance to share them. their reasons were that syllabi take a lot of work and there was concern that one would not get credit for that effort; there was also the concern that a student would use an outdated syllabus by mistake, as these documents change frequently (even during the term). one interviewee thought the ability to see a junior colleague's syllabi online was a benefit for mentoring and evaluation; another noted that dartmouth's  ­week term may not make dartmouth syllabi very useful to the many u.s. institutions that are organized around a   to  ­week semester. several interviewees mentioned the value of having their digital material available in a central repository from which it can be used in research and teaching. two interviewees saw an ir as an aid to their current published work, allowing for open access versions to be disseminated. . issues concerning ownership and re­use of their materials .  concerns about the adoption of irs for academic content five interviewees thought that the processes of reward and recognition in the humanities, especially when tenure and promotion decisions are made, did not value digital publishing. one interviewee noted that this is still an era of individual scholars generating printed books, and that this is how careers are judged. one interviewee also suspected that a repository could allow people to plagiarize more easily or to circumvent copyright. there was also interest in the long­term maintenance of public web sites over time so they do not age badly and reflect poorly on the individual or the institution. four interviewees expressed no concerns, as long as the repository was safe and capacious. they saw an ir as part of an evolutionary process away from print­only resources, as a good way to disseminate material, and as a necessary backup. .  access restrictions all interviewees thought that some material in the repository should be open to all users. many had examples of times when this would not be the case for a given item, or for an item in a pre­production state, but the general tenor of opinion was firmly towards global access. one interviewee expanded on this opinion, / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / suggesting that it was important to behave the way we wish other scholars, libraries, and museums to behave in granting the most open access legally possible. another interviewee has been pleasantly surprised by the quality and passion of some of the online commentary on public services such as flickr even when the interviewees are not trained academics. one interviewee raised a question about ownership of lecture material: do faculty members own the material they create, especially for teaching, or could they find their material in the repository without their input? .  usage statistics for ir content all interviewees saw the provision of usage statistics as being of interest, but opinions ranged from usage statistics having personal value merely to their being important to the institution to judge the impact of the service. one noted that impact metrics are becoming more prevalent in the humanities and this would be one such measure. two interviewees cautioned against popularity being a useful indicator of quality for scholarship. ) repository models other than institutional ones .  contributions to or use of a disciplinary repository ten interviewees did not know of a repository that is important in their discipline. on the other hand, one interviewee used a disciplinary repository centered on a scholarly society; a second used a music repository that is discipline­specific; and a third used facebook (for personal use) and flickr (for photo sharing and photo discovery) and saw such services as of value in academia. .  the utility of a disciplinary repository as a way to publish and discover scholarship one interviewee noted that a cross­institutional, discipline­based repository would be more desirable than an ir; two thought that it would not, and three had no opinion. seven interviewees saw some value in disciplinary repositories when combined with institutional ones, with the local architecture designed so that it can support both an institutional expression of content and also deliver content to disciplinary repositories. .  level of importance that the dartmouth repository is cross­searchable with other institutional repositories all interviewees saw value in ir cross­searchability, with eight seeing it as very important, four as fairly important, and one as of low importance. interdisciplinary work and new collaborations were most valued. two respondents also thought that an ir could ease the transfer of their material to a disciplinary repository if one becomes important to them, or to another institution if they move. ) other needs and concerns in their closing remarks, interviewees reiterated their need for digitization and conversion services, the easy ingest of materials, the importance of collaborative online work environments, faculty control over who accesses their material, and the long­term institutional commitment to an ir. one interviewee was concerned that humanists may have a failure of imagination about these possibilities, and wondered when more collaborative modes of scholarship enabled by network technologies will enter the humanities. another wondered how to get sufficient commitment from faculty to make an ir successful, and suggested that the college should pre­load an ir with a faculty member's scholarship, with his or her permission, as an incentive to participate. finally, two interviewees noted that dartmouth college faculty members are very good at teaching and scholarship but sometimes poor at showcasing these skills, and that the institution could be a leader in the humanities in embracing irs as a way to promote and share materials.   discussion / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / institutional repositories are of increasing importance to colleges and universities, although there is still a mismatch between the high expectations on the part of librarians and technologists for these infrastructures and the faculty's low willingness to contribute content to them. prior research shows this lack of engagement with irs to be especially true for humanities faculty members. the humanists in this study, however, expressed a relatively high level of interest in the opportunities that irs afford them, especially if the services they need are a part of the ir design. most interviewees expressed a strong desire for services that the college should develop as part of the design of an ir infrastructure. digitization services were most needed; for humanists, their research archives and scholarly publications are not necessarily available to them in digital form. this is especially true of scholarship from earlier in their careers, but not limited to this material: scholarly books still come out in the humanities without an accompanying ebook version, for example, and there has been to date no college infrastructure designed to take the computer file in which the faculty member creates the book and give it an online existence, including copyright clearance. the interviewees also expressed a wide spectrum of other services that they felt could make participation in an ir more attractive. while there was relatively little interest in digital publishing through a campus infrastructure, there were clear needs for mass data storage, persistent identifiers, interlinked scholarly and pedagogical repositories, collaborative online work, community tagging, and user commentary. it is clear from this research that­if issues of trust, control, and professional recognition can be suitably addressed­an ir could address a number of significant information management and dissemination needs for humanists. all interviewees, for example, believed that some ir content should be accessible to all internet users, but they also saw an ir as a desirable collaborative working space for certain types of documents where the audience would be a more limited circle of colleagues and students. most interviewees in this study manage aspects of their teaching at the network level, with the access and data security benefits that this brings. the blackboard™ course management system functions variously as a long­term repository for teaching material and a set of services (grading, course discussion, and delivery of syllabi and selected content). the interviewees have no such robust institutional infrastructure for research materials and published scholarship, however, and are much more likely to manage these on a personal laptop or desktop, or on several such machines in multiple locations. they express a clear need for a central networked repository for research files that would be accessible to other systems and that could store large music, video, and cad files as well as text and image collections. chavez, et al. ( ) have also noted the need for faculty services to be tightly coupled with the design of irs. these authors promote the importance of both "high­level, or infrastructure, repository services that support sharing of data between repositories, ingesting of data into repositories, and harvesting of content from repositories" and "low­level, or content based, services such as natural language processing tools and analysis services that support users in their interaction with materials that are already within repositories" (introduction section). such service­based inducements to contribute to a repository were very important to the respondents in this study; however, services are rarely promoted when institutions introduce an ir to a faculty population: "most ir staff members lacked an understanding of the range of services that might constitute a comprehensive service model for irs" (rieh et al.,  , p.  ). the comments of the humanists in this research suggest that a focus on the high­level and low­level needs of ir contributors is fundamental to the adoption of these services. the wide range of current ir­related needs expressed by these humanists was surprising, given the low use of irs by humanists elsewhere and the infrequency with which these disciplines express other cyberinfrastructure needs. the lack of an infrastructure for research materials beyond personal hard drives and web pages is of real concern to most of the interviewees­too much of their existing digital material is vulnerable to loss. an ir that included services to ease the ingest, management, and sharing digital content could address those needs. almost all interviewees were ready to store, use, and share material from a campus ir, and all thought that some material in an ir should be free to all. this would lead one to assume that these humanities faculty members would agree with the value propositions of the american council of learned societies ( ) when / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / describing networked infrastructures for the humanities and social sciences, which they assert should "be accessible as a public good; be sustainable; provide interoperability; facilitate collaboration; and support experimentation" (p.  ). however, "the public good" is not the first need an individual faculty member expresses in thinking about the personal and institutional investment that an ir involves. librarians should be careful not to couch irs initially in such abstract, if worthy, sentiments when trying to articulate their value to an individual contributor. issues of trust are central to ir adoption. there is the need to trust that the institution will support the repository over a long period of time and that the material placed in there is safe from catastrophic loss. there is also the issue of access control, which needs to be firmly in the hands of the contributors, with the ability to control who can access a given item, from local users only to all internet users. the design, promotion, and ongoing support for any ir designed for humanists needs to engender trust to be successful. some faculty saw an ir as a way to showcase their work. at several points in the discussions, interviewees asked for a repository to be able to feed content to a faculty profile page, to promote their work, and help others find it. researchers at the university of rochester uncovered the same faculty expectation, which led to the introduction of "a personalized webpage that we will make available to any university of rochester faculty member or staff author who puts work into our ir. the researcher page will serve as the showcase for all of the researcher's work" (foster & gibbons,  , enhancing the ir to meet the needs of faculty users section, para  ). finally, there are a cluster of concerns around reputation and reward. several respondents do not trust others to give credit where it is due, and also think that the institution does not value digital work sufficiently in promotion decisions, which lessens their willingness to engage with irs. this latter opinion finds support in the recent survey by schonfeld and housewright ( ): "in our survey, roughly one­third of faculty members strongly agree that tenure and promotion practices 'unnecessarily constrain' their publishing choices.... this belief is stronger among social scientists and humanists than among scientists" (p.  ). [ ] coupling institutional encouragement for open access publishing with institutional rewards such as promotion and tenure will be necessary for irs to take root deeply in academia. if an institution asserts that it values open­ access publishing, sharing of intellectual assets, and collaborative working practices, then those behaviors need to be recognized within the academic rewards structures. further research this inquiry could be broadened to include humanists from a range of educational institutions, and to examine differences in information­seeking behavior among the humanities, physical and medical sciences, and social sciences. the age and rank of the faculty member could be taken into account also, to see if the willingness to engage with irs is affected by these variables. there is recent evidence to suggest that more senior faculty are more willing to contribute their materials to an ir: "only  . % of lecturers or instructors have contributed an article to a digital repository, while  . % of full professors have done the same. this is true despite the lower levels of awareness of repositories at higher rungs in the academic totem pole" (primary research group,  , p. ). further research opportunities also exist in examining the characteristics of early adopters of ir services, which could also uncover any changes in the services required after faculty had been using a repository for a period of time.   conclusion the topics of conversation generated in this set of interviews were broad and varied, covering both teaching and research, and including data creation, curation, and delivery needs over multiple media. the level of interest in ir services was surprisingly high, given the general lack of engagement that humanists have in irs elsewhere. if librarians think of irs in terms of access, scholarly communication, and institutional promotion, then these interviews underscore the fact that faculty members think in terms of storage, services, and the marketing of self. understanding this difference in focus is critical to the success of an ir. if the faculty do not see the ir / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / service described in language that makes sense to them, and if they do not believe that it is an infrastructure that addresses their needs, then their incentive to contribute content is severely diminished. conversely, this research makes it clear that an ir can offer a range of services to humanities faculty that are desired by them, especially the digitization, online storage, and curation of their research materials. such reliable networked storage is a real need, especially for those with large collections of digital images, audio, video, and architectural datasets. the existence and relative success of course management systems (which the faculty see as one type of repository) has made faculty even more aware of the lack of institutional infrastructures to support their research data, and an ir should be promoted as a solution to this problem. these interviews also focus attention on the need for faculty rewards systems, including promotion and tenure decisions, to take much more account of the activities surrounding the digital archiving, publishing, and sharing of scholarship and research data. while it was not a majority opinion, there was concern that digital projects and publishing were not valued sufficiently in promotion decisions within the humanities. until scholars feel that contributing materials to irs and publishing from them will enhance their scholarly and pedagogical reputations rather than potentially damage them, the ability to promote an ir service will be seriously hampered. the institutional will to create institutional repositories must be coupled with the institutional will to reward their use. humanists can be willing participants in an academic institution's repository of scholarly and pedagogical assets if the designers of that ir infrastructure take the time to uncover and take into account their service needs and reputational concerns. if presented in terms that make sense to the faculty, and designed consciously with their needs in mind, an ir can be a digital creation, storage, and dissemination infrastructure that would have real benefit to the teaching and research work of humanities faculty.   notes  dartmouth college is classified by the carnegie foundation for the advancement of teaching ( ) as a "ru/vh: research university — very high research activity.".  the d i committee is charged by the provost to develop an institution­wide digital information strategy for dartmouth college. its work is part of an ongoing joint repository planning initiative undertaken by duke university and dartmouth college, and funded by the andrew w. mellon foundation, focused "not on technological solutions, but rather on developing a clearer definition of the pieces that must be integrated into a plan ... in support of an institution­wide digital asset management/enterprise content management program" (dartmouth college & duke university,  , p.  ).  dartmouth college has an author's amendment to use in such cases, and since the date of this interview, dartmouth has signed an open access compact with four other schools (massachusetts institute of technology, university of california, berkeley, cornell university, and harvard university).  harley, acord, earl­novell, lawrence, and king ( ) show this attitude is also present in both history — "publishing in nascent online­only journals carries much less prestige than the flagship print journals" (p. xv) — and political science: "there appears to be no call within the field for open access journals, and although there are a few online­only journals, they lack the gravitas of their traditional print counterparts" (p. xvii).   references [ ] agingu, b. o., & cooper, c. m. ( ). collaborating with faculty through technology: faculty as users and partners. journal of educational media & library sciences,  ( ),  ­ . http://www.fed.cuhk.edu.hk/en/jemls/ / .htm [ ] american council of learned societies. ( ). our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. retrieved september  ,   from http://www.acls.org/cyberinfrastructure/ourculturalcommonwealth.pdf http://www.dartmouth.edu/~library/schcomm/oacompact.html http://www.fed.cuhk.edu.hk/en/jemls/ / .htm http://www.acls.org/cyberinfrastructure/ourculturalcommonwealth.pdf / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / [ ] association of research libraries. ( ). spec kit  : institutional repositories. washington, dc: arl. http://www.arl.org/bm~doc/spec web.pdf [ ] baudoin, p., & branschosky, m. ( ). implementing an institutional repository: the dspace experience at mit. science & technology libraries,  ( / ),  ­ . http://www.informaworld.com/smpp/content~db=all~content=a ~frm=titlelink [ ] carnegie foundation for the advancement of teaching. ( ). classifications. retrieved september  , , from http://www.carnegiefoundation.org/classifications/ [ ] chavez, r., crane, g., sauer, a., babeu, a., packel, a., & weaver, g. ( ). services make the repository. journal of digital information,  ( ). retrieved september  ,  , from http://journals.tdl.org/jodi/article/view/ / [ ] dartmouth college, & duke university. ( ). digital asset management: elements of an institutional program. unpublished final report to the andrew w. mellon foundation on the duke/dartmouth project. [ ] davis, p. m., & connolly, m. j. l. ( ). institutional repositories. evaluating the reasons for non­use of cornell university's installation of dspace. d­lib magazine,  ( / ). retrieved september  ,  , from doi: . /march ­davis. [ ] de rosa, c., dempsey, l., & wilson, a. ( ). oclc environmental scan: pattern recognition. dublin, oh: oclc online computer library center. retrieved september  ,  , from http://www.oclc.org/reports/escan/ [ ] dill, e., & palmer, k. l. ( ). what's the big idea? considerations for implementing an institutional repository. library hi tech news,  ( ),  ­ . http://www.emeraldinsight.com/journals.htm?issn= ­ &volume= &issue= &articleid= &show=abstract [ ] directory of open access repositories ( , september). retrieved september  ,  , from http://www.opendoar.org/index.html [ ] foster, n. f., & gibbons, s. ( ). understanding faculty to improve content recruitment for institutional repositories. d­lib magazine  ( ). retrieved september  ,  , from doi: . /january ­foster [ ] green, a. ( ). review of digital repositories. report to the integrated access council, yale university library, new haven, ct. retrieved september  ,   from http://www.library.yale.edu/iac/documents/dr_review_final_ sept .pdf [ ] harley, d., acord, s. k., earl­novell, s., lawrence, s., & king, c. j. ( ) assessing the future landscape of scholarly communication: an exploration of faculty values and needs in seven disciplines. berkeley, ca: university of california, berkeley, center for studies in higher education. retrieved september  ,   from http://escholarship.org/uc/cshe_fsc [ ] holmes­wong, d., brown, j., & tompson, s. ( , april). contextualizing the institutional repository within faculty research. paper presented at the digital library federation spring forum, austin, texas. retrieved september  ,  , from http://www.diglib.org/forums/spring /presentations/wong.pdf [ ] kim, j. ( ). motivating and impeding factors affecting faculty contribution to institutional repositories. journal of digital information  ( ). retrieved september  ,  , from http://journals.tdl.org/jodi/article/view/ / [ ] maness, j. m., miaskiewicz, t., & sumner, t. ( , september/october). using personas to understand the needs and goals of institutional repository users. d­lib magazine   ( / ). retrieved september  ,  , from doi: . /september ­maness [ ] markey, k., rieh, s. y., st. jean, b., kim, j., & yakel, e. ( ). census of institutional repositories in http://www.arl.org/bm~doc/spec web.pdf http://www.informaworld.com/smpp/content~db=all~content=a ~frm=titlelink http://www.carnegiefoundation.org/classifications/ http://journals.tdl.org/jodi/article/view/ / http://dx.doi.org/ . /march -davis http://www.oclc.org/reports/escan/ http://www.emeraldinsight.com/journals.htm?issn= - &volume= &issue= &articleid= &show=abstract http://www.opendoar.org/index.html http://dx.doi.org/ . /january -foster http://www.library.yale.edu/iac/documents/dr_review_final_ sept .pdf http://escholarship.org/uc/cshe_fsc http://www.diglib.org/forums/spring /presentations/wong.pdf http://journals.tdl.org/jodi/article/view/ / http://dx.doi.org/ . /september -maness / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / the united states miracle project research findings. clir publication, no.  . washington, dc: council on library and information resources. http://www.clir.org/pubs/abstract/pub abst.html [ ] primary research group. ( ). the international survey of institutional digital repositories. new york: primary research group. http://www.primaryresearch.com/view_product.php?report_id= [ ] primary research group. ( ). the survey of higher educational faculty: use of digital repositories and views on open access. new york: primary research group. http://www.primaryresearch.com/view_product.php?report_id= [ ] rader, h. b. ( ). a new academic library model: partnerships for learning and teaching. college & research libraries news,  ( ),  ­ . [ ] rieh, s. y., st jean, b., yakel, e., markey, k. & kim, j. ( ). perceptions and experiences of staff in the planning and implementation of institutional repositories. library trends,  ( ),  ­ . retrieved september  ,  , from http://muse.jhu.edu/login?uri=/journals/library_trends/v / . .rieh.html [ ] schneiderman, b. ( ). leonardo's laptop: human needs and the new computing technologies. cambridge, ma: mit press. [ ] schonfeld, r. c., & housewright, r. ( ). faculty survey  : key strategic insights for libraries, publishers, and societies. new york: ithaka. retrieved september  ,   from http://www.ithaka.org/ithaka­s­r/research/faculty­surveys­ ­ /faculty% study% .pdf [ ] seaman, d. ( ). the user community as responsibility and resource: building a sustainable digital library. d­lib magazine  ( ). retrieved september  ,  , from doi: . /july ­seaman [ ] ware, m. ( ). institutional repositories and scholarly publishing. learned publishing,  ( ),  . http://alpsp.publisher.ingentaconnect.com/content/alpsp/lp/ / / /art [ ] wilson, t. d. ( ). on user studies and information need. journal of documentation,  ( ),  ­ . http://www.emeraldinsight.com/journals.htm?issn= ­ &volume= &issue= &articleid= &show=abstract   appendix: questions asked of humanities faculty ) information management issues q .  how do you currently manage and distribute your research and teaching materials? q .  what would you like to be able to do differently? q .  are you interested in adding your material to an institutional repository? q .  what requirements would you have for a campus­wide repository before you would entrust your own information to it? q .  what types of materials would you submit? ) issues concerning ownership and re­use of their materials q .  what concerns do you have about the adoption of repositories of scholarship, reports, datasets, and other material? q .  who would you like to have access to material you put in an institutional repository? http://www.clir.org/pubs/abstract/pub abst.html http://www.primaryresearch.com/view_product.php?report_id= http://www.primaryresearch.com/view_product.php?report_id= http://muse.jhu.edu/login?uri=/journals/library_trends/v / . .rieh.html http://www.ithaka.org/ithaka-s-r/research/faculty-surveys- - /faculty% study% .pdf http://dx.doi.org/ . /july -seaman http://alpsp.publisher.ingentaconnect.com/content/alpsp/lp/ / / /art http://www.emeraldinsight.com/journals.htm?issn= - &volume= &issue= &articleid= &show=abstract / / d-lib magazine http://www.dlib.org/dlib/march /seaman/ seaman.print.html / q .  how important to you is regular reporting about the usage of your materials in the repository? ) interest in repository models other than institutional ones q .  do you currently use an online repository of materials that is not an institutional one? q .  would such a repository be more useful to you as a way to publish and discover scholarship than the institutional model? q .  how important will it be to you that the dartmouth repository is networked and cross searchable with other institutional repositories? ) any other needs and concerns? an opportunity to raise issues not covered in the preceding questions.   about the author david seaman is associate librarian for information management at dartmouth college library. prior to moving to new hampshire, he was the executive director of the digital library federation (dlf) from  ­ . he came to the dlf in july  from the electronic text center at the university of virginia library, where he was the center's founding director ( ­ ). he has lectured and published extensively in the fields of humanities computing and digital libraries since the early  s.   copyright ©   david seaman   p r i n t e r ­ f r i e n d l y  f o r m a t return to article   http://www.dlib.org/dlib/march /seaman/ seaman.html int j digit libr ( ) : – doi . /s - - - what lies beneath?: knowledge infrastructures in the subseafloor biosphere and beyond peter t. darch · christine l. borgman · sharon traweek · rebekah l. cummings · jillian c. wallis · ashley e. sands received: january / revised: october / accepted: january / published online: february © springer-verlag berlin heidelberg abstract we present preliminary findings from a three- year research project comprised of longitudinal qualitative case studies of data practices in four large, distributed, highly multidisciplinary scientific collaborations. this project fol- lows a × research design: two of the collaborations are big science while two are little science, two have completed data collection activities while two are ramping up data collec- tion. this paper is centered on one of these collaborations, a project bringing together scientists to study subseafloor microbial life. this collaboration is little science, character- ized by small teams, using small amounts of data, to address specific questions. our case study employs participant obser- vation in a laboratory, interviews (n = to date) with scien- tists in the collaboration, and document analysis. we present adataworkflowthatistypicalformanyofthescientistswork- ing in the observed laboratory. in particular, we show that, although this workflow results in datasets apparently similar in form, nevertheless a large degree of heterogeneity exists across scientists in this laboratory in terms of the methods they employ to produce these datasets—even between sci- entists working on adjacent benches. to date, most studies of data in little science focus on heterogeneity in terms of the types of data produced: this paper adds another dimen- sion of heterogeneity to existing knowledge about data in lit- tle science. this additional dimension makes more complex the task of management and curation of data for subsequent reuse. furthermore, the nature of the factors that contribute to heterogeneity of methods suggest that this dimension of p. t. darch (b) · c. l. borgman · s. traweek · r. l. cummings · j. c. wallis · a. e. sands knowledge infrastructures project, department of information studies, ucla, gse&is building, room , box , los angeles, ca - , usa e-mail: petertdarch@ucla.edu heterogeneity is a persistent and unavoidable feature of little science. keywords data deluge · big science · little science · multidisciplinary scholarship · knowledge infrastructures introduction long predicted by the science community [ ], both nature and science have now heralded the opportunities and chal- lenges presented by the scientific data deluge [ , ]. uni- versities themselves are assessing their rights, roles, and responsibilities for managing and for exploiting data from their researchers [ ]. in addition to the sheer size of data generated, the het- erogeneity of datasets is also increasing, even within indi- vidual domains. scientific collaboration is becoming a more multidisciplinary, distributed endeavor [ ]. as a result, approaches from multiple epistemological or social perspec- tives may be combined in the production of a dataset, and conversely a single dataset may be used in multiple contexts, crossing epistemological, cultural and social boundaries. contemporary digital scholarship is thus a rapidly chang- ing and expanding undertaking. however, today’s scientific methods and organization of collaborative work often do not scale well to today’s volumes or diversity of data gener- ated; qualitatively different approaches to scientific inquiry are required. as data are combined from multiple sources and are mined for new interpretations, the challenges of data managementandcurationmultiply.modernsensornetworks, satellites, telescopes, and laboratory instruments can collect vastly more data, at far faster rates and far greater variety, than ever before. scientists rely on their instruments, algo- rithms, and collaborators to clean, verify, visualize, and inter- p. t. darch et al. pret their data. much can go wrong in the many steps involved in the design and deployment of instruments, collection and cleaning of data, and in the analysis and reporting of results. data and responsibility pass through many hands, often over the course of many years, in the life cycles of collaborative data-driven science. scientific data management requires deep expertise in sci- entific theory, method, instrumentation, and interpretation. skill sets are complex and are divided differently in each field and specialty. each step in data handling requires knowl- edge and judgment of the steps that went before. necessary details of data provenance often go undocumented, leaving researchers in the position of making multiparty inferences with insufficient information [ ]. minute differences in cal- ibration, miniscule artifacts in a data stream, and other per- turbations may be spotted by those closest to the research design—but these factors decrease in visibility the farther the interpreter lies from the source of the data. the pressure from funding agencies such as the national science foundation (nsf) and the national institute of health(nih) toshareresearchdatahighlights thecomplexity of data-driven science. “data” is a contested notion. further- more, competing views exist of research, innovation, and scholarship, disparate incentives for collecting and releas- ing data, the economics and intellectual property of research products, and public policy—and the requisite technical and human infrastructure. however, relatively few studies docu- ment consistent data release. sharing research data is thus a conundrum—“an intricate and difficult problem” [ ]. research in both data-intensive big science, where data products are large in volume but typically homogeneous, such as astronomy projects centered on the building and operation of massive-scale instruments, and in little science, where data products are small in size but large in number, such as sensor network applications in ecology, marine biol- ogy, environmental engineering, and seismology, reveals a critical lack of infrastructure to support these new forms of scholarship. the promise of technology-enabled, data- driven digital scholarship in science is predicated upon avail- able systems, services, tools, content, policies, practices, and human resources to discover, mine, and use research prod- ucts, as well as to create those products in forms that are use- ful to others. not only is this infrastructure not yet in place, it is not yet clear what should be built or how to build it [ , ]. however,thisproblemisbecomingbetterrecognized,asis the fact that sociotechnical research approaches can produce critical insights that inform the design, policy, and human resource requirements for scientific information infrastruc- ture [ ]. this paper presents preliminary findings from a three-year study of such infrastructures. the transformation of knowledge, culture, and practice in data-driven sci- ence: a knowledge infrastructures perspective (henceforth known as the knowledge infrastructures project) involves longitudinal qualitative case studies of four large, distributed, multidisciplinary scientific collaborations. two of these col- laborations could be considered as big science, whereas the other two involve multiple research teams performing little science. furthermore, two of these collaborations have been in the process of ramping down data collection and active research, while the two others are ramping up their activities. beyond the preliminary results in this paper, the knowledge infrastructures project will continue to analyze these four distinct collaborations. here, we present an agenda for researching knowledge infrastructures, defined as “robust networks of people, arti- facts, and institutions that generate, share, and maintain spe- cific knowledge about the human and natural worlds” [ , p. ],throughmotivatingandpresentingtheresearchquestions that guide our knowledge infrastructures project team. then, we present the methodologies of the knowledge infrastruc- tures project, introduce the four case studies, and explain how our approach to research will make significant contributions in pursuing the research agenda. to demonstrate these contri- butions, we present preliminary findings from one of the case studies, a multidisciplinary, multi-institutional collaboration that involves studying microbial life beneath the seafloor. in particular, we explore and account for the diversity of prac- tices in producing datasets that we observed across scientists, even in the same laboratory. to date, research on data prac- tices has focused on heterogeneity of data types produced in a research setting; by contrast, here we demonstrate that for a single data type, there can be significant heterogeneity in how such datasets are produced even in the same research setting. this heterogeneity can multiply significantly the challenges involved in data management and curation. a project for researching sociotechnical knowledge infrastructures our knowledge infrastructures project responds both to the needs of scientists in developing practices and infrastruc- tures to manage the increasing volume and diversity of data in their work, and to substantial gaps in the existing social scientific literature that addresses scientists’ data practices. in this section, we motivate the knowledge infrastructures project, introduce its features, and discuss how it will con- tribute to improved understandings of how scientists produce and manage their data. . motivation for the knowledge infrastructures project researchintoknowledgeinfrastructuresisadevelopingfield. current discourse around research data highlights the lack of institutional infrastructure comparable to the roles that libraries and publishers serve in access to scholarly publica- what lies beneath? tions. infrastructure for research data is much more than dis- seminating resources; it must support data collection, analy- sis, use, and reuse for new scientific methods, and should democratize access in the process. the design of knowledge infrastructures rests on the ability to explicate the sociotech- nical structures that are embodied in the data, in data prac- tices, in technical arrangements, and in policies. these inter- dependencies are known to present significant risks to adop- tion and implementation of effective infrastructure [ , ]. the role of data as a scholarly research product is a growing concern, both practically and politically [ , ]. our research recognizes the significant technical challenges that arise in managing research data, such as data granularity, data provenance, data structures, definitions of data, dataset identity, identifiers, and functions of data [ , , ]. some researchers have studied how authors of scientific journal articles cite reused datasets originally generated by other researchers, or develop more systematic ways and standards for citing these datasets [ , ]. however, reuse analysis covers only one part of the multistage data life cycle [ ]. while countless policy reports call for the building of infrastructure and capacity for research data, only a hand- ful of researchers consider how knowledge of data practices might inform design and policy processes [ , , , , , ]. studies of data practices draw upon a larger body of work than can be enumerated here. leigh star and karen ruhleder were the first to assess infrastructure from a sociotechnical perspective [ ], opening up a rich area to be mined by many others [ , ]. included in studies of knowledge infrastruc- tures are research on work practices, collaborations, vir- tual organizations, computer-supported collaborative work, project life cycles, and project time [ , , , , , ]. . research questions for the knowledge infrastructures project in the knowledge infrastructures project, we address general research questions across the four research sites: . what new infrastructures, divisions of labor, knowledge, and expertise are required for data-driven science? . how are the infrastructures of multidisciplinary, data- driven scientific collaborations established and how are they dismantled? . how do data management, curation, sharing, and reuse practices vary among research areas? . what data are most important to curate, from whose per- spective, and who decides? . knowledge infrastructures project case studies and methods to address these research questions, we are conducting case studies of four large, distributed, collaborative, multidiscipli- table site comparisons by data scope and by life cycle stage big science little science ramping up data collection lsst c-debi ramping down data collection sdss cens nary projects. we selected case studies for a × research design (see table ), enabling the comparison of two research projects that produce large volumes of homogeneous data (in this case, the sloan digital sky survey, or sdss, and the large synoptic survey telescope, or lsst) with two projects that produce smaller amounts of heterogeneous data (in this case, the center for embedded network sensing, or cens, and the center for dark energy biosphere investigations, or c-debi). it also allows us to compare projects that are in the earlier stages of their life cycles (lsst and c-debi) and are ramping up data production, and projects at later stages of their life cycles (sdss and cens) that have ramped down data production comparisons of these sites enable us to assess the knowl- edge infrastructure requirements for a broad spectrum of scientific research and practice. these sites also allow us to understand processes of knowledge transfer, such as that between scientists, between scientists and information pro- fessionals, between research projects, and between science projects and the public. we are able to identify infrastructure practices that contribute to better strategies for data manage- ment and to make recommendations for policy and practice. research on the two ramping down projects, cens and sdss, began prior to the knowledge infrastructures project. research on c-debi began in , and on lsst in . the cens project involved the development of sensing tech- nology in collaboration with teams from a variety of sci- ences, most notably ecology, marine biology, and seismol- ogy. cens embodied little science, and the data tended to be heterogeneous and complex. cens focused much less explicitly on transferring its data as part of its legacy. some members of the knowledge infrastructures team were embedded in cens for a decade both as observers and partic- ipants in the development of knowledge infrastructure. this infrastructure included: the cens deployment center, to plan data collection campaigns and serve as reference meta- data after the fact; the cens publication repository; and the cens data registry as part of an annual reporting system. developing these systems also enabled us to identify good practice associated with multidisciplinary data management. sdss was a highly visible project in the domain of astron- omy, embodying big science. much is being learned about its practices, policies, successes and failures, and transfer of expertise to other sciences and projects. sdss, on the surface, may appear to have exemplified a solved data cura- tion environment. however, our closer inspection of sdss is p. t. darch et al. revealing social and technical architectures contingent upon changing technologies. as with cens, c-debi also embodies little science. c-debi studies microbial subseafloor life. it was launched in late , and affords opportunities to observe how the work of negotiating, challenging, building, and maintaining data management practices unfolds in a new collaborative setting. we are learning what partners in information handling they seek, at what stages, and how they compare to the lessons of cens. lsst, like sdss, is in the field of astronomy and will produce data that are largely homogeneous and very large in scale. although the telescope is not due to launch operations until , the collaborative team involved is being assem- bled, and is already making critical design decisions about both the technology for data collection and the infrastruc- ture for data management. lsst team members are building on the experiences of those involved in the sdss and other large-scale astronomy projects. . a theoretical framework for analyzing knowledge infrastructures our study examines how scientists and engineers use both social and technical resources to accomplish their goals, tak- ing into account individuals, groups, collaborations, organi- zations, ideas, techniques, and technologies. a large body of scholarship investigates relationships between the techni- cal and the social. many studies of technological and social change have accounted for the latter in terms of the for- mer, seeing technological change as the driving force behind reconfigurations of social relations; this argument has been labeled technological determinism [ ]. scholars of technology became dissatisfied with the tech- nological determinism approach because it does not explain why some technologies achieve wide acceptance, while other similar technologies fail to do so [ ]. some scholars have developed a social construction of technology (scot) approach that explored how the development and uses of technologies were shaped by social process, including how actors use and shape technologies in pursuit of specific social interests [ ]. a subsequent development of scot was actor-network theory (ant), which focused on goals, agency, and interac- tion in knowledge making [ , ]. the underlying idea of ant is that all actors, human and nonhuman, pursue inter- ests or are goal-directed, and thus build networks of social and material resources to pursue these goals. ant has been widely and successfully used in analyses of science, technol- ogy, and society. in our analyses, scientists and engineers in a laboratory draw on their current networks of resources (social and mate- rial) to accomplish each step of the workflow they have estab- lished. they regard each step as necessary for answering their own research questions and those of the larger collaboration, leading to the accumulation of more resources: recognition and credit for the lab and its members in the form of publica- tions, funding, and promotions. furthermore, their networks of resources enable them to reevaluate and improve these processes [ ]. these research networks include the expertise acquired during the members’ education and experience at previous sites. they also include the multiple techniques and technolo- gies (sample and data collection strategies, research equip- ment, protocols, handbooks, journal articles, funding) acces- sible from both within and beyond the laboratory. it is only when the scientists and engineers become aware of, and com- petent in, engaging with certain techniques and technologies andtheiraffordancesthattheybecomepartofthelaboratory’s network. similarly, those technologies and techniques might be part of one network, but unknown in another. these net- works are dynamic and might even change rapidly, as equip- ment, techniques, knowledge, personnel, and funding are introduced to, or removed from, the laboratory. the knowl- edge of how and when to access these network resources is also in flux. applying this theoretical framework to the research car- ried out in these laboratories not only helps us to perceive the heterogeneity of data practices observed across these sites in answering the similar research questions and producing datasets of similar form, but also to account for why this heterogeneity—even between researchers working on adja- cent benches—occurs. . the center for dark energy biosphere investigations the center for dark energy biosphere investigations (c-debi) is an nsf science and technology center (stc) launched in september . c-debi brings together scien- tists from the biological, chemical, and physical sciences to study subseafloor microbial life, in particular to study inter- actions between the composition of microbial communities and the physical environment they inhabit [ ]. c-debi serves as an important case study for study- ing contemporary developments in digital scholarship. the project is massively distributed across institutions in the usa and europe, and very highly multidisciplinary. as such, it is an exemplar of the complexity of data-driven science. it is also a complement to cens, as is explicated in subsect. . . . . organization and work of c-debi scientists involved with c-debi work toward the project’s scientific goals through the collection and analysis of physi- cal samples, such as rocks from the seafloor (known as cores). fundamental to scientists’ work is the production, analysis what lies beneath? and correlation of data about the cores’ microbial commu- nities with the physical properties (such as geochemical or hydrological) of these samples. the data life cycle may start in a number of contexts. one particularly important context is scientific ocean drilling cruises. during our period of observation, these were often conductedbytheintegratedoceandrillingprogram(iodp) which ran from – (it should be noted that the iodp was replaced with a new drilling program in , namely the international ocean discovery program, also known as iodp. for the purposes of this paper, the acronym “iodp” will be used to refer to the integrated ocean drilling pro- gram throughout). iodp organized regular research cruises bringing together scientists from a broad range of disci- plines and institutions to visit a specific site to collect cores, which are subsequently analyzed both onboard the ship and in scientists’ laboratories at their home research institu- tions. c-debi also supports scientists who participate in research cruises organized by organizations other than the iodp. as well as providing some funding and equipment support for cruises, c-debi also distributes funding directly to scien- tists. this funding is generally characterized by being short term (typically one to three years in length), to individuals and small teams (usually of two or three) and across a very broad range of institutions. the main opportunity for funding is through the small grants program, through which grants are awarded to proposed projects that use existing datasets and samples (e.g., from cruises). other grants are directed toward early career researchers, such as doctoral students and postdoctoral researchers. these grants are awarded on a regular basis following competitive calls for proposals. to date, these grants have supported approximately scien- tists in more than laboratories across the usa, europe and east asia [ ] . . c-debi infrastructure c-debi has also implemented other measures to support the community of researchers, fostering connections and exchangesofknowledge.oneimportantcomponentofbring- ing the community together is the project website, which con- tains a wide range of c-debi-related information, includ- ing key project personnel, descriptions of the main scientific foci of the project, information about the various grants and fellowships, a list of c-debi-contributed scientific publica- tions, and c-debi official documents such as the proposal and annual reports. the project also communicates with community members, and any others who are interested, via a twice-monthly newsletter. finally, the project also provides opportunities for affiliated scientists to come together, such as an annual project meeting. . c-debi and cens we are comparing data management, curation, and sharing practices across the four case studies in our × research design. the richest source of comparisons and contrasts for c-debi is with cens. there are many similarities and dif- ferences between c-debi and cens, which will extend and add to the extensive body of work we have already produced about our studies of cens. as with c-debi, cens was an nsf stc. it was launched in , and ceased operation in . cens was a distrib- uted, multidisciplinary collaboration involving five research universities across california. its focus was to bring tech- nologists and domain scientists (terrestrial ecology, marine biology, environmental engineering, seismology, plus appli- cations in urban settings and arts) together so that the tech- nologists could develop networked sensing tools that would allow domain scientists to collect data at higher spatial and temporal resolution. like c-debi, cens was a federation of a number of small teams of technologists and scientists working together on such projects, funded by a mixture of internal and external grants. in common with c-debi, cens was little science, in the sense that it involved the generation of a large number of heterogeneous, small-scale datasets meant for consumption by those that generated these data [ ]. however, c-debi differs from cens in important ways. one is that cens focused on developing emergent technologies to support sci- entific work, whereas c-debi foregrounds studying emer- gent scientific problems. another is that c-debi involves the integration of samples and data produced in a domain (iodp cruises) that shares many features with big science with work of small, multidisciplinary teams in individual laboratories. how the similarities and differences between c-debi and cens will augment our findings from cens is explicated below. . . lack of shared interests across a project team one key finding from our studies of cens is the lack of shared interests that existed within individual project teams. technologists were interested mainly in accomplishing the task of building networks of sensors that were technologi- cally novel. thus, technologists were primarily interested in data about how the technology operated. they were much less interested in the scientific data per se, regarding it as background context. the converse was true in the case of the domain scientists [ ]. one implication of this finding is that the technologists would take data about how the sensors operated with them to their laboratories and manage them according to their own particular practices and standards, while the domain scien- tists would do the same with scientific data, sometimes dis- p. t. darch et al. carding data that were no longer in use. the separation of the data sources made it difficult to reproduce results. although the technologists and the scientists were interdependent, the data practices did not support this interdependence. similar challenges are appearing in c-debi. c-debi pro- duces both biological and physical data. however, where different types of data are produced by different scientists and managed in different contexts, there can be significant implications for the interoperability of these data. further- more, subsequent storage and curation can diverge due to cultural practices or formal requirements in different disci- plines. these choices can have implications for subsequent reproduction and verification of scientific analyses. . . trust in data trust in data is essential for their use and reuse in the scien- tific endeavor. our research on cens found that scientists’ ability to assess the integrity of data was essential for reuse. this ability depended on the knowledge the scientist pos- sess of stages of the data life cycle—from research design to data storage and curation [ ]. the life cycle of cens data involved many steps, each dependent on preceding steps: the effect of decisions made at each step was cumulative through- out the life cycle [ ]. furthermore, a great deal of confusion and disagreement occurred amongst domain scientists and technologists about who was responsible for different types of data, and for dif- ferent stages of the data life cycle for each type of data. questions of who owned different types of data were fre- quently unresolved because some types of data or metadata did not implicate the interests of either the scientists or the technologists, and were thus frequently neglected by both [ ]. issues about who is responsible for certain types of data also arise in c-debi. for instance, the interests of very few scientists involved in c-debi projects seem to be implicated in the tools that support c-debi-related work. information about these tools is important for the subsequent interpre- tation and reuse of c-debi-generated scientific data but it is difficult to see whose interests are served by collection, storage and curation of such information. . . successful data sharing enabling the widespread sharing of data promises many ben- efits for science [ ]. the first step in facilitating data shar- ing is to ensure effective data management practices at every stage of the data life cycle. however, our research on cens also exposes a number of other issues that complicate the sharing of data. cens researchers were generally willing to share data, subject to a number of conditions: they were more willing to share data that they have already published, and are also more likely to be willing to share data that involved less effort to collect [ ]. other conditions included ensuring that the producer of the data received proper attribution, and that the amount of effort to share data was not burdensome. given these conditions, and that few repositories for cens data actually existed, data sharing was very rare across cens [ ]. the c-debi case study provides an ideal opportunity for us to extend these findings because the observation of many different types of interactions enables a better understanding of the particular contexts in which data sharing is more and less likely to take place. we are conducting analyses of the interplay of various technological, infrastructural, social and normative factors that facilitate data sharing. . . big science meets little science another point of comparison between c-debi and cens is that while all the data produced and used by cens researchers were characteristic of little science, c-debi also involvesdatathatareproducedinacontext,namelytheiodp, that shares many features with big science. the data gener- ated on iodp expeditions about the physical properties of cores are highly structured, professionally curated according to stringent standards, and are archived in publicly accessible databases in the long term. the case study of c-debi offers the possibility to see the interactions between the iodp standards and the day- to-day data practices of researchers. the addition of iodp to the data life cycle can complicate many of the factors outlinedabove.forinstance,theinvolvementofanadditional organization can introduce additional divergent interests to c-debi scientists. adding more steps to the data life cycle can impact subsequent stages of the life cycle, contributing to the complexity of the tasks facing scientists as they judge the integrity of datasets and attempt to interpret them. c-debi case study above, we presented c-debi as a research site, explaining how it is an important exemplar of contemporary develop- ments in scientific digital scholarship and thus provides an excellent case study for understanding data practices in a little science project both in its own right and in compari- son with the other case studies that comprise our knowledge infrastructures project. here, we present findings from the first year of our case study. . methods we are conducting a longitudinal ethnographic case study of c-debi. an ethnographic study involves a range of qualita- tive research methods to provide a thorough account of the what lies beneath? organization under study [ ]. our methods include inter- views, participant observation, online ethnography, and doc- ument analysis. drawing on data from a range of sources allows for triangulation [ ]. the content of texts (includ- ing interview transcripts, reports, and ethnographic notes) is highly contingent, rather than simply reflecting an underly- ing reality. triangulation involves the crosschecking of data from different sources produced in different contexts, which helps to ensure that conclusions drawn from the data are not biased by the context in which the data are produced. . . participant observation akeyfeatureofthiscasestudyislong-termparticipantobser- vation of c-debi. our observations include being embedded for eight months in a laboratory headed by a leading figure in c-debi at a large us research university, observing sci- entists at work and in meetings. we also attend scientific meetings (conferences, workshops, seminars, and colloquia) of both c-debi and the broader scientific communities in which it is embedded. participant observation has been suc- cessfully applied to the study of scientists and their practices since the s [ , , , ], and has latterly been used in studies of geographically distributed, multidisciplinary col- laborations [ ]. participant observation is particularly suit- able for this case study as it affords a detailed understanding of the local, disciplinary, and institutional contexts in which scientists are working as well as relationships and networks amongst scientists. we have been able to observe how ideas, practices, and methods are communicated between collabo- rators. . . interviews our current interview sample comprises people, includ- ing c-debi-affiliated scientists and scientists, curators, and managerial staff involved in related activities such as the scientific ocean research cruises (iodp). our sample is detailed in table , which distinguishes between respon- dents involved in c-debi and those working for iodp. the c-debi sample is broken down further by geographic loca- tion (usa or not), and career stage. the column “involved with iodp” indicates which interviewees are involved in policy- or decision-making in the iodp. the iodp intervie- wees are further split into two groups: those in cruise opera- tions, and those with the consortium for ocean leadership, which was responsible for administering us involvement in the iodp. c-debi-affiliated scientists are based in research insti- tutions and laboratories across the usa, at a wide range of career stages, from a variety of disciplinary backgrounds, and are working on a range of projects. potential interviewees were identified based on whether they had been observed table the composition of our interview sample career stage interviewees involved with iodp c-debi usa-based undergraduate graduate student postgraduate faculty non-scientists non-usa-based faculty total c-debi iodp cruise operations curator staff scientist technical support ocean leadership policy data management total iodp at work in the laboratory, and their involvement in c-debi- funded projects or iodp operations. recommendations were also sought from interviewees. interviews ranged in length from min to two hours and min, with the majority being between one and two hours long. the scientists interviewed were questioned not only about their data practices and day-to-day scientific work but also about their academic and professional backgrounds, enabling us to understand how the scientists’ multidiscipli- nary backgrounds impact observed data practices. the non- scientists interviewed were asked about their work within the c-debi project, including the building, implementation, and maintenance of c-debi infrastructure and policies. . . document analysis we assembled a corpus of of documents for analysis. some documents help to explain the work conducted by c-debi- affiliated scientists in their laboratories, for example sci- entific journal articles, instruction manuals for laboratory equipment, and published protocols for techniques observed in the laboratory. other documents help us to interpret social contexts in which c-debi scientists operate. these include official c-debi documents such as the initial pro- posal, annual reports to the nsf, operating documents (e.g., the strategic implementation plan), and calls for funding. finally, we collected other documents to understand better p. t. darch et al. the broader contexts in which the c-debi project is operat- ing, including nsf and iodp. . . data analysis we analyzed data using a grounded theory approach [ ]. we read interview transcripts and other documents closely, and a number of themes emerged. we then coded the doc- uments according to these themes. adopting a grounded theory approach meant that the findings in this paper are data-driven, in the sense that they emerge from the empir- ical research rather than being imposed upon the data in a top-down fashion. . introducing the jones laboratory the majority of the participant observation has so far taken place in a single laboratory, the jones laboratory, at a large research university in the usa (n.b. the name “jones” and the names of the individual scientists are pseudonyms). the head of the laboratory is a senior figure in the leadership of c-debi, and the focus of the laboratory’s work is on inter- actions between microbial life and physical processes in the deep subseafloor. work in the laboratory is funded largely by the nsf, both directly and through iodp and c-debi. the overall composition of the research group in this labo- ratory often changes, due to new phd students and postdoc- toral researchers joining the group, and others completing their doctorates or postdoctoral research projects and mov- ing on to other laboratories or industry. during the period of observation, the laboratory personnel has comprised a tenured professor who was the laboratory’s leader, four post- doctoral researchers with between zero and five years’ expe- rience in the laboratory, six phd students ranging from first year to fifth year, one visiting phd student from another labo- ratory in the usa, one undergraduate student, and two short- term international research visitors. . a typical workflow in the laboratory here, we present a standard workflow within the laboratory. this workflow is a composite of many observed workflows. in particular, although the form of the resultant datasets appears similar across scientists in this laboratory, a high degree of heterogeneity nevertheless exists across the lab- oratory regarding the tools and methods used to produce these datasets. we account for this heterogeneity, discussing how different scientists—even those working on adjacent benches—have access to different and constantly changing configurations of social, material, and scientific resources to help them accomplish the different steps of the workflow. we first set the scene by presenting the basic steps of this workflow ( . . ). then, in each of sects. . . , . . and . . , a single step in the workflow is examined in more depth. in particular, for each step, we compare the differing methods employed by two scientists and examine why these methods are used. . . a biological workflow in the jones laboratory the workflow is outlined in fig. . the central goal of a project incorporating this workflow is to understand the fig. a typical data workflow observed in the jones laboratory what lies beneath? mutual shaping of the microbial community that exists under the seafloor at a particular site, and the physical composition of the surrounding seafloor. the starting point for this data cycle is the collection of cores for analysis during scientific cruises. some cores may be subject to onboard analyses, producing data about the cores’ physical characteristics. when the cruise ends, some cores from iodp cruises are sent to one of the iodp core repositories, while other cores are distributed to various lab- oratories for biological and physical science analyses. within the laboratory, scientists typically specialize in one type of analysis or the other. most of the members of the observed laboratory perform biological analyses, and then correlate with physical science data that are generated either onboard a cruise or by other scientists. for the sake of tractability, here we focus on biological analyses only. the main focus of biological research in the laboratory is characterizing the ecology and function of microbes in cores. biomass (matter from living, or recently living, microbes) is quantified, and microbes are classified into operational tax- onomic units (otus), namely groups of microbes with sim- ilar dna sequences. these classifications are used to iden- tify community members against previously characterized microbes found elsewhere, to see how different community members are related to each other, and to produce measures of community diversity. scientists may also compare com- munities across sample sites. here, we focus on quantification of biomass, and classify- ing bacteria into otus, as these are common foci of projects in the observed laboratory. the first step in these analyses is the extraction of genetic material (dna or rna) from cores. one particularly common challenge is the relatively low bio- mass often found in deep-sea environments compared with biomass in other domains, due to the relatively low level of available nutrients. this challenge is critical because without an adequate biomass yield, the scientists cannot proceed with their analysis. following extraction, the scientist has dna or rna. in the case of rna extraction, the the rna sequence must then be transformed into its analog dna sequence. the next step, known as amplification, involves the production of multiple copies of the dna sequences of interest. when trying to characterize microbial communities, the standard portion of dna sequenced is known as s. primers, short sequences of nucleotide bases usually synthesized in the laboratory, are used to facilitate amplification. the making of multiple copies of the target sequence is achieved through polymerase chain reaction (pcr). stan- dard pcr results simply in multiple copies being made of the target sequence, while quantitative pcr (qpcr) also allows for the quantification of the levels of archaea, bacteria, and fungi (and their subtypes) in a sample. following amplifica- tion and pcr (or qpcr), the next step is to generate a product that allows for sequencing, namely the process of determin- ing what nucleotides are contained in each dna sequence. sequencing can happen either within the laboratory or, more usually, is conducted by an external sequencing facility. the machine to carry out sequencing within the laboratory has only been recently acquired and so during the period of our fieldwork, external sequencing has been the predominant method used. these facilities produce dna sequences that show the nucleotide bases of the s sequences that have been extracted from the cores. once sequences are acquired, there are multiple steps car- ried out in the laboratory to clean and process these sequences for analysis. scientists receive two sequences corresponding to the same dna sequence, namely a forward and back- ward sequence, and a scientist’s first task is to marry these sequences together by matching nucleotides. another impor- tant step is identifying and removing the part of the sequence that corresponds to the primers. a third stage of clean- ing sequences involves checking that the correct nucleotide has been identified at each point along the sequence (or, nucleotide-checking), using data provided by the sequencing facility that shows the confidence with which each nucleotide has been identified. subsequently, sequences are aligned to allow for compar- ison (this is known as sequence alignment). once aligned, sequences are then clustered into otus. otus in the sam- ple are identified and classified by being compared with online databases of sequences of previously characterized bacteria. finally, the scientist seeks to produce representations of the microbial ecology they have found in their sample(s), and how this ecology compares to the microbial ecology found at other depths below the seafloor at the same site, or at other sites. one form of representation of the ecology in a single sample is pie charts, which show the relative proportions of archaea, bacteria, and fungi, and of their subtypes, in a sam- ple. another form of representation is phylogenetic trees. a phylogenetic tree shows how the otus in the sample may be related to each other. finally, the scientist typically calculates numerical measures of the sample diversity. the scientist may also compare the site analyzed with other sites. making comparisons can involve producing cladograms, which are tree diagrams showing the relation- ships of different sites to each other. they may also produce venn diagrams to illustrate overlaps between sites. once these final steps are completed, the scientist may publish results in a journal or present at a conference. the biolog- ical data may then also be correlated with physical science data to understand how the physical environment shapes the microbial community, and vice versa. although resulting from a single workflow, there is never- theless a great deal of heterogeneity of methods employed in producing this biological sequence data. it is toward this het- p. t. darch et al. erogeneity that we now turn. in particular, we focus on three steps: increasing the yield of nucleic acid; choosing how to sequence dna; and the cleaning of s sequences. . . addressing the challenge of increasing nucleic acid yield a critical challenge for scientists is to find methods that can increase biomass yield from cores. different scientists improvise using different techniques, introducing an impor- tant level of heterogeneity across scientists in their methods for producing the final datasets and research outputs charac- terizing the communities of microbes that are found in the deep subsurface. we have observed at least four methods in this single laboratory. in this subsection, we focus on two in particular. adrian adrian is a second-year phd student, whose backgroundwasinmicrobiologydomainsotherthanthedeep subseafloor. he encountered the problem of low biomass dur- ingtheearlydaysofhisdoctorate.inthelaboratoryatthetime was a new postdoctoral researcher, george. prior to joining the jones laboratory, george had completed a phd in which he investigated the microbial ecology of another environment in which there is very low biomass, and for which he learned a particular technique, called multiple displacement algorithm (mda), to increase the dna yield. the various chemicals required to perform mda are commercially available in a single kit. adrian, who had developed a strong rapport with george, turned to george for assistance. as a result, adrian has become very conversant with the method of mda. in addi- tion to using george’s expertise as a resource, adrian is also able to secure financial resources to purchase the kit because the jones laboratory is relatively well funded. however, mda is not a perfect solution, in the sense that it does not amplify all sequences with equal probabil- ity, which can foreclose the possibility of some of the sub- sequent steps of the analysis being performed, in particu- lar quantitative measurements of different types of bacte- ria, archaea, and fungi. however, both adrian and george agree that this trade-off is worthwhile because they only have access to limited quantities of physical samples for analy- sis. given that they have found a technique that works for them, they are reluctant to waste scarce physical samples by attempting to use other methods with which they may be unfamiliar. jenny jenny is a postdoctoral researcher who joined the jones laboratory following completion of a phd study- ing microbial ecology in another low biomass environment. jenny also has an academic background in chemistry and soil science. jenny does not use mda to increase nucleic acid yield. instead, she prefers a method that she developed in conjunction with her doctoral supervisor, as part of her doc- toral research. when she encountered the challenge of how to increase nucleic acid yield during her doctoral research, jenny was able to draw on her expertise in soil science to adapt existing techniques from studying soil microbiology to studying seafloor sediments. jenny was able to develop this method by drawing on a number of resources available to her at that time. her mas- ters degree in soil science meant that she was conversant with much of the soil science literature, and was thus able to dis- cover the existence of the paper presenting this method. with the assistance of her supervisor’s expertise, jenny was able to grasp the potential application of this method to seafloor sed- iments, and was given the encouragement to do so. finally, jenny’s educational background in chemistry gave her exper- tise to draw on when developing and refining this method. jenny continues to use this method, because she does not like using commercially available kits. this dislike is not simply personal taste: jenny finds that companies usually do not give sufficiently detailed information about the individ- ual components of kits, limiting her ability to modify these kits. instead, she is able to adapt the methods she has devel- oped to different contexts of use. her ability to do so is a direct result of her expertise gained through her academic background. . . making decisions about sequencing once nucleic acid yield has been increased, the scientists then undertake steps of pcr and cloning so that they can then subsequently sequence the dna in the sample. a num- ber of different options are available for outsourcing the pro- duction of sequence data, including private companies and other research institutions such as university laboratories or hospitals. the choice of which sequencing facility to use is generally up to the individual. an individual’s choice is influ- enced by the interplay of a number of technical, scientific, economic, and social factors. two graduate students diane and mike are both phd students in the jones laboratory, and frequently use the same company to sequence their samples. neither had a back- ground in biological sciences prior to embarking on their phds. upon joining the jones laboratory, diane chose to use the same company that most other laboratory members were using, primarily because the laboratory already had an account set up with them and because, as a new mem- ber of the laboratory, she was reluctant to create additional administrative work for the laboratory manager. the labo- ratory manager orders chemicals and equipment on behalf of the scientists, and diane’s choice of sequencing facility what lies beneath? can be thus be seen as motivated by helping to ensure the laboratory manager would be willing to assist diane as she moved forwards with her doctoral work. in other words, the way diane uses resources available in her network not only help her to accomplish the immediate task of sequencing, but also help her to configure the network of resources available to her in anticipation of accomplishing future tasks. mike, too, used this same company when he first joined the laboratory on the advice of richard, a postdoctoral researcher in the laboratory. as his background was in chemical engi- neering, he was keen to follow the expertise of others in the laboratory. when mike first joined the laboratory, he approached the laboratory leader for assistance with many different technical issues, and the laboratory leader advised mike that richard would be able to help him. mike was able to access the laboratory leader’s social expertise regarding who in the laboratory possessed sufficient expertise to help mike. in turn, richard effectively became part of mike’s network, meaning mike was then able to access richard’s expertise. jenny jenny uses a different sequencing facility than the company used by mike and diane. her chosen facility is one that she started to use while a phd student. jenny looked to the expertise of others when making her initial decision to use this facility. however, during our interview with jenny, she also discusses the details of some different types of sequenc- ing, and their scientific implications. jenny uses her personal scientific knowledge and expertise to evaluate her current choice of sequencing facility and the particular services that she requires of the facility. furthermore, jenny’s decisions about sequencing are also influenced by what is considered credible by the broader scientific community, i.e., the length of sequence that meets the standards of evidence required by this community. as with mike and diane (above), when jenny was completing her doctorate, she drew on the advice of other scientists in her network. however, now that she has acquired more experience and knowledge, she is confident to make her own evaluations of the various types of sequences and sequencing facilities. of particular note here is that jenny is drawing not only on her own scientific expertise to evaluate different options but also her social expertise regarding what the broader sci- entific community regards as credible. it is those in this broader community who are reviewers of the journal arti- cles that jenny writes, authors who may choose to cite jenny’s work in their own papers, possible future collabo- rators, potential future employers, or gatekeepers to future funding opportunities. in other words, jenny is showing her awareness of the importance of building and sustaining net- works in this broader community that may provide access to future resources, in turn influencing her choice of sequencing services. . . cleaning sequences once sequencing has been completed, the scientist receives the sequences in a file from the sequencing facility. however, before they are able to perform analyses on these sequences, the scientist needs to perform a number of steps to clean and prepare the sequences. we observed a number of differing configurations of computational tools that were employed in the laboratory to perform these steps. two of these tools are presented here. one tool is a piece of software called geneious [ ]. geneious has a graphical user interface that allows the user to inspect and manage sequences. for each sequence, it displays the confidence with which the sequencing facility was able to identify each nucleotide. the user can man- ually delete or change individual nucleotides, or they can automate geneious to remove all nucleotides or sequences falling below a certain confidence level. acquiring a license for geneious is expensive. the laboratory owns a license, and scientists usually access geneious using the laboratory computer on which it is hosted, or by logging in remotely. geneious works on the apple interface only. a second tool is mothur [ ], which is available freely and uses a command-line interface. mothur automates all stages of sequence management, from cleaning through to analy- sis and production of graphical and pictorial representations of results. mothur can handle very large numbers (tens of thousands or even higher) of sequences. diane diane, the graduate student encountered above in sect. . . , does not use geneious or mothur to clean sequences. she has spent a great deal of time living remotely from the laboratory where she is not able to access the com- puter in the jones laboratory to use geneious. furthermore, diane is not able to access this computer remotely as she owns a pc with windows interface, which is not compati- ble with geneious. the functionality of mothur for cleaning sequences was only added once diane had completed a sub- stantial portion of cleaning her sequences and diane judges that the acquisition of expertise in using mothur would take more effort than it would save in terms of cleaning sequences. instead, diane has embarked on some of the tasks involved in cleaning sequences by hand. however, because diane per- forms these tasks at home, her husband has been able to see how time consuming some of these tasks are. her husband has a background in computer science and suggested he write a program to automate the removal of primers, which he sub- sequently has done. the network of resources diane was able to access when she started processing sequences has determined the methods diane employs to accomplish the task of cleaning sequences. neither geneious nor mothur formed part of this network at that time. instead, diane’s only available resource was her p. t. darch et al. ability to complete the tasks by hand, becoming her approach bydefault.initiallydianewasnotawarethatherhusbandwas in a position to help. she only became aware that he would be able to after he suggested to her that he write a program: it was only at this point that her husband’s expertise has become part of the network of resources accessible to her. the network of resources available to diane has been dynamic over time as she becomes aware of her ability to access new resources. further, we can see that it is diane’s perception of her husband’s ability to help that has deter- mined whether and when he is in her network of resources: he has possessed the technical ability all along to write a pro- gram, but it was not until he made diane aware of this ability that he has become part of her network. george george, the postdoctoral researcher who is pre- sented in sect. . . , performs most of the cleaning of sequences using geneious. however, he does not like to use the features of geneious that would allow him to automate all stages of the sequence-cleaning process, preferring instead to perform steps such as nucleotide-checking manually. in particular, george regards his approach as resulting in bet- ter quality sequences that may enable him to better identify species in the data, even though it is more time consuming. george is able to access geneious through the apple com- puter in the laboratory and has an apple laptop computer that means he is able to access the laboratory’s copy of geneious when he is working from home. he prefers to perform tasks, such as nucleotide-checking, manually to improve the quality of sequences, and is able to accomplish these tasks manually due to the affordances of geneious. working manually, in turn, better enables him to identify specieswithinthesesequences,whichpromisestosecurehim greater scientific credibility and recognition in the eyes of the broader scientific community, and thus promises to increase his ability to build the network of resources available to him in the future. however, george has started to perform a type of sequenc- ing known as tag sequencing, which has much higher throughput and results in datasets comprising tens of thou- sands (rather than hundreds) of sequences. to check each sequence manually would be intractable. instead, george has begun to use mothur. mothur was recommended to him by lee, a new doctoral student in the laboratory who organizes mothur tutorials at the university and is a source of advice on how to use mothur within the laboratory. changes in the scale of datasets force george both to reconsider how he uses the resources he is able to access through his network and how to reconfigure his network to accessotherresourcestocompletethetaskofsequenceclean- ing. in particular, george was able to access lee’s expertise to learn how to use mothur, so mothur is now part of his network of resources. . discussion in subsect. . , we present examples of heterogeneity observed in data production practices in our case study of c- debi, with scientists in the same laboratory and even work- ing on adjacent benches using a diversity of approaches to accomplish similar tasks and produce datasets similar in form and intent. indeed, the heterogeneity presented above is only a fraction of the total heterogeneity that we have observed. for example, scientists perform analysis of sequences using a disparate range of tools and software including commercially available and open-source software, and sequence databases. heterogeneity of data practices has long been regarded as a hallmark of little science [ ]. to date, this heterogeneity has been understood in terms of the types of dataset pro- duced [ , ]. the analysis presented in subsect. . intro- duces heterogeneity along another dimension, namely scien- tists using a diversity of practices to produce datasets similar in purpose and form. as discussed above, our cens research demonstrates that decisions made at each stage of the data life cycle have a cumulative effect on data [ ]. in the case of the subseafloor biosphere, decisions regarding the choice of methods can have a significant impact on the results of scientific analyses, with important implications for the reuse of datasets. for example, the quantification of global subseafloor biomass is foundational to the study of the subseafloor biosphere, and attempts to quantify this biomass involve aggregating datasets from a wide range of studies [ ]. however, a recent meta-analysis of studies of subseafloor life found that the method employed to quantify biomass can have a major impact on the results [ ], with significant implications for the quantification of global biomass. furthermore, the ability of a scientist to assess the integrity and trustworthiness of a dataset tends to be enhanced when the scientist has greater knowledge about the factors—both social and technical—involved in the different stages of the dataset’s production and curation [ ]. our cens findings show that the production and use of multiple types of datasets significantly complicate these issues. different people from different disciplinary backgrounds are involved in the pro- duction of different datasets, using different methods for pro- ducing these data. the task of tracking, documenting, and maintaining access to all of the datasets in a single workflow becomes extremely complicated. adding another dimension of heterogeneity as described in subsect. . can only com- plicate this task further. . . why does heterogeneity come about? by focusing on individual scientists as the unit of analysis, we can understand how different scientists accomplish the same tasks in different ways. by viewing the process of accom- what lies beneath? plishing each task as a case of the scientist drawing on the sociotechnical networks of resources available to them at the time of carrying out the tasks, we are better able to account for this heterogeneity. some of the factors that shape these networks are discussed here. disciplinary background we find that differences in disciplinary background promote heterogeneity of data prac- tices along two dimensions in particular. the first is that some scientists may be aware of the existence of a particular tool or method due to their background whereas other scientists may not. for example, both george and jenny employ tech- niques for increasing nucleic acid yield that they had learned or developed during their doctorates prior to joining the jones laboratory. the second dimension is that different scientists may be aware of the same tool or method, but each evaluates its usefulness differently according to their particular knowl- edge and experience. for instance, when choosing sequenc- ing facilities, jenny considers some of their scientific advan- tages and disadvantages. mike, on the other hand, had little prior experience of biological research and so trusts the judg- ment of others. career stage we also find that differences in career stages, in particular issues of social status related to career stage, can drive differences in how scientists make choices about which methods to pursue. for instance, we see that diane’s choice of sequencing facility as a newly arrived phd student in the laboratory was influenced by her reluctance to cause additional burden for the laboratory manager. instead, diane’s priority was ensuring a good working environment in which to pursue her phd. in the cases of both jenny and george—both more senior scientists pursuing postdoctoral positions—we see that a concern with producing scientific work that is recog- nized as credible and significant by the broader scientific community is critical in shaping their choices of certain methods. in the case of jenny, this concern impacted her decisions regarding sequencing. george chooses to eschew geneious’s ability to automate certain tasks involved in cleaning sequences to increase his chance of identifying novel species. social networks within and without the laboratory another factor contributing to heterogeneity of methods is that different scientists have access to different social net- works inside of the laboratory and outside. for instance, adrian’s use of mda for increasing nucleotide yield was learnedfromgeorge.mikehasbeenabletolearnaboutwhich sequencing facility to use first by accessing the expertise of professor jones about who might have the expertise to help (i.e., richard) and then approaching richard. however, we also saw that diane has been able to access the expertise of her husband—outside of the laboratory and of her scien- tific domain—to write a program to assist her with sequence cleaning. physical access to tools although all members of the laboratory were in theory able to access all tools available at the time, circumstances mean that diane is not able to access the computer in the lab, and thereby geneious. furthermore, her possession of a windows laptop rather than an apple macintosh laptop means she is unable to access geneious remotely. as a result, her sequence cleaning, unusually, does not involve geneious. shifting networks of resources another feature of the networks of resources to which people have access is that these networks are not static, but dynamic. over time, scien- tists may change the way in which they accomplish certain tasks, or different scientists may perform the same task dif- ferently if they joined the laboratory at different points in time. new tools or people being introduced to the laboratory can drive this dynamism. for instance, mothur has become avail- able to george once the functionality of sequencing cleaning was added, and once lee joined the laboratory. however, the network of resources available to a particular scientist also depends on the scientist’s awareness of what resources exist. for instance, as mike was made aware of richard’s expertise on various scientific matters, mike then approached richard for advice on sequencing facilities. similarly, diane’s access of her husband’s expertise has only occurred after her hus- band told her he would be able to help her. heterogeneity as a permanent feature the above dis- cussion shows that the heterogeneity observed is not just happenstance but is instead a consequence of the interplay of multiple social, cultural, technical, and scientific fac- tors. the dynamic nature of these factors suggests that the heterogeneity—and the challenges for data management that result—will be a persistent feature of this laboratory. for example, new personnel will continue to enter the lab- oratory from a variety of disciplinary backgrounds with new expertise or approaches that others in the laboratory may learn from. social networks—both within and without the laboratory—will continue to change, which will impact how knowledge about methods and tools will spread. the labo- ratory will acquire new tools and technologies that scientists may incorporate into their own workflows. scientists will also move through career stages—from being a new doc- toral student anxious not to cause disruption in their new laboratory through to a more senior doctoral student or post- doctoral scholar taking into account how the work they con- duct will impact on their reputation in the broader academic p. t. darch et al. community. in short, challenges for successful data manage- ment and curation that result from heterogeneity of prac- tices are likely to remain during the course of c-debi and beyond. . . implications of heterogeneity for assessing data integrity a scientist’s understanding and knowledge of what was involved in producing a dataset can have a major impact on the extent to which they are able to assess the dataset’s integrity and trustworthiness. one example of where such knowledge can be useful is that the choice of technique for increasing nucleic acid yield can bias results (for example, the use of mda), thus impacting on subsequent stages of the data life cycle, for instance by foreclosing subsequent analyses. scientists who reuse such a dataset in their own work need to know not only the methods involved in produc- ing the dataset but also that these methods render the dataset unsuitable for certain tasks. not knowing one or the other of these could have major scientific implications. working out how to supply this knowledge is a com- plicated process: what granularity of details needs to be recorded? for instance, mda is a kit and the protocols for its use are available on the company’s website which make them easily accessible, whereas jenny’s method is contained within her publications and thus may be more difficult to locate. on the other hand, as discussed by jenny, commer- cially available kits are often opaque about their precise com- ponents whereas she knows the types and quantities of chem- icals she used and is willing and able to supply these details when asked. the heterogeneity of methods significantly complicates the task of effective data management and curation for mul- tiple purposes, such as checking and verifying scientific analyses and potential future reuse of datasets. at the same time, however, the heterogeneity observed in the laboratory underlines, and indeed makes more critical, the importance of curating not just datasets themselves but also informa- tion about their provenance (e.g., methods used in producing these, how these methods have been derived and adapted, and implications of the particular methods used for possible future uses of the datasets). conclusions scientists across a wide range of scientific disciplines are being confronted by the challenges of managing volumes of data increasing in both scale and diversity. to exploit the potential of digital scholarship to its fullest, it is vital to study existing data practices to inform the development of research infrastructures. the knowledge infrastructures project presented in this paper has already made significant progress toward under- standing data practices, and will continue to fill existing gaps in the literature. to date, studies have focused on heterogene- ity in terms of the types of datasets being produced. in this paper, however, we find that even within a single workflow in a single laboratory, the practices, methods, and techniques used in the production of datasets can be highly heteroge- neous, with many implications for data storage, curation, integration with other sources of data, and potential data shar- ing and reuse. future work in the knowledge infrastructures project the empirical research presented in this paper provides a starting point for other themes that we are investigating. the heterogeneity of methods discussed above simultaneously makes more critical and more difficult different components of data management practice. here, we briefly outline find- ings that will be covered in greater depth in future publica- tions. . recordkeeping in the laboratory the first stage in effective data management is to ensure that records are kept about the production of data. records kept at the sites of data production by those who produced the data can play an important role in generating effective metadata and in establishing provenance. in the case of the data workflow presented above, it is important to capture the heterogeneity of methods. scientists record details of their methods in laboratory notebooks. we have found that the notebook practices of scientists within the laboratory vary substantially in terms of the granularity of detail and the types of detail recorded. this variety is related to a number of sociotechnical fac- tors, including scientists’ disciplinary backgrounds and train- ing received during undergraduate degrees, career stage, and personal preferences regarding detail. in other words, while the heterogeneity of methods makes effective recordkeeping more important, the very factors that drive this heterogeneity also contribute to the heterogeneity of practices in record- keeping. . storage and curation of laboratory-generated data we are also finding that there are many differences across sci- entists in terms of the fate of laboratory-generated datasets, with many different points of data loss. there are multiple sociotechnical factors involved. for instance, the journals in which c-debi-funded scientists publish mandate that bio- what lies beneath? logical datasets supporting the arguments of articles must be deposited to external databases; conversely, there currently is no such requirement for physical science data. thus, the data currently deposited in online repositories represents only a fraction of all data generated in the laboratory. furthermore, data may be lost when scientists leave the laboratory, when they take their own computers, hard drives, memory stick and other backup media with them. if they move into another domain of microbiology, or even leave the field or academia altogether, it becomes even more difficult for others to track and discover the scientists’ data. the short- term nature of much of c-debi’s funding contributes to this occurrence. . where does data get shared? apart from the data that are deposited in databases, there appears to be little data shared within the collaboration. however, there are two particular circumstances in which data sharing has been observed within c-debi. the first is where a scientist discovers the existence of another’s dataset through reading this latter scientist’s paper. we are currently charting the processes by which this dataset might be even- tually shared, from the initial steps of discovery, through negotiation (for instance, regarding crediting the dataset’s originator), and eventual integration with other datasets. we are identifying many sources of friction—social, technical, and scientific—in these processes. the other instances of data sharing observed in this paper are the result of serendipitous encounters between scientists from different institutions. these instances are infrequent. in particular, the sharing of data between researchers in differ- ent institutions or disciplines is a rare and fragile accomplish- ment that involves the alignment of multiple factors, includ- ing high levels of trust between researchers, alignment of researchers’ interests, and opportunism in exploiting possi- bilities afforded by infrastructures. . why is sharing data important? through our studies of c-debi, we are also developing a richer understanding of why data sharing can be important and beneficial to science, which will extend existing ratio- nales for data sharing [ ]. the data generated in the jones laboratory, and the c-debi collaboration more generally, are often expensive and difficult to obtain. furthermore, they are also made scarcer due to the relative novelty of the domain of study. thus, sharing data promises many economic benefits for this domain. furthermore, the loss of datasets, and of information about workflows, can have significant implications on the ability to reproduce and validate the analyses of others. the ability to validate and reproduce such analyses is valuable in the con- text of c-debi: as outlined above, scientists bring many dif- ferent approaches to such an analysis. it could be very useful for others to reproduce analyses that are based on unfamiliar or new methods and tools to test reliability of novel methods. . big science meets little science: iodp cruises and laboratory practices case studies of the challenges and efforts in building knowl- edge infrastructures to support scientific work tend to fol- low the big/little science dichotomy, generally characteriz- ing data life cycles as unfolding entirely in one context or the other. at first sight, c-debi seems to exemplify little science. however, the c-debi data life cycle unfolds across both big and little science contexts. for example, the life cycle starts with the collection of physical samples on board sci- entific ocean drilling cruises, typically large-scale collabora- tions with expensive infrastructure and a large budget. cruise samples and data are collected, managed, and made publicly accessible according to established standards. c-debi is an excellent opportunity to study how big and little science con- texts shape each other, via the flow of individuals, physical samples, data, and data practices. acknowledgments the work in this paper has been supported by the sloan foundation award # , the transformation of knowl- edge, culture and practice in data-driven science: a knowledge infrastructures perspective. we also acknowledge the contributions of milena golshan, irene pasquetto, and laura a. wynholds for com- menting on drafts of this paper, and elaine levia for technical and administrative support. references . altman, m.: digital preservation through archival collaboration: the data preservation alliance for the social sciences. am. arch. ( ), – ( ) . anderson c.: the long tail. wired mag., ( ) ( , october). http://www.wired.com/wired/archive/ . /tail_pr.html . aronova, e., baker, k.s., oreskes, n.: big science and big data in biology: from the international geophysical year through the inter- national biological program to the long term ecological research (lter) network, -present. hist. stud. nat. sci. ( ), – ( ). doi: . /hsns. . . . . association of research libraries: the research library’s role in digital repositoryservices: final report ofthe arldigital repository issues task force. association of research libraries. washington, dc ( b). www.arl.org/bm~doc/repository-services-report.pdf . bechhofer, s., ainsworth, j., bhagat, j., buchan, i., couch, p., cruickshank, d., sufi, s.: why linked data is not enough for sci- entists. in: sixth ieee e-science conference. brisbane, australia ( ). http://eprints.ecs.soton.ac.uk/ / . berman, f., lavoie, b., ayris, p., choudhury, g. s., cohen, e., courant, p., van camp, a.: sustaining the digital investment: issues and challenges of economically sustainable digital preservation (interim report of the blue ribbon task force on sustainable dig- ital preservation and access). san diego ( ). http://brtf.sdsc. edu/publications.html http://www.wired.com/wired/archive/ . /tail_pr.html http://dx.doi.org/ . /hsns. . . . www.arl.org/bm~doc/repository-services-report.pdf http://eprints.ecs.soton.ac.uk/ / http://brtf.sdsc.edu/publications.html http://brtf.sdsc.edu/publications.html p. t. darch et al. . bijker, w.e., hughes, t.p., pinch, t.j.: the social construction of technological systems: new directions in the sociology and history of technology. mit press, cambridge ( ) . borgman, c. l.: big data, little data, no data: scholarship in the networked world. mit press, cambridge, ma ( ) . borgman, c. l.: the premise and promise of the global information infrastructure. first monday, ( ). http://www.firstmonday.dk/ issues/issue _ /borgman/index.html . borgman, c.l.: scholarship in the digital age: information, infrastructure, and the internet. mit press, cambridge ( ) . borgman, c.l.: the conundrum of sharing research data. j. am. soc. inf. sci. technol. ( ), – ( ). doi: . /asi. . borgman, c.l., wallis, j.c.: building digital libraries for scien- tific data: an exploratory study of data practices in habitat ecology. in: gonzalo, j., thanos, c., verdejo, m.f., carrasco, r.c. (eds.) proceedings of the th european conference on research and advanced technology for digital libraries, pp. – . springer, berlin, heidelberg, alicante, spain ( ) . borgman, c.l., wallis, j.c., enyedy, n.d.: little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. int. j. digit. libr. ( – ), – ( ). doi: . /s - - - . borgman, c.l., wallis, j.c., mayernik, m.s.: who’s got the data? interdependencies in science and technology collaborations. com- put. support. coop. work ( ), – ( ). doi: . / s - - -z . bozeman,b.,fay,d.,slade,c.p.:researchcollaborationinuniver- sities and academic entrepreneurship: the-state-of-the-art. j. tech- nol. transf. ( ), – ( ). doi: . /s - - - . callon, m.: the sociology of an actor–network: the case of the elec- tric vehicle. in: mapping the dynamics of science and technology: sociology of science in the real world, pp. – . macmillan, london ( ) . center for dark energy biosphere investigations: center for dark energy biosphere investigations stc annual report ( ). http://www.darkenergybiosphere.org/internal/docs/ c-debi-annual-report- .pdf . codata-icsti task group on data citation standards practices: out of cite, out of mind: the current state of practice, policy, and technology for the citation of data. data sci. j., , cidcr – cidcr ( ). doi: . /dsj.osom - . data’s shameful neglect. nature, ( ), ( ). doi: . / a . dealing with data. science, ( ), – ( ) . deuten, j. j.: cosmopolitanising technologies: a study of four emerging technological regimes. twente university press, enschede ( ).http://doc.utwente.nl/ / /t .pdf . edwards, k.: center for dark energy biosphere investigations (c-debi): a center for resolving the extent, function, dynamics and implications of the subseafloor biosphere ( ). http:// www.darkenergybiosphere.org/internal/docs/ c-debi_full proposal.pdf . edwards, p.n.: a vast machine: computer models, climate data, and the politics of global warming. mit press, cambridge, ma ( ) . edwards, p. n., jackson, s. j., bowker, g. c., knobel, c. p.: under- standing infrastructure: dynamics, tensions, and design: report of a workshop on history and theory of infrastructure, lessons for new scientific cyberinfrastructures. national science foundation, washington, dc ( ). http://hdl.handle.net/ . / . edwards, p. n., jackson, s. j., chalmers, m. k., bowker, g. c., borgman, c. l., ribes, d., calvert, s.: knowledge infrastructures: intellectual frameworks and research challenges (p. ). university ofmichigan,annarbor,mi( ).http://deepblue.lib.umich.edu/ handle/ . / . faniel, i.m., jacobsen, t.e.: reusing scientific data: how earth- quake engineering researchers assess the reusability of colleagues’ data. j. comput. support. coop. work ( – ), – ( ). doi: . /s - - - . glaser, b.g., strauss, a.l.: the discovery of grounded the- ory: strategies for qualitative research. aldine pub. co, chicago ( ) . hammersley,m.,atkinson,p.: ethnography: principlesinpractice. routledge, london ( ) . helland, p.: if you have too much data, then “good enough” is good enough. commun. acm , – ( ). doi: . / . . hey, a.j.g., trefethen, a.: the data deluge: an e-science perspec- tive. in: berman, f. fox, g., hey, a.j.g. (eds.) grid computing: making the global infrastructure a reality, pp. – . wiley, west sussex, england ( ). http://www.rcuk.ac.uk/escience/ documents/report_datadeluge.pdf . hine, c.: connective ethnography for the exploration of e-science. j. comput. media. commun. ( ), – ( ). doi: . / j. - . . .x . hughes, t.p.: technological momentum. in: smith, m.r., marx, l. (eds.) does technology drive history? the dilemma of tech- nological determinism. pp. – . mit press, cambridge, ma ( ) . kallmeyer, j., pockalny, r., adhikari, r.r., smith, d.c., d’hondt, s.: global distribution of microbial abundance and biomass in sub- seafloor sediment. proc. natl. acad. sci. ( ), – ( ) . karasti, h., baker, k.s., millerand, f.: infrastructure time: long- term matters in collaborative development. comput. support. coop. work (cscw) ( – ), – ( ). doi: . / s - - -z . kearse, m., moir, r., wilson, a., stones-havas, s., cheung, m., sturrock, s., et al.: geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. bioinformatics ( ), – ( ) . knorr-cetina, k.: the manufacture of knowledge. pergamon press oxford, ( ). http://sites.google.com/site/sciencestudies / reader/knorr-cetina_manknow-chapter .doc . knorr-cetina, k.: epistemic cultures: how the sciences make knowledge. harvard university press, cambridge ( ) . latour, b.: science in action: how to follow scientists and engi- neersthroughsociety.harvarduniversitypress,cambridge( ) . latour, b., woolgar, s.: laboratory life: the construction of sci- entific facts, nd edn. princeton university press, princeton ( ) . lloyd, k.g., may, m.k., kevorkian, r.t., steen, a.d.: meta- analysis of quantification methods shows that archaea and bacteria have similar abundances in the subseafloor. appl. environ. micro- biol. ( ), – ( ) . lynch, m.: art and artifact in laboratory science: a study of shop work and shop talk in a research laboratory. routledge & kegan paul, london ( ) . meng, x.-l.: multi-party inference and uncongeniality. in: lovric, m. (ed.), international encyclopedia of statistical science, pp. – . springer, berlin heidelberg ( ). http://link.springer. com/referenceworkentry/ . / - - - - _ . o’donoghue, t., punch, k.: qualitative educational research in action: doing and reflecting. routledge, london ( ) . office of science and technology policy: harnessing the power of digital data for science and society: report of the interagency working group on digital data to the committee on science of the national science and technology council. washington, d.c. ( ). http://www.nitrd.gov/about/harnessing_power.aspx . Østerlund, c., carlile, p.: relations in practice: sorting through practice theories on knowledge sharing in complex organizations. inf. soc. ( ), – ( ) http://www.firstmonday.dk/issues/issue _ /borgman/index.html http://www.firstmonday.dk/issues/issue _ /borgman/index.html http://dx.doi.org/ . /asi. http://dx.doi.org/ . /asi. http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - -z http://dx.doi.org/ . /s - - -z http://dx.doi.org/ . /s - - - http://www.darkenergybiosphere.org/internal/docs/c-debi-annual-report- .pdf http://www.darkenergybiosphere.org/internal/docs/c-debi-annual-report- .pdf http://dx.doi.org/ . /dsj.osom - http://dx.doi.org/ . / a http://dx.doi.org/ . / a http://doc.utwente.nl/ / /t .pdf http://www.darkenergybiosphere.org/internal/docs/ c-debi_fullproposal.pdf http://www.darkenergybiosphere.org/internal/docs/ c-debi_fullproposal.pdf http://www.darkenergybiosphere.org/internal/docs/ c-debi_fullproposal.pdf http://hdl.handle.net/ . / http://deepblue.lib.umich.edu/handle/ . / http://deepblue.lib.umich.edu/handle/ . / http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / . http://dx.doi.org/ . / . http://www.rcuk.ac.uk/escience/documents/report_datadeluge.pdf http://www.rcuk.ac.uk/escience/documents/report_datadeluge.pdf http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /s - - -z http://dx.doi.org/ . /s - - -z http://sites.google.com/site/sciencestudies /reader/knorr-cetina_manknow-chapter .doc http://sites.google.com/site/sciencestudies /reader/knorr-cetina_manknow-chapter .doc http://link.springer.com/referenceworkentry/ . / - - - - _ http://link.springer.com/referenceworkentry/ . / - - - - _ http://www.nitrd.gov/about/harnessing_power.aspx what lies beneath? . palmer,c.l.,cragin,m.h.,heidorn,p.b.,smith,l.c.:datacura- tion for the long tail of science: the case of environmental studies. in: presented at the rd international digital curation conference, washington, dc ( ). https://apps.lis.uiuc.edu/wiki/download/ attachments/ /palmer_dcc .rtf?version= . ribes, d., bowker, g.c.: between meaning and machine: learning to represent the knowledge of communities. inf. org. ( ), – ( ). doi: . /j.infoandorg. . . . schloss, p.d., westcott, s.l., ryabin, t., hall, j.r., hartmann, m., hollister, e.b., et al.: introducing mothur: open-source, platform- independent, community-supported software for describing and comparing microbial communities. appl. environ. microbiol. ( ), – ( ) . star, s.l., ruhleder, k.: steps toward an ecology of infrastructure: design and access for large information spaces. inf. syst. res. ( ), – ( ). doi: . /isre. . . . traweek, s.: beamtimes and lifetimes: the world of high energy physicists ( st harvard university press pbk.). harvard university press, cambridge ( ) . uhlir, p. f. (ed.): for attribution-developing data attribution and citation practices and standards: summary of an international workshop. the national academies press, washington, d.c ( ). http://www.nap.edu/catalog.php?record_id= . wallis, j.c., borgman, c.l.: who is responsible for data? an exploratory study of data authorship, ownership, and responsibil- ity. in: annual meeting of the american society for information science and technology (vol. , pp. – ). new orleans, la. information ( ). doi: . /meet. . . wallis, j.c., borgman, c.l., mayernik, m.s., pepe, a.: moving archival practices upstream: an exploration of the life cycle of eco- logical sensing data in collaborative field research. int. j. digital curation ( ), – ( ). doi: . /ijdc.v i . . wallis, j.c., borgman, c.l., mayernik, m.s., pepe, a., ramanathan, n., hansen, m. a.: know thy sensor: trust, data qual- ity, and data integrity in scientific digital libraries. in: proceed- ings of the th european conference on research and advanced technology for digital libraries, vol. lincs , pp. – . springer, budapest, hungary:berlin ( ). doi: . / - - - - _ . wallis, j.c., rolando, e., borgman, c.l.: if we share data, will any- one use them? data sharing and reuse in the long tail of science and technology. plos one ( ), e ( ). doi: . /journal. pone. https://apps.lis.uiuc.edu/wiki/download/attachments/ /palmer_dcc .rtf?version= https://apps.lis.uiuc.edu/wiki/download/attachments/ /palmer_dcc .rtf?version= http://dx.doi.org/ . /j.infoandorg. . . http://dx.doi.org/ . /isre. . . http://www.nap.edu/catalog.php?record_id= http://dx.doi.org/ . /meet. . http://dx.doi.org/ . /ijdc.v i . http://dx.doi.org/ . / - - - - _ http://dx.doi.org/ . / - - - - _ http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . /journal.pone. reproduced with permission of the copyright owner. further reproduction prohibited without permission. c. _ _article_ .pdf what lies beneath?: knowledge infrastructures in the subseafloor biosphere and beyond abstract introduction a project for researching sociotechnical knowledge infrastructures . motivation for the knowledge infrastructures project . research questions for the knowledge infrastructures project . knowledge infrastructures project case studies and methods . a theoretical framework for analyzing knowledge infrastructures . the center for dark energy biosphere investigations . . organization and work of c-debi . . c-debi infrastructure . c-debi and cens . . lack of shared interests across a project team . . trust in data . . successful data sharing . . big science meets little science c-debi case study . methods . . participant observation . . interviews . . document analysis . . data analysis . introducing the jones laboratory . a typical workflow in the laboratory . . a biological workflow in the jones laboratory . . addressing the challenge of increasing nucleic acid yield . . making decisions about sequencing . . cleaning sequences . discussion . . why does heterogeneity come about? . . implications of heterogeneity for assessing data integrity conclusions future work in the knowledge infrastructures project . recordkeeping in the laboratory . storage and curation of laboratory-generated data . where does data get shared? . why is sharing data important? . big science meets little science: iodp cruises and laboratory practices acknowledgments references cronfa - swansea university open access repository _____________________________________________________________ this is an author produced version of a paper published in : digital scholarship in the humanities cronfa url for this paper: http://cronfa.swan.ac.uk/record/cronfa _____________________________________________________________ paper: cheesman, t., flanagan, k., thiel, s., rybicki, j., laramee, r., hope, j. & roos, a. ( ). multi-retranslation corpora: visibility, variation, value and virtue. digital scholarship in the humanities http://dx.doi.org/ . /llc/fqw _____________________________________________________________ this article is brought to you by swansea university. any person downloading material is agreeing to abide by the terms of the repository licence. authors are personally responsible for adhering to publisher restrictions or conditions. when uploading content they are required to comply with their publisher agreement and the sherpa romeo database to judge whether or not it is copyright safe to add this version of the paper to this repository. http://www.swansea.ac.uk/iss/researchsupport/cronfa-support/ http://cronfa.swan.ac.uk/record/cronfa http://dx.doi.org/ . /llc/fqw http://www.swansea.ac.uk/iss/researchsupport/cronfa-support/ multi-retranslation corpora: visibility, variation, value, and virtue ............................................................................................................................................................ tom cheesman department of languages, swansea university, uk kevin flanagan department of languages, swansea university, uk and sdl research, bristol stephan thiel bauhaus university weimar, germany and studio nand, berlin jan rybicki institute of english studies, jagiellonian university, krakow robert s. laramee department of computer science, swansea university, uk jonathan hope department of english, strathclyde university, uk avraham roos amsterdam school of culture and history, university of amsterdam, the netherlands ....................................................................................................................................... abstract variation among human translations is usually invisible, little understood, and under-valued. previous statistical research finds that translations vary most where the source items are most semantically significant or express most ‘attitude’ (affect, evaluation, ideology). understanding how and why translations vary is important for translator training and translation quality assessment, for cultural research, and for machine translation development. our experimental project began with the intuition that quantitative variation in a corpus of historical retranslations might be used to project quasi-qualitative annotations onto the translated text. we present a web-based system which enables users to create parallel, segment-aligned multi-version corpora, and provides visual interfaces for exploring multiple translations, with their variation projected onto a base text. the system can support any corpus of variant versions. we report experi- ments using our tools (and stylometric analysis) to investigate a corpus of forty correspondence: tom cheesman, department of languages, swansea university, sa pp, uk. e-mail: t.cheesman@swansea.ac.uk digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. this is an open access article distributed under the terms of the creative commons attribution license (http://creativecommons. org/licenses/by/ . /), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. of doi: . /llc/fqw digital scholarship in the humanities advance access published august , by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from german versions of a work by shakespeare. initial findings lead to more ques- tions than answers. ................................................................................................................................................................................. introduction our project began with a simple observation and an intuition. the observation: in any set of multiple translations in a given language, variation among them varies through the course of the text. some text units or chunks (at any level from word, say, up to chapter or character part in a play) are more variously translated than others. the intuition: this variation can be used to project an annotation onto the translated text, indicating where and how the extent of translation variation varies. this is the es- sence of our online system. it uses a ‘translation array’ (a parallel multi-translation corpus, aligned to a ‘base text’ of the translated work) to achieve ‘version variation visualization’. here, ‘version’ encompasses any text which can be at least partly aligned with others. but the website strapline is: ‘explore great works with their world-wide translations’. if multiple translations of a work exist, then the work is enduringly popular and/or prestigious, ca- nonical or classic, in the translating culture: typically ‘great works’ of scripture, literature, philosophy, etc. interest in comparing such works’ multiple translations is surprisingly limited. some large aligned retranslations corpora are publicly accessible online (works of scripture), but user access is lim- ited to two parallel texts, and no analytic tools are provided. no similar resources exist for any secular works at all, yet. this reflects the notorious ‘invisi- bility’ of translators and translations in general (venuti, ). a key aim of our project is to make them visible. retranslations are successive translations of the ‘same’ source work, often somehow dependent on precursor (re)translations. the source works con- cerned are mostly unstable texts in their original language: what translators translate varies and changes. and so does how they do it. the gamut runs from word-for-word renderings to very free adaptations or rewritings with little obvious relation to the source. relay translation—via a third language—introduces further variation. if transla- tions are reprinted or otherwise re-used, they tend to be changed again. venuti ( ) argues that re- translations (more than most translations) ‘create value’ in the target culture. a first translation of a foreign work creates awareness of it. if retranslations follow, the work becomes assimilated to the target culture. if retranslations multiply, each both re- inforces the value and status of the work in the target culture, and extends the range of competing interpretations surrounding it. retranslations there- fore throw up questions going well beyond linguistic and cultural transfer, concerning ‘the values and in- stitutions of the translating culture’, and how these are defended, challenged, or changed (venuti, , p. ). within translation studies, ‘retranslation stu- dies’ is underdeveloped, despite its fundamental im- portance for translation, linguistics, and communication, as well as comparative, trans- national cultural studies. as munday ( ) argues, retranslations are important resources, be- cause no single utterance or text exists in isolation from alternative forms it might have taken. any extant text is surrounded by a ‘penumbra’ of ‘un- selected forms’ (munday, , p. , citing grant, , pp. – ); so any translation is surrounded by ‘shadow translations’ (johannson, , p. , citing matthiessen, , p. ). sets of translations by different translators (or the same translators at different moments) make visible at least some otherwise unselected forms. this offers scope for studying ‘the value orientations that underlie these selections’ (munday, , p. ). our project seeks to go even further: from the how and why of vari- ation among translations, back to the varying cap- acity of the translated text to provoke variation. the article is organized as follows: section re- views related work, including statistical studies in translation variation. section presents our soft- ware project, covering our aligner, corpus overviews (including stylometric analysis), and our key innovation: an interface deploying ‘eddy t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from and viv’ algorithms to explore translation variation. section presents findings of experiments using the software. section offers concluding comments. related work there has been little digital work on larger retrans- lations corpora, involving works of wide intrinsic interest, and none designed to facilitate access to multiple translations, and the translated work, to- gether with algorithmic analyses. jänicke et al. ( ) take in some ways a similar approach, but their ‘traviz’ interface offers a very different mode of text visualization, is monolingual (shows no translated text), and works best with more limited variation and shorter texts (see section . ). lapshinova-koltunski ( ) describes a parallel multi-translation corpus designed to support com- putational linguistic analyses of differences between professional translations, student translations, machine translation (mt) outputs, and edited mt outputs. shei and pain ( ) proposed a simi- lar parallel corpus, with an interface designed for translator training. these projects only offer access to filtered segments of the text corpus, and do not envisage exploring variation among retranslations. altintas, can, and patton ( ) used two time- separated (c. , c. ) collections of published translations of the same seven english, french, or russian literary classics into turkish, to quantify aspects of language change. this raises the question whether such translations ‘represent’ their language. corpus-based translation studies (baker, ; kruger et al., ) has established that translated language differs from untranslated language. we also know from decades of work in descriptive translation studies (morini, ; toury, ) that retranslations vary for complex genre-, market-, subculture-specific and institutional fac- tors, and individual psychosocial factors, involving the translators and others with a hand in the work (commissioners, editors), and their uses of re- sources including source versions and prior (re)translations. there is no consensus on defining such factors and their interrelations. the conclusion of a manual analysis of eight english versions of zola’s novel nana is typically vague: (. . .) specific conditions (. . .) explain the similarities and differences (. . .). the condi- tions comprise broad social forces: changing ideologies and changing linguistic, literary, and translational norms; as well as more spe- cific situational conditions: the particular con- text of production and the translator’s preferences, idiosyncrasies, and choices. (brownlie, , p. ) the basic lesson is that translation is a humanities subject. translators are writers. as baker warns: identifying linguistic habits and stylistic pat- terns is not an end in itself: it is only worth- while if it tells us something about the cultural and ideological positioning of the translator, or of translators in general, or about the cog- nitive processes and mechanisms that contrib- ute to shaping our translational behaviour. we need then to think of the potential motiv- ation for the stylistic patterns that might emerge from this type of study. (baker, , p. ) her comment is cited by li et al. ( , p. ), in their computationally assisted study of two english translations of xueqin cao’s hongloumeng. they conclude: corpus-assisted translation research can go beyond proving the obvious or the already known as long as meta- or para-texts are available for the analysis. the extent and depth of such analysis of course depends on the amount of information available in the form of meta- or other texts. (li et al., , p. ) genuine understanding of cultural materials re- quires knowledge and critical understanding of many other materials, to assess how multi-scale human factors shape texts and the effects they have (had) in their cultural world. non-digital studies in retranslation underline the importance of such shaping factors. deane-cox ( ) and o’driscoll ( ) both recently multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from investigated large sets of english retranslations of th-century french novels. they detail at length the historical contexts of each retranslation, its pro- duction and reception, and analyse short samples linguistically or stylistically. deane-cox’s overall ar- gument disproves the ‘retranslation hypothesis’ put forward by antoine berman ( , p. ). berman argued that over time, successive retransla- tions should tend to translate the source text more accurately. in fact—as we will see—this may hold for a first few retranslations, but when they multi- ply, the hypothesis no longer holds. this is partly because retranslators who come late in a series must be more inventive, to distinguish their work from that of precursors and rivals. the desire for distinc- tion is a great motivator (mathijssen, ; hanna, ). critical translation studies pays close atten- tion to such specific contextual factors, viewing each translation as an act of intervention in a particular moment in a particular place in the geographical and social world, and a trace of a translator’s (and associated agents’) both conscious and unconscious choices (munday, , p. ). as munday argues, translation is essentially an evaluative act. translator’s decisions are based on evaluations of the source text, of the implicit values of its author and intended audience, and of the expectations and values of the intended audience of the translation. . statistical studies statistical studies of differences between translations confirm this perspective, and also rain on the mt parade. they show that variation is greatest both in the most semantically significant units of a text, and in the units which are most expressive of values and affect. babych and hartley ( ) measured the sta- bility of alternative translations at word and phrase level in english versions of french news stories by two professional translators. they found a strong statistical correlation between instability and the scores of linguistic items in the source text for sali- ence (tf.idf score) or significance (s-score; see babych et al., ). the more important an item is for a text’s meaning, the less translators tend to agree about translating it (though each one is con- sistent in using their selected terms). babych and hartley deduce that ‘highly significant units typically do not have ready translation solutions and require some ‘‘artistic creativity’’ on the part of translators’, and that this necessary ‘freedom’ makes translation fundamentally ‘‘‘non-comput- able’’ or ‘‘non-algorithmic’’’ (babych and hartley, , p. , citing penrose, ). they conclude that there are: fundamental limits on using data-driven approaches to mt, since the proper transla- tion for the most important units in a text may not be present in the corpus of available translations. discovering the necessary trans- lation equivalent might involve a degree of inventiveness and genuine intelligence. (babych and hartley, , p. ) munday ( , pp. – ) studied seventeen english translations of an extract from a story by jorge luis borges: two published translations and fifteen commissioned from advanced trainee trans- lators. four in five lexical units varied. invariance was associated with ‘simple, basic, experiential or denotational processes, participants and relations’ (p. ). variation mainly occurred in ‘lexical ex- pression of attitude’, i.e. affect/emotion, judgment/ ethics, appreciation, or evaluation (p. ). variation was greatest at ‘critical points’, where ‘attitude-rich’ words and phrases ‘carry the attitudinal burden of the text’ and communicate ‘the central axiological values of the protagonists, narrator or writer’ (p. )—again, in effect, the semantically most sig- nificant items. translations vary most at points of greatest se- mantic and evaluative/attitudinal salience. mt has a long way to go, then. its problems include identify- ing attitude, affect, or evaluation in a text to be translated. in a chapter on mt and pragmatics, farwell and helmreich ( ) discuss lexical and syntactic differences in spanish newswire art- icles translated into english by two professional translators: % of units differed, and % of dif- ferences could be attributed to the translators’ dif- ferent ‘assumptions about the world’ (rather than assumption-neutral paraphrasing, or error). one example is this headline: acumulación de vı́veres por anuncios sı́smicos en chile t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from translation : hoarding caused by earthquake predictions in chile translation : stockpiling of provisions be- cause of predicted earthquakes in chile (farwell and helmreich, , p. ) the translations make vastly different ideolo- gical, political, evaluative assumptions. ‘hoarding’ suggests a panicky, irrational population, respond- ing to rumours of an unlikely event. ‘stockpiling’ (by the population, or the civil authorities?) is a prudent response to credible (scientific experts’?) warnings. it is impossible—without ‘meta- or para-texts’—to disentangle whether the translators impute different values to the mind of the source text creator, or to its intended readers, or to the anticipated readers of the target text, and/or whether they express their own psychological and ideological values. ‘acumulación’, here, has major evaluative implications which could not be pre- dicted without area-specific political and economic expertise. perhaps a multi-retranslation corpus could be used to discover which items provoke vari- ation, as a proxy for such knowledge? if not, what would it discover? project description a multi-retranslation corpus will contain versions of various kinds; complete, fragmentary, edited, adapted versions; versions derived from (a version of) the original-language translated work, or from intermediaries in the translating language, and/or other languages; versions in various media; for vari- ous audiences (popular, scholarly, restricted); in mono-, bi-, or plurilingual formats; from various periods and places; produced and received under various economic, political, institutional, and cul- tural-linguistic conditions. an obvious lay question is: which one is best? but the problem is already clear: by what criteria, or whose, do we judge? models for assessing professional translations (house, ) are predicated on full and precise rendering of the source, but work less well with cre- ative genres, where such ‘fidelity’ is often subordi- nated to effect in the target culture. retranslations of poetry, plays, novels, religious, or philosophical works can be very successful (i.e. ‘good’, for many people) without being at all complete or accurate. a related question is: why do most retranslations have brief lives (just one publication, or media or per- formance use), while others—backed by some insti- tutional authority—become canonical, and have many editions, revisions, and re-uses, over gener- ations? does the answer lie in linguistic, textual qualities of the translation, measured in terms of its relation to the original work? or in some quali- ties of it, measured in relation to alternative versions or other target culture corpora? or does it lie solely in institutional factors? our project does not comprehensively address these questions. it grew out of a particular piece of translation criticism, and the intuition that digital tools could be developed to explore patterns in vari- ation among multiple (re)translations, in themselves, in relation to target cultural contexts, and in relation to the translated work. before knowing any of the above-mentioned studies, cheesman wanted to find ways to compare a large collection of german trans- lations and adaptations of shakespeare’s play, the tragedy of othello, the moor of venice (see corpus overviews in section . below). his interest was as a researcher in german and comparative literature and culture. he had worked on a recent, controversial version of othello (cheesman, ), and wondered how it related to others. he manually examined over thirty translations ( – ) of a very small sample: a fourteen-word rhyming couplet, a ‘critical moment’ which is rich in affect, evaluation, and am- biguity (cheesman, ). his study showed how differences among the translations traced a -year- long conversation about human issues in the work— gender, race, class, political power, interpersonal power, and ethics. could digital tools help to explore such questions and communicate their interest to a wider public? the couplet he had selected was clearly more variously translated than most passages in the play. so he wondered if we could devise an algo- rithmic analysis which would identify all the most variously translated passages, to steer further research. a proof-of-concept toolset (‘translation array prototype’) was built, using as test data a corpus of thirty-eight hand-curated digital texts of multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from german translations and adaptations of part of the play: othello, act , scene . this is about , continuous words of the play’s , , in english: lines and speeches (in neill’s edition). the restricted sample size was due to restricted re- sources for curating transcriptions, and translation copyright limitations. versions were procured from libraries, second-hand book-sellers, and theatre publishers (who distribute texts not available through the book trade). digital transcription stripped out original formatting and paratexts (pref- aces, notes, etc). the transcriptions were minimally annotated, marking up speech prefixes, speeches, and stage directions. the brief for the programmers (flanagan and thiel) was to build visual web inter- faces enabling the user to: align a set of versions with a base text and so create a parallel multi-version corpus; obtain overviews of corpus metadata and aligned text data; navigate parallel text displays; apply an algorithmic analysis to explore the differ- ing extent to which base text segments provoke vari- ation among translations; customize this analysis and create various forms of data output to support cultural analyses. . aligner an electronic shakespeare text was manually col- lated with a recent edition, to give us a base text inclusive of historic variants. then we needed to align it segmentally with the versions. existing open tools for working with text variants (e.g. juxta col- lation software) lack necessary functionality; so do existing computer-assisted translation tools; per- haps such software could be adapted; at any rate we built a web-based tool from scratch. the devel- oper, flanagan, explains its two main components: ebla: stores documents, configuration details, segment and alignment information, calcu- lates variation statistics, and renders docu- ments with segment/variation information. prism: provides a web-based interface for up- loading, segmenting and aligning documents, then visualizing document relationships. areas of interest in a document are demar- cated using segments, which also can be nested or overlapped. each segment can have an arbitrary number of attributes. for a play these might be ‘type’ (with values such as ‘speech’, ‘stage direction’), or ‘speaker’ (with values such as ‘othello’, ‘desdemona’), and so on. (flanagan in: cheesman et al., ) hand- or machine-made attributes such as ‘irony’, ‘variant from source x’, ‘crux’, ‘body part y’, ‘affect z’, ‘syllogism’, ‘trochee’, and ‘enjambe- ment’ are equally possible. but all would require time-consuming tagging. in fact, we have worked only with ‘type: speech’. segment positions are stored as character offsets within documents, and texts can be edited without losing this information (transcription errors keep being discovered). segmented documents are aligned in an interactive wysiwyg tool, where an ‘auto-align’ function aligns all the next segments of specified attribute. for othello, every speech prefix, speech and ‘other’ string is automatically pre-defined as a segment of that type. any string of typographic characters in a speech can be manually defined as a segment and aligned. thiel and colleagues at studio nand built visual interfaces on top of prism, including parallel- text views tailor-made for dramatic texts (base text and any translation), and the ‘eddy and viv’ view discussed below (section ). thiel ( b) docu- ments the design process. he also sketched a scal- able, zoomable multi-parallel view of base text and all aligned versions, an overview model which re- mains to be developed as an interface for combined reading and analysis (thiel, a). . corpus overviews visual overviews of a corpus support distant read- ings of text and/or metadata features. we devised three. an online, interactive time-map of historical geography shows when and where versions were written and published (performances are a desider- atum); it identifies basic genres (published books for readers, books for students, theatre texts), and pro- vides bio-bibliographical information (thiel, ). a stylometric diagram is discussed in section . . (fig. ). ‘alignment maps’ depict the information created by segment alignment (fig. ). t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from f ig . a li g n m en t m ap s o f th ir ty -f iv e g er m an o th el lo . ( – ) multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from . . alignment maps alignment maps, developed by thiel, are ‘barcode’- type maps which show how a translation’s constitu- ent textual parts (here: speeches) align with a similar map of the base text. figure shows thirty-five such maps, in chronological sequence. each left-hand block represents the english base text of othello . , the right-hand block represents a german text, and the connecting lines represent alignments in the system. within each block, horizontal bars represent speeches (in sequence top to bottom) and thickness represents their length, measured in words; othello’s longest speech in the scene (and the play) is highlighted. small but significant differ- ences in overall length can be noticed: translations tend to be longer than the translated texts, so it is interesting to spot versions which are complete yet more concise, such as gundolf ( ). we can see which versions, in which passages, make cuts, reduce, expand, transpose, or add material which could not be aligned with the base text. in the centre of the figure, the german translation (felsenstein and stueber, ) of the italian li- bretto (by boito) of verdi’s opera otello ( ) is a good example of omission, addition, and trans- position. omissions and additions are also evident in the recent stage adaptations on the bottom line. zimmer ( ), like boito, assigns othello’s long speech to multiple speakers. in our online system, these maps serve as navigational tools alongside the texts in thiel’s parallel-text views. each bar repre- senting a speech is also tagged with the relevant speech prefix, so any character’s part can be high- lighted and examined. aligned segments are rapidly, smoothly synched in these interfaces, assisting ex- ploratory bilingual reading. . . stylometric network diagram figure depicts a stylometric analysis of relative most frequent word frequencies in , -word chunks of forty german versions of othello, carried out by rybicki using the stylo script and the gephi visualization tool. the network diagram shows ( ) relations of general similarity between versions, rep- resented by relative proximity (clustering), and ( ) similarities in particular sets of frequency counts, represented by connecting lines; their thickness or strength represents degree of similarity. these lines (edges) can indicate intertextual relations: depend- ency of some kind, including potential plagiarism. directionality can be inferred from date labels on nodes. for example, the version by bodenstedt ( ) (near top centre) was revised in the strongly connected version by rüdiger ( ). this confirms data on his title page. other results, as we will see, are more surprising: spurs to further research. the x/y axes are not meaningful. the analysis involves hundreds of counts using differing param- eters: the diagram is a design solution to the prob- lem of representing high-dimensional data in a two- dimensional plane. removing or adding even one version produces a different layout and can re-ar- range clusters. moreover, the analysis process is so complex that we cannot specify which text features lie behind the results. broadly, though, the diagram can be read historically, right to left: a highly formal poetic theatre language gives way to increasingly in- formal, colloquial style. nine versions are revisions, editions or rewritings of the canonical translation by baudissin (originally , in the famed ‘schlegel-tieck’ shakespeare edi- tion; see: sayer, ). most are quite strongly con- nected and closely clustered, but the apparent stylometric variety is a surprise. the long, weak line connecting the cluster to the heavily revised stage adaptation by engel ( ) (upper left) is to be expected, but the length and weakness of the connection with wolff’s ( ) published edition (lower right) is more of a surprise. his title page indicated a modestly revised canonical text, but styl- ometry suggests something more radical is going on. above all, this analysis reveals the salience of his- torical period. distinct clusters are formed by all the early c versions (mid-right), arguably all the late c versions (top), most of the late c versions (lower left), and all the c versions (far left). the c versions are all idiosyncratic adaptations (cf. fig. , bottom line). it is surprising to see how simi- lar they appear, in stylometric terms, relative to the rest of the corpus. and what do the strong links among them indicate? mutual influence, plagiarism, common external influence? what about the lines leading from gundolf ( ) (low centre) across to t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from swaczynna ( ), to laube ( ), to günther ( )? günther is the most celebrated living german shakespeare translator: do these lines trace his debts to less famous precursors? period outliers are also interesting. zeynek (?- ) ap- pears to be writing a c style in the s. the unknown schwarz ( ) is curiously close to the famous fried ( ). rothe ( ) (extreme bottom left) is writing in a late c style in the s. this throws interesting new light on the notorious ‘rothe case’ of the weimar republic and nazi years: he was victimized for his ‘liberal’, ‘modern’ approach to translation (von ledebur, ). genre is salient, too. a very distinct cluster, bottom right, includes all versions designed for study and written in prose (rather than verse). this includes our two earliest versions ( and ) and two published years later ( , ). strongly interconnected, weakly connected with any other versions, this cluster demonstrates the flaw in the approach of altintas et al. ( ). differences in the use of german represented by distances across the rest of graph cannot be due to any general historical changes in the language. they reflect changes in the specific ways german is used by translators of shakespeare for the stage, and/or for publications aimed at people who want to read his work for pleasure. . the ‘eddy and viv’ interface overviews are invaluable, but the core of our system is a machine for examining differences at small scale. the machine implements an algorithm we called ‘eddy’, to measure variation in a corpus of translations of small text segments. eddy’s find- ings are then aggregated and projected onto the base text segments by the algorithm ‘viv’ (‘variation in variation’). in an interface built by thiel, on the basis of flanagan’s work, users view the scrollable base text (fig. : left column) and can select any previously defined and aligned segment: this calls up the translations of it, in a scrollable list (fig. : fig. stylometric analysis of forty german othellos node label key: translator_date. prefix: baud ¼ version of baudissin ( ). suffixes: _pr ¼ prose study edition. no suffix ¼ other book. _t ¼ theatre text (no book trade distribution). _x ¼ theatre text, not performed (only version by a woman). multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from right columns). the list can be displayed in various sequences (transition between sequences is a pleas- ingly smooth visual effect) by selecting from a menu: order by date; by the translator’s surname; by length; or (as shown in fig. ) by eddy’s algo- rithmic analysis of relative distinctiveness. eddy metrics are displayed with the translations, and also represented by a yellow horizontal bar which is longer, the higher the relative value. we defined ‘segment’, by default, as a ‘natural’ chunk of dramatic text: an entire speech, in semi- automated alignment. manual definition of seg- ments (any string within a speech) is possible, but defining and aligning such segments in forty ver- sions is time-consuming. in future work we intend to use the more standard definition: segment ¼ sentence (not that this would simplify alignment, since translation and source sentence divisions fre- quently do not match). eddy compares the wording of each segment version with a corpus word list: here the corpus is the set of aligned segment ver- sions. no stop words are excluded; no stemming, lemmatization, or parsing is performed. flanagan explains how the default eddy algorithm works: each word in the corpus word list [the set of unique words for all versions combined] is considered as representing an axis in n-di- mensional space, where n is the length of the corpus word list. for each version, a point is plotted within this space whose co- ordinates are given by the word frequencies in the version word list for that version. (words not used in that version have a frequency of zero.) the position of a notional ‘average’ translation is established by finding the cen- troid of that set of points. an initial ‘eddy’ variation value for each version is calculated by measuring the euclidean distance between the point for that version and the centroid. flanagan in cheesman et al. ( – ) this default eddy algorithm is based on the vector space model for information retrieval. given a set s of versions {a, b, c . . .} where each version is a set of tokens {t , t , t . . . tn}, we create a set u of unique tokens from all versions in s (i.e. a corpus word list). for each version in s we construct vectors of attributes a, b, c . . . where each attribute is the occurrence count within that version of the corresponding token in u, that is: a ¼ xjaj j¼ aj ¼ ui � � fig. eddy and viv interface (colour online) t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from we construct a further vector z to represent the centroid of a, b, c . . . such that z ¼ ai þ bi þ ci . . .ð Þ jsj then, for a version a, the default eddy value is calculated as: eddy ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi xjuj i¼ jzi � aij vuut this default eddy formula is used in the experi- ments reported below, coupled with a formula for viv as the average (arithmetic mean) of eddy values. other versions of the formulae can be selected by users, e.g. an alternative eddy value based on an- gular distance, calculated as: eddy ¼ cos� a�ziaiiz i � � � work remains to be done on testing the different algorithms, including the necessary normalization for variations in segment length. essentially, eddy assigns lower metrics to word- ings which are closer to the notional average, and higher metrics to more distant ones. so, eddy ranks versions on a cline from low to high distinctiveness, or originality, or unpredictability. it sorts common- or-garden translations from interestingly different ones. viv shows where translators most and least dis- agree, by aggregating eddy values for versions of the base text segment, and projecting the result onto the base text segment. viv metrics for segments are dis- played if the text is brushed, and relative values are shown by a colour annotation (floor and ceiling can be adjusted). as shown in fig. , the base text is annotated with a colour underlay of varying tone. lighter tone indicates relatively low viv (average eddy) for translations of that segment. darker tone indicates higher viv. shakespeare’s text can now be read by the light of translations (cheesman, ). sometimes it is obvious why translators disagree more or less. in fig. , roderigo’s one-word speech ‘iago -’ has a white underlay: every version is the same. the duke’s couplet beginning ‘if virtue no delighted beauty lack. . .’ (the subject of cheesman’s initial studies), has the darkest under- lay. as we knew, translators (and editors, per- formers, and critics) interpret this couplet in widely varying ways. in the screenshot, the duke’s couplet has been selected by the user: part of the list of versions can be seen on the right. mts back into english are provided, not that they are always helpful. unlike the traviz system (jänicke et al., ), ours does not represent differences between versions in terms of edit distances, and translation choices in terms of dehistoricized decision pathways. our system preserves key cultural information (historical sequence). it can better represent very large sets of highly divergent versions. the traviz view of two lines from our othello corpus (jänicke et al., , figure ) is a bewilderingly complex graph. with highly divergent versions of longer translation texts, traviz output is scarcely readable. crucially there is no representation of the translated base text. the eddy and viv interface is (as yet) less adaptable to other tasks, but better suited to curiosity-driven cross-language exploration. . experiments with eddy and viv . eddy and ‘virtue? a fig!’ to illustrate eddy’s working, table shows eddy results, in simplified rank terms (‘high’, ‘low’, or unmarked intermediate), for thirty-two chrono- logically listed versions of a manually aligned seg- ment with a very high viv value: ‘virtue? a fig!’ (othello . . ). an exclamation is always, in munday’s terms, ‘attitude-rich’, burdened with affect; this one is a ‘critical point’ for several reasons. ‘virtue’ is a very significant term in the play, and crucially ambiguous: in shakespeare’s time it meant not only ‘moral excellence’ but also ‘essential nature’, or ‘life force’, and ‘manliness’. the speaker here is iago, responding to roderigo, who has just declared that he cannot help loving the heroine, desdemona: ‘. . . it is not in my virtue to amend it’. roderigo means: not in my nature, my power over myself, my male strength. but iago’s response implies the moral meaning, too. then, multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from t a b le ‘v ir tu e? a fi g !’ in th ir ty -t w o g er m an tr an sl at io n s ( – ) n u m b er s ‘e d d y ’ ra n k l en g th ra n k t ra n sl a ti o n s b a ck -t ra n sl a ti o n s s o u rc es in te rt ex ts t u g en d ? p fi ff er li n g . v ir tu e? [n o t w o rt h a] ch an te re ll e. w ie la n d (s ) (p ) h t u g en d !— d en h en k er au ch ! v ir tu e! — [t o h el l w it h ] th e ex ec u ti o n er to o ! e sc h en b u rg (e d . e ck er t) (s ) l l t u g en d ? p o ss en ! v ir tu e? b u ff o o n er y ! s ch il le r an d v o ss (p ) c f. # , # l t u g en d ? n ar re n sp o ss en ! v ir tu e? f o o ls ’ b u ff o o n er y ! b en d a /o rt le p p l t u g en d ! a b g es ch m ac k t! v ir tu e! v u lg ar ! � b au d is si n [s ch le g el -t ie ck ] (p ) t u g en d ? z u m h en k er ! v ir tu e? t o th e ex ec u ti o n er ! [¼ b e d am n ed !] b o d en st ed t t u g en d ? l ee re s g ef as el ! v ir tu e? m in d le ss d ri v el ! jo rd an t u g en d ? w is ch iw as ch i! v ir tu e? d ri v el ! g il d em ei st er l l t u g en d ! a eh ! v ir tu e! u g h ! v is ch er t u g en d ! p fe if d ra u f! v ir tu e! w h is tl e o n it ! [¼ d o n ’t g iv e a d am n fo r it ] g u n d o lf l l t u g en d ! p o ss en ! v ir tu e! b u ff o o n er y ! b au d is si n (e d . w o lf f) l t u g en d ! d u m m h ei t! v ir tu e! s tu p id it y ! e n g el (t ) e n er g ie ? e in s ch m ar re n ! e n er g y ? n o n se n se ! [d ia le ct al : s g er m an ] s ch w ar z (t ) c f. # l t u g en d ! a ch w as ! v ir tu e! o h co m e o n ! b au d is si n (e d . b ru n n er ) (s ) h h n ic h t d ie k ra ft ! z u m la ch en ! n o t th e st re n g th ! l au g h ab le ! z ey n ek ?- (t ) l ‘t u g en d ’? q u at sc h ! ‘v ir tu e’ ? n o n se n se ! f la tt er t u g en d ? w ei ß e m äu se ! v ir tu e? w h it e m ic e! r o th e m ac h t? d u m m es z eu g ! p o w er ? s tu ff an d n o n se n se ! s ch al le r t u g en d ? k ei n e f ei g e w er t! v ir tu e? n o t w o rt h a fi g ! s ch rö d er t u g en d ? fi ck d ra u f v ir tu e? fu ck it s w ac zy n n a (t ) h h in d ei n er m ac h t? a ch w as ! in y o u r p o w er ? o h co m e o n ! � f ri ed (p ) c f. # , # . .. t u g en d ! e in q u ar k ! v ir tu e! q u ar k ! [s o ft ch ee se /n o n se n se ] l au te rb ac h (t ) c f. # m ac h t? s ch m ar re n ! p o w er ? n o n se n se ! [d ia le ct al : s g er m an ] � e n g le r (s ) h h n ic h t in d ei n er m ac h t? s o n q u at sc h ! n o t in y o u r p o w er ? w h at n o n se n se ! l au b e (t ) c f. # t u g en d ! e in d re ck ! v ir tu e! f il th ! [c ra p ] r ü d ig er (t ) l l t u g en d ? q u at sc h v ir tu e? n o n se n se � b o lt e an d h am b lo ck (s ) h h n ic h t in d ei n er m ac h t? q u ar k ! n o t in y o u r p o w er ? q u ar k ! � g ü n th er (p ) c f. # , # . h d a k an n st d u la n g e b et en y o u ca n p ra y a lo n g ti m e [¼ n o t u n ti l th e co w s co m e h o m e] m o ts ch ac h (t ) l a ff en k ra m a p e- ru b b is h [¼ c ra p !] b u h ss (t ) h h c h ar ak te r? a m a rs ch d er c h ar ak te r! c h ar ac te r? c h ar ac te r m y ar se ! � z ai m o g lu an d s en k el h h n ic h t in d ei n er m ac h t? q u at sc h ! n o t in y o u r p o w er ? n o n se n se ! l eo n ar d (t ) n ic h t in d ei n er m ac h t! n o t in y o u r p o w er ! � s te ck el h an d l in d ic at e h ig h es t (h ) an d lo w es t (l ) se v en e d d y v al u e ra n k in g s an d le n g th ra n k in g s. a lt er n at iv e tr an sl at io n s to ‘t u g en d ’ ¼ ‘v ir tu e’ ar e u n d er sc o re d . s o u rc es : � ¼ n o w in p ri n t. (s ) ¼ st u d y te x t. (t ) ¼ n o b o o k tr ad e d is tr ib u ti o n (t h ea tr e te x t) . in te rt ex ts : (p ) ¼ p re st ig io u s, in fl u en ti al . t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from the phrase ‘a fig!’ is gross sexual innuendo. ‘fig’ meant vagina. the expression derives from spanish and refers to an obscene hand gesture: in- tense affect (see neill, , p. ). (the expression ‘i don’t give/care a fig!’ was once commonplace, and often used euphemistically for ‘fuck’, a word shakespeare never uses.) the lowest and highest seven eddy rankings are indicated. eddy’s lowest-scoring translation is ‘tugend? quatsch’ (# , # ). ‘tugend’ is the modern dictionary translation of (moral) ‘virtue’. ‘quatsch’ is a harmless expression of disagreement: a bowdlerized translation (bowdlerization is clear in most versions here). the eddy score is low be- cause most translations (until ) use ‘tugend’ and several also use ‘quatsch’. eddy’s highest score is for ‘charakter? am arsch der charakter!’ (# ). this is zaimoglu’s controversial adaptation of , with which cheesman’s work on othello began ( ). no other translation uses those words, including the preposition ‘am’ and article ‘der’. ‘charakter’ accurately translates the main sense of shakespeare’s ‘virtue’ here, and ‘arsch’ fairly renders ‘a fig!’ this is among the philologic- ally informed translations of ‘virtue’ (as ‘energy’, ‘strength’, ‘power’), a series which begins with schwarz ( ) (# ). it is also among the syntac- tically expansive translations, with colloquial speech rhythms, which begin with zeynek (?- ) (# ). both series become predominant following the pres- tigious fried ( ) (# ). reading versions both historically and with eddy, in our interface, makes for a powerful tool. here the historical distribution of eddy rankings confirms what we already know about changes in shakespeare translation. the lowest mostly appear up to . the highest mostly appear since (recall figure : lower left quadrant). ranking by length in typographical characters is not often useful, but with such a short segment its results are interesting, and similar to eddy’s. most of the shortest are up to , and most of the longest since : that shift towards more expansive, col- loquial translations, again. similar historical eddy results are found for many segments in our corpus. an ‘eddy history’ graph, plotting versions’ average eddy on a timeline, can be generated: it shows eddy average rising in this corpus since about . this may be a peculi- arity of german shakespeare. it may be an artefact of the method. but it is conceivable that, with fur- ther work, the period of an unidentified translation might be predicted by examining its eddy metrics. eddy and viv results for any selected segments, based on the full corpus or a selected subset of ver- sions, can be retrieved and explored in several forms of chart, table, and data export. the interactive ‘eddy variation’ chart, for example, facilitates com- parisons between one translator’s work and that of any set of others (e.g. her precursors and rivals). it plots eddy results for selected versions against seg- ment position in the text; any version’s graph can be displayed or not (simplifying focus on the transla- tion of interest); when a node is brushed, the rele- vant bilingual segment text is displayed. eddy’s weaknesses are evident in table , too. it fails to highlight the only one-word translation (# ), or the one giving ‘fig’ for ‘fig’ (# ), or the one with the german equivalent of ‘fuck’ (# ), ex- pressing the obscenity which remains concealed from most german readers and audiences. we still need to sort ordinary translations from extraordin- ary and innovative ones in more sophisticated ways. eddy also fails to throw light directly on genetic and other intertextual relations. some are indicated in the ‘intertexts’ column in table : the probable in- fluence of some prestigious retranslations is appar- ent in several cases, as is the possible influence of some obscure ones. such dependency relations re- quire different methods of analysis and representa- tion. stylometric analysis (section . . ) provides pointers. more advanced methods must also en- compass negative influence, or significant non-imi- tation. table shows—and this result is typical too—that the canonical version (# ), the most often read and performed german shakespeare text from until today, is ‘not’ copied or even closely varied. that is no doubt because of risk to a retranslator’s reputation. retranslators must differ- entiate their work from what the public and the specialists know (hanna, ). the tool we built is a prototype. eddy is admit- tedly imperfect. but its real virtue lies in the power it gives to viv, enabling us to investigate to what multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from extent base text features and properties might cor- relate with differences among translations. even that is only a start, as flanagan points out: ebla can be used to calculate different kinds of variation statistics for base text segments based on aligned corpus content. these can potentially be aggregated-up for more coarse-grained use. the results can be navi- gated and explored using the visualization functionality in prism. however, translation variation is just one of the corpus properties that could be investigated. once aligned, the data could be analysed in many other ways. (flanagan in: cheesman et al., – ) . ‘viv’ in venice an initial viv analysis of othello . , involving all the ninety-two natural ‘speech’ segments, was re- ported (cheesman, ). it found that the ‘highest’ viv-value segments tended to be ( ) near the start of the scene, ( ) spoken by the duke of venice, who dominates that scene, but appears in no other, and ( ) rhyming couplets (rather than blank verse or prose). there are twelve rhyming couplets in the scene; two are speech segments; both were in the top ten of ninety-two viv results. no association was found between viv value and perceptible attitudinal intensity, or any linguistic features. we did find some high-viv segments asso- ciated with specific cross-cultural translation chal- lenges. highest viv was a speech by iago with the phrase ‘silly gentleman’, which provokes many dif- ferent paraphrases. but some lower-viv segments present similar difficulties, on the face of it. there was no clear correlation. still, four hypotheses emerged for further research. hypothesis : based on rhyming couplets having high viv-value: retranslators diverge more when they have additional poetic-formal constraints. hypothesis : based on finding ( ) above: retran- slators diverge more at the start of a text or major chunk of text (i.e. at the start of a major task). hypothesis : based on finding ( ) above: retran- slators diverge more in translating a very salient, local text feature in a structural chunk (in this scene: the part of the duke) and less in translating global text features (e.g. here: othello, desdemona, iago). hypothesis relates to ‘low’ viv findings. it was somehow disappointing to find that speeches by the hero othello and the heroine desdemona, including passages which generate much editorial and critical discussion, had moderate, low, or very low viv scores. famous passages where othello tells his life story and how he fell in love with desdemona, or where desdemona defies her father and insists on going to war with othello, surely present key chal- lenges for retranslators. perhaps passages which have been much discussed by commentators and editors pose less of a cognitive and interpretive challenge, as the options are clearly established. this hypothesis could be investigated by marking up passages with a metric based on the extent of associated annotation in editions and/or frequency of citation in other cor- pora. for now, we have speculated that the hero’s and heroine’s speeches in this particular scene do exhibit common attitudinal, not so much linguistic, but dramatic features. in the low-viv segments, the characters can be seen to be taking care to express themselves particularly clearly; even if very emo- tional, they are controlling that emotion to control a dramatic situation. perhaps translators respond to this ‘low affect’ by writing less differently? but it is difficult to quantify such a text feature and so check viv results against any ‘ground truth’. there is another possible explanation: in the most ‘canonical’ parts of the text (here: the hero’s and heroine’s parts), retranslators perhaps tread a careful line between differentiating their work and limiting their divergence from prestigious precur- sors. such ‘prestige cringe’ would relate to the above-mentioned negative influence, or non-imita- tion of the most prestigious translations (section . ). precursors act, paradoxically, as both negative and positive constraints on retranslators. hypothesis : in the most canonical constituent parts of a work, viv is low, as retranslators tend to combine willed distinctiveness with caution, limit- ing innovation. in the initial analysis, the groups of speeches as- signed highest and lowest viv values had suspi- ciously similar lengths. clearly the normalization of eddy calculations for segment length leaves t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from t a b le ‘v iv v al u es ’ in tw o li n er s in o th el lo . g en er at ed b y tw en ty g er m an v er si o n s (c o n ti n u ed ) multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from something to be desired. the next and latest analysis focused on segments of similar length to investigate our hypotheses. . ‘viv’ in two liners table shows the grammatically complete two-line verse passages in othello . , plus prose passages of equivalent length, in viv value rank order. a sub- corpus of twenty translations was selected for better comparability. the text assigned to each major character part here is reasonably representative of their overall part in the scene, counted in lines: brabantio (sample eighteen lines [nine couplets]/ total sixty-one lines) . , desdemona ( / ) . , duke ( / ) . , iago ( / ) . , othello ( / ) . . hypothesis seems to be confirmed, though more work needs to be done to prove it conclu- sively: high viv value correlates with poetic-formal constraint. in the column ‘form’ in table , blank verse is the default. unsurprisingly, rhyming coup- lets appear mostly in the top half of the table, including five of the top ten items. translators enjoy responding to the formal challenge of rhym- ing couplets in self-differentiating ways; and they must so respond, or else they very obviously plagi- arize, because these items are rare in the text and highly noticeable, for audiences or readers. hypothesis is not confirmed: scanning the column ‘running order’, there is no sign that trans- lators differentiate their work more at the start of the scene, as they embark on a new chunk of the task. that could have been interesting for psycho- linguistic and cognitive studies of translation (halverson, ). hypothesis seems to be confirmed, but we need much more evidence to be sure we have discovered a general pattern. scanning the column ‘speaker’, the duke’s segments are more variously translated than those of other speakers. even if we exclude rhyming couplets, the duke is over-represented in the upper part of the table. brabantio and iago also have some very high-viv lines, but their segments are distributed evenly up and down the table. not so with the duke, who is the salient, local text feature in this scene and no other. t a b le c o n ti n u ed t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from hypothesis also seems provisionally confirmed. othello is strikingly low-viv, mostly. desdemona tends to be low- to mid-viv. translations of their parts differ ‘less’ than other parts, at this scale. why? we do not know. it could be ‘prestige cringe’ (section . ). but it could also be specific to this text. othello in particular refuses ‘affect’ in this scene, as he does throughout the first half of the play: he is in command of everything, including his emotions. he echoes a much discussed line just spoken by desdemona (‘i saw othello’s visage in his mind’, . . ) when he says to the duke and assembled senators that he wants her to go to war with him, but: i therefore beg it not, to please the palate of my appetite, nor to comply with heat—the young affects in me defunct—and proper satisfaction. but to be free and bounteous to her mind: (. . .) (othello . . – ) this is one of the play’s cruxes—passages which editors deem corrupt and variously resolve (here, ‘me’ is often changed to ‘my’, ‘defunct’ to ‘distinct’, and the punctuation revised). translators also re- solve this passage variously, depending in part on which edition(s) they work with; but—as measured by viv—not very variously, compared with other passages. can it be that textual ‘affect’ is relatively less, because that is the kind of character, the mind, the ‘virtue’ othello is projecting? concluding comments findings which only confirmed what was already known would be truly disappointing (though we do need some such confirmation, to have any faith in digital tools). digital literary studies should provoke thought. a classic example is moretti’s discovery of a rhythm of – years in the emergence and disappearance of c novelistic genres, which he uneasily ascribed to a cycle of bio- logical-sociocultural ‘generations’: i close on a note of perplexity: faute de mieux, some kind of generational mechanism seems the best way to account for the regularity of the novelistic cycle—but ‘generation’ is itself a very questionable concept. clearly, we must do better. (moretti, , p. ) so too with ‘translation arrays’ and ‘version variation visualization’: we must do better. we wanted to demonstrate that this sort of ap- proach opens up interesting possibilities for future research. of course one big difference between moretti’s work and ours so far is one of scale. his team works with tens or hundreds of thousands of texts and metadata items. we are working with a few dozen versions of one play, in one target lan- guage, because that is what we have got, and only a fragment of the play, because we chose to make the texts publicly accessible, which entails copyright re- strictions (and some expense). our approach re- quires time-consuming text curation (correction of digital surrogates against page images), permission acquisition, and manual segmentation and align- ment processes (more sophisticated approaches including machine learning will speed these up). moretti experimentally ‘operationalizes’ pre- digital critical concepts such as ‘character-space’ or ‘tragic collision’ (moretti, ), by measuring quantities in texts: digital proxies or analogues. eddy and viv, on the other hand, are measuring relational corpus properties which have no obvious pre-digital analogue. what could they be proxies for? eddy makes visible certain kinds of resemblance and difference, certain sequences, patterns of influ- ence and distinctiveness. critically understanding these still depends on understanding ‘para- and meta-texts’ (li, zhang and liu, ). viv’s contri- bution is even less certain: we won’t know whether its results correspond to anything ‘real’ about trans- lated texts’ qualities, or those of translations, or of translators, until we have studied many more cases. eddy and viv analysis, as implemented, is crude. we can imagine training next-generation eddy on human-evaluated variant translations. we can en- visage experiments with lemmatization, stopword exclusion, parsing, morphosyntactical tagging, di- verse automated segment definitions, text analytics, and plugging in other corpora for richer analyses. when does a translator’s use of language mimic a multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from pre-existing style, when is it innovative, in what way? we can map texts to wordnets, historical dic- tionaries and thesauri. we can model topics, analyse sentiments. we can explore consistency and coher- ence within translations, usage of less common words, word-classes, word-sets, grammatical, rhet- orical, poetic, prosodic, metrical, metaphorical fea- tures, and so on. we can generate intertextual and phylogenetic trees. we can perhaps adjust viv for historical sequence, and weight for the complex ef- fects of influence, imitation, and intentional non- imitation. given multi-lingual parallel corpora, we can project a cross-cultural viv. the more sophisti- cated the analysis, the greater its scope, the greater the cost of text preparation and annotation, and the greater the challenge in creating visual interfaces which offer value to non-programmers. for text re- sources on a scale which might justify such invest- ment, we must next look to scripture. then we will need experts in god’s domain, as well. funding this work was supported by swansea university (research incentive fund and bridging the gaps), and the main phase of software development was funded by a -month research development grant in under the digital transformations theme of the arts and humanities research council (uk), reference ah/j / . references algee-hewitt, m., allison, s., gemma, m., heuser, r., moretti, f., and walser, h. ( ). canon/ archive: large-scale dynamics in the literary field. literary lab pamphlet, vol. . http://litlab.stan- ford.edu/literarylabpamphlet .pdf (accessed january ). altintas, k., can, f., and patton, j. m. ( ). language change quantification using time-separated parallel translations. literary and linguistic computing, ( ): – . babych, b. and hartley, a. ( ). modelling legitimate translation variation for automatic evaluation of mt quality. proceedings of lrec , pp. – . http:// www.lrec-conf.org/proceedings/lrec /pdf/ .pdf (accessed january ). babych, b., hartley, a., and atwell, e. ( ). statistical modelling of mt output corpora for information ex- traction. in archer, d., rayson, p., wilson, a., and mcenery, t. (eds), proceedings of the corpus linguistics conference, lancaster university, – march , pp. – . http://ucrel.lancs.ac.uk/pub- lications/cl /papers/babych.pdf (accessed january ). baker, m. ( ). corpus linguistics and translation stu- dies: implications and applications. in baker, m., francis, g., and tognini-bonelli, e. (eds), text and technology: in honour of john sinclair. amsterdam and philadelphia: john benjamins, pp. – . baker, m. ( ). towards a methodology for investigating the style of a literary translator. target, ( ): – . baudissin, w. ( ). shakspeares dramatische werke. vol. . berlin: reimer. berman, a. ( ). la retraduction comme espace de traduction. palimpsestes, : – . bodenstedt, f. ( ). othello, der mohr von venedig. leipzig: brockhaus. bolte, h. and hamblock, d. ( ). othello: englisch- deutsch. stuttgart: philipp reclam. brownlie, s. ( ). narrative theory and retranslation theory. across languages and cultures, ( ): – . buhss, w. ( ). william shakespeare othello, venedigs neger. berlin: henschel schauspiel theaterverlag. cheesman, t. ( ). shakespeare and othello in filthy hell: zaimoglu and senkel’s politico-religious tradaptation. forum for modern language studies, ( ): – . cheesman, t. ( ). thirty times ‘more fair than black’: othello re-translation as political re-statement. angermion, : – . cheesman, t. ( ). reading originals by the light of translations. shakespeare survey, : – . cheesman, t. ( ). othello . : ‘far more fair than black’. in smith, b. r. (ed.), the cambridge guide to the worlds of shakespeare, vol. . the world’s shakespeare, -present. cambridge: cambridge university press, pp. – . cheesman, t. and the vvv project team ( ). translation sorting with eddy and viv. http://www. scribd.com/doc/ /eddy-and-viv (accessed january ). t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from cheesman, t., flanagan, k., and thiel, s. ( – ). translation array prototype : project overview. http://www.delightedbeauty.org/vvvclosed/home/ project (accessed january ). cheesman, t., flanagan, k., thiel, s., and rybicki, j. ( ), five maps of translations of shakespeare. in wiggin, b. and macleod, c. (eds), un/translatables: new maps for germanic literatures. evanston, il: northwestern university press, forthcoming. deane-cox, s. ( ). retranslation: literature and reinterpretation. london: bloomsbury. eder, m., kestemont, m., and rybicki, j. ( ). stylometry with r: a package for computational text analysis. r journal, ( ), forthcoming. engel, e. ( ). william shakespeare othello. berlin: felix bloch erben. engler, b. ( ). othello: englisch-deutsche studienausgabe. munich: franke. farwell, d. and helmreich, s. ( ). pragmatics-based machine translation. in chan, s. (ed.), the routledge encyclopedia of translation technology. abingdon/new york: routledge, pp. – . felsenstein, w. and stueber, c. ( ). giuseppe verdi: othello. milan and frankfurt am main: ricordi. flatter, r. ( ). othello der mohr von venedig. sonderabdruck für bühnenzwecke. munich: theater- verlag desch. fried, e. ( ). hamlet/othello. berlin: wagenbach. geng, z., laramee, r. s., cheesman, t., ehrmann, e., and berry, d. m. ( ). visualizing translation vari- ation: shakespeare’s othello. advances in visual com- puting. lecture notes in computer science, : – . geng, z., laramee, r.s., flanagan, k., thiel, s., and cheesman, t. ( ). shakervis: visual analysis of seg- ment variation of german translations of shakespeare’s othello. information visualization, ( ): – . grant, c. b. ( ). uncertainty and communication: new theoretical investigations. basingstoke: palgrave macmillan. gundolf, f. ( ). shakespeare in deutscher sprache. vol. . berlin: bondi, . günther, f. ( ). william shakespeare. othello. zweisprachige ausgabe. munich: deutscher taschenbuch verlag. hadas, e. ( ). word dream: a reader’s companion. http://eranhadas.com/word dream (accessed january ). halverson, s. l. ( ). psycholinguistic and cognitive approaches. in baker, m. and saldanha, g. (eds), routledge encyclopedia of translation studies. london and new york: routledge, pp. – . hanna, s. ( ). bourdieu in translation studies: the socio-cultural dynamics of shakespeare translation in egypt. london and new york: routledge. hope, j. and witmore, m. ( ). the language of macbeth. in thompson, a. (ed.), macbeth: the state of play. london: bloomsbury (arden), pp. – . house, j. ( ). translation quality assessment: a model revisited. tübingen: narr. hutchings, t. ( ). studying apps: research approaches to the digital bible. in cheruvallil- contractor, s. and shakkour, s. (eds), digital methodologies in the sociology of religion. london: bloomsbury, chapter . jänicke, s., geßner, a., franzini, g., terras, m., mahony, s. and scheuermann, g. ( ). traviz: a visualization for variant graphs. digital scholarship in the humanities, (suppl ): i – . http://dsh.oxfordjournals.org/ content/ /suppl_ /i (accessed january ). johannson, s. ( ). between scylla and charybdis: on individual variation in translation. languages in contrast, ( ): – . kruger, a., wallmach, k., and munday, j. ( ). corpus-based translation studies: research and applications. london: continuum. lapshinova-koltunski, e. ( ). vartra: a compar- able corpus for analysis of translation variation. proceedings of the th workshop on building and using comparable corpora, sofia, bulgaria, august , , pp. – . http://aclweb.org/anthology/w - (accessed january ). laube, h. ( ). william shakespeare othello der mohr von venedig übersetzt und bearbeitet. frankfurt am main: verlag der autoren. lauterbach, e. s. ( ). william shakespeare, othello, der mohr von venedig. aus dem englischen von e. s. lauterbach unter mitarbeit von benita gleisberg. berlin: henschel schauspiel theaterverlag. li, d., zhang, c., and liu, k. ( ). translation style and ideology: a corpus-assisted analysis of two english translations of hongloumeng. literary and linguistic computing, ( ): – . long, l. ( ). revealing commentary through com- parative textual analysis: the realm of bible translations. palimpsestes, : – . multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from matthiessen, c.m.i.m. ( ). the environments of translation. in steiner, e. and yallop, c. (eds), beyond content: exploring translation and multilingual text production. berlin and new york: mouton de gruyter, pp. – . mathijssen, j. ( ). the breach and the observance: theatre retranslation as a strategy of artistic differentiation, with special reference to retranslations of shakespeare’s hamlet ( - ). phd dissertation, utrecht university. moretti, f. ( ). graphs, maps, trees: abstract models for literary history . new left review, : – . moretti, f. ( ). ‘operationalizing’: or, the function of measurement in modern literary theory. literary lab pamphlet, . http://litlab.stanford.edu/literarylab pamphlet .pdf (accessed january ). morini, m. . the pragmatic translator: an integral theory of translation. london: bloomsbury. motschach, h. ( ). william shakespeare othello. munich: drei masken. mueller, m. ( - ). about metadata and the query potential of the digital surrogate. http://wordhoard. northwestern.edu/userman/index.html (accessed january ). munday, j. ( ). evaluation in translation. critical points of translation decision-making. abingdon; new york: routledge. neill, m. (ed.) ( ). the oxford shakespeare: othello. oxford: oxford university press. o’driscoll, k. ( ). retranslation through the centuries: jules verne in english. oxford: peter lang. paloposki, o. and koskinen, k. ( ). reprocessing texts: the fine line between retranslating and revising. across languages and cultures, ( ): – . penrose, r. ( ). the emperor’s new mind: concerning computers, minds and the laws of physics. oxford: oxford university press. roos, a. ( ). making a clean breast of english passover haggadah translations: data visualization of bowdlerization in haggadah translations of ezekiel : . unpublished chapter of phd dissertation, university of amsterdam. rothe, h. ( ). der elisabethanische shakespeare. vol. . baden-baden: holle. rüdiger, r. ( ). william shakespeare othello, der mohr von venedig tragödie. in anlehnung an die übersetzung von friedrich von bodenstedt nach dem original neu übersetzt. berlin: felix bloch erben. rybicki, j. ( ). the great mystery of the (almost) in- visible translator: stylometry in translation. in oakley, m. and ji, m. (eds.), quantitative methods in corpus- based translation studies. amsterdam: john benjamins, pp. – . rybicki, j. and heydel, m. ( ). the stylistics and styl- ometry of collaborative translation: woolf’s ‘night and day’ in polish. literary and linguistic computing, ( ): – . sayer, j. ( ). wolf graf baudissin ( - ): life and legacy. münster: lit. schaller, r. ( ). shakespeares werke. vol. . berlin: rütten & loening. schröder, r. a. ( ). shakespeare/deutsch. berlin, frankfurt am main: suhrkamp. schwarz, h. ( ). othello, der maure von venedig. typescript. shakespeare-bibliothek münchen. shei, c-c. and pain, h. ( ). computer-assisted teach- ing of translation methods. literary and linguistic computing, ( ): – . steckel, f. ( ). die tragödie von othello, dem mohren von venedig. frankfurt am main: verlag der autoren. swaczynna, w. ( ). die tragödie von othello, dem mohren von venedig. cologne: jussenhoven & fischer. thiel, s. ( ). understanding shakespeare: towards a visual form for dramatic texts and language. http://www.under- standing-shakespeare.com (accessed january ). thiel, s. ( ). othello map. http://othellomap.nand.io/ (accessed january ). thiel, s. ( a). visualizing translation variation. http:// transvis.s -website-eu-west- .amazonaws.com/# or www. tinyurl.com/transvis (accessed january ). thiel, s. ( b). visualizing translation variation: designing tools for literary scholars in translation studies and linguistics. masters dissertation. bauhaus university weimar. thiel, s. ( ). macbeth loglikelihoods. http://macbeth. s -website-eu-west- .amazonaws.com/or www.tinyurl. com/macbthe (accessed january ). toury, g. ( ). descriptive translation studies—and beyond. revised edition. amsterdam and philadelphia: benjamin. venuti, l. ( ). retranslations: the creation of value. bucknell review, ( ): – . venuti, l. ( ). the translator’s invisibility: a history of translation. revised edition. london and new york: routledge. t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from von ledebur, r. ( ). der mythos vom deutschen shakespeare: die deutsche shakespeare-gesellschaft zwischen politik und wissenschaft - . cologne and weimar: böhlau. wachsmann, m. ( ). william shakespeare, die tragödie von othello, dem mohr von venedig. berlin: gustav kiepenheuer bühnenvertriebs-gmbh. wang, q. and li, d. ( ). looking for translators’ fin- gerprints: a corpus-based study on chinese translations of ulysses. literary and linguistic computing, ( ): – . wolff, m. j. ( ). shakespeares werke übertragen nach schlegel-tieck. vol. . berlin: volksverband der bücherfreunde, wegweiser-verlag. zaimoglu, f. and senkel, g. ( ).william shakespeare othello. bearbeitung. münster: monsenstein und vannerdat. zeynek, t. (?- ). shakespeare: othello der mohr von venedig. munich: ahn und simrock bühnen und musikverlag. zimmer, h. ( ). othello steht im sturm: jugendstück frei nach shakespeare. weinheim: deutscher theaterverlag weinheim. notes . ‘version variation visualization: translation array prototype ’ at http://www.delightedbeauty.org/ vvvclosed. further project links: www.tinyurl.com/ vvvex. alternative prototype tools were also built: see geng et al., , . see further: cheesman, , , and cheesman et al., . . the existence of multilingual (re)translations can indicate both popularity and prestige, as in publishers’ blurbs for novels ‘translated into x languages’. for the stanford literary lab, translations index popularity (algee- hewitt et al., , p. ). but ‘multiple’ retranslations often also mean prestige: some are included in institu- tional curricula, reviewed in ‘high-brow’ media, etc. . for example, , versions of the bible in lan- guages at www.bible.com or approx. versions of the quran in forty-seven languages at http://al-quran. info. see long ( ) and hutchings ( ). . venuti ( ) focuses on retranslations which deliber- ately challenge pre-existing translations. our corpus is not so restricted. . see also wang and li ( ): digitally supported ana- lysis of two chinese translations of james joyce’s ulysses. . for details of the forty plus german texts used, see www.delightedbeauty.org (‘german’ page). . ‘if virtue no delighted beauty lack, / your son-in-law is far more fair than black’ (othello . . – ). multilingual translations of this are crowd-sourced by cheesman at: www.delightedbeauty.org. . this remains less easy than we would wish. roos is working with eran hadas on a more user-friendly corpus-creation, segmenting and aligning interface, in the course of a study of english translations of the hebrew haggadah from the c to now, also using tools such as traviz (jänicke et al., ) and word dream (hadas, ). see roos, , and http://www.tinyurl.com/jewishdh. . cheesman collated mit’s ‘moby’ shakespeare (http:// shakespeare.mit.edu) with neill’s edition ( ) for added dialogue and modern spellings. we chose to sample othello . partly because the english text is stable between editions, at the level of speeches and speech prefixes, if not at the level of wording (except at . . – —see neill, , p. ); also for its var- iety of major character parts. . http://www.juxtasoftware.org. juxta helps map phyl- ogeny, with the aim of (re)constructing an original or an authoritative edition. we cannot study retransla- tions with any such aim. there is no right translation. there may be a canonical translation, but users feel free to revise it, because it is ‘just’ a translation. . the potential value of this interface to support ex- plorations of text-analytic features is illustrated by the ‘macbthe’ interface (thiel, ): users explore a zoomable map of ‘macbeth’ with a log likelihood lemma table, following the impetus of hope and witmore ( ). see also thiel’s ( ) earlier work. . see: eder et al. ( ) and stylometric translations analyses by rybicki ( ) and rybicki and heydel ( ). . on the ‘fine line between retranslation and revision’ see: paloposki and koskinen, . there is no re- search on wolff, or indeed on most of the translators here. . cheesman named eddy after ( ) a formula he primi- tively devised as ‘ p d’, adapting tf.idf formulae (see: cheesman and the vvv project team, , p. ), ( ) his brother eddy, and ( ) the idea that retranslations are metaphorical ‘eddies’ in cultural historical flows. . formulae available: a: euclidean distance; b: cheesman’s original, primitive formula; c: viv as standard deviation of eddy; d: dice’s coefficient; e: angular distance. . ‘a normalisation needs to be applied to compensate for the effect of text length, [so] we calculated multi-retranslations digital scholarship in the humanities, of by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from variation for a large number of base text segments of varying lengths, then plotted average [euclidean] eddy value against segment length. we found a loga- rithmic relationship between the two, and arrived at a normalisation function that gives an acceptably con- sistent average eddy value regardless of text length’ (flanagan in cheesman et al., – ). eddy for- mula e (angular distance) appears to address the length normalization problem to some extent. . stephen ramsay commented on the ‘graceful and illuminating’ interface that ‘prompts various kinds of ‘‘noticing’’ and encourages an essentially playful and exploratory approach to the ‘‘data’’’ (personal correspondence, may ). . neill glosses ‘virtue’ as ‘moral excellence’, ‘manly strength and courage’, and ‘inherent nature’ at . . ; ‘power, strength of character’ at . . (neill, , p. and —see there also for ‘fig’). . roos ( ) uses eddy and viv to explore bowdler- ization in english haggadah texts. . zeynek died in ; his translations are undated. . stylometry and common sense recommended nar- rowing the corpus to give less ‘noisy’ results. i excluded prose study versions, adaptations with ex- tensive omissions, contractions, expansions and add- itions, c and c versions, including all versions of baudissin ( ), leaving fifteen versions: gundolf ( ), schwarz ( ), zeynek (?- ), flatter ( ), rothe ( ), schaller ( ), schröder ( ), fried ( ), swaczynna ( ), laube ( ), rüdiger ( ), motschach ( ), günther ( ), buhss ( ), wachsmann ( ). . the norm in german shakespeare translation is that formal variation in the original (prose, blank verse, rhymed verse, or another metrical scheme) should be replicated or analogously marked. roos ( ) reports similar findings for the haggadah: rhyming verse sec- tions have higher viv, if translators use rhyme. . we thank a dsh referee for pointing out this possibility. . roos ( ) similarly finds lower viv value in bible quotations (the most canonical segments) in haggadah translations. . based on the two-line verse segments found manu- ally, the length range was set at – characters. iago’s lengthy prose speeches include more examples than were segmented and aligned. . baudissin (five versions, – ) was added to the corpus previously used, to recognize this translation’s enduring relevance. . see: neill, , p. . the mit text (from an s edition) is quoted, but with neill’s line-numbering. . we also envisage training applications. an interface enabling trainee translators and trainers to compare versions would have great practical value, as an ad- junct to a computer-assisted translation system and/ or an assessment and feedback system. . shakespeare retranslations are found at scattered sites. larger, curated corpora are accessible in czech and russian: c. aligned texts (twenty-two versions of hamlet) at http://www.phil.muni.cz/kapradi; c. texts (twelve versions of hamlet) at http://rus-shake. ru/translations. . the term ‘surrogate’ is taken from mueller ( – ). ideally our system would include page images. . roos is working on this with eran hadas. . difficulties include in-text variants (e.g. in critical editions, or translators’, directors’, and actors’ copies) and orthographic variations (archaic and vari- ously modernized forms; ad hoc forms fitting met- rical rules; other non-standard forms). rather than standardize texts to facilitate comparisons, the ma- chine should learn to recognize underlying equivalences. t. cheesman et al. of digital scholarship in the humanities, by guest on o ctober , http://dsh.oxfordjournals.org/ d ow nloaded from library data science committee framework recommendations university libraries the university of north carolina at chapel hill march committee members nandita mani michelle cawley lorin bruckner jason casden adam dodd amanda henley matt jansen jamie mcgarty morgan mckeehan sarah morris therese triumph jessica venlet joe williams page | march university libraries, university of north carolina at chapel hill contents executive summary ...................................................................................................................................... recommendations ....................................................................................................................................... recommendation : communication & branding .................................................................................... r .a: communication & branding ........................................................................................................ recommendation : assessment & reporting ......................................................................................... r .a. assessment .................................................................................................................................. r .b. reporting ................................................................................................................................... recommendation : human resources & employee engagement ........................................................ r .a reskilling ..................................................................................................................................... r .b performance management and professional development ...................................................... r .c new hires .................................................................................................................................... recommendation : library data science priorities and partnerships .................................................. r .a develop partnerships ................................................................................................................. r .b communication with partners .................................................................................................... r .c joint or co-funded positions ...................................................................................................... recommendation : creating and expanding services .......................................................................... r .a services to expand ...................................................................................................................... r .a new services to create ............................................................................................................... recommendation : space & infrastructure .......................................................................................... r .a library space and ds @ carolina ................................................................................................ r .b technical infrastructure .............................................................................................................. appendix a: executive summary of institutional interviews ...................................................................... appendix b: interview questions for library counterparts at exemplar institutions ................................ appendix c: skills matrix survey ................................................................................................................. appendix d: survey for unc partners and stakeholders ............................................................................ appendix e: stakeholder matrix ................................................................................................................. appendix f: environmental scan ................................................................................................................ executive summary ................................................................................................................................. methods .................................................................................................................................................. results ..................................................................................................................................................... page | march university libraries, university of north carolina at chapel hill executive summary the university libraries at the university of north carolina at chapel hill has created a framework to expand data science support of students, faculty and researchers. this framework is centered on a set of recommendations put forward by the library data science committee. the committee’s recommendations are based on multiple data sources, including an environmental scan (see appendix f), interviews with library counterparts at exemplar institutions (see appendix a for high level summary and appendix b for interview questions), preliminary results from a library staff skills matrix survey, and committee experience and expertise. the library data science committee identified six categories of recommendations around: • communication & branding • assessment & reporting • human resources & employee engagement • cultivating partnerships • expanding and creating services • space & infrastructure table below provides a summary of the high-level goals and specific recommendations within each category. additional details around each recommendation category follow. to seize the opportunity to engage around data science @ carolina initiative (hereafter ds @ carolina) at the outset, the university libraries should establish an implementation team. any delays in engaging with campus partners and other stakeholders will set us back in terms of being seen as a partner and engaging in high level collaborations. to meet the needs of ds @ carolina and to engage as full partners, we envision that the implementation team will: • develop mechanisms to evaluate needs of ds @ carolina stakeholders on an ongoing basis. • oversee reskilling of university libraries’ staff to meet ds @ carolina needs. • create position descriptions and engage in the hiring process for recruitment so university libraries is well-positioned to support ds @ carolina. • engage in periodic, regular evaluation of university libraries’ skills and services around data science. • establish and evaluate strategic objectives and priorities for university libraries support of ds @ carolina. • address organizational and cultural barriers to meeting the anticipated needs around ds @ carolina. • provide ongoing assessment of tools, technologies, software, and services that are necessary to support ds @ carolina. if established, the implementation team should apply change management principles to prepare and support current staff for changes around services and partnerships that will be required of our organization to support ds @ carolina. specifically, a structured approach should be sought to provide clear and concise messaging around how the university libraries supports ds @ carolina, to minimize staff turnover, assure high morale, and align performance goals with this strategic initiative. page | march university libraries, university of north carolina at chapel hill table : summary of recommendations and goals recommendation : communication & branding goals includes recommendations around: showcase our expertise and capacity. • library web site • social media • donor & campus engagement • branding and promotional materials • library newsletter & library meetings • communication via data science labs cultivate data science partnerships. provide clarity for external partners and patrons. develop relationships with donors. recruit new talent to our organization. recommendation : assessment & reporting goals includes recommendations around: showcase our expertise and capacity. • assessment and evaluation plan • unc data science needs • data science spaces assessment team • data science skills assessment • annual overview cultivate data science partnerships. develop relationships with donors. recommendation : human resources & employee engagement goals includes recommendations around: bridge the skills gap within university libraries to support data science activities on campus. • reskilling • performance management and professional development • new hires incentivize library staff to gain skills relevant to data science. provide ample professional development opportunities. develop a tiered approach to providing services around data science. recommendation : library partnerships & data science priorities goals includes recommendations around: cultivate data science partnerships around research and curriculum integration. • library priorities and partnerships • concierge roles • mechanisms for collaboration • formal communications channels • joint-funded positions with campus partners transform existing services into an immersive model where library staff are integrated into research projects. page | march university libraries, university of north carolina at chapel hill recommendation : creating and expanding services goals includes recommendations around: meet carolina’s needs around data science. • services to expand • services to create cultivate data science partnerships. recommendation : space & infrastructure goals includes recommendations around: cultivate community, interdisciplinarity, and catalyze new partnerships around data science. • data science labs • formal space use policy • technical infrastructure review team • infrastructure partnerships • technical consultation services program • use policy for libraries’ technical infrastructure • business plan to manage infrastructure costs establish a group to review the libraries’ current technical infrastructure. create space use policies. page | march university libraries, university of north carolina at chapel hill recommendations this section lays out recommendations around six areas, including: • communication & branding • assessment & reporting • human resources & employee engagement • cultivating partnerships • expanding and creating services • space & infrastructure recommendations are categorized as , , or (or a combination of these) which refers to the following timelines: within year to years consider after ds @ carolina is established recommendation : communication & branding an intentional strategy around branding and communication for internal and external purposes will allow us to showcase our expertise and impact to stakeholders. this will assist with building relationships with donors and campus partners, recruiting new staff, and providing clarity around our services and spaces to serve ds @ carolina. more than institutions were reviewed for the environmental scan. based on this review, we found that few libraries provide clear communications and branding around their data science support and partnerships. this is true even among institutions with strong data science programs. communication and branding should be done in collaboration with library communications and development and should occur through multiple channels including the library website, social media, signage, internal meetings, and events for campus partners and other stakeholders including the public (e.g., workshops, trainings, etc.). the individual(s) serving in a concierge role will have a critical role with respect to internal and external communication (see library data science concierge). details on specific recommendations are provided below. regardless of how the campus initiative moves forward, the libraries should consider, in partnership with library communications, how to address communication and branding around data science and other services. communication & branding goals • showcase our expertise and capacity. • cultivate data science partnerships. • provide clarity for external partners and patrons. • develop relationships with donors. • recruit new talent to our organization. page | march university libraries, university of north carolina at chapel hill r .a: communication & branding the committee proposes the following recommendations around communication and branding: library web site: establish a one library approach to illustrate our support of data science efforts in terms of services, staff, spaces, events and workshops, and through our partnership with the ds @ carolina initiative. those navigating our website should be able to clearly identify university libraries’ as a partner in ds @ carolina and identify individuals working in data science related areas. upon establishment of the campus-wide ds @ carolina initiative, ensure university libraries is clearly indicated as a campus partner in their marketing materials and web pages. social media: create a strategy to ensure our work in data science and as partners to ds @ carolina are highlighted in a one library fashion via facebook, twitter, and other media outlets as needed. donor & campus engagement: engage our donors and future prospects through windows magazine and the gazette as a way of sharing how university libraries is engaged in data science efforts at carolina including, but not limited, to ds @ carolina. include stories of impact such as those where university libraries has: • partnered with researchers and faculty on data science projects. • assisted students in acquiring data skills. • engaged with faculty on curriculum integration. • contributed to the carolina next (or ds @ carolina) strategic framework. create a brand. create a graphic identity and tagline that reflects our values and promise to our constituents around serving campus data science needs. as part of this effort, create a messaging strategy to keep internal and external audiences informed and abreast of new developments at the libraries related to ds @ carolina. develop brochures, powerpoint slide templates, and other promotional documents and/or marketing materials that staff can use to promote data science events and initiatives. create signage for any data related physical spaces that are developed. promote select data science research projects through a website, newsletter, and other media channels. library newsletter & library meetings. use internal communication opportunities to keep staff aware of data science initiatives and how the library is engaged with ds @ carolina. highlight staff involved with data science projects, presentations, publications, and grants through our newsletter and at department heads and library all staff meetings. invite data science partners (students, faculty, etc.) to present and/or co-present with libraries staff to share their experiences around partnering with the library. communication via research hubs and/or data science spaces. use our current spaces and, assuming university libraries dedicates one or more spaces as data science labs on campus, use the labs to support communication and branding efforts as follows: • host data-related events for unc affiliates and the public such as lectures, workshops, symposia, annual data day, and hackathons. page | march university libraries, university of north carolina at chapel hill • promote partner projects and share tools and technologies developed or used in collaboration between university libraries and campus partners. recommendation : assessment & reporting a clear assessment and evaluation program will ensure we are actively meeting the needs of our stakeholders. assessment should include a plan for identifying resources needed for information, technology, and space. evaluation plans should be created to monitor our progress with regards to the recommendations outlined. reporting mechanisms should be created (see communication & branding) to highlight the progress and impact the university libraries is making with ds @ carolina. details on specific recommendations are provided below. r .a. assessment assessment and evaluation: establish a robust assessment and evaluation plan for each of the following core areas: • instruction and curriculum integration • research support • priority partnerships • library spaces • library computing, including infrastructure • library skills and services (e.g., data curation, data storage and management, classes and workshops) assessment and evaluation plans should identify specific metrics for each of these categories, to be measured on a regular basis (e.g., annually). for example, in the area of curriculum- integration, we can identify what topics we teach, how many sessions are course-integrated versus standalone workshops, how many sessions are taught, the number of students we reach, and if possible, for sessions in which we teach alongside faculty, we can include an evaluation component for library-based workshops/sessions that gauge acquisition of skills and knowledge. as part of an assessment and evaluation plan we should establish performance indicators such as those detailed below. indicators of our success with research engagement may include: • number of grants that include libraries staff for data science-related support. • number of high-profile projects where the libraries is identified as a formal partner. • increase in quantity of collaborations around data science with priority partners on campus. • number of co-authored publications between libraries staff and faculty/researchers where the libraries’ contributions pertain to data science. • number of libraries staff with high level of expertise embedded in research projects. assessment & reporting goals • showcase our expertise and capacity. • cultivate data science partnerships. • develop relationships with donors. page | march university libraries, university of north carolina at chapel hill • number of projects with established project charters or memoranda of understanding (mous). • number of co-funded and joint positions. • new positions or roles that show evidence of engagement with our partners. • acknowledgment of libraries in data science articles, dissertations, and other publications. • opportunities for libraries staff to provide substantive contributions in data science @ carolina symposiums, conferences, or workshops. indicators of our success around curriculum integration and instructional support may include: • number of consultations (i.e., one-on-one or class/group) around data science. o use of libanalytics should be reviewed to ensure statistics are entered consistently by all staff to ensure results are meaningful. o tracking of online consultations should also be done to reflect support for distance students. • staff with a high level of expertise are embedded in research projects rather than conducting instruction and : consultations. • our stakeholders access asynchronous and reusable content around data science (e.g., recorded presentations and tutorials). • dual appointments for libraries staff. unc data science needs: the implementation team should assess campus needs around data science periodically through focus groups, surveys, or other methods. this should be done in the near term to prepare for supporting ds @ carolina, soon after the program is established, and regularly thereafter. the library data science committee collated a list of campus stakeholders to include in future conversations around data science needs that may be used as a starting point. periodically assess skills around data science. use the skills matrix survey (see appendix c) or a similar tool to evaluate data science skills across all university libraries staff. a skills survey can be used to: • compare available expertise with campus needs to identify gaps that may inform new or modified positions. • identify staff with expertise who can engage in peer-to-peer training or other methods to reskill existing staff. • identify new staff as potential candidates for joint-funded positions. data science spaces assessment team: the libraries research hubs provide several existing spaces for digital research that are well managed and used. once new library services and campus partnerships are better understood, a space assessment team can analyze changes needed for current or new spaces. evaluate research hubs & potential for data science spaces: given that there will be significant overlap in services and staff between data science and what we currently offer via the research hub, and that we want to minimize confusion for patron and partners, an evaluation of all hub page | march university libraries, university of north carolina at chapel hill locations and potential for new data science spaces should be considered. the evaluation should consider overall data science needs and associated services regarding space, resource/staffing and whether the research hub needs to be re-envisioned and/or re-branded. r .b. reporting annual overview: as part of our annual report, provide a summary of data science efforts at the libraries that is linked to university libraries strategic priorities. internal reporting mechanisms: use department heads and all staff meetings to highlight the work of various people and/or teams related to ds @ carolina. encourage staff to conduct lightning rounds, share announcements, and invite partners to our meetings to share how they have collaborated with university libraries around data science. recommendation : human resources & employee engagement according to results from a skills survey administered to department heads, we have limited capacity in our organization for some of the data science skills needed to support anticipated needs of ds @ carolina. workforce development should focus on the reskilling necessary to address the skills gap by increasing experience and exposure for existing staff (i.e., figure : tiers , , and ). to bridge gaps in expertise, new hires will be required. expertise (i.e., figure : tier ) requires a degree in a related field or significant time on task. in the near term, expertise can only come from new hires. university libraries’ staff with deep expertise around data science have limited ability to embed in research projects due to heavy commitments for instruction and consultations. incentives to encourage building skills around data science include linking performance goals to the data science strategic initiative and providing time and funding for professional development opportunities. reskilling will provide opportunities for career growth among current staff. one way to incentivize librarians with data science skills to partner with others on campus in teaching data science is to provide release time from their library duties, allowing them the time needed to teach as adjuncts. details on specific recommendations are provided below. r .a reskilling reskilling university libraries’ staff will be a critical element to meeting the demands of ds @ carolina and will require ongoing attention. university libraries should explore opportunities to partner and engage with other campus units around reskilling. the libraries co-location with odum institute and strong relationship with the school of information and library science (sils) are major assets. further, our relationship with sils is unique based on more than institutions reviewed as part of the environmental scan (see appendix f). human resources goals • bridge skills gap within university libraries to support data science activities on campus through hiring and retraining. • incentivize library staff to gain skills relevant to data science needs on campus. • provide ample professional development opportunities. page | march university libraries, university of north carolina at chapel hill tiered approach for data science-based professional development. library staff will be provided opportunities to expand skills relevant to supporting ds @ carolina through participation in a semi-structured progression matrix (figure ). skill development activities will be available for each tier, with staff progressing through tiers until they reach their chosen level of proficiency. skills may be focused around particular content areas such as digital humanities as indicated below. figure : proposed tiered service pyramid for delivering data science-related services and partnering around instruction and research. tier : staff with high degree of expertise should: • partner on grants and research projects. • engage as adjunct professors. • perform original research. • provide high-level consultations. • provide limited support for instruction. • design curriculum-integrated instruction. tier : staff with advanced technical skills should: • provide back-up on consultations as needed. • engage as adjunct professors. • partner on research projects. • discover and develop code to address internal library needs. • deliver curriculum-integrated instruction. tier : staff with introductory technical skills should: • provide introductory data-related consultations. • partner with tier staff to provide instruction. • attain introductory data literacy and technical skills (e.g., data manipulation in excel). tier : all university libraries staff should: • triage data science-related inquiries. • understand that data science is a strategic priority and maintain an understanding of library expertise and services around data science. tier : all staff will be provided opportunities for awareness raising through participation in introductory activities to learn basic vocabulary and how to triage data science-related requests. opportunities may include: • internal lightning talks pertaining to data science staff projects. • selected webinars. • half-day internal training available to all university libraries staff focused on jargon busting. page | march university libraries, university of north carolina at chapel hill tier : subject liaisons will be trained to provide first tier data science support for researchers in their areas of expertise, so they can provide data literacy instruction and data project support. they will refer requests for advanced data science support to data science librarians or partners. opportunities for subject liaisons may include: • internal projects: time-limited project to build particular skills (e.g., visualization, impact measurement, r, python); report results to library staff via internal lightning talk. • short courses developed and offered in partnership with campus partners. • time on task with support from tier and staff. • peer-to-peer training series. • selected courses via linkedin learning. • partner with and support tier staff in providing instruction. • half-day internal training available to all university libraries staff focused on introductory data skills in excel. • campus-wide user groups (e.g., unc tableau users group, excel users of unc yammer group). • libraries’ communities of practice (e.g., data wrangling). tier : staff with some experience will continue to develop proficiency through participation in internal library and student research projects and by leading instruction. opportunities may include: • peer-to-peer training series. • time on task with tier staff support. • audit or enroll in courses within data science curriculum as part of work time, where appropriate. • short courses developed and offered in partnership with campus partners. • weekly block of time devoted to improving skills, learning new software, or working on research projects (e.g., digital humanities). • staff learning cohorts and research teams ranging from summer learning groups to research teams working on funded projects. for all projects, goals and plans should be developed with supervisor input and included in annual performance plan documents for evaluation. tier : staff with expertise will oversee and coordinate staff development through skills mentoring and involving tier staff in functional components of research projects. opportunities to gain expertise may include: • audit or enroll in courses within data science curriculum as part of work time, where appropriate. • attend conferences or workshops. • present at conferences and workshops and/or co-present with subject matter experts. • joint appointments. page | march university libraries, university of north carolina at chapel hill • developing short courses in partnership with campus partners. • partner on grants and research projects. r .b performance management and professional development incentives are necessary to motivate staff to engage around data science. the recommendations below focus on performance management and professional development opportunities to increase employee engagement. align annual performance goals. as staff attain foundational knowledge around data science and move up the tiered service pyramid (figure ), supervisors should work with staff to create goals that prioritize reskilling and ensure balance between current and future work. specific recommendations include: • make real time changes to performance goals as necessary. • instruct supervisors on performance goals and assessment criteria for new duties related to data science. • rebalance workload for staff providing data science services to reflect library priorities. professional development. provide opportunities for staff to attend and present at data science- related workshops and conferences. specific recommendations include: • host additional library carpentry training events and encourage participation/certification among staff. • allocate time and funding needed to gain experience and knowledge acquisition around data science. • create and facilitate a library-wide data interest group (dig) to be held on a bi-monthly basis to share best practices and discuss current and potential opportunities. • partner with other campus groups to establish and participate in a campus-wide data interest group (dig). • provide staff who have new data science responsibilities additional professional development funds in the first - years of responsibilities. priority should be given to those presenting on data science-related projects at regional or national conferences. annual library data day: host a regular data science & digital scholarship showcase to highlight staff contributions as well as program growth and overall impact. consider including the opportunity for staff to work in teams on small scale projects and use library data day to encourage cross-unit/cross-library participation and engagement. page | march university libraries, university of north carolina at chapel hill r .c new hires data science positions: over a two-year period, create new library positions across the library system to support data science efforts, considering the need for diverse educational backgrounds and experience (e.g., informatics, computer science, library & information science, data science). positions should focus on digital humanities, humanities & social sciences, biomedical and the sciences. compensation for current positions being re-formulated or redesigned to a more data-intensive role should be reviewed. tables and below lay out the committee’s recommendations around creating and hiring new positions in two phases. in addition to the full-time positions described below, the committee recommends recruiting at minimum ra or ga positions focused on data science to be situated across the university libraries. the explanations provided below may be used to draft final position descriptions, although adding more details around specific qualifications should be considered before posting announcements. table : hiring recommendations for phase phase hiring position explanation number of positions type: senior administration position anticipated rank: full librarian this senior position would guide and coordinate the libraries’ data science initiative, direct the research agenda, foster new and existing partnerships, and communicate with the library leadership team. specific responsibilities for this position may include: • guide research agenda and direct r&d. • foster new and existing instructional and research partnerships with schools, divisions, institutes, centers, and industry partners. • liaise with library communication and web development teams to address communication and branding related to library ds efforts. • lead assessment and evaluation of libraries ds engagement. • communicate with libraries leadership as needed. • ensure coordination across staff and individual libraries. • coordinate efforts around staff reskilling and creating data literacy competencies. type: data science librarians in areas of social sciences, humanities, natural sciences, and health sciences we recommend that resources are allocated to support developing our data science staff across the library system in social sciences & humanities, natural sciences, and health sciences. most researchers do not learn the computational tools and data management skills they will need to excel in today’s data-driven research environment. data science librarians page | march university libraries, university of north carolina at chapel hill table : hiring recommendations for phase phase hiring (cluster hire) anticipated rank: up to associate librarian provide advice and consultation to researchers on how to make the most of their data. overall, the responsibility of a data science librarian involves data services such as assisting patrons with locating, acquiring, preparing, and managing data. specific responsibilities for these positions may include: • discuss data problems with researchers, assess data needs, and design a suite of services based on user needs. • develop and lead instructional sessions and outreach focused on management, interpretation, analysis, and visualization of varying forms of data. • assist patrons in assessing repository options for data archiving and preservation. • provide consultation and recommendations regarding the ethical issues of data use and reuse. • develop and deliver data analysis services in response to current trends, campus needs, and library priorities. • use advanced skills with data cleaning/wrangling/normalization, regular expressions, web scraping, and apis to support and collaborate with researchers on data-related research. • identify, evaluate, and recommend new and emerging data science research tools and methods for the libraries and unc research community. • continue to develop technical skills and proficiency with new data analysis tools and techniques. • advise and recommend appropriate hardware and software to support data science work. • develop and offer stand-alone workshops (in person and online) open to libraries staff, students, faculty, and researchers. instruction would focus on the use of data science methods and tools (e.g., r, python, tableau, geospatial applications in consultation with gis librarian). • collaborate on research projects using quantitative and qualitative datasets. • partner with researchers to identify and establish open science practices and policies, best practices in data management, and e-notebooks for lab environments. • participate in data research projects across disciplines. type: data science instruction librarian this position (and associated position hired in phase ) would coordinate instruction related to data science, including data ethics. these positions would partner with liaisons to develop page | march university libraries, university of north carolina at chapel hill table : hiring recommendations for phase phase hiring anticipated rank: up to associate librarian and deliver data literacy instruction, develop data literacy modules to integrate into courses taught by liaisons, and participate in efforts to coordinate reskilling of libraries staff under direction of senior libraries leadership. specific responsibilities of this position may include: • coordinate instruction related to data science including data ethics. • partner with liaisons to develop and deliver data literacy instruction. • develop data literacy modules to integrate into courses taught by liaisons. • participate in efforts to coordinate reskilling of libraries staff under direction of senior libraries leadership. • advise and consult on pedagogy, instructional design, and curriculum-integrated instruction. • work with liaisons to develop new course- and data- related instructional resources. phase : new data science positions table : hiring recommendations for phase phase hiring position details number of positions type: humanities data science librarian (r&d) anticipated rank: up to associate librarian this position would focus on research and development of data sets that the libraries can create and manage from our own collections, e.g., text mining data sets of oral history transcripts. specific responsibilities for this position may include: • serve as technical lead on data science projects started by the libraries. • participate in digital south program. • meet the needs of library stakeholders by: o preparing library metadata and collections to be machine-readable. o creating collections (as data) from our digital collections and developing tools (e.g., apis) for researchers to use to interface with those collections (e.g., nc newspapers and transcripts from oral histories). • spearhead research efforts focused on using ai to engage more deeply with existing collections. for example: use page | march university libraries, university of north carolina at chapel hill table : hiring recommendations for phase phase hiring position details number of positions ocr to identify (handwritten) names of the enslaved from documents in our collections. • collaborate on the application of computational methods to assess and improve the management and discoverability of our digital collections, including extensive av holdings. type: data curation librarian anticipated rank: up to associate librarian this is an existing gap in our organization and essential for supporting the libraries’ preservation pillar. expanded data services and data archiving will require a dedicated specialist who can help bridge gaps in organizational infrastructure between the digital preservation stewardship committee, library it, and liaison librarians. this position will also contribute to technical infrastructure development and maintenance related to repository services. specific responsibilities for this position may include: • collaborate with institutional repository (ir) librarian to assist unc community in depositing datasets in the carolina digital repository (cdr). provide support for the ir program's maintenance and ongoing development of policies, procedures, and documentation relevant to dataset collections and open access. • contribute to aligning curation activities in collaboration with ir librarian such as: facilitating dataset preparation, ingesting datasets to the repository, creating metadata, assessing rights and rights statements, understanding reuse issues and needs • align collecting and program development and facilitate curation activities with fair data principles. • collaborate on overall digital preservation activities with the digital preservation stewardship committee, repository services, university archives, and software development. • collaborate with liaison and data science librarians around outreach for data archiving in the cdr or other data repositories, use of datasets available in the cdr, and other collections as data projects. • point person for connecting and networking with other data repositories and data curation archivists/librarians (such as odum). page | march university libraries, university of north carolina at chapel hill table : hiring recommendations for phase phase hiring position details number of positions type: data science instruction librarian anticipated rank: up to assistant librarian this position would support the data science instruction librarian (hired in phase ). specific responsibilities of this position may include: • coordinate instruction related to data science including data ethics. • partner with liaison librarians to connect data literacy to other core literacies. • develop data literacy modules to integrate into courses taught by liaisons. • participate in efforts to reskill libraries staff. • advise and consult on pedagogy, instructional design, and curriculum-integrated instruction. • work with liaisons to develop new course- and data- related instructional resources. phase : new data science positions recommendation : library data science priorities and partnerships university libraries has a variety of campus partners and should develop a strategic approach to identify and engage with priority partners around data science, including partnering with external groups such as industry, jstor, hathitrust, etc. joint or co-funded positions will elevate the libraries and will increase opportunities for partnership and high-level collaboration. individuals in concierge roles will foster relationship building and communication. details on specific recommendations are provided below. r .a develop partnerships library priorities and partnerships: the implementation team should develop a phased approach for engaging campus partners around data science in terms of curriculum integration and research. priorities and partnerships should be reevaluated periodically. the table below lays out some potential opportunities for the libraries to partner with other groups on campus. partnership goals • cultivate data science partnerships around research and curriculum integration. • transform existing services into an immersive model where library staff are integrated into research projects. page | march university libraries, university of north carolina at chapel hill data literacy curriculum (curriculum integration): build on our success in digital and information literacy to formalize a program that supports data literacy in the undergraduate curriculum. potential outcomes • integrate the libraries’ course support and programming with the ds @ carolina curriculum as it develops. • create ‘ready-to-customize’ data modules for data literacy in ideas, information, and inquiry (triple i) courses and others. • data science workshops/short courses; working with partners to provide and organize tutors and/or mentors. • provide workshops and curriculum modules drawing on our expertise in ethical use of data. • develop a program that encourages students underrepresented in stem disciplines to become data scientists. establish summer workshops and short courses to support students interested in pursuing data science, particularly those students lacking prerequisites. • collaborative workshops and seminars, possibly modeled on events such as brown’s dscov: data science, computing, and visualization workshops, and academy in context seminars. integration in graduate curriculum (curriculum integration): work with campus partners to expand teaching collaborations. provide ra, ga, or carolina academic library associate (cala) position(s) devoted to data science work in the libraries. potential outcomes • expand current teaching collaborations, specifically with regards to key concepts of library science that overlap with data science (i.e., data description and visualization, workflow and reproducibility, and ethical problem solving). • expand and make more structured our graduate research assistantships, potentially along the lines of the cala program. if these (or a subset of these) were to focus on data science, we could train students to scale our work of supporting faculty with integrating data into the classroom. this could follow the model at berkeley (i.e., data science education program student teams/peer advising), virginia tech’s databridge, or emory’s center for digital scholarship which employs more than graduate students. during the summer, there may be opportunity to shift the focus of these efforts to supporting students traditionally underrepresented in stem who need to catch up on prerequisites (calc , , , and linear algebra- likely to be required for the data science major). • seek a dual appointment (co-funded position) for a data science librarian (libraries and sils). page | march university libraries, university of north carolina at chapel hill partnerships with health affairs (curriculum and research integration): develop partnerships to ensure data literacy skills are integrated at the point of need. potential outcomes • review sph & pharmacy curricula maps to identify where data science is being taught and meet with curriculum committees and/or specific faculty to identify ds workshops and/or modules that can be integrated to support students. • meet with research deans to identify data science needs related to research inquiry (e.g., use of tools to support systematic reviews, knowledge management, etc.) and offer opportunities for data science consultations. • work with global women’s health (gwh) team to identify ways to integrate data expertise into the som/sph study which will include development of algorithms, systematic reviews, and metadata application. • provide expert searching and data analysis in support of research projects, systematic reviews, bibliometrics, and patient care. addressing data science with library collections (research integration): provide purchased and curated discipline-specific data sets to support a wide variety of researchers. potential outcomes • ‘ready-to-use’ data sets available for researchers, with documentation for each data set to orient researchers for their use, and learning modules featuring our collections as data (cad). • data purchasing program for carolina researchers. • partner with others on campus to build data literacy modules for the data literacy component of triple i courses. • establish a formal program for purchasing data sets including a method to solicit requests from researchers using the existing workflow for purchasing data sets. begin the ordering and license negotiation process early in the fiscal year. • create collections as data (cad) for humanists. advertise the data broadly and create learning modules featuring our cad for courses. • work with research computing and the rest of information technology services (its) to make cad "compute proximate" in the cloud with appropriate identity management in place. page | march university libraries, university of north carolina at chapel hill coordinate digital humanities initiatives (research integration): center the libraries as a unifying force for the many disparate digital humanities initiatives, experiments, and labs that pop up on campus. potential outcomes • partner with digital humanities initiatives on campus (e.g., provide instruction to faculty groups to raise awareness of data science tools, methods, and resources on campus). this idea based on suggestion from uc berkeley). • framework to guide and encourage staff research projects. • use the digital south initiative as an incubator. • provide space to potentially relocate one or more digital humanities labs on campus. • create analog to data (collections as data) workflow to facilitate the creation of data for humanists and formalize a service model for this work. this would build and expand on the collections as data work currently being done with the onelibrary team spanning drs, special collections, and l&it. • expand collaboration between digital research services (drs) and software development to provide a more robust set of skills on digital humanities research teams and grant funded projects. these teams may be composed entirely of libraries staff or be composed of a mix of scholars from the libraries and elsewhere. this collaboration would expand on the expertise currently available from digital research services to include deeper expertise in machine learning, database creation, and developer consulting/development time. • establish an implementation team for the digital humanities recommendation, comprised of an expanded version of the special collections digital scholarship working group, to include additional staff members from software development, repository services, and drs. the digital south initiative should serve as an incubator for this recommendation, allowing librarians the space and time to engage in digital scholarship to support the initiative. collaborate with scholars on research projects (research integration): increase the libraries’ participation on research projects. potential outcomes • expand our use of project charters. • formalize a framework for including libraries’ staff as collaborators on research projects so we can respond to collaboration requests. • increase awareness and understanding between the libraries, odum institute, and renci to increase collaborations and facilitate referrals for consultation and research collaborations. as ds @ carolina emerges, we anticipate a phased approach in engaging stakeholders, where additional stakeholders are considered as new information is available from campus efforts. a preliminary list of stakeholders identified by the committee as potential strategic partners for the libraries around data science follows: • center for faculty excellence (cfe) • college of arts & sciences • data carpentries program • digital humanities (various groups across campus) • graduate and professional schools • institute of african american research (iaar) • information technology services (its) page | march university libraries, university of north carolina at chapel hill • oasis • odum institute • office of research • office of undergraduate research (our) • renaissance computing initiative (renci) • school of information and library science (sils) the committee identified the following potential partnership opportunities around instruction, research, and other ways to collaborate. opportunities around instruction: • host datasets for use by courses to allow students to interact with data. • create data literacy curriculum for triple i courses and supporting digital humanities projects. • increase focus on curriculum integration. • develop a data literacy curriculum. • develop workshops to help faculty integrate data into their courses. • expand teaching collaborations. • expand/formalize the use of graduate research consultants for implementing data modules into courses. • introduce a data literacy component to the information literacy program for the moore undergraduate research apprentice program (murap). • strengthen efforts to advance the use of data science methods in the humanities. • support faculty with integrating data modules into courses and ensure student success. • train humanities students interested in data science. opportunities to support research: • support research reproducibility and open science. • expand support for research reproducibility and open science. • collaborate on clinical research data projects. • develop toolkits (e.g., for grant support). • expand support for researchers to ensure research data comply with fair data principles (i.e., data are findable, accessible, interoperable, and reusable). • establish a data purchasing program for carolina researchers. • investigate additional research collaboration opportunities. • promote research computing and data storage resources to facilitate researcher use. • advance the use of primary research using advanced text mining, database construction, and government data. • support graduate students throughout the research cycle. • train graduate students in research reproducibility, research data management, and open science. • expand efforts around research analytics, orcid, and data management. page | march university libraries, university of north carolina at chapel hill other opportunities to form partnerships: • expand graduate research assistantships and cala programs to support data science. • partner on efforts to promote diversity in data science (listed as a priority in their strategic plan). • partner on training opportunities. • reskill university libraries’ staff. • work together to develop a faculty learning community for data literacy. • provide more joint consultations and collaborate on research teams. • increase awareness among campus partners to foster collaboration and facilitate referrals. • establish joint appointments. • partner and collaborate to make well-structured, large library data sets available to researchers. • partner to make collections as data (cad) available for humanists. as ds @ carolina matures, it will be important for the libraries to remain agile and phase in additional partners, understanding that the role we have with these partners may differ in terms of our level of integration. all partners selected should be cultivated with specific goals in mind and a clear idea as to what extent we wish to sustain, grow, or modify our level of partnership. when engaging these partnerships, the aim should be to raise awareness of our abilities and skills around data science, maintain communication, and find opportunities to increase our efficiency. the committee developed a survey for assessing partner and stakeholder needs around data science (appendix d) that can be used aa the libraries progress with the ds framework. further, a stakeholder matrix (appendix e) lays out needs by stakeholder type and level of expertise that could be utilized when determining in what ways the libraries can collaborate. the following is not an exhaustive list, but are some examples to consider: • carolina population center • school of media and journalism • department of philosophy • law school • center for information technology and public life (citap) • nc translational and clinical sciences institute (nc tracs) • unc medical center • lineberger cancer center r .b communication with partners establish data science concierge roles: individuals should be designated as data science concierges to formalize collaboration and communication channels and ensure libraries’ are kept informed regarding data science activities around the libraries and more broadly on campus. similar to how liaisons are embedded or have primary areas of connection, the data science concierges could help facilitate communication between libraries data science efforts and partner needs and opportunities. formalize mechanisms for collaboration: formalize use of project-based mous or other mechanisms to set expectations and timelines with campus partners. if applicable, individuals in page | march university libraries, university of north carolina at chapel hill data science concierge roles may be responsible for overseeing mous, including maintenance and process. r .c joint or co-funded positions joint-funded positions with campus partners: use existing models of joint positions (e.g., sils-chip/hsl appointment) to form new opportunities for co-funded positions. expand and improve the existing models of adjunct professors to support departments/schools in their effort to teach specialized data science skills. library staff teach courses regularly (e.g., department of geography, sils) but incentives could be improved to expand these efforts. currently, the stipend to teach is low, time investment is high, and teaching must be done outside of work. data science librarians could be beneficial as sils and other units adapt to increased demand for enrollments in data science courses. data science ras or gas could assist with these efforts. potential partners include departments in the college of arts & sciences, new data science initiative/school, sils, odum institute, or school of medicine. recommendation : creating and expanding services with increased use of data science techniques, tools, and approaches to instruction and research, it is necessary for the libraries to build capacity for select services and consider creation of new services. service creation should be informed by a formal assessment with campus stakeholders. r .a services to expand the university libraries currently offers many data related services that are generally at capacity. further, there is low redundancy in staff skills, making the services vulnerable to staff turnover. reskilling and adding more staff is necessary to provide stability and additional capacity. to meet expected demand around these services, significant investment and cooperation with campus partners will be needed. expanding services: the following services are currently offered by university libraries and we anticipate that ds @ carolina will increase demand in all instances. notably, we expect a significant increase in demand around predictive analytics using ai and machine learning, text analytics, data mining (particularly on library resources), and providing datasets for humanists from digitized primary source materials. • data sourcing and acquisition: locate and acquire appropriate data from external sources to help refine or answer research questions. this includes licensing, purchasing, describing, and processing data sets to make them available to researchers. • data creation: creation of data sets from a variety of sources including web scraping, data gathering via apis, and structured data derived from digitized primary source materials. • data cleaning and preparation: conduct all necessary transformations including merging, reshaping, or other reformatting to make analysis possible. new & expanded services goals • meet carolina’s needs around data science. • cultivate data science partnerships. page | march university libraries, university of north carolina at chapel hill • analysis (ai and machine learning for predictive analytics): use machine learning techniques to predict outcomes/events, classify data, and identify data/literature sources (e.g., for a systematic review). • analysis (gis): use specialized tools and methods to work with geospatially referenced data (e.g., geovisualization, spatial analysis, and spatial statistics). • analysis (impact measurement and visualization): discover scope and pattern of research collaborations; measure and assess research impact by discipline area; communicate research impact to audiences such as funders or promotion and tenure committees. • analysis (inference statistics): use of statistical methods to infer characteristics of a population or process. • analysis (text analytics): use specialized tools and methods to derive meaningful information from unstructured text. • analysis (network analysis): identify, measure, and visually represent relationship between groups of entities (people, terms, objects, etc.) • visualization and other data presentation: discover data insights and communicate findings through techniques that employ our innate ability to distinguish visual patterns in our environment. • data preservation and archiving: using institutional repositories and other data storage systems to arrange, describe, and protect the provenance of data while preserving its integrity and making it available for reuse. • data management: organize and manage data during research; data documentation (e.g., metadata, file formats, naming conventions, file organization); version code or files (e.g., git). • data science instruction: use of tools and methods (e.g., r, python, visualization, gis/mapping). integration of data modules into existing courses. • open science: improve research by making processes and products open and accessible. • fair data principles: ensure research data are findable, accessible, interoperable, and reusable. r .a new services to create as the libraries grow to support data science, additional services will be necessary to support computing, open science/research reproducibility, and other types of analysis (e.g., image analysis, genome -wide association studies (gwas), meta-analyses, and music). further, the libraries will need to establish a data acquisition strategy and increase instruction around data literacy and data ethics. these services are currently not offered or only offered at a minimal level. to meet expected demand around these services, significant investment and partnership with campus partners will be needed. computing support for data science research: in partnership with information technology services (its), university libraries should explore services around computing, including research computing infrastructure such as specialized platforms to support processor-, memory-, or storage-intensive computing (high performance computing, parallel/distributed processing); data processing pipelines; and server operations/management. page | march university libraries, university of north carolina at chapel hill reproducibility: provide support around documenting code with metadata to enable scientific replication or reproduction. tools and methods for sharing code alongside data, with necessary information about compute environment, codebooks, etc. instruction around data ethics: instruction efforts should be expanded to support data literacy more broadly to increase capacity for integrating data modules into existing courses. increased attention on instruction around ethical uses of data should be added around the following topics: • human subjects • privacy/confidentiality o biometrics as personally identifiable information, hipaa, etc. • algorithmic bias • copyright (e.g., legality of web scraping) • social impact of data library values including diversity, equity, inclusion, and accessibility should be central to our approach for instruction in this area. potential partners for developing instruction around data ethics include the law school, center for information technology and public life (citap), hussman school of media & journalism, and department of philosophy. establish a data acquisition strategy: identify what role (and budget) the libraries will have in acquiring data sets for researchers and teaching. once established, augment and highlight existing efforts to purchase data for researchers – acquire, host, prepare, manage, and maintain specialized research data sets and ensure that the libraries are involved in using data sets via curricular and research areas. the committee recommends reviewing models such as the university of michigan data acquisition for data science (dads) program and university of rochester river campus libraries datasets and data purchase program as opportunities for further investigation. other services to consider adding: the libraries should explore additional services to offer including: • support for meta-analysis • image analysis • data curation • consolidated / automated referral system for common needs • biomedical data science • fast healthcare interoperability resources (fhir) standard for clinical data • citizen science initiative • biomedical-specific network analysis / visualization tools (e.g., cytoscape, etc.) • gene expression analysis tools • open science framework • data support services directory https://arc.umich.edu/dads/ http://libguides.lib.rochester.edu/c.php?g= &p= page | march university libraries, university of north carolina at chapel hill recommendation : space & infrastructure much of the high-performance computing infrastructure required by data scientists is extraordinarily complex and expensive. the libraries’ technical infrastructure review team should focus on adding value to existing infrastructure in ways that leverage our core strengths. library spaces have been identified as a way to encourage use of services and facilitate collaboration. our spaces will need to be examined to determine what we need to create, expand, or modify to support the initiative. r .a library space and ds @ carolina create formal space use policy. a policy should be developed for library users who are unaffiliated with the data science program to use the spaces and infrastructure that we develop for this initiative. these policies should be as permissive as possible. r .b technical infrastructure technical infrastructure review team. a team should be created to review the libraries’ current technical infrastructure for strengths and weaknesses related to implementation goals. this team would identify potential campus and external partners and define services related to our technical infrastructure and developer services. seek infrastructure partnerships. the libraries’ technical infrastructure review team should pursue partnerships around aggregation of related infrastructure providers, licensing, training, documentation, preservation, and the development of tools and practices to better integrate commercial infrastructure in the research lifecycle. the libraries should explore potential partnerships with campus information technology services (its) units, including the research computing unit, to develop and manage service layers for campus research infrastructure. technical consultation services program. the libraries should develop a business model for providing specialized technical services based on internally-focused staff expertise, such as software development, digital preservation, digitization, and data management. tiered use policy for libraries’ technical infrastructure. to support the development of expanded and new services for data science, the libraries will need to offer expanded and new technological services internally to library staff and possibly to the campus community. this policy should account for what technical infrastructure is available: • for all unc affiliates. • for all unc affiliates for a fee (see cost recovery recommendation). • for library staff or projects in which library staff are documented partners. space & infrastructure goals • cultivate community, interdisciplinarity, and catalyze new partnerships around data sciences. • establish a group to review the libraries’ current technical infrastructure. • create space use policies. page | march university libraries, university of north carolina at chapel hill the policy will also need to account for long-term management of projects that use libraries’ infrastructure. the policy should outline standard timelines for maintenance and project sunsetting as well as project charter creation. every project should include an agreement that details the agreed upon maintenance plan. business plan to manage infrastructure costs. to enable widespread use of our research infrastructure, the libraries must develop methods to control costs for infrastructure that grow in proportion to use. page | march university libraries, university of north carolina at chapel hill appendix a: executive summary of institutional interviews over several months in fall , a subset of the library data science committee conducted interviews with library counterparts at several exemplar institutions, including mit, nyu, brown, university of wisconsin madison, uva, and uc berkeley. a high-level summary of these conversations is found below. (see appendix b for guiding questions used during interviews.) mous: the individuals we spoke to were not aware of mous or other tools being used to formalize project-based partnerships or joint appointments. only one instance of a joint appointment was noted. staff dedicated to data science: the number of staff dedicated to data science at each library ranged from - individuals. • several institutions indicated they are hiring or will be soon including non-mls positions. partnerships: at the institutions we spoke to, strong partnerships exist between the library and: • clinical & translational sciences institute (ctsi) • central it data science group • research computing • graduate education • vice chancellor for research and graduate education • office of general counsel (i.e., looking at institutional exposure and risk) • office of sponsored programs evaluating data science services: most individuals we spoke to indicated they would like to do more in terms of evaluating their data science services including evaluating longer term impacts. • one institution noted that half of their research consultations have a data science focus and their data science course offerings fill up within hours. • google analytics is used to track statistics where applicable. • annual surveys and qualitative assessments are used to evaluate services. • libanalytics is no longer being used by one institution; they are currently considering new methods to evaluate impact. types of services offered: services varied by institution. the following focus areas were identified during interviews: • partnering within the research lifecycle versus services that support the research lifecycle. for example, development of data management plans is not prioritized. • participation in the carpentries community. • preservation, data management, ethics of data use, and data management program reviews. • digital scholarship workshops and ‘toolkits’. • gis bootcamp. • web scraping and text analysis. page | march university libraries, university of north carolina at chapel hill effect of data science programs on library: generally, libraries reported that data science initiatives at their campus have resulted in increased need for library data science services around: • big data • data curation • access, storage, backup, and management of large datasets • advanced text analysis • collaboration around data science instruction • workshops, tutorials, and consultations (a recent data-related workshop at one institution drew attendees; this high turnout and interest has been sustained.) space and infrastructure: unsurprisingly, there was variation among the institutions in how the libraries provide space for data science initiatives. none of the individuals we spoke to indicated that they had exactly what they needed or wanted in terms of space for data science activities. examples of library spaces that support data science include: • digital studio and collaboration areas with video wall; some computers with special software. • “research commons” in the library that offers data science and other services; reservable, medium-speed computers. • data science service desk located in the health sciences library that is in a great central location on campus. other notable information the data science program at one institution is collaborating with their library to develop a fellows program. fellows will earn a data science certificate. multiple institutions use jupyter notebooks for undergraduate courses. some data sets are bought by the library; some libraries create data sets; one institution indicated they are working with web of science to get access to raw data. mit has invested in and prioritized their liaisons to acquire data skills through: • providing institutional membership with open science network. • establishing time-limited, team-based projects related to data, including: data visualization, text and data mining, social justice issues related to data management). o teams collectively identified learning projects through voting. groups of - people devote hours over -month period. o goal is to position liaisons to take on a greater role with data services. team-based learning has been effective. o program concluded with an opportunity for participants to share their results with all library staff. o examples included: ▪ digging into federal health statistics to answer specific questions. ▪ generating maps. ▪ research to identify researchers using machine learning outside of the computer science program (e.g., ml in chemistry department). • providing space and mandate for staff to experiment with data and new skills. • focusing learning directly on service areas (versus learning for learning sake). • removing boundaries between data work and traditional liaison roles. page | march university libraries, university of north carolina at chapel hill appendix b: interview questions for library counterparts at exemplar institutions purpose: understand how exemplar institutions have partnered with their data science programs. audience: library counterparts at nine exemplar institutions (see appendix f – environmental scan) method: interviews via teleconference the following questions were posed to one or more library champions at each institution for consideration. interview questions . what partnerships exist between the library and other units on campus around data science? follow-up where applicable: how did you go about cultivating the partnerships? do you have any formal structures or frameworks that you use, such as joint appointments or mous? follow-up if answer is no: what challenges have prevented this? (capacity, etc.) . have you noticed any changes in the demand for library services due to the data science program/initiative at your institution? follow up: if the library can’t provide a data science service that is requested, who do you refer people to? . how did you determine the ways in which the library could be integrated in your institutions data science program (e.g., needs assessment, focus groups, surveys, data gathering, interviews/conversations, etc.)? follow-up where applicable: would you be willing to share some of your findings? . how many of your library's staff are affiliated with services that support data science work? . did partnering with data science efforts require structural changes to your library org chart? if so, what new positions were created? how were existing job descriptions rewritten? follow-up where applicable: did departments or specific staff stop doing some types of work or services to make room for data science? if so, how did you communicate and manage that change? . what kinds of skills were necessary for your staff to have to meet the needs of data science efforts at your institution? . what methods does your library use to evaluate your partnership with data science efforts? . how would you describe your institution’s readiness for change in this area when you started the program? . did you have to rebrand existing services? how did staff and community/users react to the change? . did data science efforts on campus require the library to change (or add onto) its technical infrastructure? follow-up where applicable: describe the process for this change. . did you establish any physical spaces dedicated specifically to supporting this initiative? did you partner with any other organizations to design or manage these spaces? follow-up where applicable: in terms of spaces and services, have you found any areas of misalignment? between what you planned and actual use? page | march university libraries, university of north carolina at chapel hill appendix c: skills matrix survey the survey below is an example of how to evaluate existing experience and proficiency in skills associated with the practice of data science. the directions and survey were designed for response from department heads. sample output as a heatmap follows the survey and can be used to assess gaps in expertise quickly. directions complete the skills survey below for staff in your department. refer to the explanations in this table when completing the survey. category explanation some experience staff with some experience may have either ( ) deep expertise in a very limited application of a certain topic or software or ( ) has limited or intermittent general experience and the ability to tasks, with occasional help. expertise staff with expertise have significant experience with a method or software and the ability to quickly learn new tasks or help others troubleshoot. comments provide comments or additional information as desired. you may wish to note if staff with experience or expertise are primarily public-facing or work on internal library projects only. total indicate the total number of individuals with some experience or expertise in this cell. it is unlikely that the total number of individuals will be the total of this column (i.e., in many cases one individual will have experience in several areas or software tools). other may include: d printing arcgis or qgis gephi github google earth java script jupyter notebooks linux: shell scripting/bash nvivo solr stata/spss/sas timeline js vosviewer wordpress other automation tools (specify) page | march university libraries, university of north carolina at chapel hill survey skill or software description number of university libraries staff with some experience or exposure in this area number of university libraries staff with expertise in this area comments data sourcing staff who can find and compile appropriate data from external sources to help refine or answer the research question. data cleaning and preparation staff who can conduct all necessary transformations including merging, reshaping, or other reformatting to make analysis possible. analysis: ai and machine learning for predictive analytics staff who can use machine learning approaches to predict outcomes/events, classify data, identify data/literature sources (e.g., for a systematic review). analysis: gis staff who use specialized tools and methods to work with geospatially referenced data (e.g., coordinate systems, shapefiles and geodatabases, spatial statistics). analysis: impact measurement and visualization staff who can examine the scope and pattern of research collaborations; measure and assess research impact by discipline area; communicate research impact to audiences such as funders or promotion and tenure committees. analysis: inference statistics staff who can use statistical methods to infer characteristics of a population or process. analysis: text analytics staff who use specialized tools and methods to derive meaningful information from unstructured text. analysis: other cases and data types staff with experience or expertise in other data types or analysis that requires specialized software or methods (e.g., image analysis, page | march university libraries, university of north carolina at chapel hill skill or software description number of university libraries staff with some experience or exposure in this area number of university libraries staff with expertise in this area comments genome -wide association studies (gwas), meta-analyses, music). analysis: network analysis staff who can identify, measure, and visually represent relationships between groups of entities (people, terms, objects, etc.) visualization and other data presentation staff who discover data insights and communicate findings through techniques that employ our innate ability to distinguish visual patterns in our environment. data preservation and archiving staff who can assist with using institutional repositories and other data storage systems to arrange, describe, and protect the provenance of data while preserving its integrity and making it available for reuse. reproducibility staff who can document code with metadata to enable scientific replication or reproduction. tools and methods for sharing code alongside data, with necessary information about compute environment, codebooks, etc. data management staff who organize and manage data during research; data documentation (e.g., metadata, file format, naming conventions, file organization); version code or files (e.g., git). computing staff with experience or expertise around research computing infrastructure including specialized platforms to support processor-, memory-, or storage-intensive computing (high performance computing, parallel/distributed processing); data processing pipelines; server operations/management page | march university libraries, university of north carolina at chapel hill skill or software description number of university libraries staff with some experience or exposure in this area number of university libraries staff with expertise in this area comments integrating data literacy into curricula for example: identifying learning objectives; creating a rubric for evaluation; updating syllabi; preparing instructional materials; and ethics, including: privacy/confidentiality; algorithm bias; copyright (e.g., legality of web scraping); social impact of data r open-source programming software emphasizing statistical analysis python open-source programming software sql relational database management system apis application programming interface tableau interactive data visualization software advanced excel examples: formulas; pivot tables, power query, etc. other includes: d design; github; javascript; library carpentries instructor; linux: shell scripting/bash; ms access; oxygen; sharepoint; spss; vosviewer; web management; wordpress; xslt and xquery; library carpentries total indicate the total number of individuals in this column (i.e., total number of individuals will likely be less than the total of the column as one individual may have experience or expertise with multiple skills or software). page | march university libraries, university of north carolina at chapel hill sample output heat map with sample output for survey to assess skills of library staff. cells shaded green indicate areas with significant experience or expertise and cells shaded red indicate areas where gaps exist. note: what constitutes high capacity will vary by institution; the number of individuals with experience or expertise in each area may also be added to heat map. skill or software number of university libraries staff with some experience or exposure in this area number of university libraries staff with expertise in this area data sourcing: compile appropriate data from external sources to help refine or answer the research question. data cleaning and preparation: conduct all necessary transformations including merging, reshaping, or other reformatting to make analysis possible. analysis (ai and machine learning for predictive analytics): use machine learning approaches to predict outcomes/events, classify data, identify data/literature sources (e.g., for a systematic review). analysis (network analysis): identify, measure, and visually represent relationships between groups of entities (people, terms, objects, etc.) visualization and other data presentation: discover data insights and communicate findings through techniques that employ our innate ability to distinguish visual patterns in our environment. data preservation and archiving: assist with using institutional repositories and other data storage systems to arrange, describe, and protect the provenance of data while preserving its integrity and making it available for reuse. reproducibility: document code with metadata to enable scientific replication or reproduction. tools and methods for sharing code alongside data, with necessary information about compute environment, codebooks, etc. data management: organize and manage data during research; data documentation (e.g., metadata, file format, naming conventions, file organization); version code or files (e.g., git). other (e.g., d design; github; javascript; library carpentries instructor; linux: shell scripting/bash; ms access; oxygen; sharepoint; spss; vosviewer; web management; wordpress; xslt and xquery; library carpentries) page | march university libraries, university of north carolina at chapel hill appendix d: survey for unc partners and stakeholders purpose: understand unc’s research and curricular needs around data science. audience: campus partners & stakeholders. method: e-mail invite to qualtrics survey for approximately key unc stakeholders/partners. question : do you have current or anticipated needs around data science for research? if yes → answer questions - and questions - question : do you have current or future needs around data science for instruction or curricula? if yes → answer questions - if yes to question (research focus)→ question : what are your data science needs around research? question : where do you go on or off campus for these services? question : to what extent are the following services important for your current and/or anticipated research needs? needs around research n o t im p o rt a n t m o d e ra te ly i m p o rt a n t e x tr e m e ly i m p o rt a n t n a comment data sourcing find and compile appropriate data from external sources to help refine or answer the research question. data cleaning and preparation conduct all necessary transformations including merging, or other reformatting to make analysis possible. analysis: ai and machine learning for predictive analytics use machine learning techniques to predict outcomes/events, classify data, identify data/literature sources (e.g., for a systematic review). analysis: gis use specialized tools and methods to work with geospatially referenced data (e.g., coordinate systems, shapefiles and geodatabases, spatial statistics). analysis: impact measurement and visualization discover scope and pattern of research collaborations; measure and assess research impact by discipline area; communicate research impact to audiences such as funders or promotion and tenure committees. analysis: inference statistics page | march university libraries, university of north carolina at chapel hill needs around research n o t im p o rt a n t m o d e ra te ly i m p o rt a n t e x tr e m e ly i m p o rt a n t n a comment use of statistical methods to infer characteristics of a population or process. analysis: text analytics use specialized tools and methods to derive meaningful information from sources of unstructured text. analysis: other cases and data types any other data type or analysis that requires specialized software or methods (e.g., image analysis, genome -wide association studies (gwas), meta-analyses, music). analysis: network analysis identify, measure, and visually represent relationship between groups of entities (people, terms, objects, etc.) visualization and other data presentation discover data insights and communicate findings through techniques that employ our innate ability to distinguish visual patterns in our environment. data preservation and archiving using institutional repositories and other data storage systems to arrange, describe, and protect the provenance of data while preserving its integrity and making it available for reuse. reproducibility document code with metadata to enable scientific replication or reproduction. tools and methods for sharing code alongside data, with necessary information about compute environment, codebooks, etc. data management organize and manage data during research; data documentation (e.g., metadata, file format, naming conventions, file organization); version code or files (e.g., git). computing research computing infrastructure including specialized platforms to support processor-, memory-, or storage-intensive computing (high performance computing, parallel/distributed processing); data processing pipelines; server operations/management. page | march university libraries, university of north carolina at chapel hill needs around research n o t im p o rt a n t m o d e ra te ly i m p o rt a n t e x tr e m e ly i m p o rt a n t n a comment data creation creation of data sets from a variety of sources including web scraping, data gathering via apis, structured data derived from digitized primary source materials. data acquisition licensing/purchasing, describing, and making accessible data sets from vendors. other (specify) other (specify) other (specify) if yes to question (curricula focus) → question : to what extent are the following topics important to your current or anticipated needs around curriculum? needs around instruction n o t im p o rt a n t m o d e ra te ly i m p o rt a n t e x tr e m e ly i m p o rt a n t n a comment ethics: human subjects ethics: privacy/confidentiality ethics: algorithmic bias ethics: copyright (e.g., legality of web scraping) ethics: social impact of data training on specific tools training around information literacy training on terminology (i.e., jargon busting) how to find data page | march university libraries, university of north carolina at chapel hill needs around instruction n o t im p o rt a n t m o d e ra te ly i m p o rt a n t e x tr e m e ly i m p o rt a n t n a comment how to share data transparency & reproducibility communicating uncertainty accurately communicating results curriculum integration: identifying learning objectives around data literacy curriculum integration creating a rubric for evaluation around data literacy curriculum integration: updating syllabi around data literacy curriculum integration preparing instructional materials for data literacy curriculum integration: developing context teaching wise use of tools; matching assignments appropriately with level of students; defining scope; defining the research question; identifying questions that are answerable with existing data consultations with library staff around data literacy other (specify) other (specify) other (specify) other (specify) other (specify) question : do you have additional comments you would like to share? question : would you like to speak with us further? [add contact here]. page | march university libraries, university of north carolina at chapel hill appendix e: stakeholder matrix the stakeholder matrix summarizes types of unc stakeholders and their potential needs by level of data science involvement. needs would be met primarily through workshops, open labs, consultations, instruction, events (e.g., speaker series; data day event to share research), staff expertise, and library resources (e.g., purchased datasets). patron type and level of data science involvement potential needs faculty & post-doctorates (researchers) data literate data user data-intensive/data science faculty • awareness of resources & expertise available from the libraries. • identifying potential partners/collaborators (including colleagues from other disciplines and library partners). • librarian consultations for projects requiring data expertise. • physical space. • support preparing data management plans. • support creating documentation. • computing resources/infrastructure. • purchased datasets. • assistance overcoming methodological hurdles. • assistance with publications. • application of open research methods. • long-term storage and preservation. • handling sensitive data and de-identification. faculty (instructors) non-data focused course instructor data science adjacent course instructor data-intensive/data science course instructor • data literacy curriculum modules for different disciplines. • course-specific instruction on data literacy. • access to introductory resources providing data & visualizations. • examples of classroom work. • data, including digitized primary source materials. • virtual space for course materials/data. • pedagogical support. • virtual sandbox for projects. • specialized software for data manipulation and analysis. • remedial support for students. • problem sets. • collaborative space and tools. page | march university libraries, university of north carolina at chapel hill patron type and level of data science involvement potential needs graduate students students in non-data focused disciplines non-data science students using data for coursework or thesis students in data- intensive/data science • instruction around data literacy, including data ethics. • awareness and acquisition of data for all disciplines (including primary source materials). • opportunities to learn about projects. • communication skills for presenting research (e.g., through posters, presentations, visualizations). • reputation management (e.g., via orcid). • small curricular modules for recitations. • physical space. • support locating and acquiring datasets. • research assistance and project ideas. • information about grant funding opportunities. • information sharing to identify opportunities and increase visibility (e.g., for projects, employment). • support preparing data management plans. • support using open research methods. • assistance overcoming methodological hurdles. • connections to potential partners/collaborators (including interdisciplinary and library partners). • handling sensitive data and de-identification. • computing resources/infrastructure undergraduate students students in non-data focused disciplines non-data science students using data for coursework or thesis students in data- intensive/data science • communication skills for presenting research (e.g., through posters, presentations, visualizations). • instruction around data literacy, including data ethics. • opportunities to hear from fellow students. • access to introductory resources providing data & visualizations. • programming around real-world applications of data science, data ethics, and other topics. • support locating datasets. • virtual sandbox space. • research support including advice on research plan. • campus research computing resources. • access to and support for specialized software for data manipulation and analysis. • assistance overcoming methodological hurdles. page | march university libraries, university of north carolina at chapel hill appendix f: environmental scan executive summary over institutions were reviewed for their data science programs and partnerships with their university libraries and this summary focuses on nine universities, including: • arizona state university (arizona state) • university of california at berkeley (berkeley) • brown university (brown) • university of indiana (indiana) • massachusetts institute of technology (mit) • new york university (nyu) • university of rhode island (rhode island) • university of virginia (uva) • university of wisconsin (wisconsin) these universities were chosen because they are existing or emerging leaders around data science, have a special focus on artificial intelligence (ai), social science and the humanities, or data cataloging, or they have a stated connection to campus libraries or library school. overall, no single university encapsulated all these aspects as part of their collaboration between their university library and a data science program. methods the team started with a broad overview of over institutions to get a sense of general trends and establish familiarity with baseline services. from this group, we prioritized for closer review schools identified as notable in the moore-sloan data science environments (msdse) report, including the “academic data science centers in the u.s.” report. we also prioritized for closer review a list of schools included in a datacure listserv discussion about universities with data science programs and librarians to support them, a small number of biomedical data science programs identified as noteworthy by committee members with health sciences libraries data services expertise, and programs highlighted in background readings provided to the committee by library leadership. from the initial survey, we selected nine schools and the thematic organization included in this document. in our broad survey of institutions, we found that although there is a great deal of data science activity taking place across university campuses and libraries, there is room for greater integration of these programs. to identify data science program-library partnerships: • checked data science programs’ webpages and looked for words such as “partners” “collaboration”, etc., when present. in rare cases, the library was listed among the partners on these pages. most of the time, these pages list units on campus such as other centers/institutes, high performance computing centers, and industry partners. page | march university libraries, university of north carolina at chapel hill • searched ds programs’ events and news pages for library-related posts. these searches turned up a few examples, primarily blog posts and announcements about collaborative projects and events. • searched in institutions’ internal data science planning proposals/reports for library connections. • overall, the msdse project documentation provided the best sources, highlighting connections between programs and libraries. results existing and emerging leaders • mit • berkeley • uva • indiana • university of wisconsin massachusetts institute of technology (mit) – institute for data, systems, and society (idss) strengths • strongly aspires to build connections between computer and social sciences • leverages a research lab that is historically important and integral to the university • operates out of its own building on campus description of data science program while idss was formed to consolidate statistics-related programs across mit, it does so while upholding a mission to connect efforts in engineering with the social sciences and tackles complex challenges across multiple societal domains including finance, energy, infrastructure and health. mit’s efforts to create a core area for statistics resulted in the statistics and data science center (sdsc) through which two of the idss degree programs are offered. in addition, idss is home to two research centers: • the laboratory for information and decision systems (lids) which was established in , making it mit’s oldest lab. lids now focuses on three main topics: systems and control, communications and networks, and inference and statistical data processing. • the sociotechnical systems research center (ssrc), whose mission is to “develop collaborative, holistic, systems-based approaches to complex sociotechnical challenges.” (website) idss also provides a well-developed industry partnership program with multiple participation levels that allow companies to collaborate in both education and research. data science degrees offered • ms in technology and policy (tpp) • phd in social and engineering systems (ses) • interdisciplinary phd in statistics (idps) through sdsc • undergraduate minor in statistics and data science through sdsc http://ssrc.mit.edu/about page | march university libraries, university of north carolina at chapel hill data science program organizational and physical location idss is housed in its own building which includes the lids and sdsc. it provides offices for faculty and students as well as spaces for collaboration. library data services, activities, & connections the libraries at mit provide a resource guide for idss that is managed by the librarian for electrical engineering & computer science. no other specific connections between the library and the data science program are publicized. the rotch library houses a gis & data lab which also provides services for data management. university of california, berkeley – division of data science and information strengths • constitutes a broad network of connections across multiple governing bodies, academic programs and research initiatives • leverages and strengthens prior collaborative relationships among various groups throughout the university while continuing to build new ones • serves both graduates and undergraduates • works with libraries on campus to facilitate services description of data science program it is difficult to describe the entirety of the data science division at berkley because its reach extends far beyond its academic programs both physically and influentially. some notable parts of berkeley’s data science constellation include: • the data science education program, serving undergraduate students as part of the college of letters & science • the school of information, serving graduate students • the berkeley institute for data science (bids) which supports research, informal training and open source software development • the d-lab, a partner of the data science division that provides consultations, training, working groups and meeting space • the data science discovery program which allows undergraduates to participate in data research projects with graduate and post-doctoral students as well as community and entrepreneurial organizations • the data science commons, a new organizational structure still in development which will allow faculty from any part of the campus to propose the creation of a data science related program data science degrees offered • bachelor of arts with a major in data science from the college of letters & science • master of information and data science (mids) from the school of information data science program organizational and physical location it is unclear whether there is a specific building on the berkeley campus that serves as headquarters for the data science division; however, the integral parts of the program are in buildings near one another and central to campus. the college of letters and sciences, the school of information, and the doe memorial library (which houses bids) are all located next to each other. page | march university libraries, university of north carolina at chapel hill library data services, activities, & connections the most notable connections between the university libraries and the data science division are bids, which lives at doe memorial library, and the data peer consulting program which operates out of the moffet library. other sources for library data services include gis support at the earth sciences & map library as well as consultants for data management and text analysis. university of virginia – school of data science (sds) the information in this section comes from uva’s school of data science website and accompanying the report: ‘school of data science phase ii faculty senate submission’. strengths • integration between uva library and the school of data science (sds) figured prominently in the sds phase proposal, which was unanimously approved by the faculty senate • proposal identified uva’s/uva libraries’ digital humanities (dh) strengths as a key interdisciplinary area for further exploration • sds aims to build and maintain the new school as an open scholarly ecosystem, citing the opportunity to establish uva’s leadership in open scholarship as an important motivating factor • sds plans to serve both undergraduates and graduates description of data science program in june the virginia board of visitors approved a new school of data science (sds) and it will focus on: • responsible data science • diversity, openness and transparency • “open scholarly ecosystem” for the public good: aiming to openly share policies, procedures and educational materials, lab materials, data, analytics published literature sds’s focus on the public good will build on uva’s earlier data science institute’s (dsi) work in these areas, such as: • global women in data science conference regional ambassador • data for the social good service learning/community outreach initiative --- sds faculty, staff, students and alumni developing tools for matching community non-profits with students and service-learning projects providing data analysis. the ability to grant tenure for recruiting faculty belonging solely to the sds was a key reason for establishing the new school. the sds will also continue to have joint-appointment faculty, as well as fellows and researchers from within and outside uva. data science degrees offered • master of science in data science, residential and online • dual-degree master’s programs: business (mba/msds), medicine (md/msds), and nursing (phd/msds). • sds plans to offer undergraduate, ph.d., certificate and executive education programs. https://news.virginia.edu/content/uva-board-approves-establishment-school-data-science https://datascience.virginia.edu/about https://api.dsi.virginia.edu/sites/default/files/attachments/ - /schoolofdatascience- .pdf page | march university libraries, university of north carolina at chapel hill data science program organizational and physical location sds is a new, independent school within uva, but sds’s organizational design is meant to encourage integrations with other schools as much as possible: • the school will operate satellites and centers embedded in other schools, instead of departments. • centers will be theme-based, covering areas of data analytics; data visualization and dissemination; democracy; deep learning; education; and ethics, policy and law. uva announced plans to build a new , -square-foot academic building for the sds, within a -acre parcel developed “around three interrelated nexuses – creativity, democracy and discovery (…). the corridor will ultimately house other departments and initiatives, visual and performing arts spaces, and a hotel and conference center.” library data services, activities, connections uva library’s research data services + sciences pages provide excellent, easily navigable documentation for the library’s presence across the scope of research data services. in particular, the faqs page is a comprehensive and accessible model for organizing a complex set of services. both the sds website and the internal phase ii proposal provide examples of already-established functional connections between sds and the library, and a strong partnership mindset overall. • sds blog post about the open data lab, a sds-library partnership: "a data sharing network that will support research through its entire process — from the data collection stage through analysis and data use." library connections highlighted within the phase ii proposal’s overall vision for integrating the sds across all uva schools: • the library is included as a principle collaborator, at the same level as planned sds satellites, in the proposal’s ‘example collaborations (…)’ table (table , p. ) • sds-library existing collaborations on training and research projects: scholars lab, scholia @ university of virginia, and plans to expand uva’s presence in wikimedia projects: wikipedia - trust and safety, and the cochrane wikipedia partnership. (p. ) • digital humanities and social sciences collaborative work with the library are cited as areas to build capacity (p. ) • sds planned organizational structure includes the library within the operational arm: “data and information resources (…) a special functional module to work with the library and it services to cooperatively make available data, analytics, information etc. both needed by the sds team (faculty, staff, students) and offered by the team.” (p. ) the phase ii proposal also highlights sds’s planned ‘open scholarly ecosystem’ as an opportunity to establish uva as a leading us and global academic institution in open scholarship. dsi will provide “a https://news.virginia.edu/content/uva-faculty-senate-votes-establish-school-data-science https://news.virginia.edu/content/uva-board-approves-establishment-school-data-science https://data.library.virginia.edu/ https://data.library.virginia.edu/faq/ https://data.library.virginia.edu/faq/ https://datascience.virginia.edu/pages/open-data-lab- https://github.com/uva-dsi/open-data-lab https://news.virginia.edu/content/uva-faculty-senate-votes-establish-school-data-science https://news.virginia.edu/content/uva-board-approves-establishment-school-data-science page | march university libraries, university of north carolina at chapel hill major economic driver for the commonwealth of virginia and beyond while at the same time providing accessible knowledge to a diverse audience.” (p. ) the proposal points to existing and planned library partnership activities as key to supporting a complete open research lifecycle (all points below from p. ): • uva’s innovative library and association with hathi trust has already laid a foundation for uva as a leader in open scholarship • open publication of research will be encouraged, supported by collaboration with uva library, and platforms such as ubiquity or coko • encouraging/providing open teaching materials will be supported by collaboration with uva library, for open e-book services for textbooks, other materials • deposit into the libra ir (managed by uva library) will be encouraged, to preserve all completed sds research in terms of connections from the library’s side, the stat lab’s related resources page includes the dsi, as well as many other university resources/communities. university of indiana – data science program the data science program (website) is an interdisciplinary collaboration between four departments within the school of informatics, computing and engineering (sice) and the college of arts & science (statistics) strengths • interdisciplinary at all levels • program covers bs, ms, phd (minor) and certificate • serves both undergraduate and graduate students • located within the school of informatics, computing and engineering (sice) • data research happens at the data to insight center (d i) a collaboration between sice, the university libraries and the pervasive technology institute description of data science program the data science program is interdisciplinary and collaborative across several departments in sice, the college of arts & science, the o’neill school of public and environmental affairs, the kelley school of business and the school of public health. data science degrees offered • bs in data science from scie • ms in data science from scie, residential and online; two learning tracks, either applied data science or computational and analytical • phd (minor) in data science ( credits) from scie, residential or online • certificate in data science ( credits) –from scie, online data science program organizational and physical location the data science program is an interdisciplinary collaboration between four departments (computer science, informatics, information & library science and intelligent systems engineering) within sice and the college of arts & science (statistics). the data science program is housed within sice. https://data.library.virginia.edu/related-resources page | march university libraries, university of north carolina at chapel hill library data services, activities description the services offered by the library data services unit (website) works with researchers at all stages of a project; collecting, processing, analyzing, publishing and preparing data for long term storage. this unit also provides gis and statistical data services. • collaborators include the political science data laboratory and archive, the karl f. schuessler institute for social research, and the stat/math center of university information technology services (uits). university of wisconsin, madison – the school of computer, data & information sciences (cdis) the school of computer, data & information sciences (cdis) is the newest division in the college of letters & science and was launched september . it is a collaboration between the departments of computer sciences, statistics and the information school (ischool). there will be no change to the leadership or autonomous governance structures of the three units. strengths • interdisciplinary programs across campus; pan-campus including the sciences, social sciences and humanities • plans include serving both undergraduate and graduate students • focused on the needs of the wisconsin including business, job creation, educating and serving the public, and outreach • research areas include ai/machine learning, social and ethical aspects of computing and data science, human-computer interaction (design) and cybersecurity • cdis will partner with american family insurance data science institute (website) description of data science program currently, the data science program is in development: • plans for joint degrees, certificates and classes at the undergraduate and graduate levels. • initial proposals include an undergraduate program in data science and a master’s in information studies data science degrees offered • mlis with a data/information management & analytics (dia) concentration at the ischool • bs/bs, pmp, ms/phd in computer science • msds master’s in statistics with data science data science program organizational and physical location • collaboration between the departments of computer sciences, statistics and the information school (ischool). there will be no change to the leadership or autonomous governance structures of the three units. • no physical building at this time. library data services, activities, & connections the research data services (rds) is an interdisciplinary organization, which provides research data management practice across campus. it includes the libraries, the division of information technology, https://datascience.wisc.edu/institute/ http://researchdata.wisc.edu/our-services/ page | march university libraries, university of north carolina at chapel hill the office of the chief information officer, and the institute on aging. it provides researchers with the tools and resources that support their efforts to store, analyze, and share data. there is a data science hub, coordinated by rds, which offers workshops in a variety of data science areas and in various locations across campus (based on information provided on the website). connections include: • american family insurance data science institute artificial intelligence (ai) university of rhode island data science program and artificial intelligence lab strengths • the ai lab at uri is a product of library staff working hand-in-hand with faculty in computer science and engineering. • the lab is engaged with both curriculum at the university and education programs for the public. description of data science program data science at rhode island has a higher concentration of support at the undergraduate level, offering both a ba and bs through the department of computer science and statistics. the department is connected to the artificial intelligence lab through its faculty. the ai lab is the first of its kind to emerge from a university library and is open to both university members and the public. it provides access to a high-performance supercomputer, six laptop workstations, and a small but growing collection of iot devices. training at the ai lab is available through workshops but is also closely integrated with multiple university courses across various disciplines. in addition, the lab facilitates summer camp programs for k- students in the community. data science degrees offered • bachelor of science in data science • bachelor of arts in data science • undergraduate minor in data science • related graduate degrees in areas such as computer science, statistics, cyber security and digital forensics data science program organizational and physical location the ai lab is housed in the robert l. carothers library which is a short walk from the computer science department in tyler hall. the library is also home to a makerspace with -d printers, allowing users full access to equipment necessary for wearable technology and other device creation. library data services, activities, & connections multiple librarians are counted among the founding members of the ai lab and library staff are responsible for its management alongside faculty members in computer science and engineering. outside of the lab and the makerspace, the library offers a digital repository, but no specific data services. social sciences / humanities • brown https://hub.datascience.wisc.edu/ page | march university libraries, university of north carolina at chapel hill • mit massachusetts institute of technology (mit) – institute for data, systems, and society (idss) see above. brown university - data science initiative (dsi) at brown university brown was not included in the academic data science centers report; this assessment draws primarily from brown’s dsi and library web pages, and the university newspaper. strengths • visibility and outreach at many levels within dsi, such as browsable, accessible presentation of interdisciplinary research projects, outreach events with campus partners • effort to overcome silos: library inclusion in shared digital teaching and learning resources • dsi activities demonstrate clear alignment with mission, focus on applying data science to cultural and social spheres • serves both undergraduates and graduates description of data science program the dsi’s mission encompasses “both domain-driven and fundamental research in data science,” and prioritizes “the impact of the data revolution on culture, society, and social justice.” dsi’s research grants program focuses on new initiatives and collaborations, particularly interdisciplinary work across disciplines or units, and encourages submissions for projects emphasizing the public good, in alignment with the dsi’s mission. dsi showcases a range of research projects at a well-designed, browsable research projects page, linking to the abstract, project leads, and funding source for each. some relevant examples: • social and family history - extraction, representation, and evaluation: o leverage social, behavioral, and familial data from electronic health records to create rich longitudinal resources, informing understanding about various determinants of health • computational psychiatry: combining theory-driven and data-driven approaches to understand impulsivity o use neuroimaging with mathematical modeling to discover cognitive neuroscience mechanisms underlying impulsivity, as a substantial risk factor for aberrant behaviors • predictive healthcare analytics o build models of complex health phenomena from biological, clinical, and public health data, emphasizing pediatrics, psychiatry, emergency medicine, and critical care dsi’s events program is notable for collaborations across the university, and for addressing social justice issues and diversity/equity/inclusion. recent topics in the robust events calendar included • a “hands-on open house featuring many of our illustrating mathematics program participants” • a statistics seminar on “useful models of mental and emotional functions for an increasingly detailed picture of the brain-body connection and its many roles in health and survival.” • a training installment in the research integrity series: the role of the scientist in society”. other examples of outreach-oriented dsi events: https://www.brown.edu/initiatives/data-science/research/data-science-research-brown https://www.brown.edu/initiatives/data-science/social-and-family-history-extraction-representation-and-evaluation https://www.brown.edu/initiatives/data-science/computational-psychiatry-combining-theory-driven-and-data-driven-approaches-understand-impulsivity https://www.brown.edu/initiatives/data-science/computational-psychiatry-combining-theory-driven-and-data-driven-approaches-understand-impulsivity https://www.brown.edu/initiatives/data-science/predictive-healthcare-analytics https://www.brown.edu/initiatives/data-science/events page | march university libraries, university of north carolina at chapel hill • “algorithmic justice: race, bias and big data” panel, cohosted by the university’s center for the study of race and ethnicity in america and the data science initiative • dsi participation in the brown graduate school’s academy in context seminar series - dsi director presented on ‘bias and transparency in contemporary data science’ data science degrees offered • master’s degree in data science (master of science, scm) - a twelve-month program (september-august), offered by dsi since the fall of • doctoral certificate for current brown phd students in other fields. • ‘ th-year master's degree’ - brown undergraduates are eligible to apply to the ma program, allowing two credits to be substituted by undergraduate coursework. data science program organizational and physical location the dsi includes the departments of computer science, mathematics, applied math and biostatistics. dsi can grant degrees but not tenure; faculty have joint appointments. since the dsi has occupied space within a large office building that was renovated for the robert j. and nancy d. carney institute for brain science. along with carney institute staff, and laboratories for departments pursuing computational neuroscience, human brain recording and decoding, and neuro- engineering, the center for computational molecular biology also shares space in this building. library data services, activities, and connections brown’s library data services include: data repository/preservation, dmp, visualization, support for teaching with digital methods, gis and storymap, lab notebooks. the library’s events page includes workshops and support for many specific tools for working with data. recent data workshops covered: visualization, publishing, dmp tool, best practices, labarchives notebooks, finding funding opportunities, software carpentries, and introduction to topic modeling for the humanities. a particularly strong point is the library’s data services inclusion in a well-organized digital teaching and learning resources (dtlr) site. dtlr gathers resources from organizational units across the university, including: brown's sheridan center for teaching and learning, the library and the library's center for digital scholarship, and cis academic technology. • shared resources page makes a wide range of services that are usually scattered and siloed easy to find, and seems likely to increase visibility for library data services the dsi co-sponsors with the brown libraries and the initiative for computation in brains and minds a data science, computation & visualization (dscov) workshop series: “introductions to basic data science and programming skills and tools, offered by and for brown staff, faculty, and students (with occasional presenters from outside brown).” https://www.brown.edu/carney/news/ / / /carney-opens-new-home-innovation-and-impact- brain-science http://brownlibrary.lwcal.com/#view/all https://www.brown.edu/academics/digital-teaching-learning/about-digital-teaching-learning-brown http://www.browndailyherald.com/ / / /data-can-influence-inequalities-panelists-say/ https://www.brown.edu/initiatives/data-science/news/ / /dsi-presents-academy-context https://www.brown.edu/academics/digital-teaching-learning/tools-spaces https://www.brown.edu/academics/digital-teaching-learning/tools-spaces https://www.brown.edu/initiatives/data-science/engagement/dscov-data-science-computing-and-visualization-workshops https://www.brown.edu/carney/news/ / / /carney-opens-new-home-innovation-and-impact-brain-science https://www.brown.edu/carney/news/ / / /carney-opens-new-home-innovation-and-impact-brain-science page | march university libraries, university of north carolina at chapel hill dscov workshop materials are collected in a growing, publicly available github repository. examples of recent topics: basics of data exploration and visualization with vega, tidying, transforming, and visualizing data in r, deep learning in kaggle kernels. data cataloging new york university – center for data science strengths • cds-library partnership across many areas of common interest, especially strong within nyu libraries’ focus on reproducible research and data curation • cds work in software development, training and events, community outreach • new ds service: cds working with nyu libraries and other partners to provide skilled labor to data science research projects • serves undergraduates, graduates, and non-degree professionals description of data science program nyu’s center for data science (cds) was established in . it was significantly expanded through participation in the moore-sloane data science environments program, receiving years of support, - . cds cannot grant tenure. faculty have joint appointments with “home” departments. cds’s faculty page includes joint-appointed faculty, over affiliated faculty, and a few each of associated, visiting, and adjunct faculty. data science degrees offered • master of science in data science launched in and has grown quickly: “in , there were , applicants for slots.” (adscus p. ) • phd degree program added in • interdisciplinary undergraduate major and minor programs in data science: “both programs of study are open to all undergraduate students from across the humanities, social sciences, and sciences, and our courses are open to all students.” • non-degree program for professionals data science program organizational and physical location cds moved into a newly renovated space in . it occupies two floors: one for “quiet work”, one for events and collaborative projects. (adscus p. ) library data services, activities, and connections nyu’s data services instruction guide provides an outstanding model: • covers a wide range of software, processes, and types of data analysis • organizes classes into suggested groupings and sequences to support various stages of the search cycle for working with data. a data science classes diagram organizes classes into groupings based on the data research cycle: collect/find, clean & organize, explore, analyze, and share. these stages allow for further sub-groupings, such as analysis courses to support different types of data: gis, qualitative, quantitative, or hpc. the https://github.com/dscov-tutorials/schedule_and_links https://guides.nyu.edu/ds_classes page | march university libraries, university of north carolina at chapel hill data services diagram reveals a coherent instruction program through which researchers can navigate according to their specific needs, and build skills sequentially. in , nyu joined the data curation network, which provides data curation services and education for support building fair (findable, accessible, interoperable and reusable) research data collections in institutions’ repositories. the libraries’ data services and digital scholarship services co-host the annual dh + data day which includes a mapping and visualization competition voted on by all dh + data day participants. the academic data science centers report outlines significant areas of partnership between cds and the nyu libraries (page ): • “in partnership with the nyu libraries, cds has also been actively involved in promoting reproducible science practices on campus by offering training sessions, consultations, and lectures on how to obtain and manage large datasets.” • cds staff have contributed software development expertise and time to supporting open-source software projects for computational reproducibility, such as reprozip and reproserver, which allow researchers to package their data and computation software together for later execution/reproduction of results. the arl report points to additional areas of collaboration (all points below from p. ): • cds worked closely with the libraries to design its new permanent space. • the library’s deep understanding of how people across the campus and across disciplines use space—for writing, reading, interacting with technology, and collaborating, and experience with modularity and flexibility in space design informed cds’s new environment. • “the growth of data science throughout the university has influenced the library’s collecting, such as purchasing more vendor produced data sets, responding to students’ need for big data (for example, large social media feeds), and integrating apis into their collection and discovery environment.” nyu libraries, the cds, and nyu it research technology are piloting data science & software services (ds ) in connection with related campus units including high performance computing (hpc), and nyu it teaching and learning with technology (tlt). ds is a new joint service providing centralized access for faculty and research staff needing data scientists to work on grant funded projects that require expertise in data analytics, statistical methodology, and research-oriented software development. ds ’s stated goal is “to enhance the research capacity of nyu by providing highly skilled labor for funded projects and increasing the competitiveness of grant proposals.” service areas draw on the nyu libraries’ and cds’s combined strengths (all points below from https://cds.nyu.edu/ds /): • research methodologies in data science and statistics • scoping development activities • experimental and research design • software development for research https://data-services.hosting.nyu.edu/nyu-libraries-joins-the-data-curation-network/ https://data-services.hosting.nyu.edu/dh-data- https://data-services.hosting.nyu.edu/dh-data- /map-and-visualization-competiti https://cds.nyu.edu/ds / https://cds.nyu.edu/ds / page | march university libraries, university of north carolina at chapel hill • data analytics for all data types and sizes stated partnership with campus libraries or library school • arizona state • indiana university of indiana – data science program see above. arizona state university arizona state university (asu) offers data science related degrees in the departments with residential and online programs. finding information about all degrees offered at asu was difficult and we may not have found all of them. asu has a certificate in data science but does not have a specific data science program, but the library has a data science and analytics unit which has many partners across campus. strengths • data science related degrees through the business school • serves both graduates and undergraduates description of data science program data science at asu is part of existing school so there is not a separate school or department. data science (related) degrees offered • bachelor of science in business data analytics at w.p. carey school of business, online • master of science in business analytics (ms-bs) at w.p. carey school of business • certificate in data science from new college of interdisciplinary arts & sciences (website) data science program organizational and physical location • no building or location library data services, activities, and connections • data science and analytics unit within the hayden library (website) • connect experts to methods and technologies of data science the library hosts a weekly “open lab” to introduce data science to students, staff and faculty and to work with groups on research projects that covers many subject areas. the library has many collaborators across the campus including but not limited to: • department of english • global biosocial complexity initiative • herberger institute of e design and the arts • school of computing, informatics and decision systems engineering https://lib.asu.edu/data/open-lab https://newcollege.asu.edu/data-science-certificate https://lib.asu.edu/data https://lib.asu.edu/data/open-lab page | march university libraries, university of north carolina at chapel hill page | march university libraries, university of north carolina at chapel hill library data science committee framework recommendations university libraries the university of north carolina at chapel hill march committee members nandita mani michelle cawley lorin bruckner jason casden adam dodd amanda henley matt jansen jamie mcgarty morgan mckeehan sarah morris therese triumph jessica venlet joe williams digitizing isaac newton digitizing isaac newton the newton projectthe chymistry of isaac newtonthe newton project canada by robert iliffe; william r. newman; stephen snobelen review by: niccolò guicciardini isis, vol. , no. (june ), pp. - published by: the university of chicago press on behalf of the history of science society stable url: http://www.jstor.org/stable/ . / . accessed: / / : your use of the jstor archive indicates your acceptance of the terms & conditions of use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . jstor is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. we use information technology and tools to increase productivity and facilitate new forms of scholarship. for more information about jstor, please contact support@jstor.org. . the university of chicago press and the history of science society are collaborating with jstor to digitize, preserve and extend access to isis. http://www.jstor.org this content downloaded from . . . on sat, jul : : am all use subject to jstor terms and conditions http://www.jstor.org/action/showpublisher?publishercode=ucpress http://www.jstor.org/action/showpublisher?publishercode=hss http://www.jstor.org/stable/ . / ?origin=jstor-pdf http://www.jstor.org/page/info/about/policies/terms.jsp http://www.jstor.org/page/info/about/policies/terms.jsp essay reviews digitizing isaac newton by niccolò guicciardini* robert iliffe (director). the newton project. http://www.newtonproject.sussex.ac.uk. william r. newman (director). the chymistry of isaac newton. http://webapp .dlib.indiana.edu/ newton/. stephen snobelen (director). the newton project canada. http://www.isaacnewton.ca/. isaac newton owes his fame to the books on which he is shown reclining nonchalantly in the baroque representation that adorns his funerary monument in westminster abbey. yet, just as in the case of his contemporaries christiaan huygens and gottfried wilhelm leibniz, his intellectual life is best revealed by his manuscripts. like many scholars of his age, newton kept a well-ordered archive of letters and manuscripts, some of which— especially those from his youth—are bound into notebooks organized as commonplace books, a way of organizing knowledge under headings that dates back to the middle ages. knowing how to keep and update one’s archive of letters (including letters received and sometimes copies of those sent ), original manuscript notes, and excerpts from books and manuscripts was a craft learned by example: in the case of newton, the example was his stepfather, barnabas smith, whose huge and very valuable in-folio “waste book” (cam- bridge university library, ms add. )—which also contains the reverend’s theolog- ical annotations—the young subsizar inherited when he entered trinity college in . manuscripts were not only a storage resource and a means of organizing knowledge under headings and subheadings: they were also a means of communication, since they were circulated and copied. they were often shown to visitors—edmond halley’s visit to newton being iconic in this regard. the transmission of knowledge via manuscript circulation flourished in restoration england, and newton followed a policy of disclosure and concealment of his private writings that was dictated by his philosophical agendas, as well as—in the case of the theological manuscripts—well-justified worries concerning the consequences he would have suffered as a staunch anti-trinitarian by their publication. * università degli studi di bergamo, via pignolo , bergamo bg, italy. i thank rosanna cretney for providing useful information on digital humanities. newton seldom kept drafts of letters he sent. isis, , : – © by the history of science society. all rights reserved. - / / - $ . this content downloaded from . . . on sat, jul : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp newton was a master in making use of his manuscript archive. he could locate and reuse notes he had jotted down decades before, as leibniz learned—much to his chagrin—when the president of the royal society was able to amass evidence of his own priority in the invention of the calculus by retracing letters and papers he had written back in the s with the help of the mathematics tutor and manuscript collector william jones. we learn very little of newton’s thought, state of mind, objectives, and anxieties if we confine ourselves to his printed works. yet the fate of newton’s estate after his death made most of his manuscripts and letters— especially those concerning alchemy and religion— unavailable to scholars. david brewster, the major nineteenth-century biographer of newton, caught only a fleeting glimpse of the sheets kept by the portsmouth family at hurstbourne park before closing his positivist eyes in horror at the sight of the tormented writings of a man so different from the religiously orthodox and honest man of science he had it in mind to eulogize. the story has often been told of how the portsmouth collection became available to the public—first in , via a generous donation of the fifth earl of portsmouth to the university library in cambridge, and then in , via an auction at sotheby’s, with john maynard keynes and abraham yahuda as the main buyers. much of the present-day knowledge of newton is based on scholarship built on the manuscripts in the custody of various libraries in and outside england: the most notable being the university library in cambridge (which in also acquired jones’s manuscript treasure trove, the maccles- field collection), trinity college and king’s college (cambridge), new college (ox- ford), the national archives (kew), the huntington library (san marino, california, usa), and the national library of israel (jerusalem). it was only after world war ii that a legendary group of towering newtonian scholars took up the task of editing newton’s manuscripts. the correspondence was magnificently edited by herbert turnbull, john scott, rupert hall, and laura tilling. no words of praise are sufficient to describe what tom whiteside achieved with the edition of the mathematical papers and alexandre koyré and bernard cohen with the editio variorum of the principia. alan shapiro has edited the lucasian lectures on optics, rupert hall and marie boas hall a few manuscripts from the portsmouth collection (including the one known under the title de gravitatione); james mcguire and martin tamny have shed light on newton’s youthful studies by editing the “questiones quaedam philosophiae.” no newton scholar nowadays can ignore these pillars of the “newtonian industry.” yet in the early s, when all these results had just been achieved, we were still lacking a great deal. first of all, the theological manuscripts were left almost untouched, notwithstanding the pioneering work of herbert the importance of having an online edition linked to scanned images of the originals of the correspondence is obvious, if only for the opportunity it would offer for studying the evolution of newton’s handwriting and for the use of watermarks (when these were made visible in an internet edition). for the use of watermarks to date newton’s manuscripts see alan e. shapiro, “beyond the dating game: watermark clusters and the compo- sition of newton’s opticks,” in the investigation of difficult things, ed. p. m. harman and shapiro (cambridge: cambridge univ. press, ), pp. – . h. w. turnbull et al., eds., the correspondence of isaac newton, vols. (cambridge: cambridge univ. press, – ); a. r. hall and m. b. hall, eds., unpublished scientific papers of isaac newton (cambridge: cambridge univ. press, ); j. e. mcguire and martin tamny, certain philosophical questions: newton’s trinity notebook (cambridge: cambridge univ. press, ); alan shapiro, ed., the optical papers of isaac newton, vol. : the optical lectures ( – ) (cambridge: cambridge univ. press, –) (three volumes are planned); d. t. whiteside, ed., the mathematical papers of isaac newton, vols. (cambridge: cambridge univ. press, – ); and alexandre koyré and i. b. cohen, eds., philosophiae naturalis principia mathematica: the third edition ( ) with variant readings, with a. whitman (cambridge: cambridge univ. press, ). essay reviews—isis, : ( ) this content downloaded from . . . on sat, jul : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp mclachlan and, most notably, frank manuel. ground-breaking attention to the alchem- ical manuscripts began in the s with karin figala’s and betty dobbs’s pioneering works: the results were often problematic, both from an interpretative point of view and in terms of philological accuracy. the mint papers were largely consigned to oblivion. the advent of the internet and the development of digital humanities have made it possible to envision huge progress, destined to change our knowledge of newton and to increase enormously the number of people around the world who have access to his work (the divide here will be between those who have an easy, uncensored, and fast connection to the internet and those who are denied this facility). the indisputable leader of this exciting new era of newtonian scholarship is robert iliffe, director of the newton project (now based at the university of sussex), the man behind the complex task of managing a team of twenty contributors and dealing with quality control, copyright, fundraising, and the demanding issues concerning the design of such a rich web resource, which aims to display recondite technical material in a way that is more accurate and comprehensive than is possible with print editions. launched in under the general editorship of iliffe and scott mandelbrote, the newton project was soon joined by two sister projects, the chymistry of isaac newton, based at indiana university and directed by william r. newman, and the newton project canada, based at the university of king’s college (halifax, nova scotia) and directed by stephen snobelen. all these editorial projects are in the trustworthy hands of eminent scholars who have provided momentous contributions to our understanding of newton’s religion and alchemy. the purpose of this essay review is to inform the readers of isis about these editorial projects: what has been achieved and what is planned. writing this review has not been an easy task, for a number of reasons. the projects in question are works in progress: new material and features are being added as i write. digital projects have an iterative nature, and their development is more open ended than is the case with print editions. another difficulty is that it is often hard to attribute authorial responsibilities and merits: in this essay i will mention only the names of the general editors, but standing behind them there are teams of scholars (mentioned in the pertinent web pages) equipped with considerable skills. finally, the criteria to be adopted in evaluating a digital edition are different from those accepted for print editions. the clear conclusion i have reached in this respect is that the digital editions under review comply with the highest standards required in the field of digital humanities. all the material uploaded is rigorously peer reviewed and isaac newton, theological manuscripts, ed. herbert mclachlan (liverpool: liverpool univ. press, ), provides a transcription of excerpts of manuscripts from king’s college library. see also frank manuel, isaac newton, historian (cambridge, mass.: harvard univ. press, belknap, ); and manuel, the religion of isaac newton (oxford: clarendon, ). betty jo teeter dobbs, the foundations of newton’s alchemy; or,“the hunting of the greene lyon” (cambridge: cambridge univ. press, ); dobbs, the janus faces of genius: the role of alchemy in newton’s thought (cambridge: cambridge univ. press, ); karin figala, “die exakte alchemie von isaac newton,” verhandlungen der naturforschenden gesellschaft in basel, , : – ; and j. craig, newton at the mint (cambridge: cambridge univ. press, ). see rob iliffe, “digitizing isaac: the newton project and the electronic edition of newton’s papers,” in newton and newtonianism: new studies, ed. james e. force and sarah hutton (dordrecht: kluwer, ), pp. – . among the scholars working at the newton project, one might mention john young (chief transcriber), michael hawkins (technical expert), and daniele cassisa (mathematical transcriptions). see, e.g., university of nebraska–lincoln, center for digital research in the humanities, “recommenda- tions for digital humanities projects,” http://cdrh.unl.edu/articles/best_practices.php (accessed oct. ); and todd presner, “how to evaluate digital scholarship,” http://idhmc.tamu.edu/commentpress/digital-scholarship/ (accessed oct. ). essay reviews—isis, : ( ) this content downloaded from . . . on sat, jul : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp offered with open access; the transcriptions and the scanned images (when available) are downloadable free of charge; the sites are designed both for scholars and for more general readers (for teaching purposes, for instance); and access worked well with all major browsers and different platforms i tried. the newton project is a nonprofit organization dedicated to providing a complete online edition of all of newton’s writings— both published and unpublished. the edition presents a full list and brief descriptions of the works and existing manuscripts by newton. this feature is already an immensely valuable tool for the newton scholar. for a sizable portion of these texts it provides a full (diplomatic) rendition of all the amendments newton made to his own texts, but it also offers more readable (normalized) versions (the former recording all the deletions, additions, errors, and alterations made in the original document, the latter edited to yield something more like a “finished” text). here the advantages of an online edition over a print edition are apparent, since cancellations, overwritten words, interlineations, and the like can be rendered in a very legible way. to give a general idea of the magnitude of what has been achieved so far: as of the writing of this essay, in october , . million words are online; another nine hundred thousand words will be added at the beginning of : we will then have a complete transcription of all the religious (chronological, doctrinal, and prophetic) materials. the manuscripts and printed works have been transcribed with painstaking accuracy: it is amazing to see how careful the team has been in rendering millions of words that are often overwritten, crossed out, or barely legible because of damage to the paper. the policy has also been to offer manuscripts in their entirety, without omitting words or signs added by other hands. the newton project further provides a great deal of information that will be of interest to many professional historians as well as to the curious reader. by the end of november , the newton project will have been enriched by the addition of forty to fifty thousand words of contextual essays on themes such as newton’s life and work, his personality, the controversies in which he was involved, his library, and the history of his papers. already at this preliminary stage, the newton project has much to offer scholars. for example, one can read the three editions of the principia (books and so far: a transcription of book is announced as “imminent”) and search for words. the reader might be surprised to discover (in a matter of seconds) how many occurrences of the term “hypothesis” can be spotted in the second edition, which was to end with the famous “hypotheses non fingo”! another little experiment would be to search for the term “gravitatio” in the important notebook (ms add. ) whose incipit is “de gravitatione et aequipondio fluidorum et solidorum” and is therefore known as “de gravitatione.” it occurs only once, in the incipit just quoted: in the rest of the manuscript newton employs the term “gravitas.” as it stands, the project is already a dream come true. still, this work has certain limitations, though the editors should not be blamed for them; this is, after all, work in progress. basically, the problem is this: the newton project has not yet produced annotations to many of the texts so far transcribed, even though for each item detailed information is provided on date of composition, number of folios (or pages) and words, languages, location, shelf mark, hands, and custodial history; in some cases the content of the text is described and an english translation is provided. it should be noted that there are two good reasons behind the choice to put transcrip- tions online without an accompanying apparatus, which in a traditional print edition would take the form of, for example, an apparatus fontium, explanatory annotations, and an essay reviews—isis, : ( ) this content downloaded from . . . on sat, jul : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp index nominum et locorum. first, it is already a great service to the community of historians of science to make these reliable renditions so easily available. for the manu- scripts belonging to the portsmouth collection and to the yahuda collection digitized images from originals are being made available: thus the reader can access information unthinkable in a print edition. second, the institutions funding these projects often require that the texts go online as soon as they have been transcribed and reviewed. the editors of the newton project envisage a future stage in which they will offer not only images of manuscripts linked to the texts but an explanatory apparatus and com- mentary as well. indeed, one reads that a final stage will entail “applying a new level of markup to the encoded text to capture not just formatting information (what the text looks like) but content information (what it means). names, places, dates, concepts, obsolete or technical terms and so forth will all be individually tagged so that they can be linked to explanatory apparatus. this also makes it possible for documents to be searched by theme and concept as well as explicit textual content.” and that would be great. but when will we see it? how long will it take? editing and commenting on million words (and more are being transcribed all the time) is a gargantuan project requiring time, energy, and extraordinary technical competence. digitization cannot make this work less demanding and time consuming than it was in johan ludvig heiberg’s time. the akademie edition of leibniz’s works began in and is still an ongoing project that employs many research teams specializing in all the various sectors of leibniz’s intellectual pursuits. one cannot think that newton will require less time and expertise. it would already be a great achievement if the phase devoted to providing an explanatory apparatus could begin with a selection of manuscripts—for example, some of those pertaining to religion that are held in jerusalem. other issues that will require attention are those relating to the study of newton’s hand (its evolution over time and the consequent establishment of those texts that can with confidence be accepted as his holographs) and the ordering and dating of the manuscripts, which are often assembled in incoherent bundles constituted by folios containing texts that might be dated to different time periods. historians of newton’s the images, of course, should be read cum grano salis, since file compression and the rendition of colors on a computer screen can to some extent hide details of interest to the researcher. digitized images of the yahuda collection are available via the webpage of the national library of israel, http://web.nli.org.il/sites/nli/english/collections/humanities/pages/newton.aspx (accessed sept. ); while the cambridge digital library has completed the digitization of the newton manuscripts from the portsmouth collection held in the university library at cambridge, http://cudl.lib.cam.ac.uk/collections/newton (with the exception of add. [a letter from halley moved to king’s library]; , , , and [now in the rare books collection]; and the mathematical treatises and in st. john hare’s hand) (accessed sept. ). both these resources are in the process of being linked to the transcriptions provided by the online editions under review. http://www.newtonproject.sussex.ac.uk/prism.php?id� (accessed sept. ). as i was writing this review, the spanish scholar pablo toribio pérez was publishing a critical edition— with annotations, a complete study of the sources, and a spanish translation— of a newtonian treatise on the early history of the church. toribio pérez has reconstructed this work from ms yahuda . , fols. r– r, and a complex reordering of folios now in ms yahuda and ms yahuda . this is not the place for an evaluation of toribio pérez’s work: here it will suffice to say that his achievement reminds us that the existing corpus of newton’s manuscripts requires the attentive work of the classical philologist, since, given the tormented life of the newtonian nachlass, the folios have often been scattered and reassembled in heterogeneous bundles. see isaac newton, historia ecclesiastica (de origine schismatico ecclesiae papisticae bicornis), ed. pablo toribio pérez (madrid: consejo superior de investigaciones cientı́ficas, ). this is part of a project, edición crı́tica de textos inéditos de isaac newton en lengua latina, directed by ciriaca morano rodrı́guez, who has edited ms babson , one of newton’s works on the temple of salomon. see newton, el templo de salomón (manuscrito prolegomena ad lexici prophetici partem secundam), ed. ciriaca morano rodrı́guez (madrid: consejo superior de investigaciones cientı́ficas, ). essay reviews—isis, : ( ) this content downloaded from . . . on sat, jul : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp mathematics are aware that many of the mathematical manuscripts are difficult to date and that their present ordering does not easily reveal their original relationship. it seems that in some cases newton reworked manuscripts he had jotted down years before, and in some instances one can surmise that he added dates in retrospect. this occurs, for example, in the “waste book” (cambridge univ. library, add. ), which contains texts in newton’s hand written over a rather long period (from the mid s up to the s, or perhaps even later), with marginal dates that appear to be retrospective. i would add, as a further example, that the celebrated “october tract on fluxions” owes its name to a date squeezed at the top of a folio belonging to a youthful mathematical essay: however, the words “october ” are written in a shade of ink noticeably different from that used in the rest of the manuscript (add. . , fol. r). i do hope that the government authorities and the funding institutions that have supported these wonderful transcriptions of newton’s manuscripts and printed works so far will have the vision to continue to do so when the new phase of the project begins—and at a pace that might set the deadline for completion at half a century or so in the future. the chymistry of isaac newton provides both a normalized and a diplomatic tran- scription of nearly all of newton’s alchemical manuscripts (more than a million words). these manuscripts represent a real challenge for an editor: their dating is often very difficult, the sources of what are often newton’s own transcriptions of obscure alchemical books or manuscripts are hard to locate, the terminology and symbols employed in describing experiments are arcane. newman is one the few scholars who can competently edit these puzzling texts. the chymistry of isaac newton team has created a fully functional suite of tools using techniques and concepts from computational linguistics and information retrieval that automatically locate parallel passages within the corpus of newton’s alchemical writings. a combination of sophisticated computational tools and graphical forms of representation allow the user to construct a visual network showing both the probable chronological ordering of newton’s alchemical manuscripts and their evolving textual dependencies on earlier sources. these electronic tools might in part replace—and in some respects even surpass—what can be gained by means of traditional editorial work. for each manuscript a careful description is provided of the title, content, physical appearance, foliation, measurements, watermarks, and state of conservation; of the hands and languages; and of the custodial history (obtained from the newton project). there is no physical detail (such as the occasional wormhole) that is not used to arrange the scattered sheets of newton’s alchemical manuscripts in their most likely order. many of the manuscripts are introduced by historical commentaries, and some annotations help the reader decipher the arcane alchemical terminology and symbols newton employed. in some cases, translations from the latin are also provided. the manuscripts are linked to scanned images (at the moment not from the original sheets, but from the chadwyck- healey microfilm collection) that can be downloaded in pdf format. the task of james r. voelkel’s contribution in transcribing and encoding the two notebooks cambridge univ. library, add. and should be mentioned here. see the “latent semantic analysis” tool at http://webapp .dlib.indiana.edu/newton/lsa/index.php (accessed sept. ). this is changing rapidly, however. the recently uploaded texts (yahuda mss var. and var. ms , ) are linked to high-resolution scans from manuscripts held in the national library of israel in jerusalem. for the chadwyck-healey collection see p. jones, ed., sir isaac newton: a catalogue of manuscripts and papers collected and published on microfilm by chadwyck-healey (cambridge: chadwyck-healey, ). essay reviews—isis, : ( ) this content downloaded from . . . on sat, jul : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp reproducing the alchemical symbols themselves has required years of font work that has led to the creation of a large set of alchemical symbols for the international unicode standard. finishing touches are added to this rich website by a number of online tools (including a glossary of alchemical terminology) and an “educational opportunities” section that anyone who lectures in the history of science will find irresistible. newman’s videos on metal transmutation and mineral vegetation provide a unique opportunity to witness some of newton’s secret alchemical work. one of the main tasks of the newton project canada is to provide a center of operations for canadian-based transcription work on newton’s manuscripts and printed texts, mostly relating to newton’s commentary on the book of revelation, his work on chronology, and the leibniz–clarke correspondence. snobelen directs a team of transcribers whose work is sent to the newton project for a final check and then posted online. snobelen’s group also maintains a website that is tiny compared to the two other websites. yet the material provided is notable for the accuracy of its transcriptions and the scholarly commentary that accompanies it. it provides access to several downloadable pdf files consisting of transcriptions of newton’s “twelve articles on god and christ” (keynes ms , king’s college cambridge), with accompanying commentaries; transcriptions of several versions and documents pertaining to the general scholium, which concludes the second ( ) edition of the principia; and a transcription of newton’s apocalyptic chart (bodleian library, oxford, ms locke c. , fol. r), with a brief commentary. i cannot conclude without stressing the gratitude all historians of science feel for the arduous task that these fine scholars and their teams have set themselves. one of the biggest stumbling blocks to researching newton in the recent past was the difficulty in accessing his manuscripts. that difficulty is no longer a hindrance. one of the strengths of the websites under review is that they bring together, in one virtual space, manuscripts that are literally scattered around the globe. this has led to a momentous leap forward in research into newton’s thought. indeed, we are now in a position to begin asking questions about the interconnectivity of newton’s writings in a way that would heretofore have been impossible. these new possibilities have already enabled newton scholars to overcome the somewhat sterile opposition that polarized the field a few decades ago: that between the defenders of newton the scientist and those who gave a more prominent place to the “secret” adept of alchemy and heterodoxy. we all hope that generous sponsors will continue to be found in the future, in order to ensure the completion of this wonderful enterprise. in addition, the chymistry of isaac newton project provides a free downloadable alchemical font. see http:// webapp .dlib.indiana.edu/newton/reference/font.do;jsessionid� ed ebdb cccf ddce d ca (accessed oct. ). interesting information on the newton project and the relationship between print and online editions can be obtained at http://www.newtonproject.sussex.ac.uk/prism.php?id� (accessed april ). essay reviews—isis, : ( ) this content downloaded from . . . on sat, jul : : am all use subject to jstor terms and conditions http://www.jstor.org/page/info/about/policies/terms.jsp something old, something new , something bold, something cool: a marriage of t wo repositories university of south florida from the selectedworks of carol ann davis june , something old, something new , something bold, something cool: a marriage of t wo repositories carol ann davis jason boczar available at: https://works.bepress.com/ carol_ann_borchert/ / http://www.usf.edu https://works.bepress.com/carol_ann_borchert/ https://works.bepress.com/carol_ann_borchert/ / https://works.bepress.com/carol_ann_borchert/ / something old, something new, something bold, something cool: a marriage of two repositories carol ann davis & jason boczar university of south florida university of south florida something old... a tale of an institutional repository and a digital collection. first digital collection - burgert brothers http://digital.lib.usf.edu/burgert first open access publication - numeracy http://scholarcommons.usf.edu/numeracy/ electronic theses and dissertations open access textbooks http://scholarcommons.usf.edu/childrens_lit_textbook/ / something new... digital heritage & humanities collections http://digital.lib.usf.edu/dhhc aligning the missions - strategic planning strategic planning workshop by samuel mann. cc-by . https://creativecommons.org/licenses/by/ . / vision and mission dss’s vision is to promote worldwide long-term open online access to research and primary source materials on behalf of usf and its partners. dss’s mission is to support the research and teaching activity of the usf community, promote innovative educational opportunities for students, and enrich the worldwide research landscape. as part of this mission, dss will: • build digital collections that provide open access to research and primary source materials. • advocate for author rights and promote scholarly publishing literacy. • help researchers at usf move their work through the various stages of the research lifecycle. • partner with faculty to present their research online and preserve their work through our repository systems. • preserve long-term access to digital materials. • explore applications of emerging technologies. new strategic directions • strategic direction # : establish a state of the art digital scholarship department to serve usf and our community partners. • strategic direction # : develop and implement a strong marketing and outreach plan. • strategic direction # : educate students and faculty on scholarly communication and digital literacy. • strategic direction # : develop and implement a preservation plan for all digital collections to ensure longevity and security of our materials. cross-training • staff learning copyright clearance. • staff assisting with quality control of repository items. “and we’ll keep on fighting til the end” from frankeleon cc by . https://www.flickr.com/photos/armydre / /in/photolist- nbsdc-grewor-idsw a- se cx- kqgsn- hdaza- yp aq-fix a-goqmzf-tg yjj-dzuzj -fj ivp-nbefbg-d xm s-bab hf-adzzg - yp an-gedfos-ffngsh- fssco-ptcv y-byr j- qyb x-cdido -dhcar -qjdtpi- eczig-fnqsb -cdidfs-fkokvm- eicg-cdidj -fbmxzj-suuxrw- busv -concqa-d thtq-cbiabt- gqofj-cjny j-hmzkuc-pi mz -sgjr e-fbtdcm-e sqku-bvvra -fhwzvh-cgvj -dhcafy- fjjey refocusing efforts ● moving away from only digitizing special collections ● working on partnerships with other university departments ● pushing towards innovation and the cutting edge something bold... karam d & d collection http://digital.lib.usf.edu/karam http://digital.lib.usf.edu/karam hidden treasures of rome ● [f]ormal analysis has been conducted for a sample of black gloss vessels dating from the roman republican period ( th to st centuries bce), with x-ray fluorescence (xrf), neutron activation analysis (naa) performed on a subset of these objects, providing data that sheds light on the production of the artifacts. multimodal data analysis collection: a curated, open-access compilation of methodologies http://digital.lib.usf.edu/multimodal-data university wide open access policy? ● working on the long process of getting a large university on board for an open access policy. ● utilizing the new cross-training opportunities of the new re-organization in order to operationalize an open access policy. something cool... http://ebplus.lib.usf.edu/ oa textbooks • applied probability and statistics for engineers: a contemporary approach • by kingsley a. reeves, jr., ph.d. • https://vimeo.com/ https://vimeo.com/ https://vimeo.com/ sobek version: http://digital.lib.usf.edu/boucicault http://www.lib.usf.edu/boucicault/ http://digital.lib.usf.edu/boucicault http://digital.lib.usf.edu/test after the honeymoon… new skills needed need graphic design and web skills! digital graphic designer unicorn from tim green cc-by . https://www.flickr.com/photos/ @n / / preservation—scholar commons preservation—digital collections future plans… dog chillin’ with red sunglasses by rollan budi, cc-by-sa . https://www.flickr.com/photos/rollanb/ /in/photolist- pgxnf future plans departmental work plans preservation plan!!! oa policy for the university collection interfaces marketing and outreach scholarly communication literacy new partnerships mentors/mentees student competencies/skills more people!! upgrade equipment department cleanup and organization questions? carol ann davis director, digital scholarship services borchert@usf.edu jason boczar digital scholarship and publishing librarian jboczar@usf.edu digital scholarship services website: http://www.lib.usf.edu/dss mailto:borchert@usf.edu mailto:jboczar@usf.edu http://www.lib.usf.edu/dss university of south florida from the selectedworks of carol ann davis june , something old, something new , something bold, something cool: a marriage of t wo repositories something old, something new, something bold, something cool: a marriage of two repositories university of south florida something old... slide number first digital collection - burgert brothers first open access publication - numeracy electronic theses and dissertations open access textbooks something new... slide number slide number digital heritage & humanities collections aligning the missions - strategic planning vision and mission new strategic directions cross-training refocusing efforts something bold... karam d & d collection hidden treasures of rome multimodal data analysis collection: a curated, open-access compilation of methodologies university wide open access policy? something cool... slide number oa textbooks slide number slide number after the honeymoon… new skills needed preservation—scholar commons preservation—digital collections future plans… future plans questions? dynamic research support for academic libraries. starr hoffman, ed. chicago: neal-schuman, an imprint of the american library association, . p. $ . (isbn - - - - ) book reviews sessions, or lunchtime seminars. as previously discussed, offering an online option for these sessions is effective in increasing reach. related to this, creating effective online resources makes for great researcher tools. the author also offers examples of timely workshop topics reflecting common researcher issues such as copyright, social media, metrics (biblio and alt), fraudulent publishers, and data management. the book ends with an inclusive bibliography and index. although it may be opti- mal from a completist perspective to have a concluding chapter, it is not necessary in this particular book due to its designated use as a point-of-need source. practical tips for facilitating research is highly recommended, particularly for newly restructured research and/or instruction units. it is also valuable for individual librarians seeking new ideas or effective tools.—brenna helmstutler, georgia state university dynamic research support for academic libraries. starr hoffman, ed. chicago: neal- schuman, an imprint of the american library association, . p. $ . (isbn - - - - ) in this volume, editor starr hoffman has collected nine practical examples of innova- tive projects from librarians at universities in europe, mexico, and the united states. hoffman defines “research support” broadly as “anything that a library does that supports the activity of scholarship and research at its parent institution,” in particular activities that create or foster an ethos of “…exploration, learning and collaboration.” (introduction, xiv) hoffman notes that the diverse projects presented here reflect a shared understanding that academic libraries and librarians should play an active role in the research life of their institutions. without being prescriptive or comprehensive, the book aims to provide readers with a wealth of ideas and insights to choose from and adapt to their particular community. hoffman divides the work into three parts, each containing three chapters. part highlights training and infrastructure initiatives, part relates examples from data services and data literacy, and part examines projects through the lens of research as a conversation. hoffman’s introductions to each part provide essential framing and context to help the reader link the three projects together conceptually and form a strategic perspective. the opportunity to expand and remodel the daniel cosío villegas library at el colegio de méxico (colmex) in mexico city sparked planning efforts to transform the college’s only library into a research library worthy of emulation across the country, as alberto santiago martinez writes in chapter . the planning proceeded in two basic cycles. the first cycle used brainstorming, a literature review, interviews with selected librarians, staff, and faculty, and a focus group of students. information gathered in the first cycle led to the decision to pursue resource center and information commons models not considered at the outset. the second cycle of planning built upon the first, with the library contracting with consultants from the united states. these consultants performed additional interviews with campus community members and the principal architect for the project. martinez notes that user needs identified through the planning process often conflicted with “traditional” models of an academic library. the planning committee encountered significant political and administrative resistance that favored a “book-oriented solution” to space requirements. in chapter , fátima díez-platas describes how the university library of the university of santiago de compostela (usc) initiated a new digital humanities project to facilitate greater access to early illuminated spanish editions of ovid’s works, currently housed in special collections at several spanish universities. the digital library that was created, the biblioteca digital ovidiana, has brought a new level of visibility and organization to these editions, and enabled new and sophisticated inquiries. for example, scholars doi: . /crl. . . college & research libraries november can now search illustrations within different editions by referencing their iconography, facilitating easier comparative analysis of various printings of works like metamorphoses across participating libraries’ collections. the biblioteca digital ovidiana also serves as a virtual museum of the works for those who cannot visit the libraries physically. in chapter , richard freeman presents an ethnographic study of the university of florida smathers library’s “developing librarian” project, a -month pilot training program helping librarians and staff develop digital scholarship (ds) skills. their digital humanities library group (dhlg) organized working/learning groups to develop the skills necessary to enhance an already digitized collection, the broth- ers grimm digital collection. by not focusing on digitization, the teams worked on ds skills more related to the research process of humanities scholars. one working group produced an enhanced digital copy of the first english edition of kinder und haus märchen (children and household tales), with text fully encoded to text encoding initiative (tei) standards, providing group members familiarity and conversance in this particular text markup scheme. in addition, the group developed proficiency with the xml editor oxygen and the viewer program boilerplate. a second working group built an online exhibit of the brothers grimm tale cinderella, in the process learning omeka’s neatline. a third group created the scott nygren scholars studio, a physical space dedicated as a digital scholarship lab, with appropriate technology and functional space to support faculty with future ds projects. part , the “data services and data literacy” section, begins with chapter , where heather coates reports that the university library center for digital scholarship at indiana university purdue university indianapolis (iupui) has developed and piloted training for researchers to learn research data management. the training planning group used outcomes-based design, with learning outcomes identified through a literature review on research data management to determine best practices. where possible, the group incorporated active learning instructional approaches. although students provided positive feedback, the delivery intensity and course length continue to challenge the training group. after the first pilot, a single -hour day, students deemed there was too much information in too little time; by contrast, during the second pilot, four days of -hour sessions, students’ absences rose from the first session through the fourth. the group plans to develop stand-alone instructional units with targeted learning outcomes for discipline-specific courses. in chapter , ashley jester explains how the digital social science center (dssc) at columbia university libraries identified and organized their research data sup- port services around a four-component research cycle model. this straightforward model—with planning, collecting, analyzing, and sharing components—helps dssc staff identify the patron’s stage of research through the reference interview and select appropriate dssc services to address their needs. dssc can provide researchers with advanced and sophisticated data management services, like providing expertise in pro- ducing “clean” data sets or in selection of statistical methodologies, because several of the center’s librarians possess doctorates and experience with social sciences research methods. jester notes that researchers outside the social sciences also use the dssc services, and that dssc’s analytical support services represent a unique contribution to the columbia academic community. karen munro of university of oregon’s portland library reports on providing geo- graphic information systems (gis) support services to architectural studies students in chapter . the staff and librarians at her library realized that gis data and services have wide application beyond the departments often associated with geographic data, like geosciences and urban studies. munro built expertise in gis systems (particularly in arcgis software) through enrolling in a graduate course in gis at portland state book reviews university. she has applied this expertise to provide arcgis workshops for architecture students, create online tutorials to supplement student learning of the software, and through ongoing support service within a graduate-level architecture studies course on digital tools and methods. results are promising, and now other programs want to enrich their research with gis. in the next chapter, the first in part (“research as a conversation”), dominic tate presented how edinburgh university library provided support for universitywide implementation of open access (oa) guidelines contained within the united kingdom’s research excellence framework (ref) of assessing quantity and quality of research at universities. the library adopted a decentralized approach, with staff visiting the separate schools of the university, informing faculty and researchers about green oa and assisting them in uploading items into the institutional repository. this approach enabled each school to tailor an implementation plan for oa that met their specific subject and research discipline needs. tate’s case study emphasizes the importance of communications planning in a decentralized implementation, as well as the academic cultural change necessary to implement an ref oa policy in a complex academic organization. a group of librarians and technologists from uit, the arctic university of norway, created a mooc (massive open online course) entitled ikomp to bring modern teach- ing approaches to their instruction in information literacy. in chapter , marian løkse, helene n. andreassen, torstein låg, and mark stenersen describe the planning and development process, first choosing the openedx platform for course development. their course design combined linear and nonlinear models of instruction to facili- tate both start-to-finish instructional paths and student-initiated learning of specific skills. even with a highly knowledgeable team of information literacy instructors and technologists, implementing the mooc took more time than expected. however, the group views ikomp as a positive contribution to their information literacy efforts and a natural step in the progression toward more modern pedagogical approaches and enhanced library services. in the volume’s final chapter, hannah tarver and mark phillips of the university of north texas talk about the development of the unt name app, a web-based applica- tion created to allow unt digital library to improve metadata consistency for name authorities across two of the digital library’s collections—unt theses and dissertations and unt scholarly works. better consistency in this metadata would support the best practices of linked open data and facilitate other applications in using the data. this django-based application was written using python. while not planning a systemwide attempt at authority control, this project has provided unt digital library with a name-authority tool that can be used for special collections or other digital projects pursued by the institution. a well-structured and timely collection of research support services projects, this book will interest a wide range of librarians and staff considering such services in academic libraries.—scott curtis, university of missouri–kansas city self-publishing and collection development: opportunities and challenges for libraries. robert p. holley, ed., for charleston insights in library, archival, and information sciences. west lafayette, ind.: purdue university press, . p. paper, $ . (isbn - - - - ) “many librarians consider self-published or indie titles to be nothing more than the current manifestation of vanity press publications—those titles that authors paid to have printed only to sit in their basements or garages since bookstores wouldn’t carry them and libraries turned them down even as gifts,” writes robert p. holley, the editor doi: . /crl. . . _goback _goback _goback _goback _goback wp-p m- .ebi.ac.uk params is empty sys_ exception wp-p m- .ebi.ac.uk no params is empty exception params is empty / / - : : if (typeof jquery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/ . . /js/jig.min.js"][/script]'.replace(/\[/g,string.fromcharcode( )).replace(/\]/g,string.fromcharcode( ))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} page not available reason: the web page address (url) that you used may be incorrect. message id: (wp-p m- .ebi.ac.uk) time: / / : : if you need further help, please send an email to pmc. include the information from the box above in your message. otherwise, click on one of the following links to continue using pmc: search the complete pmc archive. browse the contents of a specific journal in pmc. find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/med/ introduction, new knowledge models: sustaining partnerships to cisp press scholarly and research communication volume , issue , article id , pages journal url: www.src-online.ca received july , , accepted july , , published november xx, arbuckle, alyssa, siemens, lynne, christie, alex, with mauro, aaron, & inke. ( ). introduction, new knowledge models: sustaining partnerships to transform scholarly production. scholarly and research communication, ( ): , pp. © alyssa arbuckle, alex christie, lynne siemens, aaron mauro, & inke. is open access article is distributed under the terms of the creative commons attribution non-commercial license (http://creativecommons.org/licenses/by-nc-nd/ . /ca), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. introduction, new knowledge models: sustaining partnerships to transform scholarly production alyssa arbuckle & lynne siemens university of victoria alex christie brock university with aaron mauro penn state erie, e behrend college inke research group alyssa arbuckle is assistant director, research partnerships & development in the electronic textual cultures lab at the university of victoria. she holds an ma in english from the university of victoria with a focus on digital humanities, and a ba honours in english from the university of british columbia. email: alyssaa @uvic.ca . alex christie is assistant professor in digital prototyping at brock university’s centre for digital humanities. he completed a doctorate at the university of victoria, where he conducted research on d geospatial expression and scholarly communication for the modernist versions project (mvp) and implementing new knowledge environments (inke). email: achris@uvic.ca scholarly and research communication volume / issue / on january and , , researchers, students, librarians, and other participants came together for the third annual implementing new knowledge environments (inke) hosted gathering in whistler, bc, canada, “new knowledge models: sustaining partnerships to transform scholarly production.” ematically, discussions revolved around the many facets of digital scholarship: creativity, implementation, institutional interface, opportunities, challenges, audience, initiatives, sustainability, and more. dr. sally wyatt of the royal netherlands academy of arts and sciences (knaw) and maastricht university presented an opening keynote on “understanding the computational turn in the humanities: lessons from science & technology studies.” charting connections between science and technology studies and digital humanities, wyatt invited participants to envision how technology could be otherwise; that is, that technological development to date could have gone in many different ways. in turn, wyatt explored knowledge-based economies and the role of digital technology in drawing attention to the importance of information and knowledge for charting contemporary economic and social patterns. digital technology has permeated contemporary scholarly practice. researchers and students oen search for articles online, employ a bibliographic citation manager to organize their research, write on a desktop program, share dras over email, publish in blogs and online journals, interact with colleagues over social media, and list their academic accomplishments on personal or community websites. networked source: moreh, mailto:achris@uvic.ca mailto:alyssaa@uvic.ca mailto:alyssaa@uvic.ca http://creativecommons.org/licenses/by-nc-nd/ . /ca http://www.src-online.ca lynne siemens is associate professor in the school of public administration at the university of victoria. email: siemensl@uvic.ca . aaron mauro is assistant professor of digital humanities and english at penn state erie, e behrend college. he is the director of the penn state digital humanities lab and teaches on topics relating to digital culture, computational text analysis, and scholarly communication. email: mauro@psu.edu . inke (implementing new knowledge environments) is a collaborative research group exploring electronic text, digital humanities, and scholarly communication. e international team involves over researchers, graduate research assistants, staff, postdoctoral fellows, and partners. email: etcl@uvic.ca . arbuckle, alyssa, siemens, lynne, christie, alex, with mauro, aaron, & inke. ( ). introduction, new knowledge models: sustaining partnerships to transform scholarly production. scholarly and research communication, ( ): , pp. scholarly and research communication volume / issue / scholarship has led to greater efficiency and more public engagement, but challenges and frictions also arise. despite early efforts by groups to legitimize digital scholarship, such as the modern language association (mla) in its “statement on electronic publishing” (originally developed in ; revised in ), some corners of academia still consider the digital publication of academic work as inferior to traditional, printed and bound journals and monographs. concerns about status, reputation, and credit, coupled with a powerful corporate publishing system, inhibit the broad uptake and flourishing of open, social scholarship. in whistler, participants looked past these roadblocks to embracing technological change for a more accessible and efficient digital future for scholarly communication. technological change and shis in the practice of knowledge production are not new concepts. in print, manuscript and the search for order, – , david mckitterick ( ) wrote: “each new technology does not replace the previous one. rather, it augments it, and offers alternatives” (p. ). at “new knowledge models,” participants grappled with this technological evolution and the alternatives it offers for scholarship. ey considered what digital scholarship could and should look like and discussed how to create systems and methods that take advantage of technology’s opportunities, while maintaining intellectual rigour and historical and contextual significance. many of the articles in this collection cluster under the theme of open social scholarship. jon saklofske considers ways that theory, making, and praxis might be unified in order to imagine and implement new collaborative environments and publishing models. matthew wizinsky and jennifer brier outline the history moves project as a way to involve community members in the creation and distribution of oral history projects. with perhaps a controversial stance, rowland lorimer explores the worlds of commercial and open access and not-for-profit publishing to determine which might best serve academic interests. he concludes that not-for-profit publishers might focus on market competitiveness rather than solely on open access as a way to benefit research audiences. finally, lynne siemens continues her research on inke as a collaboration, finding that year six is still a positive experience for researchers, with some of the usual challenges. e intensity of the collaboration is now winding down as the project comes to completion. as scholarly inquiry continues to migrate into digital environments, research findings emerge through changes in the forms and features of scholarly output; such digital forms also correspond with changes in the way such scholarship is produced and shared, constituting form and function as a mixed zone of programmatic scholarly thought. susan brown, linda cameron, anita cutic, mihaela ilovan, olga ivanova, ruth knechtel, andrew macdonald, brent nelson, stan ruecker, stéfan sinclair, and members of the inke research group unfold the manuscript in a state of hybridity, as legacy aspects of print books persist alongside emerging digital features. using the dynamic table of contexts reading interface as their primary case study, brown and co-authors identify and explain the hybridity of form (appearance and features) and the hybridity of roles (production process) as two pillars of the book in transition. john bonnett, mark anderson, wei tan, brian farrimond, and léon robichaud introduce structuremorph, a soware environment that allows historians mailto:etcl@uvic.ca mailto:siemensl@uvic.ca to produce expressive d models of buildings, or complex objects, that express changing data associated with an object (temporal, interpretive, and expressive) through the visual features of the object itself. writing across interpretive theory and computational practice, bonnett et al. share three-dimensional visions of humanist thought at the convergence of digital history and geographic information systems (gis). alex christie unearths the importance of diversity in designing and refining digital knowledge platforms, sharing advances in the z-axis and pedagogy toolkit projects as case studies of such work. advocating for the necessity of a diverse range of voices to chart gaps in current conceptual frameworks and workflows, christie shares practical steps for cyberinfrastructure building that are guided by diversity as an encompassing mode of thought. colette colligan, with michael joyce, details the wilde trials international news project, a program developed with the simon fraser university library to track the use of text-sharing in disseminating the news of oscar wilde’s sex trials in french and english. detailing the development of the program from inception to production, colligan and joyce examine text-sharing from both an algorithmic and a historical perspective, reporting discoveries about the role of censorship in french and english news circulation revealed by the program. new models for knowledge production blend more traditional forms of scholarly inquiry with digital modes and methods. constance crompton and cole mash use contemporary database, graph, and visualization tools in their work with the devonshire manuscript, a sixteenth-century verse miscellany. using these tools, the authors examine the manuscript’s content, paleographic style, annotations, musicality, and relationship between poems, which allows them to develop evidence-based conclusions about the gender of unnamed contributors. in a similar trend of applying new models to older forms, john barber considers how radio dramas could be a participatory knowledge environment, due to the medium’s ability to communicate complex narratives to an engaged audience. richard lane suggests that digital humanities research should look to more community-based practices of open source soware development. if done with consideration, these practices could balance the demands of global, networked infrastructure development with the needs of learning and research communities. cultural institutions are rethinking how best to serve their constituents and the public at large, in response to emerging trends and challenges associated with digital scholarship. in her piece, lisa goddard explores the use of digital assessment systems to build rich digital collections that facilitate the research process for researchers and libraries alike. kimberly silk considers the role of the canadian research knowledge network (crkn), working with its partners, in the creation of an integrated digital scholarship ecosystem within canada. rebecca dowson continues the discussion of the position of libraries in digital scholarship environments by exploring how the research commons functions within the academic library. she explores the organizational structure of the research commons, which is designed to encourage interdisciplinary collaboration while providing support services to scholars throughout the research lifecycle. for kimon keramidas, research projects should apply a sense of design acuity that focuses purposefully on interactions across different media platforms. in this way, accessible knowledge production would integrate with intellectual rigour, and thus respond to the prevalent defamiliarization of academic work from public discourse. scholarly and research communication volume / issue / arbuckle, alyssa, siemens, lynne, christie, alex, with mauro, aaron, & inke. ( ). introduction, new knowledge models: sustaining partnerships to transform scholarly production. scholarly and research communication, ( ): , pp. in the thematic spirit of this volume, we undertook a collaborative peer review process. in a sense, peer review has always been collaborative. subject area experts dedicate a great deal of their time, oen with little acknowledgement, to improve the quality of submissions, assess research for relevance, or determine the veracity of their findings. anonymity has long been used as a means of ensuring the objectivity of reviewers. e gold standard of such evaluation in the humanities, the so-called double blind peer review, supposes to absolve any bias gained by knowing the identity of either the reviewer or the author. while the means of peer review have evolved over time, the goal has always been to strive for an ideal of scholarly rigour, while also achieving a sense of social legitimacy for those knowledge stakeholders involved in accessing or producing research. e resulting process for this issue of scholarly and research communication (src) has been a hybrid approach that blends single blind and open forms of peer review alongside online and in-person communication strategies. we employed this method successfully in for a volume of proceedings from inke gatherings in sydney, australia, and whistler, bc (arbuckle, mauro, & siemens, a, b, c), and it seemed fitting to repeat this process with the current proceedings. we organized reviewers by pairing undergraduate students, graduate students, and postdoctoral fellows with senior faculty to distribute expertise and experience evenly across the group. all members of the reviewing team read their assigned papers in advance and met for a single collective review session we called a “peer-a-thon.” our methods sought to challenge the standard form of peer review while allowing the process to mentor new and emerging scholars and emphasize collaborative writing among reviewers, with the goal of increasing reviewer reliability. e focus for this approach is the level of consensus between reviewers developed through in-person dialogue and collaborative writing. if both members of a reviewer group can determine and describe the value and rigour of an article, it could be reliably determined to be a strong submission. if the submission required one member to interpret and describe the content to the other member, the presentation of the content may be in question and require further development. our method responds to certain weaknesses in current peer review practices, not the least of which are rates of sexism and nepotism (wenneras & wold, ). as nina belojevic ( ) writes, it is important to develop alternatives to standard peer-review practices as these alternatives can inspire “a critical perspective on practices, tendencies, or norms that may otherwise simply be accepted without consideration or question” (p. ). e reliability of blind peer review has oen been questioned; for instance, carole j. lee, cassidy sugimoto, guo r., zhang, and blaise cronin ( ) present evidence that agreement between reviewers occurs with rates similar to chance, or to that found in similar interpretations of rorschach inkblot tests. since consistency across peer reviewers is, arguably, one of the weakest indicators of an article’s relative merit, we sought to implement research that suggests improving evaluation through learning and training to help achieve consensus (jayasinghe, marsh, & bond, ). if consensus in each peer-a-thon reviewer group can be achieved, the relative merits of argument, evidence, and veracity must also be met. while each of these articles was distributed and commented upon through the journal, scholarly and research scholarly and research communication volume / issue / arbuckle, alyssa, siemens, lynne, christie, alex, with mauro, aaron, & inke. ( ). introduction, new knowledge models: sustaining partnerships to transform scholarly production. scholarly and research communication, ( ): , pp. communication, with the installation of open journal systems, the ability to discuss and interpret the relative merits of the submission in person served as the final review stage in which the comments to authors were draed. e articles collected in this volume illustrate the creative approaches researchers, students, librarians, and practitioners are taking within the context of new knowledge production models. from project-based examples of digital humanities experiments to broader arguments around the importance of reconfiguring academic research to be more public-facing, the articles included here speak to techniques, models, and challenges for doing scholarship in a networked, digital world. collectively, they reveal zones of activity in which humanities knowledge—whether rooted in excavations of the past or experimentation in our digital present—enables creative solutions to the ever-evolving challenges of knowledge work. as this collection demonstrates, such solutions do not exist apart from each other, but instead are best mobilized through a deep understanding of their interdependence. whether revealing new methods for representing knowledge in digital environments, reflecting upon the practices and pragmatics for building such environments, or taking a theoretical framework to chart clear pathways for future reflection and action, the contributors to this volume are in distinct conversation with each other. in this sense, the scholarship collected here is fundamentally networked, reflecting its implementation at the whistler gathering, “new knowledge models,” where contributors gathered to exchange and mobilize such knowledge in person. in such an environment, no single perspective can be fully grasped without an awareness of its others. is open, social sharing of scholarship constructs, in turn, a community of practice in which a diversity of strengths and interests complement and support each other to maintain a critical mass of thought. beyond appearing as isolated entities, each article in this collection—each contributor—presents a different face of networked, open, social scholarship’s vitality. references arbuckle, alyssa, mauro, aaron, & siemens, lynne. (eds.). ( a). scholarly and research communication [special issue], ( ). arbuckle, alyssa, mauro, aaron, & siemens, lynne. (eds.). ( b). scholarly and research communication [special issue], ( ). arbuckle, alyssa, mauro, aaron, & siemens, lynne. (eds.). ( c). scholarly and research communication [special issue], ( ). belojevic, nina. ( ). developing an open, networked peer review system. scholarly and research communication, ( ). jayasinghe, upali w., marsh, herbert w., & bond, nigel. ( ). a multilevel cross-classified modelling approach to peer review of grant proposals: the effects of assessor and researcher attributes on assessor ratings. journal of the royal statistical society. series a: statistics in society, ( ), - . lee, carole j., sugimoto, cassidy r., zhang, guo, & cronin, blaise. ( ). bias in peer review. journal of the american society for information science and technology, ( ), - . mckitterick, david. ( ). print, manuscript and the search for order, – . cambridge, uk: cambridge university press. modern language association. ( ). statement on electronic publishing. new york, ny: mla. scholarly and research communication volume / issue / arbuckle, alyssa, siemens, lynne, christie, alex, with mauro, aaron, & inke. ( ). introduction, new knowledge models: sustaining partnerships to transform scholarly production. scholarly and research communication, ( ): , pp. http://src-online.ca/index.php/src/article/view/ http://src-online.ca/index.php/src/article/view/ http://src-online.ca/index.php/src/article/view/ http://src-online.ca/index.php/src/article/view/ http://src-online.ca/index.php/src/issue/view/ http://src-online.ca/index.php/src/issue/view/ http://src-online.ca/index.php/src/article/view/ http://src-online.ca/index.php/src/article/view/ moreh, jack. ( ). connectivity idea with lightbulb. url: http://www.stockvault.net/photo/ /connectivity-idea-with-lightbulb [october . ]. wenneras, christine, & wold, agnes. ( ). nepotism and sexism in peer-review. nature, , - . scholarly and research communication volume / issue / arbuckle, alyssa, siemens, lynne, christie, alex, with mauro, aaron, & inke. ( ). introduction, new knowledge models: sustaining partnerships to transform scholarly production. scholarly and research communication, ( ): , pp. http://www.stockvault.net/photo/ /connectivity-idea-with-lightbulb http://www.stockvault.net/photo/ /connectivity-idea-with-lightbulb s jra .. embodiying disruption: queer, feminist and inclusive digital archaeologies katherine cook department of anthropology, university of montreal, canada inclusive approaches to archaeology (including queer, feminist, black, indigenous, etc. perspectives) have increasingly intersected with coding, maker, and hacker cultures to develop a uniquely ‘do-it-yourself’ style of disruption and activism. digital technology provides opportunities to challenge conventional representations of people past and present in creative ways, but at what cost? as a critical appraisal of transhumanism and the era of digital scholarship, this article outlines compelling applications in inclu- sive digital practice but also the pervasive structures of privilege, inequity, inaccessibility, and abuse that are facilitated by open, web-based heritage projects. in particular, it evaluates possible means of creating a balance between individual-focused translational storytelling and public profiles, and the personal and professional risks that accompany these approaches, with efforts to foster, support, and protect traditionally marginalized archaeologists and communities. keywords: digital archaeology, queer, feminist, inclusive scholarship, public archaeology, diversity digital technologies (especially the web) were sold to us as democratizing tools that would transform the inequities inherent in communications, research, and institu- tional structures. when the shortcomings started to become visible, risk and danger were marketed to us as part of what every- one goes through to create good research and art, to innovate, to be successful. but that was not true either: some people are forced to take on more risk than others. the lines of privilege and power are far more insidious in our technology- drenched worlds than those who benefit from it care to recognize, let alone address, and there is a very troubling pattern intensifying before our eyes. risk-taking has long been a central part of both art and the sciences. its role in archaeological research is perhaps less clear, particularly when we focus on risk to archaeologists (as opposed to the physical risks of damaging or destroying archaeo- logical sites and materials, or abstract risks of knowledge loss). while early antiquar- ians chose (from a secure place of privil- ege) to face dangers of colonial (aka conquest) approaches, more and more archaeologists are forced to put their own well-being, their careers, and their work on the line to push forward a more inclu- sive past and present. what emerged out of post-processual, feminist, queer, indi- genous, black, and post-colonial discourses was the centrality of who does archaeology and whom that archaeology affects. unfortunately, we still seem to be coming to terms with the impact of this shift; the push to make room for alternative ways of knowing and inclusive or equitable european journal of archaeology ( ) , – © european association of archaeologists doi: . /eaa. . manuscript received december , accepted april , revised march https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://orcid.org/ - - - x https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core participation necessitates individuals to be ‘the first’, and in turn to face all the blow- back and take on the long process of re-education when it comes to the recog- nition of pervasive sexual harassment, abuse, and discrimination in the discipline and beyond. at the same time, public, translational, and engaged scholarship de- mands researchers, volunteers, and com- munities to be in the spotlight in a way that we have not seen in the past. digital technology is playing a significant role in this transformation, providing the oppor- tunities to disrupt conventional archae- ology in creative ways, but also creating intensively individualized and public pro- files embedded in new channels for abuse, particularly in the age of internet trolls and cyberbullies. the history of digital disruptions of archaeology, history, and heritage is critical to understanding the relationship between intersectional identities (gender, sexuality, race, ethnicity, health, etc.) and risk- taking, particularly motivated by social change mediated or facilitated by techno- logical innovation. this includes the increasingly common practice of leveraging scholars’ own identities, experiences, and perspectives to make and take up space for multivocality, fluid positionality, and counter structures of privilege. this paper will trace the ways in which queer, feminist , and more broadly inclusive dis- ruptions of traditional forms of communi- cation, values of objectivity, and gate- keeping of knowledge increasingly draw on creative uses of digital and hybrid platforms, taking on many of the goals of transhumanism and posthumanism to unlearn, unmake, unbecome traditional social structures and restrictive identities. however, in so doing, the individuals and communities behind this work risk far more than ‘normal’ levels of failure encom- passed by experimentation, research, and innovation (loss of time, resources, materi- als, etc.); in activating our own identities and past traumas, we risk ourselves more than anything. with growing documenta- tion of harassment and threats, impact on mental health, and the high rate of burnout, are humans part of the collateral damage of this transhumanism? and if so, are the potential outcomes of do-it- yourself digital disruptions truly worth the risk? cultures of inclusivity it is no coincidence that researchers com- mitted to inclusivity and equity increas- ingly connect with the ethos of a creative and open digital scholarship that breaks and confronts academic norms. this translates into a tradition of risk-taking in several ways, including challenging con- ventional research and dissemination prac- tices, transforming representation of people in the past, and supporting margin- alized scholars in the face of the exclusion- ary structures, abuse, and trauma of research spheres. queer scholars, for instance, who by nature do not easily move through the biased structures of these research spheres, this article comes from a queer, feminist, cis- female, white, settler perspective, a position that holds a great deal of privilege. while i highlight and honour indigenous, black, trans, and other diverse voices, i neither wish to speak over nor appropriate their words or experiences. at times, this discussion is, therefore, weighted more heavily towards queer, feminist theory, but i emphasize the importance of the cited literature to truly explore and support diverse perspectives in (digital) archaeology. risk includes, but is not limited to, interconnected professional risks (in education and training, employ- ment and career progression, with economic, personal and social implications) and personal risks (mental/ physical health, well-being, safety), through exclusion, discrimination, harassment, abuse, assault, hate crimes, etc. this article uses ‘risk’ to encapsulate all these facets, as they often come as a package, but, where relevant, will specify which facet of risk in particular is at work. cook – queer, feminist and inclusive digital archaeologies https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core being more likely to be excluded from conventional funding opportunities, publi- cation structures, and even career models, are often correlated with innovation and breaking convention (dowson, ; halberstam, ). in the frank words of halberstam ( : ): ‘failing is some- thing queers do and have always done exceptionally well.’ the sheer impossibility of ‘succeeding’ through normative models can push these ‘unconventional’ scholars to take greater risks because they already occupy uncharted territory and, therefore, by default take unconventional ‘do-it- yourself’ approaches, which in turn blaze a trail for more conventional scholarship to follow (halberstam, : ). these ‘rogue intellectuals’ are also more likely to recognize and react to heteronormative representations of the past and fight for inclusive interpretative paradigms. early texts in queer archaeology high- lighted the ways in which homosexual men and women negotiated academic, dis- ciplinary, and structural homophobia (obvious or subtle) by choosing when and how to deny, downplay, or share their sexuality in relation to maintaining author- ity and place within the discipline (cf. dowson, : – ). these asymmet- rical relationship between homosexuality and heterosexuality only really represent the most visible tip of a much wider set of entangled identities and related issues, including bisexuality’s problems of bi- erasure, biphobia, and lack of representa- tion, asexuality’s lack of recognition, or trans identities and the challenges of gender-sex-sexuality conflation and very particular modes of transphobia (weismantel, ). queer archaeology also includes challenging the pervasiveness of heteronormativity in archaeological interpretations, with a substantial role to play in transforming assumptions, expecta- tions, and normative structures for people past and present. nevertheless, recent political, legal, and social threats to these identities have shown the ongoing dangers of being (or being perceived to be) queer or ally archaeologists. naturally, queer archaeology cannot be wholly and completely separated from feminist archaeology. the complexity of internalized/auto homophobia, ongoing conflation of gender and sexuality in con- temporary society, and the complex inter- sections with race, ethnicity, class, and religion (claassen, : , ) blur the lines between homophobia and mis- ogyny and, therefore, queer and feminist reactions or disruptions. influenced by the layering of discrimination, fear, and hate levelled at scholars along the lines of gender, sexuality, race, ethnicity, and indi- geneity, it is no stretch to say that femin- ism in archaeology is multi-dimensional, multi-scalar, and multi-directional. it includes, but is not limited to, making women visible in the past, exploring gender and sexuality, making the discip- line more equitable and less exploitative (wylie, ; conkey, : – ; battle-baptiste, ) as well as using archaeology as broader political action (wylie, : – ). constantly evolv- ing, ebbing, flowing, and re-evaluating theory and practice, the position of femin- ism in archaeology is also ever in flux, as is its potential to influence broader discourse, methodologies, and theory. the feminist discourse of the visual lan- guage and representation of archaeological knowledge (gifford-gonzalez, ; conkey, ; moser, ) heavily influ- enced early applications of digital media in archaeology. if imagery in print or visual media was not neutral, we certainly cannot expect that digital media will naturally address issues of representation, essential- ism, and patriarchal values. the values of colouring outside the lines have given rise to a particular brand of inclusive archae- ology, defined by innovative digital european journal of archaeology ( ) https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core visualization and communication practices that challenge our assumptions about what people looked like, what roles they played, and how they moved through and experi- enced the landscape throughout history (morgan, ). these projects actively employ non-traditional means of storytell- ing (beyond articles or monographs), often through compelling translational narra- tives, to further challenge research norms through disruption and activism (cf. ulysse, ). digital communities and inclusivity maker and hacker cultures have also con- nected technological innovation with social disruption, challenging not only dominant tech culture but also broader social structures of inequity and exclusivity (richterich, ; smith, : – ). the maker movement in particular was ‘founded on a philosophy that values the sharing of diverse knowledges. it is an extension of the “do it yourself” (or diy) movement and … the democratisation of knowledge and technology, and experi- mentation and innovation through the use of shared resources’ (compton et al., : ). these hives of engagement and learning serve as hubs for shared technol- ogy, tools, and materials (richterich, ). framed by the values of low-barrier entry (economics, education, skill level), flexible and experimental processes, and an ethos of collaboration, makerspaces are becoming social statements. critical making is being used for activism (or mak- tivism, morgan, : – ) through shared resources, experiences, memory, heritage, and trauma, with a nod to a longer history of marginalized communi- ties using crafting circles as nodes for activism (rogers, , ; crooks et al., ; riley et al., : – ). despite these grassroots beginnings, makerspaces are becoming heavily institu- tionalized, finding their place on university campuses and in museums, galleries, and libraries. although this shift has made makerspaces more easily accessible to archaeologists, it has split up communities, setting up new barriers of access and approaches. these spaces also struggle with equity and a tendency to become dominated by heterosexual, white, cis- male individuals (taylor et al., ), and there is a documented history of discrim- ination and harassment targeting indivi- duals who do not fit the normative tech moulds (martin, ). today, the maker movement embodies a number of ‘digital divides’, at once creating and challenging inequities and human limitations, but also as a mainstream/technoscientific move- ment while its style is deeply grassroots and even ‘guerrilla’ (wajcman, : – ). the contributions and value of diverse scholarship in all these settings are clear, but there is much more ground to cover in making truly inclusive communities of practice. queer, feminist, and maker com- munities, for instance, have been critiqued for not doing enough to recognize or stand in solidarity with their trans, indi- genous, and/or black members in pre- and post-digital eras. this has perhaps most clearly been articulated by ann ducille ( : ) who, more than two decades ago, drew attention to the crisis of scholar- ship resulting from ‘the hyper-visibility, super-isolation, emotional quarantine and psychic violence of … precarious positions in academia’ for black female intellectuals. considering recent political developments and the ways in which discrimination is enacted and weaponized in online and digital worlds, this situation has only been exacerbated since this seminal work. inclusive intellectual communities-cum- paradigms, such as feminist, queer, black, indigenous scholarship, embody at once cook – queer, feminist and inclusive digital archaeologies https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core their own but also more collective history of exclusion, resistance, and proliferation. the resulting complexity of targeted exclusion and discrimination, and their connection to digital scholarship, is critical when examining contemporary interplays between disrupting normativity and cre- ativity (as epistemological and pedagogical tools) that are transforming traditional archaeology. disruptive digital archaeologies born out of inclusive archaeologies, digital innovations have been increasingly har- nessed as part of an empowered sense of diy and the ability to creatively amplify unconventional voices. strategic applica- tions of technologies and media to defy, to confront, to derail, to remix, to subvert can be characterized as digital archaeolo- gies that: i. confront the archaeological past we have created ii. confront the present (particularly of the discipline) iii. confront authorship and authority iv. act as platforms to support the above. while these waves of initiatives and projects may work independently or be interwoven, it is the collective impact of these digital archaeologies and the reac- tions they stimulate that join them together in a wave of disruption. confronting the past today, many digital projects seek to chal- lenge the narratives traditionally presented in archaeology, breaking norms, con- fronting assumptions, and demonstrating diversity and fluidity of identities in the past. early applications, particularly within the realms of visualization and communications, intended to shift per- spectives and the positioning of people in the past, have a distinctively feminist flavour. what has been described as ‘add women and stir’ has transformed into the progressive upending of normative assumptions and recognition of greater diversity. importantly here, and perhaps defining what separates these projects from more traditional archaeologies chal- lenging identity in the past, digital media transform our methods of ‘writing’, editing, presenting, and collaborating in archaeological narratives (see also tringham, : – and lopiparo and joyce, ). in reconfiguring structures of engagement, intimacy, immersiveness, layering, and temporality, digital archae- ology has embraced the creativity of early feminist and queer narratives and run with it. from early works, such as joyce et al.’s ( ) sister stories and tringham’s ( : ) chimera web using hypertext (see also joyce & tringham, ), via more contemporary uses of social media and websites (morgan & pallascio, ) to virtual and augmented realities and gaming (morgan, ; perry et al., ), the flexibility and ‘democratizing’ ideals of digital formats and open access are often noted as points of attraction for archaeologists seeking to construct more diverse narratives of the past. these pro- jects are also part of a much wider, inter- disciplinary push to use public digital resources to challenge normative, main- stream, and exclusionary views of the past (see for instance the tumblr resource people of color in european art history, ). tringham’s ‘dead women do tell tales’ ( ) highlights a further emer- ging trend: the integration of digital data- bases, visualization, and narration to weave together more complex histories without losing the appeal for broad audiences to engage not only with the past but the european journal of archaeology ( ) https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core ways in which we construct it (tringham, : , see also tringham, ). building on tringham’s earlier work on life histories and narratives of people in the past, using creative expression and embracing ambiguity (tringham, , ), the project spotlights the all too often opaque process of archaeological interpretation by employing imagined nar- ratives of the life histories of women at Çatalhöyük to demonstrate their connec- tion with primary data. it is not merely adding women to the past but reconfigur- ing the construction of identity in archae- ology to make space for alternative narratives and critical evaluation of trad- itional interpretations. following the public debates over the bbc’s portrayal of life in roman britain in and other recent exclamations over political correct- ness, the need to present alternatives with evidence and critical discussions should not be underrated. however, it is not without risk, as we have seen with the overwhelming levels of abuse and threats that many women, queer, indigenous, and black scholars have received over defend- ing alternative narratives (cf. beard, ). creating narratives that challenge contem- porary normative values and systems of oppression, or defending them, is a mix of russian roulette and poking a hornet’s nest; while some projects seem to go unnoticed, catching the attention of even one internet troll sets a whole system of hate in motion. it should also be recognized that the risk is not entirely limited to the research- ers creating or defending inclusive narra- tives. increasing engagement with marginalized archaeologies necessitates participation in difficult heritage and intensely political positions (for instance the projects mapping the du bois philadelphia negro project (http://www. efishdesign.com/sites/dubois/overview.php), digital harlem (http://digitalharlem.org/), the trans-atlantic slave trade database (https://www.slavevoyages.org/?xid = ps_ smithsonian); see also morgan & pallascio, : – , kamash, ). given the deep history of discrimination and inequity, this model of digital archae- ology comes with potential to harm des- cendant communities, the public, and archaeologists due to the emotional trauma often connected and resuscitated through these practices. these tensions and traumas, however, can be mobilized to address legacies of discrimination, injust- ice, and their connections to contemporary inequities, particularly through technolo- gies that layer the past on the present to connect the familiarities of everyday life with their dark heritage (figure ). the careful use of discomfort and connection to emotion through narratives, disruptive imagery, and media, and the juxtaposition of the familiar present with unexpected or unknown histories can be a very powerful use of digital technology. but it takes a great deal of skill and collaboration to mediate potential risk for contemporary communities, who are already dealing with extreme levels of systemic discrimination and trauma. confronting the present although archaeologists have, in the past, played almost invisible or at the very least non-personal roles in public dissemin- ation, there is greater emphasis on archae- ologists’ identities, and particularly the diversity of who can be an archaeologist, to create a more inclusive field. this work can also be branded as activist archaeology and translational storytelling, but also as aligning with work to address discrimin- atory and unequal structures and norms. it often emerges most strongly in the face of work action and concerns over equity, inclusivity, and security in the workplace. cook – queer, feminist and inclusive digital archaeologies https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at http://www.efishdesign.com/sites/dubois/overview.php) http://www.efishdesign.com/sites/dubois/overview.php) http://www.efishdesign.com/sites/dubois/overview.php) http://digitalharlem.org/ http://digitalharlem.org/ https://www.slavevoyages.org/?xid = ps_smithsonian https://www.slavevoyages.org/?xid = ps_smithsonian https://www.slavevoyages.org/?xid = ps_smithsonian https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core however, it also looks to attract more diverse people to the field of archaeology to challenge the persistent underrepresen- tation in this discipline (cf. agbe-davies, ). a number of projects under the banner of archaeogaming playfully fit this descrip- tion. for example, the c- dating game (https://www.winterwolves.com/c dating. htm) is a simulation game where players take the role of an undergraduate intern. importantly, the game-play includes finding friends and romance without any gender expectations or structures in place. seemingly a very simple element, it is rela- tively revolutionary when representation of queer archaeologists remains ambiguous at best in most narratives; the opportunity to choose begins to challenge those expecta- tions and make space for diverse indivi- duals. the frameworks that we create for participation in archaeology, whether through games or other media, and the identities we craft in archaeological ‘char- acters’ that populate these media, shape user experiences but they also frame public and disciplinary expectations and imagina- tions (see also dennis, ). the trowelblazers project (https://tro- welblazers.com) also challenges representa- tions of archaeologists. triggered by a conversation on twitter, leading to a network of digital and analogue resources on women in archaeology, geology, and palaeontology, this project has stitched together the full range of digital technolo- gies (hassett et al., ). their recent ‘raising horizons’ initiative was a collage including crowdfunding, the contribution of artists working in a range of mediums, social media, digital and print resources, and physical exhibits or events to showcase contemporary and historical women in these disciplines, creatively drawing con- nections between their experiences and points of view. one part social media, one figure . conceptual art for ‘built on bones’, an augmented reality app to draw attention to the dark legacies of colonialism by augmenting contemporary cities with the bones of the past (cook, ). european journal of archaeology ( ) https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.winterwolves.com/c dating.htm https://www.winterwolves.com/c dating.htm https://www.winterwolves.com/c dating.htm https://trowelblazers.com https://trowelblazers.com https://trowelblazers.com https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core part crowdsourcing, and one part creative media creation, feminist voices and activ- ism have been mobilized through largely web-based communications. nevertheless, the emphasis on real-life archaeologists requires real people to take the risk of sharing themselves as part of their work. while the ideals of reflexivity and self- awareness would beg the question ‘why not?’ (after all, we are part of our interpre- tations), in the age of the internet, this level of openness and individuality of the strategy takes on a more sinister risk (see below). confronting authorship/authority digital technologies can also confront the exclusionary view of archaeologists as the only experts in reconstructing the past. influenced by indigenous, black, and post- colonial archaeologies in particular, the social networking, communication oppor- tunities and interactivity of web-based platforms lend themselves well to the equalizing ideals of collaborative archae- ology today. one of the earliest websites mobilizing the internet to promote com- munity collaboration is carol mcdavid’s ( ) levi jordan plantation website (http://www.webarchaeology.com/html/ default.htm), part of a project examining slavery and african-american culture on a plantation in texas (see also mcdavid, ). using what now seems like very simple web-based feedback forms, along- side non-web-based interviews and partici- pation, the project invited dialogue, participation, and contributions from des- cendant communities, local communities, adults and children alike, anyone with an interest. mcdavid ( ) noted: ‘we wanted to learn if computers can be used to create “conversations” about archaeology and history among lots of different people.’ this project’s legacy is echoed in many community archaeology projects today, such as terry brock’s ( ) all of us would walk together website, which provided opportunities for descendant communities and the general public to participate, share stories, and build family trees (see also mcdavid & brock, ). social media have also significantly con- tributed to combating the privileging of (eurocentric) archaeological discourse, research, and interpretations. archaeologist joanne hammond (@kamloopsarchaeo) has infamously used twitter with edited images contrasting the problematic com- memorative signage in canada, typically erasing indigenous heritage to celebrate european colonization, with newly written narratives that decolonize our perspectives on the past (see for instance https://twitter. com/i/moments/ ). kisha supernant (@archaeomapper), a métis archaeologist, has also used twitter to challenge ways of knowing the past and highlight the discriminatory structures, attitudes, and treatment of indigenous scholars through courageously frank and honest posts about her own experiences. although framed once again by personal and professional risk, these voices blur the lines of authority, participation, and ways of knowing which are critical to repairing and reshaping relationships between archaeologists and descendant communi- ties, as well as challenging us to review our approaches to (public) archaeology. recognizing the problems of authority, authorship, and control of the past also includes acknowledging that not all appli- cations of digital technology are necessarily appropriate, even when motivated by a goal of ‘representing’ or ‘including’ that heritage in wider discourses and adopting increasingly mainstream approaches. the work of beth compton ( ), which examines complexity of authenticity, ontology, and materiality when it comes to d models and d prints, is particularly cook – queer, feminist and inclusive digital archaeologies https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at http://www.webarchaeology.com/html/default.htm http://www.webarchaeology.com/html/default.htm http://www.webarchaeology.com/html/default.htm https://twitter.com/kamloopsarchaeo https://twitter.com/i/moments/ https://twitter.com/i/moments/ https://twitter.com/i/moments/ https://twitter.com/archaeomapper https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core powerful in this context and challenges perceptions of objects (with which we think we are deeply familiar) and technol- ogy (with which we often overestimate our familiarity) (see also brown & nicholas, ; cook & compton, ; jones et al., ). challenging authority in contemporary archaeology necessitates a much more critical application of ethics and commitment to collaborative research, recognizing both the lack of understanding of data access and preservation but also the tech-influenced emphasis that is placed on doing what is innovative over what is right or responsible. platforms for support space-making initiatives, that is, the design of platforms, publication venues, and support for more diversity in the dis- cipline and narratives of the past, have played a critical role in encouraging the types of inclusive and equitable digital archaeologies described above. the goal here is to showcase the voices of diverse scholars and creators to increase their impact and support their progression. when approached as more than tokeniza- tion or shallow pr stunts, transformative diversity and inclusion work can create the conditions for social change in the struc- ture of archaeology and beyond, amplify- ing marginalized voices, challenging our perspectives on the past, and in turn dem- onstrating the relevance of the discipline in contemporary society. the heritage jam, the brainchild of sara perry and anthony masinton, has been a pioneering platform for innovative digital archaeology and heritage practice since . with their open call to ‘anyone interested in the way heritage is visualised’, free entry, and flexible formats, timelines and engagement, the heritage jam has been successful in bringing together a range of individuals interested in heritage (both within and beyond pro- fessionals and students of archaeology), showcasing diverse histories and perspec- tives on the past (heritage jam, ). the inclusiveness policy and code of conduct are two cornerstones of the jam; their aim is to ‘provide a safe, inclusive and welcoming environment … where everyone is free to express themselves regardless of gender, sexual orientation, ability, appearance, ethnicity, citizenship, socioeconomic status, religious beliefs or age’ (heritage jam, ). it also asks par- ticipants and audiences to celebrate indivi- duals making an extra effort to be inclusive and welcoming, while prohibiting harassment, abuse, discrimination, deroga- tory or demeaning speech, etc. finally, the website serves as an archive for jam entrants; with a huge audience and reach, this has been particularly successful in pro- moting the work of diverse people, provid- ing international reach and networking opportunities. it is perhaps no surprise that many of the entrants and the projects created and submitted to the heritage jam over the years have exemplified inclusive approaches to the past and the present (including cook, and tringham, cited above). more recently, epoiesen, an online pub- lication initiative based at carleton university in canada and established by editor-in-chief shawn graham, has taken up the challenges of making space for diverse and alternative media formats and knowledge in archaeology and history (cf. pálsson and aldred ; heckadon et al., ). characterized as ‘a journal for exploring creative engagement with the past, especially through digital means … [primarily through] “paradata” or artist’s statements that accompany playful or unfamiliar forms of singing the past into existence’ (epoiesen, ), the journal provides an opportunity to publish on european journal of archaeology ( ) https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core open access without any fees (lowering the cost of entry) and showcases alternative ways of engaging with the past. it regularly publishes the work of students, profes- sionals, and individuals ‘outside’ traditional careers in archaeology or history, in add- ition to allowing annotations and further engagement between authors/creators and readers. with a diverse editorial board, authorship, and audience, the journal has also been at the forefront of important conversations about inclusive publishing policies. other relevant endeavours include building digital communities for collabor- ation and support, such as the women’s digital archaeology network (https://caa- international.org/ / / /womens-digital- archaeology-network/) and the reciprocal research network (https://www.rrncommu- nity.org/), or initiatives providing inclusive community building and training opportun- ities, such as michigan state university’s institute on digital archaeology’s (lynne goldstein and ethan watrall) inclusion of participants at no cost (for students through to established researchers) and effort to build inclusive and equitable environ- ments. the value of creating more plat- forms like these, and the explicit outlining of inclusive policies, should not be under- estimated, making space for more diverse scholars, encouraging equity and allyship among all participants, and putting pres- sure on more traditional publication venues and institutions to transform their own practices. at the same time, these initiatives take time, effort, and funding. often working above and beyond their typical duties, the individuals creating these support platforms also take on incredible weight, stress, and risk. those responsibilities and the service provided by these pioneering communities and their value to building inclusivity should not go unrecognized, but rather must be acknowl- edged and protected in their own right. the dark side of disruption while these projects serve as markers of active disruption and points of inspiration, it is the people behind them and their experiences of moving through these worlds of archaeology, technology, aca- demia, and beyond that highlight how far we still have to go. notably, a growing archive of documented harassment and abuse (cf. clancy, ; nelson et al., ) is only just beginning to hint at the widespread challenges and emotional toll that targeted members of the archaeo- logical community continue to face. it is true that structures of discrimination, intimidation, and harassment have an unconscionably long history in archae- ology, including in the specialization of digital archaeology. ruth tringham, for instance, in her discussion of ‘dead women do tell tales’ and earlier web- based work, notes that ‘without the support of meg conkey, janet spector, and rosemary joyce, i might have been discouraged from this endeavour in the resistant atmosphere of the early s’ (tringham, : ). why is it different now? because technology, contrary to the hopes that it would enhance and overcome human limitations, has in fact opened the door wider for abuse via the web, particu- larly for individuals and groups that were already at risk. the publicness of archae- ology on the web has attracted a great deal of attention and developed a global reach, but this can be a double-edged sword. the emphasis that is placed today on sharing personal histories, developing an individual profile, and being a ‘public face’, coupled with the ease with which personal informa- tion, including contact details, can be acquired online is a dangerous combination. it is particularly accentuated by the degree to which we remain connected to the internet at all times through mobile devices, applications and automated notifications. cook – queer, feminist and inclusive digital archaeologies https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://caa-international.org/ / / /womens-digital-archaeology-network/ https://caa-international.org/ / / /womens-digital-archaeology-network/ https://caa-international.org/ / / /womens-digital-archaeology-network/ https://caa-international.org/ / / /womens-digital-archaeology-network/ https://www.rrncommunity.org/ https://www.rrncommunity.org/ https://www.rrncommunity.org/ https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core abusers and harassers can reach individuals at all times. finally, the interplay of public and private online communications (perry, : – ) lends itself to manipulation, allowing individuals to remain publicly friendly or polite and privately abusive, or even to use public communications to incite widespread harassment. martin ( ), considering the position of women in makerspaces, outlines the ways in which persistence in tech-domi- nated fields brings out both subtle and overt forms of discrimination and abuse, ranging from ‘suspicion’ that women are not actually the masterminds behind innovative digital products, to more direct forms of public harassment and defam- ation via social media or blatant exclusion from events. similarly, dennis ( ) has drawn attention to the significant issue of ‘problematic participants’ in gaming culture, which includes ‘elements of mis- ogyny, white supremacy, and anti-intellec- tualism’, and manifests itself in targeted online abuse and even escalating to offline harassment. most recently, geraldine deruiter’s effort to study online abusers through interviews demonstrated the com- plexity of the psyche of online abusers and the resulting volatility of hate, misogyny and harassment online but also that: ‘while we regard online misogyny and abuse of women as something wholly separate and different from its so-called “real-world” counterparts, these are all components of the same system. we dismiss sexual harassment that happens on the internet in the exact same way that we dismiss sexual harassment that happens face-to-face, even though these experiences are often just as bad —if not worse—for the victim, often due to the mechanics of the anonymity of the internet.’ (deruiter, ) particularly concerning is the system of teaching victims of online and offline abuse to believe that they brought the abuse upon themselves and, therefore, to willingly put up with further damage to themselves. after recording high levels of inappropriate digital engagement, perry ( : – ) draws attention to the lack of recourse or means of protection, with corresponding low rates of reporting and rare institutional support, despite many institutions now mandating public and private digital engagement. this is critical to any discussion promoting digital archaeology for public engagement, net- working, and dissemination. whether it is in official commissions of such work or through more subtle promotion of the ethos of community-engaged scholarship (which also has its own problematic history of inequity and responsibility placed on women, indigenous scholars, people of colour, etc.), if no form of support or protection (despite well docu- mented abuse and danger) is offered, then it knowingly puts these individuals in danger, expects self-sacrifice and risk on their part only, and in turn profits from it. this should never have been acceptable and addressing these risks must be a prior- ity in future for every single individual or institution associated with archaeology and heritage. conclusion: the collateral damage of transhumanism ‘we have the right to a safe, secure and non-threatening working and living environment. we do not tolerate any form of discriminatory, abusive, aggres- sive, harassing, threatening, sexually—or physically-intimidating, or related prob- lematic behaviours that compromise the wellbeing, equality, security or dignity of other human beings.’ (perry, ) without risk, there is no reward. the person who risks nothing does nothing. european journal of archaeology ( ) https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core when we risk going too far, we discover how far we can go. in today’s era of motivational speak, risk has been singu- larly rebranded as a badge of honour. in turn, risk is considered a cornerstone of art, innovation, creativity, and ultimately, change. perhaps ironically, then, it is the #metoo, idle no more, and black lives matter movements, among others, that have shone a light on the dark underbelly of taking chances: the demand for indivi- duals to step forward and share their voice paints targets on the already vulnerable and marginalized for fear- and anger-filled hate and aggression, repeatedly and relent- lessly beating down the voices of change. often forced to choose between the long- term, abstract risk of doing nothing (and, therefore, nothing ever changing for the better) and the immediate and often per- sonal risk of trying to confront the system, the individuals leading the charge of these movements, in the name of equity, secur- ity, and inclusivity, face harassment, abuse, suspicion, imprisonment, and violence. this has been part of the growing critique of positive thinking, this ‘mass delusion’ (ehrenreich, : ) centred on per- sonal responsibility, where hard work leads to success and poor choices lead to failure, rather than recognizing the true force and pervasiveness of underlying structural con- ditions (halberstam, ). this is perhaps the greatest flaw in transhuman and posthuman philosophies: the unflinching commitment to technol- ogy and science to evolve beyond human conflicts and limitations fails to protect humans now, risking the creation of greater fissures rather than making pro- gress. the digital has reformulated the ways in which we engage with the past and produce knowledge in the present, but we have taken many steps backward much faster than the individuals cited above (and many others) have clawed their way forward to envision the past in new ways through creativity, making, and inclusivity. it would be easy to present this as a narra- tive of ‘no risk, no reward’. however, the number of individuals who contribute so much in the name of diversity, and who are now reaching their breaking points— having battled misogyny, racism, trans- phobia, homophobia, ableism, and every other brand of hatred possible, for too long and in too high a concentration (thanks to the internet)—must be taken as a serious warning for the ways in which we put people in the firing line to try to repair what was already broken and what the digital has, at the very least, augmen- ted. the challenge for everyone, and it will need everyone, including those who have benefitted for so long from the privi- leges afforded to them, will be how to invest in better protections, buffers, and reformulate our approach to digital schol- arship. these efforts need to be bolstered and amplified by more funding, more plat- forms for dissemination, more institutional support, more regulation, and perhaps most importantly, more respect and acknowledgement of the truth of abuse when reported. there is nothing new in this statement, it is echoed across the web, in tweets and blogs, and increasingly in policy state- ments and organizational missions. and while all these elements are indeed needed, are they radical enough to con- front decades of technological evolution that has opened the pandora’s box of dis- crimination, hate, and abuse? the greatest progress appears to lie in the alternative platforms that have emerged, as described above, to make space and valorise disrup- tions to mainstream and traditional archaeologies. these do indeed require a great deal of labour, but at least the labour is not profiting commercial interests (i.e. publishers and corporate presses). if we all commit to reading and citing these plat- forms first, in addition to participating, cook – queer, feminist and inclusive digital archaeologies https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core offering our time, labour, and perspectives as a priority over traditional and very broken systems of dissemination, there is hope for a transformation in the value, security, reward, and allyship to confront targeted oppression and systemic ‘other- ing’. this should be coupled with means to protect and shelter at-risk voices, such as by mediating, blocking, or even not permitting comments, but also by valuing and recognizing that what is often branded as ‘academic kindness’ is in fact the threads that will weave empathy, respect, collegiality, and indeed humanity back into the trans- or posthuman future of archaeology. the future will be digital, but it will only be diverse and inclusive if, together, we make it so. the stubbornly diy mentality that has come to character- ize digital archaeology powered by and for inclusion and diversity emerged out of structures of inclusion and inequity but addressing the true crisis of scholarship endangering scholars today must be a do- it-collectively priority. references agbe-davies, a.s. . black scholars, black pasts. saa archaeological record, : – . battle-baptiste, w. . black feminist archaeology. new york: routledge. beard, m. . roman britain in black and white. the times literary supplement [online] [accessed , april ]. available at: . brock, t.p. . all of us would walk together [online] [accessed march ]. available at: . brown, d. & nicholas, g. . protecting indigenous cultural property in the age of digital democracy: institutional and communal responses to canadian first nations and maōri heritage concerns. journal of material culture, : – . https://doi.org/ . / claassen, c. . homophobia and women archaeologists. world archaeology, : – . clancy, k.b.h., nelson, r., rutherford, j.n. & hinde, k. . survey of academic field experiences (safe): trainees report harassment and assault. plos one ( ): e . https://doi.org/ . /journal.pone. compton, b. . negotiating authenticity: engaging with d models and d prints of archaeological things. museum of ontario archaeology notes [online] [accessed march ]. blog available at: compton, m.e., martin, k. & hunt, r. . where do we go from here? innovative technologies and heritage engagement with the makerbus. digital applications in archaeology and cultural heritage, : – . https://doi.org/ . /j.daach. . . conkey, m. . mobilizing ideologies: paleolithic ‘art’, gender trouble, and thinking about alternatives. in: l. hager, ed. women in human evolution. london: routledge, pp. – . conkey, m. . has feminism changed archaeology? signs, : – . cook, k. . built on bones [online] [accessed march ]. available at: the heritage jam university of york: . cook, k. & compton, b. . canadian digital archaeology: on boundaries and futures. canadian journal of archaeology, : – . crooks, r., contreras, i. & besser, k. . herstory belongs to everybody or the miracle: a queer mobile memory project. interactions: ucla journal of education and information studies, ( ). https://escholarship.org/uc/item/ d f dennis, l.m. . archaeogaming, ethics, and participatory standards. saa archaeological record, . http://onlinedi- geditions.com/publication/?i= & article_id= &view=articlebrowser& ver=html #{% issue_id% : ,% european journal of archaeology ( ) https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.the-tls.co.uk/roman-britain-black-white/ https://www.the-tls.co.uk/roman-britain-black-white/ https://www.the-tls.co.uk/roman-britain-black-white/ http://hsmcwalktogether.org http://hsmcwalktogether.org http://hsmcwalktogether.org https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /journal.pone. https://doi.org/ . /journal.pone. https://doi.org/ . /journal.pone. http://archaeologymuseum.ca/negotiating-authenticity-engaging- d-models- d-prints-archaeological-things/ http://archaeologymuseum.ca/negotiating-authenticity-engaging- d-models- d-prints-archaeological-things/ http://archaeologymuseum.ca/negotiating-authenticity-engaging- d-models- d-prints-archaeological-things/ http://archaeologymuseum.ca/negotiating-authenticity-engaging- d-models- d-prints-archaeological-things/ https://doi.org/ . /j.daach. . . https://doi.org/ . /j.daach. . . https://doi.org/ . /j.daach. . . http://www.heritagejam.org/new-blog/ / / /built-on-bones-katherine-cook http://www.heritagejam.org/new-blog/ / / /built-on-bones-katherine-cook http://www.heritagejam.org/new-blog/ / / /built-on-bones-katherine-cook http://www.heritagejam.org/new-blog/ / / /built-on-bones-katherine-cook https://escholarship.org/uc/item/ d f https://escholarship.org/uc/item/ d f http://onlinedigeditions.com/publication/?i= &article_id= &view=articlebrowser&ver=html % {% issue_id% : ,% view% :% articlebrowser% ,% article_id% :% % } http://onlinedigeditions.com/publication/?i= &article_id= &view=articlebrowser&ver=html % {% issue_id% : ,% view% :% articlebrowser% ,% article_id% :% % } http://onlinedigeditions.com/publication/?i= &article_id= &view=articlebrowser&ver=html % {% issue_id% : ,% view% :% articlebrowser% ,% article_id% :% % } http://onlinedigeditions.com/publication/?i= &article_id= &view=articlebrowser&ver=html % {% issue_id% : ,% view% :% articlebrowser% ,% article_id% :% % } http://onlinedigeditions.com/publication/?i= &article_id= &view=articlebrowser&ver=html % {% issue_id% : ,% view% :% articlebrowser% ,% article_id% :% % } https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core view% :% articlebrowser% ,% article_ id% :% % } deruiter, g. . what happened when i tried talking to twitter abusers [online] [accessed march ]. available at the everywhereist blog: . dowson, t. . why queer archaeology? an introduction. world archaeology, : – . ducille, a. . skin trade. cambridge (ma): harvard university press. ehrenreich, b. . bright-sided: how positive thinking is undermining america. new york: picador. epoiesen, . about epoiesen: a journal for creative engagement in history and archaeology [online journal] [accessed march ]. available at: . https://doi. org/ . /epoiesen gifford-gonzalez, d. . you can hide, but you can’t run: representations of women’s work in illustrations of palaeolithic life. visual anthropology review, : – . https://doi.org/ . / var. . . . halberstam, j. . the queer art of failure. durham (nc): duke university press. hassett, b.r., pilaar-birch, s., herridge, v. & wragg-sykes, b. . trowelblazers: accidentally crowd-sourcing an archive of women in archaeology. in: v. apaydin, ed. shared knowledge, shared power. cham: springer, pp. – . heckadon, a., sparks, k., hartemink, k., van muijlwijk, y., chater, m. & nicole, t. . interactive mapping of archaeological sites in victoria. epoiesen. https://epoiesen.library.carleton.ca/ / / /interactive-mapping-archae-victoria/ heritage jam, . policies and rules [online] [accessed march ]. available at the heritage jam, university of york: . jones, s., jeffrey, s., maxwell, m., hale, a. & jones, c. . d heritage visualization and the negotiation of authenticity: the accord project. international journal of heritage studies, : – . joyce, r.a. & tringham, r.e. . feminist adventures in hypertext. journal of archaeological method and theory, : – . joyce, r.a., guyer, c. & joyce, m. . sister stories. new york: new york university press. kamash, z. . ‘postcard to palmyra’: bringing the public into debates over post-conflict reconstruction in the middle east, world archaeology, : – . https://doi.org/ . / . . lopiparo, j. & joyce, r. . crafting cosmos, telling sister stories, and exploring archaeological kknowledge graphically in hypertext environments. in: j.h. jameson, jr, c. finn & j.e. ehrenhard, eds. ancient muses: archaeology and the arts. tuscaloosa: university of alabama press. pp. – . martin, k. . centering gender: a feminist analysis of makerspaces and digital humanities centers. paper pre- sented at institute for digital arts and humanities speaker series [online] [accessed march ]. available at: . mcdavid, c. . descendants, decisions, and power: the public interpretation of the archaeology of the levi jordan plantation. historical archaeology, : – . mcdavid, c. . levi jordan plantation [online] [accessed march ]. available at: mcdavid, c. & brock, t.p. . the differing forms of public archaeology: where we have been, where we are now, and thoughts for the future. in: c. gnecco & d. lippert, eds. ethics and archaeological praxis. new york: springer, pp – . morgan, c. . (re)building Çatalhöyük: changing virtual reality in archaeology. archaeologies: journal of the world archaeological congress, : – . morgan, c. . punk, diy, and anarchy in archaeological thought and practice. online journal in public archaeology, : – . https://doi.org/ . /ap.v i . morgan, c. . the queer and the digital: critical making, praxis, and play in digital archaeology. paper presented at theoretical archaeology group , southampton. available at: . cook – queer, feminist and inclusive digital archaeologies https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at http://onlinedigeditions.com/publication/?i= &article_id= &view=articlebrowser&ver=html % {% issue_id% : ,% view% :% articlebrowser% ,% article_id% :% % } http://onlinedigeditions.com/publication/?i= &article_id= &view=articlebrowser&ver=html % {% issue_id% : ,% view% :% articlebrowser% ,% article_id% :% % } http://www.everywhereist.com/what-happened-when-i-tried-talking-to-twitter-abusers/ http://www.everywhereist.com/what-happened-when-i-tried-talking-to-twitter-abusers/ http://www.everywhereist.com/what-happened-when-i-tried-talking-to-twitter-abusers/ http://www.everywhereist.com/what-happened-when-i-tried-talking-to-twitter-abusers/ https://epoiesen.library.carleton.ca/about/ https://epoiesen.library.carleton.ca/about/ https://epoiesen.library.carleton.ca/about/ https://dx.doi.org/ . /epoiesen https://dx.doi.org/ . /epoiesen https://dx.doi.org/ . /epoiesen https://doi.org/ . /var. . . . https://doi.org/ . /var. . . . https://doi.org/ . /var. . . . https://epoiesen.library.carleton.ca/ / / /interactive-mapping-archae-victoria/ https://epoiesen.library.carleton.ca/ / / /interactive-mapping-archae-victoria/ https://epoiesen.library.carleton.ca/ / / /interactive-mapping-archae-victoria/ http://www.heritagejam.org/policies/ http://www.heritagejam.org/policies/ http://www.heritagejam.org/policies/ https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . / . . http://hdl.handle.net/ / http://hdl.handle.net/ / http://www.webarchaeology.com/html/default.htm http://www.webarchaeology.com/html/default.htm http://www.webarchaeology.com/html/default.htm https://doi.org/ . /ap.v i . https://doi.org/ . /ap.v i . https://www.youtube.com/watch?v https://www.youtube.com/watch?v https://www.youtube.com/watch?v http:// = http:// qw_j_hy ws https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core morgan, c. & pallascio, p.m. . digital media, participatory culture, and difficult heritage: online remediation and the trans-atlantic slave trade. journal of african diaspora archaeology and heritage, : – . moser, s. . ancestral images: the iconography of human evolution. ithaca (ny): cornell university press. nelson, r.g., rutherford, j.n., hinde, k. & clancy, k.b.h. . signalling safety: characterizing fieldwork experiences and their implications for career trajectories. american anthropologist, : – . pálsson, g. & aldred, o. . en-counter maps. epoiesen. https://doi.org/ . / epoiesen/ . people of color in european art history, . mission statement. medieval poc [online] [accessed may ]. available at: . perry, s. . digital media and everyday abuse. anthropology now, : – . perry, s. . six fieldwork expectations: code of conduct for teams on field projects [online] [accessed march ]. available at the archaeological eye blog: . perry, s., economou, m., young, h., roussou, m. & pujol, l. . moving beyond the virtual museum: engaging visitors emotionally. in: l. goodman & a. addison, eds. proceedings of the rd international conference on virtual systems and multimedia. dublin: ucd & university of ulster, pp. – . richterich, a. . ‘do not hack’: rules, values, and communal practices in hacker- and makerspaces. paper presented at aoir . selected papers of internet research : the th annual conference of the association of internet researchers. berlin, germany [online] [accessed march ]. available at: . riley, d.m., mcnair, l.d. & masters, s. . an ethnography of maker and hacker spaces achieving diverse participation. poster presented at the conference of the american society for engineering education [online] [accessed march ]. available at: . rogers, m. . making queer love: a kit of odds and ends. hyperrhiz, [online] [accessed march ]. available at: http://hyperrhiz.io/hyperrhiz /missives- of-love/queer-love-info.html rogers, m. . soft circuitry: methods for queer and trans feminist maker cultures (unpublished phd dissertation, department of women’s studies, university of maryland). available at: . smith, a. . social innovation, democracy and makerspaces. swps, - . https:// doi.org/ . /ssrn. taylor, n., hurley, u. & connolly, p. . making community: the wider role of makerspaces in public life. conference on human factors in computing systems, san jose, ca, usa, may - . tringham, r. . households with faces: the challenge of gender in prehistoric architectural remains. in: j. gero & m. conkey, eds. engendering archaeology: women and prehistory. oxford: basil blackwell, pp. – . tringham, r. . engendered places in prehistory. gender, place, and culture, : – . https://doi.org/ . / tringham, r. . dead women do tell tales: ghosts [online] [accessed march ]. available at: heritage jam, university f york: . tringham, r. . creating narratives of the past as recombinant histories. in: r.m. van dyke & r. bernbeck, eds. subjects and narratives in archaeology. boulder (co): university press of colorado, pp. – . ulysse, g.a. . reflecting on boundaries, protection, and inspiration. anthrodendum [online] [accessed march ]. available at: . wajcman, j. . technofeminism. cambridge: polity press. weismantel, m. . towards a transgender archaeology: a queer rampage through prehistory. in: s. stryker & a.z. aizura, eds. the transgender studies reader . new york: routledge, pp. – . european journal of archaeology ( ) https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://doi.org/ . /epoiesen/ . https://doi.org/ . /epoiesen/ . https://doi.org/ . /epoiesen/ . http://medievalpoc.tumblr.com http://medievalpoc.tumblr.com https://saraperry.wordpress.com/ / / /fieldwork-code-of-conduct/ https://saraperry.wordpress.com/ / / /fieldwork-code-of-conduct/ https://saraperry.wordpress.com/ / / /fieldwork-code-of-conduct/ https://spir.aoir.org/ojs/index.php/spir/article/view/ https://spir.aoir.org/ojs/index.php/spir/article/view/ https://spir.aoir.org/ojs/index.php/spir/article/view/ http://hdl.handle.net/ / http://hdl.handle.net/ / http://hdl.handle.net/ / http://hyperrhiz.io/hyperrhiz /missives-of-love/queer-love-info.html http://hyperrhiz.io/hyperrhiz /missives-of-love/queer-love-info.html http://hyperrhiz.io/hyperrhiz /missives-of-love/queer-love-info.html http://hdl.handle.net/ / http://hdl.handle.net/ / http://hdl.handle.net/ / https://doi.org/ . /ssrn. https://doi.org/ . /ssrn. https://doi.org/ . /ssrn. https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / http://www.heritagejam.org/exhibition/ / / /dead-women-do-tell-tales-ghosts-ruth-tringham http://www.heritagejam.org/exhibition/ / / /dead-women-do-tell-tales-ghosts-ruth-tringham http://www.heritagejam.org/exhibition/ / / /dead-women-do-tell-tales-ghosts-ruth-tringham http://www.heritagejam.org/exhibition/ / / /dead-women-do-tell-tales-ghosts-ruth-tringham https://anthrodendum.org/ / / /reflecting-on-boundaries-protection-and-inspiration/ https://anthrodendum.org/ / / /reflecting-on-boundaries-protection-and-inspiration/ https://anthrodendum.org/ / / /reflecting-on-boundaries-protection-and-inspiration/ https://anthrodendum.org/ / / /reflecting-on-boundaries-protection-and-inspiration/ https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core wylie, a. . the engendering of archaeology refiguring feminist science studies. osiris, : – . wylie, a. . doing social science as a feminist: the engendering of archaeology. in: a.n.h. creager, e. lunbeck & l. schiebinger, eds. feminism in twentieth-century science, technology, and medicine. chicago (il): university of chicago press, pp. –- . biographical notes katherine cook is assistant professor in the department of anthropology at the university of montreal, specializing in public archaeology and digital applications. her research examines memory, identity, power, and politics in the early colonial history of the atlantic (europe, north america, africa), while exploring the applications of digital media, open data, technology in increasing access, engage- ment, and understandings of cultural diversity past and present. address: katherine cook, département d’anthropologie, université de montréal, pavillon lionel-groulx, cp , succur- sale centre-ville, rue jean-brillant, montréal qc, h t n , canada. [email: katherine.cook@umontreal.ca]. incarner et bricoler pour bouleverser la donne : allosexualité, féminisme et inclusivité en archéologie numérique les démarches qui cherchent à promouvoir l’intégration en archéologie (y compris les perspectives allosex- uelles, féministes, black ou indigènes) se recoupent de plus en plus avec celles des communautés associées au codage, à la réalisation et au piratage numérique dans le but de créer un style ‘bricolé’ de contestation et d’activisme. les technologies numériques offrent des possibilités de remettre en question les représentations traditionnelles de personnes du passé et de nos jours de façon créative, mais à quel prix ? dans cet article, une évaluation critique du transhumanisme et de l’ère numérique sert de point de départ à une présentation d’exemples numériques convaincants de pratique d’intégration mais aussi de l’omniprésence du privilège, de l’inégalité, du manque d’accès et des abus facilités par des projets d’accès libre sur internet concernant le patrimoine. on cherchera surtout à évaluer les moyens d’établir un équilibre entre la transposition de récits centrés sur des individus et un profil public et de prendre en compte les risques personnels et professionnels associés à ces approches dans le but de promouvoir, soutenir et protéger les communautés et archéologues marginalisés. translation by madeleine hummler mots-clés: archéologie numérique, allosexualité, féminisme et inclusivité, recherche inclusive, archéologie publique, diversité störende selbstgemachte verkörperungen: queer, feministische und inklusive digitalarchäologie integrative ansätze in der archäologie (einschließlich der queeren, schwarzen, feministischen oder ein- heimischen anschauungsweisen) haben sich zunehmend mit der kultur der programmierer, macher und hacker überschnitten um einen einzigartigen „gebastelten” stil von zerrüttung und aktivismus zu entwickeln. die digitale technologie bietet die möglichkeit, konventionelle darstellung von personen in der vergangenheit und in der gegenwart kreativ infrage zu stellen, aber zu welchem preis? als kri- tische betrachtung von transhumanismus und des zeitalters der digitalen wissenschaft verfasst, bes- chreibt dieser artikel überzeugende anwendungen der digitalen praxis aber auch die durchdringenden strukturen des privilegs, der ungerechtigkeit, der unzugänglichkeit und des missbrauchs, die in zugänglichen, webbasierten projekten im bereich des kulturerbes entstanden sind. insbesondere bewertet cook – queer, feminist and inclusive digital archaeologies https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at mailto:katherine.cook@umontreal.ca https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core die studie mögliche mittel eines ausgewogenen verhältnisses zwischen auf einzelpersonen ausgerichteten erzählungen und öffentlichen profilen zu finden; sie bewertet auch die die persönlichen und beruflichen risiken, die mit diesen ansätzen verbunden sind und die sich bemühen, traditionell marginalisierte archäologen und gemeinschaften zu fördern, unterstützen und schützen. translation by madeleine hummler stichworte: digitalarchäologie, queer, feminismus, integrative wissenschaft, öffentliche archäologie, vielfalt european journal of archaeology ( ) https://www.cambridge.org/core/terms. https://doi.org/ . /eaa. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms https://doi.org/ . /eaa. . https://www.cambridge.org/core embodiying disruption: queer, feminist and inclusive digital archaeologies cultures of inclusivity digital communities and inclusivity disruptive digital archaeologies confronting the past confronting the present confronting authorship/authority platforms for support the dark side of disruption conclusion: the collateral damage of transhumanism references mazi means together: an open-source “minimal computing” local community network for cultural event organisation, fieldwork research and digital curatorial practices mariana ziku (ziku@mail.com), history of art laboratory, university of ioannina, elli leventaki (elli_ @hotmail.com), history of art laboratory, university of ioannina, alexis brailas (abrailas@panteion.gr) athenian institute of anthropos & department of psychology, panteion university, stavroula maglavera (smaglavera@gmail.com), nitlab - network implementation testbed laboratory, department of electrical and computer engineering, university of thessaly, john mavridis (giamavridis@gmail.com), nitlab - network implementation testbed laboratory, department of electrical and computer engineering, university of thessaly keywords minimal computing, community networks, open source, fieldwork research, digital curation the emergence of community networks (cns) and dit (do-it-together) minimal computing ecosystems has resulted in technological solutions that enhance community connectivity and digital inclusion. the case is made for the cultural uses of local network infrastructures that combine wireless technology, low-cost hardware, and free/libre/open source software (floss) applications. based on these features, the toolkit mazi (“together” in greek), a horizon project initiated by nitlab, university of thessaly, greece, has been deployed for creating pop-up local wi-fi zones independent from the internet, that enable digital interactions of communities within a low physical proximity coverage range (davis, ; gurstein, ). mazi provides technology and knowledge that aim to: a) empower those who are in physical proximity, to shape their hybrid urban space, together, according to the specificities of the respective local environment, b) generate location-based collective awareness as a basis for fostering social cohesion, conviviality, participation in decision-making processes, self-organization, knowledge sharing, and sustainable living, and c) facilitate interdisciplinary interactions around the design of hybrid space and the role of icts in society (cordis, ). ict-enabled local networking can be critically applied within the scope of humanities and the glam sector. cn’s can be directed to foster new participatory curatorial forms, transient -off the internet- community knowledge sharing and alternative experiences of the locality and commonality (antoniadis, ; dragona, ). additionally, creating infrastructures and components that are “minimal” by eliminating functionalities to a basic level of user-friendly, sine qua-non components, yet with sustainable performance, might be highlighted as good practices for digital scholarship. cns follow minimal computing principles by utilising hardware and software that is low-budget or free/libre, with reduced clutter and essential operations, that can be further analysed as critical aesthetic frameworks, able to efficiently respond to collective needs (gil, ; sayers, ). in this context, the focus is to explore the cultural-technological intersectionality of local community networks and its affordances as useful infrastructures for enhanced cultural event planning, fieldwork research and digital curatorial practices. the three applied cases presented here are examples of physical proximity community networking platforms that have adjusted and utilized the open-source applications of the mazi toolkit (nextcloud, etherpad, limesurvey and wordpress) in different cultural settings, mounting the toolkit on a low-sized hardware with minimal computing capabilities (raspberry pi): i. media exchange, audience communication and voting in a cross-cultural balkan event the open-source toolkit was used for enhancing audience engagement during a multifaceted short film festival in the balkan region, which lasted days and recorded more than portal visits. a custom-designed interactive digital platform was developed for sharing images, writing comments, enabling real time chatting with local community members and offering a voting system where individual preferences could be expressed. the platform allowed for the people to exchange information, experiences and ideas on the basis of mutual respect, while gathering and providing valuable, anonymous feedback to the hosts of the festival. the local wi-fi zone was available exclusively in the space of the event, giving visitors the opportunity to simultaneously co-exist in a physical and digital environment, thus creating both local tangible and intangible communication channels. mazi operated as a hands-on tool for building a participatory site-specific digital infrastructure for cultural event organisation, particularly suitable for outdoors or out of internet range areas. local community network (mazi zone) in cultural event planning, . cc by . balkans beyond borders, biennale of western balkans ii. collaborative commenting and anonymous participation in community-based fieldwork research during community-based research, we employ an array of participatory techniques that elicit multimodal qualitative data: collective drawing, collaborative creative writing, reflective blogging, storytelling. ethically recording these data to analyze them later it is a critical part of any research project. mazi is a local community network with build-in anonymity that allows users to connect and share without registering their identities (by default). protecting anonymity and ensuring privacy is an ethical requirement, and true anonymity cannot be attained through commercial internet infrastructures. on the other hand, physical proximity is required to connect to the mazi network. in this way, a kind of anonymized-authentication is achieved: only the community members we are working with are able to access the wifi spot and make contributions. the data produced can easily be shared in the here-and-now of a physical meeting through a data projector, often surprising participants when they realize their collective power, and invoking further rounds of contributions. in this way, mazi bears the potential to transform a group of people into a convivial, spontaneous and creative research community, producing critical and ready to analyze empirical data. local community network (mazi zone) in fieldwork research, . cc by . athenian institute of anthropos iii. digital exhibition hosting and community-based curation with added content digital exhibitions in html format built from scratch, hosted in a variety of open-source local networking infrastructures (piratebox, librarybox, mazizone). the exhibitions could be accessed only locally, on-site, traveling along the venues. the audience could explore the exhibition by connecting to the local network (no internet access) through their personal devices. piratebox and librarybox have enabled off the grid, anonymous digital file-sharing, by supporting their mounting on commercial routers in hacker mode. digital exhibitions were saved in a usb stick attached to the portable hacked router. with the use of mazi and its combination of open hardware and software support, the option for the audience to upload their own content and collectively contextualise the exhibition has been explored. the use case of mazi here was utilising a wireless local network as an exploratory and participatory digital curatorial tool. local community networks (piratebox, librarybox, mazi zone) in digital exhibition curation, - . cc by . moving silence (top and left image), data-stories confestival (bottom right image) references “periodic reporting for period - mazi”, , ​cordis​, available at: https://cordis.europa.eu/project/id/ /reporting antoniadis, p., , “local networks for local interactions: four reasons why, and one way forward”, ​«reclaiming the internet» with distributed architectures: rights, technologies, practices symposium​, available at: http://nethood.org/publications/antoniadis_adam_keynote_abstract.pdf davis, g., gaved, m., , “seeking togetherness: moving toward a comparative evaluation framework in an interdisciplinary diy networking project”,​ proceedings of the th international conference on communities and technologies (c&t ’ )​, association for computing machinery, pp. – . dragona, d. ( ). “from community networks to off-the-cloud toolkits”, in i. theona and d. charitos (eds.) ​hybrid city conference iii: data to the people​. athens: uriac. sayers, g. ( ). “minimal definitions”, ​minimal computing: a working group of go::dh​, available at: ​https://go-dh.github.io/mincomp/thoughts/ / / /minimal-definitions gil, a. ( ). “the user, the learner and the machines we make”, ​minimal computing: a working group of go::dh, available at: http://go-dh.github.io/mincomp/thoughts/ / / /user-vs-learner gurstein, m. . ​what is community informatics (and why does it matter)?​ polimetrica, milan, italy. https://cordis.europa.eu/project/id/ /reporting http://nethood.org/publications/antoniadis_adam_keynote_abstract.pdf https://go-dh.github.io/mincomp/thoughts/ / / /minimal-definitions http://go-dh.github.io/mincomp/thoughts/ / / /user-vs-learner continued on page three challenges of pubrarianship by charles watkinson (associate university librarian for publishing, university of michigan, and director, u-m press) scan the “positions vacant” advertisements from the last year and it is clear that an interesting new type of job is emerging in libraries — combining directorship of a university press with senior responsibilities for other scholarly communication activity on campus. such titles include executive director of temple university press and the library officer for scholarly communication, director of purdue university press and head of scholarly publishing services (purdue libraries), director of indiana university press and digital publishing, and director of university of michigan press and associate university librarian for publishing. in an extreme example (not from the jobs list) the university librarian at oregon state uni- versity has for a number of years also been director of oregon state university press. what these new positions exemplify is a movement not only toward more university presses reporting to libraries (from aaup members in to in ), but also a trend toward increasing integration of the two entities. physical collocation of staff with both library and press backgrounds, joint strategic planning exercises, and shared support infra- structure are other characteristics of the most integrated press/library collaborations. even where the heads of university presses explor- ing these opportunities for integration do not hold the sort of joint titles listed above (as at northwestern, north texas, georgia, and ar- izona for example) their roles are changing as they assume greater responsibilities in library administration. such integration presents great opportu- nities (as described elsewhere in this issue of against the grain), but it also creates chal- lenges for the leaders of these merged entities — exemplars of the new role of “pubrarian” so named by john unsworth (now occupying the equally merged role of vice provost, university librarian, and cio at brandeis university). having occupied two of the positions above over the last few years, first at purdue uni- versity and now at the university of michi- gan, three particular areas of challenge have emerged for me. challenge : articulating the value of publishing to library colleagues you must know the scene, whether it’s the red carpet on the night of the academy awards or market street in the palmetto city as dusk falls. an apparently mismatched couple walks by, one short one tall, one ugly one beautiful, one nicely dressed one a mess, and we wonder… “what’s (s)he getting out of that relationship?” as both a university press director and a member of the library leadership team, i often sense that my colleagues in libraries may be having the same question about press/library collaborations. it is pretty obvious to them what university presses are getting out of the relationship because the benefits are so tangi- ble. the more integrated presses benefit from greater financial security, nicer space, access to better technology, and higher profiles in their parent institutions. but what benefits does a close collaboration with a university press bring to the library, financially at least usually the better endowed party in the match? in addressing this question, it helps to examine the ways in which press publishers can help academic librarians collaborate in, firstly, the research and, secondly, the teaching activities of disciplinary faculty. on the research side, having a university press “in house” promises a library enriched opportunities to engage with, and understand, the needs of faculty members as authors, as well as users of information. we all know that there are real asymmetries in the ways that the same scholars behave when they are creators rather than consumers. for example, an advocate for the value of reusable open data may become peculiarly cagey when it comes to sharing her own research findings. university presses understand the care and feeding of au- thors, contributing perspectives and skills that early on can provide an advantage to libraries that identify the similarities between imbedded subject liaisons and acquisitions editors, and are willing to explore them further. publish- ers also appreciate the systems of reward and prestige that motivate authors, and if given the opportunity to do so can usefully inform the design of services and systems, such as data repositories or author identification schema, that rely on enthusiastic academic opt in rather than grudging conformity to really take off. on the learning and teaching side, universi- ty presses offer libraries new opportunities for demonstrating relevance with administrations that are increasingly focused on creating an undergraduate student experience that is both more engaging and affordable. most well-pub- licized are several initiatives to create open or inexpensive textbooks based on library/ press collaboration, although the particular conventions of that complex type of publish- ing make success elusive. textbook authors still generally expect a level of silver-platter service and gourmet financial incentive that is difficult to deliver economically. emerging opportunities to engage students in the pub- lishing process, as authors and editors, seem more promising. as our parent institutions move to more engaged, experiential styles of teaching and learning, the press in the library offers the opportunity for students to not only research a real-world topic but also publish about it, whether in an undergraduate research journal or edited book. that is a rich way to incentivize student engagement, combines several high impact learning practices, and offers a tangible outcome from the experience for them to use in graduate school and job interviews. by working together to leverage publishing as pedagogy, presses and libraries may also help educate the next generation of scholars in more progressive attitudes to schol- arly communication — a worthwhile long-term play in changing reactionary academic cultures that will benefit us all. challenge : shaping the merged publishing program university presses without the scale of the multinationals are often advised to focus their attentions on a few types of publication in a select number of disciplines rather than trying to be generalists. such targeted strategies allow presses to maximize the use of their limited resources. a press publishing in a few subject areas can send acquisitions editors to almost all the relevant conferences, can reuse mailing lists for almost every book produced, and can adopt efficient, template-driven approaches to design and production since most products geared for a particular discipline are similar to each other. oriented toward a manageable number of areas of study, the editors will gen- erally have a clear idea of what manuscripts to pursue and what topics to commission in. the processes of selection that are essential to uni- versity press publishing provide an additional filter, while the need to recover revenue from sales imposes the discipline of the market on the whole process. broaden the mission to require relevance to the parent institution as well as key disci- plines, and the question of what to prioritize becomes more complex. a publishing director challenged to provide services to the entire campus community may initially feel flush with opportunities to publish, especially if situated in a large comprehensive university. but facing such choice can feel like drinking from the fire hose, with the risks of ending up flailing in a large pool of freezing water all too real. where does one even start in building a publishing program that is relevant across a large research university as well as trans-in- stitutionally valuable to a few key disciplines? the reality, of course, is that most potential projects suggested by institutional stakeholders are unrealistic in terms of the types of capacity needed to accomplish them well. the skills and resources needed to launch a major scientific journal, for example, are different from those used to create excellent books. also, while technology has leveled the playing field to a certain extent, the design and marketing of a major introductory textbook requires an infra- structure and web of relationships that takes years to develop. this is why most library publishers (working either with or without a university press partner) currently focus on the production of niche open access journals, conference proceedings, technical reports, and upper-level course companions. in these areas they can meet important areas of faculty and student needs which may have dropped through the cracks, without having to engage in unwinnable competition with established and better funded specialist publishers outside the institution. and, while the university might boast comprehensive coverage, it is usually fairly clear internally where the areas of institutional pride and attention lie. those are also often the places where there is the most money available to support publication, relieving the library of sole financial responsibility for open access publishing strategies. even if initially not apparent, these sweet spots can be identified through trial and error. as products appear and gain less or more recognition, the broad spray of solutions gradually narrows to a more focused and powerful stream. and opportu- nities may emerge for working up the value chain from areas where trust has been achieved in servicing informal needs to create more formal, university press, products. achieving disciplinary and institutional alignment are not necessarily contradictory goals. challenge : protecting existing brands while embracing new opportunities most university presses rely on well-estab- lished credentialing processes to build their brands as book publishers: first, a promising manuscript is identified by an acquisitions or series editor and developed into a product that can be reviewed. second, paid reviewers provide detailed reports. third, an editorial board discusses and decides whether to pursue the project or not (and after how much further author revision). fourth, a copyeditor works through the manuscript looking for errors of consistency and fact and the book is designed in a way that maximizes the look of authority. fifth, the book is promoted to external review- ers so that other expert opinions confirm its excellence. these processes are immensely time and resource intensive, and the end product represents an extreme of formality and elegance; top hat, tails, shined shoes, and crisp white shirt. for someone from a university press tradi- tion, whose publishing focus has usually been on producing top end books, there is some- thing very freeing in being able to operate in a campus environment where not every project needs such formal treatment. if visualized as a spectrum from informal to formal, the formal book (or journal) occupies a narrow space at the right-hand end of the continuum. to its left lie the many other types of publishing and dissemination needs that a campus community may have. there may be the proceedings of a symposium, for example, with papers already selected by the organizing committee. this needs moderate clean-up and speedy dissem- ination rather than a formal review process, laborious quality assurance, and embargo until the next publishing season. sometimes taking too much care over sartorial elegance prevents the work needed to fit the purpose getting done. pursuing projects characterized by a range of formality presents a risk for a publisher whose responsibilities still include the uni- versity press imprint. how does one avoid undermining the hard-earned university press brand through association with lighter-weight publishing products? how might the publica- tion of student scholarship affect the willing- ness of their professors to be published by the same organization? how can titles that have undergone careful peer review be distinguished from those that have been selected through less formal processes? reserving the university press isbn prefix and colophon for traditional, formal books and distinguishing the appearance of non-press books both physically and online contributes to preserving the distinction. a faculty gover- nance mechanism separate from the university press’s editorial board helps preserve a degree of oversight for publications that are still going to be associated with the university, but avoids confusion. using different production and distribution workflows can relieve staff concerns about pressure of work as well as help to maintain the separation. all these are strategies for ring fencing the university press brand, and many are already familiar to uni- versity presses that publish regional or trade books. they don’t remove all the possibilities for confusion, but they reduce them. despite the importance of protecting the brand, constant attention must be paid to the risk of keeping it too separate and reducing the opportunities for innovation and efficiency that the mixing of different types of publishing can bring. one thinks particularly of opportunities to more economically publish the revised dissertations which may start a scholar’s aca- demic progression, in a way that is informed by streamlined journal workflows. and the dangers of fossilizing the “university press” brand so that it remains associated with print books and their electronic facsimiles rather than becoming the home of innovative digital scholarship that an increasing number of schol- ars are searching for. managing two identities the word “pubrarian” may conjure up im- ages of opacs in an english bar rather than a merger of two great information professions. and traveling with two different business cards, one for the university press and anoth- er for the library, can make for a fat wallet. however, as libraries move to engage with the inputs as well as outputs of scholarship, and as publishers migrate from processing content to also providing the tools through which is it created, our joint capacity to serve the needs of scholars at all stages of their professional lives grow exponentially. the new pubrarians, whether they arrive in their roles through press/ library collaboration or the organic growth of library publishing, may be at the forefront of creating such solutions. and that’s an opportu- nity worth minting a new word for. three challenges of pubrarianship from page / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / early modernist + digital humanist :: pi, augmented cri鋙�cism lab :: associate professor, department of english, university of calgary follow @ullyot ,  followers embed view on twitter tweets by  @ullyot  jun watching zeffirelli’s   romeo and juliet in #engl     michael ullyot  @ullyot michael ullyot  menu http://ullyot.ucalgaryblogs.ca/ http://acriticismlab.org/ https://twitter.com/intent/follow?original_referer=http% a% f% fullyot.ucalgaryblogs.ca% f % f % f % fteam-project-description-for-english- % f&ref_src=twsrc% etfw®ion=follow_link&screen_name=ullyot&tw_p=followbutton https://twitter.com/intent/user?original_referer=http% a% f% fullyot.ucalgaryblogs.ca% f % f % f % fteam-project-description-for-english- % f&ref_src=twsrc% etfw®ion=count_link&screen_name=ullyot&tw_p=followbutton https://twitter.com/settings/widgets/new/user?user_id= https://twitter.com/ullyot https://twitter.com/ullyot https://twitter.com/ullyot/status/ https://twitter.com/hashtag/engl ?src=hash https://twitter.com/intent/like?tweet_id= https://twitter.com/ullyot http://twitter.com/ullyot/status/ /photo/ http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /# https://twitter.com/ullyot / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / team project description for english  there are two phases to your team project in english   (hamlet in the humani鋙�es lab), each worth different grades, for a total grade of  %: phase   (five tools) % based on your individual blog posts % for your team’s oral presenta謁렟on if you opt for this in wri謁렟ng phase   (five acts) % based on your individual blog posts % based on your group’s oral presenta萂Ȅon, or  % if you opted to take % for your phase   oral presenta謁렟on http://ullyot.ucalgaryblogs.ca/teaching/hamlet/ / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / par鋙�cipa鋙�on % based on your comments on other people’s posts throughout the term team arrangements the class has been randomly divided into five teams (a, b, c, d, e), each of five students. in addi萂Ȅon to having a group le胀ၰer, you have also been assigned a corresponding digit between   and  ; your digit will not be needed un萂Ȅl phase  . to view your le태�er and digit, click here. for phase  , five teams of five will be assigned a le胀ၰer, a‐e. each le胀ၰer signifies a tool that will be the focus of the team’s project: a: wordhoard b: tapor  c: wordseer  d: voyeur  e: monk teams will learn the advantages and the disadvantages of their tool and how to effec萂Ȅvely use it. each team will then apply their specific tool to the same sec萂Ȅon of hamlet,  . . work should be divided equally. a䀰Ԍer three weeks, teams a‐e will present before the class their insights into the passage found through the use of their tool. the presenta萂Ȅon need not be longer than   minutes. (see below for details.) for phase  , new teams will be formed. instead of teams a‐e, students will now form teams  ‐ , using the original digits that were assigned to them at the start of ac萂Ȅvity  . all students with the same number will form the new teams. the team number ( ‐ ) corresponds to the numeric act in shakespeare’s hamlet that each team will analyze: : act    : act    : act   (leave out  . )  : act    http://ullyot.ucalgaryblogs.ca/files/ / /teams_phases and .pdf / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / : act    : act  each student will bring to the team their exper萂Ȅse in the tool s/he specialized in for phase  . this combined effort, the use of five tools by five different experts, will be applied to the designated act in hamlet. how the work is divided among each member of the team is at the team’s discre萂Ȅon; possibili萂Ȅes include dividing the passage equally by lines, by theme or by speaker. a䀰Ԍer three more weeks of work, teams  ‐  will present before the class their insights into their assigned act of hamlet illustra萂Ȅng what the tools offered, supported by examples. the presenta萂Ȅon need not be longer than   minutes. (see below for details.) blog posts for detailed guidelines and grading rubrics, both for your five blog posts and your comments on other students’ posts, click here. oral presentations grading rubric  |  schedule for both phase   and phase  , powerpoint and other visual aids are not necessary, but can be used to supplement the informa萂Ȅon provided orally by your team. phase  the presenta萂Ȅon component of phase   is a  ‐minute oral presenta萂Ȅon on each team’s digital tool. the presenta萂Ȅon should explain how your tool helped you to analyze  . , and led to new understandings. the aim of the presenta萂Ȅon is to ensure that your classmates understand what your specific digital tool does, as well as what analysis of the text it enables. you should not only offer your analysis, but also describe the process your team went through in reaching your conclusions. demonstrate a clear understanding of the tool, and address both its capabili萂Ȅes and limita萂Ȅons. for full marks, the team will deliver a concise and comprehensive presenta萂Ȅon. do not simply list your findings. help your peers to http://ullyot.ucalgaryblogs.ca/ / / /on-blogging-in-english- / http://ullyot.ucalgaryblogs.ca/files/ / /rubric_grading-oralpresentation.pdf http://ullyot.ucalgaryblogs.ca/files/ / /datesfororalpresentationsphase .pdf / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / presenta萂Ȅon. do not simply list your findings. help your peers to grasp what you found and how you found it. all students must speak during the presenta萂Ȅon, but the division of speaking 萂Ȅme is up to the team. it is important that each member address his/her contribu萂Ȅon to the process and role in the project’s comple萂Ȅon. it is also important that all five students put equal work into the prepara萂Ȅon of the presenta萂Ȅon. this will be enforced and checked through the team contract and peer evalua萂Ȅons. phase  the presenta萂Ȅon component of phase   is a  ‐minute oral presenta萂Ȅon on the analysis of your assigned act in hamlet through the use of all five tools. the presenta萂Ȅon should explain the assigned act as you now understand it through the use of the five tools. the presenta萂Ȅon may be set up any way that your team wishes, but you must address the unique capabili萂Ȅes of all five tools. a conclusion should come at the end of the presenta萂Ȅon, comparing the five tools and providing a unified understanding of the act. you should also discuss why these insights could not be gained from a tradi萂Ȅonal (non‐digital) close reading of the text, and how the digital humani萂Ȅes offers effec萂Ȅve methods for analysis. again, all students must speak during the presenta萂Ȅon, but the division of speaking 萂Ȅme is up to the team. it is important that each member address his/her contribu萂Ȅon to the process and role in the project’s comple萂Ȅon. it is also important that all five students put equal work into the prepara萂Ȅon of the presenta萂Ȅon. this will be enforced and checked through the team contract and peer evalua萂Ȅons. share this:         related english  : the twi胀ၰer assignment on blogging in english  teaching hamlet in the humani萂Ȅes lab  december,   february,  in "digital humani萂Ȅes"  april,  http://ullyot.ucalgaryblogs.ca/files/ / /team-contract-phase .pdf http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /?share=email&nb= http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /?share=twitter&nb= http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /?share=facebook&nb= http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /?share=google-plus- &nb= http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /?share=pocket&nb= http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /?share=pinterest&nb= http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /?share=tumblr&nb= http://ullyot.ucalgaryblogs.ca/ / / /english- -the-twitter-assignment/ http://ullyot.ucalgaryblogs.ca/ / / /on-blogging-in-english- / http://ullyot.ucalgaryblogs.ca/ / / /teaching-hamlet-in-the-humanities-lab/ / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / one response to “team project description for english  ” updated home page | hamlet in the humani鋙�es lab   january,  […] doc outline onto this page. but i’ve also added the team assignments for phases   and  : see this post. this entry was posted in administra萂Ȅon by ullyot. bookmark the […] reply leave a reply enter your comment here... in "teaching" in "digital humani萂Ȅes" categories research teaching f  engl teaching + learning t+l workshops http://engl .ucalgaryblogs.ca/ / / /updated-home-page/ http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- /?replytocom= #respond http://ullyot.ucalgaryblogs.ca/category/research/ http://ullyot.ucalgaryblogs.ca/category/teaching/ http://ullyot.ucalgaryblogs.ca/category/teaching/f -engl / http://ullyot.ucalgaryblogs.ca/category/teaching/adtl/ http://ullyot.ucalgaryblogs.ca/category/teaching/adtl/tl-workshops/ / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / effec鋙�ve cri鋙�cal wri鋙�ng shakespeare | english  , spring  shakespeare lipsum front page the unessay novels for algorithms eebo tutorial: books by printer + publisher t+l news digital humani鋙�es books audio learning technology coaches shakespeare | english  , spring  approaches to literature | english  , fall  close reading search top posts + pages search michael ullyot http://ullyot.ucalgaryblogs.ca/teaching/ecw/ http://ullyot.ucalgaryblogs.ca/teaching/shakespeare-english- -spring- / http://ullyot.ucalgaryblogs.ca/ / / /shakespeare-lipsum/ http://ullyot.ucalgaryblogs.ca/ http://ullyot.ucalgaryblogs.ca/ / / /the-unessay/ http://ullyot.ucalgaryblogs.ca/ / / /novels-for-algorithms/ http://ullyot.ucalgaryblogs.ca/ / / /booksbyprinter/ http://ullyot.ucalgaryblogs.ca/category/teaching/adtl/tl-notes/ http://ullyot.ucalgaryblogs.ca/category/digital-humanities/ http://ullyot.ucalgaryblogs.ca/category/bookshelves/ http://ullyot.ucalgaryblogs.ca/category/bookshelves/audio/ http://ullyot.ucalgaryblogs.ca/learning-technology-coaches/ http://ullyot.ucalgaryblogs.ca/teaching/shakespeare-english- -spring- / http://ullyot.ucalgaryblogs.ca/teaching/approaches-to-literature-english- -fall- / http://ullyot.ucalgaryblogs.ca/teaching/close-reading/ http://ullyot.ucalgaryblogs.ca/teaching/ecw/ http://ullyot.ucalgaryblogs.ca/teaching/shakespeare-english- -spring- / http://ullyot.ucalgaryblogs.ca/ / / /shakespeare-lipsum/ http://ullyot.ucalgaryblogs.ca/ http://ullyot.ucalgaryblogs.ca/ / / /the-unessay/ http://ullyot.ucalgaryblogs.ca/ / / /novels-for-algorithms/ http://ullyot.ucalgaryblogs.ca/ / / /booksbyprinter/ / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / tags ac鋙�ve learning anadiplosis augmented cri鋙�cism lab badges cri鋙�cal thinking data data abundance data cura鋙�on david brooks dhsi difficult texts digital humani鋙�es digital scholarship lab eebo‐tcp elizabethan theatre google google glass grada鋙�o graduate a태�ributes http://ullyot.ucalgaryblogs.ca/tag/active-learning/ http://ullyot.ucalgaryblogs.ca/tag/anadiplosis/ http://ullyot.ucalgaryblogs.ca/tag/augmented-criticism-lab/ http://ullyot.ucalgaryblogs.ca/tag/badges/ http://ullyot.ucalgaryblogs.ca/tag/critical-thinking/ http://ullyot.ucalgaryblogs.ca/tag/data/ http://ullyot.ucalgaryblogs.ca/tag/data-abundance/ http://ullyot.ucalgaryblogs.ca/tag/data-curation/ http://ullyot.ucalgaryblogs.ca/tag/david-brooks/ http://ullyot.ucalgaryblogs.ca/tag/dhsi/ http://ullyot.ucalgaryblogs.ca/tag/difficult-texts/ http://ullyot.ucalgaryblogs.ca/tag/digital-humanities/ http://ullyot.ucalgaryblogs.ca/tag/digital-scholarship-lab/ http://ullyot.ucalgaryblogs.ca/tag/eebo-tcp/ http://ullyot.ucalgaryblogs.ca/tag/elizabethan-theatre/ http://ullyot.ucalgaryblogs.ca/tag/google/ http://ullyot.ucalgaryblogs.ca/tag/google-glass/ http://ullyot.ucalgaryblogs.ca/tag/gradatio/ http://ullyot.ucalgaryblogs.ca/tag/graduate-attributes/ / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / / hamlet humani鋙�es incrementum intellectual virtues interpreta鋙�on knowledge management lectures leviathan literary cri鋙�cism milton performance podcasts project management pu태�enham quora radio samuel johnson shakespeare sidney students teaching http://ullyot.ucalgaryblogs.ca/tag/hamlet/ http://ullyot.ucalgaryblogs.ca/tag/humanities/ http://ullyot.ucalgaryblogs.ca/tag/incrementum/ http://ullyot.ucalgaryblogs.ca/tag/intellectual-virtues/ http://ullyot.ucalgaryblogs.ca/tag/interpretation/ http://ullyot.ucalgaryblogs.ca/tag/knowledge-management/ http://ullyot.ucalgaryblogs.ca/tag/lectures/ http://ullyot.ucalgaryblogs.ca/tag/leviathan/ http://ullyot.ucalgaryblogs.ca/tag/literary-criticism/ http://ullyot.ucalgaryblogs.ca/tag/milton/ http://ullyot.ucalgaryblogs.ca/tag/performance/ http://ullyot.ucalgaryblogs.ca/tag/podcasts/ http://ullyot.ucalgaryblogs.ca/tag/project-management/ http://ullyot.ucalgaryblogs.ca/tag/puttenham/ http://ullyot.ucalgaryblogs.ca/tag/quora/ http://ullyot.ucalgaryblogs.ca/tag/radio/ http://ullyot.ucalgaryblogs.ca/tag/samuel-johnson/ http://ullyot.ucalgaryblogs.ca/tag/shakespeare/ http://ullyot.ucalgaryblogs.ca/tag/sidney/ http://ullyot.ucalgaryblogs.ca/tag/students/ http://ullyot.ucalgaryblogs.ca/tag/teaching- / / / team project description for english   | http://ullyot.ucalgaryblogs.ca/ / / /team­project­description­for­english­ / /     tei text analysis twi태�er university of calgary video http://ullyot.ucalgaryblogs.ca/ / / /encoding-exercise-description-for-english- / http://ullyot.ucalgaryblogs.ca/ / / /team-project-description-for-english- / http://ullyot.ucalgaryblogs.ca/ / / /english- -the-twitter-assignment/ http://ullyot.ucalgaryblogs.ca/tag/tei/ http://ullyot.ucalgaryblogs.ca/tag/text-analysis/ http://ullyot.ucalgaryblogs.ca/tag/twitter/ http://ullyot.ucalgaryblogs.ca/tag/university-of-calgary/ http://ullyot.ucalgaryblogs.ca/tag/video/ original article perspect med educ ( ) : – https://doi.org/ . /s - - - good practices in harnessing social media for scholarly discourse, knowledge translation, and education daniel lu · brandon ruan · mark lee · yusuf yilmaz · teresa m. chan published online: august © the author(s) abstract introduction there still remains a gap between those who conduct science and those who engage in edu- cating others about health sciences through various forms of social media. few empirical studies have sought to define useful practices for engaging in so- cial media for academic use in the health professions. given the increasing importance of these platforms, we sought to define good practices and potential pit- falls with help of those respected for their work in this new field. methods we conducted a qualitative study, guided by constructivist grounded theory principles, of electronic supplementary material the online version of this article (https://doi.org/ . /s - - - ) contains supplementary material, which is available to authorized users. d. lu department of psychiatry, faculty of medicine, university of british columbia, vancouver, bc, canada b. ruan · m. lee · y. yilmaz · t. m. chan (�) mcmaster education research, innovation, and theory (merit) unit, mcmaster university, hamilton, ontario, canada teresa.chan@medportal.ca y. yilmaz department of medical education, faculty of medicine, ege university, izmir, turkey y. yilmaz · t. m. chan program for faculty development, office of continuing professional development for the faculty of health sciences, mcmaster university, hamilton, ontario, canada t. m. chan department of medicine, division of education & innovation and division of emergency medicine, faculty of health sciences, mcmaster university, hamilton, ontario, canada emerging experts in the field of academic social me- dia. we engaged in a snowball sampling technique and conducted a series of semi-structured interviews. the analytic team consisted of a diverse group of re- searchers with a range of experience in social media. results understanding the strengths of various plat- forms was deemed to be of critical importance across all the participants. key to building online engage- ment were the following: ) culture-building strate- gies; ) tailoring the message; ) responsiveness; and ) heeding rules of online engagement. several points of caution were noted within our participants’ inter- views. these were grouped into caveat emptor and the need for critical appraisal, and common pitfalls when broadcasting one’s self. discussion our participants were able to share a num- ber of key practices that are central to developing and sharing educational content via social media. the findings from the study may guide future practition- ers seeking to enter the space. these good practices support professionals for effective engagement and knowledge translation without being harmed. keywords social media · knowledge translation · education · scholarly discourse introduction social media is now a dominant medium for dis- course, debate and education. recently, the covid- pandemic has highlighted how crucial social media has become for both our daily and professional lives [ , ]. with the staggering explosion of content online both within health professions education and knowl- edge translation [ ], we are entering into an attention economy for learners within the increasingly crowded social media space [ ]. within business, market- ing professionals manage their brand’s attention in good practices in harnessing social media for scholarly discourse, knowledge translation, and education https://doi.org/ . /s - - - http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf https://orcid.org/ - - - http://orcid.org/ - - - http://orcid.org/ - - - http://orcid.org/ - - - http://orcid.org/ - - - x https://doi.org/ . /s - - - original article a vast sea of attention-seeking stimuli. in the online world of health professions education, a similar at- tention economy is emerging; teachers and scientists are deploying many different tactics to engage their intended audiences. one of the great challenges in engaging scientists and investigators (including those in health profes- sions education) in social media activities is their per- ceptions of its usefulness. some individuals who have spent their lives devoted to generating science may not feel adequately trained to engage in social media- based techniques for disseminating their scholarship [ , ] or sufficiently rewarded by traditional tenure and promotion processes [ ]. these factors have con- tributed to social media’s under-utilization by the sci- entific community despite it having shown to have direct implications for enhancing visibility of science [ , ]. in a recent scoping review, while there was an abun- dance of descriptive studies (n= ) and conceptual pieces (n= ), there were very few clarification stud- ies (n= ) about the usage of social media for educa- tion and/or knowledge translation [ ]. none of these studies attempted to define good practices used by ex- perienced providers. we sought opinions from those respected for their academic social media work to gen- erate a list of good practices and potential pitfalls. methods we conducted a constructivist grounded theory study to determine the good practices and potential pitfalls observed by experts in the areas of social media edu- cation and knowledge translation. sampling we engaged in a snowball sampling tech- nique, which has been used in the field of social media research within health professions education since it is an evolving field with rapidly changing techniques and protocols [ , – ]. we initially randomly drew from a previously published list of social media in- fluencers within emergency medicine [ ], since this field has been shown to be quite active in social media scholarship and publications, according to a recent review [ ]. however, since there has been a marked adoption of social media across all sectors since this original list, we employed a snowball sampling tech- nique as the expertise in this area is not fixed and is evolving. as such, snowball sampling allowed our interviewees to then further nominate individuals whom they admired as experts in one of the follow- ing areas: ) knowledge translation and teaching; ) acting as an interactive scientist or investigator; ) engaging as a critical clinician [ ]. individuals were initially contacted by email or social media to engage in our study. we attempted to sample across all three groups. context the context of the study was the digital community of social media knowledge translation specialists and educationalists. although we began our study of social media experts using a social media influencers list from one specific specialty (emer- gency medicine), our context was broader due to our snowball sampling. ethics our team received ethical approval from the hamilton integrated research ethics board (# hireb- ). data collection methods a series of semi-structured interviews were conducted by our team’s research as- sistants (br, am). the research assistants were ini- tially trained via simulation and practice with feed- back. initial transcripts for research assistants were also reviewed initially by the primary investigator (tc) to provide insights for further topic exploration as part of our constant comparative analysis. each interview was conducted using zoom (zoom video communi- cations, inc., san jose, ca, usa) with audio capture on our local computer. the interview guide is found in the electronic supplemental materials (appendix ). data processing the audio files were sent to a trained and experienced medical transcriptionist, who gen- erated written transcripts from the audio files. par- ticipants were assigned a gender-matched alias. the transcripts were then verified or corrected as needed by the interviewer and investigatory team to ensure the accuracy of the transcript. data analysis we conducted our analysis using a constant comparative method, iteratively delin- eating a series of codes aligned with various good practices and potential pitfalls throughout our cod- ing process. our analysis team (br, dl, tc) met multiple times over a number of months, analyz- ing transcripts for relevant themes after batches of – interviews were completed. each coding session, a code book was updated, with relevant codes being organized and reorganized until we reached thematic sufficiency within our dataset about good practices and potential pitfalls to avoid. these analysis sessions allowed us to iteratively refine the prompts or sub- prompts used by our interviewer, guiding us better towards sufficiency. sensitization in constructivist grounded theory, re- searchers may be sensitized by concepts that have pre- ceded their present work. these are concepts that inform their analysis and are acknowledged fully for the readership. for our analysis we were sensitized by two concepts: ) davenport and beck’s attention economy [ ]; and ) the new types of social media scholars (translational teachers, interactive investiga- tors, and critical clinicians), which was a conceptual good practices in harnessing social media for scholarly discourse, knowledge translation, and education original article framework that had been previously proposed in the literature [ ]. the concept of the attention economy comes from a concept in the business world which highlights the increasing limitations of end-user (or customer) at- tention as a new type of economic driver; specifically, human attention is now a form of currency due to limitations in its supply, thereby forcing those who demand our attention to compete. davenport and beck’s work was mainly used for the analysis portion of our study. when reading transcripts, this concept helped us to detect practices that were more akin to business structures or marketing strategies that our experts were employing for the purposes of dissem- inating education. their work describes several key concepts within the attention economy: voluntary at- tention, attractive attention, aversive attention, front- of-mind attention, and back-of-mind attention. more- over, they describe the concept of attention manage- ment, a task that seems to resonate with educators. the other concept of the new types of social me- dia scholars, which emerged due to the increasing use of social media for knowledge translation, has table best usages for various social media platforms platform best practices examples mentioned twitter users may find it prudent to divide out different accounts for different usages. some suggested divisions: – person-level professional account – group/institutional – research team group/institution: @wearecanadiem (www.canadiem.org) residency program account departmental account research team: @metriqstudy (www.metriqstudy.org) when engaging in social media promotion of research, consider the following practices when generating a tweet: – include an image in the tweet – use descriptive language – tag people involved – tagging related organizations or granting agencies involved in the work – tagging the journal that the article was published within – using hashtags to join the right conversation advanced concepts include: – tweet chats understand the nuances between accounts. have a clear intent and purpose for each account “you have to be aware of what the purpose of each account is and certainly the purpose of my department’s account is very different than my account.”—piper “i am really deliberate in my use of hashtags. i also try not to spam. (laughing). so, like three or less hashtags in a tweet. . . also, in my communication, i will take tag certain people that i want to make sure that they are aware.”—grace facebook person-level account n/a facebook pages—group/institutional canadiem facebook page instagram group/institutional pem morsels canadiem closed social platform used for within team communication to enhance the functioning of a team of social media users or producers (e.g. blog community) groups using slack: aliem canadiem blog used for housing general summaries and dissemina- tive works, but also to release new scholarly contribu- tions via a digital platform. “. . . we have been producing a case of the week. and disseminating that internation- ally with our pathology residents and fellows . . . using blogging platform to do that with the question. it is a short snippet of the case— words or less. it has an im- age or digital image. like a digital scan and pathology slide as well as the question that goes with it. so that is another way that we have used social media for learners and also for our faculty.”—grace podcasts a possible outlet for digital scholarship and academic output. can be used as its own free-standing aca- demic output, since it is seen as digital scholarship em basic (for junior trainees) em guidewire (involves residents) other platforms mentioned without good practice advice: reddit, google plus, linkedin, blogs, read by qx, researchgate been highlighted because our principal investigator (tc) was a lead author on this conceptual work, and its influence is undeniable in this present study—we framed our recruitment and interview guide around these types of scholars, specifically seeking out those who fit within this framework. techniques to enhance rigor and trustworthiness to ensure rigor of our analysis, we engaged two mem- bers of our research team (ml, yy) to conduct an audit of our analysis trail. they were given full access to primary transcripts and the final codebook. the plan for resolving conflicts at this stage was to engage in discussions around areas of concerns. consensus building techniques were used to resolve any issues that arose. reporting this report adheres to the standards for reporting qualitative research reporting guidelines [ ]. good practices in harnessing social media for scholarly discourse, knowledge translation, and education http://www.canadiem.org http://www.metriqstudy.org original article table good practices for engagement online good practice explanatory quote use common sense “i guess it is fairly straightforward. just don’t be an idiot. . . i don’t know i guess i don’t do heaps and heaps of tweeting myself but if i am responding to somebody it will usually be to make sure that i say something positive or say nothing at all.”—sheila clearly identifying yourself, including conflicts of interest “i clearly identify myself as my twitter handle is not my name, but my name is on my twitter profile. and yeah, i think that is pretty straightforward. . . like, just behave properly.”—sheila “it’s just really thinking about your profile is a best practice. just thinking about being transparent to the community [about] who you are and . . . what you are going to be communicating about in that social media platform. so, [regarding] your presence in your profile, i think another best practice that i really try and think about and encourage other people to think about as well.”—grace aligned with self and institution “i’ve tried to make all of my intentions honorable and things that i would be proud of representing and that would reflect on my institution and institutions in a positive way. and so, my interactions again are founded on what is going to be best for patient care and kindness and making my intentions honorable. and so those are all things that i think of as core values that the institutions that i am affiliated with . . . support.”—edward understand the intention of each account in each platform “i am a big believer in aligning my technology with my goals that i want to achieve. and also separating personal and professional. so, i chose twitter because at the time it was where i was connecting with people in medical education, finding that it seems like that is where the audience that i wanted to connect with professionally was currently at. i felt like facebook was more personal. um, and that instagram and other, and instagram especially i guess was just starting to emerge when i was working with getting myself established in medical education. um, now have i moved to instagram. i use instagram, um, more to help i guess personal[ly], but i guess some i had done some work connecting with other professionals on it just a little. slack is one that i use . . .”—grace maintaining respect “i think in general you try to um, be polite and professional. like i don’t necessarily think delving into in depth articles on twitter is neces- sary or appropriate, um, however responding to people who are having questions or being critical of things i think it is a very reasonable way to go. and [i] try to do it in a respectful way. . . and that can be productive [in] conversation”—trevor stay positive “if i am responding to somebody, it will usually be to make sure that i say something positive or say nothing at all.”—sheila “. . . always assume that if there is two ways to read something then thinking the kinder way is the way that somebody wants you to read it; i think it is a good rule of thumb because you know like i said, it is hard to interpret tone.”—anthony “so, [an important aspect is] being respectful, you know only saying things that you would say to other people for the most part being particularly i would say from a department account you know being very positive about all of the people that you work with. i think kind of from a formal account, really you probably have to be positive, [an] uplifting voice.”—piper avoid arguments “i am always surprised at how argumentative some people get. and i think that is a little bit of a shame because i don’t think that . . . sort of reflects well and this idea about somewhere in between you know maintaining some appropriate composure versus being a skep- tic and questioning things. and there are definitely some people who do a good job of that and some people that don’t.”—sheila knowing when to end a conversation “if there are people that are engaging that seem to have a substantial agenda, then i am more likely to not continue the conversation for long while still being respectful and just [stop] interacting.”—trevor anticipate trolls “i mean you will occasionally get trolled by negative people. . . i thankfully haven’t had too much with that but every once in a while, something that i put it out on # , some naysayer will put something negative or sort of like oh it is just like # to do something like this.”—paula “you know there [are] always trolls, right? but i think of one, so before i hit publish on anything, i am like super critical of myself first. so, i think if you already are highly concerned about the words that you use and the product that you are publishing then you are going to find that most people are not out there to be obstinate and/or aggressively negative. and if there is a question then usually it is raised with a more honest and um, straightforward inquiry rather than being malicious.”—harold engage across silos “i tried to actively engage others across multiple specialties and disciplines. so not just emergency physicians but other physicians, and not just physicians but nurses and technologists and the public. so, it is mostly i think the fact that i tried to cross barriers that might otherwise limit the scope of other people who are on social media.”—roger amplify others “if there was someone that i know that is doing something cool or having something awesome to happen then i might favorite that or retweet that.”—trevor “i tried to disseminate most of the work that we publish. i try to, anything that we publish that i think is worth making people aware of, i will put a plug in for it. sometimes i will do a twitter thread if it is a particularly important study. and i will often tag junior investigators or colleagues to increase their follower count.”—nadir results demographics of participants in total, individuals were interviewed. see tab. of the online supplementary material for details about key demographics. social media platforms used by our participants can be found in tab. of the on- line supplementary materials, which is also online. the interviews were on average . min long, rang- ing from . to . min. this yielded a total of pages of transcripts. within this group, nine individ- uals self-identified as translational teachers, five indi- viduals identified as critical clinicians, and three saw themselves as interactive investigators. key themes overall, there were several good practices that were felt to be important. specifically, the domains that our good practice tenets fell within included: ) un- derstanding the nuances of specific platforms; ) so- cial media team management; ) online engagement strategies; ) techniques of effective knowledge shar- ing; ) e-professionalism; ) potential pitfalls. the fol- lowing sections detail the perspectives of the various good practices in harnessing social media for scholarly discourse, knowledge translation, and education original article participants, who are named by their randomly se- lected aliases. ) understanding the nuances of specific platforms one aspect of good practices was knowing and har- nessing the specific platforms available within the social media space for effective engagement. un- derstanding the strengths of various platforms was deemed to be of critical importance across all the participants. tab. depicts some key platforms that were mentioned and clarifying examples are provided when possible. one participant (edward) put it well when he described that each of the various media have their own temporal properties: “they each play a different role ... it depends on what the ultimate goal of the interaction is. certainly, twitter has much more frequent interaction. the blog is, you know, a weekly thing as is the podcast and then youtube might be a monthly thing”. ) social media team management although some participants retained single-person access to social media accounts (usually around their personal accounts), the preferred mode of conduct for group or institutional social media accounts was to enable shared access across multiple users. the rationale for this was that shared access connoted shared accountability, which made the work lighter for any one person. this finding was also true for blogs and podcasts. although some of our partic- ipants still engaged in single-person blogging and podcasting, many have involved bigger teams. some even saw these as opportunities to engage in teaching trainees. harold, for instance, involves residents in his process:“ ... [m]y residents . . . come up with topics . . . so, that is much more of a group collaboration fashion and we will . . . edit the script and figure out you know what the teaching points should be ...” ) online engagement strategies we found a number of key engagement strategies that were mentioned by our participants as crucial for building online engagement: culture setting strate- gies, tailoring the message, responding and respon- siveness, and heeding rules of online engagement. culture setting strategies some of the key engage- ment strategies mentioned by the group were tied to specific platforms or tactics, but others were more generic. generally speaking, some participants thought that creating an open, welcoming environ- ment was crucial to engagement. one participant, grace, stated: i try and really welcome people. and make sure that when engaging them and there is somebody new in a social media environment, um, syn- chronous or asynchronous discussion, i make sure to welcome that person ... i try and think about netiquette [sic]. and helping people feel successful when they are using social media ... for some, creating a culture also meant monitor- ing the quality of online discussions (especially ones they were engaging within) and getting involved when necessary to halt or modify conversations, or to ac- tively avoid frank arguments. for others, this means setting a positive tone with the hopes of actively cre- ating a productive space for sharing. one participant (piper) noted that “. . . from [an institutional] account really you probably have to be positive, [an] uplifting voice”. finally, one other act of culture building identi- fied by our participants was the need to teach and mentor others in this space. culture building was thought to be collaborative, by encouraging faculty and trainees to engage together via social media for education. some participants used their podcast or blog as a platform for engaging trainees, apprenticing them into this world while creating new content. tailoring the message our participants thought that social media messages should be tailored (language level, style) to the audience, as nadir states: i have a sense of who i want to read it. and so, if i want the general public to read it then i minimize the jargon and i make it sort of you know interest- ing to people who are non-medical. but if i want people in my specialty to read it then i don’t mind getting extremely technical. responding and responsiveness other keys to en- gagement were thought to be around being respon- sive to others. from one participant’s point of view (trevor) it was crucial as an investigator on twitter to interact with those who sought you out. he stated: “i don’t necessarily think delving into in-depth articles on twitter is necessary or appropriate ... however, re- sponding to people who are having questions or being critical of things i think is a very reasonable way to go”. meanwhile on the receiving end of such engagement, others certainly found that this type of interaction was helpful to themselves as scientists as well. responsiveness was thought to have its dark side as well. as you engaged more, our participants high- lighted the need to know when to stop having a con- versation too. for some, it simply meant halting en- gagement. for others, they thought it was important to think twice before responding. taking the emotion out of a disagreement was one strategy highlighted by one of our participants (jason). he highlights his own strategy for dealing with disagreements online: i tend to not respond emotively [sic] to anything. if i like something, i will like it. or i may put a fairly neutral response like: “we have done something similar have a look at this paper”. rather than good practices in harnessing social media for scholarly discourse, knowledge translation, and education original article saying “you know you’re wrong and we are right for these reasons ...”’ heeding rules of online engagement many of the other keys to interaction revolved around some rules on using common sense to engage in respectful and positive conversations, while also understanding your goals/intentions for communicating. these rules for engagement are summarized in tab. , alongside explanatory quotes. ) techniques for effective knowledge sharing participants made note that creating a digital home base such as a website was thought to be of great importance, but largely this was thought to be insuf- ficient for effective knowledge sharing. social media sharing of new research was thought to be best if it was multimodal, in order to be most useful in pro- moting that knowledge to the end-users. but simply creating a website and sharing was not thought to be sufficient. there was a perceived need to ‘repackage’ content in a way that was engaging within a specific medium. bearing in mind the audiences in the social media space, participants like anthony noted that the role of a good teacher or translationalist in this space is to “. . . make [core concepts] easily understandable, accessible, you know put my own little spin and little pearls ...”. julie noted that while repackaging was of impor- tance, it was crucial to be knowledgeable about end- users. she said that for her podcast, she and her co- host try their best to bear in mind their listenership. she stated: . . . we try to make sure that we explain sort of the basics instead of just assuming because we know we have a lot of medical student listeners and a lot of early resident listeners, in addition to career emergency physicians. and so, we’ve tried to cover the gamut by covering stuff that is interesting to us as ... docs but then also to go into a little bit of the sort of basics to an extent, to kind of address the [other audience groups]. some participants also highlighted the need to both produce and selectively amplify high-quality, accurate content. this responsibility was best stated by an- thony who said how crucial it is for producers to be “just making sure that whatever content you put out, that its quality is accurate, is important as well”. in an effort to ensure the quality of any educational or knowledge translation projects they undertake, some individuals create networks of trusted advisors and re- viewers to review their work prior to publication, an ad hoc version of pre-publication peer review, which may or may not be subjected to further scrutiny by ed- itorial members of a social media outlet. piper stated: “. . . if it is at all anything controversial, i usually send it along to a trusted mentor or a friend to read and give me their thoughts on. and then it goes to the editorial board”. sharing one’s own research good practices for dis- seminating one’s own research included ensuring that various social media promotional activities were all aligned with the area of research or scholarly interests. participants highlighted that it is useful to consider in- tegrating alternative media into the knowledge trans- lation process. specifically, infographics were thought to be of importance in this area, either as a post-pub- lication dissemination technique or directly into re- search papers so as to facilitate social sharing of the paper by others. creating a planned and integrated strategy for disseminating a paper after it is published can include alternative and creative forms of expres- sion including infographics, as piper vividly describes: . . . the other kind of variety of content that i seem to be producing most of these days is like in- fographics for translation of our research fin- dings ... [infographic creation] requires you to do is really distil down what this big study is about ... really, it gets you thinking: ‘what is the impact of what we have done?’ .. . ‘[h]ow do i want to share and frame that for people?’ so, when i do those it ends up being a bit of a self-reflective process on my paper or my research and what it is bringing to the community. the challenges of infographics of course are that the visual medium requires a different type of thinking and careful design is important so as not to over-sim- plify or water down the content. julie stated: “. . . for creating the visual abstracts, i create those for the lay emergency clinician who does not understand large rel- ative risks or odds ratios ... and then i tried to convey the results in an as succinct ... way that i can”. ) e-professionalism similarly, for those who inhabit the online space more as ‘translationalists’ and teachers, the space is simi- larly riddled with traps. as stated previously, it is im- perative for professionals (e.g. physicians and nurses) engaging in online education to be wary of their pro- fessional obligations, and all individuals should be aware of how easy it is to unintentionally breach confi- dentiality. as harold points out, training in this area is a must: “we, you know, are very cautious with our resi- dents and faculty and making sure that we are training everybody that you know you are not publishing things that are sensitive material or patient information and all of the common pitfalls that we have seen”. and even then, educators in the online space should be contin- ually vigilant in assisting those new to the field since, brandon remarked, “. . . there is, you know, potential for ... even unintentional patient privacy violations ... which again can get people into trouble with their home institutions”. good practices in harnessing social media for scholarly discourse, knowledge translation, and education original article ) potential pitfalls several notes of caution were noted within our partic- ipant’s interviews, which we grouped into two broad categories: ) caveat emptor and the need for critical- ity with resources; ) common pitfalls when broad- casting one’s self. caveat emptor: the need for increased rigor and crit- icality for social media resources specifically, some participants felt that it was important for users of so- cial media resources to note that their work was not a comprehensive resource, and that they (as the sell- ers of the evidence) could not replace primary litera- ture or textbooks. participants highlighted the value of social media to curate or highlight important re- sources, while also recognizing that sometimes so- cial media will highlight sensationalist (and less rigor- ous) resources at times. trevor noted: “i don’t believe that the use of social media replaces any of my academic reading”. specifically, trevor noted that it was impor- tant to make clear to users that social media-based resources should not supplant the use of the primary scientific literature. this was similarly mirrored by other comments who cautioned those in this space to recognize their responsibility for fact-checking and ensuring accuracy of online content prior to distribut- ing it. common pitfalls when broadcasting one’s self when engaging in disseminating one’s own work on social media, one must be aware that others may not per- ceive this as a simple act of sharing. as one participant (piper) highlighted, there was a fine line that needed to be walked between ‘bragging’ and disseminating your content. this quote highlights her perspective: fig. a summary of our study’s themes around key considerations in the use of academic social media you know i think the biggest balance is... tow- ing this line between self-promotion and sharing your resource or sharing your research and getting stories out about your studies and about what you think is important ... i don’t want to be seen as pushing it at people aggressively ... i also want to be seen as humble and thoughtful. many of our participants felt the weight of respon- sibility upon them when speaking about their role as an openly identifiable online physician. examine this one statement by a participant (darcy): . . . [t]here is a ... reason that the doctor oz show is the doctor oz [show], and it is not just [titled] talking about some shit [sic] with a guy named bennett. i mean, there is gravitas. there is a pro- fessionalism to being a physician. ... [there], is a factor of credibility because anything that i put out there i sign my name to. discussion in this study we have sought the insights of peer- identified influencers and leaders within the social media learning environment to understand good practices and potential pitfalls for those entering this space. through their aggregate experiences, several key themes are summarized in fig. as key takeaways for our readers. our findings should be interesting to both new scholars, who are seeking to carve their niche within the academy, but also those seeking to foster others’ success by capitalizing on social media as a platform for dissemination. there is an evolving role for new scholars in today’s academic milieu who can help good practices in harnessing social media for scholarly discourse, knowledge translation, and education original article with the translation of knowledge as teachers (trans- lational teachers), and those who effectively engage their target audiences and key stakeholders via social media as scientific investigators (interactive investi- gators) [ ]. this paper provides empirical data which highlight these new roles, and how they are increas- ingly sophisticated. participants who identify with these roles have a growing mandate to organize and add structure to a zone where social media meets academia. this aligns with the structuralist phase of the greater free open access medical education (foam) movement [ ], which includes new roles such as social media editors for journals [ ]. our participants also explained how they are not solely translational teachers, but at times must wade into promoting their own scientific or scholarly work, which shifts their role and requires new considera- tions. as critical clinicians, they saw the need to be actively skeptical of the science and participate in scholarly discourse around science that is published in any format—whether it be in a high-impact journal or a high-traffic blog. as teachers, our participants reflected upon how they seek to produce high-qual- ity content and to educate others to appraise con- tent in the social media space. finally, as either in- teractive investigators or translational teachers, they remarked on their sense of responsibility around the need to be accurate and not fall into the trap of be- coming a ‘celebrity’ or ‘science kardashian’ [ – ]. due to their professional identity, our participants found it imperative to consider content accuracy and saw themselves as accountable for ensuring the valid- ity and veracity of their content. similarly, e-profes- sionalism was found as a thread throughout the in- terviews; the themes found in our present study were similar to work that has been done on e-profession- alism within medical student and trainee populations [ – ]. that said, compared with prior literature [ , ] which largely focused on the hazards of social me- dia towards professionalism, our participants heavily de-emphasized this concept, relegating it to a concept that must be incorporated but not in the front of their minds. many of our findings show the parallel between our participants’ social media use and the techniques used by modern marketing strategists to gain atten- tion; for example, repackaging content in various formats to suit consumers (e.g. infographics, pod- casts, easy-to-read blogposts) and adequate brand alignment between content producers and target au- diences were identified as essential practices. many of the concepts discussed by davenport and beck [ ] have a suitable mapping to the online engagement strategies. as davenport and beck write in their book, those that: “... succeed in the future will be those experts not in the time management, but in the attention management” [ ]. phenomena such as infographics and visual abstracts are tightly associated with tac- tics that would help grab learners’ attention and pre- vent the tl;dr (“too long; didn’t read”) label that is dreaded in the social media world [ ]. translating longer articles to capture attention is a phenomenon that aligns very closely to the exis- tence of an attention economy within social media- based knowledge translation and medical educa- tion. tailoring the message to your target audience helps you to capture voluntary attention through the effective structuring and design of your content. be- ing responsive and responding well to others also help our participants to engage the attractive atten- tion—reinforcing and providing positive feedback to those who engage with your material. heeding rules of online engagement to avoid unwarranted negative reactions for violating the cultural norms is closely connected to the concept of avoiding aversive atten- tion. our participants also found it harder to walk the fine line between self-promotion and knowledge translation/dissemination, worrying about how their peers in the profession might view them. many health professions education researchers and scholars may find that this resonates with them as well. mean- while, some types of attention that davenport and beck identify in their work go beyond the depths of what we found currently within the responses of our respondents. for example, our participants did not specifically speak to the value of creating an academic brand to capture the back-of-mind attention that dav- enport and beck describe; however, there is increasing discussion around this in academic medicine [ ]. our present study has a number of limitations. first off, our lead investigator is fairly immersed in the world of online education and knowledge translation, and this may have affected our interpretation of the participants’ words. to optimize her distance from the actual respondents, we ensured that she did not interview any of the participants. we also involved the research assistants and transcriptionist in redact- ing the individual transcripts to ensure that she was not privy to the identity of the various participants. in the analysis phase, we used multiple strategies to en- sure the rigor of our analysis, acknowledging that our lead investigator brought with her both her ‘insider’ expertise, affording us a unique perspective on these topics. the future of #meded in social media going forward, the evolution of thinking within the social media-based education space will likely be- come increasingly aligned with the thinking of the participants within our study. we foresee issues around ensuring that we grab the attention (as well as the hearts and minds) of our audiences in social media; this will be of growing importance going for- ward. with the rise of a generation of physicians who essentially grew up with social media, we will gradually see the integration of these platforms into good practices in harnessing social media for scholarly discourse, knowledge translation, and education original article our scientific and educational circles [ , ]. and while we cannot generalize across a whole generation [ ], it is clear that the global increase in usage and popularity of social media as a major communication platform provides concrete evidence of the changes in the way we communicate to learners and colleagues [ , , ]. the depth of considerations and thinking on the various topics around safe social media uti- lization was coupled with an ease with which some of the participants understood how to best harness the power of these new media. conclusions by engaging leaders and early adopters of social media as a tool for scientific discourse, knowledge transla- tion and education, our work identifies good practices to guide health professionals and other stakeholders in a space that is rapidly growing in both pervasive- ness as well as importance. key strategies revolved around content delivery, audience engagement, and e-professionalism as well as being critical of one’s own accuracy and role on social media. acknowledgements we would like to thank aisha mohamed, emmabridgwaterandpriyathomasfortheirassistanceinthe development and early recruitment phases of this study. we also thank elizabeth clow for her services as a transcriptionist for our project. funding dr. chan reports receiving funding from the psi foundation for this work via the psi foundation graham farharquason knowledge translation grant recipient. dr. yilmazistherecipientofthetubitakpostdoctoralfellowship grant. open access this article is licensed under a creative com- mons attribution . international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article’s creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article’s creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permis- sion directly from the copyright holder. to view a copy of this licence, visit http://creativecommons.org/licenses/by/ . /. references . merchant rm, lurie n. social media and emergency pre- paredness in response to novel coronavirus. jama. ; https://doi.org/ . /jama. . . published on- linemarch . . gottlieb m, dyer s. information and disinformation: so- cial media in the covid- crisis. acad emerg med. ; : – . . chan tm, dzara k, dimeo sp, bhalerao a, maggio la. social media in knowledge translation and education for physicians and trainees: a scoping review. perspect med educ. ; ( ): – . . davenport th, beck jc. the attention economy: under- standing the new currency of business. : harvardbusiness schoolpress; . . chan t, trueger ns, roland d, thoma b. evidence-based medicine in the era of social media: scholarly engagement through participation and online interaction. can j emerg med. ; ( ): – . . collinsk,shiffmand,rockj.howarescientistsusingsocial mediaintheworkplace? plosone. ; ( ): – . . cameron cb, nair v, varma m, adams m, jhaveri kd, sparks ma. does academic blogging enhance promotion andtenure? asurveyofusandcanadianmedicineandpe- diatricdepartmentchairs. jmirmededuc. ; ( ):e . . chan tm, bhalerao a, thoma b, trueger ns, grock a. thinkingcriticallyaboutappraisingfoam.aemeductrain may ; ( ): – . https://doi.org/ . /aet . . . eysenbach g. can tweets predict citations? metrics of social impact based on twitter and correlation with tra- ditional metrics of scientific impact. j med internet res. ; ( ):e . . cadogan m, thoma b, chan tm, lin m. free open access meducation (foam): the rise of emergency medicine and critical careblogs and podcasts ( – ). emerg med j. ; (e ):e –e . . lagu t, kaufman ej, asch d, armstrong k. content of weblogs written by health professionals. j gen intern med. ; ( ): – . . riddell j, brown a, kovic i, jauregui j. who are the most in- fluentialemergencymedicinephysiciansontwitter? westj emerg med. ; https://doi.org/ . /westjem. . . . . o’brien bc, harris ib, beckman tj, reed da, cook da. standards for reporting qualitative research: a synthesis of recommendations. acadmed. ; ( ): – . . chan tm, stehman c, gottliebm, thoma b. a shorthistory of free open access medical education. the past, present, and future. ats scholar. ; https://doi.org/ . / ats-scholar. - ps. . lopez m, chan tm, thoma b, arora vm, trueger ns. the social media editor at medical journals. acad med. ; ( ): – . . halln.thekardashianindex:ameasureofdiscrepantsocial mediaprofileforscientists. genomebiol. ; : . . cameron p, carley s, weingart s, atkinson p. cjem debate series: #socialmedia—socialmediahascreatedemergency medicinecelebritieswhonowinfluencepracticemorethan publishedevidence. cjem. ; ( ): – . . khanms,shahadata,khansu,etal. thekardashianindex ofcardiologists. jamcollcardiol. ; ( ): – . case rep. . chestoncc,flickingerte,chisolmms.socialmediausein medical education: a systematic review. acad med j assoc ammedcoll. ; ( ): – . . decamp m, koenig tw, chisolm ms. social media and physicians’onlineidentitycrisis. jama. ; ( ): – . . mccartney m. how much of a social media profile can doctorshave? bmj. ; :e . . roy d, taylor j, cheston cc, flickinger te, chisolm ms. social media: portrait of an emerging tool in medical education. acadpsychiatry. ; ( ): – . . heinzman a. what does “tldr” mean, and how do you use it?. https://www.howtogeek.com/ /what-does- tldr-mean-and-how-do-you-use-it/. accessed june . . borman-shoap e, li stt, clair stne, rosenbluth g, pitt s, pitt mb. knowing your personal brand: what good practices in harnessing social media for scholarly discourse, knowledge translation, and education http://creativecommons.org/licenses/by/ . / https://doi.org/ . /jama. . https://doi.org/ . /aet . https://doi.org/ . /aet . https://doi.org/ . /westjem. . . https://doi.org/ . /westjem. . . https://doi.org/ . /ats-scholar. - ps https://doi.org/ . /ats-scholar. - ps https://www.howtogeek.com/ /what-does-tldr-mean-and-how-do-you-use-it/ https://www.howtogeek.com/ /what-does-tldr-mean-and-how-do-you-use-it/ original article academics can learn from marketing . acad med. ; ( ): – . . borges nj, manuel rs, elam cl, jones bj. comparing mil- lennial and generation x medical students at one medical school. acadmedjassocammedcoll. ; ( ): – . . howell lp, joad jp, callahan e, servis g, bonham ac. generational forecasting in academic medicine: a unique method of planning for success in the next two decades. acadmedjassocammedcoll. ; ( ): – . . jauregui j, watsjold b, welsh l, ilgen js, robins l. genera- tional “othering”: the myth of the millennial learner. med educ. ; ( ): – . . rochwergb,parker,murthys,etal. misinformationduring the coronavirus disease outbreak: how knowledge emerges from noise. crit care explor soc crit care med j. ; ( ):e . good practices in harnessing social media for scholarly discourse, knowledge translation, and education good practices in harnessing social media for scholarly discourse, knowledge translation, and education abstract introduction methods results demographics of participants key themes ) understanding the nuances of specific platforms ) social media team management ) online engagement strategies ) techniques for effective knowledge sharing ) e-professionalism ) potential pitfalls discussion the future of #meded in social media conclusions references microsoft word - c -schraefel copy.doc what is an analogue for the semantic web and why is having one important? parts of this paper have appeared in a blog post at http://dig.csail.mit.edu/breadcrumbs/node/ . mc schraefel iam group, electronics and computer science university of southampton, uk mc+ht at ecs.soton.ac.uk abstract this paper postulates that for the semantic web to grow and gain input from fields that will surely benefit it, it needs to develop an analogue that will help people not only understand what it is, but what the potential opportunities are that are enabled by these new protocols. the model proposed in the paper takes the way that web interaction has been framed as a baseline to inform a similar analogue for the semantic web. while the web has been represented as a page + links, the paper presents the argument that the semantic web can be conceptualized as a notebook + memex. the argument considers how this model also presents new challenges for fundamental human interaction with computing, and that hypertext models have much to contribute to this new understanding for distributed information systems. categories and subject descriptors h. . [information interfaces and presentation]: hypertext and hypermedia; h. . [user interfaces]: human information processing. general terms design, human factors, documentation. keywords memex, notebooks, hypertext argumentation, interaction design, semantic web, jourknow, mspace, tabulator . introduction in order to design either a system or an interface to support a technology, it helps to know what it is - or failing that - to have a model around which we can conceptualize what it is, what it does, and somewhat how it works. it is not unusual for a new technology to be introduced via an analogue of a previous, familiar technology "it's like this thing - but for this new bit." word processors for instance used to be described as “like typewriters except for copy and paste.” the familiar along with the new idea. the web has likewise frequently been explained along these lines: the web = a page + links. the concept of the printed page is one with which we are all familiar. it's clear, easy to grasp. the link offers only one new concept to understand, and it is largely communicable in practice: click on the link; go to a new page, with links. the rapidity with which people started creating and using new pages for the web demonstrates the success of the model: one creates some text (with images if desired); adds links to other similar types of pages, and voila, one has a web page. based on the success of the web, a new suite of web technologies and protocols have been developed, collectively called the semantic web. this grouping of technologies promises new and more powerful ways to interact with information on the web and to build new knowledge from those interactions. while this all sounds very good, there has been no analogue proposed for the semantic web that is similar in communicative power to the web as a page plus links. what is the equivalent analogue for the semantic web to help make it tractable? it is not obvious. it may be argued that the lack of such an analogue for communicating the semantic web to communities outside semantic web research is a contributor to the relatively slow or resistant take up of the semantic web within communities whose work could greatly inform its development: human computer interaction, information retrieval, information architecture, and what should be its proper home, hypertext. it is important to note that the motivation for this question of analogue is not a marketing/packaging question to help sell the semantic web, but is simply a matter of fundamental importance in any research space: it is critical to have both a shared and sharable understanding of a (potentially new) paradigm. if we do not have such a shared understanding, we cannot interrogate the paradigm for either its technical or, perhaps especially, its social goals. in the following sections, how technology models based on older familiar models actively assist development of new technologies is considered. then by looking at how this modeling approach has informed the web, we propose a possible way to construe the semantic web via a model steeped in hypertext tradition. the paper closes with a consideration of how this model may open new design paradigms beyond the semantic web and for computing interaction, as well as for the new field of web science. these seem like bold claims. they are not meant to be proclamations, but more a contemplation of a possible research agenda to include other ways we might think about computing if we start with a blank page in a fresh notebook, and let hypertext ideas be, literally, re-presented in a call to renew perhaps, rather than just to the new. this is a pre-print of the paper to be available for. ht’ , september – , , manchester, united kingdom. copyright acm - - - - / / ...$ . . what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / . the web, the page, and history there is no argument that the web is a success story. it has changed not only the way we access information, but it has also changed our expectations for information: if it is not on the web, it does not exist. for example, as bibliometric studies have shown, citation rates are significantly higher for material that is accessible on the web, compared with material only available in print [ ]. there have been many things that have contributed to the success of the web, from powerful search engines that make content discoverable, to commercial take up of the web as a core medium for communication. significantly, it has brought people into contact with computers and global network who otherwise would have had no contact with such systems. we might argue that this success of the web is largely because the paradigm of the web is powerfully familiar. that is, despite the newness (to most people) of this complex of networks and protocols known as “the web,” its paradigm is based on prior, well-established, well-used technology from the past millennia at least. the web page is in many ways, a simulacrum of both a technology and form of communication with which we have tremendous familiarity: the read-only text of the printed page. we have a long history with read-only text, whether as official public communication, such as obelisks that communicated history and cultural imperatives, to government posters, such as the famous “i want you” [ ]. with the growth of the printing press, unofficial counter-commentary from th century political handbills glued to lamp posts to more contemporary anti- ads like the artist banksy’s political commentary (shown in figure ) it has become easier to make alternative views publicly available. we also have a long experience ( + years) of a particular technology's deployment of words and images in a page – taking us from the relative exclusivity of hand copied illuminated manuscripts to early printed texts with woodcut illustrations (figure ). the web draws on this familiarity: it does not look like some strange new technology that requires strange new devices; it does not remind us of its stateless, network accessing, server dependent vastness. rather, the web looks very familiar. the web as it was introduced to us, and largely how it has evolved draws on this highly familiar mode of the printed page for communicating content. the one new thing added in the web to the notion of the page - the thing that makes it a web page - is the hypertext link. the link is the core new concept introduced to the page, and more times than not, that link's job is to link the current page to another page. the mental model for understanding the web, with its unary links, can be well supported by the page. indeed, the web’s fundamental “ease of use” is often attested to by the uptake of the technology by largely self-taught senior citizens [ ]: if elders can do it, goes the argument, it must be easy; if they are doing it, it must be ubiquitous. this is not to say that there are not a myriad of design and usability challenges for making that page+link approach useful, usable and accessible. we have developed whole suites of conventions on how to deliver pages effectively and have gone through now what are referred to as “generations” of web design to ensure that text, image and link work [ ]. yet despite over a decade of technological evolutions informing the web, how it can access content, how browsers can present that content dynamically and programmatically, the paradigm for describing what we create with the web is the same: it's a page. with links. that paradigm informs how we design web content: not as a spreadsheet; not as a network diagram, but as a page. even with web . , with rss feeds, blogs, mashups, we still have pages. the only slight page model variant in web . may be with location based mashups. in these pages, the main content rather than text is now a map. and again, maps are also highly familiar technologies that have been around for millennia, and accessed in posters and printed books. maps are a technology most of us have even had some formal training on how to use at various points in our education. figure : one of banksy’s anti-ads in london, uk, [ ] figure . liber chronicarum, . [ ] what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / in terms of communicating functionality to people – how to use the thing – the model of the web page as page is clear, familiar, highly expressive, and rapidly communicates what the web is largely about: enabling people to communicate ideas, and with that one special tool, the link, to hook their ideas into the myriad of other ideas available on other web pages. the great new concept of the tag to mark and aggregate content like blog entries or photos for rapid representation, for example, still outputs its results in catalogue-like page indices of “tag clouds” where size of tag represents its popularity in a given system. it is in large part because there is such a clear model of how to access web content and make use of web technology that there has been such rapid adoption of that technology across sectors. the semantic web changes all that. . hypertext in the machine while the web may be over ten years old and can claim world domination, even at five years old it had become a tour de force. the semantic web has effectively just turned five: it has been five years since the original scientific american article on the semantic web was published [ ]. a five years on article has recently been published [ ]. while the community of semantic web researchers can claim increasing traction within some parts of the computing industry, there is still considerable skepticism on two sides of the computing space: back end technologists and front end researchers, designers and lest we forget, users. there is far less understanding, even within the computing space, about what the semantic web is, five years on, compared with the web at five. at meetings with leaders in information retrieval over a year ago, misconceptions about the semantic web abounded: “isn’t that just that old [i.e. failed] ai stuff?” was a common theme. at a human factors conference recently, the response from people who should know better was “i don’t care what the back end is; i’m platform agnostic.” and yet, it is the capabilities enabled by the back end that often inform how we imagine the possible of what can be delivered at the front end. one might suggest that the technology deserves what it gets: if it is not being picked up by researchers or the commercial sector in large measure, then perhaps there is a reason: it is fundamentally flawed, or damaged goods. after all, that kind of argument has been made of hypertext – until the web made (a version of) it “real” to a far greater population than the limited set of hypermedia researchers. today, indeed, the annual web conference attendance surpasses numbers at either hypertext or at the international semantic web conference itself. indeed, comparisons between the semantic web and hypertext are not unknown. leading lights in the semantic web community have been quoted as saying “we don’t want what happened to hypertext to happen to the semantic web.” of course such statements are informed by ignorance of the actual hypertext community, but such comments also make clear how critical it is to communicate not only what the technology and research agenda is about, but what the potential benefits of that work are. that is, what problems is this new technology going to solve that makes the cost of adoption worth the supposed benefits? and by the way, what is being adopted? what is the semantic web? how best to answer this question perhaps needs to take into account the people the semantic web community wish to attract tag (metadata) http://en.wikipedia.org/wiki/tag_(metadata). to be involved as practitioners, innovators, creators, and discoverers in this space. if that population is to include the same range of passions and expertise that have brought so much to the web from the arts, humanities and sciences, among others, then how this question is answered becomes critical. consider for a moment how the semantic web has been described in the new first stop shop for what something is, wikipedia. the wikipedia entry for the semantic web begins: the semantic web is an evolution of the world wide web in which information is machine processable (rather than being only human oriented), thus permitting browsers or other software agents to find, share and combine information more easily. it is a manifestation of w c director tim berners-lee's vision of the web as a universal medium for data, information, and knowledge exchange. at its core the semantic web consists of a data model called resource description framework (rdf), a variety of data interchange formats (e.g rdf/xml, n , turtle, n-triples), and notations such as rdf schema (rdfs) and the web ontology language (owl) that facilitate formal description of concepts, terms, and relationships within a given domain. the burgeoning semantic web comprises newly created and/or transformed web data sources endowed with computer-processable meaning (semantics). all that description tells anyone about the semantic web is that it is for machines. as a semantic web researcher, who works with a community of semantic web researchers, one would be hard pressed to find a majority opinion that believes that the end game imagined for the semantic web is to make data easier for machines to process. machine-processable data is truly a gnarly problem, but it is a means to an end, not the end itself. the end, as with the web, is still about people, and people being able to build knowledge by moving through linked information. consider the following statement from the founders of the web science research initiative, who are leaders in hypertext, the web and the semantic web. in the science article “creating a science of the web” they state the following rationale for starting a web science discipline: since its inception, the world wide web has changed the ways scientists communicate, collaborate, and educate. there is, however, a growing realization among many researchers that a clear research agenda aimed at understanding the current, evolving, and potential web is needed. if we want to model the web; if we want to understand the architectural principles that have provided for its growth; and if we want to be sure that it supports the basic social values of trustworthiness, privacy, and respect for social boundaries, then we must chart out a research agenda that targets the web as a primary focus of attention [ ]. the emphasis here is on human engagement with this web technology. indeed, the article describes the exemplar motivation wikipedia is a fluid source. the quotation reflects the state of the entry as of jan. , . http://en.wikipedia.org/wiki/semantic_web http://www.webscience.org/about/people/ what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / for the semantic web as how it will aid a scientist in drug discovery: “researchers are exploring the use of new, logically based languages for question answering, hypothesis checking, and data modeling. imagine being able to query the web for a chemical in a specific cell biology pathway that has a certain regulatory status as a drug and is available at a certain price [ ].” we might ask, then, if the semantic web has effectively the same human-oriented goals as the web, why not use the same model for describing it: pages with links. while that was in large part the approach proposed in the foundational article, there is a growing awareness that the page is not necessarily robust enough to support what more we get from the semantic web's linking capacity to connect information across domain axes. the above drug example would seamlessly connect information about a particular cell to a variety of possible relevant domains: regulatory status, dispensers, related research, use in parallel investigations. we can imagine more radically diffuse but still logically associatable shifts from domain to domain that the semantic web can support. consider someone exploring a music space (however that may be represented) who has heard something they like that turns out to be by wagner. in a works domain they can see all his compositions. the semantic web data model promises to make it possible to link in data on say performances to compositions and then project the data through a timeline visualization. with such a representation, it becomes possible to see that there have been key periods as well as geographical locations where wagner has been performed, contrasted with periods where his work has been seemingly ignored. now connect in information on historical events and locations, and it becomes possible to correlate an influx of performances in germany during wwii and a decrease internationally post wwii; indeed performances of his work in israel more recently have become points of strong social and ethical controversy. the above interaction with data to explore associations across all these domains takes us outside the page. one might say that the whole rationale of information visualization and information seeking is to provide means to support identification of moments of interest in data spaces, hence what is new with the above semantic web scenario? ibm’s new many eyes tool to enable researchers to upload data to the web and share representations of a spreadsheet worth of data with others is a compelling example of where a little bit of web can get one. the semantic web, however, provides the technologies to make explorations across domains dynamically in a kind of degrees of separation approach technically tractable. these resources are also not fixed single data files but cut cross dynamic, multiple, heterogeneous sources and data providers. a critical challenge then becomes just how to represent these new affordances to enable and take advantage of this rich interlinking of (meta)data for exploration. some of us in the semantic web & user interaction community have been considering these problems: mspace, exhibit, haystack, topia are exemplars of efforts to take advantage of not only the metadata, but the cross-domain linking that the semantic web might enable. tabulator is a more recent and even wilder approach as it attempts to leap from rdf source to rdf source across unknown schemas and enable these diverse sources to be queried (and thus integrated) dynamically. http://services.alphaworks.ibm.com/manyeyes/home http://semanticweb.org/swui this kind of emphasis on rich interlinking of data sources, focusing on representing not only the data but the metadata of an object explodes representation parameters beyond the page into other kinds of exploratory models for discovery and knowledge building. indeed, these models reach back to fundamental hypertext and hypertext systems and forward to new kinds of representations and interaction challenges when applied at web scale. but how do we describe this potential? for a community steeped in rich link models, hypertext is an obvious conceptualization. but beyond this community, hypertext equals “a page with links” – it equals the current web, not the rich possibility of what we might call real hypertext, which was modeled in note cards and microcosm. we may ask then is hypertext as imagined in the late th century a better framing for the web scale possibilities enabled by the semantic web? these early hypertext models were imagined largely as local systems. do we then need to go further back? the original coin of hypertext with nelson’s transpointing and transclusions [ ] was certainly not restricted imaginatively to local-only systems. but it was largely constrained by traditional notions of documents and pages in particular. long pages, but pages in documents nonetheless: components of other people’s work could readily be used either to support argument in a new document (transclusion) or to provide commentary on another document (transpointing). this is the view of the world as ongoing narratives, of interactive prose. of literary machines. the semantic web promotes thinking of information as, if not more then at least as also other than, and also often prior to, a page or a document. in this respect, the metadata is as valuable as the data as is the provenance of that data. by extension, the meanings, the semantics, the ways of interpreting and hence the ability to link/associate these sources with related sources automatically becomes an alternative way of thinking about the hyperlink as meaning. that is, the way meaning is communicated that is not via the explicit prose page or catalogue page, but is via the exposure of the ways in which data is associated, and can be discovered, by direct semantic association, for the reader/interactor/explorer to make meaning. thus, we see that beyond the wikipedia definition for the semantic web, the semantic web's promise is to enable people to explore, associate, and connect information to build new knowledge. thus if nelson’s model of hypertext does not capture these metadata or subdata strata of information, perhaps we need to go back further, prior to the coining of the hypertext term, and return to an early source, bush’s memex, and see how it may help communicate the possible to be enabled by the semantic web. . memex as partial sw model most people in the hypertext community (and much of the computing community beyond it [ ]) can immediately site the source article for the system described as the memex, v. bush’s as we may think [ ] (imagined a few years after publication as shown in figure ). one of the key parts of the memex is the making and sharing of associations crafted among diverse sources by the person using the memex. bush imagined professions of "trail blazers" (section of as we may think) enabled by the memex who would go about creating these connexions and publish them in new kinds of encyclopedias. his goal was to enable people to move across information “associatively” – modeled on how, he said, the brain builds knowledge. these associative connections have been translated into hypertext links, what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / and the ubiquitous unary web link. it may be worth arguing that tagging is evolving into a very rapid lightweight way of making at least new connections, if not the richer notion of bush’s trails. there is nothing either explicitly semantic or automatic about the description of trail-making in the memex. even rediscovery of resources is based on remembering and retyping the name of the label the operator gives to a work they have added to their personal memex store. the semantic web on the other hand promises that associations can be made inferentially and automatically by taking advantage of the use of both explicit semantic structures and the use of logic to reason over those structures. interestingly, the earlier part of bush’s article, prior to describing the memex, explicitly focuses on calculations machines should be able to carry out through the application of logical processes. bush makes the distinction between “repetitive thought” and “creative thought” and that there ought to be “powerful mechanical aids” for the former. he goes on, “whenever logical processes of thought are employed—that is, whenever thought for a time runs along an accepted groove—there is an opportunity for the machine” (section ). we have seen just this kind of automation of patterns throughout computing, but when combined with trail making, bush’s description has in part been realized in semantic web practice. for instance, the mygrid project developed workflows for bioinformaticians to explore gene databases, running variations of the same processes to generate results to interrogate genetic patterns. work that took days or weeks or more could be reduced to hours [ ]. likewise, the haystack project used similar kinds of patterns with a direct manipulation interface to pull together resources in an integrated scheduling scenario for trip planning [ ]. the haystack scenario in particular draws in one’s own data to mix with external information: personal calendar data and travel/flight information, for example. the imagined automatic, logical processing of “repetitive thought tasks,” and the ability to make (or infer where appropriate) links associatively across heterogeneous resources in new and unexpected ways related to either these kinds of tasks, or to the “creative thought” processes, gives us a strong model that captures at least part of the semantic web, and as shown, has already been explored in research from the scientific to the personal. the memex offers us a model of the “what’s new” part of our analogue approach to describing technology. where the web is the page + links (the familiar + the new), the memex is the second part of the sum, the semantic web = blank + memex. we are left still to define critical familiar part of the equation. the description of interactions with the memex points to a potential model. . work in progress & notebooks the end game of the memex is to enable the scientist to “extend the record.” as bush puts it, presumably man's [sic] spirit should be elevated if he can better review his shady past and analyze more completely and objectively his present problems. he has built a civilization so complex that he needs to mechanize his records more fully if he is to push his experiment to its logical conclusion and not merely become bogged down part way there by overtaxing his limited memory. his excursions may be more enjoyable if he can reacquire the privilege of forgetting the manifold things he does not need to have immediately at hand, with some assurance that he can find them again if they prove important. the above describes processes of building new thought based on connecting new ideas with previous personal and public data. it foregrounds the need to be able to forget about data management and focus on the present “creative thought” with some assurance that the material forgotten can be retrieved. what bush describes here could in large measure be the mandate for research in personal information management [ ]: to address the challenges of information capture and the problems of later retrieval. but what bush adds to the description that takes it beyond a data management problem, is that the data management is in the service of a particular goal: to support work in progress. bush wants a tool that will support creative thought. we have a mechanism at least as successful and pervasive as the page which has for centuries served the function of personal information management for work in progress: the notebook. in the following discussion, we will consider how a model of notebook + memex can be used as an analogue to express the rich potential of the semantic web not just as a read-only mechanism like the web, but as a mechanism for the ongoing work of our own review of our shady past and analyze more completely and objectively our present problems, which include both local personal and public informing sources. . affordances of the notebook most of us have some experience of the notebook as support tool for our own work in progress, whether to capture short thoughts, experimental observations or ideas gleaned during a meeting. it is a highly flexible tool. it supports a variety of input types (pencil, pen) and data types (sketches, photos, samples, text). it also has attributes to support multiple retrieval processes: the ordered sequence of pages can be used to support temporal progress; physical width can be used for random access to relocate a note (it was around the middle of the book). in particular, the notebook also affords easy capture of this rich variety of idiosyncratic notes, what we have been calling “information scraps” [ ] that information which may have no other formalized home, like an see mads soegaard, affordances, norman’s use of the term, section: encylopedia, interactions-design.org, http://www.interaction- design.org/encyclopedia/affordances.html figure . drawing of bush's theoretical memex machine (life magazine, november , ) what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / address book or calendar (for a more complete catalogue of a paper notebook’s affordances, see “breaking the book”[ ]). the notebook while using pages as media, breaks the printed-page paradigm prevalent in the web as well structured, well presented, largely read-only information space. in the notebook, we are really looking at a blank surface bound into a single, portable container. as such these books are fundamentally unlike what we usually think of as the web in at least one particular way: the web is public; we use its protocols to publish work. lab/note books are usually personal, idiosyncratic, again emphasizing work in progress (figure ). even the complete capture of an experiment in a formal lab context is not the finished work, but is the raw observations and in-progress annotations to be available for the analysis of that work towards some understanding of an hypothesis [ ]. only under certain circumstances are notebooks called into a more public use as evidence for tracking the genesis of an idea or discovery. more casually if they are shared it is to offer a glimpse of an idea to a colleague– usually with close supervision, and for the purposes of interacting with the data directly, synchronously with the collaborator. this is not to say that we do not see traces of, if not work in progress, then what we might call the persona in progress on the current web: there is a growing trend of “social stalking” on social networking sites, and “self-stalking” web-services. blog spaces like facebook publish rapid updates of information added by one member as immediate alerts to associated members/“friends” of the person. likewise twitter enables for an overview of facebook features, see http://www.facebook.com/sitetour/ twitter.com home page: “a global community of friends and strangers answering one simple question: what are you doing? answer on your phone, im, or right here on the web!” people to post from their phone fast updates of what they’re doing, where. pithy posts such as “getting on the bus” are not infrequent. such collections might be construed as valuable contextual material for work/thoughts in progress, if not as the primary material of notes on work itself: they may act in the same way a phone number or meeting reminder might be scribbled beside the first few bars of a new sonata. one item can act as a way of refinding the other: “i put that by the notes for the sonata; the new sketch is by peter’s phone number.” but in the web context, even in these brief bursts of personal, we see that they are produced for publication, at web scale levels of access rather than for direct support of personal reflection, idea generation or work progress. this is not to say that the web is not trying to support these more private branches of endeavor. various web . services like web-based stickies and note keepers do exist, including of course one by the increasingly ubiquitous google with google notebook, a clipping service where links can be annotated and grouped into collections. in related work surveying knowledge workers, none of the people we worked with used these web based tools for note taking or information management [ ]. applications for collaborative writing, from sub etha edit to google docs have far greater take up. it is not clear the degree to which these online word processors are being used as notebooks rather than task specific tools for completing a specific writing project. . non-affordances of digital capture one would be hard pressed to say that right now using a computer is as easy for data capture in particular as using a paper notebook. research in personal information management [ ] suggests that one of the key problems of taking the kinds of information we readily capture on paper over to the digital is an issue of both data capture and data retrieval. that is (a) there is a high cost to get the data into the computer and (b) it is not always easy to get it back out [ ]. consider the problem of digitally capturing a phone number of someone met just once. if using a paper source, one might use a scrap of paper, note the number and stick the note in a book or on the corner of a desk; indeed the note may be moved to a variety of locations, and reinforce awareness of its location. on the computer, one may feel very clever and have the person beam their contact information, including phone number, from their phone to their laptop, thus avoiding the multi-step process of opening an address book application, creating a new form, and entering data into the form’s fields – a timely process at best. in either case, one month later, how will one find the phone number if all the person remembers is where they were when the data was captured, but not the person’s name? the only option is brute force search through the address book. with the paper notebook, one can say “ah, that number is next to the notes for that meeting that happened just before i left x.” in other words, the notebook provides both excellent rapid input as well as usable multiple context cues for rediscovery of data. our digital tools tend to denature the information we capture from any context. in the case of the phone number, while there may be rich data about the person, their job title and their address captured in addition to the phone number within that beamed transfer, the context of capture, that incidental data critical to its recovery, is lost. bush’s goal of a tool that will enable temporary forgetting of data in the confidence that it will be rediscoverable when needed is not met in such a circumstance. bush imagined the memex to have an easy interaction for data capture that did not denature it. figure . quintessential version of a scientist’s notebook capturing ideas/work in progress: a page from da vinci’s notebook working out a sketch to accompany a translation of viturvius’s work on architecture. the text is a translated quotation from viturvius’s work, which the figure illustrates [ ]. what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / one can now picture a future investigator in his [sic] laboratory. his hands are free, and he is not anchored. as he moves about and observes, he photographs and comments. time is automatically recorded to tie the two records together. if he goes into the field, he may be connected by radio to his recorder. as he ponders over his notes in the evening, he again talks his comments into the record. his typed record, as well as his photographs, may both be in miniature, so that he projects them for examination (section ). this scenario implicitly foregrounds two critical facets: interaction with the system is transparent; some metadata is added to preserve some context to be able to associate related data . with just one automatically added metadata tag, time, two records, notes and images, are linked. how the evening’s spoken notes are associated with the field notes is less clear, but it is obvious that semantics are being used to maintain connections among related types of information. indeed, in the smarttea project [ ] we used a similar type of lightweight semantics to tie parts of synthetic chemistry experiments together with the goal of enabling groups of them to be interrogated in various ways. in bush’s example, there is not a form in sight; no one is required to put a first name into a first name field and a last name into a last name field and so on. likewise, the data captured is not hived off into discrete applications for each data type. the information is available as captured. bush does not explicitly speculate on the value, however, of being able to get at the structured properties of the data captured, such as kingdom or class of a photographed organism or the fact that - - is a combination number not a date. but again, implicitly, bush’s quest for automation of repetitive thought practices and retrieval of assets when needed both beg the question, well then, why not do so via the metadata of a captured artefact? it is in the structure of the data, identifying one string as type meeting and another as type person or type phone number or type musical inspiration that lets us carry out queries like “what were all the phone numbers i recorded when i was last in the office at x?” such retrieval would potentially improve upon what is possible to do with even the best notebook: it would make it possible to query the captured information from a multiplicity of associative contexts. the challenge for such a system becomes how might we combine the easy interaction of notebooks or even bush’s more advanced voice and image field recorders with the rich capabilities afforded by structured data capture? to capture data structure currently, we must use separate forms in usually separate applications that share data and data structure often grudgingly. the rapid input of the notebook is lost. enter the semantic web as both personal and web scale data mechanism. by using semantic web technologies like rdf for data representation and triple stores as knowledge bases, data can be shared in a single “data soup” as the apple newton used to refer to it , where the data in the soup is accessible to all applications on the platform. by using either lightweight grammars (what natural language experts refer to as “pidgins”) it becomes feasible to capture data structure from idiosyncratic data entry of text strings. a string like “meet w ch. @ re jourknow” can readily be translated into a calendar event to be associated with notes on the project jourknow and referenced to chiang as “data soup”, apple newton entry, wikipedia. http://en.wikipedia.org/wiki/apple_newton the person involved in the meeting. we have described this process elsewhere [ ]. the advantage of automatic structure extraction to a shared data source means that data can be explored in its native context, such as the note it was when entered, or from a variety of other contexts, such as activities that took place at the time it was created or locations used or as it relates to a particular activity or project, or as a marker to what other documents were being worked on when that note was created. time and location are easy details to capture from wireless devices; document state is also tractable. using the same protocols for association, external services can be developed to support these local contexts: in an academic context, for example, relevant conferences may be found that relate to areas of work for particular projects, and deadlines scheduled automatically. awareness of others working on similar projects can also be discovered, and their related work captured. these kinds of automatic or semi-automatic associations with external data sources enable the notebook space to retain the easy affordances of the physical model while going beyond the physical limitations into the benefits of a networked computer with access to web scale data. in this respect, we do not slavishly copy the page model of the notebook, but rather as dix suggests [ ] endeavor to capture its affordances, its experiential qualities. we then enhance them with these semantic web technologies. . note cards redux: even more hypertext a compelling affordance of going digital indeed is that we can deploy a variety of representations for the same data, and take advantages of the affordances they offer. while the notebook is a well used, well trusted mechanism for keeping notes together, it does have limitations: page binding enforces linearity; it is difficult to see page next to page . a well-studied model for idea capture that breaks that linearity is the notecard stack. indeed, one of the earliest hypertext systems, notecards [ ] used the notecard stack as a model for idea capture and reordering. this work was to be followed by the commercial and pre-web hypercard and supercard applications. the cards not only contained data, but links and functions. there were also specific data types assigned to card types. hypercard defined these cards very explicitly: the home card, address cards and so on. cards within card stacks could be visited either sequentially or arbitrarily. spatial hypertext systems from viki [ ] to tinderbox [ ] have also capitalized on the the affordances of card stacks, but added another affordance from the physical realm of cards: the ability to spread out and reorganize virtual card stacks, where space in their organization communicates a kind of meaning – at least to the author of the structures. tinderbox also adds ai processing and data mining to extract new kinds of information and associations from the local data in the cards. the history of card use and structuring of cards comes from a well-designed practice of card use in pre-digital scholarship and journalism. in this research model, there were three kinds of cards: idea cards, quotation/paraphrase cards, and bibliography cards. these cards are interlinked: quotations to citations; ideas to either. these cards could be created in any order as material was discovered or ideas occurred: "only one idea to a card; only one quotation per card; only one reference per card" were the only constraints on card use. the idea being of course that individual cards could be organized and reorganized spatially for getting a picture of the developing paper. not all cards would be used, but gaps could also be detected. the organized cards could then be what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / put into one pile, and the paper written effectively from iterating through the cards one at a time. indeed, an outline for the paper or chapter could be generated from the organization of the cards before proceeding to the paper writing. the relevance of the note card model to the concept of the semantic web as personal work space with associated public data is in the integration of personal ideas with external sources: the idea cards are backed up with/informed by the quotations from external sources. in the case of note cards, these associations are either manually created by the researcher/author, or are presented by (and thus attributed to) another author. the goals are the same: building new knowledge by capturing one’s own ideas, and working with those of others - whether these are ideas that come up in a conversation with others and are hastily jotted down, or are captured from a published source. there is interplay here, a making of meaning. mark bernstein's tinderbox software very much follows the note card paradigm to support just this kind of intermix activity between the card stack, the card layout, and capture of ideas and other sources. it enables links to be copied from the web into cards, and of course enables other kinds of data to be written into the cards. it blends capture of the external with capture of the personal. digital notebook software, like circus ponies’s notebook, supports live capture of web content into a notebook page, and provides a single, knowable source for keeping track of digital ideas, whether as short bursts or longer thoughts. however that tool is currently locked to the paper page concept of the notebook page metaphor. based on the benefits of these various types of representations for our information, our tools need to provide multiple representations of the information – from pages, to cards, to timelines, to maps to facetted browsers, to whatever mode – to best support this work in progress paradigm. . notebook+memex=human focus setting issues of particular embodiment aside, whether of discrete cards or sequential pages, it is the affordances of the analogue notebook/note card stack for developing and progressing ideas and for interleaving idea content with memex-like associations across newly discovered, richly associated work that can stand as a tractable analogue for the semantic web. the semantic web = notebook+memex. one may argue that the memex is still to unfamiliar a concept to be useful, but this is the “something new” part of the “page+links” “familiar+new” equation for introducing a new technology. there was a time when links were something very new to the general population as well, and that the demonstration of how they worked quickly clarified their role. in this case, the memex is the means to help make, discover or recover contexts and connexions among work in progress at any point in the “creative thought process” from quiet self-reflection and engagement with related work and making associations among and between therein, to more broadly sharing material for in progress feedback. while the “+ memex” reflects this movement between the local and the network/web, the “notebook” component reflects the very active, yet very personal process of what has become known as knowledge working. the notion of the notebook (the blank page as opposed to the published page) is also different from what the web has become while still obviously being on the same continuum of work in progress towards some kind of sharing/publication. this blending of personal use with the semantic web's potential for automatic association of associated resources (whether personal or published, local or global) is a significant shift in how most of us have been thinking about the semantic web. let me frame that last statement. there have been projects thinking about the semantic web desktop - using the semantic web as a personal or local server layer for data. the projects foreground that there is value in applying semantic web protocols to the local context. there have also been projects like mytea which have imagined using semantic web technologies to maintain transparent context histories [ ] as a way to generate a dynamic, annotable bioinformatics experiment record (if not lab book) to track and record bioinformatics experiments as they develop acroos the variety of local and web tools used. the bioinformatician does not have to make a record of each step they take with their digital data; the system creates the record for them. at any point they can annotate or interlink the record of actions carried out. what is proposed here as a model for the semantic web not as desktop, not as an over-arching environment but as notebook + memex goes in a somewhat different direction as a model for the semantic web than what is written on wikipedia. we have already said that the page cannot reflect the rich associative possibilities of what the semantic web promises so one may ask, how could the analogue of a researcher's notebook which is so idiosyncratic support this concept? the notebook in this context is meant to force several concurrent concepts. first, there is the focus on lightweight data capture. it is critical that we re- investigate input methods, which means that we must also re- investigate data storage. right now the needs of the system to have structure captured manually have forced dreadful form-based user interfaces. we have the knowledge to do better. from filling is exactly the kind of repetitive task that a machine is well suited to carry out and leave us to the creative process. if we want light weight data capture and rich data structure, this is a challenge we must address. second, the notebook is an active repository: notes, images, pictures are frequently taped into them as are references to other documents. the semantic backing of the “+ memex” components of the notebook enables the possible interconnections – the lines between notes, the calculations across points, the paths across domains – to be developed and maintained. likewise, the single data soup of the memex repository means that data can be shared easily among a rich variety of representations. tim berners-lee’s tabulator [ ] attempts to provide just such a flexible set of views on rdf sources that have been brought together and queried: the results can then be represented in whatever view is most appropriate: table, calendar, map, or in time, hybrid views. one of the core attributes of this notion of the semantic web as notebook + memex is that it situates the semantic web conceptually within the realm of human engagement where we are actively “extending the record.” right now, very few semantic web tools, whether mspace, haystack or tabulator support direct authoring. with a semantic web (or memex) – backed notebook, we can imagine the semantic web components regularly seeking out associations to support the researcher's process. where the mighty tinderbox works to develop these connections among the local tinderbox-specific entries, a semantic web enabled notebook could draw across any local data source (associating active documents with working emails and appointments, for http://www.semanticdesktop.org/xwiki/bin/view/main/ http://mytea.org.uk what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / instance) with related (semantic) web sources. this local/personal focus is a compelling kind of inversion of the usual models of the (semantic) web. instead of an emphasis on publishing for the world readable web, we are emphasizing the pre-publishing, ingesting, personal activities of work, of active personal process rather than finished, public end. by this approach, we include the whole continuum of activity, not just the end point of the processes bush clearly imagined in leading up to the public “extension of the record.” . conclusions and oportunites in this paper i have suggested that we need a tractable model of the semantic web in order to enable people to imagine not only how it can work for them, but how they will want to design tools to support that vision. the proposal is that we can look to the web’s analogue as a model for framing one for the semantic web. the web has been postulated as a familiar technology with a new technology: the printed page + links. i have argued that a similar formulation for the semantic web is a notebook + the memex. in both the familiar notebook, and the more visionary memex, the emphasis is on engaging with information, developing it, working with it, as work in progress. while the semantic web can be seen to provide the protocols to enable the memex to support dynamic and automatic associations across inter-related domains, the notebook emphasizes both the more writerly and the more personal side of engaging with information. i have also suggested that this personally informed conceptualization of the the semantic web has the potential to lead to a different computing paradigm that may be more effective for human interaction, and may take better account of how we should by now be able to engage with computers, rather than computers forcing us to suit them (yes, this is a call to kill the form, and be liberated from it). another way to imaging the paradigm proposed is partially captured by the interaction with the computer on star trek, next generation. it is conversational: it is an ebb and flow of generating and validating ideas with the computer, and merging these into new answers that are then shared with others (members of the enterprise still go to conferences and present papers). except for the voice interface, this model of computer interaction is very much like what the memex describes with its scientist in the field, and what is proposed here as the notebook+memex: the personal working out and evolving of ideas towards a solution. the difference between star trek and the memex is that the computer is more actively engaged in assisting with data retrieval and calculations. this level of assistance is becoming possible via the logical structures supported in the semantic web’s protocols. another critical observation of these two models, both star trek and memex, is that forms are only implicit. for instance, on star trek, no one says “open calendar: date, march , event: meeting with cmd. riker, start time: , end time .” at most they provide tags, saying, “captain’s log” for instance, to initiate an entry. likewise who makes the entry is captured from the context of the voice and location of the speaker. captain’s logs are then able to be pulled together on demand, to support queries such as “what else was going on in sick bay when i made my log?” the one thing missing from these visions of the future computer is the social networks of data sources that are of current and of pressing interest to many considering the shape of the web [ ]. in a way, the memex was sensitive to the social in its consideration of the numbers of people who would contribute trails through data, sharing their associations for reuse and re-interrogation. this social immediacy enabled by the internet is fostering perhaps a new paradigm for both computing itself and what may constitute “publication” at earlier stages, that supports models to which sharing work in progress. we already have a form of this intermediary publishing of results in the e-science space: chemists are publishing crystal structures as they are generated in ebank ; bioinformaticians likewise daily add to databases of genes. each source is used regularly as a key resource by other scientists. little of the data in these repositories has first been published in formal journal papers. the role of direct experimental results being available for comparative consideration is taking on a bold new prominence in science work, above and beyond the formal primary research presentation of a peer-reviewed paper. if we believe that this intermixing of voices and intermixing of idea generation represents an important set of axes and continuums to support, then our vision will need to be for tools to support these kinds of interactions – interactions we carry out regularly in the physical world, but that are less well supported in the digital space. again, therefore, tools to support the in-process generation of ideas, to support the ready inter-relation of concepts, are critical for the next model of interaction with these systems. this interest in new models of computing, or of interacting with computers also emphasizes creativity as a necessary component to support in the design of the interaction. as shneiderman has pointed out [ ] we currently have little understanding about how to support creativity directly: what exactly in a tool set improves achieving an “ah ha” moment? how do we evaluate the strength of this feature? and yet creativity, the achievement of an insight that provides a new path to solve a problem, is a fundamental part of the scientific, process, or any research enterprise. one might postulate that the freeform nature of the notebook is an established tool in the support of creativity in the discovery process. if that is so, which attributes? how can they be understood to be directly and effectively supported digitally? a question of moment may be, therefore, do we want to challenge ourselves to take as a fundamental goal designing systems not just to support a particular task, but to support creativity? such a challenge takes most of us out of our comfort zone of known approaches for design, validation and the perceived role of computers for “productivity.” surely, though, these are the kinds of challenges we are now ready to ask of the systems we develop, whether at a high level of formal hypertext models or on the ground of embodying, for instance, semantic web enabled systems. perhaps such challenges will become part of the agenda that web science will embrace. perhaps the notebook + memex = semantic web is one approach to help us get there. . acknowledgments thanks to the kind folks at the school of information, university of texas at austin who first engaged with me on this topic. thank you as well to the semantic web user interaction steering committee who provided feedback on the blog version of this paper, and to the thoughtful reviewers of this ht paper whose comments have been key in sharpening the current presentation. i hope this version is closer to their vision of the work. http://www.ukoln.ac.uk/projects/ebank-uk/ what is the analogue for the semantic web and why is having one important? hypertext preprint / ecs eprints http://eprints.ecs.soton.ac.uk/ / this paper is an outcome of work under the aegis of the web science research initiative, supported via both an epsrc overseas travel grant, ep/e , and a royal academy of engineering global research award. . references [ ] bansky, “another crap advert” image held at art of the state web site, http://www.artofthestate.co.uk/banksy/banksy_another_crap _advert.htm. accessed from jan-may . [ ] berners-lee, t. chen, y., chilton, l., connolly, d., dhanaraj, r., hollenbach, j., lerer, a., sheets, d. tabulator: exploring and analyzing linked data on the semantic web. swui , swui.semanticweb.org/swui . [ ] berners-lee, t., hall, w., hendler, j. shadbolt, n., weitzner, d., creating a science of the web. science . (august , ): – . [ ] berners-lee, t., hendler, j., lassila, o. the semantic web: a new form of web content that is meaningful to computers will unleash a revolution of new possibilities. scientific american. may . [ ] bernstein, mark. an apprentice that discovers hypertext links. hypertext: concepts, systems and applications, proc. of echt' . a. rizk, n. streitz and j. andre, cambridge univ. press, : - . [ ] bernstein, michael., van kleek, m., karger, d. and schraefel, m.c. ( ) information scraps: how and why information eludes our personal information management. working paper. http://eprints.ecs.soton.ac.uk/ /. [ ] brody, t., harnad, s. and carr, l. earlier web usage statistics as predictors of later citation impact. journal of the american association for information science and technology (jasist) . ( ) pp. - . [ ] bush, v. as we may think. atlantic monthly july . http://www.theatlantic.com/doc/ /bush. [ ] dix, a. deconstructing experience - pulling crackers apart. funology: from usability to enjoyment. kluwer academic publishers, dordrecht, , - . [ ] gray, j. turing award lecture: what next? a dozen remaining it problems. turing award lecture, . http://research.microsoft.com/~gray/talks/gray_turing_fcr c.pdf [ ] halaz. f. reflections on notecards: seven issues for the next generation of hypermedia systems. hypertext and cacm . (july ): - . [ ] houston, r.d. harmon, g. vannevar bush and memex, arist ( ): c . [ ] hughes, g., mills, h., de roure, d., frey, j., moreau, l., schraefel, m. c., smith, g. and zaluska, e. the semantic smart laboratory: a system for supporting the chemical escientist. organic and biomolecular chemistry ( ): - . [ ] jones, w. personal information management. arist ( ): c . [ ] jourknow project site. http://projects.csail.mit.edu/jourknow/ [ ] kalnikaité, v. whittaker, s. capturing life experiences: software or wetware?: discovering when and why people use digital prosthetic memory. chi : - . [ ] liber chronicarum, george khuner collection. the metropolitan museum of art (on line): timelines of art history, http://www.metmuseum.org/toah/hd/prnt/ho_ . . .h tm. accessed jan . [ ] marshall, c., shipman, f., combs, j. viki: spatial hypertext supporting emergent structure. ht : - . [ ] “the most famous poster.” american treasures of the library of congress, memory exhibit, online. http://www.loc.gov/exhibits/treasures/trm .html, accessed jan., . [ ] nelson, t. literary machines, mindful press, sausalito, california, . [ ] quan, d., huynh, d., and karger, d. haystack: a platform for authoring end user semantic web applications. proc iswc . [ ] richer, j.p. the notebooks of leonardo da vinci, vol . new ed (edition) dover publications, . [ ] schraefel, m. c., zhu, y., modjeska, d., wigdor, d. zhao, s. hunter gatherer: interaction support for the creation and management of within-web-page collections. proc www ( ): - . [ ] schraefel, m. c., hughes, g., mills, h., smith, g., payne, t. and frey, j. breaking the book: translating the chemistry lab book into a pervasive computing lab environment. in proc. of chi , - . [ ] schraefel, m. c., brostoff, s., cooke, r., stevens, r. and gibson, a. transparent interaction; dynamic generation: context histories for shared science. in proc of echise , munich, germany. [ ] seniors online increases. sec: seniors statistics. seniors journal.com: senior citizens information and news. feb , . http://www.seniorjournal.com/news/seniorstats/ - - snrsonline.htm. accessed jan . [ ] shadbolt, n., hall, w., berners-lee, t. the semantic web revisited. ieee int. systems. may-june ( ): - . [ ] shneiderman, ben. leonardo's laptop: human needs and the new computing technologies. cambridge: mit press, . [ ] siegle, david. creating killer web sites. nd ed. new york: hayden press, . [ ] stevens, r., tipney, h.j., wroe c., oinn, t., senger, m., lord, p. goble, c., brass a, tassabehji, m. exploring williams-beuren syndrome using mygrid. intelligent systems for molecular biology (ismb), . [ ] van kleek, m., bernstein, m., karger, d. and schraefel, m.c. ( ) gui— phooey! : the case for text input. uist, , rhode island, usa (in press). modeling the scholars: detecting intertextuality through enhanced word-level n-gram matching christopher forstall, neil coffee, thomas buck, katherine roache, sarah jacobson note: this is a pre-print draft version. the published version contains several editorial changes. interested readers are advised to consult the forthcoming version of this paper in llc ©: . published by oxford university press. all rights reserved. intertextuality is an important part of linguistic and literary expression, and has consequently been the object of sustained scholarly attention from antiquity onward. the definition of intertextuality has been much debated, but it is commonly understood as the reuse of text where the reuse itself creates new meaning or has expressive effects, distinct from the unmarked reuse of language. in recent years, digital humanists have taken various approaches to detecting forms of intertextuality. this article reports on an advance in in the area of latin literature, which we focus on here, key works on intertextuality include conte , martindale , wills , hinds , pucci , edmunds , barchiesi , farrell , and hutchinson . more general studies include ben-porat , genette , irwin , ricks , and allen . the term “intertextuality” was coined by kristeva . an annotated bibliography on intertextuality surveying these and other works is provided by coffee . bamman and crane , büchler, geßner et al. , trillini and quassdorf , büchler, crane et al. , berti . automatic detection of a subset of intertextuality, namely, instances of text reuse determined by scholars of classical latin to bear literary significance. this work was carried out by the tesserae project research group, whose approach is distinctive for combining: ) efforts to use digital methods to emulate scholarly intertextual reading, ) corresponding procedures for testing results against scholarship, and ) an evolving free website for intertextual detection and analysis, http://tesserae.caset.buffalo.edu/. tesserae version matched exact word strings within moveable word windows. version added the capacity for lemma matching by line or sentence. deployment of these versions on the tesserae website provided scholars with a means of automatically finding phrase parallels that were candidates for instances of intertextuality. a previous test comparison of two latin epic poems demonstrated that the word-level n-gram matching employed by both versions could detect the majority of intertexts identified by scholars. the search lacked precision, however, so intertexts lay undifferentiated in long lists of candidate parallels, the vast majority of which were not meaningful. version now provides a filtering function that ranks parallels by significance, making it substantially easier to find those of greater potential interest. the version search algorithm is now the default method for searching the newly expanded corpus of latin, ancient greek, and english available on the tesserae site. this article describes the performance of version search. the complete code is available at https://github.com/tesserae/tesserae. coffee, koenig et al. a, coffee, koenig et al. b. methodology tesserae search proceeds in two stages. in the first stage, the search identifies all instances where a given unit in one selected text shares at least two words with a unit in another selected text. the units can be either lines of poetry or “phrases,” where a phrase is equivalent to a sentence or text demarcated by a semicolon or colon. words can be matched by exact word form (for latin, cano, “i sing” = cano) or dictionary headword (cano, “i sing” = cecini “i sang”). users can choose to exclude common words using a stop list, the size and source of which (one text, both texts, or the corpus) can be adjusted. this first stage of the version search is conceptually identical to that of previous versions, but incorporates some modifications to the code that produce a greatly increased number of phrase matches. to achieve better precision than provided by the stop list alone, version introduces a second stage scoring system that ranks results by two additional criteria: the relative rarity of the words in the phrases shared by the two texts (“word frequency”), and the proximity of the shared words in each text (“phrase density”). we privileged word frequency because we observed that, with notable exceptions, phrases identified by scholars as intertexts consist of words that are relatively rare in their contexts. we privileged phrase density because we observed that scholars generally found intertexts to consist of compact rather than diffuse collocations. the equation given in figure represents our attempt to express the relationship of these criteria as a measure of intertextual significance. the inputs to this equation are the frequency of each matching word in its respective text, and the distance between the two most infrequent words in each of the two phrases. the output is a prediction of interpretive significance generally falling between and . the effect of the equation is that, for a given parallel, the rarer the shared words are, and the closer together in their respective texts, the higher its score will be. testing search stage : phrase matching to assess the version search, we conducted a test that compared our results to a benchmark set of scholarly parallels between two latin epic poems considered to have a high level of intertextual relation, vergil’s aeneid ( , lines of hexameter verse) and book of lucan’s civil war ( lines of hexameter verse). we performed the search using the tesserae corpus-wide search interface (http://tesserae.caset.buffalo.edu/multi-text.php, fig. ). the interface allowed us to generate a list of parallel passages with common phrases, and also to see where else in the corpus those phrases appeared, as an aid to the hand-ranking process described below. we selected relatively unrestricted settings for our search to capture the greatest number of meaningful results. we compared texts by phrases rather than lines, since phrases were generally longer and so could find a broader range of intertexts. we searched by lemma rather than exact word, at the cost of some false matches, to allow for the detection of intertexts with identical roots but different forms, a necessary measure for a highly inflected language like latin. we chose a stop list that excluded only the most common lemmata in lemmatization is at present unsupervised. in cases where an inflected form is ambiguous (e.g. latin bello could mean “war” or “handsome”), it is allowed to match on any of the possible lemmata. http://tesserae.caset.buffalo.edu/multi-text.php civil war and the aeneid taken together. the stop list words were: et, qui, quis, in, hic, sum, tu, per, neque, and fero . the resulting search generated a list of , phrase parallels between the aeneid and civil war book , each with an automatically assigned score. comparison of these parallels with the benchmark set showed that the search captured % of the intertexts recorded by scholars. users can replicate the search discussed here by using the following parameters on the corpus-wide search page. source: vergil aeneid; target: lucan bellum civile book ; unit: phrase; feature: lemma; number of stop words: ; stop list basis: target + source; maximum distance: words; distance metric: frequency; drop scores below: ; filter matches with other texts: no filter; texts to search: all. the original distance metric counted both words and non- word tokens such as spaces and punctuation marks. since word and non-word tokens generally alternate, one should cut this number in half to estimate the number of intervening words in the “sparsest” parallels. the current, revised metric counts only words, and produces comparable results when set to a maximum of . our list of scholarly parallels was compiled from the lucan commentaries of heitland and haskins , thompson and bruère , viansino , and roche . these were supplemented by a list of parallels not recorded by scholars that had been generated in previous testing and graded according to the scoring system described below. note that the % recall reported here excluded matches on the list of stop words, as well as phrases in which matching words were very far apart (see below). without these restrictions, recall would be higher, around %, though at the expense of substantially decreased precision. we further attempted to determine if the search had revealed new meaningful intertexts. this required assessing the quality of the parallels returned in the search that had not been noted by scholars. for the assessment, we used a hand-ranking scale we had previously developed for this purpose, given in table . the scale has five ranks, from least to greatest significance for the literary interpreter. for testing purposes, we concentrated principally on whether parallels passed one of two thresholds. to clear the first threshold, a phrase parallel needed to have marked language and therefore be of potential interest for its artistry. this standard excluded both erroneous matches (type ) and instances of unmarked, ordinary language (type ). the determination as to whether a given phrase parallel had marked language was made in part through consideration of how often it appeared elsewhere in the corpus, as indicated by results from the corpus-wide search function. all other things being equal, a phrase parallel between the two texts that was rare in the corpus was considered of greater interest than a parallel common in the corpus. parallels passing this for a full explanation of the scale, see coffee, koenig et al. , - . this criterion is meant to exclude very common collocations. for example, forms of the expression “lift oneself up” (se tollere) occur at civil war . and aeneid . , but also in other texts in our corpus, confirming that it is a common expression and uninteresting in and of itself. at the same time, classicists have recognized instances where an intertext in fact becomes more meaningful by having been repeated, generally with variation, in multiple locations. a distinction is commonly made between a parallel consisting of two (or few) textual loci, called an allusion or intertext, and a set of multiple occurrences with close similarities, called a topos. homer initiates the “many mouths” topos by declaring that he could not name threshold were awarded a minimum score of and deemed, in our terms, “meaningful.” to clear the second threshold, a phrase parallel needed, in addition to marked language, sufficient contextual analogy between its two passages that a reader could interpret significance in their interaction. parallels passing this threshold were awarded a minimum score of and deemed, in our terms, “interpretable.” all the greek forces at troy even if he had ten tongues, ten mouths, an unstoppable voice, and a heart of bronze (il. . - ). the roman poets lucretius, vergil, ovid, persius, silius italicus, statius, and valerius flaccus later pick up and rework the conceit into a commonplace (hinds , - ). overall, it would seem that the sense of a continuum from fewest to greatest number of phrase repetitions underlies the qualitative labels allusion / intertext, topos, generic language, and ordinary language, even if there is more to these categories than phrase repetition. it may be possible to incorporate phrase frequency into a future scoring system, in which case this issue would need closer examination. for this test, phrase frequency was considered by human evaluators, which allowed for the possibility of discrimination between these types. our criteria for meaningful and interpretable parallels draw upon existing theoretical distinctions. fowler , has written that the two fundamental criteria for an intertext are “markedness and sense.” markedness is the quality that makes a parallel “stand out” and makes it “special.” we take fowler’s criterion of markedness to refer principally, if not exclusively, to the sort of distinctive shared language features required to make a parallel “meaningful” in our terms. fowler further explains that for a parallel to have “sense,” the interpreter must “make it mean.” fowler’s criterion of “sense” corresponds to our requirement evaluating all the parallels in the test set was prohibitive, so we chose instead to rank a random sample consisting of % of the results at each automatic score level, amounting to , parallels, distributed as shown in table . the resulting quality distribution of the sample set was as follows, from most to least meaningful: type : ( % of results sampled); type : ( %), type : ( %), type : ( %); type : ( %). figure shows these proportions projected onto the full set of , results returned. based on this projection, between lucan’s first book and the aeneid we should expect to find , instances of phrase parallels that constitute more or less distinctive generic language (type ) and interpretable intertexts ( type and type ). although this may appear to be an unduly large number of intertexts to be found in hexameter lines, two considerations make it seem less so. first, we counted every set of parallel loci between the two texts separately. so when a given locus in the civil war had parallels with multiple passages in the aeneid, these each counted as separate parallels. the interpretable intertexts are thus constituted by fewer than separate loci that an “interpretable” parallel have a contextual similarity in the parallel passages that generates significance. of the parallels thus selected, , had already been hand-ranked in previous testing. the remaining were ranked for the first time in this study. the previously ranked and newly ranked results were then combined to make a sample set where each parallel had both an automatic score and a hand rank. all results were collated into a spreadsheet that is posted on the tesserae blog (http://tesserae.caset.buffalo.edu/blog/benchmark-data/ under “tesserae benchmark”). http://tesserae.caset.buffalo.edu/blog/benchmark-data/ in the civil war. second, a high level of interaction is not surprising for verse (hexameter) and genre (epic) traditions generally regarded as densely intertextual. figure illustrates the projected recall of meaningful parallels (types - ) from our test in relation to those recorded by commentators, showing that version is projected to increase the number of recognized meaningful intertexts substantially. figures and illustrate the recall of interpretable parallels (types - ) produced by the versions and combined (fig. ) and the projected recall produced by version (fig. ), both again in relation to those recorded by commentators. comparison of figures and illustrates the significant improvement in recall of version over even the combination of the two previous tesserae versions. overall, the projections from our sample suggest that version improves considerably upon previous versions in discovering meaningful and interpretable intertexts, including many that have not previously been recorded. an example of these results is a parallel found in our tesserae version test sample, but neither noted by commentators nor discovered with previous tesserae versions, which was assigned an automatic score of and a hand-rank of . in civil war , lucan narrates the abandonment of rome at the advent of caesar, comparing the panicked reaction of romans to the fear of hannibal generations earlier: non secus ingenti bellorum roma tumultu concutitur, quam si poenus transcenderit alpes hannibal. the total number of commentator parallels is lower in the version test because review of the earlier commentator parallels for the current test found some that were judged duplicates. (civil war . - ) rome was rocked by the massive upheaval of war, no less than if the carthaginian should cross the alps. this passage bears some similarity to an episode in the underworld narrative of aeneid book . in the aeneid episode, set in rome’s mythical prehistory, aeneas’s father anchises looks forward over the centuries to the birth of the great general marcellus who saved rome from the carthaginians in the first punic war and fended off gallic incursions: hic rem romanam, magno turbante tumultu, sistet, eques sternet poenos gallumque rebellem, tertiaque arma patri suspendet capta quirino. (aeneid . - ) this [marcellus] will keep roman affairs standing when it is threatened by great upheaval, he will lay low the carthaginian horsemen, the rebellious gaul, he will offer a captured general’s arms to father quirinus, for only the third time ever. there are other sources, beyond this vergilian passage, that lucan may be drawing upon and alluding to, including some with lines that also end with the word tumultu. but several in his comment on the lucan passage, roche , ad . - does not mention this possible vergilian parallel, but observes that “the allusion to hannibal is compounded by the intertextual allusion to lucretius’ description of the effects of the punic war at . f. omnia cum belli trepido concussa tumultu / horrida contremuere sub altis aetheris altis.” horace features make for a distinctive recollection of the description of marcellus by anchises: the pairing of rome and upheaval (tumultu) in the same line, the enjambment of the verb for the first line at the beginning of the second, and the placement of a form of the word “carthaginian” (poenus / -os) in the same metrical position before a caesura, in a line with identical metrical rhythm. the similarity of language features in the two passages meets our requirements for a meaningful intertext. there is also sufficient analogy in context to make the parallel interpretable. both passages deal overall with the possibility of the destruction of rome through foreign invasion and the corresponding roman response (or lack thereof). the analogy invites the reader’s interpretation. we can thus observe that the echoing of aeneid in this civil war passage figures romans as not only fleeing from caesar as they might have done from hannibal, but also fleeing as marcellus did not do when faced with an earlier carthaginian carmina . . - has a similar combination of thought and language: romana pubes crevit et impio / vastata poenorum tumultu / fana deos habuere rectos, / dixitque tandem perfidus hannibal . . . . the ancestor of all expressions of upheaval in africa with tumultu at line-end would seem to be ennius’s africa terribili tremit horrida terra tumultu (annales skutsch), a line that stuck in cicero’s memory (de oratore . ). among the variable first four feet, both lines have an initial dactyl and then spondees. poenus / -os takes up the end of the third foot and beginning of the fourth foot. threat in the first punic war. the resonance compounds lucan’s criticism of romans for deserting their city. search stage : scoring having demonstrated that tesserae version can capture intertexts with some success, we then wished to evaluate how these intertexts could be identified among all the phrase parallels returned, the majority of which were not meaningful. this part of the testing involved evaluating how the scoring system developed for version could improve precision. our procedure for calculating precision was to divide the number of meaningful (type - ) or interpretable (type - ) results in our test set by the total number of results of all types ( - ). to provide a baseline, we began by calculating precision for our sample set before engaging the automatic scoring system, with results illustrated in table . the published commentaries that were our model naturally had a very high rate of precision: % of the parallels they record are meaningful, and the remaining % are instances of ordinary (metrically compatible) language (type ). for interpretable parallels (types - ), version gave the highest precision among tesserae versions, since it matched by exact words, whereas the lemma matching of version and version without the scoring system, though capturing a broader range of parallels, had lower precision. we have chosen to focus on the civil war – aeneid comparison precisely because it is well- studied, and so allows comparison of automatic methods with existing scholarship. as is true in this case, therefore, any new parallels between the two poems revealed by tesserae contribute to, and must be interpreted within, a larger set of recognized connections. we then tested how effective the automatic scoring system was at identifying the most meaningful parallels. table shows how automatic scores in our sample set correspond to hand-rankings. if we average the automatic scores at each hand-rank level, we find the correlation illustrated in figure . as this figure shows, overall the scoring system succeeds in distinguishing the more meaningful intertexts given higher hand ranks by assigning them higher scores. in other words, the automatic scoring system replicated the trends in assessment of intertexts performed by human readers. to get a more concrete sense of the performance of version search, we further assessed our results in terms of recall and precision. figures and illustrate how recall and precision of meaningful (types - , figure ) and interpretable (types - , figure ) parallels vary when we discard results below certain score levels. in both cases, discarding results with increasingly higher score levels steadily increases the proportion of interpretable or meaningful intertexts in the remaining set, leading toward consistently higher precision. raising the score threshold also reduces recall, however, by progressively eliminating meaningful and interpretable intertexts. at this stage of development, then, the scoring system may best be employed to allow the user to filter results according to his or her needs. for example, by discarding all parallels below an automatic score level of in our test set, the user can eliminate nearly three-quarters ( / ) of the non-meaningful types and and yet retain some three-quarters of type parallels ( / ), % ( / ) of type parallels, and all type parallels. on the other hand, those who wished to get only a high quality sample could choose to consider results only at a higher score level. another way to choose a score cutoff level would be to consider the combined measure of recall and precision known as an f-measure. for our f-measure assessment, we used the following equation: figure illustrates the f-measure scores produced when we progressively discard results below increasingly higher automatic score levels. though the results fall considerably below the perfect f-measure of at any score cutoff level, this measurement does suggest that those interested in a relatively economical investigation into meaningful parallels would be best served by investigating those at a score level of or above, while those interested in a range more likely to be interpretable could investigate those at a score level of or above. conclusions the version algorithm behind the current default tesserae search is designed to identify meaningful intertexts through word-level n-gram lemma matching, word frequency, and phrase density. our tests demonstrate that version search has considerable success in identifying intertexts in a sample comparison from two latin epic poems. it gives higher scores to phrase parallels of greater interest, pointing users to those more likely to constitute an intertext. with relatively unrestricted settings, it can identify a majority of the intertexts recorded by scholars. these results, along with our further informal experimentation, suggest version can be similarly employed for other comparisons of latin texts in our corpus, as well rijsbergen . as for comparisons of ancient greek and english texts, making tesserae search a substantial aid to intertextual study. our results also suggest that the three criteria of lemma identity, word frequency, and phrase density are important formal components of what constitutes an intertext. when scholars identify two or more passages as intertextual, they may be using the presence or absence of these three features as implicit, if not explicit criteria. figures figure . equation for tesserae version scoring system where f(t) is the frequency of each matching term in the target phrase; f(s) is the frequency of each matching term in the source phrase; dt is the distance in the target; ds is the distance in the source. frequency is the number of times a word occurs in its respective text divided by the total number of words in that text. the frequency of the same word may thus be different in different texts. distance is measured between the two lowest-frequency matching words in a phrase. we assume that, where an allusion involves more than two shared words, the lowest-frequency words are likely the most important. figure . screenshot of tesserae corpus-wide search interface used in testing figure : projected distribution by type of all , aeneid – civil war candidate parallels, prior to application of scoring algorithm figure : numbers of meaningful (types - ) parallels between lucan civil war and vergil aeneid found by tesserae version (projected) and by commentators. projected figures are produced by projecting the quality scores for a test sample over the whole larger test set. figure : unique interpretable (type - ) parallels between lucan civil war and vergil aeneid found by tesserae versions and , commentators, and both, as reported in coffee, koenig et al. , . figure : unique interpretable (type - ) parallels between lucan civil war and vergil aeneid found by tesserae version (projected), commentators, and both. figure : correlation of tesserae automatic scoring system with hand ranking of intertextual significance figure : the effects of score cutoff on recall and precision rates for meaningful (type – ) parallels. figure : the effect of score cutoff on recall and precision rates for interpretable (type - ) parallels. note that the stoplist and distance restrictions apply to all points on this and the following two graphs. if these constraints were removed, recall would be slightly higher and precision slightly lower, with little or no change to f-measure. figure : effects of score cutoff on f-measure for type - parallels and for type - parallels. tables table . tesserae scale for ranking significance of intertextual parallels, from coffee, koenig et al. , - . type characteristics significance category high formal similarity in analogous context. meaningful interpretable moderate formal similarity in analogous context; or high formal similarity in moderately analogous context. meaningful interpretable high / moderate formal similarity with very common phrase or words; or high / moderate formal similarity with no analogous context; or moderate formal similarity with moderate / highly analogous context. meaningful not interpretable very common words in very common phrase; or words too distant to form a phrase. not-meaningful not interpretable error in discovery algorithm, words should not have matched. not-meaningful not interpretable table . total number of version results and number hand-ranked automatic tesserae score total in test set number sampled (approx. %) table . rates of precision for various sources in civil war – aeneid test search. given v precision rates are prior to application of the secondary scoring system. quality (rank) commentators v (exact form match) v (lemma match) v (lemma match) meaningful ( - ) % % % % interpretable ( - ) % % % % table . comparison of automatic scores and hand-ranks for tesserae version sample set of parallels between civil war – aeneid. automatic score hand rank type total (highest) (lowest) total works cited allen, g. . intertextuality. london, routledge. bamman, d. and g. crane. . “the logic and discovery of textual allusion.” proceedings of the second workshop on language technology for cultural heritage data (latech ) (marrakesh). barchiesi, a., ed. . speaking volumes: narrative and intertext in ovid and other latin poets. london, duckworth. ben-porat, z. . “the poetics of literary allusion.” a journal for descriptive poetics and theory of literature : - . berti, m. . “collecting quotations by topic: degrees of preservation and transtextual relations among genres.” ancient society : - . büchler, m., g. crane, m. mueller, p. burns and g. heyer. . “one step closer to paraphrase detection on historical texts: about the quality of text re-use techniques and the ability to learn paradigmatic relations.” in journal of the chicago colloquium on digital humanities and computer science. eds. g. k. thiruvathukal and s. e. jones. büchler, m., a. geßner, t. eckart and g. heyer. . “unsupervised detection and visualization of textual reuse on ancient greek texts.” proceedings of the chicago colloquium on digital humanities and computer science . coffee, n. . “intertextuality in latin poetry.” in oxford bibliographies in classics. ed. d. clayman. new york, oxford university press. coffee, n., j.-p. koenig, s. poornima, c. w. forstall, r. ossewaarde and s. l. jacobson. a. “the tesserae project: intertextual analysis of latin poetry.” literary and linguistic computing doi: . /llc/fqs . coffee, n., j.-p. koenig, s. poornima, r. ossewarde, c. forstall and s. jacobson. b. “intertextuality in the digital age.” tapa : - . conte, g. b. . the rhetoric of imitation: genre and poetic memory in virgil and other latin poets. ithaca, cornell university press. edmunds, l. . intertextuality and the reading of roman poetry. baltimore, johns hopkins university press. farrell, j. . “intention and intertext.” phoenix : - . genette, g. . palimpsests: literature in the second degree. lincoln, university of nebraska press. hinds, s. . allusion and intertext: the dynamics of appropriation in roman poetry. new york, cambridge university press. hutchinson, g. o. . greek to latin: frameworks and contexts for intertextuality. oxford, oxford. irwin, w. . “what is an allusion?” the journal of aesthetics and art criticism : - . kristeva, j. . ““word, dialogue and novel”.” in the kristeva reader. ed. t. moi. new york, columbia university press: - . martindale, c. . redeeming the text: latin poetry and the hermeneutics of reception. cambridge, cambridge university press. pucci, j. . the full-knowing reader: allusion and the power of the reader in the western literary tradition. new haven, yale university press. ricks, c. b. . allusion to the poets. oxford, oxford university press. trillini, r. h. and s. quassdorf. . “a ‘key to all quotations’?: a corpus-based parameter model of intertextuality.” literary and linguistic computing : - . wills, j. . repetition in latin poetry: figures of allusion. oxford, oxford university press. ......................................................................................................................................................................................................................................................................................................................... doi: . /s © american political science association, ps • january the teacher troubled waters: tracing globalization and waste in the delaware river craig borowiak, haverford college vicky funari, haverford college jesikah maria ross, capital public radio helen k. white, haverford college abstract  in spring , an interdisciplinary media project titled “troubled waters: tracing waste in the delaware river” was organized at haverford college. this project brought together more than students from four courses comprising introductory polit- ical science, chemistry, and documentary film students, as well as a community media art- ist and community partners. the aim was to explore the causes, impacts, and meanings of different types of waste that are polluting the delaware river. chemistry students col- lected samples to determine the presence of chemicals from various waste products, politi- cal science students traced the waste to globalized production processes, and documentary students explored diverse ways of representing the theme of waste on screen. this article describes the project and how it might serve as a pedagogical model for multicourse inter- disciplinary collaboration and community engagement. teaching undergraduates about globalization is often a matter of debunking myths, uncovering hidden system logics, and encouraging students to think geographically about connections between local practices and global processes. one proven way to do this is to have them research and map the global production pro- cesses that bring everyday items such as t-shirts and water bottles into their closet or refrigerator (barndt ; rivoli ; sparke ). these assignments are useful for teaching students about the geographies of globalization and about trade and transna- tional production. they also make students aware of how daily decisions about what they eat, wear, and use connect them with workers in otherwise distant parts of the world. in this way, they foster sensitivity to issues of global citizenship and transnational justice that stem from these connections (young ). such mapping projects, however, tend to focus only on pro- duction and consumption, leaving unstudied the role of waste in the life cycle of commodities. waste flows are as much a facet of globalization as the laboring aspects of global commodity chains, and they are equally implicated in local and trans-local justice dynamics (clapp ); however, too often, they drop out of the picture. there are practical reasons for this. whereas the production process draws diffuse components together into commodities, the waste process breaks commodities apart into diffuse material fragments and anonymous chemical flows. consequently, individual waste items can be surprisingly difficult to trace without a digital tracer or label-tracking mechanism (lepawsky ; milmo ). given such complexities, how might a globalization mapping project be adapted to take waste into account? how might the waste stream and its social and polit- ical repercussions be studied and represented? could this be done in a way that fosters civic engagement with local communities? these are the types of questions we tackled at haverford college in spring in an interdisciplinary media project titled “troubled waters: tracing waste in the delaware river.” this project brought together a political scientist, a chemist, a documentary filmmaker, a community media artist, and commu- nity partners, as well as more than bryn mawr and haverford students from four courses across three divisions. the aim was to explore the causes and impacts of different types of waste present in the delaware river. the idea for the project originated when vicky funari, visit- ing filmmaker at haverford college’s hurford center for the arts and humanities, proposed media artist jesikah maria ross for a mellon creative residency. the two started seeking ways to use ross’s socially engaged art expertise to build connections among craig borowiak is associate professor in political science at haverford college. he can be reached at cborowia@haverford.edu. vicky funari is a documentary filmmaker and an artist-in-residence at haverford college. she can be reached at vfunari@sonic.net. jesikah maria ross is an interdisciplinary artist and the senior community engagement strategist at capital public radio. she can be reached at jmr@praxisprojects.net. helen k. white is associate professor in chemistry at haverford college. she can be reached at hwhite@haverford.edu. mailto:cborowia@haverford.edu mailto:vfunari@sonic.net mailto:jmr@praxisprojects.net mailto:hwhite@haverford.edu ps • january ......................................................................................................................................................................................................................................................................................................................... t h e te a c h e r : t r a c i n g g l o b a l i z a t i o n a n d w a s t e i n t h e d e l a w a r e r i v e r disciplines and with local communities. after soliciting initial interest, funari and ross convened a meeting with professors craig borowiak (political science) and helen white (chemistry and environmental studies) to brainstorm possibilities. from the beginning, the intention was not to design a new course but rather to fashion a collaborative project around existing course structures. the idea emerged of doing a joint mapping, research, and documentary project that would begin not with a consumer item but instead with chemical waste compounds found in the delaware river. for ross, whose work explores collaborative place- based storytelling, the river offered a promising way to connect with local environmentalists and to bridge the divide between the classroom and the larger community. for borowiak, the pro- ject provided a waste-centered corrective to the mapping-global- production assignment he used in his introductory course. the project also resonated with white’s teaching and research, which centers on chemical contaminants found in aquatic systems. funari wanted to introduce documentary film students to collab- orative, multiplatform production models and to invite them to explore interdisciplinary documentary practice. for documentary students, the themes of waste, river, and research were rich with potential images, sounds, metaphors, and storylines to discover. for everyone, the project offered a way to teach students core disciplinary research skills while also modeling both the value and the challenges of interdisciplinarity for addressing complex, socially meaningful problems. the project involved chemistry students collecting and analyz- ing samples from the delaware river to identify waste chemicals, their chemical origins, and their potential health implications. political science students traced the chemicals to particular prod- ucts and companies and to patterns of production, consumption, and regulation, which they mapped using digital software. finally, documentary film students explored diverse ways of representing the theme of waste on screen, using field trips and research of the other classes to learn about their topics and to collect images, sounds, and information. all of this was accomplished in part- nership with the delaware riverkeeper network, a civil-society network that champions the rights of communities to a free- flowing, clean, and healthy river and tributaries. although students carried out their primary research within the context of their own course, we created cross-class collaborations through peri- odic interclass meetings and shared digital platforms. the project culminated in a final public event at which students presented their work via a website that ross created to archive and establish a digital public presence for the project. throughout, ross served as the lead artist and project facilitator. this article describes the project components and highlights the benefits and challenges of our pedagogical experiment. as we discovered, interdisciplinary research and civic engagement open up new and more holistic perspectives on waste, but such approaches also put pressure on institutional constraints—requiring more time from faculty, students, and residents alike. we offer our experiences as contributions to the burgeoning conversation on effective ways to make interdisciplinary environmental peda- gogy succeed. chemistry the chemistry component of this project was conducted by junior natural science students in white’s “superlab” course, a standalone, half-semester laboratory class created to give students experience designing experiments that answer novel research questions. beginning in february, white’s students made several trips into the field to collect water and sediment samples from the delaware river and its tributaries. students and faculty from the other courses were invited to join the field trips. the students brought their samples back to the chemis- try lab to analyze for the presence of waste-related chemicals. white had collected similar samples before the course began and was able to guide students in their search. working in groups of two or three, the students analyzed seven categories of waste chemicals: flame retardants, petroleum, pharmaceuticals, hormones, pesticides and herbicides, plasticizers, and human waste. from within these categories, students were responsible for selecting a specific chemical (e.g., glyphosate or estrogen) on which to focus their analysis. they analyzed its chemical com- position; tested for its presence and quantities in the river sam- ples; and prepared data visualizations with chemical diagrams and descriptions of the chemical’s properties, its importance, possible sources for its presence and current quantities in the river, and what might be done to reduce levels. political science the political science component of the project was undertaken by entry-level students in borowiak’s “politics of globalization” course. like the chemistry students, the political science students were divided into small groups and asked to choose a specific chemical from within the seven categories of waste, although they did not always choose the same chemical as the chemistry students. the researched chemicals included codeine, phtha- lates (a plasticizer), crude oil, estrogen, polybrominated diphenyl ethers (pbde, a type of flame retardant), glyphosate (an her- bicide), and caffeine. they were required to trace their selected chemical to particular commodities and to the companies that produce them. they then mapped the supply chains for those commodities, giving particular attention to trans-local patterns in the production processes and to the underlying labor, environ- mental, and regulatory conditions. based on their research, they were tasked with building interactive digital maps that integrated images and narrative to illuminate their findings. documentary film the documentary component of the course was carried out by students from two of funari’s documentary film courses: an intro- ductory course titled “documentary film and approaches to truth” for everyone, the project offered a way to teach students core disciplinary research skills while also modeling both the value and the challenges of interdisciplinarity for addressing complex, socially meaningful problems. ps • january ......................................................................................................................................................................................................................................................................................................................... and a more advanced course titled “topics in video production: the documentary body.” students in these courses created short films on themes related to the delaware river, waste, and research being conducted by students in the other courses. students exer- cised considerable latitude in what footage they generated and how they structured their films. many chose to accompany the chemistry students into the field as well as into the laboratory space. working either in groups or alone, students completed seven films in total, each five to ten minutes long. civic engagement we wanted to tie this project to the work of community practition- ers working on the river. such partnerships were seen as a way to improve the research and amplify its impact while also fostering civic awareness and bridging the divide separating the college from its social and environmental surroundings. ross brought her experience with community-engaged media projects to bear and was able to establish a partnership with the delaware riverkeeper network. maya van rossum, the delaware riverkeeper, met with our students in the field, through interviews, and in our class- rooms to discuss waste and the river. cross-class collaboration as project organizers, we worked well together and made a point of accommodating one another’s pedagogical needs. we were in regular communication before, during, and after the semes- ter. to facilitate collaboration among the classes, we organized a series of three cross-class meet-ups in which students and faculty from the participating courses gathered together and learned from one another. the first meet-up, held near the beginning of the semester, introduced the project and engaged students in team-building activities. the second meet-up, held mid-semester, involved a presentation and conversation with maya van rossum. the third meet-up, held near the end of the project, gave students an opportunity to present and workshop their works-in-progress across disciplines. in addition to these meet-ups, we shared contact information and established a digital repository for students to share documents, images, and videos across classes. research support for many of the introductory students, this project was their first time conducting original college-level research. most had, at best, limited experience accessing the college library’s wide spectrum of resources and even less experience working with haverford’s research librarians. to help the students with their research, the college’s science, social science, and digital scholarship librar- ians customized resource-guide websites specifically for this assignment. they also made themselves available throughout the semester. technical support was crucial for this project. the digital- mapping platforms were created using software called neatline, which is a plug-in for omeka, an open-source archive software. neatline is designed for users to tell stories using maps and timelines. library staff helped set up the software and provided in-class training and case-by-case technical support for students. technical support also is an inherent part of video making in the academic context. in addition to the usual support from lab- oratory staff in learning camera, sound, and editing skills, this assignment required extra staff and equipment support to enable documentary students to share footage. facilitation “troubled waters” involved a complex set of collaborations that required extensive communication among the coordinating faculty and project participants. here, ross’s role as the project orchestrator was crucial. she accompanied students on field trips, visited each of the courses, coordinated with librarians, consulted with students from each group, established and maintained con- tact with the delaware riverkeeper, and emceed the cross-class meet-ups and final presentation sessions. it would have been difficult to accomplish such an ambitious project without a point person to keep the communication flowing. ross did this in her role as an artist-in-residence—despite being on campus for only three separate two-week visits. presentations the project culminated in “pollution, politics & place,” a grand four-hour public event offering multiple opportunities for stu- dents and audiences to interact around the project research and creative work. the first part of this event resembled a science fair, in which audience members moved around seven pollution exploration stations, choosing which to visit and how long they wanted to engage with the presenters. at each station, chemistry and political science students stood together in front of posters and computers preloaded with their digital maps and data vis- ualizations. (the stations were designed and installed by ross.) rather than giving long presentations, students came prepared to field a wide variety of questions from audience members. each working group of chemistry and political students was divided in half, with one half presenting their work while the others visited other stations. thus, the audience consisted of students, the wider college community, and the general public. the presentations were followed by a catered dinner at which participants discussed project ideas and experiences. after dinner, there was a screening and discussion of the documentary students’ finished films. final product and findings to interlink the research and creative work, ross created an inter- active website and protocols so that students could post their work to a platform accessible to both campus and community stake- holders. after students posted their work, she curated the posts so that the site conveys and unites diverse disciplinary perspec- tives on waste in the delaware river. the site includes the docu- mentary films, the political science digital maps and narratives, and the chemistry data visualizations, as well as photographs of “troubled waters” involved a complex set of collaborations that required extensive communication among the coordinating faculty and project participants. ps • january ......................................................................................................................................................................................................................................................................................................................... t h e te a c h e r : t r a c i n g g l o b a l i z a t i o n a n d w a s t e i n t h e d e l a w a r e r i v e r research in progress and contextual information. the website is http://troubledwaters .tumblr.com. the tumblr site made the project cohesive by creating a single space in which the larger group effort was made visible and accessible. what did the students find? students studying glyphosate (i.e., a key ingredient in monsanto’s “round-up” product line) traced monsanto’s global influence and the social and political controversies surrounding the company. they also examined possible health risks of glyphosate and how runoff from local golf courses and farms (which they geolocated) makes its way into the river. students working on phthalates, which are mainly used to increase the flexibility of plastic, drew connections among phtha- lates in the river, dow chemical’s production of polyethylene terephthalate (pet; a plastic), and nestle’s production of plastic water bottles. they located many of nestle’s international head- quarters and bottling facilities and identified regulations and the chemical’s environmental animal- and human-health effects. other students used the prospect of finding pharmaceuticals in the river as a basis to examine the swiss drug company siegfried international, ltd, which has its us headquarters near the delaware river. another group focused on the oil industry and the impacts of a oil spill in the river. they learned how to assess the size and impacts of oil spills, and they drew connec- tions to the emergent shale-oil industry and the larger global petroleum market. it is interesting that the chemistry students found a greater presence of petroleum-based sealants (i.e., designed to cover parking lots and suburban blacktops) than crude oil. the group working on hormones explored the pres- ence of estrogen in the river and speculated about how it might be related to a birth-control-pill producer such as actavis. they examined actavis’s global market share and different countries’ efforts to regulate endocrine disrupters. they also learned about the difficulty in cleaning waters of hormone waste and tracing this waste to particular commodity producers. the student group working on human waste in the river focused on caffeine and how it might be sourced to a coffee producer such as dunkin’ donuts. they used the connection to coffee to explore the global coffee market and the politics of fair-trade versus conventional coffee. finally, a group working on flame retardants focused on pbde flame retardants and how they had been used extensively in the united states in everything from clothing to mattresses before environmental protection agency (epa) regulations called for them to be phased out by . the students used the presence of pbdes to study the proximity of dumps to the river and to study the use of flame retardants in serta mattresses. they traced the geography of serta-mattress production, the global presence of icl industrial products, inc. (i.e., a pbde producer), and efforts to regulate and phase out pbdes internationally. for documentary students, this project mirrored the real- world process of filmmaking more closely than a traditional production class. the introductory documentary students strug- gled with the scope and timing of working on such an involved project as part of an introductory course. most were shocked to discover how much work a documentary that engages with ongo- ing, unfinished research from the social and natural sciences entails. some expressed frustration at how little solid informa- tion they had and at how much raw information they were being asked to comprehend. however, they met the challenge with cre- ativity and elegance. some adopted a conventional documentary approach, with narration and interviews. others made extensive use of humor, producing mockumentaries and spoofs. still others worked in a more abstract register, using images from the chem- istry students’ sampling and analysis to create poetic meditations on water and the labor of scientific research. pedagogical assessment and lessons learned in debriefing on the project, faculty and artists noted that we had each accomplished our pedagogical goals. in course evaluations, students from each class commented on how much more they learned about waste, rivers, global production, and the commu- nity as a result of the collaboration. they also reported the ben- efits of having to communicate their disciplinary knowledge to students from other disciplines in a multidisciplinary effort to address real-world problems. for the introductory political science course, the project put entry-level students in the position of a researcher doing origi- nal work and provided learning skills that will prove useful later in their education. it also gave them experience with new digital technologies, spatial and geographic reasoning, oral and visual communication of findings in public, and different forms of col- laboration across disciplines and with community partners. more substantively, it gave them new multidimensional insights into the global and local dimensions of production, consumption, and waste and their connections to delaware river communities. many realized how understanding the chemical compounds that comprise our modern life matters for policy making and for local, national, and global well-being. in a typical production class, the documentary students would have learned basic production skills and studied documentary theory and history. this interdisciplinary project gave them added challenges: exploring topics unfamiliar to them, get- ting off campus and into physical contact with the surrounding region, engaging with students in other disciplines and observ- ing their methodologies, and being aware of and responsible to these other forms of knowledge production. although the resulting films were no more or less technically skilled than in any other semester, they had a thematic and conceptual richness that was a direct result of the project’s structure and ambition. for the chemistry students, the project enabled them to apply their prior knowledge to explore the fate of human-derived chemi- cal compounds that are commonly released into the environment. the chemistry students were able to display their disciplinary knowledge, but they also had to learn how to communicate this it also gave them experience with new digital technologies, spatial and geographic reasoning, oral and visual communication of findings in public, and different forms of collaboration across disciplines and with community partners. http://troubledwaters .tumblr.com ps • january ......................................................................................................................................................................................................................................................................................................................... information in lay terms to the political science and documentary film students. for the majority of chemistry students, this was their first foray into the importance of science communication, and they learned how to present their findings in a way that con- nected the general public to the nature of scientific research. the chemistry students learned more about the social function of a chemist than interdisciplinary research, although several com- mented that the work of the political science students enabled them to see the broader effects of the chemical compounds that they studied. not everything went smoothly in this project. in light of stu- dent feedback and some of the difficulties faced, we offer the following lessons learned for any similar project: ( ) let students opt in. we created this dynamic and innovative learning experience for our students, but all did not choose to be involved. rather, it was part of their required class work. consequently, some students expressed a sense of being forced to engage in this project and others complained about the reduced lecture time and lack of ability to focus on their chosen discipline. if possible, create ways for students to self-select into the project to ensure that participants want to be there and do this type of work. ( ) choose the right platforms. technology made this project pos- sible but it also imposed significant constraints. the neatline software, for example, proved to be more cumbersome and difficult to learn than expected. the moodle site designed to share material across courses worked well for ordinary docu- ments but proved cumbersome for sharing images, especially video. ( ) timing is important. the sequencing of our students’ research presented challenges for students who depended on the findings of others. ideally, chemistry students would have already identified chemicals in the river in sufficient quan- tities before the political science students embarked on their research; as it happened, this was not the case. politi- cal science students chose their chemicals based largely on white’s prior experience rather than on the actual quanti- ties discovered by chemistry students. the sequencing of research also posed challenges for documentary students. due to the long timeline for making even a short film, they had to begin their films before the chemistry and the polit- ical science students had finished much of their research, so they could not effectively build the whole research into their film projects. some opted to document the chemistry- research process rather than the results, whereas others relied on preexisting information to complete their films. some of these problems might have been avoided had we organ- ized this project with subsequent courses spread over two semesters rather than concurrent courses in one semester. doing so, however, would have lessened the esprit du corps and it would have required more serious confrontation with established departmental practice (e.g., the chemis- try superlab course taught by white is coordinated with departmental colleagues and is intended for the spring semester). this raises a fourth lesson. ( ) interdisciplinary research across courses is different from other types of interdisciplinary pedagogy. because this col- laboration took place across different rather than in a sin- gle course, the faculty had greater autonomy in the design and teaching of individual content. this avoided the sort of tensions ( latent or manifest) that can develop among instructors in a co-teaching environment. however, other tensions emerged. for example, some chemistry students resisted having their research “exploited” by documentary students. faculty labored to explain to all sides that these dynamics are a common part of any collaborative effort, especially when media representations are being created. additionally, balancing the aims of the project with other curricular objectives and departmental expectations was challenging at times. for instance, the chemistry course was geared toward more advanced science majors, whereas the political science students were primarily first- and second-year students. this generated imbalance in the interactions—an imbalance that was compounded by the political science students’ reliance on the chemistry students’ scientific research. ( ) original interdisciplinary research is time-consuming. we asked much of our students and ourselves in this project. we were generally thoroughly impressed by both the quality and sheer quantity of work accomplished. nevertheless, it was taxing on students, several of whom reported that the project took too much time: working in disciplinary teams, going on field trips, meeting up with other classes, and doing research or media production. it is important to be realistic with stu- dents at the beginning of the semester about how much work and time it will take. concluding remarks this project was successful due to the dedication and hard work not only of diverse faculty and artists but also of support staff and community partners. as a model for studying waste, however, it need not entail so many actors and such ambitious and complex forms of collaboration. it could be replicated on a smaller scale in the context of single courses. a mapping-waste assignment, for example, might begin not with the research from a concurrent chemistry course but instead from a prior semester’s course, an epa report, a professional study from a riverkeeper network, or any known category of waste. more- over, the mapping and representation of the findings could be accomplished easily using simpler technology, such as google maps and posters. conversely, such projects also could be more successful if scaled up, with more course time allotted, closer coordination of schedules among participating courses, and more tailoring around the five lessons learned delineated previously. n n o t e s . supported by the andrew w. mellon foundation, the mellon creative residencies encourage bryn mawr, haverford, and swarthmore faculty to design arts residencies that combine pedagogy, public presentation, and informal exchange among artists, faculty, students, the wider campus, and area communities. . the current version of the website also includes new material added from further documentary research and interviews conducted in fall . a short film about the “troubled waters” project as a whole is available at www. youtube.com/watch?v=mlcetjzicfs&feature=youtu.be. r e f e r e n c e s barndt, deborah. . tangled routes: women, work, and globalization on the tomato trail, second edition. lanham, md: rowman & littlefield. http://www.youtube.com/watch?v=mlcetjzicfs&feature=youtu.be http://www.youtube.com/watch?v=mlcetjzicfs&feature=youtu.be ps • january ......................................................................................................................................................................................................................................................................................................................... t h e te a c h e r : t r a c i n g g l o b a l i z a t i o n a n d w a s t e i n t h e d e l a w a r e r i v e r clapp, jennifer. . “distancing of waste: overconsumption in a global economy.” in confronting consumption, ed. thomas princen, michael maniates, and ken conca, – . cambridge, ma: mit press. lepawsky, josh. . “the changing geography of global trade in electronic discards: time to rethink the e-waste problem.” the geographical journal ( ): – . milmo, cahal. . “how a tagged television set uncovered a deadly trade.” independent, february . available at www.independent.co.uk/ news/world/africa/how-a-tagged-television-set-uncovered-a-deadly- trade- .html. rivoli, pietra. . travels of a t-shirt in the global economy, second edition. hoboken, nj: john wiley & sons. sparke, matthew. . introducing globalization: ties, tension, and uneven integration. walden, ma: wiley-blackwell. young, iris marion. . global justice. new york: oxford university press. http://www.independent.co.uk/news/world/africa/how-a-tagged-television-set-uncovered-a-deadly-trade- .html http://www.independent.co.uk/news/world/africa/how-a-tagged-television-set-uncovered-a-deadly-trade- .html http://www.independent.co.uk/news/world/africa/how-a-tagged-television-set-uncovered-a-deadly-trade- .html expanding the librarian's tech toolbox: the "digging deeper, reaching further: librarians empowering users to mine the hathitrust digital library" project search d-lib: home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib d-lib magazine may/june volume , number / table of contents   expanding the librarian's tech toolbox: the "digging deeper, reaching further: librarians empowering users to mine the hathitrust digital library" project harriett green and eleanor dickson university of illinois at urbana-champaign {green , dicksone} [at] illinois.edu   https://doi.org/ . /may -green   abstract this paper provides an overview of the imls-funded project "digging deeper, reaching further: librarians empowering users to mine the hathitrust digital library," and explains how the project team developed a curriculum and workshop series to train librarians on text mining approaches and tools, in order to address the recognized skills gap between the needs of researchers pursuing digital scholarship and the services that librarians are traditionally trained to provide. keywords: hathitrust digital library project, text mining   introduction the roles of librarians are transforming as a growing number of researchers and instructors integrate data into their work and scholarship. as the association for research libraries' strategic thinking and design initiative report predicts, in , the research library will have shifted from its role as a knowledge service provider within the university to become a collaborative partner within a rich and diverse learning and research ecosystem. [ ] this futurist declaration frames how librarians increasingly are encountering new research questions and scholarly needs oriented around data and digital technologies — needs that push the boundaries of current skillsets, knowledge, and service scope of librarians and archivists today. and recent initiatives such as the library of congress's "collections as data" forum and the imls-funded "always already computational: collections as data" project recognize today's essential role of libraries and archives in providing and curating much of the data being used in this new, emergent research. in light of the "computational turn" [ ] across the disciplines and in libraries themselves, how can libraries prepare for supporting data-driven research? the digging deeper, reaching further: libraries empowering users to mine the hathitrust digital library resources (ddrf) project aims to develop and disseminate a curriculum for librarians to build competence in skills and tools for digital scholarship that they then can incorporate into research services at their home institutions.   background digital scholarship centers and research commons are emerging in more and more libraries as part of revised service models to address the research needs for digital humanities and data-driven scholarship. still, not all academic libraries have (or need) centralized services, and even when they do, librarians from many different departments in the library and areas of expertise are being drawn into digital scholarship support [ ]. studies document how these dynamic, data-driven changes in how scholars pursue research often involve deeper collaboration between librarian and disciplinary researchers [ ], and what the research libraries uk's re-skilling for research report called "a more proactive model of engagement with researchers." [ ] services such as research collaborations with faculty [ ], building new models for scholarly communications and publishing in digital humanities [ ], and offering tiered support services for digital scholarship projects encompassing digitization, multi-media publishing, and software development [ ] are becoming increasingly standard in libraries. the recently published volumes digital humanities in the library: challenges and opportunities for subject specialists [ ] and laying the foundation: digital humanities in academic libraries [ ] feature multiple case studies of new services and programs in academic libraries that address contemporary research needs in the area of digital humanities specifically. but these rapidly growing areas of digital scholarship research, and the responding changes in library services and infrastructure, also highlight the key challenges that librarians face in gaining skills that enable them to engage with digital scholarship work [ ]. some centers have responded by offering training programs for librarians at their institutions to become more familiar with digital tools and methods. notable efforts at the university of maryland [ ], indiana university [ ], and columbia university libraries' developing librarian program exemplify programs that re-skill librarians, especially subject librarians, to participate in new service models and the growing demand for digital scholarly support. national and international initiatives to train those across the academy, from students and faculty to librarians, in strategies for incorporating digital methods and tools into research have proliferated in recent years. programs such as the humanities intensive learning and teaching (hilt) institute prepare attendees, who include librarians, to engage in digitally-intensive research. other recent professional development opportunities for librarians on topics in digital scholarship have included the digital humanities institute for mid-career librarians at the university of rochester and the data science and visualization institute for librarians at north carolina state university, as well as the forthcoming the association of research libraries' newly-launched digital scholarship institute. our ddrf project aims to share and build upon the goals of many of these training initiatives, which are to address the recognized skills gap between the needs of scholarly research with computational tools and the services that librarians are traditionally trained to provide. notably, these training initiatives for librarians employ a "train-the-trainer" model, by which librarians learn a new skillset that they, in turn, can introduce to local scholars. the newly released findings of the imls-funded mapping the landscapes: continuing education and professional development needs for libraries, archives and museums [ ] attest in particular to the need for digital scholarship skills, as they note that of the core competency areas for professional development highlighted in their survey, "intermediate to advanced technology skills, digital collection management and digital preservation competency areas received the highest percentage of respondents indicating a need for significant improvement." ddrf aims to empower librarians — especially those without local training programs — to become active in digital scholarship on their campuses. as such, our project seeks to build this capacity in support of the institute for museum and library services (imls) national digital platform initiative. funded by a - imls laura bush st century librarian grant award, ddrf is a partnership between five institutions: the university of illinois at urbana-champaign, indiana university bloomington, lafayette college, northwestern university, and the university of north carolina at chapel hill. librarians and specialists from the partner institutions have been collaborating to develop a curriculum and training mechanism focused on preparing library and information professionals to engage in text analysis and core skills in supporting data-driven research. this project leverages the expertise of the hathitrust research center jointly based between the university of illinois at urbana-champaign and indiana university bloomington. many of the hands-on activities and examples presented in the curriculum are drawn from the workshops, tools, and research services provided by hathitrust research center for text analysis research [ ]. the curriculum will be released as an open educational resource at the end of the grant.   project update we have drafted, delivered, and revised the initial version of the ddrf text analysis curriculum using an iterative instructional design process. our process drew upon the inspiration and examples offered by other effective open training initiatives, including software carpentry, data carpentry, and library carpentry [ ], as well as the new england collaborative data management curriculum [ ]. the ddrf curriculum aims to be skills-oriented and centered on specific real-world use cases, as we describe later in the paper. the suite of teaching materials includes slide decks, instructor guides, and participant handouts. we continue to refine the materials after each iterated pilot workshop, with the aim of teaching the final curriculum at regional and national workshops across the u.s. during through . through the pilot workshops, we have learned that the skill needs for librarians around digital scholarship are varied and individually-driven. the five project partners represent colleges and universities with diverse constituents and approaches to supporting digital scholarship. as such, each partner institution has encountered unique experiences teaching the same curriculum to their different audiences, which have ranged from cohorts of public services librarians working in undergraduate-central communities, to information science researchers and librarians at large research universities. the richness of this participant diversity has meant that the project partners are able to provide feedback on the efficacy of the training materials for different audiences. our experience teaching the workshops to date has influenced our approach to instructional design and curriculum development, both of which have also been shaped by participant feedback through formal assessment.   . instructional design the multistage instructional design process applied in this project began in fall with definitions of learning goals and objectives for the curriculum. this stage involved identifying the requisite skills and knowledge for librarians from different areas of expertise to support text analysis research, and how to build a training program that would address those requirements. this process established a benchmark for the curriculum that project partners were able to reference as the materials took shape. as a part of iterating on the teaching materials, we have refined the learning goals and objectives based on feedback and teaching experiences. our learning goals and objectives address librarian-specific competencies to engage with digital scholarship, and we developed them with the approach of seeing text analysis tools and methods as a digital scholarship service supported by the library. we do not expect for the learner to become an expert over the course of several hours, nor for the learner to necessarily formulate their own research project. instead, we focus on fostering awareness of, and the ability to communicate about, key tools and methods in text analysis. additionally, they map to five training modules that follow the text analysis workflow, from finding textual data to managing and analyzing it, which also align with key points at which a librarian might be involved in the research process (table ). each module incorporates skills-based competencies that are developed through hands-on activities. a sample reference question that could be addressed using text analysis threads the modules and guides hands-on activities and discussion. where appropriate, the activities align with hathitrust research center tools and services. module primary learning goal skills developed introduction understand what text analysis is and how scholars are using it in their research. recognize research questions that may lend themselves to text analysis methods. gathering textual data differentiate the various ways textual data can be acquired and evaluate textual data providers. build a textual dataset and run a web scraping script. working with textual data distinguish cleaning and/or manipulating data as a part of the text analysis workflow. clean text data files using a python script and/or openrefine. analyzing textual data recognize the advantages and constraints of web-based text analysis tools and programming solutions. run a web-based text analysis algorithm and extract token frequencies from a dataset. visualizing textual data identify data visualization as a component of data-driven analysis. practice exploratory data analysis using different tools for visualization. table : learning and skill-building goals for ddrf curriculum we chose to use a modular format for the curriculum, so that the workshops could be adjusted for different settings. some modules have been further broken down into "beginner" and "advanced" lessons, improving the flexibility of the teaching materials. in the second round of pilot workshops, we found that the partner institutions were interested in rearranging the content to suit their audiences. some chose to teach the modules in order from one to five, while others taught the beginner lessons of multiple modules before moving on to the advanced lessons.   . teaching we have now taught several iterations of the curriculum via pilot workshops at each of the partner institutions. the workshops have been open to librarians, library paraprofessionals, and students in library and information science departments. we have seen strong interest in the workshops from across the library: for all of the fall workshops combined, % of attendees self-reported as reference librarians, % as technical services librarians, % as "other" types of librarians, and % as digital humanities or digital scholarship librarians. between each round of pilot workshops, the project team reviewed and updated the curriculum, based both on the attendees' evaluations and also in part on the experiences of the partner instructors teaching the materials. the feedback from the project partners has revealed that it can be challenging to learn, digest, and teach materials that others have developed. in such cases, instructors found it helpful to team-teach the workshop so that the instructor team was better able to grasp the materials and answer attendee questions. making it easier for others to pick-up and teach the curriculum is one of our goals for the coming year. to this end, we are drafting in-depth instructor guides for each module that define vocabulary terms, outline the key points that should be addressed, and provide a slide-by-slide script from which the presenter can read. an important component of our strategy thus far has been to limit technological barriers to participating in the workshop. the activities deployed in several of the modules involve the participants executing python programs to complete a task. properly setting up a programming environment can take considerable time, especially in a workshop setting and when using machines in a computer lab. when possible, we have explored web-based tools for programming, such as pythonanywhere, that allow participants to complete activities no matter what their operating system and without configuring their computer. we came to this decision by evaluating our learning goals and determining what aspects of the code-based activity was most important to meeting our objectives. we determined that streamlining the technical activities through web-based programming platforms lowers the cognitive load of learning a new concept, and allows attendees to focus on what happens when they run a script as opposed to the nuances of their programming environment. while we have attempted to simplify the steps to successful completion of each activity, we have also learned to value creative and critical thinking in the hands-on sections. after the first rounds of workshops, project partners reported that they wished there were more opportunities in the curriculum for open-ended inquiry. they also reflected on the importance of play and experimentation for those learning digital scholarship competencies. the first iterations of the activities were straightforward, and we are exploring ways to make them more playful as a means of reinforcing the concepts in the activities [ ]. we have also incorporated discussion questions into the most recent version of the teaching materials. we hope such discussion will provoke critical reflection of the skills and competencies addressed in each module within the context of the learning goals, as well as provide space for attendees to connect the workshop's content to their own teaching and learning.   . assessment following each workshop, participants complete an assessment form. from the assessment feedback, the project team has been able to glean that librarian learners appreciate learning by doing, and that they prefer depth over breadth of content in a workshop. attendee feedback shows that librarians value experiential learning. responses gathered in the assessment form often related to the hands-on activities. for example, one workshop attendee wrote that they were "intimidated" before coming to the workshop because it would teach programming concepts, but that "the structure of the workshop which allowed us to focus on the conceptual capabilities of using python and scripts to do text mining was very useful and interesting." additionally, others noted that there should be even more time devoted to skill-based learning. one wrote, "when i sign up for a workshop, i expect that most of the time will be actual hands on activities." current work on the curricular materials is focused on further developing the scope of the hands-on sections of each module to allow learners the opportunity to understand the process happening in each activity, in addition to fostering experimentation and discourse as mentioned above. workshop feedback also reflected that early pilot workshops were too short relative to the amount of content we tried to teach. one attendee advised us to, "make it longer, with more time for exploring data. [there is] not nearly enough time to really dig deep." we intend for future workshop sessions to be longer and anticipate they will be less rushed. we are also devising ways to create paths through the content for shorter workshops: by highlighting key points for each module in the aforementioned expanded instructor's guide, we aim for instructors to feel empowered to condense content as needed for use in abbreviated workshops.   next steps and conclusion we continue to incorporate the feedback and assessment received into our curricular and programmatic development of the project materials, and strive to keep in mind the various user groups and skills levels that librarians and information professionals have today. given the initial response to our workshops, we know that our colleagues are actively seeking training and instruction in these emergent skillsets for digital scholarship and data science. the next year will see a series of regional and national workshops where we will present the curriculum to larger, more diverse audiences from across north america. through these workshops, we will gather additional responses from the librarian community that will allow us to refine the curriculum into a final open educational resource. our project is motivated by the potential to build new and interactive communities of practice in libraries around digital scholarship. library and information professionals today, across areas of expertise, must grapple with questions such as: what technical and social infrastructures do libraries need to build or re-think in order to support digital scholarship? how do we provide librarians with the skillsets and knowledge needed to respond to new research and teaching needs? how can libraries anticipate the data-driven research of the future? the more that libraries proactively equip their staff to engage in more data-intensive research and teaching — in addition to developing new spaces and service models — the richer the future looks for the changing role of libraries and archives in higher education.   references [ ] association for research libraries. ( ). strategic thinking and design initiative: extended and updated report, washington, dc: association for research libraries. [ ] david berry, d.m. ( ). the computational turn: thinking about the digital humanities. culture machine . [ ] mulligan, r. ( ). spec kit : supporting digital scholarship. washington, dc: association for research libraries. [ ] green, h. e. ( ). facilitating communities of practice in digital humanities: librarian collaborations for research and training in text encoding. library quarterly ( ) , - . https://doi.org/ . / [ ] auckland, m. ( ). re-skilling for research: an investigation into the role and skills of subject and liaison librarians required to effectively support the evolving information needs of researchers. london: research libraries uk. [ ] alexander, l., case, b., downing, k., gomis, m. & maslowski, e. ( ). librarians and scholars: partners in digital humanities. educause review; nowviskie, b. ( ). skunks in the library: a path to production for scholarly r&d. journal of library administration ( ), - . https://doi.org/ . / . . [ ] coble, z., potvin, s., & shirazi, r. ( ). process as product: scholarly communication experiments in digital humanities. journal of librarianship and scholarly communication ( ), ep . https://doi.org/ . / - . [ ] vinopal, j. & mccormick, m. ( ). supporting digital scholarship in research libraries: scalability and sustainability. journal of library administration ( ), - . https://doi.org/ . / . . [ ] hartsell-gundy, a., braunstein, l. & golomb, l. eds. ( ). digital humanities in the library: challenges and opportunities for subject specialists. chicago: association for college and research libraries. [ ] gilbert, h. & white, j. eds. ( ). laying the foundation: digital humanities in academic libraries. lafayette, in: purdue university press. [ ] posner, m. ( ). no half measures: overcoming challenges to doing digital humanities in the library. journal of library administration ( ), - . https://doi.org/ . / . . [ ] munoz, t. & guiliano, j. ( ). making digital humanities work. digital humanities conference abstracts efpl-unil lausanne, switzerland - july . - . [ ] courtney, a., m. dalmau, & c. minter. ( ). research now: cross training for digital scholarship. poster presented at dlf forum. [ ] drummond, c., skinner, k., pelayo, n., & vukasinovic, c. ( ). self identified library, archives, and museum professional development needs edition: compendium of - mapping the landscapes project findings and data. atlanta: educopia institute. [ ] downie, j.s., furlough, m., mcdonald, r.h., namachchivaya, b., plale, b.a., & unsworth, j. ( ). the hathitrust research center: exploring the full-text frontier. educause review, may , . [ ] baker, j. et al., ( ). library carpentry: software skills training for library professionals. liber quarterly. ( ), - . https://doi.org/ . /lq. [ ] lamar soutter library, university of massachusetts medical school. new england collaborative data management curriculum; kafel, d., creamer, a. t. & martin, e. r. ( ). building the new england collaborative data management curriculum. journal of escience librarianship ( ): e . https://doi.org/ . /jeslib. . [ ] for more about the concept of play in digital pedagogy, see sample, m. ( ). play. in digital pedagogy in the humanities: concepts, models, and experiments. new york: modern language association.   about the authors harriett green is the interim head of scholarly communication and publishing, english and digital humanities librarian, and associate professor, university library, at the university of illinois at urbana-champaign. her research and publications focus on usability of digital humanities resources, digital pedagogy, digital publishing, and humanities data curation. she is principal investigator for the imls-funded "digging deeper, reaching further: libraries empowering users to mine the hathitrust digital library" project.   eleanor dickson is the visiting hathitrust research center digital humanities specialist at the university of illinois at urbana-champaign. she supports outreach and training for the hathitrust research center, as well as local digital humanities research at illinois.   copyright ® harriett green and eleanor dickson white paper report report id: application number: hd project director: william seefeldt (wseefeldt @unl.edu) institution: university of nebraska, board of regents reporting period: / / - / / report due: / / date submitted: / / final performance report report id: application number: hd project director: william seefeldt (wseefeldt @unl.edu) institution: university of nebraska, board of regents reporting period: / / - / / report due: / / date submitted: / / neh grant # hd- - digital humanities start up program pi: douglas seefeldt final performance report "sustaining digital history" report filed through university of nebraska-lincoln william g. thomas, chair, department of history, university of nebraska, project co-director this project aimed to build a scholarly community for the practice of the emerging field of digital history by .) enhancing communication and collaboration among scholars and journal editors, .) creating model forms of scholarship and peer review, and .) establishing a clearinghouse for all peer- reviewed digital history scholarship. digital history has grown up in the last fifteen years through and around the explosion of the world wide web, but historians have only just begun to explore what history looks like in the digital medium. increasingly, university departments seek scholars to translate history into this fast-paced environment and to work in digital history; however, they have found that without well-defined examples of digital scholarship, established best practices, and, especially, clear standards of peer review for tenure, few scholars have fully engaged with the digital medium. . enhancing communication and collaboration among scholars and journal editors our project worked to develop interest in digital scholarship among history journal editors through several parallel efforts. we contacted editors and scholars involved in the history cooperative and hosted a two-day meeting of journal editors in lincoln in october , including: robert schneider, american historical review christopher grasso, william and mary quarterly alan lessoff or john mcclymer, journal of the gilded age and progressive era david lewis, the western historical quarterly tamara gaskell, the pennsylvania magazine of history and biography eliza canty-jones, oregon historical quarterly we hosted a meeting titled “sustaining digital history” the day prior to the fifth annual nebraska digital workshop and invite potential authors, peer reviewers, and interested scholarly journal editors to participate. we invited anne s. rubin (university of maryland baltimore county) to attend the meeting as a practitioner/digital scholar and we invited abby smith rumsey (scholarly communication institute) to serve as a consulting expert to advise the group. in addition, we invited mike spinella from jstor to comment on the state of online journals and jstor's plans for potentially hosting a digital scholarship journal space. scholars attending the nebraska digital workshop who participated in the sustaining digital history meeting included stefan tanaka (university of california, los angeles), andrew jewell (unl center for digital research in the humanities), amanda gailey (unl, english department), and jeannette e. jones (unl history department). graduate students in attendance included: brent rogers, leslie working, kaci nash, jason heppler, michelle tiedje, and brian sarnaki. the outcomes of this valuable editorial and scholarly meeting were several: a. the editors affirmed their desire for a clearinghouse of reviewers and practicing digital historians as a means to understand the field, identify peer reviewers, and link up with scholars undertaking digital work b. the editors appreciated a demonstration from anne rubin laying out the creative process in a digital project and the expectations of authors engaged in digital scholarship c. the group considered the questions of hosting, collecting, imprinting, and indexing digital scholarship with the three groups (authors, peer reviewers, and editors), examined models for incorporating digital scholarship, and agreed in principle with the goal of greater indexing and integration of digital scholarship into the journals. d. the group discussed a possible award or prize for digital scholarship submission to one or more of the participating journals. . as an outcome of sustaining digital history we expected to assemble a digital history scholarly journal publishing advisory group that includes key scholars active in the field, such as edward l. ayers, laurel thatcher ulrich, daniel cohen, amy murrell taylor, william turkel, and richard white, and others listed in our directory of digital historians, who might serve as first peer reviewers working with journal editors. the editors did not see this advisory group as a high priority, preferring instead to work within their current editorial board structures. we did not pursue this objective further. the current structure of peer review combined with our digital history index (see below) was a system the editors found sufficient. editors have used the index already to line . identify, peer review and publish a number of digital history projects in a number of scholarly journals. this was an ambitious goal and one we did not have time and resources to meet fully in the course of the grant. journal editors preferred to hold peer review within their current operating structures rather than federate that role. we sought to identify potential peer reviewable works of digital scholarship through networking among digital history scholars. several projects came forward from unl: in particular, seefeldt worked with his graduate students in the william f. cody digital project to create a series of digital research analysis modules (www.codystudies.org), some of which are being developed for peer review in scholarly journals. william g. thomas has been working with graduate student leslie working, and his collaborators richard g. healey (university of portsmouth) and ian cottingham (unl, computer science) on his digging into data challenge grant for "railroads and the making of modern america" to produce a peer reviewable "app." he is considering electronic submission to the journal of the gilded age and progressive era, southern spaces, or the journal of the civil war era. other scholars, including jon christensen (ucla), used the sustaining digital history project to initiate communication with the project editors and other editors (such as environmental history) regarding possible electronic submission and digital peer review. an outgrowth of the sustaining digital history project was the wider dissemination of the concerns related to digital scholarship and peer review. thomas was selected for the board of editors of anvil academic, a new digital scholarship peer review publishing venture. seefledt was selected for the board of editors of the digital arm of the journal of the gilded age and progressive era.
 . expand the digital history site by building on our directory of digital historians and experimenting with digital “digital history reviews” of projects and tools that take full advantage of the medium. establish digital history as the clearinghouse for the best digital history scholarship. we have made substantial progress on this aspect of the project. we worked with unl graduate research assistant kaci nash to overhaul the design and organization of the digital history project web site (http://digitalhistory.unl.edu) to highlight “documenting,” “doing,” and “teaching” digital history to aid in defining and sustaining this emerging mode of scholarly research and communication. we have put together a “directory of digital historians” that we discussed at our meeting last fall and prototyped during the first year of the project. it now has approximately entries seeded from a number of sources (conference programs, digital history reviews, word of mouth, self nomination, etc.) . we have also expanded the “project reviews” section from twenty ( ) to approximately fifty ( ) projects written by unl graduate students as projects in digital history courses taught by professors seefeldt and thomas: . we plan to expand this section and the “tool reviews” section when we move the digital history project to the wordpress platform in the spring of and open up the editorial functions to other digital historians who have expressed interest in collaboration. 
 . sponsor and organize sessions to share this work at both the aha and oah annual meetings during the winter & spring of , with one panel of journal editors on the topic of “the future of the journal in the digital era” and another panel of scholars presenting their own digital history scholarship at each meeting. this objective was perhaps our greatest success in the project. the participating editors and scholars proposed several digital history and publishing session for the american historical association annual meeting and the organization of american historians annul meeting (the oah did not accept our proposals for two panels at the milwaukee meeting, unfortunately). two panels were accepted for the meeting of the aha in chicago. these sessions addressed a range of questions. one question is whether there are alternative ways of writing history than the analytic essay. are journals interested? are our colleagues interested? if so what is making this move difficult? what are the financial ramifications of digital production and/or digital journals for journals themselves and for their associations? what becomes a sustainable model financially as well as intellectually going forward? who will be the audience for our online journals? these questions addressed wider themes that aha president-elect william cronon raised in aha perspectives, as well as vital issues around what one participant called "the weakening binaries in the digital realm: professional/non-professional; academic/public; specialist/synthesizing." other members of our panel focused on these questions: what genres will emerge as the most robust in the journal niche of the short-form communication? what have you done so far to take a traditional journal into the digital age and what are the next steps? how is digital scholarship like and unlike the traditional forms of scholarship that your journal deals with? what are the challenges a born-digital "article" (entity?) poses to the double-blind review process at a scholarly journal? what challenges does it pose to editing, production, and distribution? to what extent do you think your journal will be-- or should be--transformed by digital technologies by ? the aha panels, presenters, and abstracts were as follows: “digital history: state of the field” chair, william g. thomas, u. of nebraska panelists: jon christensen, stanford u.; jo guldi, u. of chicago; andrew torget, u. of north texas abstract: digital history as a field emerged with the explosion of the world wide web, since , the dominant means of information access, knowledge acquisition, and communication for the public and increasingly for the scholarly community. because the medium is so new and the technology so quickly changing, we have only just begun to explore the new forms that historical scholarship might take. we need well-defined examples of digital scholarship, established best practices, and, especially, clear standards of review for tenure. we know that time has not solved the problem; indeed, recent studies show that scholars in history and other humanities disciplines are as wedded as ever to traditional forms of communication. young humanities scholars, especially in history, are not experimenting in the digital medium in large part because the wider professional culture has been slow to change. a whole range of social and cultural barriers confront scholars who consider digital scholarship. their departmental colleagues know little about digital technologies, practices, or methods, and their promotion and tenure committees, outside reviewers, and upper administrations often consider peer-reviewed monographs the sole basis for advancement. the current problem is multifaceted—administration leaders often seek to promote digital technologies in teaching or research, yet department tenure committees often rank digital work below a published monograph; libraries have taken the lead in creating digital research platforms for faculty, yet university presses and scholarly journals remain the gold standard for tenure and promotion; senior faculty often feel liberated to embrace experimentation, yet junior faculty often prudently avoid risks. the growth of digital history, it should be stated, has been given shape and encouragement most directly by the leading professional associations and the national endowment for the humanities (neh). the american historical review offered a pioneering set of peer-reviewed digital articles, the journal of american history has reviewed leading history web sites, and the neh has funded important history projects, from the valley of the shadow to zotero, and created a portal for leading digital sites (edsitement). these steps have provided absolutely critical opportunity for scholars to work in the digital medium. the problem historians face now is institutional, structural, and social, and this panel discussion by a slate of researchers actively pursuing digital forms of scholarship is aimed at discovering and lowering these barriers in the discipline of history. “the future of history journals in the digital age” chair, douglas seefeldt, u. of nebraska panelists: christopher grasso, the college of william & mary; david rich lewis, utah state university; john f. mcclymer, assumption college; abby smith rumsey, scholarly communication institute; stefan tanaka, u. of california at san diego; allen tullos, emory u. abstract: as digital technologies advance rapidly, as vast repositories of information come online, and as more and more people participate in the digital revolution around the world, historians face a very important set of decisions about the nature of historical scholarship and its forms. yet, few venues exist for scholars to conceive, produce, and distribute their digital work, or to communicate with one another about the forms and practices of the digital medium. while several funding institutions have committed significant resources to the development of digital collections and tools, most prominently the andrew w. mellon foundation and the national endowment for the humanities, scholars perceive few options for publishing digital work and university presses and leading journals have been slow to embrace "born digital" scholarship. the professional associations (the american historical association and the organization of american historians) have taken crucial steps in promoting digital scholarship and provided essential leadership. our challenge now is to build on their foundation and create a wider scholarly community of authors and journal editors around digital history to identify, peer review, and disseminate article-length digital scholarship by placing these works in some of the leading journals. one of the most important aspects of this roundtable discussion will be to explore ways to reduce the gap between the scholarship in the profession's journals and scholarship on the web. after significant discussions with history cooperative journal editors over the course of the past year and a half and during meetings at the university of nebraska-lincoln last fall, supported by an neh-funded digital humanities start-up grant titled “sustaining digital history,” digital historians douglas seefeldt and william thomas have found wide support for taking some steps to close this gap. journal editors see the burgeoning work on the web and recognize its value. they also recognize the challenges of peer reviewing this work. currently, the journals serve as the gatekeeper and record of scholarship in the fields of history, yet most do not yet index, review, refer to, incorporate, imprint, or publish anything from the digital medium. conversely, the independent scholarship historians have produced on the web remains all too often unconcerned with peer review, editorial control, and incorporation into the scholarly record. because digital work is rarely featured or recognized in the profession's leading journals, among other reasons, younger historians have proven reluctant to develop born digital scholarship and departments have had difficulty evaluating this scholarship for promotion and tenure. this roundtable discussion seeks to explore avenues of practice for integrating digital scholarship into the record of professional scholarly activity and to consider how best to help authors, reviewers, and editors negotiate a difficult transition. these panels discussed the significant gap in the social and cyber-infrastructure for supporting digital scholarship in history. pointing out how young humanities scholars, especially in history, are not experimenting in the digital medium in large part because the wider professional culture has been slow to change. in fact, as robert townsend's survey of aha members regarding research and teaching published in perspectives found, half of those polled have considered publishing online, noting the benefits of reaching a wider audience, publishing their work more quickly than via print, and the ability of the digital medium to reach a wider audience of historians, among other factors. it is crucial to note that among those who have not yet published any form or electronic scholarship but would consider it, they overwhelmingly cite the perception that online scholarship lacks the scholarly recognition and prestige of print publication (robert b. townsend, “how is new media reshaping the work of historians?” perspectives on history november ). essays in the january issue of perspectives written by ahr editor robert schneider and john thornton, a member of the aha's research division, further explore the cost and benefits for historians of the current “digital turn” in the humanities. these panelists hope to make it possible for scholars to create, publish, and review digital scholarship and, in effect, to mainstream this work within the disciplines and through the leading professional journals. as digital technologies advance rapidly, as vast repositories of information come online, and as more and more people participate in the digital revolution around the world, humanities scholars face a very important set of decisions about the nature of scholarship and its forms. yet, few venues exist for scholars to conceive, produce, and distribute their digital work, or to communicate with one another about the forms and practices of the digital medium. we take as something of a mantra jerome mcgann's dictum from his article “the future is digital,” that “the matter won’t become clear, one way or the other, until we undertake to design and implement a working model.” this roundtable discussion explored avenues of practice for integrating digital scholarship into the record of professional scholarly activity and to consider how best to help authors, reviewers, and editors negotiate what is a difficult, but ultimately profitable transition from print-based scholarly communication standard to an approach that makes room for a variety of modes. participants on the panels of these two sessions and editors associated with the project came to a "brown bag" session with seefeldt and thomas to discuss next steps and future developments that would benefit the project sustaining digital history. the group agreed that a series of digital humanities neh institutes at participating institutions would greatly benefit the project, spark models of digital scholarship, and provide participating journals with potentially reviewable works. conclusion: we expect to continue the work in sustaining digital history through several venues. first, we will be moving the web site and its ongoing development home to ball state university where seefeldt is now teaching and leading research in digital history. we will be implementing a more user-friendly digital historians index so that visitors and self-register more easily and we will simultaneously deploy more social media tools to support registration. additionally, we will release in spring the directory of digital history scholarship, as a first effort at categorizing peer reviewed scholarship in digital history. finally, we will begin planning a digital humanities institute proposal at two or three institutions whose faculty have expressed already an interest in partnering on an institute. since this proposal several initiatives have simultaneously developed to open venues for digital scholarship and peer review, including innovative post review models such as the journal of digital humanities. other new ventures include anvil academic and scalar. we are pleased to have contributed to this innovative movement. our project enabled some developments in history journals, notably the upcoming american historical review digital scholarship article contest and the jgape digital arm's further growth. some journals have adopted web site reviews and are attempting to more systematically consider digital work as scholarship worthy of record in their pages. nevertheless, we are struck by the lack of progress as well--most journals continue to work exclusively with traditional form articles and the energy around digital scholarship is continuing apace outside of the world of many history journals. nevertheless, our project and the critical support of the neh helped mainstream these concerns at the american historical association annual meeting and among gatekeepers in significant ways. we are grateful to the neh for its support and its patience as we have worked through difficult and challenging issues for the profession and the future of digital scholarship. appendicies a. screen shot of sustaining digital history blog section. the project blog includes the meeting schedule, professional biographies for presenters, journal editors, and grant project directors, a bibliography of selected relevant publications and materials related to the sustaining digital history initiative, and the project purpose statement document. b. screen shot of sustaining digital history project. in addition to the blog mentioned above, this website includes sections on “documenting” digital history (directory of digital historians, neh digital humanities grant, project reviews, tool reviews), “doing” digital history (essays, interviews, lectures), and “teaching” digital history (student projects, syllabi). c. promotional postcards, digital history project. , - / ” x - / ” full color cards and , ” round stickers for distribution at conferences and other professional meetings. d. rachel ensign, “historians are interested in digital scholarship but lack outlets,” chronicle of higher education blog “wired campus” entry on “sustaining digital history” project, october , . e. jennifer howard, “historians reflect on forces reshaping their profession,” chronicle of higher education"january , . the section “going digital” dedicates a paragraph to our aha panel “the future of history journals in the digital age” quoting panelist stefan tanaka. f. question list for “the future of history journals in the digital age” panel discussion at the american historical association conference. aha day : state of the field posted on january , by jason heppler in the second workshop session sponsored by the aha research division, prof. william g. thomas chaired a panel with jon christensen, jo guldi, and andrew j. torget. the purpose of the panel was to examine ways that digital scholarly work was being produced. jon christensen sought to answer to questions: ) what has the research produced?, and ) so what? he presented on the research for his book critical habitat: a history of thinking with things in nature. much of the digital output from the book, which can be viewed at the stanford spatial history project, sought to use spatial analysis to examine historical correlations. data, he reminds the audience, is shot through with historical contingency. thus, you need new methods to see through the data. jo guldi suggested that digital materials press scholars to consider sources in larger scales of time and place, indeed, may even demand larger scale and longer periods. methods of digital history help raise new questions. guldi argues that we are secure in our traditional methods of doing micro history, but we don’t know how to release macro history in our work. the annals school attempted this, but required large research teams. mass digitization, however, gives us new tools. she demonstrated her uses of file juicer and the timeline feature of zotero to highlight ways of examining the longue durée of digital history project an neh digital humanities start-up grant digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm share this: history. andre torget illustrated his texas slavery project and how spatial analysis helped him raise new questions about the extension of slavery into texas. he spoke also about the challenges of translating digital work into traditional narratives. his dynamic maps of texas speak as a sort of argument on their own, but moving that into print is a challenge and ultimately falls short. some models of moving digital to print exist, he points out, including william thomas’s the iron way and richard white’s railroaded, but the book remains the standard for tenure and promotion. posted in doing digital history | tagged events | leave a reply aha day : the future of history journals in the digital age posted on january , by jason heppler at session on saturday, prof. douglas seefeldt led a roundtable discussion with christopher grasso, david rich lewis, john f. mcclymer, abby s. rumsey, sefan tanaka, and allen tullos. the purpose of the panel was to explore ways to reduce the gap between scholarship in the profession’s journals and the scholarship of the web. university presses and scholarly journals remain the gold standard for tenure and promotion, and time has not solved the problem of valuing digital work below that of print. those participating faced a series of questions. they spoke on the steps they were taking to move their journals into the digital age. some are making concerted efforts to incorporate new digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm share this: digital supplements to their journals while others, like the peer-reviewed southern spaces, is entirely digital. the issue of peer review was a key focus in the discussion as well. the editors generally agreed that double blind peer review panels could maintain their function, but also begin bridging the gap of print and digital by incorporating experts on the content and experts on the digital to talk together and assess how well content and form interact. stefan tanaka challenged the idea, suggesting that double blind review is only one of several ways to do peer review. he also pointed out that a peer review process exists online, and these discussions needed to happen online where the scholarship is being produced. an example that tanaka points to is blogs, where people are doing serious, public scholarship and should be recognized as communities of conversations. open access formed another nexus of the conversation. open access digital publishing gives authors an idea of how many people are viewing their work. abby rumsey provocatively suggests that libraries have the money to fix the problem — they have the ability to reshift their budgets and support digital humanities without any problems. exploring the digital space means being more demanding about libraries finding solutions, and they can find solutions by reallocating budgets. “university libraries still have a lot of money,” rumsey suggested. “if faculty demanded they support digital and open access scholarship, they would.” journal editors suggested that they are open to the idea of digital scholarship and are waiting for more submissions of such work that force them to think about ways of incorporating digital work into their apparatus. posted in sustaining digital history | tagged events | leave a reply aha day : digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm presenting historical research using digital media posted on january , by jason heppler at session on saturday, “presenting historical research using digital media,” the presenters introduced several new modes for presenting their scholarly work. the session included a companion website that contained resources for each of their talks. monty dobson, a historian and archeologist, discussed his work in documentaries and showcased his upcoming pbs series, america from the ground up. originally designed as a half-hour video for his classroom after he became frustrated with the lack of material on the history of the interior u.s., the project has grown into a four-part series. he hopes that his work will focus our attention more squarely on the interior united states, promising the audience that not once will he mention george washington when discussing the arrival of europeans and americans to the region. in confronting a narrative that is east coast centric, he hopes to reshape public history and examine the history of a region more closely aligned with new france rather than the experiences of the coast. phil ethington discussed geo-historical visualizations. digital media, he reminds us, is important because of its substance and what we’re communicating. the media is not the message; rather, the media enables new ways of seeing the past. he has developed hypercities, built for urban research and collaboration, as a method to examine how people came to understand their place and space. ethington also pointed the potential of nonprofits and community-based organizations to use hypercities as a way to crowd source their local history. katrina gulliver discussed her process of starting up her podcast, cities in history. she came to podcasting as an experiment in learning how to do this technically, but also to think about presenting her work to a general audience. she digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm share this: outlined the various off-the-rack and easy to use tools she uses in her setup, including jellycast and garageband to record and tumblr for her site. jennifer serventi ended the session discussing the variety of digital projects that the national endowment for the humanities funds and things to think about when writing proposals to the neh. serventi reminded the audience that humanities projects should use the best genre or medium for the project, whether it was a book, podcast, film, or otherwise. she also pointed to the neh’s new database of digital projects as a way to begin learning about the sorts of projects that have been funded and may serve as a starting point for our own proposals. posted in doing digital history | tagged events | leave a reply aha day : pioneers discuss the future of digital humanities posted on january , by jason heppler the panelists at session “the future is here: pioneers discuss the future of digital humanities,” the presidential session chaired by outgoing aha president anthony grafton, included presentations by erez liebman aiden and jean-baptiste michel from harvard university and blaise aguera y arcas from microsoft. both presentations emphasized the necessity of collaboration and opportunities that digital computing offers humanist inquiry while also warning about the pitfalls of relying on digital technology. aiden and jean-baptiste outlined culturomics in their talk that almost exactly followed their ted talk. aiden and jean-baptiste provided examples of word-frequencies and usages over time. using million books digitized by google, they insisted their digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm share this: methods gave insight into a sort of cultural genome. they also confronted five “myths” of those critical of their approach to analyzing historical documents, insisting they were not trying to replace historians with machines but rather build tools that historians may find useful in their work. in their most provocative section of the talk was a discussion of new work they’re undertaking in to “cultural inertia,” or asking the question of whether we could use cultural data and history to predict the future. history, they concluded, will remain the domain of close reading, primary sources, and interpretation, but will also include big data, massive collaboration, data interpretation, computation, and science. blaise aguera y arcas, known for his work on photosynth, discussed his effort to understand typefaces in gutenberg’s printing press. he examined how type was configured using clustering software and high resolution images of letters to analyze the components that made up the text. moreover, he asserted that gutenberg’s real contribution was the development of fonts rather than moveable type. at the core of his talk was an emphasis on collaboration. only through collaboration in several areas of expertise could he come to understand different aspects of typesetting. the same holds true for any aspect of the past. collaboration will be essential after the digital turn because we cannot make assumptions about digital data — the rise of proprietary digital environments, the inability to truly own data, the misguided notion that one can own a gadget, the filter bubble, and no guarantee that the lights will remain on. invention does not happen in a vacuum. rather, collaboration is essential for exploring or generating new ideas. posted in documenting digital history | tagged events | leave a reply aha day : digital humanities: a digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm share hands-on workshop posted on january , by jason heppler on friday, january th, session , “digital humanities: a hands-on workshop” sponsored jointly by the aha research division and the roy rosenzweig center for history and new media, introduced attendees to a variety of approaches in digital methods for research and teaching. six stations were arranged around the room that allowed attendees to wander from topic to topic and engage in conversations, questions, and demonstrations. topics included digital publishing with dan cohen, who discussed a variety of different methods that scholars use to communicate their work. he also talked about digital humanities now and the platform that runs it, pressforward. jeff mcclurken presented on teaching with social media and shared his experiences with using twitter, facebook, and blogs for the classroom. mcclurken collected many of the resources he discussed on a page he created. fred gibbs discussed text mining and offered examples from his experiences in using the method for research. gibbs also has a companion website. rwany sibaja talked about digital storytelling and using multimedia in narrative. he has collected several tools and resources for others to check out. jennifer rosenfeld talked about the resources available at teachinghistory.org and how the website can help students gain a better understanding of the types of evidence used by historians. patrick murray-john discussed content management systems, including zotero, and its usefulness in categorizing, tagging, and collecting data and information. digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm this: posted in documenting digital history | tagged events | leave a reply digital historical scholarship and the civil war posted on january , by kacinash the civil war lends itself greatly to the digital medium. in addition to the subject’s scholarly contingent, it also possesses a great public audience of increasingly computer literate members. this question of audience was something addressed in the aha panel wittingly titled “hardtack and software: digital approaches to the american civil war,” a digital spin on john d. billings’ popular reminiscence hardtack and coffee: or the unwritten story of army life.” of the four projects presented during the session, two seemed to be readily open to the inclusion of the general public as well as the more general scholarly audience—visualizing emancipation and sherman’s march and america: mapping memory. yet the ability to play with data and explore the history provided by the digital medium promotes public use as well. civil war washington, while being a repository for scholarly information about the nation’s capitol, may also be of interest to “amateur” civil war scholars. mining the dispatch is admittedly geared toward academics, however, nelson’s findings will be of interest to any student of the civil war, with or without professional scholastic credentials. each panelist provided an overview of their respective projects, which i shall not repeat here. readers are encouraged to visit the sites and interact with them for themselves. instead, each presenter introduced the scholarly findings or evidence displayed or exhibited in the projects. the tools and technology employed by each project received relatively little attention. during the comments section of the panel, robert nelson asserted that the digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm share this: challenge is to produce scholarship that is going to be of interest to scholars of the subject not the technology. we must focus on historical questions and historical moments, not on techniques. this thought was one that stayed with me more than any other aspect of the session. if we want the discipline of history to be receptive of works created through and with the digital medium, it is essential that we emphasize the scholarship that is being produced, not the way in which it is being produced. in order for “doing digital history” to become synonymous with “doing history,” we need to convince the field of the validity of digital scholarship. back to the issue of audience, users outside of the academy —civil war “buffs,” teachers, and students—are likely unconcerned with whether or not what they are interacting with is considered scholarship by academics, but rather what they can learn from utilizing such projects. to me, a master’s student with career ambitions in the public history sector, this is the most exciting aspect of combining technology with doing history—its ability to make history more accessible and appealing to the public. whether through providing access to documents and visualizations which allow a thorough analysis of washington, d.c. or using an algorithm to reveal large societal and cultural patterns over thousands of newspaper articles, the digital medium is truly an effective way both to craft history and to communicate it. posted in documenting digital history, doing digital history | tagged events | reply the future is here: digital history at the th annual meeting posted on january , by kacinash digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm share this: “the future is here,” a series at the aha meeting, will feature numerous presentations and discussions on digital history. several graduate students who are attending these panels will post reactions to these panels as well as participation at the thatcamp hosted on january . posted in documenting digital history, doing digital history, teaching digital history | tagged events | leave a reply digital history project | an neh digital humanities start-up grant http://digitalhistory.wordpress.com/ of / / : pm digital history project about blog documenting directory of digital historians neh digital humanities grant project reviews tool reviews doing essays interviews lectures teaching student projects syllabi social media added link to guidelines for evaluating digital scholarship compiled by the center for digital research in the humanities. new submission form added to directory of digital historians. documenting digital history as we begin to explore what history looks like in the digital medium, we need to see examples of excellent digital historical scholarship, established best practices, and, especially, clear standards of peer review for tenure and promotion in history departments. this section contains a directory of digital historians, an index of digital scholarship, information on the national endowment for the humanities digital humanities start-up grant, digital history project reviews, and new media tool reviews . . . [more] added the essay, "what is digital history? a look at some exemplar projects". doing digital history digital history is about digitizing the past certainly, but it is much more than that. we aspire to create ways for people to experience and participate in history, as well as to see and follow an argument about a major historical problem. this section contains essays on the process of creating works of digital historical scholarship, interviews with leading practitioners, and lectures by digital historians sharing their work . . . [more] updated list of syllabi. teaching digital history as digital history becomes more prevalent, we will be teaching with digital sources and teaching digital methods. teaching digital history involves methodological questions, narrative theories, computational programming, technical writing, group projects, and digital media productions. this section contains links to undergraduate and graduate student projects and course syllabi . . . [more] join the conversation digital history project http://digitalhistory.unl.edu/index.php of / / : pm roundtable questions/topics for discussion “the future of history journals in the digital age,” douglas seefeldt, chair american historical association conference [cg] what have you done so far to take a traditional journal into the digital age and what are the next steps? to what extent do you think journals will be--or should be--transformed by digital technologies by ? [how can we get the good digital scholarship into the minds of historians who are accustomed to encountering such scholarship via journals and books?] [asr] who will be the audience for our online journals? how will the digital format weaken the familiar binaries of professional/non-professional; academic/public; specialist/synthesizing? [drl] what are the financial ramifications of digital production and/or digital journals for journals themselves and for their associations? what becomes a sustainable model financially as well as intellectually going forward? [it's a practical question of sorts that also gets at issues of changing membership patterns (as well as readership or scholarship) within the profession as well as by an interested public] [asr] you have expressed the concern that too much discussion about $$$, important as it is, may bog us down because there really are no ready models yet. but you contend that we must start with the premise that the fate of journals—the fate of the scholarship that gets communicated through journals, to be more precise--should be decoupled from the revenue models for societies. or else….. [at] do we see libraries increasing stepping up to publish journals in the way that some are now housing university presses? [cg] how is digital scholarship like and unlike the traditional forms of scholarship that your journal deals with? what are the challenges a born-digital "article" (entity?) poses to the double-blind review process at a scholarly journal? what challenges does it pose to editing, production, and distribution? [jm] jah's insistence upon applying print journal criteria (number of pages, for example) to online projects. more specifically, the jah is determining whether or not to include online scholarship in its index of recent scholarship. not making the index means that your work does not exist. [st] are there alternative ways of writing history than the analytic essay [spatial, data mining, deep databases, etc.]? are journals interested? are our colleagues interested? if so what is making this move difficult? [how should the discipline rethink itself and how should scholarly journals fit into it?] [asr] genres always emerge as a response to audience. what genres will emerge as the most robust in the journal niche of the short-form communication? [is the journal a “community” or just the expression of one part of a community?] [cg] is there a point where trying to increase readership for sustainability (or other noble goals) comes into conflict with the scholarly mission of the journal. [asr] what conventions of the journal are optimized for print (page limit might be one) and we should feel okay about letting go of in the digital? and which new convections can we imagine as optimized for the digital? [at] it is clear that the tendency among digital scholarship is to support oa via creative commons licenses or other methods. what are the ramifications of open access principals to the history journal model? [all] [creativity, in the form of digital historical scholarship, is coming from the authors now. how do we need to redefine the definitions and roles of authors and editors/publishers in light of this? and where does the “brand” come from? does the digital format change/challenge that authority?] op-llcj .. ++ computer stylometry of c. s. lewis’s the dark tower and related texts ............................................................................................................................................................ michael p. oakes riilp, university of wolverhampton, england ....................................................................................................................................... abstract this article looks at the provenance of the unfinished novel the dark tower, generally attributed to c. s. lewis. the manuscript was purportedly rescued from a bonfire shortly after lewis’s death by his literary executor walter hooper, but the quality of the text is hardly vintage lewis. using computer stylometric pro- grams made available by eder et al.’s ( : stylometry with r: a package for computational text analysis. r journal, ( ): – ) ‘stylo’ package and a word length analysis, samples of each chapter of the dark tower were compared with works known to be by lewis, two books by hooper and a hoax letter concerning the bonfire by anthony marchington. initial experiments found that the first six chapters of the dark tower were stylometrically consistent with lewis’s known works, but the incomplete chapter was not. this may have been due to an abrupt change in genre, from narrative to pseudoscientific style. using principal components analysis, it was found that the first and subsequent components were able to separate genre and individual style, and thus a plot of the second against the third principal components enabled the effects of genre to be filtered out. this showed that chapter was also consistent with the other samples of c. s. lewis’s writing. ................................................................................................................................................................................. introduction clive staples lewis ( – ) was a prolific writer, and his best-loved fiction is probably his deep space trilogy, the screwtape letters, and his ‘narnia’ series of children’s books. shortly after lewis’ death, walter hooper, the literary executor for the lewis estate, claimed to have found an un- published fragment of fiction, which was published much later ( ) as the dark tower. there is some overlap between the dark tower and the deep space trilogy, as they share a number of characters such as mcphee, ransom, and even lewis himself. for many years, c. s. lewis had lived with his brother warren at a house called the kilns, in oxford. in the first paragraph of the preface to the version of the dark tower published by fount, hooper claimed that warren wanted to dispose of his late brother’s old papers, and ordered the gar- dener to light a bonfire of them which ‘burned steadily for three days’. in hooper’s own words: ‘happily, however, the lewis’s gardener, fred paxford, knew that i had the highest regard for anything in the master’s hand, and when he was given a great quantity of cs lewis’s note- books and papers to lay on the flames, he urged the major [warren lewis] to delay till i should have a chance to see them. one of the rescued notebooks contained the hand-written manuscript of the dark tower’. (hooper, , p. vii) correspondence: michael p. oakes, university of wolverhampton, riilp, stafford street, mc building, wolverhampton wv ly, united kingdom e-mail: michael.oakes@wlv.ac.uk digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqx deleted text: deleted text: deleted text: `` deleted text: '' deleted text: `` deleted text: ''. deleted text: , deleted text: `` deleted text: '' deleted text: '' deleted text: : deleted text: . the dark tower was eventually published with a number of c. s. lewis short stories, all of which had been published before, except for the very brief the man born blind which had been found in a note- book given to walter hooper by lewis’s brother. the story of the bonfire was later denied by fred paxford, and this denial was published in the journal christianity and literature (lindskoog, ). shortly afterwards, christianity and literature ( ( ): – ) also published a letter from anthony marchington, seemingly in support of paxford’s denial, as it stated that a chemical analysis of the soil in lewis’ garden had re- vealed that no major bonfire had been lit there. this letter is thought to be a hoax: its content is clearly pseudoscientific, and marchington was a close friend of walter hooper, at one time sharing lodgings with him. the dark tower itself is unfinished, possibly be- cause the plot hits something of a dead end. opinions vary as to the quality of the writing, and the story changes tack abruptly in the final chapter, where the protagonist scudamour is left alone in a library in ‘othertime’ to learn about the ‘othertimers’ discoveries about time travel. hooper ( , p. viii) estimates that lewis began writing the dark tower soon after com- pleting out of the silent planet in . there are simi- larities with madeleine l’engle’s a wrinkle in time, although this was not written until . all this has led a number of people, most notably katherine lindskoog, to conclude that the dark tower may not be entirely written by c. s. lewis. the most likely can- didates for writing at least parts of the dark tower, apart from lewis himself, would be walter hooper and anthony marchington. lindskoog ( , pp. – ) mainly suspects marchington: ‘no one thinks that walter hooper could have tackled all that ficto-science. the most obvious suspect is anthony marchington himself. he is a scientist, he is interested in the origin of the dark tower, and he has tricked christianity and literature with a scientific spoof. furthermore, he was about eight years old when madeleine l’engle published her chil- dren’s classic a wrinkle in time, and so he quite possibly read it as a child. that could account for unconscious copying of engle’s automaton scene in the dark tower’. the corresponding ‘automaton scene’ in the dark tower occurs in chapter . previous work in the past, a number of computer stylometric ana- lyses have been performed on the dark tower and related texts. the first of these was by carla faust jones ( ), who used a computer program written by jim tankard which he had previously used to study the federalist papers (tankard, ). first the program finds the frequencies of character n-grams (sequences of n consecutive characters, where n was or ) in the text, then normalizes these to frequencies per , characters, rounded to the nearest whole number. spaces and punctu- ation were not considered, and upper and lower case characters were considered equivalent. for - gram (single characters), the index of difference be- tween two text samples was given by the expression: xz a jfa � fbj: where fa is the frequency of a character in the first text sample, and fb is the frequency of that character in the second. the differences in these frequencies are found for every character in the alphabet, and then all added together. for the -grams, the expres- sion is analogous: we find the differences in the frequencies of every possible character pair in the two texts, and then add together all � differ- ences. jones’ ( ) results are shown in table . both the -gram and -gram analyses show that the three complete science fiction novels by lewis, out of the silent planet, perelandra, and that hideous strength are more similar to each other than they are to the dark tower. although this is interesting, it does not prove that the dark tower was not written by lewis. there is no comparison with lewis’s other works nor any comparison with works by other candidate authors for the dark tower. lindskoog ( , pp. – ) describes a seem- ingly unpublished report by andrew queen morton. he used a data visualization technique called a cumulative sum control chart (cusum) analysis, m. p. oakes of digital scholarship in the humanities, deleted text: - deleted text: `` deleted text: '' deleted text: `` deleted text: othertimers' deleted text: '' deleted text: : deleted text: : deleted text: - deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: normalises deleted text: s deleted text: x deleted text: , deleted text: : deleted text: - deleted text: visualisation which has been used to detect changes in the quality of production line outputs in an industrial setting. morton himself suggested that this technique could be used to detect discontinuities in writing style, such as when one author breaks off and another begins in a multiple-authored text. a linguistic fea- ture such as word length or noun frequency is used to characterize the texts. the resulting graph shows an upward trend for those portions of the text which show an above average (taken over the text as a whole) occurrence of the chosen feature, and a down- ward trend for those parts which show a below aver- age occurrence. thus, if two authors who have contributed to a text show different rates of usage of the chosen feature, the point where one writer hands over to another might show an abrupt change in the direction (upwards or downwards) of the graph. morton took the first twenty-three sentences of chapter of the dark tower, the first twenty-four sentences of chapter , and the first twenty-five sen- tences of chapter , alongside sections from out of the silent planet and that hideous strength. morton concluded that the dark tower was a composite work: lewis did not write chapters and , but he did write chapter , the one with the library scene. the technique is highly controversial in studies of disputed authorship, but my feeling is that the choice of linguistic features may affect the success of the technique itself. for example, merriam ( ) achieved interesting results for the shakespeare play edward iii with cusum charts using the frequencies of prosodic features, rare words, and function words, combined into a single chart using principal components analysis (pca). unfortunately lindskoog gives no details of which linguistic features morton used to characterize the texts. morton’s study also suffers from the brevity of the texts which were analysed. more recently, thompson and rasp ( ) used statistical techniques developed by thisted and efron ( ) for comparing smaller samples of un- known authorship (such as a newly discovered text) with a much larger canon with known authorship. if we define t as the size in words of the small sample divided by the size in words of the larger canon, n as the number of words occurring exactly once in the canon, n the number of words occurring twice, and so on, then in their ‘new words’ test, we can estimate bv , the number of words in the smaller text that do not appear in the larger canon, as follows: bv ¼ n t � n t þ n t . . . : this formula depends on t being small, to ensure that the series converges. we want to see how close the estimated value of bv is to m , which is the number of ‘new’ words actually found in the small sample but not in the canon. if these values differ greatly, it suggests that the small sample was not written by the author of the canon. they performed three other tests using related formulae—the ‘rare words’ test, where the estimated and true numbers of words occurring below an ar- bitrary threshold number of times are compared, and the ‘slope’ and ‘uniformity’ tests, which take into account the estimated and real numbers of words of every individual frequency up to a thresh- old. the tests were validated first by comparing sam- ples of george macdonald’s writings with those known to be by lewis. the ‘new words’ test was most successful, being able to discriminate between them % of the time with % confidence—we table indexes of difference between the dark tower and c. s. lewis’s three complete science fiction novels, found by jones ( ) comparison texts compared i. d. (unigrams) i. d. (bigrams) a silent planet and perelandra , a silent planet and hideous strength , a perelandra and hideous strength , b silent planet and dark tower , b perelandra and dark tower , b hideous strength and dark tower , i.d.¼ index of difference computer stylometry of the dark tower digital scholarship in the humanities, of deleted text: characterise deleted text: deleted text: one deleted text: deleted text: four deleted text: deleted text: seven deleted text: one deleted text: four deleted text: seven deleted text: characterise deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: -- deleted text: `` deleted text: '' deleted text: ; deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: -- would expect only % if we were looking at a single author. the ‘new words’ test also showed the best discriminatory power between short samples of the dark tower and two of lewis’ science fiction novels (those thought to have been written closest in time to the dark tower), namely, out of the silent planet and perelandra. the test found that % of the dark tower samples were significantly different to the ‘canon’ of two science fiction novels. overall, thompson and rasp felt that their results were in- conclusive. even though the ‘new words’ test did discriminate between samples of the dark tower and the complete novels, this may not have been due to a difference in authorship, but because a novel in draft form might differ from a complete, polished work. stylometry with r: the ‘stylo’ package before describing the specific experiments carried out for this article, i will describe some general fea- tures of the package that were used, ‘stylometry with r’ (stylo), which was written in the r statistical programming language by eder et al. ( ). stylo enables a choice of measures of document dissimi- larity, and i used the classic burrows’ delta, first described by burrows ( ), throughout. stylo also allows a variety of linguistic features to be used to characterize the texts, these being word and character overlapping n-grams, where n can be any number, including for single words or characters. an n-gram is a sequence of n tokens. for example, if n is , and we are interested in over- lapping character sequences, a word like ‘lewis’ would be analysed into the four entities ‘le’, ‘ew’, ‘wi’, and ‘is’. finally, stylo enables a number of kinds of graphical displays, each of which is a way of showing which documents are most similar to each other, by placing them close together on the page. for example, fig. is an example showing the outputs for hierarchical agglomerative clustering. the relationships between the texts are shown on dendrograms, so called because they look like trees on their side. the branches on the extreme right each correspond to individual texts, and texts on nearby branches are similar to each other. the tech- nique for building a dendrogram is to first find the most similar pair of texts and join them together, so that thereafter they can be considered as a joint entity. in the subsequent series of steps, each time the most similar pair of single texts or joint entities is fused to form a larger group. this process con- tinues until all the texts are joined in a single struc- ture. when using ward’s ( ) method, the default linkage method offered by stylo, the docu- ment similarities between a newly formed joint entity and all the other text groups formed so far are functions of the distances between each of the two constituents before fusion and the rest of the text groupings, and the number of texts in each entity. a series of dendrograms obtained for fig. dendrogram of texts by lewis. l’engle and tolkein, using the most frequent single words m. p. oakes of digital scholarship in the humanities, deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: t deleted text: paper deleted text: `` deleted text: '' deleted text: characterise deleted text: `` deleted text: '' deleted text: `` deleted text: '', deleted text: `` deleted text: '', deleted text: `` deleted text: '' deleted text: `` deleted text: ''. deleted text: (hac) deleted text: , deleted text: is a different numbers of linguistic features can be fused into a bootstrap consensus tree, such as that shown in fig. . branches between texts are shown when- ever such a branch was found in a selected propor- tion of the dendrograms—i used the default value of . throughout. a third type of representation, called pca, can be seen for example in fig. . the technique aims to find groups of texts which are characterized by the common presence or ab- sence of certain groups of linguistic features, which form a component. texts with many of these fea- tures score highly on the component, while other texts with few of them have negative scores on this component. the component which accounts for the greatest amount of variability between the texts is called the first principal component (pc ), but there are other components which successively ac- count for less variability between the texts. normally the texts are plotted according to their positions on the first two components (pc and pc ), but as we shall see in this article, if pc corresponds to genre rather than author, genre effects can sometimes be overcome by plotting the texts according to their scores on lower components (such as pc and pc ). pca is often used to examine variation in language. for example, holmes et al. ( ) used pca to examine authorship of the ‘pickett letters’ from the american civil war, binongo and smith ( ) used pca to study the authorship of the play pericles, and harris ( ) looked at possible genres in the corpus of rongorongo from the easter islands. biber ( ) used the closely related tech- nique of factor analysis to study functional linguistic variation arising from genre and register. stylo allows a culling parameter to be set. for example, if this value is , then only features appearing in at least % of the texts will be considered in the ana- lysis. in all the experiments described in this article, the ‘culling’ parameter was set to ; so for example if we are studying the frequencies of the top words, the frequency of every one of these words will be considered. the most frequent words (mfw) are the mfw in the entire corpus, fig. dendrogram of texts by lewis. l’engle and tolkein, using the most frequent single words fig. pca of texts by lewis. l’engle and tolkein, using the most frequent single words computer stylometry of the dark tower digital scholarship in the humanities, of deleted text: -- deleted text: principal components analysis deleted text: characterised deleted text: paper deleted text: `` deleted text: '' deleted text: paper deleted text: `` deleted text: '' deleted text: , deleted text: most frequent word deleted text: s rather than the mfw in an individual sample. it is possible to use text samples of different sizes be- cause the word frequencies are normalized. throughout the experiments the following text pre-processing steps were adhered to. by selecting the ‘english’ button on the ‘input and language’ page of the stylo graphical user interface (gui), contractions such as ‘don’t’ will be treated as the two single words ‘don’ and ‘t’. hyphenated com- pound words such as ‘topsy-turvy’ also become two single words, here ‘topsy’ and ‘turvy’ (eder et al., , p. ). the ‘preserve case’ button was not selected, so all upper case characters were converted to lower case. i did not select the option to delete pronouns, and no stop list was used, but did select the option to read in text as plain text files. by de- fault, all sequences of non-alphabetic characters were reduced to a single white space for n-grams longer than . single words were treated as single letters separated by spaces. it is possible to examine the full feature set with the r command stylo.re- sults¼stylo(), then running the gui to select the desired feature set, and then examining the set with stylo.results$features (eder et al., , p. ). text samples the set of text samples used in these experiments is summarized in table . the four texts from lord of the rings are the prologue, and the first chapter of each of three parts (called individually the fellowship of the ring, the two towers, and the return of the king). the texts from the the hobbit are chapters – , and the texts from the narnia series are the first two chapters of the lion, the witch and the wardrobe, the first chapter of the voyage of the dawn treader, the first chapter of the magician’s nephew, and the first chapter of the last battle. the four samples of that hideous strength are the first four chapters, as is the case for perelandra. however, the four samples of out of the silent planet consist of the first two chapters; the third and fourth chapters; the fifth and sixth chap- ters; and the seventh and eighth chapter. the two shortest texts, the man born blind and the marchington letter, were used in their entirety, as was the lefay fragment. the seven samples of the dark tower consist of one chapter each, including the seventh and final (but unfinished) chapter. the four samples of through joy and beyond consist of table text samples used in the experiments described in this article samples author year title sample length (words each) lor , lor , lor , lor j. r. r. tolkein – lord of the rings , , , , , , , hob , hob , hob , hob j. r. r. tolkein the hobbit , , , , , , , eng , eng , eng , eng madeleine l’engle a wrinkle in time , , , , , , , lww, dawn, mn, lb c. s. lewis – ‘narnia’ series , , , , , , , ths , ths , ths , ths c. s. lewis that hideous strength , , , , , , , per , per , per , per c. s. lewis perelandra , , , , , , , osp , osp , osp , osp c. s. lewis out of the silent planet , , , , , , , mbb c. s. lewis unknown the man born blind , lefay c. s. lewis unknown the ‘lefay’ fragment , dt , dt , dt , dt , dt , dt , dt c. s. lewis unknown the dark tower , , , , , , , , , , , , , tjb , tjb , tjb , tjb walter hooper through joy and beyond , , , , , , , pwd _ , pwd _ , pwd , pwd walter hooper past watchful dragons , , , , , , , mlet tony marchington letter to ‘christianity and literature’ mc , mc c. s. lewis – mere christianity , , , pp , pp c. s. lewis the problem of pain , , , m. p. oakes of digital scholarship in the humanities, deleted text: most frequent words deleted text: , deleted text: normalised deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: ''. deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: `` deleted text: '' deleted text: : deleted text: `` deleted text: '' deleted text: : deleted text: are deleted text: summarised deleted text: to one part each of that book—and thus comprise the entirety of that book. the four samples of past watchful dragons consist of: the first two chapters; the third and fourth chapters; the fifth chapter, excluding the lefay fragment; and the sixth chapter. the two samples of mere christianity are books and (right and wrong as a clue to the meaning of the universe and what christians believe). finally, the two samples of the problem of pain are chapters and of that book. the lewis texts are compared against the tolkein texts because the two authors were close friends who regularly discussed their work at meetings of the lit- erary group called ‘the inklings’, which met at the ‘eagle and child’ pub in oxford. they both wrote about other worlds, such as middle earth (tolkein) and narnia (lewis). like lewis and tolkein, madeleine l’engle also wrote children’s fantasy novel with a christian theme, where children are transported to faraway planets. as stated in the introduction, lindskoog has noticed similarities be- tween the dark tower and a wrinkle in time. the lefay fragment is a long fragment of a draft of the sixth narnia book, the magician’s nephew, also found by hooper in one of lewis’s notebooks, and reproduced in past watchful dragons (hooper, , pp. – ). out of the silent planet, perelandra, and that hideous strength are lewis’s three science fic- tion works for adults. as described above, walter hooper claimed to have discovered the short story a man born blind and the unfinished novel the dark tower in notebooks, written in lewis’s hand- writing, after lewis’s death. however, the handwrit- ing in the dark tower notebook has never been satisfactorily authenticated (lindskoog, ). the two selected works by walter hooper himself are past watchful dragons, a guide to the narnia books, and through joy and beyond, a biography of c. s. lewis. a further sample used is the full text of tony marchington’s hoax letter to christianity and literature. to the author’s best knowledge, tony marchington left no other pub- lished works, and thus it was not possible to use a larger sample of marchington’s writing in these ex- periments. mere christianity and the problem of pain are examples of lewis’s non-fiction writing. experiments : discrimination between lewis and two other authors of fiction the first set of experiments, the baseline, was de- signed to show whether the multivariate statistical techniques available in the stylo package were able to distinguish between the three authors l’engle, tolkien, and lewis. the results for the hierarchical clustering (ward’s method) using the mfw as linguistic features are shown in fig. . the choice of words is made following the recommendations of burrows ( ) and juola ( , p. i ), as the mfw are typically function words, giving in- formation about grammar and individual writing style rather than content. here we see four main clusters, which from top to bottom correspond to ( ) tolkein, with the ex- ception of lewis’s voyage of the dawn treader; ( ) lewis’s first two books from the deep space trilogy; ( ) children’s books written by l’engle and lewis, except for the second section of perelandra; and ( ) lewis’s last book from the deep space trilogy. the same pattern is seen more clearly in the bootstrap consensus tree (also for the most frequent single words), as shown in fig. , where the two ‘deep space’ branches are placed closer together, effect- ively leaving three main clusters in the diagram. in fig. , the data for most frequent single words are displayed using pca. once again we see three main groupings, with samples by lewis seen in the top right of the diagram, samples of children’s books in the middle left part, and samples by tolkein in the bottom right section. juola ( ) recommends running a series of in- dependent analyses in stylometric work. while a series of runs using the same feature set with different clustering algorithms (such as shown in figs – ) are not independent of each other, experi- ments using distinct feature sets would be. in his experiment on the writing of j. k. rowling, juola states that ‘tests were run on four separate feature sets: word lengths, character -grams, word pairs, and the most frequent words’ (juola, , p. i ). juola (personal communication) recom- mends using all character -grams and word pairs, computer stylometry of the dark tower digital scholarship in the humanities, of deleted text: -- deleted text: one deleted text: two deleted text: `` deleted text: '', deleted text: `` deleted text: '' deleted text: : deleted text: - deleted text: were deleted text: most frequent words ( deleted text: ) deleted text: : deleted text: most frequent words deleted text: a deleted text: b deleted text: c deleted text: d deleted text: `` deleted text: '' deleted text: is deleted text: principal components analysis deleted text: are deleted text: `` deleted text: '' deleted text: : not just the top n. to achieve this as far as possible, i set n to the very high value of , . although these linguistic features are not completely inde- pendent of each other (for example, if a word has high frequency, this will raise the frequencies of its constituent character n-grams), i endeavoured to follow his approach. the groupings produced by the hierarchical clustering when using either the , mfw -grams (see fig. ) or the , most frequent character -grams (shown in fig. ) were the same as each other, producing somewhat clearer separation between the authors than was the case for the most frequent single words. in figs and , we again see a cluster for chil- dren’s authors, but this time we see more separation between those in lewis’s narnia series and those by madeleine l’engle, than we saw in figs – . the middle cluster consists entirely of tolkein sam- ples, and the bottom cluster contains all the books in lewis’s deep space trilogy. thus it seems that it is possible to some extent to distinguish between the three authors of fiction, but the situation is partly confused because we are seeing both the effects of authorship and of genre. as a result we have two clusters for lewis, one for his adult fiction, and an- other for his children’s fiction, which is only mar- ginally distinguished from another author (l’engle) who also wrote in the children’s fiction genre. to separate authorship and genre, it is possible to use the technique of pca. for example, schöch ( ) used pca to examine french plays by the brothers pierre and thomas corneille. the pc separated the plays by author, but the second component sepa- rated them by genre: tragedy or comedy. an example of a feature which distinguished the plays by genre was the word ‘mort’ (death) which was much more prevalent in tragedies than comedies. one of the fig. dendrogram of texts by lewis. l’engle and tolkein, using the , mfw -grams fig. dendrogram of texts by lewis. l’engle and tolkein, using the , most frequent character -grams m. p. oakes of digital scholarship in the humanities, deleted text: most frequent word deleted text: principal component analysis ( deleted text: ) deleted text: first principal component features discriminating between the two authors was the function word ‘ces’ (these). using the related technique of correspondence analysis, linmans ( ) showed that samples taken from the synoptic gospels were separated on the first compo- nent according to genre (discourse, aphorisms, nar- rative, or parable), and on the second component according to author (mathew, mark, or luke). i ran a pca on the most frequent , character -gram data, and achieved the plot shown in fig. . as in all the previous experiments, we see three main groupings in the text samples. this time the tolkein samples all appear in the top half of the plot, the children’s writing appears in bottom left part, and the deep space samples by lewis appear in the bottom right part. thus lewis’s texts still appear in two separate clusters—one for his children’s writing, and one for his adult science fiction. at the most coarse grained division of texts, we see writing for children in the left half of the diagram, corres- ponding to negative scores on pc , and writing for adults in the right division, corresponding to posi- tive scores on pc . although lord of the rings was not specifically written for children, it was written as a sequel to the hobbit, which was. thus we see the samples of the hobbit appearing to the left of those from lord of the rings. in this experiment discrimin- ation by genre was seen to be more pronounced than discrimination by author, since the pc accounts for more variation in the data than any of the other prin- cipal components. we can remove the effect of genre by taking pc out of the diagram, and instead of plotting pc against pc , plotting pc against pc , as shown in fig. . there is no option on the stylo gui for plotting pca components other than the first and second, but this may be done with the following series of r commands: >a¼stylo() >b¼a$pca.coordinates >pc ¼b[, ] >pc ¼b[, ] >labels¼names(pc ) >plot(pc , pc , pch¼““) >for (i in :length(labels)){ þtext(pc [i], pc [i], labels[i]) þ} fig. pca of texts by lewis. l’engle and tolkein, using the , most frequent character -grams fig. plot of texts by lewis, l’engle, and tolkein on the second and third principal components, using the , most frequent character -grams computer stylometry of the dark tower digital scholarship in the humanities, of deleted text: s deleted text: -- deleted text: first principal component ( deleted text: ) this has the effect of grouping all the lewis texts together, irrespective of genre, in the bottom left of the diagram. there is now also a distinct cluster for l’engle in the top left corner, and the tolkein sam- ples all appear on the right-hand side. experiments : the dark tower in relation to texts by lewis, hooper, and marchington the second set of experiments was designed to show where the individual chapters of the dark tower lay in relation to known works by lewis, hooper, and the marchington letter. the results are shown in fig. for hierarchical clustering by ward’s method with the most frequent single words. the coar- sest (leftmost) subdivision separates most of the known works by lewis from those by hooper and marchington. the posthumously discovered texts (mbb, lefay and dt –dt ) all cluster very close together, and all are well within the main lewis clus- ter. this suggests that all these texts were indeed written by lewis. the main surprise was that there was a small cluster of lewis texts at the bottom, attached to the hooper/marchington cluster. the final chapter of the dark tower (dt ) appeared in this small cluster, and thus seems to have stylistic similarities with works both by and not by lewis. the experiment was repeated using the , most frequent character -grams, since this feature gave the most clear-cut results for the fiction texts. these results are shown in fig. . the results are more clear-cut when using the , most frequent character -grams (fig. ) than when using the top single words (fig. ), and give two main clusters. all the samples of lewis’s known fiction appear in the bottom cluster, along with the posthumously published samples mbb, lefay, and chapters – of the dark tower. the top cluster contains all the samples of hooper’s works, clustered tightly together, the marchington letter, and a tight grouping contain- ing chapter of the dark tower and four samples of lewis’s non-fiction. the main division between the texts thus appears to be non-fictional (top clus- ter) versus fictional (bottom cluster). once again we have a situation where genre and authorship con- found each other—does the final chapter of the dark tower appear in the top cluster because it is written by hooper or marchington, or because it is written in the style of non-fiction? the next step was to perform pca experiments to first try and deter- mine whether the pc did indeed correspond to genre, and if so omit this component from a future analysis using pc and pc . this would ide- ally extract the effects of genre, so that the results of authorship alone can be seen. the pca analysis plotting the text samples according to their scores on the second and third principal components is shown in fig. . the fictional text samples have all got positive (or only slightly negative) scores on pc , and the non-fiction samples almost all negative (or only slightly positive) scores on pc . fig. dendrogram comparing the dark tower with text samples by lewis, hooper, and marchington, using the most frequent single words m. p. oakes of digital scholarship in the humanities, deleted text: right deleted text: to deleted text: clear deleted text: clear deleted text: to deleted text: deleted text: -- deleted text: first principal component ( deleted text: ) deleted text: are thus pc is polarized by genre, and was eliminated at the next step. dt is very close to works by hooper, being almost superimposed on the cluster of text samples by hooper in the bottom left quad- rant. is this because they were actually written by hooper, or are they simply written in a stylistically similar non-fictional style? the small marchington sample appears as a complete outlier at highly nega- tives scores on both pc and pc . in the next experiment, i removed the effect of genre which gave the polarity seen on pc , where all the non-fiction texts are placed on the left-hand side, and all the fiction texts are placed on the right-hand side. this was done by omitting pc , and plotting pc against pc . this plot is shown in fig. . this plot was inconclusive, since the hooper and lewis samples appeared very close to- gether (albeit with a tendency for the hooper sam- ples to appear near the top), and dt is almost equidistant between samples by the two authors hooper and lewis. further experimentation showed that fig. was probably distorted due to the outlying marchington letter (m_let) sample, which was much smaller than the others and thus probably contained much statistical noise. in add- ition, while it was pseudoscientific in style, it was also a letter, which would also put it in contrast with the other texts. after removing this sample, the character -gram frequencies in the corpus were recalculated to include only the remaining texts. when this sample was removed, i obtained the much clearer plot shown in fig. . here the hooper texts are plotted at positive values of both pc and pc , and thus form a cluster in the top right part of the graph. dt now plots much closer to the lewis texts. fig. dendrogram comparing the dark tower with text samples by lewis, hooper, and marchington, using the , most frequent character -grams fig. plot of the dark tower chapters and texts by lewis, hooper, and marchington on the first and second principal components, using the , most fre- quent character -grams computer stylometry of the dark tower digital scholarship in the humanities, of deleted text: polarised deleted text: left deleted text: right deleted text: - experiments : word length experiments the one linguistic feature suggested by juola ( , p. i ) not yet examined in this article is mean word length. the mean word lengths (in characters) for each of the text samples used in this article were found, using a program written in perl by the author, and are shown in table . although average word length is often con- sidered a blunt tool for assigning authorship, the results in table generally accord with the experi- ments performed on stylo before the effects of genre were filtered out by pca. the thirteen texts with greatest average word length are all non-fictional, which is the style in which dt is written. the texts by hooper and marchington and the final chapter of the dark tower are grouped together at the top of the table, as they have greater average word length than the other texts. the texts with lowest average word length are the narnia series, including the lefay fragment. the other children’s authors also tended to use shorter words: the aver- age word lengths for the madeleine l’engle samples were in the range . – . ; for tolkein’s the hobbit, the range was . – . ; and for tolkein’s lord of the rings, it was from . to . , except for the prologue which was . . since word length is a single figure which depends on both genre and authorship, it is not possible to separate these out using this technique alone, and thus the final chapter of the dark tower appears close to the marchington and hooper samples, pos- sibly because they are all written in the style of non- fiction. to filter out the effect of genre, it might be possible to find the mean word lengths for the genres (children’s fiction; adult fiction; adult non- fiction) over a large range of authors, and to find the word lengths of our samples relative to these means. word length as a feature has been found in several multi-dimensional studies, such as biber ( ), revealing that word length has functional properties. fig. plot of the dark tower chapters and texts by lewis, hooper, and marchington on the second and third principal components, using the , most fre- quent character -grams fig. plot of the dark tower chapters and texts by lewis and hooper on the second and third principal components, using the , most frequent character -grams m. p. oakes of digital scholarship in the humanities, deleted text: : deleted text: paper deleted text: paper deleted text: , deleted text: deleted text: , deleted text: to deleted text: to deleted text: , deleted text: , conclusion from these analyses, i feel that it is clear that lewis wrote the first six chapters of the dark tower, as well as the man born blind and the lefay fragment, all of which were found by walter hooper in note- books after lewis’s death. initial results did show that the final chapter of the dark tower was more stylistically consistent with the samples of hooper and marchington’s writing. however, this may be more a question of genre than authorship, since the plot of the dark tower changes abruptly from a narrative account in the first six chapters, to a pseudoscientific description of how the people of ‘othertime’ discovered time travel in the seventh chapter. marchington’s letter is also in pseudoscien- tific style, as it describes the results of a (ficticious) soil analysis. although hooper’s texts are not pseudoscientific, they are not narrative fiction either, which may explain why they initially clus- tered with the marchington letter and the final chapter of the dark tower. the use of pca where factors corresponding to genre were not plotted proved to be an effective means of filtering out genre. once the effects of genre were removed, text sample dt did appear to be more typical of the lewis texts than the hooper texts. discovering the contents of a library in another world is in fact a lewisian motif, seen in the voyage of the dawn treader, the third of the narnia series, when lucy reads the contents of a book of magical spells in the library of the fallen star coriakin. on the other hand, if an unfinished work was to be added to, it would be easier to add a new chapter at the end than at any other place in the text. references biber, d. ( ). variation across speech and writing. cambridge, uk: cambridge university press. binongo, j. and smith, m. w. a. ( ). the application of principal component analysis to stylometry. literary and linguistic computing, ( ): – . burrows, j. ( ). delta: a measure of stylistic differ- ence and a guide to likely authorship. literary and linguistic computing, ( ): – . table average word lengths (in characters) for each of the text samples text sample words characters average word length m_let , . dt , , . pwd , , . pwd _ , , . pwd _ , , . pp , , . pp , , . osp , , . tjb , , . tjb , , . tjb , , . tjb , , . pwd , , . lor , , . osp , , . ths , , . ths , , . per , , . eng , , . per , , . osp , , . dt , , . hob , , . lor , , . ths , , . dt , , . ths , , . dt , , . dt , , . eng , , . per , , . eng , , . osp , , . dawn , , . dt , , . eng , , . hob , , . hob , , . hob , , . dt , , . mc , , . lor , , . lor , , . mbb , , . mc , , . per , , . mn , , . lefay , , . lww , , . lb , , . computer stylometry of the dark tower digital scholarship in the humanities, of deleted text: deleted text: `` deleted text: '' deleted text: were eder, m., rybicki, j., and kestemont, m. ( ). ‘stylo’: a package for stylometric analyses. file://prs-store .unv. wlv.ac.uk/home $/in /home/profile/downloads/ stylo_howto% ( ).pdf (accessed may ). eder, m., rybicki, j., and kestemont, m. ( ). stylometry with r: a package for computational text analysis. r journal, ( ): – . harris, m. ( ). corpus linguistics for the decipherment of rongorongo. m. res. dissertation, birkbeck university. holmes, d. i., gordon, l. j., and wilson, c. ( ). a widow and her soldier: stylometry and the american civil war. literary and linguistic computing, ( ): – . hooper, w. ( ). past watchful dragons. new york, ny: collier books. hooper, w. ( ). preface. the dark tower and other stories. lewis c.s. london: fount. jones, c. f. ( ). the literary detective computer analysis of stylistic differences between ‘‘the dark tower’’ and c. s. lewis’ deep space trilogy. mythlore, : – . juola, p. ( ). the rowling case: a proposed standard analytic protocol for authorship questions. digital scholarship in the humanities, (suppl ): i – . lindskoog, k. ( ). some problems in c.s. lewis scholarship. christianity and literature, ( ), summer . lindskoog, k. ( ). the c.s. lewis hoax. portland, or: multnomah books, pp. – . lindskoog, k. ( ). light in the shadowlands. sisters, or: multnomah books. lindskoog, k. ( ). katherine lindskoog’s informal answer to the ninth non-proof. the lewis legacy issue , summer . the c.s. lewis foundation for truth in publishing, june , . http://www.discov- ery.org/a/ (accessed may ). linmans, a. j. m. ( ). correspondence analysis of the synoptic gospels. literary and linguistic computing, ( ): – . merriam, t. ( ). edward iii. literary and linguistic computing, ( ): – . schöch, c. ( ). principal component analysis for liter- ary genre stylistics. the dragonfly’s gaze, september , . http://dragonfly.hypotheses.org/ (accessed june ). tankard, j. ( ). the literary detective, byte, february , pp. – . thisted, r. and efron, b. ( ). did shakespeare write a newly discovered poem? biometrika, ( ): – . thompson, j. and rasp, j. ( ). did c. s, lewis write the dark tower?: an examination of the small-sample properties of the thisted-efron tests of authorship. austrian journal of statistics, ( ): – . ward, j. h., jr. ( ). hierarchical grouping to optimize an objective function. journal of the american statistical association, : – . m. p. oakes of digital scholarship in the humanities, file://prs-store .unv.wlv.ac.uk/home $/in /home/profile/downloads/stylo_howto% ( ).pdf file://prs-store .unv.wlv.ac.uk/home $/in /home/profile/downloads/stylo_howto% ( ).pdf file://prs-store .unv.wlv.ac.uk/home $/in /home/profile/downloads/stylo_howto% ( ).pdf http://www.discovery.org/a/ http://www.discovery.org/a/ http://dragonfly.hypotheses.org/ journal of the society for american music ( ), volume , number , pp. – . c© the society for american music doi: . /s x amerigrove ii: perspectives and assessments the grove dictionary of american music. nd ed. edited by charles hiroshi garrett. oxford and new york: oxford university press, . introduction christina baade and emily gallomazzei cue: lalo schifrin’s “theme from mission: impossible.” if the return of tom cruise, explosions, and fast-paced chases to the screen this summer is any indication, americans (and the global audience for hollywood action film) love an impossible mission. certainly, u.s. music scholars do. reading the eight reviews of the second edition of the grove dictionary of american music (hereafter, amerigrove ii), edited by charles garrett (with a large and distinguished editorial team and nearly fifteen hundred contributors), as well as dipping frequently into its entries, we were deeply impressed both by the quality and ambition of the eight-volume, . million-word encyclopedia and by the “gargantuan” task (as leta miller puts it) with which the reviewers had been charged. we will focus our introduction upon the reviews themselves, asking (to rephrase from richard crawford), “how does one read amerigrove ii?” john koegel, christina baade’s predecessor as book reviews editor for jsam, envisioned this multipart review of amerigrove ii and recruited eight intrepid scholars to each consider a distinct topic, to which they all brought their own perspectives, strategies, and voices. the reviewers and topics are as follows: berndt ostendorf and wolfgang rathert, overview, european perspective; john graziano, overview, u.s. perspective; glenda goodman, music before ; douglas shadle, nineteenth-century music; kip lornell, folk and traditional music; leta miller, twentieth-century art music; sherrie tucker, popular music and jazz, – ; and theo cateforis, popular music and jazz, –present. reading their reviews was an exhilarating experience that helped us better appre- ciate the achievement of amerigrove ii in encompassing the growth and changes in “american” music studies, allowing us to reflect on the current state of the field, particularly in relation to big questions of belonging: who are the “we” who study american music? who and what is included? how do we decide? these are challenging questions indeed, given how many of us (readers and reviewers) have richard crawford, “amerigrove’s pedigree: on the new grove dictionary of american music,” college music symposium ( ): – , quoted by berndt ostendorf and wolfgang rathert in their review of amerigrove ii in this issue. many thanks to john koegel for envisioning this multipart review and engaging in the persuasive tour de force involved in recruiting this outstanding team of reviewers. thanks also to karen ahlquist for arranging to publish all of the reviews in a single issue. finally, our deep appreciation to all the reviewers for taking on this tremendous task and carrying it out so well! at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available http://dx.doi.org/ . /s x https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core baade and gallomazzei contributed to the volume and our temporal and personal closeness to the scholarly conversations that shaped it. few reviewers could resist praising amerigrove ii’s scope or remarking upon its size; all of them engaged critically with the dictionary as a document showing the development of the field of u.s. music studies and its central concerns. glenda goodman, noting the ideological nature of such projects, linked amerigrove to diderot and d’alembert’s encyclopédie ( – ), a project that “attempted to map the world of knowledge,” and noah webster’s dictionary of the american language, a lexicographical and nation-building endeavor. indeed, music studies is one of the few disciplines to have sustained this enlightenment commitment to cataloguing its field. with a family of grove resources available at the stroke of a key (at least for those with access to good academic libraries), exhaustiveness was not the goal for amerigrove ii. it is explicitly a selective and nationally bounded resource: “a repository of historically significant information” related to american music (i.e., “musical life and cultures of the region now covered by the fifty states, the district of columbia, and us territories,” viii, vii). for reviewers, this selectivity was both a strength—theo cateforis praised the ways in which scholars’ voices and interpretations made their way into the entries—and an invitation to critique and engagement. amerigrove ii, like the first edition (hereafter, amerigrove i), edited by h. wiley hitchcock and stanley sadie and published in , is less a map than a self-portrait of our field, inviting us all to greater reflexivity about what it is we study and why. to arrive at their analyses, each reviewer had to read amerigrove ii quite dif- ferently from the ways most “users” usually read the encyclopedia. most users will likely encounter amerigrove ii online, entering search terms into the grove music online interface to find specific entries to answer a quick factual question or get an overview of research in a given area. by contrast, the reviewers all focused on the standalone version of amerigrove ii, in hard copy or in pdf, although some also referred to the online version. of course, it was unrealistic to read and comment upon the entire encyclopedia. we were fascinated by the multiple ways in which the reviewers engaged with this “impossible” text, drawing out important themes while also negotiating its sheer size: they sampled, browsed, followed thematic threads, and drew comparisons between the new edition and the old (as well as with other references in the grove family). they attended not only to what was covered but also to how many words were assigned to a given topic, suggesting that assigned word count was an important indicator of perceived significance. they also read a great deal, closely and critically. many focused their engagement upon the wealth of essay-length entries surveying theoretical concepts, major genres, social categories, important historical events, and cities—offering insights both compelling and expertly informed. in the rest of this introduction, we would like to bring one more reading strategy to amerigrove ii: that of text analysis using a digital humanities tool, voyant, developed by geoffrey rockwell and stéfan sinclair. if our expert reviewers could stéfan sinclair, geoffrey rockwell, and the voyant tools team, voyant tools ( ) (web application), http://docs.voyant-tools.org. our ability to carry out this analysis was enabled by at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available http://docs.voyant-tools.org https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core introduction figure . screenshot of voyant tools interface with amerigrove ii corpus. (stéfan sinclair, geoffrey rock- well, and the voyant tools team, voyant tools [ ], http://docs.voyant-tools.org). not read the entire amerigrove, the voyant tool, designed specifically to assist in analyzing large corpuses, could do so with ease. impossible mission solved? not so fast. as stephen ramsay asserts in reading machines, the point of text analysis is less to bring empirical, scientific fact to humanities inquiry and more to open up new interpretive possibilities: “text analysis, because it allows navigation of the unread . . . [is] capable of presenting the bare, trivial truths of textuality in a way that allows connection with other narratives—in particular, those narratives that seek to install the text into a network of critical activity.” thus inspired, we set about putting our quantitative findings with voyant into dialogue with the critical observations of the reviewers. before we discuss our findings, a few more words about voyant, the methods we used, and the strengths and limitations of text analysis are in order. voyant, like other popular digital humanities tools, produces word frequency tabulations, tracks word trends, allows users to investigate how keywords are used in context, and provides a range of visualization options, among other possibilities (figure ). the key to sharing results from these projects is visualization. one type of visualization voyant generates is the word cloud. word clouds offer a visually striking way into texts, but they are also imprecise and methodologically opaque (figures and ). for example, the two word clouds pictured here are the same size, although in a more accurate representation, amerigrove ii would be twice as big as amerigrove i; further, they do not reveal how the text was manipulated to produce the results. more useful, we found, were the word frequency lists generated for each edition, anna-lise santella of oxford university press, who facilitated our access to text files for both editions of amerigrove. we also received significant support in facilities and expertise from the lewis & ruth sherman centre for digital scholarship at mcmaster university and its administrative director, dale askey. a special thank you to dr. paige morgan, a postdoctoral fellow at the centre, whose expertise and helpfulness were invaluable throughout our research process. stephen ramsay, reading machines: toward an algorithmic criticism (champaign: university of illinois press, ), – , – . many thanks to dr. jennifer askey for recommending this book. at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available http://docs.voyant-tools.org https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core baade and gallomazzei figure . word cloud: most frequent terms in amerigrove i. generated by voyant tools. figure . word cloud: most frequent terms in amerigrove ii. generated by voyant tools. which we were able to analyze—not only for a word’s frequency (i.e., its number of occurrences) relative to that of other words within each edition but also for how a word’s frequency differed between the editions. to determine whether the trends (i.e., the word frequency differences between editions) we discovered were significant, we employed a keyword significance test often used by digital humanities scholars and corpus linguists: the log likelihood test (ll). although generating the word frequency lists was a useful starting point, their values need to be normalized before they can add to our discussion: in this case, the ll test compares the frequency of a word in two corpuses (amerigrove i and ii) against their total word count to give a relative representation of the word in both. at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core introduction figure . screenshot of log likelihood calculator (from tony mcenery and andrew hardie, “statistics in cor- pus linguistics,” support website for corpus linguistics: method, theory and practice [cambridge: cambridge university press, ]. http://corpora.lancs.ac.uk/clmtp/ -stat.php). to form a basis of comparison, the text from both editions of amerigrove was analyzed using voyant. before entering each corpus into the voyant system, the text had to be reworked to suit our purposes. for example, the bibliographies accompanying entries were removed. this information would not tell us much about how this new edition talks about u.s. music and could even skew the final results gathered, particularly with regard to publication details like cities, presses, and dates. once the text was entered, we applied a standard english “stop words” list in order to exclude common words like “the,” “and,” and “in” from our results. applying this stop words list ensured that the “top terms” explored in this intro- duction were representative of musical genres, practices, and populations. the ll calculator (figure ) then compared the frequency of the top fifty words within the corpus of amerigrove i vs. ii to generate the log likelihood, which shows whether the difference in frequency can be considered significant (table ). we found performing word frequency analysis a useful point of entry into these two large corpuses, especially when considered in relation to the reviewers’ in-depth explorations. given the importance the reviewers ascribed to the size of entries, we found the ability to tabulate the number of words throughout both editions a particularly compelling application of the tool. however, the context of terms— especially terms used in multiple ways—was harder to decipher. although many of our findings correlated with themes identified by the reviewers, we also found many intriguing results we were unable to fully explain given our constraints of time and expertise. one of our most striking findings is also obvious from a glance at the word clouds: that both editions had a similar “most used” vocabulary after the stop words were removed. they included obvious choices like “music” (and “musical”), “american” (and “usa” in amerigrove , “united” and “states” in amerigrove ii), “song” (and “songs”); the intriguingly prominent “university”; and words linked to chronological accounts of genres and lives: “became,” “early,” and “later.” given that we retained works and recordings lists (even though we stripped the bibliographies), the prominence of “works” is unsurprising, while the high ranking of “pf” (and “piano”) leaves no doubt as to the centrality of the piano in a range of musical genres. investigating the ranks of “new” and “york” shows “new york” to be second only to “music” on both word frequency lists, confirming the city’s preeminence in u.s. musical life. finally, the almost geometric increase in the references (by at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available http://corpora.lancs.ac.uk/clmtp/ -stat.php https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core baade and gallomazzei table . top fifty words in amerigrove i and ii (after english stop words list applied) with log likelihood. the log likelihood number expresses the difference in word frequency between the two editions. the higher the number, the more statistically significant the difference. generated with voyant tools. log likelihoods calculated with mcenery and hardie’s tool. amerigrove i amerigrove ii more or less frequent log term frequency term frequency in amerigrove i∗ likelihood∗ music , music , less . new , new , more . york , york , more . pf , american , less . american , pf , more . works , musical , less . songs , works , more . musical , songs , more . opera , jazz , more . ∗∗ university , university , more . jazz , early , more . ∗∗ orch , band , less . became , became , more . piano , opera , more . early , orch , more . usa , including , less . orchestra , piano , more . composer , united , less . band , states , less . recordings , composer , more . later , dance , less . song , later , more . style , orchestra , more . years , song , more . boston , work , less . studied , began , less . ∗∗ chamber , popular , more . ∗∗ popular , school , less . ∗∗ dance , years , more . including , group , less . ∗∗ school , fl , less . began , recordings , more . work , time , more . time , style , more . str , boston , more . group , john , less . college , musicians , less . vn , vn , more . composers , str , more . pieces , studied , more . chorus , world , less . concert , performed , more . ∗∗ performed , city , less . instruments , recorded , less . ∗∗ composition , composers , more . john , college , more . chicago , century , less . conductor , rock , less . played , sound , less . , played , more . ∗log likelihoods are given for words in the amerigrove ii column. ∗∗this log likelihood suggests that the difference in frequency between the two editions is not significant. at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core introduction table . occurrences of century references in amerigrove i and ii. generated with voyant tools. century references include dates and ordinals (e.g., and th). aggregated word frequency aggregated word frequency century amerigrove i amerigrove ii th ( s) th ( s) th ( s) th ( s) th ( s) date and title) to each ensuing century is remarkable within both editions, as is the striking growth for all periods between editions. goodman’s and douglas shadle’s observations that their centuries have been “repopulated” seems to be borne out by these numbers—as are their observations that coverage of the eighteenth and nineteenth centuries is dwarfed by coverage of the twentieth. in amerigrove ii, there are , appearances of pre-twentieth century dates and titles, whereas twentieth century dates and titles increased from the first edition by more than twice that number (table ). as the reviewers observe, this explosion of twentieth-century coverage responds to significant changes in the field, including the growth of research in non-classical genres, especially popular music; the musics of african americans and other racial- ized people in the americas; and broader questions of cultural hybridity, transna- tional movement, and power. the word frequency lists register how seriously the amerigrove ii editorial team responded to these developments. a scan of the word frequency charts confirms that with words like “rock,” “country,” “guitar,” “broadway,” “gospel,” “traditional,” and even “band,” “album,” and “studio” mak- ing striking leaps in the rankings with large, statistically significant increases in representation. above all, however, “jazz” stands out as the most frequently occurring genre term in amerigrove ii (with appearances), reflecting both the increased coverage of jazz in the new edition and the generous word counts for many entries, which, as cateforis point out, often align with word counts assigned to canonical classical topics and people. a quick look also shows that “jazz” has surpassed “opera” as the top-ranked genre. however, this “switch” in position requires a closer investigation. the occurrences of both words has grown, but although the frequency with which “opera” appears in the second edition is significantly lower than in the first edition, the differences for “jazz” are not statistically significant. as sherrie tucker notes, the coverage of jazz is greater in the new edition, but it was also well represented in the first edition. meanwhile, occurrences of “opera” have not increased at the same rate as have the popular terms listed above. where goes opera, so goes art music? finding the answer using text analysis is more challenging than in the case of jazz or rock, for which single-word genre designations are widely used. one reason, as leta miller observes, is that it has been difficult to arrive at a satisfactory overarching term to describe “art” or “classical” music (both of these words are quite low in the frequency lists for both editions). a at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core baade and gallomazzei table . occurrences of references to racial and ethnic groups in amerigrove i and ii (preliminary). generated with voyant tools. frequency frequency more or less frequent log categorization amerigrove i amerigrove ii in amerigrove i likelihood african americana less . asianb less . native americanc more . latinod less . whitee more . a represents primarily words incorporating “black,” “african,” and “afro-.” b represents words incorporating “asian” and “oriental.” c represents words incorporating “indian,” “native,” “first nations,” and “aboriginal.” this count is incomplete: it does not include names of nations and peoples. d represents primarily words incorporating “latin,” “latino/a,” “spanish-,” “mexican,” and “puerto rican.” this count is incomplete: it does not include a full list of nationalities. e represents words incorporating “white,” “european,” and “caucasian.” second reason is that art music operates as an unmarked category in musicology: van cliburn is a “pianist,” not a classical pianist, whereas teddy wilson is a “jazz pianist.” our workaround was to turn our attention to specifically classical genres and terminologies, such as “opera,” “chamber,” and “symphony.” two trends are observable when we compare amerigrove i and ii. first, with the exception of “chamber,” the count for each of these terms grew. second, all of these terms have a lower overall rate of representation in amerigrove ii. at least at the quantitative level, amerigrove ii has not stinted art music, but it has increased its representation of a number of other non-classical genres to represent a much wider swath of u.s. musical life. what about other sorts of diversity, such as ethnicity, sexuality, and gender? the reviewers all praised the quality, scope, and theoretical sophistication of the entries on race, gender, sexuality, and ethnic groups and their musics, as well as the expansion of biographical coverage of women, queer, and racialized people. our analysis confirmed that the occurrence of words used to describe african americans, asian americans, latina/os, and native americans increased in the second edition; the growth in the rate of occurrence was statistically significant for all of these groups (with the probable exception of native americans; table ). because we had to identify and search for each individual term, using text analysis to investigate representation for racialized groups in publications dating from and demonstrated to us the degree to which preferred language has changed (“the bare, trivial truths of textuality,” indeed). it also presented significant methodological challenges, particularly for latina/os and native americans, groups often discussed in amerigrove with reference to their nations (e.g., “mexican” and “choctaw”). the politics of naming and representation are important, and we acknowledge that our preliminary analysis only scratched the surface. several of the reviews attend to these questions with far more nuance. “chamber” fell from occurrences in amerigrove i to in amerigrove ii, but “symphony” grew from to . at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core introduction if tracking references to racialized groups was difficult, using text analysis to investigate the representation of white people in amerigrove was still more challenging because whiteness tends to be an unmarked category. critical schol- ars, especially feminist, critical race, and queer theorists, have long argued that unmarked categories (such as maleness, whiteness, or heterosexuality) reinforce dominance by remaining “relatively invisible as an unstated, privileged norm,” as loren kajikawa writes in his “race and ethnicity” entry. text analysis is blind to what is not named, thus references to african american and black identi- ties outnumber references to european american and white identities within both editions. however, there was a numerical increase in the rate of occur- rence for terms relating to white and european identities in amerigrove ii— reflecting, perhaps, the impact on our field of whiteness studies, which seeks to make the dynamics of whiteness (and white privilege) visible and open to critical investigation. when considering questions of visibility, the changes in our scholarly field and broader culture are perhaps most striking when comparing the representation of sexual and gender minorities documented between the two editions. “homosex- ual,” “homosexuals,” “lesbianism,” “transsexual,” and “gay” in reference to gay men (in the bette midler entry) each appeared once in the amerigrove i. in ameri- grove ii, there are nearly five hundred occurrences of words relating to transgender, lesbian, gay, bisexual, and other queer identities. the musical and scholarly work that transformed these identities from that which was almost literally unspeakable into significant areas of research is charted in a number of amerigrove ii subject entries, including “lesbian, gay, bisexual, transgender and queer music,” nadine hubbs’s updating of philip brett and elizabeth wood’s classic entry; “transgender,” by stephan pennington; and “sex, sexuality” by fred maus. turning to a final set of marked and unmarked categories, “women” made significant gains, from to , although occurrences of the word still fell behind references to “john” ( ), “george” ( ), “william” ( ), “charles” ( ), “james” ( ), “paul” ( ), “david” ( ), “robert” ( ), and “thomas” ( ). a fairer comparison would perhaps be occurrences of “he” and “she.” in amerigrove i, there were roughly . occurrences of “he” for every “she” ( , to ); in amerigrove ii, there were roughly . ( , to , ). this change in ratio is due to a decreasing rate of occurrence for “he”; there was no statistically significant change in the rate of occurrence for “she,” even though there was significant numerical growth (the number doubled). of course, these raw numbers lack the nuanced analyses to be found in the reviews. it is also worth remembering that the numbers reflect our field and its histories, contexts, and concerns as much as they reflect amerigrove ii, whose editorial team made attending to difference and power, as well as the inclusion of underrepresented groups, a priority—at which they succeeded in many ways. if our rather basic text analysis offers insights into the shifting representations of genres, centuries, race, sexuality, and gender, it also complements our reviewers’ sense that amerigrove ii has become more attentive to border crossing, hybridity, non-compositional musicking, and sound. consider the surprising entrance of “sound” to the second edition’s top fifty words, the decreasing shares of “composer” at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core baade and gallomazzei and “composition,” and the increased rate of representation for “dance,” “culture,” “international,” and even “radio.” ultimately, our digital humanities adventure with voyant gave us greater insight into these two monumental corpuses, the tool’s colorful screen inviting us to explore these texts in new ways, to dig below the relatively crude analysis we have presented here, and to further interpret the lists it has produced. as our reviewers show, amerigrove ii (and i) is a capacious, engaging, impossible text that rewards many sorts of reading—whether by human or machine, online or on paper. we hope you are engaged and challenged by the ways into amerigrove charted by these reviews, and hope they inspire you in your own impossible missions in american music. at https://www.cambridge.org/core/terms. https://doi.org/ . /s x downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available https://www.cambridge.org/core/terms https://doi.org/ . /s x https://www.cambridge.org/core / / dhq: digital humanities quarterly: digital pedagogy unplugged http://www.digitalhumanities.org/dhq/vol/ / / / .html / dhq: digital humanities quarterly   volume   number  digital pedagogy unplugged paul fyfe  , florida state university abstract does  digital  pedagogy  have  to  be  electronic?  this  paper  grows  out  of  a  sense  that  digital pedagogy is too frequently conceived in terms of instructional technologies. technology, at least in its  electrified  forms,  can  be  a  limiting  factor  in  imagining  how  humanities  instruction  can  be "digital": something to get your hands on, to deal with in dynamic units, to manipulate creatively. what might an electronically­enabled pedagogy look like if we pulled the plug? this paper surveys several  examples  to  suggest  that  an  unplugged  digital  humanities  pedagogy  can  be  just  as productively disorienting as doing humanities digitally, and can potentially help students prepare for  and  contextualize  their  learning  experiences  with  instructional  technologies  or  in  online environments. hacking the yacking what does it mean to do digital humanities? to be a digital humanist? whatever it means, it potentially puts one in a newly advantageous  professional  position,  able  to  make  a  claim  on  the  relatively  conspicuous  resources  devoted  to  digital humanities in a time of scarcity. the buzz factor and professional incentives are drawing new attention to and participation in  the  dh  community.  partly  as  a  result,  dh  has  reached  the  inevitable  if  unpleasant  stage  of  authenticity  arguments [kirschenbaum  a], [waltzer  ]. debates about what defines and qualifies digital humanism periodically flare, with opinions spread along a spectrum from "big tent" inclusivity (come one, come all) to code or computational exceptionalism ("more hacking, less yacking"). the root questions themselves are not new; digital humanities sometimes seems to exist only in a state of self­definition, which ironically may be one of its strengths as a historiography of contemporary humanities knowledge work. but  what  is  new  are  the  concerns  about  the  relative  status  of  insider/outsider  or  theorist/practitioner, running in parallel to recent concerns about alternative academic employment (#alt­ac) and professorial jobs [nowviskie a], digital scholarship and peer review [fitzpatrick  ], and the relations  of  librarians  and  researchers  [ramsay a]. what  follows  is a small provocation concerning these debates: dh is not  the exclusive domain of  theorists or practitioners. to be a digital humanist, you don’t even need a computer. jerome mcgann does not especially  like computers. or so he confessed one day  to his surprised employees (myself included) at the rossetti archive. they just happened to be the interpretive machines capable of carrying out his current intellectual projects, or opening those projects to questions he had yet to imagine. that archive has a significant place in the evolution of the digital humanities, but can the same be true more generally for the field? is it possible to speak of the digital humanities, as it becomes more broadly construed, without computers? without electronics? humanities computing, for its part, seems hard­wired to computers by definition. and some of the most interesting questions and opportunities of the digital humanities are inconceivable without the media environments which facilitate or reveal them. in other words, you cannot even imagine the work you can do before you’ve invented or experienced the tools or the social dynamics  they enable. on the other hand, it may be worth trying, precisely because of the felicitous disorientation that characterizes the digital humanities in the first place: its "productive unease" [flanders  ] and the "unexpected anomalies generative of new meaning" [liu  ,  ]. the  digital  humanities  has  made  hacking  a  discipline.  "hacking"  these  days  means  to  adapt,  manipulate,  and  make productive use out of a given technology or technological context or platform. the term’s popular and professional usages are widespread, including into academic and pedagogical contexts such as the professional productivity blog profhacker, the energetic experiments with the classroom by cathy davidson and hastac, and the alternative scholarly publication hacking the academy by dan cohen and tom schienfeldt (and lots of others). "hacking" education overlaps with but is not http://www.digitalhumanities.org/dhq/vol/ / /bios.html#fyfe_p mailto:pfyfe_at_fsu_dot_edu http://chronicle.com/blogs/profhacker/ / / dhq: digital humanities quarterly: digital pedagogy unplugged http://www.digitalhumanities.org/dhq/vol/ / / / .html / synonymous with digital humanities. edu­hacking is frequently used in its service, especially to produce unfamiliar and alien perspectives on representation and intellectual protocols. it also suggests an obvious question: can the digital humanities be hacked? [ ] what  follows  is an approach to  that question  through the classroom,  the generative domain of hacking education with instructional technology. various commenters have worried that dh overlooks the institutional struggles of "ed tech" within higher education [waltzer  ], or has been overly focused on scholarship at the expense of pedagogy [bier  ]. this essay proposes an answer to such critique in imagining a low­tech (or even no­tech) approach to dh. more broadly, it suggests that edu­hacking and dh are and must remain closely related. debates about inclusivity and opportunism in dh have overshadowed the conversation about its life in the classroom. these are the implications of the examples to follow, all of which demonstrate how "digital pedagogy" might be productively hacked. teaching naked can there be a digital pedagogy without computers? amid the influx of electronics into classrooms and the rising debates, popular and professional, about how computers and the internet affect reading, cognition, and learning, now seems like a good time to ask. there are vigorous critiques of a headlong rush to technologize education from those who suspect its deleterious effects upon learning [bauerlein  ], [carr  b]. there are also healthy critiques of instructional technology from people very sympathetic to educational and humanities computing, who point out that technology cannot change the classroom without first changing the pedagogy (see, for example, [krause  ]). as a result, discussions about digital pedagogy  and  effective  uses  of  instructional  technologies  are  flourishing  across  social  media  as  well  as  conventional academic  settings.  former  president  of  the  mla  gerald  graff  has  an  "optimistic  sense  of  the  potential  of  these technologies — if we heed [the] wake­up call to use them in imaginative ways" [graff  ,  ]. that imaginative horizon is wide, but it might be limited by keeping digital pedagogy synonymous with tools to utilize, or with the particular technologies of digital media. mark bauerlein, who is generally pessimistic about the educational benefits of technology, predicts that "over the next   years, educators will recognize that certain aspects of intelligence are best developed with a mixture of digital  and  nondigital  tools"  [bauerlein  ].  this  is  not  much  of  a  concession,  as  the  digital  and  nondigital  are  still consigned to being tools, and still separate things to be mixed. can  we  redefine  these  very  categories?  their  difference  may  be  overrepresented;  instructional  technology,  after  all, includes books and backpacks and overhead lighting. to heed the "wake­up call" or to recalibrate education in the digital age, we must not only explore unfamiliar technologies but also defamiliarize those we think we already know [mcgann a]. indeed, the two projects are dialectical.[ ] it is because i am a vigorous user of instructional technologies that i am interested to pull their plugs. how might we reimagine analog teaching in terms of the digital? how can we incorporate the opportunities of digital pedagogy without presuming  its discontinuity with nondigital  tools and methods, or  its own self­ limiting status as a toolkit? one generative way of imagining digital pedagogy might be outside the context of electronics, imagining instead a digital pedagogy unplugged. perhaps the most common shortcoming of digital pedagogy is how frequently it gets conceived in terms of instructional technology.[ ]  for  many  teachers,  especially  early­  or  non­adopters,  digital  pedagogy  is  often  presumed  to  be  just something that uses electronic tools or computers. this is unsatisfying as it often limits the teaching to the extent of its tools. two familiar problems arise. first,  if  the  tool you have  is a hammer,  it  is  tempting  to  treat problems as nails.  if presentation software makes it easy to share lecture notes, the lecture hall can turn into a place for showing bullet points instead of teaching. the second problem is treating technology as merely a tool: something that accomplishes a task you were already doing, but with (electro­) mechanical advantage. for the pedagogy, not much has changed. electronics are machines, and they can become fascinating interpretive machines. but, at the risk of provoking the ire of engineers and programmers, the tools are easy. what is hard is imagining how to use them and, harder still, imagining the social conditions they might enable and, hardest of all, creating the institutional structures in which they will flourish.[ ]  in terms of teaching, how do we work towards the learning environments we may have never had before? how do we break the thrall to tools and technologies which may limit the horizon of our pedagogical creativity? how might we even imagine a "digital pedagogy" without the potentially limiting factor of electronics? what if we just started "teaching naked"? this is the catchphrase of josé bowen, a dean at southern methodist university, whose profile appeared not long ago in the chronicle  [young  b].  bowen  is  not  anti­technology  either.  his  modest  proposal  to  "teach  naked"  actually  just means removing all the computers and projectors from his classroom, which he did at smu. bowen was reacting to  the ineffectiveness of pedagogy when it gets governed by the tools it uses: in short, by those powerpoint lectures that rain / / dhq: digital humanities quarterly: digital pedagogy unplugged http://www.digitalhumanities.org/dhq/vol/ / / / .html / down boredom in a hail of bullet points. he does advocate offering students podcasts and online discussion groups and even powerpoint lectures, but outside of class meetings. during class time — and fully clothed — bowen invites all the q&a and in­person discussion that these technologies had senselessly displaced. taking a similar approach at the high­ school level, chemistry teachers jonathan bergmann and aaron sams are successfully experimenting with vodcasting to create what they call "the flipped classroom" [bergmann and sams  ]. in these models, students bring questions about the digital resources they have hypothetically pored over in preparing for class. instructional technology is not banished but instead moved to the pedagogical periphery. in making such moves, these teachers support bauerlein's promotion of "the non­digital space as a crucial part of the curriculum" [bauerlein  ]. like bauerlein, bowen would segregate digital and analog pedagogy, or electronic and human teaching, for their particular strengths. but can we reconcile them instead? can we imagine "teaching naked" as more than merely doing without, but as something already integrated to the circuit of its electronic counterpart? what if instead we kept the "digital" in the non­electronic senses of that word: something to get your hands on, to deal with in dynamic units, to manipulate creatively? the technology of cultural studies according to sean latham, this is what cultural studies already does. in "new age scholarship: the work of criticism in the age of digital reproduction", latham riffs on walter benjamin to suggest how the "loss of aura" or dematerialization of digitized objects, such as in electronic archives, links up nicely to the demystification of cultural hierarchies he endeavors to teach.[ ]  instructing students about disparate, heterogeneous, and highly contingent cultural  formations — which  is  the bailiwick of cultural studies, according to latham — is to prepare them for an analogous experience in the digital realm. "cultural  studies  and  digital  technology",  latham  suggests,  "each  activate  the  energies  of  the  other,  generating  the coordinates by which we can begin to map the infinite density of culture  in the age of  its digital reproduction" [latham ,  ]. latham thinks of cultural studies as a technology itself, which might give students a critical selectiveness in navigating electronic realms and archives and their floods of flattened, ambiguously differentiated cultural objects. further, it potentially empowers students to establish contingent interpretations of their own in these emerging digital spaces. latham ambitiously claims that cultural studies only fulfills the promise of its method in the digital realm, where we sort and resort contingent, rare, popular, and heterogeneous materials into dynamic critical narratives [latham  ,  ]. but it is not hard to imagine how we might unplug latham’s "technology of cultural studies" and still keep it running. as latham implies, to teach this way is to already use a kind of proto­digital pedagogy. for instance, imagine inviting students into a classroom where you have literally scattered dozens of heterogeneous documents around the room. for his part, latham might  plant  documents  related  to  modernist  little  magazines:  letters,  corrected  page  proofs,  sample  issues,  reviews, newspaper  stories,  and  various  images.  the  subject  could  be  just  about  anything.  imagine  another  array  of  materials relating to cholera outbreak in victorian london and the imagination of individual and collective identity in urban spaces. one might offer maps, statistical surveys, journalistic exposés, impassioned editorials, urban sketches, snippets of fiction. students could gather, assemble, and present to the class the critical narratives they collaboratively determine and argue. discussion could proceed about how to present, exhibit, or visualize those relations. this is old and new fashioned at once, as if curating a special collection and  imitating the ivanhoe game developed by mcgann, johanna drucker, bethany nowviskie, and others. does this even make sense? can such an interactive environment like ivanhoe even work when denuded of its socially networked, asynchronous, abstractly visualized, and self­archiving functions? why not find out? those differences as well as shortcomings can become vital aspects of the class discussion: what kind of mediated environments might  facilitate this? what interpretive machines might the students imagine, electronic or otherwise? what might they even potentially build, given the conditions of the media they’re working in? the game’s creators stress how ivanhoe keeps students aware of their interpretation as itself a critical act: it provides "self­conscious insight […] into the processes of interpretation constituted by any and every act of reading" [drucker and rockwell  , vii]. by unplugging the game or doing preliminary exercises with analog collections, one might help students to appreciate, by contrast, their active mediation of similar work in  the  digital  field,  which  too  frequently  seems  transparent,  or  so  flattened  that  students  fail  to  notice  its  own  critical topologies. work in the text mines in this case, the goal is to keep students’ attention on the critical labor that digital resources seem to dissolve. this is not to privilege learning on paper but to imagine its digital futures. daniel pitti, now associate director the institute for advanced / / dhq: digital humanities quarterly: digital pedagogy unplugged http://www.digitalhumanities.org/dhq/vol/ / / / .html / technology in the humanities (iath), teaches classes on xml, tei, and encoded archival description (ead) at the rare book school at the university of virginia. he begins the class by handing out a printed recipe, and then gives students a set amount of time to make a list, on their own, of what they think are that object’s salient formal properties which they would want (ultimately) to render in code. the subsequent discussion quickly reveals the surprising differences in features the students hoped to describe, and further demonstrates how even the most programmatic of documents has proliferating ways  of  describing  its  material,  textual,  and  graphical  features.  students  come  to  perceive  how  analog  or  physical documents are  already  and  complexly  encoded  in  n­dimensions.  how  do  you  mark  that  up?  how  do  you  imagine an archive  or  finding  system  that  can  accommodate  deceivingly  non­programmatic  artefacts  of  all  document  types  and materials? for  materially  different  types  of  artifacts  —  printed  texts,  maps,  photographs  —  divisioning  and classifying become yet more complicated when the ultimate purpose is to arrange them in a system that  permits  coherent  analysis  and  study.  the  problem  is  greatly  amplified  when  the  manipulable physical properties scale to radically different measures, as is the case with a depository that includes paper­based objects and born­digital objects. [mcgann  ,  ] the scale of the problem seems like a recipe for a headache. scaling up from a simple recipe, pitti elegantly invites students into the provocative complexities of text encoding and information architecture. another disarming and elegant example comes from brad pasanek. pasanek’s research is heavy into algorithms and text mining; he collaborates and publishes with a computer scientist. he also teaches british literature of the long eighteenth century. these domains came into contact when, one day, a little frustrated with his students’  lackluster insights into the basic themes of austen’s pride and prejudice, he went home and "text mined" the novel with pink and blue highlighters. just the title terms: pink for every instance of "pride", blue for "prejudice". the next class he returned with his marked­up book, flipping through the pages to show the bursts of color.[ ] the class energetically started correlating them to important moments in the plot, to transitions in how the key terms were conceived. the exercise was an ice breaker (valuable in itself, as any teacher knows), but pasanek accomplishes much more with it. consider one prevalent critique about how we read (or don’t read) electronic texts; latham explains  it as follows: "the digital text seemingly makes reading too easy, allowing one to search out specific terms without the labor required to place them in  their proper context"    [latham  ,  ]. text mining a novel with highlighters restores  that missing  labor of search. this is not merely to reverse engineer what students can already do with a search engine, or simply to take away the  electro­mechanical  advantage  of  such  a  tool  and  make  busywork.  (noting  that  many  search  results  functions  will highlight the keywords/strings a user has searched for.) rather, the exercise makes students reflect on how their reading labor is both constitutive and mediated. pasanek’s exercise integrates at least two kinds of reading: first, linear or intensive or deep or close reading, which the defenders of paper books insist provides the all­important context for understanding; and second, extensive or distant reading in which context is measured on a very different scale.[ ] distant reading allows us to survey and map texts from a higher critical elevation, as it were. pasanek’s prototype seems exactly the kind of "hybrid" critical work that does both.[ ] it allows for specificities within close contexts as students read, and for connections within and across texts. it is like reading from the middle, except that it alternates close and distant perspectives to generate its critical current.[ ] the exercise defamiliarizes the act of reading, reveals its continuity with digital text mining, and offers insights that may not solely exist in either realm. flipping quickly through the marked­up book, the first thing one notices is the abundance of "pride" relative to "prejudice". depending  on  how  its  counted,  pride  shows  up  between  six  and  nine  times  more  than  prejudice.[ ]  there’s  food  for thought here: whereas pride describes a character trait, prejudice is more of a relational term: changeable, situational, and more dangerous to accuse someone of. browsing through the text more deliberately, the first big splash of color comes (as one might expect) with mr darcy’s debut, specifically after he declines  to dance with elizabeth bennet at  the  first ball [austen  ,  – ]. the second color splash comes later when, at netherfield, miss bingley invites elizabeth to stroll about the room, trying to get darcy’s attention while scrutinizing his character  [austen  ,  ]. the third color burst comes  (again,  as  one  might  anticipate)  when  elizabeth  meets  wickham  and  he  describes  (lies  about)  his  history  of mistreatment at darcy’s hands [austen  ,  – ]. at this point, we might consider why the color has been splashing at all: the multiple instances of the term "pride" at these moments. rereading them, we can see how "pride" is not just an epithet hurled at darcy, but a concept which certain characters — specifically elizabeth’s friend charlotte lennox, then miss bingley, and then wickham — discuss with elizabeth for its appropriateness, even its potential merits. at a distance, we / / dhq: digital humanities quarterly: digital pedagogy unplugged http://www.digitalhumanities.org/dhq/vol/ / / / .html / see the "hot spots"; moving closer, we analyze their contexts; and somewhere in the middle we start to learn about the novel’s reformation of pride through elizabeth’s perspective. pride, as the novel eventually suggests, does have a place. of course you don’t have to text mine the novel to understand this. such interpretations should be obvious to any attentive reader of the novel. that these readers are sometimes not so attentive is a big part of pasanek’s point. now for prejudice. the first instance occurs when elizabeth, just after having heard wickham’s story, gets awkwardly paired with darcy at the second ball [austen  ,  ]. by contrast, a couple of prejudices appear when elizabeth and jane later learn the truth about wickham but hesitate to publicize his bad character [austen  ,  ]. what about both "pride" and "prejudice" — the instances where the novel’s title terms are co­present? there are three. the first when, after elizabeth hotly rejects darcy’s unfortunate first proposal of marriage, he gives her a long letter explaining everything [austen  ,  ]. the second  when  elizabeth  visits  pemberley  and  hears  glowing  appreciation  of  darcy  from  one  of  his  servants  [austen ,  ].  the  third  when  elizabeth  eventually  accepts  darcy’s  marriage  proposal  and  they  reflect  back  on  his explanatory letter [austen  ,  – ]. in each case, the subjects are darcy’s maturation and his rationale for recent actions; in each context, this information is private and privileged; in each instance, characters reflect on the very relations and reformation of the novel’s title terms. there is a much more one could think about here. and there are many critical questions to raise about the method. but this should testify to the value of such an exercise. particularly for students whose interface with digital texts and resources is driven  by  search  engines,  or  guided  by  keywords  and  text  strings.  unplugging  the  search  engine  can  help  students perceive the limitations as well as the possibilities of what makes these engines run: pattern matching, which by itself is a far cry from reading at any distance. it sharpens students’ attention to forms of analysis that explore the analog and digital domains  along  a  continuum.  it  helps  students  to  interrogate  the  various  kinds  of  readings  they  can  do  therein.  and  it reveals all of those kinds of readings as actively constituting critical interpretations. on not reading in the digital humanities there are simpler approaches toward these goals which are no less effective in imagining a hybrid pedagogy. stephen ramsay offers another interesting example. just as ramsay tries to combine the "technical with the philosophical" in his graduate courses in the digital humanities, so he also combines different configurations of the seminar [ramsay  ]. for instance, after working on programming on mondays and wednesdays, his class devotes fridays to a theoretical text on new media or the digital humanities. but no one gets to read it in advance. instead, on "no­reading fridays", the class takes  turns  reading,  paragraph  by  paragraph,  the  text  projected  on  the  classroom’s  screen.  after  two  such  fridays grappling  with  heidegger’s  "the  question  concerning  technology",  the  class  had  covered  only  eight  paragraphs,  but ramsay declares that "i truly think that this is one of most enlightening class discussions i’ve ever been a part of (either as a student or a teacher)." the format allows the seminar to flourish, and "the professor is only a very small part of what’s going on." why is this different from a seminar where everyone works from the same edition of a physical book? in some ways it isn’t. but for a graduate course in digital humanities, where much of the attention is on the digital realm and on theories of new media,  it  is a chance for everyone to be on the same page — literally — where the page  is projected on the wall.[ ] because no one (save the professor) has read it before, the seminar reimagines real­time information processing in a very old fashioned way. this is "teaching naked" as it is meant to be understood: using technology effectively, subordinating it to the  pedagogical  goals  of  the  class.[ ]  though  not  quite  unplugged,  ramsay’s  digital  pedagogy  offers  similar  critical distance  and  welcome  counterpart  to  the  course’s  engagement  with  computers  and  media  theory.  this  too  is  digital humanities, and very productive of deep reading and discussion in the flesh. the digital is hands­on, offering the "haptic engagement" that ramsay argues is crucial to dh's hermeneutic of building [ramsay  b]. digital pedagogy unplugged no one,  i  think, would dispute  the credentials of  these  teacher­scholars as digital humanists.  in  their sensitivity  to  the necessary  interplay of electronic and analog forms,  they also blur  the  fallacious divides of  theorist­practitioner,  insider­ outsider, educator­hacker in ways that seem especially salutary in the context of dh's persistent debates. they enrich the notion  of  the  digital  humanist  as  a  "hybrid  scholar",  according  to  julia  flanders,  underscoring  how  digital  scholarship proceeds through "hybridizations that challenge our notions of discipline" [flanders  ] — and that even challenge our notions of the digital, as well as the inchoate discipline emerging from it. their examples more specifically suggest how an / / dhq: digital humanities quarterly: digital pedagogy unplugged http://www.digitalhumanities.org/dhq/vol/ / / / .html / unplugged pedagogy, so to speak, can still be productively digital. these case studies are meant neither to prescribe nor to exhaust the possibilities; they are simply a handful of the strategies which i’ve happened to encounter. but they all provide creative answers to a question that every teacher needs to ask: how to imagine pedagogy in a digital age. david parry is bold to say that "[t]eaching without digital technology is an irresponsible pedagogy", but correct in that the future, and the future of education, is digital [parry  ]. this might be gently rephrased to suggest that it is irresponsible to teach with technology without a digital pedagogy. and though there are all sorts of ways to construct a digital pedagogy, one powerful approach begins with pulling the plug. notes [ ]that possibility is in some ways fueling the recent authenticity debates: the hackers are themselves being hacked. [ ]much as the rossetti archive has suggested for hypermedia and bibliography. [ ]see profhacker’s report from the educause conference for a brief and trenchant critique of the institutionalization of this perspective, "simply the apotheosis of higher education’s thirty­year change from a public good to a commodity" [jones ]. [ ]as several scholars have recently argued, the biggest problems faced by the digital humanities are not technological, but institutional [mcgann  ] or legal and social [kirschenbaum  e]. [ ]there are problems with the assumption that digital archives and objects somehow lack materiality (see mcgann). though subject to critique, latham’s argument does provoke thinking about researching and teaching cultural studies through different media. [ ]to "mark up" the book with highlighters is to encode certain semantic elements in color. the exercise also anticipates the critical activity of coding for the web, where html is reconceived as "highlighted text mark­up language". [ ]carr sketches out a kind of surface or frenetic reading against a deep reading or deep thinking model [carr  b]. franco moretti’s "distant reading" is not necessarily allied with reading online or with information technology; rather, "distance […] is a condition of knowledge" [moretti  ,  ]. but, when moved into an electronic context, distant reading can help articulate a method for online sources and instructional technologies. [ ]compare to what latham suggests about using digital archives: "the reading of surfaces […] should not be starkly opposed to a deep reading that is somehow more proper or authentic. instead, the digital archive requires a hybrid type of critical work, one which matches the ability to pursue connections across texts with a studied awareness of the historical specificity of the printed word" [latham  ,  ]. [ ]put a different way, as lydgate insists in middlemarch: "there must be a systole and diastole in all inquiry, […] a man’s mind must be continually expanding and shrinking between the whole human horizon and the horizon of an object­glass" [eliot  ,  ]. [ ]the variation depends on whether or not one includes "proud" and "prided". it is also interesting that neither "pride" nor "prejudice" show up in top   or even top   most frequent words (excluding common stop words) in the novel. this makes them relatively invisible to such online word cloud tools as wordle and tagcrowd. [ ]relatedly, anyone who attended a technology in the arts and sciences camp (aka thatcamp) knows that its most crucial piece of technology is the whiteboard. [ ]in this context, see david parry’s lucid response to the "teaching naked" concept [parry  ]. works cited austen   austen, jane. pride and prejudice. ed. vivien jones. new york: penguin,  . bauerlein   bauerlein, mark. the dumbest generation: how the digital age stupefies young americans and jeopardizes our future (or, don't trust anyone under  ). new york: tarcher/penguin,  . bauerlein   bauerlein, mark. "how non­digital space will save education." britannica online learning & literacy forum.  jan  . web.   feb  . bergmann and sams   bergmann, jason, and aaron sams. vodcasting and the flipped classroom. mathematics and science teaching institute, university of northern colorado,  . web.   july  . bier   brier, steve. comment on "on edtech and the digital humanities." bloviate  oct  . web. carr  b carr, nicholas. the shallows: what the internet is doing to our brains. new york: w.w. norton,  . / / dhq: digital humanities quarterly: digital pedagogy unplugged http://www.digitalhumanities.org/dhq/vol/ / / / .html / drucker and rockwell   drucker, johanna, and geoffrey rockwell. "introduction: reflections on the ivanhoe game." text technology  .  ( ): vii–xviii. eliot   eliot, george. middlemarch. ed. rosemary ashton. new york: penguin,  . fitzpatrick   fitzpatrick, kathleen. planned obsolescence: publishing, technology, and the future of the academy. media commons press  . web.   oct  . flanders   flanders, julia. "the productive unease of  st­century digital scholarship." digital humanities quarterly  . (summer  ). http://www.digitalhumanities.org/dhq/vol/ / / / .html. accessed feb  ,  . graff   graff, gerald. "the presidential forum: the way we teach now. introduction." profession( ):  – . jones   jones, jason, and george williams. "profhacker goes to educause." profhacker,   oct  . http://chronicle.com/blogs/profhacker/profhacker­goes­to­educause/ . accessed   oct  . kirschenbaum  a kirschenbaum, matthew. comment on "doing dh v. theorizing dh." digital humanities questions & answers   oct  . web.   oct  . kirschenbaum  e kirschenbaum, matthew. "the .txtual condition." radcliffe institute for advanced study, harvard university, boston, ma.   oct  . panel presentation. web. krause   krause, steven. "‘easy’ isn’t ‘useful’ (and it might be just kind of ‘dumb’)." stevendkrause.com,   feb  . http://stevendkrause.com/ / / /easy­isnt­useful­and­it­might­be­just­kind­of­dumb/. accessed   june  . latham   latham, sean. "new age scholarship: the work of criticism in the age of digital reproduction." new literary history  .  (summer  ):  – . liu   liu, alan. "digital humanities and academic change." eln .  (spring  ):  – . mcgann    mcgann, jerome. "imagining what we don’t know: the theoretical goals of the rossetti archive. "institute for advanced technology in the humanities,  . web. may  . mcgann  a  mcgann, jerome. radiant textuality: literature after the world wide web. new york: palgrave macmillan, . mcgann    mcgann, jerome."literary history and editorial method: poe and antebellum america." new literary history .  (autumn  ):  – . mcgann    mcgann, jerome. "what do scholars want?" centre for the history of the book, university of edinburgh, scotland,   july  . keynote address. moretti   moretti, franco. " conjectures on world literature." new left review  (jan–feb  ):  – . nowviskie  a nowviskie, bethany. "#alt­ac: alternate academic careers for humanities scholars." bethany nowviskie. http://nowviskie.org/ /alt­ac/.   jan  . accessed   oct  . parry   parry, david. "on what it would mean to really teach ‘naked’." academhack   july  . http://academhack.outsidethetext.com/home/ /on­what­it­would­mean­to­really­teach­naked/. accessed   july  . ramsay   ramsay, stephen. "the no­reading seminar." stephen ramsay: literatura mundana. http://lenz.unl.edu/ / / /the­no­reading­seminar.html,   jan  . accessed   feb  . ramsay  a ramsay, stephen. "care of the soul." emory university. atlanta, ga.   oct  . panel on the digital scholarly commons. web.   oct  . ramsay  b ramsay, stephen. " on building." stephen ramsay.   january  . web.   july  . surname    waltzer   waltzer, luke. "on edtech and the digital humanities." bloviate. http://lukewaltzer.com/on­edtech­and­the­ digital­humanities/.   oct  . accessed   oct  . young  b  young, jeffrey r. "when computers leave classrooms, so does boredom." chronicle of higher education july  :  . web.   july  . http://www.digitalhumanities.org/dhq/vol/ / / / .html http://chronicle.com/blogs/profhacker/profhacker-goes-to-educause/ http://stevendkrause.com/ / / /easy-isnt-useful-and-it-might-be-just-kind-of-dumb/ http://nowviskie.org/ /alt-ac/ http://academhack.outsidethetext.com/home/ /on-what-it-would-mean-to-really-teach-naked/ http://lenz.unl.edu/ / / /the-no-reading-seminar.html http://lukewaltzer.com/on-edtech-and-the-digital-humanities/ int j digit libr ( ) : – doi . /s - - - the new knowledge infrastructure michael lesk published online: february © springer-verlag berlin heidelberg for many, information science starts with vannevar bush’s july essay “as we may think”. before the war, sci- ence had been an individual pursuit (think of sinclair lewis’s arrowsmith). bush wrote that we had learned how to have teams of scientists; he knew of the atomic bomb, although he could only openly mention the development of radar. that was an amazingly fast and effective development by the team at the mit radiation laboratory. now, he asked, how could we use teams in peacetime? traditional humanities scholarship was also an individ- ual pursuit. the same information technologies, however, that transformed science have also transformed scholarship in other domains. we can locate, read, and study materials. wecanexploretexts,images,sounds,videos,sculptures,cos- tumes, interactive objects and scientific data. we can coop- erate at a distance via electronics; i knew something had changed in the s when i saw someone send an email message to somebody else in the same room. i also remem- ber in the early s when i needed to check a citation for a book and i looked it up online even though i had a copy of the book in the office i was using. distance can be irrelevant at any scale. cooperation can extend around the world—in the sciences, look at the iris project for seismological data, or the international virtual observatory, or the protein data bank. in the humanities, look at worldcat, or artstor, or the international children’s digital library, or the interna- tional dunhuang project. increasingly, we also study using algorithms. projects like the sloan digital sky survey collect so much data that it cannot be viewed by a single individual. instead, the purpose of the data is for data mining. we have computational studies m. lesk (b) rutgers university, new brunswick, usa e-mail: lesk@acm.org ofauthorship,stylometrics,analysesofpaintingsandmusical performance, and other topics of scholarship to complement the scientific data mining of galaxies, chemical molecules, or weather events. how do we enable these new kinds of scholarship? we need a new kind of knowledge infrastructure that will offer more than the rudimentary search and retrieval capabili- ties possible today. conventional library subject indexing for books may be old-fashioned now that we have full text searching. a few years ago the library of congress floated a study even suggesting that the assignment of lcsh cate- gories be phased out (the community objected). but we now need to search images, data, and other resources where text search is not immediately applicable. in addition, we have problems of quantity. the more material to be studied, the more accurate searching must be, so retrieval algorithms of greater resolving power are needed. and it is not just that individual projects are gathering more data, but that data availability is extending across disciplines and around the world. the knowledge infrastructure we need must emphasize interoperability across areas and institutions. libraries led the way with cooperative cataloging and standards for elec- tronic records. open archives protocol use has now spread as well, but museums are trying an even more ambitious kind of description with “linked open data.” in the spirit of the semantic web, museums are putting their catalog information into rdf (resource description format). in this methodology, all information about an object is recorded as a subject–predicate–object triple, with the predicates and objects taken from official ontologies with very precise defin- itions. the british museum and the rijksmuseum are leaders in this effort, with support from the andrew w. mellon foun- dation. in principle, linked open data should permit deduc- tive logic software to operate across museum data bases. in m. lesk practice, we are still learning both how to create and how to use this information. as an example of its value, how- ever, seven months ago i wished to create chloropleth maps showing where british museum objects had originated (see below). merely using place names confuses such locations as memphis in egypt and memphis in tennessee, or rochester in kent and new york. the official geographic ontology in the rdf data clears this up. since the descriptions of rdf for museum data are coming from international cooperative projects, we can look forward to increased interoperability across museums in different domains (decorative arts, natural history, and so on) as well in different countries. the next step, and a more complicated one, will be scien- tific data. today the instructions for describing scientific data are very complex, with hundreds of pages of documentation for formats such as fgdc (federal geographic data commit- tee) or seed (standard for the exchange of earthquake data). these complex data formats do not necessarily interoper- ate effectively. the situation with medical data is similarly complex, with some accepted interchange standards such as dicom (digital imaging and communications in medicine) for radiology, and with other standards that are proprietary including cpt (current procedural technology, sold by the american medical association). beyond data description is data use. here scientific data exploration is ahead, rather than behind, the use in museums. but advanced algorithms using information resources are spreading to all areas of scholarship. sentiment analysis of text based on twitter messages, for example, shows that hawaii is the happiest state and louisiana the least happy. a remarkable example of image exploitation is noah snavely’s “building rome in a day” project, which took thousands of flickr photographs of the colosseum, discarded the out of focus ones, and then registered the images and computed a -d model using photogrammetry. other examples of sci- entific collaboration with humanities scholarship has been the analysis of manuscripts, including brent seales work on “unrolling” a manuscript scroll using -d scans and image transformation software, or william noel’s reading unknown books of archimedes from a palimpsest (an overwritten man- uscript) . all of these projects are collaborative across insti- tutions and knowledge domains. they show us how to build and use a more sophisticated knowledge infrastructure which can support teams of interdisciplinary researchers. the international journal of digital libraries offers cur- rent papers in the area of digital scholarship, especially the broaderapplicationsofmoderntechniquesandmethods.this issue ranges from text to video, network basics to linked data, and from the humanities to science. it introduces a wide vision of future research. the new knowledge infrastructure stylometric analysis of early modern period english plays santiago segarra, , mark eisen , gabriel egan , and alejandro ribeiro dept. of electrical and systems engineering,, university of pennsylvania, philadelphia, usa school of humanities,, de montfort university, leicester, uk abstract—function word adjacency networks (wans) are used to study the authorship of plays from the early modern english period. in these networks, nodes are function words and directed edges between two nodes represent the likelihood of ordered co-appearance of the two words. for every analyzed play a wan is constructed and these are aggregated to generate author profile networks. we first study the similarity of writing styles between early english playwrights by comparing the profile wans. the accuracy of using wans for authorship attribution is then demonstrated by attributing known plays among six popular playwrights. this high classification power is then used to investigate the authorship of anonymous plays. moreover, wans are shown to be reliable classifiers even when attributing collaborative plays. for several plays of disputed co-authorship, a deeper analysis is performed by attributing every act and scene separately, in which we both corroborate existing breakdowns and provide evidence of new assignments. finally, the impact of genre on attribution accuracy is examined revealing that the genre of a play partially conditions the choice of the function words used in it. index terms—authorship attribution, word adjacency net- work, markov chain, relative entropy. i. introduction stylometry involves the quantitative analysis of a text’s lin- guistic features in order to gain further insight into its underly- ing elements, such as authorship or genre. along with common uses in digital forensics [ ], [ ] and plagiarism detection [ ], stylometry has also become the primary method for evaluating authorship disputes in historical texts, such as the federalist papers [ ], [ ] and mormon scripture [ ], in a field called authorship attribution. one of the most notorious collection of texts for which such disputes exist is the collection of dramatic works produced in england during the early modern era, covering the th through mid- th century. due to factors such as inaccurate pressing information and undocumented collaborations, the precise authorship of many of these plays– including works by famous playwrights william shakespeare and john fletcher–remains highly contested. stylometric analysis of the work from this time period dates as far back as the nineteenth century in f. g. fleay’s analysis of verse features in shakespeare’s plays [ ]. similar analyses based on the manual counting of linguistic features continued throughout the early twentieth century [ ]–[ ]. computer- based techniques for counting the frequency of various stylistic features, such as rare words or phrases, have become very supported by nsf career ccf- and nsf ccf- . common over the past few decades. the premier work done in evaluating authorship in early modern era drama includes the work of macdonald p. jackson [ ], [ ], brian vickers [ ], and hugh craig and arthur kinney [ ], each of whom studied the works of shakespeare and his contemporaries extensively using computational stylometry techniques. the techniques used in authorship attribution began almost a century ago by examining sentence lengths in texts to determine authorship [ ]. in [ ], stylometric analysis first began to consider function words as important stylistic mark- ers, producing unprecedent results. as such, function words have continued to be common in analysis techniques [ ], [ ] due to their context independence and universal usage in english language texts. these methods rely mainly on the frequency of usage of function words. numerous other stylistic features have since been used in authorship attribution studies, including the study of vocabulary richness [ ], [ ] and the use of part of speech taggers [ ]. our method for attributing texts, developed in [ ], also measures function word usage to distinguish author styles. rather than only considering word frequencies, however, we consider a more complex relational structure between an author’s usage of function words. we construct word adjacency networks (wans) with function words as nodes, and edges containing information regarding the use of two function words within the same sentence or phrase. we interpret each wan as a markov chain that assigns transition probabilities between the appearance of two function words. we can then measure similarity between wans by using a measure of relative entropy. markov chains have previously been used in [ ] and [ ] for the purposes of authorship attribution, though neither consider the use of function words. results in [ ] show an increase in attribution accuracy compared to the most common frequency-based methods. we employ this new technique then to further analyze and add insight into the authorship disputes of early modern english dramatic works. we first present an overview of the construction and com- parison of wans in section ii. we discuss in section iii the main playwrights used in our analysis and the construction of their profile networks as well as a measure of similarity between profiles in section iv. in sections v-a through v-e we perform a stylometric analysis of the complete canons of our five primary playwrights, followed by a summary of results in section v-f. we are able to demonstrate high attribution accuracy between six candidate authors. an analysis of a set of plays published anonymously or without a clear author is performed in section vi. we then examine the use of wans in determining authorship of plays known to be written by multiple authors in collaboration. this is first done by ana- lyzing entire plays in section vii and then through extensive interplay analysis of a set of particularly controversial plays in section viii. our results largely corroborate existing theories regarding these plays as well as, in some cases, propose new authorship breakdowns. we conclude in section ix by providing a brief analysis of the use of wans in distinguishing between the three most common dramatic genres of the era: comedy, tragedy, and history. ii. word adjacency networks when doing authorship attribution, we are given a set of candidate authors a and a set of known texts written by these authors t , and the objective is to correctly attribute a collection of texts u of unknown authorship among the authors in a. more precisely, the idea is to construct a function r̂u : u → a relating every text in u with its rightful author in a. in [ ], [ ], we propose an authorship attribution method based on function word adjacency networks. for each text, a word adjacency network (wan) of function words, i.e. words that convey only grammatical relationships, can be constructed. formally, from a given text t we construct the network wt = (f,qt) where f = {f ,f , ...,ff} is the set of nodes composed by a collection of function words and qt : f × f → r+ is a similarity measure between ordered pairs of function words. the similarity function qt measures the directed co- appearance of two function words. i.e., once we encounter a particular function word, qt indicates the likelihood of encountering another one in the few words following the first one. more precisely, to compute qt we first divide the text t into sentences sht where h ranges from to the total number of sentences. we denote by sht (e) the word in the e-th position within sentence h of text t. moreover, we consider that two words in the same sentence are related if they are at most d ∈ n positions apart and the relation between words decays with their position difference according to a discount factor α ∈ ( , ). in this way, we define qt(fi,fj)= ∑ h,e i { sht (e) = fi } d∑ d= αd− i { sht (e + d) = fj } , ( ) for all fi,fj ∈ f . we then generate a profile network wc = (f,qc) for every author ac ∈ a using the wans from those texts in t known to have been written by the corresponding author ac. formally, if we denote by t(c) the subset of t written by ac, then the similarity function qc of the profile is computed as qc = ∑ t∈t (c) qt. ( ) the similarity function qc depends on the amount and length of the texts written by author ac. this is a problem since we aim to compare profiles of different authors. thus, we apply the following normalization to the similarity measures q̂c(fi,fj) = qc(fi,fj)∑ j qc(fi,fj) , ( ) for all fi,fj ∈ f . in ( ) we assume that the total texts written by author ac are long enough to guarantee a non zero denominator for a given amount of function words |f |. if this is not the case for some function word fi, we fix q̂c(fi,fj) = /|f| for all fj . in this way, we achieve normalized networks p̂c = (f,q̂c) for each author ac. the network p̂c estimates an ideal network pc which captures the stylistic fingerprint of author ac. since the similarities out of every node sum up to in the network p̂c, it can be interpreted as a discrete time markov chain (mc). thus, the normalized similarity q̂c(fi,fj) between words fi and fj is a measure of the probability of finding fj in the words following an encounter of fi for texts written by author ac. similarly, we can use normalization ( ) to build a mc pu for each unknown text u ∈ u. in order to perform the attribution, we need a way of comparing the generated mcs. by construction, every mc has the same state space f , facilitating the comparison. indeed, we use the relative entropy h(p ,p ) as a dissimilarity measure between any two chains p and p . the relative entropy is given by h(p ,p ) = ∑ i,j π(fi)p (fi,fj) log p (fi,fj) p (fi,fj) , ( ) where π is the limiting distribution of p and we consider log to be equal to . from [ ], if we denote as w a realization of the mc p then h(p ,p ) is proportional to the logarithm of the ratio between the probability that w is a realization of p and the probability that w is a realization of p . in particular, when h(p ,p ) is null, the ratio of probabilities is meaning that a given realization of p has the same probability of being observed in both mcs. thus, h is a reasonable dissimilarity measure between mcs. utilizing ( ) we construct the attribution function r̂u by assigning the text u to the author with the mc most similar to pu, i.e. r̂u(u) = ap, where p = argmin c h(pu, p̂c). ( ) notice that the relative entropy in ( ) takes an infinite value when any word transition that appears in the unknown text does not appear in the profile. in practice we compute the relative entropy in ( ) by summing only over the non-zero transitions in the profiles, h(p ,p ) = ∑ i,j|p (fi,fj) = π(fi)p (fi,fj) log p (fi,fj) p (fi,fj) . ( ) because the calculation of relative entropy in ( ) only adds rel- ative entropy for nonzero transitions, profiles built from fewer total words will on average contain less nonzero transitions and will thus sum together fewer terms than larger profiles. when attributing an unknown text among profiles of varying size, we avoid this potential biasing for smaller profiles by summing only over transitions that are nonzero in every profile being considered, h(p ,p ) = ∑ i,j|pc(fi,fj) = for all ac∈a π(fi)p (fi,fj) log p (fi,fj) p (fi,fj) . ( ) in the following sections, expression ( ) is used to compare the markov chain representations of wans when performing attributions following rule ( ). iii. author profiles the stylometric analysis in this paper focuses on the attribu- tion of plays written during the english early modern period stretching from the late th century to the early th century. william shakespeare is the most prominent playwright active in this period but there are several other authors that were also active during this time. for most of the paper, we focus on the authors listed below where we also detail the number of plays that are currently attributed to each of them and the period during which they are presumed to have written said plays : ( ) george chapman ( - ), active circa - . considered sole author of a total of plays plus collaborations. ( ) john fletcher ( - ), active circa - . sup- posed to have written plays, being sole author in of them while francis beaumont and phillip massinger were his main collaborators in the rest. ( ) ben jonson ( - ), active circa - . pre- sumed sole author of plays plus collaboration. ( ) christopher marlowe ( - ), active circa - . putative sole author of plays and collaboration. ( ) thomas middleton ( - ), active circa - . believed to have written plays, of them as sole author and in collaboration. ( ) william shakespeare ( - ), active circa - . a total of plays are attributed to shakespeare and collaborators. in the above list, we do not consider as plays minor dramatical compositions such as masques, entertainments and pageants. chapman, fletcher, jonson, marlowe, middleton, and shake- speare are included in our analysis since they posses large and well studied canons compared with their contemporaries. the wan attribution algorithm developed in [ ] and briefly reviewed in section ii uses known texts of a given author to construct a profile against which unknown texts are compared. since profiling accuracy increases with the length of the texts considered when building the profile, we build profiles from all texts of sole authorship for a given author that have little or no history of authorship dispute. the full list of plays used to build the six profiles is reported in table i. when building profiles for a given author, we generally subscribe to the information provided in [ ] to determine texts of sole authorship. an exception to this is middleton, who’s profile information compiled from the database of early english playbooks (deep) [ ] and the database of catalogued plays in chadwyck-healey liter- ature online (lion) [ ]. whenever inconsistencies in authorship information arise, we consider [ ] as accurate. is built using the texts attributed to him in the oxford collected works of middleton [ ], which contains a more recent and accepted study of his canon. two plays included in middleton’s corpus in [ ]–the revenger’s tragedy and the second maiden’s tragedy–were published anonymously and have long history of disputed authorship [ ]–[ ]. to be safe, we do not include these plays in middleton’s profile but provide an analysis of their authorship in section vi. notice that each profile is built from a different number of texts. marlowe, the least prolific writer of the ones here considered, is accepted as the sole author of plays that totalize , words. shakespeare, the most prolific sole author, is the undisputed sole author of plays, totaling , words. due to this difference, we compute the relative entropy between the wan of an unknown text u ∈ u and each profile using ( ) rather than ( ). in order to generate faithful representations of authors’ styles, we remove artifacts introduced by modern transcrip- tions by using the earliest editions available of each text in the lion database [ ], with the exception of shakespeare’s first quarto editions. although shakespeare’s canon was first published in full in , there exist earlier editions for a number of his plays known as first quartos. as there is currently no scholarly consensus on which editions are more authoritative, to be consistent we use editions for all shakespeare texts. when using original transcriptions we have to account for the fact that many words had multiple accepted spellings during the early modern era. e.g., the word ‘of’ is also spelled as ‘off’, ‘offe’, or ‘o’ whereas the word ‘with’ may also appear as ‘wid’, ‘wyth’, ‘wytt’, ‘wi’, ‘wt’, and ‘wth’. many of these alternate spellings are used infrequently and thus do not contribute highly to the wan of a text. nevertheless, we correct the wans so that the occurrence of any of the alternative spellings is treated as the same word. we emphasize that spelling preferences carry little information about the authorship of a play. indeed, spellings in printed editions were not necessarily those of authors as they were often selected by printers to accommodate the fixed length of lines in printing presses [ ]. in addition, we remove speech prefixes, or the character name preceding each speech, to avoid cases in which character names are abbreviated to function words (e.g. anne abbreviated to ‘an’). for the wans in this work we use the optimal parame- ters determined in [ ], α = . and d = . because punctuation marks were often added by publishers rather than the authors themselves [ ], we instead delimitate sentences at the end of character speeches. the wans are built with the most common function words in the early modern period, listed in table ii. this number is chosen based on a training period to find the optimal number of function words in which we attribute all texts with undisputed authorship, i.e. those plays listed in table i. a list of the most common early modern period alternative spellings is given in table iii. for the cases where one alternative spelling can be assigned to multiple conventional spellings, e.g. ‘yt’ can be associated with ‘it’ and ‘that’, we assign every appearance of the alternative spelling to the most common usage. table i: plays used to build author profiles william shakespeare antony and cleopatra (ant) all’s well that ends well (aww) as you like it (ayl) the comedy of errors (err) coriolanus (cor) cymbeline (cym) hamlet (ham) henry iv ( h ) henry iv ( h ) henry v (h ) julius caesar (jc) king john (jn) king lear (lr) love labour’s lost (lll) the merchant of venice (mv) the merry wives of winsdor (wiv) a midsummer night’s dream (mdb) much ado about nothing (ado) othello (oth) richard ii (r ) richard iii (r ) romeo and juliet (rom) the taming of the shrew (shr) the tempest (tmp) troilus and cressida (tro) twelfth night (tn) the two gentlemen of verona (tgv) the winter’s tale (wt) christopher marlowe dr faustus (drf) edward ii (e ) the jew of malta (jew) the massacre at paris (mas) tamburlaine (t ) tamburlaine (t ) john fletcher bonduca (bon) chances (cha) the faithful shepherdess (tfs) the humorous lieutenant (hum) the island princess (isl) the loyal subject (loy) the mad lover (tml) monsieur thomas (tho) the pilgrim (pil) rule a wife and have a wife (raw) valentinian (val) wife for a month (wfm) the wild goose chase (wgc) the woman’s prize (wpr) women pleased (wpl) ben jonson alchemist (alc) bartholomew fair (bar) catiline’s conspiracy (cat) cynthia’s revels (cyn) the devil is an ass (dia) epicoene (epi) every man in his humour (mih) every man out of his humour (moh) the magnetic lady (mag) the new inn (new) poetaster (poe) the sad shepherd (sad) sejanus’s fall (sej) the staple of news (son) a tale of a tub (tub) volpone (vol) george chapman all fools (all) sir giles goosecap (sgg) bussy dambois (bda) caesar and pompey (cap) the conspiracy of charles duke of byron (cdb) the tragedy of charles duke of by- ron (tdb) the gentlemen usher (gen) a humorous day’s mirth (hdm) may day (may) monsieur d’olive (mdo) the blind beggar of alexandria (bba) the revenge of bussy dambois (rbd) the widow’s tears (wid) thomas middleton your five gallants (fiv) a game at chess (gac) a mad world my masters (mad) a chaste maid in cheapside (mac) hengist king of kent (hen) michaelmas term (mic) more dissemblers besides women (dis) no wit, no help like a woman’s (now) the phoenix (pho) the puritan widow (pur) a trick to catch the old one (tco) the widow (wid) the witch (wth) women beware women (bew) iv. similarity of profiles we compute the relative entropy between every pair of author profiles for the six authors introduced in section iii using expression ( ); see table iv. every entry in the table represents the relative entropy between the corresponding authors in the rows and columns. in this table, as well as in the remaining of the paper, relative entropies are multiplied by to facilitate their display. we use the term centinats, or cn for short, to denote the resultant unit of measure of relative entropy. the . in the chapman row entry and shakespeare column entry indicates a relative entropy of . cn between chapman’s and shakespeare’s profiles. note that, as expression ( ) is not symmetric, the values in the table are also not symmetric, although they are similar in most cases. e.g. table ii: list of function words used in wans a both in no past this while about but into none shall those who after by it nor should though whom against can like nothing since through whose all close little of so till will an could many off some to with and dare may on such until within another down might once than unto without any enough more one that up would as every most or the upon yet at for much other them us away from must our then what bar given need out therefore when because hence neither over these where before if next part they which table iii: common alternative spellings for function words conventional alternative it yt t of off offe o that thatt thate yat yt with wid wyth wytt wi wt wth table iv: relative entropy between profiles. shakespeare fletcher jonson marlowe middleton chapman shakespeare . . . . . fletcher . . . . . jonson . . . . . marlowe . . . . . middleton . . . . . chapman . . . . . fig. : asymmetry of dissimilarities in table iv. the relative entropy between shakespeare’s and chapman’s profiles is . cn rather than . cn in the opposite direction. in general, dissimilarities between profiles in both directions are highly correlated as can be observed in fig. . in this figure, the coordinates of every point correspond to the dissimilarities in both directions for every pair of profiles. the arrangement of the points along the diagonal implies that a high dissimilarity in one direction is associated with a high dissimilarity in the opposite one. hence, this correlation allows us to speak about the similarity between two authors without specifying a direction. the entropy-based dissimilarities in table iv dispel the marlovian theory of shakespeare authorship [ ]. if marlowe and shakespeare were the same person, we should observe the dissimilarities between marlowe’s and shakespeare’s profile to be smaller than the distances between each of the other profiles. however, the relative entropies between marlowe’s and shakespeare’s profiles average . cn in both directions which is larger than the dissimilarity between shakespeare and all of the other authors. shakespeare’s profile is, on average, closest to jonson profile – average relative entropy of . cn – although still sufficiently different so as to assert that they belong to different authors, as verified by the attribution of plays in section v. the highest dissimilarity among any pair of profiles occurs between marlowe and fletcher with a mean of . cn. as will be seen in section v, the relative similarity between two profiles affects our ability to distinguish between them when attributing a text. v. attribution of plays we attribute the plays written by jonson, middleton, chap- man, marlowe, and shakespeare among the author profiles introduced in section iii. the attribution of fletcher’s plays is performed in the discussion of collaborations in section vii. when attributing any given play, profiles are built using the plays listed in table i excluding the play being attributed. we do not report raw relative entropy values between the play being attributed and the author profiles, but instead subtract from these values the relative entropy between the play and a profile containing all available texts. intuitively, the profile containing all of the texts represents the writing style of an average playwright from this period. this is done to make the figures easier to view but does not change the results in any way. each raw relative entropy value is discounted by the same constant value, thus preserving relative distances. as a result, both negative and positive relative entropy values are possible. a negative relative entropy value indicates that the play’s wan is more similar to the author profile than to the profile of the average playwright while a positive relative entropy indicates the opposite. a. ben jonson in fig. we present the attribution of the plays believed to have been written solely by jonson, plus one collabora- tion. in the horizontal axis we present the plays to attribute and the vertical axis represents the entropy distance ( ) in cn from these plays to the different profiles identified with table v: thomas middleton plays to be attributed in addi- tion to those listed in table i. the bloody banquet (ban) the changeling (chg) a fair quarrel (afq) the family of love (fam) the patient man and the honest whore (thw) match at midnight (mam) the old law (tol) the roaring girl (trg) anything for a quiet life (agl) the spanish gypsy (tsg) wit at several weapons (wea) distinct markers and discounted by the distance to the average playwright. among the mentioned total plays, including the collab- oration, an accuracy of % is achieved, correctly attributing of these plays to jonson, i.e., the entropy distance of every play to the profiles achieves its minimum for jonson’s profile. the play, eastward ho, is accepted as a collaboration between jonson and chapman plus a third author, john marston, whom we have not profiled. here chapman is not well ranked, suggesting that his contributions were minor compared with jonson’s. the relative contributions of both jonson and chap- man in eastward ho are analyzed further in section viii-a. the misattribution in fig. for plays solely written by jonson occurs for the case is altered, which is misattributed by a small margin. mixed authorship has been suggested due to the irregularity in the structure of the last two acts [ ]. the play’s content has also been compared to a comedy of errors, written by shakespeare, who is also here the closest author [ ]. another play, sejanus his fall, is attributed to jonson but only by a small margin. it has been pointed out that this play might contain elements of a second author, with both shakespeare and chapman a possible candidates [ ], [ ]. our analysis indicates that the play is closer to the style of shakespeare than to the style of chapman. sejanus his fall is also one of only two tragedies jonson published–the other being catiline his conspiracy–possibly biasing results against a profile built almost entirely with comedies. the relationship between genre and attribution is explored further in section ix. b. thomas middleton in fig. we present the attribution of plays, of which are generally believed to have been written only by middleton and in collaboration. we also include in our set two plays originally assigned to middleton–the family of love and match at midnight–but not included in his corpus in [ ]. among the plays believed to be solely written by middleton, we attribute to middleton obtaining an accuracy of . %. the first misattributed play, a game at chess, is attributed to shakespeare by a very small margin, likely due to random error. this is also true in the case of hengist king of kent, noted for being the sole history play middleton produced. additionally, although [ ] does not find evidence of middleton in the family of love or match at midnight, our results show that he is at least a stronger candidate in these plays than the other five authors. the low relative entropy − r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) al c ba r ci a ca t cy n di a ho ep i m ih m o h m ag ne w po e sa d se j so n tu b vo l shakespeare fletcher jonson marlowe middleton chapman fig. : attribution of jonson plays. we attribute the plays in table i plus the case is altered (cia) and eastward ho (ho). a single misattribution occurs for the case is altered. − − r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) ba n ch g di s af q fa m fi v g ac th w m ad m ac m am he n m ic no w to l ph o pu r aq l ro a ts g tc o w id w ea w th be w m ac m ea ti m fig. : attribution of middleton plays. we attribute the plays in table i, the additional plays in table v and collaborations with shakespeare: macbeth (mac), measure for measure (mea) and timon of athens (tim). only sole authored plays are misattributed. also, the attribution of shakespeare’s plays reveal that middleton’s contribution was minor. − r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) ad m al l bb a bd a cd b ca p td b ho g en hd m m ay m do rb d sg g w id fig. : attribution of chapman plays. we attribute the plays in table i plus the tragedy of chabot admiral of france (adm) and eastward ho (ho). out of plays, are attributed to chapman. collaboration with jonson in eastward ho can also be observed. value of − cn between middleton’s profile and the wan of the family of love adds evidence to the claim that middleton contributed to this play [ ]. among the collaborative plays, are attributed to mid- dleton. thomas dekker and william rowley were middleton’s usual collaborators. as neither of these authors are profiled, each of the plays written with these authors is attributed here to middleton with the exception of the bloody banquet, which is marginally attributed to shakespeare over middleton. another misattributed play, the spanish gypsy is usually considered to be a collaboration between middleton, dekker, rowley, and john ford [ ], [ ] which may explain why middleton is ranked second behind another author. we also attribute middleton’s three collaborations with shakespeare. measure for measure, timon of athens, and macbeth are correctly attributed to shakespeare. moreover, for these three plays, middleton is ranked very poorly being the fourth preferred candidate in all of them. this supports the accepted idea that middleton’s contribution in these three plays is minimal [ ], [ ]. we examine these plays in closer detail in section viii-c. c. george chapman chapman is considered to be the author of plays, as a sole author and in collaboration. in fig. , we attribute these plays among the profiles. in total, of the plays are attributed to chapman. in the cases of plays written in collaboration, the tragedy of chabot, admiral of france is attributed to chapman while eastward ho is attributed to jonson, as discussed in section v-a. notice that of the four remaining misattributions, three are assigned to shakespeare with chapman as the second preferred candidate. this is consistent with the fact that in table iv, chapman’s profile is most similar to shakespeare. thus, cases of random error will therefore most likely attribute to shakespeare. − − − − − − r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) di d dr f e je w m as t t shakespeare fletcher jonson marlowe middleton chapman fig. : attribution of marlowe plays. we attribute the plays in table i plus dido queen of carthage (did). a single misattribution occurs for the collaborative play dido queen of carthage. table vi: william shakespeare plays to be attributed in addition to those listed in table i. henry vi ( h ) henry vi ( h ) henry vi ( h ) henry viii (h ) macbeth (mac) measure for measure (mea) pericles (per) timon of athens (tim) titus andronicus (tit) two noble kinsmen (tnk) d. christopher marlowe in fig. , we present the attribution of plays believed to have been written by marlowe, where dido, queen of carthage is the only collaborative work, with thomas nashe as coauthor. we achieve an accuracy of % in attributing marlowe’s sole works. dido queen of carthage is attributed to shakespeare by a small margin, with marlowe as the second best candidate. in the case of sole authorship plays, each is attributed to marlowe by a substantial margin and with relative entropies between − cn and − cn. these large negative values sug- gest that the plays are much more similar to marlowe’s profile than they are to the profile of an average playwright. this difference may be a result of the fact that marlowe’s plays were written at least a decade before most of the other authors considered, thus indicating a shift in writing style during the one or two decades that separate marlowe from the rest. e. william shakespeare in fig. we present the attribution of plays believed to have been written by shakespeare, of which are attributed solely to shakespeare in [ ]. note that of the sole authored plays, henry vi and henry vi are not included in shakespeare’s profile in table i because they have a strong history of disputed authorship [ ]. all of the plays usually considered to have been written only by shakespeare are correctly attributed. however, excep- − r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) aw w an t ay l er r co r cy m ha m h h h h h h h jc jn lrll l m acm m m v w iv m ndad o o th pe r r r ro m sh r tm p ti m ti t tr o tntg v tn k w t shakespeare fletcher jonson marlowe middleton chapman fig. : attribution of shakespeare plays. we attribute the plays in table i and the additional plays in table vi. all plays are attributed to shakespeare. marlowe’s distance to a play is highly dependent on whether the analyzed play is a history play or not, emphasizing the impact of genre in attribution. tional situations arise for the plays henry vi, henry vi, and henry vi, in which marlowe is ranked uncharacteristically high. the fact that marlowe is ranked second for these plays is noteworthy, since marlowe’s profile is very dissimilar from shakespeare’s in table iv. consequently, he ranks poorly in the attribution of most plays. in addition, the relative entropy between marlowe’s profile and the wans of most plays is between + cn and + cn, while the relative entropy between marlowe’s profile and these plays’ wans is around + cn. similarly, the relative entropy between marlowe’s profile and the wans of henry v, king john, richard ii, and richard iii is around + cn. these seven plays have in common that they are history plays, a genre in which marlowe wrote edward ii and massacre at paris, comprising a third of his profile. thus, there is a genre bias of history plays towards marlowe. focusing on the henry vi saga, where the first part is a known collaboration of shakespeare with nashe, we see a particularly strong signature of marlowe in the three plays compared to shakespeare’s other history plays. moreover, these plays were written during marlowe’s most fertile years and marlowe had collaborated with nashe in – two years before the henry vi saga – when writing his play dido queen of carthage. this supports the hypothesis that there was an unknown collaborator in these plays [ ], [ ] and points at marlowe as a probable candidate. these collaborations are covered in greater detail in section viii-d. among the plays of accepted collaboration with others, besides the mentioned collaboration in henry vi, we can find the three collaborations with middleton already analyzed in section v-b. from the poor ranking of middleton in the attribution pattern, we can conclude that middleton’s revisions and contributions were minor. there are also two collabora- tions with fletcher, namely henry viii and the two noble kinsmen. we attribute both to shakespeare, with fletcher the second preferred author in the latter. in the case of the former, on the other hand, fletcher is not well ranked and his contribution is not evident from the attribution of the entire play. shakespeare’s collaborations with both fletcher and middleton are analyzed further in sections viii-b and viii-c, respectively. f. summary of results in total, we attribute correctly out of the plays we consider that are traditionally attributed to single author and listed in table i, yielding an accuracy of . %. furthermore, if we only consider attributions between authors that are more than cn apart, then we fail only in , yielding an accuracy of . %. we utilize the high classification power for plays of sole authorship to shed light on attribution problems of anonymous plays written during the early modern period in section vi. of the plays we consider that are generally accepted to be collaborations, we attribute to one of the contributing authors, yielding an accuracy of %. collaborative plays are analyzed further in sections vii and viii. table vii: list of texts of unknown authorship. arden of faversham (ard) edward iii (e ) fair em (fem) mucedorus (muc) the nice valor (tnv) the revenger’s tragedy (rev) the second maiden’s tragedy (smt) taming of a shrew (tas) vi. anonymous plays in fig. we present the attribution of anonymous plays written during the english renaissance. authorship of some of these plays have been more discussed and studied by scholars than others. e.g., edward iii is commonly attributed in part to shakespeare [ ] and our method supports this theory. indeed, this play was written during the early stages of shakespeare’s career and the shakespeare profile is the closest. another play sometimes attributed in part to shakespeare is arden of faversham [ ]. again, our method supports this theory. these plays are analyzed further in section viii-d. in addition, the plays the revenger’s tragedy, the second maiden’s tragedy, and the nice valor are usually attributed to middleton [ ], with the former two included in the oxford collected works. our method indeed attributes all three works to middleton. furthermore, fletcher is the second attributed author of the nice valor, a play originally included in the beaumont and fletcher folios of and [ ], leading some to believe that the play is a collaboration between fletcher and middleton. for the remaining plays, definite statements cannot be made, but we can support or undermine existing hypothesis. for example, mucedorus may have been written by shakespeare as proposed by a number of scholars [ ] since he is the first ranked author among the six authors we profile. fair em has also been assigned to shakespeare [ ] though there is no scholarly consensus, with robert wilson, whom we do not profile, often cited as a likely candidate [ ]. the taming of a shrew, the play generating controversies about the better known shakespeare play with similar title, is here attributed to the shakespeare profile. note, also, that marlowe is ranked atypically high for this play–second behind shakespeare. both shakespeare and marlowe have been proposed as candidates for taming of a shrew [ ], in the former case as a possibly early draft of taming of the shrew. while our analysis points to shakespeare as a more likely candidate, observe that the attribution of taming of the shrew in fig. ranks marlowe as the worst candidate, indicating that much more of his style is evident in the early draft. vii. collaborations in cases of multiple authors contributing to a single play, we show how our method is still able to detect one or more of the authors present in a full text by identifying the top ranked authors in its attribution. a. john fletcher and collaborators john fletcher wrote numerous plays both by himself and with collaborators. consequently, his canon is an appropriate text corpus to analyze the attribution of collaborative plays. in r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) ar d e fe m m uc tn v re v sm t ta s shakespeare fletcher jonson marlowe middleton chapman fig. : attribution of anonymous plays listed in table vii. our method supports the usual theories of shakespeare’s hand in edward iii and arden of faversham. also, middleton’s style in the revenger’s tragedy, the second maiden’s tragedy, and the nice valor can be observed, in accordance with current authorship consensus. table viii: plays used to build profiles for fletcher & beaumont and fletcher & massinger. fletcher & beaumont the coxcomb (cox) philaster (phi) the woman hater (twh) cupid’s revenge (cup) a king and no king (knk) love’s pilgrimage (pil) the maid’s tragedy (tmt) the scornful lady (tsl) fletcher & massinger the custom of the country (coc) the double marriage (tdm) the elder brother (teb) the false one (tfo) john van olden barnavelt (jvo) the little french lawyer (lfl) the lover’s progress (lp) the prophetess (pro) the sea voyage (sea) spanish curate (tsc) a very woman (tvw) addition to the six profiles in the previous section, we include two profiles built from plays written with fletcher’s two most frequent coauthors–francis beaumont and phillip massinger; see table viii. the attribution of fletcher’s works are divided into two plots. fig. shows the attribution of plays believed to have been written solely by fletcher and fig. shows the attribution of plays believed to have been written in collaboration with other authors. the set of plays presented before the first table ix: john fletcher plays to be attributed in addition to those listed in table i. solo beggars’ bush (bb) the captain (cap) the fair maid of the inn (fai) the noble gentlemen (tng) the queen of corinth (qoc) wit without money (wit) collaborations henry viii (h ) the knight of malta (kom) the maid in the mill (mil) the night walker (nw) four plays in one (fp) two noble kinsmen (tnk) wit at several weapons (wea) love’s cure (cur) the bloody brother (bro) thierry and theodoret (thi) wandering lovers (wan) − − − r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) bb bo n ca p ch a fa i tf s hu m is l lo y tm l th o tn g pi l q o c ra w va l w fm w g c w it w pr w pl shakespeare fletcher f & beaumont f & massinger jonson middleton chapman fig. : attribution of solo fletcher plays. we attribute the plays in table i and the additional plays in table ix. six plays are not attributed to the sole fletcher profile and, among these, two plays are attributed to collaborative profiles including fletcher. red line include attributions of plays written with francis beaumont. the second division shows the attribution of plays written with phillip massinger and the third division shows the attribution of plays written with a mix of other authors. in both figures we omit the marker corresponding to marlowe since he is poorly ranked for every play. this is consistent with fletcher and marlowe having the most dissimilar writing styles; see table iv. in fig. , out of plays are attributed to the solo fletcher profile. of the six plays attributed to other profiles, two of them, the captain and queen of corinth are attributed to one of the profiles for fletcher and a collaborator. beggar’s bush is marginally assigned to shakespeare and jonson. the faithful shepherdess, the noble gentleman, and the fair maid of the inn are mistakenly assigned to shakespeare as well, with fletcher and massinger ranked second. for the latter, existing theories attribute the play to a collaboration of four authors, two of which are fletcher and massinger, with fletcher’s contribution being minor [ ]. this would explain the fact that the fletcher and massinger profile is ranked second but the sole fletcher profile is poorly ranked. in fig. , of the fletcher and beaumont plays are attributed to the fletcher and beaumont profile, while phi- laster is assigned to the sole fletcher profile. a single mistake occurs for love’s cure, a play historically attributed to many different authors [ ]. additionally, all of the fletcher and massinger plays are assigned to the fletcher and massinger profile. one of the three fletcher profiles are also listed as the top candidate in out of the plays written by fletcher table x: plays used to build profiles for robert greene and george peele. robert greene friar bacon and friar bungay orlando furioso james iv alphonsus, king of aragon george peele the arraignment of paris edward i the battle of alcazar the love of king david and fair bethsheba old wive’s tale with other collaborators. of the four mistakes, two are the plays coauthored with shakespeare and discussed previously in section v-e and further in section viii-b. these examples demonstrate that our tool remains effective even in cases of mixed authorship and, in many cases, favors profiles built from multiple contributing authors over profiles built from a single contributing author. viii. collaborations – intraplay analysis we examine the authorship of collaborative plays through the attribution of its individual acts and scenes. in section vii we analyzed examples of detecting collaboration in full plays by looking at the top candidate authors. this does not, however, suggest any particular breakdown of which sections of the text were contributed by which author. instead, we may attribute pieces of the play separate from one another to gain deeper insight as to how the play was written. we also see cases where we can detect collaboration through intraplay analysis where we could not when attributing the full text. − − r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) co x cu p kn k cu r tm t ph i ts l tw h pi l br o co c td m tf o jv o lf l pr o se a ts c th i w an te b av w lp h ko m m il nw tn k fp w ea shakespeare fletcher f & beaumont f & massinger jonson middleton chapman fig. : attribution of fletcher plays written with collaborators. we attribute the plays listed in tables viii and ix. the first division includes plays written with beaumont and out of are correctly assigned to the fletcher & beaumont profile. the second division includes plays written with massinger and all plays are assigned to the fletcher & massinger profile. the third division includes plays written with other collaborators and out of are assigned to a fletcher profile. out of plays, a total of are assigned to a fletcher profile. table xi: function words used in the attribution of indi- vidual acts, determined in the training process. a total of words are used. a both like nor shall they when about but little nothing should this where against by many of since those which all can may off so though who an could might on some till whose and for more once such to will any from most one that unto with as if much or the up without at in must other them upon would away into no our then us yet before it none out these what table xii: function words used in the attribution of indi- vidual scenes, determined in the training process. a total of words are used. a for no some upon all from nor such us an if of that what and in on the when any it one them where as like or then which at may our these who away more out they will but most shall this with by much should to would can must so up yet in the following sections we attribute plays of known or suggested collaboration between the six original candidate authors as well as two new authors: robert greene and george peele. the plays used to construct greene’s and peele’s profile are listed in table x. additionally, we re-train the wan networks due to the fact that smaller wans increase the attribution accuracy of shorter texts. this is because shorter texts are less likely to contain less common function words. as a result, larger networks that contain these less common function words are more prone to over-fit to features of specific texts rather than author style. from the training period, we achieve accuracies of . % and . % for acts and scenes, respectively. note that in the case of scene attribution, this is the accuracy of binary attribution, whereas the act attribution is performed between eight candidate authors. the words used in the resulting networks are listed in tables xi and xii. the figures display for each act or scene the difference in relative entropy when comparing the two top candidate authors, reflected by both the color of the bars and the titles above and below the plot. a longer bar in a particular direction indicates a larger difference between the entropies of the two candidate authors. for example, in fig. , red bars extending upwards indicate an attribution to shakespeare while blue bars extending downwards indicate an attribution to fletcher. in the attribution of acts, we identify the two top authors as the two highest ranked, whereas the attribution of scenes we consider the two authors most often cited as candidates. in many cases, . . . . . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) chapman jonson fig. : attribution of acts and scenes of eastward ho. note that act is assigned to shakespeare over both jonson and chapman. the acts and scenes will be attributed between the same pair of authors. cases in which an act is attributed to a third author are marked in the figure captions. a. jonson and chapman we attribute both the individual acts and scenes of the single known collaboration between jonson and chapman, eastward ho, which also includes contributions from a third author, john marston. fig. displays the results of the act and scene attribution. every act is assigned to jonson, with the exception of act assigned to shakespeare. chapman is ranked either third or forth in all acts except act in which he is ranked second. these results are similar to the full play attribution from figs. and , in which jonson was the top ranked author and chapman was not well ranked. while these results on their own do not support chapman’s contribution, a look at the scene attribution does reveal some of chapman’s possible contributions. most of the play is still assigned to jonson, however chapman is seen as a more likely candidate . . . . . . . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) fletcher shakespeare fig. : attribution of acts and scenes of two noble kinsmen. . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) fletcher shakespeare fig. : attribution of acts and scenes of henry viii. in scene . and . whereas the attribution of scenes . - is too close to make any conclusion. while there is not a scholarly consensus on the scene breakdown, many attribute marston to act , chapman to act and , and jonson to act [ ]. most scholars agree in particular about scene . being written by chapman [ ]. our results support the notion that chapman did not write act and jonson wrote act . we also provide further evidence that chapman wrote . , as it is, in our analysis, the single scene that is assigned to chapman by a margin larger than cn. we also, however, find more evidence of jonson contributing acts and than chapman. b. shakespeare and fletcher in fig. we show the attribution of individual acts and scenes of two noble kinsmen, a known collaboration between shakespeare and fletcher. whereas in fig. the play is assigned to shakespeare with fletcher as the second best candidate, here acts and are assigned to shakespeare while acts and are assigned to fletcher. act is assigned to fletcher with shakespeare and jonson close behind. a closer look into the scene breakdown reveals more specific . . . . . . . . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) middleton shakespeare fig. : attribution of acts and scenes of macbeth. . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) middleton shakespeare fig. : attribution of acts and scenes of measure for mea- sure. assignments. shakespeare is assigned to scenes . - , . , . - , . , . , and . - , fletcher is assigned to scenes . , . , . - , . - , and . - , and close ties in scenes . and . . the scene breakdown we propose largely supports the one given by hallet smith in the riverside shakespeare [ ]. the act and scene analysis of shakespeare and fletcher’s other collaboration–henry viii–is displayed in fig. . recall that, when attributing the full play, shakespeare was the top candidate while fletcher was in fact ranked fourth, thus revealing no evidence of collaboration; see fig. or fig. . we see similar results in fig. , in which shakespeare is assigned every act. fletcher, again, is ranked poorly in every act. a scene-by-scene analysis between shakespeare and fletcher however, does reveal fletcher to be a stronger candidate than shakespeare in several individual scenes. in fact, the scene breakdown we observe–in which shakespeare is assigned scenes . - , . - , . , . , . - , and . - and fletcher is assigned scenes . - , . , and . , and . and . ties between both authors–is aligned to that proposed by cyrus hoy [ ] and currently accepted by many scholars. the primary area of disparity between the breakdown we propose and the one given by hoy is the authorship of act . while hoy assigns act to fletcher, we find that there is greater evidence that shakespeare contributed this section. both scenes are attributed to shakespeare by a significant margin of at least cn. another point of contention is the assignment of . –given to shakespeare by hoy–to fletcher by a small margin. the attribution of henry viii shows a clear example of using intraplay analysis to detect collaboration at the level of scenes that may be undetectable when looking at entire plays or acts. in this play, there are several individual scenes that attribute to shakespeare by a margin as wide as cn, such as scenes . , . , . , and . , that bias the attribution of complete acts in favor of shakespeare, while the scene to scene analysis provides a clearer perspective. . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) middleton shakespeare fig. : attribution of acts and scenes of timon of athens. c. shakespeare and middleton we analyze in figs. - middleton’s contributions to shakespeare’s plays, macbeth, measure for measure, and timon of athens. the attribution of the full plays in fig. did not suggest that middleton made any significant contribution to any of these plays. the intraplay analysis of macbeth at the level of acts and scenes, shown in figure , supports this conclusion. a total of two scenes are assigned to middleton over shakespeare, namely scenes . and . . scene . is attributed to shakespeare by only a small margin of cn while scene . is assigned by a more substantial margin of cn. scholars have often flagged scenes . , . , and . as scenes revised or contributed by middleton [ ], although we do not find evidence of this in our analysis. the case of measure for measure favors shakespeare’s sole authorship even more; both the act and scene analysis displayed in fig. find shakespeare to be the sole author of the play. if middleton had indeed revised the original play as proposed by scholars [ ], [ ], we do not find evidence it was substantial. of the three plays, we find that middleton’s contribution was likely largest in timon of athens. while all five acts attribute to shakespeare, in act it is by a margin less than cn from middleton; see fig. . this is even more evident in the scene analysis. middleton is a stronger candidate in scenes . , . , and . , with close ties in scenes . , . , and . . this assignment supports much of the claim of authorship provided in [ ], [ ]. d. shakespeare and marlowe although there are no unanimously agreed upon collabora- tions between shakespeare and marlowe, there exist a number of plays with controversial authorship that have been the sub- ject of scholarly treatment regarding marlowe’s contributions. of these, we examine the three parts of henry vi as well as the anonymous plays arden of faversham and edward iii. as suggested by the results in fig. , the three parts of henry vi have been considered as possible collaborations between shakespeare and marlowe [ ], though others such as . . . . . . . . . . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) marlowe shakespeare fig. : attribution of acts and scenes of henry vi. greene and peele have also been suggested. the attribution of the acts of henry vi, displayed in fig. , suggests that act could have been written by someone other than shakespeare. it is here attributed evenly between shakespeare and jonson with marlowe the next preferred candidate. although jonson is generally not considered a candidate for this play, it may suggest a similar author we do not profile. the rest of the play is assigned to shakespeare and, in the case of acts and , by a wide margin from second candidate marlowe. the scenes are attributed between shakespeare and marlowe. in line with the act attribution, three scenes in act ( . , . - ) attribute to marlowe rather than shakespeare. other scenes that attribute to marlowe include . , . , . , . - . scene . in particular is attributed to marlowe by a large margin of almost cn. these results support parts of the breakdown suggested by hugh craig [ ], namely the attribution of someone other than shakespeare in act as well as shakespeare in scenes . - . although craig contends that marlowe likely wrote the scenes involving joan of arc, we find only half of the joan of arc scenes ( . - , . , . ) to be more like marlowe than shakespeare. the act and scene attribution of henry vi is shown in . . . . . . . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) marlowe shakespeare fig. : attribution of acts and scenes of henry vi. . . . . . . . . . . . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) marlowe shakespeare fig. : attribution of acts and scenes of henry vi. note that the relative entropy for scene . extends out of the view of the figure to + cn. fig. . act is assigned to marlowe and the rest is assigned to shakespeare, with act being a close tie between them. in the former case, shakespeare is the third candidate author behind peele. the scene analysis assigns to marlowe scenes . , . , . , . , . , and close ties in scenes . , . , . - , and . - . scenes . and . , in particular, attribute to marlowe by a wider margin of cn, increasing the likelihood of his contribution, while the other two scenes in act show less clear indication of authorship. in comparison to the breakdown offered by craig, our results support the claims that shakespeare wrote all of act and marlowe possibly wrote scenes involving jack cade’s rebellion ( . - ). act , on the other hand, is attributed to shakespeare in the act analysis but most of the individual scenes are attributed to marlowe. the wan of scene . , in particular, has a large relative entropy to marlowe’s profile and indicates a strong likelihood it was written by shakespeare. the intraplay analysis of henry vi in fig. attributes act to marlowe and the rest to shakespeare. although craig has suggested that the part of the text most likely written by other authors is act , the act analysis alone here suggests otherwise. however, the attribution of individual scenes shows a different pattern. here, marlowe is assigned four of the eight scenes in act , while shakespeare is attributed scene . by a very wide margin of cn–caused by the presence of a rare transition–which likely skewed the entire act in shakespeare’s favor. in addition to scenes , , , and in act , marlowe is selected as the more likely candidate in scenes . , . - , . , . , and . . shakespeare, meanwhile, is assigned scenes . - , . - , , . - , . - , . - , . , . , and . - . scene . is a close tie between authors. we also perform in fig. the intraplay analysis on the play arden of faversham, attributed to shakespeare in fig. . every act is attributed here to shakespeare. although not shown in the figure, the second preferred candidate in all acts except act is jonson, who is not typically considered a potential author due to the year it was written. the other commonly considered candidates for authorship are thomas kyd and marlowe [ ], [ ]. the former is not profiled due to a lack of a sufficient number of texts to build a profile and the latter is not well ranked in acts - but is close to the second preferred candidate in act . for this reason, we attribute the scenes between shakespeare and marlowe rather than shakespeare and jonson. the scene-by-scene analysis shows shakespeare as the most likely candidate for almost the entire play, with many scenes attributed to shakespeare by a margin of at least cn. the exception to this is scene . , which is assigned to marlowe, and scene . , a tie between candidates. our results support existing claims by macdonald p. jackson [ ] that shakespeare at the very least wrote the middle of the play (act ), however we also find him to be a likely candidate in at least acts , , and as well. an analysis is performed for edward iii, attributed to shakespeare in fig. . as before, the two most commonly cited candidates for co-authorship with shakespeare are kyd and marlowe [ ], [ ]. the act attribution of edward iii in fig. shows act assigned to marlowe. acts , , and are attributed to shakespeare, as well as act by a small margin of less than . cn. a look into the scene by scene attribution, however, shows that in addition to . , marlowe is also assigned scene . by a clear margin of cn. marlowe is also assigned scenes . and . - , while the attribution of scene . does not provide a clear candidate. while not shown in fig. , the relative entropy values in attribution of scene . is large between both profiles (+ cn and + cn between shakespeare’s and marlowe’s profile, respectively), suggesting neither shakespeare nor marlowe, but possibly a third author contributed the scene. timothy irish watt has suggested that shakespeare wrote scenes . and . while someone other than shakespeare, marlowe, or peele wrote scenes . - . [ ]. our results point to shakespeare as a likely candidate for scene . , with his profile being almost cn closer to the wan of edward iii than marlowe’s profile. additionally, along with scene . , we find scenes . - and . - , . and . to be possibly written by a third author due to comparatively large distance between the scenes’ wans and both profiles. not displayed in fig. , the closest profile between shakespeare and peele for . . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) marlowe shakespeare fig. : attribution of acts and scenes of arden of faversham. . . . . . . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) marlowe shakespeare fig. : attribution of acts and scenes of edward iii. . . . . . . . . . . . . . . r e la tiv e s im ila ri ty t o p ro fil e ( ce n tin a ts ) peele shakespeare fig. : attribution of acts and scenes of titus andronicus. note that here the comparative relative entropies for act and its sole scene, . , differ. the plot of scene . reports the difference in relative entropy between peele and shakespeare while the plot of act reports the difference in relative entropy between peele and the second ranked author, marlowe. each of these scenes has a relative entropy between + . cn and + . cn, whereas all other scenes range from − . cn and − . cn from the closest profile. e. shakespeare and peele shakespeare’s play, titus andronicus, is commonly cited to include additions by peele [ ], and is attributed act by act and scene by scene in figure . act is assigned to peele while the rest of the play is attributed to shakespeare. in the scene attributions scenes . and . are attributed to table xiii: relative entropies between scene . of titus andronicus and author profiles. shakespeare fletcher jonson marlowe . . . . middleton chapman peele greene . . . . r e la tiv e e n tr o p y to p ro fil e (c e n tin a ts ) aw w ay l bb ca p m ih po e m ic ph o m do sg g an t ha m tm t br o ca t se j be w re v td b rb d h h h h r r h e m as e comedy tragedy history fig. : attribution of plays between genre profiles. the plays to left of the first red line include comedy plays. the plays to right of the first red line include tragedy plays. the plays to right of the second red line include history plays. table xiv: plays used to build profiles for genre profiles. comedy a shoemaker a gentleman (william rowley) fair maid of the west (thomas hey- wood) city madam (phillip massinger) humor out of breath (john day) heir (thomas may) orlando furioso (robert green) tragedy atheist’s tragedy (cyril tourneur) rape of lucrece (thomas heywood) cleopatra (samuel daniel) fleire (edward sharpham) broken heart (john ford) spanish tragedy (thomas kyd) history duchess of suffolk (thomas drue) edward iv (thomas heywood) sir john oldcastle (robert wilson) thomas lord cromwell (s.w.) perkin warbeck (john ford) fuimos troes (jasper fisher) shakespeare by a small margin of less than cn, evidencing possible contributions of peele. typical attributions of this play, such as the one performed by brian vickers [ ], assign to peele act as well as scenes . and . . another scene of interest in titus andronicus in the context of attribution studies is scene . , also known as the “fly” scene. this particular scene is present in the folio but not earlier additions, suggesting it was a later addition to the play and possibly added by another author. the relative entropies for this scene are compared in table xiii. the two top candidates here are shakespeare and marlowe. however, the scene only appeared in editions published long after marlowe’s death so our top candidate for this scene remains shakespeare. ix. genre analysis in addition to using wans to distinguish author styles, we may also use them to distinguish between plays more generally at the level of genre. there has been debate among literary scholars as to whether the classification of fiction into genres is something determined solely by the plot or whether it can also be determined by the writing style itself. we demonstrate that, to some extent, it is possible to sort a play into its appropriate genre by considering only its writing style as encoded by function wans. we build three profiles for each of the three primary genres– comedy, tragedy, and history–using plays that were not written by the six main playwrights studied in this paper. the complete list of texts used in the genre profiles is displayed in table xiv. the profiles use at most one play from any particular author to avoid biasing the results based on author similarity rather than genre similarity. in fig. , the results are shown from the attribution of ten comedy, tragedy, and history plays between the genre profiles. a total of seven of the ten comedy plays–displayed to the left of the first red line–correctly attribute to the comedy profile. note also that all three misattributions are attributed to the tragedy profile. the attribution of ten tragedy plays, displayed to the right of the first red line, results in only three plays being assigned to the tragedy profile, with hamlet a close three way tie between all profiles. from the remaining six plays, five are assigned to the comedy profile. however, the attribution of history plays results in % accuracy; shown to the right of the second red line. in our results, we find that distinguishing between history and the other genres is easier than distinguishing between comedy and tragedy. this is interesting because it is common to consider history and tragedy more thematically similar than comedy and tragedy. our results, by contrast, suggest that the writing styles of comedy and tragedy are more closely linked than the writing styles of either comedy and history or tragedy and history. x. conclusion function word adjacency networks (wans) were used to analyze the authorship of texts written by popular playwrights during the early modern english period. word adjacency networks were built for a large set of texts in the corpus of the analyzed authors and were compared via the relative entropy measure. the networks of every text known to be written by a particular author were aggregated to form a profile network. the profile networks were then compared to one another to determine the general similarity between author styles. each text in an author’s corpus was compared to every profile and attributed to the author whose profile network produced the smallest relative entropy. an attribution accuracy of . % was achieved when attributing amongst all authors and an accuracy of . % was achieved when attributing amongst authors that are more dissimilar than cn. with this classification power, a selection of anonymous plays were attributed amongst the author profiles. the classification power was then further evaluated with respect to plays written by multiple authors, both through the attribution of an entire play as well as its individual act and scene components. the act and scene components were individually analyzed in a set of plays with highly disputed co-authorship, in which we both corroborate existing breakdowns and provide evidence of new assignments. the impact of genre on attribution accuracy was also briefly examined to gain insight into the similarity of writing styles with respect to a play’s genre. we overall find function word adjacency networks to be simple yet effective tools in distinguishing between playwrights from the early modern era by considering relational structures between func- tion words not previously considered in authorship attribution studies from this time period. references [ ] olivier de vel, alison anderson, malcolm corney, and george mohay, “mining e-mail content for author identification forensics,” acm sigmod record, vol. , no. , pp. – , . [ ] efstathios stamatatos, “a survey of modern authorship attribution methods,” journal of the american society for information science and technology, vol. , no. , pp. – , . [ ] norman meuschke and bela gipp, “state-of-the-art in detecting aca- demic plagiarism,” international journal for educational integrity, vol. , no. , . [ ] frederick mosteller and david wallace, “inference and disputed authorship: the federalist,” . [ ] david i holmes and richard s forsyth, “the federalist revisited: new directions in authorship attribution,” literary and linguistic computing, vol. , no. , pp. – , . [ ] david i holmes, “a stylometric analysis of mormon scripture and related texts,” journal of the royal statistical society. series a (statistics in society), pp. – , . [ ] frederick gard fleay, shakespeare manual, ams pressinc, . [ ] philip wolcott timberlake, the feminine ending in english blank verse, george banta publishing company, . [ ] ants oras, pause patterns in elizabethan and jacobean drama: an experiment in prosody, university of florida press, . [ ] marina tarlinskaja, james bailey, and george t wright, shakespeare’s verse: iambic pentameter and the poet’s idiosyncrasies, lang, . [ ] macdonald pairman jackson, defining shakespeare: pericles as test case, oxford university press, . [ ] macdonald p jackson, “shakespeare and the quarrel scene in arden of faversham,” shakespeare quarterly, vol. , no. , pp. – , . [ ] b. vickers, shakespeare, co-author: a historical study of the five collaborative plays, oxford university press, . [ ] d hugh craig and arthur f kinney, shakespeare, computers, and the mystery of authorship, cambridge university press, . [ ] g udny yule, “on sentence-length as a statistical characteristic of style in prose: with application to two cases of disputed authorship,” biometrika, pp. – , . [ ] shlomo argamon and shlomo levitan, “measuring the usefulness of function words for authorship attribution,” in ach/allc, . [ ] patrick juola, “authorship attribution,” foundations and trends in information retrieval, vol. , no. , pp. – , . [ ] david i holmes, “vocabulary richness and the prophetic voice,” literary and linguistic computing, vol. , no. , pp. – , . [ ] david l hoover, “another perspective on vocabulary richness,” com- puters and the humanities, vol. , no. , pp. – , . [ ] doug cutting, julian kupiec, jan pedersen, and penelope sibun, “a practical part-of-speech tagger,” in proceedings of the third conference on applied natural language processing. association for computational linguistics, , pp. – . [ ] santiago segarra, mark eisen, and alejandro ribeiro, “authorship attribution through function word adjacency networks,” corr, vol. abs/ . , . [ ] dmitri v khmelev and fiona j tweedie, “using markov chains for identification of writer,” literary and linguistic computing, vol. , no. , pp. – , . [ ] conrad sanderson and simon guenter, “short text authorship attribution via sequence kernels, markov chains and author unmasking: an investi- gation,” in proceedings of the conference on empirical methods in natural language processing. association for computational lin- guistics, , pp. – . [ ] santiago segarra, mark eisen, and alejandro ribeiro, “authorship attribution using function words adjacency networks,” in acoustics, speech and signal processing (icassp), ieee international conference on. ieee, , pp. – . [ ] george kesidis and j walrand, “relative entropy between markov transition rate matrices,” ieee transactions on information theory, vol. , no. , pp. – , . [ ] ed. a. b. farmer and z. lesser, “deep: database of early english playbooks,” http:// deep.sas.upenn.edu/ , . [ ] chadwyck-healey. proquest information and learning, “literature online,” http:// lion.chadwyck.com. [ ] gary taylor and john lavagnino, thomas middleton: the collected works, oxford university press, . [ ] richard h barker, “the authorship of the second maiden’s tragedy and the revenger’s tragedy,” the shakespeare association bulletin, vol. , pp. – , . [ ] cyril tourneur and lawrence j ross, the revenger’s tragedy, ed, . [ ] anne begor lancashire and thomas middleton, the second maiden’s tragedy, manchester university press, . [ ] mwa smith, “the authorship of the’revengers tragedy’+ tourneur, cyril or middleton, thomas,” . [ ] philip gaskell, a new introduction to bibliography, clarendon press oxford, . [ ] edwin j howard, “the printer and elizabethan punctuation,” studies in philology, pp. – , . [ ] archie webster, “was marlowe the man?,” national review, pp. – , . [ ] t. p. logan and d. s. smith, the new intellectuals, university of nebraska press, . [ ] james loxley, the complete critical guide to ben jonson, psychology press, . [ ] a. gurr, the shakespeare company, - , cambridge university press, . [ ] a. barton, ben jonson: dramatist, cambridge university press, . [ ] david j lake, the canon of thomas middleton’s plays: internal evidence for the major problems of authorship, cambridge university press, . [ ] h dugdale sykes, “john ford, the author of” the spanish gipsy”,” the modern language review, vol. , no. , pp. – , . [ ] macdonald pairman jackson, studies in attribution: middleton and shakespeare, institut für anglistik und amerikanistik, universität salzburg salzburg, . [ ] stanley wells, shakespeare and co.: christopher marlowe, thomas dekker, ben jonson, thomas middleton, john fletcher and the other players in his story, random house llc, . http://deep.sas.upenn.edu/ http://lion.chadwyck.com [ ] edmund kerchever chambers, the elizabethan stage, vol. , clarendon press oxford, . [ ] patrick cheney, the cambridge companion to christopher marlowe, cambridge university press, . [ ] cyrus hoy, “the shares of fletcher and his collaborators in the beaumont and fletcher canon (v),” studies in bibliography, pp. – , . [ ] terence p logan and denzell s smith, the predecessors of shakespeare, vol. , university of nebraska press, . [ ] cf brooke, “tucker, the shakespeare apocrypha: being a collection of fourteen plays which have been ascribed to shakespeare,” . [ ] stephen roy miller, the taming of a shrew: the quarto, cambridge university press, . [ ] terence p logan and denzell stewart smith, the later jacobean and caroline dramatists, vol. , university of nebraska press, . [ ] ernest henry clark oliphant, plays of beaumont and fletcher, yale university press, . [ ] william shakespeare, gwynne blakemore evans, and john joseph michael tobin, the riverside shakespeare, vol. , houghton mifflin boston, . [ ] gary taylor and john jowett, shakespeare reshaped, - , cambridge univ press, . [ ] w. w. greg, “shakespeare and arden of feversham,” the review of english studies, vol. , no. , pp. – , . [ ] t. merriam, “marlowe’s hand in edward iii,” literary and linguistic computing, vol. , no. , pp. – , . introduction word adjacency networks author profiles similarity of profiles attribution of plays ben jonson thomas middleton george chapman christopher marlowe william shakespeare summary of results anonymous plays collaborations john fletcher and collaborators collaborations – intraplay analysis jonson and chapman shakespeare and fletcher shakespeare and middleton shakespeare and marlowe shakespeare and peele genre analysis conclusion references use of altmetrics to analyze scholarworks in natural resource management kulhavy, d. l., et al. ( ). use of altmetrics to analyze scholarworks in natural resource management. journal of altmetrics, ( ): . doi: https:// doi.org/ . /joa. research use of altmetrics to analyze scholarworks in natural resource management david l. kulhavy , r. p. reynolds , d. r. unger , m. w. mcbroom , i-kuai hung and yanli zhang arthur temple college of forestry and agriculture, stephen f. austin state university, nacogdoches, texas, us ralph w. steen library, stephen f. austin state university, nacogdoches, texas, us corresponding author: david l. kulhavy (dkulhavy@sfasu.edu) digital preservation of library materials has increased the need for methods to access the documents and contents maintained in digital archives. the use of altmetrics to quantify the impact of scholarly works, including plumx, is increasing readership by listing articles in reference services. the outreach from the digital repository scholarworks at stephen f. austin state university (sfasu) highlights the impact within the natural resources community from digital commons, forest sciences commons; and from the natural products chemistry and pharmacognosy commons. the use of plumx altmetrics was examined to evaluate usage, impact, and digital audience downloads for the arthur temple college of forestry and agriculture (atcofa) at sfasu. keywords: natural resources; plumx; digital commons; scholarworks introduction digital preservation of library materials is increasing, and proactive institutional organization and response are criti- cal for developing comprehensive procedures and oversight in both securing and cataloging digital library resources (wilson ). this includes acquiring needed rights and permissions to store and disseminate the materials and to continue adding to the collection. kulhavy et al. ( ) reviewed digital preservation of natural resource docu- ments from the arthur temple college of forestry and agriculture (atcofa) for scholarworks at stephen f. austin state university (sfasu) as part of forest sciences commons™, a subset of life sciences commons™ of the digital commons network™ (dcn) for bepress™. documents archived included research articles, ebooks, digital journals, theses and dissertations, monographs, bulletins, and documents produced by research centers on the sfasu campus. digital records give greater access to material downloads, as the scanned documents are archived at the online sfasu scholarworks. the dcn software is used to build institutional repositories and publish peer-reviewed journals, and increases the visibility of the institution and its research (erway ; enis ). once entered into the scholarworks database, altmetrics are available to assess the access and usability of the docu- ments archived in the system. a team of library archivists is generally responsible for scanning digital records to ensure accuracy and consistency of records (lapinski et al. ). choice of the altmetrics for measurement of digital record use is a function of cost and the intended recordkeeping for the institutional records. the use of altmetrics to quantify scholarly communication promotes the impact of the sfasu academic community (roemer and borchardt , ). the use of scholarworks at sfasu supports the mission of atcofa to produce society-ready natural resources man- agers who deal effectively with complex ecological, economic, and social issues associated with contemporary natural resources challenges. maintaining excellence in teaching, research, and service to enhance the environment through sus- tainable management, conservation, and protection of natural resources is atcofa’s strategic guiding principle (bullard et al. ). the development of the scholarworks natural resources repository at sfasu reflects the faculty research and teaching emphasis (kulhavy et al. ). the research is primarily funded by the mcintire-stennis cooperative forestry program, which was established during the kennedy administration by pl – . mcintire-stennis addressing the need to acquire new knowledge, understanding, and technologies and apply these to complex, transdisciplinary social and bio- logical issues and problems (bullard & straka, ; thompson & bullard, ; bullard et al., ; bullard et al., ). this program was designed to impact society by providing forestry institutions with the capacity funding needed not only to address forestry issues but also to disseminate findings to key audiences, including forestry professionals, the forest industry, and forest landowners, as well as to train new forestry professionals from the undergraduate through doctoral levels. tools like scholarworks are often overlooked as a necessary means of reaching these audiences in the digital age. https://doi.org/ . /joa. https://doi.org/ . /joa. mailto:dkulhavy@sfasu.edu kulhavy et al: use of altmetrics to analyze scholarworks in natural resource managementart.  , page  of plumx is an aggregator that offers more metrics, including citation and usage metrics (i.e., views and downloads). it covers more than . million artifacts, being the largest altmetrics aggregator (ortega a; plum analytics ). altmetrics can be used to measure the impact that scientific articles have on society (holmberg et al. a). plumx is the primary altmetrics source for analysis of downloads and use of the natural resources materials from the sfasu scholarworks. in addition, the downloads and articles available are retrieved from the digital open network from bepress. this information provides a basis for evaluation of the impact in the natural resource digital-retrieval user community including top academic scholars with archives stratified by institution to compare the impact of different academic institutions. plumx altmetrics use downloads over time and referrals to the digital preservation system (collier & deliyannides ; wong & vital ; bar-ilan et al. ). altmetrics data are the aggregated views, mentions, downloads, shares, discussions, and recommendations of research outputs across the scholarly web (fenner ). more emphasis is being placed on the use and availability of research findings (vanclay et al. ; holmberg et al. a, b). the impact of research can be viewed as all the different ways in which research can benefit individuals, organizations, and nations (esrc ). in , plum analytics was acquired by elsevier (www.elsevier.com), and its altmetrics information was added to the scopus database (elsevier ; michalek ). methods natural resource articles were scanned into scholarworks and archived in the ralph w. steen library at sfasu. these articles were evaluated for use and distribution using the forest sciences dcn for bepress from february , to june , . data extracted and posted from bepress included number of institutional records by country compared across natural resource digital archives, impact by authors in the forest sciences, downloads of articles from scholar- works, referrers to search engines for scholarworks, downloads of articles by country, downloads of articles by month, and examples of plumx altmetrics. results atcofa had five of the top eight authors in the country representing total number of downloads for the for- est sciences dcn in bepress as of june , (table ) and atcofa was fourth out of institutions in the united states, with works in the forest sciences commons (figure ). all-time downloads for atcofa in the sfasu scholarworks were , from countries. there were , full-text articles with , authors and table : top eight scholar citations for forest sciences commons, may . steven bullard, arthur temple college of forestry and agriculture, stephen f. austin state university jerome vanclay, southern cross university david kulhavy, arthur temple college of forestry and agriculture, stephen f. austin state university daniel unger, arthur temple college of forestry and agriculture, stephen f. austin state university i-kuai hung, arthur temple college of forestry and agriculture, stephen f. austin state university yanli zhang, arthur temple college of forestry and agriculture, stephen f. austin state university richard schultz, iowa state university jake delwiche, us forest service figure : institutions in forest commons by number of articles cataloged. http://www.elsevier.com kulhavy et al: use of altmetrics to analyze scholarworks in natural resource management art.  , page  of , , downloads in the forest sciences commons. there were , institutions downloading information, with % from education, % commercial, % government, and % organizations. the most accessed institution was sfasu, with , downloads. the highest country total was for the united states ( , ), followed by india, china, the philippines, canada, the united kingdom, indonesia, malaysia, vietnam, and germany (figure ). the major referrers ( , ) for retrieval of information from the sfasu scholarworks site were google, google scholar, and scholarworks at sfasu (figure ). total downloads from february , to june , , were , (figure ). figure : referrers to search engines for articles in scholarworks, stephen f. austin state university. figure : downloads from february to june , stephen f. austin state university scholarworks. figure : downloads of articles from scholarworks, stephen f. austin state university. kulhavy et al: use of altmetrics to analyze scholarworks in natural resource managementart.  , page  of the most downloads for a paper were for li et al. ( ), with , usages ( , downloads and , abstract views), captures by readers, and citations (figure ), determined using plumx altmetrics. this article is listed under the natural products chemistry and pharmacognosy commons, with sfasu third out of institutions repre- sented in the united states; there are articles, authors, and , downloads. three of the top four authors are from sfasu: shiyou li, wei yuan, and ping wang. the research articles are published from the national center for pharmaceutical crops established at atcofa in to improve human health by discovering novel anti-tumor and antiviral agents from native and invasive plant species (li et al. ; kulhavy et al. ). the center publishes the international peer-reviewed journal pharmaceutical crops (http://www.bentham.org/open/topharmcj/index.htm) to investigate cultivated species used for extraction or preparation of therapeutic substances used in pharmaceutical formulations, vaccines, and antibodies and therapeutic proteins (li et al. ). the center is currently investigating endocide-induced abnormal growth forms of giant salvania (salvinia molesta) (li et al. ). endocides (endogenous biocides) are metabolites that can poison or inhibit the parent via induced biosynthesis or external applications (li et al. ). discussion as much of the research is funded by the mcintire-stennis cooperative research program, which is federal funding provided to increase forestry research in the production, utilization, and protection of forestland, increased reader- ship and downloads of natural resources articles in the sfasu scholarworks enhance the flow of information to users in educational, government, commercial, and organizational entities. the number of downloads indicates a measure of societal impact, as more downloads across a variety of user groups and countries expands the range of the arti- cles (holmberg a). the purpose of the mcintire-stennis program is to increase research on forest productivity, utilization, and protection; to train future forestry scientists; and to cooperate with other states in forestry research (thompson & bullard ; bullard et al. ; rickenbach et al. ; allen ). the development of scholarworks for natural resources at sfasu increases the expanding role of libraries in archiving and dissemination of scholarly writing in an open access environment (american association of college and research libraries ). at sfasu, scholarworks is supported and maintained by the ralph w. steen library and librarians, which provides worldwide visibility of research in a single location with a stable url and open access to all faculty research at sfasu. a stable platform is essential to increase the use and reliability of research material that is produced (gracy & kahn ). plumx altmetrics work is uploaded to scholarworks in the center for digital scholarship, including metadata for information retrieval using plumx altmetrics. plumx is an aggregator that offers more metrics for citation and usage (i.e. views and downloads) for over . million artifacts (ortega b; plum analytics ). plumx provides alternative metrics (altmetrics) to view reader impact from an academic institution, individual programs, or individual faculty members (collier & deliyannides ). account administrators create profiles in plumx for individual researchers or groups, including images and biographical and contact information (champieux ). plumx metrics can be used to provide information on interaction with research, including articles, conference proceedings, book chapters, and books, in an online environment (wong & vital, ). the five categories of plumx metrics include usage, captures, mentions, social media, and citations. these five metrics are color coded: usage (green), captures (magenta), mentions (yellow), social media (blue), and citations by others (orange). the size of the circle (e.g., usage, figure ) indicates the impor- tance of this metric for each article in scholarworks. usage includes downloads, views, clicks on the work, and library holdings. the most important usage metric accord- ing to a taylor & francis open access survey in (frass et al. ) is citations of articles ( % of respondents), followed by article downloads ( %). next is captures, which includes blog posts, reviews, wikipedia, and news refer- ences; social media includes facebook, tweets, and shares (lindsay ). plumx implementation by libraries raises the academic profile and adds to the altmetrics for faculty by adding information on authors across social media, mentions, and citation counts (wong & vital ). the university of pittsburgh created an extensive digital library and used plumx altmetrics to measure usage with the plumx artifact widget (collister et al. ). as digital resources and library services expand, altmetrics for tracking user information for articles is increasing, including blogs, social media, and users. librarians serve as the conduit to the altmetrics that acquire and provide access to resources for east of tracking and to evaluate the scholarly impact of an academic institution. altmetrics like plumx have a cost structure tied to figure : published paper most downloaded from stephen f. austin state university scholarworks. http://www.bentham.org/open/topharmcj/index.htm kulhavy et al: use of altmetrics to analyze scholarworks in natural resource management art.  , page  of scholarworks, the institutional repository for digital preservation at sfasu. plumx altmetrics provide data for dois or other digital identifiers (roemer & borchardt , ; peters et al. ), and use with scholarworks provides user information in connection with the dcn and bepress. altmetrics by themselves may not express societal impact but reflect both individual and institutional use of records. the actual use of the information is determined in further exploration of citation uses in scientific search engines. altmetrics also promote new forms of scholarly communication, with the goal of assessing the societal impact of research to identify and measure how a specific research document has been used and what kind of influence it has had, not just within academia but also beyond (holmberg et al. a). altmetrics are currently being investigated to determine if they could be used to assess societal impact of research. the rapid development of the use of altmetrics (or alternative metrics) provides guidance on how research is being used, viewed, and moved. the electronic transfer of information can be tracked to provide data on the use and distribution of research findings (penfield et al. ). for plumx, tweets and blog mentions had the earliest views but were not persistent. bookmarking (mendeley read- ers), usage metrics (downloads and views) and bibliographic indicators (citations) increased over time and were persis- tent (ortega a). plumx altmetrics identified the highest numbers of mendeley readers of an article compared to crossref event data (ced) and altmetric.com (ortega b, ). wong & vital ( ) highlight the effectiveness of plumx to showcase the reader impact and academic profile of saint mary’s college of california. nuzzolese et al. ( ) used plumx altmetrics to analyze the national scientific qualification for scholars in italy as a method of measuring the impact of their research. the altmetrics led to an assessment at the product level of scholarly products. (lapinski et al. ). bar-ilan et al. ( ) and brigham ( ) reviewed the use of altmetrics to expand the use of this tool for researchers to showcase impact metrics. plumx dashboards integrates the dcn as an institutional repository, with selected works as a profile system leading to repository harvesting through web crawling. use includes articles, book chapters, books, conference papers, discussion papers, presentations, posters, reports, and theses and dissertations. with the expanded role of digital measures for libraries, increased use of altmetrics enhances the availability of research materials across a variety of platforms. the primary use of altmetrics at sfasu is currently usage (for download and abstracts). however, expansion on the use of social media, blogs, and mentions expands the potential readership and use. the use of plumx altmetrics at sfasu reflects the major audience with usage, followed by captures and cita- tions, with very little use of mentions or social media. for scholarworks at sfasu, the primary search engine for finding articles is google, followed by google scholar and scholarworks. as scholarworks expands the number of articles listed, the usage will increase, expanding the importance of download statistics as a measure of societal impact. acknowledgements this research was supported by the national institute of food and agriculture, u.s. department of agriculture, mcintire-stennis project under texy administered by the arthur temple college of forestry and agriculture, stephen f. austin state university. competing interests the authors have no competing interests to declare. references allen, j. a. ( ). sustaining healthy and productive forests, mcintire stennis strategic plan: investing in america’s competitive position in the global marketplace. the mcintire-stennis cooperative forestry research program strategic plan. retrieved august , from https://www.fs.fed.us/research/docs/forestry-research-council/ articles/mcintire-stennis-strategic-plan-update.pdf. american association of colleges and research libraries. ( ). information literacy competency standards for higher education. american library association, chicago, illinois, usa. bar-ilan, j., halevi, g., & milojević, s. ( ). differences between altmetric data sources – a case study. journal of altmetrics, ( ), . doi: https://doi.org/ . /joa. brigham, t. ( ). an introduction to altmetrics. medical references services quarterly, , – . doi: https://doi. org/ . / . . bullard, s. h., & straka, t. j. ( ). continuing education needs of natural resource professionals. resource management and optimization, , – . bullard, s. h., brown, p. j., blanche, c. a., brinker, r. w., & thompson, d. h. ( ). a “driving force” in developing the nation’s forests: the mcintire-stennis cooperative research program. journal of forestry, , – . bullard, s. h., stephens williams, p., coble, t., coble, d., darville, r., & rogers, l. ( ). producing “society-ready” foresters: a research-related process to revise the bachelor of science in forestry curriculum at stephen f. austin state university. journal of forestry , – . doi: https://doi.org/ . /jof. - champieux, r. ( ). plumx. journal of the medical library association, ( ), – . doi: https://doi. org/ . / - . . . collier, l. b., & deliyannides, t. s. ( ). altmetrics: documenting the story of research. against the grain, ( ), article . doi: https://doi.org/ . / - x. http://altmetric.com https://www.fs.fed.us/research/docs/forestry-research-council/articles/mcintire-stennis-strategic-plan-update.pdf https://www.fs.fed.us/research/docs/forestry-research-council/articles/mcintire-stennis-strategic-plan-update.pdf https://doi.org/ . /joa. https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /jof. - https://doi.org/ . / - . . . https://doi.org/ . / - . . . https://doi.org/ . / - x. kulhavy et al: use of altmetrics to analyze scholarworks in natural resource managementart.  , page  of collister, l. b., kirschner, j., bradbury, m., deliyannides, t. s., & kear, r. ( ). altmetrics and library publishing. ifla wlic, wroclaw, poland. http://library.ifla.org/id/eprint/ p. economic and social research council (esrc). ( ). what is impact? retrieved august , from https://esrc. ukri.org/research/impact-toolkit/what-is-impact/. elsevier. ( ). elsevier acquires leading ‘altmetrics’ provider plum analytics. retrieved may , from https://www. elsevier. com/about/press-releases/corporate/elsevier-acquires-leading-altmetrics-provider-plum-analytics. enis, m. ( ). uncommonly open: the new digital commons network. the digital shift, library journal. http://www. thedigitalshift.com/ / /discovery/uncommonly-open/. erway, r. ( ). last impact: sustainability of disciplinary repositories. online computer library center, inc., dublin, ohio. fenner, m. ( ). altmetrics and their novel measures for scientific impact. in s. bartling & s. friesike (eds.), opening science (pp. – ). cham: springer international publishing. doi: https://doi.org/ . / - - - - _ frass, w., cross, j., & gardner, v. ( ). taylor and francis open access survey, june . taylor and francis/ routledge. pp. https://www.tandf.co.uk//journals/explore/open-access-survey-june .pdf. gracy, k. f., & kahn, m. b. ( ). preservation in the digital age. library resources & technical services, ( ), – . doi: https://doi.org/ . /lrts. n . holmberg, k., bowman, s., bowman, t., didegah, f., & kortelainen, t. ( a). what is societal impact and where do altmetrics fit into the equation? journal of altmetrics, ( ), . doi: https://doi.org/ . /joa. holmberg, k., bowman, t., didegah, f., & lehtimäki, j., ( b). the relationship between institutional factors, citation and altmetric counts of publications from finnish universities. journal of altmetrics, ( ), . doi: https:// doi.org/ . /joa. kulhavy, d. l., reynolds, r. p., unger, d. r., bullard, s. h., & mcbroom, m. w. ( ). digital preservation and access of natural resource documents. journal of education and practice, , – . lapinski, s., piwowar, h., & priem, j. ( ). riding the crest of the altmetrics wave: how librarians can help prepare faculty for the next generation of research impact metrics. arxiv: . , p. li, s., wang, p., su, z., lozano, e., lamaster, o., grogan, j. b., weng, y. decker, t., findeisen, j., & garrity, m. ( ). endocide-induced abnormal growth forms of invasive giant salvania (salvania molesta). scientific reports, , article number . doi: https://doi.org/ . /s - - - li, s., wang, p., yuan, w., su, z., & bullard, s. h. ( ). endocidal regulation of secondary metabolites in the producing organisms. scientific reports, , article number . doi: https://doi.org/ . /srep li, s., yuan, w., deng, g., wang, p., yang, p. & aggarwal, b. b. ( ). chemical composition and product quality of turmeric (curcuma longa l.). pharmaceutical crops, , – . doi: https://doi.org/ . / li, s., yuan, w., yang, p., antoun, m. d., balick, m. j., & cragg, g. m. ( ). pharmaceutical crops: an overview. pharmaceutical crops, , – . doi: https://doi.org/ . / lindsay, j. m. ( ). plumx from plum analytics: not just altmetrics, journal of electronic resources in medical libraries, ( ), – , doi: https://doi.org/ . / . . michalek, a. ( ). plum analytics joins elsevier. retrieved from: https://plumanalytics.com/ plum-analytics-joins-elsevier/. nuzzolese, a. g., ciancarina, p., gangemi, a., peroni, s., poggi, f., & presutti, v. ( ). do altmetrics work for assessing research quality? scientometrics, , – . doi: https://doi.org/ . /s - - -z ortega, j. l. ( a). reliability and accuracy of altmetric providers: a comparison among altmetric.com, plumx and crossref event data. scientometrics, , – . doi: https://doi.org/ . / ortega, j. l. ( b). the life cycle of altmetric impact: a longitudinal study of six metrics from plumx. journal of informetrics, , – . doi: https://doi.org/ . /j.joi. . . ortega, j. l. ( ). availability and audit of links in altmetric data providers: link checking of blogs and news in altmetric.com, crossref event data and plumx. journal of altmetrics, ( ), . doi: https://doi.org/ . /joa. penfield, t., baker, m. j., scoble, r., & wykes, m. c. ( ). assessment, evaluations and definitions of research impact: a review. research evaluation, ( ), – . doi: https://doi.org/ . /reseval/rvt peters, i., kraker, p., lex, e., gumpenberger, c., & gorraiz, j. ( ). research data explored: an extended analysis of citations and altmetrics. scientometrics, , – . doi: https://doi.org/ . /s - - - plum analytics. ( ). coverage: expanding the world of altmetrics. doi: https://plumanalytics.com/learn/ about-metrics/coverage/. rickenbach, m., mohamed, a., blanche, c., & norland, e. ( ). final report: review of the mcintire-stennis cooperative forestry research program. usda national institute of food and agriculture institute of climate, bioenergy, and the environment final report to usda forest research advisory council. p. roemer, r. c., & borchardt, r. ( ). institutional altmetrics and academic libraries. information quality standards, ( ), – . doi: https://doi.org/ . /isqv no . . roemer, r. c., & borchardt, r. ( ). chapter . altmetrics and the role of librarians. library technical reports, ( ), – . http://library.ifla.org/id/eprint/ https://esrc.ukri.org/research/impact-toolkit/what-is-impact/ https://esrc.ukri.org/research/impact-toolkit/what-is-impact/ https://www.elsevier. com/about/press-releases/corporate/elsevier-acquires-leading-altmetrics-provider-plum-analytics https://www.elsevier. com/about/press-releases/corporate/elsevier-acquires-leading-altmetrics-provider-plum-analytics http://www.thedigitalshift.com/ / /discovery/uncommonly-open/ http://www.thedigitalshift.com/ / /discovery/uncommonly-open/ https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ https://www.tandf.co.uk//journals/explore/open-access-survey-june .pdf https://doi.org/ . /lrts. n . https://doi.org/ . /joa. https://doi.org/ . /joa. https://doi.org/ . /joa. https://doi.org/ . /s - - - https://doi.org/ . /srep https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / . . https://plumanalytics.com/ plum-analytics-joins-elsevier/ https://plumanalytics.com/ plum-analytics-joins-elsevier/ https://doi.org/ . /s - - -z http://altmetric.com https://doi.org/ . / https://doi.org/ . /j.joi. . . http://altmetric.com https://doi.org/ . /joa. https://doi.org/ . /reseval/rvt https://doi.org/ . /s - - - https://plumanalytics.com/learn/about-metrics/coverage/ https://plumanalytics.com/learn/about-metrics/coverage/ https://doi.org/ . /isqv no . . kulhavy et al: use of altmetrics to analyze scholarworks in natural resource management art.  , page  of thompson, d. h., & bullard, s. h. ( ). history and evaluation of the mcintire-stennis cooperative forestry research program. forestry and wildlife research center, research bulletin fo , mississippi state university, starkville, mississippi. vanclay, f., esteves, a. m., aucamp, i., & franks, d. m. ( ). social impact assessment: guidance for assessing and managing the social impacts of projects. fargo nd: international association for impact assessment (p. ). retrieved june , from http://espace.library.uq.edu.au/view/uq: /uq .pdf. wilson, t. c. ( ). rethinking digital preservation: definitions, models, and requirements. digital library perspectives, ( ), – . doi: https://doi.org/ . /dlp- - - wong, e. y., & vital, s. m. ( ). plumx: a tool to showcase academic profile and distinction. digital library perspectives, ( ), – . doi: https://doi.org/ . /dlp- - - how to cite this article: kulhavy, d. l., reynolds, r. p., unger, d. r., mcbroom, m. w., hung, i-k., & zhang, y. ( ). use of altmetrics to analyze scholarworks in natural resource management. journal of altmetrics, ( ): . doi: https://doi. org/ . /joa. submitted: august accepted: september published: october copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. journal of altmetrics is a peer-reviewed open access journal published by levy library press. open access http://espace.library.uq.edu.au/view/uq: /uq .pdf https://doi.org/ . /dlp- - - https://doi.org/ . /dlp- - - https://doi.org/ . /joa. https://doi.org/ . /joa. http://creativecommons.org/licenses/by/ . / introduction methods results discussion plumx altmetrics acknowledgements competing interests references figure figure figure figure figure table calculation as a cultural practice in modern literature vol.:( ) neohelicon ( ) : – https://doi.org/ . /s - - -x calculation as a cultural practice in modern literature lászló bengi published online: september © the author(s) abstract the opposition between quantitative sciences and the humanities is a well-known problem of cultural debates, along with its reflection in the conflicting approaches to digital humanities. as the emphasis has moved from long-standing scientific methods of quantification to the overall digital turn of everyday life, this process sheds light on the varying sociocultural conditions for calculations in modern socie- ties. consequently, numbers cannot be conceived as inherent properties of things by discovery through experimentation and explanation: this essentialist conception seems to originate in a misunderstanding of nineteenth-century scientific research and its claim of objectivity. rather, quantification and the cultural matrices of cal- culation build a raster image serving as an interface between world and mind. in this broad sense, everyday life is deeply pervaded by numbers. moreover, the abil- ity for calculations cannot be treated as a uniform skill any more. instead, it varies in accordance with different cultural forms and functions. number-based practices are also represented widely in modern literature and in non-literary works, such as being in the letters and diaries of many writers. the essay is thus intended to analyze and compare the forms of calculation in the novels and diaries of some east central european writers—such as kafka, kosztolányi, musil—who thrived in the first dec- ades of the twentieth century. in so doing, it describes three models through which calculation as a cultural practice enters the field of literature. keywords cultural practices · numbers · operational knowledge · iteration in literature on the cover of a recently published collection of essays on the possibilities of empirical and digital approaches to literature (bode and dixon ), two photo- graphs are placed next to one another. the left one shows the museum-like interior of a library hall, with bright, square and evenly arranged windows on the ceiling. * lászló bengi lbengi@elte.hu department of comparative literature and cultural studies, elte eötvös loránd university, múzeum krt. /a, budapest  , hungary http://orcid.org/ - - - http://crossmark.crossref.org/dialog/?doi= . /s - - -x&domain=pdf l. bengi the right one depicts a very similar arrangement of blackish microprocessors, sup- posedly part of a large digital device. although these photographs might repre- sent a certain kind of technological progress, their similar structure suggests that a desired harmony also accompanies the undeniable contrast between classical and digital scholarship. in the first chapter of her recent book, one of the editors, bode ( ), argues that close and distant reading are not necessarily as far away from one another as is frequently assumed in digital humanities. both methods of reading can be treated as highly equivalent forms of literary analysis in regard to the (some- times naive) claim of objectivity and their ahistorical character. thus, they reduce the heterogeneity and complexity of literary phenomena, even if in a significantly different manner. bode instead offers an approach that synthetizes the contributions of close and distant reading with the long-standing tradition of textual scholarship, mainly as it was developed in the practices of creating scholarly editions. to analyze the synchronic and diachronic variabilities of the literary field properly, philology provides a wide (and sometimes wild) inventory of critical and historical methods of investigation. nevertheless, opposed to bode’s rather optimistic—but in no way naive—point of view, the increasing importance of quantification and the interpretative attitude of the humanities have been in permanent conflict throughout modernity. snow ( ), in his famous essay on “two cultures,” originally published in , deliberately talked about a deep rupture between the human and the natural sciences. in snow’s argumentation, this opposition is not merely considered as a difference that comes from, and results in, a sharp methodological dualism—the idea of which dates back to geisteswissenschaft in its diltheyan form—but as a general inability of commu- nication and understanding between literary intellectuals and scientists. although hayles ( ), similarly to bode, argues that, under the current conditions, a certain synergy of close, hyper and machine reading is desirable, she also warns that the reaffirmation of a “rift between print-based and digital scholarship would have sig- nificant implications for both sides. print-based scholars would become increasingly marginalized […]. digital humanities would become cut off from the rich resources of print traditions, leaving behind millennia of thought, expression, and practice that no longer seem relevant to its concerns” (pp. – ). in the polarized situation of contemporary cultural and literary studies, it seems to be exceedingly important how the relation of quantitative properties and cultural practices is reflected in the history of modern literature. does the read- ing of modernist literary works confirm a strict dichotomy between numerical operations and aesthetic experience, quantification and poetical features? i will argue that such an assumption would be a misleading oversimplification. litera- ture does not represent calculation in a uniform way but as being interwoven into the matrix of heterogeneous cultural discourses. thus, the relation of numerical reasoning and literary expression, as well as being full of tension, significantly varies in accordance with the cultural function of calculation. therefore, firstly, the essay intends to show that the effect of quantification cannot be confined to the certainty of “ ×  =  ” or to the uniformity implied by the numerical meas- urement and ordering of physical properties and social conditions. the narrow concept that literary studies formed about calculation should be reinterpreted so calculation as a cultural practice in modern literature that it conforms a bit more to the colorful field of mathematics. in the remainder of the essay, i will outline three models through which different relations to math- ematical operations are articulated in both literary and para-literary forms. the dual tendency of modernism to both incorporate and reject the use of cal- culative operations in cultural practices can easily be overshadowed by concen- trating exclusively on numerical transformations and statistical representations that have pervaded everyday life since at least the last decades of the nineteenth century. it is not by chance that connor ( ), exploring the significant roles that the world of numbers plays in the arts, gave his book the subtitle “in defense of quantity.” on the one hand, numericalization or quantification threatened to subordinate the expressivity of language to rigid operations; on the other hand, the subtle notational systems like literature disclosed the potential of languages to perform complex operations, including calculations (cf. krämer , p. ). modernism, thus, maintained a rather ambivalent relationship with calculation, since differentiating letters and numbers, expressive and operative discourses, and symbolic and formal languages could generally be as important in the modernist intellectual climate of the early twentieth century as the attempt to keep together what was tempting to separate. in the first decades of the twentieth century, cal- culation could be seen either as a bridge or as a gap between the everyday experi- ence of world and the interpretative effort of reason. at the dawn of our digital era, it would be hard to decide whether the cultural transformation brought about by computerization passes beyond the european tradition of surface-substance dichotomy to a unifying system of numerical repre- sentation, or just evokes a misleading substitution of visualizing interfaces driven by strict and uniform numerical analysis for quantified description. in light of this dilemma, however, the ambivalent view of modernism on the operative dis- courses of calculation might gain new interpretative relevance for present cultural practices. as the all-embracing process of digital transformation highlights how deeply everyday life is pervaded by numbers, it no longer makes sense to assume that the quantitative elaborations of understanding and the humanities form two distinct universes—either in the sciences or in everyday life. nevertheless, litera- ture has never ceased to maintain a skeptical, or at least reflexive, attitude toward the generalization of numerical relations. this approach puts the emphasis on the complex interaction by which calculative and non-quantitative discourses inter- mingle all the time, instead of supposing a homogeneous field of relations with- out the possibility of differentiating between cultural phenomena. as a set of mathematical operations, or even as a wide range of everyday prac- tices to organize one’s life-world, calculation cannot be restricted to the field of numerical relations. although in the philosophy of mathematics it is possible to argue that counting—the notion of numbers in general—is a necessary condition for calculation, it is also important to emphasize that calculations are not neces- sarily executed numerically. in this sense, numeration is a narrower notion than calculation that embraces a much larger scope of operations. if the focus of cul- tural analysis shifts from considering the functions of numbers in themselves to the various forms of discursive embeddedness of calculation, then the relation of l. bengi modern literature to the mathematical or—more appropriately—formal construc- tion and interpretation of the world can be seen somewhat differently. books on mathematical analysis or calculus regularly skip the definition of their subject in its general meaning. it is not probably independent of the acute tension “between practicality and theoretical musing: between solving problems and under- standing the problem” (juola and ramsay , p. ), as a recent handbook on mathematics for humanists describes the ambiguous reception of the disciplinary field of “calculus.” on the one hand, the authors do not hesitate to admit that “it is one of the most eminently practical systems ever created for solving a truly vast set of problems. calculus is, in this sense, the ultimate form of applied mathematics” (p. ). on the other hand, they “can think of very few conceptual problems involving numbers that don’t eventually wend their way around to the philosophical conun- drums that gave rise the invention of the calculus, and which continue to inform its theoretical basis” (p. ). opposed to the approach of alain badiou ( ) in his thorough and technical analysis of the construction of numbers (see also the critical remarks of nirenberg and nirenberg ), the third model mentioned in this essay to grasp the interplay between calculative and literary practices primarily rests and depends on the theoretical and philosophical interpretation of mathematical analy- sis. it does not intend to deny the mutual interdependence of number theory and calculus but to emphasize that the symbolic machinery of mathematics can also be revealed without the discussion of numerical execution. as a consequence, the exploration of the relationship between literature and cal- culation as intertwining cultural discourses requires the broadening of the field of interest in four strongly related senses. first, on the thematic level, it means that even texts in which numbers or counting are not explicitly present might still be relevant. second, beside the most pervasive model of economics (cf. connor , p. ), other possible patterns of cultural interactions—such as the abstract field of operations—can also be taken into consideration. in this regard, a relevant counter- argument comes from the fact that many writers were (and are) unable to compre- hend the idea behind the abstract and highly technical elaborations of mathematical problems. this is exactly the reason why the essay will propose models that induce, motivate and govern the interaction between mathematical operations and aesthetic reception, instead of supposing the unmediated influence of higher mathematics in each case. in other words, abstract constructions are, by definition, abstract in the sense that they are not bound to one particular experience but can appear even in the most different spheres of life such as, e.g., literature. third, on the level of interpre- tative strategies, as well as “number-magic” and “numerology” (connor , pp. – ) losing their efficiency in linking literature and calculation, with both mod- ernizing rapidly, the ongoing tradition of number symbolism is no longer sufficient to undertake this role either. even though it still serves several fundamental meta- phorical and rhetorical functions in the modern era, number symbolism does not cover the scope of narrative and poetical problems that fall beyond the representa- tion of strictly numerical or quantifiable constructions of literary works. finally, the broadening of the notion of calculation rejects the reduction of cultural complexity in the sense that the role played by literature in the interaction with mathematical operations cannot be constrained to a merely defensive position. calculation as a cultural practice in modern literature the first and probably most common model is based on economy and the more or less everyday experience of the market, trade, and the circulation of money. since financial trends cannot be foreseen completely, this model notably relies on the (possible) application of statistics (the social and sociocultural context of statisti- cal inference and probability theory is analyzed in detail by hacking ; porter ; campe ). simple financial calculations as well as statistical inferences are frequently articulated in series of records (regarding defoe, cf. campe , pp. – ), and it makes diaries and letters the prototypical—but by no means exclu- sive—media of the interaction governed by the economic model. in fiction, it might become a significant pattern of composition—together with the model of measure- ment mentioned below—in works designed to build or explore the rules and compo- nents of a particular world as the context of the narrated story. together with certain types of realist novels, the genres of crime fiction and sci-fi can be mentioned as examples. in many of the journal entries and letters, the recording of several events and the—mainly intuitive—statistical-like interpretation of observations serve as a dis- ciplinary basis for self-discipline. although economic interpretations of one’s life- world commonly mark a strategy for self-regulation, the modality and the discursive position of these passages vary greatly: restrictive and retrospective practice can be differentiated from the speculative, basically expansive and future-oriented way of calculations. the effect of financial conditions on an individual’s way of life and even cultural habits is evident in a letter written by the young poet, Árpád tóth, to his parents in , when he and his brother studied in budapest: all of these have, unfortunately, cost money. […] we have already paid the apartment rent for the period until the th, but for breakfasts we will have to pay . – . forints to our current housekeeper by the th of the next month. our financial situation is as follows: today i withdrew koronas, so the balance is still koronas. […] the koronas i withdrew today will be enough for about –   days because we do not spend more than forint a day, and we do not have dinner in expensive restaurants that cost + = kreutzers every evening, as instead we eat cottage cheese or something simi- lar for dinner, and thus, in all, together with bread, our dinner costs only kreutzers. […] in addition, theatres: since march th, we have been to the comic theatre twice and the national theatre once, which cost and – kreutzers, respectively. (tóth , p. ; translation is mine.) the stationary, equilibrium-oriented, and substantially restrictive approach charac- terizing most of tóth’s letters can be contrasted with successive, progress-driven efforts. such efforts are also expressed in the diaries of zsigmond móricz, one of the greatest novelists of the interwar period in hungary. the following entry is from : it will cost koronas to dig out the well, so i have to run into debt for it, and what is more, i have to do so hastily. but my colleagues at the journal nyugat help me in many ways […]. the water conduit would cost pen- gos, while in this way, hardly , i hope. and by this, i gain perpetual value l. bengi since the garden, if there is water in it, can make some profit. now, i have a good gardener who is an aging, hard-working and trustworthy man, or so it appears, and he has already minimized the expenses for the garden. […] with- out water, he won’t be able to raise anything, but if there will be water, then it will be possible to do so. speculation it is, but i think it is good speculation, and it is the reason why i adhere so much to this idea. (móricz , p. ; translation is mine.) as a physician and psychiatrist, géza csáth—beside the permanent calculation of his financial status—kept a detailed quantitative record of his own physical condi- tions. in line with contemporaneous psychology, he applied a clear economic model to personality by explaining behavior as an interaction of different complexes of drives. therefore, in the s, he gave long and precise enumerations about his health, especially about the exact weight of drugs—mainly the morphine he took day by day—as well as about the sexual intercourses he had with his wife and with other ladies, some of his patients (csáth’s diaries are among the cruelest readings in hungarian literature). his records mostly shatter the narrative frame, and include cumulative tables and simple enumerations of data (e.g., csáth , pp. , , – ). the entries form an unsettled sequence of alternating periods in which, on the one hand, csáth tried to sustain an equilibrium of taking drugs, doing work, etc., and through which, on the other hand, he was optimistic of progressing in his episodic efforts to give up drugs. a similarly ambivalent experience evolves in kafka’s diaries. this ambiguity, however, does not stem from the dialectical personal drives to reach a steady-state equilibrium and to increase prosperity in life but from a self-contradictory attitude toward the economic model in general. as is frequently the case with non-literary works of the early twentieth century, more common forms of counting come to the fore in kafka’s numerous letters and diaries. several times he stresses the irreconcil- able opposition between the office (or the factory) characterized by the permanent demand for statistics, accounts, and reports, and literary writing. kafka complains about his unusual daily routine and, as a good official, relentlessly calculates the possibilities for minimizing the lost time he could have spent writing. in his letters, he urges his sweetheart to write more regularly, compiles reports of how many let- ters he received and how many he did not get. later, as a fiancé, he weighs the gains and losses of the anticipated marriage for both participants. in this respect, kafka’s relation to numerical calculation is not ambivalent, but is clearly paradoxical. he keeps counting the conditions for getting rid of numeration. in relation to calculation as a cultural practice, physical explanation also serves as an influential model of literary discourse. it relies on the design of measure- ments, instead of statistical data analysis. although thomas kuhn, within the his- tory of physical sciences, differentiated between observation and measurement on the basis that the latter “always produces actual numbers” (kuhn , p. ), in modern literature, this difference seems to be of secondary importance. while the possibility of numericalization is a necessary requirement, literature does not intend measurements to be carried out. for instance, the characters of robert musil and dezső kosztolányi in the novels mentioned below are not really fond of numerical calculation as a cultural practice in modern literature problems but of the conceptual relation of calculation and everyday life. hence, the peculiar medium in which measurements appear as a meaningful cultural model of calculation is the structure of narrative motifs. moreover, as a metaphoric field of interacting textual elements, this model frequently reflects on the pattern of com- munication between characters and thus highlights the social conditions of identity formation. in fact, the symbolic interpretation of numbers like zero and one or the notions of manifold and infinity can be developed within the model based on the physical explanation of interacting forces. the sensitive and smart protagonist of the confusions of young törless, a novel published in by musil, becomes highly excited over getting acquainted with complex numbers. although törless seems to be enthusiastic, his doubts are no less important: what is actually so odd is that you can really go through quite ordinary opera- tions with imaginary or other impossible quantities, all the same, and come out at the end with a tangible result! […] in a calculation like that you begin with ordinary solid numbers, representing measures of length or weight or some- thing else that’s quite tangible—at any rate, they’re real numbers. and at the end you have real numbers. but these two lots of real numbers are connected by something that simply doesn’t exist. isn’t that like a bridge where the piles are there only at the beginning and at the end, with none in the middle, and yet one crosses it just as surely and safely as if the whole of it were there? that sort of operation makes me feel a bit giddy, as if it led part of the way god knows where. but what i really feel is so uncanny is the force that lies in a problem like that, which keeps such a firm hold on you that in the end you land safely on the other side. (musil , pp. – ) according to törless, imaginary numbers suspend or at least disrupt counting with real—more or less referential—numbers (dipert , pp. – ). since the unreality of the imaginary unit, the acting of an operator where it should not oper- ate, blurs the established rules of ordinary mathematics, complex numbers cannot be treated as tools for grasping the essence of physical phenomena. in other words, the imaginary unit is supposed to be a gap in reasoning and also a bridge between dif- ferent parcels of reality, but not the (hidden) depth below this bridge. it could be the possible source of törless’ doubts. moreover, the difference between gap and depth also echoes the epigraph from “mystic morality,” an essay by maurice maeterlinck, which allegorically opposes the “uttermost depth” and the “surface,” the “darkness” of the “abyss” and “the light of day.” how strangely do we diminish a thing as soon as we try to express it in words! we believe we have dived down to the most unfathomable depths, and when we reappear on the surface, the drop of water that glistens on our trembling finger-tips no longer resembles the sea from which it came. we believe we have discovered a grotto that is stored with bewildering treasure; we come back to the light of day, and the gems we have brought are false—mere pieces of glass—and yet does the treasure shine on, unceasingly, in the darkness! (maeterlinck , p. ) l. bengi while the symbolic connotations of the epigraph express linguistic skepticism, young törless finds a possible answer to the surface-substance opposition through the intellectual challenge of performing calculations with imaginary numbers. finally, the experience of törless is also important regarding the relation of les- son and life—i.e., the theoretical comprehension of mathematical operations and the pragmatic effort to understand and organize the world: “for some days past he [törless] had been following lessons with special interest, thinking to himself: ‘if this is really supposed to be preparation for life, as they say, it must surely contain some clue to what i am looking for, too.’ it was actually of mathematics that he had been thinking […]” (musil , p. ). in this way, musil’s protagonist develops not only an ambivalent relation to non-elementary calculation, but also a pattern in which the discourses of mathematics, literary symbolism, and everyday experience are interwoven. a main character—although not the protagonist—of a hungarian novel published about  years later, the golden kite by kosztolányi, has a differ- ent relation to calculation. vilmos liszner is also a high-school student. preparing for the legendarily tough graduation exams, and different from young törless, he is much more motivated to reach outstanding results in sports than in sciences: [liszner] was reluctantly thumbing the mathematics workbook, with slow and dumb attention. he was able to read it for hours. someone has bought five meters of baize…   years ago, the father was exactly one hundred times older than his son… a rich man has hired two farm servants… he was wandering about all these situ- ations. the facts, the persons, the objects were entertaining him, and it even did not come into his mind that the exercises could be, or should be, solved. he persisted in warping his lazy dreams. he was wandering what color that baize is? and who is that father and that certain son? does the old boy have a beard and can his son cycle? then where must that rich guy have lived? but whenever it came to counting, he immediately lost his amusement and brushed aside the whole thing by insisting that he does not need any baize, and the father, the son and the rich man are all boring and witless idiots. (kosztolányi , p. ; translation is mine.) liszner does not treat abstract mathematical manipulation as an extension of reality that can be usefully integrated into the interpretation of the world; rather, he sep- arates numerical operations from everyday phenomena. thus, his rejection of cal- culation reaffirms the effects of quantification instead of adequately answering to this challenge. this unsuccessful situation is also reflected in the story: liszner is a narrow-minded student who, after failing the graduation exam, even beats his high- school professor of physics and mathematics, antal novák. professor novák is seemingly the opposite of liszner. while he handles scien- tific problems self-confidently, he is unable to tolerate emotions becoming a more important motivation for human behavior than pure reason. committing suicide, he becomes a tragic hero, but the fact that he can hardly understand other people, including his own daughter, makes him a comic figure too. ironically, both novák and liszner, just from opposite sides, suffer from the cognitive dissonance that comes from the detachment of actual physical phenomena from abstract ideas. no calculation as a cultural practice in modern literature character in the novel is able to find a harmonic relation of the different measures attached to different discourses. in this situation, the characters are also unable to communicate with each other or express their desires, fears, compassion, and empa- thy. by lacking solutions, the golden kite shows the problems that rise from the unmediated opposition between the organizing power of calculation and the irra- tional realms of instinctual motives. the last model i mention in this essay is based on philosophical reasoning. hav- ing adapted mathematical relations along this model, the strictly numerical oper- ations are transformed as logical inferences or mental operations. since these are rather procedural features, they are in close relation to the poetic functions of liter- ary discourse—i.e., to the textual strategies by which the story as a chain of events and the fictive world unfolds in the act of reading. the representation of opera- tions, including cognitive modeling of the world, necessarily involves self-under- standing and thus sheds light on the reflective character of modern subjectivity. in this respect, this is the most abstract model for the interplay between calculation and literature, since numbers are overshadowed here by the machinery of abstract transformations. it is organized around operators that, by projecting fields onto one another, are able to constitute mathematical structures or, so to speak, correspond- ences between different worlds. the novels by musil and kosztolányi as well as many of the diary entries by kafka embedded calculation in different discursive contexts, showing the ambiva- lence evoked by both the integration and rejection of abstract mathematical opera- tions. it is, however, only one layer of kafka’s rich writing. deleuze and guattari ( ) have already called attention to the extensive presence of series in kafka’s works. in , kafka wrote to himself: mend your ways, escape officialdom, start seeing what you are instead of cal- culating what you should become. […] as a link in the chain of calculation, they [flaubert, kierkegaard, and grillparzer] undoubtedly serve as useful examples—or rather useless examples, for they are part of the whole useless chain of calculation; all by themselves, however, the comparisons are useless right off. flaubert and kierkegaard knew very clearly how matters stood with them, were man of decision, did not calculate but acted. but in your case— a perpetual succession of calculation, a monstrous four years’ up and down. (kafka , pp. – ) since there are more than two parts, it is not possible to arrange them in simple opposition, but their chain forms a sequence or a series. a biographical reason exists for this, inasmuch as kafka was a german-speaking person of jewish origin who lived in prague, which belonged to the austro-hungarian monarchy at that time. however, the iterative chain of transformations is also articulated as a textual expe- rience in kafka’s literary works. the operation performed to produce sequences appears in narrative forms, in short stories like “poseidon,” “fellowship,” and “an imperial message.” the latter contains the following passage: [the messenger] is only making his way through the chambers of the inner- most palace; never will he get to the end of them; and if he succeeded in that l. bengi nothing would be gained; he must next fight his way down the stair; and if he succeeded in that nothing would be gained; the courts would still have to be crossed; and after the courts the second outer palace; and once more stairs and courts; and once more another palace; and so on for thousands of years; and if at last he should burst through the outermost gate—but never, never can that happen—the imperial capital would lie before him, the center of the world, crammed to bursting with its own sediment. (kafka , p. ) in the mathematical sense, they raise the question of the limit of an infinite series and, to some extent, allude to zeno’s paradoxes reformulated as parable in “the next village.” according to zeno, achilles cannot reach the tortoise since their motion can be broken down into an infinite number of fragments, and in all of them, achil- les’ distance from the tortoise is not zero. nevertheless, it might be a misleading parallelism because—as is shown in mathematical analysis—the constructed series of distances converges to a finite limit, despite the infinite number of elements to be summed. it seems to be more consistent with the operation of successive approxima- tion in kafka’s narratives if it is described as the result of a recursive process (corn- gold , pp. – ; beebee , p. ). beebee nested this approach, inspired by luhmann’s system theory, into the analysis of law and bureaucracy in the trial. to put it in a very simple way, the application of law requires the legal regulation of the process by which specific legal rules can be employed under different circum- stances. in other words, rhetorical pattern, mathematical operation and legal system are joined in a somewhat fearful algorithm. this resonating structure reveals that, by the beginning of the twentieth century, slightly after the transformation of modern mathematics, it had become a general cultural practice. kafka stands in a threefold and deeply ambivalent relationship to calculation. first, he curses time-consuming numerical tasks of administration. meanwhile, he permanently looks for an optimal timetable for writing, seemingly without calcula- tion. finally, he widely applies a sequential form of calculation as a rhetorical pat- tern in his narrative works. modernist writers did not set aside critical remarks on counting and numerical reporting, yet their writings are substantially pervaded by mathematical operations, suggesting that calculation was never really a mere scien- tific project of quantification. opposed to the artificial separation of discourses, this tradition of critical integration can still motivate reflection on calculation as some- thing that inseparably belongs to, but is not equal to, us. in this essay, i argued in favor of broadening the question of the relation- ship between calculation and literature beyond the function of numbers in literary work. calculation is not a uniform practice of simple numerical operations but a diverse field of discourses with different axiological implications and cultural func- tions. there are several models through which calculation enters literature. out of these numerous possibilities, the models based on economic equilibrium, physical explanation, and philosophical inference were analyzed here. these three models— although not completely separable—differ from each other in several respects. they rely upon different facets of calculation as a multimodal cultural practice: statis- tics (and probability), measurement (and quantifiable observation), and operations (and cognitive processes). they are most properly expressed by different generic or calculation as a cultural practice in modern literature textual features—i.e., by the enumerative records of events, the integrative structure of motifs, or the narrative iteration of operations. finally, all of these models play a specific role in determining the position and structure of modern subjectivity: (self-) regulation oscillates between equilibrium and progress, (self-)expression evokes the scope of personal interaction and the level of psychological consistency, and (self-) reflection induces increased awareness and processual identification of the subject. at this point, two conclusions have to be drawn. first, since the models are strictly interconnected and their characteristics can- not be defined exclusively, the features mentioned above mark only differences in emphasis, and form a matrix with highly correlated elements. second, as the advo- cates of the empirical research of digital humanities suggest a complex interaction between traditional and newborn forms of reading, the pervasive experience of cal- culation in modern literature also constitutes an intriguing synergy of cultural prac- tices, thus also providing a critical view on the techniques of quantification. acknowledgements open access funding provided by eötvös loránd university (elte). this paper was supported by the jános bolyai research scholarship of the hungarian academy of sciences. open access this article is distributed under the terms of the creative commons attribution . interna- tional license (http://creat iveco mmons .org/licen ses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. references badiou, a. ( ). number and numbers (r. mackay, trans.). cambridge: polity press. beebee, t. o. ( ). citation and precedent: conjunctions and disjunctions of german law and litera- ture. new york and london: continuum. bode, k. ( ). a world of fiction: digital collections and the future of literary history. ann arbor: university of michigan press. bode, k., & dixon, r. (eds.). ( ). resourceful reading: the new empiricism, eresearch, and austral- ian literary culture. sydney: sydney university press. campe, r. ( ). the game of probability: literature and calculation from pascal to kleist (e. h. wig- gins, jr., trans.). stanford: stanford university press. connor, s. ( ). living by numbers: in defence of quantity. london: reaktion books. corngold, s. ( ). kafka’s later stories and aphorisms. in j. preece (ed.), the cambridge companion to kafka (pp. – ). cambridge: cambridge university press. csáth, g. ( ). sötét örvénybe süllyedek: naplófeljegyzések és visszaemlékezések – [i sink into a deep maelstrom: diaries and memoires – ]. ed. e. e. molnár, and m. szajbély. budapest: magvető. deleuze, g., & guattari, f. ( ). kafka: toward a minor literature (d. polan, trans.). minneapolis and london: university of minnesota press. dipert, r. r. ( ). mathematics in musil. in w. huemer & m.-o. schuster (eds.), writing the austrian traditions: relations between philosophy and literature (pp. – ). edmonton: wirth-institute for austrian and central european studies. hacking, i. ( ). the taming of chance. cambridge: cambridge university press. hayles, n. k. ( ). how we think: digital media and contemporary technogenesis. chicago and lon- don: the university of chicago press. juola, p., & ramsay, s. ( ). six septembers: mathematics for the humanist. lincoln: zea books. http://creativecommons.org/licenses/by/ . / l. bengi kafka, f. ( ).  diaries (m. brod, ed., m. greenberg & h. arendt, trans.). new york: schocken books. kafka, f. ( ). an imperial message. in the complete stories (w. muir & e. muir, trans.) (pp. – ). new york: schocken books. kosztolányi, d. ( ). aranysárkány [the golden kite]. pozsony: kalligram. krämer, s. ( ). writing, notational iconicity, calculus: on writing as a cultural technique. mln, , – . kuhn, t. s. ( ). the function of measurement in modern physical science. isis, , – . maeterlinck, m. ( ). mystic morality. in the treasure of the humble (a. sutro, trans.) (pp. – ). london: george allen. móricz, z. ( ). naplók: – [diaries] (ed., a. cséve & j. z. szilágyi). budapest: noran. musil, r. ( ). young törless (e. wilkins & e. kaiser, trans.). new york: pantheon. nirenberg, r. l., & nirenberg, d. ( ). badiou’s number: a critique of mathematics as ontology. criti- cal inquiry, , – . porter, t. m. ( ). trust in numbers: the pursuit of objectivity in science and public life. princeton: princeton university press. snow, c. p. ( ). the two cultures. an introduction by stefan collini. cambridge: cambridge univer- sity press. tóth, Á. ( ). levelei [letters] (ed., l. kardos, p. kardos & g. kocztur). budapest: akadémiai. publisher’s note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. calculation as a cultural practice in modern literature abstract acknowledgements references javascript must be enabled to use the system on the features of translationese vered volansky, noam ordan, and shuly wintner∗ department of computer science, university of haifa mount carmel, haifa, israel abstract much research in translation studies indicates that translated texts are ontologically different from original, non-translated ones. trans- lated texts, in any language, can be considered a dialect of that lan- guage, known as ‘translationese’. several characteristics of transla- tionese have been proposed as universal in a series of hypotheses. in this work we test these hypotheses using a computational methodol- ogy that is based on supervised machine learning. we define several classifiers that implement various linguistically-informed features, and assess the degree to which different sets of features can distinguish be- tween translated and original texts. we demonstrate that some feature sets are indeed good indicators of translationese, thereby corroborating some hypotheses, whereas others perform much worse (sometimes at chance level), indicating that some ‘universal’ assumptions have to be reconsidered. introduction this work addresses the differences between translated (t) and original (o), non-translated texts. these differences, to which we refer as ‘features’, have been discussed and studied extensively by translation scholars in the last three decades. in this work we employ computational means to investi- gate them quantitatively. focusing only on english, our main methodology is based on machine learning, more specifically an application of machine learning to text classification. the special status of translated texts is a compromise between two forces, fidelity to the source text, on the one hand, and fluency in the target lan- guage, on the other hand. both forces exist simultaneously: some ‘finger- prints’ of the source text are left on the target text, and at the same time the translated text includes shifts from the source so as to be more fluent and produce a better fit to the target language model. the differences between o and t were studied empirically since the s by translation scholars on ∗this is the authors’ pre-print copy which differs from the final publication. computerized corpora (laviosa, ), but only recently, since baroni and bernardini ( ), has it been shown that distinguishing between o and t can be done automatically with a high level of accuracy. toury ( ) paved the way for studying translated texts in comparison to target language texts, ignoring the source text altogether. the idea behind this move was that translations as such, regardless of the source language, have something in common, certain stylistic features governed by so-called translation norms, and therefore, in order to learn about these special marks of translation, the right point of reference is non-translated texts. baker ( ) calls for compiling and digitizing ‘comparable corpora’ and using them to study ‘translation universals’, such as simplification, the ten- dency to make the source text simpler lexically, syntactically, etc, or explic- itation, the tendency to render implicit utterances in the original more ex- plicit in the translation. this call sparked a long-lasting quest for translation universals, and several works test such hypotheses in many target languages, including english, finnish, hungarian, italian and swedish (mauranen and kujamäki, ; mauranen, ). in this study we refrain from the token ‘universal’ and focus instead on ‘features’. this terminological choice has several reasons. first, the focus is mostly on data and empirical findings, and less on translation theory as such. whereas the features are motivated by and organized according to theoretical categories, we admit that certain features can belong to more than one theoretical category (see section ). second, we show that cer- tain features (such as mean sentence length) are highly dependent on the source language, and in general many of the features have a skewed distri- bution (again, see section ); we therefore cast doubt on the universality of ‘universals’. third, the term ‘feature’ (or sometimes ‘attribute’) is common in machine learning parlance, which is the main methodology used in this study. this paper uses machine learning algorithms to distinguish between o and t. in particular, we apply text classification, a methodology that has been used for classifying texts according to topic, genre, etc. (sebastiani, ), but also for authorship attribution, classification of texts according to their authors’ gender, age, provenance, and more (koppel et al., ). this methodology has been successfully applied to studying o vs. t in various datasets and in different source and target languages (see section ). in most of these works the focus is on computational challenges, namely classifying with high accuracy, expanding to more scenarios (for example, cross-domain classification), and minimizing the samples on which the computer is trained, so the attribution can be done on smaller portions of texts. our study, in contrast, employs this methodology to examine a list of features of ‘translationese’ suggested by translation scholars, with an eye to the question whether some of these features can be utilized to tell o from t. the main contribution of this work is thus theoretical, rather than prac- tical: we use computational means to investigate several translation studies hypotheses, corroborating some of them but refuting others. more gener- ally, we advocate the use of automatic text classification as a methodology for investigating translation studies hypotheses in general, and translation universals in particular. after reviewing related work in the next section, we detail our method- ology in section . we describe several translation studies hypotheses in section , explaining how we model them computationally in terms of fea- tures used for classification. the results of the classifiers are reported in section , and are analyzed and discussed in section . we conclude with suggestions for future research. related work numerous studies suggest that translated texts differ from original ones. gellerstam ( ) compares texts written originally in swedish and texts translated from english into swedish. he notes that the differences between them do not indicate poor translation but rather a statistical phenomenon, which he terms translationese. the features of translationese were theoreti- cally organized under the terms laws of translation or translation universals. toury ( , ) distinguishes between two laws: the law of interfer- ence and the law of growing standardization. the former pertains to the fingerprints of the source text that are left in the translation product. the latter pertains to the effort to standardize the translation product according to existing norms in the target language and culture. the combined effect of these laws creates a hybrid text that partly corresponds to the source text and partly to texts written originally in the target language, but in fact is neither of them (frawley, ). baker ( ) suggests several candidates for translation universals, which are claimed to appear in any translated text, regardless of the source lan- guage: “features which typically occur in translated text rather than orig- inal utterances and which are not the result of interference from specific linguistic systems” (baker, , p. ). consequently, there is no need to study translations vis-à-vis their source. the corpus needed for such study is termed comparable corpus, where translations from various source languages are studied in comparison to non-translated texts in the same lan- guage, holding for genre, domain, time frame, etc. among the better known universals are simplification and explicitation, defined and discussed thor- oughly by blum-kulka and levenston ( , ) and blum-kulka ( ), respectively. following baker ( ), a quest for the holy grail of translation universals this term should be distinguished from comparable corpus in computational linguis- tics, where it refers to texts written in different languages that contain similar information. began, culminating in mauranen and kujamäki ( ). chesterman ( ) distinguishes between s-universals and t-universals. s-universals are fea- tures that can be traced back to the source text, and include, among others, lengthening, interference, dialect normalization and reduction of repetitions. t-universals, on the other hand, are features that should be studied vis-à- vis non-translated texts in the target language, i.e., by using a comparable corpus. these include such features as simplification, untypical patterning and under-representation of target-language-specific items. this distinction classifies putative translation universals into two categories, each of which calls for a different kind of corpus, parallel for s-universals and comparable for t-universals. we cast all our features in a comparable corpus setting (t- universals). our assumption is that if a feature is reflected in translations from several languages into english, it is very likely present in the source texts from which these translations were generated. future study could very well verify this assumption. in the last decade, corpora have been used extensively to study trans- lationese. for example, al-shabab ( ) shows that translated texts ex- hibit lower lexical variety (type-to-token ratio) than originals; laviosa ( ) shows that their mean sentence length is lower, as is their lexical density (ratio of content to non-content words). both these studies provide evidence for the simplification hypothesis. corpus-based translation studies became a very prolific area of research (laviosa, ). text classification methods have only recently been applied to the task of identifying translationese. baroni and bernardini ( ) use a two-million- token italian newspaper corpus, in which % of the texts are translated from various source languages the proportions of which are not reported. they train a support vector machine (svm) classifier using unigrams, bi- grams and trigrams of surface forms, lemmas and part-of-speech (pos) tags. they also experiment with a mixed mode, in which function words are left intact but content words are replaced by their pos tags. the best accu- racy, . %, is obtained using a combination of lemma and mixed unigrams, bigrams and pos trigrams. extracting theoretically interesting features, they show that italian t includes more ‘strong’ pronouns, implying that translating from non-pro-drop languages to a pro-drop one, like italian, is marked on t. in other words, if a certain linguistic feature is mandatory in the source language and optional in the target language, more often than not it will be carried over to the target text. this is a clear case of positive interference, where features that do exist in o have greater likelihood to be selected in t. in contrast, there are cases of negative interference, where features common in o are under-represented in t (toury, ), and more generally, “[t]ranslations tend to under-represent target-language-specific, unique linguistic features and over-represent fea- tures that have straightforward translation equivalents which are frequently used in the source language” (eskola, , p. ). note that baroni and bernardini ( ) use lemmas as features; this can artificially inflate the accuracy of the classifier since lemmas reflect topic and domain information rather than structural differences between the two classes of texts. inspired by baroni and bernardini ( ), kurokawa et al. ( ) use a mixed text representation in which content words are replaced by their corresponding pos tags, while function words are retained. the corpus here is the canadian hansard, which consists of texts in english and canadian french and translations in both directions, drawn from official records of the proceedings of the canadian parliament. classification is performed at both the document and the sentence level. interestingly, they demonstrate that learning the direction is relevant for statistical machine translation: they train systems to translate between french and english (and vice versa) using a french-translated-to-english parallel corpus, and then an english- translated-to-french one. they find that in translating into french it is better to use the latter parallel corpus, and when translating into english it is better to use the former. the contribution of knowledge of the translation direction to machine translation is further corroborated in a series of works (lembersky et al., , a,b). van halteren ( ) shows that there are significant differences between texts translated from different source languages to the same target language in europarl (koehn, ). the features are – -grams of tokens that appear in at least % of the texts of each class. there are × classes: an original and translations from and into the following: danish, english, french, german, italian and spanish. tokens appearing in less than % of the texts in each class are replaced with 〈x〉. thus, for example, are right 〈x〉 is a marker of translations form german, while conditions of 〈x〉 is a marker of translations from french. the % threshold does not totally exclude content words, and therefore many markers reflect cultural differences, most notably the form of address ladies and gentlemen which is highly frequent in the translations but rare in original english. ilisei et al. ( ) test the simplification hypothesis using machine learn- ing algorithms. as noted earlier, certain features, such as average sentence length, do not provide a rich model, and cannot, by themselves, discrimi- nate between o and t with high accuracy. therefore, in addition to the ‘simplification features’, the classifier is trained on pos unigrams, and then each simplification feature is included and excluded and the success rate in both scenarios is compared. they then conduct a t-test to check whether the difference is statistically significant. ilisei et al. ( ) define several ‘simplification features’, including aver- age sentence length; sentence depth (as depth of the parse tree); ambigu- ity (the average number of senses per word); word length (the proportion of syllables per word); lexical richness (type/token ratio); and information load (the proportion of content words to tokens). working on spanish, the most informative feature for the task is lexical richness, followed by sentence length and the proportion of function words to content words. both lexical richness and sentence length are among the simplification features and are therefore considered to be indicative of the simplification hypothesis. all in all, they succeed in differentiating between translated and non-translated texts with . % accuracy and conclude that simplification features exist and heavily influence the results. these results pertain to spanish translated from english; ilisei and inkpen ( ) extend the results to romanian, us- ing by and large the same methodology, albeit with somewhat more refined features. furthermore, ilisei ( ) experiments also with the explicitation hypothesis (in spanish and romanian), defining mainly features whose val- ues are the proportion of some part of speech categories in the text. our work is similar in methodology, but is much broader in scope. while ilisei et al. ( ); ilisei and inkpen ( ) use their simplification features to boost the accuracy of the classifier, our goal is different, as we are interested not in the actual accuracy of any feature by itself, but in its contribution, if any, to the classification and translation process. we test some of the simpli- fication features on english translated from ten source languages vs. original english. we also add more simplification features to those introduced by ilisei et al. ( ); ilisei and inkpen ( ) to test the simplification hypoth- esis. most importantly, we add many more features that test a large array of other hypotheses. koppel and ordan ( ) aim to identify the source language of texts translated to english from several languages, and reason about the simi- larities or differences of the source languages with respect to the accuracy obtained from this experiment. the data are taken from the europarl corpus, and include original english texts as well as english texts translated from finnish, french, german, italian and spanish. in order to abstract from content, the only features used for classification are frequencies of func- tion words. koppel and ordan ( ) can distinguish between original and translated texts with . % accuracy; they can identify the original language with . % accuracy; and they can train a classifier to distinguish between original english and english translated from language l , and then use the same classifier to differentiate between original english and english trans- lated from l , with accuracies ranging from % to . %. interestingly, the success rate improves when l and l are typologically closer. thus, training on one romance language and testing on another yields excellent results between . %- . % (there are such pairs for french, italian and spanish). the poor results ( %) of training on t from finnish and testing on t from italian or spanish, for example, cast doubt on the concept of ‘transla- tion universals’. it shows that translationese is highly dependent on the pair of languages under study. although koppel and ordan ( ) manage to train on all the t components vs. o and achieve a good result distinguishing between o and t ( . %), it is exactly their main finding of pair-specific dependence that may tie this success to their corpus design: three fifths of their corpus belong to the same language family (romance), another fifth of translations from german is also related (germanic), and only the last fifth, finnish, is far removed (finno-ugric). in our experiments we use a wider range of source languages in an effort to neutralize this limitation: romance (italian, portuguese, spanish, and french); germanic (german, danish, and dutch); hellenic (greek); and finno-ugric (finnish). popescu ( ), too, identifies translationese with machine-learning meth- ods. he uses a corpus from the literary domain, mainly books from the nineteenth century. the corpus contains books, half of which ( ) are originally written in british and american english. the other half is of translated english, from french and from german. the book do- mains are varied and translations are ensured to be of at least minimal quality. popescu ( ) uses character sequences of length , ignoring sen- tence boundaries, for classification. he achieves . % to % accuracy using different cross-validation methods. when training on british english and translations from french, and testing on american english and trans- lations from german, the accuracy is . %. he then uses the original french corpus to eliminate proper names, still at the character level, and achieves . % accuracy. by mixing american and british texts, . % accuracy is achieved. this work has many advantages from the engineering point of view: extracting characters is a trivial text-processing task; the methodology is language-independent, and with some modifications it can be applied, for example, to chinese script, where segmenting words is a non-trivial task; it does not impose any theoretical notions on the classification; last, the model for o and t is very rich since there are many possible character n-gram values (like the, of, -ion, -ly, etc.) and therefore the model can fit different textual scenarios on which it is tested. similarly to popescu ( ), we use simple n-gram characters, n = , , , among many other features. the higher n is, the more we can learn about translationese, as we show in section . still, it should be noted that character n-grams can capture lexical information, which, like lemmas, may reflect topic and domain information rather than structure. in contrast to some previous works, we use the machine-learning method- ology with great care. first, we compile a corpus with multiple source lan- guages, from diverse language families; we balance the proportion of each language within the corpus, and provide detailed information that can be used for replicating our results. second, we totally abstract away from con- tent so as to be unbiased by the topics of the corpora. a classifier that uses as features the words in the text, for example, is likely to do a good job telling o from t simply because certain words are culturally related to the source language from which the texts are translated (e.g., the word paris in texts translated from french). we provide data on such classifiers, but only as a “sanity check”. furthermore, while previous works used this method- ology to investigate the simplification hypothesis, we use it to investigate a wide array of translation studies hypotheses, including simplification, ex- plicitation, normalization and interference. finally, and most importantly, we use a plethora of linguistically informed features to learn more about the nature of translationese. methodology our main goal in this work is to study the features of translated texts. our methodology is corpus-based, but instead of computing quantitative mea- sures of o and t directly, we opt for a more sophisticated, yet more revealing methodology, namely training classifiers on various features, and investigat- ing the ability of different features to accurately distinguish between o and t. we now detail the methodology and motivate it. . text classification with machine learning in supervised machine-learning, a classifier is trained on labeled examples the classification of which is known a priori. the current task is a binary one, namely there are only two classes: o and t. each instance in the two classes has to be represented : a set of numeric features is extracted from the instances, and a generic machine-learning algorithm is then trained to distinguish between feature vectors representative of one class and those representative of the other. for example, one set of features for natural texts could be the words (or tokens) in the text; the values of these features are the number of occurrences of each word in the instance. this set of features is extracted from the text instances in both classes, and then each of the classes is modeled differently such that there is a model for how o should look like and a model for how t should look like. given enough data for training and given that the features are indeed relevant, the trained classifier can then be given an ‘unseen’ text, namely a text that is not included in the training set. such a text is again represented by a feature vector in the same manner, and the classifier can predict whether it belongs to the o class or to the t class. such unseen texts are known as “test set”. one important property of such classifiers is that they assign “weights” to the features used for classification, such that significant features are as- signed higher weights. due to potential dependencies among features, some features may be assigned weights that diminish their importance on their own, as they do not add any important data to the classifier. this means a terminological note is in place: throughout this paper, o and t refer to texts written in the same language, specifically in english. the languages from which t was translated are therefore referred to as the source languages. when we say french, for example, we mean texts translated to english from french. that low weights are not always very reliable; but if a feature is assigned a high weight, it is certainly a good indication of a significant difference between the two classes (the inverse does not necessarily hold). . motivation applying machine learning algorithms to identify the class of the text (o or t) is thus a sound methodology for assessing the predictive power of a feature set. this is by no means a call to abandon traditional significance tests as a tool to learn about differences between texts, and in fact, we use both in this study. but text classification is more robust, in the sense that it reflects not just average differences between classes, but also the different distributions of features across the classes, in a way that facilitates generalization: prediction of the class of new, unseen examples. to illustrate this point, consider the case of punctuation marks. we compare the frequencies of several marks in o and t. table summarizes the data: for each punctuation mark, it lists the relative frequency (per token) in o and t; the ratio between o and t (‘ratio’); whether the feature in question typifies o or t according to a log-likelihood (ll) test (p < . ); and the strength of the weight assigned to the feature by a particular classifier (section . ), where t is the most prominent feature of translation (the one with the highest weight, as determined by the classifier), t the second most prominent, and so on; the same notation from o to o is applied to o. frequency mark o t ratio ll weight , . . . t t ( . . . t t ’ . . . t t ) . . . t t / . . . — — [ . . . t — ] . . . t — ” . . . o o ! . . . o o . . . . o o : . . . — o ; . . . — o ? . . . o o - . . . o o table : summary data for punctuation marks across o and t the most prominent marker of t according to the classifier is the comma, which is indeed about . times more frequent in t than in o. there are punctuation marks for which the ratio is much higher; for example, square brackets are about . times more frequent in t. but their frequency in the corpus is very low, and therefore, this difference is not robust enough to make a prediction. theoretically it may be interesting to note that in translations from swedish into english, for example, there are four times more square brackets than in original english, but this plays no significant role in the classification task. conversely, there are cases where the critical value of ll is not significant by common standards, but it does play a role in classification. such is the case of the colon. the ratio o/t is . and the critical value is . , namely p < . . this value still accounts for % of the cases and although common statistical wisdom would rule it out as an insignificant feature, it does play a significant role in telling o from t using text classification techniques. the parentheses appear almost always together. we notice rare cases of ‘(’ appearing in itself, usually as a result of tokenization problem, and some cases of the right parenthesis ‘)’ appearing by itself, usually in enumerating items within a paragraph, a common notation in non-english languages and therefore three times more frequent in t than in o ( vs. cases, respec- tively). although the raw frequency of both ‘(’ and ‘)’ is about the same and although the ratio between their frequency in o and their frequency in t is nearly identical, ‘(’ appears to be a better marker of t according to the classifier. when a classifier is encountered with two highly dependent features it may ignore one of them altogether. this does not mean the ig- nored feature is not important, it only means it does not add much new information. in summary, we use text classification algorithms to measure the robust- ness of each feature set. we are interested in differences between o and t, but we are also interested in finding out how revealing these features are, how prominent in the marking of translated text, to the effect that they have a predictive power. we use the information produced by the classifiers to provide a preliminary analysis. then, to make a finer analysis, we check some of the features manually and conduct significance tests. we believe that using text classification techniques provides a good tool to study the makeup of translated texts in a general way, on the one hand, and that using statistical significance tests on occasion enables the researcher to look at less frequent events which are no doubt part of the story of translationese, on the other hand. . experimental setup the main corpus we use is the proceedings of the european parliament, eu- roparl (koehn, ), with approximately million tokens in english (o) and the same number of tokens translated from source languages (t): danish, dutch, finnish, french, german, greek, italian, portuguese, spanish, and swedish. although the speeches are delivered orally (many times read out from written texts), they can be considered a translation rather than interpretation, since the proceedings are produced in the follow- ing way: . the original speech is transcribed and minimally edited; . the text is sent to the speaker, who may edit it further; . the resulting text is translated into the other official languages. the corpus is first tokenized and then partitioned into chunks of ap- proximately tokens (ending on a sentence boundary). the purpose of this is to make sure that the length of an article does not interfere with the classification. we thus obtain chunks of original english, and chunks of translations from each of the ten source languages. we then gen- erate pos tags for the tokenized texts. we use the uiuc ccg sentence segmentation tool to detect sentence boundaries; and opennlp, with the default maxent tagger and penn treebank tagset, to tokenize the texts and induce pos tags. we use the weka toolkit (hall et al., ) for classification; in all ex- periments, we use svm (smo) as the classification algorithm, with the default linear kernel. we employ ten-fold cross-validation and report accu- racy (percentage of chunks correctly classified). since the classification task is binary and the training corpus is balanced, the baseline is %. hypotheses we test several translation studies hypotheses. in this section we list each hypothesis, and describe how we model it in terms of the features used for classification. feature design is a sophisticated process. in determining the feature set, the most important features must: . reflect frequent linguistic characteristics we would expect to be present in the two types of text; we are grateful to emma wagner, vicki brett, and philip cole (eu, head of the irish and english translation unit) for this information. http://cogcomp.cs.illinois.edu/page/tools_view/ , accessed august . http://incubator.apache.org/opennlp/, accessed august . in ten-fold cross validation, % of the annotated data are used for training, and the remaining % are used for testing. this process is repeated ten times, with different splits of the data, and the ten results are averaged. this guarantees the robustness of the evaluation, and minimizes the risk of over-fitting to the training data. http://cogcomp.cs.illinois.edu/page/tools_view/ http://incubator.apache.org/opennlp/ . be content-independent, indicating formal and stylistic differences be- tween the texts that are not derived from differences in contents, do- main, genre, etc.; and . be easy to interpret, yielding insights regarding the differences between original and translated texts. we focus on features that reflect structural properties of the texts, some of which have been used in previous works. we now define the features we explore in this work; for each feature, we provide a precise definition that facilitates replication of our results, as well as a hypothesis on its ability to distinguish between o and t, based on the translation studies literature. when generating many of the features, we normalize the feature’s value, v, by the number of tokens in the chunk, n: v′ = v× /n. this balances the values over chunks that have slightly more or less than tokens each (recall that chunks respect sentence boundaries). henceforth, when describing a normalized feature, we report v′ rather than v. we also multiply the values of some features by some power of , rounding up the result to the nearest integer, in order to have a set of values that is easier to compare. this does not affect the classification results. . simplification simplification refers to the process of rendering complex linguistic features in the source text into simpler features in the target text. strictly speaking, this phenomenon can be studied only vis-à-vis the source text, since ‘simpler’ is defined here in reference to the source text, where, for example, the prac- tice of splitting sentences or refraining from complex subordinations can be observed. and indeed, this is how simplification was first defined and stud- ied in translation studies (blum-kulka and levenston, ; vanderauwerea, ). baker ( ) suggests that simplification can be studied by compar- ing translated texts with non-translated ones, as long as both texts share the same domain, genre, time frame, etc. in a series of corpus-based studies, laviosa ( , ) confirms this hypothesis. ilisei et al. ( ) and ilisei and inkpen ( ) train a classifier enriched by simplification features and bring further evidence for this universal in romanian and spanish. we model the simplification hypothesis through the following features: lexical variety the assumption is that original texts are richer in terms of vocabulary than translated ones, as hypothesized by baker ( ) and studied by laviosa ( ). lexical variety is known to be an unstable phenomenon which is highly dependent on corpus size (tweedie and of the features we define in this section, the first five were also implemented by ilisei et al. ( ), who, in addition, added sentence depth (as the depth of the parse tree) and ambiguity (as the average number of senses per word). baayen, ). we therefore use three different type-token ratio (ttr) measures, following grieve ( ), where v is the number of types and n is the number of tokens per chunk. all three versions consider punctuation marks as tokens. . v/n, magnified by order of . . log(v )/log(n), magnified by order of . . ×log(n)/( −v /v ), where v is the number of types occurring only once in the chunk. mean word length (in characters) we assume that translated texts use simpler words, in particular shorter ones. punctuation marks are ex- cluded from the tokens in this feature. syllable ratio we assume that simpler words are used in translated texts, resulting in fewer syllables per word. we approximate this feature by counting the number of vowel-sequences that are delimited by conso- nants or space in a word, normalized by the number of tokens in the chunk. lexical density this measure is also used by laviosa ( ). the fre- quency of tokens that are not nouns, adjectives, adverbs or verbs. this is computed by dividing the number of tokens tagged with pos tags that do not open with j, n, r or v by the number of tokens in the chunk. mean sentence length splitting sentences is a common strategy in trans- lation, which is also considered a form of simplification. baker ( ) renders it one of the universal features of ‘simplification’. long and complicated sentences may be simplified and split into short, simple sentences. hence we assume that translations contain shorter sen- tences than original texts. we consider punctuation marks as tokens in the computation of this feature. mean word rank we assume that less frequent words are used more often in original texts than in translated ones. this is based on the observa- tion of blum-kulka and levenston ( ) that translated texts “make do with less words” and the application of this feature by laviosa ( ). a theoretical explanation is provided by halverson ( ): translators use more prototypical language, i.e., they “regress to the mean” (shlesinger, ). to compute this, we use a list of english most frequent words, and consider the rank of words (their position in the frequency-ordered list). the maximum rank is (since some words have equal ranks). we handle words that do not appear in the list in two different ways: http://www.insightin.com/esl/, accessed august . http://www.insightin.com/esl/ . words not in this list are given a unique highest rank of . . words not in the list are ignored altogether. values (in both versions) are rounded to the nearest integer. all punc- tuation marks are ignored. most frequent words the normalized frequencies of the n most frequent words in the corpus. we define three features, with three different thresholds: n = , , . punctuation marks are excluded. . explicitation explicitation is the tendency to spell out in the target text utterances that are more implicit in the source. like simplification, this ‘universal’ can be directly observed in t only in reference to o; if there is an implicit causal relation between two phrases in the source text and a cohesive marker such as because is introduced in target text, then it could be said with confidence that explicitation took place. but explicitation can also be studied by con- structing a comparable corpus (baker, ), and it is fair to assume that if there are many more cohesive markers in t than in o (in a well-balanced large corpus like europarl), it could serve as an indirect evidence of explicitation. blum-kulka ( ) develops and exemplifies this phenomenon in transla- tions from hebrew to english, and Øver̊as ( ) compiles a parallel bidirec- tional norwegian-english and english-norwegian corpus to provide further evidence for explicitation. koppel and ordan ( ) find that some of the prominent features in their list of function words are cohesive markers, such as therefore, thus and consequently. the first three classifiers below are inspired by an example provided by baker ( , pp. - ), where the clause the example of truman was always present in my mind is rendered into arabic with a fairly long para- graph, which includes the following: in my mind there was always the ex- ample of the american president harry truman, who succeeded franklin roosevelt.... explicit naming we hypothesize that one form of explicitation in transla- tion is the use of a proper noun as a spelling out of a personal pronoun. we calculate the ratio of personal pronouns to proper nouns, both singular and plural, magnified by an order of . see also ‘pronouns’, section . . single naming the frequency of proper nouns consisting of a single token, not having an additional proper noun as a neighbor. this can be seen in an exaggerated form in the example above taken from baker ( , pp. - ). as a contemporary example, it is common to find in german news (as of ) the single proper name westerwelle, but in translating german news into another language, the translator is likely to add the first name of this person (guido) and probably his role, too (minister of foreign affairs). mean multiple naming the average length (in tokens) of proper nouns (consecutive tokens tagged as proper nouns), magnified by an order of . the motivation for this feature is the same as above. cohesive markers translations are known to excessively use certain co- hesive markers (blum-kulka, ; Øver̊as, ; koppel and ordan, ). we use a list of such markers, based on koppel and ordan ( ); see appendix a. . each marker in the list is a feature, whose value is the frequency of its occurrences in the chunk. . normalization translators take great efforts to standardize texts (toury, ), or, in the words of (baker, , p. ), they have “a strong preference for conven- tional ‘grammaticality’”. we include in this the tendency to avoid repeti- tions (ben-ari, ), the tendency to use a more formal style manifested in refraining from the use of contractions (olohan, ), and the tendency to overuse fixed expressions even when the source text refrains, sometime deliberately, from doing so (toury, ; kenny, ). we model normalization through the following features: repetitions we count the number of content words (words tagged as nouns, verbs, adjectives or adverbs) that occur more than once in a chunk, and normalize by the number of tokens in the chunk. in- flections of the verbs be and have are excluded from the count since these verbs are commonly used as auxiliaries. this feature’s values are magnified by an order of . contractions the ratio of contracted forms to their counterpart full form(s). if the full form has zero occurrences, its count is changed to . the list of contracted forms used for this feature is given in appendix a. . average pmi we expect original texts to use more collocations, and in any case to use them differently than translated texts. this hypothesis is based on toury ( ) and kenny ( ), who show that transla- tions overuse highly associated words. we therefore use as a feature the average pmi (church and hanks, ) of all bigrams in the chunk. given a bigram w w , its pmi is: log(freq(w w )/freq(w )×freq(w )) threshold pmi we compute the pmi of each bigram in a chunk, and count the (normalized) number of bigrams with pmi above . . interference toury ( ) takes on the concept of interlanguage (selinker, ) to de- fine interference as a universal. selinker ( ) coins the term in order to talk about the hybrid nature of the output of non-native speakers producing utterances in their second language. this output is heavily influenced by the language system of their first language. translation is very similar in this sense, one language comes in close contact with another through trans- fer. in translation, however, translators habitually produce texts in their native tongue. therefore, toury ( ) advocates a descriptive study of in- terference not tainted, like in second language acquisition, by the view that the output reveals “ill performances” (production of grammatically incor- rect structures). interference operates on different levels, from transcribing source language words, through using loan translations, to exerting struc- tural (morphological and syntactic for example) influence. this may bring about, as noted by gellerstam ( ), a different distribution of elements in translated texts, which he calls ‘translationese’, keeping it as a pure de- scriptive term (cf. santos ( )). we model interference as follows: pos n-grams we hypothesize that different grammatical structures used in the different source languages interfere with the translations; and that translations have unique grammatical structure. following baroni and bernardini ( ) and kurokawa et al. ( ), we model this assumption by defining as features unigrams, bigrams and trigrams of pos tags. we add special tokens to indicate the beginning and end of each sentence, with the purpose of capturing specific pos-bigrams and pos-trigrams representing the beginnings and endings of sentences. the value of these features is the actual number of each pos n-gram in the chunk. character n-grams this feature is motivated by popescu ( ). other than yielding very good results, it is also language-type dependent. we hypothesize that grammatical structure manifests itself in this fea- ture, and as in pos n-grams, the different grammatical structures used in the different source languages interfere with the translations. we also hypothesize that this feature captures morphological features of the language. these are actually three different features (each tested separately): unigrams, bigrams and trigrams of characters. they are computed similarly to the way pos n-grams are computed: by the fre- quencies of n-letter occurrences in a chunk, normalized by the chunk’s size. two special tokens are added to indicate the beginning and end of each word, in order to properly handle specific word prefixes and suffixes. we do not capture cross-token character n-grams, and we exclude punctuation marks. prefixes and suffixes character n-grams are an approximation of mor- phological structure. in the case of english, the little morphology expressed by the language is typically manifested as prefixes and suf- fixes. we therefore define a more refined variant of the character n- gram feature, focusing only on prefixes and suffixes. we use a list of such morphemes (see appendix a. ) as features, simply counting the number of words in a chunk that begin or end with each of the prefixes/suffixes, respectively. contextual function words this feature is a variant of pos n-grams, where the n-grams can be anchored by specific (function) words. kop- pel and ordan ( ) use only function words for classification; we use the same list of words in this feature (see appendix a. ). this fea- ture is defined as the (normalized) frequency of trigrams of function words in the chunk. in addition, we count trigrams consisting of two function words (from the same list) and one other word; in such cases, we replace the other word by its pos. in sum, we compute the fre- quencies in the chunk of triplets 〈w , w , w 〉, where at least two of the elements are functions words, and at most one is a pos tag. positional token frequency writers have a relatively limited vocabulary from which to choose words to open or close a sentence. we hypoth- esize that the choices are subject to interference. munday ( ) and gries and wulff ( ) study it on a smaller scale in translations from spanish to english and in translations from english to german, respec- tively. the value of this feature is the normalized frequency of tokens appearing in the first, second, antepenultimate, penultimate and last positions in a sentence. we exclude sentences shorter than five tokens. punctuation marks are considered as tokens in this feature, and for this reason the three last positions of a sentence are considered, while only the first two of them are interesting for our purposes. . miscellaneous finally, we define a number of features that cannot be naturally associated with any of the above hypotheses, but nevertheless throw light on the nature of translationese. function words we aim to replicate the results of koppel and ordan ( ) with this feature. we use the same list of function words (in we thank moshe koppel for providing us with the list of function words used in koppel and ordan ( ). fact, some of them are content words, but they are all crucial for organizing the text; see the list in appendix a. ) and implement the same feature. each function word in the corpus is a feature, whose value is the normalized frequency of its occurrences in the chunk. pronouns pronouns are function words, and koppel and ordan ( ) re- port that this subset is among the top discriminating features between o and t. we therefore check whether pronouns alone can yield a high classification accuracy. each pronoun in the corpus is a feature, whose value is the normalized frequency of its occurrences in the chunk. the list of pronouns is given in appendix a. . punctuation punctuation marks organize the information within sentence boundaries and to a great extent reduce ambiguity; according to the explicitation hypothesis, translated texts are less ambiguous (blum- kulka, ) and we assume that this tendency will manifest itself in the (different) way in which translated texts are punctuated. we focus on the following punctuation marks: ? ! : ; - ( ) [ ] ‘ ’ “ ” / , . apostrophes used in contracted forms are retained. following grieve ( ), we define three variants of this feature: . the normalized frequency of each punctuation mark in the chunk. . a non-normalized notion of frequency: n/tokens, where n is the number of occurrences of a punctuation mark; and tokens is the actual (rather than normalized) number of tokens in the chunk. this value is magnified by an order of . . n/p, where p is the total number of punctuations in the chunk; and n as above. this value is magnified by an order of . ratio of passive forms to all verbs we assume that english original texts tend to use the passive form more excessively than translated texts, due to the fact that the passive voice is more frequent in english than in some other languages (cf. teich ( ) for german-english). if an active voice is used in the source language, translators may prefer not to convert it to the passive. passives are defined as the verb be fol- lowed by the pos tag vbn (past participle). we calculate the ratio of passive verbs to all verbs, and magnified it by an order of . as a “sanity check”, we use two other features: token unigrams and token bigrams. each unigram and bigram in the corpus constitutes a spe- cific feature, as in baroni and bernardini ( ). the feature’s value is its frequency in the chunk (again, normalized). for bigrams we add spe- cial markers of the edges of the sentences as described for pos-n-grams. we assume that different languages use different content words in varying frequencies in translated and non-translated texts. we expect these two features to yield conclusive results (well above % accuracy), while token bigrams are expected to yield somewhat better results than token unigrams. these features are highly content-dependent, and are therefore of no em- pirical significance; they are only used as an upper bound for our other features, and to emphasize the validity of our methodology: we expect very high accuracy of classification with these features. results we implemented all the features discussed in the previous section as classi- fiers and used them for classifying held-out texts in a ten-fold cross-validation scenario, as described in section . the results of the classifiers are reported in table in terms of the accuracy of classifying the test set. as a sanity check, we also report the accuracy of the content-dependent classifiers. as mentioned above, these are expected to produce highly- accurate classifiers, but teach us very little about the features of transla- tionese. as is evident from table , this is indeed the case. for the sake of completeness, we note that it is possible to achieve very high classification accuracy even with a much narrower feature space. some of the more complex feature sets have hundreds, or even thousands of fea- tures. in such cases, most features contribute very little to the task. to emphasize this, we take only the top most frequent features. for exam- ple, rather than use all possible pos trigrams for classification, we only use the most frequent sequences as features. table lists the classification results in this case. evidently, the results are almost as high as when using all features. our main objective, however, is not to produce the best-performing clas- sifiers. rather, it is to understand what the classifiers can reveal about the nature of the differences between o and t. the following section thus anal- yses the results. analysis . simplification laviosa ( , ) studied the simplification hypothesis extensively. some features pertaining to simplification are also mentioned by baker ( ). the four main features and partial findings pertain to mean sentence length, type-token ratio, lexical density and overrepresentation of highly frequent items. lexical density fails altogether to predict the status of a text, being nearly on chance level ( % accuracy). interestingly, while mean sentence length is much above chance level ( %), the results are contrary to com- mon assumptions in translation studies. according to the simplification category feature accuracy (%) simplification ttr ( ) ttr ( ) ttr ( ) mean word length syllable ratio lexical density mean sentence length mean word rank ( ) mean word rank ( ) n most frequent words explicitation explicit naming single naming mean multiple naming cohesive markers normalization repetitions contractions average pmi threshold pmi interference pos unigrams pos bigrams pos trigrams character unigrams character bigrams character trigrams prefixes and suffixes contextual function words positional token frequency miscellaneous function words pronouns punctuation ( ) punctuation ( ) punctuation ( ) ratio of passive forms to all verbs table : classification results hypothesis, t sentences are simpler (i.e., shorter), but as figure shows, the contrary is the case. we computed the mean sentence length in eleven , -word texts, one of them original english, and the other translated from ten source languages (this is the same corpus on which we run the classification). only three translations (from swedish, finnish and dutch) have a lower mean sentence length than original english, and on average o sentences are . tokens shorter. whereas this result may pertain only to certain language pairs or certain genres, this alleged “translation universal” is definitely not universal. moreover, it may actually be an instance of the category feature accuracy (%) sanity token unigrams token bigrams table : classification results, “sanity check” classifiers category feature accuracy interference pos bigrams pos trigrams character bigrams character trigrams positional token frequency table : classification results, top- features only interference hypothesis, where sentence length in the target language reflects its length in the source language. this, however, should be studied under a parallel corpus setting, and is beyond the scope of this work. figure : mean sentence length according to ‘language’ the first two ttr measures perform relatively well ( % accuracy), and the indirect measures of lexical variety (mean word length and syllable ra- tio) are above chance level ( % and % accuracy, respectively). following holmes ( ) we experiment with more sophisticated measures of lexical variety. the best performing one is the one that takes into account hapax legomena, words that occur only once in a text. this variant of ttr ( ) yields % accuracy. one important trait of hapax legomena is that as op- posed to type-token ratio they are not so dependent on corpus size (baayen, ). another interesting classifier with relatively good results, in fact, the best performing of all simplification features ( % accuracy), is mean word rank. this feature is closely related to the feature studied by laviosa ( ) (n top words) with two differences: ( ) our list of frequent items is much larger, and ( ) we generate the frequency list not from the corpora under study but rather from an external much larger reference corpus. in contrast, the design that follows laviosa ( ) more strictly (n most frequent words) has a lower predictive power ( %). . explicitation the three classifiers we design to check this hypothesis (explicit naming, single naming and mean multiple naming) do not exceed % classification accuracy. on the other hand, following blum-kulka ( ) and koppel and ordan ( ), we build a classifier that uses cohesive markers and achieve % accuracy in telling o from t; such cohesive markers are far more frequent in t than in o. for example, moreover, thus and besides are used . , , and . times more frequently (respectively) in t than in o. . normalization none of these features perform very well. repetitions and contractions are rare in europarl and in this sense the corpus may not be suited for study- ing these phenomena. the repetition-based classifier yields % accuracy and the contraction-based classifier performs at chance level ( %). one of the classifiers that checks pmi, designed to pick on highly as- sociated words and therefore attesting to many fixed expressions, performs considerably better, namely % accuracy. this measure counts the number of associated bigrams whose pmi is above . as figure shows, english has far more highly associated bigrams than translations. if we take the word form stand, for example, then at the top of the list we normally get highly associated words, some of which are fixed expressions, such as stand idly, stand firm, stand trial, etc. there are considerably more highly associated pairs like these in o; conversely, this also means that there are more poorly associated pairs in t, such as the bigram stand unamended. this finding contradicts the case studies elaborated on in toury ( ); kenny ( ). it should be noted, however, that they discuss cases operating under particu- lar scenarios, whereas we check this phenomenon more globally, completely unbiased towards any scenario whatsoever. the finding is robust but it is oblivious to the particulars of subtle cases. figure : number of bigrams whose pmi is above threshold according to ‘language’ . interference the interference-based classifiers are the best performing ones. most of them perform above %. in this sense we can say that interference is the most robust phenomenon typifying translations. however, we note that some of the features are somewhat coarse and may reflect some corpus-dependent characteristics. for example, in the character n-grams we notice that some of the top features in o include sequences that are ‘illegal’ in english and obviously stem from foreign names, such as the following letter bigrams: haarder and maat or gazpron. to offset this problem we use only the top features in several of the classifiers, with a minor effect on the results. the n-gram findings are consistent with popescu ( ) in that we also find they catch on both affixes and function words: for example, typical trigrams in o are -ion and all whereas typical to t are -ble and the. as opposed to popescu ( ) we reduced the feature space without the need to look at the original texts; popescu ( ) looked for sequences of n-grams in the target language that also appear in the source texts, thereby eliminating mostly proper nouns. however, this method can be applied only to language pairs that use similar alphabet and orthography conventions. using only the most frequent features results in a drop in accuracy of up to %. furthermore, restricting the space of features to only prefixes and suffixes, a much narrower domain than the set of all character bi-grams, for example, still yields % accuracy. evidently, original and translated texts differ greatly in the way they use these affixes. different english affixes were imported from different languages, and this is reflected in our findings. the prefix mono-, a marker of translated language, is much more frequent in greek than any other language. the suffix -ible, originating in latin, is much more common in all the romance languages, which are “clustered together” around this feature, compared to english. last, the suffix -ize, originating in latin, is highly frequent in original english, less frequent in the romance languages, and even less in the other languages. further study, backed by a sound historical linguistics perspective, may determine how such parameters affect transnational choices between language, taking into account their distance from each other. part-of-speech trigrams is an extremely cheap and efficient classifier. the feature space is not too big, and the results are robust. a good discrimina- tory feature typifying t is, for example, the part-of-speech trigram of modal + verb base form + verb past particle, as in the highly frequent phrases in the corpus must be taken, should be given and can be used; as can be seen in figure , it typifies more prominently translations from phylogenetically distant languages, such as finnish, but original english is down the list, regardless of t’s source language. figure : the average number of the pos trigram modal + verb base form + participle in o and ten ts moving now to positional token frequency, we report on three variations of this classifier with different degrees of accuracy (reported in brackets): taking into account all the tokens that appear in these positions ( %), us- ing only the most frequent tokens ( %) and finally only the most frequent tokens ( %). the last is the most abstract, picking almost ex- clusively on function words. the second most prominent feature typifying o is sentences opening with the word ‘but’. in fact, there are . times more cases of such sentences in o. in english there is a long prescriptive tradition forbidding writers to open a sentence with ‘but’, and although this ‘decree’ is questioned and even mocked at (garner, ), the question whether it is considered a good style is a common question posted on inter- net forums dealing with english language use. translators have been known to be conservative in their lexical choices (kenny, ), and the underuse of ‘but’-opening sentences is yet another evidence for this tendency. as op- posed to other features in positional token frequency, this is not a case of interference but rather a tendency to (over-)abide to norms of translation, i.e., standardization (toury, ). . miscellaneous in this category we include several classifiers whose features do not fall under a clear-cut theoretical category discussed by translation theorists. the function words classifier replicates koppel and ordan ( ) and despite the good performance ( % accuracy) it is not very meaningful theoretically. one of its subsets, a list of pronouns, reveals an interesting phenomenon: subject pronouns, like i, he and she are prominent indicators of o, whereas virtually all reflexive pronouns (such as itself, himself, yourself ) typify t. the first phenomenon is probably due to the fact that pronouns are much more frequent in t (about . more frequent) and a fine-tuned analysis of the distribution of pronouns in each sub-corpus normalized by the number of pronouns is beyond the scope of this study; the high representation of reflexive pronouns is probably due to interference from the source languages. the accuracy of classifying by pronouns alone is %. the accuracy of a classifier based on the ratio of passive verbs is much above chance level, yet not a very good predictor by itself ( %). t has about . times more passive verbs, and it is highly dependent on the source language from which t stems: original english is down the list, right after the romance languages and greek, and from the top down: danish, swedish, finnish, dutch and german. we experiment with three different classifiers based on punctuation marks as a feature set. the mark ‘.’ (actually indicating sentence length) is a strong feature of o and the mark ‘,’ is a strong marker of t. in fact, using only these two features we achieve % accuracy. parentheses are very typical to t, indicating explicitation. a typical example is the following: the vlaams blok (‘flemish block’) opposes the patentability of computer-implemented inventions... last, we find that exclamation marks are on average much more common in original english ( . times more frequent). translations from three source languages, however, have more exclamation marks than original english: german, italian and french. translations from german use many more exclamation marks, . (!!!) times more than original english. conclusion machines can easily identify translated texts. identification has been suc- cessfully performed for very different data sets and genres, including parlia- mentary proceedings, literature, news and magazine writing, and it works well across many source and target languages (with the exception of liter- ary polish, see rybicki ( )). but text classification is a double-edged sword. consider how easily the classifier teases apart o from t based on letter bigrams: % accuracy, with a slight drop to % when only the top most frequent letter bigrams are used. it is considerably better than the performance achieved by professional humans (tirkkonen-condit, ; baroni and bernardini, ). we then find that the letter sequence di is among the best discriminating features between o and t, as it is about % more frequent in t than in o; but it does not teach us much about t, and we cannot interpret this finding. furthermore, text classification is highly dependent on the genres and domains, and cross-corpus classification (‘scalability’) is notoriously hard (argamon, ). we addressed the first problem by designing linguistically informed fea- tures. for example, enhancing letter n-grams to trigrams already revealed some insights about morphological traits of t. the second problem calls for future research. recall that we were unable to replicate the results reported by olohan ( ), simply because contractions are a rarity in europarl and therefore ‘normalizing’ them is even a rarer event. that translationese is dependent on genre is suggested and studied in various works (steiner, ; reiss, ; teich, ). this point is much related to one of our main conclusions: the universal claims for translation should be reconsidered. not only are they dependent on genre and register, they also vary greatly across different pairs of lan- guages. the best performing features in our study are those that attest to the ‘fingerprints’ of the source on the target, what has been called “source language shining through” (teich, ). this is not to say that there are no features which operate “irrespective of source language” (like cohesive markers in europarl), but the best evidence for translationese, the one that has the best predictive power, is related to interference, and interfer- ence by its nature is a pair-specific phenomenon. note that mean sentence length, which we included in ‘simplification’, has been purported to be a trait of translationese regardless of source language, but turned out to be very much dependent on the source language, and in particular, contrary to previous assumptions, sentence length turned out to be shorter in o. this can indeed be shown in a well-balanced comparable corpus, ideally from as many source languages as possible and, when possible, typologically distant ones. another caveat is related to comparable corpora in general. olohan and baker ( ) report that there are less omissions of optional reporting that in t, as in i know (that) he’d never get here in time. this is, according to the authors, a case of explicitation, i.e., replacing a zero-connective with a that-connective to avoid ambiguity. pym ( ) raises the following ques- tion: how do we know that this finding is not due to interference? what if the t component of this corpus consists of source languages in which the that-connective is obligatory and therefore it is just “shining through” to the target text? we cast the same doubt on some of our findings. the under-representation of sentences opening with but in t are probably due to normalization, but without reference to the source texts we will never be sure. with this kind of corpus — a comparable corpus — we can set- tle the ontological question (t is different from o across many dimensions suggested by translation scholars), but we are left with an epistemological unease: given our tools and methodology we do not know for sure what part of the findings is a mere result of source influence on the target text (inter- ference), and what part is inherent to the work of translators (simplification, normalization, and explicitation). we leave this question for future studies. references omar s. al-shabab. interpretation and the language of translation: creativ- ity and conventions in translation. janus, edinburgh, . shlomo argamon. book review of scalability issues in authorship attribu- tion, by kim luyckx. literary and linguistic computing, ( ): – , . r. harald baayen. word frequency distributions. text, speech, and lan- guage technology. kluwer academic, . isbn . mona baker. corpus linguistics and translation studies: implications and applications. in gill francis mona baker and elena tognini-bonelli, ed- itors, text and technology: in honour of john sinclair, pages – . john benjamins, amsterdam, . marco baroni and silvia bernardini. a new approach to the study of trans- lationese: machine-learning the difference between original and trans- lated text. literary and linguistic computing, ( ): – , september . url http://llc.oxfordjournals.org/cgi/content/short/ / / ?rss= . nitza ben-ari. the ambivalent case of repetitions in literary translation. avoiding repetitions: a “universal” of translation? meta, ( ): – , . shoshana blum-kulka. shifts of cohesion and coherence in translation. in juliane house and shoshana blum-kulka, editors, interlingual and inter- http://llc.oxfordjournals.org/cgi/content/short/ / / ?rss= http://llc.oxfordjournals.org/cgi/content/short/ / / ?rss= cultural communication discourse and cognition in translation and second language acquisition studies, volume , pages – . gunter narr verlag, . shoshana blum-kulka and eddie a. levenston. universals of lexical sim- plification. language learning, ( ): – , december . shoshana blum-kulka and eddie a. levenston. universals of lexical sim- plification. in claus faerch and gabriele kasper, editors, strategies in interlanguage communication, pages – . longman, . andrew chesterman. beyond the particular. in a. mauranen and p. ku- jamäki, editors, translation universals: do they exist?, pages – . john benjamins, . kenneth ward church and patrick hanks. word association norms, mutual information, and lexicography. computational linguistics, ( ): – , . issn - . sari eskola. untypical frequencies in translated language. in a. mauranen and p. kujamäki, editors, translation universals: do they exist?, pages – . john benjamins, . william frawley. prolegomenon to a theory of translation. in william frawley, editor, translation. literary, linguistic and philosophical per- spectives, pages – . university of delaware press, newark, . bryan a. garner. on beginning sentences with but. michigan bar journal, : – , . martin gellerstam. translationese in swedish novels translated from en- glish. in lars wollin and hans lindquist, editors, translation studies in scandinavia, pages – . cwk gleerup, lund, . stefan th. gries and stefanie wulff. regression analysis in translation studies. in michael p. oakes and meng ji, editors, quantitative methods in corpus-based translation studies, studies in corpus linguistics , pages – . john benjamins, philadelphia, . jack grieve. quantitative authorship attribution: an evaluation of tech- niques. literary and linguistic computing, ( ): – , . mark hall, eibe frank, geoffrey holmes, bernhard pfahringer, peter reute- mann, and ian h. witten. the weka data mining software: an up- date. sigkdd explorations, ( ): – , . issn - . doi: . / . . url http://dx.doi.org/ . / . . http://dx.doi.org/ . / . http://dx.doi.org/ . / . sandra halverson. the cognitive basis of translation universals. target, ( ): – , . david i. holmes. a stylometric analysis of mormon scripture and related texts. journal of the royal statistical society, ( ): – , . iustina ilisei. a machine learning approach to the identification of trans- lational language: an inquiry into translationese learning models. phd thesis, university of wolverhampton, wolverhampton, uk, february . url http://clg.wlv.ac.uk/papers/ilisei-thesis.pdf. iustina ilisei and diana inkpen. translationese traits in romanian newspa- pers: a machine learning approach. international journal of computa- tional linguistics and applications, ( - ), . iustina ilisei, diana inkpen, gloria corpas pastor, and ruslan mitkov. iden- tification of translationese: a machine learning approach. in alexan- der f. gelbukh, editor, proceedings of cicling- : th international conference on computational linguistics and intelligent text process- ing, volume of lecture notes in computer science, pages – . springer, . isbn - - - - . url http://dx.doi.org/ . / - - - - . dorothy kenny. lexis and creativity in translation: a corpus-based study. st. jerome, . isbn . philipp koehn. europarl: a parallel corpus for statistical machine trans- lation. in proceedings of the tenth machine translation summit, pages – . aamt, . url http://mt-archive.info/mts- -koehn. pdf. moshe koppel and noam ordan. translationese and its dialects. in pro- ceedings of the th annual meeting of the association for computational linguistics: human language technologies, pages – , portland, oregon, usa, june . association for computational linguistics. url http://www.aclweb.org/anthology/p - . moshe koppel, jonathan schler, and shlomo argamon. computational methods in authorship attribution. journal of the american society for information science and technology, ( ): – , jan . issn - . doi: . /asi.v : . url http://dx.doi.org/ . /asi. v : . david kurokawa, cyril goutte, and pierre isabelle. automatic detection of translated text and its impact on machine translation. in proceedings of mt-summit xii, pages – , . http://clg.wlv.ac.uk/papers/ilisei-thesis.pdf http://dx.doi.org/ . / - - - - http://dx.doi.org/ . / - - - - http://mt-archive.info/mts- -koehn.pdf http://mt-archive.info/mts- -koehn.pdf http://www.aclweb.org/anthology/p - http://dx.doi.org/ . /asi.v : http://dx.doi.org/ . /asi.v : sara laviosa. core patterns of lexical use in a comparable corpus of english lexical prose. meta, ( ): – , december . sara laviosa. corpus-based translation studies: theory, findings, ap- plications. approaches to translation studies. rodopi, . isbn . gennadi lembersky, noam ordan, and shuly wintner. language models for machine translation: original vs. translated texts. in proceedings of the conference on empirical methods in natural language process- ing, pages – , edinburgh, scotland, uk, july . association for computational linguistics. url http://www.aclweb.org/anthology/ d - . gennadi lembersky, noam ordan, and shuly wintner. adapting transla- tion models to translationese improves smt. in proceedings of the th conference of the european chapter of the association for computational linguistics, pages – , avignon, france, april a. association for computational linguistics. url http://www.aclweb.org/anthology/ e - . gennadi lembersky, noam ordan, and shuly wintner. language models for machine translation: original vs. translated texts. computational linguistics, ( ): – , december b. url http://dx.doi.org/ . /coli_a_ . a. mauranen and p. kujamäki, editors. translation universals: do they exist? john benjamins, . anna mauranen. universals tendencies in translation. in gunilla anderman and margaret rogers, editors, incorporating corpora: the linguist and the translator, pages – . multilingual matters, clevedon, buffalo and toronto, . jeremy munday. a computer-assisted approach to the analysis of translation shifts. meta, ( ): – , . maeve olohan. how frequent are the contractions? a study of contracted forms in the translational english corpus. target, ( ): – , . maeve olohan and mona baker. reporting that in translated english: ev- idence for subconscious processes of explicitation? across languages and cultures, ( ): – , . lin Øver̊as. in search of the third code: an investigation of norms in literary translation. meta, ( ): – , . http://www.aclweb.org/anthology/d - http://www.aclweb.org/anthology/d - http://www.aclweb.org/anthology/e - http://www.aclweb.org/anthology/e - http://dx.doi.org/ . /coli_a_ http://dx.doi.org/ . /coli_a_ marius popescu. studying translationese at the character level. in galia angelova, kalina bontcheva, ruslan mitkov, and nicolas nicolov, editors, proceedings of ranlp- , pages – , . anthony pym. on toury’s laws of how translators translate. in anthony pym, miriam shlesinger, and daniel simeoni, editors, beyond descrip- tive translation studies: investigations in homage to gideon toury, benjamins translation library: est subseries, pages – . john ben- jamins, . isbn . katherine reiss. text types, translation types and translation assessment. in andrew chesterman, editor, readings in translation theory, pages – . oy finn lectura ab, helsinki, . jan rybicki. the great mystery of the (almost) invisible translator: sty- lometry in translation. in michael p. oakes and meng ji, editors, quan- titative methods in corpus-based translation studies, studies in corpus linguistics , pages – . john benjamins, philadelphia, . diana santos. on grammatical translationese. in kimmo koskenniemi, editor, short papers presented at the tenth scandinavian conference on computational linguistics, pages – , . fabrizio sebastiani. machine learning in automated text categorization. acm computing surveys, ( ): – , march . issn - . doi: . / . . url http://doi.acm.org/ . / . . larry selinker. interlanguage. international review of applied linguistics in language teaching, ( – ): – , . miriam shlesinger. simultaneous interpretation as a factor in effecting shifts in the position of texts on the oral-literate continuum. master’s thesis, tel aviv university, faculty of the humanities, department of poetics and comparative literature, . erich steiner. a register-based translation evaluation: an advertisement as a case in point. target, ( ): – , . elke teich. cross-linguistic variation in system and text: a methodology for the investigation of translations and comparable texts. mouton de gruyter, . sonja tirkkonen-condit. translationese: a myth or an empirical fact? tar- get, ( ): – , . gideon toury. interlanguage and its manifestations in translation. meta, ( ): – , . http://doi.acm.org/ . / . http://doi.acm.org/ . / . gideon toury. in search of a theory of translation. the porter institute for poetics and semiotics, tel aviv university, tel aviv, . gideon toury. descriptive translation studies and beyond. john benjamins, amsterdam / philadelphia, . fiona j. tweedie and r. harald baayen. how variable may a constant be? measures of lexical richness in perspective. computers and the humani- ties, ( ): – , . hans van halteren. source language markers in europarl translations. in donia scott and hans uszkoreit, editors, coling , nd in- ternational conference on computational linguistics, proceedings of the conference, - august , manchester, uk, pages – , . isbn - - - - . url http://www.aclweb.org/anthology/ c - . ria vanderauwerea. dutch novels translated into english: the transforma- tion of a ‘minority’ literature. rodopi, amsterdam, . a lists of words a. cohesive markers we use the following list of words as cohesive markers: as for, as to, because, besides, but, consequently, despite, even if, even though, except, further, furthermore, hence, however, in addition, in conclusion, in other words, in spite, instead, is to say, maybe, moreover, nevertheless, on account of, on the contrary, on the other hand, otherwise, referring to, since, so, the former, the latter, therefore, this implies, though, thus, with reference to, with regard to, yet, concerning. a. contracted forms we use the following list of contracted forms and their expansions: i’m: i am, it’s: it is, it has, there’s: there is, there has, he’s: he is, he has, she’s: she is, she has, what’s: what is, what has, let’s: let us, who’s: who is, who has, where’s: where is, where has, how’s: how is, how has, here’s: here is, i’ll: i will, you’ll: you will, she’ll: she will, he’ll: he will, we’ll: we will, they’ll: they will, i’d: i would, i had, you’d: you would, you had, she’d: she would, she had, he’d: he would, he had, we’d: we would, we had, they’d: they would, they had, i’ve: i have, you’ve: you have, we’ve: we have, they’ve: they have, who’ve: who have, would’ve: would have, should’ve: should have, must’ve: must have, you’re: you are, they’re: they are, we’re: we are, who’re: who are, couldn’t: could not, can’t: cannot, wouldn’t: would not, don’t: do not, doesn’t: does not, didn’t: did not. http://www.aclweb.org/anthology/c - http://www.aclweb.org/anthology/c - a. prefixes and suffixes we use the following list of prefixes: a, an, ante, anti, auto, circum, co, com, con, contra, de, dis, en, ex, extra, hetero, homo, hyper, il, im, in, inter, intra, ir, macro, micro, mono, non, omni, post, pre, pro, sub, syn, trans, tri, un, uni and the following list of suffixes: able, acy, al, ance, ate, dom, en, ence, er, esque, ful, fy, ible, ic, ical, ify, ious, ise, ish, ism, ist, ity, ive, ize, less, ment, ness, or, ous, ship, sion, tion, ty, y. a. function words we use the following list of function words: a, about, above, according, ac- cordingly, actual, actually, after, afterward, afterwards, again, against, ago, ah, ain’t, all, almost, along, already, also, although, always, am, among, an, and, another, any, anybody, anyone, anything, anywhere, are, aren’t, around, art, as, aside, at, away, ay, back, be, bear, because, been, before, being, below, beneath, beside, besides, better, between, beyond, bid, bil- lion, billionth, both, bring, but, by, came, can, can’t, cannot, canst, certain, certainly, come, comes, consequently, could, couldn’t, couldst, dear, defi- nite, definitely, despite, did, didn’t, do, does, doesn’t, doing, don’t, done, dost, doth, doubtful, doubtfully, down, due, during, e.g., each, earlier, early, eight, eighteen, eighteenth, eighth, eighthly, eightieth, eighty, either, eleven, eleventh, else, enough, enter, ere, erst, even, eventually, ever, every, every- body, everyone, everything, everywhere, example, except, exeunt, exit, fact, fair, far, farewell, few, fewer, fifteen, fifteenth, fifth, fifthly, fiftieth, fifty, finally, first, firstly, five, for, forever, forgo, forth, fortieth, forty, four, four- teen, fourteenth, fourth, fourthly, from, furthermore, generally, get, gets, getting, give, go, good, got, had, has, hasn’t, hast, hath, have, haven’t, having, he, he’d, he’ll, he’s, hence, her, here, hers, herself, him, himself, his, hither, ho, how, how’s, however, hundred, hundredth, i, i’d, i’m, i’ve, if, in, indeed, instance, instead, into, is, isn’t, it, it’d, it’ll, it’s, its, itself, last, lastly, later, less, let, let’s, like, likely, many, matter, may, maybe, me, might, million, millionth, mine, more, moreover, most, much, must, mustn’t, my, myself, nay, near, nearby, nearly, neither, never, nevertheless, next, nine, nineteen, nineteenth, ninetieth, ninety, ninth, ninthly, no, no- body, none, noone, nor, not, nothing, now, nowhere, o, occasionally, of, off, oft, often, oh, on, once, one, only, or, order, other, others, ought, our, ours, ourselves, out, over, perhaps, possible, possibly, presumable, presumably, previous, previously, prior, probably, quite, rare, rarely, rather, result, re- sulting, round, said, same, say, second, secondly, seldom, seven, seventeen, seventeenth, seventh, seventhly, seventieth, seventy, shall, shalt, shan’t, she, she’d, she’ll, she’s, should, shouldn’t, shouldst, similarly, since, six, six- teen, sixteenth, sixth, sixthly, sixtieth, sixty, so, soever, some, somebody, someone, something, sometimes, somewhere, soon, still, subsequently, such, sure, tell, ten, tenth, tenthly, than, that, that’s, the, thee, their, theirs, them, themselves, then, thence, there, there’s, therefore, these, they, they’d, they’ll, they’re, they’ve, thine, third, thirdly, thirteen, thirteenth, thirti- eth, thirty, this, thither, those, thou, though, thousand, thousandth, three, thrice, through, thus, thy, till, tis, to, today, tomorrow, too, towards, twas, twelfth, twelve, twentieth, twenty, twice, twill, two, under, undergo, under- neath, undoubtedly, unless, unlikely, until, unto, unusual, unusually, up, upon, us, very, was, wasn’t, wast, way, we, we’d, we’ll, we’re, we’ve, wel- come, well, were, weren’t, what, what’s, whatever, when, whence, where, where’s, whereas, wherefore, whether, which, while, whiles, whither, who, who’s, whoever, whom, whose, why, wil, will, wilst, wilt, with, within, with- out, won’t, would, wouldn’t, wouldst, ye, yes, yesterday, yet, you, you’d, you’ll, you’re, you’ve, your, yours, yourself, yourselves. a. pronouns we use the following list of pronouns: he, her, hers, herself, him, himself, i, it, itself, me, mine, myself, one, oneself, ours, ourselves, she, theirs, them, themselves, they, us, we, you, yourself. introduction related work methodology text classification with machine learning motivation experimental setup hypotheses simplification explicitation normalization interference miscellaneous results analysis simplification explicitation normalization interference miscellaneous conclusion lists of words cohesive markers contracted forms prefixes and suffixes function words pronouns the national digital stewardship residency: building a community of practice through postgraduate training and education search d-lib: home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib d-lib magazine may/june volume , number / table of contents   the national digital stewardship residency: building a community of practice through postgraduate training and education rebecca fraimow, wgbh rebecca_fraimow [at] wgbh.org meridith beck mink meribecks [at] gmail.com margo padilla, metropolitan new york library council (metro) mpadilla [at] metro.org   https://doi.org/ . /may -fraimow   abstract the national digital stewardship residency (ndsr) program addresses the need for a dedicated community of professionals with the knowledge and technical skills to ensure the long-term viability of the digital record by matching recent postgraduate degree recipients with cultural heritage institutions to manage digital stewardship projects. since the initial ndsr dc pilot program, there have been five more iterations of the program — ndsr new york, ndsr boston, american archive of public broadcasting ndsr, ndsr art, and biodiversity heritage library ndsr. although these programs share the same characteristics, each operates independently and no formalized guidelines or standards currently exist to link all the programs together. in the fall of , the council on library and information resources (clir) was awarded an imls grant to evaluate the early ndsr programs. by providing a comprehensive picture of the ndsr programs that were completed by , the study was intended to help the ndsr community build connections across initiatives and learn from the experiences of its first participants. imls also funded the ndsr symposium planned for april , which will serve as an opportunity to bring stakeholders together to incorporate the strongest practices of each iteration and develop a standardized model for future programs. keywords: national digital stewardship residency, ndsr   introduction the range of processes involved in the long-term management of digital objects requires a specific skillset and specialized training. as the amount of digital material being acquired by cultural heritage organizations continues to grow, building and maintaining professional capacity for digital stewardship is becoming increasingly vital. the national digital stewardship residency (ndsr) programs are designed to address this need by cultivating a dedicated community of new professionals with the knowledge and technical skills to ensure the long-term viability of the digital record. the ndsr model matches recent postgraduate degree recipients with cultural heritage institutions to gain practical, hands-on training by managing digital stewardship projects in real-world settings. institutions gain the benefit of a dedicated, full-time employee focused on advancing and strengthening institutional capacities. projects can range from policy development to web archiving to data management. host institutions are competitively selected based on their proposed project, ability to mentor and support the resident, and other criteria to ensure effective and high-level training. during the residency, the host institution is responsible for engaging the resident as a professional employee of the institution and providing them with the space, supplies, and institutional access that they need in order to complete the project. the resident is responsible for completing the project on behalf of the institution, as well as spending a certain portion of their time on professional development activities as designated by the program administration. ndsr programs are funded in part through the institute of museum and library's (imls) laura bush st century librarian program, which helps libraries and archives better meet the changing learning and information needs of their patrons by providing professional development, graduate education, and continuing education opportunities for information students and professionals. the initial ndsr dc pilot program, hosted by the library of congress, ran from - . since then, five more programs — ndsr new york, ndsr boston, american archive of public broadcasting ndsr (aapb), ndsr art, and biodiversity heritage library ndsr (bhl), as well as a new round of ndsr dc — have recruited new professionals to serve as national digital stewardship residents and undertake the challenges of caring for digital content at institutions around the country. although these programs share many of the same characteristics, each operates independently. most implementations have introduced variations that have tested the potential and flexibility of the basic conceptual model. for example, the ndsr boston and new york programs demonstrated that the program was replicable in other metro areas outside of dc while adapting curricula and management structures to suit local needs. the aapb ndsr and ndsr art programs were experiments in adapting the program for cohorts working at a distance from one another, organized around thematic rather than geographic institutional connections. the bhl ndsr further tested the model by pre-selecting hosts from among a small group of collaborating member institutions; each bhl resident has worked on a different aspect of a larger coordinated effort. although program administrators have remained in communication with each other, and many have informally shared resources and best practices, no formalized guidelines or standards currently exist to link all the programs together. because of its flexibility, each program has been adapted to suit the needs of hosts, the bureaucratic requirements of the administering institutions, and the managerial preferences of project staff. in the fall of , the council on library and information resources (clir) was awarded an imls grant to evaluate the early ndsr programs. by providing a comprehensive picture of the ndsr programs that were completed by , the study was intended to help the ndsr community build connections across initiatives and learn from the experiences of its first participants. imls also funded a national symposium for ndsr constituents planned for april , which will serve as an opportunity to bring stakeholders together to discuss and evaluate program-specific needs and goals, to make recommendations that incorporate the strongest practices of each iteration, and to develop guidelines for future ndsr programs.   clir's assessment of the national digital stewardship residency - commenced in the fall of , the clir assessment was a formative evaluation that gathered qualitative feedback predominantly from interviews and surveys in order to capture the diversity of experiences of the first ndsr participants. it considered the four ndsr programs that were completed by summer , which included the initial ndsr pilot program designed by the library of congress ( - ); the second washington, d.c.-based program led by the library of congress ( - ); the boston-based program led by harvard library and mit libraries ( - ); and the new york city-based program led by the metropolitan new york library council, or metro ( - ). the clir team conducted both phone interviews and site visits with the residents and their supervisors who were in the midst of their residencies in boston, new york, and washington, d.c. between january and may of . in addition, clir administered surveys to residents and supervisors from the first ndsr cohorts that ran from - , and spoke with the individuals who administered and managed these programs. clir's research team also interviewed or met with staff associated with ndsr programs led by the american archive of public broadcasting (aapb ndsr), and the philadelphia museum of art and the art libraries society of north america (ndsr art); however, because these programs were only just underway at the time of research, they were not a significant part of the study. nonetheless, clir's study is the first comprehensive evaluation that compared all completed ndsr programs and investigated the differences in program design and coordination, and identified common factors that made for successful and productive residencies. although the assessment was independent from previous evaluations conducted on the first dc, boston, and new york programs by howard besser and michelle gallinger, clir's research team reviewed conclusions made by these previous studies. clir's report — keepers of our digital future: an assessment of the national digital stewardship residencies, - — was published online in december of and is openly accessible on clir's website. it provides a comprehensive picture of the ndsr programs to date and identifies several key strengths that have influenced participant satisfaction and perceptions of success. the study shows that there was an overwhelmingly positive response to ndsr's cohort model, and that ndsr projects had generally very constructive outcomes for host organizations. because the ndsr projects are designed for very different kinds of host institutions — including university libraries, radio and tv stations, and governmental agencies — the specific digital preservation skills and tools acquired by residents have varied greatly. on the whole, however, participants reported gaining and honing a wide variety of skills critical to good digital stewardship and making significant strides in their professional development. the programs also had a positive impact on the careers of most of the participants: twenty-six of the thirty-five residents who completed ndsr by the summer of were employed in the digital preservation field at the time of the report's publication. the vast majority — over % — of all residents also indicated they use the skills and tools they learned through ndsr in their current jobs. the assessment also noted some ongoing challenges, including the need for improving the clarity and regularity of communication between program administrators, hosts and residents, and strengthening connections to unify the growing ndsr community. participants in the assessment repeatedly emphasized the need for dedicated staff with sufficient time to coordinate the residencies. during the duration of clir's study, preparations were underway for the ndsr symposium, discussed in detail below. the report also proposes a series of recommendations for ndsr based on insights provided by those who participated in the assessment. these include a set of general recommendations for all programs, such as setting terms of twelve months for the residencies, adjusting pay according to the cost of living where the residency takes place, and providing health care. clir also made more specific recommendations for the effective management of programs, building curricula, establishing key digital preservation skills related to program goals, encouraging strong cohorts, fostering successful mentorship, and establishing a more formalized means of coordinating ndsr programs at a national level. as separate assessments of the boston and new york programs noted, the ndsr community is not yet effectively or formally organized and connected at a national scale. to achieve this goal, clir made several recommendations for ndsr stakeholders, including: appointing or electing a national committee to set basic standards and best practices, formalizing a means to facilitate cross-cohort communication and interaction, and establishing procedures for collecting data on resident competencies. overall, clir's study concluded that ndsr has been particularly successful in increasing residents' professional experience, cultivating supportive regional cohorts, enriching digital preservation at host organizations, and heightening the awareness and understanding of digital stewardship concepts and practices among participants in its programs. julia kim, who was part of the first group of new york residents, stated that once she completed ndsr she "felt thoroughly trained, tested, and ready for the next phase in my unfolding career as a folklife specialist and digital assets manager at the american folklife center of the library of congress in washington, d.c." many other study participants echoed kim's affirmation that the residencies help prepare new professionals for the growing number of jobs in digital stewardship. the cohort-based ndsr programs, therefore, are effectively building a community of professionals dedicated to the preservation of the nation's digital heritage.   the national digital stewardship residency symposium in the fall of , administrators of the various ndsr programs then active — ndsr dc, hosted by the library of congress; ndsr boston, hosted by harvard and mit libraries; ndsr new york, hosted by metro; and the at-the-time brand-new aapb ndsr, hosted by boston public broadcaster wgbh — in its role as steward of the american archive of public broadcasting — identified a need to address challenges in the program's stated mission of "building a community of professionals." while each individual program has forged strong links between the residents and mentors that make up each year's cohort, the impact of the programs has largely remained siloed within the urban areas and institutions that have directly participated in each program, with little cross-collaboration between programs or outreach to geographic areas that have not yet hosted any ndsr residents. assessments of the first dc, boston, and new york programs determined that "work needs to be done to fully realize the "national" part of the national digital stewardship residency [...] the ndsr programs would benefit from banding together to connect their networks and expand them further." [ ] clir's research subsequently reinforced this conclusion, demonstrating that most members of the ndsr community "felt that connections across ndsr cohorts and initiatives should be formalized and strengthened." [ ] in order to realize this goal, metro applied to imls for funding to host a national event that would convene current participants, alumni, and organizers of all the ndsr programs, along with prospective organizers of future ndsr programs and other digital preservation professionals. the ndsr symposium was originally designed as an opportunity to allow members of the ndsr community to build connections, present achievements, share knowledge, and discuss the sustainability and long-term goals of the ndsr initiative. once the announcement was made of clir's project to assess the impact of the ndsr initiative, the program organizers decided that the symposium could also serve as an opportunity to discuss the clir report as a community and, in addressing their findings, develop more formalized guidelines for the ndsr initiative as a resource for future programs and take steps towards coordinating ndsr at a national level. in order to ensure that stakeholders at all levels of the ndsr community had a voice in planning the symposium, ndsr program organizers asked current and past residents and hosts from all four extant programs to form a symposium program committee and take the lead in developing the curriculum for the event. after the philadelphia museum of art was awarded an imls grant to host the ndsr art program, a representative was invited to join the program committee. utilizing clir's report and recommendations, the program committee solicited proposals for the event around five suggested topics: models and strategies for making programs like ndsr sustainable; expanding the geographic reach of ndsr; methods of fostering a digital preservation community of practice; raising awareness of the ndsr program; and models and strategies for effective mentorship. project staff and the program committee worked to identify prospective organizers of future ndsr programs and invite them to attend the symposium, in order to solicit feedback on how best to support the expansion of the ndsr community. the project has also grown beyond the original proposal in encouraging members of the ndsr community to reflect on the program in a collaborative fashion. former and current residents who are attending the symposium will be meeting before the event to discuss their vision for the program as residents and how they can work to accomplish it. joint program resources have developed, such as the collaborative website ndsr-program.org, and the ndsr beacon email newsletter. the ndsr symposium will be held at the library of congress on april th and th, . with the support of project staff, and feedback from the advisory board of program organizers, the program committee has curated a curriculum for the symposium that includes presentations from residents, organizers and hosts. topics such as "building a communication network for collaborative projects" and "extending training findings beyond ndsr" focus on the work necessary to continue the development of a community of practice in digital preservation. the final half-day of the symposium will be dedicated to small group discussions of topics raised in the clir report, with the goal of gathering recommendations for an ndsr handbook which the program organizers will collaboratively write in the months following the symposium, and which will be made publicly available on the joint program website, ndsr-program.org. the handbook and results of the symposium will also be presented publicly at the next meeting of the national digital stewardship alliance in .   conclusion the different iterations of ndsr are not only intended to advance the careers of a small number of individuals; they are designed as a long-term investment in developing human resources and human connections to support the project of digital stewardship over the long term. participation in an ndsr program — whether as a host institution committing to mentoring a new professional, or as a resident committed to learning and sharing their work with peers and colleagues — encourages the collaboration and community-building that are key to developing the kinds of impactful, nationally significant tools, projects and partnerships conceptualized in the national digital platform. as one archivist quoted in clir's survey reported, ndsr "helps foster this idea of changing the culture of archives from being a single person alone in a dark room to being a more community-based field." the programs have already seen considerable success in launching participants into the field of digital stewardship and enriching that field and community through their work. however, as with all new initiatives, ndsr must continue to look both inward and outward if it wishes to establish itself as a sustainable long-term model for community growth. the discussions, connections, and standards generated in response to clir's report and by the symposium will further the ndsr program and support the continued development of a robust digital preservation community of practice.   references [ ] gallinger, michelle, "ndsr new york: assessment of the - program year," june . [ ] mink, meridith beck, "keepers of our digital future," december .     about the authors rebecca fraimow joined wgbh educational foundation as a resident in the - ndsr boston cohort and is now a project manager at the wgbh media library and archives, where she leads the american archive of public broadcasting national digital stewardship residency and the pbcore development and training project, as well as overseeing collections processing and preservation workflows. she has also worked as the digital projects coordinator at the dance heritage coalition, and is one of the founders of xfr collective, a video preservation nonprofit organization. rebecca holds an ma in moving image archiving and preservation from nyu and a ba in english from stanford university.   meridith beck mink is a freelance research consultant, writer, and helps run a sustainable seafood company with her husband in sitka, alaska. she was the lead researcher for clir's assessment of the ndsr programs and was responsible for implementing the project, including designing the interview protocols and survey, and writing the final report. in her former position as a clir postdoctoral fellow in data curation for early modern studies, she worked at indiana university on the chysmistry of isaac newton project and consulted on digital scholarship in the herman b. wells library's scholars commons. meridith received her ph.d. in the history of science from the university of wisconsin-madison. she holds an ma in history and ba in archaeology from simon fraser university.   margo padilla is the strategic initiatives manager at the metropolitan new york library council (metro). in addition to managing metro's strategic initiatives, including program development, grant management, and technology services, she was the project director for ndsr-ny and a resident in the inaugural ndsr-dc cohort. margo received her mlis with a concentration in management, digitization, and preservation of cultural heritage and records from san jose state university and her undergraduate degree from the university of california, berkeley.   copyright ® rebecca fraimow, meridith beck mink and margo padilla fql .. disciplined: using educational studies to analyse ‘humanities computing’ ............................................................................................................................................................ melissa terras school of library, archive and information studies, university college ....................................................................................................................................... abstract humanities computing is an emergent field. the activities described as ‘humanities computing’ continue to expand in number and sophistication, yet no concrete definition of the field exists, and there are few academic departments that specialize in this area. most introspection regarding the role, meaning, and focus of ‘‘humanities computing’’ has come from a practical and pragmatic perspective from scholars and educators within the field itself. this article provides an alternative, externalized, viewpoint of the focus of humanities computing, by analysing the discipline through its community, research, curriculum, teaching programmes, and the message they deliver, either consciously or unconsciously, about the scope of the discipline. it engages with educational theory to provide a means to analyse, measure, and define the field, and focuses specifically on the ach/allc conference to identify and analyse those who are involved with the humanities computing community. ................................................................................................................................................................................. introduction humanities computing is a relatively new, and small, field of academic activity. although the community is growing, with an expansion of tools, techniques, and activities which identify themselves as ‘humanities computing’ (or its various pseudo- nyms) , no definition of the subject exists, and very few academic institutions have a dedicated humanities computing department. this article looks towards education theory to ascertain what a discipline is, and to see how this can be used to define the status of humanities computing. this article also reports on an analysis of the humanities computing curriculum and community, from an educational, and curriculum, studies perspective. as a novel and alternative approach to answering the perennial question ‘what is humanities computing?’, this research yields useful insights. as kelly, ( , p. ) notes: ‘a study of curriculum, while not offering us spurious answers to questions of values, will . . . draw our attention to important questions that need to be asked about policies and practices and help us achieve the kind of clarity which will enable us to see underlying ideologies more clearly’. is humanities computing a discipline at all? does it exist as an academic field? the article is presented in sections. section introduces the type of activities associated with humanities computing, and describes the problems associated with trying to ascertain its status. the methodology used to analyse humanities computing in this enquiry is then sketched. corespondence: melissa terras, school of library, archive and information studies, henry morley building, university college, gower street, london wcie bt, uk. e-mail: m.terras@ucl.ac.uk literary and linguistic computing, vol. , no. , . � the author . published by oxford university press on behalf of allc and ach. all rights reserved. for permissions, please email: journals.permissions@oxfordjournals.org doi: . /llc/fql advance access published on april section asks: what is an academic discipline? a definition of disciplinarity is propagated from educational theory, and humanities computing is assessed from this perspective. section looks at the curriculum and issues of the ‘hidden curriculum’. teaching programmes are contrasted and compared with the research agenda, and aspects about the identity of humanities computing are raised. section attempts to ascertain who constitutes the humanities computing community through analy- sis of available data. section concludes the research, highlighting issues raised and identifying future work that could be carried out to develop this research further. what is humanities computing? academic activity associated with humanities computing typically revolves around specific appli- cations, such as the development and analysis of large textual corpora, the construction of digital editions of works of literature, the creation of digital artefacts through the process of digitization, the use of ‘virtual reality’ for reconstruction of architectural models, etc. new techniques and technologies are continually being developed and applied to humanities data. let us not discuss here the history of humanities computing, as it has been covered elsewhere by fraser ( ), schreibman et al., ( ), and vanhoutte (forthcoming ). however, defining humanities computing as an academic field is problematic. there are few established academic departments in the field. a lot of work in humanities computing is project- based, usually resulting in a product for other academics to utilize, and there is concern whether this is an academic endeavour. humanities computing ‘units’ or ‘centres’ often provide technical support facilities for humanities divisions in universities, meaning that humanities computing is often viewed as a support to ‘proper’ academic research. there are also few teaching programmes in existence, perhaps because it is hard to define a skills-set to pass on which would individually define the discipline, rather than just providing technical ‘training’ on specific computer technologies. this can create problems for those in the field. firstly, there is the question of academic kudos: if you are in a discipline which is not worthy of an academic department, is your research that meaningful or useful? there is often a bias from more traditional humanities scholars that work with computing is not ‘proper’ research. secondly, there are funding implications for research. research councils tend to ask the academic to identify which traditional discipline they belong to: humanities computing is not a ‘panel’ within itself. scholars using humanities computing are often ‘too technical’ to be eligible for funding from the humanities sector, and ‘not technical enough’ to secure funding through engineering and computing science channels. this situation may be changing as computers and internet technologies become more pervasive and embedded in everyday, and academic, life, but an interdisciplinary scholar is often battling different cultures and regimes to succeed in either, or both, disciplines. finally, if the subject cannot define a set of core theories and techniques to be taught, is it really a subject at all? is a research community enough to define a ‘discipline’, or does this merely reflect a community of like-minded scholars who meet occasionally to swap battle scars? these problems have been fairly exhaustively detailed by papers from many of the luminaries in the humanities computing field. however, these papers have generally focused on the content of specific teaching programmes and the development of a curriculum. there was an entire conference devoted to ‘the humanities computing curriculum: the computing curriculum in the arts and humanities’ (siemens, ), at malaspina university college, nanaimo, british columbia, canada. most papers necessarily described the practical aspects of setting up humanities computing programs and courses, and defining an overview of their content. for example, gilfillan and musick ( ) outlined the practicalities involved in promoting the use of computing in humanities- based teaching and research at the university of oregon, and hockey ( ) examined the role m. terras literary and linguistic computing, vol. , no. , of computing in the humanities curriculum at both postgraduate and undergraduate levels. there was a seminar series which was undertaken to define and generate a syllabus for a graduate course in knowl- edge representation for humanists at the university of virginia, which resulted in a comprehensive syllabus for a master’s degree in digital humanities (drucker et al., ), although this course was never actually established due to funding cuts (sending a disappointing message to the wider academic community). various papers from this seminar detail the problems in belonging to a discipline-less discipline (burnard, ; hockey, ; mccarty, ; mcgann, ; moulthrop, ; nerbonne, ). more generally, the advanced computing in the humanities (aco�hum) project produced a study on how computing was or is, and could be used in humanities subjects (de smedt et al., ). these studies all serve to illustrate how important defining the curriculum is to humanities computing, and how, as a nascent subject, much is still being done to define the teaching programme, and the field: although their focus is mostly (and necessarily) a practical approach to how teaching programmes can be implemented and integrated into academic departments and scholarly frameworks. additionally, some of the papers were concerned with ascertaining whether humanities computing is an academic endeavour or merely a support subject. various other papers exist that question the role and focus of humanities computing (aarseth, ; de smedt, ; orlandi; warwick, ) most work has been done by willard mccarty, senior lecturer in the centre for computing and the humanities at king’s college london (mccarty, , , b, , , forthcoming a, forthcoming b; mccarty et al., ; mccarty and short, ) and john unsworth, dean and professor of the graduate school of library and information science, university of illinois, urbana-champaign (unsworth, , , , – ). however, these papers are written by academics within the field, describing their own experiences of teaching, learning, and research with very little thought given to educational theory—only one of these papers, burnard ( ) mentions in passing ‘educational theory from the s’ without provid- ing any reference. the aim of this article is to apply the definitions and measures from education to the humanities computing community, to ascertain whether it exists as an academic subject. there has been much discussion within educa- tion as to what actually makes a discipline, or what defines the work of a group of academic individuals as a bona fide ‘subject’. academic culture can define a ‘tribe’ of scholars, whilst the span of disciplinary knowledge can be described as the ‘territory’ of the discipline (becker and trowler, ). ‘fields gradually develop distinctive methodological approaches, conceptual and theoretical frameworks and their own sets of internal schisms’ (ibid., p. ). what are the methodological approaches of humanities computing? is there a culture which binds the scholars together? or, is the humanities computing community merely that—a community of practice, which shares theories of meaning and power, collectivity and subjectivity (wenger ) but is little more than a support network for academic scholars who use outlier methods in their own individual fields? additionally, the notion of the hidden curriculum is also of relevance. what thoughts are we projecting in our teaching pro- grams and research as to the scope and relevance of humanities computing? this research is an ambitious attempt to provide an overview of an academic field. a literature review was carried out, both in humanities computing, and in education, to understand notions of disciplinarity and the hidden curriculum. secondly, a series of interviews with ten scholars in humanities computing was undertaken: six from the united kingdom, two from the usa, one from canada and one from belgium. comments and opinions from scholars are integrated throughout this article. thirdly, four teaching programmes were compared and contrasted to see the focus of their teaching, and which notions of humanities computing were being projected onto students. subject focus was compared and contrasted with available research materials to see whether the teaching covered the same scope as the research: this is quantifiable through textual analysis of available conference abstracts. fourthly, a database educational studies to analyse ‘humanities computing’ literary and linguistic computing, vol. , no. , was constructed of the humanities computing community, taking as its basis presenters at the main conference in the field: the annual association of computing in the humanities and the association of literary and linguistic computing joint international conference (ach/allc). in , this conference was held at the university of victoria, british columbia, canada, june – . an analysis of who attended provides an overview of who is part of the humanities computing commu- nity, how it functions, and what this projects onto the discipline as a whole. disciplines, disciplinarity, and humanities computing being part of a discipline gives a scholar a sense of belonging, identity, and kudos. but the idea of what constitutes a discipline is muddy, and often hinges around the bricks and mortar proof of a university department’s existence: [a discipline] can be enacted and negotiated in various ways: the international; invisible college; individuals exchanging preprints and reprints, conferences, workshops . . . but the most concrete and permanent enactment is the department; this is where a discipline becomes an institutional subject. the match between discipline and subject is always imperfect; this can cause practical difficulties when, for example, the (discipline-based) categories of research selectively do not fit the way the subject is ordered in a particular department. (evans, ; pp. – ). this notion of institutionalizing the subject would seem to give gravitas: if you can point at an academic department, the discipline exists. however, this definition of a ‘discipline’ is proble- matic, as many have specialisms and subspecialisms, which may or may not be represented in every university department, and every ‘discipline’ is different in character and scope from the next: most embrace a wide range of subspecialisms, some with one set of features and the other with different sets. there is no single method of enquiry, no standard verification proce- dure, no definitive set of concepts that uniquely characterises each particular disci- pline (becker and trowler, , p. ). additionally, a ‘discipline’ is not an immutable topic of research or body of individuals: ‘for nothing is more certain in the lives of the disciplines, whatever the field, whatever the institu- tional setting, than that they are forever changing’. (monroe, , p. ). the discipline gains kudos from becoming permanently established in the university subject roll call, but does not having this institutional branding preclude a body of research and teaching from actually being a discipline? most ‘new’ academic subjects have had to gradually be accepted into the university pantheon, with much discussion along the way regarding whether they actually are disciplines in the first place. for example, there is continuing debate in the field of education as to whether it is really a discipline or not (scheffler, ; hughes, ; kymlicka; ; viñao, ). it would seem like asking ‘is this a discipline?’ akin to asking ‘is this art?’: it is, if the person involved in the activity thinks it is. that said, although it is difficult to provide a definition of what a discipline may be, there are characteristics which are associated with disciplinary practice. disciplines have identities and cultural attributes. they have measurable communities, which have public outputs, and can be measured by the number and types of departments in universities, the change and increase in types of he courses, the prolifera- tion of disciplinary associations, the explosion in the number of journals and articles published, and the multiplication of recog- nised research topics and clusters (becker and trowler, , p. ). disciplines have identifiable idols in their subject (clark, ), heroes and mythology (taylor, ) and sometimes artefacts peculiar to the subject domain, or ethnographic similarities in workspaces (becker and trowler, ), meaning that the community is defined and reinforced by being m. terras literary and linguistic computing, vol. , no. , formally accepted as a university subject, but also instituting a publication record and means of output, and, more implicitly, by ‘the nurturance of myth, the identification of unifying symbols, the canonisation of exemplars, and the formation of guilds’ (dill, ). it is therefore, possible to ascertain if humanities computing is a discipline by taking an overview of the activities of the field utilizing these measures. . is humanities computing a discipline? opinion was split between the interviewed scholars in humanities computing as to whether it was a discipline. some felt very strongly that it was, others strongly denied it, defining a discipline as a ‘core set of skills’ or ‘lingua franca’ which could not be identified in the case of humanities computing. two ascertained that it did not really matter: ‘i don’t know what it is. i don’t know if it is. actually, i doubt whether we need it to be’. most identified that there was a definable commu- nity, but that they were bound together by the fact that they were traditional humanities experts who happened to use new technologies to research in their field. if technology is all there is in common, this does not make a discipline, as an academic commented: hey, you write with a ballpoint pen, and i write with a ballpoint pen . . . let’s make us the blue pen club! it is what we write with the pen that is important, not the technology. however, there are a number of activities solely associated with the humanities computing com- munity. it is now over thirty years since the association of literary and linguistic computing (allc) was founded (in ), and almost twenty years since the first issue of the journal ‘literary and linguistic computing’ (published by oxford university press) was issued in . the association of computers and the humanities (ach) was founded in the early s. the humanist electronic discussion list, which describes itself as ‘an international electronic seminar on the application of computers to the humanities’, has been in operation during : more than million words on the subject have been posted during that time. there has been a yearly conference (held by allc) since , becoming an inter- national conference (jointly held between allc and ach) since . other more local conferences emerge: digital resources in the humanities , a predominantly uk-based yearly conference, was first held in . mccarty and kirschenbaum ( , regularly updated) attempt to keep a register of conferences, associations, journals, and teaching programs in humanities computing: they currently list seven printed and eleven electronic journals devoted to humanities computing, thirteen profes- sional societies, six specific online portals, and three dedicated discussion groups. clearly, something is going on that can be classed as ‘humanities computing’. histories of and companions to the discipline have begun to emerge (fraser, ; schreibman et al., , vanhoutte (forthcoming )), from both research, scholarly, and institutional perspec- tives (warwick, ). when asked who the academic ‘heroes’ of humanities computing were, most experts came up with the same names: professors roberto busa, susan hockey, roy wisbey and john unsworth. others mentioned (professors mark greengrass, alan bowman, manfred thaler, lisa jardine, and ray siemens) were all active members in the field, and the head of often ambitious and very successful initiatives in the discipline. as for artefacts and workspace: most of the experts; workspaces were characterized by having one (or more) powerful computers, contrasted with shelves of books on traditional humanities subjects such as english literature, with the odd technical manual about the internet or extensible markup language (xml) thrown in. there was usually some large artwork on the wall (perhaps stressing how they are routed in the ‘creativity’ of the humanities, not computer science, which has a bad name for being ‘geeky’ although it is also a creative discipline). there are also identifiable artefacts from humanities computing: the mug from drh; , the rucksack from ach/ allc . educational studies to analyse ‘humanities computing’ literary and linguistic computing, vol. , no. , there are undoubtedly cliques of scholars in the community, unofficial discussion groups, friend- ships, scholarly support networks, mentoring pro- grammes, and many other relationships associated with academic communities and disciplines active within humanities computing. the amount of activity would suggest that there was an identity associated with the humanities computing com- munity, as well as issues of shared practice, and that the amount of academic activity detailed above do classify this as a discipline, rather than just a ‘community of practice’ (wenger, , p. ). however, where the argument falls down for humanities computing as a discipline, is in its institutionalization, or lack of it. mccarty and kirschenbaum, ( ) provide ‘a structured list of departments, centres, institutes and other institu- tional forms that variously instantiate humanities computing’ (although this is slightly out of date). only institutions worldwide are listed that provide academic teaching programmes in the field. these are generally at the postgraduate level, with only a minor in humanities computing being available at the undergraduate level at two institu- tions: the ‘major’ degree is always a traditional humanities subject. additionally, the majority of these programmes are provided not through ‘departments’ but through ‘centres’ or ‘institutes’, such as the centre for computing in the humanities at king’s college london , or the humanities advanced technology and information institute at the university of glasgow. as computing becomes more pervasive, information technology skills are becoming more important to all scholars, and these centres usually also provide general it skills training to humanities scholars. this makes it hard to differentiate between general training in computing applications, and bona fide ‘academic’ study. ‘humanities computing’ has yet to be institutionalized as an academic subject. there was a feeling amongst some of those interviewed that ‘the lady doth protest too much’ regarding the perennial ‘is-humanities-computing- a-discipline’ question. surely if it was, it would have become established by now? but given the above evidence, it would seem to be established as a discipline. the question is why it is not an established university subject. this may be because there is not a definable skills set or focus that can be passed on to the next generation of scholars. additionally, the subject is reliant on technologies which continually change, requiring learning of specific applications and the application of knowl- edge and action rather than the traditional humanities focus on development of the ‘self ’ (barnett et al., , p. ). also, there is an inherent understanding that the domain will always exist as applied to traditional humanities scholarship, as it uses computational techniques to undertake humanities research. it does not exist in ‘itself’ away from the humanities, and will always depend on the traditional disciplines to provide questions that need to be answered. experts variably described this as ‘symbiosis’ (giving a positive view of the intertwining of computer technologies with the humanities) or the negative ‘parasitic’: ‘its like mistletoe. it cannot exist on its own’. the humanities computing scholar was often described as a ‘magpie’ who had to visit other domains to gather shiny pieces of knowledge for use at home, or a ‘chameleon’ who has to jump from one mode of disciplinary thinking and culture to another. mccarty (forthcoming b) describes humanities computing as an ‘archipelago’ of subjects that we visit. we are like a ‘jack of all trades: master of none’. finally, to be able to understand how computing technologies can benefit the humanities there needs to be an understanding of how the humanities function. therefore, most scholars need traditional humanities training or qualification before they can use humanities computing: it is essentially a research environment, and that befits teaching at a postgraduate level better than undergraduate level. humanities computing would seem to display many traits that are associated with being a discipline, apart from being institutionalized as a ‘proper’ academic subject. this raises problems, as detailed in section , regarding kudos and funding. however, although there are only a small number of teaching programmes available, this would suggest that there is something to be taught, and this is analysed in section . m. terras literary and linguistic computing, vol. , no. , curriculum, hidden curriculum, and humanities computing the syllabus and curriculum of humanities computing has never really been decided (as demonstrated by the discussion papers listed in section .) however, some teaching programmes do exist. this section gives a brief overview of some programmes and compares and contrasts their content, comparing this to the research agenda of humanities computing through analysis of con- ference abstracts in the field. issues of the ‘hidden curriculum’ are then discussed, illuminating what message humanities computing is giving out through its teaching programmes and institutional representation. four university courses were looked at to compare and contrast their content and implemen- tation. these were: ( ) the ma in applied computing in the humanities in the centre for computing in the humanities, at king’s college london. this is a one year masters degree. ( ) ‘humanities computing: electronic text’ , a one-term, one module course at masters level in the english department of the university of antwerp. ( ) ‘digital resources in the humanities’ , a one-term, one module course at masters level in the school of library, archive, and information studies, university college london. ( ) ‘digital humanities’ a one-term, one module course at masters level in the graduate school of library and information studies, university of illinois at urbana- champaign. . the syllabus and curriculum from an educational and curriculum studies perspective, the term ‘curriculum’ applies not only to the content of a particular subject of study, but refers to the total programme of an educational institution: being ‘the overall rationale for any educational programme, including those more subtle features of curriculum change and development and especially those underlying ele- ments [explanation and justification] . . . which are the most crucial element in curriculum studies’ (kelly, , p. ). syllabus here is taken to mean the course content. the courses listed above have a remarkably similar focus, mostly taking as their syllabus the techniques used to produce, manipulate, and deliver electronic text. some, such as antwerp, focus, exclusively on this, whilst others, such as ucl, have this as the focus but introduce some other computational application to the humanities in the course of teaching, such as digitization and outlier methods such as virtual reality. illinois is more discursive than the others, with more written elements and less technical work, and of course the one year course at king’s is more extensive than the others, and can go into more depth about various tools and techniques. there is a significant amount of group work, which is relatively rare in the humanities. courses are relatively small and have much direct contact with the tutors, with practical sessions as well as lecture and tutorial sessions. assessment is by practical project, or take- home exam, in which the students are expected to demonstrate that they can implement the technol- ogies whilst understanding the theory behind them. but the focus of these courses is digital text, and the theory, tools, and technologies which can be used for markup and analysis. the reading lists are remarkably similar, and the projects which the students have to do involve practical project work where they create an electronic text using the techniques taught in the session (all of the courses teach extensible markup language (xml) , and the form of xml espoused by the text encoding initiative (tei) : major technical developments by the humanities computing research community). there is good reason for this, as it would seem that it is the thrust of academic research within the discipline. this can be shown by a simple analysis of conference abstracts published for ach/allc, which were obtained in electronic format, and run through a commonly used text analysis program, concordance , to show which are the most commonly used words in these papers (fig. ). educational studies to analyse ‘humanities computing’ literary and linguistic computing, vol. , no. , all available conference abstracts from ach/allc were mined , from to , with the exclusion of which was not available. this resulted in a corpus of , , words, which, when analysed, demonstrated that ‘text’ is indeed the focus of humanities computing research. further analysis (not shown here) demonstrates that this is consistently the case across all years of the conference. humanities computing research is predominantly about text: it follows that the teaching programmes should concentrate on this aspect. this also demonstrates that the teaching and research agendas are similar—this is perhaps debatable in other subjects, and could be the focus of further research. it would seem then, that the rationale for the courses is to pass on the theory and techniques used in the humanities computing research community. . the hidden curriculum the term ‘hidden curriculum’, coined by philip jackson ( ), refers to the fact that education is a socialization process, and that cultural norms, socially accepted practices and acceptable types and levels of knowledge are passed on to students through the way the teaching process is constructed. investigation into the hidden curriculum can be used to understand more fully how the educational process works at different institutional levels (see snyder, ; tobias, ; and margolis, for further discussion). academics in humanities computing were asked about the aspects of teaching and research which could pass on implicit messages about the subject to either the student, or to the wider academic community. it was difficult to gather statistics about the courses regarding usual aspects of hidden most popular words in ach/allc conference abstracts – . . . . . te xt in fo rm a tio n u n iv e rs ity te xt s h u m a n iti e s la n g u a g e u se d a ta n e w p ro je ct re se a rc h a n a ly si s d ig ita l e le ct ro n ic u se d co m p u tin g w o rk p a p e r w o rd s xm l d o cu m e n t n u m b e r m a rk u p co rp u s e n g lis h e n co d in g st u d e n ts to o ls w o rd lit e ra ry d a ta b a se m o d e l w e b o c c u rr e n c e p e r th o u s a n d fig. the most commonly used words in abstracts of the association for computing and the humanities and association of literary and linguistic computing joint conference. words are shown in occurrences per , excluding words like ‘the’ and ‘a’ (using the glasgow stop words list ). ‘text’ is by far the most commonly used word, statistically occurring in every single abstract. other key words demonstrate that humanities computing is about the computational analysis of data, especially language, words, and documents. m. terras literary and linguistic computing, vol. , no. , curriculum research: gender, social background, ethnicity, etc as the courses were new, of different sizes, and in very different organizations. various other issues were raised. ( ) in teaching specific technologies, specifically regarding text processing and manipulation, the field was not seen to be engaging in the full spectrum of technology development, but a narrow focus. because the field was so insular, and did not engage with computer science, it was shielding itself away from further developments. ( ) all of these courses are taught in humanities faculties: only one course exists which teaches people in a computing science department . scholars were seen to need humanities training before they could be ‘trusted’ to undertake computational analysis of humanities data, and this precluded students with a background in technical subjects such as computing or engineering being ‘allowed’ to join the field at masters level—when actually they could have a lot to contribute. ( ) links between traditional humanities depart- ments could not be guaranteed as there was scepticism about the value of some courses. where links were made, these were generally because of a few keen individuals in the institution. ( ) the fact that courses are taught (or research is done) in ‘centres’ or ‘institutes’ for the most part, rather than ‘departments’, suggests to both students and other academics that this is not a proper subject. this has an effect on recruitment for courses. the closure of a research institute by a major oxford university (see burnard ( )), and the funding of the creation but not implementa- tion of an ma degree (virginia, see drucker et al., ) has also done a lot of damage to the growth of the ‘subject’ because of the way these actions have been perceived in the wider community. humanities computing was seen as a ‘help desk’ rather than as a research field in its own right. ( ) the humanities computing community is small and friendly, and it was seen that graduate students could rapidly become part of this community and have the opportunity to engage with leaders in the field from quite early in their study of the subject. however, it was relatively insular. ( ) the use of small group and practical project work was very different from tradi- tional humanities disciplines and required a different skill set from the average humanities’ graduate student. students have to be technically very adept, and also have an access to technology to be able to undertake the courses. ( ) it can be very difficult to ascertain funding to undertake graduate research in humanities computing, although this may be changing as computing becomes more pervasive through- out all disciplines. although there is a similar curriculum and syllabus throughout available courses, which relates very closely to the research agenda of humanities computing, there are various issues that need to be addressed in the way that the discipline projects its values onto students, and to the wider academic world. although the community is warm and welcoming, humanities computing needs to engage more with both computer science and humanities disciplines, rather than being an insular community. issues of curriculum and the hidden curriculum require much more attention and analysis in the future if humanities computing is to expand and become institutionalized as an academic subject. who is part of the humanities computing community? it has been suggested in section that academic fields can be ‘measured’ by their number of publications, associations, conferences, etc. in section , it was suggested that the humanities computing community was small and insular, and it has also been suggested that academics active in research in humanities computing generally educational studies to analyse ‘humanities computing’ literary and linguistic computing, vol. , no. , are employed to research in traditional disciplines. this section aims to ratify these claims by briefly attempting to measure the humanities computing community. . membership of associations, journals, and discussion groups the major associations in humanities computing are the association of literary and linguistic computing (which is based in europe) and the association for computing and the humanities (based in the usa). subscribers can be members of both, but as the membership is tied to paid subscription to the journal ‘literary and linguistic computing’ and as it is necessary to choose one or the other or pay extra to be a member of both, most subscribers belong to one organization only. statistics for membership of these, and also the main free discussion lists in the subject (humanist and the text encoding initiative list ) were collected (table ). there are between – scholars who are willing to pay for yearly subscription to the field journal (members can subscribe once but can be members of both the organizations), and over interested parties in the field who will sign up for free, almost daily, postings and discussions about the discipline. over engage in almost daily discussions about the application of textual markup in the humanities. although the community is relatively small, it is not inconsequential. but who are these people? . analysis of ach/allc conference abstracts one way to measure who partakes in the humanities computing community, given that is has such a diverse spread, is to analyse conference proceedings, attendance lists, and abstracts. the biggest conference in humanities computing is ach/allc (see p. ). attendance lists were not available for any of the annual conferences, and only a selection of full papers will ever be published. however, the -word abstracts selected for presentation from those submitted were made available by the program committee for analysis. a database was constructed, from abstracts and personal webpages, of all the presenters attending ach/allc , with their name, paper title, department and institution affiliation, and job title stored for each presenter. not everyone who undertakes teaching or research in humanities computing presented at (or attended) this conference, but as the single large conference in the field, it should provide an overview of activity, affiliation, and structure of humanities computing . a total of individuals were presentation authors at ach/allc, which consisted of sessions: eight full sessions, thirteen panel sessions, and individual papers. (indicating that there are, on average, more than two presenters associated with each paper: perhaps a rarity in humanities scholarship?) a few scholars presented more than one paper. the presenters came from countries (fig. ): logged by country of the institution they are affiliated to. this domination by the usa and canada is not altogether surprising, considering the location of the conference. when the conference is held in paris in , there will probably be more presenters submitting papers from europe. nevertheless, % of those presenting are from countries where english is the native tongue. only five abstracts were accepted in other languages: french, german, and spanish. this may construct barriers to non- native english speakers adopting the techniques developed by scholars in the discipline. additionally, the scholars are all from western countries (the one presenter from africa being from its richest and most ‘western’ country). humanities computing relies on access to computational technologies which would exclude scholars from poorer institutions participating. also noticeable table membership of various initiatives in humanities computing name membership allc (approx) ach (approx) humanist discussion list subscribers text encoding initiative discussion list subscribers m. terras literary and linguistic computing, vol. , no. , in their absence are china and india, which have both experienced massive technological growth and expansion of internet usage in recent years . the presenters at the conference come from different academic institutions , with four scholars coming from the industry. the host institution, the university of victoria, fields the most presenters (although, they have a very strong humanities computing centre). this attendance is matched by the university of illinois at urbana-champaign, who have a large and strong library school. most institutions represented are fairly large, well-known and respected universities (fig. ). in a few institutions, humanities computing is more prevalent, but there still remains a large number of lone, or almost lone, scholars in the field: twenty institutions fielded two scholars, fifty two fielded a lone scholar (although, a large number of these work or present with scholars from other institutions). the scope of the humanities computing com- munity can also be judged by the host department each presenter is affiliated to (table ). the most represented academic discipline is library and information studies. this subject has made extensive use of technology for the organiza- tion, storage, and retrieval of data. english follows, as literary and linguistic textual analysis and manipulation are common application in the field (hence the name of ‘literary and linguistic computing’). reassuringly, a large proportion of scholars are linked to humanities computing centres, showing that their presence is central to the field. a distinction has been made between library schools and university libraries, (as a place for training of librarians versus the university facility), but staff from university libraries are also well represented, indicating the take-up of technol- ogies in this area. linguistics is also strongly represented. interestingly, however, is the fact that seventeen scholars are computer scientists, indicat- ing that humanities computing is of interest to not only the humanities scholar, but also those involved in computer science. digital projects are specific projects which use humanities computing tech- niques to construct digital resources (for example matrix , the centre for humane arts, letters p re s e n te rs ( % ) u s a c a n a d a u k s p a in g e rm a n y n e th e rl a n d s ja p a n it a ly a u st ra lia f in la n d f ra n ce h u n g a ry ir e la n d s o u th a fr ic a s w e d e n country presenters at ach/allc by institution country fig. presenters at ach/allc by institution of the country. it can be clearly seen that the conference is dominated by those from the usa and canada. educational studies to analyse ‘humanities computing’ literary and linguistic computing, vol. , no. , and social services online, which develops online teaching tools). various other arts and social science disciplines are also part of the humanities computing community, indicating that the tech- niques, theory, and applications discussed at this meeting are of broad interest in the humanities themselves. in addition to tracking the academic affiliation of presenters, their job titles were noted. again, this required some degree of abstraction, and it is also possible that lecturers could be grouped further: a ‘lecturer’ can be seen as being akin to being an assistant professor, for example. additionally, five presenters’ job titles could not be ascertained. the type of job undertaken with the number of presenters is shown subsequently: most represented job first. the resulting spread of academic positions demonstrates that those involved in humanities computing cover the whole spectrum of academic posts: there is a high number of professors, associate and assistant professors, lecturers, researchers and graduate students, as well as support staff, directors of projects, web developers, post-docs, and inde- pendent consultants. although it is hard to make any statistical judgement on this without comparing it to other disciplines, the fact that this wide range of posts is represented would suggest that humanities most represented institutions u n iv e rs ity o f v ic to ri a u n iv e rs ity o f i lli n o is a t u rb a n a -c h a m p a ig n u n iv e rs ity o f k e n tu ck y k in g ’s c o lle g e l o n d o n s im o n f ra se r u n iv e rs ity b ro w n u n iv e rs ity a ca d ia u n iv e rs ity u n iv e rs id a d d e l a s p a lm a s d e g ra n c a n a ri a u n iv e rs ita t o b e rt a d e c a ta lu n ya u n iv e rs ity c o lle g e l o n d o n n e w y o rk u n iv e rs ity u n iv e rs ity o f m a ry la n d u n iv e rs ity o f t o ro n to u n iv e rs ity o f d u is b u rg -e ss e n t o ky o in st itu te o f t e ch n o lo g y u n iv e rs ity o f n ijm e g e n u n iv e rs ity o f w o lv e rh a m p to n m ic h ig a n s ta te u n iv e rs ity c o n se il n a tio n a l d e r e ch e rc h e d u c a n a d a m cm a st e r u n iv e rs ity u n iv e rs ité l a va l u n iv e rs ity o f a lb e rt a u n iv e rs ity o f s a sk a tc h e w a n u n iv e rs ity o f w e st e rn o n ta ri o u n iv e rs ity o f o u lu u n iv e rs ity o f p is a r ic e u n iv e rs ity t e xa s a + m u n iv e rs ity u n iv e rs ity o f n o rt h c a ro lin a a n d c h a p e l h ill u n iv e rs ity o f o re g o n institution n u m b e r o r p re s e n te rs fig. most represented institutions. any institution fielding less than three scholars is not shown. m. terras literary and linguistic computing, vol. , no. , computing functions as an academic field where promotion and development is possible. it is not only a service to be provided to humanities scholars, but a discipline in its own right. that said, many of the academic professors earned their chair from more traditional disciplines, such as english or linguistics, without the use of comput- ing, and their interest in computing may have come after significant academic success has already been achieved. this analysis has shown that, although there are very few academic departments called ‘humanities computing’, there is a community of scholars who partake in the discipline and who cover a broad range of traditional academic disciplines. these scholars are involved at every level from undergraduate student to professor. humanities computing is international—if international is limited to developed, western countries, that is. nevertheless, the fact that such a community clearly exists would suggest that there is enough academic activity being undertaken to identify this as a separate field in its own right, confirming the ‘disciplinary’ status. further analysis of the abstracts from ach/allc would include citation and publication analysis, which could provide further information regarding the operation of the field and how it interacts with others. additionally, it would be useful to compare these results to previous ach/ allc conferences, to see any potential changes in the community, and to compare and contrast the humanities computing community with those from other disciplines. table official departmental affiliation of presenters academic department number of presenters school of library and information studies english humanities computing centre library linguistics computing science digital project education literature classics history computational linguistics information services french italian german linguistics humanities philology phonetics social sciences dutch linguistics and literary studies management sciences slavic languages and literatures archaeology art and design cognitive science communications economics and business multimedia philosophy public policy retired sociology spanish women and gender studies table official posts of presenters job type number of presenters graduate student professor researcher assistant professor lecturer associate professor director librarian programmer/developer post-doctoral fellow associate director computing officer humanities computing specialist research development senior analyst senior lecturer undergraduate student consultant manager research assistant teacher web architect archivist assistant curator assistant dean reader educational studies to analyse ‘humanities computing’ literary and linguistic computing, vol. , no. , conclusion this article has taken an externalized viewpoint, from educational theory, to demonstrate that humanities computing consists of a diverse com- munity from various subjects in the humanities, whose activities constitute those associated with an academic discipline. although few teaching pro- grammes exist, humanities computing has not yet been accepted as a subject by the majority of institutions, and this can cause problems to scholars undertaking research in this area. this enquiry has raised points about the acceptance of humanities computing by both academics and students, whilst demonstrating that there is an identifiable commu- nity operating in the field of computing and the arts, from various traditional academic subjects, at all academic level from the student to the professor. further studies need to be carried out to further analyse and define the humanities computing discipline, field, and community. a citation analysis could be carried out to see which texts are cited by scholars in the field: are they computing science texts, or pure humanities texts? which journals are most popular? which are the seminal texts in humanities computing, that emerge from this analysis? who would be the most cited author(s)? continuing this analysis, it would then be useful to return to individual scholars in humanities computing and analyse where they publish their articles: what is the publication scope of humanities computing? how could this be measured, and what could it tell us about the field? do humanities computing scholars publish in ‘traditional’ humanities single-subject journals, or is there a cross-over with computing science? looking at publication records would show the impact factor that humanities computing scholarship has in the wider academic field, and so could illuminate some of the boundaries that the discipline operates within. further analysis of the curriculum and hidden curriculum is needed across the teaching pro- grammes, with comparison to be made between the differences between pure computing science and humanities computing, and also more traditional humanities academic subjects and humanities computing. students from various courses could be interviewed so that more detailed information could be ascertained about the different issues present between different academic subjects. there is room for also engaging more with curricular theory, on issues of the hidden curriculum and how we can intrinsically promote and expand humanities computing through our teaching programmes and methods. by turning to a different discipline, education, to understand its views on issues of disciplinarity, curriculum, and identity, it has been possible to measure and analyse the humanities computing community and academic activity in a novel and illuminating way, highlighting areas of concern, and confirming that we do, indeed, seem to function as an academic discipline. what this research has not done is to provide an over-arching definition of the ‘subject’: it has demonstrated that humanities computing exists as an academic discipline, without having to be accepted into the university subject pantheon. although this creates problems with funding and academic kudos, it can also be seen as an indication of the strength of the discipline: the community exists, and functions, and has found a way to continue disseminating its knowledge and encouraging others into the community without the institutionalization of the subject. this gives the discipline and the scholars within it additional freedom: if they are not defined, or their activities are not prescribed, then they are free to develop their own research and career paths, which may not fit into the normal mode of operation for academic subjects, but could allow the subject to remain fluid and undefined. is it such a bad thing if a definition of the subject does not exist?: who, for example, ever learnt anything of significance of learning or loving by defining these concepts? reflecting on and writing about learning should preserve or create an openness, which is a fundamental part of the practice of learning, rather than the closure of categorization, which has more to do with oppression and control. (rowland , p. ). m. terras literary and linguistic computing, vol. , no. , there may be a time when every academic institution worldwide may have some element of humanities computing research and teaching present within it. alternatively, given that comput- ing is becoming more pervasive, perhaps the humanities computing scholar will just be accepted back into the individual discipline they are applying the techniques to: the safe haven of a community of humanities scholars who happen to use computa- tional techniques may no longer be needed. the techniques and tools of humanities computing will then become absorbed back into the support function of information systems and services in academic institutions. the field may only flourish as an academic subject if it becomes less insular and interacts both with computer science and those humanities scholars who are less willing to accept computing as part of their research tools. research and teaching methods peculiar to humanities computing have to be promoted and developed as useful adjuncts to usual training for humanities students. the com- munity must continue to develop, but extend its remit to be more inclusive, international, and interdisciplinary: in the cross-faculty sense, encouraging work between the sciences and the arts. humanities computing is an emergent discipline, which may or may not flourish into an emergent academic subject if the community does not work to extend its focus, scope, and relevance. acknowledgements the authors would like to thank the following academics involved in humanities computing for their aid in researching this work (in no particular order): edward vanhoutte, lou burnard, dr willard mccarty, professor seamus ross, professor susan hockey, dr claire warwick, dr mike fraser, dr stan ruecker, professor john unsworth and dr wendell piez. the author also thanks alejandro bia for providing abstracts from ach/allc ahead of public release and andrew ostler who provided technical advice on data mining. references aarseth, e. ( ). the field of humanistic informatics and its relation to the humanities, human it, ( ): – . http://www.hf.uib.no/hi/espen/hi.html. barnett, r., parry, r., and coate, k. ( ). conceptualising curriculum change. teaching in higher education, ( ): – . barnker, t. and trowler, p. r. ( ). academic tribes and territories. buckingham: the society for research into higher education and open university press, nd edn. burnard, l. ( ). is humanities computing an academic discipline? or, why humanities computing matters part of is humanities computing an academic discipline?, interdisciplinary seminar university of virginia. http:// www.iath.virginia.edu/hcs/burnard.html. burnard, l. ( ). humanities computing in oxford: a retrospective. http://users.ox.ac.uk/�lou/wip/hcu-obit. txt clark, b. ( ). academic culture. working paper number . new haven, cn. yale university higher education research group. de smedt ( ). some reflections on studies in humanities computing. literary and linguistic computing, : – . de smedt, k., gardiner, h., ore, e. et al. (eds). ( ). computing in humanities education, a european perspective. university of bergen. http://helmer.aksis. uib.no/acohum/book/ dill, d. d. ( ). academic administration. in clark, b. r., and neave, g. (eds), the encyclopedia of higher education, vol. ii: oxford: pergamon press, pp. – . drucker, j., unsworth, j., and laue, a. ( ). final report for digital humanities curriculum seminar, media studies program, college of arts and science, university of virginia. http://www.iath.virginia.edu/ hcs/dhcs/ evans, c. ( ). choosing people: recruitment and selection as leverage on subjects and disciplines. studies in higher education, ( ): – . fraser, m. ( ). a hypertextual history of humanities computing. http://users.ox.ac.uk/�ctitext /history/ gilfillan, d. and musick, j. ( ). wiring the humanities at the university of oregon: experiences from year . paper at ‘‘the humanities computing curriculum: the computing curriculum in the arts and humanities’’, – november , malaspina university college, educational studies to analyse ‘humanities computing’ literary and linguistic computing, vol. , no. , http://www.hf.uib.no/hi/espen/hi.html http:// http://users.ox.ac.uk/ http://helmer.aksis http://www.iath.virginia.edu/ http://users.ox.ac.uk/ nanaimo, british columbia, canada, http://web. mala.bc.ca/siemensr/hccurriculum/ hockey, s. ( ). is there a computer in this class? part of is humanities computing an academic discipline?, interdisciplinary seminar, university of virginia. http://www.iath.virginia.edu/hcs/hockey.html. hockey, s. ( ). towards a curriculum for humanities computing: theoretical goals and practical outcomes. paper at ‘‘the humanities computing curriculum: the computing curriculum in the arts and humanities’’, – november , malaspina university college, nanaimo, british columbia, canada, http://web.mala. bc.ca/siemensr/hccurriculum/ hughes, e. c. ( ) is education a discipline? in the sociological eye: selected papers. chicago: aldine. jackson, p. w. ( ). life in classrooms. new york: holt, reinhart and winston. kelly, a. v. ( ). the curriculum: theory and practice, london: paul chapman, th edn. kymlicka, b. b. ( ). the faculty of education: an interpretation of history and purposes. http://www. oqe.org/doc/kymlicka.pdf (accessed june ); ( ) literary and linguistic computing, . margolis, e. (ed.) ( ). the hidden curriculum in higher education. new york: routledge. mccarty, w. ( ). what is humanities computing? toward a definition of the field. http://www.kcl.ac.uk/ humanities/cch/wlm/essays/what/ mccarty, w. ( ). humanities computing as interdiscipline. is humanities computing an academic discipline? iath, university of virginia, november . http://www.kcl.ac.uk/humanities/cch/wlm/essays/ inter/ mccarty, w. ( b). ‘we would know how we know what we know: responding to the computational transformation of the humanities’, for the conference ‘‘the transformation of science – research between printed information and the challenges of electronic networks’ held by the max planck gesellschaft, schloss elmau, may to june http://www.kcl.ac.uk/ humanities/cch/wlm/essays/know/ mccarty, w. ( ). humanities computing: essential problems, experimental practice. literary and linguistic computing, : – . mccarty, w. ( ) humanities computing. preliminary draft entry for the encyclopedia of library and information science, new york: dekker. mccarty, w. (forthcoming a). humanities computing. london: palgrave. mccarty, w. (forthcoming b). tree, turf, centre or archipelago? poetics of disciplinarity for humanities computing. literary and linguistic computing, ( ). mccarty, w. and kirschenbaum, m. ( ). institutional models for humanities computing. http://www.kcl. ac.uk/humanities/cch/allc/imhc/ mccarty, w. and short, h. ( ). a roadmap for humanities computing. http://www.kcl.ac.uk/ humanities/cch/allc/reports/map/ mccarty, w., burnard, l., deegan, m., anderson, j., and short, h. ( ). ‘root, trunk, and branch: institu- tional and infrastructural models for humanities computing in the u.k’. panel session at the joint international conference of the association for computers and the humanities and the association for literary & linguistic computing, queen’s university, kingston, ontario, canada – june, . mcgann, j. ( ). iath and humanities computing part of is humanities computing an academic discipline?, interdisciplinary seminar, university of virginia. http://www.iath.virginia.edu/hcs/mcgann.html. monroe, j. ( ). introduction: the shapes of fields. in writing and revising the disciplines. ithaca: cornell university press, pp. – . moulthrop, s. ( ). computing, humanism, and the coming age of print part of is humanities computing an academic discipline?, interdisciplinary seminar, university of virginia. http://www.iath.virginia.edu/ hcs/moulthrop.html. nerbonne, j. ( ). humanities computing: a federation of disciplines part of is humanities computing an academic discipline?, interdisciplinary seminar university of virginia. http://www.iath.virginia.edu/ hcs/nerbonne.pdf orlandi, t. (n.d.). the scholarly environment of humanities computing: a reaction to willard mccarty’s talk on the computational transformation of the humanities http://rmcisadu.let.uniroma .it/ �orlandi/mccarty .html. rockwell, g. ( ). is humanities computing an academic discipline part of is humanities computing an academic discipline?, interdisciplinary seminar, university of virginia. http://www.iath.virginia.edu/hcs/rockwell. html. rowland, s. ( ). the enquiring university teacher. milton keynes: society for research into higher education and open university press. m. terras literary and linguistic computing, vol. , no. , http://web http://www.iath.virginia.edu/hcs/hockey.html http://web.mala http://www http://www.kcl.ac.uk/ http://www.kcl.ac.uk/humanities/cch/wlm/essays/ http://www.kcl.ac.uk/ http://www.kcl http://www.kcl.ac.uk/ http://www.iath.virginia.edu/hcs/mcgann.html http://www.iath.virginia.edu/ http://www.iath.virginia.edu/ http://rmcisadu.let.uniroma .it/ http://www.iath.virginia.edu/hcs/rockwell scheffler, i. ( ). is education a discipline?. in walton, j. and kuethe, j. l. (eds), the discipline of education. madison: university of wisconsin press, pp. – . schreibman, s., siemens, r., and unsworth, j. (eds). ( ). a companion to digital humanities. blackwell companions to literature and culture, blackwell publishing. siemens, r. ( ). the humanities computing curriculum: the computing curriculum in the arts and humanities, – november . malaspina university college, nanaimo, british columbia, canada, http://web.mala.bc.ca/siemensr/ hccurriculum/ snyder, b. r. ( ). the hidden curriculum. cambridge, ma: mit press. taylor, p. j. ( ). an interpretation of the quantification debate in british geography. transactions of the institute of british geographers n.s., : – . tobias, s. ( ). the hidden curriculum: faculty-made tests in science. london: plenum. unsworth, j. ( ). networked scholarship: the effects of advanced technology on research in the humanities, presented at harvard university, cambridge, ma, november . http://www .isrl.uiuc.edu/ �unsworth/gateways.html. unsworth, j. ( ). digital research in the humanities: community, collaboration, and intellectual technologies, petrou lecture, university of maryland, april . http://www .isrl.uiuc.edu/�unsworth/ petrou.umd.html. unsworth, j. ( ). ‘what is humanities computing and what is not?’ a talk delivered in the distinguished speakers series of the maryland institute for technology in the humanities at the university of maryland, college park md, october . http:// www.iath.virginia.edu/% ejmu m/mith. .html. unsworth, j. ( ) ‘the emergence of digital scholarship: new models for librarians, scholars, and publishers,’ delivered as part of a the new scholarship: scholarship and libraries in the st century, dartmouth college, hanover, nh, november . http://www .isrl.uiuc.edu/ �unsworth/dartmouth. / unsworth, j. ( ). tool-time, or ‘haven’t we been here already?’ ten years in humanities computing. delivered as part of ‘transforming disciplines: the humanities and computer science,’ january . washington, dc. http://www .isrl. uiuc.edu/�unsworth/carnegie-ninch. .html. unsworth, j. ( ). what is humanities computing (and what is not)? texas a&m, as part of the humanities informatics lecture series, college station, tx, september http://www .isrl. uiuc.edu/�unsworth/texas-hc.html. vanhoutte, e. (forthcoming ). electronic scholarly editing: history, theory, applications, and implications. ph.d. dissertation. antwerp: university of antwerp. viñao, a. ( ). a history of education in the th century: a view from spain, mexican journal of educational research, ( ): – . available from http://www.comie.org.mx/revista/pdfsenglish/ carpeta / investtem engl.pdf warwick, c. ( ). no such thing as humanities computing? an analytical history of digital resource creation and computing in the humanities. paper given at allc göteborg university, – june . submitted. wenger, e. ( ). communities of practice: learning, meaning, and identity. cambridge: cambridge university press. notes it should be noted that ‘humanities computing’ can also be referred to as digital humanities, digital resources for the humanities, digital resources in the humanities, cultural and heritage informatics, humanities computer science, and literary and linguistic computing. throughout this article, ‘humanities computing’ is used for consistency. http://web.uvic.ca/hrd/achallc / http://www.allc.org/ . the association of literary and linguistic computing published their journal twice yearly from to , when this was merged with allc bulletin to become literary and linguistic computing ( ). http://www.ach.org/ http://www.princeton.edu/�mccarty/humanist/ http://www.kcl.ac.uk/humanities/cch/allc/ refdocs/conf.htm http://www.drh.org.uk/ http://www.w .org/xml/ http://www.kcl.ac.uk/humanities/cch/ http://www.hatii.arts.gla.ac.uk/ educational studies to analyse ‘humanities computing’ literary and linguistic computing, vol. , no. , http://web.mala.bc.ca/siemensr/ http://www .isrl.uiuc.edu/ http://www .isrl.uiuc.edu/ http:// http://www .isrl.uiuc.edu/ http://www .isrl http://www .isrl http://www.comie.org.mx/revista/pdfsenglish/ http://web.uvic.ca/hrd/achallc / http://www.allc.org/ http://www.ach.org/ http://www.princeton.edu/ http://www.kcl.ac.uk/humanities/cch/allc/ http://www.drh.org.uk/ http://www.w .org/xml/ http://www.kcl.ac.uk/humanities/cch/ http://www.hatii.arts.gla.ac.uk/ http://www.kcl.ac.uk/humanities/cch/ma/ .html http://www.kantl.be/ctb/vanhoutte/teach/ hc .htm http://www.ucl.ac.uk/slais/melissa-terras/ drh.htm http://www .isrl.uiuc.edu/�unsworth/ lis dh-s .html http://www.w .org/xml/ http://www.tei-c.org/ http://www.concordancesoftware.co.uk/ they were downloaded from the relevant web pages using a utility called pagesucker, http:// www.pagesucker.com/, and concatenated using a python script, for analysis with concordance. the msc in it and the humanities at the university of glasgow, recently changed to the name msc in information management and preservation, which is jointly taught between the humanities advanced technology institute and the department of computer science: http:// www.hatii.arts.gla.ac.uk/imp/index.htm http://www.kcl.ac.uk/humanities/cch/allc/ http://www.ach.org/ http://llc.oxfordjournals.org/ http://www.princeton.edu/�mccarty/humanist/ http://listserv.brown.edu/archives/tei-l.html thanks must go to alejandro bia (chair of the ach/allc program committee, universidad de alicante, spain) for access to these files prior to official release. of course, these statistics are of no use without comparison with other conferences. conference size and attendance varies greatly between subject and ‘importance’ of host association. take these two examples as extremes: � siggraph , the st annual conference and exhibition on computer graphics and inter- active techniques, consisted of papers, exhibitors, and seven panel sessions. , professionals from nearly countries attended the conference at los angeles, – august . see http://www.siggraph.org/ s / � computers and history or art (chart) consisted of papers, with presenters from different countries. approximately people attended the two day conference on – november, university of london. see http://www.chart.ac.uk/chart -abstracts/ index.html ach/allc is a medium-sized conference which represents a specific subject matter. attendance at ach/allc was approximately scholars (including presenters). the percentage of attendees who were present is much higher in this medium-size conference than a conference like siggraph: perhaps meaning that presenting at ach/allc has less academic kudos than presenting at a larger conference, even though it is the leading conference in its field. http://www.allc-ach .colloques.paris-sorbonne. fr/ internet world stats, http://www.internetworld stats.com/stats .htm, demonstrate that the asia has experienced a % growth in internet usage since , with the rest of the world experien- cing % growth. the european union experienced % growth during this period. the international association of universities holds records of almost academic institutions: http://www.unesco.org/iau/ onlinedatabases/list.html. this would indicate that only . % of academic institutions have a scholar attending ach/allc. http://www. unesco.org/iau/onlinedatabases/list.html. a degree of abstraction had to be used to pigeonhole the departments because of different naming conventions: for example, centres in humanities computing were variously named: centre for computing and the humanities, centre for technology and the arts, computing for humanities, institute for technological research in humanities, arts informatics, humanities computing, humanities computing and media centre, humanities computing centre, institute for technology in the humanities, research computing (faculty of arts), and the institute for technology and liberal education, with only one ‘department of computing in the humanities’ (at the university of groningen, netherlands). http://matrix.msu.edu/projects.php http://www.dcs.gla.ac.uk/idom/ir_resources/ linguistic_utils/stop_words m. terras literary and linguistic computing, vol. , no. , http://www.kcl.ac.uk/humanities/cch/ma/ .html http://www.kantl.be/ctb/vanhoutte/teach/ http://www.ucl.ac.uk/slais/melissa-terras/ http://www .isrl.uiuc.edu/ http://www.w .org/xml/ http://www.tei-c.org/ http://www.concordancesoftware.co.uk/ http:// http:// http://www.kcl.ac.uk/humanities/cch/allc/ http://www.ach.org/ http://llc.oxfordjournals.org/ http://www.princeton.edu/ http://listserv.brown.edu/archives/tei-l.html http://www.siggraph.org/ http://www.chart.ac.uk/chart -abstracts/ http://www.allc-ach .colloques.paris-sorbonne http://www.internetworld http://www.unesco.org/iau/ http://www http://matrix.msu.edu/projects.php http://www.dcs.gla.ac.uk/idom/ir_resources/ archiving dossier narrative the french of italy timemap fordham university, center for medieval studies project catalogue record​: ​https://fordham.bepress.com/ddp_archivingdossier/ / persistent identifier doi​: . /zenodo. current project url: https://medievalomeka.ace.fordham.edu/exhibits/show/french-texts-in-italy/french-of-italy-time map launch date: / / archive date: / / narrative section . project rationale and scope scholars interested in the history of french-language writing in thirteenth- and fourteenth-century italy have previously approached the topic in one of two ways: either by examining one specific textual tradition, or by tracing french-language production within one italian region, most often venice. neither “close” approach allowed for an understanding of how french was used at different times and places on the peninsula, or on the possible connections between various locales of production. using a spatial humanities approach, the french of italy timemap attempts to close those gaps and provide new ways of seeing this complex literary phenomenon. project contributors, based at fordham university’s center for medieval studies, have built a digital object using the omeka/neatline platform which incorporates textual, geographic, and temporal data about french-language writing in italy. the french of italy timemap aims to provide geotemporal visualization of this textual phenomenon, and to invite users to interact with the data about medieval literary production across italy in a dynamic digital format. narrative section . project trajectory https://fordham.bepress.com/ddp_archivingdossier/ / the project had five distinct phases: . conceptualization; . construction; . data collection and verification; . launch and maintenance; . archiving. conceptualization: creating a clear vision for the french of italy timemap was challenging. in at fordham's center for medieval studies, there was little community knowledge about how to use digital tools to address complex research questions. although the center had years of experience disseminating information online, particularly via the internet history sourcebook project (https://sourcebooks.fordham.edu/, inaugurated in ), it had not yet sponsored a project that used digital tools to analyze humanities data, as was the goal with the french of italy timemap visualization. the timemap aimed to explore information first assembled in the center-sponsored website, the french of italy (https://frenchofitaly.ace.fordham.edu/, inaugurated in ) featuring the corpus of french-language writings associated with italy from - . as these sources were collected, connections among the sources emerged, linking geographic regions and textual production in ways not present in previous literature. instead highlighting the disjunction among texts or textual traditions running the length of the italian peninsula, the timemap was conceived as a tool to see where and when common texts were produced and to encourage further inquiry on the culture of french-language writing in italy over time and space. exhibit construction: we used the omeka digital platform to store the information that would ultimately be displayed on the map, including a map we made to reflect the uncertainty we had about where texts were created and to accommodate imprecise dating (for example, if a text were identified as created in the fourteenth century in northern italy, it was difficult to map it using modern mapping tools). the dynamic part of the map was created using a plugin to omeka called neatline. we also included images of the manuscripts when available, and displayed information about the texts, including the manuscript in which the text was copied and the date it was copied. data collection and verification: once we decided to map the occurrences of french-language texts in italy, we chose the textual witness as the unit to map. if one italian manuscript contained several different french-language texts, a point would appear on the map for each of these texts, not simply for the entire miscellany, so that the contents of one codex might appear several times on the map. this allowed users to see the relative popularity of a text on the peninsula, and to question the means of transmission from one place to another. to collect data concerning production locale and timing, we relied on information from the french of italy website, but also on other related sites such as rialfri (http://www.rialfri.eu/rialfriwp/) and medieval francophone literary culture outside of france (http://www.medievalfrancophone.ac.uk/). the data collection and verification process resulted in a spreadsheet of french-language texts created in italy or by italian authors. this datasheet was used to create the points that appear on the map and can be freely downloaded from the site. launch and maintenance: the map was launched in late . from the start, getting the platform to work well with university computer systems required co-operation with both university information technology specialists and with the creators of the omeka and neatline platforms. many of the difficulties we encountered were only resolved by appealing to the omeka forum and to the wider user audience. during the timemap's active phase, the omeka and neatline platforms were updated several times, and fordham's it department would often update the omeka versions it supported so that the project lost functionality with each successive upgrade. major maintenance was performed in december and august following platform upgrades and the migration of the project from one university server to another. yet another migration will occur in late . narrative section . project-specific digital objects the digital objects created for the french of italy timemap, with the accompanying tools used to create them, are listed below. participants did not record the versions of each tool used to create individual project components. . the project website (omeka, with the following plugins: exhibit builder, simplepages, csv import, socialbookmarking, simple contact form) . the interactive map (omeka, neatline, similie timeline, waypoints), . the data spreadsheet (excel) . the project specific map (photoshop). narrative section . project outcomes no project analytics are available for this project. however, the following presentations and publications authored and co-authored by the project's senior scholar, laura morreale, featured elements of the french of italy timemap. co-authors and presenters are listed in the citation. “medieval digital humanities and the rite of spring: thoughts on performance and preservation,” in ​the digital medieval: new directions in medieval history and the digital humanities​. walter prescott webb memorial lectures , college station: texas a&m university press, forthcoming. “a tour of fordham medieval digital mapping projects” (with tobias hrynick and stephen powell), ​terra digita: digital approaches to medieval mapping​, ithaca, ny, november , . “digital approaches to the world of medieval french,” ​university of colorado symposium on digital humanities​, boulder, co, september , . “telling the story of french after the linguistic, spatial, and digital turns,” ​university of connecticut medieval studies lecture series​, storrs, ct, december , . “visual exploration of medieval textual histories: the case of the french of italy,” (with abigail sargent and david j. wrisley) ​keystone digital humanities conference​, university of pennsylvania libraries, philadelphia, pa, july , . testi e manoscritti franco-italiani: verso una definizione del corpus,” conference sponsored by medioevo romanzo, “il franco-italiano: definizione tipologia fenomenologia,” venice, italy, october . narrative section . documentation statement the french of italy timemap was an important project for the promotion of digital scholarship at fordham's center for medieval studies. when it was first conceived in and while it was active from - , it introduced omeka as a content management system to the center's digital scholars, challenged students and faculty to consider the problems inherent in mapping medieval places on to modern geographies, and offered a new approach toward the french of italy corpus, which had already been a topic of interest to fordham medievalists. however, most of the project contributors have moved to other institutions, and the migration will certainly create functionality issues that will need time and attention that the center can no longer support. the project will therefore become the first to undergo the archiving process, thereby leading the way for other center projects that will eventually exceed their active life-cycles. narrative section . project bibliography avalle, d’arco silvio. “richerche di letteratura medievale francese in italia.” in convegno letterature straniere neolatine e ricerca scientifica ( - may , accademia della crusca), - . rome: bulzoni, . bertozzi, gabriele-aldo and marie-jose hoyet, “la cultura francese in italia: contributi editoriali piu recenti.” in la cultura italiana in francia, la cultura francese in italia, . rome: ministero per i beni culturali e ambientali, . bec, christian. les marchands écrivains, affaires et humanisme à florence, - . paris, la haye: mouton & co, . branca, vittore, ed. mercanti scrittori: ricordi nella firenze tra medioevo e rinascimento: paolo da certaldo, giovanni morelli, bonaccorso pitti e domenico lenzi, donato velluti, goro dati, francesco datini, lapo niccolini, bernardo machiavelli. milan: rusconi, . busby, keith.“the geography of the codex. italy.” in codex and context, - . new york: rodopoi, . cigni, fabrizio. “manoscritti di prose cortesi compilati in italia (secc. xiiixiv): stato della questione e prospettive di ricerca.” in la filologia romanza e i codici (atti del convegno, messina, università degli studi, facoltà di lettere e filosofia, - dicembre, ), vols.. edited by saverio guida and fortunata latella,vol. , - . messina: sicania, . folena, gianfranco. “la cultura volgare e ‘l’umanismo cavalleresco’ nel veneto.” in culture e lingue nel veneto medievale, - . padua: editoriale programma, . medieval francophone literary culture outside of france, king's college, http://www.medievalfrancophone.ac.uk/about-the-project/people/project-team / meyer, paul. “de l’expansion de la langue française en italie pendant le moyen Âge,” in atti del congresso internazionale di scienzestoriche (roma, - aprile, ), iv, atti della sezione iii: storia delle letterature, - . rome: tipografia della r. accademia dei lincei, . petrucci, armando. writers and readers in medieval italy: studies in the history of written culture. edited and translated by charles radding. new haven: yale universtiy press, . rialfri, department of linguistic and literary study at the university of padua.http://www.rialfri.eu/rialfriwp/ ruggieri, ruggero.l’umanesimo cavalleresco italiano: da dante all’ariosto. nd ed., naples: fratelli conte, . segre, cesare. “la letteratura franco-veneta.” in storia della letteratura italiana. edited by enrico malata, - . rome: salerno editrice, . wunderli, peter and günter holtus. “la ‘renaissance’des études fanco-italiannes. rétrospective et prospective.” in testi, cotesti e contexti del franco-italiano, - . edited by holtus, krauss and wunderli. tübingen: niermeyer, . microsoft word - . uesaka-verifying the authorship of saikaku ihara's arashi ha mujyo monogatari- .docx verifying the authorship of saikaku ihara’s arashi ha mujyō monogatari in early modern japanese literature: a quantitative approach ayaka uesaka auesaka@mail.doshisha.ac.jp organization for research initiatives and development doshisha university, japan introduction this study focuses on arashi ha mujyō monogatari (“the tale of transient popular kabuki actor arashi’s life”; ), a novel from the early modern japanese literature, written by saikaku ihara ( – ). it is a first work of a kabuki actor’s life in japan (kabuki is a traditional stage arts performed exclusively by male actors with the accompaniment of live music and songs). then we will examine the “authorship prob- lem” in saikaku’s works using the tools of quantitative analysis. saikaku was a national author whose novels were published in th century. saikaku’s works are known for their significance for developing japanese contem- porary novels. one recent hypothesis has stated that he wrote twenty-four novels, however, it remained un- clear which works were really written by saikaku ex- cept kōshoku ichidai otoko (“the life of an amorous man”; ), shōen Ōkagami (“the great mirror of fe- male beauty”; ), kōshoku ichidai onna (“the life of an amorous woman”; ), kōshoku gonin onna (“love stories about five women”; ). although the study of his works has continued, these fundamental doubts about his authorship remain. meanwhile, the potential of quantitative analysis of textual data and the related field of the digital human- ities have also dramatically advanced. however, quan- titative analysis of japanese classical works has been behind. it has been a problem due to complications re- garding development of morphological analysis soft- ware and also delayed digitalization of japanese clas- sical works. previous studies found by noma in noma found and introduced arashi ha mujyō mo- nogatari in . he mentioned that the novel was ac- tually written by saikaku, for the following reasons (noma, and ). ( ) the handwriting of the novel belongs to saikaku; and ( ) he found a similar writing error in arashi ha mujyō monogatari and saikaku’s work. arguments for saikaku’s authorship the handwriting is not crucial in deciding if they are saikaku's novels. according to emoto et al. ( ), among his twenty-four novels, the handwriting of nineteen works does not belong to saikaku. moreover, saikaku made a fair copy of other writer’s draft such as kindai yasa inja (“the story of a hermit”; ) by kyōsen sairoken (? -?) and shin yoshiwara tsurezure (“the book of commentary on the licensed quarters of a certain area”; ) by sutewaka isogai (? -?).mori ( ) has argued that saikaku’s novels are an apoc- ryphal work mainly written by dansui hōjō ( - ) except kōshoku ichidai otoko. as he gained a national audience, saikaku was pres- sured to write on demand and in great volume. at first he wrote only one or two novels a year, however in the two years from to he published twelve books, with a total of sixty-two volumes. saikaku’s style and approach also changed at this point (shirane, ). it is possible that saikaku had some assistance (nakamura, ). arashi ha mujyō monogatari was published in this period. moreover, arashi ha mujyō monogatari does not have a preface, epilogue, signa- ture, namely it is not specified that it was written by saikaku. despite the authorship problem of arashi ha mujyō monogatari remains unanswered, little work has been done about it. for that reason, this study re- examines the authorship of arashi ha mujyō monogat- ari using a quantitative approach. databases database of saikaku’s works first, we digitized all the text of works of saikaku ( novels, poem books, etc.) based on the first edition of each works (see figure ). second, since japanese sentences are not separated by spaces, we built the rule with early modern japanese researchers, who were editors of shinpen saikaku zenshu (“the new complete works of saikaku”). finally, based on this rule, we added spaces between the words in all of the sen- tences. in addition, the grammatical categories’ infor- mation was added. according to our database, there are , words contained in his works. figure : saikaku’s publication database of dansui’s works we also made the database of dansui’s novels shi- kidō Ōtuzumi (“the great drum of love”; ), chuya yōjin ki (“the night and day of precaution”; ) and budō hariai Ōkagami (“the great mirror of martial arts”; ), using same methods and rules of saikaku’s database. according to our database, there are , words contained in these works. the next section considers the doubts about the authorship problem of arashi ha mujyō monogatari. analysis and results in our previous studies, we have analyzed saikaku and dansui’s novels, and have clarified the following two points by extracting their writing style using principal component analysis (pca) and cluster analysis (hierarchical clustering): ( ) a comparison of the saikaku and dansui’s novels showed ten prominent features: the grammatical categories, words, nouns, particles, verbs, adjectives, adverbs, adnominal adjectives, grammatical categories bigrams and particle bigrams (uesaka, , ); and ( ) using these features, we analyzed saikaku's four posthumous novels (many researchers have raised questions about the authorship, because these novels were edited and published by dansui after saikaku’s death). we found these four posthumous works indicated same features of saikaku’s novel, therefore we concluded that these four posthumous novels belonged to saikaku (uesaka・ murakami, ab, uesaka, ). in this study, we compared arashi ha mujyō monogatari to saikaku and dansui, as authenticated novels of them (see table ) by ten prominent features using pca and cluster analysis to see the differences in each novels. the analysis revealed differences of writing style between arashi ha mujyō monogatari, saikaku and dansui. table : the authenticated novels of saikaku and dansui we conducted pca with correlation matrix and these novels fall into three groups: saikaku, dansui and arashi ha mujyō monogatari (see figure ). furthermore, we conducted a cluster analysis. there also appears to be a considerable difference among arashi ha mujyō monogatari, saikaku and dansui’s novels. when calculating distances between each novels, we normalized the frequency of each words, and used the kullback–leibler divergence and the algorithm from the ward method. furthermore, we obtained similar result of the other nine features: the grammatical categories, words, nouns, particles, verbs, adjectives, adverbs, adnominal adjectives and particle bigrams. figure : pca results for grammatical categories bigrams a ra sh ih am uj yo um on og at ar i d :c hu ya yo uj in ki d :b ud ou ha ria io ka ga m i d :s hi ki do uo tsu zu m i s: k ou sy ok ug on in on na s: k ou sy ok ui ch id ai on na s: k ou sy ok ui ch id ai ot ok o s: sh oe no ka ga m i . . . . . . kullback-leibler distance hclust (* , "ward.d ") nb = h ei gh t figure : cluster analysis results for grammatical categories bigrams discussion and conclusion when comparing ten prominent features using pca and cluster analysis, we found that arashi ha mujyō monogatari was significantly different from saikaku and dansui’s works. a number of features indicate that arashi ha mujyō monogatari is not saikaku and dansui. in order to clarify arashi ha mujyō monogatari’s author, we need to conduct more detailed analysis. it is necessary to add the data of other writers with the possibility of the author of arashi ha mujyō monogatari, for example, kiseki ejima( - ) and ichirōemon nishimura(?- ). we have been building the database of these author’s novels, and we will do comparisons in the future study. acknowledgements we would like to thank professor masakatsu mura- kami, professor hidekazu banno and professor taka- yuki mizutani for their help on our research. this work was mainly supported by the japanese society for the promotion of science (jsps) grant-in-aid for research activity start-up( h ). bibliography emoto, y. and taniwaki, m. ( ). saikaku jiten (“a saikaku dictionary”). tokyo:ouhu publishing. mori, s. ( ). saikaku to saikaku bon (“saikaku and saikaku’s novel”). tokyo:motomotosha publishing. nakamura, y. ( ).saikaku nyumon(“the introduction of saikaku’s research”). in:kokubungaku kaishaku to kansho(“ japanese literature-explanation and appreciation”) ( ). pp. - . tokyo: shibundo publishing. noma,k. ( ). arashi ha mujyō monogatari (“the tale of transient popular kabuki actor arashi’s life- explanation and understaning”). in: saikaku shin shinkō (“saikaku new article”; ). pp - . tokyo:iwanami publishing. noma,k. ( ). sairon arashi ha mujyō monogatari (“re- explanation of the tale of transient popular kabuki actor arashi’s life”). in: saikaku shin shinkō (“saikaku new article”; ). pp - . tokyo:iwanami publishing. shirane, h. ( ). early modern japanese literature: an anthology – . new york: columbia university press. uesaka, a. ( ). a quantitative comparative analysis for saikaku and dansui’s works, proceedings of japan-china symposium on theory and application of data science, pp. ~ , kyoto:doshisha university faculty of culture and information science. uesaka, a. & murakami, m. ( a). verifying the authorship of saikaku ihara’s work in early modern japanese literature: a quantitative approach. digital scholarship in the humanities. ( ). pp. ~ . oxford: oxford university press. uesaka, a.& murakami, m. ( b). a quantitative analysis for the authorship of saikaku’s posthumous works compared with dansui’s works. proceedings of the digital humanites . uesaka, a. ( ). saikaku ikōshu no chosha no kentō (“verifying the authorship of saikaku’s posthumous works”). pp - . in: the computational authorship attribution. tokyo: bensei publishing. - - - - - pc ( . % ) p c ( . % ) s:kousyokuichidaiotoko s:shoenokagami s:kousyokugoninonna s:kousyokuichidaionna d:chuyayoujinki d:budouhariaiokagamid:shikidouotsuzumi arashihamujyoumonogatari care, code, and digital libraries: embracing critical practice in digital library communities – in the library with the lead pipe skip to main content chat .webcam open menu home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search feb kate dohe / comments care, code, and digital libraries: embracing critical practice in digital library communities in brief in this article, the author explores the necessity of articulating an ethics of care in the design, governance, and future evolution of digital library software applications. long held as the primary technological platforms to advance the most radical values of librarianship, the digital library landscape has become a re-enactment of local power dynamics that privilege wealth, whiteness, and masculinity at the expense of meaningful inclusive practice and care work. this, in turn, has the net result of self-perpetuating open access digital repositories as tools which only a handful of research institutions can fully engage with, and artificially narrows the digital cultural heritage landscape. by linking local narratives to organizational norms and underlining the importance of considering who does the work, and where they can do it, the author explores manifestations of care in practice and intentional design, and proposes a reframing of digital library management and governance to encourage greater participation and inclusion, along with “user-first” principles of governance. by kate dohe introduction digital programs in research libraries, such as institutional repositories and digital collections of unique special collections materials, are deep in their second or even third decade. the broad swath of products, technologies, projects, and professional practices that undergird individual efforts are mature, even as individual libraries are subject to economic stratification that impedes full engagement with those technologies and practices (dohe ). a variety of specializations within the digital library practitioner community continue to emerge each year–digital scholarship, digital curation, digital publishing, or digital strategies to name a few–and it is rare to find an academic library in the association of research libraries (arl) that does not include digital initiatives in its strategic plan or mission. clearly, the profession places a great deal of value on such efforts, and the most idealistic and ambitious mission statements emphasize the power of digital libraries to bridge cultural and geopolitical divides (“icdl mission” n.d.), share “the record of human knowledge” (“hathitrust mission and goals” n.d.), or facilitate global access to scholarly product (“mission and vision, texas digital library” n.d.).  digital projects have long been hailed as the ethical or even radical solution to our crises of the hour, whether those crises are journal pricing, original publishing, scientific reproducibility, research data management, or textbook affordability. yet here we are, twenty years later, and none of those crises have been solved. we built our digital repositories, invested our time and infrastructure, and struggle to reach users (salo ). the contemporary digital library product landscape is currently reduced to commercial options owned by the same content owners and vendors (schonfeld b) that exuberantly pillage our collections budgets every year (mit libraries n.d.), and a handful of open source options with similar governance structures and substantial community dominance by a smattering of wealthy, historically white (hathcock ) arl member institutions. digital library initiatives across the u.s. are reckoning with very real questions of financial or legal sustainability, while the doors to participation remain firmly closed to broad swaths of the higher education landscape. even as a significant amount of the profession emphasizes the importance of digital projects work, the cloistered technical community that contributed to this state of affairs is poorly understood by many librarians outside “digital etc.” specialists. the end result is elite institutions making products for other elite institutions, and every year the technical and economic barriers to entry grow higher. how did we get so far from the truly radical roots of digital libraries, when the budapest open access initiative urged libraries, governments, and scholars to “unit[e] humanity in a common intellectual conversation and quest for knowledge” (chan et al. )? why are our technical products failing our users? how is so much talent and investment (arlitsch and grant ) producing such mediocre results? more importantly, how do we re-invigorate our own open source projects and fulfill the ultimate missions of digital libraries? how can we create truly participatory digital library project communities? familiar wolves are at the door, slyly promising vertical product integration and improved discovery as they buy commercial digital projects platforms left and right (schonfeld a). this isn’t ground we can afford to cede back to the same commercial interests that have put libraries on the ropes financially for decades. after an illuminating discussion over breakfast with a male colleague in my library’s technology division, i began to interrogate the ways this problem is a byproduct of social reproduction at our local institutions. he and i shared responsibility for digital library initiatives as peer department heads in our arl library’s it division. my librarians managed our digital collections, stakeholders, and users, while his developers were responsible for technical implementation of our portfolio of digital library applications, including code contributions to international projects. we had developed a mutually trusting work partnership and collegial friendship by the time we sat down at a diner on the second day of the code lib national conference.  our seemingly innocuous conversation over coffee prompted me to reflect upon the full weight of gendered assumptions regarding the divide between our positions and the value of our respective labor, and underscored the ways these assumptions between individuals ripple through communities. while this conversation occurred between two colleagues at one research library, we also occupy roles and operate within social systems reproduced throughout the profession that dictate and shape the nature of our work relationship. open source digital library communities are largely driven by the priorities of technical staff like us at elite research libraries like ours, who frequently exist in a siloed, overwhelmingly white, predominantly cis-male micro-culture within their home libraries (askey and askey ), creating a masculinized environment that outsiders often negotiate through participation, emulation, or willful ignorance (brandon, ladenson, and sattler ). the inherently gendered tensions between predominantly male it groups and a feminized library workforce inevitably permeate the communities and applications imbued with our professional values. radical change to community projects requires a codified framework for equitable, just, and caring interpersonal communication that begins at the local level. whose community projects? any given library’s digital collections, institutional repository, and digital scholarship projects are typically powered by a variety of software applications, components, and services, rather than a single monolithic “digital library” application capable of serving up all types of content and data effectively. some of these services come packaged from commercial vendors, like worldcat’s contentdm or elsevier’s recently acquired digital commons, and the relationship between the software provider and customer library is similar to that of any other software or content package. many more digital library technologies (including some of those implemented in commercial products) are community-supported open-source projects. some of the most prominent examples include digital repository applications fedora and dspace, digital collections interface tools samvera and islandora, content viewer frameworks like the international image interoperability framework (iiif), content creation tools like omeka and open journal systems, and discovery services based on blacklight. this is far from a complete picture of the digital library project landscape, but serves to highlight the complex nature of implementing and maintaining an open source digital library program. it is fitting that collections and content intended to reach the global citizenry should be available with open source software applications. moreover, many of these applications are created, customized, and maintained by staff at the research and cultural heritage institutions that also steward the content. these are among the few products that we make, that are most directly for us, our content, and our users. this should represent a shift in power dynamics from vended solutions that is nearly as significant as the shift to open access to information. to borrow an analogy from safiya noble’s dissertation “searching for black girls: old traditions in new media” (noble ), open source digital library technologies are comparable to solar panels that “facilitate independent, democratic participation by citizens, and [show] that design impacts social relations at economic and political levels” in opposition to controlled and closed systems peddled as a “galaxy of knowledge” (appleton ) even as they proclaim their openness and transparency. community–and consequently, community membership–is critical to understanding these open source digital library projects. as open-source applications, anyone may download, install, and run digital library applications, though the technical skills to effectively customize and maintain these applications are non-trivial and often out of reach for anyone but professional software developers. this technical overhead can be an exceptionally high barrier to clear for participation in the community. as an example, the samvera community and toolkit requires adopters to make a staggering amount of critical and frequently binding technical decisions before even getting started; production-level adoption of the latest version of fedora is constrained to fewer than two dozen institutions worldwide at the time of this writing (“‘fedora deployments – fedora repository,’” ); dspace has a robust adoption base but until the still-forthcoming release of dspace , two mutually exclusive user interfaces. upgrading applications over time–a necessity for a professional digital library program that promises permanence and preservation as a core service–also proves to be a fraught, labor-intensive effort, as seen in the slow adoption of fedora (“designing a migration path – fedora repository – duraspace wiki” n.d.), or widely reported problems upgrading to ojs . governance structures of many of these projects tend to overlap in both structure, reward systems, and membership. institutions often have two avenues for participation in the governance and decision-making of these products—pay membership fees to secure a seat in leadership, and/or employ software developers who are talented enough to contribute code back to the application’s core source code. skilled developers with a high degree of institutional support may become official “committers,” which is often a meritorious individual achievement on par with elected professional national service, and the committers themselves have a strong say in product development and roadmaps. because this labor is extremely technical, administrative representation in steering committees or product leadership are often themselves technical department heads, division managers, or ads/auls. institutions with the resources to participate in these application communities at this level are often further privileged with grant funding opportunities to develop new tools or applications within this digital content ecosystem, and thus reify their status as community leaders. avenues for participation outside the programmer/management dyad within these open source product communities can be largely limited to programmer support roles like documentation, request management and release testing (as is the case with the dspace community advisory team), or specialist interest groups with no codified governance power, as is the case with the proliferation of groups in the samvera (“samvera ig/wg framework – samvera.,” n.d.) community. largely absent in these communities are liaisons, curators, or actual end users, and consequently there is a fundamental disconnect between developers of these applications and the front line users who must navigate, curate, and use the contents of such systems. many of the design discussions i have been privy to in local and organizational settings privilege the discussion of objects and data over people—the pursuit of a more perfect object model without centering and clearly articulating the user’s needs. hand-waving at “more discoverable” is often unexamined without clearly arriving at discoverable by who, for what purposes, and how we know that. the net result of this insular community development is that programmers and the people who supervise them at wealthy and historically white american institutions are making considerable product and implementation decisions about the most potent tools in our arsenal to resist neoliberalism. excluding those who possess insight into the social, political, and experiential impacts of technology from the messy discursive process of making it undermines the value of a collaborative professional tradition, and protects institutional white supremacy and all its trappings of valorized productivity. “the master’s tools will never dismantle the master’s house” (lorde ). the economic and racial stratification resulting from this community insularity is counter to the self-proclaimed social justice spirit of early digital initiatives that emphasized their commitment to the public good of global open access to information, and open source technology to serve it. access barriers for users cannot be lowered when the technological barriers for a diverse member community are simultaneously raised. no hbcus are listed as fedora adopters of any version outside consortial support.  only four community colleges have deployed and registered their own dspace repositories (and two of the listed, registered repositories are defunct at the time of this writing) even as the community college consortium for open educational resources (“community college consortium for open educational resources.” n.d.) and similar initiatives emphasize the vital importance of access to locally-developed open educational resources for the (frequently non-white, non-traditional, poorer) student populations at community colleges. accessibility, particularly compliance with wcag . aa standards required at a growing number of institutions in response to lawsuits (carlson n.d.), continues to be elided with responses that individual members of “the community” need to identify and resolve such fundamental issues on their own (“dspace and wcag website accessibility.” n.d.). while repository registries are voluntary and therefore inherently incomplete, they do paint a picture of self-identified organizations more likely to engage actively in governance and community initiatives for those products. the explanations for this delta between applications that serve clear needs and a potential user base are often presented as self-evident—these institutions “lack the resources,” which is typically a euphemism for “can’t afford a full time developer and systems team inside a library” (hamill ). often in the same breath, applications like dspace or the aging eprints application are presented as “turnkey” solutions in literature (maynard et al. ) and documentation (“what is dspace? – dspace knowledgebase” ). professional hosting for open source repository software is a comparably new phenomenon when one considers the lengthy history of such projects, many of which are designed as “single tenant” applications that can be difficult to scale for multiple institutions, or even multiple projects. furthermore, while the software is open source, there are obvious risks regarding public service sustainability any time a vendor comes into the picture. extensible repository systems like fedora are of abstract utility outside a very limited community, and the talent to configure and manage those applications comes dear. positions go unfilled, and issues can only be solved by a handful of developers at a few institutions. organizations without resources to participate in the open source communities may select vendor solutions for digital projects, which often prove to be less costly than the required fte and skill set for supporting a major open source digital library initiative. it is no coincidence that major companies like wiley and elsevier are buying products like atypon and digital commons respectively (schonfeld b), to integrate with the scholarly enterprise software suites each company is building and marketing to provosts, directors of research, and university presidents. in this landscape, open access and cultural heritage content ceases to be an ethical imperative and instead becomes a lucrative revenue stream for organizations that have nakedly demonstrated their opposition to the free and open exchange of information over decades of doing business with libraries. the disadvantages of this state of affairs in the open source digital projects community are several-fold. open source tools aren’t designed to be adopted by the communities they could theoretically best serve, as users and content creators. this in turn artificially narrows the cultural heritage landscape as digital content is never shared or has no long-term stewardship. the communities that do adopt these applications are frequently so small that only a few people are equipped to share expertise with each other (which mitigates the advantages of having a community of practitioners). ultimately, the products become worse over time and present market opportunities for the same commercial interests that are hollowing out the mission of academia (seale , bourg , mountz et al. ), and put the entire open access and digital scholarship enterprise at risk. local cultures examining “…the people building these systems and the environments in which the software is produced, as part of the software’s ecology” (sadler and bourg ) is essential to understanding how digital library applications evolved in the manner they have. open source digital library architecture is not built by silicon valley techbros gleefully commodifying the labor of women and people of color (hoffmann and bloom ). it is built by our colleagues and friends; people we interact with every day on listservs, on calls, in slack channels, and in the halls. many developers and technical staff chose comparably lower-paying positions in higher education, and libraries in particular, because they value the library’s mission and workplace, and care about work-life balance–a far cry from “brogrammer” culture (crum et al. ). we are on the same side, and value the same things. yet library it culture is still a place apart within libraries, often very literally(askey and askey )–a place with its own language, norms, rhythms, and priorities. libraries have long been understood as feminized workplaces, with (largely white) female librarians and non-technical support staff, and a higher proportion of (largely white) male managers (schlesselman-tarango ). library it, particularly in academic libraries, is often the opposite. women occupy a minority of positions and are less likely to take supervisory positions, and are less likely to be compensated comparably with male supervisors with equivalent experience and expertise (lamont ). the work environments are rarely as openly hostile or sexist in the vein of silicon valley, but entire books (brandon, ladenson, and sattler ) are dedicated to women who must navigate alienation, imposter syndrome, overt sexism, and unconscious bias throughout their careers in library it. gender is a vital dimension to understanding technological influence within libraries; as roma harris’s influential article on the topic states: “given the strong cultural and ideological associations between masculinity and technology in western society, it is impossible to consider the social shaping of technology in librarianship without taking into account the gendered nature of library work, particularly since studies of technological change in other sectors of the labor force reveal that the work of women and men is generally segregated, in part along lines structured by their association with or their use of particular technologies” (harris ). institutions with the resources to hire “digital etc. librarians” often rely on these positions to “bridge the gap” between librarians and library it, or “collaborate” through internal marketing and external proselytizing about the merits of a system designed largely by technical staff. these librarians often end up in service provision roles to ameliorate systemic usability flaws (mediated institutional repository submission workflows are a prime example of this). this in turn limits the opportunities for these librarians to collect and advance user needs or participate in the creation of better systems and projects. the work of a digital etc. librarian bears all the signifiers of carework typified by the broader profession of librarianship and explored at length in “on capacity and care” (nowviskie ), “library technologies and the ethics of care” (henry ), and others. it is also frequently composed of bullshit task completion (schmidt ) generated by questionable user interface decisions in software applications. furthermore, occupants of this role often feel most immediately the tensions between the patriarchal and technocratic “future of the library” and feminized care work explored in mirza & seale’s “dudes code, ladies coordinate” presentation at dlf (mirza and seale b) as well as their “who killed the world?” chapter (mirza and seale a) in gina schlesselman-tarango’s topographies of whiteness. in both works, the authors examine the ways in which the technocratic libraries of “the future” (and present) elevate technological production at the expense of care work required to support the end users of those products. this valorization of final product over emotional process positions digital etc. librarians as handmaidens to a vision of libraries that poorly emulates the commercial it industry. moreover, as digital initiatives, maker spaces, and technology initiatives for libraries occupy a progressively more prominent place in the strategic objectives of a given library, this isolated microculture is increasingly pushed forward as “the future of libraries” (mirza and seale a) at the expense of feminized labor and values of librarianship. this explicit valuing of technological solutionism by local institutions is then echoed in the committees and organizations responsible for maintaining and governing open access digital library projects. just as technology is a reflection of the human values of its creators (noble , winner ), the governance structures of digital library projects are a product of the values of the most influential adopters of these technologies, with explicit and nearly exclusionary value placed on functional code and technological work as an “in kind” contribution to those projects, as seen with fedora (“fedora leadership group in-kind guidelines” ) and islandora (“islandora and fedora ” ) as notable examples. these are the only products that “count” (mountz et al. ) in this corner of the academic community. this, in turn, is underpinned by interpersonal dynamics within organizations, and the net result is that some of our worst biases manifest in the products we make. “just like all politics is local, all culture is local,” dr. chris bourg stated in her code lib keynote speech in washington, dc (bourg ). aimed squarely and unapologetically at the ways white men can use their de facto positions of power and group belonging within library it departments to create—or hinder—inclusive environments, the keynote combined evidence and sociological theory with blunt instructions for white cis men in library it to be better. vouch for colleagues. make space. reduce stereotypical and exclusionary cultural markers. be cognizant of the bleed between social and professional.  definitely don’t get beers with the fellas and talk about the womenfolk. my own professional background echoes many of the findings and narratives of workplace studies and examinations of library it culture, including those described in dr. bourg’s keynote. i am a white female digital etc. librarian by trade, accustomed to being described by others in terms of my interpersonal skills and characteristics, with my technical chops left as a vague afterthought. i currently supervise only white and male faculty librarians in my library’s it division, and i worked with nearly exclusively white and male developers throughout my decade or so in the profession at arl institutions and private companies–places with money. my current library is similar to the physically isolated it spaces askey and askey describe, with our generally male technology division housed in a maze-like basement behind swipe card access points, and a highly collegial environment that relies heavily on technical knowledge and project-driven work that often seems disconnected from “the upstairs.”  i’ve always been “the woman in the room,” and even sometimes “one of the fellas” (always with an asterisk by my name), ready with an invitation to game night or a deep dive on george r. r. martin’s a song of ice and fire. alienation in both the male spaces of library it and female-dominated librarian communities has shadowed me for much of my career. dr. bourg’s code lib keynote rang true to my lived experience as a woman marginalized within a system i recognized i was complicit in perpetuating. i was deeply surprised, then, when my colleague and breakfast companion asked for my impressions of the keynote the next day, and then confided  to me that he became emotional and somewhat defensive during dr. bourg’s speech. he continued that maybe some men needed to be spoken to as she had, but he felt put off by what he perceived as stereotypes and assumptions about men in her presentation. i had long perceived this colleague as both cool-as-they-came and a reliably empathetic ally, so this admission unsettled me for its resemblance to white fragility (diangelo ) from someone i had never found susceptible to it, and who had long ago earned my respect for his introspection. if this genuinely well-intentioned male colleague’s perceptions were so far from my own, i thought, then how on earth can our library technology departments become more approachable and accessible? whither the ethics of care? ethics of care has emerged in recent decades as a powerful, intentionally feminist ethical framework centering relationships and emotion in moral development, typically credited to carol gilligan for originating the theory (webteam n.d.). with time and effort, the understanding and application of this ethic has evolved to encompass a broader array of intersectional (eugene , graham ) professional (noddings ), and political (tronto , hankivsky ) implications for care work. joan tronto delineates four components of ethics of care (tronto ) – responsibility: assuming a willingness to respond to a need within a relationship attentiveness: observing a need within a relationship competence: addressing a need effectively responsiveness: empathy for the perspective of others the explicit values of tronto’s framework–equality, freedom from oppression, democracy–align handily with the mission of librarianship, and the relational, emotional, empathetic work of academic librarianship across disciplines are easily understood as care work. tronto’s exploration of care work provides actionable criteria for characteristics of care. these criteria in turn make it possible to meaningfully assess the effectiveness of caring actions. in short, it helps us articulate the difference between simply telling colleagues and users “my door is always open” or “you can email me with any questions” and proactively working to reduce the psychological and interpersonal barriers that prevent people from taking those actions. high performing librarians exercise strong empathetic skills to identify, respond, assume responsibility, and effectively seek solutions to reducing those barriers. in particular, digital etc. librarians are often asked to do constant translation and code switching between programmers, curators, students, and faculty under the auspices of “bridging communication gaps,” yet frequently earn less than their programming counterparts, and have diminished influence in the direction and governance of digital library products. these librarians are doing the heavy lifting of emotional labor on behalf of the technical colleagues who are empowered to enact actual change in their repository communities, while they themselves call into yet another interest group to debate whether “title” should be required or only recommended in a system workflow. my breakfast companion and i were both aware (perhaps to varying degrees) that other staff in our library frequently approached me explicitly as this colleague’s “translator.” i often found myself assuming a disproportionate amount of emotional labor to explain technical concepts, or how decisions were made by our software developers, and occasionally provide encouragement and support for team members who felt apprehensive talking to my colleague. i had long understood this dynamic as problematic, particularly as a female peer manager in a technology division, but minimized my own feelings of frustration over it as “part of my job.” furthermore, much of my career as a digital etc. librarian had involved the same work, codified as position responsibilities–who was i to be annoyed when someone would privately take me aside and ask “ok, can you tell me what he meant by this? i couldn’t follow the technical explanation and i’m embarrassed to ask him about it.” care in our home institutions when care work is denigrated by our own research libraries, through both our employment practices and our local interpersonal behaviors, we create patterns of behavior and exclusion that manifest both locally and in our products. if we continue to privilege coding over care as if the two are fully disconnected, and hand the reins of what should be our most intentional and accessible applications to a homogenous cohort of well-intentioned but isolated decision makers who are removed from direct and constant care work for end users or colleagues, then we are complicit in the neoliberal hollowing of the academic library mission to use our resources for the public good. we produce software that serves the needs of technologists employed at rich white universities first, and everyone else as an afterthought. this is solvable and avoidable. locally, we can embrace and elevate the care work done by the librarians whose fates and careers are increasingly bound up in the viability of digital library software. 
stepping back to that fundamental question from my breakfast with my colleague—if reflective and helpful white men who want to be allies are struggling to respond competently to calls for more inclusive, caring spaces, what can be done? like too many women in this #metoo moment (though one with the privileges of whiteness, financial security, sound health, and more), i am tired. i am tired of patiently explaining, or pulling back the curtain on my own experiences. i am tired of answering men who ask “why didn’t you tell me?” when i believe a better question they could ask themselves is “what could i have done differently to help others be comfortable confiding in me?” moreover, this telling and retelling of what men can do to be better allies and why they need to take action may help with attentiveness to the problem and even assuming responsibility, but does little for developing competence or responsiveness on its own. “doing the thing is doing the thing,” as amy poehler put it in her memoir (poehler ), and our profession needs more opportunities for those who would support marginalized communities to practice the thing. for the previous three years, i had worked with another former colleague on an improv workshop specifically for librarians and technologists, which took into account the shifting landscape of librarianship in higher ed and gave our players an informal space to practice the essential skills of collaboration without the pressure of real expectations (pappas and dohe ). many of our workshop’s objectives echoed dr. bourg’s recommendations, with a performative twist—make your partner look good. be present. practice listening with undivided attention. commit to affirmation as a means to develop the best ideas. avoid assumptions about common knowledge. decenter yourself and focus on the needs of the ensemble. while the intent of the workshop was to foster collaboration across domains of distributed expertise, the same skillset applied to both allyship and effective care work, and represents a low-stakes learning environment to develop communication competence. these are concrete abilities that one must practice like coding, not fuzzy personality-driven soft skills that are difficult to assess or articulate. the pursuit of professional development opportunities and training on these skills should be taken as seriously as any request to attend a coding workshop, and just as we would expect a programmer to share a new tool or language with the team, we can and must expect the same from participants in a communication workshop. furthermore, the care work performed by librarian-technologists and digital etc. librarians can be emphasized and recognized within library it departments and divisions in a number of ways. de-emphasize and decouple quantity of submissions (especially faculty submissions) in repositories as a metric of performance. elevate and make visible the user research that informs local product decisions as an essential part of application research and documentation. emphasize demonstrable methods of emotional work, not “collaborate with stakeholders” as a panacea in position descriptions. stop treating diversity exclusively as a pipeline problem and reward efforts to connect with and meet the needs of underrepresented communities. for the love of capybaras, get in front of users before decision points have whistled by. coar, care, and the evolution of digital library communities at the time of this writing, digital library applications are at a pivotal juncture in their development and future evolution. high-profile crises in major projects, notably the closure of the digital preservation network, and layoffs at the digital public library of america, are focusing community attention on the governance of digital library projects and sustainability of membership-driven initiatives. questions of in-kind labor contributions are likely to rise as local library budgets continue to shrink, but so long as these contributions are limited only to coding and development activities, prospective participants and supporters will continue to be artificially limited. the coalition for open access repositories (coar) has requested comments on their “next generation repositories” proposal (“coar next generation repositories | draft for public comment” ), and the proposal does specify at a number of points that inclusivity and user engagement are guiding principles for the document. however, the user stories provided highlight a number of self-perpetuating assumptions about the nature of a human user as a high-level researcher that one would typically find at a high-level research institution in a western nation. students, public users motivated by personal interest, disabled users, and exclusively mobile users are nowhere to be found in the design of the “next generation repository,” leaving one to wonder if the next generation user is expected to evolve as well.   search results for “accessibility” on coar draft for public comment shifting practice within a community requires reconceptualizing the values of that community, and in this regard black feminists, womanists, and care scholars are instructive. in “to be of use,” toinette m. eugene emphasized connections, caring, and personal accountability, rather than the “arbitrary and fragile” market model of community (eugene ). this humanist and explicitly afrocentric centering of community broadens its scope beyond coders and managers, and instead encompasses the communal ways of knowing and doing work in this space. the organizations that sustain and steward digital repository products have a number of opportunities to engage with and support an ethics of care in the design and governance of their applications. one easy win is to establish parity between the influence of committers and non-programmers. what if quality end-user documentation, or design work, or user survey design, or accessibility assessments were credited and elevated by the projects in the same explicit way code is? what if those contributions shaped the strategic direction of those applications and communities? what if community outreach were baked into the charges of working groups, to seek new opportunities for growth and inclusive design?put in the parlance advocated by the collective authors of “for slow scholarship: a feminist politics of resistance through collective action in the neoliberal university,” what if we counted differently (mountz et al. )? the mukurtu project (christen, merrill, and wynne ) and community is emerging as a leader in inclusive digital cultural heritage practices. while the project’s primary application does not fulfill many of the essential tasks required of a repository, the content management system does accommodate behavioral metadata, cultural signifiers, and the expression of permissions aligned carefully to its community of indigenous peoples. moreover, these features were not identified and prioritized in a vacuum, nor was development work undertaken with the expectation that a community on the receiving end of centuries of violence and oppression would be eager to accept an existing repository platform. instead, the project originated as a grassroots program driven by community needs, evolved in response to the shared requirements of historically marginalized communities, and centered collaboration and consultation as the guiding principles of development. ultimately, mukurtu demonstrates the potential of an application and community with an inclusive ethics of care embodied in the mission of the platform and its evolution. conclusion four years after bess sadler and chris bourg’s code lib journal article calling for explicitly feminist discovery products, and twenty years after roma harris shone light on the gendered power differentials in library technology change management, little has meaningfully changed with regard to the participants and governance structures of our digital repository ecosystem. in fact, newly emerged technologies such as iiif continue to mimic governance structures of other technical products, which in turn replicate the same imbalances in decision-making explored above. what is now emerging is an unabashedly feminist and inclusive call to action as a critical mass of librarians interrogate the ecosystems of digital library participation and reproduction. the practitioners of emotional labor and care work continue to be de-emphasized in conversations about products with very real impacts on their users, their careers, and the health of a hugely important strategic initiative within libraries. repositories and linked data platforms have the potential to be our most potent leveler of access and privilege, if we choose to embrace our responsibility and respond with intention. as chris bourg stated in her keynote at code lib, this isn’t a pipeline problem, one that can be solved by just getting more “diverse humans” into the mix, as though it can be fixed with some magic combination of attributes. it’s an environmental problem that originates in our home institutions and the elevation of coding over collaboration, of objects over humans, and in-jokes over inclusion, and ultimately serves to starve our own digital repository applications. evolution of these communities without a rethinking of product governance may be slow. on a night during a conference when my “one of the fellas” asterisk was available to me, i spoke with a number of repository developers who proceeded to complain about the changes at the dlf forum over the last few years, scoffing that “no one even puts code on the screen anymore.” as an individual who had co-taught improv at the dlf forum as a means of strengthening collaboration between those who can teach, and those who can code, i found this to be a terribly myopic attitude. it came across to me as a distillation of the belief  that collaboration and soft skills and learning from users should be someone else’s skillset, or that there’s nothing to be gleaned from presentations that center the experiences of students, people of color, people with disabilities, public communities, and the complex, messy universe of invisible “end users” of our digital products if those presentations don’t also include an illegible (and inaccessible) screenshot of a json file. dlf is where i saw “dudes code, ladies coordinate.” i attended that year’s forum with my code lib breakfast companion, and i remember at the time wishing that he had attended that particular session. i especially wished this a day later, when that same colleague forgot our prior plans to meet for lunch, and instead went out with repository developers from another institution to talk about emerging technical issues with strategic implications. i was not invited to that discussion, and instead i spent a few hours reflecting on how little i might be professionally or personally respected by the same people i needed to work with most closely. i understood his invitation to breakfast at code lib, and the emotionally challenging conversation we shared, as a tacit effort to repair a fairly serious personal rent between us. i recognized it as one reciprocal act of care, in the bounds of one working relationship, at one arl institution. one site of cultural change. acknowledgements huge thank yous to my reviewers dr. melissa villa-nicholas and ian beilin, and my publishing editor kellee warren at in the library with the lead pipe, for your labor and thoughtfulness in helping to shape this piece. i’d like to thank a number of people for reading and engaging with the earliest versions of this article, especially erin pappas for encouraging me to seek publication, joseph koivisto and vin novara for extensive feedback, and bria parker, joanne archer, rebecca wack, rachel gammons, and kelsey corlett-rivera for their suggestions and support throughout. and finally, i have to extend my gratitude to ben wallberg, whose generosity as a colleague, collaborator, and friend made much of this article possible. references appleton, gaby. “guest post: supporting a connected galaxy of knowledge.” the scholarly kitchen, january , . https://scholarlykitchen.sspnet.org/ / / /guest-post-supporting-a-connected-galaxy-of-knowledge/. arlitsch, kenning, and carl grant. “why so many repositories? examining the limitations and possibilities of the institutional repositories landscape.” journal of library administration , no. (march ): – . https://doi.org/ . / . . . askey, dale, and jennifer askey. . “one library, two cultures.” in feminists among us: resistance and advocacy in library leadership, edited by shirley lew and baharak yousefi. library juice press. https://macsphere.mcmaster.ca/handle/ / . bourg, chris. . “the neoliberal library: resistance is not futile.” feral librarian (blog), january , . https://chrisbourg.wordpress.com/ / / /the-neoliberal-library-resistance-is-not-futile/. ———. . “for the love of baby unicorns: my code lib keynote.” march , . https://chrisbourg.wordpress.com/ / / /for-the-love-of-baby-unicorns-my-code lib- -keynote/. brandon, jenny, sharon ladenson, and kelly sattler. . we can do it: women in library information technology. carlson, laura l. n.d. “higher ed accessibility lawsuits, complaints, and settlements.” information technology systems and services, university of minnesota duluth. accessed january , . http://www.d.umn.edu/~lcarlson/atteam/lawsuits.html. chan, leslie, darius cuplinskas, michael eisen, fred friend, yana genova, jean-claude guédon, melissa hagermann, et al. . “budapest open access initiative.” budapest open access initiative. february , . https://www.budapestopenaccessinitiative.org/read. christen, kimberly, alex merrill, and and michael wynne. . “a community of relations: mukurtu hubs and spokes.” d-lib magazine ( / ). https://doi.org/doi: . /may -christen. “coar next generation repositories | draft for public comment.” . internet archive. march , . https://web.archive.org/web/ /http://comment.coar-repositories.org/. “code lib community statement in support of chris bourg.” . c l -keynote-statement. march , . https://code lib.github.io/c l -keynote-statement/. “community college consortium for open educational resources.” n.d. accessed july , . https://www.cccoer.org/. crum, janet, aaron dobbs, william helman, and and kelly sattler. . “more than money: recruiting and retaining library it staff.” presented at the lita forum, november . http://hdl.handle.net/ / . “designing a migration path – fedora repository – duraspace wiki.” n.d. accessed january , . https://wiki.duraspace.org/display/ff/designing+a+migration+path. dohe, kate. . “linked data, unlinked communities.” lady science (blog), “libraries and tech” series. https://www.ladyscience.com/blog/linked-data-unlinked-communities “dspace and wcag website accessibility.” n.d. dspace community – dspace and wcag website accessibility. accessed july , . http://dspace. .n .nabble.com/dspace- -and-wcag-website-accessibility-td .html. eugene, toinette m. . “to be of use.” journal of feminist studies in religion ( ): - . https://www.jstor.org/stable/ . “‘fedora deployments – fedora repository.’” accessed june , . wiki. duraspace wiki. june , . https://wiki.duraspace.org/display/ff/fedora deployments. “fedora leadership group in-kind guidelines.” . wiki. fedora repository – duraspace wiki. january , . https://wiki.duraspace.org/display/ff/fedora+leadership+group+in-kind+guidelines. graham, mekada. “the ethics of care, black women and the social professions: implications of a new analysis.” ethics & social welfare , no. (may ): – . https://doi.org/ . / . hamill, lois. . “so you want an institutional repository but don’t have….” presented at the midwest archives conference, lexington, ky, may . http://www.midwestarchives.org/ccboard/ _ bb ecf d b ac e cf.pdf. hankivsky, olena. . “rethinking care ethics: on the promise and potential of an intersectional analysis.” american political science review , no. (may ): – . https://doi.org/ . /s . hathcock, april. “white librarianship in blackface: diversity initiatives in lis.” in the library with the lead pipe, october , . http://www.inthelibrarywiththeleadpipe.org/ /lis-diversity/. henry, ray laura. . “library technologies and the ethics of care.” the journal of academic librarianship ( ): – . https://doi.org/ . /j.acalib. . . . hoffmann, anna lauren, and raina bloom. . “digitizing books, obscuring women’s work: google books, librarians, and ideologies of access.” ada new media (blog). may , . https://adanewmedia.org/ / /issue -hoffmann-and-bloom/. “icdl mission.” international children’s digital library (icdl). accessed february , . http://en.childrenslibrary.org/about/mission.shtml. “islandora and fedora .” . islandora website. december , . https://islandora.ca/content/islandora-and-fedora- . lamont, melissa. . “gender, technology, and libraries.” information technology and libraries ( ): . https://doi.org/ . /ital.v i . . lorde, audre. . “the master’s tools will never dismantle the master’s house.” sister outsider: essays and speeches. ed. berkeley, ca: crossing press. - . . maynard, aubrey, laura gentry, adam mosseri, courtney whitmore, margaret diaz, camille chidsey, and and kelly kietur. . “‘fedora commons or dspace: a comparison for institutional digital content repositories.’” presented at the ndsa annual meeting. http://www.digitalpreservation.gov/meetings/documents/ndiipp /wayne-wsu_ndsa_ _confprestemplate.pdf. mirza, rafia, and maura seale. a. “who killed the world? white masculinity and the technocratic library of the future.” in topographies of whiteness: mapping whiteness in library and information science, edited by gina schlesselman-tarango. library juice press. http://mauraseale.org/wp-content/uploads/ / /mirza-seale-technocratic-library.pdf. ———. b. “dudes code, ladies coordinate: gendered labor in digital scholarship.” presented at the dlf forum, pittsburgh, pa, october . https://osf.io/ jhzx/. “mission and goals, hathitrust digital library.” hathitrust digital library. accessed february , . https://www.hathitrust.org/mission_goals. “mission and vision, texas digital library.” texas digital library (blog). accessed february , . https://www.tdl.org/strategic-plan/vision/. mit libraries. n.d. “elsevier fact sheet.” scholarly publishing – mit libraries (blog). accessed january , . https://libraries.mit.edu/scholarly/publishing/elsevier-fact-sheet/. mountz, alison, anne bonds, becky mansfield, jenna loyd, jennifer hyndman, margaret walton-roberts, ranu basu, et al. “for slow scholarship: a feminist politics of resistance through collective action in the neoliberal university.” acme: an international journal for critical geographies , no. ( ): – . noble, safiya. . “searching for black girls: old traditions in new media.” http://hdl.handle.net/ / noddings, nel. “feminist critiques in the professions.” review of research in education ( ): – . https://doi.org/ . / . nowviskie, bethany. . “on capacity and care.” bethany nowviskie (blog). october , . http://nowviskie.org/ /on-capacity-and-care/. pappas, erin, and and kate dohe. . “lessons from the field: what improv teaches us about collaboration.” library leadership & management ( ). https://journals.tdl.org/llm/index.php/llm/article/view/ . poehler, amy. . yes please. dey street books. sadler, bess and chris bourg. . “feminism and the future of library discovery.” code lib journal . https://journal.code lib.org/articles/ . salo, dorothea. . “innkeeper at the roach motel.” library trends : . https://minds.wisconsin.edu/handle/ / . “samvera ig/wg framework – samvera.” n.d. wiki. duraspace wiki. https://wiki.duraspace.org/display/samvera/. schlesselman-tarango, gina. “the legacy of lady bountiful: white women in the library.” library trends , no. ( ): – . https://doi.org/ . /lib. . . schmidt, jane. . “innovate this ! bullshit in academic libraries and what we can do about it.” institutional repository. rula digital repository. may , . https://digital.library.ryerson.ca/islandora/object/rula% a . schonfeld, robert c. a. “cobbling together the pieces to build a workflow business.” the scholarly kitchen, february , . https://scholarlykitchen.sspnet.org/ / / /cobbling-together-workflow-businesses/. ———. b. “reflections on ‘elsevier acquires bepress.’” ithaka s+r (blog). august , . https://sr.ithaka.org/blog/reflections-on-elsevier-acquires-bepress/. seale, maura. “the neoliberal library.” in information literacy and social justice: radical professional praxis, edited by lua gregory and shana higgins, – . library juice press, . http://eprints.rclis.org/ /. tronto, joan. . moral boundaries: a political argument for an ethic of care. new york: routledge. ———. . “partiality based on relational responsibilities: another approach to global ethics.” ethics and social welfare ( ): – . https://doi.org/ . / . . . winner, langdon. . the whale and the reactor: a search for limits in an age of high technology. university of chicago press. https://www.press.uchicago.edu/ucp/books/book/chicago/w/bo .html. webteam. n.d. “carol gilligan.” ethics of care. accessed july , . https://ethicsofcare.org/carol-gilligan/. “what is dspace? – dspace knowledgebase.” . wiki. duraspace wiki. december , . https://wiki.duraspace.org/pages/viewpage.action?pageid= . there is much more to explore in the demographic composition of “technical librarians” in systems, digital curation, data management, and other positions that require stronger it skills as a function of their position. further, people in these positions who may be perceived as “outsiders” to the majority cohort anecdotally take on masculine qualities in an effort to either fit in or establish dominance, which is surfaced in several narratives included in we can do i.t. edited by jenny brandon, sharon ladenson, and kelly sattler [↩] unsurprisingly, this keynote earned dr. bourg the vitriol of internet trolls who reduced these exhortations to “she’s saying girls can’t like star trek!” and decried the leftist takeover of libraries. the code lib conference organizers and community issued a statement of support: https://code lib.github.io/c l -keynote-statement/ [↩] this colleague has read earlier versions of this article, and has told me i may share this conversation as a part of the piece. we had multiple conversations about this article in which i asked him to affirm his consent and reflect on this conversation, and his feedback and changes have been helpful. however, in the development of the article, i frequently became anxious about what this would mean for him in particular and other colleagues more generally, which caused me to consider and reflect seriously on the ways in which i still elevate and prioritize white male feelings. all i can do is the work. [↩] while this article focuses on gendered dynamics within a specific community, it is also vitally important to consider the intersectional nature of racial, ableist, and economic systems that come to bear on care ethics within academic settings and the ways in which many people are excluded from the digital etc. practitioner community. [↩] digital curation, digital libraries, digital publishing, digital repositories, digital scholarship, digital strategies, ethics of care, open access, open source dismantling deficit thinking: a strengths-based inquiry into the experiences of transfer students in and out of academic libraries intersubjectivity and ghostly library labor responses pingback : stew of the month: february pingback : bilimsel araştırma rehberi | yusuf hakan güngör naomi – – at : pm thank you for this article! this is a statement that i have been thinking a lot about for libraries. https://collegefund.org/wp-content/uploads/ / /creating-visibility-and-healthy-learning-environments-for-natives-in-higher-education_web.pdf this work is licensed under a cc attribution . license. issn - about this journal | archives | submissions | conduct understanding researcher needs and raising the profile of library research support researchers at north carolina state university expect little to no difficulty in discerning how their library can support their work. at the same time, librarians repeatedly find that researchers are unaware of what our library has to offer. within this context, we embarked on a two-year study to help inform the development of outreach strategies to enable new research engagement opportunities that will scale and, at the same time, help us transform our model of research support strategies and engagement. we interviewed both librarians and researchers to gain an understanding of researcher needs from both perspectives. the results of the interviews provided a solid grounding for building our awareness of researchers’ behaviors, expectations and workflows as well as presenting a unique picture of both unmet and unarticulated needs. in this article we summarize our results with a specific focus on findings from the researcher interviews. we share our recommendations for evolving library research support and enhancing outreach strategies to provide an easier starting point for different types of researchers to discover relevant research assets provided by libraries such as ours. understanding researcher needs and raising the profile of library research support keywords researchers; research support; libraries; outreach; interviews introduction at north carolina state university (‘nc state’), the libraries support , students, , faculty, and , administrative and support staff. the university offers more than undergraduate programs, masters programs, doctoral programs and a doctor of veterinary medicine program. research is a major strategic priority for nc state, with over $ million in research awards (fy ) and over , intellectual property disclosures as of . supporting researchers is one of the nc state university libraries’ strategic goals. in particular, we focus on having a ‘strategic alignment of resources to advance the capacity of our researchers and partners’. this study is an attempt to rediscover and reinforce our support for researchers and partners by interviewing librarians and researchers insights – , understanding researcher needs – library research support | colin nickels and hilary davis colin nickels experiential learning services librarian nc state university libraries hilary davis head of collections & research strategy nc state university libraries across campus. we interviewed librarians and researchers from october through may . the internal interviews focused on three goals: capturing a complete picture of our research support services, documenting library assumptions about researchers and creating a sense of ownership in our library participants for the results of the study. the external interviews focused on determining the behaviors, expectations and workflows of researchers at nc state to uncover new and unmet needs, inform ways to enhance our outreach methods and test library assumptions about researcher needs. in this article, we intentionally focus more on the findings from the interviews with researchers. our recommendations for evolving library research support and ways for researchers to discover relevant research assets provided by our library can be applied in other academic library contexts. methodology we used semi-structured qualitative interviews as our primary research method. semi-structured qualitative interviews are optimal for situations where you have only one chance to interview a respondent: they are efficient and maximize responsiveness while also keeping a similar structure between interviews so that information can be more easily compared across multiple interviews. the structure of these interviews is based on open-ended questions with the flexibility of being able to probe further and jump between questions and reorder them based on the flow of the conversation. a major benefit of semi-structured qualitative interviews is that they enable investigators to discover the values of the respondents. the interview instruments we used for this study can be found on our open science framework (osf) project site. because this study was not intended to be statistically representative, we were aiming for data saturation, the point at which no new information is observed. according to studies, data saturation can be achieved with to interviews. for this study, we interviewed researchers during individual one-hour-long interview sessions. we also conducted one-hour-long interviews with groups of librarians composed of approximately two to three librarians per group. library participants were recruited based on availability and reflected distributed representation across library units that interface with researchers. researcher participants were recruited based on a sample of researchers across all disciplines provided by the library research and planning unit and augmented with additional researchers’ names that were recommended by the librarians that were interviewed. even though we attempted to get an equal distribution across disciplines and researcher types, respondents who agreed to be interviewed were not equally distributed across disciplines and researcher types. demographics of the researchers we invited to be interviewed, completed interviews. we interviewed researchers across these four main career stages: student (undergraduate and phd student), early career ( – years post-phd) mid career ( – years post-phd), and late career ( + years post-phd) (table ). the researchers came from diverse academic units: humanities, social sciences, arts, textiles chemistry, biology, education, educational technology, engineering, agriculture, natural resources, public and international affairs and statistics. despite repeated attempts, we were unable to secure interviews with researchers from agricultural extension units. career stage number of participants student researcher (undergraduate and graduate) early-career researcher ( – years post phd) mid-career researcher ( – years post phd) late-career researcher ( + years post phd) table . career stages of researchers interviewed in the study ‘we were aiming for data saturation’ ‘determining the behaviors, expectations and workflows of researchers at nc state to uncover new and unmet needs’ overarching challenges before we go into the detailed finding areas below, some of the challenges our researchers faced span multiple areas or were endemic to all different kinds of research and career stages. these overarching challenges included lack of time, inconsistent experiences with promotion and credit for scholarly contribution across disciplines, and a general set of challenges as researchers progress through career stages. time lack of time and being too busy was the most frequently mentioned challenge. one researcher said, ‘there’s often talk about collaboration, but it’s hard to find time because everyone is super busy. people want to do stuff with you, but many opportunities are missed because of lack of time’. others mentioned maintaining a rotating strategy of neglect between family, research and teaching as how they balance these competing demands. promotion and credit another overarching challenge was related to promotion and tenure. often researchers have to balance different expectations between what modern scholarship entails and traditional promotion and tenure practices. one researcher reported that he experiences ‘problems incentivizing interdisciplinary work. i collaborate with a graduate student in electrical engineering doing stuff that is totally amazing and different and we write an article. i get credit for half of one article’. institutional promotion and tenure frameworks operate differently across the academic units, resulting in a disproportionate distribution of credit. likewise, some academic units reward individual contributions via promotion and tenure practices, contradicting recent trends in research funding which favor collaborative, team- based research. different challenges at different career stages each different career stage articulated a unique challenge. student researchers felt insecure in their status as a researcher. one wished they could ‘make [them]self more confident and have a way to have older scholars have more faith’, reflecting a sense of both a lack of confidence in themselves and also a perceived lack of trust in their abilities by their mentors. early-career researchers had a different challenge. they reported difficulty getting used to ‘that side of the desk’ and one commented that ‘having library spaces has been a lifesaver – being able to come in and hide from students’. their role has changed from student or postdoc to early-career faculty and they are still figuring out what exactly that means. researchers in the middle of their career emphasized a lack of time as their greatest challenge. they seem to feel comfortable in their role as a faculty member, but one stated, ‘time, eat, sleep, doing things with my family’ as their most critical challenges. they consistently reported having lots of responsibilities and feeling overwhelmed trying to accomplish them all. late-career researchers’ specific challenge was related to a change in their working relationships. often in late career, they are further away from the actual research and are performing more administrative and managerial work. they have to accommodate and manage their working situation and partnerships. one stated they wished for ‘ways to engineer projects to fit the reality of people i work with: i really have to know the limitations’. ‘maintaining a rotating strategy of neglect between family, research and teaching’ ‘each different career stage articulated a unique challenge’ comparing internal vs. external interviews interviewing librarians and researchers as two separate groups gives us the opportunity to compare their responses. overall, it appears our assumptions and thoughts on researcher support were close to what researchers reported themselves, with a few exceptions. researcher support and services  in order to capture a complete picture of all the different services we offer to researchers, we asked librarians, ‘what type of help do researchers come to you for?’ and, ‘what are the primary services that the library offers researchers?’. the top answers from librarians were collections, consultations, search strategies, scholarly communication support, data management planning, data visualization support and technology lending. when we asked researchers, ‘what kinds of help or resources do you come to the library for?’, researchers mentioned all of the above. in this category our assumptions about researcher support were accurate. targets for more outreach  we asked librarians, ‘what are some things we are doing that researchers might not know about yet?’ to get a list of services that librarians thought could use more outreach. librarians’ most commonly listed answers were intellectual property assistance, technology lending, high-tech spaces (like our teaching and visualization lab with degree projection), digital media creation, and data and visualization services. researchers did mention three of these as being support they had received from the library: technology lending, digital media creation, and data and visualization support. they also talked about attending events in our high-tech spaces, but did not report using them for research yet. finally, none of the researchers we interviewed mentioned intellectual property assistance so this may be a target for more outreach. outreach strategies  the final category of overlap between librarian and researcher interviews was about outreach. we asked librarians, ‘of all the outreach modes you’ve tried, what has been the most effective?’ and we asked researchers, ’how do you discover events or new resources to support your research’. both groups reported e-mail as being the best and primary form of outreach. because so much of their work is already based in e-mail, e-mail is the most likely place for a researcher to discover new support. both groups also mentioned drawbacks to e-mail, primarily that not all e-mails get read, and not all e-mails that have been read are acted on. both groups also highlighted departmental meetings and external events, personal relationships, library workshops and programs, and course-based instruction as being other effective methods of outreach. communication preferences researchers encountered many challenges related to outreach and communication. they specifically mentioned information overload, information decentralization and timing as being perpetual challenges. researchers indicated that they receive too much e-mail and the e-mail they receive is often poorly formatted or overburdened with text. they cited irrelevant extra information as the primary problem with e-mail as a vehicle for outreach directed to them. researchers do not want to see ‘walls of text’ and they want more images and white space in communications they receive, along with more informative subject lines. respondents overwhelmingly want a centralized place to discover events and helpful resources and to be able to use filters to see only the events and resources relevant to their needs and interests. they said that there were too many different sources of information and that information is often hard to find because it is siloed. ‘our assumptions and thoughts on researcher support were close to what researchers reported themselves’ ‘information is often hard to find because it is siloed’ timing of outreach is a perpetual challenge recognized by the researchers. they cited right before or at the beginning of fall semester (which equates approximately to autumn or michaelmas term) or during exams to be the best times to push outreach to them. recommendations   some recommendations based on our findings from these interviews include: • infuse rich media in all forms of outreach/marketing communication  • attend department meetings to give relevant, library-related updates • time outreach right before or at the beginning of fall semester or during exams • support the aggregation of campus-wide events and helpful resources with filtering to enable personalized search and discovery. information-seeking behavior information sources    we asked researchers, ‘what kinds of information do you rely on to do your research?’, aiming to establish the core set of information sources used by these scholars. researchers reported primarily using journals, books, government data, conference proceedings and their personal network of colleagues (including faculty referring to student colleagues) as information sources. this resonated with findings of the librarian interviews regarding the most popular collections requests from researchers. in addition to the above set of library resources and personal networks, the researchers reported a diversity of sources they rely on including: • data sources such as twitter, industrial data, images • web sources such as the wayback machine, blogs, google scholar and listservs • technical sources such as code written by other researchers, software (and manuals) and github (a standard for some fields like climate research) • dissertations, abstracting and indexing databases, reference works • special collections such as archival newspaper collections and im- age collections • unique sources such as citizen groups in specific regions, courses, large-text corpora, news, and grants. locating information    we asked researchers, ‘how do you locate the information sources that you rely on (referred to in the previous section)?’. every researcher reported using google scholar, with some researchers preferring it over any other search strategy while others use it as a last resort. student researchers reported using web-based search strategies such as summon and google scholar to find the information sources they rely on. early- and mid- career researchers reported also using physical visits to libraries, personal networks, twitter and serendipitous discovery (primarily through physical browsing). late-career researchers reported using journal alerts and journal tables of contents, select library resources (including librarians), twitter and their personal networks to help them find the information sources they rely on. recommendations    some recommendations based on our findings from these interviews are listed below. • we need to consider how the library supports researchers’ use of the non-library information sources and find ways to incorporate support for those resources. ‘every researcher reported using google scholar, with some researchers preferring it over any other search strategy’ • the preference for google scholar is not surprising, but helps us think about our goal to make researchers aware of library services and resources at their point of need. we have emphasized service centralization to some extent (e.g. summon search) but services still exist in siloes and we do not offer levels of customization akin to the way researchers use google scholar. we need to consider possibilities for adding ways for researchers to customize their experience with library services and collections. locating help findings     we asked researchers, ‘how do you look for help from others on campus?’ and, ‘what kinds of help/support/resources do you come to the library for?’. all researchers we interviewed reported using their peer networks as their starting point for looking for help. student researchers reported seeking help from their faculty mentors and supervisors as well as attending library workshops. early-, mid-, and late-career researchers primarily sought help from their disciplinary communities (e.g. listservs). specific units on campus that the respondents also articulated for finding help included the office of faculty development, office of information technology, distance education and learning technology, proposal development unit, office of research commercialization and the benefits office. library help     researchers specifically mentioned the kinds of help they seek from the library. these are documented in table . help category specific ways researchers reported seeking help from the library access to collections and resources • students reported using databases, journal articles, lynda.com and books in the collection • all other researchers reported using journals, government documents, standards, special collections, audio recorders, microfilm readers and the dmptool (for developing data management plans) service points • across all researchers, the most common service points used were tripsaver (inter-library loan), the library website, the catalog, chat and e-mail consultations • students reported requesting help with literature searching and using statistical analysis software • early-career researchers reported requesting help with research data management • mid-career researchers reported seeking help with data management and storage, data visualization, web development, special collections, help finding collabora- tors, literature searching, bibliometrics for promotion and tenure, grant-seeking and systematic reviews • late-career researchers reported seeking help with early-stage prototyping, literature searching, creating search alerts and bibliometric analyses course-integrated pedagogy and instruction • students did not report requesting help for instruction or pedagogical needs • faculty reported using virtual reality and augmented reality services, digital scholarship, data visualization and literature search strategies as part of their instructional needs workshops • faculty reported recommending students attend workshops hosted by the library events • faculty interviewees were the primary group that reported attending library events and requesting help from the library to host and produce events spaces • students reported using study carrels • all other researchers reported using faculty research commons spaces (limited to faculty only) for group project work and individual work, media labs and sound booths, the vr studio and various library showcase spaces table . ways in which researchers sought help from the library ‘all researchers … reported using their peer networks as their starting point for looking for help’ https://www.lynda.com/ challenges     some of the major challenges facing researchers as they look for help from others on campus center on a major gap in communication about resources and services available to researchers and how to gain access to those resources. one stated there is ‘some kind of an information and communication gap … some information is not written anywhere and some information is spread across many different places’. researchers consistently reported that it is often difficult to access or discover sources for help and that these resources are not organized in a way that is easy to find. researchers also reported that for projects or initiatives that require diverse skill sets and expertise, it is difficult to find experts and, when found, difficult to get them together to forge a path forward. we asked researchers about their experiences attending networking events on campus designed to bring researchers together to spark potential collaborations. the researchers reported that they understand that these events are co-ordinated with good intentions, but that since these networking events are not co-ordinated by the researchers themselves, the incentives to participate are low and that these events are sometimes perceived as irrelevant and potentially a waste of time. recommendations      some recommendations based on our findings from these interviews are listed below. • establish and grow relationships with campus partners to develop a shared under- standing of complementary services to create a stronger network of support for researchers at different stages of their careers. (partners include but are not limited to: office of faculty development, office of information technology, distance education and learning technology, proposal development unit, office of research commercialization and the benefits office.). • endorse the aggregation of campus-wide services and resources for researchers with filtering to enable personalized search and discovery and enable easy editing to keep the information updated. • when developing events intended for researchers to network and start new collaborations, involve researchers in the planning process so that they are more likely to be invested in the experience and feel more confident that they will derive benefit from participating in the event. data practices findings       we asked researchers, ‘what kinds of data does your research typically produce (or use)?’ and, ‘have you encountered any challenges in the process of working with the data your research produces?’. researchers reported using small to very large (‘big data’) data sets, and formats of data varied widely from paper files to digital files. images and spreadsheets were the most common data types, followed by interview transcripts, survey data and physical data. researchers reported that they mostly generate the data themselves, but also reported using externally produced data, both only available for a fee and openly available (e.g. government agency data, public health data, supplementary data from journal articles and data from software code generated by other researchers). challenges       storage was by far the biggest challenge faced by the researchers we interviewed. some researchers said that they were aware of storage options on campus, while others indicated that they did not know what storage resources were available to them, often funding storage solutions themselves. some researchers reported using commercial storage (e.g. dropbox) noting that they did not have confidence in the long-term viability of those commercial ‘it is difficult to find experts and …difficult to get them together to forge a path forward’ ‘when developing events …involve researchers in the planning process’ ‘storage was by far the biggest challenge faced by the researchers’ storage-hosting solutions. finding the optimal storage solution that all collaborators can use was noted by multiple researchers, often opting for tools such as github, dropbox or google drive. data analysis was cited as a challenge, especially for researchers who were incorporating new methods that are outside of their skill sets into their research, such as statistical methods, text mining, or using tools unfamiliar to them. data quality was also noted as a particularly vexing challenge for researchers working with data sets generated by other entities, due to missing data, lack of metadata and errors. recommendations       some recommendations based on our findings from these interviews are listed below. • develop outreach materials to help researchers understand how they can use vari- ous data storage options (with or without grant funding, for sensitive data and with outside collaborators). • promote data analysis skills workshops to researchers who are incorporating new methods or tools into their work that are outside their skill sets. • promote our data consultancy service to departments that are more likely to lack existing data science skill sets (e.g. humanities, social sciences, education). skills for success findings        we asked researchers, ‘what are the most important skills that you and/or your research team need in order to be successful?’. this was followed with questions about how those skills are typically acquired, as well as a set of questions about what skills the researcher is looking to develop and those skills the researcher expects their collaborators (or in many cases, members of their research group) to develop. specific skills researchers from every career group mentioned specific skills they sought to learn next. these were skills they needed either to conduct a new kind of research or to complement their existing skills. skills reported by the researchers included python, r, gis (geographic information system), ai (artificial intelligence), the internet of things, crimson hexagon and other specific tools (see table ). one trend of note was that earlier-career researchers often listed more specific skills, while later-career stages often emphasized soft skills. researchers mentioned personnel management, interpersonal skills, communication, project management, reproducibility and leadership as being very important. research skills • collecting and analyzing qualitative data • reproducibility vs. patentability • entrepreneurial research • identifying topics for research • irb process • reading an article library skills • building a search strategy • finding data from articles • application development • market research • citation management • tracking news programming and coding • commenting on code • database development • web development • intro to programming (r, julia, python, go, sas, stata) • ai and machine learning • contributing to open source communication and data visualization • how to write an abstract • personal branding • how to deliver an effective presentation • how to justify your research • animation (contd.) programming and coding • contributing to open source • internet of things • machine-text analysis  communication and data visualization • community engagement • infographics • data storytelling • graphic design • how to write a business plan • grant writing • how to talk to the press specific tools • atlas.ti • dedoose • crimson hexagon • excel and pivot tables • voyant • unity • twine • gis • github • microsoft word tips • omeka statistics and methods • basic statistics • agent-based modeling • clinical interviewing • dealing with large data • semi-structured interviews • survey design • graph analysis teams and interpersonal relationships • conflict management • power dynamic ethics • project management • saying ‘no’ • sharing articles with a group • agile methods in research personal development • memory skills • imposter syndrome • networking confidently table : skills and related training topics described by researchers as important to their work changing technology one challenge that surfaced in our discussion of skills was the rate that technology changes. our researchers reported feeling that technology changed too quickly, or that they could not count on a tool or data type lasting more than a few years before going obsolete. some researchers responded to this shift by planning to learn one or two new things every year. others said that instead of maintaining a list of skills to learn, they let their project dictate what skills they need next. this way they can avoid investing too much effort into learning a skill that will no longer be relevant in a few years. student skills students we interviewed mentioned several basic skills, including general knowledge of their field, how to read an article, editing, communication and core research skills. because we interviewed both faculty and student researchers, we were able to compare the skills students mentioned with those that faculty thought were necessary for success. every skill students said they needed was also mentioned by faculty, but faculty mentioned many more. one possible explanation is that faculty, after performing research in their field, are thinking in terms of their whole career based on what skills they have already needed. students, on the other hand, are often thinking on a shorter timeline, e.g. how to finish this project or their degree. ‘every skill students said they needed was also mentioned by faculty, but faculty mentioned many more’ recommendations        some recommendations based on our findings from these interviews are listed below. • continue to develop and deliver new workshops related to skills specifically men- tioned by researchers. • develop and deploy new outreach strategies to help our workshops reach larger audiences. • partner with other campus entities such as the office for institutional research and planning, human resources, and the office for institutional equity and diversity to offer workshops on management, human subject research protocols and frameworks for the ethical application of power dynamics in educational settings. collaboration findings         we asked researchers, ‘do you regularly work with, consult or collaborate with others as part of your research process?’. this was followed up with questions about challenges experienced in the process of working with others as well as what makes for a successful collaboration. every one of our participants reported collaborating with others and most people collaborated both on and off campus. these collaborations ranged from being solely within their department or their cluster to collaborating with ngos (non-governmental organizations), government agencies, or industry partners. our respondents reported that seeking complementary skill sets was a key driver for collaboration and that securing funding for multidisciplinary work was easier if they were engaged in a collaboration with others. finally, researchers reported that dedicated space was necessary for successful collaborations. many reported physical collocation as being conducive to collaboration. groups that included long-distance or international collaborators also stressed the importance of virtual space for collaboration, relying on platforms like github and trello to keep everyone up to date and involved. challenges         some challenges related to collaboration were finding collaborators, logistics and setting expectations. faculty noted that pre-arranged networking events rarely spark collaborations because they are put together by a third party, lack a focused agenda and can create awkward situations. one stated, ‘i’m not going to go to a pizza party, but if there was an event to learn sas [a statistical software suite], i would go to that’, highlighting that networking, by itself, was not enough to get them to attend an event. student researchers reported feeling differently, stating that they may benefit from networking events because finding other student researchers is much more difficult. logistics was an additional barrier to collaboration. participants found scheduling difficult, especially in groups that included international collaborators, noting that there might not be a single hour that falls inside every participant’s work schedule. researchers dealt with issues such as dropped calls, missing audio, or delays. they also expressed difficulty with online conferencing platforms like skype, google hangouts and zoom. additional challenges researchers experienced related to collaboration was a lack of clear expectations and a lack of administrative support. many researchers said that when they engage in interdisciplinary work, they have to play the role of project manager: they need to ‘hunt people down, schedule them, add in buffer time, and facilitate communication so that everyone knows what is going on’. at the same time, many researchers commented that they did not feel adequately trained in project management best practices. where possible, researchers preferred to hire a project manager to provide logistical co-ordination and project oversight so that they (the researchers) could focus on conducting the research. ‘participants found scheduling difficult, especially in groups that included international collaborators’ recommendations         some recommendations based on our findings from these interviews are to: • involve faculty researchers in the planning of networking events meant to spark new collaborations • investigate and codify best practices surrounding online collaboration platforms • offer training on project management tools and strategies. sharing and publishing findings          we asked respondents, ‘how do you typically share or publish your research?’, and, ‘are you doing any non-traditional publishing?’. examples of non-traditional publishing include publishing data sets, video abstracts, blogs and digital scholarship projects. most researchers preferred publishing in traditional journals, books and conference proceedings because that is how they get credit for their work, but they also reported seeking out non- traditional publishing opportunities based on the same research. those researchers who had engaged in non-traditional modes of publishing reported posting research updates on their lab or personal website; publishing data visualizations, white papers, and posting reports on researchgate or mendeley; authoring newspaper articles or blogs; posting data, code, and articles on github; producing webinars and podcasts; and leading public town hall meetings. most of the researchers reported doing some sort of scaffolded publishing. we define scaffolded publishing as a process whereby researchers present at a conference, then submit a journal manuscript or short-form book manuscript for publication, as well as create additional non-traditional scholarship such as a digital project or blog post allowing them to explore other ways to express their scholarship. this scaffolding of publications approach enables researchers to meet institutional expectations while also leveraging more creative avenues to share and grow potential for next steps in their research. challenges          an overarching challenge was related to promotion and tenure. often researchers have to balance different expectations between what modern scholarship entails and traditional promotion and tenure practices. referring to the incentives for interdisciplinary work, one researcher commented that ‘the incentive structure is built around publishing articles, not chapters, blogs or social media engagement’. still, non- traditional publications are seen as critical because, as one researcher put it, non-traditional publications do ‘influence your reputation as a scholar even though it doesn’t weigh on p&t [promotion and tenure]’. some researchers mitigate this by writing an article based on a digital humanities project, getting the credit they need for their career while producing compelling research for their disciplinary community of scholars. some researchers reported that publishing work via open access (oa) and open data channels is seen as a practical next step, but that they were limited by a lack of incentive, funding, concern about journal quality and, in some cases, a feeling that their work is not ‘up to the level’ to be considered reproducible. recommendations          some recommendations based on our findings from these interviews include: • provide examples and support for scaffolded publishing • provide guidance and infrastructure for non-traditional publications by helping to adopt and apply citation guidelines for non-traditional forms of scholarship (such as data sets) • aid researchers in leveraging alternative metrics such as web visits and social media mentions ‘this scaffolding of publications approach enables researchers to meet institutional expectations while also leveraging more creative avenues’ • enhance outreach and support structures to help researchers find pathways toward oa • offer incremental support for building researchers’ confidence for open research, specifically to enable reproducibility. next steps and conclusion this study has given us deep qualitative data and insights to act on. while creating a comprehensive set of findings and recommendations constituted a valuable outcome, we also have a set of next steps to take this research further. one of our next steps is to generate a set of interview questions for our librarians who regularly support researchers to employ in order to help them stay abreast of emerging researcher needs. while we found some high-level, recurring issues throughout the researcher interviews, we also found that with every interview, a new perspective and new need or insight was revealed. because each interview provided new information, we want to create a way to make this process more continual. throughout the interview process we received substantial feedback and heard a variety of challenges that are not directly related to the library or our services. one of the next steps of this effort is to find a way to share this feedback with external stakeholders by playing the role of advocate for researchers on campus. another next step is packaging these challenges and tasks with which researchers need help into a set of ‘research tracks’. a research track is a multi-step process that spans multiple departments and uses resources that the library offers. an example would be the case of a researcher needing to find help to create a video to showcase their research. our library offers workshops on video editing, we have a digital media studio with software and computers to do the editing, we have librarians and staff who offer consultations, we have collections that describe the process as well as include examples of video abstracts, and we also have online training for users who are unable to come to the library for help. we want to gather these intersectional tasks together in one view, or ‘track’ as an easier starting point for different types of researchers to discover relevant research assets provided by the library. finally, we will also work with our acquisitions and metadata and user experience units to develop new ways to offer these research tracks at the user’s point of need, with a goal of creating an automated solution for pulling these pages together. data accessibility statement this study is documented fully with the interview instruments as well as the coded, anonymized data set available at https://osf.io/ akd v/ and occasional posts and updates on the process can be found at medium.com/raising-the-profile. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘full list of industry a&as’ link: http://www.uksg.org/publications#aa competing interests the authors have declared no competing interests. ‘playing the role of advocate for researchers on campus’ references . nc state university libraries strategic plan, fy / to fy / : https://www.lib.ncsu.edu/sites/default/files/files/images/ncsu_libraries_strategic_plan_fy -fy - final.pdf (accessed november ). . john w. cresswell, educational research: planning, conducting, and evaluating quantitative and qualitative research (upper saddle river, nj: pearson education, ); john w. cresswell, qualitative inquiry and research method: choosing among five approaches, nd. ed. (thousand oaks, ca: sage, ). . colin nickels and hilary davis, raising the profile of the ncsu libraries’ research support strategies & engagement, doi: https://doi.org/ . /osf.io/akd v (accessed november ). . greg guest, arwen bunce, and laura johnson, “how many interviews are enough? an experiment with data saturation and variability,” field methods ( ): – . doi: https://doi.org/ . / x https://osf.io/akd v/ https://osf.io/akd v/ https://medium.com/raising-the-profile http://www.uksg.org/publications#aa https://www.lib.ncsu.edu/sites/default/files/files/images/ncsu_libraries_strategic_plan_fy -fy - final.pdf https://doi.org/ . /osf.io/akd v https://doi.org/ . / x article copyright: © colin nickels and hilary davis. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. corresponding author: hilary davis head of collections & research strategy north carolina state university libraries north carolina state university, us e-mail: hmdavis @ncsu.edu orcid id: https://orcid.org/ - - - co-author: colin nickels orcid id: https://orcid.org/ - - - to cite this article: nickels c and davis h, “understanding researcher needs and raising the profile of library research support,” insights, , : , – ; doi: https://doi.org/ . /uksg. submitted on september             accepted on november             published on january published by uksg in association with ubiquity press. http://creativecommons.org/licenses/by/ . / mailto:hmdavis @ncsu.edu https://orcid.org/ - - - https://orcid.org/ - - - https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ introduction methodology demographics overarching challenges time promotion and credit different challenges at different career stages comparing internal vs. external interviews researcher support and services  targets for more outreach  outreach strategies  communication preferences recommendations   information-seeking behavior information sources    locating information    recommendations    locating help findings     library help     challenges     recommendations      data practices findings       challenges       recommendations       skills for success findings        specific skills changing technology student skills recommendations        collaboration findings         challenges         recommendations         sharing and publishing findings          challenges          recommendations          next steps and conclusion data accessibility statement abbreviations and acronyms competing interests references table table table jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- __________ © , the author(s). this is an open access article, free of all copyright, that anyone can freely read, download, copy, distribute, print, search, or link to the full texts or use them for any other lawful purpose. this article is made available under a creative commons attribution . international license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. jlis.it is a journal of the sagas department, university of florence, published by eum, edizioni università di macerata (italy). archival education in the age of social media in algeria: opportunities and future horizons behdja boumarafi(a), khaled mettai(b) a) institute of library & documentation, university of constantine , algeria b) institute of library & documentation, university of constantine , algeria __________ contact: behdja boumarafi (phd), associate professor, bboumarafi@gmail.com; khaled mettai, doctoral researcher and part time faculty, khaledmettai@gmail.com received: july ; accepted: january ; first published: may __________ abstract digital technology is changing the way we learn, interact, work and entertain, for its unlimited potential in penetrating all spheres of life. the digital revolution is transforming education industry worldwide. in recent years, extensive debate and research are exploring digital technology, focusing on developing a clear understanding of its capabilities as a platform for making social sciences and humanities applicable to the cyber environment of the twenty first century. the widespread use of social media supported by a rapid growth of the digital culture is making learning ubiquitous by creating, capturing and sharing knowledge. this is enhancing students’ engagement and learning efficiency. it is also improving the learner- instructor interaction by engaging students in a more meaningful participation in their own education and academic achievement. to accomplish this, both students and instructors need to have the skills and expertise in capturing the positive effect of the digital technology and engaging in the new learning environment. this paper reflects on the new learning environment supported by a curriculum reflecting the digital technology era. it discusses how students at the department of archives at the university of constantine build on the digital capabilities of social media to engage in independent learning by creating and sharing knowledge. keywords digital technology; digital tools; archival education; social media; students’ engagement; algeria. citation boumarafi, b., mettai, k. “archival education in the age of social media in algeria: opportunities and future horizons”. jlis.it , (may ): - . doi: . /jlis.it- http://creativecommons.org/licenses/by/ . / mailto:bboumarafi@gmail.com mailto:khaledmettai@gmail.com jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- introduction as societies are moving towards the digital era there is a greater need for harnessing the technological advancements of the web and promoting their use in instructional and learning activities. in the last few years extensive discussion and heated debates are exploring such use. much of that focused on developing a clear understanding of the capabilities of information and communication technology as a platform for enhancing instruction and learning. digital technology and accompanying tools are putting academic institutions under pressure for organizational changes built on digital technology to make their education system applicable to the cyber-environment. however, social sciences’ programs face innumerable challenges in nurturing and managing the impact of such developments. with the advent of the internet technology and its gradual penetration in developing countries, education re- engineering is needed to optimize the positive effect of the web technology and its growing applications in instruction and knowledge creation, sharing and transfer. social media emerged as one of platforms for such endeavors for their unlimited potential of making infinite amount of data available to learners. for instance, facebook has reached two billions users all over the world. they create and share around billion pieces of content every month. therefore, the use of digital technology in form of social networks is growing at a scale that is threatening and at the same time promising. on one hand, they have the potential for global involvement of institutions; on the other hand, the question of how to deal with the big-data generated by myriads of users will soon become inevitable and need to be resolved. this requires new skills and expertise that may not be applicable to social sciences and humanities programs such as archival education. nonetheless, this is in a state of transition as a result of several factors specifically, economic growth and gradual application of ict in various activities namely: digital preservation, electronic record management and several other areas to optimize the positive effect of the web technologies. the heavy use of digital technology is changing every sphere of life including education where rigid learning models do not work in the digital era that is promoting flexible systems supported by technological tools. in retrospect, new communications technologies and research data infrastructure are now appreciated by humanities researchers and learners, enriching the connections within the academy and powering the linkages of content and data (owen, s. et al, a.). students in algerian universities are creating and sharing contents through social media, thus engaging in their own learning in a more meaningful way. the institute of library and documentation was founded in it developed gradually from a small institute to the status of national institute with a students’ population of about reading for a ba, master and doctorate. there are full time faculty members and a number of part timers serving in the institute’s two departments namely: library science and archival science. a total students are enrolled in the department of archives and the number is growing every year. this paper explores the use of social media by students in the department of archives at the university of constantine (algeria) to bridge the gap between digital technology and archival education. it investigates the type of social media they use, the type of activities they are likely to use social media for. it looks at how students are investing their technological skills and learning time, for better academic achievements and examines factors affecting their use. jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- archival education in algerian universities archival science came into the mainstream of higher education in algeria in . but, it started as a single course long before that, offering archives coursework as part of library science bachelor program. in it grew into a standalone program offering bachelor, master and doctorate degrees to a growing number of students. courses on archives are designed to educate and train a new generation of archivists qualified to take on different roles and responsibilities for archival activities in public and private organizations. to provide an adequate coverage of curriculum content, that encompass academic courses and technical skills that are vital for educating future archivists is important in building a stronger archival profession in algeria as well as in fostering professionalization among the younger generation of graduates, who are required to work in the digital era. therefore, the curriculum is constructed to serve the archival collections in the country, give a new perspective to the profession and open doors of opportunities for archivists to acquire the required expertise to accommodate the technological requirements of the digital environment to contribute to essential understandings for the development of future archival systems and technologies that operate at a global level. (mckemmish, gilliland - swetland, &ketelaar, ). this depends on the extent to which the program content and characteristics respond to the needs of digital technology and provide necessary tools for that. at this point, it is necessary to develop clear strategies for integrating into the digital arena by: - developing and providing access to new and innovative learning and instruction tools; - tapping into the wide ict applications to provide access to new formats of education packages; - develop a common vision of the demands of the digital technology environment; - provide the necessary institutional and technological infrastructure that commensurate with the global development in archival science. the rapid development of the internet technology, specifically the web . technology and tools have emerged as the driving forces that are reshaping the global environment within which social sciences and humanities including archives are operating. such as it savvy user demographics, (internet generation) complex user needs, changing collection formats (from paper to digital), increased use of social networking, global pressure for sharing knowledge among others. for that matter, curriculum development has been carried out a number of times to include courses that have a direct impact upon the development of the field of archival science in relation to digital technology environment. courses like: electronic archiving, application of new technology to archives, archive digitization, electronic record management, archival institutions through the web, metadata & archives, digital technology & archives that are featured in the program are direct applications of digital technology and tools to archives. literature review the literature cites four dimensions of learning styles using the social web; these are: - active versus reflective: or trying first then think; jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- - sensing versus intuitive: learning the facts or discovering them; - visual learning versus verbal: using visual material or verbal explanations; - sequential versus global: starting by understanding the linear steps or getting the big picture first. (mavropalias, n.d.) modern education systems are shifting the emphasis from instructors towards learners who take part in their own learning. students in the department of archives are using social media, specifically, facebook to collaborate with one another in group projects and term papers, they exchange lecture notes, previous exam questions and other learning material related to their course. one possibility afforded by social media is the ability for students in the humanities to use and create linked data through open systems and through socially constructed linkages which are driven by the perspectives and understandings of individuals (owen, s. et al. b). the ubiquitous presence of social media is generating big-data outputs and has attracted researchers to study both positive aspects and concerns of using such tools in various settings offering new and various ways of using computers or/and mobile devices. (paliktzoglou and suhonen, ). as education institutions are embracing social media there is a need to optimize the positive effect of such technologies to bring them into pedagogy to make instruction and learning active and applicable to the cyber environment of the new millennium. (boumarafi, a). for a generation immersed in a world of evolving technologies where internet applications, specifically, the web . tools are having a considerable impact on creating a technology – driven culture in every society. kirshenbaum ( ) states that digital technology and tools are a social enterprise, it is a network between people who jointly research together, argue, compete and collaborate. students all over the globe are using social media capabilities to create and share content, exchange ideas and establish networked communities. as a result, a huge amount of digital-born data is being created. that is, in a way promoting the digital technology and tools movement through innovation in humanities scholarship which still needs more investment while recognizing the challenges of infrastructure and staff (spiro, ) in compiling, organizing and preserving social media content for future use. they need to create new digital tools for data warehousing and text mining. the current approach to digital technology in social sciences especially in developing countries is still limited in terms of scope and research projects. however, as research in this area increases, demand for digital scholarship will become inevitable (green, ) in a new intellectual space aiming for a global impact and internationalization (zorich, ). this has challenged the status-quo of social sciences and humanities as it requires new skills and competencies to properly engage in the digital technology environment. pahl ( ) studied the evolution and change in web-based teaching and learning environments with the focus on four perspectives including content, format, infrastructure and pedagogy. the author concluded that lack of standardized technology, its limited life expectancy and cost are among the major problems facing teaching and learning environments that are struggling to keep up with the constant changes in technology and theoretical advances in education. jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- in , pannapaker posited that digital technology is “the next big thing in a long time, because its implications affect every field including arts and social sciences as a result dozens of grants have been awarded to projects in digital humanities, focusing on the application of computing technology to humanistic inquiries and on humanistic reflections on the significance of that technology (sula, ) to develop a clearer understanding of the capabilities of such technology as a new platform in leveraging arts and social sciences efficiently and addressing large scale participation in the creation of digital content and tools. ellison, steinfield & lampe, ( ) observed that facebook supports resource sharing by establishing the social foundation between students and their peers. in essence the advent of social networking technology is also the advent of new learning systems and a rapid growth in educational technology. although, social networks were not initially created for education purpose, paliktzoglou, stylianou, and sohenen ( ) found evidence that google apps can support pedagogical activities by increasing students’ engagement and team work. therefore, it is important archival students to learn how to integrate evolving technology into learning strategies; not just for technology’s sake, but for the added value that these tools already familiar to learners provide. (brotherton, ). boumarafi ( b) investigated the use of social media by algerian students and the extent to which they use it for academic purpose found that facebook is the most popular and is used in some academic activities. methodology documented literature was examined to design a questionnaire for this study. a pilot survey was conducted with a small group of archives and library sciences students to assess the weaknesses if any and strength of the questionnaire. based on their suggestions the instrument was revised and then finalized. faculty members were approached for permission to distribute the questionnaire during their class session. this allowed greatest accessibility to the target population that consisted of master students who were in class on the day the questionnaire was distributed. completed questionnaires were collected, out of which ( . %) were used in the study the rest were considered unusable. the majority of these respondents were female (n= ; . %) and male represent (n= ; . %), in fact the number of female students in the institute in general outnumbers the male. all respondents have a laptop, a smart phone and internet connection at home. this means that they have the possibility to respond to the digital technology and tools material requirements. findings preferred social media respondents were asked which social network they prefer to use. their preferences are summarized in table . as expected, all respondents ( %) gave the top rank to facebook at a mean of . . previous studies also identified facebook as the most frequently used sns. jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- youtube is ranked second in preference with a mean of . . respondents put twitter in third position in terms of importance with a mean of . , followed by linkedin which scored . . skype that respondents use to keep in touch with family and friends at home and abroad was put in fifth position with a mean of . . myspace is ranked last with a mean of . . type of social media mean sdt. dev. rank facebook . . youtube . . twitter . . linkedin . . skype . . google+ . . table . preferred social network sites (n= ). activities carried out using social media respondents were asked to specify academic activity they carry out using social media and indicate the importance of each. the respondents’ responses are presented in the form of means and standard deviation in table . activities mean std. dev. rank discuss group projects . . share assignments and course work . . share files and lecture notes . . create content . . exchange ideas . . join academic discussion forums . . make a presence in the cyber-space . . improve foreign language skills . . self-regulated learning . . improve communication skills with students abroad . . share ideas and promote creativity . . make contact with faculty members easier . . enhance academic achievement . . get assignments from faculty members . . send assignments to faculty members . . get grades for assignments completed . . table . activities carried out using social media (n= ). jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- from the results exhibited in table above, the sample groups of the study consider seven activities as important at various levels. however, “discussion of group projects” is considered as the most important and gets top rank with a mean of . . they also placed high importance on sharing assignments and course work (mean= . ), followed by “share files” and “lecture notes” (mean= . ), “create content” (mean= . ), “exchange ideas” (mean= . ), “join academic discussion forums” (mean= . ), and “make a presence in the cyber-space” (mean= . ). all these activities are very close to very important. learning a foreign language especially english became a very important skill for the st century learners. therefore it seems logical that respondents considered “improve foreign language skills” as the next important activity they carry out (mean= . ). “self-regulated learning” was also perceived as important (mean= . ). in the same manner, the study showed that the perceived importance of “improve communication skills with students abroad” was evaluated with a mean of . . “share ideas and promote creativity” (mean= . ) this supports the claim that students use social media to create content. respondents agreed that social media make contact with faculty members easier (mean= . ). “enhance academic achievement” (mean= . ) indicate that students benefit from the use of social media technologies that have positive effect on their academic performance and growth. the last three activities namely: “get assignments from faculty members”, “send assignments to faculty members”, and “get grades for assignments completed” were given less importance with means less than ; . , . , . respectively. these results reveal digital technology and tools acceptance and adoption by archival science students expressed in a good use of social media in a numbers of academic activities. discussion and conclusion interest in social networking is growing because of the belief that digital technology and tools are becoming essential for socialization, work and study. arguably, it seems certain that learning and research will be affected by their evolution into digital forms. therefore, there is a need to understand what motivates archival students at the institute of library and documentation science to use social media as digital tools to participate in their own learning through interaction with peers in the process. the study found evidence that facebook is the most used tool and received top score. this correlated with results of previous literature. in relation to students’ activities carried out using social media, the study revealed that the top activities are of academic type between peers. the literature review supports this result. this helps in the deployment of technological networking in archival studies. such deployment impacts archivists in the way they study and represent knowledge with direct improvement in instruction and learning activities. surprisingly, interaction with faculty members is unexpectedly low. no doubt, using social networks increase understanding of digital techniques in humanities and social sciences and yield a better collaboration between learners and instructors in sharing knowledge and ideas and learn from each other’s opinions on site and remotely whether creating, collecting, interpreting and sharing knowledge through social media and digital devices. in fact, that is what social media is about. jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- the use of social media by students in the department of archival science enabled them to create content and share it with peers. social networking proves suitable for self-learning and for establishing connections with peers locally and abroad. overall, our study revealed that, although there is evidence of the use of digital technology by archival education for independent learning, such use remains limited to informal learning. references ellison, n. b., steinfield, c., & lampe, c. ( ). “the benefits of facebook ‘friends’: social capital and college students’ use of online social network sites.” journal of computer mediated communication, ( ), – . boumarafi, b. ( ). “strategies for the delivery of e-information services to support the e-learning environment at the university of sharjah.” the electronic library, ( ), – . boumarafi, b. (summer ). “social media use in algerian universities: university of constantine case study.” the iafor journal of education: technologies & education special edition, – . brotherton, p. ( ). “social network enhance employee learning.” abi/inform complete, ( ), . mavropalias, k. (nd). “social bits: personality and learning style profiling via the social web.” retrieved / / from: http://www.iconof.com/blog. green, h. ( ). “digital technology and tools: a new model of scholarship in a new intellectual space.” in supporting digital technology and tools for knowledge acquisition in modern libraries. retrieved / / , from: http://books/google/books? isbn: . . kirschenbaum, m.g. ( ). “what is digital technology and tools and what’s it doing in english department.” retrieved / / from: http://mkirschenbaum.files.wordpress.com/ / /ad- final.pdf. mckemmish, s., gilliland-swetland, a. & ketelaar, e. ( ). “communities of memory: pluralising.” archival research and education agendas archives and manuscripts : – . owen, s.; verhoeven, d.; horn, a.; robertson, s. ( ). “collaboration success in the dataverse libraries as digital technology and tools research partners.” in proceedings of iatul conferences. paper . retrieved / / , from: http://docs.lib.purdue.edu/iatul/ /openaccess/ . pahl, c. ( ). “managing evolution and change in web-based teaching and learning environments.” computers and education. : – . palktzoglou, v., stylianou, t., & suhonen, j. ( ). “google educational apps as a collaborative learning tool among computer science learners.” in assessing the role of mobile technologies and distance learning in higher education, edited by p. ordóñez de pablos, r.d. tannyson, m.d. lytras. – . igi global, isbn: . http://books/google/books http://mkirschenbaum.files.wordpress.com/ / /ad-final.pdf http://mkirschenbaum.files.wordpress.com/ / /ad-final.pdf http://docs.lib.purdue.edu/iatul/ /openaccess/ jlis.it , (may ) issn: - online open access article licensed under cc-by doi: . /jlis.it- pannapacker, w. ( , december ). “the mla and the digital technology and tools. chronicle of higher education.” retrieved / / from: http://chronicle.com/blogpost/the-mlathe- digital/ /. spiro, l. ( ). “this is why we fight: defining values of digital technology and tools”. retrieved / / from: http://www.nyu.edu/projects/senger/cdh/spiro.pdf. sula, c.a. ( ). “digital technology and tools and libraries: a conceptual model.” journal of library administration, : – . http://dx.doi: . / . . . zorich, d.m. ( ). “a survey of digital technology and tools centers in the united states.” retrieved / / from: http://www.clir.org/pubs/reports/pub .pdf. http://www.nyu.edu/projects/senger/cdh/spiro.pdf http://dx.doi: . / . . http://www.clir.org/pubs/reports/pub .pdf cultural diversity and the digital humanities original paper cultural diversity and the digital humanities simon mahony received: december / accepted: february / published online: march © the author(s) . this article is an open access publication abstract digital humanities has grown and changed over the years; we have moved away from expecting technology to be a tool to make humanities research easier and faster into one where we are now equal partners. our collaborative projects drive forward the research agendas of both humanists and technologists. there have been other changes too. the focus of our scholarly interest has moved away from its historical origins in text-based scholarship, although that now has many more possibilities, and we are seeing an interest in exploring culture and heritage more widely. where the progress is slower is in our moves towards openness and inclusivity, and this is to some extent hampered by a lack of linguistic diversity. this is being addressed with specialist groups within the major dh organizations on a national and a global level. dh has grown rapidly in china, and the anglophone world could do more to engage with practitioners and potential colleagues in this new vibrant and emerging area. there are certainly western centres that specialize, particularly in chinese texts and historical documents, but this needs to be extended further if we are not to impose limits on the conversations, synergies and collaborations that can result. keywords digital humanities · cultural diversity · multi-lingualism · community · globalism & simon mahony s.mahony@ucl.ac.uk department of information studies, ucl centre for digital humanities, university college london, gower street, london wc e bt, uk fudan j. hum. soc. sci. ( ) : – https://doi.org/ . /s - - - http://orcid.org/ - - - http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf introduction digital humanities (dh) has grown out of what was previously known as humanities computing, and perhaps earlier as applied computing in the humanities, working at the intersection of technology and the humanities. a comprehensive introduction to the growth of dh can be found in the introductory chapter in nyhan and flinn ( ). with this change in nomenclature have come other changes in this versatile and fastmoving interdisciplinary field; the focus has moved away from technology as the servant of the humanities to one where our projects and other activities are of interest to and advance the research agendas of both disciplines. humanities itself is difficult to define but can, in my view, best be described as the study of the human condition, and of human achievement, and alternatively, as the study of that which makes life worth living. dh commonly works and builds partnerships at the intersection of cultural heritage, human achievement and the computational sciences, exploring new areas that were not possible previously. this is not the only change that has been occurring. historically, dh has developed in a very anglophone environment as english became the language of the internet (with icann) and the lingua franca of the web (with the w c consortium), along with the domination of the ascii code. icann is extending things now with the new generic top-level domains to include non-latin characters, although only those that are included in anglo/us-centric unicode. there have been recent studies on the metrics of publication and how that along with citation counts has a clear anglo-bias, resulting in incentives for advancement, promotion and funding to favour publication in the english language for the arts and humanities. as domenico fiormonte argues: the over-representation of us and uk humanities titles [as counted in major indices such as scopus and web of science] will always support arguments in favor of using english as the lingua franca, and the misrepresentation of knowledge production and geopolitical imbalance will continue to thrive (fiormonte ). as he notes, this is supported by metrics from scopus itself (see meester ). this article looks at the growth of dh beyond the anglophone sphere and some of the challenges that cross-cultural initiatives present. the beginnings of dh are generally ascribed to roberto busa and his collaborations with ibm to create an index variorum of the works of thomas aquinas, a corpus of latin texts, although alternatives to the general narrative have been put forward by some scholars. nevertheless, medievalists such as busa along with classicists were very much at the forefront of humanities scholars using icann https://newgtlds.icann.org. for this and a cultural critical approach to dh, see fiormonte ( ). for more on this, see crane ( a) and the response by fiormonte ( ) and the comments appended to the latter. see also fiormonte ( ). see, for example, rockwell ( ). s. mahony https://newgtlds.icann.org computational methodologies for their data-intensive research projects (bodard and mahony ). referencing my original discipline of classics (as referring to greco-roman studies), examples would be the thesaurus linguae graecae (tlg) and the lexicon of greek personal names (lgpn), both founded in and more recent publications such as the chicago homer, suda online, inscriptions of aphrodisias and roman tripolitania, to name but a few. these are primarily text- based sources whether that text is found on papyrus, parchment, paper or stone. this also is in a context where classical greek and latin are the two heritage languages of european and western culture, and greco-roman culture forms the foundation of european (and by extension north american) cultural heritage, literature and philosophy. this foundation and reverence is clearly manifest when looking at the canon of literature and the architectural design of many public buildings such as museums and galleries with their columns and porticos mimicking those of athens and rome. cultural and linguistic diversity looking back to our earliest writings in europe and the earliest surviving complete work in greek literature from the so-called ‘father of history’, herodotus of halicarnassus (approx. – bc), we read in the very first paragraph of the first page of his histories the justification for this work: that human achievement not be forgotten and that the deeds of the greeks and barbicans should have their glory and particularly that we should know ‘why’ they fought each other (adapted from de selincourt (trans) ). he uses the term ‘barbarian’ in the same way that chinese understand ‘foreigner’—to herodotus anyone that did not speak greek was a barbarian as their speech sounded like that of sheep. he moves on to account for the origins of the quarrel that began the enmity between east and west (asia and europe), and he arrives at conflicted accounts. the scene is set, however, for the long-running conflict between east and west; he is, of course, talking about greece (europe) and persia (or more particularly the medes as asia, built on the relatively understudied empire of cyrus the great). there is a gulf in understanding between the two. they speak different languages and have different cultures and customs while each, he tells us, ‘without exception believes his own native customs […] to be the best. […] pindar was right when he called [custom] king of all’ (de selincourt (trans) , : ). the histories culminates with the invasion of greece firstly by darius and then by his son xerxes and their defeat at the hands of an tlg http://stephanus.tlg.uci.edu. lgpn http://www.lgpn.ox.ac.uk. chicago homer http://homer.library.northwestern.edu. suda online http://www.stoa.org/sol. inscriptions of aphrodisias http://insaph.kcl.ac.uk. inscriptions of roman tripolitania http://inslib.kcl.ac.uk/irt . see the british museum in london and the library of congress in washington dc as two striking examples. cultural diversity and the digital humanities http://stephanus.tlg.uci.edu http://www.lgpn.ox.ac.uk http://homer.library.northwestern.edu http://www.stoa.org/sol http://insaph.kcl.ac.uk http://inslib.kcl.ac.uk/irt alliance of greek city states. whether herodotus actually travelled beyond the wider greek world or collected together stories gathered from sailors in the ports of halicarnassus or piraeus is not a discussion to have here but from these accounts, whatever the source, the further east, we read in histories, the more mysterious are the peoples and their customs. whatever the truth, it is clear that differences in language and culture lead to difference in understanding. an anglo-centric critique of the dh is nothing new and was made well by fiormonte in his article ‘towards a cultural critique of the digital humanities’. he starts with the perceived tension between methodological differences before moving on to the geopolitics which he claims is to be found endemic in our field as evidenced by the dominance of such pervasive systems as ascii code (american standard for information exchange) and the domain name system (administered by icann). the same is of course true for html and the ubiquitous xml, the latter particularly having a pronounced linguistic bias (difficulties with accented characters and right-to-left scripts) as well as the english-based tei guidelines. as i explain to students when giving visiting lectures and talks overseas, english is the language of the web and of digital publishing. it is, therefore, an additional incentive to learn english for anyone wanting to work in the myriad of internet industries regardless of whether or not they study or train in an english-speaking environment. the counter side to this is, of course, that working in these industries does not incentivize english speakers to develop other language skills. this dominance of the english language is, of course, not only an issue for the digital humanities as a trip to the fudan library will doubtless reveal the wealth and spread of journals and magazines it holds, covering many fields and disciplines for which the major and most prestigious publications are often in english. for a european example, in italy informatica umanistica has long been established at the university of pisa and elsewhere. in my former institution, i was very pleased to have as colleagues a former lecturer and a former student from that programme (elena pierazzo now at grenoble and raffaele viglianti now at mith), but we hear little about the activities of the italian informatica umanistica in mainstream dh publications; with the exception, perhaps, of fiormonte quoted above and geoffrey rockwell, also included in this publication, drawing our attention to the importance of tito orlandi and others. this is despite the continuing work of the very research centre founded by roberto busa being carried out on the index thomisticus treebank at circse at the università cattolica del sacro cuore, milan. colleagues at the ucl centre for digital humanities (ucldh), julianne nyhan and andrew flinn, have worked on uncovering some of the lesser known histories of the development of the digital humanities, including interviewing informatica umanistica: university pisa https://www.unipi.it/index.php/lauree/corso/ . centro interdisciplinare di ricerche per la computerizzazione dei segni dell’espressione (circse) http://centridiricerca.unicatt.it/circse-home?rdelocaleattr=en. s. mahony https://www.unipi.it/index.php/lauree/corso/ http://centridiricerca.unicatt.it/circse-home% frdelocaleattr% den orlandi, for their ( ) volume computation and the humanities. it is pleasing to see the new journal of the italian association of digital humanities, umanistica digitale, online although this seems to be mostly in english. objects of study this bias is not restricted to language but also concerns the materials of our study. from the early days of computational scholarship in the humanities, text has always been the fundamental material for study as evidenced by the index thomisticus, through the tlg and many other publications. looking, however, at the research projects at my own dh centre (ucldh), we now see a much greater variation in source material and data. the nyan and flinn volume mentioned earlier gathers together oral history interviews to push ‘forward the current boundaries of scholarship on the history of dh’ and questions the previous narratives (nyhan and flinn , p. ). there are projects using non- destructive imaging technology to uncover texts otherwise not visible, such as on an egyptian coffin lid and the papyri used as filling ; an online camera capturing viewers’ reaction to seeing the auto-icon of jeremy bentham; computer algorithms to assist reassembling fragments of wall paintings; handwriting recognition; open educational resources; user log analysis; and many more. the objects of our research within dh are changing. even within the area of textual scholarship, digital humanities methodologies have opened up new opportunities to study texts in different ways. examples from two of my phd students illustrate this well. greta franzini, on the editorial board of umanistica digitale, began her doctoral research to create a digital edition of a medieval latin manuscript, being the oldest surviving copy of st. augustine’s de civitate dei and held in the scriptorium at her hometown of verona. this project has morphed into much more and now includes best practice in the field of electronic editing, user studies and requirements, as well as recommendations for the production of digital editions of texts. part of this research has been to create a detailed catalogue of extant digital editions of texts, and this included mapping the institutions involved; the results clearly demonstrate the western-european and us focus on such production (see fig. ). umanistica digitale https://umanisticadigitale.unibo.it. ucl advanced imaging consultants http://blogs.ucl.ac.uk/dh/ / / /ucl-advanced-imaging- consultants-uclaic-undertake-imaging-projects-on-a-range-of-fascinating-heritage-materials. ucldh research projects http://www.ucl.ac.uk/dh/projects. greta franzini http://www.gretafranzini.com. cultural diversity and the digital humanities https://umanisticadigitale.unibo.it http://blogs.ucl.ac.uk/dh/ / / /ucl-advanced-imaging-consultants-uclaic-undertake-imaging-projects-on-a-range-of-fascinating-heritage-materials http://blogs.ucl.ac.uk/dh/ / / /ucl-advanced-imaging-consultants-uclaic-undertake-imaging-projects-on-a-range-of-fascinating-heritage-materials http://www.ucl.ac.uk/dh/projects http://www.gretafranzini.com […] the reader will notice a shortage of, for example, asian and arabic editions as we work through those in the catalogue. nevertheless, digital editions appear to be a western phenomenon, led by the united states and the united kingdom, two of the wealthiest and most influential countries in the world, both economically and politically (franzini, mahony and terras , p. ). in this study, % of the projects are anglo-american; that is out of the total of editions recorded. moreover, it is necessary to remember that the major associations and portals in the digital humanities field are based in the usa and the uk; as well as this, historically, the major journals published in dh are primarily english language publications. as a result, chinese digital editions and any others in a non-latin script were not included in the catalogue because of the language barrier which further serves to accentuate any bias (franzini et al. ) (fig. ). another of my phd students, jin gao, is researching the intellectual and social structures of dh with research methodology primarily (so far) based on citation and social network analysis. at the adho dh conference, gao presented her preliminary results showing the clustering of topics based on her co-citation analysis of the major dh journals: computers and the humanities, digital scholarship in the humanities (formerly known as literary and linguistic computing) and digital fig. screenshot of the map visualization of editions present in the catalogue of digital editions. note: this is at the time of writing: . (franzini et al. , p. ) among others, the association for computers and the humanities (ach); alliance of digital humanities organizations (adho); the european association for digital humanities (formerly allc); arts-humanities.net; dhcommons; digital humanities now; humanities, arts, science, and technology advanced collaboratory (hastac) and the humanities and technology camp (thatcamp). jin gao http://www.ucl.ac.uk/dis/people/gao. s. mahony http://www.ucl.ac.uk/dis/people/gao humanities quarterly; these are all in the english language. at a recent seminar in the department of information studies at ucl, gao presented her latest research which focused on visualizing the dh community through the connections made in twitter; effectively, this represents an analysis of who is connected to whom by their patterns of re-tweeting content on the microblogging platform that is used extensively by members of the dh community. interestingly, this also revealed edge clusters in languages other than english, with the largest being french, followed by german (also including dutch contributors using german rather than their native dutch) and then spanish. the lack of italian is curious given the stature of the late father busa and the longevity of informatica umanistica but perhaps they are tweeting in english or one of the other european languages. it is important to note that these edge clusters are determined by the language of the tweets rather than the home nation of those posting them, although it was also found that these individuals do sometime post tweets in english as well, presumably when entering into discussion and engaging with english language twitter posts. the main point here is that although the original source material is text, in some form or other (citations in journals or twitter postings), it is being interrogated using visualization methodologies to further understand the field’s intellectual structures rather than any close reading of the texts themselves. moreover, we can identify other languages being used within a primarily english-speaking medium. again, however, we are looking at an anglo-focus here and there are, i am sure, many other dh microblogging discussion groups out there in many different languages and on many different platforms. china, for example, has the ubiquitous fig. languages of the primary sources presented in the catalogue of digital editions. note that this is at the time of writing: . (franzini et al. , p. ) dh abstracts https://dh .adho.org/program/abstracts. cultural diversity and the digital humanities https://dh .adho.org/program/abstracts wechat and the dh groups that i am aware of and a member of are digital humanities group (数字人文 群|dh group ), digital humanities group (数字 人文 群|dh group ), with and members (approximately unique individuals), respectively, at the time of writing, and dh global (for non-chinese speakers with members at this time) and doubtless there are many others elsewhere. it is notable also that the names of dh group and dh group were changed to include the latin characters very soon after allowing me to join. nevertheless, despite the clear language issues, from this, we can also see that digital humanities methodologies and techniques are allowing us to ask new and interesting questions of text that were not possible previously without the intervention of computational analysis. the scope and variety of digital humanities research is, of course, much broader; for example, at ucldh, as noted above, we have projects that make use of digitization, the visualization of materials, text mining, crowdsourcing and many varied methodologies. although the source data are still often in the form of text (the letters of jeremy bentham, the british library corpus of digitised newspapers, census records from the national archives, etc.), we are doing new and innovative research using these materials. in addition, the sources also now include oral histories and cultural heritage artifacts. moving beyond just as the objects of dh study are moving beyond a focus on the reading of written text, so too they are now moving beyond their linguistically imposed geographical boundaries. the alliance of digital humanities organizations (adho) conference no longer simply alternates between europe and north america, with dh in sydney and dh to be hosted at the national autonomous university of mexico (unam), mexico city. adho itself has become more global with membership extending to dh associations in australasia and japan : ● the european association for digital humanities (eadh) ● association for computers and the humanities (ach) ● canadian society for digital humanities/société canadienne des humanités numériques (csdh/schn) ● centernet ● australasian association for digital humanities (aadh) ● japanese association for digital humanities (jadh) ● humanistica, l’association francophone des humanités numériques/digitales (humanistica) centernet which describes itself as, ‘an international network of digital humanities centers formed for cooperative and collaborative action to benefit ucldh research projects http://www.ucl.ac.uk/dh/projects. alliance of digital humanities organizations (adho) http://adho.org. s. mahony http://www.ucl.ac.uk/dh/projects http://adho.org digital humanities and allied fields […]’, pulls together dh centres internation- ally. their map, just as the one above (fig. , franzini, mahony and terras. , p. ) also shows a preponderance of western europe and north america with a few outliers (fig. ). this online map from centrenet is also very similar to the one recorded in the infographic (fig. ) published by melissa terras, then director of ucldh, to quantify the extent of dh activities globally in (terras ). according to these graphics, the spread of dh centres in the east has changed little in the intervening years; one has gone missing from south korean and one added at hong kong (to identify the cause or to see whether this results from a possible error in the data would take further research beyond the scope of this article). in east asia, the centrenet map (fig. ) indicates dh centres in tokyo and kyoto in japan, taipei in taiwan, and one in hong kong. what is missing is a connection here with mainland china. notwithstanding this, saw the first chinese dh forum at peking university (pku), crossing boundaries and engaging communities: digital humanities under global view, with the second in , interaction and coexistence: digital fig. centernet, centres map http://dhcenternet.org/centers (november ) centernet http://dhcenternet.org. centernet: map of the registered centres http://dhcenternet.org/centers. cultural diversity and the digital humanities http://dhcenternet.org/centers http://dhcenternet.org http://dhcenternet.org/centers humanities and historical research. at the time of writing, we have just seen the call for papers for the third dh forum at pku circulated, incubation and application: how digital humanities projects cater to academic needs. the first chinese dh centre was established at wuhan in (although apparently not registered with centrenet), with another set up this year at nanjing (i have already met with the director of the wuhan centre and hope to visit both centres in as part of my #chinesedh networking activities); dh was one of the topics for the fudan conference, cross-cultural, cross-group and comparative modernity conference, from which this publication derives, and the international symposium on library and digital humanities (isldh) was held at shenzhen in december . as these maps demonstrate, there are a number of dh centres listed in centernet clustering around the east asian pacific rim: hong kong, taiwan and japan. perhaps as a result of cross-pacific migration and/or trading links, there seems to be an ever-growing interest in collaborative dh projects in the usa that focus on chinese literature and culture: the china bibliographic database project and the chinese text project, both based at harvard along with the east asia dh lab. the harvard/china connection can, of course, be traced back to john king fairbank, the first director of the center for chinese studies based there and subsequently named after him, and his pioneering work on chinese history and fig. global dh centres (detail from terras ) the first dh forum at pku http://pkunews.pku.edu.cn/xwzh/ - / /content_ .htm; the second dh forum held at pku http://english.pku.edu.cn/news_events/news/focus/ .htm. china bibliographical database project (cbdp) https://projects.iq.harvard.edu/cbdb and. chinese text project http://ctext.org. east asia dh lab http://guides.library.harvard.edu/c.php?g= &p= . s. mahony http://pkunews.pku.edu.cn/xwzh/ - / /content_ .htm http://english.pku.edu.cn/news_events/news/focus/ .htm https://projects.iq.harvard.edu/cbdb http://ctext.org http://guides.library.harvard.edu/c.php?g= &p= culture. a quick search of the web brings up others including the dh asia summit being held at the stanford humanities center. in london, saw the official opening by president xi jinping, president of the people’s republic of china, of the ucl institute of education, confucius institute, ‘supporting the teaching and learning of mandarin chinese and the study of china across other areas of the curriculum’. at king’s college london, there is the lau china institute for the study of contemporary china. while writing this article, i received a pdf via the ‘digital humanities group ’ wechat group of a new publication in the journal digital scholarship in the humanities, examining text reuse in early chinese literature (sturgeon b). there is clearly much active western research activity in this rich field. closer to my own interests, i co-organize the digital classicist summer seminar series supported by the institute of classical studies, school of advanced study, at senate house, london. in our series, we invited donald sturgeon from harvard to present a paper entitled ‘crowdsourcing a digital-library of pre-modern chinese’ with a focus on the chinese text project mentioned above. this was well attended and particularly by colleagues from the british library working on the chinese manuscript collections held there. of more note, the british library is a partner institution for the international dunhuang project: the silk road online (idp). this project pulls together disparate collections of artifacts and manuscripts originally held at dunhuang and now dispersed internationally, as well as of other heritage sites along the eastern silk roads. this multinational project has stated aims to engage in the conservation of the original documents and artifacts, cataloguing and research, the systematic digitisation of the material to allow access that would not otherwise be possible, as well as the all-important education and outreach to bring this collection to a wider audience. these types of projects stimulate the interest in and so the scholarship on this important cultural area. they in turn acknowledge the huge debt that europe and the west owe to chinese culture, technology, innovation and scholarship, for example, the recently screened bbc four documentary, the silk road , and also scholarly print publications such as frankopan ( ). the idp online project publishes in a range of languages (english, chinese, russian, japanese, german, french and korean) or at least when the various hosting sites are available. these languages make the content available beyond the immediate confines of either the anglophone world or the mandarin one. this is an exemplar for the appropriate and effective dissemination of scholarship, making the research outputs and other material available in a range of languages, fairbank center for chinese studies http://fairbank.fas.harvard.edu/. dh asia summit http://shc.stanford.edu/events/digital-humanities-asia- -summit. ioe confucius institute http://www.ucl.ac.uk/ioe/departments-centres/centres/ioe-confucius-institute- for-schools. lau china institute https://www.kcl.ac.uk/sspp/departments/lci. sturgeon http://www.digitalclassicist.org/wip/wip - ds.html. idp http://idp.bl.uk. idp activities http://idp.bl.uk/pages/about_activities.a d. bbc the silk road http://www.bbc.co.uk/programmes/p qb . cultural diversity and the digital humanities http://fairbank.fas.harvard.edu/ http://shc.stanford.edu/events/digital-humanities-asia- -summit http://www.ucl.ac.uk/ioe/departments-centres/centres/ioe-confucius-institute-for-schools http://www.ucl.ac.uk/ioe/departments-centres/centres/ioe-confucius-institute-for-schools https://www.kcl.ac.uk/sspp/departments/lci http://www.digitalclassicist.org/wip/wip - ds.html http://idp.bl.uk http://idp.bl.uk/pages/about_activities.a d http://www.bbc.co.uk/programmes/p qb makes them accessible to a far wider audience and does not restrict the engagement or outreach to a single language-based audience. at ucldh, we have had students working at the british library on the idp and writing dissertations relating to that project, on the importance of the silk roads and the impact that publishing otherwise unobtainable original source material online has and how it benefits research in that area. we also have our own project, bridge to china, which aims to further the understanding of all aspects of the chinese speaking world. ucl, more widely and as a linguistic centre, has been partnering attempts to address the language issues and limitations of internet domain names, with their limited character set, as a partner organization in icann’s development of the new generic top-level domains, particularly with regard to chinese (han), japanese and korean characters. language initiatives are particularly welcome in the uk as we are notoriously bad at learning languages, despite language learning being compulsory at primary school level, and as english had become such an international language, there is generally not the incentive to do so. lack of language acquisition is not the only potential issue when it comes to lack of linguistic diversity. greg crane points to the problematic nature of the loss of the rich linguistic diversity previously found in the usa (particularly german although with an acknowledged rise in spanish) (crane a). widening the possibilities east asia is not the only linguistically under-represented area in dh. the closing keynote of the adho dh conference held at the university of nebraska- lincoln usa was given by isabel galina russell, an honorary research fellow at ucldh where she completed her phd, now working at the institute for bibliographic studies at unam, mexico. in her keynote, ‘is there anybody out there? building a global digital humanities community’, galina raises many of the questions touched on above (galina ). galina asks, ‘who are we?’ in digital humanities and we both share a similar view: dh is a community (more on that below). however, there are problems here which she articulates well: one of the things that characterizes dh i think is that the community has worked very hard towards building the dh community. and most of this work has come from enthusiastic and generous scholars who have given much of their time to developing it. […] this community has traditionally viewed itself, as with the conference, as welcoming and open. collaboration and bridge to china https://wiki.ucl.ac.uk/display/chinese. ‘ucl and soas (the nearby school of oriental and african studies) together form the world’s leading centre of linguistic expertise, teaching and researching more than languages’ http://www.ucl. ac.uk/ah/domain-names/leading-centre. ucl domain names http://www.ucl.ac.uk/ah/domain-names. uk national curriculum https://www.gov.uk/government/publications/national-curriculum-in- england-languages-progammes-of-study. s. mahony https://wiki.ucl.ac.uk/display/chinese http://www.ucl.ac.uk/ah/domain-names/leading-centre http://www.ucl.ac.uk/ah/domain-names/leading-centre http://www.ucl.ac.uk/ah/domain-names https://www.gov.uk/government/publications/national-curriculum-in-england-languages-progammes-of-study https://www.gov.uk/government/publications/national-curriculum-in-england-languages-progammes-of-study cooperation are seen as specific traits of dh that differentiate it from the more “lone-scholar” traditional humanist. it seems to be that openness and a desire to work with others is fundamental to the way we think of ourselves. and yet, over the past few years this community has become aware that this isn’t so open, universal as it thought it was (galina ). the field of digital humanities has arguably been built on openness and a sense of community but has historically excluded much of the world by its anglophone preponderance and focus on text-based scholarship. this, as galina says, has been pointed out over time but has only more recently ‘become more of a mainstream discussion’ (galina ). it is argued in this article that perhaps a self-conscious anxiety over the value and importance of the field (if it can indeed be called that) of digital humanities has got in the way of our reflection on what it is that we do and why we do it. as an educator, this is a task that we often give our students in their assignments and certainly something that we (in the uk) write into module and programme proposals; reflective practice is, or should be, part of the training of our students (the next generation of practitioners) and yet, under all the pressures of academia, we perhaps find little time for that ourselves. experience show that this is not limited to dh but often the case across the wide range of disciplines found in academia. the mexican dh has an established organization red de humanidades digitales with its material published in spanish. spanish is also a european language, and although there are variations specific to mexico, it is still a european romance (indo-european) language that has grown from latin vulgar, the common and spoken language (as opposed to the classical latin of literature and poetry) of much of europe during and after the time of the roman empire and throughout the mediterranean geographic area. it has a common root and heritage with many other european languages but because of the predominance of the english language, particularly in international journals, publication in english is needed to ensure disseminated and engagement with knowledge production. this is true also with spanish universities and their doctoral degrees. my own experience of presenting at events in mexico has involved them supplying live translators, for my benefit rather than anyone else’s, as i have often been the only non-mexican presenter. my attempts at pronunciation prompted polite amusement from my hosts as i attempted to apply my best anglo-andalusian accent to mexican terms, names and titles. for guest talks and workshops delivered in china, i have a translator plus the main text on any content slides is translated into mandarin by my chinese students. where possible, i just use images in slides as these need no translation and the language is international. at my home institution, in accordance with our regulations, my teaching slides are made available for the students prior to classes; this was originally to help students with reading difficulties (e.g. dyslexia) but now more so for the non-native speakers in my classes, who make up the majority, to allow them to investigate unfamiliar words and technical terms beforehand. as a red de humanidades digitales http://www.humanidadesdigitales.net. for example, at the university of guadalajara (mahony ). cultural diversity and the digital humanities http://www.humanidadesdigitales.net lecturer, i quickly learned to remove all traces of sarcasm and irony from teaching material as feedback indicated that some students had thought that examples being held up for ridicule were there as exemplars of good or best practice. i had previously trained as a language teacher to adults and so routinely avoided jargon and colloquial expressions or culturally specific examples to illustrate points. the dominance of the english language is, as above, a barrier to inclusivity to all who do not read or speak english. in dh, we now have go::dh (global outlook:: digital humanities), which is a special interest group (sig) of adho with a purpose, to help break down barriers that hinder communication and collaboration among researchers and students of the digital arts, humanities, and cultural heritage sectors in high, mid, and low income economies […] its core activities are discovery, community-building, research, and advocacy. adho, itself, also has the multi-lingualism and multi-culturalism committee (mlmc), which is there to consult with its steering committee over questions of multiculturalism. the wide range of nationalities within this group can be seen from the list of representatives on their web page along with their protocols and policy documents for linguistic and cultural inclusion. in europe, the eadh has an executive composed of members from a great number of countries, although i understand that english is generally spoken as the most common language. they have associate and partner organizations with dhd (german speaking), aiucd (italian), dhn (nordic countries) and cadh (czech). there are indeed linguistic diversity and multicultural inclusion although in a european and predominately english-speaking sphere. these issues are not confined to dh but there seems to be justifiable concern in a field where there is a drive for inclusiveness rather than exclusion. this is arguably one of the reasons why we struggle ‘not’ to define the field of dh, which has been a distinct policy at ucldh. once you give definitions, you are erecting barriers and fences; if you define what it is that explicitly constitutes dh, you are also saying what is not dh—you are excluding people. we prefer to think of ourselves as self- defining. we are part of the dh community because we think and say that we are; in other words, we self-identify with the field. we also see ourselves as a ‘community’ and self-consciously describe our field as such (mahony ). language is indeed a barrier to inclusion and inclusiveness, and we must endeavour to address this and other issues such as an apparent lack of reflection on what it is that we do and why we do it. galina suggests some simple ways to make things more accessible to non- english speakers such as having more translations of published research outputs and go::dh http://www.globaloutlookdh.org. mlmc https://adho.org/administration/multi-lingualism-multi-culturalism. dhd http://dig-hum.de. aiucd http://www.aiucd.it. dhn http://dig-hum-nord.eu. cadh http://czdhi.ff.cuni.cz/en/about. s. mahony http://www.globaloutlookdh.org https://adho.org/administration/multi-lingualism-multi-culturalism http://dig-hum.de http://www.aiucd.it http://dig-hum-nord.eu http://czdhi.ff.cuni.cz/en/about project reports with the resulting costs being built into funding proposals (galina ). having more international conferences, such as the one in fudan, and the movement of the major conferences outside of the anglophone zone would go some way to addressing this. without doing these things, we are restricting our participants and our audience, and so limiting the global reach of our dh community. there are, of course, also issues of connectivity and computing infrastructure which hamper our wider connections. these are not discussed in this article, other than to note that there are great regional differences within the uk itself without having to cross any political borders. the so-called digital divide has not yet been resolved here in a technologically developed nation. in china, where there is a much faster growth in internet penetration, the gap appears to be closing with the spread of phone networks (the most popular way to connect to the internet in china) (statista ), although there is still a clear gap between urban and rural areas which the government is planning to address. in the uk, the drive is still for high- speed fibre optic cables, although full coverage in even such a small country is still limited with a similar gap between urban and rural areas, with the greatest coverage in the more populated and affluent areas. conclusion returning to the main topic of cultural diversity and the digital humanities, as shown by the examples above, we are now seeing within dh an ever-growing interest in exploring culture and heritage more widely. geographic inclusion, however, does not necessarily equate to scholarly inclusion, particularly if language is a barrier to that inclusiveness. this is an issue for the principal dh journals, other dh publications, and calls for conference papers within the global dh sphere. for the adho international dh conference in , we have the call for papers available and circulated in a variety of languages (german, english, spanish, italian and portuguese) as well as the conference itself being bilingual. the conference will be officially bilingual in spanish and english, so we invite proposals for presentations particularly in these languages, as well as in the other languages for which we have a sufficient pool of peer reviewers (german, italian, french and portuguese, the latter an important language community of our host region). this is pleasing to see and not too unexpected as it will be held in mexico city at unam with red de humanidades digitales as one of the hosting organizations and galina russell as one of the local conference organizers. another important issue that cannot be overlooked is that of institutional support; a digital humanities centre requires this support along with a coherent management china daily http://usa.chinadaily.com.cn/china/ - / /content_ .htm. gov.uk https://www.gov.uk/guidance/broadband-delivery-uk. dh call for papers (cfp) https://dh .adho.org/en/cfp. cultural diversity and the digital humanities http://usa.chinadaily.com.cn/china/ - / /content_ .htm https://www.gov.uk/guidance/broadband-delivery-uk https://dh .adho.org/en/cfp policy. it can also find itself being constantly asked to justify its existence and whether or not it represents ‘good value’ for the resources (staff time and funding) that it receives. within rigid university structures, it is easy to say that interdisciplinary practice is encouraged but as we all find out to our cost, it is often difficult to achieve. beyond the institutional barriers imposed by schools and faculties, there are also the disciplinary ones which bring with them their own logistical, practical and inter-personal ones too. these often revolve around the management of projects, funding opportunities, recognition for the work done and publishing venues (terras ). in the introduction to debates in the digital humanities, matthew gold also expresses this but in a different way, describing dh as ‘a field in the midst of growing pains as its adherents expand from a small circle of like-minded scholars to a more heterogeneous set of practitioners who sometimes ask more disruptive questions’ (gold ). this ‘disruption’ is another potential cause of tension when entering into collaborations with more established disciplinary areas. institutional support is needed to facilitate interdisciplinary working, to provide support for dh centres and their activities, as well as allowing practitioners to engage in international projects and events; budgets for travel, conference attendance, publication, and now also for translation must be part of long-term institutional strategic vision. as an academic field, dh has come a long way in a relatively short time but we still have much to do to achieve the openness, sense of community and inclusiveness that we aspire to. we need to have more conversations with dh groups beyond the anglophone sphere and the conference in fudan, that prompted this article, and that in shenzhen (mentioned above) have given the opportunity for and facilitated several such conversations. chinese dh centres and research groups in libraries such as pku and shanghai offer a welcome and hospitality to western visitors. the challenge is for us in the western anglophone sphere to be equally welcoming and willing to engage with researchers and practitioners outside of our echo-chamber and to reach out more widely. otherwise, we are destined to meet, greet and discuss our topics of interest and research only with those that we already know. coda just as the artefacts we produce are the results of cultural influences, so too are the writings, our cognitive processes, and how we view and understand the world around us. this article has also drawn on my own very limited research into cross- cultural teaching, examining some of the issues that become apparent when working across disciplinary and ethnic boundaries; it also considers some of the growing number of collaborative dh projects that focus on chinese literature and cultural heritage. restricting our cultural perspective is restricting our field; we all learn from each other and inclusion benefits us all. and without this, it is those english speakers who have no other language that stand to lose the most. greg crane expresses this well: s. mahony now, english has emerged as a de facto lingua franca – with of those of us who grew up speaking english losing the most, insofar as the widespread use of english makes it easy for us to ignore the importance of language and to avoid the challenge of mastering languages other than our own. no one would benefit more from a commitment to linguistic diversity than speakers of english (crane b). open access this article is distributed under the terms of the creative commons attribution . international license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted use, dis- tribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. references bodard and mahony eds. . ‘“though much is taken, much abides”: recovering antiquity through innovative digital methodologies’ digital classicist special issue, digital medievalist . crane, gregory. a. ‘resisting a monocultural (digital) humanities’ https://docs.google.com/ document/d/ novwutubpa yn uibjvz_tscavzs exrsjrzisn -vq. accessed jan . crane gregory. b. ‘the big humanities, national identity and the digital humanities in germany’. http://www.dh.uni-leipzig.de/wo/the-big-humanities-national-identity-and-the-digital-humanities- in-germany. accessed jan . de selincourt, aubrey. . (trans) herodotus the histories. penguin. fiormonte, domenico. . ‘towards a cultural critique of the digital humanities’, historical social research, vol. no. , – . reprinted in: debates in the digital humanities , eds matthew gold and lauren klein (eds) university of minnesota press. fiormonte, domenico. . ‘towards monoculture (digital) humanities?’ infolet: cultura e critica dei media digitali : . https://infolet.it/ / / /monocultural-humanities. accessed jan . frankopan, peter. . the silk roads: a new history of the world. london: bloomsbury publishing. franzini, greta, simon mahony, and melissa terras. . ‘a catalogue of digital editions.’ in digital scholarly editing: theories and practices, eds. elena pierazzo and matthew driscoll, – . open book publishers. galina russell, isabel. . ‘is there anybody out there? building a global digital humanities community’, humanidades digitales http://humanidadesdigitales.net/blog/ / / /is-there- anybody-out-there-building-a-global-digital-humanities-community. accessed jan . gold, matthew. . ‘introduction: the digital humanities moment’. in debates in the digital humanities, ed. gold, matthew, ix–xvi. university of minnesota press. mahony, simon. . university of guadalajara ‘reflections on knowledge production within the framework of uk academic institutions. http://www.udg.mx/es/noticia/redes-sociales-generan- nuevas-dimensiones-de-ensenanza. accessed jan . mahony, simon. . ‘the digital classicist: building a digital humanities community’, digital humanities quarterly : . http://www.digitalhumanities.org/dhq/vol/ / / / .html. accessed jan . meester, wim. . ‘towards a comprehensive citation index for the arts & humanities’ research trends . https://www.researchtrends.com/issue- -march- /towards-a-comprehensive- citation-index-for-the-arts-humanities. accessed jan . nyhan, julianne and flinn, andrew. . computation and the humanities: towards an oral history of digital humanities, springeropen. rockwell, geoffrey. . ‘an alternate beginning to humanities computing?’ theoretica.ca. statista ‘mobile phone internet user penetration in china from to ’ https://www.statista. com/statistics/ /china-mobile-phone-internet-user-penetration. accessed jan . sturgeon, donald. a. ‘crowdsourcing a digital library of pre-modern chinese’ http://www. digitalclassicist.org/wip/wip - ds.html. accessed jan . sturgeon, donald. b. unsupervised identification of text reuse in early chinese literature. digital scholarship in the humanities. https://doi.org/ . /llc/fqx . cultural diversity and the digital humanities http://creativecommons.org/licenses/by/ . / https://docs.google.com/document/d/ novwutubpa yn uibjvz_tscavzs exrsjrzisn -vq https://docs.google.com/document/d/ novwutubpa yn uibjvz_tscavzs exrsjrzisn -vq http://www.dh.uni-leipzig.de/wo/the-big-humanities-national-identity-and-the-digital-humanities-in-germany http://www.dh.uni-leipzig.de/wo/the-big-humanities-national-identity-and-the-digital-humanities-in-germany https://infolet.it/ / / /monocultural-humanities http://humanidadesdigitales.net/blog/ / / /is-there-anybody-out-there-building-a-global-digital-humanities-community http://humanidadesdigitales.net/blog/ / / /is-there-anybody-out-there-building-a-global-digital-humanities-community http://www.udg.mx/es/noticia/redes-sociales-generan-nuevas-dimensiones-de-ensenanza http://www.udg.mx/es/noticia/redes-sociales-generan-nuevas-dimensiones-de-ensenanza http://www.digitalhumanities.org/dhq/vol/ / / / .html https://www.researchtrends.com/issue- -march- /towards-a-comprehensive-citation-index-for-the-arts-humanities https://www.researchtrends.com/issue- -march- /towards-a-comprehensive-citation-index-for-the-arts-humanities https://www.statista.com/statistics/ /china-mobile-phone-internet-user-penetration https://www.statista.com/statistics/ /china-mobile-phone-internet-user-penetration http://www.digitalclassicist.org/wip/wip - ds.html http://www.digitalclassicist.org/wip/wip - ds.html https://doi.org/ . /llc/fqx terras, melissa. . ‘the digital classicist: disciplinary focus and interdisciplinary vision’. in digital research in the study of classical antiquity, eds. gabriel bodard and simon mahony, – . ashgate. terras, melissa. . ‘quantifying digital humanities’. infographic. ucl centre for digital humanities. http://www.flickr.com/photos/ucldh/ . accessed jan . simon mahony is director of the ucl centre for digital humanities and principal teaching fellow in digital humanities at the department of information studies, university college london (ucl). his research interests are in the application of new technologies to the study of the ancient world, using new web-based mechanisms and digital resources to build and sustain learning communities, collaborative and innovative working. he is a member of the ucl student recruitment interest group and recipient of support from ucl’s global engagement funding; chair of the new ucl open education special interest group and on the project management team and a member of the project board for the ucl open educational resources (oer) repository. he is also active in the field of distance learning and is a member of the university of london’s centre for distance education with an interest in the development of educational practice and the use of new tools to facilitate this. in addition, he is an associate fellow at the institute of classical studies (school of advanced study, university of london) and one of the founding editors of the digital classicist. s. mahony http://www.flickr.com/photos/ucldh/ cultural diversity and the digital humanities abstract introduction cultural and linguistic diversity objects of study moving beyond widening the possibilities conclusion coda open access references ade bulletin ◆ adfl bulletin . © vialla hartfield- méndez and karen stolley crossref dois . /ade. . and . /adfl. . . vialla hartfield-méndez is professor of pedagogy in the department of span- ish and portuguese at emory university. karen stolley is professor of span- ish in the department of spanish and portuguese at emory university. preface on september , just after we completed what we thought were the final revisions of this article, the faculty of emory’s college of arts and sciences gathered for the first meeting of the – academic year, at which the dean of the college, robin forman, announced that numerous cuts and other measures were to be announced in the coming days. two days later forman released a let- ter outlining some of the changes: the elimination of several entire programs or departments and the suspension of two other doctoral programs. these decisions had implications for both the undergraduate and graduate schools and occasioned significant faculty and student consternation and resistance. as a result, faculty governance structures are under intense scrutiny, and there is still considerable un- certainty and tension in the college, in the laney graduate school, and extending into the rest of the university. many important questions have arisen: going forward, how will the liberal arts be defined in institutions of higher education like emory university? what role will the humanities play in that definition? what are the rights and respon- sibilities of faculty members in major decision- making processes? how viable are current structures of faculty governance? central to the focus of this article is whether the gains made over the last twenty years for lecture- track faculty members at emory and, more importantly, their synergistic relationship with tenure- track faculty members are undermined by the decisions announced in fall . forman has said both privately and publicly that he is committed to the “the idea and the reality” of lecture- track faculty at emory. nevertheless, several lecture- track faculty members will lose their jobs at the end of their current con- tracts, and in general the lecture- track faculty members are feeling very unsettled, even betrayed. the assimilation of this new reality has consumed an inordinate amount of faculty energy across the ranks. in this ongoing process, the gover- nance, representation, appointment, and promotion structures for lecture- track faculty members in emory’s college of arts and sciences that have been put in place over the last fifteen years are now part of the scaffolding needed to address these questions. this article, originally written in the spring semester to describe the emergence of these structures and then revised during the summer before these events began to unfold, was initially framed as a somewhat exem- plary tale; now it must be read, at least in part, as a somewhat cautionary one. events are still unfolding, and so time will tell which characterization ultimately best fits the narrative. “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley introduction we find ourselves at a “crucible moment” for higher education in the united states. the imperative to reimagine our mission and our praxis has become increasingly urgent in the context of shrinking resources, seismic shifts in student populations, a changing faculty profile, and debates about how to organize and evaluate teaching, learning and scholarship for the twenty- first century. these debates often center on the role that education plays in building civic culture, and a recent report urges us to “embrace civic learning and democratic engagement as an undisputed educational priority for all of higher education” (crucible moment ). but if higher education is to embrace that mission, we faculty members must first consider our own un- derstanding of civic engagement and democratic process within the colleges and universities in which we teach and research. in other words, it is impossible to “do” democracy in the curriculum if it is not practiced in the professoriat. the challenges facing higher education were in evidence in january when the new faculty majority, an advocacy group for adjunct and contingent faculty members—“over million of the . million people teaching in american colleges and universities” and, according to gary rhoades, almost percent of all fac- ulty appointments (bérubé)—hosted a national summit in washington, dc. the summit provided an opportunity for participants to share information and discuss strategies for addressing systemic inequities in how faculty labor is viewed and com- pensated (schmidt, “summit”). current discussions about the professoriat—includ- ing those at the new faculty majority summit—often take as their starting point a stark binary between two unequal and nonoverlapping faculty subsets: on one side, tenured or tenure- track faculty members who enjoy job stability, full participation in governance, decent salaries, and a range of perks related to their working conditions; on the other side, non- tenure- track colleagues in adjunct or contingent positions that are precarious and poorly compensated and who, for the most part, are without ac- cess to the basic workings of faculty governance. growing attention to this binary has led to important and long- overdue efforts to address the inequities in working conditions for faculty members off the tenure track—efforts that are described by contributors to this special issue and that must continue. ongoing debates about working off the tenure track, the crisis in higher education (and, especially, in the humanities and in foreign language departments), and the role of language and literature studies in higher education converge to create either a perfect storm or the proverbial teachable moment. as members of the mla who teach and study language, rhetoric, discourse, and textual analysis, we can probably agree that language is powerful, grammar provides scaffolding for thought, and terminology can influence attitudes and action. perhaps the most egregious example of language that impedes our ability to work together is the frequency with which one still hears tenured colleagues speak of “real faculty” as a way of differentiating tenured or tenure- track faculty members from those off the tenure track. the terms adjunct and contingent carry implications of structural and intellectual subordination that resonate even in the most enlightened efforts to ad- dress issues of economic and contractual parity. the mla’s professional employment ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley practices for non- tenure- track faculty members: recommendations and evaluative questions, for example, is implicitly predicated on the limitations and legitimacy of a two- tiered system for faculty, as are expressions of concern that increasing numbers of non- tenure- track faculty members weaken the larger faculty body, compromise faculty governance, and pose a threat to our academic mission. yet we address here another dimension that is often overlooked: the assumptions embedded in our thinking about who does what work in our colleges and universi- ties and about how that work is valued. without minimizing the very real inequities of pay and working conditions that affect a significant number of our colleagues, we want to suggest that approaching the issues of non- tenure- track faculty members solely as a matter of redressing these inequities is too narrow a response. a recent study shows it may even be less effective. researchers at the university of southern california found that “adjuncts had made the most progress at colleges where they tried to transform the campus climate to be more inclusive of them, rather than simply fighting to change one employer practice at a time” (schmidt, “when ad- juncts”). how might we reimagine the different components of the professoriat as something other than the result of a mere accident of hiring circumstances or, worse, as the justifiable enshrinement of long- cherished hierarchies that are increasingly called into question? what if instead we were to imagine a larger body of “regular faculty” composed in an intentional way of tenure- track and non- tenure- track fac- ulty members (think of a venn diagram with two overlapping and complementary subsets)? in this new paradigm, we are not arguing for the abolition of tenure; we are arguing for a partnership, a different way of thinking that will permit us to reimagine a faculty as a community of college or university citizens with collective rights and responsibilities who must function as a whole to meet the challenges of undergraduate and graduate education in the twenty- first century. we use emory university as a case study, focusing on emory’s college of arts and sciences and the department of spanish and portuguese. we understand our experience as “radical incrementalism”—an ongoing process in which gains, as they are made, reveal the need for continued change and adaptation. the process is not so much reformist as evolutionary and emerges from a particular institutional context, one which (as noted in the preface) has changed over the past year at emory. it may not, therefore, prove a compelling model for every case or persuasive to those seek- ing more immediate change. but it has made it possible for us at emory—at least within the college of arts and sciences—to develop alliances across faculty lines to work together as “regular faculty” to reframe our narrative. although this process takes place in a particular institution, we suggest that reframing the narrative of faculty roles in higher education in terms of citizenship participation is both possible and necessary in the profession at large. lecture- track faculty in emory college: a narrative in progress emory university in the early s was similar to many american universities and colleges, exhibiting significant growth in the number of faculty members off a tenure path without the institution’s demonstrating clear goals or strategic planning. and as ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley in many other research universities, emory had and still has disparate appointment patterns and polices across its units. according to the latest diversity report, the total number of full- time faculty members (including tenure- track and non- tenure- track) increased by approximately percent from to , and the total number of non- tenure- track faculty members increased by almost percent (diversity profile – ). that is, in approximately percent of the total faculty at emory was not on the tenure track, and by that percentage had increased to almost percent. here we focus on the development of the lecture track in emory’s college of arts and sciences, which is related to but quite distinct from other units. terminology emerged as critical in reform efforts at emory. recent discussion in the academy about the role of faculty members described as “adjunct” or “con- tingent” does not capture the circumstances of many with multiyear contracts, es- tablished positions within departments, and increasingly consequential positions beyond their departments. for the last ten years at emory’s college of arts and sci- ences, non- tenure- track faculty members are not necessarily employed in contingent positions. early in the process of regularizing this track, we clearly delineated the lecture track from adjunct and visiting appointments and began working on defin- ing rights and responsibilities within that track, while working to avoid adjunct or contingent situations whenever possible. changes in the college policy have been gradual and are driven by collaboration across tenure- track/ non- tenure- track lines and with administrators, reflecting a broad commitment to university citizenship. of paramount importance is the role that non- tenure- track faculty members played in pushing for changes. beginning in the late s and early s, several pioneering faculty members in non- tenure- track appointments from across the dis- ciplines began meeting and laying the foundations for governance structures that eventually created new avenues for participation in decision making. at the time, many non- tenure- track faculty members were, according to their one- year con- tracts, “visiting faculty.” from this tenuous position several leaders began organizing meetings of the non- tenure- track faculty, which in time became the lecture- track faculty group. as this group coalesced, several things became clear: a significant number of faculty members had previously been, for the most part, invisible, both to one another and to the faculty members in the tenure track; there were striking inconsistencies in the conditions of appointment and employment across the college; coming together to raise these issues—an act of university citizenship—made it pos- sible to address them; and there were sympathetic interlocutors among members of the tenured and tenure- track faculty and the administration. the lack of any clear and consistent policy regarding non- tenure- track appoint- ments and employment was one of the first issues addressed by this group. in , the college created a policy on the appointment and review of lecturers and senior lecturers, ultimately approved by the provost and the board of trustees. an impor- tant step, this policy regularized the appointments of all those “visiting” faculty members as lecturers or senior lecturers, created multiyear contracts and a promo- tion mechanism, and established lecturers as regular members of the faculty with access to many of the benefits accorded those on the tenure track (e.g., office space, retirement programs, professional development funds). this new policy stabilized ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley the more than ninety faculty members on the lecture track and affirmed their full investiture in the faculty body. with stability, it became easier for lecturers to play key roles within their departments and in the college and university. concurrent with this policy- revision process was a series of publications and cam- pus conversations regarding important issues facing the university. beginning with a report from the then provost, billy frye, and others titled choices and responsi- bility: shaping emory’s future, the university community was challenged to seriously consider five issues facing higher education and emory in particular: the balance between teaching and research; the need for a stronger sense of campus community; the reordering of university resources and processes to encourage interdisciplinar- ity; the balance of infrastructure needs with resources; and the university’s civic responsibility, beyond the campus. following this “call to action,” frye and william chace, the president at the time, established the commission on teaching, which issued its report in with an introduction that strongly emphasized the impor- tance of teaching: we want to get beyond the notion that excellence in research must preclude excel- lence in teaching and that universities cannot support, evaluate, and reward teach- ing and research in equivalent ways. . . . an equivalent commitment to research and teaching does not mean a quantifiable measure from every program nor an equal portion of each for each faculty member at all points in his or her career. it means that we want the culture and structures necessary to ensure an institution in which both teaching and research flourish. (teaching) the report did not address the question of tenure- track versus non- tenure- track fac- ulty members (in fact, its unacknowledged assumption is that the faculty members in question are on the tenure track). nevertheless, this important document opened up the space to talk about the value of teaching. since lecture- track faculty members were most closely associated with teaching, it became possible to present the value of their work as equivalent to the value of the work of a faculty member whose main focus is research. it also opened up the possibility of conceiving of lecture- track faculty members as pedagogical leaders on campus. and since the deliberations of the commission on teaching were informed by ernest l. boyer’s scholar- ship reconsidered: priorities of the professoriate, among other documents, there was increasingly space in campus conversations to understand scholarship more broadly and to value the scholarship of teaching. the distribution of the new policy to department and program chairs in the col- lege in signaled important progress, but it also pointed to work ahead. an arti- cle published later in the campus faculty magazine about lecture- track- faculty noted, “departments sometimes do not understand the potential contribution of these fac- ulty members, and the resulting lack of communication, professional development, and collaboration represents a lost opportunity” (hartfield- méndez, marsteller, and patterson ). by the spring of , under the leadership of a new president, james wagner, and a new college dean, robert paul, emory was well into a multiyear stra- tegic planning process. there was growing interest among lecture- track faculty mem- bers in their role in this new environment, especially since the strategic plan would profoundly influence the allocation of resources. in a meeting in march with ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley the lecture- track faculty group, paul drew a direct line from the ideas in teaching at emory to the strategic plan, emphasizing that the role of teaching would be cen- tral to the plan and asserting that emory should strive to be one of the best under- graduate teaching institutions among top- tier research universities. he projected that lecture- track faculty members would continue to constitute about percent of the faculty in the college and recognized the need to find ways to acknowledge and bet- ter reward their work. several changes that ultimately were approved were discussed formally for the first time in this meeting. in the discussion of how to bridge the gap between teaching and research, even within the lecture track, bobbi patterson eloquently advanced the idea that the “bridge between scholarship and teaching has to be constructed conceptually and might include the idea of pedagogical scholar- ship and design, and that this could become part of the criteria for evaluation of [ lecture- track faculty]” (qtd. in kelley). in addition, the participants in the meeting contemplated the creation of a third tier in the lecture track and a college standing committee for promotion and evaluation of lecture- track faculty members. in fall , paul appointed a task force wisely made up of a mix of tenured faculty members, administrators, and senior lecturers, charged with making recommenda- tions regarding the lecture track and empowered to think boldly. emory’s affirmation in its mission statement that it is an “inquiry- driven, ethically engaged” institution also helped frame the task force’s discussions on how to forge productive and ethical faculty relationships. four working principles informed the group’s deliberations: . emory college [of arts and sciences] has a strong group of regular fac- ulty of which there are two subsets, namely tenure- track faculty (ttf) and lecture- track faculty (ltf), and these are distinct from faculty on temporary appointments. these subsets are full partners in forwarding the vision of emory as an institution that combines the opportunities of a tier- one research university with a small liberal arts college experience, which makes possible the inquiry- driven, ethically responsible practice of engaged citizenship to which we aspire for ourselves and our students. the synergy of including faculty of both subsets permits attainment of the vision of the college and the university. . emory college can and should lead its peer institutions on the issue of how best to integrate regular faculty who are, by both individual and institutional choice, in positions that offer no possibility of tenure. although ltf experi- ence less pressure to conduct research and publish findings in top venues, they are clearly in positions that indicate a long- term relationship with the university, strongly supporting the teaching aspect of the university’s mission. . emory college places value on the complementary relationship between teaching and scholarly activity. lecture- track faculty can and do play an important role in defining that relationship, putting into practice the broadened concepts of scholarship advanced by ernest l. boyer . . . and others. these concepts are referenced in the report on teaching at emory, which expressed the clear aspiration to “an emory in which there is a balance between teaching and research” but without demanding that every faculty member maintain that balance all the time. the task force ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley acknowledges the important role of lecturers in teaching, and also the inte- gration of scholarly activities that many bring to that role. . any system of evaluation of ltf should be predicated on the ltf having value and voice. in emory college, ltf are and should be highly valued, with full rights and responsibilities in faculty governance. (task force ) to fulfill its charge, the task force required an inventory of the lecture- track fac- ulty, which was completed using various sources of information including a survey of the lecture- track faculty members and college records. an important survey ques- tion regarded the roles occupied by lecture- track faculty members at the departmen- tal, college, and university levels. at that point (and even more so today), they taught at all levels—from introductory undergraduate courses to graduate seminars—and were actively engaged with students as advisers to student organizations and in di- recting honors theses, coordinating and training graduate teaching assistants, and serving on dissertation committees. the task force noted the significant number of teaching awards among lecture- track faculty members. they served then, as they still do, in key administrative roles at the department and program levels and were consistently representatives on the governance committee and other important col- lege committees. a few had become leaders at the university level, especially in the areas of community engagement and sustainability and in issues of race and dif- ference. several had played large roles in emory’s strategic- planning process. the results of this inventory were revealing, even surprising. first, it was surprising to several members of the task force (and later to many members of the faculty at large, when the academic exchange article about the task force’s recommendations was published [ hartfield- méndez, marsteller, and patter- son]) that lecture- track faculty members’ level of commitment and activity was so high, given that regularization of the lecture track had occurred only a decade ear- lier. second, it was revealing that this high level of integration for some in the life of departments and the college did not extend to all. many faculty members in lecture- track appointments did not see clear paths for themselves to go beyond teaching introductory courses, were excluded from departmental governance, and were dis- couraged from seeking expanded roles for themselves outside their departments. in addition to its internal inventory, the task force gathered information from other institutions in search of best practices and comparison points. after consider- ing all this information, the task force came to consensus on its recommendations, which were grounded in the key statement from the report teaching, which called for a revised view of the value of teaching in relation to research. finding that the work of lecture- track faculty members, while not excluding research, is or- ganized around the teaching mission of the university, the task force explicitly con- nected the value that the institution places on teaching and the value of the work of those on the lecture track. this connection was seen as one of several avenues open to the university to act on the earlier mandate to value its teaching mission. the recommendations included the creation of the following: . a third tier in the lecture track, promotion to which would “link teach- ing and scholarship through new pedagogies in and across disciplines and ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley between the university and the community, all hallmarks of excellent teach- ing and research” (hartfield- méndez, marsteller, and patterson ), thus providing a clear path for advancement and aspiration while also acknowl- edging excellent work . clearer and more consistent policies for hiring and contracts at each level in the lecture track that were more uniform across the college, making three- year contracts available to lecturers with the possibility of promotion to se- nior lecturer after two terms and longer- term contracts at the senior lecturer and third- tier levels . detailed procedures for evaluation and promotion . a national search for all hires, with the possibility of hiring at any level . more equitable compensation policies . regular professional leaves institutional wheels driven by faculty governance tend to move slowly, but in the grand scheme of things, the college and university acted fairly quickly to implement many of these recommendations. a new policy was written, the college bylaws were changed after debate and a faculty vote, and a standing committee for evaluation and promotion of lecture- track faculty members was created. perhaps the most im- portant shift was that the majority of the faculty members approved the notion of regular faculty with two subsets, from which the rest of the changes flowed logi- cally. just as important was the alliance between members of the tenure- track and lecture- track faculties. a critical group of tenure- track faculty members worked tire- lessly alongside their lecture- track colleagues, leveraging crucial and complementary institutional knowledge and experience. the associate dean of faculty guided the discussions. of particular importance was the participation of a cochair of the com- mission on teaching from a decade earlier. several faculty members in the lecture track had long- established relationships of mutual respect with faculty members on the tenure track. the task force’s recommendations and the resulting changes would not have occurred without this alliance. at the same time, it was essential that mem- bers of the lecture- track faculty stepped up to leadership roles, for them to exhibit a breadth and diversity of involvement and leadership and to begin to see themselves as active citizens of the university, assuming commensurate rights and responsibilities. lecture- track faculty members can now present themselves for promotion to pro- fessor of pedagogy, performance, or practice, depending on the emphasis of their work. the creation of this third tier has been an important mechanism for recogniz- ing and rewarding lecture- track faculty members whose scholarly accomplishments are visible throughout and beyond the university. the significance of such a promo- tion goes beyond simply acknowledging excellent teaching; in fact, the new promo- tion policy explicitly requires evidence of excellent teaching, service, and contributions to their respective fields, especially as related to teaching, beyond emory. scholarship, defined broadly, is an essential piece of the portfolio that must be submitted for pro- motion. evidence of leadership on campus but also beyond the campus in the area of teaching innovation and in the scholarship of teaching has been important in the consideration of candidates for promotion. traditional scholarship in their respec- tive fields is also considered valuable, particularly when the candidate demonstrates ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley linkages or complementarity between such scholarship and his or her teaching. con- sequently, several members of the lecture- track faculty are now acknowledged campus leaders in questions of pedagogy, enhancing the enterprise of teaching and claiming leadership in an arena that the university has set forth as critical to its mission. lecture- track faculty members serve on the standing promotion committee, and now that three cohorts of professors in the lecture track have been promoted, they also serve on committees for promotion to the third tier. contracts for lecturers are for three years, with the possibility (but not requirement) for promotion to senior lecturer after two terms. lecturers who do not present themselves for promotion are not penalized with the loss of their jobs; they may continue to serve as lecturers, depending on departmental needs. the term for senior lecturers is five years; senior lecturers may present themselves for promotion to professor after one term but are not required to do so. promotion to professor is viewed as a special distinction, not necessarily appropriate in every case. clear procedures are in place for evaluation and promotion, and national searches are required for new hires (“appointment”). among the lecture- track faculty, there has been significant advancement in terms of rank (with a growing cohort of professors of pedagogy, practice, or performance) and aspirations. the promotion process makes visible the accomplishments of the lecture- track faculty and the multiple paths to professional development. the pres- ence of lecture- track faculty members in administrative leadership positions at the highest levels makes their voices more audible and their advancement more evident. it has sometimes been the case that visibility at those levels opened a path for reimag- ining the role of lecture- track faculty members within departments and across the university. accelerated possibilities for professional development have also resulted, particularly in emerging areas of institutional investment such as sustainability, en- gaged scholarship, and digital humanities. institutional structures have morphed to accommodate the new faculty reality. the emory college language center has become a space for empowerment, collaboration, and coalition building among all lecture- track faculty, not just among those teaching in language and literature de- partments. for example, in november the center hosted a panel discussion for all lecture- track faculty members on the lecture- track- faculty promotion process. still left unaddressed are the issues of equitable compensation and regular profes- sional leaves. since , lecture- track faculty members can apply for a semester leave through the competitive winship award for senior lecturers. this was a ma- jor institutional advancement in the support of lecture- track faculty members, but only two awards are offered annually, making it unlikely that most eligible mem- bers will be granted a leave within the foreseeable future. lecture- track salaries are significantly lower than those for faculty members with comparable tenure- track status and seniority. and although there is a merit increase in salary at the point of promotion to senior lecturer and to professor of pedagogy, practice, or performance, these increases do not match those given to tenure- track faculty members at points of promotion. thus the current salary structure amounts to an institutionalization of inequity and undervaluing of the teaching mission of the institution. furthermore, as recent events have signaled, the four core principles that under- girded the work of the task force, as well as the gains that were made as a result, ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley have been undercut. since the cuts, suspensions, and reorganization were announced, questions about the status and role of lecture- track faculty members and their rela- tionship with the tenured and tenure- track faculty members have resurfaced—in other words, the notion of “regular faculty” is frayed. and in the communications about the decision- making process regarding which programs would be eliminated or suspended and how, there was a reversion to previous conceptualizations of the faculty on the basis of contract definitions rather than contributions to the shared missions of the college and graduate school. fortunately, the structures and practices now in place (a strong executive committee of the lecture- track faculty group with open lines of communication to the dean and senior associate dean for faculty; repre- sentation by lecture- track faculty members on the college’s governance committee, in the faculty senate and in other committees; habits of shared responsibility, mutual respect, and sharing of information among lecture- track and tenure- track faculty members) are proving to be useful avenues for addressing these questions. the department of spanish and portuguese: a case study in implementation policies are only guideposts; change in institutional culture is where real transforma- tion occurs. once new policies regarding appointment and review of lecture- track faculty members and the college’s guidelines for promotion to professor of pedagogy, practice, or performance were in place, an important educational process began for faculty members. lecture- track faculty members, tenure- track faculty members, and administrators all had to understand the details of the new landscape and then work together to create appropriate pathways for constructing a new reality. for the new structure to become real, changes in the college bylaws were necessary, which required majority votes by the entire faculty. bylaws changes occurred on several occasions throughout the restructuring process, providing faculty members with opportunities for debate and ultimately for demonstrations of broad support. it is fair to say that the college of arts and sciences, the university as a whole, and es- pecially individual departments are still digesting these changes. the regularization and thoughtful reconfiguration of the lecture track has opened up further questions but has also created a pathway for all members of the faculty to grapple with them. arguably the last frontier for implementation is at the departmental and program level. best practices of faculty governance often break down at this level, for many reasons. effective departmental governance relies on strong, engaged leadership that explicitly recognizes the linkages between college governance and departmental pro- cesses, and happily the department of spanish and portuguese benefitted from this kind of leadership during the years immediately following the restructuring of the college’s lecture track. spanish has for a number of years been the largest language program at emory. after a period of expansion in the late nineties (gold ), enrollment leveled off. in the – academic year, according to reports from the registrar, , students enrolled in spanish, and in portuguese. lecture- track faculty members currently constitute more than percent of the departmental faculty and teach a majority of the undergraduate curriculum, especially in the lower division but also in upper- ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley level courses. graduate pedagogical instruction and supervision is under the purview of a senior lecturer, and multisection undergraduate courses are usually supervised by lecture- track faculty members. thus all courses taught by graduate students are informed by the perspective of a colleague whose work is closely identified with the university’s teaching mission. as a result of the regularization of the lecture- track faculty in the s, by there was already a sense of limited enfranchisement for lecture- track faculty mem- bers in the department, who had office spaces, mailboxes, access to departmental communications, multiyear contracts, annual evaluations, and access to opportuni- ties for professional development through pedagogy seminars and workshops. that is, the positioning of lecture- track faculty in the department very nearly mirrored the recommendations of the mla’s “statement on non- tenure- track faculty members.” missing was the full integration of the lecture- track faculty members in departmental decision making, which made for an unbalanced departmental life and created barriers for their seeing themselves and being seen as fully responsible and capable. changing those perceptions required a change in the terms of engage- ment in the department. there had long been an articulation point in the curricu- lum where members of the lecture- track and tenure- track faculties (and graduate students) collaborated productively (in the “gateway” course to the major), but points of articulation in our departmental governance were less clearly defined. the department did not have bylaws, and the lack of a well- defined process for de- partmental governance was becoming increasingly evident, especially in the light of changes in the college bylaws. there was clearly a need to formalize the participation of all members of the department in shared governance. thus writing departmen- tal bylaws became a laboratory for acting on the spirit of inclusion of lecture- track faculty members that had guided the college’s bylaws changes and policy revisions. it was helpful that the chair of the department and a member of the lecture- track faculty (the authors of this article) had served on the college task force. the bylaws conversation was intentionally constructed as a space in which lecture- track faculty members and tenure- track faculty members would have equal voice and privileged as an important process that required the presence of all faculty members. the process was guided by the principles set forth by the task force, especially the ideas that evaluation of lecture- track faculty members should be predicated on their having value and voice and that they should have full rights and responsibilities in faculty governance. several meetings were devoted to the discussion and writing of the bylaws, culminating in their approval by vote of all regular faculty in the depart- ment. the process of writing the rules was exemplary of the rules ultimately put in place; in these conversations, lecture- track faculty members were called to step into a new role, and tenure- track faculty members were bound by the newly revised college bylaws to respect that new role. we found that these changes in our departmental governance, initially performative, became increasingly consequential. this new governance structure informs administrative and teaching decisions in the department, such as how to distribute teaching assignments and administra- tive tasks equitably and creatively. one challenge is how to move beyond the tradi- tional tagging of certain jobs as appropriate for lecture- track faculty (e.g., director ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley of undergraduate studies, study- abroad adviser, casa hispana adviser) and others for tenure- track faculty (e.g., chair, director of graduate studies). we strive to think about our teaching assignments departmentally as well as individually not only in terms of coverage but also as opportunities for faculty conversation, collaboration, and development. put bluntly, we are challenging the traditional thinking of lecture- track faculty members as warm bodies to be plugged into a curricular sequence and tenure- track faculty members as free agents who teach whatever and whenever they want, toward a situation in which all faculty members will work as equal partners to realize the department’s mission and leave behind the hierarchies and value assump- tions of the past. what has become clear over the last year, however, is that there is a tension between this process, to which we remain fully committed, and the broad context of the college and graduate school. looking forward our experiences at emory reveal that while there has been much progress, much work remains to be done. even as we wrote this article, administrative decisions whose ramifications are still emerging appeared to pose unforeseen challenges for the collaborative vision we have just described. the challenges ahead—nothing as simple as revising a single set of documents—must be seen as opportunities for all faculty members to engage together in what is still very much a work in progress of rethinking the academic labor force. an underlying concern is job security for both lecture- track and tenure- track faculty members in the current climate of contraction and reorganization. other issues are conversion of tenure lines, inter- and intrauni- versity portability of appointments, graduate education, and research. the mla’s professional employment practices for non- tenure- track faculty mem- bers seems to imply that non- tenure- track lines should be converted to the tenure track whenever possible and to assume that conversion to the tenure track will be the desideratum of all those working off it. but this may not necessarily be the case. to what degree are these recommendations driven by the fundamental material and symbolic inequities that define our profession? if the working conditions and remuneration of lecture- track faculty members corresponded to their role in the institution, would there still be a pressing need for conversion? the importance of the conversion of tenure lines might become moot if the goal of symbolic and mate- rial parity were achieved. focusing on conversion of lines as a goal, in addition to reinforcing existing hierarchies, may lead us to overlook other, more transformative possibilities for thinking about the larger body of the faculty. the issue of portability arises when lecture- track faculty members contemplate moving from one institution or unit to another. the portability of a lecture- track appointment becomes problematic because there is no interinstitutional or cross- institutional context for a common understanding of what a lecture- track appoint- ment means. there is often little parity in terms of conditions of employment and wide variance in titles. there is a need for greater standardization, particularly across similar kinds of institutions (e.g., community colleges, four- year liberal arts colleges, research universities). ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley the role that lecture- track faculty members play in graduate education is another area that will require concerted and innovative effort. at emory, as at many research universities, lecture- track faculty members serve most visibly but not exclusively at the undergraduate level. in the department of spanish and portuguese, they are deeply involved in the pedagogical training and mentorship of graduate teaching assistants, including the teaching of the required graduate seminar on pedagogical methods. yet they are not members of the graduate faculty (as some are in other departments at emory). as a result, graduate students in spanish and portuguese are trained by ltf in ways that are invisible and not legitimized; correspondingly, lecture- track faculty members’ contributions to our graduate program are also invis- ible. these issues play out across the spectrum of american universities. new models are needed for training graduate students in a changing academy, as russell berman, in his former role as mla president, noted in a call to rethink doctoral education. graduate students need, for example, expertise in digital schol- arship, technology- based instruction, and engaged scholarship. to be successful on the job market, they need the skills and expertise that many lecture- track faculty members have developed. given that a shrinking percentage of jobs are tenure- track, we need to make available to our graduate students stellar models in both tracks— because it is likely that they will be working with (if not as) lecture- track faculty members in the future. thus mentors from the lecture track need to be not only in- vested in graduate education but also fully enfranchised. as they play an increasingly important role in the training of graduate students, boundaries and assumptions that once seemed self- evident become less so. who counts as graduate faculty, serves on the graduate studies committee, teaches graduate seminars, serves on dissertation committees, or directs dissertations? answers will vary depending on the department or program. a positive recent development in the department of spanish and portu- guese at emory is the appointment of a lecture- track faculty member to the graduate faculty, mirroring a similar previous appointment in the department of religion. the role of lecture- track faculty members in graduate education relates to another important question—research. the initially clear divide between lecture- track fac- ulty members, whose focus was on teaching rather than research, and tenure- track faculty members, whose contributions to the university involved research as well as teaching (with research often valued over teaching), no longer obtains. this is not surprising if one takes seriously the ways in which teaching and scholarship are said to nurture and support each other or if one considers the ways in which traditional definitions of scholarship have expanded in recent years. on a practical level, non- tenure- track faculty members must have access to professional support and develop- ment. but as our definitions of scholarship and teaching evolve to meet the needs of the twenty- first century and the sharply drawn lines between the different faculty bodies within the body of “regular faculty” become increasingly blurred, we might productively reimagine their relation. currently the driving force behind the expansion of the nontenured faculty is eco- nomic. were that imperative to go away (or be mitigated by other economic forces) and in the face of radical and far- reaching changes in higher education, might we envision a tipping point at which the distinction between lecture- track and tenure- ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley track faculty is no longer intellectually or professionally viable? where, then, are the spaces for continuing these conversations about the professoriat? these pages are one important space for defining and animating the discussion, as are mla publica- tions, panels, and workshops. mla leadership is key, and former president michael bérubé has used his position as a bully pulpit to play an important advocacy role: [n] on- tenure- track faculty members will have to learn . . . to assert themselves as faculty members, to comport themselves as if they have every right to be treated with the respect accorded the tenure- track faculty—which they most certainly do. and tenure- track faculty members, for their part, will have to learn not to be such jerks— and, more ambitiously, to learn to challenge cultures of jerkdom where they exist. conclusion we believe that thinking of the professoriat—the “regular faculty”—in terms of university citizenship will free us to move beyond a division of tenure- track and non- tenure- track faculty that thwarts our progress toward shared pedagogical, intel- lectual, professional, and institutional goals. in his generous and provocative discus- sion of global citizenship, cosmopolitanism: ethics in a world of strangers, kwame anthony appiah identifies two intertwined notions that in his view must guide our engagement in the world: “universal concern and respect for legitimate difference” (xv). by this he means that we share mutual obligations that go beyond narrow notions of cohort or community and that we must take seriously the activities and beliefs of others. if we apply these notions to the world of the twenty- first- century university, it becomes clear that we must work together to make it possible for all of us to claim and exercise the full set of rights and responsibilities that pertain to all citizens in that world. recognizing this reality, we believe, is not a luxury but rather a necessary first step toward reaching our shared goals of civic learning and demo- cratic engagement—toward responding to our crucible moment. notes we gratefully acknowledge the encouragement and careful reading of drafts of this manuscript offered by associate dean michael elliot and emory college language center director and professor of ger- man studies hiram maxim. . the report continues, “two- year and four- year colleges and universities offer an intellectual and public commons where it is possible not only to theorize about what education for democratic citizenship might require in a diverse society, but also to rehearse that citizenship daily in the fertile, roiling context of pedagogic inquiry and hands- on experiences” (crucible moment ). . the mla’s report foreign languages and higher education: new structures for a changed world concludes that “[t] he two- tiered configuration has outlived its usefulness and needs to evolve.” . in this section we draw on hartfield- méndez, marsteller, and patterson. . this distinction is one of the particularities of a leading research university and may not be the case in other institutions. it is within colleges (whether or not they are part of larger universities), however, that most non- tenure- track faculty members in en glish and other languages work. . every effort is made to limit adjunct and temporary hires, although there are still occasional in- stances of them, in response to either academic or economic considerations. . in discussing the emory experience, we use the language that has evolved at emory: lecture- track faculty and tenure- track faculty. . the standard teaching loads for tenured and tenure- track faculty members in the department of spanish and portuguese is - ; lecturers teach - ; and senior lecturers and professors of pedagogy teach - . ade bulletin ◆ adfl bulletin . “regular faculty” and citizen participation: re- framing the narrative of higher education vialla hartfield- méndez and karen stolley works cited appiah, kwame anthony. cosmopolitanism: ethics in a world of strangers. new york: norton, . print. “appointment and review of lecture- track faculty in emory college.” emory university. emory u, july . web. aug. . berman, russell. “reforming doctoral programs: the sooner, the better.” modern language association. mla, . web. apr. . bérubé, michael. “among the majority.” modern language association. mla, jan. . web. mar. . boyer, ernest l. scholarship reconsidered: priorities of the professoriate. princeton: carnegie foundation for the advancement of teaching, . print. a crucible moment: college learning and democracy’s future. association of american colleges and uni- versities. a acu, . web. mar. . diversity profile: emory university composition statistics. emory university. emory u, n.d. web. mar. . foreign languages and higher education: new structures for a changed world. modern language associa- tion. mla, may . web. mar. . frye, billy e., et al. choices and responsibility: shaping emory’s future. emory university. emory u, . web. aug. . gold, hazel. “emory university: department of spanish and portuguese.” adfl bulletin . - ( ): – . print. hartfield- méndez, vialla, pat marsteller, and bobbi patterson. “the ‘lecture track’ reconsidered: pro- fessional identity and aspiration among non- tenure- path faculty.” academic exchange . ( ): – . print. kelley, anne. minutes of lecture- track faculty group meeting. mar. . ts. professional employment practices for non- tenure- track faculty members: recommendations and evalua- tive questions. modern language association. mla, june . web. mar. . schmidt, peter. “summit on adjuncts yields tentative framework for campaign to improve their condi- tions.” the chronicle of higher education. chronicle of higher educ., jan. . web. mar. . ———. “when adjuncts push for better status, better pay follows, study suggests.” the chronicle of higher education. chronicle of higher educ., nov. . web. mar. . “statement on non- tenure- track faculty members.” modern language association. mla, . web. apr. . task force on lecture- track faculty. “report to prof. robert paul, dean of emory college.” spring . ts. teaching at emory. emory university. emory u, . web. mar. . sigchi conference paper format open access and scholarly communication: the current landscape, future direction, and the influence on global scholarship daniel gelaw alemneh digital libraries services, university of north texas daniel.alemneh@unt.edu samantha hastings school of lis, university of south carolina hastings@sc.edu suliman hawamdeh department of lis, coi university of north texas suliman.hawamdeh@unt.edu austin mclean, (proquest) , and abebe rorissa (suny) austin mclean, director, scholarly communication and dissertation publishing, proquest, ann arbor, michigan email: austin.mclean@proquest.com abebe rorissa, associate professor, department of information studies, university at albany, state university of new york (suny). email: arorissa@albany.edu abstract the synergies of numerous emerging trends such as the development of open source software, global and explosive growth of social networking, interinstitutional data sharing, cross discipline collaborations, etc. provide new directions for scholarship. the rapid pace of development poses new threats and challenges to scholarly communication as well. open access is increasingly viewed as a popular alternative to traditional distribution methods. despite the overwhelming agreement regarding the concept of open access, there are however, significant differences and debate about a number of issues. this panel brings together diverse stakeholders, explores the current landscape and future direction of scholarly communication, and reflects on the overall implications on global scholarship. keywords open access, scholarly communication, digital libraries, global scholarship. introduction the concept of open access essentially focuses on the need for the free exchange of knowledge and ensures the widest possible availability of scholarly literature and research documentation. in light of the prospects and challenges that this new environment brings, the participants on this panel discuss a number of issues from various perspectives: knowledge management perspective open access is increasingly viewed as a key to scientific knowledge sharing and scholarly communication especially with the rising cost of scientific journals and access controls as mandated by licensing arrangements. dr. suliman hawamdeh is a professor and department chair in the department of library and information sciences at the university of north texas. he is an expert and a pioneer in the field of knowledge management. he will discuss the importance of open access in promoting knowledge sharing and best practices. researchers’ perspective dr. abebe rorissa is currently assistant professor in the department of information studies at the university at albany, state university of new york (suny). in light of the fact that scholars write to advance knowledge in their fields, dr. abebe will analyze how open access enhances the research impact and professional growth of the researchers. he will also discuss the economic and scholarly impact of open access in developing countries as well as the resistance to open access by the academic community. digital curators’ perspective academic libraries provide services to support the creation, organization, management, use and reuse of digital scholarship. dr. daniel gelaw alemneh is a digital curator and project manager in the digital library division of the university of north texas libraries. he will examine how the removal of barriers – (pricing, technical, and legal) facilitates the numerous activities required to maintain the digital assets of cultural heritage institutions. this is the space reserved for copyright notices. asist , october - , , new orleans, la, usa. copyright notice continues right here. mailto:daniel.alemneh@unt.edu mailto:hastings@sc.edu mailto:suliman.hawamdeh@unt.edu mailto:austin.mclean@proquest.com mailto:arorissa@albany.edu publishers’ perspective open access and the democratization of publishing has revolutionized publishing. austin mclean is the director of scholarly communication and dissertation publishing for proquest, ann arbor, michigan. he implemented the world’s first open access publishing model for monographs in . he will share his insights into the current state of publishers’ positions and opinions. university administrators’ and editors’ perspectives finally, dr. sam hastings, (director and professor: school of library and information science: university of south carolina; and asis&t monographs editor) will uncover the pros and cons of the open access for lis research and scholarly communication in general. references alemneh, d. & hastings, s. ( ). developing the ict infrastructure for africa: overview of barriers to harnessing the full power of the internet. journal of education for library and information sciences, ( ) - . retrieved july , from: http://digital.library.unt.edu/ark:/ /metadc /. crow, r. ( ). campus-based publishing partnerships: a critical guide and campus-based publishing resource center. sparc [ -pp.]. retrieved july , from http://www.arl.org/sparc/bm~doc/pub_partnerships_v .p df. herb, u. ( ). sociological implications of scientific publishing: open access, science, society, democracy, and the digital divide. first monday, ( ). retrieved july , from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/f m/article/viewarticle/ / . king, d. ( ). an approach to open access author payment. d-lib magazine ( / ). retrieved july , from http://www.dlib.org/dlib/march /king/ . moen, w. & hartman, c. ( ). open access: a new paradigm for knowledge creation, dissemination, and access. retrieved july , from http://digital.library.unt.edu/ark:/ /metadc /. morrison, h. ( ). rethinking collections- libraries and librarians in an open age: a theoretical view. first monday, ( ). retrieved july , from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/f m/article/viewarticle/ / . waters, d., ( ). open access publishing and the emerging infrastructure for st-century scholarship, journal of electronic publishing, ( ). retrieved july , from http://hdl.handle.net/ /spo. . . . http://digital.library.unt.edu/ark:/ /metadc / https://webmail.unt.edu/owa/redir.aspx?c=d d f b cb d ecc fa &url=http% a% f% fwww.arl.org% fsparc% fbm% edoc% fpub_partnerships_v .pdf https://webmail.unt.edu/owa/redir.aspx?c=d d f b cb d ecc fa &url=http% a% f% fwww.arl.org% fsparc% fbm% edoc% fpub_partnerships_v .pdf https://webmail.unt.edu/owa/redir.aspx?c=d d f b cb d ecc fa &url=http% a% f% ffirstmonday.org% fhtbin% fcgiwrap% fbin% fojs% findex.php% ffm% farticle% fviewarticle% f % f https://webmail.unt.edu/owa/redir.aspx?c=d d f b cb d ecc fa &url=http% a% f% ffirstmonday.org% fhtbin% fcgiwrap% fbin% fojs% findex.php% ffm% farticle% fviewarticle% f % f http://www.dlib.org/dlib/march /king/ http://digital.library.unt.edu/ark:/ /metadc / https://webmail.unt.edu/owa/redir.aspx?c=d d f b cb d ecc fa &url=http% a% f% ffirstmonday.org% fhtbin% fcgiwrap% fbin% fojs% findex.php% ffm% farticle% fviewarticle% f % f https://webmail.unt.edu/owa/redir.aspx?c=d d f b cb d ecc fa &url=http% a% f% ffirstmonday.org% fhtbin% fcgiwrap% fbin% fojs% findex.php% ffm% farticle% fviewarticle% f % f https://webmail.unt.edu/owa/redir.aspx?c=d d f b cb d ecc fa &url=http% a% f% fhdl.handle.net% f % fspo. . . the opportunity to contribute: disability and the digital entrepreneur uc irvine uc irvine previously published works title the opportunity to contribute: disability and the digital entrepreneur permalink https://escholarship.org/uc/item/ t author boellstorff, tom publication date - - doi . / x. . peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ t https://escholarship.org http://www.cdlib.org/ full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=rics information, communication & society issn: - x (print) - (online) journal homepage: http://www.tandfonline.com/loi/rics the opportunity to contribute: disability and the digital entrepreneur tom boellstorff to cite this article: tom boellstorff ( ): the opportunity to contribute: disability and the digital entrepreneur, information, communication & society, doi: . / x. . to link to this article: https://doi.org/ . / x. . published online: jun . submit your article to this journal view related articles view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=rics http://www.tandfonline.com/loi/rics http://www.tandfonline.com/action/showcitformats?doi= . / x. . https://doi.org/ . / x. . http://www.tandfonline.com/action/authorsubmission?journalcode=rics &show=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=rics &show=instructions http://www.tandfonline.com/doi/mlt/ . / x. . http://www.tandfonline.com/doi/mlt/ . / x. . http://crossmark.crossref.org/dialog/?doi= . / x. . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / x. . &domain=pdf&date_stamp= - - the opportunity to contribute: disability and the digital entrepreneur tom boellstorff department of anthropology, university of california—irvine, irvine, ca, usa abstract a range of scholarly work in communications, informatics, and media studies has identified ‘entrepreneurs’ as central to an emerging paradigm of digital labor. drawing on data from a multi-year research project in the virtual world second life, i explore disability experiences of entrepreneurism, focusing on intersections of creativity, risk, and inclusion. since its founding in , second life has witnessed significant disability participation. many such residents engage in forms of entrepreneurship that destabilize dominant understandings of digital labor. most make little or no profit; some labor at a loss. something is being articulated through languages and practices of entrepreneurship, something that challenges the ableist paradigms that still deeply structure both digital socialities and conceptions of labor. disability is typically assumed to be incompatible with work, an assumption often reinforced by policies that withdraw benefits from disabled persons whose income exceeds a meagre threshold. responses to such exclusion appear when disabled persons in second life frame ‘entrepreneur’ as a selfhood characterized by creativity and contribution, not just initiative and risk. in navigating structural barriers with regard to income and access, including affordances of the virtual world itself, they implicitly contest reconfigurations of personhood under neoliberalism, where the laboring self becomes framed not as a worker earning an hourly wage, but as a business with the ‘ability’ to sell services. this reveals how digital technology reworks the interplay of selfhood, work, and value – but in ways that remain culturally specific and embedded in forms of inequality. article history received march accepted april keywords collaboration; disability; entrepreneurship; labor; selfhood; virtual worlds ellie’s best-selling bed summits of green-topped mountains peek over the walls of ellie’s store. it cannot rain, so ceilings are unneeded. ellie leads me room to room, showing me her merchandise: case studies in clever beauty, attentive originality. furniture from beds to desks and even swings for the yard, all with custom animations built right in. a shirt that can manifest in three or even five sizes for lindens. of course, the reason it cannot rain – that a piece of furniture can have ‘animations’ inside it – that a shirt can change its size at will – is because ellie’s store is in the virtual © informa uk limited, trading as taylor & francis group contact tom boellstorff tboellst@uci.edu department of anthropology, university of california irvine, social science plaza, irvine, ca , usa information, communication & society https://doi.org/ . / x. . http://crossmark.crossref.org/dialog/?doi= . / x. . &domain=pdf mailto:tboellst@uci.edu http://www.tandfonline.com world second life. here, commerce takes place in linden dollars (or ‘lindens’); the exchange rate is usually around for one us dollar. ellie’s -cent virtual shirt is typi- cally priced. but ellie has been crafting things long before discovering this virtual world: ‘let me explain the way i was raised. we didn’t have a dining room: we had a cutting board and three sewing machines … i have always crafted; if i was watching tv i was crocheting something’. this earlier crafting had sometimes represented a source of income – for instance, making miniature items for dollhouses: you could take a box of colored paper clips, two round-nosed pliers, straighten out the paper clip and rebend it into a coat hanger, sell ten of them for $ . there were [paper clips] in a package, so you’d make $ . a worsening disability made work and crafting nearly impossible for ellie. one day: i had tried to start my crocheting, and i could not hold onto my crochet hook, i kept on drop- ping it. i mean, i could not crochet at all … and to me, that was just, okay, shoot me now, my life is over. a friend of mine came in and found me just bawling, i mean he thought if he couldn’t calm me down he was going to have to take me to the hospital. and he goes, ‘okay, i’m going to take you somewhere, if you can hold onto the computer mouse’. i said ‘yeah, but that gets boring, you know, because it’s not crafty’. he said ‘well, i’m going to take you somewhere where you can build, and you can make things’. this was ellie’s introduction to second life, where she not only reclaimed crafting but found opportunities to sell. this included the virtual furniture mentioned earlier – in particular, beds, one of ellie’s specialties. ellie would purchase a basic bed shape some- one else had designed in a third-party program like maya or blender. these basic shapes, known as ‘kits’, could be imported into second life as three-dimensional objects. ellie added textures, making the bed appear to be made of worn wood or fine-patterned fabric. she would add animations so an avatar could sleep, read a book, or sit with legs dangling. as ellie showed me around her store, she described the financial calculus involved: this bed we’re sitting on. it costs me , lindens to get the kit. i’m selling it for . so i’d have to sell of them to break even. well, no, to break even, because you’ve gotta con- sider i put money into the textures and money into the animations, right? … it costs about , lindens to make this. so i’m going to have to be able to sell a good of them to break even. this is my most popular selling bed. i’ve sold . as ellie herself noted, ‘people say “why is your stuff so cheap?” … i’m not trying to sell it for a profit’. indeed, because ellie also paid a monthly fee for the virtual land on which her store stood, she had a negative cash flow of $ – a month: this is my therapy. my shrink actually said that i should submit the bill to medicaid … you see, if i did not have this outlet for my creative side, they would have to have me on drugs to keep me from going totally wacko … for me this is a four-part therapy, okay? i get my crea- tivity release, which will build up and truly drive me insane if i don’t. i get a place where i can talk to other people about my disability. i have a place where i can … satisfy my need to be an instructor … the fourth is, my friends are here … which is a big thing many handicapped people do not get … i gave up driving a long time ago. i can’t drive. i can barely get out of the house, with help, right now. t. boellstorff how can someone who identifies as an entrepreneur have a ‘best-selling’ bed only five people have purchased? how can such a person not seek to sell for profit – indeed, lose money? is this false consciousness, someone duped by neoliberal capitalism? or might there be a more complicated interplay of selfhood, labor, and ability in a digital context? this is the point of departure for my analysis. disabled persons in second life like ellie are articulating something through languages and practices of entrepreneurship, something that challenges the ableist paradigms structuring digital socialities and regimes of labor. digital technology, labor, disability this article is based on years’ research in second life, years of which have focused on disability (e.g., boellstorff, ; davis & boellstorff, ). this virtual world is owned by linden lab; during my research it had about , residents. there is no cost to obtain- ing an account so long as your computer and internet connection suffice, but one must pay to own virtual land. i gathered data using methods including inworld participant obser- vation, physical-world and inworld individual interviews, and inworld group interviews. i got to know disability communities through my original fieldwork and built on those connections for this research. these disability communities are as diverse as in the phys- ical world, including visual and auditory impairments, limb loss, autism, epilepsy, post- traumatic stress disorder, multiple sclerosis, and the effects of strokes, cancer, parkinson’s disease, and other illnesses. this diversity thus includes congenital disabilities, disabilities acquired later in life due to disease or accidents, and conditions whose status as ‘disability’ is contested (for instance, deafness and autism). most of my interlocutors were between and , but some were in the – range, and a few were in their seventies, eighties, and even nineties. from its origins in the early s, second life was designed as a virtual world where most objects and experiences would be created by residents (ondrejka, ). this model, known by terms like ‘user-generated content’ or ‘prosuming’, is fundamental to platform capitalism in that platforms are underdetermined: facebook does not produce most of its posts; youtube does not create most of its videos. in second life, user-generated content can be given away freely or sold for linden dollars; as noted above, these can be exchanged for us dollars. most commodities sell for the equivalent of cents to two dollars, but there are many items in the $ – range and a few for more than $ , $ , or even $ (see au, ). the open-ended design of second life means there are ample possibilities for content creation and sales, but some characteristics of the virtual world work against these possi- bilities, particularly for disabled persons. while second life accounts are free, the rela- tively high cost of renting land is a barrier. a full region (‘sim’) costs $ to set up, with a monthly fee of $ . regions can be shared, and it is possible to own smaller parcels (or rent parcels from larger virtual landowners) so that one has a monthly fee of $ or less, but even this is prohibitive for some disabled persons. without land, creating and sell- ing objects is harder though not impossible (see the ‘second life marketplace’ discussed below). despite these barriers, throughout my fieldwork i have been struck by how often dis- abled persons in second life participate in content creation and sales. the exact number of such persons is not key: ethnographic analysis is not about establishing what is prevalent information, communication & society but exploring what is possible. demographic data are difficult to obtain because accounts can be obtained anonymously, and not everyone reveals their disability inworld. morgan, a disabled entrepreneur, noted that: in our community, this is huge because we can choose how much anonymity we want. and for some of our members, that anonymity is key to their comfort zone of participation. and then, of course there are a lot of people who just, they don’t tell people, period … they’re just choosing to explore this world without the d-word attached to it. they’re not trying to be able-bodied. they’re just trying to kind of see what it’s like to not have the big d front and center. with these limitations in mind, it was clear that most of my interlocutors lived in north america or europe. most had limited resources – for instance, an annual income under $ , in the united states – though some identified as middle class. (one indicator: the research project was to have a virtual reality component involving the purchase of vr headsets for at least participants, but it was only possible to do this for three par- ticipants because the rest did not own sufficiently powerful computers.) in line with sur- veys estimating that around % of virtual-world residents identify as female in the physical world (pearce, symborski, & blackburn, , p. ), the majority of my interlo- cutors were women. i emphasize female narratives in this analysis and address how gender intersects with disability in the domain of entrepreneurship. while not commonly empha- sized by my interlocutors, it bears recalling that through their creative labor they were con- tributing to the profits of linden lab (analogous to the way that content creators are pivotal to the profits of facebook, youtube, twitter, and so on; see ekbia & nardi, ). my ethnographic material understandably speaks to a range of topics; everyday experi- ence (online or offline) always involves multiple cultural domains. i seek to contribute to literatures on digital technology and labor; literatures on disability and labor; and the emerging body of work addressing all three of these domains (e.g., friedner, ). i turn particularly to entrepreneurship. as a pivotal theme addressed by current research on technology and labor, entrepreneurship opens the analysis to questions of intersubjec- tivity and belonging – to how ‘contribution’ as affect and social fact shapes intersections of disability and labor. this is important because many disabled persons do not work for wages: indeed, state and national laws often forbid income as a condition for benefits. appreciating the contributions of disability experience to the question of digital technol- ogy and labor requires moving beyond ‘employment’ narrowly construed. scholars writing on entrepreneurism have noted its connection to aspects of selfhood in addition to gender: where work is coded as entrepreneurship, [workers] learn to imagine themselves as risk takers rather than laborers. their cultural characteristics – such as gender, race, ethnicity, nationality, citizenship status, and religion – make it possible for them to succeed in mobiliz- ing themselves or others like them as labor. (tsing, , p. ) this is a ‘gendered, racialized, and classed distribution of opportunities and vulnerabilities’ (van doorn, , p. ), a context in which ‘age, gender, ethnicity, region and family income re-emerge … and add their own weight to the life chances of those who are attempting to make a living’ (mcrobbie, , p. ). while these authors do not list abil- ity, i am certain they would consider it relevant, given that ‘the concept of disability emerged alongside the rise of industrial capitalism … disability came to be understood t. boellstorff as a limit to one’s ability to earn a living’ (ross & taylor, , p. ). it is ‘because of the industrial revolution … [that] disability emerged as both an analytical concept and lived way of experiencing the world’ (friedner, , p. ); at the same time, disabled persons have long been reworking non-medical technologies in unexpected ways (williamson, ). one vital analytical task is to trace how such disability lifeworlds are transforming in the contemporary digital era. while my argument is informed by recent developments in online socialities, it is important to place these developments in historical context. the connection between tech- nology and labor has been a concern since the ancient greeks and was central to marx’s critique of capitalism. for instance, in chapter of capital, vol. marx discussed how alongside lengthening the working day and compelling workers to labor harder, technol- ogy allows capitalists to produce surplus value and thereby profit at the worker’s expense. here as elsewhere, marx emphasized labor’s embeddedness in society: ‘technology … lays bare [man’s] mode of formation of his social relations, and of the mental conceptions that flow from them’ (marx, , p. ). for over a century, anthropologists have taken up these questions of technology and labor. malinowski’s classic argonauts of the western pacific ( ) was ‘totally devoted to the analysis of economic relations’ (godelier, , p. ). other work showed how ostensibly ‘primitive’ peoples without money actually interweave economics and culture in a complex fashion: ‘they make their economic relationships do social work … [and] in all this primitive economic systems differ only in degree and not in kind from our own’ (firth, , p. ; see bloch, ). by showing the cultural embeddedness of labor, anthropological scholarship in dialogue with feminist marxism challenged the image of a universal proletariat (e.g., harris & young, ; meillassoux, ; nash, ; ong, ; taussig, ). more recent work has explored how ‘like workers, capi- talists are always constituted as particular kinds of persons through historically specific cultural processes’ (yanagisako, , p. ; see dunn, ). anthropology has thus con- tributed an analysis that ‘treats capitalist action as culturally produced and, therefore, always infused with cultural meaning and value’ (yanagisako, , p. ). how might it be that entrepreneurs are being constituted as particular kinds of persons through histori- cally specific conjunctions of disability and digital technology? anthropological approaches can explore how these conjunctions might act as forms of ‘dislocation’ in which ‘both places and persons are reconfigured by the movements of capital’ (harvey & krohn-hansen, , p. ; see also bear, ho, tsing, & yanagisako, ). labor as contribution although second life is designed around the user-generated content model, making con- tent for profit is neither obligatory nor a universal goal. most residents do not produce items for sale at all: they purchase what others make or obtain items for free. those who create often do so for the pleasure of creating, perhaps giving copies of favorite items to friends. for some, however, the work of creating leads to sales. this is usually done either through an inworld store, on the ‘second life marketplace’ website, or both. (an inworld store incurs the cost of paying for the virtual land on which it sits, unless one advertises one’s wares inside someone else’s store, in which case a fee is often paid. if listing on the information, communication & society second life marketplace, linden lab charges a % commission.) a few residents make thousands of us dollars selling avatar clothing or managing virtual real estate, though job-hunting in second life is not necessarily easy (au, ). most residents, however, earn lesser amounts of money, and this pattern holds for disabled entrepreneurs as well. a few have earned what they consider significant income – for instance, from mana- ging a series of rental estates covering almost second life regions, with paid employ- ees. often, however, the income is more modest. for instance, one disabled fashion designer usually priced clothing items at around lindens, and sold approximately items a month, giving a monthly income of around $ . and often there is no sig- nificant income at all: recall that ellie had sold five copies of her best-selling bed, earning about five dollars. how do disabled persons understand these dynamics of virtual labor in the context of entrepreneurial selfhood? how might disability intersect with and transform expectations regarding such forms of selfhood – given that in the united states and elsewhere, ‘entre- preneurship’ is promoted by state and other entities as a way to conceptualize disability ‘self-employment’? how is entrepreneurship being framed as a modality by which one’s inner self is revealed to oneself and the social world? morgan, whose thoughts on anonymity and the ‘big d’ i cited above, had a good num- ber of disabled acquaintances. so many of them were successful entrepreneurs – or sought to become entrepreneurs – that she founded an organization for disabled persons already in second life interested in entrepreneurship. sitting in my second life home one day, she explained that her goal was to help ensure that for ‘people who don’t feel like they have any contribution to make, we get them to a place where they can see they have a contribution’: no longer do we have to sit there and go ‘i have to make a certain amount of money a year’. for most of us, the society we’re in doesn’t support that for us. right? it looks at us, and it doesn’t even give us the opportunity to contribute in that way. you know, when they see a wheelchair coming through the door, or somebody with a stick to guide them, or they hear that they need an animal on site, ‘no, we can’t accommodate that’, right? and so our opportunities become more limited, but it doesn’t mean that our potential is gone. it’s definitely critical for me to feel that i have something to contribute. like my disabled interlocutors more generally, experiences with employment and unem- ployment in the physical world led morgan to reflect on the implications of disability for virtual-world entrepreneurship. particularly relevant for my analysis is her linking of self- hood with a sense of contribution: ‘contributing something back to society takes us off the focus of our condition and its challenges, to this focus on this other thing that we’re con- tributing … that’s what gives us that initiative’. furthermore, morgan (like others) directly connected this initiative to entrepreneurship: ‘the definition of an entrepreneur is a person who organizes and manages any enterprise, usually with considerable initiative and risk … “i’m putting myself out there; this is what i do”’. morgan’s definition of entrepreneurship recalls scholarly definitions discussed below. i have given morgan the first word to underscore her point that entrepreneurship can be collaborative. entrepreneurs are of course always part of collectivities that can include fun- ders, peers, and workers, but for morgan and my disabled interlocutors more generally, the idea of nurturing members of a community was not external to the definition of entre- preneurship. for instance, morgan was aware of ellie and her best-selling bed: ‘you know, t. boellstorff i listen to ellie say “hey, i spend more than i make”. but actually i’m guessing, with a few skill sets, ellie could make more than she spends, because she’s super-talented’. these skill sets could include things like learning programs outside second life helpful in content cre- ation, or better marketing. but what already stands out in this data is that disability languages and practices of entrepreneurship are shaping cultural logics beyond the economic. ‘entrepreneur’ as subject position there has been sustained interest in the entrepreneur as a culturally and historically specific subject position – a socially extant category of selfhood that can be occupied in various ways (i.e., as individualized ‘subjectivities’; see boellstorff, ). one classic the- orization of the entrepreneur subject position comes from schumpeter’s the theory of economic development. schumpeter was concerned with the role of ‘new combinations of means of production’ in economic development: ‘the carrying out of new combinations we call “enterprise”, the individuals whose function it is to carry them out we call “entre- preneurs”’ (schumpeter, , p. ). with regard to digital capitalism, schumpeter’s idea that entrepreneurs are pivotal to economic recombination and thus social change has gained mythic status – as indicated by the mere mention of (nota bene: male) names like jobs, gates, zuckerberg, and bezos. however, a rich body of scholarship has explored how conceptions of entrepreneurship have expanded beyond this figure of the corporate titan. the metaphor for employee– employer relations has shifted from that of property, where workers own themselves as if ‘they were property that could be rented to an employer for a certain period of time’ (gershon, , p. ) to a metaphor where ‘people now think they own themselves as though they are businesses – bundles of skills, assets, qualities, experiences, and relationships, bundles that must be consciously managed and constantly enhanced’ (ger- shon, ). this newly dominant metaphor represents ‘new imaginaries of labor in which making a living appears as entrepreneurship’ (stensrud, , p. ). in this fra- mework ‘contemporary culture’s benchmark of success is the figure of the entrepreneur’ (duffy, , p. ): it is assumed that ‘you are no longer a worker, with worker’s rights. instead, you’re an entrepreneur, and entrepreneurs take risks (and suffer them too)’ (dewhurst, , p. ). social scientists have explored links between economic formations and selfhood since at least weber’s the protestant ethic and the spirit of capitalism ( ). at issue are the able- ist forms these links take in digital contexts. i coined the term ‘creationist capitalism’ in my analysis of user-generated regimes emerging online since the s (boellstorff, , chapter ). with this neologism i sought to highlight how creativity was becoming con- strued as a form of labor, particularly in the context of digital socialities where the cost of producing, say, virtual chairs was not times the cost of producing one chair (as opposed to the cost of producing wooden chairs compared to one wooden chair). i also sought to highlight how the christian metaphysics weber identified as central to dominant capitalist formations of the nineteenth century remain, albeit transformed, in the twenty-first century. i identified the pivotal transformation as one in which ‘workers are not just sellers of labor-power, but creators of their own worlds’ (boellstorff, , p. ). rather than worldly success indicating divine favor, in creationist capitalism it information, communication & society is creation that reveals one’s inner self. increasingly, this inner self is an entrepreneurial self (rather than, say, the self of kinship or wage labor). we now have a constellation of terms alongside ‘creationist capitalism’ that track these shifts in digital labor, including communicative capitalism (dean, ), aspirational labor (duffy, ), platform capitalism (srnicek, ), platform labor (van doorn, ), and venture labor, ‘the explicit expression of entrepreneurial values by nonentrepreneurs’ (neff, , p. ). the actions, experiences, and subjectivities of my disabled interlocutors in second life further develop neff’s insights: in addition to nonentrepreneurs expressing entrepreneurial values, the horizon of what counts as entrepreneurship is expanding across the terrain of the human. the binarism of ‘entrepreneur’ and ‘nonentrepreneur’ is becom- ing destabilized in favor of multiple inhabitations of the entrepreneur subject position (just as, for instance, one can inhabit the ‘teenager’ subject position as a diligent ‘geek’, athletic ‘jock’, and so on). my analysis here thus explores how a concept related to self-identity can be trans- formed in ways never expected at the time the concept was originally formulated. disabil- ity experience in virtual worlds provides new perspectives on how reconfigurations of ‘entrepreneur’ are emerging – notions of entrepreneurial selfhood that do not stand out- side the dominant discourse but cannot be reduced to it either. in other words, a working hypothesis i derive from my ethnographic data is that a prototypical silicon valley ‘entre- preneur’ and the disabled persons i discuss in this article differentially inhabit a shared subject position. at issue is not conflating different forms of selfhood but recognizing how differing forms of selfhood can be informed by a shared cultural logic. this illumi- nates emerging contours of an ‘entrepreneurial subjectivity’ that involves reconfigurations of self-presentation and self-understanding (bröckling, ; marwick, ). such reconfigurations include new forms of ‘entrepreneurial citizenship’ in which ‘entrepre- neurialism is not only a project of the self, but a project that posits relations between selves and those they govern, guide, and employ’ (irani, in press). these are, in short, forms of ‘entrepreneurial living’ (lindtner, in press) in which self- hood and citizenship are construed as an intertwined entrepreneurial project. the scholars cited above in this section are among those who explore the benefits and dangers in these new forms of selfhood. at stake in understanding these benefits and dangers is nothing less than what human agency and equality will mean in the digital age. we need analytical tools for comprehending this expansion of the entrepreneur subject position, such that people ‘increasingly define themselves as self-branding entrepreneurs rather than employ- ees’ (robinson, , p. ). recalling weber, it is remarkable that this can be at least partially delinked from the desire for wealth (see weeks, ). neff notes that ‘when people think of their jobs as an investment or as having a future payoff other than regular wages, they embody venture labor’ (neff, , p. ). this is a culture of capitalism that ‘shifts content creators’ focus from the present to the future, dangling the prospect of a career where labor and leisure coexist’ (duffy, , p. ). my interlocutors, most of whom did not enter second life with entrepreneurship in mind, reframe these conceptions of laboring selfhood. lila, for instance, got to know second life after a friend asked her to spend time there: she had been inworld for four years before being disabled by a significant chronic illness. she then became a creator of roleplaying clothing, avatar body attachments, and furniture. however, she emphasized ‘i actually didn’t want to deal with building when i wasn’t sick … i was crazy bored at t. boellstorff home and i wanted to do more, something to make me feel productive even if i didn’t sell many things’. like ellie and many other disabled entrepreneurs, a sense of productivity was linked to creating, collaborating, and sharing, not sales. for instance, customers had purchased about copies of one of lila’s signature pieces of furniture. some months she would sell enough to pay the rent for her inworld store (about $ ), but not consist- ently. however, lila’s real motivation was ‘i like the fact that someone else enjoys things i make. i get some sense of satisfaction for work done’. as i noted earlier, this kind of ethnographic analysis confronts the complex interplay of multiple cultural domains. lila’s experience and that of many of my interlocutors draws on notions of craftsmanship (sennett, ), but is also gendered, reflecting how historically the work of women has often not been seen as real ‘labor’: reassigned an emotional value and conflated with a ‘domestic’ sphere. entrepreneurial selfhood is thus not external to a gendered logic in which ‘online technology allows workers to carve out strategies to cope with conditions that are highly intensified because they are taken to be individual rather than structural in nature’ (gregg, , p. ; see hochschild, ). gender and ability are both shaped by this dynamic, which means that ‘people increasingly … have to do the work of the structures [like the wel- fare state] by themselves … which in turn requires intensive practices of self-monitoring or “reflexivity”’ (mcrobbie, , p. ). it is in this context of intensification through individuation – making work more overwhelming by making it more personal – that my interlocutors’ naming of collaboration as intrinsic to their conception of entrepre- neurship is particularly revealing. collaboration and capability in this section i focus on the question of collaboration. while certainly informed by gen- der, as noted above the ideal of collaborative labor is mobilized by other cultural charac- teristics, including disability. for my interlocutors the link between disability, digital entrepreneurship, and collaboration was often shaped by upsetting and economically devastating experiences of physical-world employment discrimination. consider how one morning a group of disabled persons discussed labor in both second life (‘sl’) and the physical world (often colloquially termed ‘rl’ or ‘real life’, but with an understanding that second life was real as well): rhonda: wonder if anyone else is afraid to try to get a job in rl … i fear that if i am unable to do it, keep up with my work, or if i cannot understand or am too slow … then i’ll get fired and i will have lost my benefits. jason: i share that. ruby: ughhhh sylvia: i will start my teacher training in march, and just like any social work i am afraid i will burn out twice as hard. rhonda: sometimes i’m sick or just unable to do things for a month or so … i don’t think they take that into account when they think we should try to work, but could lose our benefits. so i’ve got lots of fear of that happening. sylvia: ♥ david: the last job i had in rl, i lost two days before my trial period was over. it was in a hotel, shift work. and they scheduled me to do the late shift, and then i’d have to do the early shift the next day after, which meant that when i got home and information, communication & society took my meds, it took me at least a couple hours to go to sleep. so i didn’t get enough sleep, and it kept burning me out. sylvia: gotta love the retail type of jobs … . david: when i asked my boss if they could accommodate me, because basically they have to by law here in france, he asked why, so i was open with him and said it’s because i have bipolar disorder. and his face just turned, and he talked about how people with manic depression are unreliable and dangerous to have around. lila: sighs michelle: dang sylvia: grrrrrr david: so they let me go. and that was the last time i worked in rl. i’m on disability now, stable, and i find that i can make a little pocket money here in second life by making custom mesh [objects] for people, some cars and some little build- ings, and i’m working on a big house. so thanks to michelle and others for teaching me how to do it! but that’s how i use second life, a little pocket money here and there. another interlocutor, joseph, noted how: i was told i would lose medical benefits by working. if anything, i could work and have $ deducted for every $ earned, i cannot have more than $ , in an account, and it can work out to earn an extra $ a week … employment means a whole lot more than money. it means having a place to go every single day where i am (hopefully) wanted and needed. in conversations like these and in everyday practices of digital entrepreneurship, we find (as in david’s statements above) a valuing of creativity, a de-emphasis on sales despite income precarity, and a stressing of collaboration and learning. these responses to con- flations of labor and self-worth extend beyond disability: work is crucial not only to those whose lives are centered around it, but also, in a society that expects people to work for wages, to those who are expelled or excluded from work and mar- ginalized in relation to it. (weeks, , p. ) morgan noted that: it is such a conflicting situation, of constantly facing barriers to what you are capable of doing. and constantly having these outside forces suggest you’re not being honest about your capabilities, and that you could do more … [disabled persons] are actually forced into the position of entrepreneurship … you’re going to have to have the initiative to prove that you can make that contribution. morgan indicates that the ‘opportunity’ to contribute can be a compulsion as well. the intersection of disability and the digital reveals how the entrepreneur subject position is centered on a normatively ableist self. this is a self who ostensibly faces no barriers to work, particularly when ‘vocational rehabilitation’ programs frame entrepreneurship as a paradigm of disability self-employment. digital technologies are now commonly linked to that paradigm, as if they ensure labor transparently reveals one’s value. this is one way that such technologies have often furthered, not mitigated, exclusions of disabled persons from the workforce (ross & taylor, ). to recall one of the most enduring insights of technology studies, no technology has an inevitable social valence. technology does not inherently ‘make things better’. t. boellstorff the ableist self on which the entrepreneur subject position is centered is presumed to be constituted through risk and individual productivity. it is thereby part of a cultural frame- work that narrowcasts dependency, mutuality, and collaboration in terms of start-up or open-source ‘disruptions’ of corporate capitalism (lindtner, in press). however, my analy- sis builds on the growing body of work showing how the dynamics in play involve inclusion as well: [d]isabled people are being produced as idealized ‘workers with disabilities’ and included in neoliberal workplaces … they provide added value through helping corporations rack up csr [corporate social responsibility] ‘brownie points’. they are also remaking the work- place as a more affective space for [able-bodied] coworkers who experience novel feelings of responsibility, inspiration, attachment, and love. (friedner, , p. ) disabled persons in second life respond to these shifting dynamics of exclusion and inclusion when framing ‘entrepreneur’ as a selfhood characterized by collaboration and contribution as well as initiative and risk. this construes ability as interpersonal, and entre- preneurship as a capability that cannot be slotted into a classic teleology of wealth accumu- lation or even full employment. it is an aspirational labor where one key ‘aspiration’ is the opportunity to contribute itself – recalling capabilities approaches to human rights that focus on ‘what people are actually able to do and to be, in a way informed by an intuitive idea of a life that is worthy of the dignity of the human being’ (nussbaum, , p. ; see alsoburchardt, ;sen, ). formy interlocutors,second lifeenabledcollaborative entrepreneurship not just because of mobility limitations, but because the affordances of virtual worlds included community and tools for creation. when describing her unemploy- ment, michelle once noted that ‘job situations don’t accommodate mental unwellness very well. what i find in second life though is an opportunity to get some of the very positive rewards of “working”, of being productive, of making a contribution to the wider world’. that this ‘wider world’ includes a virtual world underscores how the internet is not a monolithic cultural entity. affordances of various online socialities vary, with often- unforeseen consequences. morgan once noted that: when you compare to facebook, facebook is a social media … there’s nothing solid in it, right? there’s no open mikes: any creative expression i post on facebook can be potentially limited to those that i would allow to see it, and those who see it, they’re not going to pay me a dime for it. morgan here emphasizes facebook’s form as a network. in contrast, second life is ‘solid’ – meaning not that it is physical, but that it is a place. it does not mediate between two locations of culture, but is a site of culture itself: if i try to go out and be an entrepreneur in the real world, i got bankers telling me why they’re not going to fund me, i got office buildings telling me why they’re not going to rent to me, i’ve got all kinds of people telling me what they can’t do. and i find in the virtual world there’s very little of that. you have a whole lot of the opposite. which is, ‘yeah, you should do that. yeah. i know someone who knows how to do that. you should talk to this person’ … i didn’t think i’d be able to build. and the people who build were like magicians to me, and i would watch people – ellie was one of the first people i watched build, and i was pretty sure she was a magician, because she can build anything in a few seconds … and i’m just like ‘that will never be me; i’m not capable or competent’, but i have come to realize i am capable of things i never imagined. information, communication & society morgan summarized her experiences and those of her fellow disabled entrepreneurs: our lives aren’t over, and here is a virtual world where we can express that, and how we choose to define success. that’s why we don’t define it by somebody who can support them- selves off their linden dollars annually. that’s not a valid measurement of success. conclusion: toward an anthropology of absences one possible interpretation of the materials discussed in this article is that disability entre- preneurs in second life are duped by neoliberal capitalism. however, more careful ethno- graphic attention reveals persons who in a sense take rhetorics of entrepreneurialism at their word, yet forge visions of a better self and community. recentering entrepreneurial selfhood on collaboration and simultaneously reframing what ‘collaboration’ entails, they sideline rhetorics of productivity and challenge dominant logics of ableism. as michelle noted, ‘second life has given me a way to feel once again like i am a contributing member of society. it has helped me reconstruct my sense of identity, in the wake of becoming disabled’. at a methodological level, my analysis illustrates how ‘ethnographic thick description can surely offer a way forward for rethinking the economy outside of a capitalocentric frame’ (gibson-graham, , p. s ). beliefs and practices around disability entrepre- neurship in second life do nothing less than rework the notion of value – but in ways that cannot be reduced to either complicity or opposition. the relation to dominant beliefs is not so unilinear. recalling insights gained from earlier research in indonesia, i might say that these second life residents are not ‘translating’ dominant notions of ability and labor. rather, they ‘dub’ them like a movie is dubbed into another language, resulting in an ongoing juxtaposition where moving lips never quite match the new, dubbed voice, but meaning-making nonetheless occurs (boellstorff, ). while some anthropologists are understandably ‘uncomfortable with scholarly insis- tence that people with disabilities teach us something’ (kulick & rydström, , p. ), ethnographic analysis contributes more than knowledge regarding the specific com- munity studied. for instance, attention to disability entrepreneurs in virtual worlds speaks to emerging dynamics of digital labor and the implications of platform socialities for per- sonhood. their forms of mutual support challenge individualistic tropes of the self-made genius. their experiences of value creation challenge the binarism of ‘ability’ versus ‘dis- ability’, suggesting that rubrics attentive to human capability might prove more effective. such insights also broaden intersections of disability studies and digital studies. to date, disability scholarship addressing virtual worlds has highlighted opportunities for ‘infor- mation, socialization, and community membership’ (stewart, hansen, & carey, , p. ). these are all valuable topics, but foregrounding labor allows us to pose different questions regarding current contexts and future possibilities for disability inclusion. the point, then, is not that disabled persons be compelled to ‘teach us something’, but that they have a place at the table of recognized ways of living a fully human life. in this sense, i might term my analysis an ‘anthropology of absences’. this builds on boaventura de sousa santos’s notion of a ‘sociology of absences … an inquiry that aims to explain that what does not exist is, in fact, actively produced as non-existent’ ( , p. ). he empha- sized that one way such ‘non-existence’ is produced is ‘non-productiveness’, which applied to labor takes the form of assumptions regarding. he emphasized that one way such ‘non- t. boellstorff existence’ is produced is ‘non-productiveness’, which applied to labor takes the form of assumptions regarding ‘discardable populations’ ( , p. ), and which can be coun- tered by ‘recuperating and valorising alternative systems of production … hidden or discre- dited by the capitalist orthodoxy of productivity’ (santos, , p. ; see mitchell & snyder, ). in recuperating and valorising the work of digital disability entrepreneurs, i respond to how disability can be made to appear absent in regimes of labor, and how some disabled persons in second life presence their ability through languages and practices of entrepreneurship. this is why income can be partially delinked from entrepreneurship: ‘entrepreneurship’ is being used to make present ability and contribution. i also respond to the reality that some contemporary digital scholarship actively pro- duces virtual worlds as non-existent, particularly those virtual worlds not oriented toward children (like minecraft) or predominantly structured as games (like world of warcraft). i remain amazed by how often colleagues ask me some version of the question ‘is second life even around any more’? yet [f]or ethnographers today, no task is more important than to make small facts speak to large concerns, to make the ethical acts ethnography describes into a performative ontology of economy and the threads of hope that emerge into stories of everyday revolution. (gib- son-graham, , p. s ) this is true despite the danger that the disability entrepreneurs i discuss in this article could be taken as ‘poster children’ for virtual worlds (and capitalist markets to boot). the tendency for disability experience to be reduced either to catastrophe or ‘inspiration’ (rousso, ) does not disappear in the digital domain. the response to this tendency should be neither to marginalize disability experience nor treat it as an instance of ‘tech- nosolutionism’ (lindtner, bardzell, & bardzell, ), but engage with that experience as deeply contributing to interdisciplinary conversations regarding the human condition. making the lifeworlds of disability entrepreneurs in second life present in our concep- tual debates can contribute powerfully toward better understanding the emerging digital economies that already transform societies. it reframes disability as a form of social action irreducible to limitation or lack. in a contemporary moment when so much discussion of online socialities foregrounds surveillance, deception, and precarity, the lifeworlds of dis- ability entrepreneurs in second life point to the no less real possibilities for connection, possibility, and creativity. and it is in approaches founded neither in utopia or dystopia, however promising or fearful the future might seem, that we find the best hope of com- prehending our unfolding present. notes . in this article, i employ ‘disabled persons’ rather than ‘people with disabilities’. both are con- tested and imperfect, but i find person-first language less effective (see sinclair, ; titch- kosky, ; broderick & ne’eman, ). i received institutional review board (irb) approval for this research. no hipaa (health insurance portability and accountability act) related details of health status were obtained, and details of self-identified disabilities (along with other personally identifying details) have been altered. physical world and screen names have been changed: quoted text chat has been altered so make it harder to find using a search engine. . by extension the money can then be converted to any currency, but linden dollars are directly exchangeable only into us dollars. information, communication & society . see, for instance, https://www.dol.gov/odep/topics/selfemploymententrepreneurship.htm (accessed march , ); https://www.colorado.gov/pacific/dvr/self-employment (accessed march , ). acknowledgements i thank my second life interlocutors for their generosity, patience, and truly extraordinary insights. i thank my co-investigator, donna z. davis, for her camaraderie and intellectual support. i thank gerard goggin and haiqing yu for their encouragement and support. a draft of this paper was discussed by the labortech group: for the invitation to participate i thank winifred poster, the group’s organizer; for their comments during the discussion i thank opeyemi akanbi, sareeta amrute, julie yujie chen, laura forlano, seda guerses, and lilly irani. additional comments were provided by ilana gershon, alice krueger, silvia lindtner, alice marwick, and winifred pos- ter. a documentary about this project, our digital selves (bernhard drax, director, mn, ), is freely available at https://youtu.be/gqw -me w . disclosure statement no potential conflict of interest was reported by the author. funding this research was funded in part by the national science foundation, cultural anthropology and science, technology, and society programs [grants and ]. notes on contributor tom boellstorff is professor in the department of anthropology at the university of california, irvine. a fellow of the american association for the advancement of science, he is the author of many articles and the books the gay archipelago, a coincidence of desires, and coming of age in second life. he is also coauthor of ethnography and virtual worlds: a handbook of method and coeditor of data, now bigger and better! a former editor-in-chief of american anthropologist, the flagship journal of the american anthropological association, he currently coedits the princeton university press book series ‘princeton studies in culture and technology’ [email: tboellst@uci.edu]. references au, w. j. ( , september ). open forum: what’s the most you ever paid for a virtual item in second life (besides land)? new world notes. retrieved from http://nwn.blogs.com/nwn/ / /sl-virtual-item-sale.html au, w. j. ( , april ). virtual job hunting in second life about as daunting as job hunting irl. new world notes. retrieved from http://nwn.blogs.com/nwn/ / /virtual-job-second-life. html bear, l., ho, k., tsing, a., & yanagisako, s. ( , march ). gens: a feminist manifesto for the study of capitalism. theorizing the contemporary. cultural anthropology website. retrieved from https://culanth.org/fieldsights/ -gens-a-feminist-manifesto-for-the-study-of-capitalism bloch, m. ( ). marxism and anthropology: the history of a relationship. oxford: clarendon press. boellstorff, t. ( ). dubbing culture: indonesian gay and lesbi subjectivities and ethnography in an already globalized world. american ethnologist, ( ), – . doi: . /ae. . . . t. boellstorff https://www.dol.gov/odep/topics/selfemploymententrepreneurship.htm https://www.colorado.gov/pacific/dvr/self-employment https://youtu.be/gqw -me w mailto:tboellst@uci.edu http://nwn.blogs.com/nwn/ / /sl-virtual-item-sale.html http://nwn.blogs.com/nwn/ / /sl-virtual-item-sale.html http://nwn.blogs.com/nwn/ / /virtual-job-second-life.html http://nwn.blogs.com/nwn/ / /virtual-job-second-life.html https://culanth.org/fieldsights/ -gens-a-feminist-manifesto-for-the-study-of-capitalism https://doi.org/ . /ae. . . . boellstorff, t. ( ). the gay archipelago: sexuality and nation in indonesia. princeton, nj: princeton university press. boellstorff, t. ( ). coming of age in second life: an anthropologist explores the virtually human ( nd ed.). with a new preface. princeton: princeton university press. bröckling, u. ( ). the entrepreneurial self: fabricating a new type of subject. los angeles, ca: sage. broderick, a., & ne’eman, a. ( ). autism as metaphor: narrative and counter-narrative. international journal of inclusive education, ( / ), – . burchardt, t. ( ). capabilities and disability: the capabilities framework and the social model of disability. disability & society, ( ), – . davis, d., & boellstorff, t. ( ). compulsive creativity: virtual worlds, disability, and digital capi- tal. international journal of communication, , – . retrieved from http://ijoc.org/ index.php/ijoc/article/view/ / dean, j. ( ). blog theory: feedback and capture in the circuits of drive. cambridge: polity. dewhurst, m. ( ). we are not entrepreneurs. in m. graham & j. shaw (eds.), towards a fairer gig economy (pp. – ). oxford: meatspace press. duffy, b. e. ( ). (not) getting paid to do what you love: gender, social media, and aspirational work. new haven: yale university press. dunn, c. d. ( ). personal narratives and self-transformation in postindustrial societies. annual review of anthropology, , – . ekbia, h., & nardi, b. ( ). heteromation, and other stories of computing and capitalism. cambridge, ma: mit press. firth, r. ( ). orientations in economic life. in e. e. evans-prichard (ed.), the institutions of primitive society (pp. – ). oxford: basil blackwell. friedner, m. i. ( ). valuing deaf worlds in urban india. new brunswick: rutgers university press. gershon, i. ( ). down and out in the new economy: how people find (or don’t find) work today. chicago: university of chicago press. gibson-graham, j. k. ( ). rethinking the economy with thick description and weak theory. current anthropology, (s ), s –s . godelier, m. ( ). perspectives in marxist anthropology. translated by robert brain. cambridge: cambridge university press. gregg, m. ( ). work’s intimacy. cambridge: polity. harris, o., & young, k. ( ). engendered structures: some problems in the analysis of reproduc- tion. in j. kahn & j. llobera (eds.), the anthropology of pre-capitalist societies (pp. – ). london: macmillan. harvey, p., & krohn-hansen, c. ( ). dislocating labour: anthropological reconfigurations. journal of the royal anthropological institute, (s ), – . doi: . / - . hochschild, a. r. ( ). the time bind: when work becomes home and home becomes work. new york: h. holt. irani, l. (in press). entrepreneurial citizenship: innovators and their others in indian development. kulick, d., & rydström, j. ( ). loneliness and its opposite: sex, disability, and the ethics of engagement. durham: duke university press. lindtner, s. (in press). age of experimentation: making as entrepreneurial living in china’s new normal. lindtner, s., bardzell, s, & bardzell, j. ( ). reconstituting the utopian vision of making: hci after technosolutionism. proceedings of the chi conference on human factors in computing systems (chi ‘ ) (pp. – ). new york: acm. doi: . / . malinowski, b. ( ). argonauts of the western pacific. new york: e. p. dutton & co. marwick, a. e. ( ). entrepreneurial subjects: venturing from alley to valley. international journal of communication, , – . marx, k. ( [ ]). capital: a critique of political economy. volume i: the process of capitalist production. translated by ben fowkes. london: penguin books. information, communication & society http://ijoc.org/index.php/ijoc/article/view/ / http://ijoc.org/index.php/ijoc/article/view/ / https://doi.org/ . / - . https://doi.org/ . / . mcrobbie, a. ( ). clubs to companies: notes on the decline of political culture in speeded up creative worlds. cultural studies, ( ), – . meillassoux, c. ( ). from reproduction to production: a marxist approach to economic anthro- pology. economy and society, ( ), – . mitchell, d. t., & snyder, s. l. ( ). disability as multitude: re-working non-productive labor power. journal of literary & cultural disability studies, ( ), – . doi: . /jlcds. . nash, j. ( ). we eat the mines and the mines eat us: dependency and exploitation in bolivian tin mines. new york: columbia university press. neff, g. ( ). venture labor: work and the burden of risk in innovative industries. cambridge, ma: mit press. neff, g. ( ). conclusion: agendas for studying communicative capitalism. international journal of communication, , – . nussbaum, m. ( ). frontiers of justice: disability, nationality and species membership. cambridge: harvard university press. ondrejka, c. ( ). escaping the gilded cage: user created content and building the metaverse. new york law school law review, ( ), – . ong, a. ( ). spirits of resistance and capitalist discipline: factory women in malaysia. albany: state university of new york press. pearce, c., symborski, c., & blackburn, b. r. ( ). virtual worlds survey report: a trans-world study of non-game virtual worlds–demographics, attitudes, and preferences. corporate report. retrieved from http://cpandfriends.com/wp-content/uploads/ / /vwsurveyreport_final_ publicationedition .pdf robinson, l. ( ). entrepreneuring the good life? international journal of communication, , – . ross, a., & taylor, s. ( ). disabled workers and the unattainable promise of information tech- nology. new labor forum, ( ), – . doi: . / rousso, h. ( ). don’t call me inspirational: a disabled feminist talks back. philadelphia: temple university press. santos, b. ( ). the wsf: toward a counter-hegemonic globalization, part i. in j. sen, a. anand, a. escobar, & p. waterman (eds.), the world social forum: challenging empires (pp. – ). new delhi: the viveka foundation. retrieved from http://www.choike.org/ /eng/informes/ .html schumpeter, j. a. ( ). the theory of economic development. cambridge, ma: harvard university press. sen, a. ( ). human rights and capabilities. journal of human development, ( ), – . doi: . / sennett, r. ( ). the craftsman. new haven, ct: yale university press. sinclair, j. ( [ ]). why i dislike ‘person first’ language. autonomy, the critical journal of interdisciplinary autism studies, ( ), – . srnicek, n. ( ). platform capitalism. cambridge: polity. stensrud, a. b. ( ). precarious entrepreneurship: mobile phones, work and kinship in neoliberal peru. social anthropology, ( ), – . doi: . / - . stewart, s., hansen, t. s., & carey, t. a. ( ). opportunities for people with disabilities in the virtual world of second life. rehabilitation nursing, ( ), – . taussig, m. ( ). the devil and commodity fetishism in south america. chapel hill: university of north carolina press. titchkosky, t. ( ). disability: a rose by any other name? ‘person-first’ language in canadian society. canadian review of sociology/revue canadienne de sociologie, ( ), – . tsing, a. ( ). supply chains and the human condition. rethinking marxism, ( ), – . doi: . / van doorn, n. ( ). platform labor: on the gendered and racialized exploitation of low-income service work in the ‘on-demand’. economy. information, communication & society, ( ), – . doi: . / x. . t. boellstorff https://doi.org/ . /jlcds. . http://cpandfriends.com/wp-content/uploads/ / /vwsurveyreport_final_publicationedition .pdf http://cpandfriends.com/wp-content/uploads/ / /vwsurveyreport_final_publicationedition .pdf https://doi.org/ . / http://www.choike.org/ /eng/informes/ .html http://www.choike.org/ /eng/informes/ .html https://doi.org/ . / https://doi.org/ . / - . https://doi.org/ . / https://doi.org/ . / x. . weber, m. ( [ ]). the protestant ethic and the spirit of capitalism. translated by talcott parsons. london: allen & unwin. weeks, k. ( ). the problem with work: feminism, marxism, antiwork politics, and postwork imaginaries. durham: duke university press. williamson, b. ( ). electric moms and quad drivers: people with disabilities buying, making, and using technology in postwar america. american studies, ( ), – . yanagisako, s. ( ). producing culture and capital: family firms in italy. princeton: princeton university press. information, communication & society abstract ellie’s best-selling bed digital technology, labor, disability labor as contribution ‘entrepreneur’ as subject position collaboration and capability conclusion: toward an anthropology of absences notes acknowledgements disclosure statement notes on contributor references interdisciplinary collaboration: librarian involvement in grant projects marci d. brandenburg, sigrid anderson cordell, justin joque, mark p. maceachern, and jean song* librarians are excellent research collaborators, although librarian par- ticipation is not usually considered, thereby making access to research funds difficult. the university of michigan library became involved in the university’s novel funding program, mcubed, which supported innova- tive interdisciplinary research on campus, primarily by funding student assistants to work on research projects. this article discusses three different mcubed projects that all benefited from librarian involvement. these projects spanned across many areas from translational research to systematic reviews to digital humanities. librarian roles ranged from mentoring and project management to literature searching. introduction traditionally, librarians have adopted supportive roles in their research collabora- tions with faculty. while such roles still exist within academic librarianship, there is an increasing emphasis on librarians as partners within research collaborations. these partnerships include grants, systematic review publications (a specific type of comprehensive literature review), and other projects that benefit from librarians’ specialized skillsets. the ability to contribute funds to a research collaboration creates a more balanced partnership, allowing librarians to more fully contribute to projects with other faculty researchers. the university of michigan (um) university library values collaboration and participation in research, which is evident through the library’s participation in the mcubed program, a recent pilot program designed to fund innova- tive interdisciplinary research on campus. the university library participated in the program, providing an opportunity for the authors of this paper to propose projects, find interdisciplinary collaborators, and contribute funding to conduct research. most important, because librarians were equal contributors of funding, they engaged in these * marci d. brandenburg is bioinformationist in the taubman health sciences library, department of computational medicine & bioinformatics, and bioinformatics core at the university of michigan; e-mail: mbradenb@umich.edu. sigrid anderson cordell is librarian for english language and literature and justin joque is visualization librarian in the hatcher graduate library at the university of michigan; e-mail: scordell@umich.edu, joque@umich.edu. mark p. maceachern is informationist and jean song is assistant director, research and informatics in the taubman health sciences library at the university of michigan; e-mail: markmac@umich.edu, jeansong@umich.edu. © marci d. brandenburg, sigrid anderson cordell, justin joque, mark p. maceachern, and jean song, attribution-noncommercial (http:// creativecommons.org/licenses/by-nc/ . /) cc by-nc. doi: . /crl. . . mailto:mbradenb@umich.edu mailto:scordell@umich.edu mailto:joque@umich.edu mailto:markmac@umich.edu mailto:jeansong@umich.edu http:// . /crl interdisciplinary collaboration: librarian involvement in grant projects projects as full collaborators, paving the way for stronger relationships with faculty and future research opportunities. the inability to obtain funding is a common barrier to librarian involvement in re- search initiatives. in , gore et al., after discovering that a only a quarter of research articles published in top health sciences library journals identified funding sources, noted that “funding for health sciences library research remains either limited or nonexistent.” yet, at the same time, funding is perceived by association of research libraries (arl) library directors to be one of the most effective mechanisms for pro- moting research among librarians. further, there is evidence to suggest that funded research is associated with “substantially higher impact” than nonfunded research. not only did participation in the mcubed program provide librarians with funding opportunities, it also set the stage for meaningful collaborations with nonlibrary faculty across campus, which is generally underreported in the literature. this paper outlines three interdisciplinary research projects that originated from a unique funding situation that came about at the university of michigan. the projects are diverse, involving librarians from the humanities and health sciences, covering digital literary texts, bioinformatics tools, and evidence from the literature to inform medical decisions. furthermore, the extent to which the librarians were involved in the projects, and the range of responsibilities they took on, suggest to others possibilities for involvement and collaboration in research projects. these projects are examples of contributions to research that redefine librarian roles and help rewrite librarian stereotypes. the projects are examples of successful interdisciplinary collaborations that help fill a gap in the library literature and emphasize the impact librarians have on research when they are made equal partners through funding. background in , the university of michigan piloted the mcubed program, which supported innovative, faculty-proposed interdisciplinary research on campus, primarily by funding student assistants to work on research projects. in this way, the program had a strong undergraduate and graduate education component. project proposals had to enlist the support of three faculty members to form a “cube,” with at least two differ- ent unit affiliations represented. one goal of mcubed was to provide quick funding for projects; as a result, there was no peer-review process, but it was believed that the requirement to have three faculty from different units provided a level of review, in itself. proposals that met the criteria of having the support of three faculty from at least two different departments were then funded at $ , by a random selection process handled by the um’s office of research. this selection process was necessary since more proposals were submitted than could be funded. a majority of the awarded funds had to be allocated specifically to student salaries. the funding period was two years, and unused funds were returned to the funding groups. the funding for this program came from a combination of the um’s provost’s office, um’s rackham graduate school, and all participating um schools/colleges and their faculty. in total, “cubes” were funded. when the mcubed program was started, librarians were not originally included as faculty contributors in the program and were, in fact, overlooked as potential research partners. this oversight was consistent with the view of librarians as part of a support system rather than as collaborators with equal standing among other faculty. it was only after the mcubed program had been announced and marketed to departments, and a faculty member in english sought a library collaborator, that the idea of includ- ing librarians came under consideration. from the library’s point of view, however, collaborating with faculty researchers was a natural outgrowth of its mission, and college & research libraries march the dean of libraries agreed to fully fund four librarians at $ , each, for a total contribution of $ , to the grant funding process. the cubing process each independent investigator in the program, as defined by his or her unit, received one “token.” each token represented $ , to contribute to a cube. the university library funded four tokens. three tokens from two different units had to be redeemed on a project to form a “cube” (see figure ). a total of percent of the funds had to be used to support undergraduate students, graduate students, or postdocs. success- ful cubes were funded in total of $ , by a random selection process. while any investigator who belonged to a department that participated in the mcubed program could submit cube proposals, the departments allocated a specific budget amount and therefore could only fund a specified number of tokens, limiting the number of cubes that could be funded. all cubes that successfully met the criteria of the program (three faculty from at least two departments) were entered into a pool from which cubes were randomly selected to be funded. any cube had a similar chance of being selected from this pool. the projects given the maximum number of tokens that each participating um unit chose to fund, there were , fundable tokens available for projects, meaning that cubes could conceivably have been funded, as each project required three tokens. successful fund- ing, however, required not only the formation of a cube with three tokens, but also that the cube be chosen during the random selection process. if a cube was not chosen, the cube dissolved and no funding was received. all faculty within the um university library had the opportunity to contribute a token to, or create, a project, and all four of the tokens the library agreed to fund were successfully used in cubes. once four library tokens were funded, no more cubes containing a library token could receive funding. as a result, not all submitted cubes that contained a library token received figure explanation of cube formation: each faculty member received a token, and tokens combined to create a cube interdisciplinary collaboration: librarian involvement in grant projects funding. one funded cube included two librarians, therefore using two library tokens; as a result, three cubes that included a total of four library tokens received funding through this program. the three funded cubes were: ) scientific needs assessment and analysis of bioinformatics tools to support clinical and translational research; ) core outcomes measures for rotator cuff disorders; and ) using the digital to read literary texts in context. the following sections describe these fully funded projects and discuss the multiple ways in which librarians contributed to the research enterprise (see table ). scientific needs assessment and analysis of bioinformatics tools to support clinical and translational research this cube consisted of one library token, from the taubman health sciences library’s (thl) assistant director for research and informatics, and two medical school tokens, representing internal medicine and the department of computational medicine & bioinformatics (dcm&b). additional faculty and staff from the library and dcm&b were nontoken collaborators on this project, meaning that they worked on this project but did not contribute funds. cube discovery process this project was one of three related cubes that were created with help from the director of informatics infrastructure in dcm&b and the bioinformationist, a bioinformatics librarian specialist in thl. the bioinformationist originally proposed the idea of creat- ing a cube based on transmart and helped find faculty to participate in the process. table comparison of interdisciplinary projects, including the role and significance of librarian involvement and the work accomplished for each project scientific needs assessment and analysis of bioinformatics tools to support clinical & translational research core outcomes measures for rotator cuff disorders using the digital to read literary texts in context librarian role project management and mentorship literature searching project management and mentorship work accomplished bioinformatics tools assessment and creation of instruction resources analysis of rotator cuff studies in literature data collection of specific periodicals librarian significance positive student experience, creation of project deliverables highly skilled literature search positive student experience, development of transferrable data collection methodologies college & research libraries march transmart is an open source data sharing and analysis platform for furthering transla- tional research developed through a public-private partnership that includes academic institutions, commercial entities, and nonprofit organizations. the bioinformationist, along with dcm&b’s director of informatics infrastructure, identified areas within the transmart project that lacked personnel for development and could be filled with student effort as specified by the cube requirements. these areas included computer programming for data loading purposes, heuristic analyses of workflows and bioin- formatics tools, and training material development. after discussions with interested faculty from thl, dcm&b, internal medicine, and other units, three distinct projects were proposed, meaning nine faculty were involved, including two from the library. unfortunately, only one of the two projects with librarian involvement was funded. the other cube received no funds, and the project did not move forward. librarian role in cube despite not being the named librarian collaborator on this cube, the thl bioin- formationist was extensively involved in this cube project with support from the thl assistant director for research and informatics, who was the official librarian collaborator. the latter librarian worked with finance to get regular updates on the project budget, while the bioinformationist took on the project management role, in addition to her previous work as project developer that had led to the three original funding proposals, facilitated communication between all stakeholders, and ensured that the project moved forward. seven students were hired to work on this cube, most of whom were attending um’s school of information. the bioinformationist led the student recruitment effort by writing many of the job descriptions, conducting stu- dent interviews, and selecting the successful candidates. she was also the primary supervisor and mentor for four of the students. as such, she worked to ensure that the students had positive educational experiences, while accomplishing the tasks requested of them. they remained busy and were challenged, yet also received ap- propriate mentorship to help them achieve the desired end results. the bioinforma- tionist and the students often met weekly to discuss their projects and the plan for the upcoming week. confirmation of task completion and designing new short-term goals was necessary to keep the project moving forward, and these roles fell under the purview of the bioinformationist. in addition, she was the connector between the students and other faculty and staff invested in the work, including programmers working on the transmart code. work accomplished from funding the mcubed students accomplished several different projects for the transmart work. under the project management and mentorship of the bioinformationist, two students conducted an assessment of locally developed bioinformatics tools. this included a literature review for similar resources and a citation analysis for the locally developed tools. the students conducted heuristic evaluations of the tools and developed proto- types of these tools integrated into the transmart platform. also under the mentor- ship of the bioinformationist, two additional students created instructional resources that included numerous video tutorials for using transmart. the bioinformationist provided feedback on drafts of student-created tutorials and ensured the video topics met the stakeholders’ needs. the video tutorials, which included closed captioning, were made freely available on the transmart foundation’s youtube channel. since no instructional materials existed for helping users load data into transmart, a written manual was created stepping users through the data loading process, in addition to a video tutorial, filling this need. in addition, a hands-on training session, “introduction interdisciplinary collaboration: librarian involvement in grant projects to transmart,” was offered at the university of michigan. this training session was cotaught by one of the mcubed-funded students and the bioinformationist. with help from the bioinformationist and the assistant director of research and informatics, students displayed their work as a poster at each of the mcubed annual symposiums, giving them real-world presentation experience. librarian participation significance mcubed provided an opportunity for librarians to demonstrate the value of the part- nership between librarians and medical school faculty. the librarians provided grant funding, project management leadership, and student mentorship. the transmart instruction materials were highly valued by faculty in dcm&b and by members of the transmart foundation, as they were important resources that furthered the adoption of the transmart platform. without librarian-initiated grant proposals and project management, this work would not have been accomplished. this project also provided important educational opportunities and real-world work experience for students. one student commented, “in the particular project i worked on, i was able to view the bioinformatics field from a unique angle that allowed me to not only learn about the field itself, but also about creating informative material that can benefit others.” this cube project reinforced the value of librarians as strong grant partners, project leaders, and student mentors in the funded research environment. core outcomes measures for rotator cuff disorders cube discovery process this project was initially conceived of and put forward by a faculty member in the um school of public health. the project consisted of a systematic review, which is a comprehensive literature review that aims to objectively identify, synthesize, and sum- marize all relevant evidence on a research topic. because well-constructed systematic reviews adhere to a stringent set of methodological standards and processes, the resulting publications tend to have significant influence on health policies and clini- cal decisions. the thl librarian, known as an “informationist” in the health sciences schools, who became involved in this mcubed project identified the opportunity by proactively seeking systematic review proposals on the mcubed website. upon dis- covering the proposal, he reached out to the project lead, with whom he had worked previously on projects, and offered to contribute his search expertise and his token to cube the project. librarian role in cube the informationist conducted a comprehensive literature search in pubmed, one of the largest biomedical literature databases, and other resources to identify studies on patient-reported and physician-assessed outcome measures for rotator cuff con- ditions. systematic reviews (sr) differ from other types of reviews because of their rigorous methodology that is in place to reduce subjectivity and bias from all aspects of the review and its analyses. the standards that govern sr methods extend to the search process, and it behooves the searcher(s) and benefits the project to adhere to the search-related standards closely. failure to adhere to the standards or failure to report on essential aspects of the search in the sr manuscript as outlined in the stan- dards can result in poor results and rejected manuscripts. in practical terms, the sr search process includes complex search strategies designed to capture all published and unpublished literature on the research question, which means that specialized knowledge of the resources is essential and that constructing and documenting the search process takes significant time. the searches in this project resulted in most of college & research libraries march the data that formed the backbone of the analysis. the informationist documented the searches, kept track of search results and duplicate records, and worked with research assistants on citation management. as a side project, the informationist assessed the validity of a published pubmed search filter that was created to capture studies pertaining to patient-reported outcome measures. to do so, the informationist created a pool of the approximately citations involving outcome measures the team identified through full-text review, and sought to determine how well the published filter captured the pool of citations known to be relevant. the informationist then applied the filter to a pubmed search that he created to capture all studies pertaining to rotator cuff conditions. the idea was that if the team created a search that was sensitive enough to capture most rotator cuff papers, then the patient-reported outcome filter could be applied to that search to quickly isolate those rotator cuff studies of interest to the project. in addition to developing the literature searches that supported the analysis, the informationist contributed to discussions about the project plan when appropriate. the other two members of the team, including the project lead, were epidemiologists with extensive experience de- signing and conducting systematic reviews; they handled most methodological and clinical considerations. work accomplished from funding using cube funds, students were hired as research assistants to perform data extraction and analysis of all the rotator cuff studies identified through the literature searches. these efforts led to a description and categorization of instruments and other measures used to assess rotator cuff disorders, which were then used as a basis for developing the core outcome measures. the project was presented at two mcubed symposia, once as an oral presentation and once as a poster. librarian participation significance by partnering on the project, the informationist was in a position to demonstrate the importance of librarian involvement in systematic review projects to a nonlibrarian, research audience. as literature searches form the basis of systematic reviews, they are an ideal output for demonstrating the value of librarian contributions to research. in fact, research shows that librarian involvement in such projects improves search strategy reporting, an essential component of systematic review publications that adhere to accepted reporting standards. furthermore, the work generally performed by librarians in systematic reviews is often unreported, despite being significant and often worthy of authorship or published acknowledgement. in this project, the informationist was accepted as a fully integrated member of the team, perhaps in part because of the funding tied to his efforts. regardless, being present at strategic meetings, articulating the importance of search processes and accepted standards, and contributing to methodological discussions about the project help redefine librarian roles in research as one of a partner more than one of support. using the digital to read literary texts in context cube discovery process this project grew out of a series of conversations among the english language and literature librarian, the data visualization librarian, and a faculty member in english who wanted to explore the possibilities of using digital approaches to studying regional literature in its periodical context. both the english language and literature librarian and the faculty member in english work in the field of american periodicals studies, and their scholarly interests overlapped in this project. while many digital humanities interdisciplinary collaboration: librarian involvement in grant projects projects draw on large corpora of texts to perform what franco moretti has termed “distant reading,” this project explored the kinds of data that could be derived from close readings of texts in an entire run of a periodical. librarian role in cube while this project reflected the faculty member’s scholarly interests in american periodicals, the specific object of study emerged from a book manuscript project that the english language and literature librarian was working on related to analyzing regional literary fiction in the context of early twentieth-century california magazines. for this reason, the english literature librarian was able to actively shape the research goals of the project. likewise, the methodology for data collection was one that the data visualization librarian was working to develop for other researchers to adapt. there has been a growing interest, especially in the humanities, for methods and technology to aid in rigorous, controlled, and collaborative data creation, and this project was an opportunity to explore potential approaches. thus, both librarians played key roles in determining and contributing to the project’s research agenda. in addition to shaping the project’s research agenda, the librarians played a key role in determining the workflow for accomplishing this project, both as a research project at scale and through collaboration with undergraduates, graduate students, and faculty. whereas collaboration is relatively rare in the humanities, librarians bring considerable skills in collaboration and project management. work accomplished from funding as determined by mcubed, the majority of the grant funds were spent on undergradu- ate and graduate researchers who worked as a team on the project. the collaborators, along with a team of undergraduate and graduate student researchers, identified data points to be collected based on common themes and elements in the magazine. a graduate student in information science built a web-based tool for data entry, valida- tion, and storage. in addition to bibliographic information, the data points included thematic, economic, cultural, and geographic information. for example, because the magazine focused on the ethnography of the southwest, the research team recorded identity groups mentioned, activities, and geographic locations. although the data collection phase took the bulk of the funded project time and in fact has only recently been completed, the research team has already begun sorting, sifting, and visualizing it to look for patterns and networks. the work was presented at two mcubed symposia. preliminary analyses of the data have revealed unexpected patterns in authorship in the magazine, such as the unusually high number of contributions by female authors, and the project managers are currently outlining additional grant proposals to fund research that builds on the initial dataset. librarian participation significance one of the most important lessons learned in this project was the crucial mentoring role that librarians can and do play in graduate education, especially in the field of digital scholarship. although the project leaders had initially intended to hire only a few graduate students to oversee a team of undergraduates, the interview process revealed an eagerness among graduate students to become involved in digital projects. as a result of hiring an interdisciplinary team of advanced undergraduates and gradu- ate students, the project benefited greatly from becoming a collaborative team effort. this insight into the key role of librarians in mentoring graduate students in digital projects was the focus of a recent coauthored chapter published by the librarian for english language and literature and the data visualization librarian. college & research libraries march another significant lesson learned in this project was the enormous investment of resources required by institutions, including libraries, interested in supporting digital scholarship. not only are digital projects necessarily highly collaborative, but they also require financial resources and technological expertise far beyond what is available to the individual researcher who is simply curious about the digital humanities. the project allowed the librarians involved to develop methodologies for data collection in the humanities that will be beneficial to other researchers on campus. while not all researchers have grant money to invest in staffed projects, librarians can nevertheless advance digital scholarship by contributing subject expertise, as well as project and data management advice, and helping researchers make connections with potential collaborators. discussion librarians are often considered support personnel rather than primary collaborators; but the projects discussed in this article demonstrate that librarians can and should be primary collaborators, as they can play valuable roles in research projects. being a funding partner is one method for solidifying librarians as research partners, although obtaining funding is not an easy task for librarians. the authors of this article were able to provide a limited amount of funding and their expertise to three unique projects, showing the range and value of library engagement. a clearly identifiable theme across all these projects was that each benefited from librarian involvement. mentorship, expert searching, and project management pro- vided by librarians were key to the success of the projects discussed in this article. librarians in academic settings provide students with real-world experiences and opportunities to grow. they train students to use good communication skills, ideal research techniques, suggested data management practices, and more; as a result, librarians are natural mentors. as skilled searchers, librarians are often partners for projects that involve searching, such as systematic reviews. having a librarian on the team ensures an accurate, efficient, and comprehensive search. many librarians seek and deserve authorship for their expert searching role because of the necessary time commitment and the significant intellectual and methodological contributions they can make to a project. regardless of the project, librarians are often equipped to take on more than searching responsibilities, instead providing valuable input in project planning and discussion. these projects demonstrate how librarians’ project manage- ment skills ensure successful outcomes. the interdisciplinary nature of these projects provided a significant benefit for librarians. the authors established collaborations that otherwise might not have been formed and strengthened existing connections. the library became an equal funding partner, which was meaningful given the importance of grant funding for the authors’ research partners. in addition, the projects and librarian roles discussed in this article show the variety of ways in which librarians can get involved. collaborators learned that librarian engagement leads to positive outcomes, encouraging them to tell their colleagues and collaborate with librarians in the future. in addition, mcubed provided an invaluable experience for librarians to be involved in long-term projects, which al- lowed the authors to develop methods and ways of working that were transferable at different scales to shorter-term projects. working in an academic library, a lot of value is placed on sharing work both within and outside the library profession. the projects and the librarian roles discussed in this article have led to a variety of opportunities for disseminating and sharing work. um held an annual mcubed symposium in which each of the cubes presented a poster, sharing their work with the rest of the um community. these symposia were open interdisciplinary collaboration: librarian involvement in grant projects for anyone to attend and highlighted the work accomplished. librarians worked with the students to create these posters, were either authors themselves or mentioned as primary collaborators, and helped present the posters at the symposia. a book chapter about librarian involvement in transmart discusses the project management and mentorship role played in the transmart mcubed project. this was significant since it stressed the value of librarians embracing such a role. the authors of this article were also members of a panel on campus showcasing librarian involvement in these projects. the panel provided an opportunity to express the value of librarians as collaborators, in hopes of encouraging more librarians to pursue such opportunities. although the library’s involvement in mcubed was a success, these projects were not without their challenges. the interdisciplinary nature of the projects was in many ways a strength, but this also led to there being a variety of stakeholders, with differing interests, for each project. maintaining good communication among all stakeholders was not easy and at times created frustrating bottlenecks for information gathering and decision making. another challenge was the timeline, as all librarians found that it took time to get their projects started. this meant that, although the funds for hiring students were available, it was a while before the projects were at the hiring point. as a result, a longer funding period or a staggering between the notification of getting funding and when the funding became available would have been extremely useful. the next cycle of mcubed will be from through , and, due to the library’s success in the first cycle, the library will be participating again. given the interest in including librarians in cubes, for the next cycle the library is funding more tokens, but most tokens will be worth $ , instead of $ , . as a result, a larger number of proj- ects that include librarians can be funded. the more projects that involve librarians, the more integral librarians become in the research process as partners and collaborators. conclusion mcubed provided the authors a unique opportunity to be part of collaborative research teams at their institution. as evidenced by this paper, the range of research projects that benefited from librarian collaborators is large and encompassed all disciplines. in addition, librarians played a variety of roles, ranging from project development and conducting essential literature searches to providing mentorship and project manage- ment. librarians should seek out opportunities to be full collaborators, whether via large grants or smaller funding opportunities, such as um’s mcubed initiative. as more researchers recognize and understand the value of library participation, it will become the rule rather than the exception for librarians to be viewed as primary col- laborators instead of support personnel. acknowledgments the authors would like to thank all the students, faculty, and staff that put effort to- ward these projects to make them successful. we would like to thank paul trombley for his help designing the figure presented in this article. we would also like to thank the university of michigan’s mcubed initiative for funding support. notes . jake carlson and ruth kneale, “embedded librarianship in the research context,” college & research libraries news , no. ( ): – ; amalia monroe-gulick, megan s. o’brien, and glen white, “librarians as partners: moving from research supporters to research partners,” in imagine, innovate, inspire: the proceedings of the acrl conference, edited by dawn m. mueller (chicago: association of college and research libraries, ): – ; megan oakleaf, the value college & research libraries march of academic libraries: a comprehensive research review and report (chicago: association of college & research libraries, ): – . . sally a. gore, judith m. nordberg, lisa a. palmer, and mary e. plorun, “trends in health sciences library and information science research: an analysis of research publications in the bulletin of the medical library association and journal of the medical library association from to ,” journal of medical library association ( ): – . . elizabeth m. smigielski, melissa a. laning, and caroline m. daniels, “funding, time, and mentoring: a study of research and publication support practices of arl member libraries,” journal of library administration ( ): – . . dangzhi zhao, “characteristics and impact of grant-funded research: a case study of the library and information science field,” scientometrics ( ): – . . amalia monroe-gulick, m.s. o’brien, and g. white, “librarians as partners,” . . regents of the university of michigan, “about mcubed,” ( ), available online at http:// mcubed.umich.edu/about [accessed august ]. . mark burns et al., “mcubed: michigan’s revolutionary seed funding program,” mcubed information session presentation ( ), available online at http://mcubed.umich.edu/sites/default/ files/files/webversionmcubedinfosessionsmay .pdf [accessed august ]. . brian athey, michael braxenthaler, magali haas, and yike guo, “transmart: an open source and community-driven informatics and data sharing platform for clinical and transla- tional research,” amia joint summits on translational science proceedings amia summit on trans- lational science ( ), – ; elisabeth scheufele et al., “transmart: an open source knowledge management and high content data analytics platform,” amia joint summits on translational science proceedings amia summit on translational science ( ), – . . to view the transmart tutorials, see the youtube channel, available online at www. youtube.com/playlist?list=plv ymfogdidhbc_noohkcwcgy_fu hk d [accessed august ]. . at thl, librarians have the title “informationist.” . melissa l. rethlefsen, ann m. farrell, leah c. osterhaus trzasko, and tara j. brigham, “librarian co-authors correlated with higher quality reported search strategies in general internal medicine systematic reviews,” journal of clinical epidemiology , no. ( ): – . . jonathan b. koffel, “use of recommended search strategies in systematic reviews and the impact of librarian involvement: a cross-sectional survey of recent authors,” plos one , no. ( ): – . . franco moretti, distant reading (london: verso, ). . sigrid a. cordell, alexa pearce, melissa gomis, and justin joque, “filling the gap: digital scholarship, graduate students, and the role of the subject specialist,” in supporting digital hu- manities for knowledge acquisition in modern libraries, eds. kathleen sacco, scott richmond, sara parme, and kerrie fergen wilkes (hershey, penn.: igi global, ), – . . marci d. brandenburg, “librarian involvement in transmart, a translational biomedical research platform,” in translating expertise: librarian roles in translational research, ed. marisa conte (lanham, md.: rowan and littlefield, ) – . http://mcubed.umich.edu/about http://mcubed.umich.edu/about http://mcubed.umich.edu/sites/default/files/files/webversionmcubedinfosessionsmay .pdf http://mcubed.umich.edu/sites/default/files/files/webversionmcubedinfosessionsmay .pdf http://www.youtube.com/playlist?list=plv ymfogdidhbc_noohkcwcgy_fu hk d http://www.youtube.com/playlist?list=plv ymfogdidhbc_noohkcwcgy_fu hk d langid_abstract_pdf_illustration word-level language identification in “the chymistry of isaac newton“ levi king, sandra kübler, wallace hooper indiana university {leviking,skuebler,whooper}@indiana.edu introduction language identification is the task of determining the language of short text snippets, much shorter than for e.g., text classification. in computational linguistics (cl), language identification is generally considered a solved problem—but these methods assume that a text is monolingual, and at least characters long. furthermore, such methods cannot be used for multilingual texts in which the author switches between languages within a sentence, as in the “chymistry of isaac newton” (walsh and hooper ), a collection of alchemical manuscripts written by newton over a - year period beginning in the mid- s. the team behind the chymistry of isaac newton project at indiana university has transcribed these manuscripts and is publishing a digital scholarly edition at www.chymistry.org. attempts to automatically analyze this corpus, even with basic levels like pos markup and lemmatization, are difficult because newton frequently switches between english, latin, and french within a paragraph or sentence, as shown in the following sentence: “the short lived & despicable plant [[lat paronychia folio rutaceo [[eng infused in beer, doth wonders in curing the kings evill.” for this reason, we developed a new method for automatically identifying the language for single words rather than for complete texts. this method requires more information because the classification is finer-grained than standard methods, which have access to more text. there is an additional complication because seventeenth-century english and french allowed many spelling variations, unlike latin, which was fairly standardized. we first train and test the method on the corpus itself. however, since this corpus is rather small for methods developed in cl, we also investigate whether the method can use either current texts or texts written by newton’s contemporaries. while this approach increases the amount of training data, it is unclear whether the additional data is useful given that all these additional newton-era and modern texts are monolingual, and that the modern english texts will fail to exhibit the large variations in spelling that we see in newton’s manuscripts. our experiments show that using newton’s own texts reaches the highest accuracy of close to %, but using modern text results only in a moderate decrease of % points. language identification on the document level all previous work in language identification assumes that each text to be identified is written in a single language. for this task, naïve approaches are often utilized with high success. the simplest methods use the presence of language-specific characters in a text to identify the language. another method uses lists of the most common words of a language (johnson ). then, the text is classified based on which set of common words occurs most frequently. cavnar and trenkle ( ) use the same method with relative frequencies of n- grams rather than words and reach an accuracy of . % given texts with at least n- grams. our work is based on work by beesley ( ) and mandl et al. ( ), who also extract n-grams. beesley determines language identity for a whole text of any size by comparing probabilities of bigrams and characters of the individual words for each candidate language and labeling the text as the language most probable for the most words. mandl et al. use n-grams to determine switch points between languages. recent approaches use more sophisticated methods, such as vector-space models (prager ) or multiple linear regression (murthi and kumar ). however, those approaches are difficult to use on the word level. the data source the newton alchemical corpus. the newton alchemical corpus comprises approximately , words, drawn from a three-language lexicon of , unique wordforms. newton frequently alternates between english, latin, and french. the collection contains documents written exclusively in either english or latin. these documents were used as training data for our approach. for both english and latin, texts of approximately , words were used as training data. additionally, a list of words was extracted from each monolingual training set and used as a lexicon for that language. french only occurs in the multilingual documents and much more rarely than english or latin in these documents. since there are no documents written exclusively in french, no french training data was available from this source. non-alphabetic elements (e.g., punctuation and numbers) are automatically labeled as non-words. additionally, the texts include recipes, calculations and figures, and thus contain a large number of alphabetical variables and labels. these items are not relevant for language identification and present potential obstacles for automatic approaches. thus, they were excluded from training/testing. any string of letters not containing a vowel was determined to be a non-word. for testing, we selected six texts ( , words) that contained a high degree of switches between languages and annotated them manually for the languages used. three texts ( , words) were used for optimizing parameters, and three more texts ( , words) for testing. note that the test set does not contain any french words. other texts: for english texts from newton's era, we used excerpts of francis bacon's the new atlantis ( ) and essayes or counsels, civill and morall ( ) and robert boyle's the sceptical chemist ( ) and experiments and considerations touching colours ( ). newton-era latin texts were excerpted from rene descartes' meditationes de prima philosophia ( ), benedict de spinoza's ethica ( ), and carl von linne's species plantarum ( ). the modern day training set for english was extracted from the los angeles times and the washington post stories from . word-based language identification: the newton corpus our approach assumes that a particular document to be identified contains one or more of the languages used in the corpus: english, latin, and french. we automatically segment the texts into words, extract all n-grams per word, and calculate the relative frequencies of the n-grams in each language (normalized for capitalization). figure illustrates this process for the latin word “ignis”. figure : extraction of bigrams (left) and comparison of relative frequencies. $ and # mark word boundaries. we determine a language score by averaging over all n-gram probabilities of a word. since there is no training data for french, we use only english and latin for training, with a threshold: first, the scores for english and latin are determined. if neither the english nor the latin probability exceeds a pre-determined threshold, the word is determined to be french. this corresponds to the intuition that if the n-grams of a word are rare in both english and latin, then that word is unlikely to be from those languages but from a different language. the final decision also takes the language label of the previous word into account. if the current word is in the lexicon of the language of the previous word, the current word is tagged as that language. if the word is not in the lexicon, we consider the language identity probabilities of the previous word by adding a proportion of that probability to the probability that the current word is english, and do the same for latin. this decision captures the tendency of words to belong to the same language as the words in the immediate context, while allowing for the possibility of switches. at the beginning of a sentence, the threshold is higher than between words. performance on the current language identification task is defined as accuracy: the percentage of words in the test texts (excluding non-words) with correct language labels. ultimately, we found -grams to be the best performing setting. training set: accuracy newton eng/lat . % table : results for the newton corpus. the results in table show that we reach an accuracy of . %. this is lower than the results reported for language identification on full documents, but the task is more difficult. the word misclassified most often is a genuinely ambiguous word, “in”. in general, the words most frequently misclassified are short ( - characters). word-based language identification: using other corpora for training since the training set from the newton corpus is rather small, we also investigated using either training texts from newton’s era, or modern corpora. as no modern latin is available, we used the newton-era latin texts. training set: accuracy newton . % newton + newton-era texts . % newton-era texts . % modern texts . % table : results when other corpora were used for training. the results in table show that using the small set of texts by newton gives the highest accuracy. adding newton-era texts does not result in the expected increase in accuracy. instead, accuracy decreases minimally from . % to . %. using only newton-era texts decreases accuracy by approximately %. using modern texts also results in a small decrease in accuracy. however, our method does not suffer much from using modern texts, which suggests that the information about character differences between languages does not heavily depend on the changes in spelling. conclusion we presented a novel method for identifying language on individual words in multilingual texts. we have shown that the method reaches an accuracy of . % when trained on monolingual texts from the same author. however, if no such texts are available, other texts from the same era, or even current texts can be used with only a minor degradation in performance. references beesley, kenneth r. "language identifier: a computer program for automatic natural- language identification of on-line text." languages at crossroads: proceedings of the th annual conference of the american translators association ( ): - . cavnar, william b. and john m. trenkle. "n-gram-based text categorization." proceedings of third annual symposium on document analysis and information retrieval ( ): - . johnson, stephen. "solving the problem of language identification." technical report. ( ). school of computer studies, university of leeds. mandl, thomas, margaryta shramko, olga tartakovski, christa womser-hacker ( ): “language identification in multi-lingual text documents”. proceedings of the th international conference on applications of natural language to information systems (nldb ), klagenfurt, austria. springer lecture notes in computer science : - . murthy, kavi narayana and g. bharadwaja kumar. "language identification from small text samples." journal of quantitative linguistics ( ): - . walsh, john a. and wallace edd hooper. "the liberty of invention: alchemical discourse and information technology standardization." literary and linguistic computing ( ): - . microsoft word - fenlon_ro _preprintsubmission.docx interactivity, distributed workflows, and thick provenance: a review of challenges confronting digital humanities research objects katrina fenlon (kfenlon@umd.edu; https://orcid.org/ - - - ) introduction while research objects (ros) are primarily oriented toward scientific research workflows, the ro model and parallel approaches have gained some uptake in the humanities, enough to suggest their potential to undergird sustainable, networked humanities research infrastructures. digital scholarship in the humanities takes a great variety of forms that range widely beyond traditional publications, and which incorporate narratives, media, datasets and interactive components—any of which may be physically dispersed as well as dynamic and evolving over time. despite the rapid growth of digital scholarship in the humanities, most existing research infrastructures lack support for the creation, management, sharing, maintenance, and preservation of complex, networked digital objects. ros, and the community and tools that are growing around ros, offer a potential, partial solution. while the concept of the ro has seen significantly more uptake in the humanities than has the formal data model (bechhofer, ; belhajjame et al., ), several compelling applications of the concept that suggest the time is ripe for considering broader integration of the model into distributed infrastructures. these applications include platforms for data sharing and collaborative scholarship, platforms for digital and semantic publishing, and digital repositories in several domains. this paper reviews existing applications of the ros model to identify challenges confronting the application of ros to humanities digital scholarship. this paper builds on fenlon ( ), which investigated the application of the ros model to digital humanities collections, and which identified three promising strengths of the model for the realm of digital humanities: ( ) ros readily perform the most essential function of a collection: to aggregate related resources in order to support scholarly objectives; ( ) ros have the capacity for explicit, semantic descriptions of interrelationships among components that are often hidden in digital humanities collections (and therefore vulnerable to dissolution); and ( ) the ro model accommodates aggregations of linked data, offering researchers the opportunity to create and annotate virtual, fully referential collections. having identified some strengths and limitations of the ro model for digital humanities collections through one experimental application of model, this paper builds on that analysis by reviewing the literature on ros in the humanities and examining a range of applications of the ro and similar models within humanities and cultural heritage domains. this paper frames the review around three main challenges and their implications for future implementations of ros to support digital research in the humanities: first, digital humanities scholarship requires specialized interactive use, so realizing the advantages of ros for the humanities will depend on implementations that create platforms for experimentation and development by communities. second, the idiosyncratic workflows employed in the construction of networked humanities scholarship means that workflow-oriented ros will not gain significant uptake in the humanities unless they can capture distributed, sociotechnical workflows in meaningful ways. third, humanities ros will require capturing provenance in ways and at a level of detail that may be unfamiliar to the ros scientific origins; humanities scholarship requires “thick,” multilayered, context-rich provenance descriptions that can accommodate conflicting assertions and formalize uncertainty. challenge . essential interactivity for specialized use much of humanities digital scholarship is essentially interactive. new modes of production and publication in the humanities are intended for user interaction or participation, and dynamic and responsive representation based on research context. digital collections and archives, digital editions, maps, models, and simulations, and other modes of digital scholarship all rely on interactive components to express their interpretive contributions, or to enact their scholarly purposes. the interactive and dynamic components of digital scholarship include things like customized browsing and searching facilities that take advantage of extensive, rich scholarly encodings and annotations; platforms for collaborative annotation; dynamic maps and visualizations; etc. such components are intended to do multiple things at once: to make arguments, to manifest interpretive stances, to enable knowledge transfer, and simultaneously to serve as active platforms for ongoing interpretation and research (palmer, ; fenlon, ; and others). prior empirical work on applying the ro model to digital humanities collections found the main limitation of the model for digital humanities collections to be that functional components, designed for ongoing end-user interaction, are not usefully captured in a basic ro model and instead fall to the implementations built on top of research-object management systems (fenlon, ). ros can, of course, accommodate as flat code objects that are intended to be interactive; and ros have been employed for this purpose to support data migration and archiving (e.g., the ro bagit profile). but the purpose of digital humanities scholarship is to be alive and functional, and for ros to be useful in this domain will require implementations that support platforms for flexible, participatory development. in a conceptual sense, the ro model has demonstrated value for this kind of platform approach in the humanities. the perseids project offers a platform for sharing and peer-review of the transcriptions, annotations, and analyses that constitute research data in the classics. the perseids architecture is built around the concept of data publications, which are modeled as collections of related data objects. the perseids team explicitly relates the data publication model to the ro model (almas, ). like ros, perseids data publications weave in several domain standards (including the tei epidoc schema, w c web annotation, and others) to undergird an infrastructure that supports scholarly requirements specific to the classics domain: transcription, fine-grained annotation, collaborative editing (with versioning), a research environment that facilitates data-type-specific extensions, and tailored workflows for peer review (almas, ). similarly, the ceres (community enhanced repository for engaged scholarship) toolkit, created by the northeastern university libraries digital scholarship group, explicitly draws on the concept of the ro in its system for supporting networked humanities scholarship and publishing. ceres allows digital humanities creators to build custom publications that pull objects from different repositories using apis (including the northeastern university libraries’ digital repository service and the digital public library of america) (sweeney, flanders & levesque, ). it is unclear how the ro model may fit into the broader, more diversified landscape of linked data and the semantic web in cultural institutions and in the humanities, but the conceptual fit within digital scholarship is established. ros and similar models have substantial potential to underpin systems that support a variety of implementations. realizing the advantages of ros for the humanities will depend on implementations that create platforms for experimentation and collaborative development by distributed communities (fenlon, ). such platforms must accommodate dynamic interface-building, to allow scholarly communities with distinctive interests and needs to mobilize ros in different ways. they must also accommodate participation and co- creation through contributions of linked-data annotations and enrichments, including linking among ros and the concepts and entities within ros. challenge . distributed and idiosyncratic workflows of networked humanities scholarship humanities digital scholarship is increasingly networked: heavily interconnected with and dependent on external resources for functionality and meaning. many digital humanities publications in various forms—monographs, multimedia productions, exhibits, collections—draw on, reference, embed, and patch together distributed resources called from other collections, often via api. for example, a collection may center on a set of high-resolution images of primary sources, which are called from another digital library’s iiif image server. some of the longest- running, large-scale cultural heritage digital libraries (including europeana and the digital public library of america) are aggregations of descriptive surrogates, which link to original content hosted externally. externally maintained schemas, authorities, and utilities undergird digital editions. visualization and mapping projects generate content using external services. and with the growth of linked data in cultural collections, projects increasingly leverage external data sources as primary content, to which scholars then add layers of interpretive narrative, annotations, context, and interconnection. humanities workflows rarely happen in self-contained or end-to-end research infrastructures, thwarting the possibility of sufficiently rich, automatic workflow capture. indeed, efforts to build a workflow-oriented, unified cyberinfrastructure for supporting humanities scholarship tend to founder (e.g., dombrowski, ). however, niche, task- or domain-specific infrastructures can capture constrained workflows. for example, in the domain of musicology, page et al. ( ) observe how digital editions and annotations of encoded works are “manifestations of workflows deployed in musicological scholarship,” and offer a compelling framework for representing musical ros, which include images, text, audio, and encoded music (page et al., ; de roure et al., ). computational workflows are readily captured within humanities research environments, and ros have come into play for this purpose. for example, the hathitrust research center data capsule environment is moving toward systematic provenance-capture for computational text analysis workflows. these workflows take as inputs worksets (jett et al., ), which are conceptually and technically akin to ros: aggregate digital objects that implement addressability for and relational expressivity among components using domain ontologies. unlike ros, worksets are envisioned as the inputs of workflows in the current model of the hathitrust data capsule environment, rather than encompassing whole research workflows (murdock et al., ). but workflow-oriented ros will not gain significant uptake in humanities contexts unless they can also capture and make useful more complex, distributed, sociotechnical workflows in meaningful ways. with their capacity for linked data using domain vocabularies, ros readily accommodate many of the artifacts of networked digital scholarship in the humanities, along with their interrelationships (fenlon, ). but can ros accommodate humanities workflows in useful ways? in their effort to undergird dariah (pan-european infrastructure for digital arts and humanities research) through the systematic production of humanities ros, blanke and hedges ( ) observed that humanities scholars employ sequential workflows, but “except in relatively specialised cases we rarely encountered workflows that could be automated, shared with and used by others, such as occur in many scientific disciplines.” while auto-generated and computer- useable workflows may not apply to most humanities research processes, formally characterized, (semi-) manually captured workflows would be highly useful for review, validation, archiving, reproducibility, reuse, and other purposes. while the ro model has the capacity and flexibility for complex workflow representation, more research is needed to characterize humanities workflows; to identify how such characterizations can be made useful; and to identify model extensions and unique implementation strategies workflows might require in different domains. challenge . thick provenance drilling down on the problem of workflow capture, digital humanities scholarship places special demands on data provenance—not only on the provenance of digital resources (such as files, compound objects, datasets) or components thereof (such as passages of music, paragraphs of a text, or lines of a poem), but also the provenance of attached, contextual information. archival artifacts—the evidence of the humanities—often possess simultaneous, multiple and parallel provenances (gilliland, ; hurley, ). documenting the provenance of the evidence itself can be complicated, but beyond that, the provenance of the provenance must also be documented. any assertion made about any artifact (in the form of metadata or annotation), or any contextual and secondary information attached to artifacts in the context of digital scholarship, require provenance. annotations and metadata are often, in the humanities, products of scholarly, interpretive work. therefore, each annotation or metadata proposition itself is subject to claims of authorship, competing perspectives, expression of uncertainty, and further annotation—all requiring provenance information. because provenance is a multilayered thing in humanities scholarship, different humanities disciplines and subdisciplines may require domain-specific provenance schemas and standards, which specialize existing standards for the expression of the provenance of different kinds of resources, ranging from digital media files to annotations. humanities ros will require thick, multilayered, context-rich provenance descriptions, which can accommodate conflicting assertions and formalize uncertainty. it is unclear whether existing implementations of the ro model can accommodate this level of description, though the model itself has the capacity. the researchspace environment (oldman and tanase, ) offers exemplary support for documentation of thick, multifaceted provenance of humanities ros. researchspace is an open- source platform created by the british museum to facilitate scholarly data sharing, formal argumentation, and semantic publishing within communities of researchers. researchspace does not directly employ the ro model, though its architecture does rely on aggregates of linked data, taking advantage of related standards including w c web annotation and linked data platform containers. in this environment, provenance and argumentation are expressed using the cidoc-crm specialization crminf (the argumentation model). scholars can use this vocabulary to build narratives and thick descriptions around digital ros through annotation and data-linking. these narratives of provenance allow and formalize the expression of uncertainty and competing perspectives, and the environment also serves to document the scholarly work that goes into building these narratives (researchspace team, ). the reasons for highlighting the researchspace approach to provenance in this review of humanities ros are ( ) to exemplify the unique demands of formalizing humanities provenance, and ( ) to exemplify the highly distinctive, domain-specific implementation requirements that confront the ro and other domain-independent data models. describing humanities provenance will require vocabularies to express argument and belief, as oldman et al. ( ) observe. beyond the ro model’s use of prov and web annotation, humanities provenance will demand domain- specific argumentation extensions such as crminf. it is clear that ros can theoretically accommodate thick provenance description, just as they can theoretically accommodate the representation of highly complex workflows, but can they usefully undergird implementations that are centered in humanities research needs? the researchspace interface is tailored toward knowledge work, toward the collaborative construction of multifaceted provenance descriptions, without requiring users to code or gain expert-level knowledge of domain ontologies. tools for the authorship of humanities ros, or tools that implement ros behind the scenes, may benefit from taking the same approach. conclusion ros make a great deal of sense for modeling cultural information; skeletons of a similar shape— the simple and powerful combination of aggregation and annotation to represent compound digital objects—already structures large-scale cultural data aggregations, e.g., through the europeana data model and the digital public library of america metadata application profile, which are both founded on ore:aggregations plus oa:annotations. but the challenges confronting widespread application of the ro model to humanities digital scholarship are significant. this review of existing applications has identified three central challenges: . digital humanities scholarship requires specialized interactive use, so realizing the advantages of ros for the humanities will depend on implementations that create platforms for experimentation and development by communities. . the idiosyncratic workflows of networked humanities scholarship means that workflow- oriented ros will not gain significant uptake in the humanities unless they can capture distributed, sociotechnical workflows in meaningful ways. . humanities ros will require thick, multilayered, context-rich provenance descriptions that can accommodate conflicting assertions and formalize uncertainty, along with implementations that support the documentation of such provenance. in particular, the challenge of characterizing and formally expressing diverse humanities workflows, along with the provenance of data and contextual information within those workflows, presents the most urgent challenge and exciting opportunity for the future of humanities cyberinfrastructure. to many stakeholders in humanities cyberinfrastructure, “workflows are the new content” (dempsey, ; baynes et al., ; schonfeld and waters, ). while research on workflows is underway on multiple fronts (including liu et al., ), it is clear already that there will be significant semantic differences between conceptual and technical elements in scientific workflows (and provenance) and those in the humanities; and these differences will affect the implementation of ros for humanities research. historically, attempts to implement scientific research infrastructures (including data models like the ro model) to support humanities scholarship have hit an obstacle in the form of semantic gulfs. for example, in the linking and querying ancient texts (laquat) project, an effort to transfer escience infrastructure in support of a humanities virtual research environment, anderson and blanke observed a fundamental challenge in integrating humanities data from different databases. they located the solution to that problem in humanities research communities: “integrating humanities research material...will require researchers to make the connections themselves, including decisions on how they are expressed and how to understand and explore the data more effectively” (anderson and blanke, ). oldman et al. ( ), reviewing the state of linked data in the humanities, observed that basic linked data publication for many kinds of humanities sources can be counterproductive, “unless adapted to reflect specific methods and practices, and integrated into the epistemological processes they genuinely belong to.” this caution resonates with the challenges identified for the adoption of the ro model—or indeed for the importation of any data model, even domain- independent data models—into the humanities. the main challenges to implementing ros for humanities research also present exciting opportunities for a more sustainable cross-disciplinary infrastructure (fenlon, ), but implementation strategies must be centered in scholarly communities, and grow out from the practices, needs, and epistemologies of specific areas of study in the humanities and cultural institutions. references almas, b. ( ). perseids: experimenting with infrastructure for creating and sharing research data in the digital humanities. data science journal, ( ). https://doi.org/ . /dsj- - anderson, s., & blanke, t. ( ). taking the long view: from e-science humanities to humanities digital ecosystems. historical social research / historische sozialforschung, ( ( )), – . baynes, m. a., sommer, d., melley, d., & lickiss, t. ( , april). workflow is the new content: expanding the scope of interaction between publishers and researchers. panel presentation presented at the society for scholarly publishing. retrieved from https://www.sspnet.org/events/past-events/workflow-is-the-new-content-expanding-the- scope-of-interaction-between-publishers-and-researchers/ bechhofer, s., buchan, i., de roure, d., missier, p., ainsworth, j., bhagat, j., … goble, c. ( ). why linked data is not enough for scientists. future generation computer systems, ( ), – . https://doi.org/ . /j.future. . . belhajjame, k., zhao, j., garijo, d., gamble, m., hettne, k., palma, r., … goble, c. ( ). using a suite of ontologies for preserving workflow-centric research objects. journal of web semantics, , – . https://doi.org/ . /j.websem. . . blanke, t., & hedges, m. ( ). scholarly primitives: building institutional infrastructure for humanities e-science. future generation computer systems, ( ), – . https://doi.org/ . /j.future. . . de roure, d., klyne, g., page, k., pybus, j., weigl, d. m., & willcox, p. ( , july). digital music objects: research objects for music. presented at the research object workshop (ro ) at ieee escience conference . retrieved from https://zenodo.org/record/ #.xb chc khhe dempsey, l. ( , october). the library in the life of the user: two collection directions. education. retrieved from https://www.slideshare.net/lisld/the-library-in-the-life-of-the- user-two-collection-directions dombrowski, q. ( ). what ever happened to project bamboo? literary and linguistic computing, ( ), – . https://doi.org/ . /llc/fqu fenlon, k. ( ). thematic research collections: libraries and the evolution of alternative scholarly publishing in the humanities (doctoral dissertation, university of illinois at urbana-champaign). retrieved from http://hdl.handle.net/ / fenlon, katrina. ( ). modeling digital humanities collections as research objects. presented at the acm/ieee joint conference on digital libraries . retrieved from https://hcommons.org/deposits/item/hc: / gilliland, a. j. ( ). conceptualizing st-century archives. ala editions. hurley, c. ( ). parallel provenance [series of parts]: part : what, if anything, is archival description?. [an earlier version of this article was presented at the archives and collective memory: challenges and issues in a pluralised archival role seminar ( : melbourne).]. archives and manuscripts, ( ), . jett, j., cole, t. w., & downie, j. s. ( ). exploiting graph-based data to realize new functionalities for scholar-built worksets. proceedings of the association for information science and technology, ( ), – . https://doi.org/ . /pra . . liu, a., kleinman, s., douglass, j., thomas, l., champagne, a., & russell, j. ( ). open, shareable, reproducible workflows for the digital humanities: the case of the humanities.org “whatevery says” project. presented at the digital humanities (dh ). retrieved from https://dh .adho.org/abstracts/ / .pdf murdock, j., jett, j., cole, t., ma, y., downie, j. s., & plale, b. ( ). towards publishing secure capsule-based analysis. proceedings of the th acm/ieee joint conference on digital libraries, – . retrieved from http://dl.acm.org/citation.cfm?id= . oldman, d., doerr, m., & gradmann, s. ( ). zen and the art of linked data. in a new companion to digital humanities (pp. – ). https://doi.org/ . / .ch oldman, d., & tanase, d. ( ). reshaping the knowledge graph by connecting researchers, data and practices in researchspace. in d. vrandečić, k. bontcheva, mari carmen suárez-figueroa, v. presutti, i. celino, m. sabou, … e. simperl (eds.), the semantic web – iswc (pp. – ). retrieved from https://link.springer.com/chapter/ . % f - - - - _ page, k., lewis, d., & weigl, d. ( ). contextual interpretation of digital music notation. presented at the digital humanities (dh ), montréal, canada. palmer, c. l., teffeau, l. c., & pirmann, c. m. ( ). scholarly information practices in the online environment: themes from the literature and implications for library service development. retrieved from oclc research and programs website: http://www.oclc.org/content/dam/research/publications/library/ / - .pdf researchspace team, british museum. ( , december). moving from documentation to knowledge building: researchspace principles and practices. presented at the stiftung preußischer kulturbesitz (prussian cultural heritage foundation) berlin. retrieved from https://www.researchspace.org/docs/berlin.pdf schonfeld, r. c., & waters, d. ( , april). the turn to research workflow and the strategic implications for the academy. presented at the coalition for networked information (cni) spring membership meeting, san diego, ca. retrieved from https://vimeo.com/ sweeney, s. j., flanders, j., & levesque, a. ( ). community-enhanced repository for engaged scholarship: a case study on supporting digital humanities research. college & undergraduate libraries, ( – ), – . https://doi.org/ . / . . frankfurt the wild west: promoting digital scholarship at the university of colorado boulder thea lindquist | cu boulder | thea.lindquist@colorado.edu | @lutefisk digital humanities task force center for research data & digital scholarship first outcomes references in , a group of librarians and technologists on the cu boulder campus sought to understand researcher interest and needs in digital humanities. we discovered that digital scholars were scattered across disciplines, often with little interaction with each other or knowledge of available expertise and resources on campus. the task force recommended the creation of a digital scholarship center to act as a hub to bring a dispersed network of researchers, expertise, and resources together. at the same time, campus research computing was thinking about creating a center for data analytics and visualization. we merged our visions into a campus research center, crdds, which launched in june : http://www.colorado.edu/crdds/ center for research data and digital scholarship at the university of colorado boulder: . /bul . . dh+cu – future directions for digital humanities at cu boulder: http://scholar.colorado.edu/libr_facpapers/ / image credit: paul conrad, los angeles times, . here are some of the ways in which crdds is meeting or plans to meet identified needs:  fall seminar series  digital humanities graduate certificate  student visualization contest due to the interest exhibited across the disciplinary spectrum, we realized that using the term digital scholarship was more appropriate in our context. in interviews, some researchers described a hardscrabble existence where they felt compelled to “pull themselves up by their bootstraps.” building community was high priority. the needs faculty and student researchers identified were diverse. on a campus as decentralized as cu boulder, no one group could hope to support them.  space for workshops and collaboration in main library  integrating research data infrastructure  ongoing information-gathering on user needs ds http://www.colorado.edu/crdds/ http://scholar.colorado.edu/libr_facpapers/ / the wild west: promoting digital scholarship�at the university of colorado boulder�thea lindquist | cu boulder | thea.lindquist@colorado.edu | @lutefisk digitally-mediated practices of geospatial archaeological data: transformation, integration, & interpretation . introduction digital archaeology has burgeoned over the past decade, with archaeologists tending to focus on data acquisi- tion tools such as terrestrial laser scanning (remondino et al. ), airborne lidar (chase et al. ; prufer, thompson & kennett ; von schwerin et al. ), pho- togrammetry (saperstein ), or on visualization using virtual and augmented reality. recently, scholars are call- ing for greater introspection of digital practices, mirroring late s pushes for geographic information systems (gis) to go “beyond the map” (aldenderfer and maschner ; forte ; lock ; maschner ). these calls ask archaeologists to shift focus from digital data acquisi- tion to the unique affordances of the digital for archaeo- logical research questions (gunnarsson ; huggett ; roosevelt et al. ). while new conversations are evolving that address impacts in both the digital humani- ties and digital archaeology (e.g. benardou et al. ; dal- las ; holdaway, emmitt, phillipps & masoud-ansari ; macfarland and vokes ; wright and richards ), effects of the digital on archaeological practice and scholarship remain understudied. for example, initial conversations on preservation of cultural heritage mate- rials have shifted to include access and reuse—focusing not simply on making data available for future inspection, but also preparing them for contemporary reuse (clarke ; esteva et al. ; lukas, engel & mazzucato ; macfarland and vokes ; ullah ; witcher ; wylie ). in this vein, we focus on transforming analog legacy data to digital geospatial data (i.e. data that have real-world spatial reference) for the purpose of “min[ing] old data sets for new insights that redirect inquiry” (wylie : ). specifically, we ask: what can we learn as we convert analog data to geospatial data? and, how does digital data transformation, integration, and interpreta- tion impact archaeological practice and scholarship? archaeological fieldwork and lab work involve digital data acquisition, for example, capturing global positioning system (gps)/global navigation satellite system (gnss) points, digital photos, and d point clouds; however, digitization also involves capturing data by scanning analog data, particularly legacy data such as field notes, photographs, or drawings to a digital format richards-rissetto, h and landau, k. ( ). digitally-mediated practices of geospatial archaeological data: transformation, integration, & interpretation. journal of computer applications in archaeology, ( ), pp.  – . doi: https://doi.org/ . /jcaa. * university of nebraska-lincoln, us † alma college, us corresponding author: heather richards-rissetto (richards-rissetto@unl.edu) position paper digitally-mediated practices of geospatial archaeological data: transformation, integration, & interpretation heather richards-rissetto* and kristin landau† digitally-mediated practices of archaeological data require reflexive thinking about where archaeology stands as a discipline in regard to the ‘digital,’ and where we want to go. to move toward this goal, we advocate a historical approach that emphasizes contextual source-side criticism and data intimacy—scrutiniz- ing maps and d data as we do artifacts by analyzing position, form, material and context of analog and digital sources. applying this approach, we reflect on what we have learned from processes of digitally-mediated data. we ask: what can we learn as we convert analog data to digital data? and, how does digital data transformation impact the chain of archaeological practice? primary, or raw data, are produced using various technologies ranging from global navigation satellite system (gnss)/global posi- tioning system (gps), lidar, digital photography, and ground penetrating radar, to digitization, typically using a flat-bed scanner to transform analog data such as old field notes, photographs, or drawings into digital data. however, archaeologists not only collect primary data, we also make substantial time invest- ments to create derived data such as maps, d models, or statistics via post-processing and analysis. while analog data is typically static, digital data is more dynamic, creating fundamental differences in digitally-mediated archaeological practice. to address some issues embedded in this process, we describe the lessons we have learned from translating analog to digital geospatial data—discussing what is lost and what is gained in translation, and then applying what we have learned to provide concrete insights to archaeological practice. keywords: digitally-mediated archaeology; geospatial; archaeological practice; historical approach; mesoamerica; data intimacy; paradata journal of computer applications in archaeology richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data (allison ; gaffney, stančič & watson ; smith ; wylie ). as archaeologists, we collect primary, or ‘raw’ data of extant archaeological features and artifacts that we often use to create ‘derived’ data such as maps, d models, and statistics based on post-processing, analy- sis, and interpretation (beale and reilly ; costa et al. ; faniel et al. ; huggett ; kansa and kansa ; kintigh et al. ). these raw data often need to be digitzed—changed from analog to digital—to be useful for digital technologies and methods. archaeologists employ numerous software including databases, geographic information systems (gis), photogrammetric tools, and d environments to process, integrate, and analyze archaeological data; in other words, we transform analog and digital ‘items’ to create new (derived) data for archaeological research. additionally, we create ‘natively digital,’ ‘digital-first,’ ‘digital-exclusive,’ or ‘intrinsic born’ digital data (austin ). in contrast to digitization of analog data, such natively digital data come from post-processing primary data or generating data that do not or did not have a physical counterpart. this process creates new challenges and opportunities in archaeological scholarship (digital preservation coalition ; forte ). for example, we use flatbed scanners to digitize a site map recorded in a field notebook to convert the analog page into a digital format such as a tagged image file format (tiff). while a tiff is machine- readable, the data require post-processing to be useful for geospatial analysis. for example, the scanned map must be georeferenced to provide real-world coordinates and scale; it must also be vectorized to provide data for analysis in a gis or other platform. in other words, post- processing necessitates multiple steps of human decision- making, producing numerous file types, and results in new data. the creation of new data through digitization, most often but not always through post-processing, is called datafication. some scholars define datafication as “transforming objects, processes, etc. in a quantified format so they can be tabulated and analysed” (gattiglia : , emphasis ours; mayer-schönberger and cukier ). in contrast, we contend that datafication outputs are not limited to quan- tifiable data, and we recommend shifting the definition of datafication to emphasize process i.e. the transforma- tion and translation of objects and processes rather than outputs (richards-rissetto b). basically, in contrast to digitization which ‘replicates’ original data, datafication creates derived, or new data, which requires human trans- lation (interpretation) and encourages unique considera- tions for archaeological scholarship. datafication of both born-digital and analog formats offer archaeology more than either can do alone. datafication involves what digital scholars call metadata and paradata. while the term metadata describes infor- mation about the data themselves (clarke ; esteva et al. ; hodder ; roosevelt et al. ; ullah ; witcher ), paradata more specifically concern intrepetive decisions. recording paradata, the “informa- tion choices or the process of interpretation so that the aims, contexts and reliability” of methods can be evalu- ated (bentkowska-kafel and denard : ) is a major challenge. with digital data, in particular geospatial and d modeling and visualization, archaeologists can eas- ily modify raw and derived data to generate new derived data. however, retracing our steps is not straightforward, though transparency is necessary for others to assess data quality as well as analytical results (huggett ; kansa et al. ). datafication mandates not only metadata but also paradata, thus requiring unique practices for digital scholarship in archaeology (see below). however, datafica- tion also brings a great opportunity for data intimacy: a deep familiarity with the data that affects perception and affords new insights (cavillo and garnett ; fahmie and hanley ; hong ). intimacy is increasingly essential for a digitally-mediated archaeology in which data transformation, integration, and creation is anything but straightforward. archaeological data is heterogeneous, making not only the data messy, but perhaps more importantly mak- ing the scientific process itself messy; research does not proceed in an orderly series of steps (boyer ). yet this messiness affords new opportunities for data integration that require deep interdisciplinary think- ing and often lead to innovative methods and analyses (demján and dreslerová ; harrison ; kansa ; kintigh ; von schwerin, lyons et al. ). even in cases where digitization/datafication stand- ards or best practices exist (e.g. open geospatial con- sortium), researchers still must make numerous deci- sions as we generate digital data. this decision-making process is not a new aspect of digital archaeological practice (hodder ; hodder ). for example, in hand-drawing profiles we decide on important points (x, y, and z) to map based on previous knowledge, experi- ence, objectives, etc. however, in generating born-digital and derived digital data we often make black box deci- sions (caraher ) based on convention or ‘mysterious’ software algorithms. given the emerging nature of digital technologies and tools, we often make decisions based on trial and error, searching the internet for solutions, or con- tacting colleagues. also, because digital data are dynamic (alberts, went & jansma ), our initial choices more easily and readily change, leading to new challenges and advantages in digitally-mediated archaeology. because of the dynamic nature of digital data and technologies, we contend that digitally-mediated data transformation, integration, and interpretation require reflexive, iterative thinking—we must be more aware of our decision-making processes (engel and grossner ; esteva et al. ; hodder ; hodder ; hodder ; lukas, engel & mazzucato ; roosevelt et al. ; tringham ). why and how do we make specific choices? and how can we document our choices, i.e. record metadata and paradata, to allow for digitally- mediated scholarship to become better integrated and accepted into archaeological practice? these questions are part of larger challenges and opportunities of digital scholarship in archaeology. richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data in the first part of this paper we introduce a historical approach for digitally-mediated archaeology. derived from interdisciplinary collaboration between historians and historical archaeologists, the approach encourages increased reflexivity and critical analysis of data sources. just as archaeologists study the position, form, material, and context of an artifact, historians consider the same in scrutinizing documentary resources. here, we explore how to apply source-side criticism of (sometimes actually historical) analog and digital data. in the second part of this paper, we discuss several examples of translating analog data to geospatial digital data including: ( ) converting maps originally generated with alidade and plane table to geographic information systems (gis) data, ( ) converting hand-written field notes into gis data, ( ) integrating multi-source data (i.e. vectorized maps, gnss, total station, and airborne lidar), ( ) processing data to generate georeferenced d models, and ( ) analyzing digital data in different software for scholarly research and interpretation. we summarize the lessons we have learned from our experiences in trans- forming analog data to geospatial digital data, discuss- ing what is lost and what is gained in translation, and then applying what we have learned to provide concrete insights to archaeological practice. we contend that as we transform, integrate, and analyze these data, we are not simply digitizing data but rather we are performing datafication. in other words, we are acquiring new knowl- edge about data collection, documentation, processing, and interpretation, which can lead to new archaeologi- cal questions and methodologies and enhance the nature of archaeological scholarship (huggett ; kansa and kansa ; kintigh et al. ; richards-rissetto and landau ). we advocate an iterative process of ‘trans- lating’ analog and digital data that goes beyond ‘end-prod- ucts’ but rather considers datasets as part of a non-linear process of archaeological investigation that offers new insights to guide transformations of archaeological prac- tice into rich digital scholarship. archaeological scholarship, whether digital or not, stems from specific research goals. we ask questions that guide our research design from data collection to analysis to dissemination. to situate our discussion, we use a case study from the ancient maya site of copán, honduras, that has specific research goals and questions related to landscape archaeology using a combination of analog and digital data. . a historical approach for digitally-mediated archaeology in anthropology, the so-called postmodern turn of the s encouraged greater awareness of power differ- entials between observed and observer. the concept of culture itself was scrutinized as a reification and tool for “othering” (abu-lughod ; clifford and marcus ). anthropologists began to question themselves, the ethnographies they produced, and the epistemological basis for the scientific hypothetico-deductive-nomological approach. customs, traditions, and ways of being could only be understood within their appropriate contexts. given that anthropologists can never actually get inside an informants’ head, postmodernists argued that their books are merely one-sided accounts or fictions, and so they should be treated just as any other literary text (e.g., salzman ). in this vein, post-processual archaeologists attempted to “read the past” or “read material culture” as a way to construct meaning (hodder ; tilley ; tilley ). because historical archaeology involves both artifacts and texts—material objects as well as writing—postmodern critiques had much to offer. until then, while written his- tories provided a more privileged position than archaeo- logical data, they were not subjected to the same kinds of rigorous contextual analyses as artifactual studies (lightfoot ; morrison and lycett ; stahl ). historical archaeologists began treating texts as arti- facts by more reflexively considering their contexts, how they obtained them (source-side criticism) and how they applied them (subject-side criticism). of particular importance was source-side criticism, and archaeologists followed the lead of historians in more carefully assessing the authenticity and validity of documentary accounts. w. raymond wood ( ) argued that archaeological records, photographs, maps, and the landscape itself be considered ‘documents,’ and thus open to the same kind of source-side criticism as historical texts. he summarizes the historical method in four steps: ( ) formulating the problem or research question for which documents are needed, ( ) determining which documents are authen- tic (‘external criticism’), ( ) determining which details within a document are credible (‘internal criticism’), and ( ) organizing all reliable information into a narrative to resolve the research problem. wood’s ( ) first step of formulating a research ques- tion is encapsulated by archaeology’s turn to ‘problem- based research’ during s processualism, and later on in gis. for example, lock and stančič ( : xiv) stressed that it is not the specific mathematical procedures them- selves that will be the future of innovative research with gis but rather “the underlying archaeological approaches and questions determining their use.” thus, as in dirt archaeology, a preconfigured research question is also a necessary starting point for digital archaeology. wood’s ( ) second step of external criticism involves assess- ing a document itself, while his third step, internal criti- cism, addresses the document’s specific contents and meaning. to perform external criticism, one must focus on the author and date and obtain the original version rather than a copy. next, the researcher must separate the content of the document into eyewitness accounts at the moment versus descriptions written by another indi- vidual or later. most important in evaluating credibility is temporal proximity to the event; next is considera- tion of potential distortion due to the intended purpose and audience of the document; last involves addressing the competency and expertise of the writer and whether there is independent corroboration (wood , ). also important for our purposes is wood’s admonition to richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data carefully check all translations and their meaning particu- lar to that time and place (e.g. a ‘lot’ number at copan in meant something different than in ). in relation to maps, he recommends some knowledge on the history of cartography and mapmaking, awareness of the ‘silent updating’ of existing maps, how particular maps were made (e.g. using a compass or astrolabe), what the partic- ular surveyor thought was worthwhile to depict, and the geography of the area in question. in the sections below, we take wood’s historical approach and apply source-side criticism to text, maps, and d models. . geospatial data: what’s the big deal? archaeology is all about location. provenience is essential across scales. the more we know about location, the greater potential for more informed and granular inter- pretations. archaeologists began to employ gis originally for data management and not long after for spatial analy- sis (connolly and lake ; wheatley and gillings ). gis revolutionized the way archaeologists deal with spa- tial data, and the question as to the magnitude of its impact on archaeological theory is still debated (howey and brouwer burg ; richards-rissetto b). never- theless, gis brought greater awareness to the potential of geospatial data for archaeological studies. no longer would our maps be aligned to site-scale coordinate sys- tems based solely on an arbitrary ( , ) origin. rather, our site data could be tied to real-world coordinates allowing us to overlay multiple layers of data such as geology, geo- morphology, and land cover, and importantly for land- scape archaeology, tied to a much larger area with greater precision allowing new types of analyses. today, many archaeologists have gnss to acquire data points for not only site location but millimeter-level geospatial data of intra-site features through integrating a variety of digital tools (e.g. total station, laser scanning, and photogrammetry). others have legacy data from ear- lier surveys, excavations, and analysis (allison ; clarke ; faniel et al. ; kansa and kansa ; ullah ; witcher ), which provide data that are ‘lost’ due to the destructive nature of excavation, urbanization, agri- culture, looting, taphonomic processes, natural disasters, and more (e.g. gruen, remondino & zhang ). these analog data provide a rich source of information that can be converted to and subsequently integrated with digital data to generate not only new data, but to lead to new forms of archaeological practice and scholarship (faniel et al. ; gunnarsson ; tringham ; wells et al. ; wylie ). the use of digital geospatial data has revolutionized the practice of archaeology, but archaeolo- gists must still be vigilant of its origins and context (ullah ). . methods, lessons, & reflections: translating analog data to geospatial digital data we apply the above insights on analog/static versus digital/dynamic data and source-side criticism of the historical method to geospatial data in archaeology. we examine five types of data transformation that have been particularly relevant to our own research, and pro- vide a few words on the experience as well as lessons for future practitioners. we illustrate the five transfor- mation types with different categories of data from the ancient city of copán. before outlining the five types of data transformation, we provide a brief history on the kinds of analog and digital data that currently exist for copán. the ancient maya site of copán has a long occupation dating back to at least bce. today, it is a unesco world heritage site in honduras, but from the fifth to ninth centuries it was the seat of a dynastic kingdom that at its peak governed over square kilometers (bell et al. ; fash ). excavation dates back to when guatemala’s governor, juan galindo, mapped part of the site’s core and excavated a tomb in the main civic- ceremonial complex (fash and agurcia fasquelle ). unfortunately these primary data are lost, but in stephens and catherwood—two early explorers of central america—described, mapped, and created drawings (using a camara lucida) of copán’s jungle-covered main civic- ceremonial core (stephens and catherwood ). in the early to mid-nineteenth century, archaeologists began scientific studies of the site that included excavation, architectural drawings, and maps (maudslay – ; morley ). later, in the late s and early s archaeologists carried out a % pedestrian and map- ping survey of square kilometers surrounding copán’s main civic-ceremonial complex (fash and long ). in the early s two austrian architects used photogram- metric methods to generate large-scale ( : ) maps of the main civic-ceremonial complex (hohmann and vogrin ). additionally, maps from individual excavations are available via unpublished field notes, type-written sum- maries of field notes with penciled-in additions, type- written finalized reports, dissertations, monographs, and other publications available online and in copán’s onsite archives. these maps along with archival and published data provide a wealth of analog resources to investigate ancient copán. in following the first step of wood’s ( ) historical method, we want to be explicit in defining the nature of the problem for which we seek documentary sources. generally speaking, our case study has two broad research questions: ( ) what is the nature of social interaction at copán in the late eighth to early ninth centuries, just prior to the city’s decline? and ( ) how did daily life within copán’s urban neighborhoods change over time in rela- tion to major political and/or economic events? to exam- ine these questions, we focus on accessibility and visibility within the city of copán. we ask: who lived in view of royal architecture? who was visually isolated? were cer- tain social groups channeled toward specific locations? if so, for what purposes? additionally, can measures of accessibility and visibility provide data useful for identi- fying neighborhood or other boundaries (landau ; llobera, fábrega-Álvarez & parcero-oubiña ; llobera , llobera a, llobera b; richards-rissetto )? in order to address these questions, we need not only geospatial data, but multiple scales of data from richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data excavation units to regional surveys, and many of these data are only available in analog formats. thus, we devel- oped the above mentioned five-step process that includes: ( ) converting maps—originally generated with alidade and plane table—to gis data, ( ) translating hand-written field notes into gis data, ( ) integrating multi-source geo- spatial data (e.g. digitized analog data with gnss data, total station, and airborne lidar data), ( ) processing gis and other data to generate georeferenced d models, and ( ) analyzing geospatial digital data in different software for scholarly research and interpretation. . . step : digitizing, georeferencing, & attributing paper maps lessons: while labor-intensive and time-consuming, the process of digitizing, georeferencing, and attributing maps created with differing methods, at multiple scales, and in different languages (english, spanish, german, and french), and then painstakingly vectorizing them provided new insights and sparked new archaeological questions about the mahler, or prismatic, method of map- ping maya sites (hutson ) and copán’s site typology (richards-rissetto , ; richards-rissetto and lan- dau ; willey and leventhal ). our experience was similar to that of ullah’s ( ) exploration of the minute details and large errors of legacy survey data in jordan. here wood’s ( ) second and third steps apply: the vari- ous paper maps must be subjected to external criticism (are they authentic?) as well as internal criticism (are the details within accurate?). making such judgments inher- ently involves becoming well acquainted with the context of creation for each map: what we term data intimacy. what were standard cartographic practices in the us, hondu- ras, germany, and france in the s? what were the defined problems for which these maps were produced? which details did the mapmakers include and exclude, and why? do field notes admit to mistakes, illnesses, land-access issues, etc. that were or were not published in the final map? did individual field workers have years of experience, or did they learn on the job? such con- textual questions must be considered in the digitization process. reflexively as well, the digitizer should explicitly record which maps (external criticism) and which details (internal criticism) were actually digitized or left out, and why (clarke ; esteva et al. ; hodder ; roosevelt et al. ; ullah ; witcher ). while digital implies speed—archaeologists quickly acquire millions of d points using a laser scanner—we learned that the best practice is slow practice (caraher ): to take a step back and critically consider the longer term implications of digitization before jumping in. it is criti- cal to examine all mapped analog (and digital) data before georeferencing and vectorizing. careful examination may help to identify an appropriate grid system and lay out a methodology suited to heterogeneous data (demján and dreslerova ). in the case study, maps ranged from twenty-four km square plane table and alidade maps at a scale of : (figure ) (fash and long ) to : scale photogram- metric maps of copán’s civic-ceremonial core (hohmann and hohmann-vogrin ), to excavation maps of indi- vidual sites (maca ; webster ). after researching how each of the existing maps were created, we decided to georeference them to the copán archaeological project (pac ) site grid (fash and long ) for several reasons. first, it offered the best tie points for copán’s heterogene- ous mapped data. it also provided a way to double-check and link attribution because structure and group names are based on grid quadrants with additional data (e.g. site type, number of plazas) available in a separate vol- ume (fash and long ). third, the copán site archives contains a massive collection of original field notes, type-written versions of the field notes, original hand- drawn maps, reports to funding agencies, and final pub- lication drafts. such sources were helpful in determining how much to rely on particular internal details. for exam- ple, when an archaeologist’s field notes indicated that an area was heavily forested, we noted that archaeological structures and contour lines may be less accurate here than in other areas. figure : example of : scale plane table and adilade map, from fash and long ( ) (left) and photogrammetric map at scale : , from hohmann and vogrin ( ) (right). richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data due to fine lines, no color differentiation, and multiple data layers in a single map (e.g. hydrology, structures, contours, modern roads, text), we manually digitized the maps to create georeferenced vector data (i.e. shape- files) to ensure accurate data capture (richards-rissetto ). three data layers were vectorized—contour lines, archaeological structures, and hydrology—and attributed with group name, structure name, site type, and elevation using data from maps, architectural drawings, and text. circling back to wood ( ), it is essential in digital archaeology to capture not only metadata, but paradata (bentkowska-kafel, denard & baker ; denard ); that is, recording the data sources, methods, etc. that inform the choices we make as we digitize, and importantly providing information on any modified data, for example, filling in missing gaps on a map using exca- vation data or architectural drawings. capturing metadata and paradata is essential in digital archaeology to allow other researchers to reproduce not simply an end-product, but to actually retrace our processes to verify as well as build on such scholarship. in the end, such practice will also facilitate data preservation and access and help for- mulate best practices and standards because it allows data to be readily re-used (richards-rissetto and von schwerin ). another advantage of manual digitization is data intimacy. on-screen tracing of archaeological features by hand simulates traditional hand-drawn mapping prac- tices. such a process provides familiarity with second-hand data that is lost with automatic vectorization. for exam- ple, in the s, cartographers created a five-level typol- ogy for classifying aboveground architectural remains (fash and long ; willey and leventhal ) that archaeologists adopted to represent socioeconomic status. through manually digitizing and attributing over structures, we came to question the validity of correlat- ing the typology to social status (richards-rissetto ). our suspicions were later supported because there was no spatially statistically significant difference in accessibility between some elite (type ) and non-elite (type ) residen- tial groups (richards-rissetto ; richards-rissetto and landau ). therefore, the slow and tedious practice of on-screen tracing led to the development of a research question about accessibility between people of different socioeconomic status. results from this study led to a cor- rection in the copán site typology, changing our under- standing of the nature of status differences and inequality at the ancient city. applying wood’s historical method encouraged new research questions that ultimately helped us better answer major anthropological questions. basic lesson: although today’s digital archaeology allows rapid and efficient digitization and datafication, we should step back and slow down. our experiences have shown that developing data intimacy—though some- times hours of painstaking manual digitization—affords greater exploration and reflection on the data. gaining introspective clarity during the process of digitization and datafication may lead to significant new research findings, previously unconsidered. . . step : translating archival documents into spatial data & informing the geospatial process lessons: archival field reports and hand-written notes are an often untapped resource; however, such data are inconsistent—some investigators write more than others and notes are missing, often unstandardized (between individuals, between projects, and across time), and pro- venience data are hit or miss. moreover, documents are composed in multiple languages, and at times the writing is illegible (e.g. clarke , ullah , witcher ). nonetheless, after applying wood’s ( ) criteria to determine credibility, these archival data are worth the effort—they fill in missing pieces and enrich research. for example, they provide attributes for mapped features, rationale for terminology and methods, and ‘lost’ provenience. in the case study, we scanned archival data from the library at the center for regional archaeological investigations (cria) at copán ruinas in honduras. these data include hand-drawn maps and profiles, artifact counts, provenience information, catalog numbers, etc. we scanned the originals as pdf files (for documents) and tiff files (for images and maps) to address three inter- related goals: for long-term archival purposes, to assist the cria in digitization efforts, and to gather more infor- mation on precisely how archaeological structures were interpreted and mapped. while we were reasonably sure that all field notes and reports were authentic due to their curation at the site archives, we combed through these documents for spatial information that we could trans- form into usable geospatial data. we read each source completely to gain a sense of internal validity – does the author contradict themselves? are peculiar margin com- ments corroborated by other authors within the archives? once we established validity and accuracy, we georefer- enced and vectorized maps into shapefiles, and populated spreadsheets with attributes linked to the shapefiles. a key challenge was to assign height to the archaeological structures. in the maya region, survey- ors record the length, width, and height of architectural mounds (i.e. collapsed structures), but do not estimate original structure heights. to estimate structure heights we began by gathering spatial and other relevant data from archived excavation notes, published monographs, and ethnographic data. in particular, annotations and their placement within archival documents provided insights (often lost in the typewritten field reports) via rough sketches and from architectural materials and con- struction techniques (figure ) (tringham ). beyond providing x, y, and z spatial data, these data were integral to developing a gis method to estimate height based on site type, construction materials, and excavation data (richards-rissetto , ). ultimately we estimated height using a trigonometric function, but developing mathematical formulas and an appropriate methodology required a close reading of various analog sources. basic lesson: texts should be treated as artifacts them- selves (lightfoot ; morrison and lycett ; stahl ); we cannot simply take them as fact and incorporate richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data them, but rather we need to try to understand the context and purpose of their writing in relation to the writer and historical circumstances. each document has its own his- torical trajectory and materiality. . . step : integrating multi-source geospatial data (e.g. shapefiles, gps, gnss, total station, lidar) lessons: combining different geospatial datasets typically fills gaps in archaeological maps, giving a more com- plete picture despite differences in original acquisition or granularity. however, sometimes different datasets overlap. how do we decide which dataset is best, how to combine datasets, or how to give more weight to the ‘better’ dataset? the second and third step of wood’s ( ) historical method (external and internal criticism) again become important in integrating various data- sets. first, how was each dataset initially produced? for which research questions were the data commissioned to answer? second, which aspects of each map were more ‘accurate’ in instances of overlap? we conclude that decid- ing which representation is more ‘correct’ or ‘accurate’ should be an iterative process and ideally best accom- plished while in the field, where ground-checking is pos- sible. no one type of data (capture) is necessarily ‘better’ than others, but rather each data type comprises parts— some parts are more useful or accurate than others. in our case study, total station-based mapping in san lucas revealed what appeared to be a ‘new’ archaeologi- cal group—unmapped in previously published reports. we also ‘lost’ a group that had been previously mapped, which we could not relocate on the ground (similar to ullah’s [ ] experience). consulting lidar data showed that the originally mapped group had been erroneously placed. while the internal architecture was mapped cor- rectly, the group was placed about m away from its actual location. therefore, these two groups were one in the same. in another example, while total station data captured low mounds (landau, richards-rissetto & wolf ), it was difficult to differentiate low archaeological mounds (< cm) from natural topography using airborne lidar (von schwerin et al. ). in the process of inte- grating multi-source datasets, we learned that datasets can ‘self-correct,’ but only if we iterate back and forth between them to reveal which bits are more or less accurate. in the end, we create a critical combination of all maps—by applying wood’s method—that results in improved accu- racy and precision all around. another example from our case study involves the inte- gration of various datasets with each other and existing geospatial data. figure is an example from the neighbor- hood of san lucas at copán (landau ). it illustrates overlaid data gathered from three different sources– pink (fash and long ), yellow with black lines (landau ), and a lidar-derived landscape (von schwerin et al. ). the fash and long ( ) data were collected using alidade and plane table at a time when the copán valley was much more sparsely occupied, and this area was likely a cow pasture with low to medium overgrowth. wolf and landau re-mapped this architectural group in – with several different gnss units and a total station with prism, and in , the mayaarch d project com- missioned lidar data (von schwerin, richards-rissetto et al. ). in general, wolf and landau consulted the fash and long ( ) maps while in the field using gnss figure : unpublished scanned sketch maps from copán, honduras (left), and original field notes from san lucas, copán—both illustrating importance of legacy data. (courtesy: honduran institute of anthropology and history and k. landau and m. wolf). richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data receivers and total station with a prism. first we searched for the structures as indicated on the maps. keeping these structures in mind with the contemporary land- scape topography, wolf drew the architectural group as he interpreted it by hand in a notebook; afterward we took a series of three to six points for each structure. wolf later reconciled these points with his hand-drawn maps (see figure ). afterward, when plotting the maps together with wolf’s maps on top of the lidar hillshade surface, landau made further corrections to the wolf drawing. for example, she modified the edge of the flat- tened area in the northwest corner of figure , to give a more accurate sense of its extent. another lesson involves careful, critical use of auto- matic digitization tools. while the vector to raster tool in gis is push-button (not quite black box, but easily non- critically applied), dealing with architecture rather than topography requires different decisions, methods, and tools. for example, what spatial resolution is sufficient? to capture details such as platforms and stairs require high-resolution data; however, generating a cm raster surface for square kilometers or more requires high lev- els of processing power—often not available to individual researchers or archaeologists in developing countries. additionally, in cases of landscape analysis, these raster- ized architectural data also need to be integrated with the terrain (topographic surface). while lidar data are avail- able in some areas, typically they are still unavailable to archaeologists due to high costs and lack of flights, par- ticularly in remote regions. thus, our options for free or low-cost raster terrain data are limited to lower-resolution datasets such as shuttle radar topography mission (srtm) or advanced spaceborne thermal emission and refraction radiometer (aster), which unfortunately are not suf- ficient for visibility analyses within urban landscapes such as ancient maya cities where topography is integral to site layout (aveni and hartung ; gagnon et al. ; inomata ; juarez, salgado-flores & hernández ; landau ; richards-rissetto and landau ). another option is analog data acquired via instrument mapping and published as paper maps with contour lines. these paper maps typically provide a larger-scale (i.e. higher resolution) terrain than free dem data (particularly outside of the u.s. and europe), but following a histori- cal approach, we must step back to critically evaluate the quality of source data. figure : group m- - at the san lucas neighborhood, showing overlap between fash and long ( ) (pink), landau ( ) (yellow), and the lidar surface (von schwerin, richards-rissetto et al. ) (gray). richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data basic lesson: all sources provide complementary and unique information that together create a more holistic and empirical picture. digitally-mediated practice neces- sitates and allows us to more deeply interrogate data accuracy and interpretation, particularly through an iterative process. importantly for archaeological practices, geospatial data integration raises questions such as: can we identify spatial patterns by examining similarities and differences among analog maps, lidar, and excavation data? can such comparisons help us interpolate older analog maps? how accurate are lidar data in particular cultural and environmental contexts? can we devise algo- rithms to more accurately detect low mounds by ground- checking a stratified sample and comparing topography and vegetation to algorithm-detection accuracy? these questions impact archaeological practice and digital scholarship. . . step : processing data to generate georeferenced d models lessons: in the past decade, particularly since the advent of out-of-the-box photogrammetry (i.e. structure from motion), d data have become commonplace in archaeology. however, most d data are not born-digital, but rather they are acquired in the field, lab, or museum capturing physical objects and landscapes. these primary data can be instantly georeferenced, or not, depending on available technology and the location of data capture. however, converted analog data such as structure maps introduce new challenges as we move from d (vector) to . d (raster) to d models (mesh/faces). while we can transform analog maps to gis vector data and subse- quent raster data, our d results are extruded schematic models lacking (slanted) roofs, architectural sculpture, and often platforms and stairs depending on the original map. transforming . d data into true d models usually neces- sitates manual modeling, though procedural modeling is now offering innovative opportunities (saldana ). while directly generating d architectural models from gis ( . d) data is not ideal, it offers the benefits of con- veying uncertainty and offering a close-reading of data. d models, particularly those that are photo-realistic, can lead viewers to false certainty about reconstructions (kantner ). however, abstract models (perhaps aug- mented by transparency or color-coding) portray impor- tant ambiguities (brunke ; kensek, dodd & cipolla ; lengyel and toulouse ). considering wood’s ( ) process in reverse, how can we use d modeling to indicate instances of uncertainty regarding source authen- ticity and accuracy? creating data that includes measures of uncertainty would allow future researchers to more eas- ily apply source-side criticism and, ultimately, correction. moreover, manual d modeling leads to data intimacy providing new insights. for example, ambiguities in map- ping, typically not identified in procedural modeling, can be identified and then employed to write scripts to gener- ate empirically-informed procedural models. yet, we still end up with static, fixed models that represent a single interpretation (i.e. reconstruction). in this scenario, we fail to take advantage of certain digital affordances; that is, we do not take advantage of digital technologies to generate multiple hypothetical d models (or simulations) for structures or landscapes. thus, in the case study we turned to procedural modeling, i.e. ruled-based rapid generation of buildings from gis data (richards-rissetto and plessing ), to generate multiple simulations. we generated d models from a spatial database with metadata and the decisions we made (i.e. paradata) stored both as a text document and schematic hierarchy—offering innovative possibilities for digital data storage, accessibility, and reuse (bentkowska-kafel, denard & baker ; denard ; esteva et al. ; faniel et al. ; lukas, engel & mazzucato ; richards-rissetto and von schwerin ). additionally, these procedural models provide information to digitally define basic elements and compo- nents of ancient maya architecture, which scholars have sought to define for over one-hundred years (andrews ; kubler ). using architectural definitions (loten and pendergast ), we created rules for elements and components that allow for dynamic modeling rather than static modeling of architecture—this digitally-mediated process facilitates hypothesis generation with empirical underpinnings that are documented in procedural mod- eling scripts (richards-rissetto and plessing ). basic lesson: in part because of the time input for manual modeling, singular d architectural models can mislead viewers to false impressions of the past. procedural modeling of geospatial data into d intro- duces new possibilities because we can create multi- ple simulations based on different data sources. given that each model displays a different set of conclusions based on the data—and, importantly, includes the source data on which that particular conclusion was based— procedural modeling provides more dynamism to archaeological data. this allows researchers to evaluate multiple different scenarios, and could potentially reveal to the public the complexities of digital d archaeologi- cal reconstruction. . . step : analyzing digital data in different software for scholarly research & interpretation lessons: while ‘analysis’ occurs in data translation, gis, d modeling software, and vr afford opportunities for knowledge generation via integration, computation, and visualization (forte and pescarin ; jones and levy ). each software offers unique tools and methods that facilitate, enhance, and ultimately change archaeological practice and scholarship. yet through reflectively iterat- ing between these software, we afford additional new possibilities. while gis provides tools to convert analog data to geospatial digital formats, its power for scholar- ship resides in its analytical capabilities. using gis we can identify spatio-temporal patterns and trends of big and complex data to investigate old questions and hypotheses in alternative ways and propose new lines of inquiry. in the case study, using gis we developed computational vis- ibility and accessibility approaches across multiple scales to investigate social connectivity among copán’s different socio-economic groups (landau ; richards-rissetto richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data ; richards-rissetto ; richards-rissetto and landau ). while the gis offers quantitative measures of potential social connectivity, the technology, based on . d data, does not allow for architectural details such as sculptured facades and arched doorways to be incorporated in the analysis. thus, aesthetic details such as color and light- ing are missing as well as features impacting visibility, particularly in the communication of messages (paliou ; paliou ; richards-rissetto a; sullivan ). additionally, gis falls short for phenomenological and other perception-based approaches because it gives a bird’s-eye perspective and lacks a sense of mass and scale (gillings and goodrick, ; kwan ; llobera ; rapoport ; richards-rissetto b; tilley ). to get a closer reading, we need to employ d technologies such as d modeling software and virtual reality (vr). in the case study, we created d models of over structures in copán’s urban core using sketchup based on gis data (scanned and georeferenced analog maps), and simulated the landscape between structures to create entire d areas to investigate the san lucas neighbor- hood at copán (landau, richards-rissetto & wolf ) (figure ). the process of creating d models was not linear, but rather we iteratively worked back and forth among gis, lidar, and excavation data necessitating a deep exploration of the data as parts but also as a whole. this data intimacy led to new questions about the mahler method of mapping ancient maya sites, which records mound heights and not actual structure heights, and thus proves problematic for direct gis to d model conversion. additionally, copán’s site typology attributes sites from types - ; however, site types refer to the ‘highest’ socio- economic status of the entire group and do not provide information on lower-status occupants or on structure functionality—both of which affect d modeling and sub- sequent archaeological interpretations. gis and d models (reality-based and reconstructions) provide source data to create d virtual environments of ancient copán using, for example, vr and procedural modeling. however, other, originally analog, data also provide essential information to create d simulations of past landscapes that serve as more than pretty illustra- tions. they enable us to create multiple simulations to interchange data, investigate old hypotheses, and create new interpretations. in these simulations, analog data are just as essential as digital data because they provide information on features that are now lost to degradation, urbanization, excavation, or other processes. architectural hand-drawings, archival photos, and field notes fill in data gaps. as we go from analog to digital and subsequently integrate datasets, we do not simply convert data, but we translate it—we see anomalies, errors in data, find ‘missing data,’ and think about typologies or classification schemes (e.g. as we standardize attribution). in other words, we acquire data intimacy. with these d simulations, we have the ability to convey data ambiguity (brunke ; kantner ; kensek, dodd & cipolla ), explore our data in unique, dynamic, and experiential or embodied ways (forte and pescarin ; forte and pietroni ; richards-rissetto et al. ; richards-rissetto et al. ), and perform landscape-scale analyses that are impossible without going digital. basic lesson: in the process of translating analog data to digital form, various technologies including gis, d mode- ling, and vr offer new pathways for data integration, com- putation, and visualization. although gis provides a suite of analytical tools, its . d format prevents crucial archi- tectural and landscape features from playing their part in visibility studies, for example. therefore d modeling and vr take over where gis leaves off: the procedural mod- eling process is predicated on a back-and-forth agreement and decision-making among all datasets, static analog and dynamic digital. the product is more than just a pretty picture because it can show multiple possibilities, re-open preliminary conclusions, and close lasting questions. . discussion—lessons learned in transforming analog to geospatial digital data our particular experience working with geospatial data at an archaeological site with over years of excavation history necessitated translating various analog data into digital data. because we are dealing with raw and derived geospatial data, the steps of the digitization and datafica- tion process are complex; therefore, we aimed to provide some perspective to help guide others in an area for which standards and best practices are emerging. the historical approach we advocate (following wood ) provides a methodology for assessing the origins and accuracy of static analog and born-digital data. in converting different figure : gis map of group m- from san lucas neighborhood, copán (left); d sketchup reconstruction of group m- (right). richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data sets of paper maps, made by different people in different languages and countries, best practice is to slow down and get a sense for the bigger—and longer term—picture. in using archival documents (e.g. field notes, personal journals, excavation reports), our experience has shown that treating such resources within their context, just as critically as one would treat artifacts, is best practice. inte- grating multiple and multi-sourced geospatial datasets also requires attention to context, and a back-and-forth iterative process between all data toward a more holistic, more accurate representation. in the creation of d virtual environments from or . d data, using procedural modeling and vr allows archaeologists to interrogate and integrate various datasets. through the process of transforming dispa- rate datasets such as architectural drawings, excavation notes, and archival photos into useful digital data that forms part of the d simulations, we develop data inti- macy—identifying key pieces of information that would be lost in automatic methods that simply convert data rather than translate data as required by a close read- ing. therefore, as we translate analog to digital data, we develop a deeper understanding and appreciation for how the data that are digitized came to be. the digital affords archaeologists greater intimacy with both legacy datasets (analog and digital) as well as derived data and the inter- mediate files created through datafication. several of our examples above demonstrate the intellectual advantages of data intimacy and slow science (sensu caraher ). we learn about the archaeology we are studying through the act of ‘translating’ these data. importantly, we invite similar reflection of already digital data because often we have already lost some of the history of these data, especially before metadata or paradata were empha- sized for inclusion. digital data can give a false sense of accuracy because they are often clean and ready-to-use; likewise, while digital data allow landscape-scale analy- ses that are impossible with analog formats, they impart a distance, disembodied, and masculine god’s eye view. in both cases, the downside is that we often forget the palimpsest from which the data originally derived, as well as the time, material conditions, labor, and small decisions that went into collecting them. in a sense, we experience another black box stemming not only from unknown or poorly understood algorithms, but also from a lack of deep understanding of the data themselves. postmodern scholars and their skepticism toward eth- nographies have led anthropologists to adopt historical approaches to texts (morrison and lycett ; stahl ; wood ). rather than privileging the written word, archaeologists should treat legacy analog data as any other artifact. we should try to understand the con- text and purpose of their writing, the author, and time period. each document should be interpreted within its own frame. applying this same perspective to digital archaeology, we conceptualize digitization of analog data not as simply conversion, but rather a continuous process of translation and re-translation (wylie ). to over- come potential losses or confusion in data translation, we advocate a historical approach to digitally-mediated archaeological practice and scholarship that ( ) brings awareness to the initial problem or research question for which the data are needed, ( ) determines which datasets are authentic (‘external criticism’), ( ) identifies which details within a dataset are credible (‘internal criticism’), and ( ) organizes all reliable information into a narrative to address the research problem. beyond advocating a historical approach, we contend that digitally-mediated archaeological practice should not be conceptualized as a chain. we do not acquire knowl- edge linearly; that is, we do not always begin with research (discovery), move to synthesis (integration) and end with practice (application) (boyer ), but rather each step, phase, or component builds on, complements and at times overlaps another. it is time we acknowledge the ‘messi- ness’ of archaeological data and research by devising new conceptual schemes instead of forcing the process into a preconfigured ‘chain.’ in sum, we advocate an iterative pro- cess of ‘translating’ analog and geodigital data that treats data transformation not simply as making ‘end-products,’ but rather as a process that generates intermediate data- sets, i.e. datafication, within the dynamics of archaeologi- cal practice. in this way, as we transform and integrate analog and digital data, we acquire new knowledge about data collection, documentation, processing, and interpre- tation than can lead to new archaeological questions and methodologies and enhance the nature of our scholarship. data accessibility statements the geospatial data are available to researchers upon request via the mayaarch d and mayacitybuilder projects, and with approval from the instituto hondureño de antropología e historia (ihah). acknowledgements we would like to thank arkwork cost action wg on ‘archaeological scholarship’ for funding the inspira- tional workshop in cologne, germany. we also grateful to special issue editors elefheria paliou, jeremy huggett, costas papadopoulos, and isto huvila, as well as workshop participants, the honduran institute for anthropology and history (ihah), the mayaarch d project, and the anonymous reviewers. this article is based upon work from cost action arkwork, supported by cost (european cooperation in science and technology). www.cost.eu. funded by the horizon framework programme of the european union. competing interests the authors have no competing interests to declare. references abu-lughod, l. . writing against culture. in: fox, rg (ed.), recapturing anthropology: working in the present, – . santa fe: school of american research press. alberts, g, went, m and jansma, r. . archaeology of the amsterdam digital city; why digital data are richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data dynamic and should be treated accordingly. digital technology, culture and society, ( – ): – . doi: https://doi.org/ . / . . aldenderfer, ms and maschner, hdg. (eds.) . anthropology, space, and geographic information systems. new york: oxford university press. allison, p. . dealing with legacy data: an introduc- tion. internet archaeology, . doi: https://doi. org/ . /ia. . andrews, g. . pyramids and palaces, monsters and masks: the golden age of maya architecture. lancaster, california: labyrinthos. austin, a. . mobilizing archaeologists: increasing the quantity and quality of data collected in the field with mobile technology. advances in archae- ological practice, ( ): – . doi: https://doi. org/ . / - . . . aveni, af and hartung, h. . maya city planning and the calendar. transactions of the american philosophical society, ( ): – . doi: https://doi. org/ . / beale, g and reilly, p. . digital practice as meaning making in archaeology. internet archaeology, . doi: https://doi.org/ . /ia. . bell, e, canuto, m and sharer, r. (eds.) . understanding early classic copan. philadelphia: university of pennsylvania museum of archaeology and anthropology. benardou, a, champion, e, dallas, c and hughes, lm. . introduction: a critique of digital prac- tices and research infrastructures. in: benardou, a, champion, e, dallas, c and hughes, lm (eds.), cultural heritage infrastructures in digital humani- ties, – . new york: routledge. doi: https://doi. org/ . / - bentkowska-kafel, a and denard, h. . introduc- tion. in: bentkowska, a, denard, h and baker, d (eds.), paradata and transparency in visual heritage, – . series: digital research in the arts and human- ities. london: routledge. bentkowska-kafel, a, denard, h and baker, d. (eds.) . paradata and transparency in virtual heritage. series: digital research in the arts and humanities. london: routledge. boyer, e. . scholarship reconsidered: priorities of the professoriate. carnegie foundation for the advance- ment of teaching. brunke, l. . uncertainty in archaeological d reconstruction: a case study of monument at the via appia near rome. unpublished thesis (msc), leiden university. caraher, w. . slow archaeology: technology, efficiency, and archaeological work. in: averett, e, et al. (eds.), mobilizing the past for a digital future: the potential of digital archaeology, – . grand forks: the digital press at the university of north dakota. cavillo, n and garnet, e. . data intimacies: building infrastructures for intensified embodied encounters with air pollution. the socio- logical review, ( ): – . doi: https://doi. org/ . / chase, af, chase, dz, weishampel, jf, drake, jb, shrestha, rl, slatton, kc, awe, jj and carter, we. . airborne lidar, archaeology, and the ancient maya landscape at caracol, belize. journal of archaeological science, ( ): – . doi: https://doi.org/ . /j.jas. . . clarke, m. . the digital dilemma: preservation and the digital archaeological record. advances in archaeological practices, ( ): – . doi: https://doi.org/ . / - . . . clifford, j and marcus, ge. (eds.) . writing culture: the poetics and politics of ethnography. berkeley: university of california press. connolly, j and lake, m. . geographic informa- tion systems in archaeology. new york: cambridge university press. costa, s, beck, a, bevan, a and ogden, j. . defin- ing and advocating open data in archaeology. in: earl, g, sly, t, chrysanthi, a, murrieta-flores, p, papadopoulos, c, romanowska, i and wheatley, d (eds.), archaeology in the digital era: papers from the th annual conference of computer applications and quantitative methods in archaeology (caa), – . amsterdam: amsterdam university press. doi: https://doi.org/ . / - dallas, c. . an agency-oriented approach to digi- tal curation theory and practice. in: trant, j and bearman, b (eds.), ichim ‘ , international cul- tural heritage informatics meeting, – . toronto: archives and museum informatics. demján, p and dreslerova, d. . modelling distribu- tion of archaeological settlement evidence based on heterogeneous spatial and temporal data. journal of archaeological science, : – . doi: https://doi.org/ . /j.jas. . . denard, h. . a new introduction to the london charter. in: bentowska-kafel, a, denard, h and baker, d (eds.), paradata and transparency in virtual heritage, – . farnham: ashgate publishing. digital preservation coalition. . digital preserva- tion handbook. nd edition. available at https:// dpconline.org/handbook [last accessed july ]. engel,  c and grossner, k. . representing the archaeological process at Çatalhöyük in a living archive. in: hodder, i and marciniak, a (eds.), assem- bling Çatalhöyük, – . leeds: maney publishing. esteva, m, trelogan, j, rabinowitz, a, walling, d and pipkin, s. . from site to long-term preservation: a reflexive system to manage and archive digital archaeological data. in: archiv- ing – preservation strategies and imaging technologies for cultural heritage institutions and memory organizations final program and proceed- ings (den haag, netherlands - june ), – . springfield, va: society for imaging science & technology. richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data fahmie, ta and hanley, gp. . progressing toward data intimacy: a review of within-session data analysis. journal of applied behavior analysis, ( ): – . doi: https://doi.org/ . / jaba. . - faniel, i, kansa, e, whitcher kansa, s, barrera-gomez, j and yakel, e. . the challenges of digging data: a study of context in archaeological data reuse. in: jcdl proceedings of the th acm/ieee-cs joint conference on digital librar- ies, – . new york: acm. doi: https://doi. org/ . / . fash, wl and agurcia fasquelle, r. . visión del pas- ado maya. san pedro sula: asociación copán. fash, wl and long, k. . mapa arqueológico del valle de copán. in: baudez, cf (ed.), introducción a la arqueología de copán. tegucigalpa, honduras: instituto hondureño de antropología e historia. forte, m. . d archaeology: new perspectives and challenges—the example of Çatalhöyük. journal of eastern mediterranean archaeology & heritage studies, ( ): – . doi: https://doi.org/ . / jeasmedarcherstu. . . forte, m and pescarin, s. . behaviours, interac- tions and affordance in virtual archaeology. in: bentkowska-kafel, a, denard, h and baker, d (eds.), paradata and transparency in virtual heritage, – . farnham: ashgate publishing. forte, m and pietroni, e. . d collaborative environments in archaeology: experiencing the reconstruction of the past. international journal of architectural computing, : – . doi: https://doi. org/ . / gaffney, v, stančič, z and watson, h. . the impact of gis on archaeology: a personal perspective. in: lock, g and stančič, z (eds.), archaeology and geographical information systems: a european per- spective, – . london: taylor & francis. gagnon, sa, brunyé, tt, robin, c, mahoney, cr and taylor, ha. . high and mighty: implicit associ- ations between space and social status. frontiers in pyschology, : – . doi: https://doi.org/ . / fpsyg. . gattiglia, g. . think big about data: archaeology and the big data challenge. archäologische informationen: open access and open data, : – . doi: https://doi.org/ . /ai. . . gillings, m and goodrick, gt. . sensuous and reflexive gis: exploring visualisation and vrml. internet archaeology, . doi: https://doi. org/ . /ia. . gruen, a, remondino, r and zhang, l. . pho- togrammetric reconstruction of the great buddha of bamiyan, afghanistan. the photogram- metric record, ( ): – . doi: https://doi. org/ . /j. - x. . .x gunnarsson, f. . archaeological challenges, digital possibilities: digital knowledge development and communication in contract archaeology. unpub- lished thesis (licentiate), linneaus university. harrison, tp. . computational research on the ancient near east (crane): large-scale data inte- gration and analysis in near eastern archaeology. levant: the journal of the council for british research in the levant. doi: https://doi.org/ . / . . hodder, i. . reading the past: current approaches to interpretation in archaeology. new york: cambridge university press. hodder, i. . ‘always momentary, fluid and flexible’: towards a reflexive excavation method- ology. antiquity, : – . doi: https://doi. org/ . /s x hodder, i. . where is Çatalhöyük? developing a reflexive method in archaeology. in: hodder, i (ed.), towards reflexive method in archaeology: the example at Çatalhöyük, – . cambridge: mcdonald institute for archaeological research. hodder, i. . archaeological reflexivity and the “local” voice. anthropological quarterly, ( ): – . doi: https://doi.org/ . /anq. . hohmann, h and vogrin, a. . die architektur von copán (honduras): vermessung, plandarstellung, untersuchung der baulichen elemente und des räumlichen konzepts. verlagsanstalt, graz/austria: akademische druck- und verlagsanstalt. holdaway, sj, emmitt, j, phillipps, r. and masoud- ansari, s. . a minimalist approach to archae- ological data management design. journal of archaeological method and theory, : – . doi: https://doi.org/ . /s - - - hong, s. . data’s intimacy: machinic sensibility and the quantified self. communication + , ( ): article . doi: https://doi.org/ . /r cf n howey, mcl and brouwer burg, m. . assess- ing the state of archaeological gis research: unbinding analyses of past landscapes. journal of archaeological science, : – . doi: https://doi. org/ . /j.jas. . . huggett, j. . promise and paradox: accessing open data in archaeology. in: mills, c, pidd, m and ward, e (eds.), proceedings of the digital humanities con- gress , – . sheffield: the digital humanities institute. huggett, j. . digital haystacks: open data and the trans- formation of archaeological knowledge. in: wilson, a and edwards, b (eds.), open source archaeology: ethics and practice, – . warsaw/berlin: de gruyter open. doi: https://doi.org/ . / - huggett, j. . the apparatus of digital archaeology. internet archaeology, . doi: https://doi. org/ . /ia. . hutson, sr. . “unavoidable imperfections”: historical contexts for representing ruined maya buildings. in pillsbury, j (ed.), past presented: archaeological illustration and the ancient americas, – . washington, dc: dumbarton oaks research library and collection. inomata, t. (ed.) . warfare and the fall of a fortified center: archaeological investigations at aguateca. richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data vanderbilt institute of mesoamerican archaeology, . nashville: vanderbilt university press. jones, i and levy, t. (eds.) . cyber-archaeology and grand narratives: where do we currently stand? in cyber-archaeology and grand narratives. in: levy, t and jones, iwn (eds.), one world archaeology. cham, switzerland: springer. doi: https://doi. org/ . / - - - - _ juarez, s, salgado-flores, s and hernández, c. . the site of noh k’uh, chiapas, mexico: a late pre- classic settlement in the mensabak basin. latin american antiquity, ( ): – . doi: https:// doi.org/ . /laq. . kansa,  ec. . open context in context: cyber infrastructure and distributed approaches to publish and preserve archaeological data. the saa archaeological record, ( ): – . kansa, ec, kansa, sw, burton, mm and stankowski, c. . googling the grey: open data, web services, and semantics. archaeologies, ( ): – . doi: https://doi.org/ . /s - - - kansa, sw and kansa, ec. . data beyond the archive in digital archaeology: an introduction to the special section. journal of archaeological prac- tice, ( ): – . doi: https://doi.org/ . / aap. . kantner, j. . realism vs. reality: creating virtual reconstructions of prehistoric architcture. in: barceló, j, forte, m and sanders, d (eds.), virtual reality in archaeology, – . oxford: bar interna- tional series. kensek, k, dodd, l and cipolla, n. . fantastic reconstructions or reconstructions of the fan- tastic? tracking and presenting ambiguity, alter- natives, and documentation in virtual worlds. automation in construction, : – . doi: https://doi.org/ . /j.autcon. . . kintigh,  k. . the promise and challenge of archaeological data integration. american antiquity, ( ): – . doi: https://doi.org/ . / s kintigh, k, spielmann, k, brin, a, candan, ks, clark, tc and peeples, m. . data integration in the service of synthetic research. advances in archae- ological practice, ( ): - . doi: https://doi. org/ . /aap. . kubler, g. . the art and architecture of ancient america: the mexican, maya and andean peoples. new haven: yale university. kwan, m. . feminist visualization: re-envision- ing gis as a method in feminist geographic research. annals of the association of american geographers, ( ): – . doi: https://doi. org/ . / - . landau, k. . spatial logic and maya city planning: the case for cosmology. cambridge archaeo- logical journal, ( ): – . doi: https://doi. org/ . /s x landau, k. . maintaining the state: centralized power and ancient neighborhoods in copán, honduras. unpublished thesis (phd), northwestern university. landau, k, richards-rissetto, h and wolf, m. . tacking back and forth: using d tools to guide archaeological excavation, analysis, and interpre- tation. in: th annual meeting of the society for american archaeology. austin, texas. lengyel, d and toulouse, c. . the consecution of uncertain knowledge, hypotheses and the design of abstraction. in: börner, w and ulhlirz, s (eds.), proceedings of the th international conference on cultural heritage and new technologies, – . stad- tarchäologie, vienn: museen der stadt wien. lightfoot, kg. . culture contact studies: redefining the relationship between prehistoric and historical archaeology. american antiquity, ( ): – . doi: https://doi.org/ . / llobera, m. . building past landscape perception with gis: understanding topographic prominence. journal of archaeological science, ( ): – . doi: https://doi.org/ . / llobera, m. a. modeling visibility through vegeta- tion. international journal of geographical informa- tion science, ( ): – . doi: https://doi. org/ . / llobera, m. b. reconstructing visual landscapes. world archaeology, ( ): – . doi: https://doi. org/ . / llobera, m. . life on a pixel: challenges in the devel- opment of digital methods within an “interpre- tive” landscape archaeology framework. journal of archaeological method and theory, ( ): – . doi: https://doi.org/ . /s - - - llobera, m, fábrega-Álvarez, p and parcero-oubiña, c. . order in movement: a gis approach to accessibility. journal of archaeological science, ( ): – . doi: https://doi.org/ . /j. jas. . . lock, g. (ed.) . beyond the map: archaeology and spatial technologies. nato science series. washington, dc: ios press. lock, g and stančič, z. (eds.) . archaeology and geographical information systems: a european per- spective. london: taylor & francis ltd. loten, sh and pendergast, dm. . a lexicon for maya architecture. toronto: royal ontario museum. lukas, d, engel, c and mazzucato, c. . towards a living archive: making multi layered research data and knowledge generation transparent. journal of field archaeology, (sup ): s –s . doi: https:// doi.org/ . / . . maca, al. . spatio-temporal boundaries in classic maya settlement systems: copán’s urban foothills and the excavations at group j– . unpublished thesis (phd), harvard university. macfarland, k and vokes, a. . dusting off the data curating and rehabilitating archaeologi- cal legacy and orphaned collections. advances in archaeological practice, ( ): – . doi: https:// doi.org/ . / - . . . richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data maschner, hdg. . new methods, old problems: geographic information systems in modern archae- ological research. southern illinois university, car- bondale: center for archaeological investigations occasional paper. maudslay, ap. – . biologia centrali-americana: archaeology. london: dulau and co. mayer-schönberger, v and cukier, k. . big data: a revolution that will transform how we live, work, and think. united kingdom: john murray publishers. morley, sg. . the inscriptions at copán. carnegie insti- tution of washington publication, . washington, dc: carnegie institution of washington. morrison, kd and lycett, mt. . inscriptions as artifacts: precolonial south india and the analysis of texts. journal of anthropological method and theory, ( / ): – . doi: https://doi.org/ . / bf paliou, e. . visibility analysis in d built spaces: a new dimension to the understanding of social space. in: paliou, e, lieberwirth, u and polla, s (eds.), spatial analysis and social spaces: interdiscipli- nary approaches to the interpretation of prehistoric and historic built environments, – . berlin: topoi – berlin studies of the ancient world/topoi – berliner studien der alten welt . doi: https:// doi.org/ . / paliou, e. . visual perception in past built environments: theoretical and procedural issues in the archaeological application of three-dimensional visibility analysis. in: siart, c and forbriger, m (eds.), digital geoarchaeology: new techniques for interdisciplinary human environment research, – . new york: springer. doi: https:// doi.org/ . / - - - - _ prufer, km, thompson, ae and kennett, dj. . evalu- ating airborne lidar for detecting settlements and modified landscapes in disturbed tropical environ- ments at uxbenká, belize. journal of archaeological science, : – . doi: https://doi.org/ . /j. jas. . . rapoport, a. . levels of meaning in the built envi- ronment. in: poyatos, f (ed.), cross-cultural per- spectives in non-verbal communication, – . toronto: c.j. hogrefe. remondino, f, gruen, a, von schwerin, j, eisenbeiss, h, rizzi, a, girardi, g, sauerbier, m and richards-rissetto, h. . multi-sensor d doc- umentation of the maya site of copan. proceedings of nd cipa symposium, kyoto, japan, commis- sion, v, wgv/ . richards-rissetto, h and plessing, r. . procedural modeling for ancient maya cityscapes: initial meth- odological challenges and solutions. digital heritage international congress, : – . ieee con- ference publications. doi: https://doi.org/ . / digitalheritage. . richards-rissetto, h. . studying social interaction at the ancient maya site of copán, honduras: a least cost approach to configurational analysis. in: white, da and surface-evans, s (eds.), least cost analysis of social landscapes: archaeological case studies, – . salt lake city: university of utah press. richards-rissetto, hm. . exploring social interac- tion at the ancient maya city of copán, honduras: a multi-scalar geographic information systems (gis) analysis of access and visibilty. unpublished thesis (phd), the university of new mexico. richards-rissetto, hm. a. an iterative dgis analysis of the role of visibility in ancient maya landscapes: a case study from copán, honduras. digital scholarship in the humanities, ( ): ii –ii . doi: https://doi.org/ . /llc/ fqx richards-rissetto, hm. b. what can gis + d mean for landscape archaeology? journal of archaeological science, : – . doi: https://doi. org/ . /j.jas. . . richards-rissetto, hm and landau, k. . movement as a means of social (re)production: using gis to measure social integration across urban landscapes. journal of archaeological science, : – . doi: https://doi.org/ . /j.jas. . . richards-rissetto, hm, robertsson, j, remondino, f, agugiaro, g, von schwerin, j and giradi, g. . kinect and d gis in archaeology. in: th interna- tional conference on virtual systems and multime- dia, – . ieee conference publications. doi: https://doi.org/ . /vsmm. . richards-rissetto, hm, robertsson, j, remondino, f, agugiaro, g, von schwerin, j and girardi, g. . geospatial virtual heritage: an interac- tive, gesture-based d gis to engage the public with ancient maya archaeology. in: earl, g, sly, t, chrysanthi, a, murrieta-flores, p, papadopoulos, c, romanowska, i and wheatley, d (eds.), archaeology in the digital era: papers from the th annual con- ference of computer applications and quantitative methods in archaeology (caa), – . amster- dam: amsterdam university press. doi: https://doi. org/ . / - richards-rissetto, hm and von schwerin, j. . a catch of d data sustainability: lessons in d archaeological data management & accessibil- ity. journal of digital applications in archaeology and cultural heritage, : – . doi: https://doi. org/ . /j.daach. . . roosevelt, ch, cobb, p, moss, e, olson, br and Ünlüsoy, s. . excavation is destruction digitization: advances in archaeological practice. journal of field archaeology, ( ): – . doi: https://doi.org/ . / y. saldana, m. . an integrated approach to the pro- cedural modeling of ancient cities and buildings. digital scholarship in the humanities, (suppl_ , december ): i –i . doi: https://doi. org/ . /llc/fqv salzman, pc. . on reflexivity. american anthro- pologist, ( ): – . doi: https://doi. org/ . /aa. . . . richards-rissetto and landau: digitally-mediated practices of geospatial archaeological data saperstein, p. . accurate measurement with pho- togrammetry at large sites. journal of archaeo- logical science, : – . doi: https://doi. org/ . /j.jas. . . smith, n. . towards a study of ancient greek landscapes: the perseus gis. in: lock, g and stančič, z (eds.), archaeology and geographical information systems: a european perspective, – . london: taylor & francis. stahl, ab. . concepts of time and approaches to analogical reasoning in historical perspective. american antiquity, ( ): – . doi: https:// doi.org/ . / stephens, jl and catherwood, f. . incidents of travel in central america, chiapas and yucatan. new york: harper & bros. doi: https://doi.org/ . / bhl.title. sullivan, ea. . seeking a better view: using d to investigate visibility in historic landscapes. journal of archaeological method and theory, ( ): – . doi: https://doi.org/ . /s - - - tilley, cy. (ed.) . reading material culture: structur- alism, hermeneutics and post-structuralism. oxford: basil blackwell. tilley, cy. . interpretative archaeology. explorations in anthropology. oxford: berg publishers. tilley, cy. . a phenomenology of landscape: places, paths and monuments. oxford: berg publishers. tringham, r. . forgetting and remembering: the digital experience and digital data. in: borić, d (ed.), archaeology and memory, – . oxford: oxbow books. ullah, it. . integrating older survey data into mod- ern research paradigms: identifying and correc- tion spatial error in “legacy” datasets. advances in archaeological practice, ( ): – . doi: https:// doi.org/ . / - . . . von schwerin, j, lyons, m, loos, l, billen, n, auer, m and zipf, a. . show me the data!: structuring archaeological data to deliver interactive, transpar- ent d reconstructions in a d webgis. in: mün- ster, s, pfarr-harfst, m, kuroczyński, p and ioannides, m (eds.), d research challenges in cultural heritage ii: how to manage data and knowledge related to interpretive digital d reconstructions of cultural heritage, – . cham, switzerland: springer. doi: https://doi.org/ . / - - - - _ von schwerin, j, richards-rissetto, h, remondino, f, spera, mg, auer, m, billen, n, loos, l, stelson, l and reindel, m. . airborne lidar acquisi- tion, post-processing and accuracy-checking for a d webgis of copán, honduras. journal of archaeo- logical science: reports, : – . doi: https://doi. org/ . /j.jasrep. . . webster, dl. (ed.) . the house of the bacabs, copán, honduras. washington, dc: dumbarton oaks. wells, j, kansa, ec, kansa, sw, yerka, sj, anderson, dg, bissett, tg, myers, kn and demuth, rc. . web-based discovery and integration of archaeo- logical historic properties inventory data: the digi- tal index of north american archaeology (dinaa). literary and linguistic computing, ( ): – . doi: https://doi.org/ . /llc/fqu wheatley, d and gillings, m. . spatial technology and archaeology: the archaeological applications of gis. baton raton, florida: crc press. doi: https:// doi.org/ . / willey, gr and leventhal, rm. . prehistoric set- tlement at copán. in: hammond, n and willey, gr (eds.) maya archaeology and ethnohistory, – . austin: university of texas press. witcher, re. . (re)surveying mediterranean rural landscapes: gis and legacy survey data. internet archaeology, . doi: https://doi.org/ . / ia. . wood, wr. . ethnohistory and historical method. archaeological method and theory, : – . https://www.jstor.org/stable/ . wright, h and richards, j. . reflections on collabo- rative archaeology and large-scale online research infrastructures. journal of field archaeology, (sup ): s – . doi: https://doi.org/ . / . . wylie, a. . how archaeological evidence bites back: strategies for putting old data to work in new ways. science, technology, & human values, ( ): – . doi: https://doi. org/ . / how to cite this article: richards-rissetto, h and landau, k. ( ). digitally-mediated practices of geospatial archaeological data: transformation, integration, & interpretation. journal of computer applications in archaeology, ( ), pp. – . doi: https://doi.org/ . /jcaa. submitted: january accepted: july published: august copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access journal of computer applications in archaeology, is a peer-reviewed open access journal published by ubiquity press. microsoft word - -hsr-rolehumancomp-hc.doc www.ssoar.info the role of humanities computing: experiences and challenges short, harold veröffentlichungsversion / published version zeitschriftenartikel / journal article zur verfügung gestellt in kooperation mit / provided in cooperation with: gesis - leibniz-institut für sozialwissenschaften empfohlene zitierung / suggested citation: short, h. ( ). the role of humanities computing: experiences and challenges. historical social research, ( ), - . https://doi.org/ . /hsr. . . . - nutzungsbedingungen: dieser text wird unter einer cc by lizenz (namensnennung) zur verfügung gestellt. nähere auskünfte zu den cc-lizenzen finden sie hier: https://creativecommons.org/licenses/by/ . /deed.de terms of use: this document is made available under a cc by licence (attribution). for more information see: https://creativecommons.org/licenses/by/ . diese version ist zitierbar unter / this version is citable under: https://nbn-resolving.org/urn:nbn:de: -ssoar- http://www.ssoar.info https://doi.org/ . /hsr. . . . - https://creativecommons.org/licenses/by/ . /deed.de https://creativecommons.org/licenses/by/ . https://nbn-resolving.org/urn:nbn:de: -ssoar- humanities computing the role of humanities computing : experiences and challenges harold short ∗ abstract: dued to the celebration of the thirtieth anniver- sary of the department for literary and documentary data processing in tuebingen this article is written. it gives an overview of humanities computing developments since the formation of this research-department. the paper is devided into three parts. first, the experiences in humanities computing are reviewed. for these purposes the author points out various aspects of the development and exploita- tion of scholarly materials using computers, considering some of the current work to create new tools for research. this chapter is followed by the discussion of some of the key challenges of this century, by that humanities comput- ing and the scholarship, of which it is a part, are faced with. finally, the author gives a summary of what in his opinion would be the key roles of humanities computing in the fu- ture. introduction we are meeting today to acknowledge and celebrate the more than years of humanities computing here in tuebingen, and the thirtieth anniversary of the department for literary and documentary data processing (abteilung liter- arische und dokumentarische datenverarbeitung): http://www.uni-tuebingen. de/zdv/tustep/. ∗ aus dem protokoll des . kolloquiums über die anwendung der elektronischen datenver- arbeitung in den geisteswissenschaften an der universität tübingen vom . november (revised version: may ). address all communications to harold short, king’s college london, centre for computing in the humanities, strand, london wc r ls, united kingdom. e-mail: harold.short@kcl.ac.uk. it is a great privilege to be here to join your celebration. i had planned to be here in any case, but to be sitting out there with you rather than standing here. so although i was honoured by professor ott's invitation to take professor zampolli's place, i also found it extremely daunting, all the more so when i realised that the person who had spoken at the th anniversary was that great pioneer of our field, roberto busa sj. it is much to be regretted that our esteemed colleague professor zampolli could not be present, and i echo professor ott's good wishes to him. he would have been a most appropriate speaker, for he attended that famous colloquium here in with father busa, and has been closely involved in the whole range of humanities computing developments over the intervening years. he was a student of busa and a collaborator with ott and wisbey and others in the founding of the association for literary and linguistic computing (allc: http://www.allc.org) in . i cannot begin to have either the depth or the breadth of his understanding across the field of humanities computing. this is, i believe, an exciting time for anyone involved in the application of computing in the humanities disciplines. information and communications technologies are evolving at a pace almost impossible to keep up with and difficult even to comprehend. there are new areas of research and activity that impinge on us more and more in fields near and far: information science, com- puter science, engineering, cognitive science, not to mention the wider cultural heritage sector, with which humanities scholarship has had more traditional ties. somehow we have to make ourselves aware of and make sense of all of this. experiences old and new our experiences in humanities computing now go back more than years, each decade more packed with change and innovation than the one before. in what must necessarily be a brief and highly selective overview, i will concen- trate on the scholarly creation and use of 'digital resources' (to use the current term), and will refer to a number of projects and institutions involved in this work. this should certainly not be taken to indicate a lower regard for other kinds of scholarly use of computers; far from it. the focus of my discussion is what i am calling 'hybrid resources for schol- arship'. the creation of such 'resources' - the product of scholarship and created kolloquium on november . busa's paper was entitled 'half a century of literary computing: towards a "new" philology'. see literary and linguistic computing ( ) , - . this kolloquium, entitled "internationales kolloquium über maschinelle methoden der literarischen analyse und der lexikographie", was held in the university of tuebingen on november . for the use of scholars - has been a key activity throughout our fifty years. roberto busa set out on his computing path with the purpose of producing a concordance and index to the works of thomas aquinas. it was a work of substantial research, and a product of research that itself has become an invalu- able tool of research. the record of scholarly publication here in the zdv/alddv in tuebingen is remarkable, and it could be characterised in a similar way. the record of achievement and of far-sighted technical design is one of which he and his colleagues in lddv, and the university of tuebingen more generally, should be proud. it is an achievement that is known and admired far beyond this insti- tution and this city. professor ott has told me that more than volumes have been prepared with tustep. the editions page on the tustep web site http://www.uni- tuebingen.de/zdv/tustep/ tustep_eng.html needs an index for its several hun- dreds of entries, and that is before you get on to the lists of bibliographies, indexes, concordances, dictionaries and other reference works in whose pro- duction it has been used. the changes over the years in the character of the publications, in the tustep software itself, and no less in the papers given in this kolloquium, reflect developing technologies and changing expectations, and the increasing intermixture of paper and electronic forms. at first a key role of the technology was to make possible or enhance the paper publication. as the technologies have developed, and the scholarly horizons have changed to take account of them, so the publication forms have developed. in many cases now it is the digital form that is the key resource for scholars, and if there are paper publica- tions at all, they are supplementary to the electronic one. hence my use, above, of the term 'hybrid' - encompassing research re- sources in both digital and non-digital forms, not just the purely digital. in using the term i seek to include not only a mixed form of publication, but also the paper or other material form of the sources on which the digital forms may be based. thus i do not see 'hybrid' as merely an intermediate stage between 'paper' and 'digital'; rather as something long term, perhaps even in some way 'ideal', in which a paper or material original is preserved and treasured for what it is, and the electronic is exploited for what it makes possible. one important characteristic of the work of busa, ott and many others has been that it falls firmly within the existing traditions of textual scholarship. what has been developed is built on solid foundations and is highly practical, although this does not mean it has failed to be innovative. it has made possible many things that were not feasible before. busa's index thomisticus could be commonly known as the index thomisticus: the busa edition ( - ): s. thomae aquinatis opera omnia; ut sunt in indice thomistico, additis scriptis ex aliis medii aevi auctoribus. fromann-holzboog, stuttgart-bad canstatt, ; available on cd-rom as a supplement to the -volume printed edition. completed in a matter of decades, rather than of centuries or not at all; concor- dances have become matters of routine rather than a life's work, a point made by michael sperberg-mcqueen in his paper in this kolloquium five years ago. tustep has brought new possibilities of rigour, consistency and comprehen- siveness to the preparation and production of scholarly editions, and has made possible new kinds of editions. textual studies: thinking with mark-up of course not everyone uses tustep. since all processing of texts by com- puter involves some form - perhaps several forms - of mark-up (or encoding), it was natural that as the volume of work of this kind increased, so the question of standards should arise. this in turn gave birth to the text encoding initiative (tei), one of the most remarkable projects of its kind to have been undertaken. my authority for this is not a computing humanist nor anyone directly in- volved, but jon bosak, chair of the xml work group of the world wide web consortium. the intellectual basis of tei as a document grammar system was a key part of the sperberg-mcqueen paper here years ago, so i will touch instead on some key practical consequences of its development and use. the tei is remarkable for at least three reasons. - first, it it does indeed provide a basis for ensuring that texts can be transferred between different hardware and software platforms without loss of data, not only providing a measure of 'future-proofing' against hardware and software changes, but also enabling scholars to exchange and share their encoded texts. - second, it engaged some of the finest scholarly minds in a common en- deavour over more than a decade, a process which yielded particular in- sights into the ways in which mark-up may be of much greater value than 'mere' longevity in a variety of scholarly endeavours. - third, the work of the project led directly to its north american editor, michael sperberg-mcqueen, taking a leading role in that w c work group that defined the xml (extensible mark-up language) standard. the tei is continuing, as a consortium co-hosted at virginia, brown, bergen and oxford (http://www.tei-c.org/), and sperberg-mcqueen now works for w c. as the wider xml 'revolution' gathers pace, there are signs that some of the long-term significance of the tei will be related to xml, and the opportu- nities it is starting to bring to textual scholarship, not only in burgeoning quan- kolloquium : november . see literary and linguistic computing ( ) , - . bosak characterised the tei in this way during his closing keynote talk at the 'tei ' tenth aniversary conference, held at brown university, - november . http://www.stg. brown.edu/conferences/tei tities of encoded texts (of which more later), but also in the development of new tools to exploit them. as you are no doubt aware, in order to produce a basic tei-conformant text it is necessary to carry out only what is sometimes termed 'shallow' or 'light' mark-up - enough essentially to capture the structure of the text - the chapters and paragraphs of a novel, the acts, scenes and lines of a play, the stanzas and lines of a poem. tei encoding can be used, of course, to do much more, but that is the minimum. let us look next at an example of very 'deep' mark-up, which, although it does not actually use tei (it began before the tei standards were defined), illustrates some of the intellectual challenges and potential bene- fits of deep encoding in textual scholarship. for the past eleven years my friend and colleague willard mccarty has been preparing an analytical onomasticon to the metamorphoses of ovid: http://ilex.cc.kcl.ac.uk/analyticalonomasticon/index.htm there is not time here to attempt any detailed description of the work, but to quote mccarty, it is "in essence a systematic, disciplined means of discovering and following paths of association through the metamor- phoses - specifically those formed by shared names and what i call 'onomastic profiles' ". - (unpublished. quoted with permission.) it is a 'book of names' of an entirely new kind, in which a systematic attempt is made to capture all references to 'persons', whether directly by proper name or by other appellative devices, direct or indirect. given the subject matter of the poem, the poetic process of personification is of particular interest. classical scholars will judge how effective the onomasticon is as a research tool when it is published. however, i want to focus on its methodology. what mccarty set out to do required the construction of a metalanguage - a complex set of tags - that would enable him to mark up in the text all the words and phrases that he judges to have an appellative function. the marked up text constituted a 'model' of the poem and its appellative devices; as the work pro- gressed, there was a process of change, both in the tag set itself and in the tag- ging, thereby changing the 'model'. thus when someone uses the resource in its published electronic form, what they will see will reflect mccarty's interpreta- tion of ovid's poetics. reflections of this kind are not themselves new, of course - all publications represent the interpretations of their authors or editors. however, what will be new is that the basis of his interpretation, his model, will be entirely and comprehensively explicit. his critics will be able to take issue with anything from the encoding of a single word to his entire metalan- guage scheme. what is more, it will be possible for other scholars not only to 'replicate' his results, but also to change the mark-up and thence to produce new results - they will change the model, and so will be able to generate new inter- pretations. there are two other aspects of the work that are worthy of comment, both raised by mccarty himself at various times. - the first is to do with his experience of the encoding process. the ma- chine has no intelligence, and is therefore merciless in showing up in- consistencies and contradictions in the mark-up scheme. thus the proc- ess itself becomes a sigificant tool for thinking about the poem and its poetic devices. it also is a very good illustration of the modelling- failure cycle, which was referred to or even emphasized, i noticed, in the papers given in this kolloquium by busa ( kolloquium : no- vember ), sperberg-mcqueen ( kolloquium : november ) and susan hockey ( kolloquium : december ). each 'failure' gave mccarty new insights both into his scheme and into ovid's work, and provided a basis for refining his model. - the second is to do with the commitment of time and energy required by such an undertaking. many - perhaps most - words in the poem have more than one tag attached to them; in such a long poem, this represents substantial labour. deep encoding is not for everyone! however it illus- trates in a very practical way one of the fundamental potentialities of humanities computing - and i am not aware of many undertakings of its specific kind. i may of course be wrong, but i believe that in time it will come to be seen as a piece of seminal work, perhaps to rank even with busa's index thomisticus - but perhaps at this point i should remind you also that mccarty is a close friend and colleague! these remarks should not be understood to imply that scholars use the com- puter as a tool for thinking only through deep encoding. from the beginning the best and most intelligent use of computing has enabled new ways of thinking about the materials of scholarship, and the iterative process of modelling and failure characterises much if not all scholarly work with computers. databases and structured data another area of activity in which a similar phenomenon may be observed is historical studies. i will not attempt to encompass all of such a broad field. i noticed that one of your speakers here was professor manfred thaller, who spoke to you about the clio system developed by him and his colleagues ( kolloquium : november ). he will have given you a much wider overview of historical computing than i could possibly do. instead i will focus see also unsworth, j.: "the importance of failure," in the journal of electronic publish- ing, . (december, ). on one particular kind of research, that of prosopography, in which the centre for computing in the humanities (http://www.kcl.ac.uk/cch) at king's college has some direct involvement. we are collaborators in three major projects of this kind, funded by the uk's arts and humanities research board: these are three projects: - the prosopography of the byzantine empire, (http://www.kcl.ac.uk/cch /pbe) - the prosopography of anglo-saxon england, (http://www.kcl.ac.uk/cch /pase) - the clergy of the church of england database (http://www.kcl.ac.uk /cch/cce) a fourth project, the prosopography of roman egypt, is on temporary hold. in each case information is recorded within a formalised structure - in these pro- jects the data is managed by relational database software - which reflects the scholars' view of what is important in the subject matter; or, to be more precise, what is both important and also amenable to being forced into a structure of this kind, which is an important qualification. where this approach is appropri- ate, the database tools make possible the rapid retrieval of data in many combi- nations, the asking of more or less complex questions, and the selection of sets of data that may reveal patterns or discontinuities, and that demonstrate or suggest links between different data elements. as with all the best scholarship, it may be at its most useful when it raises new questions, more so perhaps than when it helps to provide 'answers'. projects such as these pose technical as well as scholarly challenges. the technical approach adopted in each involves the development of a 'data collec- tion database' to make the gathering of information as efficient as possible with respect to the current scope of an individual researcher, or to the characteristics of a particular source type; the development of a 'master database' into which the collected data is uploaded and integrated; and procedures for producing richly varied means of accessing and manipulating the data. there is another database project at king's college which has a very differ- ent scholarly purpose - a literary one. dr david yeandle is creating a line-by- line bibliography for wolfram von eschenbach's parzival. database entries are created for all published work on the poem, recording the line numbers and thematic aspects addressed. this makes it possible to access the bibliographic information by line number, theme, author and year. although this project uses database technology rather than mark-up, there are parallel's with mccarty's onomasticon: in the comprehensiveness of its aims, yeandle, david n.: stellenbibliographie zum "parzival" wolframs von eschenbach für die jahrgänge - niemeyer verlag, tübingen, to be published on cd-rom, . details of the project may be found at: http://www.kcl.ac.uk/kis/schools/hums/german /parzive.html and the level of detail at which the work is done. in both cases, also, 'publica- tion' makes sense only in electronic form - although certain kinds of paper output might be useful as 'spin-off' publications. in this they are examples of 'second generation digital resources', to which i will return later. computational and corpus linguistics philology has traditionally encompassed both literary and linguistic scholar- ship, and it is not surprising that the origins of the association for literary and linguistic computing in the early s lie in a time of awareness of parallels and overlaps in the application of computing in linguistic and literary studies. some of the developments in computational linguistics during the s and s were based in and driven by computer science departments, with an emphasis on theoretical linguistics, and these seemed remote from the interests of literary scholars, so 'literary computing' and 'linguistic computing' appeared to diverge to some extent. more recently, however, the emergence of 'corpus linguistics' and of corpus-based research more generally, as well as continuing work in applied linguistics, have fostered a welcome process of re-convergence. in part this is based on a recognition that theoretical and practical issues in corpus building and corpus use are common to linguistic and literary studies, and in part on an understanding of how the tools developed for linguistic applications may have a significant role in literary research. professor zampolli's institute in pisa, the istituto di linguistica computazionale, is just one example, among many, of where tools of this kind have been developed, and where the 'linguis- tic' and the 'literary' have not been divorced (http://www.ilc.pi.cnr.it/). there is also a wider social context for this process of re-convergence. cor- pus linguistics provides the foundation of a great deal of work in the field of 'human language technologies', which today has major cultural, political and commercial significance, notably in the european context, but also far beyond, on a truly international scale. the european language resource association (elra: http://www.icp.inpg.fr/elra/), in whose establishment antonio zam- polli has played a major role, demonstrates the vibrancy and potential of cur- rent work in this area, for example in the range of resources and tools created under its aegis. professor zampolli could tell you much more about these de- velopments than i can, but they remain as central to the experiences of 'humani- ties computing' as they were at the start of our journey, in the work of father busa. mixed media resources mixed media resources have become increasingly important to humanities scholars as the technologies that enable them have developed, including in- creasingly sophisticated techniques to manipulate images, the decreasing cost of storage that make it possible for very large files to be manipulated even on personal computers, and the increasing bandwidth of the networks that make possible the rapid transfer of high data volumes. as my first example i would draw your attention to the blake archive ( http://www.iath.virginia.edu/blake/), based in the institute for advanced tech- nology in the humanities (iath: http://www.iath.virginia.edu) at the univer- sity of virginia. it is a representation of the work of someone who was both a poet and an artist, and whose work can be understood best when the poetry and the art are seen and studied in an integrated way. thus the intellectual require- ment finds a very appropriate match in the technical possibilities. one cannot help being struck by the degree and quality of intellectual input required in the creation of a resource of this kind if it is to be of value to a scholarly audience, even if on a superficial level its ease of use makes it acces- sible to a much wider non-specialist audience. for example, substantial 'meta- data' has been created to describe the image content, which makes possible content-based searches of the images and also thematic cross-referencing be- tween images and poems. other aspects of relevance to our present discussion include the 'modelling' of the data that underlies the resource, the range of sources and contributors it draws on, and the collaborative basis of its development. i shall return to some of these matters later. as my next example, i have chosen a project at the courtauld institute of art, the corpus of romanesque sculpture in britain and ireland (crsbi: http://www.crsbi.ac.uk). the resource consists of photographs of romanesque sculpture and with ex- pert descriptions and commentaries. the researchers are volunteer art histori- ans, and the work of the project is overseen by a committee of eminent schol- ars. the researchers take the photographs and prepare the written material; the photographs and texts are then sent in to the project office, where they are edited and prepared for on-line publication by the editor and the research officer. as the material is received and processed, the details are recorded in a central database. the images are scanned - at a high resolution for preservation purposes, with lower resolution (jpeg) versions produced for web display purposes (thumbnail and full-screen). the texts are edited, with xml tags inserted to reflect the carefully designed structure of information the scholars are asked to produce, as well as to identify specialised terms so they can be linked later to a glossary. the database, image and xml data are then used to display integrated html materials on the web site, by means of a set of (perl) scripts. i sketch the process because i believe it is typical of many multi-media digital resource projects currently in progress. another such project at the courtauld, the corpus vitrearum medii aevi (cvma: http://www.cvma.ac.uk), is digitising photographs of medieval stained glass and adding a great deal of metadata in order to create another searchable on-line database that will be valuable to scholars as well as having wider appeal. the cvma is in fact an international project, with well- established paper publication series of scholarly commentaries, and the pilot project is enabling the cvma to explore how such commentaries could in the future be integrated with the image archive. among the things i find interesting about these three projects - and many others like them - is the similarity in the intellectual and technical challenges posed by taking different kinds of source material and shaping them in an intel- lectually rigorous way into a whole that aims to provide a basis for new schol- arship. but they have other important characteristics. i selected the blake ar- chive because it provides a practical interdisciplinary framework for literary scholars and art historians; crsbi because it enables important art historical evidence to preserved that would otherwise be lost; cvma because it has the potential to bring together into a single unified resource images of all the uk's medieval stained glass (europe's too if the international project adopts the same approach across its member countries). these characteristics - of bringing source materials together in 'virtual' col- lections, of bridging the gaps between disciplines, and of providing access to remote or fragile resources - are in one or more respects true of a great many of the scholarly multi-media projects now under way, and in the latter respect at least of the hundreds of digitisation projects being undertaken by museums, galleries and libraries all round the world. all such resources are likely to be used by scholars for research and teaching, as part of the global digital library. the management of this 'library' to ensure access and long-term availability is far from straightforward, and this is the first of the challenges to which we'll turn our attention in a moment. bringing to a close this rapid and selective survey of the 'experiences' of humanities computing, i should emphasise the fact that many major areas of activity have not been touched on, such as the study of literary style and au- thorship attribution. let me repeat that their omission is only because i chose to take a 'digital resources' perspective for this talk, not because i regard them as less important for humanities computing or for scholarship. cf john burrows and his seminal work on jane austen, computation into criticism: a study of jane austen's novels and an experiment in method, oxford, clarendon press, , and the substantial body of work by burrows, mckenna, craig and others at the cen- tre for literary and linguistic computing at the university of newcastle, new south wales. for a recent overview of work in this area, see holmes, d.i.: the evolution of stylometry in humanities scholarship in: literary and linguistic computing vol no ( ), pp. - . challenges many of the challenges that face us in the new century arise from the rapidity of technological change and the loss of stability in institutions and practices, and in society at large, that follow from this. among the most complex of these is the management and preservation of the ever-growing range of digital re- sources. from a scholarly point of view what is particularly important is the integration of the digital and the non-digital in what i described earlier as the 'hybrid' library. the hybrid library all of the resources we have so far looked at might be termed 'second genera- tion' digital resources, according to criteria proposed by john unsworth in the opening keynote address at the digital resources for the humanities confer- ence (drh) in sheffield. in his paper unsworth suggested that the dis- tinguishing features of 'second-generation resources' are likely to include the following: collaborative; multi-disciplinary; multi-media; multi-technology; large-scale; complex; long-term (both in terms of their creation and their in- tended usefulness). this formulation is useful, although clearly not exclusive, and in his paper unsworth identified a number of important issues that arise in the conception, planning, project management, development, dissemination, management, and preservation of these resources. i should add that it would be against the inclusive spirit of unsworth's paper to see his proposal as a yard- stick for deciding whether some piece of work is 'old-fashioned' or 'modern'; rather its importance lies in its systematic analysis of the issues raised by new technical possibilities, and in his proposed agenda of research that follows from the analysis. his focus is on: - the scholarly use of primary resources in digital form - the adoption by libraries of what he calls 'born-digital' resources - the new interactions needed by scholars, publishers and libraries in the co-creation and dissemination of scholarly digital resources in part, unsworth's proposals arise from a recognition that the scale of the operation is changing, that we are approaching a threshold that necessitates new thinking. the andrew w. mellon foundation is funding unsworth's group to carry out research in this area: supporting digital scholarship (http://www. iath.virginia.edu/sds/). digital resources for the humanities (drh) is an annual conference that specifically addresses the scholarly, information and technical issues related to the creation, manage- ment, use and preservation of digital resources in the humanities: http://www.drh.org.uk. drh was held at the university of sheffield in september . this is my own summary and interpretation of what he said, rather than a quotation from a published version of the paper, which has not appeared as yet. it may also be worth mentioning: the uk's arts and humanities data service, which supports the creation of scholarly digital resources, in part by recom- mending technical standards and specifying metadata structures to ensure wide access and long-term preservation; and the fedora architecture for a 'digital objects repository', developed by computer scientists at cornell. over time, no doubt, the structure and practices of the global digital library will emerge from these and many, many other initiatives addressing various aspects of the hybrid library challenge. modes of publication the big challenges in publication are - rightly - well publicised, and we will return to them, but i'd like to begin by considering the question from the point of view of an individual project, where one of the primary objectives is likely to be to make its information available to as wide a scholarly audience as pos- sible. this is a question that raises technical design issues at a very practical level, and in all the examples we have considered thus far there has been a conscious decision to 'protect' the user from the technology as far as possible. this approach characterises many projects, and in the projects in which cch is involved we have articulated the principles as a 'general model' com- prising three levels of access, which for convenience we have labelled 'browse', 'query' and 'specialised' access. with 'browse access' materials are presented in standard world wide web format as a set of indices, from which 'point and click' is all that is needed to lead the user to the underlying data displays. the key characteristics of this level are no or minimal technical barriers to use, and minimal prior understanding of the data. with 'query access', one or more search forms are provided, also web-based, allowing users to construct enquir- ies based on simple or complex criteria. this requires a greater degree of un- derstanding of the data, but still raises no technical barrier. the third level, 'specialised access', is for users who are willing to learn - or already know - the computer language which will enable every ounce of the complexity in the the arts and humanities data service (ahds) is funded by the uk's joint information systems committee (jisc) and arts and humanities research board (ahrb). it has a co- ordinating executive, and five 'service providers', covering texts, history, architecture, visual arts, and performing arts (http://www.ahds.ac.uk). fedora is an acronym for 'flexible extensible digital object repository architecture', developed by carl lagoze, sany payette and colleagues at cornell. since the time of the talk, the andrew w. mellon foundation has funded a collaborative project based at cornell and virginia, involving lagoze and payette at cornell and thornton staples and colleagues in the digital library research and development group at virginia, to develop a practical implementation of the fedora model ( http://fedora.comm.nsdlib.org/). see also: deegan, marilyn and tanner, simon: digital futures: strategies for the informa- tion age, london, . underlying system to be exploited. in the case of relational database projects, for example, this is likely to be structured query language (sql). part of the reason for this three-tiered approach is to maximise the use and the usefulness of these expensively created resources. it is anticipated that the browse level mechanism will suffice for a substantial number even of scholarly users, as well as opening the resources up to a wide range of non-specialist use, e.g in schools and public libraries. with the addition of the form-based tier, it is expected that almost all scholarly needs will be met. and for a few hardy souls, there will still be the final level! returning to the more major issues, i don't propose to say a great deal since they are well rehearsed. the roles and relationships of scholars, libraries and publishers are having to change, and there are no clear models for how to pro- ceed. the economics of publishing are changing, and this has given significant impetus to thoughts of institution- or consortium-based approaches, and the open archives initiative (oai: http://www.openarchives.org/) and the sparc initiative, with its slogan "returning science to scientists" (http://www. arl.org/sparc/), may be important in this regard. the new initiative at the uni- versity of virginia to create a department in the university press with specific responsibility for 'born-digital' resources is also very interesting (http://www. iath.virginia.edu/imprint/). new methods and new tools the application of computing is forcing a re-evaluation of research methodol- ogy in the humanities, partly because of the scale of evidence these methods make available for systematic analysis, as previously alluded to, and partly because of the rigour demanded by computational processes. earlier this year we organised a colloquium at king's entitled 'humanities computing: formal methods, experimental practice', held on may (http://www.kcl.ac.uk/humanities/cch/seminar/ - /seminar_hc.html), which drew on such fields as computer science, sociology, and philosophy and history of science, as well as the humanities and humanities computing. it was an attempt to address issues of inter-disciplinarity, of which more later, and to explore the extent to which it may be useful to think of the application of com- puting in the humanities as an experimental science. scholarly primitives one paper in the colloquium tackled the question of whether humanities com- puting should identify, and if necessary create, a set or sets of scholarly primi- tives, whose combination and re-combination would enable researchers to shape and re-shape their data in the ceaseless quest to find patterns and discon- tinuities. tustep is a fine model here, with its modularity linked to the what are described as 'the small steps' of progressing through the work. more think- ing of this kind is needed. the ideal is a set of protocols enabling interchange of data and interoperability between systems. modelling and structured design inherent in any mark-up of text, the creation of any database, the creation of any digital resource, is an iterative process of analysis and modelling, with the finished work being an instantiation of the final model in that process. at its best the process forces researchers to view their data in a new way, to confront new aspects of inconsistency - and at times to consider whether the data is really susceptible to the rigours of a formal analysis and modelling process. productive - and necessary - though these methods are at their best, they are not widely known and understood, and their wider dissemination is one of the important challenges we face. statistical methods there use of statistical methods, e.g. in stylistic analysis and authorship attribu- tion studies, is well established. although not seen as 'glamorous' in compari- son with the latest multi-media fashions, the work has been continuing steadily, due to the efforts of burrows and others, as previously mentioned, and has been making gains as new statistical methods are proposed and tried. in some areas of historical research their use has been much more widespread. it seems clear that for many humanities researchers, the methods continue to seem forbidding, partly because of a reasonable suspicion of attempts to 'measure' the creative imagination, and partly through lack of awareness about the possibilities of statistical methods and lack of training in the associated techniques and proce- dures. professor anthony kenny, past president of the british academy, said in the british library research lecture: "the third lesson is that it will not be possible for humanists to take full stock of what the computer has to offer their dis- ciplines until the study of statistics becomes a normal and in- escapable part of the training of those who plan an academic career in the humanities. this needs to be recognised not only at university level (as in france, where now a course in statistics is an essential part of the training of an academic historian) but also in any high school which has an interest in sending on students to do university work in the humanities." kenny, a.: computers and the humanities. ninth british li- brary research lecture. london: british library, , p. . this may well be another of our challenges. institutional structures institutional models for humanities computing one of the things our two institutions - tuebingen and king's - share is a par- ticular conception of how best to provide the framework for scholarly work in the humanities using computers. this conception is based firmly on a notion of creative interaction between scholarship and technology. let us begin with the lddv department here in tuebingen, whose anniver- sary we are celebrating. it consists of academic staff and programmers, and it is worth quoting from (the english version of) professor ott's description of the lddv: "... this implies that service and research are closely related: part of our service ... consists in the research in data process- ing methods and in the development of tools for computer- aided philological research. e.g. we do not prepare critical editions or carry out historical research projects, but we de- velop software ... and collaborate with the scholar responsi- ble for a project in designing methods for new applications." on the web site of my own department, the centre for computing in the hu- manities (cch) at king's college london, you will find similar language: "the primary objective of the cch is to foster awareness, understanding and skill in the scholarly applications of com- puting." all of us in cch have both humanities and computing backgrounds, and it would not be possible to do what we do without this. a third institutional example to which i would direct your attention is john unsworth's institute for advanced technology in the humanities (iath) at the university of virginia. its work also is based on the principle of scholarly and technical collaboration. one particularly interesting feature is their programme of research fellowships, which involve support from iath, a semester of teaching relief and support for a graduate research assistant. you have only to look through the projects undertaken at virginia under this programme, includ- ing the blake archive, to be impressed both at the range of scholarly interests and the variety of technical methods they encompass. my starting point was infrastructure, and i selected three models that share certain features i believe are important. it would be wrong, however, to leave the impression that these are the only insitutions that provide infrastructures along these lines, or that these are the only good models. my colleague willard mccarty has begun to develop a typology of institutional organisational models for humanities computing, and with the support of the allc i hope this work can be taken forward. the new opportunities for humanities computing make it more important than ever that institutions develop appropriate frameworks to support the scholarly work, and also mechanisms to recogise and reward inno- vative work in this field. these are among our most important challenges. national and international frameworks i have concentrated on institutional matters, but it would be wrong to ignore the significance of national and international activities. the scholarly and profes- sional assocations have an important role to play, with the association for literary and linguistic computing and the association for computers and the humanities being the major ones in our field. there is also an increasing num- ber of national and international agencies and projects that are relevant, some mentioned earlier in this paper, e.g. the tei and the uk's arts and humanities data service. new partnerships working with scientists and engineers there are many examples of scientists and engineers becoming engaged with and helping to address the problems of research in the humanities, and i'd like to mention just three by way of illustration. - the first is kevin kiernan's work with scientists at his institution, the university of kentucky - as well as with the manuscript curators at the british library! - on the electronic beowulf project: http://www. uky.edu/% ekiernan/ebeowulf/main.htm. the key scientific work was image processing that allowed much finer analysis of fire- and preser- vation-damanged manuscripts of the medieval english poem beowulf. - the second is the work with engineers that is being undertaken by ox- ford's centre for the study of ancient documents, in which a number of engineering techniques are brought to bear on a wide variety of texts inscribed on a number of different types of material: http://www.csad. ox.ac.uk/csad/images.html. - my third reference is to the omras project, which involves musicolo- gists and electronic engineers at king's college and the university of indiana. it is carrying out research on the automated recognition of 'mu- sical objects', research with far-reaching potential within the academy and in the wider commercial and entertainment world: http://www. kcl.ac.uk/kis/schools/hums/music/ttc/ir_projects/omras/. mccarty, w. and kirschenbaum, m.: humanities computing: institutional models and resources: http://www.allc.org/hcim the point about these examples, and many others, is that partnerships of this kind are based on the use of computers, but very specialised use, developed originally for purposes far removed from the humanities. the challenge is to seek out and exploit these new opportunities, and to create environments in which such partnerships arise 'naturally' and can flourish. the cultural heritage sector partnerships with museums, galleries and national libraries are much more familiar to humanities scholars. yet the rapid advance of multi-media and virtual reality techniques offer many new opportunities, and require new ways of thinking. the creation of 'virtual collections' of physically separated objects, or of objects of different kinds - material artefacts and manuscripts, for exam- ple - is just one example. new modes of collaboration earlier i talked about the structures and the approach of your department, my centre, and iath at virginia. one of the key features in common between these three institutions is a founding principle of collaboration, of a joint activ- ity to which each participant brings specialised skills and experience, and dif- fering considerations of theory and of practice. it is this concept of collabora- tion, and the sometimes unexpected consequences as the different sets of theory and practice meet, that characterises the work we do, and makes it a constant source of challenge - and, we hope, reward. i raise the matter again because apart from the research under way at vir- ginia mentioned earlier, there has been insufficient effort to understand and document systematically the many facets of these new modes of collaboration. one of the key challenges we face is to understand them better and to train scholars in the issues of principle and practice which characterise them. continuity and a culture of change continuity of citation is fundamental to our traditions of scholarship, yet there are many uncertainties in the digital age. some of the most complex issues are those being addressed by the digital library research i have referred to - the development of metadata structures to ensure adequate citation information, of 'persistent' addressing mechanisms, of sophisticated means to manage digital i note, in passing, that the blake archive numbered among its collaborators galleries and a private collector: http://www.blakearchive.org/public/institutions.html. for an earlier discussion of new modes of collaboration, see unsworth, j.: networked scholarship: the effects of advanced technology on research in the humanities, in: gate- ways to knowledge, ed. larry dowler. mit press, . objects, of long-term access and preservation. this remains one of our major challenges. digital culture, digital scholarship it is not only in dealing with the digital representation of 'traditional' source materials that the humanities scholar is having to become 'digitally literate'. we live in a 'wired world' - a strange term, perhaps, to describe what is in reality an increasingly wireless world - in which creative activity of many kinds is carried out by digital means, from literary or musical composition to performance art. our contemporary culture is increasingly 'digital', we find both the old and new subjects of our study appearing in digital form, and the boundaries between the digital and the non-digital become increasingly blurred, never more so than in the world of 'virtual reality'. perhaps the overarching challenge we face is 'digi- tal scholarship'. marilyn deegan has for many years been working at the forefront of hu- manities computing and digital library developments. in her keynote address at the allc-ach conference in glasgow, entitled 'digital scholarship in a wired world', she said: "my contention is that digital scholarship in a wired world is different in some profound way from the scholarship that has gone before it." keynote address at the joint international conference of the association for literary and linguistic computing and the association for computers and the humanities, university of glasgow, july . unpublished; quoted with permission. the manuscript of her paper was invaluable in the prepara- tion of mine, and i acknowledge this with thanks. – deegan is digital resources director of the refugee studies centre, university of oxford, editor of literary and linguis- tic computing and co-director, with the author, of the of- fice for humanities communication. this is a challenging proposition, and deegan, mccarty and i are working to arrange a colloquium - or perhaps a series of colloquia - to pursue it in greater depth. see, for example: law, derek g.: the mickey mouse world of humanities scholarship, in: drh : a selection of papers from "digital resources in the humanities ", office for humanities communication, . the role of humanities computing bridge, glue and intermediary 'bridge' encompasses the occasions when humanities computing is the means by which scholars and projects know about and use the techniques and tech- nologies most appropriate to their purpose. and a good bridge must have solid foundations on both banks of the stream (or chasm) it crosses! humanities computing is an inter-disciplinary endeavour above all. 'glue' describes in particular those projects or other scholarly ventures in which it contributes the technical methods that bind together the multiple disciplines and media that make up the whole. 'intermediary' describes the pro-active role humanities computing has to play in seeking out potential partners and in identifying scholarly activities that can take advantage of the technologies. it should also be monitoring and as- sessing changes in technology for their potential benefits to humanities scholar- ship. in the best cases computing humanists will be in a position to influence the technological developments, as with the tei and xml. the role of humanities computing must reflect its soul, and at its core it is interdisciplinary and methodological. pushing the boundaries recent experience has shown how significant is the effect of technological development, especially in computing and communications technologies, in removing or reshaping boundaries. if we are alert and careful, humanities com- puting is in a unique position to push boundaries in directions that will benefit rather than distort or trivialise humanities scholarship, and to promote the inno- vative thinking on which its future in a highly competitive and fragmented cultural environment will ultimately depend. new ways of thinking now i want to return to the question of analysis and design, which i mentioned, for example, in relation to the prosopography projects at king's. observing the historians in the design phase of these projects, it is clear that having to think in such a highly structured mode forced and enabled them to confront their schol- arly materials in a new way and this exercise in new thinking gave them new insights. transforming the disciplines staying with prosopography to introduce my next point, it has also been fasci- nating to observe at close quarters a discipline in the process of transformation. before computers, prosopography involved the reading of sources by 'the pro- sopographer' and the preparation of scholarly synthesis and summary based on the sources. computational methods make it possible to record what all the sources say, so that the synthesis and interpretation can be done by any user of the resource. it would be possible, of course, to imagine that this resource would contain not only the structured database but also the full texts of all the sources. one could further imagine these resources being interlinked. in such a circumstance, any scholar would be able to compare the sources and carry out their own synthesis. what kind of new definition do we then need of 'prosopog- rapher' and 'prosopography'? similar isses are being raised in relation to textual editions - a subject with a growing literature. it is perhaps inevitable that the new methods are calling into question traditional roles and long-established ways of doing things. this points to one of the key roles of humanities computing, as a mediator in the changing academy. a new research agenda at various times and places in the past year or two, willard mccarty has sought to provoke discussion and thinking on whether 'humanities computing' should be conceived as a discipline. as part of his research and writing on the subject, he has raised the question of whether humanities computing has a research agenda that it could claim as its own. in at least two key areas it seems clear that there is readily identifiable and significant research to be done: on the methodologies that are common across a number of humanities disciplines; and on what occurs at the interface between scholarship and technology, and the effects of the interaction on both. there are likely to be other areas: perhaps in new definitions of scholarly and technical primitives and the development of new tools that might grow from this; perhaps in relation to cognitive science and the question of how we know what we know. it is in developing this distinctive research agenda that humanities comput- ing will perhaps best prepare itself for the significant roles it will continue to have within the broader context of humanities scholarship. inter alia, see: sutherland, kathryn: electronic text: investigations in method and theory, oxford, s jra .. introduction introduction to the special issue peter k. bol* harvard university *corresponding author. e-mail: peter_bol@harvard.edu this issue of the journal of chinese history aims to take stock of the effects of “the dig- ital” on the study of chinese history. we are doing this through a combination of research articles whose authors have made extensive use of digital resources and tech- nologies and a set of introductions to non-commercial, open-access utilities and tools that scholars have created to support research in a digital environment. these hardly exhaust the universe of research or tools. when the materials with which humans found their way through time and space and communicated with others were in physical media, they could be collected, curated, and preserved for future research. when the writing and imagery became digital as well, they became ephemeral but also more accessible, and more people than ever before could become producers. the first website was created in ; today there are almost billion of them and they are being accessed by . billion users. there are blogs and wikis, and over billion text messages are being sent daily. by one reckoning the entire digital universe is expected to reach zettabytes ( bytes) in , which would be forty times more bytes than there are stars in the observable universe. we can be sure that the digital will not become less important. it is also playing an ever- larger role in the study of the pre-digital past as well. digital scholarship uses sources that are in digital form. several things follow from this: sources in digital form are easily altered and manipulated, they can be treated with computational procedures (algorithms) that allow massive amounts of data to be pro- cessed very efficiently and they can be shared widely at minimal marginal cost. this is an historical event that can be tracked over time and one that has been affecting the research cycle during the academic careers of the editors of this journal. research begins with questions. as the digital has advanced it has affected the ways in which we know what is going on in a discipline, but has it also had an effect on the questions we ask? my reading of the research articles in this issue suggests three answers: it makes it easier to address old questions by taking more information into account; it makes it possible to ask questions from multiple perspectives, for example © cambridge university press https://blog.microfocus.com/how-much-data-is-created-on-the-internet-each-day/ and https://www. websitehostingrating.com/internet-statistics-facts. https://www.visualcapitalist.com/how-much-data-is-generated-each-day/. see the discussion in “interchange: the promise of digital history,” featuring daniel j. cohen, michael frisch, patrick gallagher, steven mintz, kirsten sword, amy murrell taylor, william g. thomas iii, william j. turkel, journal of american history . ( ), – journal of chinese history ( ), , – doi: . /jch. . h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. mailto:peter_bol@harvard.edu https://blog.microfocus.com/how-much-data-is-created-on-the-internet-each-day/ https://blog.microfocus.com/how-much-data-is-created-on-the-internet-each-day/ https://www.websitehostingrating.com/internet-statistics-facts https://www.websitehostingrating.com/internet-statistics-facts https://www.websitehostingrating.com/internet-statistics-facts https://www.visualcapitalist.com/how-much-data-is-generated-each-day/ https://www.visualcapitalist.com/how-much-data-is-generated-each-day/ https://crossmark.crossref.org/dialog?doi= . /jch. . &domain=pdf https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms to see information in terms of both social networks and spatial distributions; and it opens up questions we had not previously attempted to answer. the search for sources—both the search for trends in a discipline and for primary sources—once took place in the trays of the card catalog. today the open public access catalog is the norm, but it only began in the late s; the catalog of today that allows faceted search is much more recent. harvard’s hollis catalog, the largest academic catalog in the world, did not display cjk scripts until , fourteen years after the sys- tem launched. today that catalog can link users to sources that are online and programs such as endnote and zotero can automate the extraction of bibliographic data from online catalogs and websites. the advent of searchable text is the single most important development in digital scholarship. scripta sinica, from the institute of history and philology at academia sinica began in and continues to set the standard for accu- racy. today there has been a proliferation of searchable text corpora from commercial vendors, much to the disadvantage of scholars without access to the handful of major research libraries, but also great open-access repositories such as the chinese text project (ctext.org), discussed in this issue. ways of organizing and storing information have changed as well. excel, the most popular commercial spreadsheet, did not appear until , although the first electronic one goes back to . the functions built into today’s spreadsheet programs have more capabilities than most historians are aware of. the relational database, which allows many tables to be joined together and facilitates complex queries, was first con- ceived in but the first commercial program came out in . since then new kinds of databases have appeared: object databases, which represent information as objects in contrast to the tables of relational databases; and graph databases, which rep- resent data through networks of nodes and edges. i have used the word “information,” but computational methods work with “data.” what is the difference? the sentence “zhu xi was born in the year ” is information. it is composed of three data (a name, an event, a date) which can be arranged as a row in a table, with two columns or variables (name or person, year of birth), or as a rela- tionship between two nodes (a name and a year), with “birth” as the edge or link between the nodes. information is translated into data to fit a data structure so that the data, when queried can become information again, for when taken alone the datum “ ” is merely an integer. understanding the relationship between informa- tion and data is important when storing information in spreadsheets or databases, because the data structure defines the ways the data can be queried: the structure is what makes it possible to call up all the names of people in the dataset who were born in the year . the key to transforming information into data is the ability to identify, to “tag,” ele- ments in a text. this can work at a structural level (sentences, paragraphs, parts of speech) and at an information level (places, dates, persons). the text encoding initiative (tei) released its first guidelines for marking up text in . it was soon https://en.wikipedia.org/wiki/online_public_access_catalog (accessed september , ). http://applyonline.ihp.sinica.edu.tw/english/source/source .htm. for information about all collections see http://hanchi.ihp.sinica.edu.tw. https://en.wikipedia.org/wiki/spreadsheet (accessed september , ). https://en.wikipedia.org/wiki/relational_database (accessed september , ). https://en.wikipedia.org/wiki/object_database (accessed september , ) and https://en.wikipedia. org/wiki/graph_database (accessed september , ). peter k. bol h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://en.wikipedia.org/wiki/online_public_access_catalog https://en.wikipedia.org/wiki/online_public_access_catalog http://applyonline.ihp.sinica.edu.tw/english/source/source .htm http://applyonline.ihp.sinica.edu.tw/english/source/source .htm http://hanchi.ihp.sinica.edu.tw http://hanchi.ihp.sinica.edu.tw https://en.wikipedia.org/wiki/spreadsheet https://en.wikipedia.org/wiki/spreadsheet https://en.wikipedia.org/wiki/relational_database https://en.wikipedia.org/wiki/relational_database https://en.wikipedia.org/wiki/object_database https://en.wikipedia.org/wiki/object_database https://en.wikipedia.org/wiki/graph_database https://en.wikipedia.org/wiki/graph_database https://en.wikipedia.org/wiki/graph_database https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms applied to the tripitaka by the chinese buddhist electronic text association; a guide to the use of tei with chinese texts appeared in . the idea of searching text for pat- terns of expression, such as dates, goes back to the s, but in practice its use in the humanities is more recent. the markus utility, described in this issue, allows the user to upload text, tag elements in it, and then download the tagged data. the translation of information into data together with access to databases through the internet, makes it possible for databases to share data with each other through “application programming interfaces” (apis), which only appeared in . almost all the utilities built for chinese studies discussed in this issue make use of apis, thus greatly increasing their utility by allowing knowledge distributed across different systems, collected for different purposes and managed by different people in different countries to be joined together. the tools for analyzing digital information and data have proliferated. there are software packages for many kinds of cluster analysis. topic modeling, the use of “machine learning” to discover the set of topics inherent in a text corpus, first appeared in . some text databases combine repository functions with analytic functions, enabling frequency analysis, comparison between editions, discovering nearest neigh- bors, and so on. there is an argument that those who want to mine and analyze texts extensively need to consider learning to program, for which there are now online courses and les- sons for a wide variety of analytic methods. for many the free software packages for statistical analysis, spatial analysis, and network analysis are adequate. the first desktop geographic information system appeared in the s. systematic social network anal- ysis dates from the s but it seems that software packages first appeared in the s. finally, digital modes of dissemination, the last part of the research cycle, have become ubiquitous. there are differences between print and digital editions: full color illustrations, zoomable maps, and sound files, for example, do not entail extra costs when digital. this introduction has been able to cite wikipedia or online sources for all but one of to this point. blogs, wikis and open-access online journals are com- monplace. individuals can publish through personal websites and the work is findable with web searches. publishers are now experimenting with digital projects in addition to books. the first article in this issue is a theoretical essay from a scholar of song literature, michael fuller: “digital humanities and the discontents of meaning.” fuller is also an experienced programmer and the chief data architect of the china biographical database. he argues that the digital humanities offer a way out of what he calls isolated subjectivity, when the meaning of things is whatever i as interpreter make it to be, and the hermeneutics of suspicion, when the meaning of things is revealed through decon- struction to be in service of ideology and power. the pragmatic study of language and cognition instead treats the structuring of human experience as an object of scientific https://en.wikipedia.org/wiki/cluster_analysis. https://en.wikipedia.org/wiki/topic_model. https://programminghistorian.org/en/. https://en.wikipedia.org/wiki/geographic_information_system. https://en.wikipedia.org/wiki/social_network_analysis and https://en.wikipedia.org/wiki/social_network_ analysis_software. see, for example, stanford university press’s new series: https://www.sup.org/digital/. journal of chinese history h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://en.wikipedia.org/wiki/cluster_analysis https://en.wikipedia.org/wiki/cluster_analysis https://en.wikipedia.org/wiki/topic_model https://en.wikipedia.org/wiki/topic_model https://programminghistorian.org/en/ https://programminghistorian.org/en/ https://en.wikipedia.org/wiki/geographic_information_system https://en.wikipedia.org/wiki/geographic_information_system https://en.wikipedia.org/wiki/social_network_analysis https://en.wikipedia.org/wiki/social_network_analysis https://en.wikipedia.org/wiki/social_network_analysis_software https://en.wikipedia.org/wiki/social_network_analysis_software https://en.wikipedia.org/wiki/social_network_analysis_software https://www.sup.org/digital/ https://www.sup.org/digital/ https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms inquiry. his discussion of distant and close reading as the basic strategies of the human- ities begins with the mathematics of topic modeling and ends with biographical data by way of the hermeneutics of wittgenstein, dilthey, and schleiermacher. the argument is in many ways straightforward: the meaning of a text is relative to the textual context in which it takes place and relative to the life experience of the writer. distant reading—the term comes from franco morretti—is a way of finding in the universe of texts the pat- terns, structures, and clusters that the particular text that is being read closely invokes. fuller cites schleiermacher’s observation that “as every utterance has a dual relation- ship to the totality of the language and the whole thought of its originator, then all understanding also consists of the two moments: of understanding the utterance as derived from language, and as a fact in the thinker.” his discussion of topic modeling reveals the logic of distant reading; there are many methods of analyzing very large tex- tual corpora. the hermeneutics of distant reading that explores texts within the context of textual corpora, fuller argues, is complemented by an analogous approach in the study of peo- ple. in designing the china biographical database he sees that the way in which a rela- tional database models life patterns does for historical figures what distant reading does for texts. that is, it makes it possible to understand “the larger life patterns within which an individual (or an era) lives.” i would add that a database created for seeing the patterns of many lives should not be treated as a biographical dictionary. the “distant reading” of biography is at the heart of nicolas tackett’s “the evolution of the tang political elite and its marriage network.” building on his earlier work on the demise of the tang great clans, tackett compiled kinship data from excavated epi- taphs and the new tang history to construct a dataset of , father–child kinship ties to explore the marriage network of political elites, the backgrounds of chief minis- ters, the composition of the capital elite in early, middle, and late tang, and the com- position of the provincial elite. once the data was disambiguated—that is, determining whether two or more entities with same name referred to one or more people—he was able to construct patrilines over multiple generations as an alternative to relying on claimed choronyms ( junwang 郡望). the one large network of political elites to which he pays particular attention contained eighty-seven patrilines, with a total of , individual members and marriages. locating individuals in these networks offered a way of knowing more about persons for whom other data was missing, such as their home base or the type of family they would marry. the distant reading informs the close reading. tackett combines network analysis to show kinship ties, spatial analysis to show where the political elite was based, a typology of elite epitaphs (e.g. no office, civil, mil- itary, etc.) and temporal analysis to show change over time. his results both confirm and challenge received views. tang factions were indeed regional and the luoyang elite began with fewer ties to the state, but empress wu was not in fact bringing in newly risen men but tapping well-established patrilines. after the an lushan rebellion capital patrilines were ever more dominant at court, but the ties between them and the provinces dissipated, and provincial office-holders became provincial elites who made their careers in territorial governments. in the past an apt quotation or anecdote might suffice to make a point, and readers could check references in the footnote. however, tackett’s findings and those of all the moretti, franco, distant reading (london: verso, ) and graphs, maps, trees: abstract models for a literary history (london: verso, ). peter k. bol h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms other research articles depend on datasets that are derived from the primary sources. to say that the work has been checked has two levels of meaning: that the analysis of the dataset is replicable and that the data itself is correctly derived from their sources. both levels require that the dataset be made available, either to replicate the analysis or to check the data sources. due to copyright issues the original sources usually cannot be made available. currently there is no standard protocol for making datasets avail- able and authors have taken different approaches. tackett makes his data available through a personal website, chen and chang through dataverse, de weerdt through dans (digital archiving and networked services), and schäfer explores ways to pub- lish her data in connection with the original sources that are kept in commercial databases. tackett worked with all the epitaphs he could find. song chen in “writing for local government schools: authors and themes in song-dynasty school inscriptions” uses all the inscriptions for song dynasty state schools in the digital edition of the vol- umes of the complete song prose, which gives him an author, a title, and usually the administrative unit. chen aims to quantify influence. two aspects of his approach are particularly interesting. first, his network analysis is between inscription authors and regions rather than solely between authors. authors are from somewhere and write for a place—is it in their native region or not? writing for schools in more regions is a sign of greater influence. using gis to show where the schools with inscriptions are and the various measures of network properties built into social network software, he points out the relative isolation of the upper yangzi (sichuan) but also the importance of those who, by writing for schools in dif- ferent regions, bridge them. second, he uses distant reading approaches to analyze the content of the inscrip- tions. one technique is to search for key terms across the dataset to show change over time, but key word search (leaving aside the problem of word segmentation in lit- erary chinese) brings with it the danger of only validating or nullifying the researcher’s specific hypothesis without providing an opportunity for discovering something new and unexpected. of course we can know something about the authors, and this is also a way of differentiating likely content. chao buzhi, the author of numerous epi- taphs and inscriptions, was a follower of su shi, in contrast to the even more prolific neo-confucian zhu xi. but authorship does not necessarily equate to content. chen uses “document clustering” to show that there are three distinct thematic clusters of inscriptions and that an author could write in more than one mode. chen’s explanation of his procedure illustrates in practice the mathematical introduction given by fuller. in “is there a faction in this list?” hilde de weerdt and her collaborators take up three song dynasty lists from , , and that were purported to represent factions and asks if they really were factions and, if so, how they were organized. the debate on song factions they review suggests that we may not want to assume that because some said others formed a faction (and thus should be pushed out of court) that in fact they did. to get at this they adopt a novel strategy. they could have taken up those on the list and used their writings to see how involved they were with each other, but this skews analysis towards those who wrote the most. so instead they ask which names co-occur in a given piece of writing (a letter, an epitaph, etc.) in all the extant writings by persons from the time of the lists as found in the complete song prose. this is a study of shared perceptions, as it does not matter if the writer had anything to do with a given list, only whether they thought there were relationships between people on the lists. journal of chinese history h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms the amount of data involved in this procedure, from over , to over , co-occurrences, makes computational analysis essential. the results do not surprise, given changes in politics from northern to southern song—those on the first list com- prised a dense court network, the second list was not a group at all, and those on the third list formed conglomerations of central and provincial figures. what is interesting and useful is the method they developed to compare historical networks and sample them to test their hypotheses. by working statistically and using different ways to com- pare, analyze, and test the data they arrived at nuanced conclusions that recognize both different kinds of connectivity and their absence. the kinds of social network analysis required to show this are sophisticated. digital scholarship often involves collaboration between scholars with different kinds of exper- tise. in addition, the databases and platforms used (in this case cbdb and markus) are collaborative efforts. i think this is worth mentioning because many humanists assume that we must be able to do everything ourselves, and there are indeed rare figures who can, but mastering the tools so as to use them fully and wisely is sometimes a career in itself. collaborate! the collaboration between the historian of science dagmar schäfer, the digital humanist shih-pei chen, and the historical geographer qun che combines an historical inquiry into the reporting of natural disasters related to silk production during the yuan dynasty and the use of the logart platform for automating data collection from local gazetteers (the platform is introduced in the section on utilities). this is a good example of combining different scales and different modes of analysis. the local gazetteer, well on its way to becoming the ubiquitous form of local history during the yuan, has generic patterns; it lends itself to becoming a database. although there are considerable differences in how information is presented in gazetteers, it is possible in logart to search across the entire corpus—another use of “distant reading.” the authors wish to explore how analytical procedures used to interpret one source reflect upon the read- ing of the entire corpus, and how quantitative interpretations can direct us in our read- ing and analysis of the particular case. combining statistical and visual analyses on data to discover general patterns and local anomalies and historical analysis that closely examines the contexts of how data were produced, this article reveals that reporting disasters were interactive results of the historical protocols for reporting information, individual editorial decisions, and the specific conditions of a place, all of which must be taken into account. to see only at the corpus scale is not better than treating an anecdote as representative of the whole. in their contribution chen, campbell, ren, and lee introduce the china government employee database—qing (cged-q). the sources for this data are the quarterly jinshen lu, recording government offices and their incumbents from the mid- eighteenth century to the end of the qing dynasty. the current database, with over three and a half million observations from to , is the most important dataset for the study of the institutional and social history of qing officialdom. this article introduces only some of conclusions that the authors are drawing from this database, showing how the qualifications that gave entry to office changed over time and how per- sons with different qualifications and different provincial origins were distributed across the bureaucracy. the availability of cged-q invites more quantitative research on qing government. the article also provides a very useful introduction to the challenges of creating a large, reliable database. some challenges are predictable, such as collecting and under- standing the sources and converting the information on the page into structured data. peter k. bol h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms the larger challenge, common to all large-scale collections of biographical data, is dis- ambiguation. if every time we encountered the name of a person it was accompanied by a note telling us where he or she lived, birth and death dates, given and courtesy names, titles and offices, we would know immediately whether two persons with the same name in a historical source were one person or two. the china biographical database has sixty-two people named wang zuo 王佐 (king’s assistant), forty-nine for the ming dynasty alone, but in many instances ambiguity remains (better to dupli- cate than treat two persons as one). fortunately, the jinshen lu includes the name of the place the official is from and how (and sometimes when) he qualified for office, so the disambiguation procedure was highly successful. it was far less successful in the case of bannerman, for whom the banner is given rather than the place of residence; reduced further by the frequent absence of manchu clan names. the final article, charles chang’s account of the communities within urban kunming, illustrates how new data sources can be used to study the recent past. kunming was a small, provincial backwater in the s; today it is a city of . million. how can we understand this growth and the make-up of the city today, given a dearth of official data or official data that does not correspond to ground truth? the solution lies in combining very large amounts of data from different sources. in addition to the record of remote sensing data freely available from the us government, there is data on points of interest from the chinese commercial mapping companies (which follow gov- ernment rules in distorting geographic locations), those social media posts from resi- dents that are geo-tagged, electronic commerce data, and street view images from surveying companies. making large amounts of data from different sources, generated in such different ways, and offered at very different scales compatible is a challenge. using his expertise in geographic information systems chang builds several layers of data at different scales in such a way that each layer becomes an elaboration on the preceding one. most of the data he uses to build this data-driven approach are themselves a byproduct of our digital era. the use of computational techniques to scrape websites, topic model hundreds of thousands of blog posts, and map geo-tagged data is becoming part of the toolkit of historians of today’s world. the articles in this issue of the journal were written when librarians kept the libraries open, scholars were presenting papers at conferences, and students attended classes in person. the final editing took place in the spring of , when we were practicing social distancing and quarantining ourselves at home. by the time this issue reaches readers the situation will have changed, but we might reflect on how we would have managed to keep on with academic life under such circumstances without digital assets and communication. the utilities introduced in this issue have provided databases, tools, and platforms that make scholarship “at a distance” possible. although they were not written with this purpose in mind, the articles demonstrate some of the ways cutting-edge research can take place in a fully digital environment. cite this article: bol pk ( ). introduction to the special issue. journal of chinese history , – . https://doi.org/ . /jch. . journal of chinese history h tt p s: // d o i.o rg / . /j ch . . d o w n lo ad ed f ro m h tt p s: // w w w .c am b ri d g e. o rg /c o re . c ar n eg ie m el lo n u n iv er si ty , o n a p r a t : : , s u b je ct t o t h e c am b ri d g e c o re t er m s o f u se , a va ila b le a t h tt p s: // w w w .c am b ri d g e. o rg /c o re /t er m s. https://doi.org/ . /jch. . https://doi.org/ . /jch. . https://www.cambridge.org/core https://www.cambridge.org/core/terms introduction to the special issue the codata-rda data steward school daniel bangert, göttingen state and university library, university of göttingen joy davidson, digital curation centre, university of glasgow steve diggs, scripps institute, ucsd marjan grootveld, frans huigens, dans sanjin muftić, university of cape town cristiana pisoni, university of bergamo hugh shanahan, royal holloway, university of london lina sitz, ictp ana slavec, innorenew coe sothearath seang, université côte d'azur shanmugasundaram venkataraman, digital curation centre, university of edinburgh abstract given the expected increase in demand for data stewards and data stewardship skills it is clear that there is a need to develop training, education and cpd (continuous professional development) in this area. in this paper a brief introduction will be provided to the origins of the definitions of data stewardship. also it notes the present tendency towards equivalence between data stewardship skills and fair principles. it then focuses on one specific training event – the pilot data stewardship strand of the codata-rda research data science schools that was held in trieste in august . the paper will discuss the overall curriculum for the pilot school, how it matches with fair s framework, and findings from the students and instructors on how to improve the school. introduction data stewardship as a role has come into prominence over the last decade. early references to data stewards occur in the literature in the first decade of the century with respect to health data (diamond, mostashari, & shirky, ; rosenbaum, )​ and had a strong emphasis on maintaining the privacy of the data sets. since then, the role has developed into one that carries out a variety of roles for data ​(peng, privette, kearns, ritchey, & ansari, ; peng et al., ; peng, ge et al., ; salome scholtens et al., ; sapp nelson, ; sapp nelson, megan r., ; verheul et al., )​ . the first formal publication of the fair data guiding principles ​(wilkinson et al., )​ makes an explicit connection between those principles and data stewardship. this paper will make explicit use of this connection in terms of defining an initial curriculum framework for data stewardship. regardless of how deep that connection is, the adoption of fair principles as a policy goal ​(european commission, )​ and the identification that these practices at least attempt to address critical issues such as a reproducibility ​(hartter, ryan, mackenzie, parker, & strasser, )​ indicates that there will be a substantial increase in the required number of data stewards. in this respect there will need to be an extensive increase in the amount of activity in this area from educational, training and continuous professional development (cpd) perspectives. this paper will discuss one particular initiative, the data steward strand of the codata-rda schools in research data science which is being piloted for the first time in august in cooperation with the fairsfair project. the medium-term goal of the school will not give students an introduction to data stewarding but instead embed them with early career researchers (ecrs) with the goal of demonstrating the importance of partnership between these roles over the research lifecycle. the immediate goal for the pilot school is to deliver a draft curriculum that will be refined based on feedback from pilot participants and offered through subsequent schools delivered by codata/rda and the fairsfair project (​https://www.fairsfair.eu/fair-competence-centre​). https://www.zotero.org/google-docs/? jwlxt https://www.zotero.org/google-docs/?ncvbxg https://www.zotero.org/google-docs/?ncvbxg https://www.zotero.org/google-docs/?ncvbxg https://www.zotero.org/google-docs/?dzxkle https://www.zotero.org/google-docs/?bggwcb https://www.zotero.org/google-docs/? snlsx https://www.zotero.org/google-docs/? snlsx https://www.fairsfair.eu/fair-competence-centre curriculum framework – fair s a more detailed description of the landscape of training resources and curriculum frameworks is discussed elsewhere ​(shanahan, h. et al., )​. the most recent and relevant curriculum framework, which incorporates previous work, is fair s (​https://www.eoscpilot.eu/sites/default/files/fair s_eoscpilot_skills_framework.pdf ) ​(whyte, a, ; whyte, angus et al., )​. this provides a high level description of the skills necessary to make data fair and keep data fair. it also provides a description of the level of understanding of the topics required for a variety of roles, including data steward. the highest set of topics are: ​plan and design ​(planning and design of data, research software and other outputs, including documentation)​; capture and process (​capturing and processing of data or related materials to enable research evidence to be prepared for analysis​) ​; integrate and analyse (developing and applying appropriate methods to enable lines of enquiry for research)​; appraise and preserve ​(developing and applying appropriate methods to appraise research outputs)​; publish and release ​(describing research products and their inter-relationships and providing access to them)​; expose and discover (​ensuring processes and mechanisms for providing access to research products)​; govern and assess ​(developing and maintaining legally compliant strategies, policies, and processes on outputs)​; scope and resource ​(identifying the scope of research data services and stewardship activities and securing the resources to sustain these)​ ​and​ advise and enable (managing services that enable data stewardship and open research)​. the school the codata-rda research data science schools are a series of schools that have run since . the long-term goal of the schools is to create communities of ecrs that are enabled to make the most of the data revolution in research. this is enabled by delivering an expanding set of schools delivered regionally which provide a foundation in data science – skills that are independent of the domain that the ecr is based in. there is a strong emphasis on teaching practical skills with team learning and ample opportunities for reflection and discussion. students come from a wide variety of domains including bioinformatics and other life sciences, earth and atmospheric sciences, high energy physics and others. the priority is to deliver these schools to individuals from low and middle income countries though the curriculum is applicable for students from high income countries as well. through the expansion of these schools the concept of providing such a curriculum (or something similar) will become embedded in higher education and hence such data science skills for ecrs will become accepted in much the same way that an understanding of basic biostatistics is essential in the life sciences ​(metz, )​ or linear algebra in engineering ​(barry & steele, )​. the school emphasises responsible research and hence distinguishes itself from the standard, machine learning focussed, data science bootcamps that tend to focus more on purely technical content. over two weeks it delivers modules on ● open and responsible research ​(bezuidenhout, louise, quick, rob, & shanahan, hugh, n.d.)​; ● the carpentries introductory material ​(teal et al., ; wilson, )​ on the unix command line, git and r; ● research data management, which provides a broad introduction to the topic for ecrs; ● author carpentry ​(caltech library, )​ describing the skills necessary for authorship in the st century; https://www.zotero.org/google-docs/?srmogl https://www.eoscpilot.eu/sites/default/files/fair s_eoscpilot_skills_framework.pdf https://www.zotero.org/google-docs/?searpe https://www.zotero.org/google-docs/?fppdum https://www.zotero.org/google-docs/?dmwhjl https://www.zotero.org/google-docs/?dmwhjl https://www.zotero.org/google-docs/?tgom f https://www.zotero.org/google-docs/?u midp https://www.zotero.org/google-docs/?ru bdn ● visualisation, specifically the visualisation of data; ● machine learning; ● information security; ● computational infrastructures, providing an introduction computing beyond a laptop or desktop computer. a diagrammatic representation of how these modules are related to each other is provided in figure . in december the school in san josé, costa rica used python as the main language rather than r. by february , nine schools ran on four continents to students from over countries. figure ​ diagrammatic representation of modules run in the ecr strand of the codata-rda schools. extension to data stewards there is already a substantial overlap between this curriculum and the fair s framework. hence the codata-rda school’s curriculum, with some adjustment, would provide an excellent introduction to the area. the strong emphasis on building communities of researchers with a grounding in responsible research practices also presents the opportunity to embed early career data stewards and researchers with each other. this would encourage both roles to work more closely with each other. fairsfair (https://fairsfair.eu) is a project that addresses the development and concrete realisation of an overall ​knowledge infrastructure on academic quality data management, procedures, standards, metrics and related matters, based on the fair principles.​ ​ one of its goals is to develop and provide a series of schools in this area and hence fairsfair is partnering with the codata-rda schools to deliver training along the lines described above. data steward pilot school in august a pilot version of the data steward school ran in parallel with the codata-rda school at the international centre for theoretical physics. a cohort of six students with a data stewarding background was taught in the pilot school. in table the modules that were run in common with the ecrs are listed. week of the school was common for both ecrs and data stewards. the data steward specific modules were run in week and are listed in table . a detailed description of the data steward specific teaching is provided in appendix a. in terms of omissions from the ecr school in week , visualisation was dropped entirely and machine learning scaled back to one seminar on the penultimate day of the school. the computational infrastructures module was scaled back from a day and a half to one day. these and the modules delivered in week one were taught with the ecrs. table . modules covered in pilot school that are common to ecrs and data stewards. the matching terms from fair s are abbreviated as follows. plan and design = pd; capture and process = cp; integrate and analyse = ia; appraise and preserve = ap; publish and release = pr; expose and discover = ed; govern and assess = ga; scope and resource = sr; advise and enable = ae. topic matching fair s week run open and responsible research ae, ga introduction to the command line ia, cp, pd r ia, cp, pd, ed git pr, ed author carpentry cp, ap, pr research data management pd, ap, ed, ga introduction to information security ga computational infrastructures ia, sr machine learning ia table . modules covered in pilot school that are unique to the data stewards. all of these modules are run in the second week of the school. the matching terms from fair s are abbreviated as follows. plan and design = pd; capture and process = cp; integrate and analyse = ia; appraise and preserve = ap; publish and release = pr; expose and discover = ed; govern and assess = ga; scope and resource = sr; advise and enable = ae. topic matching fair s data steward practice sr, ae fair data pd, ap, pr, ed, ga data management planning pd, ap, pr, ed, ga, sr metadata cp, ed accessibility pr, ed, ga preservation and publishing pr, ap data policies ga, ae storing data ap, sr linked data pd, pr, ed, ia a diagrammatic representation of how the modules for the data steward strand of the school are related to each other is provided in figure , which relates the common modules for ecr and data steward strands with the data steward specific modules, and figure which focuses on the relationships between the data steward specific modules. figure ​ diagrammatic representation of modules run in the data steward strand of the codata-rda schools. figure ​ diagrammatic representation of data steward specific modules run of the codata-rda schools. data steward specific modules the materials taught were delivered in nine modules. the modules were largely taught in an overlapping fashion during the second week. it is important to note that the module structure described here is a ​post hoc​ description of the set of materials delivered. in detail the modules are: ● data steward practice - providing an introduction into what data stewardship is and how to train individuals in the topic. ● fair data - specifically an explanation of the fair principles and the assessment of the fairness of data sets. ● data management planning - providing the necessary skills to support researchers to develop dmps. ● metadata - in particular determining minimum metadata requirements. ● accessibility - licensing of data and conditions associated with access to data. ● preservation and publishing - providing an introduction to archiving and issues such as long term sustainability, trustworthy repositories and publishing data. ● data policies - determining key elements involved in the development of an institutional data policy. ● storing data - practical issues associated with uploading data and the skills to get researchers to deposit or archive their data. ● linked data - providing an introduction to the principles and implementation of linked data. assessment the codata-rda schools works on a principle of iterative improvement of its curriculum and has been shown to be a powerful technique (wolf, ). this pilot in particular represents a first step in this area and will require further revision. given the small size of the cohort more qualitative approaches were taken to understand the impact of the pilot school. specifically, students were requested to write an action plan on their future plans and a workshop on the pilot was held. action plans the students were asked to complete an action plan for what they would do after the school. specifically they were asked to address two questions. . please tell us what you plan to put into practice when you get back to your home institution as a result of taking part in this training. specifically, what are three things you would like to achieve within the next six months? . are there any specific outcomes that you anticipate from these actions? if so, can you please describe them? the responses of the students are in appendix b. a number of recurring themes emerged from these responses. training​: specifically to train researchers or potential data stewards at their home institution. the topics proposed to be taught ranged from practical issues such as the carpentry material taught at the school to more general open science issues. four students addressed and planned this. dissemination (blogs/talks)​: this is an opportunity for the students to report back to their institution of their findings from the school through seminars or reports or to discuss the school to a wider audience through web sites or blogs. four students addressed and planned this. research usage​: in particular this is to research how data is managed within their own institution, e.g. do researchers use repositories and hence understand what the landscape looks like. four students addressed and planned this. specific references to fair​: in out of the action plans there is a specific reference to fair principles. however, in of those cases this is very much in the larger context of rdm practices ​in general. the fourth is very focussed on fair but the relevant student has a specific fair-related project. develop or use tools/procedures: ​this was a commitment to either develop a procedure for their institution for use by their researchers in rdm that came out of the school’s materials or to use a tool such as rise or to be part of a team to develop a relevant tool, such as terms fairskills (https://github.com/terms fairskills/). three students addressed and planned this. engage with external data steward community: ​this was a commitment to continue engaging with the data steward community beyond their own institution. three students addressed and planned this. distribution of actions: ​the activities described above were not evenly distributed between the students. a network diagram (drawn using the geomnet r package using the mds layout algorithm) shows the connections between students and themes. as can be seen in figure , the themes divide into those related to dissemination and teaching and those related to research and implementation. the interest in the former set of themes is perhaps something of a surprise, namely that students are as interested in teaching and communicating what they’ve learnt to their colleagues and researchers as implementing data stewardship. figure ​: network of themes and students. an edge indicates that a student referred to the theme. workshop in october a workshop on the school was held at the research data alliance th plenary in helsinki. during the workshop the experiences of the students and instructors were discussed with a group of data professionals to get their feedback on the schools and for the students and instructors to further reflect on the school and how it could be improved. at this meeting, two themes emerged. in the first instance the data professionals were for the most part happy with the overall design of the curriculum. secondly, it was noted that while the ecrs and data stewards were aware of each other during the school there wasn’t an opportunity for them to a) get a better understanding of what the other group was doing in their strand, and b) that there wasn’t an opportunity for the two groups to work together in their respective roles. a proposal to deal with this would be, for future schools, to create teams of ecrs at the outset of the school and to assign a data steward student to each team. during the school, joint exercises could then be run so the team of ecrs grasp the purpose of the data steward and data stewards see their work in the context of improving research. conclusions this paper describes a pilot school to provide training for data stewards over a two week period. this pilot builds on a school specifically for ecrs that combines technical and responsible research. a small cohort of data steward students were selected to test the materials for this pilot. the students take part in a curriculum that is very similar to the ecr school but with crucial differences in the second week of the school, with tailored material delivered to only them. given the paucity of training in this area, particularly with respect to fair-related activities, there are a number of issues that have not been addressed here but would merit further research. in the first instance the term “data steward” has been used here rather than other related terms such as “data manager”. rather than attempting to tease out such roles which are likely to be evolving, a starting point was to use the reference document fair s and ensure that there is a mapping between the broad categories described there and the topics covered in the school. in other words the focus is very much on the teaching of a set of skills rather than training for a specific role. in reviewing the school there were a number of findings. in the first instance, students were as interested in disseminating and providing their own training at their institute as carrying out the roles of being a data steward. there was also interest in disseminating their findings through blog posts etc. it is clear that future schools should take this into consideration and consider carefully how one can provide examples of good training practice. secondly, the data steward students were interested in understanding how data is being managed at their institution and hence make suggestions on how to improve it. students understand the context of the fair principles. there is clear interest in using tools and engaging with the larger data stewardship community. feedback from data professionals from the rda plenary workshop indicates that the curriculum is for the most part apposite. what can be improved is ensuring that there is greater cross-talk between the ecr’s and data stewards. this would represent a unique offering in terms of training - namely an opportunity to simulate what the partnership between researchers and data stewards should look like and give both groups the chance to see the purpose, challenges and opportunities that come from working with the other group. this could be achieved by creating a series of complementary exercises where ecrs and data stewards work together during the second week of the school and represents the next challenge. finally the strong emphasis of the codata-rda schools on building a sense of community amongst its students will be important in ensuring that data stewards from this school will also feel that they are part of a community that they can reach out to. the pilot school that ran in august represents a first step; focusing on ensuring the draft curriculum that is apposite for the data stewards. adjustments will be made to the schools content and will be run again in in trieste. furthermore, negotiations are in place to run other instances of the school in specific research domains. the codata-rda schools themselves have expanded considerably over the last five years and the authors see no reason why that could not occur for the data steward strand as well. acknowledgements this work has been made possible by support from the fairsfair project. fairsfair “fostering fair data practices in europe” has received funding from the european union’s horizon project call h -infraeosc- - grant agreement . references barry, m. d. j., & steele, n. c. ( ). a core curriculum in mathematics for the european engineer: an overview. ​international journal of mathematical education in science and technology​, ​ ​( ), – . https://doi.org/ . / bezuidenhout, louise, quick, rob, & shanahan, hugh. (n.d.). “ethics when you least expect it”: a modular approach to data ethics instruction. ​submitted for publication​. caltech library. ( ). ​authorcarpentry homepage​. https://doi.org/ . /z h ffz diamond, c. c., mostashari, f., & shirky, c. ( ). collecting and sharing data for population health: a new paradigm. ​health affairs​, ​ ​( ), – . https://doi.org/ . /hlthaff. . . european commission. ( , september ). european commission—press releases - press release—g leaders’ communique hangzhou summit. retrieved july , from http://europa.eu/rapid/press-release_statement- - _en.htm hartter, j., ryan, s. j., mackenzie, c. a., parker, j. n., & strasser, c. a. ( ). spatially https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi explicit data: stewardship and ethical challenges in science. ​plos biology​, ​ ​( ), e . https://doi.org/ . /journal.pbio. metz, a. m. ( ). teaching statistics in biology: using inquiry-based learning to strengthen understanding of statistical analysis in biology laboratory courses. ​cbe—life sciences education​, ​ ​( ), – . https://doi.org/ . /cbe. - - peng, g., privette, j. l., kearns, e. j., ritchey, n. a., & ansari, s. ( ). a unified framework for measuring stewardship practices applied to digital environmental datasets. ​data science journal​, ​ ​, – . https://doi.org/ . /dsj. - peng, g., privette, j. l., tilmes, c., bristol, s., maycock, t., bates, j. j., … kearns, e. j. ( ). a conceptual enterprise framework for managing scientific data stewardship. data science journal​, ​ ​( ), . https://doi.org/ . /dsj- - peng, ge, ritchey, nancy a., casey, kenneth s., kearns, edward j., privette, jeffrey l., saunders, drew, … ansari, steve. ( ). scientific stewardship in the open data and big data era—roles and responsibilities of stewards and other major product stakeholders. ​d-lib magazine​, ​ ​( / ). https://doi.org/ . /may -peng rosenbaum, s. ( ). data governance and stewardship: designing data stewardship entities and advancing data access. ​health services research​, ​ ​( p ), – . https://doi.org/ . /j. - . . .x salome scholtens, petronella anbeek, jasmin böhmer, mirjam brullemans-spansier, marije van der geest, mijke jetten, … celia w g van gelder. ( ). ​life sciences data steward function matrix​. https://doi.org/ . /zenodo. sapp nelson, m. ( ). pilot data information literacy competencies matrix scaffolded across undergraduate, graduate and data steward levels. ​libraries faculty and staff scholarship and research​. retrieved from https://docs.lib.purdue.edu/lib_fsdocs/ https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi sapp nelson, megan r., m. ( ). a pilot competency matrix for data management skills: a step toward the development of systematic data information literacy programs. journal of escience librarianship​, ​ ​( ). https://doi.org/ . /jeslib. . shanahan, h., hoebelheinrich, n., whyte, a., davis, r., jones, s., & hodson, s. ( ). teaching fair. ​submitted for publication​. teal, t. k., cranston, k. a., lapp, h., white, e., wilson, g., ram, k., & pawlik, a. ( ). data carpentry: workshops to increase data literacy for researchers. ​international journal of digital curation​, ​ ​( ), – . https://doi.org/ . /ijdc.v i . verheul, i., imming, m., ringerma, j., mordant, a., ploeg, j.-l. van der, & pronk, m. ( ). data stewardship on the map: a study of tasks and roles in dutch research institutes​. https://doi.org/ . /zenodo. whyte, a. ( , march ). ​fair s, a skills and capability framework for the european open science cloud​. presented at the drexel-codata fair and responsible research data management (fair-rrdm) workshop, philadelphia. whyte, angus, leenarts, ellen, de vries, j., huigen, frans, sipos, gergely, dijk, e., … ashley, kevin. ( ). ​eoscpilot d . strategy for sustainable development of skills and capabilities​. retrieved from https://eoscpilot.eu/themes/skills wilkinson, m. d., dumontier, m., aalbersberg, ij. j., appleton, g., axton, m., baak, a., … mons, b. ( ). the fair guiding principles for scientific data management and stewardship. scientific data​, ​ ​, . https://doi.org/ . /sdata. . wilson, g. ( ). software carpentry: getting scientists to write better code by making them more productive. ​computing in science & engineering​, ​ ​( ), – . https://doi.org/ . /mcse. . https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi https://www.zotero.org/google-docs/?hx yi appendix a detailed topics covered that are specific for data stewards. day week module description welcome and introductions an ice breaking opportunity for the students to get to know each other and the lecturers. the session also helps the students to understand each others’ motivations for participation. background an introduction to the course themes and some background on drivers for research data management, open science and the emergence of the fair principles. what does a data steward do? data steward practice overview of - potential - tasks and roles, plus relevant literature fair data fair data a skeptic’s view: everyone loves fair, but what is it really? including group discussion fair assessment tools - foster course and exercise fair data students take the online tutorial of fair assessment tools then complete a practical exercise to assess the fairness of data using one of the tools presented. data management planning data management planning data management plans (dmps) are increasingly part of good rdm practice, in many cases being mandated by institutions and funders and therefore it was deemed necessary to train data stewards in how to create a good dmp. this session involved a descriptive outline of dmps and a practical exercise. day week module description data management plan activity data management planning a creative exercise to help students to consider the motivations for developing data management plans from different stakeholder perspectives. the activity is based on the train the trainer card game . created by gwen franck. metadata and description activity metadata a practical exercise to get students to consider minimum metadata requirements from the perspective of both creators and data reusers. this exercise is based on an activity developed by sally rumsey, university of oxford. licences accessibility message: licences tell potential reusers what they can do. openaire recommendations for research data licences. creative commons. software licences. plus discussion repositories preservation and publishing what does an archivist do? what is the oais model? long-term sustainability. preferred file formats. certification: fair data in trustworthy repositories preservation and publishing message: trustworthy repositories make and keep data fair. coretrustseal repository certification. plus exercise to understand how coretrustseal requirements also concern the data producer, i.e. the researcher. rdm service development and optimisation and rise activity data management planning data stewards are well placed to contribute and aid in service building. the rise tool was therefore introduced along with a practical session. different institutions will have different priorities relating to what they will want to focus on but by taking stock of their current status they can then see where effort should be directed most. day week module description developing rdm policies and activity data policies a session to introduce students to key elements that may be included in an institutional data policy and other environmental factors to consider when developing policies. promote and archive data preservation and publishing message: publish data in a data journal. plus exercise to choose one data journal and give a one-minute summary of the manual/guidelines upload your data and practice data access storing data introduction in the b share repository. hands-on exercise in depositing data in the ​training version of b share​ with its annotation service b note. exercise on data access accessibility exercise: explore these three data repositories and their conditions on data access. are the conditions clear and feasible? discussion of data access challenges accessibility an open q&a session where students can ask questions about providing data access. designing rdm training activity data steward practice an exercise for the students to design their rdm training activity. (day week is given over to computational lnfrastructures). day week module description linked data theory and basics linked data an introduction to the aims and principles of linked data and the semantic web. exercise: explore the wikidata knowledge graph. ontologies linked data an introduction to ontologies, including how ontologies and their structures are expressed. exercise: explore and compare selected ontologies. producing rdf linked data exercise: create rdf triples and upload to blazegraph triplestore. sparql linked data exercise: query uploaded data and explore other query services. https://trng-b share.eudat.eu/ https://trng-b share.eudat.eu/ summary and feedback discussion data steward practice summary of linked data in theory and practice. revisiting of the fair principles. final reflections and feedback on the data steward school. appendix b action plans of students. data stewardship action plans . please tell us what you plan to put into practice when you get back to your home institution as a result of taking part in this training. specifically, what are three things you would like to achieve within the next six months? . are there any specific outcomes that you anticipate from these actions? if so, can you please describe them? ------------------------------------------------------------------------------------------------------------ ana slavec’s data stewardship action plan . please tell us what you plan to put into practice when you get back to your home institution as a result of taking part in this training. specifically, what are three things you would like to achieve within the next six months? at my institution (innorenew coe) i plan to carry out the following activities: ) write a blog post about the data stewardship school ) have a presentation about research data management on a weekly meeting (we have a staff meeting the same time every week and every time someone else presents) ) update our data management plan and prepare a template for researchers to use in their individual projects ) collect information on what data has been collected in the life time of the institute ( . years) and help researchers to select the most appropriate repository for their data (in case there is no disciplinary repository zenodo is used) ) establish a procedure to assist researchers in depositing their data and metadata to the repository in addition to my work at the institution, what i learned in the data stewardship course is going to help me in my activities in other organisations: i. as part of my rda ambassador programme i will visit several research institutes in europe that collect data on renewable materials and products. in the first part of the presentation i will make an introduction to data management in general and in the second part i will present them rda and what are the possibilities of being involved. ii. development of the programme for open science , an event that i am organizing in november with young academy slovenia (with the support of the fit rri project) at the university of maribor iii. skype meeting with members of the eurodoc open science wg and eurodoc’s open science ambassadors (to share the experience with the school) . are there any specific outcomes that you anticipate from these actions? if so, can you please describe them? at my institution: ) blog post will be published on the institute’s website and shared on social media. i hope it will motivate other institutions to establish the position of a data steward. ) the presentation will make our researchers more aware of the importance of managing their data and will motivate them to seek help in preparing their data for deposition ) the template will make it easier for researchers to prepare a data management plan ) based on the overview we will be able to prepare a plan for data deposition (point ) ) i hope the outcome will be that all our data will be deposited in either a disciplinary repository or on zenodo in other organisations: i. getting more researchers from the renewable materials domain to become rda members and establishment of a rda interest group (and a wg in the long term) ii. raise awareness of fair data and other aspects of open science among slovenian researchers iii. motivate early career researchers involved with eurodoc to learn more about the fair data aspect of open science and to share this in their communities ------------ cristiana pisoni’s data stewardship action plan . please tell us what you plan to put into practice when you get back to your home institution as a result of taking part in this training. specifically, what are three things you would like to achieve within the next six months? . a meeting with the chief of the library service and the research provost of our university to underline the benefits that our institution could have from making the datasets produced by its researchers at least traceable and attributable to the institution, drawing their attention to some actions that could be taken at no cost, as described below . within a very little group of researchers ( - max, preferably from the engineering campus, where i work): ❖ i would like to interview them separately about the interest they would have on a mirror repository on our publication repository (​https://aisberg.unibg.it/​) with a specific section (i.e. section : research data) to collect bibliographic data about their research datasets with a specific metadata for the linkink to dataset uploaded in other repositories/archives ❖ analyzing their answer in order to understand: ➢ if they understand the importance of the data management ➢ if there is any real interest in the pilot project: if yes, build on a section : research data in our repository (zero cost) ➢ if, in future, there will be the possibility of inserting in the global policy a section about research data so that the linking to datasets uploaded elsewhere would be mandatory . in collaboration with the research office: ❖ analyze which researchers are involved in h /fp (open access pilot?) in the last three years ❖ contacting them with an e-mail to understand if they are collecting/have collected the data for their research projects, asking them whether they can cooperate in a survey https://aisberg.unibg.it/ ❖ send a short survey of - questions to those who have expressed interest: ➢ where do you usually store research data? ➢ would you like to have at your disposal a help service for cleaning, data processing and data storage? ➢ do you know what data fairness is? ➢ are you interested in making your data findable, accessible, interoperable and reusable (fair)? ➢ how could we help with the research data management? . in collaboration with the phd students school, propose some courses with very practical cut, using the author carpentry and library carpentry materials ❖ persistent identifier: importance and use. how to manage orcid and doi ❖ publishing: licenses and their importance. examples from our repository ❖ publishing: how to choose the best journal vs. predatory publishing ❖ research data management: rcr, open data and fair data ❖ research data management: tools and strategies (see venkat’s and louise’s presentations) . ​are there any specific outcomes that you anticipate from these actions? if so, can you please describe them? first, i hope to be able to use some of the tools and materials used during the summer school right away: ● rmarkdown to review the repository policy and to make a web page on the example of the university of bath research data policy ● carpentries’ materials to prepare courses for phd students i think that also the dialogue with other internal structures (research office and phd school) could lead to a greater awareness of the importance of managing all aspects of research in a coordinated way, starting from attribution and credits. speaking with research would let me know if there is the awareness of the growing importance of data research management. whether positive or negative the awareness would be, i would look in every way i can to spread the knowledge of the concepts deepened during the summer school, asking for the possibility to insert some ad hoc news on library service homepage as well as on repository homepage. i hope that also the phd courses will spread a more conscious awareness of data importance and of open data and fair data concepts. i am also considering the possibility of participating in online courses to certify my path towards the acquisition of the skills of data steward, but i would like to ask you some guidance to choose the best way to do this. actually i am not sure even if the costs are worth the benefits, since it is not clear to me if a certification like the one proposed by ​https://ecm.elearningcurve.com or https://dama.org/content/cdmp-status​ can really be useful. https://ecm.elearningcurve.com/ https://ecm.elearningcurve.com/ https://dama.org/content/cdmp-status https://dama.org/content/cdmp-status sanjin muftić​ data stewardship action plan . please tell us what you plan to put into practice when you get back to your home institution as a result of taking part in this training. specifically, what are three things you would like to achieve within the next six months? . re-design the website a. we have begun work on redoing our department website - digital library services - dls and we have had lots of discussions around the content. this debate has been centered around how much info is necessary, how open should we be, how to arrange all the services that we offer so that users can find what they need. throughout the school, as much as the content presented was very important, so was the way in which it was presented (see below). also, because the school was a wonderful wholesome overview of the field of data stewardship and data science it made me aware of all the little pockets of information and tools that are available and that need to be shared. it also made me aware of the context in which people will use these tools and how to best present them. it was a view you find on top a mountain. throughout the two weeks, but more especially during the data steward stream, next to my notes i was marking down which little phrases, links, tools that would be good to include on our website to help others. i think the overall approach on one of the days of teaching rdm was great which was expressed something like: “this is what rdm is, in all detail, but you will not tell it to researchers all of this.” the implication being that there need to be approaches to communicate this knowledge in a way that is easy to digest. and for me, even though there was not a specific amount of focus on this, it became important to think how it translates to the written guides and the digital material. there is a wealth of open science info out there, but part of what the data steward should do is to distill it or adjust it to suit the audience that they are working with. in this case of my context, and am sure many others, that would include researchers, librarians and students (as evidenced by our role-playing exercises). so with the website i hope to contribute to my team’s writing to find the right tone and approach with our audience and also make sure that the most appropriate content is immediately accessible. i also want to integrate the rmarkdown skills we obtained to work on our quickguides for certain of our services and tools, allowing us to make quicker changes to that material. . develop suitable teaching/training formats on data stewardship a. catering for a number of audiences in terms of spreading the practices of data stewardship is quite a challenge. the material needs to be presented in different ways within diverse forums and various audiences. what was really wonderful at the data school was the number of lecturers and having a chance to engage with everybody’s teaching approach in terms of connecting with the audience and how they structured their content. it was possible to take a step back and observe which approaches suited which people and content better. is it better to allow students to ask questions, or do you work with green and pink stickers? how much content do you put on your slides, and how much do you deliver in words only? do you start at the beginning or at the “end” to hook people in? do you give a lot of theory or make people do things immediately to get the experience? of course there is no one right approach, but the opportunity to see different ones in action was eye-opening and then to cross-match them with the content highlighted how much a data steward - in terms of advocacy - should be flexible in their approach. i think more than anything a data steward should be able to switch between the theory aspect and an interactive workshop environment seamlessly to keep the flow of advocacy. it might sound obvious but simply showing a tool and what it does - without providing the larger motivations for using the tool - might fail to connect with the audience. the other important thing is choosing how much information to share before it might become an overload, thus knowing when not to cross that line that might stop people from engaging. it made me rethink and make sure that i plan different strategies when doing outreach and advocacy programmes. . increase participation within data steward community a. i think the big part of the data steward community is that it needs to keep growing to the point where everybody is a data steward because they are managing their data and engaging in solid digital scholarship practices. we do quite a bit of outreach already in terms of hosting data steward meet-ups and going to present at various faculties and departments. so obviously the aim to increase the participation if only by making people aware of what they are already doing. in a strange way i found the session on linked data was useful for this, because developing a directory of people with related interests is very useful for this task, and being able to identify points of intersection between people through linked data is a digital scholarship project on its own. as requested by my manger of our team, we need to develop such a graphic that “maps” out our community of data stewards. wanting to be part of this map could encourage others to join the community of data stewards. what we could do with this map is also connected to the common repositories, metadata standards, queries that each department gets (something we at dls should keep track of and catalogue), so that new users can simply navigate the map to get some of their initial questions answered - either by looking at connections or by contacting one of the people along the connection list. . are there any specific outcomes that you anticipate from these actions? if so, can you please describe them? . a more interactive website that improves efficiency of the researchers and puts us more in contact with the research community. if we can make clear progress on how we communicate the information and the kind of information we put out - we might find that researchers are using our resources without needing direction from us or even having to intervene on their behalf. while not eliminating the need for a personal connection and hands-on, face to face outreach, having a single resource that can assist you in multiple ways without getting you lost would improve our status within the research community as a trusted resource. i suppose if we see more traffic on our website that would be a good start. . on top of this, i see a bigger outcome in having more digital scholarship projects arise and take place within the university. this is more of a very long term wish but it is related to the experience gained at the school. for me the digital scholarship within my position encompasses all of the data steward work plus a digital humanities angle that seeks to encourage and support research projects which use digital tools and technologies. i think partly the potential of engaging with r markdown, and the semantic web has made me realize how the awareness of these tools could save a lot of future work for researchers, but also encourage them to do some interesting projects by being aware of these tools in the first place. . overall the aim to grow the data stewards community in three ways - in the library among staff, in the faculty among lecturers and researchers, and in the student body. having a number of different data stewards in different fields and disciplines would ensure that it is not necessary for one data steward to serve the needs of a an entire multifaceted research body. instead the data steward might be a first port of call who can direct research to those who are more equipped to deal with the query and give the most appropriate workflow, tool or procedure. building a data culture through the website, advocacy, a visual registry would support all of this. i don’t know if the aim is to have people within dls do less advocacy, but actually have the time to drive further investigations and discover the next steps in data stewardship while the initial steps are done with the data stewards who are students, researchers and librarians. if we empower them, we can seek further vantage points. sothearath seang’s data stewardship action plan: . please tell us what you plan to put into practice when you get back to your home institution as a result of taking part in this training. specifically, what are three things you would like to achieve within the next six months? - report back about the summer school and discuss about research data policy with my university in the beginning of - academic year: - provide information about the programme, what it covered and how the materials and knowledge can be used in the context of the university - develop a first version of research data policy - call for consultation from all university members - make the policy publicly available on the website - get researchers and other university or laboratory personnel such as librarians and it people to be engaged in the process together: - organise events such as workshops, seminars and conferences with local experts on the subject - list out the benefits of collaboration between the different actors in the research sphere of the university and of having a concrete research data management plan - find and summarise regional and national projects and initiatives for references and as a state of the art to encourage the stakeholders to take action - create a working group specifically oriented towards research data: - within the newly formed open science working group, a sub-working group should be created - co-develop training courses with the regional unit of training in addition to the existing ones in open science general curriculum - actively approach researchers and other stakeholders for surveys, discussions and help concerning any issues they are facing with research data - engage phd candidates in the topic through our local association of junior researchers . are there any specific outcomes that you anticipate from these actions? if so, can you please describe them? - as a first objective, it is essential that the research community will become more aware of the opportunities/challenges and what are at stake for research data - because this university has never put their focus on this topic, the outcomes are a bit difficult to predict but i believe that making things clearer for all the people involved in the first place, it can help the community to grow and develop in the right direction - by putting in place this action plan, i expect that more positions will be created to fill the current void of research data experts in my university and - this should also foster more collaborations between actors locally and regionally as well as more open science and open data projects frans huigen's data stewardship action plan . please tell us what you plan to put into practice when you get back to your home institution as a result of taking part in this training. specifically, what are three things you would like to achieve within the next six months? from the summer school, the most striking notion for me was that of the responsible science citizenship and the responsibilities inherent to it. at dans, i will continue to invest in being such a citizen. i will do so through the following three activities (chronologically ordered): . being responsible is being open and transparent. i thus have written two pieces of reporting of the summer school. one for our dans-website (which i will translate later on), and one for the e-data&research journal. its topic is data stewardship in the ever changing world of digital research data. . together with experts in the horizon project fairsfair, i will work on what's called the fair terminology. this terminology is derived from the fair s framework (written in the context of the eoscpilot project). concrete action is to further define skills necessary for st century data stewards in a structured way. . i became part of the dans training group, specialized in giving workshops, training, lectures, about open science and related topics. giving a good workshop is not a walk in the park. thus, i will take up training to give training. on december th, and on january th, i will receive - with colleagues - a training in how to give a solid workshop. . are there any specific outcomes that you anticipate from these actions? if so, can you please describe them? for the first two: yes, concrete outcomes, namely: . reporting of the event in trieste, referring further to useful website and publications on the subject. this will create digital traffic and ensure attention is given to the important topic. . an iterative process of understanding, and growing of a structured list of fair skills necessary for data stewards. through it, the fair principles will be concretized, operationalized; ready to be implemented in necessary curriculi. for the third, the outcome will be less concrete but nevertheless crucial. in following workshops and investing in my professional growth, i will become an expert in the field and improve my soft skills. lina sitz action plan: . please tell us what you plan to put into practice when you get back to your home institution as a result of taking part in this training. specifically, what are three things you would like to achieve within the next six months? after taking part in this training i realised that in my institution, even if there are a lot of actions devoted to promote open science and data sharing, we don’t have a common way of dealing with our research data. so i would like to achieve the following activities: ) try to help the scientist and it of our pilot project to fulfill of “good practices” the use case. so this use case can become an “example” to help other scientist at the ictp in the process of sharing their research data. ) use the rise tool, in collaboration with scientists and directives, in order to identify the “real” gaps in my institution in rdm offering. i believe that in the case of my institute the chart will be close to a dot :), but i’m planning to use the options of any section in order to understand which are the real objectives and needs in my institute. ) make a list of actions needed in order to develop a rdm roadmap. . are there any specific outcomes that you anticipate from these actions? if so, can you please describe them? ) a fair shared dataset and a “real example” (with real issues and the possible solutions, in order to take “fair decisions”) to help other scientists from and visiting our centre in the process of sharing their research data. ) a frame to define the actions to take in order to provide a rdm service in our institute. _ - _aufsätze.indd z f b b ( ) – libraries as e-infrastructure libraries have served as education and research infrastructures for centuries. in this paper, we will describe major opportunities and future challenges in the context of digital research and the »e-infrastructures« that are required for e-science. we will pro- vide examples of current involvements and focus on the impor- tance of cooperation at local, international, specifically european, and global scale. bibliotheken fungieren seit jahrhunderten als bildungs- und forschungsinfrastrukturen. in dem vorliegenden aufsatz werden die chancen und herausforderungen von digitalen forschungs- umgebungen und von – für die sogenannte e-science benötig- ten – e-infrastrukturen erörtert. es werden aktuelle beispiele be- schrieben; außerdem wird aufgezeigt, wie wichtig kooperation auf lokaler, internationaler, speziell europäischer ebene in diesem zusammenhang ist. l i b r a r i e s a s i n f r a s t r u c t u r e s for centuries, libraries were a major, if not the main research infrastructure of academic institutions. they started off by holding the manuscripts and prints of researchers working at the institution, in times when reproduction of scholarly work was the exception and scholars had to travel around the world to gain insights into the works of other scholars. when the reproduction of scholarly works became easier, librar­ ies were able to collect a large segment of the world’s knowledge and make it accessible to researchers and students. libraries’ estates were usually established at the heart of the campus to perform their organ­ izational function for the circulation of knowledge and serve as a sanctuary for study, where researchers and students could be among themselves and could receive advice by librarians – in many cases scholars themselves. in parallel, experimental research became a ma­ jor paradigm and laboratories containing a scientif­ ic apparatus became a major part of the institution­ al infrastructure. and laboratories started to produce knowledge resources that were usually not kept in li­ braries, namely research data and artifacts that under­ pin research findings. libraries and laboratories – text and data – coexisted in a highly entangled form as re­ search infrastructure partners for more than years. today, three rapid and radical developments bring li­ braries as infrastructures to a whole new level. first, digital knowledge resources are largely location inde­ pendent. second, and relatedly, research has become collaborative and distributed. third, and most signif­ icantly for our question about the role of libraries as research infrastructures, data and the software used to process it – forming compound objects representing virtualized experimental artifacts – became primary research outputs themselves. as text, data and soft­ ware become more and more integrated, the resulting challenge for research infrastructure is how to sustain these new research objects. text expressing the researcher’s narrative of ideas and methods was long time the sole authoritative record of research. libraries have been keeping and providing access to this record for the society inde­ pendently of changing publishing mechanisms. since there are now new forms of compound knowledge objects that need to be kept as authoritative records of research, the question is how libraries, laboratories and computing centers can work together to maintain a record of research that can be reliably accessed now and by future generations. libraries and e-science the rules of the scientific and educational system have changed tremendously with the use of informa­ tion and communication technologies. huge amounts of data are produced and can be immediately made available, interpreted, processed, enriched, stored and preserved. the old paradigm that access to re­ search output is slow, difficult and expensive in or­ der to be high quality is no longer valid. traditional mechanisms for guaranteeing quality, such as peer review, have shown not to be % reliable and se­ riously slow down the review process. expensive li­ censes have made access to research output hardly affordable for most research institutes. furthermore, copyright systems lack flexibility to allow for text and data mining. as an answer to the need for quick, easy, afford­ able and permanent access to research output, librar­ ies have built digital repositories. a repository brings together all scientific output of an institution or a project. libraries are widely recognized as a superior source of quality content, but they need to make more effort to increase the visibility of the content stored in these repositories. according to several studies, large amounts of papers ( – %, depending on the field) published in academic journals remain uncited. librar­ ies can contribute to a more efficient and transparent scientific ecosystem in the e­science age. interoper­ ability standards, metadata enrichment, linked data, e-science w o l f r a m h o r s t m a n n , w o u t e r s c h a l l i e r , j a r k k o s i r e n , c a r l o s m o r a i s - p i r e s libraries as e-infrastructure fo to : bo dl ei an l ib ra ri es fo to : pr iv at fo to : pr iv at fo to : pr iv at wolfram horstmann wouter schallier jarkko siren carlos morais-pires z f b b ( ) – wolfram horstmann, wouter schallier, jarkko siren, carlos morais-pires and convergence of metadata schemes will give high quality scientific output more visibility. libraries also need to aim at a full integration of formal publications (books, papers) with other content types such as grey literature, research data, software, audio, video, learn­ ing objects, etc. finally, repositories give governments, funding agencies, and research institutes insight in the impact of the research that they support. since preservation of research output is no longer limited to institutional and format related boundaries, preservation becomes more complex. on the other hand, it is also an opportunity for libraries to organize preservation as a collaborative, global effort. the care for educational and scientific information as a public good represents also challenges for governments and policymakers. the emerging compound knowledge ob- jects produced in collaborative research activities re­ quire a diverse set of services beyond the basic remit of storage; they should include easy to use services for deposit, registration, quality control, discovery, and ac­ cess. these are supplemented with information­age infrastructure elements, such as semantic standards, specialist query and visualization tools, preservation services and elements which sustain critical charac­ teristics of the repository materials: their integrity, au­ thenticity, usability, and their ability to be understood and discovered. libraries in e-infrastructures to derive greatest benefit from research data and any other form of research output, it is fundamental that library services for e­science are connected to state­ of­the­art information and communication infrastruc­ tures, also termed e­infrastructures. these infrastruc­ tures include high­performance computing resources, fast networks, as well as information storage, access and management structures. thanks to a long history of co­operation, libraries are well suited to develop dig­ ital information infrastructures as a collaborative ef­ fort. recent examples of such innovative efforts involv­ ing big consortia include openaire (europe) , share (usa) and coar (global) . examples of services include coordinated advocacy and support (e. g. openaire na­ tional open access desks or noads) , information ag­ gregation services building on institutional reposito­ ries, reporting services for research funders and insti­ tutions, and integration of all research outputs in »en­ hanced publications«, »executable papers« and finally through researcher workflows. the concept of virtual research environment has been proposed as a work­ ing environment – for all sorts of scientific disciplines – that integrates all these elements and connects them with the underlying e­infrastructure. research libraries and data centers are both im­ mersed in the transition imposed on them by the adoption of e­science practices by the communities they serve. the complementary role they are adopt­ ing as providers of e­infrastructure services were de­ scribed by the ode project . the services provided by libraries and data centers must necessarily be aligned to provide the integrated data and text products as well as comprehensive workflows that can best sup­ port e­science research practices. the study on au­ thentication and authorisation infrastructures (aai) in research conducted jointly by liber (association of european research libraries) and terena (trans­eu­ ropean research and education networking associa­ tion) is an example of how libraries and data centers can collaborate in developing common services to support e­science. it includes case studies that show how inter­institutional collaboration can be improved through the libraries’ involvement in e­infrastruc­ tures. more generally, libraries have a significant po­ tential to provide information services for collabora­ tive science. the dataone project and infrastruc­ ture also illustrates how libraries collaborate and provide services in linking data with publications as well as support for research data management. the force community initiative involves many librar­ ians in their efforts to improve scholarly communica­ tion, including enhanced publications as well as cita­ tion of research data. in this context, libraries execute the institutional implementation of global approaches for providing unambiguous research information, e. g. orcid or fundref for authors and academic insti­ tutions. and, of course, libraries are getting heavily in­ volved in research data management. it is crucial for libraries to be involved in the de­ velopment of infrastructures that ensure new ways of using scientific information, a task that may require new partnerships. in e­science this includes the crea­ tion of machine­readable scientific records and text and data mining tools. libraries need to participate in the current debate on legal reforms relating to these technologies (a summary of the discussions as well as proposals for reform are described in the report of the text and data mining expert group ). libraries as sustainable hosts in today’s quickly evolving research world, libraries provide a sustainable framework for specialized ser­ vices. arxiv , probably the world’s most renowned open access repository, is operated by cornell univer­ sity library. pubmed , the world’s most authoritative bibliography in the medical and life sciences is operat­ ed by the national library of medicine. and datacite , authentication and authorisation infrastructures (aai) machine-readable scientific records and text and data mining tools z f b b ( ) – libraries as e-infrastructure the world’s most important service for providing per­ sistent addresses for research data in the internet is managed by the technical information library of ger­ many in collaboration with many libraries such as the british library and the california digital library as well as many research institutions around the world. new skills for librarians libraries have transformed their skillset over the last decades. in order to adapt to the researcher’s new re­ quirements, libraries had to hire business analysts and staff with academic background for digital scholarship support. digital library systems require highly trained administrators and developers, and repositories re­ quire metadata as well as copyright specialists. since research data and software have become pri­ mary research assets that often require guarantees for permanent access, libraries can provide a safe harbor for digital research objects in a dynamic environment of mobile researchers, volatile repository content, transient products and short­lived standards. librar­ ies now need to tackle the challenge of making data and software reliably accessible and re­usable. this re­ quires a transformational approach to library services and development of the new skills. tasks such as the curation and stewardship for new research objects – data and software – will imply a profound revision of library and information science curricula, certificates and trainings, direct involvement in research projects, as well as learning on the job. librarians will not be­ come experts in data analytics, which is evolving as its own discipline. but they can become stewards who provide a sustainable basis for data scientists to work on. local cooperation the new roles of libraries in e­infrastructures have sig­ nificant implications for the cooperation across the campus or the research institution. new forms of co­ operation with researchers are emerging: one­to­one support and copyright advice for depositing publica­ tions, but also data consultancy in information inten­ sive research projects. interfaces to the financial and administrative systems of the research institutes need to be made in order to reliably link publications and data to research projects. and in all instances libraries need to closely align their activities with the comput­ ing services of the research institute to enable a seam­ less operation of services. thus, virtual teams across libraries, computing services and research offices are being set up to tackle new challenges such as open access publishing and research data management. global cooperation the grand challenges of the st century transcend borders, and science will be increasingly global. da­ ta­driven science will require extensive global collabo­ rations and researchers on each continent are striving for a leading role in the world’s production of knowl­ edge. research data itself is global and the key issues to consider are: . how data can be networked . how to envision and set up data governance on a global scale . how the eu can play a leading role in helping start and steer this global trend. an international group of research funders has been supporting the set­up of the research data alliance (rda) to enable data exchange on a global scale. the initial phase of rda has been supported by the eu­ ropean commission, the us national science foun­ dation and national institute of standards and tech­ nology, and the australian ministry of research, with research funders from other countries becoming ac­ tively involved. rda is being set­up to bring a diver­ sity of stakeholders together and improve interactions between users and technology and service providers. rda is a bottom­up community­led initiative to foster global interoperability across geographic and disciplinary boundaries. rda is open: those who want to participate in rda and shape the way the global data infrastructure operates are invited to join and take the lead on concrete initiatives. it is focused on the real needs of the research communities and will seek links with industry. it aims at being the place where practitioners stop discussing about the ideal solution and/or the complete set of standards and start implementing practical solutions for data shar­ ing and related issues. libraries are already active and even in leading roles in several rda working groups. global initiatives such as coar (confederation of open access repositories) bring together sever­ al major regional repository networks from austra lia, canada, china, europe, latin america and the united states. coar’s ambitions are to develop sustainable repository networks all over the globe, align these net­ works and make them fully interoperable, increase the impact of repository content, and provide training and support. organisations like eifl , world bank , unesco , eclac and others actively promote open access to knowledge as a motor for socio­economic development. research data itself is global coar (confederation of open access repositories) z f b b ( ) – wolfram horstmann, wouter schallier, jarkko siren, carlos morais-pires future horizons the new european research funding framework re­ flects the challenges for the next years. similar ex­ amples can be found in other research funding pro­ grams in different parts of the world. horizon , the eu framework programme for research and inno­ vation, was adopted in december . a quote from the regulation starts: »horizon should support the achievement and functioning of the european re­ search area in which researchers, scientific knowledge and technology circulate freely, by strengthening co­ operation both between the union and the member states, and among the member states, […]«. horizon is open also to the participation of non­european countries. the funding scheme includes support for international partnerships e. g. in the do­ main of scientific information, data and computing­in­ tensive science (areas relevant for coar, rda, etc.). horizon covers the period of – with a budget of approximately billion euros. its macro structure is based on three interrelated pillars: »hori­ zon pursues three priorities, namely generating excellent science (›excellent science‹), creating indus­ trial leadership (›industrial leadership‹) and tackling societal challenges (›societal challenges‹). those prior­ ities should be implemented by a specific programme consisting of three parts on indirect actions and one part on the direct actions of the joint research centre (jrc).« research infrastructures (ri) priority is part of the »excellent science« pillar and includes e­infrastruc­ tures, a. k. a. information and communication tech­ nologies infrastructures offering services for high­ speed connectivity, high­performance computing and research data management. it aims at developing a strong european research capacity in terms of instru­ ments, installations and equipment to cope with the most demanding requirements for pushing forward the frontiers of scientific knowledge. the actions on e­infrastructures, as recently published in the horizon work programme – , cover data­intensive science and engineering, high­performance compu­ tational infrastructure, research and education net­ works, virtual research environments, and e­science software environments. these actions provide oppor­ tunities for partnerships of scholarly communication and data management experts from libraries and scientific communities with e­infrastructure service providers capable of exploring the technologies and know­how for data management supported by high bandwidth communication, high­performance com­ puting, open scientific software, and virtual research environments. c o n c l u s i o n s libraries’ support to research is evolving. the key com­ petency of information provision stays – albeit in a in­ creasingly digital form. but the more tacit role of the library as a service organization that can provide sus­ tainable support for knowledge resources is pushed to the foreground. text­based resources are comple­ mented by research data. involvement in digital re­ search methods and operation of software resources becomes a must. libraries build virtual teams with re­ search offices and computing centers both on a local and a global level and become an integral part of a global e­infrastructure for research. a c k n o w l e d g e m e n t s the authors would like to thank their colleagues for rich and fruitful discussions. see com( ) final of . . . available at: http://eur- lex.europa.eu/legal-content/en/txt/?qid= &uri=cele x: dc see http://ec.europa.eu/research/innovation-union/pdf/ tdm-report_from_the_expert_group- .pdf kenning arlitsch, patrick s. o'brien ( ). invisible institutional repositories: addressing the low indexing ratios of irs in google scholar, library hi tech, vol. , iss. , pp. – . available at: http://dx.doi.org/ . / https://www.openaire.eu www.arl.org/news/arl-news/ -shared-access-research-eco system-proposed-by-aau-aplu-arl https://www.coar-repositories.org birgit schmidt, iryna kuchma ( ). implementing open ac- cess mandates in europe. available at: www.univerlag.uni-goettingen. de/content/list.php?notback= &details=isbn- - - - - see e. g. the liber quarterly special issue vol , no ( ), http://liber.library.uu.nl/index.php/lq/issue/view/ leonardo candela, donatella castelli, pasquale pagano ( ). virtual research environments: an overview and a research agenda. available at: http://dx.doi.org/ . /dsj.grdi- see carlos morais pires, jean-claude guédon, alan blatecky. scientific data infrastructures: transforming science, education, and society; zeitschrift für bibliothekswesen und bibliographie ( ). susan reilly, wouter schallier, sabine schrimpf, eefke smit & max wilkinson ( ). report on integration of data and publications, opportunities for data exchange project (ode). available at: www.al liancepermanentaccess.org/wp-content/uploads/downloads/ / / ode-reportonintegrationofdataandpublications- _ .pdf see carole goble, david de roure. the impact of workflow tools on data-centric research (in tony hey et al (ed.). the fourth paradigm: data-intensive scientific discovery). available also at http://research. microsoft.com/en-us/collaboration/fourthparadigm/ th_paradigm_ book_part _goble_deroure.pdf see advancing technologies and federating communities, a study on authentication and authorisation platforms for scientific resources in europe : final report. available at http://bookshop.eu ropa.eu/en/advancing-technologies-and-federating-communities-pb kk / see ellen collins, michael jubb. information handling in collab- orative research. liber quarterly, [s. l.], v. , n. , p. – , feb. . issn - x. available at: http://liber.library.uu.nl/index.php/lq/ article/view/urn% anbn% anl% aui% a - - / . date accessed: apr. . www.dataone.org/for-librarians https://www.force .org/ www.orcid.org/ www.crossref.org/fundref/ see christopher j. shaffer ( ). the role of the library in the research enterprise, in: journal of escience librarianship ( ): article . http://dx.doi.org/ . /jeslib. . see herbert van de sompel, carl lagoze. all aboard: toward a machine-friendly scholarly communication system, in: tony hey et al (ed.). the fourth paradigm: data-intensive scientific discovery). avail- able at: http://research.microsoft.com/en-us/collaboration/fourthpar adigm/ th_paradigm_book_part _sompel_lagoze.pdf global e-infrastructure for research developing a strong european research capacity z f b b ( ) – libraries as e-infrastructure see http://ec.europa.eu/research/innovation-union/pdf/tdm- report_from_the_expert_group- .pdf see lucie guibault, andreas wiebe (eds.) ( ). safe to be open: study on the protection of research data and recommendation for access and usage. available at: www.univerlag.uni-goettingen.de/ content/list.php?details=isbn- - - - - www.arxiv.org www.ncbi.nlm.nih.gov/pubmed https://www.datacite.org see f. friend, h. van de sompel, j-c. guédon. beyond sharing and re-using: toward global data networking. accessed at: https://eu rope.rd-alliance.org/repository/document/publications% and% reports/toward-global-data-networking.pdf see https://www.rd-alliance.org/; fran berman. building glob- al infrastructure for data sharing and exchange through the research data alliance, www.dlib.org/dlib/january / guest_editorial.html, doi: . /january -berman beth plale, »synthesis of working group and interest group activity one year into the research data alliance«, www.dlib.org/dlib/ january /plale/ plale.html, doi: . /january -plale www.coar-repositories.org/ www.eifl.net/ https://openknowledge.worldbank.org/ http://en.unesco.org/themes/building-knowledge-societies http://repositorio.cepal.org regulation (eu) no / of the european parlia- ment and of the council of december laying down the rules for participation and dissemination in »horizon – the framework programme for research and innovation ( – )« and repealing regulation (ec) no / , official journal of the european union l / , . . . proposal for a council decision establishing the specific pro- gramme implementing horizon – the framework programme for research and innovation ( – ) /* com/ / final - / (cns). t h e a u t h o r s dr. wolfram horstmann, associate director, bod­ leian libraries, university of oxford, broad street, oxford ox bg, uk (now university librarian, göttingen university, tel.: ­ ­ , e­mail: horstmann@sub.uni­goettingen.de) wouter schallier, chief, hernán santa cruz library, united nations , economic commission for latin america and the caribbean (eclac), av. dag ham­ marskjöld , vitacura, santiago, chile, e­mail: wouter.schallier@eclac.org [the author has co­written this article on his per­ sonal behalf. views expressed in this article are not necessarily shared by his organisation] jarkko siren, technical and scientific project of­ ficer, european commission dg connect, avenue de beaulieu , brussels, belgium, e­mail: jarkko.siren@ec.europa.eu [the views of the author are his and do not com­ mit the european commission] dr. carlos morais pires, programme officer, coor­ dinator for scientific data e­infrastructures, euro­ pean commission, avenue de beaulieu , brussels, belgium, e­mail: carlos.morais­pires@ec.europa.eu [the views of the author are his and do not com­ mit the european commission] communicating science communicating science: reform model of the gates open research platform spaska tarandova, phd student sofia university st. kliment ohridski faculty of journalism and mass communication sofia, bulgaria e-mail: tarandova@uni-sofia.bg milena tsvetkova, dr., associate professor sofia university st. kliment ohridski faculty of journalism and mass communication sofia, bulgaria e-mail: milenaic@uni-sofia.bg published reference (suggested bibliographic citation): tarandova, spaska; tsvetkova, milena. communicating science: reform model of the gates open research platform. in: communication management: theory and practice in the st century. th central and eastern european communication and media conference ceecom . sofia: faculty of journalism and mass communication at the sofia university “st. kliment ohridski”, , pp. - . print-isbn: - - - - ; e-isbn: - - - - . quoting from or reproduction of this paper is permitted when accompanied by the foregoing citation. abstract: the eu's scientific potential is increasingly flowing into the world of new scientific knowledge. the object of this paper is the communication interpretation of the open science policy, covering not only access and storage of scientific information and preservation of scientific information, but communication aspects also. purpose of the study: establish modern trends in the scientific ecosystem oriented towards facilitating the publication and communication of scientific results. tasks of the study: compare new solutions in science communication models in the most popular platforms, and explore what is the alternative to traditional scientific journals. methodology/approach: the qualitative systematic review (qualitative evidence synthesis), scientific criticism of sociological surveys, methods of analytic and synthetic processing of primary and secondary resources, secondary data analysis and overview of scientific publications available in the libraries worldwide, have been used to obtain data about the impact of new eu solutions: the european road map for development of the european research area (era), the european strategy forum for research infrastructures (esfri), the organization for economic cooperation and development, etc. a comparative analysis of innovation in publishing platforms was conducted with special attention to the bill and melinda gates foundation's gates open research platform. results: the creators mailto:tarandova@uni-sofia.bg mailto:milenaic@uni-sofia.bg of the gates open research platform defend the view of the rapid and socially beneficial effect of new and publicly-accepted scientific knowledge. the cutting-edge solutions are: transfer power from the hands of editors to the hands of the authors; minimize barriers or gatekeepers on the path of the new scientific outcome for society; assessment of the research not in view of the venue of publication but on the basis of the intrinsic value of the completed study; minimize the funds invested in publishing and dissemination. implications: the conclusions can be important in identifying technological and ideological regularities for optimizing the model of scientific publications and increasing the speed and visibility of any scientific news. keywords: science communication, barriers to scientific communication, scientific ecosystem, open science, open refereeing process, open access publishing model, transparent publishing, author-led publication, research-centred platform, f research, plan s introduction the past five years have witnessed more and more discussions in the eu about free access to scientific knowledge, in particular to results of publicly funded projects. if europe wants to compete with the rest of the world, the regulations having a bearing on the access to scientific knowledge need to be liberalized, and the time required to provide free access to the latest publications, shortened. the feeling becomes ever more tangible that we are living in a time of “a war” for free access to scientific achievements. representatives of various stakeholders ask questions, not only amongst themselves. more and more voices are heard in public, speaking about the price of scientific knowledge, its dissemination and re-use for new scientific results. over the past decade, it’s been getting easier and easier to circumvent the paywalls and find free research online. one major reason: the active effort of the so- called science pirates working on-line for the cause of free access to, and use of, science. the most popular among them is kazakh neurotechnology researcher and software developer alexandra elbakyan, also known as “science's pirate queen”. her (illegal) website sci-hub sees more than , visitors daily (according to data from april ), and host more than million academic reports. at the start of we also received two unequivocal signals from global economic players: on february elon musk opened the access to tesla's patents sci-hub. “twitter@sci_hub”. . . . accessed june , . https://twitter.com/sci_hub/status/ https://twitter.com/sci_hub/status/ to be used for preserving the earth (“to help save the earth” ). two months after that, on april , toyota offered free access to , of its patents . the moods among scientists from all over the world, veering on frustration and disappointment, allow one to formulate the prediction that we are entering an era of scientific communism when knowledge will become free. in , the vox portal surveyed scientists from different countries to determine what problems they believe are hindering modern science from developing dynamically. based on the survey findings, seven main obstacles were formulated, among which the inaccessibility of scientific information was ranked on the fifth place: ) academia has a huge money problem; ) too many studies are poorly designed; ) replicating results is crucial, and rare; ) peer-review is broken; ) too much science is locked behind paywalls; ) science is poorly communicated; ) life as a young academic is incredibly stressful. at this background, three groups of open access defenders stand out: ) librarians and science funders are playing hardball to negotiate lower subscription fees to scientific journals. jeffrey k. mackie-mason, university librarian and chief digital scholarship officer at the university of california, berkeley, told vox media on june : “[the publishers] know it’s going to happen. they just want to protect their profits and their business model as long as they can.” ) scientists, increasingly, are realizing they don’t need paywalled academic journals to act as gatekeepers any more. they are finding clever workarounds, making the services that journals provide free. ) open access crusaders, including science pirates, have created alternatives that free up journal articles and pressure publishers to expand the free access. simranpal singh, “tesla patents made public to save the world, reveals elon musk.” gizmo china, . . . accessed june , . https://www.gizmochina.com/ / / /elon-musk-tesla-patents paul ridden, “toyota offers free access to over years of electric vehicle patents.” new atlas, . . . accessed june , , https://newatlas.com/toyota-royalty-free-patents-electric-vehicle- technology/ julia belluz, brad plumer, and brian resnick, “the biggest problems facing science, according to scientists.” vox, . . . accessed june , , https://www.vox.com/ / / / /science- challeges-research-funding-peer-review-process brian resnick, and julia belluz, “the war of free science: how librarians, pirates, and funders are liberating the world’s academic research from paywalls.” vox, . . . accessed june , , https://www.vox.com/the-highlight/ / / / /open-access-elsevier-california-sci-hub- academic-paywalls https://www.gizmochina.com/ / / /elon-musk-tesla-patents https://newatlas.com/toyota-royalty-free-patents-electric-vehicle-technology/ / https://newatlas.com/toyota-royalty-free-patents-electric-vehicle-technology/ / https://www.vox.com/ / / / /science-challeges-research-funding-peer-review-process https://www.vox.com/ / / / /science-challeges-research-funding-peer-review-process https://www.vox.com/the-highlight/ / / / /open-access-elsevier-california-sci-hub-academic-paywalls https://www.vox.com/the-highlight/ / / / /open-access-elsevier-california-sci-hub-academic-paywalls background the political and economic context of the digital age connected with the creation, dissemination and use of scientific knowledge, is changing. the organization for economic co-operation and development (oecd) was the first to announce a policy of open science in . oecd's digital economy papers , published at the start of , predict the appearance of a multitude of platforms and ecosystems offering goods, services, information, knowledge and new forms of intermediation for accessing and using them. the transformation in the economy calls into question the traditional thinking about how to organize and implement most effectively the economic and social activities. the digital ecosystems offer users comfort with a familiar interface that creates a sense of ease of use. the development of digital platforms raises questions related to equal access and market concentration. the oecd urges governing bodies to develop public platforms, either individually or in partnership with commercial platforms, to provide administrative and social services in the implementation of public policies. in september , the european commission and the european research council (erc), along with eleven national research funding organizations, announced the launch of plan s to make full and immediate open access to research publications. in the coalition was joined by funding organizations – european research funding organizations and three charities (including the bill & melinda gates foundation). the funders state they control around € . billion of funds annually. this represents less than % of the nearly $ trillion global spend on r&d. however, it is the academic papers arising from plan s funders’ r&d activities that determine the effects on the scholarly publishing market. in this context, plan s funders have a more significant influence (table ) . oecd. digital economy papers. paris: oecd publishing, no. , . accessed june , , doi: . / ade bba-en. marc schiltz, “why plan s.” coalition s, . . . accessed june , , https://www.coalition- s.org/why-plan-s dan pollock and ann michael, “potential impact of plan s.” delta think, . . . accessed june , , https://deltathink.com/news-views-potential-impact-of-plan-s https://www.coalition-s.org/why-plan-s https://www.coalition-s.org/why-plan-s https://deltathink.com/news-views-potential-impact-of-plan-s table . plan s – share of scholarly articles in context the consortium around plan s, called coalition s , works with digital science and combines the latter's data with data from delta think's open access data & analytics tool , which makes it possible to determine, approximately, the ratio in the research production. plan s funders account for roughly . % of articles published globally. these include all articles where a plan s funder is involved, even as part of a jointly-funded or multi-author project. although many of the plan s funders are national, they account for just over one fifth of their respective countries’ publication output. also, as plan s funders are oa advocates, they account for a higher than average share of oa output. plan s principles are also consistent with other oa- advocacy countries (germany), several institutions (university of california), funders (bill & melinda gates foundation), and with the broader eu principles of a move to oa by . it is reasonable to posit that plan s will gain additional support from a variety of oa stakeholders. one such example is germany. its absence from inclusion may well be a matter of the timing due to its on-going publisher negotiations, rather than differences in long-term position. dan pollock and ann michael, “potential impact of plan s.” delta think, . . . accessed june , , https://deltathink.com/news-views-potential-impact-of-plan-s coalition s. brussels, belgium: science europe, . accessed june , , https://www.coalition-s.org digital science. london: digital science & research ltd, . accessed june , , https://www.digital- science.com delta think open access data & analytics tool (oa dat). delta think, . accessed june , , https://deltathink.com/open-access/oa-data-analytics-tool https://deltathink.com/news-views-potential-impact-of-plan-s https://www.coalition-s.org/ https://www.digital-science.com/ https://www.digital-science.com/ https://deltathink.com/open-access/oa-data-analytics-tool figure . change in market value of plan s uptake scenarios compared with current projections plan s includes a number of revolutionary principles that impact the market. its preamble and principles mention banning publication in hybrid journals, requiring cc-by licenses (creative commons attribution . ) to be held by the author, instituting caps on apc (article processing charge) funding, and the coalition s view on using the journal impact factor for quality assessment and on the ban of the hybrid model. broad advocacy exists in respect of the widespread banning of the hybrid model by eu funders covering oa output of all eu countries, among them of high oa-uptake countries, such as the uk, austria, the netherlands, and sweden. reactions to plan s have ranged from delighting oa advocates, to suggesting that this is simply a part of the on-going discussion about oa, to responses from the mainstream scholarly publishing community urging for more detailed consideration of the complexities of the scholarly publishing market, to concerns from some researchers that it will deprive them of quality journal venues and of international collaborative opportunities. the planned launch of plan s, with the primary goal of opening access to publicly funded research in the european union as of january , was postponed to dan pollock and ann michael. “potential impact of plan s.” delta think, . . . accessed june , , https://deltathink.com/news-views-potential-impact-of-plan-s https://deltathink.com/news-views-potential-impact-of-plan-s . following consultations with academic libraries, publishers and researchers, coalition s announced that until , eased requirements will apply : ) coalition s will not place a cap on the cost of publishing a paper in an open-access journal. but they say journals must be transparent about publishing costs. ) coalition s changed the rules concerning hybrid titles and offered “transformative agreements”, which give these partly paywalled journals a route to becoming open access. ) coalition s will ignore the prestige of journals when making funding decisions. ) in some cases, researchers will be able to publish work under more restrictive open licences, when approved by coalition s. the reasons for the postponement can be found in two directions - in the resistance of the publishing community whose actions are increasingly in the direction of protecting their own profit, rather than protecting the quality of research and the interests of the authors, or related to the protection of the interests of researchers, their copyright and the quality of research output. it can only be noted that the use of hybrid journals is a temporary measure to full open access. plan s is intended to accelerate the changes in this direction. its small core of funders can have a significant impact in the future when access to research publications will increasingly be through open science on-line platforms. methodology hypothesis: revolutionary changes in the organization and functioning of academic journals are looming, and the model of scholarly publishing will be changed for good. object: open-access resources for research communication subject: the positive changes for academic authors and their publications in the contest of the digital transformation coalition s, plan s: principles and implementation. brussels, belgium: science europe, . accessed june , , https://www.coalition-s.org/principles-and-implementation holly else, “radical open-access plan could spell end to journal subscriptions.” nature , ( ): - . accessed june , , doi: . /d - - - . https://www.coalition-s.org/principles-and-implementation/ the study focused on the development of the view of a rapid-effect socially beneficial science based on an open-access policy, and addressed four overarching research questions: . what changes are expected at the eu level in respect of access to publicly- funded scientific output produced by the research effort of international teams? . are the questions about the purpose of scientific achievements primarily of moral and philosophical essence, or are they predominantly related to economic and business interests? . are the editorial teams of scientific journals threatened by the two ongoing debates - about the effectiveness of open peer reviews and about ignoring the significance of the impact factor (if and ir) of their publications? the qualitative systematic review (qualitative evidence synthesis), the methods of the analytic and synthetic processing of primary and secondary resources, secondary data analysis and overview of scientific publications available in the libraries worldwide, were used to obtain data about the impact of new eu solutions: the european road map for development of european research area (era), the european strategy forum for research infrastructures (esfri), the organization for economic cooperation and development, etc. the analysis of innovation in publishing platforms was conducted with special attention to the bill and melinda gates foundation's platform gates open research. object of the research: gates open research platform the bill & melinda gates foundation was among the first open-science funders in the world. as far back as in gates supported the just starting berlin- based researchgate, the most popular and free networking website for scientists, with funding to the amount of usd million . in november , the bill and melinda gates foundation changed the rules for research funding by putting in place an open-data policy. researchers could bill & melinda gates foundation. seattle, wa, . accessed june , . https://www.gatesfoundation.org/what-we-do https://www.gatesfoundation.org/what-we-do publish in subscription journals but had to guarantee that after months their papers be made freely available . as of january , after a so-called “grace period”, the foundation's rules were changed and publishing with closed access is no longer allowed. “personally, i applaud the gates foundation for taking this stance,” says simon hay, a gates-funded researcher who is director of geospatial science at the institute for health metrics and evaluation in seattle, washington. “the overwhelming majority of my colleagues in global health and fellow gates grantees with whom i have chatted are highly supportive of these developments,” he says. the foundation requires the publication of articles under the free creative commons attribution license which enables dissemination and processing of material subject to designation of authorship. scientists who do research funded by the bill & melinda gates foundation are not allowed to publish papers about that work in journals that include nature, science, the new england journal of medicine (nejm) and the proceedings of the national academy of sciences (pnas). this is due to the fact that the charity requires from grant recipients to publish open-access research, whereas the journals in question do not offer this kind of open-access publishing. a spokesperson for nature’s publisher, springer nature, said that most springer nature journals do comply with the gates foundation policies, but a “small number”, including nature and some nature-branded research titles, do not. “at the moment we believe the subscription model is still the best way to provide sustainable and widespread access to journals with low acceptance rates such as nature and the nature-branded research and reviews titles,” the spokesperson added. richard van noorden, “gates foundation announces world’s strongest policy on open access research.” nature, . . . accessed june , , http://blogs.nature.com/news/ / /gates-foundation- announces-worlds-strongest-policy-on-open-access-research.html richard van noorden, “science journals end open-access trial with gates foundation.” nature, . . . accessed june , , http://www.nature.com/articles/d - - - richard van noorden, “gates foundation research can’t be published in top journals.“ nature , ( . . ): . accessed june , , http://www.nature.com/news/gates-foundation-research- can-t-be-published-in-top-journals- . http://blogs.nature.com/news/ / /gates-foundation-announces-worlds-strongest-policy-on-open-access-research.html http://blogs.nature.com/news/ / /gates-foundation-announces-worlds-strongest-policy-on-open-access-research.html http://www.nature.com/articles/d - - - http://www.nature.com/news/gates-foundation-research-can-t-be-published-in-top-journals- . http://www.nature.com/news/gates-foundation-research-can-t-be-published-in-top-journals- . results the gates open research is the newest publication medium that researchers supported by the gates foundation can use in order to disseminate their data in a way which is fully compliant with their open access policy. the website was launched on january as a platform for rapid publication by researchers, with transparent peer review . publications with closed/paid access are not admitted as of this date. gates open research is based on f research’s format . f is an abbreviation for the faculty of - a cadre of experts who provide peer review and recommendations as needed. f research is an open science post-publication peer review platform. the bill and melinda gates foundation is the second funding body to partner with f to generate an open-access academic publishing platform (the first was the wellcome trust). the gates open research platform advocates the view of the rapid and beneficial societal impact of the new and publicly accepted research. essentially, the entire work in the platform is carried out by the team of f research, and the foundation covers the publishing costs. the publication of an article up to , words costs $ , from , to , words - $ , and more than , words, $ , . the wellcome trust charitable foundation works on the same principle. in november the charity signed an agreement with f research and has since published about research articles for wellcome open research. on average, an article costs $ to the charity, the manuscript reaches the website within seven days and is refereed in the course of one month. gates open research gives authors significantly more control than normally given to them by a traditional publication model. authors can decide what and when to publish, including replication studies and negative results. authors will also be able to suggest reviewers for their paper or choose from a list of suggested reviewers. this is the essence of the author-led open peer-review model. the refereeing process takes days at the most (figure ). once submitted, the article has to pass basic editorial checks by the f faculty prior to publication. this final process usually takes seven gates open research. london: bill & melinda gates foundation, f research ltd., . accessed june , , https://gatesopenresearch.org f research. london: science navigation group, . accessed june , , https://f research.com https://gatesopenresearch.org/ https://f research.com/ days (figure ). an important fact is that the grantees of the bill & melinda gates foundation each year publish , - , papers in the area of healthcare and education. figure . the length of the refereeing process: days figure . the length of the publishing process for articles: days to recap, below we offer a summary of the most important characteristics of the new model of intermediation in science legitimized by the gates open research: benefits for researchers: enables authors, not editors, to decide when to make their research available. authors suggest peer reviewers and control the process. all types of research can be published rapidly: articles, data sets, null results, protocols, case reports, incremental findings, etc. gates open research, guidelines for article reviewers. london: bill & melinda gates foundation, f research ltd., . accessed june , , https://gatesopenresearch.org/for-referees/guidelines gates open research, how it works. london: bill & melinda gates foundation, f research ltd., . accessed june , , https://gatesopenresearch.org/about https://gatesopenresearch.org/for-referees/guidelines https://gatesopenresearch.org/about benefits for research: shifts the way research and researchers are evaluated. moves away from journal-based ranging towards direct assessment of individual outputs. supports research assessment based on the intrinsic value of the research, not the venue of publication. benefits for society: reduces the barrier to collaborative research through data sharing, transparency and attribution. reduces research waste and helps to remove the bias in our understanding of research. enables others to build upon new ideas right away, wherever and whoever they are. the gates foundation is dedicated to the belief that all lives have equal value and everyone deserves the opportunity to lead a healthy and productive life. to solve the challenges of the st century, we must accelerate open access to high-quality research on health, education, and economic development. gates open research is designed to ensure that the research we fund can be of immediate benefit to society. if we are to summarize the contribution of the platform, it is found in the following:  a shift from author-centred to research-centred platform;  transfer power from the hands of the editors to the hands of the authors;  minimize barriers or gatekeepers on the path of the new scientific outcome for society;  transparent peer-review of research;  assessment of the research not in view of the venue of publication but on the basis of the intrinsic value of the completed study;  minimizing the funds invested in publishing and dissemination. conclusion after , the landscape of scholarly publishing is much different, thanks in large part to non-governmental funds that already mandate open access. large foundations such as ford, gates and hewlett have adopted strong open-access policies that require research to be not only publicly available, but also licensed to allow re- publishing and re-use by anyone. the world's second-largest charitable foundation, the wellcome trust, also offers free access to the scientific output of everyone who receives financial support from it. but if the publisher does not allow them to publish for free access, the wellcome trust allows such articles to be embargoed for up to six months. the circumstances that were examined indicate that revolutionary changes in the organization and functioning of academic journals are looming, and the model of scholarly publishing will be changed for good:  the barriers and gatekeepers on the path of new scientific outcomes to society will be reduced influenced by the tendency of disintermediation in the financial sector.  the funds invested in scientific communication will be streamlined.  the benefits of open refereeing will be advanced  the future models of communicating science will centre on new knowledge and new scientific outcomes, and not on the author or the venue of publication (the name of the journal).  the platforms for scientific knowledge creation and sharing will shift from being researcher-centric platforms to being research-centric platforms, hand in hand with the shift of the media environment from an “economy of attention” towards an “ecology of attention” . eu's research potential is increasingly entering a research ecosystem of decommodification and decapitalization. it may well be that the driving forces behind a more radical and urgent change are entrepreneurs and philanthropists such as bill gates and elon musk. universality is a fundamental principle of science. only results that can be discussed, challenged, and reproduced by others qualify as scientific. the moral solution is open access. what is needed is to find the proper legal framework for a fair distribution of the benefits between science and society. “knowledge is not simply another commodity. on the contrary. knowledge is never used up. it increases by diffusion and grows by dispersion”, daniel boorstin, u.s. library of congress director ( - ), says. yves citton, the ecology of attention (cambridge: john wiley & sons, ), . bibliography belluz, julia, brad plumer, and brian resnick. “the biggest problems facing science, according to scientists.” vox, . . . accessed june , . https://www.vox.com/ / / / /science-challeges-research-funding-peer-review- process bill & melinda gates foundation. “what we do.” seattle, . accessed june , . https://www.gatesfoundation.org/what-we-do citton, yves. the ecology of attention. cambridge: john wiley & sons, . coalition s, plan s: principles and implementation. brussels, belgium: science europe, . accessed june , . https://www.coalition-s.org/principles-and-implementation coalition s. “about.” brussels, belgium: science europe, . accessed june , . https://www.coalition-s.org delta think open access data & analytics tool (oa dat). “about.” delta think, philadelphia, . accessed june , , https://deltathink.com/open-access/oa-data- analytics-tool digital science. “about.” london: digital science & research ltd, . accessed june , . https://www.digital-science.com else, holly. “radical open-access plan could spell end to journal subscriptions.” nature , ( ): - . accessed june , . doi: . /d - - - . f research. “about.” london: science navigation group, . accessed june , . https://f research.com gates open research. “about.” london: bill & melinda gates foundation, f research ltd., . accessed june , . https://gatesopenresearch.org gates open research. guidelines for article reviewers. london: bill & melinda gates foundation, f research ltd., . accessed june , . https://gatesopenresearch.org/for-referees/guidelines gates open research. how it works. london: bill & melinda gates foundation, f research ltd., . accessed june , . https://gatesopenresearch.org/about oecd. digital economy papers. paris: oecd publishing, no. , . accessed june , . doi: . / ade bba-en. pollock, dan and ann michael. “potential impact of plan s.” delta think, . . . accessed june , . https://deltathink.com/news-views-potential-impact-of-plan-s resnick, brian and julia belluz. “the war of free science: how librarians, pirates, and funders are liberating the world’s academic research from paywalls.” vox, . . . accessed june , . https://www.vox.com/the-highlight/ / / / /open-access-elsevier- california-sci-hub-academic-paywalls ridden, paul. “toyota offers free access to over years of electric vehicle patents.” new atlas, . . . accessed june , . https://newatlas.com/toyota-royalty-free-patents-electric- vehicle-technology/ schiltz, marc. why plan s. coalition s, . . . accessed june , . https://www.coalition-s.org/why-plan-s sci-hub. “twitter@sci_hub.” . . . accessed june , . https://twitter.com/sci_hub/status/ https://www.vox.com/ / / / /science-challeges-research-funding-peer-review-process https://www.vox.com/ / / / /science-challeges-research-funding-peer-review-process https://www.gatesfoundation.org/what-we-do https://www.coalition-s.org/principles-and-implementation/ https://www.coalition-s.org/ https://deltathink.com/open-access/oa-data-analytics-tool https://deltathink.com/open-access/oa-data-analytics-tool https://www.digital-science.com/ https://f research.com/ https://gatesopenresearch.org/ https://gatesopenresearch.org/for-referees/guidelines https://gatesopenresearch.org/about https://deltathink.com/news-views-potential-impact-of-plan-s https://www.vox.com/the-highlight/ / / / /open-access-elsevier-california-sci-hub-academic-paywalls https://www.vox.com/the-highlight/ / / / /open-access-elsevier-california-sci-hub-academic-paywalls https://newatlas.com/toyota-royalty-free-patents-electric-vehicle-technology/ / https://newatlas.com/toyota-royalty-free-patents-electric-vehicle-technology/ / https://www.coalition-s.org/why-plan-s https://twitter.com/sci_hub/status/ singh, simranpal. “tesla patents made public to save the world, reveals elon musk.” gizmo china, . . . accessed june , . https://www.gizmochina.com/ / / /elon- musk-tesla-patents van noorden, richard. “gates foundation announces world’s strongest policy on open access research.” nature, . . . accessed june , . http://blogs.nature.com/news/ / /gates-foundation-announces-worlds-strongest- policy-on-open-access-research.html van noorden, richard. “gates foundation research can’t be published in top journals.” nature , ( . . ): . accessed june , . http://www.nature.com/news/gates- foundation-research-can-t-be-published-in-top-journals- . van noorden, richard. “science journals end open-access trial with gates foundation.” nature, . . . accessed june , . http://www.nature.com/articles/d - - - https://www.gizmochina.com/ / / /elon-musk-tesla-patents https://www.gizmochina.com/ / / /elon-musk-tesla-patents http://blogs.nature.com/news/ / /gates-foundation-announces-worlds-strongest-policy-on-open-access-research.html http://blogs.nature.com/news/ / /gates-foundation-announces-worlds-strongest-policy-on-open-access-research.html http://www.nature.com/news/gates-foundation-research-can-t-be-published-in-top-journals- . http://www.nature.com/news/gates-foundation-research-can-t-be-published-in-top-journals- . http://www.nature.com/articles/d - - - http://www.nature.com/articles/d - - - september accessseptember access feature feature breathing life into digital collections at the british library by mia ridge introduction how are research libraries preparing to meet the needs of st century researchers? for the past decade, the british library’s digital scholarship team has worked to ensure that the library’s collections, systems, policies and processes meet the emerging needs of anyone who wants to conduct innovative research with the library’s digital collections and data. this article firstly provides some context for the british library’s investment in this area, then discusses how the team seeks to understand and encourage the use of collections in digital scholarship and, finally, addresses some of the challenges this entails. biography dr mia ridge is the british library’s digital curator for western heritage collections. as part of the library’s digital scholarship team, she enables innovative research based on digital collections, providing guidance and training on computational methods for historical collections. current projects include crowdsourcing work with historical playbills, and experimenting with machine learning. her phd was titled ‘making digital history: the impact of digitality on public participation and scholarly practices in historical research’. formerly lead web developer at the science museum group, her career began in australia with roles at melbourne museum and vicnet at the state library of victoria. working at scale — the collections of the british library the british library is the national library of the united kingdom. its purpose is to make the intellectual heritage represented by its collections accessible to everyone, for ‘research, inspiration and enjoyment’. this work is supported by six core purposes, some of which — including working internationally to advance knowledge and mutual understanding; supporting and stimulating research of all kinds; inspiring learners of all ages and engaging everyone with memorable cultural experiences — are directly linked to the library’s support for digital access to collections for research and learning. one of the largest libraries in the world, the british library holds an estimated – million items, including over million books; million stamps; , manuscript volumes; million maps; million patents; , journal titles; sound files; pamphlets, magazines, sheet music and newspapers; television and radio recordings; and archived websites. over million new items are added every year, and as digital publishing increases in volume, within a few years the library expects to ingest terabytes of data a day. digital scholarship at the british library the digital scholarship team was set up in to enable innovative research with the library’s digital collections and data. four digital curators (including the author) are embedded within specific collections and curation departments, and provide advice, support and training in the creation and use of relevant digital collections for their staff. for example, my current focus is exploring the applications of data science-based research methods to digitised historical collections, by investigating algorithm- based metadata generation to make collections more discoverable, and seeking to understand how disciplines such as computer vision [http://blogs.bl.uk/digital- s c h o l a r s h i p / / / s e e i n g - b r i t i s h - library-collections-through-a-digital-lens. html] or computational linguistics approach library collections. other digital curators work on specific digitisation projects, building capacity for digital scholarship within potential user groups through workshops, pilots and documentation. we want to help researchers think beyond reading a page at a time or manually compiling a database of records they’re interested in, to thinking about ‘reading’ thousands of pages from hundreds of sources or using text and data- mining techniques to scale up their research. we collaborate closely with the mellon- funded british library labs [https://www. bl.uk/projects/british-library-labs] team, the endangered archives programme [http://eap.bl.uk/], it projects (such as the new, standards-based item viewers [http:// blogs.bl.uk/digital-scholarship/ / / new-viewer-digitised-collections-british- library.html]), and other researcher- focused teams. collectively we aim to share knowledge, expertise and experience; to connect scholars with the resources they need; and to experiment with digital methods to address barriers to collections access for users. below, i outline key activities that help us meet those goals. the british library holds an estimated – million items, including over million books; million stamps; , manuscript volumes; million maps; million patents, , journal titles; sound files; pamphlets, magazines, sheet music and newspapers; television and radio recordings; and archived websites. over million new items are added every year ... within a few years the library expects to ingest terabytes of data a day. september accessseptember access feature feature building internal capacity — training library staff a key activity for the team is devising and running training in digital scholarship for other library staff. begun in , our digital scholarship training programme [https:// www.bl.uk/projects/digital-scholarship- training-programme] is the result of an extensive consultation exercise and survey of the digital scholarship landscape to understand the foundational concepts, methods and tools with which staff would need to be familiar (mcgregor et al. ). courses provide a mixture of hands-on, practical exercises, and time to explore and discuss innovative digital projects and case studies. providing training in subjects such as crowdsourcing and data visualisation for cultural heritage collections, copyright and using online data sources helps library staff understand how other scholars might apply new technologies and methods to digital collections, enabling better research collaborations. in the past year we have responded to the need for a more flexible training programme by breaking day-long workshops into modules delivered over ‘seasons’. over a season, staff can learn about topics such as ‘text and data mining for cultural heritage collections’ through a mixture of talks, practical workshops, tutorials, and guest lectures from visiting specialists. this format has several advantages: staff find it easier to attend hour-long modules, staff can try out methods on their own collections between sessions, the ability to pick and choose sessions means that attendees for each module are more engaged, and new topics can be introduced on a ‘just in time’ basis as the technology changes. the modular format also means we can invite international experts and collaborators to give talks on their specialisms with relatively low organisational overhead. the team needs to keep apace of changes in the field, so we run a monthly reading group [http://blogs.bl.uk/digital- s c h o l a r s h i p / / / w h a t - d o - d e e p - learning-community-archives-livy-and- the-politics-of-artefacts-have-in-common. html] and hands-on ‘hack and yak’ sessions. both are open to anyone in the library interested in a topic, activity or tool featured in that session. collaborating with external researchers many of our external research collaborations are based on phd studentships, devised with the library’s research collaboration team [https://www.bl.uk/research- collaboration], and funded through research councils, with academic partners recruited through an open call. they provide access to library collections and expertise for phd students, while we learn from their in-depth explorations of specific research questions or methods. students can attend our training programme, and are invited to give staff talks or run workshops based on their research, further strengthening the training programme. we also take part in the library’s programme for three-month phd placements, which provide valuable experience for students while helping deliver useful outcomes. we have also supervised undergraduate and master’s dissertation projects, working with printed heritage, manuscripts and archives colleagues to shape their research projects around specific collections. challenges for internal and external collaboration building digital collections and scholarship into traditional structures can be challenging. for example, if a staff member is inspired to try text mining after attending a training session, they must first navigate the various permissions needed to access digitised sources, install software on their work computer and find the time to experiment. for the library, the scale of the collections means that tools that work at a local scale may not be suitable for larger or more complex collections. turning ad hoc pilots or experiments into larger, integrated projects is a challenge. on a more positive note, this provides some insight into the challenges that external researchers face when incorporating digital scholarship methods into their work. commercial reuse, is vital. the library has published items on a range of platforms. over million digitised images from th century books [http://britishlibrary.typepad. co. uk/dig ita l-scholarship/ / /a- million-first-steps.html] are freely available from flickr commons [https://www.flickr. com/photos/britishlibrary/], while the text is available via jisc’s historical texts site [https://historicaltexts.jisc.ac.uk/home]. library content — from maps to images of book bindings — also appears on wikimedia commons [https://commons.wikimedia. o rg / w i k i / c a t e g o r y : i m a g e s _ f ro m _ t h e _ british_library]. the library’s metadata team has published a range of catalogues as datasets [http://www.bl.uk/bibliographic/ datafree.html] in formats including linked open data (sparql, basic rdf/xml), ‘researcher format’ (csv), marc via we want to help researchers think beyond reading a page at a time or manually compiling a database of records they’re interested in, to thinking about ‘reading’ thousands of pages from hundreds of sources or using text and data-mining techniques to scale up their research. assuming they can find the right skills or collaborators to get started, academics may face challenges finding suitable outlets for publishing work based on digital scholarship. if they publish in traditional disciplinary journals, they may have to minimise computational aspects of their research, while journals in digital fields may only be looking for ‘new’ or ‘innovative’ work. opening access to data publishing well-documented digital and digitised collections online, under licences that encourage scholarly, creative and z . and pdf. some data from the uk web archive [https://www.webarchive.org.uk] is available for reuse. we also published linked open data descriptions of learning resources in collaboration with the bbc’s research and education space project [http://blogs.bl.uk/ digital-scholarship/ / /how-can-a- turtle-and-the-bbc-connect-learners-with- literature.html]. building on the work of bl labs and digitisation colleagues in collecting files from legacy digitisation projects, the library launched an open data portal [https://data. bl.uk] in . we have found that publishing september accessseptember access feature feature academic datasets built on british library collections can give them a new lease of life, encouraging their use by early career scholars, and by established researchers looking for ‘challenge datasets’ they can test their tools with. challenges for publishing usable collections data however, there are several reasons why collections and metadata published by the library may be challenging for would- be digital scholars. overall, the biggest challenge is the pace of cataloguing and digitisation in relation to the scale and variety of the collections. our best estimates are that – % of collections are digitised or born digital. while ideally all digitised items should have detailed catalogue records and specialist metadata, and automatically transcribed text to enable full-text search and reuse, this is not always the case. cataloguing and digitisation practices have varied over time and between projects, and the resulting variability in metadata quality increases the challenge in finding and using relevant collections in digital (or indeed, any) scholarship. entities recorded in metadata about historical collections, such as dates, names and places, may be uncertain, ambiguous, imprecise and generally ‘messy’ compared to modern data. this can cause problems for systems that expect modern, precise records about conventional books. researchers may initially have high expectations for digitised collections. the availability of accurately transcribed text is key for many digital methods, including text and data-mining techniques to extract the topics, people, places and other entities mentioned in the text, drawing network graphs of relationships between entities, or algorithmically compiling quantitative records for analysis. when published online, a single dataset of digitised texts can be used by multiple researchers. for example, the library’s th century newspaper collections have been studied to answer questions on the depictions of london in british newspapers [https://ihrdighist.blogs.sas. ac.uk/ / / /tuesday- -january- -te ss a-h au sw ede ll-e uropea n-or- imperial-metropolis-depictions-of-london- in-british-newspapers- - /], the locations of political meetings [https:// ihrdighist.blogs.sas.ac.uk/ / / / tuesday- -february-katrina-navickas- political-meetings-mapper-with-british- l i b ra r y- l a b s - m a p p i n g - t h e - o r i g i n s - o f - british-democratic-movements-with-text- m i n i n g - n l p - g e o - p a r s i n g - a n d - c r o w d - sourcing/], attitudes to immigrants and refugees [http://www.lancaster. ac.uk/people-profiles/ruth-byrne], and the temporal and spatial relationship to disease [http://blogs.bl.uk/digital- scholarship/ / /a-temporal-spatial- investigation-of-disease-in- th-century- british-newspapers.html]. however, resources are rarely available for manually transcribing and marking-up records, and the quality of text transcribed with optical character recognition (ocr) tools can be poor, particularly for early digitisation projects. (re-ocring material can help, where resources allow.) the library is exploring methods for handwritten text recognition [http://transkribus.eu/], which have the potential to transform access to manuscript and archive collections. role of curators and cataloguers in relation to these new tools, and finding software for processing non-western materials and non-textual digital collections such as the uk web and sound archives. newer forms of digitisation, such as d modelling, which create complex digital objects, put further pressure on internal data systems but offer new possibilities for accessing objects, as explored by digital curator dr adi keinan- schoonbaert [http://britishlibrary.typepad. co.uk/asian-and-african/ / /cant- judge-a-book-by-its-cover-perhaps-you- can.html]. publishing collections as datasets creates practical issues, too. when individual collection items are combined into datasets, their sheer size can create challenges for researchers. for example, one dataset available for download from the library’s data portal [https://data.bl.uk/] is over gb in size. smaller datasets may still be over gb in size, making them difficult to download, uncompress, store, and computationally process for all but the most figure : in an ideal world, this digitised page would be tagged with a linked data identifier to clarify whether ‘melbourne’ refers to victoria or florida. source: https://archive.org/details/ mysteriesofmelbournelife while ideally all digitised items should have detailed catalogue records and specialist metadata, and automatically transcribed text to enable full-text search and reuse, this is not always the case. cataloguing and digitisation practices have varied over time and between projects, and the resulting variability in metadata quality increases the challenge in finding and using relevant collections in digital (or indeed, any) scholarship. applying content mining techniques to process items at scale has massive potential for digital scholarship and the discoverability of collection items. this, in turn, brings new challenges, including integrating tools for post-digitisation semantic enhancement into existing workflows, negotiating the well-resourced researchers. copyright and data protection laws can further limit immediate access to collections. we also face more subtle issues. the library’s catalogues are traditionally based around the ‘deliverable unit’, the september accessseptember access feature feature physical codex, bound volume or archive box that can be ordered to the reading room. however, emergent practices such as crowdsourced tagging and transcription, machine learning-led classification and content mining target single pages, or even regions of a page, and this has changed expectations about what a catalogue record represents. the mismatch in granularity between catalogues that describe the deliverable unit and technologies that describe the images and text on specific regions of manuscript, sheet or page must be resolved for us to take full advantage of newer technologies. the role of outreach publishing data online and hoping that people will find it is not enough — an active programme of outreach activities is key for encouraging the use of digital collections. the bl labs [https://www.bl.uk/projects/ british-library-labs] team has taken digitised collections out to universities on ‘roadshows’. these workshops are an opportunity to highlight innovative uses of digital collections and encourage academics to think creatively about including resources in their research and teaching [http://britishlibrary.typepad.co.uk/digital- scholarship/ / /success-story-the- bl_labs-roadshow- .html]. these events are also popular with university library staff curious to learn how we’ve faced some of the challenges, as well as academics who are considering digital scholarship projects. running competitions (or preparing material for use in other competitions) is an effective way to motivate the use of collections. from to , the bl labs team invited researchers, developers and artists to submit their important research question or creative idea leveraging the library’s digital content and data to their annual competition, and supported the winners in working on their idea. digital curator stella wisdom has also run off the map competitions [https://www.bl.uk/ projects/off-the-map], a videogame design competition for uk students. students use digitised british library ‘assets’ including maps, views, texts, book illustrations and recorded sounds as creative inspiration. efforts by colleagues including nora mcgregor to include historical arabic manuscripts in technical competitions will help improve automatic text transcription for non-english items. in defining time-limited projects with clear expectations about what to submit and which rewards are possible, the competition format has encouraged creative uses of digital collections. the annual british library labs awards recognise outstanding work using the library’s digital collections and data in four categories: research, artistic, commercial and teaching/learning. the award format encourages people to nominate work with collections that would otherwise be difficult to track, and provides material for case studies. crowdsourcing tasks related to collections metadata is another form of outreach, engaging new audiences while making our collections more discoverable (ridge ). in our most recent project, in the spotlight [http://playbills.libcrowds.com/], was designed for both engagement and productivity. we added elements to the task interface to encourage participants to download images, view the full item on the main website, add their own tags to describe playbill sheets, comment on a sheet or discuss their findings on a forum. this approach appears to be working, as participants have shared interesting finds with us, and we recently celebrated our , th contribution. in addition to the activities outlined above, members of the team present at conferences and summer schools. we publish articles, case studies [http://bl.uk/digital] and blog posts [http://britishlibrary.typepad.co.uk/ digital-scholarship/] on digital scholarship with the library’s collections. case studies published on the digital scholarship website [http://bl.uk/digital] help scholars understand how their work could benefit from new and emerging methods for working with digitised collections. we also deliver versions of our training programme courses for phd students and academic departments, and run evening events on topics related to digital scholarship for the public. conclusion describing this work at the british library is to write from a position of privilege. the library’s investment in digitisation and digital scholarship is unusual, as are the hundreds of years of collecting collections at this scale. however, many of the activities figure : screenshot of the in the spotlight interface, with interface elements designed to encourage exploration highlighted in red and orange. described above can be scaled up or down for use in different contexts, or adapted in collaboration with other departments. technology underlies many of the methods referenced but the real difference is in our investment in outreach, and in the library’s commitment to make collections accessible to everyone, for ‘research, inspiration and enjoyment’. references mcgregor, n, ridge, m, wisdom s & alencar- brayner a , ‘the digital scholarship training programme at british library: concluding report & future developments’, text. available at: http://dh .adho.org/ abstracts/static/data/ .html. ridge, m, , ‘from tagging to theorizing: deepening engagement with cultural heritage through crowdsourcing’, curator: the museum journal vol. , no. . op-llcj .. towards a new approach in the study of ancient greek music: virtual reconstruction of an ancient musical instrument from greek sicily ............................................................................................................................................................ angela bellia institute for archaeological and monumental heritage, national research council of italy ....................................................................................................................................... abstract in the summer of , the institute of fine arts, new york university selinunte mission began to explore the interior of the cella of temple r. this excavation showed that the classical and archaic layers had been sealed by a deep fill of the hellenistic period and left untouched by earlier archaeological research at the site. among the discoveries were a series of votive depositions positioned against the walls, dating to the sixth century bce. one of the most striking finds among those votive depositions was the discovery of two parts of a bone aulos, which can be dated to bce. the virtual reconstruction of the aulos found in temple r at selinunte aims to increase and improve its scientific investigation, overcoming the limitations caused by the fragility of the instrument. digital technology has allowed us to produce a three-dimensional ( d) model of the aulos. this digital model has been translated into a d artificial copy, using polymer as a material. our goal is to reconstruct the aulos, after analysing its organological character- istics. we also hope that this new study of the aulos will increase our knowledge of ancient greek music. ................................................................................................................................................................................. introduction the discipline of archaeomusicology is based on analyses of ancient depictions of music and finds of musical instruments in their archaeological con- text, rather than in isolation. this is the research methodology and approach of the telestes pro- ject, which has been funded by the european commission’s marie curie actions programme. this project is dedicated to the musical culture of selinus (modern selinunte), one of the most im- portant western greek cities. in , the institute of fine arts at new york university, in collabor- ation with the superintendency of trapani and the archaeological park of selinunte, initiated a project of topographical, architectural, and archaeological investigation of the main urban sanctuary on the acropolis in selinunte, under the direction of clemente marconi (marconi, ; marconi and scahill, ). a very important part of the telestes project is the study of an actual aulos that was discovered in two pieces under temple r. in the summer of , the institute of fine arts, new york university correspondence: angela bellia, institute for archaeological and monumental heritage, national research council of italy, c/o palazzo ingrassia - via biblioteca, - catania. e-mail: angbellia@gmail.com or angela.bellia@unibo.it digital scholarship in the humanities, vol. , no. , . � the author(s) . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com doi: . /llc/fqy advance access published on september d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay http://orcid.org/ - - - deleted text: . selinunte mission began to explore the interior of the cella of temple r (fig. ). this excavation showed that the classical and archaic layers had been sealed by a deep fill of the hellenistic period fig. selinunte, state plan of temple r with indication of trench o � institute of fine arts, nyu. fig. selinunte, temple r: aulos fragments. museum ‘baglio florio’ of selinunte. cons. . photograph by raffaele franco. � institute of fine arts, nyu. fig. selinunte, temple r: aulos fragments. drawing by filippo pisciotta � institute of fine arts, nyu. fig. aulos player. detail from red-figure amphora in palermo early fifth c. bce (n.i. ) a. bellia digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay and left untouched by earlier archaeological research at the site. among the discoveries was a series of votive depositions against the walls, dating back to the sixth century bce. one of the most striking finds among the votive depositions positioned was the discovery of two parts of a bone aulos, held today at the museum ‘baglio florio’ of selinunte, which can be dated to bce (figs. – ). the aulos was a widely used wind instrument with finger holes and a reed mouthpiece (bellia ) (fig. ). in the archaic and classical periods, the aulos was characterized by the absence of mechanisms of action on the holes for sound production. these are so-called ‘early type’ auloi, conventionally named to distinguish them from later versions, which have keys to change their sound. the use of this type of auloi extends to the hellenistic age. they were played exclusively by cov- ering the holes in the upper part of the two reeds with fingers (either in alternation or by covering all holes at the same time) and, when present, by cover- ing the thumbhole placed at the back of the tubes. early-type instruments were usually of different lengths: the earliest versions in bronze and wood were made up of two sections, and the later bone versions were made up of four (west, ). fig. ( ) samos (vii c. bce); ( ) sparta (vii c. bce); ( ) chios ( - bce); ( ) ephesus ( - a.c.); ( ) perachora ( - bce); ( ) lindos (vi c. bce); ( ) poseidonia (vi c. bce); ( ) selinunte (vi c. bce); ( ) ialyssos (vi-v c. bce); ( ) aegina (vi c. bce); ( ) brauron (ante bce); ( ) corinto (v c. bce); ( ) argos (v c. bce); ( ) athens (ante bce); ( ) delphi (v c. bce); ( ) lemno (v c. bce); ( ) locri (v c. bce?) fig. computed axial tomography. photograph by angela bellia fig. computed axial tomography of the aulos towards a new approach in the study of ancient music: virtual reconstruction of a musical instrument from greek sicily digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay deleted text: `` deleted text: '' deleted text: figg deleted text: characterised deleted text: `` deleted text: '' deleted text: early with regard to the present state of the studies, the sacred contexts of the ‘early type’ auloi (psaroudakēs ), or their sections, refer to hera at samos (moustaka, ), chios (boardman, ), perachora (dunbabin, ; psaroudakēs, ), and poseidonia (greco, ; greco, ); to artemis at sparta (dawkins, ), ephesus (hogarth, ; psaroudakēs, ), brauron (landels, ; landels, ), and aegina (furtwängler, ); to athena at lindos (blinkenberg, ; psaroudakēs, ) and iallyssos (psaroudakēs, ; psaroudakēs, ); and to persephone at locri (bellia, ). now we can add to these instruments the fragment of the aulos discovered in selinunte in the sanctuary of ‘malophoros’ (bellia, ), and the two sections found under temple r, probably dedicated to demeter ‘thesmophoros’ (marconi, ; bellia, ) (fig. ). this discovery at selinunte is very significant, par- ticularly with regard to the performance of music and ritual dancing associated with the cult activity of temple r. as clemente marconi has argued, the per- formance of choral dancing in this part of the main urban sanctuary of selinus is also suggested by the discovery of a series of fragments of corinthian vases in the area of temple r feature chains of dancing women that conform to the so-called ‘frauenfest’ iconography. these discoveries show the importance of music and dance at selinus as early as the early archaic period (marconi, ; marconi, ; marconi and scahill, ). the on-going study of the musical instrument from selinunte is relevant both for its organology and for the information offered by the analysis on the type of bone used in its production. the three- dimensional ( d) virtual reconstruction of the aulos from selinunte is aimed not only at the acoustic and morphological study, but also to increase and im- prove the instrument’s scientific investigation. fig. the two sections of the aulos fig. the two sections of the aulos fit together fig. second finger hole for the thumb on the under- side of the pipe a. bellia digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay deleted text: `` deleted text: '' deleted text: , deleted text: early deleted text: archaic computed axial tomography the lost parts of the instrument included the other pipe. to render a virtual reconstruction of the aulos from selinunte, we are currently engaged in the pro- ject by several means. the first is that of a computed tomography (ct) scan of the bone, which has per- mitted the study of its measures and morphology, overcoming the limitations presented due to the fragility of the instrument (fig. ). furthermore, the ct represents an accurate method for the visu- alization and analysis of surfaces, volumes, internal structure, and the material density of the ancient musical instrument. the aim is to use d scanning for generating d models of ancient musical instru- ments (avanzini et al., , ). we also aim to develop specific tools suitable for processing the re- sulting d models. the tools we are developing are divided into those involving the use of computa- tional methods for processing the d models and those involving the development of interactive tools aimed at engaging museum visitors in the exploration of musical instruments (micheloni et al., ) (figs. – ). among a variety of applica- tions, a rotating d reconstruction of the aulos can be generated, showing possible damages, and the necessary repairs and modifications. within this framework, information from the undamaged parts of the object was utilized in com- bination with literary and iconographic sources, in an attempt to re-create the appearance of the com- plete object and group various fragments together. measurements the exterior surface retains the natural shape of the bone from which the instrument was made, includ- ing the grooves, present on both pieces and cutting right across the finger holes and the thumbhole on the longer piece. assuming that the two sections fit together with one another, the instrument is still incomplete. the lost portions would include the mouthpiece and the lower section with additional fig. measurements of the section a towards a new approach in the study of ancient music: virtual reconstruction of a musical instrument from greek sicily digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay deleted text: . deleted text: in order t deleted text: visualisation deleted text: ; avanzini et al. deleted text: , deleted text: figg deleted text: utilised deleted text: . finger holes. bore and pipe workmanship is com- patible with the use of an arched or rope drill used in antiquity by craftsmen to pierce the surfaces of objects. it is necessary to take in account that the second finger hole for the thumb (counting from the mouthpiece end) was on the underside of the pipe to the left, away from the hand, because the tube was the left member of its pair (fig. ). to refine the measurements, a ct scan was used. the scanning was then read with the open-source software horos, a medical image viewer that also pro- vides tools to extract reliable measures from ct scans. . measurements of section a (fig. ) overall length of the section: . mm operating length: . mm length of the downstream spigot: . mm . measurements of section b (fig. ) overall length of the section: . mm operating length: . mm length of the downstream spigot: . mm . distances of holes from upstream end and their diameters (fig. ) . assemblage of sections (fig. ) overall length of the sections a þ b: . mm the printed copy of the instrument digital technology allowed us to produce a d model of the aulos. this digital model has been translated into a d artificial copy, using polymer as a material (zoran, ) (fig. ). the digital model of the aulos has been translated into two d artificial copies at the school of science and engineering at the state university of new york at new paltz, and at the officina d lab at reggio emilia (fig. ). the officina d lab also produced two video clips of the reconstruction of the aulos from selinunte (fig. ). in addition to a polymer copy, on the basis of the measurements provided by the computed tomography of the instrument, pitano perra, an italian wind instrument maker, reconstructed two reed and bone copies of the aulos (fig. ). the goal is to reconstruct the instrument, and to analyse its organological characteristics. however, because the upper end and the lower part of the aulos are missing, the possible pitches can only be deduced indirectly, by finding the instrument length with which the finger holes would yield a plausible scale. although this principle seems promising, it will not be able to provide convincing results re- garding the aulos from selinunte, as we have just a single pipe. for this reason, the preserved aulos from selinunte must be subjected to closer examin- ation based on comparisons with other similar in- struments and ‘early type’ auloi. the aulos from selinunte conforms to the auloi ‘early type’ version found in the sanctuaries of artemis orthia at sparta (dated to the end of the seventh century bce) and artemis at brauron (dated to the sixth–fifth centu- ries bce), the examples found at aegina in the fig. measurements of the section b a. bellia digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay deleted text: in order to deleted text: . deleted text: . deleted text: . deleted text: . deleted text: sections deleted text: . deleted text: . deleted text: three-dimensional deleted text: but a deleted text: since deleted text: `` deleted text: '' deleted text: `` deleted text: '' fig. distances between holes from upstream end of the section and their diameters towards a new approach in the study of ancient music: virtual reconstruction of a musical instrument from greek sicily digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay sanctuary of aphaia, and those from the sanctuary of persephone at locri epizephyrii. it is especially similar in form and date to an aulos found in a tomb at poseidonia in southern italy, which dates from the end of the sixth to the beginning of the fifth century bce (psaroudakês, ). it is a well-preserved ‘early type’ greek aulos made of deer bone, housed today at the national archaeological museum of paestum (fig. ). on the basis of the measurements of fig. assembled sections of the aulos a. bellia digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay deleted text: `` deleted text: '' this well-preserved and near-complete aulos from poseidonia, the next step will be to reconstruct the incomplete tube of the aulos from selinunte. conclusion this study contributed towards overcoming limita- tions posed by traditional methods of measuring ancient musical instruments through pictures and drawings, opening up new perspectives for the study of the materials, origins, diffusion, and pro- duction process of musical instruments in antiquity. moreover, according to the anthropologist roberto miccichè, the tubes were crafted from the metatarsal of a deer, a conclusion that was corroborated by the ct scan. the osteoarchaeological results open up new perspectives for the study of the materials, ori- gins, diffusion, and production process of musical instruments in antiquity. as pollux (onomasticon, iv, d) recalls for theban wind instruments, the bones of the chamois or roe deer legs were used for the production of the auloi. as it is a material that involved a complex manufacturing process, it can be assumed that the aulos could have been con- sidered a precious object, more so if it was imported to selinunte from the motherland megara. fig. the aulos from selinunte and its d artificial copy photograph by angela bellia. fig. polymer copy of the aulos from selinunte fig. video of the aulos fig. american archaeological mission at selinunte, in sicily. pitano perra, an italian wind instrument maker, is playing the reconstructed ancient instrument in the same place of its discovery. the instrument has been recon- structed on the bases of the digital model translated into a d copy using polymer as a material fig. the aulos of poseidonia. museum of paestum. inv. photograph by stelios psaurodakes. towards a new approach in the study of ancient music: virtual reconstruction of a musical instrument from greek sicily digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay deleted text: well deleted text: near deleted text: . deleted text: bore and pipe workmanship is compatible with the use of an arched or rope drill used in antiquity by craftsmen to pierce the surfaces of objects. since the hypothesis of a provenance of the aulos from the motherland can be accounted for due to the presence of auloi manufacturers in megara, whose activity must have had a long and well-established tradition if they were able to introduce innovations to the instruments: in this regard, it is interesting that telefane of megara in iv c. bce (pseudo- plutarch, de musica, xiv, f- a) prevented the artisans from inserting additional holes into their instruments to modify their sound. it was for this reason that he was excluded from the pythian games, as he would have had to perform with the modified auloi. in conclusion, this research will develop a new theoretical basis, which will contribute to the estab- lishment of a methodology at the crossroads of archaeomusicology and digital technologies. funding this research was funded by the european commission’s marie curie actions. project id: . funded under: fp -people. acknowledgement i would like to thank professor clemente marconi, james r. mccredie professor in the history of greek art and archaeology at the institute of fine arts at new york university, for granting me per- mission to study the aulos. i would also like to thank the anthropologist dr roberto micciché for his suggestions and dr marco orlandi for helping in the aulos measurements. references avanzini, f., canazza, s., de poli, g., fantozzi, c., pretto, n., rodà, a., angelini, i., bettineschi, c., deotto, d., faresin, e., menegazzi, a., molin, g., salemi, g., and zanovello, p. ( ). archaeology and virtual acoustics: a pan flute from ancient egypt. in proceedings of the th international conference on sound and music computing conference (smc- ), maynooth, ireland, july , & august , . http://www.dei.unipd.it/�avanzini/downloads/paper/ avanzini_smc .pdf avanzini, f., canazza, s., de poli, g., fantozzi, c., micheloni, e., pretto, n., rodà, a., gasparotto, s., and salemi, g. ( ). virtual reconstruction of an an- cient greek pan flute. in proceedings of the th international conference on sound and music computing conference (smc- ), hamburg, germany, august to september , . http://smcnetwork.org/system/files/ smc _submission_ .pdf bellia, a. ( ). gli strumenti musicali nei reperti del museo archeologico regionale ‘‘antonino salinas’’ di palermo. roma: aracne. bellia, a. ( ). il canto delle vergini locresi. pisa- roma: fabrizio serra editore. bellia, a. ( ). relation of music to cultural identity in the colonies of west greece: the case of selinus. the journal of musicology, ( ): – . bellia, a. ( ). su uno strumento musicale ri-trovato nel museo archeologico regionale ‘‘antonino salinas di palermo. il frammento di aulos dal santuario della malophoros’’. sicilia antiqua, : – . blinkenberg, c. s. ( ). lindos, fouilles de l’acropole ( - ). les petits objects. berlin: de gruyter. boardman, j. ( ). excavations in chios, – . greek emporio. athens: the british school of archaeology at athens. dawkins, r. m. ( ). the sanctuary of artemis orthia at sparta. london: council of the society for the pro- motion of hellenic studies. dunbabin, t. j. ( ). perachora. the sanctuaries of hera akraia and limenia. oxford: clarendon press. furtwängler, a. ( ). aegina. das heiligtum der aphaia. münchen: akademie der wissenschaften. greco, g. ( ). da hera argiva a hera pestana. in i culti della campania antica. roma: g. bretschneider, pp. – . greco, g. ( ). santuari extraurbani tra periferia citta- dina e periferia indigena. in la colonisation grecque en méditerranée occidentale. roma: école française de rome, pp. – . hogarth, d. g. ( ). excavations at ephesus. the archaic artemisia. london: british museum. landels, j. g. ( ). the brauron aulos. annual of the british school at athens, : – . landels, j. g. ( ). music in ancient greece and rome. london; new york, ny: routledge. a. bellia digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay deleted text: games deleted text: since deleted text: in addition, this study is part of a programme intended to valorise ancient cultural and musical heritage in the mediterranean with cross-disciplinary approaches to human culture and technology, in order to unveil new meanings and create new research fields within digital humanities and heritage science. http://www.dei.unipd.it/∼avanzini/downloads/paper/avanzini_smc .pdf http://www.dei.unipd.it/∼avanzini/downloads/paper/avanzini_smc .pdf http://www.dei.unipd.it/∼avanzini/downloads/paper/avanzini_smc .pdf http://smcnetwork.org/system/files/smc _submission_ .pdf http://smcnetwork.org/system/files/smc _submission_ .pdf marconi, c. ( ). nuovi dati sui culti del settore mer- idionale del grande santuario urbano a selinus. sicilia antiqua, : – . marconi, c. ( ). a new bone flute from selinus: music and spectacle in the main urban sanctuary of a greek colony in the west. in musica, culti e riti dei greci d’occidente. roma-pisa: istituti editoriali poligrafici internazionali, pp. – . marconi, c. and scahill, d. ( ). the ‘‘south build- ing’’ in the main urban sanctuary of selinus: a thea- tral structure? in the architecture of the ancient greek theatre. aarhus: aarhus university press, pp. – . micheloni, e., pretto, n., avanzini, f., canazza, s., and roda, a. ( ). installazioni interattive per la valoriz- zazione di strumenti musicali antichi: il flauto di pan del museo di scienze archeologiche e d’arte dell’università degli studi di padova. in proceedings of the xxi colloquium of musical informatics, cagliari, september to october . http:// www.dei.unipd.it/�prettoni/paper/micheloniprettoetal _flauto_ _cim_xxi_atti.pdf moustaka, a. ( ). aulos und auletik im archaischen ionien. zu einem aulos aus dem heraion von samos. in festschcrift für jörg schäfer zum . geburtstag am . april . würzburg: ergon, pp. – . psaroudakēs, s. ( ). the aulos of argithea. orient- archäologie, : – . psaroudakēs, s. ( ). the auloi of pydna. orient- archäologie, : – . psaroudakēs, s. ( ). the dafnē aulos. greek and roman musical studies, : – . psaroudakēs, s. ( ). the aulos of poseid �onia. in musica, culti e riti nell’occidente greco, ed. by angela bellia, pisa-roma: istituti editoriali e poligrafici internazionali. west, m. l. ( ). ancient greek music. oxford: clarendon press. zoran, a. ( ). the d printed flute: digital fabrication and design of musical instruments. journal of new music research, ( ): – . notes http://cordis.europa.eu/result/rcn/ _en.html r:id="rid " w:history=" " towards a new approach in the study of ancient music: virtual reconstruction of a musical instrument from greek sicily digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by guest on m ay http://www.dei.unipd.it/∼prettoni/paper/micheloniprettoetal_flauto_ _cim_xxi_atti.pdf http://www.dei.unipd.it/∼prettoni/paper/micheloniprettoetal_flauto_ _cim_xxi_atti.pdf http://www.dei.unipd.it/∼prettoni/paper/micheloniprettoetal_flauto_ _cim_xxi_atti.pdf http://www.dei.unipd.it/∼prettoni/paper/micheloniprettoetal_flauto_ _cim_xxi_atti.pdf http://cordis.europa.eu/result/rcn/ _en.html uc irvine western journal of emergency medicine: integrating emergency care with population health title sepsis alerts in emergency departments: a systematic review of accuracy and quality measure impact permalink https://escholarship.org/uc/item/ wv c fk journal western journal of emergency medicine: integrating emergency care with population health, ( ) issn - x authors hwang, matthew i. bond, william f. powell, emilie s. publication date doi . /westjem. . . supplemental material https://escholarship.org/uc/item/ wv c fk#supplemental license https://creativecommons.org/licenses/by/ . / . peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ wv c fk https://escholarship.org/uc/item/ wv c fk#supplemental https://creativecommons.org/licenses/https://creativecommons.org/licenses/by/ . // . https://escholarship.org http://www.cdlib.org/ volume , no. : september western journal of emergency medicine review article sepsis alerts in emergency departments: a systematic review of accuracy and quality measure impact matthew i. hwang, bs* william f. bond md, ms† emilie s. powell md, mba, ms‡ section editor: gabriel wardi, md submission history: submitted november , ; revision received may , ; accepted may , electronically published august , full text available through open access at http://escholarship.org/uc/uciem_westjem doi: . /westjem. . . university of illinois college of medicine at peoria, peoria, illinois university of illinois college of medicine at peoria, osf healthcare, jump simulation and department of emergency medicine, peoria, illinois northwestern university feinberg school of medicine, northwestern memorial hospital, department of emergency medicine, chicago, illinois * † ‡ introduction: for early detection of sepsis, automated systems within the electronic health record have evolved to alert emergency department (ed) personnel to the possibility of sepsis, and in some cases link them to suggested care pathways. we conducted a systematic review of automated sepsis-alert detection systems in the ed. methods: we searched multiple health literature databases from the earliest available dates to august . articles were screened based on abstract, again via manuscript, and further narrowed with set inclusion criteria: ) adult patients in the ed diagnosed with sepsis, severe sepsis, or septic shock; ) an electronic system that alerts a healthcare provider of sepsis in real or near-real time; and ) measures of diagnostic accuracy or quality of sepsis alerts. the final, detailed review was guided by quadas- and grade criteria. we tracked all articles using an online tool (covidence), and the review was registered with prospero registry of reviews. a two-author consensus was reached at the article choice stage and final review stage. due to the variation in alert criteria and methods of sepsis diagnosis confirmation, the data were not combined for meta-analysis. results: we screened articles by title and abstract and by full text; we then selected for the study. the articles were published between – . two studies had algorithm-based alert systems, while eight had rule-based alert systems. all systems used different criteria based on systemic inflammatory response syndrome (sirs) to define sepsis. sensitivities ranged from - %, specificities from - %, and positive predictive value from . - %. negative predictive value was consistently high at - %. studies showed some evidence for improved process-of- care markers, including improved time to antibiotics. length of stay improved in two studies. one low quality study showed improved mortality. conclusion: the limited evidence available suggests that sepsis alerts in the ed setting can be set to high sensitivity. no high-quality studies showed a difference in mortality, but evidence exists for improvements in process of care. significant further work is needed to understand the consequences of alert fatigue and sensitivity set points. [west j emerg med. ; ( ) - .] introduction sepsis is defined as life-threatening organ dysfunction due to a dysregulated inflammatory response to infection. it is implicated in an estimated . million hospitalizations each year and is among the most costly conditions for hospitals. , delays in diagnosis of sepsis can lead to delay in treatment, , which can lead to increased morbidity and mortality. quality measures now track time to these treatments as process markers of successful care. while studies have questioned some of the interventions, such as protocol-driven fluid resuscitation, there western journal of emergency medicine volume , no. : september sepsis alerts in eds: a systematic review of accuracy and quality measure impact hwang et al. population health research capsule what do we already know about this issue? the use of automated clinical alerts is increasing, and complex algorithmic models are now being implemented. what was the research question? how do sepsis alert systems in the emergency department perform based on accuracy and quality measures? what was the major finding of the study? process measures moderately improved. one low-quality study showed mortality benefit, while no high-quality studies did. how does this improve population health? further research of alert system elements is needed. our goal is to guide the development of sepsis alerts to improve outcome measures. is general agreement that early antibiotic administration reduces mortality from sepsis. , - risk for delays in diagnosis led to the development of automatic electronic sepsis alerts built into electronic health record (ehr) systems. , , some of these systems were created for use in the inpatient ward, , intensive care unit (icu), , and emergency department (ed), , and some stretch across settings within a healthcare system. , one study demonstrated that over % of sepsis hospitalizations presented in the ed, warranting a focused study of this population. the challenge of demonstrating the marginal impact of these systems is that they act alongside existing sepsis care processes in a very ill population whose incremental change in mortality may be difficult to detect. in addition, thanks to education campaigns for staff, the drive toward improvement in quality measures, and increasing board certification of emergency providers, ed personnel have become better trained and are likely better at detecting sepsis. thus, in the highly visually and electronically monitored ed setting, the benefit of these systems over clinician gestalt may diminish over time. the possibility still exists that automated sepsis alerts may be an important method to detect more subtle cases or earlier presentations and may have greater value in less monitored settings. the value of these alert systems is measured based on their detection accuracy, with a goal of high sensitivity and, more importantly, their impact on process or outcome measures. however, alert systems carry a risk of alarm fatigue and distraction. , sepsis alerts add to already increasing alarms with the ehr, including those for physiology monitors, pharmacy checking, and infectious disease isolation. the positive impact of these automated sepsis alerts and their alarm methods on sepsis care, specific to the ed, remains an open question, and drove the desire for this systematic review. alert systems vary in their criteria. early systems were often rule-based using the centers for medicare and medicaid services (cms) sepsis- definition of sepsis: two of four systemic inflammatory response syndrome (sirs) criteria with a suspected or identified infection source. sirs is defined as at least two of the following four findings: temperature > ° celsius (c) ( . ° fahrenheit [f]) or < °c ( . °f); heart rate > beats/minute; respiratory rate > breaths/minute; or white blood count > , per microliter (µl) or < /µl or % band forms. cms with sepsis- set elevated temperature at > . °c ( . °f). more advanced systems are using algorithms, which expand on the limited criteria of rule-based systems. such criteria may include past medical history and lab values or vitals with near-real time updating. evaluation of the success of these systems is complicated by difficulty establishing consensus and evolving definitions for the sepsis spectrum, including the update to sepsis- . thus, the diagnostic criteria are both evolving and in most cases based on discharge diagnosis, rather than information available in the ed. the ability to accurately diagnose and treat a specific disease may be measured by studying discharge diagnosis, but it may not account for clinician decisions made with limited information, as is often encountered in ed settings. discharge diagnosis as a standard does not account for a clinician’s ability to risk stratify and exclude life- threatening conditions, which is valuable for stabilizing patients and completing the diagnostic workup. although using chief complaint for quality evaluation or diagnostic criteria has been proposed, it has yet to be standardized. , due to evolving systems and definitions, we systematically reviewed studies assessing the effectiveness of these alerts. our objectives were to determine whether automated electronic sepsis alerts in the ed are accurate and whether they have an impact on quality measures and/or mortality. methods this review followed guidelines presented by the preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies (prisma-dta) and prisma-p. , this review was registered with prospero (prospective register of systematic reviews) . search strategy databases for the search included pubmed medline, embase, the cochrane library, and the cumulative index of nursing and allied health literature (cinahl), from the earliest available dates to august , . we defined the search according to four fields: emergency department; sepsis; electronic health record; and alerts/alarms. details of the search strategy are described in appendix a. volume , no. : september western journal of emergency medicine hwang et al. sepsis alerts in eds: a systematic review of accuracy and quality measure impact eligibility criteria randomized trials, performance improvement trials (including before and after studies), and cohort studies were included in the screening. eligible studies included published articles with the following: ) adult patients in the ed, diagnosed with sepsis, severe sepsis, or septic shock (hereafter referred to as sepsis); ) an electronic system that alerts a healthcare provider of sepsis in real or near-real time; and ) measures of diagnostic accuracy or impact on quality of care measures. exclusion criteria included the following: ) primary data based on non-ed settings, such as prehospital, icu, or the general wards; ) articles studying medical conditions that can present with sepsis, such as specific infections (eg, influenza), pregnancy-related issues, and bacteremia, without assessing sepsis independently; ) alert systems that screen only at triage, as opposed to reaching an alert trigger threshold at any point in the ed visit; and ) non-english language articles lacking translation. we ensured chosen articles came from peer- reviewed sources based on the presence of a peer-review process description on the journal homepage. study records we collected citations in a reference manager software zotero (corporation of digital scholarship, george mason university, fairfax, va). article screening was completed through the online software covidence systematic review software (veritas health innovation, melbourne, australia). two independent reviewers (authors wb and mh), selected for articles based on the inclusion and exclusion criteria in the title and abstract screenings. at the next stage, two independent reviewers (authors wb and ep) selected articles in the full-text screening. conflicts were resolved through regular meetings or conference calls. data was collected by wb and mh, and then extracted with covidence to be stored as a secure microsoft excel file (microsoft corporation, redmond, wa). data items qualitative data items for extraction included clinical setting, study design, age group, type of alert system, definition/ threshold for the alert, method of alert notification, treatment recommendation, and reference standard. the implemented alert system was considered the index test. we classified the alert systems as rule based or algorithm based. among the eligible studies, the rule-based alerts used sirs criteria. the algorithmic alerts had unique measures such as vitals, glasgow coma scale, and creatinine. variations for either system are described in table . quantitative data items included sample size, population size, accuracy, and outcome measures. outcomes and prioritization and diagnostic accuracy measures we extracted data from articles on sepsis alerts for both diagnostic accuracy and impact on quality measures. diagnostic accuracy assesses the ability of the alert to accurately detect sepsis. measurements included positive and negative predictive values, sensitivity, and specificity. quality measures of interest were process and outcome measures. examples of process markers included compliance or time to antibiotic administration, fluid resuscitation, and lactate measurement. outcome measures included mortality and length of icu stay, although various additional markers were captured by different authors. when reported by the authors, we used confidence intervals for the given estimates. data synthesis a qualitative analysis of each study was used. the variation of sepsis definition for the alerts, the set points, methods of alerting, response processes, etc prevented an aggregated quantitative analysis. bias and applicability covidence included a bias rating system based on the cochrane standard of quality assessment. we added criteria from the quality assessment of diagnostic accuracy studies (quadas- ) to effectively assess diagnostic accuracy of the articles, per the recommendation of prisma-dta, leeflang, and cochrane. , , we rated quality measure articles following the guidance of grade (grades of recommendation assessment, development and evaluation). each article was rated for bias regarding blinding of participants and personnel to the alert, blinding of outcome assessors, incomplete outcome data, selective outcome reporting, the index test, gold standard, and flow and timing. once each component was finalized, a consensus overall quality rating was decided based on the risk of biases. the overall quality was scaled relative to the cohort study design. no articles had strong experimental designs (ie, randomized controlled trials); therefore, quality was ranked based on comparison within this cohort of articles. details are recorded in appendix b. results study selection and characteristics we imported articles into covidence. after duplicate removal, were screened by title and abstract. twenty articles underwent full-text assessment, and were selected for the study (figure). eight of these studies assessed diagnostic accuracy and six assessed quality measures. all studies were prospective or retrospective cohorts and were conducted in urban, tertiary and/ or academic medical centers (table ). publishing years ranged from – . two studies had algorithm-based alert systems, while eight had rule sets. all systems used different criteria based on sirs to define sepsis. there was significant variability in the criteria used for activation of the sepsis alert, the threshold definitions that activated the alert, the presence or absence of triggering links to care order sets, and the degree and type of interventions triggered by the alert. likewise, there were variations in the diagnostic criteria standards against which the western journal of emergency medicine volume , no. : september sepsis alerts in eds: a systematic review of accuracy and quality measure impact hwang et al. table . characteristics of included studies. source (the article) study design demographic type of alert system (the index test) definition/threshold for the alert method of alert notification treatment recommended reference standard (compared to the index test) alsolamy prospective cohort > years old rule based ≥ sirs criteria and organ dysfunction, or organ dysfunction criteria^ notification to nurse who pages the physician no clinical evaluation by an em or icu physician following surviving sepsis campaign guidelines austrian retrospective cohort ≥ years old rule based st alert is sirs based. nd and rd alerts are sepsis based, which is the sirs alert plus systolic blood pressure < mmhg or lactate ≥ mg/dl electronic notifications to the following, alert : nurse alert : nurse alert : provider yes (to all alerts) icd- coding for severe sepsis or septic shock on admission only bansal prospective cohort adult patients (though not clearly specified) rule based st alert is sirs based. nd alert is a sepsis alert, which is the sirs alert plus wbc ≥ k or ≤ k blood cultures ordered or lactate > mg/dl alone team leader paged yes, a sepsis response team in the post alert group physician reviewers using standardized sepsis criteria, approved by mayo clinic enterprise subspecialty councils for em and critical care berger prospective cohort ≥ years old rule based > sirs criteria plus infection source electronic notification to clinician yes, lactate recommended ≥ sirs criteria and clinical suspicion, retrospectively brown prospective cohort ≥ years old algorithm based parameters including demographics, encounter details, lab tests, sirs criteria, and other clinical measurements page and email to charge nurse not specified admitted from ed to icu and either ) icd- discharge diagnosis relating to sepsis or infection or ) identification by a qi coordinator in the icu. martin rico prospective cohort ≥ years old algorithm based series of parameters including lab tests, sirs criteria, vitals, and glasgow coma scale score electronic notification to clinician yes, with an e-order set chart review with “clinical experts” with icd- cm discharge diagnosis of sepsis meurer prospective cohort ≥ years old rule based ≥ sirs criteria page to study coordinator who confirms a source of infection from the physician no chart reviewers ( ) confirmed or excluded the diagnosis ^systolic blood pressure < to mmhg with intravenous fluids or < mm hg regardless of fluids, blood oxygen saturation < % to % with supplemental oxygen or < % without oxygen, or lactate > mmol/l. sirs, systemic inflammatory response syndrome; icu, intensive care unit; icd- , international classification of diseases, th ed; mmhg, millimeters of mercury; mg/dl, milligram per deciliter; mmol/l, millimole per liter; wbc, white blood count; k, thousand; em, emergency medicine; ed, emergency department; qi, quality improvement. volume , no. : september western journal of emergency medicine hwang et al. sepsis alerts in eds: a systematic review of accuracy and quality measure impact figure. primsa flow diagram. source (the article) study design demographic type of alert system (the index test) definition/threshold for the alert method of alert notification treatment recommended reference standard (compared to the index test) narayanan retrospective cohort ≥ years old rule based st alert is sirs based. nd alert is a sepsis alert, which is the sirs alert plus end organ dysfunction or fluid nonresponsive hypotension electronic notification to clinician no chart review with icd- code diagnosis of severe sepsis and septic shock nelson prospective cohort ≥ years old rule based ≥ sirs criteria and systolic blood pressure measurements less than mmhg all caregivers notified with a page yes chart review with the same sirs and hypotension criteria nguyen retrospective cohort all ed patients* rule based ≥ sirs criteria, and systolic blood pressure ≤ mmhg or lactic acid ≥ . mg/dl. not specified not specified patients for which the alert did not fire were randomly selected *”while children have different ranges for sirs criteria, < % of emergency department (ed) patients were < years old…” sirs, systemic inflammatory response syndrome; icd- , international classification of diseases, th ed; mmhg, millimeters of mercury; mg/dl milligram per deciliter; ed, emergency department. table . continued. alerts were weighed, with most studies using chart review confirmation, while some used clinician confirmation. only nguyen et al had a control group of randomly selected patients during a study period when the alert did not fire. all of the other articles were either prospective or retrospective cohort designs without control groups. diagnostic accuracy diagnostic accuracy was recorded in table below. specificity ranged from - %, and positive predictive value (ppv) from . % to %. negative predictive value (npv) was consistently high at - %. excluding meurer et al, sensitivity ranged from - %. meurer et al had a sensitivity of . % for the electronic alert alone, and . % for the electronic alert and attending confirmation. with attending confirmation, specificity increased from . % to . %. the study had a low activation threshold of ≥ sirs criteria, the smallest sample size of , and an age range of years or older. patients were only included if they presented between am and pm on weekdays. this study also only included patients admitted from the ed, instead of all ed patients, risking selection bias. the notification system sent a page to the study coordinator before confirming with a physician. in contrast, other studies directly notified a member of the clinical team, excluding nguyen et al, which did not describe the notification method. five rule-based studies were of high quality. , , , , two studies had systems with high accuracy. alsolamy et al had a sensitivity of . %, specificity . %, and ppv . %. bansal et al had a sensitivity of %, specificity . %, and ppv . %. highest ppv was achieved by nelson et al at . % and nguyen et al at . %. austrian et al shared the number of total alerts fired, for any of three criteria sets including sirs, nurse alert, and physician alert that included progressively more ill criteria. they report sensitivities of . %, . %, and . %, respectively, and ppv of . %, . %, and . % as expected for the more progressively stringent criteria. they did not share the denominator of all ed presenting patients for the retrospective period under study but report the total number of hospitalized western journal of emergency medicine volume , no. : september sepsis alerts in eds: a systematic review of accuracy and quality measure impact hwang et al. table . diagnostic accuracy. source sample size (n)* population size (n)^ sensitivity ( % ci) specificity ( % ci) positive predictive value ( % ci) negative predictive value ( % ci) overall quality alsolamy , . ( . - . ) . ( . - . ) . ( . - . ) . ( . - . ) high austrian not specified . high bansal ( . - ) . ( . - . ) . high brown , . . . . low martin rico , ( . - . ) ( . - . ) low meurer alert alone: alert and attending confirmation: alert alone: . ( . - . ) alert and attending confirmation: . ( . - . ) alert alone: . ( . - . ) alert and attending confirmation: . ( . - . ) low nelson sens. and spec.: ppv and npv: high nguyen not specified . ( . - . ) high *alerts for sepsis meeting the diagnostic criterion standard of the individual article. ^patients presenting to the emergency department (ed). ci, confidence interval; ppv, positive predictive value; npv, negative predictive value; sens. and spec., sensitivity and specificity. sepsis patients based on discharge diagnosis. septic patients may have been sent home, but if we assume they captured all true positives and false negatives through final diagnosis of sepsis, this allows for calculation of sensitivity and ppv and does not allow the calculation of specificity or npv. with patients with a final diagnosis of severe sepsis or septic shock, and , alerts (any of the three levels included), they had the largest retrospective sample size. two studies assessing algorithm-based alerts were deemed low quality. brown et al measured a sensitivity of . %, specificity of . %, and a low ppv of . %. martin rico et al measured a sensitivity of %, specificity of %, and a ppv of %. prevalence of sepsis compared to total ed patients was . - % in five studies. , , , , meurer et al had a prevalence of . %, but this was among patients ≥ years old and it was the sole study with only sirs criteria (a low threshold) for its sepsis definition. quality measures quality measures are described in table . two studies evaluating quality measures were high quality: austrian et al and nelson et al. in austrian et al, process markers of time to first lactate and vasopressor use significantly improved. statistically insignificant findings included blood cultures drawn before antibiotic administration and whether a lactate was drawn. antibiotic timing was not reported. for nelson et al, process markers of blood culture collection and chest radiograph before admission improved. statistically insignificant findings included lactate collection and antibiotics given in the ed. outcome measures of icu transfer, icu length of stay (los), and total los significantly improved for austrian et al. mortality did not improve significantly for either study. four studies (bansal, berger, martin rico, narayanan) , , , were judged to be of low quality. berger et al had significant improvement in lactate testing. narayanan et al improved antibiotics within minutes, time to antibiotics, and los. bansal et al had nearly % sensitivity and specificity with a modest ppv of . %, and had no significant improvements in survival rate. of note, we established the article to be high quality in regard to diagnostic accuracy, while outcome measures were low quality. in contrast, austrian et al had improvements in los and martin rico et al had improvements in mortality, while both exhibited moderate accuracy. none of the rule-based studies showed statistically significant improvements in mortality. - , , the only outcome reported by an algorithm-based study (martin rico et al) was mortality, which showed significant improvement, although the study was judged to be of low quality. the alert system narayanan et al studied did not recommend treatment as other systems did. for this rule-based system “antibiotics in minutes” meant time to antibiotics, and los significantly improved. volume , no. : september western journal of emergency medicine hwang et al. sepsis alerts in eds: a systematic review of accuracy and quality measure impact discussion overall, most of the study designs used to assess the impact of sepsis alerts were weak, and the review authors had difficulty isolating the impact of the automated sepsis alert itself from broader interventions such as response teams or order set bundles. thus, our review conclusions must be couched within the strength of the overall low-quality evidence. the limited evidence available suggests that sepsis alerts in source sample size (n)* population size (n)^ significant results (process/outcome marker: prior vs. after) insignificant results overall quality austrian before sepsis alert: after sepsis alert: # icu transfer: . % vs. . %, p< . icu length of stay (days): . ( . ) vs. . ( . ), p< . length of stay (days): . (sd . ) vs. . (sd . ), p< . time to first lactate (days): . ( . ) vs. . ( . ), p< . vasopressor used: . % vs. . %, p< . blood cultures drawn prior to antibiotics: . % vs . %, p= . in-hospital mortality: . % vs. %, p= . lactate drawn (excluding ≥ hrs after ed arrival): . % vs. . %, p= . high bansal whole cohort: n= severe sepsis and septic shock: n= in-hospital survival rate with ssrt activation in full cohort: . ( % ci, . to . ) in-hospital survival rate with ssrt activation among severe sepsis/septic shock patients: . ( % ci, . to . ) low berger before sepsis alert: lactate- , hyperlactatemia- , mortality- . after sepsis alert: lactate- , hyperlactatemia- , mortality- before alert: after alert: & lactate testing: . % vs. . % ( % ci, . to . %) absolute increase p< . change in frequency of hyperlactatemia if lactate was tested: . % vs. . % ( % ci, - . to . ) mortality: . % vs. . % ( % ci, - . to . %, p= . ) low martin rico , mortality: . % vs. . % adjusted risk reduction for mortality: % ( . - . ) incidence rate ratio: . , p= . low narayanan prior to sepsis alert: n= after sepsis alert: n= not specified antibiotics in minutes: . % vs. . %, p<. length of stay odds ratio: [ . ( . - . )] mean time to antibiotics (minutes): . ( - ) vs. ( - ), p<. mortality odds ratio: . ( . - . ) low nelson blood culture collected odds ratio: [ . ( . - . )] chest radiograph before admission odds ratio: [ . ( . - . )] antibiotic given in ed odds ratio: [ . ( . - . )] lactate collected odds ratio: [ . ( . - . )] high *alerts for sepsis meeting the diagnostic criterion standard of the individual article. ^patients presenting to the emergency department. #all hospitalizations with severe sepsis or septic shock &all patients with sepsis. total ed presentations not specified. icu, intensive care unit; vs, versus; sd, standard deviation; ed, emergency department; ci, confidence interval; ssrt, sepsis and shock response team. table . quality measures. the ed setting can be tuned to a high sensitivity for the detection of sepsis. evidence from both low- and high-quality studies showed some improved process-of-care markers, including time to antibiotics, with the use of automated sepsis alerts. , , , , lactate testing was studied by four groups with two producing significant results. other than lactate measurement, no single measure consistently improved across studies. a lack of consistency of measured items and western journal of emergency medicine volume , no. : september sepsis alerts in eds: a systematic review of accuracy and quality measure impact hwang et al. measurement methods creates a challenge in forming a conclusion. for example, one study examined whether blood cultures were collected, as opposed to blood cultures collected before antibiotic administration. no high-quality studies showed a difference in mortality, and only one high-quality study showed impacts on icu los and vasopressor use. our findings are in keeping with a review by makam in that covered alerts both inside and outside of the ed environment. our review added recently published articles, including those that now use an algorithmic as opposed to simple rules-based approaches, and was focused on patients presenting to the ed. the strongest study designs we reviewed for inclusion were prospective cohort studies, but we would call attention to a well-executed performance improvement study conducted by gatewood et al. they included a computerized alert with a multipronged intervention and showed a substantial improvement in sepsis bundle of care compliance. however, they did not show differences in mortality in part due to the inclusion of lower risk patients on the sepsis spectrum. sepsis alerts represent a difficult area to study with traditional randomized methods. one challenge is that in the course of operational improvement, sepsis alert criteria and/or alert thresholds may be subtly changed in the background. this may be done by information technology, analytics, or ehr personnel to address ppv or safety concerns, usually with a clinician’s input, but often without alerting all ed staff to the change. moving to a more rigorous study design requires holding the alert constant and ethical approval for a non-alert or clinician gestalt arm. thus, success will likely be found in future studies that use time series, or perhaps cluster randomized rollout methods across healthcare systems. likewise, future areas for study could include comparisons of the method of alert, and the presence or absence of treatment recommendations. none of the studies addressed potential harms. harm may include the alarm issues impacting staff, missing alternative diagnoses due to early anchoring on sepsis, and the follow-on effects of early, aggressive fluid intervention, which has been questioned more broadly in the sepsis literature. significant further work is needed on the alarm consequences of the sensitivity set points, and if possible, such work should incorporate influences from other non- sepsis alarms in alarm fatigue. although low quality, one algorithmic system showed significant mortality improvement, potentially validating its further development. systems such as this are being developed to improve accuracy and ppv, and may include risk factors such as comorbid conditions and past medical history. these systems can effectively insert multiple variables into an equation using current and past patient data as regression coefficients, running the calculation repeatedly over the course of a patient stay as more predictor variables become available. the data creating the coefficients of such a regression-based equation would influence the predictor’s value. for example, a sepsis predictor tool based on the elderly would likely not be predictive for children. the newest models of sepsis alerts include machine learning. complex algorithmic models may use well over variables, and a machine-learning program may be integrated into them. machine learning uses computer programming to identify patterns and significant predictors beyond the reasonable capabilities of humans. with continual analysis, it can fine-tune coefficients and thresholds of the algorithm. initial studies show promise, - and additional research is required to assess its impact on clinical outcomes. limitations our limitations include a risk of publication bias because we did not search the gray literature or clinical trials for studies in progress. there are likely many hospital systems that have implemented sepsis alerts, collected data, and did not report it. our consensus group was small in number, but we followed a rigorous process using review rubrics guided by well-accepted grading criteria. conclusion automated sepsis alerts in the ed may be set to a high sensitivity. process measures show moderate benefit; however, no single measure has consistently improved, and high-quality studies have yet to demonstrate, a mortality benefit. specific components of these systems, alarm fatigue, and sensitivity set points should be examined further. sepsis alerts demonstrate utility and future research is indicated to build a more ideal alert system. address for correspondence: matthew hwang, bs, university of illinois college of medicine at peoria, jump simulation, n berkeley ave., peoria, il . email: mihwang @uic.edu. conflicts of interest: by the westjem article submission agreement, all authors are required to disclose all affiliations, funding sources and financial or management relationships that could be perceived as potential sources of bias. no author has professional or financial relationships with any companies that are relevant to this study. there are no conflicts of interest or sources of funding to declare. copyright: © hwang et al. this is an open access article distributed in accordance with the terms of the creative commons attribution (cc by . ) license. see: http://creativecommons.org/ licenses/by/ . / references . singer m, deutschman cs, seymour cw, et al. the third international consensus definitions for sepsis and septic shock (sepsis- ). jama. ; ( ): - . . torio cm, andrews rm. national inpatient hospital costs: the most expensive conditions by payer, . . available at: https://www. hcup-us.ahrq.gov/reports/statbriefs/sb .pdf. accessed february , . volume , no. : september western journal of emergency medicine hwang et al. sepsis alerts in eds: a systematic review of accuracy and quality measure impact . rhee c, dantes r, epstein l, et al. incidence and trends of sepsis in us hospitals using clinical vs claims data, - . jama. ; ( ): - . . candel fj, sá mb, belda s, et al. current aspects in sepsis approach. turning things around. rev esp quimioter. ; ( ): - . . kassyap ck, abraham sv, krishnan sv, et al. factors affecting early treatment goals of sepsis patients presenting to the emergency department. indian j crit care med. ; ( ): - . . andersson m, Östholm-balkhed Å, fredrikson m, et al. delay of appropriate antibiotic treatment is associated with high mortality in patients with community-onset sepsis in a swedish setting. eur j clin microbiol infect dis. ; ( ): - . . centers for medicare and medicaid services. severe sepsis and septic shock: management bundle. centers for medicare and medicaid services (cms). . available at: https://cmit.cms.gov/cmit_public/ viewmeasure?measureid= . accessed april , . . the arise investigators, anzics clinical trials group, peake sl, et al. goal-directed resuscitation for patients with early septic shock. n engl j med. ; ( ): - . . ballester l, martínez r, méndez j, et al. differences in hypotensive vs. non-hypotensive sepsis management in the emergency department: door-to-antibiotic time impact on sepsis survival. med sci. ; ( ): . . rhodes a, evans le, alhazzani w, et al. surviving sepsis campaign: international guidelines for management of sepsis and septic shock: . intensive care med. ; ( ): - . . johnston anb, park j, doi sa, et al. effect of immediate administration of antibiotics in patients with sepsis in tertiary care: a systematic review and meta-analysis. clin ther. ; ( ): - .e . . tafelski s, nachtigall i, deja m, et al. computer-assisted decision support for changing practice in severe sepsis and septic shock. j int med res. ; ( ): - . . herasevich v, pieper ms, pulido j, et al. enrollment into a time sensitive clinical study in the critical care setting: results from computerized septic shock sniffer implementation. j am med inform assoc. ; ( ): - . . mcree l, thanavaro jl, moore k, et al. the impact of an electronic medical record surveillance program on outcomes for patients with sepsis. heart lung. ; ( ): - . . sawyer am, deal en, labelle aj, et al. implementation of a real-time computerized sepsis alert in non-intensive care unit patients. crit care med. ; ( ): - . . hooper mh, weavind l, wheeler ap, et al. randomized trial of automated, electronic monitoring to facilitate early detection of sepsis in the intensive care unit. crit care med. ; ( ): - . . brandt bn, gartner ab, moncure m, et al. identifying severe sepsis via electronic surveillance. am j med qual. ; ( ): - . . alsolamy s, al salamah m, al thagafi m, et al. diagnostic accuracy of a screening electronic alert tool for severe sepsis and septic shock in the emergency department. bmc med inform decis mak. ; : . . balamuth f, alpern er, abaddessa mk, et al. improving recognition of pediatric severe sepsis in the emergency department: contributions of a vital sign based electronic alert and bedside clinician identification. ann emerg med. ; ( ): - .e . . guirgis fw, jones l, esma r, et al. managing sepsis: electronic recognition, rapid response teams, and standardized care save lives. j crit care. ; : - . . westphal ga, pereira ab, fachin sm, et al. an electronic warning system helps reduce the time to diagnosis of sepsis. rev bras ter intensiva. ; ( ): - . . angel c, leisman de, schneider sm, et al. sepsis presenting in emergency department versus inpatient settings: divergences in prevalence, patient characteristics, initial resuscitation, outcomes, and costs. ann emerg med. ; ( , supplement):s . . ramsdell th, smith an, kerkhove e. compliance with updated sepsis bundles to meet new sepsis core measure in a tertiary care hospital. hosp pharm. ; ( ): - . . reiter m, wen ls, allen bw. the emergency medicine workforce: profile and projections. j emerg med. ; ( ): - . . guardia-labar lm, scruth ea, edworthy j, et al. alarm fatigue: the human-system interface. clin nurse spec. ; ( ): - . . curry jp, jungquist cr. a critical assessment of monitoring practices, patient deterioration, and alarm fatigue on inpatient wards: a review. patient saf surg. ; ( ): . . sartelli m, kluger y, ansaloni l, et al. raising concerns about the sepsis- definitions. world j emerg surg. ; : . . kalantari a, mallemat h, weingart sd. sepsis definitions: the search for gold and what cms got wrong. west j emerg med. ; ( ): - . . griffey rt, pines jm, farley hl, et al. chief complaint-based performance measures: a new focus for acute care quality measurement. ann emerg med. ; ( ): - . . national quality forum. advancing chief complaint-based quality measurement. . available at: http://www.qualityforum.org/ publications/ / /advancing_chief_complaint-based_quality_ measurement_final_report.aspx. accessed april , . . mcinnes mdf, moher d, thombs bd, et al. preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: the prisma-dta statement. jama. ; ( ): - . . moher d, shamseer l, clarke m, et al. preferred reporting items for systematic review and meta-analysis protocols (prisma-p) statement. syst rev. ; ( ): . . leeflang mmg, deeks jj, gatsonis c, et al. systematic reviews of diagnostic test accuracy. ann intern med. ; ( ): - . deeks jj, wisniewski s, davenport c. guide to the contents of a cochrane diagnostic test accuracy protocol. . available at: https://methods.cochrane.org/sites/methods.cochrane.org.sdt/files/ public/uploads/ch _sep .pdf. accessed march , . . dijkers m. introducing grade: a systematic approach to rating evidence in systematic reviews and to guideline development. kt western journal of emergency medicine volume , no. : september sepsis alerts in eds: a systematic review of accuracy and quality measure impact hwang et al. update. ; ( ): . . austrian js, jamin ct, doty gr, et al. impact of an emergency department electronic sepsis surveillance system on patient mortality and length of stay. j am med inform assoc. ; ( ): - . . bansal v, festić e, mangi ma, et al. early machine-human interface around sepsis severity identification: from diagnosis to improved management? acta med acad. ; ( ): - . . berger t, birnbaum a, bijur p, et al. a computerized alert screening for severe sepsis in emergency department patients increases lactate testing but does not improve inpatient mortality. appl clin inform. ; ( ): - . . brown sm, jones j, kuttler kg, et al. prospective evaluation of an automated method to identify patients with severe sepsis or septic shock in the emergency department. bmc emerg med. ; ( ): . . martin rico p, valdivia perez a, and lacalle martinez jm. electronic alerting and decision support for early sepsis detection and management: impact on clinical outcomes. eur j clin pharmacol. ; ( ): - . . meurer wj, smith bl, losman ed, et al. real-time identification of serious infection in geriatric patients using clinical information system surveillance. j am geriatr soc. ; ( ): - . . narayanan n, gross ak, pintens m, et al. effect of an electronic medical record alert for severe sepsis among ed patients. am j emerg med. ; ( ): - . . nelson jl, smith bl, jared jd, et al. prospective trial of real-time electronic surveillance to expedite early care of severe sepsis. ann emerg med. ; ( ): - . . nguyen sq, mwakalindile e, booth js, et al. automated electronic medical record sepsis detection in the emergency department. peerj. ; :e . . makam an, nguyen ok, auerbach ad. diagnostic accuracy and effectiveness of automated electronic sepsis alert systems: a systematic review. j hosp med. ; ( ): - . . gatewood mo, wemple m, greco s, et al. a quality improvement project to improve early sepsis care in the emergency department. bmj qual saf. ; ( ): - . . delahanty rj, alvarez j, flynn lm, et al. development and evaluation of a machine learning model for the early identification of patients at risk for sepsis. ann emerg med. ; ( ): - . . shimabukuro dw, barton cw, feldman md, et al. effect of a machine learning-based severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial. bmj open respir res. ; ( ):e . . nemati s, holder a, razmi f, et al. an interpretable machine learning model for accurate prediction of sepsis in the icu. crit care med. ; ( ): - . all along the watchtower: intersectional diversity as a core intellectual value in the digital humanities daniel paul o’donnell department of english university of lethbridge this problem is significant because it indicates the failure of the traditional model for scholarship adequately to describe serious intellectual work in humanities computing, whose scope cannot be delimited in the same way and to the same extent as the traditional kind…. a new definition of scholarship, demanding new abilities, would seem to follow. the bonfire of the (digital) humanities the digital humanities (dh) came close to imploding as an organised discipline in the - academic year. the origins of the dispute lay in the deliberations of the programme committee for digital humanities, the annual, usually very competitive, international conference organised by the alliance of digital humanities organisations (adho) and held in in krakow, poland. what criteria, this committee asked itself, should we use for accepting or rejecting submissions? should we privilege “quality”—presumably as this is measured by success in the conference’s traditionally highly structured and quite thorough peer review process? or should we privilege “diversity”—defined largely in terms of ensuring that speakers from as wide a willard mccarty, humanities computing (basingstoke [england]; new york: palgrave macmillan, ), . preprint from intersectionality in digital humanities, ed roopika risam and barbara bordalejo (forthcoming ). doi (all versions): . /zenodo. ; (this version): . /zenodo. possible range of demographics are given slots at a conference (and in a discipline) that has been accused of skewing heavily towards the white, male, northern, and anglophone? or, as one member of the committee put it with forceful clarity in an email: there's a solid consensus that the conference is there in order to hear from diverse groups, but whenever one opts for diversity, it usually means opting for less quality (otherwise there would be no issue), so the danger is that one loses sight of this, very central goal of the conference. email is an informal medium, and it would be unfair to take the position expressed here and later circulated by others on social media as having been considered in the same way that this chapter has been or that other formal presentations have been that have referred to this email since this controversy first arose. as steven ramsay has noted of his own apparently unintentionally provocative comments on the belief that coding is the core activity within the digital humanities, moreover, “all quotes are by nature taken out of context.” in this particular case, it is important to remember, the passage in question comes from the middle of an internal debate (most of which has not been published or released on social media) in which members of a conference organising committee struggled to determine the best method of fairly distributing access to a major conference with a high rejection rate. at the same time, however, the “diversity debate” exemplified (and in part provoked) by this email was real and involved the numerous regional, national, linguistic, and other organisations that make up adho and run the field’s major journals, conferences, and societies. [adho conference coordinating committee email listserv], “re: dh and diversity,” september , . stephen ramsay, “on building,” accessed june , , http://stephenramsay.us/text/ / / /on-building/. the debate led to the resignation of one of adho’s officers and it resulted in inter-society debates about cultural norms surrounding issues of “diversity” and “quality” that are still on- going. this resignation and these debates led to a brief threat from one of the societies to break away from the larger consortium, taking its journal and participation in the international conference with it. the debate provoked in part by this email, in other words, was serious enough to threaten some of the most prestigious and central organs and activities that characterise global digital humanities and undo what can be considered one of the most characteristic features of international digital humanities as it is currently constituted: its strong and highly centralised international organisational collaboration and cooperation. moreover, while people seem wary of putting it in writing, the sentiment that there is an opposition between “quality” on the one hand and “diversity” on the other remains relatively common within some parts of institutional digital humanities as (well as other industries). it also aligns to a certain extent with longer-standing positions and regional trends in how the field as a whole is understood: between “those who build digital tools and media and those who study traditional humanities questions using digital tools and media,” as mark sample puts it:  “do vs. think, practice vs. theory, or hack vs. yack.” i am a member of a national digital humanities society executive and a former chair of the special interest group (sig) global outlook::digital humanities (go::dh), an organisation that played a pivotal role in the recent “global turn” within dh. i am also a middle aged, white see cleve r. wootson jr, “a google engineer wrote that women may be unsuited for tech jobs. women wrote back,” the washington post, august , , https://www.washingtonpost.com/news/the-switch/wp/ / / /a-google-engineer-wrote-that-women-may-be- genetically-unsuited-for-tech-jobs-women-wrote-back/. mark sample “the digital humanities is not about building, it’s about sharing,” samplereality, may , , http://www.samplereality.com/ / / /the-digital-humanities-is-not-about-building-its-about-sharing/. anglophone man who enjoys the security of a tenured north american professorship. and i have been, at various times, a member of the adho executive, adho conference organising committees, and president of one of the national societies that collectively govern the organisation. in these contexts, i have heard both dismissive complaints about “diversity” as a way of promoting the less qualified and honest struggles with the question of how a desire to promote as wide participation as possible within dh might conflict with definitions of various forms of “quality” within the field. as is true of many significant disciplinary debates within the dh, however, much of this discussion has taken place out of public view—on closed email lists used by the adho executive or in closed meetings of its various committees: as shelaigh brantford pointed out in an unpublished paper, a person unfamiliar with the details of the internal debate provoked by this email and resignation would not be able to build an accurate sense of the issues at stake (or just how serious the crisis had become) from the organisation’s own public pronouncements. in this paper i would like to tackle the question of “diversity” and “quality” within the digital humanities head on. that is to say, i would like to consider the question raised in the email thread from the digital humanities organising committee directly and seriously. is there an inherent conflict between these two concepts within the digital humanities? is it the see daniel paul o’donnell and shelaigh brantford, “the tip of the iceberg: transparency and diversity in contemporary dh,” csdh-schn (congress ), calgary, june , . for a summary, see geoffrey rockwell, “csdh-cgsa ,” philosophi.ca, august , , http://philosophi.ca/pmwiki.php/main/csdh-cgsa . examples of public statements showing this oblique approach include alliance of digital humanities organisations, “adho announces new steering committee chair,” adho, november , , http://adho.org/announcements/ /adho-announces-new-steering-committee-chair; karina van dalen-oskam, “report of the steering committee chair (november – july ),” adho, july , , http://adho.org/announcements/ /report-steering- committee-chair-november- -%e % % -july- . it is important to remember that the purpose of such statements is administrative and political rather than academic and that an approach that makes things difficult for the researcher may represent good management practice. case that “whenever one opts for diversity, it usually means opting for less quality”? and is the promotion of “quality,” to the extent that it can be kept distinct from “diversity,” actually a “very central goal of the [digital humanities] conference,” or any other venue for disseminating our research? to anticipate my argument, i am going to suggest that the answer to each of these questions is “no.” that is to say, first, that there is no inherent conflict between “diversity” and “quality” in the digital humanities; second, that emphasising “diversity” does not threaten the “quality” of our conferences and journals; and, finally, that “quality”—when taken by itself, without attention to questions of “diversity”—is in fact not the central goal of the dh conference, or any other digital humanities dissemination channel. indeed, to the extent they can be distinguished at all (and to a great degree, in fact, i argue they are the same thing), “diversity”—in the sense of access to as wide a possible range of experiences, contexts, and purposes in the computational context of the study of problems in the humanities or application of computation to such problems, particularly as this is represented by the lived experiences of different demographic groups—is in fact more important than “quality,” especially if “quality” is determined using methods that encourage the reinscription of already dominant forms of research and experience. full of sound and fury…? as intense as it was, the “quality vs. diversity” debate revolved around what can only be described as a very odd premise for a discipline that is commonly described as a “methodological commons” or “border land.” at the most literal level, the debate suggests that the two qualities in question (i.e. “diversity” and “quality”) have a zero-sum relationship to each other: the more “diversity” there is of participation on a panel or at a conference, the fewer examples (presumably) of “quality” work you are likely to find. that this is inherently problematic can be tested simply by reversing the terms: if diversity of participation is thought to lead to lower “quality,” then, presumably, greater “quality” comes from increasing the homogeneity of participation. in certain circumstances and to certain degrees, of course, this can be true: a conference that is focussed on a single discipline or subject, for example, is likely to be of higher “quality” (in the sense of creating opportunities to advance that discipline or topic) than a conference that sets no limits on the subject matter of the papers or qualifications of the participants. faculty and students at the university of lethbridge participate in several conferences each year where the principle of organisation is geographic (“academics living in alberta”) or educational status (“graduate students”) rather than discipline or topic. in such cases, the principal goal of the conference is less the advancement of research in a particular discipline (i.e. promoting the kind of “quality” that seemed to be at issue in the adho debate) than the advancement of researchers as a community. these conferences can attract a wide variety of approaches, subjects, and methods and, frankly, “quality” of contributions (in the sense of “likely to be of broad interest or impact to the field or discipline in question”). the benefit they offer lies in the practice they afford early-career academics and students in preparing papers or the cross- mccarty, humanities computing, . julie thompson klein, interdisciplining digital humanities: boundary work in an emerging field (ann arbor: university of michigan press, ). disciplinary networking opportunities they provide for scholars working in a particular geographic area. but while it would be wrong to measure the success of such conferences by the impact they have on their field (since there is no single field), it is also undeniable that such conferences generally have lower “quality” when measured from a disciplinary perspective. at the same time, however, absolute homogeneity is also obviously problematic. research, like many collaborative tasks, is an inherently dialectic process. it involves argument and counter-argument; debate over methods and results; agreement, disagreement, and partial agreement over significance and context. in many cases, this dialectic takes place within a broader context of theoretical agreement (the so-called “normal science” ) in others, it can involve sweeping changes to the framing theories or concepts (the infamous “paradigm shift” ). advancement in research, in other words, requires there to be at least some difference among researchers in approach, goals, method, or context. for great advancement to occur—the kind that changes the field or opens up new avenues of exploration—it is necessary for at least some of the participating researchers to understand the problems the discipline is facing from very different perspectives from that of the rest of the field. the relationship between lack of homogeneity and advancement of research is particularly true in the case of the digital humanities. this is because the “field” is really a paradiscipline— that is to say “a set of approaches, skills, interests, and beliefs that gain meaning from their association with other kinds of work.” in contrast to many traditional humanities disciplines, see thomas s. kuhn, the structure of scientific revolutions (chicago: university of chicago press. ). see kuhn, the structure of scientific revolutions. while kuhn is discussing science, the same pattern can be found, mutatis mutandis, in the social sciences and humanities. daniel paul o’donnell, “‘there’s no next about it’: stanley fish, william pannapacker, and the digital humanities as paradiscipline,” dpod blog, june , , http://dpod.kakelbont.ca/ / / /theres-no-next-about- it-stanley-fish-william-pannapacker-and-the-digital-humanities-as-paradiscipline/. the digital humanities traditionally has been much more about methodology than content: that is, it is less about something than it is about how one studies or researches something else. advancing the field in such cases requires developments either in the range of “something elses” to which these “hows” can be applied (i.e. the range of subjects studied); or in the “hows” themselves (i.e. the methods that can then be used across disciplines and problems). novelty in the digital humanities (and research is always about new ideas or concepts), in other words, requires either the application of existing techniques, models, or understandings to an ever widening range of humanities problems (testing the boundaries of our existing tools and approaches); or experiments in the development and application of new techniques, tools, theories, and approaches to new or old types of problems (expanding the range of the digital humanities methodological project).   in both cases, diversity of experience and situation are crucial pre-conditions for advancement. we improve our understanding of computers and the humanities by discovering new problems for old solutions and re-solving existing problems in new cultural, economic, social and computational contexts. without such diversity of experience and condition, the digital humanities ceases to be a paradiscipline and becomes instead simply a computationally heavy sub-discipline within some larger traditional field of research. medieval studies: a counter case this fundamental importance of diversity to the digital humanities can be seen when they are compared to a more traditionally content-focussed field such as medieval studies. as a cross-disciplinary area study, medieval studies covers a wide range of topics, approaches, and subjects—from archaeology to philosophy to literature to geography—and involves a number of technical and methodological skills (e.g. paleography, linguistics, numismatics, etc.). the field is commonly organised along cultural and temporal lines, with often parallel (but largely unconnected) research going on otherwise similar topics within different political, cultural, or linguistic contexts. a scholar of anglo-saxon kingship may have little to do with somebody studying the same topic with regard to continental european or middle eastern cultures during the same time frame—or even with those studying the same topic in earlier or later periods in the same geographic area. medieval vernacular literary studies, similarly, tend to focus on relatively narrowly delimited languages, movements, or periods. apart from some common broad theoretical concerns, a student of early italian vernacular literature might have very little to do with research on early french, spanish, or english literature of the same or different periods. even within a single time or culture, the multi- disciplinary nature of the field means that it is quite common for research by one medievalist to be of only marginal immediate relevance or interest to another medievalist trained in a different discipline or tradition: art historians debate amongst themselves without necessarily seeking input from (or affecting the work of) philologists or archaeologists working the same geographical or cultural area and time-period. but while the range of medieval studies is huge, its definition is still primarily about content rather than methodology. that is to say, the goal of medieval studies ultimately is to know or understand more about the middle ages, not, primarily, to develop new research techniques through their application to the middle ages. while differences between the different sub-disciplines within medieval studies are such that advanced research in one area can be difficult or impossible to follow by researchers trained in some other area, it remains the case that the overall goal of research across domains and approaches is to develop a comprehensive picture of the time or location under discussion: the history, archaeology, politics, language, literature, culture, and philosophical understandings of a particular place or time in the (european) middle ages. if a piece of research focuses on europe or the middle east (as a rule, research involving a similar time period in africa, asia, or the americas is not considered part of medieval studies) and if it involves or analyses content or events occurring from (roughly speaking) the fall of the roman empire through to the beginning of the renaissance, then that research is likely to be considered “medieval studies” and its practitioner a “medievalist”; if, on the other hand, a piece of research falls outside of these temporal and geographical boundaries, then it is not considered “medieval studies,” even if the techniques it uses are identical to those used within medieval studies or could be applied productively to material from the medieval period. content vs. method in historical disciplines one implication of this is that in medieval studies, comprehensiveness or completeness can be as important a scholarly goal as novelty of method, and the discovery and explication of additional examples of a concept or type of cultural object are as or more valuable than more generalisable methods or studies. if having a scholarly edition of one anglo-saxon poem is thought to be useful for the study of the period, for example, then having editions of two anglo- saxon poems—or, better still, all anglo-saxon poems—will be thought to be even more useful. a digital library of frankish coins, similarly, is the better the more it is complete. for a discussion of this with regard to medieval and classical studies see gabriel bodard and daniel paul o’donnell, “we are all together: on publishing a digital classicist issue of the digital medievalist journal,” digital medievalist ( ), https://doi.org/ . /dm. . just how important this focus on the accumulation of examples and detail is can be seen simply by examining medievalist conference programmes or publishers’ booklists. medievalist conferences, for example, place a premium on the specific. while broad generalised papers synthesising across domains are not unheard of (they are in fact characteristic of keynote addresses), by far the majority of contributions focus on quite specific topics: “the music of the beneventan rite i (a roundtable)” or, in a session on “flyting” (i.e. the exchange of insults in germanic poetry), papers on three or four specific texts: “the old high german st. galler spottverse,” “flyting in the hárbarðsljóð,” “selections from medieval flyting poetry,” and “hrothgar, wealhtheow, and the future of heorot [i.e. in the poem beowulf],” to take some examples from the international congress on medieval studies at western michigan university. indeed, it is significant in this regard that the dominant form of submission to a conference like the international congress on medieval studies is by externally organised panel (i.e. a collection of papers assembled and proposed by an external organiser) rather than through the submission of individual papers by individual scholars: given the level of detail involved in the majority of the papers (and the lack of generalising emphasis), this is the only way of ensuring a critical mass of background knowledge in speakers and audience. andrew j. m. irving, “the music of the beneventan rite i (a roundtable) [conference session],” international congress on medieval studies, kalamazoo, mi, may , , https://wmich.edu/sites/default/files/attachments/u / /medieval-congress-program- -for-web.pdf; doaa omran, “dead poet flyting karaoke [conference session],” international congress on medieval studies, kalamazoo, mi, may , , https://wmich.edu/sites/default/files/attachments/u / /medieval-congress- program- -for-web.pdf. this focus on specificity is the norm across the traditional humanities: the annual conference of the modern language association, for example, the largest in the humanities, fills its programme entirely by means of externally proposed sessions (nicky agate, personal communication). book series on topics in medieval studies, similarly, tend to justify their claims to the scholars’ attention through their comprehensiveness. thus the early english text society advertises for new subscriptions by pointing to its collection of   most of the works attributed to king alfred or aelfric, along with some of those by bishop wulfstan and much anonymous prose and verse from the pre-conquest period… all the surviving medieval drama, most of the middle english romances, much religious and secular prose and verse including the english works of john gower, thomas hoccleve, and most of caxton’s prints… a similar emphasis on comprehensiveness is found in the advertisement for early english books online: from the first book published in english through the age of spenser and shakespeare, this incomparable collection now contains more than , titles... libraries possessing this collection find they are able to fulfill the most exhaustive research requirements of graduate scholars - from their desktop - in many subject areas: including english literature, history, philosophy, linguistics, theology, music, fine arts, education, mathematics, and science. significantly, this interest in completeness is such that it can even trump methodological diversity: the goal of comprehensive collections of texts or artifacts, after all, is to provide anne hudson, “the early english text society, present past and future,” the early english text society, accessed august , , http://users.ox.ac.uk/~eets/. early english books online, “about eebo,” eebo, accessed august , , https://eebo.chadwyck.com/marketing/about.htm. researchers with a body of comparable research objects—that is to say, research objects established using (more-or-less) common techniques and expectations. this is both why it makes sense for scholars to regularly re-edit core texts in the field (the better to make them compatible with current scholarly trends and interests) and why it can make sense to explicitly require researchers to follow specific methodological approaches and techniques. thus the modern language association’s committee on scholarly editions codifies its views on best practice in textual editing in the form of a checklist against which new editions can be compared. this checklist and the associated guidelines include advice on the specific analytic chapters or sections that ought to be included in a “certified edition” as well as minimum standards of accuracy and preferred workflows. the early english text society, likewise, warns potential editors of its strong preference for editions that follow the models set by previous editions in the series, recommending against experimentation without prior consultation: we rely considerably on the precedents set by authoritative earlier editions in our series as a means of ensuring some uniformity of practice among our volumes. clearly discretion must be used: departures from practice in earlier editions are likely to have been made for good, but particular, reasons, which do not necessarily suit others. moreover, if they wish to make an argument from precedent, editors should follow eets mla committee on scholarly editions, “guidelines for editors of scholarly editions,” modern language association, june , , https://www.mla.org/resources/research/surveys-reports-and-other-documents/ publishing-and-scholarship/reports-from-the-mla-committee-on-scholarly-editions/guidelines-for-editors-of- scholarly-editions. editions, in preference to those of other publishers. once again, please consult the editorial secretary in cases of doubt. this emphasis on continuity, consistency, and clearly identified standards is not (necessarily) evidence of unthinking conservatism. textual criticism and editing as a method has gone through some remarkable developments in the last three decades, and while not all presses or series are prepared to accept some newer methods for representing texts and objects editorially, others, such as the modern language association, have worked diligently to ensure their guidelines work with different prevailing methodologies and approaches. what it does suggest, however, is a belief in the necessity of minimum common standards, in a minimal degree of common understanding about expectations and purpose, and that the purpose of method is to develop reliable content rather than, as both the mla and the early english text society emphasise, experiment for the sake of experiment—a sense of minimum “quality,” in other words, that is more important than “diversity” if “diversity” produces something methodologically or conceptually unexpected. given the choice between reliable content produced using a conservative, well-tested methodology, and content of unknown quality produced using novel, but less well-tested methodologies, in other words, these examples suggest that mainstream medievalists will tend to prefer the reliable success over the interesting “failure.” this bias against (methodological) early english text society, “guidelines for editors,” early english text society, , accessed august , , http://users.ox.ac.uk/~eets/guidelines% for% editors% .pdf. the early english text society, for example, promises to issue separate guidelines for “electronic editions… as and when the society decides to pursue this manner of publication in the future,” see early english text society, . mla committee on scholarly editions, “mla statement on the scholarly edition in the digital age,” modern language association, may , https://www.mla.org/content/download/ / /rptcse .pdf. a famous example in medieval english studies is the reception of the athlone press editions of piers plowman, i.e. george kane, piers plowman : the a version. will’s visions of piers plowman and do-well (london: diversity need not, in principle, lead to a bias against participation by “diverse” communities (in the sense of gender, belonging to a racialised community, economic class, or educational background)--although medieval studies as a field has recently begun to recognise both its lack of diversity in this respect as well, and the degree to which this homogeneity may leave it particularly vulnerable to co-option by explicitly racist political movements. but it does in current practice discourage it, in part because it interacts poorly with the lived experience of intersectionally diverse participants: it allows for participation by “anybody,” but is methodologically suspicious of those whose experience, training, interests, or economic situation results in work that does not easily continue the larger common project using clearly recognised methods and meeting previously recognised standards. as a new generation of medievalists university of london, ); george kane and e. talbot donaldson, piers plowman the b version (london; berkeley: athlone press, ); william langland, george russell, and george kane, piers plowman : the c version ; will’s visions of piers plowman, do-well, do-better and do-best (london; berkeley: athlone ; university of california press, ). these were generally criticised on the basis that their innovative editorial method, while interesting and perhaps theoretically sound, left the texts “unreliable” and incomparable to other editions of the poem. see among many others derek pearsall, “piers plowman: the b version, (volume ii of piers plowman: the three versions), by george kane, e. talbot donaldson,” medium aevum, ; john a. alford, “piers plowman: the b version. will's vision of piers plowman, do-well, do-better and do-best. george kane, e. talbot donaldson,” speculum, ; traugott lawler, “reviewed work: piers plowman: the b version. will's visions of piers plowman, do-well, do-better, and do-best. an edition in the form of trinity college cambridge ms. b. . , corrected and restored from the known evidence, with variant readings by george kane, e. talbot donaldson,” modern philology . lawler’s review is an interesting example as it praises the edition while mentioning these same caveats. robert adams, “the kane-donaldson edition of piers plowman: eclecticism’s ultima thule,” text ( ): – , contains a discussion of the reception. see among others candace barrington, “beyond the anglophone inner circle of chaucer studies (candace barrington),” in the middle, september , , accessed january , , http://www.inthemedievalmiddle.com/ / /beyond-anglophone-inner-circle-of.html; wan-chuan cao, “#palefacesmatter? (wan-chuan kao),” in the middle, july , , accessed january , , http://www.inthemedievalmiddle.com/ / /palefacesmatter- wan-chuan-kao.html; dorothy kim, “a scholar describes being conditionally accepted in medieval studies (opinion) | inside higher ed,” inside higher ed, august , , accessed july , , https://www.insidehighered.com/views/ / / /scholar-describes-being-conditionally-accepted-medieval- studies-opinion; dorothy kim, “the unbearable whiteness of medieval studies,” in the middle, november , , accessed january , , http://www.inthemedievalmiddle.com/ / /the-unbearable-whiteness-of- medieval.html; and medieval institute, “featured lesson resource page: race, racism and the middle ages,” teams: teaching association for medieval studies, july , . accessed january , , https://teams-medieval.org/?page_id= . http://www.inthemedievalmiddle.com/ / /the-unbearable-whiteness-of-medieval.html http://www.inthemedievalmiddle.com/ / /the-unbearable-whiteness-of-medieval.html https://www.insidehighered.com/views/ / / /scholar-describes-being-conditionally-accepted-medieval-studies-opinion https://www.insidehighered.com/views/ / / /scholar-describes-being-conditionally-accepted-medieval-studies-opinion http://www.inthemedievalmiddle.com/ / /palefacesmatter-wan-chuan-kao.html http://www.inthemedievalmiddle.com/ / /palefacesmatter-wan-chuan-kao.html http://www.inthemedievalmiddle.com/ / /beyond-anglophone-inner-circle-of.html http://www.inthemedievalmiddle.com/ / /beyond-anglophone-inner-circle-of.html tackle this problem using an explicitly intersectional theoretical approach, the field may gradually become more hospitable to a broader and more welcoming definition of diversity. the digital humanities as methodological science the focus on content, comprehensiveness, and, in the more technical areas, methodological conservatism that i argue characterises the practice of a traditionally historically-focussed field like medieval studies contrasts very strongly against what we can easily see to be the case within the digital humanities. if medieval studies can be described as a discipline that marshals specific types of method and theory in order to apply it to the study of a specific temporally and geographically bound subject, the digital humanities can be described as a field that marshals studies of a variety of (often) temporally, geographically, and similarly bound subjects in order to develop different types of method and theory. as in medieval studies, the range of topics, approaches, and subjects covered by the digital humanities is extremely wide—indeed, in as much as the digital humanities does not focus on a specific temporal period or geographic location, far wider. and as in medieval studies, different streams of research in different areas of the digital humanities—while engaged, broadly speaking, in the same large project—commonly advance with a fair degree of independence: advances in d imaging, for example, may or may not be related to or have an impact on developments in text encoding, media theory, gaming, or human-computer interaction, to name only a few areas commonly considered to be part of the digital humanities. the difference, however, is that the project of the digital humanities, in contrast to that of an area study like medieval studies, is primarily about the methods and theories used rather than the content developed. that is to say, the goal of the digital humanities as a discipline is not primarily to know more about any specific period, text, idea, object, culture, or any other form of content (though it does no harm if it helps further this knowledge), but, rather, to develop theories, contextual understandings, and methods that can be used in the context of computation to study such periods, texts, ideas, objects, cultures, etc. this is not to deny that research in the digital humanities can have an impact on our knowledge of such periods, texts, ideas, objects, and cultures. in fact much good digital humanities work does have that impact. rather it is to claim that this impact is not the primary interest of such research to other digital humanities researchers: a digital edition of an anglo- saxon poem can be at the same time a work of medieval studies (if it adds to our knowledge of the anglo-saxon period) and digital humanities (if it adds to our knowledge of how one can make digital editions or some other aspect of digital method or theory). in order to make such an edition a contribution to the digital humanities, however, it must do something new computationally, regardless of its value to anglo-saxon studies: the kind of methodological conservatism we have seen as being acceptable in medieval studies is simply fatal in a field like the digital humanities. where editing yet another anglo-saxon text improves our knowledge of anglo-saxon england, the simple application of well-known computational techniques to yet another cultural object of the same kind dealt with previously by others does nothing to advance the digital humanities as a paradiscipline. advancement in the digital humanities requires there to be something new, innovative, or generalisable about the work from a digital/methodological perspective. as is the case with medieval studies, this difference in emphasis is reflected in how digital humanities dissemination channels define themselves and operate. digital humanities book series, in contrast to the examples we have seen from medieval studies, tend to celebrate the methodological and disciplinary breadth of their catalogue, rather than the comprehensiveness of their collections. both “digital culture books,” a digital humanities imprint of the university of michigan press, and “topics in the digital humanities,” an imprint of the university of illinois press, for example, advertise their series in terms of the breadth of topics covered in their volumes, the methodological diversity and innovation they entail, and the diverse experiences of their authors: the goal of the digital humanities series will be to provide a forum for ground-breaking and benchmark work in digital humanities. this rapidly growing field lies at the intersections of computers and the disciplines of arts and humanities, library and information science, media and communications studies, and cultural studies. the purpose of the series is to feature rigorous research that advances understanding of the nature and implications of the changing relationship between humanities and digital technologies. books, monographs, and experimental formats that define current practices, emergent trends, and future directions are accepted. together, they will illuminate the varied disciplinary and professional forms, broad multidisciplinary scope, interdisciplinary dynamics, and transdisciplinary potential of the field. humanities computing is undergoing a redefinition of basic principles by a continuous influx of new, vibrant, and diverse communities or practitioners within and well beyond the halls of academe. these practitioners recognize the value computers add to their work, that the computer itself remains an instrument subject to continual university of michigan press, “digital humanities series,” digital culture books, accessed september , , http://www.digitalculture.org/books/book-series/digital-humanities-series/. innovation, and that competition within many disciplines requires scholars to become and remain current with what computers can do. topics in the digital humanities invites manuscripts that will advance and deepen knowledge and activity in this new and innovative field. conference sessions, too, tend to be far less specialised and homogenous in terms of subject. where in the case of area or historical studies, conference papers tend to focus on very specific research questions and outcomes and submissions tend to be primarily through the externally organised panel, in the case of digital humanities conferences, papers tend both to be on a wider variety of topics in any single session (because the content is less important than the methodology) and organised by single-paper-submission rather than externally organised panels. i have been on conference panels in both the digital humanities and medieval studies: where in the case of medieval studies conferences, committees commonly look favourably on papers that emphasise new detailed findings, digital humanities committees commonly ask the authors of papers that concentrate too much on the details of their “case” and not enough on its generalisability to reorganise their paper or consider presenting their findings as a short paper or poster. the role of diversity this brings us finally, to the role of intersectional diversity in the advancement of the digital humanities. thus far this paper, i have been emphasising the way in which the digital humanities acts as what mccarty and short have described as a methodological commons: an university of illinois press, “topics in the digital humanities,” university of illinois press, accessed september , , http://www.press.uillinois.edu/books/find_books.php?type=series&search=tdh. intellectual space in which researchers active in different disciplines, in essence, compare notes and develop new approaches and ideas about the role, context, and use of the digital in relation to humanities questions. the great change in the last five years within the digital humanities, however, has been the recognition that this “commons” also involves lived experience within the digital realm. that is to say, that diversity of personal, gendered, regional, linguistic, racialised, and economic experience and context is as important to developing our understanding of method and theory in the digital humanities as is diversity of subject or focus. what this means is that it is as important to promote diversity of experience in the digital humanities as it is diversity of methodology or topic. the experiences of researchers working with relatively poor infrastructure in mid and especially low income communities, for example, are as important to the progress of the digital humanities as a discipline as those working with cutting edge infrastructure in the most advanced technological contexts. the problem of doing good humanities work with “minimal” computing infrastructure is at least as challenging (and interesting) for the digital humanities as the problem of adapting the latest tools from silicon valley in a high bandwidth environment—and it remains so, even if the research in high bandwidth infrastructures produces “better” content for the domain specialist (e.g. colour or hd imagery vs black and white, for example, or larger collections taking advantage of the latest interfaces and technologies). the experiences of those working in rigid or very traditional research environments that discourage novel work with computation in traditional humanities fields, likewise, bring interesting cultural and methodological challenges which enrich the understanding of researchers working in environments in which the digital humanities is “the next big thing.” because it also involves the application of computation to the humanities or the understanding of the humanities in an age of (mostly) ubiquitous networked computing, the research of underfunded researchers, those at non-research-intensive institutions, those without permanent faculty positions, and those just beginning their careers as students, likewise, is at least as important to our understanding of the digital humanities as that of tenured researchers working with the best funding in the most elite institutions. the digital humanities, in other words, is about the intersection of the humanities and the world of networked computation; it is not (solely) about the intersection of the humanities and the world of the fastest, most expensive, and best supported examples of networked computation. because it is part of the contemporary humanities, the experiences of the marginalised in their use of computation or their understanding of and access to different computation contexts are at least as important to a full understanding of the digital humanities as are the experiences of those at the centre of our best-funded and most technologically advanced research and cultural institutions. diversity and quality there is in theory of course no reason why encouraging the contributions of the marginalised alongside those of the non-marginalised (i.e. encouraging “diversity”) should result in lower “quality,” as measured by things like “impact,” citation rates, or peer review scores. researchers working with poor infrastructure can do as “careful” work as those working with excellent infrastructure and, as dombrowski and ramsay have pointed out, excellent infrastructure and funding does not preclude large scale failure. william pannapacker, “no dh, no interview,” the chronicle of higher education, july , , http://chronicle.com/article/no-dh-no-interview/ /; william pannapacker, “the mla and the digital humanities,” brainstorm, accessed june , , http://chronicle.com/blogpost/the-mlathe-digital/ /. the problem, however, is that measures of “quality” in the academy are as a rule, self- inscribing. that is to say, the mechanisms by which “quality” is determined strongly favour the already favoured: as my colleagues and i have demonstrated of “excellence” (a synonym for “quality” in this context): a concentration on the performance of “excellence” can promote homophily among... [researchers] themselves. given the strong evidence that there is systemic bias within the institutions of research against women, under-represented ethnic groups, non-traditional centres of scholarship, and other disadvantaged groups, it follows that an emphasis on the performance of “excellence”—or, in other words, being able to convince colleagues that one is even more deserving of reward than others in the same field—will create even stronger pressure to conform to unexamined biases and norms within the disciplinary culture: challenging expectations as to what it means to be a scientist is a very difficult way of demonstrating that you are the “best” at science; it is much easier if your appearance, work patterns, and research goals conform to those of which your adjudicators have previous experience. in a culture of “excellence” the quality of work from those who do not work in the expected “normative” fashion run a serious risk of being under-estimated and unrecognised. this is particularly true when measures of relative “quality” (or “excellence”) are used to distribute scarce resources among researchers. peer review is an inherently conservative process —the core question it asks is whether work under review conforms to or exceeds existing samuel moore et al., “‘excellence r us’: university research and the fetishisation of excellence,” palgrave communications (january , ): , https://doi.org/ . /palcomms. . . internal bibliographic citations within this quotation have been silently elided. disciplinary norms. in zero-sum or close to zero-sum competitions—such as the distribution of prizes or space in a conference—it has a well-established record of both rewarding the already successful and under-recognising the work of those who do not conform to pre-existing understandings in the discipline. in other words, as we have argued elsewhere,   ...the works that—and the people who—are considered “excellent” will always be evaluated, like the canon that shapes the culture that transmits it, on a conservative basis: past performance by preferred groups helps establish the norms by which future performances of “excellence” are evaluated. whether it is viewed as a question of power and justice or simply as an issue of lost opportunities for diversity in the cultural coproduction of knowledge, an emphasis on the performance of “excellence” as the criterion for the distribution of resources and opportunity will always be backwards looking, the product of an evaluative process by institutions and individuals that is this is known as the “mathew effect”; see robert k. merton, “the matthew effect in science,” science , no. ( ): – , http://www.jstor.org.ezproxy.alu.talonline.ca/stable/ ; dorothy bishop, “the matthew effect and ref ,” bishopblog, october , , http://deevybee.blogspot.ca/ / /the-matthew-effect-and- ref .html discusses the effect in relation to the research excellence framework. as jian wang, reinhilde veugelers, and paula e. stephan, “bias against novelty in science: a cautionary tale for users of bibliometric indicators,” social science research network, january , , have shown, novelty in science is consistently underestimated by most traditional measures of “impact” in the short and medium term. there is a minor industry researching the failure of peer review to recognise papers that later turned out to be extremely successful by other measures such as citation success or the receipt of major prizes. see joshua s. gans and george b. shepherd, “how are the mighty fallen: rejected classic articles by leading economists,” the journal of economic perspectives: a journal of the american economic association , no. (winter ): ; juan miguel campanario, “rejecting and resisting nobel class discoveries: accounts by nobel laureates,” scientometrics , no. (april , ): – ; pierre azoulay, joshua s. graff zivin, and gustavo manso, “incentives and creativity: evidence from the academic life sciences,” the rand journal of economics , no. ( ): – ; juan miguel campanario, “consolation for the scientist: sometimes it is hard to publish papers that are later highly cited,” social studies of science ( ): – ; juan miguel campanario, “have referees rejected some of the most-cited articles of all times?,” journal of the american society for information science , no. (april ): – ; juan miguel campanario, “commentary on influential books and journal articles initially rejected because of negative referees’ evaluations,” science communication , no. (march , ): – ,; juan miguel campanario and erika acedo, “rejecting highly cited papers: the views of scientists who encounter resistance to their discoveries from other scientists,” journal of the american society for information science and technology , no. (march , ): – ; kyle siler, kirby lee, and lisa bero, “measuring the effectiveness of scientific gatekeeping,” proceedings of the national academy of sciences , no. (january , ): – . established by those who came before and resists disruptive innovation in terms of people as much as ideas or process. diversity instead of quality taken as a whole, this bias among traditional measures of quality means that they are highly likely to underestimate the value of potentially “excellent” work by digital humanities researchers from non-traditionally dominant demographic groups—especially if this work challenges existing conventions or norms in the field. but what about poor quality work from “diverse” researchers? that is to say, what about work from researchers outside traditionally dominant demographic groups within the digital humanities that can be shown on relatively concrete grounds to be below the accepted standards in the field? work, for example, that does not use or recognise existing technological standards? that ignores (or appears to be ignorant of) basic disciplinary conventions? a student project, say, that encodes text for display rather than structure? or a project from a researcher working outside mainstream digital humanities that uses proprietary software or formats or strict commercial licences? it is easy to see, in theory, how a conference programming committee that had to choose between a good project by a research team from a dominant demographic group and a flawed project by a team working outside such traditionally dominant communities might struggle with the question of “diversity vs. quality” when it came to assign speaking slots. the answer is that it is a mistake to see “poor quality” as a diversity issue. while such problems can arise with researchers from demographics that are not traditionally dominant moore et al., . within the digital humanities, they also arise among researchers from traditionally dominant demographics as well. indeed, the willingness to celebrate (or at the very least destigmatise) “failure” is one of the features of the digital humanities that distinguishes it from traditional area fields like medieval studies. mccarty has described the digital humanities as “the quest for meaningful failure” and many authors in the field have devoted considerable attention to the “error” part of “trial and error” (i am aware of no such bibliography or tradition within medieval studies). we have a proud tradition of accepting student papers at digital humanities conferences—indeed, there are often both special prizes and special adjudication tracks for such papers. as long as the researchers in question conform to dominant group expectations in other ways, it seems, referees and review panels are prepared to accept work that implicitly or explicitly violates disciplinary norms on an exceptional basis because it helps define the field. in the case of student papers, they also take positive steps to identify and support a demographic that, by definition, is still presumably acquiring the skills that otherwise make for “quality” work. what this suggests, in turn, is that even “poor quality” is not a reason to avoid privileging diversity within the digital humanities. the digital humanities has a tradition of encouraging willard mccarty, “humanities computing,” encyclopedia of library and information science (marcel dekker, ), https://doi.org/ . /e-elis. see, among many others, isaac knapp, “creation and productive failure in the arts and digital humanities,” inspire-lab, january , , https://inspire-lab.net/ / / /creation-and-productive-failure-in-the-arts-and- digital-humanities/; katherine d. harris, “risking failure, a cuny dhi talk,” triproftri, march , , https://triproftri.wordpress.com/ / / /risking-failure-a-cuny-dhi-talk/; brian croxall and quinn warnick, “failure,” digital pedagogy in the humanities, mla commons, accessed august , , https://digitalpedagogy.mla.hcommons.org/keywords/failure/; jenna mlynaryk, “working failures in traditional and digital humanities,” hastac, february , , https://www.hastac.org/blogs/jennamly/ / / /working- failures-traditional-and-digital-humanities; stephen ramsay, “bambazooka,” accessed august , , http://web.archive.org/web/ /http://stephenramsay.us/ / / /bambazooka/; quinn dombrowski, “what ever happened to project bamboo?,” literary and linguistic computing , no. (september , ): – . accounts of failure and accounts of structurally often less accomplished researchers such as students for the same reason it has a tradition of encouraging reports from researchers working in a wide variety of disciplinary contexts—because these accounts contribute collectively to the breadth of our understanding of the application of computation to humanities problems, expanding particularly our knowledge of method (i.e. the “hows,” or, in this case perhaps, “how not tos”). adding to this the occasional failed or less accomplished work of a researcher from a traditionally non-dominant demographic will neither disturb this tradition of celebrating failure nor result in the crowding out of successful projects by members of traditionally dominant or non-dominant demographics. conclusion the history of the digital humanities is often traced through landmark projects and movements, from the initial work by roberto busa on his concordance, through the stylometrics and statistical work of the s and s, to the “electronic editions” of the s and s, to big data and ubiquitous computing today. this history, however, is also a history of diversity. at each stage, progress in the field has required the introduction of new problems, new methods, and new solutions: a broadening of, rather than simple repetition or perfection of, the type of problems to which computation can be applied or which exist in an interesting computational context. the digital humanities is what it is today because we did not privilege “quality”—of concordance-making or edition-making or other early forms of humanities computing—over other novel forms of computational work. rather, it has thrived because we have embraced new and (often initially) imperfect experiments in the application of computation to other problems or new approaches to understanding the significance of computation in the context of humanistic research. this is, indeed, as mccarty has pointed out, perhaps the most ironic thing about the decision of the editors of computers and the humanities to narrow the focus of their journal to language resources and evaluation in , just as the digital humanities entered its most expansive and diverse phase. just as progress in humanities computing would have stalled if it had been unable to expand beyond roberto busa’s early interest in concordances, or the burst of activity in text encoding and presentation that characterised the “electronic editions” of the s and early years of this decade, so too the digital humanities will fail to progress if it cannot expand its range of experiences beyond those whose work and experience have largely defined it for most of its history: the white, northern, male, university researcher with access to reasonably secure funding and computational infrastructure. as digital culture (and hence the scope of humanities research) expands globally, the type of methodological and theoretical questions we are faced with have become itself much broader: why are some groups able to control attention and others not? how do (groups of) people differ in their relationship to technology? how do you do digital humanities differently in high- vs. low-bandwidth? how does digital scholarship differ when it is done by the colonised and the coloniser? how is what we discuss and research influenced by factors such as class, gender, race, age, social capital in an intersectional way? this expansion requires the field, if it is to advance, to ensure that researchers with experience in these questions from different perspectives are given a place to present their findings in our conferences and journals. in some cases—and there is no reason to believe that the frequency of such cases will be more than we find whenever new approaches and ideas enter the field—this work will belong to the well-established tradition of see humanist discussion group (by way of willard mccarty information studies (third level) media and communications -> bibliometrics (adjustment, doesn’t exist as third level) media and communications -> journalism (adjustment doesn´t exist on third level) other humanities -> cultural studies languages and literature -> english medical and health sciences (first level in scb) data which data are present about social media use by researchers? more than one type may be entered interview data (includes focus groups) twitter data survey data citation data log files wiki data blog data online profiles (also sns) personal use field notes other (e.g., literature) dh _jing_v jing hu, leiden university hou ieong (brent) ho, berlin state library https://dh.chinese-empires.eu/markus Ø only use chrome browser automated tagging and identification of named entities in classical chinese tagging of user-supplied keyword lists in all languages with credit to huei-lan xiong, leiden university hou ieong (brent) ho platin palladio docusky personal names korean historical biographical information system encyclopedia of korean culture place names encyclopedia of korean culture the map of dongyeo bureaucratic titles veritable records of the joseon dynasty encyclopedia of veritable records of the joseon dynasty books encyclopedia of korean culture comprehensive edition of korean literary collections encyclopedia of veritable records of the joseon dynasty § user interface § markus is designed at the early beginning to support multi language interfaces. (english, traditional chinese and simplified chinese) § it is convenient to add the korean interface § automated tagging § the tagging mechanism is dictionary based. so it is language independent. § new korean dictionaries are needed § web reference § web reference is designed and implemented by external web services (api or iframe). § it is easy to hook up new korean web reference by registering korean markup to web reference index.html index_korean.ht ml index_english.ht mllanguage? english korean for those tiny ui fragments, it is implemented by using i n json object. for example: { korean: ‘태깅 요약’, english: ‘tag summary’ } ● research background ● korean interpreters of the chosŏn dynasty ( – ) constantly travelled to beijing as delegation members to implement chosŏn’s tributary missions. ● due to lower social statues, there is scarce textual source on interpreters’ visiting records ● yŏnhaengnok (chosŏn travelogues to beijing) mentions some names of interpreters, but their identification is usually unrecognizable in the texts. ● questions: ● who visited beijing as diplomatic interpreters? ● what is the relationship among the interpreters in one diplomatic delegation? ● where did diplomatic interpreters encounter chinese people in china? ● …… textual source: full-texts of yŏnhaengnok (chosŏn travelogues to beijing) using automated markup to tag chinese place names and korean personal names using online biographical reference to discover the hidden interpreters in the text using online biographical reference to the hidden interpreters in the text retrieving and comparing with genealogical network data; findings: . genealogical ties existed in the interpreters delegation who visited beijing . diplomatic interpreters have (lineage or marriage) ties to prominence families. using keywords helper to retrieve accommodation spots using online reference tools to get coordinate data export tagged data to docusky platform (docugis) the accommodation spots in pak chiwŏn's rehe diary (熱河日記) § hilde de weerdt (leiden university) § kim hyoen, kim baro, the academy of korean studies digital humanities the importance of pedagogy: towards a companion to teaching digital humanities hirsch, brett d. brett.hirsch@gmail.com university of western australia timney, meagan mbtimney.etcl@gmail.com university of victoria the need to “encourage digital scholarship” was one of eight key recommendations in our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences (unsworth et al). as the report suggested, “if more than a few are to pioneer new digital pathways, more formal venues and opportunities for training and encouragement are needed” ( ). in other words, human infrastructure is as crucial as cyberinfrastructure for the future of scholarship in the humanities and social sciences. while the commission’s recommendation pertains to the training of faculty and early career researchers, we argue that the need extends to graduate and undergraduate students. despite the importance of pedagogy to the development and long-term sustainability of digital humanities, as yet very little critical literature has been published. both the companion to digital humanities ( ) and the companion to digital literary studies ( ), seminal reference works in their own right, focus primarily on the theories, principles, and research practices associated with digital humanities, and not pedagogical issues. there is much work to be done. this poster presentation will begin by contextualizing the need for a critical discussion of pedagogical issues associated with digital humanities. this discussion will be framed by a brief survey of existing undergraduate and graduate programs and courses in digital humanities (or with a digital humanities component), drawing on the “institutional models” outlined by mccarty and kirschenbaum ( ). the growth in the number of undergraduate and graduate programs and courses offered reflects both an increasing desire on the part of students to learn about sorts of “transferable skills” and “applied computing” that digital humanities offers (jessop ), and the desire of practitioners to consolidate and validate their research and methods. we propose a volume, teaching digital humanities: principles, practices, and politics, to capitalize on the growing prominence of digital humanities within university curricula and infrastructure, as well as in the broader professional community. we plan to structure the volume according to the four critical questions educators should consider as emphasized recently by mary bruenig, namely: - what knowledge is of most worth? - by what means shall we determine what we teach? - in what ways shall we teach it? - toward what purpose? in addition to these questions, we are mindful of henry a. giroux’s argument that “to invoke the importance of pedagogy is to raise questions not simply about how students learn but also about how educators (in the broad sense of the term) construct the ideological and political positions from which they speak” ( ). consequently, we will encourage submissions to the volume that address these wider concerns. references breunig, mary ( ). 'radical pedagogy as praxis'. radical pedagogy. http://radicalpeda gogy.icaap.org/content/issue _ /breunig.ht ml. giroux, henry a. ( ). 'rethinking the boundaries of educational discourse: modernism, postmodernism, and feminism'. margins in the classroom: teaching literature. myrsiades, kostas, myrsiades, linda s. (eds.). minneapolis: university of minnesota press, pp. - . http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html http://radicalpedagogy.icaap.org/content/issue _ /breunig.html digital humanities schreibman, susan, siemens, ray, unsworth, john (eds.) ( ). a companion to digital humanities. malden: blackwell. jessop, martyn ( ). 'teaching, learning and research in final year humanities computing student projects'. literary and linguistic computing. . ( ): - . mccarty, willard, kirschenbaum , matthew ( ). 'institutional models for humanities computing'. literary and linguistic computing. . ( ): - . unsworth et al. ( ). our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. new york: american council of learned societies. profile: kathleen fitzpatrick profile kathleen fitzpatrick insights – ( ), november profile: kathleen fitzpatrick kathleen fitzpatrick is currently the director of scholarly communication at the modern language association (mla), which is the largest scholarly society in the humanities, bringing together scholars, teachers, students and other members interested in the study of language, literature and culture. kathleen received her ba and mfa from louisiana state university and her phd from new york university. she was a professor of english and media studies at pomona college from to and also holds appointments as visiting research professor of english at new york university and visiting professor of media at coventry university. the mla hosts an annual convention and publishes a wide range of journals and books in the field, including the mla handbook, an authoritative resource for writers of research papers in the humanities. it also publishes the mla international bibliography, the only comprehensive bibliography in language and literature. when asked about her role as director of scholarly communication at the mla, kathleen explained that she has responsibility for overseeing the development and editing of all the association’s book and periodical publications, as well as for ‘our other scholarly communication initiatives, such as mla commons. i work with our member committees to ensure that their goals for our publications are met and to imagine new means of supporting the ways that scholars want to communicate and collaborate with one another.’ your editor noted that the arts and humanities have historically been relatively poorly served in terms of digital content, so was keen to find out from kathleen how she feels that this has changed over recent years, or, indeed, whether there are still things that she would like to see develop further. ‘my sense is that publications in the arts and humanities may have moved a bit more slowly than those in the sciences in embracing digital communication’, she began, but added, philosophically, ‘that deliberative nature has served our fields well: while there is a much greater percentage of scientific content available digitally, there has been little experimentation in terms of its form, which remains fairly locked into the model of the pdf, replicating the print journal article from which it derived. by contrast, in the arts and humanities, we’ve seen much more in the way of innovation in the possibilities that the web presents for scholarship: multimodal, networked texts; massive data-driven visualizations; complex mapping projects; open scholarly communities; and so on.’ kathleen acknowledges that ‘these newer forms aren’t as ubiquitous as the pdf, of course, but they have created an atmosphere in which scholars are able to continue experimenting with new forms for their work. it’s my hope that the platforms and projects that we develop at the mla can inspire such experimentation.’ it is clear that kathleen has a great sense of pride in her role and what she and the mla are helping to achieve for the community, so your editor was keen to understand the career path that had brought her to her current role. she began by saying lightheartedly that ‘a series of happy accidents and unexpected opportunities led me to where i am now.’ ‘when i was a faculty member, i frequently brought alumni back to campus to talk to our students about their career paths, and, inevitably, the people with the most interesting jobs got to those jobs through some indirect and often unpredictable route. i’m no exception; first, during a depressed year on the job market, landing the one faculty position whose ad most perfectly described the odd combination of things i did; then, when i ran into difficulty getting my first book published, using the blog i’d recently established as a means of thinking through the changes that were beginning to take root in scholarly communication; then, when one such blog post came to the attention of the institute for the future of the book, having a chance to experiment with developing an online scholarly community. and finally, finding myself with the opportunity to put what i’d learned both in the development of that community and in the research i’d done on scholarly communication at the service of the modern language association.’ sometimes fate has a fortunate knack of putting a person in the right place at the right time to find the perfect role for them, and it seems that kathleen is one of those lucky people. and it is a role that she has clearly taken to heart. she recalled vividly ‘the afternoon i received my first author’s copies of planned obsolescence. i was completely taken by its physicality: the velvety cover stock, the evocative cover image, the beautiful page layout. there was something deeply affective about my response to it – and yet that beautiful object could not have come into being without the equally rich digital networks through which i’d discussed, drafted, received feedback, and improved the manuscript.’ that experience shaped her views going forward, and she added, ‘that moment made clear to me the degree to which the book and the internet can be mutually supporting and can work together to enliven scholarly communication.’ the innovation that the internet allows has made possible the rise of open access (oa) publication and dissemination of scholarly work. your editor was interested to know how the move to oa has impacted on kathleen’s role at the mla. ‘one of the early projects i undertook at the mla was leading a process of revising our author agreements to ensure that they all became green oa friendly, such that the authors who publish with us would be free to deposit their work in the repository of their choice.’ she went on to add, ‘more recently, we have developed and launched core, a repository connected to mla commons, in which any member can deposit his or her work, make it available both to colleagues in the field and to the broader public, and have that work represented as part of their scholarly profiles.’ the mla is clearly making the most of the technologies available in the st century, but kathleen is realistic enough to accept that there is more to be done. ‘lots of exciting new scholarly work is being produced in innovative formats and on innovative platforms, but too often that work is not considered to ‘count’ in conventional review and assessment practices . . .’ she noted, before continuing, ‘new kinds of work may demand new modes of reading and evaluation, but our processes need to adapt to the work, rather than vice versa. ‘the book and the internet can be mutually supporting and can work together to enliven scholarly communication’ ‘in the arts and humanities, we’ve seen much more in the way of innovation’ the mla has created a set of guidelines for evaluating digital work, as have a number of other scholarly societies; scholars and administrators should seek out these guidelines and use them to ground the assessment of new modes of scholarly work on their own merits.’ kathleen speaking at a summer meeting of the institute for liberal arts and digital scholarship (iliads). in drawing the interview to a close, your editor asked kathleen if she manages to find time in her busy schedule to switch off from the day job and relax. ‘i love to read genre fiction: detective novels, science fiction, young adult novels, and so on. there’s something about the worlds created by these genres that can permit a temporary escape from this one!’ and on that note, your editor thanked kathleen for taking the time to talk to insights and allowed her to escape (albeit temporarily) to one of her beloved alternative worlds. kathleen was interviewed for ‘insights’ by steve sharp acknowledgments photographs are courtesy of the modern language association (first page) and angel david nieves (second photo). article copyright: © uksg. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original is credited. steve sharp co-editor, insights uksg e-mail: s.l.sharp@leeds.ac.uk orcid id: http://orcid.org/ - - - x to cite this article: sharp, s, profile: kathleen fitzpatrick, insights, , ( ), – ; doi: http://dx.doi.org/ . /uksg. published by uksg in association with ubiquity press on november http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / mailto:s.l.sharp@leeds.ac.uk http://orcid.org/ - - - x http://dx.doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ [library quarterly, vol. , no. , pp. – ] � by the university of chicago. all rights reserved. - / / - $ . convergent flows: humanities scholars and their interactions with electronic texts suzana sukovic this article reports research findings related to converging formats, media, prac- tices, and ideas in the process of academics’ interaction with electronic texts during a research project. the findings are part of the results of a study that explored interactions of scholars in literary and historical studies with electronic texts as primary materials. electronic texts were perceived by the study participants as fluid entities because the electronic environment promotes seamless interactions with a variety of media and formats. working with electronic texts combines some tra- ditional information and research practices into new patterns of information be- havior. the practice called “netchaining” combines aspects of networking with information-seeking practices to establish and shape online information chains, which link sources and people. different forms of exploration of participants’ research questions were enabled by interactions with electronic texts. introduction electronic information technologies have triggered significant changes in many areas of everyday life and scholarly research, but it had appeared for some time that the humanities were largely unaffected by new tech- nologies. an impulse for change had been apparently contained in pe- ripheries of the humanities field. the main reasons were often identified as the nature of humanities research, limitations of information and com- munication technology (ict), and scholars’ resistance to technology in general. humanities research deals with the “creative, imaginative, the subjective world” [ , p. ]. the main contribution to the discipline is usually in- . the author wishes to thank joyce kirk, theresa dirndorfer anderson, and reviewers for their comments about this article. . program coordinator, digital innovation unit for the humanities, arts and social sci- ences, the university of sydney, sydney, p.o. box , broadway nsw , australia; e-mail suzana.sukovic@gmail.com. the library quarterly dividual interpretation based on interaction with a wide variety of primary materials. the way in which electronic technology enables the use of pri- mary materials and the way in which scholars engage with primary sources in electronic form are the central issues in the employment of electronic technologies in the humanities. stephen wiberley and william jones noted, “ultimately, the most important development will be the extent to which humanists use electronic information technology to access the pri- mary sources—the content that is the basis of their work” [ , p. ]. jerome mcgann wrote about the “material revolution” in which we re- conceive the entity of our cultural archive of materials [ ]. widely accepted electronic texts (e-texts) are expected to revolutionize research to such an extent that some authors see them as a catalyst for a cultural change equivalent to major developments in the history of print [ ]. although digital collections are growing rapidly, evidence for the schol- arly change has not been conclusive yet. deborah lines andersen found that the “humanities have been the most resistant to digital endeavors” [ , p. ], confirming virginia massey-burzio’s view that scholars in the hu- manities have been skeptical about technology [ ]. however, there is some evidence that the situation has been changing. in our comparatively fast- paced world, it is easy to forget that a decade is a split second in the life of the humanities, probably the oldest scholarly field. only fifteen years ago, matthew gilmore and donald case [ ] wrote that electronically pub- lished materials were a step back in some ways. they observed that the cost of electronic publications was a factor that severely reduced access. a decade later, democratized access to materials in electronic form was flagged as one of the most important benefits brought by electronic tech- nologies, particularly the internet. wiberley and jones [ ] found that, over a period of ten years, most academics, even senior scholars without much interest or inclination, increased their use of electronic technology. three reports from australia [ ], the united kingdom [ ], and the united states [ ] documented an increased engagement with ict in the humanities. the perceived value of the engagement with new technologies may de- pend on the particular subdiscipline or the department, but the literature and anecdotal evidence both suggest that it is not very high in the main- stream humanities. martha brogan and daphnée rentfrow [ ] investi- gated recent journals in american literature and found very few articles dealing with digital scholarship. studies of citation patterns, such as suz- anne graham’s investigation of citations in historians’ professional pub- lications, show that electronic resources do not rate highly in published works [ , ]. andersen wrote that tenure and promotion committees tended to discount digital works [ ]. noticing and documenting the change is difficult for two main reasons. first, the influence of information technology in the research process is humanities scholars and e-texts often missing from published works. second, the nature of electronic me- dia makes it difficult to monitor their contribution to the research process. the american council of learned societies (acls) [ ] noted that in- frastructure is deeply embedded in the way we work but that, when it is efficient, it is invisible. carole palmer wrote that the internet and local digital library collections tend to be perceived “as one big digital blur of information, quite separate from personal or physical library collections” [ , p. ]. although “digital blur” probably makes it more difficult to notice the impact of different aspects of online interactions on research, the effects of search engines capable of gathering information and large amounts of materials very quickly have been observed. the acls [ ] found that recent trends in scholarship have broadened our understanding of what type of material belongs to any academic discipline, which has led to the inclusion of a much wider range of materials. research in the humanities is a process that includes numerous nonlin- ear paths in the search for relevant information and connections between scattered sources. one practice that allows the establishing of connections is chaining, described by david ellis as “following chains of citations or other forms of referential connection between material” [ , p. ]. an- other prominent practice that promotes discovery is browsing [ – ]. serendipity is an integral part of the research process, enabled by practices such as browsing, but the literature often mentions the lack of possibility for serendipitous discovery as one of the significant shortcomings of elec- tronic environments. allen foster and nigel ford [ ] overviewed the literature that discussed serendipity in electronic environments. the use of personal names for searching and organizing information is also a com- mon practice in the humanities [ , ]. there is disagreement on whether scholars can employ their significant practices in electronic environments and whether electronic sources con- tribute to intellectual aspects of their work. helen tibbo [ ] found that historians seemed to be using the internet in the same way in which they traditionally used printed repository guides or the telephone, but she also noted that people “tend to base their information-seeking behaviors on what they expect to find” [ , p. ]. it can be expected that practices will change as more materials and services become available. participants in brogan and rentfrow’s study [ ] appreciated search capabilities because they facilitated access to large amounts of material and the establishment of connections in ways that were impossible or impractical before. scholars in the humanities tend to work alone, but networking and participation in invisible colleges are important ways of keeping in touch with colleagues [ , , ]. sanna talja, reijo savolainen, and hanni maula [ ] showed that mailing lists have been important to researchers in scarcely populated fields because of their social and informational value. the library quarterly the internet fosters short-term exchanges with strangers rather than col- leagues, which are potentially valuable in accessing new information [ ]. paul genoni, hellen merrick, and michele willson’s [ ] survey showed that over percent of academics in their study used the internet to activate latent ties. the concept of latent ties, based on caroline haythornthwaite’s description, includes “individuals working within the same occupational or professional group, who may be aware of each by name or reputation, but who have not had previous personal contact” [ , in the section “strong, weak, and latent ties”]. according to susan leigh star, geoffrey bowker, and laura neumann [ , p. ], “information artifacts undergird communities of practice, and communities of practice generate and de- pend on these same information resources.” the authors described con- vergence as “this process of mutual constitution.” the literature offers a range of studies on the information behavior of the humanities scholar; some are focused on the use of digital resources in the humanities, but none deals primarily with the use of e-texts. several studies have been conducted that deal briefly with the use of e-texts as part of a broader investigation [ , , , ] or that investigate the use of a particular electronic resource [ – ]. although primary textual mate- rials are central to research in the humanities, there is a gap in our knowl- edge about the role of primary materials in electronic forms in the research process. there is a need to explain how researchers interact with e-texts and how these interactions contribute to the research process. methodology study design qualitative methodology was used to investigate research projects in which e-texts have been used and the nature of academics’ interactions with e- texts. this study focused on exploring the roles of e-texts as primary sources in the projects aiming to produce traditional outputs. it dealt with the use of e-texts as a resource and tool, as opposed to projects that aimed to produce electronic textual editions or enhance e-texts in any way. inves- tigated research projects were in the areas of literary and historical studies because both fields are known for extensive and sophisticated use of textual resources. hermeneutics provided a philosophical and methodological framework for the study. hermeneutics as a philosophical tradition based on studying text and hans-georg gadamer’s reflections on phenomena of understand- ing and interpretation [ ] are particularly relevant, considering that this study investigated interactions with text and that interpretation and text define scholarship in the humanities in many ways. humanities scholars and e-texts table study participants number of participants total gender: female male field: historical studies literary studies career stage: early career researcher midcareer senior study participants.—researchers who worked in universities in a major australian city were initially invited to discuss their current and recent research projects in which they had used e-texts at least once. in later stages of data gathering, theoretical sampling was required to identify scholars who worked with electronically born literature, in order to explore further some issues raised in the first round of interviews. the final group of study participants included scholars from six universities in two austra- lian cities and one participant from a university in the united states (al- together, sixteen participants). there were nine female and seven male participants in different stages of their careers (see table ). participants researched a wide range of time periods and topics from a variety of disciplinary orientations. literary scholars discussed projects that explored subjects in a broad time range, from old icelandic literary studies to biographies of contemporary authors, and in a variety of fields, from creative writing to criticism of electronic poetry. historical projects included, for example, studies in religion, cultural studies, and eighteenth- century english history. in order to protect participants’ anonymity, numerical codes were used instead of names to label all data gathered during the study. the labels are in the format / , the first number indicating the number of the participant and the second standing for stage or of the study (e.g., / means participant in stage ). all references that could reveal partic- ipants’ research topics were replaced with more general terms to protect participants’ anonymity and their ownership of ideas. i would like to ac- knowledge that numerical codes may have the effect of constructing par- ticipants as peripheral to the study [ ]. the choice of numerical codes in this study was based on two considerations. first, it ensured an effective way of labeling data gathered in two research stages without identifying the library quarterly participants who continued their participation. second, pseudonyms imply gender and ethnicity, which were not relevant in this study. an occasional use of a personal pronoun when presenting views of a participant makes gender less prominent than would be the case if a pseudonym was re- peatedly used to identify the participant. numerical codes are not meant to construct a participant as a faceless entity. data gathering.—the study had two stages. the first stage included in-depth semistructured interviews and examination of participants’ manuscripts and published works, as well as examination of some e-texts mentioned during interviews. the second stage involved detailed data gathering from a group of four academics drawn from the participants in the first stage. data were gathered throughout and . the study investigated thirty research projects. in the first interview, participants talked about some general issues re- lating to their engagement with e-texts, and they discussed one finished and one current research project. the interviews in the first stage lasted sixty-five minutes on average. the participants in the second stage recorded data about their inter- actions with e-texts during the current research project identified in the first interview. the participants were asked to record data on forms and audiotapes. data-gathering forms were designed as a memory aid for the participants, who were asked to record brief details about e-texts they used. the study participants were also asked to record their comments on au- diotapes, reflecting on any interesting aspect of their interactions with e- texts during the current project. researchers who completed forms and/ or recorded comments discussed recorded information in the final inter- view. it appeared that forms and tapes served as a memory aid during the final interviews. in the second interview, the participants talked about the details of the research project and their view of e-texts in the research process. the second-stage interviews lasted one hour and fifteen minutes on average. participants’ publications arising from their projects were ex- amined throughout the second stage of data gathering (see table ). data analysis.—the interviews and tapes with comments were fully tran- scribed. data from interviews were used in audio and written form during much of the analytical process, to aid interpretation. understanding de- veloped in considering part and whole in the hermeneutic circle, which thomas schwandt [ , p. ] described as a method that “involves playing the strange and unfamiliar parts of an action, text, or utterance off against the integrity of the action, narrative, or utterance as a whole until the meaning of the strange passages and the meaning of the whole are worked out or accounted for.” humanities scholars and e-texts table data-gathering summary stage of the study and method number of participants first: interview examination of participants’ works second: interview audiotape with comments data-gathering forms examination of participants’ works note.—in both stages of the study, e-texts were examined when they were accessible and when sufficient details were available to retrieve them. grounded theory techniques described by anselm strauss [ ], strauss and juliet corbin [ , ], and barney glaser [ ] were used to code transcripts. after developing an initial coding scheme, the software nvivo was used for the coding. the coding scheme was refined and developed at different levels of abstraction as additional data were gathered. credibility.—credibility was ensured by triangulation of data sources and methods [ ], as well as by prolonged and persistent engagement with the field of inquiry, avoiding the danger of “going native” and building trust with participants [ ]. documentation of the research process contributed to the credibility of the study. referential adequacy [ ] and descriptive validity [ ] were ensured by taping and transcribing interviews and par- ticipants’ comments. participants’ published works also provided a refer- ence point. procedural and reflective memos were written during the study to provide data about the research instrument. joseph maxwell’s approach to validity “refers primarily to accounts, not to data or methods” [ , p. ]. triangulation of methods in this study was used in a way that strengthened interpretive validity. accounts in par- ticipants’ own words provide some evidence for interpretation. limitations of the study.—the exploratory study aimed to investigate a range of issues concerning scholars’ engagement with e-texts, based on in-depth data gathering. the study results do not have any statistical significance and cannot be generalized beyond the study data. the selected group of participants was not unique in any way, so a comparison with similar groups of scholars is possible. the literature suggests significant similarities be- tween researchers in different countries, but there may be some national differences. convergent, divergent, and parallel flows the library quarterly fig. .—divergent, parallel, and convergent flows (based on [ ]) this article presents some study findings related to the converging for- mats, media, practices, and ideas that result from working with e-texts. some experiences of working with e-texts and aspects of academic writing and presentation of research results are also part of the convergences identified in the study, but they are omitted from this article because their coverage would exceed the limitations of a single journal article. however, it is important to acknowledge the existence of converging tendencies in academic styles and genres, which contribute to changes in the research process. i would also like to acknowledge that converging tendencies do not present the whole picture of scholars’ work with e-texts. divergent and parallel developments are present as well. the diagram in figure is based on research into the evolution of glacial ice [ ], but it provides a frame- work for thinking about flows that can be observed in the electronic environment. divergent tendencies appear in the development of different research projects arising from the same idea or in the shift of some aspects of the project into new online activities. divergence is also noticeable as branch- ing out from one’s scholarly field to explore materials and ideas in other disciplines and in alternative, nontraditional sources. parallel flows provide a way of thinking about multipurpose environments as well as parallel investigations of the same problem. parallel flows can also describe schol- ars’ work on different projects or tasks, which could have had the same source and could converge at some point. convergent flows describe the movement from various directions to one point, such as the contribution of ideas from disparate sources or disciplines to form an understanding of the topic. while acknowledging the existence of divergent and parallel flows, i will focus on convergent flows. humanities scholars and e-texts understandings of e-texts: converging media and formats definition of “e-text”.—the notion of text and textuality has been changing since the beginning of the twentieth century, so that it now encompasses linguistic and nonlinguistic forms. when text, with its numerous meanings and connotations, appeared in the virtual space, defined by mutable objects and fluid boundaries, further reshaping of textuality was inevitable. an understanding of text in this study was derived from françois rastier’s [ , p. ] and paul ricoeur’s [ , p. ] definitions, which see text as a linguistic phenomenon. for the purposes of this study, “text” is defined as an autonomous linguistic chain (oral or written) that constitutes an empirical unit, fixed by writing or recording. the term “e-text” in this study means any textual material in electronic form used as a primary source in literary and historical studies. primary materials are usually poetry, stories, novels, plays, and a variety of historical documents—government, public, or private. digitized archival copies of magazines and newspapers as well as web sites and blogs could be e-texts as defined here when they are used as primary sources. e-texts could be written or spoken (e.g., oral histories), digitized or created electronically, and stand-alone documents or part of electronic databases and editions. textual fluidity.—the definition in the previous paragraph was communi- cated in different ways to the study participants. although i have been aware of the mutable nature of electronic textuality, i expected that oc- casional clarifications were all that might have been needed, considering that the participants have had very clear understandings of the nature of primary materials. as it turned out, the nature of the online environment and electronic textuality determined understandings of e-texts to such an extent that any definition provided only loose boundaries once people started to talk about their experiences with e-texts. the original definition was retained throughout the study, but participants’ ways of viewing e-texts were taken into account when it was necessary to understand their per- ceptions of e-texts and their interactions with these sources. e-texts are not solid objects. the convergence of formats, media, and information flows makes fluidity one of the most essential characteristics of electronic textuality. the participants talked about the ubiquitous nature of e-texts and referred to them as if they had a gaseous or liquid state of aggregation: “it becomes like the air you breathe. it’s very difficult to talk about because it’s everywhere” (participant / ). a number of partici- pants compared e-texts to a rich and unpredictable ocean. it is a “vast ocean of information out there and i can draw on that when i feel like it” (participant / ). or, exploration of a textual database is like “going in fishing, pot luck to see what turns up” (participant / ). that ocean the library quarterly can also be threatening and overwhelming, making it difficult to define the boundaries of one’s project. it is certainly true that a “vast ocean of information” exists in analog forms as well. what distinguishes work with e-texts is that their unique complexity and interactivity are enabled by computerized search and speed rates, in which diverse sources are brought together. the internet provides loosely ordered environments, which gather sources that traditionally do not exist in the same space. the speed in following hunches and patterns of information, combined with a lack of traditional reference points, un- derpinned participants’ perceptions that they were dealing with a vast and rich, albeit unpredictable, ocean. converging media and formats.—the fluidity of e-texts embraces different formats and media, such as bibliographic records, transcribed textual ma- terials, page images, and visual materials—sources that do not belong to- gether in traditional divisions of material types. bibliographic information found in catalogs becomes part of searching and exploration, leading to e-texts or other sources of information. however, constant and relatively quick iteration of search, retrieval, and interaction with sources, all hap- pening in the same physical space, makes the process and its elements hard to distinguish from each other. google and anything produced by google and other search engines tend to be seen as e-text. it is often difficult or irrelevant for the researcher to differentiate be- tween page images and transcribed texts. at the same time, images without linguistic content tend to merge with textual sources. this happens either because images are part of digitized pages and/or because the search process does not require that they be perceived as different formats. one participant worked extensively with images of archival parish maps that had important handwritten inscriptions on them. in her use of all sorts of images, she found catalog records to be an important and integral part of her research—the records “textualized” images and “put searchable language onto images” (participant / ). this participant combined var- ious searching techniques to bring together materials in different formats (e.g., images and transcriptions of old newspaper articles, photographs, annotated maps) to help her make connections and see patterns. in this case, annotations on the maps as well as bibliographic records, which “tex- tualized images,” merged with other materials, all of which could be searched and retrieved at the same time for the same purpose. participant / talked about e-texts used in a project and referred to visual and textual sources, which revealed patterns of information in the search process. when asked what e-texts were used in the project, partic- ipant / answered, “basically, i’m thinking here of digital images of the engravings that are on library and other archival web sites.” humanities scholars and e-texts even when e-texts are clearly textual sources with linguistic content, other media are intrinsically present. the best example is electronic poetry, in which words become kinetic and visual elements are an integral part of the poetic text. although graphic elements have been significant in some print-based genres, electronic media enable movement and inclusion of other media in novel ways, which can change literary genres. the lin- guistic nature of textuality changes, and one of the participants commented that “you’re writing a kind of picture” (participant / ). the relationship with music is easily established in the environment in which written words are often associated with sounds. sometimes it is even hard to distinguish between analog and electronic texts. if an archive digitized an old manuscript page to avoid photocopying and gave a printout of a digital image to the user, it can be argued that the printout is a hard copy of both analog and electronic formats. one participant thought it was ironic that a rare book library offered printouts of scanned pages but did not offer the digital images of pages. in the electronic environment, physical boundaries and physical space do not exist, which promotes a sense of fluid movement through the electronic domain. while the users of a traditional library do not need to know about the fine details of the library system, they cannot help but be aware that they have stopped using the catalog and have physically moved to the stacks or that they have put a book back on the shelf and need to play a videotape. physical movements and interactions with different phys- ical objects do not exist online, which has its positive and negative effects. limitations of electronic formats and environments as well as intrinsic qualities of different media are reasons why most users need materials in analog and digital forms, but materials in electronic form have their unique advantages. not only is it convenient to access materials quickly from one’s own home or office and to gather materials scattered in many physical collections, public and private, but it is usually easier to handle electronic files than large bound volumes or crumbling microfiche. participant / , for example, talked about the convenience of accessing digitized archival materials on the web sites of the new south wales (nsw) department of lands and the state library of nsw. not only do multimedia environments support converging formats in a technical sense, but the smooth move from a catalog record to an electronic monograph and film promotes the convergence of media and formats in the user’s perception—and this, in turn, promotes the convergence of ideas. participant / explained the significance of being able to access and search large bodies of materials online: interviewer : when you explore the ideas related to your research, does it help you the library quarterly in any way to follow certain threads more easily? does it help you to explore more than you otherwise would? or it doesn’t change that? participant / : no, no, it does. it changes that dramatically with images . . . interviewer : how does it change that? participant / : because it allows me to build networks of connections. netchaining: converging practices the discussions about scholars’ interactions with e-texts uncovered a num- ber of converging and transformed practices of networking and infor- mation searching. i will describe a combination of information behaviors occurring on the internet, which i call “netchaining” because it combines aspects of networking, chaining, browsing, and web surfing in a new pat- tern. netchaining is about establishing and shaping online information chains that link sources and people. netchaining is the internet behavior that combines all of the above practices in traditional and new ways. chaining is a traditional form of following references, but on the internet, another source may be only a click away if there is a link, or it may require a brief additional search to retrieve the referenced source. participant / , for example, talked about finding references to primary sources in academic journal articles available from project muse or the academic search elite database and then moving from the journal article to a primary source during the same searching session. online chaining can widen to include communication with the author, whose contact details appear as part of the reference or the linked e-text. the practice of browsing was transferred to the electronic environment and may include browsing of digital collections as well as web surfing as a way of looking for relevant information by searching and following hy- perlinks. chaining is often combined with browsing and web surfing. par- ticipant / , for example, talked about doing a keyword search online, retrieving e-books that were used as primary materials, examining web sites of some people recommended by colleagues or identified online, examining blogs, and participating in online discussions in order to find more information about a research concept. participant / had favorite web sites created by communities of research interests. this researcher would go to the web sites to retrieve relevant e-texts through searching and browsing and to visit the bulletin boards and discussion sections. the difference between traditional chaining and browsing often disappears online. if a reference to another source is provided as a hyperlink, it is difficult and probably irrelevant for practical purposes to establish whether following the link is part of the general browsing of a collection, surfing humanities scholars and e-texts the web, or a straightforward online version of traditional chaining. tra- ditional browsing and chaining are ways of discovering materials that can- not be found through catalogs and indexes. online variations of these practices have the same aim. however, the ease and speed at which the scholar can employ searching, browsing, chaining, and surfing practices change the nature of the interaction, and, more importantly, the ease of interaction with other people brings new aspects to the process. communication with people can be part of an online search for infor- mation (e.g., asking how to obtain documents), interaction with materials (e.g., clarifying details from a document), or networking (i.e., connecting with people). the common practice of searching by personal name has been widened to include checking various details about a person and, possibly, communicating by e-mail. the ease of contacting people strength- ens informal communication and opens access to alternative sources of information. traditional networking by participation in invisible colleges is practiced through participation in online academic forums. however, netchaining includes communication with a variety of people in formal and informal ways. some aspects of online communication are unique. for example, one participant ( / ) found that authors of electronic po- etry were usually more responsive than authors of print-based poetry and were willing to reveal requested information about technical aspects of their work because they were interested in discussing their technical so- lutions. another participant ( / ) used web sites and online discussions to keep in touch with communities that were a subject of her research, which was not possible in other ways. for this researcher, electronic ma- terials discovered or created in the process of communication provided unique primary data. in many cases, networking happened as part of finding and checking information. when asked what she liked about e-texts, participant / responded: i guess the immediacy. so when you found someone in particular who had written something you really like and then written it in the last two years and their e-mail address was at the bottom, you actually also realize you could go and talk to them. and then if you like them, you could connect them in with your [name of the topic] network. so that whole people-publication actually humanly connecting with them thing works. interviewer : so you contact them and what happens then? participant / : well, then you’d say, look, i really like this article, can i just check, when you said , did you mean that, or was that the ’ one? just little details like that. and, you know, you’re going to come to australia, we’re thinking of having a conference and, you know, that sort of thing. the library quarterly in traditional chaining, a reference in a footnote would be used as a lead to another document. in the situation described above, a note at the bottom of the screen provided an e-mail address, which led to interaction. in this example, the researcher wanted to contact the author whose work was interesting, and their communication combined information seeking and networking. four main reasons motivated netchaining activities that involved other people in some way: to find information, to aid access to a physical col- lection, to confirm information, and for purposes of current awareness. to find information.—an example of contacting people to find additional information is already mentioned in relation to the participant who con- tacted authors of electronic poetry to check technical details. participant / referred to a situation in which online communication with a friend helped him to clarify details and find a needed e-text. to aid access to a physical collection.—participant / , for example, talked about contacting an archivist after consulting online samples of materials, to aid access to a physical collection and organize a visit. to confirm information.—the excerpt from the interview with participant / quoted above provided an example of contacting the author to check information: “can i just check, when you said , did you mean that, or was that the ’ one?” participant / mentioned situations when in- formation looked suspicious but an alternative source did not exist or was not readily available. communication with people on a discussion list and electronic correspondence with the author can be used to confirm infor- mation: “i would ask colleagues or post it on a discussion list or check it with, like i said before, check the contact person or the copyright” (par- ticipant / ). for current awareness.—a number of participants mentioned certain web sites they checked regularly for current awareness purposes, particularly web sites that were likely to publish the latest works by modern authors. one of the participants ( / ) regularly checked a web site of a poet’s fan because the poet tended to publish his new works there. in other cases, a secondary document may lead to primary materials readily available on the same database (e.g., jstor) as well as provide a link to information about the author, which is used for current awareness rather than for the satisfaction of immediate information needs. communication with other people is not necessarily part of netchaining, as illustrated by examples where netchaining included a combination of other techniques, but it is an important part of the online behavior. table summarizes why participants initiated netchaining activities that involved approaching other people at some point. reasons and netchaining activ- ities can be combined so that one reason can include several netchaining humanities scholars and e-texts table reasons for initiating netchaining activities involving other people reason for netchaining netchaining activities to find information if interested in a document to confirm detail(s) if information is crucial if author’s authority could not be discounted if curious if interested in technical details of elec- tronic literature contacted a person who may know looked up author’s web site made a note for future use contacted the responsible person and asked question(s) connected that person into own network, invited to a conference to aid access to a physical collection to confirm details about a collection to arrange a visit to an archive contacted archivist listed on the web site to confirm information when worried about trustworthiness of a document posted a question to a discussion list contacted the responsible person for current awareness when coming across new work and won- dering what other people do contacted the author initiated online discussion about the type of work people are doing contacted people outside the discussion list activities from the same category. in the first category, to find information, contacting another person is a necessary first step. netchaining often reinforces and widens connections based on authority. the participants contacted people perceived as authoritative in certain respects (e.g., the author, the archivist) to check information. these con- nections were also widened by contacting people who “sounded interest- ing” (participant / ) and maybe involving them in some professional activities or discussing issues outside formal forums. the immediacy is an important part of the ability to make these connections. the significance of netchaining for participants ranged from an occa- sionally useful practice to the perception that this was a critical element of their professional identity. one of the participants talked about a net- work intelligence formed by linked data and artistic and intellectual agents. for this researcher, netchaining as a form of participation in the network intelligence meant “mainly just who i am” (participant / ). this partic- ipant talked about new ways of doing scholarly research, which started from traditional fields—in this case from english departments—to grow the library quarterly in new directions. the clarification that the new direction had become a new discipline came from feedback from people who form networks of online connections. netchaining is an important way of gathering information by following broad and unpredictable information paths. it has the potential to con- tribute to interdisciplinary exchanges. fast retrieval of a wide range of materials inevitably brings to the scholar’s attention a variety of materials. study participants frequently searched the internet to find and confirm information in a broad area of interest. interdisciplinary investigations are divergent in the sense that they spread out to other disciplinary fields, but they enable the convergence of information and ideas. as in many other aspects of interactions with e-texts, netchaining does not include completely new information behaviors that could not be com- pared with anything that was done in traditional ways. however, the ways in which various information-seeking and networking practices come to- gether for various purposes reveal a new pattern of electronic interactions. exploration: converging ideas the multiplicity of sources, formats, and textual information that could be quickly brought together form a basis of exploration that allows scholars to see different meanings and aspects of the topic. in the process of ex- ploring traditional and nontraditional sources and experimenting with different approaches online, connections and patterns emerge. the participants in the study, particularly if they were working on a relatively new topic, started with an internet search to learn about the background of a topic, discover main bodies of materials, and build a bibliography. after the initial orientation, scholars started exploring their research questions by investigating events, particular works, issues, and ideas in depth. the convergence of ideas happened during online explo- ration of large amounts of materials as well as by interrogating a limited range of texts. exploration of patterns and connections is a prominent online practice described by researchers in historical studies. the researchers searched for a variety of materials from different sources to build a profile of the topic. critically important aspects of the search are search engines, which retrieve information in a systematic way and provide access to a wide variety of genres. participant / commented that “oral history often gets cap- tured in the blogging culture.” searching for information on a historical personality retrieves blog entries as well as “local historical society online postings about him. there you’ve got a whole range of different registers of different genres of text all turning up in the electronic version” (par- ticipant / ). newly established connections point to other possible di- rections, and the scholar keeps moving “backwards and forwards between humanities scholars and e-texts a whole range of sources” (participant / ), including digital and analog materials. electronic access to large amounts of materials from different sources allows a scholar to make comparisons and see connections, which was not possible before: “and we wouldn’t actually have imagined making those sorts of links because it wouldn’t be simple to do, so we wouldn’t have even bothered” (participant / ). although some scholars thought that digital environments were not con- ducive to browsing and serendipitous discovery, others found that they allowed new forms of serendipity to emerge. one participant is a researcher in both literary and historical studies, who sometimes combined these disciplines in the same project. he described a search during which he had two browsers open while working on a historical project, when he noticed what he thought to be surprising information related to another research interest in literary studies. to divide the two threads suddenly emerging, he opened other windows to follow the second thread. finally, he downloaded a large number of poems related to the literary project for future investigation. these poems were not available in hard copy. in the process, he realized that there was a strong connection between the two different research interests. the researcher found “that link logic, that hypertextual kind of logic helps one understand relationships . . . quite well” (participant / ). the participant talked about the value of this discovery: “and i find this again and again, that investigating the gardens . . . leads quite quickly to investigating other aspects of the zen aesthetics, for example, so i could start with the gardens and within three or four minutes end up looking at a black american’s poetry writing. but it is very strongly related, and i love that sense of how, if i would have just said to you, ‘i’ve got a couple of topics i’m interested in, zen gardens and the late writing of [author],’ we wouldn’t have seen a connection at all.” serendipitous discovery also happens by opening several windows on the computer desktop. there are similarities between serendipitous dis- covery that happens in browsing books or other media and serendipity occurring when someone works with several computer windows. library books are organized by subject, so shelves collocate books that may be relevant to someone working on that subject. windows contain the content related to the scholar’s research interests, and they collocate materials around the researcher’s sense of what is relevant. these examples illustrate the convergence of the parallel flows. the new ways of working, which rely on quick retrieval of large amounts of materials and searching of patterns, encourage a new type of investigation. some serendipitous discoveries emerge in moving quickly through large amounts of diverse materials when juxtaposition of ideas and information trigger novel combinations. interrogation of textual databases or a limited range of texts to explore the library quarterly research questions was described by scholars in both literary and historical studies. although it is possible to work in this manner online, all partici- pants who referred to the interrogation of textual databases worked offline. similar to explorations of large amounts of diverse materials, in-depth exploration of a small number of texts allows researchers to investigate connections. participant / , for example, used a database of primary materials to explore links between certain concepts and to prove his hy- pothesis that a widely held view in his field was not quite correct. his ideas came from the field of cognitive psychology, and the textual database helped him to apply these ideas to the field of studies in religion. it was very difficult, if not impossible, to prove his understanding by using more traditional research methods. in this case, the parallel flow of ideas in different disciplines came together assisted by the use of e-texts. interrogation of textual databases was also used to establish connections between different texts. participant / downloaded a particular version of the bible and the book of common prayer to produce concordances and explore connections between these texts. participant / prepared a database of literary motifs and plot summaries to support her investi- gation of connections between literary texts. as participant / said, in- terrogation of e-texts made it easier to “understand relationships amongst bodies of knowledge.” perceptions of the library role the university library was seen as a major factor in promoting e-texts. the researchers observed organizational encouragement in the way libraries worked: libraries preferred electronic forms, there seemed to be money for electronic resources, and library staff sent notifications about sources and organized seminars for academics. considering participants’ responses, there may have been significant differences in the way different libraries or even subject librarians communicated with academics. although no one criticized librarians, some participants emphasized how important the ser- vices were that they received from their subject librarians (e.g., circulated information about new sources and suggestions), while others discussed difficulties they had in learning how to work with e-texts in the absence of any training or workshop organized through their institution. the range of available sources differed between participants’ institutions. the participants mentioned initiatives to make some expensive sources available through academic networks to several universities as very helpful. in the complex process of evaluating the trustworthiness of e-texts, re- searchers valued access to e-texts that had been digitized or selected by trusted libraries. they also preferred institutional support when they used humanities scholars and e-texts new software and tools because it was easier to work with new technologies if training and technical support were available at the workplace. discussion and conclusions the internet gathers a diverse range of material types that do not coexist in any single physical collection. online interactions further promote the convergence by blurring the boundaries between a variety of media and formats. different understandings about what constitutes scholarly evi- dence influence scholars’ decisions on how to approach the diversity of online materials, but, at the same time, available materials shape the un- derstanding of the topic and selection of the evidence. the study has provided evidence of the contribution of scholars’ interactions with e-texts to intellectual aspects of the research process. scholars work with e-texts in ways that employ traditional behaviors, and some academics engage in new information and research practices. chain- ing is a behavior that has been transferred and sometimes transformed in electronic environments. serendipity is still an important part of infor- mation encounter, but it may take new forms enabled, for example, by netchaining and the way in which computer windows present information. online practices promote convergences based on information discovery and informal communication, which may include members of the com- munity of practice as well as anyone who may be related to a research interest or an information trajectory. research libraries have a significant role to play in supporting these changes by providing sources, expert advice, and technical support. the study confirmed indications from the literature that ict is embedded in working practices, which often makes it invisible [ ]. it has implica- tions both for methodologies of studies that aim to investigate the use of ict in general and e-texts in particular and for the design of electronic environments. the convergence of formats, media, and practices points toward the development of new settings, which would allow further convergence of existing online communication and information environments. these new environments need to provide for divergent and parallel movements as well. investigations of how to map and support existing tendencies and how to apply them in new ways need to consider settings for a range of dis- parate activities including online communication, search of large multi- disciplinary repositories, and textual analysis, as well as explorations in- volving work with software tools and multimedia productions. these new settings will enable further change and transformations of academic re- search and will encourage new research directions. they will provide con- ditions for predictable fishing and for navigating on the open seas. the library quarterly references . immroth, john phillip. “information needs for the humanities.” in information science: search for identity; proceedings of the nato advanced study institute in information science held at seven springs, champion, pennsylvania, august – , edited by anthony debons, pp. – . new york: marcel dekker, . . wiberley, stephen e., and jones, william g. “time and technology: a decade-long look at humanists’ use of electronic information technology.” college & research libraries , no. ( ): – . . mcgann, jerome. radiant textuality: literature after the world wide web. new york: palgrave, . . brockman, william s.; neumann, laura; palmer, carole l.; and tidline, tonyia j. “schol- arly work in the humanities and the evolving information environment.” digital li- brary foundation, council on library and information resources, washington, dc, . http://www.clir.org/pubs/abstract/pub abst.html. . andersen, deborah lines, ed. digital scholarship in the tenure, promotion, and review process. armonk, ny: m. e. sharpe, . . massey-burzio, virginia. “the rush to technology: a view from the humanities.” library trends , no. ( ): – . . gilmore, matthew b., and case, donald o. “historians, books, computers, and the li- brary.” library trends , no. ( ): – . . houghton, john w.; steele, colin; and henty, margaret. “changing research practices in the digital information and communication environment.” department of educa- tion, science and training [canberra?], . http://www.dest.gov.au/sectors/research _sector/publications_resources/profiles/changing_research_practices.htm. . the british academy. “e-resources for research in the humanities and social sciences: a british academy policy review.” the british academy, london, . http://www .britac.ac.uk/reports/eresources/report/eresources-pdf.pdf. . american council of learned societies’ commission on cyberinfrastructure for human- ities and social sciences. “our cultural commonwealth: the report of the american council of learned societies’ commission on cyberinfrastructure for humanities and social sciences.” acls, new york, . http://www.acls.org/cyberinfrastructure/acls.ci .report.pdf. . brogan, martha l., and rentfrow, daphnée. “a kaleidoscope of digital american lit- erature.” council on library and information resources, digital library federation, wash- ington, dc, . http://www.clir.org/pubs/abstract/pub abst.html. . graham, suzanne r. “historians and electronic resources: a citation analysis.” journal of the association for history and computing , no. ( ). http://mcel.pacificu.edu/jahc/ /issue /works/graham/. . graham, suzanne r. “historians and electronic resources: a second citation analysis.” journal of the association for history and computing , no. ( ). http://mcel.pacificu.edu/ jahc/ /issue /articles/graham/. . palmer, carole l. “scholarly work and the shaping of digital access.” journal of the american society for information science and technology , no. ( ): – . . ellis, david. “modeling the information-seeking patterns of academic researchers: a grounded theory approach.” library quarterly , no. ( ): – . . stone, sue. “humanities scholars: information needs and uses.” journal of documentation , no. ( ): – . . delgadillo, roberto, and lynch, beverly p. “future historians: their quest for infor- mation.” college & research libraries , no. ( ): – . humanities scholars and e-texts . duff, wendy m., and johnson, catherine a. “accidentally found on purpose: information- seeking behavior of historians in archives.” library quarterly , no. ( ): – . . foster, allen, and ford, nigel. “serendipity and information seeking: an empirical study.” journal of documentation , no. ( ): – . . siegfried, susan; bates, marcia j.; and wilde, deborah n. “a profile of end-user searching behavior by humanities scholars: the getty online searching project report no. .” journal of the american society for information science , no. ( ): – . . cole, charles. “inducing expertise in history doctoral students via information retrieval design.” library quarterly , no. ( ): – . . tibbo, helen r. “primarily history: historians and the search for primary source ma- terials.” paper presented at the association for computing machinery/institute of elec- trical and electronics engineers joint conference on digital libraries, portland, or, . . tibbo, helen r. “primarily history in america: how u.s. historians search for primary materials at the dawn of the digital age.” american archivist (spring/summer ): – . . becher, tony. academic tribes and territories. milton keynes: the society for research into higher education, . . weedman, judith. “on the ‘isolation’ of humanists: a report of an invisible college.” communication research , no. ( ): – . . talja, sanna; savolainen, reijo; and maula, hanni. “field differences in the use and perceived usefulness of scholarly mailing lists.” information research , no. ( ). http://informationr.net/ir/ - /paper .html. . genoni, paul; merrick, helen; and willson, michele. “the use of the internet to acti- vate latent ties in scholarly communities.” first monday , no. ( ). http:// firstmonday.org/issues/issue _ /genoni/index.html. . star, susan leigh; bowker, geoffrey c.; and neumann, laura j. “transparency beyond the individual level of scale: convergence between information artifacts and commu- nities of practice.” in digital library use: social practice in design and evaluation, edited by ann p. bishop, nancy a. van house, and barbara pfeil buttenfield, – . cambridge, ma: mit press, . . porter, sarah. “reports from the front: six perspectives on scholar’s information re- quirements in the digital age.” new review of academic librarianship ( ): – . . ruhleder, karen. “reconstructing artifacts, reconstructing work: from textual edition to on-line databank.” science, technology and human values , no. ( ): – . . flanders, julia. “scholarly research and electronic resources.” wwp newsletter , no. ( ). http://www.wwp.brown.edu/project/newsletter/vol num /scholarly .html. . duff, wendy m., and cherry, joan m. “use of historical documents in a digital world: comparisons with original materials and microfiche.” information research , no. ( ). http://informationr.net/ir/ - /paper .html. . cherry, joan m., and duff, wendy m. “studying digital library users over time: a follow-up survey of early canadiana online.” information research , no. ( ). http:// informationr.net/ir/ - /paper .html. . noguchi, sachié. “assessing users and uses of electronic text: in case of the japanese text initiative, japanese classics electronic text on the world wide web.” phd diss., university of pittsburgh, . . gadamer, hans-georg. truth and method. nd rev. ed. new york: continuum, . . mckechnie, lynne; julien, heidi; pecoskie, jennifer l.; and dixon, christopher m. “the presentation of the information user in reports of information behaviour research.” information research , no. ( ). http://informationr.net/ir/ - /paper .html. the library quarterly . schwandt, thomas a. dictionary of qualitative inquiry. nd ed. thousand oaks, ca: sage, . . strauss, anselm l. qualitative analysis for social scientists. cambridge: cambridge university press, . . strauss, anselm, and corbin, juliet. basics of qualitative research: techniques and procedures for developing grounded theory. nd ed. thousand oaks, ca: sage, . . strauss, anselm, and corbin, juliet. “grounded theory methodology: an overview.” in strategies of qualitative inquiry, edited by norman k. denzin and yvonna s. lincoln, pp. – . thousand oaks, ca: sage, . . glaser, barney g. doing grounded theory: issues and discussions. mill valley, ca: sociology press, . . denzin, norman k. the research act: a theoretical introduction to sociological methods. en- glewood cliffs, nj: prentice-hall, . . lincoln, yvonna s., and guba, egon g. naturalistic inquiry. beverly hills, ca: sage, . . maxwell, joseph a. “understanding and validity in qualitative research.” in the qualitative researcher’s companion, edited by a. michael huberman and matthew b. miles. thousand oaks, ca: sage, . . wilson, christopher john, and zhang, yu. “comparison between experiment and com- puter modeling of plane-strain simple-shear ice deformation.” journal of glaciology , no. ( ): – . http://web.earthsci.unimelb.edu.au/wilson/ice /evolution.html. . rastier, françois. meaning and textuality. translated by frank collins and paul perron. toronto: university of toronto press, . . ricoeur, paul. a ricoeur reader: reflection and imagination. new york: harvester wheatsheaf, . sustainability strategies for digital humanities systems claes neuefeind​ ​, philip schildkamp​ ​, brigitte mathiak​ ​, unmil karadkar​ ​, johannes stigler​ ​, elisabeth steiner​ ​, gunter vasold​ ​, fabio tosques​ ​, arianna ciula​ ​, brian maher​ ​, greg newton​ ​, stewart arneil​ ​, martin holmes​ ​ cologne center for ehumanities, university of cologne, germany ​ data center for the humanities, university of cologne, germany ​ centre for information modelling, university of graz, austria ​ king’s digital lab, king’s college london, united kingdom ​ humanities computing and media centre, university of victoria, canada now that the digital humanities (dh) are becoming a well-established research field, producing seminal publications in print as well as digital formats, the time for consolidation has come. it is noteworthy that digital tools and methods from the pioneering days of the dh are degrading and some have already vanished. therefore, it is urgent to take action and to prevent further losses. while the necessity of high quality research data management (rdm) is encouraged or even required by funding agencies and there is an increasing awareness for long-term archiving (lta), when it comes to primary research data, the fact that the dh exhibit a structural deficit regarding maintaining and preserving research software is at the least underestimated. in this panel, we will focus on infrastructure and institutional support. beginning with an overview of existing strategies from the dh and beyond, we highlight selected strategies to compare how they are implemented at different institutions in terms of infrastructure, expert knowledge and also funding. we also want to evaluate the extent of institutional support that is needed to successfully sustain and archive dh projects and the software they use. we will discuss currently implemented solutions to maintain and preserve research projects and software, all of which approach the outlined problem from a different angle. . sustainability strategies in dh and beyond (brigitte mathiak, data center for the humanities, university of cologne) sustainability of research software is an important issue for the dh. in our investigation of the “digital scholarly editions“ online catalogue, we compared the time stamps of the last seen version on the internet archive with the first seen version (schildkamp & mathiak, ). we discovered that of digital editions, had disappeared (cf. fig. ). the average life time is . years, while the half life time is about years. we expect that other dh projects exhibit similar trends. the reasons for the disappearance of these valuable research resources are manifold: diminishing funding, lack of institutional support and, over time, lack of personnel support as researchers switch career paths or research directions. the “digital dark age” (whitt, ) affects not only our digital cultural heritage, but also the born digital outcomes of scholarly labor. figure : life time of digital scholarly editions the problem of sustainability is neither unknown, nor without solutions. several different models have been explored within the dh community. these include the development of centers such as chnm, consortia such as europeana, hathitrust, and dpla, as well as community partnerships such as samvera (previously hydra) and islandora. individual institutions such as those represented on this panel have taken up responsibility for the resources that were placed in their care. yet, there is a dazzling variety of strategies, technologies, and policies that have been adopted to improve the elusive sustainability, e.g. code archiving, open source dissemination, duplication, sandboxing, refactoring, unified tech stacks, virtual research environments, virtualization, and use of the internet archive. while it is clearly easier to prepare a project for sustainability in the planning stage, advice for enhancing sustainability is divergent, ranging from using simple technology, someone’s preferred infrastructure, or particular documentation practices. many completed projects do not have a sustainability strategy, either because they were too old or too optimistic. what happens to these projects is often determined by funding and institutional support. the luxury version is a complete redesign with all the newest bells and whistles, but there are also cheaper strategies, such as putting the system in a sandbox, or relying on the internet archive. however, the problem of sustainability is not unique to the dh. basic sciences (biology and physics), atmospheric and space sciences, as well as geosciences are some disciplines that are developing sustainability enhancing mechanisms. in conjunction with funding agencies such as the national science foundation, researchers in these disciplines have attempted approaches such as community engagement in software and schema development (specify), ongoing external funding for maintenance (rather than only for new research), long-term funding arcs (nsf centers), funding agency mandates (contribution of digitized data to existing repositories), efforts to desilo or integrate resources (idigbio, iplant), and institutional support for pre-publication drafts (arxiv). we will explore the breadth of these approaches as well as the expected and actual impact of these strategies on sustainability of products that are critical for scholarship in these disciplines, and connect the dots by drawing parallels to the dh. . tosca-based application management (claes neuefeind and philip schildkamp, cologne center for ehumanities/data center for the humanities, university of cologne) the university of cologne’s data center for the humanities (dch) is obliged to concern itself with the sustainability of all digital artifacts produced during (digital) humanities projects, e.g. as run by the cologne center for ehumanities (cceh). and as such, it is not only committed to the long-term preservation of data, but of so-called “living systems” (sahle & kronenwett, ) as well. with regards to this necessity, the dch is currently engaged in the dfg-funded project “sustainlife - sustaining living digital systems in the humanities” (neuefeind et al., ), conducted in cooperation with the institute of architecture of application systems (iaas) of the university of stuttgart. the project aims at adopting the “topology and orchestration specification for cloud applications” (tosca) standard (oasis, and ) to the field of digital humanities. being an industry standard focussed on deployment and maintenance of complex software services, tosca allows to model applications as abstract topologies consisting of reusable components, while avoiding any kind of vendor or technology lock-in. through this meta-modelling of software components, not only can the deployment context be adjusted easily (e.g. deployments geared towards openstack can easily be adjusted towards docker, vmware vsphere, etc.), but from the reusability of said components, synergetic effects emerge, lessening the overall administrative costs for long-term archiving and deployment of research applications. in our contribution to the panel, we will present the methodological concept of our approach based on the opentosca ecosystem (breitenbücher et al., ), an open-source implementation of the tosca standard, as well as a distinct set of use case implementations conducted within the sustainlife project. the use cases to be presented will cover some of the typical technology stacks in the dh. foremost, the ( ) earlycinema use case stands for one of the most common technology stacks: lamp (linux, apache, mysql, php). further, the ( ) autopost and ( ) tiwolij use cases employ the popular java framework spring(boot) with a mysql database as persistence layer. also implemented using spring(boot), but persisting data in mongodb, employing elasticsearch as indexing service and packing a reactjs frontend, the ( ) vedaweb use case represents one of the more specialized stacks. and lastly the ( ) musical competitions database is the most specialized use case, as it depends on older versions of couchdb for data persistence and elasticsearch for indexing persisted data (neuefeind et al., ). . gams: geisteswissenschaftliches asset management system (unmil karadkar, johannes stigler, elisabeth steiner, gunter vasold, and fabio tosques, centre for information modelling, university of graz) recognizing the problems inherent in conducting digital humanities research based on stand-alone, custom software, the centre for information modeling has developed, maintained, and enriched gams--a modular, standards-based, community-used software-- since the early ’s, gaining over years of experience in sustaining a digital scholarship infrastructure. the gams infrastructure is supported by ongoing relationships with researchers, personnel, processes, and certifications that inform a holistic, long-term sustainability philosophy. thus, gams embodies a strategy for digital preservation that has been hardened through software upgrades, continuous use, and external testing. gams hosts over , compound digital objects and supports over digital humanities projects. ● infrastructure: the gams software is developed using open software and platform-independent standards. these include fedora--a flexible open repository infrastructure, blazegraph, a standards-based, high-performance graph database, handle--a persistent identifier service, postgresql, apache cocoon, apache lucene, apache solr, and loris iiif image server. gams was initially developed using fedora . and over the years, has been migrated to fedora . . the gams team has developed oais-compliant workflows in order to support long-term preservation. data stored in gams is subject to fair data principles. currently, the gams team is updating the backend to fedora . . this upgrade presents unique challenges as fedora has outsourced the notion of content models since version . and model compliance must now be handled in the application layer. the modularity of the gams architecture facilitates such an upgrade as the java-based cirilo client supports the management of a legacy layer while migrating to a rest-api-based interface. cirilo is developed on an open source philosophy and is available for download via github. the user interface layer of the gams web interface is based on web technology standards, such as xml and xslt that separate structure from content and enable multiple, context-specific renditions of web-based information. ● relationships: in order to ensure continued relevance, the gams team partners with humanities researchers. gams receives and stores data in recognized archive-compliant standards such as jpeg , tiff, tei, and lido. in addition to providing interfaces for tasks such as the upload, management, description, presentation, and dissemination of digital objects, the team consults with research partners about issues such as document digitization, ingest, description, and management, developing custom workflows, data models, deposit agreements, data management plans, and publication pipelines as necessary. developed tools and techniques are available for other projects, thus enriching gams as well as the digital research environment for humanists. ● personnel: continuity of people is often correlated with the availability of infrastructure and data. in order to ensure long-term availability of the data as well as services, the centre invests in project staff for tasks such as software development, infrastructure management, processes, workflows, and content model design, as well as for document and metadata enrichment. ● certifications: in a demonstration of our commitment for long-term preservation and to assure (potential) partners of this commitment, gams has undergone rigorous evaluation and has been certified as a trusted digital repository (since ), carries the coretrustseal (since ), and is registered with the registry of research data repositories (roar).the team is currently working to certify gams repository as a clarin center. . managing dh legacy projects and building new ones: a pragmatic and holistic approach to archiving and sustainability (arianna ciula and brian maher, king’s digital lab, king’s college london) king’s digital lab’s (kdl) (king’s college london) contribution to archiving and sustainability practices in digital humanities (dh) will be presented along the following dimensions: ● (human) sustainability of expertise: as generational change occurs and in line with reorientations across the dh community (see boyles et al., ), it has become increasingly clear that the surest way to sustainability is to ensure continuity of technical expertise, domain knowledge, and tacit understanding. kdl conceived and adopted a relatively flexible model with defined career development document and research software engineering (rse) role definitions (smithies, ). ● (technical) sustainability of systems and technical stack: the second dimension needed to sustain the dh tradition and fulfil kdl’s mandate to increase digital capability across the arts & humanities is caring for the cluster of technical systems comprised of hardware and software, web servers, network infrastructure, application frameworks, programming languages, tools (for project, data and code management), and equipment. in practice, sustainable management of lab projects required the adoption of limited server and development environment stacks, in a move away from the more flexible but difficult to manage environment used in earlier eras (for more details of the tools used to support the stack see https://stackshare.io/kings-digital-lab​). ● (operational) post-project integrated in the lab software development lifecycle: the techniques used to manage kdl rich and heterogeneous estate of legacy projects matured into an ongoing process of archiving and sustainability tailored to the lab’s historical, technical and business context. it is applied to new as well as legacy projects, in a manner that ensure systems as well as data are maintained https://stackshare.io/kings-digital-lab https://stackshare.io/kings-digital-lab throughout defined life-cycles (king’s digital lab, ). to control this, open ended service level agreement (sla) contracts are offered to principal investigators (pis) of collaborative research projects to secure maintenance of legacy projects in their live state; however, other options for archiving are also possible and assessed (see also smithies et al. and ​https://dev.clariah.nl/files/dh /boa/ .html​). to make the overall approach sustainable, it had to be integrated into the lab’s software development lifecycle (sdlc; see ​https://kingsdigitallab.github.io/sdlc-for-rse/​), and in so doing align with kdl infrastructure architecture and core technical stack, while at the same time informing practices of forward planning for new projects. kdl’s contribution to the panel will reflect on how alignment across these three layers raises challenges but also poses the foundations for the sustainability of the lab’s ecosystem, hopefully offering a reference for others to reflect upon, adapt and improve. . keeping it simple and straightforward (greg newton, stewart arneil, martin holmes, humanities computing and media centre, university of victoria) the university of victoria long ago demonstrated its commitment to dh research by providing base-budget funding for the five-person ​humanities computing and media centre - a department in the faculty of humanities. as can be seen from the name, hcmc actually pre-dates the term digital humanities. as a base-budget funded department, hcmc has the capacity to take on projects regardless of their level of funding - we regularly take on projects with no funding at all - and the commitment to support the project's outputs in perpetuity. this is only possible due to a critical mass of professors and executives seeing value over time. for over twenty years hcmc has been consulting on and developing web applications in support of teaching and research. on behalf of our academic collaborators we work closely with library and systems colleagues who take primary responsibility for archiving and technical infrastructure, respectively. this is a strategic division of labour entailing ongoing communications with the benefits of specialization and scale. over the years we have come to recognize the inherent dangers of creating teetering stacks of complicated, fashionable technology that cannot stand the test of time. experiments with several cms's, javascript libraries, and so forth has invariably led us to the conclusion that the long-term cost of coping with breakage and security problems outweighs the short-term value these applications and libraries offer. while we provide institutional support we are not keen on a never-ending cycle of upgrades and code-maintenance. to mitigate this we have become staunch supporters of kiss - in our case it might stand for "keep it simple and straightforward". we take on very few projects that we did not develop, and when we do they are usually converted to a static site and archived or completely re-written. our ​project endings survey and interviews have made us doubly aware of the potential for catastrophe using technology that is not proven to be simple and durable. from our https://dev.clariah.nl/files/dh /boa/ .html https://kingsdigitallab.github.io/sdlc-for-rse/ https://www.uvic.ca/humanities/hcmc/ https://projectendings.github.io/ perspective every project will benefit from adopting a kiss strategy, but perhaps especially those projects without institutional support. references arneil, stewart, martin holmes and greg newton. . “​project endings: early impressions from our recent survey on project longevity in dh​.” digital humanities conference, utrecht, netherlands. july . https://dev.clariah.nl/files/dh /boa/ .html boyles, christina, carrie johnston, jim mcgrath, paige morgan, miriam posner, and chelcie rowell. . ‘precarious labor in the digital humanities – dh ’. in digital humanities : book of abstracts / libro de resúmenes, edited by Élika ortega, glen worthey, isabel galina, and ernesto priani, – . mexico city: red de humanidades digitales a.c. ​https://dh .adho.org/precarious-labor-in-the-digital-humanities/​. breitenbücher, uwe, endres, c., képes, k., kopp, o., leymann, f., wagner, s., wettinger, j., zimmermann, m. . ‘the opentosca ecosystem. concept & tools’. in: european space project on smart systems, big data, future internet - towards serving the grand societal challenges - volume : eps rome . scitepress, pp. - . king’s digital lab. . ‘archiving and sustainability | king’s digital lab’. king’s digital lab. . ​https://www.kdl.kcl.ac.uk/our-work/archiving-sustainability/​. neuefeind, claes, lukas harzenetter, philip schildkamp, uwe breitenbücher, brigitte mathiak, johanna barzen, frank leymann. . ‘the sustainlife project – living systems in digital humanities’. in: proceedings of the th advanced summer school on service-oriented computing (summersoc ) (ibm research report rc ), pp. - . neuefeind, c. and schildkamp, p. and mathiak, b. and marčić, a. and hentschel, f. and harzenetter, l. and breitenbücher, u. and barzen, j. and leymann, f. ( ). sustaining the musical competitions database. a tosca-based approach to application preservation in the digital humanities. in: book of abstracts of the th digital humanities conference (dh ), https://dev.clariah.nl/files/dh /boa/ .html oasis. . topology and orchestration specification for cloud applications version . , http://docs.oasis-open.org/tosca/tosca/v . /tosca-v . .html oasis. . tosca simple profile in yaml version . , http://docs.oasis-open.org/tosca/tosca-simple-profile-yaml/v . /tosca-simple-prof ile-yaml-v . .html​. sahle, patrick and simone kronenwett. . ‘jenseits der daten. Überlegungen zu datenzentren für die geisteswissenschaften am beispiel des kölner data center for the humanities’. in: libreas. library ideas # , pp. - . schildkamp, philip and brigitte mathiak. . ‘overview of life and death of digital scholarly editions’ [data set]. zenodo. ​http://doi.org/ . /zenodo. https://dev.clariah.nl/files/dh /boa/ .html https://dev.clariah.nl/files/dh /boa/ .html https://dev.clariah.nl/files/dh /boa/ .html https://dh .adho.org/precarious-labor-in-the-digital-humanities/ https://www.kdl.kcl.ac.uk/our-work/archiving-sustainability/ https://dev.clariah.nl/files/dh /boa/ .html http://docs.oasis-open.org/tosca/tosca/v . /tosca-v . .html http://docs.oasis-open.org/tosca/tosca-simple-profile-yaml/v . /tosca-simple-profile-yaml-v . .html http://docs.oasis-open.org/tosca/tosca-simple-profile-yaml/v . /tosca-simple-profile-yaml-v . .html http://doi.org/ . /zenodo. smithies, james. . ‘the continuum approach to career development: research software careers in king’s digital lab’. king’s digital lab - thoughts and reflections from the lab (blog). february . https://www.kdl.kcl.ac.uk/blog/rse-career-development/​. smithies, james, anna maria sichani, carina westling, pam mellen, and arianna ciula. . ‘managing digital humanities projects: digital scholarship & archiving in king’s digital lab’. digital humanities quarterly. http://www.digitalhumanities.org/dhq/vol/ / / / .html​. whitt, richard. . ‘through a glass, darkly’ technical, policy, and financial actions to avert the coming digital dark ages, santa clara high tech. l.j. . available at: http://digitalcommons.law.scu.edu/chtlj/vol /iss / https://www.kdl.kcl.ac.uk/blog/rse-career-development/ http://www.digitalhumanities.org/dhq/vol/ / / / .html http://digitalcommons.law.scu.edu/chtlj/vol /iss / uc berkeley uc berkeley previously published works title building a research data management service at uc berkeley permalink https://escholarship.org/uc/item/ xn d authors wittenberg, jamie elings, mary publication date - - peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ xn d https://escholarship.org http://www.cdlib.org/ building a research data management service at uc berkeley jamie wittenberg and mary elings abstract: uc berkeley’s library and the central research information technologies unit have collaborated to develop a research data management program that leverages each organization's expertise and resources to create a unified service. the service offers a range of workshops, consultation, and an online resource. because of this collaboration, service areas that are often fully embedded in it, like backup and secure storage, as well as services in the library domain, like resource discovery and instruction, are integrated into a single research data management program. this case study discusses the establishment of the program, the obstacles in implementing it, and outcomes of the collaborative model. keywords: data services, lis as a profession, academic libraries context: establishing the partnership the university of california at berkeley is one of the top research universities in the country, receiving over $ million in research funding last year and supporting over research centers (best college reviews, ). in addition, uc berkeley supports academic departments and programs that are home to over , graduate students, , undergraduates, and , full time faculty. this community is dispersed across over , acres in hundreds of buildings working in countless organized research units, centers, institutes, laboratories, facilities, and groups (uc berkeley, n.d.). with such an active and highly distributed research environment, the university has a significant task in providing research support to its campus community. an area of particular focus in the last year has been the adoption of open access policies that aim to make uc research outputs widely accessible. despite the adoption of the uc open access policy by the academic senate in and issuance of an expanded oa policy in , these policies did not cover data specifically . as research changes and evolves, the services and needs of the community are evolving, especially with current data-driven research activities that rely on access to diverse data resources, data-intensive methods, and distributed computing tools and platforms in addition to meeting unfolding new federal requirements for data re-use and data security (ferguson et al., ). in response to this evolving environment and its needs, the research community is seeking support in managing, storing, sharing, and preserving the university of california, "open access policy for the academic senate of the university of california," http://osc.universityofcalifornia.edu/open-access-policy, (july , ). the scope of the policy only applies to scholarly articles. data they produce in order to maintain the viability, reproducibility, and re-use of research data. tenopir et al. ( ) demonstrate in "research data management services in academic research libraries and perceptions of librarians" that technical (hands-on) research data services are less common than informational (consulting) services. this lack of technical data services in libraries may be addressed through a library and it partnership, and the uc berkeley program attempted to address both technical and informational research data needs through such a partnership. in , the uc berkeley library and research information technologies (research it) joined forces to develop a research data management program to support this need for its large and active research community. the library and research it partnership brought together two key organizations participating in the research process. research it is a unit situated in uc berkeley’s office of the cio. research it provides research computing technologies, consulting, and community for the berkeley campus. research it works in close partnership with the office of the vice chancellor for research and other campus technology services units, including the library. the uc berkeley library connects students and scholars to information and services in support of research across campus. the library seeks to select and create, organize and protect, provide and teach access to resources that are relevant to our campus programs. together, these two organizations support the depth and breadth of campus research needs, which are increasingly digital in nature. the goal of this collaborative partnership is to develop a program that will bring together the campus-wide systems and technical knowledge of research it with the research support and preservation expertise of the library. this collaboration is a change for these two organizations and represents a new way of working together where each group is contributing to the process and sharing the costs. it is part of a push from several campus leaders, including leadership in research it and the library, to build meaningful service collaborations between groups charged with providing campus wide services. it serves as a useful model of two large and diverse organizations taking joint ownership of a campus need, and working together to meet that need. the collaboration of the library and research it around the topic of research data management grew out of earlier work on the research & academic engagement (rae) benchmarking project ( ), which was an effort by uc berkeley’s research it group and educational technology services, with involvement from the library . the benchmarking project looked at existing and planned technology services and compared them with a set of peer institutions to help berkeley develop a strategy for improving research, teaching, and learning technology support. one of the areas rae looked at was research data management, and the library and research it recognized a shared interest in this area, as well as shared expertise, that could be brought together to advance the topic and provide support services. research and academic engagement (rae) benchmarking project: https://www.ets.berkeley.edu/projects/rae-services-peer-benchmarking a natural partnership flowed from the success of that project. by bringing the combined expertise of the library and research it to bear on the emerging needs around research data management, we could advance use of services supported by research it and expand adoption of research data management as part of the public facing mission of the library. with offerings like high performance computing (hpc), virtual computing environments, and infrastructure services available through research it, and the library’s focus on research support and data management, the collaborative partnership covered many of the bases in a research data management portfolio. the more consultative role of the library and the service-oriented role of research it completed the picture in terms of a research data management program, and thus the partnership was formed. rdm program and goals: improving campus support for research the stated objective of this effort was to establish a program for research data management (rdm) services at the uc berkeley campus level, through a joint partnership between research it and the library. the goals in year one (january - december ) were to design and deliver workshops, develop an rdm service guide, and develop an rdm consulting service. the programmatic goal of the rdm initiative is to improve campus support for research output across all domains and subject areas, offering services around research data to help researchers steward, protect, and disseminate their data. research data includes tabular and numeric data, text, images, audiovisual content, code, or any other actionable information generated during the research process. this typically excludes administrative data like financial and student records, as well as technical data, like the operations information generated by servers and laboratory equipment. rdm supports research data across domains and organizations, particularly in the areas of planning, organization, active data management, and sharing. research is highly distributed at berkeley, and so are the services that support research. the efforts around creating a centralized rdm program can also be viewed as an attempt to knit together and coordinate a range of specialized and somewhat siloed services funded by departments, organized research units, and external “soft" money. the rdm program aims to establish workflows and policies related to activities surrounding research data at berkeley in addition to developing consulting, active data management, and training offerings. digital humanities has been managing the bulk of data management and curation requests focused on humanities data. contributions: library and it roles the rdm initiative at berkeley is led by a core group consisting of leadership from both organizations, each committing one administrator to the team. the effort is managed by a team made up of a project manager, the research data management analyst, and an it project manager. the core group working under this direction includes librarians and technical staff in the library, research it staff, a staff member from the california digital library, and a staff member from the uc berkeley campus shared services - information technology group. the core group meets bi-weekly and activities and deliverables are kept on a master calendar managed by the it project manager. meetings are led by the program manager, who also prepares the meeting agendas and keeps meeting notes. by providing for equal staffing and equal participation, the program is expected to promote equal engagement in this effort on both sides. while there are no plans to establish a separate rdm unit within either organization, the work will continue to be coordinated among library and research it staff going forward. research data management will become part of what these groups provide, and that work will be shared among the participants. as part of this effort, the library and research it agreed to share support for a full time research data management analyst who would split time between each organization. this position reports to both entities and has a physical space in both offices. the library has provided a space for bi-weekly meetings and workshops, which to date have largely been focused on creating a cohort among librarians. the role of the library group in the rdm program has been to bring expertise in supporting the research process. the inclusion of librarians in the sciences, social sciences, and humanities brought a broad perspective to the core group. these participants are also part of a larger consulting network of departmental liaisons and subject specialists who are involved in research support on a day to day basis. these librarians offer support for and provide access to several data services including dash, ezid, and the dmptool, all hosted by the california digital library, another key partner in the collaboration. the role of the research it group in the rdm program is to provide direction in the areas of active data management and data security, bringing expertise in data transfer, storage, and security. research it encompasses two groups that work very closely with rdm: berkeley research computing (brc) and digital humanities at berkeley. both of these groups are actively involved in projects that support the goals of rdm. brc offers consultation and builds services related to high performance computing support and infrastructure. they are involved in experimental work on virtual workstations that are piloting solutions for rdm use cases - for example, developing an analytics environment for textual humanities data. the partnership between the library and it is critical to the success of the rdm program, as is partnership with other organizations on campus like educational technology services, the berkeley institute for data science, and the d-lab. support and participation by cdl is also central in this effort and will be increasingly important as the program moves forward. professional culture: navigating library and it culture the cultural differences between the library and research it organizations posed some challenges during the development of a joint program. it is important to note that the research it group has been involved in long term work in museum informatics through the development of a collection management platform (collectionspace) and, consequently, research it has been deeply engaged in the libraries, archives, and museums space on multiple community source projects. this is highly unusual for a research computing group, and has been important in forging relationships between organizations. despite this, fundamental cultural differences between the organizations emerged. as detailed by verbaan and cox ( ) in their discussion of occupational sub-cultures in research data management collaborations, librarians and it staff have different and occasionally competing perspectives on rdm, wherein “broadly speaking, it services focused on short term data storage; research office on compliance and research quality; librarians on preservation and advocacy”. this description of focus and scope aligns with the experience of the rdm program at uc berkeley working with central it more broadly. for example, library positions, being academic, are more flexible than it staff positions, and it is not the norm for librarians to have a percentage of their position assigned to projects. in it, it is typical to have a % appointment or % appointment to a project where time spent on the project is tracked and assessed. a senior librarian provided feedback that the project had more it-focused elements than library-focused elements. perhaps one reason for this is that the time commitment of librarians is not as explicitly defined as the time commitment of it staff, there were occasional misunderstandings related to workload, role, and commitment. as a result, some work related to the rdm program skewed more in the it interest (active data management, storage), than the librarian interest (scholarly communication, preservation, research). one significant example of an area where cultural difference between the library and research it emerged was in approaches to researcher privacy. as established by the american library association ( ), “protecting user privacy and confidentiality has long been an integral part of the mission of libraries. the ala has affirmed a right to privacy since ...in keeping with this principle, the collection of personally identifiable information should only be a matter of routine or policy when necessary for these organizations represent the programs primary partners in the uc system. research it: http://research-it.berkeley.edu/. berkeley research computing: http://research- it.berkeley.edu/programs/berkeley-research-computing. digital humanities at berkeley: http://digitalhumanities.berkeley.edu/. california digital library: http://www.cdlib.org/. berkeley institute for data science: https://bids.berkeley.edu/. educational technology services: https://www.ets.berkeley.edu/. d-lab: http://dlab.berkeley.edu/ http://research-it.berkeley.edu/ http://research-it.berkeley.edu/programs/berkeley-research-computing http://research-it.berkeley.edu/programs/berkeley-research-computing http://digitalhumanities.berkeley.edu/ http://www.cdlib.org/ https://bids.berkeley.edu/ https://www.ets.berkeley.edu/ http://dlab.berkeley.edu/ the fulfillment of the mission of the library.” leadership in research it preferred that identifying information like research names and departments be collected and shared among other consulting groups in order to provide a higher level of coordinated service. however, the library has a more conservative stance towards information-sharing and does not systematically collect this kind of patron data. the resolution has been an endeavor to jointly draft a privacy policy. because rdm is an emerging field, it helps to have people working on the project that have a professional development mindset. outreach and partnership with other organizations working in research data management is crucial to providing services that are relevant. some examples of this are attending method and tool-based workshops related to scholarly communication, digital scholarship, and transparent research at uc berkeley. we found that planning in these activities was an important part of the project. implementation: consulting, resources, and training developing the rdm guide was the first step in preparing to launch the program. the guide is designed to serve as a resource for both service providers (consultants and librarians) as well as researchers. content for the guide was written collaboratively by members of the team, based on area of expertise. it was developed in drupal and is hosted by pantheon, a web hosting platform. the public-facing guide contains content organized loosely by stages in the research lifecycle. content consists of best practices, service offerings at uc berkeley, useful tools, and case studies. there is also a back-end to the guide, called the knowledge base, which is accessible to core team members only. the knowledge base serves as a tracking and record-keeping system that consultants use to document details of their consultations. this system is used primarily for program assessment. building the rdm guide was an important part of the program because it offered the first opportunity for research it and the library to collaborate on an enduring and publicly available rdm resource. librarians and it staff researched and wrote content together, defining the scope of the project and sharing knowledge. as the rdm guide took shape, development began on the consulting service. the rdm consulting service is supported by three ‘triage’ staff members who respond to requests and reach out to the broader consulting network to refer questions they are unable to answer. this network includes domain specialists, data scientists, qualitative data experts, librarians, and it staff. there are many existing consulting services on uc berkeley’s campus, including in digital humanities, berkeley research computing, the data lab, and the berkeley institute for data science. it was important that the research data management consulting service integrated well with these existing services, and this allowed the team to borrow protocols and practices from partner organizations. researchdata.berkeley.edu building the consulting network was, in large part, an outreach and engagement objective. there were several individuals and groups that were already stakeholders in the rdm program who could serve as consultants, but one of the drivers for the development of the rdm program was bringing together distributed pockets of data management expertise. the consulting network was an opportunity to leverage knowledge in a range of domains, like cloud storage or metadata standards, for a research application. the first goal of the rdm program was to train the staff that would make up the consulting network. staff training for the rdm initiative has focused on three major groups: partner organizations, it support, and librarians. partner organizations did not receive formal training, but were engaged through a series of meetings and presentations. following the september soft launch, rdm developed a training model targeting it support staff and librarians. this model proposed to create a cohort of early adopters that would participate in rdm training and serve as a point person for their unit or division. cohort models have proved successful in training librarians, as demonstrated by nardine and moyo ( ) and witteveen ( ). this group of early adopters made up ‘cohort .' central it (css-it) support staff responsibilities are location-based, and a staff member is designated to a campus zone. that staff member will then respond to service requests within that zone. these staff are on the front lines in terms of responding to it problems, some of which are related to research data. because it staff operate independently in this way, each zone representative was recruited for participation in cohort . a total of members of the central it group participated, including two supervisors. uc berkeley librarians typically operate within a division structure that partitions librarians and library staff based on domain. library divisions include: arts & humanities, engineering & physical sciences, instructional services, social sciences, and life & health sciences. because the university library system at berkeley comprises constituent and affiliated libraries, these divisions can contain multiple libraries. thus, the rdm team made the decision to recruit two representatives from each division that could serve as members of cohort . representatives were selected by the rdm team and division heads, based on expressed interest in rdm activities and the data-intensive nature of the librarian's role. a total of librarians participated, including four division heads. cohort members committed to a semester-long program that consisted of an orientation, a workshop, and an evaluation. the goal of the orientations was to introduce members to the need for and principles of rdm, to demonstrate the use of our online documentation, and to provide them with contacts for referral in the event that they or a colleague are asked an rdm question. four orientations were offered during fall : two for librarians, and two for it. it was important to provide training for these groups separately in order to target existing workflows and tap in to referral processes within these organizations. following these orientations, the rdm group presented rdm developments at a library-wide meeting and encouraged librarians and staff to seek out cohort members for more information, or with questions. the fall workshop brought all of cohort together to introduce cohort members to different aspects of rdm and to some tools that they might find useful when engaging with researchers. the workshop began with a keynote by john chodacki, director of the california curation center (uc ) at the cdl. several of the data management tools that berkeley uses are developed and supported by the cdl, so this also provided an opportunity for relationship-building between these organizations. following the keynote, three rdm team members gave lightening talks highlighting research data management use cases. one focused on data security, one focused on writing codebooks, and one focused on data confidentiality. participants then split into small groups made up of both librarians and it staff. these groups completed an exercise that involved responding to various scenarios with research data management components. one sample scenario asked participants "i am a researcher in agricultural economics and i have been publishing my data on my department’s password-protected server, but my department is no longer going to maintain a server. what should i do to make sure that people can still find my data?” participants then collaborated to answer the following questions:  who in the data management consulting network could help you answer this?  what services exist at berkeley that might provide support?  are there data privacy or security considerations?  are there policy, copyright, or intellectual property considerations?  where in the rdm guide would you look for an answer? this group exercise offered an opportunity for participants to practice working through some of the issues researchers face when interacting with data, as well as to work with their fellow cohort members. the final element of the workshop was delivering two demonstrations of tools, both developed and supported by the cdl. the first, the dmptool, is widely used at research institutions across the united states. it offers step-by-step guidance to researchers who are completing a data management plan to fulfill the requirements of a funding organization - usually as part of a grant application. data management plan review can serve as an effective basis for librarian training in rdm (davis and cross, ). the second demonstration was of dash, an interface for data deposit into the merritt data repository. because uc berkeley does not have an institutional data repository, dash serves that function. currently, the service is free to researchers and subsidized by the university library, which makes it an attractive option for researchers who are interested in depositing their data and an important tool for cohort to be familiar with. cohort members were given access to test sites for each tool and encouraged to experiment with them. in response to feedback from cohort members, the rdm library training group, made up of librarians and it staff, shifted direction in . librarians requested training that was more nuanced, more concrete, and more directly relevant to their everyday activities. several analyses have identified liaison librarians as critical to the success of an rdm program, and liaison librarian training was thus prioritized (cox and pinfield, ; soehner et. al., ). the training team developed a month, domain-based proposal for a training program for librarians . the program divided the year into two-month training cycles. each two-month training cycle targets a single domain, based on the existing library division structure. library divisions will partner with the rdm team to create specialized content relevant to their domain. during a division’s training cycle, the rdm team and division representative(s) collaboratively build workshop curricula and deliver two workshops. a monthly “topics in research data services” series, tailored to the domains of the training cycle, will support librarians and library staff as they develop a broad understanding of the challenges researchers face and gain confidence discussing various aspects of data management and stewardship. the first training cycle focuses on the social sciences division. the curriculum was approved by the head of the social sciences division and developed in partnership with the anthropology and qualitative data librarian. outcomes: resolving consultations, raising awareness, and training the rdm program has been successful in several areas: raising awareness of the program among uc berkeley researchers, resolving rdm requests, training service providers in it and the library, and meeting project milestones on schedule. in the weeks between the service launch and the end of the semester, the program hosted or participated in events, ranging from invited talks, to town hall presentations, to workshops, to demonstrations of the guide. these events targeted both service providers and researchers. the program received consulting requests from departments and organized research units, of which were resolved by the end of the quarter. the majority of researchers requesting consultations were faculty and staff, closely followed by graduate students. requests from undergraduates were rare. the guide received unique visitors who viewed approximately , pages. the most frequently visited pages, after the home page, were: data management best practices, consulting, and data management planning. consultants were able to resolve many rdm questions, but several areas emerged as areas of greater need, with less support. in particular, active data management and securing research data need greater attention. two working groups have convened to address these areas and develop recommendations. domain-based training has proved to be very successful, with high levels of participation and engagement from librarians. this training is more successful than the generalized rdm training that attempted to target service providers from all domains and organizations. educational resources associated with the librarian training program may be found at: http://n t.net/ark:/b /d v t project management has been a very helpful part of the program development. as we have ramped up the work, having a project manager who kept the group on target and focusing on achieving goals across these two groups was very successful. sticking to a firm calendar has helped the project manager to keep the deliverables on track. with only a handful of staff with time committed in real hours (fte), other staff and librarians have had to make an effort to remain involved and committed given other priorities around their regular work. the program now serves as an organized mechanism to help us better understand future researcher and support staff needs. it will help us determine where to focus our time and resources in an essential support area that is evolving fairly rapidly. in addition, we are building the foundation for future work, which will include a broader campus launch of rdm services and the development of additional services. the rdm consulting network helps to share important information with other campus service areas, such as computation (brc) and learning analytics (ets). all of this work taken together is helping us build a broad, meaningful service collaboration between groups charged with providing campus wide services. reflections on collaboration a collaboration of this type is not a simple undertaking for two large, complex, and disparate organizations like the library and research it, but the shared interest in research data management support provided a common goal. a collaboration of this type can vary widely in terms of extent and outcomes, falling along a continuum ranging from a simple interaction over a common goal to highly interdependent activities that involve shared risks and benefits. there is a model that is useful in discussing the trajectory of such partnerships called the collaboration continuum (zorich et al., ). in that model, partnerships move from basic contact through increasingly deepening relationships between the parties involved to a point of actual convergence. when a partnership reaches convergence, the collaboration is so ingrained that the parties no longer see it as a collaboration, but rather as a shared infrastructure that both parties have come to rely upon. because the library/research it partnership is a complex one, it might be instructive to look at how it has moved across this continuum. figure : the collaboration continuum. reprinted from "beyond the silos of the lams: collaboration among libraries, archives, and museums.: by zorich d, waibel g and erway r ( ) oclc research, p. . copyright by the oclc online computer library center, inc. reprinted with permission. in the case of the rdm service model, the process began with contact between two administrative leaders of the organizations. this started with an initial meeting to explore the idea of launching an rdm service. research it has a stake in the research process from the research cyberinfrastructure (rci) side -- tools, services, and community -- and the library has a similar position in supporting the research process through instruction, research design, access to resources, and publishing expertise. when they decided to work together on developing the program, the two parties moved from the contact stage of the partnership to cooperation, which made no commitment of time, money, or space, and had nothing in writing, but was simply an informal agreement to move the partnership forward. as the partnership progressed, the parties agreed to coordinate on writing a job description for the shared rdm analyst and putting together a working group, which required a time commitment on both sides. next they coordinated efforts to establish a calendar of work and deliverables which was managed by the assigned project manager and the leadership group. because this stage required a written agreement of how the analyst position would be shared and paid for, a commitment of some fte of a project manager to the effort, as well as a commitment of time on the part of the two parties to meet regularly, this moved the project further down the continuum toward coordination. as we see the partnership now, where we have the financial commitment of a shared position, a contribution of space where the analyst can work in each office, and written commitments of fte to the project, we have reached the higher level of collaboration. at this point there is more investment from each party and a higher level of risk than in previous stages of the partnership. should one of the partners back out of the collaboration, there would be a financial loss in staffing and time to untangle resources and dissolve written agreements. having shared communication and program management responsibilities in the project has been a key method by which uc berkeley has mitigated this risk. on the plus side, each party has gained through sharing the work towards a common goal. the library has formed relationships and gained knowledge from interactions and sharing information with the research it staff. the research it group has gained a greater understanding of the research support process and the work done by librarians in this space. this has broadened the network of consultants across the campus that both groups can reach out to for support of research needs, so the campus community will also benefit. we have learned from each other and are better at what we do as a result. where the work to this point was largely additive, as we have moved toward true collaboration, the work is becoming more transformative as we begin to share work and reduce duplication of effort. this stage suggests a level of trust between the partners, where they share risks and responsibilities, as well as the rewards. the rdm program at berkeley has not yet reached a point of convergence where each partner has become completely dependent on the other. this was not part of the program’s stated goals, even though it is the next logical step in a collaboration. for this to happen, the research it group and the library would need to, for example, commit resources to permanently support the shared position, or dedicate a shared space for this work, supported by shared funding. we would need to serve each other’s missions in a way that dedicates resources across the partnership, or establish a formal partnership that forms a new organization to support this work. as the collaboration moves forward, these goals may become desirable but, for now, the close partnership will continue to work toward the shared rdm goals and continue to build an extended network of partners across campus as we move down the continuum. next steps: formalizing and scaling the program the research it and library partnership has come a long way in terms of their collaboration in a relatively short time. the collaboration has evolved into a successful venture to date and will continue to evolve as the rdm program establishes additional trainings and workshops, continue to develop its guide to services, and continues to share expertise across the two partners, as well as the extended network of partners. as the program explores possibilities for additional services related to secure and active research data management, collaboration with other campus organizations is becoming increasingly important. as wilson and jefferies ( : ) discuss "towards a unified university infrastructure: the data management roll-out at the university of oxford," researchers prefer data management guidance that is specific to their discipline and methodology. this drive towards the provision of rdm services on a domain-specific basis necessitates domain expertise. this expertise does exist at uc berkeley, but it is distributed among departments, research units, administration and support teams. partnering with these organizations is necessary to provide the support that researchers are looking for. because of the success of programs like this and a driving need for holistic solutions to research computing problems, research it is becoming increasingly involved in forging new collaborations with organizations at berkeley and with other uc campuses. this includes a pilot project for managing ocr data between research it, the library, the d- lab, and digital humanities. this project uses new analytics environments developed by berkeley research computing to make licensed ocr software available to the entire campus community. furthermore, research it is spearheading a consulting project that centers around a bi-annual consulting summit. this summit brings together consulting groups from it, educational technology, the library, the berkeley institute for data science, and the geospatial innovation facility. in addition, the library is partnering with uc san diego to deliver a data carpentry workshop for librarians. these efforts aim to promote collaboration across these groups, and improve the impact and quality of research support services. the library role has changed in terms of being better prepared to address data management and preservation needs as part of the broader research process. we are seeing this reflected in new library positions that include digital methods and data support as part of their portfolio. this situates these skills within the library and indicates that an rdm community is beginning to be built within the library, one which could extend to include other uc libraries. in addition, the library has gained an understanding of other services offered across campus and identified experts that can serve as partners, consultants, or referrals. in , the library and research it hope to be able to substantially support to the rdm needs of campus. the rdm collaboration will continue to build relationships between it and library groups, pilot new services in active data storage, strengthen partnerships with the california digital library and researchers, and broker access to secure computing environments. by , the program will be focused on formalizing rdm efforts within the institutional structure. funding the author(s) received no financial support for the research, authorship, and/or publication of this article. works cited university of california, berkeley (n.d.) by the numbers. available at: http://www.berkeley.edu/about/bythenumbers (accessed may ). cox a and pinfield s ( ) research data management and libraries: current activities and future priorities. journal of librarianship and information science ( ): – . doi: . / davis hm and cross wm ( ) using a data management plan review service as a training ground for librarians. journal of librarianship & scholarly communication ( ): – . available at: http://jlsc- pub.org/articles/abstract/ . / - . / ferguson, a. r., nielson, j. l., cragin, m. h., bandrowski, a. e., & martone, m. e. ( ). big data from small data: data-sharing in the 'long tail' of neuroscience. nature neuroscience, ( ), . doi: . /nn. nardine j and moyo l ( ) learning community as a model for cultivating teaching proficiencies among library instructors – a case study. available at: http://library.ifla.org/ / / -nardine-en.pdf (accessed september ) an interpretation of the library bill of rights ( ). american library association. available at: http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy (accessed may ).soehner c, steeves c, ward j, et al. ( ) e-science and data support services: a study of arl member institutions. association of research libraries. available at: http://www.arl.org/storage/documents/publications/escience- report- .pdf (accessed september ). tenopir, c, sandusky r j, allard s, and birch b ( ) research data management services in academic research libraries and perceptions of librarians. library & information science research, ( ), – . doi: http://doi.org/ . /j.lisr. . . best college reviews (n.d.) the top research universities. available at: http://www.bestcollegereviews.org/top-research-universities/ (accessed may ). http://jlsc-pub.org/articles/abstract/ . / - . / http://jlsc-pub.org/articles/abstract/ . / - . / http://library.ifla.org/ / / -nardine-en.pdf http://www.arl.org/storage/documents/publications/escience-report- .pdf http://www.arl.org/storage/documents/publications/escience-report- .pdf http://doi.org/ . /j.lisr. . . verbaan e and cox am ( ) occupational sub-cultures, jurisdictional struggle and third space: theorising professional service responses to research data management. the journal of academic librarianship ( – ): – . doi: http://dx.doi.org/ . /j.acalib. . . wilson jaj and jefferies p ( ) towards a unified university infrastructure: the data management roll-out at the university of oxford. international journal of digital curation ( ): – . doi: . /ijdc.v i . witteveen a ( ) better together: the cohort model of professional development. library journal december, . available from: http://lj.libraryjournal.com/ / /careers/better-together-the-cohort-model-of- professional-development/ (accessed april ). zorich d, waibel g and erway r ( ) beyond the silos of the lams: collaboration among libraries, archives, and museums. oclc research. available from: http://www.oclc.org/content/dam/research/publications/library/ / - .pdf (accessed september ). http://dx.doi.org/ . /j.acalib. . . http://www.oclc.org/content/dam/research/publications/library/ / - .pdf abstract: keywords: data services, lis as a profession, academic libraries context: establishing the partnership contributions: library and it roles professional culture: navigating library and it culture implementation: consulting, resources, and training reflections on collaboration next steps: formalizing and scaling the program funding works cited publications article preprints in scholarly communication: re-imagining metrics and infrastructures b. preedip balaji ,* and m. dhanamjaya indian institute for human settlements library and school of library and information science, reva university, bengaluru , india school of library and information science, reva university, yelahanka, bengaluru , india; registrar@reva.edu.in * correspondence: pbalaji@iihs.ac.in; tel.: + - - - received: september ; accepted: january ; published: january ���������� ������� abstract: digital scholarship and electronic publishing within scholarly communities change when metrics and open infrastructures take center stage for measuring research impact. in scholarly communication, the growth of preprint repositories as a new model of scholarly publishing over the last three decades has been one of the major developments. as it unfolds, the landscape of scholarly communication is transitioning—with much being privatized as it is made open—and turning towards alternative metrics, such as social media attention, author-level, and article-level metrics. moreover, the granularity of evaluating research impact through new metrics and social media changes the objective standards of evaluating research performance. using preprint repositories as a case study, this article situates them in a scholarly web, examining their salient features, benefits, and futures. moves towards scholarly web development and publishing on the semantic and social web with open infrastructures, citations, and alternative metrics—how preprints advance building the web as data—is discussed. we determine that this will viably demonstrate new metrics and, by enhancing research publishing tools in the scholarly commons, facilitate various communities of practice. however, for preprint repositories to be sustainable, scholarly communities and funding agencies should support continued investment in open knowledge, alternative metrics development, and open infrastructures in scholarly publishing. keywords: preprint repositories; scholarly publishing; scholarly communication; scholarly metrics; open infrastructures; scholarly web . introduction electronic publishing has provided many benefits for sharing research materials online. besides the mainstream publishing in books, peer reviewed journals, and conference papers, research outputs have increased in many other forms—preprints, datasets, multimedia, and software—not only for dissemination, but also for reproducibility and replication. although outputs of publications in a variety of ways have increased, preprints stand out for their “accessibility” to early disseminated versions and “subject to review” status. they are publicly accessible and typically in line with definitions of open access, before being formally published. preprints are scientific publications that are published online and publicly accessible before peer review in a journal publication. growth in numbers of preprints [ ] and the repositories to host them are on the rise, covering different disciplines. specifically, they are moving beyond natural sciences to social sciences and humanities, although there is widespread skepticism [ ] among scholarly communities about their acceptance of and recognition for scientific validation. along with the growing trend for open access publishing [ ], preprint repositories have grown, “while still used for small portion of papers, provided much earlier publications , , ; doi: . /publications www.mdpi.com/journal/publications http://www.mdpi.com/journal/publications http://www.mdpi.com https://orcid.org/ - - - http://dx.doi.org/ . /publications http://www.mdpi.com/journal/publications http://www.mdpi.com/ - / / / ?type=check_update&version= publications , , of access to scientific findings” among the scholarly communities [ , ]. as the need for access to research is felt widely, especially at an early stage to accelerate the access to new findings, many institutions and organizations have tried to establish preprint repositories alongside the scholarly publishing platforms from the late twentieth century. in one of the earliest experiments, the national institute of health (nih) initiated a biological preprints circulation program called ‘information exchange groups’ in . since journal publishers were not accepting preprints, this was shut down in [ ]. again, at the nih’s public archive platform pubmedcentral in , establishing a preprint section was proposed. however, it was severely criticized by scientific publishers on the ground that “publishing preprints electronically sidesteps peer-review and increases the risk that the data and interpretations of a study will be biased or even wrong. . . the best way to protect the public interest is through the existing system of carefully monitored peer-review, revision and editorial commentary in journals [ ]”. this remains as one of the main debates since then as to why preprints cannot be accepted without peer-review or through any other feedback mechanisms. notwithstanding, the growth and diversity of preprint repositories in the last two decades reveal many other reasons as to how they play a vital role in the scholarly publishing ecosystem for their benefits, metrics, and risks. arxiv was launched in and it set the trend of preprint-driven open scholarship (or e-prints server) in physics, computer science, and mathematics. in social sciences and economics, the social sciences research network (ssrn) was launched in and research papers in economics (repec) in . in , the social academic networking sites academia.edu and researchgate.net were launched, which had features that were more social and possessing options to accept research documents at any stage. biology preprints, biorxiv and peerj preprints, were launched in . in , chemrxiv for chemistry and socarxiv for social sciences were launched. earth sciences preprints, which were called essoar, were launched by the american geophysical union in [ , ]. in addition to their principal benefits of making the scholarly content available for open access, preprint repositories break through traditional barriers—paving ways for new metrics, benefits, and research impact. for these reasons, preprint repositories emerged as a key player in scholarly publishing and they will continue to be a boon as an open infrastructure for researchers. . background in pursuit of open knowledge since the eighteenth century, scientific and scholarly communities exchanged communication without any formally integrated and holistic use of peer review [ ]. nevertheless, when the body and expanse of scholarly literature grew exponentially in the mid- to late-twentieth century with information explosion, the dissemination of current knowledge, the archiving of the canonical knowledge base, quality control of published information, assignment of priority, and credit for their work to authors, became a norm for the peer reviewing process [ ]. along the way, in scientific writing, various roles of authorship, levels of contribution, and the rules for publishing research data in the public domain, especially before the paper released was defined by journals. it was the then editor of the new england journal of medicine, franz j. ingelfinger, whose ideas on “sole distribution” in his editorial in september for scientific communities became popular [ ]. subsequently, journals became the primary mode of communication much before the widespread of peer review. with clear guidelines for authorship, academics and researchers primarily began communicating scientific research through peer reviewed journals for publishing their scholarship. post- s, in which the internet came to be much more widespread, this did not disrupt the scholarly publishing perhaps as much as expected, as we still have the same large players that we did in the pre-digital age dominating the landscape [ ]. it was thought that the web would kill off scholarly journals, because the cost of dissemination would plummet to near zero. however, the large publishers simply shifted the offline system online—which is why we still have things like journals, issues, articles, copyright, and metrics that are designed for a pre-web era. what this did, importantly, was to emphasize that it was publishers who were in charge, because they manage the publications , , of metrics (for evaluation/reputation) and the copyrights. preprints challenge both of these things. this is perhaps a deeper significance that needs to be explored. many commercial publishers and open access mega-journals consolidated their positions as large players, even as the ties between open access (oa) and incentives and power relationships between politics, publishers, and academies increased [ ]. in addition, geographical heterogeneity and geopolitics play a larger part in it, as both countries in the global north and global south are attempting to address issues in open access policies and the integration of nonprofit workflows into scholarly publishing. although preprints emerge as an equalizer to leverage its potential, the landscape is complex. in the past three decades, different regions strived for distinct things to promote open access across the national, state, institutional, and sectoral levels, as is advocated in africa, china, and south asia. the efforts of scielo in latin america; a radical open access program, plan s, as was announced in western europe by research funders in and the united states of america (usa), are calling for global action towards more inclusive, open, and multilingual scholarship [ , ]. nonetheless, open source technologies and open access movements necessitated the retooling of existing scholarly processes towards openness. although open access publishing started to grow, some leading publishers, such as commercial, learned societies, and university press had actively opposed the growth of oa [ ], until they found a way to transform it into a new business model and they were cautious to take up the oa model of publishing. consequently, the period of – reported the high growth of oa mega-journals, which focused on the scientific trustworthiness and soundness, eschewing judgment of novelty or importance. open access journals, plos one and scientific reports, are dominating this now [ , ]. breaking the conventional boundaries, digital scholarly publishers have flourished innovatively with a wide variety of repository solutions and open journal publishing platforms, testing a range of open access publishing models. this is to achieve gold open access (oa at the publishing source), green open access (self-archiving), and diamond open access (gold, but explicitly, with no article processing charges), as indexed by the directory of open access journals, which also lists the oa journals that charge article processing charges. as open access publishing models, licensing options, and infrastructures are getting larger, the data and resources enrich the web towards building open data. when the existing complexities of proprietary software, commercial publishers, and paywalled content is widespread, then the preprints entered to disrupt the scholarly communication system, thus making the vast amount of unpublished data and scholarly content available, regardless of peer review process. preprints—as a leveler—enrich the scholarly web on top of the existing scholarly resources for discoverability. they do so by allowing access to not yet printed versions, timestamping ideas and findings, and adds meaning to interconnect people, concepts, and applications [ ]. therefore, preprints play a larger role in scholarly publishing strengthening the infrastructures of web through linked data, scholarly-rich content, and applications. . . rethinking research impact metrics as countries, institutions, and research communities compete on the global stage to measure and evaluate their national, institutional, and research outputs, various outcome and metrics-based research frameworks assess different research activities and performance. some of those key areas are science and technology indicators, patents, bibliometrics, citations, rankings, research and development factors, measurements for innovation, and metrics for assessing the quality of scientific outputs. however, there is an increasing need to support research artefacts to be as inclusive as possible, going beyond research papers to preprints, software, codes, posters, media, and datasets. scholarly activities, such as teaching and public outreach should also be included. again, it is largely debated that the benefits of research impact should rise above academia on economy, society, public policy, human development, and the environment. this refers to the strategy, resources, and the infrastructure supporting the research, as adapted in the united kingdom (uk) research excellence framework—currently assessing the excellence of research in higher education institutions in the uk [ ]. this is more important for publications , , of understanding what constitutes scholarly impact—when literature obsolescence and non-citation is rife, even with journals that maintain an impact factor of five [ , ]. journal impact factor, citescore, scimago journal rank, source normalized impact per paper metrics for journals and h-index, i -index, and s-index for authors have determined and built a reputation of scientific productivity and the research impact of digital scholarship [ ]. however, there is a growing demand for other kinds of metrics, such as at the article-level and the author-level—having their own merits beyond journal impact factor, which is an aggregate of citation count for a journal in which the work is published [ ]. though academics and scientometricians have developed many metrics to measure the scientific output, whether the metrics work, fair, or overused need evaluation, as citation counts have less than one percent of usage for an article [ ]. many of the metrics that exist for measuring journal quality necessitate a paradigm shift to measure author-level metrics, which essentially captures the citation-related data and the connectivity-related metrics of authors [ ]. many metrics are still focused on published, peer reviewed articles as a primary output. however, the point is—with preprints and a wider diversity of processes and outputs—this demands new metrics beyond those for traditional outputs to be developed; but also, they must be applied in a responsible manner. the web also opens up a whole field of additional context to explore things and hence a more ‘contextualized metrics’ is required for measuring those. moreover, defining impact in various contexts becomes extremely challenging at the academic, economic, and societal levels—given that the way the traditional metrics used for evaluation are deeply flawed [ ]. for example, that they are being misused beyond their original intention (for example: journal impact factor), deeply unscientific, and mostly operated by commercial entities and often being incredibly biased in different dimensions. citation rates, journal ranks, and impact factors are inherently hierarchical and hence the institutionalization of them as a scientific impact assessment tool has unintended consequences of negative effects [ , ]. it is further found that the methodological quality and reliability of published research works in several fields may be decreasing with increasing journal rank [ , ]. supporting new methods in data and scholarly publishing, the open research community must encourage publishing null results or failed experiments, against a growing body of evidence, questioning the conventional forms of impact assessment, which insist on quantifying the research outputs and they cannot capture diverse, wide-ranging, and inclusive research impact [ ]. moreover, the relevancy of citations and impact factor is widely questioned for their role in problem-solving and societal impact [ ]. for a long time,open access has seen a great push through academics, policy making, science communication, and so on, and preprints add to this environment as an additional layer that will further enrich the scholarly ecosystem. reimagining open infrastructures and metrics, this article aims to situate preprints in the emerging research ecosystem, establishing that disciplinary-centric and public preprint repositories have been on the rise in the last two decades or so. as preprints become mainstream, research publications coming out from highly to moderate novelties of incremental, supportive, or confirmatory results, and their supplementary data will more visibility benefit [ ]. research communication, academic outputs, and scholarly artifacts have diversified in many ways and they are available for various communities of practice—transcending disciplinary boundaries of research. scholarly communication is evolving and diversifying. we need to rethink our metrics and evaluation systems based on this in the rapidly changing landscape. research outputs are more than journal articles, and so measuring their impact should go beyond them, including prepublication outputs. their credibility, impact, and value should be measured through heterogeneous metrics, which calls into question the whole idea of trying to measure scholarship. are metrics appropriate? or is qualitative assessment needed? is such assessment even operationally better than randomness? . . growth of preprint repositories: from arxiv to essoar as exhibited in table , the rapid growth of preprint repositories prompted the scholarly communities to define what constitutes a preprint, when there is no clear consensus on what they publications , , of are. an examination of definitions by some of the preprint repositories reveals that they are “draft, unpublished, incomplete, or unedited final versions of papers, maybe work in progress and not typeset”. in one of the early attempts, gunther [ ] distinguished the preprints from an electronic publishing and e-print server perspective, referring to them as “‘pre-peer-review’ or ‘pre-submission’ documents” in a guest editorial in . according to peerj preprints, a preprint repository [ ], it is described as “a draft of an article, abstract, or poster that has not yet been peer reviewed for formal publication”. many scholars have attempted to define exactly what a preprint is—distinguishing preprint as a scholarly item that is based on subject to evaluation as in pre- and postprints and preprint server as infrastructure. neylon [ ] proposed a model that distinguishes the preprints by “characteristics of the object, its ‘state’ from the subjective ‘standing’ granted to it by different communities”. rittman explains preprints as “a piece of research made publicly available before it has been validated by the research community. that is to say, some output that follows the scientific process but has not yet been peer-reviewed for journal publication [ ]”. however, tennant et al. [ ] propounded a definition of what is a preprint that is based around its peer review status, which is in line with the sherpa/romeo description: • preprint: version of a research paper, typically prior to peer review and publication in a journal. • postprint: version of a research paper, subsequent to peer review (and acceptance), but before any type-setting or copy-editing by the publisher. also, sometimes called a ‘peer reviewed accepted manuscript’. • version of record (vor): the final published version of a scholarly research paper after undergoing formatting (and any other additions) by the publisher. • e-print: version of a research paper posted on a public server, independently of its status regarding peer-review, publication in print, etc. preprints, postprints, and vors are forms of e-prints. publishers are accepting preprints for peer review in journals, even if they are available in preprints repositories that are submitted in parallel. they are submitted to the preprint repositories without peer review, being free of cost by authors to solicit feedback from peers, perhaps being often submitted to a journal later for peer review and subsequent publication. arxiv is a preprint repository that was established for high energy physics in . however, other disciplines took more time to realize the potential of using preprint repositories and the best practices of early dissemination of research works online to maximize the research impact [ ]. table . growth of preprint repositories, – . s. no. name of preprints subject/disciplines year established no. of records as on july website arxiv natural sciences, engineering, economics, finance and computing , , https://arxiv.org repec economics , , https://ideas.repec.org ssrn social sciences , https://www.ssrn.com/en e-lis library and information science , http://eprints.rclis.org biorxiv life sciences , http://www.biorxiv.org peerj preprints biological, medical, environmental and computing sciences https://peerj.com/preprints https://arxiv.org https://ideas.repec.org https://www.ssrn.com/en http://eprints.rclis.org http://www.biorxiv.org https://peerj.com/preprints publications , , of table . cont. s. no. name of preprints subject/disciplines year established no. of records as on july website osf preprints natural sciences, technology, engineering and social sciences. arts and humanities https://osf.io/preprints mdpi preprints natural, engineering, social sciences and arts and humanities https://www.preprints.org chemrxiv chemical sciences http://www.chemrxiv.org essoar earth sciences https://www.essoar.org figure shows the growth of preprints in life sciences from to , which are reporting a high number of submissions. life sciences established more preprints, such as arxiv q-bio, which is a quantitative biology archive and it has been part of arxiv, publishing preprints since september . cold spring harbor laboratory, a nonprofit, launched biorxiv, which is a biology preprint repository, in november . in april , peerj inc. launched its peerj preprints that covered biological, medical, and environmental sciences. publications , , x for peer review of preprints sciences and arts and humanities chemrxiv chemical sciences http://www.chemrxiv.org essoar earth sciences https://www.essoar.org figure shows the growth of preprints in life sciences from to , which are reporting a high number of submissions. life sciences established more preprints, such as arxiv q-bio, which is a quantitative biology archive and it has been part of arxiv, publishing preprints since september . a nonprofit, cold spring harbor laboratory launched biorxiv, a biology preprint repository in november . in april , peerj inc. launched its peerj preprints covering biological, medical, and environmental sciences. figure . monthly preprints added in november is . source: prepubmed.org. journal impact factor indicates the quality of journals through citation metrics, though measuring scholarly and societal impact is more important [ ]. hence, a new paradigm shift is that publications should not be subjective of impact, novelty, and interest, but is based on scientific and methodological soundness or objective. in other words, many journals have emerged to report on what “scientific literature might gradually become less biased against negative or null results and it will be less dominated by the trends and ‘hot topics’ of the day [ ]”, for which preprints provides the access to check the pre-publication of results prior to peer review. journal impact factor of journals and their peer reviewing process are found to be excruciatingly slow (typically – days or longer) and invariably slows decision making—affecting careers of early researchers with no recognition until the research is published, though foster international collaboration and global reach [ ]. among the other criticisms widely conceded among researchers is that science and knowledge are measured by numerical ranking systems, which makes researchers pursue the rankings first over the research [ ]. muller argues against these counterproductive conventions on the performance evaluation and metrics calling it as ‘metric fixation’ [ ]. preprints break these conventions, giving advantage to research publications, for their merits and research impact, as they become openly available as early pre-publication outputs, but they also are independent of any specific journal venue at the point of sharing. preprint repositories play a vital role in the dissemination of research artefacts for impact and making them visible to connect audience, which is the material cultures of academic reading and writing, maybe transient in social media communication, but calls for reuse, credit, and replication in an open research ecosystem with data, code, citations, and software. also, scholar identity has figure . monthly preprints added in november is . source: prepubmed.org. journal impact factor indicates the quality of journals through citation metrics, though measuring scholarly and societal impact is more important [ ]. hence, a new paradigm shift is that publications should not be subjective of impact, novelty, and interest, but that is based on scientific and methodological soundness or objective. in other words, many journals have emerged to report on what “scientific literature might gradually become less biased against negative or null results and it will be less dominated by the trends and ‘hot topics’ of the day [ ]”, for which preprints provides the access to check the prepublication of results prior to peer review. the journal impact factor of journals record statistics are collected from their respective websites, except biorxiv and for this, data is collected from osf preprints. https://osf.io/preprints https://www.preprints.org http://www.chemrxiv.org https://www.essoar.org prepubmed.org publications , , of and their peer reviewing process are found to be excruciatingly slow (typically – days or longer) and the decision-making process is invariably slow—affecting the careers of early researchers with no recognition until the research is published, though they foster international collaboration and global reach [ ]. among the other criticisms that are widely conceded among researchers is that science and knowledge are measured by numerical ranking systems, which first makes researchers pursue the rankings over research [ ]. muller argues against these counterproductive conventions on the performance evaluation and metrics, calling it ‘metric fixation’ [ ]. preprints break these conventions, giving advantage to research publications for their merits and research impact, as they become openly available as early prepublication outputs, but they also are independent of any specific journal venue at the point of sharing. preprint repositories play a vital role in the dissemination of research artifacts for impact and making them visible to connect with their audience. this concerns the material culture of academic reading and writing, which may be transient in social media communication, but calls for reuse, credit, and replication in an open research ecosystem with data, code, citations, and software. additionally, scholar identity has grown alongside the technological innovations for technology-influenced scholarship through participatory technologies in the public sphere. increasingly, academics, practitioners, and researchers [ ] tend to communicate their research using social media as a utility in the research landscape and lifecycle—as a digital opportunity to learn tools and techniques and then apply them for research communication in the changing research landscape. there is a growing trend in publishing for unrefereed preprint repositories; writings blog posts or field notes online; creating infographics, data visualizations, and publishing research data in data journals; making podcasts, creating videos/images, photo-essays, and overlay journals—which all diversify scholarly communication [ ]. furthermore, the scholarly communication activities and processes on informal channels boost interaction, collaboration, seeking, citing, publishing and disseminating in orthodox, moderate, and heterodox use scenarios [ ]. a few examples are the conversation global [ ] and policyforum.net [ ], the online independent news platforms that are run by research communities. using these platforms, journalists, scientists, academicians, and researchers primarily aim to communicate scholarly information for the lay audience. in this, preprints help journalism, promoting transparency and science communication for the public. . methods for the purpose of this study, a sample of ten preprint repositories was chosen, which were based on their history, popularity, and disciplinary diversity. this was a combination of preprints (that go on to be published or not), postprints, final published articles, datasets, working papers, which were all examined of their salient features, disciplinary focus, and the number of records available between march and september (see table ). as a case study of preprints, the research was conducted in two stages. first, was to highlight their principal features, such as system architecture, persistent identifiers and registries, disciplinary focus, research data management, peer reviewing models, infrastructures, and metrics. the second stage consisted of using indicators in depth for analysis: software and open source technologies used, standards and protocols adopted, knowledge organization systems applied, interoperability and open licensing options, indexing and aggregating agencies involved, metrics and peer reviewing processes, community standards, and web . applications that are available. subject and disciplines of preprints, such as life sciences, technology, engineering, and social sciences, were included for this study. additionally, management aspects, such as funding agencies and whether the preprints were supported by for-profit corporations or nonprofits, their advisory committees, code of conduct; management of digital object identifiers, submission guidelines, copyright policies, and publishing workflows were investigated. subsequently, a comparative analysis at the site and record levels were performed in order to synthesize the results and discussions further. publications , , of . results . . comparative features of preprint repositories the results that are presented below are in eight sections. key findings are categorized based on themes, such as: system architecture, persistent identifiers and registries, disciplinary focus and management, interoperability and open licensing, indexing and aggregators, knowledge organization systems, authority control and subject categories, metrics and open reviews, and community standards. . . . system architecture system architecture refers to the database structures, hardware, and software that are used to set up a preprint repository. as shown in table , there was a limited number of software solutions available when preprint repositories were started, so legacy preprint repositories, such as arxiv and repec, are migrating to use digital repository software—invenio and eprints, respectively, to integrate new applications, such as dois, orcids, and altmetric. e-lis is a public preprint repository in library and information science run on dspace. though dspace and eprints dominate globally in repositories development, osf preprints and figshare are new entrants for repository solutions. few preprints are building application programme interfaces (apis) to build robust features and accommodate services from other programs. it is found that, out of ten, four preprints repositories have open apis, which are repec, mdpi preprints, osf preprints, and figshare. osf preprints is an aggregator from across almost all of the other servers. it also links to other services, such as figshare or github, and it is virtually unlimited in scope of what can be ‘attached’ to preprints and offers local storage. it uses share, which is a community open-source initiative suite of technologies. out of the ten preprints that were evaluated, four preprint repositories are using custom proprietary systems, which could not be identified, as listed in column of table in infrastructure. managing research data has become an integral part of system architecture, where multiple files are supported from word processors to datasets in variety of formats, such as latex to zip, for preservation, and essentially all of the preprints support that [ ]. . . . persistent identifiers and registries persistent identifiers help to provide perpetual ids for digital objects to identify and retrieve them. most preprint repositories have identifiers, such as article ids, uris, and handle system for publications, which make the records unique, identifiable, persistent, and retrievable (see table ). many of them are cross-linked and directed to the dois of the article, where the latest version of the article is available as permalinks. an example of arxiv id is arxiv:hep-th/ , where hep-th stands for high energy physics—theory and , , is the unique record number. another example of the repec identifier handle is: https://ideas.repec.org/p/hhs/cesisp/ .html, where hhs:cesisp denotes the centre of excellence for science and innovation studies, royal institute of technology, stockholm, sweden, followed by the unique record number: . among the ten preprint repositories analyzed, seven are found to be using crossref’s doi services for preprint records. crossref dominate dois among preprints. at osf preprints, each project is assigned a globally unique identifier, or guid, though dois are used as well. dois versioning was found to be unique with chemrxiv, mdpi preprints, and peerj preprints for version control. further, dois assigned to supplementary data, file, code, and dataset enable them to be citable as well. moreover, one of the important features found is registries, which records various projects to make them available publicly as crucial content providers and helps in avoiding the duplication of studies. osf registries has , registrations of research studies of systematic reviews and meta-analyses in clinical psychology and medicine that are cross-searchable with research registries and clinicaltrials.gov registries. https://ideas.repec.org/p/hhs/cesisp/ .html publications , , of . . . disciplinary focus and management the examination of preprints history and growth reveals that disciplinary focus has been one of the major factors for establishing them. since the need for sharing the scholarly research arose in different settings—laboratory, academic, research, and practice—preprints were created and supported by diverse disciplinary areas (see table ). this also ties into the social differences and norms between different research communities, wherein the replication, reproducibility, and methodological approaches vary greatly among different domains, especially when preprints have a ‘state’ from the subjective ‘standing’ granted to them by different communities of practice [ ] (p. ). arxiv was started with physics, but it soon expanded to covering quantitative biology, astronomy, computer science, and mathematics. biology has been quite conventional, but in recent years it has been reporting high number of submissions in biorxiv and peerj preprints (see figure ). in the last year, there are more than disciplinary-based preprint platforms that have emerged. see here the disciplinary prerints, which are backed by centre for open science. its other country-specific examples are ina-rxiv, arabixiv, and africarxiv, which are committed for indonesia, the arab states, and africa, respectively, to promote open science. moreover, managing preprint repositories are not only solely resting with public institutions or government, but also by different agencies that are funded by nonprofits and for-profit companies [ ]. arxiv is hosted by cornell university library, repec by munich university library and consortia, e-lis is supported by aims, fao, and university of naples federico ii, naples–centralino, biorxiv is hosted by cold spring harbor laboratory, and osf preprints by centre for open science, mdpi preprints by mdpi, which are nonprofits. chemrxiv is collaboratively managed by american chemical society, german chemical society (gdch) and the royal society of chemistry, uk, and essoar by the american geophysical union are learned societies. these agencies are backing the growth and development of preprints in key disciplinary areas. however, ssrn that are acquired by relx group in may and both peerj preprints and mdpi’s preprints.org are services run by commercial publishers, meaning that preprint servers are seen as a key part of business models (see table ). this is part of dangerous move from some publishers into controlling the entire research workflow and it is symptomatic of a highly dysfunctional scholarly publishing market [ ]. . . . interoperability and open licensing out of ten repositories, arxiv, repec, and osf preprints are found to be interoperable and they support a whole range of integrated search features, such as cross-searching of content including abstract, full text of articles across multiple repositories, and owned by different content providers. chemrxiv run by figshare is a proprietary platform, but it has a unique model where all of the available content is shown on the single portal and is owned and provided by various institutions worldwide at https://figshare.com. creative commons license is found to be predominantly used by many of the preprint repositories for licensing to allow the reusing of the content and data. however, the degree of freedom varies across preprint repositories. arxiv uses the following license types, which goes from the most accommodative to restrictive: attribution . generic (cc by . ), attribution . international (cc by . ), sharealike (sa), noncommercial (nc), and some even have cc by-nc, cc by-nc-sa types [ , ] . e-lis, peerj preprints, and mdpi preprints use attribution . international (cc by . ), which allows for the sharing and adapting of works. this implies that there is a “unrestricted use, distribution and reproduction in any medium and the original work is properly cited” [ ]. chemrxiv allows for attribution-noncommercial-noderivs cc by-nc-nd to “download works and share them with others as long as they are credited, but they can’t change them in any way or use them commercially” and applies embargo, keeps confidential files, generate private links, and reserve dois, and also accepts any file format up to gb. essoar follows the attribution . generic (cc by . ) license. repec does not state any one of the above licensing options, while ssrn allows this: papers by the copyright owner or that have the copyright owner’s permission are permitted to post under publishing agreement or the publisher’s copyright policies or institution’s license agreement or under a creative commons license. preprints to peer reviewed journals portability is also worth mentioning https://figshare.com publications , , of here for biorxiv preprints, which become easy for authors and currently this service is available for biology journals. . . . indexing and aggregators it was found that all of the preprints are indexed and then aggregated by commercial, institutional, data repositories, and databases, which are bibliographic, aggregating, and depositive in nature. repec has its own indexing platform, called ideas, which is a comprehensive bibliographic database in economics, available for free, which indexes over , , items of research, indexed in econlit, econstor, google scholar, inomics, oaister, openaire, and ebsco. e-lis provides seek option for references tht can be retrieved in google scholar. biorxiv preprints are indexed by the following services: google scholar, crossref, meta, and microsoft academic search. mdpi preprints are indexed by europe pmc, google scholar, scilit, academic karma, share, and prepubmed. peerj preprints are indexed in google scholar. crossref provides dois for preprints and datacite primarily works for providing persistent identifiers for all kinds of research data and it is integrated with chemrxiv. though many of the abstracting and indexing databases—openaire, researchgate, academia.edu oaister—index preprints, there is no established standards available for preprints, hence no usage statistics are reported, unlike peer reviewed journals that report counter-complaint usage statistics. . . . knowledge organization systems, authority control and subject categories for the authority control of authors, arxiv, biorxiv, mdpi preprints, and essoar use the endorsement of authors through orcid. all of the preprint repositories display author-supplied keywords and tags and the browsing of preprints by subjects/disciplines is prevalent. in addition, many of the preprint repositories display the subject/discipline category and they are based on which of the preprint categories are displayed. for example, at biorxiv, the category of articles that are submitted are new results, confirmatory results, or contradictory results vis-à-vis differentiate the conventional papers research, opinions, reviews, technical, concepts, or case studies published in social science preprints, which are ssrn, osf preprints. peerj preprints, arxiv, biorxiv, and osf have advanced features, such as article versioning, adding links, and comments. it also has faceted the browsing of its collections by manuscript type; filtering articles by entity, which are references, questions, answers, figures, and by published date and subjects. really simple syndication (rss) is popular among the preprints for having syndicated updates on new articles, subject areas, besides social media. repec and ssrn are using jel classification codes, whereas e-lis uses the jita classification of library and information science to classify the scholarly literature. there are no standardised metadata schema adopted by preprints, except the dublin core metadata schema followed in dspace at e-lis, and the rest of the preprint repositories use a more simplified metadata input formats. . . . metrics and open reviews since citations data are quite distributed in various databases by their journals coverage, they need to be aggregated from multiple platforms, such as crossref, scopus, and web of science, for use. google scholar’s citation data is essentially found to be the superset of scopus and web of science databases [ ]. all of the preprints provide citation tools support to export the references in multiple file formats that are supported by various platforms of reference management software. among all of the preprints, arxiv reports a unique subject wise submissions, access, and download details—daily, monthly, and institutional wise. repec reports the number of citations, downloads, and abstract views; top-level metrics for institutions, regions, authors, and document types. also, it reports statistics by research items, series and journals, authors, and institutions [ ]. ssrn preprints have report on institutional level data for downloads, abstract views and rank of papers, authors, and organizations, besides integrated plumx metrics, which is an alternative metric platform of elsevier. see here, an example [ ]. peerj preprints reports unique article-level metrics, which are grouped as social publications , , of referrals by social media and top referrals, which are essentially search engines, bookmarks, urls, and email alerts. see the example in figure [ ]. altmetric platform is integrated with biorxiv, mdpi preprints, chemrxiv, peerj preprints, and essoar preprints—aggregating social media metrics. peerj preprints reports its visitors, downloads, and views; osf preprints shows the downloads count; mdpi preprints exhibits the views, downloads, commenting options in public and private, and also provide rating options; e-lis shows the monthly and yearly downloads in the graph at the article-level and repository-level, and also other statistics that are available are the most downloaded items, top authors. chemrxiv shows views, downloads, and citations; essoar reports the download counts. mdpi preprints allows the viewing of reviewer comments through publons, which is a peer-review profile platform and the only one to do so among the preprints, while peerj preprints provides open feedback, q&a, and linking options to engage with readers and reviewers.publications , , x for peer review of figure . an example of article level metrics at peerj preprints. . . . community standards community standards help to develop, integrate, and steer the standards, protocols, and codes of conduct to take the initiatives (systems, software, and programs) to the wider community of committers, developers, and funders for strengthening open access initiatives, open source technologies, and into scholarly publishing. this refers to the standards, which are free and open source software, projects, and communities for interoperability. one of the important metadata harvesting interoperability protocol is open archives initiative—protocol for metadata harvesting—v. . (oai-pmh v . ). arxiv, repec, and e-lis are compliant to this protocol to support harvesting of records from other digital repositories and set the trends for community standards in building open archives [ ]. for the standards of software and operating system, arxiv uses gnu and mit license. repec has gnu and guildford protocol. e-lis adopted open data commons-open database license. as much as preprints operate on open community standards, managing them need advisory boards, funding strategies, and steering committees to take the initiatives forward, which are further discussed in the table . none of the preprints explicitly display code of conduct. figure . an example of article level metrics at peerj preprints. . . . community standards community standards help to develop, integrate, and steer the standards, protocols, and codes of conduct to take the initiatives (systems, software, and programs) to the wider of community of committers, developers, and funders for the strengthening of open access initiatives, open source technologies, and scholarly publishing. this refers to the standards, which are free and open source software, projects, and communities for interoperability. one of the important metadata harvesting interoperability protocol is the open archives initiative—protocol for metadata harvesting—v. . (oai-pmh v . ). arxiv, repec, and e-lis are compliant to this protocol to support the harvesting of records from other digital repositories and to set the trends for community standards in building open archives [ ]. for the standards of software and operating system, arxiv uses gnu and mit license. repec has gnu and guildford protocol. e-lis adopted open data commons—open database license. as much as preprints operate on open community standards, managing them needs advisory boards, funding strategies, and steering committees to take the initiatives forward, which are further discussed in the table . none of the preprints explicitly display code of conduct. publications , , of table . comparative features of preprint repositories . preprint infrastructure host/funding agency open technologies/protocols used license name knowledge organization systems web . applications metrics nonprofit/for profit bodysoftware name open source/proprietary software identifier/managing agency arxiv gnu/invenio open source arxiv: . /arxiv cornell university library, simons foundation and by the member institutions mit license. oai_pmh v . (oai ) non exclusive-distrib/ . /. (cc by . ), (cc by-sa . ), (cc by-nc-sa . ), (cc . ) keywords, subjects and authority records. rss, twitter, bookmarks, email alerts, annotation, blog, citation tools subject wise submission, access and downloads details—daily, monthly, institutional-wise nonprofit repec gnu/eprints open source repec:hhs:cesisp: . repec short id for authors: pzi /repec munich university library and members from countries. research division of the federal reserve bank of st. louis guildford protocol. oai-pmh. - jel classification rss, twitter, facebook, g+, reddit, stumbleupon, delicious, email alerts, blog citations, downloads, and abstract views. top-level metrics for institutions, regions, authors and document types nonprofit ssrn custom proprietary . /ssrn. /crossref relx group - - jel classification facebook, twitter, citeulike, permalink, blog downloads, abstract views, plumx metrics. ranks for paper, author and organizations for-profit e-lis dspace open source http://hdl.handle.net/ / /handle aims, fao and university of naples federico ii, naples—centralino, italy open data commons open database license. the open archives initiative and oai . - jita classification - downloads nonprofit biorxiv highwire proprietary / . / /crossref cold spring harbor laboratory, cold spring harbor, ny - cc-by . international license subjects rss, twitter, facebook, g+, alerts, digg, reddit, citeulike, google bookmarks, comment system, citation tools altmetric nonprofit http://hdl.handle.net/ / /handle publications , , of table . cont. preprint infrastructure host/funding agency open technologies/protocols used license name knowledge organization systems web . applications metrics nonprofit/for profit bodysoftware name open source/proprietary software identifier/managing agency peerj preprints custom proprietary . /peerj.preprints. v /crossref peerj, inc. - cc by . keywords and discipline wise browsing twitter, facebook, g+, alerts, citation tools, versions of record visitors, downloads, views and altmetric for-profit osf preprints osf/share open source . /osf.io/zuwnr/crossref center for open science - cc-by attribution . international; cc . universal disciplines and tags twitter, facebook, linkedin, alerts, citation tools, annotation, highlights downloads nonprofit mdpi preprints custom proprietary . /preprints . .v /crossref mdpi - cc by license disciplines facebook, twitter, linkedin and email alerts. bookmarks in citeulike. bibsonomy, mendeley, reddit, delicious, citation tools and publons views, downloads, comments and altmetric for-profit chemrxiv figshare proprietary . /chemrxiv. .v /crossref american chemical society, german chemical society (gdch) and the royal society of chemistry openapi initiative. mit, gpl, gpl . +, gpl . +. cc by-nc-nd . , cc by . , cc subject categories and keywords facebook, twitter, linkedin, g+, email alerts views, downloads, citations and altmetric nonprofit essoar atypon proprietary . /essoar. . /crossref american geophysical union - cc-by-nc-nd, cc-by-nc, or cc-by keywords facebook, twitter, linkedin, google+, reddit, email alerts altmetric and downloads nonprofit technical features of open infrastructures and metrics used at preprint repositories are examined at the article and site level on the respective website. publications , , of table . management of preprint repositories. preprint name managed by individuals/organizations steering committee/advisory board submission guidelines subscription/membership forum/q&a companion website/social media arxiv cornell university library with arxiv scientific advisory board and the arxiv sustainability advisory group member advisory board yes no subscription required, but runs on voluntary contributions with active institutions yes yes repec munich university library and members from countries. research division of the federal reserve bank of st. louis repec coordinators and volunteers for editing, hosting and support yes no yes yes ssrn relx group network directors yes free to use, however, subscription is available yes yes e-lis aims, fao and university of naples federico ii, naples–centralino e-lis admin board and country editors yes no yes yes biorxiv cold spring harbor laboratory advisory board yes no yes yes peerj preprints peerj, inc. academic boards, advisors, editors yes no yes yes osf preprints center for open science advisory group yes no yes yes mdpi preprints mdpi advisory board yes no yes yes chemrxiv american chemical society, german chemical society (gdch) and the royal society of chemistry no yes no yes yes essoar american geophysical union advisory board/editorial board yes no yes yes publications , , of . discussion . . preprints for building scholarly infrastructures and metrics preprint repositories are becoming pivotal at the intersection of scholarly web and open infrastructures, adopting the role of developing their pathways towards a dynamic research ecosystem in the advent of open technologies, such as persistent identifiers, open data harvesting and protocols, integrated data aggregators, and various discovery layers. since the preprints make the content available, building infrastructures around them is central to the build, scale, and measure of such projects. interoperability and crosswalking between them is critical for discoverability and citability of scholarly data. though, some funders have guidelines, for example, at nih, there is no general standards or established principles for preprints publishing. this is important for researchers, publishers, infrastructures, and service providers to have coherent workflows and the integration of multiple data sources and open infrastructures into unifying platforms to collect evidence regarding research impact, which will improve the demonstrated reliability [ ]. building novel metrics upon preprint infrastructures help with the quality assurance of scientific outputs, however, has its limitations. for example, alternative metrics say little about the quality of a paper and the kinds of impact, but more about its popularity [ ]. hence, the alternative metrics for alternative scholarly infrastructures need to be designed wisely to prevent adverse effects, as in how some of the conventional metrics are misused, such as the journal impact factor. embracing findable, accessible, and interoperable, reusable (fair) principles for scientific data management and stewardship focuses on the reuse of scholarly data, specifically enhancing the ability of the machines to automate the reusability of data. the potential impact and good practices of using fair principles amongst the uk academic research community has been found to exist and be continually improving, despite disciplinary differences. however, it is found that there is lack of understanding of fair data and principles; need for investments in the development of data tools, services, and processes to support open research; adopting fair principles across the broad coordinating activities and policy development at cross-disciplinary, national, and international levels [ , ]. datacite has been steering on persistent identifiers for research data citation, discovery, and accessibility, while also emphasizing the measurement of grants and the impact that is made by funding agencies [ , ]. hypothesis has been experimenting with open annotation use cases on preprints and discussed the burden of moderating (editorial and site), identity, and versioning among the preprint repositories [ ]. osf preprints has been experimenting on open annotations. at the nexus of building open scholarly infrastructure-metric in the broader scholarly communication system, preprints push for developing and integrating evidences of the impact for evaluating research and researchers with the emergent systems below: . data infrastructures and metrics—curates resources, metadata, and datasets that make the data of scientific publications discoverable, reusable, and citable involving the seamless integration between data and researchers across the research lifecycle, connecting human and technical infrastructure for open research. some examples include dryad digital repository, datacite, and institutional repositories. . persistent identifiers (pids)—connects not only digital objects, but also people, events, organizations, and vocabulary terms to achieve the persistence of digital resources. persistent identifier infrastructure facilitates the scientific reproducibility and the discovery of open data, providing long-term access to research artifacts (software, preprints, and datasets) and interoperability. for pids to grow, building and strengthening legacy pids, provenance, preservation, and linking of scholarly works and an ecosystem of co-existence are critical. few cases of pids are digital object identifiers, archival resource keys, rrids, igsns, and isbns. publications , , of . authority files—build and control the names of authors and organizations to share and validate the published data for vocabulary control. international standard name identifiers, orcids, researcherid and virtual authority international files, and international registry of authors-links to identify scientists are some examples. . oa applications—includes a set of open applications that facilitate free, accessible, and reusable scholarly research by building layers of new functionalities, such as programs, extractions, extensions, and link resolvers to find open access and a full text of scholarly resources. examples, including unpaywall, open access button, kopernio, and lazy scholar help to find full text of publications. there are also platforms for showing the research impact of articles, authors, and software. some examples are impactstory and depsy. . open citations databases—create and expand on open repository of scholarly citation data for reuse, which mainly include citation links, citation metrics, and cited resources under open licenses. some examples are opencitations, dimensions.ai, and lens.org. . open peer review systems—displays the pre- and post-publication track of reviews and comments made for peer-reviewed publications that are openly accessible. peerage of science, pubpeer, scienceopen, and publons are a few examples where the reviews and comments of peer review is open for recommendation and social sharing. table shows the common features that are found across all of the platforms. arxiv is cross-linked with the sao/nasa astrophysics data system and inspire—high energy physics databases. repec has many mirror sites that are hosted in countries. ssrn has recommendations for related e-journals and papers while browsing. though all of the preprints have embedded references in pdfs, repec collects citations data for all of its holdings through citec, which is a citation database and e-lis has on site display of the references. being at the nascent stage, preprint repositories are developing and integrating with some of the common infrastructures. altmetric platform integration, having identifiers with crossref’s dois and open references, are the most implemented features in open infrastructures and metrics. table . common features of open infrastructures and metrics. preprint name google scholar integration publons/open reviews altmetric/plumx metrics crossref dois open references recommendations (browsing related research) additional site integration/final publication display arxiv yes no no no yes no yes repec no no no no yes no yes ssrn no no yes yes yes yes yes e-lis yes no no no yes no no biorxiv no no yes yes yes no yes peerj preprints yes yes no yes yes no yes osf preprints no no no yes yes no yes mdpi preprints no yes yes yes yes no no chemrxiv no no yes yes yes no no essoar no no yes yes yes no no as listed in the table open references column, all of the preprints have references that are open in preprints; however, building open citations data is not freely available. since not all of the citations are openly accessible for preprints that are being cited, building open citations remains the biggest challenge, as peer reviewed publications and their citations are invariably distributed within the google scholar, crossref, web of science, and scopus databases. dimensions.ai and lens.org are the dimensions.ai lens.org publications , , of few open citations databases, facilitating the measurement of citations data, while remaining widely distributed to be discovered on the scholarly web. . . towards building sustainable open infrastructures with preprints preprints drive demand for new scholarly metrics and infrastructures, having been part of the scholarly outputs, reporting preliminary results. preprint repositories that are designed with open source software, technologies, and infrastructures become essentially sustainable [ ]. commenting upon the needs of open development as a socio-technological innovation towards open access, chan [ ] noted that “the term is a broad proposition that open models and peer-based production, enabled by pervasive network technologies, non-market based incentive structures and alternative licensing regimes, can result in greater participation, access and collaboration across different sectors. . . a key understanding of ‘open development’ is that while technologies are not the sole driver of social change, they are deeply embedded in our social, economic and political fabric. we therefore need to understand ‘openness’ within the context of a complex socio-technical framework”. the collective action for scholarly communication necessitates the funding for infrastructure services to be interoperable, scalable, open, and community-based for open infrastructures as the potential funders and organizations look for demonstrable community-based services, like preprints supporting open research. scoss and the coko foundation are notable here as promising initiatives in this space. hence, developing conceptual frameworks to support investors in infrastructures for open scholarship and in developing community capacity through the oa sustainability index becomes important. this is to take on initiatives, like preprints development, which are in hitherto under-represented disciplines and extending frontiers of open knowledge [ ]. sustainability of research ecosystem with research, education, and knowledge production components are crucial, as the implementation of preprint policies relies on the development of a fully-functioning oa infrastructure [ ]. in order to build resilient open infrastructures that are inclusive and sustainable systems, creating, sharing, and disseminating knowledge is important in scholarly publishing for workflow integrations, metadata reuse, and publisher integration with the research lifecycle. in support of open and collaborative science, chan [ ] further argues that “open approaches to knowledge production have the potential to radically increase the visibility, reproducibility, efficiency, transparency, and relevance of scientific research, while expanding the opportunities for a broad range of actors to participate in the knowledge production process. . . openness is not simply about gaining access to knowledge, but about the right to participate in the knowledge production process, driven by issues that are of local relevance, rather than research agendas set elsewhere or from the top down”. this is where preprint repositories are proven to be a disruptive development towards building public science. scientific publishers, research enterprises, and funding agencies are at a deflecting point where research systems should be built, designed, and disseminated inherently openly, and developing preprint services provides just that opportunity for scientific communities [ ]. we need to strengthen and expand the community and institutional role in managing preprints and their development. for that, we should redefine frameworks to overcome barriers and challenges in establishing open infrastructures for scholarly communication networks, so that open research principles are inbuilt in our research ecosystem, production processes, and in scientific publishing. the open science by design report that was released by the united states (us) national academies of sciences, engineering, and medicine is a step towards that [ ]. research is global and scholarly communities need interoperable hubs, interlinking data, and infrastructures supporting information exchange across repositories with standards, metadata schema, and semantic interoperability, as there is lack of standards for aggregating data that is used across platforms [ ]. preprints are disrupting the scholarly communication system and many leading publishers are slowly participating in the process—supporting, accepting of, and archiving in preprint repositories. however, some of the important challenges are inconsistent metadata schema in data harvesting, supporting multilingual systems, a lack of standards in integration, and protocols for aggregating data and implementing publications , , of them across platforms in version control, deduplication, and digital preservation. in strengthening the open infrastructures and metrics, preprints add to the ever-growing repository types and artifacts that are indexed and mined by indexers, aggregators, and search engines; built into registries, authority files for authors and organizations, and vocabulary control of subject terminologies. in this, all of the stakeholders—publishers, governments, funders, organizations, authors, and institutions—will shape the preprint repositories growth as they are accepted, developed, and available. according to johnson and fosci [ ], the key priority areas for immediate action for open infrastructures are below, which also resonates for preprint repositories: • interoperable, community-led preprints with strong open access initiatives and programmes should adopt sound governance structures with a greater representation from funders and policy makers, promoting the wider use of crucial identifiers and standards for preprints with maximum community participation, like open access repositories. • ensure the financial sustainability of critical services, particularly the doaj and sherpa, strengthening coalitions and funders, like scoss for preprint services, and balancing different disciplines and their representation fairly. • take into the account the rapid growth of preprints and create an integrated infrastructure for them, which is based on roadmaps and strategies for mainstreaming them across other modes of scholarly communication. • invest strategically in preprint repositories and services in order to create a coherent oa infrastructure that is efficient, integrated, and representative of all stakeholders. . . preprints for open science and public with its ability to promote open, ethical, and transparent research workflows and processes, preprints promote building open infrastructures and symbiotic services as—the web of data where reproducibility is at its core—mutually supporting and growing along with other research artifacts [ ]. as more and more preprint repositories grow, this is going to consolidate the research ecosystem towards a resilient, transparent, and open research environment for the public in promoting scientific temper and awareness as a public good. preprint repositories as public good initiatives offer enormous opportunities for researchers to manage the life cycle of research production, data management, access and collaboration control, project analytics, version control, and centralized access in a distributed environment [ ]. they allow for researchers to disseminate preliminary work or draft papers to a wider global community of researchers, before formally submitting to peer reviewed journals to obtain feedback or comments. it also helps in speeding up the communication of research results and fostering collaborations. currently, many journals accept preprint submissions. nature and science have been accepting preprints for long time, since they publish physics papers. at the american chemical society, of the journals accept preprints unconditionally [ ]. fostering scholarly commons, such as preprints, will open up opportunities for scientists and the public to solve some of our pressing problems from climate change to drug discovery, and it is possible through open science. without limits and no embargos, preprints pose no great threats than if they remain inaccessible and restricted for the public [ ]. . . peer review in preprints: revisiting for present times the peer review process exists to enable nominally disinterested experts to assure the quality of academic publications, but preprint servers usually host articles that have not yet been subject to peer review. the question of peer review at this juncture is—for open science—will the scientific communities accept preprints without peer review when this process itself has been entangled with a lack of incentives, credits, and recognition for peer reviewers [ , , ]. since preprints are not necessarily peer reviewed and explicit about that, this remains to be discussed. this offers enormous potential for establishing processes like the open review mechanism and new models of peer review. publications , , of open science through preprints promote transparency and secure provenance, time, and integrity of scientific data in an open and distributed infrastructure documenting every step of the research process and data for public. as bibliometric measures are not the indicator of achievement, there is a need to evaluate what needs to change in our culture, who are all involved, what are the best effective ways, and how it can be measured [ , ]. the challenges of maintaining unbiased review systems without gender bias in authorship and peer review, keeping the diversity of gender, racial, and ethnic communities, and the high quality of ethics and transparency calls for attention and cultural change in scholarly communication. asapbio’s initiative is worth mentioning for accelerating scholarly communication in life sciences through preprints. there is also an equal emphasis on standards, research integrity and ethics, quality, and credibility to navigate through the peer review process with scope for new initiatives having potential issues and advantages disrupting scholarly communication both in systems and as a process with incentives in place of fostering open research environments and open access publishing [ , ]. hence, reforming scholarly communication system to overcome barriers in legal framework, information technology infrastructures, business models, indexing services and standards, the academic reward system, marketing, and critical mass to integrate subject-specific, institutional, and data repositories into the main channels of scientific publications is critical, in which preprints development is a key component [ ]. though long established as a standardized practice with no other viable options for scientific communities, the peer review process is crucially invaluable and unquestionable, and for preprints, this process calls for openness. moreover, it should broaden the approaches to accommodate open rewards, incentives, and other non-monetary benefits, as they advance scientific communication [ ] to solve social problems, make sense for policy makers, and push forward scholarship to the advancement of humanity. . conclusions preprint repositories are gaining momentum in becoming active partners of the scholarly research ecosystem and they contribute to open scholarship as a new model of scholarly publishing, as discussed in this article. nevertheless, the dangers of the commercialization of preprints does not augur well for open science. this necessitates questions regarding the sustainability of preprint repositories and to what degree commercial business models interfere with open science. without embargos, preprints pose no risks to the public understanding of science and hence imposing limits is against the public interest [ ]. preprints apparently add to the existing complexities in scholarly publishing; however, its plethora of models, scale, and form give rise to opportunities to embrace it on one hand and on the other hand may take time for mainstreaming in scholarly publishing [ – ]. nonetheless, what constitutes them and whether they will stand out in the constructs of scholarly communication remains to be seen in the wake of diverse open data, open access-publishing models, open infrastructures, and web . technologies [ ]. these factors are central for scholarly communication to enrich and strengthen scholarly web with search engines, indexing systems, semantic technologies, and social software analytics to maximize the research impact and build reputation systems through open infrastructures and metrics for authors and institutions. going forward, on the landscape of preprints and metrics, perhaps overlay systems could be implemented, based on repositories using new metrics as overlay journals emerge. preprint repositories have emerged as movement and they are implemented in different ways; approached in heterogeneous forms and seeing them along with conventional journals may be a possibility or whether they will change the scholarly communication landscape fundamentally, as hubs of early-research output have important caveats for open science [ , ]. however, the trade-offs, such as the questions of conflict of interests, risks, and research ethics with which preprints are published, need to be addressed for the public in the public domain and in understanding science [ , ]. publications , , of author contributions: conceptualization, b.p.b.; writing: original draft, b.p.b., and m.d.; writing, review and editing, b.p.b and m.d. funding: this research received no external funding. acknowledgments: the authors are extremely grateful to two reviewers and editor, for their critical insights and useful comments, which helped to revise this paper in its current form. conflicts of interest: the authors declare no conflict of interest. references . prepubmed. monthly statistics for october . available online: http://www.prepubmed.org/monthly_ stats/ (accessed on december ). . teixeira da silva, j.a. the preprint wars. ame med. j. , , . [crossref] . piwowar, h.; priem, j.; larivière, v.; alperin, j.p.; matthias, l.; norlander, b.; farley, a.; west, j.; haustein, s. the state of oa: a large-scale analysis of the prevalence and impact of open access articles. peerj , , e . [crossref] . peiperl, l. preprints in medical research: progress and principles. plos med. , , e . [crossref] [pubmed] . severin, a.; egger, m.; eve, m.p.; hürlimann, d. discipline-specific open access publishing practices and barriers to change: an evidence-based review. f research , , . [crossref] . cobb, m. the prehistory of biology preprints: a forgotten experiment from the s. plos biol. , , e . [crossref] . eysenbach, g. the impact of preprint servers and electronic publishing on biomedical research. curr. opin. immunol. , , – . [crossref] . tennant, j.; bauin, s.; james, s.; kant, j. the evolving preprint landscape: introductory report for the knowledge exchange working group on preprints. available online: https://osf.io/preprints/bitss/ tu/ (accessed on july ). . wikipedia. preprint. available online: https://en.wikipedia.org/wiki/preprint (accessed on november ). . bornmann, l. scientific peer review: an analysis of the peer review process from the perspective of sociology of science theories. hum. arch. j. sociol. self-knowl. , , – . . rowland, f. the peer-review process. learn. publ. , , – . [crossref] . ingelfinger, f.j. definition of sole contribution. n. engl. j. med. , , – . . larivière, v.; haustein, s.; mongeon, p. the oligopoly of academic publishers in the digital era. plos one , , e . [crossref] . van noorden, r. open access: the true cost of science publishing. nature , , – . [crossref] . shen, c. open access scholarly journal publishing in chinese. publications , , . [crossref] . else, h. radical open-access plan could spell end to journal subscriptions. nature , , – . [crossref] . smith, a. alternative open access publishing models: exploring new territories in scholarly communication. in report on the workshop held on october at the european commission directorate-general for communications networks, content and technology; european commission: brussels, belgium, . . björk, b.-c. evolution of the scholarly mega-journal, – . peerj , , e . [crossref] . spezi, v.; wakeling, s.; pinfield, s.; creaser, c.; fry, j.; willett, p. open-access mega-journals. j. doc. , , – . [crossref] . berners-lee, t.; o’hara, k. the read-write linked data web. philos. trans. r. soc. a math. phys. eng. sci. , , . [crossref] . research excellence framework. ref : key facts. available online: http://www.ref.ac.uk/ / media/ref/content/pub/refbriefguide .pdf (accessed on october ). . garg, k.c.; kumar, s. uncitedness of indian scientific output. curr. sci. , , – . . hu, z.; wu, y. a probe into causes of non-citation based on survey data. soc. sci. inf. , , – . [crossref] http://www.prepubmed.org/monthly_stats/ http://www.prepubmed.org/monthly_stats/ http://dx.doi.org/ . /amj. . . http://dx.doi.org/ . /peerj. http://dx.doi.org/ . /journal.pmed. http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /f research. . http://dx.doi.org/ . /journal.pbio. http://dx.doi.org/ . /s - ( ) - https://osf.io/preprints/bitss/ tu/ https://en.wikipedia.org/wiki/preprint http://dx.doi.org/ . / http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . / a http://dx.doi.org/ . /publications http://dx.doi.org/ . /d - - - http://dx.doi.org/ . /peerj. http://dx.doi.org/ . /jd- - - http://dx.doi.org/ . /rsta. . http://www.ref.ac.uk/ /media/ref/content/pub/ref brief guide .pdf http://www.ref.ac.uk/ /media/ref/content/pub/ref brief guide .pdf http://dx.doi.org/ . / publications , , of . flatt, j.; blasimme, a.; vayena, e. improving the measurement of scientific success by reporting a self-citation index. publications , , . [crossref] . martín-martín, a.; orduna-malea, e.; lópez-cózar, e. scholar mirrors: integrating evidence of impact from multiple sources into one platform to expedite researcher evaluation. in proceedings of the sti conference: science, technology and innovation indicators. “open indicators: innovation, participation and actor-based sti indicators”, paris, france, – september . . buschman, m.; michalek, a. are alternative metrics still alternative? bull. am. soc. inf. sci. technol. , , – . [crossref] . martín-martín, a.; orduna-malea, e.; delgado lópez-cózar, e. author-level metrics in the new academic profile platforms: the online behaviour of the bibliometrics community. j. informetr. , , – . [crossref] . tennant, j.p.; waldner, f.; jacques, d.c.; masuzzo, p.; collister, l.b.; hartgerink, c.h.j. the academic, economic and societal impacts of open access: an evidence-based review. f research , , . [crossref] . seglen, p.o. citation rates and journal impact factors are not suitable for evaluation of research. acta orthop. scand. , , – . [crossref] . brembs, b.; button, k.; munafò, m. deep impact: unintended consequences of journal rank. front. hum. neurosci. , , . [crossref] . brembs, b. prestigious science journals struggle to reach even average reliability. front. hum. neurosci. , , . [crossref] . rau, h.; goggins, g.; fahy, f. from invisibility to impact: recognising the scientific and societal relevance of interdisciplinary sustainability research. res. policy , , – . [crossref] . weale, a.r.; bailey, m.; lear, p.a. the level of non-citation of articles within a journal as a measure of quality: a comparison to the impact factor. bmc med. res. methodol. , , . [crossref] . chaddah, p. evaluation of research output. curr. sci. , , – . . peerj prints. what is a preprint? available online: https://peerj.com/about/preprints/what-is-a-preprint/ (accessed on april ). . neylon, c.; pattinson, d.; bilder, g.; lin, j. on the origin of nonequivalent states: how we can talk about preprints. f research , , . [crossref] . rittman, m. preprints as a hub for early-stage research outputs. preprints , – . [crossref] . fabry, g.; fischer, m.r. beyond the impact factor—what do alternative metrics have to offer? gms j. med. educ. , . [crossref] . meadows, a. journals peer review: past, present, future. available online: https://scholarlykitchen.sspnet. org/ / / /journals-peer-review-past-present-future/ (accessed on april ). . gölitz, p. preprints, impact factors, and unethical behavior, but also lots of good news. angew. chem. int. ed. , , – . [crossref] . nicholas, d. editorial: thematic series on scholarly communications in the digital age. fems microbiol. lett. , . [crossref] . muller, j.z. the tyranny of metrics; princeton university press: princeton, nj, usa, . . gu, f.; widén-wulff, g. scholarly communication and possible changes in the context of social media. electron. libr. , , – . [crossref] . mahesh, g. the changing face of scholarly journals. curr. sci. , , – . . shehata, a.; ellis, d.; foster, a.e. changing styles of informal academic communication in the age of the web. j. doc. , , – . [crossref] . the conversation global. the conversation. available online: https://theconversation.com/global (accessed on may ). . asia and the pacific policy society. policyforum.net. available online: https://www.policyforum.net/ (accessed on may ). . brochu, l.; burns, j. librarians and research data management- a literature review: commentary from a senior professional and a new professional librarian. new rev. acad. librariansh. . [crossref] . tennant, j.; brembs, b. relx referral to eu competition authority. zenodo . [crossref] . commons, c. licensing types. available online: https://creativecommons.org/share-your-work/licensing- types-examples/ (accessed on october ). http://dx.doi.org/ . /publications http://dx.doi.org/ . /bult. . http://dx.doi.org/ . /j.joi. . . http://dx.doi.org/ . /f research. . http://dx.doi.org/ . / http://dx.doi.org/ . /fnhum. . http://dx.doi.org/ . /fnhum. . http://dx.doi.org/ . /j.respol. . . http://dx.doi.org/ . / - - - https://peerj.com/about/preprints/what-is-a-preprint/ http://dx.doi.org/ . /f research. . http://dx.doi.org/ . /preprints . .v http://dx.doi.org/ . /zma https://scholarlykitchen.sspnet.org/ / / /journals-peer-review-past-present-future/ https://scholarlykitchen.sspnet.org/ / / /journals-peer-review-past-present-future/ http://dx.doi.org/ . /anie. http://dx.doi.org/ . /femsle/fnx http://dx.doi.org/ . / http://dx.doi.org/ . /jd- - - https://theconversation.com/global https://www.policyforum.net/ http://dx.doi.org/ . / . . http://dx.doi.org/ . /zenodo. https://creativecommons.org/share-your-work/licensing-types-examples/ https://creativecommons.org/share-your-work/licensing-types-examples/ publications , , of . commons, c. what our licenses do. available online: https://creativecommons.org/licenses/ (accessed on october ). . commons, c. attribution . international (cc by . ). available online: https://creativecommons.org/ licenses/by/ . / (accessed on october ). . martín-martín, a.; orduna-malea, e.; thelwall, m.; delgado lópez-cózar, e. google scholar, web of science, and scopus: a systematic comparison of citations in subject categories. j. informetr. , , – . [crossref] . repec. repec/ideas rankings. available online: https://ideas.repec.org/top (accessed on october ). . evans-cowley, j.s. there’s an app for that: mobile applications for urban planning. ssrn electron. j. . [crossref] . broman, k.w.; woo, k.h. data organization in spreadsheets. am. stat. , , – . [crossref] . lagoze, c.; van de sompel, h.; nelson, m.; warner, s. the open archives initiative protocol for metadata harvesting. available online: https://www.openarchives.org/oai/openarchivesprotocol.html (accessed on december ). . costas, r.; zahedi, z.; wouters, p. do “altmetrics” correlate with citations? extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. j. assoc. inf. sci. technol. , , – . [crossref] . allen, r.; hartland, d. fair in practice—jisc report on the findable accessible interoperable and reuseable data principles; jisc: bristol, uk, . . bruce, r.; cordewener, b. open science is all very well but how do you make it fair in practice? available online: https://www.jisc.ac.uk/blog/open-science-is-all-very-well-but-how-do-you-make-it- fair-in-practice- -jul- (accessed on july ). . wilkinson, m.d.; dumontier, m.; aalbersberg, i.j.; appleton, g.; axton, m.; baak, a.; blomberg, n.; boiten, j.-w.; da silva santos, l.b.; bourne, p.e.; et al. the fair guiding principles for scientific data management and stewardship. sci. data , , . [crossref] . robinson-garcia, n.; mongeon, p.; jeng, w.; costas, r. datacite as a novel bibliometric source: coverage, strengths and limitations. j. informetr. , , – . [crossref] . staines, h. preprint services gather to explore an annotated future. available online: https://web.hypothes. is/blog/preprint-services-gather-to-explore-an-annotated-future/ (accessed on may ). . shewale, n.a.; balaji, b.p.; shewale, m. open content: an inference for developing an open information field. in open source technology: concepts, methodologies, tools, and applications; igi global: hershey, pa, usa, ; pp. – . . chan, l. what role for open and collaborative science in development? available online: http://www. universityworldnews.com/article.php?story= (accessed on june ). . jisc. oa sustainability index; jisc: bristol, uk, . . johnson, r.; fosci, m. putting down roots: securing the future of open-access policies; jisc: bristol, uk, . . ali-khan, s.e.; jean, a.; macdonald, e.; gold, e.r. defining success in open science. mni open res. . [crossref] . national academies of sciences, engineering, and medicine. open science by design: realizing a vision for st century research; national academies press: washington, dc, usa, . . hudson-vitale, c.r.; johnson, r.p.; ruttenberg, j.; spies, j.r. share: community-focused infrastructure and a public goods, scholarly database to advance access to research. d-lib mag. , . [crossref] . capadisli, s.; guy, a.; lange, c.; auer, s.; greco, n. linked research: an approach for scholarly communication. available online: http://csarven.ca/linked-research-scholarly-communication (accessed on may ). . foster, e.d.; deardorff, a. open science framework (osf). j. med. libr. assoc. , , . . american chemical society. acs launches chemistry preprint server. available online: https://cen.acs. org/articles/ /web/ / /acs-launches-chemistry-preprint-server.html (accessed on april ). . sarabipour, s.; wissink, e.m.; burgess, s.j.; hensel, z.; debat, h.; emmott, e.a.; akay, a.; akdemir, k.; schwessinger, b. maintaining confidence in the reporting of scientific outputs. peerj prepr. . [crossref] . tennant, j.p. the state of the art in peer review. fems microbiol. lett. , . [crossref] https://creativecommons.org/licenses/ https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / http://dx.doi.org/ . /j.joi. . . https://ideas.repec.org/top http://dx.doi.org/ . /ssrn. http://dx.doi.org/ . / . . https://www.openarchives.org/oai/openarchivesprotocol.html http://dx.doi.org/ . /asi. https://www.jisc.ac.uk/blog/open-science-is-all-very-well-but-how-do-you-make-it-fair-in-practice- -jul- https://www.jisc.ac.uk/blog/open-science-is-all-very-well-but-how-do-you-make-it-fair-in-practice- -jul- http://dx.doi.org/ . /sdata. . http://dx.doi.org/ . /j.joi. . . https://web.hypothes.is/blog/preprint-services-gather-to-explore-an-annotated-future/ https://web.hypothes.is/blog/preprint-services-gather-to-explore-an-annotated-future/ http://www.universityworldnews.com/article.php?story= http://www.universityworldnews.com/article.php?story= http://dx.doi.org/ . /mniopenres. . http://dx.doi.org/ . /may -vitale http://csarven.ca/linked-research-scholarly-communication https://cen.acs.org/articles/ /web/ / /acs-launches-chemistry-preprint-server.html https://cen.acs.org/articles/ /web/ / /acs-launches-chemistry-preprint-server.html http://dx.doi.org/ . /peerj.preprints. v http://dx.doi.org/ . /femsle/fny publications , , of . tennant, j.p.; dugan, j.m.; graziotin, d.; jacques, d.c.; waldner, f.; mietchen, d.; elkhatib, y.; collister, b.l.; pikas, c.k.; crick, t.; et al. a multi-disciplinary perspective on emergent and future innovations in peer review. f research , , . [crossref] . ma, l.; ladisch, m. scholarly communication and practices in the world of metrics: an exploratory study. in proceedings of the th asis&t annual meeting: creating knowledge, enhancing lives through information & technology, copenhagen, denmark, – october ; volume , p. . . meadows, a. changing the culture in scholarly communications. available online: https://scholarlykitchen.sspnet.org/ / / /changing-culture-scholarly-communications/ (accessed on april ). . allahar, h. is open access publishing a case of disruptive innovation? int. j. bus. environ. , , – . [crossref] . björk, b.-c. open access to scientific publications—an analysis of the barriers to change? inf. res. , , . . calne, r. preprint servers: vet reproducibility of biology preprints. nature , , . [crossref] . da silva, j.a.t. preprints should not be cited. curr. sci. , , – . . inlexio. the rising tide of preprint servers. available online: https://www.inlexio.com/rising-tide- preprint-servers/ (accessed on may ). . hoyt, j.; binfield, p. who killed the preprint, and could it make a return? available online: https://blogs. scientificamerican.com/guest-blog/who-killed-the-preprint-and-could-it-make-a-return/ (accessed on may ). . luther, j. the stars are aligning for preprints. available online: https://scholarlykitchen.sspnet.org/ / / /stars-aligning-preprints/ (accessed on april ). . balaji, b.p.; vinay, m.s.; shalini, b.g.; raju, m.j.s. an integrative review of web . in academic libraries. libr. hi tech. news , , – . [crossref] . da silva, j.a.t. intellectual phishing, hidden conflicts of interest and hidden data: new risks of preprints. j. advocacy res. educ. , , – . . sheldon, t. preprints could promote confusion and distortion. nature , , . [crossref] © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). http://dx.doi.org/ . /f research. . https://scholarlykitchen.sspnet.org/ / / /changing-culture-scholarly-communications/ http://dx.doi.org/ . /ijbe. . http://dx.doi.org/ . / b https://www.inlexio.com/rising-tide-preprint-servers/ https://www.inlexio.com/rising-tide-preprint-servers/ https://blogs.scientificamerican.com/guest-blog/who-killed-the-preprint-and-could-it-make-a-return/ https://blogs.scientificamerican.com/guest-blog/who-killed-the-preprint-and-could-it-make-a-return/ https://scholarlykitchen.sspnet.org/ / / /stars-aligning-preprints/ https://scholarlykitchen.sspnet.org/ / / /stars-aligning-preprints/ http://dx.doi.org/ . /lhtn- - - http://dx.doi.org/ . /d - - - http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction background rethinking research impact metrics growth of preprint repositories: from arxiv to essoar methods results comparative features of preprint repositories system architecture persistent identifiers and registries disciplinary focus and management interoperability and open licensing indexing and aggregators knowledge organization systems, authority control and subject categories metrics and open reviews community standards discussion preprints for building scholarly infrastructures and metrics towards building sustainable open infrastructures with preprints preprints for open science and public peer review in preprints: revisiting for present times conclusions references annotation guideline no. : annotation guidelines for narrative levels annotation guideline no. : annotation guidelines for narrative levels adam hammond . . article doi: . / c. journal issn: - cite: adam hammond, “annotation guideline no. : annotation guidelines for narrative levels,” journal of cultural analytics. january , . doi: . / c. . rationale i first became aware of the santa project at the digital humanities conference in montreal in the summer of . i had just been assigned a -student second- year undergraduate digital humanities undergraduate english literature class, set to begin in january , and i was looking for a group annotation project for my students. in previous iterations of the course, i had carried out several anno- tation projects focused on the narrative phenomenon of free indirect discourse (fid) in texts by virginia woolf and james joyce. what made these projects successful, from my perspective, was that fid is a complex phenomenon (by def- inition, a passage in which it is difficult or impossible to say for certain whether a character or narrator is speaking certain words) which is however relatively easy to represent in machine language (for instance, with the tei element and a few value-attribute pairs). the challenge in the assignment, in other words, was literary rather than technical: while it was easy to learn the tei tagging, it the syllabus for this class, eng , “the digital text,” is available at http://www. adamhammond.com/eng s / see adam hammond, julian brooke, graeme hirst, “modeling modernist dialogism: close reading with big data,” reading modernism with machines: digital humanities and modernist liter- ature, eds. shawna ross and james o’sullivan (palgrave macmillan, ): - and julian brooke, adam hammond, graeme hirst, “using models of lexical style to quantify free indirect discourse in modernist fiction,” digital scholarship in the humanities . (june ): - . https://doi.org/ . / c. https://doi.org/ . / c. http://www.adamhammond.com/eng s / http://www.adamhammond.com/eng s / http://www.adamhammond.com/wp-content/uploads/ / /llc.fqv .full_.pdf http://www.adamhammond.com/wp-content/uploads/ / /llc.fqv .full_.pdf adam hammond cultural analytics was hard to say for certain whether a passage from to the lighthouse was in di- rect discourse or fid, or to identify who exactly was speaking. to my mind, this made the assignment a meaningful one for my students, teaching them a technical skill while also bringing them into closer contact with the sometimes- irresolvable complexities of literary language. listening to the santa presentation at dh , it struck me that the phe- nomenon of narrative levels would make for a similarly meaningful annotation project. on the one hand, narrative levels could be represented fairly easily with a single xml element and through xml’s nesting structure (its “ordered hierarchy of content objects”). on the other hand, definitions of what a narrative is, and what a “narrative level” might be, were sufficiently complex that the annotation even of a relatively simple text would present an interpretive challenge to my students. by the time that i had begun planning my course, the santa group had published a more detailed set of instructions on their website, including suggestions for theoretical readings on narrative levels. they organized these readings in three levels: introductory, basics, and advanced. reading through these texts, i was struck by three things. first, that the concept of narrative levels, relatively intuitive at first glance, becomes more complex the more one looks at it. second, that there was significant disagreement among narratologists concerning even basic categories (such as the distinction between a “narrative level” and a “narrative frame”). third, that many of my second-year undergraduate students would be deeply confused even by the recommended texts at the simplest, “introductory,” level. in light of this, i decided to keep my definitions as simple as possible — as close as possible to the level of the “intuitive,” and free from explicit discussion of the theo- retical disagreements that preoccupy narratologists who study the phenomenon. since my motivation in preparing annotation projects is to find tasks that are sim- ple technically but make my students reflect deeply about literary phenomena, i would keep my tagging scheme as simple as possible and restrict my definitions to the points on which all narratologists basically agree. this led to the very short guidelines that you see here — the shortest, by some margin, in this group. al- though it could be argued that their brevity might lead to unnecessary disagree- ment among annotators — that by offloading so much of the literary work to my students, i was deliberately reducing the likelihood that the guidelines would produce annotations with useable levels of inter-annotator agreement — my sus- picion from the beginning was that any greater detail would in fact simply con- fuse my student annotators and reduce inter-annotator agreement. (analysis of the first round of santa annotation schemes confirms this suspicion to some see https://sharedtasksinthedh.github.io/levels/ https://sharedtasksinthedh.github.io/levels/ cultural analytics annotation guideline no. extent.) my guidelines depend on annotators’ mastering three relatively simple concepts. the first is the concept of a narrative, which i define, drawing on porter abbott’s cambridge introduction to narrative, as “a representation of a story (an event or series of events) by a narrator.” the next is the notion that a given text can contain more than one narrative, and that narratives can be embedded within one another. i provide a rule of my own devising for helping students to de- cide whether they have a reached a moment at which one narrative is embedded within another: if they could plausibly insert the phrase “let me tell you a story” (a phrase which captures both sides of my simple definition of a narrative, the narrator [“me”] and the story itself) at the beginning of the proposed embedded narrative, then they should mark the beginning of a new narrative. the third con- cept is that of degrees of embeddedness, borrowing terminology from shlomith rimmon-kenan via manfred jahn. in my original guidelines, the annotation is described in terms of xml tags, which makes the discussion of embedding some- what simpler, in that i can simply import xml’s model of embedding and make the assumption that narrative levels also form an “ordered hierarchy of content objects.” a further benefit of this simple annotation scheme is that it serves to focus the eventual computational task. though annotations produced according my guide- lines could not be used to train machine learning models in all narrative phenom- ena related to narrative levels, they could help to keep attention focused on three crucial and related tasks: identifying moments where one narrative yields to an- other; identifying the speaker of each; and placing these narratives in hierarchical relation to one another. i carried out my annotation project as follows. first, i assigned henry james’s the turn of the screw, and presented a two-hour lecture focused in large part on how james’s complicated framing structure serves to complicate (rather than resolve) the text’s many “narrative gaps.” the next week, in another two-hour h. porter abbott, the cambridge introduction to narrative, nd edn. (cambridge: cambridge up, ). abbott defines a narrative as “the representation of a story (an event or series of events)” ( ). he excludes the necessity of a narrator from his definition of a narrative on the basis that this would exclude most drama and film. since i was working exclusively with prose fiction, this exclusion was not necessary for my own schema. manfred jahn, “n . . narrative levels,” narratology: a guide to the theory of narrative (english department, university of cologne, ), http://www.uni-koeln.de/~ame /pppn.htm#n . the actual guidelines distributed to students — which describe the annotation project in terms of xml — are available at http://www.adamhammond.com/wp-content/uploads/ / / narrative-frames-annotation-guidelines.pdf the main difference between these guidelines and the tool-agnostic version included here are the necessary addition of the “open” attribute, which is a less elegant method than the method described in note below. http://www.uni-koeln.de/~ame /pppn.htm#n . http://www.adamhammond.com/wp-content/uploads/ / /narrative-frames-annotation-guidelines.pdf http://www.adamhammond.com/wp-content/uploads/ / /narrative-frames-annotation-guidelines.pdf adam hammond cultural analytics lecture, i introduced the students to xml and to the project itself. in this lec- ture, i provided slightly more detail than i provide in the guidelines themselves. for example, i showed students genette’s speech bubble doodle and discussed its implications. i introduced box diagrams for representing narratives within narratives, for instance in the thousand and one nights, as follows: i also provided corresponding diagrams emphasizing the stratified levels of nar- rative — the “degrees” of narrative — in such box diagrams: i provided additional diagrams for hamlet, the taming of the shrew, and the turn of the screw. for the latter text, i emphasized that there were multiple valid ways of interpreting the text’s structure: for instance, the governess’s tale could be the slides for this lecture are available at http://www.adamhammond.com/wp-content/uploads/ / /eng _narrative_levels_lecture.pdf http://www.adamhammond.com/wp-content/uploads/ / /eng _narrative_levels_lecture.pdf http://www.adamhammond.com/wp-content/uploads/ / /eng _narrative_levels_lecture.pdf cultural analytics annotation guideline no. envisioned as a third-degree narrative embedded within douglas’s and the outer narrator’s, or could be seen as embedded only within that of the outer narrator; further, certain stories that the governess tells mrs. grose could be marked as separate narratives, though one could argue that they are simply part of the gov- erness’s narrative, not independent of it. i also used turn of the screw to introduce the notion of “open frames.” i next explained “mise-en-abyme” or recursive narratives. i concluded the lecture by explaining the process students would use to annotate their assigned stories. i next posted an instructional video explain- ing the annotation procedure, which students accomplished with the sublime text editor. in practice, the project seems to have been a success in the context of undergraduate pedagogy. in the annotations received in the project, there were only three coding errors — evidence that, as desired, the technical challenge was minimal. although we have yet to perform detailed investigation of inter- annotator agreement among my students, informal evaluations performed in the context of grading students’ work revealed that disagreement occurred primarily in instances where literary interpretations might reasonably differ — evidence that the literary questions asked of students were meaningful ones. going for- ward and revising these guidelines for use beyond my classroom, i would add more explicit and theoretically-grounded definitions and include diagrams like those depicted above. . overview a set of narrative texts are to be annotated for narrative levels. any span of text containing a narrative is to be marked with the nframe category marker. for the purpose of our task, a narrative is defined as a representation of a story (an event or series of events) by a narrator. the texts in our annotation set may contain a single narrative (and thus a single nframe category) or may contain multiple narratives embedded within one another (nframe categories within nframe cat- egories). if you come to a point in a text where you are uncertain whether to indicate a shift in narrative levels, imagine inserting the phrase “let me tell you a story” right after the proposed division point. if the phrase fits, you should in my original guidelines, these open frames indicated by a deliberate xml error — withholding an end-tag — which is not practical but which i believe perfectly captures a reader’s feeling at the end of a story like the turn of the screw, where it is as if the author had made a coding error, omitting crucial information that allows us to properly process the conclusion of the story. the instructional video is available at https://www.youtube.com/watch?v=dseeulcyfsu the texts i assigned to students were mostly those proposed by the project, though i made several sub- stitutions based on various factors, including stereotyped representations of racialized characters in certain supplied texts. for instance, i replaced rudyard kipling’s “beyond the pale” and “how the leopard got its spots” with wallace thurman’s “cordelia the crude” and an abridged version of zora neale hurtson’s their eyes were watching god. adam hammond cultural analytics likely mark a new narrative level. the nframe category has two necessary and one optional attribute. • level attribute the level attribute is used to express the degree of embedding of a narrative. if the narrative is not embedded within any others, it is a top-level or first-degree narra- tive and should be given the attribute value of “a”. a narrative embedded within an “a”-level narrative — a narrative within a narrative, or second-degree narrative — is given the attribute value of “b”. a narrative embedded within a “b”-level narrative — a narrative within a narrative within a narrative, or third-degree nar- rative — is given the attribute value “c”, and so on. note that a text may contain multiple narratives at each level. for instance, the thousand and one nights con- tains hundreds (in some tellings, exactly , ) of “b”-level narratives — some of which contain “c”-level narratives of their own. • narr attribute • open attribute the narr attribute keeps track of the narrator who conveys the narrative. we will represent these with numbers. the first narrator you encounter should be numbered “ ”, the second “ ”, the third “ ,” and so on. if the narrator of a “b”- level narrative is the same as the narrator of the “a” level, both are numbered “ ”. if the narrator of a “b”-level narrative is different from the narrator at the “a” level, the first is numbered “ ” and the second “ .” and so on. some writers choose deliberately to leave frames “open.” for example, in henry james’s the turn of the screw, the governess’s “c”-level tale is framed within a christmas fireside storytelling session by two narrators, the “a”-level “i” and the “b”-level douglas. yet after the governess finishes her tale, james does not return to the “a” or “b” levels to explicitly close them. instead, they are left hanging. indicate “open” by setting the “open” attribute to “true” (if not indicated, it will be assumed that the frame is “closed”). . sample annotations a simple text containing only one narrative might be annotated as followed, using xml markup as an example: it was a dark and stormy night. the wind blew and the wolf howled. the wind blew open my window and the wolf entered. the wolf bit me and i died.a text containing a single “b”-level narrative might be annotated as follows. (since the narrator of the “b”-level narrative is different cultural analytics annotation guideline no. from that of the “a”-level narrative, it is given the narrator attribute of “ ”.) it was a dark and stormy night. the wind blew and the wolf howled. the wind blew open my window and the wolf entered. the wolf opened his mouth and spoke. "once upon a time, when i was but a young pup, a wizard appeared before me and predicted my fate. he told me that one day, i would leap through a window and eat a man whole. after enduring many hardships, i have come to enact my fate." he bit me and i died.a text containing two “b”-level narratives and a single “c”-level narrative might be tagged as follows. (since the narrator of the second “b”-level narrative is the same as the “a”-level narrative, they share the narrator attribute of “ ”.) it was a dark and stormy night. the wind blew and the wolf howled. the wind blew open my window and the wolf entered. the wolf opened his mouth and spoke. "once upon a time, when i was but a young pup, a wizard appeared before me and predicted my fate. the wizard told me, 'i was born in the east. my father was a plumber and my mother an auto mechanic. from a young age, it was clear that i had little talent for either profession, so i set off for the wizard academy. my expert wizardry has brought me here to you. you, dear wolf, will some day leap through a window and eat a man whole.' and so here i am. after enduring many hardships, i have come to eat you." before he had a chance to eat me, i tried to distract him with a story. "once upon a time and a very good time it was there was a moocow coming down along the road and this moocow that was coming down along the road met a nicens little boy named baby tuckoo...". but he found the story boring and so he bit me and i died. • special case: “open frames” in the following example, the “a”-level narrative is not explicitly closed by nar- rator (presumably because he has been eaten and is unable to write) and thus the attribute “open” attribute has been set to “true” it was a dark and adam hammond cultural analytics stormy night. the wind blew and the wolf howled. the wind blew open my window and the wolf entered. the wolf opened his mouth and spoke. "once upon a time, when i was but a young pup, a wizard appeared before me and predicted my fate. he told me that one day, i would leap through a window and eat a man whole. after enduring many hardships, i have come to enact my fate. " • special case: “mise-en-abyme” narratives some narratives, especially popular with postmodern writers, paradoxically embed a story within itself. this paradoxical situation can be represented by showing a series of “a”-level narratives embedded within one another: it was a dark and stormy night. the band of robbers huddled together around the fire. when he had finished eating, the first bandit said, "let me tell you a story. it was a dark and stormy night and a band of robbers huddled together around the fire. when he had finished eating, the first bandit said: 'let me tell you a story. it was a dark and stormy night and...' " . other notes if a shift in narrative level occurs around a chapter break and you’re unsure whether to put your nframe category marker before or after the chapter header, put it after. unless otherwise specified, all work in this journal is licensed under a creative commons attribution . international license. http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / . overview . sample annotations . other notes libraries and research: five key themes for sustainable innovation in strategy and services full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=racl download by: [university of southampton] date: august , at: : new review of academic librarianship issn: - (print) - (online) journal homepage: http://www.tandfonline.com/loi/racl libraries and research: five key themes for sustainable innovation in strategy and services wendy white to cite this article: wendy white ( ) libraries and research: five key themes for sustainable innovation in strategy and services, new review of academic librarianship, : - , - , doi: . / . . to link to this article: http://dx.doi.org/ . / . . © the author(s). published with license by taylor & francis group, llc© wendy white published online: aug . submit your article to this journal view related articles view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=racl http://www.tandfonline.com/loi/racl http://www.tandfonline.com/action/showcitformats?doi= . / . . http://dx.doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=racl &show=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=racl &show=instructions http://www.tandfonline.com/doi/mlt/ . / . . http://www.tandfonline.com/doi/mlt/ . / . . http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - guest editorial libraries and research: five key themes for sustainable innovation in strategy and services wendy white hartley library, university of southampton, southampton, united kingdom this themed issue, which draws together articles that explore supporting researchers: sustainable innovation in strategy and services, seemed to sharpen in relevance even as the manuscripts were being written and developed. the global environment and the impact on higher education continues to change rapidly and demonstrates the need to be prepared for the unexpected but also the need to create a compelling vision for the future. five themes emerge strongly across these articles and case studies as libraries seek ways of innovating to develop flexible, yet sustainable and purposeful approaches. the overarching narrative foregrounds building and evolving relationships. this is not in itself new. librarians have often showed initiative and skill in form- ing alliances that maximize access to information, shape collections and foster the attributes required to operate effectively in an increasingly complex multi-format and multiple rights holder environment. nor is involvement with research activity new. librarians have themselves been noted scholars over the years. what is clear from the articles in this volume is that libraries globally are engaging more system- atically, both with the actual craft of research activity and with the development and monitoring of research strategy. these collaborations are forming and deepening at all levels from international to local. a canadian case study of the network linking the national statistics agency to institutional users shows the strength of librarians as knowledgeable ambassa- dors, also demonstrated by the expanding portfolio of advice available as a result of new training approaches. several of the case studies identify new modes of staff development for researchers and professional experts, including collaborative learning. innovative ways of learning together, transferring knowledge and under- standing across currently partially siloed communities, could be key to effective support for large scale interdisciplinary activity. within universities new collaborations are forming that go beyond the formal- ity of staff structures to embrace deeper social alliances. these facilitate the more contact wendy white w.h.white@soton.ac.uk hartley library, university of southampton, highfield, southampton, so bj, uk. © wendy white. published with license by taylor & francis group, llc. this is an open access article distributed under the terms of the creative commons attribution license (http://creativecommons.org/ licenses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. new review of academic librarianship , vol. , nos. – , – https://doi.org/ . / . . d ow nl oa de d by [ u ni ve rs it y of s ou th am pt on ] at : a ug us t https://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - mailto:w.h.white@soton.ac.uk http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / https://doi.org/ . / . . flexible and responsive ways of working that may well be required to stay relevant in a higher education environment that is challenging but full of opportunity. there are examples of the benefits of evolving communities of practice that encourage a reflective approach and of the development of centers of digital schol- arship where co-creation and contribution to research practice flourishes with academic partnership. can greater involvement with research activity help improve both the reflective and critical approach of practitioners and the evi- dence-base available? libraries are well placed to provide strategic and service alignment across teaching and learning activities and a deeper involvement in research can add value to the profession in the round. another interesting question posed by this collection of articles and case studies is what is the distinctive contribution of libraries in research collaboration? we have evidence here of librarians working in research offices, similar services being led by different parts of universities, shared service teams developing where inte- gration blurs the boundaries between providers. we may well be evolving into a more fluid world where career pathways can develop across professions. this can be seen as an opportunity for richer development, not a cause for protectionism. however, it is still useful to ask what a profession or particular role might contrib- ute that is distinctive, even though it may not be unique. three areas emerge as possible distinctors. � the role of libraries in the management and development of what we might call socio-informatic space, both physical and virtual. research is, after all, about the making of new knowledge and librarians have a distinctive contri- bution to the creation of the research “maker-spaces” of the future where this knowledge will be made and shared. � libraries have a particular view of the long term. librarians and archivists have a strong track record of caring about access to information for future genera- tions and certainly past the end of a year business plan. case studies here show that librarians are aware of this responsibility and the scale of the chal- lenge, but perhaps flag this up as a key area where there is more work to do. � librarians have a distinctive interest in the full range of scholarly communi- cation activity, as represented in this volume by the engagement with writing centers, editorial activities, social media, evidencing research contribution and impact, and perhaps most importantly embracing the ethical challenges of the changing information landscape and demonstrating commitment to new modes of publishing and dissemination. there is currently much debate in the sector regarding the next steps for opening up access to scholarly work. new business models are slowly emerg- ing to mixed reception and much scrutiny. this volume highlights through case studies some library-led initiatives to develop new and sustainable approaches through open access university presses, showcasing original works through repositories and adding value through contextualizing and showcasing research impact narratives. guest editorial d ow nl oa de d by [ u ni ve rs it y of s ou th am pt on ] at : a ug us t in order to provide sustainable services to support the growing areas of research partnership libraries are taking a fresh look at best fit staffing config- urations and approaches to professional development. libraries are balancing the need for cross-cutting flexibility in planning and delivery with clarity of purpose. all with the backdrop of economic uncertainty, a challenging policy landscape and a requirement to evidence impact and operational efficiency. in such an environment the ability of libraries to continually hone expertise and demonstrate their contribution to collaborations is paramount. libraries are shown in this special issue to be strongly contributing to evidenc- ing activity for both the individual researcher and for the institution as a whole. australian research particularly shows the growing involvement in bibliometric and altmetric analysis and the potential of the contribution to the overall research environment in institutions. case studies also demonstrate the expansion of the research repositories to support annual research planning, working to inform insti- tutional strategy. these show that libraries are not just providing research services. they are developing their contributions to the shaping of research visions for uni- versities and in monitoring activity and identifying success. many of the articles draw out the challenge of turning pilots and projects for new services, or innovations to existing services, into sustainable core activity. a u.s. based initiative demonstrates the utility of a specific research innovation framework which draws on design thinking to track projects, maximize lateral communication, and assess funding options. another model draws on urban the- ory to inform the development of typologies. embracing methodological innova- tion brings research practice to research service development. a collaborative approach to enquiry and investigation is a platform for sustainability. one of the most striking areas of commonality in this collection of articles is a focus on nontextual elements of research practice. not only are libraries embrac- ing research data management and undertaking numerical analytics, there are cases studies here that are all about visual research objects and outputs and that capture the creation of more visual learning materials. we are potentially at a tip- ping point where the primacy of the written word as the core means of communi- cating research may be under challenge. the exponential growth of research data and its heterogeneity is well documented. in the year in which we have lost john berger, it has never been more apposite to ask anew how we look at the world and reflect on “ways of seeing”? increasingly the information environment is under- pinned by algorithms, software coding and forms of data manipulation, and is sur- faced in an image-rich context. these visualizations often embrace interaction, from virtual reality worlds to methodological investigation of data mash-ups. extending engagement with these non-textual areas will be essential if libraries are to continue to lead and collaborate on all aspects of research strategy and services. this range of research papers and case studies provides plenty of reasons to be optimistic regarding libraries of the future. librarians individually, in partnerships and in communities are well placed to be innovative leaders and provide enduring guest editorial d ow nl oa de d by [ u ni ve rs it y of s ou th am pt on ] at : a ug us t services. public trust in research authenticity and the expertise of researchers is currently the subject of debate. librarians have a role to play in securing and nur- turing this trust by providing complementary expertise and skills to support research replication, shared ethical practice, reflective communities, and a respon- sible approach to use of metrics. they can foster a culture of appropriate openness and necessary privacy; not just for access to research outputs and engagement with all aspects of research activity, but for intelligent citizenship in a global digital world. guest editorial d ow nl oa de d by [ u ni ve rs it y of s ou th am pt on ] at : a ug us t review: citizen jane: battle for the city edinburgh research explorer review: citizen jane: battle for the city citation for published version: clericuzio, p , 'review: citizen jane: battle for the city', journal of the society of architectural historians, vol. , no. , pp. - . https://doi.org/ . /jsah. . . . digital object identifier (doi): . /jsah. . . . link: link to publication record in edinburgh research explorer document version: publisher's pdf, also known as version of record published in: journal of the society of architectural historians publisher rights statement: published as clericuzio, p , 'review: citizen jane: battle for the city' journal of the society of architectural historians, vol. , no. , pp. - . doi: . /jsah. . . . © by the society of architectural historians. all rights reserved. authorization to copy this content beyond fair use (as specified in sections and of the u. s. copyright law) for internal or personal use, or the internal or personal use of specific clients, is granted by the regents of the university of california for libraries and other users, provided that they are registered with and pay the specified fee via rightslink® or directly with the copyright clearance center. general rights copyright for the publications made accessible via the edinburgh research explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. take down policy the university of edinburgh has made every reasonable effort to ensure that edinburgh research explorer content complies with uk legislation. if you believe that the public display of this file breaches copyright please contact openaccess@ed.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. download date: . apr. https://doi.org/ . /jsah. . . . https://doi.org/ . /jsah. . . . https://www.research.ed.ac.uk/portal/en/publications/review-citizen-jane-battle-for-the-city( e dc -dba - c f-a - a a bbfec).html multimedia matt tyrnauer, director citizen jane: battle for the city altimeter films, los angeles, , min., http://www.altimeterfilms.com/citizen-jane- battle-for-the-city the current political climate in the united states, wherein government officials with an authoritarian bent attempt to imple- ment “reforms” in the face of popular re- sistance and disapproval, bears a strong resemblance to the story of robert moses’s efforts to reconfigure housing and transit systems in new york city in the s– s and the grassroots opposition to them led by jane jacobs. the saltiness of the lan- guage with which some might denigrate donald trump’s presidential administra- tion echoes, no doubt, the same sort of diction one could use to describe moses, the de facto director of all municipal con- struction and associated programs of urban renewal in new york city for much of the two decades after world war ii. mean- while, the now-familiar mantra “never- theless, she persisted” could be applied retroactively to jacobs, whose unflagging determination united her neighbors in greenwich village. in the end, jacobs and her cohort scuttled moses’s plans for an elevated lower manhattan expressway— projected to destroy much of the vibrant district and its trademark collection of historic cast-iron buildings—along with his political career. indeed, while these com- parisons are warranted, matt tyrnauer’s documentary citizen jane: battle for the city spares us any discussion of the political events of the recent past in favor of a focus on those of nearly sixty years ago. for seasoned students of the history of urban planning in postwar america, citizen jane offers few new insights. it is, for all in- tents and purposes, a recounting of jacobs’s well-known struggle against moses’s de- struction of the “slums” he identified as “cancerous areas” to be excised from the urban fabric of new york city. documen- tary, in this case, is not scholarship, nor is it necessarily meant to be. but in its ex- ceptionally rich visual presentation, which includes resurrected film footage of once- bustling manhattan neighborhoods juxta- posed with shocking stills of the massive empty blocks to which moses—with bomber-like precision—reduced them, citizen jane emerges as a gripping reminder of what went wrong between and in large american cities and the often troubled legacy of the international style and modernist architecture. in this respect, one of the most useful aspects of the film is the way in which it contextualizes jacobs’s work, placing it within both the long-term development of progressivism in the united states and the shorter-term activism of the s, as well as ultimately showing the work’s importance in the face of con- temporary global population growth. tyrnauer structures citizen jane around clips of interviews with a wide variety of experts in the fields of architectural and urban studies (some of whom worked closely with jacobs) as well as with politi- cians and others. interviewees include mary rowe, mindy fullilove, alexander garvin, alex alexiou, former new york mayor ed koch, robert a. m. stern, and the ubiquitous paul goldberger. these conversations, largely, conducted for the film, complement excerpts of recorded in- terviews with jacobs, moses, and activists such as james baldwin. particularly effec- tive is the way these clips are interspersed with archival footage of the precise subject matter being discussed on the sound track. for instance, as an interviewee recounts how jacobs’s activism was subject to mi- sogyny within the architectural and urban planning communities, on-screen we see period publicity photographs of the plan- ners of detroit’s lafayette park and new york’s lincoln center. the latter consists of an ominous, high-contrast view of nine men in suits interspersed among the struc- tures in a model of the performing arts complex, gazing impassively and unsym- pathetically out at us in a manner that only underscores the ruthlessness of their mo- tives and the clarity of their maquettes. in retrospect, the arcade of the metropolitan opera house bears an eerie similarity to giovanni guerrini, ernesto bruno la padula, and mario romano’s palazzo della civiltà italiana, built for benito mus- solini’s regime in rome in – . if this does not convince us of the ingrained social bias against jacobs because of her gender, we are reminded later that new yorker critic lewis mumford—whose own book the city in history, published the same year as jacobs’s the death and life of great american cities ( ), also sharply criticized many of the developments of modern urban planning, most notably sub- urban sprawl—dismissed death and life soon after its publication in a lengthy cri- tique titled “mother jacobs’ home reme- dies,” indicating with little subtlety that jacobs was poking her nose into a field where women clearly were not welcome. no less effective in the film is the music that swells with every mention of moses or his grand urban schemes. one might argue journal of the society of architectural historians , no. (march ), – , issn - , electronic issn - . © by the society of architectural historians. all rights reserved. please direct all requests for permission to photocopy or reproduce article content through the university of california press’s reprints and permissions web page, http://www.ucpress.edu/journals.php?p=reprints, or via email: jpermissions@ucpress.edu. doi: https:// doi.org/ . /jsah. . . . . http://www.altimeterfilms.com/citizen-jane-battle-for-the-city http://www.altimeterfilms.com/citizen-jane-battle-for-the-city mailto:jpermissions@ucpress.edu https://doi.org/ . /jsah. . . . https://doi.org/ . /jsah. . . . that such embellishment is gratuitous for an audience already sensitive to the prob- lems of moses and his legacy, but it is worth underscoring how deceptive and seductive the ideas of modernist urban planning are—particularly for a generation of university students who have experi- enced firsthand only post-moses urban environments. indeed, the footage of futurama from the – new york world’s fair, the drawings of le corbusier’s ville contemporaine of , and even the perspective schemes for the lower manhattan expressway that are shown in citizen jane still look inviting, peaceful, and imaginative when unyoked from the film’s sound track (figure ). as economist san- ford ikeda explains on camera, it is precisely this order that makes “dead cities” attractive in a purely platonic visual sense. conversely, we are reminded, it is the systematic chaos and messiness of everyday urban life that allows cities to function effectively. jacobs figured all this out when she conducted her extensive anthropological fieldwork for death and life. in contrast, moses, who practiced “armchair history” from the backs of limousines and the protected zones of municipal offices, could not understand why ordinary people’s daily activities would not simply conform to his architectural order. the film is as much about moses as it is about jacobs. teachers of jacobs’s theories of how urban environments function and flourish will welcome the background that citizen jane provides for the emergence of moses and the forces against which jacobs fought and on which she honed her theory of the city. much of the first part of the film is devoted to the origins of moses’s ideas during the progressive era, particularly the period – . it explains how he drank deeply from the well of progressive thought, convincing himself that the sine- cure of public office could afford him the authority to forcibly reconfigure entire urban districts, which, once physically al- tered, would witness the eradication of pov- erty, disease, and overcrowding. in other words, the film recounts a story of how the road to hell is paved with good intentions. moses launched his career in public service at the dawn of the aviation age and the seductive bird’s-eye view of the city it afforded. this aerial perspective on the city, exploited by le corbusier, buckminster fuller, hugh ferriss, and norman bel geddes, among others, encouraged the notion of the urban “master builder” as an omniscient—and often anonymous— godlike figure, able to reconfigure vast urban spaces at will and removed from having to live personally with the conse- quences of his decisions should they go awry. it is difficult to overemphasize the extent to which moses in reality approxi- mated this imagined figure, since virtually all of the offices that he held were appoint- ments, subject to little oversight from elected officials. as a result, he controlled vast amounts of funding from the federal government and toll revenues, among other sources—powers that make jacobs’s victory all the more impressive. to the filmmaker’s credit, citizen jane resists any hyperbole of direct comparisons between moses and the leaders of the totalitarian regimes of the s and s, whose em- brace of control and the persona afforded by the aerial view resembled moses’s view- point taken to its logical extreme. the film neatly sets up the ideological distinction between moses and jacobs in its opening sequences, most strikingly with a split-screen image of the two of them, a duality that tyrnauer sustains throughout (figure ). each interview with jacobs figure detail of a perspective drawing for the lower manhattan expressway, ca. , as seen in citizen jane: battle for the city (library of congress). j s a h | . | m a r c h reveals her calm, unpretentious, and ear- nest personality. when discussing the planning of the lower manhattan express- way, her former colleagues praise the way she brilliantly strategized how to upset the planners’ applecart, at one point using her own daughter in a ribbon-tying ceremony in new york’s washington square to pro- test a southern extension of fifth avenue through the park. by contrast, nearly every clip of moses portrays him as a brooding, misanthropic, misunderstood bureaucrat, brashly aware that his urban plans are going to meet with opposition for destroy- ing ordinary citizens’ lives, yet thoroughly devoid of compassion. more broadly, citizen jane contextual- izes jacobs’s struggle as part of the radical changes of the s, which included chal- lenges to the conventions of social struc- ture. it points out, for example, that the death and life of great american cities, jacobs’s defining work, appeared within two years of the publication of two other pathbreaking books by women, betty frie- dan’s the feminine mystique ( ) and rachel carson’s silent spring ( ). in a section devoted to st. louis’s pruitt-igoe housing project, the film deftly highlights the ugly racial aspect of urban renewal during the turbulent days of the civil rights movement. it reminds us that much of such new housing was designed expressly to remove black americans from areas ripe for redevelopment by whites, often confining them to prison-like high-rise structures. to be sure, citizen jane is not free of faults. it is uneven in its dating of events, for instance. we are not told precisely when jacobs began writing death and life or when she began her campaign against the lower manhattan expressway, only that she undertook the campaign after she finished writing death and life. likewise, the film does not make clear that the idea of the expressway was not finally quashed until . (instead, we are led to believe that it died after jacobs disrupted a meeting of the city commissioners in december .) later, robert a. m. stern informs us in an interview that le corbusier’s visions for housing never included “towers in a park” configurations such as those moses implemented, but we know that several of le corbusier’s housing schemes from the mid- s onward consisted of just that. perhaps fittingly, jacobs’s orga- nized resistance to moses mirrored the grassroots opposition that had earlier doomed le corbusier’s unité-based plans to redevelop war-ravaged areas of france in the mid- s, the adoptions of which le corbusier attempted to force through using similar top-down authoritarian methods. finally, the film mentions only in passing jacobs’s work in toronto opposing the spadina expressway, a cause she took up after she moved to canada in , the success of which is testament to the power of both her ideas and the depths of her per- sonal commitment to them. the most important part of citizen jane is saved for last. in its closing sequence, the film directs our attention to china, where entire districts of newly built, uniform apartment towers are creating what pro- mise to be the slums of the future. the film thus reinforces the old cliché that history, when unlearned—or learned too well—too easily repeats itself. here, there is no reso- lution to the mess created by moses in new york and by his followers elsewhere in the united states; we are simply re- minded that, once destroyed, the neighbor- hoods that gave our cities their distinctive architectural and social character can never be re-created or duplicated. the most potent aspects of citizen jane are thus the questions and challenges the film poses to figure jane jacobs and robert moses, as seen in citizen jane: battle for the city (photos by fred mcdarrah, cbs archive, getty images). m u l t i m e d i a architects, planners, developers, real estate owners, and designers in training: for whom and by whom is the world of the future going to be built? and will we all want to live there? peter clericuzio university of pittsburgh notes . lewis mumford, the city in history: its origins, its transformations, and its prospects (new york: harcourt, brace and world, ); jane jacobs, the death and life of great american cities (new york: random house, ); lewis mumford, “mother jacobs’ home remedies,” the sky line, new yorker, dec. , – . . on the development of the personification of the master builder during the first half of the twentieth century, particularly in the exploits of fuller, ferriss, and bel geddes, see adnan morshed, impossible heights: skyscrapers, flight, and the master builder (minneapolis: university of minnesota press, ). incidentally, moses se- cluded himself in a high-rise co-op apartment overlooking the east river on manhattan’s upper east side; thus his back was strategically turned toward the borough whose “slums” he had re- moved with surgical precision. see tanay war- erkar, “robert moses’s former upper east side co-op, on top of fdr drive, hits the market,” curbed, new york, may , https://ny. curbed.com/ / / / /robert-moses- upper-east-side-apartment-for-sale (accessed july ). . the redevelopment of the hill district of pittsburgh provides a good example. see allen dieterich-ward, beyond rust: metropolitan pitts- burgh and the fate of industrial america (philadel- phia: university of pennsylvania press, ), esp. – ; laura grantmyre, “visual repre- sentations of redevelopment in pittsburgh’s hill district, – ” (phd diss., university of pittsburgh, ). . this in spite of le corbusier’s allies at higher levels of the french national govern- ment who also tried to force their adoption. see le corbusier’s ouevre complète, vol. , – (zurich: Éditions girsberge, ); peter clericuzio, “le corbusier and the re- construction of saint-dié: the debate over modernism in france, – ,” chicago art journal ( ), – . scalar and omeka https://scalar.me/anvc/ http://omeka.org the migration of academic scholarship to digital platforms over the past two decades has seen its share of ups and downs. advo- cates of digitization and online publication have pointed out the lower costs of pub- lishing, the democratizing potential of the internet, and the increased opportunities for collaboration and iterative work af- forded by digital and online tools. at the same time, critics of online publication have pointed to questions of rigor and peer review, along with dilemmas such as how to preserve and cite works that are inherently unstable, how to credit individuals involved in collaborative efforts, and how to assess work that by its nature involves ongoing change and multiple versions. to address some of the challenges of digital scholarship, scholarly societies have adopted standards for assessment, includ- ing the “guidelines for the evaluation of digital scholarship in art and architec- tural history” developed by the society of architectural historians in partnership with the college art association and re- leased in . at the same time, a num- ber of university presses are investing in the development of platforms that reimag- ine the process of publishing peer-reviewed born-digital work. take, for instance, the recently released platform manifold scho- larship, which is being developed by the university of minnesota press in collabo- ration with gc digital scholarship lab at the city university of new york. like many such platforms, manifold is open source, meaning the original source code for the software is freely and publicly avail- able for anyone to adopt or modify. in the same spirit, this software departs from the print and even e-book models for textual academic scholarship in that the document remains dynamic, collaborative, and itera- tive, allowing for commentary from a com- munity of readers as well as revision and expansion of the project by the author beyond its first release. even as manifold and other university press developments promise a rigorous and redefined future for digital scholar- ship, implementing these platforms for an individual research project outside the institutional support of a team of computer programmers and information technology specialists would likely be rather difficult, if not formidable, for the general scholar. however, platforms are increasingly avail- able that allow nonexperts to engage in electronic self-publishing, whether for the dissemination of their own research or for use in the classroom. with an eye to the financial crisis faced by many university presses, and in particular the high cost of publishing media-rich scholarship, one of the earliest examples of this new kind of digital publication platform came out of the alliance for networking visual culture, which was formed in at the university of southern california. supported by funding from the andrew w. mellon foun- dation, the anvc proposed scalar, a plat- form that breaks from the linear mode of reading inherent in the printed text and allows scholars to reconsider the relation- ship between the archive (in many cases now digitized), analysis, and publication. central to the anvc strategy are partner- ships with archival repositories, humanities centers, libraries, and university presses, including the university of michigan, mit, the university of california, open humanities press, new york university, and duke university. scalar’s flexible interface invites users with a range of technical expertise to create media-rich long-form writing and exhibi- tion-like content, as well as more complex digital humanities projects. an online plat- form currently hosted by the university of southern california, scalar is ideal for ar- chitectural and urban history projects, as it is designed to incorporate different types of media, including images, video, and audio. these audiovisual media are treated as au- tonomous, first-order content that can be annotated with unlimited numbers of unique captions for different parts of an ar- gument and used repeatedly throughout the text. with scalar, the writer may orga- nize content using one or more “paths” that guide a reader through a series of pa- ges and may connect with other paths of content. despite their intended effects of indeterminacy and multiplicity, paths in scalar ultimately encourage a linear read- ing of the text. however, the author can associate audiovisual media and textual pages with tags that encourage a nonlinear mode of reading. to help, scalar includes built-in visualizations that the author may use to organize the networks formed by paths and tags in complex text as radial graphs, grids, and tag clouds. these visual- izations can add a layer of analysis for the author or provide a reader with additional means of discovering and exploring the j s a h | . | m a r c h https://ny.curbed.com/ / / / /robert-moses-upper-east-side-apartment-for-sale https://ny.curbed.com/ / / / /robert-moses-upper-east-side-apartment-for-sale https://ny.curbed.com/ / / / /robert-moses-upper-east-side-apartment-for-sale https://scalar.me/anvc/ http://omeka.org microsoft word - dasylvaturner-postprint.doc using ancillary text to index web-based multimedia ob- jects lyne da sylva lyne.da.sylva@umontreal.ca james m turner james.turner@umontreal.ca École de bibliothéconomie et des sciences de l’information, université de montréal . introduction périculture is the name of a research project at the université de montréal which is part of a larger parent project based at the université de sherbrooke. the parent project aimed to form a research network for managing canadian digital cultural content. the project was financed by canadian heritage and was con- ducted during the fiscal year - . périculture takes its name from péritexte and culture, péritexte be- ing one of a number of terms used (in french, our working language) to mean ancillary text associated with images and sound. it is a sister project to digiculture, another part of the same larger research project which studied user behaviours in interactions with canadian digital cultural content. the general research objective of périculture was to study indexing methods for web-based non-textual cultural content, specifically still images, video, and sound. specific objectives included: . identifying properties of ancillary text useful for indexing; . comparing various combinations of these properties in terms of performance in retrieval; . contributing to the development of bilingual and multilingual searching environments; . developing retrieval strategies using ancillary text and synonyms of useful terms found therein. our work in the context of this project focusses on text associated with web-based still images, and builds on previous work in this area of information science (e.g. goodrum and spink ; jörgensen , ; jörgensen et al. ; turner and hudon ). we identified a number of web sites that met our criteria, i.e., that contained multimedia objects, that had text associated with these objects that was broader than file names and captions, that were bilingual (english and french), and that housed canadian digital cultural content. we identified keywords that were useful in indexing and studied their proximity to the object de- scribed. we looked at indexing information contained in the tag and the “alt” attribute of the tag, and whether other tags contained useful indexing terms. we studied whether standards such as the dub- lin core were used. we identified web-based resources for gathering synonyms for the keywords. . background and context in computer science, research into indexing images and sound focusses on the low-level approach, performing statistical manipulations on primitives in order to identify semantic content (e.g. alvarez et al. ). this approach is also referred to as the content-based approach (e.g. gupta and jain , lew ). in informa- tion science, research into indexing images and sound focusses on associating textual information with the non-textual elements, and this often involves manipulating ancillary text. this approach is referred to as the high-level or concept-based approach (e.g. rasmussen , o'connor, o'connor, and abbas ). although human indexing remains necessary in a variety of situations, a number of factors militate in fa- vour of automating the high-level approach as much as possible. these include the very large volume of web- based materials available and the high cost of human indexing, which in any case is relatively inconsistent. the disparity among cataloguing and indexing methods from one collection of images to another was not a problem until recently, because collections were self-contained, and users learned the indexing languages necessary to reach the information available in the collections they use. now, however, there is a strong de- sire to connect repositories worldwide and to provide interfaces which allow users to search multiple collec- tions using a single search strategy. the usefulness of permitting multilingual searches of these same online collections is clear; however, much work remains to be done in setting up the organisational infrastructure of collections to permit this. lyne da sylva’s work in this connection has dealt mainly with automatic indexing of text documents. her prototype system for automatic indexing (da sylva , ) constructs a back-of-the-book index for digital documents. it relies extensively on insights from human methodology for indexing books as well as on proper- ties of index entries and semantic relationships between headings in the index. several techniques for spot- ting and handling linguistic cues in text documents are directly applicable to the indexing of non-textual ele- ments, given appropriate ancillary text. james turner has worked for a number of years on the general problem of using ancillary text to index still and moving images, with emphasis on shot-by-shot indexing of moving images. this work has shown that ap- proaches to indexing images are rather different from those used for indexing text documents. people asked to describe non-art images name the objects seen in the images, as well as persons and events. of all the names an object might have, only one or a few names are given very often. furthermore, no significant dif- ferences in how the objects are described by various groups such as students and workers or visually-oriented and non-visually-oriented persons were found (turner ). the transfer from visually perceiving the image and naming the objects found in the image seems rather direct (although this is not necessarily true in the case of art images), whereas text indexing requires more strenuous cognitive and intellectual work in inter- preting the meaning of the text into indexing terms useful in searching. in studies looking at cross-language differences between english and french, no cultural differences were found in how canadian english-speakers and french-speakers describe images (turner and roulier ). in a study of automated translation between english and french of indexing terms for moving images, eight web-based translators performed very well, with success rates between % and %, in taking the indexing terms in either language and putting them correctly into the other (turner and hudon ). the research results reported here build on this work by studying other aspects of text associated with im- ages in a networked environment to try to gain some understanding of how the ancillary text associated with images on web pages can be exploited to index the corresponding images. we studied this question in the context of web sites that were canadian, bilingual (english and french), and with patrimonial multimedia content. . methodology . constituting the corpus in order to conduct our study, we first constituted a corpus of web sites that responded to these criteria. to build this corpus, we did an initial web search on keywords such as “museum”, “canadian”, and “cultural”, in order to make a list of canadian museum and government cultural sites. interesting links found on the pages returned as results were followed, and in the end we reviewed forty ( ) sites in order to constitute our cor- pus. we then made a preliminary evaluation of the sites in the corpus, assigning to each site a score calculated using a number of criteria. some of the criteria increased the score, and others decreased it. criteria such as the presence of both official languages, cultural content, images, video or sound increased the score, as did the use of dublin core metadata, html, and use of the tag in html. use of frames, flash elements, tables, or asp code decreased the score. other elements considered were the approximate number of words of text on the page, and the ratio between the number of multimedia elements present and the number of words of accompanying text. in this way, we were able to calculate a score for each site. as a result of this exercise, twenty-five ( ) sites were retained for further study, each containing between one and thirty-five ( ) pages. the corpus is thus made up of selected web pages from the chosen sites. the pages do not necessarily represent all the multimedia objects available at these sites. for example, in the case of the musée virtuel canadien/virtual museum canada site, the web pages we examined do not include all those con- taining paintings; since a number of criteria were used to score each page, not all pages with useful elements to study made it into our study sample, only those with the highest scores were selected. . types of ancillary text considered web sites are often configured to generate pages dynamically from information in a database in response to a query. in cultural institutions, these collections are often indexed manually (i.e., by humans), to varying de- grees of exhaustivity but to the degree that is deemed appropriate by the collection managers or that is pos- sible as a function of the resources available. in the context of the present study, this indexing is not of pri- mary interest to us since it is purposeful, specific to the multimedia objects indexed, and takes into consid- eration the needs of users of the collection. here we are interested rather in ancillary text, that which is found in association with multimedia objects but that was created for purposes other than that of indexing the associated objects. following are the types of ancillary text we studied: • the filename of the multimedia object, i.e. the value of the “name” attribute of the tag. file extensions include gif (for image files), mp , wmf, wav or aif (for audio files) and mpg, mpeg, wmv, qtv (for video files); • the value contained in the “alt” attribute of the tag, which offers a textual description of the image when the visual display is hindered in some way; • text from a hyperlink which points to the multimedia object, and more specifically, the text be- tween the anchor tags ( this text); • a legend or other label for the multimedia object; • the value of metadata elements associated with the html file, for example: • the title of the page as indicated within the tags in html; • headers in the page, as expressed with the tags <h >, <h >, and so on; • text contained in the paragraphs immediately preceding or following a multimedia object. for each html page of the sites we selected, we made a detailed analysis which consisted of identifying all types of ancillary text present and determining whether each text element (e.g. the full title, or the text contained in the “alt” attribute) was useful for describing the multimedia objects on the page. for example, given a photograph of a tepee, it was determined whether expressions such as “typical native lodgings” were useful for indexing. for paragraphs of full text surrounding the object, we counted the number of words in the paragraph, as well as the number of useful indexing terms the paragraphed contained. from this, we calcu- lated a ratio of useful keywords to the number of words in a paragraph. for each case, the research assistants made a judgement call as to whether the term could be considered useful for indexing or not, based on whether the term seemed to be descriptive of the object or not. . locating and making an inventory of multimedia objects and their associated indexing terms locating multimedia objects and identifying their relative position to potentially useful ancillary text were complex tasks, since tables or frames (or other features of html mark-up) often blurred what seemed obvious in the resulting page. these tasks were necessary, however, especially as regards paragraphs immediately preceding or following multimedia objects, since these were a primary object of study. to try to maintain some uniformity in the data, the positions of multimedia objects are given by line number (in the original html file) and not in terms of the number of intervening words. in cases where the multimedia objects and the ancillary text were separated by javascript code rather than html code, the number of intervening lines turns out to be smaller than the number of words, which reduces unwarranted discrepancies, although some do remain. to facilitate processing, line numbers were calculated approximately using the line-numbering function in word. we identified images by searching for <img> elements in the html code. where they were found, the cor- responding url was visited to determine whether the page also contained words that were useful for index- ing. other types of multimedia objects were found by manual inspection of the html code, since they are generally referred to by a hyperlink. distinction was made between the following types of multimedia ob- jects: image, sound, video, and link. links are not multimedia objects, of course, but they represent them in the html page. the user may be presented with a quicktime icon, for example, to represent the multimedia object. in our study, both true multimedia objects and links to them were considered as targets to be in- dexed. we limited the multimedia objects in our study to two still images. the rationale for this is discussed in the results section. for each html page of the selected sites, we located all types of ancillary text associated with the images (e.g. the text between the <title> tags, the text contained as a value of the “alt” attribute, and so on). next, we determined whether the text was useful for describing the multimedia objects. often, the “alt” attribute consisted of a descriptive legend such as “the honourable liza frulla”. sometimes the title of the page was a very general term which could also be used to describe the multimedia objects, such as “constructing the ca- nadian pacific railroad”. other times, the text was not useful for our purposes, either because it had no meaning (e.g. file names consisting of alpha-numeric sequences such as “ts .jpg”) or because the ancil- lary text was ambiguous or non-descriptive. some multimedia objects could not be considered cultural con- tent but rather were graphic elements used for navigation or identification, such as arrows or labels. com- pany logos were considered a special case. although they may be considered multimedia objects representing canadian entities, they were discarded from the analysis because they often appear on every page of the web site, regardless of the theme of the page. we felt that including them would introduce too much noise in the analysis of multimedia objects. as we mentioned, not all multimedia objects we considered are visible on the html page as it is displayed. some are only accessible via a click, so we identified each multimedia object as being visible or not. this was especially relevant in the case of images that were not visible but that had an intermediary visible counter- part in the form of a thumbnail of the image. although the thumbnail is not the image itself as such, we con- sidered it to be the multimedia object, since it is does appear on the page and since ancillary text can de- scribe both the target image that requires a click to be displayed and the corresponding thumbnail. these methodological acrobatics allowed us to obtain a useful measure of the distance between the multimedia ob- ject and the ancillary text which describes it. links pointing to video clips were treated the same way. for each multimedia object, we extracted from the corresponding html page a list of words that describe it, by reading the entire page. the “find” function in word was then used to group occurrences of the words from this list. finally, we produced a list of pairs, each pair consisting of the candidate indexing term and the type of ancillary text in which the term occurred. . results each page of each site was examined to locate the multimedia objects and to determine whether ancillary text was present and if so, whether it could be considered useful for indexing. ancillary text is not always present. a number of conditions were identified: the title of the page is sometimes left unspecified, a given <img> tag may contain no value for the “alt” attribute, and so on. table summarises the results of the pages we examined. the data reveal a number of interesting phe- nomena which we present in this section. ____________________________________________________________________________________________ table . types of ancillary text for web pages in the corpus with indications of whether types present were considered useful types present/total percentage useful/total present percentage file name / . / . “alt” attribute / . / . hypertext / , / . legend/label/caption / . / . <meta> elements / . / . title of page / . / . paragraphs / . / . ___________________________________________________________________________________________ file names for multimedia objects are almost always present ( . % of the time). however, since they are considered useful (i.e. descriptive of the corresponding object) only . % of the time, their role as indexing terms is rather limited. in approximately two-thirds of the cases ( . %) there was a value assigned to the “alt” attribute of the <img> tag. however, it was considered useful less than half the time ( , %). names found in hypertext links are present only one-third of the time ( . %). however, when they are present they are considered useful almost all the time ( . %). text in legends, labels, or captions is also relatively rare in the data we examined, occurring only . % of the time. however, when such text is present, it is a very reliable indicator of the content ( % of the cases were deemed useful). this is not surprising, since the purpose of such text is to identify information in the picture. in our study his category of data was identified by human observation, but a way of identifying it automatically needs to be found, otherwise the data would have to be distributed among the other catego- ries. the explicit nature of this kind of text (e.g. a legend created specifically to give descriptive information about the corresponding image) militates in favour of finding some way to help an algorithm identify it as such, even if this means developing a special tag for it, for example. similarly, data is included in the <meta> tag only about one-quarter of the time ( . %), but when it is pre- sent it is almost always useful ( . % of the time in the data we analysed). again, the usefulness of such text is not surprising, since the tag exists to facilitate adding searchable terms for describing the content. the low rate of use of the <meta> tag is not surprising either, since widespread abuse of the tag has caused almost all popular search engines to disregard it (sullivan ). titles of html pages as expressed in the <title> tag are virtually always present ( . % of the time). in ad- dition, the text contained in these titles is very reliable for indexing purposes. in our data, the text of titles was deemed useful % of the time. however, in almost all cases the words were classed as “general” so that their usefulness in indexing the specific content of images is limited. for example, the title “canadian pacific railway” gives some helpful general information but does not describe an image of the inside of a waiting room in a rural train station that might be found on the page. another example, in which text is only vaguely useful or even outright misleading is a page entitled “our roots” and containing a photograph entitled “along the fifth” (i.e. fifth avenue). while the connection as general indexing can be made without much trouble, it is clear that users searching for images of roots will be disappointed if they land on this page. paragraphs immediately preceding or following multimedia objects appear to be the most promising sources of useful indexing text in the data we studied. text adjacent to multimedia objects was both very frequent (it occurred . % of the time) and very useful ( . % of the time there were indexing terms consid- ered useful). however, this figure for usefulness requires some explanation. the length of paragraphs (identi- fied by the <p> tag for our purposes) varied greatly. in addition, we allowed untagged paragraphs to be counted in our analysis. thus in some cases a very large paragraph containing only one or a few useful index- ing terms is considered useful. in table we try to account more precisely for these variations, since we believe they warrant closer analysis. here we seek to determine the distance from the multimedia object at which useful text can still be found, and whether the text preceding the multimedia object is more useful than the text following it. to determine this, we examined closely the size of paragraphs preceding and following images in terms of the number of words they contained and how many of these can be considered keywords useful for indexing. ____________________________________________________________________________________________ table . mean and standard deviation for number of words and number of useful keywords paragraph be- fore image paragraph af- ter image paragraph be- fore image paragraph af- ter image no. web pages in sample mean no. words in paragraph . . . . standard deviation . . . . mean no. keywords in parag. . . . . standard deviation . . . . percentage useful keywords , . . . ratio useful keywords : : : : ____________________________________________________________________________________________ in order to make the task manageable, we limited the analysis to the first two images in each web page. this strategy also allows us to compare the behaviour of initial images with that of images in the main body of a web page. the number of images in each web page varied, of course, and since not all pages contained more than two, this approach allowed us to work with a more uniform pool of data. our sample of pages represents those that included at least one image (i.e., two pages out of the had no images), and twenty-two ( ) of these pages included at least one more image. we did not remove stop words in this analysis because of limited resources available for processing. however, it would be helpful to compare the results against another analysis in which stop words are removed, as the figures we obtained would probably change considerably. three observations are in order here. first, the mean number of words in the paragraphs preceding image ( . words) is smaller than that of paragraphs following it ( . words). this is true despite the fact that if the text following image was the same text as that preceding image , it was only counted as the latter, and thus the text following image was set to zero (the same was done for image , in cases where there existed a third image). figures for image show the inverse pattern, although the difference in the fig- ures is not as great: on average, . words precede the image and . words follow it. we speculate that this may reflect properties of the first image, often situated near the beginning of a page, with most of the page’s text following it, while the properties of other images (image , in our analysis) are less predict- able. an alternate explanation is that the pages we analysed were relatively short, so that if a second image were present, there remained little text after it. interestingly, in our sample the length of text before each image was comparable (between and words). this corresponds to roughly a paragraph of lines on letter-sized paper. however, this comparable length does not correlate with comparable usefulness, as we shall see presently. the second observation has to do with the percentage of words useful for indexing. this varies greatly within adjacent paragraphs. for the first image, . % of words preceding it are useful for indexing, com- pared to . % in the paragraphs following it. it is reasonable to expect initial paragraphs to be more informa- tive; however, it is counterintuitive when one considers the large size ( . words on average) of the fol- lowing paragraph, which then seems wordy without being descriptive of the image preceding. for the second image, however, the percentages are reversed. only . % of words preceding image were deemed relevant for indexing (despite the comparable size of preceding paragraphs for each image), while . % of words in the following paragraph were considered useful. this may have to do with some type of conclusive matter in the paragraphs following image , which in most cases is the last of the page. the third observation we wish to make is that the standard deviation for the number of words in the para- graphs is quite high, ranging from three quarters of the mean ( . for a mean of . ) to almost double ( . for a mean of . ). this indicates great variation in the number of words the paragraphs contain. for the number of useful keywords, there is strong variation in the paragraph following the first image ( . standard deviation for a mean of . ), but it is smaller in the paragraph preceding it ( . standard devia- tion for a mean of . ), and in either paragraph adjacent to image ( . standard deviation for a mean of . and . standard deviation for a mean of . respectively). this needs to be taken into account when building algorithms for seeking out keywords. . discussion as we have seen, some of the elements of ancillary text studied are very useful as sources of indexing terms, and others not very useful at all. all types make some contribution to the pool of indexing terms that can be derived from ancillary text. not surprisingly, the most useful elements are those that are designed to hold descriptive content, such as the <meta> tag and legends or their equivalent. the least useful is the name of the file, found to be useful only . % of the time. the profile of some of elements is that they are not always present, but when they are, they are very useful. the “alt” attribute of the image tag allows creators of web pages to include some description of an image for the benefit of users who cannot see the image for any of a number of reasons. as we noted, in our data there was a value assigned to this tag . % of the time, and when there was a value it was found to be useful only . % of the time. this data suggests that the “alt” attribute is underexploited, an observation that has been made in other contexts, and which further suggests that those responsible for creating web page content should be made aware of the potential benefits of using of this attribute. perhaps the richest source of potential keywords for indexing is the text of the paragraphs surrounding the images on web pages, because these contain the greatest number of words, although only a percentage of these words are useful for indexing. as we noted, we did not remove the stop-words in this study, but once this is done the percentages and ratios of useful keywords would improve considerably. we made a number of casual observations about the properties of the text of paragraphs, and these observations might now be for- mulated more rigourously as research questions to help get a better understanding of the nature of this text. we notice that the first image of the web pages we studied occurred rather near the beginning of the page, so that the text surrounding the first image may well have different properties from those of the text surrounding additional images on the page. we also observe that the paragraphs following the second image seem to be the most informative, followed by those preceding the first image, although variations were con- siderable in our data. we also observe that the paragraphs preceding the second image are less useful for in- dexing than those preceding the first image, for a comparable number of words. finally, paragraphs following the first image are typically quite large but not very useful for indexing. . conclusions our study found that a large number of useful indexing terms are available in the ancillary text of web sites with cultural content. we evaluated various types of ancillary text as to their usefulness in retrieval. our re- sults suggest that these terms can be manipulated in a number of ways in automated retrieval systems to im- prove search results. cross-language comparison of the results reinforces our previous research results, which suggest that in- dexing in other languages can be generated automatically from a single language using web-based tools. rich information that can be used for retrieval is available in many places on web sites with cultural con- tent, from the file name to explicit information in captions to descriptive information in surrounding text to the contents of various html tags. algorithms need to be developed to exploit this information in order to improve retrieval. some of our previous work on how people assign indexing terms to images suggests that noun phrases are probably the most useful indexing constructs of all. including in the search algorithm some kind of parser that could identify noun phrases would undoubtedly be helpful. building further on previous and present results, indexing terms that have been identified as such could then be filtered through a bilingual dictionary in order to provide indexing in the other language. this princi- ple can probably be extended to create additional indexing in other languages; however, the universal feasi- bility of this principle needs to be demonstrated empirically. a further step that can be undertaken to improve performance is to filter the indexing terms through an online thesaurus, in order to pick up synonyms and hierarchically-related words. for example, it would be helpful to be able to manipulate specific indexing terms such as “sparrow” so that users searching for images of any birds could also find these. for ancillary text which is sometimes but not always useful, such as the “alt” attribute of the <img> tag, one possible direction for research would be to analyse words to estimate their usefulness in a given page: for example, does the word appear in a thesaurus of the domain? does it confirm or is it compatible with informa- tion already present in other ancillary text? such analysis would require external terminological resources. it is clear that much more can be done to improve the performance of search algorithms for finding multi- media objects in a networked environment. we hope the results from this study make some contribution, even a modest one, to solving this problem. enough knowledge has been gained to assure us that investing more effort in the area of exploiting ancillary text for indexing web-based multimedia objects is an invest- ment that will surely pay off. acknowledgements we thank canadian heritage for funding this project with a grant received via corimedia, a research consor- tium based at the université de sherbrooke which focusses on access to multimedia information. we also thank our research assistants nawel nassr and stéphane boivin for their contribution to this work. references and bibliography alvarez, c., oumohmed, a. i., mignotte, m., and nie, j.-y. ( ). toward cross-language and cross-media image retrieval. in working notes for the clef workshop, - september, bath, uk. da sylva, l., and doll, f. ( ). a document browsing tool: using lexical classes to convey information. in lapalme, g. and kégl, b. (eds), advances in artificial intelligence: th conference of the canadian soci- ety for computational studies of intelligence, canadian ai (proceedings). new york: springer-verlag, pp. - . da sylva, l. ( ). relations sémantiques pour l’indexation automatique: définition d'objectifs pour la détec- tion automatique, document numérique, numéro spécial « fouille de textes et organisation de docu- ments », ( ): - . goodrum, a. and spink a. ( ). image searching on the excite web search engine, information processing and management, ( ): - . gupta, a. and jain, r. c. ( ). visual information retrieval, communications of the acm, ( ): - . jörgensen, c. ( ). image attributes: an investigation. phd thesis, syracuse university. jörgensen, c. ( ). image attributes in describing tasks: an investigation, information processing and man- agement, ( / ): - . jörgensen, c., jaimes, a., benitez, a. b., and chang, s.-f. ( ). a conceptual framework and empirical re- search for classifying visual descriptors, journal of the american society for information science and technology (jasist), ( ): - . lew, m. s. ( ). principles of visual information retrieval. new york: springer. marsh, e. e., and white, m. d. ( ). a taxonomy of relationships between images and text, journal of documentation, ( ): - . o’connor, b. c., o’connor, m. k., and abbas, j. m. ( ). user reactions as access mechanism: an explora- tion based upon captions for images, journal of the american society for information science, ( ): - . rasmussen, e. m. ( ). indexing images. in williams, m. e. (ed.), annual review of information science and technology, . medford, nj: learned information, pp. - . sullivan, d. ( ). death of a meta tag, searchenginewatch, http://searchenginewatch.com/sereport/article.php/ (first published october , ; last accessed january ). turner, j. m. and hudon, m. ( ). multilingual metadata for moving image databases: preliminary results. in howarth, l.c., cronin, c., slawek, a. t. (eds), l'avancement du savoir : élargir les horizons des sciences de l'information, travaux du e congrès annuel de l'association canadienne des sciences de l'information, toronto: faculty of information studies, pp. - . turner, j. m. and roulier, j.-f. ( ). la description d’images fixes et en mouvement par deux groupes lin- guistiques, anglophone et francophone, au québec, documentation et bibliothèques, ( ): – . turner, j. ( ). determining the subject content of still and moving image documents for storage and re- trieval: an experimental investigation. phd thesis, university of toronto. appendix following is a list of the sites we studied. these were selected on the basis of the following criteria: the presence of multimedia objects, text in both english and french, sufficient text, html code, dublin core metadata and absence (or relative non-importance) of flash objects or other dynamic code. sites bank of canada currency museum / musée de la monnaie de la banque du canada http://www.museedelamonnaie.ca/fre/index.php bonjour québec (québec government official tourist site / site touristique officiel du gouvernement du qué- bec) http://www.bonjourquebec.com/francais/attraits/ http://www.bonjourquebec.com/francais/restauration/index.html http://www.bonjourquebec.com/francais/regions/index.html canadian conservation institute / institut canadien de conservation http://www.cci-icc.gc.ca/html/ canadian museum of civilization / société du musée canadien des civilisations http://www.civilisations.ca : selected pages including : « histoire des autochtones du canada » (http://www.civilisations.ca/archeo/hnpc/npint f.html); « salles des trésors » (http://www.civilisations.ca/tresors/tresorsf.asp); « lois etherington betteridge - orfèvre » (http://www.civilisations.ca/arts/bronfman/better f.html) canadian museum of nature / musée canadien de la nature http://nature.ca/ selected pages : « nos trésors préférés - une histoire merveilleuse » ( http://nature.ca/discover/treasures/trsite_f/trmineral/tr /tr .html); « nos trésors préférés - de minuscules terreurs » (http://www.nature.ca/discover/treasures/trsite_f/tranimal/tr /tr .html); « nos trésors préférés - une histoire merveilleuse » (http://nature.ca/discover/treasures/trsite_f/trmineral/tr /tr .html) http://www.mcq.org/roc/fr/plan.html canada science and technology museum / musée des sciences et de la technologie du canada http://www.science-tech.nmstc.ca exposition virtuelle - maîtres de l'art populaire http://pages.infinit.net/sqe rl / government of canada - the national battlefields commission / gouvernement du canada - commission des champs de bataille nationaux http://www.ccbn-nbc.gc.ca/ http://www.nlc-bnc.ca/jardin/h - -f.html. maison saint-gabriel http://www.maisonsaint-gabriel.qc.ca/maison/ _e.html maritime museum of british comlumbia http://mmbc.bc.ca musée acadien of the université de moncton / musée acadien de l'université de moncton http://www.umoncton.ca/maum/index.html musée de la nature et des sciences http://www.mnes.qc.ca/index.html musée virtuel du c.f.o.f. http://www.cfof.on.ca/francais/navbar/museetest.htm museum of new france - canadian museum of civilization corporation / musée de la nouvelle-france - société du musée canadien des civilisations http://www.civilization.ca/vmnf/vmnff.asp national archives of canada / archives nationales du canada http://www.archives.ca/ selected pages : « expo - pavillons » (http://www.archives.ca/ / / _f.html); « expo - activités » (http://www.archives.ca/ / / _f.html); « expo - projet d'une exposition universelle à montréal et mise en candidature » (http://www.collectionscanada.ca/ / / _f.html) http://www.archives.ca/ / _f.html : selected pages including : « dictionnaire montagnais-français, v. , par le père antoine silvy, mission- naire jésuite » (http://www.collectionscanada.ca/ / / / _f.html) old montreal / vieux montréal http://www .ville.montreal.qc.ca/vieux/histoire/ old port of montréal / vieux-port de montréal http://www.vieuxportdemontreal.com/histoire_patrimoine/ our roots - canada's local histories online / nos racines - les histoires locales du canada en ligne http://www.ourroots.ca/f/intro .asp. société des musées québécois http://www.smq.qc.ca/ virtual museum of canada / musée virtuel du canada http://www.museevirtuel.ca intertextuality_by_meaning_preprint     note: this is a pre-print draft version. the published version contains several editorial changes. interested readers are advised to consult the forthcoming version of this paper in llc ©: . published by oxford university press. all rights reserved. the sense of a connection: automatic tracing of intertextuality by meaning walter j. scheirer harvard university chris forstall university of geneva neil coffee state university of new york at buffalo     the sense of a connection: automatic tracing of intertextuality by meaning introduction the recognition that poetic texts are often significantly linked to their predecessors through shared or similar language has been an important part of the reading and study of literature since antiquity. more recently, however, scholarly interest has broadened beyond the verbatim reuse of specific phrases to take in the great scope and subtlety of intertextual connections . the term intertext, coined by julia kristeva ( , p. ), has come to be used widely in literary studies to indicate linguistic similarity that, in presenting to the reader a marked connection between two texts, generates new meaning or novel stylistic effects. emerging digital methods now make it possible to trace this sort of intertextuality with some success. most typically, computational approaches search for the type of lexical correspondences that can be loosely described as paraphrase . the process closely resembles the manual identification of intertextuality still commonly practiced by literary scholars, including the writers of commentaries (coffee et al., ). this same work has demonstrated, however, that meaningful connections between texts occur not only via lexical similarity but also through a broader similarity of meaning in the absence of words that have the same form or stem . the classicist stephen hinds describes this phenomenon as a “poetic of corresponding inexactitude, which draws on but also distances itself from the rigidities of philological and intertextualist fundamentalisms alike” (hinds, , p. ). one study indicated that, among a certain set of meaningful parallels between two texts, some % were made up by similarity of meaning in the absence of more than one shared word (coffee et al., , p. ).     quite remarkably, human readers are rather adept at identifying text reuse when faced with such “inexactitude,” where a predefined formula for lexical matching drawn from textual criticism would simply fail. for instance, consider the following lines from the roman poet lucan, which, in an epic simile, characterize the once-great general pompey on the eve of the roman civil war as a tottering, but still venerated, oak : qualis frugifero quercus sublimis in agro, exuvias veteres populi sacrataque gestans dona ducum, nec iam validis radicibus haerens, pondere fixa suo est; nudosque per aera ramos effundens, trunco, non frondibus, efficit umbram; et quamvis primo nutet casura sub euro, tot circum silvae firmo se robore tollant, sola tamen colitur. (lucan, civil war . – ) just as a lofty oak in a productive field, bearing the ancient spoils and consecrated gifts of leaders, but no longer clinging with healthy roots, is fixed in place by its own weight; and spreading out bare branches through the air, it casts a shadow from its trunk rather than its leaves; and, although it sways, ready to fall at the first easterly wind, while so many of the surrounding trees bear themselves up on sturdy hardwood, it alone is honored. commentators on this poem have noted intertextual connections to several passages in vergil’s aeneid. among them is another simile, this one comparing the doomed city of     troy, finally penetrated by the besieging greeks, to a moribund ash-tree toppled by industrious peasants: ac veluti summis antiquam in montibus ornum cum ferro accisam crebrisque bipennibus instant eruere agricolae certatim,—illa usque minatur et tremefacta comam concusso vertice nutat, volneribus donec paulatim evicta, supremum congemuit, traxitque iugis avolsa ruinam. (vergil, aeneid . – ) just as when farmers vie to uproot an ancient ash-tree high in the mountains, hacked at with a rain of blows from their iron axes—it keeps threatening to fall, and, with its foliage trembling, its crown shaken, it sways, until, overcome little by little with its wounds, uprooted from the ridge, it at last gives a groan and heaves forward its own collapse. as readers, how do we recognize that these two texts are related, when they share just one distinctive word, “sway” (nuto) ? we see a resemblance of theme: both texts describe a tottering old tree. both passages also share a narrative function. in each case the tottering tree foreshadows the downfall of a hitherto stalwart bastion: the trojan citadel in the aeneid, the republican general pompey in the civil war. indeed, the two events appear intertextually connected: the capture and destruction of mythological troy anticipates the historical defeat and death of the roman leader .     theme, narrative structure, historical and mythical events: the ability of poetic language to forge connections simultaneously among such different sign-systems is precisely what kristeva’s original broad notion of intertextuality as “an intersection of textual surfaces” (kristeva, , p. ) was meant to encompass. this view of intertextuality leads us to think of words, even different but related ones, as part of a continuum of reuse and repurposing, and so to see in our examples of epic collapse, and countless others, the potential for thematic material from one context to be redeployed in another to new effect. given the complexity of literary meaning that arises when readers encounter such instances of intertextuality, how can we capture it adequately with a computer model? what we need, in the words of hinds, is a “fuzzy logic” that is flexible enough to identify highly inexact matches often based in thematic similarity (hinds, , p. ). the technique we employ for identifying such semantic intertextuality is the popular natural language processing strategy of semantic analysis. algorithms for semantic analysis are typically designed around the notion of word co-occurrence. that is, they start from the assumption, possibly counterintuitive but well-demonstrated, that words that occur in the same contexts have related meanings. this assumption, coupled with the cognitive matching process described above, motivated the design of latent semantic indexing (lsi), an early and still powerful approach (deerwester et al., ). the use of algorithms for semantic analysis, including topic modeling (blei, ), has spread from the practical applications of natural language processing to become a popular tool for literary studies among digital humanists. recent work has used semantic analysis to     distinguish between genres, produce an algorithmic historiography of classical scholarship, and characterize sentiment in political writing . these types of tasks fall into what jockers terms macroanalysis (jockers, ), which applies the tools of machine learning to collect quantifiable evidence of literary phenomena over large corpora, which might consist of the collected works of an author, whole genres, and entire literatures. when instead used for close reading, semantic analysis has the potential to reveal the characteristics and behavior of the language elements that participate in intertextual connections. in this work, we are concerned with texts from antiquity where intertextuality takes the form of similar small phrases or passages, as opposed to corpora of large documents where semantic analysis is more commonly applied. let us begin with an example of how semantic analysis can be applied to this sort of small collection. consider the following lines of latin from lucan’s civil war as a simple corpus: . bella per emathios plus quam civilia campos iusque datum sceleri canimus . post cilicasne vagos, et lassi pontica regis proelia, barbarico vix consummata veneno, ultima pompeio dabitur provincia caesar . sed non in caesare tantum nomen erat, nec fama ducis: sed nescia virtus stare loco: solusque pudor, non vincere bello . turba minor ritu sequitur succincta gabino, vestalemque chorum ducit vittata sacerdos, troianam soli cui fas vidisse minervam     . certe populi, quos despicit arctos, felices errore suo, quos ille timorum maximus haud urget, leti metus . quodque (nefas) nullis inpune apparuit extis, ecce, videt capiti fibrarum increscere molem alterius capitis . iam gelidas caesar cursu superaverat alpes, ingentisque animo motus, bellumque futurum ceperat. ut ventum est parvi rubiconis ad undas . rupta quies populi, stratisque excita iuventus deripuit sacris adfixa penatibus arma, quae pax longa dabat . non, si tumido me gurgite ganges summoveat, stabit iam flumine caesar in ullo, post rubiconis aquas now suppose that we would like to find which of the above lines, if any, have some thematic similarity to the phrase rubiconis aquas (“the waters of the rubicon”). given that caesar is famously associated with crossing the rubicon, if a semantic analysis approach were effective, we would expect a search for thematic material related to the rubicon to turn up phrases in which caesar appears. to test this hypothesis, we applied lsi to search for content similar to rubiconis aquas. an in-depth look at the lsi algorithm, including a description of the relevant parameters, follows in the next section. for now, let us simply consider the top three results returned when we perform this test with an lsi approach, using two topics and cosine distance: . post cilicasne vagos, et lassi pontica regis proelia, barbarico vix consummata veneno, ultima after defeating roving cilician pirates and after battles on the black sea with the fading king,     pompeio dabitur provincia caesar . . . ? (civil war . - ) scarcely ended by barbaric poison, will caesar now be handed over to pompey as his last charge? . iam gelidas caesar cursu superaverat alpes, ingentisque animo motus bellumque futurum ceperat. ut ventum est parvi rubiconis ad undas. (civil war . - ) already caesar had overcome the frozen alps with speed, and in his heart he had anticipated the great upheavals and war to come, when he arrived at the waters of the slender rubicon. . sed non in caesare tantum nomen erat, nec fama ducis: sed nescia virtus stare loco: solusque pudor, non vincere bello. (civil war . - ) but caesar had not only a name and renown as a general, but also a courage incapable of standing still, and shame only at conquering without war. as the results indicate, the test was successful: the search for thematic content similar to “waters of the rubicon” turned up passages referring to caesar as the top three results. in one of these phrases, the search for meanings similar to those of rubiconis aquas detected mention of the rubicon itself, along with caesar, but the two others did not. the results also show substantial precision. the algorithm did not recall everything related to caesar, but only hits rich in the martial language that also co-occurs with the word rubicon, an emblem of the civil war. this simple test suggests that material likely to be thematically associated in the mind of the reader (caesar and rubicon) can also be identified through semantic analysis. the remainder of this article will address in greater detail a more complicated task. whereas we have just demonstrated a search that finds passages matching a phrase, we turn now to     detecting semantic similarity between two whole passages. the goal of this sort of search is that the reader interested in finding instances of textual similarity absent verbal repetition will ultimately not need to input a search term, as we have just done, but will be able to simply search all passages of one given work against all of those in another. with this basic understanding of our goals and approach in place, we can summarize the contributions we describe in the remainder of this article: . a methodology for applying semantic analysis to the problem of detecting instances of intertextuality without strict lexical correspondence (sec. ). . an extensive experimental analysis that compares the results of semantic analysis to human analysis, i.e. scholarly commentaries that compare two texts (sec. ). . a publically accessible web tool that allows non-experts to apply our semantic analysis methodology to a large corpus of latin writers (sec. ). . the discovery of thematic matches between lucan’s civil war and vergil’s aeneid not previously recorded by commentators that were detected by our tool (sec. ). methods . lsi approach to find the passages that best match a particular query phrase by context, we need to not only generate a semantic model, but also assess similarity within that model space. for this purpose, we chose to use the lsi module of the gensim framework (rehurek and sojka, ) in a custom python program. the underlying algorithm performs a transformation on a set of document vectors to draw out latent structure in the corpus, and to reduce dimensionality for computational efficiency. this is accomplished via singular     value decomposition (svd), a matrix factorization technique in linear algebra. a similarity search is then performed in the resulting low rank transformation space. in order to provide enough contextual information for the models and still keep the input highly localized to specific phrases, a window of approximately characters around and including a target line of text was always selected to form a passage considered a “query.” similarly, a window of approximately , characters around and including a line from the text we wanted to match against formed a passage considered a “document.” note that each line from the text was used as a basis to create a document, resulting in a large measure of overlap between documents as the window moved across the text. a collection of all such documents from a text represents a training corpus. during pre-processing, the most common words from the tesserae corpus were removed from consideration. this list contains function words, as well as the most common nouns, verbs, adjectives, and adverbs. each passage was then processed into a bag-of-words representation, with the inflected form of each word replaced with the set of all possible stems. this was done in lieu of typical lemmatization to increase the amount of text available for training (see the discussion of small sample sizes below). each lsi model for a corpus was trained using a user-specified number of topics (i.e. the dimensions retained after svd is applied by the algorithm). similarity queries proceeded by projecting a query passage and a corpus into the transformed model space, and assessing cosine similarity between the query passage and each document in the corpus to produce a set of match scores (in the range - to , where a higher score indicates a better match). these scores were then sorted to provide a ranked list of     potential matches. the source code for this algorithm is available publicly on github as part of the tesserae web tool . a mathematically inclined reader might ask why we opted for lsi instead of a more flexible topic modeling approach such as latent dirichlet allocation (lda) (blei, ). during the course of this work, we evaluated several lda implementations including the online learning technique provided by gensim, and the efficient sampling-based implementation provided by mallet (mccallum, ). for text samples as small as our passages, these algorithms were not numerically stable, i.e. they produced radically different match scores for the exact same input across multiple trials. this is a significant problem for the scholar attempting to search for instances of textual reuse with some degree of confidence. the cause is an artifact of random bootstrapping (i.e. initializing the algorithm with different random data each time it is run) with limited sampling. the minimum sampling of text required for the statistical estimators to converge is something greater than what we are providing – lda is most typically applied to long-form documents and any implementation must make certain assumptions on its input. this is a key open issue in machine learning for the digital humanities: textual analysis for forms such as poetry, song, or epigraphy will nearly always involve small samples . our testing revealed drift in only the least significant digit of the scores produced by gensim’s lsi implementation , giving us enough stability to reliably replicate our results over any number of trials. the sizes of the query and document passages described above were determined experimentally with numerical stability in mind. characters for the query and characters for the document represent the smallest passage sizes that form a highly localized window around their respective target lines     (ensuring that matches are not too broad), while providing enough numerical stability for the lsi algorithm. for comparison, we also considered a simpler semantic analysis approach without the rank lowering of the lsi algorithm on the same texts. again using the gensim module, we computed the cosine distance between just the bag-of-words representations for a query passage and each passage in the corpus to produce a second set of match scores. the goal of this comparison was to see what lsi adds beyond the basic language model. according to deerwester et al. ( ), rank lowering helps us find all words that are related to each document. this is typically a much larger set than the plain bag-of-words representation because it accounts for synonymy across the corpus. if lsi is indeed exploiting the “semantic structure” of our corpus via low-rank approximation, we should observe better match scores for relevant parallels compared to this simple approach. . experiment design our baseline for experimentation is the n-gram matching capability that forms the core of the tesserae search engine, which is freely available on the web . briefly summarized, the matching algorithm operates in two distinct stages . in the first stage, all instances where a given unit (e.g. verse line or phrase) in one text shares at least two words with another unit in a different text are identified. the words may be exact forms or lemmata. in the second stage, the candidate matches are ranked by the relative rarity and proximity of their shared words. the final result is a score that reflects the overall strength of the match, if some word reuse is present.     we validated our approach with reference to two latin epic poems, lucan’s civil war, book , and vergil’s aeneid. civil war consists of hexameter lines, while all of the aeneid consists of , hexameter lines. these epics are generally considered to have a deep and remarkable intertextual relationship . this relationship is attested in the work of scholarly commentators, who, as expert readers, document a variety of forms of intertextual relationship, among them instances of shared meaning. we therefore tested our results against a benchmark data set assembled by the tesserae project comprising all intertexts between the two texts recorded by four major commentaries. the ability of the algorithm to replicate commentator decisions is used as the measure of performance. from previous experiments with the tesserae search engine, we know that it is possible to identify the majority of known intertexts by searching for sentences that share two or more lemmata. in a test on a set of given samples, the word-based algorithm missed of the commentator parallels, however, which accounted for / of the benchmark set. analysis of such missed samples suggests that they consist wholly or partially of instances of similar meaning, without shared words (coffee et al., b, ). this subset of the overall benchmark represents a union of parallels described in the four commentaries. of these, individual commentators identified distinct parallels, while two commentators independently identified each parallel in the remaining five. with due allowance for the subjectivity of the commentators, the objective of this work was to see how many of the missed intertexts could be recovered by automatic matching by semantic context rather than words.     lucan-vergil benchmark results our first test case was the following excerpt from book of the civil war, where lucan uses metaphorical language to describe the abandonment of rome by its military age men (left panel below). qualis, cum turbidus auster reppulit a libycis inmensum syrtibus aequor fractaque veliferi sonuerunt pondera mali, desilit in fluctus deserta puppe magister nauitaque et nondum sparsa conpage carinae naufragium sibi quisque facit, sic urbe relicta in bellum fugitur. nullum iam languidus aevo evaluit revocare parens coniunxve maritum fletibus, aut patrii, dubiae dum vota salutis conciperent, tenuere lares; nec limine quisquam haesit et extremo tunc forsitan urbis amatae plenus abit visu: ruit inrevocabile volgus. o faciles dare summa deos eademque tueri difficiles! (civil war . – ) postquam res asiae priamique evertere gentem immeritam visum superis, ceciditque superbum ilium et omnis humo fumat neptunia troia, diversa exsilia et desertas quaerere terras auguriis agimur divum, classemque sub ipsa antandro et phrygiae molimur montibus idae, incerti quo fata ferant, ubi sistere detur, contrahimusque viros. vix prima inceperat aestas et pater anchises dare fatis vela iubebat, litora cum patriae lacrimans portusque relinquo et campos ubi troia fuit. feror exsul in altum cum sociis natoque penatibus et magnis dis. (aeneid . - ) just as when the swirling south wind drives the vast sea back from the libyan syrtes, and the shattered mass of the mast, with its sail, groans, the helmsman abandons the stern and leaps into the waves; and though the fittings of the hull are not yet strewn after the gods saw fit to overturn the affairs of asia and visit undeserved punishment on the race of priam, after proud ilium had fallen and all of troy, built by neptune, was a smoking ruin, we were driven by signs from the gods to seek exile far away     apart, each sailor fashions his own personal shipwreck; so too they desert the city and flee into war. parents, frail with age, cannot call back their sons, nor wives, by their tears, their husbands; nor the ancestral homes, so long as they place their hopes on an unlikely salvation. no one hesitated on his threshold, to depart, perhaps, with a final look, filled with the love of his city. the crowd rushed on, heedless. how easily the gods give everything, how little they care to preserve it. (civil war . – ) and find vacant lands. near antander and the mountains of phrygian ida we constructed a fleet, though we were unsure where the fates were taking us, where we were to settle, and we gathered our men together. summer had only just begun and my father anchises ordered us to give sail for our destiny. i wept as i left the shores and harbors of my fatherland, and the plains where once was troy. i was cast, an exile, onto the high seas, together with my companions, my son, the spirits of my household and the great gods above. (aeneid . - ) the major theme of these lines is abandonment, in this case of the city of rome, (sic urbe relicta in bellum fugitur), articulated in part through a simile of shipwreck (desilit in fluctus, puppe magister, naufragium). these lines are thought to be richly intertextual with the aeneid. the commentator paul roche, author of the most recent and extensive commentary on this part of lucan’s epic, notes numerous parallels in lines - alone (italicized in the left panel above), particularly with book of the aeneid . but roche also remarks on a similarity with part of aeneid that has no shared words, making it a good test for detecting resemblance of meaning alone. the relevant passage from aeneid comes at the opening of the book, where aeneas begins the story of his wanderings. it is marked in italics in the right panel above.     this entire passage was in fact included in a top match returned by our algorithm for the comparison above between aeneid and the lucan passage. roche observes the contrast between aeneas’s concern for his family in flight and the disregard for their families shown by romans fleeing their city in lucan’s epic (roche , pp. - ). our lsi method responds to related themes over a longer stretch of text. as in lucan’s description of citizens’ flight from rome, in the opening of aeneid we find pronounced themes of abandonment (diversa exsilia et desertas quaerere terras, litora cum patriae lacrimans portusque relinquo) intermingled with naval imagery (classem, vela, portus). this thematic similarity creates a connection between the texts despite the absence of any significant lexical overlap of the kind targeted by tesserae lexical search and other text reuse search engines. the infrequent words common to both texts are underlined above, illustrating that the passages share none of the compact, word-level n-grams typically picked out in scholarly commentaries . the passages could, in theory, be identified as similar based upon this sparse collection of shared words, but only by a search so minimally restrictive as to produce a flood of results. matching via semantic analysis thus brings us much more directly to the thematic resemblance identified by roche. taking this approach further, we experimented with the lsi modeling to see how many of the missing commentator parallels between civil war book and the aeneid we could return in the top results, on the assumption that this was a highly manageable number for scholars to check. passages (queries) from book of civil war were matched against all passages (documents) found in individual books of the aeneid, and the results were ranked in descending order by lsi score. this search involved setting one arbitrary parameter, the number of topics (or dimensions) into which the passages would be     categorized by content. for our experiment, we evaluated each query at , and topics, and reported the parameter at which a valid parallel was found. to provide the reader with a more thorough analysis of the proposed approach, we also computed precision, the fraction of retrieved instances relevant to a valid parallel, for each result. this was done by counting the number of matches that contained text from a valid parallel and dividing by the total matches we always considered to be candidates. recall from sec. . that our approach generates a large sampling of overlapping windows, meaning that it is possible to have multiple valid matches per search instance. this is a useful feature for a scholar, in that we have good coverage of the context surrounding a target line of interest from a set of windows that overlap, but not completely. we exploit this behavior in our user interface (described below in sec. ) to highlight relevant passages of text. of the missing parallels, the lsi approach returned , listed in table . several of these results were ranked in the top five returned by the algorithm for a given number of topics, indicating very strong thematic links. one additional parallel also found by the n- gram matching algorithm of the tesserae search engine was returned as a rank- result. comparing the methods of analysis, we found that lower ranks tend to be correlated with higher precision. these results also provide a basis for comparing our lsi method with the alternative approach of cosine distance between bag-of-words representations. when testing the latter, we observed much higher ranks (indicating worse performance) and lower precision values for most of the parallels in table . in many cases, the ranks fall outside of the top results, and are not considered valid matches by the matching criteria of this     paper. scores produced by this simple model were also significantly lower than those generated by the lsi approach. in every instance lsi outperformed the simpler bag-of- words approach. thus, for this corpus, we can conclude that by making use of low-rank approximation to capture the broader synonymy of the corpus, the lsi approach yields stronger matches that appear higher up in the rank order. this is not to say, however, that the simpler model has been rendered useless. table lists an additional set of missing civil war – aeneid commentator parallels found in the top- results returned by the cosine distance between bag-of-words representations. these parallels are not found by the lsi approach. similar to the results in table , we again observe higher ranks and lower precision values for each parallel – not a single one of these matches falls within the top- results. this indicates that even as a weak approach, the simple bag-of-words model could be useful in combination with other, more powerful approaches via fusion (using a reasonably intelligent score analysis algorithm) to improve the rank position of a match. we are investigating this possibility in our ongoing work. a new tool for the study of intertextuality based on the satisfactory benchmark results, we designed an accessible front-end to the proposed algorithm for more traditional scholars of the classics. those interested in trying out the algorithm have free access to an easy-to-use web-based tool via the tesserae project website . figs. and show the interface, which provides simple drop- down menus for all parameters (author, work, book and number of topics), and a point- and-click mechanism to allow the user to explore the texts while reading. scholars     without significant training in machine learning will find this tool to be a convenient starting point for conducting studies related to intertextuality and semantics at a large scale. at the time of this writing, different latin poets and prose writers are available for comparison. an important question is whether this tool (and the underlying lsi algorithm) can be useful in revealing new instances of text reuse. ideally, the approach should produce results beyond those in our benchmark set that were noted by commentators but missed by lemma matching. to this end, we used our web interface to visualize other strongly matching passages between civil war and the aeneid, using the lines from civil war in tables & as “targets” (i.e. queries) against the passages from all of the “source” books of the aeneid. this search turned up significant thematic correspondences not recorded by commentators, listed in table . these included another passage in the aeneid that shares the themes of abandonment and the sea with the lines around civil war . quoted above. here sailors flee from the shores of polyphemus, and the related words are concentrated in the first two lines: sed fugite, o miseri, fugite atque ab litore funem rumpite. nam qualis quantusque cavo polyphemus in antro lanigeras claudit pecudes atque ubera pressat, centum alii curva haec habitant ad litora vulgo infandi cyclopes et altis montibus errant. (aeneid . - ) but flee, you wretches, flee and slash the cables from the shore. for as great and tall as polyphemus is who lives in his hollow cave, keeps wooly flocks, and milks their udders, a hundred such other monstrous cyclopes live together along the curved shore, and wander the steep mountains.     other passages were related by different common themes. the lsi algorithm identified the following two passages as highly related, and both in fact describe the god bacchus, though in almost entirely different terms (they share just one word, vertice): nam, qualis vertice pindi edonis ogygio decurrit plena lyaeo . . . (civil war . - ) nec qui pampineis victor iuga flectit habenis liber, agens celso nysae de vertice tigris (aeneid . - ) for just as a thracian bacchant, filled with theban bacchus, rushes down from the summit of mount pindus . . . nor did bacchus, who in victory guides his chariot with reins of vine, leading his tigers from the summit of lofty nysa, [traverse as much land as augustus will rule]. we also found a similar correspondence between text surrounding civil war . and aeneid . - . this instance contained both identical words (qualis, per urbem) and (near) synonyms (attonitam ~ excita, urguentem ~ stimulant). in sum, then, our employment of lsi proved successful for the needs of users, in that it can bring them swiftly to significant instances of semantic similarity not previously recorded. and as a computational method, in every case the lsi algorithm again outperformed the simpler bag-of-words approach. discussion our experiment demonstrates that lsi can be used to detect intertextual relationships of meaning where few or no words are shared by the two texts. the same approach can in principle be extended to discover common themes and generic material, though     computational constraints currently make it impossible to conduct a rapid search for such material over very large-scale corpora. the distinction between intertext and non- intertext has always been fundamentally a heuristic one that can shift and change. if this sort of searching can be brought to larger scales, it will likely begin to dissolve the border between the instances of intertextuality most frequently noted by scholars – tight verbal correspondences – and the traditional understanding of similarities of mood and theme. acknowledgements this work was supported by the national endowment for the humanities [start-up grant award #hd- - ]. we thank prof. neil bernstein of ohio university, who provided valuable feedback on an early draft of this work.       table . list of missing civil war (bc) – aeneid (aen) commentator parallels found by the lsi approach in the top results returned by the algorithm for each query. both rank and precision are reported. an asterisk denotes a parallel outside the missing parallel set also found by the lexical matching algorithm of the tesserae search engine. for comparison, the corresponding ranks and precision values are also provided for a cosine distance between bag-of-words representations for each parallel. in nearly every instance, lsi outperformed the simpler bag-of-words approach. comparing rank to precision, it can be seen that lower ranks tend to be correlated with higher precision. bc line aen line shared context topics lsi rank lsi prec. bow rank bow prec. . . destiny of caesar; peace . . . . the blowing wind; tree . . . . the blowing wind; tree . . . . an apparition . . . . an apparition . . . . horses . . . . flight . . . . abandonment * . . . . abandonment; navy . . . . omens; terror . . . . dido as bacchant . . . . prophecy . . . . frenzied discussion . . table . list of missing civil war (bc) – aeneid (aen) commentator parallels found in the top results returned by the cosine distance between bag-of-words representations. these results are not found by the lsi approach. compared to the lsi approach in a general sense, we find that the ranks tend to be much higher and precision much lower for this baseline, with no result in this table placing in the top of those returned. low precision scores are also observed for this experiment. bc line aen line shared context bow rank bow prec. . . war . . . hostility . . . broken treaty . . . fortune . . . dido as bacchant . . . questioning destination . . . decapitation; shore .                     table . additional thematic matches found between civil war (bc) – aeneid (aen) by the lsi approach. these include highly specific parallels (a bacchant in the passage around civil war . and bacchus around aeneid . and aeneid . ), weaker parallels with some lexical correspondence (civil war . and aeneid . , civil war . and aeneid . ), and interesting contextual parallels (a metaphorical description of nautical abandonment around civil war . and sailors fleeing the shores of polyphemus around aeneid . ). an asterisk denotes a parallel also found by the lexical matching algorithm of the tesserae search engine. ranks are also provided for a cosine distance between bag-of-words representations. as with the experiment shown in table , our lsi method consistently outperformed the simpler bag-of-words approach. bc line aen line shared context topics lsi rank bow rank . . war * . . the blowing wind . . conquest . . city; nation * . . abandonment; nautical imagery . . bacchus . . bacchus     fig. . the public web interface to the algorithm described in this article. parameters are presented to the user as a series of drop-down menus. the user can click on any line in the “target” frame, which will initiate the lsi matching process between the passage centered on the target line and all passages in the “source” frame. the tesserae project’s entire latin corpus is available for search. the simple interface allows scholars with minimal training in machine learning to conduct sophisticated studies of semantic intertextuality at a large scale.     fig. . an example of a match between civil war . and aeneid . . the entire passage highlighted on the left represents the query centered on civil war . . to reduce visual clutter, we only highlight the lines matching passages are centered upon on the right. the matches provide the scholar with an indication of the general neighborhood where semantically similar text can be found. color intensity in the right-hand frame indicates the strength of the match (brighter colors mean a stronger match).     notes . in classical scholarship sometimes called loci similes, or “similar passages,” and typically consisting of (near) verbatim reuse of a two-word phrase. . two useful surveys of practices within classical philology can be found in pucci ( , ch. ) and schmitz ( , ch. ); see also coffee ( ). . examples include: global linear models for assessing verse similarity in the new testament gospels (lee, ); unsupervised detection of greek quotation (büchler et al., ), and hash coding to detect reuse and citations in lautréamont and balzac (ganascia et al., ). more flexible sequence alignment approaches (horton et al., ; roe, ; wolff, ; smith et al., ), inspired by related analysis techniques in genetics, are prevalent as well. most closely aligned with the goals of this present work are the etraces (bamman and crane, ; büchler et al., ; büchler et al., ; geßner et al., ) and tesserae (forstall et al., ; forstall and scheirer, ; coffee et al., a; coffee et al., b; coffee et al., ) projects. . wills ( , ch. ) gives an extensive set of textual features that can serve as the basis for intertextual connections, with examples of each. . latin texts cited here are from the perseus digital library (see also note below); translations are our own.     . for example, paul roche ( , ad loc.). . excluding extremely common function words such as et (“and”) and in (“in/on”). . the process of recognition does not necessarily proceed in such an orderly fashion, however, from the concrete to the abstract; rather, the alert reader is often sensitized to the possibility of intertext in advance. this potential for an intertextual relationship to prime the reader to see further connections is described in detail by wills ( , pp. – ). in general terms, “a poetic sign signals first to the other signs within the poetic system . . . before signaling its specific sense in a precise context” (conte , p. ). for example, vergil himself seems already to have foreshadowed pompey’s death in his description of the death of priam, patriarch of the trojans (hinds , p. ). lucan’s readers might well have recognized the intertext first on this basis and only subsequently (or never) noticed the reuse of nuto. . allison et al. ( ), mimno ( ), and nelson ( ), respectively. . http://radimrehurek.com/gensim/intro.html . the tesserae latin corpus currently comprises just under texts, evenly divided between prose and verse, principally from the first century bce to the third century ce. most of these texts are sourced from the perseus digital library     (http://www.perseus.tufts.edu; g. crane, editor). for further information, see http://tesserae.caset.buffalo.edu/sources.php. . https://github.com/tesserae . on difficulties associated with small samples in literary applications of text analysis, see, e.g., eder ( ). . the gensim module implements scalable truncated singular value decomposition in python to calculate the low-rank approximation of a matrix. while there is no specific property of lsi that makes it more suited to small corpora, this particular svd solver is stable for small sample sizes, making it useful for the kinds of searches demonstrated here. with a lack of good alternatives, we recommend that other researchers consider this implementation for semantic analysis problems that are constrained to small sample sizes. . http://tesserae.caset.buffalo.edu . additional detail can be found in the “methodology” section of coffee et al. ( ). . the lucan commentaries we consulted for this information were heitland and haskins ( ), thompson and bruére ( ), viansino ( ), and roche ( ).     . (roche, ), - records parallels between civil war . - and aeneid . f., - , - , - , - , - ; . f.; . f; . f. most are contrastive, evoking the difference between aeneas’s concern for keeping his loved ones together while fleeing troy, and the disregard for family ties shown by those fleeing rome in civil war. . not underlined are the function words cum, et, and in, which are extremely common and typically excluded by even the shortest stop lists. . http://tesserae.caset.buffalo.edu/cgi-bin/lsa.pl . so, for example, novelist and semiotician umberto eco, in reflecting on the various intertextual relationships between his own fiction and that of jorge luis borges, notes links of several distinct types, including “cases where i was not aware of it, but subsequently readers . . . forced me to recognize that borges had influenced me unconsciously,” as well as others in which a reminiscence of borges in eco’s writing is due rather to a mutual debt to “preceding sources and the universe of intertextuality.” (eco , p. ).     references allison, s., heuser, r., jockers, m. l., moretti, f., and witmore, m. ( ). quantitative formalism: an experiment. n + , : - . bamman, d. and crane, g. ( ). the logic and discovery of textual allusion. in proceedings of the second workshop on language technology for cultural heritage data (latech ), marrakesh, morrocco. blei, d. ( ). probabilistic topic models, comminications of the acm, ( ): - . blei, d., ng, a., and jordan, m. ( ). latent dirichlet allocation. journal of machine learning research, : - . büchler, m., geßner, a., berti, m., and eckart, t. ( ). measuring the influence of a work by text re-use. bulletin of the institute of classical studies supplement, : - . büchler, m., crane, g., mueller, m., burns, p., and heyer, g. ( ). one step closer to paraphrase detection on historical texts. journal of the chicago colloquium on digital humanities and computer science, ( ). büchler, m., geßner, a., eckart, t., and heyer, g. ( ). unsupervised detection and visualization of textual reuse on ancient greek texts. journal of the chicago colloquium on digital humanities and computer science, ( ). coffee, n., koenig, j.-p., poornima, s., forstall, c. w., ossewaarde, r., and jacobson, s. ( a). the tesserae project: intertextual analysis of latin poetry. literary and linguistic computing, ( ): : .     coffee, n., koenig, j.-p., poornima, s., ossewarde, r., forstall, c., and jacobson, s. ( b). intertextuality in the digital age. transactions of the american philological association, ( ): - . coffee, n. ( ). “intertextuality in latin poetry.” in oxford bibliographies in classics. ed. d. clayman. new york, oxford university press. coffee, n., forstall, c., buck, t., roache, k., and jacobson, s. ( ). modeling the scholars: detecting intertextuality through enhanced word-level n-gram matching. to appear in literary and linguistic computing, pre-print available at: http://tesserae.caset.buffalo.edu/blog/wp-content/uploads/ / /modeling-the- scholars- - - llc-preprint .pdf. conte, g. b. ( ). the rhetoric of imitation: genre and poetic memory in virgil and other latin poets. translated by charles segal. cornell university press, ithaca, new york. deerwester, s., dumais, s., furnas, g., landauer, t., and harshman, r. ( ). indexing by latent semantic analysis. journal of the american society for information science ( ): - . eco, u. ( ). borges and my anxiety of influence. in, on literature, pp. - . translated by martin mclaughlin. harcourt, inc., orlando fl. eder, m. ( ). does size matter? authorship attribution, small samples, big problem. literary and linguistic computing, forthcoming. published online november , at http://llc.oxfordjournals.org/content/early/ / / /llc.fqt .full.     forstall, c. w. and scheirer, w. j. ( ). revealing hidden patterns in the meter of homer’s iliad. in proceedings of the chicago colloquium on digital humanities and computer science, chicago, illinois. forstall, c. w., jacobson, s., and scheirer, w. j. ( ). evidence of intertextuality: investigating paul the deacon’s angustae vitae. literary and linguistic computing ( ): - . ganascia, j.-g., glaudes, p., and delungo, a. ( ). automatic detection of reuses and citations in literary texts, in proceedings of digital humanities, lincoln, nebraska. geßner, a., kötteritzsch, c., and lauer, g. ( ). biblical intertextuality in the digital world: the tool gertrude. in proceedings of the st international workshop on collaborative annotations in shared environment: metadata, vocabularies and techniques in the digital humanities, bologna, italy. heitland, w. e. and haskins, c. e. ( ). m. annaei lucani pharsalia. london: g. bell. hinds, s. ( ). allusion and intertext: the dynamics of appropriation in roman poetry. new york: cambridge university press. horton, r., olsen, m., and roe, g. ( ). something borrowed: sequence alignment and the identification of similar passages in large text collections. digital studies / le champ numérique, ( ). jockers, m. ( ). macroanalysis: digital methods and literary history. champaign: university of illinois press.     kristeva, j. ( ). word, dialogue and novel. in moi, t. (ed.), the kristeva reader, new york: columbia university press, pp. - . lee, j. ( ). a computational model of text reuse in ancient literary texts. in proceedings of the th annual meeting of the association of computational linguistics, prague, czech republic, pp. - . mccallum, a. k. ( ). mallet: a machine learning for language toolkit. http://mallet.cs.umass.edu (accessed january ). mimno, d. ( ). computational historiography: data mining in a century of classics journals. journal on computing and cultural heritage ( ). nelson, r. k. ( ). of monsters, men – and topic modeling. the new york times. http://opinionator.blogs.nytimes.com/ / / /of-monsters-men-and-topic- modeling (accessed january ). rehurek, r. and sojka, p. ( ). software framework for topic modeling with large corpora. in proceedings of the lrec workshop on new challenges for nlp frameworks, valletta, malta, pp. - . pucci, joseph ( ). the full-knowing reader: allusion and the power of the reader in the western literary tradition. yale university press, new haven, ct. roe, g. h. ( ). intertextuality and influence in the age of enlightenment: sequence alignment applications for humanities research. in proceedings of digital humanities, hamburg, germany. roche, p., ed. ( ). lucan: de bello civili. book . oxford: oxford university press. schmitz, thomas a. ( ). modern literary theory and ancient texts: an introduction. blackwell publishing, malden ma.     smith, d. a., cordelly, r., and dillony, e. m. ( ). infectious texts: modeling text reuse in nineteenth-century newspapers. in proceedings of the ieee workshop on big data and the humanities, santa clara, california. thompson, l. and bruére, r. t. ( ). lucan’s use of vergilian reminiscence. classical philology, : - . viansino, g., ed. ( ). marco annaeo lucano: la guerra civile (farsaglia) libri i-v. milan: arnoldo mondadori. wills, jeffrey ( ). repetition in latin poetry: figures of allusion. clarendon press, oxford. wolff, m. ( ). surveying a corpus with alignment visualization and topic modeling. in proceedings of the chicago colloquium on digital humanities and computer science, chicago, illinois. developing digital skills through engaged scholarship digitalhumanities.org developing digital skills through engaged scholarship - minutes abstract this paper offers a case study of two contrasting digital scholarship internships at the pennsylvania state university. we explore the benefits and drawbacks of the internship model as an approach to developing digital scholarship among undergraduates through detailing the challenges and particularities of these experiences and analyzing mentor reflection and student feedback. we conclude with a number of recommendations on best practices for teaching digital scholarship through an internship model and aim to provide a useful roadmap for institutions looking to follow a similar model for undergraduate education in this field. digital scholarship has never been more important than for the current generation of undergraduate students. the need to develop one’s technical expertise is not just a concern for those few students aspiring to a career in academia; competency in the use of computer-assisted methods has relevance for the entire student population. data analytics, knowledge representation, and dissemination techniques are just a few of the many areas with broad professional application to have undergone technology- developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm driven transformation in recent decades. society’s reliance on technology is such that the digital has permeated our professional lives, transforming the skillsets expected of students upon graduation. the integration of digital scholarship in the undergraduate curriculum can further the students’ learning experiences and engagement with their core subject matter, but there are numerous obstacles to embedding the skills required into course learning objectives and outcomes. increasing digital fluencies among undergraduate students in the arts and humanities, in particular, presents a number of key challenges: "the technical proficiency of undergraduates and instructors, the timeframe of a single semester or quarter, and the availability of hardware and software" [bjork , ]. as a consequence, institutions of higher education are responsible for exploring a variety of pedagogical approaches to digital scholarship, both within and beyond the confines of the classroom. by "digital scholarship," we refer to the practice of leveraging digital methods and computer-assisted approaches to research in the broader arts and humanities, and indeed, in related disciplines across the social sciences, for the purposes of producing new meaning across a multitude of forms.[ ] in turn, there is a need to ensure that students are not simply being trained in the use of intuitive tools to produce artifacts of tactical convenience, but rather, that they are developing a deep understanding of the potential for new and supplementary meaning offered by computational methods, as well as an awareness of the digital’s many constraints and the profound repercussions that interdisciplinarity can have for established practices. as tanya developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm clement argues: until we consider digital humanities undergraduate pedagogy in terms other than training, and rather as a pursuit that enables all students to ask valuable and productive questions that make for “a life worth living,” digital humanities will remain unrelated to and ill defined against the goals of higher education. [clement , ] in this paper, we present a case study of an engaged scholarship model–which seeks to complement classroom-based learning with out-of-classroom experiences–as a means to explore an alternative pedagogical approach to digital scholarship. specifically, we consider two evaluative questions: how effective are internships at developing knowledge and skills in digital methods? what are the optimal student learning conditions with respect to structure, guidance, and supervision to nurture the development of such knowledge and skills? to address these questions, we compare two undergraduate internships which proceeded as part of a collaboration in between the the pennsylvania state university libraries and the college of the liberal arts, which saw two independent groups working with undergraduate students on research projects with a significant digital component. at penn state, there has been little distinction between the digital humanities work that is housed within the university libraries, and that which is primarily led at the academic college-level. the structure of the institution’s digital humanities effort, and indeed the interdisciplinary nature of the field, is such that it has a range of interdepartmental stakeholders.[ ] both projects hired two paid interns, who were employed in a full-time capacity for a duration of weeks.[ ] one pair of interns worked on a project which availed of developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm computational approaches to text analysis for the purposes of exploring the language utilized by online roleplayers. these students worked primarily under the guidance of the digital humanities research designer, with support from the university libraries’ publishing and curation services, as well as the office of digital pedagogy and scholarship in the college of the liberal arts. in this instance, the students worked collaboratively under minimal supervision on the same project, on a research question of their own choosing. effectively, this internship was an experiment to see the possibility of undergraduate digital scholarly research supported by resources – time, money, and faculty/staff expertise. for the purposes of clarity, these students will be referred to as the "text analysis interns." the other pairing, which will be addressed as the "geospatial interns," worked with faculty and staff of the donald w. hamer maps library, and were tasked with helping the library accomplish the goal of increasing digital access to the sanborn fire insurance map collection. this internship was comparable to a traditional professional internship structurally, but with the goal of developing digital skills (digitization, mapping) that transfer to academic research, along with the freedom within the scope of the internship to pursue independent projects as interns developed skills. while the implementation and scope of the projects varied, both sought to adopt a digital-project-as-pedagogy approach, so that interns developed advanced expertise through direct engagement with applied research. in doing so, the intention was that interns would gain a sense of how to conduct research that is of publishable quality, while seeing what is required in bringing a digital project from conception to fruition. by design, the experience allowed students to develop a number of their broader professional skills, such as time and project management, as well as practice developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm habits of collaboration, all of which would occur in professional environments. as we begin to evaluate the internship model as a way to develop digital scholarship skills in students, it is helpful to situate the approach within a pedagogical framework. in this case, both internships were envisioned as project-based learning opportunities to enhance the digital skills and professionalization of undergraduate students. this "learn by doing" approach is common in the classroom, and can be equally effective in the field, given the right conditions. in "perspectives on learning in internships," david thornton moore challenges the notion that academic learning happens exclusively in the classroom, and that only testing or application of that knowledge happens in the field [moore , ]. he argues: "thinking in the real world may indeed supplement and reinforce school-based learning; but it can also do far more to develop valid and important learning in its own right" [moore , ]. moore puts forth a matrix to evaluate internship experiences through focusing on two dimensions: the ways one uses knowledge, and the ways one relates to others in a particular learning environment. with respect to the mental work of internships, he suggests that we consider how interns are expected to use knowledge: is it fixed and immutable, or are students able to reorganize and transform knowledge? [moore , ]. similarly, we can evaluate the social relationships in particular contexts to see the degree to which interns are relied on and able to participate in the definition and creation of knowledge. this framework provides two spectrums useful for evaluation: is the mental work of the internship more rote and algorithmic, or creative and transformative? and are the social relationships in the environment developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm hierarchical and controlled, or collegial and participatory? [moore , ]. what follows is a comparison of the two internship experiences, using moore’s matrix where appropriate. selection, planning, and orientation in choosing the final candidates, each of the mentors made selections from the pool of applicants based in part on students possessing complementary skillsets, the ability to work collaboratively with others, and a natural curiosity and willingness to learn. however, as noted, there were a number of differences in the implementation of the two internships. the geospatial interns were supervised by a project team of four– penn state’s geospatial services librarian, gis specialist, maps library manager, and research data management specialist–that worked for three months prior to their start to plan and scope the project. preparations for the arrival of the geospatial interns included producing a document outlining general professional expectations, organizing introductory reading material on the history of fire insurance mapping–of which the sanborn collection is a part–producing a step-by-step technical protocol, and developing and writing about goals for project and intern learning outcomes. interns were oriented to the project by the team at the beginning of the summer and integrated into all project activities from that point forward. the text analysis interns were supervised by the digital humanities research designer, though they were largely under the tutelage of the social sciences data curation fellow throughout the initial phase of their project during which data management was one of developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm the primary concerns. in preparation for their start, a general scope of work, with learning goals identified and a rough project timeline outlined, was developed. the project timeline included general phases of research project development, with a lot of flexibility built in to accommodate the students’ project. upon their start, the students were introduced to the goals and expectations of the internship, but then given latitude to define the nature of their research project. roles and responsibilities the geospatial interns occupied much of their time georeferencing and extracting data from the sanborn fire insurance maps, and then creating arcgis web-based applications to view and search the data. after the initial orientation, interns took over updating the step-by-step protocol as issues arose. interns also took the lead on researching and developing land-use codes for buildings. significant guidance was given to the interns on how to create good quality metadata. the varied experiences and skills of the geospatial interns were key to the project. the geospatial intern with a landscape architecture background applied the use of graphics software from architectural coursework. the geospatial intern with a geography background made connections across geographic elements from geography coursework. interns kept a log of their daily activities and questions on a network drive that was monitored by all team members and served as a diary of sorts at summer’s end. intern updates on activities, difficulties, and progress were also delivered to the rest of the project team during weekly project meetings. the text analysis interns day-to-day tasks were self-directed and developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm aligned to their research goals and project design. the text analysis interns conducted a significant literature review, acquired chat logs to to serve as their source data, and worked with mentors to store, clean, and analyze the data. their study seeks to determine the extent to which online roleplayers make use of lanauge in the construction of narrative, using computer-assisted methods to idenitiy the particularities of the language of online rolepay [o'sullivan et al. ]. the text analysis interns had regular contact with project mentors, but the dynamic was more similar to an apprentice working with a mentor than an employee working with a supervisor. the level of direction and scaffolding provided to the interns was the most significant difference in the two internship experiences. as outlined above, the geospatial interns received considerable direction at the outset, and regular ongoing feedback throughout the duration. with respect to the mental work based on moore’s framework, though, both sets of interns performed complex tasks and advanced ways of thinking. the digital humanities interns did have more autonomy over their day-to-day tasks, but they still relied on significant guidance from the mentors to develop the digital research skills to complete their research project. because the geospatial interns were not assigned mundane and routine tasks, but rather, charged with responsibilities that demanded they apply the methods of digital scholarship correctly and adapt to changes in project needs or tasks, both internships provided meaningful opportunities for students to acquire and use "scholarly" knowledge [moore , ]. also, the social relationships in both internships were collegial and participatory. that is, while the geospatial interns worked within a more traditional supervisory structure, they were also included on the research team as active developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm participants, with a shared responsibility as a vital part of a major project’s progression. the text analysis interns developed their sense of responsibility through ownership, in that they were made aware from the outset that the success or failure of the project would be a direct consequence of their own efforts–saying this, students were made aware that "failure" did not mean incompletion, and that exploration, discovery, and learning were to be privileged over the delivery of expected outcomes. moore says, "in classrooms, students rarely have the opportunity to be truly responsible - not just punctual or obedient, but to have others actually count on them for something meaningful" [moore , ]. the digital-project-as-pedagogy approach in both internships gave students the chance to engage in meaningful projects that required them to think critically, adapt to changing demands, collaborate with colleagues, and identify when they needed guidance and feedback. in this section, mentor and student intern perspectives give some insight on the approaches and outcomes of the internship experiences. the mentors provide a self-reflective account of the internships, while the student perspectives are based on a thematic analysis of a qualitative survey which was conducted upon completion of their employment. mentor perspectives[ ] a number of common challenges emerged across both internships, along with a variety of project-specific issues that arose. the first challenge was the selection of the candidates since mentors not only had to choose the students with the most direct experience, but rather, to pair individuals with complementary skillsets who we developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm felt would function well within a collaborative setting. the text analysis project also required students who could work independently, conducting the unsupervised research necessary to further both the theoretical and technical aspects of their project. in this instance, particularly in relation to the text analysis project, there was also a need to judge the motivations of applicants–while curiosity among undergraduates is to be encouraged, there was a sense that many of the students were more interested in securing an internship–any internship–than they were in the digital humanities as a subject matter. in essence, we found the selection of the candidates was not just about experience, but a balance between aptitude, attitude, and interpersonal skills. in both instances, the aim was to select interns who would be able to leverage their skillsets in different and mutually beneficial ways. there is some tension in this approach, as it opposes most other pedagogical contexts: where the process of choosing candidates for an internship is highly selective, in the classroom you are typically not in a position to engineer your learner-dynamic. the aforementioned tension emerges from the realisation that the impact of such initiatives, which are not necessarily replicable in a broader range of contexts, is limited. as outlined, one of the key differences was that the geospatial interns were assigned to an existing project, whereas the text analysis students worked on a research topic of their own choosing. when creating internships involved with a pre-existing project, it is important to consider the level of interest and student engagement. in the case of the geospatial interns, student engagement was fostered by continual positive social interaction and role-modeling professionalism and engagement of all team members throughout developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm the project. this generated the essential sense of ownership inherent in the alternative student-driven project, while also giving students a sense of collaborative responsibility. with the text analysis interns, the mentors felt it was vital that the project’s focus was student-driven, as this would ensure their commitment to the undertaking when faced with the inevitable technical barriers throughout the processes of gathering and analysing the data. in the text analysis project, any potential failure would be the students’ own, whereas in the geospatial internship, students were aware that their component was an essential part of a larger whole, and thus, benefited from the experience that comes from working within a broader team. the geospatial interns benefited from having clearer milestones and indicators of success given the larger project context they were working within. the text analysis interns had to navigate through the uncertainty of conducting research employing digital humanities methods, absent the structure of a more typical professional internship. there are tradeoffs to consider in both internship models, with one privileging technical and skill development, and the other prioritizing more holistic research skill development. as noted, the two projects adopted different approaches to supervision. the model used to supervise the geospatial interns was one of co-supervision shared by three individuals, with the supervisor in closest proximity to the workstations of the interns serving as a daily point of contact. the mentors felt that it was important to have daily contact with the interns in order to foster collaboration, integrate them into the project, and give them real- world experience working in a professional environment. the entire team also met on a weekly basis to discuss aspects of the project, developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm alternatives to adopted approaches, and assess progress towards the end goals. this process enabled the students to build communication and negotiation skills, as well as learn to compromise on those elements of the project where a unified vision was needed. in contrast, the text analysis interns worked very much in isolation, liaising with their mentors as their research requirements dictated. over the course of the project, direct meetings were predominantly reserved for those instances where the students required instruction in a specific methodology. there were some clear benefits to this approach, in that the students seemed to cope well with the demands of a project’s initial research requirements: they produced a very thorough literature review, and were proactive in the gathering of a suitable dataset. however, the chief supervisor also noticed considerable scope creep at various junctures, and that between meetings, students had wandered from the guidelines offered during previous interactions. on multiple occasions, the mentor found it necessary to remind students of their central research question, and how best to re-focus their efforts on answering that question. upon reflection, this approach gave the students a real sense of the demands of independent or small-scale collaborative research - which is still the major component of research-based positions, even in the digital humanities - but that some further direction would have certainly helped the students achieve their intended deliverables. the geospatial interns were exposed to other units and departments within the library so that they could situate their projects within a wider professional context. it was important for them to learn how the project related to other units in terms of deadlines, roles, contributions, and limitations. as noted, the text developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm analysis interns worked independently, and so they did not further their understanding of how various departments contribute to the institution’s overarching strategies. it was hoped that they would spend some time working with the digitization and preservation department, but the dataset that their research necessitated did not require digitization, and so this element was removed. the relative autonomy allowed the students to see how scholarly research, and particularly digital scholarly research, is conducted – a significant amount of independent work with points of collaboration with specialists when the project dictates that level of support. mentors observed that the students’ enthusiasm waned in the final weeks of the project. this may have been due to the length of the undertaking, as most undergraduates are not used to projects of this scope, but it may also have been due to a lack of stimulation in what was an isolated setting. we hoped that their interest in the research project would be sufficient to overcome this issue, but there is certainly some merit to suggesting that students should engage with a variety of units and departments if only as an exercise in breaking the monotony of independent research and providing them with some additional context and routine, as well as introducing and fostering a sense of community. professionalization was an important part of both internships, the intention being that students would emerge from the experience having developed more confidence in their ability to negotiate workplace dynamics. this was accomplished, in that interns appeared to increase their involvement as the projects progressed, making vital contributions towards the future directions of the projects. our implementations suggest that a major risk of the internship model is that, in the event that students do not engage, developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm the investment of mentors’ time is a risk without guarantee of concrete rewards, both in terms of project output and student learning. one of the failings in the text analysis internship was the student engagement with the more technical aspects of the project. the nature of the dataset was such that students spent a considerable time gathering and cleaning chatlogs from online games, leaving little time for the analysis phase. while working through a series of computer-assisted methodologies with the interns, their supervisor felt that the students struggled with the volume of information, and had at that point suffered from a loss of motivation. to that end, while they drove the research objective, and gained a holistic understanding of a digital project’s lifecycle, the extent to which they expanded upon their technical expertise is less certain. from the perspective of its product, the project was a success, in that the student produced a research report of some significance. from a pedagogical perspective, the students now understand what constitutes rigorous digital scholarship, and the steps required to accomplish such. however, it would have been better if more structure had been provided so as to ensure that they also emerged with more methodological expertise, as this was one of the expected learning outcomes. supervisors of the geospatial interns reviewed their ouputs during multiple stages of development, including the overall aesthetics of the output, accuracy, consistency, and thoroughness, an approach which the text analysis project could also have been adopted. in terms of determining the interns’ transformation of information, based on observations of their knowledge and experiences at the beginning of the internship compared to their experiences at the end of the internship, it is evident that the work conducted led to a developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm transformative experience. this transformation can be characterized as the development of knowledge about processes and topics that enable the learners to take their skills and experiences from the internships and transfer them to new situations and activities. furthermore, these experiences informed the mentors on the importance of focusing on the specific needs of the intern, and how such is often challenged by the surrounding organizational, administrative, and project needs. internships in this field should privilege the development of a student’s digital skills, rather than seek to accomplish any specific research output, though accomplishing such should be encouraged, and indeed act as part of the means by which success is measured. thus, an intern- centric undergraduate learning experience should be adopted, wherein topics on the periphery of core curricula can be integrated into the project. student perspectives as this study is focused on the experiential aspects of the internships as pedagogical models, a qualitative approach was adopted for the analysis of the student perspectives. a common survey was issued to the interns, in which they were asked to respond to three questions:[ ] what skills did you develop during this internship, and which do you feel you will use again? . what aspects of this internship did you find most beneficial? . were there aspects of this internship that you found disappointing or did not meet your expectations? . the questions were deliberately open, so as to not lead student developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm responses. using thematic analysis, we approached the data with two concerns: what insights could be gained in the development of the students’ research, technical, and professional skills, and what other, unanticipated themes, emerged across each of the groups? it is worth noting that students were seen as collaborators throughout this process, their contributions to this study a key part of its scholarly value. furthermore, their participation was very much a success in the sense that they produced outstanding work of considerable substance. moore’s uses of knowledge a number of themes emerged from the respondents, the most prominent of which was "problem solving." the interns agreed that this was both the primary skill they developed, as well as the most beneficial aspect of their internship. this is a positive finding, in that it reflects the pedagogical ethos of the humanities and social sciences, seeking to foster critical thinking among undergraduates. it also shows how students, with appropriate training, can learn how unfamiliar technologies and techniques can be applied to the creation of new knowledge and meaning. interns also drew much attention to the value of those transferable generic skills which they felt would be of use in their future careers. several of the interns also referenced specific technical skills - it was particularly encouraging to see that the text analysis interns recognized both the technical expertise and broader professional competencies which they developed, as their chief supervisor placed little emphasis on the latter. they clearly realized the broader professional value of the internship, as well as the potential for applying digital methods to a broad range of activities beyond developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm academia: "i developed skills in technical writing and reading, experimental design, data analysis, data management, and team- based research. i feel that [these] skills will be very useful in my future, no matter what i choose to do." moore’s social relationships the experiences of the interns were clearly impacted by the interpersonal dynamics of their projects:"working with another intern to help solve problems and make compromises and decisions was one of the most important aspects..."; "i feel like our project really benefited from two of us working together … bouncing ideas off of one another, dividing up the tasks either of us were best suited to, but each contributing even when the other took the lead on one step." the geospatial interns also benefited from working with professionals across different departments and units, while the text analysis students articulated that they achieved the initiative’s primary objective of giving them an understanding of how to bring a digital project through its complete lifecycle, from concept to fruition. a related theme–engaged scholarship as having the potential to offer more than what is permitted in a classroom setting–also emerged: "in school, all of our work is done in a single semester and often alone, so working on a project that required we not only think about immediate outcomes, but also future uses and applications and collaborating with others was very helpful." the majority of students cited a lack of structure as being one of the drawbacks. this criticism was far more evident among the text analysis students, where they were largely left to work independently. extensive planning was conducted for the geospatial internship, so a reference to a lack of structure in this instance is developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm possibly due to adjustments being required as the project progressed. this is a natural consequence of any large-scale collaborative project, and so ideally, the need for some level of uncertainty would have been appreciated by the students. a related frustration–the pursuit of blind alleys–was cited by the text analysis interns, which again, is inherent in any such undertaking, and something which we hope they now realize. the extent to which the text analysis interns were allowed to work independently could perhaps be revised in any future iterations of this initiative, in that the students clearly wanted more supervision. the post was advertized and described during the interview process as being an independent study, wherein the internship would largely be driven by the students’ own interests and ability to develop new expertise under limited supervision. the extent to which undergraduates can comprehend the significance of such an absence of structure was arguably underestimated. a better approach might have been to facilitate some preliminary, even extensive discussion at the start of the internship about what "independent study" means, especially if this type of academic experience is new to the students. this could be followed by periodical reviews of the format of the undertaking and its ramifications for the work. it is also worth noting that the students made no references to any difficulties in mastering particular materials or tasks, which would suggest that this model could benefit from students receiving feedback on where they excelled, and where further improvements could be sought, so as to increase their own awareness of their strengths and weaknesses. a similar approach to that utilized in the geospatial internship was pioneered at bucknell university, where undergraduates worked as developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm research assistants on the stories of the susquehanna valley project.[ ] reporting on the results of their program, katherine m. faull and diane jakacki conclude: extending the classroom outside (both spatially and temporally) allows for the development of rich, deep knowledge in both digital tools and research subject matter. indeed, extending the faculty- student collaboration to include students from outside traditional humanities departments also reifies the value of interdisciplinary research at an early level and reflects the professional dh research model employed by larger-scale projects. [faull and jakacki ] the experience at penn state, both in the geospatial and text analysis internships, support these claims. restating moore’s spectrums of mental work, knowledge engagement and social relations, undergraduate student internships can be interpreted as serving two distinct but overlapping roles in student preparation. first, they can be seen as a direct extension of classroom work where information literacy and critical thinking–or research–are the focus; application outside of the classroom is the learning environment, and learning outcomes are evaluated in relation to student preparation for further engagement with academics. alternatively, internships also acknowledge the gaps left by classroom preparation and a response to the reality that most students will not become professional academics, but rather will work in a variety of professional settings. in a student-centric internship model, we encourage advisors and supervisors to be somewhat flexible in adjusting the structure and expectations of their interns according to their stated professional goals rather than preconceived learning objectives or project goals. although not by design, both internship case studies were successful in this regard, developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm perhaps as a result of careful review and selection of intern candidates. as has already been emphasized, the selection step of an internship plays an important role in setting the stage for internship success and failure from both the perspective of the students and advisors. any similar undertakings would be best advised to engineer a dynamic which represents the best possible opportunity for success, pairing complementary skillsets and personalities, though these can be difficult to assess, particularly the latter, through a limited application process. other institutions looking to implement similar models need to be very clear on their purpose: is the aim to have students develop their ability to negotiate the workplace, or learn how to do advanced research? it is possible to accomplish both, but depending on restrictions on time and resources, it may not be possible to achieve an equilibrium. nor is it desirable to give students a false sense of a particular dynamic: the reality is that, major interdisciplinary projects excluded, most scholarship is still conducted in a largely isolated manner. this is not to say that we support the status quo, but merely acknowledge that it would be irresponsible to have students believe that a career in scholarship, particularly in the humanities, will be predominantly occupied by collaborative endeavours–this might change, but most hiring and promotion committees still privilege "traditional" forms of scholarship. regardless of an internship’s stated purpose, students should develop competencies beyond those cultivated in the classroom, and the supervisory team should stress the importance of transferable skills, so that students can see the utility of any newfound expertise. internships provide an opportunity to liberate students of the constraints of the classroom, affording them the chance to make and learn from mistakes, to explore potentially developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm fruitless strands of inquiry, and delve deeper into topics of personal interest. when developing a program model in such a context, moore’s framework is helpful to review, considering the extent to which the work will be rote rather than creative or transformative: will there be enough of an intellectual engagement for the internship to be worth it for the student? how controlled is the situation? will interns have the ability to make a real difference in the project as equals or is the information very controlled and top- down? defining suitable barometers for the measurement of success also presents a challenge. the primary purpose of the text analysis internship was to give students a sense of what is required in the completion of a digital project, a condition which, from assessment of their final research outputs, was a success. however, it cannot be said with confidence that they could replicate some of the methodologies that were utilized without assistance, and a future iteration of the initiative would do well to ensure that the project’s dataset can be gathered quickly, allowing more time for students to gain familiarity with some of the more technical aspects. this experience was shared by the geospatial interns, who spent a considerable amount of time having to digitize physical materials. when defining those objectives which will determine the success of a project, it is essential to allow for the time necessary to investigate challenges and problems, both technical and intellectual, that arise throughout the research process. supervisors must also account for seemingly obvious yet often overlooked social realities, like students wanting to take holidays during the summer semester, which was when these internships took place. the danger in such programs is that mentors will impose their own developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm value systems on their participants, expecting them to adhere to scholarly principles and standards that undergraduates cannot reasonably be expected to have attained. even faculty with extensive teaching experience might find that they are demanding too much of participants who are embarking on what is likely their first major research undertaking, and expectations must be continuously revisited before the project begins so as to ensure intended outcomes are realistic. while it is important to have a clear set of expectations before the project begins, flexibility is crucial, as it affords interns the scope necessary to pursue unforeseen developments, thus emphasizing learning outcomes over tangible deliverables. students must not be afraid to "fail," for any such fear will restrict critical and creative exploration. equally, considering the value of the opportunity, the demands placed on interns should be as challenging as they are reasonable, and mentors need to be comfortable reprimanding any participant whose performance might not be meeting expectations–this is difficult, in that you may well be dealing with inexperienced individuals with whom you have started to build a rapport atypical of what is usually established between faculty and undergraduates. striking a balance between authority and understanding is a leadership quality that is not easily attained. institutional limitations must also be considered–the reality of engaged scholarship is such that it is not always feasible. penn state, and other institutions like bucknell, where similar programs are already in place, has a network of faculty and staff whose remit is to support digital scholarship and undertakings of this nature. at other institutions, where faculty may be working without appropriate support from suitably-qualified peers, implementing such an approach might prove to be far more challenging. having developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm undergone significant investment in personnel whose mandate was to develop the institution’s capacity for digital scholarship, penn state is in a position to pursue such initiatives. the culture of the departments involved is such that the inherent power differentials in staff and faculty-like employee classifications did not influence the internships–the projects operated as a nearly flat hierarchy, with all faculty, staff, and students contributing as important stakeholders. in a different context, wherein such an initiative might be contained within the hierarchical structures embedded within an institution, one could envision problematic scenarios where student time is prioritised in terms of labour, and dichotomies within employee classifications are reinforced in the minds of emerging scholars. the danger in these models, which involve both intellectual and practical components, is that interns might develop false scholar versus technician personas, based on the perceived roles of contributors. it is imperative to the future of digital scholarship, which has considerable issues around the division and acknowledgement of labour, that artificial power-structures are not reinforced in the minds of the next generation. this extends to both the students [di pressi et al. n.d.] and their perceptions of the roles played by all those staff enabling and contributing to the internship. at institutions where faculty-staff classifications exist, measures must be taken to ensure that each partner is encouraged to make a direct contribution to the scholarship’s intellectual vision, and that any technical effort is recognized for its inherent value. administrative support must also be in place, particularly if course credit is going to be one of the motivating factors for students. the geospatial interns had the option of obtaining credit through their respective departments, an option which one of them pursued. while a commitment to alternative modes of learning is present in developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm particular schools and departments at penn state, one cannot assume that cultural differences do not exist across disciplines, and that similar administrative support for engaged scholarship would be present across the entire institution. cost is also a major consideration, particularly here, where the bednar program allowed us to pay students for their participation in the internships.[ ] the likelihood is that most institutions would not be in a position to pay interns, nor would they have the capacity to provide the technology necessary for students to pursue digital modes of scholarship. unpaid internships might only serve to further the field’s current issues with diversity, in that they would be exclusive to those students in the privileged position of not having to consider remuneration when pursuing extracurricular placements. engaged approaches to teaching digital scholarship provide a mechanism to explore areas of interest in a collaborative and multidisciplinary manner. a digital project requires contact with primary and secondary sources through the lens of digital technologies. by creating opportunities to take an inquiry and explore it in a project environment, interns are learning the importance of understanding how these sources can be negotiated and manipulated through the digital, and what the deep and significant repercussions of this act might be. given the resources, appropriate planning, and clear objectives and success metrics, we conclude that the internship model can be a highly effective learning experience for students, both to develop their digital literacies and professional skills. with the right guidance and meaningful work, the digital-project-as-pedagogy can be a powerful teaching approach in the digital humanities. the conveners of this initiative would like to thank nicki hendrix, developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm joe fennewald, barbara dewey, and christopher p. long for their support. funding for this project was generously provided by library endowments made possible through the charitable contributions of donald w. hamer and marie bednar. notes [ ] we use the term "digital scholarship" as opposed to "digital humanities," as, while we recognize that the latter term is useful in identifying the specific field or community, digital scholarship at penn state, as at many other institutions, involves collaborators who identify as being from beyond the humanities, with many hailing from the social sciences. but for the purposes of this paper, this is a minute detail, and, functionally, the terms could just as easily be used interchangeably. [ ] for example, many of penn state’s dh initiatives, and in some cases, faculty appointments, are jointly funded by the university libraries and the college of the liberal arts. [ ] the pennsylvania state university libraries has for the last years administered a paid undergraduate internship program that is supported by the generous endowment of donor, donald w. hamer, and former employee, marie bednar. the endowment is to support and enhance the university libraries by providing monies for an internship program to enable undergraduates to participate in an active and collaborative learning experience and to gain career experiences in the student’s field of study. both of the projects outlined in this paper were supported by bednar funding. [ ] using the survey detailed in the student perspectives section, we give the students an opportunity to reflect on their experience. developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm however, it would have been ideal if, given a broader scope, they had been given the opportunity to survey their mentors more comprehensively. [ ] students were aware of the purposes of the survey, and gave permission for their responses to be directly quoted in this paper. [ ] cost also relates to scale and departmental or school commitment. success for a digital scholarship internship program might be defined by increased scale, which itself is dependent on the growth in the number of faculty engaged in these approaches and on commitment by college/departmental administrations. works cited bjork bjork, o. ( ) "digital humanities and the first-year writing course." in: brett d. hirsch (ed.). digital humanities pedagogy: practices, principles and politics. cambridge: open book publishers. pp. – . clement clement, t. ( ) "multiliteracies in the undergraduate digital humanities curriculum: skills, principles, and habits of mind." in: brett d. hirsch (ed.). digital humanities pedagogy: practices, principles and politics. cambridge: open book publishers. pp. - . faull and jakacki faull, k.m. & jakacki, d.k. ( ) "digital learning in an undergraduate context: promoting long-term student–faculty place-based collaboration." digital scholarship in the humanities. [online] (suppl_ ), i –i . available from: https://doi.org/ . /llc/fqv . moore moore, d.t. ( ) "perspectives on learning in internships." journal of experiential education ( ). pp. - . developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm o'sullivan et al. o'sullivan, j., shade, m., rowles, b. ( ). "player-driven content: analysing textual communications in online roleplay." in digital humanities : conference abstracts. jagiellonian university & pedagogical university, kraków, pp. - . developing digital skills through engaged scholarship about:reader?url=http://digitalhumanities.org: /dhq/vol/ / / ... of / / , : pm int j digit libr ( ) : – doi . /s - - - introduction to the special issue on digital scholarship stephen griffin published online: february © springer-verlag berlin heidelberg digital scholarship, or “cyberscholarship”—that based on dataandcomputation—isradicallyreshapingknowledgedis- covery, creation, analysis, presentation and dissemination in many scholarly domains. distinguishing features of digi- tal scholarship are multi-stage workflows that often involve cross-disciplinary collaborations, use of a large variety of information objects from multiple sources, new research methodologies, innovative data analytics and multiple forms of presentation of research outcomes. the enabling environ- ment for digital scholarship is a rapidly expanding global digital ecology composed of richly annotated datasets, open source tools and a growing appreciation of open access dig- ital publication of text and data as a measure and driver of scholarly productivity. this special issue contains papers that report on research related to the broad set of activities that enable digital schol- arship. for digital scholarship to flourish, consideration must be given to the entire data lifecycle. the digital libraries community has laid the foundation for digital scholarship by developing information environments and resources and by exploring new interdisciplinary problem domains. as large volumes of “born digital” data are created and legacy col- lections are converted to digital form, new possibilities for scholarly work appear. to the degree that repositories can be interlinked at the data element level, interoperability and functionality is greatly increased. the result is knowledge infrastuctures capable of supporting a broad spectrum of scholarly activities. semantic web and linked open data activ- ities are developing standards, protocols and best practices for achieving this. s. griffin (b) school of information sciences, pittsburgh, usa e-mail: sgriffin@pitt.edu michael lesk’s introduction to the issue offers a rich historic perspective on digital libraries research and digital scholarship. he reminds us that the current information envi- ronment is a result of the efforts of many people from many kinds of organizations. often they were focused on different goals. what is seen as essential is the communication of ideas and this we must continue to value and improve through new models of scholarly communication. the special issue contains six papers that describe work that describe outstanding examples of contemporary digital scholarship. they are: digital field scholarship and the liberal arts: results from a to sandbox, james proctor (corresponding author) inthispaper,theresultsofasandbox(orcollectiveon-line experiment) involving digital field scholarship (dfs) is discussed. dfs is defined as scholarship for which field-based research, concepts and methods are signif- icant. the availability to easily incorporate spatial and temporal indexed data into dfs opens new opportuni- ties for disciplinary and multidisciplinary work. it also greatly expands problem space of in a broad set of topi- cal domains in the sciences digital humanities. exploring publication metadata graphs with the lod- milla browser and editor, andras micsik, (corresponding author) linked open data (lod) and semantic web are revolu- tionizing repository interoperability. by structuring and creating relationships at the data level, altogether new functionalities result, transforming browsing, search and retrieval across web sites. in this paper, generic functions for browsing lod repositories are discussed as well as s. griffin new search strategies and data requirements necessary to meet the needs of digital scholars. towards robust tags for scientific publications from nat- ural language processing tools and wikipedia, michał lopuszyński (corresponding author) text mining and natural language processing have long been central research areas in digital libraries research. given the vast amount of internet accessible textual con- tent, tools to enhance the means to extract value from stored textual information improves the ability for the content to be searched, organized, presented and used. this paper describes research in methods of tagging sci- entific publications with labels gleaned from wikipedia and the arxiv preprint collection. the resulting analy- sis yields insight into the effectiveness of the different approaches and suggests what utility the methods may have for digital scholarship. visinfo: a digital library system for time series research data based on exploratory search, jurgen bernard (corre- sponding author) increasingly, non-textual knowledge and data are the basis for much scholarly work. images, audio, multime- dia, computational models and a wide variety of other data assemblages have proven capabilities for advanc- ing digital scholarship. this paper introduces “visinfo”, a web-based exploratory search system for time series research data. visinfo is the product of collaborative efforts by data providers, digital librarians and computer scientists. the result is a powerful suite of tools for schol- ars dealing with large data sets covering extensive time periods. what lies beneath?: knowledge infrastructures in the subseafloor biosphere and beyond, peter t. darch (cor- responding author) this paper reports the findings of qualitative case stud- ies of data practices in four multidisciplinary scientific collaborations—two large collaborative projects and two small projects. it examines an increasingly important aspect of digital scholarship—data workflow and the arti- facts that naturally result from working with very large datasets. the context of the paper are changes in research practices as a result of the “data deluge” over the past decade what that portends for research and scholarship as viewed from multiple perspectives. contemporary digital scholarship is in a constant state of flux as data becomes inextricably entwined in the social, cultural and practi- cal dimensions of research practice. this paper touches on these issues and others of significant and immediate import. image restoration; photonegative digitization; historical documents, george v. landon central to digital scholarship in the humanities and social sciences are collections of printed manuscripts, photo- graphic images, audio recording, handwritten documents and more. conversion of these often requires preliminary restoration work to bring them to a condition where accu- rate digitization can be done. photographs on acetate- based film causes special challenges because of rapid deterioration over time. yet, collections of photographs are among the most important source of historical content for digital scholarship in many disciplines. this paper describesthecomplextechnicaldimensionsofautomated image restoration and shows promising outcomes for use in memory institutions that are holders of important his- toric collections. the papers give a glimpse of the continuing movement toward date-centered approaches to inquiry that now have now become a staple of research and scholarship in almost every disciplinary domain. digital scholarship will continue to proliferate as network-centric models of scientific commu- nication become the norm and the future reporting of schol- arly work accommodates dynamic and recombinant docu- ment models capable of fully reporting scholarly research across all workflow stages. much of the seminal work in developing the information environments and resources that support digital scholarship can be linked directly to digital libraries research—past and present. digital libraries research will continue to inform and provide the essential information resources for scholars in the years to come. introduction to the special issue on digital scholarship affective labor, resistance, and the academic librarian affective labor, resistance, and the academic librarian lisa sloniowski library trends, vol. , no. , (“reconfiguring race, gender, and sexuality,” ed- ited by emily drabinski and patrick keilty), pp. – . © the board of trustees, university of illinois abstract the affective turn in the humanities and social sciences seeks to theorize the social through examining spheres of experience, par- ticularly bodily experience and the emotions, not typically explored in dominant theoretical paradigms of the twentieth century. affec- tive and immaterial labor is work that is intended to produce or alter emotional experiences in people. although it has a long his- tory, affective labor has been of increasing importance to modern economies since the nineteenth century. this paper will explore the gendered dimensions of affective labor, and offer a feminist read- ing of the production of academic subjectivities through affective labor, by specifically examining the pink-collar immaterial labor of academic reference and liaison librarians. it will end by exploring how the work of the academic librarian may also productively subvert the neoliberal goals of the corporate university. introduction feminist theorists have long been concerned with questions of labor. what constitutes “women’s work” and how did labor come to be gendered? what are the social divisions of labor, and what are the economic and political implications of such divisions? this paper will focus on feminist theoretical reflections regarding the gendered dimensions of immaterial and affec- tive labor in response to marxism and autonomist formulations of those terms, and also attempt to apply these reflections to the work of academic librarians in the twenty-first century. first, i will examine the marxist con- ception of immaterial labor and then turn to socialist feminist objections to the ways in which domestic labor is cast as unproductive and almost library trends/spring invisible in this formulation of work. second, the autonomist conception of immaterial and affective labor will be explored through an examination of socialist feminist critique, as well as those offered by poststructuralist femi- nists and sociologists of emotion. finally, the workplace of the university itself provides a case study for the gendered immaterialization of labor. my interest in this topic emerged one day at my own library workplace where, instead of spending the day clearing away my to-do list of concrete tasks, i spent most of it dealing with other people’s emotions. first, in the morning, i worked for quite a long time with a student at the refer- ence desk who was in tears over her assignment that was due the next day. then later in the afternoon, i had lengthy encounters with two different colleagues, one dealing with a workplace issue and the other with an un- related personal problem. all three individuals needed support and en- couragement in the midst of their respective small crises. when i arrived at home that evening i felt exhausted, and i also felt guilty because i had not “done anything” that day. by contrast, my colleague over in the digital services side of the library proudly announced that he had batch-ingested over a thousand objects into our university library’s new digital repository that very same day. what did i have to show for my time? around that same time period, another colleague posed a seemingly unrelated ques- tion to one of our internal listservs, asking why academic librarians with active research agendas were (in her opinion) so embarrassed to be “just librarians,” why they felt that research was more distinguished work. was it because librarianship was traditionally a women’s profession, and research more the domain of men and hence considered more prestigious? what was wrong with focusing on providing library traditional services like refer- ence help, collection management, and teaching? as a librarian with a reasonably active research agenda who is currently pursuing a doctorate with a focus on theorizing feminist archives and col- lections, the latter question really struck me. was i internalizing sexism in diverging from a more traditional librarianship role? what is the proper and historical labor of librarians anyway? what constitutes “doing some- thing”? a quick review of library literature reveals that research agendas are not new to us—librarians have been writing and publishing for a long time in a variety of formats and publication venues. arguably, librarianship changed scope toward the practical when professionalized by dewey in the late nineteenth century. as garrison ( , p. ) discovered, dewey argued that librarianship had to become firmly established as a woman’s profession, for who else would work for such low pay and do such routin- ized work? her important article on libraries as domestic spheres also em- phasizes the turn to feminize librarianship as a sort of housewifely role in the late nineteenth century, where women librarians were hired to create welcoming spaces and offer patrons gentle guidance toward edifying and educative literature. it would seem that the so-called traditional librar- affective labor and resistance/sloniowski ian roles of teaching, research help, and collections management were in fact tied to a gendered circumscription of the role as libraries emerged as publicly available entities during the late victorian period in the western world. the purpose of these libraries was largely about helping the work- ing class attain social mobility, and as garrison describes it, “tender techni- cians” were required to help them in their journey (p. ). modern formulations of the academic library followed suit, with a sig- nificant gender distinction in wages and prestige created between faculty and librarians. librarians were considered support staff, subservient to the scholarly and pedagogical output of the faculty, despite the fact that much of the work of faculty and students, particularly in the social sciences and the humanities fields, relied upon the collection building and research help of librarians (coker, vanduinkerken, & bales, ). as we will see later in this paper, one can make a parallel here with the domestic labor debates between doctrinaire marxists and socialist feminists over the value of different kinds of labor. it was a long struggle for librarians to become accepted as members of faculty associations, and in some places they still bargain independently of faculty. thus an argument for a more robust understanding of the librarian’s role and potential research contributions in academe challenges the dewey-imposed ceiling on our work rather than reinscribing sexist disdain for the service work of librarians. but still this question remains: is the traditional affective work of the librarian of the last hundred years not important? does it not serve a unique purpose on a university campus, and is that purpose still unrecognized and underval- ued? why? what about all those people in tears in my office? what about all the careful tending that goes into developing and maintaining research collections, which scholars then use to generate more scholarship? how do such activities fit into an analysis of our labor? the final section of this paper offers an exploration of the waged affective labor of academic refer- ence librarians in the university. in placing the existential crises the profession faces within the context of wider feminist debates and theorizing about labor, i hope to under- score the value of librarianship to both the digital age and the univer- sity campus. in other words, this exploration serves a broader purpose than simply reassuring librarians that their work is important—although such reassurance is by no means frivolous. more importantly, however, in identifying the often unrecognized or unproblematized affective work of academic librarians in knowledge production and education, we can also analyze the overarching production culture of knowledge work from new angles. a feminist reading of academic labor that points to the production of academic subjectivities and human capital in neoliberal institutions of higher learning, and that acknowledges dependence on the often invisible pink-collar labor of academic librarians in these production processes, is a curious gap in the literature that this paper will begin to address. library trends/spring marxist feminism and socialist feminism the first feminist interventions in the theoretical conversations about im- material labor came in response to the work of marx. feminists responded to marxist distinctions between productive labor that creates surplus value and unproductive or immaterial labor, such as housework or childcare. challenges arose over the implication that the labor needed to create market commodities and wealth is the only form of “productive” work in capitalism. the domestic labor debates of the s and s emerged from this tension. marxist feminists and feminist socialists sought to cor- rect marxist views by pointing to the importance of unpaid reproductive and domestic labor in capitalist societies (mann, , p. ). this body of feminist work asks us to rethink what constitutes labor and the notion of the role of the household in social reproduction (p. ). a core question concerned whether domestic labor operated inside or outside capitalist production (weeks, , p. ). marxist and socialist feminisms viewed the distinction between “production for use” versus “production for val- ue” as key to forming the structural basis of inequality for women in society because production for use—for example, housework—“is unpaid labor and provides no basis for women’s independence” (mann, , p. ). such labor is also isolating, allows for no specialization of skills because housewives are expected to be equally good at a range of diverse tasks, and women are expected to be always available for work that has neither beginning nor end (p. ). marxist feminists argued that women’s un- paid labor neatly served the profit-making desires of capitalism, and that women were dominated and exploited by this economic system; socialist feminists, on the other hand, argued that gender and class were interlock- ing systems of oppression, and that women were doubly oppressed by both capitalism and patriarchy (p. ). both groups focused on women’s role in the social reproduction of labor power through childbearing, childrear- ing, and housework, and usefully argued that women were producing and reproducing commodity labor power that is “the most valuable commod- ity under capitalism because it produces surplus value” (p. ). if domestic labor is a critical form of production, then it becomes im- portant to define the constituent elements of immaterial labor in the domestic realm, such as as “education, communication, information, knowledge, organization, amusement/entertainment, and specifically, the supply of love, affection and sex” (fortunati, , p. ). rethink- ing the supply of love as a form of labor is in itself a particularly subversive political and theoretical move. it should also be noted for those read- ers interested in the possibility of capitalist resistance, that affective labor was long understood in certain feminist traditions as fundamental both to contemporary models of exploitation and to the possibility of their subver- sion (weeks, , p. ). various critiques of the domestic labor debates and marxist and socialist feminist approaches to labor have been offered. affective labor and resistance/sloniowski weeks suggests, for instance, in her article “life within and against work: affective labor, feminist critique, and post-fordist politics,” that these formulations privilege housework over other forms of affective labor and are locked inside a logic of dual systems (private/public spheres) (p. ). she also suggests that the differences in laboring practices among oc- cupations, and the subjectivities that might be developed as a result of such practices, were grafted by later socialist feminist-standpoint theorists onto a problematic logic of separate spheres that tries to locate an epis- temology or ontology of women’s work. this grafting ultimately replicat- ed a binary, two-gender system that serves to essentialize gender identity (p. ). weeks also suggests that socialist feminist political-economy analyses hit a theoretical wall because they did not adequately register the transition from fordist to post-fordist modes of production wherein immaterial labor becomes increasingly prevalent and arguably valorized in social and economic structures (p. ). this last claim is problematic because weeks seems to ignore more recent work from feminist thinkers posing important questions about the gendered division of labor in post- fordist societies. nonetheless, she is correct that post-fordist modes of production demand that we examine the gender divisions of material and immaterial labor more carefully and in new ways. in relation to contemporary academic librarianship, particularly the work of reference and liaison librarians, we can see some parallels. while academic departments and the faculty within them are understood to be revenue generating by producing surplus value in the form of attracting students to the university, libraries are often understood as expensive cost centers. the work of librarians in supporting faculty research, teaching information-literacy skills to students, and building and maintaining col- lections is undervalued, as evidenced by the low profile that librarians have on most campuses and the ways in which the operational budgets of libraries are continually under siege (kniffel, ). the socialist feminist tendency to glorify women’s work and essentialize gender is also occasion- ally prevalent inside librarianship, as demonstrated above by the anecdote of my colleague’s defense of traditional library roles, as well as outside the profession in the frustratingly enduring stereotypes of librarians. the question of post-fordist modes of production is interesting as well because most growth in libraries has come in the areas of digital services: digiti- zation units, digital collections management roles, digital assets manage- ment positions, and digital repository managers, to name just a few. the immaterial labor of digital librarians is thus increasingly prevalent and arguably valorized as the future of librarianship in many library strategic- planning documents, hiring practices, library conferences, and librarian networks. it would be interesting to examine in more detail the demo- graphic makeup of digital librarians, but it would seem that this is where the bulk of male librarians hang out (tennant, ). without replicat- library trends/spring ing the binary system that limits our discussions to essentialist defenses of “women’s work,” we must nonetheless acknowledge that certain forms of affective and immaterial labor are privileged and prioritized more than others in libraries, and that gender certainly plays at least a partial role in this process. as we shall see, this trend is typical of the post-fordist workplace. autonomism and immaterial labor the concepts of immaterial and affective labor in post-fordist economies are most often traced to the work of a group of thinkers called the autono- mists, whose work focuses on the biopolitical power of global networks and alliances in the “multitude.” while a number of autonomists have written on immaterial labor, for the purposes of this paper i will focus on hardt and negri ( ), who argue in empire that immaterial labor is increas- ingly dominant in post-fordist modes of production and based on the continual exchange of information and knowledges. such exchange also includes an emphasis on the manipulation of affect and human proximity in order to produce powerful social networks. in hardt’s essay “affec- tive labor,” it is specifically suggested that such labor is “better understood by beginning from what feminist analyses of ‘women’s work’ have called ‘labor in the bodily mode.’ caring labor is certainly entirely immersed in the corporeal, the somatic, but the affects it produces are nonetheless immaterial. what affective labor produces are social networks, forms of community, biopower” (p. ). hardt further argues that the existence of immaterial labor is not new, but the extent to which it has been generalized throughout the entire economy and achieved a dominant position in the contemporary, post- fordist informational society is unique. nonetheless, he recognizes gen- der and race divisions within forms of affective labor, and the fact that lower value affective work is outsourced to developing countries (p. ). hardt suggests that clarifying unequal and oppressive divisions of labor within immaterial labor is critical. he identifies three types of immaterial labor that drive the service sector at the top of the informational economy: the informationalization of industrial processes; analytical and symbolic tasks versus routine symbolic tasks; and the production and manipula- tion of affects that requires virtual or actual human contact or proximity (pp. – ). he recognizes affective labor as critical to the production of collective subjectivities, sociality, and subjectivity itself, and sees the prod- uct of affective labor as a wittgensteinian “form of life” itself, hinting at affect’s reproductive capacities—or as he calls it, biopolitical production, or biopower from below (p. ). in short, hardt regards affective labor as the place where the boundary between productive and reproductive labor breaks down. he cautions, however, against celebrating maternal work in ways that reinforce the gendered division of labor and tradition- affective labor and resistance/sloniowski ally oppressive familial structures (p. ). notably, hardt does not draw on the important body of feminist literature on subjectivity and inter- subjectivity and collectivity, consensus, and coalition-building in his writ- ing, and while he acknowledges feminist lineages, many of the feminist thinkers discussed below criticize him for not delving deeply enough into that work. an autonomist reading of contemporary librarianship will be offered below, but suffice it say that there is something deeply relevant here to the ways in which academic librarians work. our immaterial labor of collect- ing, organizing, preserving, and digitizing scholarly and creative works, which are then used to generate further scholarly and creative works, is both productive and reproductive. we “batch-ingest”; we circulate also. in addition, the affective labor of our student-support work, which is used largely to help students develop the academic subjectivities needed to earn their degrees and in some cases perhaps to go on to become scholars themselves, is reproductive in its own right, if not generally recognized or valued in university retention-assessment schemes and mechanisms. the following two sections of this paper will explore feminist responses to the autonomists’ thoughts on immaterial labor that have been espe- cially critical of the ways in which gender has been “added on” to the theory without adequate explorations of gendered forms of domination and oppression. some have attacked what they see as a problematic du- alism: how mind work is privileged over emotional work, as if care work had no intellectual components. others have pointed to the materiality of care work and the ways in which embodiment is actually ignored in hardt’s ( ) and hardt and negri’s ( ) work, while other feminists are concerned that the autonomists reduce “women’s” work only to the body. the implications of the outsourcing of domestic labor to migrant communities of women needs further exploration, as do the ways in which information technologies impact domestic labor, because several of the theorists examined below are sceptical of hardt’s idealism in relation to these new social networks and forms of production. many have pointed out the ways in which the feminist lineages of the concept have been ig- nored in current debates. these criticisms will be examined below in more depth, and existing feminist critiques will be separated into two broad theoretical categories: socialist feminist perspectives, and poststructuralist feminist positionings. also examined are some relevant formulations that emerge from the sociology of emotions. socialist feminist critiques of autonomism first, we will look at the work of schultz ( ), who in her essay “dissolved boundaries and ‘affective labor’: on the disappearance of reproductive labor and feminist critique in empire” has offered a powerful feminist critique of hardt and negri. she takes exception to what she sees as their library trends/spring gendered definition of emotional or affective labor that maintains the du- alism of separate spheres, and instead argues that affective labor involves strategizing and problem-solving, pointing out that emotional work is also mental work (p. ). schultz also suggests that hardt and negri present affective labor as nonobjectified and noninstrumental and disputes their idealism, countering that one does not find an egalitarian paradise inside social networks, virtual or physical (p. ). such a claim overly idealizes women’s work as spheres free from domination and exploitation in her view. she also disputes the way in which hardt in particular looks at the disappearing boundary between productive and reproductive labor, not- ing that women still do the bulk of care work in the home and take longer maternity and parental leaves. according to schultz, “in this sense, the convergence model of production and reproduction reflects less the re- ality of labor relations than an increasingly hegemonic image of female subjectivity, where reproductive labor disappears into the holes and gaps of the patchwork that is the neoliberal working day” (p. ). she also notes that while the delegation of reproductive labor to underprivileged women, particularly migrant women, is an example of the displacement of the boundaries between productive and reproductive labor, “the thesis of a shrinking divide between production and reproduction appears ab- surd when one thinks of the neoliberal cutbacks to public services such as kindergartens and health care, which (re)privatize reproductive labor and force unpaid women to pick up the slack in the system” (p. ). schultz thus concludes that because empire offers no basis for a critique of the political economy of gender regimes, its subversive claims for the potential of biopolitical resistance fail. like schultz, fortunati ( ) also notes the way in which the autono- mists and other political economists valorize certain kinds of immaterial labor over others. in “immaterial labour and its machinization,” she ar- gues that “the overall consequence of their discourse is that women again risk being reduced to the body” (p. ). in her reflections on the concept of immaterial labor, fortunati notes its overall growth and examines the increase in immaterial labor in the domestic sphere specifically as a con- sequence of the increase in old and new media (p. ). fortunati’s view is that the domestic labor arguments have been put to rest because most rec- ognize that domestic work is productive labor able to create surplus value. therefore, the larger issue for feminists and others is to examine how the immaterial labor of the domestic sphere has changed. she suggests that the time saved through the diffusion and adoption of domestic appli- ances has been filled up by an increasing labor of housework organiza- tion and planning, micro-coordination of the various family members and their personal schedules and commitments, planning of children’s’ transportation, the logistics of the flows of goods and people within affective labor and resistance/sloniowski the house, knowledge and information activity aimed at the develop- ment of “informed” housewives/ workers, and the adoption and use of information and communication technologies (icts) in order to remove the human body from education, communication, information, entertainment and other immaterial aspects of domestic labor. (p. ) such new elements of domestic labor are integral to the post-fordist desire for the production of human capital with skill sets suited to the manipulation of information and technology, as well as being important mechanisms of social control that encourage the consumption of new commodities like cell phones, tablets, and handheld gaming devices in the domestic sphere. one can see parallels in librarianship, with its increas- ing emphasis on uncritically oriented maker-spaces, as well as technology- training approaches to information literacy. simultaneously, in the production of material goods, precarious labor and the intensification of the working day has increased along with imma- terial labor. “immaterial labor has become productive for capital in a way that signals a wider phenomenon which is the exporting of the logic and structure of the domestic sphere to the world of goods, which always ends up resembling and being assimilated to the reproductive world” (fortuna- ti, , pp. – ). in pointing to the relationships between the public and the domestic spheres, fortunati also draws attention to the integrated and inseparable nature of the dual spheres and systems of analysis from a perspective more attentive to domination and oppression than that of the autonomists. for librarians, then, the traditional reproductive affective work of teaching, research help, and collections management is increas- ingly eroded by the need for them to develop deeper skills in manipulat- ing and preserving digital objects. however, effective manipulations also require a sense of the affective behaviors and needs of academic users when approaching online collections. while it might seem easy at first to distinguish between different librarian organizational silos, the reality is that our work deeply impacts and shapes one another—as evidenced in our many committee and listserv battles over how best to approach our work. our tenure structures also require us to evaluate one another. none- theless, certain forms of digital immaterial labor are valorized as mind work over the emotion work of liaison librarians, and such valorizations have their roots in gendered divisions of labor. ironically, however, outside of librarianship in the digital humanities and related fields, the contribu- tions of digital librarians are often misconstrued and devalued as service work (shirazi, ). in their essay “gender at work: canadian feminist political econ- omy since ,” luxton and maroney also point to this complex integra- tion of spheres and inform us that the best contemporary feminist work now recognizes library trends/spring a complex interplay of capital accumulation, labor markets, state poli- cies (especially regarding public-service funding and employment legislation), reproduction of labor power in daily and generational cycles, family household demographics, forms of organization and divisions of labor, and workplace, trade-union, and political organizing by workers. (p. ) they note that it is women who “mediate the contradictions between the two production processes and locations. gendered relationships and sub- jectivities are produced in the labor force as well as through ‘socialization’ in families or educational institutions” (p. ). analyses that challenge the boundaries between the reproductive and productive spheres align with hardt and negri’s ( ) belief that the boundaries are dissolving; however, feminist analyses demonstrate that such dissolution does not nec- essarily offer emancipatory opportunities for women inside capitalist and patriarchal systems. as this paper demonstrates, socialist feminist critiques of autonomist conceptions of immaterial labor focus on the ways in which previous femi- nist work has been largely ignored, how emotional labor and care work is not valorized as highly as intellectual immaterial labor, and how the dissolving boundaries between the productive and reproductive sphere reinscribe all manner of exploitation and oppression of women. although i did not focus specifically on this issue in this section of the paper, it is important to note that all three of the studies examined above also dis- cuss the social implications of the ways in which affective labor and care work are increasingly delegated to racialized migrant communities, offer- ing an important intersectional analysis of post-fordist economies. these theoretical and political interventions in the conversation surrounding immaterial labor are important to any conversation about twenty-first- century labor. however, lessons gleaned from poststructuralist formula- tions suggest that perhaps the theoretical perspectives of socialist feminists could also be improved by an examination of the regulatory regimes of heteronormativity in their explorations of the domestic sphere, and should pay deeper attention to cultural, symbolic, and discursive forms as mechanisms of oppression that are as powerful as the political economic structures that operate to devalorize work among certain social groups. poststructuralist feminist critiques and the sociology of emotion poststructuralist feminist thinkers share certain concerns with socialist feminists, particularly in regard to their suspicions about the ways in which the autonomists undertheorize the gendered dimensions of affective la- bor, assume that social networks are egalitarian, and do not adequately consider the ways in which the affective labor of the private sphere has been outsourced to migrant, racialized women. their work deviates from affective labor and resistance/sloniowski socialist feminism, however, in a deeper attention to embodiment, sub- jectivity and intersubjectivity, heteronormativity, issues of cultural repre- sentation, and queer political economies. for example, poststructuralist scholars barker ( ) and lanoix ( ) respectively pay attention in their work to rethinking representations and discourses of childcare and healthcare workers and the sacredness of the heteronormative family, the essentializing of women and gender identity, as well as to the materiality of the body and the relations that such materiality produces. these additions are useful interventions in the conversation around affective labor. their attempts to think of new forms of experience and relationality that do not trap people inside a limiting domestic sphere and offer potential spaces of resistance is also shared by poststructuralist, post-marxist scholar weeks in her exploration of hochschild’s work in the sociology of emotion. be- cause this paper attempts to specifically focus on feminist debates with the autonomists, it does not directly engage the large body of literature on the sociology of emotion that is relevant to affective labor as a concept. feminist conversations in that arena could form the bulk of an entirely separate paper; space does not allow for a full exploration here. however, the early work of hochschild is particularly germane to this paper, and so we will approach her text the managed heart: commercialization of human feeling ( ) via the useful critique of her work by weeks ( ) in “life within and against work.” as mentioned earlier in this paper, weeks offers an explanation of the feminist lineages of the concept of affective labor, which has long been un- derstood in certain feminist traditions as fundamental both to contem- porary models of exploitation and to the possibility of their subversion (p. ). she critiques the autonomists for largely ignoring these lineages. additionally, weeks argues that socialist feminist analyses hit a conceptual wall because they do not adequately register the transition from fordist to post-fordist economies (p. ). she points instead to hochschild’s analyses of postindustrial labor as a more useful feminist text for an ex- amination of waged affective labor and the social consequences of the rise of immaterial labor. hochschild’s work suggests that the postindustrial era requires a new and sometimes harmful commodification of laboring subject through the transmutation of private emotional work to public emotional labor. she ar- gues that active emotional labor is both a skillful activity and practice that helps form one’s subjectivity. it is also the case for hochschild ( , p. ) that “in processing people the product is a state of mind,” which indicates the ways in which affective labor is co-opted by capitalism as a manipulative force. the affective labor needed to sustain social relations of cooperation and civility and to strategically manage emotions for social effect is also an everyday practice that, since it is traditionally privatized and feminized, is not recognized or valued as labor (p. ). such labor becomes a kind of library trends/spring shadow labor in the post-fordist economy. there is also a human cost, as hochschild suggests: “[we] risk losing the signal function of feeling or the signal function of display when emotional work is transmitted to la- bor . . . the commercial distortion of a managed heart has a human cost” (p. ). she notes that service jobs are largely performed by women and argues that gender is both produced and productive when personality is harnessed for the workplace. hochschild acknowledges that all societies require the use and management of feeling, but argues that the use of emotional labor in capitalist economies correlates to an estranging, sexist, colonization of life by work. while weeks ( ) sees hochschild’s analyses as offering important new questions, she does challenge the ways in which hochschild relies upon “both a site of unalienated labor and a model of the self prior to its alienation” to animate her critiques (p. ). weeks warns us that there is no way of identifying some “kind of spatial or ontological position of exte- riority” (p. ). in other words, there is no outside; there is no heart that was never managed. she notes that hochschild herself questions whether there is such a thing as an “unmanaged heart.” weeks indicates that such claims to exteriority undermine hochschild’s argument by relying upon nostalgia and an essentialist view of the self that ends up reproducing the logic of dual spheres. nonetheless, she argues that the critique of work as a mode of subjectification must be a feminist project, and suggests that looking at the genealogies of feminist theories on affective labor offers us clarity in our current situation. weeks notes that “once the model of separate spheres is rendered finally unsustainable, the problem is how to develop a politics in the absence of an outside in which to stand” (p. ). therefore she asks what if, instead of discussing home and work as separate spheres, we talked about life and work to critique the post-fordist organiza- tion of labor? if these two categories are indistinguishable, then life could offer an immanent critique of work. however, another political project remains: namely, to register and challenge the gendered organization of labor. the gendered hierarchies and divisions of labor within both work and life must be contested and made visible (p. ). an emphasis on sub- jectivity as it is produced and gendered in the workplace, with a focus on the potentially liberatory project of collectively inventing new subjects, al- lows for the expression of feminist political desire that does not reinscribe gender identity. the preceding review of feminist responses to marxist and autonomist formulations of affective labor identifies a series of key themes and preoc- cupations echoing through the literature. core concerns include the ways in which the gendered and racialized divisions of immaterial and affective labor in both marxist and autonomist texts have been treated superficially; how emotional labor is productive of subjectivity and can also damage and affective labor and resistance/sloniowski exhaust the laborer; and how heteronormativity is reinscribed in concep- tions of care work and the family. another issue is the problem of a hierar- chy of labor: how intellectual immaterial labor is valorized over emotional labor in autonomist theory. this valorization replicates the gender binary and suggests that emotional labor requires no intellectual capacities or that “mind work” does not require the management of feeling. finally, in some feminist analyses there is a concern about the ways in which the body, and the work of the body, are dematerialized or dismissed in discus- sions of immaterial labor. also, the literature review demonstrates that given the increase and dominance of immaterial labor in the post-fordist economy, the exami- nation of the specificities of the social divisions of such labor is a critical political project. scholars should examine the ways in which affective la- bor is both subordinated and undervalued while simultaneously offering possibilities of affective resistance. our own university environments are an ideal place to engage such a project; in fact, it is important for academ- ics to study the university for a number of reasons, for, as gregg ( , p. ) reminds us in “working with affect in the corporate university,” our laboring practices are not exceptional. we need to better understand how our workplace mirrors others; what shared concerns and points of recognition we have with the people we study; how our fortunes are tied to the socioeconomic conditions of all workers; and how in academe we re- produce some of the same oppressions that we study in other workplaces. treating the university as an exceptional workplace allows for any num- ber of inequities to flourish, even in environments where the study of so- cial inequity is of primary concern. noting that there are wide ranges of employees engaged in any number of tasks and activities at large institu- tions like universities, in the final section of this paper i will largely focus on the work of professors and librarians. this focus emanates from logisti- cal concerns, given the space and time restrictions of this paper, but also as a response to luxton and maroney ( ), who noted, “political economy has not paid much attention to women as members of the capitalist or business class and little to women professionals” (p. ). i will attempt to address that omission by offering some thoughts on labor inside academe. affective labor and the edu-factory to begin this analysis we must first demonstrate that the labor of the uni- versity is largely immaterial in both the marxist and post-fordist defini- tions. the products of higher education, if it can be said to have any, are the production of new knowledge in the form of original research and the formation of human capital for the market. many have written about the neoliberal impact on the university, and the ways in which education has become marketized. the neoliberal influence has also increased so- library trends/spring cial divisions of labor with the academy. feminist critics have noted that “the highly individualized capitalist-inspired entrepreneurialism that is at the heart of the new academy . . . has allowed old masculinities to re- make themselves and maintain hegemonic male advantage” (grummell, devine, & lynch, , p. ). these authors point to the glorification of concrete outputs in performance measurements over emotional labor as an example of the ways in which care work is devalorized in relation to other tasks. like the forms of immaterial labor particularly prized by the autono- mists, the psychological space of the university is heavily influenced by cartesian views of the development of the rational autonomous subject (p. ). the university is supposed to be an emotionally neutral space, a place for objective inquiry and the production of new knowledge; of course, the pressure to be emotionally neutral creates its own affect. gregg ( , p. ) challenges us to ask how the production cultures of knowl- edge work impact the work we do and the knowledge we create. we should ask how the pressure to suppress both the emotions and the body impacts the research that scholars produce and the learning experiences they pro- vide, as well as the ways in which labor is institutionally and interpersonally divided as a result of these affective pressures. hochschild ( ) has ex- plained how corporations rely upon the emotional lives of employees for company benefit. the university, which must shepherd students of varying ages and backgrounds through the educational process, also relies upon the emotional lives of its workers to produce correctly calibrated human capital for the labor market, despite the invisibility of such work in yearly performance reviews and merit bonuses. of course, academic immaterial labor is stratified; while management is rewarded handsomely for manag- ing relationships and developing partnerships, particularly with private industry, employees are only minimally rewarded for good teaching scores and high-impact publications and grants. some affective labor is more valuable than others, and it is not difficult to see a gendered and capitalist dimension to such valorizations. gregg ( ) has perhaps best characterized affective labor for faculty in the corporate university as including fear, anxiety, controlling oneself and one’s emotions, modulating subjectivity to fit workplace demands, the psychological preparation to be ready for work’s potential (for example, through constantly checking email), and the anticipatory effects of staying constantly connected and on top of new information in one’s field. gregg also points to a pervasive sense of precariousness, feelings of instability and being overloaded, and a need to learn and respond to processes of change management—all coupled with ongoing fears of being left be- hind. in academia one must always be psychically and somatically pre- pared for work that has no beginning and no end (pp. – ). as others affective labor and resistance/sloniowski have noted, this environment mirrors many other post-fordist workplaces and work practices, and as such, the university offers a fertile case study of immaterial labor. it behooves us to examine the social divisions of labor inside it, and the consequences of those divisions. laboring librarians one such social division of labor is the stratification between faculty and librarians. as mentioned previously, librarianship as a profession has been heavily female-dominated since the early nineteenth century. working in libraries was considered fitting labor for a woman because they were will- ing to do it for low recompense, it was not physically taxing, some of it was mundane and detail-oriented, and it allowed them to expand their roles as guardians of “culture” and leverage their skills learned in the home to make the library a gracious and welcoming home-like space for patrons. these skills and roles conversely began to impact the goals and status of libraries (garrison, , p. ). notwithstanding the many important contributions of individual librarians to both their communities and cul- tural memory, libraries can be understood as an extension of the domestic sphere, and librarianship a form of waged domestic labor. the emotional and affective labors of librarians are well-documented in the library and information science (lis) literature, as well as the com- mon phenomenon of burnout attached to such work. however, outside of this literature, studies on the work of librarians and archivists are very scarce. academic librarians specifically are a curious employee category inside the university, straddling both academic and nonacademic work in their job descriptions. like faculty, librarians are engaged in helping educate students by offering research help, as well as instruction in infor- mation literacy and research competencies. although the term information literacy is not well-known outside of librarianship and librarians’ work in this area is underrecognized, the skills required to find, organize, synthe- size, and manipulate information are prized in the neoliberal knowledge economy, as information is the preeminent commodity form of contempo- rary capitalism (eisenhower & smith, , p. ). academic librarians also maintain collections and organize information. we participate in gift economies through facilitating the borrowing of books and other items, as well as through our professional engagements with the open-access and open-data movements in scholarly publishing. we curate and maintain common spaces within which faculty and students may read and study, and, finally, along with our archivist colleagues, we engage in all matter of cultural stewardship and preservation activities in our collections, both physical and digital. in short, we operate as shadow labor whose role serves to reproduce the academy (shirazi, ). the emphasis in our work, however, and how it is perceived by the public, is largely on the service library trends/spring side of our role rather than on the intellectual work involved in negotiat- ing, evaluating, and manipulating scholarly information and its affects (as discussed at length in harris and chan [ ]). as in all service jobs, librarians are asked to make pretence of emotional neutrality around the information and people they engage with and to offer service with a smile (shuler & morgan, , p. ). professional guidelines exist that describe how one should govern oneself in order to appear receptive, visible, cordial, and interested in our student’s research questions (reference and user services association, ). a growing body of literature asks libraries to consider the issue of library-user anxi- ety and how best to address it. even in our pedagogical efforts to foster information literacy, as guests in professors’ classrooms, our work often in- volves negotiating for opportunities to engage in pedagogies that facilitate critical thinking and the evaluation of information, rather than just offer- ing tours of library catalogs and research databases (eisenhower & smith, , pp. – ). these negotiations often involve having to educate faculty members as to the intellectual contributions that librarians can make to their course or curriculum, and to resist reacting emotionally to the dismissiveness with which our services are sometimes received. as in all service positions, librarians are required, therefore, to disguise fatigue and irritation with library patrons, and our primary affective contributions involve a willingness to help, patience, and active listening—supplements to the flow of pedagogical power (p. ). hochschild ( ) defined this sort of emotional labor as that which “requires one to induce or sup- press feeling in order to sustain the outward countenance that produces the proper state in others” (p. ). this management of feeling is true for librarianship as well and comes with a personal cost to the worker. in terms of social and material conditions, while academic librarians are generally remunerated on a level comparable, if slightly less than faculty, we suffer from lack of prestige and recognition in the workplace (harris & chan, ). cuts to library operating budgets mean that most libraries operate without enough librarians, while increases in enrollment and increased demand to offer new digital services in both collections and teaching abound. it is not an exaggeration to say that academic librar- ians have considered themselves to be in significant existential and insti- tutional crisis for most of the last two decades, along with most publicly funded services. our low status and decreasing ranks also lead to a dimin- ishment of opportunity, where librarians are not always considered viable candidates for upper-level university service positions or principal inves- tigator roles on grant applications. there is a ceiling for care workers in the university because we are viewed not as professionals or scholars, but as support and administrative workers. this perception remains despite our faculty status in most universities. however, the work we do is central to the production of knowledge. in archive fever, derrida ( ) argued affective labor and resistance/sloniowski that there is no political power without control of the archive, and that the technologies of archivization (which are largely created by librarians and archivists) produce, as well as store the historical record (pp. , ). foucault ( ) suggests, in the archaeology of knowledge, that enunciabil- ity itself depends on the archive: what can and cannot be said is predi- cated on what we preserve and how we make it available (p. ). and yet librarians struggle to find the time to write and theorize intensively about the social and political dimensions of libraries, archives, and the technolo- gies of archivization. by not publishing and presenting on these issues to other scholars—issues that we have intimate and practical engagement with—we contribute not only to our ongoing invisibilization, but also to a diminishment of academic culture and the debates pertinent to schol- arly communication and knowledge production in general. we struggle to find time to research and write because our service work is considered more useful to the corporate goals of the university, and university admin- istrators are often unsupportive of our research goals when they take our limited time and bodies away from serving library users and their various anxieties. simultaneously, the rise of digital humanities has opened doors for librarians and programmers to be more involved in academic projects, but nonetheless such projects are generally managed and funded within traditional academic-labor hierarchies, with professors directing the work of librarians and other alt-academics whose intellectual contributions are devalued as merely service work or project management. from a poststructuralist perspective, libraries may also be considered as an extension of the domestic sphere in the sense that they are pro- creative spaces. liljeström and paasonen ( , pp. – ) remind us that interpretation is a question of contagious affects and dynamic encounters between readers and texts. if so, then the labor of librarians needed to structure and mediate those encounters is generative and reproductive. of course, the library as a physical building holding a body of knowledge on its shelves also requires librarians and library staff to care for its ma- terial needs. to think that the act of finding materials in the virtual or physical library is the result only of serendipity speaks again to the invisibi- lization of librarians’ labor. consequently, i would argue from both a post- structuralist and socialist feminist position that librarians and archivists provide a form of largely ignored reproductive and affective labor in the knowledge production of academe, and are an unrecognized production culture within the knowledge work of the university. our invisibilization relates to the very heart of feminist critiques of gendered affective and immaterial labor and the ways in which reproductive and care labor is de- valorized in the post-fordist economy. more attention needs to be paid to the care work of librarians in studies that examine knowledge production because the absence of these laborers provides a curious and gendered gap in existing scholarship—a gap that will only increase as our librar- library trends/spring ies become increasingly virtual and immaterial and will have long-term consequences for the historical record and the production of knowledge inside academe. further studies also need to be undertaken that examine the ways in which immaterial labor in librarianship is stratified along gen- der lines—as i have suggested earlier, in relation to digital librarianship’s preferential status over reference and liaison librarianship. nonetheless, at the same time, we need to be careful that we do not reproduce a dual- spheres binary analysis that essentializes gender and limits librarians to a false and ahistorical notion of “traditional librarianship” in our attempts to recognize what is socially, academically, and politically useful about the more affective dimensions of our work. conclusion this paper has attempted to trace feminist engagements with affective la- bor, variously defined as reproductive labor, care work, and part of imma- terial labor, from the domestic labor debates of the s to present-day interventions in conversations regarding the post-fordist political econo- my. primary challenges to dominant texts include the ways in which the gendered divisions of labor in both marxist and autonomist texts have been dismissed, how heteronormativity is reinscribed in current debates about the family and the public sphere, and how global migrant labor is ignored in the lower valued affective labor of care work. immaterial labor, which is considered problem-solving, strategic mind work, is more highly valued than emotional labor, replicating a traditional gender binary and suggesting that emotional labor requires no intellectual capacities or that “mind work” does not carry its own freight in emotional labor. the materi- ality of care work is foregrounded in many feminist analyses, and disputes about the importance of the body and embodiment abound. the human cost of the instrumentalization of affect and the production of subjectivity has also been explored in much of this work. in terms of the immaterial labor of universities, some material has been written about the affective labor of faculty in mainstream scholarly texts, but the care work of less privileged members of the university has been ignored outside the literature of lis studies, essentially replicating the invisibilization of care workers outlined in feminist theory and offering us a good example of how even debates aware of the denigration of affective labor can replicate the very divisions they wish to disrupt. any such gap is also problematic because as hardt ( , p. ) reminds us, “the produc- tion of affects, subjectivities, and forms of life present an enormous poten- tial for autonomous circuits of valorization, and perhaps for liberation.” feminist theorists ask us to consider new potentialities for the resistance of life to power. further work on the feminist concept of affective labor needs to be done that takes up the larger body of work on affect in feminist thought. given affective labor and resistance/sloniowski the ongoing and increasing machinization of immaterial modes of pro- duction, attention to the affective nature and labor of technology in life and work and to the ways in which this machinization also impacts human subjectivity and gender seems a fruitful new line of inquiry for feminist thinkers concerned with labor issues. if we take up a call to arms to think about life and work and the subjects we wish to become, how might new technologies enhance, augment, or limit our feminist political desires for subjectivities free from domination? within the context of the academic library, how does the disruption of the digital library allow us to rethink and revalorize the subjectivity of the librarian? indeed, the feminized fig- ure of the academic librarian might be an ideal object of study for such work, given the ways in which technology has completely transformed the workplace and culture of the library. regardless, feminist theoreti- cal reflections on immaterial and affective labor offer an important cor- rection to the techno-determinism and optimism of the autonomists, reminding us that liberation must be equally available to all members of the multitude. as for the potential biopolitical resistance or subversion afforded by affective labor as suggested by the autonomists, i would like to end with some final, cautious thoughts on what academic librarians might do to disrupt oppressive divisions of labor. while i am aware that there are vast limitations on what we can accomplish, particularly while working on our own and not as part of broader social coalitions with other related profes- sions, i regard our most effective forms of resistance as having two prongs: we must have our affective labor recognized, and we must recognize our affective labor. in other words, we need to write and talk our way into leg- ibility through publishing more often outside of our own journals, and we need to develop some sort of internal professional metrics that reward, or at least acknowledge, affective labor. we recognize movers, shakers, pushers, shovers, leaders, and change agents, but how do we acknowledge emotional labor and care work? we need to speak at more interdisciplin- ary tables and to write precisely about our labor issues, as well as about the politics of knowledge organization and how our work impacts the produc- tion culture of the academy. our writing must place our work within a broader theoretical and sociopolitical context. we need to be visible, we need to speak the language of social and political theory, and we need to be heard. the recent interest in critical librarianship is very encouraging on this front, as evidenced by recent editorial shifts in some library jour- nals, new conferences, and new presses, but we need to be less insular as a group. and we need to find a way to recognize the caring and affective dimensions of our work at precisely the same time as we become imma- terialized in a digital world. we also need to recognize the other kinds of affective laborers on our campuses—library technicians, the secretarial staff, and the people who cook and serve food, the cleaners—and fully library trends/spring think through the ways in which those positions are underprivileged, un- derwaged, and disproportionately staffed by women of color. relatedly, academic reference librarians must engage the concepts of critical information literacy and social justice in our teaching as key mechanisms for resisting market logic in education. we must continue to build broad and subversive collections and resist censorship and fight for intellectual freedom and freedom of expression. i would contend that by fostering spaces for dissent, civic engagement, nonneutrality, and even nonefficiency in our libraries and classrooms, we offer disruptions in the affective flow of the corporate university. similar contributions can be made in the areas of scholarly communication and digital scholarship, calling attention to the ways in which authority is constructed and valued, and exposing the gears of knowledge production. in offering a feminist interpretation of the human cost of the undervalued immaterial labor of librarianship and developing an awareness of the many hurdles in our path, i am nonetheless comforted by a new awareness of the ways in which our labor undergirds and is generative of academic subjectivity. we must consider such production more carefully and in more detail, and consider the kinds of new subjects we both wish to produce and become. acknowledgments i would like to thank my colleague karen nicholson, as well as the guest editors, editor, and copyeditors of library trends for their feedback on this paper. i would also like to thank professor meg luxton for her feedback and encouragement, without which i would neither have written or pub- lished this work. as always, my deepest gratitude goes to philip kiff for carrying far more than his usual load of our shared affective and domestic labor during the writing of this piece. notes . there has been some critique of garrison ( ) from those who regard women in the early public library world as in fact pioneering professionally (hildenbrand, ; pas- sett, ). this was the case in terms of women being the first to carry out significant survey work (mcdowell, ) and their engagement with early-twentieth-century child psychology (van slyck, , pp. – ). these revisions are important corrections and clearly provide evidence that some women librarians were able to step outside of gender restrictions in libraries and demonstrate expertise, professionalism, and even radicalism. however, such exemplary work was and still is largely unrecognized in dominant narratives and stereotypes about librarians. . my own place of work, york university in toronto, broke ground in canada when our union won a historic employment-equity concession from the employer in . the em- ployer acknowledged that librarians were eligible for an equity settlement, given that the majority of them were women and performed academic activities and were required to have graduate degrees to do their work, and yet were significantly less compensated than academic employees in male-dominated faculties with comparable degree qualifications and expectations. it is worth noting that male librarians benefited from this settlement as well, underscoring the point that feminist organizing can be good for all. . see, for instance, the work of luxton and maroney ( ) and mcdowell and dyson ( ). . librarian stereotypes are well-documented in library literature; see, for instance, the work of pagowsky and defrain ( ). affective labor and resistance/sloniowski . other scholars have similarly argued that the literature on the decline of civic engage- ment ignores the care work of women both inside and out of the domestic by dismissing it as selfishly motivated rather than altruistic citizenship behavior (herd & meyer, , p. ). . the bibliography on the “living in interesting times” website provides an excellent col- lection of work on the neoliberalization of higher education; see “corporatization,” in “living in interesting times” ( ). . see, for instance, the important work by nicholson ( ) on the mcdonaldization of academic libraries. . see accardi ( ); caputo ( ); eisenhower and smith ( ); guy, newman, and mastracci ( ); julien and genuis ( ); matteson and miller ( ); mills and lodge ( ); sheeshly ( ); sheih ( ); and shuler and morgan ( ). . see, for instance, the work of nicol ( ) and mellon ( ). references accardi, m. t. ( , october ). the souls of our students, the souls of ourselves: resisting burnout through radical self-care. paper presented at the pala annual conference, state college, pennsylvania. retrieved from http://www.scribd.com/doc/ /accardi-pala-crd -keynote-final- barker, d. k. ( ). querying the paradox of caring labor. rethinking marxism, ( ), – . brown, l. j. ( ). trending now-reference librarians: how reference librarians work to prevent library anxiety. journal of library administration, ( ), – . caputo, j. s. ( ). stress and burnout in library service. phoenix: oryx. coker, c., van duinkerken, w., & bales, s. ( ). seeking full citizenship: a defense of tenure faculty status for librarians. college & research libraries, ( ), – . derrida, j. ( ). archive fever (e. prenowitz, trans.). chicago: university of chicago press. eisenhower, c., & smith, d. ( ). the library as “stuck place”: critical pedagogy in the corporate university. in m. t. accardi, e. drabinski, & a. kumbier (eds.), critical library instruction: theories and methods (pp. – ). duluth: library juice press. eklof, a. ( ). understanding information anxiety and how academic librarians can mini- mize its effects. public services quarterly, ( ), – . fortunati, l. ( ). immaterial labor and its machinization. ephemera: theory & politics in organization, ( ), – . foucault, m. ( ). the archaeology of knowledge (r. swyer, trans.). new york: pantheon. garrison, d. ( ). the tender technicians: the feminization of public librarianship, – . journal of social history, ( ), – . gregg, m. ( ). working with affect in the corporate university. in m. liljeström & s. paa- sonen (eds.), working with affect in feminist readings: disturbing differences (pp. – ). abingdon, uk: routledge. grummell, b., devine, d., & lynch, k. ( ). the care-less manager: gender, care and new managerialism in higher education. gender and education, ( ), – . guy, m. e., newman, m. a., & mastracci, s. h. ( ). emotional labor: putting the service in public service. armonk, ny: m. e. sharpe. hardt, m. ( ). affective labor. boundary , ( ), – . hardt, m., & negri, a. ( ). empire. cambridge, ma: harvard university press. harris, r. m., & chan, c. s. ( ). cataloging and reference, circulation and shelving: public library users and university students’ perceptions of librarianship. library and information science research, ( ), – . herd, p., & meyer, m. h. ( ). care work: invisible civic engagement. gender & society, ( ), – . hildenbrand, s. ( ). some theoretical considerations on women in library history. journal of library history, ( ), – . hochschild, a. r. ( ). the managed heart: commercialization of human feeling. berkeley: university of california press. julien, h., & genuis, s. k. ( ). emotional labour in librarians’ instructional work. journal of documentation, ( ), – . kniffel, l. ( , may ). cuts, freezes widespread in academic libraries. american librar- ies. retrieved from http://americanlibrariesmagazine.org/ / / /cuts-freezes-wide spread-in-academic-libraries library trends/spring lanoix, m. ( ). labor as embodied practice: the lessons of care work. hypatia: a journal of feminist philosophy, ( ), – . liljeström, m., & paasonen, s. ( ). introduction: feeling differences—affect and feminist reading. in m. liljeström & s. paasonen (eds.), working with affect in feminist readings: disturbing differences (pp. – ). abingdon, uk: routledge. living in interesting times. ( ). corporatization. retrieved from http://livingininterest ingtimes.wordpress.com/resources/corporatization luxton, m., & maroney, h. ( ). gender at work: canadian feminist political economy since . in w. clement (ed.), understanding canada: building on the new canadian political economy (pp. – ). montréal, qc: mcgill-queen’s university press. mann, s. a. ( ). doing feminist theory: from modernity to postmodernity. oxford: oxford university press. matteson, m. l., & miller, s. s. ( ). a study of emotional labor in librarianship. library & information science research, ( ), – . mcdowell, k. ( ). surveying the field: the research model of women in librarianship, – . library quarterly, ( ), – . mcdowell, l., & dyson, j. ( ). the other side of the knowledge economy: “reproduc- tive” employment and affective labours in oxford. environment and planning a, ( ), – . mellon, c. a. ( ). library anxiety: a grounded theory and its development. college & research libraries, ( ), – . mills, j., & lodge, d. ( ). affect, emotional intelligence and librarian–user interaction. library review, ( ), – . nicholson, k. p. ( ). the mcdonaldization of academic libraries and the values of trans- formational change. college & research libraries, ( ), – . nicol, e. c. ( ). alleviating library anxiety. public services quarterly, ( ), – . pagowsky, n., & defrain, e. ( , june ). ice ice baby: are librarian stereotypes freezing us out of instruction? in the library with the lead pipe. retrieved from http://www .inthelibrarywiththeleadpipe.org/ /ice-ice-baby- passet, j. e. ( ) cultural crusaders: women librarians in the american west, – . albu- querque: university of new mexico press. reference and user services association (rusa). ( ). guidelines for behavioral perfor- mance of reference and information service providers. retrieved from http://www.ala .org/rusa/resources/guidelines/guidelinesbehavioral schultz, s. ( ). dissolved boundaries and “affective labor”: on the disappearance of repro- ductive labor and feminist critique in empire. capitalism, nature, socialism, ( ), – . sheesley, d. f. ( ) burnout and the academic teaching librarian: an examination of the problem and suggested solutions. journal of academic librarianship, ( ), – . sheih, c. s. m. ( ). a survey of circulation librarians’ emotional labor and emotional ex- haustion: the case of difficult patron service in university libraries. journal of educational media & library sciences, ( ), – . shirazi, r. ( , july ). reproducing the academy: librarians and the question of ser- vice in the digital humanities. retrieved from http://roxanneshirazi.com/ / / /reproducing-the-academy-librarians-and-the-question-of-service-in-the-digital-humanities shuler, s., & morgan, n. ( ). emotional labor in the academic library: when being friendly feels like work. reference librarian, ( ), – . tennant, r. ( , august ). digital libraries: the gender gap. library journal. retrieved from http://lj.libraryjournal.com/ / /ljarchives/digital-libraries-the-gender-gap van slyck, a. ( ). free to all: carnegie libraries and american culture, – . chicago: university of chicago press. weeks, k. ( ). life within and against work: affective labor, feminist critique, and post- fordist politics. ephemera: theory & politics in organization, ( ), – . lisa sloniowski is the english literature librarian at york university in toronto. she is currently pursuing a phd in social and political thought, with a focus on the politics of memory and archivization as exposed by unruly feminist collections. the importance of place and openness in spatial humanities research porter, c. ( ). the importance of place and openness in spatial humanities research. international journal of humanities and arts computing, ( ), - . https://doi.org/ . /ijhac. . published in: international journal of humanities and arts computing document version: peer reviewed version queen's university belfast - research portal: link to publication record in queen's university belfast research portal publisher rights copyright edinburgh university press. this work is made available online in accordance with the publisher’s policies. please refer to any applicable terms of use of the publisher. general rights copyright for the publications made accessible via the queen's university belfast research portal is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. take down policy the research portal is queen's institutional repository that provides access to queen's research output. every effort has been made to ensure that content in the research portal does not infringe any person's rights, or applicable uk laws. if you discover content in the research portal that you believe breaches copyright or violates any law, please contact openaccess@qub.ac.uk. download date: . apr. https://doi.org/ . /ijhac. . https://pure.qub.ac.uk/en/publications/the-importance-of-place-and-openness-in-spatial-humanities-research(e f - b - f - b c-cf d ef ).html introduction: the importance of place and openness in spatial humanities research catherine porter abstract digital humanities (dh) is a dynamic and developing field. in recent years, its evolution has been witnessed foremost in the growth of funded dh projects and through the willingness of scholars from diverse backgrounds to not only work in dh research, but also as ‘digital humanists’. one crucial component to dh research is that of spatial enquiry, the expansion of which has rapidly evolved from a small component often found buried in research objectives, to the research aim of a growing number of projects. spatial humanities, while still a relatively new interdisciplinary field, is exhibiting continued advancement and focus from the academic community; however, working with digital data is rarely a straightforward pursuit, even for the most accomplished scholar. primarily access to appropriate and reliable (spatial) datasets, the keystone of spatial humanities research, the sharing and openness of spatial methods, tools and data (smtd), and education in the former, all remain a challenge. witnessing the continued rise of spatial humanities research, this special issue brings together a selection of articles delivered at spatial humanities , a conference held at lancaster university (uk). the aim of this multi-disciplinary conference was to explore and demonstrate the contribution to knowledge that spatial technologies in humanities research may enable within and beyond the digital humanities. here, this introductory text and associated articles present key research that embodies the growing relevance of the spatial humanities across a plethora of fields, and demonstrates several of the prevailing and enduring struggles when working in digital and spatial research. these articles emphasise that, despite common obstacles, spatial humanists make up an imaginative and thriving community keen to share innovation and knowledge and provide stimulating new insights through research. keywords: digital humanities, spatial humanities, smtd, methods, tools, data, gis . introduction this special issue is composed of articles delivered at the conference spatial humanities . the conference attracted close to one hundred delegates, each keen to share and discuss their research, and each advertising how fundamental the ‘spatial’ is to their work. delegates, varying from early career researchers through to senior members of the academy, were brought together in a friendly and relaxed atmosphere which was highly conducive to the success of the spatial humanities has led to the organisation of a second conference, spatial humanities . discussion and debate on all aspects of spatial research. session foci varied from ‘critical perspectives’, ‘gis and text’, ‘literary gis’ and ‘digital landscapes of the past’, to the ‘urban’, the ‘countryside’ and ‘ d applications’, but all with a common thread – to share and exhibit spatial humanities research to the wider community. the spatial humanities has made extensive progress in recent years, rapidly evolving from a small component often found buried in research objectives, to the research aim of a growing number of projects. nonetheless, we must not lose sight of the struggles and associated limitations when working in a digital environment, especially with historic sources, spatial datasets and tools, and associated education. the ambition of this introductory text is to highlight our progress and also to provide a brief reminder of the key difficulties, that we, as spatial humanists, frequently encounter, and make us pause to consider how we might engage with these obstacles and evolve solutions. . our shifting place as spatial humanists to better understand our interest in, and growing contribution to, spatial research, we must first refer to the role that space and place has played in our history. the past brims with examples of natural inquisitiveness in our surroundings and landscape, our sense, awareness and attachment to place, and our urgency to define territory by “marking and making place” (david and wilson, ). historically, we have had a compulsion to mark out our place in the world, and so, to understand the spatial was as inherent to our ancestor’s lives as it is to ours. evidence of the earliest forms of human spatial awareness are demonstrated through early chorography , the visualisation of which we have borrowed and developed over time through progressive spatially motivated application, writing and research. crucially, today, the analysis of space and place play a significant role in developing and broadening our understanding of history. the advent of computer technology in the twentieth-century saw the introduction of a new kind of spatiality - the digital. scientists could now ‘digitise’, analyse and visualise datasets using fresh approaches, but only those niche humanists with statistical or computer science backgrounds (and largely quantitative interests) were coaxed to get involved. gregory and geddes ( : ) describe the take-up of geographic information systems (gis) by geographers in the s as “controversial” with historical gis (hgis) not becoming a focus of research until the turn of the century. by , qualitative components of history and geography were readily being explored through digital technologies with “studies that developed the historiography by answering applied research questions” (gregory and geddes, : ). in pre-history, examples such as the pavlov map in the czech republic ( bc), the ancient polynesian maps of the pacific ocean and cave maps of the stars such as those in lascaux, france ( , bc – , bc). in the ancient and medieval world, we saw maps with religion at their core (in the form of t and o maps and mappa mundi) and ptolemy’s geographia. the early modern saw the knowledge of place coveted and used in conquest, strategy and power, and in the nineteenth-century a growing consideration of the importance of place and space was actively demonstrated through demographics as governments collected, collated, recorded and mapped key information on population. the last ten years have seen a marked expansion in the use of digital and spatial technologies in humanities research, largely due to the growing availability of, and accessibility to, giscience (and data visualisation software), and the development of geospatial datasets, each making the analysis of the spatial more achievable . thus, from a not too distant past, when advocates of the digital had the challenging task of persuading more ‘traditional’ methods-based academics to recognise the merits of spatial technologies, we now welcome to the fold (often self-trained), historians, geographers, literary scholars, linguists, anthropologists and archaeologists who use progressive digital and spatial mechanisms to focus and enhance their research . in a time, arguably described as a “crisis” for humanities research, (waltzer, ; jay, ; kirsch, ; berube and ruth, ), spatial humanities is a burgeoning field that still “promises to revitalise and redefine scholarship by (re-) introducing geographic space to the humanities” (bodenhamer et. al, : ). the field has yet to be formally defined, a difficult prospect considering that, as with all things digital, definitions shift and modify with each technological advancement . considerable change can also be noted in the growing number of scholars now working under the often-interchangeable titles of the digital and spatial humanities. these, often self-labelled, ‘digital humanists’, have diverse backgrounds and interests: it is no longer unusual to find a social historian with a postgraduate degree in computer science, an anthropologist who is an expert in spatial analysis, or an interdisciplinary project where these spatial humanists are working together towards a common goal . there is an undoubted allure to the spatial humanist tribe – it provides a well-formed, yet ever-evolving, diverse and open community in which to belong. it is a community that puts ‘space’ first, expediting choice and opportunity to work freely across disciplines with common goals, and hence, providing the spatially minded with an enhanced sense of place in the academy. . opportunities, challenges and solutions however key challenges remain. aside from the difficulties in securing funding and knowledgeable staff, designing and executing digital research can have numerous complications, none more so than the creation and application of spatial methods, tools and data (henceforth referred to as smtd). in recent years there has been a surge in new and innovative smtd, as more data and computer scientists pursue spatial analysis, but often information on these is presented on websites such as github - extremely useful to those according to lee and kang ( ), geospatial ‘big data’ is growing at a rate of % per annum. here, we also must consider how to tackle this increasing volume of data as illustrated by songnian et al. ( ). it should be noted that not all of these more ‘traditional’ scholars are open to digital methodologies and so there is still some way to go in terms of persuasion. see dunn ( ) for further discussion and examples of this. bodenhamer et al. (various) has made the most comprehensive definition. the authors of the articles in this special issue portray this diversity well. the times higher education (uk) published an overall figure of per cent success rate in grant proposals for / financial year. see: https://www.timeshighereducation.com/news/uk-research-grant-success-rates- rise-first-time-five-years github can be used to access open source coding and tools for many forms of analysis. https://github.com/ there are also several studies referring to the quality of coding available on github – see dabbish et al. ( ) and kalliamvakou et al. ( ). https://www.timeshighereducation.com/news/uk-research-grant-success-rates-rise-first-time-five-years https://www.timeshighereducation.com/news/uk-research-grant-success-rates-rise-first-time-five-years https://github.com/ of us with programming and software development backgrounds, but not so favourable for those without - or is hidden behind publications with little explanation of how one might access or use them. the sourcing of reliable and trustworthy data is particularly important as the lack of available datasets for certain times and spaces complicates this fundamental ingredient of digital research. thankfully, and responsively, librarians and curators are playing a crucial role in combating this deficiency . many large libraries, museums and repositories are now digitising collections and making them available as a crucial resource for researchers. however this supply of robust data, in turn, introduces the danger of data availability shaping research. those of us keen to push the boundaries of our research areas and deliver unique insight into key historical and geographical thought must therefore create datasets, an often time-consuming and arduous process involving multiple stages. sometimes this involves using long-established methods in which many of us were originally trained: trawling through original source material extant in archives and libraries (much of which was not ‘born digital’), actively collecting data in the field, or for a fortunate few, the acquisition of existing (trusted) datasets. once data are sourced, digitisation is often required. this might comprise of transcription and scanning, but is often a manual, or at best, a semi- automated process. apart from the obvious costs involved in time and money, digitisation processes can also lead to levels of inaccuracy in the final datasets (for instance, optical character recognition (ocr) for textual data ) and therefore two-tier checking systems (think punch card dual-checking for main frames!) and post-correction are necessary to carefully assess the newly created datasets before they might be considered a trusted, ‘gold standard’ output (clematide et al., ) . for any research with a spatial focus, following the digitisation of data, a spatial component must be created. this involves firstly, finding and applying a method for extracting ‘place’ information from the collected data, and secondly, linking these data to the most crucial form of data in spatial research (and a common thread highlighted in the articles of this issue), gazetteers. however access to reliable and appropriate gazetteers remains problematic. not only are spatial humanists often working with varied languages and temporal and spatial extents, they must also contend with change over time in country and administrative boundaries, placename revision and standardisation of the same. some notable projects tackling this are the world-historical gazetteer project (whg), a vision of britain through time (and the related great britain historical gis (gbhgis)), the map of early modern london, and pelagios commons . each project, in its own way, promises to for example, the british library has undergone extensive digitisation programmes such as that of the newspaper collections and now employs expert digital librarians, archivists and curators. see king ( ) for an early discussion of this. they also have a focus on digital scholarship see: https://www.bl.uk/subjects/digital- scholarship see schiuma & carlucci, clematide et al. use crowdsourcing to post-correct ocr for a german and french heritage corpus. the world-historical gazetteer project is based at the university of pittsburgh http://whgazetteer.org/ ; a vision of britain through time is at the university of portsmouth http://www.visionofbritain.org.uk/ ; the map of early modern london https://mapoflondon.uvic.ca/gazetteer_about.htm ; pelagios commons is funded by the andrew w. mellon foundation http://commons.pelagios.org/ http://whgazetteer.org/ http://www.visionofbritain.org.uk/ https://mapoflondon.uvic.ca/gazetteer_about.htm http://commons.pelagios.org/ deliver opensource gazetteer-based resources to all. despite the on-going success of such projects and the openness that they encourage, many projects have a need for more bespoke datasets by necessity. unfortunately, this process, too, is littered with hurdles and without reliable and robust gazetteers employed we may not be able to commence our analysis or trust in our results. the creation of a new smtd for digital and spatial research is futile if we do not have the skills to analyse and interpret these digital data. a common thread that developed during spatial humanities was an admission by several attendees to their lack of formal education in the digital and their apprehension in presenting their work to those they perceived as qualified. there are key concerns for those with no explicit education in the ‘digital’ (or perhaps little contact the digital and/or spatial humanities communities). firstly, the danger of doing research the ‘long way’ when an arduous process (of say, transcription or digitisation) may be least semi-automated using an existing tool, or indeed, duplicating already existing and established smtd. secondly, there is a real concern in the visualisation process and (mis)interpretation of results: as any spatial humanist will tell you, mapping data can produce misleading outputs depending on how data are, intentionally or otherwise, collated and/or classified. lastly, and perhaps worst of all, some will not have the confidence to try at all! as spatial humanists, we have the responsibility to encourage the osmosis of our smtd across established scholarly boundaries and provide a bridge for others. equally importantly, we must as a community strive towards inclusion by continuing to work towards a platform for those at the beginning of their spatial research journey whether they be early career researchers or more established scholars. conferences such as spatial humanities and the adho digital humanities series of events are a beginning of this process. however, this form of sharing is insufficient in itself. there is also a considerable need to offer more accessible training for individuals who are keen to explore the spatial in their research. summer schools based at universities and colleges are key to this digital education . these courses and workshops (which are usually free to attend except for travel and accommodation although student bursaries are sometimes offered) are run by leaders in the field and provide an opportunity for first-hand engagement, practice and discussion. chiefly, they introduce researchers, of all levels, to the basics of spatial research and provide attendees with a starting point, a first opportunity to use and discuss their own data. however, while there are clear barriers to this form of education, for example geographical location and finance (which can be particularly difficult for early career researchers), a greater issue is that of personal anxiety and uncertainty about whether we can do ‘how to’ digital courses. one option that solves geographical and financial constraints is to offer more online training in the ever-popular mooc (massive open online courses) style courses that some universities, colleges and companies such as esri now successfully employ and provide more access to training websites such as the programming historian, such as the digital humanities summer institute at the university of victoria; the summer school in gis for the digital humanities at lancaster university; the university of oxford digital humanities at oxford summer school; sussex humanities lab; digital humanities@manchester’s dh week; and the summer a.w. mellon workshop in digital humanities at carnegie mellon university, to name a few. where opensource ‘how to’ tutorials can be followed . this can only be achieved by leaders in the field working together to offer this education. self-doubt and related anxieties might then be treated by self-paced learning and forums that offer direct assistance from course leaders and peers. regardless of educative delivery formats, the crux of the issues mentioned here is that of openness (graham, ). openness of smtd and associated education is key to the accessibility and transparency within the spatial humanities community and essential for future growth. although many funded projects now publish (or are required to publish) the smtd and associated code alongside the research, there is a genuine need for increased clarity so that we might each learn the processes involved and have the confidence and ability to apply these to our own research. refreshingly, some of the authors in this issue explicitly mention sharing as a core aspect to their research, serving to project the welcoming and receptive attitude of the spatial community and cement the importance of conferences such as the spatial humanities series in moving this positive attitude to openness forward. to do so lies at the heart of what it means to be a spatial humanist – sharing knowledge, educating, being open to new ideas, and carrying out innovative and successful interdisciplinary research toward common goals. . outline of articles in this special issue leading on from the previous short discussion, the chosen articles that make up this special issue touch on each of the previous points and epitomise the breadth and diversity of contemporary spatial humanities research. they embody the variation in research interests, with a temporal scale ranging from the early modern through to the twenty-first century and projects focusing on europe, the americas and beyond. some are building ‘deep maps’ (bodenhamer et al., ), historical gis (hgis) and other spatial tools, others are applying existing techniques and methods but pushing known boundaries. overall, the articles have three common goals: (i) to carry out robust historical research using what is frequently unstructured and undigitised historical source material; (ii) to use the spatial as a core component of enquiry and; (iii) to develop and/or use digital tools, methods and datasets to answer key research questions. within this, the articles fall into two central categories. the first are those that are actively developing and building historical geographic information systems (hgis) as a core component of their research. the second are those that are using methods of spatial enquiry to investigate textual data, be it historic, modern and/or literary in nature. notably, each article introduces new and impressive smtds and showcase the ability to place all kinds of resources into a workable framework for analysis, core to the character of spatial humanities research (terpstra & rose, ). beginning with the first category, hgis is still a growing methodology in historical and geographical research, although now frequently referred to under the dh umbrella. devos https://programminghistorian.org/ tutorials are created by a mix of archivists, librarians and academics. they vary in the level of difficulty. https://programminghistorian.org/ et al. present the innovative stream-project (ghent university and vrije universiteit brussel) which has developed a much awaited spatio-temporal infrastructure for early modern data, specifically, an early modern brabant and flanders historical geographic information system (hgis). the creation of the system and vast dataset serves to highlight the earlier discussion on access to historic datasets for research, something which this project tackles with strength, not least because of the varied datasets it includes: the stream model contains local level data on early modern society including territory, transport, demography, agriculture, industry and trade, data which the team has valiantly sourced and in some cases digitised as part of the on-going project. the project therefore promises not only a new hgis of early modern data but methodological innovation that will assist with the longevity of stream and will extend interest beyond the project’s spatial and temporal context. schindling and harris (west virginia) continue the issue’s focus on methodology and hgis but move towards the integration of textual sources that link source materials related to people, places and events. in common with other papers here, there is a very real focus on the use of historical source material, but the ingenuity and uniqueness lies in the method and the developing technology which includes a mobile application for collecting and adding data in the field and the ability to query and edit the dataset concurrently and in real time, a model that will no doubt be profoundly useful to many archival researchers. stangl (university of graz, austria) transports us to latin america at the end of the colonial period (eighteenth and early nineteenth century) with a historical and web gis. here, the author uses historical data to reconstruct colonial rule as seen through populated places, and various entities such as political and ecclesiastical divisions, territories and administrative boundaries. the need for such a system again reflects common complications mentioned by other papers in this issue, and more generally, the availability of comprehensive historical datasets for research. wisely, the author uses a grading system for the source material, prioritising primary information gathered from the period under research rather than source material reflecting on the time under study. taylor et al. (lancaster university) herald the second theme in this special issue, textual analysis. they tackle the methods of ‘reading’ text through the analysis of a textual corpora containing eighty key pieces of literature related to the english lake district (dating from the seventeenth through to early twentieth century). uniquely, this textual exploration focuses on historical soundscape and in doing so provides a detailed explanation of how the more traditional and computationally driven methods of textual analysis might be combined in the form of geographical text analysis (gta). the method clearly describes the complex weave of processes and techniques necessary to digitise literary text effectively and how we can analyse these data through corpus linguistics. the article highlights the difficulties in geoparsing datasets and points to the lack of complete historical gazetteers. notably, the spatial analysis component of the article combines textual data with various forms of statistical analysis and quantitative datasets (such as digital terrain models), and final visualisation of the data in cartographic form, shows how the techniques devised by this team provide a new voice for not only soundscape studies but also the digital and spatial humanities. the authors clearly highlight the importance of incorporating close reading with what are largely automated, digital and distant reading procedures: to ‘read’ a text effectively the researcher must do so at varying distances and scales. staying with the trend of mapping place in literature, lopez-sandez (university of santiago de compostela) writes of another interdisciplinary project made up of literary scholars, geographers and cartographers who have created databases from two sets of written corpora; one historic and one modern, to compare historic perception of place in compostela with the twenty- first century city experience. the author justifies the use of manual reading, transcription and extraction of placenames in the text speaking to the nuance that they believe more difficult to achieve through automated processes. this differs from the combined methods in the paper by taylor et. al but also highlights that one method is not always applicable to every project. the author does, however, highlight that to go further (say, analysing emotion in the corpora) they must employ automated processes such as collocation using existing software packages. lopez-sandez points to differences between the social use of space as recorded in the historic corpora and the modern student texts. for example, the lack of interest current students of the university has in pilgrimage, religion and the camino and the differing global imaginary in both the historic and modern corpora. the final article shows that through spatial technologies we can now investigate historic texts on a scale not previously possible. porter et al. (queen’s university belfast, the university of liverpool and lancaster university) apply geographical text analysis (gta) to investigate space and time across millions of words extracted from one british nineteenth-century newspaper. in this method-based paper, the authors push the known boundaries of gta to test the techniques and shed light on how a newspaper discussed health in the population of britain. the methods are key to this paper in that they show strong promise for porting to other topics and literary genres as well as illustrating the sheer size of texts that can now be analysed using semi-automated processes. the articles show the intrinsic breadth and interdisciplinarity of the spatial humanities. here, we have geographers, historians, linguists, literary historians, archaeologists and computer scientists working together towards a common goal and in doing so expose the determination of the contributors to conceive of and apply solutions to key obstacles inherent in digital research. . conclusions this corpus of spatially focused articles reaffirms that spatial humanities research is a flourishing field with the interdisciplinary and collaborative research at its core. each article showcases our continued fascination with place and space and through innovative sdtm heralds a promising future for spatial research. equally, we are witness to the ongoing complexity of spatial research. through this issue are introduced several new and complex smtds, each built with a specific purpose in mind, but with the prospect of making substantial contributions to not only the author’s research fields but to the broader spatial community. the articles also highlight that working in the realm of the spatial humanities is not a solely digital pursuit. scholars must be multi-talented and interdisciplinary in their approach to research. they should not only be willing (and able) to search archives, understand census returns, grasp translation and grapple with transcription, but also have the necessary digital skills to record and manipulate this information (or the inclination to learn), so it might make a meaningful contribution to knowledge. as we move forward, in reading the articles we should consider three key elements to spatial humanities research: (i) how the encouragement of openness, sharing and accessibility of smtds would benefit the spatial humanities community,; (ii) acknowledge that more accessible and tailored education is needed to enable new and existing scholars to further their spatial research and actively contribute to knowledge and; (iii) how can you, as an individual or as part of a larger team, help facilitate this? ultimately, these articles are symbols of the continued success of digital and spatial research to ask and answer key questions that add to narratives of time and space. the variation in research interest and background, the tenacity and the innovation of spatial humanists today, is clear and continues to progress. references • berube, m. & ruth, j. ( ) the humanities, higher education, and academic freedom: three necessary arguments. houndsmills, basingstoke, hampshire: palgrave macmillan. • bodenhamer, d.j., corrigan, j. & harris, t.m. ( ) deep maps and spatial narratives. bloomington & indianapolis: indiana university press. • bodenhamer, d.j., corrigan, j. & harris, t.m. ( ) the spatial humanities: gis and the future of humanities scholarship. bloomington & indianapolis: indiana university press. • british library digital scholarship https://www.bl.uk/subjects/digital-scholarship [accessed / / ]. • clematide, s., furrer, l. & volk, m. ( ) crowdsourcing an ocr gold standard for a german and french heritage corpus. in proceedings of the tenth international conference on language resources and evaluation (lrec ), portoroz, slovenia, may – may : - . • dabbish, l., stuart, c., tsay, j. and herbsleb, j. ( ) social coding in github: transparency and collaboration in an oepn software repository, in proceedings of the acm conference on computer supported cooperative work: - . • david, b & wilson, m. ( ) inscribed landscapes: marking and making place. honolulu: university of hawaii press. • dunn, s. ( ) praxes of “the human” and “the digital”: spatial humanities and the digitization of place, geohumanities, vol. ( ): - . • graham, l. ( ) applied media studies and digital humanities: technology, textuality, methodology. in ostherr, k. (ed.) applied media studies: theory and practice. new york: routledge. • gregory, i.n. & geddes, a. ( ) toward spatial humanities: historical gis & spatial history. bloomington & indianapolis: indiana university press. https://www.bl.uk/subjects/digital-scholarship • jay, p. ( ) the humanities "crisis" and the future of literary studies. houndsmills, basingstoke, hampshire: palgrave macmillan. • kalliamvakou, e., gousios, g., blincoe, k., singer, l., german, d.m. & damian, d. ( ) the promises and perils of mining github. in proceedings of the th working conference on mining software repositories: - . • king, e. ( ) digitisation of newspapers at the british library, the serials librarian, vol. ( - ): - . • kirsch, a. ( ) technology is taking over the english departments. available online: https://newrepublic.com/article/ /limits-digital-humanities-adam-kirsch [accessed / / ]. • lee, jg & kang, m. ( ) geospatial big data: challenges and opportunities, big data research, vol ( ): - . • schiuma, g. & carlucci, d. ( ) big data in the arts and humanities: theory and practice. florida: crc press. • songnian, li, dragicevic, s., castro, f.a., sester, m., winter, s., coltekin, a., pettit, c., jiang, b., haworth, j., stein, a & cheng, t. ( ) geospatial big data handling theory and methods: a review and research challenges, isprs journal of photogrammetry and remote sensing, volume : - . • terpstra, n. & rose, c. (eds) ( ) mapping space, sense, and movement in florence: historical gis and the early modern city. london: routledge. • terras, m., nyhan, j. & vanhoutte, e. (eds) ( ) defining digital humanities: a reader. farnham: ashgate. • waltzer, l. ( ) ‘digital humanities and the “ugly stepchildren” of american higher education’. in gold, m. (ed) debates in the digital humanities. minneapolis: university of minnesota press. https://newrepublic.com/article/ /limits-digital-humanities-adam-kirsch https://www.sciencedirect.com/science/journal/ https://www.sciencedirect.com/science/journal/ https://www.sciencedirect.com/science/journal/ / /supp/c restructuring and formalizing: scholarly communication as a sustainable growth opportunity in information agencies&#x f; asis&t annual meeting papers restructuring and formalizing: scholarly communication as a sustainable growth opportunity in information agencies? a. j. million university of michigan, usa. millioaj@umich.edu cynthia hudson-vitale penn state university, usa. cynhudson@gmail.com heather moulaison sandy university of missouri, usa. moulaisonhe@missouri.edu abstract emerging technologies are revolutionizing the field of scholarly communication. because of this, scholars in- creasingly need specialized support during all stages of the research process. with the academic library as the unit of analysis, two concepts from rogers’ diffusion of innovation theory and organizational innovation litera- ture are drawn upon to assess the sustainability of scholarly communication work in libraries. these concepts are organizational restructuring and formalization. data on association of research libraries (arl) employees with relevant job titles and three digital curation competencies documents are analysed. study findings suggest that arl information agencies have restructured to provide added research support and that skills associated with scholarly communication positions are becoming more uniform. we conclude that scholarly communication information professionals are part of a sustainable area of practice within arl information agencies, that has matured over the past decade, and this trend is likely to continue in at least the short term. keywords information professionals; scholarly communication; digital curation; information agencies; sustainability (management); diffusion of innovations introduction the emergence of new technologies to support publishing and communication (including social networks and online reference management and sharing platforms) have fundamentally changed the way that scholars work (regazzi, ) and have altered the research process in which they engage. scholarly communication has become more complex as a result of new and emerging technologies. scholarly communication, an area supported by information professionals, is defined as: the study of how scholars in any field (e.g., physical, biological, social and behavioural sciences, humanities and tech- nology) use and disseminate information through formal and informal channels. the study of scholarly communication includes the growth of scholarly information, the relationships among research areas and disciplines, the information needs and uses of individual user groups, and the relationships among formal and informal methods of communication (borgman, , pp. - ). scholarly communication enables the work of researchers at universities and organizations that are dedicated to creating new knowledge. in the united states (u.s.), over half of all federal research dollars are granted to university-affiliated researchers (regazzi, ) making universities a hub for the creation of new knowledge. supporting scholars supporting scholarly communication on university campuses in an age of emerging technologies has naturally fallen to infor- mation professionals based in information agencies, such as academic libraries. traditionally, academic libraries have divided the expertise they provide into technical services and public services, but given the complexity of technologies combined with the increasingly sophisticated needs of users, a new breed of information professional must emerge (e.g., kowalski, ). in the case of scholarly communication questions, not only must the information professional curate and supply access to schol- arship, but he/she must also work with scholars to organize, provide access to, save, and share their work. providing specialized support to scholars (e.g., ketchum, ) during the research process has quickly become the purview of the academic library. as the needs of researchers and scholars have changed, the work of information professionals has evolved in parallel. working to support the field of “scholarly communication today reflects a [need to master an ever-growing] constellation of tools, prac- tices, and competencies” (cross, oleen, & perry, , p. ). the support scholarly communication information profession- als provide to researchers varies according to the priorities of the institutions in which they work, the areas of expertise of the scholars whom they support, and their own skills and training. for the purposes of this paper, we consider work that supports mailto:millioaj@umich.edu mailto:cynhudson@gmail.com mailto:moulaisonhe@missouri.edu asis&t annual meeting papers scholarly communication as: assistance with digital curation, research data management, and open access and publishing. alt- hough the competencies and skills to support this work have been codified in several documents (some of which are analyzed in this paper), little is known about the sustainability of scholarly communication work in information agencies. this paper assesses the role of the scholarly communication information professional in an attempt to understand the sustaina- bility of this non-traditional work. to assess sustainability, we review literature that pertains to the evolving roles and missions of academic libraries and the role that information professionals can play in supporting scholarly communication at their insti- tutions. next, we present the concepts of restructuring and formalization from rogers’ ( ) diffusion of innovations (doi) theory and organizational innovation literature, under the assumption that scholarly communication positions have diffused among information agencies in recent years. informed by concepts from doi, we discuss findings from an analysis of asso- ciation of research libraries (arl) job statistics and a qualitative evaluation of three scholarly communication competency documents specific to digital curators. based on our findings, we conclude that arl libraries have restructured to add scholarly communication positions due to the value these information professionals bring, and the alignment of competency documents suggests that the work of information professionals is becoming more uniform. doi predicts that restructuring and formaliza- tion in organizations are associated with slower rates of change. thus, evidence suggests it is likely that scholarly communica- tion information professionals will remain a sustainable addition to arl information agencies in the short term. we conclude by discussing limitations to our analysis and offer directions for future research. sustainability for the purposes of this paper, we define sustainability as the stability of defined roles and responsibilities within information agencies. the united nations brundtland report ( ) provides another definition, which says that sustainability is “devel- opment that meets the needs of the present without compromising the ability of future generations to meet their own needs” (p. ). applied to work in information agencies, this definition implies that sustainability occurs when individuals carry out work that does not compromise the ability of their successors to do the same. we do not discount the value of this definition; however, it is inappropriate for use in this paper. the reason we adopt the first definition is we are ultimately concerned with the perpet- uation of scholarly communication as a set of positions and practices in arl information agencies. thus, we are not concerned with the question of whether short-term practices by scholarly communication professionals are likely to be harmful or coun- terproductive in the long-run. research questions and rationale based on survey data from , the council of prairie and pacific university libraries (coppul) scholarly communications working group found the field of scholarly communication was not yet coalescing to form a community of practice in libraries (swanepoel, ). however, after ten years of the association of college and research library (acrl) scholarly commu- nication roadshow, cross, oleen, and perry ( ) found the field has significantly matured. given the work these information professionals carry out, and given the support that researchers and scholars require thanks to changing technologies, this paper addresses two research questions: rq . going forward, how sustainable is the work of scholarly communication information professionals in arl information agencies likely to be? rq . to what extent has a sub-area of scholarly communication practice (i.e., digital curation) emerged as a cohesive area of practice? the first research question (rq ) addresses the primary issue examined in this study. research question two (rq ) pertains to a sub-area of scholarly communications practice that must be examined to answer rq . literature review scholarly communication is a field that has needed to adapt to modern technology, research approaches, and communication methods. information professionals supporting scholarly communication have emerged to assist researchers in navigating this new landscape. as brantley, bruns, and duffin, ( ) explain, “the activities of scholarly communication-support librarians have grown and changed in recent years due to the increasingly complex nature of modern digital scholarship” (p. ). for example, regazzi ( ) identifies big data and big science as two changes that have profoundly affected researchers and the work they do. hey, tansley, and tolle’s ( ) canonical text also correctly identifies a marked shift towards computational this assumption is based on cross, oleen, and perry’s ( ) history of the acrl scholarly communications roadshow and prior efforts to educate librarians about the need to engage in scholarly communication work. asis&t annual meeting papers and data intensive research in the academic environment. furthermore, the u.s. office of science and technology policy memorandum, “expanding public access to the results of federally funded research” hastened the development of library services that support the open access and sharing of research outputs. scholarly communication information professionals have learned to navigate the increasingly complex field of scholarly communication and its related technologies, and work to support scholars in their use of these technologies through services they offer. borgman ( ) clarifies that, in terms of the intersection of digital libraries and scholarly communication, information professionals are more interested in the services digital libraries can provide (borgman, , p. ); in this way, they might be said to resemble both the technical services librarians and the public services librarians of the past. the evolving academic library in response to the changing needs of their users and in support of the new services they must now provide, academic libraries have changed dramatically, beyond moving from a front-of-the-house/back-of-the-house binary model to convergence among the two (bossaller & moulaison sandy, ; kowalski, ). library reorganization has been a key way in which academic libraries have positioned themselves for the future of access. in examining the literature, novak and day ( ) identify five steps libraries typically undertake during the reorganization process: libraries ) identify drivers for change, ) carry out anal- ysis and diagnosis, ) communicate the change plan, ) implement the change, and ) carry out continuous assessment after- ward. assigning new tasks to staff based on the analysis and diagnosis step takes place during the implementing the change step. it is at this point that staff are trained with any new skills they might need and are ideally placed in new positions based on their existing skills, talents, and proclivities. staff may, therefore, have been hired under one set of requirements or profes- sional expectations, and find themselves taking on new work in other areas. information professionals supporting scholarly communication new job titles and requirements have emerged in support of the changing academic library and changing academic research practices. positions may no longer require information professionals to have ala-accredited degrees in library and infor- mation science (lis) (t. dawes, personal communication, ). research by fearon et. al. ( ) found that the librarians supporting research data management services include subject expertise across varied research disciplines, in addition to, or in replacement of, the traditional lis degree. these information professionals, therefore, have the potential to come from a variety of backgrounds and may have traditional lis degrees, other areas of specialization, or both. anecdotally, there has been a marked increase in scholarly communication information professionals in academic libraries, but little is known about them other than the gender of scholcomm listserv members (hayes & kelly, ). according to the literature, in terms of their work, information professionals in this domain are responsible for a number of sub-activities. they may be expected to assist faculty and other users with the scholarly communication process (klain-gabbay & shohamb, ). promoting the open access movement among library users is also seen as a scholarly communication problem (potvin, ; rodriguez, ). one primary sub-area of scholarly communication is digital curation, and by extension, research data management. to support scholars throughout the research process, scholarly communication information professionals must also assist with the curation and use of data and digital assets in light of the technological advances mentioned earlier (fearon et al., ). indeed, open access and open research services are inherent in the developing of support for research data management and curation. fearon and his co-authors find that % of institutions who archive data do so to support open access (p. ). further, one of the core library services for research data management involves consultations and education around public access to all research outputs, not just data, but also code, analyses, protocols, workflows, publications, and more. information professionals working in this area also attempt to understand their users’ information needs and digital information preferences (maron & smith, ). lewis, sarli, and suiter ( ) found that many institutions are developing services to support researchers in managing their scholarly identities and track research outputs. to support this service, information pro- fessionals developed skills in data analysis, digital humanities, data management, and an understanding of various metric types. in short, as the field has evolved, scholarly communication information professionals have moved from advocating for open scholarship to actively providing services that relate to the communication and publishing of research, research data, and addi- tional aspects of scientific inquiry. diffusion of innovations it is challenging to predict the future, but rogers’ ( ) diffusion of innovations (doi) theory and organizational innovation literature provides concepts to examine the sustainability of scholarly communication work in arl information agencies. no theory can predict the future; however, doi is one of the most cited works in social scientific literature (green, ). below, we introduce doi and describe how the theory helps predict change in information agencies like libraries with a focus on the asis&t annual meeting papers concepts of organizational restructuring and formalization. in social scientific literature, doi is used to explain how and why innovations spread. an innovation is defined as “any idea, practice, or project that is perceived as new by an individual or other unit of adoption” (rogers, ; pp. - ). in this paper, we define an innovation as scholarly communication positions within arl libraries. stemming from the work of rural sociologists (ryan & gross, ), rogers ( ) is credited with popularizing doi by synthesizing studies that explained why individuals and organizations adopt innovations (valente & rogers, ). in doing so, rogers ( ) identified four causal factors: an innovation, communication, time, and a social system (p. ). we focus on innovations (i.e., scholarly communication positions) and social systems (i.e., arl libraries) below. innovation in organizations organizations are “organized bod[ies] of people with a particular purpose,” and they tend to be complex (oxford, n.d., para. ). originally, doi emerged as a theory that focused on the decisions of individuals. later developments extended it to describe innovation in organizations, such as academic libraries, through the use of metaphors. for example, one metaphor is that of an organizational decision-making process. another metaphor is that of organizational structure. regarding the metaphor of organizational decision-making, one way that rogers extended doi was through the creation of a six-part model. elaborating, rogers ( ) argued that organizations act as if they possess agency and try to reduce uncertainty about the costs and benefits of adopting and/or rejecting innovations. the six stages of organizational innovation-decisions are: agenda-setting, matching, redefining, restructuring, clarifying, and routinizing (p. ). agenda-setting refers to where goals and agendas are set by leaders. matching describes staff searching for and finding solutions to meet agendas. redefining and restructuring mean that after the matching stage ends, agendas are refined and innovations adopted. finally, clarifying describes the process of generating buy-in from staff (p. ) and routinizing is when staff no longer perceive an innovation as new (p. ). we do not focus on decisions to create scholarly communication positions in this paper, but rogers’ offers a way to infer changes in information agencies that have already taken place through a process of diffusion. demonstrated by the creation of scholarly communication positions, we argue this change pertains to library restructuring. another metaphor central to this paper is the concept of organizational structure. pugh ( ) says that organizational structures are “regularities in activities such as task allocation, coordination, and supervision” (p. ), and rogers ( ) argues there are six traits that determine if organizations will change. rooted in organizational innovation literature, these structural traits are: centralization, formalization, complexity, interconnectedness, organizational slack, and size. formalization describes the extent to which “an organization emphasizes its members’ following rules and procedures” (p. ) and we interpret it as information agencies creating positions that align with professionally uniform practices. literature examining innovation in organizations from a structural view supports rogers’ arguments—especially when it comes to the matter of formalization. formalized job roles are associated with slower rates of innovation. for example, in a study of social welfare organizations hague and aiken ( ) found that a low degree of job codification is associated with a high rate of change. in , howard looked at four university libraries and found that the rate of innovation is negative for centralized, formal, and stratified organizations. more recently, chen and chang ( ) looked at organizations via survey and deter- mined that a high degree of organizational formalization slows decision speeds and organizational innovation. formalization is not the only predictor of change in organizations, but it does align with the restructuring phase of rogers’ ( ) innovation decision-making process. methodology assessing the prevalence of scholarly communication positions in arl libraries to understand the growth of the number of scholarly communication information professionals employed in large academic libraries in north america, we analyzed data for positions coded in the arl salary survey. although arl institutions are not representative of all information agencies supporting research, they include some of the largest and most prestigious academic libraries in the u.s. and canada. once per year, arl libraries submit employee data for all positions. survey coordinators choose one job code to apply per staff member, and later this data is cleaned, normalized, and aggregated. for the current study, we examined data for positions coded as scholarly communication librarian (scholar) and digital curation librarian (digicur). the scholar code is defined in the salary survey as a library staff member who is involved in scholarly communication. these individuals work with or promote open publication access, provide advice regarding copy- right issues, and more. the digicur code is defined as a library staff member who creates and curates digital collections in the sciences, social sciences, or humanities, or who works with data-management issues across multiple disciplines. as noted earlier, digital curation is a sub-area of scholarly communication. both of these positions were analyzed given their support for the communication and publishing of scholarship. the year marked the first time arl collected data regarding these two categories. asis&t annual meeting papers skills and competencies assessment researchers also analyzed the recommended skills and competencies required for digital curation and data management, an area of expertise that blends back-of-the-house technology expertise with front-of-the-house service provision in support of the scholarly communication process. the project team applied a multi-step approach: ( ) identification of relevant guidelines; ( ) compilation of competencies/skills as a way to enable uniform mark-up during coding; ( ) the inductive development of broad categories; and, ( ) the coding of each competency to a category. to find as many relevant digital curation skills and recommendation reports or research, the following approach was applied: . databases searched: library literature & information science; library, information science & technology abstracts; google scholar; google . keywords used: “digital curation skills”; “digital curation curriculum”; “digital curation training” this search returned articles. given that we sought to understand competencies and skills for digital curation and research data management—a specific subset of work carried out by scholarly communication information professionals—documents focusing on practice rather than theory were retained. articles analyzing position announcements were deemed unreliable since there was no way to know how successful searches would be or how closely the incumbent would meet requirements and would carry out the work as described. with this in mind, we developed and applied the following inclusion criteria to narrow search results: . published -present; . written in english; . specifically indicated necessary or recommended competencies/skills for properly curating or stewarding data; and . competencies/skills were not developed by analysing job advertisements or reviewing the literature. the following three competency documents were analyzed: librarians' competencies profile for research data management ( ), matrix of digital curation knowledge and competencies ( ), and preparing the workforce for digital curation ( ). below, we present each in chronological order based on their publication dates. the matrix of digital curation knowledge and competencies published in is one of the earliest documents that articulated the requisite skills for digital curation work. these skills were developed as part of the digccurr project led by helen tibbo and cal lee at the university of north carolina as part of an institute of museum and library services (imls) project. this imls-funded project developed library school curricula and held a number of training events to support the diffusion of digital curation activities. a project matrix comprised of six dimensions was developed as a method to identify and organize material to be covered in digital curation curricula. these dimensions are: ( ) mandates, values, and principles; ( ) functions and skills; ( ) professional, disciplinary, institutional, organizational, or cultural context; ( ) type of resource; ( ) prerequisite knowledge; and ( ) transition point in information continuum. the dimension “function and skills” was the source for this research, because it is defined as “know how,” as opposed to conceptual, attitudinal, or declarative knowledge. in , the national research council of the u.s. national academies of science released the report preparing the workforce for digital curation. this report was authored by a committee on “future career opportunities and educational requirements for digital curation” and was intended to examine workforce-related issues in information agencies with an eye toward future economic development. the committee was composed of experts and industry representatives in library and information sci- ence, labor economics, and domain-specific scientific fields. more recently, the librarians’ competencies profile for research data management, was published in by the joint task force for librarians’ competencies in support of e-research and scholarly communication convened by the arl. the aim of this task force was to outline the competencies needed for e- research, repository management, and scholarly communication. the task force consisted of representatives from arl institu- tions, the canadian association of research libraries, the confederation of open access repositories, and the association of european research libraries. compilation of competencies the three competency guidelines varied in length and detail; two were documents, and one was a matrix of skills. formats also varied, with one being an html document and two being in pdf format. to normalize these documents, a pdf was made of the html document. next, using adobe acrobat pro, we documented codes by highlighting relevant text and inserting com- ments for each competency. the level of detail for each competency document varied. the matrix of digital curation knowledge and competencies included function categories, such as “access” and “administration” and sub-level functions, such as the “generation of dissemination information package,” which went into great detail about each overarching category. to ensure comparison across the competency documents was reliable, we used high-level descriptions and text in the definition and explanation section of each document to conduct the qualitative coding process. asis&t annual meeting papers inductive coding content analysis is an appropriate methodology for understanding and reducing information from existing data sources such as competency documents. because no taxonomy of qualitative competency guidelines for digital curation exists, we employed an inductive structural coding approach, which involved identifying concepts and skills that may apply to large segments of text and enabled the comparison of frequency counts across cases (saldaña, ). inductive coding means that coded skills reflect the content found in source documents rather than the skills categories being pre-determined by the researchers or based on existing taxonomies. each document was read by a team member and a list of skills was developed. to allow for comparison and the next coding step across documents, broad categories or buckets of skills were created. for example, the category of research workflows includes skills such as the ability of a curator to understand research practices, workflows, and/or the ability of the curator to understand disciplinary norms and standards. after categories were developed, the competency documents were then re-coded for each category. as competencies or skills were mentioned, the researchers recorded instances for each category. suitability of competencies as data points speaking about management, jordan and lloyd ( ) argue that human resource planning in libraries calls for the careful selection and recruitment of staff. one way this is accomplished is to formalize positions by defining job responsibilities to make work predictable and regular. in the case of scholarly communication, the existence of a disciplinary consensus about the skills individuals ought to possess would suggest the presence of formalized roles brought about by restructuring. organizations do not constitute professions; however, they do employ the individuals who comprise them. as noted earlier, organizational formalization is associated with a decreased rate of change, so if arl libraries are creating scholarly communication positions, it is more likely that scholarly communication work will be sustainable in the future. findings number of scholarly communication librarians in arl libraries the number of scholarly communication librarians has been steadily on the rise since when data was first collected, as is demonstrated in figure . between and , scholarly communication positions with the job titles scholarly communi- cation (scholar; i.e., focusing on open access, etc.) and digital curation (digicur; i.e., focusing on research data man- agement, etc.) increased annually in arl libraries. as indicated previously, marked the first year that arl collected data for these positions; in scholar data was combined with other codes given the low number of positions and is, therefore, not presented in figure . interestingly, as shown below, growth in digital curation positions totaled %. scholarly commu- nication growth was far more rapid with an increase of %. a conclusion to draw is that the number of scholarly communi- cation librarians in arl libraries has grown since data was first collected and that the total of the two combined (as of ) was roughly individuals. figure . scholarly communication (scholar) and digital curation (digicur) positions at arl libraries, - . identification of competencies and agreement within documents coding of the competencies examined in this paper identified categories for information professionals to perform digital curation and/or research data management work (see table ). of these categories, twelve were found across all three compe- tency publications. these skills are: advocacy and outreach, instruction, data preservation, data management, data selection, asis&t annual meeting papers data repository platforms, data linking, data audits, data type best practices, licensing data, data organization, and information discovery. an additional seven skills were found across two competency documents. these skills were: some type of discipline knowledge, data analysis skills, an understanding of funder requirements, knowledge of research workflows, data citation, data sharing, and data visualization skills. finally, four skills were limited to only one competency document. these skills were: data security, programming and scripting, a background in science, technology, engineering, or medicine, and an understanding of publisher requirements. competency or skill matrix of digital curation knowledge ( ) preparing the workforce for digital curation ( ) librarians' competencies profile for research data management ( ) advocacy & outreach data audit data linking data management data organization data preservation data repository platform data selection data types information discovery instruction licensing data discipline knowledge data analysis funder requirements research workflows data citation data sharing data visualization data security programming & scripting education in stem publisher requirements table : skills required for digital curation per the selected guidelines documents. as noted above, competencies documents possessed a high degree of agreement expressed in skill categories. such agreement is notable, because all three documents were written by separate groups over the past decade with notable disciplinary qualifi- cations. regarding agreement among documents, the most similar were the librarians' competencies profile for research data management and preparing the workforce for digital curation with over % agreement ( of skills). the matrix of digital curation and the preparing the workforce for digital curation documents had the next closest agreement with % or out of points of overlap. discussion to address the question of sustainability, we consider our findings overall. based on gathered arl data, scholarly communi- cation information professionals, especially generalists with the title scholarly communication librarian, emerged in academic libraries in force through the s. the upward trend in the growth of this position leads us to believe that this aspect of the information professions is in demand and that the work being done is both necessary and appreciated. consensus in competen- cies suggests that the information professions are in agreement about what ought to be done in support of researchers as they seek assistance with digital curation and research data management. this implies that the role of the scholarly communication information professional, at least in terms of his/her work, is becoming formalized. we take the formalization of digital curation work to be an indicator of the maturation of the field as it coalesces around established competencies. based on reviewed literature and study findings, the time between and appears to have been pivotal in the formation and coalescence of scholarly communication work by information professionals. in , the acrl scholarly communication roadshow began educating librarians and other information professionals about scholarly communication writ large. by , arl had begun to collect data on positions with the title scholarly communication librarian, but did not find they had enough data to warrant reporting it separately from other positions. a divergence in data collection classifications was allowed to form asis&t annual meeting papers between scholarly communication and digital curation, despite overlap in the work carried out. in , respondents to the coppul survey supplied data indicating scholarly communication as an area of specialization had not coalesced (swanepoel, ). yet, by , arl was collecting data about scholarly communication librarians with over positions in the arl libraries. in and , two separate documents were published that indicated % agreement in terms of the required skills and competencies for the sub-area of digital curation (librarians' competencies profile, ; preparing the workforce, ). by , the acrl roadshow organizers found the knowledge of audience members had improved to the point where it was necessary reconfigure their curriculum (cross, oleen, & perry, ), implying that a level of formalization had occurred in the information professions and information agencies. based on these findings, the answer to research question two (rq ) is digital curation has emerged as a cohesive sub-area of practice. further evidence for the emerging cohesiveness of practice in the sub-area of digital curation (and research data management) comes from findings associated with the three competencies documents that we examined. these documents were published at extreme ends of what the literature points to as a formative period for the field of scholarly communication within information agencies; they were also published before and after the establishment of the area as a formal title by arl. the matrix of digital curation knowledge ( ) is the most different of the three competencies documents studied, implying that since its publica- tion the sub-area of digital curation has coalesced. the competencies identified in (preparing the workforce for digital curation) and (librarians' competencies profile for research data management) demonstrate a mature understanding of this area of scholarly communication information work. all but one of the competencies in the document are included in later documents. further, of the competencies shared by the and documents are unique to newer publications, having been identified and added since . we interpret this as evidence of the maturation of this sub-area of practice within arl information agencies and as a de facto formalization of associated work. doi and the future of scholarly communication information professionals based on these findings and the concepts of restructuring and formalization from doi and organizational innovation literature, we predict that increased formalization within information agencies will contribute to a slowing of change for scholarly com- munication information professionals. therefore, to answer research question one (rq ), we predict scholarly communications work will remain sustainable in the future. staff are being brought in to do work in arl libraries, and that work increasingly, at least in digital curation/research data management, is consistently codified in terms of the competencies and skills that prac- titioners require. work in areas of scholarly communication has been made increasingly predictable and regular, in support of human resource planning (jordan & lloyd, ) as demonstrated in the consensus about professional skills and competencies. presumably, these positions have diffused, and are diffusing, through the information professions, as implied by the success of the acrl roadshow. the coherence of competencies documents and the number of positions within arl information agen- cies also suggests that most arl information agencies are in, or have moved beyond, the restructuring stage of rogers’ ( ) innovation-decision process. study limitations and future work the future of arl information agencies providing scholarly communication support to scholars appears to be sustainable, but there are limitations to this study. the data we collected only allowed us to examine two concepts from doi and organizational innovation literature. indicators such as arl data and the cohesiveness of professional competencies documents suggest that scholarly communication work has entered the late stages of rogers’ innovation-decision process, but we did not look at indi- vidual adoption-decisions inside agencies. speaking about the process by which organizations “choose” to adopt or reject an innovation, rogers notes how the process is not linear; at any point, steps in his model may be skipped, left unfinished, or started again later (p. ). without data to test rogers’ model, we do not know who made decisions to hire scholarly communication professionals in arl information agen- cies, if positions resembled those described in competencies documents, and if staff communicate with stakeholders to demon- strate the value that they provide. the creation of new positions hints that scholarly communication work is sustainable, in that it has been formalized, but limitations associated with our data mean the long-term is uncertain. to address uncertainty, future research should examine how scholarly communications positions are created and evaluate their value, which may shape the sustainability of the work that scholarly communication information professionals carry out. one way to gather data would be to interview decision-makers who are responsible for creating positions in arl libraries as well as the scholars who use schol- arly communication services. another limitation associated with this paper carries over to high-level traits that are associated with organizational innovation. in this paper, much as has been made of formalization as a predictive indicator, but we did not examine the work that is actually carried out within arl libraries. therefore, it is possible that positions were less formal than competencies documents suggest. asis&t annual meeting papers additionally, doi and organizational innovation literature provide other variables that may shed light on if scholarly commu- nication information services are likely to remain sustainable in the future. briefly discussed earlier, the variables we did not examine in this study include: centralization, complexity, interconnectedness, organizational slack, and size. as other arl information agencies attempt to create positions, centralized decision-making by university administrators may intervene. a lack of available resources may also prevent positions from being filled. the finan- cial state of many public universities is troubled, and today u.s. states spend less per-student than a decade ago (mitchell, leachman & masterson, ). organizational size also correlates with innovation, and as scholarly communication positions diffuse through the information professions, it is likely that small or medium-sized universities will not be able to provide the same level of support as their larger counterparts. wilder ( , p. ) reports that in the typical arl library employed professional staff, which is much larger than libraries at liberal arts colleges and teaching universities. to account for influences such as these, future work should seek to collect data that provides a more holistic view of scholarly communication position stability. additionally, because organizational size is a factor to consider, future work should look at information agencies other than arl libraries, because these libraries tend to be very large. conclusion our findings revealed that the work of scholarly communication information professionals has formalized since . using two concepts from doi and organization innovation literature, despite the limitations of this study, we infer that there is value in the work these professionals do, and conclude that the field is entering into a period of clarifying itself (using rogers’ ( ) terminology) that is indicative of its sustainability within arl libraries. given the coalescence of practice in the sub-area of digital curation, scholarly communication will likely continue to remain a sustainable area of practice in the information pro- fessions. however, more work remains to be done. the data collected in this study was insufficient to test other concepts from doi, which leads us to conclude that, while the present is stable, the long-term is much less predictable. future research should seek to move beyond this effort and examine the emergence of scholarly communications work within specific organizations, the value that this work delivers to researchers, and its potential to create a reliable, long-term growth opportunity for all types of information agencies. acknowledgments the authors would like to thank arl for providing access to its data. references brantley, s., bruns, t. a., & duffin, k. i. ( ). librarians in transition: scholarly communication support as a developing core competency. journal of electronic resources librarianship, ( ), - . doi: . / x. . brundtland, g. h. ( ). report of the world commission on environment and development: our common future. united nations. borgman, c. l. ( ). scholarly communication and bibliometrics. newbury park: sage publications. borgman, c. l. ( ). digital libraries and the continuum of scholarly communication. journal of documentation, ( ), - . bossaller, j. s., & moulaison sandy, h. ( ). documenting the conversation: a systematic review of library discovery layers. college & research libraries, ( ), - . doi: . /crl. . . calarco, p., shearer, k., schmidt, b., & tate, d. ( ). librarians’ competencies profile for scholarly communication and open access. re- trieved from confederation of open access repositories website: https://www.coar-repositories.org/files/competencies-for-scholcomm- and-oa_june- .pdf chen, s. t., & chang, b. g. ( ). the effects of absorptive capacity and decision speed on organizational innovation: a study of organiza- tional structure as an antecedent variable. contemporary management research, ( ), - . cross, w., oleen, j., & perry, a. ( ). jump start your scholarly communication initiatives: lessons learned from redesigning the scholarly communications roadshow for a new generation of librarians. in: at the helm: leading transformation, the proceedings of the acrl conference. - . chicago: association of college and research libraries, a division of the american library association, . retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/ /jumpstartyourscholar- lycommunicationinitiatives.pdf fearon, d., gunia, b., lake, s., pralle, b. & sallans, a. ( ). research in data management services (spec kit ). association of re- search libraries: washington, d.c. retrieved from: https://doi.org/ . /spec. green, e. ( , may ). what are the most-cited publications in the social sciences (according to google scholar)? [blog post]. retrieved from http://blogs.lse.ac.uk/impactofsocialsciences/ / / /what-are-the-most-cited-publications-in-the-social-sciences-according-to- google-scholar/ hage, j., & aiken, m. ( ). program change and organizational properties a comparative analysis. american journal of sociology, ( ), - . https://www.coar-repositories.org/files/competencies-for-scholcomm-and-oa_june- .pdf https://www.coar-repositories.org/files/competencies-for-scholcomm-and-oa_june- .pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/ /jumpstartyourscholarlycommunicationinitiatives.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/ /jumpstartyourscholarlycommunicationinitiatives.pdf https://doi.org/ . /spec. http://blogs.lse.ac.uk/impactofsocialsciences/ / / /what-are-the-most-cited-publications-in-the-social-sciences-according-to-google-scholar/ http://blogs.lse.ac.uk/impactofsocialsciences/ / / /what-are-the-most-cited-publications-in-the-social-sciences-according-to-google-scholar/ http://blogs.lse.ac.uk/impactofsocialsciences/ / / /what-are-the-most-cited-publications-in-the-social-sciences-according-to-google-scholar/ asis&t annual meeting papers hayes, c., & kelly, h. e. ( ). who’s talking about scholarly communication? an examination of gender and behavior on the schol- comm listserv. journal of librarianship and scholarly communication. ( ). doi: . / - . hey, t, tansley, s., & tolle, k. ( ). the fourth paradigm: data-intensive scientific discovery. redmond, washington: microsoft research. howard, h. a. ( ). organizational structure and innovation in academic libraries. college & research libraries, ( ): - . ketchum, a. m. ( ). the research life cycle and the health sciences librarian: responding to change in scholarly communication. journal of the medical library association, ( ), - . doi: . /jmla. . klain-gabbay, l. & shohamb, s. ( ). scholarly communication and academic librarians. library & information science research, ( ), - . kowalski, m. ( ). breaking down silo walls: successful collaboration across library departments. library leadership & management, ( ), - . kyrillidou, m, & morris, s., eds. ( ). arl annual salary survey – . washington, dc: association of research libraries. re- trieved from https://doi.org/ . /salary. - kyrillidou, m, & morris, s., eds. ( ). arl annual salary survey – . washington, dc: association of research libraries. re- trieved from https://doi.org/ . /salary. - . kyrillidou, m, & morris, s., eds. ( ). arl annual salary survey – . washington, dc: association of research libraries. re- trieved from https://doi.org/ . /salary. - lee, c. ( ). matrix of digital curation knowledge and competencies. retrieved from https://ils.unc.edu/digccurr/digccurr-matrix.html lewis, r., sarli, c., & suiter, a. ( ). scholarly output assessment activities (spec kit ). association of research libraries, washing- ton, d.c. retrieved from https://doi.org/ . /spec. maron, n. l. & smith, k. k. ( ). current models of digital scholarly communication: results of an investigation conducted by ithaka for the association of research libraries. the journal of electronic publishing, ( ). doi: . / . . mitchell, m., leachman. m., & maserson, k. ( ). funding down, tuition up state cuts to higher education threaten quality and affordability at public colleges. washington, d.c.: center on budget and policy priorities. retrieved from https://www.cbpp.org/sites/default/files/at- oms/files/ - - sfp.pdf montana, p. j., & charnov, b. h. ( ). management. hauppauge, n.y: barron's. morris, s. ( ). arl annual salary survey – . washington, dc: association of research libraries. retrieved from https://doi.org/ . /salary. - national research council. ( ). preparing the workforce for digital curation. washington, dc: the national academies press. https://doi.org/ . / novak, j., & day, a. ( ). the libraries they are a-changin': how libraries reorganize. college & undergraduate libraries, ( - ), - . doi: . / . . oxford. (n.d.). organization. retrieved from https://en.oxforddictionaries.com/definition/us/organization potvin, s. ( ). the principle and the pragmatist: on conflict and coalescence for librarian engagement with open access initiatives. the journal of academic librarianship, ( ), - . pugh, d. s. ( ). organization theory: selected readings. harmondsworth: penguin. regazzi, j. j. ( ). scholarly communications: a history from content as king to content as kingmaker. new york: rowman & littlefield. rodriguez, j. e. ( ). scholarly communications competencies: open access training for librarians. new library world, ( / ), - . doi: . /nlw- - - rogers, e. m. ( ). diffusion of innovations. new york: free press of glencoe. rogers, e. m. ( ). diffusion of innovations ( th ed.). new york: free press. ryan, b., & gross, n. c. ( ). the diffusion of hybrid seed corn in two iowa communities. rural sociology, ( ), – . swanepoel, m., kehoe, i., hohner, m., shepstone, c., vanderjagt, l., wakaruk, a., [...] winter, c. ( ). developing a community of prac- tice: report on a survey to determine the scholarly communication landscape in western canada. unpublished manuscript. http://hdl.han- dle.net/ / b valente, t. w., & rogers, e. m. ( ). the origins and development of the diffusion of innovations paradigm as an example of scientific growth. science communication, ( ), – . http://doi.org/ . / wilder, s. ( ). hiring and staffing trends in arl libraries. retrieved from http://www.arl.org/storage/documents/publications/rli- - stanley-wilder-article .pdf st annual meeting of the association for information science & technology | vancouver, canada | – november author(s) retain copyright https://doi.org/ . /salary. - https://doi.org/ . /salary. - https://doi.org/ . /salary. - https://ils.unc.edu/digccurr/digccurr-matrix.html https://doi.org/ . /spec. https://www.cbpp.org/sites/default/files/atoms/files/ - - sfp.pdf https://www.cbpp.org/sites/default/files/atoms/files/ - - sfp.pdf https://doi.org/ . /salary. - https://doi.org/ . / https://en.oxforddictionaries.com/definition/us/organization http://hdl.handle.net/ / http://hdl.handle.net/ / b http://hdl.handle.net/ / b http://doi.org/ . / http://doi.org/ . / http://www.arl.org/storage/documents/publications/rli- -stanley-wilder-article .pdf http://www.arl.org/storage/documents/publications/rli- -stanley-wilder-article .pdf new media & society ( ) – © the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / nms.sagepub.com the big one: the epistemic system break in scholarly monograph publishing phil pochoda university of michigan, usa abstract a system of scholarly monograph publishing, primarily under the auspices of university presses, coalesced only years ago as part of the final stage of the professionalization of us institutions of higher education. the resulting analogue publishing system supplied the authorized print monographs that academic institutions newly required for faculty tenure and promotion. that publishing system – as each of its components – was bounded, stable, identifiable, well ordered, and well policed. as successive financial shocks battered both the country in general and scholarly publishing in particular, just a decade after its final formation, the analogue system went into extended decline. finally, it is now giving way to a digital scholarly publishing system whose configuration and components are still obscure and in flux, but whose epistemological bases differ from the analogue system in almost all important respects: it will be relatively unbounded and stochastic, composed of units that are inherently amorphous and shape shifting, and marked by contested authorization of diverse content. this digitally driven, epistemic system shift in scholarly publishing may well be an extended work in progress, since the doomed analogue system is still fiscally dominant with respect to monographs, and the nascent digital system has not yet coalesced around a multitude of emerging digital affordances. keywords analogue publishing system, digital publishing system, accreditation, authorization, epistemic break, digital soup, digital shrew, stochastic corresponding author: phil pochoda, institute of humanities, university of michigan, ann arbor, mi , usa. email: ppochoda@gmail.com nms . / new media & societypochoda review article new media & society ( ) darnton’s publishing circuit in a much-cited article ‘what is the history of books’ (darnton, / : – ), robert darnton utilized the actual developments and participants in the publication of voltaire’s nine-volume questions sur l’encyclopedie to generate a schematic model of the full publication circuit, the flows of materials and ideas from author to reader (with intermediate stops at publisher, printer, shippers, and booksellers). darnton claims, with much justification, that this model of the ‘entire communication process’ should apply ‘with minor adjustments to all periods in the history of the printed book’ (darnton, / : ). darnton illustrates the communications circuit with the diagram shown in figure (darnton, / : ). darnton’s ‘publishing model’ or ‘publishing circuit’ can also be considered a publishing ‘system,’ since it delineates a set of components that interact with each other in regular and ongoing ways, and whose overall behavior or output could not be predicted or explained by any or all of the components separated from the others. any system is an arbitrary slice through, or selection from, a much broader potential array of interacting elements. the value of the system model consists of its ability to reveal and/or to explain regularities in important or recurrent empirical phenomena. in the case of darnton’s model, the claim that it incorporates and helps conceptually organize and interpret the primary activities connected with print publishing over the course of two centuries underwrites its claim to significance. since each of the elements of the publishing system is itself a system, and can be decomposed into its own interacting components, the publishing system can be viewed as constituted of linked subsystems. for example, the publisher is represented in darnton’s general publishing model (although not in his nuanced narrative account of the actual publication of voltaire’s work) as a monolithic entity – a ‘black box’ that receives inputs from the author and sends outputs to the printer. however, for other analytic or practical purposes, publishers must themselves be treated as a system, composed of, or deconstructed into, interacting components (or subsystems), such as editorial, produc- tion, marketing, sales, and business departments. conversely, the publishing system can be considered itself as a subsystem of larger economic, political, or cultural systems: that is, a more generalized, more encompass- ing, societal system model could be constructed in which the publishing system is treated as a single component. darnton takes account of the substantial impact of the tumultuous economic, political, and cultural environments of pre-revolutionary europe upon the publication of voltaire’s work by treating them as boundary conditions that significantly affect and are, in turn, affected by the publishing circuit. in technical terms, the publishing system is an open system, exchanging inputs and outputs with ele- ments or systems outside itself (patten and auble, : ). open systems may encounter certain boundary conditions – the environmental states that feed into the system – that facilitate system stability over a wide range of internal activity, while other boundary conditions make system stability difficult or even impossible. in a sys- tem subject to mathematical formulation, the latter outcome means that there are no system solutions for certain values or a range of boundary conditions. in the example of pochoda f ig u re . r o be rt d ar nt o n’ s de pi ct io n o f t he b o o k pu bl ic at io n ci rc ui t in f ra nc e, . new media & society ( ) the publishing system, particular economic conditions (e.g. extreme general financial instability or crises, or specific economic developments in the publishing sector) or political conditions (e.g. rigid censorship by french royal agents) can and did interrupt the publishing system (at least as regards voltaire’s book) for shorter or longer periods of time. coalescence of the scholarly print publishing system: professionalization, authorization, and accreditation until the s, the circuit of monograph scholarly publishing – which at that time included approximately university presses (givler, ) – could be depicted in a diagram that much resembled darnton’s schematic of the th century publishing circuit. authors, generally faculty and graduate students at the same institution as the press, would hand over their manuscripts to the press; the press would ready them for produc- tion and deliver the edited manuscripts to a printer; the printed books would continue on through a distributor, bookstore or library, to readers. what were missing were strong and consistent controls on the quality of the output, the books produced and circulated. that is, it lacked feedback from the core output – the quality not of the physical product, the container, but of the argument, the content. to qualify as a cybernetic scholarly sys- tem, content quality needed to be sampled, assessed, and then fed back into the publish- ing process as a control mechanism that effects changes at the book level: the system must display strong and persistent feedback loops that produce a reliable threshold of content quality (wiener, and : – ). in the s the scholarly publishing circuit installed such feedback, in the form of professional disciplinary standards, within the publishing system at multiple locations. university presses, long the center of scholarly publishing, now organized and imple- mented such feedback by submitting every monograph to an external review process that was adapted, with some variations, from similar systems employed for scholarly journals. while the scholarly disciplines had previously weighed in formally but errati- cally post-publication on the merits of monographs through reviews in prestigious professional journals, and informally in many other ways, by building in the review hurdle or authorization within the publishing process itself, presses attempted to ensure that every published monograph and all published content, attained at least a minimal professional level. not coincidentally, the professionalization of press content was influenced by and, in turn, played a significant role in, the final stage of the professionalization of institutions of higher education in general. it was at just this time that such institutions – at first pri- marily elite universities and colleges, but soon schools of all types – began requiring faculty in the humanities and the social sciences to produce well-received monographs (or, in some cases, a number of journal articles) to qualify for tenure and promotion (accreditation). superior teaching, institutional service, and professional reputation were no longer sufficient for moving into and up the faculty ranks: quality and quantity of publication became an additional and, soon, the primary, qualification for faculty appointment and hierarchical elevation. if administrations were to require published books as a major part of a decision on faculty tenure or promotion, then it was necessary pochoda for the books to arrive at that decision point already professionally authorized. university press editors were rarely adequately qualified to perform this function, so monographs needed the endorsement of credentialed external reviewers to serve in this role. the mutually reinforcing, symbiotic, professionalization of the university and profes- sionalization of the presses acted as a further control mechanism on the system of schol- arly communication. the idea of applying scholarly standards to monographs was not an innovation, of course: many press editors and many post-publication reviewers had long invoked them on a somewhat haphazard basis. what changed were the uniformity, rigor, and consistency of the application of such standards – and the immediate and significant negative consequences of failure to conform (the negative feedback loop). post- publication administration accreditation, when layered upon the pre-publication peer review, authorization, provided strong likelihood that monographs published within the university press system could be relied upon to meet at least a minimum scholarly stand- ard. manuscripts that failed to achieve such authorization would (in principle, at least) not be published; faculty who produced an insufficient number of vetted publications would (again, in principle) not be retained or promoted (accredited). several anomalies in the resulting system are worth mentioning. by the end of the s, there were approximately university presses (many quite small) (givler, ), although many more institutions of higher learning – certainly several hundred – were then requiring published monographs in many fields. currently there are approximately university presses in the us and, while not all of the institutions of higher learning in the country require published books on the part of their humanities and some social science faculties, a significant majority certainly does, implying that presses and their home institutions are in fact subsidizing at least other universities and colleges who are free riders on a system that they rely on but do not support (greenstein, ). the preceding discussion also makes clear that while university presses are oriented to their home administration for material and other support, their published faculty authors are situated in a far wider swath of institutions of higher learning, and these authors are far more concerned with own administrations (and their power to reward or punish). this creates a disconnect between the limited number of administrations that support a university press and the much larger pool that knows little about university presses and publishing, yet require their products for status decisions regarding their own faculty. the result of this disconnect is not merely financial inequity but the reduced likelihood of rational information and observation about, much less remediation of, the scholarly publishing system as a whole. university presses have always made independent decisions about which scholarly areas to publish (much less which books to publish), decisions that have been based on press history (sometimes just plain inertia), press resources, university strengths, product marketability, editorial and marketing preferences and idiosyncrasies, and other factors, both rational and irrational. presses have never collaborated to efficiently allocate pub- lishing areas in order to ensure complete or optimal distributed coverage of fields. in the diagrammed publishing system presented in figure , it is only after publication, when books get to the library distributors who sell books by categories to academic libraries, that books from all academic publishers are even sorted and aggregated by field. new media & society ( ) the system of scholarly publishing in the above figure embeds the feedback mecha- nisms (authorization and accreditation) inside the system, because these are not exter- nalities or options, but are fundamental to system identity and maintenance. (a third control mechanism or mode of authorization – post-publication reviews in professional journals – is implicitly contained in the ‘reader’ venue of the diagram.) the scholarly print publishing system itself proved stable or at least sustainable (rela- tively if not rigidly homeostatic) for a half-century. the sites depicted increased or decreased in their quantity (for example, the number of university presses increased by only from to ) (givler, ), but remained unchanged in their nature and function. the vector of the circuitous flow from manuscript to printed books to readers was invariant (although, to the dismay of presses, books sold per title declined continuously). in short, the analogue scholarly publishing system – engaged in the production of books that are bounded (literally, bound), identifiable (clearly and immutably authored and titled), and stable (the container and the content of each book remained fixed) – is itself stable, bounded, continuous, well ordered, and well policed. as in the darnton model, the system as depicted in figure remains open to environ- mental influences, in particular, the states of the local and national economy and external revenue flows. in addition to revenue from book sales and university subventions that are internal to the system as depicted, a major factor supporting the coalescence of the system of scholarly publishing in the s was the vast amount of federal funds that flowed into universities as a whole, and into the scholarly publishing system, directly and indirectly, through libraries and presses. post-sputnik, the government was determined to accelerate figure . the ‘analogue’ system of print monograph scholarly publishing. pochoda development of us research capabilities (infrastructure) as well as by direct funding of research activities, impulses that were strengthened by the outbreak of the vietnam war. the national science foundation saw its budget increased from its founding level of us$ million to a post-sputnik amount of us$ million in and nearly us$ million in , and the foundation lavishly supported a wide range of research projects and their subsequent publication throughout the s (mazuzan, : ch. ). the defense department, deeply enmeshed in the vietnam war, funneled much funding into university research as well as into publications (mazuzan, : ch. ). significant publi- cation subsidies also came, both overtly and covertly, from the cia in this period of high- intensity cold war competition (saunders, : – ). libraries, whose budgets expanded throughout the decade, were able to commit to substantial standing orders for almost all university press books. presses inevitably assumed that the higher standards they had imposed on their monographs, and the increased professionalism of their opera- tion, were responsible for their strong performance. no one had reason to imagine that this period of prosperity, and the stability it meant for libraries and presses, would prove extremely brief, or that the golden age for the analogue publishing system – and for higher education as a whole – would span only a decade. from abundance to scarcity as rapidly as the financial spigots were turned on in the s, so were they precipi- tously reversed in the early s (mazuzan, : ch. ). the escalating costs of the vietnam war (and the escalating public opposition to the war), as well as the public excitement and relief engendered by the us leapfrogging past the soviet union to the moon in , reduced the federal government’s felt urgency or capacity to fund univer- sities as it had for the preceding decade. in particular, libraries were hit hard, and their reduced budgets resulted in a protracted decline of library orders to presses. universities, now struggling with a suddenly underfunded research infrastructure, began reducing press subventions as part of much wider cuts (givler, ). as the universities in a short time went from boom to bust, so inevitably went the presses. the analogue system bent but did not break under a series of major financial blows in the s. in addition to the decline in federal funding, the decade witnessed the follow- ing: a major stock market crash and global oil crisis in ; inflationary growth through the whole period; a major recession from to ; another oil shock in ; and record stagflation throughout the end of the decade (samuelson, : – ). separately and together, these economic shocks took a significant toll on publishing revenues. at this time as well, large commercial publishing conglomerates (e.g. springer, bertelsmann, and wiley) gained control over many or most of the most prestigious aca- demic science, technology, and medicine (stm) journals (carrigan, ; thatcher, ), and compelled university libraries to pay relentlessly escalating and exorbitant subscription prices for articles based on research that was primarily funded by the us government, foundations, or by the universities themselves. in turn, academic libraries, already faced with static or declining budgets, were forced to assign much higher propor- tions of those beleaguered acquisitions budgets to these journal subscriptions, and reduce, by at least the same amount, their purchase of academic monographs. university new media & society ( ) presses, reeling under this simultaneous decline of multiple revenue streams, reacted by raising the price of their books – with the inevitable result that libraries reduced orders even further, and so the vicious spiral (downwards of sales, and upwards of prices), that would continue for the next years, commenced. these internal system changes and the changes in the system boundary conditions beginning in damaged the analogue publishing system severely, rendering homeo- stasis (wiener, and : – ) – at either the system level or at the level of individual publishers – increasingly problematic. the print-based analogue system was so weakened as a result, and had so little margin to maneuver, that when subjected to the full force of the digital assault years later, it could provide only token resistance. the digital trojan horse: sustaining innovation under increasing stress for a decade, in the s scholarly publishers, as did all other businesses, rapidly and enthusiastically incorporated in their daily operations the soft- ware programs that accompanied the widespread introduction of desktop computers (freiberger and swaine, ). word processing programs – e.g. word star, wordperfect, xywrite, and the eventual champion from microsoft, word – were introduced in the late s and early s, and publishers immediately appreciated and appropriated their value for manuscript submission, editing, copyediting, and composition (eisenberg, ; thompson, : – ). powerful desktop spreadsheets and database pro- grams became available in that period as well. visicalc, lotus - - , and then, of course, microsoft’s excel permitted rapid calculations to be performed on massive amounts of financial or other quantitative data (power, ); database management programs, such as db , foxbase, oracle, sql server, and others, facilitated the storage, manipulation, and transmittal of large amounts of information within and between organizations. these general digital tools significantly increased the productivity and efficiency of any kind of business and, in particular, served to protect the narrowing and endangered financial margins of publishers. in the s a new class of software emerged that was publisher specific. these were the powerful page design tools that relieved the produc- tion departments of the necessity to lay out every page of a manuscript by hand (both in the initial layout, and then yet again whenever changes were made to the text or to the number and size of illustrations). beginning with aldus pagemaker . in , and continuing through adobe pagemaker ( ), quarkxpress ( ), and adobe indesign, (adams, ), this software permitted text to flow freely into pages of defined dimensions, and to flow automatically around illustrations and around insertions and deletions in the text. the productivity savings for publishers was substantial and invalu- able. in terms of clayton christensen’s conceptual framework, digital technology acted at first as a sustaining innovation (christensen, : xviii), propping up the existing publishing system, without threatening existing markets or recruiting new ones. digital becomes disruptive however, these digital accessories were only short-term palliatives: the analogue pub- lishing system was already on a fatal slide, while the growing potency of digital tools, pochoda and the ever-increasing publishing options made available through digital means and channels, began suggesting the possibility, indeed the inevitability, of a complete role reversal, resulting in a scholarly publishing system in which the scholarly content is overwhelmingly born-digital, then digitally organized, digitally processed, digitally produced, and digitally disseminated (and in which print versions would play, at best, only a supplementary or niche role). digital technology changed, in the course of only two decades, from a sustaining innovation within the scholarly publishing circuit to a disruptive innovation (christensen, : xvii; christensen and raynor, : – ); from increasing productivity while supporting the traditional values and markets within the legacy print publishing system to an innovation that first suggested, then insisted on, a radically transformed system of scholarly publication, one premised on digitally inspired and digitally mediated resources and perspectives introduced at every juncture of the system, as well as throughout all system flows and outputs. an unordered and incomplete list of (and comments about) some of the most relevant digital affordances that contributed to this system change includes at least the following. the powerful search and discovery tools that digital publication makes possible have done much to accelerate the migration of scholarly publishing from print- based dominance to digital primacy. probably too much has been made of the digital capacity to enhance books with audio and visual components. even should this not prove the norm for most digital books, still these options will prove beneficial for many projects, and will create whole classes of publication in which print-only content, if it exists at all, will be considered a diminished or even impoverished version of the book. vectors: journal of culture and technology in a dynamic vernacular and kairos: a journal of rhetoric, technology, and pedagogy are each born-digital, peer-reviewed, multi-media journals that publish major scholarly articles that could be presented in no other format. permitting the content, not the traditional print containers, to dictate publication length and format. in the print system, formats can only be chosen from the familiar binary choices: on the one hand, articles (generally of lengths less than pages), aggregated into issues of scholarly journals and sold via a subscription model; on the other, monographs, almost always of lengths greater than and less than or so pages. the exclusivity of those two physical containers was and is entirely determined by the economics, the exchange-value, required by publishers, and not in order to optimize the use-value, for either authors or readers, of an intellectual or scholarly argument or project (esposito and pochoda, ). in the procrustean print system, authors are compelled to fit their argument into the short-form article or the long-form text (itself falling within a limited spectrum of potential lengths). by contrast, the digital regime, in principle, permits publication in any length and in a wide and expanding variety of digital (as well as print) containers. the opportunity, which in many fields of research is becoming the necessity, for generating, curating, archiving, sharing, and disseminating very large and mutat- ing data sets (borgman, : – , ). digital publishing is obviously the new media & society ( ) only format of choice for such projects (and raises many thorny issues of authori- zation and accreditation). although the first digital copy may not be inexpensive, the marginal cost of addi- tional copies, whether it is one or one million, is essentially zero. so wide digital dissemination is not limited, in principle, by cost, but only by discoverability, internet access, and reader interest. negligible marginal cost also encourages open access (oa) perspectives and implementation, since the cost is bounded whatever the extent of distribution. web . makes possible direct and immediate linkages to full-text versions of many of the citations, footnoted sources, and bibliography mentioned in the text, providing considerable assistance to readers. web . features a vast range of web-mediated pre-publication and post-publi- cation interactions among the authors, readers, commentators, and editors (as well as interactions between books and other books) (esposito, ; o’reilly, ). scholarly books in such a rich digital publishing ecosystem not only benefit from but may be premised upon community engagement throughout the publishing system, and the books themselves become the catalyst for the coalescence of such communities on either a transitory or more permanent basis (nash, ). web . , the semantic web, permits fine-grained algorithmic tracking and data mining of many of the endless uses and interactions, connections and disconnec- tions obtaining among humans and a myriad of digital products. it maps, for the first time, the complex real-time patterns, rhythms and intensity of actual usage, assesses demographic differences, and minutely tracks the digital interactions between books, readers, authors, and publishers (antoniou and van harmelen, ). thereby extensively informed about reading practices and interactions within the publishing system, we can become better publishers. many flavors of do-it-yourself web-based publishing in the scholarly sector (as elsewhere) have already emerged, including academic wikis, blogs, file-sharing applications, listservs, facebook and other social media sites, twitter streams, and much else. currently, such distributed publishing is neither authorized nor accred- ited in the scholarly system, but challenges to the orthodoxy are beginning to appear from within the scholarly disciplines themselves and from scattered but influential individual faculty members protesting the inflexible application of rigid inherited standards and norms to determine what is a legitimate scholarly contribution in the digital era (cohen, , ; fitzpatrick ). at first, such publications appeal to a limited niche of academic writers and readers: because they have not cleared the accreditation hurdle, they qualify as christensen’s low-end disruptions (christensen, : , , ; christensen and raynor, : , ), unable to provide some of the primary rewards attached to academic publica tion. however, as standards of accreditation become more flexible and diverse (which should not be interpreted as the end or even the diminution of scholarly or intellectual standards), these low-end disruptors and disruptions will likely migrate upwards in professional and administrative esteem, taking a place alongside the traditional scholarly productions of univer- sity and scholarly presses. pochoda whether or when administrations will accredit any versions or aspects of such publishing remains problematic, but as non-traditional and flexible digital publishing forms and formats become increasingly common, the pressure for concomitant flexibility and pluralism in the associated areas of authorization and accreditation will likely prove irresistible as well. alternative or supple- mentary forms of authorization, for example public crowd-sourced post-publi- cation assessments, will undoubtedly acquire some degree of academic legitimacy if accompanied by sufficient controls and annotation, and if they are appropriately positioned along a gradient in scholarly import (acord and harley, ). gearing up digitally as rapidly as their resources permit, university presses large and small are now digitiz- ing all phases of their operations and installing digital workflow tools, beginning with the ingestion of manuscripts (themselves digitally formatted by authors so that they engage almost seamlessly with the subsequent publishing stages) on through editing, production, and dissemination in a variety of digital formats and digital channels (thompson, : – ). extensible markup language (xml) processing of man- uscripts has been pushed ever further up toward the commencement of the publishing process, thereby readying an accepted manuscript for prompt, simultaneous print and digital production. those production processes themselves have been heavily digi- tized, and are often organized through use of powerful digital production platforms that are efficient and versatile. publishers make use of an increasing array of digital tools to create and disseminate rich metadata; to archive, search, and retrieve content of many kinds; to handle contracts, permissions, rights, and royalties; and to track multiple and complicated workflows throughout all press departments and stages (o’leary, b). currently, a confusing and inefficient array of digital formats emerges from this processing: pdfs for web use; mobipocket (for amazon); apk (for android); daisy (for vision impaired); and epub and epub for almost everyone one else. reflecting an immature industry, channels for digital dissemination are equally diverse and dizzying: through direct to consumer commercial sites (e.g. amazon, apple, barnes & noble); through commercial ebook distributors selling to an array of ‘e-tailers’ (e.g. ingram’s coresource, perseus’s millennium project); through commercial aggregators selling largely to university and public libraries (e.g. ebrary, netlibrary, overdrive, myilibrary); through university press-affiliated aggregators (project muse/upcc, university press scholarship online, books at jstor), as well as directly though individual press web sites to consumers. similar chaos prevails throughout scholarly publishing. for example, the largest scholarly and scientific presses in the us and europe have leveraged their considerable scale to create impressive aggregated digital packages consisting either exclusively of their own titles or with supplementation from smaller academic presses (kelley ). unfortunately for users, each comes tethered to its own incompatible or digital rights management (drm) protected proprietary platform. new media & society ( ) the digital soup however, for all of this churning, experimenting, and expanding digital activity, nothing resembling a viable long-form digital scholarly publishing system exploiting the full potential of digital scholarship and digital dissemination has yet coalesced or even dimly revealed its initial configuration. by analogy with the biological ‘soup’ that is posited as the incubator of organic life on earth, we have now what resembles a primordial digital scholarly publishing soup – very much on the boil as digital ‘atoms,’ digital affordances, are produced and added at dizzying rates, and digital venues, digital resources, digital workflow tools, digital archives, digital platforms, and digital receptors are connected, combined, and aggregated on the fly into digital ‘molecules,’ while previous attempts at molecular combination and control are already disintegrating. too little productive or persuasive thought has been given to what the fully digital scholarly system will or should look like; how it will be assembled, coordinated, and networked; what venues inside and outside the university will participate and in what roles; and what will be the underlying foundation, the digital support system or cyberinfrastructure (an integrated suite of digital hardware and software platforms; data storage, retrieval, and sharing sites and channels; and digital social practices and services that will best facilitate collaborative intra-university and inter-university scholarly digital publishing) (atkins et al., ). a higher digital life form, a digital system, cannot emerge from the bubbling digital soup until we have far more devel- oped versions of both a robust digital substructure (in part, the cyberinfrastructure just discussed) and an equally vital digital superstructure (consisting, in part, of born-digital or born-again digital scholars and scholarship, each deeply embedded in digital processes and digital resources). the epistemic break even now, however, before more than the most preliminary digital forays are visible, it should be clear that a digital publishing system – either in form or in content; in the con- tainers or the contained; in forms of scholarship or formats of scholarship – will not be simply the print system in digital dress. what is underway is not just a change in formats and publication processes, but a much more fundamental, ontological, change in what it means to be a participant in a digital as opposed to an analogue system, or, in particular, in a digital scholarly publishing system as opposed to the legacy print system. one elementary clue as to the profundity of this imminent epistemological break is the common observation that roles in digital publishing are already becoming fluid, mutable, and multiple, and that this shape shifting, this newfound disdain for fixed publishing roles, identities, or boundaries, is inherent in the digital system. this is already visible in the trade publishing world where variations of publishing alchemy are displayed on almost a daily basis. before our eyes, publishing roles that have been stable and separate for centuries are suddenly becoming volatile and interchangeable: overnight, authors become publishers and/or distributors; distributors become publish- ers and literary agents; literary agents become publishers; readers become critics and authors. no traditional publishing role, much less traditional publishing entity, seems pochoda stable or settled in the fully digital publishing universe: the digital system, by its nature, empowers its components to shed rigid identities and labels and be not a this or a that but both, and more, simultaneously and sequentially. whereas the analogue system is, in principle at least, deterministic (since the state of each system component can be specified at a given time), the illustration above sug- gests that the digital system as a whole, as well as in its parts, is stochastic, amenable to specification only within a range of probability at any time. despite the danger in pressing analogies or metaphors too far (as regards to content areas that are far afield), it is tempting to characterize the analogue print scholarly publish- ing system as newtonian in nature, composed of stable, identifiable, inertial units – whether they be planets, apples, and atoms, or authors, publishers, and bookstores – that interact in regular fashion according to known or discoverable but immutable laws or regularities. in contrast, the digital system is best likened to a quantum system, composed of units, or packets of units, that can be alternate things simultaneously, as an electron or light itself is, in the quantum reckoning, both a particle and a wave. more importantly, the antipathy of the digital system to imposed or inherited limits, its corrosive effect upon artificial or historical boundaries to textual production and consumption, may not be coincidental or contingent but a consequence of its essential nature. in contrast to the analogue publishing system, in which the relationships among the components tend to be linear and limited, the pattern, the solution set, manifested by any ‘digital’ system is not necessarily obvious or easily characterized. the analogue system is relatively static in its components and in its connections, while the digital circuit is in flux in multiple dimensions. it is intrinsically a network of shifting connections, a set of shifting sets, a community of shifting communities. publishing shrews and dinosaurs the real dinosaurs suffered evolutionary defeat by shrew-like mammals, in part because their massiveness, their inertia in the face of major environmental trauma, proved to be a fatal liability (dawkins, : – ). however, the slow biological evolutionary anal- ogy breaks down here, since first-generation publishing shrews can mutate astonishingly rapidly into publishing giants: amazon, google, and apple, currently publishing’s big three dinosaurs, all began as publishing shrews just a few years ago. further, the still flourishing commercial stm dinosaurs have demonstrated how agile they can be not only in creating powerful new digital resources and platforms, but also in almost effortlessly mimicking the shrews by appropriating, for their own benefit, digital developments such as oa that were, in part, designed and promoted to undercut dinosaur domination of the academic marketplace. university presses and/or the publishing-inclined university libraries, will, of course, try to protect and expand their home terrain by claiming the dis- ruptive digital technologies and the emerging scholarly digital publishing system for themselves, but whether they have the imagination and the resources to effect and enforce such shrew-like aspirations remains to be seen (pochoda, b). major challenges to the publishing dinosaurs – university dinosaurs as well as trade dinosaurs –- will likely arise not from the center of their respective ecosystems but from the periphery. digital shrews, bonding almost genetically to all the free or inexpensive new media & society ( ) digital publishing-ware that is fast becoming available, and having no emotional or financial bondage to the analogue print system, may find a way to insert themselves into the digital scholarly system by their born-digital ability to truly ‘float like a butterfly, sting like a bee’ in the mutating digital atmosphere and ambience. some recommendations moving forward because so much hinges on it, and so much pressure is behind it, that there will be an organized digital scholarly publishing system for monographs as well as for journal articles, different in kind from their print predecessors, is a certainty – although when and even where, much less how, they will coalesce is entirely and predictably obscure as yet. nor, if my arguments or inferences about the nature of the long-form digital system are plausible, will coalescence occur in the manner or in the venues to which we have become accustomed. universities may or may not decide to claim the trans- formed scholarly publishing system for their own – although i believe that it will be seriously detrimental to scholarship if they do not – but if they do it will be because, at least in part, they appreciate that the publication of knowledge in the digital era will be seamlessly intertwined from the outset with the production of knowledge. that will require major reconsideration of the vexed decisions regarding what level and what kinds of support are necessary to ensure that fundamental university principles of max- imizing scholarly production and dissemination are fully realized. while it may be too early to delineate the specifics of the scholarly digital publishing landscape going for- ward, it seems possible to infer some of its broadest outlines. if anything seems a key to digital scholarly publishing it will be the pervasive active and interactive involvement of diverse scholarly communities, supplementing if not entirely displacing the traditional inflexible, hierarchical systems of authority, authoriza- tion, and accreditation. there is both a specific and a general justification for this asser- tion. the specific argument hinges on the wide distribution of skills and interests relevant to digital publication throughout the university community, and on the imaginative and disruptive role of smaller, sometimes ephemeral, intra-university communities and individuals in vigorously adopting (or even generating) the many digital publishing affordances that are neither part of legacy publishing practices nor have yet been legiti- mated by the traditional mechanisms of assessing and rewarding faculty achievement. the general argument – as has been strikingly developed by writers such as david weinberger (weinberger, ), clay shirky (shirky, ), and jonathan zittrain (zittrain, ) – is that in the digital era the crowd is smarter and much more efficient than any of its individual members, or, as weinberger memorably puts it, the smartest person in the room is the room. war, we have been told, is too important to be left to the generals, and as publishing decisions become ever more explicitly engaged with foundational university values, as well as with the digitally oriented scholarship emerging from a wide array of formal and informal university venues, academic publishing in the digital era will be deemed too important to be left to publishers and to uncontrolled market determinations of publishing value. for example, although i believe that there will be a continued need, at least selectively, for dedicated, professional editors operating within university press venues, committed to discovery, recruitment, selection, assessment, remediation, and pochoda enhancement of long-form scholarly projects, presses will have to share their inherited academic publishing monopoly with many other university venues and with shifting communities of scholars interacting and publishing through the web. the expanding range of authorized publications – from formally reviewed projects of widely varying lengths, including vast collectively created and shared datasets and databases, to an increasingly dizzying array of less formal scholarly exchanges – will make the task of locating, sorting, curating, archiving, storing, preserving, and display- ing such heterogeneous scholarly productions a task that only academic libraries are experienced and dedicated enough to assume. (however, there is no persuasive theo- retical or practical rationale for placing imperialistic academic libraries in charge of these emerging campus-wide digital publishing collaborations.) university it divisions, working closely with the traditional library and press ven- ues, and collaborating, as well, with schools of engineering and computer science, will be crucial in helping create and install the evolving cyberinfrastructure at a campus and inter-campus level (pochoda, ). digitally invested and digitally savvy faculty from an extraordinary wide range of digital humanities, digital design, and digital per- formance centers, and a broad spectrum of academic departments, will also be a central part of this collaborative digital publishing effort. the coordination of these elements on a single campus will be immensely compli- cated, given the range and diversity of the groups and projects involved: how that collaboration is effected, marrying the centralization required for efficiency with the decentralization necessary to incorporate both formal and informal campus venues, will likely be settled differently by each institution. the university itself, through its administration, will have to be involved, directly or indirectly, to coordinate and fund these distributed campus publishing entities; to provide appropriate accredita- tion and support for the many emerging forms of digital publication; and, in general, to ensure that the basic university commitment to optimizing the production and dissemination of scholarly materials receives priority over the narrow self-interest of any of the individual publishing venues. , what should emerge on each campus is a distinct, heterogeneous publishing ecosystem (itself a participant in a complex national and international publishing network) whose very diversity and permeabil- ity ensure a powerful, evolving institutional publishing identity. scale has always mattered in publishing, and scale in digital publishing is no excep- tion. leveraging publishing assets across a campus will produce significant and much- needed economies along with superior performance. this is true whether or not the business model adopted is, as at present, in a large part market driven or, more likely, and much more desirably, tipping towards an oa hybrid model, in which the preponderance of publishing expenses are supplied by the home institution or foundation grants, but with ancillary revenue derived through sales of specialized digital formats, print on demand, and other associated publishing activities. the core university commitments to the freest and widest distribution of research, combined with zero marginal cost for digital distribution, makes a compelling case for an oa university scholarly publishing model – for monographs as well as journals. however, such a normative perspective not only begs the question of how the substantial operating costs for digital monograph publication will be paid, but also, as donald waters demonstrates (waters, ), it evades the necessary concern for important issues such new media & society ( ) as the sustainability of the digital business model and of support for continued innovation in scholarly publishing. reinventing each of the digital affordances, the circuitry, and the cyberinfrastruc- ture on each campus obviously would be inefficient. powerful digital platforms (at least capable of performing labor-intensive production and copyediting tasks, but also able to provide aggregated marketing and sales support), should be shared by many, perhaps all, colleges and universities. such coordination will, at the least, require organization that accommodates the diversity and complexity of the array of partici- pating university publishing communities, while helping effect a system-wide coher- ence and efficiency. further, only by involving not just hundreds but thousands of institutions of higher education in the planning and the maintenance of publishing activities will costs be equitably allocated, distributed asset use optimized, and an intricate array of publishing needs met (pochoda, a). this explicit, organized, inter-university collaboration represents yet another required level of the impending scholarly publishing ecosystem, encompassing all the others, and itself confronted with the pervasive issue of maintaining system-wide coherence while supporting the rambunctiousness of campus-based publishing units nationwide. the magnitude and the complexity of organizing the digital scholarly publishing sys- tem should be neither surprising nor disturbing but refreshing. the digital system, which will be part of the single biggest disruption in scholarly, much less human, communica- tion since at least gutenberg, years ago, will bring with it fundamental (and many unforeseen and unforeseeable) transformations not only in how and where scholars com- municate what they know, but also in what they know, in how they know, and in the ways in which they know. these profound changes awaiting scholarly publishing in the digital era will be an integral part of the much larger digitally driven epistemic transformations of scholarship, of scholarly discourse, and of the academy in general. in the language of seismologists, this is truly ‘the big one’. acknowledgments i would like to acknowledge the unusual assistance of nicholas jankowski who solicited, accom- modated, edited, and much improved this essay. he was an invaluable editor, particularly for this piece that, with its personal focus and speculative nature, deviates from the style and methodology of most new media & society articles. funding this research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors. notes . the observations on publishing throughout this essay are based on the cited sources and on many more uncited sources, but rely especially on my more than years of hands-on pub- lishing experience, both in the trade book world (as editorial director of anchor books and dial press at doubleday and publisher of prentice-hall press at simon & schuster) as well as in the university press world (as associate director and editor-in-chief at the university press of new england, and for a decade, until my recent retirement, as director of the university of michigan press). ump received much attention for being in the forefront of the university pochoda press transition from print to digital publishing, for its selective commitment to open access monograph publishing, and for close collaboration with the university library in its publishing activities. . douglas carlston has some revealing stories about the creation of wordstar, visicalc, and lotus - - (carlston, : – , – , – ). . http://vectors.usc.edu/journal/index.php?page=introduction. . http://www.technorhetoric.net/about.html. . brian o’leary (o’leary, a) provides a superb discussion of the ways in which digital publication permit context and content, and not the ‘container’ in which content is embedded, to determine published form and format. . the academic scholarly world is, in principle, much more prepared for some of its lead- ing actors to assume alternate publishing roles, both simultaneously and sequentially, than is the trade book publishing sector. in the scholarly print publishing system, the same faculty member is frequently a monograph author, an external reviewer for one or more presses, a reader of many academic monographs, and a reviewer for scholarly journals. such versatility is relatively widespread among academics, and should certainly abet the scholarly digital transition that facilitates role switching in principle. . the state of an electron or system of electrons in quantum mechanics can only be presented in terms of probabilities, that is, stochastically, and so bears at least a remote correspondence to the digital model as presented above. . john seely brown and paul duguid go even further, and argue that in this stage of the infor- mation society, it will be the large organizations, able to command large networks, rather than the fabled garage start-ups, that will be the source of major innovation (brown and duguid, : ). . an interesting recent edition of the scholarly kitchen blog is devoted to a discussion among its principle contributors of just this issue. ‘ask the chefs: “who will win the future – the small, the mid-sized, or the big organization?”’ (scholarly kitchen, ). . in a very recently published article (that i read only as this essay was about to be published), richard lorimer endorses much the same scenario as presented here for the relationship between university presses and academic libraries in the digital publishing era (lorimer, ). i agree with almost all of the points made in this insightful article. . in a very important and much discussed report, “university publishing in the digital age,” done for the ithaka group, the authors, laura brown, rebecca griffiths and matthew rascoff, strongly urge university administrations to play a very active role in scholarly publishing in the digital age (brown et al, ). . eleven provosts of mid-western research universities, all members of the committee on institutional cooperation (cic), recently issued an enlightened statement along these lines. the provosts were expressing their support for the proposed federal research public access act (frpaa) that would mandate oa status after a six-month embargo period for all published articles that had received federal funding support. however, in addition, the provosts strongly endorsed some broad publication principles to guide their university presses: in particular, ’ensuring that our own university presses and scholarly societies are creating models of scholarly publishing that unequivocally serve the research and educa- tional goals of our communities, and/or the social goals of our communities.’ in addition, the provosts’ statement supported oa institutional repositories on each campus, and further addressed the developing faculty accreditation issue, in favor of ’ensuring that promotion and tenure review are flexible enough to recognize and reward new modes of communicating research outcomes.’ ( research university provosts, ). it is noteworthy that the professional association of university presses, the association of american university presses (aaup), issued a statement opposing the frpaa, providing new media & society ( ) another clear demonstration of how university values and university press values – at least as communicated by the aaup – have significantly diverged in recent years. . clifford lynch, the director of the coalition for networked information, has proposed a ‘future system of many distributed university presses mainly focused on the editorial production of scholarly monographs, supported by a very small number of digital platforms for managing and delivering these monographs as a database rather than transactionally to academic and research libraries’ (lynch, ). . several academic presses – including national academies press, university of michigan press, penn state university press – have, in all or in part of their publishing program, dis- tributed the digital version of monographs free over the web while continuing to charge for the print (or print-on-demand) version and, in some cases, for non-pdf digital versions of the title. although it is not entirely clear whether or not the free digital distribution cannibalizes print sales, given the relentless and accelerating decline of print sales overall, this approach can at best be of short-term assistance to financially stressed presses. . john willinsky’s important book (willinsky, ), though dealing exclusively with jour- nal publishing, makes a powerful case for open access scholarly publishing in general. . david weinberger provides a compelling perspective on this epistemic shift (weinberger, ). references research university provosts ( ) values and scholarship. inside higher ed, february. available at: http://www.insidehighered.com/views/ / / /essay-open-access-scholarship (accessed march ). acord sk and harley d ( ) credit, time, and personality. new media & society. adams p ( ) pagemaker past, present and future. available at: http://www.makingpages.org/ pagemaker/history/ (accessed december ). antoniou g and van harmelen f ( ) a semantic web primer. nd ed. cambridge, ma: the mit press. atkins d, droegenmeier kk, feldman si, et al. ( ) revolutionizing science and engi- neering through cyberinfrastructure. report of the national science foundation blue- ribbon advisory panel on cyberinfrastructure. available at: http://www.nsf.gov/od/oci/ reports/atkins.pdf (accessed november ). borgman cl ( ) scholarship in the digital age: information, infrastructure, and the internet. cambridge, ma: the mit press. borgman cl ( ) the conundrum of sharing research data. journal of the american society for information science and technology. available at: http://papers.ssrn.com/sol /papers. cfm?abstract_id= (accessed november ). brown js and duguid p ( ) the social life of information. boston, ma: harvard business school press. brown l, griffiths r and rascoff m ( ) university publishing in a digital age. journal of electronic publishing ( ). available at: http://quod.lib.umich.edu/j/jep/ . . ? rgn=main;view=fulltext (accessed november ). carlston dg ( ) software people. new york: simon & schuster, inc. carrigan d ( ) commercial journal publishers and university libraries: retrospect and prospect. journal of scholarly publishing ( ): – . christensen c ( ) the innovator’s dilemma: when new technologies cause great firms to fail. cambridge, ma: the harvard business school press. christensen c and raynor me ( ) the innovator’s solution: creating and sustaining successful growth. cambridge, ma: the harvard business school press. pochoda cohen d ( ) making digital scholarship count. in: dan cohen’s blog: edwired.org. available at: http://edwired.org/ / / /making-digital-scholarship-count/ (accessed november ). cohen d ( ) open access publishing and scholarly values. in: dan cohen’s blog. available at: www.dancohen.org (accessed may ); http://www.dancohen.org/ / / /open- access-publishing-and-scholarly-values/ (accessed november ). darnton r ( / ) what is the history of books? in: darnton r (ed.) the case for books. new york: public affairs, pp. – . dawkins r ( ) the ancestor’s tale: a pilgrimage to the dawn of evolution. boston, ma: houghton mifflin company. eisenberg d ( ) word processing (history of). in: encyclopedia of library and information science, vol. . new york: dekker. available at: http://tinyurl.com/ p f e (accessed december ). esposito jj ( ) the processed book. first monday ( – ). available at: http://firstmonday. org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / (accessed november ). esposito jj and pochoda p ( ) publishing through the wormhole: a new format for the born- digital publisher. the scholarly kitchen, april. available at: http://scholarlykitchen.sspnet. org/ / / /publishing-through-the-wormhole-a-new-format-for-the-born-digital-publisher/ (accessed november ). fitzpatrick k ( ) planned obsolescence: publishing, technology, and the future of the academy. new york: new york university press. freiberger p and swaine m ( ) fire in the valley: the making of the personal computer. new york: mcgraw-hill. givler p ( ) university press publishing in the united states. aaup net. available at: http://www.aaupnet.org/about-aaup/about-university-presses/history-of-university-presses (accessed november ). greenstein d ( ) next-generation university publishing: a perspective from california. journal of electronic publishing ( ). available at: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext;q =pochoda (accessed november ). kelley m ( ) a guide to publishers in the library ebook market. the digital shift. available at: http://www.thedigitalshift.com/ / /ebooks/a-guide-to-publishers-in-the-library-ebook- market/ (accessed february ). lorimer r ( ) libraries, scholars, and publishers in digital journal and monograph publishing. scholarly and research communication ( ): . available at: http://www.src-online.ca/ index.php/src/article/view/ / (accessed october ). lynch c ( ) imagining a university press system to support scholarship in the digital age. journal of electronic publishing ( ). available at: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext;q =lynch (accessed november ). mazuzan gt ( ) the national science foundation: a brief history. available at: http://www. nsf.gov/about/history/nsf /nsf .jsp (accessed december ). nash re ( ) publishing . . in: speech at the bng technology forum. available at: http:// rnash.com/article/the-speech-chris-anderson-of-iwired-i-says-is-the-best-hes-ever-seen-on- boo/ (accessed november ). o’leary b ( a) context, not container. in: mcguire h and o’leary b (eds) book: a futurist manifesto. sebastopol, ca: o’reilly media. available at: http://book.pressbooks.com/chapter/ context-not-container (accessed november ). o’leary b ( b) tools of the digital workflow. in: mcguire h and o’leary b (eds) book: a futurist manifesto. sebastopol, ca: o’reilly media. available at: http://book.pressbooks. com/chapter/tools-of-the-digital-workflow-brian-oleary (accessed november ). new media & society ( ) o’reilly t ( ) what is web . . o’reilly network. available at: http://oreilly.com/web / archive/what-is-web- .html (accessed: november ). patten bc and auble gt ( ) system theory of the ecological niche. american naturalist ( ): – . available at: http://www.jstor.org/stable/pdfplus/ (accessed february ). pochoda p ( ) scholarly publication at the digital tipping point. journal of electronic publishing ( ). available at: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main ;view=fulltext;q =pochoda (accessed november ). pochoda p ( a) editor’s note for reimagining the university press. journal of electronic publishing ( ). available at: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main ;view=fulltext;q =pochoda (accessed november ). pochoda p ( b) up . : some theses on the future of academic publishing. journal of electronic publishing ( ). available at: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main; view=fulltext;q =pochoda (accessed november ). power dj ( ) a brief history of spreadsheets. in: dssresources. com version . . available at: http://dssresources.com/history/sshistory.html (accessed december ). samuelson r ( ) the great inflation and its aftermath: the past and future of american affluence. new york: random house. saunders fs ( ) the cultural cold war: the cia and the world of arts and letters. new york: the new press. scholarly kitchen ( ) what’s hot and cooking in scholarly publishing. scholarly kitchen, november. available at: http://scholarlykitchen.sspnet.org/ / / /ask-the-chefs-who-will- win-the-future-the-small-the-mid-sized-or-the-big-organization/ (accessed november ). shirky c ( ) here comes everybody: the power of organizing without organizations. new york: penguin books. thatcher s ( ) fair use: the double-edged sword. journal of scholarly publishing ( ): – . thompson jb ( ) books in the digital age: the transformation of academic and higher education publishing in britain and the united states. cambridge: polity press. waters d ( ) open access publishing and the emerging infrastructure for st-century scholar- ship. journal of electronic publishing ( ). available at: http://quod.lib.umich.edu/j/jep/ . . ?rgn=main;view=fulltext;q =waters (accessed december ). weinberger d ( ) too big to know: rethinking knowledge now that the facts aren’t the facts, experts are everywhere, and the smartest person in the room is the room. new york: basic books. wiener n ( and ) cybernetics: or control and communication in the animal and the machine. nd ed. cambridge, ma: the mit press. willinsky j ( ) the access principle: the case for open access to research and scholarship. cambridge, ma: mit press. zittrain j ( ) the future of the internet and how to stop it. new haven, ct: yale university press. phil pochoda recently retired as director of the university of michigan press. previously, he was associate director and editorial director of the university press of new england; editorial director of anchor books and dial press at doubleday; and vice-president at simon & schuster while publisher and editor-in-chief of prentice-hall press. outline for aag forum on the cyberinfrastructure for gis-enabled historiography the harvard community has made this article openly available. please share how this access benefits you. your story matters citation bol, peter k. . on the cyberinfrastructure for gis-enabled historiography. annals of the association of american geographers , no. : – . published version doi: . / . . citable link http://nrs.harvard.edu/urn- :hul.instrepos: terms of use this article was downloaded from harvard university’s dash repository, and is made available under the terms and conditions applicable to open access policy articles, as set forth at http:// nrs.harvard.edu/urn- :hul.instrepos:dash.current.terms-of- use#oap http://osc.hul.harvard.edu/dash/open-access-feedback?handle=&title=on% the% cyberinfrastructure% for% gis-enabled% historiography&community= / &collection= / &owningcollection / &harvardauthors= eaf e e fc bacbaafe b&departmenteast% asian% languages% and% civilizations http://nrs.harvard.edu/urn- :hul.instrepos: http://nrs.harvard.edu/urn- :hul.instrepos:dash.current.terms-of-use#oap http://nrs.harvard.edu/urn- :hul.instrepos:dash.current.terms-of-use#oap http://nrs.harvard.edu/urn- :hul.instrepos:dash.current.terms-of-use#oap on the cyberinfrastructure for gis-enabled historiography peter k. bol harvard university, center for geographic analysis abstract: from an historian’s perspective the use of giscience and technology in the study of history holds the promise of an integration of historical and geographic modes of analysis. the national geographic information systems that provide extensive coverage of changes in administrative structures over time provide important support for gis- enabled historiography. other parts of the cyberinfrastructure necessary to support collaborative research in a digital environment are now beginning to emerge, however a world-historical gazetteer, an essential tool for linking historical data to mapped places has yet to be developed. history as a record of the past tracks change; history as a discipline turns to new ways of understanding the past. there was once a quantitative turn, a cultural turn, a linguistic turn, and today there is a “spatial” turn. “spatial turn”—the recognition that knowing where things took place is necessary to understanding what took place—has not yet become as widely used as “cultural turn” but recently it has been outpacing “spatial history” (according to a google ngram search). in fact the spatial turn was announced over a decade ago with anne knowles’ special issue of social science history on historical geographic information systems (gis) (knowles ), followed by successive collections illustrating the application of gis to historical subjects (knowles ; knowles and hillier ). space, place, and landscape have regularly figured in modern thinking about the past (guldi ), although not, it seems to me, with the systematic analysis of the spatial dimensions of change over time that is integral to spatial history (white ). the kind of spatial history that makes use of gis is one particular way of making the spatial turn, a turn that some argue has been taking place across the disciplines (guldi ; warf and arias ). to be clear, this is but one possible marriage between history and geography. the spatial turn also appears in studies focused on “place,” as sites where social processes and events take place and places that serve as the basis for identity which constructed through social processes (withers ). a reliance on gis differentiates this kind of historical study from humanistic geography or the geographically engaged research of the “geo-humanities” (dear et al. ). what is clear is that during the last decade we have begun to see major historical studies where the research findings depended on geospatial analysis (ayers and rubin ; ayers ; gordon ; white ). an awareness that change over time happens in many places across space, and that spatial features are not constant through time is certainly ancient, but geography and history for all their overlaps have remained distinct. the polymaths strabo and sima qian, to give ancient examples from opposite ends of eurasian landmass, were near contemporaries ( st century bc). strabo is known to us as a geographer (but he was interested in philosophy and astronomy as well and also wrote a work known as the “historical sketches”). sima qian is treated as an historian (but he was also interested in philosophy, astronomy, and geography). they both recognized that life unfolds over time and in many places at once; from a geohistorical perspective it is temporal and spatial. how did they model this (i use model because that is what sima and strabo were in fact doing)? both turned away from the possibility of a quantitative, “scientific” methodology, which was proposed at the time in both places by some scholars who looked to mathematics and astronomy for methods to model space and time. ultimately they went in different directions. strabo devoted years to his geography, in which he described the world known to him through its diverse parts (he included south but not east asia) and sima qian, who gave years to his records of the historian, documented the diversity of the past through court-centric chronologies and extensive biographies of individuals from across the land. the real and difficult challenges of combining time and space, of the map and the chronology, are themselves the subject of research (peuquet ; peuquet ; andrienko et al. ). yet historians and geographers face a similar challenge: they are always engaged in choices of scale. in theory time is infinite and geospace is finite (or historical time is finite and space is infinite), but in practice human historical time and geographical space are finite yet subject to resolution at ever-finer scale. historians and geographers are respectively engaged in representing temporal change and spatial variation, and in simplification, in order to establish analytic clarity. a chronology, integral to history, and a map, integral to geography, are useless if perfectly to scale. we choose to highlight what is important: to widen the road so that it is visible, to define certain moments as consequential in order to clarify change and difference. history and geography have been, and remain, separate disciplines, evident in the subtitle of alan baker’s geography and history: bridging the divide (baker ). we can ask why this should be so – which is to ask why methodologies for analyzing change over time and variation through space have diverged so greatly – but i prefer to ask, in the spirit of the sixty sessions on “space-time integration in geography and giscience” at the aag meeting, what the technologies of today allow us to do about it. the great modern advancement of knowledge has been credited to three things: academic specialization, paradigm shifts, and the emergence of new tools. for the moment i am going to stand with the “tool” camp, and suppose that tools that enable us to deal with vast quantities of information allows us to see with many times (for historians) and many places at once (for geographers) affect both specialization and paradigm shifts. gis as a tool, like the telescope and the microscope, allows us to see what we could not see before. historical gis may not bring about the integration of history and geography but it does make it possible for historians to take advantage of some of the accomplishments of giscience to combine variation through space with change over time. learning to apply gis software to historical questions may not be too demanding, thanks to gis textbooks and ian gregory’s guide to historical gis specifically (gregory and ell ), but building comprehensive historical geographic information systems is very challenging and time-consuming, as the review of national historical gis projects in the journal historical geography makes clear (knowles ). gis gives us the ability to sort out and analyze the ever increasing amount of digital information with spatial attributes, but without the comprehensive historical geographic information systems that have been and are being constructed it would be impossible for the individual researcher to handle national datasets, visualize and measure the spatial patterns in that data, see those patterns change over time and correlate information from different domains. the barriers for historians in applying gis technology to particular questions are not so much technological as infrastructural. the american council of learned societies (acls) report on cyberinfrastructure for the humanities and social sciences asserted that it was “primarily concerned not with the technological innovations that now suffuse academia, but rather with institutional innovations that will allow digital scholarship to be cumulative, collaborative, and synergistic.” (welshons and al. ). how does this apply to using gis in service of history? we can begin with one of the first steps in the research cycle: finding information. gis is about geographic space, but in the historical record “place” not “space” was the focus. populations are clustered in places, people come from places, and postal stations are places in themselves. places are nodes in networks, but the further back we go the less certain our knowledge of the precise routes between nodes than our knowledge of where the nodes/places were. reliable sources for boundaries only begin to emerge with the spread of modern cartography in the nineteenth century. finding information has more to do with place than with space, very much in the sense of curry’s distinction (curry ). historical information, particularly for periods lacking cartographic records, is typically associated with places—a person of a certain place, a religious site located at a certain distance from a known place, the tax assessment of a place—although a spatial picture can often be inferred. the interface humans have created between themselves and the physical world, which allows them to position themselves relative to that geography and make it intelligible, to organize knowledge of it, and preserve the memory of their acts within it, is created through the process of naming. naming – of a mountain or a river, town or a building – maintains an intelligible interface between the geophysical world and human culture; the name makes it a place. naming pertains to all aspects of human life. but, like everything in human culture, names are not stable. they are changed, abandoned, forgotten, fabricated. names provide an interface between the historical and the geographic, allowing us to use them to bring the historical and geographic together. if we can capture names in written/drawn sources and locate those on the landscape, then we can locate historical data (tax records, population, religious activities, battles) in space. in this ideal version, in which all temporal data can be linked to places and places to spatial objects, the data from past becomes attributes space and place. the historical record is, ultimately, finite, so it is possible to imagine collecting all names in one giant historical gazetteer that tells us when a name is valid, what system of naming it belongs in, and (we hope) where it is. the gazetteer is fundamental to the geographic ordering of our human past and making it accessible. in practice, however, a gazetteer should be able to accommodate place names (even imaginary and fabulous landscapes) that have sources but lack spatial locations. the united states government is the most important source of geographic names in the world today. the geographic names information system (gnis) provides over million names for named natural and constructed places (except roads) in the united states (yost and carswell ). the national geospatial-intelligence agency’s geonet names server (gns), the official repository for foreign place names, has over million foreign geographic features, including alternate names and the local vernacular. (national geospatial-intelligence agency ). these invaluable resources, freely available for public use, are under the aegis of the board of geographic names which has ultimate authority over the names included. the greatest non-governmental gazetteer, the geonames geographical database, founded by mark wick, contains over million geographical names and consists of . million unique features whereof . million are populated places and . million are alternate names (wick ). a vital difference with the government gazetteers, whose data it incorporates, is that geonames allows volunteered data and is open to the wisdom of crowds. however, none of these gazetteer databases includes time as an attribute of place. at first glance this may appear to be a minor loss—the proportion of the million daily web services requests to geonames that need dates is probably very small. but the problem of excluding temporal data for names—including contemporary names—has consequences. the lack of a record about when a name is changed or a jurisdictional line redrawn eventually will result in the loss of knowledge about when the attributes of that place name (population, area, etc.) are valid. in the past territorial administrations managed their records through print, archiving past records and thus creating a paper trail; to the degree that an information management system keeps itself up-to-date by overwriting earlier data that information system is sacrificing a longitudinal record to clerical efficiency. thus a first-order cyber infrastructural need in integrating history and geography, time and space, is a temporally-enabled gazetteer—in short we need a world historical gazetteer. the challenges here are considerable. names can be linked to geographic locations but sometimes only to areas (within the jurisdiction of, mentioned as being near to). place names change asynchronously. the recoverable begin and end dates are often approximate. there is an enormous amount of information to be had from online resources such as wikipedia in many languages. how do we organize linked data into geohistorical factoids? a world-historical gazetteer is fundamental to historically-conscious spatial research. as humphrey southall has written: “understanding the larger socio-economic challenges facing our society requires a long-term global perspective, but in practice such perspectives are almost impossible to achieve because the necessary datasets are fragmentary or non-existent. all too often, historical research is based on a single country or a small group of advanced economies; or on just the last thirty or forty years. we need to assemble not just historical statistics but closely integrated metadata, including locations and reporting unit boundaries, so that researchers can explore alternative approaches to achieving consistency over space and time without requiring an army of assistants for each new project…existing social science data repositories are insufficiently integrated…an open collaborative approach is essential…geographical information science technologies are necessary…and concepts from other areas of information science are also needed, notably including ontologies and linked data.” (southall et al. ) but what a world-historical gazetteer should contain and how it should be organized is not settled. the panels on gazetteers for the space-time symposium at the aag addressed ontologies of place, temporal frameworks for gazetteer elements, the construction of historical and cultural gazetteers, interoperable gazetteers and the spatially enabled web, building world historical gazetteers from historical gis, data models and content standards, and building a temporally enabled global gazetteer. the goals of these sessions, as the principal organizers merrick lex berman and humphrey southall explained, was to evaluate current gazetteers, to consider methods for building temporal/historical gazetteers, and to persuade the agencies responsible for authoritative gazetteer systems to include time as an essential element. the overall aim was to plan for system interoperability between online gazetteers, and to sketch out the right course of development leading to the funding of a true world historical gazetteer system. a number of national historical gis databases provide the kinds of information that a true temporally enabled world gazetteer would need to offer. the great britain historical gis, which covers the last two centuries of administrative units and the relationships between them is also a sophisticated historical gazetteer( -; southall ), accessible through the a vision of britain through time website . the great britain historical gis (gbhgis) was first created to enable the longitudinal spatial analysis of demographic data, but precisely because it is also a gazetteer it can link to other data sources: historical maps, election results, and travel writing. similarly the national historical gis was created for the spatial analysis of united states census data - , but its polygons have temporal attributes and could provide us data for a historical gazetteer together with the invaluable print historical gazetteer of the united states (hellmann ). somewhat similar are the historical gis of belgium, netherlands and germany (dans and universiteit ; kunz, zipf, and böhler -; vakgroep nieuwste geschiedenis/department of modern history at ghent university). in contrast, the china historical gis (chgis), covering bc- ad was created as a time series of administrative entities and major towns and their changing relationships between places from bce to ce. chgis from the start has served as a gazetteer in that the purpose was to provide the points and polygons for places to which scholars could join historical data of their choice (bol ; bol and ge ; bol et al. - ). the aag’s historical gis clearing house and forum provides a listing of historical gis projects and gazetteers (association of american geographers). the cyberinfrastructural challenge is to create either a unified or a federated temporally- enabled multilingual gazetteer system informed by multiple ontologies in different languages that can be sustained over time. this leads directly to a second challenge: populating a world historical gazetteer systematically on a large scale. at first glance the problem is so large that it is hard to say where to begin. there are, i think, two somewhat different starting points: digital texts and scanned, georeferenced maps. the identification of place names appearing in dated texts provides a source authority for a “before” date for a place name. the proprietary metacarta geographic search and referencing platform from qbase appears to be the most sophisticated geo-referencing software, which presumably could be used for the geo-tagging and their geo-referencing of historical texts (although with greater degrees of uncertainty as distance from the present increases). nevertheless, identifying all the place names in past writings provides a large amount of raw data; the hope is that their locations can gradually be refined through iterative procedures. since the use of theodolites in s britain, mathematically accurate maps have accumulated and now cover the entire globe. these maps provide information routes, boundaries, physical features, and locations that texts cannot provide. for a limited historical period – but one which saw global modern growth at a pace unparalleled in human history—geo-referenced maps allow us to link place names, locations, and time and thus provide a foundation for geo-referencing place names that appear in earlier texts. manual data extraction will always be limited to specific projects; a systematic approach requires the extension of optical character recognition technology to maps. this has largely eluded software engineers but real progress is being made (chiang and knoblock). given software to extract vector and text data from map scans, a third infrastructural challenge follows: creating a system for discovering and accessing geo-referenced map scans. the premier online collection of scanned maps, with over , out of a total collection of over , maps, is the rumsey historical map collection (rumsey - ). of the scanned maps some , have rough geo-referencing of which have been georectified using an average of - control points per map. some universities have larger map collections (harvard has over , items) but none can rival rumsey for digitized maps and geo-referenced maps. university map collections do not necessarily register their entire holdings in electronic catalogs, making a union catalog impossible. given the costs of scanning and geo-referencing the maps in public and private collections, there is a need for a federated system for registering maps that have been scanned or geo-referenced. both raster and vector data belong in a federated geospatial catalog. here there is good news to report. harvard, mit, and tufts have joined in opengeoportal.org, to create a portal for searching and previewing collections that can be installed on local servers (it has already been adopted by fifteen universities and government organizations). system interoperability between the portals of different collections will enable searching across catalogs. a concomitant to a federated spatial catalog is a system for archiving and searching historical datasets, exactly what spatial historians could join to gis boundary and point files. the center for historical information and analysis, directed by patrick manning at the university of pittsburg, has launched the world-historical dataverse with the aim of creating such a system and founded the electronic journal of world-historical information (world-historical dataverse). the final piece of cyberinfrastructure is an online platform for sharing spatialized data. here too there has been significant progress. google earth created a foundation of public understanding and an inspiration for further developments aimed at research and teaching. social explorer, led by andrew beveridge, is a proprietary platform with free and subscription editions for the visualization of spatialized data. it includes a wide variety of historical and modern data from the u.s. census, the american community survey, and data on religion. it allows users to create reports, download data in convenient formats, and create a time series of map visualizations (social explorer). esri’s proprietary freeware, arcgis online and arcgis explorer online, are cloud- based geospatial content management systems for storing and managing maps, data, and other geospatial information (esri). they allow users to create and share maps and datasets, manage geospatial content, and control access to volunteered content. . another similar system is geocommons developed by the geoiq company, which has now been bought by esri (geoiq). the center for geographic analysis at harvard is developing the open-source and open- access worldmap platform to lower barriers for scholars who wish to explore, visualize, edit, collaborate with, and publish geographically referenced information. worldmap has an expanding list of functionalities it wishes to add, but at this writing it already allows researchers to upload large datasets and overlay them with their own layers or those shared by others, create and edit maps and link map features to rich media content, grant edit permission to small or large groups, export data to standard formats, georeference paper maps scans online, and publish data to a few collaborators or the world (center for geographic analysis). harvard's instance of worldmap runs on amazon web services, although it can be installed on local servers. it is simple to replicate for organizations that would like to set up their own instances cost effectively. using the harvard instance, users can upload their data for sharing and archiving through harvard, and link to external web services. the great promise of worldmap is that it is cumulative, and this is already being borne out: from its beta release in july of to february , it attracted users from than countries, and its over registered users had contributed data layers and created map collections. the spatial turn in history points toward bringing history and geography together in ways that are changing the ways historians work. much of it takes place in a digital environment; it involves historians with geographers; it requires collaboration between academics, technologists and librarians; and it must be a cumulative enterprise where we all advance by sharing our data. the elements of cyberinfrastructure discussed here will help make large-scale historical gis possible. andrienko, g., n. andrienko, u. demsar, d. dransch, j. dykes, s.i. fabrikant, m. jern, m.j. kraak, et al. . space, time and visual analytics. international journal of geographical information science : - . association of american geographers -. historical gis clearing house and forum. http://www.aag.org/cs/projects_and_programs/historical_gis_clearinghouse (last accessed february ). ayers, edward l. . in the presence of mine enemies: war in the heart of america, - . the valley of the shadow project. new york: w.w. norton. the valley of the shadow: two communities in the american civil war. macintosh/windows version. ( computer optical disc). w.w. norton & co., charlottesville, va. baker, alan r. h. . geography and history: bridging the divide. cambridge studies in historical geography . cambridge, u.k.; new york: cambridge university press. bol, peter k. . "creating a gis for the history of china." in placing history: how maps, spatial data, and gis are changing historical scholarship, eds. anne kelly knowles and amy hillier, - . redlands, ca: esri press. bol, peter k., jianxiong ge, merrick lex berman, and zhimin man. - . china historical geographic information system . - . , http://www.fas.harvard.edu/~chgis/ (last accessed february ). harvard university and fudan university. bol, peter k., and jianxong ge. . china historical gis. historical geography : - . center for geographic analysis. worldmap and mapwarper. harvard university, cambridge ma. http://worldmap.harvard.edu/ and http://warp.worldmap.harvard.edu (last accessed february ). chiang, yao-yi, and craig a. knoblock . "recognition of multi-oriented, multi- sized, and curved text." in proceedings of the tenth international conference on document analysis and recognition. http://www.icdar .org/ (last accessed february ). curry, michael r. . toward a geography of a world without maps: lessons from ptolemy and postal codes. annals of the association of american geographers ( ): - . dans, and afdeling geschiedenis van de radboud universiteit . nlgis. http://nlgis.dans.knaw.nl/hgin/home.ctrl (last accessed february ). dear, michael, jim ketchum, sarah luria, and doug richardson. . geohumanities: art, history, text at the edge of place. london and new york: routledge. esri. arcgis online and arcgis explorer online. redlands, calif. http://www.arcgis.com/home/ (last accessed february ). geoiq. geocommons. http://geocommons.com/ (last accessed february ). http://www.fas.harvard.edu/~chgis/ http://worldmap.harvard.edu/ http://warp.worldmap.harvard.edu/ http://www.icdar .org/ http://nlgis.dans.knaw.nl/hgin/home.ctrl http://www.arcgis.com/home/ http://geocommons.com/ gordon, colin. . mapping decline: st. louis and the fate of the american city, politics and culture in modern america. philadelphia: university of pennsylvania press. great britain historical geographical information system. a vision of britain through time http://www.visionofbritain.org.uk/ (last accessed february ). ———. -. great britain historical geographical information system gregory, ian, and paul s. ell. . historical gis : technologies, methodologies and scholarship. cambridge studies in historical geography . cambridge, uk; new york: cambridge university press. guldi, jo. the spatial turn in history. institute for enabling geospatial scholarship at the scholars' lab at the university of virginia library, apr . available from http://spatial.scholarslab.org/spatial-turn/disciplinary-perspectives/the- spatial-turn-in-history/ (last accessed february ). hellmann, paul t. . historical gazetteer of the united states. new york: routledge. knowles, anne kelly. . past time, past place: gis for history. redlands, calif.: esri press. ———. . emerging trends in historical gis. historical geography : - . knowles, anne kelly . special issue: historical gis: the spatial turn in social science history. social science history . . knowles, anne kelly, and amy hillier. . placing history: how maps, spatial data, and gis are changing historical scholarship. redlands, calif.: esri press. kunz, andreas, alexander zipf, and wolfgang böhler. -. historical gis of germany http://www.hgis-germany.de/ (last accessed february ). institut für europäische geschichte mainz. national geospatial-intelligence agency. . geonet names server http://geonames.nga.mil/ggmagaz/ (last accessed february ). peuquet, donna j. . representations of space and time. new york: guilford press. peuquet, donna j. . it's about time: a conceptual framework for the representation of temporal dynamics in geographic information systems. annals of the association of american geographers ( ): - . rumsey, david. -. david rumsey map collection http://www.davidrumsey.com/ (last accessed february ). cartography associates. social explorer. led by andrew beveridge. http://www.socialexplorer.com (last accessed february ). southall, humphrey. . great britain historical gazetteer/gis. university of portsmouth: great britain historical gazetteer/gis. southall, humphrey, p. manning, m. berman, j. gerring, and p. bol. . understanding global change: how best to organize information? university of portsmouth’s research repository. vakgroep nieuwste geschiedenis/department of modern history at ghent university. belgian historical gis. http://www.lokstat.ugent.be/ (last accessed february ). warf, barney, and santa arias. . the spatial turn: interdisciplinary perspectives. routledge studies in human geography. london; new york: routledge. http://www.visionofbritain.org.uk/ http://spatial.scholarslab.org/spatial-turn/disciplinary-perspectives/the-spatial-turn-in-history/ http://spatial.scholarslab.org/spatial-turn/disciplinary-perspectives/the-spatial-turn-in-history/ http://www.hgis-germany.de/ http://geonames.nga.mil/ggmagaz/ http://www.davidrumsey.com/ http://www.socialexplorer.com/ welshons, marlo, et al. . our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences. new york: american council of learned societies. white, richard. . what is spatial history? stanford university: spatial history lab. ———. . railroaded: the transcontinentals and the making of modern america. new york: norton. wick, mark. . geonames. http://www.geonames.org (last accessed february ). withers, charles w. j. . place and the "spatial turn" in geography and history. journal of the history of ideas ( ): - . world-historical dataverse http://www.dataverse.pitt.edu/ (last accessed february ). -. center for historical information and analysis, university of pittsburgh. yost, a.y., and w.j. jr. carswell. . geographic names: u.s. geological survey fact sheet - . http://pubs.usgs.gov/fs/ / / (last accessed february ). correspondence: center for geographic analysis, harvard university, cambrdige st., cambridge ma , email: pkbol@fas.harvard.edu. http://www.geonames.org/ http://www.dataverse.pitt.edu/ http://pubs.usgs.gov/fs/ / / mailto:pkbol@fas.harvard.edu op-llcj .. why the quantitative analysis of diachronic corpora that does not consider the temporal aspect of time-series can lead to wrong conclusions ............................................................................................................................................................ alexander koplenig institute for the german language (ids), mannheim, germany ....................................................................................................................................... abstract recently, a claim was made, on the basis of the german google books -gram corpus (michel et al., quantitative analysis of culture using millions of digitized books. science ; : – ), that there was a linear relationship between six non-technical non-nazi words and three ‘explicitly nazi words’ in times of world war ii (caruana-galizia. . politics and the german language: testing orwell’s hypothesis using the google n-gram corpus. digital scholarship in the humanities [online]. http://dsh.oxfordjournals.org/cgi/doi/ . /llc/ fqv (accessed april )). here, i try to show that apparent relationships like this are the result of misspecified models that do not take into account the temporal aspect of time-series data. the main point of this article is to demon- strate why such analyses run the risk of incorrect statistical inference, where potential effects are both meaningless and can potentially lead to wrong conclusions. ................................................................................................................................................................................. introduction ‘it is fairly familiar knowledge that we some- times obtain between quantities varying with the time (time-variables) quite high correl- ations to which we cannot attach any physical significance whatever, although under the or- dinary test the correlation would be held to be certainly ‘‘significant.’’ as the occurrence of such ‘‘nonsense-correlations’’ makes one mis- trust the serious arguments that are some- times put forward on the basis of correlations between time-series [. . .] it is important to clear up the problem how they arise and in what special cases.’ (yule, , p. ) ‘so-called univariate time-series analysis actually is the analysis of the bivariate rela- tionship between the variable of interest and time.’ (becketti, , p. ). the idea to quantitatively study ‘the relationship between political regimes and language’ (caruana- galizia, , p. ) is certainly a highly interesting correspondence: alexander koplenig, institute for the german language (ids) postfach , mannheim, germany. e-mail: koplenig@ids-mannheim.de digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqv research topic, which became possible with the recent availability of large machine-readable dia- chronic corpora such as the coha (davies, ) or the google n-gram corpora (michel et al., ). the latter, in particular, received widespread atten- tion, as it reportedly contains roughly % in the version (michel et al., ) and even % in the version of all books ever published (lin et al., ). for example, petersen et al. ( , p. ) reason that observable frequency effects in the google books n-gram corpora ‘during wwii rep- resents a ‘‘globalization’’ effect, whereby societies are brought together by a common event and a unified media’, while bochkarev et al. ( , p. ) argue for a ‘[m]ajor societal transformation’. in a similar vein, michel et al. ( ) try to demonstrate that censorship in those corpora can be de- tected by measuring changes in the number of times the name of a person is mentioned. this as- sumption is tempting, but can be contested, because the google books data sets are not accompanied by any metadata regarding the books the corpora con- sist of, as i try to show in (koplenig, b [to appear]). in a recent paper, caruana-galizia ( ) uses the german google books n-gram corpus to show that there was a linear relationship between six non-technical non-nazi words and three ‘expli- citly nazi words’ in times of world war ii. this relationship is used as evidence for a hypothesis made by george orwell ‘that everyday language de- teriorates under dictatorships’ (caruana-galizia, , p. ). in this article, i first replicate this result (section ). i then try to demonstrate why such analyses that do not take into account the special nature of time-series data, run the risk of incorrect statis- tical inference, where potential effects are both meaningless and can potentially lead to wrong con- clusions (section ). when one accounts for this problem, the claimed relationship almost disappears entirely (section ). this article ends with some concluding remarks (section ). to ensure max- imal replicability, the appendix contains a stata script (‘do-file’) that automatically downloads the data and reproduces all results presented in this article. replication of caruana-galizia ( ) to analyze the relationship between six non- technical non-nazi words (demokratie [democ- racy], freiheit [freedom], frieden [peace], herrlichkeit [glory], gerechtigkeit [justice], and heldentum [heroism]) and three ‘explicitly nazi words’ (rassenschande [racial defilement], halbjude [half jew], and arier [aryan]) in times of world war ii, caruana-galizia ( ) extracts the time-series for each keyword (time span: – ) from the version of the german google books -gram corpus (michel et al., ). figure presents a replication of the (pearson) correlation analysis that caruana-galizia ( ) presents in table . the correlations are comparable for rassenschande and halbjude, but are not identi- cal. to make sure that the analyses presented here are correct, i manually extracted each times series from the google books -gram corpus and the total -gram frequency for the time span – and recalculated the correlations with identical differ- ences between my results and the results presented by caruana-galizia ( ). the reason for the dif- ference might be that caruana-galizia ( ) calcu- lates the overall token -gram frequency on the basis of all words that appear in the corpus for each year. due to legal reasons, however, n-grams that occur less than times in the corpus as a whole are excluded from the google books n-gram corpora (michel et al. b) but are available in the total counts file. nevertheless, this potential difference cannot explain the huge difference between the caruana-galizia ( ) results and the results pre- sented here for the keyword arier. for example, caruana-galizia ( ) finds a correlation between arier and herrlichkeit of r ¼ . . in my analysis, this correlation is virtually nonexistent (r ¼ . ). while it is hard to speculate about potential rea- sons for this difference, the main problem of an analysis, such as the one caruana-galizia ( ) conducts, is the fact that it does not take the special nature of temporally ordered data into account. in the next section, i will outline the problem and ex- plain why it also matters in the analysis i replicated here. a. koplenig of digital scholarship in the humanities, to demonstrate why i believe that the validity of such an analysis can be questioned, i use three add- itional time-series. the first two time-series are the frequency profiles of two word types related to switzerland. in koplenig ( b [to appear]), i adapted a method for the measurement of syn- chronic corpus (dis-)similarity put forward by kilgarriff ( ) to reconstruct the composition of the german corpus in times of world war ii. in the absence of information about the texts that the german google books corpus compiles, this ana- lysis supports the argument that the corpus was strongly biased toward volumes published in switzerland during world war ii. the two word types that contribute most to the calculated differ- ence are zürich [zurich] and schweiz [switzerland]. the frequency profiles of those two words where extracted in the same way as the other keywords. the third time-series is a simulation of a random walk (henceforth randomwalk) with drift (hill, ; becketti, , p. / ; koplenig, c), where the value xt at time point t is given as: xt ¼ : þ xt� þ et with et normally distributed in the interval [ , ]. this means that the resulting time-series x has an average upward trend, but otherwise behaves in a completely random manner. the problem: pearson correlation and non-stationarity the statistical analysis of time-series—that is, data with a natural temporal ordering—is special. in fact, it is so special that most of the classic statistical tools of data analysis cannot be used directly. in many situations, this has to do with the sequential de- pendence of observations and with the fact that the variable which is measured at successive mo- ments in time exhibits an upward or downward trend. the resulting series is said to have a unit root or to be non-stationary (becketti, , pp. – ). regressing one non-stationary time-series on another non-stationary time-series leads to a spurious model, where the variables look highly cor- related but are not related in any substantial sense fig. . replications of the correlation analysis of caruana-galizia ( , p. table ). the figure shows pearson correlations between the nazi words and the selected keywords in the time span – on the basis of the google books german -gram corpus (version ). the quantitative analysis of diachronic corpora digital scholarship in the humanities, of (granger and newbold, ; koplenig, c). there are formal ways to test for unit roots, the classic one is the augmented dickey-fuller test (becketti, , ch. . ). i chose this one, since caruana-galizia ( , fn. ) also used it in later analyses. table lists the (mackinnon approxi- mate) p-values for each case. the null hypothesis states that the respective time-series follows a unit root, or put differently that it evolves through time. for arier the null hy- pothesis of a unit root can be rejected at p < . . however, there is also virtually no correlation for this word and any of the keywords (cf. fig. ). for all other keywords including the three add- itional words, there is good reason to accept the presence of a unit root because the p-value is greater than . . this result points toward the fact that for those words, the time-series seem to be non- stationary. why is this problematic? basically, the pearson product–moment correlation coefficient is the co- variance of two variables x and y scaled to the inter- val [� , ]. now, if we have two series x and y that both have an upward trend, then by definition, for both series the following statement is true: values that are later in time will be above average from the mean value of the series, while values that are earlier in time will be below average. since the co- variance measures whether values of x that are above/below average tend to co-occur with values of y that are above/below average, then by mathematical necessity, the correlation coefficient will be high when in fact they are not related in any substantial sense (granger and newbold, ). thus, for two trending time-series, the pearson correlation only measures the fact that the two series are trending. figure presents four plots that all document an apparent linear relationship. to visualize why i be- lieve that the problem described above is also pre- sent in the analysis of caruana-galizia ( ), the observed values are colored by decade with earlier decades colored in lighter shades of gray and later decades colored in darker shades of gray (as indi- cated by the color bar at the bottom of the figure). plot a replicates the findings of caruana-galizia ( ) for the nazi word halbjude and the keyword frieden. at first glance, there seems to be a positive correlation between the time-series of both words as argued by caruana-galizia ( ). however, the color pattern reveals that this could be the result of a spurious model: values for later decades (dark shades of gray) strongly influence the apparent re- lationship. this can be best understood if we have a look at plot b that shows the relationship between randomwalk and the keyword demokratie. again, the apparent correlation (r ¼ . ) is the result of a misspecified model with values for later decades strongly influencing the result. it is noteworthy to point out again that the randomwalk series has an average upward trend, but behaves com- pletely randomly apart from that. so, what other explanation for the observed calculation could be there apart from a spurious model? plot c shows the relationship between the nazi word rassenschande and zürich. again, it is hard to come up with an explanation for this result other than a misspecified model. in plot d the re- lationship between schweiz and zürich is depicted. while, as argued below, the time-series of both words are indeed related, the very strong linear re- lationship (r ¼ . ) is the result of the fact that both series are trending as indicted by the color pattern. in the next section, i will outline a procedure to account for this problem and show that this pro- cedure strongly affects the results of an analysis, like the one conducted by caruana-galizia ( ). table . augmented dickey-fuller tests for unit roots. for each word, the test was run for a lag length of keyword p-value arier . halbjude . rassenschande . demokratie . freiheit . frieden . herrlichkeit . gerechtigkeit . heldentum . zürich . schweiz . randomwalk . a. koplenig of digital scholarship in the humanities, accounting for autocorrelation questions apparent effects instead of comparing the actual time-series, one can take the first differences of the variables involved, to induce (weakly) stationarity. put differently, instead of comparing actual values of the series, period-to- period changes are being correlated. the rationale of this procedure is simple: if we compare the differ- ences of two time-series x and y, a strong positive fig. . linear relationship in the time span – between halbjude and frieden (a), randomwalk and demokratie (b), rassenschande and zürich (c), and schweiz and zürich (d). in each case, a positive linear correlation is found, as indicated by the dashed line (pearson correlation coefficients are shown in the bottom right corner of each plot). additionally, the observed values are colored by decade, with earlier decades colored in lighter shades of gray and later decades colored in darker shades of gray (as indicated by the color bar at the bottom of the figure). the fact that there is an obvious color pattern in all four plots (with later decades having most influence on the apparent relationship) supports the claim of a spurious result in each case. note: word frequencies are relative per million words. the quantitative analysis of diachronic corpora digital scholarship in the humanities, of correlation implies that period-to-period changes that are above/below the average for x correspond mainly to changes that are above/below the average for y. it is noteworthy that this procedure seems to be better suited to answer a research question like the one caruana-galizia ( ) tries to answer: if the relative use of a nazi word increases from last year to this year, then—on average—the relative use of one of the keywords should also increase from last year to this year if both words are related. table demonstrates that this procedure results in weakly stationary series for all keywords, except for demokratie. for this series, it might be appro- priate (or necessary) to difference the difference (second-order difference). however, since the cor- relation analysis presented below shows that even under the assumption of non-stationarity, demokratie does not correlate with any of the three nazi words beyond random fluctuations, this option is not pursued any further in fig. , year-to-year changes are correlated instead of actual levels for the selected words. this procedure strongly counters the analysis of caruana- galizia ( ): only rassenschande and heldentum are positively correlated, while most of the correl- ations are now negative and/or virtually nonexistent. figure modifies the analysis of fig. by correlating year-to-year changes. the fact that compared to fig. , the color pattern is less obvious supports the fig. . replications of the correlation analysis of caruana-galizia ( , p . table ). the figure shows person correlations between the first differences for the time-series for each word. table . augmented dickey-fuller tests for unit roots. for this analysis, the first differences for each time-series were used. for each word, the test was run for a lag length of keyword p-value arier . halbjude . rassenschande . demokratie . freiheit . frieden . herrlichkeit . gerechtigkeit . heldentum . zürich . schweiz . randomwalk . a. koplenig of digital scholarship in the humanities, claim that the procedure of taking first differences helps to solve the problem of non-stationarity. correspondingly, there is no linear relationship be- tween year-to-year changes of the randomwalk series and year-to-year changes of the keyword demokratie (plot b). the only noteworthy linear relationship re- mains between schweiz and zürich (plot d). on a more general level, i believe that is import- ant not to forget that ‘[v]isual inspection plays a key role in time-series analysis’ (hamilton, , p. ; cf. also becketti, , ch. ). to this end, fig. plots the time-series for each combination of words that were presented in figs and . in accordance with fig. , this visual inspection clearly demon- strates that only the time-series for schweiz and zürich seem to behave in a similar way (plot d), while all other plots do not indicate a relationship in any substantial sense. fig. . linear relationship in the time span – between the first differences for halbjude and frieden (a), randomwalk and demokratie (b), rassenschande and zürich (c), schweiz and zürich (d). the depicted information is described in fig. . the fact that compared to fig. , the color pattern is less obvious supports the claim that the procedure of taking first differences helps to solve the problem of non-stationarity. however, the only noteworthy linear relationship remains between schweiz and zürich (plot d). the quantitative analysis of diachronic corpora digital scholarship in the humanities, of concluding remarks the main point of this article was to demonstrate why an analysis of diachronic data that does not take the temporal aspect of time-series data into account, runs the risk of incorrect statistical infer- ence, where potential effects are meaningless and therefore can potentially lead to wrong conclusions. to this end, i replicated the result of caruana- galizia ( , p. ) who argues that six non- technical non-nazi words are highly correlated with explicitly nazi words in order to test a hypothesis by george orwell, who argues that ‘ordinary language deteriorates under dictatorship’ (caruana-galizia, , p. ). i hope that the re- analysis presented in this article shows that this result can (or has to) be questioned. in a similar vein, frimer et al. ( ) claim that there is a linear relationship between the level of prosocial language and the level of public disapproval of us congress. again, a reanalysis casts doubt on this apparent re- lationship by demonstrating that it is the result of a misspecified model that does not account for first- order autocorrelated disturbances resulting from non-stationarity (koplenig, a). conversely, i believe that the use of more appro- priate tools for the analysis of time-series data can help the digital humanities to uncover the ‘true’ and sometimes potentially even more interesting mech- anism of how particular systems or institutions work as i have argued elsewhere (koplenig, b [to appear]). acknowledgments i thank carolin müller-spitzer, sascha wolfer, and martin hilpert for valuable comments on earlier fig. . time-series plots for the examples presented in figs and . for each plot, the time-series of the first word is placed on the left y-axis and colored in black, while the time-series of the second word is placed on the right y-axis and colored in gray. each time-series was smoothed using a simple weighted moving average with a -year window centered on the current frequency. a. koplenig of digital scholarship in the humanities, drafts of this article, sarah signer for proofreading, and one anonymous reviewer for her/his valuable comments. all remaining errors are mine. references becketti, s. ( ). introduction to time series using stata. st ed. college station, tx: stata press. bochkarev, v., solovyev, v. and wichmann, s. ( ). universals versus historical contingencies in lexical evo- lution. [online]. http://wwwstaff.eva.mpg.de/% ewich mann/lexevoluploaded.pdf (accessed june ). carmody, s. ( ). ngramr: retrieve and plot google n-gram data. [online]. http://cran.r-project.org/web/ packages/ngramr/index.html (accessed april ). caruana-galizia, p. ( ). politics and the german lan- guage: testing orwell’s hypothesis using the google n-gram corpus. in: digital scholarship in the humanities [online]. http://dsh.oxfordjournals.org/ cgi/doi/ . /llc/fqv (accessed april ). davies, m. ( ). the corpus of historical american english: million words, – . [online]. http://corpus.byu.edu/coha/ (accessed october ). frimer, j. a., aquino, k., gebauer, j. e., and zhu, l. (lei), et al. ( ). a decline in prosocial language helps explain public disapproval of the us congress. proceedings of the national academy of sciences : – . granger, c. w. j. and newbold, p. ( ). spurious regressions in econometrics. journal of econometrics : – . hamilton, l. c. ( ). statistics with stata: updated for version . th ed. boston, ma: brooks/cole, cengage learning. hill, r. c. ( ). principles of econometrics [online]. http://www.principlesofeconometrics.com/poe /poe do_files/figure - .do (accessed june ). kilgarriff, a. ( ). comparing corpora. international journal of corpus linguistics : – . koplenig, a. ( a). autocorrelated errors explain the apparent relationship between disapproval of the us congress and prosocial language. [online]. http:// hdl.handle.net/ / - e-f b -e - a - (ac- cessed june ). koplenig, a. ( b). the impact of lacking metadata for the measurement of cultural and linguistic change using the google ngram datasets – reconstructing the composition of the german corpus in times of wwii. digital scholarship in the humanities, oxford: oxford university press, . koplenig, a. ( c). using the parameters of the zipf– mandelbrot law to measure diachronic lexical, syntac- tical and stylistic changes – a large-scale corpus analysis. corpus linguistics and linguistic theory [online] . http://www.degruyter.com/view/j/cllt.ahead-of-print/cl lt- - /cllt- - .xml (accessed april ). lin, y., michel, j. -b., aiden, l. e., orwant, j., brockman, w., and petrov s. ( ). syntactic annotations for the google books ngram corpus, proceedings of the th annual meeting of the association for computational linguistic, jeju, republic of korea, pp. – . michel, j. -b., shen, y. k., aiden, a. p., verses, a., gray, m. k., the google books team, pickett, j. p., hoiberg, d., clancy, d., norvig, p., orwant, j., pinker, s., nowak, m. a., and aiden e. l. ( ). quantitative analysis of culture using millions of digitized books. science : – . [online pre–print: – ]. petersen, a. m., tenenbaum, j. n., havlin, s. and stanley, h. e. ( ). statistical laws governing fluctu- ations in word use from word birth to word death. scientific reports [online] . http://www.nature.com/ doifinder/ . /srep (accessed march ). yule, g. u. ( ). why do we sometimes get nonsense correlations between time series? a study in sampling and the nature of time series. journal of the royal statistical society : – . notes as an aside: the terminology of caruana-galizia ( ) is somewhat unclear, in fig. ( , p. see also p. ) he says that the plot shows the ‘[p]roportion of german books containing keywords, – ’. this would mean that he uses the relative number of books that contain one of the keywords per year. on page , however, he states that ‘these correlations show us that when the relative use of an explicitly nazi word increases, so did the keywords’ (my emphasis, see also p. ). this in turn would mean that he uses the relative token frequency of a keyword per year. both types of information are available in the google books corpora; the data sets are freely available here: http://storage.goo gleapis.com/books/ngrams/books/datasetsv .html (last accessed april ). to find out which information caruana-galizia ( ) actually uses, i compared the the quantitative analysis of diachronic corpora digital scholarship in the humanities, of results he presents in table with the original data. this shows that he seems to have used ‘the relative token frequency’. on this basis, i replicated the analysis. the relative token frequency per year is calculated by dividing the absolute token frequency with the total number of -grams. this information is available here: http://storage.googleapis.com/books/ngrams/boo ks/googlebooks-ger-all-totalcounts- .txt (last accessed / / ). in addition, a replication of the results with r using the ngramr package (carmody, ) yields identical re- sults. i would like to thank my colleague sascha wolfer for running this analysis. this comes as a bit of a surprise since he deals with this problem in further analyses he presents in his article (cf. footnote ). of course, from this does not follow that other results presented in caruana-galizia ( ) have to be chal- lenged, too. however, the autoregressive integrated moving average (arima or armax) models he uses in order to predict the relative frequency of a keyword on the basis of the polity score (a measure of the level of democracy, the data are available here: http:// www.systemicpeace.org/inscrdata.html, last accessed on april ) are quite sophisticated and fitting such a model requires several conceptual decisions regarding the appropriate arima structure that depend on each respective time-series (becketti, , ch. ). caruana- galizia ( , p. ) only uses one arima model spe- cification for the time-series of all six keywords. to see why this is rather problematic, i ran separate arimas of heldentum and zürich on the polity score (the code that replicates this analysis can also be found in the appendix). a look at the autocorrelations and partial- autocorrelations of a regression of the first difference of the heldentum series on the first difference of the polity score shows that the residuals have one auto- regressive lag and two lags of moving averages. an arima model with robust standard errors yields an insignificant negative effect (p ¼ . ) of the first dif- ference of the relative frequency of heldentum on the first difference of the polity score. however, fitting the same model with lags of autocorrelations and lag of moving averages yields a significant negative effect (p ¼ . ), but the fact that it requires many iterations to converge indicates that the model is misspecified. in a similar vein, we can check the autocorrelations and partial-autocorrelations and then fit an arima of the first difference of zürich on the first difference of the polity score and include two lags of autocorrelations and one lag of moving averages. this yields an insig- nificant negative effect (p ¼ . ) of the first difference of the polity score on the relative frequency of zürich. if we fit the model again and include nine lags of autocorrelations and two lags of moving aver- ages, then we obtain a significant negative effect (p ¼ . ), again with many iterations to converge. these differences demonstrate why it is very difficult to choose the ‘best’ model specification in time-series analysis. that is why becketti ( , p. , my em- phasis) issues a warning: ‘[t]ime-series analysis provides powerful tools for revealing patterns and relationships in data, but the best statistical techniques can only bound, but not eliminate, the irreducible uncertainty we face when analyzing data. [. . .] there is no substitute for a thoughtful approach to time-series analysis in- formed by deep subject-matter knowledge and willing- ness to apply rigorous tests to every estimate’. i believe that the analyses of caruana-galizia ( ) would cer- tainly benefit from the identification of an appropriate arima structure for ‘every’ keyword. a. koplenig of digital scholarship in the humanities, assessing scholarly multimedia: a rhetorical genre studies approach this article was downloaded by: [cheryl ball] on: december , at: : publisher: routledge informa ltd registered in england and wales registered number: registered office: mortimer house, - mortimer street, london w t jh, uk technical communication quarterly publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/htcq assessing scholarly multimedia: a rhetorical genre studies approach cheryl e. ball a a illinois state university available online: dec to cite this article: cheryl e. ball ( ): assessing scholarly multimedia: a rhetorical genre studies approach, technical communication quarterly, : , - to link to this article: http://dx.doi.org/ . / . . please scroll down for article full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions this article may be used for research, teaching, and private study purposes. any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. the publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. the accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. the publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. http://www.tandfonline.com/loi/htcq http://dx.doi.org/ . / . . http://www.tandfonline.com/page/terms-and-conditions assessing scholarly multimedia: a rhetorical genre studies approach cheryl e. ball illinois state university this article describes what scholarly multimedia (i.e., webtexts) are and how one teacher-editor has students compose these texts as part of an assignment sequence in her writing classes. the article shows how one set of assessment criteria for scholarly multimedia—based on the institute for multi- media literacy’s parameters (see kuhn, johnson, & lopez, ) for assessing honor students’ mul- timedia projects—are used to give formative feedback to students’ projects. keywords: assessment, dynamic criteria mapping, genre, kairos, scholarly multimedia, values, webtexts scholars in digital writing studies have been publishing webtexts since at least , when kairos: a journal of rhetoric, technology, and pedagogy was first published. kairos’s mis- sion is to offer scholars a place to transfer their knowledge of linear, print-based, academic writing into multimedia-based scholarship that enacts the author’s argument. in other words, authors compose the equivalent of a peer-reviewed article for kairos, but instead of relying only on words (and maybe a few figures), they use whatever media and modes of production they need, such that the media and modes complement, if not create, the point the author wants to make. as editor of kairos, i see on an everyday basis how form and content are inseparable in authors’ scholarly multimedia—an important concept for students to learn and practice in an age when multimedia is ubiquitous. based on my editorial experience with kairos, i teach students at illinois state university to read, analyze, and assess authors’ schol- arly multimedia projects as well as to propose, compose, revise, and peer review their own webtexts, which they can submit to peer-reviewed venues such as kairos, c&c online, x=changes, and the jump (journal for undergraduate multimedia projects). what is scholarly multimedia and how does it work? a primer the webtexts that journals like kairos publish have been called new media scholarship, born-digital scholarship, scholarly multimedia, digital media scholarship, digital scholarship, and many other names. although i may have coined the term new media scholarship in the past (ball, ), i also see that term’s limitations for current and future use in that the term does not explicitly point to the multimodal nature of the texts under discussion here and may wrongly imply (as shipka, , stated) that multimodal has to be digital, which is not true, except in the case of webtexts. thus, although i use terms depending on my audience (e.g., funding technical communication quarterly, : – , copyright # association of teachers of technical writing issn: - print/ - online doi: . / . . d ow nl oa de d by [c he ry l b al l] a t : d ec em be r agencies like digital scholarship and scholarly multimedia ) in most cases, here i use the term scholarly multimedia when i need to emphasize the multimodal nature of this scholarship. at other times in this article, i prefer to use the term that kairos itself uses: webtexts. this term also has the benefit of being much simpler and easier to say. i vacillate between these terms here. when teaching multimodal composition to students who have never heard any of these phrases before, i begin by defining the kinds of texts we are focusing on. until one answers the what, one cannot answer the how (do we assess?). scholarly multimedia are article- or book-length, digital pieces of scholarship designed using multimodal elements to enact authors’ arguments. they incorporate interactivity, digital media, and different argumentation strategies, such as visual juxtaposition and associational logic (see purdy & walker, in press), and are typi- cally published in online, peer-reviewed journals (e.g., kairos, c&c online, vectors) and presses (e.g., computers and composition digital press). scholarly multimedia cannot be printed and still retain the author’s argument because such texts are composed of web pages with links, animations, images, audio, video, scripting languages, databases, and other multime- dia and interactive elements, including but not limited to written text. to show what scholarly multimedia looks like and how it functions, see figure , which includes semirepresentative screenshots (i.e., they do not show the interaction, animation, or audio, if there were any in this piece) from a recently published webtext in kairos. obviously, these screenshots do not look like typical scholarly articles. the one on the left includes the webtext title, author’s name, and a splash page that graphically represents the navigation system that the author, susan delagrange ( a), used throughout. in this webtext, the navigation sys- tem draws on visual and experiential metaphors of wunderkammern, or curiosity cabinets, on which the author’s argument is based. delagrange explained the four major sections of this flash-based piece in a preview node. i quote it here at length to explain how the design of the piece enacts the author’s argument: in ‘‘wunderkammer,’’ i argue that these th-century cabinets of wonder are models of visual provocation in which objects were manipulated and arranged in order to discover new meanings in their relationships. ‘‘visual analogy’’ expands the concept of arrangement as heuristic, because analogy is a trope that lends itself particularly well to the discovery of unexpected affinities in the figure susan delagrange’s ( a) webtext, ‘‘wunderkammer, cornell, and the visual canon of arrangement’’ juxtaposes visuals, sometimes using animation to superimpose visuals on written text, to show the power of invention across multiple modes. (this figure is available in color online.) ball d ow nl oa de d by [c he ry l b al l] a t : d ec em be r juxtaposition of seemingly disparate objects (and ideas). ‘‘joseph cornell’’ explores the mobile assemblages of th-century artist and bricoleur joseph cornell, whose refined use of repetition and small variation predicts the epistemic possibilities of st-century interactive digital media. the last section of the piece is called ‘‘praxis,’’ and the author explains her motivation for this webtext by describing how her praxis is connected to the theoretically supporting sections that come before it: much of my current digital media work with undergraduates at ohio state involves using the techné of visual arrangement described here as a heuristic to shape nuanced proposals for the use of urban space. the ‘‘praxis’’ section of this article describes that work in more detail. the intervening sec- tions develop a rationale for this pedagogy. thus, delagrange’s self-described purpose in this webtext is to develop a rationale for teaching visual arrangement as a heuristic. however, one need not read the (literal) writing on the wall of this wunderkammern to understand delagrange’s scholarly aim. the design of the webtext argues just as much as any linguistic text does: a reader must engage with the wunderkammern on the opening page to read the piece, and the reader can click on any of the thumbnail images (see screenshot in left column of figure ) to proceed to a node (or page) that displays an animation and a chunk of written text (see screenshot in right column of figure ), both of which work together to make delagrange’s argument. as kress ( ) has said, ‘‘design is the servant of rhetoric—or, to put it differently: the political and social interests of the rhetor are the generative origin and shaping influence for the semiotic arrangements of the designer’’ (p. ), which, in delagrange’s case, means she has purposefully arranged the webtext’s multimodal, semiotic elements to serve the political and social interests of her argument. further, she accomplished this task with the aid of peer reviewers and editors, and the piece has been published in a venue respected for scholarly multimedia, so we as readers should assume that each design element belongs, is purposeful, and works to make an argu- ment. we just need to figure out what that argument is. other texts (ball, , ; ball & arola, ) have described what such a reading strategy might look like, so i will proceed with the point of this article: how to ask students to compose scholarly multimedia and how to assess their work. readers may be expecting me to provide a transferable rubric for reading, analyzing, asses- sing, grading, or evaluating scholarly multimedia—particularly a rubric that could be useful for tenure and promotion purposes. i hope readers keep in mind that each of these interpretive and evaluative verbs (reading, grading, assessing, evaluating) indicates a different audience—ran- domly and overlapping: pleasure readers, students, scholars, hiring committees, tenure commit- tees, teachers, and authors—each of which has different needs from, and comes to the reading experience with different value expectations of, such a piece of scholarship. i would like to say that the criteria i discuss in this article would serve all those readers’ needs, but it likely will not, and i offer this practice with the caveat that i have used it only in a handful of classroom settings for one specific kind of assignment sequence, which i discuss below. a webtextual assignment sequence the major project that i assign students in multimodal composition courses is to compose a web- text, which can include many possible genres, technologies, media, and so forth, but will always assessing scholarly multimedia d ow nl oa de d by [c he ry l b al l] a t : d ec em be r be scholarly-creative and aimed at an academic audience. i basically ask students to compose webtexts for possible submission to a journal like kairos, and we spend the entire semester dis- cussing the rhetorical, technological, ideological, institutional, professional, social, and other issues that arise when one chooses to undertake such a task. i use a similar assignment for both undergraduates and graduate students, tailoring the details of each issue (above) to the audience. both groups share one quality with a majority of kairos authors: they are composing scholarly multimedia for the first time. these three groups (undergraduates, graduate students, and first-time kairos authors) are all developmental writers in the sense that they are not yet confident or do not yet have expert technological, multimodal, or rhetorical abilities. the assignment sequence for their webtext projects includes a genre set that starts the semester with their own reactions to others’ webtexts and ends with their telling me what they learned about multimodal composition and how they can transfer that rhetorical, technological, and multimodal knowledge to other writing situations. the cumulative assignments for the webtext project can include the following: . reading responses to published webtexts . values-based analysis of digital media texts and webtexts . audience and venue analyses . genre analyses of webtexts . review presentations of technologies available for composing webtexts . project pitches . proposals to flesh out the project idea . storyboards and scripts . workable or rough drafts . peer review of classmates’ rough drafts . annotated versions of peer-review letters . completed webtexts. i will not detail all of these assignments in this manuscript, partly because some of these assign- ments will be familiar to readers who teach any kind of writing and partly because i want to focus on how the values-based analysis guides most of the assessment practices throughout the semester. this values-based analysis might be better known to writing studies scholars as dynamic criteria mapping (dcm; broad, ), the outcome of which has been dubbed by stu- dents in my classes as ‘‘kuhn ! .’’ building webtext assessment criteria despite my editorial familiarity with assessing webtexts, i realized through teaching this assign- ment that kairos has no standard set of criteria that the editorial board uses to evaluate webtext submissions. in some ways (that i do not discuss here), that lack of criteria is purposeful. how- ever, when teaching webtext production, i needed to push my assessment methods beyond my initial i-know-it-when-i-see-it brand of evaluating scholarly multimedia. i found what i needed in several locations, including a methodology for combining several assessment methods in broad’s ( ) what we really value: beyond rubrics in teaching and assessing writing. in this book, broad explained the use of dcm as a method of articulating values that assessors ball d ow nl oa de d by [c he ry l b al l] a t : d ec em be r (sometimes unknowingly) use in evaluating writing portfolios. from that data, readers can create a set of assessment criteria that can be used heuristically. this method is transferrable to any number of compositional situations, including scholarly multimedia. (for other examples, see broad, adler-kassner, alford, et al., .) a few semesters ago, i began to build this set of criteria by asking students to articulate the criteria that they valued (or did not value) in digital media texts. to their criteria, i added three rubrics for assessing scholarly multimedia as a particular subset of digital media texts that stu- dents would need to become familiar with in the class. the three rubrics included . warner’s ( ) assessment tool for evaluating webtexts . kairos’s peer-review criteria written for the manifesto issue (dewitt & ball, ) . kuhn’s ( ) ‘‘the components of scholarly multimedia,’’ elaborated on in kuhn, johnson, and lopez’s ( ) follow-up piece, ‘‘speaking with students: profiles in digital pedagogy.’’ i will briefly address each of these rubrics, focusing in particular on kuhn’s criteria, as that formed the basis for the short list of criteria that students in my fall class decided upon as their assessment criteria for the major projects. kuhn’s criteria in ‘‘speaking with students: profiles in digital pedagogy,’’ kuhn, johnson, and lopez ( ) described the goals in creating assessment criteria for the university of southern california’s institute for multimedia literacy (iml) honors program. all students in that program have to complete a scholarly multimedia thesis in their respective major. the parameters were intro- duced to digital writing studies as an assessment method in kuhn’s webtext, ‘‘the com- ponents of scholarly multimedia,’’ in which kuhn provided a reading of a collaborative student video to ‘‘discuss [scholarly multimedia] in terms that are understood . . . by the larger academic community.’’ the four parameters she used were conceptual core, research component, form=content, and creative realization. (these will be explained in detail below.) ‘‘the key [with these parameters],’’ kuhn ( ) wrote, ‘‘is to strike a balance between con- vention and innovation, even as the line between image and text, between orality and literacy, between art and critique and, indeed, between scholarship and pedagogy grows ever more fuzzy.’’ it was kuhn’s application of these parameters to her students’ video in the webtext that first drew my attention to this assessment framework and her work with johnson and lopez, in which they interviewed graduates from iml’s honors program who reflected on their multimedia pro- jects using these parameters, that showed how this assessment framework can be used in classroom assessment practices. in fact, kuhn, johnson, and lopez directly addressed assessment as one pur- pose for documenting the students’ reflections of their multimedia projects: although it is unpopular to talk about grading, at least at the faculty level, since that is the terrain of the ‘‘bean counters,’’ we ignore our institutional constraints at our peril. not only is it a disservice to students to fail to inform them of the criteria by which they will be judged . . . given its relative new- ness, digital work is subject to the charge of lack of academic rigor. without the sustained analysis that comes from assessment criteria, digital work can be dismissed as bells and whistles. these cri- teria give us a lexicon with which to discuss digital work among ourselves and our students, even as assessing scholarly multimedia d ow nl oa de d by [c he ry l b al l] a t : d ec em be r explaining digital work in language that is familiar to traditional academics helps them appreciate its nuances and sophistication. and although institutional constraints can prove frustrating, this is some- thing that academic institutions do well: they force a type of rigor that pushes us towards excellence. at the iml, we feel our project parameters help to highlight aspects that may not be immediately apparent in the piece itself—they approach each project on its own terms. as such, there is far more freedom to be innovative with emerging platforms while maintaining high quality work. their goals for including assessment speak to a typical set of ‘‘institutional constraints’’ for needing to ‘‘count’’ digital media work and for showing the rigor of digital media against the supposed bells-and-whistles–only view under which digital media is often seen—all issues that faculty members also face. although i disagree that rigor should be the touchstone for assessing the value of scholarly work, it is invaluable having a set of criteria that allows for open-ended expansion into and discussion of multimedia in terms that are recognizable by teachers who do not yet know how to read and assess such work. this particular set of criteria has proven invalu- able to my students, who have taken it up with unabashed enthusiasm after reading kuhn’s ( ) webtext, which is one of the first webtexts that i usually ask students to analyze using the four parameters embedded within it. students used these parameters to analyze existing, suc- cessful (already published) webtexts from the venues they are interested in submitting to, as well as non-peer-reviewed venues that publish digital media texts they liked, such as music videos on youtube. in addition to reading kuhn’s parameters (see table ), students read and assess two additional sets of criteria specifically created for scholarly multimedia: warner’s ( ) assess- ment tool for evaluating webtexts and the peer-review criteria written for the manifesto issue of kairos (dewitt & ball, ). table institute for multimedia literacy honors thesis project parametersa parameter description conceptual core . the project’s controlling idea must be apparent. . the project must be productively aligned with one or more multimedia genres. . the project must effectively engage with the primary issues of the subject area into which it is intervening. research component . the project must display evidence of substantive research and thoughtful engagement with its subject matter. . the project must use a variety of credible sources and cite them appropriately. . the project ought to deploy more than one approach to an issue. form and content . the project’s structural or formal elements must serve the conceptual core. . the project’s design decisions must be deliberate, controlled, and defensible. . the project’s efficacy must be unencumbered by technical problems. creative realization . the project must approach the subject in a creative or innovative manner. . the project must use media and design principles effectively. . the project must achieve significant goals that could not be realized on paper. afrom kuhn, v., johnson, d. j., & lopez, d. ( ). speaking with students: profiles in digital pedagogy. kairos: a journal of rhetoric, technology, and pedagogy, ( ). retrieved from http://kairos.technorhetoric.net/ . /interviews/ kuhn/index.html. ball d ow nl oa de d by [c he ry l b al l] a t : d ec em be r warner’s assessment tool warner’s ( ) tool comes from her dissertation study, in which she examined how webtexts that won the kairos best webtext award made their arguments. she compared the webtexts to current standards in print scholarship such as content, documentation, and tone. then she studied all the nonprint features comprising the webtexts’ webbed affordances such as navigation, links, design, and media features, and created an assessment tool that tenure and promotion committees could use to evaluate the scholarly worth of webtexts. this rich and detailed rubric contains criteria. the number of criteria, which include technical terms that outsiders to scholarly multi- media may not understand (e.g., ‘‘nodes,’’ ‘‘lexia’’), tended to overwhelm students so that they avoided listing any of warner’s criteria on their values-based short list. thus, due to space restrictions and lack of relevance for this study, i will not detail this heuristic any further here. manifesto heuristic the third set of criteria that i provided to students was written by dewitt and ball ( ) in the kairos special issue on manifestos that dewitt and i coedited. because manifestos serve a dif- ferent scholarly function than most webtexts that kairos publishes, dewitt provided the editorial board with his rubric that outlined how the manifesto webtexts should be evaluated. it was simi- lar in its flexible approach to the assessment goals later laid out by kuhn, johnson, and lopez ( ), and we described our goals for these criteria in the issue’s introduction: our goal was [to] create review criteria that reflected the call for manifestos while also allowing approaches that we really couldn’t have imagined until we received submissions. the questions were intended to help reviewers generate a response that would consider the manifesto form while also allowing for flexibility and openness, since not all of the questions would be relevant to all submis- sions. the criteria were crafted around four major considerations: readership, form, media, and response. students did not choose any of the criteria (as such) from the manifesto issue, so i will skip ahead to the criteria that they did end up choosing. creating kuhn ! after assessing the value of these three frameworks by using each of them to analyze published and unpublished webtexts, we used the dcm methodology to choose a delimited set of criteria that the class could agree were the most useful for scholarly multimedia. in addition to a slight modification of the four parameters outlined in kuhn’s ( ) work, the class wanted to add two components to our assessment criteria: audience and timeliness. these two parameters evolved from three contexts: (a) the students’ knowledge of rhetorical principles from their rhet- oric and writing classes; (b) the readership parameter in the manifesto peer-review guidelines, which they thought was too vague a term to use by itself; and (c) their simplification of the word kairos. we ended up with a set of six criteria: . creativity . conceptual core assessing scholarly multimedia d ow nl oa de d by [c he ry l b al l] a t : d ec em be r . research=credibility . form=content . audience . timeliness. because we had been distinguishing between the three rubrics by shortening their names or pur- poses (i.e., warner, kuhn, or manifesto), and because the students’ choice of terms so heavily relied on kuhn’s four criteria, they labeled this portmanteau of parameters ‘‘kuhn ! .’’ we used kuhn ! as a heuristic for building the student projects over the next few weeks, and i constantly reminded them that these criteria would be similar to what the editorial board would use to assess their webtexts when (or if) they submitted them at the end of the semester. the important thing for teachers to remember here is not that kuhn ! is the rubric you should use to assess scholarly multimedia or other kinds of digital media, but that the rubric needs to be created fresh, with students, for each kind of project you assign. for the courses i taught in fall and spring , this meant adding audience and timeliness to the iml’s base criteria. in fall , it meant not requiring the three heuristics (while still providing them for some early analytical assignments) and asking students to create their own values-based cri- teria for assessing their and others’ projects. the students then had to justify why they used the criteria that they used during peer review. as my understanding improves regarding how web- texts move through authors’ and editors’ and publishers’ processes and as i expand my theoreti- cal understanding of multimodal composition (i.e., writing) teaching, my pedagogy changes and so must my assessment criteria. this is why my values system for assessing webtexts may not, cannot, will not necessarily be yours. (and this is most certainly not what the kairos editorial board uses when they evaluate submissions. in fact, there are no set criteria for kairos submis- sions, as each piece must be evaluated on its own terms in relation to that moment and to tech- nology and media and genre, in time. this is also why i was so opposed to writing this version of this article: because i am worried that kuhn ! will be adopted without exploration or under- standing the need to consider an assignment within its historical, technological, cultural, and social framework. see prior et al., .) using webtextual assessment criteria but i also understand the need to start from somewhere, which is why i am hedging my bets and providing a few examples of how i use some of the above criteria to provide formative assess- ments to first-time authors (in this case, students) of scholarly multimedia. i do that by focusing on two primary areas of difficulty that student-authors have when applying the criteria to their actual composition process. the two areas i want to focus on in particular come from the iml (kuhn’s) criteria for scholarly multimedia as outlined above: form and content, and creative rea- lization. these are often the most difficult for student-authors (and teachers) to deal with because these criteria present mandatory new ways of composing wherein linguistic, discursive forms are not the primary means of communication. as readers will see from the examples below, these criteria are not easily divorced from one another. just as form and content are inseparable in mul- timedia texts (ball & moeller, ; wysocki, ), they are not separable from a text’s conceptual core or its creative realization. ball d ow nl oa de d by [c he ry l b al l] a t : d ec em be r i am purposely skipping the conceptual core and research component outlined in the iml criteria because it is too similar to what writing teachers already handle in regard to brainstorm- ing topics, writing thesis statements, forming purposes in an essay, and so on. the research and the main concept still have to be strong in a multimodal composition, and writing teachers do not need me to cover those topics in detail here. (we know those topics, even if we do not always know how to cover the details in multimedia terms, but just in case, see again ball & moeller [ ] for a useful discussion of how to assess the research component of scholarly multimedia.) issues in creating form-and-content relationships i am focusing on two criteria that fall within the iml’s form-and-content category, difficulties that can arise at multiple stages of the composing process. the trick of the form-and-content category is that it cannot be assessed separately from the purpose, or conceptual core, of a piece. the conceptual core (e.g., the controlling idea) is usually what readers would call the content or purpose or even the thesis of a piece, so in the case of kuhn’s criteria, the form-and-content cate- gory relates explicitly to the piece’s conceptual core. if the concept is not clear, the form=content relationship will not usually be clear either. formative feedback on the form=content of a piece can be given at any stage in the compositional process, but is best—as always—caught early, such as in the proposal or storyboard stage of the project. still, sometimes, the conceptual core of a piece sounds great in the proposal and their form=content description in the proposal sounds like it could work, but until an author presents a storyboard or rough draft well into the composi- tional process, the problems the author had in carrying out the form and content relationship do not become evident. (also, no one wants to revise a multimodal project, which usually involves reenvisioning the project and starting from scratch—not something any teacher wishes on a stu- dent with only weeks left in a semester.) ‘‘the project’s structural or formal elements must serve the conceptual core’’ recently, one student group produced a webtext called ‘‘facebook activism,’’ in which they wanted to critique web users’ unreflective ability to ‘‘like’’ or ‘‘attend’’ a facebook event intended as a nationwide activist event or movement when, for the most part, that kind of acti- vism stops at the level of the click. in particular, the students were interested in the wear purple event that was promoted during the fall of to stop bullying lesbian, gay, bisexual, and transgendered (lgbt) students at schools across the u.s. in my students’ proposal for the project, they planned to interview lgbt activists from several u.s. universities who had created local wear green facebook events, accompany those interviews with statistics of online acti- vism from several sources, and embed all of this into a facebook page that would contain some of the written support for their arguments as status updates and notes (both features of facebook). the interviews and facebook page never materialized, so for the peer-review work- shop, the video consisted of a -minute voice-over of their research (pulled mainly from one book, digiactive), which was accompanied by only a handful of still images that were used as visual examples of facebook events. the students acknowledged that they had not been able to complete their intended work, but it was time for the peer reviewers to critique the piece. assessing scholarly multimedia d ow nl oa de d by [c he ry l b al l] a t : d ec em be r in discussing the form-content relationship in the piece, one student wrote at length in her peer-review letter: one concern that i do hold with this piece is whether or not the media form used is fitting to the project that was given to me to observe. the piece is in video format with audio voiceovers supply- ing the information. the visuals in the video, however, were not entirely effective to me. throughout most of the video, there was simply the facebook logo, and even when some facebook screen shots were given, they just did not seem to be enough to cover up for the lack of visuals throughout the rest of the text. one idea that i have to provide is that if there are not any other strong images to add—although i’m sure that there are—the group could add blurbs of text or quotes on the screen at the time that they are being said. not only will this be more visually appealing and more interesting to view, but also if you highlight important information so that it is visual and aural, then the audience has a better chance of not just hearing the information but also understanding it. there is a youtube video that i think could be helpful as far as making actual text effective visually—it’s under stephen fry kinetic typography language. this piece is actually a prezi, which after viewing the ‘‘facebook activism’’ project would be extremely effective (but i know that time is sparse and re-doing the project to that extent is not the best option). and, if adding these extra visuals does not seem appealing or necessary, then i think it would be best to make the project audio only rather than a video piece. although the kinetic typographic piece that the student mentioned is not actually a prezi (a zoom- ing presentation tool), it does look like one to students who had just learned what prezi was—an interesting moment of technological and designerly uptake. students had studied prezi, among other technological choices for making webtexts that semester, in part to learn how authors need to choose their technology depending on what arguments they want to make. that choice is inti- mately related to the form=content criteria for composing and assessing webtexts. in this case, the student-reviewer offered a relevant suggestion that would have aligned the form and content more closely with the conceptual core. this is the same suggestion i (or perhaps a kairos reviewer) would have made had the piece been submitted for consideration at the rough draft or query stage of the publication process. ‘‘the project’s design decisions must be deliberative, controlled, and defensible’’ one of the biggest obstacles to teaching multimodal anything to first-time multimodal authors is that the instructor may forget to get them to detail how their form and content work together from the very beginning of the process. without a design concept as part of the proposal stage of a webtext, the conceptual core will never be realized in anything besides a paper-based, tra- ditional format that we have been trained—hardwired, it seems—to write. (this is also an issue of creative realization, which i address below.) to that end, students should be articulating their design choices (form=content relationship) as rhetorical, aesthetic, technological, and other choices that make sense for the conceptual core of a piece given the medium they have chosen as best to present their concept. again, the issue is not usually with the conceptual core, that is, students have a good idea what they want to say; they just do not know how best to say it in multimedia. sometimes the author’s message does not need to be said in multimedia or—more often—the message can be said differently in multimedia than how the author can envision a ball d ow nl oa de d by [c he ry l b al l] a t : d ec em be r scholarly article or seminar paper, in which the author knows how to assess the deliberative, con- trolled, and defendable nature of every word, paragraph, and transition. how, however, does one evaluate (and inevitably defend) how one’s design works? it must all be tied to the genre (or subgenre) of text the author is composing. in this example, a student created a presentation for a funding organization on her campus. this organization had previously donated money for her to start a studio-based writing center at her -year college. her presentation needed to show the funders what the studio had done with the money previously donated, how the studio does its work (which differs significantly from an archaic conception of a writing center as a skill-and-drill center), and for what the studio would need future funding. to best show what the studio does, the student decided (and i agreed, based on her proposal for this project) to videotape several brief writing-center interactions in this wildly kinesthetically designed studio. she had to show, not just tell, what happened in the stu- dio to get the point across. at the storyboard-and-script stage of the project, she presented me with several scripts for the seven videos she wanted to shoot for her presentation, which was supposed to be an -minute presentation. the length was a problem because each script was to minutes long. so we started to edit (see figure ) to accomplish two things: (a) ensure that the objects she had described in the script were shown in the video instead of spoken by tutor-actors (e.g., ‘‘see the beanbag chairs!’’), which would take advantage of the medium of video so that at least two modes (visual, voice-over) could multiply the amount of rhetorical work that each video accomplishes, thereby (b) shorten the overall length of the videos so the presentation could be accomplished in the allotted time. this was a genre consideration that we needed to address during the storyboard-and-script stage, so the student could succeed in the actual presentation. editing the uncontrolled, indefensibly discursive scripts (per this criterion’s requirements) prior to her filming helped her form=content relationship to deliberately and quickly express the con- ceptual core of the project. (after this presentation, the student garnered another small grant from the foundation to buy ipads for the studio.) figure the script has been significantly edited to meet the genre expectations of this multimodal presentation. (this figure is available in color online.) assessing scholarly multimedia d ow nl oa de d by [c he ry l b al l] a t : d ec em be r issues in creative realization of webtexts one of my favorite dictums for first-time authors of scholarly multimedia is, ‘‘if you start with word, you’ll end with word.’’ of course, microsoft word is a stand-in for any kind of linear, scholarly thinking or any word-processing program. authors are so accustomed to drafting scholarly and academic projects by writing them out (see, e.g., any powerpoint presentation that relies on words to convey its bulleted points or the mandatory word counts inherent in academic articles or badly theorized first-year writing curricula). so, when i work with authors, i want to shout ‘‘awaywithwords!’’ (wysocki, ). at stake is the creative realization of a project that ‘‘could not be achieved on paper,’’ as the criterion i want to focus on here indicates. ‘‘the project must achieve significant goals that could not be achieved on paper’’ how an author comes up with a project that cannot be achieved on paper is far beyond the scope of this article, although other texts have addressed the topic (arola, sheppard, & ball, in press), including in classroom practice and digital authoring workshops such as those at com- puters and writing and the digital media and composition summer institutes. generally speak- ing, however, i offer two suggestions to webtext authors: (a) your design should enact your argument, and (b) to come up with that design, think of a visual metaphor for your argument. delagrange’s ( a) piece, detailed at the beginning of this article, did just that: it used the vis- ual metaphor of a wunderkammer as the main navigational design of the webtext, which helped her enact her argument that juxtaposing items in proximity (as wunderkammern require) will aid in the invention process. perfect. and not replicable on paper. nothing excites me more as an editor or teacher than witnessing the moment when an author realizes that a webtext needs to be presented on-screen. as kalmbach ( ) has so rightly said, the majority of webtexts published in kairos look like hyperlinked seminar papers. some pieces fit the mission (rhetoric, technology, and=or pedagogy) and use minimal-but-necessary webtex- tual features (e.g., links, some reader interplay, maybe an embedded student video) that make the pieces unsuitable for submission to a print-based journal. that kind of webtext is kairos’s bread and butter, but it is not the stuff that makes me giggle with editorial delight. i discourage students from producing next-button hypertexts even though that is what they usually veer toward, usually because they lack practice with anything else and because with next-button hypertexts, they feel safe and thus confident. being able to teach scholarly multimedia requires lots of reassurance for students, lots of reminders that it is not about the finished product but about the trying. the formative feedback they get from me at each major stage of their projects helps guide students through their efforts. and during those feedback sessions (usually in one-on-one conferences during which i first view their projects, just as i would for an author submitting a query about a webtext), i hope for the ‘‘wow’’ moments when the student realizes that a piece should be on-screen. recently, in one of my classes, most of the students had chosen prezi in which to build their webtexts. prezi seemed easy, and it would allow them to practice building multimodal scholar- ship without too much hands-on support from me. (in this graduate-level theory class, they were responsible for learning much of the technology on their own, with my help for troubleshooting as needed.) i conferenced with one student who expressed concern about the appropriateness of her design and how it would work (or, was not working, she thought) in prezi. her project was a ball d ow nl oa de d by [c he ry l b al l] a t : d ec em be r creative nonfiction piece about her fibromyalgia diagnosis and her struggle to fit the accepted notions of that diagnosis in relation (or opposition) to its portrayal in medical media, such as pfizer ads. in the storyboard for her project (a poster board), she had drawn the front of a woman’s body (the predominant gender diagnosed with fibromyalgia, i learned) and placed short selections of her creative writing about her lengthy process of diagnosis and the pain she experienced that did not match the trigger points that most people felt with the ‘‘disease’’ or ‘‘disability.’’ (this is in scare quotes because there is some contention over its categorization as either, which is also part of her project.) peer feedback on her storyboard suggested that she needed to show the back of the body as well as the front because the trigger points also appear at the back. she found an image that represented the front and back trigger points on the outline of a woman’s body and used that as the basis of her design in prezi (see figure ), which we then discussed. her concern with figure the trigger points for fibromyalgia are shown as large black dots in the drawing. (this figure is available in color online.) assessing scholarly multimedia d ow nl oa de d by [c he ry l b al l] a t : d ec em be r using prezi and with the background image arose when we were discussing what path she should implement. prezi, like powerpoint, wants to follow a sequential path that the author determines. prezi, unlike powerpoint, does not require a linear slide-show-like path but can navigate in and out of any straight or curving or jutting path on-screen; the author has only to create stopping points along the path for each frame she wants to show. for this student’s project, it seemed obvious at first that the stopping points should correspond with the pain trig- ger points on the woman’s body. but the student explained that she wanted to make sure read- ers could access material she was placing on parts of the body that corresponded not with the known trigger points but with her pain. her experience with fibromyalgia was different from the norm, which was part of her argument in this creative nonfiction piece. if she added a path in prezi that followed the trigger points, readers would miss some of the most important frames in the piece. we then realized how this piece needed to be on-screen, neither on paper written as a series of narrative snippets with a few print advertisements and screenshots of commercials thrown in as illustrations nor as a next-button hypertext with the videos embedded. just as the doctors whom she saw over several years struggled to diagnose her condition, readers would need to struggle to find order and potential closure in this piece. the prezi needed no path. the author needed full freedom from any limiting, directive series of stopping points for readers so she could reinforce her argument that nothing with fibromyalgia represents a norm. i got goose bumps from this idea. she would use prezi’s affordances against itself, hacking the system in one of the most powerful ways to adapt a technology to make an argument. readers would be confused, but that was part of the point. this piece needed to be difficult to navigate, so that readers had to figure it out, thereby recreating the diagnosis process. hilary selznick’s ( ) piece, ‘‘fibromyalgia: the (in)visible (dis)ability’’ was later published in the premiere issue of technoculture: an online journal of technology in society. selznick told me that keith dorwick, technoculture’s editor, provided her with excellent feedback that took the piece from a classroom project to a published webtext, showing exactly how webtexts have viable public trajectories. r&r is the new a: revaluing grading scales although this article shows some strategies that i have used for providing formative assessment feedback for students’ scholarly multimedia projects, it is more important to me that students can assess each others’ work through the peer-review letters they write to each other after their rough draft workshops (such as the example from the first form and content criterion). i do not grade students’ completed webtext submissions because too many smaller assignments are part of the larger project’s requirements. i assign one grade to each student’s entire body of work for the whole semester. each student’s grade is based on one thing: participation. however, my defi- nition of ‘‘participation’’ includes several key aspects: did the student . do all of the work i assigned? . turn it in on time? . do it with excellence? (for a full discussion of my grading scale, see ball [n.d.].) ball d ow nl oa de d by [c he ry l b al l] a t : d ec em be r excellence, of course, is easier said than graded, so i will remind readers that this system of grading is built on a class’s producing a particular set of genres in a particular moment in time. having taught students to compose webtexts five semesters in the past years, i feel that this assignment sequence (much modified since the first time i taught it) may finally be getting close to what first-time kairos authors actually use when submitting to the journal. i also hope to be getting closer to creating a fluid method of assessing students’ webtexts that equally values . their in-class peer reviews . the constantly shifting genre conventions of scholarly multimedia work . my expertise (and time) as teacher-editor . students’ everyday interests in digital media . the audiences and venues (e.g., editorial boards and scholars) that students’ work might actually reach. yet, i am not dumb enough to expect that students could design excellent, publishable work in one semester. i can only expect them to complete their projects in similar fashion and scope to what most first-time kairos authors complete: a webtext that . is suitable to the subject matter of and audience for the journal . is submittable via a url or zip file . functions without breaking . is far enough along in its thinking that the first round of reviews would suggest the author revise and resubmit, as nearly all first-time (and most second-time) submissions to kairos receive. once i articulated these expectations to myself, i realized that i had to change the standard by which i assessed student work. it was not feasible to judge students based on any finished pro- duct (or the process they used to complete the work) given that many first-time scholarly multi- media authors need a reasonable amount of feedback on their webtexts before those pieces are considered ready to resubmit. if my expectation was a semifinished product, why not have revise and resubmit be the standard by which i assess the students’ projects? as an editor, i cannot expect kairos authors to produce perfect (i.e., accepted for publication) work the first time around, nor as a teacher should i expect students to produce at that level the first time they com- pose in multimedia. that is not to say that as editors or teachers we must lower our standards. when students who take a multimodal class for the first time can produce work that is on par with much of what first-time kairos authors produce, that is a bar-raising event, yet my grading of their work must shift to accommodate what that work means in relation to the academic world of peer-reviewed scholarly multimedia. notes . the latter term, ‘‘scholarly multimedia,’’ was brought to prominence by the usc school, that is, the collective of the institute for multimedia literacy and vectors journal, both of which are located at the university of southern cali- fornia. . although i usually assign this entire sequence only to undergraduates, i have discovered that graduate students need this disciplinary-knowledge breakdown just as much as undergraduates so they can better understand the disciplin- ary conventions of publishing in their field. during a recent semester teaching graduate students in a multimodal theory assessing scholarly multimedia d ow nl oa de d by [c he ry l b al l] a t : d ec em be r seminar, i discovered during the proposal stage (having skipped everything prior to that as, mistakenly, being too ped- estrian for the graduate students) that they had no idea how to articulate the scope, audience, or values of a particular journal. i should not have been surprised, given that students at this point have usually recently entered the field or may not even be in rhetoric=composition or technical communication but might come from linguistics, creative writing, literature, and so forth. next time i teach graduate students, i will be much more explicit in my directions, following those i use for the undergraduates, which includes all the assignments above. . for details on these assignments, please see http://www.ceball.com/classes/ /fall /major-assignments/ for an undergraduate example or http://www.ceball.com/classes/ /assignments/ for a graduate example. . if we rely on rigor as our scholarly touchstone, we miss the value that supposedly nonrigorous (e.g., nondiscur- sive, affective, imagistic) meaning-making strategies can have in our scholarship. (i think kuhn would agree.) o’gorman ( ), murray ( ), and kress ( ) all discuss the problems of assigning rigor too much value in a dichotomous comparison to affect (or just interest, which is the term kress uses). both o’gorman and murray particularly engage with the necessity of including image in our discussions of the value of digital media work. . when i first conducted this assessment strategy, the four parameters that kuhn ( ) outlined were not yet being attributed in scholarship as the assessment method used at iml. not until kuhn, johnson, and lopez’s ( ) work was published—a year after my students starting referring to our set of criteria as kuhn ! —that the connection to the iml’s honor’s program, which kuhn directs, became clear. in addition, the four parameters were created with the pre- vious honor’s director, steve anderson. for more information on the history of this criteria, please see kuhn, johnson, and lopez’s webtext. . the name stuck, for good or ill, in that it is nondescriptive of that for which the heuristic exists, although it also appreciatively recognizes the author who wrote about the heuristic convincingly enough for students to see its absolute-use value. . students are not required to submit texts, but they must go through all the genres of submitting, including writing a query or proposal e-mail to the editors, which they can send to me as the teacher if they do not actually submit their work. . undergraduate students work in groups of three or four in my multimodal composition classes. graduate students work independently, unless they prefer to work in small groups. . public service announcement: if you have never authored scholarly multimedia and you try to assign that writ- ing to students, you will struggle to guide students through the rhetorically and technologically intensive troubleshooting process that this assignment requires and struggle more when you assess their work. try to accomplish the assignment yourself first. start small. these workshops and institutes give you quality time with experts and can help you quickly learn the standards of multimodal composition. . delagrange took years, and no less than three designs, to get her piece ready for publication. (see her dis- cussion of the revision process in delagrange, b). obviously, in a classroom setting where that process may last anywhere from to weeks, a publication-ready piece will not be possible. (i discuss how this process impacts assess- ment and grading in the conclusion of this article.). . for the language that i use to introduce students to the concept of ‘‘trying,’’ see the scope section on my assignment page (ball, ). references arola, k. l., sheppard, j., & ball, c. e. (in press). making multimodal projects. boston: bedford=st. martins. ball, c. e. (n.d.). english : multimodal composition—expectations. retrieved from http://www.ceball.com/classes/ /fall /syllabus ball, c. e. ( ). show, not tell: the value of new media scholarship. computers & composition, , – . ball, c. e. ( ). a new media reading strategy (unpublished doctoral dissertation). houghton, mi: michigan tech- nological university. ball, c. e. ( ). multimodal theory and pedagogy—project (final project). retrieved from http://www.ceball.com/ classes/ /assignments/project ball, c. e., & arola, k. l. ( ). ix: visualizing exercises ( st ed.). boston: bedford=st. martin’s. ball, c. e., & moeller, r. m. ( ). converging the ass[umptions] between u and me; or how new media can bridge a scholarly=creative split in english studies. computers & composition online [special edition]. retrieved from http://www.bgsu.edu/cconline/convergence ball d ow nl oa de d by [c he ry l b al l] a t : d ec em be r broad, b. ( ). what we really value: beyond rubrics in teaching and assessing writing. logan, ut: utah state university. broad, b., adler-kassner, l., alford, b., et al. ( ). organic writing assessment: dynamic criteria mapping in action. logan: utah state university press. delagrange, s.h. ( a). wunderkammer, cornell, and the visual canon of arrangement. kairos: a journal of rhetoric, technology, and pedagogy, ( ), retrieved from http://kairos.technorhetoric.net/ . /topoi/delagrange/index.html delagrange, s. h. ( b). when revision is redesign: key questions for digital scholarship. kairos: a journal of rhet- oric, technology, and pedagogy, ( ), retrieved from http://kairos.technorhetoric.net/ . /inventio/delagrange/ index.html dewitt, s. l., & ball, c. e. ( ). logging on: manifestos as scholarship. kairos: a journal of rhetoric, technology, and pedagogy, ( ), retrieved from http://kairos.technorhetoric.net/ . /loggingon/lo-schol.html kalmbach, j. ( ). reading the archives: ten years on nonlinear (kairos) history. kairos: a journal of rhetoric, technology, pedagogy, ( ). retrieved from http://kairos.technorhetoric.net/ . /binder.html?topoi/kalmbach/ index.html kress, g. ( ). multimodality: a social semiotic approach to contemporary communication. new york: routledge. kuhn, v. ( ). the components of scholarly multimedia. in kuhn & victor vitanza (eds.), from gallery to webtext. kairos: a journal of rhetoric, technology, and pedagogy, ( ). retrieved from http://kairos.technorhetoric.net/ . /topoi/gallery/index.html kuhn, v., johnson, d. j., & lopez, d. ( ). speaking with students: profiles in digital pedagogy. kairos: a journal of rhetoric, technology, and pedagogy, ( ). retrieved from http://kairos.technorhetoric.net/ . /interviews/kuhn/ index.html murray, j. ( ). non-discursive rhetoric: image and affect in multimodal composition. albany: state university of new york press. o’gorman, m. ( ). e-crit: digital media critical theory and the humanities. toronto: university of toronto press. prior, p., solberg, j., berry, p., bellowar, h., chewning, b., lunsford, k., rohan, l., et al. ( ). core text. resituating and re-mediating the canons: a cultural-historical remapping of rhetorical activity (a collaborative webtext). in kairos: a journal of rhetoric, technology, and pedagogy, ( ). retrieved from http://kairos.technorhetoric.net/ . /binder.html?topoi/prior-et-al/core/index.html purdy, j. p., & walker, j. r. (in press). scholarship on the move: a rhetorical analysis of scholarly activity in digital spaces. in d. journet, c. e. ball, & r. trauman (eds.), the new work of composing [digital book]. computers and composition digital press=utah state university. schultz, d. ( ). a digiactive introduction to facebook activism. retrieved from http://digiactive.org selznick, h. ( ). fibromyalgia: the (in)visible (dis)ability. technoculture: an online journal of technology in society, ( ). retrieved from http://tcjournal.org/drupal/vol /selznick shipka, j. ( ). negotiating rhetorical, material, methodological, and technological difference: evaluating multimodal designs. college composition and communication, ( ), w –w . warner, a. b. ( ). constructing a tool for assessing scholarly webtexts. kairos: a journal of rhetoric, technology, and pedagogy, ( ). retrieved from http://kairos.technorhetoric.net/ . /binder.html?topoi/warner/index.html wysocki, a. f. ( ). impossibly distinct: on form=content and word=image in two pieces of computer-based interactive multimedia. computers and composition, , – . wysocki, a. f. ( ). awaywithwords: on the possibilities in unavailable designs. computers and composition, , – . dr. cheryl e. ball is an associate professor of new media at illinois state university. she studies multimodal composition, digital media scholarship, and digital publishing. her portfolio can be found at http://ceball.com assessing scholarly multimedia d ow nl oa de d by [c he ry l b al l] a t : d ec em be r capital, neoliberalism and educational technology commentaries capital, neoliberalism and educational technology chris jones published online: april # springer nature switzerland ag i have been asked by experienced colleagues what neoliberalism means and why it is relevant in educational technology. to me, the link seems obvious but let me set out some of the links between neoliberal political economy and educational tech- nology. educational technology has a relatively long history (cuban ) and this commentary is restricted to computing and digital technologies in education. cur- rently, digital technologies are established features throughout education. indeed, the core digital presence of many educational institutions is the learning management system (lms) (virtual learning environment, or vle, in the uk). these systems retain many of the features found in early applications of computer conferencing and even their look and feel has not moved on significantly since the introduction of a graphical user interface (gui). the really significant advances in technologies have been in devices, networks and memory, with a movement from desktop personal computers and mainframe computers to mobile phones and tablets connected to high-speed networks and the cloud. institutional islands and dial-up facilities have changed to high-speed broadband networks accessed via mobile and wi-fi connec- tions linked to extensive storage. technological changes are sometimes seen as the dominant driver of social change, a kind of technological determinism. however, in the same time period that technology changed, there was a political and economic transformation of the world economy from a competition between capitalism and socialism, with a developing ‘third world’, to a single global system differentiated into geographical regions and nation states. capital is triumphant, the economy is global in scope and the state has been reduced in the face of resurgent markets. perhaps, surprisingly, there has been little focus on the relationships between the global resurgence of capital, in market-led neoliberal forms and the characteristics of those educational technologies and institutions that emerged at the same time (for an exception see selwyn and facer ). this commentary addresses that perceived gap by outlining possible connections between capital, neoliberal politics and educational technology and suggesting a revised educational response. postdigital science and education ( ) : – https://doi.org/ . /s - - - * chris jones c.r.jones @ljmu.ac.uk liverpool john moores university, liverpool, uk http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf mailto:c.r.jones @ljmu.ac.uk neoliberalism builds on the early liberal idea of free markets and a laissez-faire economic order. neoliberalism argues that non-market mechanisms are inefficient and that they lead to a lack of political, social and economic freedom (hayek ). these ideas became mainstream thinking under the presidency of ronald reagan in the usa and the prime minister of the uk, margaret thatcher. their reforms in two advanced capitalist economies coincided with the market reforms of deng xiaoping in china, and in the s, they were amplified by the collapse of the soviet union. a distinction between classic liberalism and its neoliberal form was the active role of the state. liberalism reacted against the feudal state and struggled to free markets from state interference. neoliberalism actively used the state to reset regulatory and political frameworks, to roll back the state’s previous activities and to enforce market relations. the neoliberal state was not laissez-faire because it regulated to enforce market and quasi-market relations rather than stepping aside to allow transactions between private parties. in education, this meant the drive towards viewing students as consumers and in some cases the introduction of fees and competition between education ‘providers’. the neoliberal nature of these changes was signalled by their introduction alongside market regulators through which the state actively encouraged certain outcomes. the catchphrases of the new state endorsed and enforced education systems were choice and competition. new systems of measurement and ranking were introduced to replicate the role of price in genuine markets. for choice to be effectively transmitted between purchasers and providers, systems of funding were devised allowing choices by students and parents to be translated into financial allocations between competitors. these cumbersome processes added significant transaction costs to the educational system and drove ‘providers’ to find new ways to engage with students and parents. technological changes in the late s and s moved computing away from mainframe large computers in specialised facilities to personal computers in the home and distributed throughout the economy. associated with these technological transfor- mations, a range of arguments have placed technology at the centre of social change. these arguments stress the effects of technology on society. arguments for social shaping, in contrast, stressed the process of design and the purposes technologies were designed for. these conflicting outlooks can be reconciled by a socio-material under- standing of technology in which technologies have a history in which social forces are embedded, and they are material artefacts with definite form, which are taken up in society in socially and politically informed ways. technologies have persistent effects because they allow for or afford some outcomes rather than others and even when they are non-determining they have an influence inclining users to one outcome rather than another. a significant link between the political and economic form of neoliberalism and the technology it gives rise to can be found in the need to control labour. technology provided a way to replace labour and to disrupt labour organisation (harvey ). education is directly affected by neoliberal capitalism because of the changing occupational and social structure. new occupations demand new skill sets and the proportion of society requiring higher level education and skills has increased as a proportion of the population. while there are arguments about the usefulness of degree level qualifications, there is a general assumption that an advanced economy based on new technologies requires a highly educated workforce. the new workforce is said to require ‘team’ working and new forms of cooperation and collaboration at work. the postdigital science and education ( ) : – flat organisational forms and reduced hierarchy characteristic of modern organisations require workers to be more self-motivated and self-organising. the contradictory process of de-skilling and re-skilling the workforce, and the development of complex global production networks gives rise to independent and semi-independent producers, organised in modular networks, alongside independent professionals and consultants. these changes have led to different kinds of demands being placed on education, in particular, to develop students with competences fitted to the new work conditions with what is often described as digital literacy. the information age and the network society are decentralised and the history of computing mirrors the neoliberal arch in both historical time and dialectical process. the decentralised structures of mature network societies rely upon a history of centralised public funding for their foundations. it was state funding, often military in character, that developed the technological building blocks that became the basis of an apparently distributed, entrepreneurial and disruptive ‘new’ technology (mazzucato ). it remains state action that protects and regulates the market in which the technological giants flourish. fundamentally, public education provided the intellectual ‘capital’ on which technological advances were built and which helped to determine the geographical distribution of centres for new technology such as silicon valley. the distributed global network society displays a geographic clustering of innovation in distinct areas co-located with high-quality universities. to feed the new economy, schools and universities have gone through an unprecedented expansion and the education sector has become international and economically important in itself, with international flows being included as an ‘export’ in economic terms. education is increasingly spoken of as a key enabler, if not the enabler, of development, innovation and economic growth. educational technology does not have the scale and scope to warrant a distinct role in technological developments. while many technologies arise in educational contexts, they are often developed and commercialised elsewhere and sold back to educational institutions as products. education is a consumer of technologies developed for other purposes (e.g. microsoft ). at times, educational technology coalesces around issues with a wider social purpose, for example, early collaborative systems of group- ware and computer conferencing (e.g. lotus notes) and currently ‘learning’ analytics and the application of data analytics to education. in the neoliberal system of choice by consumers, technological systems are deployed to facilitate competition between institutions by marketing, reducing student drop out and ensuring the reliability and comparability of results. this has led to the deployment of institutional technological infrastructures including learning management systems (lms–vle), enterprise re- source planning (erp) and plagiarism software (e.g. turnitin). education also relies upon a range of universal infrastructures which serve the wider public, such as google, and these systems often have educational sub-sectors, e.g. google scholar and google for education. an increasing feature of educational engagement with technologies is the ways in which the institution itself becomes ‘unbundled’, with different aspects being absorbed into discrete technological systems. these systems are provided by global corporations who extract data and expertise from the institutions and aggregate the data across the world. a recent example of ‘disruptive’ technology that was thought to challenge educational institutions was the development of massive open online courses (moocs). in their early incarnations, they were thought to be a potential postdigital science and education ( ) : – replacement for traditional organisations in education, a technological disruptor. only a few years later on from their height, the idea that free courses offered via online platforms could supplant universities and schools seems ridiculous, but this was a widely held view until recently. the technologies and platforms developed by venture capital and large corporation have affected the kinds of work educational employees are engaged in and the kinds of students enrolled in education. academic work has been affected by the general development of digital and networked technologies through the emergence of digital scholarship and the digital scholar. the student intake is said to be affected by their immersion in new technologies and there are repeated claims that new ‘generations’ of students are bringing practices developed by their exposure to new technology into student contexts. in previous times, it was the digital and the networked that were identified as motors for this change, more recently, it has been social media and the smartphone. digital technologies are also said to be changing fundamental aspects of educational practice including what it means to learn, the routines of academic reading and the way study is conducted. openness has become one of the key battlegrounds in the integration of new technologies into education because open educational resources (oer) can be seen as a public benefit, a way to reduce costs and increase access. however, oer can also be in alignment with the contemporary needs of capital in which the learner has to constantly seek new and relevant knowledge, to make themselves a more desirable, educated and flexible labour commodity. openness has had two faces, one aligned with neoliberal markets and individualism, the other with a more anarchic anti-state libertarianism that prizes civic initiatives and self-help. the results of such progressive initiatives have in large part been disappointing. open courseware, learning objects, open educational resources, all worthy initiatives, reliant on voluntary labour, have not been able to develop to a scale that challenges venture capital and neoliberal practices. open access to research has been a recent example in which governments have begun to intervene to ensure that publicly funded research is not held behind subscription walls by large academic publishers. it seems that to ensure even a modicum of success, bottom-up initiatives for openness require top-down support from the state. the conclusion i draw from these interactions between a neoliberal political econ- omy and educational technology is that educational reform will not come from technological change and it will not come from civic action and voluntary initiatives alone without state support. the open university in the uk was set up by a progressive labour government in and it integrated the newest communication technologies. this hugely successful institution was proudly open to people, places, methods and ideas and it inspired many international comparators. it has struggled in recent years, its openness challenged by repeated regulatory restrictions and the introduction of fees in the uk higher education. the history of the open university demonstrates how state action can galvanise progressive grassroots activity and build successful, innovative and advanced practical action. it also demonstrates decline through neglect and state regulation in a hostile economic environment. critical digital education, and educa- tional technology more broadly, needs to embrace aspects of classic political action. to develop pedagogies that mobilise new technologies successfully, educators need to work with traditional social and political actors to win them to a progressive educa- tional programme of action. this will need engagement with political parties, trade postdigital science and education ( ) : – unions and cooperative movements. it means adopting a top down alongside a bottom- up perspective. critical pedagogy needs to embed itself in political programmes that can be enacted at state and regional levels. such an approach paradoxically mirrors the successful use of the state to enforce the reintroduction of markets and quasi-markets by neoliberal politicians. references cuban, l. ( ). teachers and machines: the classroom use of technology since . new york: teachers college press. harvey, d. ( ). a brief history of neoliberalism. oxford: oxford university press. hayek, f. ( ). the road to serfdom. chicago: university of chicago press. mazzucato, m. ( ). the entrepreneurial state. debunking public vs private sector myths. london: anthem press. selwyn, n., & facer, k. (eds.). ( ). the politics of education and technology. new york: palgrave macmillan. postdigital science and education ( ) : – capital, neoliberalism and educational technology references open access in context: connecting authors, publications and workflows using orcid identifiers publications essay open access in context: connecting authors, publications and workflows using orcid identifiers josh brown ,*, tom demeranville and alice meadows orcid eu, paddock gardens, lymington so es, uk orcid eu, catherine way, bath ba pa, uk; t.demeranville@orcid-eu.org orcid inc., brookline, ma , usa; a.meadows@orcid.org * correspondence: j.brown@orcid-eu.org; tel.: + - - - academic editor: mary anne baynes received: april ; accepted: september ; published: september abstract: as scholarly communications became digital, open access and, more broadly, open research, emerged among the most exciting possibilities of the academic web. however, these possibilities have been constrained by phenomena carried over from the print age. information resources dwell in discrete silos. it is difficult to connect authors and others unambiguously to specific outputs, despite advances in algorithmic matching. connecting funding information, datasets, and other essential research information to individuals and their work is still done manually at great expense in time and effort. given that one of the greatest benefits of the modern web is the rich array of links between digital objects and related resources that it enables, this is a significant failure. the ability to connect, discover, and access resources is the underpinning premise of open research, so tools to enable this, themselves open, are vital. the increasing adoption of resolvable, persistent identifiers for people, digital objects, and research information offers a means of providing these missing connections. this article describes some of the ways that identifiers can help to unlock the potential of open research, focusing on the open researcher and contributor identifier (orcid), a person identifier that also serves to link other identifiers. keywords: identifiers; orcid; research information; open access; web technologies; open research . introduction the evolution of the academic web has provided researchers and the interested public with a radical set of possibilities. the cost of distribution and copying of resources has decreased at the same time as the volume of accessible information has soared. this was one of the inspirations for the open access movement [ ]. the fundamental web technology of linking has enabled a degree of richness and interconnectedness in resources that was previously unimaginable. many initiatives and products take advantage of these possibilities. the idea that a researcher can access an article, anywhere in the world, click on a resolvable link to articles cited (or to the data being reported on, or to digitized source material), and access them seamlessly is very attractive, and deceptively simple. resources are all too often hidden behind barriers. links decay. ‘reference rot’ has become a significant issue [ ]. however, a more fundamental challenge to opening up the processes and fruits of global research is that all too often links are not there, or are ambiguous. two fundamental units in the scholarly endeavor are the researcher and the article. even connecting these has proved challenging. the variations in naming conventions across journals, countries, cultures and alphabets are huge. one author could be referred to in her career as jane doe, jane ellen doe, j. doe, j.e. doe, j.e. doe-pilkington, dr. jane doe, prof. jane doe and many other variations besides. factor in all the other researchers with the same, or similar, names, and a simple search for works by a colleague can quickly become a hugely complicated discovery exercise. correctly identifying a publication can be almost as publications , , ; doi: . /publications www.mdpi.com/journal/publications http://www.mdpi.com/journal/publications http://www.mdpi.com http://www.mdpi.com/journal/publications publications , , of challenging, as it may appear in multiple formats, on multiple systems and platforms (the publisher’s own platform, aggregator and library systems, and more), and even in multiple versions, including the author accepted manuscript (aam) and version of record (vor). if we cannot unambiguously connect two such well-established units, how can we tackle new or emerging outputs in the research process, such as software or data visualization, or recognize an expanded range of roles and contributions, such as peer review and data curation? correct attribution not only makes research more efficient, it is essential for openness. open access (oa) is a special case in point. if a researcher publishes an article in an oa journal, there is often an audit and reporting trail implied. a funder may have a policy that requires outputs from funded projects to be oa [ ]. an article processing charge (apc) may have been paid. the institution employing the author will want to track the publishing outputs and preferences of its staff. there may be an oa requirement for eligibility for research evaluation [ ]. there might be a data management policy that requires deposit into a data center for underpinning datasets or open data. this could come from a funder or from a publisher [ , ]. the staff time involved in collecting, managing, and analyzing all of this information, and preparing it for formal reporting can be extremely significant, especially at global research institutions with substantial volumes of output. in light of this, the challenge is clear. to fully deliver the potential benefits of open, digital scholarship, automatic, reliable, resolvable connections must be made systematically between researchers, their employment, their publications and other research outputs, their research activities, and the funding that supports it all. truly open research is also transparent, which requires a mesh of information to surround each output or action. attribution lies at the core of this transparency, as it forms a crucial link in the chain of relationships that underpin research. in this context, orcid (open researcher and contributor identifier) plays a vital role in establishing that chain [ ]. orcid maintains a central registry of unique persistent identifiers (orcid ids) for researchers and others engaged in research, scholarship, and innovation. these identifiers are openly available under a creative commons zero license [ ]. researchers can register for an identifier at no cost, via a simple online registration process. this identifier can then be used when they apply for funding, when they submit a manuscript, or in internal research information systems. it provides an unambiguous, resolvable link to the individual to whom a research activity should be attributed. the registry also enables connections to other persistent identifiers (pids), whether for organizations or for research resources. these connections are curated and controlled by the researcher, and can be updated by the creators of identifiers and metadata, such as publishers, with the researcher’s permission [ ]. by connecting orcid ids with other pids for authoritative sources of data about the resource or activity, orcid acts as a bridge, connecting researchers to information about their works. these connections provide the beginnings of the rich, detailed context that openness both implies and requires. orcid works with researchers and hundreds of organizations around the world to develop these connections, to help realize the benefits of st century communication technologies for the global research community [ ]. this article sets out some of the ways that these partnerships have already improved the flow of research information, and describes some of the next steps that could be taken to further bolster openness and to improve the understanding of our progress towards open access. . improving attribution in existing workflows orcid is not alone is providing identifiers for researchers, and all published authors have multiple identifiers [ ]. this is an inevitable consequence of publishing as many are automatically generated and assigned. identifiers can be specific to a single vendor, country, discipline, institution, funder or publisher, or can emerge from wider community initiatives. some focus on the needs of the library community or institutions, others may serve the researcher. national approaches also exist, or have existed in the past, such as dai [ ] in the netherlands and jisc names [ ] in the uk. publications , , of the type and scope of these identities vary depending not just on the use-case they were designed to meet, but also on the use-cases they have evolved to address. three of these are described below. isni (international standard name identifier) is the iso certified global standard number for identifying contributors to creative works and those active in their distribution, including researchers, inventors, writers, artists, visual creators, performers, producers, publishers, aggregators, and more [ ]. isni identifiers are semi-automatically derived from library catalogues and other trusted sources using algorithms and human intervention for quality control. they are not editable by the entities they reference, although users can suggest changes to existing profiles and report duplicates. institutions that are members can submit data for matching and isni creation. isni requires at least two sources of matching information before an id is created. researcherid is a unique identifier assigned by thomson reuters that enables researchers to manage their publication lists, track their times cited counts and h-index, identify potential collaborators, and avoid author misidentification [ ]. researcherid identifiers are closely integrated with other thomson reuters products, such as web of science. like orcid, researcherids are created and managed by users, and the two systems interoperate to enable the creation of bidirectional links. scopus author identifier assigns a unique number to groups of documents written by the same author via an algorithm that matches authorship based on certain criteria [ ]. it is a service provided by elsevier and integrated with their other products such as mendeley. if a document cannot be confidently matched with an author identifier, it is grouped separately. in this case, you may see more than one entry for the same author. scopus profiles cannot be amended by users but there is a process for author feedback. these and many other identifiers are already embedded in many of the workflows that make up the research process. in this section, we describe some of the ways that interactions between these identifiers are improving the accuracy, currency, and reliability of research information. . . connecting researchers and their organizational affiliations for researchers, reporting their employment or educational history is a central part of establishing their professional credentials. for the organizations employing or training researchers, the accuracy of these claimed affiliations is vital for the organization’s understanding of its research portfolio, the balance of its activities and their impact, and also for its reputation. the claim that researcher x is affiliated with organization y is a fundamental component of research information, and the importance of its veracity cannot be overstated. organizational names suffer from similar ambiguities to those described for researcher names in the introduction to this article. researchers typing organization names into free text fields on forms, or as part of a block of text in an article manuscript may express these names in varying ways (for example, m.i.t., mit, massachusetts institute of technology, etc.). this means that web or other searches for affiliations require using many variant terms, and may still not be exhaustive. while current identifiers for organizations tend to cover a defined subset of the vast range of organizations existing at any one time, there are a cluster of identifiers that are especially pertinent to research organizations; these are used in the orcid registry to connect organizations to people. at the simplest, data entry level, researchers can enter their own employment affiliation into their orcid record. a dropdown list of names is provided based on characters entered, from which researchers can select the correct organization. once this is done, the name and associated organization identifiers are linked to the metadata associated with that person’s orcid id [ ]. this means that the researcher is now connected to a canonically expressed, disambiguated organization name, underpinned by widely used identifiers. in this scenario, the affiliation is asserted by the researcher. while this is no problem for the overwhelming majority of cases, in which trustworthy individuals act in good faith, an additional step—in which the claimed affiliation is validated—may be required for those seeking to re-use or publish this information. however, this additional step is unnecessary when employers or other publications , , of organizations make their own, validated claim (“assertion”) of affiliation with an individual, which can be done by using the orcid api to add information to orcid records through a simple process. first, the researcher logs in to their institutional system, where their organization requests permission to connect to the researcher’s orcid record and add affiliation information. if the researcher grants permission, the final step is for the organization to update the orcid record with details of the affiliation. this may involve role information, or time bounds for the relationship (such as “october to the present”), but will always involve the unambiguous expression of the organization name and associated identifiers. this information can then be displayed in the web interface and in the metadata behind it, with the source shown as the asserting organization. this employer assertion of affiliation provides an additional degree of trustworthiness in the connection between an individual and an organization. this is essential if the data are to be re-used, for example, by a publisher in adding affiliation information to an article or by a funder in assessing a grant application. it also offers a more robust underpinning for other potential uses of this information, which will be discussed later. the value of these affiliations for oa are clear. individuals are often the most visible connection between grant funding, an organization, a project, and a publication. by ensuring that these affiliations are openly and consistently expressed in a trusted way, the data can be embedded in publications and other outputs. policy compliance, progress towards greater openness, and changes in citation level or other impacts can all be reliably tracked at the project or organizational level. for researchers, it can reduce the time spent re-keying information into multiple forms, reducing bureaucratic overhead and minimizing the risk of errors. . . automatic updates for newly published outputs at the time of writing, the orcid registry contains records linking individual researchers to more than , , unique digital object identifiers (dois) [ ]. the doi has long been established as a de facto standard for electronic academic publications, originally for journal articles, and now also used for book chapters, datasets, and other outputs. connecting orcid ids for the authors of the outputs to those outputs’ dois is an obvious use case for improving the accuracy of publications records, citation counts, and attribution. given that the workflows already exist for authors to register for an orcid id, for publishers to attach orcid ids to articles [ ], and for doi registration agencies to provide dois for those articles, the logical next step is to add one more piece to the workflow and connect the doi and the orcid id. crossref is the major doi registration agency for academic journal publishing [ ]. it has established a workflow, in partnership with orcid and academic publishers, enabling automatic updating (auto-update) of researchers’ orcid records when a new article is published [ ]. the process is the culmination of significant collaboration in the sector, and requires minimal effort from researchers. automatic updating has also been set up by datacite for research datasets [ ]. since the workflow is very similar, we will discuss the process as it currently operates for articles; for datasets, the first step involves a data center, rather than a scholarly publisher. auto-update works simply by adding an additional step to the existing manuscript submission workflow that is already in place for many scholarly publishers. during manuscript submission, the publisher requests the author’s orcid id, which they provide by signing in to their orcid account and granting permission for the publisher to read their orcid record. when the article is accepted, the publisher then embeds the author’s orcid id in the article and associated metadata, which is sent to crossref to mint a new doi for the published article. crossref detects the orcid id in the metadata, and updates the orcid record with the metadata it has received from the publisher. if the author has not already granted permission, crossref sends a message to their orcid account notifying them that the article is published and the metadata is available. once this process has taken place, any system linked to the orcid registry can be updated with the publication information, including institutional repositories, research information management publications , , of systems, and more. since the update contains the doi, the system can then pull a complete metadata record from the authoritative source for the data, populating the researcher’s profile with the new metadata. this can be achieved, usually within h of the doi being minted, without the researcher having to lift a finger. the ramifications for open research of these enhanced workflows are significant. organizations tracking their research outputs can receive notifications of new outputs within a day of them being published. this accelerates the flow of new information and reduces the time currently spent on manual searches and data entry. a simple check by resolving the doi can verify any rights metadata accompanying the doi, enabling the organization to ensure that articles are indeed open and that they can be recorded and reported as compliant with institutional or funder policies. for datasets, the doi points to a stable, accessible, curated home for the resource for the future, and indicates that action has been taken at a disciplinary or national level to preserve and share the data. . . supporting transparency and recognition for reviews peer review, for publications, datasets, or funding applications, is fundamental to research. it is one of the ways in which researchers contribute to the general health of research, and is a form of disciplinary “good citizenship”. as such, it should be recognized and rewarded as a first class research activity. to recognize peer review in this way, it must be made more transparent. this transparency is a clear step forward for open research, irrespective of whether or not the review itself is open. in , a community partnership, led by orcid, f research and the standards body consortia advancing standards in research administration information (casrai) established a working group to define a set of metadata to describe peer review activities, with the aim of exposing it in their information systems [ – ]. this metadata, which includes identifiers for the reviewer, sponsoring organization, and the review itself, can be sent to the orcid registry by a publisher or platform, enabling peer review activities to be included in an orcid record, alongside the researcher ’s other outputs and professional contributions. from there, it can be sent to research management systems or funder reporting systems. this functionality has now been rolled out by several organizations, including f and the american geophysical union, and will be available to an increasing number of journals via implementations in systems such as aries editorial manager, ejournalpress, and publons [ – ]. a full citation for an open review can be shared or, for double-blind peer review, the journal can choose simply to acknowledge that the individual has performed reviews. by using identifiers to expose peer review in this way, as a community we encourage a new level of openness in the system of peer review. by publishing review activities, we can get a fuller sense of the breadth of an individual’s contributions to research. reviewers unambiguously identified can be checked against author lists or known associations, improving the trustworthiness of both the review and the article. by adding yet more transparency to the research system, we can better understand the processes that shape articles, and support more innovation in scholarly communications, such as post-publication peer review. . further possibilities for identifiers and open access the developments described above set out some of the ways that identifiers are already serving as a tool to enhance the openness of research, and to make specific workflows around oa and related activities more efficient. in the future, they may also serve as a foundation for further steps in the evolution of oa workflows. as these new processes become embedded in scholarly communications practice, they will be the enablers of further gains in efficiency, transparency, and trust: the cornerstones of the open movement. publications , , of . . managing article processing charges more efficiently one example would be to use identifiers to improve the management of article processing charges (apcs) levied by many oa journals to cover publication costs. there are numerous challenges at present in managing the payment of apcs at an organizational level, but tools exist to overcome or ameliorate these. by linking researchers to their employing organizations (as described in section . . above) and to the organizations that provide their research funding [ ], using funder identifier systems such as crossref funding data or the global research identifier database (grid), eligibility for apc subsidies can be readily established [ , ]. the ability to delineate affiliations by time is crucial in this regard, as researchers may have moved on to pastures new while the products of their work at a previous employer were still working their way through the publication process. tracking these progressions is a vital component in both career recording and the audit trail for apc expenditure. orcid already has the ability to connect individuals, employers, and funding data, and will continue to enhance its support for other identifiers. the auto-update functionality described in section . . above can already be used to streamline reporting processes, as noted, but could also serve as a valuable tool for auditing apc expenditure. given that oa publishing represents a major shift for publisher business models, from charging for access to content to providing a publication service, the ability to generate close to real-time reports of publication activity will be a valuable indicator both for publishers and organizations. finally, the fact that these flows are being built from the ground up to be open and transparent, and rely extensively on apis and other machine interfaces, creates exciting possibilities for other new services to be built on them. for example, the data they provide, and the flow of information they create, could be readily augmented from other sources to develop new kinds of analysis or new understandings of research activity. . . understanding oa as both producer and consumer another potential benefit for oa from the increased use and integration of identifiers is the opportunity for enhanced understanding of publishing output. it would be hypothetically possible, given sufficiently robust uptake of orcid ids by researchers and publishers, to use the orcid id to connect articles to people and funding, assess the balance of subscription versus oa articles for any publisher or subscription portfolio, and tie this to the apcs paid at the organizational level. this would enable each organization to assess their contribution to oa progress. this intelligence would also provide a valuable resource for governments, such as that of the netherlands, which have set ambitious targets for oa publications [ ]. for library consortia, it would provide a way to track the balance of subscription content versus oa year on year, assessing the impact of the apcs they pay, and comparing it to the remaining subscriptions they purchase. such a tool would also be invaluable for publishers providing content to these consortia, as the objective data would enable both sides to lay the ghost of ‘double-dipping’ to rest [ ]. . discussion it is clear that the pursuit of open access relies on improvements and changes to many of the processes that underpin the research endeavor. it creates new uses for existing research information, and generates demand for new kinds of information. at the same time, existing research information and scholarly communications systems are showing signs of strain. analysis and reporting is labor-intensive and slow. the administrative burden on researchers—and research managers and administrators—is too high. the transition to a digital research world is by no means complete or effective. improving our use of machine-to-machine communication and the automatic transmission of information across systems is vital if we are to improve on this situation. identifiers play a central role publications , , of in this improvement, and the power they have to streamline workflows and generate better quality information has been amply demonstrated by the examples of identifier integration provided here. that said, identifiers are just some of the tools that exist to both improve and expand the services that research and researchers depend upon. increasing openness itself generates new possibilities. open research is not just open to the people of the present, seeking to use, re-use, and better understand our research achievements. it is open to the future, creating possibilities we haven’t even thought of yet. so, as we build our infrastructures for open research and work to continue to create new possibilities, we must be mindful not to foreclose the future. conflicts of interest: the authors are employed by orcid inc. (alice meadows) and orcid eu (josh brown and tom demeranville). orcid inc. provides some of the services and tools described in this article. orcid eu exists to promote the adoption and integration of orcid identifiers in european research infrastructures and systems. references . budapest open access initiative. available online: http://www.budapestopenaccessinitiative.org/ (accessed on march ). . burnhill, p.; mewissen, m.; wincewicz, r. reference rot in scholarly statement: threat and remedy. insights , , – . [crossref] . registry of open access repository mandates and policies (roarmap). available online: http://roarmap. eprints.org/ (accessed on march ). . higher education funding council for england open access research policy. available online: http://www.hefce.ac.uk/rsrch/oa/policy/ (accessed on march ). . engineering and physical sciences research council policy framework on research data access expectations. available online: https://www.epsrc.ac.uk/about/standards/researchdata/expectations/ (accessed on march ). . public library of science data availability policy. available online: http://journals.plos.org/plosone/s/ data-availability (accessed on march ). . open researcher and contributor identifier. available online: http://orcid.org/ (accessed on march ). . creative commons cc license. available online: https://wiki.creativecommons.org/wiki/cc (accessed on march ). . orcid privacy policy. available online: https://orcid.org/privacy-policy (accessed on march ). . orcid integration chart. available online: https://orcid.org/organizations/integrators/integration-chart (accessed on march ). . aryani, a.; barton, a.j.; brase, j.; brown, j.; demeranville, t.; herterich, p.; mcavoy, l.; paglione, l.; ruiz, s.; thorisson, g.; et al. workflow for interoperability. . available online: https://figshare.com/articles/ d _ _workflow_for_interoperability/ / (accessed on march ). . digital author identifier. available online: https://wiki.surfnet.nl/display/standards/dai (accessed on june ). . names: developing a name authority service to provide unique identifiers for authors. available online: https://www.jisc.ac.uk/rd/projects/names (accessed on june ). . international standard name identifier (iso ). available online: http://isni.org (accessed on june ). . researcherid. available online: http://www.researcherid.com/ (accessed on june ). . scopus preview. available online: https://www.scopus.com/search/form/authorfreelookup.uri (accessed on june ). . organization identifiers in orcid (ringgold identifiers). available online: http://members.orcid.org/api/ organizations-orcid-ringgold-identifiers (accessed on march ). . orcid statistics. available online: https://orcid.org/statistics (accessed on march ). . requiring orcid in publication workflows: open letter. available online: http://orcid.org/content/ requiring-orcid-publication-workflows-open-letter (accessed on march ). . crossref. available online: http://www.crossref.org/ (accessed on march ). . auto-update has arrived. available online: http://orcid.org/blog/ / / /auto-update-has-arrived- orcid-records-move-next-level (accessed on march ). . datacite. available online: https://www.datacite.org/ (accessed on march ). http://www.budapestopenaccessinitiative.org/ http://dx.doi.org/ . /uksg. http://roarmap.eprints.org/ http://roarmap.eprints.org/ http://www.hefce.ac.uk/rsrch/oa/policy/ https://www.epsrc.ac.uk/about/standards/researchdata/expectations/ http://journals.plos.org/plosone/s/data-availability http://journals.plos.org/plosone/s/data-availability http://orcid.org/ https://wiki.creativecommons.org/wiki/cc https://orcid.org/privacy-policy https://orcid.org/organizations/integrators/integration-chart https://figshare.com/articles/d _ _workflow_for_interoperability/ / https://figshare.com/articles/d _ _workflow_for_interoperability/ / https://wiki.surfnet.nl/display/standards/dai https://www.jisc.ac.uk/rd/projects/names http://isni.org http://www.researcherid.com/ https://www.scopus.com/search/form/authorfreelookup.uri http://members.orcid.org/api/organizations-orcid-ringgold-identifiers http://members.orcid.org/api/organizations-orcid-ringgold-identifiers https://orcid.org/statistics http://orcid.org/content/requiring-orcid-publication-workflows-open-letter http://orcid.org/content/requiring-orcid-publication-workflows-open-letter http://www.crossref.org/ http://orcid.org/blog/ / / /auto-update-has-arrived-orcid-records-move-next-level http://orcid.org/blog/ / / /auto-update-has-arrived-orcid-records-move-next-level https://www.datacite.org/ publications , , of . faculty of . available online: http://f .com/ (accessed on march ). . consortia advancing standards in research administration information (casrai). available online: http://casrai.org/main_page (accessed on march ). . orcid peer review early adopter program. available online: http://orcid.org/content/peer-review- early-adopter-program (accessed on march ). . hanson, b.; lawrence, r.; meadows, a.; paglione, l. early adopters of orcid functionality enabling recognition of peer review: two brief case studies. learn. publ. , , – . [crossref] . aries editorial manager. available online: http://www.ariessys.com/software/editorial-manager/ (accessed on march ). . ejournalpress. available online: http://www.ejournalpress.com/ (accessed on march ). . publons partners with orcid to give more credit for peer review. available online: https://orcid.org/blog/ / / /publons-partners-orcid-give-more-credit-peer-review (accessed on march ). . updating awardee orcid records. available online: http://members.orcid.org/funder-workflow#award (accessed on march ). . funding data. available online: http://www.crossref.org/fundingdata/ (accessed on march ). . global research identifier database. available online: https://www.grid.ac/ (accessed on march ). . dutch government policy on open access to academic publications. available online: https://www. government.nl/documents/parliamentary-documents/ / / /open-access-to-publications (accessed on march ). . australasian open access strategy group. blog post: “addressing the double dipping charge”. available online: http://aoasg.org.au/news-updates/blog-summary/addressing-the-double-dipping-charge/ (accessed on march ). © by the authors; licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc-by) license (http://creativecommons.org/licenses/by/ . /). http://f .com/ http://casrai.org/main_page http://orcid.org/content/peer-review-early-adopter-program http://orcid.org/content/peer-review-early-adopter-program http://dx.doi.org/ . /leap. http://www.ariessys.com/software/editorial-manager/ http://www.ejournalpress.com/ https://orcid.org/blog/ / / /publons-partners-orcid-give-more-credit-peer-review https://orcid.org/blog/ / / /publons-partners-orcid-give-more-credit-peer-review http://members.orcid.org/funder-workflow#award http://www.crossref.org/fundingdata/ https://www.grid.ac/ https://www.government.nl/documents/parliamentary-documents/ / / /open-access-to-publications https://www.government.nl/documents/parliamentary-documents/ / / /open-access-to-publications http://aoasg.org.au/news-updates/blog-summary/addressing-the-double-dipping-charge/ http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction improving attribution in existing workflows connecting researchers and their organizational affiliations automatic updates for newly published outputs supporting transparency and recognition for reviews further possibilities for identifiers and open access managing article processing charges more efficiently understanding oa as both producer and consumer discussion bibliodata and scholarly primitives bibliodata and scholarly primitives marcin roszkowski bibliographical data workflows: discover, analyse, and improve them. . . bibliodata and scholarly primitives • focus: • process and context oriented perspectives on bibliodata • what do you do with bibliodata? • where do you place your activities in the bibliodata life cycle? • scholarly primitives • common vocabulary that reflects basic types of activities related to bibliodata • shared horizon of understanding • goal • explore the landscape of bibliodata in digital humanities • identify stakeholders • develop workflows for bibliodata beyond the library catalog john unsworth* and scholarly primitives • context • digital scholarship in humanities ( ) • concept • basic functions common to scholarly activity across disciplines • basic type of information behaviour related to the research process • approach • pragmatic and technology-oriented • to identify research activities for which we can develop functional requirements for technology that support them * unsworth, j. ( , may). scholarly primitives: what methods do humanities researchers have in common, and how might our tools reflect this. in symposium on humanities computing: formal methods, experimental practice. king’s college, london. http://people.virginia.edu/~jmu m/kings. - /primitives.html example: annotation as a technology- supported scholarly primitive • importance • the idea of annotations is strongly related to the concept of interpretation which is one of the basic tasks that scholars in humanities perform • conceptualization • subject of annotation: digital assets, networked assets … • mode of annotation: manual, semi-automatic, automatic • type of annotation: highlighting, commenting, identifying, linking … • access to annotation: private, collective, public • context of annotation: website, digital collection, virtual research environment, … • … https://burckhardtsource.org/ + https://thepund.it/ web annotation data model https://www.w .org/tr/annotation-model/ example: annotation as a technology- supported scholarly primitive bibliodata related activities and lis • ifla library reference model* • user activities which are the basis for functional requirements of bibliographic records in the library catalog find to bring together information about one or more resources of interest by searching on any relevant criteria identify to clearly understand the nature of the resources found and to distinguish between similar resources select to determine the suitability of the resources found, and to be enabled to either accept or reject specific resources obtain to access the content of the resource explore to discover resources using the relationships between them and thus place the resources in a context * riva, p., le boeuf, p., & Žumer, m. ( ). ifla library reference model. a conceptual model for bibliographic information. ifla international federation of library associations and institutions. https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla_lrm_ - .pdf https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla_lrm_ - .pdf bibliodata activities beyond the library catalog • lrm activities reflect the nature of the search process and exclude all the things we do with bibliodata beyond the library catalog • we need a more comprehensive perspective on bibliodata relevant activities rooted in a digital scholarship • two perspectives • how does bibliodata support different scholarly primitives? • how to interpret bibliodata through scholarly primitives? taxonomy of digital research activities in the humanities (tadirah) activity description creation activity of producing born-digital digital objects (e.g. cataloging) capture activity of creating digital surrogates of existing artefacts, or expressing existing artifacts in a digital representation (e.g. conversion, gathering, discovering) enrichment enrichment refers to the activity of adding information to an object of enquiry, by making its origin, nature, structure, meaning, or elements explicit (e.g. mappings, pids assignment) analysis refers to the activity of extracting any kind of information from open or closed, structured or unstructured collections of data, of discovering recurring phenomena, units, elements, patterns, groupings, and the like (e.g. bibliometric analysis, data quality analysis) interpretation interpretation is the activity of ascribing meaning to phenomena observed in analysis (e.g. contextualizing, citations) storage making digital copies of objects of inquiry, results of research, or software and services and of keeping them accessible, without necessarily making them available to the public (e.g. archiving, preservation, organizing references) dissemination making objects of inquiry, results of research, or software and services available to fellow researchers or the wider public in a variety of more or less formal ways (e.g. publishing, sharing) meta-activities activities which, unlike regular research activities, do not apply directly to a research object, but rather to a combination of a research activity with a research object (e.g. teaching, bibliodata policy, bibliodata management) bibliodata and scholarly primitives results from polling session bibliographical data workflows: discover, analyse, and improve them. . . white paper report report id: application number: hd project director: steven hackel (steven.hackel@ucr.edu) institution: university of california, riverside reporting period: / / - / / report due: / / date submitted: / / the early california cultural atlas: spatial and digital history and the visualization of colonial california project director: steven w. hackel, uc riverside jeanette zerneke, electronic cultural atlas initiative, uc berkeley natale zappia, whittier college table of contents i. brief project summary ii. project history iii. methodology and design process iv. summary of project findings v. recommendations a. interdisciplinary research b. outreach c. long-term impact vi. appendix a vii. figures viii. selected references i. brief project summary the early california cultural atlas (ecca) explores and visualizes the spatial and temporal aspects of the enormous economic, demographic, and ecological changes that shaped colonial california between - . during this time, the peoples and lands of california were remade by political and biological forces that we are only now beginning to understand in their totality. the establishment of alta california’s mission san diego in initiated many of these dramatic changes, including the spread of deadly diseases, the introduction of foreign plants and animals, large waves of indian migration, and the introduction of new agricultural regimes that undermined indigenous subsistence practices. the ecca includes a website visualizing these historical trends in the region of monterey, california. in , the ecca received an neh digital humanities level i start-up grant to construct a basic website of historical change in the region of monterey, california. in the process we resolved many technical issues and encountered new historical questions, including how to address geographical and spatial-temporal ambiguity. ultimately, the ecca will place this data and its associated visualizations alongside data from the los angeles basin—a region with an arid climate, expansive network of juaneño and gabrielino/tongva villages, spanish missions, mexican ranchos, and the pueblo of los angeles. by visually and digitally comparing these distinct colonial centers while simultaneously measuring ambiguity through a data characterization and management matrix (see appendix a), the ecca will reveal a more comprehensive picture that promotes new ways of understanding california before . ii. project history the ecca emerged out of the huntington library’s early california population project (ecpp), a project that was completed in with neh funding. the ecpp database contains all the information in the california mission baptism, marriage, and burial records; thus, it holds an extraordinary wealth of unique information on more than , indians, soldiers, and settlers in california. most important for the ecca, the ecpp database lends itself to spatial and temporal analysis. for, each of the more than fields in every record describes a person or event that can be situated in time and place. beginning in the spring of hackel and jeanette zerneke began to discuss the advantages of displaying ecpp data spatially and temporally through visualizations. they created an interactive map of indian villages at the time the spaniards arrived in the monterey region and linked the ecpp data for indians baptized to a map of the villages from which they came. with this exploratory work complete, in the spring of hackel applied for an neh digital humanities initiative level i start-up grant to fund a series of discussions about how to incorporate new layers of information into the website and how this information could best be made available to teachers, students, and scholars. with level i funding, the ecca team created a phase i website (see attached figures) with google earth visualizations. the team focused on two missions, san carlos and san juan bautista. initial mapping for san carlos relied upon single locations for villages and did not specify exactly how the location of the villages had been determined. in adding san juan bautista to the study, hackel and zerneke plotted multiple village locations for villages and devised a register that explains the source of information for this mapping as well as the relative certainty of these locations. furthermore, they created in this new prototype a means by which simply clicking the cursor on an indian village allows access to basic information on indians who lived in these villages and moved to san carlos or san juan bautista. an innovative and newly designed time bar function shows the movement of those indians to the missions in every year until the village has been depopulated. ecca staff added to the website geo-registrations of historic maps and the boundaries of ranchos created in california during the spanish and mexican periods. thus, with level i funding the ecca devised new ways to visualize and understand historical transformations that heretofore had only been represented in ways that did not adequately consider spatial and temporal relationships. the project director made a significant personnel change at the halfway point in the grant period. we had initially planned to hire a graduate student assistant to work on the project, but we could not find one with adequate background knowledge for the project. therefore, to coordinate the compilation of historical data for the project we hired natale zappia, a recent ph.d., an expert on the history of early california, and an assistant professor of history at whittier college. zappia’s contributions quickly emerged as crucial for this project’s completion. iii. methodology and design process the ecca is an integrated investigation of the relationship between historical content and new processes for digital scholarship. there is currently momentum in the development and use of spatially aware technologies for a broad range of applications. and, there is an amazing increase in users’ ability and willingness to use spatial technologies. however, in the humanities and especially in the discipline of history, technology and methodologies for dealing with spatial and attribute data with varying degrees of certainty and ambiguity have not been fully developed and adopted. this has created a critical barrier to the widespread adoption of spatial-temporal technologies. the ecca is developing methodologies to handle these issues and to create spatial-temporal visualizations that support more complex narratives. each of the sources of data used to map this study area has its own characteristics of ambiguity. zerneke developed a matrix of methods to analyze the ambiguity by addressing both its source and typology. this information is used to inform appropriate scale, cartography, and associated attributes for the dataset. several technologies were tested to determine appropriate visualization methods to support users’ understanding of the content. these scholarly processes, which have been developed to integrate content and technology, can be adopted with newer technologies as they become available. the phase i website was constructed using statistical data from the ecpp, historical maps from online archives, shape files for historical ranchos, and a reference dataset of the north american spanish missions. the demographic data from the ecpp was linked via the individuals’ record of origin to the locations of indian villages. considerable work went into the process of determining appropriate locations for the village names mentioned in the historical records. multiple scholarly sources were consulted and an ontology of village types was developed. the indian village maps and historic maps were overlaid on current satellite imagery. then several methods of representing baptism data were tested. the current method shows the probable locations of the villages both before and after active baptisms as static locations. during the period when residents of the village were being baptized at the mission, the number of baptisms is represented by a polygon whose height is determined by the number of baptisms. this allows a quick visualization of the quantity of people who were moving from the village to the mission each year. further work on the initial demonstration now allows users to interrogate the map to see the exact number of baptisms, and it also includes the baptisms occurring at the mission as a comparison. expansion of spanish and mexican rancho lands are also overlaid to contribute to the understanding of the change in land use. the most recent version of this dynamic map demo can be viewed in the internet at: ecai.org/ecca/googleearthdemos.html (see appendix). google earth allows complex visualizations with a data-driven timeline, flexible data layering and cartography, links to related resources, and the capability of creating three-dimensional representation of data. exporting a customized set of data layers with dynamic change over time into an embedded google earth, video, or flash demo allow its incorporation with a narrative explanation into teaching and research materials. iv. summary of project findings in our neh level i application, we argued that our project would break new ground by embracing ambiguity, an issue that has bedeviled humanists’ attempts to use new mapping technologies, especially in studies that involve complicated notions of time and space. at the same time, there has been considerable research in defining geographic uncertainty, developing frameworks for representing geographic uncertainty, and work on methods of visualizing uncertainty. most of this research focuses on contemporary gis data for decision-making, visualization of single dimensions of geospatial uncertainty or complex visualization of non-spatial uncertainty. work in spatial information theory provides examples of modeling complex spatial understanding. in this project, we demonstrated that adding the spatial component to temporal analysis leads not only to a deepening of humanistic inquiry, but to a reformulation of the inquiry itself. this work addresses the issues of ambiguity and uncertainty holistically as a case study in spatial history. each of the sources of data and information available to map this study area has its own characteristics of ambiguity. for instance, the representation of the geography of california in western maps and atlases changed dramatically. the skills of cartography and mapping were improving and there were many voyages of exploration to this region. it also now seems clear that ecca will force us to fundamentally rethink what we have understood and written about the movements of indians to missions in california. most important, the historical questions we are now asking emerged directly out of visualizations we prepared with level i funding. in our mapping of the movement of indians to two missions in the monterey region (san carlos and san juan bautista) we became aware not only that mission recruitment proceeded steadily outward from each mission, but that in the s and s indians came to the missions from the interior of california, an area previously thought to be far less affected than the coastal region by the growth of mission agriculture and livestock and the creation of spanish and mexican ranchos. thus, we are now asking new questions: if mission encroachment on native subsistence drove indian movement to the coastal missions before , what led indians from the interior of california to the missions after ? furthermore, now that we can see the spatial and temporal patterns of mission recruitment for the monterey region, how might these patterns differ from those of the los angeles basin, a region of greater aridity, greater spanish settlement, and greater cultural diversity? our work in digital history suggests that scholars need to figure out more complicated stories to tell about indians’ movements to the missions and environmental change in early california. while our project focuses on california, one of our main objectives has been to resolve important issues that have often made it difficult for humanities’ scholars to adopt emergent technologies. thus, the ecca has at its core a willingness to acknowledge and map ambiguity. typically, computer programs that historians use to map events spatially and temporally do not have a means to handle ambiguity. notably, ambiguity is present in at least three of the important layers of information that we are attempting to map. indian myths, legends, and creation stories describe the california landscape and various places that were part of what we might call a sacred geography. in our level i grant we focused our studies on the monterey region, but we found that the region did not lend itself easily to a mapping of indigenous understandings of land and its uses. we are addressing this problem in our level ii grant application as we endeavor to extend our project to southern california, a region with a richer ethnographic base. second, there is lively scholarly debate over the exact location of many of the villages that indians inhabited before moving to the missions, and much of the information on the location of these villages comes from the records of franciscan missionaries who had their own understandings of the land and its features. scholars generally solve this problem by simply mapping a single village site that they consider the most likely. in our project, we will continue to map multiple village locations and document the sources of these variant locations. in this sense, we are not attempting to create a simplified version of the past but one that is full of complexity and open to reinterpretation. third, we confronted ambiguity in our mapping of the diseños, the hand-drawn maps of california ranchos created by grantees and spanish and mexican officials. these maps were never drawn to scale and indicate boundaries by trees, streams, or buildings that often no longer exist. thus, in working with the diseños the ecca is trying to devise new methodologies and best practices for the geo-registration of representational maps. and since issues that surround ambiguity have presented serious impediments to humanists’ attempts to map information from different times and places, our efforts have important implications for work on other regions and time periods. v. recommendations: the ecca as a tool for interdisciplinary research and outreach a. interdisciplinary research the broader technical questions and challenges the early california cultural atlas is addressing are common to the sub-discipline of historical gis or spatial history, yet to an unusual degree the ecca is addressing them in an integrated fashion. the project has brought together multiple datasets with different types of ambiguity, subjectivity, and diversity of perspective. then, using newly available technical tools in experimental ways, it has considered the design of appropriate methods for visualization, navigation, and documentation through complex dynamic maps, which address the differing levels of certainty and scale. the ecca project has built on the research and technical experience of the electronic cultural atlas initiative (ecai), and it has applied lessons learned in that work in new ways. the ecca has taken advantage of the technical developments of many ecai projects including a large-scale collaborative project, the religious atlas of china and the himalaya. development of the ecca did not require creation of new large-scale custom software or database systems. instead, the work leveraged in new ways the early california population project database, historical maps from the david rumsey and library of congress collections, and the tremendous resources of the online archive of california, among others. the technical approach chosen for the ecca also built upon existing technology to create novel and flexible methodologies and tools. google earth has been chosen as the primary spatial-temporal interface for multiple reasons. it is relatively easy to use, allows easy creation of complex and multiple visualizations, and enables publication of static and interactive visualizations on the web. google earth has a very large global client base, with many programmers contributing open source code for specialized functions. the data can be made available in kml format for others to use as they wish and is human readable for sustainability. a number of digital history projects address digitizing, cataloging and accessing historical maps, including the cartography of american colonization database (cacd) and digital mappaemundi. however, unlike the mappaemundi project, the ecca uses technologies such as google earth that are based on the standardized framework of latitude and longitude and iso standard dates (year:month:day) and include satellite imagery for context. we can take advantage of this temporal spatial consensus because of the increased adoption of these technologies in widespread applications and the growing community of users familiar with these interfaces. unlike other projects the ecca is developing methodologies that use these tools while still representing the appropriate ambiguity for a variety of data types. ecai’s timemap and the virginia center for digital history’s historybrowser projects address displaying attributes and multimedia resources within a spatial-temporal interface. the historybrowser, now renamed as visualeyes, is an adobe flash and actionscript based integrated map and resource interface. it is a powerful but very specialized tool for creating visualizations using multiple types of attribute links and display. this custom tool creates an easy to use map-based interface. both of these tools may be useful in creating specific visualizations for the ecca project, such as for curriculum materials. however, the goals of the ecca are more complex as not all the resources that are important for the project will be easily importable into a gis style interface. converting information into a format usable in a spatial-temporal interface is a complex activity and is even harder in the humanities. specifically addressing these challenges and determining appropriate methodologies for digital humanities is central to the ecca project. the results created allow novel complex spatial-temporal visualizations and from these, new simple or advanced user interfaces can be created. this effort is pioneering a rich new approach to analysis for projects in the humanities that was not possible heretofore. b. outreach the ecca has responded to the needs of many constituencies. presenting visualizations of the transformation of early california has enhanced the study of early california by students, scholars, and native peoples. integrating into a single website the materials and resources related to the study of early california and its peoples meets the educational and cultural needs of students, researchers, and native peoples. at the conclusion of the initial grant period hackel, zerneke, and zappia presented demos of the website and visualizations at academic conferences, to academic colleagues, and to various groups of students and teachers in southern california. they received enthusiastic support for this work from university faculty and elementary school teachers. it is this broad-based response and overwhelming desire for access to this kind of dynamic visualization that has led to the current proposal for level ii funding. we will continue these efforts with the website design and curriculum development proposed in our level ii application. in our next grant we will respond directly to the california state board of education standards for history and social science. these standards mandate that fourth graders demonstrate an understanding of the physical and human geographic features that define places and regions in california and the social, political, cultural, and economic life and interactions among people of california during the spanish mission and mexican rancho periods. notably, in an average year, there are , fourth graders in california, and nearly all study the california missions. these students and their , teachers may be the largest and most important audience for our work, and they are a great way to engage a larger public. c. long-term impact once completed the project will have an immediate impact on the how colonial california is taught. we are developing a curriculum for the project and we expect that the website will be used by many of the , californian th graders each year. our project will also be of use to historians interested in digital history and those interested in using maps to explore ambiguity and diversity in perspectives of time and space. to date, ucr and the huntington, the institutions that have sponsored and hosted this project, have received accolades for their involvement in this work. vi. appendix a data characterization and management matrix source of ambiguity examples proposed methodology data recording/ collection unclear documentation village name ambiguity: spelling variations, unclear writing choose the most likely village name or drop it from the data set. document the choice. incomplete attribute documentation documentation exists but not for all the data of a particular type: age of indians at baptism often estimated and sometimes missing use best estimates and document metadata: what are the characteristics of the data set, it's source and history. what percent of data is known, what is not known. temporal documentation incomplete or vague don't know exactly when, or don't have the same degree of accuracy for all items of a data set – e.g., exact founding dates not available for all spanish ranchos use dates when available. round to nearest year. choose a default date by which all ranchos were founded. use the default date if actual date is unknown. document this choice in map legend. spatial information incomplete or uncertain village name is registered in mission records. however, the location of some villages is not recorded we can give them a generalized location or drop them from the visualization. proposed methodology – drop from spatial visualization. historical maps don’t have standardized representation of location and knowledge of north america was incomplete geo-register the historical maps approximately where they would fit in comparison to modern satellite imagery to allow comparison of views over time interpretation of data: difference of opinion among scholars about the specific location of a often location is based on interpretation of textual descriptions of the site and/or choose a site base on ranking of reference data sources. include lineage data in data village. incomplete documentation. multiple attempts to decide where the villages are have been published presentation data classification: what was meant by the origin of an individual in the baptism records? ambiguity in definition of home or village name may refer to an inhabited region, a specific site or multiple sites inhabited by a group of people develop a ontology of village types a person’s documentation is complex a person may come from multiple villages/ locations, perhaps because of intermarriage assign first listed village only or count them multiple times? document the choice vii. figures figure : ecca website figure : mission san carlos borromeo de carmelo was established in . here we see in june of the estimated location of native california villages at the time of initial contact. baptisms of indians coming from the village of tucutnut have been recorded. figures - illustrate changes in demographics and settlements between - . figure : san carlos borromeo de carmelo, . figure : san carlos borromeo de carmelo, . figure : monterey region, . in this area shows a network of secularized mission churches and a large number of land grants owned by mexican landholders. figure : diseño maps from monterey region. figure : ecca commodity figures. these tables have been embedded into ecca google earth imaging to visualize land use between - . figure : ecca website showing village clusters and ambiguous village locations. viii. selected references: lowell bean, ed., the ohlone past and present (menlo park, ca: ballena press, ) ross brown, binh pham, and alex streit, “visualization of information uncertainty: progress and challenges,” trends in interactive visualization, advanced information and knowledge processing, ( ). donald thomas clark, monterey county place names: a geographical dictionary (carmel valley, ca: kestral press, ) andrew u. frank, data quality ontology: an ontology for imperfect knowledge, spatial information theory, lecture notes in computer science (springer, ) ian n. gregory, k. kemp, and r. mostern, “geographical information and historical research: current progress and future directions,” history and computing ( ) steven w. hackel, children of coyote, missionaries of saint francis: indian-spanish relations in colonial california, - (chapel hill, unc press, ) robert heizer, the california indians: a source book (berkeley and los angeles: uc press, ) robert heizer, ed., “california indian linguistic records: the mission indian vocabularies of alphonse pinart,” university of california anthropological records ( ) robert heizer, ed., hnai volume : california ( ) anne kelly knowles, ed., past time, past place: gis for history, (esri press, ) alfred l. kroeber, “the chumash and costanoan languages,” university of california publications in american archaeology and ethnology ( ) alfred l. kroeber, handbook of the indians of california (new york: dover press, ) alan m. maceachren, et al., “visualizing geospatial information uncertainty: what we know and what we need to know”, cartography and geographic information science ( ) randall t. milliken, a time of little choice: the disintegration of tribal culture in the san francisco bay area, - (menlo park, ca: ballena press, ) laurence h. shoup and randall t. milliken, inigo of rancho posolmi: the life and times of a mission indian (menlo park, ca: ballena press, ) ann lucy wiener stodder, “mechanisms and trends in the decline of the costanoan indian population of central california,” coyote press archives of california prehistory ( ) linda yamane, ed. a gathering of voices: the native peoples of the central california coast, santa cruz county history journal ( ) karen markey et al. portal: libraries and the academy, vol. , no. ( ), pp. – . copyright © by the johns hopkins university press, baltimore, md . institutional repositories: the experience of master’s and baccalaureate institutions karen markey, beth st. jean, soo young rieh, elizabeth yakel, and jihyun kim abstract: in , miracle project investigators censused library directors at all u.s. academic institutions about their activities planning, pilot testing, and implementing the institutional repositories on their campuses. out of respondents, ( . percent) were from master’s and baccalaureate institutions (m&bis) where few operational institutional repositories (irs) were in place but where interest in learning more about the m&bi experience pertaining to irs was high. comments by these library directors in the miracle study demonstrated their desire to learn more about ir planning and implementation at institutions like their own. we address their comments in this paper, which compares ir activities at m&bis to research universities (rus). background and objectives the proliferation of digital forms of the scholarly record raises serious and press-ing issues about how to organize, access, and preserve the record in perpetuity. furthermore, teaching materials, institutional records, and special collections are increasingly delivered in digital form. the response of academic institutions has been to build and deploy institutional repositories (irs) to manage the digital scholarship that their learning communities produce and utilize in research and teaching. to discover the experiences that academic institutions have and the challenges they face during ir planning and implementation, researchers have surveyed research universities—the academic institutions most likely to have an operational ir or an ir implementation project underway. expanding their survey to include liberal arts col- leges, clifford lynch and joan lippincott report that only percent of liberal arts colleges institutional repositories: the experience of master’s and baccalaureate institutions have an operational ir. they conclude that "deployment of institutional repositories beyond the doctoral research institutions in the united states is extremely limited." in search of ir models, best practices, and success factors, miracle (making institutional repositories a collaborative learning environment) project investiga- tors enlisted a different strategy. we conducted a census of library directors at all u.s. academic institutions to learn about their involvement with irs, deliberately casting a wide net, knowing we would recruit institutions that had not yet jumped on the ir bandwagon. academic library directors and senior library administrators at master's and baccalaureate institutions (m&bis) who participated in the miracle study revealed a strong desire to learn more about the ir planning and implementation experience from institutions like their own. the purpose of this paper is to describe the ir planning, pilot testing, and imple- mentation experience of master's and baccalaureate institutions (m&bis) and to make comparisons to research institutions (rus) where most ir efforts have been undertaken to date. literature review several surveys of librarians at north american higher education institutions have been conducted in order to gain a better picture of the overall state of ir development; however, the vast majority of these surveys have focused on large research universi- ties (see, for example, kathleen shearer, ; shearer, ; and charles bailey et al., ). whereas shearer focuses on members of the canadian association of research libraries (carl), bailey focuses on members of the association of research libraries (arl). many members of both of these organizations are libraries at large comprehensive research universities and other large research organizations. a few authors have conducted similar surveys that were more inclusive. for ex- ample, lynch and lippincott surveyed cni (coalition for networked information) member institutions, taking care to target liberal arts colleges that were consortial members of cni. however, their resulting respondent pool was quite top-heavy with a large percentage of doctoral universities ( percent). similarly, mark ware and mark ware consulting looked at irs indexed by oaister, irs built on the eprints platform, and a hand-picked selection of irs that were mostly large research universities (for example, florida state university, georgia tech, mit, ohio state, university of virginia, and virginia tech). a recent, more broadly based survey from ithaka confirms one of the central find- ings from the miracle study—that the large research universities are advanced in the development of their irs but are not representative of the majority of u.s. colleges and universities. ithaka states that, although "digital repositories are far more common at the research universities than they are elsewhere, …there is nearly uniform interest in these repositories across the spectrum of libraries surveyed." ithaka goes on to point out specific differences between research universities and small colleges in terms of both the types of content they are storing in their irs and the objectives these types of institutions have for their irs. karen markey et al. just as there is a dearth of ir-related survey data for the small colleges, there is also a dearth of case studies covering irs at these institutions. to generate questions for survey instruments, miracle project investigators compiled an extensive bibliography of ir- related literature (see http://miracle.si.umich.edu/bibliography.html) and identified case studies of u.s. institutional repositories, of which cover irs at research universi- ties where high levels of research were being done. the three remaining case studies examine irs at master's colleges and universities (m&bis). only marianne buehler and adwoa boateng discuss an m&bi going it alone; the other two articles concern irs run by consortia. details about these three cases conclude this literature review. christopher nolan and jane costanza of trinity university's coates library write about their ir consortium, the liberal arts scholarly repository. their article details the original impetus for their ir, as well as their reasoning behind their selection of both the digital commons and the contentdm platforms. it includes a list of important practice and policy decisions that they suggest library staff consider early on in the ir development process. the authors also delineate the technical features of the digital commons software and discuss the advantages of participating in a consortium and the steps that they have taken in order to market the ir to both faculty and students. in conclusion, they mention types of usage statistics that digital commons calculates, speculate as to the potential impacts that the ir may have, and mention their future plans for adding a broader range of content to the ir. john-bauer graham, bethany skaggs, and kimberly stevens introduce their ir consortium, the cornerstone project. this is a statewide digital repository project run by the network of alabama academic libraries (naal). graham, skaggs, and stevens trace the development of the cornerstone project as well as the involvement of jackson- ville state university's (jsu) houston cole library in this project. they discuss jsu's selection criteria for digitization of content, jsu's marketing efforts on behalf of the repository, and the accessibility and permanence of jsu's repository content. addition- ally, they describe how the library's involvement in the cornerstone project resulted in a new and improved relationship with the archaeology department of the university. in conclusion, the authors offer some advice to other libraries seeking to get involved in similar digital repository projects. buehler and boateng describe the impact of establishing and operating an ir on the roles and careers of reference librarians. after providing some general background about the impetuses for libraries to create irs, the shifting roles of librarians, and the crucial necessity to market the ir, the authors detail the experiences of wallace library at the rochester institute of technology (rit). they briefly describe the composition, goals, and activities of the ir task force that was convened at rit. some of the activi- ties described include an early needs assessment, design and development of the ir interface, and demonstration and marketing of the ir. in conclusion, the authors point out that reference librarians have to take on a new role as change agents in order to get faculty to use and contribute to the ir. although these three case studies help to illuminate the ir-related experiences of small teaching and learning colleges and universities, many more of these types of case studies are needed to get a less idiosyncratic and more well-rounded picture of ir activity at these institutions. in the sections that follow, we present methods and results focusing on the experiences of these institutions in particular. institutional repositories: the experience of master’s and baccalaureate institutions methods miracle project investigators conducted their nationwide study of irs from april , through june , . we purchased mailing lists of library director names and addresses from information today's american library directory online and thomson- peterson's service. we sent e-mail messages to , library directors at four-year col- leges and universities in the united states, asking them for their participation by first characterizing the extent of their involvement with irs as follows: ( ) implementation of an ir, ( ) planning and pilot testing an ir software package, ( ) ir planning only, or ( ) no ir planning to date. in response to their answers to this question, we sent them a link to one of four survey instruments. many of the same questions were listed across two or more of the survey instruments so that comparisons could be made based on the extent of the institutions' involvement with irs. we used surveymonkey to collect these data online. some directors themselves completed the questionnaires, and others delegated the task to someone else at their institutions who was more knowledgeable about the institution's plans for irs. when data collection closed in late june, we cleaned up the data—for example, by deleting empty questionnaires. when data cleaning was done, our study's response rate was . percent; a total of institutions completed ques- tionnaires. types of institutional participants in the miracle study miracle project investigators asked respondents to characterize their involvement with irs so that they answered questions that were appropriate to their stage in the overall ir effort. the majority ( percent) of this study's respondents have done no ir planning to date, percent are planning for irs only, percent are planning and pilot testing one or more irs, and percent are implementing or have implemented an ir. to determine whether certain types of institu- tions were more or less likely to be involved with irs, miracle project investigators turned to the carnegie classification of institutions of higher education, which is "the leading framework for describing institutional diversity in u.s. higher education [and] …has been widely used in the study of higher education, both as a way to represent and control for institutional differences, and also in the design of research studies to ensure adequate representation of sampled institutions, students, or faculty." investigators used the following six carnegie classifications to characterize study respondents: ( ) master's colleges and universities that award at least master's de- grees per year, ( ) baccalaureate colleges where baccalaureate degrees represent at least percent of all undergraduate degrees and award fewer than master's degrees or fewer than doctoral degrees per year, ( ) research universities with very high or high research activity that award at least doctoral degrees per year, ( ) doctoral research universities that award at least doctoral degrees per year, ( ) special focus institutions the majority ( percent) of this study's respondents have done no ir planning to date. karen markey et al. where a high concentration of degrees is in a single field or set of related fields, and ( ) tribal schools that are members of the american indian higher education consortium. the population of u.s. academic institutions that are classified in the carnegie founda- tion classes that are the object of this paper's study are: ( ) . percent master's colleges and universities, ( ) . percent baccalaureate colleges, and ( ) . percent research universities. miracle study respondents in these same classes are: ( ) , . percent master's colleges and universities, ( ) , . percent baccalaureate colleges, and ( ) , . percent research universities. table shows the carnegie classifications of miracle study respondents, based on the extent of their involvement with irs. thirty ( . percent) of the respondents whose institutions have implemented irs are from research universities (rus). all but four of the remaining respondents whose institutions have implemented irs come from master’s colleges and universities ( . percent) and baccalaureate colleges ( . percent). institutions involved in ir planning only are more likely to be master’s colleges and universities ( . percent) and baccalaureate colleges ( . percent) and less likely to be rus ( . percent). dominating the no-planning respondent type are master’s colleges and universities ( . percent) and baccalaureate colleges ( . percent). low percent- ages of ru respondents participating in the miracle study are likely to be planning only ( . percent) or not planning at all ( . percent) for irs. asked about ir planning, . percent of no-planning respondents generally foresee ir planning beginning within the next months. asked why they have not begun, a large percentage of respondents from m&bis choose these reasons: ( ) other priorities, issues, activities, and so on are more pressing than an ir ( . percent), ( ) no available resources to support planning ( . percent), and ( ) the desire to assess irs at institu- tions like our own before taking the plunge ( . percent). asked how miracle project activities can help them, several respondents want to learn more about ir implementation from institutions like their own. in their own words, these requests are: • "would love to see models in a small, liberal arts college environment, particularly for consortial opportunities." (baccalaureate college from a southeastern state) • "i believe that a full-fledged ir is beyond our capabilities at this point, but would be interested in continuing to hear about developments in this area, especially in small universities." (baccalaureate college from a central plains state) • "testimonials that cut to the heart of what each size institution can gain. …from the smallest size institution, this is more than just adding a service. it could relate to a huge percentage of extremely tight resources." (baccalaureate college from a midwestern great lakes state) • "information on the various options and operating irs at comparable colleges." (baccalaureate college from a northeastern state) institutional repositories at master’s and baccalaureate institutions (m&bis) this section addresses the requests from staff at m&bis for information about ongoing ir projects at institutions like their own, especially findings that are unique, distinc- tive, and different from findings about irs at rus, where the majority of ir projects are underway. institutional repositories: the experience of master’s and baccalaureate institutions r es ea rc h un iv s. . . . . . d oc to ra l u ni vs . . . . . . m as te r’ s . . . . . ba cc al au re at e . . . . . sp ec ia l f oc us . . . . . tr ib al . . . . . u nc la ss ifi ed * . . . . . to ta l . . . . . * in st itu tio ns a re c c h e- un cl as si fie d be ca us e th ey r es po nd ed w ith tw o or m or e in st itu tio ns in p ar tn er sh ip -l ik e ar ra ng em en ts . ta bl e c ar ne gi e c la ss es a nd th e e xt en t o f i r in vo lv em en t by m ir a c le s tu dy r es po nd en ts c ar n eg ie c la ss es n p p o p p t i m p t ot al n o. % n o. % n o. % n o. % n o. % karen markey et al. involvement with irs questionnaires asked staff how long they have been involved with irs. figure shows m&bi responses in -month ranges. most long-running irs are maintained by rus; however, percent of the operating irs at m&bis have been operational for over three years, as opposed to nearly percent of operating irs at rus. next, . percent of irs at m&bis and percent of ru irs have been operational for less than months. most m&bis report that their planning and pilot-testing activities have been going on for less than months. ir investigative activities m&bi and ru respondents from institutions where irs are being pilot tested or implemented are generally in agreement about their ratings for a dozen ir investiga- tive activities. for example, high-rated investigative activities are: ( ) learning about successful ir implementations at comparable institutions, ( ) learning from reports of other institutions’ ir planning, pilot testing ir software, and implementation activities to date, and ( ) learning about successful implementations at a wide range of academic institutions. one important deviation comes from m&bi respondents at institutions where ir planning and pilot testing are going on. they rate learning about available expertise and assistance from a library consortium, network, group of libraries, and so on at the top, whereas all other respondents rate this activity exactly in the middle. such a rating may be an indication that m&bi respondents who are in the planning and pilot testing stage of ir implementation may be intending to partner with other institutions for ir implementation. questionnaires asked respondents whether they conducted a needs assessment prior to implementing an ir. m&bis with operational irs ( percent) were less likely than rus with operational irs ( percent) to conduct a needs assessment; however, these percentages were about the same for m&bis in the planning and pilot-testing stage ( percent) and rus in the same stage ( percent). perhaps these lower percent- ages in investigative activities are indicative of a general acceptance of irs in educational institu- tions. having monitored what first-generation ir implementers have accomplished, second-gen- eration implementers might not feel that a needs assessment is necessary. the people involved in the ir effort questionnaires asked respondents to rate how ac- tive were people who had various organizational roles in their institution’s ir effort. respondents at m&bis and rus agree that library staff, the library director, and the assistant library director are the most active. when asked who is leading the ir implementation at their institution, respondents do not agree. table shows their responses. at m&bis where ir implementation is underway, the library director leads the ir implementation; and where ir planning and pilot testing is underway, the library direc- having monitored what first- generation ir implementers have accomplished, second- generation implementers might not feel that a needs assessment is necessary. institutional repositories: the experience of master’s and baccalaureate institutions figure . involvement with irs in months library director . . . . a library staff . . . . member assistant library . . . . director your institution’s . . . . cio other . . . . your institution’s . . . . archivist a faculty member . . . . no chair yet . . . . appointed total . . . . table people leading the ir effort individual implementers implementers planning & planning & %m&bis %rus pilot pilot testers testers %m&bis %rus karen markey et al. tor, archivist, or a faculty member may take the lead. at rus, the library director, the assistant library director, a library staff member, or others such as associate directors of various library functions (such as the associate director for technology in the libraries or associate director of collection development), scholarly communication coordinators, committee chairs, and consortium staff play the leading roles. at smaller institutions, library directors may be more likely to lead the ir effort because they cannot afford to keep a high number of staff members with specialized roles, and they need to maintain staff who cross-train and can work in various functions throughout the library. the number of people at m&bis and rus who are involved with ir implementation averages . and . , respectively. the average number of people decreases at m&bis to . and increases at rus to . during the ir planning and pilot testing phases. such confusing averages may be due to the small number ( ) of respondents who answered this question from institutions where irs are operational. operational irs at m&bis asked how many irs are available to their institution’s learning community, respondents give the numbers provided in table . over percent of m&bis that have implemented irs report only one ir at their institution. in contrast, one-third of the ru respondents identify two or more irs. given the breadth and depth of rus, one or more academic or research units may be offering ir-like systems and services, possibly subject- or dis- cipline-oriented irs, to serve a worldwide network of researchers. additionally, m&bis may have fewer resources than rus; and, thus, they may not be able to implement and maintain more than one ir. although m&bis are also choosing dspace, they are dem- onstrating more variety than rus in terms of system selection. table tells how respondents at m&bis and rus characterize their operational irs’ host. m&bis are opting for various alternatives that do not require them to go it alone—such as obtaining ir services from a consortium, entering into a partnership with a comparable institution, or negotiating with a for-profit vendor. rus are much more likely than m&bis to operate irs on their own because they can afford to have several technicians to manage ir implementation and maintenance. with regard to ir features, m&bis rate their irs high on technical support. such a high rating is to the credit of for-profit vendors and consortia from which the majority of m&bis are obtaining ir services. regardless of institutional type, all respondents rate their irs’ features for controlled vocabulary searching and authority control at the bot- tom of a list of features. respondents from rus also rate their irs’ interface toward the bottom. ir systems generally could benefit from improvements to all three of the system features that people use to retrieve digital content—user interface, controlled vocabulary searching, and authority control. when asked how likely they are to modify their irs’ software, respondents from the m&bis and rus do not agree on their answers. table shows the results. m&bis are much less likely than rus to modify their irs’ software. since the majority are partner- ing with other institutions, contracting with vendors of hosted systems, or receiving ir services from a consortium, they are less likely to have staff assigned to the ir that are able to modify the system. at rus, . percent of ir staff report that they are likely to modify their irs, which can involve complex programming knowledge. since most institutional repositories: the experience of master’s and baccalaureate institutions rus have implemented dspace, they have systems staff handling ir maintenance and updates; thus, these staff can accomplish the job of modifying ir software. respondents from the m&bis and rus are in agreement about sources of fund- ing for the ir. costs are absorbed in the library’s routine operating costs, by a special initiative supported by the library, a regular line item in the library’s budget, or a grant awarded by an external source. staff involved in ir planning and pilot testing at m&bis and rus estimate that they will make the decision whether or not to implement an ir over the next . and . months, respectively. ir staff at m&bis and at rus where irs are operational think their institution will retain its current ir system for the next . and . years, respectively. contributors to the ir questionnaires asked respondents who were the authorized contributors to their institution’s ir. table gives the results; a “t” indicates a tied rating. faculty and table number of irs no. of irs implementers implementers %m&bis %rus . . . . or more . . total . . your institution only . . for-profit vendor . . partnership with one or more . . comparable institutions regional or state-based . . consortium total . . table ir hosts host type implementers implementers %m&bis %rus karen markey et al. librarians top the list for both m&bi and ru respondents. librarians and archivists are especially likely to be active contributors due to work assignments connected with digitizing and depositing special collections in the ir. librarians and archivists may also act as proxies for faculty and research scientists who want to deposit content in the ir but have no time to do it. at m&bis, undergraduate students are much more likely to be active contributors than at rus. when asked who is the major contributor to their institution’s ir, a respect- ably large percentage ( . percent) of respon- dents at m&bis singled out undergraduates (table ). in fact, at m&bis, undergraduates are as likely as faculty to be the major contribu- tor to the ir. this is a major way in which the nature of m&bis—teaching-focused institu- tions—is reflected in the irs’ collection. questionnaires asked respondents what they thought would be the most important reasons why people would contribute to the ir and provided a list of options from which to choose. respondents from both institution types consistently give very high ratings to three reasons: ( ) to expose the particular scholar’s intellectual output to re- searchers in north america and around the world who would not otherwise have access to it, ( ) to boost the particular scholar’s prestige, and ( ) to increase the accessibility to knowledge assets such as numeric, video, audio, and multimedia databases. there is less agreement between m&bis and rus on other reasons for contribution. respondents at m&bi institutions also highly rate “to place the burden of preservation on the ir instead of on individual faculty members,” whereas ru respondents viewed this factor as less of an impetus for people to contribute to the ir. ru respondents gen- erally rate the statement “to expose your institution’s intellectual output to researchers in north america and around the world who would not otherwise have access to it” in the middle of the pack. m&bi respondents put this reason third on their lists. perhaps table likelihood of modifying ir software responses implementers implementers %m&bis %rus very likely . . somewhat likely . . somewhat unlikely . . very unlikely . . don’t know . . total . . in fact, at m&bis, undergradu- ates are as likely as faculty to be the major contributor to the ir. institutional repositories: the experience of master’s and baccalaureate institutions m&bi respondents feel that irs have the potential to level the playing field, giving small institutions the same mechanism as larger research institutions for making their institution’s scholarship accessible to people around the world. on the other hand, respondents at m&bis rate the statement “to increase citation counts to the particular scholar’s oeuvre” toward the bottom third of the list, whereas respondents at rus rate it second. in the absence of open-ended remarks accompanying questionnaire responses, it is difficult to give an explanation why the mb&i respon- dents rate this reason so much lower than the rus. perhaps citations are much more important to ru faculty because they increase their chances for promotion, tenure, and merit increases, whereas m&bi faculty are also rewarded for excellence in teaching and service. faculty t . t . librarians t . t . graduate students t . t . undergraduate students t . . archivists t . t . research scientists . . table authorized contributors to operational irs contributor implementers m&bis implementers rus rank % rank % faculty t . . undergraduates t . – . graduate students . t . librarians t . . archivists t . . academic support staff t- . – . other – . t . table the major contributor to operational irs contributor implementers m&bis implementers rus rank % rank % karen markey et al. respondents at m&bis agreed entirely with respondents at rus with regard to the most successful content recruitment methods: ( ) staff working one-on-one with early adopters, ( ) personal visits by ir staff to faculty and administrators, and ( ) presenta- tions by ir staff at departmental and faculty meetings. respondents’ answers to a question about what was likely to inhibit their ability to deploy a successful ir reveal differences between respondent types, not between institution types. all respondents rated con- tributors’ lack of knowledge about how they can benefit from irs at or close to the top. m&bi respondents who are planning and pilot testing irs rate convincing faculty that the ir will not adversely affect the current publishing model in the middle of a list of reasons, whereas ru respondents where irs have been implemented rate it second. this finding is noteworthy because it emphasizes how important the current publishing model is to faculty, research scientists, fellows, and other research personnel at rus. this model drives their behavior because their institution’s reward structure is intimately tied to it. efforts like irs that have the potential to change the model may be viewed with skepticism and suspicion, so ir staff may have to empha- size ir benefits to faculty to alter their preconceived notions about its relationship to the publishing model. an explanation for the lower levels of concern among staff at m&bis may be due to the more balanced emphasis on research, teaching, and service that exists at these institutions. absence of campus-wide policies mandating contributions of certain material types to irs was an inhibiting factor that concerned staff at institutions where irs have been implemented more than it concerned staff at institutions involved in ir planning and pilot testing. such mandates may be more important to the former staff than the latter because the former have an operational ir at hand and feel an urgency to populate this tool with substantive content. ir staff were in agreement about the ratings for other inhib- iting factors except for one—competing for resources with other priorities, projects, and initiatives. this was a factor that concerned staff where ir planning and pilot testing was going on, regardless of institution type. since these respondents are still in the planning phase of an ir effort, they may be more sensitive about competing for resources because their institution’s decision-makers have not yet made the commitment to implement an ir, making the possibility that the ir effort will get derailed a reality. discussion in a nationwide study of institutional repositories in u.s. academic libraries, most of the institutions that have not begun planning for institutional repository (irs) services are master’s colleges and universities and baccalaureate colleges (m&bis). a number anticipate that they will begin ir planning in the next three years. although securing resources for the ir effort may be a problem, they want to learn what similar institutions are doing about irs. this paper responds to their pleas in this regard. at m&bi institutions where no ir effort or only ir planning is underway, library directors are taking the lead because they probably have fewer obstacles fielding inquiries all respondents rated contrib- utors’ lack of knowledge about how they can benefit from irs at or close to the top. institutional repositories: the experience of master’s and baccalaureate institutions from provosts and financial officers about funding and capital expenditures, from the chief information officer about needed technical expertise and equipment, and from the college archivist about conflicting roles and responsibilities. library staff at m&bis still have much to contribute. knowledgeable of faculty and students, they can assess the faculty’s interest in making submissions to the ir, learn from faculty what students are likely to submit, and collect faculty and student submissions during the pilot-testing phase of the ir effort. library staff at m&bis are interested in learning more about irs, especially concern- ing information pertaining to successful implementations at institutions like their own. specifically they want to know about best practices, case studies, policy development, the benefits of irs, ir system reviews, and research findings, such as findings from this miracle project study. m&bi respondents at institutions where ir planning and pilot testing are underway rate learning about available expertise and assistance from a library consortium, network, group of libraries, and so on at the top; consequently, they may be more likely to seek ir services from various affiliated groups. respondents expressed their interests regarding consortia through their answers to open-ended questions. examples are: • "we are in the process of investigating ir systems and are in talks with other colleges about our digital needs. a consortial agreement for an ir system would be ideal." (planning-only respondent at a small private liberal arts college in a great lakes state) • "provide information about collaboratives, either within a consortium, a system, or amongst institutions with similar needs." (planning-only respondent at a mid- sized master's university in a northern great lakes state) • "offer guidelines for partnering with other institutions." (no-planning respondent at a small public baccalaureate university in the mountain west) • "best practice, identification of institutions like ours who have succeeded, forma- tion or information about collaborative groups who have (or will have) a shared ir that we can join. we see a shared system as one of the more viable options." (no-planning respondent at a small private liberal arts college in the central atlantic states) • "would love to see models in a small, liberal arts college environment, particu- larly for consortial opportunities." (no-planning respondent at a small master's university in the southeast) deploying a successful ir depends on contributors. at both m&bis and rus, faculty, librarians, and graduate students are likely to be authorized contributors to irs. the former are more inclined than the latter to accept undergraduate students as contribu- tors. undergraduates may be the major contributor to irs at m&bis, not so at rus. the emphasis on student contributions may be an important distinction between m&bis and rus. ir staff at both m&bis and rus are concerned that faculty will not contribute to the ir because faculty think it might upset the current publishing model, which is familiar and on which their institution’s reward structure is built. m&bis and rus also share concerns about contributor inactivity due to their lack of knowledge about how the ir can benefit them and are conflicted about campus-wide mandates regarding mandatory contributions of certain material types to irs. karen markey et al. an examination of this article’s limitations is appropriate. since few m&bis have operational irs, the number of m&bi institutions with operational irs who participated in the miracle project study is small. questionnaires enumerated several questions about ir policies, document types in pilot test and operational irs, and the benefits of irs that had many response cat- egories. there were too few m&bi respondents for miracle project investigators to detect differences or trends between institution types based on the response categories they chose. follow-up surveys and censuses should focus exclusively on m&bis. more case studies of ir implementations at m&bis are also needed. m&bi staff are ready for irs, and they want to learn about successful ir efforts at comparable institutions. conclusion both master’s colleges and universities and baccalaureate colleges (m&bis) and research universities (rus) call this new discovery tool an institutional repository but, in time, we may see irs at m&bis that look qualitatively different from the irs at rus. for ex- ample, the digital contents of irs at the former may be more oriented toward teaching objects than the products and by-products of research. although we have no empirical evidence of this due to the small number of m&bi respondents with operational irs who participated in the miracle project study, future researchers who survey the state-of-the-art in ir implementation should compare the contents of irs at primarily teaching and primarily research institutions to determine differences in the contributors, content, and end users of these tools. it may be that irs in m&bi institutions require new definitions, names, qualifiers, users, and uses. many m&bis have not begun planning for institutional repository (ir) services. the miracle project study demonstrates that large numbers of these institutions will begin ir planning in the next three years. this paper responds to some of their requests for information about the ir deployment efforts of comparable institutions. at m&bis where ir planning has not begun, there is a sleeping beast of demand regarding irs. they want to know how much irs cost to plan, implement, and maintain, and what comparable institutions are doing with regard to irs. their interest in irs is a wake-up call to their colleagues at other-than-research universities to share with an audience that is eager to learn what their peers have to say about their success stories as well as cautionary tales about irs. karen markey is professor, school of information, university of michigan, ann arbor, mi; she may be contacted via e-mail at: ylime@umich.edu. beth st. jean is a doctoral student, school of information, university of michigan, ann arbor, mi, and a graduate student research assistant for the imls-funded miracle project; she may be contacted via e-mail at: bstjean@umich.edu. ir staff at both m&bis and rus are con- cerned that faculty will not contribute to the ir because faculty think it might upset the current publishing model, which is familiar and on which their institution’s reward structure is built. institutional repositories: the experience of master’s and baccalaureate institutions soo young rieh is assistant professor, school of information, university of michigan, ann arbor, mi; she may be contacted via e-mail at: rieh@umich.edu. elizabeth yakel is associate professor, school of information, university of michigan, ann arbor, mi; she may be contacted via e-mail at: yakel@umich.edu. jihyun kim is a doctoral candidate, school of information, university of michigan, ann arbor, mi; she may be contacted via e-mail at: jhkz@umich.edu. notes . see, for example: mark ware, “institutional repositories and scholarly publishing,” learned publishing , ( ): – ; mark ware consulting, ltd., “publisher and library/learning solutions (pals): pathfinder research on web-based repositories: final report,” (january ), http://www.palsgroup.org.uk/palsweb/palsweb.nsf/ (accessed january , ); kathleen shearer, “survey results—summer : carl institutional repositories project (september ),” http://www.carl-abrc.ca/projects/ institutional_repositories/pdf/survey_results_ -e.pdf (accessed january , ); shearer, “the carl institutional repositories project: a collaborative approach to addressing the challenges of irs in canada,” library hi tech , ( ): – ; and charles w. bailey, jr., et al., “ executive summary,” spec kit institutional repositories (washington, d.c.: association of research libraries, ), , http://www.arl.org/ bm~doc/spec web.pdf (accessed january , ). . clifford a. lynch and joan k. lippincott, “institutional repository deployment in the united states as of early ,” d-lib magazine , ( ), http://www.dlib.org/dlib/ september /lynch/ lynch.html (accessed january , ). . karen markey, soo young rieh, beth st. jean, jihyun kim, and elizabeth yakel, census of institutional repositories in the united states: miracle project research findings (washington, d.c.: council on library and information resources, february ), http:// www.clir.org/pubs/reports/pub /pub .pdf (accessed january , ). . to simplify reporting in this paper, the acronym m&bis is used to represent the combination of staff responses from master’s colleges and universities and baccalaureate institutions while the acronym rus is used to refer to research universities. . see, for example: shearer, “survey results”; shearer, “the carl institutional repositories project”; and bailey et al. carl abrc, “fact sheet,” canadian association of research libraries, http://www. carl-abrc.ca/about/factsheet-e.html (accessed january , ); association of research libraries, “membership: principles of membership in the association of research libraries,” association of research libraries, http://www.arl.org/arl/membership/ qualprin.shtml (accessed january , ). . lynch and lippincott. . mark ware; mark ware consulting, ltd. . ithaka, ithaka’s librarian and faculty studies: overview of key findings (new york, ny: ithaka, ), , http://www.ithaka.org/research/ithaka.surveys. .overview.pdf (accessed january , ). . ibid. . christopher w. nolan and jane costanza, “promoting and archiving student work through an institutional repository: trinity university, lasr, and the digital commons,” serials review , ( ): – ; john-bauer graham, bethany latham skaggs, and kimberly weatherford stevens, “digitizing a gap: a state-wide institutional repository project,” reference services review , ( ): – ; and marianne a. buehler and karen markey et al. adwoa boateng, “the evolving impact of institutional repositories on reference librarians,” reference services review , ( ): – . . nolan and costanza. . graham, skaggs, and stevens. . buehler and boateng. . the carnegie foundation for the advancement of teaching, “classifications: the carnegie classification of institutions of higher education,” the carnegie foundation for the advancement of teaching, http://www.carnegiefoundation.org/classifications/index.asp (accessed january , ). . the carnegie foundation for the advancement of teaching, “classifications: lookup & listings,” the carnegie foundation for the advancement of teaching, http://www. carnegiefoundation.org/classifications/index.asp?key= (accessed january , ). op-llcj .. computer-supported collation of modern manuscripts: collatex and the beckett digital manuscript project ............................................................................................................................................................ ronald haentjens dekker department of it r&d, huygens institute for the history of the netherlands, royal netherlands academy of arts and sciences, the netherlands dirk van hulle department of literary studies, university of antwerp, antwerp, belgium gregor middell institut für deutsche philologie, universität würzburg, würzburg, germany vincent neyt department of literary studies, university of antwerp, antwerp, belgium joris van zundert methodology research program, huygens institute for the history of the netherlands, royal netherlands academy of arts and sciences, the netherlands ....................................................................................................................................... abstract interoperability is the key term within the framework of the european-funded research project interedition, whose aim is ‘to encourage the creators of tools for textual scholarship to make their functionality available to others, and to pro- mote communication between scholars so that we can raise awareness of innova- tive working methods’. the tools developed by interedition’s ‘prototyping’ working group were tested by other research teams, which formulate strategic recommendations. to this purpose, the centre for manuscript genetics (university of antwerp), the huygens institute for the history of the netherlands (the hague), and the university of würzburg have been working together within the framework of interedition. one of the concrete results of collaboration is the development and fine-tuning of the text collation tool collatex. in this article, we would like to investigate how the architecture of a correspondence: joris van zundert. huygens institute for the history of the netherlands, royal netherlands academy of arts and sciences, po box , lt, the hague, the netherlands email: joris.van.zundert@ huygens.knaw.nl digital scholarship in the humanities ! the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqu digital scholarship in the humanities advance access published december , digital archive containing modern manuscripts can be designed in such a way that users can autonomously collate textual units of their choice with the help of the collation tool collatex and thus decide for themselves how efficiently this digital architecture functions—as an archive, as a genetic dossier, or as an edition. the first part introduces collatex and its internal concepts and heuristics as a tool for digitally supported collation. how this tool can be integrated in the infrastructure of an electronic edition is discussed in part two. the third and final part examines the possibility of deploying collatex for the collation of modern manuscripts by means of a test case: the beckett digital manuscript project (www.beckettarchive.org). ................................................................................................................................................................................. computer-supported collation with collatex following john unsworth’s textual scholarship workflow typology of scholarly primitives (unsworth ), it is clear that text comparison is pivotal to any kind of textual scholarship. the role of text comparison becomes paramount in any scholarly editing project that involves critical enquiries about the edited text witnessed in multiple versions. conducting such a collation is a tedious and error-prone work, especially because the required attention to detail is highly exacting com- pared with the repetitive and mechanical nature of the task. from that perspective, this type of work seems an ideal candidate for automation, not only because computers can support users in tedious error-prone duties rather efficiently, but specifically because the number of versions is often so large that it is simply not feasible any more to compare each witness against another manually. the application of computers or other appara- tuses to support the collation of texts already has a long-standing tradition in and of itself (smith ), reaching back at least to the usage of optomechanical devices like those pioneered by charlton hinman. since then, the semiautomatic collation of texts has been a well-established area for the application of software, offering support in managing large text traditions, in comparing prede- termined passages of different versions as well as in storing and rendering the results. however, it con- tinued to be the user’s duty to orchestrate the whole process and guide the computer in comparing rele- vant passages by manually calibrating the complex input to make it fit rather basic comparison algo- rithms. recent advancements in the field of com- putational biology, a field closely related when viewed from the computational perspective, resulted in renewed attempts to further the degree of auto- mation achieved thus far in the comparison of nat- ural language texts. protein sequences—not unlike texts in natural language—can be modeled as se- quences of symbols, whose differences can be under- stood as a set of well-defined editing operations (levenshtein ), which transform one sequence into another and can be computed. the analogy goes even further, as the consecutive evaluation of assumed editing operations between protein se- quences on the one hand and texts on the other hand bears striking similarities, as they often pro- vide the basis for further stemmatic analysis and genetic reasoning (spencer and howe, ). the only subtle but crucial problem with this analogy is that, while biologists can afford to leave aside meth- odological questions about the intentionality of assumed ‘editing operations’ on protein sequences, philologists cannot always base their reasoning on computed differences between texts. the prime ex- ample for this kind of difference is a transposition of two passages, which has been explicitly marked by the author in the manuscript (for instance, via arrows or a numbering scheme) and interpreted as such by the editor, but the intentionality of which cannot be computed deterministically by collation software, even if the transposition itself can be detected. in addition to this dilemma, there are numerous workflow-related challenges to surmount when proper integration of collation software with a r. haentjens dekker et al. of digital scholarship in the humanities, www.beckettarchive.org . e to , opto- - digital editing environment becomes a concern. it makes computer-supported collation not only a computationally complex but also an architecturally challenging problem for software developers. as in the traditional trilogy of ‘recensio’, ‘examinatio’ (including ‘collatio’), and ‘emendatio’ (see for in- stance grafton et al., , p. ), the collation of digital texts is again central to the editorial work- flow, with consecutive architectural dependencies on many adjacent building blocks of the editing environment. the workflow may apply specific modeling and encoding of text versions as well as possible automatic linguistic annotation like part of speech tagging to be able to compare texts in a more expressive manner than on plain character string level. this might be the case for instance where lemmatization is applied. also, specific workflows might require the ability of a human interacting with the collation result or process. certain idiosyn- crasies of witnesses and their texts, for instance, might have to be modeled and/or encoded by human intervention where optical character recog- nition does not serve well, e.g. in the case of visual poetry; or researchers may need to intervene in the process of automatic linguistic annotation of textual versions to make them comparable in a more sens- ible way, for instance, in supervised learning meth- ods or in cases of languages that are poorly supported by the current state-of-the-art in natural language processing, such as medieval dutch or neo-latin; or it may be necessary to manually annotate the output of densely marked-up and interconnected text versions as a result of their com- parison yielding differences previously unnoticed. many newer approaches to the problem of collation offer interesting solutions to the computational challenge, but most of them do not fully address the architectural challenges, nor do they approach the problem as one which can only be solved semiautomatically given the methodological frame- work of its domain. consequently, existing solutions either remain within the realm of decision-support systems, which mainly help scholars keep the overview while producing essentially handcrafted collation re- sults and transforming them into a commented crit- ical apparatus, or they automate the collation in a way that is tailored to a specific use case and/or runtime environment. the general applicability of the latter approach can then only be approximated via quantitative properties of its specific input and the accuracy of the achieved results. in contrast, we would like to offer a third, rather pragmatic approach, in which we first dissect the problem of collation into smaller more manageable subpro- blems and then show by an example how each of these subproblems can be addressed in a way that is more fitting to its application domain and with a higher chance of applicability to the variety of re- quirements stipulated by the many different scholarly environments in which the collation of texts and its adjacent scholarly tasks have to be performed. . comparing existing solutions our example for this approach is collatex, a proto- typical collation tool, developed in the context of interedition. shortly after the project started, it became clear that a proper requirements analysis for a versatile collation tool would need input from a range of stakeholders as wide as possible, including users and interested developers as well as implementers of existing solutions. a collation summit and a collation workshop were held in gothenburg and brussels in , co-organized by the european cost action ‘open scholarly communities on the web’, which invited imple- menters of three collation tools—literary scholars, digital humanists, and developers of xml database software—to discuss conceptual commonalities between their fields of expertise as they relate to the collation of texts. the immediate result was the agreement on a modularization of the digital collation process into a couple of well-defined steps, which—if applied in order and/or itera- tively—allows the collation of texts to be supported more flexibly by implementations adhering to this separation of concerns. four basic steps were defined. the first is the tokenization of digital texts to be compared—in effect the segmentation of the texts into the sequence of tokens that will be compared. the second step is the alignment of tokens from different texts, which essentially iden- tifies which segments of tokens match between the computer-supported collation of modern manuscripts digital scholarship in the humanities, of , fashion — . o . , . o ; - hand- , -- text—effectively also identifying where the com- pared texts differ and thus implying or assuming edit operations in those places. the third step is the analysis of the computed alignment, which introduces an interpretative aspect into the process as edit operations are now qualified (e.g. as deletion, addition, or transposition). the fourth and final step is the output/visualization of collation results. the workflow of these four steps has since become informally known as the ‘gothenburg model’. we will explain the various steps in more detail below. although any collation software can compare texts on a character-by-character basis, in the more common use case, before collation each text (or comparand) is normally split up into segments or tokens and compared on the level of the token rather than on the character-level. this familiar step in text (pre)processing, called ‘tokenization’, is per- formed by a tokenizer and can happen on any level of granularity, for instance, on the level of syllables, words, lines, phrases, verses, paragraphs, text nodes in a normalized xml dom instance, or any other unit suitable to the texts at hand. another service provided by tokenizers as defined in our model re- lates to marked-up texts. as most collators compare texts primarily based on their textual content, embedded markup would usually get in the way and therefore needs to be filtered out—but must be kept as stand-off annotations during tokeniza- tion—so the collator can henceforward operate on tokens of textual content. annotations must be kept because it might be valuable to have the markup context of every token available, for example, if one wants to make use of it in the comparison of tokens during the alignment step (see below). the schematic diagram on the left in fig. depicts this process: the upper line represents a comparand, each character a, b, c, and d, an arbitrary token, and the xml tags e and e are examples of embedded markup. a tokenizer transforms this marked-up text into a sequence of individual tokens, each referring to its respective markup/tag- ging context. from now on, a collator can compare tokenized comparands to others based on its toke- nized content and does not have to deal with its specific notational conventions anymore, which are often rather specific to a particular markup language, dialect, or project. during the tokeniza- tion step, it is also possible to normalize each token, so the subsequent comparison can abstract from certain specifics, such as case-sensitivity or even morphological variants. in most use cases, we have found that abstracting away from such specifics yields useful collation results. however, it should be noted that there is no principal methodological or technical reason to enforce such abstraction. in cases where specifics would turn out to be useful as information for alignment of comparands, the model allows us to take into account such specifics. when the comparands have been tokenized, a collator will align the tokens of all comparands involved. aligning comparands implies the match- ing of equal tokens and the insertion of ‘empty’ tokens (so-called ‘gap tokens’) in such a way that the token sequences of all comparands line up prop- erly. as mentioned before, this specific task of a collator is computationally similar to the problem of sequence alignment, as it is also encountered, for example, in computational biology. looking again at an example (fig. , center diagram), we assume that three texts (each depicted in its own column) are being compared: the first consists of the token sequence ‘abcd’, the second reads ‘acdb’, and the third ‘bcd’. a collator might align these three comparands as depicted in a tabular manner. each comparand occupies a column, matching tokens are aligned in a row, and necessary gaps as inserted during the alignment process are marked by means of a hyphen. depending on the perspective from which one translates this alignment into a set of editing operations, one can conclude, for example, that the token ‘b’ in the second row was omitted in the second comparand or that it was added in the fig. schematic representation of the tokenization (left), alignment (middle), and analysis (right) phases of a collation workflow r. haentjens dekker et al. of digital scholarship in the humanities, -- 'll , i fashion first and the third. a similar statement can be made about ‘b’ in the last row by just inverting the rela- tionship of being added/omitted. in addition to atomic editing operations com- puted in the alignment step, a further analysis of the alignment, conducted by the user and supported by the machine, can introduce additional interpret- ative preconditions into the process. repeating the previous example in fig. (right diagram), one might interpret the token ‘b’ in columns and as being transposed instead of as being simply added and omitted. whether these two edit operations ac- tually can be interpreted as a transposition ultim- ately depends on the judgment of the editor and can at best be suggested, though not conclusively deter- mined, via unambiguous heuristics. that is why an additional analytical step, in which the alignment results are augmented (and optionally fed back as preconditions into the collator), appears as essential to us in order to bridge the methodological ‘imped- ance’ between a plain computational approach to the theoretical problem and the established hermen- eutical approach taken in practice. in some cases, even human interpretation may of course not deter- mine decisively whether an actual transposition took place. we may have to conclude that some cases of potential transposition cannot be deter- mined with absolute certainty. the obvious remaining step is the output of the collation results, which is again a complex task. the requirements here range from the encoding of the results according to various conventions, markup dialects, and formats required by other tools to the visualization of results in multiple facets, be it in a synoptic form, either as a rendering focusing on one particular text and its variants, or as a graph-oriented networked view, offering an over- view of the collation result as a whole. after establishing this separation of concerns, im- plementers of collation-related software can hence- forth focus on specific problems. for instance, the collation tool juxta has a feature-rich tokenizer for xml-encoded texts since version . , which has been extended constantly in consecutive versions. juxta also has support for larger comparands as well as stand-off annotations and is available as a self-contained software library for reuse in other tools. comparable work is ongoing to generalize juxta’s visualization components. . comparing alignment algorithms the main emphasis of collatex’s development is on improving the alignment step. as mentioned in the introduction, aligning sequences of symbols is a well-known problem in computer science having many applications, notably in the field of computa- tional biology. it has also been noted that the adop- tion of existing sequence alignment algorithms for use in the context of philology poses several prob- lems, some of a conceptual methodological nature and some of a practical technical nature. rather than providing a complete account of the pros and cons of particular algorithms, a task better undertaken elsewhere, we would like to draw atten- tion to three recurring criteria, on which the quality of recent alignment algorithms is evaluated: . transposition detection detecting arbitrarily transposed passages in versions of a text is a much harder problem when done in the context of sequence align- ment than computing insertions, deletions, and substitutions. schmidt concludes his ana- lysis of this problem (schmidt ) with a pragmatic solution by stating that, given an np-complete computational problem and no guaranteed correspondence between an opti- mal computational result and the outcome desired by the user, a heuristic algorithm might be the best solution. accordingly, algo- rithms that try to detect transpositions do so heuristically and refer to benchmarks measur- ing computationally detected transpositions against manually predetermined ones. . support for flexible token matching the well-known distinction between substan- tial versus accidental variants as well as other factors, like orthographic variation, require alignment algorithms to match tokens more flexibly than just via exact character matching. some algorithms use edit distance–based thresholds for this purpose (e.g. spencer/ howe’s or juxta’s), whereas others rely on lookup tables predefined by the user, which computer-supported collation of modern manuscripts digital scholarship in the humanities, of judgement , , , , vs. - - - list possible mappings of tokens to match them despite their differing character content. . base-text-/order-independence alignment algorithms like juxta’s compare versions one-on-one, so that as soon as more than two versions are to be compared, the task has to be reduced to pairwise comparison of two versions at a time and consecutive mer- ging of the pairwise results. spencer and howe have shown the potential functional de- pendence of such a unified result on the order in which pairwise comparisons are merged. this poses a problem for genetic research based on such results, since a suitable order in which the pairwise comparisons should be merged depends on a hypothesis about which texts are closer related to each other and whose comparison results should conse- quently be merged first (spencer and howe, ). collatex’s aligner tries to tackle all of these prob- lems by following the modularization outlined in the section above and by finding ways to align tokens that do not inherit the trade-offs of existing sequence alignment algorithms. as such it has to be characterized as experimental, but at the same time it already yields promising results. . comparing texts with collatex this section gives an overview of the major concepts by which collatex aligns tokens of comparands. we begin by explaining the basic challenge of aligning two comparands including the detection of trans- positions and extend the challenge stepwise up to the alignment of multiple versions. most alignment algorithms work on the basis of the following editing operations: insertion, deletion, and substitution. these operations are well defined, e.g. via levenshtein’s concept of the edit distance. a frequently recurring problem when comparing two versions of a text is the phenomenon where a passage of a text has been moved between them (i.e. transposed). moreover, transposed passages of a text usually are not transposed literally, but contain small changes on their own, which makes the chal- lenge to detect these even harder. alignment algo- rithms that are constrained to the editing operations just mentioned will regard a transposition either as a deletion and an insertion or, in case two passages have been swapped, as two substitutions. collatex releases this constraint by handling transpositions as an additional kind of editing operation and trying to detect those operations. to start with a trivial case, detection of transpos- itions is easy when all tokens in the compared ver- sions are unique (fig. ). when we look at the different tokens from the two versions in each position, then it is easy to see that ‘a’ is transposed with ‘c’ and ‘c’ with ‘a’. apart from all tokens being unique, the previous example also assumes that moved passages of text are exactly one token long. in the next example (cf. fig. ), we drop this constraint as well. the desired result would be that the sequence ‘a b c d’ is transposed with ‘z’. the trivial approach described above for the detection of transpositions would not work in this case. real-world cases of transposition involve arbitrary length sequences of tokens moving over seemingly arbitrary distances in text, in the process, more often than not, also chan- ging the internal order of the sequence to various extents. to solve this problem, we need a more elab- orate form of token matching. to this purpose, we use a match table, which is a document-to-docu- ment matrix, allowing us to compare two variant witnesses, each word to each word. let us first con- sider a case where there is no variation (cf. fig. ). we put the tokens of witness as the column head- ers of the matrix and the tokens of the identical fig. less trivial case of transposition fig. trivial case of transposition r. haentjens dekker et al. of digital scholarship in the humanities, , , : , witness as the row headers of the matrix. we then simply mark cells that have identical row and column headers. this simple case reveals an essential aspect of document-to-document matrices for variant detec- tion: in general, the ‘path’ from the upper left corner to the bottom right corner that deviates the least from the exact diagonal corresponds to the similar- ity a reader would assume between two texts. a reader would not assume, for example, that the first ‘the’ in the horizontally depicted witness is ac- tually to be identified with the second ‘the’ in the vertically represented witness, or vice versa. thus, we assume the ‘conclusion’ depicted for the general- ized case in fig. a to be invalid, and the solution in fig. b is preferred. this case also gives us a hard constraint for any algorithm design: we cannot select more than one token for each row and/or column, i.e. we cannot have two tokens simultan- eously in one position. now consider a case of transposition. text a is a six-sentence (or ninety-seven–word) quote from samuel beckett’s stirrings still. when we compare two identical copies of this text in a matrix as ex- plained above, we arrive on the result as depicted in fig. a. now we copy text a, but we deliberately move the last sentence to the start of the text. in this way, we effectively create an artificial transpos- ition. we now compare the original text to the copy containing the artificial transposition. the result is depicted in fig. b. the displaced sentence is clearly indicated by a diagonal in the top right corner, at- testing that the last sentence of the original (hori- zontal direction) coincides with the start of the altered copy. if we created multiple transpos- itions, we would get a result as depicted in fig. c. instead of a clear diagonal, all the way through we find a rather broken-up path of smaller diagonals: one night as he sat at his table head on hands he saw himself rise and go. one night or day. for when his own light went out he was not left in the dark. light of a kind came from the one high window. under it still the stool on which till he could or would no more he used to mount to see the sky. why he did not crane out to see what lay beneath was perhaps be- cause the window was not made to open or because he could or would not open it. fig. document-to-document matrix applied as a match table, cells representing tokens coinciding between two witnesses are marked (‘scored’) in this case with dots fig. a(left), and b (right): unrealistic alignment conclu- sion (a) versus natural, elegant, or reader’s common sense solution (b) fig. a(left), b (center), and c (right) computer-supported collation of modern manuscripts digital scholarship in the humanities, of - text a: excerpt from samuel beckett’s stirrings still. it will be clear from this example that in fact any changes (or ‘edits’ as they are called in computer science, be they intentional or not) in an initially identical copy of a text will result in a deviation from a perfect diagonal in the document- to-document matrix—even the substitution of a single character. many such edits cause the visible diagonal of a perfect alignment to be broken up in a large collection of longer and shorter diagonals, similar to what is shown in fig. c (but many times larger for real-world texts). we call these dis- persed diagonals match phrases. like the much sim- pler case represented in fig. , in a real-world case it remains collatex’s task to determine what sequence of match phrases corresponds to the ‘natural’ align- ment of two texts or witnesses. of course, being a computer program, collatex has no conception of the kind of alignment a human reader would infer, and to further complicate matters, human readers can actually have different opinions on what the ‘best’ alignment is. therefore, collatex has to rely on a mathematical approximation of the inferences human readers might make. congruent to the argu- ment of bourdaillet and ganascia ( ), the approach of collatex for this is to determine the set of match phrases that corresponds to the smal- lest number of edits between two witnesses. in other words, the algorithm determines the smallest set of longest match phrases that accounts for all variants between two texts, as conceptualized in fig. . theoretically, this process can be applied to an n-dimensional matrix. this would facilitate com- paring an arbitrary amount of witnesses, or in other words, support for multiple witness align- ment. however, the time needed to compute the n-dimensional case is not linear, but probably in the order of an n-exponential function of the text length, making it computationally unattainable in a reasonable amount of time. therefore, multiple wit- ness alignment must be supported in another way, as explained in the remainder of this section. to register the alignment and variation between witnesses traced by the algorithm, an efficient way to store the algorithm’s results is needed. to this end, collatex adopts schmidt and colomb’s concept of a variant graph (schmidt and colomb, ). we will demonstrate this process using the case of a variant token. in fig. a, we see the algorithm determining the first alignment. the algorithm traverses all cells that represent aligned tokens and adds a vertex for each, the edges of which are indexed for both witnesses. however, in the case of the token ‘i’, we traverse an empty column. this means that we hit a token that is rep- resented in one witness but that does not have a corresponding token at that location in the other witness. instead, the other witness has ‘j’. in this case, we add two vertices with indexed edges, one for each witness (cf. fig. b). this process ultimately results in the situation depicted in fig. c. collatex’s variant graph is a directed acyclic graph in which each vertex represents a token in one or more versions. each edge in a variant graph is annotated with one or more identifiers for each version. additionally, a variant graph has a start and an end vertex (the #-vertices in fig. ) that do not represent any tokens. by proceeding in this way, variation at the start or the end of a ver- sion can be recorded. when one reads the variant graph from left to right, following only the edges annotated with the identifier of a certain version, the complete token sequence of this version can be reconstructed. when transpositions are detected, a fig. vectors describing alignment of two witnesses in a document-by-document matrix r. haentjens dekker et al. of digital scholarship in the humanities, t : : duplicate of the transposed vertex is created and the two copies are linked together (cf. fig. ). by applying a variant graph, we can also extend the applicability of the described algorithm from pairwise comparisons to the alignment of more than two versions. to this end, we apply the same approach of matrix-wise comparison, but instead of a d matrix a d matrix is used. this allows us to compare a new witness with the entire variant graph constructed so far. this process is conceptualized in fig. where a new witness ‘t’ (with a reading iden- tical to witness ‘u’) is added to the comparison. darker ‘cubes’ visualize alignment between the existing graph and the new witness. (lighter ‘cubes’ are for the readers’ orientation within the d matrix only.) eventually, in this case, the process leads to the addition of an index ‘t’ to all edges in the graph that had an index ‘u’. of course, when a new reading is found that has not been recorded in the graph, a new vertex would be inserted. in this way, the graph represents a serialization of the variation between the documents of a steadily growing set. the serialization contains all the vari- ation that is recorded during the alignment of pre- vious comparisons and is derived from the variant graph by arranging the tokens of all vertices in topo- logical order. note that a variant graph ultimately describes variation between witnesses; it does not— nor does collatex—interpret or infer the type or cause of variation. thus, in a graph such as depicted in fig. , there is no given interpretation whether the ‘j’ results from an addition in one witness or has been deleted in another. of course, whenever there is additional knowledge on the provenance and the date of the witnesses, this inference becomes a trivial task in most cases. as we noted above, collatex’s development as a whole is still in an experimental stage, but the cur- rent version (as of the time of writing version . is available on http://www.collatex.net) yields fig. a (left), b (center), and c (right): storing alignment results using a variant graph fig. capturing a transposition in a variant graph computer-supported collation of modern manuscripts digital scholarship in the humanities, of two-dimensional three-dimensional to d - http://www.collatex.net convincing results. inspection of a % sample of a real-world test collating the first chapter of darwin’s origin of species (bordalejo ) yielded a % correct detection rate for textual variation, and . % correct rate of transposition identification (i.e. one false positive and seven correct identifica- tions of transpositions, judged by human control) within the sample. collatex’s performance as to speed may leave something to wish for. currently, sentence- to paragraph-sized collations are executed in less than a second, virtually independent of the amount of witnesses. collation of chapter-sized texts is feasible (seconds), but at larger sizes (‘book length’), the speed degrades rapidly to unfeasible. to this end, chunking or breaking larger bodies of texts into smaller parts is an effective solution. a number of identified problems remain to be ad- dressed in future work. most importantly, the inde- pendence of the alignment results from the order in which the versions are aligned needs more testing. although no dependence on the order could be wit- nessed in test cases found in other publications ad- dressing the issue (spencer and howe, ), it is possible that, for example, a combination of repeated tokens in versions and a change in the order of their comparison might cause different results. another issue is testing and benchmarking. collatex’s algo- rithm is tested against an ever growing set of real- world use cases varying from simple and constructed cases such as ‘the black cat and the white dog’ versus ‘the white cat and the black dog’ to elaborate frag- ments of armenian, medieval dutch, and hebrew prose and verse. this yields good use-case–based evi- dence that collatex is indeed capable of tracing complicated examples of real-world textual vari- ation. the development methodology used (‘agile’ ) implies an ever growing set of such real- world cases, as new users request test runs of previ- ously unseen material. however, there is also a need for a mathematically/computationally constructed larger test corpus of variant texts of which the vari- ation is exactly known and modeled on real-world textual variation, so that future releases of collatex can be benchmarked for accuracy and performance to a certain standard. creating such a test corpus is an important step in future research and development. integrating collatex within the infrastructure of a digital edition now that the conceptual framework and algorithm have been mapped out, we would like to address the issue of possible integrations of collatex into elec- tronic editions. the beckett digital manuscript project proved to be a suitable test case since its soft- ware infrastructure is rather typical for the way in which many digital editions are currently composed. fig. visualization of multiple witness comparison using a d matching matrix fig. variant graph recording a variant that is either a deletion in witness ‘u’ or an addition in witness ‘v’ r. haentjens dekker et al. of digital scholarship in the humanities, , , vs. - . it uses apache cocoon, a publishing framework, which is based entirely on xml-oriented technolo- gies. as xml and its adjacent standards are almost ubiquitous in today’s digital humanities landscape, ranging in their application from the encoding of source material to the publication in multiple xml-based formats like xhtml or pdf via xsl- fo, apache cocoon is widely deployed among the projects in this field. the framework is built around the idea of configurable transformation scenarios. such scenarios are mapped flexibly to the uri name- space of a project’s web site and are triggered as soon as a web client sends a request to any of the mapped uris. cocoon then ( ) takes input parameters from the request; ( ) pulls additional relevant data from a variety of data sources (e.g. xml databases, relational databases, web services, or a server’s file system); ( ) converts all data into an xml document; ( ) pushes the data through a chosen transform- ation scenario, configurable by the site’s developer as well as the user, and ( ) returns the transformation result in the response to the web client (fig. ). the beckett digital manuscript project follows this pattern in as much as it ( ) receives requests for specific textual resources in the edition, selected via an appropriate uri; ( ) pulls those resources, encoded in tei-p compliant xml, from the server’s file system; ( ) applies an xslt-based transformation to the xml-encoded text resource that is fitted to the desired output format (often (x)html) and the current site context in which the user requests the resource, and ( ) delivers the transformation result along with any static resources (images, stylesheets, and client-side script code) to the requesting client (often a web browser). to seamlessly integrate collatex’s functionality in this site architecture, it was deemed to be the most elegant approach to think of the collator module as another transformation scenario, which transforms data from a selected set of text versions into an intermediary xml-based format encoding the collation result. as described above in the sec- tion on collatex’s design, the modularity of collatex allows us to uncouple the preprocessing of input data and the postprocessing of colla- tion results from the core of its functionality, the alignment. because of this flexibility, it was com- paratively easy to embed collatex into apache cocoon as another transformer module. all that was needed was ( ) a preprocessing step, which transformed an xml-encoded set of versions into tokenized input to the alignment module of collatex, and ( ) a postprocessing step, which renders the re- sults of the alignment step in an xml-based format, so it can be further processed by apache cocoon’s components and ultimately delivered to the user in the form desired (cf. fig. ). more specifically, the transformer module looks for so-called ‘data islands’ in provided xml input documents, which resemble the following snippet: <cx:collation xmlns:cx¼‘‘http://interedition.eu/ collatex/ns/ . ’’ cx:outputtype¼‘‘tei’’> <cx:witness> . . . </cx:witness> <cx:witness> . . . </cx:witness> <cx:witness> . . . </cx:witness> . . . </cx:collation> fig. usual blueprint for apache cocoon architecture- based web services computer-supported collation of modern manuscripts digital scholarship in the humanities, of w . . , . . . . . . . - . - . - http://interedition.eu/collatex/ns/ . http://interedition.eu/collatex/ns/ . whenever the transformer encounters such an island in an input document, it substitutes it with the resulting alignment of the given versions/ witnesses in the output while just copying the sur- rounding data. the encoding of the output can be controlled via the attribute ‘cx:outputtype’. currently a proprietary, tabular data format and tei p parallel segmentation markup are supported. by assembling these data islands, dynamically based on the user’s request (parameterized for instance via the input of an html form; see e.g. fig. ), and by adding consecutive transformation mod- ules after the collation has been performed, the beckett digital manuscript project can provide for the described personalization of its critical apparatus. because of the modular design of collatex, a seamless integration of its functionality with the site infrastructure of the beckett digital manuscript project has been achieved. the amount of work was comparatively limited because the collatex team was able to reuse most of the already developed components and the beckett project’s team was able to build the integration entirely on their exist- ing code base and platform. however, the integration did not come without trade-offs. particularly, the synchronous online exe- cution of the collation considerably limits the amount of textual data that can be collated. this constraint and the requirement of more complex request/response choreographies than the ones cocoon provides out of the box as soon as larger texts are collated, makes this a solution which cannot simply be adopted without prior adjust- ments by any edition with arbitrarily large text trad- itions. but projects such as nines, with its collation software juxta, and the interedition project itself develop web-service–based solutions, which aim to overcome scalability issues related to the synchron- ous on-the-fly collation in use today. therefore, the adoption of these solutions and again their integra- tion with existing infrastructures like apache cocoon offers a promising perspective. collatex, modern manuscripts, and the digital scholarly editorial process among all the interoperable tools developed within the interedition framework collatex takes a special place, as it realizes interoperability not just in a tech- nical sense but also in a scholarly sense. following textual scholars such as bryant ( ), buzzetti ( ), and mcgann ( ) and genetic critics such as lebrave ( ), grésillon ( ), hay ( ), de biasi ( ), ferrer ( ), and , we have come to appreciate text in its essential fluidity and its forms as a process rather than a static object. fig. blueprint of the integrated apache cocoon and collatex architecture-based web service for the beckett digital manuscript project r. haentjens dekker et al. of digital scholarship in the humanities, '' t on- - . , the dynamic aspect of text has caused especially mcgann to argue that any electronic edition—or for that matter any attempt in that direction— should be based on dynamic processes, ideally im- plementing ‘flexible and multi-dimensional views of the materials’ (mcgann ). originally, mcgann envisaged such editions in the form of hypermedia editions backed by relational databases, on top of which adaptable transformational logic would cause archived digital texts to be represented according to fig. the beckett digital manuscript project computer-supported collation of modern manuscripts digital scholarship in the humanities, of specific editorial practices and views of different edi- torial or literary communities. of course, collatex does not fully answer to such an ambitious perspec- tive. however, it is interesting to note that collatex represents at least one aspect of such a transform- ational logic as mcgann pointed to. collatex effect- ively dynamically ‘reverse engineers’ the variation present in textual tradition or genesis. the process is dynamic because the basic process is independent of tedious scholarly manual labor, causing collatex’s transformations to be repeatable. the process is also dynamic, as it is adaptable. by adding witnesses present in the database-backed electronic archive to the analysis (or by removing some of them from it), the effect and perspective on the collation may change. in this sense, the applica- tion of collatex in editorial processes takes us one step further in our abilities to express the dynamics of text production, editing, and reception. at this point, the software can be serviceable in the prepar- ation of a scholarly edition, since it can also output tei parallel segmentation xml, which an editor can then transform and visualize the way he or she wants within the edition. projects such as ‘manuscript archives’ that do not envisage the de- velopment of a full ‘historical-critical’ edition, could still offer their users an alternative to a traditional ‘critical apparatus’. comparable with aspects of what is currently often labeled social editing (siemens ), embedding collatex in the beckett digital manuscript project enables the user to make his or her own selection of textual versions that need to be collated and leave out the ones he or she is not immediately interested in. from the vantage point of editorial theory, this development has interesting consequences regarding the scholarly editor’s role, whose focus may shift from the collation to a more interpretive function. in this way, the integration of a collation tool may be consequential in terms of bridging the gap be- tween scholarly editing and genetic criticism. from the perspective of editorial practice, the application of collatex is still at an experimental stage, but it already shows that the modular and service-oriented approach used by interedition has the potential to be useful, both to the specialized field of digital scholarly editing and to a more general audience. . pushing the collation envelope: modeling genetic stages apart from facsimiles, topographic, and linear tran- scriptions (encoded in xml), the beckett digital manuscript project provides the option to compare the different preparatory versions of the text—from the earliest draft stages to the page proofs. to avoid getting lost, the user is able to compare a particular segment in one version with the same segment in another version, or in all the other versions. the size of such a segment is determined by the user, the smallest unit being the sentence. the user is offered a synoptic survey of all the extant versions of the segment of his or her choice, showing each version in its entirety with the variants highlighted (cf. fig. ). the syntactical context of each segment remains intact, but in order for the variants to be highlighted, they had to be encoded first. in view of the large amount of manuscript ma- terials still to be transcribed, the project would not have been able to include the option of encoding an apparatus in all of the transcriptions. as an alterna- tive to that manual encoding task, we tested the pos- sibilities of digitally supported collation by means of the collatex algorithm. one of the complicating elements of this test case is the rather large number of versions in combination with the presence of de- letions and additions in almost all of them. to find solutions for the latter complicating element, there are several ways of looking at the challenge of collating modern manuscripts. one way would be to regard it as a form of collation that does not only collate versions of a text, but also stages within versions. for one manuscript, ver- sion can often be subdivided into several writing stages. a writing stage is defined, according to the suggestions of the tei special interest group (sig) on ‘genetic editions’, as ‘the a reconstructable stage in the evolution of a text, represented by a docu- ment or by a revision campaign within one or more documents, possibly assigned to a specific point in time’. ideally, this would require that the editor can identify not only different stages in the writing process, but also the writing sequence within each writing stage. if these sequences and stages can be discerned unequivocally, it would be theoretically r. haentjens dekker et al. of digital scholarship in the humanities, , , to even ' possible to treat each stage as a version (or ‘witness’) to be collated. the tei sig on genetic editions suggested working with ‘stagenotes’ to describe the compos- ition stages that have been identified in the genesis of a text. these ‘stages’ relate to the relatively large unit of the textual version as a whole (‘textfassung’). within a stage (say, an author writ- ing a block of text in black ink, deleting, and adding words in the same writing tool) it is often difficult, if not impossible, to further discern different ‘sub- stages’. still, a genetic critic might be interested in a collation tool that brings to the fore precisely this kind of moment in the writing process, when the writer did not immediately find the right words. in the case of a simple example, ‘the ^black^ cat ^dog^ is alive ^dead^’—assuming all deletions () and additions (^) are made in the same handwriting and writing tool—all of the following combinations are theoretically possible: w a: the cat is alive w b: the black cat is alive w c: the cat is dead w d: the black cat is dead w e: the dog is alive w f: the black dog is alive w g: the dog is dead w e: the black dog is dead fig. synoptic survey of various version of one textual segment in the beckett digital manuscript project computer-supported collation of modern manuscripts digital scholarship in the humanities, of the tei sig ‘genetic editions’ developed the ‘stagenote’ element for the documentary level. according to the tei sig’s suggestions, ‘a genetic editor needs to be able to assign a set of alterations (deletions, additions, substitutions, transpositions, etc.) and/or an act of writing to a particular stage’. however, some authors always use the same writing tool, not only for the ‘first stage’ of their draft but for all subsequent revision cam- paigns. moreover, in the tei suggestions for genetic editions, the ‘stagenote’ element was developed for the ‘documentary’ level (related to what hans zeller called ‘befund’, the record, as opposed to ‘deutung’, its interpretation), not for the ‘textual’ level. however, collation is a text-related operation. to collate modern manuscripts, it may therefore be beneficial—for the purpose of designing digit- ally supported collation tools for modern manu- scripts—to conceive of the manuscript as ‘a protocol for making a text’, according to daniel ferrer’s definition (ferrer ). a relatively straightforward application of this protocol model is to work with the ‘uncancelled text’ of each manuscript (i.e. a ‘clean’ transcription or reading text of a draft, without the deleted pas- sages, i.e. by ignoring the passages marked by <del> . . . </del> tags). this ‘uncancelled text’ is usually an author’s last ‘protocol’ or instruction to himself when he is on the verge of fair-copying or typing out the text on another document. we tried to apply this ‘uncancelled text’ system to test the first research results of collatex. all versions of a segment are computed and their data are handed over to collatex for comparison. the segmentation of the textual material (see above) can now serve an extra purpose: apart from reducing the danger for the user to get lost in the jungle of manuscripts, it also determines the speed of the collation. since the most frequently chosen textual unit in the project is the smallest segment (usually the unit of a sentence), the number of versions can be relatively high (in the test case: about twenty versions) without slowing down the instant collation. as an intermediary step, the ‘uncancelled text’ system is useful, but it does reduce the complexity of the manuscript to a textual format. in a way, this pragmatic solution ‘de-manuscripts’ the manuscript. in order to try and refine the computer-assisted col- lation of modern manuscripts, it would be helpful if the collation software were xml-aware in order for the input to be derived directly from the xml- encoded transcription, and to record changes not only between the stages but also between substages within one stage. the test case provided us with the following example: a passage in one of beckett’s manuscripts (uor ms , v- r, written in beckett’s hand in black ink), with two consecutive substitutions within the same writing stage: and then again faint ^hoarse from long silence ^faint^^ from far within in xml, this could be transcribed as follows: and then again <subst xml:id¼‘‘subst ’’><del xml:id¼‘‘del ’’>faint</del> <add xml:id¼‘‘add ’’><subst xml:id¼‘‘subst ’’> <del xml:id¼‘‘del ’’>hoarse from long silence</del> <add xml:id¼‘‘add ’’>faint</add></subst> </add></subst> from far within the subst, del, and add tags suffice to cover all stage information, which could be expressed in the ‘aug- mented’ variant graph of fig. . each path in the graph represents a witness. for the purposes of the collation of modern manuscripts, a new type of node has been introduced (in the example, s and s , corresponding to subst and subst , respectively. this writing stage (a) then needs to be compared with other stages or other ver- sions, i.e. with multiple witnesses, say, (b) and (c): fig. conceptual collatex variant graph capturing genetic stages of authoring r. haentjens dekker et al. of digital scholarship in the humanities, ' , c , is , : (a) and then again faint ^hoarse from long si- lence ^faint^^ from far within (b) and then again nothing from far within (c) and then again faint from far within in collatex’s internal model, this would be pre- sented as depicted in fig. . an advantage of this model is that it can support several collation op- tions. by default, collatex would use only the ‘uncancelled text’ of each witness, but whereas the ‘uncancelled text’ model (described above) only took the final protocol into account, this model takes in all the extra information about the cancelled words, saves it, and enables us to ‘port out’ these data again at the visualization stage. for instance, if for whatever reason, one would prefer to compare (b) and (c) with substage of (a), rather than to its ‘uncancelled text’, the algorithm can optionally be instructed to collate [(b) nothing] and [(c) faint] against [(a’) hoarse from long silence], rather than against [(a’’) faint]. all the information stored in the xml transcription passes through the collation process untouched, so that it can be retrieved for visualization purposes. in terms of visualization, an option ‘hide cancel- lations’ (as one of the ‘tools’ in the menu) could simplify the alignment table, reducing it to a visu- alization of the different versions’ ‘uncancelled text’ only: w and then again faint from far within w and then again nothing from far within w and then again faint from far within but we can imagine that genetic critics and other researchers interested in modern manuscripts might want to have an overview of all the cancellations and substitutions in the manuscripts. undoing the same ‘hide cancellations’ option in the menu could offer these users a more complete picture: faint hoarse from long silence w and then again faint from far within w and then again nothing from far within w and then again faint from far within the advantage of having introduced the new type of node (s and s in the variant graph above) is that the ‘hoarse from long silence’ variant can be treated as one unit during the computer-supported collation and also be presented as such at the visu- alization stage. in closing the collation of modern manuscripts involves the treatment of cancelled text. a classical problem in this area of study is the division of a modern manu- script into ‘stages’ or even ‘substages’, for especially if an author uses the same writing tool for all the text on the document (including cancellations and additions), it is often almost impossible to discern separate stages. it is possible to work with the ‘uncancelled text’ of the documents in order to compare the different versions, but researchers working in modern manuscripts are usually espe- cially interested in the cancellations and substitu- tions. therefore, we tried to find a solution for computer-supported collation of modern manu- scripts, including cancellations. we have explored how this complex research problem in the applica- tion of computers in the humanities could be approached by breaking it down into a community supported and well-defined set of subproblems, which each on its own can be solved in a more fig. conceptual collatex variant graph capturing genetic stages of authoring as well as witness variation computer-supported collation of modern manuscripts digital scholarship in the humanities, of to closing c c - flexible and efficient way. looking beyond the spe- cific problem of computer-supported collation, such an approach does not only appear suitable to us because it is a well-established practice in the construction of complex software systems in gen- eral, but also because it allows for effective collab- oration among researchers and developers from many different backgrounds and projects. from this perspective, it is not by accident that the devel- opment of a modularized collation solution took shape within the context of the research project ‘interedition’, whose aim it is to foster such collab- oration and to address the organizational and archi- tectural issues associated with such an approach as well, issues which point beyond the development of singular software tools for singular use cases. references bordalejo, b. ( ). introduction to the online variorum of darwin’s origin of species. http://darwin-online.org. uk/variorum/introduction.html (accessed february ). in van wyhe, j. (ed.), ( ). the complete work of charles darwin online. http://darwin-online. org.uk/ (accessed february ). bourdaillet, j. and ganascia, j.-g. ( ). practical block sequence alignment with moves. lata – international conference on language and automata theory and applications, / . http://www-poleia. lip .fr/"ganascia/medite_project?action¼attachfile& do¼view&target¼lataþ (accessed may ). bryant, j. ( ). the fluid text: a theory of revision and editing for book and screen. university of michigan press. http://books.google.nl/books?id¼ w wp odpbu c (accessed february ). buzzetti, d. ( ). digital representation and the text model. new literary history, : – . de biasi, p.-m. ( ). toward a science of literature: manuscript analysis and the genesis of the work. in jed deppman, j., ferrer, d., and groden, m. (eds), genetic criticism: texts and avant-textes. philadelphia: university of pennsylvania press, pp. – . ferrer, d. ( ). the open space of the draft page: james joyce and modern manuscripts. in bornstein, g. and tinkle, t. (eds), the iconic page in manuscripts, print, and digital culture. ann arbor: university of michigan press, pp. – . ferrer, d. ( ). logiques du brouillon: modèles pour une critique génétique. paris: éditions du seuil. grafton, a., most, g.w., and settis, s. ( ). the classical tradition. harvard university press, p. . http://books. google.nl/books?id¼lbqf z bq sc (accessed november ). grésillon, a. ( ). éléments de critique génétique: lire les manuscrits modernes. paris: presses universitaires de france. hay, l. ( ). la littérature des écrivains. paris: josé corti. lebrave, j. l. ( ). l’édition génétique. in cadiot, a. and haffner, c. (eds), les manuscrits des écrivains. paris: cnrs/hachette, pp. – . levenshtein, v. ( ). binary codes capable of correct- ing insertions and reversals. soviet physics: ‘doklady’, : – . mcgann, j. ( ). radiant textuality. literature since the world wide web. new york: palgrave/st martins. oakman, r. l. ( ). computer methods for literary research. athens: university of georgia press, pp. – . schmidt, d. and colomb, r. ( ). a data structure for representing multi-version texts online. international journal of human-computer studies, . : – . schmidt, d. ( ). merging multi-version texts: a generic solution to the overlap problem. proceedings of balisage: the markup conference . montreal: balisage series on markup technologies, vol. . shillingsburg, p. l. ( ). from gutenberg to google: electronic representations of literary texts. cambridge: cambridge university press, p. . siemens, r. ( ). toward modeling the social edition: an approach to understanding the electronic scholarly edition in the context of new and emerging social media. literary and linguistic computing, : – . http://llc.oxfordjournals.org/content/ / / .full (ac- cessed november ). smith, s. e. ( ). the eternal verities verified. charlton hinman and the roots of mechanical collation. studies in bibliography, : – . spencer, m. and howe, c. j. ( ). collating texts using progressive multiple alignment. computers and the humanities, : – . unsworth, j. ( ). scholarly primitives: what methods do humanities researchers have in common, and how might our tools reflect this? symposium on ‘humanities r. haentjens dekker et al. of digital scholarship in the humanities, http://darwin-online.org.uk/variorum/introduction.html http://darwin-online.org.uk/variorum/introduction.html http://darwin-online.org.uk/ http://darwin-online.org.uk/ http://www-poleia.lip .fr/~ganascia/medite_project?action=attachfile&do=view&target=lata+ http://www-poleia.lip .fr/~ganascia/medite_project?action=attachfile&do=view&target=lata+ http://www-poleia.lip .fr/~ganascia/medite_project?action=attachfile&do=view&target=lata+ http://www-poleia.lip .fr/~ganascia/medite_project?action=attachfile&do=view&target=lata+ http://www-poleia.lip .fr/~ganascia/medite_project?action=attachfile&do=view&target=lata+ http://www-poleia.lip .fr/~ganascia/medite_project?action=attachfile&do=view&target=lata+ http://www-poleia.lip .fr/~ganascia/medite_project?action=attachfile&do=view&target=lata+ http://www-poleia.lip .fr/~ganascia/medite_project?action=attachfile&do=view&target=lata+ http://books.google.nl/books?id= w wpodpbu c http://books.google.nl/books?id= w wpodpbu c http://books.google.nl/books?id= w wpodpbu c http://books.google.nl/books?id=lbqf z bq sc http://books.google.nl/books?id=lbqf z bq sc http://books.google.nl/books?id=lbqf z bq sc http://llc.oxfordjournals.org/content/ / / .full the research leading to these results has received funding from the european research council under the european union's seventh framework programme (fp / - ) / erc grant agreement n° . durham research online deposited in dro: june version of attached �le: accepted version peer-review status of attached �le: peer-reviewed citation for published item: tehrani, j. j. and nguyen, q. and roos, t. ( ) 'oral fairy tale or literary fake? investigating the origins of little red riding hood using phylogenetic network analysis.', digital scholarship in the humanities., ( ). pp. - . further information on publisher's website: http://dx.doi.org/ . /llc/fqv publisher's copyright statement: this is a pre-copyedited, author-produced pdf of an article accepted for publication in digital scholarship in the humanities following peer review. the version of record tehrani, j. j., nguyen, q. and roos, t. ( ) 'oral fairy tale or literary fake? investigating the origins of little red riding hood using phylogenetic network analysis.', digital scholarship in the humanities. ( ): - is available online at: http://dx.doi.org/ . /llc/fqv . additional information: use policy the full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for personal research or study, educational, or not-for-pro�t purposes provided that: • a full bibliographic reference is made to the original source • a link is made to the metadata record in dro • the full-text is not changed in any way the full-text must not be sold in any format or medium without the formal permission of the copyright holders. please consult the full dro policy for further details. durham university library, stockton road, durham dh ly, united kingdom tel : + ( ) | fax : + ( ) https://dro.dur.ac.uk https://www.dur.ac.uk http://dx.doi.org/ . /llc/fqv http://dro.dur.ac.uk/ / https://dro.dur.ac.uk/policies/usepolicy.pdf https://dro.dur.ac.uk submitted to linguistic and literary computing oral fairy tale or literary fake? investigating the origins of little red riding hood using phylogenetic network analysis jamshid tehrani¹ † , quan nguyen², teemu roos² † ¹ department of anthropology, durham university, south road, durham, dh le ² department of computer science and helsinki institute for information technology, fi- university of helsinki, po box , helsinki. † authors for correspondence. j. tehrani: jamie.tehrani@dur.ac.uk teemu roos: teemu.roos@cs.helsinki.fi please do not cite this draft without the authors’ permission mailto:jamie.tehrani@dur.ac.uk mailto:teemu.roos@cs.helsinki.fi abstract the evolution of fairy tales often involves complex interactions between oral and literary traditions, which can be difficult to tease apart when investigating their origins. here, we show how computer-assisted stemmatology can be productively applied to this problem, focusing on a long-standing controversy in fairy tale scholarship: did little red riding hood originate as an oral tale that was adapted by perrault and the brothers grimm, or is the oral tradition in fact derived from literary texts? we address this question by analysing a sample of literal and oral versions of the fairy tale little red riding hood using several methods of phylogenetic analysis, including maximum parsimony and two network-based approaches (neighbournet and trex). while the results of these analyses are more compatible with the oral origins hypothesis than the alternative literary origins hypothesis, their interpretation is problematised by the fact that none of them explicitly model lineal (i.e. ancestor-descendent) relationships among taxa. we therefore present a new likelihood-based method, phylodag, which was specifically developed to model lineal as well as collateral and reticulate relationships. a comparison of different structures derived from phylodag provided a much clearer result than the maximum parsimony, neighbournet or trex analyses, and strongly favoured the hypothesis that literary versions of little red riding hood were originally based on oral folktales, rather than vice versa. . introduction recent years have witnessed a boom in computational approaches to the reconstruction of literary traditions, fuelled by the adoption of phylogenetic techniques from evolutionary biology and the development of custom-made software for textual analysis (howe et al., ; roos & heikkilä, ). so far, research in this field has focused on the transmission histories of hand-copied manuscripts, where the accumulation of errors and occasional innovations can be modelled as a branching process analogous to the diversification of biological lineages by descent with modification. recently, it has been argued that a similar approach can shed light on the evolution of oral traditions, such as folktales (tehrani, ), legends (stubbersfield & tehrani, ) and myths (d'huy, ). although these stories are not literally copied in the way that manuscripts or dna sequences are, their basic plot elements, motifs, characters and symbols exhibit clear evidence of both fidelity of transmission as well as cumulative change through time. recent case studies (tehrani, ) demonstrate that careful analyses of these features make it possible to reconstruct deep and robust stemmata, which can in turn yield potentially crucial insights into the origin and development of oral tales. one of the key issues in this area concerns the complex interactions between oral and literary traditions, which are often difficult to disentangle. for example, it is well known that, historically, many so-called fairy tales (i.e. traditional short stories containing fantastical or magical elements) have been adapted by writers inspired by oral story-tellers and vice versa. in such cases, it can be extremely problematic to establish in which medium a given tale originated. while most folklorists have tended to assume that fairy tales are rooted in oral tradition, some scholars have argued that they may in fact be derived from written texts. most notably, ruth bottigheimer (bottigheimer, , ) proposed that fairy tales are a primarily literary genre that was invented by the sixteenth century writer giovanni francesco straparola and subsequently popularised by other authors such as basile, perrault and the brothers grimm. while these authors presented their stories as though they were borrowed from the tales told by common folk, bottigheimer suggests this was simply a stylistic ruse, and that the direction of transmission was much more likely to be the other way around. in support of this point, she highlights that the earliest literary versions of fairy tales were written centuries earlier than the supposedly more authentic oral versions collected by folklorists. bottigheimer’s controversial thesis has been rejected by most experts (ben-amos, ziolkowski, silva, & bottigheimer, ), who point out that absence of evidence hardly constitutes evidence for absence, especially given that oral traditions, by definition, lack a written record. however, by the same token, nor can it be proved that oral fairy tales predate the earliest written versions. in this paper, we show how techniques developed in computer- assisted stemmatology can help break this impasse, and shed new light on the missing links between oral and literary traditions in fairy tales. our case study focuses on a tale whose origin has long been the subject of intense controversy: little red riding hood. the tale, which is classified as atu in the aarne- thompson-uther (atu) index of international tale types, famously tells the story of a young girl who is attacked by a wolf disguised as her grandmother. there are numerous theories about the source of the tale, from pre-christian sun myths (saintyves, ) or medieval coming-of-age rites (verdier, ) to chinese folk tradition (haar, ). while these ideas remain difficult to substantiate, the modern tradition of little red riding hood/atu can be traced back to , when the first classic version of the story, le petit chaperon rouge, was published by the french author charles perrault in his collection of purportedly traditional stories, histoires ou contes du temps passé (tales of past times) ( ). a second classic version of little red riding hood (rotkäppchen) was published in in the first volume of jacob and wilhelm grimm’s kinder und hausmärchen (children’s and household tales) ( ). in this version, unlike perrault’s, little red and her grandmother are rescued by a passing huntsman, who slices open the villain’s stomach and sews it up again with stones. although, like the other tales in that volume, rotkäppchen was ostensibly collected from ordinary german peasant folk, grimm scholars have established that the brothers’ source for the tale was actually an educated woman of french-huguenot descent named marie hassenpflug, who was almost certainly familiar with perrault’s enormously popular contes (zipes, ). while the perrault and grimm tales provided the model from which all subsequent literary little red riding hoods are derived, the origins of the oral tradition of atu , and its relationship to these two “classic” versions, are much less well understood. most folklorists believe that perrault based his tale on a traditional french werewolf tale, probably from his mother’s native region of touraine, which was the site of a series of werewolf trials in the sixteenth and seventeenth centuries (zipes, , p. ). it is claimed that variants of the tale survived into the nineteenth and twentieth centuries in the oral literatures of south- east france, the alps and northern italy (delarue, ; rumpf, ). these tales, commonly referred to as simply 'the story of grandmother' (following delarue ) are typically more gory than perrault's censored version – for example, the girl is tricked into eating some of her grandmother's remains. more importantly, rather than being a helpless victim, the girl typically outwits the wolf/werewolf by tricking him into letting her go outside to urinate. although the provenance and antiquity of the tradition remains unknown, it has been suggested that it may go back to medieval times. this is supported by an eleventh century latin poem by egbert of liége, which relates a local walloon folktale in which a young girl encounters a wolf in the woods, and is saved by the supernatural protection afforded by her red tunic, a baptism gift from her godfather, (ziolkowski, ). although it is debateable as to whether or not this tale represents a direct ancestor to little red riding hood (berlioz, ), the echo of common motifs like the young girl in the woods, the villainous wolf, the red outfit given to her by a relative, etc. certainly point to some kind of historical connection between them. nevertheless, other researchers are extremely sceptical that the oral variants held up by folkorists can be regarded as "independent" descendents of the pre-perraudian oral tradition. instead, they suggest that, like the brothers grimm version, these tales are more likely to be vernacular interpretations of published texts. for example, in an essay that strongly resonates with bottigheimer's ideas, hüsing ( ) writes that little red riding hood “represents one of the loveliest french literary tales, perhaps being the most successful fake that we have in the entire genre”, which nonetheless lacks the characteristic stylistic features of authentic oral fairy tales (such as incompleteness). similarly, berlioz ( ) and, indeed, bottigheimer herself ( , p. ), argue that there is no evidence to suggest that little red riding hood existed in oral tradition prior to the publication of perrault's contes at the end of the seventeenth century. in this paper, we aim to shed more light on these issues by taking a quantitative stemmatological approach to investigate the relationships between oral and literary traditions of little red riding hood. our study builds on tehrani’s ( ) recent phylogenetic analyses of the atu type tales, which investigated the relationships between oral european variants (plus perrault and grimm) to similar stories from other parts of the world, especially africa and east asia. tehrani's study did not, however, address the question of whether little red riding hood originated in an oral or literary medium, nor did it examine interactions between the two traditions of atu . below, we outline how these issues were tackled in this study. . materials a total of texts of little red riding hood were selected for analysis (see ‘sources’ in appendix a). to be clear, the aim of the analyses was not to produce a comprehensive stemma of the little red riding hood tradition – which would involve hundreds, if not thousands of texts – but to investigate a specific problem concerning the relationship of oral versions of the tale to literary versions. specifically, we sought to test whether perrault based his tale on a pre-existing oral tradition, or if both the oral and literary traditions derive from the classic versions of perrault and the grimms published in the seventeenth and nineteenth centuries respectively. our dataset included franco-italian oral tales collected in the nineteenth and twentieth centuries that cover most of the major variations in the plot and character found in the folk traditions of these regions. for example, in some cases little red riding hood lacks her characteristic red hood and is simply described as a young girl. in many variants the protagonist outwits the villain to escape, but in others she is eaten. the character of villain, meanwhile, can take several forms, such as a wolf, witch or werewolf. in one group of italian tales (three of which are included here) known as ‘catterinetta’ – formerly categorized as a distinct subtype of atu (aarne & thompson, ) – the villain is actually the relative that the girl went to visit (usually an aunt or uncle). she/he takes revenge on the girl for eating the food that was in her basket and replacing them with cakes made from donkey dung. the dataset also included egbert’s th century poem, the classic versions of little red riding hood published by perrault and the brothers grimm in the seventeenth and nineteenth centuries respectively, five examples of literary versions of little red riding hood from the late nineteenth and early twentieth centuries sampled from the degrummond’s children’s literature research collection curated by the university of southern mississippi (http://www.usm.edu/media/english/fairytales/lrrh/lrrhhome.htm), and three oral variants from beyond the hypothesised atu cradle (two from portugal and one from lusatia in modern day poland) that are thought to be based on literary texts, and which provide another useful point of comparison with the franco-italian oral versions. next, we constructed a matrix that coded the presence or absence of traits (or, in phylogenetic parlance, “characters”) identified in the texts. the traits included features such as the red hood worn by the girl, the character of the wolf, the girl being eaten and so on (the full list of characters and the matrix are provided in appendix a). the matrix only included traits that occurred in at least two tales, which might give clues about common ancestry. traits that occurred in just a single text were excluded, since these would not be informative about relationships. the matrix was analysed using several methods of phylogenetic/stemmatic reconstruction, each of which are described in the sections below. we predicted that, if the oral origins hypothesis is correct, then the literary tradition instigated by perrault and also comprising the grimms’ rotkäppchen, later published versions and oral copies from portugal and lusatia, should constitute a distinct lineage nested within a larger family of franco-italian folktales. conversely, if the latter are derived from textual sources, they would be expected to comprise a lineage (or lineages) that split off from the literary tradition instigated by perrault and continued by the brothers grimm. in the last analysis we introduce a method, phylodag, that directly tests for ancestor-descendent relationships, while also allowing us to incorporate contamination between texts and/or oral traditions. . phylogenetic tree analysis our first analysis employed the most-widely used method for reconstructing relationships among texts in stemmatology, maximum-parsimony (howe et al. ). maximum parsimony involves finding the tree(s) that minimises the number of evolutionary changes required to explain shared traits among a group of taxa (in this case, versions of little red riding hood) under a branching model of descent with modification. we carried out the maximum parsimony analysis in the software program paup . * (swofford, ). the results are shown in figure . fig. "parsimony tree" about here. the tree is rooted using the oldest text, egbert’s th century poem (“latin”), as an outgroup. under the oral origins hypothesis, egbert’s text represents the earliest known witness of the oral tradition of atu prior to perrault, so it can be assumed that all the other texts (both oral and literary) are descended from a common ancestor of more recent origin. under the literary origins hypothesis, egbert’s text would be excluded from the little red riding hood tradition, which is assumed to have originated six centuries later. thus, both hypotheses would position egbert’s text as an outgroup with respect to the other texts. the tree indicates that the literary versions of little red riding hood form a clade, or branch, that also includes the three oral “copies” from portugal and lusatia, as well as an italian tale called three girls. although the latter is technically a folktale, it is much closer to literary versions of atu than traditional versions of ‘the story of grandmother’ (for example, the girl is eaten and then subsequently cut out of the wolf’s stomach), and is probably derived from published texts. the literary clade forms part of a larger grouping that comprises variants of the franco-italian tale ‘the story of grandmother’, but excludes variants of the italian ‘catterinetta’ tale (represented by catterinetta, serravalle and unclewolf), which form a separate lineage splitting off at the root of the tree. thus, as predicted by the oral origins hypothesis, the results of the maximum parsimony analysis suggest that the literary texts share a last common ancestor (lca) of more recent origin than the lca of the oral variants. it is worth noting, however, that there are some inconsistencies between the tree and existing knowledge and theories about the little red riding hood tradition. for example, one of the literary variants (goldenhood) and a portuguese oral “copy” (consigliere) form a clade that appears to be descended from a common ancestor of more ancient origin than perrault. since the literary tradition is known to have originated with perrault, this anomaly can probably be attributed to an error of the maximum parsimony estimation, possibly as a consequence of contamination (or “reticulation” in phylogenetic jargon) between the literary and oral traditions. contamination is likely to be common in fairy tale traditions as multiple oral and literary versions of a tale may circulate at the same time within and between geographical areas, and sometimes get mixed together (e.g. tehrani ). since the underlying model used in maximum parsimony analysis does not explicitly allow for horizontal transmission across lineages, it can sometimes erroneously interpret similarities that result from this process as primitive traits (i.e. the traits exhibited by the hybrid taxon are assumed to be inherited from an ancestral taxon that existed before the lineages leading to the two donor taxa split), thereby “dragging” highly contaminated variants deeper into the structure of the tree. this effect might similarly explain the position of one of the oral variants, joisten, which is claimed to have borrowed traits from literary texts (zipes, , pp. - ), but appears in this tree to have split off from the lca of the oral and literary tradition prior to the emergence of the latter. another issue with maximum parsimony analysis is that it focuses solely on reconstructing collateral phylogenetic relationships (i.e. relationships based on common descent), rather than ancestor-descendent relationships. consequently, it is not clear from the tree whether the position of perrault should be interpreted as ancestral or collateral with respect to the other literary variants, while the position of the grimm text is similarly ambiguous. these examples highlight the need to be cautious in drawing strong conclusions from the topology of the parsimony tree, or indeed other methods that assume a pure branching model of evolution. . network analysis phylogenetic networks provide an alternative approach to reconstructing cultural and biological evolution where relationships are not strictly tree-like. a number of methods for detecting different kinds of reticulation events have been proposed (morrison, ). many of the methods are specific to certain mechanisms, for instance, recombination and therefore not necessarily appropriate for modeling fairy tale traditions where the blending process is rather poorly understood and probably varies significantly from case to case. below, we present results from two popular network methods, neighbornet and t- rex. in addition, we present a new method, phylodag, which is based on maximum likelihood analysis and allows generic directed networks or dags (directed acyclic graph). we also apply a parametric bootstrap test to compare a number of network hypotheses obtained by the phylodag method. . neighbornet analysis a popular method for studying data that may involve reticulation is neighbornet (bryant & moulton, ), (huson & bryant, ). in the terminology of morrison ( ), neighbornet is a data-display method. in other words, it does not attempt to construct a genealogical hypothesis that accurately represents the actual evolutionary history. rather it attempts to represent the possibly conflicting phylogenetic signals in the data, so that non- tree-like structures may result either by actual reticulation or by other mechanisms such as evolutionary reversal or convergent evolution. neither does the neighbornet attempt to suppress statistically insignificant signals in the data which tends to result in very complex networks with a large number of non-tree-like structures. figure shows the neighbornet obtained for the data in our study by using the splitstree software . the network shows similar clusters to the maximum parsimony analysis, distinguishing the literary variants (including the portuguese and lusatian oral copies) from franco-italian oral versions of ‘the story of grandmother’ and versions of the italian ‘catterinetta’ tale, which form a separate group. the "boxiness" of the network suggests probable lines of contamination within and between these sub-groups. however, the network has the typical problem associated with this method, which is that the middle part of the network is a very complex dense mesh of interconnected points that correspond to various weak conflicting signals in the data. furthermore, all the most of the extant versions (the labelled points) are at the end of a long edge, suggesting that none of them (except perhaps one root node) are ancestors of the others. this makes is very hard to interpret the result in a way that would be informative for the questions we are presently considering. in particular, we can tell almost nothing from the network about the influence of perrault and the brothers grimm on the oral tradition, or vice versa. fig. "neighbornet" about here. . t-rex analysis another technique from phylogenetics that can be used to model reticulation is t-rex (boc, diallo, & makarenkov, ). it starts from a tree structure and by comparing the pairwise distances computed from the data to the distances expected based on the tree, it identifies parts of the tree that fail to accurately match the distances in the data. in case certain groups of taxa are more similar to each other than the tree would lead us to expect, a reticulation edge may be introduced. the underlying tree structure is obtained by neighbor-joining (saitou & nei, ). the number of reticulation edges can be chosen by the user. we chose to include five of them in an attempt to discover the most significant contamination events. the result of the t-rex analysis is shown in figure . the backbone phylogeny is largely similar to the parsimony tree, and indicates that the literary versions of little red riding hood form a branch that split from the lineage leading to modern oral variants of the traditional franco-italian tale ‘the story of grandmother’. versions of the italian tale ‘catterinetta’ form a sister group to these tales. one notable difference between the t-rex phylogeny and the parsimony tree is the position of threegirls. as mentioned above, threegirls is an italian oral tale that shares notable features in common with the grimms’rotkäppchen. whereas the parsimony analysis indicated that threegirls was likely to be derived from literary texts (as per the portuguese and lusatian oral versions of atu ), t-rex suggests that threegirls is descended from an oral ancestor that preceded the literary tradition, but has been contaminated by the latter (n.b. although the reticulation edges in t-rex are undirected, the well-documented influence of literary fairy tales – particularly the grimms’ kinder und hausmärchen – on european oral traditions (zipes, ) support this interpretation). this is consistent with the neighbournet graph, which grouped threegirls with oral variants, but indicated substantial conflict in the data surrounding its relationships to other tales. the t-rex analysis proposed several other reticulation edges that suggest substantial mixing within regions between literary and oral traditions of atu , notably between perrault’s classic text and french oral tales, and between the italian variants of ‘the story of grandmother’ and ‘catterinetta’. more puzzlingly, the structure also suggests contamination from the egbert’s medieval poem and a modern literary version of little red riding hood (cupplesleon). since a careful reading of both texts revealed no obvious link between them (e.g. characteristic features of the medieval version that occur in cupplesleon but not in the perrault or grimm tales from which it is certainly derived)) we assume this to be an estimation error (the precise cause of which would require a more detailed deconstruction of the search algorithm that is beyond the scope of the current paper). a more general problem with the interpretation of the results of the t-rex analysis is that, like the parsimony and neighbournet structures, all the variants are represented as leaf nodes. consequently, it is not easy to evaluate direct lines of descent between historical and modern variants, most particularly the relationships of perrault and the brothers grimm to literary and oral tales that were published/recorded more recently. fig. "t-rex" about here. . phylodag we will now propose an alternative approach to network analysis. our approach is likelihood based and, as we will show below, it solves many of the issues in existing network and tree- based methods. likelihood based phylogenetic inference involves a probabilistic sequence evolution model characterizing the evolutionary process. a popular example of such a model is the jukes-cantor model (jukes & cantor, ) that gives the probability of the four dna symbols, a,t,g, and c, changing into other symbols or remaining unchanged in a certain period of time, and also depending on the mutation rate. given such a model, the likelihood of a phylogenetic tree is obtained as the probability that the observed data sequences are produced when the tree structure is fixed and the lineages evolve independently according to the sequence evolution model and branching occurs according to the tree structure. the maximum likelihood method for phylogenetic inference attempts to find the tree structure, including the edge lengths that determine the expected amount of change along each edge, for which the likelihood is the highest possible. strimmer and moulton ( ) describe a simple extension of the likelihood defined for phylogenetic trees that is also applicable to networks, hence allowing reticulation edges to be added into a tree. we improve and extend the method by moulton and strimmer in two ways. first, we introduce a more efficient technique for approximating the likelihood of phylogenetic network. second, we propose a simple search procedure that considers additional reticulation edges in a given tree structure and also estimates the edge lengths by a simple sampling technique. as a result, our method which we call phylodag operates in a similar fashion as t-rex: it takes as input a matrix of character data such as dna sequences or a set of features, and an initial tree structure, and produces a network where a given number of reticulation edges have been added to the tree, together with its likelihood value. in contrast to t-rex, however, phylodag can be used to evaluate tree and network structures where some of the extant taxa are placed at internal nodes so that they represent ancestors of some of the other taxa. for a more detailed description of the phylodag method, see appendix b. different network or tree structures can be compared using a statistical test known as the parametric bootstrap, which we will also outline below, see appendix c. we start the phylodag method with a parsimony tree, fig. , obtained from data matrix in table ii. we then use phylodag to evaluate its likelihood (setting the number of reticulation edges to zero). the parsimony tree yields log-likelihood the value – . . next, we manipulated the topology of the tree to explore different scenarios concerning the origins of the literary and oral traditions of atu . this involved moving the perrault and grimm texts into different internal positions in the tree where they would be either ancestral to both the oral and literary variants, or ancestral to the literary variants and collateral to the oral variants (i.e. descended from a common oral ancestor). we did not attempt manipulations which are incompatible with existing knowledge about the tales, such as the chronology of the literary variants (for example, we did not experiment with making grimm’s tale ancestral to perrault’s version). it is important to note that these manipulations alone will not, as a rule, yield a higher likelihood score than a normal tree. this is because any such manipulated tree is equivalent to a special case of a tree where the taxon in the internal position is in fact a leaf node but the edge pointing it has length zero. hence, the likelihood value of the tree where the taxon is a leaf node will never be lower than the likelihood of the tree where it is an internal node when the edge lengths in both models are optimimized so as to maximize the likelihood. the interesting question is whether a hypothesis involving observed ancestral taxa is better when we allow possible contamination, i.e., reticulation edges in addition to the tree. the phylodag method provides a tool for answering this question. we used phylodag to search for reticulation edges that improve the likelihood score. as a starting point for the search, we use different variations of the parsimony tree (fig. ) where either perrault or grimms is moved into an ancestral position, considering a number of different nearby positions just above or next to the position of the said taxa in the parsimony tree. the search produced alternative structures, which we label by a, b, c, d, e, f, g, h, i, j, and k. figures and show respectively networks c and d, which are of particular importance for our discussion below. the other networks are given for completeness in appendix d. as an indication of how well the models "fit" the data, we report the log-likelihood value of each of the models. for example, the log-likelihood of network c is – . , and the log-likelihood of network d is – . . networks b, c and g achieve a higher log-likelihood value than the parsimony tree (– . ). however, the likelihood values should not be taken to be the final evaluation of the models because of two reasons. first, the likelihood evaluation is approximate due to the random sampling procedure included in the method (see appendix b). second, perhaps more importantly, the log-likelihood score tends to favor complex models because they have more adjustable parameters that make it easier to achieve high log- likelihood values for most data sets. to provide a statistically sound goodness-of-fit measure, below we propose to use a parametric bootstrap technique. . parametric bootstrap it is important to note that a network hypothesis is typically more complex than a tree hypothesis (it has more parameters), which may lead to so called over-fitting: choosing a too complex hypothesis considering its statistical support. to avoid over-fitting, we applied a parametric bootstrap test to compare the tree hypotheses and the different network hypotheses; for more details, see appendix c. table i summarizes the results of the bootstrap test. the results are not unanimous but there is a relatively strong (considering the small sample size) signal indicating that models b, c, and g have the best statistical support. among them, model c (fourth row in table i, and fig. ) fares especially well, and is only rejected with low statistical confidence when compared to models b and g, while the latter two are both rejected in more comparisons. all three models place perrault in an internal position that makes it ancestral to all the literary variants. however, there is some disagreement regarding the position of the grimms’ tale: model b (see appendix d) has grimm as a terminal node, whereas both c and g place grimm as an ancestral source for subsequent literary versions. although the bootstrap test was unable to discriminate between these possibilities, previous research into the history of little red riding hood strongly support the latter scenario (zipes, ). table i. statistical hypothesis test results (parametric bootstrap). rows: null hypothesis. columns: alternative hypothesis. 'tree': parsimony tree. '': not rejected. '+': rejected at significance level . . '*': rejected at significance level . . null alternative hypothesis hypothesis tree a b c d e f g h i j k tree * * * * + * * * . * . a  * * * * * * * . * * b   + + + + + + . * + c   +    + . . . . d  + * * +  * . . * + e + * * + * * * + . * * f + * * *  * * + . + . g +  +  * * * . . + . h * * * * * * * * * * * i * * * * * * * * * * * j * * * * * * * * . . * k * * * * * * * * . . * fig. "phylodag network c" about here. more significantly, all three models b, c, and g are consistent with the oral origins hypothesis. the literary tradition instigated by perrault (placed as an internal node in all three models) is represented as an offshoot of a lineage that also gave rise to the french and italian tale 'the story of grandmother'. the models further suggest that the variants of the italian tale of catterinetta comprise a separate group that split from the other oral and literary variants prior to perrault. however, the models show that these various subgroups of atu did not develop in isolation of one another. all three indicate contamination both within and between the literary and oral traditions of the tale. for example, like the t-rex structure, models b, c, and g, all suggest reticulation played an important role in the tale threegirls. however, whereas the t-rex analysis suggested that threegirls was descended from an oral ancestor that preceded the first written versions of little red riding hood, the phylodag models are more consistent with the parsimony results, which situated the tale within the literary group. specifically, models b, c, and g, indicate that threegirls is descended from the grimm’s text, which was mixed with elements from oral tradition (notably the italian catterinetta tale, as shown in models c and g, with which it shares distinctive motifs like angering the villain by replacing the contents of the basket). contamination also appears to be evident in the portuguese tale consigliere and french literary tale goldenhood, which might explain their anomalous positions in the parsimony tree, which made them a sister clade to the perrauldian literary tradition. as explained earlier, reticulation can be a major source of error in inferring phylogenetic trees, for example by dragging affected taxa deeper into the structure of the tree. by incorporating reticulation edges in phylodag, we found that models in which perrault was ancestral to consigliere and goldenhood fitted the data much better than models in which these tales formed a sister clade, i.e. a and e, which were rejected in all the bootstrap comparisons with every other model except one (i, discussed below). we analysed six structures that supported the alternative literary origins hypothesis. among them, the one that is best supported by the data – albeit not as well as the oral origins models, b, c, and g – is model d, see fig. . the other network structures are given in appendix d. models f, i and k represent perrault as the ancestor of all modern versions of atu , including the literary variants and the oral tales 'the story of grandmother' and 'catterinetta'. model f represents the grimm tale as a leaf node, while in i and k the grimm tale is shifted into different internal positions within the phylodag. in the bootstrap comparisons, all three models are rejected against the tree and the oral origin scenarios represented in b, c and g. models d, h and j represent perrault as the ancestor of the literary variants of little red riding hood and the oral tale 'the story of grandmother', but not of versions of 'catterinetta', which consistently come out as a sister group to the other tales in the analyses. the grimm tale is positioned as a leaf node in model d and as an internal node in h and j. model d is supported against the parsimony tree, but rejected with high statistical support against the oral origins models b, c, and g. models h and j are rejected in all the comparisons. fig. "phylodag network d" about here. in sum, the inclusion of lineal and reticulate relationships using phylodag produced a number of structures that fit the data better than the parsimony tree. structures consistent with the oral origins hypothesis were less frequently rejected in the bootstrap comparisons than those that are consistent with the literary origins hypothesis, with all three of the top performing models (b, c and g) falling into the former category. however, it should be noted that the evidence from the bootstrap test comparisons is not all in one direction, since models b and g (oral) are rejected against d and f (literary). on the other hand, model c (oral) is supported with high statistical confidence against both literary origins models. thus, overall, the results of the phylodag analyses indicate that the literary tradition of little red riding hood has its roots in oral folktales, rather than the other way around. . conclusions our aim in this paper has been to shed light on a complex question in the historiography of fairy tales: is it possible to identify whether particular stories originated as traditional folktales or authored texts? we have proposed that a useful strategy for addressing this question is to adopt the kind of quantitative, computational approach that has been so successfully used to reconstruct manuscript stemmata. our case study focused on testing two long-standing competing hypotheses about the origins of little red riding hood. the first suggests the tale originally evolved in french and italian oral tradition, adapted by charles perrault in the late seventeenth century, and subsequently copied by the brothers grimm to establish the classic form of the tale found in present day popular culture. the second hypothesis proposes that the tale was a literary invention in the first place, and that “traditional” variants collected by folklorists are actually adaptations of perrault’s and grimm’s texts. we initially tested these hypotheses by analysing oral and literary variants of little red riding hood/atu using one the most popular methods in computer-assisted stemmatology – maximum parsimony analysis. while the general structure of the tree returned by this analysis seemed to be more compatible with the oral origins hypothesis than the literary origins hypothesis, this conclusion is mitigated by two problems with interpreting the results: firstly, maximum parsimony does not incorporate reticulation (contamination), which can lead to errors in estimating phylogenetic relationships; secondly, the method does not model lineal (ancestor-descendent) relationships among observed taxa, making it difficult to draw firm conclusions about the role of classic historic texts (i.e. perrault and grimm) on contemporary literary and oral variants. alternative methods for modelling reticulate evolution, such as neighbournet and t-rex, provide a means for addressing the first of these problems but not the second. as such, their usefulness for addressing the question in hand turned out to be limited. we therefore introduced a new approach – phylodag – which handles both lineal and reticulate relationships in a statistically sound way. this enabled us to compare different models for the evolution of little red riding hood and directly test the oral hypothesis against the literary hypothesis. our results pointed strongly toward the former, with the best models indicating that perrault adapted his tale from oral folktales, rather than vice versa. of course, we cannot extrapolate any general conclusions about the origins of fairy tales from a single case study. it is entirely possible – likely, even – that other tales originated in a literary medium before passing into oral tradition, as suggested by bottigheimer. what we have shown here is that the problem of establishing these facts is far from intractable, and can be solved using principled and powerful computational methods. we anticipate that the application of these methods will generate new insights into the origins and development of different types of fairy tale, as well as other kinds of cultural traditions (lipo, o’brien, collard, & shennan, ; mace, holden, & shennan, ). endnotes the splitstree software is available at www.splitstree.org . we follow the convention to give likelihood values in logarithmic scale, so that probabilities, which are always less than one, become negative numbers. we chose to include all networks in order to give an indication of the range of possible network hypotheses we considered and to quantify the statistical uncertainty by means of the bootstrap test. references aarne, a., & thompson, s. ( ). the types of the folktale. a classification and bibliography (vol. ). helsinki: ff communications. ben-amos, d., ziolkowski, j. m., silva, f. vaz da., & bottigheimer, r. ( ). special issue: the european fairy-tale tradition between orality and literacy. journal of american folklore, ( ). berlioz, jaques. ( ). un petit chaperon rouge médiéval? ‘la petite fille épargnée pa les loups’ dans la fecunda ratis d’egbert de liège (début du xie siècle). marvels and tales, ( ), – . boc, alix, diallo, alpha boubacar, & makarenkov, vladimir. ( ). t-rex: a web server for inferring, validating and visualizing phylogenetic trees and networks. nucleic acids research, (w ), w -w . doi: . /nar/gks bottigheimer, r.b. ( ). fairy godfather: straparola, venice, and the fairy tale tradition: university of pennsylvania press, incorporated. bottigheimer, r.b. ( ). fairy tales: a new history: state university of new york press. d'huy, j. ( ). a phylogenetic approach to mythology and its archaeological consequences. rock art research ( ), - . delarue, p. ( ). les contes marveilleux de perrault et la tradition populaire: i. le petit chaperon rouge. bulletin folklorique d'ile-de-france, - , - , - . grimm, j, & grimm, w. ( ). children's and household tales. gottingen. haar, b.j. ( ). telling stories: witchcraft and scapegoating in chinese history: brill academic pub. howe, c. j., barbrook, a. c., spencer, m., robinson, p., bordalejo, b., & mooney, l. r. ( ). manuscript evolution. trends genet, ( ), - . husing, g. ( ). is little red riding hood a myth? in a. dundes (ed.), little red riding hood: a casebook (pp. - ). madison: university of wisconisn press. huson, daniel h., & bryant, david. ( ). application of phylogenetic networks in evolutionary studies. mol biol evol, ( ), - . doi: . /molbev/msj lipo, c., o’brien, m., collard, m., & shennan, s. j. (eds.). ( ). mapping our ancestors: phylogenetic approaches in anthropology and prehistory. new brunswick: aldine transaction. mace, r., holden, c., & shennan, s. (eds.). ( ). the evolution of cultural diversity – a phylogenetic approach. london: ucl press. morrison, david. ( ). introduction to phylogenetic networks. http://www.rjr- productions.org/networks/index.html: rjr productions. perrault, c. ( ). histoires ou contes du temps passé. roos, teemu, & heikkilä, tuomas. ( ). evaluating methods for computer-assisted stemmatology using artificial benchmark data sets. literary and linguistic computing, ( ), - . doi: . /llc/fqp rumpf, m. ( ). little red riding hood, a comparative study (vol. ). bern: artes populares. saintyves, paul. ( ). little red riding hood or the little may queen. in a. dundes (ed.), little red riding hood: a casebook (pp. - ). madison: wisconsin university press. stubbersfield, joseph, & tehrani, jamshid. ( ). expect the unexpected? testing for minimally counterintuitive (mci) bias in the transmission of contemporary legends: a computational phylogenetic approach. social science computer review, ( ), - . doi: . / swofford, d.l. ( ). paup* . phylogenetic analysis using parsimony (*and other methods). version . sunderland: sinauer. tehrani, jamshid j. ( ). the phylogeny of little red riding hood. plos one, ( ), e . doi: . /journal.pone. verdier, yvonne. ( ). le petit chaperon rouge dans las tradition orale. cahiers de litterature orale, , - . ziolkowski, j. m. ( ). a fairy tale from before fairy tales: egbert of liege's "de puella a lupellis seruata" and the medieval background of "little red riding hood". speculum, ( ), - . zipes, j. ( ). the trials and tribulations of little red riding hood. new york: routledge. zipes, j. ( ). the golden age of folk and fairy tales: from the brothers grimm to andrew lang: hackett publishing. http://www.rjr-productions.org/networks/index.html: http://www.rjr-productions.org/networks/index.html: figures fig. parsimony tree. log-likelihood – . . fig. neighbornet. the network is obtained by splitstree (huson and bryant, ) with default settings. fig. t-rex. the underlying neighbor-joining tree is shown with solid black lines and five additional reticulation edges are shown with dotted red lines. fig. phylodag network c. log-likelihood – . . fig. phylodag network d. log-likelihood – . . appendix a. data sources taxon name reference perrault perrault, c. ( ). "le petit chaperon rouge" histoire ou contes du temps passe. grimm grimm j. & grimm w. ( ). "rotkäppchen". kinder- und hausmärchen. gottingen, no. lusatia a. h. wratislaw ( ) “little red hood”. sixty folk-tales from exclusively slavonic sources london: elliot stock, pp. - neill neill, j. ( ). little red riding hood. chicago: reilly & lee co. downloaded from the university of southern mississippi little red riding hood project: http://www.usm.edu/media/english/fairytales/lrrh/lrrhhome.htm randre andre, r. ( ). red riding hood. new york: mcloughlin bros. downloaded from the university of southern mississippi little red riding hood project: http://www.usm.edu/media/english/fairytales/lrrh/lrrhhome.htm cupplesleon gruelle j. b. ( ). all about little red riding hood. new york: cupples & leon. downloaded from the university of southern mississippi little red riding hood project: http://www.usm.edu/media/english/fairytales/lrrh/lrrhhome.htm dewolf dewolfe ( ). red riding hood and cinderella. dewolfe, fiske, and co. downloaded from the university of southern mississippi little red riding hood project: http://www.usm.edu/media/english/fairytales/lrrh/lrrhhome.htm goldenhood marelles, c. . "the true story of little goldenhood". andrew lang, the red fairy book, th edition. london and new york: longmans, green, & co. pp. - consigliere vaz da silva, f. ( ). capuchinho vermelho em portugal. estudos de literatura oral , p. - moncorvo vasconcellos, l. (n.d.) “o chapelinho encarnado”. translated by sara silva. courtesy of isabel cardigos and the centro de estudos ataíde oliveira threegirls calvino, i. ( , trans. by g. martin) "the wolf and the three girls". italian folktales. harmondsworth: penguin, pp. - millena millen, a. ( ). 'little red riding hood: version '. zipes, j. . the golden age of the folk and fairy tales. indianapolis: hackett. p - millenb millen, a. ( ). 'little red riding hood: version ' zipes, j. . the golden age of the folk and fairy tales. indianapolis: hackett. p millenc millen, a. ( ). 'the little girl and the wolf' zipes, j. . the golden age of the folk and fairy tales. indianapolis: hackett. p grandmother delarue, p. ( ). "the story of grandmother". the borzoi book of french folktales. new york: alfred knopf, pp. - . fintanonna calvino, i. ( , trans. by g. martin) "the false grandmother". italian folktales. harmondsworth: penguin, pp. - redcap schneller, c. ( , trans. by d. ashliman). "cappelin rosso". märchen und sagen aus wälschtirol: ein beitrag zur deutschen sagenkunde.innsbruck: verlag der wagner'schen universitäts- buchhandlung, pp. - blade blade, jean-francois. ( ). 'the wolf and the child' zipes, j. . the golden age of the folk and fairy tales. indianapolis: hackett. p legot legot m. ( ). 'little red riding hood: the version of tourangelle'. zipes, j. . the golden age of the folk and fairy tales. indianapolis: hackett. p joisten joisten, c. untitled. recounted in zipes, j. ( ) the trials and tribulations of little red riding hood. new york: routledge, pp. - . serravalle rumpf, m. ( ) “caterinella: ein italienisches warnmärchen,” serravalle variant. fabula : - unclewolf calvino, i. ( , trans. by g. martin) "uncle wolf". italian folktales. harmondsworth: penguin, pp. - . catterinetta schneller, c. ( , trans. by d. ashliman). "cattarinetta". märchen und sagen aus wälschtirol: ein beitrag zur deutschen sagenkunde.innsbruck: verlag der wagner'schen universitäts- buchhandlung, pp. - . latin ziolkowski, j. ( ) a fairy tale from before fairy tales: egbert of liege's "de puella a lupellis seruata" and the medieval background of "little red riding hood" list of characters protagonist [ ] girl [ ] boy girl wears red hood: [ ] absent [ ] present who made red hood: [ ] absent [ ] mother [ ] grandmother [ ] godfather girl goes to visit relative: [ ] absent [ ] granny [ ] aunt [ ] mother relative is a witch: [ ] absent [ ] present [ ] fairy] granny sick [ ] absent [ ] present girl told to fetch pan from relative: [ ] absent [ ] present girl told not to stay from path: [ ] absent [ ] present carries basket: [ ] absent [ ] present cargo: bread: [ ] absent [ ] present cargo: soup: [ ] absent [ ] present cargo: custard: [ ] absent [ ] present cargo: butter: [ ] absent [ ] present cargo: cakes: [ ] absent [ ] present cargo: eggs: [ ] absent [ ] present cargo: wine: [ ] absent [ ] present girl plays in forest: [ ] absent [ ] present girl eats the cargo: [ ] absent [ ] present villain is [ ] ogre [ ] wolf [ ] werewolf [ ] devil reconnaissance - villain finds out where the girl is going: [ ] absent [ ] present villain and girl take separate paths: [ ] absent [ ] pins vs needles [ ] short vs long woodcutters are in the forest: [ ] absent [ ] present wolf impersonates girl: [ ] absent [ ] present grandmother gives instructions on opening door: [ ] absent [ ] present girl replaces cargo [ ] absent [ ] dung [ ] nails monster eats granny: [ ] absent [ ] present monster dresses up in grannys clothes: [ ] absent [ ] present monster disguises voice: [ ] absent [ ] present girl eats remains of granny: [ ] absent [ ] present girl eats body parts: [ ] absent [ ] present [ ] refuses girl eats granny teeth: [ ] absent [ ] present girl drinks blood: [ ] absent [ ] present [ ] refuses the girl is warned about the danger: [ ] absent [ ] by monster [ ] by animals girl flees home boards up house: [ ] absent [ ] present monster stalks girl "i'm coming!": [ ] absent [ ] present wolf tells girl to take off clothes: [ ] absent [ ] present throws clothes into fire: [ ] absent [ ] present wolf tells girl to get into bed: [ ] absent [ ] present dialogue: [ ] absent [ ] present my what! head [ ] absent [ ] present my what! arms [ ] absent [ ] present my what feet [ ] absent [ ] present my what! legs [ ] absent [ ] present my what! ears [ ] absent [ ] present my what! teeth [ ] absent [ ] present my what! eyes [ ] absent [ ] present my what! nose [ ] absent [ ] present my what! hands [ ] absent [ ] present my what! mouth [ ] absent [ ] present my what! hairy [ ] absent [ ] present girl eaten: [ ] absent [ ] present girl cut out of stomach: [ ] absent [ ] present girl saved [ ] absent] by [ ] hunstman [ ] woodcutters [ ] father [ ] mother [ ] townsfolk [ ] granny girl saved by magic cloak: [ ] absent [ ] present [ ] magic wand girl tricks wolf: [ ] absent [ ] present wolf chases girl [ ] no [ ] to her house wolf killed: [ ] absent [ ] present wolf's stomach sewn up with stones inside matrix [character no. ] latin perrault randre dewolfe neill cupplesleon grimms lusatia goldenhood fintanonna grandmother joisten redcap catterinetta unclewolf serravalle threegirls legot blade milliena millienb millienc consigliere moncorvo ? n.b. the value represents a “gap” state for characters that were redundant or not relevant for a particular tale. for example, if the girl did not carry a basket (character ) then characters relating to the contents of the basket ( - ) – which logically could not be present – were coded as gap characters appendix b. description of the phylodag method strimmer and moulton ( ) proposed a likelihood-based method for comparing different phylogenetic hypotheses that correspond to directed acyclic graphs (dags). each node in the graph corresponds to a taxon, either extant or hypothetical (unobserved). the edges in the dag correspond to direct inheritance where the origin of the edge, the "parent", is the immediate ancestor and the end of the edge, the "child", is the offspring. cases where a taxon has only one parent are modelled by using familiar sequence evolution models such as the jukes-cantor model. however, when a taxon has more than one parent, a different evolutionary model is assumed: each of the parent taxa is given a relative weight, and each character is inherited from a parent that is randomly chosen based on these weights. inheritance from a parent follows the same model as in the case where there is only one edge pointing to the node in question. computing the likelihood of a dag model, i.e., the probability that a given set of sequences is obtained as the outcome of the given dag, is hard. moulton and strimmer proposed a random sampling technique to approximate the likelihood. their technique eventually converges to the exact likelihood value but in practice it may take a large number of samples, and hence, a long time, before obtaining accuracy that is sufficient for comparing different dags. we have developed an alternative approximation which is not based on random sampling but instead uses a technique called loopy belief propagation, see (murphy, weiss, & jordan, ). it is not guaranteed to converge to the exact value but on the other hand, it is often significantly faster than random sampling. in our experiments (not shown here, see (nguyen & roos, in preparation)), it produces better accuracy than a number of different random sampling techniques with less computation time. we also extend the earlier method by strimmer and moulton by including a parameter learning step where the edge lengths that characterize the amount of evolutionary change along each edge in the network are learned from the data so that they need not be given as input to the phylodag method. in practice, the phylodag method takes as input a set of sequences and a tree structure. it then considers all possible additional edges between any two nodes in the tree – including edges between two extant nodes, edges between an extant and an hypothetical node, and edges between two hypothetical nodes – in turn and evaluates the likelihood of the network where the edge in question is included in addition to the edges in the initial tree structure. the edge or the edges that improve the likelihood score the most are included in the output network. often it is useful to also set an upper bound on the number of edges that are added so as to obtain a more easily interpreted network where only the most significant reticulation events are included. in the present work, we limited the number of additional edges to four to facilitate the interpretation of the models. we used the jukes-cantor model, which can be directly extended to handle any other number of character states than four, for modeling the evolution of individual features and following moulton and strimmer, set the weigths on the parents to be uniform so that each parent taxon has the same influence on the dependent taxon. appendix c. parametric bootstrap parametric bootstrapping for testing phylogenetic topologies, i.e., tree structures, was first suggested by (huelsenbeck & crandall, ). our implementation is primary based on the later description by (posada, ). the testing procedure of topology m (null hypothesis) against topology m (alternative hypothesis) can be briefly described as follows. . estimate the parameters (edge lengths) in models m and m by maximum likelihood. denote the maximum likelihood estimates (mles) by and , respectively. . calculate the log-likelihood ratio (llr) , where and are the log-likelihood of the data given structure m and m with mle parameters respectively. . from structure m with estimated parameters , draw k= simulated data sets which all have the same size and missing data as the original data set. . for each simulated data set , estimate parameters and for both structures, and calculate the llr . use these to obtain an approximate distribution of the llr between m and m under the null hypothesis m . . let f be the number of time that the llr on simulated datasets is bigger than the llr on the original data in step . if the quotient f/k (in this case k= ) is smaller than a predefined threshold ( . or . ), the null hypothesis is rejected. the intuition is that if the null hypothesis is true, then the simulated data sets in step are drawn from the same distribution as the observed data. this implies that the llr based on the observed data, computed in step , follows the same distribution as the llr values for the simulated data in step . suppose now that the llr for the observed data, which measures how much better model m fits the obsered data than m , is higher than almost all of the simulated llr values. by the above reasoning, this must be unlikely since the observed llr value is supposed to be drawn from the same distribution as the simulated ones, and we are lead to reject the null hypothesis. it is obvious that such a test is valid in the sense that if the null hypothesis is true, it is unlikely to be rejected. appendix d. additional results. networks c (fig. ) and d (fig. ) are representative examples among the two main hypotheses: the oral origins hypothesis (network c) and the literary origins hypothesis (network d). figures – show the rest of the networks for completeness. fig. phylodag network a. log-likelihood – . . fig. phylodag network b. log-likelihood – . . fig. phylodag network e. log-likelihood – . . fig. phylodag network f. log-likelihood – . . fig. phylodag network g. log-likelihood – . . fig. phylodag network h. log-likelihood – . . fig. phylodag network i. log-likelihood – . . fig. phylodag network j. log-likelihood - . . fig. phylodag network k. log-likelihood - . . op-llcj .. comparative evaluation of term selection functions for authorship attribution ............................................................................................................................................................ jacques savoy computer science department, university of neuchatel, neuchâtel, switzerland ....................................................................................................................................... abstract different computational models have been proposed to automatically determine the most probable author of a disputed text (authorship attribution). these models can be viewed as special approaches in the text categorization domain. in this perspective, in a first step we need to determine the most effective features (words, punctuation symbols, part-of-speech, bigram of words, etc.) to discrim- inate between different authors. to achieve this, we can consider different inde- pendent feature-scoring selection functions (information gain, gain ratio, pointwise mutual information, odds ratio, chi-square, bi-normal separation, gss, darmstadt indexing approach (dia), and correlation coefficient). other term selection strategies have also been suggested in specific authorship attribu- tion studies. to compare these two families of selection procedures, we have extracted articles from two newspapers and belonging to two categories (sports and politics). to enlarge the basis of our evaluations, we have chosen one news- paper written in the english language (‘glasgow herald’) and a second one in italian (‘la stampa’). the resulting collections contain from to , articles written by four to ten columnists. using the kullback–leibler divergence, the chi- square measure and the delta rule as attribution schemes, this study found that some simple selection strategies (based on occurrence frequency or document frequency) may produce similar, and sometimes better, results compared with more complex ones. ................................................................................................................................................................................. introduction in automatic authorship attribution, computer sys- tems can be designed and implemented to deter- mine the most probable author behind a disputed document or a text excerpt (mosteller and wallace, ; juola, ; stamatatos, ). to achieve this, a set of texts written by each of the possible writers must be made available to the classifier. in this study, we focus on the closed-class attribution problem in which the real author is one of the several possible candidates. other pertinent con- cerns related to this issue include the mining of demographic or psychological information on an author (profiling) (pennebaker, ), or simply determining whether or not a given author did write a given internet message or document (verifi- cation) (koppel et al., ). instead of being limited to text, we can also consider other media (e.g. music, song, picture, drawing). to solve this categorization task, we need to ex- tract and select features that are useful in identifying correspondence: jacques savoy, computer science department, university of neuchatel, neuchâtel, switzerland email: jacques.savoy@unine.ch literary and linguistic computing � the author . published by oxford university press on behalf of allc. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqt digital scholarship in the humanities advance access published december , in fact of course, i , , etc. the differences between the authors’ writing styles. in this study, we will consider words and punctu- ation symbols as possible features or terms. in a second step, we determine the discrimination power of each term and then apply a selection pro- cedure to derive a reduced set of terms that can effectively discriminate between the different pos- sible authors. finally, through applying classifica- tion rules or schemes, the system can determine the most probable author of a text excerpt. the rest of this article is divided as follows. section presents related work, and section out- lines the main characteristics of the corpora used in our experiments. section briefly describes the term selection functions applied in our experiments. section presents the selected attribution methods, and section evaluates them according to various term selection strategies. section exposes some practical considerations, and section draws the main conclusions of this study. related work as far as automatic authorship attribution appro- aches are concerned, the early solutions were based on a unitary stylometric value that must be constant for a given author but should vary from one writer to another (holmes, ). as measures, previous studies have suggested using vocabulary richness measures, average word length, mean sentence length, and yule’s k measure or other statistics related to type-token ratios (baayen, ). none of these measures has been proven satisfactory in all cases (hoover, ), due in part to word distri- butions (including word bigrams or trigrams) ruled by a large number of low probability elements (large number of rare events) (baayen, ). moreover, these measures are based on a single measurement, and therefore no term selection procedure was needed. to account for the vocabulary used, mosteller and wallace ( ) propose a semiautomatic selec- tion procedure to determine the most useful terms, and particularly the most frequent ones composed mainly of various function words (determiners, prepositions, conjunctions, pronouns, and some adverbs and verbal forms). in this case, we state that the occurrence frequency of some word types is not fully controlled by the author and varies from one person to the other. for example, mosteller and wallace ( ) notice that the term ‘while’ was used times by hamilton but never by madison, the second possible writer. in their last study, these re- searchers have worked with a reduced list to word types. following this perspective, burrows ( ) sug- gests automatically selecting words that can discrim- inate between authors. the selection criterion is simply the occurrence frequency, and burrows ( ) proposes to consider the first – most frequent word types. in such a sample, we usually find a large proportion of function words. this threshold was first increased to , and then to , (hoover, ). as a variant, jockers and witten ( ) derive , terms (single words and bigrams of words) appearing at least once in texts written by all three possible authors of the ‘federalist papers’. from this list, the researchers extract a reduced set composed of terms, after imposing the condition that for each item the rela- tive frequency must be > . %. various studies have followed this vein suggesting using frequent word types containing many functional words (damerau, ; holmes and forsyth, ; baayen and halteren, ; miranda garcı́a and calle martı́n, ). in a similar way, grieve ( ) considers selecting all word types in a k-limit profile, where k indicates that each selected term must occur, at least in k articles written by every author (e.g. a value k¼ imposes the presence of the target word in at least five articles written by every possible author). thus, this scheme imposes that all selected terms must be used by all authors and not only a fraction of them. instead of selecting the features based on the available corpus, zhao and zobel ( ) propose to define a priori the most useful word types. their suggested list contains english word types, composed mainly of function words but with some lexical terms (independent of the the- matic of the underlying texts). finally, other studies suggest applying appro- aches used in automatic text categorization (or j. savoy of digital scholarship in the humanities, while whereas as well as very very or lnre - , to greater than , , text classification) defined as the task to automatic- ally assign one (or more) predefined label(s) to each input text (manning and schütze, ; sebastiani, ; stamatatos, ). to build such a system, we need to apply a feature selection procedure to reduce the number of terms needed to discriminate between the different categories. having fewer terms, the underlying computation can be simpli- fied, and the lexical space needed to be explored is also reduced (liu and motoda, ). in a com- parative study, yang and pedersen ( ) evaluated six selection measures for topical text classification, using two corpora and two classifiers (k-nearest neighbors and linear least squares fit). their experi- ments indicate that the information gain (also called expected mutual information) or the chi-square statistic tends to achieve the best results. for sebastiani ( ), the odds ratio (or) and the chi-square are usually the selection functions displaying the best performance. topical text categorization and authorship attri- bution do not, however, strictly follow the same implementation. in the former, we usually remove the most frequent words (stopword list) having no precise and useful meaning. in authorship attribu- tion, these terms are viewed as important style mar- kers because they are used in a less conscious way than other words (pennebaker, ). thus, their use and occurrence frequency may differ from one author to the other. moreover, topical text categor- ization must deal with sparse data because many terms appear only in a few documents. therefore, numerous synonyms must be taken into account to achieve a high effectiveness. in authorship attribu- tion, we tend to ground the classification decision on frequent terms, thus reducing the problems related to the synonymy and data sparseness. the main objective of this article is to know if, from the set of all possible terms and punctuation symbols, we can automatically select a smaller per- tinent set of terms that can discriminate among authors. this objective can be achieved by automat- ically ignoring noisy terms. those terms are irrele- vant in discriminating between the possible authors because their occurrence distributions are similar among them. such noisy terms must be ignored, and their removal might improve the classification accuracy. moreover, working with a reduced set of terms will speed up the underlying computation and decrease the risk of over fitting the classifier to the available data (hastie et al., ). evaluation corpora to obtain a replicable test collection with authors sharing a common culture and having similar lan- guage registers, we opt for a stable and publicly available corpus by pulling out a subset of the clef- test suite. the first two collections are written in the english language and correspond to articles appearing in in the ‘glasgow herald’ newspaper. from this news source, we have chosen articles covering two distinct topics, namely, ‘sports’ and ‘politics’. for each set, we have selected five journalists having written , articles on sports and on politics. the distribution over authors is depicted in table . to complement these first two collections, we extracted two additional corpora based on articles appearing in in the ‘la stampa’ newspaper (written in the italian language). as with the first newspaper, we also selected articles covering the categories ‘sports’ and ‘politics’. for the sports subset, we chose four journalists having written , articles, whereas for the politics corpus, we had ten columnists having written , articles. table shows the distribution over the authors. these corpora are pertinent for authorship attri- bution because each collection is formed by texts having the same general topic and genre. moreover, they originate from the same period. we know that the style may differ from one person to another, but the period (juola, ; hughes et al., ), the topic, the genre (labbé, ), and the text intent also have an obvious impact on the style. finally, their spelling was controlled and normalized (e.g. to denote the capital of the people’s republic of china, we can name either beijing or peking). to speed up the computation and derive an ef- fective feature set, we ignored all terms appearing less than ten times in a given corpus and terms ap- pearing only in a single article. moreover, we also removed terms used only by a single author. such comparative evaluation of term selection functions digital scholarship in the humanities, of statistic tend very paper , in order t - (gh) while on have since time time , as well as of course, s words may be good indicators of the real author, but they are also easy to use by another person aiming to play a masquerade. for example, when analyzing the ‘federalist papers’, this filter will re- move the term ‘while’ used frequently by hamilton but never by madison, as well as the term ‘whilst’ used only by madison. with the ‘glasgow herald’, after applying these constraints, the remaining vocabulary contains , word types for the sports corpus and , terms for the political domain. in the ‘la stampa’ corpora, the sports subset comprises , word types, and the politics subset is composed of , terms. the question that then arises is the following: can we extract from these sets of terms subsets having better discrimination power to en- hance the classification performance? selection functions in the machine learning domain, we can find differ- ent independent feature-scoring functions to rank the features according to their discriminative power. to measure this capability for a term tk ac- cording to a given category (or author) cj, with j¼ , , . . . , jcj, we usually use a contingency table for each pair (tk, cj) as depicted in table . in this table, the value a indicates the number of texts be- longing to the category cj in which the term tk occurs. when considering all other classes (denoted by �cj), the term tk appears in b other texts. thus, in the whole corpus, this term occurs in aþb texts, while we can count aþc texts labeled with the category cj. to measure the association between a term tk and a category (or author) cj, we can compute the ‘pointwise mutual information’ (pmi) given in equation ( ) (church and hanks, ). pmiðtk ,cjÞ ¼ log prob tk ,cj � � prob tk½ � � prob cj � � " # ¼ log a=n ( a + b ) � n � ( a + c ) � n " # ð Þ this function compares two models to estimate the probability of selecting the term tk within the cat- egory cj. the first model is based on a direct esti- mation of the joint probability (and denoted prob[tk, cj]¼a / n). this estimation is the numer- ator of equation ( ). the second model (denomin- ator of equation ( )) estimates this probability by considering independently the probability of the occurrence of the term tk (prob[tk]¼(aþb) / n), and the probability of selecting a text belonging to the category cj (prob[cj]¼(aþc) / n). this second model assumes that there is no relationship between the occurrence of the term tk and the category cj. when this assumption is true (no real relationship table distribution of , articles about sports and articles about politics by author name in the ‘glasgow herald’ name topics number douglas derek sports gallacher ken sports gillon doug sports paul ian sports traynor james sports total , johnstone anne politics shields tom politics smith graeme politics trotter stuart politics wishart ruth politics total table distribution of , articles about sports and , articles about politics by author in ‘la stampa’ name topics number ansaldo marco sports beccantini roberto sports del buono oreste sports ormezzano gian paolo sports total , battista pierluigi politics benedetto enrico politics galvano fabio politics gramellini massimo politics meli mari teresa politics nirenstein fiama politics novazio emanuele politics pantarelli franco politics passarini paolo politics spinelli barbara politics total , j. savoy of digital scholarship in the humanities, -- — a + between the category cj and the term tk), prob[tk, cj] can be estimated by prob[tk] �prob[cj]. in such cases, the two probability estimates will be close, and the ratio in equation ( ) will return a value close to . computing the logarithm of such a value, we will find a value close to , indicating in- dependence between the term occurrence and the corresponding category. on the other hand, when a strong association does exist between the term tk and the category cj, the value of a will be large. the direct estimation for prob[tk, cj] will be larger than the product prob[tk] �prob[cj]. the ratio will then be larger than and the logarithm function will return a posi- tive value. with a negative association between the term tk and the category cj, the numerator will be smaller than the denominator, returning a value smal- ler than . taking the logarithm, a negative value is returned, indicating that the term tk is less frequently used in category cj than in the rest of the corpus. to illustrate this idea, we have taken a hypothet- ical numerical example given in table . in this case, the term � appears in twenty-one texts in the corpus, in which we can find ten texts written by author aj. the other authors have used the word � in eleven other texts. the last row in table indi- cates that the author aj has written texts, while we can count , texts written by all other au- thors. in other words, aj has written % of all the texts belonging to the corpus. a quick analysis reveals that the author aj repre- sents � % of the occurrences of the term �, but only % of all texts. when computing the pmi value (formulation given below), we can see that the joint estimation is larger than the denominator. the resulting ratio between the two models is larger than ( . in our example), giving a final value of . . the term � is more closely associated to aj than by pure chance. because the presence of this term in a text is an indication that this document might have been written by aj, we can select this term to help discrimination between aj and all other possible writers. pmið�,ajÞ ¼ log = ( + ) � � ( + ) � " # ¼ log � � � � ¼ log : ½ � ¼ : on the other hand, if the value for a in table had been instead of , and keeping the same values for the last row and the last column, we would find no relationship between the term � and the author aj. observing two texts written by aj with the term � corresponds closely to the independent model (pure chance). because aj wrote % of the whole corpus, it is not surprising to see close to % of them with the term �. in this case, the pmi func- tion will return the value � . , a value close to . using such a term selection function seems a pertinent choice. such a procedure may reveal the terms useful in discriminating between the possible authors. moreover, this approach may rank all terms according to their discriminative power and we can limit the selection of the top m most dis- criminative terms. as a second function, we can estimate the prob- ability prob[cj j tk], a measure denoted darmstadt indexing approach (dia) (fuhr et al., ). based on table , this probability is estimated by a / (aþb) (to simplify the presentation, all formulae are re-grouped in the appendix). the dia function is based on different arguments than those justifying the pmi function. as a third function, we can use the odds ratio (or) (manning and schütze, ), which always returns a table example of a contingency table for a term tk and a category cj category cj category –cj term tk a b aþb other -tk c d cþd aþc bþd n¼aþbþcþd table contingency table for the term � and the author aj aj other authors –aj term � other -� , , , , comparative evaluation of term selection functions digital scholarship in the humanities, of around since was in fact, o since a surprise - dia ( ) of course, t odds ratio ( ) positive value. a positive association between the term tk and the category cj is indicated by a value larger than , although a value close to signifies an opposition. a value close to denotes independence between the term and the underlying class. as a fourth selection function, we can use the information gain (ig) or the expected mutual in- formation. the value returned by this function is large if a positive association exists. a small positive value signifies the absence of a discriminative power for the term tk and the category cj. following the same interpretation, we can compute the chi-square, � (tk, cj), statistics (manning and schütze, ). as a sixth function, we can apply the gain ratio (gr) returning a positive value to signal either a positive or negative association between the term tk and the category cj. independence is indicated by a value close to . as a seventh term selection function, derived from the � , we can provide the correlation coeffi- cient, cc(tk, cj) (ng et al., ), which indicates a positive association by a positive value (although a negative value signifies an opposition). the inde- pendence between the term and the category is denoted by a value close to . following the same interpretation, we can compute the gss coefficient (gavalotti et al., ). another interesting selection function is the bi-normal separation (bns) suggested by forman ( ). in the presence of independence, this function returns a small positive value. a larger value indicates either a positive or a negative asso- ciation between the term tk and the underlying category cj. in addition to these nine selection functions, we can simply consider the document frequency (df) indicating the number of texts indexed by the term tk. the larger this value, the better the corres- ponding term. moreover, we can also assume that the style of a given author may be revealed by the frequent use of certain forms. in this perspective, we can follow burrows ( ) and use the absolute term frequency (tf). as for the df value, the higher the absolute term frequency (or tf), the better the usefulness of this term. when applying one of the aforementioned selec- tion functions, we can compute a local utility value denoted f(tk, cj) for each tk and each category cj. when faced with a binary classification problem (two authors), such a function is enough to define the overall selective value for each term. in author- ship attribution in general, the number of authors (categories) is larger than two. in such cases, we need to aggregate the local utility values over the jcj categories. to define such a global utility measure for a term tk (denoted uop(tk)), we can take the maximum over the jcj categories or com- pute the sum or a weighted mean as shown in equation ( ). umaxðtkÞ ¼ maxj f ðtk ,cjÞ, usumðtkÞ ¼ xjcj j¼ f ðtk ,cjÞ, uwmeanðtkÞ ¼ xjcj j¼ prob½cj� � f ðtk ,cjÞ ð Þ finally, to select the m most adequate terms, we ex- tract the m terms having the highest utility values uop(tk) according to one of the aggregate operators given above (max, sum, or weighted mean). authorship attribution methods to evaluate the different term selection functions, we have selected three authorship attribution approaches to ground our finding on a relatively broad basis. as a first method, we can apply the approach suggested by zhao and zobel ( ), who propose to compute the distance between the author profile aj (concatenation of all his/her writings) and the query text q by using the kullback–leibler divergence (kld) (also called rela- tive entropy (manning and schütze, )). this measure is given in equation ( ), where probq[ti] (or probaj[ti]) indicates the occurrence probability of a term ti in the query q (or in the author profile aj), for i¼ , , . . . , m. kldðqjjajÞ ¼ xm i¼ probq ti½ � � log probq ti½ � probaj ti½ � � � ð Þ when two distributions are identical, the kld measure is . otherwise, the formula returns a j. savoy of digital scholarship in the humanities, one while zero one employ χ zero &unicode_xf ;χηι⊟σθυαρϵ,&unicode_xf ; while zero very - zero positive value. this value grows as the distance (dis- agreement) increases between the two underlying distributions. to estimate the needed probabilities, we can apply the maximum likelihood principle and access prob[ti]¼ tfi/n, with tfi indicating the occur- rence frequency of a term ti, and n the size of the document. however, it is usually better to smooth such estimates to avoid null probabilities (manning and schütze, ). in our evaluations, we have applied the lidstone’s rule where the probabilities are then estimated as (tfiþ�) / (nþ��jvj), with jvj indicating the vocabulary size and � a parameter fixed to . (showing the best performance). as a second authorship attribution method, we can compare the representation of a given text q with an author profile aj using the chi-square stat- istic (grieve, ) defined by equation ( ) (the same general method can be used as term selection and attribution scheme). in this formulation, rtfq(ti) represents the relative frequency of the ith term in the query text, and rtfaj(ti) the same information in the jth author profile. when comparing a query text q with different author profiles aj, we simply select the lowest chi-square value to determine the most probable author of a disputed text. � ðq, ajÞ ¼ xm i¼ rtfqðtiÞ� rtfajðtiÞ � � . rtf ajðtiÞ ð Þ as a third authorship attribution method, we used the delta model (burrows, ) measuring the dis- tance between two texts according to the standar- dized frequency (z score) of their terms. this value is obtained from the relative occurrence frequency (denoted rtfij for term ti in the document dj) by subtracting the mean (meani) and dividing by the standard deviation (sdi), the mean and standard de- viation estimated by considering the underlying corpus (see equation ( )). z scoreðtijÞ ¼ rtfij � meani sdi ð Þ once these dimensionless quantities are obtained for each selected word, we can then compute the distance to those obtained from author profiles. given a query text q, an author profile aj, and a set of terms ti, for i¼ , , . . . , m, we compute the delta value (or the intertextual distance) by apply- ing equation ( ). �ðq, ajÞ ¼ =m � xm i¼ z scoreðtiqÞ� z scoreðtijÞ ð Þ in this formulation, proposed by burrows ( ), we assign the same weight to each term ti. a large dif- ference between q and aj will appear, when, for a given term, both z scores are large but with opposite signs. on the other hand, when a term appears with similar relative occurrence frequencies in both texts, the difference in z scores will be small. finally, when for all m terms the differences in z scores are small, the resulting � distance will be slight, indicating that the same person probably wrote both texts. evaluation to achieve unbiased performance estimations, we cannot use the same instances for both training the classifier and testing it. the set of available ex- amples must, therefore, be divided into a training set and a distinct test set. in the current study, we opted for the leave-one-out approach (hastie et al., ). when applying this methodology with the sports corpus extracted from the ‘glasgow herald’, each of the , articles, in turn, will form the query text, whereas the remaining , texts will generate the training set used to determine the most useful terms. the accuracy rate reported in this study corresponds to the micro-average value, the mean over all documents. some authors suggest not using the same train- ing set to let the classifier learn (i.e. the author pro- files) and to select the features. thus, it is recommended to use a disjoint set of instances for feature selection and for learning. if from a theor- etical point of view a bias exists, from a practical viewpoint the impact of this bias is rather limited (singhi and liu, ). as a first authorship attribution model, we have evaluated the kld model (zhao and zobel, ). with the ‘glasgow herald’ corpus, the feature selec- tion is based on predefined word types. this list mainly contains function words (the, in, but, not, comparative evaluation of term selection functions digital scholarship in the humanities, of λ λ λ δ leaving while , it thus kullback-leibler divergence ( ) am, of, can . . .) and frequent items (became, noth- ing . . .). some entries are less frequent (howbeit, whereafter, whereupon), whereas others indicate the expected behavior of the tokenizer (doesn, weren) or correspond to an arbitrary decision (in- dicate, missing, seemed). with the italian corpora, we first need to define a list of frequent terms usu- ally appearing in all documents. to achieve this, we have chosen a stopword list used by search technol- ogy with this language (savoy, ). this list in- cludes terms containing mainly function words (il, la, del, in, con, nostro, essi, fare . . .) and frequent items (anno, casa . . .). using the kld method with the predefined set of english words, we achieved an accuracy rate of . % for the sports corpus and . % for politics (these values are reported in the line ‘a priori’ in table ). to obtain a more complete picture, we also considered all available terms (words and punc- tuation symbols, no selection). in this case, the kld scheme produces an accuracy rate of . % (sports, based on , terms) and . % (politics, based on , terms) (line denoted ‘all’ in table ). with the italian language and the predefined set of words, we achieved an accuracy rate of . % for the sports subset and . % for politics (these performances appear in the line ‘a priori’ in table ). when considering all terms without any selection, an accuracy rate of % was obtained with the sports subset (based on , terms) and . % with the politics subset (based on , terms) (line denoted ‘all’ in table ). we then compare these baselines (a priori selec- tion) with eleven other term selection methods and three aggregation operators (max, sum, or weighted mean (denoted wmean)). as the number of selected terms (parameter m in the previous formulae de- picted in section ), we have tested the following values { ; ; ; ; , ; , ; , ; , ; , ; and , }. the last value ( , ) corres- ponds to � . % of all , terms available for the sports subset in the ‘glasgow herald’ (or . % of all , available terms for the politics subset). similar percentages can be obtained when analyzing the two italian text collections. when considering larger numbers, the term space is not really reduced; therefore, we did not attempt this variation. instead of reporting all possible combin- ations of the number of features with the three ag- gregation functions, we have only reported the best parameter setting for each selection function (number of terms / aggregation operator). the question that then arises is ‘can we obtain a better performance using fewer terms?’ if yes, can the set of terms defined by zhoa and zobel ( ) produce a better performance than that produced table feature selection methods with kld and ‘glasgow herald’, with the sports corpus and politics function sports politics parameter accuracy (%) parameter accuracy (%) a priori terms . terms . all , terms . y , terms . y tf(tk, cj) , / max . y / max . y df(tk, cj) , / max . y , / max . y ig(tk, cj) , / wmean . y / sum . y gr(tk, cj) , / max . y / max . y gss(tk, cj) , / max . y , / max . y � (tk, cj) , / sum . y / sum . y cc(tk, cj) , / max . y / max . y bns(tk, cj) , / wmean . y , / sum . y pmi(tk, cj) , / wmean . y , / wmean . y or(tk, cj) , / wmean . y / wmean . y dia(tk, cj) , / max . y , / wmean . y table feature selection methods with kld and ‘la stampa’, with the sports collection and politics function sports politics parameter accuracy (%) parameter accuracy (%) a priori terms . terms . all , terms . y , terms . tf(tk, cj) , / max . y / sum . y df(tk, cj) , / max . y / max . y ig(tk, cj) , / max . y , / wmean . y gr(tk, cj) , / sum . y , / wmean . y gss(tk, cj) , / max . y / max . y � (tk, cj) , / max . y , / wmean . y cc(tk, cj) , / max . y , / max . y bns(tk, cj) , / max . y , / wmean . y pmi(tk, cj) , / max . , / max . or(tk, cj) , / max . , / wmean . y dia(tk, cj) , / max . / max . j. savoy of digital scholarship in the humanities, can, … , s well as very , while very , , as well as very , `` a '' `` '' `` a '' (politics, `` '' , , , , , , , , , around `` ?'' from sets of terms defined by the various feature selection functions? the accuracy rates depicted in tables and indicate that different selection functions may pro- duce better performance levels than either the manual selection or when ignoring the selection procedure (lines labeled ‘all’). the manual selection (lines labeled ‘a priori’) produces relatively low ac- curacy rates for the english language compared with the others (see table ). for the politics subset of the ‘glasgow herald’, considering all possible terms clearly improves the performance (from . to . %). within the same category, but with the italian language (table ), the manual selection, or considering all terms, tends to produce similar performance levels ( . versus . %). however, those accuracy rates are lower than those achieved by other selection functions. overall, tables and tend to show that we can achieve high-performance results when considering relatively simple selection methods such as df, or the absolute tf. in these tables, the best performance is shown in bold. the performance differences with ig, gr, gss, chi-square (� ), or cc functions are usu- ally small and not significant. however, the pmi, or, and dia functions seem to offer lower performance levels than those produced by other selection schemes. when inspecting the different ag- gregate operators (max, sum, or weighted mean), we can see that the maximum function tends to occur frequently in tables and , indicating that this ag- gregation function tends to achieve the best results. to verify whether a performance difference is statistically significant between two term selection procedures, we opted for the sign test (conover, ; yang and liu, ) (bilateral test, signifi- cance level �¼ %). in this case, the null hypothesis h assumes that both selection methods perform at a similar level. in tables – , we use the first line as a baseline, and any statistically significant perform- ance difference is indicated by the symbol ‘y’. as we can see in tables and , the performance differ- ences are usually significant compared with the first row, the selection strategy based on a predefined set of words. when applying the chi-square metric (grieve, ), we have tried different k-limits and found that for the ‘glasgow herald’ the -limit produces the best performance ( . % as depicted in table in the row ‘k-limit’) by selecting , terms for the sports corpus (or with k¼ for the politics subset, selecting terms and producing an accuracy of . %). when specifying k-limit¼ , all selected terms must appear in at least ten articles written by all journalists. thus, such a selection strategy imposes that every possible author must have used all selected terms. with the two italian corpora, the best accuracy rates were achieved when considering the -limit (selecting a small set of thirty-one terms, accuracy rate¼ . %) for the sports subset or with k¼ for the politics subset (selecting terms and pro- ducing an accuracy of . %). when ignoring the selection procedure, we take into account all possible terms. in this case, we can achieve an accuracy rate of . % (sports, , available terms) and . % (politics, based on , terms) with the english corpora (see table , line with the label ‘all’). in table , for the italian collections, the accuracy rate is of . % (sports, , available terms) or . % (politics, based on , terms) when we ignore the selection procedure. instead of strictly following the selection scheme proposed by grieve ( ), we can apply different feature-scoring selection functions. the best par- ameter settings and accuracy rates are reported in table for the ‘glasgow herald’ newspaper and in table for the italian corpora. as for the kld method, the results depicted in tables and indicate that an appropriate feature- scoring function (with their parameter values) might produce higher performance levels than when considering all terms or when selecting terms according to the best k-limit principle. overall, when comparing the different selection strategies, the df, or tf selection schemes tend to produce high-performance levels. with this chi- square-based attribution scheme only, the bns se- lection function also offers a high effectiveness. after applying the sign-test, we can see that the performance differences with the best k-limit approaches are usually statistically significant (indi- cated with the symbol ‘y’). finally, and for both comparative evaluation of term selection functions digital scholarship in the humanities, of `` '' ``a '' to % % vs . t however high document frequency ( ) term frequency ( ) information gain ( ) gain ratio ( ) χ correlation coefficient ( ) t pointwise mutual information ( ) odds ratio ( ) , however, α to `` '' `` '' of course, i high bi-normal separation ( ) languages, usually the pmi, the or, and the dia functions tend to return less pertinent term sets, achieving lower performance levels. to obtain a broader view, we have also depicted the best performances achieved with the delta rule (burrows, ). as depicted in table for the ‘glasgow herald’, the best performance using the delta rule is achieved when considering the most frequent terms for the sports subset (accuracy rate . %, under the label ‘most freq.’) or with the politics corpus ( . %). with the italian text collections, the most effective number of terms is for the sports subset ( . %) or word types with the politics corpus ( . %) (see table ). this selection procedure proposed with the delta rule is equivalent to the tf scoring function with the sum aggregation. table feature selection methods with chi-square method and ‘la stampa’, with the sports collection and politics function sports politics parameter accuracy (%) parameter accuracy (%) k-limit terms . terms . all , terms . , terms . tf(tk, cj) / sum . y / wmean . y df(tk, cj) / wmean . y / wmean . y ig(tk, cj) , / max . / wmean . y gr(tk, cj) , / max . y / wmean . gss(tk, cj) , / max . y / max . y � (tk, cj) , / sum . , / wmean . cc(tk, cj) , / max . , / max . y bns(tk, cj) , / max . y , / max . y pmi(tk, cj) , / max . , / max . or(tk, cj) , / max . y , / max . y dia(tk, cj) , / max . y , / max . y table feature selection methods with chi-square method and the ‘glasgow herald’, with the sports corpus and politics function sports politics parameter accuracy (%) parameter accuracy (%) k-limit , terms . terms . all , terms . y , terms . tf(tk, cj) / max . y / sum . y df(tk, cj) / sum . / sum . y ig(tk, cj) , / sum . y / wmean . y gr(tk, cj) , / sum . y / sum . y gss(tk, cj) , / max . y / max . y � (tk, cj) , / sum . y / wmean . y cc(tk, cj) , / max . y , / max . bns(tk, cj) , / sum . y , / max . y pmi(tk, cj) , / max . y , / sum . y or(tk, cj) , / max . y , / wmean . y dia(tk, cj) , / max . y / max . y table feature selection methods with delta method and ‘glasgow herald’, with the sports corpus and politics function sports politics parameter accuracy (%) parameter accuracy (%) most freq. terms . terms . all , terms . y , terms . y tf(tk, cj) / max . y / sum . df(tk, cj) / max . y / wmean . ig(tk, cj) / sum . y / sum . y gr(tk, cj) / wmean . y / sum . y gss(tk, cj) / max . y , / max . y � (tk, cj) , / sum . , / wmean . y cc(tk, cj) / sum . y , / max . y bns(tk, cj) , / max . y , / sum . y pmi(tk, cj) , / max . y , / max . y or(tk, cj) / max . y , / wmean . y dia(tk, cj) / max . y , / max . y table feature selection methods with delta method and ‘la stampa’, with the sports collection and politics function sports politics parameter accuracy (%) parameter accuracy (%) most freq. terms . terms . all , terms . y , terms . y tf(tk, cj) / max . y / max . df(tk, cj) / max . y / max . ig(tk, cj) , / max . y / max . y gr(tk, cj) , / max . y / sum . y gss(tk, cj) / max . y / max . � (tk, cj) , / max . y / sum . y cc(tk, cj) / max . y / max . y bns(tk, cj) , / wmean . y / wmean . y pmi(tk, cj) , / max . y / sum . y or(tk, cj) , / sum . y / sum . y dia(tk, cj) , / max . y / max . y j. savoy of digital scholarship in the humanities, pointwise mutual information ( ) odds ratio ( ) `` '' without any feature selection, the delta rule pro- duces, with the english corpus, an accuracy of . % (sports, , available terms) and . % (politics, , terms), as depicted in the line ‘all’ in table . with the ‘la stampa’ newspaper’s cor- pora and considering all terms, the delta rule achieves an accuracy of . % (sports, , terms) and . % (politics, with , terms). the performance differences with the first row are rather large, indicating that the delta rule must be applied with a reduced number of word types. as with the other authorship attribution methods, we have also reported the best success rate according to the eleven selection approaches together with the best parameter setting. overall, the evaluations depicted for the delta rule (see tables and ) confirm that high-per- formance levels are achieved when using the df, or the absolute tf as feature selection functions. the performance differences are usually statistically sig- nificant over the selection based on the most fre- quent words, but only for the two sports corpora. as a second choice, we can use the gss, ig, and chi- square functions. however, the or, the pmi, and the dia function tend to produce less pertinent fea- ture sets, at least in an authorship attribution con- text and especially when using the delta rule (see tables and ). a few cases are worth a comment. with the english language (table ), the high result of the gr function in the sports subset is not confirmed by the politics corpus. we also notice that the cor- relation coefficient (cc) function with the english sports corpus (see table ) achieves a rather low accuracy level, as does the bns selection function with the italian politics subset. finally, to give a view of the selection effect of the different functions, we have counted the per- centage of selected terms in common between two functions with the english corpora. using only the sum as the aggregate operator, and varying the number of selected terms between and , , we can see that the functions df and tf return, on an average, similar sets of terms (overlap degree between to %). a similar effect can be detected with the function cc and the chi-square metric (this can be explained by the fact that the function cc is derived from the chi-square) or be- tween the ig and gr. we can also detect a relation- ship between the set of features defined by the functions gss and ig. in these cases, the overlay is, on an average, %. finally, it is difficult to find a clear relationship between the bns, or, or pmi functions and all the others. these three selec- tion functions tend to propose different sets of features. practical considerations when faced with a new authorship attribution problem, which term selection function must we apply and how many terms must we select? based on four test collections and three attribution schemes, the experimental results do not show a strong systematic pattern. however, some trends can be detected. first, the absolute occurrence frequency (tf) and the df tend to produce pertinent and well-performing term sets for the three attribu- tion schemes and the four collections. this is an indication that using frequent words as features to discriminate between different authors is an effective strategy. such selection approaches also have the advantage to be easy to implement and own a clear interpretation for the end-user. moreover, we cannot detect significant differences between the evaluations done with the english language and those performed over the italian collections. it is worth mentioning that the term selection is not based on the whole vocabulary. as specified in section , we can take into account the domain knowledge. thus, it is a good practice to ignore word types having a low occurrence frequency or appearing in a single (or a few) document(s). moreover, we have also removed words used by only a single author. it is known that such terms can be effective to distinguish between authors and most of the selection functions will detect them as effective features for authorship discrimination. however, these terms are vulnerable because they can be easily used to spoof a given identity. the second important question is to define the number of terms to be selected. determining a comparative evaluation of term selection functions digital scholarship in the humanities, of values `` '' high document frequency ( ) term frequency ( ) information gain ( ) t odds ratio ( ) pointwise mutual information ( ) however so bi-normal separation ( ) very % correlation coefficient ( ) information gain ( ) gain ratio ( ) information gain ( ) bi-normal separation ( ) odds ratio ( ) pointwise mutual information ( ) very document frequency ( ) have these , however, priori such an optimum value is difficult. the vari- ous experiments depicted in the previous section indicate that we need a small number of terms (be- tween and ) to obtain one of the best per- formance levels with the delta rule (tables and ). the chi-square authorship attribution scheme also requires a relatively small number of terms to produce one of the best accuracy rates ( – (df function) with the english corpora (table ), and terms (df function) with the italian corpora (table )). with the kld method, a general conclusion is harder to draw. for the english cor- pora, we need � for the politics subset, and , – , terms for sports subset (table ). for the italian language, � , terms for the sports subset and for the politics part (table ) are required to achieve the highest per- formance levels. in addition to these findings, we must recall that the morphology of the italian language is more complex than that of english. thus, we can expect having more functional word types in this language than in english. for example, the translation of the determiner ‘the’ (definite article) could be ‘il, lo, l, i, gli, la, or le’ because the variations in gender and number must be specified in the italian language. this difference in size can be reduced when we con- sider representing text using the lemmas (headword or dictionary entry) instead of the word types. in this case, we can conflate all inflected forms under the same entry (e.g. ‘was, were, is,’ etc. are re- grouped under the lemma ‘be’, whereas the pro- nouns ‘i, me’ under ‘i’). this processing can be done manually, but it is a costly operation. on the other hand, we can apply an automatic part-of- speech tagger. however, such an approach is not error-free, and some recent studies have compared the relative merits of these two text representation schemes for authorship attribution (savoy, ; miranda garcı́a and calle martı́n, ). finally, we must mention that some natural languages may have other linguistic construction than those used in the english language. for example, the def- inite article appears as a suffix in the bulgarian or swedish language and not as a distinct and separate lemma. moreover, the indefinite article (‘an/a’) does not exist in the bulgarian language. as a possible default parameter setting, we can suggest selecting the first most frequent word types according to the document frequency (or dfsum) for the english collections, and the top most frequent terms (dfsum) with the italian cor- pora. using the occurrence frequency (tf) will pro- duce similar results, and according to our experiments, there is no real reason to prefer one function to the other. we have a slight preference for the document frequency information because this function ignores the variations in document lengths. finally, to determine the number of terms, it is more efficient to work with a small term set. according to our experiments, a size of terms seems reasonable for the english lan- guage. considering that the morphology of the italian language is more complex, we suggest adding more terms when working with lan- guages having a more complex morphology (i.e. gender, grammatical cases). table depicts the accuracy rates obtained when adopting these suggested default parameter settings. for the ‘glasgow herald’ corpora, we have used the first most frequent terms accord- ing to the document frequency (or dfsum) and the top most frequent terms (dfsum) for ‘la stampa’. these performance levels are then compared with the optimal parameter setting (considering all selec- tion functions and number of terms). as shown in table , we can see that the performance differences between the proposed default parameter settings and the optimal ones are rather small, particularly with the kld attribu- tion scheme. on the other hand, the delta rule is more sensitive to deviation from an optimal number of terms. these data also show that the df selection strategy tends to produce similar term sets when compared with the tf function. this relationship can be shown by considering the performances obtained with the politics corpus of the ‘glasgow herald’ in table . using the chi-square attribution scheme and the most frequent occurring terms (tf), we achieve an accuracy rate of . %. using the df function, the selection of the most frequent terms provides a mean performance of . %, a relative decrease of � . %. j. savoy of digital scholarship in the humanities, to around to around , while (pos) such however very , this s to - conclusion to design an effective authorship attribution scheme, we need to select the most appropriate fea- tures (word types and punctuation symbols in the current study) that can discriminate between the different categories or authors. to evaluate the different selection strategies, we compared nine fea- ture-scoring functions and two well-known selec- tion approaches used in authorship attribution studies (based on the absolute term frequency (tf), or the df). to combine the scores computed for various terms, we have evaluated three aggregation operators. as a classifier, we used the kld measure (zhao and zobel, ), the chi-square metric (grieve, ), and the delta rule (burrows, ). using four corpora extracted from the newspaper ‘glasgow herald’ and ‘la stampa’ about sports ( , and , articles, respectively) and politics ( and , articles, respectively), our evalu- ations show that using the df or the absolute term frequency (tf) tends to provide good overall per- formances. in a second class of performance level, we can place the gr, the chi-square, the gss func- tion, and the ig. the use of the pmi, the or, or the dia function does not provide comparable results, at least in the authorship attribution context. finally, the bns function presents an erratic behavior, working well in some cases, and moder- ately in others. the overall good results achieved by the absolute term frequency (tf) are clearly an indication that the suggested selection procedure proposed by burrows ( ) for the delta rule is an effective one. similarly, the df selection function tends to propose discriminative term sets, and this study confirms, in part, the k-limit selection strategy suggested by grieve ( ). however, this latter procedure im- poses that the selected terms be used by all possible authors, a constraint not imposed by the df selection function. unlike the yang and pedersen’s study ( ) based on topical text classification, the ig or the chi-square is not always the best performing meth- od. according to sebastiani ( ), good selections can be achieved by applying the orsum or the gssmax. the current study, which is based on authorship attribution, indicates that this choice, adequate for topical text classification, is not the best when dealing with authorship attribution. finally, as an aggregation function, this study tends to indicate that applying the maximum oper- ator seems to be a good default choice. on the other hand, our evaluations based on four distinct cor- pora are unable to clearly define a specific number of terms to be used for a new collection. as a default table evaluation of the proposed parameter setting versus the optimal one, with the english corpora and the italian corpora attribution method ‘glasgow herald’ ‘la stampa’ corpus—parameter accuracy (%) corpus—parameter accuracy (%) kld sports— , df max . sports— , df max . terms . (� %) terms . (� %) chi-square , terms—no select. . , bns max . terms . (� %) terms . (� %) delta gr wmean . tf max . terms . (� %) terms . (� %) kld politics— gr max / max . politics— df max . terms . (� %) terms . ( %) chi-square tf sum . df wmean . terms . (� %) terms . (� %) delta tf sum . gr sum . terms . (� %) terms . (� %)y comparative evaluation of term selection functions digital scholarship in the humanities, of in order t document frequency ( ) kullback-leibler divergence document frequency ( ) gain ratio ( ) information gain ( ) s pointwise mutual information ( ) odds ratio ( ) document frequency ( ) this , however, information gain ( ) are s odds ratio ( ) range of values, the evaluations reported in this study indicate that considering between and terms seems a good starting point before further investigations, when possible. references baayen, h. r. ( ). word frequency distributions. dordrecht: kluwer academic press. baayen, h. r. ( ). analyzing linguistic data: a practical introduction to statistics using r. cambridge: cambridge university press. baayen, h. r. and halteren, h. v. ( ). an experiment in authorship attribution. in proceedings of the th journées d’analyse des données textuelles . st-malo, pp. – . burrows, j. f. ( ). delta: a measure of stylistic differ- ence and a guide to likely authorship. literary & linguistic computing, : – . church, k. w. and hanks, p. ( ). word association norms, mutual information and lexicography. in proceedings association for computational linguistics (acl). pp. – . conover, w. j. ( ). practical nonparametric statistics, nd edn. new york: john wiley & sons. damerau, f. j. ( ). the use of function word frequen- cies as indicators of style. computers and the humanities, : – . forman, g. ( ). an extensive empirical study of fea- ture selection metrics for text classification. journal of machine learning, : – . fuhr, n., hartmann, s., knorz, g., lustig, g., schwantner, m., and tzeras, k. ( ). ir/x a rule- based multi-stage indexing system for large subject fields. in: proceedings recherche d’information assistée par ordinateur (riao). pp. – . gavalotti, l., sebastiani, f., and simi, m. ( ). experiments on the use of feature selection and negative evidence in automated text categorization. in: proceedings european conference in digital libraries (ecdl). berlin: springer, pp. – . grieve, j. ( ). quantitative authorship attribution: an evaluation of techniques. literary & linguistic computing, : – . hastie, t., tibshirani, r., and friedman, j. ( ). the elements of statistical learning. data mining, inference, and prediction. new york: springer-verlag. holmes, d. i. ( ). the evolution of stylometry in humanities scholarship. literary and linguistic computing, : – . holmes, d. i. and forsyth, r. s. ( ). the federalist revisited: new directions in authorship attribution. literary and linguistic computing, : – . hoover, d. l. ( ). another perspective on vocabulary richness. computers and the humanities, : – . hoover, d. l. ( ). corpus stylistics, stylometry, and the styles of henry james. style, : – . hughes, j. m., foti, n. j., krakauer, d. c., and rockmore, d. n. ( ). quantitative patterns of styl- istic influence in the evolution of literature. proceedings of the national academy of sciences united states of america, ( ): – . jockers, m. l. and witten, d. m. ( ). a comparative study of machine learning methods for authorship at- tribution. literary and linguistic computing, : – . juola, p. ( ). the time course of language change. computers and the humanities, : – . juola, p. ( ). authorship attribution. foundations and trends in information retrieval, : – . koppel, m., schler, j., and bonchek-dokow, e. ( ). measuring differentiability. journal of machine learning research, : – . labbé, d. ( ). experiments on authorship attribution by intertextual distance in english. journal of quantita- tive linguistics, : – . liu, h. and motoda, h. ( ). computational methods of feature selection. boca raton: chapman & hall / crc. manning, c. d. and schütze, h. ( ). foundations of statistical natural language processing. cambridge: the mit press. miranda garcı́a, m. and calle martı́n, j. ( ). functions words in authorship attribution studies. literary & linguistic computing, : – . miranda garcı́a, m. and calle martı́n, j. ( ). the authorship of the disputed federalist papers with an annotated corpus. english studies, : – . mosteller, f. and wallace, d. l. ( ). applied bayesian and classical inference: the case of the federalist papers. reading: addison-wesley. ng, h. t., goh, w. b., and low, k. l. ( ). feature selection, perceptron learning, and a usability case study for text categorization. in: proceedings j. savoy of digital scholarship in the humanities, acm-special interest group in information retrieval (sigir). new york: the acm press, pp. – . pennebaker, j. w. ( ). the secret life of pronouns. what our words say about us. new york: bloomsbury press. savoy, j. ( ). report on clef- experiments. in peters, c., braschler, m., gonzalo, j., and kluck, m. (eds), cross-language information retrieval and evaluation, lncs # . berlin: springer, pp. – . savoy, j. ( ). authorship attribution: a comparative study of three text corpora and three languages. journal of quantitative linguistics, : – . sebastiani, f. ( ). machine learning in automatic text categorization. acm computing survey, : – . singhi, s. and liu, h. ( ). feature subset selection bias for classification learning. in proceedings international conference on machine learning. new york: the acm press, pp. – . stamatatos, e. ( ). a survey of modern authorship attribution methods. journal american society for information science and technology, : – . yang, y. and liu, x. ( ). a re-examination of text categorization methods. in proceedings acm-sigir. new york: the acm press, pp. – . yang, y. and pedersen, j. o. ( ). a comparative study of feature selection in text categorization. in proceedings international conference on machine learning. pp. – . zhao, y. and zobel, j. ( ). entropy-based authorship search in large document collection. in proceedings european conference on information retrieval (ecir). berlin: springer, pp. – . appendix table a list of the functions used for feature selection dia(tk, cj) prob[cj j tk] pmi(tk, cj) log prob½tk ,cj� . :prob½tk� � prob½cj� � ¼ log prob½tkjcj� � � � log prob½tk�ð Þ or(tk, cj) prob½tkjcj� � � prob½tkj� cj� � �. : � prob½tkjcj� � � � prob½tkj� cj� ig(tk, cj) p c fcj ,�cjg p t ftk ,�tkg prob½t ,c� � log prob½t ,c� . :prob½t� � prob½c� h i gr(tk, cj) prob½t ,c� � log prob½t ,c�=prob½t� � prob½c� h i þ prob½�t ,c� � log prob½�t ,c�=prob½�t� � prob½c� h i � (tk, cj) n � prob½tk ,cj� � prob½�tk , � cj� � � � prob½tk , � cj� � prob½�tk ,cj� � �� � prob½tk� � prob½�tk� � prob½cj� � prob½�cj� cc(tk, cj) ffiffiffi n p � prob½tk ,cj� � prob½�tk , � cj� � � � prob½tk , � cj� � prob½�tk ,cj� � �� � ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi prob½tk� � prob½�tk� � prob½cj� � prob½�cj� p gss(tk, cj) prob½tk ,cj� � prob½�tk , � cj� � � � prob½tk , � cj� � prob½�tk ,cj� � � bns(tk, cj) f � ðprob½tkjcj�Þ� f � ðprob½tkj� cj�Þ a af� represents the inverse of the normal cumulative distribution function. comparative evaluation of term selection functions digital scholarship in the humanities, of table a estimation for the selection functions and possible values for a positive association or independence function estimation positive independence dia(tk, cj) a / (aþb) pmi(tk, cj) log [a � n / (aþb) � (aþc)] � � or(tk, cj) (a � d) / (c � b) � � ig(tk, cj) a/n � log [a�n / (aþb)(aþc)]þ b/n � log [b�n / (aþb)(bþd)]þ c/n � log [c�n / (aþc)(cþd)]þ d/n � log [d�n / (bþd)(cþd)] � � gr(tk, cj) a/n � log [a�n / (aþb)(aþc)]þ c/n � log [c�n / (aþc)(cþd)] � � � (tk, cj) n � (a�d�c�b) / [(aþc)�(bþd)�(aþb)�(cþd)] � � cc(tk, cj) sqrt(n) � (a�d�c�b) / sqrt[(aþc)�(bþd)�(aþb)�(cþd)] � � gss(tk, cj) [(a�d)�(c�d)] /n � � bns(tk, cj) j f � (a/(aþc))�f � (b/(bþd)) j a � � a f � represents the inverse of the normal cumulative distribution function. j. savoy of digital scholarship in the humanities, microsoft word - czech-digital-revised-seeir-preprint.doc http://hdl.handle.net/ / © the haworth press. <http://www.haworthpress.com>. final preprint version of article published in slavic and east european information resources, vol. , no. . article copies available from the haworth document delivery service: - -haworth. email address: docdelivery@haworthpress.com digital access to cultural heritage and scholarship in the czech republic brian rosenblum∗ abstract this article looks at selected digital library projects in the czech republic, with a focus on two main domains of activity: digital preservation of cultural heritage, and providing digital access to scholarship through institutional repositories. with regard to digitization of national cultural heritage, the czech republic, largely through the leadership of the national library, has established itself as one of the most active countries in the region. with regard to providing access to czech research and scholarship, although there is quite a bit of interest among research institutions and universities, institutional repositories are currently in the exploratory stage and have not yet been widely implemented. copyright laws are repeatedly cited by czech librarians as one obstacle to improving access to digital resources in both spheres of activity. ∗ brian rosenblum (ms in information, university of michigan) is currently scholarly digital initiatives librarian at the university of kansas. address correspondence to: brian rosenblum, anschutz library, hoch auditoria drive, lawrence, kansas, . (email: brianlee@ku.edu). the author would like to thank the university of kansas library research fund, which provided support for travel and research for this article. http://hdl.handle.net/ / keywords czech republic, digital libraries, digitization, cultural heritage, institutional repositories, scholarly communication, open access, digital preservation introduction the digital revolution of the last two decades has presented new and profound challenges to libraries, museums, archives and other institutions charged with preserving and providing access to our cultural and intellectual heritage. for central and east european institutions the challenges have been even more complicated, for this digital revolution began in the aftermath of the fall of communism, in the midst of other major transformations in the social, political and economic spheres, when the very role of these institutions within society was changing. this article takes a look at the state of digital library activity in one central european country, the czech republic, by surveying selected projects in two domains of activity: digital preservation of cultural heritage, and providing digital access to current scholarship through institutional repositories. despite the challenges of the larger social and economic context, the czech republic has invested very heavily in the preservation of cultural heritage and, largely through the leadership of the national library, has established itself as one of the most active countries in europe. with regard to providing access to czech research and scholarship, although there is quite a bit of interest among research institutions and universities, institutional repositories are currently in the exploratory stage and have not yet been widely implemented. yet there are signs that that this is changing and repositories are beginning to be developed. http://hdl.handle.net/ / . cultural heritage initiatives the national library of the czech republic a study of national libraries across western and eastern europe found a wide range of investment and expertise in the application of digital technologies to the preservation of cultural heritage, with no correlation between the amount of investment or expertise and whether an institution was in an eu member state, a new eu member state, or a non-eu member state. for example, national libraries in new eu-member states or non-eu states may have faster internet connections than those in old eu-member states, and non-eu or new-eu states may have greater expertise or outperform eu-members in certain areas of activity. the study, conducted by the european library (an initiative to coordinate policy and activity across national libraries and provide a portal to their collections), concluded that, with regard to national libraries, "there are no gaps between the eu- members, eea states and the new member states. in certain areas new member states and non-eu countries outperform the group of eu- , eea countries and switzerland and may be considered the european centres of competence in certain areas of expertise." the national library of the czech republic (nkp), located in prague, scored particularly high in this study, ranking consistently among the top libraries in a variety of measurements, and having the highest amount of r&d expenditures of all the national libraries surveyed. the nkp was one of the top five most active national libraries in international projects, and had one of the fastest internet connections in europe. it was in the top five in the number and variety of technologies used, and was also fifth (behind spain, france, the uk, and austria) in number of http://hdl.handle.net/ / objects digitized, with almost three million. in terms of digitization of manuscripts and early printed books, the czech republic, through the manuscriptorium project (described below) has established itself as a european center of expertise. the nkp began exploring digitization and digital preservation very early on. in , three years after the collapse of the communist government and in the midst of the split of czechoslovakia into two countries, even as they were struggling with other pressing issues, such as poor storage conditions, incomplete cataloging of old volumes, and limited collections budgets , the nkp became one of the original participants in unesco's memory of the world program, an early effort to explore digitization as a means of cultural heritage preservation. through the memory of the world program, the nkp began to digitize manuscripts and old prints from their collection. (in the czech national library won the first unesco memory of the world jikji prize for its contribution to the preservation of the world’s cultural heritage. ) since that time, the nkp’s digital activities have continued to grow, and now center around three large national projects with the aim of preserving and providing access to different aspects of the cultural heritage of the czech lands. these three projects are described briefly below. manuscriptorium <http://www.manuscriptorium.com>. the longest running of the three projects, manuscriptorium grew out of nkp’s participation in the original unesco memory of the world project. now financed by the nkp and managed by a commercial firm, aip beroun, manuscriptorium is both a union catalog of rare books and manuscripts, and a repository for digitized manuscripts, old printed books and other rare documents. manuscriptorium includes not only items from the czech national library's collections, but has evolved into a collaborative http://hdl.handle.net/ / program which enables other historical and cultural institutions in the czech republic and abroad to digitize and make available items from their collections. the manuscriptorium database currently contains material from over czech cultural institutions (including libraries, archives, museums, and castles), as well as from several international partners from institutions in poland, hungary, slovakia, slovenia and lithuania. in december , the nkp announced that national library of turkey has joined the project and will add over , records relating to turkish manuscripts, making it the largest foreign contributor to the project. full-text and image content of manuscriptorium currently includes more than , manuscripts and old printed books, or about , full-text pages, with a growth of about , to , digitized pages per year. in addition, manuscriptorium provides a set of specially created tools for viewing the images and for creating bibliographic records of manuscripts for inclusion in the project. metadata for the entire manuscriptorium collection is available to the general public without restriction. public access to the full data (including images and full text) is provided through individual and institutional licenses (the current institutional license is usd/year). partner institutions receive free access to the entire content of the database. (digital scriptorium <http://www.scriptorium.columbia.edu/>, a similar project administered at columbia university and focusing on holdings in american libraries, has less content than manuscriptorium, but makes images freely available online). kramerius <http://kramerius.nkp.cz>. launched in , kramerius (named after vaclav matej kramerius, an th-century publisher active in the czech national revival) follows a http://hdl.handle.net/ / similar model to manuscriptorium in allowing the nkp to partner with other institutions to digitize materials. but rather than old books and manuscripts, this project focuses on preserving and safeguarding more recent acid-paper materials, particularly newspapers, journals and brittle monographs from the th century to the present. as of december , kramerius contains about . million pages of digitized material with an annual growth of about , pages. as with manuscriptorium, the metadata is available online to any user worldwide, but most images of the digitized documents are available only via computers on the premises of the national library itself. the project website states that this is because most of the digital documents are “protected by copyright law,” but it does not provide any further details. webarchiv <http://www.webarchiv.cz/>. webarchiv aims to provide a comprehensive, ongoing archive of the czech world wide web (the .cz domain). a kind of "wayback machine" for the czech republic, webarchiv was launched in and comprises several areas of activity, including developing the tools and methods for the ongoing collection and long-term preservation of web resources, integrating those records into the czech national bibliography, and providing access to those resources. the project is using both large-scale automated harvesting of the entire national web as well as a more labor-intensive, selective archiving of thematic and "event-based" collections, including czech-related content residing outside the .cz domain. according to the project website, there are currently over million documents harvested. webarchiv is a collaborative initiative with two other institutions-- the moravian regional library and the institute for computer science at masaryk university. both are located in brno http://hdl.handle.net/ / and are providing support for it issues. the webarchiv partners have also begun working with institutions in other countries, including slovakia, to help set up similar web archiving efforts. as with manuscriptorium and kramerius, online access to webarchiv content is currently limited, due to apparent legal restrictions in the czech republic. according to individuals involved with the project, the czech legal deposit act does not cover web pages, so web publishers are not required to provide the nkp with content they publish electronically, and current czech copyright law prevents the partner institutions from making the archived data available to the public online without permission from the original publisher. the archived content is currently available using computers on the premises of the national library and moravian library, but for those who can't visit the libraries in person, access is limited to content which is covered by agreements with original publishers. given the vast and growing scope of the web, working out individual agreements with publishers of web pages is an inefficient, incomplete, labor intensive process which will not be sustainable in the long run. the law does allow, however, libraries to harvest and store online documents to preserve them and prevent them from disappearing forever. so to that extent, webarchiv functions in part as a “dark archive,” ensuring content is preserved, even though it may not always be accessible at the moment. while the focus now is on developing the harvesting and archiving technology to ensure the collection and preservation of web pages and electronic publications, in the future, the partner institutions hope that czech intellectual property law will evolve to allow unfettered access to this material. http://hdl.handle.net/ / the czech digital library in , the nkp began to integrate these three initiatives, manuscriptorium, kramerius, and webarchiv, into one administrative and technical infrastructure, to be known as the czech digital library. funded by the ministry of culture, any czech library, museum, archive, or other institution can apply to have documents owned or produced by them included in the czech digital library. if certain selection and technical criteria is met, they will be eligible to receive assistance from the ministry of culture for the digitization of those documents. the first phase of implementation of the czech digital library is scheduled for - and will focus on the integration of the three repositories, addressing a number of technical questions regarding standards, formats, storage technologies, and search and access technologies. by bringing these three initiatives together, the czech digital library will form a comprehensive program for capturing and preserving in digital formats a large range of the cultural heritage of the region, from manuscripts and early printed books, to modern newspapers, journals and other printed materials, to born-digital objects and web pages. the czech digital mathematics library the national library is not the only institution in the czech republic engaged in large digitization projects. the czech academy of sciences (cas), which oversees some research institutes, is also very active in this arena. the main library of the cas, located in prague, began to be interested in digital preservation after the prague floods in , which, in addition to devastating many buildings and streets in prague, damaged large amounts of materials in http://hdl.handle.net/ / libraries throughout the country. in the aftermath of the floods, the cas built a digitization center and began the digitization of materials from on, focusing in particular on old journals from their collections. using czech software built specifically for this purpose, they have the capacity to digitize as much as , pages per month. some of this capacity is being used to build the czech digital mathematics library (cdml). scheduled for completion in , the cdml will consist of "the relevant mathematical literature which has been published throughout history in the czech lands." the cdml will contain an estimated , pages when completed, and will include professional journals of international stature published by czech institutions, as well as monographs, textbooks, dissertations, research reports, and conference proceedings published by czech universities and research institutes. (when completed, the content will also be incorporated into the world digital mathematics library, an initiative of the international mathematics union to coordinate the digitization of math literature worldwide.) like webarchiv, the cdml is a collaborative project with contributions from several institutional partners. the mathematical institute of the academy of sciences and the faculty of mathematics and physics at charles university in prague are providing subject expertise and project coordination, while the library of the academy of sciences, the faculty of computer science at masaryk university in brno, and the institute of computer science at masaryk university (also involved in the webarchiv project) are providing the digitization, search functionality, and various other technical components. http://hdl.handle.net/ / as with the nkp initiatives described above, project managers indicated that copyright issues may limit the extent of access to the digitized content, and said they are currently in negotiations with czech publishers to work out the terms of online access. . scholarly communication initiatives institutional repositories the projects discussed above focus on content that can be considered to be part of the cultural heritage of the region. digitization can also play a role in providing access to and preservation of current scholarship, which is an increasing need for universities and research institutions. a report on scholarly communication needs in the czech republic noted several challenges in the environment at that time, years after the collapse of communism. some of the challenges identified in the report include: • a shortage in access to contemporary academic books and journals • need for improved access to grey literature produced abroad • need for improved access to czech grey literature by foreign scholars interested in the czech republic • access to conference proceedings such challenges are not unique to the czech republic. they are, in fact, representative of the so- called “scholarly communication crisis” that is impacting academic and research institutions http://hdl.handle.net/ / worldwide, and that is the result of, among other things, skyrocketing prices of scholarly journal subscriptions, the explosion in the amount of new scholarly publications, and a restrictive intellectual property environment. such concerns have led many libraries to explore new initiatives to influence and create change in the scholarly communication system. one strategy in particular that is gaining widespread momentum, especially in the u.s., western europe, and australia, is the implementation of institutional repositories. institutional repositories (irs) are a set of services to manage and disseminate the scholarly knowledge created by faculty, researchers, or students at an institution. although irs are not a preservation strategy in and of themselves, they put digital scholarship into a managed process which can make the complex task of preservation easier. in addition, irs play an important role in the open access (oa) movement, which seeks to make scholarly material freely available to as large a worldwide audience as possible by making the material openly available on the internet. by providing unobstructed access to scholarship, institutions can maximize the visibility and potential impact of the work of their faculty. institutional repository deployment central and eastern europe although still in the early stages of development, irs have become widely deployed at research institutions in the u.s. and western europe. a recent survey of ir deployment in the u.s. and western europe concluded: “it is clear, at least among the nations surveyed, that institutional repositories are becoming well established as campus infrastructure components. they are broadly deployed in many of the countries surveyed, and essentially universally available in a few already." http://hdl.handle.net/ / in central and eastern europe, repository deployment has proceeded more slowly. although the budapest open access initiatives (http://www.soros.org/openaccess/) (boai), which strongly encourages open access to research and the use of institutional repositories, has a high profile in the region, this has apparently not yet led to the widespread implementation of irs. a few brief statistics will illustrate this point. the university of michigan's oaister search engine (http://oaister.umdl.umich.edu) harvests metadata from open access repositories, journals, and digital library collections worldwide. launched in , oaister now contains over million records of (mostly) freely available material residing in over repositories in countries. table provides a breakdown by country of the number of items indexed by oaister, as of october , . approximately half of the million records in oaister reside in u.s. repositories. only % of the total of repositories harvested by oaister are in eastern europe or slavic-language countries, and together they contain only . % of the total content in oaister. remove poland and the figures decrease to % of the repositories and . % of the total content. [insert table .doc] figures from the registry of open access repositories (roar) (http://roar.eprints.org/) are similar (see table ). out of the repositories from countries listed in roar as of october , repositories are in russia, in hungary, and each in croatia, poland, slovenia and ukraine. together this represents . % of the total number of repositories listed. no other east http://hdl.handle.net/ / european or slavic speaking nation is represented in the registry. by contrast, the top ten countries (led by the united states with repositories, the u.k. with , and germany with ), have % of the listed repositories. [insert table .doc] it is important to note that there are some important limitations to these figures and they may not fully represent the use of repositories in the region. nevertheless, they do give us a general indication of where the main development of repositories has taken place to date, and they give a sense of the low eastern european presence in two of the more well-known and established ir tools. this suggests that institutions in east european countries have not yet fully developed the repository infrastructure that can help preserve their own research output, make it more accessible worldwide, and increase the visibility and impact of their own scholars. this is not necessarily a disadvantage yet, although in the long run it could be. irs are still a relatively new technology and even the early adopters of this technology are still exploring how best to deploy irs and fill them with content. eastern european institutions can thus pay attention to and take advantage of the lessons currently being learned by early ir adapters elsewhere. institutional repositories in the czech republic there are no czech repositories currently listed in oaister or roar, and no one i spoke with during a visit to the czech republic in the summer was aware of any fully-implemented, open access institutional repositories. there was, however, quite a lot of interest in such repositories and there are many institutions starting to explore how to implement them. http://hdl.handle.net/ / for example, the institute for computer science (ics) at masaryk university (muni) in brno has been very active in researching and testing ir technology. they are well aware of current trends and developments with regard to institutional repositories in western europe and the u.s., and, anticipating the possibility of a need arising at muni, they have been laying the groundwork for the eventual implementation of an ir. for ics (which is a partner in both the webarchiv and czech digital mathematics library projects) it is not the technical challenges which are delaying the development of an ir, but rather an array of other more intangible factors in the wider academic landscape, such as support from high-level administrators, pressure from faculty, legal mandates for the deposit of research in repositories, and the existence of friendlier copyright laws, that do not yet seem to be in place, and which are necessary to move repository development forward. ics doesn’t see it as their role to be the main advocate for the development of irs, but rather to be prepared to provide that service when the need arises. more investigation is needed to truly understand the scholarly communication environment in the czech republic, but there are signs that the environment is starting to change and put more pressure on institutions to implement repositories to provide access to their research output. two brief examples will illustrate this. in the first example, recent legislation has forced universities to find ways to provide access to the theses and dissertations written by their students. in the second example, a research institute’s involvement with an international consortium is driving the implementation of an ir. http://hdl.handle.net/ / theses and dissertations and open access in the czech republic, attention to open access (oa) to scholarship has so far largely focused on theses and dissertations. in , the czech legislature passed a law requiring public universities to provide free, electronic, public access to theses and dissertations of students who have been awarded degrees at those institutions. there were several motivations behind the enactment of this new law. until this time, czech universities have had no responsibilities or obligations concerning making theses and dissertations available. requiring public access to theses and dissertations will provide more quality control and accountability on the part of universities receiving public funds. in addition, making such research available to the public will increase public debate and knowledge, and therefore have a positive social impact that can help justify the public funds these institutions receive. the law applies only to theses and dissertations awarded january , or later. because of potential conflict with the rights of authors of theses and dissertations to decide how their work would be made available, czech copyright law was also modified. now, as a condition of submitting their thesis or dissertation, students agree to have it made publicly available in electronic form. there remains some confusion and loopholes in the law, however, with the result that there is some inconsistency in how different universities are interpreting and implementing access to their thesis and dissertations. while universities are required to provide public, electronic access to this material, it is left up to them to each individual institution to determine the mechanisms and process by which to provide this access. most universities are using some kind of local http://hdl.handle.net/ / system already in place at their institution. in some cases, as at masaryk university in brno, or the university of ostrava, this data is made available through a public system which allows robust searching by keywords, authors, english and czech language abstracts, faculty advisor, year of defense, and other terms. masaryk university is also working on providing full-text searching of theses and dissertations. at other universities, access is made through the existing library catalog, which may limit the search terms to author and title and subject, and which requires users to search across the entire holdings the library at the same time. another issue has been definition of “public access.” at some institutions, an electronic database of dissertations is available, but only on the premises of the institution itself, not online via the internet. this seems to run counter to the spirit of the law, although in some cases it may stem from current technical limitations at those universities. there is inconsistency in the range of content available as well. the legislation applies only to theses and dissertations submitted january , or later, and some universities limit access to content which falls into that time frame. other institutions however, are making older material available. despite such inconsistency and certain loopholes, this legislation is recognized by many as a positive first step of making one sector of czech grey literature more available. cerge-ei and economists online in , the center for economic research and graduate education (a unit of charles university in prague) and the economics institute (a part of the czech academy of sciences) established a joint program, known as cerge-ei. cerge-ei provides an american-style ph.d. program in http://hdl.handle.net/ / economics and conducts economic research, with a particular emphasis on the transition to free markets and european integration. all courses are taught in english, and the ph.d. degree from cerge-ei is fully recognized in the united states. the cerge-ei library was established in and is one of the best libraries in the region for access to economics resources. the library also has a strong history of providing public access to it’s resources, having been open to the public since . in cerge-ei joined the nereus consortium (a network of european economics research institutes) and became a participant in the economists online project, which provides a portal to the research output of participating institutes available online. as part of their participation in economists online, the cerge-ei will place about % of the published work produced by cerge-ei faculty into a repository to be made freely accessible online. at the time this article was written, cerge-ei had not yet developed their ir and was looking into various options— including maintaining it themselves, or finding for a commercial vendor to host it. neither of cerge-ei's sponsoring institutions, charles university or the academy of sciences, currently has an open-access institutional repository infrastructure that would meet cerge-eis needs. the czech academy of science would seem to be a natural institution for implementing an ir-- the or so institutes of the cas produce a wide range of research and scholarly publications. however, the research institutes that comprise the cas span a wide range of scholarly disciplines, vary widely in size and scope, and, as in the case of cerge-ei, may have unique needs that require them to form their own external partnerships based on their own individual needs. implementing a repository that meets the needs of faculty at a single institution is a http://hdl.handle.net/ / challenge in itself. trying to coordinate semi-autonomous research units is even more complex. conclusions christine borgman has noted that libraries in central and eastern europe have historically been “oriented more towards preservation of cultural heritage than access to information.” the activities of czech institutions in the digital realm seem to bear this out. the czech republic has invested heavily in cultural heritage preservation initiatives and has established itself as a european leader in digitization and web archiving activities. the national digitization projects have been successful in part because of the strong leadership that the national library has shown, the high-level support they have received from the ministry of culture, and the collaborative nature of the projects. czech universities and research institutions have been somewhat slower to implement institutional repositories and focus on providing access to czech research, scholarship and grey literature. similar levels of commitment and collaboration to that seen in the national cultural heritage projects may help lay the groundwork for a broad repository to help preserve and distribute current czech research and scholarship. as demonstrated by the inconsistent levels of access currently provided by universities to their theses and dissertations, there may be a benefit to establishing a repository at a multi-institutional, or even national, level, rather than having each individual institution duplicating effort and establishing and maintaining its own system. http://hdl.handle.net/ / on the other hand, czech universities may not have as strong a history of cooperation as czech libraries and cultural heritage institutions, and may tend to view the implementation of repositories in a more competitive light. nevertheless, the fact institutions are laying the groundwork for repository implementation (looking at technical requirements, researching different systems options) suggests that pressures to move in this direction are starting to build, and that we may be on the cusp of seeing a broad implementation of irs at universities and research institutions in the czech republic. indeed, the extent of repository activity in the czech republic could develop very fast. internationally, repository and open access activities are gaining momentum: funding agencies are beginning to mandate that the results of research they fund be deposited into open access repositories, and concerns about capturing, organizing, and preserving born-digital items and data in a variety of file formats are driving widespread repository development. as those influences make their way into the czech scholarly environment, research institutions and universities are likely to find the demand for irs growing. http://hdl.handle.net/ / notes for more information on the history of czech libraries and their changing role over time, see rebecca rhodes, "libraries, librarianship and library education in the czech republic" (master's paper for the m.s. in l.s. degree, university of north carolina, ). zinaida manžuch and adolf knoll, “research activities of the european national libraries in the domain of cultural heritage and ict,” (tel-me-mor, ), . http://www.telmemor.net/docs/d . _research_activities_report.pdf. manžuch and knoll, “research activities of the european national libraries,” - . adolf knoll, “universal availability of publications problems in czechoslovakia,” fontes artis musicae ( ): - . “czech national library to receive unesco/jikji memory of the world prize," unesco webworld, september , , communication and information sector news, http://portal.unesco.org/ci/en/ev.php- url_id= &url_do=do_topic&url_section= .html (accessed march , ). tel-me-mor, “digital library-related initiatives in europe,” the european library, http://www.telmemor.net/diglib.php (accessed march , ). http://hdl.handle.net/ / tel-me-mor, “digital library-related initiatives in europe.” during interviews and conversations i had with czech librarians working on this and other projects, they repeatedly cited czech copyright laws as a major obstacle to making digital content available to a wider audience. this applied to the cultural heritage projects discussed here, as well as to efforts to place scholarship into institutional repositories, discussed later in this article. i found little documentation available in english about czech copyright law, but the following websites (in english and czech) represent places to get further information: • "autorské právo," ministry of culture of the czech republic <http://www.mkcr.cz/autorske-pravo/default.htm> • "autorské právo a knihovny (author's rights and libraries)," national library of the czech republic, <http://www.nkp.cz/o_knihovnach/autzak/dop.htm> • "czech library and information science portal," national library of the czech republic (see links under the section labeled "legislation") <http://knihovnam.nkp.cz/english/> • "odměna autorům za půjčování jejich děl v knihovnách [the reward to authors for the lending of their work in libraries]" ikaros: elektronický časopis o informační společnosti, roč. , č. ( ), <http://www.ikaros.cz/node/ > also see the article by vera jurmanová volemanová cited in footnote below. in addition, the american association for the advancement of slavic studies maintains, through its bibliography & documentation subcommittee on copyright issues, a list of resources and links related to copyright in the wider region. see: <http://intranet.library.arizona.edu/users/brewerm/copyright/awareness.html.> http://hdl.handle.net/ / “webarchiv – archive of the czech web,” http://en.webarchiv.cz/ (accessed march , ). “webarchiv – archive of the czech web,” http://en.webarchiv.cz/ (accessed march , ) and personal conversations with project personnel. “webarchiv – archive of the czech web,” http://en.webarchiv.cz/ (accessed march , ). bohdana stoklasova, "czech digital library" (paper presented at the st colloquium of library information employees of the v + countries, bystrica, slovakia, may - , ), http://www.svkbb.sk/colloquium/zbornik/obsah.htm. for more on the czech library response to the floods, see emily ray, "the prague library floods of : crisis and experimentation," libraries & the cultural record , no. ( ): - . jirí rákosník, "dml-cz: czech digital mathematics library," ercim news ( ), http://www.ercim.org/publication/ercim_news/enw /rakosnik.html. rákosník, “dml-cz: czech digital mathematics library.” interviews and conversations with project personnel. http://hdl.handle.net/ / zdenka mansfeldová, "report on scholarly communication needs in the czech republic," unpublished report for the social sciences research council, working group on dissemination ( ). for one overview (among many others) of irs and the issues related to their implementation, see mark ware, "pathfinder research on web-based repositories: final report," bristol, uk, publisher and library/learning systems (pals), http://www.palsgroup.org.uk/palsweb/palsweb.nsf/ b d e a cb ae a e / c c e a c cd e e a/$file/pals% report% on% institutional% repositorie s.pdf. gerard van westrienen and clifford a. lynch, "academic institutional repositories: deployment status in nations as of mid )," d-lib magazine , no. (september ), http://www.dlib.org//dlib/september /westrienen/ westrienen.html. there are a number of caveats to note about with these figures. to begin with, they represent a mostly self-selected group of repositories or publications that have chosen to register with oaister or roar. in oaister, many of the "sources" represent different types of collections, so it is not always clear what types of items one is comparing. for instance, the sole source from bulgaria is not an institutional repository but an oa journal, bulgarian rusistika, published by the society of philologists of bulgaria. assessing repository development by number of items is problematic because, as clifford lynch has pointed out, "no two institutions are counting the http://hdl.handle.net/ / same things....the diversity in both the definition of what constitutes an "object" and in the nature of the objects being stored (massive videos or groups of datasets as opposed to individual articles or images) makes repository size very hard to interpret." in addition, these figures do not take into account the relative size of the countries in terms of number of research institutions and faculty, nor do they take into account the fact that east european scholars in some fields already make use of subject-based repositories located in the united states or western europe. the information in this and the following paragraphs, including the discussion of the new legislation and the responses of czech universities was obtained from vera jurmanová volemanová, "fulltextové databáze vŠkp (vysokoškolských kvalifikacních prací) volný prístup k cennému zdroji odborných informací [fulltext databases of theses and dissertations free access to valuable source of specialized information]," (paper presented at inforum , may - , , prague, czech republic), http://www.inforum.cz/inforum /pdf/jurmanova_volemanova_vera.pdf. christine borgman, from gutenberg to the global information infrastructure: access to information in the networked world (cambridge, ma: mit press, ), . .. i. introduction over the last decade, libraries, archives and museums have made a major contribution to the creation of historical digital collections which subsequently provide immediate access to primary source materials that might otherwise be unavailable to researchers, scholars, andthe general public. a typical workflow for a digital collection project includes processes such as: digitization, conversion and loading data (often in batch), exporting and parsing metadata, and designing web sites. two main benefits of digital collections are: ( ) access – whereby institutions can provide multiple and simultaneous users with remote access to a variety of digital objects, including photographs, manuscripts, books, etc.; and ( ) preservation – whereby a digital copy can help preserve the original objects. because of the constant changes in technology, librarians working on digital collection projects need to constantly evaluate their access and preservation practices. this article aims to analyze the current technical skills being sought for digital librarian positions by examining the required and preferred qualifications listed in position announcements posted in , and to explore how well topics – offered by seven major library and information science (lis) programs in – have matched these qualifications. ii. literature review in an ever-changing technology landscape, the capacity to learn constantly and quickly is more relevant than ever, and is the primary motivation of this study. similar studies to this one exist in the literature, both as planning tools for hiring administrators on what skills are likely to be needed as libraries change and evolve, or as advice for those new librarians seeking positions in these evolving roles. however, not many have explored the special case of positions focused on digital library development. in , marion ( ) conducted an exploratory study that analyzed online academic librarian employment ads posted during to determine current requirements for technologically oriented jobs. marion organized the results into different categories, and two of those categories seem relevant to this study: programming languages and web site creation. perhaps the best attempt at addressing digital library competencies, choi and rasmussen ( ) conducted a survey for their article: “what is needed to educate future digital librarians” to identify their skills and to detect possible gaps in their training. the authors concluded “lis education needs to pay attention to [. . .] integration of practical skills and experience with digital collection management and digital technologies into curricula.” in , tammaro ( ) analyzed the trends for digital library education in europe. although the focus of this work was about a “curriculum for digital librarians,” the first set of competencies (information architecture, information retrieval, web-publishing, database theory, networking, human computer interaction, evaluation of information systems, and technical troubleshooting skills) are directly related to what this study plans to examine in the list of technical courses currently offered in library programs. most recently, in the spring of , mathews and pardue ( ) conducted a five-month study to analyze theit skills that employers were looking for when posting job ads in ala’s online joblist. iii. methodology based in part on this literature review, a twofold data collection methodology was developed that compared the required and desired technical skills as expressed in position announcements and the skills currently being taught in major lis programs. a common set of categories was developed to account for variations in wording and specific implementations of a technology. . data collection analysis of required and preferred qualifications: to ensure the most current data, a -month period was selected for analysis, from january to december . data were collected from five sources: ala joblist, job opportunities from educause, lisjobs.com, and three library schools’ career sites. position announcements were limited to those for digital collection/digital library-related positions. analysis of technical courses: included an analysis of technical courses offered in at selected library programs listed among the top schools as identified by the us news and world report ranking. an initial list included the top five schools specializing in archives and preservation, information systems, and digital librarianship. duplicates were removed from the combined list resulting in a final list of seven target schools (us news and world report, ). library hi tech news number , pp. - , q emerald group publishing limited, - , doi . / technical skills for new digital librarians elı́as tzoc and john millard . data pre-evaluation one challenge in gathering historical data for position announcements is that most job banks only keep the data for a given period of time – usually from to days. the educause’s job opportunities web site was very helpful as it provides access to past data. the availability of dedicated career services at some library schools was helpful. position announcements posted in ala joblist expire after days. however, ala was able to provide archived data for this study. another challenge in the pre- evaluation process was the discovery of new position titles with similar requirements to those found for positions specifically for digital collection projects. extra care was given to ensure that the job description and responsibilities for these new positions were within the scope of the study. the resulting pool of position announcements to be analyzed included distinct positions working exclusively on digital library or digital collection activities. table i summarizes the source of the position announcements analyzed. an interesting finding from the pre- evaluation process was that percent of positions preferred a bachelor’s degree in computer science, information management or a related field. this may be a confirmation of the ongoing need to recruit new librarians with those types of degrees. the data were normalized to a common format to account for differences in descriptive style and length of entry. the statistical reports were generated using features in ms excel such as pivottable report. table ii represents unique position titles. the differences in position titles may represent the diversity of understanding of where these kinds of positions fall within the organizational framework or workflow of the hiring institution. data on technical courses offered by the selected lis programs were collected from descriptions of course offerings during the spring and fall semesters of . data collected included course title and description. courses were assigned a general category based on topic to allow for more meaningful comparison. courses without enough contextual information in the description field were not included in the sample of courses. courses with a direct link to syllabus were extremely helpful, especially where faculty listed specific class projects. table iii provides a summary of the technical courses offered in in the sample of seven schools with a specialization in archives and preservation, information systems, or digital librarianship. . categorization position announcements were analyzed for required and desired skills. the resulting list of skills and qualifications were grouped into common broad categories and a frequency distribution was created. the same analysis was then performed on course descriptions using the same common categories. in all, broad categories were identified: ( ) database design and management – knowledge of and experience with relational database design, deployment and management including proficiency with sql on both commercial and open source database servers. ( ) digital collection management – understanding and experience managing all facets of digital collections including overall technical proficiency in the tools and technologies related to digital repositories and digital collection development. ( ) digital content management systems – experience using or managing common content table ii. position titles represented in the sample position title total percentage data librarian digital archivist digital collection/metadata librarian digital collections librarian digital collections metadata librarian digital collections specialist digital and scholarly communications librarian digital initiatives librarian digital integration librarian digital librarian digital scholarship and services librarian digital services librarian information technology librarian librarian digital services support library digital services manager metadata and digital initiatives developer metadata and digital resources developer systems and digital collections librarian systems development librarian web initiatives librarian web services librarian total table i. summary of job ads grouped by source source total percentages ala joblist educause jobs ischool-career-services lisjobs total library hi tech news number management systems including digital asset management systems. specific examples include dspace, contentdm, drupal, luna insight, dlxs, wordpress, and omeka. ( ) digital conversion – knowledge of the conversion of analog materials to digital formats in a library context. it includes knowledge of digitizing equipment and best practices, archival file formats, and reformatting audio and video materials. ( ) digital preservation – knowledge and experience with preservation of both analog and digital materials and ability to manage the ongoing preservation of digital collection content. ( ) metadata and cataloging standards – familiarity with emerging or established metadata and cataloging standards including dublin core, ead, tei, frbr and rda as well as controlled vocabularies. ( ) programming: java, c, cþþ – familiarity with more formal development languages like java, c, cþþ, etc. ( ) programming: scripting languages – familiarity or experience with less formal procedural scripting languages like php, perl, javascript, ruby and python. ( ) systems and network administration and desktop support – experience administering unix/linux/windows servers, designing and managing networks, and providing desktop configuration and support for both mac and windows pcs. ( ) web application development – experience using common scripting languages, relational databases in the creation of dynamic date-driven web sites and applications. ( ) web design and web standards – understanding of web design standards including html, xhtml, css, interface design and common software platforms in web development like dreamweaver. ( ) xml and related standards – primarily, experience with xml and related technologies like xslt and xml databases. iv results each position description was tagged with all applicable categories. once tagged, a simple distribution was developed showing the percentage of position descriptions requiring or desiring qualities in each category. as shown in figure , the top six categories were web design and web standards ( percent of the sample of job ads included this category), digital collection management and programming with scripting languages ( percent), digital content management systems, metadata/ cataloging standards, and xml/xslt ( percent). the least expressed category in this study was digital preservation ( percent). the same methodology applied to technical courses in lis programs. figure shows the following: the top four categories represented were web design and web standards ( percent of the sample of technical courses offered this topic), database design and management ( percent), web application development ( percent), and digital collection management ( percent). the three top popular topics in web design and web standards courses were: web navigation, information visualization, and user- centered design. for database design and management, four schools offered two courses: introductory and advanced database mainly focused on sql and web database applications. for comparison purposes, the two sets of data (job ads and courses) were re-calculated in a scale of - . figure shows the correlation of technical skills’ categories between position descriptions and lis courses. generally, skills taught in lis courses closely matched those being sought in position announcements with some notable exceptions. as shown in figure , digital preservation was not a highly sought after qualification with only percent of the positions requiring it. conversely, percent of courses included this topic. lis curriculum may be ahead of common practice in the field here or employers may be assuming that digital preservation is a standard skill set common in all lis program graduates. further study is needed to determine the cause of this disparity. more significant disparities existed in the distribution of programming – scripting languages, digital content management systems, metadata and cataloguing standards, and xml/ xslt skills. it is clear that these skills are perceived as core competencies for these types of positions. lis curriculum table iii. list of technical courses represented in the study course total percentage advanced database management advanced web technology/presentation content management system data interoperability development of web applications digital library technology/software encoded archival description information architecture for the web information visualization introduction to database management introduction to web design introduction to web programming mobile application development usability web . web archiving digital preservation and access total library hi tech news number is clearly leading in three areas: database design and management, web application development, and web design and web standards. a certain amount of disparity in results can be expected due to the differing missions between hiring institutions and lis programs. for example, it is possible that lis curricula look further ahead to anticipate future developments in the field. indeed, our review of the literature review includes attempts to determine the best curriculum to train future students. in addition, the analysis of topics taught revealed several topics taught that were not represented in the position announcements at all. notable among these were courses covering open source software evaluation, assessment, and information visualization. v conclusions one conclusion of this analysis is that current students as well as practicing librarians need to seek out additional non-curricular opportunities to build competency in the technical areas represented in this study if they are or expect to be marketable. fortunately, the areas where the greatest disparity exists are also areas where significant opportunities for independent learning are available. for instance, one could begin to gain experience individually by setting up a local and general-purpose web server (sandbox) with a basic lamp/wamp configuration. from here, one can install common open source content management systems such as drupal, word press, dspace, or omeka and begin creating small collections for testing and experimentation. for more specific or advanced tutorials, free online training sites such as w schools.com or subscription based such as lynda.com can be valuable resources as well. additionally, new digital librarians may also join technical groups such as code lib, where members are constantly exchanging ideas on top technical trends and development in the library community. a good example is a may discussion in the code lib listserv about what library technologists “would like to learn”, which provides a useful list of technologies currently in demand in figure . percentages of technical skills sought by employers database design/management digital collection management digital content management sysyems digital conversion digital preservation metadata/cateloging standards programming - java, c, c++ programming - scripting languages system administration/client configuration + network web application development web design/web standards xml/xslt % % % % % % % % % % % % note: % = n/ figure . percentages of technical courses in lis programs database design/management digital collection management digital content management sysyems digital conversion digital preservation metadata/cateloging standards programing - java, c, c++ programming - scripting languages system administration/client configuration + network web application development web design/web standards xml/xslt % % % % % % % % % % % % note: % = n/ figure . categories comparison: position descriptions vs lis courses database design/management digital collection management digital content management sysyems digital conversion digital preservation metadata/cateloging standards programming - java, c, c++ programming - scripting languages systeam administration/client configuration + network web application development web design/web standards xml/xslt % % % % % % % % % % % % % % % % % % % % % % % % job ads % courses % library hi tech news number the library world (http://tinyurl.com/ code lib-time-to-learn). another conclusion is that students in library programs should be encouraged to apply or volunteer for internship opportunities or work on technical capstone projects, as these opportunities will help them acquire real-life experience. in fact, two to three years of experience was indicated as a preferred qualification in more than percent of job ads analyzed in this study. having the opportunity to work on a real project will help new librarians acquire the skills for which employers are looking. and for those new librarians already working on digital collection/library programs, a personal interest in continuously learning and improving their technical skills will be essential for their professional development. finally, during the data collection of this study, the authors found multiple positions for other technical/digital areas of academic librarianship – but with similar technical requirements. one of the new titles found repeatedly was emerging technologies librarian where the main responsibilities include the exploration and creation of mobile applications as well as implementation of apis and mashup tools such as google maps for geotagging. a further study of the overlap of responsibilities and expectations for digital librarians in other areas of academic libraries will be useful. acknowledgements the authors would like to recognize the contribution of david m. connolly, classified ads coordinator from the american library association, in providing archived data for this publication. references choi, y. and rasmussen, e. ( ), “what is needed to educate future digital librarians”, d-lib magazine, vol. , available at: www.dlib.org/dlib/septembelr /choi/ choi.html (accessed january ). marion, l. ( ), “digital librarian, cybrarian, or librarian with specialized skills: who will staff digital libraries?”, association of college & research libraries publication, available at: www.ala.org/ala/mgrps/divs/acrl/events/ pdf/marion.pdf (accessed march ). mathews, j. and pardue, h. ( ), “the presence of it skill sets in librarian position announcements”, college & research libraries, vol. no. , pp. - . tammaro, a. ( ), “a curriculum for digital librarians: a reflection on the european debate”, new library world, vol. nos / , pp. - . us news and world report ( ), “library and information studies ranked report”, us news and world report, available at: http://grad-schools.usnews. rankingsandreviews.com/best-graduate-schools/ top-library-information-science-programs (accessed august ). elı́as tzoc (tzoce@muohio.edu) is the digital initiatives librarian at miami university libraries, oxford, ohio, usa. john millard is head of digital initiatives at miami university libraries, oxford, ohio, usa. library hi tech news number archives alive!: librarian-faculty collaboration and an alternative to the five-page paper – in the library with the lead pipe skip to main content chat .webcam open menu home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search aug tom keegan and kelly mcelroy / comments archives alive!: librarian-faculty collaboration and an alternative to the five-page paper image courtesy of florian klauer, (cc . ) in brief: the short research paper is ubiquitous in undergraduate liberal arts education. but is this assignment type an effective way to assess student learning or writing skills? we argue that it rarely is, and instead serves as an artifact maintained out of instructor familiarity with and unnecessary allegiance to timeworn conceptions of “academia.” as an alternative, we detail the archives alive! assignment developed by librarians and faculty at the university of iowa and designed to bring rhetoric students into contact with archival collections and digital skills. we also discuss how librarians can collaborate with instructors on new assignment models that build meaningful skills for students, highlight library collections, and foster connections on campus and with the broader community. by tom keegan and kelly mcelroy introduction anyone who has spent much time working at an academic library reference desk has encountered students scrambling to find sources for research papers they have already written. these students just need to add a few quotes (preferably from – different sources) in support of their preformed arguments. and quickly, because the paper is due tomorrow…or perhaps in a couple of hours. anyone who has taught undergraduate rhetoric, composition, or english has likely slogged through grading those same papers: formulaic phrasing or overwrought syntax; patchwriting if not outright plagiarism; vagaries grounded in hyperbole such as, “since the dawn of time…” and of course, anyone who has ever written a five-page paper knows the score. on sites like yahoo answers, you can find helpful instructions for how long it takes to write such a paper, how many sources to include, and how to tweak the margins/fonts to require as little actual writing as possible. whether the page limit is five or fifteen, as an assignment format, the short research paper is pervasive, largely unloved by its participants, and deeply flawed. we — an instruction librarian and an english ph.d./rhetorician now a library department head — believe that another world is possible, where assignments can be deeply engaging for both students and instructors. we also believe that librarians can help make this change happen. we recently created the archives alive! assignment now used by many sections of a core required course at the university of iowa (ui). the story of the assignment’s development is a story of risk and collaboration. in particular, it highlights the benefits of librarians approaching instructors to create assignments where students produce work for public audiences, and where student work can contribute to projects beyond their classrooms. the prevalence and weakness of short research paper assignments the five-page (or n-page) paper lurks everywhere in academia. its form privileges quantity, outmuscling quality and utterly preoccupying students with concerns about numbers. although professors may intend these assignments to facilitate exploratory learning, students often focus on meeting the expectations of their professors. because assignments may be vague on the more qualitative aspects, students often fixate on the concrete word count or required number of references.  it’s the mass of text that preoccupies these students. their arguments and the audiences for them are secondary considerations. this quantification of thought sends the absolute wrong message to students. good arguments are not necessarily quantifiable. we’re not suggesting that there’s categorically no difference between a one-page paper and a five-pager. tom has had enterprising or lazy students ask if it would be possible to write a successful one-page paper. it would, but that would be a hell of a paper. and that’s the thing: the impressive, succinct, artful one-page paper is not what we tend to teach students. we tell them that in order to tease out an argument, in order to excavate the multiple facets of the topic being addressed, we need a particular number of paragraphs and pages. in prescribing these numbers, we remain silent on what other possible forms might better serve their arguments. and in doing so, are we adequately modeling our own enthusiasm for the subject being taught? if we love rhetoric or composition, what is it that we love about it? are we, through our assignments, conveying that love? in a world where the relevant searches for “five-page paper” are expeditious rather than enlightening, we doubt it. in the run up to the annual meeting of the modern language association, tom and our colleague, matt gilchrist — a lecturer in the ui rhetoric department and director of iowa digital engagement and learning (ideal) — ran a call for papers for a panel titled “beyond the essay.” they were interested in what other kinds of assignments instructors were asking their students to undertake in service to their learning. we received a bunch of marvelous proposals that, to use the parlance of educational theory, described hybrid learning models. the panel drew interest from graduate students, non-tenure track lecturers, and the whole gamut of tenure track faculty. people read gardens as texts, mapped local narratives, created marketing campaigns for local non-profits. tom also received a email from an incredulous think tank member who asked, “do you really want to give this generation of college students relief from writing college essays?” years later, looking back on our work, we think: “yes, yes we do.” and “relief” is a fitting word. those essays are needlessly stressful, arid work — for both writer and reader. there’s no texture, no hook; nothing animates them. they serve as stock exercises in a form that as one colleague of ours has noted, is not replicated anywhere outside of academia. and that is where we are sending many (if not most) of these students — beyond the bounds of academia. so what are we preparing them for? and how are we preparing them for it? short essay assignments can still play a role but over-reliance on the form does not serve students or faculty best. why is this a librarian problem? although most librarians may not assign short research papers, we are often brought in to provide instruction or reference to assist students. as advocates for information literacy, we have a stake in whether these types of assignments help students build the skills we wish them to have. that is: does a short research paper help students learn to do research? in short, not necessarily. among the findings of the citation project, sophomore students often cite the first page of a work, and rarely cite any source more than once in a paper, suggesting cursory engagement with texts. the work of project information literacy around employer satisfaction with recent college graduates further suggests that students don’t always get the skills required in the workplace. as one of their employer-participants explained, “they do well as long as the what, when, why, and how is clear in advance.”  although we are wary of focusing on the interests of employers, this statement also raises questions about how students will handle other research and critical thinking outside the classroom. whether making personal medical decisions, researching local ballot initiatives for an election, or flirting with a potential partner on an online dating site, an inquiry isn’t over simply because you found your three references or reached a word limit. unfortunately, librarians often find themselves simply reacting to assignments, rather than advocating for projects that will purposefully build student skills. the chapter titles of the popular one-shot library instruction survival guide allude to common problems of faculty non-collaboration: “they never told me this in library school,” “the teaching faculty won’t/don’t…” “but how will i cover everything?” despite lamentations of the one-shot, many librarians cling to it as the only scrap of contact they can get with students in the classroom. even embedded librarians well-integrated into a course may have very little role in assignment design. there may be a gap in perception between faculty and librarians. in an ethnographic study of faculty who were heavy users of library instruction, manuel et al. found that library advocates sometimes had opposing motivations to librarians, for example showing little interest in lifelong learning or critical thinking as goals for library instruction.  as one of their informants explicitly stated, they bring students to the library because the research paper is “the basic goal of the course.”  faculty may also assume that students will learn research skills simply by doing research, and leave out clear information literacy or research outcomes from assignments. the most successful librarian-faculty relationships occur when there are shared goals.  however, as in the case below, the common ground may not be immediately visible. nalani meuleumans and carr describe a program targeting new faculty members, with clear aims to shape their expectations for library instruction. creative thinking on the part of the librarian can help unearth potential for greater collaboration, but it also requires willingness to be flexible and make active suggestions. combining clearly articulated learning needs with new and interesting library services can lead to fruitful adventures. what is the point? developing successful assignment types in our view, the most successful assignments meet two criteria. first, the assignment type fits the learning outcomes and skills being developed. although this may seem obvious, we believe that archaic assignment types are often selected by rote. a new graduate instructor or junior faculty member is handed a stock syllabus for an introductory level course and is encouraged to maintain the status quo with respect to assignments because the clock is ticking on time to degree or tenure. the pressure to reach professional objectives outside of teaching become reason to cling to the “tried and true” assignments of the th century. research paper assignments fit some learning outcomes and skills — for example, writing in an academic voice or learning a particular citation style — but certainly not all. second, successful assignments are placed within a context broader than the course. students, like anyone else, shape their work to fit their audience. although not every quickwrite or draft must be shared broadly, when students understand that their work contributes to a larger project or could be seen a wider audience, they tend to take it more seriously. at worst, it’s a vanity concern in which students don’t want to look bad; at best, it imbues them with a sense of relevance that extends beyond the bounds of the classroom. of course, there are research paper assignments that meet these two criteria. a research paper can be a part of a broader scholarly conversation on a topic, or at least a stepping stone to a student’s contribution to such scholarship. but few of our students will go on to become academics, and so the question arises: what do they get out of this “academic” practice? we found that the answer was little that cannot be replicated in assignment formats more relevant to students’ future professional lives. case study: archives alive! so that’s where we were in : tired, bored with our assignments, suspicious that we were not fully delivering the course objectives, and worried that we were merely reinscribing old methods onto our students who were poised to be citizens of the st century and needed, badly, to be able to move nimbly amidst its various forms of communication. tom: i was a non-tenure track lecturer in the rhetoric department, and co-directing a provost-funded student success initiative with my friend and colleague matt gilchrist. called iowa digital engagement and learning (ideal), the program was designed, in part, to help instructors rethink existing assignments and make them more digitally and publicly-inclined. our thinking was that students needed to be honing digital composition and public engagement skills. part of the departmental mission was to train students in writing, public speaking, and research. kelly: the library’s crowdsourcing transcription project, diy history, had huge success with the broader public, thanks in part to a viral post on reddit. people from all over the world were transcribing digitized archival collections, but the materials weren’t necessarily getting used on campus, let alone in the classroom. my colleague jen wolfe approached me to see if i had ideas about departments that might be open to developing something new using the pioneer letters in diy history. rhetoric seemed like a natural fit because their assignments often involved analysis of a text, and because i had strong relationships with several of their lead instructors, including both tom and matt. it helped that the two of them were known to be open to quirky suggestions, so we asked if they wanted to pilot…something. tom: and of course, matt and i said yes. kelly: the first few planning meetings had an open-endedness that was both refreshing and intimidating. for the usual one-shot, it is very rare to have any say in what an assignment looks like, since the syllabus is generally set long before the librarian is asked to come in. i think library instruction is often brought in as the clean-up crew when an early assignment goes wrong — yikes, my students don’t know how to research, please help! tom: in my teaching at the time, research remained the last of the skills i introduced to my students. we grappled with reading, writing, public speaking, analysis….and research. and this after-the-fact approach irked me. research became a sort of window dressing for students rather than the foundation of their work. they were seeking sources to hang on their arguments, rather than building those arguments on the sources they had read and analyzed. for a long time, i had been thinking about ways to better thread these skills together. the letters in diy history presented a different way to engage students in research. the letters themselves were not necessarily making pre-formed arguments, and i chose not to introduce them with much more context than: let’s look at these intimate writings from other people in history. the approach relieved (or robbed) students of the impulse to tie their arguments to ready-made contexts. much of the curriculum at the time encouraged instructors to use controversies as a means of getting students to understand the complexity of making an argument, to recognize the myth of argumentative dichotomies, the need to evaluate sources, etc. i was prepared to set that approach aside in favor of simply letting students dig into primary sources that they might find engaging. i asked my students to do the following things: transcribe the letter. rhetorically analyze its content (why did the letter writers choose the words they did?) in a -word blog post. historically contextualize the letter (what historical content is present in the letter, or barring that: what was going on globally at the time) in the same blog post for another words. the intent here was to locate this letter in a real moment and possibly juxtapose the local with the global. create a two minute or so “ken burns” style video that walked the viewer through any aspect of the letter that the student found interesting. live present their findings to the class using any visual medium they found appealing (powerpoint, prezi, etc.) but not simply show their videos. the intent was to get them conversant in rhetorical analysis, writing, research, public speaking, and digital composition – all in the same assignment (while also helping create a searchable index of these texts for scholars). kelly: during the very first pilot, my own assumptions about one-shots limited what we did. students had already looked at a few of the letters, and seemed really excited about the project. i had prepared a research guide for the assignment, and we used that to navigate to the finding aid for the archival collections the letters came from. it was a total buzzkill! students were confused by the format, and suddenly felt intimidated by the formality. we moved on to explore historical newspaper collections, and asked students to try to find an article from the date of the letter they were looking at, and their joy started to come back. tom: we should point out here that one of the reasons for the return of their joy was reading old newspaper advertisements. students were intrigued by the fact that people a hundred years earlier advertised and purchased things like hats. hats became a simple hook to the past. kelly: but, it was a good challenge to my assumption: did they really need to understand how to use the finding aid to complete the assignment? no, as much as my archival studies profs would hate to hear it, they really didn’t. the purpose of the assignment was to do a rhetorical  analysis of the letter, with very minor historical context. some students would come back and use the contextual information later, but it was secondary. overwhelming students with the arcane form of the finding aid did not serve them well. these weren’t history students, and our goal wasn’t to make them into historians — or even to make them feel like historians. tom: right. we wanted to use the primary source material to foreground the work of rhetorical analysis against the backdrop of historical research. after all, i was expected to be teaching them rhetoric — the art of persuasion. in many cases, analyzing the rhetoric of the letter also required researching contemporary idioms and terminology. i should also point out that the letters fostered remarkable collaboration. cursive, for example, brought out the cooperative spirit in them. we worked on transcription in class and when students had difficulty reading a word, we would put it to the class to essentially crowdsource an answer about what was written there. was this scribble an “s” or an “f”? i was impressed by the problem-solving groupthink that possessed them. kelly: by the second term we ran the assignment, we had expanded to three sections. that term, students in all sections looked at letters from a single scrapbook collection. this approach had a serendipitous peer-evaluation factor where several students in each section read letters from the same group of half a dozen american men serving in wwii. the students’ curiosity about filling in the gaps in these narratives or between references and words they understood and those they didn’t, led them to connect their work with that of their peers. tom: i will admit to being deeply suspicious about using such a small set of letter writers. i thought i’d be hearing the same names and the same stories and views over and over again. i was sure we were running the risk of replicating an assignment along the lines of asking students to weigh in with their views on the drinking age or the legalization of recreational marijuana use. i couldn’t have been more wrong. in class and during their final presentations students questioned one another about their shared letter writers. they asked things like: “when was that letter written?” or “had he already said this to evelyn?” as they pieced together a larger narrative. clarence clark letter, may , , page . http://diyhistory.lib.uiowa.edu/transcribe/ / kelly: evelyn birkby, the woman who had donated the scrapbooks, ended up agreeing to do a phone call with one section, which i sat in on. it was truly an experience in rhetoric as these students carefully tried to ask this -year-old woman about the nature of her relationships with all these men -plus years ago. she later expressed to the curator of the iowa women’s archive that she was delighted to know that her materials were being used, not just sitting in storage somewhere. tom: these letters also introduce some content that is more immediately graspable for our students. the soldiers mention films, music, and plays. the students can relate to those things — but they often don’t know the works being referenced. so, boom: there’s a research question. and they love it. pop culture references, military lingo, idioms all become portals for analysis and with it: research. tellingly their blog posts (a form that i think produces a more compelling and earnest voice than formal papers which often encourage stilted language and overwrought syntax) improve. they care about what they are writing and about the people writing the letters. as one student commented, “this project taught me that when something interests you, it never really feels like research as much as it feels like learning more about an old friend or uncovering hidden, exciting secrets.” another student talked about wanting to read the letters of their deceased grandfather as “good bonding experience for us.” and while our students have chafed against the videos, they do admit to enjoying the sense of accomplishment upon seeing their arguments in documentary style. their presentations are also a delight to watch. they interrupt one another, they go over time with questions, they carry on conversations after class about the letters and evelyn’s connection to these men. they consider themselves (mild) experts on their letters. and they feel they have contributed to the scholarly enterprise. at the very least, they have transcribed letters for other scholars, making those handwritten texts searchable. i’ll note here that one question i often get when discussing this assignment is: “isn’t this just student labor?” to which i often reply that nothing is more laborious for students and instructors than the rote five-pager. and why adhere to an assignment model that pretends to include students in the experience (that dewey objective) of scholarly work, when we can use one that actually does? kelly: it definitely requires ongoing maintenance. once a collection is fully transcribed, it can’t be reused for the assignment. it has taken conversations each term with the library staff who really know the collections to identify good fits for the assignment, and then the assignment gets tweaked to fit as well. other examples lately, we in the ui libraries have been working on calling attention to little used or little seen collections. we’ve commenced a collections to courses initiative that tries to bring the holdings of the libraries into broader circulation in the classroom. for instance, like our colleagues at notre dame university and the university of pennsylvania, we are identifying and promoting public domain holdings that can be openly remixed by students. and, in turn, we are encouraging students to archive their remixes with the library for future remixing. we’re interested in creating intellectual feedback loops where students create knowledge that will be stored by the libraries and those works can in turn be used by other scholars (students and faculty alike). we’ve also begun archiving student works produced by the iowa narratives project in our institutional repository, iowa research online. that project asks students to work in groups and create eight-minute podcasts out of an interview with a local citizen. students must make audio recordings, edit them in the style of, say, storycorps or this american life or radiolab, take photographs, and write a brief paragraph of context for the interview. in our experience, students often compose essays in one take. it’s four in the morning, they’ve just tumbled a bunch of text onto the page and…damn, it’s perfect (in their exhausted eyes). by contrast, no students edit like the students asked to make an audio recording of themselves. we find that students do not readily edit their own writing in the same way they do their multimedia. students making audio recordings of their own voices, for example, will do multiple takes without any prompting — they know what sounds good. so what if we used assignments that highlight editing of multimedia as a gateway for helping them understand why and how to edit writing? the recipe (we think) for librarians to propose this kind of change for librarians interested in pursuing this kind of pedagogical change with instructors, we have some suggestions for successful collaborations. strategize. consider your target. are there faculty/instructors who are known to be willing to experiment? folks who are big advocates for the library? a course whose instructors are particularly grateful for help from instruction librarians? or perhaps there are courses whose regular assignments produce groans every term. at the university of iowa, all students are required to take a course titled “rhetoric,” which is meant to introduce students to the art of persuasion. in that course, many instructors, students, and librarians alike lamented the long-standing paper-about-a-controversy. those lamentations were an invitation for new ideas. by targeting shared frustrations and overlapping objectives, instructors and librarians were able to jointly remake the assignment in a way that better achieved their goals. if you can think of projects that both advance the library’s goals and instructor and student need, you’re likely to have a better chance of lasting success. archives alive! helped promote our digital collections in the classroom while hitting multiple course objectives tied to rhetoric. advocate. consider the possible motivations of the people you approach. will they see this as the solution to a perennial problem? an innovative feather in their teaching cap? a hassle this late in the term? an opportunity to give back to the library? however you package your suggestion, be clear about your intended role in the project. meulemans and carr recommend practicing answers to hard questions from faculty, so you are prepared to stand up for yourself in the moment. if you’re afraid of a tough interaction, roleplay with colleagues who might have helpful feedback. work backwards from your objective. if you’re going to rethink an assignment, think first about what it is supposed to do. not along the lines of “it’s supposed to generate a paper” — but rather along the lines of “what do you want your students to be able to do?” if you want your students to become better researchers, think about what that means to you. what is a better researcher able to do? once you have a sense of what it is you’d like as in end product (in terms of skills), work backwards towards the assignment prompt.  ask yourself what steps the student will need to take to wind up at the desired end point. in the case of archives alive! we wanted to arrive at a live presentation on a topic of interest to the student that had been reasonably researched. and of course, “of interest” and “reasonably researched” don’t make that endpoint particularly easy to attain. working backwards also better allows you to anticipate the time needed to work through each step in the assignment process. unlike simply assigning a paper with draft and final due dates, our assignment included due dates for component parts of the assignment. this approach helped students lay the foundation for their eventual live presentation by completing one part of the assignment at a time. be honest. when you ask students to undertake new assignment models, be honest with them. tell them this hasn’t been done before. acknowledge that there will be bumps in the road. and tell them that they will be your troubleshooters. as they walk through the assignment, the problems they encounter will help the next semester’s students. this goes for interacting with faculty, too. there are costs associated with implementing new assignment forms; they take up time both in and out of class. so remain flexible when navigating a faculty member’s approach to the project, and find ways to be generous of your own time and resources. promote. once your students have crafted these engaging, enlightening, and entertaining works, share them. get them out of the classroom to present in a more public setting. at iowa, we have had tremendous success getting classes to share their work in our learning commons within the main library, an open space that gets a lot of foot traffic. and to the extent that the works are digital, circulate them on the internet. call attention to your hard work and that of your students, by inviting faculty and administrators to come listen to your students’ presentations. celebrate their effort by trusting that it is something the public will find interesting. take risks. let go of your assumptions of what library instruction means. for archives alive!, we went into it without really knowing what the assignment would look like. it took a lot of conversation to clarify the goals of the instructors and of the librarians, and to brainstorm about how to get all those goals met. for students to interact fully with the documents, we had to let them focus on deciphering the cursive, and let the finding aid wait for another time. reflect and repeat. examine how things went, make adjustments, and try it again. whether or not you can reuse the assignment as developed, it has certainly taught you something, and hopefully broadened your network of connections on campus. both of us have developed a reputation for willingness to experiment, which draws otherwise unexpected opportunities. conclusion ironically, the archives alive! assignment helped us bury the myth that the rhetoric course was where university of iowa students learn all their research skills. by intentionally designing an assignment where students engaged with primary source materials, we uncovered necessary scaffolding that was otherwise being left out. we also got students to better understand research as an engaging and ongoing endeavor rather than a set number of citations. this experience has given kelly more confidence to set limitations with faculty who expect a whirlwind one-shot to solve all research woes. it has also opened up collaborations within the library, as the folks who work with digital collections, special collections and archives have to communicate and brainstorm. and this partnership isn’t dependent on personal relationships: a host of collaborations have continued although both kelly and jen wolfe, the other librarian involved at the start of the project, have left the university of iowa. this work also led tom into the library, where he now heads the digital scholarship & publishing studio. each of us has also had misfires in suggesting new projects: assignment designs that bombed, instructors who balked at making changes. however, the process of proposing and brainstorming remains a necessary one. at its root, education is about curiosity and the experience of seeking out answers to our questions. for us, asking questions of our assignments and looking for new, innovative ways to shape the curriculum has been incredibly rewarding — and brought with it some much needed relief from the five-page paper and its host of dated, restrictive, and staid trappings. we encourage you to usher in a similar sense of curiosity and relief as you and your students explore what new forms the st century has to offer. many thanks to in the library with the lead pipe for inviting us to publish with them and for their wonderful guidance and support. we would like to particularly thank our publishing editor ellie collier, our internal reviewer annie pho, and our external reviewer kate rubick. your feedback and suggestions were indispensable.  works cited bowers, cecilia v. mcinnis, byron chew, michael r. bowers, charlotte e. ford, caroline smith, and christopher herrington. “interdisciplinary synergy: a partnership between business and library faculty and its effects on students’ information literacy.” journal of business & finance librarianship , no. (june ): – . buchanan, heidi e., and beth a. mcdonough. the one-shot library instruction survival guide. ( ). head, alison j., michele van hoeck, jordan eschler, and sean fullerton. “what information competencies matter in today’s workplace?” library and information research , no. ( ): – . head, alison j, and michael b eisenberg. “assigning inquiry: how handouts for research assignments guide today’s college students.” available at ssrn , . jamieson, sandra. “reading and engaging sources: what students’ use of sources reveals about advanced reading skills.” across the disciplines , no. ( ). manuel, kate, susan e beck, and molly molloy. “an ethnographic study of attitudes influencing faculty collaboration in library instruction.” the reference librarian , no. – ( ): – . mcguinness, claire. “what faculty think–exploring the barriers to information literacy development in undergraduate education.” the journal of academic librarianship , no. (november ): – . doi: . /j.acalib. . . . nalani meulemans, yvonne, and allison carr. “not at your service: building genuine faculty-librarian partnerships.” reference services review , no. ( ): – . wiggins, grant p. & mctighe, j. ( ). chapter : what is backward design? understanding by design. alexandria, virginia: association for supervision and curriculum development. retrieved from https://fitnyc.edu/files/pdfs/backward_design.pdf valentine, barbara. “the legitimate effort in research papers: student commitment versus faculty expectations.” the journal of academic librarianship , no. ( ): – . valentine, “the legitimate effort in research papers: student commitment versus faculty expectations.” [↩] project information literacy found, for example, that ⅔ of the assignment handouts in their sample required some particular type of structure, and over half had a required number of citations. head and eisenberg, “assigning inquiry: how handouts for research assignments guide today’s college students,” . [↩] a project information literacy study on research assignment handouts found that % of the undergraduate research assignments in their study pool were plain old research papers. [↩] jamieson, “reading and engaging sources: what students’ use of sources reveals about advanced reading skills.” [↩] head et al., “what information competencies matter in today’s workplace?” . [↩] buchanan and mcdonough [↩] see for example gaspar and wetzel, “a case study in collaboration: assessing academic librarian/faculty partnerships,” . [↩] manuel, beck, and molloy, “an ethnographic study of attitudes influencing faculty collaboration in library instruction,” . [↩] ibid, . [↩] mcguinness, “what faculty think — exploring the barriers to information literacy development in undergraduate education.” [↩] bowers et al., “interdisciplinary synergy: a partnership between business and library faculty and its effects on students’ information literacy,” . [↩] nalani meulemans and carr, “not at your service: building genuine faculty-librarian partnerships.” [↩] ibid, . [↩] wiggins, grant p. & mctighe, j. ( ). chapter : what is backward design? understanding by design. alexandria, virginia: association for supervision and curriculum development. retrieved from https://fitnyc.edu/files/pdfs/backward_design.pdf. [↩] academic libraries, collaboration, college students, digital humanities, digital learning materials, faculty, information literacy, instructional design, librarianship, research, teaching new grads, meet new metrics: why early career librarians should care about altmetrics & research impact unpacking and overcoming “edutainment” in library instruction responses anthony – – at : pm one of the best assignments i had during my ils course was when we were given a spreadsheet of library visitation statistics. we were then given the task to analyze the statistics, and write a report recommending when and how the library could open extra hours on the weekend, by reducing hours during the week. i believe the assignment itself had been passed onto the current lecturer from the previous lecturer. the assignment felt fully realised and time honed and tested.it was fun, difficult and i definitely learnt real world skills that i use to this day; and not a word count in sight! karenmca – – at : pm you people are brilliant! i too want collaboration between library and faculty. the more i get, the more i want it! in my case, i want to get students looking at historic materials in our conservatoire library, and indeed in other libraries, so that they interrogate the materials and ask themselves the questions that will make these sources come alive to them, and allow them to understand the context in which these scores (or playscripts, or whatever) were composed. i’ve saved your posting so i can read it slowly and thoughtfully at the first possible opportunity. good on you! pingback : minor musings « venn librarian pingback : in the library with the lead pipe » editorial: introductions all around this work is licensed under a cc attribution . license. issn - about this journal | archives | submissions | conduct itinerario interview are we all global historians now? an interview with david armitage the harvard community has made this article openly available. please share how this access benefits you. your story matters citation armitage, david, jacobs, jaap, and van ittersum, martine. . are we all global historians now? an interview with david armitage. itinerario ( ): - . citable link http://nrs.harvard.edu/urn- :hul.instrepos: terms of use this article was downloaded from harvard university’s dash repository, and is made available under the terms and conditions applicable to open access policy articles, as set forth at http:// nrs.harvard.edu/urn- :hul.instrepos:dash.current.terms-of- use#oap http://osc.hul.harvard.edu/dash/open-access-feedback?handle=&title=are% we% all% global% historians% now?% an% interview% with% david% armitage&community= / &collection= / &owningcollection / &harvardauthors= c db c abd e e e de fed d &departmenthistory http://nrs.harvard.edu/urn- :hul.instrepos: http://nrs.harvard.edu/urn- :hul.instrepos:dash.current.terms-of-use#oap http://nrs.harvard.edu/urn- :hul.instrepos:dash.current.terms-of-use#oap http://nrs.harvard.edu/urn- :hul.instrepos:dash.current.terms-of-use#oap     are we all global historians now? an interview with david armitage by martine van ittersum and jaap jacobs the interview took place on a splendid summer day in cambridge, massachusetts. the location was slightly exotic: the interviewers had lunch with david armitage at upstairs at the square, an eatery which sports pink and mint green walls, zebra decorations, and even a stuffed crocodile. what more could one want? armitage was recently elected fellow of the australian academy of the humanities. at the time of the interview, he was just about to take over as chair of the history department at harvard university. his long-awaited foundations of modern international thought ( ) was being copy-edited for publication. granted a sneak preview, the interviewers can recommend it to every itinerario reader. in short, it was high time for itinerario to sit down with one of the movers and shakers of the burgeoning field of global and international history for a long and wide-ranging conversation. could you tell us something about your life story? where were you born, where did you grow up, what is your family background? i had a very land-locked childhood, which, at least superficially, gave no indication that i would be interested in international and global history later in life. i say superficially because, as i think back, over the years about aspects of my family history, i can see that the seeds were already there, although i did not realize it at the time. my father, a marine engineer, did his british national service in the merchant navy and then stayed on for some years afterwards, spending most of his time on the so-called ‘manz-run’, which took in montreal, australia and new zealand via year-long tours of the pacific. it was perhaps an indication that i was going to have a global future as well, and was possibly also the genetic origin of my recent interest in pacific history. when i was a child, my father spoke very little about his activities at sea. but i occasionally picked up hints, when i saw occasional photographs of his travels or he mentioned a visit to australia here, having been in new york there. without putting too much of a burden on the accidents of family history, i think that it was significant both that my father had had a distinctly global career in his twenties and that he spoke so little about it. that seems characteristic of the way britain itself in my childhood was a power with international and global connections, but maintained a great amnesia about speaking of these connections or acknowledging how much the world beyond britain had shaped british history itself.                                                                                                                 david armitage, foundations of modern international thought (cambridge: cambridge university press, ).     and i also remember as another thread of family history that my great-grandfather in the years before the first world war had wanted to escape his family. he was something of a wastrel, not to be relied upon. he went on a long and rather mysterious tour of north america in / . there was clearly a gene for wanderlust in the family, even though it was rarely spoken about, and was thought about in somewhat dangerous terms. i have the postcards that my grandmother’s family received from him, all of which were dismayingly, suspiciously brief about what he was up to, where he was going, etc. there was clearly a strain in my family–my father’s global career through the merchant navy, my great-grandfather with his wanderlust, taking him through north america—that i must have picked up in a career that’s taken me to the us for the past twenty years, when i’ve lectured on six continents. (antarctica has eluded me so far.) by contrast, most of my relatives have stayed very close to the unremarkable town where i was born, stockport, just south of manchester. it was a spinning and hat-making town, one of the cradles of the industrial revolution, although all of that industry was already receding when i was a child. there is a third coincidence which, looking back, perhaps made me into an imperial and ultimately a global historian. my mother went into labor on the day winston churchill was buried, the last living symbol of the british empire. i now like to think, without being completely self-aggrandizing, that the weekend the british empire was buried in the figure of winston churchill was a rather appropriate time for a self-consciously post- imperial historian to be born. as the empire was passing away in the early s, i was born as part of a generation of historians that would regard the british empire as history, as something to look back on, but no longer as a living force. i was part of a generation that came of age in the late s and early s, when the last wisps of the british empire were given up. it conditioned the way my generation thought about british history in relation to its larger international and imperial contexts. it was hardly a coincidence that i should come out of an amnesiac britain trying to forget its international, imperial and global connections, or that i grew up as part of a generation which was determined to recover those broader contexts, i.e. the impact of the wider world on britain and the impact of britain on the wider world as well. one realizes in retrospect that every part of british society was deeply enmeshed with the empire and commonwealth, even people coming from landlocked places. however, this was little talked about, something i call imperial amnesia. my parents contemplated emigrating to nova scotia in the late s, for example. this was entirely typical of upper- and middle-class whites moving to the empire in the twentieth century. david cannadine’s reflections on the hidden but multiple imperial connections of his family in birmingham suggest a very similar profile to that of my own family. white settlers born in the empire were educated at cambridge and oxford, on the understanding that you would return to place where you came from. the great historian of political thought, j. g. a. pocock, was born in new zealand and still retains a very                                                                                                                 david cannadine, “an imperial childhood”, yale review, , (april ), pp. - .     strong identity as a new zealander. it inflects almost everything he writes, increasingly self-consciously in recent years as well. his skepticism about europe as a project has always come from his new zealand identity. in his view, britain faced a choice between europe and the white settler commonwealth at the beginning of s, and made a choice for europe and against the settler empire. it was exactly at that moment that pocock wrote his first essays on the ‘new british history’. he envisioned a metropolitan britain as part of a congeries, a global nexus of commonwealth settler societies across the globe, a set of islands scattered around the globe, including new zealand. the ‘new british history’ came from that charged political moment and that set of choices. you were educated at the university of cambridge. what difference did it make to your intellectual interests and academic career? i benefited from excellent teaching at stockport grammar school, most notably an inspirational history master by the name of nicholas henshall. a very fine publishing scholar in his own right, he had done a special subject with geoffrey elton at cambridge, and later started a ph.d. at the university of manchester, which he did not complete. i am convinced that i got from henshall as good a history education as i would have had, had i done history at cambridge. henshall gave me a real feel for what it was like to work at the highest pitch of scholarship, absent immediate access to the archives. i am immensely grateful for that. nick is a really inspirational figure: we all have someone like that, especially in our school careers, who showed you the excitement of intellectual life, intellectual work, of whatever kind it might be. i went to the university of cambridge in to read english, in probably the only act of rebellion in my whole life. i was supposed to enter cambridge as a history student, but at school i revolted and opted to do english instead. although i lacked the terminology, i knew that i wanted to be what we might now call a cultural and intellectual historian. if there is one single book that made me want to be a historian, it is frances yates’s the art of memory ( ). i still have the copy that i first read at the age of . to me, it was so thrillingly unusual in what it revealed about the past. these were aspects of the past that i had never encountered before. and yates did it in such an elegant and revelatory way. i decided that this was the kind of history that i wanted to do. at cambridge in the                                                                                                                 j. g. a. pocock, ‘british history: a plea for a new subject’, new zealand journal of history, ( ), pp. - , reprinted with minor modifications in j o u r n a l o f m o d e r n h i s t o r y ( ) , pp. – , and again in pocock, the discovery of islands: essays in british history ( c a m b r i d g e : c a m b r i d g e u n i v e r s i t y p r e s s , ) , p p . - ; see also richard bourke, ‘pocock and the presuppositions of the new british history’, the historical journal , ( ), pp. - . n. g. henshall, the myth of absolutism: change and continuity in early modern european monarchy (london: longman, ); henshall, the zenith of european monarchy and its elites: the politics of culture, - (basingstoke: palgrave macmillan, ). frances a. yates, the art of memory (london: routledge and kegan paul, ; rpt. harmondsworth: peregrine, ).     early s, the undergraduate history tripos was still very much focused on political and institutional history. i had read everything by geoffrey elton and his followers by about the age of : very impressive institutional and political history, but not what i wanted to do myself. in order to become a cultural and intellectual historian, i knew i needed training in the reading of texts, i.e. interpretation and hermeneutics, with historical sensitivity. the cambridge english tripos—formally entitled “english literature, life and thought”—seemed the more sensible option. i had no intention of pursuing an academic career at that point. i wanted to be a barrister, which meant doing two years of english and one year of law. my plan was totally derailed by surprisingly good exam results in english in my first and second years. everyone said: ‘you should carry on with it, because you are apparently very good at it’. since i had by then lost my rebellious streak, this is what i did. following a b.a. in english, i immediately started with a ph.d. in english. i spent two years on shakespeare’s classical sources, particularly shakespeare’s use of ovid. halfway through that process, i discovered that almost all the ovidian poets in sixteenth- century england had also written poems about english colonial enterprises in virginia or guiana. i thought: this is a much more interesting topic, this is where the juices start flowing, this is something novel that has not been talked about before. i faced a fork in the road. to cut a long story relatively short: rereading john milton’s paradise lost was the point on which my work pivoted back from literary scholarship to intellectual history. there are two great narratives in paradise lost: a) the narrative of the fall of mankind, and b) the narrative of satan’s discovery of the new world. there are references all the way through to satan as a voyager, a traveller, going to a new world, where he encounters the native peoples. the poem is saturated with the language of empire. i thought: ‘why was this the case?’ why did milton reflect in the late s and early s, when writing paradise lost, on contacts with the new world beyond europe? why did he mention empire, discovery and colonization, oftentimes with a negative valence? i related that to milton’s republicanism, his commitment to classical republicanism, to neo-roman thought. this was the topic i had been looking for: the relationship between republicanism and empire. i took this project with me when i left cambridge in , right in the middle of my ph.d., and went to princeton for two years on a commonwealth fund harkness fellowship, in order to retool as a historian. i was particularly encouraged do so by j. h. elliott, who was then at the institute of advanced study at princeton. elliott’s the old world and the new, - ( ), based on his wiles lectures at queen’s university, belfast, had been a great                                                                                                                 david armitage, ‘john milton: poet against empire,’ in armitage, armand himy and quentin skinner, eds., milton and republicanism (cambridge: cambridge university press, ), pp. - ; see also armitage, ‘literature and empire,’ in nicholas canny, ed., the oxford history of the british empire, i: the origins of empire (oxford: oxford university press, ), pp. - .     inspiration to me. (i was therefore more than usually honored and delighted to follow elliott as the wiles lecturer at queen’s in .) john himself was extraordinarily kind to me and had a decisive influence on my career. although based in princeton, he retained a house in cambridge, to which he returned every summer. i met him one summer when he was in cambridge, explained that i had a fellowship to go to the us, and that i wanted to work on the relationship between english literature and english imperial ventures in the americas. he immediately suggested i to come to princeton. since the institute of advanced study does not take graduate students, he could not officially supervise me, but offered to help in any way he could. he gave me the names of princeton faculty members i might work with, including lawrence stone, natalie zemon davis and john murrin in history and david quint and victoria kahn in literature. he was extraordinarily generous, really decisive at that point in helping me to move my intellectual framework towards the us and american academia, princeton in particular. he was amazingly generous to someone he had never met before, to whom he had no prior connection. as a side-note, i should say that i have been struck at various points by how generous extremely senior scholars can be to junior scholars and how decisive this can be in one’s academic life. throughout my own career, i have tried to follow elliott’s example as much as i can, in my own, faltering way. i realize that i owe so much to so many generous people, who helped me at critical moments in my career, when they really had no reason to. elliott was the first one to do that for me. he did not just introduce me to princeton scholars, but also—here is where the irony comes in—to many of the great cambridge historians who would play a decisive influence in my career for the next ten years. it was at princeton that i first met anthony pagden, richard tuck, chris bayly, linda colley and david cannadine. it was at princeton as well that i first really came across the work of quentin skinner, whose name had never even been mentioned to me in the five years i spent at cambridge studying english. so princeton, not cambridge, was the decisive influence in your career? you might say that; i couldn’t possibly comment. however, the pivotal figure in my career is quentin skinner, a founding member of the so-called “cambridge school” of the history of political thought, who, like pocock, also had a family background in the empire. quentin saved my academic life. in my second year at princeton, i had reached a crisis point. i realized that i could not in good conscience continue with a ph.d. in english. i wanted to be in a history department and to work as a historian. it became necessary to throw myself at the mercy of at least one historian in order to make the transition. by that time, i had read a great deal of skinner’s work. quentin was then publishing his major work on republicanism, which fit very closely with my interests at                                                                                                                 j. h. elliott, the old world and the new, - (cambridge, ). itinerario interviewed sir john elliott in : ‘an englishman abroad: sir john elliott and the hispanic world’, itinerario , (july ) pp. - . quentin skinner’s father was a former naval officer who joined the colonial service and served in west africa for most of skinner’s childhood: see interview with alan macfarlane, january : http://www.sms.cam.ac.uk/media/ . quentin skinner, machiavelli (oxford: oxford university press, ); machiavelli,     that stage. i very much wanted to work with him. it was the last chance to rescue myself as a historian. through a friend, i got in touch with him. on a brief visit from princeton, we had lunch and i explained my project. with what i soon discovered is his characteristic generosity and grace, he agreed to take me on as a student and to help me make the transition to the history ph.d. program at cambridge. he did exactly that. he did all the necessary administrative legwork to transfer me from english to history. the rest is history—or, rather, history. it was unforced, unanticipated generosity on quentin’s part, and an enormous vote of confidence. when i joined the history ph.d. program at cambridge, it was the absolute zenith of intellectual history and history of political thought at cambridge in terms of the breath and depth of the group i was part of. among my contemporaries were annabel brett, joan-pau rubiés, and andrew fitzmaurice. it was a really extraordinary moment in terms of early-modern intellectual history, the move towards connecting intellectual history with extra-european history and the history of colonization. richard tuck was independently beginning his work that led to the rights of war and peace ( ). there was in different areas a move towards the international, colonial, imperial, global setting of early-modern intellectual history in particular. sometimes we worked entirely independently of each other, but then we discovered that we arrived at the same set of topics via different routes. there were common seminars: the famous monday night seminar in the history of political thought, run for many years by skinner and john dunn–yet another founder of the cambridge school who came from “a sort of imperial family” in british india –and which still continues today. together with joan-pau rubiés, i organized a seminar on ‘cultural encounters in the early-modern world’, a colonial history and european expansion seminar with a cultural/intellectual history focus. in doing so, we had the full support of peter burke, a fellow-fellow of mine at emmanuel college at the time.                                                                                                                                                                                                                                                                                                                                           the prince, trans. russell price, ed. quentin skinner (cambridge: cambridge university press, ); gisela bock, quentin skinner and maurizio viroli, eds., machiavelli and republicanism (cambridge: cambridge university press, ). annabel brett, liberty, right, and nature: individual rights in later scholastic thought (cambridge: cambridge university press, ); brett, changes of state: nature and the limits of the city in early modern natural law (princeton, n.j.: princeton university press, ); andrew fitzmaurice, humanism and america: an intellectual history of english colonisation, - (cambridge: cambridge university press, ); joan-pau rubiés, travel and ethnology in the renaissance: south india through european eyes, - (cambridge: cambridge university press, ); rubiés, travellers and cosmographers: studies in the history of early modern travel and ethnology (aldershot: ashgate, ). richard tuck, the rights of war and peace: political thought and the international order from grotius to kant (oxford: clarendon press, ). john dunn, interview with alan macfarlane, march : http://www.sms.cam.ac.uk/media/ .     you taught at columbia university from until . how different was american academia from what you were used to back in britain? do you feel that it was a crucial step in your academic and professional career? by extraordinary good fortune, a junior position in british history opened up at columbia university just as i finished my junior research fellowship at emmanuel college, cambridge, a year later, in . david cannadine held the senior position in british history at columbia. his extraordinary generosity, inspiration, and camaraderie began when i went for my interview at columbia university and continued throughout the time that i was there. he was an immensely supportive, energizing figure, who was taking his own imperial turn in those years as well. david and i very much saw eye to eye in terms of where british history was going. there was a palpable sense among british historians, particularly in the us, that the field was dwindling into insignificance, that the tweedy anglophilia that had sustained it for decades was no longer viable, as america was becoming a more outward-looking and global society. if british history was to survive as a teaching and research subject, a subject in which major universities would continue to hire, we had to reconsider its position in the wider academic ecology in the us. in the late s, the north american conference on british studies undertook a self-study of the field. its report concluded that a turn towards empire, towards britain’s international connections, towards the global setting of british history was going be essential to save the field, just as it was intellectually unignorable as a major aspect of the field–it had been largely overlooked, except under the rubric of imperial history. i arrived at the right moment in the us, just when that move was taking place. there is one other big difference between the us and british academia: the breadth demanded in teaching, the fact that one has to teach british history to a non-british audience, in the context of a very diverse student body. when i started teaching at columbia, i had to think from the bottom up about the larger stories, the larger narratives into which i would put british history, which would be narratives intersecting with colonial american history, with atlantic history, with imperial history, with global history. just in the course of writing my first series of lectures on british history for students at columbia university, i was being pushed to think in my capacity as a teacher—even before i took this turn in my research--to think outwards, to think imperially, to think globally. that was decisive for my career, as was a succession of brilliant graduate students at columbia and now harvard who have taught me more than i could ever have taught them about expanding the boundaries of established histories.                                                                                                                 david cannadine, ornamentalism: how the british saw their empire (london: allen lane, ). nacbs report on the state and future of british studies in north america ( ): http://www.nacbs.org/documents/reportonfield .html. among them, james delbourgo, a most amazing scene of wonders: electricity and enlightenment in early america (cambridge, mass.: harvard university press, ); ted mccormick, william petty and the ambitions of political arithmetic (oxford:     i taught the contemporary civilization course at columbia university, a two thousand year text-based, seminar/discussion-based survey of (mainly) euro-american intellectual history, plato to rawls and beyond. that was the most exciting teaching i have ever done. its chronological breadth over the very longue durée was salutary. it was in some sense my education as an intellectual historian, inculcating an interest in questions over the longue durée that have become increasingly urgent in my current work. half of my teaching at columbia university was in the core curriculum, in fact, which was one of the great attractions for me. i was very committed to it, and chaired the contemporary civilization course at columbia in - . what about your research and publications at this time? in the s, i did research in the new york public library, columbia university library, and the folger shakespeare library in washington dc, among other places. i went back to britain in the summer time in order to do research in london and edinburgh. i did quite a lot of scottish history as part of the larger british project i was working on, to make sure that it was truly british, representing both the english and scottish experiences. i published an edition of bolingbroke’s political writings in . bolingbroke played an important role in my ideological origins of the british empire ( ). he was among the first to theorize britain as a blue-water empire in the s. in the course of                                                                                                                                                                                                                                                                                                                                           oxford university press, ); lisa ford, settler sovereignty: jurisdiction and indigenous people in america and australia, - (cambridge, mass.: harvard university press, ); travis glasson, mastering christianity: missionary anglicanism and slavery in the atlantic world (new york: oxford university press, ); philip stern, the company-state: corporate sovereignty and the early modern foundations of the british empire in india (new york: oxford university press, ); ryan t. jones, empire of extinction: a natural history of russian expansion in the eighteenth-century north pacific (forthcoming, oxford university press); tristan m. stein, “the mediterranean and the english empire of trade, - ” (unpublished phd thesis, harvard university, ). david armitage, ‘the scottish vision of empire: intellectual origins of the darien venture’, in john robertson, ed., a union for empire: political thought and the british union of (cambridge: cambridge university press, ), pp. - ; armitage, ‘making the empire british: scotland in the atlantic world - ’, past and present, no. (may ): - ; armitage, ‘the scottish diaspora’, in jenny wormald, ed., scotland: a history (oxford: oxford university press, ), pp. - . bolingbroke: political writings ed. david armitage (cambridge: cambridge university press, ); see also david armitage, armand himy and quentin skinner, eds., milton and republicanism (cambridge: cambridge university press, ). david armitage, the ideological origins of the british empire (cambridge: cambridge university press, ).     doing research on him, i discovered that there was no modern edition of his writings, and there deserved to be. so a left-hand project in the context of my other work was to bring bolingbroke back to some prominence in the context of the blue book series, edited by skinner, who had written the most important classic essay on bolingbroke many years before. skinner was very receptive to the idea of publishing bolingbroke’s writings in the cambridge texts in the history of political thought series. theories of empire, published in , was a collection of previously published essays. the hardest part of putting together that volume was to find anything on the dutch empire. i wrote to various dutch historians, including prof. piet emmer at the university of leiden, to ask whether there was one classic essay on dutch ideas about empire. emmer replied: ‘sorry, the dutch had no ideas; they just counted. there is no secondary literature on the intellectual history of the dutch empire’. consequently, i included in the volume the classic essay ‘freitas versus grotius’ by c. h. alexandrowicz ( - ). this led to an abiding interest in alexandrowicz’s work as perhaps the first post-colonial historian of international law, who anticipated by two decades the ‘third world approaches to international law’ school, which has more recently transformed the field of international legal studies. together with jennifer pitts of the university of chicago, i will shortly publish a collection of alexandrowicz’s scattered but germinal essays. my doctoral dissertation was mostly a collection of case studies, which were published separately as articles or led to other projects, such as the edition of bolingbroke’s political writings. the ideological origins of the british empire, published by cambridge university press in , contains just a chapter and a half or maybe two chapters of my doctoral dissertation. the rest was freshly researched. much to the anxiety of my colleagues at columbia university, the monograph appeared just weeks before the tenure-file went forward. it was a risky strategy for any junior scholar in the american tenure system. do not try this at home! i was very, very lucky to have the extra time (i.e. a junior research fellowship at emmanuel college and research support from columbia university) to write the book the subject deserved. should we characterize the period - as the ‘atlanticist’ decade of your career? yes and no: yes, in the sense that most of the work which i published during that period was either explicitly or implicitly atlantic in focus, and no, in the sense that the transition to international/global history was already taking place in / . there was an obvious overlap between my atlanticist and international/global interests. the focus on global history was firmly in place when i became fellow at harvard’s charles warren center for american studies in / , starting my project on the ‘foundations of                                                                                                                 david armitage, ed., theories of empire, - (aldershot: ashgate, ). c. h. alexandrowicz, the law of nations in global history, ed. david armitage and jennifer pitts (farnham: ashgate, forthcoming). the warren center theme in / was “global america: connections between developments in america and in other parts of the globe”: http://news.harvard.edu/gazette/ / . /warren.html.     modern international thought’. out of that project grew—some might say metastasized—a single chapter, which turned into a book, entitled the declaration of independence: a global history ( ). my year at the charles warren center was filled with a series of very intense, very fertile conversations with international and global historians, led by the late ernest may ( - ), akira iriye, and james kloppenberg. i really began to discover that i was an international historian or had been one all along, like some sort of scholarly monsieur jourdain. i was thus becoming an international and increasingly global historian on top of being an atlantic historian. that is when the conversion really began to take hold, during that year. the british atlantic world, - ( ) was workshopped at a meeting of the international seminar on the history of the atlantic world at harvard university in september . on that occasion, you presented your now classic essay “three concepts of atlantic history”, which has been rather extravagantly compared to marx’s eighteenth brumaire of louis bonaparte for its quotable opening line, “we are all atlanticists now”. but what was the connection with international/global history? this interviewer attended the workshop, but never felt the connection with her own work. as bernard bailyn once put it, ‘martine insists on doing the east indies’. the boundaries have broken down more since. we can now recognize each other as being part of the same enterprise: l’histoire des deux indes, if you will. that was not always the case. national boundaries seem to have been reintroduced in atlantic history, which defies the purpose. absolutely, partly because of volumes like the british atlantic world, - , insisting that there is something british about it. we had thought, perhaps naively, that it might generate a series of volumes on the french atlantic world, the portuguese atlantic                                                                                                                 david armitage, the declaration of independence: a global history (cambridge, mass.: harvard university press, ). in molière’s play, le bourgeois gentilhomme ( ), the eponymous bourgeois m. jourdain aspires to be an aristocrat, only to make a complete fool of himself; in the process, he discovers that he has been speaking “prose” all his life without knowing it. david armitage and michael j. braddick, eds., the british atlantic world, - (basingstoke: palgrave macmillan, ; expanded edition, ); http://www.atlantichistory.com/atlantic_history/atlantic_history_home.html. martine van ittersum, profit and principle: hugo grotius, natural rights theories and the rise of dutch power in the east indies, - , brill intellectual history series (leiden: brill academic publishers, ); though see van ittersum, “mare liberum in the west indies? hugo grotius and the case of the swimming lion, a dutch pirate in the caribbean at the turn of the seventeenth century”, itinerario ( ) pp. - . see, for example, h. v. bowen, elizabeth mancke and john g. reid, eds., britain’s oceanic empire: atlantic and indian ocean worlds, c. – (cambridge: cambridge university press, ).     world, the dutch atlantic world, etc. happily, none of those happened, otherwise it could have been even more entrenched than it is. in some ways the cynics may be partly correct in saying that atlantic history was a way of rescuing different national historiographies by putting them in broader contexts. early american history became atlantic history, parts of early modern british history became atlantic history, and the same happened with the early modern histories of other european countries that had overseas connections or empires. to do proper atlantic history requires the knowledge of so many languages that is very difficult for anyone to do that. yes, perhaps it can only be done as a collaborative enterprise. were you influenced by prof. bernard bailyn’s conceptualization of atlantic history? bailyn first outlined his ideas for the international seminar on the history of the atlantic world in the itinerario interview of march . absolutely, yes. i presented a paper at the international seminar on the history of the atlantic world at harvard university in august , i.e. the second year that the august seminar was running. i continued to attend the annual seminars until . although i was already making the turn towards international history, i consider the atlantic seminars among the most fertile forcing-houses for historiographical innovation that i have ever been part of. bailyn’s vision, ever expanding, ever deepening, was extraordinary to see unfold in the early years of the seminar. i was very privileged to have had a ringside seat for that. what is your position with regard to feegi discussions about hemispheric history versus world history and european expansion versus world history? my answer is twofold: ) this is perhaps a trivial point, but i have not been directly engaged in discussions, face-to-face, with groups like feegi, to thrash them out. ) to make a more substantial point, i am a great believer in letting at least a thousand flowers bloom. one should not be exclusive about these things. it all depends on the question you want to answer. turning that around, the framework that you choose to bring to bear on your materials will generate new kinds of questions as well. there is a reciprocity, a back and forth, between the problems and the methodologies available to solve them. prescriptivism is death in these matters. one should not legislate for one approach or another. all approaches should be in play in order to generate the questions to open up the archives and to create the discussions that are necessary to solve particular problems. that is the only reasonable answer to that question. it is also sitting on the fence a bit.                                                                                                                 bernard bailyn, ‘the idea of atlantic history’, itinerario (march ), pp - .     yes, i always remember what david lloyd george said of an opponent in the british house of commons: ‘he has been sitting on the fence so long that the iron has entered his soul’. i feel very much that way myself: uncomfortable yet implacable. you say that international and global history has been at the forefront of your mind since your year at the charles warren center. at harvard, you find yourself in good company: niall ferguson, charles maier and emma rothschild, to name a few, are extremely distinguished historians of empire. has the harvard history department gone global? are we all global historians now? those are two separate questions, but connected. yes, what i have found most hospitable about the harvard history department is precisely its long-running commitment to international and global approaches. the two great innovators of international history were ernest may and akira iriye, with more than seventy years of teaching at harvard between them. they had laid the groundwork for this approach with their own students for the broader tenor of the department long before any of the recent generation of imperial historians was appointed. but both may and iriye did/do modern history. yes, both published on modern history, but both were also deeply learned in earlier periods. most of their ph.d. students did topics in th and th century history, and therefore the field could be identified with that era to some extent. but there was never any hostility to earlier periods. part of the raising of awareness about international and global history has been a breaching of chronological boundaries. for example, if we conceive of international history in terms of the interaction of both national and non- national histories, then before the great age of nation-states, before the cementing of a regime of nation states, all history was ipso facto transnational or international history. i would insist upon that. my colleagues in medieval history do as well. pre-modern history (i.e. history before the late th century) is ipso facto, by definition, by its construction, a trans-national historiography, although it has only very recently been conceived of in those terms. possibilities for dialogue with more self-conscious international/transnational/global historians are opening up, across chronological as well as geographical barriers. that is something the history department at harvard is very hospitable to—as are larger swathes of the historical profession, at least in the us. and a good thing too! that is something i am quite evangelical about. i am not on the fence about that at all. to answer the larger question, are we all global historians now? no, not in the sense that we are all doing global history. we certainly are in the sense that all historians now have                                                                                                                 niall ferguson, empire: how britain made the modern world (london: allen lane, ); charles s. maier, among empires: american ascendancy and its predecessors (cambridge, mass.: harvard university press, ); emma rothschild, the inner life of empires: an eighteenth-century history (princeton, nj: princeton university press, ).     a global audience, thanks to the internet. but in one strong sense we could say that we all have to be global historians now. by that i mean, if you are not doing…. this formulation will get me into trouble, but let me nevertheless put it in these strong terms: if you are not doing an explicitly transnational, international or global project, you now have to explain why you are not. there is now sufficient evidence from a sufficiently wide range of historiographies that these trans-national connections have been determinative, influential and shaping throughout recorded human history, for about as long as we know about it. the hegemony of national historiography is over. it used to be the case until very recently, let’s say ten years ago, that if you did not do national historiography, you had to tell other people why you were not doing national historiography. i would like to say the boot is now on the other foot. we now have to ask the national historians: why are you doing us history without the history of the hemisphere, the american empire, america’s relations with the wider world, the history of american emigration, the transnational circulation of ideas, whatever it may be? i think it is time for us to put the national historians on the defensive, to justify their choice of particular local, regional or national frameworks. i am putting that a little aggressively, but i also hope that it might be productive for those who work on smaller units, to justify to themselves why it is they choose them—apart from the inertia of the historical profession, that it has always been the case that one would take a town, a region, or a nation-state as a focus of historical study. we need to be more reflexive about exactly why we choose those things, rather than the path-dependency of historiographical activity. the irony is that many historians born and/or living in newly independent countries in the so-called third world are doing national history. yes, it is essential for them do so; it is essential for the public purposes of their historiography, because of the former suppression. it is absolutely essential for them to go through that stage. is it essential for us or british historians to continue doing national history for the same reasons? there is no equivalence there. if historians find themselves in a post-imperial, not a post-colonial, situation and if they continue to write national history, then we have to ask why. they need to justify why they are doing what they are doing, when there is so much evidence that the nation-state is a container at once too small and too large to encompass everything that we want to learn about the past. does it not also depend on the audience historians are writing for? historians have a duty towards society, their own societies, hence the predominance of the nation-state in historical narratives. it feeds into national identity, any identity. that is what people are interested in. as an ideal, we should do global history. but we are all rooted in our local communities. i agree. history has a public, indeed a civic function in that sense. but to take the example of the us, we now know from the latest census analyses that white descendants of europeans are already in the minority. that necessarily changes the public and civic                                                                                                                 http:// .census.gov/ census/data/.     focus of us historians. they should not continue to tell the story of the nation-state as the advance of european immigrants and the embedding of their institutions, but tell the full story, the diversity of the us in its connections with the wider world --oceanic, hemispheric, and global. so, yes, there will be conservatives who say ‘the national story should continue to be the story that has been told by the new england historians since the beginning of the th century’. however, that story will increasingly lose an audience because that audience is dying off, and being replaced by a much more diverse audience, with a much greater consciousness of transnational connections, not least through their own family lives. the general public is mainly interested in genealogy and local history. that is what you see all around you, especially in the boston area–lots of historic sites associated with the american revolution. in order to keep in contact with the larger public, university- trained historians should have a feel for that, while showing at the same time the larger implications. one of the impetuses behind my book on the us declaration of independence was precisely to show that this most american of american documents was fundamentally international, even global: if one could globalize the declaration, then there was no reason not to globalize the rest of american history–by which i mean, as most of its practitioners mean, united-states-ian history. even in its physical make-up the declaration was an international object, printed by an irishman, using a printing press and type imported from england. moreover, he printed it on dutch paper. there were no paper manufacturers in british north america in the s. the us would not become self-sufficient in paper production until the early th century. even the paper the declaration was written on had to be imported. the inkstand used to sign the manuscript was made of silver not from the mines of virginia, but probably from the mines of peru. so it does not take very much to show the international connections, even in the declaration’s physical fabric. the declaration of independence: a global history ( ) is now available in paperback. more importantly, it has been translated into italian, french, portuguese, spanish, and japanese. a chinese translation is underway. does this make you a public historian? was it your intention to speak to a wider audience? that book grew out of your forthcoming foundations of modern international thought. did you ask harvard university press to publish one chapter as a separate book? oddly enough, it was actually my editor at harvard university press, kathleen mcdermott, who suggested the idea to me of doing a separate book, as something that could reach a wider audience. i was very happy to do that. i was quite bullish about taking a broader, international and global approach to early american history in general. harvard university press very generously, very wisely saw the potential for a book on that subject, a relatively short book that would bring that perspective to a wider audience. one of my great satisfactions is the way in which the book has been read by non- academic readers, including high school students. i have done a lot of talks to high school teachers, in particular about how to teach the american revolution in wider contexts.     that seems to be an important shift in the teaching of american history in american high schools. teachers have realized the necessity of taking a broader, cosmopolitan perspective to educate their students about the wider world that they are part of. for civic purposes, the national narrative is no longer sufficient for them. i am very proud indeed to have made a small contribution to that. i have the satisfaction of seeing my research go very quickly into classrooms across the us. were there any negative reactions to your interpretation of the declaration of independence? yes, it was written as a polemical work. i deliberately downplayed the importance of the declaration’s second paragraph (i.e.‘self-evident truths’ and ‘inalienable rights’) because historically it has been much less important to the global context than the opening and closing paragraphs regarding the rights of peoples and ‘free and independent states’. but i did get some pushback from american historians and americanists, who claimed the book was unbalanced in not giving due attention to the importance of the second paragraph for american history itself. but that work –placing the second paragraph into its larger, historical context—has been done as well as it is likely to be done by pauline maier in american scripture ( ). her book was a great inspiration to me. so i said to myself: ‘my job now is to place the whole document into its international context in and beyond, and see what the evidence turns up’. and the evidence was very decisively against the importance of the second paragraph. that did get me into some kind of trouble. the way that i tend to teach that, especially when i work with high school teachers, is to say: ‘it is important to remind your students that the promises of the second paragraph, the promises of individual rights, the broader promises of human rights, are always contestable and reversible, not something you can absolutely rely upon’. one of our jobs as teachers is to encourage our students to make arguments in favor of those rights, not to assume that these rights will always be available to them or to anybody else. come up with good arguments why this conception of rights, natural rights, rights perhaps derived from a divine source, rights derived from major foundational documents like the declaration of independence or the bill of rights are substantive and can be actionable, can protect you. how do you gain protection from that? only by protecting the rights themselves, by being able to argue for them. my skeptical view of the second paragraph is very much intended to push in that kind of civic direction: to say, well, justify these arguments. there are plenty of philosophers who say that the assumptions underpinning the second paragraph of the declaration of                                                                                                                 gary reichard and ted dickson, eds., america on the world stage: a global approach to us history (chicago: university of illinois press, ). laurent dubois, robert ferguson, daniel hulsebosch and lynn hunt, “critical forum,” william and mary quarterly, rd ser., , (april ), pp. - ; tiziano bonazzi, david hendrickson, peter onuf and arnaldo testi, “round-table on armitage, the declaration of independence: a global history,” rsa: rivista di studi nord- americani, ( ), pp. - . pauline maier, american scripture: making the declaration of independence (new york: vintage, ).     independence are, to put it mildly, not very robust. we may need to come up with better arguments in their favor. so what might those better arguments be, instead of the shorthand assumptions that jefferson built into the document? do harvard historians have a duty to speak to the general public? many of your colleagues are writing in the big american newspapers, weekly magazines etc. is it valued by the harvard administration? maybe not a responsibility, but certainly an opportunity. the harvard name does open doors. the inspiration provided by colleagues who have a public presence encourages one to rethink how to couch one’s scholarship to reach a wider audience. on the part of the administration, there is an expectation that one should speak to the widest possible audience. the edx initiative may become important in this regard as well. instead of having at most one thousand students in a physical classroom at harvard, it will now be possible to have tens of thousands of listeners and learners all around then world. what will be history’s contribution to the joint mit-harvard edx initiative (http://www.edxonline.org/)? i was at a meeting a few days ago to discuss harvard’s entry into the world of “massive open online courses” (moocs). the very first on-line humanities course to be offered through edx will be a course on chinese history taught by my colleagues peter bol and bill kirby. they are working on it right now. there is a potential audience of over a billion in china alone. but there would seem to be a problem if the history contribution to edx would be nothing more than a harvard history professor pontificating in front of a camera, expecting the world to watch in breathless admiration. that is true. that turns out to be very unappealing to an on-line audience. that is where the really interesting questions begin. we had a two-and-half hour discussion about this. how do you do what we do as interpretative, evaluative, qualitative scholars in that kind of scaled-up, massive on-line environment? it is fine for introductory courses in mathematics or computer science: almost all on-line courses so far have been of that kind. they are introductory; they can easily be accessed by non-human assessors, through multiple-choice questions and machine-marking, for example. it really is a matter of advancing stage by stage from simple to more complex information. it does not involve evaluation or analysis of the materials. so the really interesting questions are: ‘how do we do what we do in that kind of environment? is it even possible for us to do what we do in that kind of environment?’. that is one reason why harvard and mit are investing a large amount of money in the edx initiative. it is a very good program in the sense that harvard, in particular, has said: ‘this will not just be for the sciences and engineering, this will also be for the humanities and social sciences’. harvard has now turned it over to all of us, asking ‘well, how will it be?’ ‘what kind of resources can harvard put your disposal to create on-line the kind of analytical experiences that we value in our classrooms?’. there are various possibilities, of course. it could mean     digitizing texts and physical objects, in order for students to zoom in and view and rotate them in three-dimensional space. it could mean allowing various kinds of on-line discussion, perhaps with off-site, but on-line teaching assistants. or it could be done through various kinds of peer advising and peer teaching, i.e. more-experienced students help less-experienced students in on-line discussion groups. students who take an on- line course for some kind of credit become teachers for that course in due course. it is creating a wholly different kind of teaching environment, and at an international and global scale. a professor who teaches a course on leadership at the kennedy school of government at harvard told us at the meeting on wednesday that he had to rethink the course in light of the cross-cultural, international conceptions of what leadership means, which he got back from the ten thousand students who were taking the course on-line. they were feeding back very different conceptions of leadership. he brought along a student from serbia who was a graduate of the on-line course, who had come to harvard to take a masters degree at the kennedy school and is now teaching on the course here. contact with actual, living subjects changed the way this professor taught the course at harvard. another participant said ‘we now have a huge survey group for testing pedagogical innovation’. you can try a new technique or module, and get immediate feedback from ten thousand students, whether it works or does not. that can take years in a normal classroom. a third participant mentioned the possibilities for crowdsourcing in research, such as the transcribe bentham project in london, which uses non- academics to crowdsource scholarship itself. that could come through courses as well, i.e. to have certain core texts or materials that people can use for research. it is possible to begin to imagine ways in which we can build in research and analytical experiences in on-line courses that are unimaginable in a classroom of between and students, but become conceivable when you are scaling up to ten thousand students. it could create very different, novel, previously unimaginable ways of teaching and doing collective research, which are not possible in a small, classroom setting. is edx going to be one of your priorities as chair of the history department at harvard? it cannot be formally a priority, because for the moment edx is something the faculty do in their spare time. it is a non-profit organization, independent from both universities. the members of the board are senior administrators, including the presidents, provost and deans of both harvard and mit, etc. right now, harvard faculty members are asked to contribute pro bono and pro fama—they can become famous and reach a larger audience. however, there is no salary recognition for it. it is like writing a textbook, which you also do in your own time. crucially, there is no business model for it yet. nobody has figured out how to generate a revenue stream out of this kind of higher education. until somebody works out how to do that, edx may continue to be something that you do out of a passion to reach a larger audience and that the universities like harvard, mit and stanford will undertake to expand their brand. part of the down-side of these on-line                                                                                                                 on this innovative project in editorial crowdsourcing, see tim causer, justin tonra and valerie wallace, “transcription maximized; expense minimized? crowdsourcing and editing the collected works of jeremy bentham”, literary and linguistic computing , (march ), pp. - .     courses is the drop-out rate: at best %, at worst %, of the people who signed up do not see out the course until the very end. this is no reason not to press ahead: even if such a huge proportion of students do not make it the end of the course—or, in most cases, even get past the beginning—thousands still may. as the best recent analysis of mit’s first on-line course concludes, ‘the message for moocs has to be: disregard the dropouts and celebrate giving huge numbers of people access to free, high-quality, education’. to retain students, smaller modules are being developed for edx, i.e. - week modules, rather than the - weeks of the harvard teaching semester. so that is the question, how do you keep people’s attention, when they do not have regular class assignments, when they are not doing it for credit? in some cases, you can get a certificate of completion, but that does not have any credibility for employers, as an academic qualification, unless you can find a way to make it more robust, and, essentially, to sell those kinds of accreditation. it is not clear how you monetize this kind of higher education. there are all kinds of question, very interesting, fundamental questions. what are the university’s responsibilities towards a wider audience beyond its gates? how can faculty members reach out, under what circumstances, with what kind of encouragements? it is all fascinatingly up in the air. but this is just a tiny corner of the much bigger digital revolution that is taking place now. i am absolutely certain that we are in the midst of the single most transformative moment in academic life since the modern research university was created at the end of the th and the beginning of the th century. in five years time, the landscape is going to be unrecognizable. it is already becoming unrecognizable in fundamental ways. you mean new formats of publishing, secondary literature with direct links to primary sources and other secondary literature, possibilities to annotate e-books on-line, etc? of course: six different layers of annotation, books themselves becoming wikified, through their interactions with past scholarship and later readers. this is already happening, it is already here. that is where i feel very strongly that we have an agenda to follow. that is an agenda that i am already putting into place for the history department at harvard. i have set up a digital working group for the department. we have more than ten faculty members who are actively engaged in rethinking pedagogy and research, using digital tools and materials. i take that to be our major issue now: to publicize which is already going on in the department–there is huge amount of innovation in this area which is not as well known as it should be—and to equip all of our students and as many faculty as want to be equipped with these digital capacities, because they are rapidly becoming essential for everything that we do. some familiarity with how they operate is going to be as basic as philology was to a classical historian, for instance. we need to realize that this has already happened, but we are lucky to have some of the field’s great innovators here at harvard to give advice and inspiration. what we are                                                                                                                 sue gee, “mitx: the fall-out rate” ( june ): http://www.i- programmer.info/news/ -training-a-education/ -mitx-the-fallout-rate.html. jonathan shaw, “the humanities, digitized: reconceiving the study of culture,” harvard magazine, may-jun : http://harvardmagazine.com/ / /the-humanities- digitized, featuring the work of, among others, peter bol, jo guldi, and jeffrey schnapp.     doing now is playing catch-up. as one of participants of the wednesday meeting [about edx] put it, ‘putting together these large on-line courses now is rather like driving a very rapidly moving train, when you have to construct both the engine and carriages behind you and lay the track in front of you, at the same time as the train is moving at a hundred miles an hour’. that is the way it is going in all areas now. i am absolutely convinced of that. but that is a western phenomenon, is it not? in his essay ‘codex in crisis’, anthony grafton recalls that he was sitting in a ”tin-roofed, incandescently hot west african internet café” in , trying to answer e-mail questions from his graduate students in the us. he could find “little high-end material on the screen, and neither by the look of things, could [his] beninese fellow users.” there are digital divides within the us as well. as with any valuable resource, very rapidly inequalities kick in. we have to be aware of that. there is a lot of discussion within, for instance, the digital community in the us about these inequalities of access and how digital access can overcome them to create more connected forms of public history and community history, to have history literally from the bottom up. for example, local groups crowdsourcing materials from their own communities, feeding them into on-line archives, where these can be supplemented by historians, but in a non- hierarchical relationship between professional historians and non-professional people interested in history and with access to historical materials. that is great, but you are absolutely right, the conversation has to be expanded outside the wealthy heartlands of the digital world. in terms of academic institutions, we have immense computing power –large amounts of money are being put behind it here. but that is not true everywhere, even within the relatively well-funded higher education system in the us. most students and scholars do not have access to the full range of databases that exist behind high and costly pay-walls. so, yes, what about benin, what about india, what about many other parts of the world, even latin america, for instance, how will they get access to these tools and techniques? that is a question that goes beyond the capacity of academics, but that is a one that we have to consider in so far as the promise of the digital revolution is universal access to things that had so far been allowed only to the privileged and accredited few. open-access journals, creative-commons licenses and the various efforts to digitize vast numbers of books through the internet archive, google books, the digital public library of america, the europeana project, as well as national projects in countries such as france and germany, will in time all help to create that universal access to the world’s knowledge. do you consider your work to have moral implications? the reasons why i am asking this is your contribution to a recent symposium in the journal political theory ( ) on the work of the canadian political theorist, james tully. in your contribution, you appear to criticize tully, a major defender of the rights of indigenous peoples, for ignoring ‘the tens of millions of people [in the global south] who still lack some of the                                                                                                                 anthony grafton, worlds made by words: scholarship and community in the modern west (cambridge, mass.: harvard university press, ), p. .     most basic forms of human security’. should historians leave it to philosophers to consider these moral issues? it was certainly meant as a friendly provocation to tully, very much in his own critical spirit, to say we should not settle with the boundaries of moral and political philosophy as we have inherited them, we should always be seeking to expand them, if we believe that there is any transformative potential whatsoever in our use of historical knowledge to enlighten contemporary society and open up new questions. i was pushing the boundaries of what he had done. his work has been absolutely fundamental, not just in canada or north america, but more broadly in bringing indigenous rights to the center of discussion in political theory. that is a huge achievement in itself. i was just pushing the logic of that further, by saying ‘so what about those people who cannot make claims within the context of settled and constitutionalized societies like canada, the us or australia, those for whom the struggle might not be about recognition, but for simple, bare human survival?’. how can this be made relevant to them? how can we think about other kinds of inequalities on a global scale which are parallel to and to some extent intersect with the kinds of inequalities which tully himself was mapping in the context of a very large and very important set of communities, but only one congeries of communities on that global scale? should historians consider these questions as well? how can we not? it depends on your choice of topic, of course. but the topic i am working on at the moment, competing conceptions of civil war, is something that affects hundreds of thousands of people around the world, not just in asia and africa, but now in the middle east as well. to ask about the boundaries of humanitarian law and civil law, to ask how external powers should react to conflicts called civil war, this can literally be a case of life and death for tens of thousand of people, perhaps even millions of people. if one encounters a topic like that, i think there is a moral responsibility to consider the wider ramifications of one’s academic work. anything that one writes may be taken up in these contexts. one therefore has a duty to get it right, to consider the potential implications, what uses it might be deployed for. finally, how might you define the future of intellectual history?                                                                                                                 david armitage, rainer forst, bonnie honig, duncan ivison, anthony simon laden, and james tully, ‘feature symposium: reading james tully, public philosophy in a new key (vols. i & ii)’, political theory , no. ( ), pp. – , on james tully, public philosophy in a new key, vols. (cambridge: cambridge university press, ). david armitage, civil war: a history in ideas (new york: knopf, in progress).     my answer is three-fold: ) international/global, ) longue durée, and ) digital, which facilitates ) and ). i have been writing recently on all of three of these futures. in regard to all of them, my preference is for short books on big topics. they are more readable; they have more of an impact. one can move more rapidly. at some point, somebody has to digest the findings of the big books, to put them into a bigger picture. and to do that within the compass of, let’s say, between , words and , words for a wider audience is absolutely essential, if we are going to have any kind of impact. and also to do that in other fora. we are still talking in terms of the physical dimensions of the codex. again, the digital revolution means that we are now writing in different genres and reading in different genres. now, much of the most exciting stuff that i read is in blog-posts, it is not in journals, to some extent it is not in monographs. very rapidly moving, suggestive, deeply researched scholarship is coming out in very different formats now. i joined twitter recently: the amount of information, fabulous information, one can get from that is absolutely mind-boggling. i have learned an incredible amount from the links that people have put up–there’s very serious material to be found there if you follow the right people. the problem is that in britain they have not caught up with this. of course, they have. many the people i follow are in britain. they put up links to the folger shakespeare library, to the institute of historical research, to digital projects at the university of london, etc., etc.—an incredible amount of stuff. and much of the most important digital work is being undertaken by scholars in britain: king’s college, london, has a department of digital humanities, oxford has an increasingly prominent and integrated programme in the field, and the world’s largest digital archive of subaltern sources, the old bailey online, comes out of three british universities. but none of this counts for the research excellence framework (the united kingdom’s regular process of academic assessment)! this is the problem, a really interesting and critical problem. how do we evaluate digital scholarship in non-traditional formats? the american historical association—following the modern languages association—has just set up a committee to create protocols for evaluating digital scholarship. that is at least a start. that is one of things that i have asked our digital working group in the department to do, to create standards for the evaluation of digital work for junior faculty, graduate students and undergraduates—we are likely to get increasing numbers of undergraduate theses that involve digital work.                                                                                                                 armitage, foundations of modern international thought; armitage “what’s the big idea? intellectual history and the longue durée,” history of european ideas , ( ); armitage and jo guldi, “the return of the longue durée” (in progress). http://www.ref.ac.uk/. http://www.mla.org/guidelines_evaluation_digital. the aha “approved the establishment of a task force on digital scholarship” at its june meeting: http://blog.historians.org/news/ /decisions-of-the-aha-council-june- .     and we have no standards for evaluating that at the moment. in a year’s time, we have got to have them. that is a real imperative. according to neil jefferies, the bodleian library in oxford will soon make its entire catalogue open-source, thus allowing scholars to make changes in the catalogue. but there is only an incentive for people to do this if they are going to receive some sort of recognition for it. not necessarily. go to any rare books library in the us or britain and you will often find a slip of paper in the front of a book—or people have made annotations—about where extracts have been published, about other manuscripts, attributions, and so on. we have always have had an informal version of that sharing of scholarly knowledge in and alongside the physical objects. but the planned changes in the on-line catalogue of the bodleian library will massively increase that possibility. harvard libraries have made available the meta-data on million— million!—objects in the harvard collections – manuscripts, books, physical materials, etc. if you can wait a couple of hours, you can download the whole zip file of, basically, two-thirds of the library collection. and then the kinds of searches you can run on that, the way that you can manipulate that material… the sky’s the limit. it is up to you. that is all open-source now, that is all there. that is like being able to see inside the whole card catalogue all at once, but on a ten-fold scale. is there anything you would like to add to conclude our interview? i am sorry that i do not have the standard stories of how i spent six months on a banana boat, chatting to the indonesian crew. i have read a few of these itinerario interviews: i am sorry i do not have more glamorous or romantic stories for you! i would however like to mention the cambridge university press series, ideas in context which i co-edit and about which i feel very strongly. speaking of the future of intellectual history, we are pushing the series very much in the direction of doing more on imperial ideologies and global intellectual history. we just published chris bayly’s recent book on indian liberalism: the th volume in the series, symbolically to show a new direction for the series and for the field of intellectual history as a whole. chris                                                                                                                 neil jefferies, research and development project manager at the bodleian library, oxford, announced the planned changes in the on-line catalogue of the bodleian library at the ‘representing the republic of letters’ meeting held at huygens ing in the hague, june- july . http://openmetadata.lib.harvard.edu/bibdata. http://www.cambridge.org/gb/knowledge/series/series_display/item /ideas-in- context/?site_locale=en_gb. c.a. bayly, recovering liberties: indian thought in the age of liberalism and empire (cambridge: cambridge university press, ); compare shruti kapila, ed., an intellectual history for india (cambridge: cambridge university press, ); kapila and faisal devji, eds., ‘forum: the bhagavad gita and modern thought,’ modern     talked about the first glimmerings of this project in an itinerario interview a few years ago. i am happy to link up to that. i am also convinced the next frontier for oceanic history is pacific history. we are very glad to convene the conference at harvard in november . i think this will be the first conference ever to take a truly pan-pacific perspective. it will include scholars who work on the indigenous pacific, the histories of australia and new zealand, the history of asia, including china and japan, and also the north-pacific, russia as well as the americas. the participants will see the pacific whole for the first time, on the models of atlantic history. we will need to figure out whether the models forged for atlantic history have any relevance to an arena that is so much bigger, i.e. one-third of the earth’s surface, one-sixth of humanity within its borders. the pacific is a sea of islands—in the way that the atlantic by and large is not—as well as a sea of rims and borders and connections. it is very exciting to see how that comes together. a volume should emerge from that by , designed after the british atlantic world volume. the conference is the workshop for the volume. it is important in terms of my global trajectory to say that i feel in some ways that i am repaying a debt to the pacific world, and even carrying on my father’s legacy. i hold an honorary professorship at the university of sydney in australia, where i like to visit as often as i can. there one sees the world from a very different perspective, a pacific perspective. i have also been lucky enough to have two extended visits to japan in recent years as well, where one gains another pacific perspective. putting together those perspectives and the conversations about pacific history that i have had in both japan and australia over the years, it seemed to me that this was a topic whose future had very much come. you asked earlier about the future of atlantic history. i think one of the futures of atlantic history is precisely joining it to other oceanic and trans-regional histories. that is part of the logic of what we discovered about the limits of atlantic history: it can be too broad to encompass things but also too narrow to deal with trade flows, migration flows, and flows of goods and ideas. we need to think about the interrelations between these oceanic arenas and how in some sense they add up to a global or proto-global history. and there i end!                                                                                                                                                                                                                                                                                                                                           intellectual history ( ): - . interview with c. a. bayly, “i am not going to call myself a global historian,” itinerario ( ) pp. - . http://projects.iq.harvard.edu/pacific_histories/. david armitage and alison bashford, eds., pacific history (basingstoke: palgrave macmillan, ), modelled on armitage and braddick, eds., the british atlantic world, - . african americans in evolutionary science: where we have been, and what’s next graves jr. evo edu outreach ( ) : https://doi.org/ . /s - - - co m m e n ta ry african americans in evolutionary science: where we have been, and what’s next joseph l. graves jr.* abstract in national science foundation data revealed that in the united states the professional biological workforce was composed of ~ . % “whites”, . % “asians”, and only % “african american or blacks” (national science founda- tion, , https ://ncses data.nsf.gov/docto ratew ork/ /html/sdr _dst_ .html). there are problems with the categories themselves but without too deep an investigation of these, these percentages are representative of the demography of biology as a whole over the latter portion of the twentieth and beginning of the twenty-first century. however, evolutionary biologists would argue (and correctly so) that the representation of persons of african descent in our field is probably an order of magnitude lower ( . %). this commentary focuses on the factors that are associ- ated with underrepresentation of african americans in evolutionary science careers. keywords: african americans, evolutionary science, institutional racism, aversive racism, diversity, inclusion © the author(s) . this article is distributed under the terms of the creative commons attribution . international license (http://creat iveco mmons .org/licen ses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the creative commons public domain dedication waiver (http://creativecommons.org/ publicdomain/zero/ . /) applies to the data made available in this article, unless otherwise stated. background as the first african american to have earned a phd in evolutionary biology i have been concerned with this dis- parity for my entire career (graves ). it has been my experience that most non-african descended people in this field are woefully unaware of the dynamics that drive this historical disparity. in this sense, evolutionary biolo- gists are not different from the majority of non-african descended persons in this country that have little to no training or familiarity with the scholarly literature associ- ated with the african american experience. thus in this commentary i intend to provide the reader with a brief description of the cultural experiences of per- sons of african ancestry in the united states and how these have played a role in maintaining their underrep- resentation in evolutionary biology careers. this will be accomplished by also discussing the confluence between the history of evolutionary biology as a discipline and the social changes that allowed persons of african descent to pursue careers in higher education. the commentary continues with providing the reader a sense of the cur- rent state of underrepresentation within the field and will provide some perspectives concerning ongoing issues that are maintaining this situation. finally it will make recommendations concerning how evolutionary biolo- gists might learn from anti-racist struggles that are going on in other sectors of our society to move towards a more diverse and inclusive discipline. the central premise of this commentary is that racism in america as it is manifested in higher education (spe- cifically evolutionary biology) creates a culturally non- inclusive environment that systematically disadvantages persons of non-european descent. the form of this disad- vantage differs by the sociocultural positioning of individ- uals. thus to change the patterns of underrepresentation within the discipline requires that the dominant social group (persons of european descent socially-defined as “white”) to address and act on how their position of privi- lege is subordinating “others.” i will focus in this commentary on african americans, as this is the group whose history i know best. in addi- tion, there is much overlap between the african ameri- can experience and that of afro-caribbean, afro-english, afro-canadians, and newly immigrated africans with regards to their experiences of racial subordination and/ or colonialism. some of these themes are present in the struggles of other non-europeans (latinax, american indian) in higher education. open access evolution: education and outreach *correspondence: gravesjl@ncat.edu joint school of nanoscience and nanoengineering, north carolina a&t state university, unc greensboro, greensboro, nc , usa http://orcid.org/ - - - https://ncsesdata.nsf.gov/doctoratework/ /html/sdr _dst_ .html http://creativecommons.org/licenses/by/ . / http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf page of graves jr. evo edu outreach ( ) : it is important to understand the differences as well as the similarities of the experiences of persons of african descent. unfortunately, most non-african descended people tend to lump persons of african descent in the socially-defined category of “black.” for example, barack obama was widely hailed as the first “black” or african american president of the united states. this despite the fact that his father was of kenyan descent, and virtually no kenyans were ever transported to the western world via the trans-atlantic slave trade (rawley and behrendt ). barack obama also had a mother who was of european descent, thus it is only america’s social cus- tom of the “rule of hypodescent” or “one drop rule” that classifies him as “black” in the european american mind. the data from the trans-atlantic slave database shows that far more enslaved africans were transported to the caribbean than to north america. for example, accord- ing to the trans-atlantic slave trade database about times more enslaved africans were disembarked in the caribbean compared to north america during the slave trade (only about , enslaved africans compared to , , british caribbean and , , french caribbean were disembarked between and ce. emory center for digital scholarship ) far fewer enslaved africans were sent to europe during the slave trade (only ). thus, the afro-english and afro- french populations are mainly derived from later (post wwi) migrations to england and france; while afro- canadian populations are derived from both enslaved africans who escaped chattel slavery in the united states or fought with the british during the american revolu- tionary war and were granted freedom (unlike the u.s. the british generally honored their promise to grant freedom to those who fought for them), or later migra- tions from the caribbean and africa to canada. these groups clearly have different histories, as well as cul- tural influences. for example, many african americans (such as myself ) were raised in national baptist con- vention (nbc); southern christian leadership (sclc) style church communities; while many afro-english and west africans (former british colonies) would have been raised in anglican union style churches; and the afro- french in primarily catholic churches. thus, in the same way that the cultural experiences of europeans from dif- ferent countries, wouldn’t be thought of being exactly the same; neither should persons of african descent be con- sidered exactly the same. however, all african descended persons have some experience with the cultural construc- tion of “blackness” and its many disadvantages in nations of majority european descended individuals, just at these same european descended individuals experience white privilege in primarily “white” societies (roediger ). white privilege is associated with the fact that the united states was founded as a colonial/settler nation by western europeans. the roots of it’s english speak- ing population began with the jamestown colony that imported its first enslaved africans in . thus, of the   years since anthony and isabela (tucker) were disembarked in jamestown,   years of those allowed persons of african descent to be owned as chattel, the fol- lowing  years were dominated by the jim crow system of nd class citizenship complete with organized state and private racial terror, and   years past jim crow (to this date) are the years in which the mass incarceration of persons of african and latino descent is considered normal in the united states (alexander ). i was born in jim crow and less than two generations have passed since it’s end. my birth certificate says “colored” under the category of race. my childhood memories include “whites only” signs on water fountains and bathrooms. i remember being denied service in restaurants. mine was the first generation of african americans to enter public school after the momentous brown v. board of educa- tion decision of . main text evolution as a discipline during the same period in which african americans were fighting for a legal end to jim crow, evolution- ary biology became a coherent disciple. this occurred between and (mayr ), with the founding of the society for the study of evolution (sse) occurring in (smocovitis ). this was right after the end of wwii in which racial theories had been utilized to jus- tify the slaughter of millions of people in both the euro- pean and pacific theaters of the war. what is not as well realized is that these theories had their origin in the west and prominent evolutionary biologists and geneticists contributed to their rise (graves a). worse still was that after the war nazi race scientists such as fritz lenz, hans gunther, and eugen fischer were “rehabilitated” by their american and english colleagues and continued to support the “scientific” principles of eugenics (graves a). however, evolutionary biologists also played an important role in debunking biological racism, beginning with people like th. dobzhansky who wrote the popular book heredity, race, and society along with leslie dunn published in . richard lewontin’s classic study of genetic variation within and between the purported races of humans was an important contribution to anti-racism (lewontin ). stephan jay gould’s the mismeasure of man first published in is considered a major contri- bution to this cause. my own anti-racist work as an evo- lutionary biology was deeply influenced by interactions with lewontin and gould. page of graves jr. evo edu outreach ( ) : however, when the sse was founded, white suprem- acy was still a relatively unchallenged ideology in the united states. smocovitis ( ) provides a list of the founding members of the sse. many of the names one would expected were signatories of the founding docu- ments (ernest mayr, th. dobzhansky, sewall wright, hampton carson, george gaylord simpson). how- ever, none of the founding individuals were african americans or held faculty appointments at a histori- cally black college or university (hbcu). at this time there were no african americans who held research positions at any of the nation’s major universities. the first african american to receive a phd in biology was alfred o. coffin. his degree was awarded by illinois wesleyan university in zoology in . his research interests seemed to be in anthropology and he spent his professional career teaching mathematics, romance languages, and anthropology as alcorn a&m (a his- torically black university). most historically black col- leges and universities began after the civil war ended in . cheyney university (pa) was the first hbcu and was founded in . two years before this, ober- lin college (my alma matter) was the first historically white institution (hwi) to admit african americans. most of the hbcus were associated with christian denominations, such as the various baptist conven- tions, african methodist episcopal church (ame), united methodists, united church of christ, and some were supported by the catholic church (flem- ing ). of course, this is similar to the founding of the historically white colleges and universities (hwi). many of the first hwis were founded with money that came directly from the slave trade or the appropria- tion of land from the american indians (wilder ; harris et  al. ). indeed, the development of medi- cine as an academic discipline in america was fueled by the unfettered access to the deceased bodies of afri- can americans, irish, and american indians. medical experiments on living enslaved people were also more easily performed as enslaved people had no rights to their own bodies. the case of dr. james marion sims (an alabama slave holder and a founder of american gynecology) and his experiments on enslaved women is well documented (owens ). probably the most prominent african american biolo- gist of the synthesis period, ernest everett just died in . just was an embryologist trained at dartmouth university and is best remembered for his contribu- tions in embryology as outlined in his book: the biology of the cell surface published in . however, despite just’s reputation as an outstanding scientist he was never allowed to hold an appointment at a premier research university in the united states. there is some indication that just was thinking about evolutionary problems, as before his death he was working on a paper entitled: “ethics and the struggle for existence” but he died before completing this manuscript (manning ). a brief history of african american higher education the growth of the modern american research univer- sity was associated with the passage of the morrill land grant act of . this was designed primarily as an engine to improve agricultural education as well as to “open college doors to farmer’s sons and others who lacked the means to attend the colleges then existing (duemer ). however the first morrill land grant primarily benefited persons of european descent, as after the civil war reconstruction and rigid segrega- tion of higher education was reestablished in the former confederate states. therefore in a second morrill land grant act was passed to provide for more equita- ble access to higher education in states that maintained segregated higher education (neyland and fahm ). the morrill act helped to bring into existence colleges such as tuskegee institute, florida a&m, and north carolina a&t. however, it is important to realize that the southern states never provided equitable sup- port for the hbcus and that their original mission was not designed to fully educate african americans. in sep- tember of , booker t. washington gave his famous “atlanta compromise” speech before the cotton states and international exposition in atlanta, georgia. this was written to palliate a primarily european american audience. in this speech, washington offered the follow- ing guaranteed to the southern power structure: african americans would not agitate for their constitutional right to vote; not retaliate against racism; tolerate segregation and not resist discrimination. in return, the southern states would provide free vocational education to african americans. an addendum to the industrial educational model was that the hbcus would not provide liberal arts education to their students. thus schools, like north car- olina a&t really began as trade schools, not universities. it is not hard to see how the washington (or tuskegee) model retarded the growth of african americans intel- lectuals. however, by the turn of the twentieth century, other african americans leaders such as w.e.b. dubois sharply criticized the tuskegee model: “unless the american negro today, led by trained university men of broad vision, sits down to work out by economics and mathematics, by physics and chemistry, by history and sociology, exactly how and where he is to earn a living and how he is to establish a reasonable life in the united states or elsewhere, unless this is done the university has page of graves jr. evo edu outreach ( ) : missed its field and function and the american negro is doomed to be a suppressed and inferior caste in the united states for incalculable time.” w.e.b. du bois, the field and function of the negro college, . thus, for african americans to begin producing scholars in the sciences, two things had to happen. first the dominance of the tuskegee model in the hbcu environment had to be eroded, and secondly, deseg- regation of hwi’s had to progress to the point where african americans could survive their institutional- ized racism to achieve higher degrees. data suggests that african americans scientists began to trickle into faculty appointments at major research universities beginning in the early s. albert wheeler was the first african american in the school of public health at the university of michigan (appointed ); james jay, microbiology, wayne state university, ; per- cival skinner, anthropology, columbia university in ; and george jones, molecular biology, university of michigan are examples. both jim jay (deceased ) and george jones had important influences on me as i struggled through graduate school at michigan and then wayne state. so far i have detected i am the first african american to receive a phd in evolution- ary biology (broadly defined). my degree was awarded in . these facts concerning the pioneering years of african americans in the life sciences are not generally known by this generation of african americans enter- ing evolutionary science careers. considering american history, these events should not be surprising. in , only % of “white” ameri- cans polled believed that “black” americans were on average as intelligent as whites. this number increased to a high of % in but has declined ever since (shuman et  al. ). virtually, every african ameri- can pioneer in science can tell horror stories associ- ated with the “out of place” principle. as even the best trained human minds still reflectively stereotype, the “out of place principle” follows from stereotypes con- cerning what people believe about other people. as a graduate student at the university of michigan, i had doors slammed in my face while attempting to enter sci- ence buildings. the reasoning of the people slamming the doors was that i had no business in the museum of zoology on a weekend (as everyone knows, there are no blacks in evolutionary biology). or during my assis- tant professor/associate professor years, students at the research- campuses at which i held my appointments assuming that i was a football or basketball coach. or my favorite is the day that european american under- graduates approached the university provost asking me to be removed from teaching genetics due to my lack of qualifications. they considered me “unqualified” to teach genetics because i didn’t start the course with the material in chapter one of their textbook. this was the same day that the campus newspaper ran an article about my election as a fellow of the american associa- tion for the advancement of science (aaas) for my pioneering research into the genetics and physiology of aging! a tipping point? it is possible that was an inflection point for persons of african descent in evolutionary biology. shortly after my degree was awarded others followed (see table  .) yet by we have no evidence that the numbers of african americans have significantly increased in the field or are approaching equity (~ % of the us population identi- fies as african american, thus equitable numbers would be % of african americans as professional evolution- ary scientists.) however given that only % of profes- sional scientists are african american, for evolutionary science even achieving the % parity with other fields could be considered progress. however the overall lack of progress in evolutionary science, begs explanation. the first explanation proffered for the lack of progress generally goes: “african americans are not interested in evolution…” often this is associated with claims con- cerning either greater religiosity or “they are interested in going to medical school.” the greater religiosity of african americans has been well studied (chatters et al. ). in a pew center research survey, % of whites stated that they absolutely believed in god, while % stated they were fairly certain in the existence of god. these figures were % and % for blacks in this same survey. alternatively, % of whites stated that they did not believe in god, versus % of blacks (pew research center ). table african american pioneers in  evolutionary biology this may not be a comprehensive list. as the number of persons of african descent receiving phd’s in evolutionary biology or identifying themselves as evolutionary biologists began to increase in the s name institution years joseph l. graves jr. wayne state university scott edwards university of california (berkeley) tyrone hayes harvard university collette st mary university of california (santa barbara) paul turner michigan state university charles richardson indiana university page of graves jr. evo edu outreach ( ) : the figures for these questions are quite different for scientists. over the last century, figures have held con- stant with ~ % of scientists surveyed believing in god, and ~ % not (larsen and witham ). i suspect that for evolutionary scientists the figures for the non-belief in god are higher than for general science professions. darwin’s agnosticism on the existence of god is a well- known feature of his life (desmond and moore ). jerry coyne’s position on the incompatibility of evolu- tion and religion is one that i shared earlier in my career (coyne ). however i have since recanted. such views certainly stand as an impediment to the success- ful recruitment of greater numbers of african american students to careers in evolutionary biology. for example, we found that the level of evolution acceptance was lower for african american students at north carolina a&t state university (ncatsu is a hbcu) than for national figures (bailey et  al. ). however, more surprisingly in this study we found that evolution knowledge was negatively correlated with evolution acceptance. stud- ies of european american and combined race/ethnicity samples generally find that evolution acceptance is posi- tively correlated with evolution knowledge (the more you understand evolution, the more you are likely to accept it as valid science). as high religiosity was negatively correlated with evolution acceptance in our study, we concluded that our students’ rejection of evolution was premised on their belief that evolution challenged their religious values. however, this need not stand as impediment to the recruitment and retention of african americans (or other highly religious) individuals into science. i have found that most of my highly religious christian stu- dents have never really discussed the foundation of their theological views. as a confirmed episcopalian, these are conversations i have learned how to conduct in ways that do not automatically shut down critical reasoning. indeed, there is variation within christian denomina- tions with regards to their willingness to accept evolu- tion as compatible with their faith. in general, doctrinally conservative christians reject evolution (berkman and plutzer ). for example, the southern baptist con- vention (formed as the pro-segregation baptist church in the s) and the national baptist convention (pre- dominately african american membership) both reject evolution as compatible with their faith; on the other hand, the catholic church accepts evolution as com- patible with their faith (martin ). notably there is variation within the individuals who subscribe to major denominations concerning their acceptance of evolution. for example, for doctrinally conservative protestants, surveyed from to , those who felt that: humans developed from earlier species of animals % felt that this statement was definitely false or probably false, while % felt it was probably true or true. similar values were recorded for black protestants, % and % respectively, for mainline protestant denominations, the values were % and %; while for roman catholics, the values were % and % (berkman and plutzer ). thus while a given church’s official position is to accept or reject evo- lutionary science, individuals within denominations tend to make up their own minds concerning evolution. i have found that exposing my highly religious students to the fact that that there is variation within christian thought concerning evolution helps them be able to engage it critically while not feeling that they are abandoning their faith. the claim: “african americans students are not inter- ested in evolution because they want to go to medical school” is one of the most unfounded explanations for underrepresentation that i have ever heard. the actual data on applicants to us medical schools shows a very different picture (see fig.  ). the only group that seems to be more interested in applying to medical school compared to their percentage of the us population is asian americans. in our own survey (small) of highly year asian only black only white only hispanic only % us . % . % . % . % year . . . . . . . . . p er ce nt year vs asian only year vs black only year vs white only year vs hispanic only fig. applicants to us medical schools, — by race/ ethnicity. this figure shows the percent of each ethnic/racial group that applied to us medical schools compared to their percent of the us total population. asians were ~ four times more likely to apply to medical school compared to their percentage in the population, whites, blacks, and hispanics were less likely to apply compared to their percentage in the population. data from american association of medical colleges; these represent individuals who self-identified their ancestry in only one racial/ethnic category https ://www.aamc. org/data/facts /appli cantm atric ulant / https://www.aamc.org/data/facts/applicantmatriculant/ https://www.aamc.org/data/facts/applicantmatriculant/ page of graves jr. evo edu outreach ( ) : motivated students who attended the annual biomedi- cal conference for minority students (abrcms) and society for the advancement of chicanos and native americans (sacnas) in , we found that more afri- can americans and latinos, were interested in attending graduate school in biology, than medical school (grad school biology: . %, % compared to medical school: %, % respectively.) of those interested in graduate school, only %, % respectively were interested in evolu- tion as a career (mead et al. ). this paper also dem- onstrated that concerning graduate school interest, that the presence of role models in the particular discipline was thought highly important for african americans and mexican americans; but not so much for puerto ricans. role models again? there has been considerable study of the significance of role models for underrepresented minority (urm) students in science (chemers et  al. ). if so, there is virtually no way, other than by chance alone, for a urm student to know that there are urm scientists in evolution. for example, very few universities have african american faculty members in departments of ecology/evolutionary biology. there are very few afri- can american evolutionary biologists, other than me, whose appointments are at historically black universi- ties (hbcus). indeed, when i first arrived at ncatsu in , the upper division evolution course was rarely taught. from conversations with faculty at other hbcu campuses i found that this was quite common. as far as i know, there are few documentary films spe- cifically addressing evolutionary biology, that feature african american scientists. for example, i appeared in a segment of kcet (public television)’s series: life and times. my ten minutes of the episode was specifically focused on my evolution of aging work. later in the documentary, race: the power of an illusion, by califor- nia news reel, i was interviewed along with two other prominent evolutionary biologists (richard lewontin, stephan jay gould) and in the film i was labeled as an “evolutionary biologist.” however, this film rarely gets shown in biology class rooms. in the documentary, decoding watson, i am also identified as an evolutionary biologist. yet these films are exceptions. evolutionary biology textbooks do not generally iden- tify the race/ethnicity of those whose work is featured within. in some cases, race/ethnicity can be inferred by the person’s name, but this is generally not possible for african americans. searching the indexes of three popu- lar evolution textbooks for african americans who work could be featured in such texts, i only found one men- tion of scott edwards (no picture associated; bergstrom and dugatkin ; herron and freeman ; futuyma ). some of my early life history work is displayed in figure  . of stearns and medzhitov’s evolution- ary medicine, published in . however this is cited via a review paper, not by my publications (stearns and medzhitov ). there may be many other examples like this, in which the work of african american evolutionary biologists appears in textbooks, but the take home mes- sage is that there is no way that a student could know that the contribution came from a urm scientist. so while we know that role models are important in urm student choices of careers, there is no evidence that sig- nificant numbers of african american students have any way of knowing that there are african americans who have made important contributions in evolutionary sci- ence. thus a useful tool that might help make progress in this regard is the production of materials (articles, books, profiles in textbooks, podcasts, social media, films, etc.) that highlight the contributions of urm scientists in evolution. locally, the most important tool for providing your students role models is the hiring of african ameri- can (and other urm) into faculty positions. while the numbers are still small, they have grown sufficiently so that with some intention departments can locate poten- tial candidates. the key however is “intention.” intention usually is accompanied by a university commitment (with accompanied financial resources) dedicated to a diverse and inclusive faculty. thus, diversifying the faculty will not occur through “business” as usual techniques that are genuinely biased towards replicating the existing demog- raphy of the professoriate. examples of intentional hir- ing towards diversity require that you do some work to determine who is in the pipeline. this can be achieved by attending professional meetings that are likely to attract urm graduate students, post-doctoral researchers, and faculty members, such as annual biomedical research conference for minority students (abrcms) and soci- ety for the advancement of chicanos and native ameri- cans in science (sacnas). also working to develop real relationships with historically black universities (hbcu’s), hispanic-serving institutions (hsi’s), ameri- can tribal colleges, and minority serving institutions (msi’s). by knowing who is in the pipeline, this bet- ter allows you to write job descriptions in areas that are likely to draw the attention of “diverse” candidates. becoming the anti‑racist discipline the title of this subsection is shamelessly borrowed by joseph barndt’s book “becoming the anti-racist church” (barndt ). i have found that discussing institutional racism with persons of european descent in america, is sort of like sitting down in the dentist’s chair without anesthetic. in barndt’s case, he at least had the advan- tage of christianity’s core belief systems being aligned page of graves jr. evo edu outreach ( ) : with anti-racist ideas in theory, if not in practice. how- ever, this is not the case of the enterprise of science, and its institutions (e.g. professional societies, university academic units, etc.) there is nothing in science that requires that it take a moral stand on any issue, although i will argue that we would be better people and scien- tists if we did take such stands. at the onset of this dis- cussion i am going to make the claim that institutional racism is alive and well in the united states (and most of the western world). institutional racism can be found in all facets of american life. the american university has been in the main a tool of white supremacy, from its slave holding origins to the modern research university of the twenty-first century. in the early days of the american university, the relationship between its scholarship and white supremacy was “owned” and unchallenged. over the course of the nation’s growth, this association is less “owned” and most faculty members within the academy would decry such a relationship. for example, in the course of my life time the character of america’s racism has changed. at the time of my birth, biological racism was the predominant mode of thinking within european american communities. biological racism posits both the existence of biological races and inherent inborn differences between them (graves a, b). biologi- cal racism in the united states was backed by law until the civil rights act of . some american scientists such as carleton coon played an active role in support- ing biological racism, while others, such as dobzhansky, lewontin, and gould fought against it (graves a; jackson ). however in the latter portion of my life, biological racism has been supplanted by aversive/symbolic rac- ism. aversive racism (color-blind) is an ideology that allows people of the dominant socially defined race to claim that racism is no longer the central factor deter- mining the life chances of those of the subordinated race (in the united states, this is primarily dark-skinned individuals of african descent). this position argues that instead of the ongoing institutional and individual racism of american society, nonracial factors such as market dynamics, naturally occurring phenomena, and the cultural attitudes of racial/ethnic minorities them- selves are the main causal factors of their social subor- dination (pearson et al. ). barndt found in his book that the european american audience he was writing to, displayed more racism of the aversive than biologi- cal type. although i know of no studies that explicitly examine the prevalence of aversive racism in scientists, let alone evolutionary scientists, there is no reason to believe that scientists differ in this trait from the rest of their university colleagues or from the non-african american community (scheurich and young ). if this is so, it can influence the way faculty members interact with urm students in ways that they do not recognize. for example goff et  al. showed that aversive racism (or that fear of engaging in aversive rac- ism) reduced the willingness of persons of european descent to engage in conversation with persons who were not of european descent. another example of how this can negatively influence behavior is the recent study suggesting implicit bias against african ameri- cans in nih ro grant reviews (ginther et  al. ). a study has recently been published demonstrating that stem faculty who believe that student ability is fixed, show greater racial achievement disparity in their courses (canning et al. ). in addition to this problem, evolutionary biologists have not done enough to address the teaching of the relationship between the concepts of race, racism, and human variation in the k- and university curriculum. in , lieberman et al. found that % of biology pro- fessors surveyed accepted that biological races existed in the human species. in , morning reviewed biology texts from between and and found that they routinely accepted the existence of biological races within our species, without explaining by what criteria these races were defined. donovan found that there was little evidence that high school biology texts challenged stereotypical racial beliefs. in contrast, herron and free- man’s th edition of evolutionary analysis ( ) does a very good (if not complete) job of addressing human evo- lution and its relationship to modern human diversity. the problem here is that most students are exposed to the sort of instruction described by donovan ( ), and not enough are exposed to herron and freeman ( ). this is an opportunity that evolutionary biologists could exploit for reducing stereotypical beliefs within univer- sity students. aversive racism is a comfortable belief in that it excuses an individual’s own subconscious racism by supplying an easy palliative (society at large or the victims themselves are responsible for their conditions). it also excuses those who benefit from aversive racism from any responsibil- ity for taking any action to alleviate social subordination. aversive racists may decry the crude biologic racism that they observe in their neighbors but never see racism within themselves. for example, a study of aversive rac- ism demonstrated that individuals of european descent who endorsed barack obama for president, were more likely to describe certain job types as more suitable for “whites” compared to “blacks” (effron et  al. ). in general, aversive racism increased during the obama presidency, which may have accounted for the election of donald trump (crandall et al. ). page of graves jr. evo edu outreach ( ) : barndt in his book described the stages that persons of european descent must go through to get over their rac- ism. he likened it to the way patients who are suffering from traumatic grief move towards healing. . denial . anger . bargaining . depression . acceptance denial is just as it sounds: “racism is no longer a factor in determining life chances in american society”, or more relevant to science: “while racism might exist outside the academy, its does not play a role in how we evaluate can- didates for admission to our graduate programs, or post- doctoral/faculty appointments”. anger, the next stage of the process: “how dare you call me a racist!” or from the point of view of the university: “how dare you say that our policies maintain institutional racism!” my guess that many of you reading this commentary are currently experiencing stage or . bargaining: “well isn’t true that white people also had to struggle to make it in america?” or in the academy: “our asian students come from just as deprived backgrounds as african american students, why are they doing so well?” depression: “okay, i admit that i have racist tendencies, i can’t help being a bad per- son.” or in the academy: “i understand that institutional racism is an issue here, but it’s just so entrenched and so big i can’t do anything about.” finally, acceptance: “okay, i get it now, there are some things i can do to reduce rac- ism in my community.” or in the academy: “i get it, con- federate statutes are harmful to my african american and other students. i am going to do everything i can to get them removed from this campus!” conclusions in , the eminent african american scholar, w.e.b. dubois wrote that the problem of the twentieth century was the color line (dubois ). well into the twenty- first century the color line is still a prominent problem in american social life. the way forward requires that per- sons of european descent recognize their unearned white privilege. sociologists have demonstrated that white privilege exists in buying and selling a house, neighbor- hood locations, getting a job, advancement within a job, securing a first class education, and seeking and receiving the best medical care. ironically, this may be extremely difficult for scientists to believe. in my experience, most non-african american scientists have little to no under- standing of the role that institutional racism has played in structuring social opportunity in the united states (denial). this despite that fact that there is a voluminous scholarly literature on this subject (desmond and emir- bayer ). this situation is made even more complex by the growing numbers of scientists holding academic appointments whose cultural origins are from outside of the united states (e.g. east asia, middle east, india) who also have no formal training in american history and also bring racist/caste prejudices associated with skin color with them to the united states (dikotter ). thus, for us to make real progress within the academy, it is primarily academicians of european descent who must recognize how white privilege operates in their institu- tion and then commit themselves to acting to eliminate it (acceptance). to their credit, the sse, american society of natural- ists (asn), and society of systematic biologists (ssb) have begun to recognize this as an issue. for example, the three societies recently adopted an anti-harassment policy (sse safe meeting website ). included in the harassment policy is racial discrimination. this of course is limited in that it only applies to the behavior of individuals at scientific meetings. in addition, a new list serv has been initiated to track individuals who wish to self-identify as a member of a “diverse” category (diver- sify eeb website ). however i would argue that the diversity/inclusion efforts of the nsf science tech- nology center, biocomputational evolution in action (beacon) stand as one of the best models of how we may make real progress towards meaningful racial/eth- nic demographic change within evolutionary science as a discipline (beacon website ). this was made pos- sible, in part, by the fact that senior african american and latino scientists were included in the leadership of beacon from the start. as a science technology center, beacon provided funding to teams of investigators to develop preliminary data to pursue larger research grants. each budget request was evaluated on eight cri- teria, including how the research activity contributed to the diversity goals of beacon. as a member of bea- con’s executive committee i can state that projects that did not address the diversity criterion were not scored as highly as those that did. furthermore beacon sup- ported its diversity mission in visible ways such as a paid staff position that was charged with the oversight of its diversity efforts. this is the kind of commitment that makes it clear to all involved in the work that your organ- ization is committed to making progress in its diversity inclusion mission. beacon’s stated diversity goal was to exceed national norms for diversity at all levels of the center. in its report to the national science founda- tion (https ://www .beaco n-cente r.org/wp-conte nt/uploa ds/ / /beaco n- -annua l-repor t_for-web. pdf ) it showed % of its participants as “black”, and % hispanic/latino. the total of all individuals reporting as https://www .beacon-center.org/wp-content/uploads/ / /beacon- -annual-report_for-web.pdf https://www .beacon-center.org/wp-content/uploads/ / /beacon- -annual-report_for-web.pdf https://www .beacon-center.org/wp-content/uploads/ / /beacon- -annual-report_for-web.pdf page of graves jr. evo edu outreach ( ) : urms was % exceeding the national norm by . % (brown and pierre brown clarke and pierre ). the nsf national norms are derived from all biological sci- ence subdisciplines, not just evolution. at present we do have reliable data concerning the numbers of self- identified african americans participating in evolution- ary biology (phd, graduate and undergraduate research students). gathering this data by subdiscipline would be helpful for developing better strategies for intervention. one of the most interesting things that barndt found in his study of diversity/inclusion in the church was that congregations that had achieved the most in this regard, had apparent and active leadership from historically underserved minorities in their leadership. this mirrors my own experience with diversity/inclusion programs in science. however the accomplishments we made in bea- con would have been impossible if the rest of the senior leadership had not bought into the necessity of including diversity/inclusion as part of our core mission. in the case of beacon, the capacity of the senior leadership to buy in, may simply have resulted from the character of those individuals. in other words, leadership buy-in is not something that one can guarantee will hap- pen. i have often found that it is necessary to cultivate that buy-in. often, education and training is required. for example, for co-pis in a diversity/inclusion train- ing grant i am involved in, we all committed to attend- ing an intensive racial equity and inclusion training workshop (racial equity institute website ). thus investigators who are really committed to changing the demography of this field need to invest time in training themselves to help accomplish this. university adminis- trators who are invested in seeing the demography of this field change, must be willing to reward the efforts of fac- ulty members who put their efforts into this work. this is crucial, in that often times, urm and women faculty play take on disproportion amount of this duty, and they should be rewarded for this work in promotion and ten- ure decisions. if this profession is really serious about increasing the participation of african americans in this field, it must first examine its own cultural and implicit biases. in this regard, other subdisciplines within the biological sci- ences have done a better job. to get a sense of the direc- tion that we should be moving i suggest that model the accomplishments of the biomedically focused national research mentor network (https ://nrmne t.net/#under gradp opup). i specifically point out the efforts of the enhancing the diversity of the nih-funded workforce group (https ://www.nigms .nih.gov/train ing/dpc). finally, i have often explained to my european ameri- can colleagues across my career that i love this work, but i do not love that you want me to become “you” to do it. by definition, african- and european americans have different social and cultural experiences. real progress will be made towards diversity and inclusion in evolu- tionary science, when kowtowing to eurocentrism is no longer the criterion for participation in it. abbreviations abrcms: annual biomedical conference for minority students; beacon: biocomputational evolution in action; hbcu: historically black colleges and universities; hsi: hispanic serving institutions; hwi: historically white institu- tions; msi: minority serving institutions; sacnas: society for the advancement of chicanos and native americans in science. acknowledgements i would like to thank the beacon executive board for helpful discussions concerning these issues over the last years. i would also like to thank to anonymous reviewers for helpful comments concerning the manuscript. authors’ contributions manuscript completely written by jlg. the author read and approved the final manuscript. funding this work was funded in part by beacon: an nsf center for the study of evolution in action (national science foundation cooperative agreement no. dbi- ). availability of data and materials not applicable. competing interests the author declares no competing interests. received: june accepted: october references alexander m. the new jim crow: mass incarceration in the age of colorblind- ness. new york: the new press; . bailey gl, han j, wright dc, graves jl. religiously expressed fatalism and the perceived need for science and scientific process to empower agency. sci soc. ; ( ): – . barndt j. becoming the anti-racist church: journeying towards wholeness. min- neapolis: fortress press; . beacon diversity. . https ://www .beaco n-cente r.org/diver sity-acros s-beaco n/. accessed june . bergstrom ct, dugatkin la. evolution. nd ed. new york: norton & co.; . berkman m, plutzer e. evolution, creationism, and the battle to control america’s classrooms. new york: cambridge university press; . brown clarke j, pierre p. beacon: using diversity as an evolutionary tool for a high-performing science and technology center. in: banzhaf et al., editor. evolution in action—past, present, and future. cham: springer interna- tional publishing, ag. . (in press). canning ea, muenks k, green dj, murphy mc. stem faculty who believe ability is fixed have larger racial achievement gaps and inspire less student motivation in their classes. sci adv. ; ( ):eaau . chatters lm, taylor rj, bullard km, jackson js. race and ethnic differences in religious involvement: african americans, caribbean blacks and non-hispanic whites. ethn racial stud. ; ( ): – . https ://doi. org/ . / . chemers mm, zugriggen el, syed m, goza b, bearman s. the role of efficacy and identity in science career commitment among underrepresented minority students. j soc issues. ; : – . coyne ja. science, evolution, and society: the problem of evolution in america. evolution. ; ( ): . https ://doi.org/ . /j. - . . .x. https://nrmnet.net/#undergradpopup https://nrmnet.net/#undergradpopup https://www.nigms.nih.gov/training/dpc https://www .beacon-center.org/diversity-across-beacon/ https://www .beacon-center.org/diversity-across-beacon/ https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /j. - . . .x https://doi.org/ . /j. - . . .x page of graves jr. evo edu outreach ( ) : • fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold open access which fosters wider collaboration and increased citations maximum visibility for your research: over m website views per year • at bmc, research is always in progress. learn more biomedcentral.com/submissions ready to submit your research ? choose bmc and benefit from: crandall cs, miller jm, white mh. changing norms following the u.s. presidential election: the trump effect on prejudice. soc. psych. personal- ity sci. . https ://doi.org/ . / . desmond a, moore j. darwin: the life of a tormented evolutionist. in: never an atheist. new york: norton co.; . p. – . desmond m, emirbayer m. racial domination, racial progress: the sociology of race in america. new york: mcgraw-hill; . dikotter f. the discourse of race in modern china. nd ed. new york: oxford university press; . diversify eeb. https ://diver sifye eb.com/. accessed june . donovan bm. reclaiming race as a topic of the u.s. biology textbook curricu- lum. sci educ. ; ( ): – . dubois web. the souls of black folk: essays and sketches. chicago: a.g. mcclurg; . duemer ls. the agricultural origins of the morrill land grant act of . am educ hist j. ; ( & ): – . effron da, cameron js, monin b. endorsing obama licenses favoring whites. j exp soc psychol. ; : – . emory center for digital scholarship, slave voyages: the trans-atlantic slave trade database. https ://www.slave voyag es.org/. accessed june . fleming jt. three forces that shaped african american history. divers issues high educ. ; ( ): . futuyma dj. evolutionary biology. sunderland: sinauer and associates; . ginther dk, haak ll, schaffer wt, kington r. are race, ethnicity, and medical school affiliation associated with nih r type award probability for physician investigators? acad med. ; ( ): – . https ://doi. org/ . /acm. b e d b. goff pa, steele cm, davies pg. the space between us: stereotype threat and distance in interracial contexts. j personal soc psychol. ; ( ): – . https ://doi.org/ . / - . . . . graves jl. the emperor’s new clothes: biological theories of race at the millen- nium. new brunswick, nj: rutgers university press; a. p. – . graves jl. the race myth: why we pretend race exists in america. new york: dutton books; b. graves jl. science in the belly of the beast: a look back at my career in the academy. in: farmer vl, shepherd-wynn e, editors. voices of historical and contemporary black american pioneers. westport: praeger publish- ers; . harris lm, campbell jt, brophy al, editors. slavery and the university: histories and legacies. athens: u. georgia press; . herron jc, freeman s. evolutionary analysis. new york: freeman; . jackson jp. “in ways unacademical”: the reception of carleton s. coon’s the origin of races. j hist biol. ; : – . larsen ej, witham l. scientists and religion in america. sci am. ; ( ): – . lewontin rc. the apportionment of human diversity. evol biol. ; : – . manning k. black apollo of science: the life of ernest everett just. new york: oxford university press; . martin jw. compatibility of major u.s. christian denominations with evolu- tion. evol educ outreach. ; : . https ://doi.org/ . /s - - - . mayr e. the growth of biological thought: diversity, evolution, and inheritance. cambridge, ma: harvard university press; . mead ls, forcino fl, brown clarke j, graves jl. factors influencing the career pursuit of underrepresented minorities with an interest in biology. evol educ outreach. ; : . https ://doi.org/ . /s - - - . national science foundation. survey of doctorate recipients, survey year . . https ://ncses data.nsf.gov/docto ratew ork/ /html/sdr _dst_ .html. accessed june . neyland lw, fahm eg. historically black land-grant institutions and the devel- opment of agriculture and home economics, – . tallahassee, fl: florida a&m university foundation; . owens dc. medical bondage: race, gender, and the origins of american gyne- cology. athens: u. georgia press; . pearson ar, dovidio jf, gaertner sl. the nature of contemporary preju- dice: insights from aversive racism. soc personal psychol compass. ; ( ): – . pew research center. religion in public life. . https ://www.pewfo rum.org/ relig ious-lands cape-study /compa re/belie f-in-god/by/racia l-and-ethni c-compo sitio n/. accessed june . racial equity institute. . https ://www.racia lequi tyins titut e.com/. accessed june . rawley ja, behrendt sd. the transatlantic slave trade. lincoln, ne: university of nebraska press; . roediger d. working toward whiteness: how america’s immigrants became white: the strange journey from ellis island to the suburbs. new york: basic books; . scheurich jj, young md ( ) white racism among white faculty: from critical understanding to anti-racist activism. in: white wa, altbach pg, lomotey k, editors. the racial crisis in american higher education (revised edition): continuing challenges for the twenty first century. albany: the state university of new york press. shuman h, steeth c, bobo l. racial attitudes in america: trends and interpreta- tions. cambridge: harvard university press; . smocovitis vb. organizing evolution: founding the society for the study of evolution ( – ). j hist biol. ; ( ): – . society for the study of evolution, safe meeting website. . https ://www. evolu tionm eetin gs.org/safe-evolu tion.html. accessed june . stearns sc, medzhitov r. evolutionary medicine. sunderland, ma: sinauer associates; . wilder cs. ebony and ivy: race, slavery, and the troubled history of america’s universities. new york: bloomsbury press; . publisher’s note springer nature remains neutral with regard to jurisdictional claims in pub- lished maps and institutional affiliations. https://doi.org/ . / https://diversifyeeb.com/ https://www.slavevoyages.org/ https://doi.org/ . /acm. b e d b https://doi.org/ . /acm. b e d b https://doi.org/ . / - . . . https://doi.org/ . /s - - - https://doi.org/ . /s - - - https://doi.org/ . /s - - - https://ncsesdata.nsf.gov/doctoratework/ /html/sdr _dst_ .html https://ncsesdata.nsf.gov/doctoratework/ /html/sdr _dst_ .html https://www.pewforum.org/religious-landscape-study/compare/belief-in-god/by/racial-and-ethnic-composition/ https://www.pewforum.org/religious-landscape-study/compare/belief-in-god/by/racial-and-ethnic-composition/ https://www.pewforum.org/religious-landscape-study/compare/belief-in-god/by/racial-and-ethnic-composition/ https://www.racialequityinstitute.com/ https://www.evolutionmeetings.org/safe-evolution.html https://www.evolutionmeetings.org/safe-evolution.html african americans in evolutionary science: where we have been, and what’s next abstract background main text evolution as a discipline a brief history of african american higher education a tipping point? role models again? becoming the anti-racist discipline conclusions acknowledgements references science magazine december • vol issue sciencemag.org s c i e n c e il l u s t r a t io n : d a v id e b o n a z z i/ @ s a l z m a n a r t insights | p o l i c y f o r u m by victoria stodden, marcia mcnutt, david h. bailey, ewa deelman, yolanda gil, brooks hanson, michael a. heroux, john p.a. ioannidis, michela taufer o ver the past two decades, computa- tional methods have radically changed the ability of researchers from all areas of scholarship to process and analyze data and to simulate complex systems. but with these advances come chal- lenges that are contributing to broader con- cerns over irreproducibility in the scholarly literature, among them the lack of transpar- ency in disclosure of computational methods. current reporting methods are often uneven, incomplete, and still evolving. we present a novel set of reproducibility enhancement principles (rep) targeting disclosure chal- lenges involving computation. these recom- mendations, which build upon more general proposals from the transparency and open- ness promotion (top) guidelines ( ) and recommendations for field data ( ), emerged from workshop discussions among funding agencies, publishers and journal editors, in- dustry participants, and researchers repre- senting a broad range of domains. although some of these actions may be aspirational, we believe it is important to recognize and move toward ameliorating irreproducibility in computational research. access to the computational steps taken to process data and generate findings is as important as access to data themselves. computational steps can include informa- tion that details the treatment of outliers and missing values or gives the full set of model parameters used. unfortunately, re- porting of and access to such information is not routine in the scholarly literature ( ). although independent reimplementation of an experiment can provide important sci- entific evidence regarding a discovery and is a practice we wish to encourage, access to the underlying software and data is key to understanding how computational re- sults were derived and to reconciling any differences that might arise between inde- pendent replications ( ). we thus focus on the ability to rerun the same computational steps on the same data the original authors used as a minimum dissemination standard ( , ), which includes workflow information that explains what raw data and intermedi- ate results are input to which computations ( ). access to the data and code that under- lie discoveries can also enable downstream scientific contributions, such as meta-anal- yses, reuse, and other efforts that include results from multiple studies. recommendations share data, software, workflows, and details of the computational environment that gener- ate published findings in open trusted reposi- tories. the minimal components that enable independent regeneration of computational results are the data, the computational steps that produced the findings, and the workflow describing how to generate the results using the data and code, including parameter set- tings, random number seeds, make files, or function invocation sequences ( , ). often the only clean path to the results is presented in a publication, even though many paths may have been explored. to min- imize potential bias in reporting, we recom- mend that negative results and the relevant spectrum of explored paths be reported. this places results in better context, provides a sense of potential multiple comparisons in the analyses, and saves time and effort for other researchers who might otherwise ex- plore already traversed, unfruitful paths. persistent links should appear in the pub- lished article and include a permanent iden- tifier for data, code, and digital artifacts upon which the results depend. data and code un- derlying discoveries must be discoverable from the related publication, accessible, and reusable. a unique identifier should be as- signed for each artifact by the article pub- lisher or repository. we recommend digital object identifiers (dois) so that it is possible to discover related data sets and code through the doi structure itself, for example, using a hierarchical schema. we advocate sharing digital scholarly objects in open trusted re- positories that are crawled by search engines. sufficient metadata should be provided for someone in the field to use the shared digi- tal scholarly objects without resorting to contacting the original authors (i.e., http:// bit.ly/ fvwjph). software metadata should include, at a minimum, the title, authors, version, language, license, uniform resource identifier/doi, software description (includ- ing purpose, inputs, outputs, dependencies), and execution requirements. to enable credit for shared digital scholarly objects, citation should be standard practice. all data, code, and workflows, including soft- ware written by the authors, should be cited in the references section ( ). we suggest that software citation include software version in- formation and its unique identifier in addi- tion to other common aspects of citation. to facilitate reuse, adequately document digital scholarly artifacts. software and data should include adequate levels of documenta- tion to enable independent reuse by someone skilled in the field. best practice suggests that software include a test suite that exercises the functionality of the software ( ). use open licensing when publishing digi- tal scholarly objects. intellectual property laws typically require permission from the authors for artifact reuse or reproduction. as author-generated code and workflows fall under copyright, and data may as well, we recommend using the reproducible re- search standard (rrs) to maximize utility to the community and to enable verification of findings ( ). the rrs recommends attribu- tion-only licensing, e.g., the mit license or the modified berkeley software distribution (bsd) license for software and workflows; the creative commons attribution (cc-by) license for media; and public domain dedica- tion for data. the rrs and principles of open licensing should be clearly explained to au- thors by journals, to ensure long-term open access to digital scholarly artifacts. reproducibility enhancing reproducibility for computational methods data, code, and workflows should be available and cited university of illinois at urbana-champaign, champaign, il , usa. national academy of sciences, washington, dc , usa. university of california, davis, ca , usa. university of southern california, los angeles, ca , usa. american geophysical union, washington, dc , usa. sandia national laboratories, avon, mn , usa. stanford university, stanford, ca , usa. university of delaware, newark, de , usa. email: vcs@stodden.net da_ policyforum.indd / / : am published by aaas o n a p ril , h ttp ://scie n ce .scie n ce m a g .o rg / d o w n lo a d e d fro m http://science.sciencemag.org/ december • vol issue s c i e n c e sciencemag.org journals should conduct a reproducibility check as part of the publication process and should enact the top standards at level or . such a check asks whether the data, code, and computational steps upon which find- ings depend are available in an open trusted repository in a discoverable and persistent way, with links provided in the publication. and have all digital artifacts been openly li- censed? is documentation and workflow in- formation available for a reader to follow the discovery process? are all digital scholarly objects used in the discovery process cited in the manuscript’s reference section? could the published computational findings be re- produced on an independent system by using the data and code provided? the last item is arguably the most time- consuming for reviewers and difficult to carry out, and many journals may choose not to adopt it or may perform partial reproduc- tion for only some of the computational find- ings. the journal article should specify which of these items have been checked and, if so, whether they are fully or partially fulfilled. journals should strive to enact level or of the top standards on “data transpar- ency” and “analytic methods (code) transpar- ency.” level recommends an independent reproduction of findings. some journals are already taking steps in this direction ( , ). to better enable reproducibility across the scientific enterprise, funding agencies should instigate new research programs and pilot studies. resolving some barriers to reproduc- ibility may be straightforward; however, oth- ers may take time and community effort to overcome. we recommend enacting research programs to advance our understanding of reproducibility in computationally enabled research. topics might include methods for verifying queries on confidential data; extend- ing validation, verification, and uncertainty quantification to encompass reproducibility; numerical reproducibility and sensitivity to small variations in computation ( ); test- ing standards for code, including closed or proprietary codes; cyberinfrastructure that supports reproducibility, as well as innova- tive computational work; pilot efforts to create “instruction manuals” for manuscript submission (e.g., http://libguides.caltech. edu/authorcarpentry); policy research on in- tellectual property law and software patent- ing; costs and benefits to reproducibility in different settings, for example, in industry collaboration; provenance and workflow re- positories; and exploring how to make invest- ments regarding the preservation of various digital artifacts. funding bodies could sup- port efforts to reproduce results in different computational settings to better understand sources of error in computational findings. barriers, exceptions, ongoing efforts we recognize that there are challenges to the implementation of these recommendations. there will necessarily be exceptions in the near term and possibly indefinitely, for ex- ample, analysis and data involving human subjects or proprietary codes. however, we believe that creative ways to manage excep- tions could be developed in such cases and that exceptions should be explained in the article. for example, if data or code cannot be made publicly accessible, the research team or journals could have infrastructure, policies, and procedures in place for rapidly giving reviewers access to information neces- sary to perform a review ( , ). it may not be possible to fully disclose, or even license, all proprietary software used in the discovery pipeline. however, scripts de- signed to be executed by propriety software such as matlab may be openly licensed by the script authors under the rrs. we also feel there are broad benefits to code release, for example, allowing for inspection, even if the code cannot be executed ( ). beyond the reproducibility check de- scribed above, journals can improve review of computational findings by rewarding reviewers who take extra effort to verify computational findings. authors that fa- cilitate such a review could be rewarded with badging of their published article (e.g., http://bit.ly/badging gp). best practices for reviewers of reproducible publications need to be formulated. funding agencies may en- courage, request, and reward reproducible research practices in the scientific investi- gations that they review and fund. appropriate methodology to facilitate re- producibility should be taught to students who will use computational techniques in research. best practices of digital scholar- ship should be required and incorporated into curricula and should include discus- sions of ethics, use of repositories, and ver- sion control, for example. key societies or communities should consider short courses, best practices publications, and awards to promote these skills. groups or research ar- eas with limited experience in reproducible research practices could focus initially on a few seminal articles to demonstrate and promote reproducibility. we believe that as these efforts become commonplace, practices and tools will con- tinue to emerge that reduce the amount of time and resource investment necessary to facilitate reproducibility and support increas- ingly ambitious computational research. j r e f e r e n c e s a n d n ot e s . b. a. nosek et al., science , ( ). . m. mcnutt et al., science , ( ). . a. a. alsheikh-ali et al., plos one , e ( ). . d. donoho et al., ieee comput. sci. eng, , ( ). . v. stodden, ims bull. online, november ( ); http://bit.ly/bullimstat . . d. h. bailey, j. m. borwein, v. stodden, notices amer. math. soc. ( ), ( ). . d. garijo et al., plos one , e ( ). . d. donoho, v. stodden, in the princeton companion to applied mathematics. n. j. higham, ed. (princeton univ. press, princeton, nj, ), pp. – . . r. gentleman, d. temple lang, j. comput. graph. stat. , ( ). . v. stodden, s. miguez, j. open res. softw. , e ( ). . v. stodden, comput. sci. eng. , ( ). . v. stodden, p. guo, z. ma, plos one , e ( ). . m. heroux, acm trans. math. softw. ( ), art ( ). . d. h. bailey, j. m. borwein, v. stodden, in reproducibility: principles, problems, practices, h. atmanspacher and s. maasen, eds. (wiley, new york, ), pp. – . . m. fuentes, amstat news, july ; http://bit.ly/ jasa gb. . r. j. leveque, siam news , april . ac k n ow l e d g m e n ts these recommendations emerged from a workshop held at the american association for the advancement of science (aaas), washington, dc, and february , funded by the laura and john arnold foundation (http://bit.ly/aaas arnold). workshop participants are identified in the supplementary materials. s u p p l e m e n ta ry m at e r i a l s www.sciencemag.org/content/ / / /suppl/dc . /science.aah da_ policyforum.indd / / : am published by aaas o n a p ril , h ttp ://scie n ce .scie n ce m a g .o rg / d o w n lo a d e d fro m http://science.sciencemag.org/ enhancing reproducibility for computational methods ioannidis and michela taufer victoria stodden, marcia mcnutt, david h. bailey, ewa deelman, yolanda gil, brooks hanson, michael a. heroux, john p.a. doi: . /science.aah ( ), - . science article tools http://science.sciencemag.org/content/ / / materials supplementary http://science.sciencemag.org/content/suppl/ / / / . . .dc content related http://science.sciencemag.org/content/sci/ / / . .full references http://science.sciencemag.org/content/ / / #bibl this article cites articles, of which you can access for free permissions http://www.sciencemag.org/help/reprints-and-permissions terms of serviceuse of this article is subject to the is a registered trademark of aaas.sciencescience, new york avenue nw, washington, dc . the title (print issn - ; online issn - ) is published by the american association for the advancement ofscience copyright © , american association for the advancement of science o n a p ril , h ttp ://scie n ce .scie n ce m a g .o rg / d o w n lo a d e d fro m http://science.sciencemag.org/content/ / / http://science.sciencemag.org/content/suppl/ / / / . . .dc http://science.sciencemag.org/content/sci/ / / . .full http://science.sciencemag.org/content/ / / #bibl http://www.sciencemag.org/help/reprints-and-permissions http://www.sciencemag.org/about/terms-service http://science.sciencemag.org/ ‘proper’ pro-nun-ſha- ſhun in eighteenth-century english: ecep as a new tool for the study of historical phonology and dialectology this is a repository copy of ‘proper’ pro-nun-ſha- ſhun in eighteenth-century english: ecep as a new tool for the study of historical phonology and dialectology. white rose research online url for this paper: http://eprints.whiterose.ac.uk/ / version: accepted version article: yanez-bouza, n., beal, j., sen, r. orcid.org/ - - - et al. ( more author) ( ) ‘proper’ pro-nun-ſha- ſhun in eighteenth-century english: ecep as a new tool for the study of historical phonology and dialectology. digital scholarship in the humanities. issn - https://doi.org/ . /llc/fqx this is a pre-copyedited, author-produced version of an article accepted for publication in digital scholarship in the humanities following peer review. the version of record yáñez-bouza, n. et al; ‘proper’ pro-nun- ha- hun in eighteenth-century english: ecep as ʃ ʃ a new tool for the study of historical phonology and dialectology, digital scholarship in the humanities, fqx is available online at: https://doi.org/ . /llc/fqx eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ reuse items deposited in white rose research online are protected by copyright, with all rights reserved unless indicated otherwise. they may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. the publisher or other rights holders may allow further reproduction and re-use of the full text version. this is indicated by the licence information on the white rose research online record for the item. takedown if you consider content in white rose research online to be in breach of uk law, please notify us by emailing eprints@whiterose.ac.uk including the url of the record and the reason for the withdrawal request. mailto:eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ ╅proper╆ pro-nun-ſha- ſhun in eighteenth-century english: ecep as a new tool for the study of historical phonology and dialectology . introduction in recent years english historical linguists have voiced complaints about the scholarly neglect of the late modern english period ( に ). while grammar and the prescriptive grammatical tradition have received increasing attention over the last couple of decades (beal et al., ; tieken-boon van ostade, ), there is still relatively little research on the phonology of late modern english, and of the eighteenth century in particular; as beal ( : ) points out, け[w]here interest is shown in the eighteenth century, phonology is neglected, and where interest is shown in the history of english phonology, the eighteenth century is neglectedげ. there remains an urgent need for new studies of historical phonology in general and of eighteenth-century phonology in particular. one reason for this lack of research could be the idiosyncratic notation systems used by eighteenth-century authors, which make it difficult to search and interpret phonological evidence. yet the value of pronouncing dictionaries as rich and reliable evidence of lexical diffusion as well as of sound variation and change in eighteenth-century pronunciation has been observed in studies such as beal ( ) and jones ( ). with this in mind, we have constructed a new electronic, searchable database of eighteenth-century english phonology (ecep). the purpose of this paper is to present ecep as a tool to facilitate research on the social, regional and lexical distribution of phonological variants in eighteenth-century english, thereby meeting the demands of the growing research community in late modern english generally (mugglestone, ; hickey, ) and in historical phonology in particular (honeybone and salmons, ). ʍ;ニキミェ wwノノゲげ ふヱΓΒヲ: に ) lexical sets for comparing the vowel systems of present-day varieties of english as its reference, the database provides unicode ipa transcriptions for the relevant segment of each word given as an example of its lexical set or subset in wwノノゲげ account of standard lexical sets, as documented in eighteenth-century pronouncing dictionaries (e.gく ʍエラマ;ゲ sエwヴキs;ミげゲ a general dictionary of the english language, ). wwノノゲげ ゲ┌hゲwデゲ ヮヴラ┗キsw キマヮラヴデ;ミデ ヮラキミデゲ ラa comparison where varieties of english differ as to the distribution of variants within the lexical set. for example, whilst the lexical set bath キゲ swaキミws ;ゲ けiラマヮヴキゲキミェ デエラゲw ┘ラヴsゲ ┘エラゲw iキデ;デキラミ aラヴマ iラミデ;キミゲ デエw ゲデヴwゲゲws ┗ラ┘wノ っfっ キミ ɑwミaマが h┌デ っ嵯衾っ キミ ‘pげ ふwells : ), subset ( hぶ iラミゲキゲデゲ ラa ┘ラヴsゲ デエ;デ ;ヴw ゲラマwデキマwゲ ヮヴラミラ┌ミiws ┘キデエ っfっ けキミ ;iiwミデゲ ┘エキiエ otherwise have broad bath (australian, west indian)げ ( : に ), whilst those in subset ( c) けデ┞ヮキi;ノノ┞ エ;┗w デエw palm vowel in the otherwise flat-bath ;iiwミデゲ ラa デエw ミラヴデエ ラa eミェノ;ミsげ ( : ). we have retained the subsets in the structure of ecep in order to determine whether these major distinctions between accent systems in present-day english have precursors in the variation present across our eighteenth-century data sources. we have included all of the examples provided by wells which occur in these sources in order to reveal ;ミ┞ a┌ヴデエwヴ ヮ;デデwヴミゲ ラa ノw┝キi;ノ sキゲデヴキh┌デキラミく sキミiw wwノノゲげ ノw┝キi;ノ ゲwデゲ ;ヴw swゲキェミws ラミノ┞ デラ investigate vowel systems, and some consonantal changes are also of interest in the study of late modern english phonology, we have constructed consonantal sets of phonological interest and extracted ヴwノw┗;ミデ s;デ; キミ デエw ゲ;マw ┘;┞ ;ゲ aラヴ wwノノゲげ ┗ラ┘wノ ゲwデゲく sエwヴキs;ミげゲ ふヱΑΒヰぶ デヴ;ミゲiヴキヮデキラミ ふsキ;iヴキデキiゲ ラマキデデwsぶ aラヴ デエw ┘ラヴs けヮヴラミ┌ミiキ;デキラミげが ゲエラ┘キミェ ヮ;ノ;デ;ノキ┣ws っ崎っ キミゲデw;s ラa っゲキっ ;デ デエw ゲデ;ヴデ ラa デエw デエキヴs ;ミs aラ┌ヴデエ ゲ┞ノノ;hノwゲく this paper describes the structure and contents of the ecep database, and reports on the three-step method of compilation: (i) the selection of primary sources (section ); (ii) the process of data input and annotation (section ); and (iii) the design of the web-based interface (section ). the context for the development of this new tool is provided in section , which gives an overview of the phonology of eighteenth-century english and of the value of pronouncing dictionaries as evidence for eighteenth-century pronunciation. section describes two pilot case studies which demonstrate the value of ecep for the study of english historical phonology. . background . the phonology of eighteenth-century english since charles jones described the eighteenth and nineteenth centuries as the けcキミswヴwノノ;ゲ ラa english hisデラヴキi;ノ ノキミェ┌キゲデキi ゲデ┌s┞げ ふヱΓΒΓぎ ヲΑ ), there has been considerable progress in late modern english studies. much of this progress has been made possible by the availability of corpora such as archer (a representative corpus of historical english registers), which have enabled searches across large datasets for the complex patterns of variation and change which characterize this period. however, despite the monographs by beal ( ) and jones ( ), research on the phonology of this period has been less prolific than that in other areas such as morphosyntax, pragmatics and language ideology. one reason for this relative neglect of eighteenth-century phonology is the lack of accessible primary source material: as argued by beal ( b), the corpus revolution which has energised other areas of late modern english studies has so far had little effect on phonology. the ecep project aims to redress this. some scholars have actually suggested that the phonology of eighteenth- and nineteenth- century english is not worthy of their attention. bloomfield and newmark ( : ), for example, state that changes in the language between the eighteenth century and the present day are けdue to matters of style and rhetoric [...] rather than to differences in phonology, grammar or vocabularyげ. bloomfield and newmark go on to state that historical linguists are less interested in style and rhetoric, a statement which no longer rings true given the recent development of historical pragmatics and historical sociolinguistics. strang notes デエ;デ けsome short histories of english give the impression that change in pronunciation stopped dead in the eighteenth c[entury], a development which would be quite inexplicable for a language in everyday useげ ふヱΓΑヰぎ ΑΒぶく macmahon ( ), after summarising the views of earlier scholars who claimed that there had been little change in this period, states that けデエwヴw is other evidence to show that the pronunciation of english more than years ago was noticeably different, for reasons mainly of phonotactics (structure and lexical incidence) from what it is デラs;┞げ ( : , original italics). whilst both strang and mcmahon assert that changes in pronunciation have taken place since , both make the point that these more recent changes are less systemic than those occurring in earlier periods. beal argues that this opposition is to some extent け;ミ キノノ┌ゲキラミ iヴw;デws by the different types of evidence available aラヴ デエw w;ヴノキwヴ ;ミs ノ;デwヴ ヮwヴキラsゲげ ;ミs ェラwゲ ラミ デラ キミ┗ラニw デエw ゲ;┞キミェ けi;ミげデ ゲww デエw ┘ララs aラヴ デエw デヴwwゲげ: in her opinion, け[i]t is a matter of perspective: at a distance, a forest appears as a monolithic block, but, the closer you get to the forest, the more you notice the variation hwデ┘wwミ キミsキ┗キs┌;ノ デヴwwゲげ ふbeal, : ). not only are we closer in time to the eighteenth century than to the middle or early modern english periods, but the amount of detailed l;デw mラswヴミ eミェノキゲエ キゲ ェwミwヴ;ノノ┞ ;ェヴwws デラ iラ┗wヴ ┘エ;デ エキゲデラヴキ;ミゲ ┘ラ┌ノs デwヴマ デエw けノラミェげ wキェエデwwミデエ ;ミs nineteenth centuries. see beal ( ; a), tieken-boon van ostade ( ) for more detailed definitions. information on the pronunciation of more recent english makes us more aware of the range of variation. moreover, some of the changes occurring in this period are still ongoing and/or are reflected in variation between varieties of english today. examples of these changes are the sキゲデヴキh┌デキラミ ラa けノラミェげ っ嵯准っが っ査准っ ;ミs けゲエラヴデげ っ;っが っ左っ variants in the bath and cloth lexical sets respectively (see beal and condorelli, ヲヰヱヴ aラヴ ;ミ ;iiラ┌ミデ ラa デエw ノ;デデwヴぶき デエw けnラヴデエ-south sキ┗キswげ キミ デエw ;hゲwミiw ラr presence of the phoneme /朔/ (beal, c); and the ongoing palatalization of alveolar consonants preceding earlier /ju准/ (see section below). since many of the phonological changes taking place in the eighteenth century involve shifts in lexical incidence, sources of evidence used to investigate these changes need to be lexically rich. as we shall see in the next section, the sources chosen for inclusion in ecep are ideal for these purposes, as they provide evidence for the entire lexicon. . phonology sources in ecep written evidence for the historical pronunciation of english can be divided into direct and indirect types. evidence that is indirect involves sources whose authors were not overtly commenting on or describing pronunciation, but which give clues about it. typical sources of indirect evidence are rhymes, puns and non-standard spellings. direct evidence, on the other hand, comes from authors who deliberately set out to describe (or prescribe) the pronunciation of their time. in reconstructing the pronunciation of earlier periods of english, we have to rely mainly on indirect evidence, but from the sixteenth century onwards, direct evidence becomes increasingly available as interest in spelling reform and in phonetics increases. texts such as christopher cララヮwヴげゲ ( ) the english teacher provide detailed and sophisticated descriptions of the sounds of english, lists of homophones and near- homophones and even metalinguistic comments on the social and/or geographic distribution of variants, but exemplify their descriptions with a very restricted number of lexical tokens. however, from the middle of the eighteenth century dictionaries are published in which the pronunciation of every word is described, and these provide the source material for ecep. to illusデヴ;デw デエw ケ┌;ミデキデ;デキ┗w sキaawヴwミiw hwデ┘wwミ ラヴデエラwヮキゲデキi ┘ラヴニゲ ゲ┌iエ ;ゲ cララヮwヴげゲ ( ) and eighteenth-century pronouncing dictionaries, let us consider the distribution of long and short variants of me 鍋 in the bath and start sets. cooper provides important early evidence for the lengthening of the vowel in these sets, which allows us to identify some of the ヮエラミwデキi wミ┗キヴラミマwミデゲ キミ ┘エキiエ デエw iエ;ミェw aキヴゲデ ラii┌ヴゲく fキヴゲデ エw デwノノゲ ┌ゲ ラa けデエw ┗ラ┘wノ ; ノキミェ┌;ノげ that けキミ デエwゲw can, pass by, a is short; in cast, past, for passedが キデ キゲ ノラミェげ ふヱヶΒΑぎ ヴぶく ʍエwミ エw ェラwゲ ラミ デラ sキゲi┌ゲゲ デエw iラミデw┝デゲ キミ ┘エキiエ デエw ┗ラ┘wノ けキゲ ヮヴラミラ┌ミiws ノラミェ キミ キデゲ ラ┘ミ ゲラ┌ミsげ ふデエ;デ キゲが /a准っ rather than /e准っぶが デエwゲw hwキミェ けhwaラヴw nch and s when another consonant follows, and before r unless sh folノラ┘ゲげ ふヱヶΒΑぎ ンヴぶく cララヮwヴ ヮヴラ┗キswゲ ; ノキゲデ ラa ┘ラヴsゲ キミ ┘エキiエ デエキゲ ノwミェデエwミws a occurs: barge, blast, carking, carp, cast, dart, flasket, gasp, grant, lance, mask, path, tart. these words have been chosen to provide the same pre-vocalic environments as words which w┝wマヮノキa┞ け; ゲエラヴデげ ふっ;っぶ ;ミs け; ゲノwミswヴげ (/e准/): thus bar with short /a/ is contrasted with barge pronounced with /a准/ and bare with /e准っく fヴラマ cララヮwヴげゲ w┗キswミiw ┘w i;ミ ヮキwiw デラェwデエwヴ ;ミ account of the environments in which early lengthening occurs, but we have no way of knowing whether the examples chosen represent all the words in which orthographic <a> occurs in the given environments. for instance, cooper provides path as an example of a word with /a准/, but does not specify whether the same vowel would be used in other words with this post-vocalic environment, such as bath, lath, etc. by contrast, ecep contains words from the bath set and words from the start set. this will allow users to trace variation between /a/and /a准/ across a much larger subset of the lexicon and to identify differences in the transcriptions of authors from different places writing at different times within the eighteenth century (see beal, : – for further discussion of this sound change). the qualitative value of evidence from eighteenth-century pronouncing dictionaries has been disputed in the past. dobson, albeit writing at a time when many of the sources used in the eighteenth century collections online were unknown or inaccessible, staデws デエ;デ けデエw eighteenth century produced no writers to compare with the spelling reformers who are our main source up to (hodges) or with the phoneticians who, beginning with robinson ふヱヶヱΑぶ i;ヴヴ┞ ┌ゲ ラミ aヴラマ ヱヶヵン ふw;ノノキゲぶ デラ ヱヶΒΑ ふcララヮwヴげゲ english teacher)げ (dobson, : ). others have taken issue with the prescriptivism of eighteenth-century authors. john walker, the most successful and influential of these, is often singled out for criticism on this account. sheldon ( : ) ┘ヴキデwゲ デエ;デ けw;ノニwヴ ゲ;デキゲaキwゲ デエw デwマヮwヴ ラa エキゲ デキマw ぷぐへ ;ミs キデゲ swマ;ミs aラヴ ノキミェ┌キゲデキi ヴwェ┌ノ;デキラミ ;ミs ヴwaラヴマげが ┘エキノゲデ hラノマhwヴェ ( : ) accuses walker of being けキミaノ┌wミiws h┞ デエw ゲヮwノノキミェげく iデ キゲ デヴ┌w デエ;デ ;ノノ デエw ヮヴラミラ┌ミiキミェ sキiデキラミ;ヴキwゲ ┌ゲws aラヴ ecep ┘wヴw written with the aim of providing their readers with a guide to what the authors considered けiラヴヴwiデげ ヮヴラミ┌ミiキ;デキラミが h┌デ デエw ゲ;マw iラ┌ノs hw ゲ;キs ラa デエw マ;ミ┞ デ┘wミデキwデエ- and twenty-first century dictionaries which transcribe the pronunciation of english words in rp and/or general american. recent scholars such as agha ( ), beal ( ), ranson ( ) and trapateau ( ぶ エ;┗w ヴwエ;hキノキデ;デws w;ノニwヴげゲ ヴwヮ┌デ;デキラミ ;ゲ ; ヮエラミwデキiキ;ミ h┞ デ;ニキミェ エキゲ ┘ラヴニ ラミ キデゲ ラ┘ミ terms as an important and highly informative source of information on the prestigious マwデヴラヮラノキデ;ミ ヮヴラミ┌ミiキ;デキラミ ┘エキiエ ┘;ゲ デエw ヮヴwi┌ヴゲラヴ ラa ‘pく w;ノニwヴげゲ ( ) critical pronouncing dictionary is the major source of metalinguistic comments in ecep, many of which provide valuable sociolinguistic information (see section . below for further discussion of metalinguistic comments). other sources used in the compilation of ecep provide accounts of ┘エ;デ ┘;ゲ iラミゲキswヴws けiラヴヴwiデげ ヮヴラミ┌ミiキ;デキラミ キミ デエw ヮヴラ┗キミiwゲく the sources included in ecep include the earliest available editions of all the accessible pronouncing dictionaries of english printed in eighteenth-century britain. at the time of writing, these are as follows:  buchanan ( ) linguae britannicae vera pronuntiatio.  johnston ( ) a pronouncing and spelling dictionary.  kenrick ( ) a new dictionary of the english language.  perry ( ) the royal standard english dictionary.  spence ( ) the grand repository of the english language.  sheridan ( ) a general dictionary of the english language.  burn ( ) a pronouncing dictionary of the english language.  scott ( ) a new spelling, pronouncing and explanatory dictionary of the english language.  walker ( ) a critical pronouncing dictionary and expositor of the english language.  jones ( , ) sheridan improved. a general pronouncing and explanatory dictionary of the english language. nd and rd editions. buchanan ( ) is the first true pronouncing dictionary of english, in the sense that every word is transcribed. it was decided to include two editions of jonwゲげゲ sキiデキラミ;ヴ┞ hwi;┌ゲw デエw third edition demonstrates significant changes in which jones distances himself from sheridan, most noticeably in recognising a distinction between long and short vowels in the bath and start sets. in future, we intend to augment ecep with data from later editions and from other sources, but those listed above provide evidence across the second half of the eighteenth century from authors of varying geographical provenance に one irishman (sheridan), four scotsmen (buchanan, perry, burn, scott), one northern author from newcastle (spence), three authors from the london area (kenrick, jones, walker), and one author of uncertain origin but http://gale.cengage.co.uk/product-highlights/history/eighteenth-century-collections-online.aspx we intend to include early american pronouncing dictionaries in later versions of ecep. who lived and worked in the south-east county of kent (johnston). it is important to state that ecep is not intended to be a database of dialectal pronunciation, but it does reflect the ┗;ヴキ;デキラミ hwデ┘wwミ デエw けヴwiwキ┗wsげ ゲヮwwiエ ラa lラミsラミ ;ミs ラa デエw wケ┌キ┗;ノwミデ キミ ヮヴラ┗キミiキ;ノ iwミデヴwゲ such as edinburgh and newcastle, as well as providing evidence for change over the course of the eighteenth century. . data annotation once the pronouncing dictionaries had been selected, the next step in the compilation of ecep was the process of data input and annotation. this section reports on the design and contents of the database, including the methodological principles adopted. . database design ecep has been built in ms access format as a relational database constructed with a variety of integrated tables. the data have been systematically annotated and thematically grouped in three major categories: phonology data, source metadata and author metadata. details for each category are set out in table . table design of the ecep database (meta)data fields phonology lexical set, lexical subset, keyword, ipa, ipa variants, example word frequency, metalinguistic comments, metalinguistic attitude, mwデ;ノキミェ┌キゲデキi ノ;hwノが cラマヮキノwヴゲげ ミラデwゲ source type of work, title, year of publication (of the edition consulted), edition, place of publication, imprint (printers, booksellers), price, physical description, paratext, audience (age, gender, social class, instruction, ゲヮwiキaキi ヮ┌ヴヮラゲwぶが ‘wawヴwミiwゲ iラミゲ┌ノデwsが cラマヮキノwヴゲげ ミラデwゲ author name, life dates, gender, social class, place of birth, places of residence, occupation, other biographical details, works by this author in ecep the metadata for the dictionaries have been drawn from the original sources, such as the title-pages and prefaces to works, and also from the literature (e.g. alston, ; beal, ). the metadata for the authors come principally from the oxford dictionary of national biography. regarding the phonology data, the starting point for drawing up the list of words for ecep was john wwノノゲげ ふヱΓΒヲぶ accents of english, in particular his list of standard lexical sets for the vowel system in varieties of present-day english ( : に , に ). our aim was for ecep to incorporate data from the selected pronouncing dictionaries in the form of ipa transcriptions so that the historical data documented in the database could be easily compared to present-day studies; this was necessary because, as mentioned above, the notation systems used by eighteenth-century authors were often idiosyncratic and difficult to interpret (see section . and appendix iii). the use of wwノノゲげ lexical sets and their associated example words is standard practice in studies of variation and change in present-day english. including the full range of example words allows for differences in lexical distribution between the primary sources, and also between these and the contemporary accents described by wells. for instance, a scholar interested in the distribution of words related to the strut-foot split would be able to find how each of the words provided as examples for wwノノゲげ sets is transcribed in each of the eighteenth-century sources documented in the database, and how phonological variants are perceived at the time in the context of the standardization of english (e.g. correct, vulgar, improper, etc.). wells ( : にヲヰぶ w┝ヮノ;キミゲ デエ;デ けぷデへエw ┌ゲw ラa ラミw ┗ラ┘wノ ラヴ ;ミラデエwヴ キミ ヮ;ヴデキi┌ノ;ヴ ┘ラヴsゲ (lexical items) can be illustrated h┞ デ;h┌ノ;デキミェ デエwキヴ ラii┌ヴヴwミiwげ キミ デエw ゲwデ ラa keywords presented in table in small caps, so that each of them けゲデ;ミsゲ aラヴ ; ノ;ヴェw ミ┌マhwヴ ラa ┘ラヴsゲ ┘エキiエ hwエ;┗w デエw ゲ;マw ┘;┞ キミ ヴwゲヮwiデ ラa デエw キミiキswミiw ラa ┗ラ┘wノゲ キミ sキaawヴwミデ ;iiwミデゲげ; the latter are referred to in this paper as example words. overall his list contains twenty-four lexical sets for stressed vowels and three sets for unstressed vowels; this makes , example words in total, distributed in sixty-one subsets. the sets kit, dress, trap, lot, strut, foot, cloth, concern short vowels; the sets bath, nurse, fleece, palm, thought, goose, start, north, force refer to long vowels; the sets face, goat, price, choice, mouth, near, square, cure include diphthongs; and the sets happy, letter, comma represent unstressed vowels. table wwノノゲげ ふヱΓΒヲぎ に ) lexical sets in ecep (sorted as in wells) set subset example word short vowels kit -- bit, drink dress -- bed, deaf trap -- back, thank lot -- box, sock strut -- blood, cut foot -- bush, full long vowels and diphthongs bath bath_a ask, castle bath_b branch, enhance bath_c banana, calf bath_f blasphemy, plastic cloth cloth_a broth, cough cloth_b coffee_ , moth cloth_c coroner, florin nurse -- birth, nerve fleece fleece_a agree, cheese fleece_b bead, deceive fleece_c machine, police face face_a age, safe face_b day, faith face_c break, great palm palm_a calm, father palm_b bravado , inamorato palm_f almond, sultana thought thought_a fall, sought thought_b false, fault the sets are categorized as long- or short-vowel sets according to their pronunciation in rp. pヴ;iデキi;ノ ミラデwゲく ふ;ぶ ʍエw iラswゲ ぱ;が ぱh wデiく キミ wwノノゲげ ノw┝キi;ノ ゲ┌hゲwデゲ ;ヴw ヮヴwゲwヴ┗ws ;ゲ キミ エキゲ ラヴキェキミ;ノ ノキゲデが w┝iwヮデ aラヴ ぱaが ┘エキiエ wwノノゲ iラswゲ ┘キデエ ;ミ ;ヮラゲデヴラヮエw ;ミs ラaデwミ ヴwawヴゲ デラ ;ゲ ;ミ け;ヮヮwミsキ┝げ デラ デエw ラヴキェキミ;ノ set. (b) the codes _ and _ in some of weノノゲげ w┝;マヮノw ┘ラヴsゲ ;ヴw ┌ゲws ┘エwミ デエw ゲ;マw ┘ラヴs ;ヮヮw;ヴゲ キミ more than one subset; the number indicates the syllable that is relevant in each particular case, as in coffee_ for cloth_b and coffee_ for happy_b. goat goat_a boat, holy goat_b grow, know goose goose_a choose, shoot goose_b blue, few price price_a arrive, try price_b fight, high choice choice_a boy, noise choice_b join, spoil choice_c groin, hoist mouth -- down, mountain near near_a beer, near near_b beard, fierce near_c hero, period near_f idea, real square square_a air, pear square_b scarce square_c dairy, rarity start start_a far, start start_b bark, party start_c tiara north north_a for, war north_b assort, mortal north_c aura, taurus force force_a adore, door force_bi deport, forth force_bii coarse, fourth force_c aurora, glorious cure cure_ai amour, tour cure_aii endure_vw, pure cure_b gourd, tournament cure_ci boorish cure_cii bureau, curious weak vowels happy happy_a baby, city happy_b coffee_ , vanity letter -- better, razor comma -- opera, saliva to these sets for the study of the vowel system in general we have added five supplementary sets for the study of the consonant system in eighteenth-century english, including ten subsets and a total of example words. the sets deuce, feature and sure address the process of palatalization, dealing with stress patterns (subsets _a for stressed syllable, _b for post-stress syllable, _c for pre-stress syllable), and the pre-/j/ phoneme (/t, d, s, z/ in each set). the set heir relates to the presence or absence of initial /h/, and the set whale to the pronunciation ラa け┘エげく see table and appendix i for details. more consonant sets may be added in due course. practical notes. (a) when the same example word appears in a vowel set and in a consonant set, the former is coded with _vw and the latter with_cn, for instance heir_vw for square_a and heir_cn for heir. table consonant lexical sets in ecep set subset example word deuce deuce_a /t/ tuesday /d/ due /s/ suit /z/ resume deuce_b /t/ altitude /d/ module /s/ issue /z/ visual deuce_c /t/ tumultuous /d/ adulation /s/ superior /z/ -- feature -- /t/ creature /d/ procedure /s/ pressure_cn /z/ pleasure sure sure_a /t/ mature /d/ during_cn /s/ surety /z/ c(a)esura_cn sure_b /t/ century /d/ verdure /s/ censure /z/ closure sure_c /t/ maturation /d/ duration /s/ mensuration /z/ -- heir -- honour, humble whale whale_a when, whine whale_b elsewhere, somewhat each of the eighteenth-century pronouncing dictionaries in ecep was examined in order to find wwノノゲげ example words for vowels and consonants, and the data were entered according to the following principles: (b) if the same example word appears in more than one of the consonant sets, _deu stands for deuce, _ture for feature, and _sure for sure, for instance fissure_ture and fissure_sure. this set consists of words which have schwa in the post-stress syllable in present-day english according to the oxford english dictionary (as opposed to a full vowel in sure_b), following a palatalized consonant in at least one pronunciation variant. these forms presumably arose from pronunciations ┘キデエ っテ鎖っが ┘エキiエ ;ヮヮw;ヴ デラ エ;┗w hwiラマw マラヴw ┘キswゲヮヴw;s キミ デエw wキェエデwwミデエ ientury. these in turn ラヴキェキミ;デws キミ aラヴマゲ ┘キデエ ┗;ヴキ;デキラミ hwデ┘wwミ ぷ┞准へ ;ミs ぷキ┌へ キミ デエw aキミ;ノ ゲ┞ノノ;hノw キミ mキssノw eミェノキゲエく wエwミ デエw aキミ;ノ ゲ┞ノノ;hノw hwi;マw ┌ミゲデヴwゲゲwsが デエwヴw ┘;ゲ ┗;ヴキ;デキラミ hwデ┘wwミ けa┌ノノげ aラヴマゲ ┘キデエ っキ┌っ ;ミs ヴws┌iws aラヴマゲ ┘キデエ っ畷っく ʍエw ┗;ヴキ;ミデゲ ┘ith /iu/ could then develop to /ju/ with the subsequent possibility of palatalizing デエw ヮヴwiwsキミェ iラミゲラミ;ミデが ┘エwヴw;ゲ デエラゲw ┘キデエ っ畷っ sキs ミラデ ノw;s デラ ヮ;ノ;デ;ノキ┣;デキラミく s┌hゲwケ┌wミデ ヴwゲデラヴ;デキラミ of /j/ in the schwa-forms (with possible palatalization) combined with the reduction of /u/ to schwa in デエw a┌ノノ aラヴマゲ ヴwゲ┌ノデゲ キミ デエw ヴwマ;ヴニ;hノw ┗;ヴキ;デキラミ ┘w ゲww キミ ecep hwデ┘wwミ wくェく っデテ┌准っが 三┌准っが っ三テ┌准っが っデテ鎖っが っ三鎖っが っ三テ鎖っが ;ミs っデ畷っ キミ デエキゲ ゲwデ. a) wwノノゲげ ノw┝キi;ノ ゲwデゲ ;ヴw swゲキェミws aラヴ デエw ;ミ;ノ┞ゲキゲ ラa ヮヴwゲwミデ-day english. naturally, the sets include example words that were introduced into the english language in recent times. given that the scope of ecep is limited to the phonology of the eighteenth century, we have excluded from the database those lexical items created or borrowed after (source: oxford english dictionary, january ). b) wwノノゲげ example words that are not documented in any of the pronouncing dictionaries examined have been excluded. c) proper names and cliticised spellings of the type ĚŽŶ͛ƚ, ĐĂŶ͛ƚ have been excluded on the grounds that they are unlikely to be considered headwords in dictionaries. country names appear occasionally in lists, as in johnston ( ), but some did not exist at the time. d) example words that are documented in at least one pronouncing dictionary are included in the database, and the dictionaries in which an example word does not appear are iラsws けnidげ ふキくwく けnラデ iミ デエキゲ dキiデキラミ;ヴ┞げぶく for instance, macaroni (set happy_a) is missing in all but perry ( ) and scott ( ), and whorl (set nurse) appears only in johnston ( ). e) if an example word is listed in the dictionary but no pronunciation is provided, it is iラsws ;ゲ nラp ふキくwく けnラ pヴラミ┌ミiキ;デキラミげぶ, such as cup (set foot) in kenrick ( ). f) at times pronouncing dictionaries do not list the precise example word, but they do list or make reference to a related word. in such cases we take note of the latter and add an explanatory note for users. for instance, for awn (set thought_a) we have taken awning as the reference in five of the six dictionaries in which it is documented; and for honourable and honesty (set heir) we have taken honour and honest as reference example wordゲ キミ kwミヴキiニげゲ ふヱΑΑン) dictionary. g) example words for which the notation system in the original source is unclear or ambiguous have been coded as unclear. following the above method, ecep currently lists , example words for each pronouncing dictionary: , example words in the vowel sets in subsets, and example words in consonant sets across subsets. this leads to a total of , items annotated for the study of eighteenth-century english phonology. a summary of the contents of ecep is set out in table . appendix ii ノキゲデゲ wwノノゲげ example words that have been excluded from ecep according to principles a)-c). table ecep contents lexical sets subsets example words vowels に wells ( ) kit, dress, trap, lot, strut, foot; bath, cloth, nurse, fleece, palm, thought, goose, start, north, force; face, goat, price, choice, mouth, near, square, cure; happy, letter, comma , consonants に supplementary list deuce, feature, sure; heir; whale total in each pronouncing dictionary , total in all pronouncing dictionaries , the exception to country names is england. note that alexander, charles, george and morris are included in ecep because the dictionary entries refer to derivations which are no longer proper names as such; for instance, alexander refers to the name of the herb. . database annotation the database is designed to address research questions concerning the chronological, social, geographical and phonological distribution of variants such as /hw/~/w/~/h/ in the whale set, bath broadening or the strut-foot split, all of which are of interest to sociolinguists, dialectologists and historical phonologists. to this purpose ecep has been compiled to reflect the inventory of categorically distinct sounds in the way that the eighteenth-century pronouncing dictionaries document them; we avoid second-guessing issues of phonology here. as beal ( ) has rightly argued with respect to notations for orthographic <a>: the systems of notation provided in these pronouncing dictionaries tell us about the phonemic inventory of the recommended accent に that is, how many phonemes there are (we can, for instance, easily tell that sheridan has three soundsねwhilst spence and walker have four) whilst we can find out about the incidence of those phonemes from the dictionary entries themselves. what we cannot tell from a dictionary such as the grand repository is the phonetic nature of those phonemes: how do we know that the sound in father was ぷ嵯准] rather than [æ准] ラヴ w┗wミ ぷ坐准]? (beal, : ) bringing together the information from all the pronouncing dictionaries, as we aim to do in ecep, will help us address bw;ノげゲ ケ┌wゲデキラミく our method has thus been to translate the idiosyncratic notation systems of the dictionaries into unicode ipa transcriptions, based on the descriptions provided by the authors in the preface or introduction to their works. according to bert eマゲノw┞げゲ categories of pronouncing dictionaries, eighteenth-century sources are けtypicallyげ diacritic, so that diacritic marks indicate quality as well as quantity of sounds (cited in beal, : ). they all tended to use different types of diacritic marks, though, and sヮwミiwげゲ grand repository in fact けstands apart from all the others both in its purpose and in the means of executing that purposeげ (beal, : ) in that it uses a truly phonemic system of notation in which any one symbol always represents the same phoneme and vice versa. for instance, in a new dictionary of the english language kenrick ( ) used a notation system based on numbers placed over each syllable, a method ┘エキiエ エw ;iニミラ┘ノwsェwゲ ┘;ゲ キミゲヮキヴws h┞ けデエw iwノwhヴ;デws mヴ sエwヴキs;ミげ ふbw;ノ, : ). in the introduction to the work he gives readers けsキヴwiデキラミゲ aラヴ iラミゲ┌ノデキミェ デエw aラノノラ┘キミェ sキiデキラミ;ヴ┞げ ( : に ) and then elaborates on the description of the sounds in the けrhetorical grammarげ ヮヴwaキ┝ws デラ it ( : に ). he first provides a table of english sounds for vowels and another for consonants, taking note of spelling variation for the same sound, as shown in table . table kwミヴキiニ ふヱΑΑンぎ ┗ぶ ラミ けデエw ノラミェ ;ミs ゲエラヴデ マラswゲ ラa ┌デデwヴキミェ ラ┌ヴ aキ┗w ┗ラ┘wノゲげ a. ďĂƌƌ͛Ě͘ bard. e. met. mate. i. short in hit. long in heat. o. not. naught. u. pull. pool. he goes on to explain the notation system with the word fascination as an illustrative example: ( ) the word is next printed, as it is divided into syllables according to a right pronunciation, with figures placed over each syllable, to determine its exact sound, as the figures correspond with those of the above table of sounds: thus fa s-ci -na -ti on.] now, by referring to the table, we find that the several syllables are to be pronounced like the words placed over against the numbers , , , ; by which the quality of the sound, or the power of all the vowels, is exactly determined. by shewing farther that the consonant c in the second syllable is printed in italicks, it is known, by the table of consonants, that it is here pronounced soft like an s. again, the letters ti in the last syllable being printed also in italics, it is plain from the same table that they have the usual power of sh; so that the word must be pronounced as if it had been printed fa s-si -na -sho n. (kenrick, : vii) kwミヴキiニげゲ ゲ┞ゲデwマ is キデゲwノa ; ヴwawヴwミiw aラヴ pwヴヴ┞げゲ ふヱΑΑヵぶ dictionary, which also takes inspiration from jラエミゲデラミげゲ ふヱΑヶヴぶ method, and in turn is found in sheridan ( ) in combination with b┌iエ;ミ;ミげゲ ふヱΑヵΑぶ ヴwゲヮwノノキミェ notations (beal, : , ). the system in walker ( ) is け┗キヴデ┌;ノノ┞ キswミデキi;ノ デラ デエ;デ sw┗キゲws h┞ sheridanげ ふbw;ノ, : に ). walker argues that sエwヴキs;ミげゲ けmethod of conveying the sound of words, by spelling them as they are ヮヴラミラ┌ミiwsが キゲ エキェエノ┞ ヴ;デキラミ;ノ ;ミs ┌ゲwa┌ノげが ;ミs デエwヴwaラヴw キデ けゲwwマws デラ iラマヮノwデw デエw キsw;げ ラa w;ノニwヴげゲ own dictionary (walker, : iii). fig. shows a summary of sエwヴキs;ミげゲ notation system, where vowels are categorized けh┞ デエw デキデノwゲ ラa fキヴゲデが swiラミsが ;ミs ʍエキヴs ゲラ┌ミsゲが ;iiラヴsキミェ デラ デエw ラヴswヴ キミ ┘エキiエ デエw┞ ノキwが ;ミs ;ゲ デエw┞ ;ヴw マ;ヴニws h┞ デエラゲw aキェ┌ヴwゲげ ふ : ), and where consonants are preceded by a vowel (first row) or h┞ けゲラ┌ミsキミェげ デエw characters so デエ;デ けデエwキヴ ミ;デ┌ヴw ;ミs ヮラ┘wヴゲ ┘キノノ hw w┝ヮヴwゲゲws キミ デエwキヴ ミ;マwゲげ ふヱΑΒヰぎ ヵぶく as an illustrative example from the dictionary entries (see ( )), the example word whisker is documented by sheridan with the consonant cluster hw in the first syllable (set whale_a) and with the vowel u in the second syllable (set letter), that is ipa /朔ヴ/. figure sエwヴキs;ミげゲ ふヱΑΒヰぎ ヵぶ notation system for vowels and consonants ( ) whisker, hwi ゲげ-ku r. s. the hair growing on the cheek unshaven, the mustachio. (sheridan, : s.v. whisker) once the iラヴヴwゲヮラミswミiw hwデ┘wwミ デエw sキiデキラミ;ヴキwゲげ ゲ┞ゲデwマゲ ;ミs デエw ipa iラミ┗wミデキラミゲ was established (see appendix iii for a sample of two dictionaries), the relevant segment of each example word was transcribed using ipa symbols in an individual entry in the database. the here, the symbols <w> and <y> refer to the semivowels /w/ and /j/. following methodological principles were followed for the interpretation of all pronouncing dictionaries. first, the symbol っ嵯准/, which would be used for the vowel produced by rp speakers in the bath, palm, and start sets, has not been used in our ipa transcriptions; rather, we have consistently used /a准/ in line with the general view by historical phonologists that the h;iニキミェ デラ っ嵯准/ was a later process (e.g. lass, : ). this concerns the sets bath, palm, start, and variants in face, lot, square, thought, trap. second, all the eighteenth-century dictionaries examined describe and/or prescribe a rhotic pronunciation. since it is therefore a given that orthographic r is pronounced in all contexts, we have included post-consonantal /r/ in our transcriptions only when rhoticity is relevant to the pronunciation of the vowel in the example word, namely in the sets cure, force, letter, near, north, nurse, square, start. in these sets, historical changes in the pronunciation of the vowels are connected to the presence or loss of rhoticity. the exceptions are the subsets cure_ci, cure_cii, force_c, near_c, near_f, north_c, square_c, start_c because the example words in these subsets all have the vowel before /r/ followed by another vowel, as in boorish, curious, and therefore rhoticity is not an issue. in the sets sure, feature, heir post-consonantal /r/ has not been included in the transcription either, because the relevant segment in these sets is the prevocalic consonant, not the vowel. third, where current transcription conventions vary, we have chosen the one that most closely corresponds with the descriptions provided by our eighteenth-century sources. for example, in transcribing the vowel of the lexical set fleece we have chosen /i衾/ rather than /i/ because thw マ;テラヴキデ┞ ラa ラ┌ヴ ゲラ┌ヴiwゲ swゲiヴキhw デエキゲ ;ゲ ; けノラミェげ ┗ラ┘wノく authors typically provide a single pronunciation; if they comment on variation in the pronunciation of a particular word we document that in a separate column, as shown in table . table illustrative examples of example words with ipa variants lexical set subset example word ipa ipa variant dictionary bath bath_a plant æ a准 walker cure cure_ai your jo准r j朔ヴ jones face face_a great e准 i准 sheridan foot foot bosom u 朔 scott sure sure_a sure_cn sju准 崎テ┌准 johnston square square_a bear e准r i准r buchanan whale whale_a whist hw w kenrick in addition, if authors elaborate further on a context in which there is variation, the passage is recorded in the field metalinguistic comments. an example of this is the need to explain that a difference in pronunciation implies a difference in meaning, as noted by buchanan ( ) for the lexical item bear (set square_a)ぎ ;ゲ ; ミラ┌ミ マw;ミキミェ けa ┘キノs hw;ゲデげ キデ キゲ ヮヴラミラ┌ミiws bĤar (ipa /i准r/), ┘エキノw ;ゲ ; ┗wヴh マw;ミキミェ けʍラ i;ヴヴ┞げ デエw ヮヴラミ┌ミiキ;デキラミ キゲ beĈr (ipa /e准r/). if the remarks convey prescriptive attitudes towards either variant, this is further annotated in the fields for attitudes (i.e. positive, negative, neutral) and labels (e.g. vulgar, improper). criticism is usually related to pronunciations considered けv┌ノェ;ヴげ, whether in the sense けiラ;ヴゲwが ┌ミヴwaキミwsげ ふoed s.v. vulgar ii. .d) and けマw;ミき ノラ┘げ (johnson, : s.v. vulgar, sense ), or in there is a peculiar case in which post-consonantal /r/ stands in variation with /l/, namely in colonel (set nurse) with ipa variants /朔ノっ ;ミs っ朔ヴっく jラエミゲデラミ ふヱΑヶヴぶ ゲキマヮノ┞ ノキゲデゲ デエw デ┘ラ ┗;ヴキ;ミデゲ ┘キデエラ┌デ a┌ヴデエwヴ iラママwミデぎ けi境ノラミwノが i┎ヴミwノげが ┘エキノw kwミヴキiニ ふヱΑΑンぶ マ;ニwゲ デエw aラノノラ┘キミェ ヴwマ;ヴニぎ けit is now generally sounded with only two distinct syllables, ĐŽů͛ŶĞů, and vulgarly as if written cur-nelげ, that is ipa /左ノ/ and っ朔ヴっ ヴwゲヮwiデキ┗wノ┞く hwヴw ┘w エ;┗w ヮヴwゲwヴ┗ws デエw っヴっ キミ デエw デヴ;ミゲiヴキヮデキラミく beal ( : に ぶ sキゲi┌ゲゲwゲ ┘エwデエwヴ ;┌デエラヴゲ ラa ヮヴラミラ┌ミiキミェ sキiデキラミ;ヴキwゲ ┘wヴw けェララsげ ヮエラミwデキiキ;ミゲ ラヴ ミラデが ;ミs エラ┘ けswゲiヴキヮデキ┗wげ ラヴ けヮヴwゲiヴキヮデキ┗wげ デエwキヴ ヴwマ;ヴニゲ ┘wヴwく デエw ゲwミゲw けiラママラミノ┞ ラヴ i┌ゲデラマ;ヴキノ┞ ┌ゲws h┞ デエw ヮwラヮノw ラa ; iラ┌ミデヴ┞き ラヴsキミ;ヴ┞が ┗wヴミ;i┌ノ;ヴげ (oed s.v. vulgar i. .a)が ラaデwミ キミ ヮエヴ;ゲwゲ ゲ┌iエ ;ゲ けデエw ┗┌ノェ;ヴ ゲ;┞げ ラヴ け;マラミェ デエw ┗┌ノェ;ヴげ ふゲww ;ノゲラ sundby et al., : に , に ). w;ノニwヴげゲ wミデヴ┞ aラヴ plant (set bath_b) in passage ( ) provides an illustrative example of this. for his part, the irish author sheridan often comments on variation between english and irish pronunciation, as in デエw ゲwiデキラミ ラミ け‘┌ノwゲ デラ hw ラhゲwヴ┗ws h┞ デエw n;デキ┗wゲ ラa iヴwノ;ミs キミ ラヴswヴ デラ ;デデ;キミ ; テ┌ゲデ pヴラミ┌ミiキ;デキラミ ラa eミェノキゲエげ ふヱΑΒヰぎ ヵΓに ). see, for instance, his passage in ( ) about lexical items such as great (set face_a), where he warns けデエw ェwミデノwマwミ ラa iヴwノ;ミsげ to avoid the mistaken pronunciation /i准/ for the けテ┌ゲデげ ヮヴラミ┌ミiキ;デキラミ /e准/ in english. sheridan emphasizwゲ デエ;デ けぷ;へ ゲデヴキiデ ラhゲwヴ┗;デキラミ ラa デエwゲw aw┘ ヴ┌ノwゲ ぷくくくへ ┘キノノ enable the well-educated natives of ireland to pronounce their words exactly in the same way as the more polished part of the inhabitants of england sラげ ふヱΑΒヰぎ ヶヰぶく ( ) plant, pla nt. [ipa /æ/]  there is a coarse pronunciation of this word, chiefly among the vulgar, which rhymes it with aunt [i.e. a nt, ipa /a准/]. this pronunciation seems a remnant of that broad sound which was probably given to the a before two consonants in all words, but which has been gradually wearing away, and which is now, except in a few words, become a mark of vulgarity. (walker, : s.v. plant; s.v. aunt) ( ) the second vowel, e, is for the most part sounded ee by the english [ipa /i准/], when the accent is upon it; whilst the irish in most words give it the sound of second a , as in hate [ipa /e准/]. this sound of e [ee] is marked by different combinations of vowels, such as, ea, ei, e final mute, ee, and ie. [...] the english constantly give this sound [i.e. /i准/] to ea, whenever the accent is on the vowel e, except in the following words, gre at, a pe ar, a be ar, to be ar, to forbe ar, to swe ar, to te ar, to we ar. in all which the e has its second sound [e , ipa /e准/]. for want of knowing these exceptions, the gentlemen of ireland, after some time of residence in london, are apt to fall into the general rule, and pronounce these words as if spelt, greet, beer, sweer, &c. (sheridan, : ) finally, since word frequency may be an influential factor in the choice of variants or in the development of sound changes such as those arising through lexical diffusion, we have compiled a frequency list with an estimated frequency rate of the lexical item in eighteenth- century british english, based on the data available in the multi-genre historical corpus archer – , version . ( , words). . web-based interface the ecep database will be made available to users via a web-based application hosted on the website of the humanities research institute, university of sheffield. access to ecep will be free for any user registering at the website. the reference line for citation is as follows: ecep = eighteenth-century english phonology database, . compiled by joan c. beal, nuria yáñez-bouza, ranjan sen and christine wallis. the university of sheffield and universidade de vigo. published by: university of sheffield. http://hridigital.shef.ac.uk/eighteenth-century-english-phonology the online interface has been developed using client-side html and javascript and server- side php and mysql. it displays two layouts に browse, search に and offers a download function in cvs file format. the design aims to replicate the ms access format, and therefore it offers three main blocks of data: the lexical sets plus metadata for works and for authors (see table ). fig. shows the homepage, from which each of these sections can be accessed (see top row), and from which users can go directly to the pronouncing dictionary they are interested in (see buchanan and burn in the image). fig. and fig. are screenshots of the browse layouts for works and authors, respectively. the search tool allows users to search in one field or in a combination of fields. fields which contain a predefined list of values (e.g. lexical sets, example words, authorげゲ name) offer an automatic drop-down list menu to facilitate selection, as in the field ipa in fig. . lexical sets and example words can be searched in the entire database or within a particular work; for instance, fig. displays a sample of the set bath, where users can compare the occurrence of the variants /æ/, /a准/ and っ査准/. figure ecep online interface に homepage figure ecep online interface に works in browse layout figure ecep online interface に authors in browse layout figure ecep online interface に lexical sets in search layout figure ecep online interface に lexical sets in browse layout . case studies in this section we report on two case studies that demonstrate the value of evidence that can be systematically extracted from this database for the analysis of segmental and suprasegmental phonology, in their regional and chronological settings. the results constitute a valuable distillation of the conditioning factors to look out for in a wider range of eighteenth- century evidence, hence a point of departure for further investigation. in this light, these results must be interpreted as indications of patterns rather than definitive analyses; that is, if a sub-set of dictionary writers display a pattern in their choices, it is worth exploring that pattern using all the available evidence to establish whether a conditioning factor in the sound change indeed underlies it. the first of these studies examined variation in the pronunciation of け┘エげ (/hw/~/w/~/h/) in example words of the consonantal set whale (beal and sen, a; b). in present-day rp, words such as whale, what, where begin with the sound /w/, whilst who, whole have initial /h/. eighteenth-century pronouncing dictionaries present evidence, through their orthographic systems, of variation between /hw/ and /w/ for the first set, hence a preserved versus unpreserved contrast in where/wear. the fifty example words in this consonantal set were selected on the basis of their occurrence in as many of the sources as possible, and to represent three phonological contexts: ( ) thirty-nine example words beginning with the ゲヮwノノキミェ け┘エげ ┘エキiエ ;ヴw ヮヴラミラ┌ミiws ┘キデエ っ┘っ キミ ヮヴwゲwミデ-day rp, ( ) six example words with キミキデキ;ノ け┘エげ ┘エキiエ ;ヴw ミラ┘ ヮヴラミラ┌nced with initial /h/, and ( ) five example words ┘キデエ け┘エげ word internally, which are now all pronounced with internal /w/ (e.g. somewhere). the デヴ;ミゲiヴキヮデキラミゲ ラa デエw け┘エげ ゲwェマwミデ キミ w;iエ w┝;マヮノw ┘ラヴs aラ┌ミs キミ nine of the eleven pronouncing dictionaries compiled in ecep are displayed in table . table Α tヴ;ミゲiヴキヮデキラミゲ ラa デエw け┘エげ ゲwェマwミデ in the whale lexical set whale set bu joh ke pe sp sh bur wa jo whale hw hw w w hw hw w hw hw wharf hw w w w hw hw w hw hw what hw hw w w hw hw w hw hw wheat hw hw w w hw hw w hw hw wheedle hw hw w w hw hw w hw hw wheel hw hw w w hw hw w hw hw wheeze hw hw w w hw hw w hw hw whelm hw hw hw hw hw hw w hw hw whelp hw hw w w hw hw w hw hw when nid hw w w hw hw w hw hw whence nid hw w w hw hw w hw hw where_cn nid hw w w hw hw w hw/w hw wherry hw hw hw w hw hw w hw hw whet hw hw w w hw hw w hw/w hw whether hw hw hw w hw hw w hw hw whey_cn hw hw w w hw hw w hw hw which nid hw w w hw hw w hw hw whiff hw hw w w hw hw w hw hw whiffle hw hw w w hw hw w hw hw whig hw hw w w hw hw w hw hw while nid hw w w hw hw w hw/w hw whim hw hw w w hw hw w hw hw whimper hw hw w w hw hw w hw hw whin hw nid w w hw hw w hw hw whine hw hw w w hw hw w hw hw whip hw hw w w hw hw w hw hw whirl hw hw w w hw hw w hw hw whisk hw hw hw hw hw hw w hw hw whisker_cn hw hw hw hw hw hw w hw hw whisper hw hw hw hw hw hw w hw hw whist hw hw hw/w w/hw hw hw w hw hw whistle hw hw hw w hw hw w hw hw whit hw hw w w hw hw w hw hw white hw hw w w hw hw w hw hw whither hw hw w w hw hw w hw hw whitlow hw hw hw hw hw hw hw hw hw whitsuntide hw hw hw w hw hw hw hw hw whiz hw hw hw hw hw hw hw hw hw who_cn nid h h h hw h h h h whole h h h h hw h h h h whom nid nid h h hw h h h h whoop hw h h h hw h h h h whore_cn h h h h h h h h h whose_cn nid h h h hw h h h h why nid hw w w hw hw hw hw hw elsewhere nid nid nid hw hw hw w hw hw nowhere nid nid nid w hw hw w hw hw overwhelm hw hw nid hw hw hw hw hw hw somewhat nid nid nid w hw hw w hw hw somewhere nid hw nid w hw hw w hw hw this systematic data collection even on such a small scale enabled us to identify patterns in the evidence, along dimensions commonly under investigation in sociolinguistic, historical and phonological research, namely geography, chronology, phonology, lexical factors, and social iノ;ゲゲく f┌ヴデエwヴマラヴwが デエw ミ;デ┌ヴw ラa デエw s;デ; ;ノゲラ wミ;hノws ┌ゲ デラ ェノw;ミ けsキヴwiデげ w┗キswミiw キミ デエw form of contemporary commentary on the choices made by the authors. a notable example is that walker ( ) presents the loss of the /hw ~ ┘っ iラミデヴ;ゲデ ;ゲ ; ゲヮwiキ;ノ i;ゲw ラa けエ-sヴラヮヮキミェげ in lower-class london english, which was just beginning to attract social stigma in the middle of the eighteenth century (beal, : に ). three main patterns emerged from the data based on geographical and chronological sキゲデヴキh┌デキラミく fキヴゲデノ┞が デエw lラミsラミ ;┌デエラヴゲ ヮヴwawヴ っエ┘っ デラ っ┘っ デラ ;┗ラキs デエw ヮヴラゲiヴキhws けエ- sヴラヮヮキミェげ ;ゲ sキゲi┌ゲゲws h┞ w;ノニwヴが ┘キデエ デエw w┝iwヮデキラミ ラa kwミヴキiニ ふヱΑΑンぶが one of the earliest of the group, presumably because the stigmatization of /w/ had not yet fully taken effect by this time. secondly, two out of the three scottish authors prefer /w/ (perry, ; burn, ), whereas the earliest, buchanan ( ), prefers /hw/. perry and burn appear to be advising a more london-like pronunciation to avoid the scottish /hw/, stigmatized due to its regional connotations (douglas, [ ]: ). the /w/ pronunciation could therefore be analysed as a hypercorrect anglicism, one which is particularly remarkable in the light of the contemporaneous oppラゲキデw デヴwミs キミ lラミsラミ ┘エwヴw っエ┘っ ┘;ゲ ヮヴラゲiヴキhws s┌w デラ けエ-sヴラヮヮキミェげく arguably, this trend was only taking hold in london at the time and had not yet reached the consciousness of the scottish authors. thirdly, spence ( ) from newcastle in north-east england has near-consistent /hw/, even in words containing a following back, rounded vowel, where other authors have delabialized /h/ e.g. who. along with the fact that spence is the only ;┌デエラヴ デラ ┌ゲw ; ゲヮwiキ;ノ ゲ┞マhラノ aラヴ デエw け┘エげ ゲラ┌ミsが デエキゲ iラ┌ノs hw キミデwヴヮヴeted as evidence in sヮwミiwげゲ sキ;ノwiデ aラヴ マラミラゲwェマwミデ;ノ っ柵っが ;ミs ミラデ ; iノ┌ゲデwヴ っエ┘っが デエw ┗ラキiwノwゲゲ iラ┌ミデwヴヮ;ヴデ ラa voiced /w/ which also retained its labial element before back, rounded vowels, e.g. wound, womb, wool, wood. two lexically based patterns emerged. the first, homophone avoidance, as shown by b┌iエ;ミ;ミげゲ ふヱΑヵΑぶ っエ┘っ aラヴ whoop け; iヴ┞げが h┌デ っ┘っ aラヴ whoop け; hキヴsげ ;ミs b┌ヴミげゲ ふヱΑΒヶぶ ;ミs pwヴヴ┞げゲ ふヱΑΑヵぶ whitsuntide with /hw/ and whit with /w/, is evidence that sensitivity to the contrast remained to a sufficient degree to construct minimal pairs in some regions, notably scotland where the contrast survives to the present day. the second, onomatopoeia as illustrated by /hw/ in whisk, whisper in kenrick ( ) and perry ( ), could also be interpreted as evidence for an increased chance of /hw/-preservation (perhaps enhanced by considerations of sound symbolism) before a front vowel in precisely these two authors, e.g. whelm, and often with a following /s/, e.g. whisk, whiskers, whisper. aside from this partial pattern, two main explanations based on phonological context emerged. the first is the unambiguous delabialization to /h/ before any vowel that is higher ;ミs マラヴw ヴラ┌ミs デエ;ミ っ査っ ふデエwヴw キゲ ミラ っエっ キミ wharf in any of the dictionaries), e.g. who. secondly, the realization of word-キミデwヴミ;ノ け┘エげ キミ pwヴヴ┞ ふヱΑΑヵぶ appears to be conditioned by stress, as marked by the author himself, thus stressed-syllable onset /hw/ in overwhélm, elsewhére, but unstressed-syllable onset /w/ in sómewhere, sómewhat, nówhere. we therefore repeatedly found that by systematically collating the different types of direct evidence afforded by the eighteenth-century pronouncing dictionaries (sounds and stress, contemporary commentary, geographical and chronological spread), and analysing them in the light of acknowledged influences in sound change, we were able to posit accounts for many of the patterns in a way that only such an orderly approach to the data permitted. the second case study explored palatalization in eighteenth-century english, i.e. where a postalveolar fricative /崎 桜/ or affricate /三 鮫/ arose from the sequence alveolar /t d s z/ + /j/ + /u衾/, as in the word tune (beal and sen, ). the palatalization of alveolar consonants before late middle english /u准/ is still variable and is diffusing in present-day english. the oed gives several pronunciations for mature ふwくェく っマ恭ろ三搾恭 れ マ恭デテ搾恭っぶが h┌デ ヮヴラ┗キswゲ ラミノ┞ ┌ミヮ;ノ;デ;ノキ┣ws (/dj tj/) transcriptions for endure, tune, and duke, despite the common occurrence of palatalized (and yod-dropped) variants in many varieties of british english. extensive variability is not recent in origin, and we can already detect relevant patterns in the eighteenth century from the evidence of a range of pronouncing dictionaries; for instance, beal ( ; ) notes a tendency for northern english and scottish authors to be more conservative. she concludes デエ;デ ┘w ヴwケ┌キヴw け; iラマヮヴwエwミゲキ┗w ゲ┌ヴ┗w┞ ラa デエw マ;ミ┞ ヮヴラミラ┌ミiキミェ sキiデキラミ;ヴキwゲ ;ミs ラデエwヴ ┘ラヴニゲ ラミ ヮヴラミ┌ミiキ;デキラミげ ふヱΓΓヶぎ ンΑΓぶ デラ ェ;キミ マラre insight into the historical variation patterns underlying present-s;┞ eミェノキゲエく ʍエキゲ ゲデ┌s┞ ヮヴwゲwミデws ヴwゲ┌ノデゲ aヴラマ ゲ┌iエ ; けiラマヮヴwエwミゲキ┗w ゲ┌ヴ┗w┞げ h;ゲw ラミ デエw s;デ; iラマヮキノws キミ ecepく the data were divided into two main consonantal lexical sets: deuce where there was no /r/ following the vowel, and sure where an /r/ followed. a third set was feature, where the vowel following the palatalized sequence is schwa in present-day english, and /r/ originally followed the vowel. this division was made after preliminary examination demonstrated a clear difference in the behaviour of the consonantal sequences in these contexts. we were then able to further clarify the nature of the divergence after constructing the database with information from ten dictionaries, and with word-frequency information for the period に from archer . . as mode of illustration, table displays transcriptions of example words in some of the deuce subsets, and table transcriptions of some of the sure and feature subsets. table transcriptions of the deuce lexical set in ecep: subsets deuce_a /t/, deuce_b /t/, deuce_c /t/, deuce_b /s/ deuce_a /t/ bu joh ke pe sp sh sc wa jo jo opportunity tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: tuesday tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: tumour tju: tju: to: tju: tju: デ崎┌ぎ tju: tju: tju: tju: tube tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: tutor tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: tune_cn tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: obtuse_cn tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: tulip tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: tumult tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: tubular nid nid tju: tju: nid デ崎┌ぎ tju: tju: tju: tju: contusion tju: tju: tju: tju: tju: tju: tju: tju: デ朔 unclear tumid tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: tuberous tju: tju: tju: tju: nid デ崎┌ぎ tju: tju: tju: tju: tunic tju: nid tju: tju: nid デ崎┌ぎ tju: tju: tju: tju: opportune_a tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: attune nid nid tju: tju: tju: tju: tju: tju: tju: tju: deuce_b /t/ bu joh ke pe sp sh sc wa jo jo latitude tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: amplitude tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: longitude tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: altitude tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: magnitude tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: tju: tju: fortitude tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: punctual tju: tju: tju: nop tju: デ崎搾 tju: デ崎テ┌ぎ デ崎テ┌ぎ tju: solitude tju: tju: tju: tju: tju: tju: tju: tju: デ崎テ┌ぎ tju: attitude tju: tju: tju: tju: tju: tju: tju: tju: unclear uncl aptitude tju: tju: tju: tju: tju: tju: tju: tju: unclear tju: sanctuary tju: tju: tju: tju: tju: デ崎搾 tju: デ崎テ┌ぎ デ崎テ┌ぎ tju: mortuary_deu tju: tju: tju: tju: tju: tju: tju: デ崎テ┌ぎ tju: tju: actuary_ deu tju: nid nid tju: nid tju: nid デ崎テ┌ぎ tju: tju: opportune_b tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: bitumen tju: tju: tju: tju: tju: 俊デテ┌ぎ tju: 俊デテ┌ぎ tju: tju: deuce_c /t/ bu joh ke pe sp sh sc wa jo jo tumultuous tju: tju: tju: tju: nid デ崎┌ぎ tju: tju: tju: tju: tutorial nid nid nid nid nid nid nid デ崎テ┌ぎ nid nid deuce_b /s/ bu joh ke pe sp sh sc wa jo jo issue sju: sju: sju: 崎テ┌ぎ sju: 崎搾 sju: 崎テ┌ぎ 崎┌ぎ 崎テ┌ぎ consular nid sju: sju: nop sju: 崎搾 sju: 崎テ┌ぎ 崎搾 unclear consummate ゲ朔 ゲ朔 ゲ朔 ゲ朔 ゲ搾 nid ゲ朔 ゲ朔 ゲ朔 unclear tissue sju: sju: nid 崎テ┌ぎ sju: 崎搾 sju: 崎テ┌ぎ 崎テ┌ぎ 崎テ┌ぎ insulate nid nid sju: nid nid sju: nid 崎テ┌ぎ nid nid table transcriptions of the sure and feature lexical sets in ecep: subsets sure_a /t/, sure_c/t/, sure_a /s/, sure_b /z/, feature /z/ sure_a /t/ bu joh ke pe sp sh sc wa jo jo futurity_cn tju: tju: tju: tju: tju: デ崎┌ぎ tju: tju: unclear tju: centurion_cn tju: tju: デ査 tju: tju: tju: tju: nid tju: tju: mature_cn tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: maturity_cn tju: tju: tju: tju: tju: tju: tju: tju: tju: tju: sure_c/t/ bu joh ke pe sp sh sc wa jo jo maturation tju: nid tju: tju: nid tju: tju: デ崎テ┌ぎ tju: デ崎テ┌ぎ sure_a /s/ bu joh ke pe sp sh sc wa jo jo sure_cn sju: sju: 崎テ┌ぎ 崎テ┌ぎ 崎┌ぎ 崎┌ぎ sju: 崎テ┌ぎ 崎┌ぎ 崎テ┌ぎ assure_cn sju: sju: 崎テ┌ぎ sju: 崎┌ぎ 崎┌ぎ sju: 崎テ┌ぎ 崎┌ぎ 崎テ┌ぎ assurance_cn sju: sju: 崎テ┌ぎ sju: 崎┌ぎ 崎┌ぎ sju: 崎テ┌ぎ 崎┌ぎ sju: insure_cn nid sju: nid nid nid nid sju: nid nid nid ensure_cn nid nid sju: 崎テ┌ぎ nid nid nid 崎テ┌ぎ sju: 崎テ┌ぎ surety sju: sju: 崎テ┌ぎ 崎テ┌ぎ 崎┌ぎ 崎┌ぎ sju: 崎テ┌ぎ 崎┌ぎ 崎テ┌ぎ insurance_cn sju: sju: nid nid nid sju: sju: nid nid nid unsure nid sju: 崎テ┌ぎ 崎テ┌ぎ nid 崎┌ぎ sju: 崎テ┌ぎ nid 崎テ┌ぎ sure_b /z/ bu joh ke pe sp sh sc wa jo jo composure zju: ┣朔 zju: ┣朔 桜搾 桜朔 ゲテ朔 桜テ┌ぎ 桜朔 桜テ┌ぎ seizure zju: ┣朔 ┣朔 nid zju: 桜朔 ゲテ朔 桜テ┌ぎ 桜朔 桜朔 azure_sure zju: ┣朔 ┣査 ┣朔 桜搾 桜朔 ┣テ朔 桜テ┌ぎ 桜朔 桜テ┌ぎ closure nid nid ┣朔 ┣朔 桜朔 桜朔 nid 桜テ┌ぎ 桜朔 桜朔 feature /z/ bu joh ke pe sp sh sc wa jo jo pleasure zju: ┣朔 桜朔 桜朔 桜搾 桜朔 ┣テ朔 桜テ┌ぎ 桜朔 桜朔 measure_cn zju: ┣朔 桜朔 桜朔 桜搾 桜朔 ┣テ朔 桜テ┌ぎ 桜朔 桜朔 treasure zju: ┣朔 桜朔 ┣朔 桜搾 桜朔 ┣テ朔 桜テ┌ぎ 桜朔 桜テ┌ぎ leisure zju: nid 桜朔 ┣朔 桜搾 桜朔 ┣テ朔 桜テ┌ぎ 桜朔 桜朔 azure_feat zju: ┣朔 ┣査 ┣朔 桜搾 桜朔 ┣テ朔 桜テ┌ぎ 桜朔 桜テ┌ぎ rasure sju: ゲ朔 ゲ朔 崎朔 zju: 崎朔 ┣テ朔 zju: 崎朔 桜テ┌ぎ all the pronouncing dictionaries are consistently rhotic, i.e. they report syllable-final /r/ in forms such as sure. it was found that there is significantly more palatalization when /r/ follows (sure/feature) than when it did not (deuce), particularly in a post-stress syllable, thus even the resistant spence ( ) has /炸/ in closure, pleasure. the more frequent of these palatalized forms (e.g. nature) seem to be the words which have become lexicalized in present-day english. the nature of the palatalizing phoneme was also relevant. palatalization occurred in /sj/ in particular, thus it is near-regular in post-stress deuce in perry ( ), sheridan ( ), walker ( ), and jones ( , ), e.g. /灑/ in issue. this is arguably because the high tongue position of palatal /j/ shapes frication noise, producing post-alveolar percepts. furthermore, /sj/ is the only context which palatalizes in a stressed syllable with any regularity, particularly when in a rhotic context, thus sure, surety with /灑/ even in kenrick ( ), perry ( ), and spence ( ). stress therefore also appears to have been a conditioning factor, with palatalization generally resisted in the onset of a stressed syllable, as noted explicitly by walker ( ), and more common in post-stress syllables. pre-stress syllables also show some palatalization, yielding interesting alternations such as /tj/útor h┌デ っ三っutórial and ma/tj/úre but maっ三っurátion in walker ( ). two other contexts proved to be more conducive to palatalization: word-initial position, デエ┌ゲ sエwヴキs;ミ ふヱΑΒヰぶ っ三っ キミ tune, but /tj/ in attuneが ;ミs hwaラヴw ┗ラ┘wノ エキ;デ┌ゲが デエ┌ゲ っ三っ キミ punctual, sanctuary in sheridan ( ), walker ( ), and jones ( ; ), but mainly /tj/ elsewhere. aゲ ┘キデエ デエw け┘エげ ゲデ┌s┞が iエヴラミラノラェ┞が ェwラェヴ;ヮエ┞が ;ミs ゲデキェマ;デキ┣;デキラミ ;ノゲラ ヮヴラ┗ws デラ hw relevant factors in accounting for the variation. palatalization appears to have become increasingly more common over the course of the eighteenth century: there is little in kenrick ( ), but sheridan ( ; late in career) is the arch-ヮ;ノ;デ;ノキ┣wヴく hラ┘w┗wヴが デエw ノ;デデwヴげゲ dictionary was repeatedly singled out for criticism later in the century, as such pronunciations came to be stigmatized (e.g. jones, : iv). palatalization consequently became much less common at the end of the century; it is less widespread, but stress-based in walker ( ; see his principles , , にヶヴぶが ;ミs ヮヴラェヴwゲゲキ┗wノ┞ w┗wミ ノwゲゲ iラママラミ aヴラマ jラミwゲげ ゲwiラミs edition ふヱΑΓΑぶ デラ エキゲ デエキヴs ふヱΑΓΒぶく iミ デwヴマゲ ラa ェwラェヴ;ヮエ┞が sエwヴキs;ミげゲ ヮ;ノ;デ;ノキ┣キミェ デwミswミiキwゲ were attributed at the time to his irish origin; this contemporary explanation requires further scrutiny as there is little evidence that palatalization was common in the irish english of the time. there is little palatalization in the scottish sources, with buchanan ( ; early source), and scott ( ) notably having no palatalized forms whatsoever. spence ( ) from newcastle also has little palatalization. palatalization in the late-middle part of the eighteenth century may have increased due to the earlier restitution of post-consonantal yod in earlier yod-dropped forms, as in the london-h;ゲws けマwデヴラヮラノキデ;ミ ヮヴラミ┌ミiキ;デキラミげ iヴキデキiキ┣ws h┞ kwミヴキiニ ( ). for example, the earlier sources almost all have yod-dropped /t/ in creature (johnston, ; kenrick, ; perry, ヱΑΑヵぶが h┌デ デエw ノ;デwヴ ラミwゲ エ;┗w っ三っ ふsエwヴキs;ミ, ; walker, ; jones, ; ). furthermore, this observation forms part of a further pattern revealed by the database: there were two chronologically and phonologically distinct yod-droppings. the first, mentioned above, notably occurred in the earlier sources after all phonemes /t d s z/ in unstressed syllables before /r/. the second yod-dropping occurred in the later sources in a different context: after any phoneme in a stressed syllable. sheridan ( ) is the earliest to do this in the single example dual; scott ( ) is the most frequent omitter of stressed yod, mostly in fricative and only in the most frequent words, e.g. duty. . conclusion in this paper we have presented a new digital resource for the study of english historical phonology: the eighteenth-century english phonology database (ecep). the database provides ipa デヴ;ミゲiヴキヮデキラミゲ aラヴ デエw ヴwノw┗;ミデ ゲwェマwミデ ラa w;iエ w┝;マヮノw ┘ラヴs キミ wwノノゲげ ふヱΓΒヲぶ ノw┝キi;ノ ゲwデゲ for the vowel system of present-day english, and some complementary consonant sets, as documented in a selection of eighteenth-century pronouncing dictionaries. we have described the structure and content of ecep, while reporting on the methodology of compilation: source selection, data input and annotation, and the web-based interface for users. ecep is already available, but work will continue with a view to enlarging the database gradually. originally designed as a sister to the eighteenth-century english grammars database (eceg, ), on the practical side ecep will help to promote the use of databases as research resources in historical linguistics, beyond or alongside largely available text corpora. in terms of content, ecep will contribute to english historical phonology, dialectology and sociolinguistics, with a focus on the eighteenth century, but will also be of use for comparative studies with nineteenth-century english or present-day english. the two case studies outlined in section demonstrate the potential of ecep as a resource for investigating the historical phonology of late modern english. the database has also been used in studies of the cloth lexical set (beal and condorelli, ) and of the use of labels in the enregisterment of non-standard pronunciation (beal and trapateau, in prep.). the availability of this resource will ensure that in the future historical phonology will no longer be デエw けヮララヴ ヴwノ;デキラミげ ラa l;デw mラswヴミ eミェノキゲエ ゲデ┌sキwゲ (beal, b: ). funding this work was supported by the british academy / leverhulme trust [sgに ] and the santander research mobility scheme (calls に and に ). we are also grateful to the humanities research institute at the university of sheffield for technical support, in particular michael pidd (digital director) and ryan bloor (developer). nuria yáñez-bouza would like to thank the spanish ministry of economy, ramón y cajal scheme [ryc- - ] and the research group language variation and textual categorisation at the university of vigo, the european regional development fund [ffi - -p], and the autonomous government of galicia [gpc / ]. references primary sources buchanan, j. ( ). linguae britannicae vera pronunciatio: or, a new english dictionary. london. burn, j. ( ). a pronouncing dictionary of the english language. glasgow. johnston, w. ( ). a pronouncing and spelling dictionary. london. jones, s. ( ). genuine edition. sheridan improved: a general pronouncing and explanatory dictionary of the english language, nd edition. london jones, s. ( ). sheridan improved: a general pronouncing and explanatory dictionary of the english language, rd edition. london kenrick, w. ( ). a new dictionary of the english language. london. perry, w. ( ). the royal standard english dictionary. edinburgh. scott, w. ( ). a new spelling, pronouncing, and explanatory dictionary of the english language. edinburgh. sheridan, t. ( ). a general dictionary of the english language. london. spence, t. ( ). the grand repository of the english language. newcastle. walker, j. ( ). a critical pronouncing dictionary and expositor of the english language. london. secondary sources agha, a. ( ). the social life of cultural value. language and communication, : – . alston, r. c. ( ). a bibliography of the english language from the invention of printing to the year . vol. v, the english dictionary. leeds: arnold and son. archer . . a representative corpus of historical english registers version . . に / / / / . originally compiled under the supervision of douglas biber and edward finegan at northern arizona university and university of southern california; modified and expanded by subsequent members of a consortium of universities. current member universities are northern arizona, southern california, freiburg, heidelberg, helsinki, uppsala, michigan, manchester, lancaster, bamberg, zurich, trier, santiago de compostela and leicester. www.manchester.ac.uk/archer (accessed january ). beal, j. c. ( ). the jocks and the geordies: modified standards in eighteenth-century pronouncing dictionaries. in britton, d. (ed.), english historical linguistics . papers from the th international conference on english historical linguistics, edinburgh, ʹ september . amsterdam: john benjamins, pp. ʹ . beal, j. c. ( ). english pronunciation in the eighteenth century: thomas spence͛s grand repository of the english language ( ). oxford: clarendon press. beal, j. c. ( ). john walker: prescriptivist or linguistic innovator? in dossena, m. and jones, c. (eds.), insights into late modern english. bern: peter lang, pp. – . beal, j. c. ( ). english in modern times – . london: hodder arnold. beal, j. c. ( a). late modern english. in brinton, l. and bergs, a. (eds.), historical linguistics of english: an international handbook. berlin: mouton de gruyter, pp. – . beal, j. c. ( b). けcaミげデ ゲww デエw ┘ララs aラヴ デエw デヴwwゲいげ corpora and the study of late modern english. in markus, m., iyeiri, y., heuberger, r. and chamson, e. (eds.), middle and modern english corpus linguistics: a multi-dimensional approach. amsterdam: benjamins, pp. – . beal, j. c. ( c). けby those ヮヴラ┗キミiキ;ノゲ マキゲヮヴラミラ┌ミiwsげ: the strut vowel in eighteenth- century pronouncing dictionaries. language and history, ( ): – . beal, j. c. & condorelli, m. ( ). cut from the same cloth? variation and change in the cloth lexical set. token: a journal of english linguistics, : – . beal, j. c., nocera, c. and sturiale, m., eds. ( ). perspectives on prescriptivism. bern: peter lang. beal, j. c. and sen, r. ( a). towards a corpus of eighteenth-century english phonology. in davidse, k., gentens, c., kimps, d. and vandelanotte, l. (eds), recent advances in corpus linguistics: developing and exploiting corpora. amsterdam: rodopi, pp. に . beal, j. c. and sen, r. ( b). w(h)o, w(h)en, w(h)ere, and w(h)at?. the eighteenth-century pronunciation of けwhげ. paper presented at the th international conference on english historical linguistics (icehl ), leuven に july . beal, j. c. and sen, r. ( ). eミぷsテへ┌ヴキミェ ぷ三へ┌ミwゲ ラヴ マ;ぷデテへ┌ヴw ぷ鮫へ┌ニwゲい palatalization in eighteenth-century english: evidence from the eighteenth-century english phonology database. paper presented at the th studies in the history of the english language conference (shel- ), vancouver に june . beal, j. c. and trapateau, n. (in preparation). けnamed and shamedげ: labels for non-standard pronunciation in th -century pronouncing dictionaries. in brownlees, n., iamartino, g. and sturiale, m. (eds.), labelling english, english labelled: from the th century to late modern times. bloomfield, m. w. and newmark, l. ( ). a linguistic introduction to the history of english. new york: knopf. cooper, c. ( ). the english teacher, or, the discovery of the art of teaching and learning the english tongue. london. dobson, e. j. ( ). english pronunciation – . oxford: clarendon press. douglas, s. ( [ ]). a treatise on the provincial dialect of scotland, edited by c. jones. edinburgh: edinburgh university press. eighteenth-century english grammars database (eceg). . compiled by nuria yáñez-bouza (university of manchester) and maría e. rodríguez-gil (university of las palmas de gran canaria). www.manchester.ac.uk/eceg (accessed january ). hickey, r., ed. ( ). eighteenth-century english. ideology and change. cambridge: cambridge university press. holmberg, b. ( ). on the concept of standard english and the history of modern english pronunciation. lund: gleerup. honeybone, p. and salmons, j., eds. ( ). the oxford handbook of historical phonology. oxford: oxford university press. johnson, s. ( ). a dictionary of the english language. london. jones, c. ( ). a history of english phonology. london: longman. jones, c. ( ). english pronunciation in the eighteenth and nineteenth centuries. basingstoke: palgrave macmillan. lass, r. ( ). phonology and morphology. in lass, r. (ed.), the cambridge history of the english language, vol. iii: – . cambridge: cambridge university press, pp. – . macmahon, m. k. c. ( ). phonology. in romaine, s. (ed.), the cambridge history of the english language, vol. iv: – . cambridge: cambridge university press, pp. – . mugglestone, l. ( ). 'talking proper': the rise of accent as social symbol, nd edition. oxford: clarendon press. oxford dictionary of national biography (odnb). www.odnb.com (accessed january ). oxford english dictionary (oed). www.oed.com (accessed january ). ranson, r. ( ). some aゲヮwiデゲ ラa w;ノニwヴげゲ methodology: a decisive step towards standardization? language and history, ( ): – . sheldon, e. k. ( ). w;ノニwヴげゲ influence of the pronunciation of english. proceedings of the modern language association of america, : – . strang, b. m. h. ( ). a history of english. london: methuen. sundby, b., bjørge, a. k. and haugland, k. e. ( ). a dictionary of english normative grammar に . amsterdam: john benjamins. tieken-boon van ostade, i., ed. ( ). grammars, grammarians and grammar-writing in eighteenth-century england. berlin: mouton de gruyter. tieken-boon van ostade, i. ( ). an introduction to late modern english. edinburgh: edinburgh university press. ʍヴ;ヮ;デw;┌が nく ふヲヰヱヶぶく けpws;ミデキiニげが けヮラノキデwげ ラヴ け┗┌ノェ;ヴげい a systematic analysis of eighteenth- iwミデ┌ヴ┞ ミラヴマ;デキ┗w sキゲiラ┌ヴゲw ラミ ヮヴラミ┌ミiキ;デキラミ キミ jラエミ w;ノニwヴげゲ dictionary ( ). language and history ( ): – . wells, j. c. ( ). accents of english. cambridge: cambridge university press. appendices appendix i. consonant sets and example words set subset example word deuce deuce_a assume, attune, consume, contusion, deuce_cn, dual, dubious, due, duel, duke_cn, duly, dupe_cn, duplicate, duty_cn, exuberant, exude, fiducial, fiduciary, indubitable, obtuse_cn, opportune_a, opportunity, presume, resume, sudatory, sudorous, suicide, suit, suitable, suitor, supine, suture_deu, tube, tuberous, tubular, tuesday, tulip, tumid, tumour, tumult, tune_cn, tunic, tutor, zeugma deuce_b actuary_deu, altitude, amplitude, aptitude, arduous, attitude, bitumen, casual, casualty, consular, consummate, fortitude, fraudulent, glandulous, gradual, incredulous, insulate, issue, latitude, longitude, magnitude, modulate, module, mortuary_deu, opportune_b, punctual, sanctuary, solitude, tissue, visual deuce_c adulation, duplicity, insulation, modulation, sudation, sudorific, superb, superior, superlative, supremacy, supreme, tumultuous, tutorial feature feature azure_ture, creature, feature_cn, fissure_ture, future, leisure, measure_cn, nature_cn, ordure_ture, pleasure, pressure_cn, procedure, rasure, suture_ture, torture_cn, treasure heir heir heir_cn, heiress, herb, herbage, honest_cn, honesty, honour, honourable, hospital, hostler, hour, humble, humorous, humour_cn, humoursome sure sure_a assurance_cn, assure_cn, centurion_cn, cesura_caesura_cn, durable, dure, during_cn, endure_cn, ensure_cn, futurity_cn, insurance_cn, insure_cn, mature_cn, maturity_cn, perdure/perdurable, sure_cn, surety, unsure sure_b actuary_sure, azure_sure, censure, century, closure, composure, fissure_sure, mortuary_sure, ordure_sure, seizure, suture_sure, tonsure, verdure sure_c duration, duress, induration, maturation, mensuration whale whale_a whale, wharf, what, wheat, wheedle, wheel, wheeze, whelm, whelp, when, whence, where_cn, wherry, whet, whether, whey_cn, which, whiff, whiffle, whig, while, whim, whimper, whin, whine, whip, whirl, whisk, whisker_cn, whisper, whist, whistle, whit, white, whither, whitlow, whitsuntide, whiz, who_cn, whole, whom, whoop, whore_cn, whose_cn, why whale_b elsewhere, nowhere, overwhelm, somewhat, somewhere appendix ii. example wordゲ w┝iノ┌sws aヴラマ wwノノげゲ ノw┝キi;ノ ゲwデゲ (alphabetic order by set) set subset example word bath bath_a giraffe, shaftesbury bath_b commando, flanders, france, frances, francis, ranch, sandra bath_c i;ミげデが iラヴヴ;ノが iヴ;ミが iヴ;ケが マラヴ;ノwが ゲエ;ミげデが sノ;┗が s┌s;ミ bath_f basque, cleopatra, contralto, glasgow, graph, intransigent, masturbate, plaque, stance, transept choice choice_a -- choice_b -- choice_c -- cloth cloth_a austen, austin, australia, austria, doss, floss cloth_b boston, gloucester, gong, joss, ross cloth_c florida, horrify, laurence_lawrence, moribund, norwich, oregon, tomorrow, warwick comma comma amoeba_ameba, arena, balsa, bertha, catalpa, cinderella, dementia, neuralgia, panda_ , phobia, saga, visa_ , vodka cure cure_ai dour, spoor cure_aii mcclure cure_b bourbon, bourse, gourmand, gourmet cure_ci houri, tourism, tourist cure_cii angostura, anthurium, bravura, huron, muriel, neural, neuron_neurone, sulfuric_sulphuric, tellurium, thurible, truro, ural, uriel dress dress fez, leicester, rev, thames face face_a bouquet, fête face_b aitch, beige, raid face_c -- fleece fleece_a grebe, keith, peter, sheila fleece_b aesop, anemic_anaemic, caesar fleece_c casino, chic, elite, prestige, ski, trio, unique, visa_ foot foot ゲエラ┌ノsミげデ force force_a chore, crore, galore force_bi borneo force_bii -- force_c angora, boron, dora, euphoria, fedora, gregorian, moratorium, moron, nora_norah, thorium, torus, victoria_victorian goat goat_a sラミげデが ェ;┌iエwが マ;┌┗w goat_b owen goose goose_a ghoul, moog, schooner, smooch, tarboosh, vancouver goose_b flu, sewage, sleuth happy happy_a birdie, boogie, breathy, budgie, calorie, chilli, corgi, edgy, fluffy, fussy, hibachi, khaki_ , lassie, movie, nazi_ , prairie_ , salami, sari_ , scampi, sortie, spaghetti, strategy, stymie, talkie, taxi happy_b chelsea, hockey, swansea kit kit syria letter letter indicator, liner, ogre, pallor, scorer, tudor lot lot bother, tom, waffle mouth mouth macleod near near_a -- near_b deirdre near_c diphtheria, eerie, madeira near_f colosseum, crimean, galatea, jacobean, korea, maria, sophia, tedeum north north_a thor north_b cavort, corm, dorking, morgan, mormon, morph, morpheme, morphia, morphine, orchid, porn, quartz, thorpe, torque, torso, warsaw, york north_c aural, laura, taurus nurse nurse berth, byrne, earp, erg, liqueur, masseur, twerp, worthing palm palm_a blah, bra, ma, pa palm_b afrikaans, armagh, bach, bahai, baht, botswana, brahmin, brahms, candelabra, couvade, dada, dali, façade, guano, guatemala, guava, ha-ha, iguana, incommunicado, java, kahn, karachi, kava, kraal, laager, lager, legato, llama, lusaka, mafia, mahal, maharajah_rajah, maharani_rani, mahdi, malawi, mali, marijuana, mikado, pizzicato, pooh- bah, raj, roulade, salaam, schwa, shah, somalia, staccato, sumatra, swami, swazi, taj_ , taj_ , transvaal, yokohama, zhivago palm_f aubade, bah, bali, chorale, colorado, enchilada, finale, ghana, khaki_ , khan, koran, lava, locale, nazi_ , nevada, nirvana, pakistan, palaver, panorama, pasha, piranha, plaza, pyjama_pajama, shan, soprano price price_a bicycle, chi, christ, cyprus, eider, glynde, hi-fi, hybrid, kaleidoscope, tried price_b -- square square_a ayr, eyre square_b -- square_c aquarium, dun laoghaire, eire, mary, pharaoh, prairie_ start start_a bazaar, saar start_b aardvark start_c aria, bari, cascara, curare, mata hari, safari, sahara, sari_ , scenario strut strut ɑ┌デエヴキwが マ┌ゲデミげデ thought thought_a auk, maugham, paul, raleigh, taut, vaughan, waugh thought_b -- trap trap jazz, math_maths, panda_ appendix iii. ipa デヴ;ミゲiヴキヮデキラミゲ aラヴ b┌iエ;ミ;ミげゲ ふヱΑヵΑぶ ;ミs w;ノニwヴげゲ ふヱΑΓヱぶ ミラデ;デキラミ ゲ┞ゲデwマゲ buchanan ipa walker ipa ̄ /e准/ a /e准/ ̆ /æ/ a /a准/ ai /e准/ a 【濮准【┸ 【臑准/ au/aw 【濮准/ a /æ/ oi /ai/ e /i准/ e違 /i准/ e 【瀑【 ee /i/ i /ai/ ̆ 【瀑【 i 【與【 ̄ /ai/ o /o准/ ̆ /i/ o /u准/ ̄【oa /o准/ o 【濮准/ ̆ 【臑【 o 【臑【 oo /u/ u /ju准/ ou 【濮u【 u 【炬【 ̄ /ju准/ u 【炯【 ̆ 【炬【 o i 【濮i【 o u 【a炯【 o u 【u濺【 oi and oy have a mixed sound which is never varied, and sounds like long (i) (buchanan, : ). search results | educause skip navigation skip to sidebar menu jobs .edu domain educause review become a member login  student engagement showcase featured showcase engaging students by design upcoming showcase looking beyond technology for inclusive student success launches / past showcase post-pandemic future: implications for privacy see all showcases > what are educause showcases? the showcase series spotlights the most urgent issues in higher education. for each topic, we’ve gathered the tools and resources you need into one place, to help you guide your campus forward. topics focus areas covid- cybersecurity digital transformation diversity, equity, and inclusion enterprise it federal policy presidents and senior executives student success teaching and learning library topics cybersecurity emerging technologies information systems and services infrastructure and networking technologies it management and leadership libraries and technology policy and law teaching and learning explore more in our library > the dx journey | a roadmap advance your institution’s progress on the road to digital transformation. privacy policy terms of use .edu domain contact us insights research covid- quickpolls horizon report top it issues students and technology faculty and technology it workforce explore all research > publications educause review things you should know about benchmarking and assessment analytics services diy survey kits learning space rating system vendor assessment toolkit (hecvat) it service catalog it risk register cds data is now available visit the analytics services portal to learn more and access your data. privacy policy terms of use .edu domain contact us conferences & learning upcoming online events eli annual meeting cybersecurity and privacy professionals conference event finder > now on demand view all on-demand learning > annual conference educause annual conference educause institute leadership & management programs more learning experiences webinars learning labs teaching & learning lx pathways learn and advance in your career explore professional development opportunities to advance your knowledge and career. earn a microcredential showcase your expertise with peers and employers. find or become a mentor get just-in-time help and share your expertise, values, skills, and perspectives. apply for a scholarship financial assistance is available to help with your professional development. privacy policy terms of use .edu domain contact us community get connected community groups member quicktalks member directory mentoring volunteer ambassador program committees working groups explore all volunteer opportunities > career center find or post a job awards educause awards my educause profile create or update your profile privacy policy terms of use .edu domain contact us who we are about us what is educause? leadership and staff board of directors mission and organization strategic partnerships work at educause contact us > membership discover membership membership orientation member directory corporate engagement corporate membership sponsorships and advertising corporate partner program privacy policy terms of use .edu domain contact us search privacy policy terms of use .edu home contact us search results search within: all sites library educause review events members search by keyword including results for: "{{syn }}", found: {{ meta.currentresults }} of {{ meta.totalresults }} results for keyword: “{{ meta.keywords }}” within: found: results for keyword: “{{ meta.keywords }}” within: {{facet.label}} {{facet.name}} filter results sort by: relevance alphabetical publication date {{doc.publishdate | date: 'longdate' }} {{doc.contenttypedisplay}} {{column.split(',')[ ]}} {{brand.split('|')[ ]}} {{brand.split(',')[ ]}} previous {{paginator.pagefirst}} ... {{page.label}} ... {{paginator.pagelast}} next filter your results: clear all {{ group.label }} + - {{ f.label }} ({{f.count}}) {{ sub.name }} ({{sub.count}}) + see more - see less by publish date {{datefilter.label}} custom range from to start date must occur before end date. connect with educause subscribe to our emails and hear about the latest trends and new resources. log in or create a profile topics covid- cybersecurity digital transformation diversity, equity, and inclusion enterprise it federal policy presidents and senior executives student success teaching and learning explore all library topics insights research core data service educause review conferences & learning annual conference eli annual meeting cybersecurity and privacy professionals conference educause institute face-to-face online explore all upcoming events community community groups member directory mentoring member quicktalks volunteering educause awards find or post a job who we are discover membership mission and organization leadership and staff corporate participation work at educause contact us copyright © educause privacy policy terms of use .edu home web accessibility assistance a community of relations: mukurtu hubs and spokes search d-lib: home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib d-lib magazine may/june volume , number / table of contents   a community of relations: mukurtu hubs and spokes kimberly christen, alex merrill and michael wynne washington state university {kachristen, merrilla, michael.wynne} [at] wsu.edu   https://doi.org/ . /may -christen   abstract this paper describes the history of mukurtu cms and our current project "mukurtu hubs and spokes: a sustainable national platform for community archiving" funded by the imls as part of their national digital platform initiative in . this project is an extension of the social, cultural, and technical work of developing the mukurtu cms software to the current . . release. mukurtu cms is community driven software that addresses the ethical curation of, and access to, cultural heritage. the mukurtu hubs and spokes grant will create regional centers of support and training and update the software to a . release. each mukurtu hub will contribute to the software updates and provide local training and support for community users. keywords: mukurtu, collaborative curation, open-source software, indigenous, digital archives, access   mukurtu alpha: mukurtu wumpurrarni-kari archive mukurtu started as a grassroots project to manage, circulate, and narrate the warumungu aboriginal community's digital materials using their own cultural protocols. in , after years of collaboration with the warumungu aboriginal community in central australia, dr. kimberly christen, accompanied a group of community members to the national archives in darwin. during this visit, the warumungu expressed both tension and relief when viewing the images and documents held in the national archives. the tension centered on the violation of cultural protocols observed by warumungu people in the distribution, circulation, and reproduction of their cultural materials. for example, images of the deceased were displayed online with no warnings; pictures of sacred sites lacked any connection to the ancestors who care for those places; ritual objects were disconnected from their original context. in addition to this archival material, the warumungu community received thousands of photos from former missionaries, schoolteachers, and researchers. these digitally returned materials posed a challenge because they could be reproduced endlessly, accessed more easily, and distributed without consent or consultation. community members who viewed the photos on christen's laptop knew what individuals and families should view which items, how they should be shared and who held the knowledge intimately connected to the images of places, kin, and ancestors, and if the materials could be reproduced. the cultural protocols did not need to be spoken; everyone knew what they were and how they should be followed "properly." what the warumungu community wanted was a platform whose functionality respected their dynamic social and cultural systems, relationships, and cultural protocols for sharing, circulating, and creating knowledge. after evaluating the available commercial content management systems, we discovered a set of unmet needs, including: cultural protocol driven metadata fields, differential user access based on cultural and social relationships, and functionality to include layered narratives at the item level. we were not the only ones to come to this conclusion. based on feedback from native communities in the u.s., the national museum of the american indian (nmai) at the smithsonian institution conducted a survey of commercial off-the-shelf content management systems and came to the same conclusion (hunter, koopman, & sledge, ). between and , christen collaborated with warumungu community members at the nyinkka nyunyu art and culture centre, the papulu apparr-kari language centre and julalikari arts center in tennant creek, nt, australia. the three years of work to define and develop the initial "community digital archive" laid the foundation for what would be central to the growth and sustainability of mukurtu: community collaboration, consultation, and creative production. during community listening sessions, individual and family meetings and work with warumungu people at each of the aboriginal organizations we defined the protocols for use, the interface requirements and the overall goals of the platform. in while viewing one of the demo versions of the archive, michael jampin jones, a warumungu elder and archivist at nyinkka nyunyu, elaborated on the archive's imperative and ultimately its name. mukurtu — he told christen — meant "dilly bag" in the warumungu language. the dilly bags held sacred materials and elders kept and protected them as part of their obligations to care for their communities, relatives, places and ancestors, but the elders could not be "stingy" and had to open them up when younger generations asked respectfully. therefore, jampin declared that like the dilly bag — mukurtu — the digital archive we built should be "a safe keeping place." in , after two and half years of community consultation, development, and design work we launched the mukurtu wumpurrarni-kari archive as a stand-alone, browser-based, community protocol driven archive, updatable to community needs over time (figure ). figure : mukurtu wumpurrarni-kari archive explaining the features and functions in this original version highlight the fundamental needs that run through successive iterations. in addition to basic metadata including a unique id number for each piece of content, dates, names, and places, all material is tagged with a set of "protocols" relating to family relations, gender, and country affiliations. when content is uploaded a specific set of criteria must be considered. which families can see the image (a pull down menu allows families to be added)? is the material restricted to men only or women only? is the image restricted only to those related to specific countries (a pull down menu allows countries to be checked)? is the image sacred and thus restricted to elders only? is anyone in the photo or video deceased? finally, is this material "open" to everyone? in order to filter search queries, all of the material held in the database was linked via the metadata to individual user profiles. these detailed user profiles match community members to the content by linking their social and cultural identities to the content. community members create a user profile the first time they log in. each person enters their given name, nicknames, "skin name" (kin group / subsection), and gender before they choose a password. following this, each individual connects to their larger kin networks by entering their mother's and father's family; and their parent's country. finally, each individual is assigned one of three status levels: community member, traditional owner, and/or elder. each status has associated levels of access to sacred materials, the ability to add content, and edit materials. for example, only elders can view and edit scared material, but anyone can add tags to their own collections. similarly, because men and women may not view the same ritual materials a person logged in as a man would not be able to view women's materials. in a sense, each user views their own "mini-community archive" — a relational slice of the archive generated by the communities' own cultural terms and an individual's place in the community, their relationships, obligations and commitments (christen, ). the mukurtu wumpurrarni-kari archive (the safe keeping place belonging to the warumungu people) was our first attempt to encode cultural protocols and social networks into the logic of the platform.   mukurtu beta: the plateau peoples' web portal as the mukurtu wumpurrarni-kari archive was in use, christen and technologists and librarians at the washington state university libraries formed a partnership with six tribes in the plateau region. building upon the university's memorandum of understanding (native american programs, ) with the tribal nations, the partnership sought to extend the alpha version of mukurtu to incorporate existing library digital collections (with dublin core metadata), in a web based platform including multiple tribes across several states who share common histories, but also unique tribal values, languages, and collections. to test the concept of a multi-tribal version of mukurtu, the wsu team worked with tribal partner representatives to develop a prototype of the plateau peoples' web portal (the portal). the portal allows members of the plateau tribes to curate their cultural materials held in wsu's collections and at partner institutions at the national anthropological archives (naa), the national museum of the american indian (nmai) at the smithsonian institution, and the northwest museum of art and culture (mac) in spokane, washington. the portal expanded the functionality of the alpha mukurtu platform creating an online, multi-tribal digital archive with more administrative features, extended access management parameters, and differential metadata requirements across fields and between native communities and collecting institutions. figure : plateau peoples' web portal the portal extended the narrative features of the mukurtu wumpurrarni-kari archive to allow for multiple voices through the inclusion of community records, layered context, and diverse forms of metadata at the item and collection level. as we worked through the design and architecture of the database in the first two phases of development, we were not satisfied to simply have a native "comments" section. instead, we wanted an integrated metadata schema that placed native knowledge side-by-side with institutional metadata. rather than follow the "crowd sourcing" approach to social tagging that presumes all knowledge to be equal, the portal highlights the unique knowledge sets of native peoples of the columbia plateau alongside scholars who have contributed to these collections. in the display of every item in the portal, dublin core metadata from the institution sharing the materials is included under the institutions' record and the tribal community, family or individual records are displayed alongside through multiple tabs. tribal administrators have access to the uploaded content, but cannot alter the institutional record and metadata. instead, tribal administrators manage the "tribal catalog record" and/or the "tribal knowledge" tabs to add their own metadata, narratives, and audio or video comments. similarly, institutional administrators (at wsu, the naa, or the nmai) can upload and enter metadata, but cannot alter the tribal catalog records or tribal knowledge fields. this system maintains the integrity of both institutional metadata and tribal community metadata while simultaneously showing the sharing of knowledge in multiple directions (christen, ). the administrative system allows institutional metadata and added tribal metadata to be visible simultaneously. in this way, the portal provides the framework for a new mode of collaborative curation, display, and classification that recognizes the significance of integrating multiple sets of standards and information systems. utilizing mukurtu's features, the portal highlights the rich history for each piece of content, linking histories of collection and colonization with those of survival and adaptation. for example, the wsu libraries' edwin chalcraft collection of glass lantern slides sparked a multi-tribal response. the collection, which none of the tribal members had seen before, consists of dozens of slides from the chemawa indian school (salem, oregon). the collection provided us with our first truly inter-tribal case study. as the tribal affiliates pointed out, indian children from across the country attended the chemawa boarding school not just the northwest region. in the portal, multiple tribal representatives contributed to the descriptions of the images. for example, one particular slide of the chemawa school bakery includes multiple linked narratives (figure ). vivian adams, yakama nation librarian, contributed a commentary on the image in which she wrote that the "purpose of these schools was to teach indian children how to become 'civilized' by taking them away from their familial ties and cultural influences. some of the vocations like blacksmithing, the one my grandfather learned as a youth at one government school, quickly became obsolete during the industrial era." added to the same image is a brief sound clip of percy brigham, a umatilla elder, who described eating mush for breakfast that contained worms, but if he didn't eat that 'mush,' he'd go hungry. these layered narratives are all part of a "digital heritage item" within the framework of mukurtu. replacing the notion of the stand-alone item or record with succinct metadata, the digital heritage item shows the overlapping, shared, sometimes competing and always growing conversations around culture, place, and history. figure : community records   mukurtu beta: the plateau peoples' web portal as we developed the plateau peoples' web portal and created a beta version of mukurtu with support from a national endowment for the humanities digital start up grant, christen presented the system's capabilities to many groups: indigenous communities, archivists, librarians, and museum curators. it became clear that there was a similar set of archival and content management needs that linked under-represented and marginalized communities. for example, the squamish nation in canada wanted an archive whose protocols could accommodate their intricate clan and family system; in new zealand the maori needed a system that could deal with extensive iwi (kin-based social networks), and in kenya the maasai sought a system that would allow them to differentiate materials meant for commercial purposes from those meant only for internal circulation; and lgbt archives wanted to protect the privacy of their donors while also providing access to sensitive materials to smaller groups. in every case, these communities requested flexible cultural protocols to manage the distribution, circulation, and reproduction of their cultural heritage. they also wanted customizable templates, adaptable user-access levels, and clear intellectual property management tools to make informed decisions about the circulation of their own materials. with an imls national leadership advancing digital resources grant in , the wsu mukurtu team, in a partnership with the center for digital archaeology, civic actions, and kanopi studios updated the beta version of the platform to produce a stable and more easily upgradeable tool that would be available to more communities. in the process, mukurtu changed from a custom project to a free and open source content management system tool. emphasizing both an agile development model and community engagement, the philosophy behind mukurtu's creation — to build a "safe keeping place" — drives all development, functionality and added features. leveraging drupal as mukurtu's base allowed the development team to focus on specific features, functions and areas of emphasis that set mukurtu apart from other cms options. mukurtu is the only cms that provides: customizable, granular cultural protocol-driven access to digital content based on local knowledge systems; pathways for sharing content and metadata between multiple community groups within the platform; layered narratives and curation for materials that go beyond the "item" level to connect content, metadata, traditional knowledge and cultural narratives in one view; flexible and clear licensing and labeling of content; and selective metadata transfer between collecting institutions and indigenous communities using mukurtu's "roundtrip" feature. mukurtu's selective sharing and vetting in features, along with the protocol level management and access functionality, balance the cultural needs of communities with the desire of non-indigenous institutions desire to share content more publicly. in , we launched mukurtu cms version . in sydney, australia and over the course of the next twelve months grew mukurtu's community, held ten community workshops for training and continued user feedback and testing. in using a community focused software development model, mukurtu cms reached a . release and underwent a theming facelift, a complete code update, and the addition of new features including customization of the front page, community pages, and the addition of customizable traditional knowledge (tk) labels. built directly from community needs and input, the tk labels are a prime example of a feature designed around specific cultural and historical needs. because indigenous communities do not legally own much of their patrimony, traditional or creative commons' licenses do not apply. over two iterations of mukurtu development, we created tk labels to provide context to public domain and third-party owned works circulating to the general public. items or collections within mukurtu cms can have a combination of customizable tk labels including: seasonal (for materials that should only be accessible during certain seasons), sacred (for materials that are culturally sensitive) or attribution (so source communities can be named in addition to other creators of works) to signal to viewers the appropriate forms of circulation and access regardless of the copyright status. figure : traditional knowledge labels figure : traditional knowledge labels   mukurtu hubs and spokes: a community of relations the current mukurtu hubs and spokes: a sustainable national platform for community archiving project grew out of simultaneous conversations from local communities using mukurtu cms and national repositories discussing ways to work with indigenous communities to facilitate content sharing. we also considered national conversations around digital tool building, sustainability, and the technological needs of diverse users, particularly the cultural values and social needs of underrepresented communities. an identified goal of the imls national digital program is to champion diversity and inclusion (institute of museum and library services, ). a national digital platform as envisioned by the imls is a decentralized set of tools, services, support and training that aid local communities in doing on-the-ground work while connected to platforms, tools, and workflows. mukurtu cms embodies this idea by prioritizing the diversity of knowledge systems, cultural protocols, and social needs across communities while highlighting the biases coded into seemingly neutral standards and curatorial practices. the continued development of mukurtu cms through the hubs and spokes model will expand the mukurtu cms support structure, enhance the feature set, and grow the user base of mukurtu cms to ensure long term sustainability. the two main goals of the project are to train mukurtu hubs to provide support for their regions and continue mukurtu cms's community development. working in tandem, the project builds sustainability at the community and platform level — indeed, the two are inseparable. over the three years of the project ( - ) mukurtu hubs will become regional support and training centers for the tribal archives, libraries, and museums — the spokes — in their areas. the hubs and spokes will work together with the wsu mukurtu teams' community software development model used in previous phases of development to ensure that mukurtu cms remains a grassroots effort, with design, functions and features driven by local needs and reaches a . release by the end of the project. what we found through the process of open source development is that while the standard open source model provides an avenue for growing a development community, mukurtu users do not have the resources, infrastructure, and programming skills to contribute to mukurtu's development in the same way as other open source platforms. recognizing that these differences are based on limited resources, and an underrepresented set of users, the proposed hubs and spokes model addresses this by maintaining the technical foundation of mukurtu at wsu while collaborating with the hubs to provide additional support and outreach, and the spokes to provide community needs for updates and refinements to the platform. the hubs will work directly with the spokes using wsu's assessment workflow to document feature and functionality needs through user testing and on-going training modules. the hubs — the university of hawaii's libraries and department of linguistics, the alaska native language archives at the university of alaska at fairbanks, the university of oregon libraries, the university of wisconsin's slis program and the wisconsin library services, and yale university's beinecke rare book & manuscript library and the yale indian papers project — all have existing relationships with regional talms that provide a foundation for collaboration and commitments to a sustainable national platform that addresses the ethical curation of, and access to, cultural content. in the first six months of the grant, we focused on start-up activities. these included training the trainers at the mukurtu hubs, articulating community engagement strategies, and building our shared team communication network. the mukurtu hubs have selected their hub managers. the hub manager is primarily responsible for maintaining communication with the mukurtu team at wsu and leading community engagement with the spokes. in march of , the first group meeting was held at washington state university in pullman, washington. this meeting, affectionately dubbed mukurtu cms "boot camp," emphasized all aspects of using and supporting mukurtu cms. we discussed procedures for creating user stories for use in the agile development process, effective bug reporting, and community outreach techniques and successes. we focused on both the overall goal of sustainability — in terms of the software and the communities who use mukurtu — as well as the practical aspects of implementing mukurtu while collaborating with tribal partners. the boot camp training included the basics of setting up and using mukurtu, community consultation practices and processes, and induction into mukurtu's community software development model. the software development model process is built from a combination of existing "agile" models that emphasize iterative methods and user engagement and community building methodologies that rely on mutual respect and knowledge sharing. together these methods form mukurtu's community software development model that builds from on-the-ground, grassroots community needs around sharing, curating, and managing digital heritage materials to build, extend and expand features and functionality. the remainder of will be devoted to further virtual training, community engagement with spokes, developing user stories based on direct community feedback, and initial development sprints.   mukurtu cms: a shared future over the short life of mukurtu cms, with a small staff and budget, we have developed a truly community-driven platform. we recognize the profound need for technologically and culturally responsive platforms for digital heritage stewardship. over the last five years, during mukurtu workshops and online trainings, through surveys supplemented by informal conversations with stakeholders, we have identified the need for new models for the shared curation cultural heritage. these models include the vetting of content for cultural sensitivities, support for native languages, and the inclusion of historical context into diverse collections. the main needs we encountered during these collaborations are described as follows: non-indigenous collecting institutions want to share their collections with indigenous communities and the public, and they know they have some content that is culturally sensitive (sacred objects, images of the deceased, maps which disclose archaeological sites, etc.), but they do not have a way to share the content online while respecting these cultural differences, nor do they know the best channels for contacting community representatives for vetting of the materials. indigenous collecting institutions wish to make their own collections available to their communities online, but need a way to define access based on local, internal protocols. indigenous institutions know that non-indigenous collecting institutions hold materials pertaining to their tribe and they would like to add their own metadata to those collections online through their own cms. without a system that can facilitate easily moving not just content but also metadata between databases, tribes have limited access to these institutional collections for their own use, reuse and community access. mukurtu hubs and spokes will build capacity for individual mukurtu cms installations around the country, and by doing so will increase support and training opportunities for communities dong this significant cultural work. while this is happening there is a concomitant need to bring national repositories into this community-driven work of collaborative curation and ethical sharing of indigenous cultural materials through a digital return process. the wsu mukurtu team just completed a planning grant funded by the andrew mellon foundation: "mukurtu shared: a portal for connecting collections and native communities through a collaborative curation model" to scope the technical, functional, training and community-related needs to build a platform for aggregating, curating, and sharing native american library and archive collections online by connecting tribal and national repositories. the result is a roadmap for expanding the sharing capabilities within mukurtu cms to facilitate ethical and engaged content and metadata exchange between institutions and existing aggregators. instead of content being pulled into aggregators without a cultural vetting process, the "mukurtu shared" workflow model proposed is designed to promote trust at all levels of content management and sharing by extending the notion of security past technical checks, to include cultural and ethical checks into the aggregation process. the roadmap includes: preservation micro-services, exposing a mukurtu api for easier feature development, and enhanced support for complex metadata import and export (mets, mods, ead). in addition to the technical roadmap, the planning meetings showed the need for expanded training (imagined through a cohort of mukurtu fellows), an extended and documented collaborative curation workflow for institutions and shared resources for long-term partnerships between native communities and collecting institutions. the future of mukurtu is embodied in these interrelated projects that move between local, national, regional and global contexts to imagine a different type of curation process. like the dilly bag from which this project takes its name, we aim to open up spaces for sharing within a network of relationships that obligates us to act respectfully, to listen responsively, and to learn reciprocally.   references [ ] christen, k. ( ). opening archives: respectful repatriation. american archivist, ( ), - . [ ] christen, k. ( ). does information really want to be free?: indigenous knowledge systems and the question of openness. the international journal of communication, , - . [ ] hunter, j., koopman, b., & sledge, j. ( ). software tools for indigenous knowledge management. museums and the web . [ ] institute of museum and library services. ( ). imls focus : the national digital platform summary report. [ ] native american programs. ( ). washington state university memorandum of understanding.   about the authors kimberly christen is an associate professor and the director of the digital technology and culture program, the director of digital projects, native programs, and the co-director of the center for digital scholarship and curation at washington state university. she is the founder of mukurtu cms, the co-director of the sustainable heritage network and the local contexts initiative. more of her work can be found at her website.   alex merrill is the head of systems and technical operations for the washington state university libraries and the director of technology for the center for digital scholarship and curation at washington state university (cdsc). alex studied history and computer science at ball state university and received his mlis from the university of arizona in . alex began working with mukurtu cms (and the plateau peoples' web portal) in .   michael wynne is the digital applications librarian in the center for digital scholarship and curation at washington state university. michael received a bsc in linguistics from the university of victoria and an mlis from the ischool at the university of british columbia in . michael is the primary support specialist for mukurtu cms, and has been working with mukurtu, the plateau peoples' web portal, and other projects at wsu since .   copyright ® kimberly christen, alex merrill and michael wynne white paper report report id: application number: hd project director: andrew torget (andrew.torget@unt.edu) institution: university of richmond reporting period: / / - / / report due: / / date submitted: / / visualizing the past: tools and techniques for understanding historical processes a white paper for the national endowment for the humanities andrew j. torget university of north texas (formerly at the university of richmond) andrewtorget@gmail.com james w. wilson james madison university wilsonjw@jmu.edu in september , the university of richmond (in partnership with james madison university) was awarded a national endowment for the humanities level digital humanities start-up grant (award #hd- - ) to convene a two-day workshop of leading scholars and experts working on creating visualizations of historical events, processes and datasets. more than twenty experts took part in the workshop held at the university of richmond on february - , . participants in the workshop presented cutting-edge work in historical visualizations and took part in wide-ranging discussions about the state-of-the-field and the challenges in expanding the reach and capacity of such research. following the workshop, the university of richmond and james madison university conducted follow-up work to extend further our understanding of the current state-of-the-field. this white paper documents the opportunities and challenges of historical visualization research, the workshop, subsequent work performed at the university of richmond and james madison university, and the outcomes of the workshop in pushing forward historical visualization research. workshop website: http://dsl.richmond.edu/workshop http://dsl.richmond.edu/workshop mailto:wilsonjw@jmu.edu mailto:andrewtorget@gmail.com table of contents • project overview, p. • call for papers, p. • the workshop: february - , , p. • common issues identified, p. • lack of data-sharing, p. • subsequent work at the university of richmond and james madison university, p. • outcomes, p. • recommendations for future work, p. • appendix : call for papers, p. • appendix : list of participants, p. • appendix : workshop schedule of events, p. • appendix : sample survey of historical visualizations online, p. project overview this project began with a simple question: how we can advance the work of people seeking to use digital tools to visualize complex historical processes? as a historian and geographer, we were both well aware of the growing potential for historical visualization techniques to transform our disciplines. we have both devoted much our own scholarship toward such work, and we knew from personal experience that a growing number of our colleagues were considering doing the same. some of this movement within the humanities and social sciences toward digital visualization tools reflects the fact that nearly all historical data has some spatial component to it. every historical newspaper, census record, manuscript, battlefield report, audio recording, and photograph came from a particular place, and often documented multiple other locations. access to tools that can plot the spatial dimensions embedded in these various datasets—particularly as those patterns move across the landscape of the past—can help scholars to better analyze and understand what their sources can tell them. others find themselves drawn to visualization research techniques as a result of the digital revolution. with digitization efforts, both in the public and private sectors, making increasing amounts of historical data available, scholars find themselves in need of new ways to sort that information. the problem for many scholars working with historical data has begun to shift from “what do you do with too little?” to “what do you do with too much?” in the age of google, when scholars can access millions of historical sources through digital media, the challenge has become to find ways to discover meaningful patterns in such massive quantities of information. visualization techniques, many have recognized, offer a method of sorting overwhelming amounts of data. within our own work, we had experienced both of these phenomenons. as such, we had also experienced some of the common challenges besetting anyone attempting to incorporate visualization tools and techniques into their work with historical sources. some examples from our own experiences include: • historical data is nearly always incomplete, leaving gaps within the available information that are difficult to represent properly in digital visualizations. • arcgis, the standard in digital cartography, is a terribly complex and difficult to learn package of software. • gis tools tend to be poorly equipped to deal with change over time. • if not generated locally, historical datasets nearly always have to be downloaded from a source to a local machine to be incorporated into a tool like gis. as such, that data usually has to reformatted to a new standard in order to be compatible with the project. we had, moreover, concluded that many of these challenges shared two underlying problems: ( ) because the dominant gis tool, arcgis, has been designed for other audiences, it still lacks sufficient tools to adequately address the needs of humanities scholars. nearly all spatial research is performed within arcgis, largely because it is the most robust and adaptable set of tools toward analyzing issues of space. as such, most scholars employ arcgis toward historical research that incorporates spatial analysis. the challenge in using arcgis in this context comes from two directions. first, because the programs involved are highly powerful, they are also highly complex to learn. the learning curve alone, thus, prevents the widespread adoption of such techniques toward humanities research. second, arcgis was not developed originally to deal with humanities-related questions. thus the software tends to emphasize exploring datasets within a specific timeframe, rather than across time. that has recently begun to change with the development of new arc tools aimed toward this very problem. ( ) there is little cyber-infrastructure aimed toward supporting the rapid sharing and dissemination of these various historical datasets in a manner that makes them easy to share (and thus promote enhanced research possibilities). although increasing numbers of scholars are both generating datasets and using digital tools to analyze them, there is very little in the way of tools, techniques or ability of these researchers to share their materials. in other words, most of the datasets being generated cannot talk directly to any other datasets. the result is that nearly every scholar using these materials must go through a laborious process of collecting various datasets, reformatting them to match a single standard for his or her own research, and then use cumbersome tools like arcgis to analyze them for themselves. there is little in the way of infrastructure, support, or adoption of techniques to allow scholars to grab datasets distributed across various systems that would allow a scholar to perform their work on-the-fly in a way that would speed up their efforts. the inability of scholars to share data, it seemed to us, was the most widespread and significant challenge facing scholars hoping to incorporate visualization work in their research. having identified both the potential for expanding the possibilities of visualization of historical resources and the challenges preventing many scholars from undertaking such work, we decided to convene a workshop where we could foster concentrated dialogue about these issues among leading scholars working in historical visualization. as such, the purpose of this project was rather simple: we wanted to identify the current state of the field, to assess current practices and needs, and to begin discussions about how to move these people forward with sharing data and connecting various efforts. call for papers we began by issuing a call for proposals for participants for a workshop to be held on february - , , at the university of richmond. we asked for abstracts of - minute presentations of ongoing projects that would address two central questions: • how can we harness emerging cyber-infrastructure tools and interoperability standards to visualize, analyze, and better understand historical events and processes as they spread out across both time and space? • how can user-friendly tools or web sites be created to allow scholars and researchers to animate spatial and temporal data housed on different systems across the internet? the response was impressive, drawing over proposals from all over the world. the large number of proposals reinforced impressions we had both developed about the widespread desire within the humanities and social science communities to expand visualization research in order to promote enhanced research and dissemination in the digital age. we also formed a board of advisors made up of experts in the field, to assist us with evaluating the most promising proposals. the board consisted of david arctur (open geospatial consortium interoperability institute), edward l. ayers (university of richmond), peter k. bol (harvard university), david schell (open geospatial consortium), terry solcum (university of kansas), and richard white (stanford university). (detailed biographies of the board of advisors can be found in “appendix : list of participants.”) with the advice of our board, we invited more than presenters (representing the united states, canada, england, and the netherlands) to come to richmond, virginia, for the two-day conference. scholars selected to participate in the workshop represented a wide array of backgrounds and disciplines, including historians, geographers, computer scientists, statisticians, anthropologists, animators, programmers, and journalists. in addition to scholars affiliated with traditional universities, we also brought in various representatives of the private sector. microsoft and google both accepted invitations to present their latest work in visualization technology, as was the private firm bbn technologies. we also hosted representatives from the non-profit sector (including the open geospatial consortium and open geospatial consortium interoperability institute, non-profit consortiums directed toward developing interoperability standards in supporting gis work) and government agencies (including the ordnance survey, great britain's national mapping agency). (detailed descriptions of each presenter can be found in “appendix : list of participants.”) the workshop: february - , the workshop itself was organized around successive sessions of two to three presentations, followed by a round of discussion. the relatively small size of the group allowed for wide-ranging discussions in a structured, but open-ended, setting. small discussions followed each presentation, with a larger and more comprehensive dialogue following the end of each session. the first day’s sessions covered a wide array of current work in the field: • j. b. owens discussed the efforts of idaho state university to establish a lab for the use of gis in historical geography, and the challenges of developing research techniques for addressing the inherently ambiguous nature of historical datasets. • chris weaver and may yuan discussed new software packages for analyzing historical datasets, as well as multiple possibilities for visualizing both historical landscapes and textual landscapes (that is, relationships embedded in language patterns). • jeanette zernecke discussed the efforts of the electronic cultural atlas initiative to develop and promote platforms for the sharing and preserving historical artifacts and information by using time and space as organizational tools. • rafael alvarado discussed the use of rdf as a means of organizing and sharing historical data in a manner that could promote sharing and exchange of historical data for visualization research. • kurt rohloff discussed recent data-mining efforts at bbn technologies to use historical sources in order to create algorithms for predicting the geographic locations of future cultural events. • david arctur and phillip dibner discussed interoperability standards for gis developed by the open geospatial consortium (ogc) and the open geospatial consortium interoperability institute (ogcii) and their applicability toward promoting historical gis efforts. • david bodenhamer discussed efforts to translate gis’s capabilities into web- accessible interfaces, and the challenges and limitations that exist for scholars who seek to distribute such work online. • peter l. pulsifer discussed developing a cyber-infrastructure to support “cybercartography” as a means of promoting visualization as a means of analyzing historical data, promoting the use of ogc standards to promote such work. • charles van den heuvel discussed two projects aimed toward developing new techniques in the annotation and contextualization of manuscript maps, focusing on the need to develop technology that enables scholars to better analyze these maps while also preventing the casual distortion of those maps through digitization. • max edelson and alan craig discussed efforts to geo-locate map manuscripts effectively without using proprietary gis software, while at the same time making such maps available through kml as a means of delivering them online. • josh wall presented and discussed the “surface” environment developed by microsoft to promote collaborative work in a visualization-rich environment. the second day of the workshop continued with presentations covering: • peter k. bol discussed various projects underway at harvard’s center for geographic analysis, emphasizing the deep-set need for the development of better tools (such as historical gazetteers) for programmatic identification of place names in historical sources (particularly in languages other than english) • jon christensen discussed the development of the spatial history project at stanford university, and their efforts to combine gis, cartography, and spatial analysis through a variety of software platforms and tools, all aimed toward moving beyond the confines of arcgis. • hadley wickham discussed using the statistical analysis tools available as “r” toward developing visualizations of information sets, focusing in particular on the problem of dealing with incomplete datasets (a near constant when dealing with historical data). • sorin matei discussed the possibilities for developing immersive visualization environments for exploring historical datasets, as well as incorporating those datasets into rich kml layers for distribution through google’s mapping and visualization platforms. • carsten roensdorf discussed the development of citygml by the united kingdom as a means of developing more semantically rich data structures for the preservation and dissemination of historical data about buildings and spatial relationships within cities. • mano marks discussed google’s wide-ranging platforms of tools available for geospatial plotting and sorting of information sets, concentrating on the capabilities of keyhole markup language (kml). common issues identified the projects presented at the workshop represented wide-ranging approaches to visualizing historical datasets, usually toward very different ends and goals. yet there were common threads that ran throughout the presentations and the discussions they elicited. foremost, of course, was the common approach to using visualizations to explore both large and complex datasets as they move across both time and space. there was a shared sentiment among those assembled, both in the presentations and the discussions that followed, that visualizations offer scholars the ability to make sense of various sorts of data (and the data analyzed in the various projects ranged quite widely, from chinese to english, from newspapers to videos, from manuscripts to audio recordings) in ways that enable researchers to detect more easily embedded patterns, organize their information, and generally make sense of complex problems. in terms of the data itself, one of the most common challenges identified in creating visualizations of historical information is the problem of ambiguity. throughout numerous presentations and discussions, the problem of incomplete datasets due to incomplete historical information was identified by workshop participants as a looming and unresolved issue within historical visualization work. nearly all historical datasets are problematic because of their incomplete nature (this is, in fact, a problem with nearly all large datasets, as one participant—hadley wickham—pointed out). what remains unclear is what is the best method for addressing (and being transparent about) those gaps in the data when creating visualizations. different projects handled that question in very different ways—some found explicit means for identifying missing data (thus drawing attention to that issue for the reader), while others simply ignored those gaps in their representations (thus allowing the reader to imagine a completeness to the data that might not in fact be the case). within the field, there remains no clear consensus on how to address such issues, and no accepted (and thus clearly recognizable) method for alerting readers of where gaps in data are expressed in a given visualization of historical information. indeed, an area that was not explored in depth at the workshop but was seen as highly important for future work related to visualizing the past relates to spatial and temporal reasoning and cognition. specific areas brought up at the workshop include the concept of precision and the need to develop ways to evaluate the vagueness of qualitative data and portray margins of error inherent in incomplete historical data. workshop participants also expressed interest in developing better ways of using natural language and meaning representation in conveying information, and in particular the use of a narrative model of presentation. there was also an interest in going beyond spatially referenced visualizations to use visual vocabularies. in terms of the tools available for creating visualizations, esri’s suite of tools for gis are clearly the most widely adapted toward historical visualizations and analysis in use today. in fact, there is nothing else that comes close to the dominance of esri’s toolsets in this work, and the vast majority of the projects undertaken by the workshop participants incorporated some form of formal gis software (usually arcgis). permutations of gis found its way into the presentations both in direct form (such as bodenhamer’s discussions of the challenges in translating gis capabilities to a web- ready environment) or indirectly as a means of organizing information that was then later presented through another tool (such as in ferster’s discussions of developing a flash- based tool for historical visualizations, which can incorporate gis shapefiles). much of this, it seems, reflects the fact that the majority of these projects seek to pin information down to specific geographic locations, for which arcgis remains the standard. yet even projects that do not have a specific landscape upon which to layer data (for example, a project being undertaken by yuan to begin to use gis to map the language patterns embedded in large collections of digitized texts) are using gis software as a means of organizing datasets in spatial dimensions. the dominance of arcgis as the toolset for performing geospatial analysis means that the limitations of those same tools prevents some scholars from taking advantage of historical visualization research. although esrr has several tools and platforms for making gis files accessible on the web, many of the participants at the workshop noted the challenges of translating the capabilities of arcgis (such as layering multiple datasets over a particular landscape) into a web-ready environment. other participants addressed the challenges of addressing change over time, and the animation of data-patterns as they evolved over a time-series, through software that was designed primarily to address a single time and place at any given instance. indeed, many of the projects presented sought to address these difficulties by creating new software and/or platforms to fill in these gaps in capabilities. simply put, there is nothing that comes close to competing with the capabilities of esri’s suite of gis software, and thus the vast majority of scholars use the software for geospatial and visualization research. yet the fact that these tools were not designed specifically for the humanities means that they are sometimes unable to meet the research needs of such scholars. the second most common tools seen during the workshop were the various platforms offered by google (primarily through the use of kml in google’s maps and earth projects), in large measure because they appear to offer highly accessible solutions to perceived limitations of arcgis. kml, for example, can easily be translated into a web-environment, has a tremendous amount of flexibility due to its stripped-down simplicity, and can be easily distributed and disseminated across a wide-ranging variety of platforms. the openness to kml easily lends itself to sharing of data (an indispensible consideration in the support of scholarship) and the adaptation of a digital project into various web-based interfaces. moreover, there are a large (and ever- increasing) number of software programs that can translate gis shapefiles into kml files, making the transition from a gis environment to a kml environment an increasingly easy endeavor. several projects presented at the workshop, in fact, combined gis and google’s kml platforms. the combination of these two tool-sets appears to be becoming a standard for layering historical information across a landscape (done with arcgis) and then disseminating that information in another format that is highly accessible to a wide audience (through kml and google’s platforms). kml also enables scholars to take advantage of google’s tools for time-lapse animation, something that is now also available in arcgis software. that enables researchers to take gis layers developed for different times and pull them together into a geo-referenced time-series, allowing the user to analyze information as it changed over both time and space simultaneously and therefore extending their ability to reconstruct patterns embedded in their historical datasets. many of the scholars at our workshop had identified these as key advantages of combining their work in gis with kml as a delivery system, and the two tool-sets figured predominately in a large number of the projects presented. the advantages of kml as a delivery—and thus dissemination—tool for historical visualizations is further extended by the fact that google provides multiple platforms upon which to distribute and disseminate these datasets. rather than create and maintain the platforms upon which kml would operate, these scholars relied instead on the platforms provided by google through the google maps api and earth interface. in both cases, this enabled scholars to move more quickly in developing their projects and saved them time and money in the process. the clear downside, and something that was voiced during the workshop discussions, was the fact that google has no long-term vested interest in the support of any of these projects, and these particular tool-sets could disappear if the fortunes of google changes significantly. although kml is open, google’s apis are not, and the advantages in cost and speed of development in the use of these google tools are to be weighed against the lack of control over the sustainability of these tools. adobe’s flash was the third most common tool that scholars at the workshop used in their work, usually for the translation of gis work into a web-ready format specific to particular research needs. much like kml filtered through google’s platforms, applications built on top of flash are being created by various scholarly projects in order to construct platforms upon which historical visualization work can be more effectively shared and disseminated online. these efforts are invariably more labor and cost- intensive than using google’s prefabricated platforms, but they are also more customizable than what google makes available and therefore can be more carefully calibrated to the questions and needs of a particular researcher. bill ferster’s presentation, for example, delineated how a flash-based application might incorporate gis shapefiles into a series of historical visualizations that can be shared widely online. in all cases, the most commonly used tools were proprietary (arcgis, flash, google’s apis) or associated with a for-profit entity (kml with google). as such there are currently a number of significant obstacles inherent in sharing historical data stored in such formats and shared on such platforms. there are few platforms available to scholars that have been developed specifically for the sorts of questions that humanities and social science researchers ask. the most widely-used exceptions to this would be the tool-sets developed by mit’s simile project (specifically timeline and timeplot), although both of those tool-sets showed up only occasionally in the projects discussed at the workshop. technical areas for future research identified at the workshop covered a range of topics from data development needs, to storage methods, to software enhancements. in the area of data development there was widespread interest in encouraging the development of “framework” type layers through time, including political boundaries, place names (gazetteers), vegetation, transportation, hydrography, etc. to enable more detailed analysis of various topics and to facilitate comparative studies. there is also a need for advances and standardization of database schemas and ontologies that relate to and allow the linking of textual, numeric and visual data. advances in software development that were identified include the development of more dynamic gis that can handle processes better. the research promise of a distributed network of systems was discussed, with several participants calling for better ways to discover and utilize distributed data, tools, analytical processes, and annotations that are contributed by multiple users of a system. such systems should allow for the capturing of queries and data exploration as people are working with systems to build analytical stories. a better understanding and methods of linking nested scales of analysis were also thought to be important. some specific utility applications that were identified were automated georeferencing of scanned maps, automated feature extractions, and temporal editing tools for kml, and other data formats. the importance of metadata was also discussed at the workshop in detail. the consensus was that this should go beyond merely describing the source and method of producing a data set to tracing how the builders or users of a system have queried the systems, made choices, in order to follow the logic of the narrative that is told, as well as to let people change choices that were made and come up with their own narrative. there is also a need to address the issue of metadata’s role in “archiving” online resources for future exploration. in addition to making data available as formats and media change, people should also have access to the logic of systems in order to fully understand the data that went into and the output from such systems. lack of data-sharing one of the primary challenges holding back advances in historical visualization research, as identified during the workshop, was the lack of data-sharing among historical projects. that is, nearly every project maintained the historical data it was using for its visualizations in a particular format based on the unique needs of that particular project. a given project, for example, might store census and voting data in a particular way that differs significantly from how another project might store or format the same data (probably because different projects ask different questions of the data, and therefore have different needs for its storage and formatting). thus if multiple projects use the same set of data (such as the u. s. census returns), they are nearly always going to create multiple versions of that same set of data, each customized to a particular project and therefore unable to be shared with other projects. the result is that historical data tends to be digitized into unique datasets that nearly always remain solitary silos, rather than datasets that can be widely shared and disseminated among other interested scholars. as discussed in detail at the workshop, this is a common and far-reaching problem among those working in historical visualization research. a given researcher working in the field must invariably download necessary datasets from multiple sources and then translate them into new formats that meet the requirements of their own particular research. this is a time-intensive, and therefore expensive, endeavor. the process also usually creates a new set of data (one that combines multiple datasets into a new one that has value for having combined disparate information) that is also in a format that is unique enough to be difficult to share. the result is that scholars are not able to access common datasets (again, census and voting returns being some of the most obvious of those widely used in numerous visualization projects) available on various servers and access them on-the-fly through something like an api. the ability to do so— to grab datasets available on various servers and then immediately translate them into the forms needed for your own project, without having to download and manipulate them on a local machine—would greatly accelerate the ability of scholars to develop new historical visualization projects. there are, in fact, standards that exist for such interoperability. as discussed in detail at the workship, the open geospatial consortium (ogc) and the open geospatial consortium interoperability institute (ogcii) have been developing such standards for sharing gis data for some time, and they continue to develop and promote such standards as a means of promoting spatial research among scholars. one of the biggest challenges, however, seems to be a pervasive lack of awareness of such standards among scholars working with and creating historical visualization datasets. while workshop participants were asked specifically to address the use of distributed systems, it became apparent that little has actually been accomplished to date to integrate and harness the exciting visualization work that is going on in ways that make it easy for people to integrate and visualize data on different systems on the internet. part of the reason for this is that most of the work presented at the workshop was project focused and the projects did not require the use of distributed data stores. where data was acquired from multiple sources it was typically processed locally before being analyzed and visualized. another aspect of the lack of work in this area is the relatively recent development of technologies that can facilitate visualization from distributed data stores and that this work is going on in niche areas typically removed from humanities scholars the problem of interoperability, and promoting the awareness of existing standards and the development of new ones, emerged as one of the primary challenges facing scholars interested in the visualization of historical datasets. subsequent work at the university of richmond and james madison university at the conclusion of the workshop, teams at the university of richmond and james madison university worked to develop and further explore themes and ideas identified during the workshop itself. university of richmond at the university of richmond, this work was undertaken at the digital scholarship lab (dsl), a humanities-focused research lab aimed toward advancing disciplines such as history through digital technology. the dsl attempted to extend the lessons learned at the workshop through two specific endeavors: ( ) survey of projects available online that incorporate visualizations of historical datasets. we wanted to test whether the sampling of projects presented at the workshop was, indeed, representative of the work currently available online. so we conducted an informal survey of projects currently presented online, cataloging a sample of around one hundred online projects that incorporate visualizations of historical datasets in one capacity or another. we conducted this survey during the late spring and summer of , cataloging the sites surveyed into “appendix : sample survey of historical visualizations online.” the results of this survey confirmed the patterns we had identified during the workshop. although it was sometimes difficult to decipher what technology had been used to create the visualizations in particular projects, when we could learn about their methodology the same patterns emerged. arcgis served as the standard in assigning data to a particular geography, with kml, google’s platforms, and flash serving as the primary means of developing methods of disseminating the work of these projects and their historical visualizations online. similar to the workshop itself, there were various exceptions (such as incorporations of mit’s simile project tools). yet the overwhelming proportion of those projects adapted what appear to have become the standard tools in the field: arcgis, kml/google, and flash. the survey also further confirmed the widespread interest in the use of visualizations to make sense of large sets of historical data. some of the most widely imitated sets of historical visualizations noted in the survey come from the graphics team at the new york times online edition, where they have used flash and gis to deliver compelling visualization of various historical phenomenon as they changed over both time and space. other projects seek to use visualizations to comprehend patterns embedded in large datasets that are becoming ever-more readily available, such as census and voting returns as they changed over time. ( ) further development of historical visualizations that attempted to deal with incorporating the three toolsets most commonly identified in the workshop, arcgis, google’s kml and mapping platforms, and flash. much of this mapping and visualization work took place within the context of work preformed on the dsl’s on-going project, history engine: tools for collaborative education and research <http://historyengine.richmond.edu>. this project collates historical narratives written by undergraduates in classrooms across the country. the historical narratives (which we call “episodes”) each have timeframe and location metadata assigned to them, so we had been experimenting with developing methods of mapping and visualizing that historical data across both time and space. building on the discussions at the workshop, we sought to create new visualizations of these sets of information. (this work also built off previous work preformed at the digital scholarship lab at the university of richmond, specifically with the development of voting america: united states politics, - <americanpast.richmond.edu/voting/>.) • flash map for browsing historical data : <historyengine.richmond.edu/map/>. this mapping interface takes gis shapefiles and translates them into a format that can be displayed online through a flash-based interface. this work was both expensive and laborious, largely because we had to replicate in flash many of the capabilities that exist in arcgis in order to make those capabilities available online. the results, however, were rather impressive—a malleable interface that enables the user to quickly, efficiently, and succinctly manipulate large amount of historical data that is being pulled from a constantly updated database of student historical narratives. flash also enabled our project team to demonstrate change over time effectively, and to customize the interface directly toward the needs of our particular project. • google kml interface for browsing historical data : <http://historyengine.richmond.edu/location>. we also created a kml version of our browsing interface that relied on google’s mapping platform. in this we translated gis files into kml, then adapted the google tools from the mapping api to create a new interface that also pulled from the constantly updated database of student historical narratives. rather than attempt to replicate many of the capabilities of arcgis, this effort built upon google’s existing platforms and tools. this approach was, therefore, much more cost- effective than the earlier efforts at the flash map, and was accomplished on a much faster timetable. there was a great deal less that we could customize in this approach (since in flash we could create anything we wanted, whereas with google’s tools we had to chose from existing capabilities), but the results remained quite satisfying. another example of a dsl project working to incorporate gis and kml in analyzing historical datasets online is the redlining richmond project <http://americanpast.richmond.edu/holc/>. this project sought to take data about the geography of the city of richmond during the s and the efforts of the home owners' loan corporation (holc) during the new deal to determine which areas were eligible for federal loans. the work of assigning historical data to locations was preformed in arcgis, as it proved to be the most effective and accurate method for doing so. these gis shapefiles, however, were translated into kml and incorporated into an interactive map using google’s mapping api. the result is a highly effective interactive map of historical data that was generated quickly and easily disseminated online. project experiments reinforced central ideas voiced among participants at the workshop: the precision of arcgis to locate geospatial information remains unsurpassed among available toolsets for such work. yet kml and flash offer some of the most effective means for disseminating such work online. flash offers more flexibility in the creation of an online interface, while kml and google’s toolsets offer greater speed and efficiency in creating and completing such projects. james madison university work at james madison university included working on the white paper with the university of richmond, and on researching, documenting, and analyzing tools and standards identified at the workshop and through subsequent exploration. given the interest in kml and its ability to encode temporal information, faculty and students at jmu utilized google earth (http://earth.google.com/) and arcgis explorer (http://www.esri.com/software/arcgis/explorer/index.html) to view multiple time-enabled kml datasets simultaneously. both viewers were able to display the files and provide a timeline tool to visualize the data through time; however some of the files did not display properly in arcgis explorer. we were not able to determine if the files that were not viewable in arcgis explorer were improperly formatted or if there were bugs in the software. two additional issues emerged from this work: ( ) no easy and systematic method of discovering time-enabled kml files and what time period they represent, and ( ) the limited functionality of the temporal tools available in both packages. the remainder of the effort focused on other data and interoperability standards related to building standards-based and internet-accessible servers in support of spatial and temporal visualization. an existing international effort has been underway for some time to establish standards and best practices in the development of spatial data infrastructures (sdi), and is being championed at the international level by the global spatial data infrastructure association (http://www.gsdi.org/). sdis are designed to allow for easy and standard methods of discovering and disseminating geospatial data across multiple organizations. the gsdi organization includes many international partners and has developed a “cookbook” (http://www.gsdi.org/gsdicookbookindex) for organizations wishing to develop sdis that integrate with others across the globe. the cookbook is based on internationally accepted standards primarily developed through ogc and iso processes. affiliated organizations are also working on a minimum “best practices” designation that identifies the core elements needed in an sdi implementation (see http://www.giknet.org/bestpractice.php). outcomes: some of the most valuable outcomes were the cross-discipline discussions that emerged during the workshop, which will likely yield fruit in the field over the coming years in the form of new collaborations and endeavors. there were, however, a number of specific outcomes that came directly out of the two-day workshop and the conversations that began during its presentations and discussions: ( ) the open geospatial consortium (ogc) is seeking to develop a temporal gazetteer in order to facilitate the sharing and rapid development of historical data http://www.giknet.org/bestpractice.php visualizations. this was a key need identified at the workshop, as there are few resources available for the standardization of place-names and locations across time. peter bol emphasized this matter in particular and found himself in agreement with a large number of people at the workshop. the members of the ogc present, david arctur and phillip dibner, proposed during one of the discussion periods that the ogc perhaps could organize an effort to develop interoperability standards for a temporal gazetteer that could then facilitate such research by historians seeking to harness the power of gis. since the workshop, the ogc has worked to push forward in the development of a temporal gazetteer. initial efforts to develop the project under an ogc test-bed activity called ows- did not survive initial discussions due to budget limitations. currently, however, a standards working group of the ogc considering changes to wfs-g (web feature service, gazetteer profile) is taking initial steps toward the development of a temporal gazetteer. an on-going discussion, moreover, continues among an email group established at the workshop itself, geared toward this specific goal and moderated by the ogc, with the goal of solving this particular problem identified during the workshop discussions. ( ) the open geospatial consortium interoperability institute (ocgii) is also actively moving forward on the development of a temporal gazetteer standard. according to phillip dibner, the executive director of the ogcii and a participant of the workshop, the ogcii has been pushing forward on possibly hosting “a workshop that addresses this directly, and incorporates a broad array of interested communities not currently involved with the ogc, including the academic history community. there are other groups who need to be engaged in such developments, and we envision this as an opportunity to get them all together.” ( ) the survey of historical visualizations online. as discussed above, this survey reinforced patterns observed during the workshop itself (in terms of the dominance of arcgis, kml, google, and flash as the tools-sets used in historical visualization research). the sites surveyed can be seen in “appendix : sample survey of historical visualizations online.” ( ) experiments conducted at the university of richmond and james madison university in developing historical visualizations using the dominate software identified during the workshop, as discussed above. ( ) workshop co-organizer james wilson, workshop participant david bodenhamer, and ian gregory (lancaster university, uk) included questions on “visualization” in a recent survey they put together as part of a spatial literacy in teaching (splint, see http://www.le.ac.uk/gg/splint/) fellowship on “gis in the humanities: towards an educational strategy in britain and america” (see http://www.hgis.org.uk/splint/). the workshop and survey results are currently being analyzed. recommendations for future work in sum, we can make several overarching recommendations for future work in historical visualizations that will, we hope, push the field forward and enable a much broader range of scholars to engage in such research. these conclusions emerged from workshop presentations and discussions and were reinforced by subsequent explorations by the project team. ( ) in order to widen the number of scholars engaging in historical visualization work, we need the development of much more accessible toolsets for gis work beyond the esri suite of tools that can more easily be learned by scholars without a technical background in geospatial analysis and more adaptable to some of the research needs unique to the humanities disciplines. ( ) in order to promote the analysis of change over time, we need the development of more sophisticated toolsets for animating historical over a time-series, beyond what arcgis currently makes available. this appears to be an underdeveloped area of current software (for which scholars usually adapt flash-based applications), although both kml and arcgis are increasing their capabilities in these areas. ( ) in order to ensure the transparency of the scholarship that emerges from historical visualization research, we need the development of more robust standards for dealing with ambiguity and gaps in historical datasets, so that people who create and read these visualizations will be fully aware of the state of the data and produced a given visualization. ( ) in order to promote the sharing of historical datasets, we need the further development (and increased awareness among scholars) of interoperability standards for the sharing and disseminating of historical datasets in manners that will promote the sharing of various gis and other information sets being created by various scholars. in other words, the further development of standards (such as the ogc continues to do) that will enable the sharing and repurposing of these datasets, and making scholars fully aware of them. ( ) based on the examination of existing international efforts to develop sdis based upon internationally developed and acceptable standards we feel that researchers interested in developing internet-based systems for temporal geospatial data and tools for the humanities should look to build upon existing and engage in future developments of sdi and geospatial interoperability efforts to enhance their applicability for temporal and humanities related data and analysis. (details on the most relevant geospatial and temporal standards and tools that support some of the standards will be provided on the project website: http://dsl.richmond.edu/workshop). work in these areas has already begun, and we believe that the collaborations that are emerging from this project’s workshop will help fulfill these goals and thereby move the field of historical visualization forward in powerful new ways. http://dsl.richmond.edu/workshop appendix : call for papers the digital revolution has made massive amounts of historical and social science data available to scholars in electronic formats, and this phenomenon is opening new possibilities for exploring the human past. the ability to plot historical processes embedded in these datasets using mapping and visualization tools holds remarkable promise for providing scholars new insights into old questions. yet significant obstacles currently prevent scholars from sharing their geospatial data with one another, and thus from full taking advantage of the potential of visualization techniques. to address this, scholars and practitioners from multiple disciplines (geography, history, geographic information science, computer science, graphic arts, etc.) are invited to submit proposals for presentations at a two-day workshop (funded by the national endowment for the humanities) that will focus on two main issues: how can we harness emerging cyberinfrastructure tools and interoperability standards to visualize, analyze, and better understand historical events and processes as they spread out across both time and space? how can user-friendly tools or web sites be created to allow scholars and researchers to animate spatial and temporal data housed on different systems across the internet? we seek - page proposals for - minute presentations that describe ongoing projects, address these questions, and outline a view for future research and experimentation. we invite proposals from all backgrounds, and relevant topics might include: historical gis applications, cartographic animation, analyzing and visualizing temporal data, service oriented architecture, web-mapping and interoperability standards, data and metadata standards, open-source and commercial applications. the workshop will center on in-depth discussions among - participants in roundtable format. the first day will be devoted to individual presentations; the second day to discussions about the workshop's main questions, and describing what should be the future of this work. travel scholarships will be available to invited participants. for more information or to submit your proposals, contact the conference organizers at: andrew j. torget, university of richmond, atorget@richmond.edu , - - james w. wilson, james madison university, wilsonjw@jmu.edu , - - proposals are due december , via email to the conference organizers, and invitations for participation will be sent out by december , . the workshop will be held at the university of richmond, february - , . appendix : list of participants project directors andrew j. torget was the director of the digital scholarship lab at the university of richmond at the time of the workshop. today he is an assistant professor of history at the university of north texas, where he leads development of a number of digital projects focusing on visualizations of historical processes. torget is director of voting america: united states politics, - , the texas slavery project, and the history engine: tools for collaborative education and research, as well as the co-editor of two books on the american civil war. james w. wilson is an assistant professor of geographic science at james madison university, specializing in historical geography, internet gis, and cartography. wilson is on the advisory board of the virginia geographic information network (the state gis coordinating body), and the secretary of the historical geography specialty group of the association of american geographers. his article "historical and computational analysis of long-term environmental change: forests in the shenandoah valley of virginia" appeared in a special issue of historical geography devoted to historical gis (vol. , ). project advisory board david arctur is president and chief technology officer of the open geospatial consortium interoperability institute (ogcii), a non-profit scientific and educational organization dedicated to continued improvements in worldwide application of interoperable geoprocessing technologies and spatial data. arctur has previously been a data architect, product engineer, and interoperability engineer at esri; was the chief scientist at laser-scan, inc.; and a senior research associate at the university of florida. he is the co-author of designing geodatabases: case studies in gis data modeling (esri press, ). edward l. ayers is the president of the university of richmond and a historian of the american south. ayers has been involved in numerous digital humanities projects, most notably as the director of the award-winning digital archive the valley of the shadow: two communities in the american civil war. he also co-authored a born-digital article, "the differences slavery made: a close analysis of two american communities," which used gis mapping to examine the role slavery played in the outbreak of the american civil war, and appeared in the december issue of the american historical review. peter k. bol is the director of harvard university's center for geographic analysis and the charles h. carswell professor of east asian languages and civilizations. bol led harvard's university-wide effort to establish support for geospatial analysis in teaching and research, and directs the china historical geographic information systems project, a collaboration between harvard and fudan university in shanghai to create a gis for years of chinese history. david schell serves as chairman of the board of the open geospatial consortium inc., which he founded in with both public and private sector support to evolve "opengis" into a global standard for interoperable geoprocessing. schell is primarily responsible for directing ogc's board of directors operations and since he has served as chairman and ceo of the ogc. terry a. slocum is chair of the department of geography at the university of kansas. slocum is lead author of thematic cartography and geovisualization (now in its third edition) and has published on geography and visualizations in numerous journals, including cartography and geographic information science, cartographica, journal of geography, annals of the association of american geographers, the professional geographer, and the british cartographic journal. from to , he served as editor of cartography and geographic information science. richard white is the margaret byrne professor of american history at stanford university, where he directs stanford's spatial history lab and its effort to create new tools for visualizing historical development. his current digital humanities project, how the west was shaped, is developing a large database and computer graphics tools to study and represent visually how people's experience of space and time was dramatically shaped by railroads in the north american west during the nineteenth century. workshop participants: rafael alvarado is the principle information architect for house divided project, a comprehensive archive of primary and secondary sources relating to the years leading up to the american civil war. he has been active in the digital humanities since early s when he created the mayan epigraphic database project at the institute for advanced technology in the humanities (iath). in he traveled to princeton university to become coordinator of humanities and social sciences computing, where he established the consortium for the development of digital collections, the educational technologies center, and the humanities computing research support group. currently at dickinson college, rafael has also developed software for numerous digital humanities projects. nathaniel ayers is the digital scholarship lab's programmer analyst at the university of richmond, serving as the head of the lab's historical visualization work on projects such as voting america. a graduate of virginia commonwealth university, nate has done programming and visualization work for the university of virginia. david j. bodenhamer is professor of history and founding executive director of the polis center at indiana university purdue university, indianapolis. during his tenure, the center has developed over projects, with grant and contract funding of over $ million. in addition, polis has expanded its programmatic focus from indianapolis and central indiana to state, regional, national, and international partnerships and projects. an active researcher, bodenhamer is author or editor of eight books, with two books on the spatial humanities forthcoming in and . he has made over presentations to audiences on four continents on topics ranging from legal and constitutional history to the use of gis and advanced information technologies in academic and community-based research. he also has served as strategic and organizational consultant to universities, government agencies, and not-for-profit and faith-based organizations across the u.s. and in europe. jon christensen is a ph.d.candidate in the department of history and an associate director of the spatial history project of the bill lane center for the american west at stanford university. he is a distinguished departmental scholar for academic year - , supported by a mellon foundation dissertation fellowship, and was honored with a prize for excellence in first-time teaching in - . department of history, stanford university. he is coordinating "tooling up for digital histories," a collaboration between the spatial history lab, the computer graphics lab, and stanford humanities center, supported by grants from the presidential fund for innovation in the humanities at stanford. visualizing the past through digital historical sources and spatial analysis has been the key to his own dissertation, "critical habitat," a history of ideas, narratives, science, land use, and practices of conservation and extinction of a species in time and space. web sites: http://stanford.edu/~jonallan/ and http://spatialhistory.stanford.edu. alan craig has focused his career on the interface between humans and machines. he has been involved in many different capacities related to scientific visualization, virtual reality, data mining, multi-modal representation of information, and collaborative systems during his career at the national center for supercomputing applications where he has worked for the past twenty years. craig is co-author of the book understanding virtual reality, published by morgan kaufmann publishing, and author of the forthcoming book, using virtual reality. phillip c. dibner is the director of research programs for the ogc interoperability institute (ogcii), the research and educational affiliate of the open geospatial consortium (ogc). dibner has been involved with the ogc since its inception, where he has managed technical integration and coordinated demonstrations for testbeds, interoperability experiments, and pilot implementations, in remote collaboration with participants throughout europe, north america, and australia. he also established and continues to chair the ogc earth systems science domain working group (ess dwg). trained as an ecologist at the yale school of forestry and environmental studies, dibner has had field experience throughout the continental united states. prior to his involvement with the ogc, he joined the silicon valley technology boom of the s and ' s, where he worked on operating systems and network protocols, while pursuing his interest in environmental and ecological data acquisition and analysis. max edelson is associate professor of history at the university of illinois at urbana- champaign. his first book, plantation enterprise in colonial south carolina, examines agriculture, economy, and environment in the making of the carolina lowcountry's early plantation landscape. his current research investigates cartography and empire in eighteenth-century british america. in collaboration with the institute for computing in humanities, arts, and social science (i-chass) at the national center for supercomputing applications at illinois, he received an neh level i digital humanities start-up grant to create the cartography of american colonization database (cacd). bill ferster is senior scientist at the university of virginia with a joint appointment with the center for technology and teacher education at the curry school and the virginia center for digital history in the college of arts and sciences. he has founded numerous companies including west end film, developer of the first pc-based d animation system, emc, developer of the first digital nonlinear editing system which received an emmy award in , and stagetools, the leading developer of image animation tools. charles van den heuvel is senior researcher for the virtual knowledge studio for the humanities and social sciences of the royal netherlands academy of arts and sciences, where he leads the research project "paper and virtual cities. new methodologies for the use of historical sources in virtual urban cartography," with the department of information science at groningen university. ( ) finished his study art history and archeology at groningen university with a specialization in the history of architecture, town planning and planning sciences in . he received his ph.d. from the university of groningen in , writing his dissertation on the dissemination of knowledge of italian engineers in the netherlands and their role in the introduction of the "renaissance" culture in the netherlands. since then he worked as a senior-researcher for the universities of groningen, utrecht, maastricht and research institutes, such as the maastricht mcluhan institute. matthew koeppe is director of giscience programs at the association of american geographers, where he coordinates many of the aag's gis-related activities in the areas of education, outreach, international programs, and public policy. matthew received his phd in geography from the university of kansas in . his research background and interests include environment and development in the brazilian amazon, tropical frontier expansion, the social and political aspects of land cover classification, and the geography of food production and consumption. sorin adam matei is known for applying, from a cross-analytical perspective, traditional statistical, gis, and spatial methodologies to the study of information technology and social integration. he has conducted a number of studies on the social and cognitive impact of location aware systems deployed in real or virtual environments. his current research is particularly focused on the role of spatial indexing on learning in location aware situations and on the role of physical affordances in structuring location aware communication experiences. the experimental work he conducted at purdue university's envision lab indicates that there are some benefits for information acquisition in location aware situations. in addition, he has conducted large-scale multidisciplinary surveys of communication technology use in local communities both in the united states and in europe. his research was funded by motorola, kettering foundation, university of kentucky, and purdue university and was recognized by various professional organizations with paper and research awards. his teaching makes use of a number of software platforms he has codeveloped, such as mindmeld. mano marks is a developer advocate with google, helping people place their content in google earth and maps. he has a masters in history from columbia and a masters in information management and systems from uc berkeley. he is very interested in the intersection between data, visualization, and communication. worthy n. martin received his ph.d. in computer science from the university of texas- austin in . he then joined the university of virginia in as a professor of computer science. he is the author or co-author of papers. his primary research interest is dynamic scene analysis, i.e., computer vision in the context of time-varying imagery, as well as the fundamental concepts involved in machine perception systems composed of independent processes operating in distributed computing environments and cooperating to form interpretations of image sequences. robert k. nelson is the digital scholarship lab's associate director, overseeing historical visualization work on the history engine and the text-mapping projects. a graduate of william and mary's american studies ph.d. program, rob is a historian of nineteenth- century america. scott nesbit is a doctoral fellow at the institute for advanced studies in culture and a phd candidate in the history department at the university of virginia. his dissertation examines the politicization of the idea of forgiveness in the american civil war era. he is co-creator and an associate director of the history engine and has managed for several other online projects at the virginia center for digital history. j. b. owens is professor of history and director of the geographically-integrated history laboratory at idaho state university. he currently serves as co-project leader of a multidisciplinary, multinational research project he created for the european science foundation's eurocores (european collaborative research) scheme's program "the evolution of cooperation and trading" (tect). the title of his project is "dynamic complexity of self-organizing cooperation-based commercial networks in the first global age" (acronym: dyncoopnet), and the work involves researchers from sixteen countries on five continents (including co-authors of his position paper for the "visualizing the past" workshop). before creating the dyncoopnet project, owens held consecutive fellowships from the national endowment for the humanities and the john simon guggenheim memorial foundation. owens' research has focused on the cultural, economic, and social contexts shaping the exercise of political authority in the kingdom of castile during the period - . peter pulsifer is a research associate at the geomatics and cartographic research centre, department of geography, carleton university in ottawa, canada. pulsifer's research is focused on creating new knowledge, methods and tools in support of integrating geographic data, information and knowledge for education and decision support. his research incorporates several major, current themes in geomatics research including web-based mapping for education and decision support, modeling and integration of geographic information, and the ontological foundations of visualization and representation of geographic phenomena. peter has been very active in research related to information management and the development of on-line atlases for the polar regions. he applies an collaborative. interdisciplinary to research and has worked closely with human and physical geographers, cartographers, psychologists, cognitive scientists, computer scientists, anthropologists and cultural theorists. carsten roensdorf is an expert geographic data management and currently holds the position of corporate data manager at ordnance survey, great britain's national mapping agency. in this role he is responsible for the integrity of the national geographic database, the repository for consistent, high detailed geographic base data in great britain. carsten is a trained geodesist and has created, managed and utilised geographic uses in central and local government, utilities, mobile telecoms as well as land management. he is an active participant in the development of geographic information standards in the open geospatial consortium and led the standardisation of citygml, a standard to represent cities in multiple dimensions, in . kurt rohloff is a scientist in its information and knowledge technologies department at bbn technologies, where his areas of technical expertise include computational modeling, control and decision systems, distributed resource management, and software reliability. kurt's recent research focus has been developing automated methods to identify quantifiable patterns in highly multi-dimensional data with a particular focus on patterns that precede nation-state instability as part of the externally-funded icews program. kurt's other research focuses at bbn have been in applying notions of control theory for the increased performance and reliability of interacting, distributed computational modeling systems. kurt was previously affiliated with the coordinated science laboratory (csl) at the university of illinois, urbana-champaign, the center for mathematics and computation (cwi) in amsterdam, the netherlands, and mit's lincoln laboratory in lexington, ma. erik steiner is a visiting scholar at stanford university, where he is the director of the spatial history lab. a recognized leader in the design of dynamic mapping applications, he has most notably led the development of the atlas of oregon cd-rom and the interactive nolli and vasi websites of rome. erik has a permanent appointment in the infographics lab in the geography department at the university of oregon. josh wall is a managing consultant for information strategies (www.infostrat.com) a washington dc based microsoft gold partner. information strategies was chosen by microsoft to be one of a select group of partners to build solutions for microsoft surface, their innovative new multi-touch device. josh and his team have worked closely with the microsoft virtual earth team to build the next generation of gis solutions that leverage the multi-touch technology in microsoft surface. chris weaver is associate director of the center for spatial analysis and assistant professor in the school of computer science at the university of oklahoma. weaver holds a b.s. in chemistry and mathematics from michigan state and an m.s. and ph.d. in computer science from wisconsin. chris' grand tour of academic research so far includes analytical chemistry, cognitive psychology, operating systems, databases, human-computer interaction, and geographic information systems. he was recently a research associate with the geovista center in the department of geography at penn state, where he was also a founding member and core investigator with the north-east visualization and analytics center. hadley wickham is an assistant professor of statistics at rice university. he is interested the use of graphics to reveal interesting and unexpected features of data, as well as practical tools to make dealing with real-life data easier. he won the john chambers award for statistical computing for his work on the ggplot and reshape r packages. may yuan is brandt professor, edith kinney gaylord presidential professor and associate dean of atmospheric and geographic sciences and the director of center for spatial analysis at the university of oklahoma. may's research interest is in temporal gis, geographic representation, spatiotemporal information modeling, and applications of geographic information technologies to dynamic systems. her research projects center on representation models, algorithms for spatiotemporal analysis, and understanding of dynamics in geographic phenomena, such as wildfires, rainstorms, air-pollution plumes, and behavior and activities in complex social systems. she explores multiple perspectives of dynamics, analyzes the drivers and outcomes of geographic dynamics, extracts spatiotemporal patterns and behavioral structures of dynamic systems, and draws insights into the system development and evolution to derive an integrated understanding, interpretation, and prediction of activities, events, and processes in dynamic geographic systems. jeanette zerneke is the technical director for the electronic cultural atlas initiative (ecai). in that role jeanette works with a diverse group of technology experts to develop tools and methodologies that support ecai's mission. ecai is a global collaboration among humanities scholars, librarians, cultural heritage managers, and information technology researchers. ecai's mission is to enhance scholarship by promoting greater attention to time and place. jeanette's work involves developing infrastructure, programs, methodologies, working groups, and training workshops to support ecai affiliates in project development and integration. jeanette works directly with project teams to develop web sites and epublications highlighting the growing use of new technologies to present cultural information in innovative ways. appendix : workshop schedule of events day one (february ) : am and : am: shuttles from hotel to jepson alumni center : am: breakfast : am: andrew torget and james wilson: welcome, introductions, and opening remarks : am: session • j. b. owens, "visualizing historical narratives: geographically integrated history and dynamics gis • jeanette zerneke, "from historical gis to seeing history" • may yuan and chris weaver, "visual analytics and applications to historical data" • discussion : pm: lunch • josh wall, "microsoft surface and virtual earth" : pm: session • rafael alvarado, "the semantic web as a tool for visualization and collaboration: the house divided project and the underground railroad" • kurt rohloff, "cwest: disruptive integration of computation technology for data analysis and visualization" • david arctur and phillip dibner, "interoperability, knowledge integration, and the study of historical processes" • discussion : pm: afternoon break : pm: session • david bodenhamer, "visualizing complex data in an online historical gis: twentieth century religious adherence data as a testbed" • peter pulsifer, "the role of cybercartography in exploring, visualizing and preserving the past" • s. max edelson and alan craig, "rendering digital maps: using and displaying images in the cartography of american colonization database" • discussion day two (february ) : am and : am: shuttles from hotel to jepson alumni center : am: breakfast : am: session • peter bol, "people and places: computing china's past" • jon christensen, "tooling up for spatial history projects" • charles van den heuvel, "visualizing historical evidence and experience: two projects around early modern manuscript maps and an experimental e- humanities lab" • sorin matei, "visible past: where information searches for you" • discussion : am: lunch • mano marks, "using google geo technologies to visualize spatially located data" : pm: session • hadley wickham, "visualizing data with r" • carsten ronsdorf, "integration of historic data fragments on the basis of citygml" • bill ferster, "the emancipation of data: a call to action" • discussion : pm: afternoon break : pm: andrew torget and james wilson: wrap-up discussion neh appendix untitled used to track the ‘hours’ of your shift; ) a ‘gridlock counter’, which tracks how many ed backups or adverse patient outcomes occur (‘gridlocks’). the goal of the game is to work cooperatively with your teammates to complete patient tasks and move patients through the ed to an ultimate disposition (e.g. admission, discharge). the game is won if you finish your shift before reaching the maximum number of ‘gridlocks’ allowed. conclusion: initial responses to gridlocked have been very positive, supporting it as both an engaging board game and potential teaching tool. we are excited to see it validated through research trials and possibly incorporated into emergency medicine training at both student and postgraduate training levels. keywords: emergency department flow, simulation, board game lo the canadiem digital scholars program: an innovative international digital collaboration curriculum f. zaver, md, a. thomas, md, s. shahbaz, md, a. helman, md, e.s. kwok, md, b. thoma, md, ma, t.m. chan, md, university of calgary, calgary, ab introduction / innovation concept: digital media are a new frontier in medical education scholarship. asynchronous education resources facilitate a multi-modal approach to teaching, and allows residents to personalize their learning to achieve mastery in their own time. the canadiem digital scholars program is a nationwide initiative that provides residents with practical experiences in creating digital educa- tional materials under the supervision of experts in the field. the pro- gram allows for collaboration and access to mentorship from top digital educators from across north america. methods: interested residents accepted into the program spent a period of their pgy year completing modules developed in the theory and science behind digital education. four modules, developed in an iterative process, have been built on the topics of podcasting, blogging, digital identity, and patient commu- nication. each fellow was supervised members of the canadiem team, a faculty member from the resident’s home institution, and digital experts from across north america. curriculum, tool, or material: the first fellow completed all aspects of the designed curriculum. above this, he also engaged in blog content creation, initiated research on digital scholarship, and managed the editorial section of canadiem. the sec- ond fellow is currently halfway through his year (and is expected to complete the program within the year) and has co-authored blog posts and podcasts in months. conclusion: the canadiem digital scholars program utilizes a novel approach to foster development of digital educators utilizing experts across north america. we have demonstrated the feasibility and sustainability with our initial pilot years. this program is being scaled next year to include two scholars per year, which will facilitate cross-collaboration between the scholars. keywords: innovations in emergency medicine education, social media, free open access meducation (foam) lo not a hobby anymore: establishment of the global health emergency medicine organization at the university of toronto to facilitate academic careers in global health for faculty and residents c. hunchak, md, mph, l. puchalski ritchie, md, phd, m. salmon, md, mph, j. maskalyk, md, m. landes, md, msc, mount sinai hospital, toronto, on introduction / innovation concept: demand for training in global health emergency medicine (em) practice and education across canada is high and increasing. for faculty with advanced global health em training, em departments have not traditionally recognized global health as an academic niche warranting support. to address these unmet needs, expert faculty at the university of toronto (ut) established the global health emergency medicine (ghem) organization to provide both quality training opportunities for residents and an academic home for faculty in the field of global health em. methods: six faculty with training and experience in global health em founded ghem in at a ut teaching hospital, supported by the leadership of the ed chief and head of the divisions of em. this initial critical mass of faculty formed a governing body, seed funding was granted from the affiliated hospital practice plan and a five-year strategic academic plan was developed. curriculum, tool, or material: ghem has flourished at ut with growing membership and increasing academic outputs. five governing members and general faculty members currently run projects engaging over faculty and residents. formal partnerships have been developed with institutions in ethiopia, congo and malawi, supported by five granting agencies. fifteen publications have been authored to date with multiple additional manuscripts currently in review. nineteen frcp and ccfp-em residents have been mentored in global health clinical practice, research and education. finally, ghem’s activities have become a leading recruitment tool for both em postgraduate training programs and the em department. conclusion: ghem is the first academic em organization in canada to meet the ever-growing demand for quality global health em training and to harness and support existing expertise among faculty. the productivity from this collaborative framework has established global health em at ut as a relevant and sustainable academic career. ghem serves as a model for other faculty and institutions looking to move global health em practice from the realm of ‘hobby’ to recognized academic endeavor, with proven academic benefits conferring to faculty, trainees and the institution. keywords: global health education, global health training, global health research lo safety and efficiency of emergency physician supplementation in a provincially nurse-staffed telephone service for urgent caller advice e. grafstein, md, r.b. abu-laban, md, mhsc, b. wong, mha, r. stenstrom, md, phd, f.x. scheuermeyer, md, m. root, ma, q. doan, mdcm, mhsc, phd, st. paul’s hospital, vancouver, bc introduction: in british columbia created a nurse (rn) staffed telephone triage service, (tts) to provide timely advice to non- callers ( ). a perception exists that some callers are inappropriately directed to emergency departments (eds) thereby worsening crowding. we sought to determine whether supplementary emergency physician (ep) triage would decrease ed visits while preserving caller safety and satisfaction. methods: tts rns use computer algorithms and judgment to triage callers. potentially sick callers are directed to “seek care now” (red calls). often this is to an ed depending on acuity and time of day. in the vancouver health region from april-september between : - : hours, a co-located ep also spoke with “red” callers to provide further guidance. callers were followed up with week and satisfaction was evaluated on a -point likert scale. the tts data was linked to the regional ed database to assess ed attendance within days, and the provincial vital statistics database for -day mortality. our primary outcome was the proportion of unique “red” callers who did not attend the ed compared with a historical cohort one year earlier without ep triage in place. secondary outcomes were the proportion of “red” callers advised not to attend the ed but (a) attended, (b) admitted, or (c) died. results: in the study period there were “red” calls of résumés scientifique s ; suppl cjem � jcmu https://doi.org/ . /cem. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/ . /cem. . https://www.cambridge.org/core https://www.cambridge.org/core/terms music encoding conference proceedings mei and verovio for mir: a minimal computing approach mark saccomano natalia ermolaev columbia university princeton university m.saccomano@columbia.edu nataliae@princeton.edu abstract while the increase in digital editions, online corpora, and browsable databases of encoded music presents an extraordinary resource for contemporary music scholarship, using these databases for computational re- search remains a complex endeavor. although norms and standards have begun to emerge, and interopera- bility among different formats is often possible, researchers must devote considerable time to discover, learn, and maintain the skill sets necessary to make use of these resources. this talk will discuss our work with the serge prokofiev archive and the creation of a prototype to browse, display, and play notated music from prokofiev’s notebooks via a web browser. the project is an example of how using the principles of minimal computing can reduce the burden of technological expertise required to both disseminate and access encod- ed music. the archive the serge prokofiev archive, housed at columbia university, contains more than , diverse items: music manuscripts, letters, scores, financial documents, notebooks, photographs, and recordings. originally a per- sonal collection amassed by prokofiev’s widow lina, the materials were first established as an archive in at goldsmith’s college in london. as the archive grew, a complex, intricate, and item-level descriptive appara- tus evolved alongside it. by the time the collection came to columbia, the archival items were accompanied by hundreds of metadata files in a wide variety of formats, including word documents, spreadsheets, text files, pdfs, endnote databases, access databases, marc records, and various xml encodings. typically, archival collections are accessed through an online finding aid, which users often find not only difficult to use, but whose underlying structure and interface can obfuscate the richness of a collection. the blocks of narrative and long lists of items found in a finding aid, especially in a collection of our scope, are a barrier to true discovery. we sought to improve the experience of navigating a large archival collection by affording users the opportunity to make new, spontaneous discoveries. our serge prokofiev archive as data project was guided by two important conceptual shifts in the library and archives profession. first is the “collections as data” movement, which encourages reframing the digital object itself as data [ ]. the second is kate theimer’s notion of “archives as platform,” a move away from locating value exclusively in the objects of a collection to the impact collections have on people and communities [ ]. in theimer’s view, the notion of an archive includes the tools and technologies that help users interact with it in creative ways that add value to their lives and experiences. accessible technology and minimal computing because we were looking for solutions that could be adapted for researchers with varying skill sets and with different computing needs, we tested out a variety of freely available software to store, structure, clean, ana- lyze and display our data. also, we had no budget: necessity dictated that we seek out non-proprietary tools. thus, we placed ourselves in the position of many researchers (independent and graduate student research- ers in particular) looking for ways to disseminate their work to a wider audience. following this path, we soon https://findingaids.library.columbia.edu/ead/nnc-rb/ldpd_ https://mss .github.io/spademo/ see also https://collectionsasdata.github.io/statement/ https://findingaids.library.columbia.edu/ead/nnc-rb/ldpd_ https://mss .github.io/spademo/ https://collectionsasdata.github.io/statement/ became introduced to the principles of minimal computing and discovered their applicability to our own proj- ect’s goals. minimal computing is a design philosophy that seeks to maximize access to digital materials through reduc- ing reliance on specific hardware and software requirements [ ]. organized around the question “what do we need?”, alex gil describes minimal computing as a conscious effort to “harness the new media in smart, ethical and sustainable ways.” in addition to reducing reliance on multiple, and opaque, processes, minimal comput- ing also implies “learning how to produce, disseminate and preserve digital scholarship ourselves, without the help we can’t get [ ].” this diy approach helps minimize dependence on institutional resources and funding, as well as proprietary tools (which, in addition to their cost, often require a high level of expertise as well). one of the first steps we took was to avail ourselves of systems and workflows with ample documentation. we were also cognizant of the advisability for scholars to publish digital materials in a versatile format that requires little or no maintenance and can easily be ported to other systems. this way, digital materials remain accessible even as technology develops in ways that are impossible to foresee today. we soon created a re- pository for the serge prokofiev archive as data project on github and created a static website for display on github pages using a jekyll template. because a static site does not require knowledge of server operations or database design, it simplifies the task of individual researchers to disseminate their work. for the musical component of our project, we wanted to create not only an attractive front end and simple user interface, but a simple back end as well. the idea was to provide a repository of encoded music that could not only be seen but heard—a difference that could make such a repertory valuable beyond the spe- cialized scholar in computing or musicology. aficionados and researchers in other fields who may not be able to read code or read music could nonetheless hear the music in prokofiev’s manuscripts—and could hear for themselves the jagged rhythms and unexpected chromatic alterations that are hallmarks of his style. we also developed a simple workflow for creating the encoded files (one very similar to the process now detailed in the tutorial “introduction to the music encoding initiative” by anna kijas and raffaele viglianti). to publish to the encoded materials, we used our github website and jekyll template. the notebooks one of the highlights of the collection are prokofiev’s notebooks. here, in an interview transcript from the ar- chive, prokofiev’s widow lina described how he used the notebooks in his creative process. sp never stopped creating…. at the most unexpected moment, at the most unusual circumstances—during a conversation or while walking—he would make a note of a new theme in a special notebook he kept in his pocket or on any scrap of paper or on his cuff—on paper napkins in a restaurant. then on returning home he would copy the themes into a more permanent notebook. the sketches we display on the site are from these “more permanent” notebooks lena mentions. we began by simply browsing through the notebooks and taking some pictures. displayed in a web exhibit, these images would be interesting on their own. but we also knew that by adding the sounded music represented by these scores, we would greatly increase the usefulness of these notebooks to scholars, as well as to the general pub- lic. not all musicologists and music theorists have sufficient musicianship skills to fluently imagine the sound of notated music. for archival materials such as unlabeled sketches, this can aid in identification of fragments, suggesting how and where they might have been used in published scores. mei was chosen as the encoding format, not only because of its adaptability and increasingly common use in digital musicological projects, but also due to the availability of verovio, an engraving library that can be used to display and play mei files in a web browser. as these were short, handwritten passages of only a few measures each, they were entered manually into the music notation program sibelius. (because they were written by hand, an ocr program would likely not have been the most efficient method of encoding). next, the files were exported to musicxml using the export function of sibelius. to convert the musicxml files to mei, https://go-dh.github.io/mincomp/ https://dlfteach.pubpub.org/pub/intro-mei/release/ https://go-dh.github.io/mincomp/ https://dlfteach.pubpub.org/pub/intro-mei/release/ music encoding conference proceedings we used the automated converter available on the verovio website. this worked extremely well and yielded excellent results. the light editing that remained to be done was mostly for aesthetic purposes of display. the editing was done in atom, using the mei-tools-atom package, which renders mei in a separate pane within the application. once the mei files were checked and polished, they were uploaded to the github repository. the challenge, then, at this stage, was to create page templates that would incorporate verovio. although it took many at- tempts to pull everything together, the results were encouraging, and a prototype was developed that could display an engraved version of the score derived from a digitally encoded version of the manuscript, as well play the score in a browser using a simple interface: https://mss .github.io/spademo/sketches/ implementation development challenges proved to be formidable. while finding appropriate tools for coding, display and play- back of manuscripts was reasonably easy, getting them to work together was exceedingly complex. documen- tation, though rich, can be dense; the largest impediment to timely progress is access to consultant who can assist in troubleshooting. without this, the plethora of manuals and tutorials become an obstacle to learning, creation, and design. (think of the myriad articles, tips, and guidelines many of us received back in march of this year on how to migrate our courses online—such a wealth of material can be overwhelming). even with access to university assistance, this site took nearly a year to assemble. however, the skills to use github and jekyll are within reach via ground-up tutorials available from such sites as the programming historian. difficulties still remain, specifically, those arising from technical solutions that push the limits of common browser capabilities. for example, problems with audio playback, such as web midi players clipping notes (due to possible buffering or threading issues) have driven developers on some projects to insert an extra musical object into their encoded scores. we also encountered this clipping problem, and were only able to come up with a temporary work around through a laborious trial-and-error process. to ensure the mei would play properly in the browser, a <space> element or “dummy” <note> event with @visible=“false” had to be inserted before the first and final notes in order for them to be heard. such inelegant solutions are highly undesirable for an archival representation of a manuscript. presumably, improvements in how browsers and system players handle midi will soon make such workarounds unnecessary. in the meantime, these ad hoc solutions need to be specially commented in the mei files. extensibility and future directions sample sites using mei, verovio and ed template el corrido mexicano https://mss .github.io/corridosed/ serbian hymns https://mss .github.io/zagreb/ in order to test the extensibility of this project, we tried it out with texted music in a special jekyll template for minimal literary editions, “ed.” developed by alex gil and associates. the resulting sites showed the flexibil- ity of the ed theme to handle some of the more complex requirements of verovio and web midi, while still remaining a project that could be managed by a single researcher. they also demonstrate the utility of our chosen suite of open source tools for musicologists, music theorists, and music archivists. in the future, we https://www.verovio.org/musicxml.html https://atom.io/packages/mei-tools-atom https://programminghistorian.org/ we are particularly indebted to amelia visconti for her jekyll tutorial https://programminghistorian. org/en/lessons/building-static-sites-with-jekyll-github-pages https://github.com/cuthbertlab/music /issues/ https://elotroalex.github.io/ed/ https://www.verovio.org/musicxml.html https://atom.io/packages/mei-tools-atom https://programminghistorian.org/ https://programminghistorian.org/en/lessons/building-static-sites-with-jekyll-github-pages https://programminghistorian.org/en/lessons/building-static-sites-with-jekyll-github-pages https://github.com/cuthbertlab/music /issues/ https://elotroalex.github.io/ed/ hope to incorporate search tools for specific series of notes and an analytical component that could be used to identify the stylistic traits of a corpus. a note about program evaluation: one aspect of design that is often overlooked in digital musicology proj- ects is user testing. as noted by david weigle in his study of the academic use of digitized online resources [ ]: “the needs and behaviours of musicologists in particular remain relatively underexplored”. this is not just an issue in musicology. the statement made by warwick, et al. [ ] in (cited by murray and wiercinski [ ]) rings true today in : “user testing, like disseminating information, is a skill that most humanities scholars have not acquired”. however, as murray and wiercinski point out, a strictly user-centered development might restrict a project’s ability to make full use of nascent technology. for them, the ideal interface would “provide the more familiar and comfortable features that facilitate the types of activities that scholars know,” while af- fording new opportunities for discovery and experimentation “of which they are currently unaware” [ ]. until more research like weigle’s is conducted on users in music studies, we can only note that all development is an iterative process, an attempt to anticipate needs, get feedback, address shortcomings, and get more feedback. in the meantime, having robust models that can be easily adapted for use by others is a positive step toward increasing access to archival materials. conclusions while the raw data of much notated music may be ready to be downloaded for analysis, the high-level com- puting skills required to retrieve and analyze that data means that it remains out of reach to many. in order to make collections such as these more accessible, both the resources and the training for encoding, retrieval, analysis, and display of encoded music need to be made available to researchers. we would like our prototype to be a resource to scholars in music studies—an example of open data and code that will lessen the demand for technical expertise for both the researcher and the user, while demonstrating the functionality that can be added to a single site accessed through an ordinary web browser. as music ocr technology continues to become more successful at first-pass recognition, we will want to be prepared to make repositories available to more than just the technologically savvy few. with encoded music, the difference between a mode of access that involves scrolling through a list of text files and one that features an interactive display of scores and sound, is analogous to the difference between retrieving library materi- als through an institution with open stacks and one with closed stacks. refining interests and homing in on relevant and interesting material are often the result of seeing a book on a shelf, opening it up and thumbing through it—reading a few sentences, checking out the toc, skipping to the index, looking at the color plates in the middle. we don’t always need or want to engage with materials in this manner, but having the option to do so is invaluable. works cited [ ] padilla, thomas. “on a collections as data imperative”. conference report. collections as data: stewardship and use models to enhance access, library of congress, washington, dc, september , . http://digitalpreservation.gov/meetings/dcs /tpadilla_onacollecti- onsasdataimperative_final.pdf [ ] theimer, kate. “the future of archives is participatory: archives as platform; or, a new mission for archives” presented at the offene archive . conference, stuttgart, germany, april - , . [ ] sayers, jentery. “minimal definitions”. . https://go-dh.github.io/mincomp/thoughts/ / / /minimal-definitions/ [ ] gil, alex. “the user, the learner, and the machines we make”. . https://go-dh.github.io/mincomp/thoughts/ / / /us- er-vs-learner/ [ ] weigl, david, et al. “on providing semantic alignment and unified access to music library metadata” international journal on digital libraries , no. ( ), - . [ ] warwick, claire, et al. “the master builders: lairah research on good practice in the construction of digital humanities projects” literary & linguistic computing , no. ( ), - . [ ] murray, annie, and jared wiercinski. “a design methodology for web-based sound archives” digital humanities quarterly , no. ( ). https://www.digitalhumanities.org/dhqdev/vol/ / / / .html http://digitalpreservation.gov/meetings/dcs /tpadilla_onacollectionsasdataimperative_final.pdf http://digitalpreservation.gov/meetings/dcs /tpadilla_onacollectionsasdataimperative_final.pdf https://go-dh.github.io/mincomp/thoughts/ / / /minimal-definitions/ https://go-dh.github.io/mincomp/thoughts/ / / /user-vs-learner/ https://go-dh.github.io/mincomp/thoughts/ / / /user-vs-learner/ https://www.digitalhumanities.org/dhqdev/vol/ / / / .html constructing a campus-wide infrastructure for virtual reality: college & undergraduate libraries: vol , no skip to main content log in  |  register cart home all journals college & undergraduate libraries list of issues latest articles constructing a campus-wide infrastructur .... search in: this journal anywhere advanced search college & undergraduate libraries latest articles submit an article journal homepage views crossref citations to date altmetric research article constructing a campus-wide infrastructure for virtual reality elisandro cabada grainger engineering library information center, university of illinois at urbana-champaign, urbana, il, usa correspondencecabada@illinois.edu https://orcid.org/ - - - view further author information , eric kurt undergraduate library, university of illinois at urbana-champaign, urbana, il, usa https://orcid.org/ - - - view further author information & david ward undergraduate library, university of illinois at urbana-champaign, urbana, il, usa https://orcid.org/ - - - view further author information published online: feb download citation https://doi.org/ . / . . crossmark   translator disclaimer full article figures & data references citations metrics reprints & permissions get access /doi/full/ . / . . ?needaccess=true abstract abstract the tools, techniques, and physical infrastructure to conduct vr explorations can be expensive and require specialized facilities and training, creating obstacles for faculty and students to explore potential applications. a gap exists for a trusted environment to develop and support best practices. this study documents lessons learned from the transformation of existing library spaces into vr content creation and exploration spaces, as part of an integrated campus-wide initiative. results argue for the central role libraries can play in the vr lifecycle. the study presents recommendations for adapting to the rapidly-developing vr marketplace, suggests methodologies for standardizing hardware deployment, training, and application support, and recommends a support structure for academic vr support. keywords: virtual realitymixed realityextended realityimmersive environments acknowledgements the authors would like to thank the technology services at illinois, the center for innovation in teaching and learning (citl), and the staff in the media commons in the undergraduate library, and the grainger engineering library idea lab for their generous support and invaluable work in supporting the vr efforts mentioned in this article. additional information funding the authors would like to thank the technology services at illinois, the center for innovation in teaching and learning (citl), and the staff in the media commons in the undergraduate library, and the grainger engineering library idea lab for their generous support and invaluable work in supporting the vr efforts mentioned in this article. log in via your institution loading institutional login options... access through your institution log in to taylor & francis online log in shibboleth log in to taylor & francis online username password forgot password? remember me log in restore content access restore content access for purchases made as guest purchase options * save for later item saved, go to cart pdf download + online access hours access to article pdf & online version article pdf can be downloaded article pdf can be printed usd . add to cart pdf download + online access - online checkout issue purchase days online access to complete issue article pdfs can be downloaded article pdfs can be printed usd . add to cart issue purchase - online checkout * local tax will be added as applicable more share options   related articles people also read lists articles that other readers of this article have read. recommended articles lists articles that we recommend and is powered by our ai driven recommendation engine. cited by lists all citing articles based on crossref citations. articles with the crossref icon will open in a new tab. people also read recommended articles cited by information for authors editors librarians societies open access overview open journals open select cogent oa dove medical press f research help and info help & contact newsroom commercial services advertising information all journals books keep up to date register to receive personalised research and resources by email sign me up taylor and francis group facebook page taylor and francis group twitter page taylor and francis group linkedin page taylor and francis group youtube page taylor and francis group weibo page copyright © informa uk limited privacy policy cookies terms & conditions accessibility registered in england & wales no. howick place | london | sw p wg accept we use cookies to improve your website experience. to learn about our use of cookies and how you can manage your cookie settings, please see our cookie policy. by closing this message, you are consenting to our use of cookies. white paper report report id: application number: hd project director: michael spalti (mspalti@willamette.edu) institution: willamette university reporting period: / / - / / report due: / / date submitted: / / bridging the gap: connecting authors to museum and archival collections level ii digital humanities startup grant willamette university michael spalti, project director, mark o. hatfield library jonathan bucci, collections curator, hallie ford museum of art roger hull, senior faculty curator, hallie ford museum of art mary mckay, willamette university archivist, mark o. hatfield library elizabeth garrison, curator of education, hallie ford museum of art table of contents i. brief project summary ii. project background and objectives iii. project outcomes v. future directions vi. evaluation vii. impact on the digital humanities viii. links i. brief project summary this project bridges the gap between authoring software used to create digital scholarship and repository software used for preservation, search and retrieval of digital assets. it is a collaborative effort by software developers, museum curators, archivists and scholars to build one particular kind of bridge between multimedia authorship and digital collections. project outcomes include:  a search and asset retrieval plug-in for a digital repository system widely used in higher education (contentdm);  integration of this search and retrieval plug-in with an authoring tool (pachyderm . ) and other software applications; and  a multimedia presentation on carl hall, a major artist in the pacific northwest, who first attracted national attention as a magic realist in the ’s. the project originally called for implementing a contentdm plug-in with pachyderm . , but anticipating a new, updated version of the authoring software soon became a priority. while pachyderm . development was underway, we also enjoyed a productive and unexpected collaboration with cynthia walters at the university of virginia. cynthia was working on an experiment in integrating pachyderm . with the sakai learning management system and had joined the pachyderm development team as the contributor of updated repository plug-in support. although pachyderm . was eventually released in october of , repository plug-in support is not included in the public version. we nevertheless achieved project goals and played a modest role in pachyderm . development efforts by implementing the university of virginia pachyderm . plug-in support at willamette university and sharing what we learned with california state university center for distributed learning (cdl), which now leads the pachyderm development effort. with assistance from the willamette university archives and mark o. hatfield library, jonathan bucci, collection curator at the hallie ford museum of art, and roger hull, professor of art history and senior faculty curator, also completed work on the multimedia project, carl hall: oregon master. this online exhibit combines images of artworks drawn from the hallie ford museum of art northwest art collection and documents from the university archives pacific northwest artists archive. the online exhibit is published on the new museum website and will serve as the model for future multimedia projects that utilize pachyderm . authoring and our museum and archival online collections. happily, we have also seen meaningful adoption of pachyderm by students documenting past museum exhibits. beyond the classroom, two students are working on willamette university- funded research projects that will culminate in pachyderm multimedia presentations. ii. project background and objectives beginning in the fall of , staff from the hallie ford museum of art, willamette university archives and mark o. hatfield library met regularly over several months to discuss ways to deepen collaboration on digital asset management. initially, these conversations focused on strategies to grow digital collections and improve procedures for joint management. however, the team quickly identified the need for an authoring tool that would help art historians, museum staff and archivists craft a variety of online exhibits and other interactive programs. we soon discovered that, as a member of the new media consortium (nmc), willamette university had supported the creation of pachyderm . , an open source multimedia authoring tool developed under the direction of the nmc and the san francisco museum of modern art and funded in part by an institute of museum and library services grant. we learned that although pachyderm . maintained its own internal database of digital assets, it also supported open knowledge initiative open service interface definition (osid) plug-ins. these repository plug-ins make external content accessible from within the pachyderm authoring environment— for us, a significant advantage. repository plug-ins already existed for dspace and fedora repositories. no repository osid plug-in existed, however, for contentdm, the primary software used by willamette university and many other institutions. these observations lead to the following project goals: . develop a plug-in for contentdm using the open service interface definition and the contentdm php programming interface; . implement this plug-in in the open source pachyderm . authoring environment and a local instance of contentdm; . explore the effectiveness of this integration for furthering museum and archives outreach programs and for sharing scholarship on pacific northwest regional artists; . investigate the potential use of this integrated platform in the humanities curriculum of willamette university; and . explore use of the osid repository interface with software applications in addition to pachyderm . , as well as the potential for collaboration with other colleges, universities, and consortia. iii. project outcomes the highlights are summarized below. . the contentdm osid: by january, , the osid repository plug-in was completed and tested using several osid-enabled applications. one of our consultants, jeffrey kahn of verbena consulting, contributed to this effort and facilitated testing with applications other than pachyderm . . the new repository service requires both a java plug-in for osid-enabled applications like pachyderm and support code developed using contentdm’s php api. the php support code includes configuration options identified as useful during testing. . pachyderm . / . integration: by june, , the contentdm osid plug-in was installed and tested in a pachyderm . instance at willamette university. by march, , the plug-in had been installed and tested in pachyderm . , providing access to both user-uploaded media and assets retrieved directly from contentdm. although plug-in support is not currently available in the public pachyderm . binary release, willamette university revisions to enable plug-in support are under review by pachyderm developers at the cdl. . the carl hall exhibit: jonathan bucci, collection curator at the hallie ford museum of art, and roger hull, professor of art history and senior faculty curator, created an online flash exhibit on the pacific northwest artist, carl hall. entitled carl hall: oregon master, this online exhibit combines images of artwork from the hallie ford museum of art northwest art collection and documents found in the pacific northwest artists archive, managed by the willamette university archives. carl hall was a prolific northwest artist whose work was featured in a issue of life magazine and shown nationally and regionally for decades. he taught painting, printmaking and composition at willamette university from until . the carl hall: oregon master online exhibit is featured on a newly re-designed museum website. the collaboration between the hallie ford museum of art and the willamette university archives focuses on the historical record of regional artists and benefits from shared infrastructure for managing collections and producing digital content. both museum and archives staff use contentdm as their digital repository and will develop future outreach and educational materials using pachyderm and content drawn from online collections. it is worth noting that final revisions of the carl hall multimedia project used our new ability to access contentdm within the pachyderm . authoring tool. . use in the curriculum: in , groups of students under the direction of professor rebecca dobkins were asked to plan and storyboard a pachyderm-based multimedia presentation for a museum exhibition, toi maori: the eternal thread. this exhibition of contemporary maori weaving was important in the history of the museum and in the creation of significant cultural relationships between the maori and western oregon tribes, especially siletz and grand ronde. the student project documented both the toi maori exhibition and the extensive programming that occurred at the museum during this event. exhibition images will be organized and available in contentdm and automatically scaled dimensions that pachyderm can handle gracefully. automatic image scaling will aid future student-authors of a published toi maori pachyderm presentation. . other applications: the contentdm osid plug-in should work with any osid- aware application. to date, it has been incorporated into the resource libraries of the visual understanding environment (vue) and softchalk authoring software for teachers. iv. future directions contentdm repository plug-in as mentioned earlier, the current repository plug-in requires two quite separate components. the first is the java plug-in itself, which resides with pachyderm. the second is php code that runs on the contentdm server and exposes digital assets to osid-aware applications. the latest release of contentdm includes changes that will fundamentally alter this approach: a new restful architecture should allow us to retool the java plug-in to interact directly with a native contentdm web service. that work completed, local contentdm sites will no longer need to install the php code developed in this project. retooling the java plug-in will effectively allow any osid-aware application (e.g. vue, softchalk, pachyderm, etc.) to access virtually any contentdm repository. this is the kind of many-to-many possibility that we hoped to achieve at the outset. we are currently exploring the possibility with oclc. pachyderm the recent pachyderm . release is a significant accomplishment and those involved in the effort are to be congratulated. pachyderm . benefits organizations and classroom settings by providing a simple, template-driven, flash authoring tool. for those with flash experience, pachyderm can be customized to meet local needs. the development of pachyderm . continues, albeit slowly, including experiments with html versions of pachyderm templates for non-flash devices like the ipad. clearly, from the standpoint of this project, inclusion of osid support in a future binary distribution of pachyderm would be most welcome. when and if osid support becomes a standard feature of pachyderm . , an intriguing enhancement would be an editing screen for optionally selecting an image detail with consistent, pre-defined height and width dimensions. this cropping feature is currently not part of pachyderm but might be implemented using the contentdm osid plug-in and the contentdm image libraries (or other image services like djatoka). museum and archives collaboration and outreach the integration of pachyderm authoring with willamette’s growing online museum and archival image collections will streamline efforts to create online exhibits like carl hall: oregon master. this initial, positive experience makes it easier to continue down the path of creating multimedia presentations for permanent museum collections and past exhibits like the art of ceremony: regalia of native oregon, which was selected by the oregon arts commission as the state’s american masterpieces project and is the subject of a current student pachyderm project (discussed below). automatic image scaling is significant feature for would-be multimedia authors. not only does the contentdm osid plug-in bring images and metadata to the author’s fingertips, it also converts images from jpeg format to jpeg and scales these images to the optimal size for pachyderm presentations. this eliminates a sometimes significant barrier to student and staff use. we believe this will be particularly useful in the case of student projects. moreover, the new release of pachyderm should make it easier to customize and share templates with other museums. for example, the potential for easier collaboration may revive an earlier conversation that the willamette team had with the seattle art museum regarding their customization of pachyderm templates. finally, on a related front, the php code developed in this project has allowed the university web development team to feature images from art collections on the new hallie ford museum of art website. simply by editing item records in contentdm, images are selected for display by museum staff. this capacity may be offered to other departments as the willamette university website continues to evolve. faculty and undergraduate use we are excited that this project has directly benefited students. as noted, in a group of students developed pachyderm storyboards for a past museum exhibit, toi maori: the eternal thread. the student project leader is currently in new zealand (funded by an internal university research grant) conducting video interviews with some of the artists who were part of the exhibition. on return, she will include that video, along with photos from the exhibition, in a finished pachyderm project that will be completed during a museum internship in fall semester . in addition, the willamette university center for ancient studies has provided funding to a student to create a pachyderm project for the art of ceremony: regalia of native oregon exhibition. previously, in a museum studies class, this student organized photos of the exhibit and created a small, on-campus, physical exhibition. other spin-offs the php support code developed for this grant was extended to support advanced search and other contentdm features. this allowed us to use google web toolkit to develop an ajax application for contentdm. initial conversations with oclc suggest that we will be able to retool the ajax application to for the new restful api. v. evaluation thus far, the integration of pachyderm and contentdm has been useful at willamette university and we believe it will continue to yield benefits. there may also be benefits from integration with applications like softchalk and vue, although we have not yet seen this in our setting. using vue as the sample application, we earlier shared our project code with two other colleges and tested against their contentdm data. we also presented early results at a code lib meeting, again using vue. although there was interest in the novelty of seeing vue and a local contentdm repository working together, we are not aware of anyone using the contentdm plug-in, which has been available for download for over a year now. it should be noted that we have not advertised pachyderm integration, which may be the most compelling use case that we can offer to libraries and museums using contentdm. one reason for limited adoption of the plug-in is certainly the need to install additional php code on the contentdm server. if this barrier can be removed, it will be possible for local sites to simply use and evaluate an osid-aware application against their local data without the need, or the knowledge required, to download and install local php support. this will also make it easier to market the solution to the contentdm user community. a second reason for limited adoption of the plug-in at other campuses may be a lack of compelling and well-known osid-aware applications. one must consider the possibility that the osid repository plug-in strategy has failed to gain sufficient traction in the open source developer community, with the result that too few applications use the osid model as an integration strategy. integrating data and functionality is a formidable challenge, and osid’s do offer a useful, and reusable, way for developers to meet the challenge. it’s not clear, then, whether the problem, if indeed one exists, is the osid technology itself, competition from simpler, ad hoc approaches to integrating content and applications, or a general lack of resources committed to the problem of interoperability. even so, we believe that the integration we achieved between pachyderm . and contentdm has significant benefits. it should be noted that willamette university is among the liberal arts colleges nationally to have an active and successful art museum. this may in part explain why the pachyderm . /contentdm combination appears to work here. sharing the results of our project – particularly if a new release of pachyderm . includes osid plug-in support – should tell us more about the value of this effort to others. vi. impact on the digital humanities the aim of this project is to facilitate new forms of scholarship and learning by seamlessly integrating humanistic tools and data. this is a central problem in advancing the digital humanities. to cite one example, project bamboo identified it as a key challenge and located it at the heart of the current mellon-funded bamboo technology project. bamboo’s proposed research environments “can be used by humanities researchers to store and organize sets of digital content, e.g. text, images, video and audio; to create, maintain, and search rich metadata about this content; to annotate and analyze content; and to accomplish these through collaboration with other scholars.” in contrast to the “big humanities” approach pursued in project bamboo, the goals of this project are strikingly modest -- and also illustrate the difficulty of this kind of effort. content interoperability requires multifaceted solutions that draw on a common set of standards and techniques implemented across multiple platforms. this is not easy to do. many worthy software projects struggle to maintain the basic functionality needed by users. for these projects, support for content interoperability is a daunting hurdle to clear, despite obvious benefits. even so, there is a reasonable chance that the tool and content integration realized at willamette university will be extended to a wider audience. the soon-to-be-released contentdm restful api should be an automatic enabler of wider interoperability, since institutions that have local contentdm instances will also have their data exposed through a web service without additional code or configuration on their part (however simple this may be, it remains a barrier). retooling the contentdm repository osid for interaction with the new restful api would make a wealth of data available to osid-aware applications. we have also demonstrated that with few modifications the osid support code developed at the university of virginia does work as intended with pachyderm . . the team responsible for pachyderm . development will need to make the final decision on when and how this capacity is enabled in a new pachyderm release, but the fact that they can use the willamette pachyderm site and see repository integration enabled -- and read detailed information on how we made this happen -- should speed their efforts. the work involved is not prohibitively difficult, but it will require a decision to allocate the necessary staff resources. we will gladly provide any assistance we can. the technical challenges that we encountered are only recently overcome and we have not had the opportunity to share the project results widely. hopefully, the story will soon include a “you can do this, too” pitch as work progresses on pachyderm . and as the contentdm java plug- in is retooled for the new restful api. we are excited that stories of undergraduates and pedagogical outcomes are also a part of what we will share. vii. links carl hall multimedia presentation: http://libmedia.willamette.edu/museum/carlhalloregonmaster / hallie ford museum of art: http://www.willamette.edu/arts/hfma/ contentdm osid source code: http://projects.oscelot.org/gf/project/cdmosid/scmsvn/?action=accessinfo project download and documentation page http://libmedia.willamette.edu/acom/neh/ pachyderm . : http://www.pachyforge.org/ http://libmedia.willamette.edu/museum/carlhalloregonmaster / http://www.willamette.edu/arts/hfma/ http://projects.oscelot.org/gf/project/cdmosid/scmsvn/?action=accessinfo http://libmedia.willamette.edu/acom/neh/ http://www.pachyforge.org/ escholarship@mcgill.ca - redirect redirecting to: http://escholarship.mcgill.ca white paper report report id: application number: hd project director: daniel cohen (dcohen@gmu.edu) institution: george mason university reporting period: / / - / / report due: / / date submitted: / / neh digital humanities start-up grant white paper grant #hd- - : scholarpress with receipt of its neh digital humanities start-up grant in fall , the scholarpress project (http://scholarpress.net) set out to address some common problems faced by the thousands of scholars using the wordpress web content management system to organize their professional web presence and to meet several needs ill-served by existing software. our modest goal was to tackle a few critical aspects of scholarly activity, including assembling interactive course syllabi, publishing bibliographies with broad publics, and building and disseminating a cv. rather than a one-size-fits-all approach to the demands for a scholarly web presence, scholarpress aimed to provide a few tightly-focused tools designed to satisfy these narrowly-defined scholarly needs. building on the extremely popular and flexible wordpress platform, these tools were intended to integrate closely with a scholar's existing digital identity rather than requiring the scholar to learn, install, and customize a new piece of software. this approach was designed to save the scholar as much time and work as possible. with the support of a digital humanities start-up grant, we proposed to build, test, and disseminate prototypes of two of these tools. we are pleased to report that at grant’s end, we have accomplished, and even exceeded, these goals. despite some significant organizational hurdles and staff changes over the course of the grant, we have produced not two, but three production-ready wordpress plugins for scholarly use under the scholarpress banner. in addition to the courseware and vitaware plugins described in our original proposal, we produced a third plugin called researcher, which provides functionality similar to the “scholarpress cite” plugin anticipated in the future directions section of the grant. courseware scholarpress courseware enables the administration of a college-level class through a wordpress blog. courseware provides the ability to add and edit a schedule, create a bibliography and assignments, and manage general course information. while it was designed primarily for use in higher education courses, it can easily be adapted for other scholarly uses such as workshops, study groups, etc. as of version . . , scholarpress courseware allows a user to create entries in a schedule in the schedule page, add items of various media to a bibliography in the bibliography page, and assign those bibliography items to read (or create other types of assignments) in the assignments page, and edit course information in the course info page. there are several different types of assignments: reading, writing, presentation, group work, research, discussion, and creative. multiple assignments may be added to a particular class meeting date, e.g. two articles could be assigned to read (reading), as well as a one page synopsis of the reading (writing), and a question for students to prepare to discuss (discussion). each of these assignments shows up separately on the schedule page for the chosen date. all of these tasks are easily created and edited in separate panels in the administrative interface. the user interface was carefully designed and developed to provide a simple and intuitive process through which the information could be entered. additionally, an administrative dashboard provides an overview of the course for review prior to publication. to date, courseware has been downloaded nearly , times, indicating a significant user base. with some further improvements, this number should continue to grow. ideally, courseware will eventually be extended to interact with the other scholarpress plugins described below, vitaware and researcher, giving users an option to harness their existing zotero libraries in assembling course reading lists and assignments. according to the user feedback we have already received, automating the creation of bibliographies using zotero appears to be one of the most highly desired features for future iterations of courseware. vitaware the scholarpress vitaware plugin makes use of zotero’s native cv builder to allow scholars to publish customized curriculum vitae on their wordpress website. users of zotero.org can already build a cv from items in their zotero library and publish it to their profile on the zotero.org website. however, many scholars desire an easy way to embed this dynamically generated cv on their own personal sites or blogs, which often employ wordpress as the underlying software. vitaware allows users to easily create a cv page on zotero.org and publish that cv on their personal wordpress website. to accomplish this, vitaware leverages the shortcode functionality wordpress introduced in version . . shortcodes are an easy way for users to control where and how plugin information appears on a page, without requiring any coding knowledge. by adding a simple text string to any page, users can pull in their cv from zotero. in addition to ease of use, the great advantage of using the zotero api in this instance is that the user’s online cv is dynamically generated. enter a new publication in your zotero library, and your online cv is automatically updated with the correctly formatted citation on all instances generated by vitaware. using multiple shortcodes for different faculty members, an entire department could use vitaware to manage the faculty profiles section of its website. researcher scholarpress researcher is another plugin that uses the zotero api, in this case to display portions of a user’s zotero library on a wordpress page. users can pull their entire library or portions of it into wordpress for display on any page or post. the plugin allows for several customizations, such as citation style and item order, making it useful for different use cases and for scholars of different disciplines. researcher offers two methods for users to display zotero content. the first employs the aforementioned shortcodes functionality of wordpress. a shortcode containing a library or collection id can be placed anywhere on any page and will bring back the items contained in that library or collection. a second option utilizes the simplicity of custom fields in wordpress. a user simply enters a library or collection id into a custom field, and places a small function in his or her wordpress theme. this function will recognize the presence of the custom field data, and display the content for that library or collection. this latter method was tested on, and is currently being used by the connecticut humanities council (chc) on the new connecticuthistory.org website. for this project, the chc populated a group library in zotero with a large number (nearly , ) of items. these items were grouped into collections for discrete topics, people, and places, as well as sub-collections for different media types such as audio, video, books, or documents. simply by entering the collection id for a specific topic, researcher automatically pulls in all of the items in the collection organized by media type onto a chosen page. in this way, the bibliographies for several hundred pages are managed centrally in zotero and update dynamically on the live website. additionally, because their bibliographical content is managed outside of the website itself, chc will be able to repurpose this valuable information for other uses and audiences. researcher was developed with the idea that users might have different skill sets, and different uses for the plugin, hence the two different approaches (shortcodes and custom fields). future development plans include allowing options such as citation type and ordering to be set in dropdown menus, making the interface more user-friendly, as well as a simple interface to enter single citations in a footnote style. conclusion scholarpress represents a relatively modest intervention in the educational and scholarly technology landscape. but it reflects a model that we believe can be extended and built upon. the main objective of scholarpress was to show that by extending an off-the-shelf, open source, modular technology suite like wordpress, scholars could be afforded greater options with which to manage their digital presence. specifically, scholarpress offers scholars the option of using a system with which they are already comfortable to manage key aspects of their scholarly, pedagogical, and public humanities work. scholarpress is already used by hundreds of scholars. in the coming years, we will aim to increase these numbers and, equally importantly, to strengthen the small, but growing developer community that has started to coalesce around the new software. librarian, heal thyself: a scholarly communication analysis of lis journals – in the library with the lead pipe skip to main content chat .webcam open menu home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search home about awards & good words contact editorial board denisse solis ian beilin jaena rae cabrera kellee warren nicole cooke ryan randall emeritus announcements authors archives conduct submission guidelines lead pipe publication process style guide search apr micah vandegrift and chealsye bowley / comments librarian, heal thyself: a scholarly communication analysis of lis journals in brief this article presents an analysis of library and information science journals based on measurements of “openness” including copyright policies, open access self-archiving policies and open access publishing options. we propose a new metric to rank journals, the j.o.i. factor (journal openness index), based on measures of openness rather than perceived rank or citation impact. finally, the article calls for librarians and researchers in lis to examine our scholarly literature and hold it to the principles and standards that we are asking of other disciplines. [also available as an epub for reading on mobile devices, or as a pdf.] image by flickr user micahvandegrift (cc by . ) by micah vandegrift and chealsye bowley introduction january saw the launch of sponsoring consortium for open access publishing in particle physics (scoap ), which was the first major disciplinary or field-specific shift toward open access. considerable numbers of journals and publishers are moving to embrace open access, exploring a variety of business models, but scoap  represents a significant and new partnership between libraries, publishers and researchers.  simply, journals under the scoap  program were converted to open access overnight and are being supported financially by libraries paying article processing charges through a consortium rather than purchasing subscriptions. the physics field has been at the forefront of open access for more than years, beginning with the foundation of arxiv.org and followed by their premier society, american physical society (aps), actively evolving their publications to provide efficient open access options for authors. there has yet to be any such movement in the professional literature of library and information sciences (lis), despite the fact that the library world is inextricably linked to “open access” both in principle and in practice. the authors note this disciplinary discrepancy, and through an analysis of lis journals and professional literature hope to inspire those researching and publishing in the lis field to take control of our professional research practices. we conducted this analysis by grading select lis journals using a metric we propose to call the “j.o.i factor” (journal openness index), judging “how open is it?” based on a simplified version of the open access spectrum proposed by public library of science (plos), the scholarly publishing and academic resources coalition (sparc), and the open access scholarly publishers association (oaspa). it is our hope that doing so will lead to the shifts in the scholarly communication system that libraries are necessarily pursuing. background scholarly publishing is evolving in many ways, as anyone connected to academia knows. discussions about publishing often center on the potential that digital technology offers to disseminate the results of scholarly research, a role traditionally filled by scholarly associations, societies, university presses, and commercial publishers. scholars and researchers at institutions ranging from ivy league universities to state colleges are raising questions about how non-traditional “digital” scholarship will be evaluated, what criteria and credence should be given to new, openly accessible online journals, and what role open access repositories have in disseminating and preserving the scholarly record. reaching even into public policy, the office of science and technology policy (ostp) convened a scholarly publishing roundtable in . that group’s final report offered the recommendation that each federal research agency (national science foundation, national endowment for the humanities, etc.) should expeditiously develop and implement public access policies, offering free access to results of federally funded research. saw ostp revisit that recommendation and, in response to an overwhelming petition, issue a directive to all federal funding agencies with more than $ million in r&d funding to develop and implement open access policies, similar to the national institute of health’s public access policy, in effect since may , . popular media are also taking up the question of how scholarly publishing will evolve. the guardian regularly features pieces in its higher education network calling for redefinition of the publishing cycle that earns large publishing companies significant financial gains off of the gift economy of intellectual content and peer-reviewing in which faculty participate. one opinion piece went so far to say, “academic publishers make murdoch look like a socialist… down with the knowledge monopoly racketeers.” the economist coined the term “academic spring” in a february  piece, referring to faculty’s rising discontent with the current system. they cite the example of timothy gowers, an award winning cambridge mathematician who called for a boycott of elsevier, a large stem publisher, for its unsatisfactory business practices. as of april nd,  that boycott, thecostofknowledge.com, had , signatories. finally, us news and world report published a piece in july that opened with harvard university’s faculty advisory council stating “many large journal publishers have made the scholarly communication environment financially unsustainable and academically restrictive.” responding to these “tectonic shifts in publishing,” university libraries and academic librarians are undergirding a system that is shaky at best. budgets remain flat, while subscription costs continue to rise; all the while many libraries are investing in staff and infrastructure in the area of scholarly communication, supporting open access initiatives, or moving directly into publishing themselves.  while the primary push for adapting this system has been working through disciplinary faculty to change research culture, academic librarians are slowly engaging the idea that publishing practices within our own journals and professional writing could be an effective way to mold the future of academic publishing. the scope of this article is to engage our own community, librarians who publish in professional or academic literature, and target pressure points in our subset of academic publishing that could be capitalized upon to push the whole system forward. we are approaching this topic with the goal of plainly sketching out what lis publishing looks like currently, in terms of scholarly communication practices like copyright assignment, journal policies for open access self-archiving and open access publishing. literature review studies of this magnitude have been conducted in the recent past, although they have primarily focused on the attitudes of individual librarian authors toward publishing practices more than analyzing the publishing practices and policies journals themselves. elaine peterson, in , produced an exploration of “librarian publishing preferences and open-access electronic journals”, in which she conducts a brief survey. the results show that academic librarians often consider open access journals as a means of sharing their research but hold the same reservations about them as many other disciplines, i.e. concerns about peer review and valuation by administration in terms of promotion and tenure.  this line of thought is continued in snyder, imre and carter’s study, which focused more specifically on intellectual property concerns of academic librarian authors and allowable self-archiving practices. they quote peter suber, author of open access and director of harvard’s open access project, writing, “‘there is a serious problem [serials pricing and permission crisis], known best to librarians, and a beautiful solution [open access] within the reach of scholars.’ one can draw the conclusion from suber’s statement that librarians as authors should be the most prominent supporters of open access and that, as scholars, they would practice self-archiving.”   this study in particular lays a unsettling foundation that % of respondents cared mostly about publication without considering the copyright policies of the journals in which they published and that only % had exercised the right to self-archive in an institutional repository.((ibid)) these and other similar studies highlight the simple fact that concerns about changing publishing habits are the same within librarianship as they are in many other disciplines. college and research libraries (c&rl), a well-regarded journal for academic librarianship, published four articles between and that studied the publishing practices of academic librarians through surveys.  each has contributed valuable insights while reaching very similar conclusions across the board. palmer, dill and christie conclude that in attitude, “librarians are in favor of seeing their profession take some actions toward open access […] yet this survey found that agreement with various open access–related concepts does not constitute actual action.”  mercer, focused on the publishing and archiving behaviors rather than attitudes of academic librarians, highlights the substantial differences between the dual role many academic librarians inhabit; library professional first and academic researcher second. she writes, “…librarians may be risk takers in their professional roles, where they are actively encouraging changes in the system of scholarly communication and adoption of new technologies but are risk-averse as faculty in their roles as researchers and authors.”  taken together, the research could lead one to think that academic librarians are invested in changes to the scholarly publishing system about as little as disciplinary faculty and are just as cautious about evolving their own publishing habits. many academic authors write and publish out of passion for their research and to contribute to the progression of knowledge in society. unfortunately, because of the system of measurement in which academia is mired, credentials, merit and perception can also play a substantial role in the publishing decisions of faculty. without delving too deep into the discussion of tenure for librarians, the expectations for publishing in certain journals, or at all, are slightly different for librarians than other university faculty. both the h-index and journal impact factor are measurements of supposed “impact,” based on the citations an article receives, which have in turn been equated with quality.  the h-index is an impact measure for an individual, whereas impact factor applies to the journal level. two recent studies follow mercer’s line of argument and look at the journals in the lis field, rather than the authors, using these two traditional measures of “impact.” jingfeng xia conducted a fascinating study proposing that the h-index of authors published in a journal, as opposed to that journal’s impact factor, could provide an efficient method of ranking lis journals, especially those that are open access and not listed in journal citation reports. xia’s article also underscores some of the complications that arise when lumping together all journals in the library and information science field; library and information science research (lisr), a researcher-focused journal published by elsevier (h-index = , impact factor = . , not open access) is judged alongside d-lib magazine published by the corporation for national research initiatives (h-index = , impact factor = . , open access), a journal aimed at the practice of digital librarianship. lisr’s impact factor ( . ) is high for lis journals (median . ), but when compared to the h-index of d-lib’s authors lisr seems to have less “impact.” xia’s employment of the h-index as a measurement, illustrated in this example, shows the breadth and depth that alternate matrices may introduce, the complications of judging journal quality based on citations, and the potential inversion of perceived impact depending on how one looks at it. expanding on the idea that acknowledging the perceived quality of journals is a valuable practice within librarianship, judith nixon’s “core journals in library and information science: developing a methodology for ranking lis journals” was published in by c&rl. she proposes, based on successful practices at purdue university libraries, that “top lis journals can be identified and ranked into tiers by compiling journals that are peer-reviewed and highly rated by the experts, have low acceptance rates and high circulation rates, are journals that local faculty publish in, and have strong citation ratings as indicated by an isi impact factor and a high h-index using google scholar data.”  the production of a ranked list like this aligns perfectly with the type of study we performed, and our conclusions will highlight some similarities and differences between nixon’s list and our findings, pitting the journal openness index (j.o.i) factor against the top tier journals she presents. whereas some of these studies in lis publishing focused on the “people” angle, studying librarians and their attitudes and practices around publishing, we chose to follow more recent research and widen the lens to look at the journals in which librarians might publish. a challenge presents itself when broadening to this scale: there is the ever-present blurred line between the publishing habits of working librarians and those of teaching/research faculty in library schools and academic departments — library and information science research vs. d-lib magazine for example. there are obvious differences between these groups, so pairing analysis on the specific journals where professional librarians typically publish with the more specific studies on that same group’s publishing habits will present the most accurate portrait of the scholarly communication landscape as it has been studied to date. we leave the extension of this research for future study. methodology our live dataset is viewable on this google spreadsheet. downloadable and citable data are accessible on figshare. journal selection the journals that we began with came from an internal list compiled as part of a professional development initiative at florida state university libraries. a student worker in the assessment department compiled the original list of journals, and then the co-authors of this piece expanded that list to after consulting the lis publications wiki. the journals were ingested into a spreadsheet with columns for impact factor, scope, instructions for authors, indexing information and other common details. our first task was to add columns for copyright policy, open access archiving policy, and open access publishing options. our journal list includes an extraordinarily broad range of journals including research focused journals and those in subfields of librarianship like archives and technical services. this decision was made so as to gather data from the broadest possible representation of lis scholarship. data collection after compiling and organizing the journal list, we collected each journal’s standard policies on copyright assignment, open access self-archiving (“green open access”), and open access publishing (“gold open access”). we began gathering these data by searching the sherpa/romeo database for commercial journals and the directory of open access journals (doaj) for open access journals. after searching these databases, we double checked policies and open access options on the journal and/or publisher’s website using the following workflow: locate the policies section of the website, which is commonly labeled “policies,” “policies and guidelines,” “author’s rights,” or “author’s guidelines”; identify the copyright policy of the journal; identify the open access self-archiving policy or “green open access” options that the journal permits; identify the open access publishing or “gold open access options” of the journal, which may be listed in the policies section or a specific “open access options” section; and finally view the copyright transfer agreement or other author agreement, if available. all details were inputted to the spreadsheet and coded for consistency. j.o.i factor (journal openness index) grading journals based on how “open” they are, as opposed to citation impact or h-index, is a novel approach, and one that had not been applied to lis literature to our knowledge. in fact, it is not clear that this measurement has been used extensively in any field or practice aside from the production of the spectrum and some supporting documentation by plos, sparc, and oaspa. potentially then, as further research is done using the j.o.i factor, the grades we apply to journals herein may be different based on how many measures of openness are used and how they are counted. our proposed enumeration of the j.o.i factor is indicated on the image below, superimposed over the “how open is it?” scale produced by sparc/plos. the application of j.o.i factors to specific journals is contained to our conclusion section for purposes of clarity and emphasis. our proposed journal openness index, adding numerical values to plos/sparc’s how open is it spectrum. the original spectrum breaks openness down to six categories, three of which overlap neatly with the criteria we used in our analysis: ) copyrights, ) reuse rights, and ) author posting rights. the remaining categories, reader rights, automatic posting, and machine readability were mostly ancillary to our focus, and so the j.o.i factor numbers that we apply only account for the three criteria we researched. the “reader rights” category does include some details about embargoes but typically refers to embargoes on the final published pdf released after that term expires by the publisher. our use of the embargo data point was in terms of author posting rights, so we chose not to include reader rights as a category in our j.o.i factor calculations. also, the spectrum lumps open access publishing options, another of our data points, in with reader rights as “immediate access to some, but not all, articles (including the ‘hybrid’ model” — “hybrid” meaning the business model where articles can be made open access on a one-by-one basis for a fee. we decided to add a “-” for journals that offer open access publishing for a fee, illustrating the negative connotation that might have for authors. journals that are fully open access without any publishing fees will have a j.o.i number and a “+” illustrating positive connotations. information technology and libraries, for example, published by library and information technology association/ala, would have a j.o.i factor of +; four points for author retention of copyrights, four points for broad reuse rights (cc-by), four points for the author being allowed to post any version of the article in a repository and “+” for the journal being fully open access without imposing any publication fees. we hope that the application of the j.o.i factor in this article serves merely as a proof of concept, and we invite colleagues to use our data, apply j.o.i factors to all the journals we listed there, and extend this work to account for the full range of possible factors of openness. data analysis the most common major publishers from our sample were taylor & francis ( journals), emerald ( journals), and elsevier ( journals). society and association publishers followed closely with journals, and universities and university presses had . the remainder were either unknown, other types of organizations, smaller publishing houses or “self-affiliated.” the three clearly self-affiliated journals, first monday, code lib and in the library with the lead pipe are all fully open access but have a range of difference in their copyright policies, illustrating the variety of publishing options within the lis field. each journal was assigned a corresponding code for its copyright, open access archiving, and open access publication policies. these codes were used primarily for organizing the information in our spreadsheet, and are not conflated with our proposed j.o.i factors which are applied after all data were collected, organized, and analyzed. the codes represent the range of possible options under each category, based on the variety of options we identified in the journals we reviewed. for example, the copyright field could range from ( ) required full transfer of copyright to ( ) copyright jointly shared between author and publisher. self-archiving policies ranged from not permitted ( ) to allowing the final published pdf ( ), with a range of embargo periods for each category in between. (see table for all codes) table : journal policy codes as applied to our data. copyright policies despite librarianship’s ongoing waltz with copyright complications, of the lis journals we reviewed still require the author to transfer all copyrights to the publisher, “during the full term of copyright and any extensions or renewals… including but not limited to the right to publish, republish, transmit, sell, distribute and otherwise use the [article] in whole or in part… in derivative works throughout the world, in all languages and in all media of expression now known or later developed” (emphasis our own). however, leaning toward a more expansive rights agreement, journals allow the author to retain copyright, of which require a license to publish be granted to the publisher.   of the that require a license granted to the publisher are taylor & francis journals, which fall under their new author rights for lis journals. taylor and francis shows leadership in adapting their rights agreements for lis journals, although one co-author of this article sought to push them further, with success. the remaining journals that allow the author to retain copyright also offer the article to be published under a creative commons (cc) license, ranging from attribution-non-commercial-no derivatives (collaborative librarianship) to public domain (first monday). the boldest and most progressive copyright policy goes to first monday, which offers total author choice, from copyright transfer (©), through every possible creative commons license, to releasing the work in the public domain (cc ). snapshot: % of the journals require full transfer of copyright; % of the journals allow authors to retain copyright; . % had a choice between full copyright transfer and retaining some rights; . % were unknown and . % had joint copyright between the author and publisher. open access self-archiving policies (green open access) this category provided the broadest range of possibilities, mostly due to the fact that different publishers assign different terms of embargo for self-archiving. assuming that well-informed lis authors who submit to these journals desire the simplest and broadest open access options, journals allow the pre-print (submitted version), post-print (accepted version) and final published pdf to be archived in an open access institutional repository, with no stated embargoes. of these are fully open access journals, and they are all published by societies, associations, universities or self-affiliated groups. common thought in academic publishing tends to say that society/association publishers lose the most when going open access; it is heartening to see this is absolutely untrue in lis literature. the strictest embargo on self-archiving in an institutional repository is months for of the taylor and francis journals. university of texas press and university of chicago press both allow archiving after months, while ironically, given the topic of the journal, the journal of scholarly publishing published by toronto university press only allows archiving of the pre-print with no policies for post-prints. an important point to consider when discussing self-archiving policies is the farce that they truly are. kevin smith, duke university’s scholarly communication officer stated it most plainly in his february  blog post titled it’s the content, not the version! he writes, …this notion of versions is, at least in part, an artificial construction that publishers use to assert control while also giving the appearance of generosity in their licensing back to authors of very limited rights to use earlier versions.  the versions are artificially based on steps in the publication permission process (before submission, peer-review, submission, publication), not on anything intrinsic to the work itself that would justify a change in copyright status. the practice of self-archiving is totally dependent on copyright transfer agreements, and based on the representative sample of lis journals we reviewed, all but % had direct or implied policies regarding what the author is allowed to do with specific versions of the same work. the author’s false sense of control over their work and the publisher’s exploitation of that sense deserves a study unto its own. suffice it to say that if the field of library and information studies considers a green open access policy a good deal, there is much work to be done. snapshot: % allow pre-print and post-print archiving; % allow pre-print and post-print archiving after - months; % allow pre-print, post-print, and publisher’s pdf to be archived; . % have unclear policies and . % are unknown. open access publishing policies (gold open access) a common misconception about achieving open access is that it always requires a fee on the part of the author. while this mostly true for traditional commercial publishers attempting to retain their income stream while “acquiescing” to the desires of their authors, it is a falsity broadly, which is proven in our analysis. journals offer open access on an article-by-article basis and require an article processing charge (apc) ranging from $ to $ , . of these are published by commercial publishers (elsevier, sage, springer, wiley, taylor and francis, and emerald). in stark contrast, journals on our list are fully open access and all articles are published without a fee. a significant number, journals, either do not offer a “gold open access” publication option or do not publicize it. a number of the journals that do not offer or publicize a paid open access business model are university presses ( ), and association/society journals ( ). snapshot: % of journals are already open access and publish without a fee, % offer open access publication options through an article processing charge, and % of journals offer no gold option or do not publicize it. open access lis journals as noted above, within these lis journals there is considerable diversity in policies. we wanted to further explore that depth of difference by looking specifically at the fully open access journals in our sample. this section reiterates some of the analyses from previous sections, but we thought it still important to enumerate the complexities of publishing within this subset of a subset. of the journals that we looked at are open access, and only two (the international journal of library science and ifla journal) have a publication charge, $ and $ respectively. while two of the open access journals require a full copyright transfer (international journal of library science and student research journal) a little more than half of them ( ) allow the author to keep copyright and attach a creative commons license to the work.   of these fully open access journals allow the author to deposit the final published pdf in a repository, meaning that fully open access journals either place some restrictions on the reuse of open access content or have poorly defined reuse policies. even though these are open access journals, the data suggests that what qualifies as “open access” even within our own field is still loosely defined, a point we attempt to illustrate by applying the j.o.i. factor at the close of this article. some might make the argument that any restriction of authors’ rights (copyright) and readers’ rights (reuse via licenses) toes the line of not achieving pure open access. emily drabinski, a reviewer of this article, made the salient point that the policies we discuss as needing to change are under the purview of journal editorial boards who are often in the complicated position of being between authors (colleagues) and publishers. to that end, we encourage journal editors as well as authors to lead by taking action. regardless, as the measures of openness are more effectively discussed within our communities of practice, the lis field is making slow progress toward public access (readability) and open access (re-usability), a trend we expect to broaden and deepen. conclusions this article illustrates something with which every researcher in the field of library and information studies must contend. a significant percentage of our professional literature is still owned and controlled by commercial publishers whose role in scholarly communication is to maintain “the scholarly record,” yes, but also to generate profits at the expense of library budgets by selling our intellectual property back to us. conversely, there is much to be proud of, including the many association, society and university-sponsored journals that are well-respected and proving important points about the viability of open access as a business model, a dissemination mechanism, and a principle to which librarians hold — our “free to all” heritage. it is our hope that this article inspires the activism that the earlier articles from our review of the literature pointed out as a disturbing discrepancy in our professional practice. simply, this is our call for librarians to practice what we preach, regardless of, or even in the face of, tenure and promotion “requirements,” long-held professional norms, and the unnecessary fear, uncertainty and doubt that control academic publishing. we already have models for activism on the collections side of our work; we call our colleagues to echo those impulses on the production side of scholarship, as editors, authors, bloggers, library publishers, and consumers of research. there are three practical means of seeding this change; ) exercise the right to self-archive every piece of scholarship published in lis journals, or better yet never give those rights away in the first place; ) move the “prestige” to open access, meaning offering the best work to journals that are invested in a more benevolent scholarly communication system; and ) as editors, work diligently to adapt the policies and procedures for the journals we control to align with our professional principles of access, expansive understanding of copyrights, fair use, and broad reusability. returning to “nixon’s list,” which proposed a possible ranking system for lis journals, it is interesting to grade her list in terms of the “openness” criteria we’ve employed in this article, and in light of the practical actions we propose. nixon’s findings present journals that were determined to be the “tier one” journals, based on the criteria she and her colleagues developed.   of those were also identified as top lis journals from her literature review. table shows those “prestige” journals, as graded by our applied j.o.i factor. table : tier one lis journals graded by j.o.i factor the results are striking. college and research libraries, widely regarded as a top journal for practicing librarians, received a j.o.i factor of +, whereas information technology and libraries (ital) measures at +, all because of ital’s generous reuse rights policy (cc-by). jasist is tied for last place (j.o.i factor -) with elsevier and emerald journals because of copyright transfer requirements, no reuse rights and middling author posting allowances. library trends and library quarterly (university press journals) sit solidly in the middle, entirely due to their author posting policies which allow posting the publisher’s pdf. based on this, in closing, we submit these final questions to the lis research community: are these the journals we want on a top tier list, and what measure of openness will we define as acceptable for our prestigious journals? further, how long will we tolerate measurements like impact factor and h-index guiding our criteria for advancement, while accounting for very little that matters to how we principle ourselves and our work? finally, has the time come and gone for lis to lead the shifts in scholarly communication? it is our hope that this article prompts furious and fair debate, but mostly that it produces real, substantive evolution within our profession, how we research, how we assign value to scholarship, and how we share the products of our intellectual work. our thanks and gratitude go to emily drabinski for her thoughtful, helpful and engaging comments as the external reviewer of this article. thanks also to lead pipe colleagues and editors, ellie, erin, and hugh, for challenging our ideas, correcting our bad grammar and making this lump of coal into a diamond. most of all, thanks to brett for proposing the term “journal openness index” to replace our not creative and weird-sounding original concept. bibliography peterson, e. ( ) librarian publishing preferences and open-access electronic journals. electronic journal of academic and special librarianship, ( ). accessible at http://southernlibrarianship.icaap.org/content/v n /peterson_e .htm carter, h., carolyn snyder, and andrea imre. ( ) “library faculty publishing and self-archiving: a survey of attitudes and awareness.” portal: libraries and the academy, ( ). open access version at http://opensiuc.lib.siu.edu/morris_articles/ / palmer, k., emily dill, and charlene christie. ( ) “where there’s a will there’s a way?: survey of academic librarian attitudes about open access.” college and research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html mercer, h. ( ) almost halfway there: an analysis of the open access behaviors of academic librarians. college and research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html nixon, j. ( ) core journals in library and information science: developing a methodology for ranking lis journals. college and research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html smith, k. ( ) its the content, not the version! scholarly communications @ duke [blog], posted on february . accessible at http://blogs.library.duke.edu/scholcomm/ / / /its-the-content-not-the-version/ data vandegrift, micah; bowley, chealsye ( ): lis journals measured for “openness.” http://dx.doi.org/ . /m .figshare.   other readings malenfant, k. j. ( ) leading change in the system of scholarly communication: a case study of engaging liaison librarians for outreach to faculty. college & research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html sugimoto, c. r., tsou, a., naslund, s., hauser, a., brandon, m., winter, d., … finlay, s. c. ( ) beyond gatekeepers of knowledge: scholarly communication practices of academic librarians and archivists at arl institutions. college & research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html xia, j. ( ) positioning open access journals in a lis journal ranking. college & research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html henry, d. and tina m. neville. ( )  research, publication, and service patterns of florida academic librarians. the journal of academic librarianship, . open access version at http://hdl.handle.net/ / . published version at http://dx.doi.org/ . /j.acalib. . . joswick, k. ( ) article publication patterns of academic librarians: an illinois case study. college & research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html hart, r. ( ) scholarly publication by university librarians: a study at penn state. college & research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html wiberley, jr., s. julie m. hurd, and ann c. weller ( ) publication patterns of u.s. academic librarians from to . college & research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html harley, d.; acord, sophia krzys; earl-novell, sarah; lawrence, shannon; & king, c. judson. ( ). assessing the future landscape of scholarly communication: an exploration of faculty values and needs in seven disciplines. uc berkeley: center for studies in higher education. accessible at  http://www.escholarship.org/uc/item/ x g frass, w. jo cross, and victoria gardener ( ) taylor and francis open access survey – supplement - data breakdown by subject area. accessible at http://www.tandfonline.com/page/openaccess/opensurvey priego, e. ( ) fieldwork: mentions of library science journals online. accessible at http://www.altmetric.com/blog/fieldwork-mentions-library-journals-online/ the price of information (feb. ) http://www.economist.com/node/ a (free) roundup of content on the academic spring (april ) http://www.guardian.co.uk/higher-education-network/blog/ /apr/ /blogs-on-the-academic-spring academic publishers make murdoch look like a socialist (aug. ) http://www.guardian.co.uk/commentisfree/ /aug/ /academic-publishers-murdoch-socialist is the academic publishing industry on the verge of disruption? (july ) http://www.usnews.com/news/articles/ / / /is-the-academic-publishing-industry-on-the-verge-of-disruption   see open access directory “journals that converted from toll access to open access.” accessible at http://oad.simmons.edu/oadwiki/journals_that_converted_from_ta_to_oa [↩] bolick, j. ( ). “we need a scale to measure the #scholcomm friendliness of a journal: based on @sparc_na and @plos #howopenisit: hoii factor?” tweet available at https://twitter.com/joshbolick/status/ [↩] information pulled from library journal’s annual periodicals price survey, accessible at http://lj.libraryjournal.com/ / /publishing/the-winds-of-change-periodicals-price-survey- / [↩] peterson, e. ( ) librarian publishing preferences and open-access electronic journals. electronic journal of academic and special librarianship, ( ). accessible at http://southernlibrarianship.icaap.org/content/v n /peterson_e .htm [↩] carter, h., carolyn snyder, and andrea imre. ( ) “library faculty publishing and self-archiving: a survey of attitudes and awareness.” portal: libraries and the academy, ( ). open access version at http://opensiuc.lib.siu.edu/morris_articles/ / [↩] additionally, its partner newsletter, college and research libraries news, has run a column dedicated to scholarly communication since . [↩] palmer, k., emily dill, and charlene christie. ( ) “where there’s a will there’s a way?: survey of academic librarian attitudes about open access.” college and research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html [↩] mercer, h. ( ) almost halfway there: an analysis of the open access behaviors of academic librarians. college and research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html [↩] for more on the issues associated with impact factor, see this editorial from nature – http://blogs.nature.com/news/ / /scientists-join-journal-editors-to-fight-impact-factor-abuse.html a response to these issues is the recent push toward altmetrics (alternative metrics), measuring many other forms of impact beyond simply citations – http://altmetrics.org/manifesto/. [↩] nixon, j. ( ) core journals in library and information science: developing a methodology for ranking lis journals. college and research libraries, . accessible at http://crl.acrl.org/content/ / / .full.pdf+html [↩] our use of “fully open access” throughout this article means published online, freely accessible to anyone with an internet connection, with broad copyright and reuse options for authors and readers respectively. [↩] language pulled from a standard wiley agreement, although only one lis journal we reviewed is published by wiley. elsevier would be a better example agreement, but curiously elsevier’s copyright transfer agreements are nearly impossible to find on the web anymore. [↩] further study should be dedicated to determining if these “licenses to publish” are exclusive, non-exclusive or have other clauses that render them less effective than they seem. [↩] smith, k. ( ) its the content, not the version! scholarly communications @ duke [blog], posted on february . accessible at http://blogs.library.duke.edu/scholcomm/ / / /its-the-content-not-the-version/ [↩] cc-by is the most common license for these open access journals. [↩] “top lis journals can be identified and ranked into tiers by compiling journals that are peer-reviewed and highly rated by the experts, have low acceptance rates and high circulation rates, are journals that local faculty publish in, and have strong citation ratings as indicated by an isi impact factor and a high h-index using google scholar data.” nixon, j. ( ), http://crl.acrl.org/content/early/ / / /crl - .full.pdf [↩] working at learning: developing an integrated approach to student staff development how well are you doing your job? you don’t know. no one does. responses scott walter – – at : pm thanks for mentioning the work that authors in college & research libraries have published in recent years regarding open access and its adoption by scholars in lis and other fields. c&rl has been invested in open access issues, not just as a trend in the research in our field, but in its own practice, as we have made a complete transition in the journal from a traditional subscription model to a fully oa model (and one that includes equally free and open access to our entire, -year backfile). essays accepted for publication in c&rl are openly available to all, for free, throughout their lifecycle from pre-print to fully-formatted and published article. i did want to point out one discrepancy in your analysis of c&rl’s openness, however, and that has to do with “re-use rights”: all articles published in c&rl include a cc-by-nc license, which you can see noted on both the abstract page and the full-text versions of any of our current articles, e.g., the aptly-titled “beyond gatekeepers of knowledge: scholarly communication practices of academic librarians and archivists at arl institutions” (march ), available at http://crl.acrl.org/content/ / / .full.pdf+html. with this data point updated, i hope our “j.o.i. factor” may even increase! micahvandegrift – – at : am hi scott – thanks for the clarification. we did a lot of the data analysis over the past year, and were sure there would be things we overlooked, or that had changed. i am a big fan of the work over at c&rl for all the reasons you mention, so i hope you’ll forgive the oversight. one suggestion, if i may be so bold – the cc-by-nc license is buried a pdf. is there any reason it couldn’t be clearly indicated on the “instructions for authors” page, or plastered on the front webpage of the journal? aside from our argument for greater openness, we also want to push lis literature to be transparent and clear. using the easily recognizable creative commons logo would go a long way too. and, since i have your ear, why the decision to impose a cc-by-nc on c&rl authors? why not follow the lead of first monday and allow full author choice? what does the nc actually practically accomplish? why is c&rl still enforcing standards that are hangovers from print publication (ultra prescriptive reference and citation rules, assignment of page numbers)? and, my biggest beef with c&rl, why the pre-print bottleneck? if the article is accepted, and your platform is digital, why not release it as “published” and do away with the constrictions of volume and issue, which don’t matter anyways except for the legacy of a print publication? i am a supporter of everything you all are doing over there, just want to make sure you know i think there is plenty of room to grow. if c&rl truly embraced open access and digital publication, i believe it should look more like plos with first monday’s rights/reuse structure. scott walter – – at : am micah – you are correct that the “instruction for authors” page needs to be updated, not just for the item you note, but also to highlight our support for links to open data, etc., that are now possible. maybe even to acknowledge the possibility of an open peer review option (depending on how that experiment goes with one of your lead pipe colleagues). such an update is on my “to do” list for the year. not on my list, though, are the changes you suggest regarding formatting and the continued release of specific issues of the journal. the reason these changes are not on my list, valid options though they are, is because they are not currently supported by the broader community of c&rl readers and authors. in reader surveys and focus groups conducted as we planned for the transition, the retention of these legacy markers of quality of content were specifically noted as important to continue. by increasing the number of articles included in each issue, we are on target to shrink the time to publication significantly over the next year, but to do so in a manner that retains important characteristics of our print heritage that readers and authors told us were important for the continued vitality of the journal. we’ll continue to engage our community in the coming years on issues like this, and i am sure the journal will continue to grow in the right direction for a top-tier, peer-reviewed, oa journal in lis. micahvandegrift – – at : pm glad to hear, and i understand the many layers of complexity in moving something like this along. i’d be really interested in conducting that survey of c&rl readers and authors now, as scholcomm seems to have developed so drastically in the years i’ve been a librarian. i’d wager that the results might be different. if it ever comes to a vote, mine will be surely on the side of eschewing those “legacy markers of quality” since i believe nostalgia for legacy impairs innovation. all that said to say, i’m on your team! do great things! innovate! lets lead scholarly publishing rather than react! pingback : academics investigate big deals | bibliographic wilderness pingback : declaration on open access for lis authors | semantic rain pingback : twitter open access report – apr | pingback : editor’s choice: in the library, with the lead pipe: librarian, heal thyself: a scholarly communication analysis of lis journals | digital humanities now pingback : infobib » open access im bibliothekswesen pingback : new librarian, heal thyself: a scholarly communication analysis of lis journals – stephen's lighthouse new silvia cho – – at : pm thank you so much for this very valuable work. i’m wondering how you feel about electronic journal license terms on allowing or prohibiting interlibrary loan as another measure of journal openness. often, electronic journal licensing terms are restrictive or unclear, expressly prohibiting or simply deterring libraries from sharing electronic journal articles with other libraries. with print materials, we have fair use best practices to rely upon; with electronic journals, we additionally have to contend with restrictive licensing terms in addition to copyright matters. this is a frequent barrier we have to contend with in resource sharing as we seek to connect users to resources they need. micahvandegrift – – at : pm hi silvia – i think you’re absolutely right on the money that ill-ability is another incredibly important measure of openness. in fact, after publishing this i came across this liblicense project that hopes to deal with exactly that issue. my ultimate hope would be, that if a journal is fully open access there would be little to no restrictions on re-usability, including interlibrary loans. an issue that we will have to contend with, that this article might hint at, is that many of these complications come from the fact that academic publishing is currently controlled by publishing companies. they have every interest in restricting the access and usability of the content because it is their business model. we are hoping and seeing that initiatives like library publishing are altering this system a bit, and it would make sense then that publishing would be more in the interest of information access than restriction. i’m hopeful that the system will evolve so that concerns like yours simply won’t exist anymore. thanks for reading, and for the comment! pingback : declaration on open access for lis authors | informed pingback : open access update may | australian open access support group pingback : atg article of the week: librarian, heal thyself: a scholarly communication analysis of lis journals | against-the-grain.com pingback : atg article of the week: librarian, heal thyself: a scholarly communication analysis of lis journals | against the grain pingback : librarians need to “walk the talk” on oa publishing | open access @ uofs library pingback : open access student publishing | hls pingback : the life and death of an open access journal: q&a with librarian marcus banks – by richard poynder | against the grain this work is licensed under a cc attribution . license. issn - about this journal | archives | submissions | conduct umanistica digitale - issn: - - n. , s. van herck – visualizing gender balance in conferences doi: https://doi.org/ . /issn. - / visualizing gender balance in conferences sytze van herck university of luxembourg sytze.vanherck@uni.lu abstract. data visualization is a powerful tool for digital scholarship yet not without its pitfalls. based on the dissertation “visualizing gender balance” comparing ten computer science conferences, several visualization techniques and tools undergo a critical review. the dataset underlying the visualizations contains data researchers encounter daily: bibliographic information. analyzing larger sets of authors writing and publishing for conferences in computer science changes our perception of the gender (im)balance in this academic research area. but only a careful curation and visualization can truly reveal what goes on behind the scenes. still the more complicated, detailed and nuanced the visualization, the harder it becomes for an untrained eye to interpret the patterns. la visualizzazione dei dati è un potente strumento per gli studi informatici, ma non è priva di insidie. sulla base della presentazione "visualizing gender balance", che mette a confronto dieci diverse conferenze di informatica, vengono qui sottoposti a revisione critica diverse tecniche e strumenti di visualizzazione. il dataset alla base delle visualizzazioni è costituito da dati con cui i ricercatori hanno a che fare quotidianamente: informazioni bibliografiche. l'analisi di ampi gruppi di autori che scrivono e pubblicano nel campo dell'informatica cambia la nostra percezione dello (s)quilibrio di genere in quest'area di ricerca accademica. tuttavia, solo un'attenta cura può veramente rivelare ciò che accade dietro le quinte. più complicata, dettagliata e sfumata è la visualizzazione, infatti, più diventa difficile per un occhio non addestrato interpretare i modelli. introduction as virginia valian discusses in her article beyond gender schemas people generally fail to recognize gender equity problems. when people do recognize gender disparities, they often rely on four possible explanations. first, the pipeline problem refers to the decline in the percentage of women from undergraduate to graduate to professional status. second, the child- care responsibility indicates that when child care is seen as women’s work they are more likely to become part-time workers. the third problem of different values based on gender originates .. https://doi.org/ . /issn. - / umanistica digitale - issn: - - n. , in survey data suggesting that men are more willing than women to forgo a balanced personal life. finally, the lack of acculturation refers to a presumed lack of understanding by women on how to be successful. furthermore, research confronting structural sexism and racism is often marginalized and trivialized. in computational history the importance of gender dynamics are devalued according to historians of technology such as janet abbate and marie hicks. similarly, racial inclusion in stem (science, technology, engineering and math) fields has only recently received attention in both the book and the film on hidden figures. the concept of intersectionality introduced by kimberlé crenshaw directs attention to the interaction of multiple power structures such as race, sexuality, class and ability. the intersectionality framework provides both a methodological approach and lived experience of academics in the field of digital humanities and computer science. in the field of digital humanities, nickoal eichmann-kalwara, jeana jorgensen and scott b. weingart studied geographic, disciplinary and demographic diversity between and at the alliance of digital humanities organizations (adhos) conference. their analysis reveals “a growing awareness of diversity-related issues, with moderate improvements in regional diversity”, but a stagnation in gender diversity. although women occupy prominent positions in the community’s core, they occupy less space in the much larger periphery of authors. yet the peer review process shows visible bias against authors with non-english names. furthermore biases around subject matter reflect gender disparities in specific disciplines, a phenomenon explained by historian lynn hunt. the feminization of a field is usually paired with a decline in status and resources. “there is a clear correlation between relative pay and the proportion of women in a field: those academic fields that have attracted a relatively high proportion of women pay less on average than those that have not attracted women in the same numbers.” so even though women are just as likely to get accepted as men if they submit a presentation on the same topic, topics gendered towards women are less likely to get accepted. gender studies for instance has an acceptance rate of percent, whereas a male-skewed topic such as text analysis accepts percent of submissions. barbara bordalejo’s minority report further demonstrates an anglophone bias in digital humanities, .. ., ix. ., ix; .. ., xi; .. ., xi; .. .. ., . ., . ., . ., . ., ; .. .. ., . ., . s. van herck – visualizing gender balance in conferences and includes personal attacks openly denigrating her work and feminist research by extension. in the field of computer science, research participation and collaboration of female authors increased by less than . % per year between and , and male researchers present % of actively publishing members in computer science conferences. the european report on she figures mentions that “[women] made up % of those pursuing phds in computing in ”, therefore remaining underrepresented within the subfield of computing. furthermore, “no progress has been made [between and ] to promote women to grade a positions, (…) women remain relatively more present at lower level of the academic career path”. in order to expand the analysis of gender inequality within the computer sciences and compare the results to the geographic, disciplinary and demographic diversity in the digital humanities, several data visualizations are included. as the graphic display of abstract information, data visualization serves two purposes, namely sense-making or data analysis and communication. as illustrated in the report of malu a.c. gatto on making research useful: current challenges and good practices in data visualisation, “academics have often struggled to share their data with other actors and to disseminate their research findings to broader audiences.” however, data visualization can truly advance research, not only as a communication tool towards a broader audience, but especially as a tool that allows pre- attentive processing of vast amounts of information. in short, “data visualization reduces knowledge gaps.” since conference data sets are often too large to process without the help of visualizations, i would like to introduce several techniques. data visualization could provide one possible answer to michael jensen questions regarding digital scholarship, asking: “how can we most appropriately support the creation and presentation of intellectually interesting material, maximize its communicative and pedagogical effectiveness, ensure its stability and continual engagement with the growing information universe, and enhance the reputations and careers of its creators and sustainers?” in order to engage with the audience or reader, the design of the project should not be overlooked, even though “necessity often dictates that we adopt and adapt tools and technologies that were originally developed for other needs and audiences.” i will illustrate the development of and uses for visualizations in digital humanities research based on my own visualizations of the gender balance in ten computer science conferences from until . .. .. ., . ., . ., . ., . ., . ., . ., . .. umanistica digitale - issn: - - n. , which data? dblp digital library both datasets used in this article indirectly trace back to the “data bases and logic programming” (dblp) digital library created by michael ley for his phd research. the browsable dblp collection started as formatted hypertext markup language (html) based on tables of contents, where each authors’ name linked to a list of their publications and to a list of co-authors and their personal pages. in the evolving coverage of computer science sub- fields in the dblp digital library florian reitz and oliver hoffmann discovered thematic biases in the coverage of computer science, with a narrow focus on databases, information retrieval, programming languages, and digital libraries and data mining. by the dblp collection covered % of computer science conferences mentioned in the list created by reitz and hoffmann. women in computer science research the dataset used in this research was created by swati agarwal et al. in the context of an article on women in computer science research: what is the bibliography data telling us? and can be accessed online via mendeley data. the data was retrieved on september , from the dblp bibliography database and includes the last years ( – ) of publication records from computer science conferences. for each article the dataset contains the year, conference abbreviation, publisher and unique doi, as well as the domains defined by the authors for each conference. the author and editor tables in the dataset provide the position of the author in the paper or the position of the editor in the conference proceedings, a unique name, their gender and the probability of a name being male, female or undetermined. finally, the dataset includes information about the affiliation of each author and for each paper such as the name, type (industry or academic institution) and country of each affiliation, as well as the latitude and longitude. citation network for the citation network i needed a second dataset created by swati agarwal et al. with a focus .. .. .. .. .. .. .. .. .. .. s. van herck – visualizing gender balance in conferences on seven acm sigweb series of conferences which is also available via mendeley. the data was again derived from the snapshot of the dblp collection on september , covering years of not seven, but eight sigweb conferences. the most important addition to the dataset is the cited_by table where an identifier for paper a links to the identifier of the article b that cited paper a. although the data structure was more complicated, the information about the papers, authors and their affiliation was still included. in order to test the citation network visualizations, i only selected the first papers and their related authors. why bibliographic data? originally bibliographies improved the process of browsing through collections of books, articles, journals and proceedings either per genre, per country or per language and mostly for academic researchers and librarians. when these collections became accessible online, they provided insights into academia through visualizations. bibliographic databases store and provide rich information on both co-authorship and the citation networks of academics. several tools use this data to uncover research area evolutions and communities that show current trends in scientific research, as well as academic social networks. what’s (not) behind the data? human or algorithmic selection a dataset is inherently curated and therefore leaves out other information. whether data is selected by a human or an algorithm, the selection or parameters could be biased. for example, as i mentioned earlier, the dblp digital library focusses only on certain areas of computer science research. furthermore, the creators of the first database on women in computer science research manually assigned each conference to a certain sub-field such as software engineering (se), data engineering (de), and theory (th), as well as computer science (cs) in general. in mind the gap: gender and computer science conferences antonio fiscarelli and i applied topic modelling to prevent such a subjective judgement on research areas, but we still had to decide on a name for each category. for the visualizations my co-supervisor and i have made another subjective selection of ten different conferences ranging from .. .. .. .. .. .. .. .. .. .. umanistica digitale - issn: - - n. , interdisciplinary (i.e. computer-human interaction or chi) to disciplinary (i.e. user-interface software and technology or uist). besides human interference with the data selection, algorithms have enhanced the datasets with geographical information. the first dataset for women in computer science research combined several application programming interfaces (apis) such as openstreetmap, alchemy language, google geocoding and bing geocoding “to determine the type of affiliation (industry or an academic institution)” and add the coordinates of certain institutions. furthermore, the genderize api determined the gender of the authors based on their first name. assigning gender the genderize api uses “big datasets of information, from user profiles across major social networks” to determine the gender of a first name and “includes a certainty factor as well”. in women in computer science research the authors decided to include the gender of an author only when their first name was known and the confidence score was over %. overall, , % of authors did not have a gender in the dataset, whereas , % of authors were identified as male and only , % as female. the binary approach of such an algorithm ignores the psychological and sociological use of the term gender, which originated in the united states and signifies “the state of being male or female as expressed by social or cultural distinctions and differences, rather than biological ones; the collective attributes of traits associated with a particular sex; or determined as a result of one’s sex. also: a (male or female) group characterized in this way”. not only does an algorithm assign gender without taking into account an individual’s agency to determine their gender on a spectrum rather than in binary form, it also ignores change over time. rather than ignoring gender in research altogether, eichmann-kalwara et al. believe that “showing whether reviewers are less likely to accept papers from authors who appear to be women can reveal entrenched biases, whether or not the author actually identifies as a woman”. what data visualization shows data processing in order to test the data visualizations, several queries limited and structured the data further to create smaller subsets per year and per conference for network visualizations, which were combined to visualize authorship demographics and the evolution of the gender balance across .. .. ‘genderize.io | determine the gender of a first name’. .. .. oxford university press, ‘gender’. .. s. van herck – visualizing gender balance in conferences conferences and over the years. the data processing fell into three main steps: • querying. in order to perform the queries easily, i connected to the databases in python and stored the results in variables. the queries differed for each single visualization, since they all required different information. • formatting. the visualizations created based on javascript libraries such as google charts, google maps api, protovis, or d .js only accepted data in the javascript object notation or json. tableau on the other hand accepted excel-files. • importing. the json-files for the first visualizations were imported using the jquery asynchronous javascript and xml or ajax method. tableau used a drag and drop interface. figure : the data processing process. arrows represent scripts, while post-its represent files co-authorship network visualization the first network visualization was based on an arc visualization in protovis and showed every single author represented as a dot on a horizontal line. the size of the dot or node represented the number of co-authors. the color represented their gender and the lines connected authors working on the same paper. grey represents an unknown gender, pink represents female and blue stands for male authors. because the dataset contains authors in total for the selection of ten conferences, the co- authorship network focuses on a specific year for a single conference. shows the situation at the relatively small acm user interface software and technology (uist) conference in with a particularly low representation of female authors. furthermore, the single-authored papers were all written by men, whereas both unknown and female authors all co-authored with male authors. http://mbostock.github.io/protovis/ex/arc.html http://www.tableau.com/ http://bl.ocks.org/mbostock/ http://mbostock.github.io/protovis/ex/arc.html https://developers.google.com/maps/ https://developers.google.com/chart/ https://developers.google.com/chart/ umanistica digitale - issn: - - n. , in the second network visualization authors were again represented individually as a dot, but this time arranged in the form of a circle and grouped by affiliation according to a hierarchical edge-bundling example from d .js. the lines show the relation between co-authors and when hovering over an author, the incoming and outgoing links to co-authors are highlighted. in the gender of authors is not included in the visualization of collaboration at the computer- human interaction (chi) conference of . the chord diagram does illustrate that authors in this interdisciplinary research area will mostly collaborate within the same institution, or one or more authors from a single external organization. despite the limited selection of papers from the dataset, the chord diagram is very difficult to interpret due to the sheer number of authors. unfortunately, network visualizations always run into the risk of cluttered screens, which is figure : chord diagram of the chi conference co- authorship network figure : protovis visualisation of the uist conference http://bl.ocks.org/mbostock/ s. van herck – visualizing gender balance in conferences especially true for larger networks. however, added interactivity and filtering the data allows users to explore the results in a structured way. another solution would be to cluster authors either based on their institution or by research area to reduce the number of nodes. co- authorship networks are less clear in displaying the gender balance, but color-coding nodes according to gender does draw attention to the role of gender in collaboration. furthermore, the choice of colors is heavily based on culture in with pink for women and blue for men. in modern western culture the color choice immediately conveys gender information, but it does adhere to existing stereotypes and fails to take into account other cultural color conventions. conference demographics first i experimented with simple bar and line charts provided by google charts api to demonstrate the evolution of gender balance in computer science conferences over time and a map visualization of all the affiliations included in the dataset using the google maps api. these visualizations did not, however, allow for interactive exploration, so eventually authorship demographics were visualized using the software platform of tableau. besides studying the evolution of the gender balance per conference, other demographics include the geographical location of authors based on their affiliation, as well as gender in relation to co- authorship grouped by conference. the first stacked bar chart shows the evolution of gender balance in ten computer science conferences over a period of sixteen years. the horizontal axis was grouped per conference and the stacked bars were color-coded according to gender, with blue representing male, orange representing female and grey representing unknown. the interactive tableau software provides details-on-demand while hovering over the visualization and filtering per conference, year and gender. overall, research participation and collaboration of female authors increased by less than . % per year between and as previously mentioned. the percentage of male authors decreased in several interdisciplinary conferences such as chi, .. .. figure : % stacked bar chart of the gender balance in ten cs conferences from to http://www.tableau.com/ https://developers.google.com/maps/ https://developers.google.com/chart/ umanistica digitale - issn: - - n. , and in the field of knowledge engineering including uist and knowledge discovery and data mining (kdd) as well as software engineering with the international conference on software engineering (icse). because of the relatively large proportion of authors where the gender is unknown, the decrease in male authors does not necessarily indicate an increase in female authors. overall only the chi conference has a relatively large percentage of female authors representing between % and % of all authors. the lowest percentages of female authors can be found at the very large databases (vldb) and uist conference. the map visualization uses the same color scheme as the stacked bar chart, but gradually changes from dark blue indicating a lack of female authors to bright orange representing % female authorship on average. each country also contains the exact number of authors affiliated to institutions located in that country. tableau uses the mercator map projection, which means that although the shape of countries is respected, area is not well represented. a common critique of the mercator projection is that greenland is roughly the same size as the entire continent of africa. furthermore, the map is europe-centered and thus presents a eurocentric view of the world. for example, the size of russia immediately draws attention, but only authors are affiliated with russian institutions compared to authors from the united states. figure : map of average percentage of female authors at ten cs conferences between and . despite the relatively low number of authors, . % of russian-based researchers submitting papers to cs conferences are identified as female, compared to only , % of u.s.-based researchers. the united states ( , %) and china ( %) account for , % of all authors of which on average only % are female. besides disciplinary differences in the gender balance, the disparities are far bigger in the author affiliations from the united states and china than in russia. the different course in the history of computing of the former soviet union had the opposite effect on gender balance in cs creating a previously female-dominated field. .. .. s. van herck – visualizing gender balance in conferences figure : treemap of the gender balance for each paper in ten cs conferences from to in a tree map visualization showing the percentage of women per paper and grouped by conference, the same color-coding and filters were again adopted to allow additional data exploration. the ten larger blocks each represent a single conference, with size referencing the number of authors. within each conference block, a single block represents a paper and a small block refers to a single-authored paper whereas larger rectangles reference co-authored papers. the filters can limit the results to a smaller range of years or exclude some of the conferences to get a clearer view of the data. furthermore, hovering over a block provides the exact number of authors for a single paper, as well as the percentage of female authors for that paper. although the first network visualization showed that women at the uist conference of generally co-authored papers, the same does not apply to other conferences, given the high concentration of female authors in the bottom-right corner of single-authored papers. overall, the multi-authored papers on the left and top of each conference rectangle have few if any female authors. collaboration thus occurs mostly between men, except at the chi conference (top right). what data visualization doesn’t show accessibility and compatibility in order to visualize the data, free access to software or existing code is rarely guaranteed since this software is often commercial. furthermore, if support for software or existing code ends, the visualization will likely disappear entirely. even at the time of the creation some of the online visualizations are not supported by the browser or could appear different and even distorted depending on the screen size and browser. regardless of challenges related to the visualization software, the main value of data visualization lies in both facilitating pre-attentive processing, as well as communicating results. umanistica digitale - issn: - - n. , user analysis while the expert might find certain types of visualizations very useful, some graphical representations such as the tree map in do not communicate anything to a larger audience because it requires previous knowledge of the structure and parameters included in tree maps. therefore, i evaluated the communicative value of visualizations with twelve digital humanities students at ku leuven. the test users performed nine tasks based on the visualizations in a think aloud study and afterwards rated the visualizations on a system usability scale. furthermore, an open-ended questionnaire allowed participants to express their opinion concluding the evaluation. based on the user test i found that required explanation, but most participants correctly identified the number of co-authors and especially the topic of the conference as an influence on the percentage of female authors in conference papers. the co-authorship network visualization displayed in and frustrated the users, although nearly all of them recognized more collaboration between authors from different affiliations in the chi conferences of compared to the uist conference in . overall, the “interactive components and filters” greatly improved how comprehensive a visualization was. conclusion rather than merely analyzing issues regarding the gender imbalance in computer science, this paper studies how to identify and visualize such issues through co-authorship networks and conference demographics. however, a researcher first needs to critically examine and understand the data at the base of the study. without understanding the data, understanding the visualization becomes difficult if not impossible. in the particular case of the dblp bibliographic database, topics are biased towards databases, information retrieval, etc. furthermore, algorithmic bias creeps in through automatically assigning gender to authors based on their first name. finally, human selection of ten specific conferences further narrows down the dataset. based on two co-authorship networks and three conference demographic visualizations, we can better understand the pitfalls of data visualization and at the same time study intersectionality in computer science conferences. for instance, due to the unknown gender of some researchers, a decrease in male authors does not necessarily indicate an increase in female authors in the % stacked bar chart. however, taking a closer look at collaboration through co-authorship networks can only be done for a single conference and specific year since both the arc- and hierarchical edge-bundling visualizations otherwise become too cluttered to read. the map visualization then illustrates the difference in gender balance between russian and american institutions yet distorts the size of countries in favor of a western-centric vision of the world. furthermore, the tree map visualization might combine a lot of information such as the .. .. s. van herck – visualizing gender balance in conferences number of authors for each paper, the number of papers at each conference and the gender balance in co-authored papers. however, a tree map visualization is not intuitive or easy to interpret for anyone unfamiliar with both the data and the visualization method. the iterative process of creating and evaluating visualizations is best explained through norman’s action cycle. in order to form questions and find answers, the action cycle falls into two gulfs. first a user or researcher needs to set an intention and create an action plan during the gulf of execution. if the action or in this case the visualization shows an interesting pattern, then a gulf of evaluation follows. the perception of the visualization might lead to an interpretation which then needs to be evaluated again. interactivity thus allows further exploration of the data by other researchers or the audience, while storytelling structures the relations between different visualizations and guides the audience through the research in a few clicks. the core value of visualizations for the digital humanities therefore lies in accelerating data processing and raising possibilities for further research. references . abbate, janet. . recoding gender. cambridge: the mit press. . agarwal, swati, nitish mittal, rohan katyal, ashish sureka, and denzil correa. . “women in computer science research: what is the bibliography data telling us?” sigcas computers and society ( ): – . https://doi.org/ . / . . . agarwal, swati, nitish mittal, and ashish sureka. . “a glance at seven acm sigweb series of conferences.” sigweb newsletter, no. summer: - . https://doi.org/ . / . . . agarwal, swati, ashish sureka, and nitish mittal. . “dblp publications records and acm metadata for sigweb conferences” mendeley data . https://doi.org/ . /dn d fbkb . . . agarwal, swati, ashish sureka, nitish mittal, rohan katyal, and denzil correa. . “dblp records and entries for key computer science conferences” mendeley .. figure : i primi concetti-chiave estratti dai paper clic-it. la dimensione del font è proporzionale al peso normalizzato del concetto- chiave. https://doi.org/ . /dn d fbkb . https://doi.org/ . / . https://doi.org/ . / . umanistica digitale - issn: - - n. , data . https://doi.org/ . / p w t mr. . . boykis, vicki. . “being a woman in programming in the soviet union,” github (blog), february . http://veekaybee.github.io/ / / /being-a-woman-in- programming-in-the-soviet-union/. . bordalejo, barbara. . “minority report. the myth of equality in digital humanities.” in bodies of information: intersectional feminism and the digital humanities, edited by e. losh and j. wernimont. minneapolis: university of minnesota press. https://muse.jhu.edu/book/ . . crenshaw, kimberle. . “mapping the margins: intersectionality, identity politics, and violence against women of color.” stanford law review ( ): – . https://doi.org/ . / . . eichmann-kalwara, nickoal, jeana jorgensen, and scott b. weingart. . “representation at digital humanities conferences ( - ).” preprint, submitted / / . https://doi.org/ . /m .figshare. .v . . eichmann-kalwara, jorgensen, weingart. . “representation at digital humanities conferences ( - ).” in bodies of information: intersectional feminism and the digital humanities, edited by e. losh and j. wernimont. minneapolis: university of minnesota press. https://muse.jhu.edu/book/ . . european commission, directorate-general for research and innovation. “she figures . gender in research and innovation.” website, september . https://publications.europa.eu/en/publication-detail/-/publication/f dfed- a - e -af - aa ed a . . few, stephen. . “data visualization for human perception.” in the encyclopedia of human-computer interaction, edited by m. soegaard and r. friis dam, nd edition. the interaction design foundation. https://www.interaction- design.org/literature/book/the-encyclopedia-of-human-computer-interaction- nd- ed/data-visualization-for-human-perception. . gatto, malu. . making research useful: current challenges and good practices in data visualisation. oxford: reuters institute for the study of journalism. . ‘genderize.io | determine the gender of a first name’. website. https://genderize.io/. . hicks, marie. . programmed inequality: how britain discarded women technologists and lost its edge in computing. cambridge: mit press. . hunt, lynn. . “has the battle been won? the feminization of history.” perspectives on history. https://www.historians.org/publications-and- directories/perspectives-on-history/may- /has-the-battle-been-won-the- feminization-of-history. . jensen, michael. . “intermediation and its malcontents: validating professionalism in the age of raw dissemination.” in companion to digital https://www.historians.org/publications-and-directories/perspectives-on-history/may- /has-the-battle-been-won-the-feminization-of-history https://www.historians.org/publications-and-directories/perspectives-on-history/may- /has-the-battle-been-won-the-feminization-of-history https://www.historians.org/publications-and-directories/perspectives-on-history/may- /has-the-battle-been-won-the-feminization-of-history https://genderize.io/ https://www.interaction-design.org/literature/book/the-encyclopedia-of-human-computer-interaction- nd-ed/data-visualization-for-human-perception https://www.interaction-design.org/literature/book/the-encyclopedia-of-human-computer-interaction- nd-ed/data-visualization-for-human-perception https://www.interaction-design.org/literature/book/the-encyclopedia-of-human-computer-interaction- nd-ed/data-visualization-for-human-perception https://publications.europa.eu/en/publication-detail/-/publication/f dfed- a - e -af - aa ed a https://publications.europa.eu/en/publication-detail/-/publication/f dfed- a - e -af - aa ed a https://muse.jhu.edu/book/ https://doi.org/ . /m .figshare. .v https://doi.org/ . / https://muse.jhu.edu/book/ http://veekaybee.github.io/ / / /being-a-woman-in-programming-in-the-soviet-union/ http://veekaybee.github.io/ / / /being-a-woman-in-programming-in-the-soviet-union/ https://doi.org/ . / p w t mr. s. van herck – visualizing gender balance in conferences humanities, edited by s. schreibman, r. siemens, and j.unsworth. oxford: blackwell publishing professional. http://www.digitalhumanities.org/companion/. . kirschenbaum, matthew g. . ““so the colors cover the wires”: interface, aesthetics, and usability.” in companion to digital humanities, edited by s. schreibman, r. siemens, and j. unsworth. oxford: blackwell publishing professional. http://www.digitalhumanities.org/companion/. . lee shetterly, margot. . hidden figures: the american dream and the untold story of the black women mathematicians who helped win the space race. new york: william morrow. . ley, michael. . “the dblp computer science bibliography: evolution, research issues, perspectives.” in proceedings of the th international symposium on string processing and information retrieval, – . berlin: springer-verlag. http://dl.acm.org/citation.cfm?id= . . . losh, elizabeth, and jacqueline wernimont. . bodies of information: intersectional feminism and the digital humanities. minneapolis: university of minnesota press. https://muse.jhu.edu/book/ . . norman, donald a. . the design of everyday things, revised and expanded edition. cambridge, ma: mit press. https://mitpress.mit.edu/books/design- everyday-things-revised-and-expanded-edition. . oxford english dictionary, s.v. “gender”. . reitz, florian, and oliver hoffmann. . “an analysis of the evolving coverage of computer science sub-fields in the dblp digital library.” in research and advanced technology for digital libraries, edited by m. lalmas, j. jose, a. rauber, f. sebastiani, and i. frommholz, – . berlin: springer. . valian, virginia. . ‘beyond gender schemas: improving the advancement of women in academia’, hypatia ( ): - . . van herck, sytze. . “visualising gender balance. ten computer science conferences and the digital humanities conference compared.” ku leuven, faculteit wetenschappen, master of digital humanities. . van herck, sytze, and antonio maria fiscarelli. . “mind the gap gender and computer science conferences.” in this changes everything – ict and climate change: what can we do?, edited by d. kreps, c. ess, l. leenen, and k. kimppa, – . berlin: springer international publishing. . wu, meng qi yelena, robert faris, and kwan-liu ma. . “visual exploration of academic career paths.” in ieee/acm international conference on advances in social networks analysis and mining (asonam ), – . https://doi.org/ . / . . last urls access: september https://doi.org/ . / . https://mitpress.mit.edu/books/design-everyday-things-revised-and-expanded-edition https://mitpress.mit.edu/books/design-everyday-things-revised-and-expanded-edition https://mitpress.mit.edu/books/design-everyday-things-revised-and-expanded-edition https://muse.jhu.edu/book/ http://dl.acm.org/citation.cfm?id= . http://www.digitalhumanities.org/companion/ http://www.digitalhumanities.org/companion/ introduction which data? dblp digital library women in computer science research citation network why bibliographic data? what’s (not) behind the data? human or algorithmic selection assigning gender what data visualization shows data processing co-authorship network visualization conference demographics what data visualization doesn’t show accessibility and compatibility user analysis conclusion references the prescription opioid epidemic: social media responses to the residents' perspective article ucsf uc san francisco previously published works title the prescription opioid epidemic: social media responses to the residents' perspective article. permalink https://escholarship.org/uc/item/ gk c journal annals of emergency medicine, ( ) issn - authors choo, esther k mazer-amirshahi, maryann juurlink, david et al. publication date doi . /j.annemergmed. . . peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ gk c https://escholarship.org/uc/item/ gk c#author https://escholarship.org http://www.cdlib.org/ pain management and sedation/special contribution the prescription opioid epidemic: social media responses to the residents’ perspective article esther k. choo, md, mph*; maryann mazer-amirshahi, pharmd, md, mph; david juurlink, md, phd; scott kobner, bs; kevin scott, md; michelle lin, md *corresponding author. e-mail: choo@aya.yale.edu, twitter: @choo_ek volume in june , annals of emergency medicine collaborated with the academic life in emergency medicine (aliem) blog-based web site to host an online discussion session featuring the annals residents’ perspective article “the opioid prescription epidemic and the role of emergency medicine” by poon and greenwood-ericksen. this dialogue included a live videocast with the authors and other experts, a detailed discussion on the aliem web site’s comment section, and real-time conversations on twitter. engagement was tracked through various web analytic tools, and themes were identified by content curation. the dialogue resulted in , unique page views from cities in countries on the aliem web site, , twitter impressions, and views of the video interview with the authors. four major themes about prescription opioids identified included the following: physician knowledge, inconsistent medical education, balance between overprescribing and effective pain management, and approaches to solutions. free social media technologies provide a unique opportunity to engage with a diverse community of emergency medicine and non–emergency medicine clinicians, nurses, learners, and even patients. such technologies may allow more rapid hypothesis generation for future research and more accelerated knowledge translation. [ann emerg med. ;-: - .] - /$-see front matter copyright © by the american college of emergency physicians. http://dx.doi.org/ . /j.annemergmed. . . introduction annals of emergency medicine and academic life in emergency medicine (aliem) have conducted a collaborative journal club as a shared initiative to promote awareness of key emergency medicine literature, to facilitate knowledge translation, and to provide an educational resource to teach critical appraisal to emergency physicians while drawing engagement from a broad audience through social media platforms. - because of its increasing popularity, this collaboration now extends to the annals residents’ perspective series. in this installment, we feature the article by poon and greenwood-ericksen, “the opioid prescription epidemic and the role of emergency medicine.” opioid misuse and addiction are increasing and serious problems in the united states, with associated fatalities increasing -fold between and and approximately daily deaths from prescription opioids. the emergency department (ed) has experienced significant increases in visits related to nonmedical use of prescription opioids and, in parallel, significant increases in the number, quantity, and potency of opioid prescriptions dispensed from the ed. - the original residents’ perspective article discussed the challenges of practicing in the context of the opioid epidemic and the daily struggle to alleviate pain while trying to avoid initiating or perpetuating opioid misuse. the article presented -, no. - : - several means of supporting emergency physicians in these goals, including adoption of ed prescribing guidelines; use of prescription drug monitoring programs, or statewide electronic records of prescribed substances for each individual; and a standardized resident education curriculum on opioid prescribing. with the annals article as a launching point, aliem further explored this topic with free social media platforms, including a twitter conversation, web site discussion, and live videocast with the authors and key experts. this article aims to organize and summarize the responses from the global social media community and to propose potential solutions and recommendations. objective web analytics will also be reported for the multiple digital platforms used. materials and methods the annals editors selected the residents’ perspectives article, and aliem chose facilitators for their expertise in medical education and active presence on social media. one is an experienced blogger on aliem (m.l.), and all have active twitter accounts with greater than followers (s.k., @skobner), greater than followers (k.s., @k_scottmd), greater than , followers (e.c., @choo_ek), and greater than , followers (m.l., @m_lin) at the time of the discussion. annals of emergency medicine mailto:choo@aya.yale.edu mailto:@choo_ek http://dx.doi.org/ . /j.annemergmed. . . figure . featured aliem blog questions. figure . questions posed to videocast panelists. the prescription opioid epidemic choo et al the discussion was hosted by aliem (http://aliem.com), which is a public, wordpress-based, educational blog web site created in . aliem has greater than million page views annually, greater than , facebook fans, greater than googleþ followers, and greater than e-mail subscribers. the web site hosts a broad range of topics relevant to academic and community emergency physicians, including clinical pearls, reviews of journal articles, faculty development discussions, and medical education topics. the facilitators’ goal during the discussion was to encourage sharing and reflection on preselected discussion questions (figure ) in regard to current perspectives about opioid prescribing. the open-ended questions were selected by the authorship team to maximize discussion involving the core teaching points from the highlighted article. from august to august , , the prescription opioid discussion was hosted on the aliem web site, with comments moderated both on the blog web site and twitter, similar to the format of previous aliem-annals residents’ perspectives discussions. promotion for the discussion included notices on the aliem web site, aliem facebook page, aliem googleþ page, and annals of emergency medicine facilitators’ individual twitter accounts. ongoing promotion during and after the discussion occurred with tweets including the #aliemrp hashtag from the annals’ and facilitators’ twitter accounts. on august , , a live panel discussion was hosted on google hangout on air, featuring both authors of the highlighted article, sabrina poon, md, and margaret greenwood-ericksen, md, mph, emergency medicine residents at the brigham and women’s hospital/ massachusetts general hospital harvard affiliated emergency medicine residency program. esther choo, md, mph (brown university), a public health researcher with expertise in substance use disorders, who has published on medical education and use of social media in academia, acted as the session host and moderator; other panelists included published experts in the field of opioid prescription misuse, david juurlink, md, phd (university of toronto) and maryann mazer-amirshahi, pharmd, md, mph (medstar washington hospital center), and medical student scott kobner, bs (new york university; aliem-emra social media and digital scholarship fellow), who was asked at the end of the session to provide the perspective of a junior trainee. figure lists the questions posed to the panelists. michelle lin, md (university of california, san francisco), and kevin scott, md (university of pennsylvania), participated off camera volume -, no. - : - http://aliem.com table. aggregate analytic data from discussions for the first days of the event. social media analytic aggregator metric metric definition count google analytics page views number of times the web page containing the post was viewed , users number of times individuals from different ip addresses viewed the site (previously termed “unique visitors” by google) , number of cities number of unique jurisdictions by city as registered by google analytics number of countries number of unique jurisdictions by country as registered by google analytics average time on page average amount of time spent by a viewer on the page min s aliem blog number of tweets from page number of unique -character notifications sent directly from the blog post by twitter to raise awareness of the post number of facebook likes number of times viewers “liked” the post through facebook number of googleþ shares number of times viewers shared the post through googleþ number of site comments comments made directly on the web site in the blog comments section average word count per blog comment (excluding citations) symplur analytics for twitter hashtag #aliemrp number of tweets number of tweets containing the hashtag #aliemrp number of twitter participants number of unique twitter participants using the hashtag #aliemrp twitter impressions how many impressions or potential views of #aliemrp tweets appear in users’ twitter streams, as calculated by number of tweets per participant and multiplying it by the number of followers that participant has , youtube analytics length of videocast total duration of recorded google hangout videoconference session min s number of views number of times the youtube video was viewed average duration of viewing average length of time the youtube video was played in a single viewing min s choo et al the prescription opioid epidemic by live-tweeting the event. the videocast was automatically uploaded in real time for public viewing to aliem’s youtube account (aliem interactive videos) at http:// youtu.be/ b a ckvwb . written transcripts from twitter, the blog web site, and the videocast discussions were analyzed for broad themes and subthemes by author (e.c.). the remaining authors reviewed these themes and subthemes to corroborate inclusion of key discussion points, organization, and comprehensiveness. web analytics were recorded for the -day discussion period (august to august , ). a -day discussion period was set according to previous online journal club events hosted by aliem and annals because often intermittent conversation continues for some time after the featured discussion period. google analytics, the aliem social media widget, youtube analytics, and symplur were used to track metrics for viewership and engagement on the web site, various social media platforms, youtube, and twitter, respectively. these metrics are freely available digital resources that allow users to track and filter web traffic data, such as by dates and geography. the number of comments and words per comment in the web site discussion were also calculated, excluding the initial comments by the facilitators and all references. volume -, no. - : - results the -day analytics data for the multiplatform discussion about the opioid prescription epidemic demonstrated a geographically diverse readership on the aliem web site page ( , unique readers from countries) and dissemination on twitter ( tweets). additional analytics are summarized in the table. the global geographic distribution of participants is outlined in figure . summary of the online discussion a discussion transpired on the blog, twitter, and google hangout video that not only covered the blog questions but also generated additional related debate. the major domains discussed were as follows: theme : physicians lack knowledge about prescription opioids. our discussants thought that emergency physicians had knowledge deficits that made addressing the opioid problem challenging (figure ). these include a poor understanding of the scope of the opioid problem. in the videocast, juurlink observed that “[s]mart and well meaning people suggest that this is not an epidemic at all, or that the epidemic of pain is what we should be focused on.” participants in both the panel and the blog made the observation that physicians lack knowledge about annals of emergency medicine http://youtu.be/ b a ckvwb http://youtu.be/ b a ckvwb figure . geographic distribution of readers who viewed the aliem-annals blog post on the opioid prescription epidemic (http:// www.aliem.com/opioid-prescription-epidemic-annals-em-resident-perspectives-article/) during the first days. the prescription opioid epidemic choo et al the harms of prescription opioid medications and the prevalence of opioid abuse. “pain is such a common complaint, and as physicians we are conditioned to make people feel better and believe that opioids are our best means of doing so.. despite our perception of the role of these drugs, they just don’t work that well.”—juurlink “i was taught to be very liberal with prescription of opiates [sic].. i think we believed we were doing what was best for the patient. as a result, lots of patients with ankle sprains walked out the door with an rx for vicodins. this is crazy. there’s little, if any, evidence that opiates are necessarily better than other medications for pain.”—anand swaminathan, md (new york university) theme : there is a lack of consistent medical education on use of opioids. participants noted an absence of formal teaching in regard to pain control and use of opioid medications from the preclinical years of medical school through residency training, especially compared to education on other classes of medications. in the videocast, kobner described a “huge lack of medical school education on this topic,” suggesting that opioid-related content should be introduced through didactic instruction and again in context figure . best blog quote. annals of emergency medicine at the bedside. poon, greenwood-ericksen, and other residents participating in the discussion online described the educational influence of the heterogeneity in attending physician practice and lack of specific guidelines. resident trainees are often left uncertain and confused about the appropriate place for opioids in their clinical practice and how to use and judiciously prescribe without contributing to the problem of opioid use disorders: “i saw a young woman who was demanding vicodin for her chronic back pain.. the attending told me.‘you can kind of do what you want.’. i remember thinking, ‘gosh, how can there be so much grey area?’. i really didn’t know how to have a conversation with her.about her risks.. i sat down to write the prescription and i realized i didn’t even know what opioid was in vicodin, i wasn’t quite sure what the normal dose was, and i didn’t know how many pills were considered normal.”—greenwood-ericksen, md, mph “as a new intern, i feel like i have a general framework of what meds to order for an asthma exacerbation, or stemi, or a handful of other protocol-driven scenarios. analgesia is much murkier.”—matthew klein, md, mph (northwestern university) theme : there needs to be a balance between curtailing excessive upward “drift” in physician opioid-prescribing practices and effective pain management. twitter participants described observing not only an increased use of opioids over time but also the prescribing of more potent opioids (figure ). mazer-amirshahi described similar findings from her recent research on opioid-prescribing trends in adult and volume -, no. - : - http://www.aliem.com/opioid-prescription-epidemic-annals-em-resident-perspectives-article/ http://www.aliem.com/opioid-prescription-epidemic-annals-em-resident-perspectives-article/ figure . twitter commentary. choo et al the prescription opioid epidemic pediatric eds. , her studies noted an increase in both the number of opioid prescriptions and the potency of use of opioids in the ed. in the adult ed, there was a % overall increase in prescription opioid use and % increased use of hydromorphone between and . mazer-amirshahi hypothesized that this reflected a lower prescribing threshold on the part of clinicians. panels thought that the ed population also seems much more used to receiving prescribed opioids from their primary providers and more opioid tolerant than they were years ago. our panel noted how inured physicians have become to the potency of these medications. “we have lost the respect for these drugs that we had twenty years ago,” said juurlink, pointing out that physicians are now used to having a substantial number of their patients receive long-term opioid therapy. in the videocast, mazer-amirshahi also described a substantial increase in opioid use among low-acuity patients (triage levels and ), as well as a shift in indications for use, including use when not clearly indicated. for example, another of her studies demonstrated more use for migraine headaches, even though opioids are not recommended as first-line therapy for this condition. , this kind of use—incongruent with clinical practice guidelines—raised questions on twitter about the quality of emergency medicine care in regard to pain control and its potential conflict with patient satisfaction measures. as stated by anton helman, md, (university of toronto) this “suggests poor [patient] centered care” (figure ). however, discussion participants also expressed an ongoing focus on treating pain effectively and concern that figure . twitter commentary. volume -, no. - : - pain control was more inexact than uniformly overdone. for example, ari friedman (university of pennsylvania) emphasized the importance of continuing to meet patients’ needs even as physicians strive to reduce inappropriate opioid use (figure ). these concerns were echoed by marnie rackmill, who participated in the blog discussion as a patient. rackmill shared a story of being mistaken as opioid seeking when presenting with pain. ultimately, she was prescribed them. “did i need an opioid? i don’t know. toradol might have been just as useful, but nobody gave it any consideration.” rackmill’s providers, to her knowledge, did not look up her medication history or access the state’s available prescription drug monitoring program, which would have shown a limited history of prescribed opioids. “the initial labeling of patients—either by the triage nurse or whomever—without looking into their history, diagnosis, or needs is quite concerning. in this case it created very poor pain management.. overall, it seems to me that before deciding on a treatment plan, anyone (including a nurse) treating or diagnosing a patient should look into the patient’s history, especially if it is on file, be willing to listen to what the patient says, and open to the idea that something may actually be wrong.” theme : there are no simple answers; solutions will need to be multifactorial. panelists lauded the emerging solutions for the opioid crisis discussed in the residents’ perspective—including prescription drug monitoring programs, state- or citywide opioid use guidelines, and structured, standardized education around opioids—as well as policychanges,suchasreclassifying hydrocodone as aschedule ii drug. however, participants also mentioned barriers to some of these measures. for example, routine use of prescription drug monitoring programs is impeded by poor awareness of their existence, lack of availability, technical difficulties (eg, forgettinglog-onpasswords),orlimited time to access them in the busy ed. participants generally felt the need to advocate greater accessibility of prescription drug figure . twitter commentary. annals of emergency medicine figure . twitter commentary. the prescription opioid epidemic choo et al monitoring programs. voluntary opioid-prescribing guidelines are in limited use, but more information is needed about the extent of their adoption and their effect on prescribing practices. although traditional curricula on pain management and use of opioids may take years for formal adoption and dissemination, jeanmarie perrone, md, (university of pennsylvania) suggested that online education platforms and open-access resources might provide a more immediate educational solution. “we could tackle the knowledge gap by producing a few podcasts highlighting several case based challenging patient scenarios and hosting them on a foamed web site or existing pain curriculum site.” one open-access web site she referenced was painfree ed (http://www.painfree-ed.com/), createdby sergeymotov, md, which serves as a repository of slides, pdf handouts, local protocols, and other resources in the area of pain management education in the ed. given the limitations of system-based resources, panelists discussed the ongoing need for physicians to alter their practices on an individual level. mazer-amirshahi emphasized that all misuse starts somewhere, so practitioners should consider the potential influence of even a single unnecessary prescription: “is that one prescription i’m giving someone going to contribute to the problem of abuse and addiction?” she also emphasized the importance of caution in using opioids during the ed stay, warning that acute administration of opioids may also lead to subsequent prescription opioid use or reinforce existing patterns of misuse. poon advised considering the following when treating pain in the ed: “do i think this patient would benefit from opioid medication more than it would harm them?. what should i prescribe them? and not only what, but how much, for how long, what kind, and also what else might help them?” greenwood-ericksen emphasized the need for individual practitioners to be willing to have the “tough conversations” with patients about the dangers of opioids, “and if we do think they are at risk for abuse and misuse, address this specifically, rather than shying away from it.” addressing and tempering expectations about our ability to reduce pain may also be a part of the conversation: “what we really need are drugs that work better.and are free of toxicity. that’s likely to be a long way off.[but] until then, we have to lower our expectations, and we have to have patients lower their expectations as well.”—juurlink these thoughts were echoed on twitter by taylor zhou, md, an anesthesia resident from canada (figure ). on the aliem blog, swaminathan also discussed the importance of referral to outpatient treatment services for patients with opioid use disorders: “everyone in whom you annals of emergency medicine are concerned about chronic opiate dependence or abuse deserves a conversation from you, their physician, and referral for treatment. we wouldn’t discharge a chronic hypertensive or a diabetic without follow up.” a full transcript of the blog web site discussion is archived at http://www.aliem.com/opioid-prescription- epidemic-annals-em-resident-perspectives-article/, and all tweets with the #aliemrp are archived on symplur.com at aliem.link/ nznqzk. limitations our results were generated by posing a series of questions about the prescription opioid epidemic to stakeholders through social media platforms. in this curated review of the multiplatform discussions, our findings are at risk for selection bias in that individuals who engage in social media discussions may differ from the broader stakeholder populations. it is thus unclear whether all stakeholders are represented in this discussion because it was voluntary and required use of social media platforms for communication. also, the views of a vocal minority may have been overrepresented because of the challenges of drawing out more reserved participants to build consensus in a public, online discussion. our discussion did not distinguish between acute and chronic pain or address the different challenges of practicing in a variety of ed settings; thus, comments specific to a clinical scenario or practice setting may not be generalizable. a single author conducted the initial analysis of the themes. this may have led to the omission or misinterpretation of comments. having the themes undergo member checking by the other facilitators reduced such threats to internal validity. finally, we did not design the discussion to reach saturation, and there may be relevant themes that did not emerge with this format. in regard to web analytics, twitter analytic data depend on participants adding the hashtag #aliemrp to their tweet. those who omitted the hashtag were not included in the symplur analytics, and thus the number of twitter participants may be underrepresented in our results. despite this likely underestimation, there were still tweets by individuals with a broad reach, as defined by a twitter impression of , . volume -, no. - : - http://www.painfree-ed.com/ http://www.aliem.com/opioid-prescription-epidemic-annals-em-resident-perspectives-article/ http://www.aliem.com/opioid-prescription-epidemic-annals-em-resident-perspectives-article/ http://symplur.com choo et al the prescription opioid epidemic discussion this article presents the results of an aliem-annals collaboration using multimodal social media discussions to explore a timely, relevant question inspired by a residents’ perspective article: how do we combat the opioid epidemic within the ed? in examining the emergent themes, we identified concerns about practicing physicians’ knowledge in regard to opioids, inconsistent prescribing practices hampering training of medical students and residents, an ongoing tension between physicians’ obligation to reduce suffering by addressing pain and the desire not to contribute to the prescription opioid epidemic, the need to manage pain with a wider range of modalities, and the need for further development of system-level supports for safe prescribing, such as prescription drug monitoring programs. identified solutions to this problem remain in their infancy, with participants expressing frustration with the inaccessibility or limited accessibility of prescription drug monitoring program databases, guidelines, or formal and bedside instruction about opioid prescription practices (ie, how to determine when to prescribe opioids and how much to prescribe). until such resources are in place, physicians should continue to monitor their own prescribing practices, operating from the principle that opioids are not necessarily the most effective pain reliever and are often not the first line for controlling some types of pain. it is critical to engage patients in conversations about potential harms and alternative means of treating pain in the long term. many of the individual experiences described in social media—including frustrations with the lack of effectiveness of ourexisting paincontrol efforts, the heterogeneity ofphysician practice patterns, and the escalation of ed opioid prescribing over time—were explained and corroborated by our expert panel. clinicians and learners benefit from this multimodal presentation by having their shared experiences contextualized within the larger problem to address both immediate and long-termpotential solutions.furthermore,our blogincluded thecommentsofapatientwhobelievedshewasmislabeledasa “seeker,” a vital reminder that patients are at the center of care. as guidelines and other policies are implemented, it will continue to be important to capture the diverse circumstances, perspectives, experiences, and goals of our patient population and incorporate this information into our approaches to controlling the prescription opioid problem. social media, in this case, provided a unique opportunity to include a patient perspective in a scholarly dialogue. social media: a new frontier in scholarly discussions in this third installment of the social media curation series of annals residents’ perspective articles, web analytic volume -, no. - : - data demonstrate the feasibility of a social media–based, multimodal approach to coconstructive learning and teaching in the growing online community. the blog post received , page views from , unique users in cities ( countries). these large readership numbers, however, resulted in only a small subset providing active comments on the blog, as demonstrated by only comments in this opioid epidemic prescription discussion. in contrast, individuals using twitter seemed more likely to engage and post a retweet or reply in the discussion, with #aliemrp-tagged tweets found. this is likely multifactorial and may include the fact that tweets are brief (ie, characters), the platform encourages a more conversational environment with the ability to tag and reply to particular individuals, and twitter is a more regularly checked tool than most other web sites. in the age of digital transparency and online learning, we aim to promote more active engagement through blog comments and tweets as online communities and discussions become more mainstream in medical education. analytic data on twitter activity using the hashtag #aliemrp demonstrated a broad reach ( , impressions) among a small but engaged community who contributed to tweets. “impressions” is defined as the number of #aliemrp tweets per participant multiplied by the number of followers that participant has. these data are within the range of other popular twitter-based journal clubs in the fields of nephrology (#nephjc) and urology (#urojc). symplur analytics report , and , impressions, respectively, and and tweets, respectively, in their january journal clubs. , it is still a challenge to determine the significance of the #aliemrp data, especially because these other journal clubs are primarily discussions held on twitter (tweet chats), whereas our discussion was based more on the blog web site, with twitter supplementing the conversation. the live google hangout on air videocast published to youtube illustrated a proof-of-concept model whereby it is not only possible to virtually gather a geographically diverse group of experts in a medical grand rounds–like panel with minimal inconvenience and without the travel costs but also to host this on a free platform (google hangout on air), with live tweets reporting the conversation and the ability for live viewers to tweet in comments. although the video garnered only viewers in the first days of publication, it remains easily found on the aliem youtube channel through standard internet search engines as an archived educational resource. overall, this curation series demonstrates that it is possible to engage a global and digitally interconnected community to learn and rapidly share knowledge on timely issues annals of emergency medicine the prescription opioid epidemic choo et al relevant to emergency medicine practice. such a discussion contrasts the typical silo- and classroom-based approach to medical education. conclusion the medical community continues to struggle with the best way to combat the opioid epidemic. our multimedia discussion underscored several key challenges for our specialty, including ongoing knowledge deficits, little formal education and training on pain control and opioid use for medical students and residents, and the upward “drift” in use of opioids in the ed. although there are no easy solutions to the problem, our discussants and other online discussion participants reflected thoughtfully on efforts needed at both the system and individual levels. the authors acknowledge the following: the aliem blog discussion participants, including teresa chan, md, esther choo, md, mph, scott cooper, rn, margaret greenwood- ericksen, md, mph, gareth debiegun, md, matt klein, md, mph, scott kobner, bs, jeanmarie perrone, md, anand swaminathan, md, and marnie rackmill; the #aliemrp twitter participants (and the number of their followers), including @adagiudicetomps ( ), @alkhalifaa ( ), @alsugairmd ( ), @annalsofem ( , ), @aribfriedman ( ), @aylc ( ), @barbholly ( ), @cathimon ( ), @cheeler ( ), @choo_ek ( , ), @chopfellow ( ), @chsu ( ), @corbetron ( , ), @davidjuurlink ( , ), @ditchdocrn ( ), @drlfarrell ( , ), @edtakedown ( ), @elbertchu ( ), @emcases ( , ), @emergnsea ( ), @emlitofnote ( , ), @emurgentologist ( ), @ermentor ( ), @gpdots ( ), @henrikbugge ( ), @jmperronemd ( ), @joebabaian ( , ), @jschuurmd ( ), @k_scottmd ( ), @ketaminh ( , ), @lsaldanamd ( , ), @lwestafer ( , ), @m_lin ( , ), @margezilla ( ), @mdaware ( , ), @medicjosh ( ), @medquestioning ( ), @megsahokie ( ), @michelleklaiman ( ), @mkchan_rcpsc ( ), @mkleinmd ( ), @ml_barnett ( ), @njoshi ( , ), @northwesternem ( ), @nxtstop ( , ), @paularobeson ( ), @paulcoelho ( ), @perfectednurse ( ), @peterrchai ( ), @poisonreview ( , ), @poped ( ), @poppaspearls ( ), @restoreofaz ( ), @scottweinermd ( ), @sjpoon ( ), @skobner ( ), @smotovmd ( , ), @stampforge ( ), @tama dora ( ), @tchanmd ( , ), @thesgem ( , ), @theskeeterhawk ( ), @thestrengthdoc ( ), @toxtalk ( , ), @travels little ( ), @ucmorningreport ( ), @ukingsbrook ( ), @ultrasoundrel ( ), @umasstox ( ), @upennem annals of emergency medicine ( , ), @utswemsa ( ), @vjsapps ( ), @whole_patients ( , ), @withspin ( ), and @wvuemergencymed ( ); and the google hangout videocast participants, including esther choo, margaret greenwood-ericksen, david juurlink, scott kobner, maryann mazer-amirshahi, and sabrina poon. supervising editor: michael l. callaham, md author affiliations: from the department of emergency medicine, emergency digital health innovation program, warren alpert medical school of brown university, providence, ri (choo); the department of emergency medicine, medstar washington hospital center, washington, dc (mazer-amirshahi); the department of medicine, sunnybrook health sciences centre, toronto, ontario, canada (juurlink); the new york university school of medicine, new york, ny (kobner); the department of emergency medicine, perelman school of medicine, university of pennsylvania, philadelphia, pa (scott); and the department of emergency medicine, university of california, san francisco, and the mededlife research collaborative, san francisco, ca (lin). funding and support: by annals policy, all authors are required to disclose any and all commercial, financial, and other relationships in any way related to the subject of this article as per icmje conflict of interest guidelines (see www.icmje.org). the authors have stated that no such relationships exist. references . radecki rp, rezaie sr, lin m. annals of emergency medicine journal club. global emergency medicine journal club: social media responses to the november annals of emergency medicine journal club. ann emerg med. ; : - . . chan tm, rosenberg h, lin m. global emergency medicine journal club: social media responses to the january online emergency medicine journal club on subarachnoid hemorrhage. ann emerg med. ; : - . . thoma b, rolston d, lin m. global emergency medicine journal club: social media responses to the march annals of emergency medicine journal club on targeted temperature management. ann emerg med. ; : - . . joshi nk, yarris lm, doty ci, et al. social media responses to the annals of emergency medicine residents’ perspective article on multiple mini-interviews. ann emerg med. ; : - . . poon sj, greenwood-ericksen mb. the opioid prescription epidemic and the role of emergency medicine. ann emerg med. ; : - . . centers for disease control and prevention. vital signs: overdoses of prescription opioid pain relievers—united states, - . mmwr. . available at: http://www.cdc.gov/mmwr/preview/mmwrhtml/ mm a .htm. accessed november , . . centers for disease control and prevention. emergency department visits involving nonmedical use of selected prescription drugs — united states, - . mmwr. . . mazer-amirshahi m, mullins pm, rasooly i, et al. rising opioid prescribing in adult us emergency department visits: - . acad emerg med. ; : - . . mazer-amirshahi m, mullins pm, rasooly ir, et al. trends in prescription opioid use in pediatric emergency department patients. pediatr emerg care. ; : - . volume -, no. - : - http://www.icmje.org/ http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://www.cdc.gov/mmwr/preview/mmwrhtml/mm a .htm http://www.cdc.gov/mmwr/preview/mmwrhtml/mm a .htm http://refhub.elsevier.com/s - ( ) - /sref a http://refhub.elsevier.com/s - ( ) - /sref a http://refhub.elsevier.com/s - ( ) - /sref a http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref choo et al the prescription opioid epidemic . dhalla ia, mamdani mm, sivilotti mla, et al. prescribing of opioid analgesics and related mortality before and after the introduction of long-acting oxycodone. cmaj. ; : - . . mazer-amirshahi m, dewey k, mullins pm, et al. trends in opioid analgesic use for headaches in us emergency departments. am j emerg med. ; : - . . edlow ja, panagos pd, godwin sa, et al. clinical policy: critical issues in the evaluationand managementof adult patients presenting to the emergency department with acute headache. ann emerg med. ; : - . . silberstein sd. practice parameter: evidence-based guidelines for migraine headache (an evidence-based review): report of the quality volume -, no. - : - standards subcommittee of the american academy of neurology. neurology. ; : - . . griggs ca, weiner sg, feldman ja. prescription drug monitoring programs: examining limitations and future approaches. west j emerg med. ; : - . . symplur analytics. symplur analytics. healthcare hashtag project. available at: http://www.symplur.com/healthcare-hashtags/aliemrp/. accessed november , . . duggan m, smith a. social media update : pew research. . available at: http://pewinternet.org/reports/ /social- media-update.aspx. accessed november , . annals of emergency medicine http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://www.symplur.com/healthcare-hashtags/aliemrp/ http://pewinternet.org/reports/ /social-media-update.aspx http://pewinternet.org/reports/ /social-media-update.aspx the prescription opioid epidemic: social media responses to the residents’ perspective article introduction materials and methods results summary of the online discussion limitations discussion social media: a new frontier in scholarly discussions conclusion references is there a social life in open data? the case of open data practices in educational technology research publications article is there a social life in open data? the case of open data practices in educational technology research juliana e. raffaghelli ,* and stefania manca faculty of education and psychology, universitat oberta de catalunya, barcelona, spain; jraffaghelli@uoc.edu institute of educational technology, national research council of italy, genova, italy; stefania.manca@itd.cnr.it * correspondence: jraffaghelli@uoc.edu; tel.: + - received: december ; accepted: january ; published: january ���������� ������� abstract: in the landscape of open science, open data (od) plays a crucial role as data are one of the most basic components of research, despite their diverse formats across scientific disciplines. opening up data is a recent concern for policy makers and researchers, as the basis for good open science practices. the common factor underlying these new practices—the relevance of promoting open data circulation and reuse—is mostly a social form of knowledge sharing and construction. however, while data sharing is being strongly promoted by policy making and is becoming a frequent practice in some disciplinary fields, open data sharing is much less developed in social sciences and in educational research. in this study, practices of od publication and sharing in the field of educational technology are explored. the aim is to investigate open data sharing in a selection of open data repositories, as well as in the academic social network site researchgate. the open datasets selected across five od platforms were analysed in terms of (a) the metrics offered by the platforms and the affordances for social activity; (b) the type of od published; (c) the fair (findability, accessibility, interoperability, and reusability) data principles compliance; and (d) the extent of presence and related social activity on researchgate. the results show a very low social activity in the platforms and very few correspondences in researchgate that highlight a limited social life surrounding open datasets. future research perspectives as well as limitations of the study are interpreted in the discussion. keywords: open data; open science; open data repositories; social media; researchgate; educational technology research . introduction open science is the movement that advocates for more public and accessible science [ , ], and has progressively encompassed new researchers’ practices and identities that go beyond the idea of digital science towards open and social activities [ – ]. in the educational technology sector, open science is also regarded as a shorthand for the transformative intersection of digital content, networked distribution, and open practices [ ]. authors have conceptualized theoretical frameworks and epistemological approaches to analyse the relationship between scholarly practice and technology, and have explored new forms of scholarship fostered by social media and social network sites [ – ]. publications , , ; doi: . /publications www.mdpi.com/journal/publications http://www.mdpi.com/journal/publications http://www.mdpi.com https://orcid.org/ - - - http://www.mdpi.com/ - / / / ?type=check_update&version= http://dx.doi.org/ . /publications http://www.mdpi.com/journal/publications publications , , of in this scenario, open data (od) as “data that anyone can access, use and share” , plays a crucial role, as data are one of the most basic components of research, despite format differences in scientific disciplines [ ]. opening up data is a recent concern for policy makers and researchers, as the basis for good open science practices [ ]. in fact, data-driven research encompasses a massive production of digitalized data, which in turn allows for appropriate communication and sharing, thus implying new discoveries and more balanced efforts from the community of researchers. a common factor underlying these new practices concerns the relevance of promoting open data circulation and reuse, which is often considered solely a social form of knowledge sharing and construction. however, whilst data sharing is strongly encouraged by policy making in disciplines such as physics and genomics, this concept is far less developed in the social sciences [ ]. in this article, the authors explore practices of od publication and sharing in the field of educational technology, a branch of educational research under the social sciences. educational technology is relatively young academic discipline, with academic references starting to appear in the s [ , ]. the idea to do this arose from the extensive research experience of both authors in this field. the overarching aim of the study is to demonstrate how open data practices are emerging, and to what extent these align with the principles of open science. the focus on the social life of open data goes beyond the specific open data portals and their social features (that is, who has been reading, citing, recommending, sharing, or downloading for reuse), and considers academic social networks, such as researchgate, for sharing open datasets with a wider public. the choice to investigate the social activity concerned with open datasets in researchgate derives from its increasing popularity among scholarly communities, as previously reported in recent studies [ ], as well as being a competitor of institutional repositories for some scholars’ habits [ ]. furthermore, as academic social network sites primarily focus on social interest and activity around science, and for fostering networked practice in scholarly professional learning [ ], another assumption is that identifying the social activity concerned with open data might indicate dynamics of professional learning and scholars’ engagement with open science on these platforms. finally, this study also discusses the characteristics and the quality of open datasets, as well as their “social life”, in terms of the visible activity traced back by the platform where the od is stored and the convergent storage and activity on researchgate. . background . . from open science to open data: an emergent agenda while the european union initially adopted the term science . [ ] in an attempt to follow the participatory nature of the web . , the concept of open science has gained ground and covered a number of goals, namely, public accessibility and transparency of scientific communication, public availability and reusability of scientific data, transparency in experimental methodology, observation and collection of data, and use of web-based tools/infrastructure to facilitate collaboration [ ]. in this scenario, open data acquire a crucial importance in open science as an object of socialization and exchange, thus shaping the ideals of openness in science [ ]. open data allows for all researchers to replicate scientific experiments, thus aligning with the goals of transparency, and could be reused in further processing to generate new results as secondary data. as such, not only can open data be adopted by other researchers, but they can be also mined by the industry in faster cycles of research and development (r&d) [ ]. moreover, open data aligns with the goals of the open access movement, also embedded into the concept of open science. although this movement has a prior trajectory, it also addresses the idea that all public research should be made freely available and accessible. the concept of open data expands on this idea by making the units of research (data) open [ ]. european data portal - https://www.europeandataportal.eu/elearning/en/module /#/id/co- . https://www.europeandataportal.eu/elearning/en/module /#/id/co- publications , , of although the centrality of open data became a reality for the european commission throughout the mallorca declaration of [ ], today, several international organizations are increasingly dealing with data sharing through a number of funded projects. the open aire portal, to ensure visibility of open data produced within the european research framework horizon [ ], the wellcome trust [ ], the netherlands organization for scientific research (nwo) [ ], the european organization for nuclear research (cern)’s policies [ ]; and the bill and melinda gates foundation [ ], are among the most relevant initiatives. nevertheless, the practices around open data are unevenly distributed across scientific fields, and in most areas, the concepts, tools, and techniques to share data are little known [ ]. for example, mckiernan et al. [ ] have shown several benefits of data sharing in applied sciences, life sciences, maths, physical science, and social sciences, areas where the advantages are often reflected in the visibility of research in terms of citation rates. however, they have also pointed out the need for deconstructing several “myths” in adopting open science practices and data sharing. in this regard, the effort made by the research community to generate common principles for the quality of open data should be considered. the fair (findability, accessibility, interoperability, and reusability) data principles [ ] are an expression of this endeavour, aiming at introducing clear parameters for open data associated not only with humans, but also with machine tasks throughout algorithms and workflows. in the field of educational science, the issue is relatively new, and thus requires specific attention in order to overcome the initial state of aversion, as well as the fragmentation of incipient practices [ ]. moreover, an important dimension of new scholarly practices has been the bottom-up movement of networked scholarship, which went in the direction of sharing scientific knowledge throughout novel channels [ – ]. although academic social network sites have played an important role in this sense, their usage has been questioned because of their non-institutional nature, or because they are challenging scholars’ habits in how they deposit their publications [ ]. however, as discussed in the next section, the social features of academic practice reveal pathways of engagement and professional learning that could cover significant scholars’ knowledge and skills gap. to our knowledge, there is no research that has investigated the relevance of social activity on open datasets as primary objects of scientific knowledge. . . social media and networked scholarship: how scholars share and build professionalism in the digital era recent research has suggested that scholars are increasingly using social media to enhance scholarly communication by strengthening mutual relationships, facilitating peer collaboration, publishing and sharing research products, and discussing research topics in open and public formats [ – ]. studies have stressed how scholars today are familiar with blogs, wikis, general and academic social network sites, and multimedia sharing—at all stages of the research lifecycle—from identifying research opportunities to disseminating final findings [ – ]. among the reasons for using social media in academic practice, keeping up to date, maintaining and strengthening networks, and increasing visibility with positive implications for career progression have been identified as major factors for engaging in social media [ ]. other factors that influence scholars’ use of social media are concerned with making connections and developing networks, openness and sharing, self-promotion, and peer support [ ]. along with the investigation of different social media for scholarly communication, authors have also studied digital scholarship as an emergent scholarly system that intersects mainstream academia with its proper techno-cultural system [ , ]. the four dimensions of scholarship—discovery, integration, application, and teaching—as redefined by boyer in his seminal work [ ], have been increasingly affected by the values and the ideology of digital and open science, with the broad aim of promoting scholarly networking and the public sharing of scientific knowledge among a wider public. the framework programme for research in europe, https://ec.europa.eu/programmes/horizon /en/. https://ec.europa.eu/programmes/horizon /en/ publications , , of in this light, digital scholarship is increasingly being conceived as a more inclusive approach to the construction and sharing of knowledge and means of scholarly public engagement [ ]. the authors have also contended the fragmentation of studies relating to digital scholarship, with diversified disciplinary perspectives [ ]. at least two theoretical approaches have recently emerged, with the aim of conceptualizing the relationship between scholarly practice and technology, as well as new forms of scholarship fostered by social media. the first, networked participatory scholarship, has been conceived as the emergent practice of scholarly use of participatory technologies and social network sites, with the purpose of sharing, improving, and validating scholarship [ ]. in this approach, platforms such as facebook, twitter, academia.edu, and mendeley are found to provide support for acquiring, testing, validating, and sharing scholarly knowledge in university subcultures of “invisible college”. in the second approach, social scholarship, increasing social media use in scholarly practice is examined as a means through which to encompass new ways for academia to accomplish scholarship, through values such as the promotion of users and decentralized accessible knowledge [ ]. however, other authors have pointed out that scholars’ experiences in social media are fragmented and not well understood, thus demanding more focused research on the day-to-day realities of social media for scholarship [ ]. others have highlighted how the three dimensions of scholarly practice—scholarship, openness, and digitality—seem to resemble an impossible triangle that creates tensions in practice between the traditional values of disciplinary scholarship (e.g., record of publication integrity) and open teaching/public engagement (e.g., communication of research results with the general public) [ ]. other challenges were reported concerning institutional policies that tend to discourage scholars from unconventional publishing practices [ , ]. the result is that network engagement today is progressively involving individuals rather than roles or institutions, and is creating an emergent scholarly system of its own. in this respect, scientists are progressively shaping an increasingly complex academic system, with its own values and demands of new responses to both internal and external stimuli [ ]. when analysing the social media that are influential for academic practice, most studies are focused on twitter [ , , ], or on academic social network sites such as researchgate and academia.edu [ ]. indeed, researchgate and academia.edu have become the most popular social networking services developed specifically to support academic and research practices [ ]. however, while most of the research on these platforms has been conducted in the library and information sciences as deployments for reputation building and alternative ranking systems [ – ], very few studies have investigated the use of researchgate and academia.edu in the light of the diverse theoretical frameworks, and have aimed at analysing the social digital scholarship practice [ ]. the reasons for the limited adoption of these platforms might be related to criticism raised in the scholarly community, which has questioned the reliability and impact of researchgate metrics, which makes it hard to compare with other popular standard scores [ – ]. in one of these studies, it was found that scientists are apparently willing to share copies of their publications on academic social networking sites more than in institutional repositories [ ]. in the top spanish universities that were investigated, the majority of the articles that were not available in the institutional repositories were made available in full text on researchgate. however, to our knowledge, no specific investigation on open datasets sharing in researchgate has been conducted. this study aims to fill this gap by analysing the extent of the presence and related social activity of a number of open datasets in the educational technology field. they were identified in open data repositories and sought in researchgate, so as to provide preliminary evidence of scholarly practice concerned with data sharing as open scholarly resources. publications , , of . method . . rationale of the study and research questions the aim of the study is to show how open data related practices are emerging, and to which extent these align with the principles of open science. in order to achieve this aim, the research design was based on an exploratory study that analysed open datasets in the field of educational technology, and explored the academics’ social practices relating to these objects. the operationalization of the construct open data refers here to open research data. according to the recent european union guidelines, “in a research context, examples of data include statistics, results of experiments, measurements, observations resulting from fieldwork, survey results, interview recordings and images. the focus is on research data that is available in digital form. users can normally access, mine, exploit, reproduce and disseminate openly accessible research data free of charge” [ ]. we will also adopt the key term “open datasets”, which will refer to the packages and presentation of open data. the term comes from the research area of computer science, but has been adopted more recently for all types of data, particularly because of the digital form in which most data are presented nowadays (https://en.wikipedia.org/wiki/data_set). we further defined the quality of a dataset adopting the fair data principles. the fair data principles compliance [ ] are a set of guiding principles in order to make data findable, accessible, interoperable, and reusable [ ]. these principles provide guidance for scientific data management and stewardship, and are relevant to all stakeholders in the digital ecosystem. the detection of these principles was made on the basis of the following conceptualizations: ( ) the principle of findability encompasses global and unique digital identifiers, rich metadata, and indexation as searchable resources; ( ) the principle of accessibility requires metadata that can be retrieved through a standardized communication protocol, which is open, free, and universally implementable; ( ) the principle of interoperability encompasses formal, accessible, shared, and broadly applicable language for knowledge representation, with metadata that uses vocabularies compliant with the fair principles and includes qualified references; and ( ) the principle of re-usability implies that there is a plurality of accurate and relevant attributes of metadata in place. it requires that licenses of usage are clear, associated with metadata provenance, and in accordance with disciplinary field standards. it is worth mentioning that all of the fair principles support the semantic interoperability, but the “i” stands more for the syntax, and hence the formal interoperability of programs adopted to process and present data. finally, the operationalization of the construct “academics’ social practices” in this study is based on the metrics made available by the digital platforms where the od are placed. these metrics generally consist of information on the number of downloads, citations, comments, and sharing. in light of these aims, the research questions addressed by this study are as follows: . do researchers in the field of educational technology publish open datasets (ods)? . to which extent are ods compliant with the fair data principles? . what is the social life relating to the ods in terms of the metrics provided by the od portals? as a subsidiary question, ( a.) to what extent do open data portals allow researchers to cultivate social practices around od? in order to investigate od presence in researchgate, the following research aim guided the second part of the study: . analysis of the presence of the selected od in researchgate and of the type of social activity od exhibited by od according to researchgate metrics. . . sampling the systematic search of ods in the area of educational technology was deployed as follows (figure ): https://en.wikipedia.org/wiki/data_set publications , , of . five od repositories were employed, namely: openaire (https://www.openaire.eu/), figshare (https://figshare.com/), zenodo (https://about.zenodo.org/), mendeleydata (https://data. mendeley.com/), and learnsphere (http://learnsphere.org/). they were selected taking into consideration their geographical and political importance, as in the case of openaire; the number of objects archived and the number of years operating with od, as in the case of figshare, zenodo, and mendeley data (these are od repositories that have operated for more than five years, with above one million of objects and several millions of visits); and the relevance of the thematic sector, which aggregates the data from other seven open data repositories on educational research data (e.g., learnsphere). . a general search was conducted on the od repositories search engines, which included key terms such as “learning”, or “education” and “technolog*”. this research yielded objects for openaire, for figshare, for zenodo, for mendeleydata, and for learnsphere. overall, an initial number of open datasets were found. data extraction was conducted on may . . a progressive number was assigned to each of the objects. hence, a sample of % of open datasets was randomly selected using the technique of generating a random sequence from to , and extracting a sample of objects with the number randomly assigned. the random list was created with the tool “sequence generator“ from random.org (https://www.random.org/). this random extraction was adopted as the type of analysis over each of the objects extracted (ods) could not be performed manually within the given time assigned to the research project. this limitation was overcome both by the simple random sampling of a minimum number of objects that respected the % confidence level and at the higher confidence interval of %. while these are not optimal measures, they are acceptable for exploratory study purposes [ ]. . the files and metadata of the objects were analysed and some exclusion criteria were applied, as follows: the alignment of the object with the concept of dataset (a file or number of files containing raw data that can be analysed by other researchers as it is), and the pertinence with the topic of educational technology (e.g., technology-enhanced learning, online learning, and the adoption of digital tools in education). however, we still defined od as a broad concept, because of the initial diversification of the objects observed on the od repositories. to this regard, we selected all of the od that was at least human readable with no limitations of access (paywalls, registration to see the full files, and requests to authors). after this step, objects were eliminated, because they could not be considered a dataset (these files consisted of pdf files with presentations or the full article); eight objects were eliminated for being “borderline” (studies on learning machine code applied to education), and objects were considered completely out of topic (most of them relating to machine learning studies). overall, objects were excluded. . following this, for each of the remaining objects, the metadata were verified. the characteristics were annotated in a database where the objects were classified according to the analytic dimensions generated by the authors (see the section “instruments and data collection”). . moreover, the objects were also sought on the commercial academic social network site researchgate in order to analyse social activity in this platform. https://www.openaire.eu/) https://figshare.com/), https://about.zenodo.org/) https://data.mendeley.com/) https://data.mendeley.com/) http://learnsphere.org/) https://www.random.org/ publications , , of publications , , x for peer review of . a general search was conducted on the od repositories search engines, which included key terms such as “learning”, or “education” and “technolog*”. this research yielded objects for openaire, for figshare, for zenodo, for mendeleydata, and for learnsphere. overall, an initial number of open datasets were found. data extraction was conducted on may . . a progressive number was assigned to each of the objects. hence, a sample of % of open datasets was randomly selected using the technique of generating a random sequence from to , and extracting a sample of objects with the number randomly assigned. the random list was created with the tool “sequence generator“ from random.org (https://www.random.org/). this random extraction was adopted as the type of analysis over each of the objects extracted (ods) could not be performed manually within the given time assigned to the research project. this limitation was overcome both by the simple random sampling of a minimum number of objects that respected the % confidence level and at the higher confidence interval of %. while these are not optimal measures, they are acceptable for exploratory study purposes [ ]. . the files and metadata of the objects were analysed and some exclusion criteria were applied, as follows: the alignment of the object with the concept of dataset (a file or number of files containing raw data that can be analysed by other researchers as it is), and the pertinence with the topic of educational technology (e.g., technology-enhanced learning, online learning, and the adoption of digital tools in education). however, we still defined od as a broad concept, because of the initial diversification of the objects observed on the od repositories. to this regard, we selected all of the od that was at least human readable with no limitations of access (paywalls, registration to see the full files, and requests to authors). after this step, objects were eliminated, because they could not be considered a dataset (these files consisted of pdf files with presentations or the full article); eight objects were eliminated for being “borderline” (studies on learning machine code applied to education), and objects were considered completely out of topic (most of them relating to machine learning studies). overall, objects were excluded. . following this, for each of the remaining objects, the metadata were verified. the characteristics were annotated in a database where the objects were classified according to the analytic dimensions generated by the authors (see the section “instruments and data collection”). . moreover, the objects were also sought on the commercial academic social network site researchgate in order to analyse social activity in this platform. appendix shows main information on the open datasets. figure . workflow: selection of open datasets in five open data (od) repositories. . . instruments and data collection figure . workflow: selection of open datasets in five open data (od) repositories. appendix a shows main information on the open datasets. . . instruments and data collection the analysis was divided into two steps, as follows: ( ) exploration of open datasets in od repositories and ( ) complementary analysis on researchgate. in regards to the first step, the objects were analysed according to a number of categories, in order to explore and characterize each object. the categories were elaborated and discussed between the two authors, on the basis of the research questions. table shows the complete set of categories and codes. the categories are defined in the second column, and refer to a diverse conceptual basis. the research topics were established through an inductive process based on the analysis of the keywords and abstracts of the open datasets, and agreement between the authors was almost fully achieved (= ; %). the data type and number of downloads and views were extracted directly from the datasets. as for the fair principles, each open dataset was analysed on the basis of the existing principles, as reported in table . after the analysis, the descriptive statistics were calculated for the frequencies of cases under a specific category. for social activity, the metrics in the od repositories—consisting of the number of downloads—were calculated. in terms of step two, the open datasets were sought in researchgate using the same title adopted in the od repository. in this case, the metrics of social activity were retrieved for both the open dataset and, if existing, for the publication to which the open dataset could be associated. following this, the metrics of researchgate—citations, recommendations, reads, followers, and comments—were calculated. all of the data collected and analysed in this study have been published at the od repository zenodo as open data [ ]. table . codebook with the labels and values used in the codification of datasets. category definition codes assigned research topics thematic focus on the research project from which the open dataset was yielded. analysis and models in learning processes innovative teaching with edt intelligent tutoring system massively open online courses (moocs) open science prediction in learning processes teachers and trainers professional development data type type of data expressed in terms of file extension. xls/xlxs, csv, txt, pdf, sav, and others. when the data was not accessible due to its restricted access, the value “unknown” was used. publications , , of table . cont. category definition codes assigned fair (findability, accessibility, interoperability, and reusability) data principles findable f . (meta)data are assigned a globally unique and eternally persistent identifier. f . data are described with rich metadata. f . (meta)data are registered or indexed in a searchable resource. f . metadata specify the data identifier.accessible: a . (meta)data are retrievable by their identifier using a standardized communications protocol. a . . the protocol is open, free, and universally implementable. a . . the protocol allows for an authentication and authorization procedure, where necessary. a . metadata are accessible, even when the data are no longer available. interoperable: i . (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. i . (meta)data use vocabularies that follow fair principles. i . (meta)data include qualified references to other (meta)data. re-usable: r . meta(data) have a plurality of accurate and relevant attributes. r . . (meta)data are released with a clear and accessible data usage license. r . . (meta)data are associated with their provenance. r . . (meta)data meet domain-relevant community standards downloads and views platform metrics on social activity around a dataset. as reported in the data portal/repository metrics . results the following section presents the research results in response to the four research aims. q . do researchers in the field of educational technology publish open data sets? prior to conducting specific searches and selecting the datasets, all of the datasets within two relevant data repositories—mendeley data (international) and zenodo (european)—were explored. table shows the numbers for social sciences and educational technology as research areas. table . datasets concerned with social sciences and educational technology areas. data portal datasets_ss datasets_edt zenodo mendeley data initially, we observed that the datasets in social sciences across the data repositories remained stable, while this was not the case for the educational technology sector. while the number of objects supports the assumption that there are practices of open data in the field of educational technology, the difference between the two portals is puzzling. one possible explanation is that the search tools of zenodo encompass a more accurate classification of the datasets. this is also confirmed by the data shown in figure , which reports the distribution of the selected datasets that were included and excluded across the five portals. however, while some data repositories specifically curate the topic of educational technology (learnsphere), other generalist repositories (e.g., openaire) collect objects whose identification through the category of “educational” and “technolog*” is more imprecise. an exception of this was zenodo, which mostly collects projects related to the horizon program, and provides more elaborated instruments for the storage and retrieval of research data. publications , , of publications , , x for peer review of figure . datasets included and excluded in the five data repositories. as for the research topics covered in the datasets, a relevant part (n = ) relates to the data mining and analytics of students’ logs tracked in online platforms, with three concerned with massively open online courses (moocs), four with intelligent tutoring systems, and four with prediction in learning processes. another group of topics relate to the innovations in teaching and learning, analysis, and models in learning processes (n = ); the topic of teachers and trainers’ professional development is represented in three datasets. only one is related to a literature review that analyses the theoretical issues in educational technology, and two refer to the topic of open education science. q . to which extent are ods compliant with the fair data principles? as for the analysis of the datasets according to the four fair categories, it was found that the findability principle was fully accomplished in all datasets. all of the datasets are placed in data repositories that have digital and permanent identifiers where data are indexed and searchable under database queries, and require a minimum of metadata in order to accomplish the process of data uploading. conversely, issues such as interoperability and accessibility presented some problems. in a number of cases, the type of files required proprietary software to be opened, and were not interoperable. one of the weakest issues was the lack of references to other qualified metadata systems. there was no evidence of cases in which metadata were present, which followed a standardized protocol and other well-known ontologies. finally, relating to the reusability, % (n = ) of cases that could not be classified as reusable do not possess a clear usage license. the categorization of file typology containing raw data revealed that datasets contained a main file in .csv or .txt format, which is an interoperable format (data use a formal, accessible, shared, and broadly applicable language for knowledge representation). while other types of formats are also interoperable (pdf, sav, xlsx, and zip), they might not be machine readable and would require procedures of data extraction and transformation for successive analysis, in spite of being “human readable” and open under these last criteria. q . what is the social life relating to the ods in terms of the metrics provided by the od portals? q a. to what extent do open data portals allow researchers to cultivate social practices around od? while the three open data repositories (figshare, zenodo, and mendeley data) and the two open data portals (openaire and learnsphere) present diversified features, one common characteristic is that their main goals are content searchability and retrieval. https://en.wikipedia.org/wiki/machine-readable_data figure . datasets included and excluded in the five data repositories. as for the research topics covered in the datasets, a relevant part (n = ) relates to the data mining and analytics of students’ logs tracked in online platforms, with three concerned with massively open online courses (moocs), four with intelligent tutoring systems, and four with prediction in learning processes. another group of topics relate to the innovations in teaching and learning, analysis, and models in learning processes (n = ); the topic of teachers and trainers’ professional development is represented in three datasets. only one is related to a literature review that analyses the theoretical issues in educational technology, and two refer to the topic of open education science. q . to which extent are ods compliant with the fair data principles? as for the analysis of the datasets according to the four fair categories, it was found that the findability principle was fully accomplished in all datasets. all of the datasets are placed in data repositories that have digital and permanent identifiers where data are indexed and searchable under database queries, and require a minimum of metadata in order to accomplish the process of data uploading. conversely, issues such as interoperability and accessibility presented some problems. in a number of cases, the type of files required proprietary software to be opened, and were not interoperable. one of the weakest issues was the lack of references to other qualified metadata systems. there was no evidence of cases in which metadata were present, which followed a standardized protocol and other well-known ontologies. finally, relating to the reusability, % (n = ) of cases that could not be classified as reusable do not possess a clear usage license. the categorization of file typology containing raw data revealed that datasets contained a main file in .csv or .txt format, which is an interoperable format (data use a formal, accessible, shared, and broadly applicable language for knowledge representation). while other types of formats are also interoperable (pdf, sav, xlsx, and zip), they might not be machine readable and would require procedures of data extraction and transformation for successive analysis, in spite of being “human readable” and open under these last criteria. q . what is the social life relating to the ods in terms of the metrics provided by the od portals? q a. to what extent do open data portals allow researchers to cultivate social practices around od? https://en.wikipedia.org/wiki/machine-readable_data. https://en.wikipedia.org/wiki/machine-readable_data publications , , of while the three open data repositories (figshare, zenodo, and mendeley data) and the two open data portals (openaire and learnsphere) present diversified features, one common characteristic is that their main goals are content searchability and retrieval. in the case of the two data portals, one issue in common is the aggregation of several data repositories. moreover, the two portals endow researchers with contextual information on the provided affordances. in the case of openaire, where the main target is european researchers and research institutions, information on the policy context is provided. on the contrary, learnsphere provides more educational and pedagogical information for consultation, as well as tools for the specific target of international educational researchers (most of the contributors are affiliated to the united states). both portals offer an engine to make general and advanced searches; however, while openaire concentrates all sorts of objects concerned with european union research, learnsphere is specific for data. for example, even when selecting the data option, many retrievable objects are not proper datasets. in both portals, there are tools for community collaboration and dialogue external to the objects’ search. once a dataset is retrieved, there is no possibility to see if the object is downloaded, consulted, or used by others. however, in some repositories, it is possible to find some social indicators. for instance, in the case of openaire, a relevant part of the datasets is actually located on the data repository zenodo, hence the metrics of this can be considered. in the case of learnsphere, some datasets relate to data repositories that show some social metrics (e.g., downloads—in the case of harvard dataverse), while others do not consider social metrics at all (e.g., datashop). as for the three data repositories, the targets were supposedly international and undifferentiated. all three allow for researchers to upload all types of objects (from final publications to reports, working documents, pre-registration documents, etc.), whilst mendeley can embed data repositories from other data repositories, and as such, can be considered as a “meta” repository. the payoff in this case, as reported in response to q , is less accuracy at the time of retrieval (both mendeley and openaire showed the lowest number of the datasets selected). however, it is worth noting there was evidence of strong imprecision in the type of objects one can retrieve, despite having specified the search for datasets. all of the repositories offer categories to restrict search, as well as an internal engine based on boolean operators. whilst mendeley data does not provide any metrics of social activity in terms of downloads, citations, recommendations, or comments, figshare and zenodo include minimal indicators. moreover, social activity is also shown in terms of an “altmetric attention score”. almetrics is a commercial tool embedded in other platforms; it calculates the amount of attention derived from an automated algorithm, representing a weighted count of the number of shares on generic social networks—such as twitter or facebook—and other specific professional and academic social networks—such as mendeley or researchgate . in the case of zenodo, only views and downloads are computed. in the case of portals, because of their connection with repositories, the number of downloads can be retrieved, but the information concerned with the views is not always available. table shows the social activity in terms of downloads, views, and altmetrics for the datasets. see for example: https://data.mendeley.com/datasets/ yj w hh/ . see for example: https://figshare.com/collections/ _ _million_kids_and_counting_mobile_science_laboratories_drive_ student_interest_in_stem/ . for the former case in figshare, see the altmetrics: https://figshare.altmetric.com/details/ . https://data.mendeley.com/datasets/ yj w hh/ https://figshare.com/collections/ _ _million_kids_and_counting_mobile_science_laboratories_drive_student_interest_in_stem/ https://figshare.com/collections/ _ _million_kids_and_counting_mobile_science_laboratories_drive_student_interest_in_stem/ https://figshare.altmetric.com/details/ publications , , of table . metrics of social activity. case repository/portal downloads views altmetrics openaire/figshare figshare figshare zenodo n.a. zenodo zenodo n.a. mendeley data/zenodo mendeley data/zenodo n.a. n.a. mendeley data/harvard dataverse n.a. n.a. zenodo n.a. zenodo n.a. learnsphere/datashop n.a. n.a. n.a. learnsphere/datashop n.a. n.a. n.a. learnsphere/datashop n.a. n.a. n.a. learnsphere/datashop n.a. n.a. n.a. learnsphere/datashop n.a. n.a. n.a. learnsphere/datastage n.a. n.a. n.a. learnsphere n.a. n.a. n.a. learnsphere n.a. n.a. n.a. learnsphere/databrary n.a. n.a. n.a. learnsphere/harvard dataverse n.a. n.a. learnsphere/harvard dataverse n.a. n.a. learnsphere/harvard dataverse , n.a. n.a. nonetheless, in all three repositories, there is some kind of evidence of partial coherence between the number of views (initial contact), downloads (potential interest in use), and shares on social media. however, this trend cannot be considered for the whole sample, as cases do not display this information. another significant trend is the irregular distribution of downloads. while a “champion” dataset has , downloads (topic: moocs and learnsphere), and three others have more than downloads, the remaining datasets have obtained limited attention. it must be acknowledged that out of datasets did not have any tool to observe users’ activity, as the platforms where the data was stored adopted older technologies with regard to most data repositories operating recently. this was the case for datashop and other datasets accessible via learnsphere, belonging to pioneering projects relating to od in education. q . analysis of the presence of the selected od in researchgate and of the type of social activity od exhibited by od according to researchgate metrics. to investigate the social activity around open datasets in academic social network sites, the datasets were sought in researchgate. researchgate has specific affordances to endow researchers to become “more social”. researchers can view, download, or cite articles, as well as comment and recommend datasets. however, the associated search engine does not allow for the search of specific datasets. these can be found by browsing the type of publication on researchers’ pages; retrieval of datasets is tightly connected to a good knowledge of a researcher’s trajectory and work, which is indeed a social activity. as for researchgate, only two datasets were found. table shows the social metric retrieved in researchgate and in the data portals. only the data concerned with the datasets that have associated metrics are reported. publications , , of table . researchgate metrics on social activity. progr. number associated publication dataset r g c or re sp on d en ce r es ou rc e c on su lt at io n [r g _r ea d s] r es ou rc e s h ar in g [r ec om m en d at io n s] r es ou rc e tr ac k in g [f ol lo w er s] r es ou rc e- b as ed s oc ia l a ct iv it y [c om m en ts ] r es ou rc e u si n g [c it at io n s] [r g _r ea d s] [r ec om m en d at io n s] [f ol lo w er s] [c om m en ts ] [c it at io n s] . discussion and conclusions this study investigated the open data practices in the field of educational technology through the analysis of a number of data portals as well as researchgate, with the aim of studying the social life connected with open datasets. the results show that open data publishing is becoming a trend in the field of educational technology, which is demonstrated clearly when compared to the overall open data publications in social sciences. however, only some subfields of educational technology are represented in the sample if we consider the landscape of topics recently identified in the sector [ ]. the significant presence of datasets concerned with educational data mining and learning analytics may be properly explained with the increasing popularity of this research topic in recent years [ , , , ]. as for compliance with the fair principles, the results show that these are only partially implemented. the great variance of features provided by the diverse portals and repositories allows for only partial compliance with the recommended principles, which might also be explained by the diversity of research objects made available in the portals. despite an overall low social activity concerned with the datasets, which is also strictly connected with the poor range of indicators provided in the data portals and repositories, some initial activity of views and downloads were reported. the downloads were mostly found for the datasets related to the topics that are considered popular in the research field, such as the moocs produced by mit. based on the only portal that provides information on altmetrics, in a few cases, researchers made use of social network sites to share their open datasets. as for researchgate, although the platform provides a better range of metrics, the limited number of datasets that were retrieved restricts the generalization of results. with reference to the low results obtained in the platform, the questionable reliability of academic social network sites metrics might have prevented the same research groups from making the same datasets also available in researchgate. however, the absence of open data on general social media as well as on researchgate is informative with regards to the level of progress of social activity connected to open data. a number of factors may be identified to explain the results of the study. the first relates to issues concerned with the culture of career advancement in the studied field of research. in the eu, where the publications , , of majority of the observed practices occur, the visibility of open data sharing policies is rapidly increasing, especially with the introduction of specific related policies; however, the recognition of activities relating to open data is still in its infancy. in fact, research institutions and national research assessment systems still tend to favor the publication of accomplished research products (i.e., articles), and to support their publication in prestigious and indexed journals that mostly have restricted access policies [ , ]. in the disciplinary field we studied, this tendency, hence, seems to conflict with both open science and with open data in particular [ ]. furthermore, open data engagement in educational technology is progressively involving small groups or individual researchers rather than roles or institutions, thus shaping an increasingly complex academic system with its own values [ , , ]. secondly, it appears there is a general lack of professional competency in the area of open data. some authors have highlighted the barriers preventing a full uptake of open science practices, which in turn go hand in hand with the need to acquire appropriate skills in order to navigate the digital abundance (of data) continuously produced in the digital and open world [ ]. some have compared the problem of the appropriation of open data to the phenomenon of the digital divide [ ]. indeed, there are complex skills that are required for the new approaches to data, such as appropriate metadata that explain the complexity of structures of data, the use of appropriate licenses for access, and of proper software connected to the machine readability of data to support interoperability. moreover, the way data is packaged and presented influences its re-use and sharing in expanded networks [ – ]. the potential embedded in open data in science cannot be directly transformed into effective practices towards open science, unless stakeholders’ skills and purposive professional development programs are guaranteed [ , ]. while this study has provided a preliminary analysis of the open data practices in the field of educational technology, a number of limitations also need to be highlighted. firstly, the sample used for this study was selected as a small percentage of the global results retrieved in the five portals/repositories, and cannot be considered as representative of the total number of datasets. although it was selected through a robust method (randomization), we cannot exclude that significant results were omitted. while this is a preliminary study on this topic, further research should enlarge the sample size in order to draw more robust conclusions. a second limitation concerns the use of certain keywords through boolean operators. despite the fact that they all use metadata for indexing and retrieving datasets, diverse repositories/portals mediate access to the datasets differently. another limitation concerns the method used to build the codebook, as its qualitative categories were derived inductively or applied according to the researchers’ judgment when pre-defined (the fair principles). however, the first limitation may be slightly counterbalanced with the lack of shared ontologies to classify subareas in the sector of educational technology. the interrater agreement served the purpose of controlling the second limitation. another important limitation relates to the type of data collected and the method connected to the construct of “academics’ social practices”. we extracted the data manually from the ods and from a single academic social network site (researchgate), which as a quantitative method is limited to anonymous users’ parameters. instead, the user experience could be further explored for insights on the activities, habitudes, and motivations, throughout qualitative and ethnographic approaches. future studies might profit from these limitations, and consider larger samples along with diverse methods of investigation. in fact, while this study was conducted in the sector of educational technology, it would be of particular importance for further elaboration on the scholarly appropriation of open data practices in order to investigate scholars’ habits and attitudes in different disciplinary areas as well as in diverse social media platforms. moreover, along with observational approaches, mixed methods and the triangulation of methodological approaches could provide an extensive overview of the changes today that are concerning the landscape of open data sharing in social media as a practice that reinforces the principles of open science. author contributions: conceptualization: j.e.r. and s.m.; investigation: j.e.r. and s.m.; methodology: j.e.r.; writing—original draft: j.e.r.; writing—review & editing: s.m. publications , , of funding: ministerio de ciencia e innovación: ryc- - . conflicts of interest: the authors declare no conflict of interests. appendix a progressive number title author url community health workers and mobile technology: a systematic review of the literature braun, rebecca; catalani, caricia; wimbush, julian; and israelski, dennis https://explore.openaire.eu/search/ dataset?datasetid=r c :: f bb e f e a d . million kids and counting—mobile science laboratories drive student interest in stem amanda l. jones and mary k. stapleton https://figshare.com/collections/ _ _ million_kids_and_counting_mobile_ science_laboratories_drive_student_ interest_in_stem/ technology, attributions, and emotions in post-secondary education: an application of weiner’s attribution theory to academic computing problems rebecca maymon, nathan c. hall, thomas goetz, andrew chiarella, and sonia rahim https://figshare.com/collections/ technology_attributions_and_emotions_ in_post-secondary_education_an_ application_of_weiner_s_attribution_ theory_to_academic_computing_ problems/ pyramidapp configurations and participants behavior data set kalpani manathunga and davinia hernández-leo https://zenodo.org/record/ # .w yh wgzy w classification of word levels with usage frequency, expert opinions, and machine learning guzey, onur; sohsah, gihad; and unal, muhammed https://zenodo.org/record/ # .w yibwgzy w human-centered design methods to empower “teachers as designers” garreta domingo, muriel; sloep, peter; and hernández-leo, davinia https://zenodo.org/record/ # .w yifwgzy w supporting awareness in communities of learning design practice konstantinos michos and davinia hernández-leo https://zenodo.org/record/ # .w yij gzy w massively open online course for educators (mooc-ed) network data set kellogg, shaun and edelmann, achim http: //dx.doi.org/ . /dvn/zzh ub on technological determinism: a typology, scope dafoe, allan http://dx.doi.org/ . /dvn/ conditions and a mechanism towards vocational translation in german studies in nigeria and beyond: lessons from translation teaching and practice in germany oyetoyan, oludamilola iyadunni https://zenodo.org/record/ results of a research software programming and development survey at the university of reading darby, robert https://zenodo.org/record/ mathan—fostering the intelligent novice: learning from errors with metacognitive tutoring ken koedinger https://pslcdatashop.web.cmu.edu/ datasetinfo?datasetid= geometry angles—north hills spring john stamper and steve ritter https://pslcdatashop.web.cmu.edu/ datasetinfo?datasetid= dataset: assistments math – neil heffernan https://pslcdatashop.web.cmu.edu/ datasetinfo?datasetid= middle school gaming the system (two schools and four lessons) – v ryan baker https://pslcdatashop.web.cmu.edu/ datasetinfo?datasetid= instructional factors analysis min chi https://pslcdatashop.web.cmu.edu/ project?id= the stanford moocposts data set akshay agrawal and andreas paepcke https://datastage.stanford.edu/ stanfordmoocposts/ - assistment data neil heffernan https://sites.google.com/site/ assistmentsdata/home/assistment- - -data https://explore.openaire.eu/search/dataset?datasetid=r c ::f bb e f e a d https://explore.openaire.eu/search/dataset?datasetid=r c ::f bb e f e a d https://explore.openaire.eu/search/dataset?datasetid=r c ::f bb e f e a d https://figshare.com/collections/ _ _million_kids_and_counting_mobile_science_laboratories_drive_student_interest_in_stem/ https://figshare.com/collections/ _ _million_kids_and_counting_mobile_science_laboratories_drive_student_interest_in_stem/ https://figshare.com/collections/ _ _million_kids_and_counting_mobile_science_laboratories_drive_student_interest_in_stem/ https://figshare.com/collections/ _ _million_kids_and_counting_mobile_science_laboratories_drive_student_interest_in_stem/ https://figshare.com/collections/technology_attributions_and_emotions_in_post-secondary_education_an_application_of_weiner_s_attribution_theory_to_academic_computing_problems/ https://figshare.com/collections/technology_attributions_and_emotions_in_post-secondary_education_an_application_of_weiner_s_attribution_theory_to_academic_computing_problems/ https://figshare.com/collections/technology_attributions_and_emotions_in_post-secondary_education_an_application_of_weiner_s_attribution_theory_to_academic_computing_problems/ https://figshare.com/collections/technology_attributions_and_emotions_in_post-secondary_education_an_application_of_weiner_s_attribution_theory_to_academic_computing_problems/ https://figshare.com/collections/technology_attributions_and_emotions_in_post-secondary_education_an_application_of_weiner_s_attribution_theory_to_academic_computing_problems/ https://figshare.com/collections/technology_attributions_and_emotions_in_post-secondary_education_an_application_of_weiner_s_attribution_theory_to_academic_computing_problems/ https://zenodo.org/record/ #.w yh wgzy w https://zenodo.org/record/ #.w yh wgzy w https://zenodo.org/record/ #.w yibwgzy w https://zenodo.org/record/ #.w yibwgzy w https://zenodo.org/record/ #.w yifwgzy w https://zenodo.org/record/ #.w yifwgzy w https://zenodo.org/record/ #.w yij gzy w https://zenodo.org/record/ #.w yij gzy w http://dx.doi.org/ . /dvn/zzh ub http://dx.doi.org/ . /dvn/zzh ub http://dx.doi.org/ . /dvn/ https://zenodo.org/record/ https://zenodo.org/record/ https://pslcdatashop.web.cmu.edu/datasetinfo?datasetid= https://pslcdatashop.web.cmu.edu/datasetinfo?datasetid= https://pslcdatashop.web.cmu.edu/datasetinfo?datasetid= https://pslcdatashop.web.cmu.edu/datasetinfo?datasetid= https://pslcdatashop.web.cmu.edu/datasetinfo?datasetid= https://pslcdatashop.web.cmu.edu/datasetinfo?datasetid= https://pslcdatashop.web.cmu.edu/datasetinfo?datasetid= https://pslcdatashop.web.cmu.edu/datasetinfo?datasetid= https://pslcdatashop.web.cmu.edu/project?id= https://pslcdatashop.web.cmu.edu/project?id= https://datastage.stanford.edu/stanfordmoocposts/ https://datastage.stanford.edu/stanfordmoocposts/ https://sites.google.com/site/assistmentsdata/home/assistment- - -data https://sites.google.com/site/assistmentsdata/home/assistment- - -data https://sites.google.com/site/assistmentsdata/home/assistment- - -data publications , , of progressive number title author url assistments skill builder data neil heffernan https://sites.google.com/site/ assistmentsdata/home/ - assistments-skill-builder-data head-mounted eye tracking: a new method to describe infant looking franchak, j. m., kretch, k. s., soska, k. c., and adolph, k. e. https://nyu.databrary.org/volume/ socioeconomic status indicators of harvardx and mitx participants – hansen, john and reich, justin https://dataverse.harvard.edu/dataset. xhtml?persistentid=doi: . /dvn/ cameo dataset: detection and prevention of “multiple account” cheating in massively open online courses northcutt, curtis; ho, andrew; and chuang, isaac https://dataverse.harvard.edu/dataset. xhtml?persistentid=doi: . /dvn/ ukvor harvardx-mitx person-course academic year de-identified dataset, version . mitx and harvardx https://dataverse.harvard.edu/dataset. xhtml?persistentid=doi: . /dvn/ references . dg connect european commission. digital science in horizon ; dg connect european commission: brussels, belgium, . . fecher, b.; friesike, s. open science: one term, five schools of thought. in opening science; bartling, s., friesike, s., eds.; springer: cham, switzerland, ; pp. – . [crossref] . nielsen, m.a. reinventing discovery: the new era of networked science; princeton university press: princeton, nj, usa, ; isbn . . veletsianos, g.; kimmons, r. scholars in an increasingly open and digital world: how do education professors and students use twitter? internet high. educ. , , – . [crossref] . weller, m. the digital scholar: how technology is transforming scholarly practice; bloomsbury academic: london, uk, ; isbn . . veletsianos, g.; shepherdson, p. who studies moocs? interdisciplinarity in mooc research and its changes over time. int. rev. res. open distrib. learn. , , – . [crossref] . manca, s.; ranieri, m. exploring digital scholarship. a study on use of social media for scholarly communication among italian academics. in research . and the impact of digital technologies on scholarly inquiry; esposito, a., ed.; igi global: hershey, pa, usa, ; pp. – . isbn . . li, j.; greenhow, c. scholars and social media: tweeting in the conference backchannel for professional learning. emi. educ. media int. , , – . [crossref] . borgman, c.l. big data, little data, no data: scholarship in the networked world; mit press: cambridge, ma, usa, ; isbn - - - - . . molloy, j.c. the open knowledge foundation: open data means better science. plos biol. , , e . [crossref] [pubmed] . zawacki-richter, o.; latchem, c. exploring four decades of research in computers & education. comput. educ. , , – . [crossref] . bond, m.; zawacki-richter, o.; nichols, m. revisiting five decades of educational technology research: a content and authorship analysis of the british journal of educational technology. br. j. educ. technol. , , – . [crossref] . manca, s. researchgate and academia.edu as networked socio-technical systems for scholarly communication: a literature review. res. learn. technol. , , – . [crossref] . borrego, Á. institutional repositories versus researchgate: the depositing habits of spanish researchers. learn. publ. , , – . [crossref] . stewart, b.e. in abundance: networked participatory practices as scholarship. int. rev. res. open distrib. learn. , , – . [crossref] . burgelman, j.-c.; osimo, d.; bogdanowicz, m. science . (change will happen . . . .). first monday , . [crossref] https://sites.google.com/site/assistmentsdata/home/ -assistments-skill-builder-data https://sites.google.com/site/assistmentsdata/home/ -assistments-skill-builder-data https://sites.google.com/site/assistmentsdata/home/ -assistments-skill-builder-data https://nyu.databrary.org/volume/ https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ ukvor https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ ukvor https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ ukvor https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi: . /dvn/ http://dx.doi.org/ . / - - - - _ http://dx.doi.org/ . /j.iheduc. . . http://dx.doi.org/ . /irrodl.v i . http://dx.doi.org/ . / . . http://dx.doi.org/ . /journal.pbio. http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /j.compedu. . . http://dx.doi.org/ . /bjet. http://dx.doi.org/ . /rlt.v . http://dx.doi.org/ . /leap. http://dx.doi.org/ . /irrodl.v i . http://dx.doi.org/ . /fm.v i . publications , , of . baack, s. datafication and empowerment: how the open data movement re-articulates notions of democracy, participation, and journalism. big data soc. , . [crossref] . european commission—rise—research innovation and science policy experts. mallorca declaration on open science: achieving open science; european commission: mallorca, spain, . . h programme guidelines on fair data management (v . ); european commission: brussels, belgium, . . wellcome trust. wellcome signs open data concordat. wellcome trust blog, july . . now. open science. available online: https://www.nwo.nl/en/policies/open+science (accessed on november ). . cern. cms data preservation, re-use and open access policy; cern open data portal; cern: geneve, switzerland, . . bill & melinda gates foundation. gates open research. . available online: https://gatesopenresearch. org/about/policies#dataavail (accessed on november ). . mckiernan, e.c.; bourne, p.e.; brown, c.t.; buck, s.; kenall, a.; lin, j.; mcdougall, d.; nosek, b.a.; ram, k.; soderberg, c.k.; et al. how open science helps researchers succeed. elife , , – . [crossref] [pubmed] . bournea, p.e.; clarkb, t.; dalec, r.; de waardd, a.; hermane, i.; hovyf, e.; shottong, d. improving future research communication and e-scholarship: a summary of findings. informatik-spektrum , , – . [crossref] . van der zee, t.; reich, j. open education science. aera open , , – . [crossref] . veletsianos, g.; kimmons, r. networked participatory scholarship: emergent techno-cultural pressures toward open and digital scholarship in online networks. comput. educ. , , – . [crossref] . scanlon, e. digital futures: changes in scholarship, open educational resources and the inevitability of interdisciplinarity. arts humanit. high. educ. , , – . [crossref] . pearce, n.; weller, m.; scanlon, e.; kinsley, s. digital scholarship considered: how new technologies could transform academic work. in education , , – . . greenhow, c.; gleason, b. social scholarship: reconsidering scholarly practices in the age of social media. br. j. educ. technol. , , – . [crossref] . veletsianos, g. social media in academia: networked scholars; routledge: abingdon, uk, ; isbn . . manca, s.; ranieri, m. “yes for sharing, no for teaching!”: social media in academic practices. internet high. educ. , , – . [crossref] . donelan, h. social media for professional development and networking opportunities in academia. j. furth. high. educ. , , – . [crossref] . gu, f.; widén-wulff, g. scholarly communication and possible changes in the context of social media. electron. libr. , , – . [crossref] . rowlands, i.; nicholas, d.; russell, b.; canty, n.; watkinson, a. social media use in the research workflow. learn. publ. , , – . [crossref] . lupton, d. “feeling better connected”: academics’ use of social media; news and media research centre (uc): canberra, australia, . . boyer, e.l. scholarship reconsidered: priorities of the professoriate; carnegie foundation for the advancement of teaching: san francisco, ca, usa, ; volume , isbn . . raffaghelli, j.e.; cucchiara, s.; manganello, f.; persico, d. different views on digital scholarship: separate worlds or cohesive research field? res. learn. technol. , , – . [crossref] . goodfellow, r. scholarly, digital, open: an impossible triangle? res. learn. technol. , , – . [crossref] . scanlon, e. scholarship in the digital age: open educational resources, publication and public engagement. br. j. educ. technol. , , – . [crossref] . nicholas, d.; herman, e.; jamali, h.r. emerging reputation mechanisms for scholars; european commission: seville, spain, . . hoffmann, c.p.; lutz, c.; meckel, m. a relational altmetric? network centrality on researchgate as an indicator of scientific impact. j. assoc. inf. sci. technol. , , – . [crossref] . kuo, t.; tsai, g.y.; jim wu, y.-c.; alhalabi, w. from sociability to creditability for academics. comput. hum. behav. , , – . [crossref] http://dx.doi.org/ . / https://www.nwo.nl/en/policies/open+science https://gatesopenresearch.org/about/policies#dataavail https://gatesopenresearch.org/about/policies#dataavail http://dx.doi.org/ . /elife. http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / http://dx.doi.org/ . /j.compedu. . . http://dx.doi.org/ . / http://dx.doi.org/ . /bjet. http://dx.doi.org/ . /j.iheduc. . . http://dx.doi.org/ . / x. . http://dx.doi.org/ . / http://dx.doi.org/ . / http://dx.doi.org/ . /rlt.v . http://dx.doi.org/ . /rlt.v . http://dx.doi.org/ . /bjet. http://dx.doi.org/ . /asi. http://dx.doi.org/ . /j.chb. . . publications , , of . niyazov, y.; vogel, c.; price, r.; lund, b.; judd, d.; akil, a.; mortonson, m.; schwartzman, j.; shron, m. open access meets discoverability: citations to articles posted to academia.edu. plos one , , e . [crossref] . thelwall, m.; kousha, k. researchgate: disseminating, communicating, and measuring scholarship? j. assoc. inf. sci. technol. , , – . [crossref] . viberg, o.; hatakka, m.; bälter, o.; mavroudi, a. the current landscape of learning analytics in higher education. comput. human behav. , , – . [crossref] . kraker, p.; lex, e. a critical look at the researchgate score as a measure of scientific reputation. in proceedings of the quantifying and analysing scholarly communication on the web workshop (ascw’ ), oxford, uk, june– july . . nicholas, d.; clark, d.; herman, e. researchgate: reputation uncovered. learn. publ. , , – . [crossref] . orduna-malea, e.; martín-martín, a.; thelwall, m.; delgado lópez-cózar, e. do researchgate scores create ghost academic reputations? scientometrics , , – . [crossref] . ortega, j.l. relationship between altmetric and bibliometric indicators across academic social sites: the case of csic’s members. j. informetr. , , – . [crossref] . wilkinson, m.d.; dumontier, m.; aalbersberg, i.j.; appleton, g.; axton, m.; baak, a.; blomberg, n.; boiten, j.-w.; da silva santos, l.b.; bourne, p.e.; et al. the fair guiding principles for scientific data management and stewardship. sci. data , , . [crossref] [pubmed] . raffaghelli, j.e.; manca, s. is there a social life in open data? open datasets exploring practices in educational technology research. zenodo . [crossref] . fitzgerald, e.; jones, a.; kucirkova, n.; scanlon, e. a literature synthesis of personalised technology-enhanced learning: what works and why. res. learn. technol. , , – . [crossref] . bodily, r.; leary, h.; west, r.e. research trends in instructional design and technology journals. br. j. educ. technol. , , – . [crossref] . salmi, j. study on open science: impact, implications and policy options; european commission: brussels, belgium, ; isbn . . verhaar, p.; schoots, f.; sesink, l.; frederiks, f. fostering effective data management practices at leiden university. lib. q. , , – . [crossref] . veletsianos, g. a case study of scholars’ open and sharing practices. open prax. , , – . [crossref] . gurstein, m.b. open data: empowering the empowered or effective data use for everyone? first monday , , – . [crossref] . zuiderwijk, a.; janssen, m.; choenni, s.; meijer, r.; alibaks, r.s. socio-technical impediments of open data. electron. j. e-gov. , , – . [crossref] . janssen, m.; charalabidis, y.; zuiderwijk, a. benefits, adoption barriers and myths of open data and open government. inf. syst. manag. , , – . [crossref] . hey, a.j.g. the fourth paradigm: data-intensive scientific discovery; microsoft research: redmond, wa, usa, ; isbn . . sieber, r.e.; johnson, p.a. civic open data at a crossroads: dominant models and current challenges. gov. inf. q. , , – . [crossref] © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . /asi. http://dx.doi.org/ . /j.chb. . . http://dx.doi.org/ . /leap. http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /j.joi. . . http://dx.doi.org/ . /sdata. . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /zenodo. http://dx.doi.org/ . /rlt.v . http://dx.doi.org/ . /bjet. http://dx.doi.org/ . /lq. http://dx.doi.org/ . /openpraxis. . . http://dx.doi.org/ . /fm.v i . http://dx.doi.org/ . /b ?ref=search-gateway: d b f af faeaef http://dx.doi.org/ . / . . http://dx.doi.org/ . /j.giq. . . http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction background from open science to open data: an emergent agenda social media and networked scholarship: how scholars share and build professionalism in the digital era method rationale of the study and research questions sampling instruments and data collection results discussion and conclusions references op-llcj .. the arabic papyrology database ............................................................................................................................................................ johannes thomann university of zurich, switzerland ....................................................................................................................................... abstract there exist about , premodern arabic documents on papyrus and paper, of which about , have been edited. another , unpublished documents are described or mentioned in papyrological publications. it is the aim of the arabic papyrology database (apd) to give access to published texts and descrip- tions. for the apd, an entirely new approach of organizing arabic text was developed. the apd presents texts in five levels that account for the peculiarities of the arabic script system and the scribal practices. the first level provides a faithful diplomatic edition of the text as found in the document. the following levels document four steps of editorial interventions. on the second level, lines of characters are broken into single words. on the third level, lacking diacritical dots are supplied. on the fourth level, arabic vowel signs are added, providing a full phonological representation. on the fifth level, a scientific latin transliteration is given. each element of the fifth level is connected to a lexicon and a list of grammatical forms. all levels contain variant readings and editorial remarks. at present, the apd contains , full-text documents and is freely accessible (http://www.naher-osten.lmu.de/apd). ................................................................................................................................................................................. peculiarities of the arabic writing system the arabic writing system in general and the writing conventions in papyri in particular require special display formats. as in most afro-asiatic writing sys- tems, short vowels are not represented by letters, but there exist vocalization marks ($arakāt), written above or below the letters (gacek, : ). further, one-letter words and the article are written together with the following word, and the pronom- inal suffixes are attached to the preceding word. in transliteration, these words are separated by a hyphen. finally, fifteen letters (al-$urūf al-mu‘jama) are distinguished by diacritical pointing from letters with the same basic form (gacek, : ). however, these dots are only occasionally found in early arabic documents (kaplony ). most modern editions of early and classical arabic texts do not account for these peculiarities of the arabic writing system and normalize the texts according to modern orthographical rules without using vowel- signs. while this established practice may be appro- priate for literary texts, a more elaborate procedure is needed for documentary texts. the data there are two main groups of premodern arabic documents: inscriptions and papyri in the broader sense. the second group consists of about , documents written on different materials such as papyrus, parchment, or paper during the period from the th to the th centuries, of which about , have been edited (sijpesteijn ). another , unpublished documents are described or mentioned in papyrological publications. all these documents provide information on almost every aspect of islamic history. despite their importance, they still do not receive a high level of attention in historical research (sijpesteijn : ). correspondence: johannes thomann, institute of asian and oriental research, wiesenstrasse , ch- zurich, switzerland. e-mail: johannes. thomann@aoi.uzh.ch digital scholarship in the humanities, vol. , supplement , . � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com i doi: . /llc/fqv advance access published on june http://www.naher-osten.lmu.de/apd . s - &unicode_x bf; . methodology two leading ideas founded the arabic papyrology database (apd, accessed october : http:// www.naher-osten.lmu.de/apd): on the one hand, a research tool which makes metadata and full texts of all edited documents easily accessible should be pro- vided, and, on the other hand, the limitations of printed editions could be overcome by this digital edition. as already mentioned, modern editions present the texts in a normalized form according to modern arabic orthography. some high-quality editions describe the diacritical pointing and vocalization marks in the critical apparatus, and a limited number of editions provide word indices in transliteration. for the apd, an entirely new approach of orga- nizing arabic text was developed. instead of the single text-level approach used in print and in other database projects, the apd presents texts in five levels. on the first level, only diacritical dots found in the document are written and all observa- tions of gaps, deleted texts, and redundant words are indicated by sigla and brackets. on the second level, these marks are removed and the text is broken into single words. on the third level, missing diacritical dots are added, as it is common in text editions. on the fourth level, vowel marks are added, providing a full phonological representation. on the fifth level, the text is given in scientific latin transliteration; words written together in arabic are now separated. further, each element of the fifth level is connected to a lexicon and a list of gram- matical forms. the levels of text are hierarchically organized, and variant readings and remarks are at- tached to their appropriate level. search is not only possible along each level, but searches across levels for the corresponding or neighbouring word in an- other level can be carried out as well. beyond these layers of text, some steps have been made to include a treebank of syntactical structures in the apd, based on the identified word categories and gram- matical forms. hopefully, its realization will become part of a future project. at present, pairs of adjacent word categories with particular grammatical forms can be found in the texts. state of the project the apd was initiated by andreas kaplony and johannes thomann at zurich university in . today it is a joint project of the universities of zurich, munich (lmu), and vienna. the apd was online and freely accessible from its beginning. a phd project by one of its collaborators was based on the apd (grob, ), and another research project would have been impossible without the advanced search capabilities across text levels (kaplony, ). at present, , full text docu- ments are available in the apd, and by the remaining edited documents will be entered into the apd. there are mutual references in the apd and the trismegistos database, and soon, texts of the apd will be imported by the papyrus navigator, made possible by the xml export engine of the apd in epidoc format. references gacek, a. ( ). arabic manuscripts: a vademecum for readers. leiden: brill. grob, e.m. ( ). documentary arabic private and business letters on papyrus: form and function, content and context. berlin: de gruyter. kaplony, a. ( ). what are those few dots for? thoughts on the orthography of the qurra papyri ( - ), the khurasan parchments ( - ) and the inscription of the jerusalem dome of the rock ( ). arabica ( ): – . sijpesteijn, p.m. ( ). checklist of arabic papyri, bulletin of the american society of papyrologists, : – . updated version (accessed october ): http://www.naher-osten.uni-muenchen.de/isap/isap_ checklist/index.html. sijpesteijn, p.m. ( ). arabic papyri and islamic egypt. in bagnall, r.s. (ed.), the oxford handbook of papyrology. new york: oxford university press, pp. – . j. thomann i digital scholarship in the humanities, vol. , supplement , . http://www.naher-osten.lmu.de/apd http://www.naher-osten.lmu.de/apd s in s - http://www.naher-osten.uni-muenchen.de/isap/isap_checklist/index.html http://www.naher-osten.uni-muenchen.de/isap/isap_checklist/index.html city, university of london institutional repository citation: baker, j., moore, c., priego, e., alegre, r., cope, j., price, l., stephens, o., van strien, d. and wilson, g. ( ). library carpentry: software skills training for library professionals. liber quarterly : the journal of european research libraries, ( ), pp. - . doi: . /lq. this is the published version of the paper. this version of the publication may differ from the final published version. permanent repository link: http://openaccess.city.ac.uk/id/eprint/ / link to published version: . /lq. copyright and reuse: city research online aims to make research outputs of city, university of london available to a wider audience. copyright and moral rights remain with the author(s) and/or copyright holders. urls from city research online may be freely distributed and linked to. city research online: http://openaccess.city.ac.uk/ publications@city.ac.uk city research online http://openaccess.city.ac.uk/ mailto:publications@city.ac.uk vol. , no. ( ) – | e-issn: - x this work is licensed under a creative commons attribution . international license uopen journals | http://liberquarterly.eu/ | doi: . /lq. liber quarterly volume issue library carpentry: software skills training for library professionals james baker university of sussex, uk james.baker@sussex.ac.uk caitlin moore sotheby’s institute of art, uk moorecaitlina@gmail.com ernesto priego city university london, uk efpriego@gmail.com raquel alegre university college london, uk r.alegre@ucl.ac.uk jez cope university of sheffield, uk j.cope@erambler.co.uk ludi price city university london, uk ludi.price@gmail.com http://liberquarterly.eu http://www.doi.org/ . /lq. mailto:james.baker@sussex.ac.uk mailto:moorecaitlina@gmail.com mailto:efpriego@gmail.com mailto:r.alegre@ucl.ac.uk mailto:j.cope@erambler.co.uk mailto:ludi.price@gmail.com library carpentry: software skills training for library professionals liber quarterly volume issue owen stephens owen stephens consulting, uk owen@ostephens.com daniel van strien homerton university hospital foundation trust davanstrien@gmail.com greg wilson software carpentry, canada gvwilson@software-carpentry.org abstract librarians play a crucial role in cultivating world-class research and in most disciplinary areas today world-class research relies on the use of software. this paper describes library carpentry, an introductory software skills training programme with a focus on the needs and requirements of library and information professionals. using library carpentry as a case study of the development and delivery of software skills focused professional devel- opment, this paper describes the institutional and intellectual contexts in which library carpentry was conceived, the syllabus used for the initial exploratory programme, the administrative apparatus through which the programme was delivered, and the analysis of data collection exercises con- ducted during the programme. as many university librarians already have substantial expertise working with data, it argues that adding software skills (that is, coding and data manipulation that goes beyond the use of familiar office suites) to their armoury is an effective and important use of professional development resource. key words: capacity building; software skills; data; library carpentry mailto:owen@ostephens.com mailto:davanstrien@gmail.com mailto:gvwilson@software-carpentry.org james baker et al. liber quarterly volume issue . overview librarians play a crucial role in cultivating world-class research and in most disciplinary areas today world-class research relies on the use of software (hettrick, ). established non-profit volunteer organisations such as software carpentry (wilson, ) and data carpentry (teal et al., ) offer introductory research software skills training with a focus on the needs and requirements of research scientists. this paper describes library carpentry, a comparable introductory software skills training programme with a focus on the needs and requirements of library and information professionals. in its initial exploratory run, library carpentry took the form of four three-hour sessions held at the city university london centre for information science across four successive monday evenings in november . these sessions attracted participants from institutions in london and its environs. subsequently, there has been interest in library carpentry both from outside the united kingdom and from individuals working in comparable profes- sional contexts outside librarianship (such as the archives sector). this paper presents library carpentry as a case study on the development and delivery of software skills focused professional development. it begins by describing the institutional and intellectual contexts in which library carpentry was conceived. it goes on to describe the syllabus used for the ini- tial exploratory programme, the administrative apparatus through which the programme was delivered, and the analysis of data collection exercises con- ducted during the programme. it concludes with a discussion of what went well, what might need adapting, and future plans. as many university librar- ians already have substantial expertise working with data, it argues that add- ing software skills (that is, coding and data manipulation that goes beyond the use of familiar office suites) to their armoury is an effective and important use of professional development resources. the paper has three aims: ) to describe how and why the approaches of and resources created by non-library specific software skills training (software carpentry, data carpentry, and programming historian) and existing library specific programmes (british library digital scholarship training programme, data scientist training for librarians) were adapted to create the library carpentry exploratory run and lesson materials. library carpentry: software skills training for library professionals liber quarterly volume issue ) to present data on software skills in university libraries that was collected during this exploratory run. ) to stand as one of a number of activities that builds the foundations of a distributed community model for embracing and sustaining software skills in the library and information profession. . context . . intellectual contexts jonathan rochkind ( ) argued in his editorial introduction to the inaugu- ral issue of the code lib journal that: “this is a decisive time for libraries. in the changing social and technological environment, libraries must adapt to fulfill their missions and satisfy their users. library technology is acutely involved in this adaptation. digital services, con- tent and tools have become a part of nearly every aspect of library operations. the “digital library” is here–if you work in a library, you probably work in a digital library” (n.p.). modern librarians have a wide range of roles (rluk, ). within many of these roles there is potential for applying programming and it skills, which allow for automation of tasks and the manipulation of data, whether aggregating and harmonising usage statistics from diverse sources or bulk editing of metadata in library catalogues. the research data man- agement and open access services now offered by academic libraries pro- vide additional motivation for librarians to develop such skills (goben & raszewski, ). conceptual questions raised when thinking about soft- ware design and programming skills often overlap with issues ‘tradition- ally’ considered by librarians, potentially leading to new approaches in both. nevertheless, while the “digital library” has since fundamentally altered what university libraries do and are for, the integration of software skills into the work of library and information professionals has remained uneven (dalziel, ; kim, ). as andromeda yelton ( ) notes in her american library association library technology report ‘coding for librarians: learning by example’, significant social and political barriers remain: james baker et al. liber quarterly volume issue “many library coders spend a significant amount of time trying to cultivate buy-in, educate their colleagues about technology, or work against siloed organizational structures as they produce inherently cross-departmental work” [p. ]. that “buy-in” can be inhibited by a conflation of the desire to automate pro- cesses and reduce human involvement with cuts in staffing levels and the resulting increased workloads on individuals (wilkinson, ). in turn, the emergence of software skills among library and information professionals can contribute to existential doubt, to concerns that the traditional and val- ued skills are no longer enough to justify employment (wilkinson, ). one factor that has provided fresh impetus for librarians to develop soft- ware skills which cut across their existing data management, user sup- port, and research collaboration roles, is the recent emergence of the digital humanities, a field in which library and information scientists feature promi- nently (baker et al., a; baker, williams, russell, & rosenblum, b; keener, ). a project funded by the uk public body jisc made a series of recommendations to higher education institutions or higher education clusters looking to build capacity for enabling complex analysis of large-scale digital collections by their non-computationally trained humanities research- ers. in particular, it suggested investment ‘in training library staff to run these initial queries in collaboration with humanities faculty, to support work with subsets of data that are produced, and to document and manage resulting code and derived data’ (terras et al., ). . . practical contexts in response to these contexts, library carpentry was conceived in autumn as an attempt to facilitate the cultivation of software skills in the library and information science community. it built on three activities undertaken by the library carpentry pi (baker) whilst a digital curator at the british library between march and august . the first activity was his attendance at a software carpentry workshop (greenwich, london, october ). subsequent discussions with greg wilson (then director of software carpentry) revealed the prominence of library and information science pro- fessionals among the cohort of non-scientists who attend software carpentry library carpentry: software skills training for library professionals liber quarterly volume issue workshops. this indicated a potential appetite for in-person software skills training tailored to librarians. the second activity was the experience of writing lessons for the open-access, peer-reviewed journal the programming historian (crymble et al., ). the observation of the crossover between historians and librarians in the use of and contribution to its lessons indi- cated an interest among librarians in structured and maintained online learn- ing materials on software skills. the third activity was his role as a session lead and programme coordinator for the british library’s digital scholarship training programme, a hands-on practical training programme designed for british library staff and delivered as one-day on-site workshops. the pro- gramme launched in november and covers topics from communicating collections and cleaning up data to command line programming and geo-ref- erencing, some of which are available for self-directed learning (baker, ). when presenting this activity to librarians outside of the british library, it was clear that demand existed in the sector for comparable in-person training programmes. in light of the contexts and observations described, library carpentry had three aims upon inception: ) to blend non-library specific software skills training (software carpentry, data carpentry, and programming historian) with exist- ing library specific programmes (british library digital scholarship training programme, data scientist training for librarians; erdmann, von alstine, eslao, durocher, & wicks ) into a public offering aimed at library and information professionals seeking an introduction to software skills. ) to collect data on software skills in university libraries, organisa- tions that play a crucial role in cultivating world-class research that is increasingly reliant on software. ) to build the foundations of a distributed community model for embracing and sustaining software skills in the library and informa- tion profession. . programme administration library carpentry was conceived in autumn as a planned activity for baker ’s then proposed software sustainability institute fellowship. use james baker et al. liber quarterly volume issue of the ‘carpentry’ name was intended to indicate an alignment with core aspects of software carpentry, namely: training delivered over four seg- ments; an emphasis on open software and data; encouragement of atten- dance as groups rather than as individuals to support peer learning; the use of multiple trainers to troubleshoot issues; and training materials pub- lished online. the use of the ‘carpentry’ moniker was approved by software carpentry. upon being selected for a fellowship, an outline for library carpentry was pitched to the software sustainability institute and a £ , budget set aside to cover travel costs for trainers, room hire, and refresh- ments for attendees. at this stage it was anticipated that – attendees would be accommodated on a single day. in early , a public call for input into shaping library carpentry was issued (baker, a) and discussions with individuals and groups already involved in relevant pedagogical ini- tiatives (see ‘context’) took place. as a result of this work, a decision was made for the library carpentry exploratory programme to take the form of a series of short events delivered in late- based around a syllabus com- parable to software carpentry and data carpentry, but with examples, data, exercises that replicated library practice rather than scientific research. the syllabus would include the unix shell (a command line user interface), git (a version control tool) and openrefine (an interactive data cleanup tool). city university london centre for information science were approached with a view to joining the project and the discussions that followed led to the decision to host library carpentry at city university london in alignment with their master ’s degree in library and information science. this had two significant consequences. first, a programme of four sessions to take place between : and : on consecutive monday evenings was agreed upon. doing so meant diverging from the two-day format preferred by both software carpentry and data carpentry, effectively distributing each half- day session over a four-week period. second, as room hire costs were waived as part of this arrangement, the budget was in turn adjusted to accommodate attendees. having settled on a venue, draft syllabus, and format, in april a call for participants and volunteers was issued (baker, b). the response to this established both the demand for library carpentry and a group of indi- viduals willing to offer their time to develop and/or deliver the syllabus. in order to maximise the pedagogical benefits of peer learning between sessions (wilson, ), respondents from the first group were asked to bring a cohort from their institution to library carpentry. library carpentry: software skills training for library professionals liber quarterly volume issue in spring/summer the programme website and syllabus were devel- oped and hosted on public github repositories. this platform centralised our work and made it transparent: for example, github issues trackers were used to manage lesson development. although using github for this pur- pose is not frictionless – its steep learning curve introduces potential inclu- sivity issues (crymble, ) – it nevertheless provided a useful platform on which to promote, iterate, and deploy library carpentry. in july lesson plans for each session were in place and proposed attendees were invited to confirm their registration. due to high demand, institutions were limited to four attendees, with a waiting list for those the programme was unable to support due to capacity restrictions. during programme delivery in november , attendees worked on their own personal computers. for sessions one, two, and three printed handouts were provided to guide learning. refreshments and snacks were provided at every session and as the sessions were held in the evening attendees were encouraged to bring more substantial meals. at the beginning of each session attendees were asked to make a name badge for themselves, to self-assess their confidence level on the topic at hand, and to clearly display this on their badge (see ‘attendance and feedback’). each session was coordinated by a session lead and supported by between two and four helpers who undertook administrative tasks and/or were available to troubleshoot problems arising in the room. at the close of each session, attendees were directed to any actions they needed to take before the next session, typically software installation or downloading data. between ses- sions matters arising, clarifications, and installation problems were handled via both github issues and email. . syllabus . . overview the library carpentry syllabus delivered in during the exploratory pro- gramme was aimed at beginners, required no prerequisite knowledge, and was tailored to align with the needs and requirements of library and james baker et al. liber quarterly volume issue information science professionals through the use of relevant examples, data, and exercises. the programme was split into four parts, each of which was delivered in a single three-hour session that was sub-divided into three sec- tions each roughly minutes in length (timetables used for each session are noted below, though note that they do not correspond exactly to what hap- pened at each session). dependencies between each session were minimised. this reduced the need for refreshers and meant that attendees who were unable to attend a session were not disadvantaged. two sets of learning outcomes informed the development and delivery of the syllabus. first, that attendees understood the value of command line interfaces, regular expressions, plain text file formats and consistent naming conventions, and would be equipped by library carpentry to apply these in their own professional context. second, that attendees understood the impor- tance of openly licensed software with strong and diverse user communities and how these characteristics could support both self-directed learning and professional development. to support the delivery of these learning out- comes, the syllabus was also built around use cases with clear relevance to library practice and functionality found in multiple software tools. the lat- ter was particularly important in syllabus construction. the choice to offer a session that focused on the interactive data cleanup tool openrefine, for example, was made not only because openrefine is a powerful software tool for manipulating tabulated data that is well liked by librarians and is under- pinned by a strong user community, but also because openrefine queries are built on both regular expressions and programming languages (jython, clojure, and the bespoke general refine expression language) that intro- duce learners to clear, well-documented, and well-constructed programming syntax. . . sessions session one began with an introduction to basic programming concepts. it drew upon a wide literature, with elements adapted from british library digital scholarship training programme lesson materials. thereafter attendees were asked to reflect on words and phrases associated with pro- gramming, code, and software that they believed they would benefit from library carpentry: software skills training for library professionals liber quarterly volume issue knowing more about. attendees were organised into small peer groups to encourage honesty, peer support, and a better understanding of confidence levels among their peers. discussions in these small groups were collated by the session lead and an open discussion held that aimed to underscore the learning outcomes of library carpentry, to flag areas that the sessions would not cover and where help on those could be sought, and to amelio- rate concerns over prerequisite knowledge. the session concluded with an introduction to regular expressions. this included an exercise designed to both help attendees understand what regular expressions do and to encour- age attendees to build their own regular expressions. these were conducted on paper with personal computers used to check answers, resolve queries, and experiment. week one timetable arrival and introduction ( minutes) jargon busting exercise ( minutes) foundations presentation ( minutes) regular expressions practical ( minutes) session two covered the unix shell. it was adapted from software carpentry and the programming historian lesson materials. in this hands on-session, attendees were introduced to the basic commands required to navigate the filesystem, to count (wc) and mine (grep) metadata for journal articles in tab- ular form, and to clean and manipulate a text for the purposes of counting words. a demonstration of how to run named entity recognition software from the unix shell sought to underscore the wider applications of the inter- face and commands introduced. the session concluded with some advice on and recommendations of resources that can support learning and use of the unix shell and programming languages. week two timetable arrival and introduction ( minutes) unix shell basics ( minutes) counting and mining in the unix shell ( minutes) cleaning and transforming in the unix shell ( minutes) session three covered version control with git. it was adapted from software carpentry lesson materials. the session began by introducing attendees to james baker et al. liber quarterly volume issue the concept of version control, to the terms used to control versions in git, and how to collaborate and publish git repositories using github. a pen and paper group exercise was used to reinforce the latter. thereafter, attendees acquired hands-on experience of using git in the unix shell to version con- trol a file and explored the use of github within a collaborative versioning workflow. finally, attendees were encouraged to consider use cases for git/ github workflows including using github pages to create simple websites to host a blog or details of an event. week three timetable arrival and introduction ( minutes) introducing git and github ( minutes) exercise and using git ( minutes) git and github use cases ( minutes) session four covered cleaning and transforming data in openrefine. it was adapted from british library digital scholarship training programme les- son materials. after a brief introduction to the history of openrefine, its purpose, and the community that supports and maintains it, attendees were led through a series of exercises that cleaned and normalised a real-world dataset from an institutional repository. the session concluded with some examples of more advanced uses of openrefine (such as interactions with web based services) and a discussion of the capabilities of openrefine, its appropriateness for certain use cases, and how to integrate it into existing workflows. week four timetable arrival and introduction ( minutes) introduction to openrefine ( minutes) basic openrefine functions ( minutes) advanced openrefine functions ( minutes) each of the four sessions began by situating what was to follow within the wider context of the programme, and sessions two, three, and four began with an articulation of which of the basic programming concepts encoun- tered in session one would be covered. every session provided opportunities for attendees to ask questions of the session lead and helpers both during and at the conclusion of the session. library carpentry: software skills training for library professionals liber quarterly volume issue training materials for each session were available to attendees on github and have been subsequently archived: the materials available include lesson plans, slides, data used in exercise, handouts, and answer sheets (baker, ). . attendance and feedback one of the aims of the library carpentry exploratory programme was to collect data on software skills in university libraries. this took the form of gathering attendance data and feedback from attendees. starting with atten- dance, across the four sessions, a total of individuals from the library and information science community attended library carpentry, nine of whom were trainers and/or supported the programme in an administrative capac- ity (often, in addition to attending as learners). organisations were repre- sented (see table ). . . feedback: self-assessment during each library carpentry session feedback was gathered from attendees. it was anticipated that this data would support understanding of both whether the syllabus had been pitched correctly so as to deliver the stated learning out- comes and the potential barriers to building a distributed community model for embracing and sustaining software skills in the library and information profession. three mechanisms were used to gather feedback from attendees. table : composition of library carpentry attendance by location and affiliated organisation. location n affiliated organisation london birkbeck university of london, british film institute, british library, city university london, imperial college london, ucl, university of london computing centre, the wellcome trust south east of england cambridge university, university of reading, university of sussex rest of england de montfort university, university of leicester, university of sheffield international software carpentry james baker et al. liber quarterly volume issue first, at the beginning of each session, attendees were asked to self-report their skill level based on the topic at hand. in order to do this, attendees were asked to complete the sentence ‘in relation to the topic this week i know...’ with one of four options: ‘nothing!’ ( ), ‘a little!’ ( ), ‘lots!’ ( ), and ‘lots and lots!’ ( ). the rationale for this exercise was threefold: first, it enabled the session leads to gain a quick sense of the confidence in the room each week; second, as attendees were asked to clearly display this on their badge, other attendees were able to identify people nearby in the room who might be able to sup- port them if they got stuck; and third, it provided data on what software skills attendees thought they had before attending library carpentry. the data displayed in table shows that at the beginning of sessions two, three, and four more than three-quarters of attendees reported knowing nothing about the topic at hand. attendees were more knowledgeable on concepts (session one), suggesting the existence of a basic set of knowledges in the library and information science community upon which software skills could build. openrefine (session four) provided the lowest self-assessment, although this was expected given that it is specialist, bespoke software. session three (which covered git) had the lowest attendance. this data sug- gests that the integration of software skills into the work of library and infor- mation professionals remains uneven. . . feedback: anticipated use of library carpentry the second mechanism by which feedback was collected took place at the beginning of the third session, the programme mid-point (for this data see table : attendee self-reporting on skill level at the start of each week. in relation to the topic this week i know… nothing! a little! lots! lots and lots!! week one week two week three week four mean (n) . . library carpentry: software skills training for library professionals liber quarterly volume issue week two materials; baker, ). on this occasion attendees were asked to articulate ways in which what they had learnt could be used in their daily practice. attendees were asked to write on a sticky note a scenario in which they could imagine that they or a member of their team might be able to use what they had learnt at library carpentry in the workplace. attendees responded. these responses were grouped by theme to identify clusters. analysis of these responses shows a strong cluster (n= ) that antici- pated using software skills to improve their search capabilities: • ‘ability to combine large documents and search across them’ • ‘find all versions of a word (misspelt due to dyslexia) within a database’ • ‘possibly pulling lines out of a horrid csv file our open access system produces’. concurrent with these were clusters of responses that anticipated using skills learnt at library carpentry to manipulate large scale data (n= ) or to review library data (n= ): • ‘use regular expressions to help create review files of metadata’ • ‘comparing e-book data (ie. cat records) with physical records. overlaps? gaps? subject heavy in one or other?’ • ‘cleaning up data after exporting it from a) lms b) institutional repository’ • ‘reformatting shelfmarks (title, author, etc) in an exported database of item/bib records. quite possibly in transforming data to different systems e.g. alt lms etc’. attendees also reported anticipating using the skills learnt at library carpentry to better understand software related possibilities in their work- places (n= ): • ‘understanding arguments/possibilities for software’ • ‘but mostly—talking to the tech people in clearer, more useful ways’ • ‘given me confidence, reminded me what i know/can learn so i can make online ‘sandbox’ environment to test tools & share courses’. james baker et al. liber quarterly volume issue smaller overlapping clusters of potential uses of the skills learnt included research support (n= ), local capacity building (n= ), and task automation (n= ). this data suggests demand in the library and information profession for acquiring software skills. . . feedback: barriers to learning software skills the third feedback mechanism took place at the close of the final session (for this data see week four materials; baker, ). on this occasion attendees were asked to articulate what they might need to pass on to colleagues the skills they had learnt at library carpentry. again they were asked to write their responses on a sticky note. attendees responded. these responses were grouped by theme to identify clusters. analysis of these responses shows time (n= ) and practice (n= ) as important requirements to passing on skills learnt at library carpentry. a smaller cluster (n= ) reported the need for more worked examples: • ‘practice with data i am familiar with—applying each week’s lessons practically’ • ‘i’d be really keen but think i’d need time and actually apply this to something practically myself (i am hoping to do this) to feel more confident in it (+ seeing other people’s practical applications too)’ • ‘any websites that act as a simple reminder for me + to give others’. organisational barriers to passing on skills learnt were reported. these were described in two forms. one group (n= ) reported having insufficient it per- missions to apply the skills developed at library carpentry in their work- place. a second more nebulous group (n= ) reported that their organisational culture was either unsupportive or lacked the communities they needed to embed software skills in their practice: • ‘an appropriate channel: knowing the people/group to dissemi- nate to’ • ‘some of my colleagues are not very technologically adept, so not sure how much can pass on’, ‘might be able to sell openrefine library carpentry: software skills training for library professionals liber quarterly volume issue to colleagues but the organisation i work for hates open source software :(‘ • ‘i certainly need more time to practice and a community where i am not afraid of asking stupid questions’. this data suggests that in spite of demand for software skills, the integration of core software skills into the work of library and information professionals is inhibited by significant cultural, social and organisational barriers. . . review taken together, these three sets of data (drawn, it must be noted, from a self-selecting audience) point to three provisional findings and recommendations: ) library and information science professionals both value the acqui- sition of software skills as part of their professional development and report a low competency in such skills (findings corroborated by two attendee blogs that reflected on attending library carpentry: playforth, ; sykes, ). more work is needed corroborate these findings in an international context. ) library and information science professionals report undertaking activities that could be improved by their acquisition of software skills. more work is needed to map these activities to the roles per- formed by library and information professionals. ) library and information science professionals face various chal- lenges to acquiring software skills and to embedding those skills in their workplace. more work is needed to tease out commonalities between these challenges. an organisational challenge not captured by these feedback mechanisms was one of timing. taking into account the target audience and the host institution, it was important to strike a balance between fulfilling every- day responsibilities and the need for ongoing professional development. complementing the positive feedback regarding content and delivery, some participants provided informal negative feedback that the sessions were too james baker et al. liber quarterly volume issue long, especially as they took place after a day of work. the pragmatic chal- lenge for organisers and participants was that this time of the day was the most feasible for information professions with core working hour responsi- bilities and stretched professional development capacity. work that builds on the initial exploratory run of library carpentry must remain attentive to this important professional dynamic. . next steps starting with what went well, the decision to spread library carpentry across four weekly evening sessions (as opposed to software carpentry’s two-day format) delivered significant benefits: the intensity of learning was distributed, the time between sessions was used to reinforce skills and field queries, nascent peer support communities were built (especially through the logging of issues on github and the use of the twitter hashtag #librarycar- pentry during and after sessions; priego, ), the syllabus was revised mid- programme based on the progress of previous sessions, and a wider pool of expert trainers could be drawn upon (that is, the individual trainers and helpers were unlikely to all have been available simultaneously). the deci- sion to encourage institutions to bring a cohort to library carpentry across a four-week period was a notable success: attendees reported discussing ses- sions during journeys home and subsequently in the office, thus deepening knowledge acquisition and facilitating self- and peer- learning. attendance remained consistent throughout, indicating that attendees were sufficiently satisfied with the learning experience to return the next week. as the lessons were new – if adapted in most cases from existing resources – areas were encountered where revisions focused on clarity and concision were required. the necessity to troubleshoot problems encountered by a large group meant that all four lessons fell behind schedule, and most did not cover the entirety of the planned lesson content. a higher ratio of train- ers and helpers to attendees might alleviate this problem, though as many of the problems encountered were not lesson related but caused by attendees working on their own personal computers (such as failed wifi logins, soft- ware installation faults, and operating systems interoperability issues) an it training suite might also ensure more lesson content is covered. that said, library carpentry: software skills training for library professionals liber quarterly volume issue we observed clear benefits of having learners able to leave a training ses- sion with a configured environment on their own personal computers: that is, they could continue developing their skills immediately (wilson, ). better support for learners who wish to reinforce learning between sessions could be offered by providing exercises for completion between lessons. finally, attendees struggled most with git workflows and terminology, this in spite of attendees using github issues to discuss or ask questions from week one (see for example the especially large number of openrefine use case queries raised on github issues after prompting from the session four lead https://github.com/librarycarpentry/week-four-library-carpentry/ issues). introducing git through a gui such as github desktop might be preferred, although this would reduce the opportunity to reinforce learning of shell commands. note however this is a finding of comparable training programmes and is a reason for data carpentry not teaching git and github. . conclusion the future plans for library carpentry, however, go beyond iterating lesson content. library carpentry set out to build the foundations of a community model for embracing and sustaining software skills in the library and infor- mation profession, with the intention that this model would emerge from the experience of delivering the initial exploratory programme. spreading the programme over four weeks had the unexpected benefit of alerting a wider range of non-attendees to library carpentry, a number of whom reached out during and since library carpentry to find out more and to enquire into run- ning iterations of library carpentry in their local area. since the library carpentry exploratory run, further workshops have since been organised in countries across continents. each workshop has adapted the lesson materials and timings to suit the needs of requirements of their local audience. alongside and as a result of these workshops, the library carpentry lesson materials have been substantially revised and improved. during the mozilla science lab global spring ( – june ), a team from the usa, australia, canada, uk, the netherlands, and south africa developed lesson materials, added a new lesson on sql (a relational https://github.com/librarycarpentry/week-four-library-carpentry/issues https://github.com/librarycarpentry/week-four-library-carpentry/issues james baker et al. liber quarterly volume issue database management language), assigned administrative roles required to support a distributed management and maintenance structure, and repub- lished the materials using the data carpentry lesson template (weaver, ). building on the momentum created by these medium-term successes, we hope that in the long-term library carpentry will develop a sustain- able model comparable to software carpentry and data carpentry – that is, towards an evolving set of centralized lessons delivered globally by individuals who have undergone instructor training, the latter in particu- lar having proved crucial to building communities around both software carpentry and data carpentry. to achieve this, we must first note that software carpentry and library carpentry have different audiences: the former being scholars (and primarily scientists) who self-identify with the benefits of developing their own software skills in relation to their research; the latter being professionals who self-identify with the benefits of develop- ing their own software skills in relation to their organisational needs. with this in mind, and in lieu of substantial funds, the next step is to establish library carpentry along a distributed model, to create a body of materials that libraries as organisations can drawn on, adapt, and reuse as appropri- ate in their local contexts, thereby achieving medium-term sustainability through an organisation to organisation community. to achieve this, invest- ment priorities for future work are: • evaluate the short- to medium-term benefits of attending library carpentry workshops to both attendees and their libraries through semi-structured interviews with attendees. • develop a set of resources to enable library carpentry attendees to pass on software skills in their libraries. it is anticipated that these resources would be predicated on the idea that the best way to rein- force your own software skills is to teach others. • disseminate the results of that evaluation and the resources devel- oped in appropriate venues. this case study has described how the library carpentry exploratory pro- gramme was conceived and delivered, the syllabus of software skills train- ing materials that were used and subsequently developed, the analysis of data collection exercises that were conducted, and developments that have occurred in subsequent months. the library carpentry exploratory pro- gramme and the activities that have built on it confirm that there is demand, library carpentry: software skills training for library professionals liber quarterly volume issue appetite and a will among library and information professionals to acquire software skills. as librarians and software skills are both vital components of world-class research, library carpentry is a timely intervention into the role of librarians in the research lifecycle. acknowledgements we thank the software sustainability institute for their generous sponsor- ship of library carpentry. the software sustainability institute cultivates world-class research with software. the institute is based at the universities of edinburgh, manchester, southampton and oxford. references baker, j. ( ). british library digital scholarship training programme: a round-up of resources you can use. british library digital scholarship. retrieved october , , from http://britishlibrary.typepad.co.uk/digital-scholarship/ / /british- library-digital-scholarship-training-programme-round-up-of-resources-you-can-use. html. baker, j. ( a). what would library carpentry look like? british library digital scholarship blog. retrieved october , , from http://britishlibrary.typepad. co.uk/digital-scholarship/ / /what-would-library-carpentry-look-like.html. baker, j. ( b). library carpentry: call for volunteers, call for participants. british library digital scholarship blog. retrieved october , , from http:// britishlibrary.typepad.co.uk/digital-scholarship/ / /library-carpentry-call-for- volunteers-call-for-participants.html. baker, j. ( ). library carpentry. weeks one-four. november . figshare. retrieved october , , from https://dx.doi.org/ . /m .figshare. .v . baker, j., bourg, c., gil, a., hettel, j., lindblad, p., miller, l., & stack, p. ( a). methods for empowering library staff through digital humanities skills’. workshop given at digital humanities , lausanne, switzerland, july – , . retrieved october , , from https://dh .files.wordpress.com/ / /dh- - workshop- .pdf. baker, j., williams, h., russell, j., & rosenblum, b. ( b). teaching digital humanities in the library. talk given at data driven: digital humanities in the library, charleston, sc, june – , . http://britishlibrary.typepad.co.uk/digital-scholarship/ / /british-library-digital-scholarship-training-programme-round-up-of-resources-you-can-use.html http://britishlibrary.typepad.co.uk/digital-scholarship/ / /british-library-digital-scholarship-training-programme-round-up-of-resources-you-can-use.html http://britishlibrary.typepad.co.uk/digital-scholarship/ / /british-library-digital-scholarship-training-programme-round-up-of-resources-you-can-use.html http://britishlibrary.typepad.co.uk/digital-scholarship/ / /what-would-library-carpentry-look-like.html http://britishlibrary.typepad.co.uk/digital-scholarship/ / /what-would-library-carpentry-look-like.html http://britishlibrary.typepad.co.uk/digital-scholarship/ / /library-carpentry-call-for-volunteers-call-for-participants.html http://britishlibrary.typepad.co.uk/digital-scholarship/ / /library-carpentry-call-for-volunteers-call-for-participants.html http://britishlibrary.typepad.co.uk/digital-scholarship/ / /library-carpentry-call-for-volunteers-call-for-participants.html https://dx.doi.org/ . /m .figshare. .v https://dh .files.wordpress.com/ / /dh- -workshop- .pdf https://dh .files.wordpress.com/ / /dh- -workshop- .pdf james baker et al. liber quarterly volume issue crymble, a. ( ). how can we make the ph more friendly for women to contribute?. the programming historian, , n.p. retrieved october , , from https://github.com/programminghistorian/jekyll/issues/ . crymble, a., gibbs, f., hegel, a., mcdaniel, c., milligan, i, taparata, e., & wieringa, j. (eds.) ( ). the programming historian. ( nd ed.). retrieved from http:// programminghistorian.org/. dalziel, k. ( ). why every library science student should learn programming. nirak.net. retrieved october , , from http://nirak.net/ / / why-every-library-science-student-should-learn-programming/. devenyi, g.a., koch, c., & srinath, a. (eds) (june ) software carpentry: the unix shell. version . . retrieved october , , from https://github.com/ swcarpentry/shell-novice, http://dx.doi.org/ . /zenodo. . erdmann, c., von alstine, c., eslao, c., durocher, m., & wicks, s. ( ). data scientist training for librarians. international data curation conference , san francisco, california, february – , . retrieved from http://www.dcc.ac.uk/ sites/default/files/documents/idcc / dst l-idcc_ .pdf. goben, a., & raszewski, r. ( ). research data management self-education for librarians: a webliography. issues in science and technology librarianship, issue , n.p. retrieved october , , from http://dx.doi.org/ . /f hck. hettrick, s. ( ). it’s impossible to conduct research without software, say out of uk researchers. software sustainability institute. retrieved october , , from http:// www.software.ac.uk/blog/ - - -its-impossible-conduct-research-without- software-say- -out- -uk-researchers. keener, a. ( ). the arrival fallacy: collaborative research relationships in the digital humanities. digital humanities quarterly, ( ), n.p. retrieved octoer , , from http://www.digitalhumanities.org/dhq/vol/ / / / .html. kim, b. ( ). why not grow coders from the inside of libraries? library hat. retrieved october , , from http://www.bohyunkim.net/blog/archives/ . playforth, c. ( ). why the information profession needs library carpentry. software sustainability institute. retrieved october , , from http://software.ac.uk/ blog/ - - -why-information-profession-needs-library-carpentry- . priego, e. ( ). an archive of #librarycarpentry [ / / : : to / / : : gmt]. figshare. retrieved october , , from https://dx.doi. org/ . /m .figshare. .v . rluk. ( ). powering scholarship: rluk research libraries uk strategy – . retrieved october , , from http://www.rluk.ac.uk/wp-content/ uploads/ / /rluk-strategy- -online.pdf. https://github.com/programminghistorian/jekyll/issues/ http://programminghistorian.org/ http://programminghistorian.org/ nirak.net http://nirak.net/ / /why-every-library-science-student-should-learn-programming/ http://nirak.net/ / /why-every-library-science-student-should-learn-programming/ https://github.com/swcarpentry/shell-novice https://github.com/swcarpentry/shell-novice http://dx.doi.org/ . /zenodo. http://www.dcc.ac.uk/sites/default/files/documents/idcc / dst l-idcc_ .pdf http://www.dcc.ac.uk/sites/default/files/documents/idcc / dst l-idcc_ .pdf http://dx.doi.org/ . /f hck http://www.software.ac.uk/blog/ - - -its-impossible-conduct-research-without-software-say- -out- -uk-researchers http://www.software.ac.uk/blog/ - - -its-impossible-conduct-research-without-software-say- -out- -uk-researchers http://www.software.ac.uk/blog/ - - -its-impossible-conduct-research-without-software-say- -out- -uk-researchers http://www.digitalhumanities.org/dhq/vol/ / / / .html http://www.bohyunkim.net/blog/archives/ http://software.ac.uk/blog/ - - -why-information-profession-needs-library-carpentry- http://software.ac.uk/blog/ - - -why-information-profession-needs-library-carpentry- https://dx.doi.org/ . /m .figshare. .v https://dx.doi.org/ . /m .figshare. .v http://www.rluk.ac.uk/wp-content/uploads/ / /rluk-strategy- -online.pdf http://www.rluk.ac.uk/wp-content/uploads/ / /rluk-strategy- -online.pdf library carpentry: software skills training for library professionals liber quarterly volume issue rochkind, j. ( ). editorial introduction — issue , n.p. the code lib journal, , . london: research libraries uk. retrieved october , , from http:// journal.code lib.org/articles/ . sykes, t. ( ). c: print “library carpentry”. making connections with code. code and the librarian. retrieved octobre , , from https://codeandthelibrarian. wordpress.com/ / / /library-carpentry-poster/. teal, t., cranston, k., lapp, h., white, e., wilson, g., ram, k., & pawlik, a. ( ). data carpentry: workshops to increase data literacy for researchers. international journal of digital curation, ( ), – . retrieved october , , from http:// dx.doi.org/ . /ijdc.v i . . terras, m., baker, j., hetherington, j., beavan, d., welsh, a., o’neill, h., …, farquhar, a. ( ). enabling complex analysis of large-scale digital collections: humanities tesearch, high performance computing, and transforming access to british library digitalccollections. digital humanities , kraków, poland, july – , . weaver, b. ( ). updating library carpentry. software carpentry blog. retrieved october , , from http://software-carpentry.org/blog/ / /library- carpentry-sprint.html. wilkinson, l. ( ). is coding an essential library skill? sense and reference. retrieved october , , from https://senseandreference.wordpress.com/ / / / is-coding-an-essential-library-skill/. wilson, g. ( ). software carpentry. teaching basic lab skills for research computing. blog. retrieved october , , from http://software-carpentry.org/. wilson, g. ( ). software carpentry: lessons learned. retrieved october , , from https://arxiv.org/pdf/ . v .pdf. yelton, a. ( ). coding for librarians: learning by example. library technology reports, ( ), – . retrieved october , , from https://journals.ala.org/ltr/ article/view/ / . note as the software carpentry unix shell lesson describes: ‘the most popular unix shell is bash, the bourne again shell (so-called because it’s derived from a shell written by stephen bourne). bash is the default shell on most modern implementations of unix and in most packages that provide unix-like tools for windows’ (devenyi, koch, & srinath, ). popular examples of ‘modern implementations of unix’ are mac os x and linux operating systems such as ubuntu. http://journal.code lib.org/articles/ http://journal.code lib.org/articles/ https://codeandthelibrarian.wordpress.com/ / / /library-carpentry-poster/ https://codeandthelibrarian.wordpress.com/ / / /library-carpentry-poster/ http://dx.doi.org/ . /ijdc.v i . http://dx.doi.org/ . /ijdc.v i . http://software-carpentry.org/blog/ / /library-carpentry-sprint.html http://software-carpentry.org/blog/ / /library-carpentry-sprint.html https://senseandreference.wordpress.com/ / / /is-coding-an-essential-library-skill/ https://senseandreference.wordpress.com/ / / /is-coding-an-essential-library-skill/ http://software-carpentry.org/ https://arxiv.org/pdf/ . v .pdf https://journals.ala.org/ltr/article/view/ / https://journals.ala.org/ltr/article/view/ / achieving human and machine accessibility of cited data in scholarly publications submitted december accepted february published may corresponding author tim clark, tim clark@harvard.edu academic editor harry hochheiser additional information and declarations can be found on page doi . /peerj-cs. distributed under creative commons public domain dedication open access achieving human and machine accessibility of cited data in scholarly publications joan starr , eleni castro , mercè crosas , michel dumontier , robert r. downs , ruth duerr , laurel l. haak , melissa haendel , ivan herman , simon hodson , joe hourclé , john ernest kratz , jennifer lin , lars holm nielsen , amy nurnberger , stefan proell , andreas rauber , simone sacchi , arthur smith , mike taylor and tim clark california digital library, oakland, ca, united states of america institute of quantitative social sciences, harvard university, cambridge, ma, united states of america stanford university school of medicine, stanford, ca, united states of america center for international earth science information network (ciesin), columbia university, palisades, ny, united states of america national snow and ice data center, boulder, co, united states of america orcid, inc., bethesda, md, united states of america oregon health and science university, portland, or, united states of america world wide web consortium (w c)/centrum wiskunde en informatica (cwi), amsterdam, netherlands icsu committee on data for science and technology (codata), paris, france solar data analysis center, nasa goddard space flight center, greenbelt, md, united states of america public library of science, san francisco, ca, united states of america european organization for nuclear research (cern), geneva, switzerland columbia university libraries/information services, new york, ny, united states of america sba research, vienna, austria institute of software technology and interactive systems, vienna university of technology/tu wien, austria american physical society, ridge, ny, united states of america elsevier, oxford, united kingdom harvard medical school, boston, ma, united states of america abstract reproducibility and reusability of research results is an important concern in scien- tific communication and science policy. a foundational element of reproducibility and reusability is the open and persistently available presentation of research data. however, many common approaches for primary data publication in use today do not achieve sufficient long-term robustness, openness, accessibility or uniformity. nor do they permit comprehensive exploitation by modern web technologies. this has led to several authoritative studies recommending uniform direct citation of data archived in persistent repositories. data are to be considered as first-class schol- arly objects, and treated similarly in many ways to cited and archived scientific and scholarly literature. here we briefly review the most current and widely agreed set of principle-based recommendations for scholarly data citation, the joint declaration of data citation principles (jddcp). we then present a framework for operationalizing the jddcp; and a set of initial recommendations on identifier schemes, identifier how to cite this article starr et al. ( ), achieving human and machine accessibility of cited data in scholarly publications. peerj comput. sci. :e ; doi . /peerj-cs. mailto:tim_clark@harvard.edu https://peerj.com/academic-boards/editors/ https://peerj.com/academic-boards/editors/ http://dx.doi.org/ . /peerj-cs. http://dx.doi.org/ . /peerj-cs. http://creativecommons.org/publicdomain/zero/ . / http://creativecommons.org/publicdomain/zero/ . / http://creativecommons.org/publicdomain/zero/ . / https://peerj.com/computer-science/ http://dx.doi.org/ . /peerj-cs. resolution behavior, required metadata elements, and best practices for realizing programmatic machine actionability of cited data. the main target audience for the common implementation guidelines in this article consists of publishers, scholarly organizations, and persistent data repositories, including technical staff members in these organizations. but ordinary researchers can also benefit from these recommen- dations. the guidance provided here is intended to help achieve widespread, uniform human and machine accessibility of deposited data, in support of significantly im- proved verification, validation, reproducibility and re-use of scholarly/scientific data. subjects human–computer interaction, data science, digital libraries, world wide web and web science keywords data citation, machine accessibility, data archiving, data accessibility introduction background an underlying requirement for verification, reproducibility, and reusability of scholarship is the accurate, open, robust, and uniform presentation of research data. this should be an integral part of the scholarly publication process. however, alsheikh-ali et al. ( ) robust citation of archived methods and materials—particularly highly variable materials such as cell lines, engineered animal models, etc.—and software—are important questions not dealt with here. see vasilevsky et al. ( ) for an excellent discussion of this topic for biological reagents. found that a large proportion of research articles in high-impact journals either weren’t subject to or didn’t adhere to any data availability policies at all. we note as well that such policies are not currently standardized across journals, nor are they typically optimized for data reuse. this finding reinforces significant concerns recently expressed in the scientific literature about reproducibility and whether many false positives are being reported as fact (colquhoun, ; rekdal, ; begley & ellis, ; prinz, schlange & asadullah, ; greenberg, ; ioannidis, ). data transparency and open presentation, while central notions of the scientific method along with their complement, reproducibility, have met increasing challenges as dataset sizes grow far beyond the capacity of printed tables in articles. an extreme example is the case of dna sequencing data. this was one of the first classes of data, along with crystallographic data, for which academic publishers began to require database accession numbers as a condition of publishing, as early as the ’s. at that time sequence data could actually still be published as text in journal articles. the atlas of protein sequence and structure, published from to , was the original form in which protein sequence data was compiled: a book, which could be cited (strasser, ). today the data volumes involved are absurdly large (salzberg & pop, ; shendure & ji, ; stein, ). similar transitions from printed tabular data to digitized data on the web have taken place across disciplines. reports from leading scholarly organizations have now recommended a uniform approach to treating research data as first-class research objects, similarly to the way textual publications are archived, indexed, and cited (codata-icsti task group , ; altman & king, ; uhlir, ; ball & duke, ). uniform citation of robustly archived, starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://dx.doi.org/ . /peerj-cs. described, and identified data in persistent digital repositories is proposed as an important step towards significantly improving the discoverability, documentation, validation, reproducibility, and reuse of scholarly data (codata-icsti task group , ; altman & king, ; uhlir, ; ball & duke, ; goodman et al., ; borgman, ; parsons, duerr & minster, ). the joint declaration of data citation principles (jddcp) (data citation synthesis group, ) is a set of top-level guidelines developed by several stakeholder organizations as a formal synthesis of current best-practice recommendations for common approaches to data citation. it is based on significant study by participating groups and independent scholars. the work of this group was hosted by the force (http://force .org) individuals representing the following organizations participated in the jddcp development effort: biomed central; california digital library; codata-icsti task group on data citation standards and practices; columbia university; creative commons; datacite; digital science; elsevier; european molecular biology laboratories/european bioinformatics institute; european organization for nuclear research (cern); federation of earth science information partners (esip); force .org; harvard institute for quantitative social sciences; icsu world data system; international as- sociation of stm publishers; library of congress (us); massachusetts general hospital; mit libraries; nasa solar data analysis center; the national academies (us); openaire; rensselaer polytechnic institute; research data alliance; science exchange; national snow and ice data center (us); natural environment research council (uk); national academy of sciences (us); sba research (at); national information standards organization (us); university of california, san diego; university of leuven/ku leuven (nl); university of oxford; vu university amsterdam; world wide web consortium (digital publishing activity). see https://www.force .org/ datacitation/workinggroup for details. community, an open forum for discussion and action on important issues related to the future of research communication and e-scholarship. the jddcp is the latest development in a collective process, reaching back to at least , to raise the importance of data as an independent scholarly product and to make data transparently available for verification and reproducibility (altman & crosas, ). the purpose of this document is to outline a set of common guidelines to operationalize jddcp-compliant data citation, archiving, and programmatic machine accessibility in a way that is as uniform as possible across conforming repositories and associated data citations. the recommendations outlined here were developed as part of a community process by participants representing a wide variety of scholarly organizations, hosted by the force data citation implementation group (dcig) (https://www.force .org/ datacitationimplementation). this work was conducted over a period of approximately one year beginning in early as a follow-on activity to the completed jddcp. why cite data? data citation is intended to help guard the integrity of scholarly conclusions and provides a basis for integrating exponentially growing datasets into new forms of scholarly publishing. both of these goals require the systematic availability of primary data in both machine- and human-tractable forms for re-use. a systematic review of current approaches is provided in codata-icsti task group ( ). three common practices in academic publishing today block the systematic reuse of data. the first is the citation of primary research data in footnotes, typically either of the form, “data is available from the authors upon request”, or “data is to be found on the authors’ laboratory website, http://example.com”. the second is publication of datasets as “supplementary file” or “supplementary data” pdfs where data is given in widely varying formats, often as graphical tables, and which in the best case must be laboriously screen-scraped for re-use. the third is simply failure in one way or another to make the data available at all. integrity of conclusions (and assertions generally) can be guarded by tying individual assertions in text to the data supporting them. this is done already, after a fashion, for image data in molecular biology publications where assertions based on primary data contained in images typically directly cite a supporting figure within the text starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitation/workinggroup https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://example.com http://dx.doi.org/ . /peerj-cs. containing the image. several publishers (e.g., plos, nature publications, and faculty of ) already partner with data archives such as figshare (http://figshare.com), dryad (http://datadryad.org/), dataverse (http://dataverse.org/), and others to archive images and other research data. citing data also helps to establish the value of the data’s contribution to research. moving to a cross-discipline standard for acknowledging the data allows researchers to justify continued funding for their data collection efforts (uhlir, ; codata-icsti task group , ). well defined standards allow bibliometric tools to find unanticipated uses of the data. current analysis of data use is a laborious process and rarely performed for disciplines outside of the disciplines considered the data’s core audience (accomazzi et al., ). the eight core principles of data citation the eight principles below have been endorsed by scholarly societies, publishers and other institutions. such a wide endorsement by influential groups reflects, in our view, these organizations include the american physical society, association of research libraries, biomed cen- tral, codata, crossref, datacite, dataone, data registration agency for social and economic data, elixir, elsevier, european molecular biology laboratories/european bioinformatics institute, leibniz institute for the social sciences, inter-university consortium for political and social research, international association of stm publishers, international union of biochemistry and molecular biology, international union of crystallography, international union of geodesy and geophysics, national information standards organization (us), nature publishing group, openaire, plos (public library of science), research data alliance, royal society of chemistry, swiss institute of bioinformatics, cambridge crystallographic data centre, thomson reuters, and the university of california curation center (california digital library). the meticulous work involved in preparing the key supporting studies (by codata, the national academies, and others (codata-icsti task group , ; uhlir, ; ball & duke, ; altman & king, ) and in harmonizing the principles; and supports the validity of these principles as foundational requirements for improving the scholarly publication ecosystem. • principle —importance: “data should be considered legitimate, citable products of research. data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications.” • principle —credit and attribution: “data citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data.” • principle —evidence: “in scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited.” • principle —unique identification: “a data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community.” • principle —access: “data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.” • principle —persistence: “unique identifiers, and metadata describing the data, and its disposition, should persist—even beyond the lifespan of the data they describe.” • principle —specificity and verifiability: “data citations should facilitate identifica- tion of, access to, and verification of the specific data that support a claim. citations or citation metadata should include information about provenance and fixity sufficient to facilitate verifying that the specific time slice, version and/or granular portion of data retrieved subsequently is the same as was originally cited.” starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://figshare.com http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://datadryad.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dataverse.org/ http://dx.doi.org/ . /peerj-cs. • principle —interoperability and flexibility: “citation methods should be sufficiently flexible to accommodate the variant practices among communities, but should not differ so much that they compromise interoperability of data citation practices across communities.” these principles are meant to be adopted at an institutional or discipline-wide scale. the main target audience for the common implementation guidelines in this article consists of publishers, scholarly organizations, and persistent data repositories. individual researchers are not meant to set up their own data archives. in fact this is contrary to one goal of data citation as we see it—which is to get away from inherently unstable citations via researcher footnotes indicating data availability at some intermittently supported laboratory website. however individual researchers can contribute to and benefit from adoption of these principles by ensuring that primary research data is prepared for archival deposition at or before publication. we also note that often a researcher will want to go back to earlier primary data from their own lab—robust archiving positively ensures it will remain available for their own use in future, whatever the vicissitudes of local storage and lab personnel turnover. implementation questions arising from the jddcp the jddcp were presented by their authors as principles. implementation questions were left unaddressed. this was meant to keep the focus on harmonizing top-level and basically goal-oriented recommendations without incurring implementation-level distractions. therefore we organized a follow-on activity to produce a set of implementation guidelines intended to promote rapid, successful, and uniform jddcp adoption. we began by seeking to understand just what questions would arise naturally to an organization that wished to implement the jddcp. we then grouped the questions into four topic areas, to be addressed by individuals with special expertise in each area. . document data model—how should publishers adapt their document data models to support direct citation of data? . publishing workflows—how should publishers change their editorial workflows to support data citation? what do publisher data deposition and citation workflows look like where data is being cited today, such as in nature scientific data or gigascience? . common repository application program interfaces (apis)—are there any ap- proaches that can provide standard programmatic access to data repositories for data deposition, search and retrieval? . identifiers, metadata, and machine accessibility—what identifier schemes, identifier resolution patterns, standard metadata, and recommended machine programmatic accessibility patterns are recommended for directly cited data? the document data model group noted that publishers use a variety of xml schemas (bray et al., ; gao, sperberg-mcqueen & thompson, ; peterson et al., ) to model scholarly articles. however, there is a relevant national information standards starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://dx.doi.org/ . /peerj-cs. organization (niso) specification, niso z . - , which is increasingly used by publishers, and is the archival form for biomedical publications in pubmed central. this niso z . - is derived from the former “nlm-dtd” model originally developed by the us national library of medicine. group therefore developed a proposal for revision of the niso journal article tag suite to support direct data citation. niso-jats version . d (national center for biotechnology information, ), a revision based on this proposal, was released on december , , by the jats standing committee, and is considered a stable release, although it is not yet an official revision of the niso z . - standard. the publishing workflows group met jointly with the research data alliance’s publishing data workflows working group to collect and document exemplar publishing workflows. an article on this topic is in preparation, reviewing basic requirements and exemplar workflows from nature scientific data, gigascience (biomed central), f research, and geoscience data journal (wiley). the common repository apis group is currently planning a pilot activity for a common api model for data repositories. recommendations will be published at the conclusion of the pilot. this work is being undertaken jointly with the elixir (http:// www.elixir-europe.org/) fairport working group. the identifiers, metadata, and machine accessibility group’s recommendations are presented in the remainder of this article. these recommendations cover: • definition of machine accessibility; • identifiers and identifier schemes; • landing pages; • minimum acceptable information on landing pages; • best practices for dataset description; and • recommended data access methods. recommendations for achieving machine accessibility what is machine accessibility? machine accessibility of cited data, in the context of this document and the jddcp, means access by well-documented web services (booth et al., )—preferably restful web services (fielding, ; fielding & taylor, ; richardson & ruby, ) to data and metadata stored in a robust repository, independently of integrated browser access by humans. web services are methods of program-to-program communication using web protocols. the world wide web consortium (w c, http://www.w .org) defines them as “software system[s] designed to support interoperable machine-to-machine interaction over a network” (haas & brown, ). web services are always “on” and function essentially as utilities, providing services such as computation and data lookup, at web service endpoints. these are well-known web addresses, or uniform resource identifiers (uris) (berners-lee, fielding & masinter, ; jacobs & walsh, ). uris are very similar in concept to the more widely understood uniform resource locators (url, or “web address”), but uris do not specify the location of an object or service—they only identify it. uris specify abstract resources on the web. the associated server is responsible for resolving a uri to a specific physical resource—if the resource is resolvable. (uris may also be used to identify physical things such as books in a library, which are not directly resolvable resources on the web.) starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.elixir-europe.org/ http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://dx.doi.org/ . /peerj-cs. restful web services follow the rest (representational state transfer) architecture developed by fielding and others (fielding, ). they support a standard set of operations such as “get” (retrieve), “post” (create), and “put” (create or update) and are highly useful in building hypermedia applications by combining services from many programs distributed on various web servers. machine accessibility and particularly restful web service accessibility is highly desirable because it enables construction of “lego block” style programs built up from various service calls distributed across the web, which need not be replicated locally. restful web services are recommended over the other major web service approach, soap interfaces (gudgin et al., ), due to our focus on the documents being served and their content. rest also allows multiple data formats such as json (javascript object notation) (ecma, ), and provides better support for mobile applications (e.g., caching, reduced bandwidth, etc.). clearly, “machine accessibility” is also an underlying prerequisite to human accessibility, as browser (client) access to remote data is always mediated by machine-to-machine com- munication. but for flexibility in construction of new programs and services, it needs to be independently available apart from access to data generated from the direct browser calls. unique identification unique identification in a manner that is machine-resolvable on the web and demon- strates a long-term commitment to persistence is fundamental to providing access to cited data and its associated metadata. there are several identifier schemes on the web that meet these two criteria. the best identifiers for data citation in a particular community of practice will be those that meet these criteria and are widely used in that community. our general recommendation, based on the jddcp, is to use any currently available identifier scheme that is machine actionable, globally unique, and widely (and currently) used by a community, and that has demonstrated a long-term commitment to persistence. best practice, given the preceding, is to choose a scheme that is also cross-discipline. machine actionable in this context means resolvable on the web by web services. there are basically two kinds of identifier schemes available: (a) the native http and https schemes where uris are the identifiers and address resolution occurs natively; and (b) schemes requiring a resolving authority, like digital object identifiers (dois). resolving authorities reside at well-known web addresses. they issue and keep track of identifiers in their scheme and resolve them by translating them to uris which are then natively resolved by the web. for example, the doi . /rsos. when appended to the doi resolver at http://doi.org, resolves to the uri http://rsos.royalsocietypublishing. org/content/ / / . similarly, the biosample identifier sameg , when ap- pended as (“biosample/sameg ”) to the identifiers.org resolver at http://identifiers. org, resolves to the landing page www.ebi.ac.uk/biosamples/group/sameg . however resolved, a cited identifier should continue to resolve to an intermediary landing page (see below) even if the underlying data has been de-accessioned or is otherwise unavailable. starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://doi.org http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://rsos.royalsocietypublishing.org/content/ / / http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://www.ebi.ac.uk/biosamples/group/sameg http://dx.doi.org/ . /peerj-cs. table examples of identifier schemes meeting jddcp criteria. identifier scheme full name authority resolution uri datacite doi (as uri) datacite-assigned digital object identifier datacite http://dx.doi.org crossref doi (as uri) crossref-assigned digital object identifier crossref http://dx.doi.org identifiers.org uri identifiers.org-assigned uniform resource identifier identifiers.org http://identifiers.org https uri http or https uniform resource identifier domain name owner n/a purl persistent uniform resource locator online computer library center (oclc) http://purl.org handle (hdl) handle system hdl corporation for national research initiatives (cnri) http://handle.net ark archival resource key name assigning or mapping authorities (various)a http://n t.net; name mapping authorities nbn national bibliographic number various various notes. a registries maintained at california digital library, bibliothèque national de france and national library of medicine. by a commitment to persistence, we mean that (a) if a resolving authority is required that authority has demonstrated a reasonable chance to be present and functional in the future; (b) the owner of the domain or the resolving authority has made a credible commitment to ensure that its identifiers will always resolve. a useful survey of persistent identifier schemes appears in hilse & kothe ( ). examples of identifier schemes meeting jddcp criteria for robustly accessible data citation are shown in table and described below. this is not a comprehensive list and the criteria above should govern. table summarizes the approaches to achieving and enforcing persistence, and actions on object (data) removal from the archive, of each of the schemes. the subsections below briefly describe the exemplar identifier schemes shown in tables and . digital object identifiers (dois) digital object identifiers are an identification system originally developed by trade associations in the publishing industry for digital content over the internet. they were developed in partnership with the corporation for national research initiatives (cnri), and built upon cnri’s handle system as an underlying network component. however, dois may identify digital objects of any type—certainly including data (international doi foundation, ). doi syntax is defined as a us national information standards organization standard, ansi/niso z . - . dois may be expressed as uris by prefixing the doi with a resolution address: http://dx.doi.org/<doi>. doi registration agencies provide services for registering dois along with descriptive metadata on the object being identified. the doi system proxy server allows programmatic access to doi name resolution using http (international doi foundation, ). datacite and crossref are the two doi registration agencies of special relevance to data citation. they provide services for registering and resolving identifiers for cited data. starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://dx.doi.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://identifiers.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://handle.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://dx.doi.org/ . /peerj-cs. table identifier scheme persistence and object removal behavior. identifier scheme achieving persistence enforcing persistence action on object removal datacite doi registration with contracta link checking datacite contacts owners; metadata should persist crossref doi registration with contractb link checking crossref contacts owners per policyc; metadata should persist identifiers.org uri registration link checking metadata should persist https uri domain owner responsibility none domain owner responsibility purl uri registration none domain owner responsibility handle (hdl) registration none identifier should persist ark user-defined policies hosting server host-dependent; metadata should persistd nbn ietf rfc domain resolver metadata should persist notes. a the datacite persistence contract language reads: “objects assigned dois are stored and managed such that persistent access to them can be provided as appropriate and maintain all urls associated with the doi.” b the crossref persistence contract language reads in part: “member must maintain each digital identifier assigned to it or for which it is otherwise responsible such that said digital identifier continuously resolves to a response page. . . containing no less than complete bibliographic information about the corresponding original work (including without limitation the digital identifier), visible on the initial page, with reasonably sufficient information detailing how the original work can be acquired and/or a hyperlink leading to the original works itself. . . ” c crossref identifier policy reads: “the . . . member shall use the digital identifier as the permanent url link to the response page. the. . . member shall register the url for the response page with crossref, shall keep it up-to-date and active, and shall promptly correct any errors or variances noted by crossref.” d for example, the french national library has rigorous internal checks for the million arks that it manages via its own resolver. both require persistence commitments of their registrants and take active steps to monitor compliance. datacite is specifically designed—as its name would indicate—to support data citation. a recent collaboration between the software archive github, the zenodo repository system at cern, figshare, and mozilla science lab, now makes it possible to cite software, giving dois to github-committed code (github guides, ). handle system (hdls) handles are identifiers in a general-purpose global name service designed for securely resolving names over the internet, compatible with but not requiring the domain name service. handles are location independent and persistent. the system was developed by bob kahn at the corporation for national research initiatives, and currently supports, on average, million resolution requests per month—the largest single user being the digital object identifier (doi) system. handles can be expressed as uris (cnri, ; dyson, ). identifiers.org uniform resource identifiers (uris) many common identifiers used in the life sciences, such as pubmed or protein data bank ids, are not natively web-resolvable. identifiers.org associates such database-dependent identifiers with persistent uris and resolvable physical urls. identifiers.org was developed and is maintained at the european bioinformatics institute, and was built on top of the miriam registry (juty, le novére & laibe, ). identifiers.org uris are constructed using the syntax http://identifiers.org/ <data resource name>/<native identifier>, where <data resource name> designates a particular database, and <native identifier> is the id used within that database to retrieve the record. the identifiers.org resolver supports multiple starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://dx.doi.org/ . /peerj-cs. alternative locations (which may or may not be mirrors) for data it identifies. it supports programmatic access to data. purls purls are “persistent uniform resource locators”, a system originally developed by the online computer library center (oclc). they act as intermediaries between potentially changing locations of digital resources, to which the purl name resolves. purls are registered and resolved at http://purl.org, http://purl.access.gpo.gov, http:// purl.bioontology.org and various other resolvers. purls are implemented as an http redirection service and depend on the survival of their host domain name (oclc, ; library of congress, ). purls fail to resolve upon object removal. handling this behavior through a metadata landing page (see below) is the responsibility of the owner of the cited object. http uris uris (uniform resource identifiers) are strings of characters used to identify resources. they are the identifier system for the web. uris begin with a scheme name, such as http or ftp or mailto, followed by a colon, and then a scheme-specific part. http uris will be quite familiar as they are typed every day into browser address bars, and begin with http:. their scheme-specific part is next, beginning with “//”, followed by an identifier, which often but not always is resolvable to a specific resource on the web. uris by themselves have no mechanism for storing metadata about any objects to which they are supposed to resolve, nor do they have any particular associated persistence policy. however, other identifier schemes with such properties, such as dois, are often represented as uris for convenience (berners-lee, fielding & masinter, ; jacobs & walsh, ). like purls, native http uris fail to resolve upon object removal. handling this behavior through a metadata landing page (see below) is the responsibility of the owner of the cited object. archival resource key (arks) archival resource keys are unique identifiers designed to support long-term persistence of information objects. an ark is essentially a url (uniform resource locator) with some additional rules. for example, hostnames are excluded when comparing arks in order to prevent current hosting arrangements from affecting identity. the maintenance agency is the california digital library, which offers a hosted service for arks and dois (kunze & starr, ; kunze, ; kunze & rodgers, ; janée, kunze & starr, ). arks provide access to three things—an information object; related metadata; and the provider’s persistence commitment. arks propose inflections (changing the end of an identifier) as a way to retrieve machine-readable metadata without requiring (or prohibiting) content negotiation for linked data applications. unlike, for example, dois, there are no fees to assign arks, which can be hosted on an organization’s own web server if desired. they are globally resolvable via the identifier-scheme-agnostic n t (name-to-thing, http://n t.net) resolver. the ark registry is replicated at the california starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.org http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.access.gpo.gov http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://purl.bioontology.org http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://n t.net http://dx.doi.org/ . /peerj-cs. digital library, the bibliothéque nationale de france, and the us national library of medicine (kunze & starr, ; peyrard, kunze & tramoni, ; kunze, ). national bibliography number (nbns) national bibliography numbers are a set of related publication identifier systems with country-specific formats and resolvers, utilized by national library systems in some countries. they are used by, for example, germany, sweden, finland and italy, for publications in national archives without publisher-assigned identifiers such as isbns. there is a urn namespace for nbns that includes the country code; expressed as a urn, nbns become globally unique (hakala, ; moats, ). landing pages the identifier included in a citation should point to a landing page or set of pages rather than to the data itself (hourclé et al., ; rans et al., ; clark, evans & strollo, ). and the landing page should persist even if the data is no longer accessible. by “landing page(s)” we mean a set of information about the data via both structured metadata and unstructured text and other information. there are three main reasons to resolve identifiers to landing pages rather than directly to data. first, as proposed in the jddcp, the metadata and the data may have different lifespans, the metadata potentially surviving the data. this is true because data storage imposes costs on the hosting organization. just as printed volumes in a library may be de-accessioned from time to time, based on considerations of their value and timeliness, so will datasets. the jddcp proposes that metadata, essentially cataloging information on the data, should still remain a citable part of the scholarly record even when the dataset may no longer be available. second, the cited data may not be legally available to all, even when initially accessioned, for reasons of licensing or confidentiality (e.g. protected health information). the landing page provides a method to host metadata even if the data is no longer present. and it also provides a convenient place where access credentials can be validated. third, resolution to a landing page allows for an access point that is independent from any multiple encodings of the data that may be available. landing pages should contain the following information. items marked “conditional” are recommended if the conditions described are present, e.g., access controls are required to be implemented if required by licensing or phi considerations; multiple versions are required to be described if they are available; etc. • (recommended) dataset descriptions: the landing page must provide descriptions of the datasets available, and information on how to programmatically retrieve data where a user or device is so authorized. (see dataset description for formats.) • (conditional) versions: what versions of the data are available, if there is more than one version that may be accessed. • (optional) explanatory or contextual information: provide explanations, contextual guidance, caveats, and/or documentation for data use, as appropriate. starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://dx.doi.org/ . /peerj-cs. • (conditional) access controls: access controls based on content licensing, protected health information (phi) status, institutional review board (irb) authorization, embargo, or other restrictions, should be implemented here if they are required. • (recommended) persistence statement: reference to a statement describing the data and metadata persistence policies of the repository should be provided at the landing page. data persistence policies will vary by repository but should be clearly described. (see persistence guarantee for recommended language). • (recommended) licensing information: information regarding licensing should be provided, with links to the relevant licensing or waiver documents as required (e.g., creative commons cc waiver description (https://creativecommons.org/ publicdomain/zero/ . /), or other relevant material.) • (conditional) data availability and disposition: the landing page should provide information on the availability of the data if it is restricted, or has been de-accessioned (i.e., removed from the archive). as stated in the jddcp, metadata should persist beyond de-accessioning. • (optional) tools/software: what tools and software may be associated or useful with the datasets, and how to obtain them (certain datasets are not readily usable without specific software). content encoding on landing pages landing pages should provide both human-readable and machine-readable content. • html; that is, the native browser-interpretable format used to generate a graphical and/or language-based display in a browser window, for human reading and under- standing. • at least one non-proprietary machine-readable format; that is, a content format with a fully specified syntax capable of being parsed by software without ambiguity, at a data element level. options: xml, json/json-ld, rdf (turtle, rdf-xml, n-triples, n-quads), microformats, microdata, rdfa. best practices for dataset description minimally the following metadata elements should be present in dataset descriptions: • dataset identifier: a machine-actionable identifier resolvable on the web to the dataset. • title: the title of the dataset. • description: a description of the dataset, with more information than the title. • creator: the person(s) and/or organizations who generated the dataset and are responsible for its integrity. • publisher/contact: the organization and/or contact who published the dataset and is responsible for its persistence. • publicationdate/year/releasedate: iso standard dates are preferred (klyne & newman, ). • version: the dataset version identifier (if applicable). starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / http://dx.doi.org/ . /peerj-cs. additional recommended metadata elements in dataset descriptions are: • creator identifier(s): orcid or other unique identifier of the individual creator(s). orcid ids are numbers identifying individual researchers issued by a consortium of prominent academic publishers and others (editors, ; maunsell, ). • license: the license or waiver under which access to the content is provided (preferably a link to standard license/waiver text (e.g. https://creativecommons.org/publicdomain/ zero/ . /). when multiple datasets are available on one landing page, licensing information may be grouped for all relevant datasets. a world wide web consortium (http://www.w .org) standard for machine-accessible dataset description on the web is the w c data catalog vocabulary (dcat, mali, erickson & archer, ). it was developed at the digital enterprise research institute and later standardized by the w c egovernment working group, with broad participation, and underlies some other data interoperability models such as (dcat application profile working group, ) and (gray et al., ). the w c health care and life sciences dataset description specification (gray et al., ), currently in editor’s draft status, provides capability to add additional useful metadata beyond the dcat vocabulary. this is an evolving standard that we suggest for provisional use. data in the described datasets might also be described using other formats depending on the application area. other possible approaches for dataset description include datacite metadata (datacite metadata working group, ), dublin core (dublin core metadata initiative, ), the data documentation initiative (ddi) (data documentation initiative, ) for social sciences, or iso (iso/tc , ) for geographic information. where any of these formats are used they should support at least the minimal set of recommended metadata elements described above. serving the landing pages the uris used as identifiers for citation should resolve to html landing pages with the appropriate metadata in a human readable form. to enable automated agents to extract the metadata these landing pages should include an html <link> element specifying a machine readable form of the page as an alternative. for those that are capable of doing so, we recommend also using web linking (nottingham, ) to provide this information from all of the alternative formats. should content management systems be developed specifically for maintaining and serving landing pages, we recommend both of these solutions plus the use of content negotiation (holtzman & mutz, ). a more detailed discussion of these techniques and our justification for using multiple solutions is included in the appendix. note that in all of these cases, the alternates are other forms of the landing page. access to the data itself should be indicated through the dcat fields accessurl or downloadurl as appropriate for the data. data that is spread across multiple files can be indicated by linking to an ore resource map (lagoze & van de sompel, ). starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / https://creativecommons.org/publicdomain/zero/ . / http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://www.w .org http://dx.doi.org/ . /peerj-cs. persistence guarantees the topic of persistence guarantees is important from the standpoint of what repository owners and managers should provide to support jddcp-compliant citable persistent data. it is closely related to the question of persistent identifiers, that is, the identifiers must always resolve somewhere, and as noted above, this should be to a landing page. but in the widest sense, persistence is a matter of service guarantees. organizations providing trusted repositories for citable data need to detail their persistence policies transparently to users. we recommend that all organizations endorsing the jddcp adopt a persistence guarantee for data and metadata based on the following template: “[organization/institution name] is committed to maintaining persistent identifiers in [repository name] so that they will continue to resolve to a landing page providing meta- data describing the data, including elements of stewardship, provenance, and availability. [organization/institution name] has made the following plan for organizational persis- tence and succession: [plan].” as noted in the landing pages section, when data is de-accessioned, the landing page should remain online, continuing to provide persistent metadata and other information including a notation on data de-accessioning. authors and scholarly article publishers will decide on which repositories meet their persistence and stewardship requirements based on the guarantees provided and their overall experience in using various repositories. guarantees need to be supported by operational practice. implementation: stakeholder responsibilities research communications are made possible by an ecosystem of stakeholders who prepare, edit, publish, archive, fund, and consume them. each stakeholder group endorsing the jddcp has, we believe, certain responsibilities regarding implementation of these recommendations. they will not all be implemented at once, or homogeneously. but careful adherence to these guidelines and responsibilities will provide a basis for achieving the goals of uniform scholarly data citation. . archives and repositories: (a) identifiers, (b) resolution behavior, (c) landing page metadata elements, (d) dataset description and (e) data access methods, should all conform to the technical recommendations in this article. . registries: registries of data repositories such as databib (http://databib.org) and r data (http://www.re data.org) should document repository conformance to these recommendations as part of their registration process, and should make this information readily available to researchers and the public. this also applies to lists of “recommended” repositories maintained by publishers, such as those maintained by nature scientific data (http://www.nature.com/sdata/data-policies/repositories) and f research (http://f research.com/for-authors/data-guidelines). . researchers: researchers should treat their original data as first-class research objects. they should ensure it is deposited in an archive that adheres to the practices described starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://databib.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.re data.org http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://f research.com/for-authors/data-guidelines http://dx.doi.org/ . /peerj-cs. here. we also encourage authors to publish preferentially with journals which implement these practices. . funding agencies: agencies and philanthropies funding research should require that recipients of funding follow the guidelines applicable to them. . scholarly societies: scholarly societies should strongly encourage adoption of these practices by their members and by publications that they oversee. . academic institutions: academic institutions should strongly encourage adoption of these practices by researchers appointed to them and should ensure that any institutional repositories they support also apply the practices relevant to them. conclusion these guidelines, together with the niso jats . d xml schema for article publishing (national center for biotechnology information, ), provide a working technical basis for implementing the joint data citation principles. they were developed by a cross-disciplinary group hosted by the force .org digital scholarship com- munity. data citation implementation group (dcig, https://www.force .org/ force .org (http://force .org) is a community of scholars, librarians, archivists, publishers and research funders that has arisen organically to help facilitate the change toward improved knowledge creation and sharing. it is incorporated as a us (c) not-for-profit organization in california. datacitationimplementation), during , as a follow-on project to the successfully concluded joint data citation principles effort. registries of data repositories such as r data (http://r data.org) and publishers’ lists of “recommended” repositories for cited data, such as those maintained by nature publications (http://www.nature.com/sdata/data-policies/repositories), should take ongoing note of repository compliance to these guidelines, and provide compliance checklists. we are aware that some journals are already citing data in persistent public repositories, and yet not all of these repositories currently meet the guidelines we present here. compliance will be an incremental improvement task. other deliverables from the dcig are planned for release in early , including a review of selected data-citation workflows from early-adopter publishers (nature, biomed central, wiley and faculty of ). the niso-jats version . d revision is now considered a stable release by the jats standing committee, and is under final review by the national information standards organization (niso) for approval as the updated ansi/niso z . - standard. we believe it is safe for publishers to use the . d revision for data citation now. a forthcoming article in this series will describe the jats revisions in detail. we hope that publishing this document and others in the series will accelerate the adoption of data citation on a wide scale in the scholarly literature, to support open validation and reuse of results. integrity of scholarly data is not a private matter, but is fundamental to the validity of published research. if data are not robustly preserved and accessible, the foundations of published research claims based upon them are not verifiable. as these practices and guidelines are increasingly adopted, it will no longer be acceptable to credibly assert any starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org http://force .org https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation https://www.force .org/datacitationimplementation http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://r data.org http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://www.nature.com/sdata/data-policies/repositories http://dx.doi.org/ . /peerj-cs. claims whatsoever that are not based upon robustly archived, identified, searchable and accessible data. we welcome comments and questions which should be addressed to the forcnet@googlegroups.com open discussion forum. acknowledgements we are particularly grateful to peerj academic editor harry hochheiser (university of pittsburgh), reviewer tim vines (university of british columbia), and two anonymous reviewers, for their careful, very helpful, and exceptionally timely comments on the first version of this article. many thanks as well to virginia clark (université paul sabatier), john kunze (california digital library) and maryann martone (university of california at san diego) for their thoughtful suggestions on content and presentation. appendix serving landing pages: implementation details ideally, all versions of the landing page would be resolvable from a single uri through content negotiation (holtzman & mutz, ), serving an html representation for humans and the appropriate form for automated agents. in its simplest form, content negotiation uses the http accept and/or accept-language headers to vary the content returned based on media type (a.k.a. mime type) and language. ark-style inflections propose an alternate way to retrieve machine-readable metadata without requiring content negotiation. some web servers have provision to serve alternate documents by using file names that only vary by extension; when the document is requested without an extension, the web server returns the file highest rated by the request’s accept header. enabling this feature typically requires the intervention of the web server administrator and thus may not be available to all publishers. the content negotiation standard also allows servers to assign arbitrary tags to documents and for user agents to request documents that match a given tag using the accept-features header. this could allow for selection between documents that use the same media type but use different metadata standards. although we believe that content negotiation is the best long-term solution to make it easier to provide for automated agents, this may require building systems to manage landing page content or adapting existing content management systems (cms). for a near-term solution, we recommend web linking (nottingham, ). web linking requires assigning a separate resolvable uri for each variant representation of the landing page. as each alternative has a uri, the documents can be cached reliably without requiring additional requests to the server hosting the landing pages. web linking also allows additional relationships to be defined, so that it can also be used to direct automated agents to landing pages for related data as well as alternatives. web linking also allows for a title to be assigned to each link, should they be presented to a human: starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com mailto:forcnet@googlegroups.com http://dx.doi.org/ . /peerj-cs. link: “uri-to-an-alternate” rel=“alternate” media=“application/xml” title=“title” we recommend including in the title the common names of the metadata schema(s) used, such as datacite or dcat, to allow automated agents to select the appropriate alternative. as an additional fallback, we also recommend using html <link> elements to duplicate the linking information in the html version of the landing page: <link href=“uri-to-an-alternate”;rel=“alternate”; media=“application/xml”;title=“title”> embedding the information in the html has the added benefit of keeping the alternate information attached if the landing page is downloaded from a standard web browser. this is not the case for web linking through http headers, nor for content negotiation. in addition, content negotiation may not send back the full list of alternatives without the user agent sending a negotiate: vlist header (shepherd et al., ). as each of the three techniques have points where they have advantages over the others we recommend a combination of the three approaches for maximum benefit, but acknowledge that some may take more effort to implement. serving landing pages: linking to the data note that the content being negotiated is the metadata description of the research data. the data being described should not be served via this description uri. instead, the landing page data descriptions should reference the data. if the data is available from a single file, directly available on the internet, use the dcat downloadurl to indicate the location of the data. if the data is available as a relatively small number of files, either as parts of the whole collection, mirrored at multiple locations, or as multiple packaged forms, link to an ore resource map (lagoze et al., ) to describe the relationships between the files. if the data requires authentication to access, use the dcat accessurl to indicate a page with instructions on how to request access to the data. this technique can also be used to describe the procedures on accessing physical samples or other non-digital data. if the data is available online but is excessive in volume, use the dcat accessurl to link to the appropriate search system to access the data. for data systems that are available either as bulk downloads or through sub-setting services, include both accessurl and downloadurl on the landing page. additional information and declarations funding this work was funded in part by generous grants from the us national institutes of health and national aeronautics and space administration, the alfred p. sloan foundation, and the european union (fp ). support from the national institutes of health (nih) starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://dx.doi.org/ . /peerj-cs. was provided via grant # nih u ai - in the big data to knowledge program, supporting the center for expanded data annotation and retrieval (cedar). support from the national aeronautics and space administration (nasa) was provided under contract nng hq c for the continued operation of the socioeconomic data and applications center (sedac). support from the alfred p. sloan foundation was provided under two grants: a. grant # - - to the harvard institute for quantitative social sciences, “helping journals to upgrade data publication for reusable research”; and b. a grant to the california digital library, “clir/dlf postdoctoral fellowship in data curation for the sciences and social sciences”. the european union partially supported this work under the fp contracts # supporting the alliance for permanent access and # supporting digital preservation for timeless business processes and services. the funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. grant disclosures the following grant information was disclosed by the authors: national institutes of health (nih): # nih u ai - . alfred p. sloan foundation: # - - . european union (fp ): # , # . national aeronautics and space administration (nasa): nng hq c. competing interests the authors declare there are no competing interests. author contributions • joan starr and tim clark conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper. • eleni castro, mercè crosas, michel dumontier, robert r. downs, ruth duerr, laurel l. haak, melissa haendel, ivan herman, simon hodson, joe hourclé, john ernest kratz, jennifer lin, lars holm nielsen, amy nurnberger, stefan proell, andreas rauber, simone sacchi, arthur smith and mike taylor performed the experiments, analyzed the data, performed the computation work, reviewed drafts of the paper. references accomazzi a, henneken e, erdmann c, rots a. . telescope bibliographies: an essential component of archival data management and operations. in: society of photo-optical instrumentation engineers (spie) conference series. vol. . article id k, pp doi . / . . alsheikh-ali aa, qureshi w, al-mallah mh, ioannidis jpa. . public availability of published research data in high-impact journals. plos one ( ):e doi . /journal.pone. . starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://dx.doi.org/ . / . http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . /peerj-cs. altman m, crosas m. . the evolution of data citation: from principles to implementation. iassist quarterly (spring): – . available at http://www.iassistdata.org/iq/ evolution-data-citation-principles-implementation. altman m, king g. . a proposed standard for the scholarly citation of quantitative data. dlib magazine ( / ). available at http://www.dlib.org/dlib/march /altman/ altman.html. ball a, duke m. . how to cite datasets and link to publications. technical report. datacite. available at http://www.dcc.ac.uk/resources/how-guides. begley cg, ellis lm. . drug development: raise standards for preclinical cancer research. nature ( ): – doi . / a. berners-lee t, fielding r, masinter l. . rfc : uniform resource identifiers (uri): generic syntax. available at https://www.ietf.org/rfc/rfc .txt. booth d, haas h, mccabe f, newcomer e, champion m, ferris c, orchard d. . web services architecture: w c working group note february . technical report. world wide web consortium. available at http://www.w .org/tr/ws-arch/. borgman c. . why are the attribution and citation of scientific data important? in: uhlir p, ed. for attribution—developing data attribution and citation practices and standards. summary of an international workshop. washington d.c.: national academies press. bray t, paoli j, sperberg-mcqueen cm, maler e, yergeau f. . extensible markup language (xml) . (fifth edition): w c recommendation november . available at http://www. w .org/tr/rec-xml/. clark a, evans p, strollo a. . fdsn recommendations for seismic network dois and related fdsn services, version . . technical report. international federation of digital seismograph networks. available at http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf. cnri. . handle system: unique and persistent identifiers for internet resources. available at http://www.w .org/tr/webarch/#identification. codata-icsti task group on data citation standards and practices. . out of cite, out of mind: the current state of practice, policy and technology for data citation. data science journal (september): – doi . /dsj.osom - . colquhoun d. . an investigation of the false discovery rate and the misinterpretation of p-values. royal society open science ( ): doi . /rsos. . data citation synthesis group. . joint declaration of data citation principles. available at http://force .org/datacitation. data documentation initiative. . data documentation initiative specification. available at http://www.ddialliance.org/specification/. datacite metadata working group. . datacite metadata schema for the publication and citation of research data, version . october . available at http://schema.datacite.org/meta/ kernel- . /doc/datacite-metadatakernel v . .pdf. dcat application profile working group. . dcat application profile for data portals in europe. available at https://joinup.ec.europa.eu/asset/dcat application profile/asset release/ dcat-application-profile-data-portals-europe-final. dublin core metadata initiative. . dublin core metadata element set, version . . available at http://dublincore.org/documents/dces/. dyson e. . online registries: the dns and beyond. available at http://doi.contentdirections. com/reprints/dyson excerpt.pdf. starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.iassistdata.org/iq/evolution-data-citation-principles-implementation http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dlib.org/dlib/march /altman/ altman.html http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://www.dcc.ac.uk/resources/how-guides http://dx.doi.org/ . / a https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/ws-arch/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.w .org/tr/rec-xml/ http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.fdsn.org/wgiii/v . - jul -doifdsn.pdf http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://dx.doi.org/ . /dsj.osom - http://dx.doi.org/ . /rsos. http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://force .org/datacitation http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://www.ddialliance.org/specification/ http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf http://schema.datacite.org/meta/kernel- . /doc/datacite-metadatakernel_v . .pdf https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://dublincore.org/documents/dces/ http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://doi.contentdirections.com/reprints/dyson_excerpt.pdf http://dx.doi.org/ . /peerj-cs. ecma. . ecma- : the json data interchange format. available at http://www. ecma-international.org/publications/files/ecma-st/ecma- .pdf. editors. . credit where credit is due. nature ( ): doi . / a. fielding rt. . architectural styles and the design of network-based software architectures. doctoral dissertation, university of california at irvine. available at https://www.ics.uci.edu/∼ fielding/pubs/dissertation/top.htm. fielding rt, taylor rn. . principled design of the modern web architecture. acm transactions on internet technology ( ): – doi . / . . gao s, sperberg-mcqueen cm, thompson hs. . w c xml schema definition language (xsd) . part : structures: w c recommendation april . available at http://www.w . org/tr/xmlschema - /. github guides. . making your code citable. available at https://guides.github.com/activities/ citable-code/. goodman a, pepe a, blocker aw, borgman cl, cranmer k, crosas m, di stefano r, gil y, groth p, hedstrom m, hogg dw, kashyap v, mahabal a, siemiginowska a, slavkovic a. . ten simple rules for the care and feeding of scientific data. plos computational biology ( ):e doi . /journal.pcbi. . gray a, dumontier m, marshall m, baram j, ansell p, bader g, bando a, callahan a, cruz-toledo j, gombocz e, gonzalez-beltran a, groth p, haendel m, ito m, jupp s, katayama t, krishnaswami k, lin s, mungall c, le novere n, laibe c, juty n, malone j, rietveld l. . data catalog vocabulary (dcat): w c recommendation january . available at http://www.w .org/ /sw/hcls/notes/hcls-dataset/. greenberg sa. . how citation distortions create unfounded authority: analysis of a citation network. bmj :b doi . /bmj.b . gudgin m, hadley m, mendelsohn n, moreau j-j, nielsen hf, karmarkar a, lafon y. . soap version . part : messaging framework (second edition): w c recommendation april . available at http://www.w .org/tr/soap -part /. haas h, brown a. . web services glossary: w c working group note february . available at http://www.w .org/tr/ /note-ws-gloss- /#webservice. hakala j. . rfc : using national bibliography numbers as uniform resource names. available at https://tools.ietf.org/html/rfc . hilse h-w, kothe j. . implementing persistent identifiers. available at http://xml.coverpages. org/ecpa-persistentidentifiers.pdf. holtzman k, mutz a. . rfc : transparent content negotiation in http. available at https://www.ietf.org/rfc/rfc .txt. hourclé j, chang w, linares f, palanisamy g, wilson b. . linking articles to data. in: rd asis&t summit on research data access & preservation (rdap) new orleans, la, usa. available at http://vso .nascom.nasa.gov/rdap/rdap landingpages handout.pdf. international doi foundation. . doi handbook. available at http://www.doi.org/hb.html. ioannidis jpa. . why most published research findings are false. plos medicine ( ):e doi . /journal.pmed. . iso/tc . . iso - : : geographic information metadata, part : fundamentals. available at http://www.iso.org/iso/home/store/catalogue tc/catalogue detail.htm? csnumber= . jacobs i, walsh n. . architecture of the world wide web, volume one w c recommendation december . available at http://www.w .org/tr/webarch/#identification. starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://www.ecma-international.org/publications/files/ecma-st/ecma- .pdf http://dx.doi.org/ . / a https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm http://dx.doi.org/ . / . http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ https://guides.github.com/activities/citable-code/ http://dx.doi.org/ . /journal.pcbi. http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://www.w .org/ /sw/hcls/notes/hcls-dataset/ http://dx.doi.org/ . /bmj.b http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/soap -part / http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice http://www.w .org/tr/ /note-ws-gloss- /#webservice https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf http://xml.coverpages.org/ecpa-persistentidentifiers.pdf https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://vso .nascom.nasa.gov/rdap/rdap _landingpages_handout.pdf http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://www.doi.org/hb.html http://dx.doi.org/ . /journal.pmed. http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber= http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://www.w .org/tr/webarch/#identification http://dx.doi.org/ . /peerj-cs. janée g, kunze j, starr j. . identifiers made easy. available at http://ezid.cdlib.org/. juty n, le novére n, laibe c. . identifiers.org and miriam registry: community resources to provide persistent identification. nucleic acids research (d ):d –d doi . /nar/gkr . klyne g, newman c. . rfc : date and time on the internet: timestamps. available at http: //www.ietf.org/rfc/rfc .txt. kunze j. . towards electronic persistence using ark identifiers. in: proceedings of the rd ecdl workshop on web archives. trondheim, norway, available at https://confluence.ucop.edu/ download/attachments/ /arkcdl.pdf. kunze j. . the ark identifier scheme at ten years old. in: workshop on metadata and persistent identifiers for social and economic data, berlin. available at http://www.slideshare.net/jakkbl/ the-ark-identifier-scheme-at-ten-years-old. kunze j, rodgers r. . the ark identifier scheme. technical report. internet engineering task force. available at https://tools.ietf.org/html/draft-kunze-ark- . kunze j, starr j. . ark (archival resource key) identifiers. available at http://www.cdlib.org/ inside/diglib/ark/arkcdl.pdf. lagoze c, van de sompel h. . compound information objects: the oai-ore perspective. open archives initiative – object reuse and exchange. available at http://www.openarchives. org/ore/documents/compoundobjects- .html. lagoze c, van de sompel h, johnston p, nelson m, sanderson r, warner s. . ore user guide—resource map discovery. available at http://www.openarchives.org/ore/ . /discovery. library of congress. . the relationship between urns, handles, and purls. available at http://memory.loc.gov/ammem/award/docs/purl-handle.html. mali f, erickson j, archer p. . data catalog vocabulary (dcat): w c recommendation january . available at http://www.w .org/tr/vocab-dcat/. maunsell jh. . unique identifiers for authors. the journal of neuroscience ( ): doi . /jneurosci. - . . moats r. . rfc : uniform resource name syntax. available at https://tools.ietf.org/html/ rfc . national center for biotechnology information. . available at http://jats.nlm.nih.gov/ publishing/tag-library/ . d /index.html. nottingham m. . rfc : web linking. available at https://www.ietf.org/rfc/rfc .txt. oclc. . purl help. available at https://purl.org/docs/help.html (accessed january ). parsons ma, duerr r, minster j-b. . data citation and peer review. available at http://dx.doi. org/ . / eo . peterson d, gao s, malhotra a, sperberg-mcqueen cm, thompson hs. . w c xml schema definition language (xsd) . part : datatypes: w c recommendation april . available at http://www.w .org/tr/xmlschema - /. peyrard s, kunze j, tramoni j-p. . the ark identifier scheme: lessons learnt at the bnf. in: proceedings of the international conference on dublin core and metadata applications . available at http://dcpapers.dublincore.org/pubs/article/view/ / . prinz f, schlange t, asadullah k. . believe it or not: how much can we rely on published data on potential drug targets? nature reviews drug discovery ( ): – doi . /nrd -c . starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://ezid.cdlib.org/ http://dx.doi.org/ . /nar/gkr http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt http://www.ietf.org/rfc/rfc .txt https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf https://confluence.ucop.edu/download/attachments/ /arkcdl.pdf http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old http://www.slideshare.net/jakkbl/the-ark-identifier-scheme-at-ten-years-old https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- https://tools.ietf.org/html/draft-kunze-ark- http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/documents/compoundobjects- .html http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://www.openarchives.org/ore/ . /discovery http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://memory.loc.gov/ammem/award/docs/purl-handle.html http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://www.w .org/tr/vocab-dcat/ http://dx.doi.org/ . /jneurosci. - . https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc https://tools.ietf.org/html/rfc http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html http://jats.nlm.nih.gov/publishing/tag-library/ . d /index.html https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://www.ietf.org/rfc/rfc .txt https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html https://purl.org/docs/help.html http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://dx.doi.org/ . / eo http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://www.w .org/tr/xmlschema - / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dcpapers.dublincore.org/pubs/article/view/ / http://dx.doi.org/ . /nrd -c http://dx.doi.org/ . /peerj-cs. rans j, day m, duke m, ball a. . enabling the citation of datasets generated through public health research. available at http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy communications/documents/web document/wtp .pdf. rekdal ob. . academic urban legends. social studies of science ( ): – doi . / . richardson l, ruby s. . restful web services. sebastopol ca: o’reilly. salzberg sl, pop m. . bioinformatics challenges of new sequencing technology. trends in genetics : – doi . /j.tig. . . . shendure j, ji h. . next-generation dna sequencing. nature biotechnology : – doi . /nbt . shepherd, fiumara, walters, stanton, swisher, lu, teoli, kantor, smith. . content negotiation. mozilla developer network. available at https://developer.mozilla.org/docs/web/ http/content negotiation. stein l. . the case for cloud computing in genome informatics. genome biology ( ): – doi . /gb- - - - . strasser b. . collecting, comparing, and computing sequences: the making of margaret o. dayhoff ’s atlas of protein sequence and structure, – . journal of the history of biology ( ): – doi . /s - - - . uhlir p. . for attribution—developing data attribution and citation practices and standards: summary of an international workshop ( ). technical report. the national academies press. available at http://www.nap.edu/openbook.php?record id= . vasilevsky na, brush mh, paddock h, ponting l, tripathy sj, larocca gm, haendel ma. . on the reproducibility of science: unique identification of research resources in the biomedical literature. peerj :e doi . /peerj. . starr et al. ( ), peerj comput. sci., doi . /peerj-cs. / https://peerj.com/computer-science/ http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtp .pdf http://dx.doi.org/ . / http://dx.doi.org/ . /j.tig. . . http://dx.doi.org/ . /nbt https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation https://developer.mozilla.org/docs/web/http/content_negotiation http://dx.doi.org/ . /gb- - - - http://dx.doi.org/ . /s - - - http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://www.nap.edu/openbook.php?record_id= http://dx.doi.org/ . /peerj. http://dx.doi.org/ . /peerj-cs. achieving human and machine accessibility of cited data in scholarly publications introduction background why cite data? the eight core principles of data citation implementation questions arising from the jddcp recommendations for achieving machine accessibility what is machine accessibility? unique identification landing pages content encoding on landing pages best practices for dataset description serving the landing pages persistence guarantees implementation: stakeholder responsibilities conclusion acknowledgements appendix serving landing pages: implementation details serving landing pages: linking to the data references speakingwithstudents_webtext speaking  with  students: profiles  in  digital  pedagogy by  virginia  kuhn,  with  dj  johnson  and  david  lopez university  of  southern  california institute  for  multimedia  literacy published  in  kairos:  a  journal  of    rhetoric,  technology,  and  pedagogy issue   .  spring,   http://kairos.technorhetoric.net/ . /interviews/kuhn/index.html introduction the  honors  in  multimedia  scholarship  program founded  in   ,  the  institute  for  multimedia  literacy   (iml)  is  an  organized  research  unit  dedicated  to  devel-­ oping  educational  programs  and  conducting  research  on   the  changing  nature  of  literacy  in  a  networked  culture.   although  its  institutional  home  is  the  school  of  cinematic   arts,  the  iml  supports  faculty  research  and  curricula  that   seek  to  transform  the  nature  of  scholarship  within  the   disciplines.  the  honors  in  multimedia  scholarship   program  is  an  university-­wide  undergraduate  program   located  at  the  iml;;  it  received  official  sanction  and  began   enrolling  its  first  cohort  in  fall   ,  basing  the  curricu-­ lum  on  the  previous  six  years  of  experience  deploying   multimedia  scholarship  in  courses  across  the  usc   campus.  the  honors  program  was  the  first  of  several   academic  programs  launched  by  the  iml. in   ,  the  iml,  in  collaboration  with  usc’s  college  of   letters,  arts  and  sciences,  created  the  multimedia  in  the   core  program  which  unites  general  education  courses   with  multimedia  labs,  offering  all  usc  students  the   opportunity  to  explore  new  forms  of  scholarly  expres-­ sion.  the  following  year,  the  multimedia  across  the   college  program  was  created;;  here,  upper  division   courses  are  paired  with  multimedia  instruction,  allowing   students  to  investigate  media-­based  forms  of  scholarly   research  and  production.  this  year  ( ),  the  iml's   minor  in  digital  studies  was  approved  which  expanded   the  course  offering  further.  iml  courses  now  include   everything  from  photo-­essays  to  web-­based  documenta-­ ries,  from  interactive  videos  to  sophisticated  web  sites,   and  from  kinetic  typography  to   -­d  visualizations.  all  iml   courses  include  a  hands-­on  lab  component,  in  addition   to  a  theoretical  foundation  borne  of  critical  studies,   semiotics,  cinema  studies,  composition  and  rhetorical   theory. like  all  iml  academic  programs,  the  honors  program  is   both  reactive  and  proactive  in  relation  to  digital  technolo-­ gies  for  expression  and  communication.  that  is  to  say,   while  the  idea  is  to  identify  and  engage  new  media  and   the  emerging  practices  they  engender,  the  program  is   explicitly  designed  to  be  transformative  in  that  it  hopes  to   teach  a  new  generation  of  scholars  to  enhance  tradi-­ tional  academic  practices  through  multimedia.  the   honors  program  stands  apart  from  other  iml  programs,   however,  in  that  its  goal  is  advanced  digital  literacy.  as   such,  the  program  culminates  in  the  creation  of  a   media-­rich,  digital  thesis  project.  honors  cohorts  are   small  ( -­  students  per  year)  and  they  are  well   supported  both  technologically  and  conceptually.   students  take  iml  and  iml  during  their  senior   year,  where  they  plan  and  execute  these  projects  which   are  grounded  in  their  disciplinary  major.  each  student   has  two  faculty  advisors,  one  from  the  iml,  and  one  from   their  major  and  this  ensures  the  type  of  student-­faculty   interaction  that  aids  their  scholarship,  and  allows  us  to   be  pedagogically  responsive. the  multimedia  thesis the  first  honors  cohort  completed  their  thesis  projects  in    and  the  second  in   ;;  the  planning  and  execu-­ tion  of  these  projects  is  the  topic  of  these  student   profiles.  the  students  featured  here  are  mainly  from  the   inaugural  class,  and  graduated  with  the  honors  designa-­ tion  in    (they  are  filmed  against  a  green  back-­ ground).  there  are  also  two  students  from  the     cohort  (they  are  pictured  against  blue-­gray  draping).  one   of  the  greatest  challenges  of  creating  these  projects  is   that  there  are  few  models  for  scholarly  multimedia.   born-­digital  work  requires  us  to  consider  the  ability  to   explore  issues  with  the  sort  of  depth  that  comes  from   deploying  the  registers  of  text,  image  and  interactivity,   while  it  also  has  the  potential  to  involve  the   reader/viewer  in  unprecedented  ways.  as  scholars  (both   teachers  and  students),  we  must  ask  ourselves  what  we   can  do  with  digital  media  that  we  could  not  do  otherwise,   but  we  must  also  avoid  uncritically  adopting  the  conven-­ tions  of  commercial  or  entertainment  media. since  the  goal  of  the  honors  program  is  to  be  both   academic  and  innovative,  we  did  not  want  to  impose   generic  conventions  on  the  projects  students  might   create,  feeling  that  this  might  limit  them.    at  the  same   time,  we  needed  to  be  sure  we  retained  the  type  of  rigor   appropriate  to  academic  endeavors.  thus,  the  thesis   parameters,  conceived  by  the  planning  team,  and   updated  by  its  program  directors  (steve  anderson,  from    to   ,  and  virginia  kuhn  from    to  the   present),  provide  a  way  to  ensure  standards,  while   encouraging  transformation  and  enhancement  of  schol-­ arship  in  light  of  emergent  technologies.  these  param-­ eters  are  presented  and  discussed  throughout  the   process  of  planning  and  executing  their  projects  and,  in   this  way,  students  gain  the  ability  to  articulate  and  defend   the  choices  made  in  their  work. speaking  with  students:  the  webtext   this  webtext  features  students  discussing  their  work.   this  reflective  aspect  is  valuable  on  many  levels,  and   documenting  and  sharing  such  reflection  in  this  webext  is   equally  vital.  here's  why: media  variety the  digital  archive  able  to  house  projects  that  cross   numerous  platforms  does  not  exist.  these  projects  run   the  gamut  from   d  environments  built  in  the  virtual  world   of  second  life,  to  the  weighty  files  of  a  korsakow  filmic   database,  to  animated  flash-­based  webtexts,  to  sophisti-­ cated  sophie  projects.  storing  numerous  file  types  in  an   online  archive  requires  conversion  into  some  uniform   format  which  will  limit  functionality.  perhaps  more   profoundly  though,  the  rise  of  social  networking  stimu-­ lates  a  sense  of  collaborative  dynamism  —  we  want   reader  feedback,  user  input,  and  viewer-­generated   content  that  extends  and  reinforces  our  efforts.  and  while   this  impulse  may  merely  highlight  the  fact  that  academic   work  is  always  part  of  a  larger  conversation,  the  respon-­ sibility  for  maintaining  the  dynamic  portion  of  digital  work   is  problematic.  standards  are  difficult  to  establish  since   applications  are  perpetually  evolving.  further,  many   digital  objects  will  have  several  iterations  depending  on how  a  viewer  might  access  them,  particularly  with  new   mobile  content  which  requires  a  different  sort  of  optimi-­ zation  than,  say,  a  standard  webtext. application  obsolescence with  no  standards  for  maintenance,  old  applications  will   not  run  in  just  a  few  short  years,  making  archiving  whole   projects  increasingly  untenable  (even  as  algorithms  that   revert  to  earlier  operating  systems  are  gaining  some   ground).  these  videos  offer  insight  into  the  process  as   much  as  the  product.  ucla's  howard  besser  suggests   that  archivists  must  shift  their  mindset  from  saving   completed  works  to  asset  management.    given  the   demand  for  ancillary  materials  (outtakes,  scripts,  story-­ boards),  besser  suggests  archivists  should  focus  on   "saving  a  side  body  of  materials  that  contextualize  a   work"  ( ).  for  our  purposes,  capturing  a  snapshot  of   student  work  while  they  contextualize  it  makes  complete   sense  —  the  video  format  is  fairly  stable  and  self-­ contained.  moreover,  institutionalized  curricula  cannot   hope  to  keep  up  with  the  rapidly  evolving  applications   that  arise  in  the  web   .  world  and  so  we  must  teach   students  how  to  learn  rather  than  what  to  learn.  these   pieces  lend  critical  insight  into  students'  processes  while   they  give  the  iml  a  uniform  repository  that  provides  a   model  for  students  and  faculty  alike.  for  even  as  digital   scholarship  is  on  the  rise,  there  remains  a  dearth  of   models  on  which  to  base  such  efforts.  in  cases  where   the  student  has  opted  to  maintain  their  work  online,  urls   are  given. assessment although  it  is  unpopular  to  discuss  grading,  at  least  at   the  faculty  level,  since  that  is  the  terrain  of  the  "bean   counters,"  we  ignore  our  institutional  constraints  at  our   peril.  not  only  is  it  a  disservice  to  students  to  fail  to   inform  them  of  the  criteria  by  which  they  will  be  judged   —  their  financial  aid,  scholarships,  or  membership  in   certain  student  groups  often  depends  upon  maintaining   a  certain  gpa  —  given  its  relative  newness,  digital  work   is  subject  to  the  charge  of  lack  of  academic  rigor.   without  the  sustained  analysis  that  comes  from  assess-­ ment  criteria,  digital  work  can  be  dismissed  as  bells  and   whistles.  these  criteria  give  us  a  lexicon  with  which  to   discuss  digital  work  among  ourselves  and  our  students,   even  as  explaining  digital  work  in  language  that  is   familiar  to  traditional  academics  helps  them  appreciate   its  nuances  and  sophistication.    and  although  institu-­ tional  constraints  can  prove  frustrating,  this  is  something   kairos  issue   .  spring,                                kuhn,  with  johnson  and  lopez that  academic  institutions  do  well:  they  force  a  type  of   rigor  that  pushes  us  toward  excellence.  at  the  iml  we   feel  our  project  parameters  help  to  highlight  aspects  that   may  not  be  immediately  apparent  in  the  piece  itself  —   they  approach  each  project  on  its  own  terms.  as  such,   there  is  far  more  freedom  to  be  innovative  with  emerging   platforms,  while  maintaining  high  quality  work.   in  creating  the  student  profiles,  we  decided  that  a   running  time  of  roughly  five  minutes  would  be  optimal.   much  longer  video  profiles  could  have  easily  been   created  given  the  scholarly  depth  of  the  projects  and   their  thickness  in  terms  of  the  multitude  of  layers  of   visual,  aural  and  textual  elements  contained  in  each.  in   addition,  the  student  interviews  covered  a  range  of   topics  related  to  the  production  of  their  thesis  projects,   from  initial  inspiration,  to  design  and  implementation,  to   the  students’  subjective  response  to  their  completed   work.  we  also  asked  them  to  discuss  how  their  work  in   scholarly  multimedia  has  impacted  their  undergraduate   education  and  how  it  has  shaped  their  future  educational   and  professional  goals.  we  had  a  wealth  of  materials   from  which  to  build  these  profiles,  which  heightened  the   challenge  before  us:  how  do  we  maintain  the  integrity  of   the  students'  projects  and  their  unique  voices  within  a   five  minute  timeframe?  we  had  to  address  key  issues   concerning  the  representation  of  students  and  their  work   in  creating  these  profiles.  in  doing  so,  we  are  moved  to   consider  best  practices  for  documenting  multimedia   pedagogy,  student  experience  and  scholarly  digital  work.   the  notes  on  process  section  accompanying  the   student  profiles  illuminates  key  issues  faced  in  creating   these  profiles  and  the  strategies  used  to  address  them.   whereas  many  of  these  strategies  are  grounded  in   formal  techniques  of  documentary  production,  they  are   deployed  in  deliberate  and  specific  ways  to  highlight  the   scholarly  and  aesthetic  nuances  particular  to  each   project. in  order  to  visually  represent  the  depth  of  the  issues   involved  in  this  flash-­based  webtext,  we  created  a  type   of  layering  effect  by  allowing  traces  of  one  page  or   screen  to  remain  behind  another.  while  reading  one   screen,  a  viewer  might  see  the  ghost  of  a  video  from  the   previous  screen  still  playing.  the  color  gradation  was   very  deliberately  adjusted  in  order  to  keep  the  text   legible,  even  in  the  presence  of  these  traces.  we  believe   this  feature  of  the  webtext  serves  as  a  reminder  of  the   type  of  depth  that  is  emerging  in  digital  technologies   both  in  and  out  of  the  confines  of  the  computer. we  feel  that  these  students  are  pioneers  in  the  area  of   digital  scholarship  and  deserve  to  be  documented  in   ways  that  are  typically  reserved  for  faculty.  however,  we   do  understand  that  no  interview,  no  film,  whether  edited   inside  or  outside  of  the  camera,  is  ideologically  neutral.   we  have  framed  students  in  a  particular  way  and  have   created  these  five  minutes,  from  the  hour  or  so  of   interview  footage  each  student  gave,  in  order  to  tell  a   particular  story.  we  hope  the  story  is  one  the  student   sees  as  valid  —  and,  indeed,  all  students  have  been   quite  pleased  with  their  piece,  often  using  them  on  job   and  graduate  school  applications  —  but  we  also  under-­ stand  the  extent  to  which  students  tell  us  what  we  want   to  hear.  our  only  way  to  reconcile  these  issues  is  full   disclosure:  we  have  a  vested  interest  in  this  program,   these  students  and  their  work.  to  mitigate  our  bias   however,  we  have  adopted  norman  denizen's  approach   to  the  construct  of  the  "interview"  as  a  form.  throughout   the  process  of  filming,  editing  and  writing  about  these   interviews  we  have  sought  to  make  them  "reflexive,   dialogic  [and]  performative"  ( )  such  that  by  creating   them,  we  are  "learning  to  use  language  in  a  way  that   brings  people  together"  ( )  rather  than  commodifying   these  students  and  their  work  for  our  own  purposes.  we   hope  you  find  these  pieces  as  stimulating  and  productive   as  we  do.   virginia  kuhn  is  the  associate  director  in  charge  of  the   honors  in  multimedia  scholarship  program  at  the  iml.   her  work  centers  on  the  ways  in  which  the  affordances   of  digital  technologies  impact  thought,  discourse  and   expression  in  a  highly  mediated  world.   dj  johnson  has  been  the  video  documentarian  for  the   iml  since   .  an  award-­winning  filmmaker,  johnson   has  extensive  experience  producing  and  directing   documentaries  and  promotional  videos  for  educational   institutions  and  social  service  organizations. david  lopez  is  an  interactivity  designer  for  the  iml.  for   over  five  years,  he  has  consistently  worked  to  facilitate   multimedia  results  from  raw  scholarly  enquiry.   works  cited besser,  howard.  digital  preservation  of  the  moving          image  material?  the  moving  image,  fall,   .          http://www.gseis.ucla.edu/~howard/papers/amia-­          longevity.html denizen,  norman.  "the  reflexive  interview  and  a  perfor          mative  social  science,"  in  qualitative  research,  vol   ,            no.   ,   -­  ( ). kairos  issue   .  spring,                                kuhn,  with  johnson  and  lopez usc  institute  for  multimedia  literacy project  parameters   these  are  the  parameters  by  which  the  thesis  project  is  gauged.  students  are  given  these  criteria  early  on,  and  can   therefore  plan  accordingly.  these  parameters  are  flexible  enough  to  allow  student  innovation,  but  rigorous  enough  to   ensure  academic  excellence.  each  of  the  four  areas  is  subdivided  into  three  nuanced  categories,  and  within  the   webtext  you  will  find  clips  that  demonstrate  the  ways  students  have  met  them. conceptual  core the  project’s  controlling  idea  must  be  apparent.   the  project  must  be  productively  aligned  with  one  or  more  multimedia  genres.   the  project  must  effectively  engage  with  the  primary  issue/s  of  the  subject  area  into  which  it  is  intervening.   research  component the  project  must  display  evidence  of  substantive  research  and  thoughtful  engagement  with  its  subject  matter.   the  project  must  use  a  variety  of  credible  sources  and  cite  them  appropriately.   the  project  ought  to  deploy  more  than  one  approach  to  an  issue.   form  &  content the  project’s  structural  or  formal  elements  must  serve  the  conceptual  core.   the  project’s  design  decisions  must  be  deliberate,  controlled,  and  defensible.   the  project’s  efficacy  must  be  unencumbered  by  technical  problems.   creative  realization the  project  must  approach  the  subject  in  a  creative  or  innovative  manner.   the  project  must  use  media  and  design  principles  effectively.   the  project  must  achieve  significant  goals  that  could  not  be  realized  on  paper.   http://iml.usc.edu open stacks: making dh labor visible ← dh+lib open stacks: making dh labor visible ← dh+lib open stacks_ making dh labor visible ← dh+lib.html[ / / , : : pm] dh+lib where the digital humanities and librarianship meet about about dh+lib dh+lib survey results editor-at-large instructions editor-at-large sign up features digital humanities in the library / of the library: a dh+lib special issue data praxis series dh+lib mini-series scene reports series scene reports dh+lib review cfp event jobs opportunity post resource recommended what are you reading? resources dh+lib registry readings library research guides tei + digital literacy : recommended readings calendar . . dh+lib . features . open stacks: making dh labor visible « previous next » open stacks: making dh labor visible by laura braunstein jun | features laura braunstein is the digital humanities librarian at dartmouth college and co-edited digital humanities and the library: challenges and opportunities for subject specialists (acrl ). last june, a group of librarians, technologists, and scholars met at middlebury college in vermont to think about how to move forward on a proposed network, the digital liberal arts exchange, that would support digital humanities scholarship and teaching across institutional boundaries. there was much discussion, as we looked out over the green mountains on a perfect early summer day, of the particular stresses on library infrastructure when it came to supporting, http://acrl.ala.org/dh http://acrl.ala.org/dh/about/ http://acrl.ala.org/dh/about/ http://acrl.ala.org/dh/about/ -dhlib-survey-results/ http://acrl.ala.org/dh/eal-instructions/ http://acrl.ala.org/dh/eal-sign-up/ http://acrl.ala.org/dh/category/dhlib/features/ http://acrl.ala.org/dh/ -special-issue/ http://acrl.ala.org/dh/category/data-praxis/ http://acrl.ala.org/dh/category/dhlibmini-series/ http://acrl.ala.org/dh/dhlib-scene-reports/ http://acrl.ala.org/dh/category/dhlib/scene-reports/ http://acrl.ala.org/dh/category/dhlib/dhlib-review/ http://acrl.ala.org/dh/category/cfps/ http://acrl.ala.org/dh/category/event/ http://acrl.ala.org/dh/category/jobs/ http://acrl.ala.org/dh/category/opportunity/ http://acrl.ala.org/dh/category/post/ http://acrl.ala.org/dh/category/resource/ http://acrl.ala.org/dh/category/recommended/ http://acrl.ala.org/dh/category/dhlib/dhlib-review/reading/ http://acrl.ala.org/dh/dh / http://acrl.ala.org/dh/registry/ http://acrl.ala.org/dh/dh /readings/ http://acrl.ala.org/dh/dh /libguides/ http://acrl.ala.org/dh/dh /readings/tei-digital-literacy-recommended-readings/ http://acrl.ala.org/dh/digital-conferences-calendar/ http://acrl.ala.org/dh http://acrl.ala.org/dh/category/dhlib/ http://acrl.ala.org/dh/category/dhlib/features/ http://acrl.ala.org/dh/ / / /farewell-to-dhlib-review-editor-caro-pinto-and-thank-you/ http://acrl.ala.org/dh/ / / /what-im-reading-this-summer-lydia-willoughby/ http://acrl.ala.org/dh/ / / /open-stacks-making-dh-labor-visible/ http://acrl.ala.org/dh/category/dhlib/features/ http://www.worldcat.org/oclc/ http://www.worldcat.org/oclc/ https://dlaexchange.wordpress.com/ open stacks: making dh labor visible ← dh+lib open stacks_ making dh labor visible ← dh+lib.html[ / / , : : pm] leading, and engaging with digital projects, in contrast to how libraries support traditional humanities scholarship. at one point, someone noted that the conversation was drifting back toward the tired dichotomy of “hack” and “yack”–that is, dh as coding and making things versus dh as critique of digital culture. i suggested that we might think about a third term–“stack”: the often invisible technological, social, and physical structures within which scholarship is produced and disseminated. since that meeting, i’ve been considering different concepts of “stack” in relationship to dh as models for these structures of labor. i’ve also found myself having more and more conversations–at work, at conferences, on social media–about how exposing dh infrastructure (in terms of how it supports both making/”hack” and thinking/”yack”) can reveal the conditions that make all kinds of scholarship possible. i’m curious to explore what these three frames–technological, social, and physical–could offer in terms of different ways to understand and reveal dh labor in the academy. in this post, i would like to “browse” the dh stack through three different frames: first, the technology stack of globalized computing; second, the social stack that manifests as institutional infrastructure; and finally, the physical library stacks that are a synecdoche for the information architecture that arranges scholarship. i’m curious to explore what these three frames–technological, social, and physical–could offer in terms of different ways to understand and reveal dh labor in the academy. my thoughts here build upon both shannon mattern’s idea of library as infrastructure and david weinberger’s idea of library as platform. rather than thinking of the library itself as an infrastructure, platform, or stack, i would like to consider what–and who–these concepts hide. as i’ve observed elsewhere, the people who “hack” and “yack” can’t work without the people in the “stack” (or without the people in the library stacks). at a time of political crisis, when the core values of libraries and access to knowledge are being challenged, we need to take responsibility for showing what we do. dh librarians, whose highly collaborative work is dedicated to social justice and public engagement, may be one particularly vital community of practice for exposing the changing conditions that create knowledge. how do we make labor in the “stack” visible? http://dhdebates.gc.cuny.edu/debates/text/ https://twitter.com/aliciapeaker/status/ https://placesjournal.org/article/library-as-infrastructure/ http://lj.libraryjournal.com/ / /future-of-libraries/by-david-weinberger/ https://www.insidehighered.com/blogs/technology-and-learning/ -questions-digital-humanities-librarian open stacks: making dh labor visible ← dh+lib open stacks_ making dh labor visible ← dh+lib.html[ / / , : : pm] a sectional view of the new york public library, . from the new york public library. the technology stack benjamin bratton, in the stack: on software and sovereignty (mit, ), suggests that we think of the vertically integrated organization of global computing as a more pervasive version of the software stack, whereby a web application, like the wordpress platform for this blog, runs on top of a database that runs on top of an operating system (which runs on top of the hardware). bratton’s global stack is totalizing: it rises from raw materials mining at the bottom to hardware manufacture as the next layer, and thence upward from network infrastructure to web programming to user interface design to tech support. it has emerged as “an accidental megastructure, one that we are building both http://acrl.ala.org/dh/wp-content/uploads/ / /nypl.digitalcollections.stacks.jpeg https://digitalcollections.nypl.org/items/ d e -d bd-a d -e -e a a http://thestack.org/ https://www.nytimes.com/ / / /magazine/new-technology-is-built-on-a-stack-is-that-the-best-way-to-understand-everything-else-too.html?_r= open stacks: making dh labor visible ← dh+lib open stacks_ making dh labor visible ← dh+lib.html[ / / , : : pm] deliberately and unwittingly and is in turn building us in its own image.” this is the megastructure of globalization, whereby computing networks transcend and surpass national boundaries and identities. the stack is not just a new technology, but operates as “a scale of technology that comes to absorb functions of the state and the work of governance” (kind of like the matrix). what bratton calls the stack as megastructure also shapes (and perhaps defines) the globalized university, in that basic internal services, such as communication, record keeping, and financial management have been outsourced to cloud-based enterprise systems. in turn, research libraries within this megastructure operate in an economy that deemphasizes (and sometimes discards) ownership of local collections in favor of access to licensed resources that are facilitated by institutional relationships with multinational corporations. yet–like the fish who asks “what is water?”–most scholars are unaware of the extent to which their work, professional interactions, and finances are imbricated with the global technology stack. how does dh fit within this megastructure? according to some critics, dh is part of the problem of the neoliberal university because it privileges networked, collaborative scholarship over individual production. if creating a tool (hacking) or using computational methods has the same scholarly significance as writing a monograph, then individualized knowledge pursued for its own sake, the struggle at the heart of humanistic inquiry, is devalued. yet writing a book always depended on invisible (gendered) labor in the academy. word processing, library automation, and widespread digitization are just three examples of the support labor for traditional scholarly work that bratton’s globalized technology stack has absorbed. (and we know that the fruits of that labor are in no way distributed equitably.) what has changed in the neoliberal university is that the humanities scholar becomes one more node in a knowledge-producing system. does it matter, then, whether dh work produces ideas or things, critics say, if all are absorbed into a totalizing system that elides the individual scholar’s privileged position? this is of course a vision of scholarship that is traditionally specific to the humanities; lab science and the performing arts, for example, have always been deeply collaborative (but with their own systems of privilege and credit). the social stack we may find ourselves comparing irrational institutional arrangements because all academic institutions are absorbed by the supposed rationality of the technology stack, but this is not the most productive way to understand the social conditions of academic labor. if bratton’s stack is characterized by a globalizing rationality, the social infrastructure of the university is highly irrational, as alan liu discusses. what liu calls the field of “critical infrastructure studies” could “’see through’ the supposed rationality of organizations and their supporting infrastructures to the fact that they are indeed social institutions with all the irrationality that implies.” institutional arrangements of dh are often social and contingent (in the very concrete sense that many who work in dh–graduate students, postdocs, people in grant-funded term positions– are classified as contingent labor). many dh programs, initiatives, and teams have arisen organically out of social connections rather than centralized planning. understanding contingency can transcend the “but it’s not like that where we are” arguments that often get in the way of sharing information and practices in order to improve the working (and thinking) lives of actual people. as an example: the discourse of “center envy”–by which the speaker positions herself in comparison to a colleague at another institution who has more resources and can ostensibly accomplish more. if only we had more resources, a physical center, dedicated programmers, graduate students, postdocs, grants–the myth of scarcity shapes so much of how we think of dh inhabiting our institutions. perhaps there’s no idealized arrangement for dh that would transcend local cultures; certain institutional configurations, like the small liberal arts college, may indeed be richer and more equitable environments for producing dh work. we may find ourselves comparing irrational institutional arrangements because all academic institutions are absorbed by the supposed rationality of the technology stack, but this is not the most productive way to understand the social conditions of academic labor. parallel to liu’s discussion, martin paul eve has recently argued in the context of open access publishing that the challenges we face in both supporting and crediting scholarship are not technological but social. when infrastructure is understood as an irrational social formation, emotional labor tends to compensate for a perceived lack of resources. scholars who are used to the invisibility of traditional library services, for instance, find that digital projects expose hierarchies and bureaucracies that they don’t want to negotiate or even think about, and the dh librarian or one of her colleagues steps in to run interference. why can’t the dean of libraries just tell that department to create the metadata for http://www.newyorker.com/books/page-turner/this-is-water https://lareviewofbooks.org/article/neoliberal-tools-archives-political-history-digital-humanities/ https://lareviewofbooks.org/article/neoliberal-tools-archives-political-history-digital-humanities/ https://twitter.com/search?src=typd&q=% thanksfortyping https://www.theatlantic.com/sexes/archive/ / /being-married-helps-professors-get-ahead-but-only-if-theyre-male/ / http://liu.english.ucsb.edu/drafts-for-against-the-cultural-singularity/ http://blogs.reading.ac.uk/open-research/ / / /on-being-open-in-practice-giving-credit-where-it-is-due/ http://acrl.ala.org/dh/ / / /not-your-dh-teddy-bear/ open stacks: making dh labor visible ← dh+lib open stacks_ making dh labor visible ← dh+lib.html[ / / , : : pm] my project? after all, they already create metadata for the library’s systems. why can’t web programming be a service you provide to me like interlibrary loan? i thought the library was here to support my scholarship. why can’t you maintain my website after i retire–exactly the way it looks and feels today, plus update it as technology changes? in some conversations, these questions may be rhetorical; it may take emotional labor to answer them, but doing so exposes the workings of the library’s infrastructure–its social stack. the physical stack scholars often presume that because libraries acquire, shelve, and preserve the print books that they write, that the same libraries will acquire, shelve (or host), and preserve digital projects. in her volume bookshelf (bloomsbury, ) for the object lessons series, lydia pyne describes how the cast-iron bookstacks manufactured by snead & co. around the turn of the twentieth century transformed library architecture and services. standardized shelves enabled libraries to house more on-site collections, which in turn allowed open-stack browsing. cast-iron stacks were the literal infrastructure that held up buildings, as the new york public library infamously discovered when it proposed to remove book stacks from its flagship fifth avenue location. if the university is what shelby foote apocryphally called “a group of buildings gathered around a library,” then the library might be a building gathered, or built around–on, out of–book stacks. book stacks literally undergird (in the sense that a bookshelf is a small girder) the modern university. where are the stacks for a digital project? what does the architecture of the physical stacks — the core collections– suggest about how we might arrange (or derange) our digital scholarship? while libraries might be the organizations on campus best suited to arrange, acquire, and preserve digital scholarship, not all scholars think so, if repository participation rates are any evidence (and they might not be). scholars may be extrapolating from their experiences with commercial ebook and journal publishers, and we can’t blame them for some apprehension. if publishers can revoke access to digital material at any time, libraries must resist to insure the free exchange of information, and advocate for alternative scholarly economies. as pyne discusses, digital rights management is the new “chain” that secures books to shelves; unlike the chains that bound medieval codices, “digital chains are just more difficult to see.” as with bratton’s technology stack that absorbs local decision making and curation of collections, digital chains obscure the agency of librarians and scholars. # # # speaking out about the very real conditions under which digital scholarship is produced and accessed can also reveal long histories of labor inequities in the academy. thinking about stacks–technological, social, physical–as frames is not a foolproof approach to making this labor visible, nor do my conceptualizations lack inconsistencies. my thoughts are intended to open a conversation, here on dh+lib, on social media, and at conferences and in further publications. and while dh may be particularly generative, as a community of practice, to facilitate these conversations, we are by no means the only ones doing so (or who should be doing so). making the digital stacks transparent and visible–as visible to library users and as fundamental to our libraries’ infrastructure as cast-iron book stacks are–should be our responsibility as librarians. not because we need yet another responsibility, but because we are uniquely positioned to interpret the rapidly changing landscape of digital scholarship for all those with whom we collaborate. this work is licensed under a creative commons attribution . international license. show footnotes http://objectsobjectsobjects.com/ http://www.newyorker.com/books/page-turner/the-new-york-public-library-comes-around http://www.dlib.org/dlib/september /wu/ wu.html https://cyber.harvard.edu/hoap/filling_the_repository http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / open stacks: making dh labor visible ← dh+lib open stacks_ making dh labor visible ← dh+lib.html[ / / , : : pm] laura braunstein laura braunstein is the digital humanities librarian at dartmouth college and a co-editor of digital humanities in the library: challenges and opportunities for subject specialists (acrl, ). she has a phd in english from northwestern university and an mslis from the pratt institute. twitter | more posts one comment on “open stacks: making dh labor visible” . pingback: editors’ choice: open stacks: making dh labor visible comments are closed. « previous next » contact us contribute submission guidelines rights + permissions dh+lib is a project of the acrl digital humanities interest group | issn - (online) http://acrl.ala.org/dh/author/lrb/ https://twitter.com/laurabrarian http://acrl.ala.org/dh/author/lrb/ http://digitalhumanitiesnow.org/ / /editors-choice-open-stacks-making-dh-labor-visible/ http://acrl.ala.org/dh/ / / /farewell-to-dhlib-review-editor-caro-pinto-and-thank-you/ http://acrl.ala.org/dh/ / / /what-im-reading-this-summer-lydia-willoughby/ http://acrl.ala.org/dh/contact-us/ http://acrl.ala.org/dh/contact-us/contribute/ http://acrl.ala.org/dh/submission-guidelines/ http://acrl.ala.org/dh/rights-permissions/ http://connect.ala.org/node/ http://www.worldcat.org/oclc/ local disk open stacks: making dh labor visible ← dh+lib towards human linguistic machine translation evaluation marta r. costa-jussà institute for infocomm research. fusionopolis way, - connexis (south tower) singapore e-mail: vismrc@i r.a-star.edu.sg mireia farrús pompeu fabra university. c/ tanger/roc boronat, barcelona. e-mail:mfarrus@upf.edu abstract when evaluating machine translation outputs, linguistics is usually taken into account implicitly. annotators have to decide whether a sentence is better than another or not, using, for example, adequacy and fluency criteria or, as recently proposed, editing the translation output so that it has the same meaning as a reference translation, and it is understandable. therefore, the important fields of linguistics of meaning (semantics) and grammar (syntax) are indirectly considered. in this study, we propose to go one step further towards a linguistic human evaluation. the idea is to introduce linguistics implicitly by formulating precise guidelines. these guidelines strictly mark the difference between the sub-fields of linguistics such as: morphology, syntax, semantics, and orthography. we show our guidelines have a high inter-annotation agreement and wide error coverage. additionally, we examine how the linguistic human evaluation data correlate with: among different types of machine translation systems (rule and statistical-based); and with adequacy and fluency. keywords: machine translation, linguistic evaluation . introduction evaluation in machine translation is a challenging task. as a consequence of the increased interest in enhancing machine translation systems, there is a correspondent interest in improving machine translation evaluation. evaluation of a translation output is not an easy task even for human be- ings because translation involves different types of knowledge, such as linguistic and cultural. different translators may have different criteria. however, hu- man judgments of performance have been the gold standard of mt evaluation metrics. preprint submitted to elsevier december , there have been several proposals for human evaluation which have been widely used by the scientific community. measuring in adequacy and fluency was proposed by [ ] and it is still a standard evaluation criteria. adequacy is a rating of how much information is transferred between the source and the target language, and fluency is a rating of how good the target language is. the most recent human evaluation approach that was chosen as the official machine translation evaluation metric for darpa’s global autonomous lan- guage exploitation (gale) program [ ] was hter (human-targeted transla- tion edit rate). hter involves a procedure for creating targeted translations. annotators compare the translation output against a reference translation, and they modify the output so that it has the same meaning as the reference, and is understandable. each inserted/deleted/modified word or punctuation mark counts as one edit, while shifting a string of any number of words, by any dis- tance, counts as one edit [ ]. other works such as [ ] propose a -category schema that does not use linguistic criteria. the errors are classified in five big classes: incorrect words, missing words, word order, unknown words and punctuation. flanagan classifi- cation [ ] lists a series of errors that are language pair dependent. the author classifies the errors in different categories for the english-to-french transla- tion, plus three more categories to be added in the english-to-german transla- tion. evaluations of different mt systems for a range of linguistic checkpoints have been carried out for english-chinese [ ], italian-english, german-english and dutch-english [ ]. as far as we are concerned, the above evaluations (except for adequacy and fluency ) do not report an inter-annotation agreement study. in any case, there has not been a formal proposal of linguistic evaluation guidelines for machine translation. the main advantages of a linguistic evaluation would be: • propose precise linguistic guidelines that allow for a high inter-annotation agreement. • provide a linguistic classification of the translation output errors. • provide new information to enhance the machine translation systems. • evaluation is done without a reference. the main drawbacks of such an evaluation would be that it requires bilingual annotators and it would be time consuming. however, nowadays we can take advantage of crowd-sourcing platforms (such as amazon’s mechanical turk , crowdflower ) to reduce these types of drawbacks. crowd-sourcing enables re- questers to tap from a global pool of non-experts to obtain rapid and affordable answers to simple human intelligence tasks (hits), which can be subsequently https://www.mturk.com http://crowdflower.com/ used to train data-driven applications. a number of recent papers on this sub- ject point out that non-expert annotations, if produced in a sufficient quantity, can rival and even surpass the quality of expert annotations, often at a much lower cost [ ], [ ]. however, this possible increase in quality depends on the task at hand and on an adequate hit design [ ], which motivates the creation of detailed guidelines. the rest of the paper is organized as follows. next section briefly describes the linguistic guidelines. section reports the experimental results with these linguistic guidelines. particularly, we exploit the linguistic guidelines to show correlation results at the segment level between linguistic evaluations and differ- ent types of systems. additionally, we test the linguistic guidelines by computing the correlation with adequacy and fluency results. finally, section discusses most relevant conclusions. . linguistic guidelines we consider that linguistic guidelines for a machine translation system should be specific for the target language. however, they may be generalizable for dif- ferent source languages. in this case, we are using guidelines specific for the catalan language. the guidelines consider four relevant linguistic evaluations: orthographic (language writing standardization); morphological (internal struc- tures of words and how they can be modified); semantic (meaning of individual words and combinations, and how these form the meanings of sentences); and syntactic (word combination to form grammatical sentences). the guidelines should classify any error committed by a translation system into one of these categories. the linguistic guidelines have been designed for the catalan target language using the translation output of the universitat politècnica de catalunya (upc) statistical machine translation system [ ] over a spanish-to-catalan test set . the guidelines were designed by a catalan linguist. next, the annotation guide- lines are summarized. • orthographic errors include punctuation marks, erroneous accents, let- ter capitalization, joined words, spare blanks coming from a wrong deto- kenisation, apostrophes, conjunctions and errors in foreign words. . punctuation marks. include a wrong use, missing punctuation and extra punctuation (exclamation and interrogation marks, full stops, commas, colons, semicolons, dots, etc.). e.g. source: es factible, pero hay que tener en cuenta tres obviedades: target: és factible, ara cal tenir en compte tres obvietats. this test set was of sentences (around k words) extracted from el páıs and la vanguardia newspapers [ ] . accents. include accented vowels when not necessary, missing accents and erroneous accents. e.g. source: la llegada de obama y la situación interna del régimen is- lamista deparan una oportunidad. target: *l’arribada d’obama i la situacio* interna del règim is- lamista ofereixen una oportunitat. correct target: l’arribada d’obama i la situació interna del règim islamista ofereixen una oportunitat. . capital and lower case letters. this refers to wrong capital letters within a sentence, lower case letters at the beginning of a sentence, and lower case letters in acronyms or proper nouns. e.g. source: el enorme peligro de este camino seŕıa privar a un régimen aislado y teóricamente revolucionario del enemigo supremo. target: l’enorme perill d’aquest camı́ seria privar a un règim äıllat i teòricament revolucionari de l’enemic suprem. . joined words. this is a less common error committed where two consecutive words are erroneously joined. e.g. source: pero, aun siendo funcional para resolver y a la vista de los resultados (...) target: *però, fins i tot sent funcional per resoldre ia la vista dels resultats. correct target: però, fins i tot sent funcional per resoldre i a la vista dels resultats. . extra spaces. this error is usually committed due to non-detokenising when required or detokenising into the wrong direction. e.g. source: ”hola” target: ” hola ” . apostrophe. apostrophe is commonly used in catalan to elide a sound. in some cases, some of the words that should be apos- trophofised are not apostrophised (missing apostrophe) and viceversa (extra apostrophe). e.g. source: sólo hace años que sabemos la historia que se oculta tras esa imagen turbadora target: *només fa anys que sabem la història que se amaga dar- rere aquesta imatge torbadora. correct target: només fa anys que sabem la història que s’amaga darrere aquesta imatge torbadora. • morphological errors include lack of gender and number concordance, apocopes, errors in verbal morphology (inflection) and lexical morphol- ogy (derivation and compounding), and morphosyntactic changes due to changes in syntactic structures. . lack of gender concordance. some words are given a different gender in different languages. for instance, the word smile is feminine in spanish (la sonrisa) and masculine in catalan (el somriure). it is then common to find a lack of gender concordance in articles and adjectives with a noun that changes its gender from one language to the other, especially in statistical systems, where there are no rules to solve it. e g. source: el balón llegó tarde target: *el pilota va arribar tard. correct target: la pilota va arribar tard. . lack of number concordance. although it is less common, some words are given a different number in different languages. for in- stance, the word money is singular in spanish (el dinero) and plural in catalan (els diners). like in the previous case, this causes a lack of number concordance in articles and adjectives with the consecutive noun, especially also in statistical systems, where there are no rules to solve it. e. g source: el gobierno se ha gastado todo el dinero de los ciudadanos. target: *el govern s’ha gastat tot el diners dels ciutadans. correct target: el govern s’ha gastat tots el diners dels ciutadans. . verbal morphology. this error refers to a verb that is not correctly inflected, a common error in a very inflected language such as cata- lan. the most common cases are the translation of an inflected verb into the infinitive form, or the lack of person concordance. source: el mismo que usted puede ahora constatar en la exposición bacon. target: *el mateix que vostè pugues ara constatar en l’exposició ba- con. correct target: el mateix que vostè pot ara constatar en l’exposició bacon. . lexical morphology it concerns basically word formation: derivation and compounding, like the use of a derivate in a wrong way (e.g. lliguer instead of de la lliga) or a wrong compounding (e.g. històric- social instead of historicosocial ). • semantic errors include no correspondence between source and target words, non-translated but necessary source words, missing target words, and non-translated proper nouns or translated when not necessary. addi- tionally, includes polysemy, homonym, and expressions used in a different way in the source and target languages. . polysemy. a polyseme is a word with multiple meaning, which shares the same origin. a polysemic error occurs when the incorrect mean- ing is chosen in the target language. e.g. the catalan conjunction perquè has two different meanings: porque (because) and para que (in order to), which causes usually translation errors. . homonymy. homonymy is found when two or more words share the same spelling and the same pronunciation but have different mean- ings, usually as a result of having different origins. like polysemes, homonym errors occur when the incorrect meaning is chosen in the target language. e.g. the spanish adverb solo, which can also be and adjective, is translated by the catalan adjective sol instead of the corresponding adverb només. . incorrect word. this error is detected when there is no correspon- dence at all between the source word and the translated target word. it is normally found in statistical systems, where the word is trans- lated incorrectly mainly due to alignment errors. e.g. source: no llegaron hasta las cuatro de la tarde. target: *no van arribar fins a les quatre. correct target: s’arreglen més sabates que mai, . unknown word. this refers to a non-translated source word, which is left intact in the target side. e.g. source: el caso dutroux, que convulsionó a bélgica a principios de los noventa, target: *el cas dutroux, que convulsionó bèlgica a principis dels noranta correct target: el cas dutroux, que va convulsionar bèlgica a prin- cipis dels noranta . missing target word this refers to a non-translated source word, which is missing in the target side. e.g. source: fue una decisión impopular, pero seguramente justa. target: va ser una decisió impopular, segurament justa. . proper nouns this error concerns non-translated proper nouns (i.e. unknown proper noun) or translated when not necessary (for in- stance, not being detected as proper noun but as common noun to be translated). e.g. source: zapatero se negó. target: *el sabater s’hi va negar. correct target: zapatero s’hi va negar. • syntactic errors include errors in prepositions, errors in relative clauses, verbal periphrasis, clitics, missing or spare article in front of proper nouns, and syntactic element reordering. . prepositions this error refers to prepositions not elided in the target language (extra prepositions), prepositions not inserted in the target language (missing prepositions), or source prepositions maintained in the target language instead of a new correct target preposition (incorrect prepositions). e.g. source: debeŕıa ser recusado en favor de otro juez. target: *hauria de ser recusat en favor d’un altre jutge. correct target: hauria de ser recusat a favor d’un altre jutge. . relative pronouns. due to its syntactic complexity, the use of rela- tive clauses involving relative pronouns refering to previous elements usually leads to erroneous translations. e.g. source: murieron tres personas al colisionar un ford scort con un renault scenic cuyo conductor sufrió heridas leves. target: *hi van morir tres persones al topar un ford scort amb un renault scénic amb un conductor va patir ferides lleus. correct target: hi van morir tres persones en topar un ford scort amb un renault scénic el conductor del qual va patir ferides lleus. . verbal periphrasis. the use of verbal periphrasis, especially when they involve prepositions that differ in the different languages, usually leads to translation errors, as well (e.g. the spanish verbal periphrasis tener que (have to) is usually translated literally into catalan as tenir que instead of the correct periphrasis haver de). . clitics include an incorrect syntactic function of the pronoun or a wrong clitic-verb combination. e.g. source: el niño se cayó por las escaleras de su casa. target: *el nen es va caure per les escales de casa seva. correct target: el nen va caure per les escales de casa seva. . articles. this error refers to missing or extra articles in front of proper nouns. e.g. source: rosa entró en el despacho del dueo target: *rosa va entrar al despatx del propietari correct target: la rosa va entrar al despatx del propietari . reordering. it refers to a syntactic reordering of the elements of the sentence. a list of the linguistic errors can be found in table . orthographic morphologic semantic syntactic puntuation marks gender concordance polysemy prepositions accents number concordance homonymy relative pronouns capital and lower case verbal morphology incorrect words verbal periphrasis letters joined words lexical morphology unknown words clitics extra spaces missing target word articles apostrophe proper nouns reordering table : guidelines summary. . experiments this section describes the experiments that were designed to evaluate the performance of the linguistic guidelines briefly reported in the previous sec- tion. first, we wanted to evaluate the inter-annotation agreement. second, we wanted to test the coverage of the linguistic errors and the generalization to a difference source language. finally, we compute the correlation of the linguis- tic evaluations: among different translation systems, and with standard human evaluation methods such as adequacy and fluency. . . data set the test corpus falls within the medicine domain. this medical corpus was kindly provided by the universaldoctor project, which focuses on facilitating communication between health-care providers and patients from various origins . table summarizes the number of sentences, words and vocabulary of the medical corpus. english sentences words vocabulary table : corpus statistics of the english medical test set. . . machine translation systems as translation systems we used freely available systems in the web. they include two rule-based mt (rbmt) systems, apertium and translendium, and two statistical mt (smt) systems, google translate and upc. all systems are used with their respective versions date of st of february . • apertium platform is an open-source rbmt system originally based on existing translation systems that have been designed by the transducens group at the universitat d’alacant (ua). the system uses a shallow- transfer machine translation technology. • translendium is developed by translendium s.l., a catalan company located in barcelona and subsidiary of the european group lucy software, made up of linguists and computer scientists with more than fifteen years of experience in the machine translation field. the translation engine consists of a modular structure of computational grammars and lexicons that makes possible to carry out a morphosyntactic analysis of the source text and then transfers it into the target language. • google translate is a smt system developed by google’s research group for more than languages. the system uses billions of words of text, both monolingual text in the target language. google is constantly working to support more languages and introduce them as soon as the automatic translation meets their standards. http://www.universaldoctor.com http://www.apertium.org/ http://www.translendium.com http://translate.google.com/ • upc system is developed at the universitat politècnica de catalunya. based on a ngram translation model integrated in an optimized log-linear combination of additional features, it is mainly a statistical system, al- though it also includes additional linguistic rules to solve some errors caused by the statistical translation [ ]. . . inter-annotation agreement in adequacy and fluency human evaluation the evaluation in adequacy and fluency was performed by three annotators catalan native and fluent in english. the rank of adequacy and fluency was from (good) to (bad). all annotators evaluated ( * ) sentences both in adequacy and fluency. the inter-annotation agreement was evaluated with the weighted kappa [ ] using a quadratic distance between errors. the weighted kappa was . which is qualified as ’good’ according to [ ]. . . inter-annotation agreement in the linguistic human evaluation the linguistic evaluation was performed by three annotators catalan na- tive and fluent in english. the errors are reported according to the following linguistic evaluations: orthographic, morphological, semantic and syntactic, as described in section . annotators were not able to find one single error that was not reported in the guidelines. this was one of the main objectives of the guidelines and it is a great achievement because the guidelines were designed on a different set from the test set with a different source language. this means that these guidelines designed for a particular target language may be used for different languages pair with common target. we evaluated the inter-annotation agreement with the weighted kappa (k) [ ] using a linear unitary distance between errors. k = − ∑k i=i ∑k j=i wijxij∑k i=i ∑k j=i wijmij where k is the number of codes (in our case four categories) and wij,xij,mij are elements in the weight, observed and expected matrices, respectively. the weighted kappa was . which is good according to [ ]. this kappa is quite high when comparing it to other inter-annotation kappas in mt evaluation [ ] and it is due to the accurate design of the linguistic guidelines. to sum up, we are boosting kappa by giving strict guidelines, which is dif- ferent from relying on the holistic evaluation that provides the adequacy and fluency criteria. depending on the application, we would prefer one evaluation or the other. http://www.n-ii.org/ . . adequacy and fluency results table shows the results of the translation evaluation from the different system outputs. notice that google is ranked the best system in adequacy and translendium is ranked the best system in fluency. english-to-catalan adequacy fluency apertium . . google . . translendium . . upc . . table : adequacy and fluency results for english-to-catalan translation output. . . linguistic human evaluation results table shows the results of the translation evaluation from the different system outputs. notice that the semantic errors are the more common, and the orthographic errors are the less common. if we rank systems by orthography, apertium is the best system. if we rank systems by morphology or syntax, translendium is the best one. and if we rank systems by semantics, google is the best one. therefore, this evaluation may be worth to decide which system is better for a specific application. for example, if ione requires tourist infor- mation, one may be only interested in the meaning of the translation, in this sense one may choose google, which has the lowest number of semantic errors. english-to-catalan sent. w/errors total errors ort. mor. sem. syn. apertium google translendium upc table : linguistic evaluation results for english-to-catalan translation outputs: number and type of linguistic errors. previous experiments with these guidelines can be found in [ ], [ ] and [ ]. x . . correlation between linguistic evaluations and adequacy and fluency we performed the correlation at the level of segment between the linguistic judgments and the adequacy and fluency criteria. we performed the correlation at the level of segment using the kendall’s τb rank correlation among the different linguistic evaluations and systems. let (x ,y ), (x ,y ), ...(xn,yn) be a set of joint observations from two random vari- ables x and y respectively (f. e. orthography and semantics). any pair of observations (xi,yi) and (xj,yj) are said to be concordant if the ranks for both elements agree: that is, if both xi > xj and yi > yj or if both xi < xj and yi < yj. they are said to be discordant, if xi > xj and yi < yj or if xi < xj and yi > yj. if xi = xj or yi = yj, the pair is neither concordant, nor discordant. given that presumably we’ve got many ties, we use the kendall τb coefficient which makes adjustments for ties and it is defined as: τb = (number of concordant pairs) − (number of discordant pairs)√ (n − n )(n − n ) where a concordant pair is a pair of two translations of the same segment in which the rank given by the number of errors calculated from the corresponding linguistic level agree; in a discordant pair, they disagree. ties are adjusted as shown in the denominator: n = n(n − )/ ; n = ∑ i ti(ti − )/ ; n = ∑ j uj(uj − )/ where n is the total number of pairs, ti is the number of tied values in the i th group of ties for the first quantity and uj is the number of tied values in the j th group of ties for the first quantity. the possible values of τb range between (where all pairs are concordant) and − (where all pairs are discordant). thus the higher the value for τb the more similar the linguistic evaluations. when τb is zero, it means linguistic evaluations are independent. in all cases, correlations followed a statistically significant trend [ ]. here, a concordant pair is a pair of two translations of the same segment in which the ranks calculated from the human ranking task (adequacy or fluency ) and from the number of linguistic errors of the corresponding level agree; in a discordant pair, they disagree. the higher the value for τb the more similar the linguistic level ranking with the human ranking either in adequacy or fluency. in all cases, correlations followed a statistically significant trend [ ]. table show the results for ( sentences * systems) segments. on the one hand, adequacy is clearly correlated to semantics, a little to syntax and nothing to orthography and morphology because these two levels do not interfere in the understanding of the translation. on the other hand, fluency is correlated with all levels in this order of major to minor importance: semantics, syntax, orthography and morphology. in both cases, adequacy and fluency are clearly related to the quantity of total errors provided by the system. . conclusions we proposed an alternative way of human evaluation in machine translation. to the best of our knowledge, our proposal is the first linguistic evaluation which adequacy fluency orthographic . morphological . semantic . . syntactic . . total errors . . table : correlation at the level of segment between linguistic evaluation and adequacy and fluency. . has been tested in detail providing good inter-annotation agreement, excellent error coverage and informative segment correlation with the standard human evaluation methodology of adequacy and fluency. in this sense, linguistic guide- lines have been shown useful for machine translation evaluation. this methodology has been proved to achieve a really high inter-annotation agreement (a kappa of . ) which should be one of the main goals in ma- chine translation evaluation. the level of agreement achieved is quite surprising specially if we take into account that the evaluation does not use a reference translation. moreover, the linguistic guidelines, designed for spanish-to-catalan and spe- cific for the target language (catalan), have shown generalizable for a different source language (english). annotators could not find one single error that was not specified in the guidelines. finally, the linguistic classification of errors pro- vides new information which has shown useful to relate linguistic errors from different type of systems. additionally, we have shown that annotators when evaluating in adequacy take into account semantic and syntactic errors and when evaluating in fluency take somehow all type of errors into account. our inten- tion with this correlation analysis was not to reach specially high correlations, but to show how linguistic evaluations are related when studying translation outputs. in further work, we would like to investigate how these linguistic guidelines work over a crowd-sourcing platform and how this new linguistic information can be used to improve machine translation systems. . acknowledgments the authors would like to thank the institute for infocomm research for their support and permission to publish this research. this work has been partially funded by the seventh framework program of the european commission through the international outgoing fellowship marie curie action (imtrap- - ). [ ] callison-burch, c., koehn, p., monz, c., peterson, k., przybocki, m., zaidan, o.. findings of the joint workshop on statistical ma- chine translation and metrics for machine translation. in: proceedings of the joint fifth workshop on statistical machine translation and met- ricsmatr. uppsala, sweden: association for computational linguistics; . p. – . [ ] coehn, j.. weighted kappa: nominal scale agreement and the maximum value of kappa. educational an psychological measurement ;( ): – . [ ] costa-jussà, m., farrús, m., mariño, j., fonollosa, j.. study and com- parison of rule-based and statistical catalan-spanish machine translation systems. accepted in computing and informatics journal ;. [ ] farrús, m., costa-jussà, m., poch, m., hernández, a., mariño, j.. im- proving a catalan-spanish statistical translation system using morphosyn- tactic knowledge. in: th annual meeting of the eamt: european asso- ciation for machine translation. barcelona; . p. – . [ ] farrús, m., costa-jussà, m.r., popovic, m.. study and correlation analy- sis of linguistic, perceptual, and automatic machine translation evaluations. jasist ; ( ): – . [ ] flanagan, m.a.. error classification for mt evaluation. in: proc. of the amta. columbia; . p. – . [ ] kittur, a., chi, e.h., suh, b.. crowdsourcing user studies with mechan- ical turk. in: proceedings of the sigchi conference on human factors in computing systems. new york, ny, usa: acm; chi ’ ; . p. – . [ ] landis, j.r., koch, g.g.. the measurement of observer agreement for categorical data. biometrics ; : – . [ ] mariño, j., banchs, r., crego, j., de gispert, a., lambert, p., fonollosa, j., costa-jussà, m.. n-gram based machine translation. computational linguistics ; ( ): – . [ ] mcbride, g.. anomalies and remedies in non-parametric seasonal trend tests and estimates; . national institute of water and atmospheric research, hamilton. [ ] naskar, s.k., toral, a., gaspari, f., way, a.. a framework for diagnostic evaluation of mt based on linguistic checkpoints. in: proceedings of the th machine translation summit. xiamen, china; . p. – . [ ] olive, j.. global autonomous language exploitation. darpa/iptoproposer information pamphlet ;. [ ] snover, m., madnani, n., dorr, b.j., schwartz., r.. ter-plus: para- phrase, semantic, and alignment enhancements to translation edit rate. machine translation ; ( - ): – . [ ] snow, r., o’connor, b., jurafsky, d., ng, a.y.. cheap and fastbut is it good?: evaluating non-expert annotations for natural language tasks. in: proceedings of the conference on empirical methods in natural language processing. . p. – . [ ] su, q., pavlov, d., chow, j.h., baker, w.c.. internet-scale collection of human-reviewed data. in: proceedings of the th international conference on world wide web. . p. – . [ ] vilar, d., xu, j., fernando-d’haro, l., ney, h.. error analysis of statistical machine translation output. in: proc. of the lrec. genoa, italy; . . [ ] white, j., o’connell, t., o’mara, f.. the arpa mt evaluation method- ologies: evolution, lessons, and future approaches. in: proc. of the st conference of the association for machine translation in the americas. columbia; . p. – . [ ] zhou, m., wang, b., liu, s., li, m., zhang, d., zhao, t.. diagnostic evaluation of machine translation systems using automatically constructed linguistic check-points. in: proceedings of the nd international con- ference on computational linguistics coling . stroudsburg, pa, usa; volume ; . p. – . preparing for the research excellence framework examples of open access good practice across the united kingdom preparing for the research excellence framework examples of open access good practice across the united kingdom hannah degroff, open access support coordinator, jisc, bristol, uk (this is an author's accepted manuscript of an article published by taylor & francis in the serials librarian on july which is available online at http://dx.doi.org/ . / x. . . the formatting has been altered from the original and taylor & francis takes no responsibility for any errors thereby introduced.) abstract this article concerns how higher education institutions across the united kingdom are implementing systems and workflows in order to meet open access requirements for the next research excellence framework. the way that institutions are preparing is not uniform, although there are key areas which require attention: cost management, advocacy, systems and metadata, structural workflows, and internal policy. examples of preparative work in these areas are taken from institutions who have participated in the open access good practice initiative supported by jisc. http://dx.doi.org/ . / x. . preparing for the research excellence framework examples of open access good practice across the united kingdom introduction in their article ‘open access for ref ’ simon kerridge and phil ward estimate that the number of journal articles and conference proceedings which will be submitted to the united kingdom’s (uk) higher education funding councils’ next research excellence framework (ref) will number somewhere in the order of , . these publications will make up about three-quarters of the total number of outputs which will be submitted to the exercise which endeavours to assess the quality of research at uk universities and subsequently informs research grant allocation. on behalf of all the uk funding councils, the higher education funding council for england (hefce) announced in march that any article or published conference proceeding which a higher education institution (hei) wants to submit to the next ref will need to be made openly available via a repository as soon as is feasibly possible. taking into account exemptions, kerridge and ward estimate that this open access (oa) requirement will affect around , submissions to the forthcoming national research assessment exercise. effectively satisfying this extensive stipulation will, according to the jisc guide complying with open access policies, require ‘an institution-wide approach’, an approach that involves preparing researchers for changes to their research dissemination processes and establishing new workflows and systems. the challenge for institutions has been, and will continue to be, therefore the dual necessity to adapt internal workflows whilst changing researchers’ kerridge and ward, ibid, ibid, . the policy only applies to articles and conference proceedings accepted for publication after st april . the jisc guide is available here: https://www.jisc.ac.uk/guides/complying-with-research-funders-open-access- policies [accessed / / ]. https://www.jisc.ac.uk/guides/complying-with-research-funders-open-access-policies https://www.jisc.ac.uk/guides/complying-with-research-funders-open-access-policies preparing for the research excellence framework examples of open access good practice across the united kingdom behaviours. it is widely held throughout the scholarly communications community that the benefits which come from free access to research for academic institutions, their researchers, and society in general outweigh the challenges of implementation. yet adoption is sometimes impeded by what stephen pinfield describes as ‘cautious researcher attitudes’. to this end, advocacy coupled with the realignment of internal processes has become the cornerstone of most, if not all, institutional oa strategies. this article offers examples of how heis across the uk are successfully preparing to meet the oa requirements of the next ref. examples of such work have been drawn from the nine pathfinder projects which were set up as part of the jisc-funded open access good practice (oagp) initiative. the projects have focused their attention on a variety of oa-related areas falling under five key themes: baselining and policy, structural workflows, cost management, systems and metadata, and advocacy. the examples covered below will thus be organised in line with these themes. via the development of support material, the organisation of workshops, and dissemination of their findings, these projects have formed the basis of an oagp community in which over uk heis have actively participated. over individuals subscribe to the oagp mailing list, an indication of the initiative’s reach across the uk’s higher education sector. before exploring how institutions involved in these projects have responded to the new oa requirements, the article will first look at hefce’s oa policy in more detail. emery & stone, . feedback from workshops ‘indicates that advocacy, funder mandates, staffing, discovery, and standards are the key barriers, with costs and workflows closely linked.’ for summaries of the benefits of oa see: http://sparceurope.org/open-access/benefits-of-open-access/ and http://www.pasteur oa.eu/sites/pasteur oa/files/resource/brief_oa% and% knowledge% transfer% to% the% private% sector.pdf [accessed / / ]. pinfield, . the pathfinder projects are part of a community-based support initiative involving heis, the primary focus being the development of good practice in oa implementation. for more information, including links to the different projects, please see: https://www.jisc.ac.uk/rd/projects/open-access-good-practice the oagp blog is available here: http://openaccess.jiscinvolve.org/wp/ [both accessed on / / ]. http://sparceurope.org/open-access/benefits-of-open-access/ http://www.pasteur oa.eu/sites/pasteur oa/files/resource/brief_oa% and% knowledge% transfer% to% the% private% sector.pdf http://www.pasteur oa.eu/sites/pasteur oa/files/resource/brief_oa% and% knowledge% transfer% to% the% private% sector.pdf https://www.jisc.ac.uk/rd/projects/open-access-good-practice http://openaccess.jiscinvolve.org/wp/ preparing for the research excellence framework examples of open access good practice across the united kingdom policy background the higher education sector was first alerted to potential changes regarding the ref during consultation by hefce with the community in . it was, however, oa-based mandates conceived by the research councils uk (rcuk) that same year which initially encouraged heis to reconsider their workflows and processes. jo aucock, head of cataloguing and repository services at the university of st andrews, notes that the ‘flurry of activity by funders in influenced [st andrews] to issue a statement on publications and this also acknowledged the role of the library in administering oa compliance.’ as in this case, university libraries along with institutional research offices, soon became the focal point for oa work across uk heis. in , the funding councils announced their oa ref policy, which endorses both green and gold routes to oa. in comparison, rcuk’s policy is more closely aligned with the finch report ( ) and its recommendation that policies should support a gold oa approach. ensuring that the research which they fund is made openly available has also been an objective for the wellcome trust, members of the charities open access fund (coaf), and the european research and innovation programme, horizon . details of these funder and research organisation mandates can be found in roarmap (the registry of open access repository mandates and policies), a database which supports heis and academics in understanding the oa policy landscape. for information about the consultation process please see: http://www.hefce.ac.uk/pubs/year/ / / [accessed on / / ]. aucock, . ibid. an updated version of the policy was released in july . rcuk issued block grants to uk heis, for example, to pay for article processing charges. http://www.hefce.ac.uk/pubs/year/ / / preparing for the research excellence framework examples of open access good practice across the united kingdom the previous ref exercise (which finished in ) did not include any requirements that research outputs be made openly available. due to its emphasis on oa, the present assessment, which is likely to conclude in - , has been described by aucock as a ‘game changer’. ben johnson (hefce’s research policy advisor) asserts that the funding councils understand a move to oa is a cultural change for academics and that it will take time to implement new systems, train support staff, and establish efficient workflows. indeed, hefce has not yet released a figure stipulating a minimum threshold of compliance that heis must meet. the funding councils are, however, considering setting a firm and final date by which all relevant outputs must have met the deposit requirements in order to be eligible for submission… (likely to be three months after the end of the ref publication period). if introduced, this would give an opportunity for institutions to make any inadvertently non-compliant outputs available as oa within the spirit of the policy. in response to concerns from heis about successfully complying, johnson has responded by assuring that evidence of ‘best endeavours towards achieving full compliance’ will be satisfactory. simply put, hefce’s oa policy requires that articles submitted to the post- ref will need to have their metadata and peer-reviewed full text deposited in a repository (either institutional or subject) upon acceptance of publication. in addition, ‘deposited material should be discoverable, and free to read and download, for anyone aucock, . for details of ref please see: http://www.ref.ac.uk/ [accessed on / / ]. hefce, open access in the next research excellence framework: policy adjustments and qualifications, . ibid. http://www.ref.ac.uk/ preparing for the research excellence framework examples of open access good practice across the united kingdom with an internet connection.’ as one may expect, aspects of the policy have elicited discussion from the hei community. acceptance or publication date? the policy stipulates that the full text and metadata of an article or published conference paper be deposited in a repository as soon as it has been accepted for publication. hefce’s reason for choosing the acceptance date stems from the expectancy that it will encourage academics - at the point of the publication process when they are most involved - to consider how their scholarship will be distributed. publication of research outputs can happen sometimes months or even years after the acceptance date, at which stage academics are removed from the workflow. requiring their input at the acceptance date will, hefce hopes, engender a change whereby researchers take increasing ownership of the way that their research is disseminated. some members of the community have, nonetheless, called for hefce to use the publication date to be the criteria for ref eligibility. although the notion of a fixed publication date can be ambiguous (for example, do you chose the print, online, or pre-release date?), currently, publication data is easily sourced from existing systems such as scopus and web of science. further, as this information is publicly available, institutions do not need to rely on academics to provide it; this is in contrast to the acceptance date, which — until now — has only ever been known (though not necessarily recorded) by the author and the publisher. recent changes to crossref’s publisher hefce, policy for open access in the post- research excellence framework, . see, for example, torsten reimer’s post on imperial college london’s oa and digital scholarship blog (dated march ): https://wwwf.imperial.ac.uk/blog/openaccess/ / / /how-compliant-are-we-with-hefces-ref- open-access-policy-why-open-access-reporting-is-difficult-part- / [accessed / / ]. hefce, policy for open access in the post- research excellence framework, . https://wwwf.imperial.ac.uk/blog/openaccess/ / / /how-compliant-are-we-with-hefces-ref-open-access-policy-why-open-access-reporting-is-difficult-part- / https://wwwf.imperial.ac.uk/blog/openaccess/ / / /how-compliant-are-we-with-hefces-ref-open-access-policy-why-open-access-reporting-is-difficult-part- / preparing for the research excellence framework examples of open access good practice across the united kingdom guidelines intend to improve this situation so that publishers can record key dates as part of the publication record information which they submit to the reference linking system. some institutions are also calling on hefce to change from deposit at acceptance to deposit at publication because full bibliographic metadata is not normally available at the point of acceptance. thus, for heis to ensure that the correct metadata is recorded in their systems will involve two workflows– one at the point of acceptance and a second once the output has been published. the development and installation of jisc’s publication router, however, will alleviate this workload by automating the deposit of research outputs and corresponding metadata directly into hei’s research management systems. exceptions according to the policy, exceptions can be claimed due to deposit, access, or technical issues. there is also a fourth category for other exceptions which fall outside these defined areas. outputs which have been made oa through the gold route – for example, where an article processing charge (apc) has been paid for rcuk-funded research – fall under the deposit exception category. as the output has already been made freely available it falls outside the scope of the policy. as part of their pathfinder project, pathways to oa, university college london (ucl) has examined the proportion of outputs from their ref submission which would have been treated as exceptions. this analysis provides other institutions, as well as the funding councils, insight into how many exceptions should be expected. of the to see crossref’s guidelines for publishers see: https://github.com/crossref/rest-api- doc/blob/master/funder_kpi_metadata_best_practice.md [accessed on / / ]. hefce, policy for open access in the post- research excellence framework, - . point - in the policy. ibid, . point /f in the policy. https://github.com/crossref/rest-api-doc/blob/master/funder_kpi_metadata_best_practice.md https://github.com/crossref/rest-api-doc/blob/master/funder_kpi_metadata_best_practice.md preparing for the research excellence framework examples of open access good practice across the united kingdom publications that ucl submitted, . % would have been compliant, a figure which is close to hefce’s nation- wide estimation of %. following the release of this data, alan bracey from ucl summarised the situation: ‘[a]lthough the potentially small number of exceptions makes them manageable, institutions still need a process in place, rather than dealing with each exception as it comes up.’ in recognition that exceptions tend to be out of the control of institutions, hefce have stated that the number claimed as part of the ref exercise will not affect results. embargoes honouring embargoes that publishers have put in place is an important step that heis have to consider when using their repository, as is ensuring that the output is freely available once embargoes have expired. the ref policy stipulates that access embargoes cannot exceed twelve months for science, technology, engineering, and medicine (stem) subjects, and twenty-four months for the arts, humanities, and social sciences (ahss). an exception to these time limits can be claimed if an author and their institution considers a journal that has a non- compliant embargo policy to be the most appropriate route for dissemination. to help academics decide which journal to publish in, the sherpa/romeo service offers the community a way to check publishers’ self-archiving and embargo policies. in addition, the newly released sherpa ref enables researchers and their institutions to confirm specifically whether the journal which they have published in (or intend to publish in) meets ref oa requirements. details of the findings can be seen on the pathways to oa blog here: http://blogs.ucl.ac.uk/open- access/ / / /ref-exceptions/ [accessed / / ]. the summary follows on from a workshop held by ucl in early , details of which can be found on the ucl blog post (dated february ) here: http://blogs.ucl.ac.uk/open-access/ [accessed / / ]. hefce, policy for open access in the post- research excellence framework, . http://blogs.ucl.ac.uk/open-access/ / / /ref-exceptions/ http://blogs.ucl.ac.uk/open-access/ / / /ref-exceptions/ http://blogs.ucl.ac.uk/open-access/ preparing for the research excellence framework examples of open access good practice across the united kingdom copyright and licenses there are different degrees of openness: ‘free to read and reuse’ is, for example, more permissive than being just ‘free to read’. the ref policy – which requires that research be free to read, search, and download – advises ‘that outputs licensed under cc by-nc-nd satisfy this minimum, as would outputs licensed under cc by, and other more permissive open licenses. as with checking embargo criteria, researchers and institutions can use sherpa/romeo to identify the self-archiving permissions for specific journals and the sherpa ref service to see if a journal is ref compliant. to help institutions further with compliance checking, it is hoped that publishers will improve the way they share licensing information through publication metadata feeds. repositories – which ones? researchers are free to deposit their articles or conference proceedings in either institutional or subject repositories. as well as integrating and fine-tuning their own institutional repositories (ir), libraries and research offices have had to become familiar with the workings of external subject repositories, a process that can be supported by using opendoar, a jisc-managed directory of academic oa repositories. depositing research in a subject repository is standard practice in certain research communities (particularly amongst stem researchers). although the use of a subject repository can conform to hefce guidance, their use poses potential challenges, particularly around compliance checking. the connecting repositories (or core) service which is managed by the open university in partnership with jisc, aggregates oa research outputs from registered repositories and can help institutions monitor what academics are putting into subject repositories. whilst the hei community has raised questions regarding whether certain repositories do or do not meet the ref’s hefce, policy for open access in the post- research excellence framework, . ibid, . preparing for the research excellence framework examples of open access good practice across the united kingdom compliance criteria, due to the range of subject repositories in operation, the funding councils are not intending to stipulate which should be used. the pathways to oa pathfinder project has investigated the use of subject repositories in light of hefce’s policy. at a january workshop, delegates heard how pubmed central and its european counterpart are developing ways to help institutions check whether items deposited in their systems fulfil ref compliance. there was also discussion on whether funding could be found to help arxiv follow a similar path. if such practices were to become more widespread, then there would be less pressure on those researchers who make good use of subject repositories to also deposit in irs. it would also reduce the administrative burden on library and research office staff. multi-authors/multi-institutions finally, discussion has centred on the question of who is responsible for deposit when a paper has multiple authors. hefce does ‘not have a strong view on which author should deposit the output, as long as the paper is deposited by one of the authors.’ indeed, the funding councils ‘see no substantial drawbacks to more than one author depositing the output.’ in theory, then, the output need only be deposited once, something which would avoid duplication of effort. an already confused area is further complicated in those instances where co-authors are based at different institutions with different oa practices. as torsten reimer from imperial college london has written, the process for a summary of the workshop see ucl blog post dated february : http://blogs.ucl.ac.uk/open-access/ [accessed / / ]. taken from the faqs section of hefce’s oa webpages: http://www.hefce.ac.uk/rsrch/oa/faq/#deposit [accessed on / / ] http://blogs.ucl.ac.uk/open-access/ http://www.hefce.ac.uk/rsrch/oa/faq/#deposit preparing for the research excellence framework examples of open access good practice across the united kingdom is helped by authors and repositories adopting orcid identifiers which distinguish researchers and their research outputs. reimer also calls on publishers to provide better metadata about the outputs that they are publishing. if, for example, the doi (digital object identifier) linked to each output was always passed on to the institution, the cross-checking process would be greatly simplified. in other areas – regarding embargoes and licensing for instance – publishers could support compliance by ensuring that their publications satisfied the funding councils’ requirements. recognising that a significant number of exceptions result from publisher non-compliance, ucl has compiled a freely available list of non-compliant publishers and journals as part of their pathfinder work. whilst ‘a ref blacklist may be useful for pressuring publishers to change policies’, many heis are hesitant to dictate where their research staff should publish. the sherpa services, and in particular sherpa ref, help in this regard as they allow academics to check for themselves whether the journals they choose to publish in meet hefce’s mandates. despite the perceived complexities of meeting the ref oa policy, it is important to recognise that no other european country is as advanced as the uk in ensuring compliance with funders’ oa policies. as we shall see, the work undertaken by institutions involved with the oagp initiative has been an essential part of this achievement. see reimer’s blog post from march here: http://wwwf.imperial.ac.uk/blog/openaccess/ [accessed / / ]. see reimer’s post on the imperial college london blog dated march : https://wwwf.imperial.ac.uk/blog/openaccess/ / / /how-compliant-are-we-with-hefces-ref-open-access-policy- why-open-access-reporting-is-difficult-part- / [accessed / / ]. details can be found on the ucl blog post dated december : http://blogs.ucl.ac.uk/open- access/ / / /ref-exceptions/ [accessed / / ]. taken from the ucl blog post dated february which summaries the ref workshop held on january : http://blogs.ucl.ac.uk/open-access/ [accessed / / ]. for an overview of the uk’s progress see mafalda picarra, ‘uk open access case study’ for pasteur oa project (november , ). http://wwwf.imperial.ac.uk/blog/openaccess/ https://wwwf.imperial.ac.uk/blog/openaccess/ / / /how-compliant-are-we-with-hefces-ref-open-access-policy-why-open-access-reporting-is-difficult-part- / https://wwwf.imperial.ac.uk/blog/openaccess/ / / /how-compliant-are-we-with-hefces-ref-open-access-policy-why-open-access-reporting-is-difficult-part- / http://blogs.ucl.ac.uk/open-access/ / / /ref-exceptions/ http://blogs.ucl.ac.uk/open-access/ / / /ref-exceptions/ http://blogs.ucl.ac.uk/open-access/ preparing for the research excellence framework examples of open access good practice across the united kingdom the infrastructure and services which have been developed in a relatively short period of time to help uk heis monitor compliance are world-leading. the majority of uk heis now have repositories, and the ref oa policy will promote their greater use. in addition, many services such as core, orcid, the publications router, and the sherpa services have been developed jointly with the community to ensure that needs are met and mechanisms are widely promoted and adopted. how institutions are preparing to meet the ref requirements whilst discussion over certain points within the ref policy continues, chris banks notes that institutions are actively ‘trying to develop frictionless services to support academic endeavour’. locally, different heis are tackling this challenge in a variety of different ways. there is no ‘one-size fits all approach’ to implementing oa, however libraries and research offices are making common preparations for the ref. these arrangements include: baselining their current position in order to identify areas for improvement, streamlining workflows through process mapping, establishing best practices for oa-related cost management, implementing and fine-tuning systems, and initiating advocacy work packages across colleges, schools, departments, and faculties. drawing from the oagp initiative and pathfinder projects, below are examples of work in these areas. banks, preparing for the research excellence framework examples of open access good practice across the united kingdom benchmarking and policy in order to assess their own level of preparedness, benchmarking existing oa-related procedures with other, comparable institutions is often the first step for heis. to help, a number of pathfinder projects have produced case studies detailing approaches to implementation from different institutions. the project led by northumbria university has compiled four studies of varied institutions (durham university, university of lincoln, university of hull, and teesside university),drawing out examples of best practice. similar outputs have also been created by the openworks, pathways to open access, and loch (lessons in open access compliance for higher education) projects. these case studies provide other universities a useful foundation from which to measure their own progress and help with decision-making. benchmarked data specifying average costs, required skill-sets, and likely responsibilities for staff working on particular oa tasks and activities can, for example, help those institutions who are considering whether they need to increase resources. to assist institutions in baselining their own position, the hhuloa project (comprising the universities of hull, huddersfield, and lincoln) have created a tool to record current oa activity across uk institutions. the data is updated every six months by contributors to track progress. as chris awre, from the university of hull describes, the case studies can be found in a northumbria and sunderland’s pathfinder blog post dated december : https://oapathfinder.wordpress.com/ / / /oa-good-practice-what-weve-learned-so-far/ [accessed / / ]. details of these outputs can be found on the pathfinder project blogs. openworks: https://blog.openworks.library.manchester.ac.uk/ pathways to oa: http://blogs.ucl.ac.uk/open-access/ loch: http://libraryblogs.is.ed.ac.uk/loch/ [all accessed / / ]. https://oapathfinder.wordpress.com/ / / /oa-good-practice-what-weve-learned-so-far/ http://blogs.ucl.ac.uk/open-access/ preparing for the research excellence framework examples of open access good practice across the united kingdom ‘[s]eeing what is happening elsewhere need not be a matter of guilt, but could be a chance to make a business case based on progress at the competition!’ the oxford brookes-led pathfinder project, making sense of oa, developed a successful benchmarking tool known as ciao (collaborative institutional assessment of open access) which has been widely adopted by institutions. the tool requires users to assess their readiness for oa by considering different stages of implementation: envisioning and initiating, discovering, designing and piloting, rolling out, and embedding. one user of ciao was ‘delighted’ when she came across the tool. it was, …just what i needed to establish where we needed to go with oa. …i was worried there were areas of oa i may be blissfully unaware of, so ciao re-assured me which areas needed to be covered. i sent a copy to the head of our research office, and we then met to go through it. it was a really good document to have in the meeting, as it meant we did not need to define for ourselves what we needed to do, we just needed to work through ciao to see what progress we had already made, and what we still needed to do. it made the meeting really straightforward. taken from the hhuloa blog post dated february : https://library .hud.ac.uk/blogs/hhuloa/ / / /open-access-baseline-activity-tool/ [accessed / / ]. ciao is available here: https://radar.brookes.ac.uk/radar/items/ b -c f - aae- - fa b / / [accessed / / ]. taken from the making sense of oa blog post dated march : http://sensemakingopenaccess.blogspot.co.uk/ / /making-sense-project-update-march- .html [accessed / / ]. https://library .hud.ac.uk/blogs/hhuloa/ / / /open-access-baseline-activity-tool/ https://radar.brookes.ac.uk/radar/items/ b -c f - aae- - fa b / / http://sensemakingopenaccess.blogspot.co.uk/ / /making-sense-project-update-march- .html preparing for the research excellence framework examples of open access good practice across the united kingdom the development and promotion of an institutional oa policy is another aspect of oa implementation where universities benefit from comparing their situation with other heis. in his article ‘making open access work’, pinfield notes that all the evidence suggests institutional mandates are beneficial in the uptake of oa at heis. introducing and successfully communicating an institution-wide policy means that both researchers and support staff have a clear steer when it comes to administering research outputs. to help with this process, jisc, the sherpa services, and roarmap have, in collaboration, developed a schema for oa policies which can be used by institutions when developing their policy. it ‘aims to encourage policy makers worldwide to express their policies in a consistent way.’ once the policy has been created, institutions are encouraged to register it on roarmap making it visible to other members of the community. structural workflows the oa landscape is difficult to navigate at all levels and from all angles. members of the hhuloa project have endeavoured to capture the interrelationships between different services, steps of the publication process, and required actions by creating three uk oa life cycle diagrams, one each for research managers, researchers, and publishers (figures - ). the coloured circles indicate where responsibilities lie, whether that be at institutional pinfield, for more information about the schema see the jisc scholarly communications blog post dated november : https://scholarlycommunications.jiscinvolve.org/wp/ / / /a-schema-for-open-access-policies/ [accessed / / ]. the diagrams are available via the hhuloa blog here: https://library .hud.ac.uk/blogs/hhuloa/ / / /new- oa-life-cycles-for-comment/ [accessed / / ]. https://scholarlycommunications.jiscinvolve.org/wp/ / / /a-schema-for-open-access-policies/ https://library .hud.ac.uk/blogs/hhuloa/ / / /new-oa-life-cycles-for-comment/ https://library .hud.ac.uk/blogs/hhuloa/ / / /new-oa-life-cycles-for-comment/ preparing for the research excellence framework examples of open access good practice across the united kingdom or ‘above campus’ level, or with publishers. the outer circle highlights different aspects of the open access workflows for academic libraries (oawal) initiative, which aims to become a ‘base from which librarians can build their local practices and processes.’ hhuloa has also produced a highly successful oa underground map (figure ) which interlinks the different workflows for key stakeholders. the simple, interactive design allows it to serve as a valuable advocacy tool for researchers and support staff alike. as aucock recognises, it is necessary to fully understand the different elements of a workflow in order to make sure it is embedded successfully. at the university of st andrews, staff have carried out a lean review of workflows as part of their oa implementation plan. ‘a year in the life of open access support: continuous improvement at university of st andrews’ details the specific outcomes from this review. a useful feature of the lean process (and one which may prove helpful to other institutions when they are considering their own workflows) is the identification of ‘wasteful’ activities and features which may hinder the development of efficient processes. interrogating processes using the lean methodology has allowed staff at st andrews to make improvements in key areas including the handling of apcs, communications, and compliance checking and reporting. the way in which universities and their researchers handle the deposit process is another area where good practice is being established. case studies outlining different institutional approaches to deposit have been created by the openworks pathfinder project. deposit activities across eight institutions were identified by interviewing library emery and stone, aucock, the report is available here: http://hdl.handle.net/ / [accessed / / ]. http://hdl.handle.net/ / preparing for the research excellence framework examples of open access good practice across the united kingdom and repository staff. as well as outlining the step-by-step deposit process followed by the individual institutions, the case studies determined the relative costs involved for each method. it was determined, for example, that the annual cost for the workflow used by queen mary university, london (qmul) is £ , whereas for liverpool john moores university (ljmu) it was fractionally higher at £ , . whilst qmul currently receives almost twice as many deposits from researchers than ljmu ( , compared to , ), its overall annual spend is slightly reduced due to the employment of a quicker workflow and the use of lower graded staff to manage the process. assessing which staff members are responsible for what tasks is a fundamental part of implementing oa at the institutional level, whether this relates to developing an oa strategy, paying apc invoices, supporting academics, or managing repository systems. the loch pathfinder project (led by the university of edinburgh) has produced a responsibility matrix template which lists individual responsibilities and tasks on one axis and job titles on another. identifying who is responsible, accountable, or needs to be consulted, supported, or informed will help streamline workflows, and prevent duplication of effort or communication breakdowns. the loch project also released a sample job description which can be adapted by institutions who are considering hiring new staff. the project also released a survey to the community in january to collect data on a broader scale. the case studies are being added to the openworks blog here: https://blog.openworks.library.manchester.ac.uk/ [accessed / / ]. the responsibility matrix template can be downloaded from here: http://find.jorum.ac.uk/resources/ / [accessed / / ]. the draft job description is available here: http://hdl.handle.net/ / [accessed / / ]. https://blog.openworks.library.manchester.ac.uk/ http://find.jorum.ac.uk/resources/ / https://jisc -my.sharepoint.com/personal/hannah_degroff_jisc_ac_uk/documents/jisc% docs/oagp/articles/serials% librarian/the http://hdl.handle.net/ / preparing for the research excellence framework examples of open access good practice across the united kingdom in many cases, it may not only be necessary to introduce support staff to new oa-specific tasks but also to the theory and ideas that lie behind the oa movement. recognising this training need, the openworks project has developed a toolkit for support staff which includes a guide that offers background information, definitions, and overviews of oa policies, repositories, gold and green oa, and reporting. the toolkit also includes a presentation template which could be employed when training staff, as well as an ‘ask an oa colleague’ feature. in addition, ‘advocating open access: a toolkit for librarians and research support staff’ is a useful guide produced by the pathways to oa project that offers practical advice for hei staff commencing an advocacy programme for researchers. cost management although many see the ref policy as steering researchers towards the green oa route, handling apcs will be an inevitable part of the process for heis. the very fact that the mandate refers to ‘publication date’ indicates that hefce expects that most (if not all) of the outputs submitted to the ref will be published, either via the traditional route of a subscription journal, through a hybrid or oa journal where apcs are paid, or via a free-to-publish oa journal. subsequently, improvements to how heis effectively managed oa-related costs which were first made following the introduction of rcuk’s policy are pertinent and continue to be developed. two related services the guide is available here: http://www.openworks.online/guide/ [accessed / / ]. the interactive help page is available here: http://www.openworks.online/ask-an-oa-colleague/ [accessed / / ]. the toolkit can be accessed here: http://blogs.ucl.ac.uk/open-access/files/ / /advocacy-toolkit.pdf [accessed / / ]. for further information about how uk institutions manage their oa funds see the report ‘institutional policies on the use of open access funds’ produced by the pathways to oa pathfinder project. it is available here: http://www.openworks.online/guide/ http://www.openworks.online/ask-an-oa-colleague/ http://blogs.ucl.ac.uk/open-access/files/ / /advocacy-toolkit.pdf preparing for the research excellence framework examples of open access good practice across the united kingdom being developed by jisc, monitor local and monitor uk, will help with the management and analysis of oa payments. monitor local enables heis to record and monitor their oa activity, including institutional apc payments, and monitor uk aggregates this data offering an overview of national expenditure on oa. amongst other outputs, gw , the pathfinder project led by the university of bath, has produced the quick guide ‘using purchase cards for apc payments’, a report analysing the administrative costs of processing apc payments, a survey of the published literature on the potential wider market effects of pre-payments on the development of the apc market, and a review of the off-setting deals now being offered by a number of publishers. to help institutions make efficiencies, jisc collections are continuing in their negotiations with publishers regarding these hybrid journal agreements. jisc collections are also running a project which seeks to establish new models to support the total cost of ownership of journals. in conjunction with the development of the monitor services, this project intends ‘to raise awareness of the amount of money being paid for apcs, especially to those publishers that also receive large sums in subscription costs.’ for those institutions considering what resources they will need for apc administration, the gw project has also brought together a collection of sample apc payment workflows. these workflows cover prepaid agreements, credit card payments (including the reconciliation and payment of the credit card bill), and general invoice http://blogs.ucl.ac.uk/open-access/files/ / /pathways-to-oa-apc-funds-survey-report-dec- .pdf [accessed / / ]. details of these outputs can be found on the gw pathfinder project blog: https://gw openaccess.wordpress.com/ [accessed / / ]. more information can be found here: https://www.jisc- collections.ac.uk/global/news% files% and% docs/principles-for-offset-agreements.pdf [accessed / / ]. more information about this project can be found here: https://www.jisc-collections.ac.uk/jisc-monitor/apc- data-collection/ [accessed / / ]. http://blogs.ucl.ac.uk/open-access/files/ / /pathways-to-oa-apc-funds-survey-report-dec- .pdf https://gw openaccess.wordpress.com/ https://www.jisc-collections.ac.uk/global/news% files% and% docs/principles-for-offset-agreements.pdf https://www.jisc-collections.ac.uk/global/news% files% and% docs/principles-for-offset-agreements.pdf https://www.jisc-collections.ac.uk/jisc-monitor/apc-data-collection/ https://www.jisc-collections.ac.uk/jisc-monitor/apc-data-collection/ preparing for the research excellence framework examples of open access good practice across the united kingdom payments (figures - ). these good practice workflows provide institutions with a starting point from which staff can reflect on their own processes and, as with many of the pathfinder outputs, they can be amended to reflect local needs. for librarians and research officers who need to make an internal business case to senior managers to receive funds to pay for apcs, the northumbria pathfinder project has developed an apc cost-modelling tool. different cost projections can be modelled depending on a variety of variables (including staff numbers, outputs produced, ref submission targets, and overheads relating to green and gold routes). the results generated from the tool enable institutions to estimate, given their local circumstances, how many apcs could be paid when working to different proposed budgets. the tool was developed from work undertaken by northumbria university that led to the approval of an annual £ , fund for gold oa costs. systems and metadata without the appropriate technical systems in place, meeting the ref oa policy would be time-consuming and laborious; without using a common metadata profile interoperability between these systems would be problematic, if not impossible. the development of the rioxx metadata application profile has been essential to the progress of managing oa outputs in the uk. plug-ins have been developed for eprints and dspace repositories which mean they can support the application profile. compliance checker plug-ins for both these systems allow institutions to assess whether deposited outputs meet ref oa requirements. the workflows can be downloaded here: http://find.jorum.ac.uk/resources/ / [accessed / / ]. details of the tool and access to it is via the northumbria and sunderland pathfinder project blog post dated july : https://oapathfinder.wordpress.com/ / / /cost-modelling-tool-now-available/ [accessed / / ]. http://find.jorum.ac.uk/resources/ / https://oapathfinder.wordpress.com/ / / /cost-modelling-tool-now-available/ preparing for the research excellence framework examples of open access good practice across the united kingdom members of the end-to-end pathfinder project (led by the university of glasgow) have focused much of their investigation on identifying technical solutions for oa management. staff members at the university of glasgow have, for example, developed an oa metadata specification for eprints which has been widely adopted by users of this repository system. aware that there is often cross-over with other systems development work and consequently a duplication of effort, project members encourage ‘co-operative working’. workshops organised by the end-to-end project have been particularly successful, with one attendee noting that it was ‘a confidence boost to see both the variety of tools available (or soon to be available) to help with implementation and advocacy of oa requirements for ref ’. other institutions involved with the pathfinder projects are investigating the incorporation of oa metadata in their systems; the university of hull is working with hydra, for example, and the university of lancaster with fedora. some heis participating in the pathfinder initiative use a current information system (cris) to manage their researchers’ outputs. like the end-to-end project, the loch project has also been involved in developing an oa metadata specification, for the elsevier-run cris, pure. to help instil best practice, staff at the university of edinburgh have also drawn up training material for pure validation checking; for example, as the ref requires that all outputs should be searchable, there is a simple but important reminder that file formats used to save attachments must allow for this. mccutcheon and eadie, - taken from a northumbria and sunderland pathfinder project blog post (dated th september ): https://oapathfinder.wordpress.com/ / / /open-access-and-the-research-excellence-framework- workshop/ [accessed on / / ]. the training documentation is available here: http://hdl.handle.net/ / [accessed / / ]. https://oapathfinder.wordpress.com/ / / /open-access-and-the-research-excellence-framework-workshop/ https://oapathfinder.wordpress.com/ / / /open-access-and-the-research-excellence-framework-workshop/ http://hdl.handle.net/ / preparing for the research excellence framework examples of open access good practice across the united kingdom though not a direct objective of the pathfinder projects, a number of institutions have also volunteered to test systems which are being developed by jisc and other system developers. the universities of hull, glasgow, cardiff, huddersfield, lincoln, liverpool, manchester, and ucl are, for instance, all trialling the monitor local and monitor uk apc management software. subsequently, an important part of work for those involved in the oagp initiative has been to share with other heis their experience of new systems and technologies which are being specifically developed to meet oa requirements, and encourage their uptake. advocacy librarians and research offers are having to think creatively about how to encourage researchers to understand and meet the new oa requirements. ‘the most important players of all’, aucock notes, ‘are authors and researchers and for an oa policy to be effective they need to feel engaged with the process.’ in order to know what motivates authors more clearly, the making sense of oa project has undertaken in-depth advocacy work with academic staff. interviews with researchers enabled support staff to achieve a better understanding of academics’ attitudes towards and behaviours around oa. this ethnographic methodology led to the creation of miao (my individual assessment of open access) which provides academics with a framework to evaluate their own preparedness for moving to oa. the coventry-led open to open access project took a similar, ethnographic aucock, miao is available here: https://radar.brookes.ac.uk/radar/items/eff b -c be- e a- a - fa ea / / [accessed / / ]. https://radar.brookes.ac.uk/radar/items/eff b -c be- e a- a - fa ea / / preparing for the research excellence framework examples of open access good practice across the united kingdom approach and, following interviews with researchers, drew up a needs assessment report which identifies the drivers and barriers to oa that academics encounter. the pathfinder projects have developed a variety of advocacy tools to help institutional staff engage with their academics. some, like the ref eligibility poster (figure ) created by the making sense of oa project, offer a soft- touch approach. free bookmarks and postcards with oa-related information, email signatures pointing academics to oa webpages, and institutional oa twitter accounts are other enterprising methods being adopted. for more focused engagement and training, the open to open access project has created a useful oa lifecycle flowchart (figure ) which can be employed during workshops. other material which can be adapted by institutions for oa training sessions includes pre- and post-workshop questionnaires by the university of portsmouth (as part of the making sense of oa project). this support package also includes a powerpoint presentation which can be modified to suit local needs. finding the balance between providing enough information so that academics are informed and feel part of the process, but not overwhelming them with unnecessary details so that they become disengaged and frustrated, is a shared problem for heis across the uk. an online decision-making tool for researchers is a valuable feature of the university of northumbria’s oa webpages which meets this challenge. academics are required to answer this work led to the development of intervention mapping guide for understanding researcher behaviour. the report is available here: http://blogs.coventry.ac.uk/researchblog/wp-content/uploads/sites/ / / /o oa-needs- assessment-report-final-v .pdf the guide is here: http://blogs.coventry.ac.uk/researchblog/wp- content/uploads/sites/ / / /o oa-intervention-mapping-output-oct- .pdf [both accessed / / ]. the flowchart can be downloaded here: http://find.jorum.ac.uk/resources/ [accessed / / ]. this material is available here: https://radar.brookes.ac.uk/radar/items/dc f a- e e- ce - c - c b a / / [accessed / / ]. the tool homepage is here: http://northumbriauniversity.libsurveys.com/loader.php?id=ee dd e a febc b [accessed / / ]. http://blogs.coventry.ac.uk/researchblog/wp-content/uploads/sites/ / / /o oa-needs-assessment-report-final-v .pdf http://blogs.coventry.ac.uk/researchblog/wp-content/uploads/sites/ / / /o oa-needs-assessment-report-final-v .pdf http://blogs.coventry.ac.uk/researchblog/wp-content/uploads/sites/ / / /o oa-intervention-mapping-output-oct- .pdf http://blogs.coventry.ac.uk/researchblog/wp-content/uploads/sites/ / / /o oa-intervention-mapping-output-oct- .pdf http://find.jorum.ac.uk/resources/ https://radar.brookes.ac.uk/radar/items/dc f a- e e- ce - c - c b a / / https://radar.brookes.ac.uk/radar/items/dc f a- e e- ce - c - c b a / / http://northumbriauniversity.libsurveys.com/loader.php?id=ee dd e a febc b preparing for the research excellence framework examples of open access good practice across the united kingdom questions about their research output and the tool identifies what publication and/or deposit options are available. it also directs them to further information and support if necessary. the process is simple and the outcome clear for users. conclusion pinfield suggests that the uk higher education community has moved from the position of debating whether oa should be a part of the scholarly communications landscape to asking how we can successfully adopt it on a wide scale. the next phase will be to consolidate this work and move to the state of ‘business as usual’. writing in the summer of , alma swan and caroline sutton acknowledged that ‘creating and managing a sustainable oa infrastructure is a challenging task and much more joint, collaborative effort is needed to move successful projects and experiments into the mainstream.’ they cite the oagp initiative and pathfinder projects as one example of such effective collaboration. clair waller from the university of kent and the end-to-end pathfinder project has summarised the benefit of the scheme thus: ‘the way in which institutions are working together to develop solutions is really amazing and the pathfinder projects seem to be a great way of achieving this. it’s good to know we are not alone’. pinfield, . swan and sutton, taken from an end-to-end oa pathfinder project blog post (dated th august ): http://e eoa.org/ / / /repository-fringe-part- / [accessed on / / ]. http://e eoa.org/ / / /repository-fringe-part- / preparing for the research excellence framework examples of open access good practice across the united kingdom although there are challenges to implementation, the benefits from making research openly available are extensive. the oagp initiative has been successful at developing and providing the tools to help heis realise these benefits. throughout this article we have seen how institutions are successfully exploring and adopting good practice in preparation for the ref. institutional policies are being established; cost management processes and internal workflows are being streamlined and embedded; systems are being developed and integrated; and advocacy programs are underway. support for all these endeavours is available to institutions through the oagp community. participation and use of the resources developed as part of the initiative, as well as those services being released by jisc and its partners, will help heis achieve successful implementation of oa as well as meet hefce’s ref requirements. preparing for the research excellence framework examples of open access good practice across the united kingdom figures figure . uk open access life cycle for research managers hhuloa pathfinder project: stone, g., awre, c., stainthorp, p., and emery, j. ( ) cc by preparing for the research excellence framework examples of open access good practice across the united kingdom figure . uk open access life cycle for researchers hhuloa pathfinder project: stone, g., awre, c., stainthorp, p., and emery, j. ( ) cc by figure . uk open access life cycle for publishers hhuloa pathfinder project: stone, g., awre, c., stainthorp, p., and emery, j. ( ) cc by preparing for the research excellence framework examples of open access good practice across the united kingdom figure . oa underground map hhuloa pathfinder project: stone, g., awre, c., stainthorp, p., and emery, j. ( ) cc by preparing for the research excellence framework examples of open access good practice across the united kingdom figure . pre-paid agreement sample workflows gw pathfinder project: jones, f., jones, s., smith, k., jones, k., and holliday, l. ccby preparing for the research excellence framework examples of open access good practice across the united kingdom figure . credit card payment sample workflow gw pathfinder project: jones, f., jones, s., smith, k., jones, k., and holliday, l. ccby preparing for the research excellence framework examples of open access good practice across the united kingdom figure . invoice payment sample workflow gw pathfinder project: jones, f., jones, s., smith, k., jones, k., and holliday, l. ccby preparing for the research excellence framework examples of open access good practice across the united kingdom figure . ref eligibility poster making sense of oa pathfinder project: bennett, e. (modified from a poster produced by hefce.) ccby preparing for the research excellence framework examples of open access good practice across the united kingdom figure . open access and the research lifecycle: a guide for researchers open to open access pathfinder project: dimmock, n., jones, k., and pickton, m. ccby . preparing for the research excellence framework examples of open access good practice across the united kingdom bibliography aucock, j., ( ). ‘managing open access (oa) workflows at the university of st andrews: challenges and pathfinder solutions.’ insights. ( ): – . doi: http://doi.org/ . / - . banks, c., ( ). ‘focusing upstream: supporting scholarly communication by academics.’ insights. ( ): – . doi: http://doi.org/ . /uksg. earney, l., ed. ( ). ‘open access infrastructure’ issue, information standards quarterly. ( ) emery, j. & stone, g., ( ). ‘the sound of the crowd: using social media to develop best practices for open access workflows for academic librarians (oawal).’ collaborative librarianship. ( ): - . hefec, (march , updated july ). policy for open access in the post- research excellence framework, hefce / hefce, (july ). open access in the next research excellence framework: policy adjustments and qualifications, circular letter / . kerridge, s. & ward, p., ( ). ‘open access for ref .’ insights. ( ): – . doi: http://doi.org/ . / - . mccutcheon, v. & eadie, m., ( ). ‘managing open access with eprints software: a case study.’ insights. ( ): – . doi: http://doi.org/ . /uksg. pinfield, s., ( ). ‘making open access work.’ online information review. ( ): - . doi: http://dx.doi.org/ . /oir- - - picarra, m., ‘uk open access case study’ for pasteur oa project (november , ). pontika, n. & rozenberga, d., ( ). ‘developing strategies to ensure compliance with funders’ open access policies.’ insights. ( ): – . doi: http://doi.org/ . /uksg. swan, a. & sutton, c., ( ). ‘sustainability of an oa infrastructure’ in earney, l., ed. ‘open access infrastructure’ issue, information standards quarterly. ( ): http://doi.org/ . /uksg. js/dh: an introduction to jewish studies/ digital humanities resources judaica librarianship volume - - - js/dh: an introduction to jewish studies/ digital humanities resources michelle chesner columbia university, mc @columbia.edu follow this and additional works at: https://ajlpublishing.org/jl recommended citation chesner, michelle. . "js/dh: an introduction to jewish studies/ digital humanities resources." judaica librarianship : - . doi: . / - . . https://ajlpublishing.org/jl?utm_source=ajlpublishing.org% fjl% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages https://ajlpublishing.org/jl/vol ?utm_source=ajlpublishing.org% fjl% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages https://ajlpublishing.org/jl/vol /iss / ?utm_source=ajlpublishing.org% fjl% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages https://ajlpublishing.org/jl?utm_source=ajlpublishing.org% fjl% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages https://doi.org/ . / - . js/dh: an introduction to jewish studies/ digital humanities resources the world in which we live is probably best described as a hybrid between the physical and the digital. as librarians working closely with the humanities, we encounter this dichotomy on a daily basis, purchasing both print and e-books via our computers, teaching the use of citation software and academic writing, and using an online catalog to access print monographs. increasingly, the use of the digital, and the multi-faceted term “digital humanities,” is becoming part of our daily conversation. at its simplest definition, writing this article could be considered digital human- ities, since it is written on a computer, in digital form, and describes a field of the humanities. however, as scholars and librarians (and librarian scholars) are producing more and more com- plex digital works of scholarship and bibliography or reference, there is a need to evaluate them, just as we evaluate print scholarship and reference works. following the lead of other scholarly resources, judaica librarianship will be premiering this new column to evaluate digital projects. librarians, especially those who provide reference assistance on a consistent basis, have to be familiar with all the resources (or those resources that identify the resources) in the field that they serve. many universities have some form of “digital center” in their libraries, but librarians not working directly with the centers are too often not familiar with the kinds of services they provide. even if one has never heard words like html, python, django, ruby, github or the many others that encompass the jargon of tech-speak, digital humanities projects should still be essential to librarians who work with researchers. like any other resources, we need to ensure that we are familiar with digital humanities resources—those grassroots projects that are gen- erated by scholars rather than vendors, and are often very useful, even in their “beta” versions. it was for this reason that i began aggregating sites that i called “dh jewish,” i.e. those sites that use digital techniques to advance scholarship in the field of jewish studies. this new column a new column reviewing digital humanities projects in the field of jewish studies. to submit a project for review, or to request to review a project, please contact michelle chesner at mc @columbia.edu. see, for instance, a recent report of the coalition for networked information (cni) on a workshop arranged by cni and the association of research libraries, and attended by a hundred librarians from academic libraries around the country (goldenberg-hart ). another important collection of articles on digital humanities in the library is christian-lamb et al. ( ). digital humanities in the library is beyond the scope of this column, but for those with interest in this area, the dh+lib website, hosted by the association of college and research libraries (http:// acrl.ala.org/dh), and associated publications such as an email roundup of research scholarship are quite informative. the data can be found at http://bit.ly/jewishdhprojects (accessed december , ). it is a google spreadsheet populated by a form (hosted at http://www.thedigin.org/jewish-studies-dh-projects/) that still receives submissions. i created the spreadsheet for myself because i could not find an aggregated collection of dh projects in jewish studies. as far as i know, this is the only “database” of jewish studies dh projects in existence. please contact me if you know of a more formal repository for these projects, and i would be happy to submit the information that i have collected. note that the aggregated collection of digital projects includes standard digitization projects, which are out of the scope for this column. m. chesner / judaica librarianship ( ) – mc @columbia.edu http://acrl.ala.org/dh http://acrl.ala.org/dh http://bit.ly/jewishdhprojects http://www.thedigin.org/jewish-studies-dh-projects/ will provide reviews and information about these resources in the digital field. as far as this column’s scope, digital humanities means projects that use digital technology to advance research in a way that could not have been done before the digital age. we will not be reviewing sites that solely feature digitized manuscripts or other digital facsimiles from one collection, unless there is a unique factor to the digitization or display that adds value to re- search beyond a digital reproduction of the item. such exceptions include sites like the british library’s polonsky foundation catalog of digitized hebrew manuscripts. this comprehensive site includes all of the metadata for the manuscripts (both electronic and physical), which can be used as a resource for a slew of various kinds of visualizations (mapping, timelines, genre breakdowns, etc.). another, older, example is the bezalel narkiss index of jewish art, which has aggregated a quarter of a million images of visual materials from about seven hundred collec- tions around the world to create a resource for teaching and studying jewish art. the friedberg genizah project, now one of many projects on jewishmanuscripts.org, went through many iter- ations before reaching its present state, and was one of the first digitization projects to utilize digital technology to allow new kinds of research in jewish studies. jews in america is also an aggregation project, allowing people to search over , items from nearly four hundred col- lections, all of which relate to american jewish history. the european holocaust research infrastructure (ehri) is an aggregation project as well, gath- ering and providing access to resources on the holocaust, but it also sponsors workshops, sym- posia, and other events to build a community of scholars around the study of the holocaust. another community of scholars can be found in the digital yiddish theatre project. the dytp is essentially an encyclopedia-in-process, with short and long articles about performers, theater groups, genres, and other topics. like ehri, it sponsors academic events surrounding yiddish theater, bringing people together in their shared topic of study. other kinds of digital humanities projects include text-based initiatives, like poetrans, the index of poetry translations into hebrew. this is a kind of digital reference book, something that would have been published as an index in the past, but is far more accessible in digital form. sefaria allows users to jump from text to commentary or other references, and then back again with just a few clicks. the digital mishna project, on the other hand, is working on creating a digital critical edition of the mishna based on many witnesses. this is an in-process project, and so we will be watching to see what it produces. yerusha is another great example of an in-process project that will become a remarkable resource for scholars upon its completion. its goal is not the digital publication of primary sources per se, but rather to provide information about collections scattered across europe so scholars know what the primary sources are, where they can be found—and, most importantly, how to access them. footprints is similar in that way, collecting scattered data about the movement of jewish all of the projects listed in this article come from the “dh jewish” list and are described in further detail there. m. chesner / judaica librarianship ( ) – https://www.bl.uk/hebrew-manuscripts http://cja.huji.ac.il/browser.php http://www.jewishmanuscripts.org/ http://www.jewishmanuscripts.org/ http://jewishmanuscripts.org http://www.jewsinamerica.org https://ehri-project.eu https://yiddishstage.org/ http://www.poetrans.org/poetrans http://sefaria.org http://www.digitalmishnah.org/ http://yerusha.eu http://footprints.ccnmtl.columbia.edu books into a format that allows researchers to locate materials that otherwise may not have been identified as relevant resources. the relatively new digital dh at the penn libraries has been working some incredible projects very recently, such as the “geniza scribes” collaboration with zooniverse to identify paleographical scripts in genizah fragments. but the purpose of this column is not simply to describe the various projects in existence. initial- ly inspired by the american historical review’s recent commitment to review digital projects as academic publications (lichtenstein ), this column should be viewed as a partner to the reviews section in this journal. it is a place for long-form essays describing the merits and drawbacks of a digital project. since digital projects are often iterative and published in “beta” versions, it is also a place to provide productive feedback to a project creator. where relevant and possible, we will publish a response from the project creator to provide context or future plans for a project. to allow the broadest access for both viewers and readers, only freely available digital human- ities resources will be reviewed in this column. this is not a venue to advertise or critique data- bases for purchase or subscription, as that can be done in the reviews section of this and other journals. in an age of digital information, new resources seem to pop up daily. it is the job of librarians to evaluate these sources to decide what is best for research and their users. when one is not familiar with the environment or the medium of the resource, however, evaluation of sources, and sometimes the source itself, is left out of the scholarly conversation, causing researchers to overlook important sources for their work. it is my hope that this column will go a long way in assisting its constituents with this daunting task. sources christian-lamb, caitlin, zach coble, thomas padilla, et al. . “digital humanities in the library / of the library: a dh+lib speical issue.” accessed january , . dh+lib: where the digital humanities and librarianship meet. http://acrl.ala.org/dh/ -spe- cial-issue/. goldenberg-hart, diane. . “report of a cni-arl workshop: planning a digital schol- arship center .” accessed january , , https://www.cni.org/wp-content/up- loads/ / /report-dscw .pdf. lichtenstein, alex. . “introduction.” the american historical review ( ): – . doi: . /ahr/ . . . this is a project i co-direct, with adam shear, josh teplitsky, and marjorie lehman. m. chesner / judaica librarianship ( ) – https://judaicadh.github.io/about http://acrl.ala.org/dh/ -special-issue/ http://acrl.ala.org/dh/ -special-issue/ https://www.cni.org/wp-content/uploads/ / /report-dscw .pdf https://www.cni.org/wp-content/uploads/ / /report-dscw .pdf https://doi.org/ . /ahr/ . . judaica librarianship - - js/dh: an introduction to jewish studies/ digital humanities resources michelle chesner recommended citation js/dh: an introduction to jewish studies/ digital humanities resources sources an overview of the th international conference on theory and practice of digital libraries (tpdl ) search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine november/december volume , number / table of contents   an overview of the th international conference on theory and practice of digital libraries (tpdl ) vittore casarosa institute for information science and technologies (isti), italian national research council (cnr), pisa, italy casarosa@isti.cnr.it ana pervan intern at the european organization for nuclear research (cern), meyrin, switzerland ana.pervan@cern.ch doi: . /november -casarosa   printer-friendly version   abstract the th international conference on theory and practice of digital libraries (tpdl) took place in valletta, malta, during september - , . a diverse community of participants and their different research approaches gave an international and interdisciplinary feel to this year's conference. the general conference theme was "sharing meaningful information". approximately delegates from more than countries presented and discussed challenges and opportunities of digital library architecture, interoperability and information integration, digital library interfaces, user behavior, data re-use and open access, linked data, data visualization, long-term preservation, semantic web in digital libraries and digital curation.   introduction the th international conference on theory and practice of digital libraries (tpdl) took place in valletta, malta, during september - , . the general chairs were milena dobreva from university of malta (malta) and giannis tsakonas from university of patras (greece). they, along with the program chairs trond aalberg from norwegian university of technology and science (norway) and christos papatheodorou from the ionian university (greece) organized a very successful conference. sponsors and co-organizers of the conference were the university of malta, cost (european cooperation in science and technology) and the unesco national commission in malta, which for a few days transformed valletta from a tourist capital into a digital library-oriented capital. malta's government was also one of the active supporters of the conference. the minister for education and employment of malta, evarist bartolo, opened the conference by highlighting the importance of technology in a changing library world and confirming the attendees' interest in the development of the digital library disciplines. by receiving the "green light" from a representative of malta's government, the tpdl conference was officially opened.   conference highlights a diverse community of participants and their different research approaches gave an international and interdisciplinary touch to this year's conference. academics, practitioners, developers, students and users gathered in order to share new ideas and discuss the current "hot topics" in the field of library and information science. the general conference theme "sharing meaningful information" was divided into four broad areas, namely: digital library infrastructure, foundation, content and services. about delegates from over countries presented and discussed challenges and opportunities in digital library architecture, interoperability and information integration, digital library interfaces, user behavior, data re-use and open access, linked data, data visualization, long-term preservation, semantic web in digital libraries and digital curation. two keynote speakers captured the main tpdl objective, which was developing an interdisciplinary approach to digital libraries. christine l. borgman opened the conference by giving a speech on "digital scholarship and digital libraries: past, present and future". in her talk, borgman stressed the importance of writing and publishing scientific research papers while also keeping in mind data re-use. the second keynote speaker, sören auer, closed this year's tpdl conference by delivering a speech on "what can linked data do for digital libraries?". in his talk, digital libraries were presented as knowledge hubs whose main purpose is to create knowledge through the sharing of content by the means of linked data. the presentation started with a very concise introduction to linked data, showing how linked data can provide a semantic web on top of the existing "hyperlink web", allowing a more meaningful navigation and discovery of interesting information. two panel sessions covered some of the issues about the present state and the future of digital libraries. the first one, entitled "cost actions and digital libraries: between sustaining best practices and unleashing further potential" was focused on showing how multi-national cooperation could bring benefits to on-going digital library research. cost is a program of the european union aimed at strengthening europe's scientific and technical research capacity by supporting cooperation and interaction between european researchers. the second panel "e-infrastructures for digital libraries...the future" was focused on the ways in which new research methods, based on intensive computing and "big data", enable new means and forms for scientific collaboration. research and collaboration will be supported by e-infrastructures, allowing researchers to access remote facilities and manage and exchange large amounts of digital data. in addition to the technical sessions, with presentations and discussions of scientific papers, there was also what is becoming the "usual" minute madness session, in which participants were allocated one minute each to present the posters and the demos that could be seen during the conference. posters and demos covered a wide range of topics, from data curation and preservation to advanced search and retrieval, from recommender systems to semantic web and linked data. the two main topics in the closing session were the announcement of the winners of the "tpdl best paper award" and the venue and dates of next year's conference. there were three categories for the best paper award. the winners were: best paper award: "an unsupervised machine learning approach to body text and table of contents extraction from digital scientific articles" by stefan klamp and roman kern best student paper award: "who and what links to the internet archive" by yasmin alnoamany, ahmed alsum, michele c. weigle, and michael l. nelson best poster/demo: "country-level effects on query reformulation patterns at the uk national archives" by steph jesper, paul clough and mark hall. for the next tpdl conference, the important news is that in , the jcdl conference normally held in the us and the tpdl conference normally held in europe will join forces and organize one single event, which will be held on - september in london. so save the date and plan to attend what is expected to be an extraordinary event, where innovative ideas, interdisciplinary approaches and novel results will be presented and discussed.   satellite events as in previous tpdl gatherings, a number of events related to the themes of the conference were held immediately before or after the conference. a day long doctoral consortium event gathered doctoral candidates and gave them a chance to present their research, discuss, share ideas and get guidelines for improving their current work. six contributions were presented and discussed, showing the diversity and breadth of the library and information science field. development of a methodology for automatically positioning electronic publications into "universal decimal classification" system; integration of life science resources; modeling of archival user needs; digital libraries exploration using automatic text summarization; folksonomy-based resource recommendations for databases and digital libraries; a knowledge organization approach to scientific trends exploration. the day before the conference was dedicated to tutorials, and this year there were six half-day tutorials, some of them offering hands-on experience. the topics offered were: linked data for digital libraries; from preserving data to preserving research: curation of process and context; state-of-the-art tools for text digitisation; mapping cross-domain metadata to the europeana data model (edm); resourcesync: the niso/oai resource synchronization framework; the role of xslt in digital libraries, editions, and cultural exhibits. as it can be seen, long term preservation and digitisation and management of data in the humanities continue to be two very relevant topics. continuing a tradition, at the end of the conference a number of workshops, bringing together academics and practitioners to discuss challenges, issues and opportunities on topics of actual interest were held. the workshops were: practical experiences with cidoc crm and its extensions; moving beyond technology: ischools and education in data curation. is data curator a new role?; the nd international workshop on supporting users exploration of digital libraries; the rd international workshop on semantic digital archives; networked knowledge organisation systems and services — th european networked knowledge organisation systems workshop; linking and contextualizing publications and datasets: paving the way towards modern scholarly communication. the novelty of this year's tpdl conference was the cooperation with the ischool community through a global workshop of ischools (g-wis). the main aim of g-wis was to support the collaboration between different ischools and foster the development of this movement on a global scale. students of dill (digital library learning) an international master started in the framework of the erasmus mundus program of the european union and presently carried on by three partner universities, were invited to actively participate and present their work during the ischool session. the covered topics were: tools for preserving digitized special collections, digital curator competences, data visualization and fostering learning in digital environment. the cooperation between tpdl and ischools was perceived as a good opportunity for students and young researchers to network, gaining new ideas and getting a broader view of the academics' and practitioners' research approaches. finally, the social events included an opening reception, which was held along with the poster and demo session. thanks to the very good weather the reception was held in the open air, on the massive ramparts surrounding the venue of the conference. of course, there was also a social dinner, held in a characteristic restaurant in the old town of mdina, malta's old capital. a number of tours to valetta (malta's capital) and to gozo, the nearby island, were also offered to the attendees' guests. proceedings of this year's tpdl conference were published by springer, in the lncs series, entitled: "research and advanced technology for digital libraries", and can be accessed here.   about the authors vittore casarosa graduated in electrical engineering at the university of pisa. after a few years spent as a researcher at cnr (the italian national research council), he spent many years in the r&d laboratories of ibm in italy, france and in the u.s., conducting and managing research mostly in image processing and networking. since , he is senior research associate of the italian national research council at isti, where he is associated with the activities of the multimedia laboratory in the field of digital libraries; from to he was deputy director of delos, the network of excellence on digital libraries. from to he collaborated with hatii at the university of glasgow for training activities on long term preservation of digital objects. since he has taught courses on digital libraries at the open university of bolzano, at the university of parma and at the university of pisa.   ana pervan is enrolled in an international master's degree program digital library learning, which is offered in cooperation between oslo and akershus university college of applied sciences (norway), tallinn university (estonia), and the university of parma (italy). she holds a master's degree in knowledge management field of information science from faculty of humanities in osijek, croatia. her two main research interests include data curation and creation, representation and re-use of scientific data. currently she works as an intern in gs-sis department in cern.   copyright © vittore casarosa and ana pervan in memoriam �this�listing�contains�names�received�by� the�membership�office�since�the�march� �issue.�a�cumulative�list�for�the�aca- demic�year� – �appears�at�the�mla� web�site�(www�.mla�.org/in_memoriam). margaret�mather�byard,�salisbury,�connecticut,� �may� a.�dwight�culler,�yale�university,� �january� philip�leslie�gerber,�state�university�college�of�new�york,�brockport,� �january� john�m.�gill,�palo�alto,�california,� �may� john�greist�hanna,�university�of�southern�maine,�gorham,� �february� kenneth�alan�hovey,�university�of�texas,�san�antonio,� �may� ronald�george�keightley,�monash�university,�australia,� �april� gwin�j.�kolb,�university�of�chicago,� �april� richard�d.�lockwood,�rutgers�university,�new�brunswick,� �march� nancy�adams�malone,�naugatuck�valley�community�college,�ct,� �may� cynthia�marshall,�rhodes�college,� �august� scott�mcmillin,�cornell�university,� �march� paul�ricoeur,�university�of�paris,�france,�and�university�of�chicago,� �may� michael�riffaterre,�columbia�university,� �may� augusto�roa�bastos,�toulouse,�france,� �april� nigel�eric�smith,�tours,�france,� �april� robert�wesley�swords,�elmhurst�college,� �march� stephen�vasari,�california�state�university,�fullerton,� �march� eugene�l.�williamson,�jr.,�university�of�alabama,�tuscaloosa,� �march� [  p m l a   [  ©   by t h e mode r n l a nguage a s s o ci at ion of a m e r ic a  ]     lost bodies inhabiting the borders of life and death laura e. tanner “lost bodies offers an engaging and imaginative exploration of death, dying, and grief through original readings of a rich array of contemporary texts: poetry, fi ction, photography, and even textiles. laura tanner makes the issue of loss in our contemporary culture vivid and compelling.” —peter balakian, colgate university $ . cloth, $ . paper treason by words literature, law, and rebellion in shakespeare’s england rebecca lemon “in some of the book’s most exciting sections, lemon shows the dangerous legal and political consequences of treason’s drift from action to language.” —john watkins, university of minnesota $ . cloth angels on the edge of the world geography, literature, and english community, – kathy lavezzo “lavezzo explains how england in the middle ages managed to justify its position on the geographical margins of christendom by producing some of the fi nest verbal and visual mappae-mundi of the period.” —jerry brotton, queen mary, university of london $ . cloth, $ . paper the growth of the medieval icelandic sagas ( – ) theodore m. andersson “this strikingly original book by theodore m. andersson, who knows more about the craft of saga-writing in medieval iceland than anyone else, crowns four decades of his writings on these extraordinary texts.” —roberta frank, yale university $ . cloth trailing clouds immigrant fiction in contemporary america david cowart “david cowart provides original and nuanced readings and enriches our understanding of immigrant fi ction . . . . while not condoning a lot that is wrong with america, cowart listens to his authors and to all that they are grateful for in their new lives.” —kathryn hume, pennsylvania state university $ . cloth, $ . paper collaborations with the past reshaping shakespeare across time and media diana e. henderson “diana e. henderson’s close readings attend in often breathtaking detail not only to literary and cinematographic subtleties of the specifi c works under discussion but also to the various historical contexts within which these uses of shakespeare function.” —douglas m. lanier, university of new hampshire $ . cloth infamous commerce prostitution in eighteenth-century british literature and culture laura j. rosenthal “infamous commerce offers a rich and interesting discussion of how the meaning and function of prostitution altered during the restoration and eighteenth century.” —kathryn temple, georgetown university $ . cloth cornell university press www.cornellpress.cornell.edu - - -     p e n g u i n g r o u p ( u s a ) w w w. p e n g u i n . c o m / a c a d e m i c a c a d e m i c m a r k e t i n g d e p a r t m e n t , hu d s o n s t r e e t , n e w yo r k , n y i ro n h e e l jac k lo n d o n edited with an introduction by jonathan auerbach and notes by jordan schugar.“a truer prophecy of the future than either brave new world or the shape of things to come.”—george orwell. london’s grim depiction of warfare between the classes is part science fiction, part dystopian fantasy, part radical socialist tract. penguin classics pp. - - - $ . l i b r a d o n d e l i l lo with a new introduction by the author. “a thriller of the most profound sort.”—chicago tribune. in a new introduction delillo reexamines the evidence surrounding oswald’s role in the assassination of jfk, as well as his place in popular culture. penguin pp. - - - $ . f ro m a c ro o k e d r i b n u ru d d i n fa r a h a somalian girl tries to escape the practical servi- tude of female existence in the first novel from “one of the most sophisticated voices in modern fiction” (the new york review of books). penguin pp. - - - $ . c o l l e ct e d sto r i e s wa l l ac e st e g n e r introduction by lynn stegner. “exemplary stories... the reader of stegner’s writing is immediately reminded of an essential america...a distinct place, a unique people, a common history, and a shared her- itage.”—los angeles times. penguin classics pp. - - - $ . also new in penguin classics: american places - - - $ . t h e o u ts i d e r s s . e . h i n to n introduction by jodi picoult. first published in , hinton’s novel still resonates with its powerful por- trait of the bonds and boundaries of friendship. “taut with tension, filled with drama.”—chicago tribune. penguin classics pp. - - - $ . t h e lo g o f a c ow b oy a n dy a da m s edited with an introduction and notes by richard w. etulain. “the most significant fictional treatment of the cattle drive alongside larry mcmurtry’s lonesome dove.”—richard w. etulain, in his introduction. penguin classics pp. - - - $ . t h e custo m o f t h e c o un t ry e d i t h w h a rto n edited with an introduction and notes by linda wagner-martin. “as long as men and women seek to use each other—and to use each other badly—edith wharton can be counted upon to provide the ideal commentary.”—anita brookner. penguin classics pp. - - - $ . t h e a n n otat e d a rc h y a n d m e h i ta b e l d o n m a rq u i s edited with an introduction and notes by michael sims. “our closest spiritual descendent of mark twain.”—christopher morley. reprinted for the first time since they appeared in his newspaper columns, marquis’ poems revolve around a streetwise alley cat, the reincarnated cleopatra, and a poet reincarnated as a cockroach. penguin classics pp. - - -x $ . n e w f r o m p e n g u i n g r o u p ( u s a )     the school of criticism & theory at cornell university an international program of study with leading figures in critical theory invites you to apply for its thirtieth summer session june -july , in new york state’s finger lakes region the program in an intense six-week course of study, participants from around the world, in the disciplines of literature, history, and related social sciences, explore recent developments in literary and humanistic studies. tuition the fee for the session is $ . applicants are eligible to compete for partial tuition scholar- ships and are urged to seek funding from their home institutions. acceptance applications from faculty members and advanced graduate students at universities worldwide will be judged beginning march , . admissions are made on a rolling basis, and decisions are announced as soon as possible. for further information or to apply, write: the school of criticism and theory cornell university, a. d. white house, east avenue, ithaca, n.y. telephone: - - email: humctr-mailbox@cornell.edu fax: - - faculty: -week seminars amanda anderson caroline donovan professor of english literature, johns hopkins university “literary theory/political theory” brent hayes edwards associate professor of english, rutgers university “black intellectuals” eric santner philip and ida romberg professor of modern germanic studies, university of chicago “on creaturely life” ella shohat professor, new york university robert stam university professor, new york university “travelling debates in translation: eurocentrism, multiculturalism, and postcoloniality” mini-seminars alain badiou ecole normale supérieur “towards a new concept of the relation between philo- sophy and non-philosophy” judith butler maxine elliot professor of rhetoric and comparative literature, university of california at berkeley “violence and critique” geoffrey hartman sterling professor emeritus and senior scholar, english and comparative literature, yale university “poetry and divinity in contest” stephen g. nichols james m. beall professor of french and humanities, johns hopkins university “revolution and counter-revolution” haiping yan professor of critical studies, school of theatre, film, and television, ucla; zijiang professor of the arts and humanistic studies, east china university, shanghai, china “on theatricality” “the sct program took interdisciplinarity to a whole new level. ” david marshall johns hopkins university “although i have studied in a number of different coun- tries, sct still surprised me with its exceptionally wide international range. i spent many nights with a map of the world and with an encyclopedia, just to make sure i knew the context in which to place the long conversations with my fellow participants.” eneken laanes university of tartu “i have had a fantastic and extremely rewarding ex- perience throughout the six weeks, and i feel that sct has made an invaluable contribution to my develop- ment as a scholar.” susan antebi harvard university “sct changed not only many of my conceptions but my entire way of seeing problems.” silvana seabra de oliveira catholic university of minas gerais dominick lacapra, director bowmar professor of humanistic studies, cornell university     u n i v e r s i t y o f t o r o n t o p r e s s available in better bookstores - call - - - - www.utppublishing.com thomas hardy reappraised essays in honour of michael millgate edited by keith wilson keith wilson pays tribute to millgate’s many contributions to hardy studies by bringing together new work by fifteen of the world’s most eminent hardy scholars. together, these contributors offer graphic testimony to hardy’s enduring popularity and importance. cloth $ . june disraeli’s disciple the scandalous life of george smythe mary s. millar one of the most intriguing relationships in victorian history is that between george smythe and benjamin disraeli.while smythe’s friendship was central to disraeli’s rise to political power, little has been written about his life. disraeli’s disciple is the first comprehensive biography of a fascinating figure and will change the way we view victorian england. cloth $ . june unsettling partition literature, gender, memory jill didur unsettling partition reinterprets the silences found in women’s accounts of sectarian violence that accompanied india’s partition. didur argues that these silences in women’s stories should not be resolved, accounted for, translated, or recovered but understood as a critique of the project of patriarchial modernity. cloth $ . desiring women the partnership of virginia woolf and vita sackville-west karyn z. sproles sexy and provocative, desiring women re-imagines woolf and sackville-west as daring, funny, beautiful, and bent on resisting the repression of women’s desires. sproles explores the dynamics of their relationship through literature, biography and psychoanalysis. cloth $ . / paper $ . disraeli’s disciple the scandalous life of george smythe mary s. millar one of the most intrigui is that between george smythe and benjamin disraeli.while smythe’s friendship was central to disraeli’s rise to political power, little has been written about his life. is the first comprehensive biography of a fascinating figure and will change cloth $ . june unsettling partition literature, gender, memory     nyu in madrid–department of spanish and portuguese faculty of arts and science, new york university university place, th floor, new york, ny - telephone: - - ; fax: - - ; e-mail: nyu-in-madrid@nyu.edu drawing on the resources of nyu, the city of madrid, and professors from both spanish universities and the nyu department of spanish and portuguese in new york, we offer a newly redesigned m.a. program that is both intellectually stimulating and academically rigorous. m.a. candidates study at el viso, a residential area of madrid very close to the center of the city, as well as at the historic instituto internacional.the new site boasts state-of-the-art classrooms and computer facilities. upon approval, students may also choose to take courses at the universidad autónoma de madrid. course offerings include the yearlong course, a cultural history of spain and latin america, is taught by faculty from leading spanish universities and from the nyu department of spanish and portuguese in new york. courses in the literatures and cultures concentration range from jews in medieval spain, cervantes, pictorial traditions in spain and its latin american colonies— th- th centuries to electives on th-century spanish and latin american literatures. offerings for the language and translation concentration include the theory and practice of translation, problems in spanish syntax for bilingual communication, and the teaching of spanish as a foreign language. all courses are taught in spanish. nyu in madrid also offers an undergraduate program for the academic year, fall, spring, or summer. courses are taught in spanish and english. new york university in madrid a one-year m.a. program in spanish and latin american languages and culture with a concentration in either spanish and latin american literatures and cultures or spanish language and translation nyu in paris–department of french faculty of arts and science, new york university university place, th floor, new york, ny - telephone: - - ; fax: - - ; e-mail: nyuparis@nyu.edu drawing on the resources of nyu and the city of paris, our m.a. programs are small, personalized, and of a very high degree of quality. m.a. candidates study at the nyu in paris center, located in a charming town house in a quiet garden setting in the th arrondissement. courses at the university of paris, weekly workshops, and guest lecturers, plus our own computer facilities and research library complement the programs. course offerings include history of french colonialism; french classical tragedy; autobiography and autofiction;the age of enlightenment; civilization of contemporary france;textual analysis; parole, nation, ecriture:the novel in francophone caribbean and africa;women writers in french literature; contemporary french theatre; french cultural history since . all graduate courses are conducted in french. nyu in paris also offers an undergraduate program for the academic year, a semester, or summer. courses are taught in french and english. new york university in paris m.a. programs in french literature (completed in one academic year) and in french language and civilization (completed in one academic year or three to four consecutive summers) _a a-a r _studyabroad fas pmla " x . " pdf email: pdfads@mla.org issue date: . . ; . . ; . . ; . . closing date: . . ; . . ; . . ; . . proof: finalr . . gd new york university is an affirmative action/equal opportunity institution. � �     new from palgrave macmillan distributor of berg publishers, i.b.tauris, manchester university press, and zed books ( ) - • f a x : ( ) - • w w w . p a l g r a v e - u s a . c o m shakespeare’s entrails skepticism, solitude and the interior body david hillman palgrave shakespeare studies pp. / - - - / $ . cl. three seventeenth-century plays on women and performance edited by hero chalmers, julie sanders and sophie tomlinson revels plays companions library pp. / - - - / $ . cl. manchester university press drama of the english republic, - plays and entertainments janet clare the revels plays companion library pp. / - - - / $ . pb. manchester university press transversal enterprises in the drama of shakespeare and his contemporaries fugitive explorations bryan reynolds pp. / - - - / $ . cl. death in henry james andrew cutting pp. / - - -x / $ . cl. john donne richard sugg critical issues pp. / - - - / $ . cl. - - - / $ . pb. gissing and the city cultural crisis and the making of books in late victorian england edited by john spiers pp. / - - - / $ . cl. an edith wharton chronology edgar f. harden author chronologies pp. / - - - / $ . cl. now in paperback! heartbreakers women and violence in contemporary culture and literature josephine gattuso hendin pp. / - - - / $ . pb. pat barker john brannigan contemporary british novelists pp. / - - - / $ . pb. - - - / $ . cl. manchester university press writing chinese reshaping chinese cultural identity lingchei letty chen pp. / - - - / $ . cl. angela carter a literary life sarah gamble literary lives pp. / - - - / $ . cl. the palgrave literary dictionary of chaucer malcolm andrew pp. / - - - / $ . cl. the contemporary british novel since edited by james acheson and sarah c.e. ross pp. / - - - / $ . cl. - - - / $ . pb. now in paperback! chaplin and agee the untold story of the tramp, the writer, and the lost screenplay john wranovics pp. / - - - / $ . pb. disraeli the victorian dandy who became prime minister christopher hibbert june / pp. / - - - / $ . cl conversations with edward said edward said and tariq ali pp. / - - - / $ . cl. seagull books conversations with jean-paul sartre jean-paul sartre, simone de beauvoir, perry anderson, quintin hoare, and ronald fraser pp. / - - - / $ . cl. seagull books now in paperback! declining by degrees higher education at risk edited by richard h. hersh and john merrow foreword by tom wolfe pp. / - - - / $ . pb. for your students key concepts in contemporary literature john peck and steve padley palgrave key concepts pp. / - - - / $ . pb. key concepts in postcolonial literature gina wisker palgrave key concepts pp. / - - - / $ . pb. mastering english literature third edition richard gill palgrave master series pp. / - - - / $ . pb. � � www.simonsaysacademic.com features the tools to meet all your educational needs: • title suggestions • teaching guides • reading guides • catalogs • newsletters • free book offers • conference schedules • desk and exam copy online requests sowing the seeds of knowledge. the premiere resource for teachers and professors!     introducción a la morfofonología contemporánea el subtitulado cinematográfico: ambipositions rosa ana martín vegas maría josé gonzález rodríguez alan libert universidad de salamanca universidad de la laguna university of newcastle las gramáticas históricas y la mayor parte de los tratados de morfología de las lenguas románicas no dedican a la morfofonología un capítulo particular. tampoco hay muchos estudios teóricos extensos que centren su atención sobre las alternancias morfofonológicas. este trabajo pretende cubrir esta laguna teórica recopilando, desde una visión crítica, todos aquellos factores que determinan la historia de estos fenómenos, que han sido tratados de forma dispar por las diferentes corrientes lingüísticas. de este modo, este estudio es una teoría de la morfofonología ejemplificada principalmente con casos de alternancias en español. la investigación se estructura en dos dominios. ) la primera parte es una reflexión teórica sobre la caracterización de los procesos morfofonológicos. se presentan los problemas de delimitación de estas alternancias frente a otros tipos de alomorfia y se analizan los rasgos condicionantes de su historia. se defiende la teoría del reanálisis frente a la segmentación morfemática, como posible explicación de la lexicalización y del cambio analógico de algunas alternancias. ) la segunda parte es una historia de la morfofonología como disciplina teórica. se analiza de forma crítica el tratamiento descriptivo y/o explicativo que las distintas corrientes lingüísticas le han otorgado a las alternancias morfofonológicas a lo largo de la historia. asimismo, se propone un modelo explicativo de corte cognitivista elaborado principalmente a partir de los presupuestos de la teoría natural, el modelo de organización léxica y morfológica de bybee, el modelo analógico de skousen y algunas investigaciones psicolingüísticas. isbn . . pp. usd . . . mientras que una película doblada se ve y se escucha simultáneamente, la película subtitulada introduce un componente añadido de presión temporal: el acto de leer. el propósito de este trabajo es dar cuenta del papel destacado de los subtítulos a través de una caracterización de las rutinas básicas que se emplean en su preparación, junto con las convenciones relativas al uso del español en el subtitulado de películas de habla inglesa. en este sentido, los resultados nos permiten constatar en qué medida los subtítulos añaden significado a una película, y cómo en una película subtitulada el requerimiento de leer se convierte en un obstáculo que el espectador puede llegar a superar hasta el punto de fijar su atención en la experiencia básica de la película, dado que la dimensión temporal queda totalmente controlada. isbn . . pp. usd . . . two major categories of relational words are prepositions and positions, the difference between them having to do with whether they precede or follow their object. there is a relatively small group of words of the same general type which can be placed either before or after their object. such words have been given the name ambipositions. a possible (though not uncontroversial) example from english is through, e.g. he walked through the forest and he slept the whole night through. other examples are german entlang and ancient greek peri. this book is a detailed examination of this unusual type of word. contents: preface, abbreviations, introduction, ambipositions with simple behavior, meaning differences depending on position, ambipositions with case marking differences in different positions, differences in types of complement allowed, differences in form of prepositional and postpositional occurrences, ambipositions from an historical point of view, conclusion, references. (with examples from more than languages) isbn . . pp. usd . . . lincom handbooks in linguistics edición lingüística lincom studies in language typology fusión de palabra, gesto y movimiento escénico lincom europa academic publications webshop: www.lincom-europa.com lincom gmbh gmunder str. , d- muenchen fax + lincom.europa@t-online.dele � � empire of letters letter manuals and transatlantic correspondence, – eve tavor bannet $ . : hardback: - - - : pp rainer werner fassbinder and the german theatre david barnett $ . : hardback: - - - : pp magic on the early english stage philip butterworth $ . : hardback: - - -x: pp gentility and the comic theatre of late stuart london mark s. dawson $ . : hardback: - - - : pp shakespeare and republicanism andrew hadfield $ . : hardback: - - - : pp london literature, – ralph hanna $ . : hardback: - - - : pp the cambridge guide to literature in english rd edition dominic head $ . : hardback: - - - : pp now in paperback! a history of african american theatre errol g. hill and james v. hatch $ . : paperback: - - -x: pp british poetry in the age of modernism peter howarth $ . : hardback: - - - : pp pamela in the marketplace literary controversy and print culture in eighteenth-century britain and ireland thomas keymer and peter sabor $ . : hardback: - - - : pp shakespeare’s tragedies violation and identity alexander leggatt $ . : hardback: - - - : pp $ . : paperback: - - - the modernist novel and the decline of empire john marx $ . : hardback: - - - : pp new and noteworthy for more information, please visit us at www.cambridge.org/us or call toll-free at - - - prices subject to change.     children of the queen’s revels a jacobean theatre repertory lucy munro $ . : hardback: - - - : pp now in paperback! the cambridge history of literary criticism volume , the eighteenth century edited by h. b. nisbet and claude rawson $ . : paperback: - - - : pp john lydgate and the making of public culture maura nolan $ . : hardback: - - - : pp drama, theatre, and identity in the american new republic jeffrey h. richards $ . : hardback: - - -x: pp anger, revolution, and romanticism andrew m. stauffer $ . : hardback: - - - : pp ethics and nostalgia in the contemporary novel john j. su $ . : hardback: - - - : pp cathedral and civic ritual in late medieval and renaissance florence the service books of santa maria del fiore marica s. tacconi $ . : hardback: - - - : pp women on stage in stuart drama sophie tomlinson $ . : hardback: - - - : pp shakespeare’s humanism robin headlam wells $ . : hardback: - - - : pp literature and the taste of knowledge michael wood $ . : hardback: - - - : pp $ . : paperback: - - - print and the poetics of modern drama w.b. worthen $ . : hardback: - - - : pp the italian encounter with tudor england a cultural politics of translation michael wyatt $ . : hardback: - - - : pp from cambridge for more information, please visit us at www.cambridge.org/us or call toll-free at - - - prices subject to change. � � fear of small numbers an essay on the geography of anger arjun appadurai “arjun appadurai is already known as the author of striking new formu- lations which have greatly illuminated contemporary global develop- ments, notably in modernity at large. in this new book, he tackles the most burning and perplexing problems of collective violence which beset us today. the book is alive with new and original ideas, essential food for thought not just for scholars, but for all concerned with these issues.”—charles taylor, author of modern social imaginaries pages, paper $ . public planet neoliberalism as exception mutations in citizenship and sovereignty aihwa ong “aihwa ong’s keen ethnographic perspective brings into sharp relief some of the differences that are essential not only for understanding the contemporary global economic and political systems but also for struggling against them to make a better world.”—michael hardt, coauthor of multitude and empire pages, b&w photos, paper $ . duke university press toll-free - - - www.dukeupress.edu c r i t i c a l i n t e r v e n t i o n s f r o m d u k e     the age of the world target self-referentiality in war, theory, and comparative work rey chow “rey chow is one of the most learned and imaginative left critics writing today, and the age of the world target is possibly her finest book yet. elegantly traversing philosophy, literature, history, and politics, chow refracts our political times through our academic practices in a fashion that is alternately pedagogical, biting, lyrical, and pro- found.”—wendy brown, author of edgework: critical essays on knowl- edge and politics pages, paper $ . next wave provocations scandalous knowledge science, truth, and the human barbara herrns tein smith “elegantly written and constructed, amusing and energetic, scan- dalous knowledge continues barbara herrnstein smith’s edgy and distinctly partial commentary on the science wars between realists and constructivists. constructivists will be intrigued by the novel, and sometimes critical, avenues the book explores. realists will be, well, scandalized.”—andrew pickering, author of the mangle of practice: time, agency, and science pages, paper $ . science and cultural theory taboo memories, diasporic voices ell a shohat “from her keen observations about the politics of knowledge produc- tion in the u.s. university, to her canny elucidation of the gendered geographies of colonial cinema, to her critical engagements with post- zionist discourse, ella shohat’s bold intelligence is unparalleled. this volume collects her key interventions that have shaped and illuminated the debates we have come to know as multiculturalism, postcolonial discourse, and transnational feminism.”—lisa lowe, author of immi- grant acts: on asian american cultural politics pages, b&w photographs, paper $ . next wave: new directions in women’s studies � � murambi, the book of bones boubacar boris diop translated by fiona mc laughlin “this novel is a miracle. murambi, the book of bones verifies my conviction that art alone can handle the consequences of human destruction and translate these consequences into meaning. boubacar boris diop, with a difficult beauty, has managed it. powerfully.” —toni morrison paper $ . american sweethearts teenage girls in twentieth-century popular culture ilana nash imagining girlhood from nancy drew to buffy the vampire slayer. paper $ . ladino rabbinic literature and ottoman sephardic culture matthias b. lehmann views tradition and modernization among sephardic communities in the ottoman empire through the lens of rabbinic literature written in ladino. cloth $ . the slave’s rebellion literature, history, orature adélékè adéèkó how the slave rebellion haunts the black imagination. paper $ . geomodernisms race, modernism, modernity edited by laura doyle and laura winkiel exciting new scholarship on the globalization of modernist literature and culture. paper $ . other routes years of african and asian travel writing edited by tabish khair, justin d. edwards, martin leer, and hanna ziadeh brings together important primary work by travel writers from asia and africa in english translation. paper $ . don owen notes on a filmmaker and his culture steve gravestock a groundbreaking study of one of canada’s most influential directors. paper $ . moving experiences understanding television’s influences and effects david gauntlett a newly revised and expanded edition of the classic critique of media effects studies. paper $ . visual delights ii exhibition and reception edited by vanessa toulmin and simon popple explores visual culture in the late th and early th centuries. paper $ . reel tracks australian feature film music and cultural identities edited by rebecca coyle examines the role of music in contemporary cinema. paper $ . the habit of art best stories from the indiana university fiction workshop edited by tony ardizzone stellar examples of new american short fiction. paper $ . the variorum edition of the poetry of john donne volume , part : the holy sonnets gary a. stringer, senior textual editor; paul a. parrish, volume commentary editor the latest volume in the distinguished donne variorum series. cloth $ . - - i u p r e s s . i n d i a n a . e d u i n f l u e n t i a l w o r k s     f eaturing the work ofmore than poets, this stunning collection redefines the great canon of american poetry from its origins in the seventeenth century right up to the present. it is a must-have anthology for anyone interested in american litera- ture and a book that is sure to be consulted, debated, and treasured for years to come. new from pp., hardcover - - - $ . , pp., hardcover - - -x $ . this hugely entertaining anthology ranges from chaucer to the present day, with anecdotes that are hilarious, outra- geous, inspiring, and sometimes down- right weird. the new oxford book of literary anecdotes is a book not just for lovers of literature, but for anyone with a taste for the curiosities of human nature. available wherever fine books are sold. www.oup.com/us cast your vote for america’s favorite poem at www.oxfordpoetry.com � � the ohio state university press the ohio state university prize in short fiction mexico is missing and other stories j. david stevens www.ohiostatepress.org - - consuming fantasies labor, leisure, and the london shopgirl, – novel professions interested disinterest and the making of the professional in the victorian novel jennifer ruth the imagination of class masculinity and the victorian urban poor dan bivona and roger b. henkle $ . cloth - - - $ . cd - - - $ . cloth - - - $ . cd - - - the reverend mark twain theological burlesque, form, and content joe b. fulton lisa zunshine why we read fiction $ . paper - - -x $ . cloth - - - a thousand words portraiture, style, and queer modernism jaime hovey $ . cloth - - - $ . cd - - - the economics of fantasy rape in twentieth-century literature sharon stockton $ . cloth - - -x $ . cd - - - the old story, with a difference julian wolfreys pickwick’s vision $ . paper - - - $ . cloth - - -x $ . cd - - - a superficial reading of henry james preoccupations with the material world thomas j. otten $ . cloth - - - $ . cd - - - narrative causalities emma kafalenos $ . cloth - - - $ . cd - - - $ . paper - - - $ . cloth - - - $ . cd - - - $ . paper - - - $ . cd - - -x lise shapiro sanders $ . cloth - - - $ . cd - - - theory of mind and the novel     � � dante university™ po box wellesley ma fax: - www.danteuniversity.org danteu@danteuniversity.org five on-line enrichment courses ($ . each): . aspects of italian and american history . neapolitan songs in the lives of italian immigrants in america . observations in poetry and pictures . a history of jews in italy (being up-loaded) . comprehensive italian conversation (up-coming) tools for teachers and students: kaso english to italian (phonemic) dictionary, isbn $ . --(free access on, www.danteuniversity.org kaso verb conjugation system (cd or down-load—english isbn , italian isbn --$ each; english/italian isbn $ ) bilingual two language assessment battery of tests, isbn , $ . , (english and italian, spanish, portuguese, french, and vietnamese—see review by kenneth beare, www.about.com) books of italian american interest: prince, machiavelli/goodwin/martinez/caso, isbn , $ . , illustrated, paper we, the people—formative documents, adolph caso, isbn , $ . , cloth inferno, dante/kilmer/martinez, isbn , $ . , illustrated, cloth italian poetry – , ridinger/renello, isbn , $ . , paper to america and around world logs of columbus & magellan isbn $ . paper on persecution, identity & activism, cristogianni borsella, isbn , $ . paper marconi my beloved, maria marconi, isbn , $ . , photos, cloth     university of nebraska presspublishers o f b is o n b o o ks w w w .n e b ra sk ap re ss .u n l.e d u | . . at home on this moveable earth by william kloefkorn the third volume in a poet’s elemental four-part memoir: earth. $ . cloth | - - - - also available this death by drowning $ . paper - - - - restoring the burnt child | a primer $ cloth | - - - - transatlantic cooperation in research (transcoop) one goal: collaborative research two partners: scholars in the humanities, social sciences, economics and law three countries: canada, germany, and the united states the facts: through its transcoop program, the alexander von humboldt foundation provides one half of the funding—up to eur , over three years—for a proposed research collaboration. u.s. and/or canadian funds must cover the balance of the cost of the project. transcoop funds may be used by all partners for short-term research stays at the partners’ institutions, travel expenses, conference organization, material and equipment, printing costs, and research assistants. applications should be submitted jointly by at least one german and one u.s. or canadian scholar. ph.d. required. deadlines: april and october . for information about this and other opportunities, go to www.humboldt-foundation.de or contact the foundation’s u.s. liaison office at avh@verizon.net. � � why? charles tilly “readers will find this book stimulating, amusing, enlightening and engaging. the veteran analyst of political conflict and change has shifted the scale and style of his analysis once again. the result is a tour de force.”—viviana zelizer, princeton university cloth $ . - - -x politics and the passions, – edited by victoria kahn, neil saccamano & daniela coli “this is a distinguished collection of essays on a com- pelling topic by major scholars and theorists. passion, emotion, and affect have been placed once again on the agenda of the humanities but these topics have been less scrutinized in political matters than else- where.”—ian balfour, york university paper $ . - - - cloth $ . - - - due july selected writings on aesthetics johann gottfried herder translated and edited by gregory moore “only a small fraction of herder’s writings has been translated into english, and such translations are often archaic and/or unreliable. i know of no previous trans- lation of the critical forests, for example, although the first and fourth parts, which appear in the present vol- ume, are of major interest to students of aesthetics.” —hugh barr nisbet, university of cambridge cloth $ . - - - due july science on stage from doctor faustus to copenhagen kirsten shepherd-barr “kirsten shepherd-barr explores contemporary theater at the intersection of science and performance. she deals with subjects such as quantum mechanics, chaos theory, evolution and genetics and focuses on work by superb playwrights such as michael frayn and tom stoppard, as well as alternative theatrical events that literally change the way of doing theater.” —brian schwartz, the graduate center of the city university of new york cloth $ . - - - due july volume history, geography, and culture cloth $ . - - - due july an international reassessment of the first global literary form the novel edited by franco moretti editorial board: ernesto franco, fredric jameson, abdelfattah kilito, pier vincenzo mengaldo, mario vargas llosa nearly as global in its ambition and sweep as its subject, franco moretti’s the novel is a water- shed event in the understanding of the first truly planetary literary form. a translated selection from the epic five-volume italian il romanzo ( - ), the novel’s two volumes are a uni- fied multiauthored reference work, containing more than one hundred specially commissioned essays by leading contemporary critics from around the world. volume forms and themes cloth $ . - - - due july - - read excerpts online www.pup.princeton.edu princeton university press     the plum in the golden vase or, chin p’ing mei translated by david tod roy in this planned five-vol- ume series, david roy provides a complete and annotated transla- tion of the famous chin p’ing mei, an anony- mous sixteenth-century chinese novel, known primarily for its erotic realism. it is a landmark in the development of narrative art—not only from a specifi- cally chinese perspective but also in a world-historical context. praise for volume one: “reading roy’s translation is a remarkable experience.”—robert chatain, chicago tribune review of books princeton library of asian translations volume two: the rivals new in paperback $ . - - - due june volume three: the aphrodisiac cloth $ . - - - due june in hora mortis / under the iron of the moon poems thomas bernhard translated by james reidel “if bernhard is, as he has been called, ‘an instrumen- talist of language,’ then reidel has written for that language a symphony of lyric art, and in so doing, rescued for the world a major twentieth-century poet.” —carolyn forché, author of blue hour: poems lockert library of poetry in translation: richard howard, series editor paper $ . - - - cloth $ . - - - due june enough to say it’s far selected poems of pak chaesam pak chaesam translated by david r. mccann and jiwon shin this is the first english translation of selected poems by one of the most important and unusual modern poets of south korea. pak chaesam writes with a spareness of presentation but with a cornucopia of imagery, meticulously exploring objective and subjective realms of existence and memory. encouraging the reader to see and listen, and to allow the sensory to reshape the analytical, pak’s poetry opens up new realms of experi- ence. a fellow korean poet described pak’s poetry as being “the most exquisite expression of the korean sense of han,” or melancholy. lockert library of poetry in translation: richard howard, series editor paper $ . - - - cloth $ . - - - due july new in paperback one of the chicago tribune’s best books of the bells in their silence travels through germany michael gorra “gorra has made a notable effort to write a truthful book that, while colorful and impressionistic, also draws thoughtful conclusions about what he encoun- ters. . . . his accounts of his wandering are peppered with literary and historical reflections as well as musings on the nature of travel literature. . . . [t]he results can be stunning.” —brooke allen, new york times book review paper $ . - - - - - read excerpts online www.pup.princeton.edu princeton university press � �     � � at the university of north carolina at greensboro, our ph.d. program in english stresses strong emphasis on professional development and collegiality and concentrates on developing scholars and teachers. doctoral students choose one primary and two secondary areas and specialize in any period of english or american literature or in rhetoric and composition. in addition to coursework that provides a broad foundation as well as focus, students are encouraged to grow as scholars through publica- tions, conference participation, and other related activities. with the guidance of faculty mentors, they develop innovative peda- gogies for teaching writing and literature. the english faculty is comprised of distinguished specialists and award-winning teachers, and our intellectual community is enhanced by visiting scholars. fellowships, teaching assistant- ships, tuition waivers, and other forms of financial support are available. the department has an excellent record of academic placement of its graduates. for more information or to apply, contact: university of north carolina, greensboro department of english director of graduate studies a mciver, uncg greensboro, nc - phone: ( ) - fax: ( ) - www.uncg.edu/eng a doctoral program that will inspire you to succeed. in the mla series approaches to teaching world liter ature modern language association broadway, rd floor, new york, ny - phone - • fax - • www.mla.org approaches to teaching emily brontË’s wuthering heights sue lonoff and terri a. hasseler, eds. “wuthering heights is a major literary text taught in a wide variety of courses, from freshman writing courses to graduate seminars. this excellent addition to the mla approaches to teaching series is not only needed and useful but mandatory.” — anne humpherys city university of new york now available. vii & pp. cloth isbn - - - $ . paper isbn - - - $ . in the mla series new     herman melville’s “typee” a fluid text edition edited by john bryant working from the existing chapters of melville’s own draft of typee, john bryant attempts to re-create the novel’s actual writing process as a chronological sequence. this edition also offers a complete diplomatic transcription of the manuscript and a high-resolution photograph of each manuscript page. clotel, or the president’s daughter a narrative of slave life in the united states william wells brown edited by christopher mulvey the first african american novel, clotel was published in in london, when its author was still legally a slave in the united states. the work’s stature derives not only from its remarkable origin but from its explosive content, which is freely based on the relationship between thomas jefferson and sally hemings. this digital edition of clotel presents, for the first time together, the full extant texts of the four versions of clotel. these texts— pages in all, imaged and coded—may be read individually or in parallel, allowing the user to explore the relationships among the various versions. published by the electronic imprint of the u n i v e r s i t y o f v i r g i n i a p r e s s - - www.upress.virginia.edu the letters of matthew arnold edited by cecil y. lang this work, years in the making, represents the most comprehensive and assiduously annotated collec- tion of arnold’s correspondence available. the six print volumes are now a single online archive and include close to four thousand letters. the letters of christina rossetti edited by antony h. harrison this digital archive combines all four volumes of the print edition, making available all of rossetti’s extant letters, almost two-thirds of which had never before been published. the journal of emily shore revised and expanded edited by barbara timm gates this precocious young victorian woman wrote of politics, natural history, her progress as a scholar and scientist, and the worlds of art and literature. emily shore wrote, too, of her illness and impending death. her journal is a record of a brief but remarkable life, and this new digital edition is expanded to include transcriptions from two recently dis- covered manuscript volumes and a new introduction by the editor. n e w f r o m r o t u n da ’ s nineteenth-century literature & culture collection digital scholarship from the electronic imprint of the university of virginia press rotunda publications are available for purchase either separately or as packages, with pricing for libraries and schools based on institution type. pricing is also available for consortia and for individuals. arrange for a free trial, or inquire about pricing and availability: contact jason coleman, electronic marketing manager, at - - or jgc h@virginia.edu. or visit http://www.rotunda.upress.virginia.edu � � a research guide for undergraduate students english and american literature th edition nancy l. baker and nancy huling “this title holds place in the under- graduate reference canon alongside the mla handbook for writers of research papers. it belongs in every undergraduate library and in the hands of students writing research papers on american or english literature.” —choice fully updated and revised, the sixth edition of the research guide for undergraduate students shows undergraduates how to locate and evaluate material available from electronic databases and the internet. viii & pp. • x paper isbn - - - $ . suggested retail a u t h o r i t a t i v e. p r a c t i c a l . e s s e n t i a l . modern language association broadway, rd f loor new york, ny - phone - fax - www.mla.org n o w a v a i l a b l e new th editio n     traces e r n s t b l o c h tr a n s l a t e d b y a n t h o n y a . n a s s a r traces, a masterwork of twentieth-century philo- sophy, is the most modest and beautiful proof of bloch’s utopian hermeneutics, taking as its source and its result the simplest, most familiar and yet most striking stories and anecdotes. meridian: crossing aesthetics $ . paper $ . cloth h. c. for life, that is to say... j a c q u e s d e r r i d a tr a n s l a t e d b y l a u r e n t m i l e s i a n d s t e f a n h e r b r e c h t e r h. c. for life, that is to say . . . is jacques derrida’s tribute to hélène cixous—the author, her works, and their lifelong mutual reading and intellectual friendship. meridian: crossing aesthetics $ . paper $ . cloth the end of art readings in a rumor after hegel e va g e u l e n tr a n s l a t e d b y j a m e s m c f a r l a n d readings of hegel, nietzsche, benjamin, adorno, and heidegger trace the role that the discourse on the end of art has played in post-hegelian philo- sophical aesthetics. $ . paper $ . cloth reflections of equality c h r i s t o p h m e n k e tr a n s l a t e d b y h o w a r d r o u s e a n d a n d r e i d e n e j k i n e the book argues that the center of political modernity is determined by a conflictive relation between the liberal core concept of political equality and the idea of individuality. cultural memory in the present $ . paper $ . cloth imagining the gallery the social body of british romanticism c h r i s t o p h e r r o v e e reading portraiture as a national rhetoric during the romantic period, imagining the gallery reveals a pervasive cultural discourse that reflects and propels sociopolitical shifts taking place in late eighteenth- and early nineteenth-century britain. $ . cloth crowds e d i t e d b y j e f f r e y t. s c h n a p p a n d m at t h e w t i e w s crowds presents several layers of meditation on the phenomenon of collectivities, from the scholarly to the personal; it is the most compre- hensive cross-disciplinary publication on crowds in modernity. $ . paper $ . cloth underwriting the poetics of insurance in america, - e r i c w e r t h e i m e r this book is about the historical influence insurance has had on american culture. $ . cloth borderlines the shiftings of gender in british romanticism s u s a n j . w o l f s o n revealing how the revolution-era debates of the s redefined notions of gender across the nineteenth century, borderlines provides fresh readings of the works, careers, and volatile receptions of felicia hemans, m. j. jewsbury, lord byron, and john keats, showing how senses (and sensations) of gender shape and get shaped by sign systems that prove to be arbitrary, fluid, and susceptible of tranformation. $ . cloth the unthought debt heidegger and the hebraic heritage m a r l È n e z a r a d e r tr a n s l a t e d b y b e t t i n a b e r g o drawing on heidegger’s corpus, the work of his- torians and biblical specialists, and contemporary philosophers like levinas and derrida, zarader brings to light the evolution of an impensé—or unthought thought—that bespeaks a complex debt at the core of heidegger’s hermeneutic ontology. cultural memory in the present $ . paper $ . cloth new from stanford university press . . w w w. s u p . o r g u n i v e r s i t y p r e s s stanford � � collective representation has long been at the heart of academic governance. as an outgrowth of that tradition and in response to the profound changes in the aca- demic labor market, many academic employ- ees have turned to collective bargaining to enhance shared governance and to advocate for improvements in working conditions. contributors to this volume aim to educate readers about the historical and practical contexts of collective bargaining. the essays collected here explore the perspectives, suc- cesses, failures, and approaches of those who have collectively bargained so that readers can assess the pros and cons of unionization. jointly published by the american association of university professors and the modern language association pp. • paper isbn - - - $ . (mla & aaup members $ . ) aaup members : please enter discount code aaup when ordering the book at www.mla.org. modern language association broadway, rd floor, new york, ny - phone - | fax - a defi nitive resource on academic collective bargaining available at www.mla.org now available     dog days an animal chronicle patrice nganang translated and with an afterword by amy baram reid “with dog days, patrice nganang has established himself at the forefront of the new generation of african francophone writers. with swiftian tones which give this young author an authentic and original voice, he leaves no doubt in our minds that the next african revolution will come from its cities.”—emmanuel dongala, author of little boys come from the stars and johnny mad dog caraf books $ . cloth, $ . paper writing rumba the afrocubanista movement in poetry miguel arnedo- gómez arising in the heyday of the music recently made famous by the buena vista social club, afro- cubanismo was an artistic and intellec- tual movement in cuba in the s and s that tried to convey a nation- al and racial identity. through poetry, this movement was the first serious attempt on the part of mostly white cuban intellectuals to produce a national literature that incorporated elements from the afro-cuban tradi- tions of lower-class urban blacks. the first book-length treatment of the poetry of this movement, writing rumba questions the assumption that the poetry did manage to symbolize racial reconciliation and unification. at the same time it reveals a process of literary transculturation by which the dominant literature of european ori- gins was radically transformed through the incorporation of formal principles from afro-cuban dance and music forms. new world studies $ . cloth, $ . paper guarding cultural memory afro-cuban women in literature and the arts flora gonzález mandri “guarding cultural memory con- tributes much to our understanding of an ‘erased’ chapter of cuban culture while enhanc- ing at the same time the crucial role that afro- cuban culture played in the formation of a national culture. . . . a much- needed cultural and historical archive.” --adriana méndez-rodenas, author of gender and nationalism in colonial cuba: the travels of santa cruz y montalvo, condesa de merlin new world studies $ . cloth, $ . paperemerson bicentennial essays edited by ronald a. bosco and joel myerson drawn from papers presented at the conference that celebrated the two- hundredth anniversary of his birth, emerson bicentennial essays presents seventeen studies of emerson that address five general themes: “the construction of emerson,” “emerson’s audience,” “emerson the reformer,” “emerson the poet,” and “emerson and the world of ideas.” distributed for the massachusetts historical society $ . cloth carolyn g. heilbrun feminist in a tenured position with a new epilogue susan kress “a fascinating biography, carolyn g. heilbrun: feminist in a tenured position now includes a new epilogue that probes the painful mystery of heilbrun’s suicide. . . . [this book] is a deeply satisfying account of a woman writer whose pioneering words and example inspired many women to change their lives.”—nancy k. miller, author of but enough about me: why we read other people’s lives $ . paper u n i v e r s i t y o f v i r g i n i a p r e s s - - www.upress.virginia.edu binder .pdf the data swamp: trends from the academic business library directors' year in review report - ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer the data swamp: trends from the academic business library directors’ year in review report - christina sylka university of british columbia christina.sylka@ubc.ca zaida diaz university of maryland zdiaz@umd.edu leigh plummer university of maryland leighplu@umd.edu abstract a summary of academic business library trends from the academic business library directors (abld) year in review report - . areas covered include new and ongoing initiatives, organizational changes, physical space changes and collection and vendor issues related to abld member libraries. the paper also highlights major issues, organizational changes and new initiatives related to abld member business schools. keywords academic business libraries, library trends, organizational changes, new initiatives, business schools, academic business library directors (abld) year in review in preparation for the academic business library directors’ (abld) annual spring meeting, individual abld members each submit an annual report that summarizes new and ongoing initiatives, organizational changes, physical space changes and collection and vendor issues related to their libraries. they also summarize issues, organizational changes and new initiatives related to their business schools. this year’s annual meeting took place in seattle at the university of washington from may - , . the theme for the meeting was the data deluge. each meeting includes a presentation summarizing major themes from the year in review. this presentation is framed by a unifying metaphor that complements the meeting theme. this year, that unifying metaphor was the data swamp. swamps have an unfair reputation in the popular imagination as waste land, toxic, dangerous and ready to be drained. however, “… despite its outward appearance, a healthy swamp is a rich, diverse, sometimes messy ecosystem. one in which a profusion of plants and animals thrive in happy symbiosis or grudging respect” with no expectation that everything needs to conform to a single pattern (nevala, ). academic business libraries share this capacity and flexibility in supporting the diverse and messy spectrum of academic life uniquely, using spaces, services and collections to accommodate the entry level needs of first year undergraduates, the advanced, interdisciplinary data needs of faculty and the various needs of entrepreneurship, innovation and emerging technologies. mailto:leighplu@umd.edu mailto:zdiaz@umd.edu mailto:christina.sylka@ubc.ca http://dx.doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer notable trends overall swamps are incredibly productive: “wetlands are among the most productive ecosystems in the world, comparable to rain forests and coral reefs” (u.s. environmental protection agency, ). academic business library ecosystems are similarly prolific. overall, the demand for instruction is increasing, the need for and access to online resources is proliferating, the demand for support for data acquisition, visualization and analytics is multiplying, entrepreneurship programs are expanding and spaces are both heavily used and being renovated. all of this activity is taking place within a shifting and dynamic organizational environment, with significant staff and librarian turnover and structural and organizational change occurring within both the larger library systems and the business schools. data, data, data the data deluge was the theme for the / abld meeting. this past year, data has a significant driver of the external facing work of abld members’ services, facilities and collection development roles. these include managing requests from and identifying resources for faculty and graduate students; acquiring, licensing, managing and trouble-shooting the data sets; and designing spaces and delivering learning opportunities in support of data visualization and analytics. a substantial desire among graduate students and faculty for unique data sets was noted in the reports of the university of maryland, wake forest, the university of british columbia, new york university, penn state, and indiana university (abld, , pp. , , , , & ). penn state also noted that these requests are coming from disciplines outside of business and economics (abld, , p. ). dartmouth emphasized that its rising data demand included requests for geospatial and mapping data in subject domains such as energy (abld, , p. ). cornell highlighted how text mining and sentiment analysis continue to be of interest to faculty and phd students but acknowledged that finding funding for and determining availability of these resources remain challenging (abld, , p. ). data also infused internal activities, as libraries gather assessment- and use- data in support of more evidence-based decision making in business design, space and service allocation, and collection renewals. during / , four institutions embarked on or were planning space assessments. hec montréal’s library revised its entire business model in a year long process in order to focus on research data and knowledge management and employed “an optimisation approach process”, analyzing circulation desk and user traffic data to advocate for extending the library’s opening hours (abld, , p. ). southern methodist university gathered true assessment data of its information literacy plan for undergraduates (abld, , p. ). purdue just completed a year long review of the parrish library space to gather user feedback about the current use of the space and to suggest future changes. the university of chicago conducted space assessment across the libraries. duke university noted that metrics from database vendors can provide useful information during the renewal decision making process but that, “unfortunately, vendor metrics are often: inconsistent; unshared … irrelevant (market value of content) and inappropriately applied (pricing based on institutional prestige, size of endowment)” (abld, , p. ). the university of illinois would like abld to work with vendors to get better and more meaningful usage metrics (abld, , p. ). http://dx.doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer entrepreneurship and innovation programs are on the rise increased campus interest in entrepreneurship and innovation was realized in abld member libraries and business schools in a variety of ways. purdue identified a growing demand for entrepreneurship collaborations; the university of washington launched a new masters of science in entrepreneurship; penn state‘s business school introduced two new entrepreneurship programs and the library hired its first entrepreneurship librarian; the university of pittsburgh’s chancellor identified innovation and entrepreneurship as priorities, and the library started an entrepreneurship research interest group (abld, , pp. , , - & ). the university of california berkeley identified an increase in short- term, non-degree entrepreneurship programs in uc berkeley schools and research centers (abld, , p. ). the university of maryland libraries partnered with campus stakeholders “to develop a “start-up hub” within the um libraries’ research commons to bring a consultation team approach and programming on various aspects of support needed in areas of business/market research/new product development; grants; data analytics/data repositories; gis services; patents/trademarks/copyright, etc.” (abld, , p. ) both hec montréal and new york university partnered with the university of toronto’s rotman school of management creative destruction lab, a program designed to maximize value for science and technology start-ups (abld, , pp. & ). boston university’s library connected with the idg capital student innovation center (build lab) to communicate how library’s resources and services can support new ventures (abld, , p. ). raising the profile of diversity, inclusivity, and social justice mit’s report highlighted the fact that campuses are committing to and raising the visibility of diversity, inclusivity and social justice issues (abld, , p. ). vanderbilt university has hired a vice provost for inclusive excellence and the business school devoted significant time to inclusivity during orientations; indiana university libraries released its iu libraries diversity strategic plan (abld, , pp. & ). in april, the university of british columbia (ubc) opened the indian residential school history and dialogue centre and issued a statement of apology for ubc’s “involvement in the system that supported the operation of the indian residential schools” (ubc, ). increasing demand for teaching, training, and outreach in her presidential address at the fall meeting of the association for public policy analysis and management, ellen schall employed the metaphor of the swamp as a way to understand the important, complex, and messy problems, resistant to technical analysis, that are part of public service leadership. in order to be effective in public leadership, people need to lead and manage in the swamp, understanding the reality of their complex, messy world, and to be able “to reflect on and learn from your own and others’ experience to make sense of things” (schall, , p. ). schall’s metaphor is resonant for academic business librarians in leadership roles, and also for those who design, deliver, and advocate for instruction, reference and outreach in a constantly shifting environment. the metaphor of a complicated and convoluted landscape also reflects the experience of patrons trying to navigate the complex world of business research resources and data, which is way instruction and outreach remain such vital activities. abld member reports identified growth in instructional load, demand for programs and demand for research support. thirteen institutions identified an increasing number of requests for in-person and http://dx.doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer online instruction, and six identified newly initiated or continued programs, workshops or courses. four libraries had new or re-designed online presences, and three identified increased research consultations and program attendance. abld member libraries offer a broad spectrum of instruction, programming and outreach, ranging from workshops to in-class instruction to for-credit courses, to course integrated sessions, to online tutorials. instruction activities manifest differently in different libraries, reflecting the institutional ecosystems in which those libraries operate. penn state’s instruction program continues to grow, with librarians conducting almost instruction and outreach sessions (abld, , p. ). michigan state university business librarians continued to be heavily involved with instruction, including teaching a - credit business information literacy course (abld, , p. ). ubc`s co-curricular programming in the forms of writing coaching, peer assisted study sessions, and bloomberg and capital iq workshops were well-attended, with attendance at pass sessions rising % from the previous year (abld, , p. ). wake forest cited challenges in securing time for the library in undergraduate orientations; the university of pennsylvania experienced its first year of wharton , the new foundational class for wharton undergraduates, and, as a way of finding a niche, the library experimented with creating a libwizard tutorial focused on finding articles (abld, , p. & ). experimenting with new techniques, exploring new technologies or platforms, assessing existing programs and reflecting on instructional practice offer avenues to enhance the impact of libraries’ instructional roles. yale’s reference instruction and outreach group worked with the yale center for teaching and learning to host a series of three workshops that offered librarians a forum to develop their skills and discuss emerging trends in instruction (abld, , p. ). southern methodist university completed its first year of the emba library research program, building on four previous years of the outreach to emba program (abld, , p. ). hec montréal discussed integrating library resources, and access to case studies and course material into zonecours . , with dynamic electronic links to the library’s e-resources (abld, , p. ). the university of michigan library launched its new website, explored using new platforms like blue jeans to instruct students, began library information sessions for alumni, integrated library instruction in new courses, and obtained a new reference desk and an improved service point for the exam review program (abld, , pp. - ). michigan state university provided increased support to students enrolled in online graduate degree programs, launched a chat service, and automated the process of linking their libguides in the course management system (abld, , p. ) organizational changes and new staff swamps are transition areas—not quite water, not quite forest—and are often temporary homes for migrating birds (new hampshire pbs, ). the most significant areas for transition and upheaval for abld member libraries were related to staff and librarian turnover and organizational change. eighteen institutions hired or were in the process of hiring new librarians; seventeen positions were vacated, left open, or lost. ten member libraries experienced reorganization within their departments. nine institutions spoke to the slow search processes involved in hiring in an academic environment. the leaders of seven member institutions had title changes or increased responsibilities; six institutions hired new staff and six hired new administration. http://dx.doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer brand new positions were approved at harvard, yale and the university of pittsburgh (abld, , pp. - , & ) vacant positions, retirements or positions not approved or lost through restructuring included cornell, emory, yale, southern methodist university and vanderbilt (abld, , pp. , - , , & ). mit articulated the challenges of succession planning (abld, , p. ). northwestern explained that turnover in key positions has led reorganizations in public services (abld, , p. ). at notre dame the collections strategy and subject services program was reorganized again (abld, , p. ). at the university of alabama both the business branch head and the business collections manager had significant increases in responsibility across multiple branches (abld, , p. ). ucla experienced change at the top administrative level, with the deputy university librarian became university librarian at ubc, as well as across the libraries, due to the user engagement reorganization in july which resulted in the formation of five functional teams (ft) -- collections, outreach, research assistance, research partnerships, and teaching and learning (abld, , p. ). the university of washington’s libraries introduced a new reporting structure in the branch libraries, “where classified staff report to a central operations manager and subject librarians report to a research services director” (abld, , p. ). at western university, the libraries also underwent an organizational renewal that raised concerns in the business library that the level of support traditionally provided will decrease due to the staffing changes that are being implemented (abld, , p. ). the university of toronto’s administrative relationship changed from reporting to the business school to reporting to the library, and this change has resulted in collection issues becoming more straightforward, more databases being acquired this year then ever before, and evidence of continued relationship building with the central library being a boon to the business library and the business school (abld, , p. ). emory and harvard reflected on the challenge of finding experienced talent in the workforce (abld, , pp. & ). harvard’s baker library acknowledged that, as their products and services become more diverse and technology driven, there has an impact on the traditional make-up of staff and librarians: “attracting specialized skill sets to what seems a very “academic” setting plus onboarding and growing … capabilities is extremely difficult. our searches for these non-traditional library skill sets are increasing in length… [as we work to] articulate what we need and build a quality candidate pool.” (abld, , p. ) facilities filip tkaczyk writes that “the fluctuating levels of water in a swampland can itself be both a blessing and a challenge, and permanent structures in such landscapes must be able to adapt to such changes.” (tkaczyk, n.d.). in this aspect, the swamp metaphor is resonant for academic business libraries, as well. they operate in shifting environments, changing their physical facilities to meet patron needs and demands more fully, working within existing constraints, and working within larger campus ecosystems in which other partners may compete for their space or other factors lead to the erosion of the physical facilities. during / , eighteen institutions either had completed, were in the midst of embarking on, or were planning to embark on renovations to their spaces. four institutions either lost or re-allocated their collections’ spaces to study or office space. three institutions had offices or spaces that were either http://dx.doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer moved or moving; three institutions received new furniture; physical collections of two institutions were being or about to be moved, and two libraries were renamed. from august until may the university of wisconsin – madison’s business library was closed, renovated and combined with other spaces to form a business learning commons, resulting in a , square foot increase and an additional seats (abld, , p. ). vanderbilt’s total renovation of its business library was completed in august (abld, , pp. - ). the university of pittsburgh, indiana university and the university of southern california all had had library spaces that were recently or are currently being renovated (abld, , pp. , & ). the university of maryland renovated its research commons (abld, , p. ). georgetown received new carpet, new cubicles and less workspace. as a response to growing enrollment at yale college and the opening of two new residential colleges near the library, yale completed a renovation that resulted in a new seminar room, map room with consultation space, av studio for producing videos and web tutorials, and a variety of new study spaces for students with additional seats (abld, , p. ). uc berkeley has been approached by its business school to discuss giving up the lower level in exchange for a complete renovation of the upper level (abld, , p. ). the university of illinois’ business information services was moved out of the business school during a renovation and doesn’t expect to return even after they have finished (abld, , p. ). michigan state university libraries launched a new digital scholarship lab and, in response to demands for more collaborative space, created four group rooms, each containing a large screen tv and airmedia technology that enable patrons to project wirelessly from their device/computer to the screen (abld, , pp. - ). emory’s library will be acquiring all new furniture for its offices over the summer, and the design will accommodate and align with use patterns, including regular team consultations (abld, , p. ). duke is working on a new plan to reconfigure its reading room, replacing some book stacks with new study tables to add an additional forty seats; babson has an upcoming renovation, and dartmouth’s feldberg library building is scheduled for a renovation in (abld, , pp. , & ). the university of southern california’s crocker business library was renamed the gaughan & tiberti library, and the university of toronto’s business information centre was renamed the milt harris library (abld, , pp. & ). collections and vendor issues a swamp is “defined as a wetland dominated by trees or dense shrub thickets…” (shaw, ), and a library is often defined by its collections, at least to people outside the library. collections remain a source of significant activity for abld member libraries. one hundred and seven new databases/datasets/modules acquired or added during / , while sixty-five subscriptions were either cancelled or reduced. eleven institutions identified an increased focus on assessing online resources, while ten institutions commented on the increased number of online resources. ten institutions spoke to vendor relationships, particularly challenges in communicating and negotiating licenses with and obtaining useful metrics from vendors. four institutions discussed the impact of flat or reduced budgets, as well as cuts resulting from consortia agreements. two libraries (babson and the university of washington) highlighted deselection and weeding activities (abld, , pp. ). http://dx.doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer the university of toronto acknowledged that greater nimbleness is needed when considering either the acquisition or cancellation of resources (abld, , p. ). flat or decreased collections budgets were specifically noted by cornell and the university of washington (abld, , pp. & ). impacts of consortia cost changes were noted by ucla and cornell; the university of chicago is in the beginning stages of joining the big ten academic alliance shared print repository (abld, , pp. , & ). the university of pennsylvania loaded about , papers from wharton faculty into scholarly commons, their institutional repository, after many years of outreach and relationship building (abld, , p. ). the university of pennsylvania highlighted challenges in negotiating with new vendors, which leads to long delays in getting access to new resources, or long delays in not getting access to new resources. we are also experiencing more difficulty than seems reasonable in working out contract language with existing vendors …” (abld, , p. ). carnegie mellon and mit both acknowledged frustration with downloading restrictions from certain business resources, which make it hard to create data sets for faculty and phd research (abld, , p. & ). the university of toronto, western and hec montréal all mentioned the canadian dollar as an issue that affected collections, but hec montréal confirmed that good vendor relationships and vendor understanding of the impact of the exchange rate meant that the library was able to retain its current subscriptions (abld, , pp. , & ). business school issues and new initiatives one of the vital functions that wetlands perform is that they “will slow down the progress of a storm surge” (masters, n.d.). academic business libraries are deeply connected with the business schools that they support, and changes or storms in those environments affect and have impact on these libraries. during / , the business schools supported by abld members experienced tremendous change. seventeen business schools were either establishing new or expanding existing programs, certificates, institutes or centers. ten institutions had either hired or were in the process of searching for a new dean. program directors or top administrators from nine business schools had either left are were in the process of leaving their positions. seven business schools were looking to cost share or expand programs with other schools or programs. other trends identified by at least two institutions were increased focus on faculty research support, new faculty hires and increased support of students and families. two business schools were working through reorganizations, two received major grants, two were focusing more on specific industries and two were splitting or reconfiguring their focuses. administrative changes or reorganizations within business schools often surface vulnerabilities in funding structures. if a new dean decides to make strategic investments in non-library areas, such as faculty development, student scholarships, or new programs, the library may face competition for funding. expanding or establishing programs, certificates, and institutes, as well as fostering formal cross-campus relationships also has an impact on libraries’ service models, particularly if the libraries are asked to provide increased support, without additional funding. http://dx.doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer external factors affect both libraries and swamps abld members’ year end reports also surfaced additional factors that have an impact on their libraries. fifteen institutions discussed changing or increasing patron demands; twelve libraries expressed the impact of reduced or flat administrative budgets. ten institutions spoke to the uncertainty bred by new or upcoming administrative changes at their libraries, business schools and universities. search processes at these senior levels are even more complicated, challenging and slow than they are at the librarian and staff levels. at least five member institutions spoke to the ways in which librarians contributed to the profession. three institutions were either needing or in the process of establishing a new strategic plan. two member libraries highlighted the importance of communicating and proving the value and role of the library in the academic environment. in terms of changing patron demands, new york university found that, after years of being focused on developing spaces for students, they have recently been experiencing increasing demand from faculty for dedicated spaces to work in the library when conducting research (abld, , p. ). while the university of chicago library was focused on supporting graduate students for many years, they are now facing an expanding undergraduate college that has grown by approximately %; while there are on- campus dormitories, the library is the only common gathering space, resulting in an increased use of the library as social space and a higher demand on staff time and resources in support of undergraduate users, a new user group (abld, , pp. - ). the university of washington elevated the budget discussion to include the state level, recognizing that decisions and divisions in the state legislature and competition for state funds yield diminishing resources and unpredictable budgeting for business libraries, for the larger library systems in which they operate, and for universities themselves (abld, , p. ). contributions to the profession made by librarians at abld member institutions were a significant feature of reports. these contributions included conference presentations and posters, service to professional associations, and articles in books and journals, as well as through symposiums, workshops and other professional events. conclusion libraries, like swamps, are complex ecosystems that exist within and contribute to bigger systems. sharing our experiences enables us to identify trends and similarities, as well as to acknowledge the complexity and variety that each institution faces. bibliography academic business library directors (abld). ( ). year in review report / . unpublished report. biodivcanada.ca. ( ). wetlands retrieved from http://www.biodivcanada.ca/default.asp?lang=en&n=f d a- masters, p. (n.d.). storm surge reduction by wetlands. retrieved from https://www.wunderground.com/hurricane/surge_wetlands.asp https://www.wunderground.com/hurricane/surge_wetlands.asp http://www.biodivcanada.ca/default.asp?lang=en&n=f d a- https://biodivcanada.ca http://dx.doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://dx.doi.org/ . /ticker. . . © christina sylka, zaida diaz and leigh plummer nevala, k. ( ). in defense of the data swamp. dataversity. retrieved from http://www.dataversity.net/defense- data-swamp new hampshire pbs. ( ). wetlands. natureworks. retrieved from http://www.nhptv.org/natureworks/nwep e.htm schall, e. ( ). learning to love the swamp. journal of policy analysis and management ( ) pp - retrieved from http://www.appam.org/assets/ / /schall_learning_to_love_the_swamp_reshaping_education_for_public_servi ce.pdf shaw, e. ( ). climate of wetland swamp ecosystems. retrieved from https://sciencing.com/climate-wetland- swamp-ecosystems- .html tkaczyk f. (n.d.). the swamp ecosystem: ecology and survival. retrieved from https://www.wildernesscollege.com/swamp-ecosystem.html university of british columbia. ( ). indian residential school history and dialogue centre. retrieved from https://ceremonies.ubc.ca/irshdc-opening/ u.s. fish and wildlife service. ( ). national wetlands inventory. retrieved from https://www.fws.gov/wetlands/ u.s. environmental protection agency. ( ). wetland functions and values. retrieved from https://www.epa.gov/sites/production/files/ - /documents/wetlandfunctionsvalues.pdf https://www.epa.gov/sites/production/files/ - /documents/wetlandfunctionsvalues.pdf https://www.fws.gov/wetlands https://ceremonies.ubc.ca/irshdc-opening https://www.wildernesscollege.com/swamp-ecosystem.html https://sciencing.com/climate-wetland http://www.appam.org/assets/ / /schall_learning_to_love_the_swamp_reshaping_education_for_public_servi http://www.nhptv.org/natureworks/nwep e.htm http://dx.doi.org/ . /ticker. . . year in review notable trends overall data, data, data entrepreneurship and innovation programs are on the rise raising the profile of diversity, inclusivity, and social justice increasing demand for teaching, training, and outreach organizational changes and new staff facilities collections and vendor issues business school issues and new initiatives external factors affect both libraries and swamps conclusion bibliography cm&r : (march) hmorn – selected abstracts c-d - : governing access to a distributed research network’s data resources beth l syat, mph, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; kimberly lane, mph, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; jeffrey s brown, phd, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; david magid, md, mph, institute for health research, kaiser permanente colorado; joe v selby, md, mph, division of research, kaiser permanente northern california; richard platt, md, ms, department of ambulatory care and prevention, harvard medical school and harvard pilgrim health care; andrew nelson, mph, healthpartners research foundation to answer many public health questions, it is essential to use information from more than one electronic data system, and efficient ways are needed to securely access and use data from multiple organizations while respecting the regulatory, legal, proprietary, and privacy implications of this data use and access. one approach centers on the development of distributed research networks that allow data owners to maintain confidentiality and physical control over their data, while permitting authorized users to ask essential questions. once such a network is fully operating and key elements are in place, sharable data resources can be made available to approved network users, under approved conditions. for instance, data from a large cohort of hypertensive patients with five years of utilization (a hypertension cohort) could be available on the network. the following questions will need to be addressed: who can have access? under what conditions should access be granted? what policies/procedures are required? to address the specific needs associated with governance of a network’s resource(s), the authors call for the establishment of user eligibility requirements, policies to deal with funders (i.e., access rules for study funders), clear standard operating procedures, and guidelines for accessing the network. recommendations to meet to those needs include: ) establishing data oversight policies; ) defining responsibilities for data resource access; ) defining responsibilities for data owners at each site (i.e., responding to queries when requests come in); ) creating standard operating procedures for the data resource; ) creating collaboration guidelines for external partners; and ) monitoring overall resource use. for the purpose of this poster, we propose to illustrate responsibilities for data owners at each site. ps - : digital scholarship: scientific publishing at the crossroads virginia d scobba, mls, ma, group health center for health studies, group health cooperative background/aims: scholarly communication is the system through which research and other scholarly writings are created, evaluated for quality, disseminated to the scholarly community, and preserved for future use. the traditional formal means of interchange, publication in peer reviewed journals, is at the core of the communication infrastructure. however, the structures and processes by which scholars communicate have undergone a major transformation in recent years with the advent of the digital age. new electronic technologies for access to information appear to be revolutionizing scholarly publishing, aptly defined by the term, digital scholarship. current trends in the chaotic scholarly publishing market can be perceived as both opportunities for and threats to digital scholarship. methods: digital scholarship is in a state of unprecedented upheaval as publishers, librarians, legislators, scholarly societies, scientists and other scholars engage in tactics to propel change in directions that promote their individual goals. strategies involve remodeling the publishing market, modifying academic and research institutional procedures, and influencing public policy. results: emerging digital publishing technologies, increasing volume of scholarly works, and decreasing satisfaction with a costly and dysfunctional economic model are changing the fundamental structure of scholarly publishing. research institutions, as well as government and funding agencies, are implementing or exploring strategies which promote free and open access to research results. these include alternative copyright arrangements, e -print archives and digital repositories. conclusion: scholars, researchers, and society at large gain tremendous benefits from the expanded dissemination of research findings. however, several factors have impeded the progress of digital scholarship, including efforts to protect publishing revenues and profits, legal licensing restrictions, and the traditional culture of academia. it is therefore critical that the scientific community is actively engaged to ensure that the advancement of scholarship takes priority in the development of new publishing models. ps - : developing an analytical tool for assessing the adequacy of state health information exchange laws randy mcdonald, jd, lovelace clinic foundation; maggie gunter, phd, lovelace clinic foundation; shelley carter, rn, mph, lovelace clinic foundation; bob mayer aims: to develop and test an analytic legislative tool that provides states with the ability to analyze and propose reform to laws related to the exchange of electronic health information. background: through extensive research, the multi -state harmonizing security and privacy law collaborative (hsplc) found myriad barriers to health information exchange in laws and business practices. in some cases, barriers are beneficial because they protect people’s privacy. however, barriers can be problematic when they prevent the timely exchange of information needed for the treatment of patients. there are many inconsistencies in state and federal laws and among state statutes in their definitions, organizational structure, and content. some states have adopted new legislation that addresses the exchange of health information that may further exacerbate differences among states and impede interstate exchange of electronic health information. methods: hsplc developed a set of analytical tools and a narrative guide, the roadmap, to assist states in implementing an effective legal framework for the review and adoption of legislation that supports health information exchange (hie). the tools and roadmap were created through extensive research to identify best practices for identifying, evaluating, and reforming state laws related to the disclosure of electronic health information. results: hsplc found that various state resources (legal, legislative, healthcare policy, healthcare providers, and consumers) are necessary for successful completion of the roadmap to identify opportunities for legislative reform. hsplc believe that states will have greater likelihood of success in achieving legislative reform if they use the roadmap and reach out to other states contemplating a change in legislation. interstate collaboration and coordination are essential if we are to achieve a national legal and technical infrastructure that facilitates health information exchange. conclusions: legislation in most states does not adequately address the exchange of electronic health information. drafting of legislation must take into account a state’s unique environment and culture, and the needs and support of stakeholders. the goal of using the analytic tool is to protect health information while removing barriers that impede the exchange of vital information. the hsplc roadmap provides a step by step process to analyze and reform state legislation. ps - : optimizing health informatics interventions from the patient’s perspective: focus group on improving safe nsaid use douglas w roblin, phd, kaiser permanente georgia; richard m shewchuk, phd, university of alabama at birmingham; jeroan j allison, md, msc, university of alabama at birmingham; renny varghese, mph, kaiser permanente georgia; suzanne baker, mph, university of alabama at birmingham; catarina i kiefe, md, phd, university of alabama at birmingham background: patient- provider messaging in an electronic medical record (emr) system provides an opportunity to create and sustain productive patient- provider interactions. we elicited patient perspectives on design, benefits, and concerns to improve usability and efficacy of a proposed health informatics intervention to support surveillance of, and provider feedback on, over the counter (otc) non-steroidal anti -inflammatory drug (nsaid) use. methods: we conducted four focus groups involving kaiser permanente georgia (kpg) adults – years old who had a medical condition for which nsaids should be used cautiously or had a recent prescription for nsaids. the focus group elicited information regarding: otc nsaid use (including recognition of risks and side effects), design of an otc nsaid survey to be delivered via kp.org (the secure kp internet portal for patient- physician messaging), benefits and concerns about transmission of this information via electronic messaging to their primary care physicians, and impact and effectiveness of legislative smoking bans and anti-tobacco media campaigns in reducing smoking among women in the us: a systematic review and meta-analysis healthcare article impact and effectiveness of legislative smoking bans and anti-tobacco media campaigns in reducing smoking among women in the us: a systematic review and meta-analysis yelena bird , ladan kashaniamin , chijioke nwankwo and john moraros ,* director ican research group, brandon, mb r a v , canada; yelenabird@gmail.com department of community health and epidemiology, university of saskatchewan, saskatoon, sk s n z , canada; ladan.kashani@usask.ca school of public health, university of saskatchewan, saskatoon, sk s n z , canada; ckn @mail.usask.ca faculty of health studies, brandon university, brandon, mb r a a , canada * correspondence: morarosj@brandonu.ca received: november ; accepted: january ; published: january ���������� ������� abstract: background: the purpose of this study is to systematically review the literature addressing the effectiveness of legislative smoking bans and anti-tobacco media campaigns in reducing smoking among women. methods: medline, pubmed, cinahl, and abi/inform were searched for studies published from onwards. meta-analysis was conducted using a random effects model and subgroup analysis on pre-selected characteristics. results: in total, articles were identified, and five studies satisfied the inclusion criteria. the studies varied from school-based to workplace settings and had a total of , women participants, aged to years old. three studies used legislative bans, one study used anti-tobacco campaigns and another one used both as their intervention. the overall pooled effect of the five studies yielded an odds ratio (or) = . (c.i. = . – . and i = . %). subgroup analysis by intervention revealed a significant pooled estimate for studies using legislative smoking bans or = . (c.i. = . – . and i = %). conclusion: legislative smoking bans were found to be associated with a reduction in the smoking rates among women compared to anti-tobacco media campaigns. further research in this area is needed. keywords: smoking; women; legislative smoking bans; anti-tobacco media campaigns . introduction the first report of the united states (us) surgeon general’s advisory committee on smoking found that cigarette smoking is a probable cause of lung cancer and poses a serious risk of death and disease for women [ ]. more than years later, smoking is still the leading cause of premature death among women in the us and across the world [ , ]. despite increased awareness of the harm caused by cigarette smoking, the effectiveness of global tobacco control initiatives has been questionable and the gains modest. the world health organization (who) predicts that cigarette smoking will continue to kill approximately eight million people a year, resulting in more than one billion deaths over the course of the st century [ ]. men continue to have higher smoking rates than women in the us and across the world but the gap has steadily decreased over the last couple of decades [ , ]. the narrowing of the gap suggests that women now share a much larger burden of smoking-related diseases, morbidities and mortalities than ever before. for instance, between and , death rates from lung cancer among us women increased by more than % [ ]. starting in the late s, lung cancer surpassed breast cancer to healthcare , , ; doi: . /healthcare www.mdpi.com/journal/healthcare http://www.mdpi.com/journal/healthcare http://www.mdpi.com http://www.mdpi.com/ - / / / ?type=check_update&version= http://dx.doi.org/ . /healthcare http://www.mdpi.com/journal/healthcare healthcare , , of become the leading cause of cancer death among women in the us [ ]. studies have shown the risk for chronic diseases and dying due to smoking to be considerably higher in women than men over the past years [ – ]. tobacco companies have increasingly used a gendered specific approach in their marketing campaigns to effectively target women. many studies have shown that advertising campaigns by the tobacco industry seek to connect smoking to desirable women behaviours and attributes [ – ]. behaviours are linked to the importance and value of smoking to women in creating fun-loving environments, strong social relationships, positive body image, weight control, independence, social status, sexual desirability and self-relaxation/medication [ ]. attributes such as cigarette size (i.e., long and slim), packaging (i.e., glitzy and sexy), and taste (i.e., light and flavourful) have been changed and designed to attract women. for instance, brands such as vogue, silk-cut, and virginia slims have introduced attractive packaging styles like purse packs and a number of limited edition cigarette packs that have been heavily promoted by famous women celebrities and even fashion designers [ ]. it is ironic that tobacco companies have linked smoking to women’s independence/social status and well-being and yet cigarette smoking has had the opposite effect on their economic empowerment and physical health. without empowerment and health, women cannot achieve equality and certainly cannot prosper. research has shown that girls and women who smoke are more likely to be socioeconomically disadvantaged and/or marginalized [ – ]. therefore, women are a top priority population for tobacco control and prevention efforts. the framework convention for tobacco control (fctc) led to a multi-national treaty to help combat the global scourge of tobacco epidemic among a number of vulnerable populations including women [ ]. the fctc identified legislative bans and anti-tobacco media campaigns as important levers to help reduce smoking rates among women [ ]. a systematic review by hoffman et al. found that legislative bans and anti-tobacco media campaigns are effective tools in reducing smoking rates among countries that ratified the fctc treaty [ ]. additionally, bala et al. and de kleijen et al. concluded that mass media campaigns can be effective strategies in smoking reduction and cessation efforts among adults [ , ]. however, there is a significant gap in the literature regarding the systematic assessment of this important topic, specifically among women. therefore, the purpose of this study is to conduct a systematic review of the literature for quantitative evidence that determines the impact and effectiveness of legislative smoking bans and anti-tobacco media campaigns in reducing smoking among women in the us. . methods . . selection of studies identified studies were screened for eligibility by two reviewers. articles were considered eligible for inclusion in the present study if they: ( ) evaluated the effects of legislative smoking bans and/or anti-tobacco media campaigns among populations that included women years old or older; ( ) evaluated smoking status before and after the establishment of legislative smoking bans or anti-smoking media campaigns; ( ) had a comparison group included in the study; ( ) reported quantitative outcome measures specifically for women; and ( ) were published in the english language in peer-reviewed journals since , and available in full text. . . search strategy search terms related to legislative smoking bans and anti-tobacco media campaigns were used to search four online databases including: ( ) medline; ( ) pubmed; ( ) cinahl; and ( ) abi/inform. a grey literature search was also conducted on google and on proquest dissertations & thesis global databases. the references of relevant articles were also carefully reviewed to identify possibly related studies. search results were imported to separate excel spreadsheets by using reference management healthcare , , of software (zotero, corporation for digital scholarship, vienna, virginia, usa) and duplicate articles were removed. . . data extraction and analysis using excel spreadsheets, characteristics of selected studies were extracted including author, publication year, type of study, number of women participants, type of intervention and effect estimates. crude odds ratios and % confidence intervals were computed using the online medcalc tool for studies that did not provide them but had cross-tabulated data [ ]. meta-analysis was conducted using a random effects model. the random-effects model was used to determine the pooled mean effect size because it enables comparisons between the statistical results arising from the different samples and methodology (measuring methods and units) found among the selected studies [ ]. the primary outcome measure was the odds ratio (or). or calculations relied on study participant responses based on their smoking habits before and after establishment of legislative smoking bans or anti-smoking media campaigns. ors and % confidence intervals (cis) were either extracted from the articles or calculated by the authors using the quantitative data provided in the studies. statistical analysis for heterogeneity was assessed using higgins i-squared [ ] and further explored with the use of subgroup analysis on predetermined characteristics such as the study design, type of intervention and type of outcome assessed. the robustness of the findings was assessed by determining the influence of each individual study on the overall pooled estimate using tobias’ method [ ]. publication bias was ascertained using a funnel plot and egger’s test. all analyses were carried out using stata/ic version . , college station, tx: statacorp lp, college station, texas, usa. . results . . article identification in total, articles were identified ( from a database search, eight from the grey literature and eight using a snowball search). after removing duplicates, articles underwent a two-step screening process. the first step included a review of all titles and abstracts for relevance. following this step, studies were excluded. the second step included a careful review of the remaining full text articles. following this step, only five studies [ – ] met the eligibility criteria and were included in the meta-analysis. the summary of our study selection is shown in a prisma diagram (figure ). . . study characteristics the total number of women participants was , . the age of the women ranged from to years old. all studies were based in the us. the studies varied from school-based [ ] to workplace [ ] settings. out of the five studies included, only one [ ] used a high-quality study design (i.e., quasi-experimental) with control groups. the other four studies [ , – ] used a lower quality experimental design (i.e., cross-sectional) without control groups. among the five studies eligible for inclusion, three [ , , ] used legislative bans, one used anti-tobacco campaigns [ ] and another one used both [ ] as their intervention. the study characteristics are presented in table . healthcare , , of healthcare , , x of figure . prisma diagram, study selection process. preferred reporting items for systematic reviews and meta-analyses (prisma) flow diagram. figure . prisma diagram, study selection process. preferred reporting items for systematic reviews and meta-analyses (prisma) flow diagram. healthcare , , of table . study characteristics. year author purpose study type data source number of females age range of females (years) type of intervention type of outcome or (ci) strengths limitations zablocki et al. to assess the association of smoking ban policies with smoking reduction and quit attempts among california smokers. cross sectional california longitudinal smokers survey ≥ home smoking ban, work place smoking ban, perceived city/community smoking ban smoking prevalence or: . ( . – . ) participants are randomly selected, first study to examine the association of perceived city/town smoking bans at outdoor locations with smoking behaviors. intervention & outcome data were assessed using self-reported data. only % of the sample participated in the follow-up. page et al. to examine the effect of a citywide smoking ban in comparison to a municipality with no smoking ban in colorado on maternal smoking outcomes and subsequent fetal birth outcomes. natural experiment state of colorado department of health, colorado birth registry, and the infant mortality registry data , all ages were included. legislative smoking ban smoking prevalence or: . ( . – . ) first evidence in regard to improvement of fetal outcomes and preterm birth as a result of smoking ban in the united states. including a comparison group with same demographics in the study. self-reported data is used, mothers’ exposure to second-hand smoke was not measured directly. paternal smoking history was not included in the data to estimate shs. maternal self-report is probably under-reported due to social stigma related to smoking during pregnancy. mothers reported lifetime smoking, not in the time period close to the pregnancy. rose et al. to assess the prevalence of work place and home smoking bans and their associations with intention to quit, quit attempts, and -month sustained abstinence among employed females. cross sectional cross-sectional data from the / tobacco use supplement to the current population survey – home and work place smoking bans smoking prevalence aor: . ( . – . ) first study to examine the association of full smoking bans (at home and work place) with smoking behaviors among employed female smokers. effect of complete work and home ban was analysed in addition to their separate effects. employed indoor females were included in this study. therefore these data may not be generalizable to all females. the data reported are cross-sectional and do not allow for causal associations. self-reported data is used. detailed information such as coworkers and spouse smoking and quitting were not collected in the dataset which they may be influential on smoking behaviour. healthcare , , of table . cont. year author purpose study type data source number of females age range of females (years) type of intervention type of outcome or (ci) strengths limitations levy et al. to examine the association between tobacco control policies (clean air laws and media campaigns) with smoking prevalence. cross sectional tobacco use supplement to the current population survey – total sample: , (number of females not mentioned) ≥ antismoking policies, anti-smoking media campaigns smoking prevalence or (clean air): . ( . – . ) or (media): . ( . – . ) examining the effect of different tobacco control policies on smoking prevalence. a dataset related to a large population was used. age and gender variations in addition to variations over time were considered in the study. different forms of policies that may have different effects, were included to the policy measure. socio-economic factors were not considered in the study. terry-mcelrath et al. to examine the association between anti-tobacco advertising and smoking related outcomes with respect to gender and race/ethnicity cross sectional th, th, and th grades student data in – collected by monitoring the future study , – anti-smoking media campaigns smoking prevalence or: . ( . – . ) first study to examine the association between exposure to anti-tobacco advertising and smoking outcomes in th, th, th grades students. comparison among males and females and among different racial/ethnic groups was performed. hispanics were included in the study population. however, spanish-language tv channels were not included. healthcare , , of . . pooled analysis the overall pooled effect size (es) of the five studies yielded an or = . (c.i. = . – . and i = . %). subgroup analysis by study design revealed a significant pooled estimate or = . (c.i. = . – . ) for the quasi-experimental study [ ] and a non-significant pooled estimate or = . (c.i. = . – . ) for the cross-sectional studies [ , – ]. subgroup analysis by intervention revealed a significant pooled estimate or = . (c.i. = . – . and i = %) for studies using legislative smoking bans [ – , ] and a non-significant pooled estimate or = . (c.i. = . – . and i = . %) for studies using anti-tobacco media campaigns [ , ] (figure ). healthcare , , x of . . pooled analysis the overall pooled effect size (es) of the five studies yielded an or = . (c.i. = . – . and i = . %). subgroup analysis by study design revealed a significant pooled estimate or = . (c.i. = . – . ) for the quasi-experimental study [ ] and a non-significant pooled estimate or = . (c.i. = . – . ) for the cross-sectional studies [ , – ]. subgroup analysis by intervention revealed a significant pooled estimate or = . (c.i. = . – . and i = %) for studies using legislative smoking bans [ – , ] and a non-significant pooled estimate or = . (c.i. = . – . and i = . %) for studies using anti-tobacco media campaigns [ , ] (figure ). figure . overall pooled estimates. . . risk of bias all five studies were reviewed for risk of bias by using a modified newcastle ottawa scale (nos) [ – ]. this nos scale includes three components and eight items: ( ) selection of study groups (four items); ( ) comparability of the groups (one item); and the ( ) ascertainment of the outcomes of interest (three items) [ ]. the quality of each study was determined by assigning it to one of three subgroups: ( ) good (≥two stars for selection of study groups, one star for comparability of the groups, and three stars for ascertainment of the outcomes of interest components); ( ) fair (one star for selection of study groups and two stars for ascertainment of the outcomes of interest components); or ( ) poor ( stars for selection of study groups, stars for comparability of the groups, and ≤one star for ascertainment of the outcomes of interest components). risk of bias was designated as: low, if there was good quality in all components; unclear/moderate, if there was fair quality in one or more components without poor quality in any components; or high, if there was poor quality in any one of the components. the four cross-sectional studies were found to have an overall moderate risk of bias, whereas the quasi-experimental study had a low risk of bias (table ). figure . overall pooled estimates. . . risk of bias all five studies were reviewed for risk of bias by using a modified newcastle ottawa scale (nos) [ – ]. this nos scale includes three components and eight items: ( ) selection of study groups (four items); ( ) comparability of the groups (one item); and the ( ) ascertainment of the outcomes of interest (three items) [ ]. the quality of each study was determined by assigning it to one of three subgroups: ( ) good (≥two stars for selection of study groups, one star for comparability of the groups, and three stars for ascertainment of the outcomes of interest components); ( ) fair (one star for selection of study groups and two stars for ascertainment of the outcomes of interest components); or ( ) poor ( stars for selection of study groups, stars for comparability of the groups, and ≤one star for ascertainment of the outcomes of interest components). risk of bias was designated as: low, if there was good quality in all components; unclear/moderate, if there was fair quality in one or more components without poor quality in any components; or high, if there was poor quality in any one of the components. the four cross-sectional studies were found to have an overall moderate risk of bias, whereas the quasi-experimental study had a low risk of bias (table ). healthcare , , of table . risk of bias assessment using modified newcastle ottawa scales (noss). nos. cross sectional selection comparability outcome risk of bias year author representativeness ascertainment of exposure outcome at start rating controls for gender controls for covariates rating assessment of outcome completeness of outcome rating zablocki et al. good good fair moderate rose a et al. good good fair moderate levy et al. good good fair moderate terry-mcelrath et al. good good fair moderate nos quasi experimental selection comparability outcome risk of bias year author representativeness ascertainment of exposure outcome at start rating controls for gender controls for covariate rating assessment of outcome completeness of outcome rating page et al. good good good low healthcare , , of . discussion overall, we found that the odds of smoking, though not statistically significant, were reduced by %. however, with a stratified pooled analysis by the type of intervention, we found that the odds of achieving smoking reduction among women with the implementation of legislative smoking bans were significantly higher by %. similarly, several studies in the scientific literature corroborate our findings [ , – ]. it is important to note that much of the global progress made in reducing the prevalence of smoking can be specifically attributed to the efficacy of legislative smoking bans [ – ]. smoke-free policies banning smoking in public places and workplaces are known to be the most effective measures. such policies are shown to help denormalize tobacco use [ ], reduce smoking prevalence [ ], limit exposure to smoke [ ] and mitigate negative health outcomes [ ]. our subgroup analysis found that anti-tobacco media campaigns had no statistically significant effect on smoking among women in the us. the pooled odds of smoking due to the implementation of anti-tobacco media campaign among women increased by %. there are several plausible explanations for this finding. we posit this may be a reflection of the broad and non-gender-specific messaging of the majority of anti-tobacco campaigns. additionally, it has been suggested that the effectiveness of media campaigns may be lessened among women because they watch fewer hours of television, when compared to men, and are, therefore, less likely to be exposed to the televised anti-smoking messages [ ]. our findings contradict the evidence reported by several studies, which show anti-tobacco media campaigns to be a useful tool in the reduction of smoking rates [ , , – ]. however, it is important to note that the reduction rates reported in the literature were obtained from generalized populations and not specifically for females. despite increased global awareness and numerous interventions on this important public health front, smoking rates among women continue to increase dramatically [ – ]. this development is concerning and may be in part attributed to changes in the marketing approach employed by the tobacco industry, as a greater focus is now placed on the use of new social media platforms that lack strict regulatory controls [ , ]. additionally, the tobacco industry expertly uses various gender-based advertising techniques to glamorize smoking in pop culture, as evidenced in many popular movies, music videos and fashion shows [ , ]. these venues are used as social cues to depict female characters as cool, independent, adventurous and edgy and therefore, strongly appeal to a wide range of young females [ ]. a recent report found that young people are exposed to an astounding . billion tobacco impressions in youth related films annually and that the overall number of tobacco incidents within us movies has increased by % from to [ ]. this is an important development as the us surgeon general found a causal link between exposure to these types of images and smoking initiation, especially among young women [ ]. . . strengths and limitations the present study is one of a few to examine the impact and effectiveness of policy measures (i.e., legislative smoking bans and anti-tobacco media campaigns) on smoking reduction specifically among women in the us. by pooling the various effect estimates, we were able to increase the sample size and thus, the power of the study to assess the desired effect. our study provides significant evidence that can be used as a reference point for future research. despite these notable strengths, our study is not without its limitations. first, there are only a small number of studies that use quantifiable data and a sound research methodology to study this topic, which limited our meta-analysis. second, there was a marked heterogeneity among the included studies and therefore, the pooled results should be interpreted with some degree of caution. third, the included studies obtained information through follow up surveys but did not account for loss to follow up and its resultant bias. finally, the majority of the studies were cross-sectional in nature and thus, reported on associations but cannot infer causation. healthcare , , of . . implications for policy and/or practice successfully thwarting and/or reversing increases in tobacco use among women will lead to improved quality of life, positive health outcomes and major disease prevention opportunities. our findings provide significant implications for interventions aimed at reducing smoking among women in the us. legislative smoking bans were found to be associated with a reduction in the smoking rates among women, while anti-tobacco media campaigns did not. legislative smoking bans need to be promoted, enforced and further strengthened. anti-tobacco media campaigns need to be thoughtfully reviewed and revised and specifically tailored so as to effectively counter the tobacco industry’s targeting of women and expose its deliberate efforts to link smoking with women’s issues of independence, rights, status and progress in society [ ]. . conclusions the complex and critical connection between smoking and women’s health needs to be widely acknowledged and fully elucidated, along with a gendered-based analysis of tobacco use, advertising and legislation. this meta-analysis sought to determine the impact and effectiveness of counter tobacco marketing and legislative smoking bans in the reduction of smoking rates among women. this topic requires urgent attention, comprehensive policies and further high-quality research (e.g., using control groups for comparison analysis, pre- and post-ban data and robust biochemically measured outcomes), with large sample sizes and longer follow up periods (six months or longer) to determine the most effective strategies for implementation and enforcement of smoking bans to prevent and/or reduce the extent of the tobacco epidemic among women in the us and across the world. author contributions: conceptualization, y.b. and j.m.; data curation and formal analysis, l.k. and c.n.; methodology, supervision and validation, y.b. and j.m.; writing—original draft, all authors; final review and editing, y.b., j.m. and l.k. all authors have read and agreed to the published version of the manuscript. acknowledgments: the corresponding author acknowledges and thanks the faculty of health studies, brandon university. conflicts of interest: the authors report no conflicts of interest in this work. references . united states surgeon general’s advisory committee on smoking and health; united states public health service; office of the surgeon general smoking and health. report of the advisory committee to the surgeon general of the public health service; office of the surgeon general: washington, dc, usa, . . public health agency of canada. smokers, by sex, provinces and territories—open government portal. available online: https://open.canada.ca/data/en/dataset/d cbc b-ae b- cd- - b a ff (accessed on june ). . jamal, a.; homa, d.m.; o’connor, e.; babb, s.d.; caraballo, r.s.; singh, t.; hu, s.s.; king, b.a. current cigarette smoking among adults—united states, – . mmwr morb. mortal. wkly. rep. , , – . [crossref] [pubmed] . mathers, c.d.; loncar, d. projections of global mortality and burden of disease from to . plos med. , , e . [crossref] . reid, j.l.; hammond, d.; burkhalter, r.; ahmed, r. tobacco use in canada: patterns and trends; propel centre for population health impact; university of waterloo: waterloo, on, canada, . . u.s. department of health and human services. the health consequences of smoking— years of progress: a report of the surgeon general; u.s. department of health and human services, centers for disease control and prevention, national center for chronic disease prevention and health promotion, office on smoking and health: atlanta, ga, usa, . . novotny, t.e.; giovino, g.a. tobacco use. in chronic disease epidemiology and control; brownson, r.c., remington, p.l., davis, j.r., eds.; american public health association: washington, dc, usa, ; pp. – . https://open.canada.ca/data/en/dataset/d cbc b-ae b- cd- - b a ff http://dx.doi.org/ . /mmwr.mm a http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /journal.pmed. healthcare , , of . office on smoking and health (us). women and smoking: a report of the surgeon general; centers for disease control and prevention (us): atlanta, ga, usa, . . huxley, r.r.; woodward, m. cigarette smoking as a risk factor for coronary heart disease in women compared with men: a systematic review and meta-analysis of prospective cohort studies. lancet lond. engl. , , – . [crossref] . bloch, m.; althabe, f.; onyamboko, m.; kaseba-sata, c.; castilla, e.e.; freire, s.; garces, a.l.; parida, s.; goudar, s.s.; kadir, m.m.; et al. tobacco use and secondhand smoke exposure during pregnancy: an investigative survey of women in developing nations. am. j. public health , , – . [crossref] . u.s. department of health and human services. the health benefits of smoking cessation: a report of the surgeon general; u.s. department of health and human services, public health service, centers for disease control, center for chronic disease prevention and health promotion, office on smoking and health: washington, dc, usa, . . amos, a.; greaves, l.; nichter, m.; bloch, m. women and tobacco: a call for including gender in tobacco control research, policy and practice. tob. control , , – . [crossref] . brown-johnson, c.g.; england, l.j.; glantz, s.a.; ling, p.m. tobacco industry marketing to low socioeconomic status women in the u.s.a. tob. control , , e –e . [crossref] . toll, b.a.; ling, p.m. the virginia slims identity crisis: an inside look at tobacco industry marketing to women. tob. control , , – . [crossref] . marmot, m. smoking and inequalities. lancet lond. , , – . [crossref] . lorenc, t.; petticrew, m.; welch, v.; tugwell, p. what types of interventions generate inequalities? evidence from systematic reviews. j. epidemiol. commun. health , , – . [crossref] [pubmed] . hill, s.; amos, a.; clifford, d.; platt, s. impact of tobacco control interventions on socioeconomic inequalities in smoking: review of the evidence. tob. control , , e –e . [crossref] [pubmed] . world health organization. who framework convention on tobacco control ; world health organization: geneva, switzerland, . . hoffman, s.j.; tan, c. overview of systematic reviews on the health-related effects of government tobacco control policies. bmc public health , , . [crossref] [pubmed] . bala, m.m.; strzeszynski, l.; topor-madry, r.; cahill, k. mass media interventions for smoking cessation in adults. cochrane database syst. rev. , cd . [crossref] [pubmed] . de kleijn, m.j.j.; farmer, m.m.; booth, m.; motala, a.; smith, a.; sherman, s.; assendelft, w.j.j.; shekelle, p. systematic review of school-based interventions to prevent smoking for girls. syst. rev. , , . [crossref] [pubmed] . schoonjans, f. medcalc statistical software. available online: https://www.medcalc.org/ (accessed on june ). . hedges, l.v.; vevea, j.l. fixed and random effects models in meta-analysis. psychol. methods , , – . [crossref] . higgins, j.p.t.; thompson, s.g. quantifying heterogeneity in a meta-analysis. stat. med. , , – . [crossref] . tobias, a. assessing the influence of a single study in the meta-anyalysis estimate. stata tech. bull. , , . . levy, d.t.; tauras, j.a.; gerlowski, d.a.; bergman, j.; compton, c. the decision to smoke and the frequency of smoking by age and gender. j. appl. econ. policy highl. heights , , – . . page, r.l.; slejko, j.f.; libby, a.m. a citywide smoking ban reduced maternal smoking and risk for preterm births: a colorado natural experiment. j. womens health , , – . [crossref] . rose, a.; fagan, p.; lawrence, d.; hart, a.; shavers, v.l.; gibson, j.t.; rose, a.; fagan, p.; lawrence, d.; hart, a.j.; et al. the role of worksite and home smoking bans in smoking cessation among u.s. employed adult female smokers. am. j. health promot. , , – . [crossref] . terry-mcelrath, y.m.; wakefield, m.a.; emery, s.; saffer, h.; szczypka, g.; o’malley, p.m.; johnston, l.d.; chaloupka, f.j.; flay, b.r. state anti-tobacco advertising and smoking outcomes by gender and race/ethnicity. ethn. health , , – . [crossref] . zablocki, r.w.; edland, s.d.; myers, m.g.; strong, d.r.; hofstetter, c.r.; al-delaimy, w.k. smoking ban policies and their influence on smoking behaviors among current california smokers: a population-based study. prev. med. , , – . [crossref] [pubmed] http://dx.doi.org/ . /s - ( ) - http://dx.doi.org/ . /ajph. . http://dx.doi.org/ . /tobaccocontrol- - http://dx.doi.org/ . /tobaccocontrol- - http://dx.doi.org/ . /tc. . http://dx.doi.org/ . /s - ( ) - http://dx.doi.org/ . /jech- - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /tobaccocontrol- - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / .cd .pub http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ https://www.medcalc.org/ http://dx.doi.org/ . / - x. . . http://dx.doi.org/ . /sim. http://dx.doi.org/ . /jwh. . http://dx.doi.org/ . /ajhp. -quan- http://dx.doi.org/ . / http://dx.doi.org/ . /j.ypmed. . . http://www.ncbi.nlm.nih.gov/pubmed/ healthcare , , of . wells, g.; shea, b.; o’connell, d.; peterson, j.; welch, v.; losos, m.; tugwell, p. the newcastle-ottawa scale (nos) for assessing the quality of nonrandomised studies in meta-analyses; ottawa health research institute: ottawa, on, canada, . . naiman, a.; glazier, r.h.; moineddin, r. association of anti-smoking legislation with rates of hospital admission for cardiovascular and respiratory conditions. cmaj can. med. assoc. j. j. assoc. med. can. , , – . [crossref] [pubmed] . callinan, j.e.; clarke, a.; doherty, k.; kelleher, c. legislative smoking bans for reducing secondhand smoke exposure, smoking prevalence and tobacco consumption. cochrane database syst. rev. , cd . [crossref] . evans, w.n.; farrelly, m.c.; montgomery, e. do workplace smoking bans reduce smoking? am. econ. rev. , , – . [crossref] . who. who report on the global tobacco epidemic , monitoring tobacco use and prevention policies; world health organization: geneva, switzerland, . . crotty, j.; driffield, n.; jones, c. regulation as country-specific (dis-) advantage: smoking bans and the location of foreign direct investment in the tobacco industry. br. j. manag. , , – . [crossref] . durkin, s.j.; biener, l.; wakefield, m.a. effects of different types of antismoking ads on reducing disparities in smoking cessation among socioeconomic subgroups. am. j. public health , , – . [crossref] . hamilton, w.l.; biener, l.; brennan, r.t. do local tobacco regulations influence perceived smoking norms? evidence from adult and youth surveys in massachusetts. health educ res. , , – . [crossref] . hafez, a.y.; gonzalez, m.; kulik, m.c.; vijayaraghavan, m.; glantz, s.a. uneven access to smoke-free laws and policies and its effect on health equity in the united states: – . am. j. public health , , – . [crossref] . frazer, k.; callinan, j.e.; mchugh, j.; van baarsel, s.; clarke, a.; doherty, k.; kelleher, c. legislative smoking bans for reducing harms from secondhand smoke exposure, smoking prevalence and tobacco consumption. cochrane database syst. rev. , , cd . [crossref] . mackay, d.; haw, s.; ayres, j.g.; fischbacher, c.; pell, j.p. smoke-free legislation and hospitalizations for childhood asthma. n. engl. j. med. , , – . [crossref] . parkinson, c.m.; hammond, d.; fong, g.t.; borland, r.; omar, m.; sirirassamee, b.; awang, r.; driezen, p.; thompson, m. smoking beliefs and behavior among youth in malaysia and thailand. am. j. health behav. , , – . [crossref] [pubmed] . hyland, a.; wakefield, m.; higbee, c.; szczypka, g.; cummings, k.m. anti-tobacco television advertising and indicators of smoking cessation in adults: a cohort study. health educ. res. , , – . [crossref] [pubmed] . warner, k.e. the effects of the anti-smoking campaign on cigarette consumption. am. j. public health , , – . [crossref] [pubmed] . janz, t. current smoking trends. available online: https://www .statcan.gc.ca/n /pub/ - -x/ / article/ -eng.htm (accessed on june ). . goel, s.; tripathy, j.p.; singh, r.j.; lal, p. smoking trends among women in india: analysis of nationally representative surveys ( – ). south asian j. cancer , , – . [crossref] . li, x.; holahan, c.k.; holahan, c.j. sociodemographic and psychological characteristics of very light smoking among women in emerging adulthood, national survey of drug use and health, . prev. chronic. dis. , . [crossref] . cavazos-rehg, p.a.; krauss, m.j.; spitznagel, e.l.; grucza, r.a.; bierut, l.j. hazards of new media: youth’s exposure to tobacco ads/promotions. nicotine tob. res. off. j. soc. res. nicotine tob. , , – . [crossref] . liang, y.; zheng, x.; zeng, d.d.; zhou, x.; leischow, s.j.; chung, w. exploring how the tobacco industry presents and promotes itself in social media. j. med. internet res. , , e . [crossref] . kaleta, d.; usidame, b.; polanska, k. tobacco advertisements targeted on women: creating an awareness among women. cent. eur. j. public health prague , , – . [crossref] . tynan, m.a.; polansky, j.r.; titus, k.; atayeva, r.; glantz, s.a. tobacco use in top-grossing movies—united states, – . mmwr morb. mortal. wkly. rep. , , – . [crossref] . anderson, s.j.; glantz, s.a.; ling, p.m. emotions for sale: cigarette advertising and women’s psychosocial needs. tob. control , , – . [crossref] http://dx.doi.org/ . /cmaj. http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / .cd .pub http://dx.doi.org/ . /aer. . . http://dx.doi.org/ . / - . http://dx.doi.org/ . /ajph. . http://dx.doi.org/ . /her/cym http://dx.doi.org/ . /ajph. . http://dx.doi.org/ . / .cd .pub http://dx.doi.org/ . /nejmoa http://dx.doi.org/ . /ajhb. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /her/cyh http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /ajph. . . http://www.ncbi.nlm.nih.gov/pubmed/ https://www .statcan.gc.ca/n /pub/ - -x/ /article/ -eng.htm https://www .statcan.gc.ca/n /pub/ - -x/ /article/ -eng.htm http://dx.doi.org/ . / - x. http://dx.doi.org/ . /pcd . http://dx.doi.org/ . /ntr/ntt http://dx.doi.org/ . /jmir. http://dx.doi.org/ . /cejph.a http://dx.doi.org/ . /mmwr.mm a http://dx.doi.org/ . /tc. . healthcare , , of . u.s. department of health and human services. preventing tobacco use among youth and young adults: a report of the surgeon general; u.s. department of health and human services, public health service, office of the surgeon general: rockville, md, usa, . . gilmore, a.b.; fooks, g.; drope, j.; bialous, s.a.; jackson, r.r. exposing and addressing tobacco industry conduct in low-income and middle-income countries. lancet , , – . [crossref] © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). http://dx.doi.org/ . /s - ( ) - http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction methods selection of studies search strategy data extraction and analysis results article identification study characteristics pooled analysis risk of bias discussion strengths and limitations implications for policy and/or practice conclusions references a socio-environmental geodatabase for integrative research in the transboundary rio grande/río bravo basin scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdata a socio-environmental geodatabase for integrative research in the transboundary rio grande/río bravo basin sophie plassin , jennifer koch  ✉, stephanie paladino , , jack r. friedman , kyndra spencer & kellie b. vaché integrative research on water resources requires a wide range of socio-environmental datasets to better understand human-water interactions and inform decision-making. however, in transboundary watersheds, integrating cross-disciplinary and multinational datasets is a daunting task due to the disparity of data sources and the inconsistencies in data format, content, resolution, and language. this paper introduces a socio-environmental geodatabase that transcends political and disciplinary boundaries in the rio grande/río bravo basin (rgb). the geodatabase aggregates gis data layers on five main themes: (i) water & land governance, (ii) hydrology, (iii) water use & hydraulic infrastructures, (iv) socio-economics, and (v) biophysical environment. datasets were primarily collected from public open-access data sources, processed with arcgis, and documented through the fgcd metadata standard. by synthesizing a broad array of datasets and mapping public and private water governance, we expect to advance interdisciplinary research in the rgb, provide a replicable approach to dataset compilation for transboundary watersheds, and ultimately foster transboundary collaboration for sustainable resource management. background & summary finding solutions to manage scarce water resources to meet human water needs while sustaining ecosystems has become a priority for research – and policy makers. in , the united nations (un) adopted sustainable development goals (sdg), including “ensuring access to water and sanitation for all” (sdg ) . however, increasing pressure on water resources, due to population growth and climate change, may be an obstacle to reaching this goal and may lead to increased competition and political tensions. this especially applies to trans- boundary river systems , which represent % of freshwater supply basins and almost % of the total land area on earth . in transboundary basins, data- and information-sharing provide a mechanism to foster cooperation among countries and, ultimately, achieve more sustainable water management and maintain peace and security . synthesizing a broad range of environmental and social datasets is further critical for in-depth understanding of watershed dynamics and for assessing the sustainability of water allocation . in large areas, spatial datasets are also valuable to capture the spatial heterogeneity of local conditions and for better understanding of differ- ent development trends across a region . however, integrating scattered cross-disciplinary and multinational datasets is challenged by the disparity of data sources, transboundary discontinuity and the inconsistencies in format, content, spatial and temporal resolution, languages , and institutional norms. this can limit the study of transboundary socio-environmental systems, and calls for the development of a harmonized, transboundary, open-access database. university of oklahoma, department of geography and environmental sustainability, e boyd st, suite , norman, ok, , usa. merolek research, po box , athens, , ga, usa. university of oklahoma, center for applied social research, partners place, stephenson parkway, suite , norman, , ok, usa. oregon state university, biological and ecological engineering, sw th street, gilmore hall , corvallis, , or, usa. ✉e-mail: jakoch@ou.edu data descriptor open https://doi.org/ . /s - - - http://orcid.org/ - - - http://orcid.org/ - - - mailto:jakoch@ou.edu http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ the rio grande/río bravo basin (rgb) epitomizes the challenges of managing water resources under con- ditions of scarcity and transboundary cooperation issues . covering , km , the basin is shared by two countries (united states and mexico) and eight states, three in the u.s. (colorado, new mexico, and texas) and five in mexico (chihuahua, coahuila, durango, nuevo león, and tamaulipas). distribution of water resources is governed by distinct national and state water rights regimes, as well as bi-national and inter-state agreements . the river is highly engineered via extensive damming and channelization , which significantly disturb the natu- ral flow regime . classified as one of the most endangered in the world , the river hosts several endangered and threatened species, such as the rio grande silvery minnow, and the southwestern willow flycatcher. despite chronically severe water scarcity, human water demand has grown along with an increase in population (current population is estimated at . million people ). furthermore, irrigated agriculture makes up approximately . % of the total surface and groundwater withdrawals , while cultivated areas cover only . % of the basin . projected climate change and population growth are likely to lead to a growing imbalance between water supply and demand, potentially impacting the sustainability of the basin and resulting in a cascade of negative impacts (e.g., increased risk of wildfires) . to overcome the challenges of data- and information sharing in transboundary basins, several initiatives were launched in recent years. the transboundary freshwater dispute database gathers global and regional informa- tion for the world’s international river basins, including the rgb . examples of topics include population, land-cover, irrigation, rainfall, and dams (http://gis.nacse.org/tfdd/index.php). in the rgb, a basin level database synthesizing hydrologic, administrative, land-use/cover, and water management datasets, was also developed at a finer resolution , . however, we are not aware of a similar approach to capture the social heterogeneity and complexity across the basin . hence, we compiled a comprehensive socio-environmental geodatabase for the rgb, encompassing geospatial data sets related to water & land governance, hydrology, water use & hydraulic infrastructures, socio-economics, and biophysical environment. an innovative contribution of our geodatabase is to provide open-access to a thorough collection of reliable and well documented, basin-wide geospatial data sets, as well as, to our knowledge, the most extensive spatial picture of multi-scale water governance in the rgb. the geodatabase expands the availability of information detailing the social and environmental characteristics of the basin and enables an integrated and cross-disciplinary approach to the watershed. many outputs can be derived from the geodatabase, including maps, spatially explicit integrated models, and statistical analyses (see usage notes for additional details). the data synthesis approach presented here is transferable to other transboundary basins, and the geodatabase will help the research, management, and policy-making communities to foster trans- boundary collaboration. methods dataset preparation. the development of the socio-environmental geodatabase followed four steps: ( ) data identification; ( ) collection of raw data; ( ) assembling, harmonizing, and geoprocessing of datasets; and ( ) organization of the final geospatial data sets into the geodatabase (fig.  ). first, we drew on an interdisciplinary collaboration among social and environmental scientists to identify crit- ical data requirements to support socio-environmental research in the rgb. the environmental anthropologists shared primary data analysis from ethnographic research and developed a qualitative typology of actors that we used to identify water and land governance datasets. we integrated land management as a key component for the study of socio-environmental dynamics in the watershed because land management decisions affect water run-off, stream flow, aquifer recharge, and soil erosion. the actor typology distinguishes the actors according to their roles in water and land management (water supply, consumption, infrastructure management, environmental protec- tion, recreation) and their spatial scale of action (from local to bi-national level; supplementary information  ). second, we gathered raw datasets from distinct sources totalling around gb. where possible, and if the spatial resolution was high enough, we prioritized global datasets over national datasets and national datasets over regional datasets to limit data inconsistency and heterogeneity. most of the raw datasets were collected from national agencies ( %), followed by state agencies ( %), global agencies ( %) and binational agencies ( %). we also collected datasets from joint research projects ( %) and private institutions ( %). most of the input datasets were available open-access online. we also contacted public or private institutions to access unpublished infor- mation and were granted permission to publish the synthesized data. the original formatting of the raw datasets varied greatly. % were vector data ( . % shapefiles, % kml/kmz files, . % feature class), % were raster files, and % were only available in tabulated format (comma-delimited text or microsoft excel files). in terms of language, % of the raw datasets were in english and % in spanish. the detailed list of raw datasets used to pro- duce the geospatial output datasets is available in supplementary information  . for each raw dataset, we included a short description, the spatial and temporal dimensions, the format, language, data source, and date of access. third, we assembled, harmonized, and geoprocessed the datasets using esri’s arcgis . and, where appro- priate, the python programming language (version . ) and the arcpy package for automating the geoprocessing workflow. geoprocessing operations varied according to the type of raw dataset, and a detailed description of these operations is available in the metadata for each geographic layer. in general, we tried to synthesize scattered state and national raw data into one data layer. where necessary, we harmonized the information in the datasets (variables, values, units of measurement) and translated spanish information to english. geospatial operations included the projection of the final datasets to the most appropriate projected coordinate system that minimized areal distortion for the whole rgb (north america albers equal area conic), and clipping datasets to the rgb boundary. instead of using the hydrological basin boundary, we used the spatial boundaries of the rio grande/ río bravo socio-environmental system as delineated by koch et al. . for all political jurisdictional boundary information (national, state, or county), given the spatial mismatch with the catchment boundary, we prioritized the use of administrative boundaries over catchment boundaries in locations where they overlap. https://doi.org/ . /s - - - http://gis.nacse.org/tfdd/index.php scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ fourth, we stored the resulting geographic layers, including vector data and raster data (grid), in a geodatabase with a total size of . gb. we organized the datasets into five main themes: water & land governance, hydrology, water use & hydraulic infrastructures, socio-economics, and biophysical environment. metadata. for each individual data layer, we produced a standardized metadata record, following the federal geographic data committee content standard for digital geospatial metadata (fgdc csdgm). the stand- ard includes seven types of information: identification information (basic information about the dataset such as the citation, description, spatial domain, access constraints, etc.), data quality information, spatial data organization information, spatial reference information (geographic coordinate system definition), entity and attribute information, distribution information, and the metadata reference information. the metadata fig. schematic overview of the workflow applied to produce the socio-environmental geodatabase for the rio grande/río bravo basin. https://doi.org/ . /s - - - scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ records have been created through the arcmap graphical user interface and are automatically included with the hosted layer item. the user can export the metadata in xml format through the arcmap graphical user interface if needed. data records the socio-environmental geodatabase is freely and publicly available on the osf repository . the collection of datasets can be downloaded as a single file (.gdb) or as individual geospatial output layers (shapefiles or raster files). here, we describe the geospatial datasets available for the five themes water & land governance, hydrology, water use & hydraulic infrastructures, socio-economics, and biophysical environment. online-only table   and tables  – provide a systematic overview of the individual layers, i.e., the name, description, format and data sources for each theme. a comprehensive list of acronyms is included in supplementary information  . water & land governance (gov). the water & land governance category is divided into sub-categories, representing political jurisdiction boundaries, surface water management agencies (binational, federal, state), state and interstate multi-stakeholder platforms, groundwater-focused institutions, irrigation and conservation organizations, land management, and border control. a description of each entity with their mis- sions is provided in supplementary information  . political jurisdiction boundaries. the geodatabase provides the boundaries of four political jurisdiction divi- sions: nation, state, county/municipio, and places (cities, towns and villages). the geodatabase also provides a point dataset of the most populated places in both countries, which includes the main u.s.-mx sister cities along the border. binational surface water management agencies. the geodatabase gathers vector datasets of binational infra- structure that spatializes the joint operations of the u.s. international boundary and water commission (ibwc) and its mexican counterpart the comisión internacional de límites y aguas (cila), regarding national alloca- tion of water, dam and reservoir operations, river course management (the banks, the berms, the vegetation), water quality, sanitation, and flood control in the border region. federal surface water management agencies. this sub-category includes two datasets of federal dams. one gath- ers all federal dams operated by the comisión nacional del agua (conagua) in mexico. the other maps all dams owned by a federal agency in the u.s., including the u.s. army corps of engineers (usace), the bureau of indian affairs (bia), the bureau of land management (blm), the bureau of reclamation (usbr), the u.s. fish and wildlife service (usfws) and the u.s. forest service (usfs). state and intrastate surface water management agencies. this sub-category depicts the boundaries of the offices of the state engineer (ose) and their regional offices (divisions and districts) in charge of the administration of water rights and interstate compacts in the u.s. for mexico, we generated spatial layers representing state-level offices of the federal agency conagua. we also included entities that engage in planning, coordination, and oversight of domestic water supply and wastewater treatment state-wide: in chihuahua, the junta central de agua y saneamiento (jcas); in coahuila, the comisión estatal de aguas y saneamiento (ceas); in durango, the comisión de agua del estado de durango (caed); in nuevo león, the servicios de agua y drenaje de monterrey (sadm); and in tamaulipas, the comisión estatal del agua en tamaulipas (ceat). state and inter-state multi-stakeholder platforms. the geodatabase includes five platforms: • the rio grande compact commission — u.s. inter-state platform — monitors water flows and administers water-sharing among the states of colorado, new mexico, and texas. we mapped its domain through three spatial layers: one representing the spatial scope of the interstate compacts (including the rio grande com- pact), one identifying the gauges used to monitor and account for states’ water delivery obligations to each other, and one identifying the “post-compact” reservoirs used to manage states’ water delivery obligations. datasets output geospatial layers description format input source watershed boundary sub_basinsrgb_basin watershed boundaries polygon fao river network rivers river network line fao aquifer boundaries aquifer aquifer boundaries in u.s. and mexico polygon usgs , conagua gauging stations . gauges_usgs . gauges_ibwc . gauges_conagua . gauges operated by usgs in u.s. . gauges operated by ibwc along the border . gauges operated by conagua in mx points usgs , usibwc , conagua quality stations quality_stations quality stations managed by ibwc and tceq point usibwc table . list of geospatial layers for the hydrology category including their description, format and input data sources. https://doi.org/ . /s - - - scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ • the san juan chama project (sjcp) — u.s. inter-state platform — moves water through a trans-mountain diversion from the colorado river basin in the state of colorado (san juan river) to municipal and irrigation users in the upper and middle rio grande valley of new mexico. we mapped this project with three vector datasets: one for the trans-mountain diversion, one for the reservoirs where imported water is stored and transferred, and one displaying the users of that water (counties, municipalities, irrigation districts, tribe). • the rio grande project — binational and inter-state platform — supplies water for irrigation from the ele- phant butte reservoir to several u.s. and mexican irrigation districts. its domain is represented through two vector datasets: one representing the dams used for storage (elephant butte, caballo) plus the diversion dams for distributing water in the u.s. and mexico (percha, leasburg, mesilla, american, riverside — which is inactive, international); and one representing the irrigation districts supplied by the rio grande project (ebid, epwid# and dr valle de juarez). • the consejo de cuenca del río bravo (rio bravo watershed council) — mexican inter-state platform — is a multi-state entity that develops and implements a basin management plan for the conagua hydrologi- cal-administrative region vi in mexico. we mapped its domain using its administrative boundary. • the rio grande basin roundtable in southern colorado — intra-state platform — regroups various stake- holders such as counties, municipalities, and conservancy districts to identify consensual strategies that meet competing needs in the region, and to provide support and funding for basin projects . the geodatabase maps the boundary of its domain of action. groundwater-focused institutions. this dataset maps the local entities and related infrastructures associated with the administration, management, monitoring and conservation of groundwater in the u.s. states of colorado, new mexico, and texas. • in colorado, we mapped three levels of groundwater management: the rio grande water conservation dis- trict (rgwcd) that manages groundwater at the regional level (san luis valley); the groundwater manage- ment subdistricts that manage groundwater at the local level (subdistrict); and the high capacity wells with augmentation/replacement plans. for the regional level, we provided the spatial domain of the rgwcd, the location of unconfined and confined wells that the rgwcd monitors, and the boundaries of the closed basin, an unconfined groundwater salvage project maintained by rgwcd that transfers water from the aquifer to the surface water basin. • the state of new mexico is divided into declared groundwater basins where the ose assumes jurisdiction over the appropriation and use of groundwater. we mapped the basins overlapping the rgb. • in texas, we mapped the domains of the groundwater management areas (gma), the groundwater con- servation districts (gcd) in charge of the implementation of groundwater management plans within the gma, and the priority groundwater management areas (pgma) located in areas with critical groundwater problems. in mexico, except for the groundwater permits (títulos y permisos de aguas nacionales) available in the sec- tion water use & hydraulic infrastructures > water rights/permits, we did not find any comparable administra- tive or governance mechanism explicitly set up on a spatial basis to manage groundwater per se. datasets output geospatial layers description format input source withdrawals withdrawals_ amount of surface water and groundwater used in mcm by the different sectors on a county/municipio basis in polygon usgs , conagua irrigation . irrigation_us . irrigation_mx spatial coverage of irrigated lands and irrigation systems (surface, sprinkler micro-irrigation, other) on a county/municipio basis polygon usgs , inegi water rights . co_waterrights . nm_waterrights . tx_waterrights . coa_waterrights . chi_waterrights . dur_waterrights . nvl_waterrights . tam_waterrights . surface and ground water rights in colorado (division ) . surface and ground water rights in new mexico . surface water rights in texas . surface water & ground water rights in coahuila . surface water & ground water rights in chihuahua . surface water & ground water rights in durango . surface water & ground water rights in nuevo león . surface water & ground water rights in tamaulipas point cdwr , nmose , tceq , conagua dams . dams_us . dams_mx . all dams in u.s. derived from usace-nid . all dams in mx derived from cenapred point usace , cenapred water diversion structures . diversion_us . diversion_mx . diversion_co . water diversion (canal/ditch, connector, pipeline, underground conduit) in u.s. . water diversion (canals, aqueducts) in mx . canals in co line usgs-nhd , inegi , cwcb/dwr wells . wells_us_nhd . wells_co_cwcb . wells_nm_nmose . wells_tx_twdb . wells in u.s. derived from usnhd . wells in co compiled by cwcb/dwr . wells in nm compiled by nmose . wells in tx compiled by twdb point usgs-nhd , cwcb/ dwr , nmose , twdb table . list of geospatial layers for the water use & hydraulic infrastructures category including their description, format and input data sources. https://doi.org/ . /s - - - scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ at the transboundary level, the u.s. and mexico also established a binational agreement, the transboundary aquifer assessment program (taap), to strengthen collaborations among mexican and u.s. institutions to jointly assess priority shared aquifers along the u.s.-mexico border . two of the four transboundary priority aquifers underlay the rgb: mesilla/conejos-médanos and hueco bolson. irrigation distribution organizations. this sub-category maps the boundaries of the organizations that divert and coordinate surface water delivery to irrigators, and ensure maintenance of the irrigation conveyance sys- tems, in both the u.s. and mexico. our dataset provides the name and the area managed by the organizations, in addition to other specific information for each state and country. we sought to gather the spatial boundaries of three organizations: the irrigation districts (distritos de riego in mexico), ditch companies, and community ditch associations, an umbrella term under which we include the acequia systems of colorado and new mexico, and unidades de riego (irrigation units) across the states of mexico. however, spatial datasets of ditch companies and community ditch associations were unavailable for new mexico and mexico. land management. this dataset maps four types of land management: certified ejido & communal land, public, native american/tribal, and private/not reported. the dataset results from the compilation of two national datasets: the national surface management agency area polygons released by the bureau of land management (blm) that maps state and federally owned lands as well as native american tribal lands in the u.s., and the perimetrales de los núcleos agrarios certificados produced by the registro agrario nacional (ran) that maps ejido and comunidad (communal) lands in mexico. when compiling the datasets, we labelled all areas that were neither classified by the blm nor the ran as “private/not reported”. protected areas. this dataset maps the areas designated as protected by the international union for conservation of nature (iucn). border control. this sub-category maps three key components of border control along the international reach of the rio grande/río bravo: the u.s. southwest border patrol sectors associated with the migration apprehension statistics; the border crossing ports of entry; and vehicle and pedestrian fences/barriers. u.s. border control is very active and has important implications for the management of riverbank and riparian areas. however, this research to-date has not covered policies or practices by mexican border-related agencies that affect water man- agement or flow. soil and water conservation districts (included only for the u.s.). the vector dataset provided in this sub-category map the outlines of the soil and water conservation districts (scwds) in colorado, new mexico, and texas. scwds foster voluntary conservation practices among private and public landowners, helping to manage and protect soil, water, forests, and wildlife at the local level. the swcds are a specific kind of district, with a mandate that is particular to the legal and institutional history of land and water management in the u.s. our research to-date has not found anything equivalent to this formation in mexico, and even less so, any spatial datasets for it. collaborative conservation projects. this sub-category includes conservation projects where members of mul- tiple sectors and/or institutions actively work in collaboration to identify, protect, manage, monitor, and/or develop policy for the conservation of specific sites. our research to-date has not extended to documentation of all such projects in either the u.s. or mexico, and did not uncover spatial data sets that document their loca- tion. our datasets map three collaborative conservation projects. two of them, the landscape conservation cooperatives (lccs) and the north america bird conservation joint ventures (bjvs), are transboundary. for both, we mapped the boundaries of the four lccs and the five bjvs that span the rgb. while the u.s. fish and category output geospatial layers description format input source population population population in , , [county/municipio-based] polygon us census bureau , inegi population density popdens_ popdens_ popdens_ popdens_ popdens_ population density in , , , , [ arc-second resolution] raster ciesin - columbia university income . income_us . distributionincome_mx . distributionincome_mx . distributionincome_mx . personal income in u.s. from to [county-based] . . . income distribution in mx [municipio-based] in , , polygon usbea , inegi compiled by conabio number and size of farms . farms_us . farms_us . farms_mx . . number of farms, farm area (in hectares and acres) and farm area distributed by farm size in us, in – [county-based] . number of farms and farm areas (hectares) in mx in [municipio-based] polygon usda-nass , inegi transport infrastructures . roads . railroads . roads . railroads line ciesin - columbia university/ itos - university of georgia , inegi/nrcan/usgs/cec table . list of provided geospatial layers for the socio-economics category including their description, format and input data sources. https://doi.org/ . /s - - - scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ wildlife service recently discontinued its funding for the lccs network, we decided to include these cooperatives in our dataset since the lccs generated highly relevant information for the rgb in the past. the third project refers to the instream flow program in colorado. for this project, we mapped the streams and lakes affected by the instream flow program. hydrology (hyd). the hydrology categor y includes five surface water and groundwater related sub-categories. watershed boundary. the geodatabase provides the outlines of the rio grande/río bravo basin and of its sub-basins. the watershed boundary dataset is derived from the world wildlife fund’s (wwf) hydrosheds product, created from nasa’s shuttle radar topographic mission (srtm) -second digital elevation model (dem) and released by the food and agriculture organization (fao) geoportal. river networks. this dataset of vectorized river reaches, derived from the wwf hydrosheds products, was also accessed from the fao geoportal. aquifer boundaries. we mapped the aquifer boundaries by assembling two national geospatial datasets com- piled by the national system of water information of conagua and the usgs. gauging stations. the geodatabase provides datasets for the location of discharge monitoring stations operated by usgs (above elephant butte dam), ibwc/cila (below elephant butte dam and along the border to the gulf of mexico), and conagua (tributaries of the rio grande/río bravo coming from mexico). each dataset provides a url to the historical time series of the stream flow. water quality stations. this dataset provides the location of water quality stations monitored by ibwc and tceq for the u.s. portion of the rgb. water use & hydraulic infrastructures (use). water use & hydraulic infrastructures datasets provide information related to consumption and diversion of water. withdrawals. this dataset provides estimates in million cubic meters per year of water used at the u.s. county and mx municipio level. for both countries, estimates are available by type of source (surface or groundwater) and by sector of use (e.g., public supply, domestic, industrial, irrigation, thermoelectricity). for the u.s., the data set also breaks down estimates by fresh water and saline water. irrigation. we provide two datasets (one per country), which reports on a county/municipio basis the irri- gated area estimated in hectares and the spatial coverage of different systems of irrigation (surface, sprinkler, micro-irrigation and other), estimated in hectares for u.s. and in number of farms for mexico. for the u.s., we used the original classes (surface, sprinkler, micro-irrigation) and for mexico, we clustered as follows: lined and earthen canals in surface; sprinkler and micro-sprinkler in sprinkler; and drip in micro-irrigation. datasets output geospatial layers description format input source ecoregions . na_eco_ . us_eco_ . ecoregions level iii for u.s. and mexico . ecoregions level iv for u.s. polygon us-epa , habitat . us_critbab_line, us_crithab_poly . us_riparian . us_wetlands . mx_ramsar_site . mx_sap . mx_spr . u.s. fws threatened & endangered species active critical habitat boundaries . riparian areas in the u.s. . wetlands in the u.s. . ramsar sites in mexico . priority focus sites for biodiversity conservation in mexico . priority sites for restoration in mexico polygon, line usfws – , conanp , conabio , soil cover soil soil units based on fao taxonomy polygon fao , elevation elev elevation derived from the srtm dem and the usgs gtopo [ arc-second resolution] raster fao/iiasa/isric/iss-cas/jrc slope . slopedeg . slopepct . slope in degree generated from the elevation . slope in percent generated from the elevation raster fao/iiasa/isric/iss-cas/jrc land cover lc land-cover in [ m resolution] raster ccrs/ccmeo/nrcan/conabio/conafor/inegi/usgs land use . cdl , cdl , cdl , cdl , cdl , cdl , cdl , cdl , cdl , cdl , cdl . mx_lu_ , mx_lu_ . mx_ancrop_winter , mx_ancrop_ spring , mx_percrop . time-series ( – ) of the cropland data layer in us [ m resolution] . land-use in mx in and [ m resolution] . annual winter, annual spring and perennial areas planted in mx in [municipio-based] raster, polygon usda-nass , inegi , , inegi table . list of provided geospatial layers for the biophysical environment category including their description, format and input data sources. https://doi.org/ . /s - - - scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ water rights/permits. this sub-category displays the allocation of surface water rights and groundwater rights at the state level for u.s. and mexico. a water right is a property right that conveys the right to use a certain amount of water. these are administered by the offices of the state engineer in the u.s. and by conagua in mexico. the datasets report the geographic coordinates of the water right holder, and when available, the amount of water right authorized (in acre feet in the u.s. and in cubic meters in mexico), the name of the water right holder, the adjudication date, the rank of priority (i.e., the water right’s seniority in a water drainage), or the type of use. dams. we provide two vector datasets. the first one, derived from the national inventory of dams, maps all dams in the u.s. exceeding a certain height and capacity of storage. the second one maps all dams recorded by the cenapred in mexico. water diversion structures. the geodatabase provides three vector datasets of water diversion structures (ditch/ canal, pipeline, underground conduit and connector). the first one covers the five mx states and is derived from a mexican national data source. the second one covers the three u.s. states and is derived from a u.s. national data source. the third one covers colorado with more details and is derived from a state data source. wells (included only for the u.s.). we compiled four datasets of well locations. one is derived from the national hydrography dataset (nhd) for the whole u.s. portion of the rgb; the three other ones were compiled by the ose for each u.s. state and provides more details. our research to-date has not uncovered data for wells in mexico. socio-economics (se). this category includes five datasets related to demography, economics, and trans- portation networks. population. this vector dataset reports the total population for the years , , and on a county/mu- nicipio basis. we assembled national demographic census information distributed by the u.s. census bureau and inegi. population density. this raster dataset depicts population density (number of persons per square kilometre) at arc-second resolution (approximately km at the equator) for the years , , , , and . income. due to the lack of consistent data for both countries, the spatial database includes two heterogeneous datasets: the personal income for the u.s. counties from to and the income distribution (i.e., the dis- tribution of the working population per category of income) in , , and for the mexican municipios. fig. map of the main dams and irrigation districts in the rio grande/río bravo basin. https://doi.org/ . /s - - - scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ number and size of farms. this sub-category provides county/municipio-based statistics about the number of farms, the total farmed area, and the average farm size (in and for the u.s. and in for mexico). transport infrastructures. this sub-category contains two datasets: the roads derived from the global roads open access data set, version (groadsv ) and the railroads derived from the north american environmental atlas. biophysical environment (bio). the biophysical environment category provides ecological, pedological, topographical, and land-use/land-cover datasets. ecoregions. ecoregions are areas with similar patterns of biotic and abiotic phenomena, including geology, physiography, vegetation, climate, soils, land use, wildlife, and hydrology. ecoregions can be mapped for different nested levels. for north america (canada, u.s. and mexico), data are available for three levels (level i being the coarsest and level iii the finest). ecoregions level iv have also been delineated for the u.s. but not yet for mexico. each of those datasets are included in the geodatabase, although ecoregion level iv is missing for mexico. habitat. we compiled six national datasets due to a lack of transboundary coverage. in the u.s., we mapped wetlands and deepwater habitats (extent, approximate location, and type), riparian areas, and critical habitat for endangered and threatened species. in mexico, we mapped ramsar wetlands sites (representative, rare or unique wetlands that are important to conserve biological diversity), priority sites for restoration (areas of high biological value that require restoration actions), and priority focus sites for biodiversity conservation (conserved habitats that host endangered and threatened species and are adjacent to the protected areas). soil cover. this vector dataset displays different soil units based on the fao taxonomy at : , , scale. elevation. this raster data, derived from the harmonized world soil database (version . ), provides the median elevation aggregated to arc-second grid cells. it is derived from the nasa shuttle radar topographic mission digital elevation model and the usgs gtopo . slope. we calculated the median slope in percent and in degree at a arc-second resolution from the elevation data described above, by using the spatial analyst tool slope in arcgis. land cover. this raster dataset, derived from the north american land change monitoring system (nalcms), depicts the land cover in the rgb, based on landsat satellite imagery. nalcms uses the level ii of the land cover classification system (lccs) standard developed by fao. fig. map of the rio grande/río bravo basin depicting land ownership, protected areas, and major rivers, generated from the geospatial datasets of the geodatabase. https://doi.org/ . /s - - - scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ land use. due to a lack of consistent transboundary datasets, we gathered national datasets to map land-use in the rgb. for the u.s., we used the yearly usda cropland data layer, a high-resolution crop-specific land-use image, available from to at a -meter resolution. for mexico, we derived our datasets from two governmental datasets. the first one is a spatial dataset of land uses for and at a -meter resolution, with a coarse classi- fication of crops (rain-fed vs irrigated crops). the second one, non-spatial, is derived from the census and pro- vides on a municipio basis planted area estimates (in hectares) of annual spring and winter crops and perennial crops. data gaps. although we developed an extensive database, it is not comprehensive for either the u.s. or mexico. our approach led us to identify several spatial data gaps for socio-environmental research in the rgb. yearly crop-specific datasets for mexico. supplying yearly crop-specific time-series datasets at a -m resolution will be helpful for comparing cropland dynamics in the u.s. and mexico. boundaries of community ditch and acequia associations (in nm) and unidades de riego (mx). to the authors’ knowledge, these spatial boundaries are not publicly available. mapping them is important, because of their water management implications. the nmose is currently working on updating and assembling this information for new mexico . land ownership. for mexico, we did not find any spatial datasets delineating public lands, except the certified ejidos and communal lands. generating a dataset that differentiates, in more detail, public from private lands in mexico would be valuable for land and water management research. groundwater wells (mx). we were not able to access a spatial dataset of wells for mexico. mapping well loca- tions is of primary interest for surface and groundwater management in the basin. binational infrastructures. some information on binational infrastructure is missing, such as the morillo drain (described in supplementary information  ). climate records are also critical for understanding hydrological processes in the basin. while not included in our geodatabase, livneh and others published a transboundary historic dataset that derives observed daily precipita- tion, minimum and maximum temperature gridded to a / ° (~ km) resolution for the time period – . fig. map comparing the total amount of water withdrawals (millions of cubic meters per year) in with the population on a county/municipio basis. https://doi.org/ . /s - - - scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ technical validation to ensure the reliability of the spatial database, we draw on a critical assessment of three factors: the quality of the source of information, transboundary discontinuities, and geoprocessing. regarding the sources of information, we sought to download most of the raw datasets from official sources such as intergovernmental organizations and governmental agencies (e.g., fao, usibwc, usgs, usda, conagua, inegi, conabio, conanp, codwr, nmose, tceq) or from joint projects overseen by repu- table universities, research institutes, and non-governmental organizations (e.g., from the nasa socioeconomic data and applications center, the north american environmental atlas). we compiled a few datasets (around %) from private entities (associations, districts) as they were the only source of the desired information. for example, this is the case for the border fences. we downloaded the spatial data from reveal from the center for investigative reporting and openstreetmap contributors (https://github.com/cirlabs/border_fence_map) who traced the fences along the border by using open-source mapping tools and by digitizing detailed pdf maps obtained through a freedom of information act request . the creators of the data set acknowledge slight differences with the official government agency numbers. however, this is the most detailed border fence map publicly available. we also gathered four datasets — that were not downloadable online — by email requests from three organizations. the rgwcd provided the boundaries of the rgwcd and the groundwater management subdistricts # , # , # , # , # , # in colorado. the cscb (colorado state conservation board) and the tsswcb (texas state soil and water conservation board) shared the boundaries of the swcds of colorado and texas, respectively. although these datasets were not gathered from governmental sources, we believe that these datasets are the best available and we rely on the accuracy of data collection of the three organizations. whenever possible, we used global or bi-national input datasets to avoid inconsistencies related to differences in resolution, type of information, or geoprocessing methods among counties and states. around . % of our raw datasets have a global or bi-national spatial extent, and therefore cross international and state boundaries. nevertheless, a significant amount of the raw datasets we gathered were nation-wide ( %) or state-wide ( . %), which could lead to some of the inconsistencies previously mentioned. all information gathered from mexican sources was translated from spanish to english and verified by a bilingual speaker. regarding the geoprocessing operations, for all datasets, we used several tools and functions available on arcgis, such as project, clip, dissolve, slope, join, merge and we rely on the accuracy of this software to create the final outputs. a series of quality checks has been carried out to catch as many potential coding errors as pos- sible. attributes were checked by using visual inspection. for quality control, transparency, and reproducibility, processing steps are thoroughly detailed in the metadata of each dataset. usage notes this spatial database offers a wide range of applications for transboundary and interdisciplinary research. we provide three examples drawing from our own research. . a broad range of mapping products can be created for the whole rgb or sub-sections, such as a map of the main irrigation districts and dams in the rgb (fig.  ), a map of land managers and protected areas (fig.  ), and a comparative map of water use and population on a county/municipio basis (fig.  ). . the geodatabase can support spatially explicit and integrated modelling for the rgb . we are currently developing a simulation model for the rgb that draws on water and land governance datasets (political jurisdiction boundaries, land management and protected areas, irrigation district outlines), as well as hy- drologic and biophysical environment datasets (river network, elevation, water bodies, gauges, land-use), socio-economic (population), and water use and hydraulic infrastructures (dams). . the geodatabase can be used to conduct spatiotemporal and geostatistical analyses. examples of interdis- ciplinary topics include: (i) evaluating whether water use correlates with environmental variables (ecore- gions, land use), socio-economic factors (population), or governance (e.g., related to land ownership, irrigation district, or state); and (ii) assessing land-use change at multiple scales (local, state, national). for example, we are currently analysing the spatial patterns of land fallowing in the basin. users of the compiled datasets should cite this data paper following the recommended citation format of the journal. furthermore, we encourage users to refer to the sections “credits” and “use limitations” in the metadata of each layer file, as additional data constraints and required credits of the original data sources may apply. in conclusion, to achieve more sustainable transboundary water management, it is necessary to approach prob- lems from a boundary-spanning and interdisciplinary perspective. by creating a basin-wide, socio-environmental geodatabase for the rgb, we address this issue and support further interdisciplinary research in the basin. the main novelty of our approach was to include geospatial datasets mapping public and private water governance, which contributes to better understanding of scale interactions and mismatches between ecological and social (decision-making) boundaries . moreover, we were able to identify data gaps, especially for the mexican side of the rgb, and therefore the need for further research to reduce data imbalance between the two sides of this transboundary basin. even though our study is limited to the socio-environmental boundaries of the rgb sys- tem and the typology of actors is specific to the shared basins between u.s. and mexico, we consider that our approach to assembling a wide range of data is transferable to other shared river basins to serve transboundary and cross-disciplinary collaboration and research. https://doi.org/ . /s - - - https://github.com/cirlabs/border_fence_map scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ code availability we created the individual output datasets using geoprocessing scripts in the python programming language and the arcpy package. all geoprocessing scripts are available on github and are accessible through the open science framework (osf) repository . received: august ; accepted: february ; published: xx xx xxxx references . garcia, x. & pargament, d. reusing wastewater to cope with water scarcity: economic, social and environmental considerations for decision-making. resour. conserv. recy. , – ( ). . green, p. a. et al. freshwater ecosystem services supporting humans: pivoting from water crisis to water solutions. glob. environ. chang. , – ( ). . jaeger, w. k. et al. finding water scarcity amid abundance using human–natural system models. proc. nat. acad. sci. usa , – ( ). . wwap (world water assessment programme). managing water under uncertainty and risk. the united nations world water development report . unesco. paris, france ( ). . united nations. the sustainable development goals report . new york, ny ( ). . munia, h. et al. water stress in global transboundary river basins: significance of upstream water use on downstream stress. environ. res. lett. ( ). . unece/unesco. good practices in transboundary water cooperation ( ). . troy, t. j., konar, m., srinivasan, v. & thompson, s. moving sociohydrology forward: a synthesis across studies. hydrol. earth syst. sci. , – ( ). . leslie, h. m. et al. operationalizing the social-ecological systems framework to assess sustainability. proc. nat. acad. sci. usa , – ( ). . hanspach, j. et al. a holistic approach to studying social-ecological systems and its application to southern transylvania. ecol. soc. ( ). . livneh, b. et al. a spatially comprehensive, hydrometeorological data set for mexico, the u.s., and southern canada – . sci. data. , – ( ). . tucker lima, j. m. et al. a social-ecological database to advance research on infrastructure development impacts in the brazilian amazon. sci. data. , – ( ). . unep & oregon state university. atlas of international freshwater agreements. vol. (united nations environmental programme, ). . nava, l. f. & sandoval-solis, s. multi-tiered governance of the rio grande/bravo basin: the fragmented water resources management model of the united states and mexico. international journal of water governance. , – ( ). . dean, d. j. & schmidt, j. c. the role of feedback mechanisms in historic channel changes of the lower rio grande in the big bend region. geomorphology. , – ( ). . blythe, t. l. & schmidt, j. c. estimating the natural flow regime of rivers with long-standing development: the northern branch of the rio grande. water resour. res. , – ( ). . wong, c. m., pittock, j. & schelle, p. world’s top rivers at risk. gland, switzerland (wwf international, ). . sandoval-solis, s., teasley, r. l., mckinney, d. c., thomas, g. a. & patiño-gomez, c. collaborative modeling to evaluate water management scenarios in the rio grande basin. j. am. water resour. assoc. , – ( ). . comisión nacional del agua (conagua). statistics on water in mexico, edition. ministry of environment and natural resources. méxico, d.f. ( ). . maupin, m. a. et al. estimated use of water in the united states in . u.s. geological survey circular , https://doi. org/ . /cir , (u.s. geological survey, ). . jun, c., ban, y. & li, s. china: open access to earth land-cover map. nature. , – ( ). . mu, j. e. & ziolkowska, j. r. an integrated approach to project environmental sustainability under future climate variability: an application to u.s. rio grande basin. ecol. indic. , – ( ). . llewellyn, d. & vaddey, s. west-wide climate risk assessment: upper rio grande impact assessment (u.s. bureau of reclamation, ). . wolf, a. t. the transboundary freshwater dispute database project. water int. , – ( ). . ortiz-partida, j. p., sandoval-solis, s. & diaz-gomez, r. assessing the state of water resource knowledge and tools for future planning in the rio grande-rio bravo basin, https://doi.org/ . /c bc d ( ). . patiño-gomez, c., mckinney, d. c. & maidment, d. r. sharing water resources data in the binational rio grande/bravo basin. j. water resour. plan. manag. , – ( ). . koch, j., friedman, j. r., paladino, s., plassin, s. & spencer, k. conceptual modeling for improved understanding of the rio grande/ río bravo socio-environmental systems. socio-environmental systems modelling. ( ). . plassin, s. et al. geospatial data for the rio grande/río bravo socio-environmental system. open science framework. https://doi. org/ . /osf.io/ ( ). . dinatale water consultants. rio grande basin implementation plan. alamosa, co (rio grande basin roundtable, ). . u.s. geological survey (usgs). the transboundary aquifer assessment program (taap), https://webapps.usgs.gov/taap/. . new mexico office of the state engineer (nmose), dhsem, edac & fema. acequia mapping project outreach, https://ose.maps. arcgis.com/apps/cascade/index.html?appid=b f edf d a dd c b a d. . corey, m. & becker, a. the wall: building a continuous us-mexico barrier would be a tall order, https://www.revealnews.org/article/ the-wall-building-a-continuous-u-s-mexico-barrier-would-be-a-tall-order/ ( ). . cumming, g., cumming, d. h. m. & redman, c. scale mismatches in social-ecological systems: causes, consequences, and solutions. ecol. soc. ( ). . food and agriculture organization (fao). hydrological basins in central america (derived from hydrosheds), http://www.fao. org/geonetwork/srv/en/main.home ( ). . food and agriculture organization (fao). rivers in central america (derived from hydrosheds), http://www.fao.org/ geonetwork/srv/en/main.home ( ). . u.s. geological survey (usgs). ground water atlas of the united states. aquifers vector digital data, https://catalog.data.gov/ dataset/aquifers ( ). . comisión nacional del agua (conagua). sistema nacional de información del agua (sina), http:// . . . /sina/ ( ). . stewart, d. w., rea, a. & wolock, d. m. usgs streamgages linked to the medium resolution nhd, https://doi.org/ . /ds ( ). . u.s. international boundary and water commissions (ibwc). gis portal, https://appportal.ibwc.gov/ibwc_geo/public_portal/ ( ). https://doi.org/ . /s - - - https://doi.org/ . /cir s https://doi.org/ . /cir s https://doi.org/ . /c bc d https://doi.org/ . /osf.io/ https://doi.org/ . /osf.io/ https://webapps.usgs.gov/taap/ https://ose.maps.arcgis.com/apps/cascade/index.html?appid=b f edf d a dd c b a d https://ose.maps.arcgis.com/apps/cascade/index.html?appid=b f edf d a dd c b a d https://www.revealnews.org/article/the-wall-building-a-continuous-u-s-mexico-barrier-would-be-a-tall-order/ https://www.revealnews.org/article/the-wall-building-a-continuous-u-s-mexico-barrier-would-be-a-tall-order/ http://www.fao.org/geonetwork/srv/en/main.home http://www.fao.org/geonetwork/srv/en/main.home http://www.fao.org/geonetwork/srv/en/main.home http://www.fao.org/geonetwork/srv/en/main.home https://catalog.data.gov/dataset/aquifers https://catalog.data.gov/dataset/aquifers http:// . . . /sina/ https://doi.org/ . /ds https://appportal.ibwc.gov/ibwc_geo/public_portal/ scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ . comisión nacional del agua (conagua). base datos bandas, ftp://ftp.conagua.gob.mx/bandas/bases_datos_bandas ( ). . dieter, c. a. et al. estimated use of water in the united states county-level data for (ver. . , june ). u.s. geological survey data release, https://doi.org/ . /f tb v ( ). . instituto nacional de estadística y geografía (inegi). censo agrícola, ganadero y forestal , https://www.inegi.org.mx/ programas/cagf/ /default.html ( ). . colorado division of water resources (cdwr). dwr water right - net amounts, https://data.colorado.gov/water/dwr-water- right-net-amounts/acsg-f s/data ( ). . new mexico office of the state engineer (nmose). open data site, http://geospatialdata-ose.opendata.arcgis.com/ ( ). . texas commission on environmental quality (tceq). wrap input files and gis files by river basin, https://www.tceq.texas.gov/ permitting/water_rights/wr_technical-resources/wam.html/#wrapinput ( ). . comisión nacional del agua (conagua). tablero sina: registro público de derechos de agua (repda)/volúmenes inscritos, http://siga.conagua.gob.mx/repda/menu/menukmz.html ( ). . u.s. army of corps engineer (usace). national inventory of dams (nid), http://nid.usace.army.mil/ ( ). . centro nacional de prevención de desastres (cenapred). presas, http://catalogo.datos.gob.mx/dataset/presas ( ). . u.s. geological survey (usgs). national hydrography dataset – medium resolution ( : , ), https://www.usgs.gov/core- science-systems/ngp/national-hydrography ( ). . instituto nacional de estadística y geografía (inegi). conjunto de datos vectoriales de la serie topográfica escala : , , . acueducto y canal, https://www.inegi.org.mx/app/biblioteca/ficha.html?upc= ( ). . colorado water conservation board (cwcb) & division of water resources (dwr). colorados’ decision support systems, https:// www.colorado.gov/pacific/cdss/gis-data-category ( ). . texas water development board (twdb). gis datasets, http://www.twdb.texas.gov/mapping/gisdata.asp ( ). . u.s. census bureau. profile of general population and housing characteristics: , , , https://factfinder.census.gov/ faces/nav/jsf/pages/searchresults.xhtml?refresh=t ( ). . instituto nacional de estadística y geografía (inegi). censos y conteos de población y vivienda. serie histórica censal e intercensal ( – ), https://www.inegi.org.mx/programas/ccpv/cpvsh/ ( ). . center for international earth science information network (ciesin) - columbia university. gridded population of the world, version (gpwv ): population density, revision , https://doi.org/ . /h c vhw ( ). . u.s. bureau of economic analysis. annual personal income by state: – , https://apps.bea.gov/regional/downloadzip.cfm ( ). . comisión nacional para el conocimiento y uso de la biodiversidad (conabio). ingreso en méxico por municipio, , , . datos estadísticos del instituto nacional de estadísitca y geografía (inegi), http://www.conabio.gob.mx/informacion/gis/ ( ). . u.s. department of agriculture - national agricultural statistics service (usda-nass). census of agriculture and , https://www.nass.usda.gov/agcensus/index.php ( ). . center for international earth science information network (ciesin) - columbia university & information technology outreach services (itos) - university of georgia. global roads open access data set, version (groadsv ), https://doi.org/ . / h vd wct ( ). . instituto nacional de estadística y geografía (inegi), natural resources canada (nrcan), u.s. geological survey (usgs) & commission for environmental cooperation (cec). north american environmental atlas - railroads, , http://www.cec.org/ tools-and-resources/north-american-environmental-atlas/map-files ( ). . u.s. environmental protection agency (epa). level iii and iv ecoregions of the continental united states, https://www.epa.gov/ eco-research/level-iii-and-iv-ecoregions-continental-united-states ( ). . u.s. environmental protection agency (epa). level iii ecoregions of north america, https://www.epa.gov/eco-research/ ecoregions-north-america ( ). . u.s. fish & wildlife service (usfws). national wetlands inventory. a system for mapping riparian areas in the western united states, https://www.fws.gov/wetlands/other/riparian-product-summary.html ( ). . u.s. fish & wildlife service (usfws). national wetlands inventory - version - surface waters and wetlands inventory, https:// www.fws.gov/wetlands/data/data-download.html ( ). . u.s. fish & wildlife service (usfws). threatened & endangered species active critical habitat report, https://ecos.fws.gov/ecp/ report/table/critical-habitat.html ( ). . comisión nacional de Áreas naturales protegidas (conanp). sitios ramsar, https://datos.gob.mx/busca/dataset/coberturas- para-manejadores-de-sig ( ). . comisión nacional para el conocimiento y uso de la biodiversida (conabio). sitios de atención prioritaria para la conservación de la biodiversidad, http://www.conabio.gob.mx/informacion/gis/ ( ). . comisión nacional para el conocimiento y uso de la biodiversida (conabio). sitios prioritarios para la restauración, http://www. conabio.gob.mx/informacion/gis/ ( ). . food and agriculture organization (fao). the digital soil map of the world, version . , http://www.fao.org/geonetwork/srv/en/ metadata.show?id= ( ). . sanchez, p. a. et al. digital soil map of the world. science. , ( ). . fao/iiasa/isric/iss-cas/jrc. harmonized world soil database (version . ), http://webarchive.iiasa.ac.at/research/luc/ products-datasets/global-terrain-slope-download.html ( ). . canada centre for remote sensing (ccrs)/canada centre for mapping and earth observation (ccmeo) natural resources canada (nrcan), comisión nacional para el conocimiento y uso de la biodiversidad (conabio), comisión nacional forestal (conafor), insituto nacional de estadística y geografía (inegi) & u.s. geological survey (usgs). land cover of north america at meters. edition . , http://www.cec.org/tools-and-resources/north-american-environmental-atlas/map-files ( ). . u.s. department of agriculture - national agricultural statistics service (usda-nass). – cropland data layer, https:// nassgeodata.gmu.edu/cropscape/ ( ). . instituto nacional de estadística y geografía (inegi). conjunto de datos vectoriales de uso del suelo y vegetación , escala : , , serie v (capa unión), https://www.inegi.org.mx/temas/usosuelo/ ( ). . instituto nacional de estadística y geografía (inegi). conjunto de datos vectoriales de uso del suelo y vegetación , escala : , , serie vi (capa unión), https://www.inegi.org.mx/temas/usosuelo/ ( ). . u.s. census bureau. tiger/line® shapefiles, https://www.census.gov/ ( ). . instituto nacional de estadística y geografía (inegi). marco geoestadístico, http://www.beta.inegi.org.mx/app/biblioteca/ficha. html?upc= ( ). . instituto nacional de estadística y geografía (inegi), natural resources canada (nrcan), u.s. geological survey (usgs) & commission for environmental cooperation (cec). north american environmental atlas - populated places, , http://www. cec.org/tools-and-resources/north-american-environmental-atlas/map-files ( ). . texas commission on environmental quality (tceq). gis data, https://www.tceq.texas.gov/gis/download-tceq-gis-data ( ). . u.s. geological survey & u.s. department of agriculture natural resources conservation service. federal standards and procedures for the national watershed boundary dataset (wbd) ( ed.). techniques and methods –a (u.s. geological survey, ). https://doi.org/ . /s - - - https://doi.org/ . /f tb v https://www.inegi.org.mx/programas/cagf/ /default.html https://www.inegi.org.mx/programas/cagf/ /default.html https://data.colorado.gov/water/dwr-water-right-net-amounts/acsg-f s/data https://data.colorado.gov/water/dwr-water-right-net-amounts/acsg-f s/data http://geospatialdata-ose.opendata.arcgis.com/ https://www.tceq.texas.gov/permitting/water_rights/wr_technical-resources/wam.html/#wrapinput https://www.tceq.texas.gov/permitting/water_rights/wr_technical-resources/wam.html/#wrapinput http://siga.conagua.gob.mx/repda/menu/menukmz.html http://nid.usace.army.mil/ http://catalogo.datos.gob.mx/dataset/presas https://www.usgs.gov/core-science-systems/ngp/national-hydrography https://www.usgs.gov/core-science-systems/ngp/national-hydrography https://www.inegi.org.mx/app/biblioteca/ficha.html?upc= https://www.colorado.gov/pacific/cdss/gis-data-category https://www.colorado.gov/pacific/cdss/gis-data-category http://www.twdb.texas.gov/mapping/gisdata.asp https://factfinder.census.gov/faces/nav/jsf/pages/searchresults.xhtml?refresh=t https://factfinder.census.gov/faces/nav/jsf/pages/searchresults.xhtml?refresh=t https://www.inegi.org.mx/programas/ccpv/cpvsh/ https://doi.org/ . /h c vhw https://apps.bea.gov/regional/downloadzip.cfm http://www.conabio.gob.mx/informacion/gis/ https://www.nass.usda.gov/agcensus/index.php https://doi.org/ . /h vd wct https://doi.org/ . /h vd wct http://www.cec.org/tools-and-resources/north-american-environmental-atlas/map-files http://www.cec.org/tools-and-resources/north-american-environmental-atlas/map-files https://www.epa.gov/eco-research/level-iii-and-iv-ecoregions-continental-united-states https://www.epa.gov/eco-research/level-iii-and-iv-ecoregions-continental-united-states https://www.epa.gov/eco-research/ecoregions-north-america https://www.epa.gov/eco-research/ecoregions-north-america https://www.fws.gov/wetlands/other/riparian-product-summary.html https://www.fws.gov/wetlands/data/data-download.html https://www.fws.gov/wetlands/data/data-download.html https://ecos.fws.gov/ecp/report/table/critical-habitat.html https://ecos.fws.gov/ecp/report/table/critical-habitat.html https://datos.gob.mx/busca/dataset/coberturas-para-manejadores-de-sig https://datos.gob.mx/busca/dataset/coberturas-para-manejadores-de-sig http://www.conabio.gob.mx/informacion/gis/ http://www.conabio.gob.mx/informacion/gis/ http://www.conabio.gob.mx/informacion/gis/ http://www.fao.org/geonetwork/srv/en/metadata.show?id= http://www.fao.org/geonetwork/srv/en/metadata.show?id= http://webarchive.iiasa.ac.at/research/luc/products-datasets/global-terrain-slope-download.html http://webarchive.iiasa.ac.at/research/luc/products-datasets/global-terrain-slope-download.html http://www.cec.org/tools-and-resources/north-american-environmental-atlas/map-files https://nassgeodata.gmu.edu/cropscape/ https://nassgeodata.gmu.edu/cropscape/ https://www.inegi.org.mx/temas/usosuelo/ https://www.inegi.org.mx/temas/usosuelo/ https://www.census.gov/ http://www.beta.inegi.org.mx/app/biblioteca/ficha.html?upc= http://www.beta.inegi.org.mx/app/biblioteca/ficha.html?upc= http://www.cec.org/tools-and-resources/north-american-environmental-atlas/map-files http://www.cec.org/tools-and-resources/north-american-environmental-atlas/map-files https://www.tceq.texas.gov/gis/download-tceq-gis-data scientific data | ( ) : | https://doi.org/ . /s - - - www.nature.com/scientificdatawww.nature.com/scientificdata/ . comisión nacional del agua (conagua). regiones hidrológicas administrativas. (organismos de cuenca). shapefile. escala: : , http://www.conabio.gob.mx/informacion/gis/ ( ). . rio grande water conservation district (rgwcd). well information, https://www.rgwcd.org/well-information. . teeple, a. p. geophysics- and geochemistry-based assessment of the geochemical characteristics and groundwater-flow system of the u.s. part of the mesilla basin/conejos-médanos aquifer system in doña ana county, new mexico, and el paso county, texas, – . scientific investigations report. reston, va, https://doi.org/ . /sir , (u. s. geological survey, ). . driscoll, j. m. & sherson, l. r. variability of surface-water quantity and quality and shallow groundwater levels and quality within the rio grande project area, new mexico and texas, – . scientific investigations report. reston, va, https://doi.org/ . / sir , (u. s. geological survey, ). . bureau of land management (blm). national surface management agency area polygons - national geospatial data asset, https://landscape.blm.gov/geoportal/catalog/main/portal.page ( ). . registro agrario nacional (ran). perimetrales de los núcleos agrarios certificados, http://datos.ran.gob.mx/conjuntodatospublico. php ( ). . commission for environmental cooperation (cec), comisión nacional de Áreas naturales protegidas (conanp), conservation areas reporting and tracking system (carts), ministère du developpement durable et de la lutte contre le changement climatique (quebec-mddelcc) & u.s. geological survey (usgs). protected areas of north america, , http://www.cec.org/tools-and- resources/north-american-environmental-atlas/map-files ( ). . homeland infrastructure foundation-level data (hifld). canada and mexico border crossings, https://hifld-geoplatform. opendata.arcgis.com/datasets/canada-and-mexico-border-crossings ( ). . homeland security infrastructure program (hsip). freedom office of border patrol sectors, https://www.arcgis.com/home/item. html?id=e c f b b f e b bf a ( ). . u.s. customs and border protection (uscpb). southwest border migration apprehension statistics. fiscal years and , https://www.cbp.gov/newsroom/stats/usbp-sw-border-apprehensions ( ). . u.s. fish & wildlife service (usfws). lcc network areas, https://www.sciencebase.gov/catalog/item/ b ade b a b b d ( ). . u.s. fish & wildlife service (usfws). north american joint ventures, https://ecos.fws.gov/servcat/reference/profile/ ( ). acknowledgements the project was funded by grant no. g ap from the united states geological survey. its contents are solely the responsibility of the authors and do not necessarily represent the views of the south central climate adaptation science center or the usgs. this manuscript is submitted for publication with the understanding that the united states government is authorized to reproduce and distribute reprints for governmental purposes. we are grateful to all the institutions that helped us to find, shared, and allowed us to reproduce the input datasets. we thank the digital scholarships lab at the university of oklahoma libraries (sarah pugachev, tara carlisle and jeff widener) for advice on the database development and publication. we would also like to thank the anonymous reviewers for their thoughtful comments that helped us to improve our manuscript. author contributions j.k. and j.r.f. designed the study. s.pa. and j.r.f. conducted the ethnographic research, and proposed the qualitative typology of actors. all co-authors contributed in the identification of datasets. s.pl. collected and processed the input datasets, generated the spatial database and created the outputs with contributions from j.k., k.s. and k.v. s.pa. verified and edited spanish-english translation. s.pl. led the manuscript writing and all authors reviewed and edited the manuscript. competing interests the authors declare no competing interests. additional information supplementary information is available for this paper at https://doi.org/ . /s - - - . correspondence and requests for materials should be addressed to j.k. reprints and permissions information is available at www.nature.com/reprints. publisher’s note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. open access this article is licensed under a creative commons attribution . international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the cre- ative commons license, and indicate if changes were made. the images or other third party material in this article are included in the article’s creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the article’s creative commons license and your intended use is not per- mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this license, visit http://creativecommons.org/licenses/by/ . /. the creative commons public domain dedication waiver http://creativecommons.org/publicdomain/zero/ . / applies to the metadata files associated with this article. © the author(s) https://doi.org/ . /s - - - http://www.conabio.gob.mx/informacion/gis/ https://www.rgwcd.org/well-information https://doi.org/ . /sir https://doi.org/ . /sir https://doi.org/ . /sir https://landscape.blm.gov/geoportal/catalog/main/portal.page http://datos.ran.gob.mx/conjuntodatospublico.php http://datos.ran.gob.mx/conjuntodatospublico.php http://www.cec.org/tools-and-resources/north-american-environmental-atlas/map-files http://www.cec.org/tools-and-resources/north-american-environmental-atlas/map-files https://hifld-geoplatform.opendata.arcgis.com/datasets/canada-and-mexico-border-crossings https://hifld-geoplatform.opendata.arcgis.com/datasets/canada-and-mexico-border-crossings https://www.arcgis.com/home/item.html?id=e c f b b f e b bf a https://www.arcgis.com/home/item.html?id=e c f b b f e b bf a https://www.cbp.gov/newsroom/stats/usbp-sw-border-apprehensions https://www.sciencebase.gov/catalog/item/ b ade b a b b d https://ecos.fws.gov/servcat/reference/profile/ https://doi.org/ . /s - - - http://www.nature.com/reprints http://creativecommons.org/licenses/by/ . / http://creativecommons.org/publicdomain/zero/ . / a socio-environmental geodatabase for integrative research in the transboundary rio grande/río bravo basin background & summary methods dataset preparation. metadata. data records water & land governance (gov). political jurisdiction boundaries. binational surface water management agencies. federal surface water management agencies. state and intrastate surface water management agencies. state and inter-state multi-stakeholder platforms. groundwater-focused institutions. irrigation distribution organizations. land management. protected areas. border control. soil and water conservation districts (included only for the u.s.). collaborative conservation projects. hydrology (hyd). watershed boundary. river networks. aquifer boundaries. gauging stations. water quality stations. water use & hydraulic infrastructures (use). withdrawals. irrigation. water rights/permits. dams. water diversion structures. wells (included only for the u.s.). socio-economics (se). population. population density. income. number and size of farms. transport infrastructures. biophysical environment (bio). ecoregions. habitat. soil cover. elevation. slope. land cover. land use. data gaps. yearly crop-specific datasets for mexico. boundaries of community ditch and acequia associations (in nm) and unidades de riego (mx). land ownership. groundwater wells (mx). binational infrastructures. technical validation usage notes acknowledgements fig. schematic overview of the workflow applied to produce the socio-environmental geodatabase for the rio grande/río bravo basin. fig. map of the main dams and irrigation districts in the rio grande/río bravo basin. fig. map of the rio grande/río bravo basin depicting land ownership, protected areas, and major rivers, generated from the geospatial datasets of the geodatabase. fig. map comparing the total amount of water withdrawals (millions of cubic meters per year) in with the population on a county/municipio basis. table list of geospatial layers for the hydrology category including their description, format and input data sources. table list of geospatial layers for the water use & hydraulic infrastructures category including their description, format and input data sources. table list of provided geospatial layers for the socio-economics category including their description, format and input data sources. table list of provided geospatial layers for the biophysical environment category including their description, format and input data sources. for what it’s worth – the open peer review landscape this is a repository copy of for what it’s worth – the open peer review landscape. white rose research online url for this paper: http://eprints.whiterose.ac.uk/ / version: accepted version article: tattersall, a. ( ) for what it’s worth – the open peer review landscape. online information review, ( ). - . issn - https://doi.org/ . /oir- - - eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ reuse unless indicated otherwise, fulltext items are protected by copyright with all rights reserved. the copyright exception in section of the copyright, designs and patents act allows the making of a single copy solely for the purpose of non-commercial research or private study within the limits of fair dealing. the publisher or other rights-holder may allow further reproduction and re-use of this version - refer to the white rose research online record for this item. where records identify the publisher as the copyright holder, users can verify any specific terms of use on the publisher’s website. takedown if you consider content in white rose research online to be in breach of uk law, please notify us by emailing eprints@whiterose.ac.uk including the url of the record and the reason for the withdrawal request. mailto:eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ for what it’s worth - the open peer review landscape andy tattersall scharr university of sheffield a.tattersall@sheffield.ac.uk structured abstract purpose - the aim of this paper is twofold, firstly to discuss the current and future issues around pre and post publication open peer review. secondly to review some of the main protagonists and platforms that encourage open peer review, pre and post publication. approach - the first part of the paper aims to discuss the facilitators and barriers that enable and prevent academics engaging with the new and established platforms of scholarly communication and review. these issues are covered with the intention of proposing further dialogue within the academic community that ultimately address researchers' concerns, whilst continuing to nurture a progressive approach to scholarly communication and review. the paper will continue to look at the prominent open peer review platforms and tools and discuss whether in the future it can become a standard model. findings – the paper identifies several problems, not exclusive to open peer review that could inhibit academics from being open with their reviews and comments of other’s research, whilst highlighting the opportunities to be had by embracing a new era of academic openness. practical implications – the paper summarises key platforms and arguments for open peer review and will be of interest to researchers in different disciplines as well as the wider academic community wanting to know more about scholarly communications and measurement. keywords - open peer review; open access; peer review, post publication peer review; altmetrics introduction the purpose of this paper is to discuss the development of open peer review and the various platforms that are exploring and championing this method of measuring and communicating research. open peer review is not a new idea and was trialled in the late s by (smith ) and (godlee et al. ) in high profile journals, bmj and jama. other notable publications followed including nature in . whilst peer review in all its guises as a practice within academia is one that pre-dates the invention of the scholarly journal, originating with the formation of the national academies in the th century (fitzpatrick ). this paper explores the different approaches taken by ten of the leading academic open peer review platforms. other social tools, including non-academic ones such as twitter are potential platforms for open peer review, but for the purpose of this paper we will explore the formal academic examples. what was once on the surface a simple, mostly blinded pre-publication model, peer review of research is now exploring new methods of assessing research quality. some of these methods revolve around openness and discourse and can be referred to as several things such as post publication review and open peer review. (ford, ) literature review of open peer review found there was no established definition of the term accepted by the scholarly research and publishing community. however (ford, ) found several common open peer review characteristics that describe the openness of the review process: signed review, disclosed review, editor-mediated review, transparent review, and crowdsourced review. three additional characteristics describe review timing, similar to traditional peer review: prepublication review, synchronous review, and post-publication review. this paper will focus on this evolving method of open peer review, whether that be pre or post publication. whilst much focus will be on the various platforms that are now exploring open peer review and comment. there are several models of peer review currently in practice with the standard models being single or double blind. single blind is where usually the author’s identity is revealed to reviewers and double-blind is where all identities are kept hidden. blind peer review is very much focused around pre-publication of research and acts as quality control. this however is not always successful due to various issues including bias (lee et al. ) as well as plagiarism, self-citation, conflict of interests and the holding back of competing research. in addition there are fictitious pieces of research (baxt et al., ) which can also on occasion be set up to sting peer reviewers and editors. finally there is the simple factor of bad research reviewed by bad or inappropriate reviewers. some of these problems could be negated by open peer review which allows authors and reviewers to be aware of each other’s identities. open peer review can take place pre and post publication with the idea being that it creates a long tail of communication and knowledge exchange between researchers. there may be variants on this model as reviewers and reviewees may be given the option to supply their name and details to each other. for some platforms there has to be agreement before the review begins that it is a fully transparent process, warts and all. platforms like plos one make their pre-publication review process open as an option, whilst post publication is open. whilst platforms like peerage of science takes the approach of encouraging open peer review than enforcing it. as (smith ) points out, people have a great many fantasies about peer review, and one of the most powerful is that it is a highly objective, reliable and consistent process, yet in reality, many are discontent with the model of traditional blind peer review, thinking of it only in negative terms- lacking in rewards, slow in return, inconsistent, and occasionally open to fraud and bad behaviour. a systematic review conducted by (jefferson et al. ) found that the practice of peer review is based on faith in its effects, rather than facts. peer review does not have the best reputation within some areas of academia and is treated with a degree of distrust. equally so, open peer review could open up new feelings of discomfort in equal measure if not handled properly. the feeling of discomfort is partly due to a change in the academic landscape that is happening on a variety of different fronts. moocs, the growing impact agenda, social media, open access, altmetrics and big data are affecting the how universities and research centres operate. all of these changes have leanings towards openness and may be seen as a threat by those academics who prefer to carry out their work in isolation and with a degree of anonymity. open post-publication review is nothing new as the bmj and other research publications have accepted letters, email communications and rapid responses about research published in their journals for some time. tools such as blogging and social media have more recently created platforms for researchers to discuss other’s work. more recently websites such as the conversation have enabled academics to publish their ideas, thoughts and research to wider audiences, all of whom can pass comment on their work directly. other discussions take place very publicly on news sites and forums or in member-only forums. it would not take long for many researchers to find some mention of their work on the web that goes beyond citations. traditional peer review despite its key, idealised role in the history of scholarship, peer review has at times been subject to criticism (sullivan ), whilst the traditional academic publishing model has been criticised for lagging behind the rest of the modern publishing industry. much of this criticism is fair, as a piece of research which can take over a year to complete can then take even longer to be published. after such time, work in that area may have moved on, new methods, technologies and ideas may have surfaced. open peer review could help highlight these problems and may make researchers aware of potential future collaborators or similar research already being undertaken. we have to weigh up the benefits of open peer review that can encourage collaboration, wider input and knowledge transfer with the negative costs. these include bias, trolling, abusive behaviour and misinformation. whilst these negative problems may seem barriers it would be foolish to think they do not already exist within academia, nor that they are exclusive to that of the office space and traditional peer review. for peer review to be truly beneficial it has to be open. websites such as youtube have allowed aliases and therefore trolls to flourish due to the anonymity it can afford them. whilst trolling may not seem like the behaviour of an intelligent, logical person, such as an academic, it can be (klempka & stimson ). citations and peer review not every piece of research published commands an open peer post-publication review, as not every piece of research gets cited. figures bandied around the web on the percentage of papers that never get cited range wildly from about % in medicine (larivière & gingras ) to an unsubstantiated claim of % for the humanities within the first years of publication. whilst figures of up to % have been shared across academic websites and blogs with no evidence, it seems impossible to get a true figure. nevertheless it should follow that a large number of research papers will never get commented on in open post publication review platforms. whilst we have to remember that some areas of research are less reliant on the journal publishing model, it does not mean open peer review is not beneficial to their advancement- it could provide new untapped opportunities. where some papers were never destined to get cited, they may receive post publication comments and potentially useful insight. academic debate using the many social and open peer review tools freely available has so far been embraced by a small number of academics. papers are frequently shared using tools such as twitter, google+ and linkedin rather than being discussed on these platforms, but that is reflective of social media content on the web in general. it is far easier and less time- consuming to share content on the web than to review it. proper reviewing takes time and requires more considered thought than most other content shared on the web such as music and film, which is often more subjective. the evolving publication in the term web . started to be popularised around the internet. originally coined by (dinucci ) some five years earlier, it was popularised by tim o’reilly of o’reilly media who also had pushed the term ‘open source’ in . web . was the point where the web could be manipulated by wider audiences without the need for web authoring and publishing skills such as html coding. this new era opened up the possibility for anyone to publish, catalogue, communicate, share and network on the web, including academia. in the decade or so since little has changed in academia, research is often conducted in private, findings are published in journals and presented at conferences. many research papers, such as those in health research, conclude with the suggestion that more research is needed (phillips ). therefore the idea of supplanting some of this published research with open post review, new data and insights rather than wholly new inconclusive publications could prove beneficial. open peer review could not just add new insights to existing research but also form new collaborations. systematic reviews could be periodically updated and enhanced with previous versions publically available. this would not work for all research but could serve as a way to bring experts together to solve problems, forge networks and create a knowledge transfer economy. a fear of openness many researchers can feel uncomfortable speaking about other’s research (smith ). for those who put their heads over the open peer review parapet it is the fear they will be humiliated by their peers or more senior colleagues. this is especially so of junior researchers and their senior peers (walsh et al. ). this is understandable as bullying and intimidation happen face to face within universities, so it should inevitably extend to the web. despite the open public face of the web it does not deter people behaving horribly to others. given that the established peer review system has for years shielded reviewers from a right to reply for their comments; it should follow that by going public it may require some to change their tact. it is no different to how some lecturers feel uncomfortable giving feedback to their students, especially when that feedback is negative. no one likes to receive bad news, just as no one likes to deliver it, certainly face to face. it is understandable that no researcher wants to hear negative comments about their own hard work, especially via open peer review when delivered on a public platform. throughout the history of research, whether as we know it now or going back hundreds of years there has always been some element of fear. this includes the fear of failure, not receiving acceptance, being wrong or too radical. add the often mentioned ‘publish or perish’ and academia can be an intimidating arena. imagine for example one of history’s scientific greats such as galileo and consider if he lived today. how would open peer review respond to his support of the idea that the earth revolved around the sun and not the other way round as was commonly believed? four hundred years ago he was opposed by astronomers and the catholic church for supporting such revolutionary ideas. could open peer review publicly lead to the humiliation and quashing of such incredible minds and their ideas? yet if we think about the period of history in which galileo and his peers lived, it was the fear of imprisonment, punishment and even death, far worse than that of comments posted on the web. as the nursery rhyme goes, ‘sticks and stones will break my bones, but words will never harm me’. nevertheless we know this to be untrue for some that have gone public with their ideas on the web. at present the decision lies with individual academic as there will always be those maverick enough to voice their ideas publicly and take the flack, and those who do not. researchers may not feel the desire to review a peer’s work publicly, but they may struggle to resist reading what others have said about their own work. whilst most researchers may not be aware of this shift towards open peer review in their own fields of work, they may be aware of it in others that extend beyond academia. this existing world of review and comment is where much of the anxiety is likely to come. popular websites such as youtube are full of hostile and negative comments, twitter is renowned for trolling behaviour and comments on stories published in the media will be enough to concern academics. these comments can become personal, malicious and toxic. that is not to say open peer review will stoop to these lows, it is however still possible. the problem for some academics is that the more vociferous and aggressive among them could draw others into public arguments that serve no one well. this already happens between researchers and the public over controversial research, so it should follow it can happen between peers. the thorny issue we have now is that everyone’s opinion can extend to public platforms on the web to voice it. the web has facilitated an opinion culture to the point where ‘trolling’ is now an acknowledged and serious problem. academics are more than culpable for making barbed comments, but making unjustified ones online will help no one, especially in the advancement of knowledge via discourse. despite this potential for negativity it is important this issue is discussed openly and maturely as the pattern is quite clear, research in whatever format is becoming more open not less so. this is through public engagement, social and traditional media, open access, blogging and news curation sites. the modern web that was predicted in the last century (dinucci ) is now one where anyone can publish and comment on any platform from just about anywhere. newspapers, youtube and twitter are just three platforms that are rife with a multitude of polarised, mischievous, hostile and un-evidenced comments. despite these issues there is a real need for academics to be mature about open peer review and embrace the openness of the web that in the longer term has real potential to carry their research much further afield than ever before. the aforementioned platforms, which are often poorly or totally un-moderated exist in an open environment. there are no requirements to be an expert, whilst communities that populate them can be diverse with polarised agendas. that is not to say academics are any different, yet their professional communities are more homogenised, many know each other personally and professional reputation plays an important role. two problems with any kind of commenting and reviewing model (open or blind) is that you do not always know who you are talking to. for all someone knows, they may as well being talking to a dog over the web (adrian, ). you may know their name, you may have information about where they work, what they do for a living and what they do in their personal time. however, if you have never met them in person it can be hard to gauge the tone they are communicating with you in. this is true of any kind of textual communication and can lead to misunderstandings. the second problem is that of online personality changes as (joinson ) showed that individuals can behave differently when engaging with others over the web, whilst at the other end of the spectrum one study (buckels et al. ) highlighted cyber-trolling as an internet manifestation of sadism. those capable of behaving badly in person will continue to do so over the internet. open peer review is much the same as social media and how it has been applied within universities -the reality is some academics will get it horribly wrong. there have been various incidences of university teaching staff being suspended or having contracts cancelled as a result of demeaning comments posted openly online about colleagues or students. the key for academics who are confronted by abusive or inappropriate open peer review is to either make the following communications private or report it. the worst thing to do is get drawn into a public argument online, as rarely either side comes out looking good. there is a clear difference between reviewing, discussing and commenting, something kent anderson in the scholarly kitchen touched on. (anderson ) summarised that today’s commentators seem to have many axes to grind. far too often, commentary forums degrade into polemical attacks with win or lose dynamics at their heart. the pursuit of knowledge and science isn’t the goal. capitulation of one combatant to another is. anderson questioned the validity of comments being championed by publications and websites, that they could never be considered in the same light as peer-review. not everyone gets it right first time open peer review is unlikely to iron out the imperfections of blind peer reviewed research, as this can never happen. there are various factors that will always work against a utopian publishing model. predatory journals will publish poor quality research, authors will attempt to hoodwink reviewers and editors with previously published, fictional and poor quality work. (rowland ) adds the issues of salami publishing (producing too many articles out of one piece of research) or duplicate publication, and also omission or down-grading of junior staff by senior authors who effectively steal their subordinate’s work. not exclusive to open peer review but potentially more humiliating for the reviewer and reviewee, than the blind model, is that reviewers can misinterpret research or get things wrong. they may misunderstand the research findings or have inadequate knowledge on the topic they are reviewing (rowland ), whilst some may more purposefully steal author’s unpublished work or deliberately delay competing work (rennie ). by opening up peer review to a totally transparent process complete with timelines the opportunities for gamekeepers to go poaching will diminish- this meaning those involved in editorial and blind peer review roles will have less opportunity to steal. this could add a higher level of responsibility and accountability that extends not just to author and reviewer but also commentators and post publication reviewers (rennie ). research carried out by the bmj (rooyen et al. ) into the effects of informing reviewers that reviews might be posted publicly on the web had no discernible effects on the review process. potential negative implications from this study found that open peer review could reduce the number of willing reviewers and increase the time taken to review. naturally a researcher’s time is very precious and there are increasing pressures on them to expand into other avenues of work such as impact evidence. nevertheless, by improving peer review it can only serve to benefit the quality of research. open peer review, post publication could give a right to reply for any authors once their work is public. this may be little compensation once the negative comments have been left on a public platform that is then shared across the web. a system of moderation can help and the option to remove inaccurate, biased and malicious comments that are posted by those not involved in the formal peer review. anyone leaving comments about a piece of research or responding to them must then think carefully before they hit the send button. as with most things on the web these days, it is a very public place, so once a comment is posted, it could be some days, weeks or even years before it is removed or corrected. a review of the platforms as open peer review, pre or post publication continues to gain traction the number of platforms, aligned to research publications, individuals or groups will undoubtedly continue to grow in line with other research technologies. the web is very good at reacting to supply and demand with more academics and aligned professionals employing online technologies in greater numbers. the platforms below are not an exhaustive list but do account for a lot of the current activity and discussion around open peer review, whether that be pre or post publication. the websites that are reviewed were chosen as the leading established platforms in the area of open peer review and its variants. many of them are firmly established not just as review platforms, but as databases, social networks and journal publishers in their own right. f research faculty of combines different strands, all committed to publishing research and communicating its findings. firstly there is f prime, which is a personalised recommendation system for biomedical research articles from f . like plos one, f research is an open science journal that tries to speed up publishing turnaround times with a transparent referee model. the final component is f posters which is an open repository for conference posters and slide presentations. f research’s approach to peer review is totally open and is one where there are published referee comments and subsequent replies by the authors. the commenting system is no different than you would see in a newspaper or blog post where reader comments are replied to on an individual basis by the original authors. as with blind peer review, articles are ‘approved’ at once or ‘approved with reservations’ or ‘not approved’. this absolute open approach not only ensures the author’s research is not only revealed to the wider world but also the competencies of the reviewer. each comment is date stamped and allows for a right to reply by the authors. visitors to f research can track the conversation and even discuss the article at the foot of the page. this gives a good snapshot of the research publishing timeline, with the entire process, paper, review and discussion taking place on one webpage. even referee’s reports can be cited in f research and published under a creative commons by attribution license. a doi (digital object identifier) is assigned to every referee report, so it can be cited independently from the article. http://f research.com/ open review open review is part of the popular academic social network researchgate. open review gives researchers the ability to publish an open and transparent review of any paper they have read, worked with, or cited. researchgate’s approach is to try and look at the evaluation of research in a different way and ask if this research is reproducible. registered users select an article that is listed on researchgate and then can go through a simple review process. this process asks simple yes and no questions relating to the research’s methodology, analyses, references, findings and conclusions. supporting resources can be attached and the reviewer can leave free text statements supporting each field. the completed review can be seen with each aspect scored which given over time would collate further reviews. reviewers can add the names of other colleagues involved in the review process but they must consent to their admission. http://www.researchgate.net/publicliterature.openreviewinfo.html peer j peer j is an open access peer-reviewed scientific journal with a focus on publishing research in the biological and medical sciences. it received substantial financial backing of $ , from o'reilly media whose founder tim o’reilly famously popularised the term web . . it is also part of the same publishing company co-founded by publisher peter binfield (formerly at plos one) and ceo jason hoyt (formerly at mendeley), bringing with them a lot of experience in scholarly communications. peer j operates a points system for authors and commentators as their incentive to publish and comment on research. a reviewer can gain anything from points for being an editor or an author on a peerj article to just one point for receiving an ‘up vote’ for a reply to a question or comment. the website hosts tables showing the top authors and reviewers which can be filtered by topic area, publication date and those who have asked the most questions and given the most answers. the questions and answers aspect is different to the commenting approach as witnessed on other platforms. it does potentially open up further dialogue between authors and commentators, although as with other similar platforms there is not a lot of activity at present in this area. as for this points ranking system, it will appeal to some researchers, especially those with a competitive edge, but on the flipside it may make others equally http://creativecommons.org/licenses/by/ . / http://f research.com/ http://f research.com/ http://www.researchgate.net/publicliterature.openreviewinfo.html http://www.researchgate.net/publicliterature.openreviewinfo.html uncomfortable. academics like to see their research measured in different ways, some captured qualitatively, others quantitatively, and some both. it is an interesting approach towards existing peer review models that is certain to split the academic community. https://peerj.com/ peerage of science this website peerage of science is not explicitly an open peer review platform but does give authors who submit content the option for reviewers to see their details. the website does encourage authors to remain anonymous but it is not compulsory. the main purpose of this platform is to offer authors an opportunity to have their manuscripts reviewed by qualified, non-affiliated peers. there are some merits for authors to submit their manuscripts to such a site, especially researchers who do not have contact with peers in the field there are submitting. whilst for reviewers the purpose of peerage of science is to build their reputation as a reviewer of research. it operates like an agency that matches reviewers with manuscripts. the concern with such a model is that reviewers risk building a reputation based on a quantity of reviews, not the quality. that said, given the problems some authors have in getting opinions on their work, the benefits could outweigh the risks. https://www.peerageofscience.org/ plos one plos one is an open access, mostly traditional peer-reviewed scientific journal published by the public library of science. whilst the pre-publication submissions are usually a blinded review, unless a reviewer indicates otherwise, the open peer review happens post publication. registered plos one members are able to comment and discuss published research. it is the world’s largest journal based on the number of papers it publishes and has given itself a mandate to make research easier to reach and discuss by speeding up the publication process whilst ensuring authors retain copyright. plos one allows users to comment on research it publishes in the same way newspapers and blogs allow visitors to comment on their news articles. registered users are required to make comments with the purpose of adding to the research or by making clarifications. this involves identifying and linking to materials that will lead to threaded discussions with regards to the content of the published research. there is no limit as to what a commenter can post as it can be as simple or in-depth as they wish. they may want to just focus on part of the research, the results, the methodology or the conclusion and write just a few lines. whilst others may take more time to write a longer, in-depth review about the whole paper. anyone wishing to comment on papers in plos one must be a registered user and identify any competing interests. the rules are quite simple and say that anyone commenting on someone else’s research must not post content as stated below: . remarks that could be interpreted as allegations of misconduct . unsupported assertions or statements . inflammatory or insulting language those who break these rules are removed and their account disabled, although this does not prevent them from creating new accounts. this is not a problem exclusive to open peer review websites. http://www.plosone.org/ pubmed commons http://www.plosone.org/ http://www.plosone.org/ pubmed is a huge publicly accessible search engine that accesses the medline database of references and abstracts on life sciences and biomedical topics. pubmed commons was launched as a platform to enable authors who are eligible to post comments on research that was accessible via pubmed. eligibility is based on being an author of a publication in the database, therefore hopefully preventing just anyone from going onto the site and leaving spurious or mischievous comments. emails of eligible authors are collected from the national institutes of health, the wellcome trust and author’s emails within pubmed and pubmed central. in addition, authors can ask a colleague who is already on the system to send them an invite. pubmed commons has tighter controls than such as plos one and other such sites. anyone wishing to leave comments must use their real name and disclose any conflicts of interest. http://www.ncbi.nlm.nih.gov/pubmedcommons/help/guidelines/ http://www.ncbi.nlm.nih.gov/pubmedcommons/ publons publons applies a different approach to open peer review by switching the focus more towards the reviewer. the primary aim of publons is to highlight and aid researchers and their reviewing activity. peer review is often regarded as a necessary chore for academics that is rarely acknowledged as part of their public profile and kudos. publons aims to give credit for their peer review work. whilst working for peer-reviewed journals have often been seen by researchers as a way of adding to their growing workloads for the benefit of others, that being financial for the publisher and recognition for the authors. reviewing is very much part of the academic’s profile building exercise but given the hidden element of this role it is not always as easily quantifiable as that of editor or author when applying for jobs or promotion. peer reviewing may have rewards with regards to the researcher’s cv and promotion prospects in addition they get to see emerging research but it is so much harder given the existing anonymous culture. however, it is no less part of the system that is the research publishing cycle. publons aim is to work with reviewers, publishers, universities, and funding agencies to turn peer review into a measurable research output. this is done by collecting peer review information from reviewers and publishers, and using the data to create reviewer profiles. this information is verified by publishers so that researchers can add these contributions to their cv. this allows reviewers to control how each review is displayed on their profile, whether that be blind, open, or published. reviewers can add both pre- publication reviews they do for journals and post-publication reviews of any article. https://publons.com/ pubpeer pubpeer’s is an online journal club, one that allows users to search for papers via dois, pmids, arxiv ids, keywords and authors amongst other options. the purpose of this to create an online community of academics that comments and discusses the publication of research results. researchers can comment on almost any scientific article published with a doi or preprint in the arxiv. they can also browse an extensive list of journals with comments, although the majority of titles only have one or two comments at present. unlike some of the other tools mentioned, pubpeer also allows for anonymous commenting, which can be accomplished without the user signing up. these comments are moderated first and how quick an anonymous comment is posted depends on the number of items there are in the queue for moderation. this model, as with any kind of anonymous commenting is always susceptible to trolling and abusive behaviour as reviewers feel an extra level of protection http://www.ncbi.nlm.nih.gov/pubmedcommons/help/guidelines/ http://www.ncbi.nlm.nih.gov/pubmedcommons/help/guidelines/ http://www.ncbi.nlm.nih.gov/pubmedcommons/ from what they say. one researcher filed a lawsuit over anonymous comments which they claim caused them to lose a job offer after accusations of misconduct in their research. https://pubpeer.com/topics/ / f ff a fb e caad #fb https://pubpeer.com/ scienceopen scienceopen is an independent publishing platform that operates an open peer review system with a full transparency of reviewers and comments. scienceopen makes their referee reports available under a creative commons by attribution licence and is part publishing platform, part social network. as with other platforms this allows reviewers to build a public collection of reviews with the aim of showcasing researchers not just as authors but critical reviewers. once users register for an account is can be automatically synchronised with their orcid profile. https://www.scienceopen.com/home the winnower the winnower is another platform committed to open research that extends to a long tail of discovery and dialogue. the winnower state that; “the winnower is founded on the principle that all ideas should be openly discussed, debated and archived.” one of the smaller platforms for open peer review, the winnower is very much in the mould of so many of the new academic start-ups in that it began life thanks to a phd student. a small platform may seem less appealing to researchers wishing to review other’s work, especially compared to large established sites such as plos one and pubmed. this can matter when trying to attract a larger audience but it is useful to remember that from acorns oak trees grow. take for example mendeley which started life thanks to three early career researchers and was reportedly acquired by elsevier for $ m in . the winnower provides an interesting angle that extends beyond successfully published research to that which was retracted with its own ‘grain’ and ‘chaff’ page. the 'grain' features publications with more than citations or a altmetric score above , whilst the 'chaff' includes papers that were pulled from publication offering authors an opportunity to talk about their research rather than just providing a ‘name and shame’ list. despite being a fledgling academic start-up and only having a handful of reviews, the winnower does provide another take on the open peer review landscape. https://thewinnower.com/ platform open pre or post publication review/ comment level of openness owner year established key audience other services creative commons licence f research post open faculty of as faculty of biology (now f prime) life sciences f prime f posters f specialists f journal clubs na open review post open researchgate as researchgate open review non-specific researchgate na peer j pre and post open review encouraged jason hoyt pete binfield biology medicine peerj computer science peerj preprints cc-by- . peerage of science pre open - onymous janne kotiaho, mikko mökkönen, janne- tuomas seppänen science na plos one pre and post optional for pre- publication. open for post comment the public library of science medicine science cc by . pubmed commons post open u.s. national library of medicine biomedicine pubmed cc by . publons pre and post optional andrew preston, daniel johnston non-specific cc by . pubpeer post optional na non-specific na scienceopen pre and post open scienceopen non-specific scienceopen research scienceopen posters cc by . the winnower pre and post open josh nicholson non-specific cc by . table . comparison of pre and post publication open review and comment platforms the previous table indicates some of the issues that the development of open peer review must address. that there are many facets and many different opinions on how best to improve scholarly measurement and communication via peer review, open or otherwise. not only do we have the option of pre and post open peer review, but also the ability for researchers and reviewers to agree the level of openness. some of the platforms are backed by larger entities such as researchgate and the u.s. national library of medicine, whilst others are start-ups involving just a handful of individuals. whilst this list is not exhaustive, more platforms will appear, and with it potentially more iterations on open peer review. the websites in table feature peer review, commenting, discussion, points systems in addition to question and answers. some options will be more popular than others depending on the researcher’s personal beliefs, as well as peer and subject area influence. researchers and reviewers will have their own agendas and bias as to why open up the peer review process. some may feel they have nothing to hide, whilst others may feel hard done to by blind peer review. whilst choosing the correct and most rewarding platform will cause some researchers and reviewers concern and confusion. other notable mentions it is worth mentioning the various platforms that also explore and have explored the pre and post open peer review landscape. papercritic worked in conjunction with mendeley to monitor papers in your reference collection and via your mendeley contacts list. it ceased posting updates on its various social media platforms in early , which is never a good sign. chapter swap aimed to democratise the peer review system by offering authors a grassroots approach to peer review and inviting authors to swap draft copies of their work for review. chapter swap’s target is postgraduates and postdocs working within the discipline of the arts and humanities. again like papercritic, the previously active twitter feed ceased in indicating that the service was no longer active. libre is an open peer review platform hosted by open scholar c.i.c that operates solely within the academic community. it aims to put authors in charge of the reviewing process which is open and published under a creative commons licence. at the time of writing this paper libre was still in a testing phase and users could sign up in readiness for the first stable release. science open review, not to be confused with the aforementioned science open or scior is based at queens university in canada. its remit is to connect authors with reviewers in author-led non-blind peer review. again with some of the previous tools there appears to be little activity in the year prior to publishing this paper. finally the journal of visualized experiments (jove) is the leading online video journal with a remit to aid the replication of published research. the pre- publication model is anonymous as is part of the post publication comment. its inclusion is based on it allowing commentators to leave comments that include their first name and the initial from their surname, possibly enough for recognition. a mixed model approach at present we still have an entrenched system for how we measure and understand the scope of research through citations, indexes and impact scores. add to that altmetrics, snowball metrics and similar systems of measurement and communication through open and blind pre and post publication review and we should be in a good place to sort out the wheat from the chaff. a novel idea that has been suggested is that of giving contributors digital badges for their role in a piece of research (cantor & gero ) to create an r-index scale of reviewer recognition. a formalised approach to this is certainly more attractive to most than independent review sites such as rate my professor, whilst shit my reviewers say operates as a way for researchers to share reviewer comments and “aims to collect the finest real specimens of reviewer comments since .” as with open access, the purpose is to remove access barriers not quality filters (suber ). (ford ) notes that whilst open access and open peer review go hand in hand, open peer review does not need to occur only in open access journals. open peer review could remove some of the barriers discussed earlier in this paper and improve filters. the evolving publication model could in time encourage more academics to discuss their work more openly on the web, rather than operate in silos. this may take place via other public discussion forums but also by blogging and social media platforms such as twitter, researchgate and piirus to name but a few. digital scholarship can only have meaning if it marks a radical break in scholarship practices brought about through the possibilities enabled in new technologies. this break could encompass a more open form of scholarship (pearce et al. ). the benefits of this new openness are also highlighted by david goldstein, director of duke university centre for human genome variation (mandavilli ). open review and commenting on published research can help identify incorrect findings. goldstein (mandavilli ) said; “when some of these things sit around in scientific literature for a long time, they can do damage: they can influence what people work on, they can influence whole fields.” conclusion at present most of the post publication, open peer review platforms have just a few comments for some research articles, the majority have none. despite the sheer amount of published research, navigating and responding to them is still quite manageable. as more researchers begin to comment and review work openly it opens up more conversation and communication which could lead to a cacophony of noise if not properly moderated. (shirky ) suggests that our problem is not one of information overload, it’s filter failure. whether that is the case for some, there still appears a genuine problem of information and communication overload for many, academics included. responding and leaving comments can be another potential disruptive interruption to their workflow as comments go back and forth between various parties. as with social media and discussion forums there is that temptation to continually peek back to see if anyone has responded to your own review or comment. as with the problem of email and social media, there may be too strong a temptation to get the last word in. despite that, for open peer review to thrive it needs researchers to leave comments, constructive ones at that, although that is perhaps too much to expect for all of them. it also needs engagement and discussion that brings with it tangible benefits, most importantly being the advancement of knowledge within that research area. before the advent of the web, researchers worked in deeper silos which meant that it was often not until publishing work and speaking at a subsequent conference they became aware of similar such research taking place. now researchers using tools such as twitter or mendeley can get a feel for what research is going on around them that is of interest and has some overlapping features. the debate on which is the best way forward for open peer review will continue as will other topics that look at the measurement of research. there appears to be no single solution, with at present a collection of websites and tools, sometimes operating in silos and all offering to solve a problem of how better to improve the quality of published research. as with predatory journals and conferences we are likely to see similar ventures in open peer review. the key to improving this process is more active participation by researchers, reviewers and editors in the discussion of how to negate the various problems related to traditional peer review. peer-review may not be perfect, but as the social web becomes more useful as a formal and informal platform for discussion and knowledge sharing within the academic community it makes sense that these options are explored. the co-existence of blind and open peer review, alongside post publication review can help shape a better system. this is similar to the argument for altmetrics, first seen by detractors as wholly alternative to the traditional measurement of citations and now more eloquently argued as an alternative indicator, rather than total measurement. open peer review platforms need to be explicit in their aims and any considerations and explain that clearly to readers and reviewers. like social media, it is unlikely that we will see every researcher using these unless they became standardised, formalised and part of the research cycle. it is optional, as with academic discussion lists, where some the most insightful and on occasion barbed communications take place. whilst many authors may feel vulnerable by making their research open for comment, they have to realise that this happens already. certainly a sizable chunk of unqualified reviewing happens courtesy of the general public when research makes local or national news. given how the web has evolved into a democratic and social platform it has given anyone connected via the internet a voice. open peer review has been theorised and trialled for some time, but as yet it remains the junior partner to the traditional model of peer review. for platforms covered previously there is still some way to go for a majority acceptance, and key to this is a support mechanism for reviewers and authors that is structured, aided by moderation and authentication. if not, as (van noorden ) asks; “will online comments look more like a scattered hodgepodge of reviews, comments and discussions across websites unlinked to original publications?” nevertheless, the world of open post publication peer review is happening right now and someone may have already commented on your research, whether you respond remains your choice. references anderson, k., . stick to your ribs: the problems with calling comments “post- publication peer-review.” the scholarly kitchen. available at: http://scholarlykitchen.sspnet.org/ / / /stick-to-your-ribs-the-problems-with- calling-comments-post-publication-peer-review/ [accessed may , ]. anonymous. . shit my reviewers say. @yourpapersucks [twitter] available from: https://twitter.com/yourpapersucks [accessed may ] adrian, a. ( ) no one knows you are a dog: identity and reputation in virtual worlds. computer law & security review [online]. available from: http://www.sciencedirect.com/science/article/pii/s baxt, w. et al. ( ) who reviews the reviewers? feasibility of using a fictitious manuscript to evaluate peer reviewer performance. annals of emergency …. ( ), – . [online]. available from: http://www.sciencedirect.com/science/article/pii/s x buckels, e.e., trapnell, p.d. & paulhus, d.l., . trolls just want to have fun. personality and individual differences, , pp. – . available at: http://www.sciencedirect.com/science/article/pii/s cantor, m. & gero, s., . the missing metric: quantifying contributions of reviewers. royal society open science, ( ). available at: http://rsos.royalsocietypublishing.org/content/royopensci/ / / .full.pdf. dinucci, d., . fragmented future. print julyaug, (july/aug), pp. – . fitzpatrick, k., . planned obsolescence: publishing, technology, and the future of the academy, new york: nyu press. available at: https://books.google.co.uk/books?hl=en&lr=&id=wf ry m ulmc&oi=fnd&pg=pr &d q=planned+obsolescence:+publishing,+technology,+and+the+future+of+the+academ y&ots=kt_-web ud&sig=fwzqj rm-hfm dkec d_rqqlfgm [accessed june , ]. ford, e., . defining and characterizing open peer review: a review of the literature. journal of scholarly publishing, ( ), pp. – . available at: http://muse.jhu.edu/journals/journal_of_scholarly_publishing/v / . .ford.html godlee, f., gale, c. & martyn, c., . effect on the quality of peer review of blinding reviewers and asking them to sign their reports: a randomized controlled trial. jama, ( ), pp. – . available at: http://jama.jamanetwork.com/article.aspx?articleid= jefferson, t. et al., . effects of editorial peer review: a systematic review. jama. available at: http://jama.jamanetwork.com/article.aspx?articleid= joinson, a.n., . disinhibition and the internet nd ed. j. gackenbach, ed., san diego, ca: elsevier academic press. available at: https://books.google.co.uk/books?hl=en&lr=&id=_cypiidzy yc&oi=fnd&pg=pa & dq=disinhibition+and+the+internet&ots=mbnm_mx iw&sig=usbckwsomyd clotm vnhoei byi klempka, a. & stimson, a., . anonymous communication on the internet and trolling. concordia university saint paul. available at: https://comjournal.csp.edu/wp- content/uploads/sites/ / / /trollingpaper-allison-klempka.pdf. larivière, v. & gingras, y., . the decline in the concentration of citations, – . journal of the american society for information science and technology, ( ), pp. – . available at: http://onlinelibrary.wiley.com/doi/ . /asi. /abstract lee, c., sugimoto, c. & zhang, g., . bias in peer review. journal of the american society for information science and technology, ( ), pp - . available at: http://onlinelibrary.wiley.com/doi/ . /asi. /full mandavilli, a., . trial by twitter. nature. available at: http://www.axeleratio.com/news/trial_by_twitter_nature .pdf [accessed march , ]. van noorden, r., . the new dilemma of online peer review: too many places to post? nature newsblog. available at: http://blogs.nature.com/news/ / /the-new- dilemma-of-online-peer-review-too-many-places-to-post.html [accessed may , ]. pearce, n. et al., . digital scholarship considered: how new technologies could transform academic work. in education. available at: http://ineducation.couros.ca/index.php/ineducation/article/view/ [accessed march , ]. phillips, c. v, . the economics of “more research is needed”. international journal of epidemiology, ( ), pp. – . available at: http://www.ncbi.nlm.nih.gov/pubmed/ rennie, d., . freedom and responsibility in medical publication: setting the balance right. jama, ( ), pp. – . available at: http://www.ncbi.nlm.nih.gov/pubmed/ rooyen, s. van et al., . effect of open peer review on quality of reviews and on reviewers’ recommendations: a randomised trial. bmj. available at: http://www.bmj.com/content/ / / .short rowland, f., . the peer-review process. learned publishing. available at: http://www.ingentaconnect.com/content/alpsp/lp/ / / /art shirky, c., . it’s not information overload. it's filter failure. in web . expo. new york: web . expo. available at: http://www.web expo.com/webexny /public/schedule/detail/ . [accessed may , ]. smith, r., . opening up bmj peer review: a beginning that should lead to complete transparency. british medical journal, (january), pp. – . smith, r., . peer review: a flawed process at the heart of science and journals. journal of the royal society of medicine. available at: http://jrs.sagepub.com/content/ / / .short suber, p., . open access first., cambridge ma: mit press. available at: http://mitpress.mit.edu/sites/default/files/ _open_access_pdf_version. pdf. [accessed march , ]. sullivan, m., . peer review and open access. openstax. available at: http://cnx.org/contents/ f- dd - d -b f- ddb c @ [accessed march , ]. walsh, e. et al., . open peer review: a randomised controlled trial. the british journal of psychiatry. available at: http://bjp.rcpsych.org/content/ / / .short riding a wave of creative destruction – reflections on <i>ecology and society</i> copyright © by the author(s). published here under license by the resilience alliance. gunderson, l., c. folke and m. lee. . riding a wave of creative destruction – reflections on ecology and society. ecology and society ( ): . [online] url: http://www.ecologyandsociety.org/vol /iss / art / editorial riding a wave of creative destruction – reflections on ecology and society lance gunderson , carl folke , , and michelle lee key words: ecology and society; creative destruction, not too long ago, checkout clerks at food stores inquired, “paper or plastic?”, referring to the type of bags to carry home one’s groceries. a similar question is now being asked about much published scholarship, except that the choice is now paper or electrons. by electrons, i mean a digital medium, in which the information is developed, produced, and stored entirely on computers, in stark contrast to the volumes of paper journals that now line the shelves of libraries and many professors’ offices. to be fair, most journals that we read are now published in both paper and electronic formats. this shift marks more than a change of format, however. it is a sea change, the kind of transformation that the austrian economist joseph schumpeter said was caused by “gales of creative destruction.” in his model, systems change when new ideas, products, and technologies bring about the destruction of the old. the emergence of digital scholarship appears to follow this model: the freely accessible electronic media is replacing the more expensive, paper based one. ecology and society (www.ecologyandsociety.org ) was one of the first entirely digital journals. many journals that began in print now produce both paper and electronic versions, but ecology and society has always lived exclusively in a virtual world. it was developed following a challenge to a handful of graduate students at carleton university in the mid s. after four years of hard work gathering funds and writing software, the first issue was published, “posted” might be a better word, in . at the time of this writing, the th issue is underway. for the past seven years, we have edited the journal. over the past decade, the journal has grown into an internationally recognized, highly respected publication. ecology and society focuses on the interactions between people and their environment. actually, we regard people and the environment as intertwined social-ecological systems. specifically, we publish articles that deal with “the management, stewardship and sustainable use of ecological systems, resources and biological diversity at all levels, and the role natural systems play in social and political systems and conversely, the effect of social, economic and political institutions on ecological systems and services.” the journal is an open access journal, which means that all of its content is available for free on the internet. we believe that making such work freely and readily available will contribute to a greater global exchange of knowledge and information. the journal has more than , subscribers in more than countries, with most of the subscribers emory university, beijer institute, stockholm resilience centre, stockholm university, resilience alliance http://www.ecologyandsociety.org/vol /iss /art / http://www.ecologyandsociety.org/vol /iss /art / mailto:lgunder@emory.edu mailto:carl.folke@beijer.kva.se mailto:managing_editor@ecologyandsociety.org http://www.ecologyandsociety.org ecology and society ( ): http://www.ecologyandsociety.org/vol /iss /art / located in north america and europe. subscribers pay nothing, but they do register with the journal in order to receive information such as notices when a new issue is published. the journal publishes about articles per year. funding for the journal comes from a combination of sources, including grants, institutional dues, and page charges. lessons from the journal focus on quality. early in the planning stages of the journal, the founders set a goal of development of a high quality product. the threshold for internet publishing is very low; once a few technical obstacles are overcome, such as learning how to develop and create a website, then anyone can publish anything. ecology and society set out to become a credible scientific outlet by asking respected scientists to be members of the editorial board, and implementing a double-blind review process for each manuscript. as a result, more than half of the submitted manuscripts are declined. in the past few years, the journal is ranked among the top half a dozen environmental science and environmental studies journals. indeed, one of our editors, elinor ostrom, was awarded the prize in economic sciences in memory of alfred nobel in . encourage creativity. the journal fosters publications that explore novel ideas and the application of those ideas by awarding annual prizes. a prize of euros is awarded to the author (s) of the most original paper that integrates different streams of science to assess fundamental questions in the ecological, political, and social foundations for sustainable social-ecological systems. another award of euros is given annually to the individual or organization that is the most effective in bringing transdisciplinary science into practice. both of these prizes are donated by a private european foundation. stay small and efficient. the journal has become successful because of the hard work of a small group of people. an executive director handles all of the financial and managerial work. a managing editor with great colleagues effectively handles the flow of manuscript submissions, reviews, editing, and final publication. a custom-designed software was developed for the journal in order to manage the entire production process. the editors in chief evaluate each manuscript and oversee the review process that is handled by a network of subject editors and reviewers. the executive director, managing editor, and editors in chief coordinate the large task of manuscript review using the custom software. develop an open network of scholars. a pleasant surprise that has emerged from the journal has been the creation of a network of thinkers that has formed around it. that is, the scholars who contribute to the journal, the subscribers who utilize the information, and a small group who facilitate publication all form a network. the first editor suggested that the journal has created a virtual institute, one without walls and a minimal support infrastructure. this network has helped with the funding of major research centers in sweden, australia, england, canada, and the united states, as well as with the production of dozens of scholarly books and hundreds of publications. the march and june issues ecology and society fills an interdisciplinary niche. interest in publishing in the journal continues to increase. we continue to expand our qualified editorial board to meet the expansion, and accepted papers appear on the web as soon as they are ready for publication. previously we collected all published papers into an issue twice a year. starting this spring we have decided to publish four issues a year to provide a more comprehensive overview for the reader of the journal. however, as before, we will only write two editorials a year. we are most pleased with the two current issues reported on here. they cover a broad spectrum of topics from long-term and even archaeological perspectives to traditional ecological knowledge, local practice, and diverse worldviews; co- management, participatory assessments and experiments, to the social-ecological challenges of management biodiversity; valuing ecosystem services and performing ecosystem-based management, to stewardship of landscapes and seascapes and the role of social innovation and transformation, often in the context of vulnerability, adaptability, and resilience. several papers deal with water issues, from collaborative learning platforms and participation, to transboundary water governance and institutional challenges. however, all of the published papers share the same nature of truly integrating the social and the ecological. http://www.ecologyandsociety.org/vol /iss /art / ecology and society ( ): http://www.ecologyandsociety.org/vol /iss /art / all in all, there are close to contributions with about half as part of the special features. four of the special features are now ready and done. the first is navigating trade-offs: working for conservation and development outcomes, edited by bruce campbell, jeff sayer and brian walker. this issue synthesizes lessons from integrated conservation- development initiatives in developing countries. the contributions emphasize that at the heart of achieving positive outcomes are a core of institutional issues involving landscape governance, trust building, empowerment, and good communication, all implying long-term commitment by, and flexibility of, external actors. the second is realizi ng water transitions: the role of policy entrepreneurs in water policy change, edited by dave huitema and sander meijerink, which explores how individuals instigate, implement, and occasionally prevent changes to water management policy. a third feature, edited by daniel bottom, kim jones, and charles simenstad, is titled pathwa ys to resilient salmon ecosystems and includes a series of articles that reflect on the past, present, and future of salmon fisheries management and their effect on the complex ecological and social systems of which they are a part. lastly, the feature managi ng surprises in complex systems: multidisciplinary perspectives on resilience, edited by lance gunderson and pat longstaff, presents ideas about resilience from a multidisciplinary vantage point. thanks a lot to all of the guest editors for their tremendous work! we receive many requests for special features, decline a lot, and invite those that we believe really can take an issue forward. in this context there are publications in the two issues that open four new special features: ( ) transitions, resilience and governance: linking technological, ecological and political systems, ( ) resilience and vulnerab ility of arid and semi-arid social ecological systems, ( ) long-term vulnerability and transfo rmation, and ( ) landscape scenarios and multifu nctionality – making land use assessment operational. although ecology and society has become a globally recognized outlet for scholarship, it will inevitably be subject to change. given schumpeter’s model, it may in the future be replaced by something totally new. responses to this article can be read online at: http://www.ecologyandsociety.org/vol /iss /art / responses/ http://www.ecologyandsociety.org/vol /iss /art / http://www.ecologyandsociety.org/viewissue.php?sf= http://www.ecologyandsociety.org/viewissue.php?sf= http://www.ecologyandsociety.org/issues/view.php?sf= http://www.ecologyandsociety.org/issues/view.php?sf= http://www.ecologyandsociety.org/viewissue.php?sf= http://www.ecologyandsociety.org/viewissue.php?sf= http://www.ecologyandsociety.org/viewissue.php?sf= http://www.ecologyandsociety.org/viewissue.php?sf= http://www.ecologyandsociety.org/viewissue.php?sf= http://www.ecologyandsociety.org/viewissue.php?sf= http://www.ecologyandsociety.org/issues/view.php?sf= http://www.ecologyandsociety.org/issues/view.php?sf= http://www.ecologyandsociety.org/issues/view.php?sf= http://www.ecologyandsociety.org/issues/view.php?sf= http://www.ecologyandsociety.org/issues/view.php?sf= http://www.ecologyandsociety.org/issues/view.php?sf= http://www.ecologyandsociety.org/vol /iss /art /responses/ title lessons from the journal the march and june issues responses to this article rural resilience in a digital society: editorial lable at sciencedirect journal of rural studies ( ) e contents lists avai journal of rural studies journal homepage: www.elsevier.com/locate/jrurstud editorial rural resilience in a digital society: editorial . introduction the development of digital technology across the globe has taken place at considerable speed; however, this has not been at an even pace within all places (graham, ; philip et al., ; riddlesden and singleton, ). there has been a fundamental un- evenness to the delivery of digital technology in all its forms that has been shaped by existing geographic and social inequalities (graham et al., ; townsend et al., ) and has, in turn, shaped the characteristics of new inequalities. this special issue critically explores how, in different rural spaces, the delivery and use of digital technologies differs massively and how this can impact on the ability of rural communities to be resilient in an increasingly digital world. in following the multiple variations in availability, accessibility, quality and use of digital technologies in rural communities, this special issue highlights how different rural communities have, first, been significantly disadvantaged by slow delivery of post-dial up (‘narrow band’ or ‘first generation’) internet telecommunications infrastructure and, second, going beyond an infrastructure-based narrative we evidence how rural communities have utilised pre-existing resilience to help improve their ability to maintain and improve social and economic relations where tele- communications infrastructure development has failed to keep pace with national and international advances. this special issue originates from a working group convened at the th congress of the european society for rural sociology, , organised by researchers from the rcuk dot.rural digital economy hub at the university of aberdeen. the working group brought together european-based scholars concerned with the level of broadband infrastructure available to rural communities in the context of the european digital agenda for europe (dae). this translated at that time, across many countries, as the market-led roll out of superfast broadband. papers presented at the congress explored the types and degrees of disadvantage asso- ciated with the lack of access to broadband infrastructure and tech- nologies that rural e particularly remote e communities experience and the ways they seek to overcome the challenges arising from barriers to fit for purpose internet access and associ- ated relative disadvantage. in this special issue contributions from those who participated in the esrs congress are joined by contributions from other, non-european, scholars to lend a more international perspective, albeit one that focuses on the global north. the special issue marks the current ‘state of play’ for rural- digital agendas. this editorial introduction highlights the major contributions that the collection of papers offers in terms of inter- ventions within the overlapping academic literature on rural digital divides, digital inclusion, rural development and resilience. it draws http://dx.doi.org/ . /j.jrurstud. . . - /© the authors. published by elsevier ltd. this is an open access article together policy recommendations (roberts, anderson, skerratt and farrington, ; salemink, strijker and bosworth, ; philip, cottrill, farrington, williams and ashmore, ) and outlines ‘ways forward’ for ongoing research in this field. furthermore, it speaks to wider concerns in rural studies around neo-endogenous development and how conceptualisations of the ‘networked’ or ‘relational’ rural (heley and jones, ; shucksmith, n.d.; woods, ) are complicated or re-stated by (lack of) access and use of internet-enabled technologies, as well as explorations of multi- functional rurality and diversification, through reference to a range of sectors (business, heritage, health) and their interactions with internet-enabled technologies (beel, wallace, webster, nguyen, tait, macleod and mellish, ; townsend, wallace, fairhurst and anderson, ; hodge, carson, carson, newman and garrett, ). the special issue also provides a much needed reminder to contemporary digital sociological and digital geography scholars of the implicit urban bias in ‘pervasive’ and ‘ubiquitous’ technolo- gies discourse. for example, the proliferation of smart cities, crea- tive cities and recently published work on neogeography (graham et al., ; haklay et al., ; wilson and graham, ) is overwhelmingly situated in an urban context. this does not reflect the life-worlds of everybody and papers in this special issue contribute to the body of evidence on how the rural sits in relation to technologies discourse. our collection of papers highlight the differentiation of rural internet users through empirical case studies of rural creative in- dustries and high-skilled workers (townsend et al., ; ashmore et al., ), of older rural populations (hodge et al. , ) of rural service providers (pant and hambly-odame, .; hodge et al., ; beel et al. , ) and in terms of peripheral and isolated com- munities and socio-economic differences (philip et al. , ; park, ; wallace et al. , ). it also highlights varying contextual fac- tors such as policy across rural communities and uk national and european scale (roberts et al. , ; salemink et al. , ; philip et al. , ). a strength of this special issue is the combination of scales and methods at which analyses are carried out; the contribu- tions range from fine-grained, qualitative research on community- level case studies, to large systematic policy and literature reviews at european and international scales, to quantitative national-level and regional studies. contributions to the special issue are grouped into two sections. the first group are presented under the heading ‘ict, infrastructure and digital divides’. these contributions synthesise current litera- ture on the rural digital divide, assess national-level policy re- sponses and evaluate community-led alternatives for accessing broadband infrastructure. the second group deal more broadly with the use and benefits of the internet in rural areas. under the heading ‘harnessing digital technologies and crossing divides’, under the cc by license (http://creativecommons.org/licenses/by/ . /). http://crossmark.crossref.org/dialog/?doi= . /j.jrurstud. . . &domain=pdf www.sciencedirect.com/science/journal/ http://www.elsevier.com/locate/jrurstud http://dx.doi.org/ . /j.jrurstud. . . http://dx.doi.org/ . /j.jrurstud. . . http://dx.doi.org/ . /j.jrurstud. . . http://creativecommons.org/licenses/by/ . / editorial / journal of rural studies ( ) e these papers illustrate how broadband internet access has provided opportunities (although barriers still exist) in different rural places and overlapping rural sectors including business, health, heritage and local services. we first introduce all the contributions to the special issue below, followed by reflections on relationships be- tween rural digital society and notions of ‘rural resilience’ that stem from the research our contributing authors have presented. we conclude by suggesting how we can move forward with regards to future research on rural resilience and digital technology. . ict, infrastructure and digital divides digital divides refer to the uneven ways in which people have access to digital technology. this presents itself and is created through a number of factors, including, for example accessibility of different technologies (e.g.: expensive equipment), provision of technologies (e.g.: the telecommunications infrastructure), and ed- ucation (e.g.: not knowing how to use different technologies). singly or in combination these factors contribute to the ways in which people are disadvantaged in their ability to make use of dig- ital technologies. the first set of papers in this special issue address the issue of digital divides from a number of illuminating positions. they reflect a more nuanced conception of digital unevenness than a simple rural-urban divide. salemink, strijker and bosworth's paper offers a comprehensive review of the literature on digital divides and charts its progression over the last decade or so, drawing international comparisons. it re- views digital policy from countries across the global north and con- cludes with recommendations for future policy that suggest how to better position rural areas in future digital society developments. the contribution distinguishes two major strands of research, con- nectivity research and inclusion research and argues that these strands should be combined to create ‘customised policies’ to address digital divides in future digital policy agendas. roberts, anderson, skerratt and farrington scrutinise the euro- pean rural-digital policy agenda in their paper, using a community resilience framework to critically assess the mechanisms and as- sumptions through which it functions. community resilience, sus- tainability and associated proxies are frequently mentioned in inclusion and digital infrastructure policy statements, via assumed future benefits and the responsibilisation of local groups to create their own access (community broadband initiatives) and support structures (digital inclusion voluntary charters or champions). focusing on the translation of the european policy agenda into a uk context they find that the language surrounding rural broad- band infrastructure policy in the uk contains normative claims about its capacity to aid rural development, offer solutions to rural service provision and the challenges of implementing localism. however, their analysis suggests that digital inclusion policy is currently piecemeal, focusing on ‘show cases’ without a coherent rural focus. philip, cottrill, farrington, williams and ashmore's paper fol- lows the rollout of broadband to the ‘final few’ rural communities within the uk. the paper reports an analysis of data published by the uk's telecommunications regulator, ofcom and a series of qual- itative vignettes which together highlight the real and lived uneven geography of digital infrastructure supply to rural areas. it then shows how this impacts most heavily on the most remote areas. the paper contributes to our understanding of the paradox faced by rural communities and policy makers in delivering broadband through a market driven approach. that is, the rural communities that would potentially benefit most from better broadband connec- tivity in both economic and social terms are always furthest away from that delivery. this raises serious questions about the economic viability and long term sustainability of remote rural communities as well as impacting upon the ability for such communities to be resilient in difficult economic times. finally, the paper also chal- lenges public policy makers to think through better ways of deliv- ering broadband provision so that rural communities are not further disadvantaged by market driven approaches. sora park highlights the intersection of multiple factors that in- fluence rural digital exclusion. she uses data from the australian bureau of statistics to show that whilst remoteness was a key deter- minant of rural digital exclusion, other sociodemographic variables including, for example, educational achievement and employment status also played a significant role. the need for building better ca- pacity in rural areas is stressed, with the authors arguing that both supply (infrastructure) and demand (education and employment opportunities, industry sector and socio-demographics) must be considered in the development of future rural digital inclusion strategies. ashmore, farrington and skerratt move the scale of analysis to the community-level. their paper compares two rural community-led broadband initiatives, one in northern england and one in scotland. they find that strong leadership and processes and structures that actively encourage participation can enhance resilience-building overall, but that this is best served by a joined-up approach that links actors and development priorities at local and extra-local levels. for example, digital champions or leaders are critical for resource identification and gaining engage- ment within a community when starting the process of setting up a local digital infrastructure network. however leaders can sometimes entrench existing inequalities and feelings of exclusion, ultimately detracting from other community member's capacity or desire to engage. . moving beyond simple rural-urban digital divides: harnessing digital technologies the second set of papers sit within the wider literature on digital divides that explicity seeks to move digital divide debates beyond considerations framed around a simple user and non-users binary (park et al., ; robinson et al., ). internet users do not all have access to the same spectrum of online activities, reflecting dif- ferences in users' abilities to consistently access reliable, high speed internet connectivity or to access the technologies that enable them to use the internet effectively at a reasonable cost. multiple socio- economic factors influence an individuals' capacity to go online, including potentially fluctuating interest and needs. the second set of papers encourages us to think about what qualifies as ‘digital participation’ or ‘engagement’ alongside better understandings of levels of use, the utility of digital connectivity and its ‘meaningful- ness’ for individuals and rural communities. the contributions all illustrate why it is important to move beyond viewing rural (non) users as a homogenous group. wallace, vincent, luguzan, townsend and beel's paper intro- duces social cohesion in terms of system integration (organisa- tional, communal spaces on and offline) and social integration (informal, networks, sense of belonging). this conceptualisation is a useful point of entry for an evaluation of intertwining on and off- line relationships at community level and the extent to which these foster social cohesion, an important contributor to community resilience. contrasting two rural communities in northern scot- land, their study concludes that ict is becoming an integral part of rural social relations but it can play very different roles with regards to promoting and sustaining social cohesion for different social and cultural groups, as well as for different kinds of locational communities. this paper draws on research undertaken in two communities with access to broadband internet and, like park et al. shows that factors other than access/no access to broadband, editorial / journal of rural studies ( ) e determine the extent to which rural community groups are moving online. the characteristics of the two case study communities may typify rural areas in many other national contexts and this paper's findings are of relevance to most other global north rural contexts where 'traditional' social networks are being reshaped and reformed through development of online presence. the canadian context is illustrated in pant and hambly-odame's paper which operates at two levels, offering detailed analysis of the uses and benefits of a rural region's high-speed broadband network for local businesses and business support organisations, and reflecting on the process of working with the partnership that de- livers this broadband infrastructure. this contribution reveals loca- tion- and sector-specific benefits of broadband that rural small businesses and community organisations have realised from increased access (including availability and affordability) to broad- band as well as stressing the importance of the reliability of internet connections. pant and hambly-odame, like roberts et al. and salemink et al. stress the necessity of flexible digital infrastruc- ture delivery programmes rather than a fixed, one-size-fits-all approach. townsend, wallace, fairhurst and anderson examine the bene- fits of digital connectivity to rural businesses in a scottish context. their paper focuses on the creative industries, recognized as an increasingly important contributor to the rural economy. they find that being digitally connected is essential for the creative sector and that online applications are used to support a variety of business related activities. the extent to which broadband con- nectivity can alleviate the penalty of distance for rural creative practitioners is dependent on whether digital connections can sup- port the download and upload speeds required to perform business-related activities. significantly, the paper reports that a lack of access to adequate broadband is perceived as such a barrier to business sustainability that it is a factor that could influence de- cisions for creative practitioners to relocate their business and their households away from rural, and especially remote rural areas. resilient rural communities need to be able to sustain an active working age population and support a diverse economic base; inadequate connectivity means that the creative sector is a vulner- able sector within the rural economy. beel, wallace, webster, nguyen, tait, macleod and mellish's pa- per asks how community activity, connectivity and digital archives can support interest in local heritage as well as help to develop more resilient communities. through the example of two case studies of community digital-heritage projects this paper explores the role of cultural practices in building community resilience and empirically ‘places’ cultural resilience. it explores the role of ‘bot- tom-up’ volunteer labour, and contextual factors such as place identities and knowledges, traditions, histories and customs, and the role the process of digitizing archives plays in strengthening community cohesion as well as supporting the development of wider socio-economic benefits. the paper provides a practical demonstration of how appropriate digital technology can have a real and positive impact in rural areas. the final paper of this special issue moves back to the australian context to look at the relationship between internet connectivity and rural service provision for the elderly. hodge, carson, carson, newman and garrett identify the nature and extent of digital inter- actions between older people and service providers, and the en- ablers and challenges for online service engagement. older participants demonstrated considerable interest in learning how to use the internet for accessing particular services, with social sup- port networks and third party facilitators being crucial enablers. service providers’ ambitions to engage with older people online appeared more limited as a result of entrenched stereotypes of older non-users and a lack of digital skills within service provider organisations alongside organisational and funding constraints. this paper illustrates how digital applications could be of consider- able benefit to rural communities at a time when increased with- drawal of physical services is being experienced and highlights that digital divides can be reinforced by increasingly outdated ste- reotypes. these need to be challenged to ensure that digital exclu- sion is not further entrenched at a time when digital service provision will become increasingly prevalent. . rural resilience? current research about and using the concept of resilience abounds in the social sciences. resilience as a concept relevant to research in the social sphere has been criticised for being too loosely conceived and all encompassing, for not being aligned with a clear methodology and for overlooking how power functions in decision-making for resilience (anderson, ; cote and nightingale, ; davidson, ; mackinnon and derickson, ). these critiques notwithstanding, a resilience perspective has been used to good effect in insightful (normative) work exploring how and what makes individuals, businesses, commu- nities and regions more resilient and it is this, we argue, that jus- tifies a focus on resilience in rural social science research. our understanding of resilience refers to both short- and long-term socio-political processes of change, not the more commonly cited ecological definitions in which resilience means adaptability or bounce-back-ability from ‘shocks,’ which in social systems often translates as natural disasters. in these papers, the ability to get on- line in a meaningful way is both an outcome of being or having resilience and a process through which resilience characteristics are exhibited, as well as a context to resilience and, in some cases, a social change that it is necessary to become resilient to. anderson ( ) argued that the strongest work on resilience borrows from a broad framework, lending specificity and appro- priate selection of factors from typologies and motifs covered across the literature. the papers in this special issue each approach resilience e as a concept and methodologically e in different ways. they bring the ‘connections between resilience and specific economic-political apparatus, including neoliberalism, into a ques- tion to be explored rather than a presumption from which analysis begins’ (anderson, p. ). digital agendas exhibit distinctly neoliberal features. through analysis at different scales, locales, and with reference to various combinations of economic drivers and policies, the contributions in this special issue begin to unpack what it means for rural communities to be or have resilience in a digital age. we respond, for example, to weichselgartner and kelman, question about how urban and rural resilience are, or should be, differentiated by exploring a specific empirical issue: rural broadband adoption and use. the papers in this special issue share the strong conclusion that those who can access (acceptable) broadband internet connectivity within rural areas are able to reap rewards in economic and cultural terms. yet not everyone choses to connect, and those who are on- line do not all have access to a reliable, fast broadband service. rela- tive disadvantage and likely exclusion from dimensions of an increasingly globalised rural society can be experience by both rural users and non-users of the internet. the conclusions from research reported in this special issues’ contributions include proposals for localised and responsive approaches to rural inclusivity in a digital society. in the context of rural digital policy agendas, we propose that key resilience terms are especially helpful for thinking about how and why communities benefit or become disadvantaged in the ways they do. some of our contributors provide support for scott ‘s ( ) claims for understanding rural resilience, allowing for an editorial / journal of rural studies ( ) e exploration of historical path dependencies, ‘lock-in’ to develop- ment trajectories, deliberative modes of decision making, and the mix of endogenous and exogenous forces interacting at the local level. for example, salemink et al. ask ‘to what extent are rural com- munities, united in civic initiatives or community action groups, and telecommunications companies able to regulate this process together, and where do they need government support?’ (p. ). roberts et al. find that in rural digital policy, ‘resilience’ can be an effective discourse for ‘responsibilising’ the community (anderson, ) on the one hand to develop their own broadband networks and support systems within the community to build dig- ital capacity, and, on the other hand, remove financial and regula- tory support mechanisms (via partnerships that work on a voluntary and often non-transparent basis) within a neo-liberal, localism backdrop. ashmore et al. highlight some of the weak- nesses and mutability of the resilience concept through illustrating how community endeavours can be vulnerable to the capital (network, knowledge and determination) of one or a few leaders, and pay heed to the non-neutrality of resilience processes whereby not all members of the community gain buy-in to a community broadband scheme for reasons related to dynamics and power- relations across socio-economic and geographical groupings. similar uneven distribution is evident in wallace et al.’s analysis of broadband usage for enhancement of quality of life between and within communities in a commuter belt rural and a more pe- ripheral rural village. across the papers generally, digital capital (in the form of access, literacy, use, benefits) can be seen as mutu- ally supporting other forms of capital that enhance rural resilience. indeed, in this issue, hodge et al. argue that strategies for enhancing social capital (through networks and inclusion) are at least as important as improving technical capabilities, for rural elderly populations ( p ). we can only provide a snapshot of rural digital society in a constantly evolving technological landscape. our contributions could exhibit the same potential limitations that much work on technology does in that they will quickly become redundant. how- ever, although digital technologies have and will continue to change, the issue of ‘lagging behind’ and inequality of opportunity within rural areas caused by prominently neoliberal structures has not changed over the last several decades so it is more than likely that the central contributions of the special issue will carry forward as a digital society becomes an even more entrenched aspect of modern life. . concluding thoughts: ongoing rural digital scholarship contributions to this special issue sit within a literature that un- derstands that digital inequality and exclusion cannot be analysed in isolation, separate from offline disadvantage and that the continued integration of digital technologies into new aspects of daily life means that forms of disadvantage mutate (helsper, ; robinson et al., ). this collection of papers provides ev- idence of distinctive rural forms of digital disadvantage and vulner- ability which take shape within and in turn create a variety of different forms of social, economic and cultural disadvantage. we feel that there is something particularly punitive to those who live in rural locations being unable to fully exploit the opportunities afforded by digital technology. if digital telecommunication infra- structure and applications are not equally available to all, regardless of location, those working and living in not served or underserved areas, such as many rural areas, are disadvantaged. this in turn re- stricts the ability of rural locations to grow economically, socially and culturally on their own terms. in stressing this final point, the special issue has sought to show how rural communities have embraced digital technologies when they are available to them. any attempts to close the digital divide and to allow rural communities to fully engage in a digital society must have a territorial focus. a continuing digital inclusion agenda for rural communities, based upon flexible, responsive and inclusive (participatory and equal op- portunity) policy, one that is cognisant of and concerned to address uneven digital geographies of place is, we argue, crucial to the future sustainability and resilience of rural communities and rural places. looking forward, we anticipate that work on rural digital divides will need to take into account changing landscapes in technological provision. despite ongoing uneven infrastructure provision, the landscape does and will change quickly in terms of fixed and mo- bile connectivity. a future research agenda should take these rapid changes into account and interrogate the extent to which govern- mental promises of reducing the rural-urban digital divide are be- ing delivered on. we propose that future research in this area takes in-depth and longitudinal approaches that look at motivations, at- titudes and barriers of rural users and how these respond to changes in technological provision. there is a need for ongoing studies that critically question the different uses and benefits of technologies across diverse rural groups, and studies that consider the relationship between socio-demographics, rurality and digital inclusion. finally, we suggest that future research in this field con- siders the development of appropriate technologies and policies, and the most effective routes to implementation e whether these be through bottom-up community led initiatives, through government-led investments and schemes or through partner- ships which encompass multiple approaches. acknowledgement the four editors of this special issue led research at the dot.rural digital economy hub, a research centre funded by the rcuk digital economy programme (award reference: ep/g / ). some of this research is presented in papers in this special issue. references anderson, b., . what kind of thing is resilience? politics ( ), e . cote, m., nightingale, a. j., . resilience thinking meets social theory: situating social change in socio-ecological systems (ses) research. prog. hum. geogr. ( ), e . davidson, d.j., . the applicability of the concept of resilience to social systems: some sources of optimism and nagging doubts. soc. nat. resour. ( ), e . graham, m., . time machines and virtual portals: the spatialities of the digital divide. prog. dev. stud. ( ), e . graham, m., hale, s., stephens, m., . featured graphic: digital divide: the geog- raphy of internet access. environ. plan. a ( ), e . retrieved march , , from editor & translator. haklay, m., singleton, a., parker, c., . web mapping . : the neogeography of the geoweb. geogr. compass ( ), e . heley, j., jones, l., . relational rurals: some thoughts on relating things and the- ory in rural studies. j. rural stud. ( ), e . helsper, e.j., . a corresponding fields model for the links between social and digital exclusion. commun. theory ( ), e . mackinnon, d., derickson, k.d., . from resilience to resourcefulness: a critique of resilience policy and activism. prog. hum. geogr. ( ), e . park, s., freeman, j., middleton, c., allen, m., eckermann, r., everson, r., . the multi-layers of digital exclusion in rural australia. in: proceedings of the annual hawaii international conference on system sciences, pp. e , emarch. philip, l.j., cottrill, c., farrington, j., . “two-speed” scotland: patterns and im- plications of the digital divide in contemporary scotland. scott. geogr. j. ( e ), e . riddlesden, d., singleton, a.d., . broadband speed equity: a new digital divide? appl. geogr. , e . robinson, l., cotten, s.r., ono, h., quan-haase, a., chen, w., schulz, j., et al., . http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref editorial / journal of rural studies ( ) e digital inequalities and why they matter. inf. commun. soc. ( ), e . robinson, l., cotten, s.r., ono, h., quan-haase, a., mesch, g., chen, w., et al., . digital inequalities and why they matter. inf. commun. soc. ( ), e . scott, m., . resilience: a conceptual lens for rural studies? geogr. compass ( ), e . shucksmith, m. (n.d.). future directions in rural development executive summary. townsend, l., sathiaseelan, a., fairhurst, g., wallace, c., . enhanced broadband access as a solution to the social and economic problems of the rural digital divide. local econ. ( ), e . weichselgartner, j., kelman, i., . geographies of resilience: challenges and op- portunities of a descriptive concept. prog. hum. geogr. ( ), e . wilson, m.w., graham, m., . situating neogeography. environ. plan. a ( ), e . retrieved march , , from editor & translator. woods, m., . rural geography: blurring boundaries and making connections. prog. hum. geogr. ( ), e . elisabeth roberts*, david beel, lorna philip, leanne townsend university of the west of england, q frenchay campus, coldharbour lane, bristol, bs qy, united kingdom * corresponding author. e-mail addresses: roberts.elisabeth@googlemail.com (e. roberts), d.e.beel@sheffield.ac.uk (d. beel), l.philip@abdn.ac.uk (l. philip), l.townsend@abdn.ac.uk (l. townsend). available online july http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref http://refhub.elsevier.com/s - ( ) -x/sref mailto:obidzinskigmailcom mailto:obidzinskigmailcom mailto:obidzinskigmailcom mailto:obidzinskigmailcom rural resilience in a digital society: editorial . introduction . ict, infrastructure and digital divides . moving beyond simple rural-urban digital divides: harnessing digital technologies . rural resilience? . concluding thoughts: ongoing rural digital scholarship acknowledgement references [pdf] the cultural shaping of icts within academic fields: corpus-based linguistics as a case study | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /llc/ . . corpus id: the cultural shaping of icts within academic fields: corpus-based linguistics as a case study @article{fry thecs, title={the cultural shaping of icts within academic fields: corpus-based linguistics as a case study}, author={j. fry}, journal={lit. linguistic comput.}, year={ }, volume={ }, pages={ - } } j. fry published computer science, sociology lit. linguistic comput. the aim of this paper is to show that the appropriation of icts is determined by a field's specific cultural identity. knowledge is not a homogeneous whole, but a patchwork of heterogeneous fields. these fields are most visible as embodied in academic disciplines, which have distinct cultural identities shaped by intellectual and social considerations. scholarly communication systems evolve over time within the context of these cultural identities. the paper discusses the cultural shaping of… expand view via publisher eresearch .eu save to library create alert cite launch research feed share this paper citationshighly influential citations background citations methods citations view all tables and topics from this paper table table noise shaping scholarly communication norm (social) autonomous robot patchwork coherence (physics) hoc (programming language) itil citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency scholarly research and information practices: a domain analytic approach j. fry computer science inf. process. manag. pdf view excerpts, cites background save alert research feed research cultures as an explanatory factor j. gläser, jana bielick, + authors u. tschida sociology pdf save alert research feed learning designers in the ‘third space’: the socio-technical construction of moocs and their relationship to educator and learning designer roles in he s. white, s. white sociology pdf view excerpt, cites background save alert research feed educational networking in the digital age c. costa engineering save alert research feed a literature review in digital humanities computing laura rinnovati computer science pdf save alert research feed field-specific mediatization: testing the combination of social theory and mediatization theory using the example of scientific communication corinna lüthje computer science save alert research feed transdisciplinarity and digital humanities : lessons learned from developing text-mining tools for textual analysis y. lin engineering, sociology pdf save alert research feed accelerating transition to virtual research organization in social science (avross) : first results from a survey of e-infrastructure adopters franz barjak, g. wiegand, + authors s. robinson political science view excerpts, cites background save alert research feed the information needs of contemporary academic researchers eti herman political science pdf save alert research feed openness and education: a beginner’s guide katy jordan, m. weller sociology pdf save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency the information work of interdisciplinary humanities scholars: exploration and translation carole l. palmer, l. neumann sociology the library quarterly view excerpt, references background save alert research feed humanities scholars: information needs and uses sue stone computer science j. documentation view excerpt, references background save alert research feed locally controlled scholarly publishing via the internet: the guild model r. kling, lisa b. spector, g. mckim computer science asist save alert research feed information sharing in academic communities: types and levels of collaboration in information seeking and use. sanna talja computer science pdf view excerpt, references background save alert research feed distance matters g. olson, j. olson computer science hum. comput. interact. , pdf view excerpt, references methods save alert research feed academic tribes and territories: intellectual enquiry and the cultures of disciplines t. becher, paul r. trowler sociology , highly influential view excerpts, references background save alert research feed scholarly work in the humanities and the evolving information environment w. brockman, l. neumann, carole l. palmer, tonyia j. tidline political science pdf view excerpt, references background save alert research feed relationships and tasks in scientific research collaboration r. kraut, j. galegher, c. egido computer science hum. comput. interact. save alert research feed the characteristics of subject matter in different academic areas. a. biglan psychology , pdf view excerpts, references methods and background save alert research feed using computers in linguistics: a practical guide j. lawler, h. dry computer science save alert research feed ... ... related papers abstract tables and topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue a prototype for authorship attribution studies patrick juola∗ juola@mathcs.duq.edu john sofko sofko @hotmail.com patrick brennan brennan @comcast.net duquesne university pittsburgh, pa united states of america abstract despite a century of research, statistical and computational methods for authorship attribution are neither reliable, well-regarded, widely-used, or well-understood. this paper presents a survey of the current state-of- the-art as well as a framework for uniform and unified development of a tool to apply the state-of-the-art, despite the wide variety of methods and techniques used. the usefulness of the framework is confirmed by the development of a tool using that framework that can be applied to authorship analysis by researchers without a computing specialization. using this tool, it may be possible both to expand the pool of available researchers as well as to enhance the quality of the overall solutions (for example, by incorporating improved algorithms as discovered through em- pirical analysis [juola, a]). introduction the task of computationally inferring the author of a document based on its internal statistics – sometimes called “stylometrics,” “authorship attribution,” or (for the completists) “non-traditional authorship attribution” is an active and vibrant research area, but at present largely without use. for example, unearthing the author of the anonymously-written primary colors (joe klein) became a substantial issue in . in , “anonymous” published imperial hubris, a followup to his (her?) earlier work through our enemies’ eyes. who wrote these books? did the same person actually write these books? does the(?) author actually have the expertise claimed on the dust cover (“a senior ∗corresponding author according to news report consensus, as first revealed by jason vest in the july edition of the boston phoenix, the author is michael scheuer, a senior cia officer. but how seriously should we take this consensus? u.s. intelligence official with nearly two decades of experience”)? and, why haven’t our computers already given us the answer? determining the author of a particular piece of text has been a methodolog- ical issue for centuries. questions of authorship can be of interest not only to humanities scholars, but in a much more practical sense to politicians, journal- ists, and lawyers as in the examples above. in recent years, the development of improved statistical techniques [holmes, ] in conjunction with the wider availability of computer-accessible corpora [nerbonne, ] has made the auto- matic inference of authorship (variously called “authorship attribution” or more generally “stylometry”) at least a theoretical possibility, and research in this area has expanded tremendously. from a practical standpoint, acceptance of this technology is dogged by many issues — epistemological, technological, and political — that limit and in some cases prevent its wide acceptance. part of this lack of use can be attributed to simple unfamiliarity on the part of the relevant communities, combined with a perceived history of inaccuracy (see, for exam- ple, the discussion of the cusum technique [farringdon, ] in [holmes, ]). since , however, the popularity of corpus linguistics as a field of study and vast increase in the amount of data available on the web have made it practical to use much larger sets of data for inference. during the same period, new and increasingly sophisticated techniques have improved the quality (and accuracy) of judgments the computers make. this paper summarizes some recent findings and experiments and presents a framework for development and analysis to address these issues. in particu- lar, we discuss two major usability concerns, accuracy and user-friendliness. in broad terms, these concerns can only be addressed by expansion of the number of clients (users) for authorship attribution technology. we then present a the- oretical framework for description of authorship attribution to make it easier and more practical for the development and improvement of genuine off-the-shelf attribution solutions. background with a history stretching to [mendenhall, ], and , hits on google , it is apparent that statistical/quantitative authorship attribution is an active and vibrant research area. with nearly years of research, it is surprising that it has not been accepted by relevant scholars : “stylometrics is a field whose results await acceptance by the world of literary study by and large.” this can be attributed at least partially to a limited view of the range of applicability, to a history of inaccuracy, and to the mathematical complexity (and corresponding difficulty of use) of the techniques deployed. for example, and taking a broad view of “stylometry” to include the infer- ence of group characteristics of a speaker, the story from judges : – describes phrasal search for “authorship attribution,” june , anonymous, personal communication to patrick juola, how tribal identity can be inferred from the pronunciation of a specific word (to be elicited). specifically, the gileadites captured the fords of the jordan leading to to ephraim, and whenever a survivor of ephraim said, “let me cross over,” the men of gilead asked him, “are you an ephraimite?” if he replied, “no,” they said, “all right, say ‘shibboleth.’ ” he said, “sibboleth,” because he could not pronounce the word correctly, they seized him and killed him at the fords of the jordan. forty-two thousand ephraimites were killed at that time. a more modern version of such shibboleths could involve specific lexical or phonological items; a person who writes of a “chesterfield” as a piece of furni- ture is presumptively canadian, and an older canadian at that [easson, ]. [wellman, ][p. ] describes how an individual spelling error — an idiosyn- cratic spelling of “toutch” was elicited and used in court to validate a document for evidence. at the same time, such tests cannot be relied upon. idiosyncratic spelling or not, the word “touch” is rather rare ( tokens in the million-word brown corpus [kučera and francis, ]), and although one may be able to elicit it in a writing produced on demand, it’s less likely that one will be able to find it independently in two different samples. people are also not consistent in their language, and may (mis)spell words differently at different times; often the tests must be able to handle distributions instead of mere presence/absence judgments. most worryingly, the tests themselves may be inaccurate [see espe- cially the discussion of cusum [farringdon, ] in [holmes, ]], rendering any technical judgment questionable, especially if the test involves subtle sta- tistical properties such as “vocabulary size” or “distribution of function words,” concepts that may not be immediately transparent to the lay mind. questions of accuracy are of particular importance in wider applications such as law. the relevance of a document (say, an anonymously libelous let- ter) to a court may depend not only upon who wrote it, but upon whether or not that authorship can be demonstrated. absent eyewitnesses or confessions, only experts, defined by specialized knowledge, training, experience, or educa- tion, can offer “opinions” about the quality and interpretation of evidence. u.s. law, in particular, greatly restricts the admissibility of scientific evidence via a series of epistemological tests . the frye test states that scientific evidence is admissible only if “generally accepted” by the relevant scholarly community, explicitly defining science as a consensus endeavor. under frye, (widespread) ig- norance of or unfamiliarity with the techniques of authorship attribution would be sufficient by itself to prevent use in court. the daubert test is slightly more epistemologically sophisticated, and establishes several more objective tests, in- cluding but not limited to empirical validation of the science and techniques used, the existence of an established body of practices, known standards of ac- curacy (including so-called type i and type ii error rates), a pattern of use in frye vs. united states, ; daubert vs. merrill dow, . non-judicial contexts, and a history of peer review and publication describing the underlying science. at present, authorship attribution cannot meet these criteria. aside from the question of general acceptance (the quote presented in the first paragraph of this section, by itself, shows that stylometrics couldn’t pass the frye test), the lack of standard practices and known error rates eliminates stylometry from daubert consideration as well. recent developments to meet these challenges, we present some new methodological and practical developments in the field of authorship attribution. in june , allc/ach hosted an “ad-hoc authorship attribution competition”[juola, a] as a par- tial response to these concerns. specifically, by providing a standardized test corpus for authorship attribution, not only could the mere ability of statistical methods to determine authors be demonstrated, but methods could further be distinguished between the merely “successful” and “very successful.” (from a forensic standpoint, this would validate the science while simultaneously, estab- lishing the standards of practice and creating information about error rates.) contest materials included thirteen problems, in a variety of lengths, styles, gen- res, and languages, mostly gathered from the web but including some materials specifically gathered to this purpose. two dozen research groups participated by downloading the (anonymized) materials and returning their attributions to be graded and evaluated against the known correct answers. the specific problems presented included the following: • problem a (english) fixed-topic essays written by thirteen duquesne stu- dents during fall . • problem b (english) free-topic essays written by thirteen duquesne stu- dents during fall . • problem c (english) novels by th century american authors (cooper, crane, hawthorne, irving, twain, and ‘none-of-the-above’), truncated to , characters. • problem d (english) first act of plays by elizabethan/jacobean play- wrights (johnson, marlowe, shakespeare, and ‘none-of-the-above’). • problem e (english) plays in their entirety by elizabethan/jacobean play- wrights (johnson, marlowe, shakespeare, and ‘none-of-the-above’). • problem f ([middle] english) letters, specifically extracts from the paston letters (by margaret paston, john paston ii, and john paston iii, and ‘none-of-the-above’ [agnes paston]). • problem g (english) novels, by edgar rice burrows, divided into “early” (pre- ) novels, and “late” (post- ). • problem h (english) transcripts of unrestricted speech gathered dur- ing committee meetings, taken from the corpus of spoken professional american-english. • problem i (french) novels by hugo and dumas (pere). • problem j (french) training set identical to previous problem. testing set is one play by each, thus testing ability to deal with cross-genre data. • problem k (serbian-slavonic) short excerpts from the lives of kings and archbishops, attributed to archbishop danilo and two unnamed authors (a and b). data was originally received from alexsandar kostic. • problem l (latin) elegaic poems from classical latin authors (catullus, ovid, propertius, and tibullus). • problem m (dutch) fixed-topic essays written by dutch college students, received from hans van halteren. the contest (and results) were surprising at many levels; some researchers initially refused to participate given the admittedly difficult tasks included among the corpora. for example, problem f consisted of a set of letters ex- tracted from the paston letters. aside from the very real issue of applying methods designed/tested for the most part for modern english on documents in middle english, the size of these documents (very few letters, today or in cen- turies past, exceed words) makes statistical inference difficult. similarly, problem a was a realistic exercise in the analysis of student essays (gathered in a freshman writing class during the fall of ) – as is typical, no essay exceeded words. from a standpoint of literary analysis, this may be regarded as an unreasonably short sample, but from a standpoint both of a realistic test of forensic attribution, as well as a legitimately difficult problem for testing the sensitivity of techniques, these are legitimate. results from this competition were heartening. (“unbelievable,” in the words of one contest participant.) despite the data set limitations, the highest scoring participant [koppel and schler, ], scored an average success rate of approximately %. (juola’s solutions, in the interests of fairness, averaged % correct.) in particular, schler’s methods achieved . % accuracy on problem a and . % accuracy on problem f, both acknowledged to be difficult and considered by many to be unsolvably so. more generally, all participants scored significantly above chance. perhaps as should be expected, performance on english problems tended to be higher than on other languages. perhaps more surprisingly, the availability of large docu- ments was not as important to accuracy as the availability of a large number of smaller documents, perhaps because they can give a more representative sample of the range of an author’s writing. in particular, the correlation between the av- erage performance of a method on english samples (problems a-h) correlation significantly ( . , p < . ) with that method’s performance on non-english samples. correlation between large-sample problems (problems with over , words per sample) and small sample problems was still good, although no longer significant (r = . ). this suggests that the problem of authorship attribu- tion is at least somewhat a language- and data-independent problem, and one to which we may be able to expect to find wide-ranging technical solutions for the general case, instead of (as, for example, in machine translation) to have to tailor our solutions with detailed knowledge of the problem/texts/languages at hand. in particular, we offer the following challenge to all researchers in the process of developing a new authorship attribution algorithm : if you can’t get % correct on the paston letters (problem f), then your algorithm is not com- petitively accurate. every well-performing algorithm studied had no difficulty achieving this standard. statements from researchers that their methods will not work with only a handful of letters as training data should be regarded with appropriate suspicion. finally, methods based on simple lexical statistics tended to perform sub- stantially worse than methods based on n-grams or similar measures of syn- tax in conjunction with lexical statistics. we continue to examine the de- tailed results in an effort to identify other characteristics of good solutions. unfortunately, another apparent result is that the high-performing algorithms appear to be mathematically and statistically (although not necessarily lin- guistically) sophisticated. the good methods have names that may appear fearsome to the uninitiated : linear discriminant analysis [baayen et al., , van halteren et al., ], orthographic cross-entropy [juola and baayen, , juola and baayen, ], common byte n-grams [keselj and cercone, ], svm with a linear kernel function [koppel and schler, ]. these techniques can be difficult to implement, or even to understand or to use, by a casual, non- technical scholar. at the same time, the sheer number of techniques proposed (and therefore, the number of possibilities available to confuse) has exploded, which also limits the pool of available users. we can no longer expect a casual professor of literature — let alone a journalist, lawyer, judge, or interested lay- man — to apply these new methods to a problem of interest without technical assistance. new technologies the variation in these techniques can make authorship attribution appear to be an unorganized mess, but it has been claimed that under an appropriate theoretical framework [juola, b], many of these techniques can be unified, combined, and deployed. using this framework, it is possible — indeed, we hope to demonstrate as the basis for incremental improvement — to develop “commercial off the shelf” (cots) software to perform much of the technical analytic aspects. the initial observation is that, broadly speaking, all known human languages can be described as an unbounded sequence chosen from a finite space of possible events. for example, the ipa phonetic alphabet [ladefoged, ] describes an inventory of approximately different phonemes; a typewriter shows approx- imately different latin- letters; a large dictionary will present an english vocabulary of – , different words. an (english) utterance is “simply” a sequence of phonemes (or words). the proposed framework postulates a three-phase division of the author- ship attribution task, each of which can be independently performed, rather in the manner of a unix or linux pipeline, where the output of one phase is immediately made available as the input of the following one. these phases are: • canonicization — no two physical realizations of events will ever be ex- actly identical. we choose to treat similar realizations as identical to restrict the event space to a finite set. • determination of the event set — the input stream is partitioned into individual non-overlapping “events.” at the same time, uninformative events can be eliminated from the event stream. • statistical inference — the remaining events can be subjected to a variety of inferential statistics, ranging from simple analysis of event distributions through complex pattern-based analysis. the results of this inference determine the results (and confidence) in the final report. as an example of how this procedure works, we consider a method for iden- tifying the language in which a document is written. the statistical distribution of letters in english text is well-known (see any decent cryptography handbook, including [stinson, ]). we first canonicize the document by identifying each letter (an italic e, a boldface e, or a capital e should be treated identically) and producing a transcription. this canonicization process would also implicitly involve other transformations, for example, partitioning a pdf image into text regions to be analyzed as opposed to illustrations and margins to be ignored. a much more sophisticated canonicization process, following [rudman, ], could regularize spelling, eliminate extraneous material such as chapter headings and page numbers, and even “de-edit” the invisible hand of the editor or redac- tor, to approximate as closely as possible the state of the original manuscript as it left the pen or typewriter of the author. the output of this canonicization process would then be a sequence of linguistic elements. we then identify each letter as a separate event, eliminating all non-letter characters such as numbers or punctuation. a more sophisticated application might demand instead that letters be grouped into morphemes, syllables, words, and so forth. finally, by compiling an event histogram and comparing it with the known distribution, we can determine a probability that the document was written in english. a similar process would treat each word as a separate event (eliminating words not found in a standard lexicon) and comparing event histograms with a standardized set such as the brown histogram [kučera and francis, ]. note that the difference between an analysis based on letter histograms and one based on word histograms is purely in the second, event set determination, phase; the statistics of histogram generation and analysis are identical and can be performed by the same code. the question of the comparative accuracy of these methods can be judged empirically. the burrows methods [burrows, , burrows, ] for authorship attri- bution can be described in similar terms. after the document is canonicized, the document is partitioned into words-events. of the words, most words (except for a chosen few function words) are eliminated. the remaining word-events are collected in a histogram, and compared statistically via principle content analysis (pca) to similar histograms collected from anchor documents. (the difference between the and methods is simply in the nature of the statistics performed.) even wellman’s “toutch” method can be so described; after canonicization, the event set of words is compiled, specifically, the number of words spelled “toutch.” if this set is non-empty, the document’s author is determined. this framework also allows researchers both to focus on the important dif- ferences between methods and to mix and match techniques to achieve the best practical results. for example, [juola and baayen, ] describes two tech- niques based on cross-entropy that differ only in their event models (words vs. letters). presumably, the technique would also generalize to other event models (function words, morphemes, parts of speech), and and similarly other inference techniques would work on a variety of event models. it is to be hoped that from this separation, researchers can identify the best inference techniques and the best models in order to assemble a sufficiently powerful and accurate system. demonstration the usefulness of this framework can be shown in a newly-developed user-level authorship attribution tool. this tool coordinates and combines (at this writing) several different technical approaches to authorship attribution [burrows, , juola, , burrows, , kukushkina et al., , juola, b, keselj and cercone, ]. written in java, this program combines a simple gui atop the three-phase approach defined above. users are able to select a set of sample documents (with labels for known authors) and a set of testing documents by unknown authors. the three-phase framework described above fits well into the now standard modular software design paradigm using java’s object-oriented frame- work. each of the individual phases is handled by a separate class/module that can be easily extended to reflect new research developments the original jgaap prototype was developed in july, . it served as a proof of concept for automating authorship attribution technologies. un- fortunately, the prototype was not developed with extensibility in mind. the architecture used was not clearly defined and the application was not easily modified. these design issues were addressed in the second (current) version of jgaap. nearly all of the original source code was refactored to conform to the java graphical authorship attribution program; the authors invite suggestions for a bet- ter name for future versions. new design framework. the new jgaap framework is devised from a strongly object oriented perspective. the core functionality of jgaap is distilled into seven basic operations. these seven operations include: • core classes • document input • creating events • document preprocessing • document scoring • displaying results • and graphical user interface the directory structure of the application reflects these operations, making the source code easy to follow and understand. core classes as the name implies, the core classes provide the basic frame- work of the application. by themselves, they provide no application functional- ity. they are necessary, however, when implementing java interfaces to extend functionality. document input the document input module provides methods for import- ing documents into jgaap. currently, jgaap provides input from local files only, although ongoing improvements to accept documents by remote file trans- fer or from the web are in the process of being added. creating events the events module modifies the input documents prior to scoring. these events specify the means by which the documents are presented to the scoring method. currently, jgaap provides two types of events: letters or words. document preprocessing the document preprocessing module provides meth- ods of modifying the documents prior to scoring as detailed above. currently, we have made available the following preprocessing options: removing end punc- tuation, removing html tags, removing non-letters, removing numerals (and replacing them with a <num> tag), removing spaces, and conversion of documents to lower case. document scoring the document scoring module contains methods for doc- ument comparison. these methods apply authorship attribution techniques to compare the input documents and provide a quantitative score for each com- parison. displaying results this module contains implementations that are utilized to display scoring results to the end user. the scoring methods currently output a matrix that contains the result of comparing each unknown document with all documents of known authorship. code within this module may reformat this information into a visual representation of the matrix. currently, jgaap provides output of the matrix to the console, to file, or via message box. graphical user interface this module contains the methods responsible for creating the user interface of jgaap. the user is able to select from a menu of event selection/preprocessing op- tions and of technical inference mechanisms. specifically, we designed a multi- menu, panel-based gui that resembles microsoft software to facilitate ease of use. the menus are clearly marked and set up so the flow of work is fairly linear and maps closely to the phase structure described above. the documents to be analyzed and the pre-processing and methods of the analysis are selected by the user, then that data gets sent to the (computational) “backend”, which returns the results back to the gui to be displayed. there are still a number of substantial issues to address in further versions of jgaap, including improvement of existing factors and the development of new features. first, we are unsatisfied with the saving/loading method currently imple- mented into jgaap. while it is functional, it relies on absolute path names, so it is not especially flexible as we should like, and specifically is restricted only to local files. we would like to add the capability of dynamic path-based “mani- fest” files in folders of documents. when you load the manifest, you would only have to point jgaap to where the folder of files is located and it would do the rest of the work for you. we also hope to incorporate state-based processing, where the program generates a a static list that loads along with the program while it starts a new session. this list would have on it all the documents that have been previously input into the program along with the program saving local copies of the tests. when a user wishes to analyze documents, he can select which documents he wishes to check from the permanent list, instead of having to load in the documents every time he wishes to analyze them. while it might almost mean a complete re-design of the gui from the ground up, it could drastically improve functionality for users that check the same documents over and over. i would like to get the opinions of the community as a whole, because it might not be all that useful. we hope to get opinions from the community on how they would like to see the data graphically interpreted and displayed. because this is being developed for the community as a whole, it is important for them to have feedback on how they would like to see the data presented. finally, we wish to add a wizard mode and in-context help files to assist new users to the jgaap program. as more features get added, the complexity of the program will warrant helping the user as much as possible; especially if the idea is to make the program suitable for the general user. parties interested in seeing or using this program, and especially in helping with the necessary feedback, should contact the corresponding author. design issues the framework outlined above relies heavily on the java concept of interface. a java interface provides a powerful tool that can be used to create highly exten- sible application frameworks. conceptually speaking, an interface is a defined set of functions (formally, “methods”)that a piece of code can “implement” us- ing any algorithm desired. this permits other pieces of code to use differing variations of the same interface with no changes, permitting easy updates as new techniques are developed and implemented. within the interfaces directory of the application, there exists five defined interfaces: display, event, input, preprocess, and score. these interfaces spec- ify required methods that must exist in classes that intend to implement the respective interfaces. the classes within the core classes directory contain methods that accept interfaces as parameters. for example, we will assume that a future developer wants to create a new method to display score output. according to the display interface, the new method may implement display if and only if it contains the public void display() method. the core class display contains a public void display(displayinterface display) method. this method accepts an object of type displayinterface as a parameter and calls that object’s public void display() method. conversely, any code with a public void display() method can be called as a display interface, so a technically sophisticated user who wants to see dendrograms as output need only write a single function, one that takes the matrix results from document scoring and computes (and dis- plays) an appropriate dendrogram. this function can be added on the fly to the jgaap program and further can be re-used by others, irrespective of the different choices they may have made about the documents, the event model, or the statistics. similarly, preprocessing can be handled by separate instantiations and sub- classes. even data input and output can be modularized and separated. as written, the program only reads files from a local disk, but a relatively easy modification (in progress) would allow files to be read from the network (for instance, web pages from a site such as project gutenberg or literature.org). discussion and future work from initial impressions, this tool is both usable and fulfills part of the need of non-technical researchers interested in authorship attribution. on the other hand, this tool is clearly a “research-quality” prototype, and additional work will be needed to implement a wide variety of methods, to determine and im- plement additional features, to establish a sufficiently user-friendly interface. even questions such as the preferred method of output — dendrograms? mds subspace projections? fixed attribution assignments as in the present system? — are in theory open to discussion and revision. it is hoped that the input of research and user groups such as the present meeting will help guide this development. most importantly, the availability of this tool (which we hope will spur ad- ditional research by the interested but computationally unsophisticated) should also spur discussion of the role to be played by commercial, off the shelf (cots) attribution software. as discussed in depth by [rudman, ], authorship at- tribution is a very nuanced process when properly done. ideally, as rudman’s law puts it, the closest text to the holograph should be found and used. the editor’s pen, the typist’s fingers, and the printer’s press can all introduce errors – and when a document exists only in physical or image form, the errors in- troduced by an ocr process [juola, b] can entirely invalidate the results. only if all of the analytic and control texts are valid can the results be trusted. this includes not only issues of authenticity, but also of representativeness – if an author’s style changes over time [juola, a, juola, ] a work from outside the period of study will be unrepresentative and may poison the analytic well. similarly, texts with extensive quotation may be more represented of the quoted sources than of the official author. texts from the internet in particular may well be regarded with suspicion due to the poor quality control of internet publishing in general. only once a suitable test suite has been developed can the computational analysis truly proceed, but even here, there are possible pitfalls. the analyst should also be aware of some of the issues introduced by the computational tool. for example, jgaap uses a fairly simple (and naive) definition of a “word” — a maximal non-blank string of characters. this means that some items may be treated as multiple words (“new” “york” “city”) while others are treated as a single word (“non-blank”). an analysis based on part-of-speech types [juola and baayen, ] will depend upon the accuracy of pos tagger as well as on its tag set. such subtle distinctions will almost certainly have an effect in some analyses and be entirely irrelevant in others. the computer, of course, is blissfully ignorant of such nuances and will happily analyze the most appalling garbage imaginable. a researcher who accepts such garbage as accurate — garbage in, gospel out — may be said to deserve the consequences. but the client of a lawyer wrongly convicted on such weak evidence deserves better. have we, then, made a faustian bargain in creating such a “plug and play” authorship attribution system? we hope not. the benefits from the wide avail- ability of a tool to the reasoned and cautious researchers who will benefit from it should outweigh the harm caused by misuse in the hands of the injudicious. it is, however, appropriate to consider what sort of safeguards might be created and to what extent the program itself may be able to incorporate and to enforce automatically these safeguards. from a broader perspective, this program provides a uniform framework un- der which competing theories of authorship attribution can both be compared and combined (to their hopefully mutual benefit). it also form the basis of a simple user-friendly tool to allow users without special training to apply tech- nologies for authorship attribution and to take advantage of new developments and methods as they become available. from a standpoint of practical episte- mology, the existence of this tool should provide a starting point for improving the quality of authorship attribution as a forensic examination – by allowing the widespread use of the technology, and at the same time providing an easy method for testing and evaluating different approaches to determine the neces- sary empirical validation and limitations. references [baayen et al., ] baayen, r. h., van halteren, h., neijt, a., and tweedie, f. ( ). an experiment in authorship attribution. in proceedings of jadt , pages – , st. malo. université de rennes. [burrows, ] burrows, j. ( ). questions of authorships : attribution and beyond. computers and the humanities, ( ): – . [burrows, ] burrows, j. f. ( ). ‘an ocean where each kind. . . ’ : statis- tical analysis and some major determinants of literary style. computers and the humanities, ( - ): – . [easson, ] easson, g. ( ). the linguistic implications of shibboleths. in annual meeting of the canadian linguistics association, toronto, canada. [farringdon, ] farringdon, j. m. ( ). analyzing for authorship : a guide to the cusum technique. university of wales press, cardiff. [holmes, ] holmes, d. i. ( ). authorship attribution. computers and the humanities, ( ): – . [holmes, ] holmes, d. i. ( ). the evolution of stylometry in humanities computing. literary and linguistic computing, ( ): – . [juola, ] juola, p. ( ). what can we do with small corpora? docu- ment categorization via cross-entropy. in proceedings of an interdisciplinary workshop on similarity and categorization, edinburgh, uk. department of artificial intelligence, university of edinburgh. [juola, a] juola, p. ( a). becoming jack london. in proceedings of qualico- , athens, ga. [juola, b] juola, p. ( b). the time course of language change. comput- ers and the humanities, ( ): – . [juola, a] juola, p. ( a). ad-hoc authorship attribution competition. in proc. joint international conference of the association for literary and linguistic computing and the association for computers and the humanities (allc/ach ), göteborg, sweden. [juola, b] juola, p. ( b). on composership attribution. in proc. joint international conference of the association for literary and lin- guistic computing and the association for computers and the humanities (allc/ach ), göteborg, sweden. [juola, ] juola, p. ( ). becoming jack london. journal of quantitative linguistics. [juola and baayen, ] juola, p. and baayen, h. ( ). a controlled-corpus experiment in authorship attribution by cross-entropy. in proceedings of ach/allc- , athens, ga. [juola and baayen, ] juola, p. and baayen, h. ( ). a controlled-corpus experiment in authorship attribution by cross-entropy. literary and linguis- tic computing, : – . [keselj and cercone, ] keselj, v. and cercone, n. ( ). cng method with weighted voting. in juola, p., editor, ad-hoc authorship attribution contest. ach/allc . [koppel and schler, ] koppel, m. and schler, j. ( ). ad-hoc author- ship attribution competition approach outline. in juola, p., editor, ad-hoc authorship attribution contest. ach/allc . [kukushkina et al., ] kukushkina, o. v., polikarpov, a. a., and khmelev, d. v. ( ). using literal and grammatical statistics for authorship attribu- tion. problemy peredachi informatii, ( ): – . translated in “problems of information transmission,” pp. – . [kučera and francis, ] kučera, h. and francis, w. n. ( ). computa- tional analysis of present-day american english. brown university press, providence. [ladefoged, ] ladefoged, p. ( ). a course in phonetics. harcourt brace jovanovitch, inc., fort worth, rd edition. [mendenhall, ] mendenhall, t. c. ( ). the characteristic curves of com- position. science, ix: – . [nerbonne, ] nerbonne, j. ( ). the data deluge. in proc. joint international conference of the association for literary and linguistic com- puting and the association for computers and the humanities (allc/ach ), göteborg, sweden. to appear in literary and linguistic computing. [rudman, ] rudman, j. ( ). on determining a valid text for non- traditional authorship attribution studies : editing, unediting, and de- editing. in proc. joint international conference of the association for computers and the humanities and the association for literary and linguis- tic computing (ach/allc ), athens, ga. [stinson, ] stinson, d. r. ( ). cryptography: theory and practice. chapman & hall/crc, boca raton, nd edition. [van halteren et al., ] van halteren, h., baayen, r. h., tweedie, f., haverkort, m., and neijt, a. ( ). new machine learning methods demon- strate the existence of a human stylome. journal of quantitative linguistics, ( ): – . [wellman, ] wellman, f. l. ( ). the art of cross-examination. macmillan, new york, th edition. summary findings of neh digital humanities start-up grants ( – ) september www.neh.gov/odh summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ table of contents acknowledgements ..................................................................... introduction................................................................................ start-up awards by project director’s discipline ........................ start-up applicants from universities ......................................... start-up applicants from non-universities.................................. start-up funding by year............................................................ map of awarded start-up grants................................................. summary findings, – ................................................. outcomes .......................................................................... unanticipated problems .................................................... implications ...................................................................... future plans ...................................................................... conclusion......................................................................... attachments a. list of websites............................................................ b. software or tools ......................................................... c. blogs/media/press ...................................................... d. publications ................................................................. e. exhibits, workshops, and conferences ......................... summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ acknowledgements first, a big thank-you to the staff of the office of digital humanities who read many hundreds of draft proposals, chaired the bulk of the panels, championed projects, and made this program a success: michael hall, jason rhody, and jennifer serventi. also, thanks to the many guest program officers from other neh divisions and offices who volunteered their time chairing start-up panels: barbara ashbrook, julie goldsmith, karen mittelman, julia nguyen, tom phelps, danielle shapiro, stefanie walker, david weinstein, and joel wurl. a special thanks to fred winter, who wrote the original guidelines for the start-up grant program and jerri shepherd, from the neh’s office of grant management, who awarded, processed, and managed all the successful projects. the front cover image was created using jonathan feinberg’s wordle software. it is a word cloud taken from the abstracts of all the funded start-up grants. lastly, thanks go to kathy toavs, neh management and program analyst, who surveyed all our grantees from – and wrote the summary findings that form the bulk of this report. http://www.wordle.net/ summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ introduction on march , , the neh invited a number of scholars to our offices in washington to help us brainstorm on how the agency might best help the field when it came to the digital humanities. we dubbed this meeting the “digital humanities mini-conference” and had a day-long conversation with a number of neh-funded scholars who had done pioneering work in digital scholarship. in attendance were john unsworth, michael mcrobbie, david bodenhamer, bernard frischer, janet murray, ken price, worthy martin, vernon burton, and tom scheinfeldt. lisa spiro, clifford lynch, and roy rosenzweig sent us suggestions via e- mail but were unable to attend in person. the group discussed the increasing impact of technology across all humanities disciplines. several themes seemed to emerge from the conversation. one important point was that the stuff that humanists study (books, newspapers, music, images) were increasingly becoming available in digital form. this increased access had many advantages but also had implicit dangers. to paraphrase historian roy rosenzweig, we seemed to be headed for an age of abundance, as literally millions of pages of materials were being put on the web, forever changing the methods of scholarship. this led to the second point, which is the importance of “digital humanists” in the overall humanities landscape. that is, people who are comfortable both in the humanities disciplines and in the disciplines of library and information science, computer science, and other technical areas who can help to build the humanities archives, libraries, and research tools necessary for the field. there was a feeling that the well-established system of humanities graduate training wasn’t currently emphasizing this new breed of scholar nor recognizing how important they would be over the coming years. nor did the entrenched promotion and tenure system reward scholars who worked collaboratively with others outside of their discipline on projects that were heavily technology focused. the group suggested that the neh might be well-suited to use our imprimatur to help move the humanities forward in this regard and to start a much more sustained conversation around the topic of digital humanities and the importance of building what the acls (american council of learned societies) would later refer to as a humanities “cyberinfrastructure;” that is, the technology tools, standards, best practices and, most importantly, people and organizations capable of guiding the humanities through the digital era. across campus, we have already seen how technology has greatly changed the way scientists do their work. it has not simply allowed for faster or more efficient research; rather, it has allowed for research that could not take place before. the humanities needed to play a role in building its own technology infrastructure and, the group argued, the neh could play a role in making that happen. one specific suggestion the group had was for an neh grant program that funded innovative new methods but that cut across our traditional divisions. the group noted that technology was breaking down walls between research, education, public programs, preservation and access. a digital edition, for example, may well contain a public programming element, be assigned in a classroom setting, be used by researchers, and also provide better access to materials. so where does it fall? the group felt that a grant program that focused on the digital scholarship and cut across the traditional program boundaries might be an excellent way to spur innovative work. immediately after the conference, the neh formed a digital humanities working group to discuss ways in which we might address the issues raised at the mini conference. this group included representatives from across the endowment. in just a matter of weeks, this http://www.acls.org/programs/default.aspx?id= summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ working group put together the guidelines for the digital humanities start-up grant program which was officially announced to the public in the summer of . the start-up grant program had a number of interesting features. it would be cross- cutting; that is, the work proposed could involve aspects of research, education, public programming, preservation, or access. the work could also focus on new methods, on specific humanities content, or a combination of both. this methodological focus proved to be a key, as many digital humanities projects focus on developing the underlying methods of scholarship. the hope was to give projects an opportunity to develop these new methods, tools, and technologies so that, down the road, they could be used in a wide variety of humanities settings, e.g. in a research project, in an education project, or in a museum or other public venue. another interesting feature of the start-up grant program was a focus on innovation, future potential, and “high risk/high reward.” like a basic research grant program in the sciences, the guidelines were designed to encourage applicants to propose innovative projects that had long-term potential but in the short term needed funds to do preliminary work, to test out ideas, to develop prototypes, to get their planning in order, and perform other tasks necessary for the successful implementation of a digital project. we recognized the fact that digital projects can be expensive – not necessarily because of the technology, per se, but rather because of the people involved. unlike the stereotypical single-authored monograph project in the humanities, digital projects are almost always collaborative. the best projects bring together people from multiple specialties, including scholars, librarians, information scientists, computer scientists, museum professionals, and others. one hope was that this start-up program would be an opportunity for the team to use the modest start-up funds to test out some ideas, bring the right team together, meet with other scholars, and basically do the legwork that would later put them in better position to win a larger award from another neh grant program (or, for that matter, from another funder). the program also encouraged projects that studied the impact of technology, both on our culture as well as on the practices of the academic humanities itself. due in part to the important involvement of libraries in digital humanities projects, our colleagues at the institute of museum and library services agreed to contribute some funds to the start-up grant program. (the chart on page breaks down each agency’s contribution). since the announcement in , the “sug” program, as it is fondly known, has quickly grown into one of the most popular programs at the neh. the sug program has two deadlines per year. while having two deadlines is more work for staff, it enables applicants to hear back quickly and gives them time to revise and resubmit their application to the next deadline. this is our attempt at keeping up with “internet speed.” all sug awardees, as of , are required to submit an end-of-grant “white paper” which is posted on the neh’s own web site. this white paper, freely shared with the public, is an opportunity for these projects to share their lessons learned with their colleagues and the general public. building an infrastructure is not a solitary task; our white paper library of funded projects is becoming a valuable resource for the field. at the conclusion of the fourth year of the sug program, we have received a total of applications and made awards (meaning a very competitive % funding ratio). over that time, we brought in peer reviewers to evaluate the applications. it is important to note that we’ve rarely had a peer reviewer serve more than once, as interest in serving on a digital humanities panel continues to grow. in the pages immediately following, we have summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ put together some charts demonstrating what kinds of institutions are submitting these applications. the bulk of this summary report reflects work done by the neh’s kathy toavs who got in touch with of the project directors from the first two years of the program ( and ). we chose just the first two years because we wanted to talk to project directors who had concluded their work to find out more about outcomes. kathy provides an overview of her research including a thorough discussion of the many publications, conferences, web sites, and software tools that emerged from the first two years of the sug program. she also asked the project directors for their feedback on the program and kathy provides an excellent summary of their thoughts. on the whole, we have been delighted with the direction of the sug program and very encouraged about the fact that many of the projects have not only produced excellent results but also used the grant as a stepping-stone to further funding. we have seen many examples of this. recently, the acls announced the winners of their prestigious digital innovation fellowships and we were pleased to see that three of the five awardees were former sug projects. we’ve seen other projects graduate from the sug program and move on to major funding at other agencies like the nsf and private funders like the andrew mellon foundation. other sug projects have moved on to larger grants in neh programs offered by other offices and divisions. also useful to hear was the project directors’ thoughts on the impact of the sug program on their career and on the field in a larger sense. many of the project directors quoted in this report make mention of the importance of the neh imprimatur for their careers in the nascent digital humanities field. in a bit of late-breaking news, i was quite surprised and encouraged to see a front-page story in the new york times on august , entitled “scholars test web alternative to peer review” that focused on two of the start-up grant projects. many start-up grants receive media coverage, of course (see attachment c), but the fact that this piece peaked as the number one most-emailed article on the times’ website seems to demonstrate wide interest. while small, we feel these grants have had an impact larger than their budgets might suggest and we look forward to watching them continue to develop over the coming years. brett bobley chief information officer director, office of digital humanities national endowment for the humanities bbobley@neh.gov http://www.acls.org/research/digital.aspx?id= http://www.nytimes.com/ / / /arts/ peer.html http://www.nytimes.com/ / / /arts/ peer.html summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ digital humanities start-up grant awards sorted by project director discipline american history ( ) hd- - university of central florida, orlando project director: lori walters come back to the fair hd- - university of virginia project director: scot french and bill ferster jefferson's travels: a digital journey using the historybrowser hd- - university of richmond project director: andrew torget visualizing the past: tools and techniques for understanding historical processes hd- - connecticut humanities council project director: bruce fraser connecticut's heritage echosystem: resolving the challenges to interoperability across disparate digital repositories hd- - university of illinois project director: s. edelson the cartography of american colonization database project hd- - university of california, riverside project director: steven hackel the early california cultural atlas hd- - kansas state university project director: bonnie lynn-sherow lost kansas: recovering the legacy of kansas places and people hd- - marist college project director: ron coleman a digital pathfinder for historic sites hd- - western reserve historical society project director: edward pershey (ai) artificially intelligent artifact interpreter hd- - university of richmond project director: edward ayers landscapes of the american past: visualizing emancipation hd- - bank street college of education project director: bernadette anand civil rights movement remix (crm-remix) hd- - university of maryland, college park project director: david lester mith api workshop hd- - montana preservation alliance project director: kathryn hampton the touchstone project: saving and sharing montana's community heritage hd- - university of california, riverside project director: steven hackel the early california cultural atlas american literature ( ) hd- - university of texas, austin summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ project director: samuel baker the ecommentary machine project hd- - hofstra university project director: john bryant melville, revision, and collaborative editing: toward a critical archive hd- - american association for state and local history project director: matthew gibson online encyclopedia best practices and standards hd- - university of nebraska, lincoln project director: andrew jewell the crowded page hd- - suny research foundation, college at purchase project director: m. jon rubin internationalizing humanities education through globally networked learning hd- - cuny research foundation, nyc college of technology project director: matthew gold looking for whitman: the poetry of place in the life and work of walt whitman hd- - electronic literature organization project director: joseph tabbi electronic literature directory: collaborative knowledge management for the literary humanities hd- - university of maryland, college park project director: tanya clement professionalization in digital humanities centers american studies ( ) hd- - duke university project director: matthew cohen interface development for static multimedia documents hd- - lake forest college project director: davis schneiderman virtual burnham initiative ancient literature ( ) hd- - university of california, berkeley project director: niek veldhuis berkeley prosopography services: building research communities and restoring ancient communities through digital tools anthropology ( ) hd- - washington state university project director: kimberly christen mukurtu: an indigenous archive and publishing tool hd- - sweet briar college project director: lynn rainville african-american families database: community formation in albemarle county, virginia, - hd- - lewis and clark college project director: oren kosansky intellectual property and international collaboration in the digital humanities: the moroccan jewish community archives archaeology ( ) summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ hd- - cuny research foundation, brooklyn college project director: h. arthur bankoff cuneiform forensics - d digital analysis of cuneiform tablet production hd- - university of pennsylvania project director: david romano digital corinth synchronized database project hd- - mississippi state university project director: paul jacobs distributed archives transaction system hd- - university of chicago project director: nadine moeller digital documentation of a provincial town in ancient egypt hd- - state of vermont division for historic preservation project director: giovanna peebles creating a sense of place through archeology: moving archeology from deep storage into the public eye through the internet hd- - michigan state university project director: ethan watrall red land/black land: teaching ancient egyptian history through game-based learning architecture ( ) hd- - university of new mexico project director: jennifer von schwerin digital documentation and reconstruction of an ancient maya temple and prototype design of internet gis database of maya arch hd- - university of california, los angeles project director: lisa snyder software interface for real-time exploration of three-dimensional computer models of historic urban environments hd- - university of new mexico project director: jennifer von schwerin digital documentation and reconstruction of an ancient maya temple and prototype of internet gis database of maya architectur hd- - university of georgia project director: stefaan van liefferinge ai for architectural discourse archival management and conservation ( ) hd- - northeast historic film project director: karan sheldon finding and using moving images in context hd- - city of philadelphia, department of records project director: joan decker historic overlays on smart phones art history and criticism ( ) hd- - coastal carolina university project director: arne flaten ashes art: virtual reconstructions of ancient monuments hd- - old north foundation of boston, inc. project director: laura northridge tories, timid, or true blue? hd- - unaffiliated independent scholar project director: amy gansell summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ identifying regional design templates of ancient near eastern ivory sculptures of women using computer technology hd- - alexandria archive institute project director: nada shabout the open modern art collection of iraq: web tools for documenting, sharing and enriching iraqi artistic expressions hd- - unaffiliated independent scholar project director: paul kaiser spatialising photographic archives hd- - university of california, san diego project director: lev manovich interactive visualization of media collections for humanities research british literature ( ) hd- - university of california, berkeley project director: alan nelson records of early english drama: digital innovations for enhanced access hd- - new york university project director: robert squillace simonides: a student-centered humanities learning tool hd- - drew university project director: martin foys digital mappaemundi: a resource for the study of medieval maps and geographic texts hd- - early manuscripts electronic library project director: adrian wisnicki the nyangwe diary of david livingstone: restoring the text classics ( ) hd- - university of virginia project director: bernard frischer new digital tools for restoring polychromy to d digital models of sculpture hd- - university of virginia project director: david koller supercomputing for digitized d models of cultural heritage composition and rhetoric ( ) hd- - hope college project director: christian spielvogel living in the valley of the shadow: the creation of a web-based, role-playing simulation on the civil war hd- - michigan state university project director: william hart-davidson archive . : imagining the michigan state university israelite samaritan scroll collection dance history and criticism ( ) hd- - university of virginia project director: bradford bennett artefact movement thesaurus education ( ) hd- - university of nebraska, lincoln project director: brian pytlik zillig summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ evince visualization and analysis tool hd- - wayne state university project director: nardina mein the digital learning and development environment hd- - apprend foundation project director: laurel sneed crafting freedom along nc :discovering hidden history with mobile technology hd- - kent state university main campus project director: mark van't hooft the geohistorian project hd- - center for civic education project director: kaavya krishna project citizen casebase: strengthening youth voices in an open-source democracy hd- - publicvr project director: jeffrey jacobson egyptian ceremony in the virtual temple- avatars for virtual heritage english ( ) hd- - loyola university, chicago project director: peter shillingsburg humanities research infrastructure and tools (hrit): an environment for collaborative textual scholarship hd- - university of southern california project director: bruce smith and katherine rowe the cambridge world shakespeare encyclopedia: an international digital resource for study, teaching, and research hd- - cuny research foundation, nyc college of technology project director: matthew gold looking for whitman: the poetry of place in the life and work of walt whitman - level hd- - university of south carolina research foundation project director: george williams braillesc.org far eastern history ( ) hd- - university of california, santa cruz project director: alan christy eternal flames: living memories of the pacific war film history and criticism ( ) hd- - university of chicago project director: yuri tsivian cinemetrics, a digital laboratory for film studies. folklore/folklife ( ) hd- - university of kentucky research foundation project director: jeanmarie rouhier-willoughby russian folk religious imagination hd- - piedmont folk legacies, inc. project director: greg adams vernacular music material culture in space and time summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ french language ( ) hd- - old dominion university research foundation project director: betty facer the impact of academic podcasting: emerging technologies in the foreign language classroom hd- - university of chicago project director: robert morrissey dictionnaire vivant de la langue francaise (dvlf): expanding the french dictionary geography ( ) hd- - kohala center project director: karen kemp hawaii island digital collaboratory history ( ) hd- - university of california, irvine project director: patricia seed the development of mapping: portuguese cartography and coastal africa - hd- - eldridge street project, inc./museum at eldridge street project director: hanna griff-sleven illuminating the immigrant experience: level i digital start-up grant hd- - university of nebraska, board of regents project director: william seefeldt sustaining digital history hd- - george mason university project director: dan cohen scholar press hd- - university of north texas project director: andrew torget mapping historical texts: combining text-mining & geo-visualization to unlock the research potential of historical newspapers history and philosophy of science, technology, and medicine ( ) hd- - indiana university, bloomington project director: colin allen inpho: the indiana philosophy ontology project history of religion ( ) hd- - george mason university project director: sharon leon crowdsourcing documentary transcription: an open source tool humanities ( ) hd- - maine humanities council project director: brita zitin podcasting and the maine humanities council: integrating a new tool for public humanities education hd- - university of arizona project director: davison koenig virtual vault hd- - brown university summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ project director: julia flanders encoding names for contextual exploration in digital thematic research collections hd- - university of north carolina, chapel hill project director: natalia smith image to xml (img xml) hd- - itasca community college project director: timothy powell gibagadinamaagoom: an ojibwe digital archive hd- - university of north carolina, chapel hill project director: natalia smith main street, carolina: uncovering and reclaiming the history of downtown hd- - brown university project director: andrew ashton semantically rich tools for text exploration hd- - georgia tech research corporation project director: douglas (fox) harrell gesture, rhetoric, and digital storytelling hd- - brown university project director: julia flanders a journal-driven bibliography of digital humanities hd- - illinois state university, milner library project director: cheryl ball building a better back-end: editor, author, & reader tools for scholarly multimedia hd- - dartmouth college project director: mikhail gronas mapping the history of knowledge: text-based tools and algorithms for tracking the development of concepts hd- - st. louis university project director: james ginther the t-pen tool: sustainability and quality control in encoding handwritten texts interdisciplinary ( ) hd- - unaffiliated independent scholar project director: michael newton building information visualization into next-generation digital humanities collaboratories hd- - university of virginia project director: worthy martin presenting progressions hd- - texas a & m research foundation project director: wei yan high dynamic range imaging for preserving chromaticity information of architectural heritage hd- - emerson college project director: eric gordon the digital lyceum: emerging frameworks for participation in live humanities events hd- - indiana university, indianapolis project director: david bodenhamer conceptualizing humanities gis: an expert planning workshop on religion in the atlantic world hd- - unaffiliated independent scholar project director: daniel visel sophie search gateway summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ hd- - wheaton college project director: mark leblanc pattern recognition through computational stylistics: old english and beyond hd- - university of massachusetts, boston project director: joanne riley online social networking for the humanities: the massachusetts studies network prototype hd- - unaffiliated independent scholar project director: bob stein where minds meet: new architectures for the study of history and music hd- - university of maryland, college park project director: douglas reside electronic broadway project hd- - ohio state university research foundation project director: h. lewis ulman and melanie schlosser reliable witnesses: integrating multimedia, distributed electronic textual editions into library collections hd- - university of maryland, college park project director: jennifer golbeck visualizing archival collections hd- - plymouth state university project director: casey bisson scriblio mu hd- - center for independent documentary project director: michael epstein murder at harvard mobile hd- - california state university, dominguez hills foundation project director: vivian price new approaches: tradeswomen archive project (tap) hd- - indiana university, bloomington project director: christopher raphael optical music recognition on the international music score library project italian literature ( ) hd- - university of oregon, eugene project director: massimo lollini oregon petrarch open book journalism ( ) hd- - loyola college in maryland project director: elliott king the journalism history hub: developing a research-based interdisciplinary social network and meta-conference languages ( ) hd- - university of alaska, fairbanks project director: siri tuttle minto songs library science, archival management, and conservation ( ) hd- - syracuse university project director: anne diekema enhanced access to digital humanities monographs hd- - drexel university summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ project director: robert allen automatic extraction of article metadata from digitized historical newspapers hd- - kent state university main campus project director: michael kreyche a bilingual digital list of subject headings hd- - new york university project director: brian hoffman mediacommons: social networking tools for digital scholarly communication hd- - willamette university project director: michael spalti bridging the gap: connecting authors to museum and archival collections hd- - university of virginia project director: bethany nowviskie and adam soroka neatline: facilitating geospatial and temporal interpretation of archival collections hd- - university of massachusetts, amherst project director: james allan ocronym: entity extraction and retrieval for scanned books hd- - university of nebraska, lincoln project director: katherine walter centernet: cyberinfrastructure for the digital humanities hd- - university of washington project director: ann lally collecting online music project hd- - columbia university project director: haimonti dutta leveraging "the wisdom of the crowds" for efficient tagging and retrieval of documents from the historic newspaper archive hd- - boston university project director: jack ammerman evolutionary subject tagging in the humanities linguistics ( ) hd- - unaffiliated independent scholar project director: richard cook the character description language (cdl) digital humanities start-up hd- - university of montana project director: mizuki miyashita computer-based data processing and management for blackfoot phonetics and phonology literature ( ) hd- - university of maryland, college park project director: douglas reside digital tools hd- - university of maryland, college park project director: matthew kirschenbaum approaches to managing and collecting born-digital literary materials for scholarly use hd- - university of south carolina research foundation project director: randall cream the sapheos project: transparency in multi-image collation, analysis, and representation hd- - pennsylvania state university, main campus project director: jacqueline reid-walsh summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ learning as playing: an animated, interactive archive of th- th century narrative media for and by children hd- - university of arizona project director: hale thomas-hilburn poetry audio/video library phase media-general ( ) hd- - university of virginia project director: johanna drucker artists' books online: from prototype to distributed community hd- - dartmouth college project director: mary flanagan digital humanities start up grant: metadata games -- an open source electronic game for archival data systems hd- - lower eastside girls club of new york project director: dave pentecost the lower eastside girls club girl/hood project medieval studies ( ) hd- - university of kentucky research foundation project director: abigail firey carolingian canon law project: a collaborative initiative hd- - john woodman higgins armory museum, inc. project director: jeffery forgeng virtual joust: a technological interpretation of medieval jousting and its culture. music history and criticism ( ) hd- - north carolina central university project director: paula harrell training to establish the north carolina central university/african american jazz caucus jazz research institute digital lib. hd- - university of texas, austin project director: robert freeman utunes: music . hd- - haverford college project director: richard freedman the chansonniers of nicholas du chemin ( - ): a digital forum for renaissance music books hd- - american university project director: fernando benadon the map of jazz musicians: an online interactive tool for navigating jazz history's interpersonal network. nonwestern religion ( ) hd- - university of california, riverside project director: justin mcdaniel digital humanities start-up grants: thai digital monastery project religion ( ) hd- - claremont mckenna college project director: daniel michon virtual taxila: a web-accessible, multi-user virtual environment (muve) of an ancient indian city summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ slavic languages ( ) hd- - university of georgia research foundation, inc. project director: victoria hasko telecollaborative webcasting: strengthening acquisition of humanities content knowledge through foreign language education spanish literature ( ) hd- - duke university project director: margaret greer manos teatrales: cyber-paleography and a virtual world of spanish golden age theater theater history and criticism ( ) hd- - university of maryland, college park project director: douglas reside camp: the collaborative ajax-based modeling platform hd- - university of california, san diego project director: emily roxworthy drama in the delta: digitally reenacting civil rights performances at arkansas' wartime camps for japanese americans hd- - buffalo and erie county public library project director: anne conable "re-collecting the depression and new deal as a civic resource in hard times" hd- - university of california, san diego project director: emily roxworthy drama in the delta: digitally reenacting civil rights performances at arkansas' wartime camps for japanese americans summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ types of universities that applied to sug program sug applicants (universitie s) associate's colleges bac. colleges--general bac. colleges--liberal arts bac./associate's colleges doctoral/research uni--extensive doctoral/research uni--intensive master's colleges & uni i master's colleges & univ ii other specialized institutions schools of engineering & tech teachers colleges summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ types of non-university applicants to sug program sug applicants (non-university) archives art museum arts related organizations center for advanced study/research institute community-level organization educational consortium general museum historic preservation organization historical site/house historical society history museum independent production company independent research library indian tribal organization libraries museums national organization non-profit educational center philanthropic foundation professional association professional school publishing school district science and technology museum state humanities council state/local/federal government television/station summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ start-up grant funding by year imls neh year summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ map of all awarded sug projects summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ summary findings from completed start-up grants ( – ) kathy a. toavs management and program analyst national endowment for the humanities the national endowment for the humanities digital humanities start-up grant program offers relatively small planning grants that encourage innovations in the digital humanities. the first applications for this program were accepted beginning in november , and the first grants were awarded in february . a two-year study was initiated by the office of digital humanities to assess the effectiveness of the program. to accomplish this, a survey of six questions was sent to the project directors who received start-up grants in and . the request was initially made on august , with follow-ups on september and september , . the survey questions were: ) what is the current status of your project? ) did your project lead to any of the following: a) a project website? (please provide links) b) journal articles or other publications? (please provide links) c) a museum exhibit or other public program? (please provide link) d) software or tool? (please provide links) e) a class, workshop, etc? (please provide information) e) mentions in the press/blogosphere? (please provide links) ) has your project continued beyond the start-up phase (if appropriate?) a) if yes, tell us how? b) did you receive money from any other funders? who? if so, did having an neh sug help you in obtaining further funding? c) if you were turned down for further funding, can you give us an idea why? what barriers did you encounter? d) if your project has not continued, please tell us why? (e.g. fully complete, no funding, etc) ) what are your general feelings about your project? a) did you accomplish the goals you set out to do? b) what lessons did you learn? (e.g. what worked? what would you do differently?) ) if you have now completed the start-up phase, have you sent the neh a white paper about your project to place in the odh library of funded projects on our website? ) what are you overall thoughts about the start-up grant program? was it helpful for your work? your career? summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ answers were provided by ( %) of the project directors, and details varied depending on the stage of the project. those project directors who had already completed their projects ( of the responders) were able to provide more detailed answers regarding the effectiveness and future direction of their work. because the survey was sent so close to the beginning of the academic year, many of the project directors were only able to provide short answers. others were limited because of travel, or felt they were not able to answer all the questions completely because their projects were still ongoing. almost all of the participants agreed that the start-up grants were beneficial and hugely successful. thirty ( %) of the responding project directors have created new websites (see attachment a); another one has expanded an existing site. seven others ( %) responded that they are in the process of developing new sites. twenty-four tools or new software have resulted from these grants; another six are planned or are in testing. for a list of some of these tools/software, see attachment b. at least of the projects have received press, media, or blogosphere coverage (attachment c). one project even received honors in for its podcasting research. over articles or chapters have been published as a result of the start-up grants; at least another are forthcoming or under review. for a list of publications, see attachment d. at least conferences, symposiums, or speeches (attachment e) have already taken place, although this is a conservative estimate. it is reasonable to assume that there are many more, as project directors tended to answer this question with “several” or “many”, and those answers were only counted once in tallying the survey results. also, at least eighteen classes, workshops, lectures, or podcasts have incorporated some element of the start-up grants; and others are forthcoming. as with the conferences, above, many of the project directors answered this question with “several” or “many”, so this is a rounded, conservative number. thirty-seven of the responders believed that they had achieved the goals as set out in the grants. another eleven, whose grants are ongoing, answered that they had not yet done so. the two remaining project directors did not answer this question, although both of their projects are still in progress. several noted that their goals had changed somewhat in the course of development, mostly for the better. one noted that the project would now be “more useful to users than what we initially thought we would accomplish.” beginning with the competition, grantees were expected to prepare a “lessons learned” white paper. twelve of the directors with grants stated that their papers had already been submitted. interestingly, many of the grantees, while not required to do so, indicated that they were planning to submit a paper as well summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ outcomes almost all of the project directors who responded to the survey thought that their projects had been very successful, and highly praised the neh digital humanities start-up grants program for giving them the impetus they needed to move forward. they emphasized the advantages of these grants, not only to their institutions, but to the larger humanities community. one director wrote, “this project exceeded our expectations in terms of the positive reactions and involvement of humanities professionals statewide.” some of the kudos for the program included such statements as, “the project fulfilled everything that we hoped that it would…the greatest value was in being able to quickly and effectively perform necessary (but usually hard to come by) preliminary research and networking on historical visualization work that will enable us to develop new software and research techniques for such work. that would simply not have been possible without the grant.” another wrote, “it was a fantastic opportunity to develop a really important educational initiative that will have legacy value for a long time to come.” most were satisfied that their teams were able to work together efficiently and productively. neh grant support for internships was indispensable in some cases. “we were able to successfully complete the project goals within the allotted time (plus a summer extension), in part because we had the wonderful advantage of a highly cooperative library environment…and terrific grad student (supported by the neh funds).” “the primary reason for success in my view was having two graduate students and an undergraduate programmer who combined a passion for the subject material with technical skills in computing, and thus we did not have the problems of communicating between content experts and technology experts that is often a barrier to such projects.” another boasted that, “we accomplished more than we had hoped. people everywhere…tell us they are shocked at the quality of work our students produce.” “the student development team thought outside the box and brought to the project a young peoples’ view of how things should work.” project team members expressed a variety of lessons learned during the grant period. one director stated that, “as with most issues in the field of humanities computing, the lessons learned involved technical, organizational and human behavior aspects. over the course of the project year, we learned a great deal about the benefits and challenges of incorporating free, online applications…into the operations of non-profit organizations. we also learned that humanities professionals are very willing to participate in an online network of this sort, when it is tuned to their professional interests and needs. however, it became apparent that many of them need a level of technical training before they feel comfortable with the online tools, and able to participate to the extent they desire.” many of the lessons learned had to do with technical issues, especially in learning how to navigate between the priorities and realities of the humanities scholar and various technical personnel. “on the technical side, we have gained much experience in navigating the sometimes tricky relationship between the highly-skilled consultants…and the rest of the project team. we have found it extremely beneficial to engage “technology translators,” those who have both a grasp on complex technical issues and the importance of practical solutions that help achieve project goals. technologists tend to get caught up in the challenge of the technology itself and need strong guidance on remaining in the “real” world. too, we’ve found the need to develop strong project management tools to help focus the technology consultants and keep them on track.” another project director noted that “finding a programmer who is thoroughly comfortable with the humanist inclination to have summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ inspirations or visions along the way is a real asset in the collaboration between the technical and humanistic personnel.” other comments regarding technical issues and lessons learned included the following: “we tried a few methods in the field first and not all of them were successful, we have to improve some of the data collection process.” “i learned that outsourcing some technical work is difficult and it should be done yourself.” “i would ask for technical support for website building.” “it is essential to build infrastructure on a flexible, open-source platform to avoid creating a mere boutique tool; that an in-house programmer can respond more fluidly to the changing needs of a complex process than a vendor; that it is essential for academics and it specialists to communicate fully about the nature of a project’s pedagogical goals and the systemic effects of individual technical choices.” “i learned that a software development project focused on integrating multiple applications requires a person in the middle to be actively engaged in all aspects of code development and testing.” “next time i’d really focus on creating one tool, or even just a set of guidelines: that way, the project could take its time to link together all of the different scholars and projects in the digital humanities world that are working on similar issues and create something even more broadly useful.” “i would build in additional opportunities for crowdsourcing. much of our content is housed on third-party platforms (google earth), and this seems to be much smarter that developing a database or presentation system that will be obsolete due to technological changes. and yet, i’d like to have even more content filter through third-party channels in partnership relations.” “i would encourage future applicants to look around to find better "off the shelf" solutions before reinventing the wheel.” the need to set goals was critical. “the main lesson i personally learned is that even in a short-term project it is necessary to put intermediate goals in writing.” one director “learned that it’s worth taking the extra time to define your terms as clearly as you can from the outset.” the need for teamwork and cooperation was needed at all stages of the process. one director noted that they had “learned more about the value of teamwork in complex technology projects and got better at working that way across disciplines.” other teamwork related comments included: “we learned that we have to work as a team, not as a set of individuals. once we had a good team ethic, the work really took off.” “weekly meetings of the project team proved vital to progress, review, and development.” summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ “collaborations with a variety of people (technologists, students, scholars) is crucial.” project directors learned that engaging the assistance of others, especially within the organization, had a large influence on how well goals were met. “partnering with museum and library professionals on the project encouraged us to develop the tool with both academic and general audiences in mind. we learned that students can be active collaborators on digital humanities start-up projects and produce exemplary work. finally, we presented the project at numerous workshops and conferences and recruited several “early adopters” to help in the development of the tool.” one of the directors expressed praise for workshops, “especially one that links technology and content. they allow projects to gain a quick start, but expanding the core is more difficult because other scholars bring different knowledge and potential different directions.” however, some directors learned that other personnel were not always available or willing to meet a projects needs or deadlines. “i learned that artists are difficult to work with, professional librarians and curators are not. i learned that funding is essential for interns because the workload of current library and museum professionals is so tight and they cannot add new tasks to their jobs.” another lamented that “it was difficult to achieve good “buy in” with other faculty and staff on the project, given the limited funding.” yet another complained that, “if there is a single lesson we have learned it is the need for a clear development structure with a concrete time commitment from the academic project participants. because our project focuses on faculty development in an area unfamiliar to the participating faculty, it was hard to define what we expected of them on a weekly basis and this made scheduling training workshops quite difficult. in the future we plan to support faculty by enrolling them in a training course with a more clearly pre-defined workload.” while setting goals and encouraging teamwork were necessary, a project also needed to plan for contingencies. this was especially clear when faced with changes in institutional personnel and job reductions “in the future, i will have contingency and alternate plans for each goal.” a technology-intensive project means that “everything takes longer than we think it should and everything is more difficult than it at first appears.” summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ unanticipated problems none of the survey participants cited any failures with their project, although several of the projects did encounter some unanticipated problems and issues. technical problems included frustration over proprietary issues. owners of data were reluctant to release data or allow permissions. “wide adoption of cdl font technology has been limited by the proprietary nature of the source code.” at least one project team became creative in working around this problem by the “writing of pseudo-agents” to include “data from sources outside the data sets of data owners”. another wrote, “…the goal of involving other holders of digital data into the project has been very slow in developing. though people express agreement that the idea of having a central place to use for collecting information is ideal, all too often there is a reluctance to release data to the use of other scholars and to the public.” other technical issues had to do with service providers (“we ran into technical challenges that arose from the online service provider changing its terms midstream”), and lack of institutional support (“we had trouble finding the technical support we wanted within the university. it may have been easier for us to start from scratch with a more common set of programming languages and tools than to stick with our original prototype and require a developer to work with that.”) still other problems cited had to do with issues of long-term preservation of complex digital projects, data collection issues, and finding/maintaining qualified/interested encoders/programmers. by far, the biggest problem encountered by project directors had to do with personnel issues, either internally or with outside collaborators. one project director wrote that “personnel issues made the project immensely frustrating, costing me and others immense personal time and psychological energy.” many realized that while their projects required library professionals and other support personnel, that assistance was not always timely or forthcoming. other problems cited included changes in personnel, job reductions, mandates, and time restraints. one director wrote “i spent an unwanted additional amount of time coping with fractious bureaucrats.” a co-director, faced with the economic recession, had to take another position, causing problems in communication between team members and a re-focus on priority issues. still another was not pleased with the role of the advisors to the project. other collaborations did not work out as well as planned (“involving the native american community was harder than i had thought. community members don’t typically attend meetings.”). other difficulties revolved around time and budget issues. some were concerned that the project took more time than was anticipated. eleven ( %) noted that they had been granted extensions by neh odh. “as is often the case, it was a lot more work than we anticipated…doing the work that we outlined as ‘preliminary’ in the grant turned out to be a huge project in itself.” others underestimated the amount of funding needed. “what we learned is that the funding (at the initial $ , cap for odh digital start-up grants) was really inadequate for the amount of effort, even with a modest investment in digital tools and technology,” stated one director. another explained that “…since i needed to collaborate with other institutions the high overhead rates cut deeply into the work i wanted done, and curtailed the amount which my team could have accomplished. summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ implications every project director citied benefits gained from the start-up grants. advantages included personal career enhancements, institutional enhancements, and broader implications for the broader scholarly community. personal most project directors thought that the dh start-up grants had been beneficial to them in some way. one project team claimed that the “project has been the “finest achievement of our pedagogical careers.” quite a few emphasized how useful the grants were in establishing some form of credibility and legitimacy, as well as enhancing their reputations in the field. “i have credibility in my own eyes and that of the digital community i would not have had without it.” “the grant provided legitimacy to my ideas to a skeptical traditional history department that is now looking to make digital a cornerstone of their nascent public history program.” “i am now recognized as a leader in providing a model for digital scholarship in my academic specialty, and the grant certainly confirmed for my department and college that digital scholarship is alive and well.” “it helps tremendously within our institutions that, though the funds are relatively small, they came for the neh. the external grant gives credibility to our project within our institution, even in the start-up phase.” “the willingness of neh to invest in our project, based on the recommendations of an independent scholarly review panel, provided us with external validation, so crucial to internal funding decisions and professional advancement.” others cited improvements to job satisfaction (“it’s given my work (and my career, i hope!) a tremendous boost”), promotions and tenure (“this grant was also helpful in my career, forming part of my promotion to full professor dossier”), and opportunities for future research and long-term career trajectories (“it has been extraordinarily useful for me. i’ve entered the world of digital humanities; i got a new “dream” job at uva, i think in part because the grant demonstrated my seriousness with digital humanities work; and i’ve simply learned a lot in doing this work that will benefit my own research”). another stated that “the sug has given me the chance to undertake innovative digital work that i’d be unlikely to undertake on my own.” only one gave a negative response to the question of whether the grant was beneficial to the director’s career. that person stated that the “real challenge confronting academic historians who do digital history is the fact that there is no tangible professional recognition for this work.” institutional benefits cited were not limited to the careers of the project team members. other beneficiaries of the start-up grants included students. one director explained that, “the start-up grant program enabled me to begin a digital project on our campus that will not only benefit our students but the "international" classroom.” yet another stated that, “the neh grant was helpful in promoting a new career path for our students, exposing this technology to the university, increasing the skills and knowledge of the grant personnel making them (and myself) more marketable in the workplace.” more impressive still were the comments offered by the director who boasted, “i view this as a legacy project before i retire in a few years that is of great benefit to the state.” similar claims were made by those who exclaimed, “the start-up grant we received was summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ indispensable to our work. it allowed us to attract the necessary interest and support to turn concept into reality and a small program’s dream into a university-wide vision that promises to spread out to many institutions.” citing the importance of neh to their institution’s future, another project director was pleased that , “although our university is late in beginning digital projects, the neh sug was instrumental in providing the first trained university personnel who can now continue work in this area.” broader scholarly community the start-up grants allowed for development and testing of new ideas, tools, and software beneficial to the international community. the following are just some of the comments related to broader implications: “this was an excellent opportunity for us to experiment on a small scale and develop procedures and prototypes that could be scaled up later on.” [we] “made significant progress in creating a new field.” “this was a wonderful opportunity to put into practice many of the new media ideas i had played with only theoretically. i made connections with numerous like-minded colleagues, and thought through enormously enjoyable technical issues.” “this start-up grant has become a model example on our campus for how to start a new interdisciplinary project and get external funding for them…we now have a queue of external scholars who are either directly trying our tools, modifying our software, and/or seek to collaborate so we can design and implement experiments in their area of the corpus.” “i think for the localized purpose of giving the community a way to get together, talk face-to-face, ask questions, debate answers, and come to some consensus on what we all need that can make our processes more efficient was more than well spent time and money.” “as a senior scholar in my field, it has also given me the chance to reach out to other specialists—at research universities and liberal arts colleges alike—in an effort to encourage collaborative work of a sort that is all too often absent in the humanities.” “it was very helpful for envisioning what is possible and for bringing together a group of researchers and technologists, some of whom continue to build out projects from those early ideas.” [this grant] “enabled us to begin important conversations among like-minded scholars scattered widely.” summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ future plans one of the survey questions asked if work was going to be continued on the grant once the start-up phase was completed. most of the project directors responded that they have continued, or plan to continue their projects. the grants were helpful in establishing credibility and “demonstrated that other people thought the project worth funding.” other awardees agreed, claiming that “having an neh start-up grant gave our project the imprimatur of a major humanities organization and served as external validation of the methods employed. it also helped convey the scholarly value of our work to chairs deans, provosts, and others in a position to support it.” wrote one director, “…without this support our project would not have moved forward.” yet others stated that the grants had been helpful in “demonstrating and promoting’ project goals, and served as a “gold seal of approval” for securing further funding. some other examples of how projects have expanded their original goals include: national and international collaboration several of the projects intend to expand their projects to include collaboration with other institutions, either to build functionality and content, or to disseminate findings in joint publications and conference presentations. one director claimed that, “further collaborations are also in the offing...we fully expect an international collaboration.” another professed that “we will be working over the next six months to share the product within our professional communities as well as asking some of the questions posed by our experience working across applications and communities of interest.” yet another director stated that, “several important one-of-a kind projects have been identified to produce after the completion of this project.” incorporation into established programs beyond the benefits already cited elsewhere for the university and students, other potential benefits were also mentioned. one explained that, “this project will be a permanent part of our digital library collections web site and is being incorporated into our permanent program of offerings.” another stated that, “the funding allowed us to complete preliminary work that was essential to establishing the basis for several possible projects, which are currently in the process of being put together for larger and more sustained grant possibilities.” several project directors, while admitting that they have not officially entered a new phase of development, stated that they have already exceeded their initial goals. one director explained that, “essentially we used the start up for much more than just starting up. we used it as implementation as well.” for those projects that have not continued, many of the project directors explained that, while they had an interest in continuing, they were stymied by their inability to procure additional funding. at least eight of the start-up projects were turned down by neh for further funding, some multiple times. only one project was successful in receiving another neh grant after an initial failure. one project director expressed frustration that that there is a “disconnect between neh’s desire for digital projects and yet reviewers who still have a traditional model of research.” citing comments from reviews, it seemed to the director that “the greatest single objection, however, seemed to be to the proposition that results (scholarly editions produced by users of our resources and tools) would appear in print format. the enraged (in some instances) comments suggest that there is an unfortunate summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ hardening of the divisions between scholars who still see value in print and those committed to digital humanities.” other reasons for projects not proceeding included problems with “social engineering” questions, hesitation on the part of granting agencies to fund projects that rely on undergraduate work, and issues of methodology. others felt that there were misunderstandings about the field and the related procedures, and fears that the projects were too ambitious, or the projects “did not sufficiently match the parameters of the granting agency.” “generally, granting agencies seem to be worried about funding projects that rely heavily on undergraduate work. i hope that our project can help to change that, because the product (completely designed and implemented by undergrads) is as professional as anything produced by graduate students or in the professional world.” one suspected that “funding of the boring part of the project (populating and refining a database) will be harder to come by.” summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ conclusion all of the project directors who answered the survey had the utmost praise for the neh digital humanities start-up grants program. most responded along the lines of the director who stated, “we feel this is an excellent program: the small scale of both the target projects and the funding application make it easy to try out ideas without committing enormous amounts of time.” another opined, “the startup grants are a great idea. humanities funding is so hard to get, this is a good way to spread it around and stimulate new projects.” “this is a wonderful program. it is so unusual to have such an opportunity to take risks in exploring cutting edge approaches to humanities education. it has served to help legitimize our method of post-secondary course internationalization and has seeded further research and training in this area.” “this grant program was a god-send! it provided us with the resources to experiment with a tool that had great potential, but which required time and focus from a wide variety of people in order to assess the scope of its usefulness for humanities professionals.” some likes and dislikes regarding the program included the following: several expressed satisfaction with the opportunity to gather together at the project directors’ meeting. one wrote, “although the day-long meeting was almost too much to grasp, i got a fuller sense of what a digital project might entail.” “the start-up grant was very helpful, and the project directors’ meeting in dc was especially good as it provided direct encouragement for our approach and the incentive to take the project further.” “we were very impressed by neh’s willingness to take risks here by investing in a broad spectrum of promising but preliminary approaches to the digitization and access of historical materials. one of the lasting memories of the project for me was the extraordinary sophistication of the projects funded under the start up program evident in your project directors’ meeting in washington, the enthusiasm of the group and the remarkable ways in which these projects promised to re-imagine and re-invigorate the disciple. it was a delight to be included.” suggestions for improvement included asking neh advice on how to create “a smooth way of creating ‘layers’ of participants—owner of site; editor for individual projects; and contributors who might pose questions or offer solutions. i am sure that other project directors supported by the odh have faced and solved this sort of problem, but in some ways i am at a loss to know exactly how to discover exactly who could help. i am confident that we will shape a workable solution, but wonder whether there might be some mechanism beyond the odh website for sharing solutions to this sort of problem.” on the program guidelines, one director questions the requirement for (c)( ) status of applicants.” i would think that this restriction reduces competition among applicants, and makes limited funding more limited (given excessive institutional overhead requirements).” however, the complaints about the neh digital humanities start-up grant program were minimal. although some of the projects encountered various obstacles, most of them were able to find solutions to their problems. one director, faced with job reduction in the it staff, was forced to reach out to another institution. as a result, “the subsequent partnership for technical guidance and development has been wonderfully creative and productive.” it was deemed “an extremely valuable program, and frankly perhaps the most innovative thing i've seen come out of the neh. the sug established legitimacy and credibility for our work, and was an important vehicle for allowing us to publicize what we were doing (which in turn put us into contact with other individuals who are attacking similar problems and with whom we can collaborate in the future. i very much hope to see summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ the dh sugs continue.” “i have always (and continue to) believed that the start-up grant program was extremely important in assisting humanities scholars to investigate whether their scholarship can be enhanced by technological developments. i believe that on a local level, a start-up grant can provide the motivation for the humanities scholar to seek out technologists at their institution and leeway for the technologists to allocate personnel resources to the scholar’s problem. i believe that the size of the grants makes it difficult to measure their impact in traditional ways, for example, the opening of effective communication lines between scholar and technologists may not yield documentable evidence, yet still have a sustained impact on the scholarship of the grantee.” “this comes as close as anything in the humanities to a broad-scale address to the problem of training, collaboration, and development in humanities computing. the resources and time involved in developing tools, debates, training, and archives in the humanities today are more similar to the resources and time needed in the natural sciences. this program recognizes that challenge, and it was crucial to allowing me to move forward in my digital work while also generating the kinds of traditional scholarship required for tenure in the literary humanities. again, the neh is providing an extremely valuable service to the nation by helping to spur the embrace and adoption of digital tools through the odh program. it has helped to provide a new window of perspective into our understanding of the humanities, as well as a new platform for sharing that understanding.” summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ attachment a websites http://www.wenlin.com/cdl/ (character description language - hd - ) http://tdm.ucr.edu/ (thai digital monastery project - hd - ) http://www.rch.uky.edu/rfri (russian folk religious imagination - hd - ) http://www.artistsbooksonline.org/ (artists’ books online - hd ) http://corinthcomputerproject.org (digital corinth synchronized database project - hd - ) http://www.pmoca.org (portuguese cartography and coastal africa, - - hd - ) http://www.coastal.edu/ashes art (ashes art: virtual reconstructions of ancient monuments - hd - ) http://www.reed.utoronto.ca/downloads.html (records of early english drama- hd - ) http://movingimagesincontext.org (finding and using moving images in context - hd - ) http://inpho.cogs.indiana.edu (inpho: the indiana philosophy ontology project - hd - ) http://lcsh-es.org (a bilingual digital list of subject headings -hd - ) http://www.asmodeus.ws/cohenlab/annotations.htm (interface development for static multimedia documents -hd - ) http://evince.unl.edu/index.html?file=../xml/base.xml (evince visualization and analysis tool - hd - ) http://utunes.utexas.org (utunes: music . - hd - ) http://sophieproject.org/ (sophie search gateway - hd - ) http://lexomics.wheatoncollege.edu (pattern recognition through computational stylistics: old english and beyond - hd - ) http://www.jeffersonstravels.org ; http://www.historybrowser.org ; http://www.viseyes.org (jefferson's travels: a digital journey using the historybrowser - hd - ) http://mastudies.ning.com (hd - ) http://www.datsproject.org/ (distributed archives transaction system - hd - ) http://vbi.lakeforest.edu (virtual burnham initiative - hd - ) http://www.people.virginia.edu/~msg d/idea (project temporarily housed at this site) (online encyclopedia best practices and standardshd - ) http://ricercar.cesr.univ-tours.fr/ -programmes/emn/duchemin/ (the chansonniers of nicholas du chemin ( - ): a digital forum for renaissance music books - hd - ) http://ccl.rch.uky.edu/ (carolingian canon law project - hd - ) http://dsl.richmond.edu/workshop/ (visualizing the past: tools and techniques for understanding historical processes - hd - ) http://libmedia.willamette.edu/acom/neh/ (bridging the gap: connecting authors to museum and archival collections - hd - ) http://emergentmediacenter.com/vtarch/ (creating a sense of place through archeology - hd - ) http://www.telledfu.org (digital documentation of a provincial town in ancient egypt - hd - ) www.literae.com/echo (prototype) (connecticut's heritage echosystem – (hd - ) http://www.wide.msu.edu (archive . : imagining the michigan state university israelite samaritan scroll collection - hd - ) http://www.wenlin.com/cdl/ http://tdm.ucr.edu/ http://www.rch.uky.edu/rfri http://www.artistsbooksonline.org/ http://corinthcomputerproject.org/ http://www.pmoca.org/ http://www.coastal.edu/ashes art http://www.reed.utoronto.ca/downloads.html http://movingimagesincontext.org/ http://inpho.cogs.indiana.edu/ http://lcsh-es.org/ http://www.asmodeus.ws/cohenlab/annotations.htm http://evince.unl.edu/index.html?file=../xml/base.xml http://utunes.utexas.org/ http://sophieproject.org/ http://lexomics.wheatoncollege.edu/ http://www.jeffersonstravels.org/ http://www.historybrowser.org/ http://www.viseyes.org/ http://mastudies.ning.com/ http://www.datsproject.org/ http://vbi.lakeforest.edu/ http://www.people.virginia.edu/% emsg d/idea http://ricercar.cesr.univ-tours.fr/ -programmes/emn/duchemin/ http://ccl.rch.uky.edu/ http://dsl.richmond.edu/workshop/ http://libmedia.willamette.edu/acom/neh/ http://emergentmediacenter.com/vtarch/ http://www.telledfu.org/ http://www.literae.com/echo http://www.wide.msu.edu/ summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ attachment b software or tools http://www.wenlin.com/cgi-bin/wenlinsvghelp.pl (hd - ) http://www.pmoca.org/ (hd - ) http://www.flintbox.com/technology.asp?page= (hd - ) http://valleydev.cs.hope.edu; www.valleysim.com (hd - ) http://inpho.cogs.indiana.edu/taxonomy/ (hd - ) http://lcsh-es.org (hd - ) http://www.structuralknowledge.com/markup_demo/markup/ (hd - ) http://evince.unl.edu/index.html?file=../xml/base.xml (hd - ) http://www.historybrowser.org ; http://www.viseyes.org/edit.htm (hd - ) http://lexomics.wheatoncollege.edu (currently only available to developers) (hd - ) http://mastudies.ning.com (forthcoming sep ) (hd - ) http://mel.hofstra.edu/textlab/ (hd - ) http://ricercar.cesr.univ-tours.fr/ -programmes/emn/duchemin/ (hd - ) http://www.stoa.org: /cclxtf/search (hd - ) http://www.wide.msu.edu/content/archive/ (hd - ) https://source.sakaiproject.org/contrib/simonides/ (hd - ) www.telledfu.org (hd - ) http://emergentmediacenter.com/vtarch/ (hd - ) http://www.wenlin.com/cgi-bin/wenlinsvghelp.pl http://www.pmoca.org/ http://www.flintbox.com/technology.asp?page= http://valleydev.cs.hope.edu/ http://www.valleysim.com/ http://inpho.cogs.indiana.edu/taxonomy/ http://lcsh-es.org/ http://www.structuralknowledge.com/markup_demo/markup/ http://evince.unl.edu/index.html?file=../xml/base.xml http://www.historybrowser.org/ http://www.viseyes.org/edit.htm http://lexomics.wheatoncollege.edu/ http://mastudies.ning.com/ http://mel.hofstra.edu/textlab/ http://ricercar.cesr.univ-tours.fr/ -programmes/emn/duchemin/ http://www.stoa.org: /cclxtf/search http://www.wide.msu.edu/content/archive/ https://source.sakaiproject.org/contrib/simonides/ http://www.telledfu.org/ http://emergentmediacenter.com/vtarch/ summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ attachment c blogs/media/press old dominion university’s research/innovations/breakthroughs quest, volume , issue , summer (hd - ) old dominion university’s center for learning technologies: video broadcast to prospective students on distance learning/podcasting technology. august , . (hd - ) the virginian-pilot, “ipod instruction,” by janette rodrigues, february , . (hd - ) http://movingimagesincontext.org/blog/ (hd - ) http://www.research.iu.edu/news/stories/ .html (hd - ) http://www.insideindianabusiness.com/newsitem.asp?id= (hd - ) http://www.nydailynews.com/ny_local/queens/ / / / - - _time_warp_ _world_fair_to_make_web_comeback.html (hd - ) http://collocate.wordpress.com/ / / /experimental-spanish-version-of-lcsh/ (hd - ) http://bengu-cn.blogspot.com/ / /more-chinese-translations-of-ifla.html (hd - ) http://splconferences.blogspot.com/ / /at-last-bilingual-subject-access-to.html (hd - ) http://laureltarulli.wordpress.com/ / / /standards-are-like-toothbrushes-a-good- idea-but-no-one-wants-to-use-anyone-elses/ (hd - ) http://lumagoo.wordpress.com/ / / /session-thoughts-the-future-is-now-global- authority-control/ (hd - ) http://www.libraryjournal.com/blog/ /post/ .html (hd - ) http://www.uaf.edu/research/frontiers/studying/index.xml (hd - ) http://www.datsproject.org/blog/ (hd - ) http://grou.ps/digitalobjects/talks/ (hd - ) http://placebased.typepad.com/placebased_education/theory_and_practice/ (hd ) http://museum-musings.blogspot.com/ / /national-council-on-public-history.html (hd - ) http://mastudies.ning.com/" #ncph : pm apr rd (hd - ) http://www.alexandriaarchive.org/blog/?p= (hd - ) http://vbi.lakeforest.edu/press.html (hd - ) http://hangingtogether.org/?p= (hd - ) http://chronicle.com/article/archiving-writers-work-in/ (hd - ) http://news.haverford.edu/blogs/digitalduchemin/ (hd - ) http://movingimagesincontext.org/blog/ http://www.research.iu.edu/news/stories/ .html http://www.insideindianabusiness.com/newsitem.asp?id= http://www.nydailynews.com/ny_local/queens/ / / / - - _time_warp_ _world_fair_to_make_web_comeback.html http://www.nydailynews.com/ny_local/queens/ / / / - - _time_warp_ _world_fair_to_make_web_comeback.html http://collocate.wordpress.com/ / / /experimental-spanish-version-of-lcsh/ http://bengu-cn.blogspot.com/ / /more-chinese-translations-of-ifla.html http://splconferences.blogspot.com/ / /at-last-bilingual-subject-access-to.html http://laureltarulli.wordpress.com/ / / /standards-are-like-toothbrushes-a-good-idea-but-no-one-wants-to-use-anyone-elses/ http://laureltarulli.wordpress.com/ / / /standards-are-like-toothbrushes-a-good-idea-but-no-one-wants-to-use-anyone-elses/ http://lumagoo.wordpress.com/ / / /session-thoughts-the-future-is-now-global-authority-control/ http://lumagoo.wordpress.com/ / / /session-thoughts-the-future-is-now-global-authority-control/ http://www.libraryjournal.com/blog/ /post/ .html http://www.uaf.edu/research/frontiers/studying/index.xml http://www.datsproject.org/blog/ http://grou.ps/digitalobjects/talks/ http://placebased.typepad.com/placebased_education/theory_and_practice/ http://museum-musings.blogspot.com/ / /national-council-on-public-history.html http://mastudies.ning.com/ https://twitter.com/search?q=% ncph https://twitter.com/publichistorian/status/ http://www.alexandriaarchive.org/blog/?p= http://vbi.lakeforest.edu/press.html http://hangingtogether.org/?p= http://chronicle.com/article/archiving-writers-work-in/ http://news.haverford.edu/blogs/digitalduchemin/ summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ http://chronicle.com/blogpost/archive-watch-good-samaritans/ / (hd ) http://www.archivesnext.com/?p= (hd - ) http://beyondwordsblog.com/ / / /archive- - -transforms-traditional-practices/ (hd - ) http://www.youtube.com/watch?v=uavlts kqus . (hd - ) http://abclocal.go.com/wjrt/story?section=news/localandid= . (hd - ) http://www.youtube.com/watch?v=uavlts kqus (hd - ) http://www.cal.msu.edu/samaritan.php.(hd - ) http://www.archives.gov/research/online-access-newsletter/ -december.pdf. (hd - ) http://www.lansingstatejournal.com/article/ /news / / /news . "msu newsroom." (hd - ) http://www.capitalgainsmedia.com/inthenews/texts .aspx. (hd - ) http://news.msu.edu/story/ /. (hd - ) peck, jim, prod. "samaritan scrolls." samaritan scrolls. big ten network. jan. . (hd - ) sorg, walt. ""archive . "" amlansing with walt sorg. -wils, lansing, mi. dec. . (hd - ) http://www.amlansing.com/amlansing/hart-davidson_ . .html. (hd - ) http://www.globeinvestor.com/servlet/story/bwire. . /gistory (hd - ) http://finance.yahoo.com/news/nyu-and-unicon-present-sakai-bw- .html?x= &.v= (hd - ) http://www.tmcnet.com/viewette.aspx?u=http% a% f% fwww.tmcnet.com% fusubmit% f % f % f % f .htm&kw= (hd - ) http://www.freshnews.com/news/ /nyu-and-unicon-present-sakai-portfolio-track- th-annual-sakai-conference (hd - ) http://www.reuters.com/article/pressrelease/idus + -jul- +bw (hd - ) http://rds.yahoo.com/_ylt=a geu.btbrpku bnzjxnyoa;_ylu=x odmtezdwlucgnmbhnlyw nzcgrwb mdmtcey sbwnhyziednrpzanimtg xzc /sig= pvepl /exp= /** http% a//www.forbes.com/feeds/businesswire/ / / /businesswire .html (hd - ) http://www.champlain.edu/emergent-media-center/projects/virtual-archeology- museum.html (hd - ) http://www.timesargus.com/article/ /features / / /features (hd - ) http://chronicle.com/blogpost/archive-watch-good-samaritans/ / http://www.archivesnext.com/?p= http://beyondwordsblog.com/ / / /archive- - -transforms-traditional-practices/ http://www.youtube.com/watch?v=uavlts kqus http://abclocal.go.com/wjrt/story?section=news/localandid= http://www.youtube.com/watch?v=uavlts kqus http://www.cal.msu.edu/samaritan.php http://www.archives.gov/research/online-access-newsletter/ -december.pdf http://www.lansingstatejournal.com/article/ /news / / /news http://www.capitalgainsmedia.com/inthenews/texts .aspx http://news.msu.edu/story/ / http://www.amlansing.com/amlansing/hart-davidson_ . .html http://www.globeinvestor.com/servlet/story/bwire. . /gistory http://finance.yahoo.com/news/nyu-and-unicon-present-sakai-bw- .html?x= &.v= http://finance.yahoo.com/news/nyu-and-unicon-present-sakai-bw- .html?x= &.v= http://www.tmcnet.com/viewette.aspx?u=http% a% f% fwww.tmcnet.com% fusubmit% f % f % f % f .htm&kw= http://www.tmcnet.com/viewette.aspx?u=http% a% f% fwww.tmcnet.com% fusubmit% f % f % f % f .htm&kw= http://www.freshnews.com/news/ /nyu-and-unicon-present-sakai-portfolio-track- th-annual-sakai-conference http://www.freshnews.com/news/ /nyu-and-unicon-present-sakai-portfolio-track- th-annual-sakai-conference http://www.reuters.com/article/pressrelease/idus + -jul- +bw http://rds.yahoo.com/_ylt=a geu.btbrpku bnzjxnyoa;_ylu=x odmtezdwlucgnmbhnlywnzcgrwb mdmtcey sbwnhyziednrpzanimtg xzc /sig= pvepl /exp= /**http% a//www.forbes.com/feeds/businesswire/ / / /businesswire .html http://rds.yahoo.com/_ylt=a geu.btbrpku bnzjxnyoa;_ylu=x odmtezdwlucgnmbhnlywnzcgrwb mdmtcey sbwnhyziednrpzanimtg xzc /sig= pvepl /exp= /**http% a//www.forbes.com/feeds/businesswire/ / / /businesswire .html http://rds.yahoo.com/_ylt=a geu.btbrpku bnzjxnyoa;_ylu=x odmtezdwlucgnmbhnlywnzcgrwb mdmtcey sbwnhyziednrpzanimtg xzc /sig= pvepl /exp= /**http% a//www.forbes.com/feeds/businesswire/ / / /businesswire .html http://www.champlain.edu/emergent-media-center/projects/virtual-archeology-museum.html http://www.champlain.edu/emergent-media-center/projects/virtual-archeology-museum.html http://www.timesargus.com/article/ /features / / /features summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ atttachment d publications “performantive metatests in metadata, and mark-up” in european journal of english studies, vol. , no. (august ), pp. - . (hd - ) speclab: digital aesthetics and speculative computing, university of chicago press, . (one chapter) (hd - ) academic podcasting and mobile assisted language learning: applications and outcomes. edited by betty rose facer and m’hammed abdous, scheduled to be published in . (hd - ) “mall technology: use of academic podcasting in the foreign language classroom.” recall, ( ), uk: cambridge university press (january ). http://journals.cambridge.org/action/displayabstract;jsessionid= d f e a c cf d a c.tomcat ?frompage=online&aid= (hd - ) “the impact of academic podcasting on students: learning outcomes and study habits” in research on e-learning methodologies for language acquisition, ed. rita marriott and patricia torres (new york/uk: information science reference/igi, july ). http://books.google.com/books?id=yhw_opo aa c&pg=pp &lpg=pp &dq=betty+rose+f acer&source=bl&ots=kvnfn hqp &sig=o z d qfecuhltjzqfcpr oaq&hl=en&ei=- ssnstlwijlqlafitpmrbw&sa=x&oi=book_result&ct=result&resnum= #v=onepage&q=betty % rose% facer&f=false (hd - ) “the cone of africa…took shape in lisbon,” humanities magazine, nov . (hd - ) arne r. flaten and alyson gill, eds. and contributors, visual resources, an international journal of documentation, special edition: using digital representations in the humanities (london: taylor & francis/routledge, forthcoming december ); essays: “state of the discipline (gill),” and “ashes art: digital models of th century bce delphi, greece (flaten).” under contract. (hd - ) arne r. flaten, “ashes art as pedagogical experiment,” in peer-reviewed proceedings of computer applications & quantitative methods in archaeology th annual meeting, budapest; accepted. (hd - ) alyson a. gill, “’chattering’ in the baths: the urban greek bathing establishment and social discourse in classical antiquity,” in peer-reviewed proceedings of computer applications & quantitative methods in archaeology th annual meeting, budapest; accepted. (hd - ) arne r. flaten, “ashes art: digital collaboration in the humanities,” in book: new technologies to explore cultural heritage (washington and rome: national endowment for the humanities and the consiglio nazionale delle ricerche, ). (hd - ) alyson gill and arne r. flaten, “digital delphi: the d virtual reconstruction of the hellenistic plunge bath at delphi,” in the digital heritage: proceedings of the th international conference on virtual systems and multimedia, ed. m. ioannides, a. addison, a. georgopoulos, l. kalisperis, hungary: archeolingua, . (hd - ) http://journals.cambridge.org/action/displayabstract;jsessionid= d f e a c cf da c.tomcat ?frompage=online&aid= http://journals.cambridge.org/action/displayabstract;jsessionid= d f e a c cf da c.tomcat ?frompage=online&aid= http://books.google.com/books?id=yhw_opo aa c&pg=pp &lpg=pp &dq=betty+rose+facer&source=bl&ots=kvnfn hqp &sig=o z d qfecuhltjzqfcpr oaq&hl=en&ei=-ssnstlwijlqlafitpmrbw&sa=x&oi=book_result&ct=result&resnum= #v=onepage&q=betty% rose% facer&f=false http://books.google.com/books?id=yhw_opo aa c&pg=pp &lpg=pp &dq=betty+rose+facer&source=bl&ots=kvnfn hqp &sig=o z d qfecuhltjzqfcpr oaq&hl=en&ei=-ssnstlwijlqlafitpmrbw&sa=x&oi=book_result&ct=result&resnum= #v=onepage&q=betty% rose% facer&f=false http://books.google.com/books?id=yhw_opo aa c&pg=pp &lpg=pp &dq=betty+rose+facer&source=bl&ots=kvnfn hqp &sig=o z d qfecuhltjzqfcpr oaq&hl=en&ei=-ssnstlwijlqlafitpmrbw&sa=x&oi=book_result&ct=result&resnum= #v=onepage&q=betty% rose% facer&f=false http://books.google.com/books?id=yhw_opo aa c&pg=pp &lpg=pp &dq=betty+rose+facer&source=bl&ots=kvnfn hqp &sig=o z d qfecuhltjzqfcpr oaq&hl=en&ei=-ssnstlwijlqlafitpmrbw&sa=x&oi=book_result&ct=result&resnum= #v=onepage&q=betty% rose% facer&f=false summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ flaten and gill, “ashes art: collaboration and community in the humanities,” in first monday: peer-reviewed journal on the internet , ( august ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/index (hd - ) flaten and gill, “virtual delphi: two case studies,” in the isprs international archives of the photogrammetry, remote sensing and spatial information sciences, xxxvi- /c , ; also published in cipa international archives for documentation of cultural heritage, xxi- . (hd - ) rajan, p and yan, w. “cast shadow removal using time and exposure varying images”, proceedings of the th international conference for advances in pattern recognition (icapr), kolkata, india. . http://www.isical.ac.in/~icapr / (hd - ) yan, w. and rajan, p. “towards digitizing colours of architectural heritage”, proceedings of the conference on virtual systems and multimedia (vsmm) ' : dedicated to digital heritage, october - th, , limassol, cyprus. http://www.vsmm .org/ (hd - ) china in the world: a history since , published by cheng & tsui, . http://www.cheng-tsui.com/store/products/china_world [book & cd] (hd - ) “inpho: the indiana philosophy ontology”, american philosophical assn. newsletter on philosophy and computers ( ) : - . http://www.apaonline.org/documents/publications/v n _computers.pdf (hd - ) “answer set programming on expert feedback to populate and extend dynamic ontologies.” in proceedings of st flairs. aaai press, ; - . http://inpho.cogs.indiana.edu/papers/ -inpho-flairs.pdf (hd - ) “the world is not flat: expertise and inpho.” first monday [online], volume number ( august ). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / (hd - ) “working the crowd: design principles and early lessons from the social-semantic web.” in proceedings of the workshop on web . : merging semantic web and social web - (sw)^ at acm hypertext, turin, italy, . http://sunsite.informatik.rwth- aachen.de/publications/ceur-ws/vol- / (hd - ) cameron buckner, mathias niepert, and colin allen. from encyclopedia to ontology: toward a dynamic representation of the discipline of philosophy. in a special issue of synthese, springer-verlag, (forthcoming) http://inpho.cogs.indiana.edu/papers/taxonomizingideas.pdf (hd - ) david bodenhamer, john corrigan, and trevor harris, eds., the spatial humanities: gis and the future of humanities scholarship (indiana university press), inaugural volume in a series on the spatial humanities, with bodenhamer, corrigan, and harris as general editors (two other titles under contract; another volume on religion and the atlantic world under development. http://www.iupress.indiana.edu/catalog/product_info.php?products_id= (hd - ) http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/index http://www.isical.ac.in/% eicapr / http://www.vsmm .org/ http://www.cheng-tsui.com/store/products/china_world http://www.apaonline.org/documents/publications/v n _computers.pdf http://inpho.cogs.indiana.edu/papers/ -inpho-flairs.pdf http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/ / http://sunsite.informatik.rwth-aachen.de/publications/ceur-ws/vol- / http://sunsite.informatik.rwth-aachen.de/publications/ceur-ws/vol- / http://inpho.cogs.indiana.edu/papers/taxonomizingideas.pdf http://www.iupress.indiana.edu/catalog/product_info.php?products_id= summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ drout, m., kahn, m., leblanc, m.d., jones, a. ‘ , kathok, n. ‘ , and nelson, c. ’ . “lexomics for anglo-saxon literature.” old english newsletter, [in press] fall . (hd - ) drout, m., kahn, m., leblanc, m.d. (submitted june ). lexomic methods for analyzing relationships among old english poems. journal of english and germanic philology. (hd - ) [in progress] downey, s., drout, m., kahn, m., kisor, y., leblanc, m. ( ) “lexomic evidence for the relationship between guthlac and vercelli homily .” (hd - ) [in progress] downey, s., drout, m., kahn, m., kisor, y., leblanc, m. ( in preparation). “‘us gewritu secgað’: lexomic evidence for an unknown source for the lamech material in genesis a.” (hd - ) jacobs and holland, “sharing archaeological data: the distributed archives transaction system” (invited article) in near eastern archaeology, . (hd - ) jacobs, “getting data into the hands of archaeologists: dats” in proceedings of conference “co-operation networks for the transfer of know-how in d digitization applications” at the cultural and educational technology institute/”athena” r.c., xanthi. (hd - ) jacobs, “coroplastic studies, an argument for total publication” in coroplast studies interest group (csig) newsletter no. , summer . http://www.coroplasticstudies.org/images/csig_newsletter_ _ .pdf (hd - ) holland, “a distributed archive for coroplastic research: www.datsproject.org” in coroplast studies interest group (csig) newsletter no. , summer . http://www.coroplasticstudies.org/images/csig_newsletter_ _ .pdf (hd - ) connect, “simonides: a faculty-led, student-centered technology initiative” (forthcoming; november, volume , number ) (hd - ) http://www.coroplasticstudies.org/images/csig_newsletter_ _ .pdf http://www.datsproject.org/ http://www.coroplasticstudies.org/images/csig_newsletter_ _ .pdf summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ attachment e exhibits, workshops, and conferences “spreading the word: reaching out to students and faculty.” iallt annual conference, georgia state university, atlanta, georgia, may , . (hd - ) “authentic materials as portable media content," at the summer institute - center for learning technologies, "web . : social networking at odu” at old dominion university, norfolk, virginia, may , . (hd - ) "academic podcasting for foreign language, literature, and culture," at the summer institute - center for learning technologies, "making the magic happen” at old dominion university, norfolk, virginia, may , . (hd - ) "academic podcasting technology," at the summer institute - center for learning technologies, "technology fair” at old dominion university, norfolk, virginia, may , . (hd ) “academic podcasting technology: the impact on foreign language acquisition.” research expo , “communities of research: discovery, innovation & entrepreneurship,” old dominion university, norfolk, virginia, april , . (hd - ) “the impact of academic podcasting on student learning outcomes.” calico with iallt annual conference, university of san francisco, san francisco, california, march , . (hd - ) “the impact of academic podcasting on student learning outcomes: emerging technologies in the foreign language classroom (neh-dhi).” seallt-maallt joint conference , pine crest preparatory school, ft. lauderdale, florida, february , . (hd - ) "ipods, podcasting and podagogy: the new generation of technology for foreign language education," at the summer institute “poducation: all about podcasting,” center for learning technologies at old dominion university, may , . (hd - ) conference presentation at http://www.vsmm .org/ (hd - ) presentation on “archive . ” at the european conference of digital libraries, september -october in corfu, greece on the panel “digital libraries, personalisation, and network effects - unpicking the paradoxes.” the full description is available here: http://www.ionio.gr/conferences/ecdl /pnl_per.php (hd - ) the jefferson’s travels demonstration project inspired an interactive exhibit in the new monticello visitor center and has led to collaborations on other historybrowser/visualeyes projects with the smithsonian institution www.viseyes.org/show/?base=smithson and the hagley library http://www.historybrowser.org/brower.php?id= (hd - ) http://www.vsmm .org/ http://www.ionio.gr/conferences/ecdl /pnl_per.php http://www.viseyes.org/show/?base=smithson http://www.historybrowser.org/brower.php?id= summary findings of neh digital humanities start-up grants ( - ) ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ in addition to designing textlab, the project’s tasks were for hofstra to host a day-long mini-conference (called melcamp) on the shaping of the melville electronic library (mel) and for me to write an neh grant proposal for launching mel online.melcamp met on october , , with over twenty melville and digital scholars in attendance. the neh start-up grant provided some travel reimbursement. hofstra provided matching funds for travel and footed the food expenses for the day. in november, , i submitted a proposal for a scholarly editions grant to fund the launching of mel. it was provisionally accepted in may, , with the condition of budget downsizing. the revised proposal for $ , for two years was finally accepted in august. the new grant will begin in november, .(hd - ) digital humanities presentation http://www.mith .umd.edu/dh /?page_id= (hd - ) http://www.mith .umd.edu/dh /?page_id= developing an open journals hosting service: a case study from liverpool john moores university (ljmu) the rise of the concept of ‘library as publisher’ has caused many university libraries to consider their role in the world of open access (oa) publishing and how that supports digital scholarship at their institutions. this paper outlines liverpool john moores university (ljmu) library services’ first steps into that world through the offering of an open journals hosting service. it begins by explaining the background and justification for the library offering such a service and details the pilot undertaken to test the chosen system, open journal systems (ojs). it considers what policies, procedures and support need to be in place in order to run a successful open journals hosting service. lessons learned and observations gathered during the pilot are shared to help others considering setting up an open journals hosting service in their own institution. finally it looks at the next steps for ljmu in taking this pilot forward to a full service offer. developing an open journals hosting service: a case study from liverpool john moores university (ljmu) introduction ‘library as publisher’ is a concept that has been gaining ground in recent years and collister et al. argue that participation in publishing is a natural area for libraries to engage in. this concept forms part of the broader area of digital scholarship, which cox argues provides opportunities for new roles for library staff. it seems a sensible step to move from disseminating other people’s content to disseminating content produced at our own institution. this begins with making outputs available via an institutional repository, in particular to support hefce’s (the higher education funding council for england’s) requirements for the next research excellence framework (ref), but should not end there. open access (oa) publishing is a logical area for libraries to get involved in. with this in mind, library services at liverpool john moores university (ljmu) made the development of the library as publisher a key strategic aim in – to support the university’s embedding research and scholarship strand of the – strategy plan. in a bid to begin to realize this aim, the open journals systems (ojs) was investigated. a number of other university libraries are engaging with ojs and this appears to be a growing area as demonstrated by a recent event held at the university of manchester. the focus there was on student publishing but some universities (like universities of edinburgh and warwick) offer hosting for both student- and academic-level journals. background a subscription to ojs, hosted by the university of london computer centre, was purchased as part of a curriculum development project, but the potential of the system had not yet been fully explored. ojs is a journal management and publishing system developed by the public knowledge project with the purpose of making oa publishing a viable option for more journals. in order to determine whether an open journal service was something the library service could and should offer as part of its digital scholarship strand, it was decided to run a project to pilot the system. we based this on the open journals hosting services at the universities of edinburgh and st andrews, and adapted their materials for our purposes. we approached two existing journal managers at ljmu who, due to web pages being moved to behind a log- in, we knew were looking for a way around this and potentially a new platform to host their insights – ( ), july developing an open journals hosting service at ljmu | cath dishman cath dishman open access and digital scholarship librarian liverpool john moores university ‘ljmu made the development of the library as publisher a key strategic aim’ journals. one journal, spark, was a student journal which gave students an opportunity to adapt an assessed piece of work into a journal article suitable for publication. this journal also had a staff and student editorial team. the other journal, innovations in practice, was a staff journal which enabled staff to publish work around the areas of pedagogic research and teaching practice. at an initial meeting it was agreed that the journal managers of these two journals would pilot ojs for us and i would take on the role of project manager. at the same time one of our liaison librarians had been approached by an academic staff member in the school of nursing and allied health for support in setting up a new journal. again they were looking for somewhere to host this. we agreed this should be based on the spark journal, being a place for students to publish their work but also offering them a chance to be editors as well. from the perspective of the project as a whole, this was a useful development as it meant that the documentation we would need for setting up a new journal could be fully tested and revised accordingly so it would be ready to use if this was to go forward as a service after the pilot. it was agreed that a member of the project team (the liaison librarian in the subject area) and i would also be part of the editorial board for this journal, links to health and social care, in order to fully understand the system and what support a new journal would need. the pilot training on the editorial aspects of the system was offered to all the journal managers and editors for the three journals. this was challenging as the system focuses very much on individual roles and in some cases participants took on more than one role. one particularly confusing session involved participants playing the role of author for some papers and editor for others, where it became clear that some participants were confused as to which role they were playing at which point in the training. as a result it was decided that the best way to move forward was for a member of the project team to play the author and the participants to take on the editorial role. this provided them with a clearer picture of what was required of them. it was agreed with all three journal teams that copyright would sit with the authors of the papers and that papers would be published with a cc by-nc-nd (creative commons attribution non-commercial) licence to allow reuse for non-commercial purposes. it was agreed that journals would be open access and advice on the types of licence that could be used was given. each of the journals was set up on ojs, which involved creating a ‘new journal’ on the system for each one, i.e. deciding on a name if new, adding the appropriate editors and reviewers, agreeing policy information surrounding the journal and adding this information into ojs. this would then be visible to future authors so they could see, for example, the focus and scope of the journal, what type of papers would be accepted, what the review process would be and turnaround for review. some degree of customization of the interface was also involved: adding a journal header, applying a unique brand to each journal and ensuring the ljmu logo was included. the aim was for all journals to release an issue before the close of the project. progress was slow to begin with as the project team and editorial teams got to grips with the system. all three journals produced an issue during the project, but all took a slightly different approach. spark after the setting up of the journal (see figure for the home page), initial training of editors and the setting up of authors on the system, there was very little input needed from the library project team. the editorial team already had an established process in place for editing an issue and only needed support when using ojs for submissions. they used the system for submitting articles and some elements of the review and copy-editing process. however, they decided not to publish each article as a separate pdf within the issue, instead ‘it was agreed that the journal managers of these two journals would pilot ojs for us’ ‘all three journals produced an issue during the project, but all took a slightly different approach.’ producing a full-issue pdf as they had done previously, so just required support to upload this onto ojs. this provided a place to host the journal and a stable url to distribute. innovations in practice this journal had only one editor/journal manager and took a different approach (see figure for journal home page). due to the additional support needed for authors to ensure the quality of paper the journal manager was looking for, articles were not submitted, reviewed and copy-edited through the system, so the journal manager worked closely with each author to develop their work to make sure it was ready for publication. support was provided with layout, creating the pdfs for each article and uploading them to the system and publishing the issue. more support was needed here in the latter stages when the papers were ready to be formatted and published. links to health and social care the system was fully tested with this journal. the editorial team used all areas of the review, copy-edit and layout sections and were supported at each stage. being part of the editorial team allowed me to play a full part in testing the system and understanding where changes to supporting documentation needed to be made. the challenges we faced as an editorial team did not just relate to the system itself but were to do with understanding the various stages of publishing a journal, as it was new to us all. the experience was invaluable as it gave us a clear understanding of what the system could do and what the editors and authors were required to do (and what support they needed) at each stage of the process. it was necessary to create guides that were specific to that journal and covered the requirements of the editors/reviewers in terms of the content of the journal (and requirements of the journal manager) whilst giving them the practical steps for what needed to be done on the system itself (where to upload documents, how to e-mail authors, etc.). working closely with the journal manager, we produced a guide to support editors/reviewers (a joint role for this journal) and another one for authors. these new guides were tested for the second issue of the journal and resulted in fewer queries from authors and editors/reviewers on how to use the system and what was required of them at the various stages of the editorial process. working with the links to health and social care team was useful as it enabled us to fully understand what a journal team needs from ojs and what their support needs associated with this are likely to be. figure shows the home page. all three journals are now well established and are approaching their third issue. all journal teams are now more independent and only need help with the final stages of compiling the issue prior to publication. to begin with, the system can seem a little overwhelming and the ‘editing’ section (shown in the screenshot, figure ) in particular caused some confusion, necessitating additional support. training on the system is therefore essential for any new journal editorial team members. figure . spark journal home page (http://openjournals.ljmu.ac.uk/spark) ‘the challenges we faced as an editorial team did not just relate to the system itself’ ‘all journal teams are now more independent and only need help with the final stages of compiling the issue’ http://openjournals.ljmu.ac.uk/spark figure . innovations in practice journal home page (http://openjournals.ljmu.ac.uk/iip) figure . links to health and social care journal home page (http://openjournals.ljmu.ac.uk/lhsc) figure . standard ojs ‘editing’ screen http://openjournals.ljmu.ac.uk/iip http://openjournals.ljmu.ac.uk/lhsc the journal manager of links to health and social care and one of the student editors and authors presented at the royal college of nursing education forum ‘nursing education and professional development: the global perspective’ in march on their experiences with ojs. they received very positive feedback and there was a lot of enthusiasm and discussion about the possibility that it might help to meet some of the new nursing and midwifery council standards for nursing education. innovations in practice has been included in the road directory of open access scholarly resources and articles have been cited in other international peer-reviewed education journals. the journal manager is very keen to raise the profile and impact of this journal. considerations when starting an open journal service policy it is important to develop a policy for the service outlining what the service does and does not offer and what level of support is available. think about the level of customization that as a service you are willing and able to offer each journal and ensure this is outlined in the policy. think about how much time will be given to each journal for the first issue, second issue and beyond. this will probably vary but needs to be laid out, and it will also influence how many new journals the service will be able to take on at any one time. it is necessary to consider ownership of individual journals. who is responsible for the content, ensuring progression if key people leave? consider also what happens if the journal folds. a successful service would not be one which holds a multitude of journals that only produce one issue and then fold! the reputation of the institution needs to be taken into consideration as well, and faculty staff are best placed to offer some kind of quality control. having backing from senior managers in the appropriate area of the institution, for example directors of school or research groups, means that there is a check in place prior to a new journal being started. time/capacity of support staff it is essential to ensure there is enough capacity in the team to deliver the level of support described in your policy. the initial set-up of the journal can take time, especially with inexperienced journal managers and once the journal is set up journal managers will generally need more support to produce their first issue using ojs. however, there will be ongoing support needs which need to be taken into consideration. a recommendation would be to start small and then it is possible to assess the level of support that will be needed. buy-in there needs to be a commitment on the part of the journal manager to develop their procedures and policies relating to their journal, and with a new journal more support is needed in this area. we are aiming to develop workload estimates for journal managers and editors based on the number of issues and articles per issue for the proposed journal. this will help journal managers and editors understand from the outset the degree of commitment needed to maintain a journal. in addition, library management need to be committed to offering this service in order to ensure adequate resource is available to support it. at ljmu we had the support of library management as the project supported the university’s embedding research and scholarship strand of the – strategy plan and our own management team acted as project board for the pilot project. finally, to ensure a successful service, there is a need for organizational-level buy-in. if support is gained from the organization, this will boost the profile of the service and make it easier to deal with ’problem’ journals. it can be a challenge convincing senior management that offering an open journals service is a worthwhile venture for the institution. it requires linking it to your strategic plan and outlining potential benefits. ‘it is necessary to consider ownership of individual journals’ ‘a recommendation would be to start small’ next steps for ljmu having developed our service policy, the next step for us is to get university approval for it and to gain the support of key groups. in particular there is a need for schools and faculties to be aware of and take responsibility for any new journals that are proposed. we would like to get more research- rather than teaching- and practice-focused journals using the system so we can test other areas like blind peer-review and situations where the roles of editor and reviewer are carried out by different people. another area we need to develop is around the marketing of current journals. how do we get the word out and improve the profile of our existing journals? some of the responsibility for this sits with the journal managers, but in order to boost the profile of the service as a whole the library team needs to take a role in this. to aid discoverability we would support journal managers with applications for inclusion in the directory of open access journals. we also are beginning to tweet links to new articles from our research support twitter feed @ljmuresearch and will include promotion of our current journals at our teaching and learning conference, as well as encouraging potential new journal managers to get in touch. there is also the potential for partnerships with other institutions. the links to health and social care team in particular is keen to work with other institutions to further boost the profile and reach of the journal. as the system is online and can be accessed anywhere, this works well for cross- institutional collaboration. the institutions do not even need to be in the same country, and the links to health and social care team is looking to collaborate with karelia university in finland. conclusions ojs is a versatile system and it is not necessary to use every aspect of it to benefit from it. when we started this project, we envisaged that all the journals would use the system in the same way, but this did not turn out to be the case. buy-in from your organization, library management, your service manager and journal managers is essential to make a journal service work as, without commitment, journals will fold. providing appropriate levels of support is key too because if inexperienced journal managers are left too much to their own devices, especially at first, they may feel overwhelmed and the journal is more likely to fail. in the broader context, working with journal managers and editors offers an excellent opportunity for library staff to develop relationships and promote the library as a partner in the enabling of digital scholarship within their organization. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘abbreviations and acronyms’ link at the top of the page it directs you to: http://www.uksg.org/publications#aa competing interests the author has declared no competing interests. ‘there is also the potential for partnerships with other institutions’ ‘an excellent opportunity for library staff to develop relationships and promote the library as a partner in the enabling of digital scholarship’ references . collister l, deliyannides t and dyas-correia s, the library as publisher, the serials librarian, , ( – ), – ; doi: https://doi.org/ . / x. . (accessed may ). . cox j, communicating new library roles to enable digital scholarship: a review article, new review of academic librarianship, , ( – ), – ; doi: https://doi.org/ . / . . (accessed may ). . higher education funding council for england, policy for open access in the next research excellence framework: updated november , , london, hefce: http://www.hefce.ac.uk/pubs/year/ / / (accessed may ). . liverpool john moores university, – strategic plan, , liverpool, liverpool john moores university: https://www.ljmu.ac.uk/~/media/files/ljmu/public-information-documents/strategic-plan/strategic_plan_ _ .pdf?la=en (accessed may ). . public knowledge project: https://pkp.sfu.ca/ojs/ (accessed may ). http://www.uksg.org/publications#aa https://doi.org/ . / x. . https://doi.org/ . / . . http://www.hefce.ac.uk/pubs/year/ / / https://www.ljmu.ac.uk/~/media/files/ljmu/public-information-documents/strategic-plan/strategic_plan_ _ .pdf?la=en https://pkp.sfu.ca/ojs/ . dobson h, february , supporting student publishing: perspectives from the university of manchester and beyond, library research plus blog: https://blog.research-plus.library.manchester.ac.uk/ / / /supporting-student-publishing-perspectives-from-the-university-of-manchester- and-beyond/ (accessed may ). . creative commons: https://creativecommons.org/licenses/by-nc-nd/ . / (accessed may ) . liverpool john moores university, ref. . . directory of open access journals: https://doaj.org/ (accessed may ). . ljmu research support: https://twitter.com/ljmuresearch (accessed may ). article copyright: © cath dishman. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. cath dishman open access and digital scholarship librarian, library services, liverpool john moores university, aquinas building, maryland street, liverpool l de, uk e-mail: c.l.dishman@ljmu.ac.uk orcid id: http://orcid.org/ - - - to cite this article: dishman c, developing an open journals hosting service: a case study from liverpool john moores university (ljmu), insights, , ( ), – ; doi: https://doi.org/ . /uksg. published by uksg in association with ubiquity press on july https://blog.research-plus.library.manchester.ac.uk/ / / /supporting-student-publishing-perspectives-from-the-university-of-manchester-and-beyond/ https://blog.research-plus.library.manchester.ac.uk/ / / /supporting-student-publishing-perspectives-from-the-university-of-manchester-and-beyond/ https://creativecommons.org/licenses/by-nc-nd/ . / https://doaj.org/ https://twitter.com/ljmuresearch http://creativecommons.org/licenses/by/ . / mailto:c.l.dishman@ljmu.ac.uk http://orcid.org/ - - - https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ introduction background the pilot spark innovations in practice links to health and social care considerations when starting an open journal service policy time/capacity of support staff buy-in next steps for ljmu conclusions abbreviations and acronyms competing interests references figure figure figure figure networked participatory scholarship: emergent techno-cultural pressures toward open and digital scholarship in online networks computers & education ( ) – contents lists available at sciverse sciencedirect computers & education journal homepage: www.elsevier.com/locate/compedu networked participatory scholarship: emergent techno-cultural pressures toward open and digital scholarship in online networks george veletsianos*, royce kimmons university of texas at austin, united states a r t i c l e i n f o article history: received july received in revised form september accepted october keywords: networked participatory scholarship digital scholarship open scholarship online networks social media techno-cultural pressures * corresponding author. e-mail address: veletsianos@gmail.com (g. veletsi - /$ – see front matter � elsevier ltd. a doi: . /j.compedu. . . a b s t r a c t we examine the relationship between scholarly practice and participatory technologies and explore how such technologies invite and reflect the emergence of a new form of scholarship that we call networked participatory scholarship: scholars’ participation in online social networks to share, reflect upon, critique, improve, validate, and otherwise develop their scholarship. we discuss emergent techno-cultural pressures that may influence higher education scholars to reconsider some of the foundational princi- ples upon which scholarship has been established due to the limitations of a pre-digital world, and delineate how scholarship itself is changing with the emergence of certain tools, social behaviors, and cultural expectations associated with participatory technologies. � elsevier ltd. all rights reserved. . introduction the last thirty years have seen an explosion of information and communication technologies that have impacted diverse aspects of our lives. for instance, the financial, medical, travel, and publishing industries have seen great advancements in how business is conducted on a day-to-day basis, in large part as a result of technological innovations. the higher education sector has not been immune to such inno- vations. the development of the contemporary university is intertwined with the development of technology and other aspects of society reflecting a rich history of interaction between the institution, society, and available technology tools (alexander, ; rhoads & liu, ; siemens & matheos, ), lending credence to the notion that just as societies, governments, and other social groups adapt and change over time, so too do universities, the work that they do, and how they do that work. in this paper, we examine the relationship between scholarly practice and technology and explore how online social networks invite emergence of a new form of scholarship that we call networked participatory scholarship. questions relevant to this investigation are: what does the participatory web mean for scholarship? what is the relationship between technology and scholarly practice? are scholars engaging in scholarly digital practices merely as a result of opportunities presented by contemporary forms of technology? is technology simply an outlet through which scholars enact scholarly practices that they value? we provide a conceptual map for thinking about these questions through an examination of digital scholarship, the historical relationship between technology and scholarship, and emergent techno-cultural pressures that may lead scholars to reconsider some of the foundational principles upon which scholarship has been established due to the limitations of a pre-digital world. we focus on higher education stakeholders and specifically the educational scholar. any reference to the term scholar should be understood to refer to individuals who participate in teaching and/or research endeavors (e.g., doctoral students, faculty members, instructors, and researchers). while the term scholarship, may traditionally be perceived as referring to scientific discovery, any reference to the term in this paper should be understood to include both teaching and research activities (boyer, ; hutchings & shulman, ). this view of scholarship was originally proposed by boyer ( ) whose empirical evaluation of scholars’ activities in higher education led him to argue that this anos). ll rights reserved. mailto:veletsianos@gmail.com www.sciencedirect.com/science/journal/ http://www.elsevier.com/locate/compedu http://dx.doi.org/ . /j.compedu. . . http://dx.doi.org/ . /j.compedu. . . g. veletsianos, r. kimmons / computers & education ( ) – perspective better describes the types of activities that scholars engage with because knowledge is generated and acquired not just through research, but through teaching as well. this perspective has been gaining increasing acceptance both among individuals studying digital scholarship (e.g., pearce, weller, scanlon, & kinsley, ) and education researchers (e.g., foster et al., ). scholarly practice on the other hand, should be taken to refer to those activities undertaken by scholars for teaching and research purposes. unsworth ( ) suggests that scholars engage in “scholarly primitives” that are common across disciplines, and these are the practices of discovering, annotating, comparing, referring, sampling, illustrating, and representing, while palmer, teffeau, and pirmann ( ) suggest that scholars engage in the activities of searching, collecting, reading, writing and collaborating. beyond these specific activities, boyer ( ) summarizes scholarly practice in terms of the scholarship of discovery, the scholarship of integration, the scholarship of application, and the scholarship of teaching. the relationship between technology and scholarship has not attracted much empirical attention in the education literature (greenhow, robelia, & hughes, ; veletsianos, in press). yet, kumashiro et al. ( , p. ) warned the education community that “technological changes are going to flood how we currently think about, do, and represent research. although [this issue] is being largely ignored in colleges of education, other than in simplistic and trivial ways. [and] most education faculty have no real sense of this inevitable change”. examples of the use of technology in scholarship vary widely. commonplace practices include the use of bibliographic management software, data analysis tools, and transcription services to aid efficiency. scholars may also publish in online journals and self-archive their publications on personal or institutional websites, use web logs (blogs) to share instructional materials with colleagues (martindale & wiley, ), employ microblogs to network with diverse audiences (veletsianos, in press), use social networking sites to publicly reflect on in- progress manuscripts (e.g., conole, ), release instructional content free-of-charge through open courseware initiatives (pearce et al., ) and collect artifacts from distributed internet contributors (wesch, ). while adoption of any technology varies according to numerous variables (e.g., perceived value to professional endeavors, familiarity with the technology, etc.), a number of technological and cultural stimuli have encouraged at least some faculty to utilize participatory technologies and online social networks for scholarly purposes. to understand these stimuli, and the forms of scholarship emerging from them, we will first discuss how prior literature has formulated the role that digital technologies play in scholarly work and how these formulations lead to a transformed understanding of scholarship that is both networked and participatory. we will then show that this transformation is not without precedent by delineating the negotiated relationship that technology and scholarship have historically had with one another, wherein technological innovation and cultural norms have shaped scholarship in a variety of ways. we will conclude by examining current and emerging techno-cultural pressures on scholarly practice that arise in the dominant culture, amongst scholars, and within scholarly journals to understand how these pressures will continue to transform scholarship in ways that are both networked and participatory. . from digital scholarship to networked participatory scholarship over the past few years, there has been growing interest in “digital scholarship” as some scholars have sought to use technology to enhance their research, typically in hopes of making it accessible faster and cheaper (andersen, ). a number of researchers have also noted the value of technology in fostering scholarship that is social (berge & collins, ; chong, ; cohen, ; greenhow, ), conversational (oblinger, ), and open (pearce et al., ). cohen ( , { ) notes that social scholarship “is the practice [.] in which the use of social tools is an integral part of the research and publishing process. [and is characterized by] openness, conversation, collaboration, access, sharing and transparent revision”. pearce et al. ( ) argue that digital scholarship is “more than just using infor- mation and communication technologies to research, teach and collaborate, but it is embracing the open values, ideology and potential of technologies born of peer-to-peer networking and wiki ways of working in order to benefit both the academy and society.” while cohen considers scholarship to refer to the process of academic research, greenhow, pearce et al., and the authors of this paper consider scholarly practice to include teaching endeavors. researchers have also attempted to characterize the individuals that employ digital tools in their scholarship (burton, ; cohen, ; weller, ) and have described these scholars in a variety of ways (e.g., social scholars, open scholars, digital scholars, etc.). though individual authors’ definitions may vary slightly from one another depending upon the scholar behaviors they are emphasizing, such definitions tend to focus on a few common components, including technology, collaboration, sharing, and openness. for example, cohen presents a list of fourteen characteristics that describe social scholars (e.g., “a social scholar initiates or joins an online community devoted to her topic, using any of a number of social software services or tools,” { ). burton argues that “the open scholar is someone who makes their intellectual projects and processes digitally visible and who invites and encourages ongoing criticism of their work and secondary uses of any or all parts of it–at any stage of its development” ({ ). weller summarizes the digital scholar as someone who is “open, digital, & networked” ({ ). the visions presented for these kinds of scholars contrast to the dominant conceptualization of scholarly practices which are often seen as monastic and lacking ongoing participation, support, and conversation (kumashiro et al., ). while it could be argued that scholars have always shared their work with colleagues (e.g., face-to-face, via correspondence, over the telephone, through conferences, etc), and disciplines have always had open (and less open) scholars, questions that we need to consider are: how are the (ongoing and new) needs and values of educational scholars supported by new technologies and participatory practices? conversely, how do current conceptualizations of scholarly practice and online participation hinder our scholarly goals? and how do online networks, and the ability to have instant and continuous access to networks of colleagues, impact the way that we research and teach? attempts at using technology to enhance scholarly practice have so far been met with skepticism and reluctance, as departmental requirements for tenure and promotion in institutions of higher education remain unchanged (ayers, ; kiernan, ; purdy & walker, ). for many faculty members, then, the potential value of “going digital” has not been worth risking tenure and departmental stigma (ayers, { ). though scholars and their universities may generally look upon digital scholarship with a receptive air (andersen, ; kiernan, ), departmental acceptance has been found to vary by discipline, with the hard sciences being the most receptive, fol- lowed by the social sciences and the humanities (andersen, p. ). the reason for such differing levels of acceptance may largely be due to what is being done by scholars in the current formulation of digital scholarship. as borgman ( ) points out, much of the effect that digitization has had upon scholarship revolves around the blurring of primary and secondary sources, wherein primary sources (i.e., data g. veletsianos, r. kimmons / computers & education ( ) – sets) are made more widely available to researchers in the form of publications and are more widely being listed on curriculum vitae. in the hard sciences, it is generally suggested that making data sets more widely available would have great value. for instance, providing access to genome mapping data may offer large societal benefits because multiple research teams can analyze such data concurrently. within the education and social sciences, however, such motivations for collaboration may be of less interest (cf. thagard, ). additionally, the publication and dissemination of secondary sources via digitization is much more problematic, as academia lacks an established framework of evaluation for judging the legitimacy or quality of interpretive or positional work that is distributed via non-traditional channels, such as videos on file-sharing sites or multimedia narratives (cf. borgman, ; purdy & walker, ). such a framework would need to consider complex aspects of digital publication such as time invested, originality, transferability, impact, peer judgments, and usefulness to the field and to society (andersen, ; kiernan, ) and has yet to evolve. as a result, digital scholarship lacks appeal for scholars in the social sciences. when considered through another lens, both the value and limitation of digital scholarship may lie in its framing of technology through a lens of amplification. thus, fields that require mass data collection and access have much to gain from the approach, while fields which rely more upon positionality and interpretation of theory (in addition to data) may find that digital tools which focus on improving data sharing do not help them drastically improve existing practice. however, if we consider that technology may replicate, amplify, or transform scholarly practice depending on how it is used (cf. hughes, thomas, & scharber, ; king, ) then we can begin to see that technology may have untapped potential in that it may not just improve what it is we are already doing, but, rather, it may actually transform our scholarly practice in positive ways. thus, we view how scholars use digital technologies to support scholarship as including a set of practices and dispositions that have the potential to fundamentally alter the way we view scholarship. we define these practices as networked participatory scholarship (nps) and consider them to go beyond digital scholarship in both scope and value. networked participatory scholarship is the emergent practice of scholars’ use of participatory technologies and online social networks to share, reflect upon, critique, improve, validate, and further their scholarship. though this practice is at an early stage of adoption and development within the scholarly community, one can find examples of it in public online networks. such networks include social media sites appropriated and repurposed to fit scholarly objectives (veletsianos, ; veletsianos, in press) or social networking sites specifically targeting scholars. examples of the former include video-sharing sites (e.g., youtube), micro-blogging and social networking services (e.g., twitter), question-answering services (e.g., quora), and blogs, and specific instances of scholarly use include: semingson ( ) who has recorded and shared a selection of her reading and literacy lectures on youtube and conole ( ) who shared in-progress drafts of her book and received comments and feedback from colleagues on its content. examples of the latter include environments such as academia.edu, cloudworks, and mendeley, and specific instances of scholarly use include the use of mendeley by the language learning and social media project ( ) to develop an open bibliography enabling distributed individuals to share manuscripts related to a topic of shared interest. it is important to note here that, while some of these networks might be designed with specific uses in mind (e.g., twitter as an information-sharing tool), individual scholars might use them in unanticipated ways. for example, veletsianos (in press) notes that non-scholarly social interaction on twitter provides opportunities for discovering shared interests and igniting opportunities for scholarly collaboration. understanding the historical relationship between technology and scholarship, and the emerging stimuli that exert pressure on scholarship, may help us explain the reasons networked participatory scholarship came into being. we turn to these issues next. . the shared history of scholarship and technology by considering how technology has influenced the development of scholarship into its current state, we may gain insights into how emerging technologies might further propel the field of scholarly work into new directions. over seventy-five years ago, binkley ( ) noted how mass publishing had largely changed the culture of scholarly work and argued that several centuries before, with the inven- tion of the printing press, scholarly materials that had largely been inaccessible to those interested in doing scholarly work were made readily available to both professional and amateur scholars alike. whereas the canon of scholarly resources had previously only been held in monasteries and libraries, the printing press made duplication of those materials so easy that they soon became accessible to a far larger group of scholars than was previously possible. through the technological innovation of printing “it became possible for the moderately wealthy man to possess what previously only princes or great religious establishments could afford – a fairly complete collection of the materials he desired” (binkley, , { ). improved access, however, was reversed in the early nineteenth century by the mass publishing of scholarly work. whereas in the past the “moderately wealthy” could keep pace with “princes” and “religious establishments” by purchasing common scholarly works, due to the exponential growth in scholarly works, the only scholars who could keep pace with the mass publishing of specialized field data were those who had direct access to a university with resources sufficient to continually purchase recurring publications that were extremely diverse and specialized. thus, binkley ( , { ) explains that “the qualities of the printing process that began in the fifteenth century to make things accessible have now begun in our different circumstances to make them inaccessible”, leading to the death of the “amateur scholar” and the shift of research from the realm of an “honored sport” to that of an “exclusive profession”. ultimately, though binkley’s heralding of “revolutionary” technological innovations such as microphotography and “near-print” repli- cation may not have panned out as he had anticipated, binkley’s analysis shows that the culture of scholarship has historically been refined, or even changed, by technological innovation and that certain, though not all, technological innovations have the capacity to lead to a fundamental rethinking of how research is done. it may not, however, be the case that this rethinking occurs in anticipated ways. upon revisiting binkley’s work twelve years later, tate ( ) points out that though microphotography did not lead to the rebirth of the amateur scholar, as binkley had hoped and anticipated, it did create new issues in scholarship that would fundamentally change scholarly practice. for instance, given the “oceans of documentation” which emerged from some of binkley’s technologies, researchers found themselves “confronted” and “confounded” by the amount of data now available to them, which led to a rethinking of the role of scholarly aides as guided assistants who waded through the vast number of available reports and articles. such rethinking is ongoing. for instance, peacock, robertson, williams, and clausen ( ) predict that learning technologists’ roles will expand to include support for faculty who conduct technology-enhanced research (e.g., virtual focus groups). http://academia.edu g. veletsianos, r. kimmons / computers & education ( ) – as digital technologies have emerged, and have become increasingly prevalent in recent years, we have begun to witness similar transformations in the ways that research teams use networking technologies to share and collaborate, publishers use online spaces to collect, review, and disseminate research articles, and educators use social media and networking technologies to enhance various aspects of their teaching. as technologies change and cultures shift, so too do the literacies and skills necessary to operate in professional contexts. for instance, jenkins, purushotma, weigel, clinton, and robinson ( ) suggest that due to the proliferation of emerging technologies and their effects on the world, in order to successfully participate in the world, individuals need to develop a new set of competencies that include skills such as appropriation, transmedia navigation, and networking. as scholars similarly find themselves confronted with the challenges of emerging technologies and shifting cultures, they too are being led to adapt and acquire new competencies in order to function in their changing world. we should be careful however, in attributing causation to technology with regard to shifts in scholarly culture. to illustrate, in anthropological studies, a piece of technology (e.g., pottery, printing press, radio, computer, youtube, etc.) can be used as a reference for gaining a greater understanding of a particular culture and must be understood as a co-evolutionary artifact with other aspects of culture like language and social behavior (pfaffenberger, ). similarly, in the current discussion, technology may just as validly be seen as a reflection of cultural trends as a cause of them. in the case of the printing press discussed above, for instance, it may just as arguably be stated that the printing press came about as a result of a widespread cultural belief in the value of accessibility as the reverse. thus, inferring causality between technology and culture remains a fuzzy issue. as a more modern example, solum ( ) argues that the growing practice of legal blogging is an effect – a symptom of how legal scholarship has already changed – and not a cause of cultural changes. similarly, rather than asking how emerging technologies will transform the culture of education scholarship, we could ask what the emergence and use of such tools as facebook, twitter, wordpress, ipads, smart phones, and so forth reveals about scholars in both a cultural sense (with regard to how knowledge in our culture has come to be acquired, tested, validated, and shared) as well as within the subculture of the university. though it is not the goal of this paper to definitively answer causal questions regarding the relationship between technology and culture, we will continue on the premise that the two influence and reflect one another in a complicated way. . emergent techno-cultural stimuli exerting pressure on scholarly practice if we are to understand how networked participatory scholarship (nps) is materializing today, we need to recognize that current trends in the dominant technophilic consumer culture, discussions within scholarly subcultures, and developments in journal publishing point to a deep-rooted rethinking of some fundamental beliefs upon which scholarly structures are built. we examine these trends individually in the sections that follow and discuss how these emergent factors exert pressures for nps adoption and the rethinking of scholarly practice. . . in the dominant culture much has changed in the world since the widespread introduction of the internet and, later, the participatory web. these changes have affected how we make and spend money, how we communicate, how we work, how we collaborate, how we play, how we create and sustain relationships, how we talk (e.g., “google it.”), and how we find and validate information. jenkins et al. ( ) for instance, note that we live in a participatory culture, or, a society in which, empowered with participatory technologies, the consumer no longer passively receives information, media, and artifacts, but also produces them. what are the implications for scholarly practice when everyone is able to contribute information on a massive scale using tools such as twitter, youtube, and facebook? within scholarly circles, the effects of these changes have been experienced in varying degrees. one of the most important of these changes, insofar as networked participatory scholarship is concerned, relates to an emergent emphasis upon collaborative work in the form of ‘collectives’ or aggregations of the actions of individuals that are organized in a complex manner to benefit those individuals (dron & anderson, ). as bull et al. ( , p. ) explain, online “[c]ollaborative projects such as wikipedia demonstrate that a previously unexploited collective intelligence can be tapped when the right conditions are established”, and the resultant collective artifacts of these exploits have the potential of spurring innovation. as a culture, we have quickly found great value in online collaborative projects. the english wikipedia alone, for instance, boasts a collection of . million articles collectively written by distributed individuals (wikipedia, ), and it has consistently remained in the list of the top ten most visited sites on the internet (alexa top global sites, ). firefox, as another example, is a community- developed web browser that is currently the most popular web browsing software in the world (browser statistics, ). further, even though lay users may not explicitly recognize other collective software products which they use on a daily basis, by virtue of the fact that the average internet user employs web server technologies to open web pages and to access content, we as an internet-using community, have further found great implicit value in other open and collaborative projects like apache, gnu/linux, php, mysql, and python which are persistent in web server environments. though such collective projects may have been initiated by a relatively small number of techno- logical savants, collectivist models of development and production have diffused into a multiplicity of realms. “wiki”, for instance, has quickly become a common word as several platforms (e.g., mediawiki, pbwiki, etc) have emerged and been adopted as valuable information-sharing platforms. a further outgrowth of this phenomenon can be seen more generally in the emerging interest in many fields to study the development and growth of online networks and communities, by which we seek to understand the reality and implications of our interdependence (briggle & mitcham, ). though scholarly practice that utilizes collective ways of thinking may be difficult to find, a few examples have recently surfaced: � timothy gowers used his blog as a platform to engage numerous individuals in producing ideas and solutions to a complex mathe- matical problem, generating substantial contributions from individuals, and announcing a proof of the problem approximately one- month-and-a-half after the inception of the project (gowers & nielsen, ). � a team of ichthyologists called on their facebook contacts (the majority of whom held doctorates in ichthyology-related fields) who helped identify more than % of the fish specimens collected in the cuyuni river of guyana (smithsonian science, ). g. veletsianos, r. kimmons / computers & education ( ) – � alec couros taught an online course in fall that was entitled “social media and open education” that was available to non-credit participants for free. couros asked colleagues who were online to help him in teaching the students who expressed interest in enrolling as non-credit students by acting as online network mentors and actively supporting these students. within a few days, individuals volunteered to serve as mentors (a. couros, personal communication, june , ) and collectively aided couros in teaching the non-credit students (couros, ). though causal relationships between technological innovation and culture may be unclear, the examples above indicate that there seems to be a case for arguing that technological innovation and the way technologies are used in the larger culture influences various subcultures (e.g., academic publishers, research communities, etc.). for instance, as individuals connect via online social networking sites such as facebook, this connectedness may lead researchers to utilize their connections for improving scholarly practice (as in the case of the team of ichthyologists mentioned above). to illustrate using a historical example, though the emergence of the printing press may not have reflected the value systems of all scholarly subcultures (e.g., some may have been interested in keeping knowledge sources restricted to elite groups), it could be said that its emergence did reflect the dominant culture of the time (i.e., the common people who were interested in gaining access to knowledge sources), which then influenced elite subcultures. likewise, it could be argued that though the emergence of technology-driven activities like blogging, social bookmarking, and social networking may not reflect the culture of university scholarship, they might very well reflect aspects of the dominant culture, which then gains power, via the tool, to influence scholarly cultures. thus, though the relationship between the dominant culture, technology, and subcultures may be ill-understood and extraordinarily complex, it is important to recognize that there is an interplay between the three by which changes in the dominant culture or technology may either reflect or influence transformation in the subculture in a complex and negotiated manner. we should emphasize however, that scholarly work does not exist in a vacuum and that how we view scholarship as a society changes in conjunction with a variety of other factors (e.g., technological innovations, dominant cultural narratives, etc.), which are currently and continually in a state of flux. . . amongst scholars for this reason, our understanding of scholarship has been in a state of transformation in recent years. to illustrate, researchers have recently asked foundational questions about the nature of education scholarship as they have reflected on the pursuits of educational scholars (e.g., berliner, ; bulterman-bos, ; capraro & thompson, ; labaree, ) and issued calls for a broader vision of scholarly activity and what it means to be a scholar in general (boyer, ; pellino, blackburn, & boberg, ). such evaluations of current scholarly practice may be the result of a fundamental re-conceptualization of scholarship that seeks to move away from emphases on disembodied, autonomous practice to community-conscious approaches (briggle & mitcham, ; buckley & du toit, ). within the realm of learning theory, a preparatory shift for this realization has gradually come as objectivist epistemologies and behaviorist learning theories have made way for constructivist and socio-constructivist views, which hold that knowledge is constructed in the mind of the learner and, as such, cannot exist independently of knowers (jenkins, ; lowenthal & muth, ). this transformed view of the mind from a disembodied and objectivist reasoning tool to an embodied, experiential, and social faculty calls into question the validity of monastic scholarly practices which attempt to disassociate the mind, knowledge, and research from social experience. this view paves the way for rethinking how scholarly knowledge is acquired, expanded, and validated given the embodied, social nature of human experience. nevertheless, we should be clear that even though such embodied practice is present in some aspects of academe, it does not represent the dominant academic culture. further, emergent learning approaches which seek to account for increasingly important aspects of social experience in a connected, digital world are coming to the forefront of learning theory discussions. according to connectivist views, for instance, learning is a nego- tiated, interconnected, cross-disciplinary, and inherently social process within complex environments (siemens, , ). though many of these ideas regarding learning are not new (kop & hill, ) and have been discussed by vygotsky ( ), lave and wenger ( ), and others, they are, nonetheless, finding growing interest amongst practitioners and researchers as evidenced by increasing offerings of freely- available online courses dubbed massively open online courses (parry, ). such approaches are noteworthy for the mere reason that they break away from norms of th century university scholarship with regard to fundamental epistemological questions regarding what knowledge is, how it is gained, how it is verified, how it is shared, and how it should be valued. these epistemological reframings of learning take form in scholarly practice in a variety of ways, but they are perhaps most noticeable in how scholars are increasingly beginning to question many heretofore non-negotiable artifacts of the th century scholarly world. peer review and online education are prime examples of such artifacts. peer review is the first example of how seemingly non-negotiable scholarly artifacts are currently being questioned: while peer review is an indispensable tool intended to evaluate scholarly contributions, empirical evidence questions the value and contributions of peer review (cole, cole, & simon, ; rothwell & martyn, ), while its historical roots suggest that it has served functions other than quality control (fitzpatrick, ). on the one hand, neylon and wu ( , p. ) eloquently point out that “the intentions of traditional peer review are certainly noble: to ensure methodological integrity and to comment on potential significance of experimental studies through examination by a panel of objective, expert colleagues”, while scardamalia and bereiter ( , p. ) recognize that “like democracy, it [peer-review] is recognized to have many faults but is judged to be better than the alternatives”. yet, peer review’s harshest critics consider it an anathema. casadevall and fang ( ) for instance, question whether peer review is in fact a subtle cousin of censorship that relies heavily upon linguistic negotiation or grammatical “courtship rituals” to determine value, instead of scientific validity or value to the field, while boshier ( ) argues that the current, widespread acceptance of peer review as a valid litmus test for scholarly value is a “faith-” rather than “science-based” approach to scholarship, citing studies in which peer review was found to fail in identifying shoddy work and to succeed in censoring originality. the challenge for scholarly practice is to devise review frameworks that are not just better than the status quo, but systems that take into consideration the cultural norms of scholarly activity, for if they don’t, they might be doomed from their inception. a recent experiment with public peer review online at nature, for example, revealed that scholars exhibited minimal interest in online commenting and informal discussions with findings suggesting that scholars “are too busy, and lack sufficient career incentive, to venture onto a venue such as nature’s g. veletsianos, r. kimmons / computers & education ( ) – website and post public, critical assessments of their peers’ work” (nature, , { ). shakespeare quarterly, a peer-reviewed scholarly journal founded in conducted a similar experiment in (rowe, ). while the trial elicited more interest than the one in nature with more than individuals contributing who, along with the authors, posted more than comments, the experiment further illuminated the fact that tenure considerations impact scholarly contributions. cohen ( ) reported that “the first question that alan galey, a junior faculty member at the university of toronto, asked when deciding to participate in the shakespeare quarterly’s experiment was whether his essay would ultimately count toward tenure”. considering the reevaluation of such an entrenched and centripetal structure of scholarly practice as peer review, along with calls for recognizing the value of diverse scholarly activities (pellino et al., ), such as faculty engagement in k– education (foster et al., ), we find that the internal values of the scholarly community are shifting in a direction that may be completely incompatible with some of the seemingly non-negotiable elements of th century scholarship. online education is the second example of how seemingly non-negotiable scholarly artifacts are currently being questioned: online education has traditionally been organized and supported through learning management systems (lms), largely as a result of these systems offering opportunities for organization and efficiency (lee & mcloughlin, ). while lms are popular in higher education settings, with one survey indicating that more than % of responding higher education institutions in the united states use an institutional lms (green, ), scholars have begun questioning whether organization and efficiency should be the guiding principles for integrating technology in their classroom and have begun reflecting on the constraints that these systems impose upon students’ learning experiences and oppor- tunities (veletsianos & navarrete, in press). for instance, lms may hinder pedagogical choice through their default settings and familiar affordances (lane, ). as a result, a number of scholars have begun using tools that reside outside the control of their home institution and have employed them in the service of teaching (e.g., social networking sites, self-hosted blogs). at the same time, we have seen these tools used in the emergence of open online courses. while traditional online courses are most frequently organized as groups (dron & anderson, ), a number of open online courses are organized as networks. the distinctions between the two are important, because they help illuminate practices that align with networked participatory scholarship. groups are structured around particular tasks, encompass designated roles and hierarchies, include access controls, and are tightly knit (e.g., group members know each others’ names). networks on the other hand, are fluid organizational structures in which participation consists of distributed individuals connected in loose and strong ties, membership is mostly unrestricted, and participants may know some but not all members of the network. how does a course structured as a network differ from a course structured as a group, and why is this an example of nps? in courses organized around networks, course materials are made available to participants (both within and outside of the institution) who then have the ability to self- directedly create networks with other participants to achieve shared learning goals (bell, ). individuals define their participation and their learning goals, and course activity occurs in distributed online fora. the use of social technologies that networked courses employ go far beyond merely allowing people to “take a course online” and empower learners to participate in self-defined ways. this type of online course breaks away from the norm of th century university scholarship by positioning knowledge around social connections rather than around content, enabling scholars to re-envision teaching, instruction, their role as teachers, and the ways that knowledge is acquired in modern society”. individual scholars’ networked participatory scholarship practices further illustrate this point: in recent years numerous scholars have engaged in using technological tools in their research, classrooms, and personal lives in ways that differ from th century paradigms of scholarship (katz, ; kirkup, ). early adopters continue to use these tools despite incompatibilities with social or institutional structures, because they recognize how such tools have the power to support, amplify, or transform their scholarship in positive ways (katz, ). for instance, engaging in nps via such tools as blogs and online social networks may enable scholars to remain current in their research field, explore new approaches to teaching via networking with colleagues, interact with individuals mentioning their research/ work, and expose their work to larger audiences. consider how scholars use blogs to support scholarly endeavors. prior research has identified that blogs are used (a) as debate platforms for scholars who seek to live as public intellectuals, (b) for recording and sharing logs of “pure” research, and (c) as a sort of tongue-in-cheek (often pseudonymous) water cooler around which critical discussions of the scholarly experience can occur (kirkup, ; walker, ). scholars’ uses of blogs have also extended beyond research to include teaching endeavors such that it has become necessary to establish frameworks for understanding the educational affordances that blogs may offer (deng & yuen, ). in a similar manner, scholars have been drawn to social networking sites (sns) in recent years for a variety of reasons. in a recent survey, moran, seaman, and tinti-kane ( ) found that nearly % of faculty surveyed had posted to an sns in the past month and that over % had visited an sns in that time. with such prevalent use of snss, it is no wonder that faculty are beginning to consider how snss can be used to help them teach (cho, gay, davidson, & ingraffea, ; dunlap & lowenthal, ). veletsianos (in press) for instance, found that early twitter adopters used twitter to make instructional information and resources available to non-students and provided students with opportunities to interact with professional communities outside of the classroom. in each of these cases, we see an ideological shift occurring amongst scholars from established frameworks of academic scholarship and discourse toward structures that are more participatory and empowering, as nps participation in social media allows the scholar to connect with others (e.g., other scholars, prac- titioners, and the general public) in ongoing discussion and reflection. given these growing phenomena, we should ask: why might scholars be interested in engaging such audiences? through ethnographic interviews, nardi, schiano, and gumbrecht ( ) found that bloggers use their blogs to “( ) update others on activities and whereabouts, ( ) express opinions to influence others, ( ) seek others’ opinions and feedback, ( ) ‘think by writing’, and ( ) release emotional tension”. if this pattern holds true for scholars, then it seems safe to say that a growing number of scholars, as evidenced by an ever-growing number of scholarly blogs (kjellberg, ), are interested in connecting their research with their identities. such a connection may serve to frame their research in a way that is increasingly embodied, experiential, and social, as scholars and faculty members use participatory tech- nologies to circumvent established systems that are neither designed to value nor equipped to support such approaches to reflection and inquiry. couple this with solum’s argument ( ) that the emergence of blogging is a symptom of changing trends in societal thought and values, and it follows that though blogging may not be transforming scholarship per se, growth in academic blogging may reflect a changing set of values amongst many scholars regarding their profession. thus, though participatory technologies may not necessarily serve as catalysts for changing scholarly norms, their growing use by scholars expresses that the current norms of scholarship are in a state of change. g. veletsianos, r. kimmons / computers & education ( ) – . . within scholarly journals even with such changing definitions of scholarship, a discussion on scholarly practice inevitably turns to outlets of scholarly work: the valued media by which scholars connect with the culture that values their work. technological innovation and cultural shifts have had, and continue to have, an impact on scholarly journals, and developments in this domain parallel the nps practices that we have described above. these developments can be summarized in three related themes. first, we have seen a transition from print-only journals to print and online journals. second, open access publishing has experienced increasing interest. third, researchers and institutions have sought new ways to evaluate the impact and reach of scholarly work. these issues are examined in detail next. the dawn of the digital age has had a marked influence on print publishing as stakeholders have realized the benefits afforded by digital dissemination. for example, scholars can access scholarly work published in electronic outlets, such as digital databases, more efficiently, and publishers can make scholarly work available faster than if the work was published in print-only form. in interviewing authors who disseminated their books online for free, hilton and wiley ( ) also found that authors perceived that this act enabled them to reach a greater and wider audience without negatively impacting the sales of their books. additionally, digital publishing enables alternative forms of content in scholarly work including dynamic content, visualizations, and multimedia integration, such as audio or video interviews (pearce et al., ). the transition to online journals, however, has had further influences on access and journal usage; reports from electronic journal introductions for instance, indicate that print journal usage has decreased significantly after the introduction of online journals (de groote & dorsch, ; rogers, ). since the development of the printing press and through the transition to online journals, scholars have embraced methods of broad dissemination of their work. cultural shifts, such as the open access (oa) movement, have shown promise for democratizing access to knowledge and exerted significant pressures on academic publishing. the budapest open access initiative ( , { ) defines oa as literature which is made available for free online “permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself”. since the launch of the educational resources information center (eric) in the u.s. in as a medium for providing access to education research and information, oa has continued to evolve and has received increased interest in recent years, evidenced in part by the recent wave of higher education institutions passing open access resolutions and mandates requesting faculty to share their scholarly work in an open access manner (harnad, ). for instance, a recent study of a random sample of peer-reviewed publications found that . % of them were available for free online either through the publisher’s website or through a web search (björk et al., ). as we have already noted, cultural and technological shifts are difficult to differentiate because they influence each other. in the case of oa however, it would have been physically impossible for scholars to make their work available in an oa manner during the age of print publishing. digital publication, provides scholars with the ability to disseminate work without physical or economic barriers. currently, the number of reputable and peer-reviewed oa journals in the field of education is limited, even though such journals are quickly becoming viable options for scholars to consider (furlough, ). scholars have proposed that numerous benefits can be derived from publishing their work in ways that align with the spirit of open access. empirical investigations comparing oa and non-oa academic journals indicate that (a) oa publications tend to be cited more heavily than non-oa (noa) publications and (b) there is no evidence to suggest that oa publishing harms citations. for instance, hajjem, harnad, and gingras ( ) evaluated . million articles published in ten disciplines between and and found that noa papers that were self-archived have had more citations than papers that were not self-archived. eysenbach ( ) reached a similar conclusion in a longitudinal analysis of paper citations, when he found that oa papers were more likely to be cited than noa papers. zawacki-richter, anderson, and tuncay ( ) compared six oa and six noa journals in the field of distance education and found no significant differences in terms of citation counts between the two. additionally, empirical evidence relating to citation metrics indicates that oa articles may be cited earlier than noa articles (eysenbach, ; zawacki-richter et al., ), suggesting that oa may allow faster access to scholarly work and thereby accelerate scholarly dissemination and development. finally, as researchers and institutions seek new ways to evaluate the impact and reach of scholarly work, the field has seen renewed emphasis on journal impact factors, article-level metrics (such as citation counts), and journal quality. while scholarly publishing has tradi- tionally been evaluated in terms of citation counts and the quality of the journal in which a paper was published (goodyear et al., ), varied technology-informed metrics have recently beenproposed in an attempt to more fully capture the influence of scholarly work. for instance, the publiclibraryof science( )has begunpublishingavarietyofmetrics foreachof their publicationsincludingarticleusage statistics (e.g.,page views), comments/notes/ratings left by article readers, and blog posts citing published articles. priem and hemminger ( ) call attention to scientometrics . as the idea of using social media to examine journal article use and citations in the participatory nature of the web. such data may help scholars gain a firm understanding of the impact of their scholarship and outreach, provide transparency to the research community, and allow richer depictions of a scholar’s influence and impact. nevertheless, notwithstanding the opportunities that participatory technologies present for scholarly dialog, neylon and wu ( ) indicate that papers published in science-related journals with online commenting plat- forms exhibit a low volume of comments. the issues, these authors suggest, are partly social as scholars (a) lack incentives to spend the time to post comments on online publications and (b) may be unsure of what is appropriate to post in these emergent fora. . conclusion in this paper, we explored the meaning of networked participatory scholarship and the historical relationship between technology and scholarly endeavors. we then discussed cultural and technological trends influencing scholars to adopt networked participatory practices and factors impacting the rejection of digital practices, thereby attempting to illuminate the complexity of the issues involved. we have claimed that the emergence of networked participatory scholarship as a practice has extensive implications for scholars, scholarship, and academic institutions and that the cultural shifts underpinning such a transition are an important dimension in any discussion surrounding higher education and scholarship. what, then, is the role of the scholar in the participatory age? the discussion and observations outlined above suggest that emergent practice in networked participatory scholarship is still largely in a phase of ongoing development within the larger, ever-fluctuating g. veletsianos, r. kimmons / computers & education ( ) – profession of scholarship. whether they recognize it or not, scholars are part of a complex techno-cultural system that is ever changing in response to both internal and external stimuli, including technological innovations and dominant cultural values. though such an understanding may lead to a certain level of trepidation regarding the shape of scholarship’s uncertain future, we should take an active role in influencing the future of scholarship and establishing ourselves as productive participants in an increasingly networked and participatory world. acknowledgements the authors would like to thank the anonymous reviewers for their feedback and suggestions. an earlier version of this manuscript was shared as a “discussion paper” with the instructional technology forum (itforum) during april , and the authors would like to thank itforum participants and moderators for their feedback. references alexa top global sites. ( ). retrieved on . . , from. http://www.alexa.com/topsites. alexander, k. ( ). balancing the challenges of today with the promise of tomorrow: a presidential perspective. in m. d’ambrosio, & r. ehrenberg (eds.), transformational change in higher education: positioning colleges and universities for future success (pp. – ). cheltenham, uk: edward elgar publishing. andersen, d. ( ). digital scholarship in the tenure, promotion, and review process. new york, ny: m.e. sharpe. ayers, e. ( , january ). doing scholarship on the web: years of triumphs and a disappointment. the chronicle of higher education, retrieved from. http://chronicle. com/article/doing-scholarship-on-the-web-/ . bell, f. ( ). connectivism: its place in theory-informed research and innovation in technology-enabled learning. international review of research in open and distance learning, ( ), – . berge, z., & collins, m. ( ). computer-mediated scholarly discussion groups. computers & education, ( ), – . berliner, d. ( ). educational research: the hardest science of all. educational researcher, ( ), – . binkley, r. c. ( ). new tools for men of letters. the yale review, , – . björk, b.-c., welling, p., laakso, m., majlender, p., hedlund, t., & guðni, g. ( ). open access to the scientific journal literature: situation . plos one, ( ), e . borgman, c. ( ). scholarship in the digital age: information, infrastructure, and the internet. hong kong: mit press. boshier, r. ( ). why is the scholarship of teaching and learning such a hard sell? higher education research & development, ( ), – . boyer, e. ( ). scholarship reconsidered: priorities for the professoriate. princeton, nj: the carnegie foundation for the advancement of teaching. briggle, a., & mitcham, c. ( ). embedding and networking: conceptualizing experience in a technosociety. technology in society, ( ), – . browser statistics. ( ). web statistics and trends. retrieved on . . , from. http://www.w schools.com/browsers/browsers_stats.asp. buckley, s., & du toit, a. ( ). academics leave your ivory tower: form communities of practice. educational studies, ( ), – . budapest open access initiative. ( ). retrieved on . . from. http://www.soros.org/openaccess/read.shtml. bull, g., thompson, a., searson, m., garofalo, j., park, j., young, c., et al. ( ). connecting informal and formal learning experiences in the age of participatory media. contemporary issues in technology and teacher education, ( ), – . bulterman-bos, j. a. ( ). will a clinical approach make education research more relevant for practice? educational researcher, ( ), – . burton, g. ( ). the open scholar. blog entry in academic evolution. retrieved on . . from. http://www.academicevolution.com/ / /the-open-scholar.html. capraro, r. m., & thompson, b. ( ). the educational researcher defined: what will future researchers be trained to do? the journal of educational research, ( ), – . casadevall, a., & fang, f. c. ( ). is peer review censorship? infection and immunity, ( ), – . cho, h., gay, g., davidson, b., & ingraffea, a. ( ). social networks, communication styles, and learning performance in a cscl community. computers & education, ( ), – . chong, e. k. m. ( ). using blogging to enhance the initiation of students into academic research. computers & education, ( ), – . cohen, l. ( ). social scholarship on the rise. blog entry posted to library . : an academic’s perspective. retrieved on . . from. http://liblogs.albany.edu/library / / /social_scholarship_on_the_rise.html. cohen, p. ( ). scholars test web alternative to peer review. the new york times, retrieved on . . from. http://www.nytimes.com/ / / /arts/ peer.html? pagewanted¼ . conole, g. ( ). book: designing for learning in an open world. retrieved on . . from. http://cloudworks.ac.uk/cloudscape/view/ . cole, s., cole, j., & simon, g. ( ). chance and consensus in peer review. science, , – . couros, a. ( ). call for network mentors – follow-up. retrieved on . . from. http://educationaltechnology.ca/couros/ . de groote, s. l., & dorsch, j. l. ( ). online journals: impact on print journal usage. bulletin of the medical library association, ( ), – . deng, l., & yuen, a. ( ). towards a framework for educational affordances of blogs. computers & education, ( ), – . dron, j., & anderson, t. ( ). how the crowd can teach. in s. hatzipanagos, & s. warburton (eds.), handbook of research on social software and developing ontologies (pp. – ). london: igi global. dunlap, j., & lowenthal, p. ( ). tweeting the night away: using twitter to enhance social presence. journal of information systems education, ( ), – . eysenbach, g. ( ). citation advantage of open access articles. plos biology, ( ), e . fitzpatrick, k. ( ). planned obsolescence: publishing, technology, and the future of the academy. new york, ny: new york university press. foster, k., bergin, k., mckenna, a., millard, d., perez, l., prival, j., et al. ( ). partnerships for stem education. science, , – . furlough, m. ( ). open access, education research, and discovery. the teachers college record, ( ), – . goodyear, r. k., brewer, d. j., gallagher, k. s., tracey, t. j. g., claiborn, c. d., lichtenberg, j. w., et al. ( ). the intellectual foundations of education: core journals and their impacts on scholarship and practice. educational researcher, ( ), – . gowers, t., & nielsen, m. ( ). massively collaborative mathematics. nature, ( ), – . green, k. ( ). campus computing, : the st national survey of computing and information technology in u.s. higher education. retrieved on . . from. http:// www.campuscomputing.net/summary/ -campus-computing-survey. greenhow, c., robelia, b., & hughes, j. e. ( ). learning, teaching, and scholarship in a digital age: web . and classroom research: what path should we take now? educational researcher, ( ), – . greenhow, c. ( ). social scholarship: applying social networking technologies to research practices. knowledge quest, ( ), – . hajjem, c., harnad, s., & gingras, y. ( ). ten-year cross-disciplinary comparison of the growth of open access and how it increases research citation impact. ieee data engineering bulletin, ( ), – . harnad, s. ( ). waking oa’s "slumbering giant": the university’s mandate to mandate open access. new review of information networking, ( ), – . hilton, j., & wiley, d. ( ). free: why authors are giving books away on the internet. tech trends, ( ), – . hughes, j., thomas, r., & scharber, c. ( ). assessing technology integration: the rat – replacement, amplification, and transformation framework. in c. crawford (ed.), proceedings of society for information technology & teacher education international conference (pp. – ). chesapeake, va: aace. jenkins, h., purushotma, r., weigel, m., clinton, k., & robinson, a. j. ( ). confronting the challenges of participatory culture: media education for the st century. cambridge, ma: the mit press. jenkins, j. ( ). constructivism. in f. w. english (ed.), encyclopedia of educational leadership and administration (pp. – ). thousand oaks, ca: sage reference. hutchings, p., & shulman, l. ( ). the scholarship of teaching: new elaborations, new developments. change, ( ), – . katz, r. ( ). scholars, scholarship, and the scholarly enterprise in the digital age. educause review, ( ), – . kiernan, v. ( ). rewards remain dim for professors who pursue digital scholarship. the chronicle of higher education, retrieved on . . from. http://chronicle.com/ article/rewards-remain-dim-for/ . king, k. ( ). educational technology professional development as transformative learning opportunities. computers & education, ( ), – . http://www.alexa.com/topsites http://chronicle.com/article/doing-scholarship-on-the-web-/ http://chronicle.com/article/doing-scholarship-on-the-web-/ http://www.w schools.com/browsers/browsers_stats.asp http://www.soros.org/openaccess/read.shtml http://www.academicevolution.com/ / /the-open-scholar.html http://liblogs.albany.edu/library / / /social_scholarship_on_the_rise.html http://liblogs.albany.edu/library / / /social_scholarship_on_the_rise.html http://www.nytimes.com/ / / /arts/ peer.html?pagewanted= http://www.nytimes.com/ / / /arts/ peer.html?pagewanted= http://www.nytimes.com/ / / /arts/ peer.html?pagewanted= http://cloudworks.ac.uk/cloudscape/view/ http://educationaltechnology.ca/couros/ http://www.campuscomputing.net/summary/ -campus-computing-survey http://www.campuscomputing.net/summary/ -campus-computing-survey http://chronicle.com/article/rewards-remain-dim-for/ http://chronicle.com/article/rewards-remain-dim-for/ g. veletsianos, r. kimmons / computers & education ( ) – kirkup, g. ( ). academic blogging: academic practice and academic identity. london review of education, ( ), – . kjellberg, s. ( ). i am a blogging researcher: motivations for blogging in a scholarly context. first monday, ( ). kop, r., & hill, a. ( ). connectivism: learning theory of the future or vestige of the past? international review of research in open and distance learning, ( ), – . kumashiro, k., pinar, w., graue, e., grant, c., benham, m., heck, r., et al. ( ). thinking collaboratively about the peer-review process for journal-article publication. harvard educational review, ( ), – . labaree, d. f. ( ). the peculiar problems of preparing educational researchers. educational researcher, ( ), – . lane, l. ( ). insidious pedagogy: how course management systems impact teaching. first monday, ( ). language learning and social media project ( ). mendeley group. retrieved on . . from http://www.mendeley.com/groups/ /language-learning-social- media/. lave, j., & wenger, e. ( ). situated learning: legitimate peripheral participation. cambridge, uk: cambridge university press. lee, m. j. w., & mcloughlin, c. ( ). beyond distance and time constraints: applying social networking tools and web . approaches to distance learning. in g. veletsianos (ed.), emerging technologies in distance education, (pp. – ). edmonton, ab: athabasca university press. lowenthal, p., & muth, r. ( ). constructivism. in e. f. provenzo (ed.), encyclopedia of the social and cultural foundations of education (pp. – ). thousand oaks, ca: sage publications inc. martindale, t., & wiley, d. a. ( ). using weblogs in scholarship and teaching. tech trends, ( ), – . moran, m., seaman, j., & tinti-kane, h. ( ). teaching, learning, and sharing: how today’s higher education faculty use social media for work and for play. pearson learning solutions. retrieved on . . from. http://www.pearsonlearningsolutions.com/blog/ / / /teaching-learning-and-sharing-how-todays-higher-education-facutly- use-social-media/. nardi, b. a., schiano, d. j., & gumbrecht, m. ( ). blogging as social activity, or, would you let million people read your diary? in proceedings of the acm conference on computer supported cooperative work, (pp. – ). new york: acm press. nature. ( ). peer review and fraud. nature, , – . neylon, c., & wu, s. ( ). article-level metrics and the evolution of scientific impact. plos biology, ( ), e . oblinger, d. g. ( ). from the campus to the future. educause review, ( ), – . palmer, c., teffeau, l., & pirmann, c. ( ). scholarly information practices in the online environment: themes from the literature and implications for library service development. retrieved on . . from. http://www.oclc.org/programs/publications/reports/ - .pdf report commissioned by oclc research. parry, m. ( ). u. of illinois at springfield offers new ‘massive open online course’. the chronicle of higher education, retrieved on . . from. http://chronicle.com/blogs/ wiredcampus/u-of-illinois-at-springfield-offers-new-massive-open-online-course. peacock, s., robertson, a., williams, s., & clausen, m. g. ( ). the role of learning technologists in supporting e-research. research in learning technology, ( ), – . pearce, n., weller, m., scanlon, e., & kinsley, s. ( ). digital scholarship considered: how new technologies could transform academic work in education. in education, ( ). pellino, g., blackburn, r., & boberg, a. ( ). the dimensions of academic scholarship: faculty and administrator views. research in higher education, ( ), – . pfaffenberger, b. ( ). social anthropology of technology. annual review of anthropology, ( ), – . priem, j., & hemminger, b. h. ( ). scientometrics . : new metrics of scholarly impact on the social web. first monday, ( ). public library of science. ( ). article level metrics. retrieved on . . from. http://article-level-metrics.plos.org/. purdy, j., & walker, j. ( ). valuing digital scholarship: exploring the changing realities of intellectual work. profession, , – . rhoads, r., & liu, a. ( ). globalization, social movements, and the american university: implications for research and practice. in j. smart (ed.), higher education: handbook of theory and research, xxiv (pp. – ). new york, ny: springer-verlag. rogers, a. ( ). electronic journal usage at ohio state university. college & research libraries, ( ), – . rothwell, p. m., & martyn, c. n. ( ). reproducibility of peer review in clinical neuroscience: is agreement between reviewers any greater than would be expected by chance alone? brain, , – . rowe, k. ( ). from the editor: gentle numbers. shakespeare quarterly, ( ), iii–vii. scardamalia, m., & bereiter, c. ( ). pedagogical biases in educational technologies. educational technology, ( ), – . semingson, p. ( ). big ideas on doing an "interactive read-aloud" when reading to children. retrieved on . . from. http://www.youtube.com/watch?v¼hlroolqhuvs youtube video clip. siemens, g. ( ). connectivism: a learning theory for the digital age. journal of instructional technology and distance learning, ( ). siemens, g. ( ). knowing knowledge. vancouver, bc, canada: lulu publishers. siemens, g., & matheos, k. ( ). systemic changes in higher education. in education, ( ). smithsonian science. ( ). facebook friends help scientists quickly identify nearly fish specimens collected in guyana. retrieved on . . from. http:// smithsonianscience.org/ / /facebook-friends-help-scientists-quickly-identify-nearly- -fish-specimens-collected-in-guyana/. solum, l. b. ( ). blogging and the transformation of legal scholarship. washington law review, , – . tate, v. d. ( ). from binkley to bush. the american archivist, ( ), – . thagard, p. ( ). collaborative knowledge. noûs, ( ), – . unsworth, j. ( ). scholarly primitives: what methods do humanities researchers have in common, and how might our tools reflect this?. in symposium on humanities computing: formal methods, experimental practice london: king’s college, retrieved on . . , from. http://jefferson.village.virginia.edu/wjmu m/kings. - / primitives.html. veletsianos, g. ( ). a definition of emerging technologies for education. in g. veletsianos (ed.), emerging technologies in distance education, (pp. – ). edmonton, ab: athabasca university press. veletsianos, g. (in press). higher education scholars’ participation and practices on twitter. journal of computer assisted learning. veletsianos, g. & navarrete, c. (in press). online social networks as formal learning environments: learner experiences and activities. the international review of research in open and distance learning. vygotsky, l. ( ). mind in society. london, uk: harvard university press. walker, j. ( ). blogging from inside the ivory tower. in a. bruns, & j. jacobs (eds.), uses of blogs (pp. – ). new york, ny: peter lang. weller, m. ( ). thoughts on digital scholarship. retrieved on . . from. http://nogoodreason.typepad.co.uk/no_good_reason/ / /thoughts-on-digital- scholarship.html. wesch, m. ( ). the visions of students today – call for submissions. blog entry in digital ethonography @ kansas state university. retrieved on september . . from http://mediatedcultures.net/ksudigg/?p¼ . wikipedia, ( ). in wikipedia, the free encyclopedia. retrieved on june , from http://en.wikipedia.org/wiki/wikipedia. zawacki-richter, o., anderson, t., & tuncay, n. ( ). the growing impact of open access distance education journals: a bibliometric analysis. the journal of distance education/revue de l’Éducation à distance, ( ). http://www.mendeley.com/groups/ /language-learning-social-media/ http://www.mendeley.com/groups/ /language-learning-social-media/ http://www.pearsonlearningsolutions.com/blog/ / / /teaching-learning-and-sharing-how-todays-higher-education-facutly-use-social-media/ http://www.pearsonlearningsolutions.com/blog/ / / /teaching-learning-and-sharing-how-todays-higher-education-facutly-use-social-media/ http://www.oclc.org/programs/publications/reports/ - .pdf http://chronicle.com/blogs/wiredcampus/u-of-illinois-at-springfield-offers-new-massive-open-online-course http://chronicle.com/blogs/wiredcampus/u-of-illinois-at-springfield-offers-new-massive-open-online-course http://article-level-metrics.plos.org/ http://www.youtube.com/watch?v=hlroolqhuvs http://www.youtube.com/watch?v=hlroolqhuvs http://smithsonianscience.org/ / /facebook-friends-help-scientists-quickly-identify-nearly- -fish-specimens-collected-in-guyana/ http://smithsonianscience.org/ / /facebook-friends-help-scientists-quickly-identify-nearly- -fish-specimens-collected-in-guyana/ http://jefferson.village.virginia.edu/~jmu m/kings. - /primitives.html http://jefferson.village.virginia.edu/~jmu m/kings. - /primitives.html http://jefferson.village.virginia.edu/~jmu m/kings. - /primitives.html http://nogoodreason.typepad.co.uk/no_good_reason/ / /thoughts-on-digital-scholarship.html http://nogoodreason.typepad.co.uk/no_good_reason/ / /thoughts-on-digital-scholarship.html http://mediatedcultures.net/ksudigg/?p= http://mediatedcultures.net/ksudigg/?p= http://en.wikipedia.org/wiki/wikipedia networked participatory scholarship: emergent techno-cultural pressures toward open and digital scholarship in online networks introduction from digital scholarship to networked participatory scholarship the shared history of scholarship and technology emergent techno-cultural stimuli exerting pressure on scholarly practice . in the dominant culture . amongst scholars . within scholarly journals conclusion acknowledgements references sshoc social sciences & humanities open cloud social sciences & humanities open cloud position statement table of contents preamble introduction eosc sshoc inclusiveness tech and human centric what is the research community expecting from eosc? sshoc data infrastructure for open science eu wide availability of high quality ssh data eu wide availability of high quality “cloud ready” ssh tools sshoc open-source software and service repository – the market place availability of an eu wide, easy to use ssh open market place sshoc secured environment for data analysis eu wide availability of trusted and secure access mechanisms for ssh data sshoc engagement and communication data sharing is accepted practice (the “new normal”) among the different ssh communities state of the art advanced through dedicated ssh data pilots cluster projects the social sciences and humanities are seamlessly integrated in eosc what added value will eosc bring to the research? increase the efficiency and productivity of researchers contribute to the creation of a cross-border and multi-disciplinary open innovation environment new discoveries which main issues should be considered by eosc executive and governing boards ? source image via: https://ec.europa.eu/commission/presscorner/detail/en/speech_ _ preamble in this document, we address three questions: . what is the research community expecting from eosc? . what added value will eosc bring to the research? . which main issues should be considered by eosc executive and governing boards? introduction eosc eosc is about connecting research data with (e-infrastructure) tools & services, following the fair principles – not just for research data, but also for software and the way science is carried out. eosc is about breaking down silos and providing seamless access to research data, tools and facilities. ec president ursula von der leyen at the world economic forum, davos, january we are creating a european open science cloud now. it is a trusted space for researchers to store their data and to access data from researchers from all other disciplines. we will create a pool of interlinked information, a ‘web of research data’. the project aims at realising the transition from the current landscape with disciplinary silos and separated e-infrastructure facilities into a cloud-based infrastructure where data are fair, and tools and training are available for scholars - especially from those domains in the social science and humanities that have adopted a data-driven scientific approach and that have an interest in the innovation and integration of their methodological frameworks. specific objectives • build the ssh cloud • maximise re-use through open science and fair principles • interconnect existing and new infrastructures • governance for ssh-eosc sshoc the overall objective of the social sciences and humanities open cloud (sshoc) project is to realise the social sciences and humanities’ part of european open science cloud (eosc). the overall impact and end-result of sshoc will be a ssh data ecosystem in which researchers and other interested parties have seamless access to high quality data. source: blomberg, niklas, & petzold, andreas. ( , january ). esfri thematic cluster view on eosc. zenodo. http://doi.org/ . /zenodo. sshoc social sciences & humanities open cloud inclusiveness all ssh esfri landmarks and projects, as well as relevant international ssh data infrastructures and the association of european research libraries participate in this project. this will ensure an inclusive approach. moreover, the consortium has the expertise to cover the whole data cycle: from data creation and curation to optimal re-use of data, and can also address training and advocacy to increase actual re-use of data. tech and human centric although often overlooked, the community aspect and mutual understanding that we term “human centric” is the key for successful uptake of the infrastructure. therefore, development of synergies and complementarity between involved research infrastructures is of the utmost importance, thus contributing to the development of a consistent ssh research infrastructures ecosystem. community building around the cloud, as well as a strong connection with end users and intermediaries (e.g. research libraries and their institutions), and social networks are essential for its success. the ssh cloud infrastructure will be distributed, which will be beneficial to its scalability. it will foster data, tools & services in different domains and elaborate on connecting data to increase reusability. the consortium is very well placed to address ssh specific challenges such as the distributed character of its infrastructures, multi-linguality, huge internal complexity of some of the data it deals with, and secured access to sensitive data. the project will pool, harmonise and make easily usable tools & services that allow the research community and other interested users to deal with the vast heterogeneous collections of data available, to process, enrich, analyse and compare it across the boundaries of individual repositories or institutions. - research (data) communities governance coordination rules of the games data producers data re-user data, tools & services e-infrastructure human-centric skill research community scientists, professionals citizens creating the ssh open marketplace research community scientists, professionals citizens fostering communities empowering user & building enterprise data communities e-infrastructure technical skill innovation in data access innovation in data production lifting technologies and services into the ssh cloudtraining governance e-infrastructure innovation tools for the markets marketplace sshoc project structure concept collection processing distribution distribution distribution discovery analysis source: ddi alliance what is the research community expecting from eosc? in line with the objectives of open science, the sshoc project will improve access to data and provide tools, enabling new and interdisciplinary research leading to new insights and innovation for society. the overall impact and end-result of sshoc will be a ssh data ecosystem in which researchers and other interested parties have seamless access to high quality data and this system will be integrated in the european open science cloud. sshoc data infrastructure for open science eu wide availability of high quality ssh data sshoc will develop shared web-based tools and services to assist data producers at different stages of the data lifecycle to collect, process, archive and share high-quality cross-national research data and metadata in a cost-effective and streamlined way. secondly, sshoc will explore how data producers can employ technology to add value to existing ssh primary data collections by, for example, automating the gathering and incorporation of contextual information. such innovations will encourage re-purposing and reuse of data across disciplines. it will benefit data users (analysts) and policy makers as well as the data producers who are able to deliver more with less and increase the visibility and value of data to the scientific and policy communities. the results will be made openly available, unless privacy regulations require secured access. eu wide availability of high quality “cloud ready” ssh tools many tools for data managing and processing already exist, but only some are ready for deployment in the cloud. sshoc will therefore adjust and enrich existing tools and services for managing and processing ssh data, thus making them “cloud ready”. in this context “cloud ready” refers to making them interoperable, citable and findable and advertised in the market place, actionable via the sshoc switch board and packaged for deployment in the eosc. sshoc will also develop a new suit of tools and services for managing and processing ssh tools and services that are central to the ssh communities. sshoc open-source software and service repository – the market place ssh open market place availability of an eu wide, easy to use ssh open market place, where tools and data are openly available the tools and data developed in sshoc will be made widely available through the sshoc open market place where scholars from the broader ssh domain can find solutions and resources for the digital aspects of their research. it will adopt a platform which allows for contextualisation and interrelation of datasets, tools, and services offered, etc., with screenshots, tutorials and links to training material, user stories, showcases, and other related resources. it will also encompass community features and contain a rating and assessment feature, based on previous work in humanities. the sshoc market place is a platform with (free and commercial) services, and tools for working on the ssh data cycle: from creation to reuse of data. these services include training and will also link existing data catalogues of the ssh infrastructures. sshoc secured environment for data analysis eu wide availability of trusted and secure access mechanisms for ssh data, conforming to eu legal requirements research data in the social sciences is often connected to individuals, requiring special security and protection. data anonymization can limit risk, but a loss of detail reduces utility and thereby limits research potential. to maximise data utility, research value, and policy impact, an interdisciplinary approach to secure data sharing is required sshoc will therefore develop services which will offer secure and trusted repositories for storing and accessing ssh data. this will be built on an open source software platform, customised to the needs of the european ssh community. the service will be developed in such a way as to ensure its sustainability after the end of the action. in this context, sshoc will also address the legal issues related to open access and reusability of ssh research data, as well as issues related to legal and ethical implementation of the fair principles. the impact on ssh research data of the gdpr, ownership and intellectual property rights (ipr), the new european e-privacy regulation, as well as ethical issues will be analysed. one of the results will be the development of a common ssh gdpr code of conduct to support the realisation of the eosc. to demonstrate that sensitive data can meet the standard of fair access to data, by being made “intelligently open”, sshoc will provide a framework of confidentiality levels for ssh data (based on global ‘data tagging’ categorisation) and how it might be implemented to other data to meet fair principles. sshoc engagement and communication data sharing is accepted practice (the “new normal”) among the different ssh communities within the eosc context, a successful implementation of the sshoc requires a strong and sustained engagement with the user base and their communities. sshoc will foster a data sharing culture according to fair principles by providing context-driven training around the sshoc infrastructure. in this context sshoc will harmonize existing training initiatives and expand the portfolio of training materials and actions from a ssh perspective in cooperation with other relevant eu funded projects in the area (e.g. foster plus, eosc-pilot, eosc-hub, openaire-advance, freya). sshoc d . system specification - ssh open marketplace doi . /zenodo. state of the art advanced through dedicated ssh data pilots cluster projects sshoc will undertake several pilot studies on implementing fair principles in research communities: • increase findability within migration & mobility by connecting with ec projects on this subject; • improve accessibility within humanities and language, also to transform data into information of interest to decision makers; • realise interoperability by linking election data using semantic techniques by connecting with an ec project on historical economic & company data; • encourage reusability by developing training in digital heritage science and work on heritage science data transformation (text & data mining, machine learning, predictive modelling) to enable large heritage datasets to be interpreted. source: https://dit.libguides.com/c.php?g= &p= the social sciences and humanities are seamlessly integrated in eosc for sshoc to reach its stated objective of pulling down traditional silos and build an integrated cloud infrastructure, it needs to seamlessly interact with other data clusters within eosc. for this purpose, sshoc will develop and implement a common governance model for the project results as part of eosc. it will include common policies on fair principles, data stewardship and harmonization, as well as quality assessment (including certifications), legal and ethical issues and fostering strong collaboration with different eosc thematic clusters. sshoc will also assist (inter)disciplinary user communities in overcoming the challenges that they encounter when attempting to contribute to sshoc and eosc by implementing principles, procedures, tools and services developed in specific research communities and other projects. what added value will eosc bring to the research? increase the efficiency and productivity of researchers by providing a full-fledged social sciences and humanities cloud where data, open data tools and services, are offered as part of infrastructure with easy and seamless discovery, access, and re-use are available for users of social, humanities and cultural heritage data. contribute to the creation of a cross-border and multi-disciplinary open innovation environment by fostering the innovation of infrastructural support for digital scholarship, stimulating multi- disciplinarity and collaboration across the various subfields of social sciences and humanities and with other science domains. new discoveries from the davos- speech of ec president von der leyen: there are “hidden treasures and untapped opportunities in the data we generate. …every researcher will be able to better use not only their own data, but also those of others. they will thus come to new insights, new findings and new solutions”. if we want to solve societal challenges – e.g. the sustainable development goals, ec mission – scientists sshoc d . challenges that user communities face when attempting to contribute to sshoc doi: . /zenodo. source: https://www.eltis.org/sites/default/files/news/shutterstock_ _ .jpg must be able to cooperate beyond disciplinary barriers and be able to use each other’s’ data. the ec mission on ‘climate-neutral and smart cities’ will require experts from social behaviour, economics, urban planning, biology, geography, chemistry, medicine, biology, environmental science, computer science, physics, etc. which main issues should be considered by eosc executive and governing boards? • address and coordinate on key elements of a platform or research data commons • persistent identifiers – for (meta)data, publications, services – and possibly organisations and researchers . • metadata standards for the respective domains. • ssh will use metadata templates, controlled vocabularies and data models used in well-curated datasets. but especially controlled vocabularies need support and require agreement over domains and on a global scale. • user interfaces – for service catalogues, market place, and other key parts of eosc. foster the interoperability and reuse of research data • secured data environments for storage, management, access, computation and analysis of research data. • instruments to describe connections & interoperability between data sets. • for example, ssh will use ddi (xml) as a rich schema to support extensive variable metadata and references to other data . • a key issue for use of platforms is trust. • this implies quality assurance for the data, the services, the (software) tools that are in eosc. ensure and foster the composability of eosc • breaking down silo’s – between data, computing power, storage and networks – is a salient value added of eosc. end-users should be able to compose combinations of services themselves in an easy and straightforward way. • encourage eosc as a working space for researchers – to work in secured environments, to combine data from various domains, to use tools & services from the market place. provide stability and sustainability in providing eosc • partners – either commercial or non-commercial – will only invest in eosc if there is a guarantee that eosc will be ensured for a long-term period. • implement an operational, scalable and sustainable eosc federation, allowing seamless alignment and convergence with data infrastructures. • provide clear rules of participation with defined funding responsibilities and transparent and ongoing financial commitment by all irrelevant stakeholders. • take advice and align efforts with ongoing (especially domain specific) initiatives and h projects. • provide and maintain communication channels and close connection to the research community. cf. nih commons https://nihdatacommons.us/ d . sshoc d . report on sshoc (meta)data interoperability problems doi: . /zenodo. sshoc, “social sciences and humanities open cloud”, has received funding from the eu horizon research and innovation programme ( - ); h -infraeosc- - , under the agreement no. inconsistent screening for lead endangers vulnerable children: policy lessons from south bend and saint joseph county, indiana, usa vol.:( ) j public health pol ( ) : – https://doi.org/ . /s - - - original article inconsistent screening for lead endangers vulnerable children: policy lessons from south bend and saint joseph county, indiana, usa heidi beidinger‑burnett  · lacey ahern  · michelle ngai  · gabriel filippelli  · matthew sisk published online: december © the author(s) abstract lead exposure is a major health hazard affecting children and their growth and is a concern in many urban areas around the world. one such city in the united states (us), south bend indiana, gained attention for its high levels of lead in blood and relatively low testing rates for children. we assessed current lead screening practices in south bend and the surrounding st. joseph county (sjc). the – lead screening data included , unique children. lead screening rates ranged from . to . %. more than % of children had ‘elevated blood lead levels’ (ebll) ≥   micrograms per deciliter (µg/dl) and . % had an ebll ≥   μg/ dl. over % of the census tracts in sjc had mean ebll  ≥    μg/dl, suggesting widespread risk. inconsistent lead screening rates, coupled with environmental and societal risk factors, put children in sjc at greater risk for harmful lead exposure than children living in states with provisions for universal screening. indiana and * heidi beidinger-burnett hbeiding@nd.edu lacey ahern lhaussam@nd.edu michelle ngai michelle.ngai. @nd.edu gabriel filippelli gfilippe@iupui.edu matthew sisk lhaussam@nd.edu eck institute, flanner hall , notre dame, usa department of earth sciences, indiana university-purdue university indianapolis, north blackford, indianapolis, in  , usa center for digital scholarship, university of notre dame, g hesburgh library, notre dame, in  , usa http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf h. beidinger-burnett et al. other states should adhere to the us centers for disease control’s guideline and use universal lead testing to protect vulnerable populations. keywords lead poisoning · testing rate · universal lead testing introduction beginning in the early s, manufacturers incorporated lead into a host of prod- ucts, mostly lead-based paint and leaded gasoline; global contamination of air, water, and soil resulted [ ]. global production of lead, largely for lead-acid batteries, has increased substantially since the s, even as lead was largely phased out of paint and gasoline [ ]. lead poisoning is estimated to kill on the order of , people per year globally, with % of deaths due to lead occurring in low and mid- dle income countries [ ]. the primary source of lead exposure in low and middle income countries differs from the united states (us), with the former dominated by battery manufacturing and recycling and the latter by the legacy impacts of lead-based paint and fuel that resulted in concentration of lead in surface soils [ ]. globally, young children living in proximity to areas with high environmental lead contamination are at greatest risk of lead poisoning that can cause detrimental life-long neurological and physiological damage [ , ]. lead can permanently decrease iq, academic achievement, and eco- nomic achievement [ – ]. further, studies have shown a strong association between childhood lead poisoning and criminal arrests [ , ], as well as other non-cogni- tive health issues [ , ]. after decades of progress decreasing lead hazards, the us continues to battle the environmental legacy of leaded paint and gasoline. the us centers for disease con- trol (cdc) reports there are at least million households in the us with high levels of environmental lead in which children reside. they further estimate that , children ages – have elevated blood lead levels (eblls) above   micrograms per deciliter (µg/dl), the threshold set by the cdc to initiate case management [ ]. us state laws governing lead testing of children vary greatly and are implemented inconsistently, resulting in abysmal testing rates. in the us, only states and the district of columbia require universal testing and states require targeted testing [ ]. targeted testing refers to the criteria established to identify children who are at-risk of lead poisoning. the remaining states only have screening recommenda- tions and no formal policy [ ]. the state of indiana has no formal policy for blood lead screening levels in chil- dren and the overall lead screening rate in indiana for all children –   years was only % [ ]. the us federal government finances a medical services program, called medicaid, for some low income children. as part of its services, medicaid requires children enrolled in the program receive lead testing in accordance with cdc guidelines [ ]. in , % of all children in indiana between and  years of age were medicaid-eligible, of whom only . % were tested [ ]. inconsistent screening for lead endangers vulnerable children:… a national survey of the number of children tested and the rates of eblls in chil- dren at the census tract or county level highlighted the failure of the current strategy for identifying and protecting children from the harm of lead exposure [ ]. the survey report pointed to a short list of nine “troubled communities”—those with high percentages of children with elevated lead levels—scattered around the us. although large municipalities appeared on this list, so did several smaller communi- ties with little support for public health surveillance, such as the small post-indus- trial/university city of south bend, indiana. up to % of children under the age of tested by public and private health providers in south bend had unsafe levels, the highest in the state of indiana [ ]. we aimed to develop an epidemiologic profile of lead testing. we designed our study to describe and analyze the demographics and spatial distribution of children under   years of age ( – .   years of age) with elevated lead levels in st. joseph county (sjc), indiana. this includes south bend. we chose under  years of age to be more consistent with cdc screening guidelines, in contrast to reuters [ ] who used the state-reported values from to for children under  years of age. we also examined whether blood lead testing rates and trends adequately captured those children at highest risk for eblls. methods the sjc health department maintains a lead database containing the results of every blood lead test conducted, along with associated patient information. the information captured includes, but is not limited to: demographics, pregnancy sta- tus, medicaid status, blood lead level, reason for testing, and physician and labo- ratory names. for this cross-sectional study, we extracted data between and and de-identified individuals to ensure confidentiality and compliance with hipaa regulations. hipaa is the health insurance portability and accountability act of ; us legislation that provides data privacy and security provisions for safeguarding medical information. as part of de-identification, we aggregated at the census tract level the locations of individual residential addresses. we extracted tract boundaries for this, and all other spatial analyses, from us census bureau tiger/ line dataset [ ]. after applying exclusion criteria (explained below), , lead tests represented , unique children under . because the data did not allow for appropriate dif- ferentiation between confirmatory and routine screening, we included the first lead test for each unique child only. we are aware that this is slightly different from the conventional method using the highest venous or lowest capillary test. we chose this method to draw attention to cases without adequate follow-up. statistically, the con- ventional aggregation method changes the mean values slightly, but does not impact the patterns discussed below. the exclusion criteria included: ( ) duplicate records, h. beidinger-burnett et al. ( ) records noting the same unique identifier, date of birth, and specimen date report- ing different blood lead levels, ( ) records with the same unique identifier, but different date of birth, ( ) individuals ≥  years of age on the date of initial lead test, and ( ) records where the residential address could not be allocated to a specific census tract. we extracted additional data pertaining to population size, poverty level, and housing age from the american community survey (acs) -year estimates at the census tract level in sjc, indiana. we performed descriptive analyses for patient demographics and medicaid status, stratified by blood lead level outcomes. we calculated the annual lead testing rate of children under at between and for several census tracts (acs yearly estimates began in ). we generated maps to determine the spatial distribution of eblls, in addition to indicators such as median housing age and poverty rates. we evaluated the correlation between socioeconomic risk factors and eblls with spearman’s rank correlation coefficient. we conducted all analyses in r version . . and esri arcgis . . results over the -year period ( – ), health providers performed , blood lead tests for children under   years of age. this led to identification of , unique individuals (table  ). of the , unique individuals, . % ( , ) had an ebll ≥   µg/dl. both sexes are roughly equally represented in the testing with . and . % females and males, respectively (table  ). the distribution of ebll among sexes was similar, with a slight increase among males for ebll – and ≥   µg/dl. the racial composition of the tested individuals consists of . % white, . % black, and fewer than % for other subgroups, including asian/pacific, american indian, and multiracial. with respect to ethnicity, a greater proportion of non-hispanic indi- viduals had been tested, compared to hispanics. the racial and ethnic composition is based on a person’s self-identification with a racial/social group as defined by the us census bureau. eblls are linked to medicaid status (table  ). medicaid status is a proxy for low socio-economic status because medicaid serves children who are from families with low income and fewer assets. for children who receive medicaid, . % had an ebll  μg/dl or higher, and for those who were non-medicaid recipients (usu- ally higher income), . % had an ebll ≥   µg/dl (p <  . ). additionally, the proportion of medicaid recipients increases with each incremental rise in blood lead level. for example, while . % of medicaid recipients had a bll of  μg/dl, . % were between and  μg/dl, . % were between and  μg/dl, and . % were  μg/dl or higher. inconsistent screening for lead endangers vulnerable children:… due to poor data collection and/or reporting, a significant proportion of the data contained an empty field or was documented as “unknown”. for race and ethnicity data, . and . % of the children, respectively, were recorded as unknown. simi- larly, . % of the medicaid status data of the children was recorded as “unknown”. over the -year period, lead testing of children was inconsistent (fig.  ). in , the number peaked at children were tested, representing a testing rate of . %. this peak in testing coincided with a us housing and urban development grant awarded to the south bend housing authority for lead screening. the trend, however, steadily declined and in , health care providers tested only chil- dren, representing a testing rate of . %. the rate of children identified with ebll greater than or equal to   μg/dl has significantly decreased from . per in to approximately . per during – (fig.  ). when disaggregated by ebll, the rate of chil- dren with ebll –  μg/dl has declined substantially, plateauing from – at approximately per , while the rate of ebll  ≥    μg/dl has remained relatively constant, averaging per . table demographics of children under  years of age stratified by elevated blood lead level (ebll) in st. joseph county, in; analysis conducted – total ebll  μg/dl ebll –  μg/ dl ebll –  μg/ dl ebll ≥   μg/ dl children under % ( , ) . % ( ) . % ( , ) . % ( ) . % ( ) sex  female . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  male . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  unknown . % ( ) . % ( ) . % ( ) . % ( ) . % ( ) race  american indian . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  asian/pacific . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  black . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  white . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  multiracial . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  other . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  unknown . % ( ) . % ( ) . % ( ) . % ( ) . % ( ) ethnicity  hispanic . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  non-hispanic . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  unknown . % ( ) . % ( ) . % ( ) . % ( ) . % ( ) medicaid status  yes . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  no . % ( ) . % ( ) . % ( ) . % ( ) . % ( )  unknown . % ( ) . % ( ) . % ( ) . % ( ) . % ( ) h. beidinger-burnett et al. in general, during the period – , . % of children tested had a reported ebll of   μg/dl or greater. individuals with an ebll  ≥    μg/dl appear to be more centrally located in south bend within sjc and more densely concentrated within nine census tracts, specifically , , , , , , , , and (fig.  a). census tracts and had the highest rates of ebll  ≥    μg/dl at . and . %, respectively. both tend to contain ( ) a higher proportion of households living in poverty and ( ) homes constructed before . both housing age and poverty were associated with an ebll with a spearman’s correlation of . and . (p  <  . ), respectively. the strength of these relationships is indicative of multiple factors affecting eblls rather than attributing the results to a single variable. disconcertingly, there appear to be low testing rates in these high-risk areas, and more specifically, in census tracts with a higher percentage of impover- ished households. in , between and % of individuals in census tract resided in poverty, but only . % of children under were reported to have been tested for lead levels (fig.  b). we examined this lack of testing further using a -year time-series analysis. we observed low and inconsistent testing rates in several census tracts (table  ). census tract appears to have the lowest testing rate, with an average of . % over the -year period. testing in census tract has been inconsistent, with rates ranging between . and . %. census tract requires special attention; this area has one of the lowest lead testing rates ( . %) yet contains the highest pro- portion of children with eblls ( . %). fig. lead testing for children under the age of in st. joseph county, in. the number of children tested has been inconsistent over the -year span, potentially due to gain and loss of monetary incen- tives to the county health department. the rate of elevated blood level (ebll) –   μg/dl steadily declined while rate of ebll  ≥    μg/dl has remained relatively constant. analysis conducted – inconsistent screening for lead endangers vulnerable children:… discussion blood lead testing rates varied substantially among census tracts and over time in sjc, and without any evident strategy to protect children at greatest risk of ebll. on average, only % of children under had been tested for lead during fig. spatial map of a central location in st. joseph county. a depicts a cluster of census tracts that have a relatively high percentage of children with an ebll  ≥    μg/dl for – . census tracts and (shown in red) are considered “hot spots”, with over % of tested children having an ebll ≥   μg/dl. b the color represents percentage of individuals living in poverty within that census tract; the color deepens with increasing percentage. the superimposed numbers dictate the correspond- ing lead testing rate within that census tract for . analysis conducted – table lead testing rates between and in children under  years of age in st. joseph county, in analysis conducted – census tracts years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . h. beidinger-burnett et al. – and no census tract reached close to the % recommended by the us cdc. of the census tracts in sjc, ( %) had an ebll greater than or equal to   μg/dl, suggesting that risk exists beyond the center city of south bend. such low and inconsistent testing make it difficult to draw any conclusion about the scope of the ebll problem among children in sjc. this, in turn, makes it difficult for public health officials to allocate resources properly and protect children from lead poisoning. this study shows the importance of describing the environmental and societal risk factors associated with ebll. housing age and poverty are correlated with an ebll, placing children at greater risk of lead poisoning. homes constructed prior to were painted with lead-based paint, exposing children to chips of paint and leaded dust. poverty is likely correlated with the state of disrepair of older homes and the inability to remediate lead hazards. the spatial analysis shows that older homes and poverty are concentrated in the inner city of south bend, exemplifying the negative impact a child’s address can have on his or her health. spatial analysis can aid in the identification of high-risk populations, promoting a more cost-effec- tive method for resource allocation and targeted testing. public health implications even after decades of research and legislation to remove lead from paint and gas, globally the incidence of lead poisoning remains high in urban areas. children of color in low income households who inhabit the polluted centers of our older cit- ies without the benefits of adequate nutrition, education, and health care remain at particular risk [ ]. to create lead-safe environments and to provide environmen- tal justice for urban dwellers, newer approaches are needed to assess current lead exposure mechanisms and to understand fully the health implications of chronic lead exposure. there are significant economic and societal costs associated with lead poison- ing—and with lack of action to test children, remediate housing, and treat those poi- soned. at the time the poisoned child enters school, he or she may not be capable of performing well without extra educational support (‘special education services’ required by law in the us), resulting in increased costs and resources. a child who drops out of secondary school is likely to see a significant decrease in lifetime earn- ings, and that decreases tax revenues and increases social service costs. prevention of lead poisoning is cost-effective and necessary to avoid the life-long effects and costs. based upon data, there were , children under living in sjc. a cost–benefit analysis was conducted to quantify the economic benefit of lead exposure prevention and the investment needed to conduct lead screening [ ]. given that the cost of a blood lead test ranges from $ to $ , the cost of screen- ing , children in sjc would cost between $ , and $ , , [ ]. this initial investment of $ –$ would be paid for by public (medicaid) and private health insurance. our cost–benefit analysis estimates that an investment of $ –$ in lead screening would yield a return of $ –$ in decreased health care costs, lifetime earnings, and direct costs of criminal activity [ , ]. thus, sjc could inconsistent screening for lead endangers vulnerable children:… expect a return of $ , , –$ , , . those returns would be distributed across various entities. the health care system (public and private) would achieve cost savings because early detection of a lead poisoned child would lower life-time health care costs. schools would serve fewer children with developmental disabili- ties resulting in a cost savings. ultimately, children would experience the greatest benefit, resulting in greater student achievement and increased lifetime earnings. this study has demonstrated a lack of coherent laws governing lead testing and case management that has led to haphazard and inequitable strategies in indiana. the result is that children are at undue risk for developing long-term problems. while indiana does not have formal policy governing lead testing, states such as maryland, iowa, and vermont benefit from laws on universal lead testing for chil- dren and have achieved significant decreases in lead poisoning rates [ ]. since , maryland has accomplished a % decrease in childhood lead poisoning, as a result of the implementation and enforcement of maryland’s reduction of lead risk in housing act [ ]. in indiana, ‘case management’ remains inconsistent, with state law requiring case management for eblls  ≥    μg/dl and the indiana state department of health recommending the initiation of case management for eblls ≥   µg/dl [ ]. case management includes a nutritional and developmental milestones assessment, an environmental assessment to identify the sources of lead exposure, and follow-up testing to monitor lead levels [ ]. we recommend a state- wide policy in indiana that adopts the us cdc’s guideline to require universal lead testing and implement case management at  µg/dl or higher. the world health organization (who) also recognizes that there is a lack of guidelines and regulations for lead prevention and management globally. currently, the who is developing a set of guidelines to provide evidence-based strategies for policy makers and health providers [ ]. recognition of the adverse effects from lead exposure has received international attention and, as early as , prompted its inclusion in conventions, such as the convention on the rights of the child. while many countries have actively engaged in efforts against lead poisoning, especially through the enforcement of bans on leaded gasoline and paint, other countries have yet to adopt these measures. a sig- nificant reduction in blood lead levels in children is a direct result of these restric- tions. vigilance in recognizing, reducing, and possibly eliminating other key sources of lead, such as lead-acid batteries, is essential in the continued efforts moving for- ward. to address lead as a global health issue, concurrent action will be needed: sur- veillance and testing, enacting policy to reduce lead poisoning, coupled with strong and consistent implementation. mindful of who’s objectives and with success seen in other us states, there is no safe level of lead; policy makers must act to protect our children and prevent lead poisoning. open access this article is distributed under the terms of the creative commons attribution . inter- national license (http://creat iveco mmons .org/licen ses/by/ . /), which permits unrestricted use, distribu- tion, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. http://creativecommons.org/licenses/by/ . / h. beidinger-burnett et al. references . landrigan pj, and others. the lancet commission on pollution and health. lancet. ; : – . . gbd . risk factors collaborators. global, regional, and national comparative risk assess- ment of behavioural, environmental and occupational, and metabolic risks or clusters of risks, – : a systematic analysis for the global burden of disease. lancet. ; : – . . filippelli gm, taylor mp. addressing pollution-related global environmental health burdens. geo- health. ; ( ): – . . bellinger d. very low lead exposures and children’s neurodevelopment. curr opin pediatr. ; : – . . clune alfh, riederer am. mapping global environmental lead poisoning in children. j health pollut. ; : – . . lead. center for disease control; . cdc.gov. https ://www.cdc.gov/nceh/lead/. accessed may . . bellinger d. childhood lead exposure and adult outcomes. jama. ; : – . . binns hj, campbell c, brown mj. interpreting and managing blood lead levels of less than  μg/ dl in children and reducing childhood exposure to lead: recommendations of the centers for dis- ease control and prevention advisory committee on childhood lead poisoning prevention. j pedi- atr. ; :e – . . tarrago o, brown mj, atsdr. case studies in environmental medicine: lead toxicity. center for disease control; . https ://www.atsdr .cdc.gov/csem/lead/docs/csem-lead_toxic ity_ .pdf. accessed sep . . mielke h, zahran s. the urban rise and fall of air lead (pb) and the latent surge and retreat of soci- etal violence. environ int. ; : – . . needleman h, mcfarland c, fienberg s, tobin m. bone lead levels in adjudicated delinquents: a case control study. neurotoxicol teratol. ; : – . . obeng-gyasi e, armijos r, weigel a, filippelli g, sayegh m. cardiovascular-related outcomes in u.s. adults exposed to lead. int j environ res public health. ; : – . . obeng-gyasi e, armijos r, weigel a, filippelli g, sayegh m. hepatobiliary-related outcomes in us adults exposed to lead. environments. ; : – . . dickman j. children at risk: gaps in state lead screening policies. safer chemicals, healthy fami- lies; . https ://safer chemi cals.org/child ren-at-risk/. accessed sep . . childhood lead surveillance report: environmental public health led and healthy homes program. indiana state department of health; . https ://www.in.gov/isdh/files /lead% rep ort% -new.pdf. accessed sep . . coverage of blood lead testing for children enrolled in medicaid and the children’s health insur- ance program. centers for medicaid and medicare services; . https ://www.medic aid.gov/feder al-polic y-guida nce/downl oads/cib .pdf accessed jan . . lead screening requirements and medical management recommendations: for children ages to   months. indiana state department of health; . https ://www.in.gov/isdh/files /case_manag ement _chart _rev_h_-_ .pdf. accessed sep . . pell m, schneyer j. unsafe at any level: the thousand of u.s. locales with lead poisoning is worse than in flint. reuters investigates; . http://www.reute rs.com/inves tigat es/speci al-repor t/usa- lead-testi ng/. accessed sep . . filippelli gm, laidlaw mas. the elephant in the playground: confronting lead-contaminated soils as an important source of lead burdens to urban populations. perspect biol med. ; : – . . geography. us census bureau; . https ://www.censu s.gov/geo/maps-data/data/tiger -line.html. accessed dec . . kemper ar, bordley wc, downs sm. cost-effectiveness analysis of lead poisoning screening strat- egies following the guidelines of the centers for disease control and prevention. arch pediatr adolesc med. ; : – . . gould e. childhood lead poisoning: conservative estimates of the social and economic benefits of lead hazard control. environ health perspect. ; : – . . mckinney j. lead poisoning in maryland drops to lowest recorded levels, testing increases in first year of state initiative. maryland government: department of the environment; . http://news. https://www.cdc.gov/nceh/lead/ https://www.atsdr.cdc.gov/csem/lead/docs/csem-lead_toxicity_ .pdf https://saferchemicals.org/children-at-risk/ https://www.in.gov/isdh/files/lead% report% -new.pdf https://www.in.gov/isdh/files/lead% report% -new.pdf https://www.medicaid.gov/federal-policy-guidance/downloads/cib .pdf https://www.medicaid.gov/federal-policy-guidance/downloads/cib .pdf https://www.in.gov/isdh/files/case_management_chart_rev_h_-_ .pdf https://www.in.gov/isdh/files/case_management_chart_rev_h_-_ .pdf http://www.reuters.com/investigates/special-report/usa-lead-testing/ http://www.reuters.com/investigates/special-report/usa-lead-testing/ https://www.census.gov/geo/maps-data/data/tiger-line.html http://news.maryland.gov/mde/ / / /lead-poisoning-in-maryland-drops-to-lowest-recorded-levels-testing-increases-in-first-year-of-state-initiative/ inconsistent screening for lead endangers vulnerable children:… maryl and.gov/mde/ / / /lead-poiso ning-in-maryl and-drops -to-lowes t-recor ded-level s-testi ng-incre ases-in-first -year-of-state -initi ative /. accessed may . . who. lead poisoning and health. world health organization; . http://www.who.int/news- room/fact-sheet s/detai l/lead-poiso ning-and-healt h. accessed sep . . centers for disease control. recommended actions based on blood lead level. . https ://www. cdc.gov/nceh/lead/acclp p/actio ns_blls.html. accessed nov . . who. lead poisoning prevention week: ban lead paint. world health organization; . http:// www.euro.who.int/en/healt h-topic s/envir onmen t-and-healt h/pages /news/news/ / /lead-poiso ning-preve ntion -week-ban-lead-paint . accessed sep . heidi beidinger‑burnett ph.d. is assistant professional specialist (faculty) in the eck institute for global redesign, curriculum evaluation, and development. i have spent my career focused on public health and public education. currently, i am engaged in community-based participatory research focused on lead prevention, hiv and infant mortality. my current projects are focused on the development of a low-cost, scalable home lead test kit and the barriers and facilitators of mental health care for persons liv- ing with hiv. in addition to my research, i teach scientific writing, qualitative research methods and lead- ership. prior to my appointment at notre dame, i worked as a consultant in k- education for nearly  years. i developed expertise in leadership, school redesign, curriculum evaluation, and development. lacey ahern bs, mph, is program director at global partners in care, notre dame, indiana, an organization that works to enhance access to palliative care globally and an adjunct assistant professor in the eck institute for global health at the university of notre dame, notre dame, indiana, usa. she works on community-based health interventions, palliative care access, use of mhealth technologies, and maternal and child health services—particularly looking at quality of care and infant mortality and peri- natal surveillance. she has spent the past  years engaged in international health and development work, including time living and working in east africa with the united nations population fund, the congre- gation of holy cross, uganda martyrs university, the palliative care association of uganda, and other local organizations. she holds a master of public health degree in global health from the rollins school of public health at emory university. michelle ngai works with students in the master of science in global health program at the eck insti- tute for global health at the university of notre dame. in this role, she is training students to join the global health sector by instructing them to critically evaluate issues impacting the health of the world’s population, and promoting cross-disciplinary and cross-cultural dialogue. originally from canada, michelle is a graduate of mcgill university, with a bachelor of science degree in physiology and inter- national development. she also completed a master of public health degree at imperial college london, followed by a ph.d. at the university of notre dame. gabriel filippelli bs, ph.d. is professor, department of earth sciences, indiana university purdue university, indiana, usa. i have worked extensively on the chemistry and geologic history of nutrient cycling in the ocean and on land. current research projects involve determining the controls on nutrient cycling on land during glaciation, examining the timing and driving forces of biological productivity in the ocean, assessing the content and distribution of the potentially harmful element mercury in coal resources of indiana and examining the links between lead distribution and children’s blood lead levels in urban areas. matthew sisk is a spatial data specialist and geographic information systems librarian based in notre dame’s navari family center for digital scholarship. his research focuses on human–environment interactions, the spatial scale environmental toxins and community-based research. he received his bs from the university of south carolina in marine science and anthropology and his ma and ph.d. in archaeology from stony brook university. http://news.maryland.gov/mde/ / / /lead-poisoning-in-maryland-drops-to-lowest-recorded-levels-testing-increases-in-first-year-of-state-initiative/ http://news.maryland.gov/mde/ / / /lead-poisoning-in-maryland-drops-to-lowest-recorded-levels-testing-increases-in-first-year-of-state-initiative/ http://www.who.int/news-room/fact-sheets/detail/lead-poisoning-and-health http://www.who.int/news-room/fact-sheets/detail/lead-poisoning-and-health https://www.cdc.gov/nceh/lead/acclpp/actions_blls.html https://www.cdc.gov/nceh/lead/acclpp/actions_blls.html http://www.euro.who.int/en/health-topics/environment-and-health/pages/news/news/ / /lead-poisoning-prevention-week-ban-lead-paint http://www.euro.who.int/en/health-topics/environment-and-health/pages/news/news/ / /lead-poisoning-prevention-week-ban-lead-paint http://www.euro.who.int/en/health-topics/environment-and-health/pages/news/news/ / /lead-poisoning-prevention-week-ban-lead-paint inconsistent screening for lead endangers vulnerable children: policy lessons from south bend and saint joseph county, indiana, usa abstract introduction methods results discussion public health implications references provided by the author(s) and nui galway in accordance with publisher policies. please cite the published version when available. downloaded - - t : : z some rights reserved. for more information, please see the item record link above. title communicating new library roles to enable digital scholarship:a review article author(s) cox, john publication date - - publication information cox, john. ( ). communicating new library roles to enable digital scholarship: a review article. new review of academic librarianship, ( - ), - . doi: . / . . publisher taylor & francis link to publisher's version http://dx.doi.org/ . / . . item record http://hdl.handle.net/ / doi http://dx.doi.org/ . / . . https://aran.library.nuigalway.ie http://creativecommons.org/licenses/by-nc-nd/ . /ie/ communicating new library roles to enable digital scholarship: a review article abstract academic libraries enable a wide range of digital scholarship activities, increasingly as a partner rather than as a service provider. communicating that shift in role is challenging, not least as digital scholarship is a new field with many players whose activities on campus can be disjointed. the library’s actual and potential contributions need to be broadcast to a diverse range of internal and external constituencies, primarily academic staff, university management, library colleagues and related project teams, often with different perspectives. libraries have significant contributions to offer and a focused communications strategy is needed to embed libraries in digital scholarship and to create new perceptions of their role as enabling partners. introduction digital scholarship has generated new roles for libraries in recent years. it spans all disciplines, ranging in terminology from e-science to the digital humanities. neat definitions of digital scholarship are elusive, however, and waters ( , p. ) notes hundreds of definitions even of digital humanities on three different websites. lynch ( , p. ) refers to a digital scholarship disconnect, questioning the need to describe scholarship as digital. he does, however, recognise digital scholarship as a term applicable to the transformation of most areas of scholarly work by technologies such as high-performance computing, visualisation and the manipulation of large datasets. computational, data-intensive science is seen as representing a new paradigm (lynch, , p. ; tenopir, sandusky, allard, & birch, , p. ). new methods of enquiry characterise digital scholarship, especially in the humanities. waters ( , pp. , - ) sees the defining feature of digital humanities as the application of digital resources and methods to humanistic enquiry, identifying three broad areas of investigation and tool sets: textual analysis, spatial analysis and media studies. sinclair ( ) observes that “new hybrid communities of inquiry are increasingly visual, collaborative, and spatial, or simply seek to make new connections possible in a digital world”, thanks to technologies such as data visualisation and mapping applications, to which can be added tools for text and data mining. new approaches to publishing findings and sharing data, often on an open access basis, are very much in scope across all disciplines too. digital scholarship relies on collections of information and data, along with a range of tools, infrastructures and, above all, people. libraries have embraced this opportunity to take on a variety of roles, encapsulated by calhoun ( , p. ), alexander ( ), vinopal ( , pp. - ) and sula ( , pp. - ), and including:  digitisation and digital preservation, often of archives and special collections  metadata creation and enhancement for linked data, exchange and reuse  assignment of identifiers to promote discovery  hosting of digital collections in library repositories  publishing of faculty-edited journals  open access dissemination of research outputs and learning materials  management of research data  curation of born-digital collections  advice on copyright , digital rights management and the application of standards  participation in text mining, data analysis and geographic information systems (gis) projects  provision of spaces, tools, equipment and training for digital scholarship these roles have represented a fundamental shift for libraries towards publishing of digital content and active participation in research projects. they bring with them many communication challenges in terms of the environment of digital scholarship, the diversity of audience interests, important messages to be communicated and the range of channels for doing so. a challenging communications space library roles to enable digital scholarship are multi-stranded, reflecting the field itself. rockenbach ( , p. ) describes digital humanities as “messy”, while she and others (lippincott, hemmasi, & lewis, ; schaffner & erway, , p. ; vandegrift & varner, , p. ) emphasise its experimental approach, indicative of a rapidly evolving field without clear boundaries. establishing and communicating a clear library offering in response is, not surprisingly, often difficult. an ithaka study of institutional models of support for digital humanities outputs (maron & pickle, , pp. - ) identifies some further characteristics, including piecemeal approaches, multiple players on campus and a lack of joined-up campus-wide strategies. the range of stakeholders with whom the library may need to communicate includes university leadership, administration, it services and the research office, as well as the different academic departments or research centres involved in digital scholarship, among whose ranks may be scholars, doctoral students, interns, web developers and programmers. achieving effective communication across all of these constituencies is problematic. the ithaka study (maron & pickle, , p. ), while urging regular communication, noted that dissemination is a function that is not owned by any unit and therefore sporadic, resulting in lack of awareness of projects in the absence, typically, of any directory of campus-wide projects (p. ). schrier ( ) too observes, somewhat depressingly, that digital collections “often remain obscure, unknown, and therefore inaccessible to their intended user populations”. there are many audiences and many perspectives. university leadership will want the benefits of digital scholarship for the institution’s research profile but may be unwilling to invest in understanding fully the range of activity involved in order to enable a coherent resourcing strategy to emerge. academic staff may embrace involvement by libraries or may be slow to ask, preferring a self-sufficient, independent and autonomous approach (schaffner & erway, , p. ). equally, library staff may fail to connect with their diverse audiences. an earlier ithaka study on the sustainability of digitised special collections (maron & pickle, , p. ) notes that “investments in understanding the needs of the audience are quite low”. this does not bode well for successful audience engagement with libraries’ digital scholarship activities. mismatches in perspective are particularly evident in the areas of open access and research data management. each is a hard sell to academics who may not see the need to engage, especially if they perceive that further work, primarily of an administrative nature, may come their way. pinfield ( , pp. - ) notes continued “significant levels of disinterest, suspicion and scepticism about oa amongst researchers”. he (p. ) and creaser et al. ( , pp. - ) report strong loyalty to the traditional publication system, and in particular to journals. calhoun ( , p. ) cites problems with the way that librarians talk to faculty about open access, often emphasising a subscriptions crisis that academics do not recognise as needing attention. similarly, librarians’ promotion of their roles in research data management may face barriers in the shape of researcher negativity towards data sharing (pinfield, cox, & smith, , p. ) . convincing library staff that libraries should adopt new roles to enable digital scholarship can also be an issue. the messy, unpredictable nature of digital scholarship asks questions of libraries in terms of agility and risk taking. its experimental approach, with projects prone to failure, may not sit well with libraries’ tendency towards orderliness and predictable outcomes (posner, , p. ). a clash of cultures is evident here. equally, the culture of easy creation of content and its publication to the social web may clash with librarians’ values of authority and authenticity (calhoun, , p. ), limiting their full engagement with social media and thereby with new modes of scholarship. library staff may not recognise the validity of adding a publishing role to existing offerings (huwe, , p. ). rockenbach ( , p. ) identifies tensions between traditional notions of library service and new models of user engagement. this is most manifest in a debate, further discussed later in this article, as to whether librarians should take a supporting role in digital scholarship or should see themselves as active partners. the support model is traditional but there is a strong body of literature which sees it as sub-optimal (posner, , p. ) and advocates an equal partnership approach, with some (vandegrift & varner, , p. ) adducing a problem of librarian timidity based on an inferiority complex in relation to academics. librarians’ lack of confidence in their own skills can hold back progress in areas such as research data management (tenopir, et al., , p. ). all of this creates a strong imperative for library leaders to communicate very effectively the strategic importance of new digital scholarship roles and initiatives to library staff as well as external audiences. the preceding paragraphs have focused on challenges, but there are great opportunities for libraries to broadcast a series of very positive messages about their contribution to digital scholarship. libraries have some real strengths to communicate and these are the focus of the next section. a recurring theme is the importance of relationships in this space (lippincott, et al., ; rockenbach, , pp. - ; vandegrift & varner, ) and libraries have a successful tradition of building good relations (pinfield, et al., , p. ; rockenbach, , p. ). uncertainties regarding the sustainability of digital scholarship projects and ongoing responsibility for them (arms, calimlim, & walle, ; kitchin, collins, & frost, ) can be turned to advantage by libraries through the more stable funding models they typically enjoy. the greatest strength for libraries, however, is that they have shared interests with their constituencies, and particularly with the humanities, in “collecting, organizing and preserving our shared collective memory”, helping to “remember the past, understand the present and build the future” (vandegrift & varner, , p. ). libraries and digital scholarship are, in fact, a natural fit and this should shape communications around them. key messages to communicate libraries have much to offer to digital scholarship and need to communicate these advantages strongly. sinclair ( ) argues that libraries are natural incubators for digital scholarship, and others (alexander, et al., ; rockenbach, , pp. - ) make a similar case in relation more specifically to the digital humanities. positive features include libraries as neutral, interdisciplinary spaces with staff who can bring together the many different and often disparate players on campus, at a minimum enabling dialogue but often also productive partnership between them. strong relationships with faculty and a habit of collaboration and connecting can be leveraged to the full in this regard. the library as place is a significant asset and there has been a move towards establishing digital scholarship centres in library buildings, with numerous examples in the united states in particular (sinclair, ). a particular advantage the library can offer is to make expensive technologies available for use and experimentation at an accessible and welcoming location by anyone on campus (lippincott & goldenberg-hart, , p. ). the traditional skills of librarians and the areas of focus of libraries match well with the needs of digital scholarship. these include cataloguing, curation and sharing of information, translating in more recent times to metadata, digital preservation and open access. library collections, notably archives and rare materials, are the backbone of many projects, especially, but not only, in the digital humanities, and their digitisation enables new forms of enquiry (green & courtney, ). there are therefore vital human and documentary resources to offer and promote. another essential infrastructure, in which libraries are often lead investors on campus, is the hardware and software environment for digital preservation, publishing and presentation, as well as open access and data curation. experience and expertise with platforms such as fedora, open journal systems, omeka, dspace and dataverse places library staff in valued advisory and consultancy roles. academic staff and other stakeholders, including university leadership, whose perception of libraries can be somewhat dated, may not appreciate the key roles that the library can play in digital scholarship, so communicating them actively and effectively is essential. the concept of library as equal partner in digital scholarship is key and should be communicated clearly, with positive linkage both to success and sustainability. such partnership need not be seen as a departure from traditional research library strengths (vandegrift & varner, , p. ). the opportunity to move from established service-based approaches to research collaboration (brown, wolski, & richardson, , p. ) and co-contribution to the creation of new knowledge (monastersky, , p. ) should be embraced. librarians have clearly asserted this partner role in some areas, notably research data management, as at griffith university in australia (searle, wolski, simons, & richardson, ), while digital scholarship centres have enabled engagement with constituents as partners rather than clients (lippincott & goldenberg-hart, , p. ). service models are limiting and library roles should more productively be marketed in terms of expertise (lippincott & goldenberg-hart, , p. ). posner ( , p. ) emphasises the valuable digital humanities work that library professionals have conceived and performed and the importance both of ensuring it is credited and of promoting it as a vital and rare skill, “not a service to be offered in silent support of a scholar’s master plan”. the skills and resources libraries can bring to digital scholarship will be more effectively harnessed through partnership and this outlook should pervade library communications. partnership represents enlightened self-interest for all parties too. sustainability is a core issue for digital scholarship, often due to its experimental nature, and many projects encounter an uncertain future beyond any initial funding. it is no coincidence that the ithaka study on sustaining the digital humanities (maron & pickle, , p. ) places knitting deep partnership among campus units, including libraries, at the top of its list of success factors for developing a system to sustain digital humanities resources. the mutual support at the university of maryland between the libraries and the maryland institute for technology in the humanities is provided in the ithaka study as an example of good practice. the partnership model at digital scholarship centres has also been seen as likely to generate sustainable results and to involve the library in funding proposals and grant applications (lippincott & goldenberg-hart, , p. ). faculty partnerships have proved vital to digitisation projects, as at the university of nevada, las vegas (lampert & vaughan, , pp. - ). libraries take a long view of digital resources and have a particular interest in promoting their sustainability and preservation. they can leverage their more stable budget model (schaffner & erway, , p. ) to advantage, both for others on campus and for themselves. in the latter context it is important to make a statement of intent by putting the library’s own digital scholarship engagements, staffing and infrastructures on a long-term footing (posner, , p. ). articulating to funders and stakeholders the benefits of digital scholarship, associated projects and the library’s involvement is key to the sustainability agenda. surprisingly, deficits have been noted in terms of dissemination of information about projects and resources (maron & pickle, , p. ), and the literature on marketing of digital collections is thin (schrier, ). failure to communicate the value of digital scholarship initiatives is likely to have negative implications in terms of funding and long-term sustainability. those benefits will vary from institution to institution but some are common enough and are well presented in a report on the impact of uk investment in digitised resources (tanner & deegan, ). this report outlines benefits for research, such as enabling new areas of enquiry and allowing scholars to concentrate on analysis instead of data collation, and for teaching through access to a more varied and rich range of materials (pp. - ). other benefits to be promoted locally may include text and data mining opportunities, wider access to the institution’s research, stronger interdisciplinary collaboration and partnerships with other institutions. communicating a clear value proposition is vital to sustainability (calhoun, , p. ; maron, smith, & loy, , pp. - ). this could focus on the unique features of a digital resource and the scholarship it enables or the time a new platform saves. equally, alignment with the institutional mission may be emphasised, for example higher rates of citation for open access publications or the institutional credit bestowed by the publication of high-quality digital resources such as the university of virginia’s valley of the shadow (http://valley.lib.virginia.edu/) project. communication strategies also need to look beyond emphasising immediate and local benefits. libraries have rightly begun to move away from a collection-centric focus (calhoun, , p. ) to a broader view of the positive social influence of digital initiatives, recognising that the collection is only a means to an end (schrier, ). wider, often global, benefits to promote include the advancement of knowledge, more equitable sharing of research outputs through open access, cultural engagement, economic benefits, bringing communities together and achieving long-term preservation (calhoun, , pp. - ; tanner & deegan, , pp. - , - ). the delos digital library manifesto captures well the social and intellectual function of digital libraries, emphasising their facilitation of communication, collaboration and other forms of interaction and placing them at the centre of intellectual activity (candela et al., ). http://valley.lib.virginia.edu/ returning to a local focus, a further area for communication is the library’s capacity to enable digital scholarship and how this will be managed relative to demand and expectation. as mentioned earlier, capacity can take the form of space (sometimes incorporating digital scholarship centres), equipment, storage, and hardware and software platforms. people, however, represent the most valuable resource the library can offer. telling the story of previous or current involvements and initiatives is a good indicator of success and potential for future engagement. identifying and promoting the teams, roles, skills and individuals available to participate in digital scholarship is important. job titles and team nomenclature can convey a lot. new library job titles have emerged, such as digital humanities librarian and digital humanities design consultant (rockenbach, , p. ), as have new teams, examples being the scholarly communications team at the university of edinburgh and the open access and data curation team at the university of exeter (corrall, , p. ). brown university (http://library.brown.edu/cds/) is interesting in that its center for digital scholarship represents a cross-departmental library team, led by a digital scholarship services manager and incorporating posts such as scientific data management specialist, manager of imaging and metadata services and data visualization coordinator, with other new posts on the horizon, including digital scholarship editor and information designer for digital scholarly publications, enabling partnership through all steps of the research cycle (maron, , p. ). managing the library’s involvement in digital scholarship is challenging and there needs to be clarity around what can and cannot be done within finite resources in a climate of high expectation and demand. digitisation, in particular, has created unrealistic expectations that any collection can be made accessible in digital format without consideration of cost, complexity or copyright, and librarians have to explain the need for selectivity (mills, , p. ). it is interesting to note the inclusion of a sub-section on managing expectation in an earlier version of the digitisation strategy of the university of manchester library ( , p. ). the management of expectations is a recurrent http://library.brown.edu/cds/ theme in the literature (maron & pickle, , p. ; schaffner & erway, , p. ; vinopal & mccormick, , pp. - ). strategies include publishing criteria for project selection, developing service level agreements, using scale solutions, implementing project and portfolio management, and cost recovery. some of these measures, especially when they involve saying no or levying costs, are unpopular. standing firm and communicating a clear position calls in particular on library leaders to take a strong and active role and to be decisive with regard to prioritisation (vinopal & mccormick, , pp. - ). without clear communication strategies, resources will be spread too thinly, or invested inappropriately, and the library’s reputation as a key player in digital scholarship will be compromised. communication strategies promotional campaigns could be regarded as the most likely way to broadcast the library’s capacity to deliver new value and new services, but communicating new library roles to enable digital scholarship poses different challenges. there is a stronger emphasis on understanding, having a facilitative mindset, being “of” the relevant communities, actively delivering, advocating effectively and using social media to build community delivering on digital scholarship projects and infrastructures is probably the best advertisement for what the library can do. resources and communication effort can, however, be misdirected without a full appreciation first of the local landscape. investment is vital in understanding the priorities of the range of audiences involved and recognising their diverse skills, culture, needs and challenges (lewis, spiro, wang, & cawthorne, ). calhoun ( ) rightly emphasises this point and it is no coincidence that in her table (p. ) of barriers to institutional repositories and possible responses the most common action recommended is conducting audience needs assessments. surveys have also proved to be valuable tools in understanding perspectives on open access (moore, ), including different disciplinary attitudes (creaser, ). they can helpfully inform the creation of digital collections (green & courtney, ) by elucidating the complex requirements of users and creating an understanding of how such collections are integrated into humanities scholarship. consultation engages users with the selection of digitisation projects (mills, ) and is essential to the development of policies for research data management (digital curation centre, ; pinfield, et al., , pp. , ). observation is also recommended in assessing the library’s level of engagement with digital humanities and noting gaps to fill (schaffner & erway, , p. ), while there is value in online forms of listening by following social media to learn of developments and to understand language and cultural norms (schrier, ). the mentality that libraries bring to digital scholarship underpins how they communicate their roles. it has already been noted that this field is multi-stranded, experimental and lacking clear boundaries. this calls for an agile outlook from libraries, characterised by “flexibility, inquisitive practices, collaboration, starting with "yes," and being courageous” (alexander, et al., ). a level of confidence, positivity and openness is implied, as is curiosity, which can manifest itself in a willingness to learn and to explore possibilities. it has been noted that the traditional reference interview offers an ideal foundation in this regard (vinopal & mccormick, , p. ). what is needed is to orient it in the direction of open-ended exploration instead of guidance towards specifics (vandegrift & varner, , p. ). a good understanding of user needs can generate a solutions-focused approach. libraries’ digital scholarship websites may communicate this “can-do” approach effectively. the emory center for digital scholarship website bills the center as providing “a one-stop shop for anyone at emory interested in incorporating digital technology into teaching, research, publishing, and exhibiting scholarly work” (http://digitalscholarship.emory.edu/). the website of the center for digital http://digitalscholarship.emory.edu/ scholarship (cds) at brown university has a section titled “how can i work with cds?” which shows what the center can do for users by translating its activities into typical actions for users, followed by photos of staff who can help, creating a very confident offering and a highly positive impression (http://library.brown.edu/cds/). there is no shortage of problems to solve, or user needs to be addressed, and libraries can productively focus their efforts and communications accordingly. for example, discoverability of their digital projects and publications is known to be a concern for scholars (calhoun, , p. ; schaffner & erway, , pp. , ). libraries have always been committed to discovery and have taken on new roles in minting digital object identifiers (dois) and promoting the use of author identifiers such as orcid to associate authors unambiguously with their content. these roles should be positively communicated as value-added solutions from the library. a participative mentality is also needed, and immersion into the digital scholarship community is an effective way of promoting the contributions of librarians. this happens readily when digital scholarship centres are based in libraries, encouraging also a social dimension (lippincott & goldenberg-hart, , pp. - ). any form of proximity certainly helps and co-location at national university of ireland (nui), galway, of the library’s archives and special collections with two major humanities and social sciences research institutes in a new research building has opened up new digital project collaborations (cox, ). going out of the library and having conversations with a range of stakeholders makes a statement of engagement and builds trust. this may involve attending digital scholarship events in academic departments or presenting papers at seminars and conferences outside the institution (vandegrift & varner, , p. ). libraries can host their own events with positive impact. examples of such events include a programme of digitisation workshops at university college dublin ( ), and a seminar on creating and exploiting digital collections at nui galway ( ) which brought together a number of players http://library.brown.edu/cds/ across the campus and promoted engagement with the library’s digital scholarship enablement strategy. actively participating in conversations is important and can advance the library role in research data management policy (erway, ) or prove the value of digital collections (schrier, ). relationships are of particular importance in digital scholarship (lippincott, et al., , pp. , ; rockenbach, , p. ), need investment by libraries (posner, , p. ) and can be mutually supportive (vandegrift & varner, ). ultimately, participation is communication. a track record of delivery on digital scholarship projects and infrastructures is the best credential for library capability. libraries commonly use their websites to advertise successful project involvements, examples being the digital humanities center at the university of rochester (http://humanities.lib.rochester.edu/) and the digital scholarship lab at the university of richmond (http://dsl.richmond.edu/). staff expertise is a vital strength and is prominently featured by, among others, the center for digital scholarship at brown university library (http://library.brown.edu/cds/). documenting progress and achievement through publications can be effective, as experienced at nui galway which has issued annual reports (http://tinyurl.com/legpsxk) of its project to digitise the archive of the abbey theatre (bradley & keane, ), focusing strongly on scholarly engagement with the digital archive. a compelling approach to communicating the library’s role is to link its contributions to all stages of the research lifecycle. good examples of this can be seen at king’s college london, (http://www.kcl.ac.uk/library/researchsupport/index.aspx ) and the university of california irvine (http://www.lib.uci.edu/dss/ ). the library can be a leader as well as a partner. librarians develop and lead their own digital humanities projects (posner, , pp. - ) and these need to be promoted. librarians have exercised leadership on campus in open access and, more recently, research data management. each of these areas is complex and in need of people who can advise knowledgeably on policy formulation, interpretation and implementation (briney, goben, & zilinski, http://humanities.lib.rochester.edu/ http://dsl.richmond.edu/ http://library.brown.edu/cds/ http://tinyurl.com/legpsxk http://www.kcl.ac.uk/library/researchsupport/index.aspx http://www.lib.uci.edu/dss/ ). librarians have established and communicated strong credibility, often as “resident experts in campus discussions” (fruin & sutton, , p. ). advocacy forms part of the communications strategy across all areas of digital scholarship. this is especially the case for open access and research data management the benefits of which, as already noted, may not be understood or embraced by faculty. promoting each successfully requires an appreciation of campus politics and cultivation of good relations with senior personnel such as research or it directors (pinfield, et al., , p. ), or respected academics who can partner in developing policy and be effective champions in selling it (fruin & sutton, , pp. - ). keeping documentation concise, clear and benefits-focused is important. an example of how this approach works was in the drafting of a two-page open access policy at nui galway (http://tinyurl.com/pfpslqd). language is significant too, and a very helpful guide to open access policies (harvard university) includes a section on “talking about a policy” which notes terminology to promote or avoid. the word “mandate”, for example, may prove problematic in creating a perception of institutional coercion. empathy with academic concerns and articulation of differentiated audience-specific benefits (calhoun, , p. ) will enhance communication and successful implementation. marketing techniques come into play too and branding can communicate important messages. nui galway’s library has published a digital scholarship enablement strategy (http://tinyurl.com/next cw) , deliberately choosing the word “enablement” rather than “service” or “support”. succinct branding is evident in “collaborate → iterate → discuss” for the university of virginia library’s scholars’ lab (http://scholarslab.org/), or “partnering to advance scholarship” at the digital scholarship lab in the j. murrey atkins library, university of carolina at charlotte (http://dsl.uncc.edu/). the latter institution also offers an example of the successful use of “joined- http://tinyurl.com/pfpslqd http://tinyurl.com/next cw http://scholarslab.org/ http://dsl.uncc.edu/ up” marketing campaigns to promote the library’s publishing services through a variety of channels, including campus conversations, newsletters, guides and a launch party to mark the publication of its first journal issue (wu & mccullough, , pp. - ). multi-faceted campaigns can be built around events such as international open access week (http://www.openaccessweek.org/) every october, the publication of a digital collection at harvard university (madsen, , pp. - ), or the establishment of a new research storage service at griffith university (searle, ). the use of social media has become a vital component of libraries’ communication strategies, enabling them not just to promote digital scholarship roles and resources but to engage users and build communities. usage of channels such as blogs and twitter is common enough but libraries’ exploitation of the full potential of social media has been limited by a collection-centric rather than people-centric worldview (calhoun, , p. ), with a tendency to promote collections rather than engage users (schrier, ). there has, however, been a definite shift in perspective in recent times from collections to networked communities, from repositories to social platforms and from content consumers to content creators and contributors, creating new roles for libraries on the social web and impacting scholarship more widely as well (calhoun, , pp. - ). researchers have embraced scholarly social networks such as researchgate, academia and mendeley as they enable sharing, discovery and new contacts. similar benefits are expected of digital scholarship platforms and institutional repositories have integrated rss feeds, altmetrics and social media functionality (marsh, , p. ). libraries have used social media optimisation strategies to make it easy to share, bookmark and comment on digital content (calhoun, , pp. - ). crowdsourcing approaches such as transcription, supplementing metadata and the identification and provenance of materials (peaker, ) have also actively engaged audiences and built communities around projects. examples http://www.openaccessweek.org/ include diy history (http://diyhistory.lib.uiowa.edu/) at the iowa digital library, which has engaged participation in the transcription of over , pages of handwritten archival material to date, and the university of pennsylvania libraries’ provenance online project (https://provenanceonlineproject.wordpress.com/) which sources information on the provenance of rare books. value-added participation by librarians in social media conversations around digital collections, and posting of contributions targeted at known areas of interest to a community, are also seen as ways of enhancing credibility, developing trust, building relationships and engaging support (schrier, ). finally, as noted earlier, library managers in particular need to communicate effectively with their own staff. library staff with traditional views of service boundaries may be sceptical about engagement with digital scholarship and the investment of resources in that direction, especially when this represents the replacement of positions formerly assigned to more established, possibly legacy, functions. a clear and ongoing articulation by library leadership of the strategic importance of new digital scholarship roles is needed (vinopal & mccormick, , pp. , , ), incorporating messages around vision, rationale, expectations, priorities and challenges. ensuring connectivity between digital scholarship staff and the rest of the library is important too. briefing sessions to all library staff about activities and initiatives are valuable. they have, in the author’s experience, proved effective at nui galway, enabling face-to-face communication and discussion. linkage with established areas like archives or research services is needed and can be cultivated. the number of library staff involved in digital scholarship is typically small relative to the whole library team and this creates its own pressure. such staff may be overextended, in need of guidance or direction, challenged by the evolving skillset required or frustrated by slow progress. they too need particular communication from library leadership to support, guide, reassure and encourage, http://diyhistory.lib.uiowa.edu/ https://provenanceonlineproject.wordpress.com/ as well as to commit the necessary resources, including training or development opportunities and even the permission to fail (posner, , p. ). effective communication structures within a digital scholarship team, including regular meetings, will ensure awareness of activities as well as sharing of, and learning from, experience. conclusion digital scholarship is a relatively new field of activity and is presenting both opportunities and challenges for libraries. the field is multi-stranded and the library response has mirrored this, with a wide range of initiatives and innovations in evidence. there are many communities involved in digital scholarship and a distinctive, experimental culture has developed, often resulting in a somewhat disjointed approach across the campus. libraries need to make their contribution and to communicate their roles in this environment, recognising and overcoming potential mismatches in culture and perspective. some big positives are the strong relationships that libraries have typically built with their academic communities, the natural fit between digital scholarship and the library mission, and the need for library contributions, both of themselves and to deliver sustainability. communication on campus and beyond about digital scholarship projects, by libraries and others, has not always been a strength. library roles may not be recognised and it is vital to get out important messages about people, skills, capabilities, collections, spaces and infrastructures, as well as the benefits delivered. these are valued, as is the move towards a partnership approach which can also be promoted in new job titles and team names. a specific communications strategy is needed, one that focuses on inserting the library into digital scholarship communities, mirroring their experimental mindset, and projecting a confident, “can-do” outlook. librarians need to participate, attend, present and converse, in general by being “out there”, communicating by doing and by sharing expertise. all of this must, however, be based on understanding the nature and needs of those involved in digital scholarship and their range of activities in order to communicate added value and to advocate effectively and sensitively. online communications are important, especially the strategic use of social media to build trust and community. engaging all library staff also needs effort so that they understand and can promote the library’s new roles as an enabling partner in digital scholarship. references alexander, l., case, b. d., downing, k. e., gomis, m., & maslowski, e. ( ). librarians and scholars: partners in digital humanities. educause review, ( june ). retrieved from http://er.educause.edu/articles/ / /librarians-and-scholars-partners-in-digital- humanities arms, w. y., calimlim, m., & walle, l. ( ). escience in practice: lessons from the cornell web lab. d-lib magazine, ( / ). retrieved from http://www.dlib.org/dlib/may /arms/ arms.html doi: . /may -arms bradley, m., & keane, a. ( ). the abbey theatre digitisation project in nui galway. new review of information networking, ( - ), - . doi: . / . . briney, k., goben, a., & zilinski, l. ( ). do you have an institutional data policy? a review of the current landscape of library data services and institutional data policies. journal of librarianship and scholarly communication, ( ). retrieved from doi:http://dx.doi.org/ . / - . http://er.educause.edu/articles/ / /librarians-and-scholars-partners-in-digital-humanities http://er.educause.edu/articles/ / /librarians-and-scholars-partners-in-digital-humanities http://www.dlib.org/dlib/may /arms/ arms.html http://dx.doi.org/ . / - . brown, r. a., wolski, m., & richardson, j. ( ). developing new skills for research support librarians. the australian library journal, ( ), - . doi: . / . . calhoun, k. ( ). exploring digital libraries: foundations, practice, prospects. london: facet. candela, l., castelli, d., pagano, p., thanos, c., ioannidis, y., koutrika, g., . . . schuldt, h. ( ). setting the foundations of digital libraries: the delos manifesto. d-lib magazine, ( / ). retrieved from http://www.dlib.org/dlib/march /castelli/ castelli.html corrall, s. ( ). designing libraries for research collaboration in the network world: an exploratory study. liber quarterly, ( ). retrieved from http://liber.library.uu.nl/index.php/lq/article/view/ / cox, j. ( ). the strategic significance of the hardiman research building retrieved november , from http://www.slideshare.net/jjcox/the-strategic-significance-of-the-hardiman- research-building- jan creaser, c. ( ). open access to research outputs—institutional policies and researchers' views: results from two complementary surveys. new review of academic librarianship, ( ), - . retrieved from doi:http://dx.doi.org/ . / creaser, c., fry, j., greenwood, h., oppenheim, c., probets, s., spezi, v., & white, s. ( ). authors’ awareness and attitudes toward open access repositories. new review of academic librarianship, (s ), - . retrieved from doi:http://dx.doi.org/ . / . . digital curation centre. ( ). five steps to developing a research data policy retrieved november , from http://www.dcc.ac.uk/sites/default/files/documents/publications/dcc- fivestepstodevelopinganrdmpolicy.pdf erway, r. ( ). starting the conversation: university-wide research data management policy. educause review, ( december ). retrieved from http://www.dlib.org/dlib/march /castelli/ castelli.html http://liber.library.uu.nl/index.php/lq/article/view/ / http://www.slideshare.net/jjcox/the-strategic-significance-of-the-hardiman-research-building- jan http://www.slideshare.net/jjcox/the-strategic-significance-of-the-hardiman-research-building- jan http://dx.doi.org/ . / http://dx.doi.org/ . / . . http://www.dcc.ac.uk/sites/default/files/documents/publications/dcc-fivestepstodevelopinganrdmpolicy.pdf http://www.dcc.ac.uk/sites/default/files/documents/publications/dcc-fivestepstodevelopinganrdmpolicy.pdf http://er.educause.edu/articles/ / /starting-the-conversation-universitywide- research-data-management-policy fruin, c., & sutton, s. ( ). strategies for success: open access policies at north american educational institutions. college & research libraries. retrieved from http://crl.acrl.org/content/early/ / / /crl - .full.pdf green, h. e., & courtney, a. ( ). beyond the scanned image: a needs assessment of scholarly users of digital collections. college & research libraries, ( ), - . doi: http://dx.doi.org/ . /crl. . . harvard university. good practices for university open access policies retrieved november , from http://cyber.law.harvard.edu/hoap/good_practices_for_university_open- access_policies huwe, t. k. ( ). digital publishing: the next library skill. online searcher, ( ), - . kitchin, r., collins, s., & frost, d. ( ). funding models for open access digital data repositories. online information review, ( ), - . retrieved from doi:http://dx.doi.org/ . /oir- - - lampert, c. k., & vaughan, j. ( ). success factors and strategic planning: rebuilding an academic library digitization program. information technology and libraries, ( ), - . lewis, v., spiro, l., wang, x., & cawthorne, j. e. ( ). building expertise to support digital scholarship: a global perspective clir publication , retrieved from http://www.clir.org/pubs/reports/pub lippincott, j. k., & goldenberg-hart, d. ( ). cni workshop report. digital scholarship centers: trends and good practice retrieved from https://cni.org/wp-content/uploads/ / /cni- digitial-schol.-centers-report- .web_.pdf lippincott, j. k., hemmasi, h., & lewis, v. m. ( ). trends in digital scholarship centers. educause review, ( june ). retrieved from http://er.educause.edu/articles/ / /trends-in- digital-scholarship-centers http://er.educause.edu/articles/ / /starting-the-conversation-universitywide-research-data-management-policy http://er.educause.edu/articles/ / /starting-the-conversation-universitywide-research-data-management-policy http://crl.acrl.org/content/early/ / / /crl - .full.pdf http://dx.doi.org/ . /crl. . . http://cyber.law.harvard.edu/hoap/good_practices_for_university_open-access_policies http://cyber.law.harvard.edu/hoap/good_practices_for_university_open-access_policies http://dx.doi.org/ . /oir- - - http://www.clir.org/pubs/reports/pub http://er.educause.edu/articles/ / /trends-in-digital-scholarship-centers http://er.educause.edu/articles/ / /trends-in-digital-scholarship-centers lynch, c. ( ). the “digital” scholarship disconnect. educause review, ( ), - . retrieved from https://net.educause.edu/ir/library/pdf/erm .pdf madsen, c. ( ). the importance of ‘marketing’ digital collections: including a case study from harvard’s open collections program. aliss quarterly, ( ), - . retrieved from http://issuu.com/alissinfo/docs/october maron, n. l. ( ). the digital humanities are alive and well and blooming: now what? educause review, ( ), - . retrieved from http://er.educause.edu/articles/ / /the-digital- humanities-are-alive-and-well-and-blooming-now-what maron, n. l., & pickle, s. ( ). appraising our digital investment: sustainability of digitized special collections in arl libraries retrieved from http://www.sr.ithaka.org/publications/appraising-our-digital-investment/ maron, n. l., & pickle, s. ( ). sustaining the digital humanities: host institution support beyond the start-up phase retrieved from http://www.sr.ithaka.org/wp- content/uploads/ / /sr_supporting_digital_humanities_ f.pdf maron, n. l., smith, k. k., & loy, m. ( ). sustaining digital resources: an on-the-ground view of projects today ithaka case studies in sustainability, retrieved from http://www.sr.ithaka.org/publications/sustaining-digital-resources-an-on-the-ground-view- of-projects-today/ marsh, r. m. ( ). the role of institutional repositories in developing the communication of scholarly research. oclc systems & services: international digital library perspectives, ( ), - . retrieved from http://dx.doi.org/ . /oclc- - - mills, a. ( ). user impact on selection, digitization, and the development of digital special collections. new review of academic librarianship, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . monastersky, r. ( ). publishing frontiers: the library reboot. nature, , - . retrieved from doi:doi: . / a http://issuu.com/alissinfo/docs/october http://er.educause.edu/articles/ / /the-digital-humanities-are-alive-and-well-and-blooming-now-what http://er.educause.edu/articles/ / /the-digital-humanities-are-alive-and-well-and-blooming-now-what http://www.sr.ithaka.org/publications/appraising-our-digital-investment/ http://www.sr.ithaka.org/wp-content/uploads/ / /sr_supporting_digital_humanities_ f.pdf http://www.sr.ithaka.org/wp-content/uploads/ / /sr_supporting_digital_humanities_ f.pdf http://www.sr.ithaka.org/publications/sustaining-digital-resources-an-on-the-ground-view-of-projects-today/ http://www.sr.ithaka.org/publications/sustaining-digital-resources-an-on-the-ground-view-of-projects-today/ http://dx.doi.org/ . /oclc- - - http://dx.doi.org/ . / . . moore, g. ( ). survey of university of toronto faculty awareness, attitudes, and practices regarding scholarly communication: a preliminary report retrieved from https://tspace.library.utoronto.ca/handle/ / national university of ireland, galway. james hardiman library. ( ). creating and exploiting digital collections: seminar, july retrieved november , from http://tinyurl.com/oo vz d peaker, a. ( ). crowdsourcing and community engagement. educause review, ( ), - . retrieved from http://er.educause.edu/articles/ / /crowdsourcing-and-community- engagement pinfield, s. ( ). making open access work: the ‘state-of-the-art’ in providing open access to scholarly literature. online information review, ( ), - . retrieved from doi:http://dx.doi.org/ . /oir- - - pinfield, s., cox, a. m., & smith, j. ( ). research data management and libraries: relationships, activities, drivers and influences. plos one, ( ), e . doi: . /journal.pone. posner, m. ( ). no half measures: overcoming common challenges to doing digital humanities in the library. journal of library administration, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . rockenbach, b. ( ). introduction. journal of library administration, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . schaffner, j., & erway, r. ( ). does every research library need a digital humanities center? retrieved from http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital- humanities-center- .pdf schrier, r. a. ( ). digital librarianship & social media: the digital library as conversation facilitator. d-lib magazine, ( / ). retrieved from http://tinyurl.com/oo vz d http://er.educause.edu/articles/ / /crowdsourcing-and-community-engagement http://er.educause.edu/articles/ / /crowdsourcing-and-community-engagement http://dx.doi.org/ . /oir- - - http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital-humanities-center- .pdf http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital-humanities-center- .pdf http://www.dlib.org/dlib/july /schrier/ schrier.html doi:http://dx.doi.org/ . /july -schrier searle, s. ( , september ). a communication and marketing campaign for research data storage. retrieved from www.samsearle.net/ / /a-communication-and-marketing- campaign.html searle, s., wolski, m., simons, n., & richardson, j. ( ). librarians as partners in research data service development at griffith university. program: electronic library and information systems, ( ), - . retrieved from doi:http://dx.doi.org/ . /prog- - - sinclair, b. ( ). the university library as incubator for digital scholarship. educause review, ( june ). retrieved from http://er.educause.edu/articles/ / /the-university-library- as-incubator-for-digital-scholarship sula, c. a. ( ). digital humanities and libraries: a conceptual model. journal of library administration, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . tanner, s., & deegan, m. ( ). inspiring research, inspiring scholarship: the value and benefits of digitised resources for learning, teaching, research and enjoyment retrieved from http://www.kdcs.kcl.ac.uk/fileadmin/documents/inspiring_research_inspiring_scholarship_ _simontanner.pdf tenopir, c., sandusky, r. j., allard, s., & birch, b. ( ). research data management services in academic research libraries and perceptions of librarians. library & information science research, ( ), - . doi: http://dx.doi.org/ . /j.lisr. . . university college dublin digital library. ( ). going digital: the application of new technologies to facilitate research insights retrieved november , from http://libguides.ucd.ie/ld.php?content_id= university of manchester. john rylands university library. ( ). digitisation strategy retrieved november , from http://www.library.manchester.ac.uk/services-and- http://www.dlib.org/dlib/july /schrier/ schrier.html http://dx.doi.org/ . /july -schrier http://www.samsearle.net/ / /a-communication-and-marketing-campaign.html http://www.samsearle.net/ / /a-communication-and-marketing-campaign.html http://dx.doi.org/ . /prog- - - http://er.educause.edu/articles/ / /the-university-library-as-incubator-for-digital-scholarship http://er.educause.edu/articles/ / /the-university-library-as-incubator-for-digital-scholarship http://dx.doi.org/ . / . . http://www.kdcs.kcl.ac.uk/fileadmin/documents/inspiring_research_inspiring_scholarship_ _simontanner.pdf http://www.kdcs.kcl.ac.uk/fileadmin/documents/inspiring_research_inspiring_scholarship_ _simontanner.pdf http://dx.doi.org/ . /j.lisr. . . http://libguides.ucd.ie/ld.php?content_id= http://www.library.manchester.ac.uk/services-and-support/staff/teaching/services/digitisation-services/about/_files/digitisationstrategyfinal.pdf support/staff/teaching/services/digitisation- services/about/_files/digitisationstrategyfinal.pdf vandegrift, m., & varner, s. ( ). evolving in common: creating mutually supportive relationships between libraries and the digital humanities. journal of library administration, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . vinopal, j., & mccormick, m. ( ). supporting digital scholarship in research libraries: scalability and sustainability. journal of library administration, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . waters, d. j. ( ). an overview of the digital humanities. research library issues, ( ). retrieved from http://publications.arl.org/rli / wu, s. k., & mccullough, h. ( ). first steps for a library publisher: developing publishing services at unc charlotte j. murrey atkins library. oclc systems & services: international digital library perspectives ( ), - . retrieved from doi:http://dx.doi.org/ . /oclc- - - http://www.library.manchester.ac.uk/services-and-support/staff/teaching/services/digitisation-services/about/_files/digitisationstrategyfinal.pdf http://www.library.manchester.ac.uk/services-and-support/staff/teaching/services/digitisation-services/about/_files/digitisationstrategyfinal.pdf http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://publications.arl.org/rli / http://dx.doi.org/ . /oclc- - - http://dx.doi.org/ . /oclc- - - davin heckman and james o’sullivan, “electronic literature: contexts and poetics” dlsanthology.mla.hcommons.org - minutes ¶ this essay is part of the third iteration of the anthology. since public review and commentary help scholars develop their ideas, the editors hope that readers will continue to comment on the already published essay. you may also wish to read the draft essay, which underwent open review in , and the project history. introduction ¶ what is electronic literature? producing a conclusive answer requires a response to a different but related perplexity that has persisted for far longer: what is literature? for derrida, the “institutionless institution” of literature is “a paradoxical structure,” davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am “constructed like the ruin of a monument that basically never existed” ( ). electronic literature should be construed not as other but rather as a construction whose literary aesthetics emerge from computation—a system of multimodal forces with the word at its center. since first garnering critical attention, electronic literature has been theorized and critiqued in a variety of ways, but it remains as ambiguous as ever. it is ambiguous because it is amorphous, and for each trait that might be classified, a new form, or potential, emerges from previously unanticipated evolutions or juxtapositions. ¶ in its earliest days, electronic literature was closely associated with the literary hypertext. the emergence of narrative selections— of choice—was not exclusive to digital media, but the computer allowed these selections to be rendered in previously unforeseen ways. with the proliferation of new technologies, this trend shows no sign of abating: practitioners have a continuous stream of new modes of production to adopt and manipulate for the purposes of artistic expression. where we once had the hypertext, we now have, for example, augmented reality, and there is no predicting where the literary may reside decades from now. what has remained constant, however, not just within the context of this digital epoch, but over centuries, is the presence of the literary. ¶ electronic literature, essentially, must be electronic and literary. even if we cannot define the literary, we can at least recognize it, and, from recognition, we can begin to build meaning. this chapter attempts to do just that: offer readers an account of some of the contexts that suggest literature that is inherently digital and extrapolate from those contexts a poetics suited to works of this nature. ¶ technological influences on contemporary modes of expression davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am have given rise to new literary forms that continue to attract authors and intrigue critics. while the origins of electronic literature can be traced back several decades, the field, as both an artistic movement and a branch of scholarship, is still in its formative stages. being literary and bound to rapidly evolving digital aesthetics, electronic literature resists stable definition, but some aspects of it lend themselves to classification. electronic literature, as the term has come to be used by the broader field of digital scholarship, does not simply refer to static text offered through screen media. n. katherine hayles defines a work of electronic literature as “a first-generation digital object created on a computer and (usually) meant to be read on a computer” ( ). a more recent definition, by serge bouchardon, is based on the same principle distinction between “digitized and digital literature”: ¶ we can retain the idea that the mere fact of being produced on a computer is not enough to characterize digital literature. digital literature uses the affordances of the computer to dynamically render the story. if an e-reader simply displays text in the way a printed book displays text—the only difference being that to advance the text one scrolls rather than turns a page—this is not “digital literature.” it is printed work digitized for optimal display in a portable computational environment. digital literature is algorithmic. it changes as the reader engages it. ( ) ¶ electronic literature has emerged from intermedial juxtapositions of literary and computational aesthetics, and it resides at the juncture between the most contemporary linguistic and multimodal aesthetics, manipulating language through digital paratextuality and technical structures. in this sense electronic literature, or e-lit, is not to be confused with text that has merely davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am been remediated; remediation being “the representation of one medium in another” (bolter and grusin ). ¶ literature probes the entire apparatus of linguistic communication, expands the range of expression, and debunks the illusory certitudes of ordinary speech. in an age pulled apart by the crisp declarations of twittering tyrants and the general malaise of a postfactual society at war with itself, literature doubles down: it seeks meaning in nonsense and makes strange what is known. instead of tearing down one slogan to replace it with another, the literary imagination seeks to carve out worlds within. to be sure, literature, electronic or otherwise, is not the only political project that matters; it is not even, in itself, a “political project” at all. rather, it is liberation by another means. to illustrate this, one might think of language as the historical image of the police call box: a ubiquitous reminder of order, a means to mobilize police action, and a holding cell for those who violate laws. but in the hands of literary artists (and their companions, the readers who travel with them), this box is bigger on the inside than it is on the outside, it bends the spatiotemporal laws that keep us bound, and it brings us opportunities to witness, wonder, intervene, reflect, and transform. the digital has simply expanded the scope of such opportunities, but with every expansion there is also constraint, and the hand of the author or artist produces meaning from within such confines. in short, what we have here is literature, but of a different sort, and difference is valuable. ¶ while print can complement a work of electronic literature, computation should constitute some inherent component of the piece’s aesthetics. even where a material connection between print and digital is absent, many aesthetic conventions persist between davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am the forms: “digital technology advances poetry into dynamic areas that were at least partially available in the prehistoric and even pretechnologic era” (funkhouser, prehistoric digital poetry ). identifying the precise point of demarcation between literature that has been remediated and literature that is born digital can prove problematic. as readers, we must be cautious not to confuse formats with poetics, placing artificial boundaries between forms of digital artistry for critical convenience. while the aesthetics of electronic literature should not be reduced to text on a screen, a piece of digitized print literature could incorporate some innovation that allows us to classify the work, in some respect, as born-digital. what we can gather from classifying works is that the practice of digitizing print literature in itself does not constitute electronic literature and that print literature can be reimagined through computation. ¶ while hayles’s definition of electronic literature—as “a first- generation digital object created on a computer and (usually) meant to be read on a computer”—is perhaps the most widely used, many critics have elaborated on the nature of the art. espen j. aarseth’s cybertextuality, or what he referred to as “ergodicity,” was among the first of the major “post-hypertexual” theories. a text is considered ergodic when “nontrivial effort is required to allow the reader to traverse the text” ( ). early delineations tended to focus on nonlinearity and on the potential for electronic literature to possess a perceived “ability to vary, to produce different courses” ( – ). traversal functions have remained central to the appreciation and interpretation of electronic literature, but more recent examinations of the form have jettisoned the precarious notion of linearity. noah wardrip-fruin notes that electronic davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am literature is simply “a term for work with important literary aspects that requires the use of digital computation” ( ). this is aligned with the electronic literature organization’s definition, which encompasses any work “with an important literary aspect that takes advantage of the capabilities and contexts provided by the stand- alone or networked computer” (“what is e-lit?”). ¶ the evolutionary essence of electronic literature makes settling on a consistent ontology a difficult, if not altogether undesirable, undertaking. the rapid proliferation of creative technologies has lent itself to this transience. scott rettberg hits upon the crux of the matter when he describes the field as “a kind of moving target” ( ). situating digital constructs on a spectrum of computational art is perhaps a more pragmatic strategy than precise ontologizing. astrid ensslin’s literary-ludic spectrum is the methodological realization of ludoliteracy’s tendency to “exhibit various degrees of hybridity,” the “complex expressive processes” of digital media meaning that this mode typically refuses to fall “neatly into generic or typological categories” ( – ). accepting that electronic literature can be many things across a broad spectrum allows us to move beyond the quandaries of definition to an inclusive critical framework that is more readily applicable to interpretations of born-digital art. electronic literature can take many forms—hypertexts, codeworks, literary games, augmented realities—so much so that many forms of its earliest manifestations have already been lost to history, and there exists an array of future iterations yet to be conceived. ¶ as counterintuitive as it may seem, electronic literature needs to be considered as an umbrella term that incorporates an ever- increasing range of literary forms that use a larger sensorium of davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am effects than traditional literature—electronic literature is inherently multimodal. electronic literature consistently relies on language and computation: the latter establishes meaningful rules that manipulate the former, sometimes based on reader interactions. these rules shape the content through dynamic procedures that cause the literary to emerge as much from the medium as from its content. e-books, for example, usually contain print literature that has been relocated from the page to the screen—these books benefit from technology’s disseminative potential but typically not its creative affordances. digitized and digital literature differ in their presentation and expression—digitized literature mirrors the codex on a screen, whereas digital literature allows computer-driven transformations to occur beyond the surface; the impact of the digital is not merely seen in the display, but embedded throughout the entire aesthetic configuration. electronic literature is work that could only exist in the space for which it was developed/written /coded—the digital space, which, while commutative, cannot be without the technical affordances of its underlying systems. the emergence of electronic literature ¶ electronic literature is a continuation of aesthetic practices that were in existence long before the advent of digital computing. while ease of dissemination is now a major benefit of the medium, prior to consumer electronics and the contemporary web, works of creative computation presumably went largely unpublished and have since been lost. some first-generation works have been preserved to a degree, but first generations begin at the point of general discovery, and one can only speculate about the vast quantities of material that never entered the public sphere. the sad reality is that there davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am are probably hundreds of obsolete drives containing electronic literature’s earliest experiments, and these dormant literary archives are more likely to occupy a landfill than a library. ¶ some of the earliest works of electronic literature that received (relatively) popular and critical attention are judy malloy’s uncle roger, first released in as a serial on the well’s art com electronic network; john mcdaid’s uncle buddy’s phantom funhouse, a hypertext novel produced with hypercard . and commercially released in ; shelley jackson’s patchwork girl, originally published in on . floppy disks and more recently released on flash drives; and bill bly’s we descend, which initially appeared in and was re-released, with new content, on the web in (malloy – ). robert coover, in “literary hypertext: the passing of the golden age,” his october keynote address at the digital arts and culture conference in atlanta, georgia, refers to michael joyce’s afternoon, a story, malloy’s its name was penelope ( ), stuart moulthrop’s victory garden ( ), and patchwork girl as the “early classics.” ¶ much of electronic literature’s first generation of works formed part of the eastgate school, which saw the commercial publication of numerous canonical hypertextual fictions through eastgate systems’s storyspace platform. foremost among these early hypertexts was michael joyce’s afternoon, a story, first demonstrated at the meeting of the association for computing machinery and published in . joyce presented the paper in question, “hypertext and creative writing,” alongside jay david bolter. in describing the mechanics of the literary hypertext, bolter and joyce pointed to “a new literary dimension” in which authors can work: “instead of a single string of paragraphs, the author lays davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am out a textual space within which the fiction operates” ( , ). many of the early eastgate titles were constructed this way, offering a variety of paths through which the reader can traverse literary fragments known as lexia. ¶ as more intuitive and sophisticated multimedia applications and computer systems became available, electronic literature evolved into a variety of increasingly intermedial forms. in scott rettberg, robert coover, and jeff ballowe founded the electronic literature organization (elo), a nonprofit initiative intended to “promote the reading, writing, teaching, and understanding of literature as it develops and persists in a changing digital environment” (“history”). founded in chicago, the elo established its first institutional headquarters at the university of california, los angeles, in . in the organization moved to the university of maryland, college park, before relocating to the massachusetts institute of technology in . this year saw the elo move to its current headquarters, in washington state university, vancouver. the publication of the elo’s first electronic literature collection in october (fig. ) was a milestone in the advent of electronic literature’s being regarded as more than merely hypertextual. it is, as chris funkhouser claims, “the first major anthology of contemporary digital writing” (“electronic literature”). edited by hayles, nick montfort, rettberg, and stephanie strickland, the collection marks electronic literature’s progression toward increased multimodal, intermedial, and computational complexity. ¶ davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am fig. . a screenshot of the home page of electronic literature collection, vol. , , collection.eliterature.org/ /index.html. ¶ composed of sixty works of electronic literature, the collection offers readers an opportunity to browse by genre. the electronic literature collection embraces a range of technologies, including “ambient,” “animation/kinetic,” “constraint-based/procedural,” “generative,” “flash,” “javascript,” “shockwave,” and “vrml” (“contents by keyword”). mark c. marino’s review in digital humanities quarterly refers to the collection as a “menagerie of forms” that “offer a sense of the perpetual metamorphosis of electronic literature.” this collection, as marino rightly asserts, is all about “variety”. individual authors had moved beyond the hypertext long before , but publication of the elo’s first collected volume was the field’s first definitive statement on electronic literature’s being more than just links. in february , the second volume of the electronic literature collection, edited by laura borràs, talan memmott, rita raley, and brian kim stefans, was published, followed by a third volume in , edited by stephanie boluk, davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am leonardo flores, jacob garbe, and anastasia salter. the ways in which the field has evolved can be appreciated through these collections, which offer snapshots of the movements, technologies, and techniques favored by artists at different times. the canon is, of course, far broader than what can feasibly be presented in any set of anthologies, and as more development companies turn to the ludoliterary, we are seeing a much higher volume of electronic literature permeating the mainstream. ¶ although several books provide a historical perspective of electronic literature, much work remains to be done to build a literary history of electronic literature. recent research by moulthrop and dene grigar for pathfinders, a preservation project funded by the national endowment of the humanities, has uncovered historical information about the aforementioned early works of electronic literature by mcdaid, malloy, jackson, and bly. the pathfinders project is a significant contribution to the field’s relatively sparse, and increasingly jeopardized, literary history. ¶ rather than approach the question of electronic literature by mapping out its historical development or its relation to social and institutional organizations that engage in its creation, consumption, criticism, and curation, one can attempt to interrogate the ways “writing with” a computer can help authors add new dimensions to the literary as a species of form. as flusser explains, writing has some preconditions: the blank surface . a means to mark the surface . an alphabet . knowledge of a “convention” that allows this alphabet to . davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am correspond to something else knowledge of the proper form for constructing this alphabet . knowledge of a specific language . knowledge of this language’s rules of writing . an idea that can be communicated through writing . a motive to communicate the idea through writing ( ) . ¶ for flusser, these preconditions recede into the background of our consciousness as the habit of writing supplants the conscious effort with which we learn to write. for instance, it is difficult to know when a child recognizes the relation between written and spoken words, and a child learns the significance of specific words later. later still, a child begins reading new words. and, of course, it is entirely possible for a child to never learn the written language and still be able to communicate complex ideas through verbal means alone. what we should note is that writing itself does not enable complex communication—it simply complicates communication. but if we do not make these preconditions explicit, we forget how writing works. ¶ the introduction of an accessible form of recording and transmission, the emergence of democratic theories of governance, and the dream of universal literacy engage the general public in the translation of everyday practices into written text. these everyday practices, in turn, feed into abstract practices of documentation, planning, and conceptual thinking surrounding archivable, teachable, and replayable formats that permit us to further distinguish between noise and pattern, introducing notions that the patterns themselves might be compared, scrutinized, rejected, davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am accepted, and hypothesized. this feedback loop provides the foundation for critical thinking and public discourse. thus, the historical coincidence between the emergence of print literacy and the accelerated production of knowledge has conditioned us to think of these two practices as intrinsically linked. however, as the electronic literature movement has shown, other routes to the same goal are possible. ¶ as a number of scholars have found, many of the insights and impulses we associate with contemporary digital writers were anticipated in the work of earlier writers. chris funkhouser’s prehistoric digital poetry, the po.ex digital archive of portuguese experimental literature, and george landow’s hypertext are projects that represent the practical and theoretical ways that the qualities we associate with digital media were conceptually evident to writers before the development of advanced digital technology . once the computer became available, even before digital literary texts were formally produced, literature saw a period of intense protodigital experimentation and reflection. nowhere is this clearer than with the oulipo writers, who explored notions like creating all the possible works through a mathematical formula (as raymond queneau did in his cent mille milliards de poèmes, a work that contains , , , , poems) or the creative possibilities of writing under constraint (like georges perec in his la disparition, a novel that does not include the letter e). although the appeal of such works often resides in concepts, the notion that literature can be understood through formal processes reflects the sheer impact of the technical worldview on our understanding of human expression. ¶ yet, there is something critical to the relation between print davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am literature and electronic literature. funkhouser, for instance, explains, “poetry is poetry, and computer poetry—though related to poetry—is computer poetry” (prehistoric digital poetry ). in the context of the argument he has developed, this distinction is significant: reading electronic literature as a strict continuation of a literary history or as a digitization or extension of print misses the point. the taxonomy of literature does not produce even parallels across its subdivisions, so it is a mistake to believe that digital mediality would simply mirror the generic features of the neighboring branches. in poetry, aural qualities are formal elements that allow one to draw distinctions. in the novel, themes, tropes, and narrative qualities are prioritized. however, though a sonnet has certain sonic qualities that designate it as such, these formal characteristics are tied to narrative and thematic qualities as well. thus, a sonnet might have some topical affinity with, say, the low literary form of the contemporary romance novel. this is simply to say that literature, even at its most canonical, suffers from a promiscuous ontology. at some level, the application of this ontology to emerging media, while a useful heuristic at times, must occasionally be hacked, transmigrated, or overwritten to permit recognition of different formalities. ¶ any reader who expects digital works to simply continue down the path of print literature as it has progressed through the twentieth century is going to find that electronic literature is inferior or imitative in some respects. for instance, developing a voice that is convincingly personal in its human patterns while exhibiting naturalistic eccentricities is something that computers are not good at yet—either the program exhibits recognizable character traits through generalization, or the program generates surprise through davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am randomization, each representing abstracted and extreme qualities that are successfully balanced in the well-rendered character. to transcend this, writers can intervene directly in the process through writing, or they can experiment with algorithms, parameters, or databases to craft a more nuanced generalization. the third option, which is much harder for readers and writers nurtured on traditional forms—but which finds encouragement in aspects of the avant- garde sensibility, without necessarily carrying the ideological weight—is to simply explore the limits of the available tools without worrying about whether or not works line up with prior practices (in other words, does a work of prose fiction have to look like a novel? does a poetic work have to look like a poem? what signs can literature be made of?). for purely historical reasons, we must, as demonstrated by funkhouser and hayles, consider that electronic literature is materially different from print literature and can thus benefit from a liberal attitude toward historical literary criteria—a liberality that is offset by a rigorous analysis of the properties of the medium itself. when the inherited literary criteria do not apply, or only partially do, the attentive reader should recognize that something else might be happening in the text beyond mere novelty. ¶ however, by working with and against these technical limits, the writer is engaged in a kind of poiesis that parallels the challenge that words have historically presented to authors, only by way of an altered system of representation. if early novelists, for instance, explored the potential of the epistolary form to create the pretext necessary for the experience of the text as literature, one can argue that contemporary writers are engaged in similar practices with computers. is the epistolary format strictly “about” letters being davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am exchanged? or is it about simulating a record of a familiar form of text-based communication between two subjects? if writers developed and readers learned dialogue conventions that enabled conversations to unfold on the printed page, we can say that digital pioneers are exploring and contemporary readers are field-testing new conventions for the experience of a literary representation. the goal of the author, then, is not to mimic the formal practice of indicating dialogue, but to facilitate a calculated transmission of that dialogue to a hypothetical reader in a manner consistent with the formal, technical, and narrative priorities of the work. this insight is important for critics because it suggests that there is enormous potential in treating electronic literature like traditional print literature, provided we engage in this treatment retrospectively rather than the other way around. if we look at literature and ask how electronic literature represents a hypothetical future, we judge the not-yet-created based on the material accidents of the old. however, if we accept electronic literature without speculation as contemporary literature and read backward into history, we can see old literary techniques more clearly, recognize the determining aspects of history, unveil components of the dialectical process that are otherwise concealed, and, finally, improve more broadly on the theories of literature, literacy, and, ultimately, language itself. ¶ today, it is difficult to imagine writers who do not employ some aspect of digital process in their work, in composing, editing, or publishing, but the fact remains that the digital is not simply a technology that has washed over the field of literature, resulting in electronic literature as a default practice. indeed, electronic artists, while often striving toward the cutting edge, are also likely to spend years exploring a particular format to experience the full range of davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am affordances that might be found, recognizing that some affordances only arrive through habitual use as the form itself becomes representative of something. writers make creative use of ubiquitous forms, expanding the range of expression while having fun with emerging habits of web readership. many writers have found in the techniques and technologies of writing occasions to reflect on the act of writing itself. this reflection is so focused, in fact, that there is a community of writers, publishers, and critics that labors specifically in and around the affordances and limitations of the computer. ¶ in the work of richard holeton, readers will find a consistent tendency to exploit commonplace digital forms for literary effect. his early work frequently asked questions about “hypertext” (fig. ) uses the frequently asked questions (faq) convention to support a comprehensive satire of digital forms. ¶ davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am fig. . a screenshot of the home page of richard holeton’s frequently asked questions about “hypertext.” from electronic literature collection, vol. , electronic literature organization, , collection.eliterature.org/ /works/holeton__frequently _asked_questions_about_hypertext/index.html. ¶ the piece purports to answer common questions about the anagrammatic poem “hypertext,” by “alan richardson,” and it performs this work, appropriately, as a hypertext. those familiar with the faq format understand that it is implicitly fictional, since faqs are written in anticipation of a hypothetical reader’s questions. at best, faqs are culled from actual questions and streamlined into the simulated perspective of a typical reader. at their most inventive, faqs contain questions that are purely speculative, reflecting what the creators think a reader should know. in keeping with the pragmatic mission of the faq format, the questions and answers tend toward a kind of abstract precision. when faqs fail to answer a reader’s question, it is usually because of a solipsism and circular ontology that, in itself, is a conceptual hypertext that leads toward an idealized form of “customer satisfaction.” in the process of satirizing the faq format, holeton tells a story about the controversy surrounding the poem and thus manages to pull a host of other aspects of digital communication into this elegant work. the fictional poet, alan richardson, is alleged to be a tech-boom millionaire whose poem was circulated virally through e-mail. yet, he is a mysterious figure who has “disappeared,” exciting the interest of conspiracy theorists, literary critics, fan-fiction communities, and hackers, who are all represented in the faqs. what at first appears to be a simple satire of digital banality gives way to a sprawling world of competing davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am speculations that undercut the solidity of the work’s asserted form. in other works, like “custom orthotics changed my life” or voyeur with dog, holeton uses the professional slideshow format, complete with bullet points and colorful charts, to tell comically banal stories of human folly and tragedy. although these works are new arrivals on the literary scene, they evoke an entire history of literary practices that exploit the norms of language and explore its potential. ¶ the evocation of this history is evident in contemporary screen fictions, even in technically complex developments that incorporate state-of-the-art components like expansive playable spaces, physics engines, and virtual and augmented realities. ensslin’s spectrum is both expanding and contracting: the range of technologies that offer creative affordances is growing, but the aesthetic boundaries that dissect this scale are being drawn closer together. the great irony of electronic literature, often heralded as an esoteric field on the periphery of literary, media, and digital scholarship, is that literary games have never been more popular. in the mobile game market, where the audience is usually casual gamers, we see that hypertext has fashioned a revival: games like reigns ( ) and lifeline ( ) appear like recent additions to the ios games catalogue, but they are structurally no different from the fictions of the eastgate school—the narrative progresses as the user chooses among a selection of paths, which lead to different lexia. it is true that these games have been adapted for the specifics of the platform—lifeline mimics mobile communications, whereas reigns operates as something of a commentary on tinder (players select narrative paths by swiping left or right)—but the affinities with their antecedents outweigh these particulars. davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am ¶ fig. . a screenshot of mez breeze and andy campbell’s all the delicate duplicates. from steam, feb. . ¶ duplicates is as beautiful as it is technically impressive, and it signals how artists like breeze and campbell are drawing electronic literature in from the outskirts of the canon—this work has received mainstream accolades. among other awards, it won the tumblr international digital media prize, and it was an official selection at the showcase parallels freeplay independent games festival, as well as a finalist for the bbc writersroom / the space prize for digital theatre. ¶ such works are both the present and future of electronic literature—a future that possesses forms we cannot even begin to anticipate. consider the trajectory of breeze and campbell: like their contemporaries, they would have started with command-line, inherently textual environments—mezangelle is representative of davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am these beginnings. literature has always been textual, and the computer afforded an opportunity for a reciprocative textuality. now the domain is one in which the real and fantastical are continuously merged, through immersion and augmentation. but even as technologies advance and the works of the pioneers look increasingly archaic, their significance has never been more apparent. the schemata that the pathfinders—to borrow from grigar and moulthrop—established remain evident, even as the successors have overlaid them with increasingly intricate multimodal mosaics. the aforementioned mobile games and the prize-winning game worlds produced by breeze and campbell are but the most contemporary iteration of a long line of literary practices. electronic literature now has its own lineage; where shelley jackson used hyperlinks between segments of text, breeze and campbell use -d objects developed using a resource- intensive engine. all this—drawing attention to examples of digital works, both old and new—is simply a means of demonstrating that the goal of the form remains consistent: to manipulate language, to transform the linguistic into the literary by means of computation. ¶ in many cases, the work of electronic literature practitioners results strictly in objects that could not exist on the printed page, and thus we should be reluctant to say that the concepts these writers explore would be inconceivable to anyone else. oral poetry, song, and dramatic literature are all time-based. gaming, ritual, and call-and-response performances are all interactive or collaborative storytelling techniques. pictographic writing systems, religious art, ritual, and drama are all visual. music, oratory, and performance all have audio components. many games and rituals include elements of chance or creative modes of meaning generation. architectural davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am spaces and medieval manuscripts are hypertextual for readers. at times the print tradition has looked to these close relations to achieve a perspective of estrangement from conventional language, to introduce a reflexive process into the act of reading and writing text. the miracle of electronic literature is not that computers are current; the miracle is that it is so thoroughly anticipated, suggesting that the literary perspective is a viral, feral, primordial tendency of human consciousness. but everyday linguistic practices reflect how human beings cannot live without contemplating, modifying, and sharing ideas. the literary mode seeks to represent and reproduce these practices in technical objects. though hardly the expression of individual artistic genius, memes circulate through this raw literary tendency. the aggregate effects of small acts of liking, sharing, and making as a twenty-first- century writing practice constitute a mode of poetic activity to which the main channels of literary theory have not responded. electronic literature as a creative practice, a focal point for a community of readers, and a subject of scholarly discourse provides an alternative zone in which the techniques and technologies of language are open for criticism and speculation in a period of radical transformation. notes ¶ . the treatments of this lineage that readers may find useful include glazier’s and di rosario’s. ¶ . ludic refers to the characteristics of play: in this context, characteristics one would typically associate with a video game. ¶ . we would like to acknowledge dene grigar, president of the davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am electronic literature organization and associate editor of this anthology, for her guidance on this section. ¶ . for more information about uncle roger, see “judy malloy’s uncle roger,” a section of grigar and stuart moulthrop’s pathfinders project (scalar.usc.edu/works/pathfinders/judy-malloy). ¶ . the well, or whole earth ’lectronic link, is a virtual community started in by stewart brand and larry brilliant (www.well.com/aboutwell.html). ¶ . for additional information about mcdaid’s work, see “john mcdaid’s uncle buddy’s phantom funhouse,” part of the pathfinders project (scalar.usc.edu/works/pathfinders/john-mcdaid). ¶ . hypercard is a hypermedia programming application for early apple systems, such as the apple macintosh and apple iigs, that predates the web. launched in , its last stable release was offered in , before being withdrawn from sale in . ¶ . for more information about patchwork girl, see the “shelley jackson’s patchwork girl” section of the pathfinders project (scalar.usc.edu/works/pathfinders/shelley-jackson). ¶ . for further information about we descend, see “billy bly’s we descend,” part the pathfinders project (scalar.usc.edu/works /pathfinders/bill-bly). ¶ . collecting many of these works for a exhibition entitled early authors of electronic literature: the eastgate school, voyager artists, and independent productions, dene grigar uses the term school to describe a body of works published by eastgate. thus “the eastgate school” denotes literary hypertexts authored using eastgate’s storyspace software, which assisted in the davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am creation and reception of many early works of electronic literature. although eastgate was not the only early publisher (or the first publisher) of electronic literature, it succeeded in creating an identity for literary hypertext that could facilitate critical discourse for an emerging community. in an interview with jill walker rettberg, mark bernstein, eastgate’s editor and chief engineer, explains, “[t]he fact that there was a publisher that looked like a recognisable sort of organization gave the critics a chance to pitch their stories to their editors, and editors who were inclined to find a technological line, or at least not repulsed by the idea of literary machines, could be convinced, since there was something that looked like a small press. that was important.” in an interview with judy malloy, bernstein explains that the standardization offered by a committed authoring system and literary publisher “gets us beyond the broad generalities and simple-minded media essentialism that still dominates so much discussion of the web.” this collection of works by eastgate establishes an identity for an important aspect of the field, with anchor points that enable thoughtful comparisons and evaluations of work. ¶ . these books include chris funkhouser’s prehistoric digital poetry ( ) and eduardo kac’s media poetry ( ). ¶ . this section provides little more than a frame of reference for those new to the field; readers with a particular interest in the history of electronic literature would be better served by engaging with such projects as pathfinders and, indeed, by contributing their own research to help fill a major gap in the field. ¶ . mezangelle is a language developed by the electronic literature artist mez breeze, who describes it in detail in an interview ( ) that accompanied a presentation of her work in davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am rhizome’s net art anthology. useful resources ¶ cell: consortium on electronic literature cellproject.net ¶ electronic literature collection collection.eliterature.org/ ¶ electronic literature organization eliterature.org/ ¶ electronic literature timeline electronicliterature.org ¶ elmcip electronic literature knowledge baseelmcip.net ¶ pathfinders dtc-wsuv.org/wp/pathfinders/ ¶ i ♥ e-poetry iloveepoetry.com/ ¶ zotero bibliography of electronic literature www.zotero.org/groups/electronicliterature works cited aarseth, espen j. cybertext: perspectives on ergodic literature. johns hopkins up, . bernstein, mark. “the history of hypertext literature authoring and beyond.” interview by judy malloy. authoring software: application software of electronic literature and new media, edited by malloy, aug. , narrabase.net/bernstein.html. davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am bolter, jay david, and michael joyce. “hypertext and creative writing.” hypertext ’ : proceedings of the acm conference on hypertext, , pp. – . bolter, jay david, and richard grusin. remediation: understanding new media. mit p, . bouchardon, serge. “towards a tension-based definition of digital literature.” journal of creative writing studies, vol. , no. , . breeze, mez. “mezangelle, an online language for codework and poetry.” interview by aria dean. rhizome, dec. , rhizome.org/editorial/ /dec/ /mezangelle-an-online-language- for-codework-and-poetry/. “contents by keyword.” hayles et al., collection.eliterature.org/ /aux/keywords.html. coover, robert. “literary hypertext: the passing of the golden age.” digital arts and culture conference, georgia tech university, atlanta, . derrida, jacques. “this strange institution called literature: an interview with jacques derrida.” acts of literature, edited by derek attridge, routledge, , pp. – . di rosario, giovanna. electronic poetry: understanding poetry in the digital environment. . u of jyväskylä, phd dissertation. ensslin, astrid. literary gaming. mit p, . flusser, vilem. “the gesture of writing.” flusser studies, vol. , . funkhouser, chris. “electronic literature circa www (and before).” review of hayles et al., electronic book review, , davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am www.electronicbookreview.com/thread/electropoetics/collected. —. prehistoric digital poetry: an archaeology of forms, – . u of alabama p, . glazier, loss pequeño. digital poetics: the making of e-poetries. u of alabama p, . grigar, dene. “early authors of e-literature, platforms of the past.” mla annual convention, jan. , seattle, wa. hayles, n. katherine. electronic literature: new horizons for the literary. u of notre dame p, . hayles, n. katherine, et al. electronic literature collection, vol. , electronic literature organization, , collection.eliterature.org/ /. “history.” electronic literature organization, eliterature.org/elo- history/. holeton, richard. “custom orthotics changed my life.” kairos, vol. , no. , , http://kairos.technorhetoric.net/ . /disputatio /holeton/index.html. ———. frequently asked questions about ‘hypertext.’ hayles et al., collection.eliterature.org/ /works /holeton__frequently_asked_questions_about_hypertext/index.html. ———. voyeur with dog. counterpath press online, . landow, george p. hypertext: the convergence of contemporary critical theory and technology. johns hopkins up, . malloy, judy. “uncle roger, an online narrabase.” leonardo, vol. , no. , , pp. – . marino, mark c. review of hayles et al., digital humanities davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am quarterly, vol. , no. , . rettberg, jill walker. “electronic literature seen from a distance: the beginnings of a field.” dichtung digital, vol. , , www.dichtung-digital.org/ / /walker-rettberg.htm. rettberg, scott. “editorial process and the idea of genre in electronic literature in the electronic literature collection, volume .” archiving electronic literature and poetry: problems, tendencies, perspectives, vol. , nos. – , , pp. – . wardrip-fruin, noah. “reading digital literature: surface, data, interaction, and expressive processing.” a companion to digital literary studies, edited by susan schreibman and ray siemens, blackwell, , pp. – . “what is e-lit?” electronic literature organization, eliterature.org/what-is-e-lit/ doi: . /lsda. . davin heckman and james o’sullivan, “electronic literature: con... about:reader?url=https://dlsanthology.mla.hcommons.org/electronic-liter... of / / , : am access to healthcare during covid- international journal of environmental research and public health article access to healthcare during covid- alicia núñez ,*, s. d. sreeganga and arkalgud ramaprasad , ���������� ������� citation: núñez, a.; sreeganga, s.d.; ramaprasad, a. access to healthcare during covid- . int. j. environ. res. public health , , . https:// doi.org/ . /ijerph academic editors: monica wendel and gaberiel jones, jr. received: february accepted: march published: march publisher’s note: mdpi stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. copyright: © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (https:// creativecommons.org/licenses/by/ . /). department of management control and information systems, school of economics and business, universidad de chile, santiago , chile ramaiah public policy center, bengaluru , india; sreeganga.sd@rppc.ac.in (s.d.s.); arkalgud.ramaprasad@rppc.ac.in (a.r.) information and decision sciences department, university of illinois at chicago, chicago, il , usa * correspondence: anunez@fen.uchile.cl abstract: ensuring access to healthcare is critical to prevent illnesses and deaths from covid- and non-covid- cases in health systems that have deteriorated during the pandemic. this study aims to map the existing literature on healthcare access after the appearance of covid- using an ontological framework. this will help us to formalize, standardize, visualize and assess the barriers to and drivers of access to healthcare, and how to continue working towards a more accessible health system. a total of articles are included and considered for mapping in the framework. the results were also compared to the world health organization guidelines on maintaining essential health services to determine the overlapping and nonoverlapping areas. we showed the benefits of using ontology to promote a systematic approach to address healthcare problems of access during covid- or other pandemics and set public policies. this systematic approach will provide feedback to study the existing guidelines to make them more effective, learn about the existing gaps in research, and the relationship between the two of them. these results set the foundation for the discussion of future public health policies and research in relevant areas where we might pay attention. keywords: health equity; healthcare access; ontology; covid- . introduction the covid- pandemic continues to be a major global public health threat, challeng- ing the provision of healthcare services and their accessibility. it has even affected those countries with high availability of healthcare facilities, cutting edge technologies, and a reasonable number of healthcare professionals. therefore, regardless of the country or continent, all have had to adapt their systems to prompt access and find the best way to respond to this virus. healthcare access refers to the ease with which individuals can obtain needed health- care. it is generally defined as the opportunity to use appropriate services in proportion to healthcare needs [ , ]. if services are available, then an opportunity exists to obtain medical care; however, it is also limited by other barriers such as financial, organizational, social, cultural issues, etc. [ ]. in this sense, the level of access influences the use of medical services, and therefore the health status of the population. access was a problem prior to the pandemic. as of today, there is preliminary evidence of racial and socio-economic disparities in the population affected by covid- [ , ] due to the reduction in access to and utilization of healthcare services. as a result, inadequate or inaccessible access to healthcare services has exacerbated the existing social disadvantages, stressing the system even more. many resources and staff are being diverted from their normal activities to test and provide treatment for covid- cases. supplies are limited and people fear accessing healthcare providers [ ]. nowadays, the population is also starting to fear the effects from the covid- vaccine. therefore, it is essential to ensure access to medical care to prevent int. j. environ. res. public health , , . https://doi.org/ . /ijerph https://www.mdpi.com/journal/ijerph https://www.mdpi.com/journal/ijerph https://www.mdpi.com https://orcid.org/ - - - https://orcid.org/ - - - https://doi.org/ . /ijerph https://doi.org/ . /ijerph https://creativecommons.org/ https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / https://doi.org/ . /ijerph https://www.mdpi.com/journal/ijerph https://www.mdpi.com/article/ . /ijerph ?type=check_update&version= int. j. environ. res. public health , , of illnesses and deaths from covid- and non-covid- cases in already weak health systems. the reinforcement of strategies and the establishment of proactive measures to ensure that access to healthcare is not disrupted is important to mitigate the effects and spread of covid- [ ]. reduced access to care, surgeries, and other hospital services, combined with fear of exposure to the virus, have led to a significant drop in access. thus, many diseases that develop symptoms have been treated by using telemedicine. telemedicine has surged as a feasible tool to maintain patient care and reduce the risk of covid- exposure to patients, healthcare workers, and the public [ , ]. there is evidence of patients who have been managed by using telemedicine and expressed satisfaction with the services received, demonstrating that telemedicine helped assessing, diagnosing, triaging, and treating patients with covid- while avoiding a visit to an emergency department or an outpatient clinic. these experiences include patients with transplanted kidney, diabetes, prenatal care, emergency ophthalmological disorders, couple and family therapies, colorectal surgery, cancer, among others [ – ]. these practices emphasize the opportunities that telemedicine offers to maintain an uninterrupted follow-up care for complex patients, today and beyond the pandemic. yet, some barriers have been identified: telemedicine does not fully replace face-to-face interactions, and increased privacy, regulatory and insurance coverage concerns must be addressed by policymakers. additionally, more research is needed to assess its efficacy and quality of care it delivers [ , ]. the health threat caused by this virus also has particular implications for the vul- nerable population—i.e., people living with disabilities, migrants, homeless, etc. [ ]. these groups of people probably already live under disadvantageous conditions which have been aggravated by the pandemic, and they do not have access to telemedicine. vulnerable-based proactive strategies need to be developed to cope with their specific needs. additionally, the pandemic has brought serious mental health effects, worsening psychological distress at all ages [ , ]. this especially the case now, as there has been a significant impact on local economies given the strict measures imposed to contain the spread of the virus. this has resulted in isolation and increased unemployment rates and also affected insurance coverages [ ]. local governments will need to use appropriate data and consider their population characteristics and needs to help combat this virus. therefore, it is imperative at this point to have a global view of the studies carried out since the covid- pandemic started, to assess them and learn how to reduce the associated risks and improve access to healthcare services. the aim of this study is to map the existing literature of healthcare access after the covid- pandemic using an ontological framework [ ] to visualize the barriers to and drivers of access to healthcare and how to continue fighting the pandemic and having a health system accessible to all. ontologies can describe relationships to model high-quality, linked and coherent data to share common understanding among people, and are a good holistic representation to simplify the available literature in the domain. as with any method, ontologies can have disadvantages. the structured natural language of the ontology may be unsuited to some researchers and contexts. it may not capture the full semantic range of a natural language narrative. some of the weak signals in the natural language narrative may be lost in the process of structuring it. however, it is effective in providing a systemic view of a domain and addressing the issues systematically. the results from the ontology were also compared to the research coverage with the world health organization (who) guidelines on “maintaining essential health services: operational guidance for the covid- context” [ ] to determine the overlapping and nonoverlapping areas. this analysis will help improve the feedback and learning from the translation of research to practice and of practice to research. int. j. environ. res. public health , , of . materials and methods . . ontology of access to healthcare during covid- the ontology of access to healthcare during covid- defines its dimensions, ele- ments, and boundaries [ ]. it deconstructs the policy problem’s complexity hierarchi- cally [ ], visualizes it in structured natural english, and encapsulates its combinatorial logic [ ]. it organizes the terminologies, taxonomies, and narratives of the policy problem systemically, systematically, and symmetrically [ – ]. it is a cognitive map of the sys- tem [ – ] to: (a) design the policy alternatives, (b) determine effective, ineffective, and innovative policies, and (c) direct the choice through feedback and learning [ , ]. it is a qualitative theory [ ] of the policy problem that can be used to describe the problem, explain its dynamics, predict the outcomes, and control the system through feedback and learning. similar ontologies have been used to conceptualize and analyze learning surveillance systems [ ], mobile health (mhealth) [ ], healthcare systems [ ] and higher education policies [ ]. the development and application of the ontology follows the description of the logic and process by ramaprasad and syn [ ]. for this study, we borrowed and applied the ontology from a previously developed ontological framework of barriers to and facilitators of access to healthcare [ ]. the ontology of access to healthcare during covid- is shown in figure . it en- capsulates the various resources that affect access to healthcare such as spatial, temporal, financial, informational, human, and technological ones. these resources can be barriers, inhibitors, catalysts, or drivers to physical and virtual access to different types of healthcare. these forces could affect preventive care, wellness, episodic illness, chronic illness, rehabili- tative, and palliative healthcare for different population segments such as the urban, rural, underprivileged, indigenous, disabled, and the elderly populations. access to healthcare may be provided by varied personnel that include general physicians, specialist physicians, traditional healers, health workers, pharmacists, social workers, care providers, peers, and family. int. j. environ. res. public health , , x for peer review of . materials and methods . . ontology of access to healthcare during covid- the ontology of access to healthcare during covid- defines its dimensions, ele- ments, and boundaries [ ]. it deconstructs the policy problem’s complexity hierarchically [ ], visualizes it in structured natural english, and encapsulates its combinatorial logic [ ]. it organizes the terminologies, taxonomies, and narratives of the policy problem sys- temically, systematically, and symmetrically [ – ]. it is a cognitive map of the system [ – ] to: (a) design the policy alternatives, (b) determine effective, ineffective, and inno- vative policies, and (c) direct the choice through feedback and learning [ , ]. it is a qual- itative theory [ ] of the policy problem that can be used to describe the problem, explain its dynamics, predict the outcomes, and control the system through feedback and learn- ing. similar ontologies have been used to conceptualize and analyze learning surveillance systems [ ], mobile health (mhealth) [ ], healthcare systems [ ] and higher education policies [ ]. the development and application of the ontology follows the description of the logic and process by ramaprasad and syn [ ]. for this study, we borrowed and ap- plied the ontology from a previously developed ontological framework of barriers to and facilitators of access to healthcare [ ]. the ontology of access to healthcare during covid- is shown in figure . it encap- sulates the various resources that affect access to healthcare such as spatial, temporal, fi- nancial, informational, human, and technological ones. these resources can be barriers, inhibitors, catalysts, or drivers to physical and virtual access to different types of healthcare. these forces could affect preventive care, wellness, episodic illness, chronic illness, rehabilitative, and palliative healthcare for different population segments such as the urban, rural, underprivileged, indigenous, disabled, and the elderly populations. ac- cess to healthcare may be provided by varied personnel that include general physicians, specialist physicians, traditional healers, health workers, pharmacists, social workers, care providers, peers, and family. figure . ontology of access to healthcare during covid- . it: information technology. resource force access health care personnel population spatial barrier physical preventive physicians--general urban distance inhibitor virtual wellness physicians--specialist rural location catalyst illness--episodic traditional healers underprivileged temporal driver illness--chronic nurses indigenous availability rehabilitative health workers disabled scheduling palliative pharmacists elderly financial social workers income care providers expenditure peers informational family stimulant educational human psychological sociological cultural technological it transportation medical [+ ] [t o/ of ] [a cc es s t o] [h ea lt hc ar e by ] [fo r] figure . ontology of access to healthcare during covid- . it: information technology. int. j. environ. res. public health , , of . . method we visually synthesized the state of research in healthcare access during covid- pandemic by mapping the research onto the ontology. the mapping was then used to generate the monad map and theme map to visualize the landscape of the domain. the visualization highlights the barriers to and drivers of access to healthcare during covid- . the corpus of research was created from searching scopus on title-abstract- keywords of the articles indexed in the database. we experimented with different search terms. the broad term (healthcare w/ access and covid- ) yielded documents. the narrower term (health w/ care w/ access and covid- ) yielded documents. finally, the search term (healthcare and access and covid- ) was used to retrieve items on september . the items included different document types such as review, note, letter, conference paper, editorial, and other types of documents. we retained only journal articles which represent a high-quality collection of peer-reviewed research on healthcare access during covid- . we further filtered out the selected articles with the word “access” in them. based on the first iteration, the author with domain expertise further filtered articles that were not relevant including protocols for hospital implementation. after this, all the authors agreed and further excluded articles that were not related to healthcare access during covid- . thus, articles were included and considered for coding. figure details the search process and results, following the prisma reporting guidelines [ ]. we then downloaded the title, abstract, and keywords of selected articles and imported them into an excel spreadsheet for mapping. the reference management software zotero (corporation for digital scholarship, vienna, va, usa) was used to store the selected corpus. int. j. environ. res. public health , , x for peer review of . . method we visually synthesized the state of research in healthcare access during covid- pandemic by mapping the research onto the ontology. the mapping was then used to generate the monad map and theme map to visualize the landscape of the domain. the visualization highlights the barriers to and drivers of access to healthcare during covid- . the corpus of research was created from searching scopus on title-abstract- keywords of the articles indexed in the database. we experimented with different search terms. the broad term (healthcare w/ access and covid- ) yielded docu- ments. the narrower term (health w/ care w/ access and covid- ) yielded doc- uments. finally, the search term (healthcare and access and covid- ) was used to retrieve items on september , . the items included different document types such as review, note, letter, conference paper, editorial, and other types of documents. we retained only journal articles which represent a high-quality collection of peer-re- viewed research on healthcare access during covid- . we further filtered out the se- lected articles with the word “access” in them. based on the first iteration, the author with domain expertise further filtered articles that were not relevant including protocols for hospital implementation. after this, all the authors agreed and further excluded articles that were not related to healthcare access during covid- . thus, articles were in- cluded and considered for coding. figure details the search process and results, follow- ing the prisma reporting guidelines [ ]. we then downloaded the title, abstract, and keywords of selected articles and imported them into an excel spreadsheet for mapping. the reference management software zotero (corporation for digital scholarship, vienna, va, usa) was used to store the selected corpus. figure . search process and results. the corpus of articles was coded into the ontology through an iterative process between the three authors. the coding of all the articles went through two iterations by each of the three authors to ensure its reliability and validity. further, we also used a glossary of elements to ensure the validity of coding. after the rounds of individual cod- ing, the coders discussed the discrepancies in their coding and arrived at a consensus for figure . search process and results. the corpus of articles was coded into the ontology through an iterative process between the three authors. the coding of all the articles went through two iterations by each of the three authors to ensure its reliability and validity. further, we also used a glossary of elements to ensure the validity of coding. after the rounds of individual coding, the coders discussed the discrepancies in their coding and arrived at a consensus for the final coding. only the dimensions and elements explicitly articulated in the title, abstract, int. j. environ. res. public health , , of and keywords were coded. elements that were implicit in the section were not coded. the coding was binary ( for present, for absent) and was not scaled or weighted. in the analysis, both presence and absence of elements convey equally important information. . results the results of mapping the corpus onto the ontology are presented through a monad map (figure ) and a theme map (figure ). they are described next. int. j. environ. res. public health , , x for peer review of the final coding. only the dimensions and elements explicitly articulated in the title, ab- stract, and keywords were coded. elements that were implicit in the section were not coded. the coding was binary ( for present, for absent) and was not scaled or weighted. in the analysis, both presence and absence of elements convey equally important infor- mation. . results the results of mapping the corpus onto the ontology are presented through a monad map (figure ) and a theme map (figure ). they are described next. figure . monad map of research on access to healthcare during covid- . figure . theme map of the research on access to healthcare during covid- . resource ( ) force ( ) access ( ) health care ( ) personnel ( ) population ( ) spatial--distance ( ) barrier ( ) physical ( ) preventive ( ) physicians--general ( ) urban ( ) spatial--location ( ) inhibitor ( ) virtual ( ) wellness ( ) physicians--specialist ( ) rural ( ) temporal--availability ( ) catalyst ( ) illness--episodic ( ) traditional healers ( ) underprivileged ( ) temporal--scheduling ( ) driver ( ) illness--chronic ( ) nurses ( ) indigenous ( ) financial--income ( ) rehabilitative ( ) health workers ( ) disabled ( ) financial--expenditure ( ) palliative ( ) pharmacists ( ) elderly ( ) informational--stimulant ( ) social workers ( ) informational--educational ( ) care providers ( ) human--psychological ( ) peers ( ) human--sociological ( ) family ( ) human--cultural ( ) technological--it ( ) technological--transportation ( ) technological--medical ( ) [+ ] [t o/ of ] [a cc es s to ] [h ea lt hc ar e by ] [f or ] resource force access health care personnel population spatial--distance barrier physical preventive physicians--general urban spatial--location inhibitor virtual wellness physicians--specialist rural temporal--availability catalyst illness--episodic traditional healers underprivileged temporal--scheduling driver illness--chronic nurses indigenous financial--income rehabilitative health workers disabled financial--expenditure palliative pharmacists elderly informational--stimulant social workers informational--educational care providers human--psychological peers human--sociological family human--cultural technological--it technological--transportation technological--medical cluster - primary theme- temporal availability barrier to physical access to chronic illness healthcare cluster - secondary theme- access to episodic illness healthcare by specialist physicians cluster - tertiary theme- technological it catalyst/driver of virtual access to preventive/wellness healthcare for the underprivileged cluster - quaternary theme- spatial (distance/location)/ temporal (scheduling)/ financial (expenditure)/ informational (educational)/ human (psychological)/ technological (medical) inhibitor to healthcare by general physicians/ nurses/ health workers cluster - quinary theme- financial (income)/ informational (stimulant)/ human (sociological/cultural)/ technological (transportation) access to rehabilitative and palliative healthcare by traditional healers/ pharmacists/ social workers/care providers/ peers/ family for urban/rural/indigenous/disabled/elderly population [+ ] [t o/ of ] [a cc es s t o] [h ea lt hc ar e by ] [fo r] figure . monad map of research on access to healthcare during covid- . int. j. environ. res. public health , , x for peer review of the final coding. only the dimensions and elements explicitly articulated in the title, ab- stract, and keywords were coded. elements that were implicit in the section were not coded. the coding was binary ( for present, for absent) and was not scaled or weighted. in the analysis, both presence and absence of elements convey equally important infor- mation. . results the results of mapping the corpus onto the ontology are presented through a monad map (figure ) and a theme map (figure ). they are described next. figure . monad map of research on access to healthcare during covid- . figure . theme map of the research on access to healthcare during covid- . resource ( ) force ( ) access ( ) health care ( ) personnel ( ) population ( ) spatial--distance ( ) barrier ( ) physical ( ) preventive ( ) physicians--general ( ) urban ( ) spatial--location ( ) inhibitor ( ) virtual ( ) wellness ( ) physicians--specialist ( ) rural ( ) temporal--availability ( ) catalyst ( ) illness--episodic ( ) traditional healers ( ) underprivileged ( ) temporal--scheduling ( ) driver ( ) illness--chronic ( ) nurses ( ) indigenous ( ) financial--income ( ) rehabilitative ( ) health workers ( ) disabled ( ) financial--expenditure ( ) palliative ( ) pharmacists ( ) elderly ( ) informational--stimulant ( ) social workers ( ) informational--educational ( ) care providers ( ) human--psychological ( ) peers ( ) human--sociological ( ) family ( ) human--cultural ( ) technological--it ( ) technological--transportation ( ) technological--medical ( ) [+ ] [t o/ of ] [a cc es s to ] [h ea lt hc ar e by ] [f or ] resource force access health care personnel population spatial--distance barrier physical preventive physicians--general urban spatial--location inhibitor virtual wellness physicians--specialist rural temporal--availability catalyst illness--episodic traditional healers underprivileged temporal--scheduling driver illness--chronic nurses indigenous financial--income rehabilitative health workers disabled financial--expenditure palliative pharmacists elderly informational--stimulant social workers informational--educational care providers human--psychological peers human--sociological family human--cultural technological--it technological--transportation technological--medical cluster - primary theme- temporal availability barrier to physical access to chronic illness healthcare cluster - secondary theme- access to episodic illness healthcare by specialist physicians cluster - tertiary theme- technological it catalyst/driver of virtual access to preventive/wellness healthcare for the underprivileged cluster - quaternary theme- spatial (distance/location)/ temporal (scheduling)/ financial (expenditure)/ informational (educational)/ human (psychological)/ technological (medical) inhibitor to healthcare by general physicians/ nurses/ health workers cluster - quinary theme- financial (income)/ informational (stimulant)/ human (sociological/cultural)/ technological (transportation) access to rehabilitative and palliative healthcare by traditional healers/ pharmacists/ social workers/care providers/ peers/ family for urban/rural/indigenous/disabled/elderly population [+ ] [t o/ of ] [a cc es s t o] [h ea lt hc ar e by ] [fo r] figure . theme map of the research on access to healthcare during covid- . int. j. environ. res. public health , , of . . monad map the monad map in figure numerically and visually summarizes the frequency of occurrence of each dimension and element of the ontology. the number adjacent to the dimension name and the element is the rate of occurrence in the papers of access to healthcare during covid- that were reviewed and mapped. the bar below each element is proportional to the frequency relative to the maximum frequency among all elements. since each item can be coded to multiple elements of a dimension, the sum of the frequency of occurrence of elements may exceed the frequency of occurrence of the dimension to which the elements belong. the dominant focus of the research was on the resources ( ), force ( ), and healthcare ( ) during covid- . there is substantial focus on the type of access ( ) and the personnel ( ). there is less focus on the population type ( ). the research covers a spectrum of resources for access to healthcare and is heavily focused on temporal availability ( ) and technological it ( ) resources. there is medium emphasis on informational educational ( ), technological medical ( ), and temporal scheduling ( ). there is some emphasis on spatial distance ( ), spatial location ( ), financial expenditure ( ), human psychology ( ), human sociology ( ), and financial income ( ). the least emphasized resources are informational stimulant ( ), technological transportation ( ), and human cultural ( ). a significant proportion of articles consider the forces that affect access to healthcare. the most focus is on the barriers ( ) to access; there is lesser emphasis on the catalysts ( ) and drivers ( ). there is little emphasis on inhibitors ( ) to access. specific forces, particularly barriers, received significant attention in the research. there is more attention on barriers than on drivers. although all the articles are linked to healthcare, only specify the type of care. the dominant focus is illness care-chronic ( ) and -episodic ( ). the next significant emphasis is on wellness ( ) and preventive ( ) care. palliative ( ) and rehabilitative ( ) care are given little attention. specific types of care have been given some attention in the research, whereas there has been relatively less paid to rising healthcare needs such as palliative care and rehabilitative care. again, although all the articles are linked to access, only specified the type of access. physical access ( ) has been emphasized the most and a few deals with virtual access ( ). specific types of access have not been given enough attention in the research. the focus has largely been on the traditional concept of access than not on the emergent perception. among personnel, the majority focus has been on specialist physicians ( ), followed by nurses ( ), general physicians ( ), and health workers ( ). the rest—family ( ), social workers ( ), pharmacists ( ), care providers ( ), and peers ( )—received little attention. there is no mention of traditional healers in the research. research focuses the least on the target population dimension. among the different segments of the population, it largely focuses on the underprivileged population ( ). the other population segments such as elderly ( ), rural ( ), urban ( ), disabled ( ), and indigenous ( ) populations have not been given much attention in the research. . . theme map the theme map visually summarizes the co-occurrence of elements of the ontology in the population of articles. hierarchical cluster analysis was done using spss (statistical package for social sciences; ibm: chicago, il, usa) with simple matching coefficient (smc) as the distance measure and the nearest-neighbor aggregation procedure. smc considers both presence (coded “ ”) and absence (coded “ ”) elements equally. the detailed rationale for the choice of the clustering method and the presentation of the results are given in syn and ramaprasad [ ] and la paz et al. [ ]. the five themes represent the five equidistant clusters in the dendrogram of the agglomeration [ ]. the colors in figure highlight the elements of the five themes. int. j. environ. res. public health , , of the primary theme (in red), is the temporal availability barrier to physical access to chronic illness healthcare. the secondary theme (in brown) is access to episodic ill- ness healthcare by specialist physicians. the tertiary theme (in yellow) is the technolog- ical it catalyst/driver of virtual access to preventive/wellness healthcare for the under- privileged. the quaternary theme (in blue), is the spatial (distance/location)/temporal (scheduling)/financial (expenditure)/informational (educational)/human (psychologi- cal)/technological (medical) inhibitor to healthcare by general physicians/nurses/health workers. the quinary theme (no color), is financial (income)/informational (stimulant)/ human (sociological/cultural)/technological (transportation) access to rehabilitative and palliative healthcare by traditional healers/pharmacists/social workers/care providers/ peers/family for urban/rural/indigenous/disabled/elderly population. the themes are in order of decreasing dominance in the research—the primary theme is the most emphasized and the quinary theme denotes nonexistence. the focus of the research is skewed to just a few, forces, types of access, resources, types of healthcare, and population segments. none of the themes comprehensively covers all the dimensions of the ontology. for example, the primary theme excludes personnel, and the secondary omits resources, force, access, and population. overall, the research corpus coverage is segmented and not systemic. . discussion the ontology-based analysis of research journal publications on access to health- care during covid- shows the thematic selectivity and segmentation in the research. research in the primary theme is personnel and population agnostic. the theme shows the research emphasis on the temporal availability barrier to/of physical access to chronic illness. availability to access chronic care has deteriorated due to diversion of medical spe- cialists as “call of duty” for urgent covid- cases [ ]. the pandemic has further affected those seeking care for chronic conditions in areas without well-established telemedicine [ ]. telemedicine helps provide routine care for patients with chronic diseases who are at increased risk of severe illness if exposed to the virus. covid- has made facility-based care for chronic conditions a major challenge. chronic conditions such as chronic obstruc- tive pulmonary disease, diabetes, and hypertension have been the most impacted due to decline in access to care [ ]. during this time, it becomes essential to at least monitor and manage patients with chronic conditions and prioritize outpatient visits based on disease severity [ ]. there is a significant contrast between the research on the primary theme and the who’s guidelines. while the research emphasizes the temporal availability barrier to/of physical access to chronic illness, the who operational guidelines of “maintaining essential health services” [ ] addresses measures beyond availability of care for chronic conditions. going beyond provision of medicines, supplies, and support from front-line workers, it calls for action on functional mapping health facilities for chronic, acute, and long-term care including those in private (commercial and nonprofit), public, and military systems. it supports the research in redesigning management strategies around limited availability of care providers. while research on the primary theme remains population agnostic, the who guidelines specify chronic care for children and the elderly. the guidelines move beyond teleconsultation and promote actions such as activating dedicated helplines and examining other outreach mechanisms. importance of educating the chronic care patients on accessing telehealth, online services, and self-managing the condition brought out critical elements missing in research. additionally, now that we are moving to a new stage of immunization, some guidelines have been established to prioritize the population that receives the vaccine, which depends on the distribution principle for equitable access and fair allocation defined by each country and may result in those living with chronic conditions being considered in a second stage of inoculation [ ]. the secondary cluster indicates the research emphasis of episodic illness healthcare by specialists. the theme indicates a siloed focus on healthcare and personnel with emphasis int. j. environ. res. public health , , of only on episodic illness and specialists. episodic illness and seeking care require the specialists to address the issues promptly to prevent aggravation. addressing episodic illness through specialists care during this pandemic requires revamping of protocols so that there is standardization of outpatient activities with remote triage, protections, diagnostic tests, and precautions that allow provision of care while minimizing risk for both surgeons and patients [ ]. with the diversion of all personnel resources, maximizing the availability of specialists for treatment of episodic illness requires adapting alternative treatment strategies [ ]. with the research emphasis being siloed, the who guidelines give additional direction to make it a more systemic by providing a systematic approach. for episodic illness care, the guidelines highlight the need for time-sensitive interventions. they indicate having primary venues to address episodic care with settings that are suited for high- volume care. modification of treatment pathways for specialist services through remote digital platforms during initial assessments is highlighted. the directions from the who further lay importance on prioritizing access for acute management of complications by considering repurposed facilities that ensure h acute care. overall, the guidelines cover more elements from the resource, force, and access dimensions, making it more systemic and systematic than the research corpus. even so, the guidelines raise significant ethical concerns, among which is the allocation of scarce resources. the guidelines consider the prioritization of patients to allocate scarce resources in relation to potential complications and specialist demand. however, given the pandemic, there are many kinds of scarce resources—i.e., infrastructure, general and specialist physicians, clinical resources, among others. this prioritization has aroused great resentment and triggered a public debate about the right to access healthcare services [ ]. this was a matter of concern before the covid- pandemic but has become more ev- ident today. although, there is a right for everyone to receive care, it is not feasible to overlook medical conditions and biological characteristics that differentiate one patient from another, which today has more relevance given the scarcity of resources faced by countries. moreover, when we add to this equation the additional demand of services from the long-suffering covid- patients from persistent medical conditions beyond the acute illness. ensuring health equity is a challenge, and this pandemic has exposed the gaps existing worldwide and stressed public healthcare systems [ ]. the third theme emphasizes technological it catalyst/drivers of virtual access to preventive/wellness healthcare for the underprivileged. it covers important practice insights but neglects a comprehensive focus on different population segments. with the technological penetration in healthcare, today it plays a critical role in acting as a catalyst or a driver for providing virtual care. virtual clinics today with their technological tools are readily available for access and are used to deliver care. deploying these mechanisms has provided high satisfaction to the patients and clinics are adopting this model, especially in resource-limited settings [ ]. it and virtual access to healthcare have extended from prevention and wellness care to other healthcare requirements such as prenatal care and wellness, with tailored telehealth regimens for surveillance and/or counseling [ ]. medical practice has changed in unprecedented ways and there is increased use of telemedicine services in safety and mental health, reproductive life planning, and routine screening for breast cancer [ ]. today, different technological approaches such as participatory digital contact notification is in practice for countries with limited access to healthcare resources and advanced technology [ ]. the who guidelines emphasize digital modalities for various purposes to maintain the essential health services. the guidelines are in line with the current research and prac- tice of using telemedicine solutions as catalysts and drivers, such as clinical consultations conducted via video chat or text message, e-pharmacies, staffed helplines, and mobile clinics with remote connections. they also support the practice of using digital health technologies as a proactive measure to manage their own health. the guidelines reflect the importance of prevention and wellness in terms of mental healthcare for populations such int. j. environ. res. public health , , of as school children and adolescents. they go beyond telecounselling and lay importance on follow-up with school dropouts and institute support mechanisms. the who guidelines broadly set the priority on wellness in terms of nutrition, monitoring status of noncom- municable diseases, and mental health. the research falls short in terms of the detailed approach taken towards and laid down by the who to maintain wellness of vulnerable populations. the fourth/quaternary theme is siloed and segmented with the dominant focus on two dimensions—resources and personnel. there is also selective emphasis on the force. the theme is not comprehensive as it does not cover elements of type of access, healthcare, and population. the pandemic, in general, has brought into focus the utilization of available resources by the personnel in the health system. research under this theme mainly focuses on the change in healthcare modalities that have been shifted to different forms and strategies by healthcare professionals [ ]. further, there is significant emphasis on the shortages in medical equipment and transfer of all human resources in addressing the pandemic which has led to revamping and redirection of resources through different triage approaches and prioritization [ ]. the research emphasis in the quaternary theme aligns with the who guidelines in terms of the different resources employed to support timely action by the healthcare pro- fessionals. it highlights the repurposing of human, financial, and material resources, and mobilizing additional resources. additionally, it aligns with the research emphasis on ex- penditure through reprogramming of budgetary resources, while monitoring expenditures to guarantee the effective use of resources and accountability [ ]. the quinary cluster highlights many parts of pathways to healthcare access that have been missed in the research. there is limited research on crucial aspects such as sociological, cultural, and income resources that play significant role in accessing healthcare during the pandemic. structural factors and societal factors concerning income, employment, health inequality, and racial bias add to the crisis [ ]. such factors call for a more comprehensive approach for access to care during covid- , with early testing, sustained, and affordable access to healthcare [ ]. further, the resources affecting healthcare access for different population segments is significantly neglected in research. several social, environmental, and health risk factors have affected indigenous populations during this pandemic and strengthening of the health system with a community-based approach is vital [ ]. among the population segments, the elderly and disabled during this time of the pandemic are likely to require palliative care. unique methods of health service delivery are necessary to ensure that vulnerable populations in underserviced metropolitan areas receive adequate and prompt palliative and rehabilitative care [ ]. some of the above research gaps are also amiss in the who guidelines. the cultural resources that would play a role in accessing healthcare during this time are not mentioned. additionally, the roles of personnel such as traditional healers and social workers in maintaining the essential health services has been missed. while there are some elements that are not in focus in both research and guidelines, the guidelines address more elements present in the quinary theme. focusing on addressing the needs of marginalized populations, such as migrants and refugees, indigenous peoples, sex workers, and the homeless is given importance by the who. the guidelines lay detailed emphasis on maintaining essential health services and access to care for older people. it ranges from care of their mental health to rehabilitative and palliative care. in providing different types of care to the elderly the guidelines places importance on the role played by care givers, peers, and family. the pandemic has affected the mental health of all population segments including the personnel involved in healthcare [ , ]. the who guidelines on care for mental health are extensive. they go one step beyond and integrate psychological and sociological factors into providing psychosocial support for different population segments such as addicts, the elderly and school children. int. j. environ. res. public health , , of the who guidelines are recommendations to ensure continuity in the access of essential care; however, each country may adapt them to their reality. ontologies are this underexploited element of effective knowledge organization [ ] that can help in their decision-making process. they can be used to: • provide a systemic view of the problem for advancing research and developing guidelines. • systematically analyze the emphases and gaps in research and practice and develop a balanced roadmap for both. • systematically analyze the gaps between research and practice and develop a strategy for effective translation between the two through feedback and learning. . conclusions for effective access to healthcare during pandemics such as covid- , the research and the guidelines must be systematically directed by a systemic framework. further, the research must complement the guidelines and the guidelines must complement the research. significant improvements can be made in the roadmaps for research and guidelines, as shown in the above analysis. the gaps in the research and the potential inclusions of practice guidelines gives the picture of the currently selective and segmented approaches in providing access to healthcare during covid- . while there are pathways unique to the research and in the practice guidelines, there is an overlap as well. a systemic ontology such as the one presented in this paper can promote a systematic approach to address the problems of access to healthcare during covid- and similar pandemics. a systematic method for driving the research and guidelines will provide feedback on and help us learn about the gaps in the research, guidelines, and between the two. the feedback and learning will reduce the gaps and make both research and guidelines more effective. this study needs to be updated based on the new knowledge that is generated day by day with the development of the pandemic. however, it is a starting point for making informed decisions in public policy. author contributions: conceptualization, a.n. and a.r.; methodology, a.r.; validation, a.n., a.r. and s.d.s.; formal analysis, a.n., a.r. and s.d.s.; writing—original draft preparation, review and editing a.n., a.r. and s.d.s. all authors have read and agreed to the published version of the manuscript. funding: this research received no external funding. institutional review board statement: not applicable. informed consent statement: not applicable. data availability statement: the datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request. conflicts of interest: the authors declare no conflict of interest. references . daniels, n. equity of access to health care: some conceptual and ethical issues. the milbank memorial fund quarterly. health soc. , , – . . whitehead, m. the concepts and principles of equity and health. int. j. health serv. plan. adm. eval. , , – . [crossref] . gulliford, m.; figueroa-munoz, j.; morgan, m.; hughes, d.; gibson, b.; beech, r.; hudson, m. what does ‘access to health care’ mean? j. health serv. res. policy , , – . [crossref] . abedi, v.; olulana, o.; avula, v.; chaudhary, d.; khan, a.; shahjouei, s.; li, j.; zand, r. racial, economic, and health inequality and covid- infection in the united states. j. racial ethn. health disparities , – . [crossref] [pubmed] . azar, k.m.j.; shen, z.; romanelli, r.j.; lockhart, s.h.; smits, k.; robinson, s.; brown, s.; pressman, a.r. disparities in outcomes among covid- patients in a large health care system in california. health aff. (proj. hope) , , – . [crossref] [pubmed] . world health organization. maintaining essential health services: operational guidance for the covid- context. world health organ. . available online: https://www.who.int/publications/i/item/who- -ncov-essential-health-services- . (accessed on december ). http://doi.org/ . / l-lhq - vte-yrrn http://doi.org/ . / http://doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /hlthaff. . http://www.ncbi.nlm.nih.gov/pubmed/ https://www.who.int/publications/i/item/who- -ncov-essential-health-services- . int. j. environ. res. public health , , of . okereke, m.; ukor, n.a.; adebisi, y.a.; ogunkola, i.o.; iyagbaye, e.f.; owhor, g.a.; lucero-prisno, d.e., rd. impact of covid- on access to healthcare in low- and middle-income countries: current evidence and future recommendations. int. j. health plan. manag. , , – . [crossref] . abuzeineh, m.; muzaale, a.d.; crews, d.c.; avery, r.k.; brotman, d.j.; brennan, d.c.; segev, d.l.; al ammary, f. telemedicine in the care of kidney transplant recipients with coronavirus disease : case reports. transplant. proc. , , – . [crossref] . al-sofiani, m.e.; alyusuf, e.y.; alharthi, s.; alguwaihes, a.m.; al-khalifah, r.; alfadda, a. rapid implementation of a diabetes telemedicine clinic during the coronavirus disease outbreak: our protocol, experience, and satisfaction reports in saudi arabia. j. diabetes sci. technol. , , – . [crossref] [pubmed] . aziz, a.; zork, n.; aubey, j.j.; baptiste, c.d.; d’alton, m.e.; emeruwa, u.n.; fuchs, k.m.; goffman, d.; gyamfi-bannerman, c.; haythe, j.h.; et al. telehealth for high-risk pregnancies in the setting of the covid- pandemic. am. j. perinatol. , , – . [crossref] . bourdon, h.; jaillant, r.; ballino, a.; el kaim, p.; debillon, l.; bodin, s.; n’kosi, l. teleconsultation in primary ophthalmic emergencies during the covid- lockdown in paris: experience with patients in march and april . j. fr. d’ophtalmol. , , – . [crossref] . burgoyne, n.; cohn, a.s. lessons from the transition to relational teletherapy during covid- . fam. process. , , – . [crossref] . lonergan, p.e.; iii, s.l.w.; branagan, l.; gleason, n.; pruthi, r.s.; carroll, p.r.; odisho, a.y. rapid utilization of telehealth in a comprehensive cancer center as a response to covid- : cross-sectional analysis. j. med. internet res. , , e . [crossref] . velásquez, j.r.m. teleconsulta en la pandemia por coronavirus: desafíos para la telemedicina pos-covid- . rev. colomb. de gastroenterol. , (suppl. ), – . [crossref] . hardcastle, l.; ogbogu, u. virtual care: enhancing access or harming care? healthc. manag. forum , , – . [crossref] . lau, j.; knudsen, j.; jackson, h.; wallach, a.b.; bouton, m.; natsui, s.; philippou, c.; karim, e.; silvestri, d.m.; avalone, l.; et al. staying connected in the covid- pandemic: telehealth at the largest safety-net system in the united states. health aff. (proj. hope) , , – . [crossref] [pubmed] . aragona, m.; barbato, a.; cavani, a.; costanzo, g.; mirisola, c. negative impacts of covid- lockdown on mental health service access and follow-up adherence for immigrants and individuals in socio-economic difficulties. public health , , – . [crossref] . roy, a.; singh, a.k.; mishra, s.; chinnadurai, a.; mitra, a.; bakshi, o. mental health implications of covid- pandemic and its response in india. int. j. soc. psychiatry . [crossref] . javed, b.; sarwer, a.; soto, e.b.; mashwani, z.u. the coronavirus (covid- ) pandemic’s impact on mental health. int. j. health plan. manag. , , – . [crossref] [pubmed] . choi, s.e.; simon, l.; riedy, c.a.; barrow, j.r. modeling the impact of covid- on dental insurance coverage and utilization. j. dent. res. . [crossref] [pubmed] . núñez, a.; ramaprasad, a.; syn, t.; lopez, h. an ontological analysis of the barriers to and facilitators of access to health care. j. public health , – . [crossref] . ramaprasad, a.; syn, t. ontological meta-analysis and synthesis. commun. assoc. inf. syst. , , – . [crossref] . simon, h.a. the architecture of complexity. in proceedings of the american philosophical society; american philosophical society: philadelphia, pa, usa, ; volume , pp. – . . cimino, j.j. in defense of the desiderata. j. biomed. inform. , , – . [crossref] [pubmed] . chandrasekaran, b.; josephson, j.r.; benjamins, v.r. what are ontologies, and why do we need them? ieee intell. syst. , , – . [crossref] . cameron, j.d.; ramaprasad, a.; syn, t. an ontology of and roadmap for mhealth research. int. j. med. inform. , , – . [crossref] [pubmed] . gruber, t.r. toward principles for the design of ontologies used for knowledge sharing. int. j. hum. comput. stud. , , – . [crossref] . gruber, t.r. ontology. in encyclopedia of database systems; liu, l., Özsu, m.t., eds.; springer: berlin/heidelberg, germany, . . ramaprasad, a. cognitive process as a basis for mis and dss design. manag. sci. , , – . [crossref] . ramaprasad, a.; mitroff, i.i. on formulating strategic problems. acad. manag. rev. , , – . [crossref] . ramaprasad, a.; poon, e. a computerized interactive technique for mapping influence diagrams (mind). strateg. manag. j. , , – . [crossref] . ramaprasad, a. revolutionary change and strategic management. behav. sci. , , – . [crossref] . ramaprasad, a. on the definition of feedback. behav. sci. , , – . [crossref] . quine, w.v.o. from a logical point of view (second, revised); harvard university press: cambridge, ma, usa, . . gadicherla, s.; krishnappa, l.; madhuri, b.; mitra, s.g.; ramaprasad, a.; seevan, r.; sreeganga, s.d.; thodika, n.k.; mathew, s.; suresh, v. envisioning a learning surveillance system for tuberculosis. plos one , , e . [crossref] . núñez, a.; ramaprasad, a.; syn, t. national healthcare policies in chile: an ontological meta-analysis. stud. health technol. inform. , , . [crossref] http://doi.org/ . /hpm. http://doi.org/ . /j.transproceed. . . http://doi.org/ . / http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /s- - http://doi.org/ . /j.jfo. . . http://doi.org/ . /famp. http://doi.org/ . / http://doi.org/ . / . http://doi.org/ . / http://doi.org/ . /hlthaff. . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /j.puhe. . . http://doi.org/ . / http://doi.org/ . /hpm. http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . / http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /s - - - http://doi.org/ . / cais. http://doi.org/ . /j.jbi. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . / . http://doi.org/ . /j.ijmedinf. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /ijhc. . http://doi.org/ . /mnsc. . . http://doi.org/ . /amr. . http://doi.org/ . /smj. http://doi.org/ . /bs. http://doi.org/ . /bs. http://doi.org/ . /journal.pone. http://doi.org/ . / - - - - - int. j. environ. res. public health , , of . ramaprasad, a.; singai, c.b.; hasan, t.; syn, t.; thirumalai, m. india’s national higher education policy recommendations since independence. j. educ. plan. adm. , , – . . liberati, a.; altman, d.g.; tetzlaff, j.; mulrow, c.; gøtzsche, p.c.; ioannidis, j.p.a.; clarke, m.; devereaux, p.j.; kleijnen, j.; moher, d. the prisma statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. bmj , , b . [crossref] [pubmed] . syn, t.; ramaprasad, a. megaprojects—symbolic and sublime: an ontological review. int. j. manag. proj. bus. , , – . [crossref] . la paz, a.; merigó, j.m.; powell, p.; ramaprasad, a.; syn, t. twenty-five years of the information systems journal: a bibliometric and ontological overview. inf. syst. j. , – . [crossref] . mauro, v.; lorenzo, m.; paolo, c.; sergio, h. treat all covid -positive patients, but do not forget those negative with chronic diseases. intern. emerg. med. , , – . [crossref] [pubmed] . chudasama, y.v.; gillies, c.l.; zaccardi, f.; coles, b.; davies, m.j.; seidu, s.; khunti, k. impact of covid- on routine care for chronic diseases: a global survey of views from healthcare professionals. diabetes metab. syndr. clin. res. rev. , , – . [crossref] . who. who sage roadmap for prioritizing uses of covid- vaccines in the context of limited supply. world health orga- nization. . available online: https://www.who.int/docs/default-source/immunization/sage/covid/sage-prioritization- roadmap-covid -vaccines.pdf?status=temp&sfvrsn=bf _ (accessed on october ). . bennardo, f.; antonelli, a.; barone, s.; figliuzzi, m.m.; fortunato, l.; giudice, a. change of outpatient oral surgery during the covid- pandemic: experience of an italian center. int. j. dent. , , – . [crossref] . kumar, s.; chmura, s.; robinson, c.; lin, s.h.; gadgeel, s.m.; donington, j.; feliciano, j.; stinchcombe, t.e.; werner-wasik, m.; edelman, m.j.; et al. alternative multidisciplinary management options for locally advanced nsclc during the coronavirus disease global pandemic. j. thorac. oncol. , , – . [crossref] . mannelli, c. whose life to save? scarce resources allocation in the covid- outbreak. j. med. ethics , , – . [crossref] . benjamin, g.c. ensuring health equity during the covid- pandemic: the role of public health infrastructure. rev. panam. de salud pública , , e . [crossref] . cohen, m.a.; powell, a.m.; coleman, j.s.; keller, j.m.; livingston, a.; anderson, j.r. special ambulatory gynecologic considera- tions in the era of coronavirus disease (covid- ) and implications for future practice. am. j. obstet. gynecol. , , – . [crossref] [pubmed] . cheng, w.; hao, c. case-initiated covid- contact tracing using anonymous notifications. jmir mhealth uhealth , , e . [crossref] . jiménez-rodríguez, d.; garcía, a.s.; robles, j.m.; salvador, m.m.r.; ronda, f.j.m.; arrogante, o. increase in video consultations during the covid pandemic: healthcare professionals’ perceptions about their implementation and adequate management. int. j. environ. res. public health , , . [crossref] . andrews, e.e.; ayers, k.b.; brown, k.s.; dunn, d.s.; pilarski, c.r. no body is expendable: medical rationing and disability justice during the covid- pandemic. am. psychol. . [crossref] . barroy, h.; wang, d.; pescetto, c.; kutzin, j. how to budget for covid- response? a rapid scan of budgetary mechanisms in highly affected countries. who. . available online: https://www.who.int/docs/default-source/health-financing/how-to- budget-for-covid- -english.pdf?sfvrsn= a _ (accessed on october ). . braithwaite, r.; warren, r. the african american petri dish. j. health care poor underserved , , – . [crossref] . de león-martínez, l.d.; de la sierra-de la vega, l.; palacios-ramírez, a.; rodriguez-aguilar, m.; flores-ramírez, r. critical review of social, environmental and health risk factors in the mexican indigenous population and their capacity to respond to the covid- . sci. total environ. , , . [crossref] [pubmed] . lakhani, a. which melbourne metropolitan areas are vulnerable to covid- based on age, disability, and access to health services? using spatial analysis to identify service gaps and inform delivery. j. pain symptom manag. , , e –e . [crossref] [pubmed] . kontoangelos, k.; economou, m.; papageorgiou, c. mental health effects of covid- pandemia: a review of clinical and psychological traits. psychiatry investig. , , – . [crossref] [pubmed] . ungureanu, b.s.; vladut, c.; bende, f.; sandru, v.; tocia, v.; turcu-stiolica, r.-v.; groza, a.; balan, g.g.; turcu-stiolica, a. impact of the covid- pandemic on health-related quality of life, anxiety, and training among young gastroenterologists in romania. front. psychol. , , . [crossref] [pubmed] . whaley, p.; edwards, s.w.; kraft, a.; nyhan, k.; shapiro, a.; watford, s.; wattam, s.; wolffe, t.; angrish, m. knowledge organization systems for systematic chemical assessments. environ. health perspect. , , . [crossref] [pubmed] http://doi.org/ . /bmj.b http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /ijmpb- - - http://doi.org/ . /isj. http://doi.org/ . /s - - -z http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /j.dsx. . . https://www.who.int/docs/default-source/immunization/sage/covid/sage-prioritization-roadmap-covid -vaccines.pdf?status=temp&sfvrsn=bf _ https://www.who.int/docs/default-source/immunization/sage/covid/sage-prioritization-roadmap-covid -vaccines.pdf?status=temp&sfvrsn=bf _ http://doi.org/ . / / http://doi.org/ . /j.jtho. . . http://doi.org/ . /medethics- - http://doi.org/ . /rpsp. . http://doi.org/ . /j.ajog. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . / http://doi.org/ . /ijerph http://doi.org/ . /amp https://www.who.int/docs/default-source/health-financing/how-to-budget-for-covid- -english.pdf?sfvrsn= a _ https://www.who.int/docs/default-source/health-financing/how-to-budget-for-covid- -english.pdf?sfvrsn= a _ http://doi.org/ . /hpu. . http://doi.org/ . /j.scitotenv. . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /j.jpainsymman. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /pi. . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /fpsyg. . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /ehp http://www.ncbi.nlm.nih.gov/pubmed/ introduction materials and methods ontology of access to healthcare during covid- method results monad map theme map discussion conclusions references library trends v. , no. winter : http://hdl.handle.net/ / abstract portico, a digital preservation archive for the scholarly community and a national digital information infrastructure and preservation program (ndiipp) partner, has successfully extended the ndiipp network to include a diverse and broad set of publishers and libraries through the development of a model that encourages institutions of all sizes to participate in digital preservation. over the past two and a half years of archive operations, portico has learned a number of lessons—most importantly that responsiveness to community needs is key to successful preservation. defining the risk: assured access to scholarly resources requires new infrastructure in recent years, academic libraries’ expenditures to purchase or license digi- tal content for their communities have increased dramatically. between and , electronic materials expenditures at the libraries of the associa- tion of research libraries (arl) increased over five times more rapidly than total library materials expenditures (lme), and in the – aca- demic year, these libraries spent an average percent of total lme on e- resources. twenty-three arl libraries spent more than percent of their materials budget on electronic resources (fig. ; kyrillidou & young, ). the average percentage of lme that the association of college and research libraries (acrl) institutions devoted to e-resources in the – academic year was only slightly smaller than the arl institutions (see fig. ). these expenditures are driven in part by the dramatic increase in fac- ulty reliance on digital resources over the past decade, which can be seen through responses to various faculty surveys over the past thirteen years: expanding the preservation network: lessons from portico amy j. kirchhoff library trends, vol. , no. , winter (“the library of congress national digital information infrastructure and preservation program,” edited by patricia cruse and beth sandore), pp. – (c) the board of trustees, university of illinois kirchhoff/expanding the preservation network • a cross-disciplinary survey of faculty concluded that the respon- dents were beginning to use networked resources but had “lack of trust in” e-journals (budd & connaway, ). • but a – survey by the electronic publishing initiative at colum- bia (epic) of faculty and students in the fields of international affairs, environmental science, and political science found that “ % [of respon- dents] somewhat or strongly agree that they would rather settle for what they can find online, even if it is not quite what they wanted, in order to save making the trip to the library” (epic faculty survey, ). • a survey by jstor of over four thousand faculty in the social sci- ences and humanities found that more than percent of the faculty who responded considered electronic databases to be invaluable (guth- rie, ). • a follow-up survey by ithaka of faculty found that over percent of faculty respondents believed that “electronic research resources are invaluable research tools” (guthrie & schonfeld, ). • a faculty survey by ithaka found that in some disciplines over percent of faculty agreed very strongly with the statement that “i will become increasingly dependent on electronic research resources in the future” (guthrie, ). students, even more than faculty, are dependent on electronic con- tent, with the majority of – -year-olds more willing to give up televi- sion or radio than to give up the internet (zogby international, ). in a paradigm shift from older generations, today’s students are accessing their information over a large variety of electronic devices (simple cell phone, desktop computer, laptop computer, mp /mpg player, handheld game device, pda, or smart phone) with over percent of students own- ing more than three of these devices (caruso, & salaway, ). as joan lippincott of the coalition for networked information notes, students to- day are producers of digital content, not simply consumers; they interact figure . arl e-resource expenditures as percentage of lme over time library trends/winter with multimedia, not simply text; they use computers and electronics as social and participatory activities, not simply individual activities; and this all makes them very visible in the digital world, not invisible (lippincott, ). a serious question is raised by the transition to this new academic world where the scholars of today and tomorrow and the libraries and publish- ers that support them are highly dependent on electronic resources: how will access to e-resources be assured over the long term? over centuries, libraries developed substantial, institutional, physical infrastructure— real estate, buildings, and shelves—to ensure ongoing access to print re- sources. but as is stated in the european commission report, study on the economic and technical evolution of the scientific publication markets in europe, “the electronic era has brought a major paradigmatic change in the provision of access to back issues of journals: in the print era, librar- ies were acquiring print journals and took in charge their preservation so that they remain accessible to their user community in the long term. in the digital era, libraries and their user community are licensed online ac- cess to electronic journals for a determined and limited duration” (dewa- tripont et al., ). as such, “unless and until it creates digital archiving services, the academy cannot fully shift to electronic-only journal publish- ing, and cannot fully achieve the system-wide savings and benefits associ- ated with such a shift” (digital library federation, ), a reality noted in the “urgent action needed to preserve scholarly electronic journals” statement issued by academic community leaders in septembers . yet even as reliance to e-resources grows, many libraries do not wish to take possession of the digital files comprising electronic publications, even when publishers allow it, as it requires significant technological in- frastructure and “there are no practical means in place for [a vast major- figure . arl & acrl expenditures as percentage of lme, – kirchhoff/expanding the preservation network ity of these] libraries to exercise their permanent usage rights” (digital library federation, ). the capacity to implement the technological infrastructure to provide long-term access to e-resources locally is only financially and technologically possible at a handful of the world’s largest institutions as institutional resources and capacity vary significantly. a re- view of average and median lme across institutions provides one illustra- tion of the wide variance that currently exists, if lme is taken as a proxy measure of capacity. the average lme of the arl institutions in the – academic year was percent more than the average lme of acrl doctoral institutions and percent more than the acrl bachelor institutions (see fig. ). nonetheless, as we saw in figure , institutions of all sizes are spending an ever-growing portion of their lme on e-resources and they, consequently, must protect the investment they have made in e-resources key to fulfill- ment of their institutional missions. responding to the risk: building an approach with the community the need for the preservation of electronic scholarly content without incurring the burden and expense of creating many local instances of complex and costly technological infrastructure was clearly highlighted in the statement, “urgent action needed to preserve scholarly electronic figure . average & median lme by class in – library trends/winter journals,” endorsed by the association of research libraries, canadian association of research libraries, and many others. the statement ob- served that libraries must invest in a qualified archiving solution. a library may itself operate a qualified archive . . . otherwise, research and academic librar- ies may collaborate in the form of an insurance collective, or mutual assurance society. such an entity may be governed in a variety of ways, but libraries would exercise their preservation obligation, in part, by paying fees to support the archive. in the event of a loss of access to an archived journal through the publisher, only paying participants would be able to have access to lost content through the archive. the collective would institute financial and other measures to ensure that potential participants who might choose initially to withhold support would pay their full fair share should they eventually need access to preserved materials. (digital library federation, ) the library community was also clear in asserting that e-journals were re- garded as the content most at risk. in response to this expressed need in , portico (originally known as the jstor electronic archiving initiative) began to work with the com- munity to build a technological and economic model that could support the development, operation, and maintenance of a third party digital preservation archive. for the first two years, portico staff worked on the development of technologies necessary to meet the project’s objectives. simultaneously, staff engaged in extensive discussions with publishers and libraries to craft an approach that would balance the needs of both communities while researching what would be necessary to build a sus- tainable business model for the archive. what emerged from portico’s analysis and community discussions is a model for the long-term preservation of e-journals built on two keystones that balance the needs of libraries, publishers, and scholars. access must be limited to well-defined instances digital content tends to be valuable to content owners for a much lon- ger period than traditionally true for print because it can be packaged in new ways and as new products. to encourage participation in preserva- tion arrangements by content providers, the archive cannot threaten the content providers’ business needs. yet library needs for assured long-term access must also be addressed. to balance the needs expressed by the community, portico’s model provides a “dark archive” with clearly defined and limited access conditions. e-journal content preserved within por- tico is made accessible for broad use by faculty, staff, and students only at participating institutions and only in the case of a trigger event: when a publisher ceases operations, ceases to publish a title, no longer offers back issues, or suffers catastrophic and sustained failure of its delivery kirchhoff/expanding the preservation network platform. to address post-cancellation access concerns, publishers may also designate portico as a method of meeting the post-cancellation needs of their library subscribers. the costs must be shared across the system portico’s operating costs are covered from diversified funding sources in order to avoid the vulnerability that comes from reliance on any single source of support. the chief beneficiaries of the archive, libraries, and publishers participate in and make an annual contribution to support the preservation service. for e-journals, publishers’ annual contributions are tiered and vary according to the size of a publisher’s annual journal reve- nue. libraries’ annual contributions are also tiered and vary according to a library’s total lme. this model allows the costs of digital preservation to be spread across the broad scholarly community, including libraries and publishers of all sizes, with no single institution required to bear all the costs of digital preservation alone. in addition to savings for individual institutions by distributing the costs broadly, constrained budgets of indi- vidual institutions do not threaten the future preservation of and access to the scholarly record. portico was launched in with support from jstor, ithaka, the andrew w. mellon foundation, and a three million dollar grant from the library of congress’s national digital information infrastructure preser- vation program (ndiipp). portico began active preservation of e-journals in early . twenty-nine months later, libraries from countries and publishers participate in portico. the portico archive preserves nearly million articles from over , journals in its archive and another , journals are committed to the archive. portico has the capacity to ingest and preserve an additional to million articles every month. extending the community’s long-term access protection while the preservation of e-journals is a complex challenge, the digital preservation needs of the scholarly and library communities extend well beyond e-journals as does portico’s mission to preserve scholarly literature published in electronic form and ensure that these materials remain ac- cessible to future generations of scholars, researchers, and students. over the past year, portico has begun exploring with the community how it might address other preservation needs and continue to extend the com- munity’s preservation infrastructure and network. a sampling of these ac- tivities is described below. e-books even as portico built preservation infrastructure and began the work of ingesting and preserving e-journal content, we received queries from library trends/winter publishers and libraries about the preservation of e-books. as the portico e-journal preservation process matured and the queries from our commu- nity of publishers and libraries increased, we leveraged our earlier experi- ences designing a preservation service for e-journals to develop preserva- tion for e-books. in late , portico undertook an e-book preservation study that included conversations about content formats and preservation needs with six publishers, three e-book aggregators, eleven libraries, and one library consortia. in addition, portico made a technical assessment of e-book data provided by the publisher survey participants. we found that publishers are now actively seeking preservation arrangements for their growing e-book collections, and they hoped that portico would provide a way to meet this need. from the sampled e-book data it was clear that the e-journal preservation infrastructure could readily be extended to receive e-books. in addition we learned that libraries desire e-book preservation, even as they strive to establish collection development policies for this still young genre. as a result of these discussions, portico has extended to e-books the model developed for e-journals, including trigger event driven access, which limits access to archived content to well-defined instances and au- diences. as with e-journals, costs are shared by libraries and publishers across the system. as of august , elsevier has signed an e-book agree- ment with portico, committing more than , e-books to the archive, and discussions are under way with several other publishers. digitized collections as our discussions regarding e-book preservation progressed we found that libraries, publishers, and aggregators also had significant concerns about preservation of large digitized collections such as historical news- papers or early texts. these collections present specific and deep collec- tions of historical content, and individual digitized collections can often exceed more than one terabyte in size. in our discussions to date, portico has received suggestions that the e-journal and e-book model would also be appropriate for this content, and our discussions with publishers and aggregators are moving ahead as of this writing. locally created content as portico has engaged with librarians about digital preservation and how best to meet this challenge, librarians have regularly expressed concern about how best to preserve locally created or digitized content. preser- vation of locally created content via an external party will likely require an approach that differs from that taken with e-journals, e-books, and digitized collections. for example, “trigger events” may not be a relevant concept and different cost sharing models may be required. to investi- gate what technologies and models are most appropriate, portico is work- ing with fifteen libraries to explore the preservation needs and potential kirchhoff/expanding the preservation network models to support preservation of locally created or digitized content. this exploration is expected to conclude in mid- , and portico will share its findings on this project with the community as it progresses. support for community preservation tools to meet the need of guaranteeing long-term access to scholarly digital content through digital preservation, portico relies upon a variety of poli- cies and tools. wherever possible, portico engages with the community on standards and tools development to secure the advantages that knowl- edge sharing and collaborative tool development offers. portico has par- ticipated in projects ranging from the premis (preservation metadata: implementation strategies) working group ( ), the national library of medicine journal archiving and interchange dtd (national center for biotechnology information, ), and jhove (jstor/harvard ob- ject validation environment; http://hul.harvard.edu/jhove/) develop- ment and each of these projects has informed portico’s approach. a key policy at portico is that all content should be preserved within a single, generic content model that has sufficient metadata to manage the long-term preservation of digital, scholarly content. the portico metadata has been heavily influenced by premis and portico’s chief technology officer, evan owens, worked with the premis working group to develop the data dictionary for preservation metadata. portico is currently revis- ing its content model and metadata gathering requirements, and one of the goals of this process is to assess our working experience with each of the premis data elements. as our analysis is concluded, we will share les- sons learned with the premis community and gather input on any adjust- ments that may be useful to enhance premis. portico also participated in the original development of jhove and with the california digital library and stanford university library is now engaged in the ndiipp-supported jhove project to further develop this tool. jhove is a tool that can be used to identify the format of a file, to determine whether the file is valid to its format specification, and to char- acterize the file in order to determine its format specific significant prop- erties. every file portico preserves in the archive is processed by jhove, and jhove has been widely adopted by other preservation entities for similar purposes. lessons learned at portico, as with many projects, as we have gained experience we have made adjustments and drawn conclusions about lessons learned. our hope, as we continue our digital preservation work, is to continue to learn and to share helpful findings with the community. from our experience to date, the lessons described below have been important in shaping our ongoing work and may offer value to other members of the preservation library trends/winter network. models must be responsive to community needs the initial model portico explored with publishers and libraries proposed a light archive where content was made available to participants after an extended predetermined time period. upon discussion with the commu- nity, however, it became clear “that preservation of electronic journals is a kind of insurance, and is not in and of itself a form of access. pres- ervation is a way of managing risk: first, against the permanent loss of electronic journals and, second, against having journal access disrupted for a protracted period following a publisher failure” (digital library federation, ). based on our discussions, portico revised the initial proposed model to arrive at the current trigger event driven approach. this adjustment has yielded a model that more closely targets libraries’ most pressing needs for long-term access without threatening publishers’ revenue models and creating unacceptable barriers of entry for content providers. while e-books and digitized collections also appear to fit well into a trigger event oriented model with the broad community sharing the pres- ervation costs, a different model may be required for the preservation of locally created content. as we continue our initial discussions with li- braries about local preservation needs, building from our model develop- ment experience, we expect to be open to revisions and adjustments to the model as the community’s needs become clearer. as in the start-up of any new endeavor, there must be willingness and ability to take risks, try new ideas, and make adjustments. identify policies through practice a preservation service with a long time horizon must be able to modify its processes and procedures over time. portico did not start production in early with a formal set of preservation policies; rather we entered production with a set of guiding principles, including: • the integrity of the scholarly record must be preserved. • source files reliably capture the intellectual content of electronic schol- arly journals. • preservation can be achieved through migration. • reliance upon accepted standards enhances archival reliability. these guiding principles have been enacted in numerous ways and have enabled us to develop more specific policies. for example portico maintains the original publisher-supplied files in the archive, in addition to all migrated copies. whenever we determine that the publisher may have erroneously supplied extraneous files that should not be maintained in the archive, a review and decision-making process is invoked to deter- mine the proper course of action (retention or rejection of the files). we kirchhoff/expanding the preservation network are now codifying the rules and processes we have developed over more than two years of experience into formal preservation policies and proce- dures that can be more readily shared with the community. infrastructure and scale can be extended as portico began its work the community clearly expressed preservation of e-journals as the most pressing priority. from a technical perspective, e- journals were a particularly challenging beginning point due to the exten- sive diversity and complexity of data structures in use over time and across the publishing community. however, because portico’s initial, generic content model and infrastructure were developed specifically to support this diverse and challenging content, it is now possible at much less effort to extend this work to new content types such as e-books and digitized collections, and possibly to locally created content. the lesson learned is that sometimes it is best to begin with the complex case. although the costs to initially develop preservation infrastructure were significant, this investment can now pay ongoing dividends as the generic content model is extended to the preservation of additional content types. impact of scale in order to process content at scale, it is impractical and cost prohibi- tive to make decisions on an article-by-article basis. instead, the supplied content must be analyzed in automated ways and tools developed to han- dle the majority of cases noting exceptions only as needed. for example, when content includes extraneous files that cannot be clearly determined as erroneously supplied nor associated with a specific article, portico’s sys- tem collects these and preserves them as a single content unit. while this conservative approach may result in unusual files being retained (u.s. postal service forms, for instance), it also helps to ensure that content is not inadvertently lost. preservation work is constant although much yet remains to be learned about preservation costs and their distribution over time, at least one model, the life project, pro- poses that ongoing preservation costs will include low, steady ongoing technology watch costs with occasional peaks of expenses to implement preservation actions (see fig. ). our preservation work at portico thus far would indicate that preser- vation actions will be less intermittent and more steady than proposed. digital preservation will require ongoing, active management. the ar- chive requires steady maintenance to keep it secure, including regular processes to check the fixity of files to determine if content has been cor- rupted and is in need of repair, to ensure successful replications, and to prepare for audits. in addition, there is a need for regular projects to maintain the archive. portico’s current review of our content model to library trends/winter make it even more generic to better manage a diverse set of content (e- books, e-journals, digitized collections, etc.) is another example of the kind of ongoing maintenance that an actively managed preservation ar- chive requires. implementing this new content model will require that later this year we re-create the metadata files for every item currently pre- served in the archive. we are learning that ongoing archive management actions are required in addition to the intermittent peaks of more intense preservation actions envisioned by life. we anticipate that intermittent activities will be less extensive because of ongoing management activities but more experience is needed to test this assumption. selecting content is challenging there is an ever growing number of journals being published electroni- cally today and with this proliferation comes the need to establish pres- ervation priorities. these priorities are best established with input from a wide range of parties with a vested interest: libraries, publishers, and scholars. gathering this input may require new forums not yet formed or new uses of existing forums to ensure that preservation priorities are widely communicated and well understood. a broad network can include diverse participants the library of congress ndiipp program “has over partners who share knowledge and experience … [and] is reliant on individuals and organizations willing to embark on cutting-edge programs” (ndiipp part- ners. [n.d.]) portico has learned that it is possible to design a preserva- figure . life project cost estimates for preservation activity over time (mcleod, wheatley, & ayris, ) kirchhoff/expanding the preservation network tion service that can expand the network of entities supporting digital preservation beyond those who have the technological and financial abil- ity to participate in digital preservation in a hands-on manner. as shown in figure , even very small academic institutions are spend- ing over percent of their lme on e-resources and the portico model, which distributes the costs of the archive broadly, allows even small in- stitutions to participate in—and benefit from—the preservation network that ndiipp has helped to establish. libraries participating in portico range from large u.s. university systems to the university of chittagong in bangladesh. similarly, the portico model encourages participation from scholarly publishers from across the spectrum. in building this broad par- ticipant base portico has extended the ndiipp preservation network to a diverse set of more than libraries and nearly publishers. through formal agreements these contributors to the network are positioned to remain engaged well past the duration of the ndiipp grant program. conclusion through its collaborations with the community portico has demonstrated that a model can be developed and operationalized that enables community supported preservation that begins to address the needs of the academic community for reliable preservation infrastructure. with nearly eight mil- lion articles preserved, portico now serves as one node within the net- work of preservation entities necessary to ensure that digital scholarship available today will remain so for future generations. as the community continues to develop new forms of e-scholarship, new digital preservation challenges will continue to emerge, and portico looks forward to working with the broader preservation network to address these as they arise. notes . these percentages are from the annual arl statistics—research trends sections (associa- tion of research libraries, n.d.). . these percentages were computed by portico from the acrl statistics dataset that under- lies the acrl academic library trends & statistics print volumes (american library association, ). . the acrl averages are drawn from data available in the acrl statistical summaries (american library association, ). the arl averages come from an analysis of the arl statistics dataset for – (association of research libraries, – ). . the contributors to the initiative were drawn from a broad range of the scholarly publishing community and included formal participation from ten publishers including the american economic association, the american mathematical society, the american political science association, the association of computing machinery, blackwell, the ecological society of america, the national academy of sciences, the royal society, the university of chicago press, and john wiley & sons. . the libraries participating in this exploration include american university, baylor university, binghamton university, brigham young university, california state polytechnic universi- ty—pomona, city university of new york, colorado state university, mcmaster university, middlebury college, northwestern university, trinity college—dublin, university of british library trends/winter columbia, university of notre dame, university of queensland, and vassar college (see http://www.portico.org/news/preservation.html retrieved on august , ). . per e-mail communications of the jhove working group, jhove is in use at: deutsche nationalbibliothek (german national librar y), ex libris, fedora, florida center for library automation, the global digital format registry project, koninklijke bibliotheek (national library of the netherlands), dspace, u.s. national archives and records ad- ministration, the national library of australia, the national library of new zealand, and the u.s. library of congress. references american librar y association. ( ). acrl academic librar y trends & statistics. retrieved january , , from http://acrl.telusys.net/trendstat/ / american librar y association. ( ). acrl statistical summaries. retrieved janu- ary , , from http://www.ala.org/ala/mgrps/divs/acrl/publications/trends/ / index.cfm association of research libraries. (n.d.). statistics and measurement. retrieved august , , from http://www.arl.org/stats/annualsurveys/arlstats association of research libraries. ( – ). arl statistics dataset for – . retrieved january , , from http://www.arl.org/bm~doc/arlstats .pdf budd, j. m., & connaway, l. s. ( ). university faculty and networked information: results of a survey. journal of the american society for information science, ( ), – . retrieved june , , from http://www .interscience.wiley.com/cgi-bin/abstract/ /abstract caruso, j. b., & salaway, g. ( ). the ecar study of undergraduate students and information technology, : key findings: educause. retrieved august , , from http://connect .educause.edu/library/ecar/theecarstudyofundergradua/ dewatripont, m., ginsburgh, v., legros, p., walckiers, a., devroey, j.-p., dujardin, m., vandooren, f., dubois, p., foncel, j., ivaldi, m., heusse, m.-d. ( ). study on the economic and technical evolution of the scientific publication markets in europe: european commission. re- trieved may , , from http://europa.eu.int/comm/research/science-society/pdf/ scientific-publication-study_en.pdf digital library federation. ( ). urgent action needed to preserve scholarly electronic journals. ( ). retrieved may , , from http://www.diglib.org/pubs/waters .htm epic faculty survey. ( ). retrieved may , , from http://www.epic.columbia.edu/ eval/facsurv. .ppt guthrie, k. ( , june ). what do faculty think of electronic resources? (paper presented at the jstor ala annual conference participants’ meeting) retrieved may , , from http://www.jstor.org/about/faculty.survey.ppt guthrie, k. ( , january ). who’s in charge: reflections on faculty and librarian surveys concerning changes in scholarly communication. (paper presented at the uc berkeley new directions, berkeley, ca) retrieved may , , from http://www.ithaka.org/ research/uc% berkeley% final% - - .ppt guthrie, k., & schonfeld, r. ( , april ). what do faculty think of electronic resources? findings from the academic research resources study. (paper presented at the cni task force meeting, alexandria, virginia) retrieved may , , from http://www.cni. org/tfms/ a.spring/presentations/cni_guthrie_what.ppt kyrillidou, m., & young, m. ( ). arl statistics – : a compilation of statistics from the one hundred and twenty-three members of the association of research libraries. washington, dc: association of research libraries. retrieved may , , from http:// www.arl.org/stats/annualsurveys/arlstats/arlstats .shtml lippincott, j. k. ( ). web . for learning discovery: net gen students, net gen scientist. (pa- per presented at the iatul conference, auckland, new zealand) retrieved august , , from http://www.iatul.org/doclibrary/public/conf_proceedings/ /joanlippincott .pdf mcleod, r., wheatley, p., & ayris, p. ( ). lifecycle information for e-literature: a summary from the life project. (paper presented at the life conference, apr ) retrieved may , , from http://eprints.ucl.ac.uk/ / national center for biotechnology information. ( ). nlm journal archiving and inter- change tag suite. retrieved june , , from http://dtd.nlm.nih.gov/ kirchhoff/expanding the preservation network ndiipp partners. (n.d.). retrieved august , , from http://www.digitalpreser vation .gov/partners/index.html premis (preservation metadata: implementation strategies) working group. ( ). re- trieved june , , from http://www.oclc.org/research/projects/pmwg/ zogby international. ( , january ) what is privacy? poll exposes generational divide on expectations of privacy, according to zogby/congressional internet caucus advisory committee survey. zogby international news. retrieved august , , from http://www .zogby.com/news/readnews.dbm?id= amy kirchhoff is the archive service product manager at portico, where she supports creation and execution of archival policy and oversees operation and development of the portico web site. prior to her work at portico, ms. kirchhoff held positions at jstor and ithaka. she has a ba in russian from the university of rochester and a masters of arts in library science from the university of arizona. the story of data                city, university of london institutional repository citation: robinson, l. & bawden, d. ( ). 'the story of data': a socio-technical approach to education for the data librarian role in the citylis library school at city, university of london. library management, doi: . /lm- - - this is the accepted version of the paper. this version of the publication may differ from the final published version. permanent repository link: http://openaccess.city.ac.uk/ / link to published version: http://dx.doi.org/ . /lm- - - copyright and reuse: city research online aims to make research outputs of city, university of london available to a wider audience. copyright and moral rights remain with the author(s) and/or copyright holders. urls from city research online may be freely distributed and linked to. city research online: http://openaccess.city.ac.uk/ publications@city.ac.uk city research online http://openaccess.city.ac.uk/ mailto:publications@city.ac.uk accepted for publication in library management 'the story of data': a socio-technical approach to education for the data librarian role in the citylis library school at city, university of london lyn robinson and david bawden accepted for publication in library management, april doi . /lm- - - abstract purpose this paper describes a new approach to education for library/information students in data literacy - the principles and practice of data collection, manipulation and management - as a part of the masters programme in library and information science (citylis) at city, university of london. design/methodology/approach the course takes a socio-technical approach, integrating, and giving equal importance to, technical and social/ethical aspects. topics covered include: the relation between data, information and documents; representation of digital data; network technologies; information architecture; metadata; data structuring; search engines, databases and specialised retrieval tools; text and data mining, web scraping; data cleaning, manipulation, analysis and visualization; coding; data metrics and analytics; artificial intelligence; data management and data curation; data literacy and data ethics; and constructing data narratives. findings the course, which was well-received by students in its first iteration, gives a basic grounding in data literacy, to be extended by further study, professional practice, and lifelong learning. originality/value this is one of the first accounts of an introductory course to equip all new entrants to the library/information professions with the understanding and skills to take on roles in data librarianship and data management. accepted for publication in library management introduction a role for librarians, and other information professionals, which is of considerable and increasing importance is the handling of data resources; on behalf of their users, and for their own purposes. this role, or perhaps it is better to say spectrum of roles, parallels that in the more traditional world of text and image resources. in supporting users, this ranges from a concern with the overall institutional, or even wider, policies for the management and curation of datasets of all kinds, to assisting an individual user with the detail of small- scale data handling and analysis. it also includes the collection, analysis, management, and use of data relating to library operations, and their use as metrics for service evaluation and improvement; an extension of the well-established 'library statistics'. the recent great expansion of the amount of available, and of public and institutional awareness of the importance of data, lends an urgency to the need for library/information specialists to be fully aware of the new 'data dimension' to their work, and this certainly amounts to a new role for librarians, in line with the theme of this special issue. as ekstrøm et al. ( ) write "imagine a librarian armed with the digital tools to automate literature reviews for any discipline, by reducing thousands of articles' ideas into memes and then applying network analysis to visualise trends in emerging lines of research. what if your research librarian could then dig deeper and use [a digital tool] to map in which sections of articles your key research terms appear? imagine the results confirmed that your favourite research term almost never appears in the results sections, but cluster only around introductions and perspectives? and what if the librarian did not stop there, but zoomed into the cloud of data with savvy statistics, applying the latest text and data mining techniques to satisfy even the most scrutinising scientific mind, before formulating an innovative research question?" not all librarians, even in academic and research settings, will become data specialists to this extent, although many certainly will. but all library and information professionals, in all sectors, will need to gain at least a basic appreciation of the issues around data, both technical and socio-ethical. this role certainly exists now, but will become of greater significance and ubiquity in future years. as kirkwood ( , p. ) puts it "data are nothing without analysis, and many librarians currently lack the data fluency to work confidently in a world of dynamic content creation ... librarians need both to re-skill and to change their self-identification and the philosophy that underlies it, if they are to achieve confident data fluency." this need for many, if not all, librarians to become more confident in dealing with data, a role which only a few years ago would be relevant to very few within the profession, is a vital one. the issue is not merely one of technical competence, important though that is, but of a confident appreciation of all the issues surrounding the good use of data, including the legal and ethical; much as librarians have traditionally had a confident appreciation of text-based publications. if librarians - in general, and beyond a few specialists and enthusiasts - are to be effective in this new role, professional education will have to adapt accordingly; see, for example, the surveys of data-focused provision in courses in the us (tang and sae-lim ) and in china (si, zhuang, xing and guo ). in the us, courses focusing on aspects of data science, data accepted for publication in library management handling and data management are offered within most educational programmes for library/information specialists, particularly, though not exclusively, in the ischools. one response is to provide programmes which specifically prepare students for the new data-centric roles, such as data librarian, data steward, data curator, research data manager and data archivist. such programmes necessarily focus strongly on the development of technical and managerial skills of data handling, and are aimed at students who are aiming at a clearly data-focused career within the library/information sector. examples of these are the programmes offered by the ischools at the university of pittsburgh (lyon, mattern, acker and langmead ), and at the university of sheffield (university of sheffield ). another response, which is the rationale for the course described in this article, is to adapt curricula to ensure that all new entrants to the library profession are given at least a basic foundational understanding of both the technology of data handling and management, and its social and ethical implications. the two aspects are of equal importance, and cannot sensibly be separated. without a detailed and practical appreciation of the technical issues, consideration of social and ethical matters will necessarily be ungrounded and general; while without a socio-ethical appreciation it will be difficult for students to understand how technical skills should best be applied. for library/information professionals dealing with data in any respect, while technical competence is a necessity, it must be framed within an understanding of the social and ethical - and indeed the wider cultural and political - environment. this paper describes an initiative, following the latter approach, within the library/information science masters programme at city, university of london (citylis). this involves the repositioning of an introductory information technology (it) course within the programme as a course dealing with data in all its aspects of relevance to the library/information professions, and from a socio-technical and ethical perspective. the data challenge for librarians of the many changes and challenges impacting on the work of the library and information professional, the 'data deluge' is certainly among the most significant. the greatly increased amount and diversity of data available is one of the most important changes in the information landscape. this applies both to the very large and heterogeneous datasets which tend to termed 'big data', and to the smaller, but no less important, bodies of data collected for specific purposes (sugimoto, ekbia and mattioli ; borgman ). the significance of data in the library/information context is two-fold. first, information professionals may need to become involved in data support, research data management, data curation, data governance, data quality evaluation, data citation, data literacy training, and similar activities, as a part, or all, of their professional remit (koltay , ; rice and southall ). this may involve, at its most formal: assisting with, or managing, research data management policies and plans (briney ); developing and managing data repositories; overseeing a data curation programme (nielsen and hjørland ; oliver and harvey ); designing training programmes for data literacy and associated skills, including basic coding, in environments including university, school accepted for publication in library management and public libraries (macmillan ; carlson, nelson, johnson and koshoffer ; crystle ); or dealing with data within an overall framework of digital scholarship (borgman ; mackenzie and martin ). or it may, in a less formal way, involve giving advice to individual users on how best to deal with their data, in the way that librarians have always advised on dealing with bibliographic references. becoming, in part or in whole, a data librarian, in rice and southall's terminology, is simply a new extension of the information provision/information management function, albeit that it may a new role description. second, it is important for information professionals, even if they have no special role in helping their users deal with data, to be able to handle data of all kinds confidently for their own purposes; to use data analytics to improve their library services, for example (farmer and safer ; kirkwood ; showers ). when these two developments are considered together, it is clear that new entrants to the information professions must be equipped to deal as confidently with data, in its variety of forms, as they have traditionally dealt with text information. achieving such data confidence means having a conceptual understanding of data, and the issues around it, plus the technical capabilities of 'data scraping' and 'data wrangling': the abilities to find, extract, collect, clean, organise, analyse, and present data. furthermore, there are two inter-related aspects to the kind of data fluency that the new environment demands of information professionals: the technical, and the social and ethical. there is little point in a librarian being able to code, to scrape data from websites, to clean and analyse datasets, and to produce metrics on demand, if they are unfamiliar with the legal requirements of, and ethical considerations implicit in, what they are doing. but equally, there is little point in such a person being able to fluently debate the social and ethical niceties, if they are unable to get their data they need, in the form they need it in, and to draw from it the meaningful information that it is of use. the two go hand in hand, and the understanding of data that the library/information professional must possess must be a socio-technical understanding, enabling them to deal with data with technical competence and with ethical confidence. there is, of course. also a legal dimension to the proper use of data; this is mentioned where necessary in the course described here, but a full treatment of legal issues comes in courses elsewhere in the city programme, dealing with information law. the importance of these issues has been emphasised repeatedly, as may be shown by the following examples. the sheer volume of data to be dealt with is illustrated by the general acceptance that we have entered the 'zettabyte era', in which annual data traffic on global networks exceeds the zettabyte level (cisco , floridi ). in response to this, the uk government has explicitly recognised the importance of data literacy as a way of helping non-data specialists make the most of data science (parkes ), while the us national information standards organization (niso) is planning training webinars for putting data literacy on a par with digital literacy (niso ). in the library sector, a bibliography on research data curation noted items published between and (bailey ). 'dealing with data' was named as one of ' technical skills that information professionals should learn', according to an entry on the cilip (chartered institute of library and information professionals) blog in march accepted for publication in library management (pennington ). this emphasised the need to deal with four distinct types of data: structured (e.g. spreadsheets and relational databases); semi-structured (e.g. files of metadata records); unstructured (without any table or field structure and encompassing big data); and linked data. similarly, 'using social media analytics' was named as one of the 'top five library technology topics' by the techsoup for libraries' blog in december (gilbert- knight ). training for librarians has begun to develop to match these perceived needs. to give three examples: the library of north carolina state university hosts a week-long 'data science and visualization institute for librarians' (north caroline state university ); the library of congress held a conference on 'collections as data' in october , with the two main themes that digital collections are composed of data that can be acquired, processed and displayed in many ways, and that we should always remember that data is derived from, and manipulated by, people (ashenfelder ); and the american library association and google, though their libraries ready to code project, are seeking to equip librarians to teach coding and data handling in public and school libraries (american library association ). these kinds of developments support the need for all librarians to have a solid socio- technical grounding in data issues. it teaching at citylis an introductory information technology course has been offered as a compulsory part of the library/information programmes at city since masters level teaching in the subject was established in its current structure in the late s (robinson and bawden ). this course has always been seen as an introduction to basic concepts, and a preparation for more specialist courses. [note that in this paper we use the term 'programme' for the whole masters scheme of study, and 'course' for this specific part.] this course was initially called 'computers and communications technology', and the very broad syllabus was: information systems and technology. an introduction to computers, hardware, software, operating systems, programming languages, software packages, databases, word processing, spreadsheets. terminology and basic concepts of telecommunications. telecommunications-based systems, including telex, fax, electronic mail, teleconferencing, videotex, electronic journals, document delivery systems, office automation. hard copy techniques, including copying, duplicating, printing, graphic design and composition, desktop publishing. microforms and their applications. introduction to systems analysis. in , the masters programme was restructured on a modular basis, and the course renamed 'information technology', with a greater digital emphasis. by - , the course was named 'data representation and management', and by then focused entirely on digital systems. its emphasis was on software systems for handling various kind of information: text handling and word processing systems, spreadsheets, web authoring, databases, etc. in , the course was renamed 'data and information technology and architecture' and accepted for publication in library management shortly afterwards 'digital information technologies and architecture'; de-emphasing data handling and taking a wider perspective. its aim was to "provide the technical background required to store, structure, manage and share information effectively". it still included material on specific kinds of software, but was increasingly focused on web-based systems, search engines, blogs and wikis, semantic web, information retrieval, etc., and on information architecture, and issues such as open access and repositories. in academic year - , this course was given a major overhaul. it was realised that the introductory material on software use was no longer necessary, while the detailed material on web-based systems, retrieval and information architecture was better left to later specialist and elective courses. eliminating this material allowed for a new focus on the handling of data in all its aspects, as the essential background preparation for the new data roles mentioned above; a return to the data focus of earlier years, but with a very different treatment appropriate to the new environment. it was also felt essential to introduce a strong flavour of ethics, and social implications, hitherto missing in what was very much a technical course. the revised course, with its strongly socio-technical perspective, was renamed as 'digital information technologies and applications', to indicate that information architecture was not not such a central point. it took the strapline 'the story of data', to match another part of the programme called 'the story of documents'. the story of data the stated aim of the restructured course is to "provide the technical and philosophical background required to collect, store, describe, structure, manage and share information effectively in the digital society", by engaging with the deluge of digital data, and distilling information from it. the theme "finding the i in data" is emphasised, with a double meaning: finding meaningful information (i) in data, and also considering how data represents or misrepresents us as individuals (i). there is also a strong focus on implications for library/information applications and issues, to ensure that the course does not become a generic 'data science lite'. in drawing up the syllabus, we were particularly influenced by north carolina's 'data science and visualization institute for librarians' mentioned above, and by modules in the oxford internet institute's masters programme in 'social science of the internet' (oxford internet institute ). we drew from these programmes ideas for both the balance of technical and conceptual material, and the balance of practical activities with consideration of conceptual and managerial aspects, as well as the general 'flow' of the course. more specifically, they influenced our decisions to use the python language to illustrate the value of coding, and to use examples of scraping data from the web whenever possible. although there is no single recommended text for the course - the material is too broad and diverse - the technical content is roughly matched by herzog ( ) and the socio-ethical content by floridi ( ). for the central concept - data itself - we follow floridi's definition: data is any discernible difference, or lack of uniformity; information is well-formed, meaningful and truthful data (floridi ). accepted for publication in library management the course is organised in ten sections: their titles are stated here to show the trajectory of the story, and discussed below: the story of data finding the 'i' in data you will be assimilated data about data taming of the data searching for the data working with the data counting the data the meaning in the data ai: the data will replace you making data work each section includes two class sessions - presentations, demonstrations and practical work - plus significant independent student work; the whole course (a uk credit module) accounting for a nominal hours student work. this is sufficient to ensure that all students have the opportunity to gain an appreciation of each topic, conceptually and practically, and to be in a position to learn more, either during their studies or in the workplace. for some sections, guest lecturers from institutions such as the uk digital curation centre, altmetric, and cilip offer the viewpoint from the world of practice. considering each section in turn, we now briefly outline its content. finding the 'i' in data this introductory section considers the modern phenomenon of the data deluge, and its implications for the individual. it considers: the relation between data, information and documents (floridi ); the historical development of computer systems, and the ways in which computers represent and handle data - turing and von neumann architectures, bits and bytes, and coding systems (ince ); and socio-technical issues, particularly for the library/information profession. this section establishes the conceptual framework for the course, and provides the understanding of basic issues needed by any librarian dealing with data. you will be assimilated this section introduces networks and digital network technologies, specifically the internet and the web, and the standards and protocols which underlie them, most notably tcp/ip and html. the concepts of the web and web pages are used to introduce some basic ideas of information architecture (rosenfeld, morville and arango ). some social and ethical implications of data transfer and sharing - including individual presence and privacy online, digital divide, net neutrality, and the implications of the design of network infrastructures - are considered. this establishes an understanding of the web environment in which virtually data in the library context resides. accepted for publication in library management data about data this section considers the ways in data forms documents (furner ), and how different kinds of documents are defined, described and organised, leading to an introduction to metadata standards and applications. following the approach of pomerantz ( ), this treats metadata very broadly, giving some attention to bibliographic and web resource metadata, but focusing equally on metadata for datasets. this provides a link between the metadata concepts familiar to librarians to their application in the less-familiar dataset context. taming of the data this section considers the structuring of data into organised data files of various kinds: flat files, csv files, database structures including relational, and standards, including xml, rdf and linked data. this leads to a discussion of the processes of data management, for research and for other purposes, and of data curation (briney ; oliver and harvey ). a conceptual understanding of, and an ability to work with, data files of these kinds is fundamental to the success of librarians in confidently dealing with data collections. searching for the data this section considers how to find data of various forms, building on early discussions of data structure. it covers the range of search tools for various forms of data collection: search engines, relational database systems and sql, full text bibliographic search systems, and other specialised retrieval tools. carlson, nelson, johnson and koshoffer ). it subsumes the text retrieval and bibliographical retrieval systems familiar to most librarians within the broader framework of systems with retrieve data of all kinds. working with the data this section focus on the ways data can be collected from web services and apis, such as twitter, and then cleaned, manipulated and analysed; what is sometimes termed 'data scraping' and 'data wrangling'. software such as hawksey's tagsexplorer and openrefine (groves ) are used to illustrate collection, summarisation and visualisation. a facility with this kind of process will be particularly valuable to librarians seeking to become experts in helping their users deal with data issues, as it is becoming a wide-spread form of data usage. counting the data this section examines data metrics, introducing basic analytics, basic bibliometrics (as an introduction to the study of bibliometrics laws and applications later in the programme), and altmetrics (tattersall ). while counting data is now technically quite straightforward, we ask what are we measuring when we measure data, and what does it mean? again, this is an extension of issues familiar to librarians - the bibliometrics of conventional publication - into the less-familiar data realm. the meaning in the data this section examines tools for exploring data to find meaning in it, including tools for text and data mining, and for visualization. standard packages - wordle, tagxedo and voyant tools (megan , moorfield-lang ) are used for collection and analysis of both structured and unstructured data from the web. there is a basic introduction to coding in accepted for publication in library management the python language, including use of general and specialised subroutine libraries, web scraping via api wrapper, and regular expressions. the aim is to illustrate the purpose of coding, and where it offers advantages over the standard packages, with examples of library/information applications. the ability to undertake basic coding is now a valuable skill in many library contexts, including modifying bibliographic records, enriching metadata, converting record formats, customising interfaces, and linking systems. this section also considers the discipline of digital humanities, which has provided many of these tools, and its relationship to lis (svensson and goldberg ; robinson ). ai: the data will replace you this section examines artificial intelligence (ai), from popular visions and historical developments to current practice, and implications for the information professions. topics include machine learning, automatic indexing, tagging, classification and categorisation, artificial agents, web bots in general and chatbots in particular, and robots. issues include whether librarians will really be replaced by robots, what the likely balance of the human and the digital will be, and what are some of the ethical implications, following the approaches of boden ( ), and floridi ( , ). some understanding of these issues is essential for new entrants to the library profession, as the impact of ai to all sectors will be significant. making data work this section, in a sense, circles back to the first section, considering the importance of data handling and management for the future library/information professional. how can they best contribute to managing the data deluge, and how can data be used to improve, justify and show the impact of, library/information services? no attempt is made to give definitive answers to these questions; rather this section opens a discussion, to be continued throughout the citylis masters programme. all aspects of the learning context have been changed to emphasise the integration of the technical and social/ethical treatment of data issues. previously the course had been run by formal lectures followed by practical classes in a computer room. the computer room classes have been abandoned in favour of using seminar room for all sessions, and encouraging students to bring and use their devices (laptops, tablets, smartphones) for short practical in-class exercises, which can be naturally integrated with presentation, and which encourages discussion and peer support. practical exercises have been adjusted so as to be doable without any special hardware or software, by using standard web-based systems: voyant tools, wordle, tagxedo, tagsexplorer, openrefine, etc. for the introduction to coding, which uses the python language, we are able to recommend a choice of online tutorials for practice, including one which requires only a web browser, rather than any special software. those students with a strong interest can, of course, take things further by using special purpose hardware and software available at the university. the purpose of including the coding component is not to train the class to become efficient coders: that would be neither desirable nor feasible in the time available. it is not necessary that all librarians be coders, but it is necessary that they understand the nature and purpose of coding, and when and why writing code may be preferable to using prepacked software. in order to do this, it is necessary to have some practical experience of coding. this course accepted for publication in library management provides this, in the context of data collecting and processing, for those students who have not encountered coding before. for those who have, it provides an introduction to a language, python, with a rich provision of libraries and subroutines for accessing and manipulating data of the kinds of most interest to library/information practitioners. the aim is not to try to develop professional programming skills, but to show coding as a tool for creative exploration of data, following the approach espoused by montford ( ). students needed a more in-depth treatment of programming can find it elsewhere in the programme, especially by following technology-oriented electives, and by participating in 'out of hours' option technology training. an example of the latter is citylis's hosting of the first library carpentry software training (playforth ). background reading and resources for each section are designed to cover three perspectives: the technical; the socio-ethical, and the professional, outlining the implications for library/information professionals. while the sections are distinguished mainly by their technical content, the social and ethical concerns tend to overlap, since their principles are applicable in many aspects of information and data management (floridi , floridi and taddeo ). the assessment for the course is an essay or report on a topic chosen by the student, but which must incorporate both technical and socio-ethical aspects. students are also required to set up a blog, if they do not already have one, and use it to reflect on their learning as the course progresses, and also encouraged to use other forms of social media such as twitter, so as to ensure that all are comfortable with communicating via digital media. conclusions at the time of writing, the course had been given for the first time. reaction from students, and from the expert practitioners acting as guest lecturers, suggests that this is an engaging and effective way of introducing students to the role of library/information professionals in managing data, understanding both the technology and its social and ethical dimensions. a more through and formal evaluation at the end of the academic year will influence the future direction of the course. the fact that is it is compulsory for all library/information students, and indeed is the first course they encounter in their studies, helps emphasise the importance of understanding data and its implications in all library/information contexts. data issues are clearly here to stay as a significant aspect of the work of all librarians, and other information professionals, and all entrants to the profession need a good socio- technical grounding as a basis for professional practice, and - vitally - continuing learning throughout professional life. we hope that this new citylis offering, which will be further developed over future years, will serve this purpose for our students, and may be a useful example to others. accepted for publication in library management references american library association ( ), equipping librarians to code: ala, google launch ready to code university pilot programme, [blog post], available at http://www.ala.org/news/press-releases/ / /equipping-librarians-code-ala-google- launch-ready-code-university-pilot, accessed january . ashenfelder, m. ( ), data and humanism shape library of congress conference, [blog post], available at http://blogs.loc.gov/thesignal/ / /data-and-humanism-shape- library-of-congress-conference/?loclr=eadpb, accessed january . bailey, c.w. ( ), research data curation bibliography (version ), houston tx: digital scholarship, available at http://digital-scholarship.org/rdcb/rdcb.htm, accessed january . boden, m.a. ( ), ai: its nature and future, oxford: oxford university press. borgman, c.l. ( ), big data, little data, no data: scholarship in the networked world, cambridge ma: mit press. briney, k. ( ) data management for researchers: organize, maintain, and share your data, exeter: pelagic publishing. carlson, j., nelson, m.s., johnson, l.r. and koshoffer, a. ( ), developing data literacy programs: working with faculty, graduate students and undergraduates, bulletin of the association for information science and technology, ( ), - . cisco ( ), the zettabyte era - trends and analysis, [online], available at http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking- index-vni/vni-hyperconnectivity-wp.html, accessed january . crystle, m. ( ), libraries and facilitators of coding for all, knowledge quest, ( ), - . ekstrøm, j., elbaek, m., erdmann, c. and grogorov, i. ( ), the research librarian of the future: data scientist and co-investigator, lse impact of social sciences blog, december , available at http://blogs.lse.ac.uk/impactofsocialsciences/ / / /the-research- librarian-of-the-future-data-scientist-and-co-investigator/, accessed march farmer, l.s.j. and safer, a.m. ( ), library improvement through data analytics, london: facet publishing. floridi, l. ( ), charting our ai future, project syndicate, [online], available at https://www.project-syndicate.org/commentary/human-implications-of-artificial- intelligence-by-luciano-floridi- - , accessed january . accepted for publication in library management floridi, l. ( ), should we be afraid of ai?, aeon essays, [online], available at https://aeon.co/essays/true-ai-is-both-logically-possible-and-utterly-implausible, accessed january . floridi, l. ( ), the fourth revolution: how the infosphere is reshaping human reality, oxford: oxford university press. floridi, l. ( ), the ethics of information, oxford: oxford university press. floridi, l. ( ), information: a very short introduction, oxford: oxford university press. floridi, l. and taddeo, m. ( ), what is data ethics?, philosophical transactions of the royal society a, : , http://dx.doi.org/ . /rsta. . furner, j. ( ), "data": the data, in kelly, m. and bielby, j. (eds), information cultures in the digital age, wiesbaden: springer vs, pp - . gilbert-knight, a. ( ), your top library technology topics, techsoup for libraries blog ( december ), available at http://techsoupforlibraries.org/blog/your-top- -library- technology-topics, accessed january . groves, a. ( ), beyond excel: how to start cleaning data with openrefine, multimedia information and technology, ( ), - . herzog, d. ( ), data literacy: a user's guide, london: sage. ince, d. ( ), the computer: a very short introduction, oxford: oxford university press. kirkwood, r.j. ( ), collection development or data-driven content curation? library management, ( / (, - . koltay, t. ( ), data literacy: in search of a name and identity, journal of documentation, ( ), - . koltay, t. ( ), data governance, data literacy and the management of data quality, ifla journal, ( ), - . lyon, l., mattern,e., acker, a. and langmead, a. ( ), applying translational principles to data science curriculum development, in ipres , november , chapel hill, north carolina, available at http://d-scholarship.pitt.edu/ /, accessed january . macmillan, d. ( ), developing data literacy competencies to enhance faculty collaborations, liber quarterly, ( ), - . mackenzie, a. and martin, l. (eds.) ( ), developing digital scholarship: emerging practices in academic libraries, london: facet publishing. accepted for publication in library management megan, w.e. ( ), review of voyant tools, collaborative librarianship, ( ), - . montford, n. ( ), exploratory programming for the arts and humanities, cambridge ma: mit press. moorfield-lang, h. ( ), infographics: information gets visual, information searcher, ( ), - . nielsen, h.j. and hjørland, b. ( ), “curating research data: the potential roles of libraries and information professionals”, journal of documentation, ( ), – . niso ( ) niso two-part webinar: digital and data literacy, available at http://www.niso.org/news/events/ /webinars/sept _webinar, accessed january . north carolina state university ( ), data science and visualization institute for librarians, [online], available at https://www.lib.ncsu.edu/datavizinstitute, accessed january . oliver, g. and harvey, r. ( ), digital curation, london: facet publishing. oxford internet institute ( ), msc social science of the internet, [online], available at https://www.oii.ox.ac.uk/study/msc, accessed january . parkes, e. ( ), data literacy: helping non-data specialists make the most of data science. government digital service blog post, available at https://gds.blog.gov.uk/ / / /data-literacy-helping-non-data-specialists-make-the- most-of-data-science, accessed january . pennington, d. ( ), technical skills information professionals should learn. cilip blog ( march ), available at http://www.cilip.org.uk/blog/ -technical-skills-information- professionals-should-learn, accessed january . playforth, c. ( ), why the information profession needs library carpentry [blog post], available at https://blogs.city.ac.uk/citylis/ / / /why-information-profession-needs- library-carpentry, accessed january . pomerantz, j. ( ), metadata, cambridge ma: mit press. rice, r. and southall, j. ( ), the data librarian's handbook, london: facet publishing. robinson, l. ( ), are the digital humanities and library and information science the same thing? [blog post], available at https://thelynxiblog.com/ / / /are-the-digital- humanities-and-library-information-science-the-same-thing/, accessed january . accepted for publication in library management robinson, l. and bawden, d. ( ), information (and library) science at city university london: years on educational development, journal of information science, ( ), - . rosenfeld, l., morville, p. and arango, j. ( ), information architecture for the web and beyond ( th edn.), sebastopol ca: o'reilly media. showers, b. ( ), library analytics and metrics, london: facet publishing. si, l., zhuang, x., xing, w. and guo, w. ( ), the cultivation of scientific data specialists: development of lis education oriented to e-science service requirements, library hi tech, ( ), – . sugimoto, c.r., ekbia, h.r. and mattiolli, m. ( ), big data is not a monolith, cambridge ma: mit press. svensson, p. and goldberg, d.t. ( ), between humanities and the digital, cambridge ma: mit press. tang, r. and sae-lim, w. ( ), data science programs in u.s. higher education: an exploratory content analysis of program description, curriculum structure, and course focus, education for information, ( ), - tattersall, a. (ed.) ( ) altmetrics: a practical guide for librarians, researchers and academics, london: facet publishing. university of sheffield ( ), msc data science, [online], available at http://www.shef.ac.uk/is/pgt/courses/ds#tab , accessed january . social media in the emergency medicine residency curriculum: social media responses to the residents' perspective article ucsf uc san francisco previously published works title social media in the emergency medicine residency curriculum: social media responses to the residents' perspective article. permalink https://escholarship.org/uc/item/ jj g journal annals of emergency medicine, ( ) issn - authors hayes, bryan d kobner, scott trueger, n seth et al. publication date - - doi . /j.annemergmed. . . peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ jj g https://escholarship.org/uc/item/ jj g #author https://escholarship.org http://www.cdlib.org/ education/special contribution social media in the emergency medicine residency curriculum: social media responses to the residents’ perspective article bryan d. hayes, pharmd*; scott kobner, bs; n. seth trueger, md, mph; stella yiu, md; michelle lin, md *corresponding author. e-mail: bhayes@umem.org, twitter: @pharmertoxguy. volume in july to august , annals of emergency medicine continued a collaboration with an academic web site, academic life in emergency medicine (aliem), to host an online discussion session featuring the annals residents’ perspective article “integration of social media in emergency medicine residency curriculum” by scott et al. the objective was to describe a -day worldwide clinician dialogue about evidence, opinions, and early relevant innovations revolving around the featured article and made possible by the immediacy of social media technologies. six online facilitators hosted the multimodal discussion on the aliem web site, twitter, and youtube, which featured preselected questions. engagement was tracked through various web analytic tools, and themes were identified by content curation. the dialogue resulted in , unique page views from cities in countries on the aliem web site, , twitter impressions, and views of the video interview with the authors. five major themes we identified in the discussion included curriculum design, pedagogy, and learning theory; digital curation skills of the st-century emergency medicine practitioner; engagement challenges; proposed solutions; and best practice examples. the immediacy of social media technologies provides clinicians the unique opportunity to engage a worldwide audience within a relatively short time frame. [ann emerg med. ;-: - .] - /$-see front matter copyright © by the american college of emergency physicians. http://dx.doi.org/ . /j.annemergmed. . . introduction in , annals of emergency medicine and academic life in emergency medicine (aliem) began a joint social media–based global emergency medicine journal club. - with this series’ increasing popularity, as well as the concurrent growing use of online resources and discussion among emergency medicine trainees, the collaboration was extended to the annals residents’ perspective series. this article summarizes the social media–based discussion hosted by aliem about the article by scott et al, “integration of social media in emergency medicine residency curriculum.” the accreditation council for graduate medical education residency review committee now permits emergency medicine programs to fulfill up to % of their weekly didactic conference requirement with an average of hour per week of “individualized interactive instruction,” potentially allowing programs to leverage home consumption of online resources into the formal residency curriculum. in a previous residents’ perspective, reiter et al discussed the implementation of asynchronous resources into the curriculum at mount sinai’s emergency medicine residency, including the challenges of meeting the residency review committee’s broad but vague requirements: an evaluation component, faculty oversight, monitoring for resident participation, and monitoring for effectiveness. the committee is clear that simple consumption of online -, no. - : - educational content does not suffice, and proper faculty and program leadership involvement with -way interaction is required. in addition to the accreditation challenges, program leadership is also faced with a paucity of evidence. a systematic review of the use of social media in education found only relevant studies, mostly small, focused on reflective writing or using course-specific content. the focus in many currently published articles on social media in resident education is the risks, particularly with respect to professionalism and privacy, rather than instruction on how to develop a curriculum. blog posts and podcasts previously discussing social media in emergency medicine residency educational curricula are presented in table . there were blog posts, podcasts, and news network articles. the featured residents’ perspective describes many of the available social media modalities that program leadership could incorporate: blogs, podcasts, videos, twitter, and google hangouts. the authors provide examples of specific resources to use and descriptions of how their emergency medicine residency at the university of pennsylvania uses each modality. they also describe their use of an innovative, debate-based, flipped classroom model. finally, the authors address their approaches in dealing with common drawbacks in social media use for resident education, including the potential for “information overload,” which was a topic covered by thoma et al in an accompanying residents’ perspective. annals of emergency medicine mailto:bhayes@umem.org mailto:@pharmertoxguy http://dx.doi.org/ . /j.annemergmed. . . table . online blog posts, podcasts, and news network articles discussing social media in emergency medicine residency educational curricula. web site author title type country date american academy of emergency medicine resident and student association (aaem/rsa) meaghan mercer foam—this is not the future of medicine, it is medical education now! blog united states july , aliem andrew grock new air series: aliem air blog united states july , aliem nikita joshi lost in translation: what counts as asynchronous learning? blog united states january , acep now jeremy faust tweets from emergency medicine–related conferences relay latest research about social media and critical care, resuscitation procedures, ultrasounds, and toxicology news network united states may , acep now jeremy faust foamed appeal is simple: get more, pay nothing news network united states february , emergency medicine cases anton helman social media & emergency medicine learning podcast canada june , emergency medicine cases anton helman best case ever rob rogers on social media in em education podcast canada june , emergency medicine news paul bufano news: how twitter can save a life news network united states april , emergency medicine news gina shaw breaking news: don’t call it social media: foam and the future of medical education news network united states february , emergency physicians monthly nicholas genes pro/con: why #foamed is not essential to em education news network united states april , emergency physicians monthly joe lex pro/con: why #foamed is essential to em education news network united states april , iteachem robert cooney how we are flipping em education blog united states january , takeokun jason nomura resident education in ultrasound using simulation and social media aium blog united states april , the poision review leon gussow must-read: getting started in online emergency medicine education and foamed blog united states june , the rolobot rambles damian roland #foamed and #smacc: revealing the camouflaged curriculum blog united kingdom july , the skeptics guide to emergency medicine ken milne tiny bubbles (#foamed and #meded) podcast canada april , ultrasound podcast matt dawson social media and medical education. #foamed talk from #acep podcast united states may , air, approved instructional resources. integration of social media in emergency medicine residency curriculum hayes et al aliem explored these challenges and opportunities in integrating social media into a residency curriculum, using social media platforms, including a twitter conversation, web site discussion, and live video interview withthe authors and key subject-matter experts. this article aims to curate (ie, collect, organize, and summarize) the online discussions from the global community of practice and highlight potential challenges and strategies. we also report objective web analytics for the various online platforms used. materials and methods the annals editors selected the residents’ perspectives article, and facilitators were chosen by aliem for their expertise in medical education and active presence on social media. two are experienced bloggers on aliem (b.d.h., m.l.), and all have active twitter accounts that had annals of emergency medicine follower numbers greater than (@skobner), (@mikegisondi), (@stella_yiu), , (@mdaware), , (@pharmertoxguy), and , (@m_lin) at the discussion. the discussion was hosted by aliem (http:// www.aliem.com), which is a public, wordpress-based, educational blog web site created in , with currently greater than million page views annually, greater than , facebook fans, greater than googleþ followers, and greater than e-mail subscribers. promotion of the event promotion for the discussion included notices on the aliem web site, aliem facebook page, aliem googleþ page, and the annals’ and facilitators’ individual twitter accounts, using the #aliemrp hashtag. it began in the days leading up to the dialogue, which started july volume -, no. - : - http://www.aliem.com http://www.aliem.com figure . featured aliem blog questions. hayes et al integration of social media in emergency medicine residency curriculum , , and then continued several times a day during the first days of the discussion period. tweets were directed toward individuals who also follow established hashtags, such as #foamed (“free open access medical education,” or foam), as well as the followers of the facilitators. social media discussion period the facilitators’ goal during the discussion, which began july , , was to encourage sharing and reflection on preselected discussion questions (figure ) about social table . aggregate analytic data from various social media–based di social media analytic aggregator metric google analytics: a free online service to track page views and other blog metrics page views numb users numb the number of cities numb ana number of countries numb goo average time on page averag aliem social media post widget: a web- based tool embedded into each blog post, which tracks engagement metrics for multiple social media platforms number of tweets from page numb the number of facebook likes numb number of googleþ shares numb aliem comments section number of site comments comm sec average word count per blog comment (excluding citations) symplur analytics: a free online service to track metrics for twitter engagement of health-related hashtags; used to track twitter hashtag #aliemrp number of tweets numb number of twitter participants numb #al twitter impressions how m app twe follo youtube analytics: a free online service to track youtube video viewing statistics length of videocast total d ses number of views numb average duration of viewing averag view volume -, no. - : - media in the emergency medicine residency curriculum from the perspective of learners, educators, and programs. on july , (day ), a live panel discussion was hosted on google hangout on air, featuring of the authors of the highlighted article, kevin scott, md, and mira mamtani, md (university of pennsylvania); established educators, stella yiu, md (university of ottawa), and michael gisondi, md (northwestern university); and the annals’ assistant social media editor, seth trueger, md, mph (university of chicago). michelle lin, md (university of california, san francisco), and scott kobner (new york university) participated off camera by live-tweeting the event. bryan d. hayes, pharmd (university of maryland), moderated the discussion. the video interview was automatically recorded and archived into aliem’s youtube account (aliem interactive videos). the discussion was hosted on the aliem web site, with comments moderated on the blog web site and twitter. curation of multimodal discussion transcripts from twitter, the blog web site, and the video interview discussions were analyzed for broad scussions for the first days of the event. metric definition count er of times the web page containing the post was viewed , er of times individuals from different ip addresses viewed site (previously termed “unique visitors” by google) , er of unique jurisdictions by city as registered by google lytics er of unique jurisdictions by country as registered by gle analytics e amount of time spent by a viewer on the page : minutes er of unique -character notifications sent directly from blog post by twitter to raise awareness of the post er of times viewers “liked” the post through facebook er of times viewers shared the post through googleþ ents made directly on the web site in the blog comments tion er of tweets containing the hashtag #aliemrp er of unique twitter participants using the hashtag iemrp any impressions or potential views of #aliemrp tweets ear in users’ twitter streams, as calculated by number of ets per participant and multiplying it by the number of wers that participant has , uration of recorded google hangout videoconference sion min s er of times the youtube video was viewed e length of time the youtube video was played in a single ing min s annals of emergency medicine figure . geographic distribution of readers who viewed the blog post during the first days of discussion. integration of social media in emergency medicine residency curriculum hayes et al emerging themes during the -day period (july to august , ) by authors (b.d.h., s.k., m.l.) independently. to ensure logical organization and comprehensiveness, and to settle any disagreements between the primary analysts, the other authors subsequently reviewed these themes. participants were self-selected, and discussants were excluded from analysis only if their commentary was blatantly inappropriate to the topic of conversation, purposefully inflammatory, or otherwise inconsistent with constructive behavior. a more purposive sampling of participants was not used because the original intent of this multimodal discussion approach was not to achieve saturation in a selected population but rather to provide a novel digital space for a virtual global community of practice to openly engage in academic discourse. web analytics were recorded for this -day discussion period. viewership and engagement were measured with such tools as google analytics, the aliem social media post widget, youtube analytics, and symplur. table provides descriptions for each of these tools. the number of comments and words per comment in the web site discussion were also calculated, excluding the initial comments by the facilitators and references. annals of emergency medicine results web analytics the -day analytics data for the multiplatform discussion about social media in the emergency medicine residency curriculum from july , , to august , , are summarized in table . the google hangout video interview, posted on july , , was viewed times during the -day period—rarely in its entirety— with an average viewing time of minutes seconds of the full duration of minutes seconds. analytics data indicate that . % of page viewers were using a mobile platform (telephone or tablet) to access the video. figure displays the global geographic distribution of participants who visited the blog. a full transcript of the blog web site discussion is archived at http://www.aliem.com/ social-media-in-the-em-curriculum-annals-em-resident- perspective-article/, all tweets with the #aliemrp are archived on symplur.com at aliem.link/ tkvadd, and the google hangout video can be accessed at https:// www.youtube.com/watch?v¼kyetj sxzci. summary of the online discussion five major domains were identified, which included curriculum design, pedagogy, and learning theory; digital volume -, no. - : - http://www.aliem.com/social-media-in-the-em-curriculum-annals-em-resident-perspective-article/ http://www.aliem.com/social-media-in-the-em-curriculum-annals-em-resident-perspective-article/ http://www.aliem.com/social-media-in-the-em-curriculum-annals-em-resident-perspective-article/ http://symplur.com https://www.youtube.com/watch?v=kyetj sxzci https://www.youtube.com/watch?v=kyetj sxzci https://www.youtube.com/watch?v=kyetj sxzci hayes et al integration of social media in emergency medicine residency curriculum curation skills of the st-century emergency medicine practitioner; engagement challenges; proposed solutions; and best practice examples. figure . tweet by christopher doty, md. figure . tweet by lauren westafer, do. curriculum design, pedagogy, and learning theory even before considering the role of social media and technology in the emergency medicine residency curriculum, several blog comments and tweets supported a broader and more scholarly thought process in addressing social media integration issues into the curriculum. many supported the notion that pedagogy, the science and theory of teaching, should drive medical education and the role of technology. furthermore, javier benítez, md, (no affiliation) advocated that “learning theories give us a framework with which to describe how learning might happen under certain circumstances. i think the use of social media to engage learners in the acquisition, participation, and creation of knowledge should be explored on the basis of different learning theories. learning theories are frameworks which help to describe how learning might occur under certain circumstances.. the use of technology should be to support effective pedagogic practices which include self-regulated learning, critical thinking, information management, and more.” as an example, jeffrey hill, md, (university of cincinnati) suggested using the community of inquiry framework, which is an instructional design model based on social constructivist education theory. this might help guide the incorporation of social media into residency education. the community of inquiry framework defines a “good” learning environment as having a cognitive, teaching, and social presence. lin, editor-in-chief of the aliem blog, agreed that “you need to get buy-in from the learners about theintrinsic valueof thecontent, a strongfacilitator presence, and a ‘safe’ community to grow and learn.” teresa chan, bed, md, (mcmaster university) provided a slightly different perspective, advocating that “theory and technology have to evolve side by side.” as an example, she referenced the lave and wenger theory of situated learning, which posits that learning should not be the mere transmission of noncontextualized knowledge to individuals but rather a socially driven process in which contextualized, coconstructed knowledge occurs. this may explain how clinicians in communities of practices “go from peripheral participants to experts within a community. this theory likely explains much of the foam community, and yet it never truly anticipated the role that online, asynchronous virtual spaces might play on developing online communities. adaptations of these volume -, no. - : - theories, but also being open to altering them in light of new phenomenon [sic] brought forth by the technology may be a new and intriguing merger of both curricular design and implementation.” in addition to pedagogic frameworks, christopher doty, md, (university of kentucky) also advocated faculty development on andragogy, which is the art and practice of teaching adult learners (figure ). digital curation skills of the st-century emergency medicine practitioner a separate discussion examined the pervasive use of educational social media among residents but soon evolved into a dialogue about the importance of critical appraisal of information. anand swaminathan, md, mph, (new york university/bellevue) observed that “residents have already embraced much of this [social media]. each of them has their preferred blogs and podcasts. the information has been integrated into their general learning, and i hear things on shift that i know echo things they learned from foam.” although other commentators were concerned with the possibility of information overload for residents, benítez suggested that this use of social media represents a unique opportunity to teach a valuable skill. he argued that “medical education has always been burdened with information overload as reported by anderson, et al. in . one way to alleviate information overload and decisionmaking is by training physicians on information management with the aid of technology.” lauren westafer, do, (baystate medical center) resonated (figure ). annals of emergency medicine figure . tweet by jeff riddell, md.figure . tweet by jeff riddell, md. integration of social media in emergency medicine residency curriculum hayes et al in developing residents’ critical appraisal and information management skills, robert cooney, md, mmed, (conemaugh memorial medical center) also suggested that faculty advance their own ability to curate information alongside learners in this digital age. discussants agreed that this would provide a more robust relationship between faculty and learners, encouraging the development of the curation skills of both. teaching critical appraisal of online resources is a new challenge for educators who may not have robust curation skills themselves. this challenge was the impetus for the creation of the aliem approved instructional resources series, which attempts to address observations that “residencies struggle in evaluating which blog posts and podcasts are appropriate and high quality for resident education”. swaminathan contended that “programs like the aliem approved instructional resources will make some of this easier,” in reference to lists of expert-reviewed and vetted social media content published for programs to easily integrate into their curricula. engagement challenges several respondents identified engagement as the central challenge to using social media in education. learner engagement depends on both learners and educators. from the learner perspective, the consensus seemed to be that learners are naturally gravitating toward reading and listening to online content to supplement their clinical education. one key obstacle identified by george miller, md, (louisiana state university health sciences center), however, was the issue of distraction. because social media technologies draw in both professional and personal content to central platforms, learners may become distracted during educational sessions. engagement with social media platforms in residency education was identified more as a problematic issue for educators and other stakeholders, such as those in key residency leadership positions. anecdotally, skeptical faculty members have highlighted the lack of formal peer review and quality control in contrast to more established educational modalities such as textbooks and journal publications. annals of emergency medicine others have questioned the need for new educational tools and the potential of this new and as yet unproven educational approach. furthermore, several noted that the lack of published outcomes data using social media in medical education contributes to the skepticism about incorporating social media into the curricula (figure ). the central argument illustrating a need for a culture change with more faculty buy-in was best summarized by shannon mcnamara, md, (st. luke’s–roosevelt), who stated that “as educators, we need to meet our learners where they’re at.” jeff riddell, md, (ucsf-fresno) echoed this in his tweet (figure ). proponents advocated that educators embrace the digital modalities that residents are already learning from to help them navigate knowledge gaps and provide individualized feedback. online educational content should be accepted and valued by all stakeholders (learners, educators, and administration) as a complementary resource to traditional resources, rather than as nonessential, “extra, noncore material.” in addition to a more engaged educator presence, others recommended the concept of learner agency, which is giving the learners the autonomy or power to control their own education. educators should learn to encourage learners to initiate and invest in their own educational experience in a virtual community of practice. multiple respondents asserted that they hope this will shift education away from the current teacher-centered, passive model to one in which learners are intrinsically motivated, are centrally involved, and develop lifelong learning skills in the process. in a more learner-centered model, educators thus may need to change their approach to teaching. lin commented, “it is incredibly hard for educators (myself included) to shift away from giving the stock lecture that i’ve given every year. there’s a growing trend toward less ‘talking at’ (lectures) and more ‘talking with’ (facilitating) learners.” furthermore, hill stressed that the issue of engagement can be more difficult with online teaching compared with traditional teaching, such as classroom- based lectures. “in live teaching you can adjust your teaching style on the fly if it seems you are losing them versus when you are responding to discussion board posts volume -, no. - : - figure . blog comment by anand swaminathan, md, mph. figure . tweet by christopher doty, md. hayes et al integration of social media in emergency medicine residency curriculum or on a comment thread.” the modern educator will need to evolve to incorporate these new skills in teaching, facilitation, and engagement in online environments. ultimately, the overarching principles remain the same as when designing nondigital educational curricula. several individuals agreed with hill that creation of social media– based products should still keep in mind that “the ultimate goal of any initiative should be to create a robust venue for interaction between learner and instructor” for meaningful learning experiences. doty said that learner-centered curricula will help ensure learner engagement, as long as it is remembered that social media technologies are merely operational tools (figure ). proposed solutions multiple suggestions were put forth to address the barriers and challenges in social media adoption for emergency medicine residency education. first, swaminathan proposed that each residency program have faculty champions, who are active in social media and medical education, in their local departments to start the culture change from within (figure ). second, multiple commentators agreed there should be more widespread faculty development efforts on learning theories, especially as they relate to digital scholarship and technologies. third, readers said that there must be recognition that general differences exist in how technologies have played a role in our lives. different expectations and uses cascade into generational differences with respect to educational preferences. program leadership may even consider incorporating reverse mentoring of educators by learners on educational technologies. fourth, there was a call for developing academic incentives to help support the efforts of digital educators. currently, online educational efforts do not receive as much academic credibility as traditional educational endeavors in terms of academic value for promotion and tenure advancement. this may contribute to problems in recruiting and engaging faculty to join this digital educational movement. chan challenged the academic volume -, no. - : - norm by stating, “just as some medical schools are giving credit for editing wikipedia perhaps there can be some sort of ‘credit’ given to those who are engaging in kt [knowledge translation] and/or review-based scholarship in the foam world ? i think it is imperative that our community not only train clinicians who can be informed consumers of foam, but also those who can be active contributors back to the community.” real-time web analytic data demonstrating objective measures of viewership and engagement with social media may help educators quantify and legitimize their efforts. fifth, riddell also championed legitimizing social media technologies from the perspective of learners, stating, “how do we engage learners? one way is to make sure we get credit for our time.. many in our program are regular consumers of podcasts, videocasts, and blogs and we get little to no official ‘credit’ for our hours of learning.. the easier it is for us to get ‘conference credit’ for our time, the more engaged people will be.” best practice examples throughout the course of discussion, many respondents highlighted their experience with social media in emergency medicine education. assembled in table is a summary of their innovative examples across a diversity of settings and social media platforms. limitations as with the previous aliem-annals residents’ perspective discussion on multiple mini-interviews, our results may be susceptible to selection bias because voluntary participants in social media discussions may represent a distinct subset with views different from those of the broader population. furthermore, the views of vocal participants may be disproportionately represented over those of stakeholders who did not participate in the public discussion. additionally, individuals in geographic regions in which access to the social media modalities used in this study is censored could not be included in this study. socioeconomic or technologic barriers to entry in this discussion may have also introduced sampling annals of emergency medicine table . summary of best social media integration practices into education mentioned by contributors to the blog discussion. contributor residency program social media incorporation brian adkins, md university of kentucky for medical students, uses twitter for a quiz contest that translates toward examination bonus points and a comedic prize robert cooney, md, mmed conemaugh memorial medical center implements wikis to redesign curriculum in real time incorporates social media content as supplement to primary literature for flipped classroom sean fox, md carolinas medical center summarizes residency conference teaching points on http://www.cmcedmasters.com/core- concepts to create a bedside resource for residents andrew grock, md suny downstate cocreated aliem air series to identify expert-identified blog and podcast content for residency programs jeffrey hill, md university of cincinnati published an online orientation curriculum for flight physician course hosts googleþ literature discussion forum for educator-learner discussion david marcus, md long island jewish medical center cocreated popular #emconf hashtag on twitter, which allows emergency medicine residency programs to share conference pearls worldwide through tweets shannon mcnamara, md temple university incorporates podcasts and blogs into the emergency medicine curriculum to increase learner engagement tamara moores, md university of utah described her residency program practices: provides ipads to all interns, which includes downloaded textbooks, rss setup, twitter, and medical apps provides automatically synchronized summaries from conference through a shared residency evernote folder encourages asynchronous learning using social media/multimedia resources (eg, podcast) and integration into residency conference discussion jeff riddell, md ucsf-fresno incorporates social media discussion about timely scientific developments (eg, peitho and mopett trials), such as twitter and blogs to trigger conversation in morning report stephen smith, md university of minnesota incorporates his blog’s content (smith’s ecg blog) into a -hour ecg course integration of social media in emergency medicine residency curriculum hayes et al bias. in addition to potentially having selection bias, our study was not designed to reach saturation, and thus some relevant themes may not have emerged. objective web analytics have many limitations. for instance, twitter and symplur data capture only tweets that include the #aliemrp hashtag and may understate the number of participants and the full extent of discussion. on the other hand, the “impressions” statistic represents an upper bound of potential reach. we were able to measure tweets by individuals, for a total of , impressions. furthermore, the nature of this online journal club lends itself to a potentially recursive conflict of interest because the study authors, blog editors, and journal all stand to benefit from mutual collaboration. such inherent conflicts may be initially unavoidable in this evolving new endeavor to expand a journal article’s reach and engagement through social media technologies with trainees and practicing clinicians. discussion this article presents the results of the second aliem- annals collaboration using a variety of social media approaches to explore the topic of social media integration into emergency medicine residency curricula. in analyzing the themes that emerged from the audience, some echoed the existing literature and featured article and others generated novel hypotheses for further study. annals of emergency medicine emergent themes included using learning theories to develop learner-centered curricula. respondents stressed that curation skills are vital for both learners and educators while remembering that social media should not be used as a sole appraisal method for primary literature. , a key overarching theme throughout the discussion involved the issue of learner and educator engagement. all of the themes aligned with the current literature on the challenges and best practice strategies about collaboration, cooperative learning, engagement, and learning in a social context. ultimately, a major culture shift is necessary among all stakeholders, including learners, educators, the residency leadership, and academic institutions, to accept and adopt these social media–based resources as legitimate forms of education and scholarship because learners are already independently incorporating them into their medical education. in the incorporation of these new technological approaches into formal curricula, it is essential to focus faculty development on teaching today’s clinician-educators about pedagogy, specifically, curriculum design and learning theories. major educational efforts, regardless of the role of technology, can usually benefit from established instructional and curricular design approaches, such as kern’s -step model for curriculum design, which includes problem identification and general needs assessment, needs assessment for targeted learners, goals and objectives, educational strategies, implementation, and evaluation and feedback. volume -, no. - : - http://www.cmcedmasters.com/core-concepts http://www.cmcedmasters.com/core-concepts hayes et al integration of social media in emergency medicine residency curriculum social media: a new frontier in scholarly discussions as described previously, our methods for scholarly discussion represent a departure from traditional approaches to critical appraisal of literature by providing a free, public, global, asynchronous forum for conversations among authors, learners, experts, clinicians, and educators. by continuing to explore new media for academic discourse, such as twitter, blogs, podcasts, and youtube, we have demonstrated the continued feasibility and potential of social media as means of meaningful discussion and engagement with primary literature. throughout our -day, multiplatform discussion of the annals’ article, we were able to reach a larger audience than in our previously reported discussions. as shown in table , although the views and reaches on the web site alone were consistent with the engagement observed in our previous discussion, we observed a doubling of twitter “impressions,” which is defined as the number of potential views of all tweets by unique twitter users. we also observed an increase in twitter participants who used the #aliemrp hashtag in the discussion. this could be partially confounded by a growing twitter user base compared with that of the previous discussion, with more followers per user, thus creating more impressions. twitter involvement grew disproportionately to engagement with the web site, possibly because iterative social media discussions are recruiting greater numbers of viewers from the sidelines and encouraging them to actively participate in the discussion. alternatively, the increase in active participation may be the direct result of the subject matter. further investigation into the engagement habits of viewers of online scholarly discussions and influx of emergency physician twitter users should be conducted to explore these possibilities. again, the youtube analytics for the google hangout video broadcast continued to demonstrate notably less success than the blog and twitter discussions. similar to that of the previous installment in the series, the average viewing time was minutes seconds out of minutes seconds. although the online video format is both accessible and familiar, many factors could contribute to these relatively meager statistics. for instance, mobile platforms enable viewers to engage in brief discussions in various environments (such as during commutes). these environments may be more prone to interruption and less suitable for longer-form videos. furthermore, watching a filmed discussion is a more passive learning experience than engagement through blog comments or twitter; therefore, participants might be less responsive to this medium of discussion. digital scholarly discourse may be more suited to the relatively shorter time frame often associated with volume -, no. - : - blogging and microblogging media. whether this is due to preference of medium versus the content itself is a question for future study. even with these incomplete viewings, we believe that the video interview with the authors and experts has unique value by connecting learners with experts more than was previously possible. in addition to the blog and twitter discussion, the video likely reached learners who otherwise would not have become familiar with the annals article. some critics may be concerned that learners will increasingly depend on secondary sources instead of critically reading the primary literature. though some viewers of our discussion might return to the original annals article, others who might not seek out primary literature were still exposed to the discussion and critical appraisal process. we hold that this form of digital scholarship we present here exceeds traditional expectations of secondary sources by teaching learners how to critically analyze and discuss primary literature. though this approach may not be ideal, it at least promotes scholarly inquiry among learners who might not have engaged otherwise, increasing, not substituting, engagement. as the landscape of digital scholarship evolves, our broad analytic approach to study multiple social media platforms establishes the importance of our digital, asynchronous form of academic discourse in medicine. the ability to communicate with thousands of learners, educators, and clinicians—across time, geographic barriers, and resource availability—represents a powerful opportunity that goes far beyond anything possible in traditional academic settings. conclusion from an educational innovation perspective, this multimodal approach provided a novel venue for asynchronous, scholarly discussions about a controversial topic published in the journal literature. it was able to attract , unique readers from countries, using social media modalities that included a medical education blog, twitter, and live video interview. our social media–based approach showed the power of online engagement with multiple experts and a diverse audience to detect new and emerging themes as framed by existing literature. this method may allow more rapid hypothesis generation for future research and enable more accelerated knowledge translation. however, the online community demonstrated here is only a small subset of hundreds of thousands of emergency medicine practitioners worldwide, many of whom do not engage in digital scholarly practices. as the methods presented here continue to be refined, it is hoped annals of emergency medicine integration of social media in emergency medicine residency curriculum hayes et al that a larger proportion of emergency medicine educators and learners will become engaged in this approach to knowledge translation. the authors acknowledge the aliem blog discussion participants: brian adkins, javier benitez, teresa chan, robert r. cooney, sean m. fox, andrew grock, bryan d. hayes, pharmd, justin hensley, jeffery hill, scott kobner, michelle lin, md, david marcus, shannon mcnamara, tamara moores, jeff riddell, stephen smith, and anand swaminathan; the #aliemrp twitter participants (and the number of their followers): @_drjeffy ( ), @_nmay ( , ), @aaimonline ( ), @aliemconf ( , ), @alsugairmd ( ), @amosshemesh ( ), @amyjwal ( ), @annalsofem ( , ), @aylc ( ), @bonnycastle ( , ), @choo_ek ( , ), @chsu ( ), @damian_roland ( , ), @davelew_ ( ), @davidjuurlink ( , ), @ditchdocrn ( ), @docamyewalsh ( ), @docmaj ( ), @dr_jibbajabba ( ), @drmshakeeb ( ), @eajkd ( , ), @elghulmd ( ), @em_educator ( , ), @emcases ( , ), @emeducation ( , ), @emergencymedbmj ( , ), @eminfocus ( ), @emswami ( , ), @emtdocandy ( ), @erikhandberg ( ), @eusmd ( ), @felixankel ( ), @fltdoc ( , ), @galal ( ), @haldunakoglu ( ), @hp_ems ( , ), @jameslhuffman ( ), @jeff__riddell ( ), @jeremyfaust ( , ), @jllaidlaw ( ), @journalofgme ( ), @jvrbntz ( , ), @k_scottmd ( ), @kasiahamptonmd ( ), @kfontes ( ), @lasvegasem ( ), @lsaldanamd ( , ), @lwestafer ( , ), @m_lin ( , ), @majthagafi ( ), @mayoclinicem ( ), @mdaware ( , ), @medieditor ( ), @mikegisondi ( ), @mikepaddock ( ), @miramamtanipenn ( ), @mkchan_rcpsc ( ), @mkleinmd ( ), @nickjohnsonmd ( ), @njoshi ( , ), @northwesternem ( ), @nxtstop ( , ), @nysuri ( ), @parzivalinc ( ), @pennsomelab ( ), @petradmd ( , ), @pharmertoxguy ( , ), @poppaspearls ( ), @purdy_eve ( ), @saemonline ( , ), @salvo_fedele ( ), @sjpoon ( ), @skobner (), @sono_kids ( ), @southerngis ( ), @stella_yiu ( ), @tchanmd ( , ), @uchicagoem ( ), @ucmorningreport ( ), @ultrasoundjelly ( ), @umanamd ( , ), @umjacksonem ( ), @ummedschool ( , ), @upennem ( ), @yaniralandaver ( ), and @zindoctor ( ); and the google hangout video interview participants: michael gisondi, bryan d. hayes, pharmd, mira mamtani, kevin scott, n. seth trueger, md, mph, and stella yiu, md. annals of emergency medicine supervising editor: michael l. callaham, md author affiliations: from the department of pharmacy, university of maryland medical center, and the department of emergency medicine, university of maryland, baltimore, md (hayes); the new york university school of medicine, new york, ny (kobner); the section of emergency medicine, university of chicago, chicago, il (trueger); the department of emergency medicine, university of ottawa, ottawa, canada (yiu); and the department of emergency medicine, university of california, san francisco, and the mededlife research collaborative, san francisco, ca (lin). funding and support: by annals policy, all authors are required to disclose any and all commercial, financial, and other relationships in any way related to the subject of this article as per icmje conflict of interest guidelines (see www.icmje.org). dr. trueger reports receiving a stipend and writing fees as social media editor for emergency physicians monthly, and receiving a stipend and writing fees as social media editor for twitter. references . radecki rp, rezaie sr, lin m. annals of emergency medicine journal club. global emergency medicine journal club: social media responses to the november annals of emergency medicine journal club. ann emerg med. ; : - . . thoma b, rolston d, lin m. global emergency medicine journal club: social media responses to the march annals of emergency medicine journal club on targeted temperature management. ann emerg med. ; : - . . chan tm, rosenberg h, lin m. global emergency medicine journal club: social media responses to the january online emergency medicine journalclubonsubarachnoidhemorrhage.annemergmed. ; : - . . joshi nk, yarris lm, doty ci, et al. social media responses to the annals of emergency medicine residents’ perspective article on multiple mini-interviews. ann emerg med. ; : - . . scottkr,hsuch,johnsonnj,et al. integrationofsocialmediainemergency medicine residency curriculum. ann emerg med. ; : - . . accreditation council for graduate medical education. frequently asked questions: emergency medicine. acgme emergency medicine web site. available at: http://www.acgme.org/acgmeweb/portals/ / pdfs/faq/ _emergency_medical_svcs_faqs.pdf. published november , . accessed october , . . reiter da, lakoff dj, trueger ns, et al. individual interactive instruction: an innovative enhancement to resident education. ann emerg med. ; : - . . cheston cc, flickinger te, chisolm ms. social media use in medical education. acad med. ; : - . . chauhan b, george r, coffin j. social media and you: what every physician needs to know. j med pract manage. ; : - . . thomab,joshin,truegerns,etal.fivestrategiestoeffectivelyuseonline resources in emergency medicine. ann emerg med. ; : - . . symplur analytics. symplur analytics. healthcare hashtag project. available at: http://www.symplur.com/healthcare-hashtags/aliemrp/. accessed september , . . lave j, wenger e. situated learning, legitimate peripheral participation. new york: cambridge university press; . . new air series: aliem approved instructional resources. academic life in emergency medicine web site. available at: http://www.aliem. com/new-air-series-aliem-approved-instructional-resources/. accessed october , . . mallin m, schlein s, doctor s, et al. a survey of the current utilization of asynchronous education among emergency medicine residents in the united states. acad med. ; : - . volume -, no. - : - http://www.icmje.org/ http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://www.acgme.org/acgmeweb/portals/ /pdfs/faq/ _emergency_medical_svcs_faqs.pdf http://www.acgme.org/acgmeweb/portals/ /pdfs/faq/ _emergency_medical_svcs_faqs.pdf http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://www.symplur.com/healthcare-hashtags/aliemrp/ http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://www.aliem.com/new-air-series-aliem-approved-instructional-resources/ http://www.aliem.com/new-air-series-aliem-approved-instructional-resources/ http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref hayes et al integration of social media in emergency medicine residency curriculum . ucsf first us medical school to offer credit for wikipedia articles. university of california san francisco web site. available at: http:// www.ucsf.edu/news/ / / /ucsf-first-us-medical-school- offer-credit-wikipedia-articles. accessed september , . . kern de, thomas pa, hughes mt. curriculum development for medical education: a six step approach. nd ed. baltimore, md: johns hopkins university press; . volume -, no. - : - . nickson cp, cadogan md. free open access medical education (foam) for the emergency physician. emerg med australas. ; : - . . van der vleuten cpm, driessen ew. what would happen to education if we take education evidence seriously? perspect med educ. ; : - . . mehta nb, hull al, young jb, et al. just imagine: new paradigms for medical education. acad med. ; : - . annals of emergency medicine http://www.ucsf.edu/news/ / / /ucsf-first-us-medical-school-offer-credit-wikipedia-articles http://www.ucsf.edu/news/ / / /ucsf-first-us-medical-school-offer-credit-wikipedia-articles http://www.ucsf.edu/news/ / / /ucsf-first-us-medical-school-offer-credit-wikipedia-articles http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref http://refhub.elsevier.com/s - ( ) - /sref a http://refhub.elsevier.com/s - ( ) - /sref a social media in the emergency medicine residency curriculum: social media responses to the residents’ perspective article introduction materials and methods promotion of the event social media discussion period curation of multimodal discussion results web analytics summary of the online discussion curriculum design, pedagogy, and learning theory digital curation skills of the st-century emergency medicine practitioner engagement challenges proposed solutions best practice examples limitations discussion social media: a new frontier in scholarly discussions conclusion references microsoft word - asist -annot-dr-chvdh-final-submission.doc annotation as a new paradigm in research archiving two case studies: republic of letters - hebrew text database dirk roorda data archiving networked services (knaw) p.o. box ab den haag netherlands dirk.roorda@dans.knaw.nl charles van den heuvel huygens ing (knaw) p.o. box lt den haag netherlands charles.van.den.heuvel@huygens.knaw.nl knaw = royal netherlands academy of arts and sciences, knaw.nl abstract we outline a paradigm to preserve results of digital scholarship, whether they are query results, feature values, or topic assignments. this paradigm is characterized by using annotations as multifunctional carriers and making them portable. the testing grounds we have chosen are two significant enterprises, one in the history of science, and one in hebrew scholarship. the first one (ckcc) focuses on the results of a project where a dutch consortium of universities, research institutes, and cultural heritage institutions experimented for years with language techniques and topic modeling methods with the aim to analyze the emergence of scholarly debates. the data: a complex set of about . letters. the second one (dthb) is a multi-year effort to express the linguistic features of the hebrew bible in a text database, which is still growing in detail and sophistication. versions of this database are packaged in commercial bible study software. we state that the results of these forms of scholarship require new knowledge management and archive practices. only when researchers can build efficiently on each other’s (intermediate) results, they can achieve the aggregations of quality data by which new questions can be answered, and hidden patterns visualized. archives are required to find a balance between preserving authoritative versions of sources and supporting collaborative efforts in digital scholarship. annotations are promising vehicles for preserving and reusing research results. keywords annotation, portability, archiving, queries, features, topics, keywords, republic of letters, hebrew text databases. introduction in the early modern history of europe, letters were by far the most important means of communication and played a role in the emergence of scholarly communities. although from the s onwards this one-to-one means of exchange of knowledge was gradually replaced by a more public form of scholarly communication via learned periodicals, communication via letters did continue. given the role of the letter in scholarly communication and the emergence of scientific communities in europe it is not surprising that the so-called “republic of letters” became a recurrent theme in the history of the humanities and sciences. with the introduction of digital tools, various new projects were set up to map the exchange of knowledge and to analyze the creation of scholarly networks in europe. the beautiful visualizations of the project mapping the republic of letters of the stanford humanities research center made the headlines of the new yorker. in europe, oxford university in the cultures of knowledge project is building a large repository to make the research on the republic of letters available on an international level. although we cooperate with these consortia to create a digital republic of letters, one of the two projects discussed here: circulation of this is the space reserved for copyright notices. asist , october - , , baltimore, md, usa. copyright notice continues right here. knowledge and learned practices in the th century dutch republic: a collaboratory around correspondences (ckcc) (see also roorda & bos & van den heuvel ), is different in geographical range and in its analytical depth. first ckcc does not follow the correspondences of all european scientists, but of those scholars that lived or sojourned extensively in the low countries. the scientific revolution of the th century was driven by discoveries at sea, in observatories, in workshops of artisans, and in libraries. the dutch republic with its global trade network, its book printing industry, and its relative tolerance to religious differences became a refuge for intellectuals from around europe: an information society avant la lettre. as such, it is an interesting counterpart to traditional studies in which knowledge production primarily is described as a scientific revolution driven by protagonists in the galileo-descartes-newton tradition. a second difference with the above-mentioned projects is in the depth of analysis. instead of focusing on metadata to explore the exchanges of knowledge between scholars in europe and overseas, ckcc focuses on the data, on the letters themselves. it does not only try to answer how knowledge was disseminated in correspondences, but also to establish how new information was picked up, processed, and finally accepted in scholarly communications. what is the impact of the correspondences and how did new scientific topics and scholarly debates around them emerge? to answer those questions ckcc digitized the corporora of published editions and of unpublished letters from the scholars caspar barlaeus ( - ), isaac beeckman ( - ), rené descartes ( - ), hugo grotius ( - ), constantijn huygens ( - ) christiaan huygens ( - ), dirck rembrandtsz van nierop ( - ), johannes swammerdam and anthoni van leeuwenhoek ( - ). software has been developed to analyze this machine-readable corpus of approximately . letters to detect topics and to visualize meaningful patterns in the networks of scholars that discuss them by using a combination of text mining, topic modeling, language technology, and visualization techniques. this analysis works in two different ways: a researcher can query the database with specific keywords, which will get a presentation of all the letters in which these project website: ckcc.huygens.knaw.nl. words occur. apart from the fact that all the queried keywords will light up in the text, the computer generates the most frequent words (in a different color) in relation to them. this way a researcher can test hypotheses about the expected outcomes of her/his queries, but at the same time serendipity has a better chance because of the computer generated terms that might convey unexpected meanings, which have to be put into context by additional research. the complexity of this set is mainly caused by the multiple languages in historical forms that occur in the corpus: latin, french, italian, dutch and english, not to speak about the spelling variants. after several years of experimenting we are entering a phase in which the database can be opened up in a web-based collaboratory and more data added. moreover, the data is to be enriched with annotations. we state that the results of these experiments with topic modeling, language detection, and visualization require new knowledge management and archive practices. to that end, we will formulate a new paradigm where annotations will play a key role. knowledge management of mixed and partial documents the challenge of ckcc is to study the appropriation of knowledge in an international context and to recognize the development of themes of interest and debates between scholars or in larger networks distributed over space and time. in order to recognize meaningful patterns in the machine-readable corpus, topic modeling is used. it is based on the distribution of words over the text in the documents and is used to find similar words, similar documents or documents similar to arbitrary text. it does this by calculating similarities between words and texts, which constitutes a statistical approach to topics. however, the specific characteristics of the corpus of about . letters not only complicate the analysis and visualization of meaningful patterns but also require a particular management for pre-processing the datasets for use and re-use in digital humanities research (roorda & bos & van den heuvel, ; wittek & ravenek ). letters often address more than one topic and their rhetorical opening and closing phrases are seldom relevant to their content. for that reason it is not only important to be able to segment the letters to the paragraph level but also to exclude certain phrases from content extraction. not only are the various digitized corpora so different in format and coding that much data curation is needed to make them suitable for analysis, but also the multilinguality and spelling variations in the letters require additional operations. the choice of language is very inconsistent over the corpus as a whole. about % of all letters are written in dutch, latin, and french, the rest in german, greek, italian, and english. for some languages, it is a profitable investment to use additional language resources and tools, but not for all of them. moreover, the letters themselves are not monolingual, but even inside sentences language switches do occur. these th century letters exhibit much spelling variation. for instance, the name of christiaan huygens van zuylichem in the ckcc corpus is spelled in more than different ways. this requires additional language tools and methodologies, such as named entity recognition to improve the recall of queries and of computer generated topics. in the first phase of ckcc, the topic model of latent dirichlet allocation was used. in this model, documents are considered as random mixtures over latent topics, where each topic is characterized by a distribution of words. using lda the computer generated strings of related words; each string was manually labeled by researchers within the team based on their domain expertise. after a year of developing the topic model, the database was tested by participants of an international workshop mathematical life in the dutch republic . we asked three test groups of in total circa historians of science to explore the possibilities of this tool and to inquire in what ways it could contribute to their historical research. although the researchers acknowledged the potential of the database, they came up with two serious problems. they experienced the database as black box which was hard to control, and their queries often had a limited recall. to overcome these problems a mixed strategy was developed for the next phase of the project. faceted search was improved to enhance the this method and latent semantic analysis and random indexing (to be mentioned subsequently) and their application in ckcc are explained in (wittek & ravenek, ). , december - , lorentz center, leiden, tinyurl.com/lorentz-mat-life. manipulation possibilities of the researchers. experiments were set up with two different models of topic modeling next to latent dirichlet allocation (lda): latent semantic analysis (lsa) and random indexing (ri) and combined with language normalization. researchers were involved in labeling the terms ( in a subset set of letters) that were generated during these experiments to enable an evaluation of the outcomes. it goes beyond the scope of this article to describe these experiments in detail, but best results were achieved with ri in combination with stemming and removal of stop words. for the implementation, a combination of lsa and ri was used in two scenarios. ( ) query terms of users are forwarded to lsa and ri models that return a ranked list of keywords that are the most relevant to the topic(s) underlying the query. ( ) text fragments specified by users are forwarded to both models and the lsa and ri models return a ranked list of letters that are most relevant to this input. in short, full text search can be enhanced with query terms suggested by the topic model, and it is possible to query for letters that are similar to a given text fragment. despite its potential usefulness there is still a long way to go given the multilingual situation and the spelling variants. to improve the recall of keywords, other experiments are set up involving named entity recognition. once again, researchers play a role in the evaluation of the automatically generated terms, so that after several iterations of feedback the recall can be improved. thus, enhancing the queries by topic modeling requires annotation, for the presentation of the results to the end user as well as for the experts' feedback to the software. the need to annotate and new paradigms in archiving notes in humanities research several studies have pointed to the different nature of data in the humanities. they are often multilingual, historically specific, geographically dispersed, and ambiguous in meaning (acls, ), (borgman, ). humanities scholars are concerned with the problem of meaning: how it is created, communicated, manipulated, and perceived. in order to contextualize data, they require annotation. contextualization by annotation has a long history. famous is the history of the footnote by (grafton, ), but its future is still unclear. “as the footnote reconfigures itself for the digital world, opportunity and danger are waiting side by side for it” (zerby, : ). bader stated in the new york times: “forget footnotes. hyperlink. old media, meet new media”. she claimed that after the eviction of the footnote by book publishers, they would find a new home in the hyper-link construction of the world wide web. “indeed the web has not only revived the footnote, it has spawned a cross- referencing craze that renders the formerly complete media event into a […] wallflower waiting to be courted by the next available annotator” (bader, ). the statements of zerby and bader reveal two problems. we do not know anymore what the function is of the footnote in digital environments, which played such an important role in the contextualization of (humanities) research. secondly, we are in need of new paradigms to preserve the digital counterpart of the footnote, the annotation by man or by machine, for re-use by researchers. the new role of the footnote in the virtual research environment has hardly been explored. an interesting exception is the study that presented the multimedia digital annotation system, madcow (bottoni & al., ), in which a functional taxonomy of “content” annotations (explanation, comment, question, example etc.) as opposed to metadata annotations was formulated based on rhetorical structure theory (rst). such a functional taxonomy can be developed to assign attributes for the contextualization of topics. moreover, the madcow project signaled the problem that users are limited by either navigating through specific browsers with annotation facilities on a restricted set of contents or have to disrupt their navigation to start an annotation application. the madcow tool allowed users to switch between navigating and annotating modalities with the web content. here we try to extend these modalities beyond web content, to include in principal all sorts of documents, and to explore its implications for preservation practices. annotations and digital scholarship the practice of annotating is a traditional ingredient of research. how can annotating support modern, digital forms of research? is the digital version of an annotation versatile enough to express new results? how do digital annotations behave in the total workflow of exploration, hypothesizing, testing, publishing, and archiving? in order to gain practical insights in these matters we have considered two significant projects that truly are representative of digital humanities research of which the first one is ckcc, described above, and the second one is rather a programme than a project: data and tradition. the hebrew bible as a linguistic corpus and as a literary composition (dthb) . this work builds on a multi-decade effort to linguistically markup the complete text of the hebrew bible. the result is a text database where morphemes, words, phrases, and higher-level text objects carry many features. a version of this database has been deposited into the dans archive, where it is stored as a compressed sql dump (talstra, ). this act happened during the workshop: biblical scholarship and humanities computing: data types, text, language and interpretation where an international group of experts reflected on how to bring these resources to better fruition in the digital age. live versions of this database run on researcher’s computers, where they can craft queries of which the results may or may not support specific interpretations of the text. if a linguistic peculiarity shows up in a difficult passage, one can query the database and see whether it is a true exception to the known rules, or just an instance of a regular but rare pattern, to name a typical use case. hundreds of queries have been crafted, run, and studied, all in relation to interpretation issues. both ckcc and dthb have produced curated sources plus analytical results. yet it is far from clear how these results can partake in a process of accumulation and sharing. here lies our motivation to explore the power of annotations. the central statement of this part is that annotations are indeed a powerful carrier of digital scholarship and that they can bridge the gap between past and future research, provided they conform to a generic model that supports preservation and sharing. project data and tradition. the hebrew bible as a linguistic corpus and as a literary composition. initiated by eep talstra, from - - to - - . see tinyurl.com/nwo-nl-dthb. more projects in the same programme are listed at tinyurl.com/nwo-nl-talstra. , february - , lorentz center, leiden, tinyurl.com/lorentz-hum-comp. in order to substantiate this statement, we have to argue that: • there are frameworks for web-based, digital annotations; • annotations are versatile: they can express queries, features, keyword and topic assignments; • annotations can be made portable: they still make sense when their targets move or change; • annotations must and can be managed with their metadata, provenance, and types; • annotations can “drive” end user applications. of course, we cannot rigorously prove these assertions. we will draw on our own experiences in building (demo) applications that are driven by queries and features as annotations in the dthb case, and by topics and keywords as annotations in the ckcc case. open annotation collaboration the realization that annotations are important carriers of scholarship, and the fact that in practice annotations tend to become locked up in the systems used to create them, has led to several attempts to standardize annotations and turn them into web resources. two of those attempts, the annotation ontology (ciccarese, ) and the open annotation model (henceforth oam) (sanderson & van de sompel, ) are currently under consideration of the w c open annotation community group with the aim of reconciling the two into a common, rdf -based specification. the guidelines in (sanderson & van de sompel, ) are particularly concise and revealing. to summarize even more: the oam focuses on the basic structure of an annotation: a body is taken to comment on one or more targets, and the annotation binds them together. annotations, body, and targets are all addressable as web resources. they can all have separate metadata, see: tinyurl.com/annot-ont. see: openannotation.org. see: tinyurl.com/w -annot. rdf: resource description framework. the language of the semantic web, also known as linked data. see linkeddata.org. including authorship, but the metadata is not part of the model. the model is agnostic to the specific protocols, platforms, and applications with one exception: everything is geared to the architecture of the web with its http protocol. the implicit consequence is that oam- annotations can be expressed as rdf and become part of the semantic web. so far, the guidelines reveal that very important goals are being achieved: annotations can be shared easily across applications, platforms, and institutions. they can be discovered, filtered by the metadata they are linked to, and organized by the resources they target, and moved around and aggregated by discovery services. yet, the guidelines also point to challenges: ( ) real annotations need to target fragments of resources, but how can they be specified in interoperable ways? ( ) resources tend to move and change, so how are the annotations that link to them, either by body or by target, to be maintained? ( ) the basic model is bare, and lots of information about annotations has to be expressed in ways not prescribed by oam, so how much interoperability can be actually achieved? from the perspective of a research archive, which preserves resources past their active lifetime in an encapsulated form, in order to revive them when somebody is interested in them, exactly these two issues of addressing and metadata are of utmost importance. in our view ( )+( ) are fundamental issues that require additional concepts. we address them in section portable annotations. as to ( ), there is a general tendency in archives, repositories, and cultural heritage institutions to conform their metadata to the ontologies that are being designed on the semantic web, not only for the metadata profiles, but also for the actual values that metadata fields may take (gradmann, ). oam is very well poised to take advantage of these developments, since it is itself defined in semantic web terms. queries, features, topics and keywords as annotations as discussed above, the results of ckcc and dthb are predominantly queries and features (dthb) and topics and keywords (ckcc). here we explain how we translated these items all into http: hypertext transfer protocol. defined here: tinyurl.com/ietf-http. annotations. we subsequently wrote two web applications that present these annotations next to the resources in one interface. qfa (queries/features as annotations) (figure ) is written for dthb, and tka (topics/keywords as annotations) (figure ) is written for ckcc material. the intention was to explore if one could build usable interfaces that are driven by annotations, and with limited effort. to this end we developed two end-user applications that directly operate on sets of annotations using the abstract model, and connect them with data sources that they are about. we assume that both data sources and annotations have been previously imported into relational databases. (see further, portable annotations below). queries as annotations queries are active, dynamic forays into landscapes of data. annotations are passive, static comments on fragments of data. what do they have in common? one might expect that we are preserving queries with the aim to be able to run the query over and over again for the indefinite future. or do we? it would require that we remain familiar with that version of the query language, and with the corresponding version of the database system forever and ever. it will become increasingly difficult to compare those query results with later ones because the modern query will not run on the old system and vice versa. the matter is not academic. in this particular case, the queries are expressed in mql, which is an implemented version of ql, defined by (doedens, ) as a query language specifically geared to text databases . application: tinyurl.com/demo-taa , wiki: tinyurl.com/wiki-taa. application: tinyurl.com/demo-qaa , wiki: tinyurl.com/wiki-qaa. the acronym ql may best be read as: quest- like query language, and mql stands for mini ql. appendix of (doedens, ) contains a historical account. although the implementation, emdros (petersen, ) is open source, well documented, and a powerful solution for text databases, it is definitely not a mainstream application, and its life span is hard to guess. for preserving the results of scholarship, there is a better option. we can select the important queries, those that have been used to obtain new interpretations that have been published in journals. the query instruction is then the body of an annotation, and the query results are the (many) targets of that same annotation. annotations will be linked to metadata specifying the related research problem, the author of the query, and the moment of its last run. that will give the future user a good picture of past research. in addition, in current research users can stumble upon query results as targets of annotations, so that these annotations lead them from passages to queries, exactly in the opposite direction that one usually follows with queries. it is the direction of serendipity. features as annotations in the dthb case features are linguistic properties of the form key=value that apply to text objects of nearly every granularity, from see emdros.org. figure . screenshot of queries/features as applications figure . screenshot of topics/keywords as annotations morpheme through part-of-speech up to book. these features are the product of many years of manual labor, combined with automatic processing. they have been checked and revised. they constitute a treasure trove. they live in the same implementation of text databases, emdros, as the queries above. by transforming the features into annotations, we potentially unlock the value that is hidden here. in this case, we simply chose as bodies strings of the form key=value. the targets are the objects that carry that feature value. in our demo application qfa we give the user key=value combinations at the word level to play with. as an example, a user can tell the application to show all verbs with tense=imperfect in blue and all verbs with tense=perfect in red. this helps to interpret narrative structures, even if you do not know hebrew, although being a linguist helps. again, this is a case of annotations with (very) many bodies: the annotation with body gender=masculine has targets! the number of targets of gender=feminine is left as an exercise to the reader. topics and keywords as annotations extracting topics from texts is as useful as it is challenging. topics are semantic entities that may not have easily identifiable surface forms, so it is impossible to detect them by straightforward search. topics live at an abstraction level that does not care about language differences, let alone spelling variations. therefore, if one has a corpus with thousands of letters in several historical languages, and wants to know what they are about without actually reading them all, a good topic assignment is a very valuable resource indeed. there are several ways to tackle the problem of topic detection, and they vary in the quality of what is detected, the cost of detection, and the ratio between manual work an automatic work. several of these methods have been (and are being) tested ckcc as explained before. it is not the purpose of this paper to go into topic modeling in depth. here we are concerned with gathering results, even intermediate results, and making them re-usable for subsequent attempts to uncover the semantic contents of the corpora involved. for our demo application, we gathered three kinds of intermediate results: ( ) automatic keyword assignments, ( ) manual keyword assignments, ( ) automatic topic assignments detected by a specific algorithm. we used the complete corpus of letters from and to the dutch th century scholar christiaan huygens ( letters). the mapping from keyword assignments to annotations is simple: bodies are the keywords; targets are the letters to which the keywords are assigned. there is no fragment addressing here. topics reveal two complications when translating them into annotations. ( ) a topic is not a single word but a complex object in itself. in this context, it is a collection of words that span a semantic field. moreover, each word contributing to a topic does so with a certain relative weight. ( ) when a topic is assigned to a letter, the assignment has a certain confidence, expressed as a number. this could be modeled as an extra annotation on top of the annotation that merely links a topic to a letter: the extra annotation has the confidence as body and the other annotation as target. in our application, however, we have opted to include topic and confidence in one body, as distinct fields. there are even more options, for which we refer to the wiki about tka . portable annotations beyond rdf so our annotations are not coded in rdf, they have no uris , and they do not conform to the linked-data aspects of oam. there are good reasons for this: neither the sources nor the annotations that result from ckcc and dthb are currently web resources. nevertheless, there is a sense in which we conform to oam: the annotations reside in a different database than the sources do, and the link between annotations and their targets is strictly symbolic, not dependent on database modeling and technology (no foreign key constraints). one could say that we enforce modularity between sources and annotations, in the sense that annotations can be ported from one source to a comparable source. from here it is not a big step to completely conform to oam: ( ) import real rdf annotations to local database tables from where they drive local applications; ( ) if a local application produces annotations that must tinyurl.com/wiki-taa-topics. uri: uniform resource identifier, which can be dereferenced by means of the http protocol. the definition uri is at tinyurl.com/ietf-uri. be shared: export them as rdf. in both cases, local addresses must be translated into absolute uris. usefulness of porting annotations now we arrive at a tempting picture: annotations that are portable. many sources are available in several versions, in many copies, in different formats, in multiple languages, and in diverse media. many annotations on a resource still make sense if one explores other variants of it. here are some examples: ( ) (from dthb): there are various authoritative versions of the hebrew text. we have compared the biblia hebraica stuttgartensia (bhs) with the westminster leningrad codex (wlc) . most of the differences are different word divisions and different diacritical marks. that means that the vast majority of feature and query annotations based on the bhs also apply to the wlc. moreover, there is a set of features, by a different enterprise (groves & lowery ) , based on the wlc, which can also be applied to the bhs. even the mismatches are interesting! ( ) (from dthb): there are word-by-word translations of the hebrew bible into english. for non-hebrew-readers, it might be interesting to see which words in such a translation derive from a masculine and which from a feminine word. such an observation can be easily achieved if we can port the feature annotations from the hebrew source to such a translation. ( ) (from ckcc): the manual topic assignments are a valuable resource. new attempts at topic modeling could make good use of that, for training or testing purposes. in those cases, it tinyurl.com/bhs-browse. tinyurl.com/tanach-tech. the westminster hebrew morphology. tinyurl.com/groves-whm. would be convenient to retrieve such annotations from an archive and then to be able to reapply them on new incarnations of the sources. uris, anchors, frbr oam requires that annotations point to their bodies and targets in the linked data way: by proper http uris. if the resources in question are stable and being maintained by libraries, archives and cultural heritage institutions, it becomes possible to harvest many sorts of annotations around the same sources. this is an organizing principle that is quite new and from which huge benefits for data mining and visualization are to be expected. in practice, however, there are several scenarios in which (fragments) of resources are not addressed in a stable way. this happens for instance when resources go off-line into an archive. in case we want to restore those resources later on, the means of addressing them from the outside may have changed. moreover, there might not be a unique, canonical restored incarnation of that resource. for that reason one needs anchors to resources that enable the re- use of annotations that have been archived in the past. the solution adopted in qfa and in tka is to work with localized addresses. these are essentially relative addresses that point to (fragments) of local resources that are part of a local corpus. there is an ontological consideration involved here. the model of functional requirements for bibliographic records (ifla, - ) makes a distinction between work, expression, manifestation, and item. work is a distinct intellectual or artistic creation. as such, it is a non-physical entity. expression, manifestation, and item point to increasing levels of concreteness, an item being a concrete entity in the physical world. wikipedia gives a nice example from music, see table . the full refinement of these four frbr concepts is probably not needed for our purposes. yet, a distinction between the work, which exists in an ideal, conceptual domain, and its incarnations, which exist in physical reality, is too important to ignore. it bears on the ways by which identifiers to works and incarnations can be kept stable. identifiers to works identify within conceptual domains, but they have no function in physically tinyurl.com/wikip-frbr. frbr concept example characteristic work beethoven’s ninth symphony distinct creation expression musical score specific form manifestation recording by the london philharmonic in physical embodiment item record disk concrete entity table . frbr’s view of the world locating works. these identifiers are naturally free of those factors that make a typical hyperlink such a flaky thing. so whenever annotations are about aspects of a resource that are at the work level, they have better target those resources by means of work identifiers. moreover, the distinction between work and incarnation also applies to fragments of works. most subdivisions, such as volumes, chapters, and verses in resources do exist at the work level, albeit that there are some fragments that are typically products of the incarnation level, e.g. lines and pages. we can now define our anchors as identifiers at the work level, for resources and their fragments. this is in fact the nature of our localized addresses. quite often, the sources themselves and their fragments have anchors that are recognized by whoever is involved with them. take the books, chapters, and verses in the bible, for example. even where there are no universally recognized anchors, it is easier to translate between rival anchoring schemes, than to maintain and multiply stable identifiers at the incarnation level. lurking below the surface there is the question: to what extent are differing versions incarnations of the same work? can we keep fragment identifiers stable under versioning? this is really a complex issue, and we plan to devote a completely new demo application to it in a new use case. (see the wiki on portable annotations) . statement not all variance between sources can be productively addressed with time-based versioning. there are deeper reasons for variation and deeper reasons for identification than sequences of surface forms. if we ignore those reasons, and if we omit to base our identifiers on them, we will not have truly portable annotations. annotation management: metadata, provenance and types the role of metadata for annotations is (at least) twofold: first, they enable to assess the quality, significance, and meaning of an annotation. quality judgments can be made based on the provenance: who made the annotation, for which project, when? significance can be gleaned from tinyurl.com/wiki-pa. beware that this is work in progress. a list of publications that are associated with that (set of) annotations. meaning can be retrieved from pointers to reference materials. as oam is firmly integrated in the semantic web effort, there are no conceptual limitations on linking metadata to annotations. the second role of metadata is to enable annotation-driven applications to decide how to best filter and display the annotations. here the typology of annotations comes in. we exposed four not too ordinary types of annotation, each with its own requirements for display. the unlimited linking of metadata to annotations is problematic for generic applications. how do applications recognize what metadata is available and by which metadata they should let themselves be controlled? here we find ourselves on the middle ground between the rigor of what is within the limits of oam and the polymorphism of what lies outside it. for dedicated applications, there is no problem: you can tell them where to look, but fully generic annotation-driven applications will have difficulties here. annotation-driven applications how difficult is it to develop an annotation-driven application that deals with significant amounts of data and annotations, and that presents a usable interface to the end user? design the demo applications qfa and tka are driven by a database containing the source materials and a separate database with the annotations. there is no mingling or tight coupling between the sources and the annotations. the only links are the anchors: symbolic expressions in the annotation targets that refer to fragments of the sources. functionality both applications display the source material in a broad column, and the annotations in narrower columns next to the sources. the targets of the annotations can be highlighted in the sources, and the user has some control over the highlighting, depending on the type of annotation. we invite the reader to explore these applications to get a more detailed picture. in short, these applications visualize the annotations and the sources in basic, not too crude ways, adapted to the different kinds of annotation. implementation in order to rapidly implement our ideas concerning annotations and sources we needed a simple but effective framework on which we could build data-driven web applications. we found it in the shape of web py (di pierro, - ). we needed very little code on top of the framework, just a few hundred lines of python and javascript each. deployment of these apps is completely web-based, and only takes seconds. most work went into the data preparation stage, where we used perl and shell scripts to compile data from various origins into sql-imports for sources and annotations. these scripts were also in the few hundred lines range. missing link what these demos still lack is full rdf capability. once these sources are truly web resources, we expect that it is easy to make an import/export facility to turn database annotations into real rdf annotations. how to translate our fragment anchors into http uris is still an open question. finally, work is to be done in order to get the best of the worlds of relational databases and of linked data, see e.g. (baron & di pierro, ). conclusion we have investigated the feasibility of using annotations as portable carriers for diverse results of scholarship in the humanities. we found that annotations are versatile enough to carry the products of digital scholarship such as query results, features, topics, and keywords. the open annotation model represents annotations as web resources, which makes them easy to share beyond the systems in which they originated. annotations can be managed by unlimited association of metadata. the development of annotation-driven applications is doable: the focus remains on the data, and does not shift to the software. yet, the web-based model for annotations is not fully compatible with the process of archiving and re-use. this would greatly be improved if we could make annotations more portable across variant resources. and that, in turn, boils down to using anchors for targeting resources and their fragments. anchors are identifiers at the work-level in the frbr sense. let us briefly consider what this outcome means for digital humanities in general. in the non-digital ages before us, scholars relied on harmonization efforts such as standard editions of historical texts, because the source materials were simply too complex to deal with in their raw form. it had the character of projecting the data on a space of one dimension. now there is a growing pressure to investigate (again) the raw data, find new perspectives, and preserve the connections between interpretations and data in a much more transparent way. this shift in research paradigm can only succeed if it is matched by a shift in archiving methods. annotations have the potential to unlock data that is behind the barriers of application interfaces and data models. they facilitate deep linking to fragments. they can be instrumental in identifying interesting slices of the data that could not be accessed as such before. this is particularly useful in disciplines whose business it is to make distinctions between objective data and many layers of interpretation, where those interpretations are based on the data themselves in combination with any amount of data from the context. the fabric of objects and meanings that humanities research is creating must be taken care of in such a way that it remains navigable from all imaginable entry points in all conceivable directions. we have shown that annotations are up to the task. their way into the web of linked open data is being paved. if, in that process, they can play nice with the distinction between concept and realization, they constitute a new archiving paradigm. acknowledgments walter ravenek (huygens ing) for helpful comments on topic modeling; eko indarto (dans) for helping to develop a first version of qfa in very short time; andrea scharnhorst (dans) for granting additional time for research; joris van zundert (huygens ing) for facilitating an inspiring interedition bootcamp which set me (dirk) on the track of rapid development. references acls ( ). our cultural commonwealth: the report of the american council of learned societies’ commission on cyberinfrastructure for humanities and social sciences. retrieved - - from http://www.acls.org/cyberinfrastructure/ourcu lturalcommonwealth.pdf tinyurl.com/intered-lvn. bader, j.l. ( ). forget footnotes. hyperlink. the new york times, sunday july section week in review. baron, c., di pierro, m. ( ). publishing linked data using web py. school of computing, depaul university of chicago. retrieved - - from tinyurl.com/web py-ld-article (pdf). bottoni, p., civica, r., levialdi, s., orso, l., panizzi, e., trinchese, r. ( ). madcow: a multimedia digital annotation system. in m.f. costabile (ed.), proc. working conference on advanced visual interfaces (avi ) (pp. - ). new york: acm press. borgman, c. ( ). scholarship in the digital age. information, infrastructure and the internet, cambridge (mass.), london: the mit press. ciccarese, p., ocana, m., castro, l.j.g., das, s., clark, t. ( ). an open annotation ontology for science on web . . j. biomed semantics , (suppl ):s ( may ). di pierro, m. ( - ). web py. full stack web framework, th edition. online book. retrieved - - from web py.com. doedens, c.f.j. ( ). text databases. one database model and several retrieval languages. language and computers, number . editions rodopi amsterdam. amsterdam and atlanta, ga. isbn: - - - . gradmann, s. ( ). knowledge = information in context: on the importance of semantic contextualisation in europeana. white paper. retrieved - - from tinyurl.com/europeana-gradmann (pdf). grafton, a. ( ). the footnote. a curious history. cambridge (mass.): havard university press. groves, a., lowery, k., (eds). ( ). the westminster hebrew bible morphology database. philadelphia: westminster hebrew institute. ifla (international federation of library associations and institutions) ( - ). functional requirements for bibliographic records. final report. retrieved - - from tinyurl.com/ifla-frbr (pdf). petersen, u. ( ). emdros - a text database engine for analyzed or annotated text. proceedings of coling . – . retrieved - - from tinyurl.com/emdros-coling (pdf). roorda, d., bos, e-j., van den heuvel, c. ( ). letters, ideas and information technology. using digital corpora of letters to disclose the circulation of knowledge in the th century”. in digital humanities conference abstracts king’s college london - july (pp. - ). sanderson, r., van de sompel, h. (eds.). ( ). open annotation: beta data model guide. web document. retrieved - - from openannotation.org. talstra, e., sikkel, c., glanz, o., oosting, r., dyk, j.w. ( ). text database of the hebrew bible. dataset available from data archiving and networked services after permission of the depositor through retrieved - - from tinyurl.com/dans-wivu. wittek, p., ravenek, w. ( ). supporting the exploration of a corpus of th century scholarly correspondences by topic modeling. in b. maegaard (ed.), proceedings of supporting digital humanities : answering the unaskable. copenhagen. zerby, c. ( ). the devil's details: a history of footnotes. new york: touchstone. microsoft word - lothian_ .docx     journal of e-media studies volume , issue , dartmouth college can digital humanities mean transformative critique? alexis lothian and amanda phillips we need new hybrid practitioners: artist-theorists, programming humanists, activist- scholars; theoretical archivists, critical race coders. we need new forms of graduate and undergraduate education that hone both critical and digital literacies. we have to shake ourselves out of our small, field-based boxes so that we might take seriously the possibility that our own knowledge practices are normalized, modular, and black boxed in much the same way as the code we study in our work. ––tara mcpherson, “why is the digital humanities so white?” ( ) we were invited to this issue of the journal of e-media studies because we gave something a name. we are two participants in a group of early-career queer, feminist, and ethnic studies scholars of media, literature, and culture who are interested in digital scholarship, who kept meeting at conferences and wondering why the critical frameworks and politicized histories of our activist inquiry were so rarely part of the conversations we were having about scholarly technology. the series of academic conference events that led us to converge as a collective have by now been hashed and rehashed many times: there was an idea at thatcamp socal in response to anxiety at mla ; then a small but productive panel at asa (american studies association) ; some blog posts on hastac (humanities, arts, sciences, and technology     advanced collaboratory) and elsewhere, a tumblr; and the birth of a hashtag that finally caught the attention of the digital humanities (dh) twittersphere. somebody made a google doc, some bodies attended a panel, and some buddies were in the collective hoping that people would take over the hashtag and submit to the tumblr and blog about why #transformdh was cute but vague and ultimately misguided. but, ultimately, the project’s goal was to put a name to a feeling and see who else was thinking the same thing. that there are now names out there, records of attendance, email trails, and other evidence for the future tenure files that might take such endeavors into account, was a side effect that has taught us much about the power of naming–– you might even say of branding––when you want to get an idea into circulation. what was the idea? in short, #transformdh is an aggregated statement of the obvious. first of all: the emergent methods and practices we call digital humanities are not only for traditional work. years of dh criticism might point to the banality of this sentiment; the changing shapes of communication and technology alter the terms of scholarship, and keeping afloat in the coming century will require mastery over new tools and methods. the revolution of dh is in full swing, with the force of multicampus institutions, internet portals, and federal funding at its back. the histories that dh as a discipline traces back through practices of humanities computing have indeed done transformational work on the structures of scholarship and the bureaucracies that shape our careers. yet the bright lights and marching bands of the so-called big tent outshine less marketable histories of engagement with technology that have emerged from standpoints that critique the privileging of certain gendered, racialized, classed, able-bodied, western-centric productions of knowledge. in a recent blog entry, filmmaker, feminist, and academic alex juhasz describes why she does not affiliate herself wholeheartedly with digital humanities:     the “field” does the amazing potentially radicalizing work of asking humanities professors (and students) to take account for their audiences, commitments, forms, and the uses of their work. but this was always there to take account of, being obscured by the transparent protocols of publishing and pedagogy that have been revealed because of the force of the digital. however, this turn is occurring, for the most part, as if plenty of fields, and professors, and artists, and students, and humanists hadn’t been already been doing this for years (and therefore without turning to these necessarily radical traditions of political scholars, theoretical artists, and humanities activists). #transformdh was our attempt to turn the digital humanities toward these radical traditions, as well as toward the bodies of critical work in new media studies by wendy chun, lisa nakamura, anna everett, tara mcpherson, and many others, that unpack the politics inherent in the force of the digital, the powers that shape the hardware and software that in turn shape our scholarly work. we wanted to think about the institutions that were forming in this ever more amorphous thing called digital humanities. we didn’t want the ways of engaging knowledge that were important to us to be left out. we felt it would be too easy to say that we were doing something other than dh, whether that be new media studies or critical cultural studies with a focus on the digital; instead, we wanted to bring what juhasz calls “necessarily radical traditions,” which have nourished us, into the dh field in which we also felt at home. if humanities scholars in critical media and cultural studies, queer studies, ethnic studies, disability studies, and related areas are doing work in and with the digital, we should lay claim to our place within digital humanities. we should explicitly occupy that space and assert––as mcpherson and jamie “skye” bianco,     among others, have recently done––that the honorable history of humanities computing is not the only one that matters for whatever it is we mean when we talk about the field. inclusivity is important to dh practitioners in the humanities computing tradition. we share that goal, but it is not the heart of our project. in “whose revolution? towards a more equitable digital humanities,” matthew k. gold’s mla talk reflecting on his book debates in the digital humanities, gold raises the question of which hierarchies, uneven distributions of labor, and value systems dh might preserve even as it seeks to change the way academic work is done. his important discussion focuses on the vital and often overlooked power of institutional resources to shape what scholarly work gets done. yet the metaphor that comes after his set of concrete and useful suggestions for diversifying dh is interesting: “as any software engineer can tell you, the more eyes you have on a problem, the more likely you are to find and fix bugs in the system.” if the system of dh were to run smoothly, gold implies, it would not perpetuate hierarchies or inequalities. gender, race, sexuality, ability, and class––and the marked bodies on which they become most visible––can be content that would fit within the forms already being established and funded for digital work: the on-campus centers, the annotated archives. but what we know about the academy, from its constitutive imbrications with nationalism and empire to the structures of race and gender that still shape its labor practices, suggests otherwise. content and form are not so separable; truly accounting for one will unavoidably change the other. so instead of smoothing out the bugs in the digital academy, we wonder how digital practices and projects might participate in more radical processes of transformation––might rattle the poles of the big tent rather than slip seamlessly into it. to that end, we are interested in digital     scholarship that takes aim at the more deeply rooted traditions of the academy: its commitment to the works of white men, living and dead; its overvaluation of western and colonial perspectives on (and in) culture; its reproduction of heteropatriarchal generational structures. perhaps we should inhabit, rather than eradicate, the status of bugs––even of viruses—in the system. perhaps there are different systems and anti-systems to be found: diy projects, projects that don’t only belong to the academy, projects that still matter even if they aren’t funded, even if they fail. what would digital scholarship and the humanities disciplines be like if they centered around processes and possibilities of social and cultural transformation as well as institutional preservation? if they centered around questions of labor, race, gender, and justice at personal, local, and global scales? if their practitioners considered not only how the academy might reach out to underserved communities, but also how the kinds of knowledge production nurtured elsewhere could transform the academy itself? these questions are not hypothetical. these digital humanities already exist. here we offer a curated list of projects, people, and collaborations that suggest the possibilities of a transformative digital humanities: one where neither the digital nor the humanities will be terms taken for granted. the transformative digital humanities will not be found only among the members of our ad hoc collective. nor will it be found only where the funding is, where the easily recognized and intensively supported dh projects are. we’ve gathered a selection of projects, ranging from institutionally sponsored archives of less-than-traditional materials to networks that purposefully have no direct connection to the academy as such. none belongs to a core member of our     collective, because we are becoming a little alarmed at the publicity our act of naming has begun to generate. all the projects put the questions of decades of feminist, queer, and critical race theory (all of which share significant temporal nodes with the politicized computing movements at the heart of much dh philosophy) at the center of their work, leveraging the affordances and methodologies for social justice. here one can find collaboration pushed to collectivity, interdisciplinarity that reaches outside of the ivory tower, and art that builds its own theory. these are only beginnings, suggestions; you may disagree that these are projects worth gathering, or you may wish to suggest other projects for consideration. your feedback, critiques, and additions will help us to build a transformative digital humanities together. curation transformative archives archives may be the most legible form of digital humanities production, as digital tools have been developed to preserve, gather, and share historical documents. digital humanities practitioners have increasingly been theorizing the power structures and silences of the archive, as well as drawing on materials less often granted the legitimacy of academic preservation.     adeline koh: digitizing “chinese englishmen”: representations of race and empire in the nineteenth century adeline koh’s online digitizing “chinese englishmen” project is an early step in the direction of decolonizing the archive, offering a forum for collaborative annotation and novel social media intervention on texts that expand the victorian anglophone repertoire beyond its current “narrow geographical boundaries.” koh’s project carves out a space for the postcolonial archive: the website is meant to be both a “decentralized” and a “postcolonial” archive. by a “decentralized” archive, it refers to one which provides modes for democratic access and exchange. on first glance, the term “postcolonial” nineteenth century archive may appear anachronistic, as no colonies were in fact “postcolonial” in this time period. my use of the term “postcolonial,” however, derives more from the type of postcolonial     literary criticism and postcolonial theory commonly associated with edward said and the subaltern studies collective than with movements towards decolonization before and after the second world war. in this definition, a “postcolonial” archive is one which examines and questions the creation of imperialist ideology within the structure of the archive. additionally, it aims to assemble a previously unrepresented collection of subaltern artifacts. (“addressing archival silence on th century colonialism – part ”) straits chinese magazine, the project’s source text, offers readers a complicated, alternative view of what it meant to be both an englishman and a chinese gentleman in the th century. koh’s archive makes no effort to resolve or simplify the complicated identity practices of the chinese englishmen, hoping instead to offer a platform to evaluate them without the colonial impulse to reduce these victorians to paragons of false consciousness or imitations of “real” british gentlemanliness. digitizing “chinese englishmen” expands the archive beyond colonial representations of nonwhite peoples in the th century, leveraging the reach of the digital to transform the face of th-century studies.     women who rock: making scenes, building communities at the university of washington women who rock is an oral history archive at the university of washington, built from the ground up on the principles of women of color feminism: collaboration across difference, intersectional critique, and accountability to communities outside the academy. participation in the project provides training for women’s and ethnic studies graduate students in the digital skills that suit their research interests, from web design to video production. headed by michelle habell-pallán, this is one of the few well-established, institutionally supported dh projects that are rooted in critical feminist media theory and praxis. women who rock research project (wwrrp) supports, develops, and circulates cultural production, conversations and scholarship by cultural producers and faculty,     graduate students, and undergraduates across disciplines, both within and outside the university, who examine the politics of gender, race, class, and sexuality generated by popular music. our goal is to generate dialogue and provide a focal point from which to build and strengthen relationships between local musicians and their communities, and educational institutions. (women who rock project: making scenes, building communities) [video by angelica macklin: http://vimeo.com/ ] oral histories such as this are committed to the production of knowledge from below, bringing people and practices who have traditionally been excluded from academic spheres––or simply not taken seriously there––into the frameworks of institutional preservation. in the case of women who rock, the preservation of popular music’s communities and histories is also aimed at a transformation of the institutional archive itself, bringing down barriers between the university and the knowledge worlds that lie outside its walls. transformative artistic production definitions of the digital humanities do not often include digital artistic production. but why not? the borders of artistic practice, software design, political activism, and critical knowledge production are porous.     micha cárdenas: transreal politics a queer performance artist currently working toward a phd in the university of southern california (usc)’s interactive media arts and practice program, micha cárdenas uses art, theory, and technology to encourage social justice thinking, which results in a unique brand of art-theory that pushes each of the fields in which it engages. cárdenas develops new software applications, designs and builds electronic gadgets that challenge hegemonic regimes, and infuses each performance with theoretical writing. cárdenas’s collaborative work has resulted in two theoretical texts so far: trans desire/affective cyborgs, coauthored by barbara fornssler and wolfgang shirmacher, and the transreal: political aesthetics of crossing realities, coauthored with zach blas, elle mehrmand, and amy sara carroll. cárdenas’s work takes trans- to its fullest extent, crossing realities, genders, theoretical perspectives, and technical design.     the video featured here, “becoming transreal,” a performance in collaboration with elle mehrmand and chris head, focuses the attention of the digital back on the material body and its entanglement with global capital, reminding us, through the pain of transgender experience braided with a dystopic science fiction narrative, that technology is of concern to bodies (and corporations) most of all. from the video’s description: what if you could become anything? what happens after species change surgery becomes a reality? becoming transreal speculates on a future in which the promises of bionanotechnology have become realized, and yet as capitalism has continued to fail, both the interiors of our bodies and the virtual world have become totally commodified. you can become anything, but to finance your whims of identity transformation, the same nanohormones that transform your body are also producing drugs for others. becoming transreal looks at transgender experience through a lens of slipstream science fiction poetry about bio-nano drug piracy. the performance uses motion capture to interface with second life avatars [http://en.wikipedia.org/wiki/second_life] and d stereoscopic imagery to immerse the audience in this transreal world. cárdenas operates in the tradition of mixed-reality performance, which steve benford and gabriella giannachi define broadly as a subset of performance art, including augmented reality and pervasive gaming, that combine “many real, virtual, augmented reality, and augmented virtuality environments into complex hybrid and distributed performance stages” ( ). although many mixed-reality works, such as blast theory’s uncle roy everywhere or entertainment’s     i love bees, focus on direct user participation and mobile technologies, cárdenas invites the audience to enter the world of the performance through indirect means such as audience props and immersive presentation technologies. using large-scale projection equipment and biometric sensors keyed to the performers’ bodies, cárdenas’s transreal performances bridge a physical installation space with the virtual world of second life, (dis)embodying their own content through form. cárdenas creates a performance space and temporality layered with autobiography and speculative fiction, physical bodies and digital avatars. [video by micha cárdenas and elle mehrmand, “becoming transreal”: http://vimeo.com/ ] zach blas: queer technologies zach blas’s queer technologies project invites viewers to rethink the role of critical theory by bringing it out of academic language and into the realm of product design. blas’s art reimagines     queer theory as a high-design brand, building objects that we can imagine as desirable accessories for the discerning plugged-in activist, and challenging us to pay attention to the commodification of art and ideas. part manifesto, part news report, part critical essay, queer technologies’ suite of instructional videos takes digital production as both theory and praxis. each video documents a queer weapon of resistance that responds to, yet participates in, the methods of the technological tools of empire. blas’s playful, speculative products ironically reproduce the signifiers of global capital while offering queer possibilities for undermining them, as indicated by the promotional speech embedded in each video: queer technologies is an organization that develops applications for queer technological agency, interventions, and social formation. we use technology to make queer weapons of resistance. these include: transcoder, a queer programming anti-language software development kit; engendering gender changers, a solution to gender adapters’ male/female binary; gay bombs, a technical manual manifesto that outlines a how-to of queer networked activism; and grid, a mapping application that tracks dissemination of queer technologies and maps the battle plans to more thoroughly infect networks of global capital. you can find our products at the disingenuous bar, a center for political support for technical problems, or in various consumer electronics stores, such as best buy, radio shack, and target. this sarcastic pr spin calls into question the apple products and slick gadgetry on which media- inclined academics depend; indeed, queer technologies asks us to consider not only the ends to     which we apply our digital tools, but also the troubling legacies and potential applications of cutting-edge developments in science and technology. the video “fag face, or how to escape your face” responds to biometric technologies that enlist the face in governmental control systems, whose applications range from commercial digital camera software to surveillance technologies used by local law enforcement. responding to legacies of homophobia and neoliberal governance with deleuze, guattari, and gay pornography, “fag face” offers a new way to think about and produce critical theory. [video by zach blas, “fag face”: http://vimeo.com/ ] from the center scholarship and activism, academy and community, theory and pedagogy are often considered to be separate. by including this project, in which researchers and technology educators work with incarcerated women of color using digital storytelling techniques, we hope to challenge readers to think about what it might mean to allow our ideas about scholarship and political commitment to be transformed from the ground up. digital scholar, poet, and university of california– berkeley graduate student margaret rhee serves as project co-lead and conceptualist. at the     hastac conference, rhee spoke of this collaborative activist work as “counterintuitive to the logics and rewards of the academy”––yet absolutely necessary. as feminists in our new media age, we believe women should be the authors, directors and storytellers of our own lives. we re-imagine how new media technologies can provide a vital intervention for all women, even those whose voices are subsumed in larger hegemonic discourse. oftentimes, incarcerated women and issues of race, class and sexuality are unacknowledged even in interdisciplinary areas such as ethnic, women and queer studies and in larger conversations and decisions of hiv/aids prevention education, policy and new media technologies. “from the center” derives from intersectional issues, domains and disciplines. we hope to bridge seemingly disparate subjects: feminist praxis, hiv/aids education, digital storytelling, the prison industrial complex, women’s studies, ethnic studies and new media studies. thus, we question, hope and urge a re-articulation of women’s identity, hiv/aids education and the digital divide by centering the issues and concerns of incarcerated women. (from the center) the field of digital humanities has become well known for its willingness to challenge academic conventions on one level: the idea that a phd constitutes professional training that should lead invariably to a tenure-track university teaching position. yet the vision of from the center, and rhee’s insistence that her work should be considered part of a scholarly project, highlights the limits of the academic transformations suggested by the increasingly celebrated alt-ac narrative (which encourages phds to seek careers in non-teaching roles in the university). from the center is a far more radical vision of what alternative scholarly knowledge projects and professional     practices could be. it is not uncommon for scholars with particular political commitments to use their skills for activist projects in addition to their university work of teaching, research, and (in the age of dh) digital projects. but what would it mean to slip the bounds of the neoliberal academy, even for a moment, and imagine this work as the center of scholarly activity? [digital story, “miracle”: http://vimeo.com/ ] because i want to help women know that it is okay to go through things like that, this life. because i have someone in my family who has hiv. and i learned from her how to have safe sex and get tested. from the evocative intensity of the video to the straightforward statements that highlight a reality too rarely acknowledged within scholarly spaces, knowledge is being produced and transmitted here. when from the center’s team travels to conferences, its presenters include formerly incarcerated participants as well as academics and professional activists. their presence suggests that the privileged sphere of digital scholarship need not remain hermetically sealed from those who “go through things like that, this life.” transformative networked pedagogies connections and support networks among those engaged in knowledge production are central to the growth of the digital humanities sphere. much unacknowledged work of consolidation, mentorship, and intellectual framing takes place in and through digitally mediated social     networks. here we highlight two examples that make the work of theory/practice explicit and conscious, building collaborative spheres on feminist principles and connecting transformative praxes inside and outside the academy. fembot collective: feminism, new media, science and technology the fembot collective consists of faculty, graduate students, and librarians who created a portal for feminist scholarship about technology. committed to the ideals of open source, fembot hosts an online journal, ada: journal of gender, new media and technology, with an open peer editorial process, an expanded notion of what “article” means, and a built-in system to help contributors bolster promotion and tenure portfolios: fembot has developed a framework for a two-level review process that includes an open editorial peer review and a community level of review for works in progress. valuing both the scholarly works and participation in the community of review, fembot will     provide metrics on article views/downloads and the usefulness of comments. these metrics will be aggregated into a portfolio, which is conducive to forming an incentive to participate in the community and support an argument for value toward promotion. in addition to its transformation of scholarly publishing, fembot contributes pedagogical tools on the undergraduate and graduate levels, hosting blog posts in the site’s laundry day section that outline short, teachable moments in feminist technology scholarship, and providing tenure policies and dissertation prospectuses for use in professionalization training. most recently, fembot acts as the portal for femtechnet, a feminist technology teaching network that hopes to launch a course taught worldwide, dialogues on feminism and technology, in . billed as a “distributed online collaborative course,” femtechnet is an attempt at developing a viable model for transdisciplinary, transnational, transmedial collaborative pedagogies, and a feminist intervention on the mooc (massive open online course) model that is prevalent and controversial in current digital humanities discourse. in the future, fembot will host peer-evaluated readings, videos, bibliographies, and other teaching resources to aid participants in tailoring local instances of the course to its networked goals. experiments such as femtechnet and ada position the fembot collective as an innovator in scholarly communicative possibilities. crunk feminist collective “mission statement”     in its mission statement, the crunk feminist collective throws off “hegemonic ways of being” in favor of reveling in and sharing the intoxicating effects of women of color feminisms with its readership and commenting community. this blogging community provides a space for women of color to commune, critique, and call out hegemonic culture in ways that reach across the divide separating academia from the popular. beat-driven and bass-laden, crunk music blends hip hop culture and southern black culture in ways that are sometimes seamless, but more often dissonant. its location as part of southern black culture references the south both as the location that brought many of us together and as the place where many of us still do vibrant and important intellectual and political work. the term “crunk” was initially coined from a contraction of “crazy” or “chronic” (weed) and “drunk” and was used to describe a state of uber- intoxication, where a person is “crazy drunk,” out of their right mind, and under the influence. but where merely getting crunk signaled that you were out of your mind, a crunk feminist mode of resistance will help you get your mind right, as they say in the south. casting off stilted academic speech for lyrical manifestos, insisting on the utility of affect for deep and considered arguments, and refusing to disconnect deeply personal stories from the project of scholarship, the crunk feminists’ commentary is more timely than journal production and more effective in enlisting the passion and drive of reader-students for social justice purposes.     the collective’s interventions in internet and popular culture have included critiquing mainstream media for its coverage of olympians gabby douglas and claressa fields, covering the triumphs and missteps of the popular the misadventures of awkward black girl web series, and offering film and television reviews that range from love and hip hop to pariah. the crunk feminists also offer practical career advice for young academics and swap experiences and strategies for the unique struggles of the black feminist running a university-level class. as the blog’s large community of regular readers and commenters attest, the tactics and philosophies of crunk feminism reach into academia and beyond, educating and transforming their corner of the web. conclusion as the tools and methods of the digital humanities take up their new positions of prominence, we can only hope that they will begin to take on the mutations and instabilities represented by the practitioners and projects featured here, rather than settle into the creaky machine of the corporate university. whatever its future, dh has already proved its power to unsettle the old guard, inducing anxious and skeptical blog posts from high-profile critics and me-too conference panels spreading the word to far-off disciplines. the spirit of #transformdh is not to arrest this momentum, but to channel it in truly transformative directions—to avoid trading whiteness for more whiteness, heteropatriarchy for more heteropatriarchy, one imperialist hierarchy for another. we hope the community at large will continue to find and go viral with the social justice-minded hybrid practices, identities, and collaborations elaborated in mcpherson’s epigraph to this work     of curation and analysis—the antiracist archives, the queer art-theories, the collaborative feminist pedagogies, the crunk academic activisms, the critical race coders. #transformdh is a convenient means to do so, but in the spirit of transformative work, we hope it will be supplanted by something else soon. about the authors alexis lothian is assistant professor of english at indiana university of pennsylvania, where she researches and teaches at the intersections of cultural studies, digital media, speculative fiction, and queer theory. she is the editor of an upcoming special issue of ada: journal of gender, new media and technology on feminist science fiction, a coeditor of a social text periscope dossier on speculative life, and a founding member of the editorial team for the journal transformative works and cultures. her work has been published in international journal of cultural studies, cinema journal, camera obscura, and journal of digital humanities. amanda phillips is a phd candidate in the department of english with an emphasis in feminist studies at the university of california–santa barbara. her dissertation takes a vertical slice of the video games industry to look at how difference is produced and policed on multiple levels of the gamic system. her interests more broadly are in queer, feminist, and race-conscious discourses in and around technoculture, popular media, and the digital humanities. in addition to participating in the humanities gaming institute , sponsored by the national endowment for the humanities (neh), amanda has been a hastac scholar since ; she has also hosted, in conjunction with margaret rhee, an online hastac forum on queer and feminist new media     spaces, the organization’s most commented on forum to date. she has presented at the conferences for ucla queer studies, the american studies association, the modern language association, the popular culture association, and the conference on college composition and communication, and has participated in unconferences such as hastac’s peer-to-peer pedagogy workshop, thatcamp socal, and the transcriptions research slam. most recently, she has been involved with the #transformdh collective’s efforts to encourage and highlight critical cultural studies work in digital humanities projects. bibliography benford, steve, and gabriella giannachi. performing mixed reality. cambridge, ma: mit press, . blas, zach. “fag face, or how to escape your face.” vimeo. . accessed may , . http://vimeo.com/ . ———. “queer technologies: automating perverse possibilities.” queer technologies. . accessed may , . http://www.zachblas.info/projects/queer-technologies/. cárdenas, micha. transreal.org. . accessed may , . http://transreal.org/. cárdenas, micha, and elle mehrmand. “becoming transreal.” ucla freud playhouse, los angeles, ca. performed nov. , . vimeo. may , . the crunk feminist collective. the crunk feminist collective. –present. accessed may , . http://crunkfeministcollective.wordpress.com/. ———. “mission statement.” the crunk feminist collective (blog). mar. , . accessed may , . http://crunkfeministcollective.wordpress.com/about/.     the fembot collective. fembot: feminism, new media, science and technology. . accessed may , . http://fembotcollective.org/. gold, matthew k. “whose revolution? towards a more equitable digital humanities.” the lapland chronicles (blog). jan. , . accessed may , . http://mkgold.net/blog/ / / /whose-revolution-toward-a-more-equitable-digital- humanities/. gonzález, isela, margaret rhee, allyse gray, and kate monico klein. from the center: facilitating feminist digital theory and praxis in a digital environment (blog). . accessed may , . http://hastac.org/blogs/alexislothian/ / / /hastac - center-facilitating-feminist-digital-theory-and-praxis-dig. graduates of from the center. “miracle.” vimeo. . accessed may , . http://vimeo.com/ . juhasz, alex. “two conferences: one students’/women’s media power.” media praxis: integrating media theory, practice and politics (blog). apr. , . accessed may , . http://aljean.wordpress.com/ / / /two-conferences-one-studentswomens- media-power/. koh, adeline. “addressing archival silence on th century colonialism – part : the power of the archive.” adeline koh (blog). mar. , . accessed may , . http://www.adelinekoh.org/blog/ / / /addressing-archival-silence-on- th-century- colonialism-part- -the-power-of-the-archive/. ———. “addressing archival silence on th century colonialism – part : creating a nineteenth century ‘postcolonial’ archive.” adeline koh (blog). mar. , . accessed may ,     ). http://www.adelinekoh.org/blog/ / / /addressing-archival-silence-on- th-century- colonialism-part- -creating-a-nineteenth-century-postcolonial-archive/. ———. digitizing “chinese englishmen.” . accessed may , . http://chineseenglishmen.adelinekoh.org/. macklin, angelica. “i saw you on the radio!” vimeo. . accessed may , . http://vimeo.com/ . mcpherson, tara. “why is the digital humanities so white?, or, thinking the histories of race and computation.” in debates in the digital humanities, edited by matthew k. gold, – . minneapolis: minnesota university press, . women who rock project: making scenes, building communities. . accessed may , . http://womenwhorockcommunity.org/. published by the dartmouth college library. http://journals.dartmouth.edu/joems/ article doi: . /ps . - .a. digital scholarship: panama silver, asian gold: migration, money, and the making of the modern caribbean course / / , : amdigital scholarship: panama silver, asian gold: migration, money, and the making of the modern caribbean course page of http://dloc.com/digital/panamasilver panama silver, asian gold: migration, money, and the making of the modern caribbean; & panama silver, asian gold: reimagining diasporas, archives, and the humanities the panama silver, asian gold: migration, money, and the making of the modern caribbean course was collaboratively created and taught at amherst college, university of florida, and university of miami in . the new course version, panama silver, asian gold: reimagining diasporas, archives, and the humanities, is being collaboratively created and taught at amherst college, university of florida, university of miami, and the university of the west indies, cavehill, barbados in . this page provides a guide to the course materials, including teaching materials, sources, and student work and projects (with permissions granted by the faculty and students for sharing these materials to support caribbean studies). this page is being created with all currently available materials; more to be added with the new course version in spring . currently: for teaching materials and student work, search dloc for "panama silver" permissions form for student work for source materials, see the panama and the canal collection course materials (all) syllabus assigned readings reading form for student responses to readings assignment description additional assignments class activity description grading rubrics instructor notes lesson plans guest lectures victor chang sonja watson powerpoint slides selected views on indentured labor in the caribbean, – michel-rolph trouillot - "the power in the story" student submitted assignments (tagged for each assignment) student-created metadata added to items, specific examples include: madison dutkiewicz and rachael schaaf, natives chelsi mullen, east chamber of gatun lock francis urroz, kayli smendec, and christine csencsitz, -b front street and panama r.r. yard - colon - berta gonzalez, american steam shovel prea persaud, depth of the culeba cut annemarie nichols, american steam shovel stephanie dhuman and kassie renneker, u.s. dredge sandpiper daniela bernal, ancon hospital french dayna clark and tasheik kerr, -x hindoo laborers alexandra graham, great fire of colon reuben jimenez, panama canal commissary, with personnel, showing the "silver" and "gold" entrances amelie steer, -b widening of sidewalks - panama laurin lavan, # taboga island church (item still in process, so digital version not only as of oct. ) digital projects kim bain, ghosts in the water: chines women in trinidad dhanashree thorat, indian indenture in trinidad yasmina martin, encountering cultures: the role of the chinese shop in jamaica, - yilin andre wang, mapping lgbt caribbean literature feedback/reflections related material planning notes (documentation on how the course came to be) pedagogical approach bios introductory/summary document (when/where course was taught, who was involved, how many students enrolled, etc.) conference presentations & publications (about the course) feminist pedagogy for the digital age ("a feminist mooc") panama silver, asian gold: collaborative pedagogy for the digital age what is dh (digital humanities) dloc and digital humanities http://dloc.com/results/?t= http://dloc.com/aa / /downloads http://dloc.com/pcm http://dloc.com/results/?t= http://www.dloc.com/results/?t=syllabus,panama+silver,,&f=ti,+zz,+au,+to http://www.dloc.com/aa / /allvolumes http://www.dloc.com/aa / http://www.dloc.com/results/?t=panama+silver,assignment,,&f=zz,+ti,+au,+to http://www.dloc.com/l/results/?t=panama+silver,rubric,,&f=zz,+ti,+au,+to http://dloc.com/l/results/?t= http://www.dloc.com/aa / http://www.dloc.com/aa / http://dloc.com/aa / http://dloc.com/aa / http://www.dloc.com/l/results/?t=panama+silver, http://www.dloc.com/pcmi / /citation http://www.dloc.com/l/aa / /citation http://www.dloc.com/aa / /citation http://www.dloc.com/l/aa / /citation http://www.dloc.com/l/aa / /citation http://www.dloc.com/aa / /citation http://www.dloc.com/l/aa / /citation http://dloc.com/l/aa / /citation http://dloc.com/l/pcmi / /citation http://www.dloc.com/l/pcmi / /citation http://dloc.com/pcmi / /citation http://dloc.com/pcmi / /citation http://dloc.com/pcmi / http://www.dloc.com/aa / http://www.dloc.com/aa / http://www.dloc.com/aa / http://www.dloc.com/aa / http://dloc.com/aa / http://dloc.com/aa / http://www.dloc.com/aa / http://www.dloc.com/aa / / / , : amdigital scholarship: panama silver, asian gold: migration, money, and the making of the modern caribbean course page of http://dloc.com/digital/panamasilver finding the silver voice: afro-antilleans in the panama canal museum collection at the university of florida breaking frontiers: the panama canal museum collection at the university of florida/ rompiendo barreras: la colección del museo del canal de panamá en la universidad de florida online exhibits and exhibit materials panama canal centennial online exhibit documenting presence (materials for physical exhibit) workshops and events resources used in the course the course is helping to increase the availability of relevant resources in dloc. in addition to the explicitly referenced resources, related resources can be access using keywords. keywords include geographic area, genre/format, and subject keywords. the advanced search supports searching by types of keywords. geographic area (e.g., trinidad and tobago; panama; jamaica; guyana) genre/format (e.g., autobiography, map) subject (e.g., migration, chinese, west indian, panama canal) existing keywords can be seen when browsing/ searching by using the facets. when viewing browse/search results, the facets are on the left and grouped by keyword type with the option to show more for each category. example of browse all for the panama and the canal collection. home | about dloc | collections | governance | digitization | outreach | faq | contact powered by sobekcm acceptable use, copyright, and disclaimer statement © all rights reserved | citing dloc | technical help http://dloc.com/ir / http://www.dloc.com/l/aa / http://dloc.com/aa / http://dloc.com/aa / http://www.dloc.com/advanced http://www.dloc.com/results/?t=trinidad,,,&f=co,+ti,+au,+to http://www.dloc.com/results/?t=panama,,,&f=co,+ti,+au,+to http://www.dloc.com/results/?t=jamaica,,,&f=co,+ti,+au,+to http://www.dloc.com/results/?t=guyana,,,&f=co,+ti,+au,+to http://www.dloc.com/results/?t=autobiography,,,&f=ge,+ti,+au,+to http://www.dloc.com/results/?t=map,,,&f=ge,+ti,+au,+to http://www.dloc.com/results/?t=,,,migration&f=ge,+ti,+au,+to http://www.dloc.com/results/?t=,,,chinese&f=ge,+ti,+au,+to http://www.dloc.com/results/?t=,,,west+indian&f=ge,+ti,+au,+to http://www.dloc.com/results/?t=,,,panama+canal&f=ge,+ti,+au,+to http://www.dloc.com/pcm/all http://www.dloc.com/dloc http://www.dloc.com/dloc /about http://www.dloc.com/dloc /collect http://www.dloc.com/dloc /bylaw http://www.dloc.com/dloc /digit http://www.dloc.com/dloc /outre http://www.dloc.com/dloc /faq http://www.dloc.com/dloc /contact http://www.dloc.com/sobekcm http://www.uflib.ufl.edu/accesspol.html http://dloc.com/dloc /citation http://dloc.com/sobekcmhelp terry carter: exploring the possibilities of digital scholarship for faculty performance—page of abstract: this article shares the author’s exploratory journey as a senior professor eager to understand and to showcase digital scholarship during periods of faculty performance evaluations. the purpose of this article is to highlight and bring attention to the possibilities of digital scholarship for tenure, promotions, and faculty evaluations. title: exploring the possibilities of digital scholarship for faculty performance the purpose of this article is to highlight and bring attention to the possibilities of digital scholarship for tenure, promotions, and faculty evaluations. the presence of digital scholarship is ubiquitous; however, how should those in positions to evaluate that scholarship weight it appropriately and fairly in our current academic culture that values traditional publications in journals and books. readers know faculty have digital outlets for their writings and scholarly explorations; such digital outlets include websites, blogs, wikis, podcast, and open access journals. with so many opportunities for faculty to disseminate scholarly ideas the academy should be loudly encouraging faculty to pursue alternative digital publication outlets. now is the time for organizations in higher education to reaffirm or create highly visible guidelines and position statements about digital scholarship as qualifying evidence for tenure, promotions, and faculty evaluations. a review of the american association of university professors’ academic freedom and electronic communication policies reveals that the organization has done its due diligence to acknowledge that digital scholarship and current technological means of dissemination scholarship must be considered in connection with academic freedom. the policy does not specifically refer to digital scholarship but rather broadly it refers to new mediums of communication that often serve as launching sites for sharing digital scholarship. the policy also has been updated several times since , which perhaps indicates the increasing awareness that technology and faculty use of technology continues to reconfigure the relationship between scholarship activities and academic freedom. aaup policy makers appear to be aware that the traditional gatekeepers (academic publishers) who defined and terry carter: exploring the possibilities of digital scholarship for faculty performance—page of constrained showcase of scholarship should no longer stand between faculty and the dissemination of their authorial works. with academic freedom in mind, should faculty in this era of digital publication opportunities be required to pursue traditional publications? in certain fields of study, peer reviewed publication in traditional and highly reputable journals are difficult to achieve without the appropriate connections in ones field of study. in fact, the resulting published scholarship that arises due to discipline specific networked connections often serves to silence the work of many deserving scholars. the beauty of current digital publication opportunities is it creates space for mature and less mature scholarship outputs, both of which show evidence of faculty productivity. in fact, one could argue that the current era has opened the academy up for a true and egalitarian way for faculty to engage and share their scholarship. in , jason priem’s beyond the paper predicted that in the future we would be seeing total different ways of assessing and valuing digital scholarship. he made some very bold predictions; two relevant points that caught my attention include the statements that “the reward structure of scholarship will change” and “tenure and hiring committee will adapt [towards respecting digital scholarship] . . . with growing urgency” (p. ). if either of those predications had come into fruition, then today we would have become accustom to universities across the nation sponsoring and leading workshops encouraging faculty to move boldly into the possibilities of digital publishing. one might also expect university guidelines and academic units of universities would be at the point of clearly articulating support for digital scholarship. as a researcher, i mistakenly assumed the aforementioned as i set out on a brief exploration of available promotion and tenure guidelines to justify my digital scholarship efforts during annual evaluation periods. interestingly, a large number of universities have supported digital publications, but they do so to support production of free oers (open educational resources). even if a reader does not know exactly what oer means, the acronym is most likely one that has surfaced in an email from an academic administrator at some point. nationwide and global open educational resource initiatives are being promoted and funded to reduce costs associated with textbook usage in college and k- classrooms. the funds to support faculty development of oers in my home state have been awarded consistently for at least years now; oer funding has been available during periods budget constraints and even more so during times of strong revenue growth. imagine if the momentum behind digital scholarship was similar to the momentum behind support of faculty to create oers; perhaps, priem’s bold predication would now be a reality. the oer movement and the digital scholarship movement are both connected by the “digital”; yet, the managerial institutions of today place a higher value on appearances of meeting student needs without regards to its own need to recruit and retain faculty who often must reckon with the ever increasing pressure to publish or perish. the goal of oers is to reduce student costs; a goal of digital scholarship is to reduce dependence of faculty on the terry carter: exploring the possibilities of digital scholarship for faculty performance—page of ever-shrinking traditional outlets for scholarship dissemination; both are important investments to ensure student and institutional success. today the publish or perish faculty most in need of clear and respected digital scholarship guidelines are junior faculty who must convince evaluators of their productivity. those junior faculty often face an uphill battle because many of the evaluators are likely to be senior professors who have more trust in printed publications than any form of digital scholarship equivalent. thus, the purpose of the remainder of this article is to share my own journey as a senior professor eager to understand and to showcase digital scholarship during periods of faculty performance evaluations. i hope faculty who are considering the possibilities of digital scholarship find what i share below to be of assistance. ******************************** in , i earned the rank of full professor at southern polytechnic state university (spsu); shortly thereafter, my career shifted toward writing center administration and faculty mentoring and away from higher expectations to engage in traditional publications. however, due to a fall mandated university consolidation, spsu merged with and became part of kennesaw state university and my career shifted back to the work of traditional faculty, which included once again focusing on publication projects. during this period of transition, i developed an interest in digital scholarship after having attended a digital humanities conference, and i later developed a small irb research study that would allow me to systematically learn about digital scholarship by concurrently researching, presenting, and engaging in its production. the research project focused on gathering and analyzing faculty and administrator perceptions of digital scholarship publications in comparison to traditional academic publications including peer reviewed journals and books. in fall of , i gave a presentation entitled “academic freedom in the digital technology age: exploring guidelines for evaluating digital scholarship for faculty promotion and evaluations” during an aaup shared governance conference held in washington dc. the presentation argued that more guidelines were needed to encourage institutions to accept digital scholarship. the rationale for encouraging digital scholarship was that many academics develop good quality manuscripts that are never published due to lack of space and increased competition for peer reviewed and other types of traditional print publications. during the presentation most of the audience members agreed with arguments and rationales that were outlined and discussed. ultimately what i learned was that my views and my aaup presentation audience views were in alignment relative to the perils of pursuing traditional academic publications and in alignment relative to the possibilities for engaging in digital scholarship to enhance faculty performance requirements. my aaup presentation was based on professional experiences and the results of my irb project. the project made use of three survey questions designed to elicit responses terry carter: exploring the possibilities of digital scholarship for faculty performance—page of that would help capture perceptions of others at and outside my institution about the value of nontraditional methods of disseminating scholarship. question one asked for a yes or no answer in response to “do you believe digital scholarship or creativity (exclusively online journal, informational websites, blogs, podcast, etc.) should count toward tenure and promotion?” question two asked for respondent to qualify their agreement or disagreement to the following statement: “digital scholarship holds potential to be weighted equally alongside peer-reviewed print publications.” question three allowed for written feedback from respondents relative to their understanding and perspective about digital scholarship. my survey data results (see appendix) were not generalizable due to the small sample of respondents; however, i learned that most of the survey participants like those of my fall aaup conference audience did indeed recognize the possibilities and perils of pursuing digital scholarship rather than traditional routes for faculty publication. with my survey data results and the aaup presentation feedback, i moved ahead to get my feet wet with some type of digital publication to test the waters with how it might be viewed during annual performance evaluation. i found my feet wetting opportunity when i was invited to write an article about mentoring that would be disseminated as a blog posting via a higher education organizational website. i completed the article which was subsequently reviewed and edited by the marketing and leadership team members before being published. mission accomplished was my inner pronouncement. my mission accomplished pronouncement later turned sour after learning during my annual evaluation review that my department had not yet developed guidelines that would clearly recognize my digital blog publication efforts. i believed my activities in the area of digital scholarship were aligned with the recognizable scholarship activities of professors at my university, but after receiving faculty performance evaluation feedback in , my beliefs turned out to be faulty assumptions. for that annual evaluation period, my academic blog publication was a praiseworthy activity; however, it was not a “creditable” scholarly activity. after consulting with other colleagues and carefully reviewing departmental scholarship guidelines, i learned that digital publications such as academic blogs were not clearly designated as a scholarly and creative output. i later learned through informal internet research that many institutions and departments across the nation did not have written guidelines to account for digital scholarship. before continuing to invest more time into digital scholarship activities, i decided to learn more about its history. after that sour experience, i decided to dig a bit deeper into the history of digital scholarship before committing more time to develop any other digital scholarship projects. during the summer of , i spent time querying my institution’s library databases for journal articles and books using “digital scholarship” key word searches; eventually i settled on five publications for in-depth reading that were related to the topic. the dates of those published resources ranged from to . in addition to searching the library databases for source material, i also reviewed web accessible departmental, college, and university level terry carter: exploring the possibilities of digital scholarship for faculty performance—page of guidelines that described faculty expectations for research and scholarly activities. from late summer of up to the early winter of , i read, reviewed, and annotated the aforementioned source material in an effort to increase my understanding. once i had completed my readings, my comprehension of the potentials and the pitfalls of pursuing digital scholarship had indeed increased. below is a listed synopsis of what i learned that i believe may be of value to others interested in pursuing digital scholarship: • the topic of how to assess digital scholarship has been an on-going conversation for almost two decades now. the modern language association (mla) approved in may of “guidelines for evaluating work in digital humanities and digital media” to help disciplines contextualize the credibility of this type of scholarship, yet articles and books published since that time demonstrate the academy as a whole continues to question the meaning and value of digital scholarship (borgman, ; friedberg, ; ren, ). • digital scholarship’s meaning tends to vary; it may refer to research about the impact of digital publications as well as digital platforms used to disseminate information (ren, ; rafaffaghelli, ). disciplines that lay claims to engaging in digital scholarship activities include the humanities, information science, and information technology. due to multiple discipline specific engagements with digital scholarship, users of the term should provide a contextualized definition for reviewers of such works. • sole reliance on digital scholarship publications such as academic blogs, deposits in digital repositories, and multimedia products is not recommended for faculty who are required to publish scholarship for promotion and tenure. the quality and significance of such publications may be sound; however, institutional guidelines may not allow for full recognition of such digital publications (braun, ; ren, ). unfortunately, perception about the quality and significance of digital publications in the academy is often less favorable in comparison to traditional print and peer reviewed publications. based on the information shared in the above listed synopsis and my own recent experiences, i would advise untenured professors to be careful when pursuing digital scholarship projects in order to satisfy scholarly and creative publication requirements. i likewise would advise tenured professors to be careful; however, based on the point of view of other academics (braun, , p. ) and my own observations, i would also strongly encourage those already tenured or in senior rank faculty positions to pursue their interest in digital scholarship in order to set precedents that will hopefully benefit upcoming generations of academic professionals who must increasingly become invested in digital scholarship activities to maintain relevancy in their disciplines against the backdrop of limited traditional print publication possibilities. finally, regardless of rank and tenure, i strongly recommend reviewing specific departmental, college, and institutional guidelines before investing time in pursuing terry carter: exploring the possibilities of digital scholarship for faculty performance—page of digital publication projects because not doing so may increase the likelihood of unexpected negative performance feedback during administrative review of faculty performance. bibliography braun, catherine. “scholarship through a new lens: digital production and new models of evaluation.” in cultivating ecologies for digital media work: the case of english studies, edited by catherine braun, - . carbondale, illinois, southern university illinois press, borgman, christine. “the continuity of scholarly communication.” in scholarship in the digital age: information, infrastructure and the internet, edited by christine borgman, - . cambridge, massachusetts, the mit press, . freidberg, anne. on digital scholarship. cinema journal, , no. ( ): - . guidelines for evaluating work in the digital humanities and digital media (february, ). retrieved from https://www.mla.org/about-us/governance/committees/committee- listings/professional-issues/committee-on-information-technology/guidelines-for-evaluating- work-in-digital-humanities-and-digital-media priem, jason. beyond the paper. nature, ( ): - . raffagheli, juliana. exploring the (missed) connections between digital scholarship and faculty development: a conceptual analysis. international journal of educational technology in higher education, ( ): - . retrieved from https://doi.org/ . /s - - -x ren, xiang the quandary between communication and certification: individual academics’ views on open access and open scholarship. online information review, , no. ( ): - . retrieved from https://doi.org/ . /oir- - - terry carter: exploring the possibilities of digital scholarship for faculty performance—page of appendix: digital scholarship survey september results q - do you believe digital scholarship or creativity (exclusively online journals, informational websites, blogs, podcast, etc.) should count toward tenure and promotion? # field minimum maximum mean std deviation variance count bottom box top box do you believe digital scholarship or creativity (exclusively online journals, informational websites, . . . . . . % . % terry carter: exploring the possibilities of digital scholarship for faculty performance—page of blogs, podcast, etc.) should count toward tenure and promotion? q - digital scholarship holds potential to be weighted equally alongside peer- reviewed print publications: # field minimum maximum mean std deviation variance count bottom box top box digital scholarship holds potential to be weighted equally alongside peer-reviewed print publications: . . . . . . % . % terry carter: exploring the possibilities of digital scholarship for faculty performance—page of terry carter: exploring the possibilities of digital scholarship for faculty performance—page of q - please use this space to provide any additional feedback about digital scholarship from your perspective and understanding. (note: minor spelling corrections for inclusion as an appendix document; otherwise content appears as written by respondents.) please use this space to provide any additional feedback about digital scholarship from your perspective and understanding. i don't see any distinction between a "digital scholarship" and "paper" articles if we are talking about peer review. most peer-reviewed articles are in online journals these days. so, i don't see how this differs. blogs are different of course. they are not peer-reviewed. but an article that is peer-reviewed and the journal is an online journal should be equal. in other words, i disagree with your premise that digital scholarship is scholarship for tenure if it is just a blog. anyone can publish anything in blog format. getting a peer-reviewed article published (online or in print) is much different and this should count more for tenure. it depends upon the type of digital scholarship--if it is peer-reviewed digital scholarship, definitely. i guess it depends if the digital scholarship is also peer reviewed. i don't think, for instance, that blogs written for a book publisher about one's discipline should count as much as a peer-reviewed article (online or in print). but, digital scholarship should count as well anyway. it also, though, should be measured in some way through reach or response or impact in order to recognize that digital scholarship is vast and should not automatically constitute equal weighted-ness to peer reviewed scholarship. publications of articles in online journals must be reviewed carefully, especially given the proliferation of fraudulent online journals that currently invite manuscripts for "peer-review." also, in the case of websites, blogs, and podcasts, i believe these should be valued in tenure and promotion cases, but universities also need to specify the criteria for equivalency. often digital scholarship has a wider circulation than traditional forms, especially print only forms. if influence in the field is something our programs are looking for they should definitely consider the range of influence that can be obtained through online, especially multi-modal, venues, through popular and well-, wide-read blogs. as long as the journal is refereed, whether it is hard copy or digital should be irrelevant. a peered review online journal publication should carry a higher weightage compared to a blog, podcast for that purpose. digital publications should be held to similar standards of quality and peer review as print publications--when this is the case, it should count equally towards tenure and promotion. terry carter: exploring the possibilities of digital scholarship for faculty performance—page of q - please click and read the following: online survey consent form. after reading, you may continue or opt out by selecting the appropriate response. # answer % count i choose to continue the survey . % i choose not to continue the survey . % total % # field minimum maximum mean std deviation variance count bottom box top box please click and read the following: online survey consent form. after reading, you may continue or opt out by selecting the appropriate response. . . . . . . % . % wjwl_a_ _o uc berkeley lauc-b and library staff research title digital publishing from the library: a new core competency permalink https://escholarship.org/uc/item/ b dk journal journal of web librarianship, ( ) authors huwe, terence lefevre, julie publication date - - peer reviewed escholarship.org powered by the california digital library university of california https://escholarship.org/uc/item/ b dk https://escholarship.org http://www.cdlib.org/ full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=wjwl download by: [university of california, berkeley] date: may , at: : journal of web librarianship issn: - (print) - (online) journal homepage: http://www.tandfonline.com/loi/wjwl digital publishing from the library: a new core competency julie lefevre & terence k. huwe to cite this article: julie lefevre & terence k. huwe ( ) digital publishing from the library: a new core competency, journal of web librarianship, : , - , doi: . / . . to link to this article: http://dx.doi.org/ . / . . published online: jun . submit your article to this journal article views: view related articles citing articles: view citing articles http://www.tandfonline.com/action/journalinformation?journalcode=wjwl http://www.tandfonline.com/loi/wjwl http://www.tandfonline.com/action/showcitformats?doi= . / . . http://dx.doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=wjwl &page=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=wjwl &page=instructions http://www.tandfonline.com/doi/mlt/ . / . . http://www.tandfonline.com/doi/mlt/ . / . . http://www.tandfonline.com/doi/citedby/ . / . . #tabmodule http://www.tandfonline.com/doi/citedby/ . / . . #tabmodule journal of web librarianship, : – , published with license by taylor & francis issn: - print / - online doi: . / . . digital publishing from the library: a new core competency julie lefevre institute of governmental studies, university of california-berkeley, berkeley, california, usa terence k. huwe institute for research on labor and employment, university of california-berkeley, berkeley, california, usa in the earlier years of the web, libraries focused on moving services online and building digital collections, but in recent years, libraries have emerged as key players in the world of digital publishing. li- brarians possess all of the necessary skills to act as digital publishers; they join the ranks of many others who have discovered the barriers around digital publishing are lower than ever. library-based digital publishing solutions have matured to a point that the act of digital publishing could—and should—become a new core competency for the library profession. to explore this hypothesis, the researchers offer a working definition of digital publishing and assess the key roles that traditional publishers have historically offered over time. they find that librarians already possess the requisite skills to be- come digital publishers, and the collaborative culture of the library profession is a strength for this new role. examples of digital pub- lishing from two libraries at the university of california-berkeley offer a proof of concept. services at these libraries include the con- ceptualization of overall web site strategies, a content plan that emphasizes distinctive and original material, and special projects that promote digital publishing at the local level, even as they take advantage of campus- or system-level services. researchers find that offering library-based web publishing services can reinforce overall © julie lefevre and terence k. huwe received october ; accepted november . address correspondence to julie lefevre, institute of governmental studies, university of california-berkeley, moses hall # , berkeley, ca - . e-mail: jlefevre@library. berkeley.edu; terence k. huwe, institute for research on labor and employment, university of california-berkeley, channing way # , berkeley, ca - . e-mail: thuwe@ library.berkeley.edu d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library information management programs and also advance the status of libraries within their respective host organizations. the com- parative ease of digital publishing has opened an opportunity for librarians to follow the user as they use the web in creative ways. keywords academic libraries, collaboration, digital libraries, digital publishing, information competency standards, social tech- nology, web publishing, web services the pace of innovation in digital media has deeply influenced the practice of librarianship. librarians routinely assume a wide array of new roles using new media, including managing the lifecycle of text- and image-based digital files, developing mobile apps, integrating social media into library services, and exploiting the potential of cloud-based applications. new-generation li- brarians enter the field with comprehensive digital information skills, offering fresh energy for repositioning the profession in the twenty-first century. the world wide web provides a space for experimentation but also continues a longstanding process of knowledge dissemination: publishing. librarians staked an early claim in digital publishing by building robust web sites, debating the future of intellectual property and copyright at the international level, and testing new ways to publish online. this history in the web environment enables librarians to become full-scale digital publishers, crafting robust ways of using digital media in support of scholarship. digital publishing as an idea is partially obscured by the general ten- dency to view library-based web services as a series of related projects and ventures but not necessarily a unified program to manage digital informa- tion in all its forms. at the same time, wholly new areas of digital expertise compete for librarians’ attention. these include digital curation, rights man- agement, and “digital scholarship”: the wholesale relocation of teaching and research to online domains. yet within this larger context of innovation, dig- ital publishing advances have made its long-term potential more visible and obvious. the profession faces a new opportunity to establish digital publish- ing as a core competency, and to do so is essentially a matter of recognizing what is already under way. framing the full array of library-based web and digital media services as “digital publishing” is a relatively new concept, and definitions are still in an evolutionary stage. john battelle, one of the founders of wired magazine and a leading commentator, offers one of the better summations: “publishing means connecting a community through the art and science of communi- cation” (battelle ). his broad definition encompasses process and also captures a new reality in the publishing world: engaging with readers takes many forms. general definitions from popular web reference sites offer sim- ple explanations, stating that books and other works can appear as web d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe sites, audio materials, and more (tomlinson ). wikipedia defines “elec- tronic publishing” as a series of content creation procedures (wikipedia ). john w. warren of the rand corporation wrote that publishing in the future will be “more about managing digital assets and metadata for increased customization and findability” (warren , ). part of the prob- lem lies in the absence of formal standards, which would offer stability. during the election cycle for the international digital publishing forum board, candidates acknowledged that standards and definitions are works in progress. one candidate, rob reynolds of mbs direct digital, an e-learning firm, asserted he would “advance the creation of digital content standards and the evolving definition of publishing across the [publishing] industry” (reynolds ). although definitions are emerging, those who are most closely associated with the work itself are still in search of consensus. the evolutionary dialogue about the future of digital publishing suggests the intellectual terrain is still open to interpretation, creating opportunities for the library profession and others who possess insight and expertise about how readers use digital artifacts. for this article’s investigation, we offer this working definition of digital publishing as a core competency for the profession: digital publishing is a role of agency in creating and disseminating text and visual artifacts in a networked electronic environment. it constitutes a comprehensive content management service libraries can offer to user communities as part of their overall mission. it encompasses both formal and casual styles of electronic publishing, ranging from peer-reviewed e- journals and policy reports to blogs and social media. it includes robust metadata, cross-platform document design, rigorous editing and qual- ity control, adherence to requisite copyright and commercial laws, and attention to long-term preservation strategies. under this definition, digital publishing encompasses all the work related to content creation, content acquisition, and long-term preservation of digital content that originates with libraries or that is acquired and managed by them. currently, digital publishing goes by many names, such as electronic resources management or electronic content aggregation. but as descriptive as these names are, they do not fully define the growing importance of digital publishing from the library, particularly from a strategic viewpoint. indeed, the percentage of time librarians already devote to digital publishing is a key element in the hypothesis that it is becoming a new core competency with a significant value proposition. armed with a working definition, we evaluate digital publishing and as- sess whether library skills are closely akin to publishing skills. the assertion that digital publishing can be a new core competency for the library profes- sion is demonstrated in two case studies which describe how digital pub- lishing evolved at two libraries at the university of california-berkeley. we d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library further assert that the library profession faces a clear choice: either embrace digital publishing as a core competency or risk losing the opportunity to serve user communities in the dynamic arena of user-generated web content. digital publishing: the library’s value proposition three trends fuel the growth of digital publishing. first, the forces of digital convergence and new technology have lowered the barriers to publishing in general, and more people know how to act as digital publishers on the web. a review of curricula for fields such as commercial publishing, communications, information science, librarianship, and related lines of work reveals striking similarities; for example, rosemont college’s graduate degree in e-publishing covers skills that are also common to journalists, editors, librarians, analysts, and web administrators (rosemont college ). digital information management skills are more transferable than ever. journalists now practice computer-assisted journalism and are expert re- searchers in their own right. authors can choose to act as their own pub- lisher, using the web to circumvent the traditional publishing process. infor- mation technologists and programmers have added skills in assessing human factors and are much more attuned to the challenges people face as they search networked information and use web interfaces. similarly, there is a natural fit between the publishing process and the practice of librarianship, which has retained a strong focus on how people use information in all its forms. second, there is growing demand for help in digital publishing on the web and other platforms, and librarians are well-positioned to offer support. in both academic and corporate settings, it is common for a variety of de- partments to offer web publishing services. web service providers might be found anywhere: in technical support departments, corporate communica- tions offices, public relations offices, central administrative offices, and more. they are able to offer digital publishing services because their staff possesses the necessary skills, and the general demand for effective digital publishing is high. even so, many user communities continue to be under-served, ill- served, or not served at all. this opens an opportunity for librarians to offer targeted web publishing services and offer advice on how to develop a com- prehensive digital publishing strategy, which is quickly becoming a crucial element for most contemporary organizations. at the same time, independent authors who act as their own publishers often find their e-books go undiscovered, that they lack polish in formatting and style, and that professional editing is crucial. here again, librarians have ready-made skills to offer. they inhabit a professional sphere that encom- passes skills in the technical workings of the web and also a growing de- gree of familiarity with the rigorous work of creating, editing, and producing d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe a quality digital product. with desktop publishing programs increasingly merged with web development software, developers can create digital files for the web that may also be repurposed as print documents. therefore, lay- out and design—crucial features of quality print or digital publications—may have a place in the array of skills librarians must possess as the web evolves. university library web sites, which attract millions of visitors, are among the finest and most prolific digital publishing platforms, even though they are not commercially focused. therefore, in a sea of multiple service providers, libraries are just as capable of offering digital publishing as any other player. the key conceptual challenge in grasping the viability of digital publishing as a core competency is to perceive that converging technologies are in- creasingly being folded into long-term conceptions of what collections and services are all about in a digital library environment. third, with the publishing marketplace in a deep state of flux and trans- ferable skills widely dispersed throughout all of the knowledge-handling professions, there is no obvious reason why librarians or anyone else should refrain from experimentation with digital publishing. effective user services require that librarians develop collaborative solutions, and this makes them useful partners for others who are developing digital publishing strategies. moreover, a wide variety of web users who are already developing cre- ative works with sophisticated text documents, videos, simulations, and data analyses are within reach, and they also are looking for collaborators. converging skills if librarians are to consider digital publishing as a potential core competency, it is important to establish how the process of publishing—whether commer- cial or not—operates. publishers’ fundamental work classically falls in four categories: agency, editorial, design and marketing, and sales (fister ). however, just like everyone else, publishers must develop a fifth category in response to the digital era: digital media skills. converging digital tech- nologies open all five functions to those with the right experience, including librarians. taken one by one, these categories illustrate how the activities librarians and publishers have performed over time are converging to such a degree that the digital publisher role is well within reach of the library profession. agency librarians work very closely with faculty members, who provide opportuni- ties to learn a great deal about scholarly publishing. librarians are becoming experts in copyright and open-access publishing, and many are authors and d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library editors themselves. they also manage large web sites and use content man- agement systems, such as drupal, joomla, and wordpress, that involve direct contact with content creators and frequently require editorial involvement. because of this pervasive involvement in knowledge creation, librarians in- teract with authors in ways that are strikingly similar to acquisition editors and other publishing staff. editorial work in addition to editing newsletters, library publications, and peer-reviewed journals and books, many individual librarians have also gained substantial experience in editing information for the web and for distribution via social media. academic librarians, who are encouraged or required to write for professional publication and who are familiar with the requirements of usage and style from their graduate studies, are rapidly becoming de facto editors by virtue of their digital publishing activity on the web. digital publishing has become a consumer-driven process, as is already seen in the self-publishing sector of web services, and new opportunities will appear for those who can expand their editorial skills. design and marketing production and marketing are dominated by web technology, making it possible for web administrators, librarians, news centers, and public affairs departments to use digital technology to their advantage. desktop publish- ing skills, once aimed solely at print media, have become powerful tools for generating digital publications of many varieties, including the open source epub format for e-books. electronic files can be repurposed for the web to take advantage of various publishing platforms, and this role is open to anyone who can use the requisite software. professional-quality document design has become necessary in a wide variety of fields, including librarian- ship. individual librarians who know or can learn how to use the dominant desktop publishing programs can use this skill in high-quality document creation as a selling point in developing a library-based digital publishing initiative. at the same time, marketing itself is evolving quickly. viral market- ing that begins with twitter and facebook can boost sales—or download traffic—in quick spurts of activity. professionals involved in outreach and marketing are experimenting with the new potential of web publishing in conjunction with social media, further leveling the playing field. self- publishing authors find marketing is their principal challenge, since the name of an established publishing house still gives an imprimatur. how- ever, universities also carry an imprimatur, and this is a chief reason why d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe academic library web sites are becoming repositories and outlets in addition to aggregators of digital information products. sales with the transition to digital products, publishers are increasingly favoring licensing-based models of distribution, which have in many cases negatively affected libraries’ ability to acquire and distribute content. by expanding their role into digital publishing, libraries could acquire, publish, or curate material in-house, avoiding these restrictions. sales continue to define the transactional relationship between creator and consumer, but on the web, the barriers around setting up a dissemination outlet are constantly shrink- ing. at the same time, managing the full lifecycle of knowledge resources and generating high download traffic are vital parts of distribution, inspiring efforts to build long-term relationships with readers. publishers have made significant strides in developing post-purchase relationships with their read- ers, employing interactive features and author web sites to sustain interest. the same paradigm for relationship building can work well for anyone who possesses the right skills, which are learned by handling web administration, using social media, and trial and error. librarians in particular have gained considerable experience in building collaborative outreach efforts, which lend themselves well to the sales process. digital media skills librarians have overseen complex digital libraries for many years and are well positioned to take on an expert role as digital publishers. traditional print publishers have been playing a game of catch-up with digital tech- nology, primarily because legacy publishing requires substantial resources and staff expertise. but once new technology becomes widely available, it is possible for formerly hidebound industries to take sudden leaps of inno- vation, and so the publishing industry is engaged in experimentation on a broad scale. publishers are focusing on reader forums, author web sites, and tie-in materials that create information ecologies around authors and book series. this process enlivens both their new catalogs and their backlists. these recent initiatives show that publishers have the mettle to change with the times and experiment with new ways to reach their readers to take better advantage of new media. at the same time, the library profession is enjoying an influx of personnel who may not focus on the historically sharp distinction between print and digital publishing. as a result, libraries have become test beds for trying new and bold ideas, particularly in online education. with a strong urge to take advantage of digital solutions, librarians have become competitive in d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library the technological arena and bring direct knowledge of user preferences into their development projects. formal and informal, global and local academic libraries oversee a digital crossroads of repositories, datasets, and library databases as well as local published reports and policy briefs. at the campus level, library-based publishing utilities offer a wealth of options for electronically publishing pre- and post-prints and e-journals and e-books. librarians are lead developers in areas such as data management tools and web archiving. the university of california’s california digital library has become a clearinghouse for such activities, and many other universities have formed their own digital publishing services. universities have also created consortia that share resources and expertise across organizational bound- aries. the sustained investment in digital publishing at the organizational and consortial level has already formalized digital publishing skills within the library profession. at the same time, local content creation continues in myriad places, and system-level digital publishing utilities, however excellent they might be, do not reach all potential customers and capture all intellectual output. librarians who are involved in local-level user communities already educate their users about existing enterprise-level publishing services, and they can also offer their own digital publishing solutions that dovetail with existing tools and services. digital publishing at the local level can also emphasize informal publications, simulations, and policy briefs that nonetheless have academic value. even in instances where content management systems such as wordpress or drupal are used to empower users to upload their own content, there is still a role for expert assistance, which can be offered by the library as another service to patrons. librarians can reframe their identities in ways that go beyond traditional librarianship, using digital technology to build new relationships and pub- lishing digital media to the web in distinctive ways. this work can exist both in the form of sophisticated, system-level publishing tools and as strategically organized use of local web sites, social media, and cloud-based services. all the tools needed to make digital publishing a core competency for the li- brary profession have existed for some time; however, library literature has only begun to report on this trend in recent years, as the following review illustrates. literature review librarians were quick to see the web’s potential as a tool to further their reach into user communities, and as a result, their initial strategies focused d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe more on updating existing information-management skills and making them relevant to the new platform than wholesale experimentation with new roles such as digital curation or digital publishing. consequently, the pace of experimentation with web management has produced a steady stream of insights about the transferrable nature of various web skills. a number of articles have emphasized how librarians and library staff who manage web resources have gained substantial ability in project management and that these skills are both transferable and in demand (burich et al. , ; fagan and keach , ). the value of outreach activity has been another recurring theme; jeanie welch ( , ) outlined crucial activities involving partners, such as fundraising and how library web services can strengthen alliances. leslie delserone, julia kelly, and jody kempf ( , ) evaluated a robust outreach effort conducted at the university of minnesota in broad terms, where the web played a supporting role in the development of inno- vative services. susan hubbs motin and pamela salela ( , ) assessed the potential of library liaisons (or “embedded” librarians) as enablers of effective teamwork, and charity hope and christina peterson ( , ) emphasized the importance of collaboration across all outreach activity and the growing dominance of web-based initiatives. as more and more academic librarians consolidated their presence on the web and enriched library web sites, it became increasingly clear that dig- ital media were changing the entire organization of the library and its host organizations. this pace of change in turn led to greater interest in what new roles librarians could play in evolving organizations. around , the liter- ature began to reflect the perceived importance of library web sites—welch ( ) described the academic library web site as an “electronic welcome mat” ( )—along with the idea that the web site could be an impetus for facilitating collaboration within an organization (motin and salela , ). erik mitchell ( , ) enumerated the challenges facing academic library web sites, arguing that new skills were needed as work styles continued to evolve. he questioned when web sites should be managed by editors rather than web developers. he found that highly-skilled web developers might be hired only to find themselves stretched thinly across related but distinct tasks such as marketing, updating, and outreach. he also identified the emerging potential of cloud-based systems and their benefits in saving time. debra riley-huff ( , ) identified web services as public services and stated that librarians must adopt an activist culture built upon aggressive attention to emerging digital technologies, cooperative relationships, and a focus on customers. as more coordinated approaches to web oversight appeared, it became clear to managers that content provision had crucial strategic value and that the location of content was also diffuse. this distributed informa- tion ecology created opportunities to leverage library-based skills (frumkin and reese , ). much attention has also been given to the hazards of passive behavior during the digital era. isaac gilman and marita kunkel d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library ( , ) argued that librarians should take a more active role in content creation, dissemination, and preservation of internally-produced scholarship. they cited this strategy as a means for advancing librarians’ academic status in the eyes of the faculty, but their argument is germane to the endeavor of web services in and of themselves. after ten years of web development and administration, library literature began reporting on intensive study of the “user experience,” the collapsing boundaries between academic disciplines, and the need to develop com- prehensive online environments. brian detlor and vivian lewis ( , ) found that the interactive web requires library web administrators to rethink their assumptions and to offer users a rich and interactive experience. carla stoffle and cheryl cuillier ( , ) argued that in order to thrive, the library must formulate an overall strategic approach that encompasses all ac- tivities within its host organization, including web services, venturing quite close to recognizing that web services are growing into a form of digital publishing. shu liu ( , ) provided a comprehensive overview of how academic libraries can harness web . , forecasting that providing rich media experiences and harnessing collective intelligence will become essential. a ithaka report asserted that libraries play a key role in developing inno- vative web tools to make scholarship available online (brown, griffiths, and rascoff , ). kevin hawkins also noted in a presentation to visitors at the university of michigan libraries that evolving web technology makes it possible for libraries to build web services that benefit their specific commu- nities ( ). by – , it became clear that digital services had diversified the opportunities for academic publishing beyond the traditional university- press-printed monograph. megan oakleaf ( , ) noted that libraries, as the academic heart of a university, are deeply connected to disciplines and departments and are well positioned to provide services that demonstrate the library’s value throughout the institution. although the literature has provided many insights about how to man- age library web services, the idea of the library acting as the digital pub- lisher has only recently gained attention. this new attention has come in two waves. the first was triggered in – by the maturation of institutional repositories and their obvious potential for a number of related publishing activities. karla hahn ( , ) identified publishing services as an emerging role for libraries, particularly with respect to services such as dspace and the berkeley electronic press’s digital commons (http://www.dspace.org; http://www.bepress.com). other library-based projects demonstrate the li- brary’s potential role as a digital publisher by focusing on the new digital-to- digital workflow in scholarly publishing. the scholarly publishing office at the university of michigan library has built a low-cost, scalable publishing model that supports digital and print publication of monographs and journals (jöttkandt, willinksy, and kimball ). similarly, griffith university (aus- tralia) closely incorporated the library in a project to develop new service d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe models, including online publishing, tacitly acknowledging librarians’ con- tributions to a university’s research impact (o’brien ). the sydney uni- versity press was re-launched in under library management focusing on digital and print-on-demand works. colin steele ( ) asserted that this hybrid approach, mixing curation, management, and access to digital schol- arship will result in better access to this content. the second wave of interest is more recent, although the seeds can be seen as early as . the second wave is characterized by growing interest in user services that emphasize creativity. this trend is sometimes referred as “maker culture”: providing users with the tools they need to become authors, artists, or any other kind of “maker” (enis ; koerber ). collaborative technologies have advanced to such an extent that they are creating opportunities to reframe web services as a form of digital publishing with high production standards. in the past three to five years, digital publishing as a library-based com- petency has begun to appear in the literature, perhaps as a result of the rise of maker culture but also in response to the parallel interest in digital curation and digital scholarship. however, it is at this juncture that awareness begins to grow that the actual publishing process, shifted to a digital platform, car- ries considerable continuity and can be viewed as a distinct set of skills. janeke adema and birgit schmidt ( , ) made a strong case for collabo- rative, open-access book publishing, whether in concert with publishers or as lone agents. tyler walters and katherine skinner recommended librari- ans use their position to take on new roles in content production through e-publishing ( , ). walt crawford ( , ) found considerable inspira- tion in “micropublishing”: libraries providing space, equipment, and software that allow users to publish their own print and electronic books. nate hill ( ) echoed the potential of this enabling role in the context of public libraries, although the same principles could apply in academic settings, es- pecially in conjunction with large-scale publishing tools such as dspace. jennifer howard ( ) noted the number of academic libraries involved in publishing services has expanded rapidly, and the concept is gaining traction through collaboration. she described how more than academic libraries have launched the library publishing coalition to promote new projects and ongoing development. serious collaboration among institutions is a com- mon signal that an idea is gaining broader acceptance, and the emergence of consortia devoted to digital publishing suggests the profession at large is becoming aware of its potential. these are examples of homegrown initiatives that clearly depend on mettle and imagination, but at the same time, library software vendors are beginning to add creative functions to their products. in october , auto- graphics integrated self-publishing software into its library management soft- ware (http://www.ac-canada.com). the software links with ondemand’s espresso book machine, which enables patrons to print their own book d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library even as the epub and pdf document formats enable them to host them online (schwartz ; http://www.ondemandbooks.com). these recent developments suggest that the information professions are furthering their awareness of how digital publishing can include not only the library’s own content but the content of others. the growing importance of digital publishing in the overall university environment is also strength- ened by institution-level publishing tools for the creation of e-journals, e- books, and more (hahn , ). however, universities are not alone in discovering the potential of empowering internet users to publish their own material. venture capitalists continue to fund new ideas that transform dig- ital publishing and add value. in summer , obvious corporation be- gan testing an online publishing tool called medium (http://medium.com). medium promises to organize blog posts, images, tweets, and more into col- lections defined by a theme and template (evangelista , d ). medium’s approach—a form of digital publishing influenced by social media—also shares ideological roots in collection development and library-based com- munity outreach practices. branch media, another startup, defines its branch service as a means of “turning monologues [e.g., blogging] into dialogues,” making digital publishing a communal experience (http://branch.com). two case studies from the university of california-berkeley the following case studies outline how two special collections libraries at the university of california-berkeley have employed digital publishing strategies to advance their status. the two collections are associated with advanced re- search institutes: the institute of governmental studies (igs) and the institute for research on labor and employment (irle). these two collections are “af- filiated libraries” of the university, meaning the library directors report not to the university librarian but instead to the director of a research institute, who in turn reports to the vice chancellor of research. these two libraries share a crucial characteristic: they are embedded in their parent organizations and are thus focused on research and community outreach. this has presented some intriguing opportunities over time. in particular, both libraries have become the digital publisher for their parent organization, directly handling digital publishing for faculty members and other user communities. igs and irle share several attributes. they both support the research activity of faculty members from a number of departments and professional schools, and they both encourage multi-disciplinary research. igs has faculty affiliates and five program units that handle conferences, fundraising, visiting scholar programs, publications, and event management. irle has faculty affiliates, eight program units with strong emphases on community outreach and policy analysis, and a top-ranked scholarly journal, industrial d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe relations: a journal of economy and society. both receive funding from the state and university of california and also oversee sponsored research funded by private foundations and government agencies. the populations they serve are diverse and scattered across the uc berkeley campus. the libraries offer similar services and share enthusiasm for web-based outreach. both manage special collections and provide reference and re- search support to their extended communities, which number about in both instances. they also serve extended patron communities beyond cam- pus that include the public policy sphere, the human resources and labor sphere, and the general community. both libraries volunteered to manage web services for faculty, students, and staff during the web’s infancy, circa – . at first, web development was an add-on to existing work, but long-term success in providing web services eventually resulted in staff increases. igs employs a digital services librarian, while irle has built a web team consisting of two applications programmers and the library director. igs: matching research, analysis, and current information igs serves an extended community of elected officials and public policy professionals along with its academic community. it provides research sup- port to the public policy institute of california, and thus the library is highly attuned to the world of electoral politics and policy initiatives. igs launched its web services by developing and maintaining static informational pages, but it quickly explored a number of interactive services that could operate as information clearinghouses. the following web projects illustrate igs’s delivery of web services to its clients and program units. california policy inbox the california policy inbox used a blog-style interface to aggregate news about the many initiatives that typically drive california politics. the inbox offered a one-stop location for rapid updates on policy, legislative, and electoral news and was aimed at meeting both academic and political needs. california political blogs this blog archive uses the web archiving service developed by the california digital library in conjunction with the library of congress. political debate lies at the heart of the blogosphere, and political bloggers have gained outsized clout in the political process. however, this state of affairs could evolve with time; the california political blogs archive will keep a permanent record of discourse that future scholars will find useful. d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library californiachoices.org californiachoices is a joint venture including uc berkeley, uc san diego, and next , a non-partisan education organization. it pools the deep talent and awareness of faculty and affiliated experts in analyzing the movement and dynamics of public policy and aims to provide accessible commentary to all levels of readers (see figure ). the igs library, long known for publishing incisive non-partisan guides to california ballot initiatives online, was recruited to contribute this content to the californiachoices site. library staff designed the layout for site pages devoted to the ballot initiative guides and developed a highly interactive endorsements table that allows users to share their opinions via e-mail or social networking sites. figure the igs library publishes ballot initiative resource guides on the californiachoices site. (color figure available online.) d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe oclc contentdm using oclc’s digital archiving tool, igs has developed an archive of government e-resources that are found through library catalogs (http://www.oclc.org/contentdm). this service seeks to keep a persistent record of documents that are available online for a limited time, making them accessible to scholars for the long term (see figure ). with oclc’s web harvesting tool, igs archives specific policy and planning documents from municipal and county government web sites in california and catalogs them for discoverability through both worldcat and the contentdm interface. escholarship repository the escholarship repository is a system-level publishing service adapted by the california digital library in conjunction with the berkeley electronic figure using contentdm, igs has built several collections of california local govern- ment documents, harvested from web sites. (color figure available online.) d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library press. escholarship provides a centralized repository that eases the discovery process for faculty publications that are either pre-prints or that were never aimed at peer-reviewed venues. irle: comprehensive web publishing and program outreach as an institute, irle has long been concerned with supporting faculty re- search that has both applied and theoretical applications. this dual approach was strongly influenced by the state of california’s early interest in study- ing labor-management cooperation in the aftermath of world war ii. irle addressed its research to management groups, labor groups such as unions, and the global community of economists, sociologists, and others who stud- ied the nature of work and the workplace. its scholarly journal spent most of its years of publishing as either the first or second-ranked journal in the field. irle’s long-term success has depended upon creating a lively intel- lectual zone of common ground for the free flow of ideas, despite the often fractious nature of labor relations. the irle library identified the need to spread irle’s research output using new media, which began with the digital publication of papers, pol- icy briefs, grant results, and topically driven study of current events. web development often dovetailed with desktop publishing, which created an impression of the library as the digital publisher for irle’s faculty and mem- ber programs. the library web team has frequently designed publications for print and digital dissemination and has involved themselves in strategic planning initiatives as a result of web service delivery. the following high- lights illustrate the mission-critical nature of web outreach at irle and how library skills advanced the success of the institute itself. research reports, policy briefs, and proceedings the library web team determined the most valuable and distinctive content irle had to offer was its faculty research and the reports and publications of its program units. from the inception of the irle web, links and architecture emphasized this content and spurred high traffic. in its early years ( – ), the irle web received the highest traffic of any research unit reporting to the vice chancellor for research. the importance of policy briefs became very clear when the center for labor research and education published a series of briefs covering “big box” employers, including walmart. during the – academic year, downloads on these briefs exploded; when a new brief was announced and appeared, traffic exceeded , downloads in one hour and averaged more than , per week for months. the research gained national attention and was featured on public television. walmart’s legal counsel protested, although in the long term they withdrew their complaint d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe figure cper online is built on the wordpress platform, enabling integrated subscription management and online payments. (color figure available online.) as the research about the impact of low-wage jobs was substantiated by several other researchers. local online journal creation california public employee relations is the leading legal magazine about public employees in california and has a loyal following among attorneys, arbitrators, and government employees. in fall , the library web team created a fully-operational online journal, using the wordpress platform, to take the place of the print version. the journal architecture supports sub- scription management, e-commerce, and editorial support. the wordpress application resides within the larger irle web site but is visually distinct (see figure ). d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library working papers using the escholarship publishing tool, the library web team oversees irle’s five working paper series. the library director manages the series, including calls for papers, follow-ups, and usage reports. a faculty editor reviews the papers for suitability. the principal working paper series sees more than , monthly downloads. mass digitization the irle web team received funding from the university of california labor and employment research fund to digitize the proceedings and publications of the california labor federation, afl-cio, and then host the resulting col- lection in perpetuity on university of california servers. a full century of proceedings, legislative voting records, and other publications are available to all web users. the federation agreed to send new proceedings and legisla- tive analysis to the library as they appear on a biannual basis (see figure ). web archiving using the web archiving service tools, the library created an archive of the web sites for the afl-cio and change to win, a splinter group of unions that left the afl-cio. the archive will preserve the web output of these two organizations, which might reunite at a later date. california’s living new deal professor richard a. walker and author gray brechin developed a strategy to create an online mapping display of all of the artifacts that were created by franklin d. roosevelt’s new deal initiative. the library web team wrote a gps-driven architecture to allow non-technical staff to enter data, which could then be displayed visually on the living new deal web site using open source gps tools. the regents of the university of california copyrighted the architecture, and it has been used as a template by other states and their universities. once the platform was mature and stabilized, it was sent to the department of geography, where it still resides. e-commerce the uc berkeley campus does not offer a single, campus-wide e-commerce solution, so campus departments must develop and maintain their own e-commerce sites. cloud-based services are making this easier, although the responsibilities associated with e-commerce are substantial. the library web team selected a commercial product ( shoppingcart; http://www. d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe figure the california labor federation repository utilizes the metadata encoding and transmission standard and is fully searchable. (color figure available online.) shoppingcart.com) and developed a php program that links the cart to cybersource, a pci-compliant credit card handler (http://cybersource.com). the payment card industry data security standard requires that no credit card or identity data can touch a uc server, and the overall system is overseen by the university cashier’s office. web-enabled registration prior to the emergence of efficient, cloud-based online registration programs, irle staff managed multiple concurrent registration processes with spread- sheets and other desktop programs. using php, the library web team created a form-and-database application that piped registrations directly into spread- sheets, saving staff a significant amount of time. d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library epub file creation irle publishes books as well as a wide array of policy briefs and reports. as the epub open source format becomes more ubiquitous, the team is developing plans for managing locally created e-books, including local e- commerce sales and sales by third parties, such as amazon. national health care calculator the library collaborated with irle’s center for labor research and ed- ucation to develop a national health care calculator (http://laborcenter. berkeley.edu/healthpolicy/calculator). under the affordable care act, be- ginning in , many individuals and families will be eligible to receive subsidized coverage in health care exchanges if they are not eligible for medicare, medicaid, or the children’s health insurance program, or not of- fered affordable coverage through their employer. the national health care calculator allows web users to enter family income, family size, and age of youngest adult to receive estimates of the amount eligible individuals and families will spend on premiums and maximum out-of-pocket costs for an exchange health plan under the law. the calculator is a mix of algorithms and php coding and has attained copyright under the regents of the uni- versity of california. it is being evaluated for acquisition by a number of for-profit health care providers. if acquired, it will generate royalties that would flow to the regents, irle, and the authors of the calculator and its underlying php code (written by library staff). discussion it is important to note that many other libraries—notably public and spe- cial libraries—are experimenting with similar approaches to develop digital publishing programs (hill ). these two case studies are distinctive in that they demonstrate how user communities can be effectively leveraged as “clients” and that libraries have the expertise to provide digital publish- ing services that meet their needs. the two libraries have been flexible in how they support their parent organizations and tailored their strategies to encompass a range of activities beyond the creation of files, web pages, or e-books. they launched their outreach early in the web era, and as a result library staff developed reputations as the “go-to” team for publishing solutions. in addition, the case studies illustrate how specific local research interests generate new sources of information, and this in turn creates op- portunities to test the idea that digital publishing can be a core competency. the two libraries have also balanced local activity with already-existing system-level ventures at the university of california. they provide scholars d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe with additional tools to share scholarship and analyze policy and research projects rapidly, reaching a wide audience. at the local level, digital pub- lishing has strengthened relationships with user communities, solidifying the library as an important contributor to scholarly communications. local services also help to introduce the faculty to system-level services such as repositories and data curation and management. the case studies imply that the benefits of digital publishing as a new core competency fall into two principal categories. first, the web is the public face of the organization and therefore central to every aspect of its work. managing digital publishing for the faculty, affiliated centers, and programs pushes the library directly into the workflows of its clients. this raises the public profile of the library as a digital publisher and content creator. from this vantage point, library staff can offer organizational advice on how to manage projects, comment on the direction and pace of digital innovation, and act as experts in academic uses for rich media. if successful, taking this role can boost support for library services, particularly with respect to adding technologically savvy staff. digital publishing also frees center and program staff to focus on their core mission and leverages web skills so more innovation is possible. the demand for service at irle was heavy from the beginning, and as a result, irle’s program units agreed to help fund a second full-time programmer to meet the demand. the second benefit is a rise in status. the importance of digital pub- lishing to the institutes’ core goals has a halo effect insofar as other library work, such as digital reference and electronic collection development, has increased status in the eyes of the faculty. at igs, high-level aggregation of data is an indispensable element for conducting research, and the library is the solution lab. at irle, oversight of working paper generation, the roll- out of e-commerce, and a variety of publishing projects have reinforced the peer status of the librarians with the faculty. library staff are centrally in- volved with the scholarly journal industrial relations: a journal of economy and society, assisting with marketing, editorial planning, and designing. the high traffic on all digital working paper series, which averages about , per month, has also increased faculty interest in new digital publishing av- enues. irle’s library developed and licensed a full-blown architecture for local history-making about the new deal, and this gained national attention among historians. this rise in status is not only welcome as a form of recog- nition, but it can also lay a foundation under library services that can protect them during the cyclical expansions and retractions of budgetary support. the challenges are associated with workflow patterns and thus are more predictable in their nature, but they must be explicitly understood if digital publishing can indeed become a new core competency. first, it is vital to evaluate all new projects carefully to avoid being overwhelmed with ongoing work. second, success breeds demand. non-technical staff may not know how much time goes into managing services such as digital repositories, d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library electronic collections, twitter feeds, blog updates, or php form management; it is easier for them to categorize these functions as simply “putting them on the web.” consequently, user education is a constant challenge. third, it is important to note that the early simplicity of networked information architecture is quickly becoming a thing of the past; the contemporary web runs on a matrix of several scripting and programming languages, all of which must be monitored and adjusted over time. given the situation, staff time naturally drifts toward code writing, debugging, and maintenance, and this can conflict with ongoing activities, such as web site redesigns, content updates, and client contact. conclusion as networked information evolves, one can expect increased direct involve- ment by users in the digital publishing process—as authors, commentators, and as publishers in their own right. this challenges libraries to take on larger technical and teaching roles and to join in collaborative ventures. these two case studies present a strategy to involve the library in digital publishing in ways that extend its overall mission and build relationships. in this respect, the strategy of taking on the publisher role is a response to the rapidly changing conditions by filling a service gap, using widely-available web technologies. the move to enter digital publishing is a major policy-level decision for individual libraries as well as for library systems and as such is a serious decision to make. however, following the user means following them into content creation too, so there is a strong case that this new kind of work is relevant and important. as digital convergence continues to change ed- ucation, scholarship, and work, the library profession must evaluate every opportunity to maintain relevance and be prepared to establish new core competencies that are responsive to the changing times. whether digital publishing becomes a core competency or not, librari- ans will face two strong social and technological trends that will influence how the profession gauges the options. first, the internet empowers end users—and those who help them. programming trends favor empowering users directly, with tools such as twitter, apps, and blogs. however, users still require support services. librarians can offer support in many ways, and they learn best from users themselves. in an era when anyone can become a digital publisher, those who get it right will gain an advantage. following users into digital publishing is one way to learn what works and what does not. second, quality still matters. although the internet is certainly an em- powering force, it spins out a vast variety of tools, ideas, and applica- tions. there is much attention going to publishing solutions, including d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe the epub open access book publishing platform, new forms of “microp- ublishing,” and new ways of building collections of material, as seen in medium. library-based initiatives that take advantage of the digital publish- ing marketplace can make significant contributions to the future direction of web authoring and may be a positive influence on the issue of preserving quality. digital publishing is a form of following the user in new directions. it is a substantial new mandate, requiring thought and investment of scarce time and resources. as a core competency, it organizes all the activities of content creation under one category, which in itself brings clarity and reveals new potential for library services. beyond the profession, the idea of what publishing is and what it can become is being widely explored; fortunately, the unsettled environment favors bold action and initiative—two qualities that the information professions have embraced. references adema, janeke, and birgit schmidt. . “from service providers to content pro- ducers: new opportunities for libraries in collaborative open access book publishing.” new review of academic librarianship (s ): – . doi: . / . . . battelle, john. . “toward a new understanding of publishing,” part . accessed january , . http://signal.federatedmedia.net/toward-a-new-understan ding-of-publishing-part- /. brown, laura, rebecca griffiths, and matthew rascoff. . “ithaka report: university publishing in a digital age.” http://www.sr.ithaka.org/research- publications/university-publishing-digital-age. burich, nancy j., anne marie casey, frances a. devlin, and lana ivanitskaya. . “project management and institutional collaboration in libraries.” technical services quarterly ( ): – . doi: . /j v n _ . crawford, walt. . “micropublishing.” online ( ): – . delserone, leslie, julia kelly, and jody kempf. . “connecting researchers with funding opportunities: a joint effort of the libraries and the university research office.” collaborative librarianship ( ): – . detlor, brian, and vivian lewis. . “academic library web sites: current practice and future directions.” the journal of academic librarianship ( ): – . doi: . /j.acalib. . . . enis, matt. . “to remain relevant, libraries should help patrons create.” the shifted librarian, may . http://www.thedigitalshift.com/ / /ux/to- remain-relevant-libraries-should-help-patrons-create/. evangelista, benny. . “a boost for collaboration: medium, from twitter backers, tries to improve online publishing.” san francisco chronicle, august . fagan, jody condit, and jennifer a. keach. . “managing web projects in academic libraries.” library leadership & management ( ): – . http:// journals.tdl.org/llm/article/view/ / . d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay digital publishing from the library fister, b. . “trade publishing: a report from the front.” portal: libraries and the academy ( ): – . doi: . /pla. . . frumkin, jeremy, and terry reese. . “provision recognition: increasing aware- ness of the library’s value in delivering electronic information resources.” journal of library administration ( – ): – . doi: . / . . . gilman, isaac, and marita kunkel. . “from passive to pervasive: changing per- ceptions of the library’s role through intra-campus partnerships.” collaborative librarianship ( ): – . hahn, karla. . research library publishing services: new options for university publishing. washington, dc: association of research libraries. http://www.arl.org/bm∼doc/research-library-publishing-services.pdf. hawkins, kevin. . “the library as publisher: creating scholarly resources.” presentation to visitors from tianjin, people’s republic of china, april . http://hdl.handle.net/ . / . hill, nate. . “a two part plan to make your library a local pub- lisher.” the pla blog: official blog of the public library association, febru- ary . http://plablog.org/ / /a-two-part-plan-to-make-your-library-a-local- publisher.html. hope, charity b., and christina a. peterson. . “the sum is greater than the parts.” journal of library administration ( – ): – . doi: . / j v n _ . howard, jennifer. . “for new ideas in scholarly publishing, look to the li- brary.” the chronicle of higher education, february . http://chronicle.com/ article/hot-off-the-library-press/ /. jöttkandt, sigi, john willinksy, and shana kimball. . “the role of li- braries in emerging models of scholarly communications.” paper presented at lianza, christchurch, new zealand, october . http://openhumanitie spress.org/jottkandt_ - - _lianza.pdf. koerber, jennifer. . “the makings of maker spaces, part : espress yourself.” the shifted librarian, october . http://www.thedigitalshift.com/ / /public- services/the-makings-of-maker-spaces-part- -espress-yourself/. liu, shu. . “engaging users: the future of academic library web sites.” college & research libraries ( ): – . http://crl.acrl.org/content/ / / . full.pdf+html. mitchell, erik. . “the organizational role of web services.” journal of web librarianship ( ): – . doi: . / . . . motin, susan hubbs, and pamela m. salela. . “a liaison model for integrating the library, it, web, and marketing teams.” technical services quarterly ( ): – . doi: . /j v n _ . oakleaf, megan. . value of academic libraries: a comprehensive research review and report. chicago: association of college and research libraries. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/issues/value/val_report. pdf. o’brien, linda. . “the changing scholarly information landscape: reinventing information services to increase research impact.” elpub —conference on electronic publishing, helsinki. http://hdl.handle.net/ / . d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay j. lefevre and t. k. huwe reynolds, rob. . “international digital publishing forum, november elec- tion: candidate statement.” accessed january , . http://idpf.org/board- elections- – /candidate-statements/reynolds. riley-huff, debra a. . “web services as public services: are we supporting our busiest service point?” journal of academic librarianship ( ): – . rosemont college. . “graduate publishing program.” accessed october , . http://www.rosemont.edu/gps /graduate/academics/publishing/study.php. schwartz, meredith. . “auto-graphics adds self-publishing tool to li- brary software.” the digital shift, october . http://www.thedigitalshift.com/ / /software/auto-graphics-adds-self-publishing-tool-to-library-software/. steele, colin. . “scholarly monograph publishing in the st century: the future more than ever should be an open book.” journal of electronic publishing ( ). doi: . / . . . stoffle, carla j., and cheryl cuillier. . “from surviving to thriving.” journal of library administration ( ): – . doi: . / . . . tomlinson, shawn m. . “the definition of digital publishing.” accessed january , . http://www.ehow.com/about_ _definition-digital-publishing. html. walters, tyler, and katherine skinner. . new roles for new times: digital cu- ration for preservation. washington, dc: association of research libraries. http://www.arl.org/bm∼doc/nrnt_digital_curation mar .pdf. warren, john w. . “the progression of digital publishing: innovation and the e- volution of e-books.” the international journal of the book ( ): – . http:// www.rand.org/content/dam/rand/pubs/reprints/ /rand_rp .pdf. welch, jeanie m. . “the electronic welcome mat: the academic library web site as a marketing and public relations tool.” the journal of academic librar- ianship ( ): – . wikipedia. . “electronic publishing.” accessed january , . http://en. wikipedia.org/wiki/electronic_publishing. d ow nl oa de d by [ u ni ve rs it y of c al if or ni a, b er ke le y] a t : m ay presentazione standard di powerpoint why open science? elena giglia this work is licensed under a creative commons attribution-sharealike . international license. photos are mine, available for reuse on flickr, https://www.flickr.com/photos/eg /albums/ elena giglia messina, nov. , elena.giglia@unito.it @egiglia http://creativecommons.org/licenses/by-sa/ . / https://www.flickr.com/photos/eg /albums/ take away messages open access/open science are opportunities, not threats open science and open innovation are connected open science: a different way to do science, not a set of rules …take open science «one step at a time» …the opposite of open science is «bad science», not «closed science» …barriers are social and cultural not technical… open science? …open science holds a huge transformative potential… if you don’t focus on its real value, it will be seen as the unpteenth administrative burden … open science in practice? open science https://twitter.com/caal/status/ https://twitter.com/caal/status/ …thank you for your undivided attention! please… …today let’s look at scholarly communication with fresh eyes… questions why do you do research? does it suit you, the way it is? what’s your shade of «open»? scholarly communication is complex… e.giglia, open access, ovvero... aviano settembre research evaluation new models (sustainability) costs (real costs – «anelastic market») economy (and profits) tecnology preservationaccess rights management (authors, readers, publishers…) disciplines and their tools (books, journals…) production [impact factor] scholarly communication: functions rosendaal h. –geurts p. forces and functions in scientific communication:an analysis of their interplay, crisp journal registration certification awareness archiving reward https://twitter.com/eggersnsf/status/ feb. , http://www.physik.uni-oldenburg.de/conferences/crisp /roosendaal.html scholarly communication: processes submission peer review acceptance/ rejection publication no economic return …expected return: citations, prestige a blurred environment… http://figshare.com/articles/ _innovations_in_scholarly_communication_the_changing_research_workflow/ http://figshare.com/articles/ _innovations_in_scholarly_communication_the_changing_research_workflow/ the money already is in the system – and we can spend - % … scholarly communication, today… may in europe meuro (underestimated) global , billion ( ) + % key message / today reading is not for free … we are paying commercial publishers to lock up our content … https://eua.eu/downloads/publications/ % big% deals% report% v .pdf …how does it like? jon tennant, open science: just science done right, sept. https://twitter.com/protohedgehog/status/ a.holcombe, aug. key message / there are huge commercial interests (and a huge waste of public money) nov. , https://figshare.com/articles/open_science_is_just_good_science/ https://twitter.com/protohedgehog/status/ https://twitter.com/ceptional/status/ https://twitter.com/alexis_verger/status/ ?s= …and a bit of monopoly peter kraker, march nov. , https://zenodo.org/record/ #.xjtt-rh m https://twitter.com/johanrooryck/status/ «access»? …the same and the ones that until march were closed behind subscritpions so expensive that harvard can no longer afford… https://twitter.com/jkamens/status/ march : thomson reuters, elsevier, nature open for free all the articles dealing with nuclear pollution https://twitter.com/jkamens/status/ … if not, sci-hub would not exist http://www.sciencemag.org/news/ / /whos-downloading-pirated-papers-everyone march , may , http://www.sciencemag.org/news/ / /whos-downloading-pirated-papers-everyone https://twitter.com/bernardrentier/status/ https://www.theguardian.com/higher-education-network/ /may/ /scientists-access-journals-researcher-article [alternative ways to get a pdf] feb. , unpaywall … but it works only if authors sefl-archive http://www.openaccess.nl/en/events/alternative-ways-to-access-journal-articles …does it work? …average publication time: - months bjork …growing number of retractions due to falsified/fabricated data fang, casadevall …in the most «prestigious» journals …reproducibility crisis p.masuzzo, sept. …huge delay march …self-citations + % sept. , doi: . /j.joi. . . http://iai.asm.org/content/ / / .full https://doi.org/ . /zenodo. https://twitter.com/mcpievatolo/status/ https://doi.org/ . /journal.pone. … the system is broken june research culture is broken, open science can fix it https://youtu.be/c-bemnz-iqa [retractions] …detected by a phd candidate told to «shut up and write» march , oct. , retractions. if you cut them off, the systematic reviews shows increased risk of mortality and renal failure https://retractionwatch.com/ / / /stem-cell-researchers-investigated-for-misconduct-recommended-for-roles-at-italys-nih/#more- https://www.repubblica.it/salute/medicina-e-ricerca/ / / /news/noto_cardiologo_piero_anversa_harvard_chiede_il_ritiro_di_ _sue_pubblicazioni- /?ref=rhrs-bh-i -c -p -s . -t https://retractionwatch.com/ / / /does-scientific-misconduct-cause-patient-harm-the-case-of-joachim-boldt/#more- brembs, digital scholarship and open science need a digital infrastructure , nov. … what about impact factor? year x citations in year x to articles published in x- x- total «citable» articles published in x- x- j.tennant barriers for young researchers, sept http://www.slideshare.net/brembs/digital-scholarship-and-open-science-need-a-digital-infrastructure https://figshare.com/articles/barriers_to_open_science_for_junior_researchers/ … evaluation? «obsession» https://goo.gl/p vzas roars marzo may , https://goo.gl/p vzas https://www.roars.it/online/impact-or-perish-lossessione-per-limpatto-delle-pubblicazioni-scientifiche-genera-frodi-e-condotte-abusive/ http://blogs.lse.ac.uk/impactofsocialsciences/ / / /the-academic-papers-researchers-regard-as-significant-are-not-those-that-are-highly-cited/ [your choice if you want to stay in the game, as it’s a dirty game] springer prospectus apr. https://twitter.com/jscaux/status/ https://t.co/elpg zfgnk https://twitter.com/jscaux/status/ …a deadly embrace [we are on the wrong road] sept. , • publishing «a result» has become more important than publishing a correct result • gaming metrics is an occupational requirements for scientists https://www.nature.com/articles/s - - - ©tom toro, http://tomtoro.com/cartoons/#jp-carousel- photo: noaa national weather service national hurricane center …what about a different landscape? …a bit of inspiration… the best thing about internet is that it’s open. in every field it let us share and innovate. in science, openness is essential. open science doesn’t mean ignoring economic reality. of course we need business models to be sustainable. but that doesn’t mean we have to carry on doing things the way they have always been done. so, wherever you sit in the value chain, whether you’re a researcher or an investor or a policy maker, my message is clear: let’s invest in collaborative tools that let us progress… let’s tear down the walls that keep learning sealed off. and let’s make science open. n. kroes, let’s make science open, giugno http://www.youtube.com/watch?v= sjbi eapxc&list=pl f be eaef&index= &feature=plpp_video …another world is possible p.masuzzo, sept. https://doi.org/ . /zenodo. open science http://opendefinition.org/ c. mac callum, uksg, april sept. , http://opendefinition.org/ https://www.slideshare.net/uksg/uksg- -breakout-setting-your-cites-to-open-i oc-maccallum https://twitter.com/openscience/status/ ?s= open science foster taxonomy https://www.fosteropenscience.eu/foster-taxonomy/open-science-definition open science tony ross-hellauer, tennant sept. https://www.slideshare.net/openaire_eu/peer-review-in-the-age-of-open-science https://twitter.com/protohedgehog/status/ open science sept. , https://twitter.com/protohedgehog/status/ ?s= open science and sdg intervista, min keynote h https://www.youtube.com/watch?v=z dmww tvii https://www.youtube.com/watch?v=bv wbpciki aug., open science https://doi.org/ . /febs. … a non-dialogue https://goo.gl/pbylmm https://goo.gl/pbylmm open [collaborative] science essere inclusivi . inclusive open science, sept. sept. , oct. , manifesto https://www.youtube.com/watch?v=mibxpc zrte https://twitter.com/lerunews/status/ ?s= https://www.idrc.ca/en/book/contextualizing-openness-situating-open-science https://ocsdnet.org/manifesto/open-science-manifesto/ open science p.masuzzo, sept. https://doi.org/ . /zenodo. open science rainbow… open science: roadmap may https://www.leru.org/files/leru-ap -open-science-full-paper.pdf another world is possible? b. rentier, https://orbi.uliege.be/handle/ / … [italy / new players: miur] national plan open science working group open science (rectors, researchers, publishers, librarians, research infrastructures) «coordination-strategy» https://t.co/zwc qadzvd plos plos …another way of doing research https://t.co/zwc qadzvd https://journals.plos.org/plosmedicine/article?id= . /journal.pmed. http://journals.plos.org/plosmedicine/article?id= . /journal.pmed. …another way of opening up register the grant proposal https://osf.io/registries/ https://aspredicted.org/ pre-register your study https://osf.io/registries/ …another way of being reproducible https://the-turing-way.netlify.com/introduction/introduction.html r.ainsworth, sept. https://the-turing-way.netlify.com/introduction/introduction.html https://doi.org/ . /zenodo. …another way of assessing multiple criteria b. rentier, june https://sfdora.org/ consider signing dora!!! https://indico.cern.ch/event/ /contributions/ / https://sfdora.org/ …you need publications… so, open access self-archiving publishing [houston, we have a problem] open access in italy (perception) - journals only - always paying for publishing - always predatory publishers march , https://peerj.com/preprints/ / why do we need open access? [or: where does the money go?] corina logan, https://osf.io/sy f / why do we need open access? corina logan, http://bulliedintobadscience.org/ https://osf.io/sy f / http://bulliedintobadscience.org/ [a call: plans] …a transition lasting years is still a transition? or is it more a «further exploitation»? we need radical and robust actions july , min. . - . and . - . plan s is a way to force the system to change https://www.youtube.com/watch?v=typfnriezwo [plans] sept. , • reactions • debate • no more hybrid journals • topped apcs • when [and only when] apcs are dued, istitution pays • authors retain copyright via cc by out of journals with apcs % revised in feb. postponed to jan https://www.scienceeurope.org/coalition-s/ https://unlockingresearch-blog.lib.cam.ac.uk/?p= https://unlockingresearch-blog.lib.cam.ac.uk/?p= [the biggest inhibitor is the system itself…] march th, https://www.researchresearch.com/news/article/?articleid= [transformative agreements] rome, feb. https://oa .org/b -conference/ https://www.oa.unito.it/new/wp-content/uploads/ / /transformative-agreements-come-e-perch%c %a - -pdf-optimiert-edited.pdf https://oa .org/b -conference/ … another way of writing https://www.authorea.com/ https://hypothes.is/ https://www.overleaf.com/ http://thepund.it/ http://help.osf.io/m/projects https://www.authorea.com/ https://www.overleaf.com/ http://thepund.it/ http://help.osf.io/m/projects …another writing / https://www.qeios.com https://www.qeios.com/ … not only articles… may, http://jupyter.org/index.html preprint and open notebook preprints added value: • immediate publication • scientific priority • no post submission uncertainty • focus on the content, not on the venue https://doi.org/ . /journal.pcbi. http://jupyter.org/index.html … another peer review https://f research.com/articles/ - /v https://f research.com/articles/ - /v • reviews are «pieces of knowledge» • they get a doi • they are citable • they should be evaluated as research outputs opr in practice poschl https://www.frontiersin.org/articles/ . /fncom. . /full …not only texts… https://github.com/ https://zenodo.org/ https://www.protocols.io/ you can deposit data, software, images, poster, protocols, workflows… https://github.com/ https://zenodo.org/ https://www.protocols.io/ … fair data… metadata, persistent identifiers…trusted repositories, formats ontologies, standards licenses and documentation r f a i fair guide, nature, march to know more https://vidensportal.deic.dk/rdmelearn to know how https://www.nature.com/articles/sdata https://vidensportal.deic.dk/rdmelearn [shades of fair] https://www.ands-nectar-rds.org.au/fair-tool training sept. , fair maturity https://www.ands-nectar-rds.org.au/fair-tool https://www.ands.org.au/working-with-data/fairdata/training https://doi.org/ . /s - - - https://fairsharing.github.io/fair-evaluator-frontend/#!/ [as now we have the eosc!] vienna, novembre vienna, nov. , seamless access to open by default fair data https://eosc-launch.eu/fileadmin/user_upload/k_eosc_launch/eosc_vienna_declaration_ .pdf https://eosc-launch.eu/fileadmin/user_upload/k_eosc_launch/eosc_vienna_declaration_ .pdf …enabling services possible only if authors self- archive in open access https://openaccessbutton.org/ pubmed linkout pubmed linout downloads . downloads may - aug. [ average] https://openaccessbutton.org/ …enabling text and data mining https://twitter.com/libereurope/status/ journalists https://twitter.com/libereurope/status/ industry (bayer) …you can’t separate commercial/ non commercial https://twitter.com/libereurope/status/ https://twitter.com/libereurope/status/ https://twitter.com/libereurope/status/ …going for a new discovery https://openknowledgemaps.org/ https://openknowledgemaps.org/ …[and above all, keep your rights!!!] all rights reserved some rights reserved don’t give away your rights https://creativecommons.org/choose/?lang=en https://creativecommons.org/choose/?lang=en open science??? https://www.fosteropenscience.eu/toolkit http://openscienceguide.tudelft.nl/ https://www.fosteropenscience.eu/toolkit http://openscienceguide.tudelft.nl/ open science: messages science was founded on openness. we closed it down. it’s time to open it up again. j. tennant oct. , https://figshare.com/articles/how_to_foster_a_community-led_cultural_shift_towards_open_scholarship/ …whose side are you on? [when the wind of changes blows, some people build walls, some people build windmills] …what about you? thank you! why open science?�elena giglia take away messages open science? diapositiva numero open science diapositiva numero please… questions scholarly communication is complex… scholarly communication: functions scholarly communication: processes a blurred environment… … scholarly communication, today… …how does it like? …and a bit of monopoly «access»? … if not, sci-hub would not exist [alternative ways to get a pdf] …does it work? … the system is broken [retractions] diapositiva numero … evaluation? «obsession» diapositiva numero …a deadly embrace [we are on the wrong road] diapositiva numero …what about a different landscape? …a bit of inspiration… …another world is possible open science open science open science open science open science and sdg open science … a non-dialogue open [collaborative] science essere inclusivi open science open science rainbow… open science: roadmap another world is possible? … [italy / new players: miur] …another way of doing research …another way of opening up …another way of being reproducible …another way of assessing …you need publications…�so, open access [houston, we have a problem] why do we need open access?�[or: where does the money go?] why do we need open access? [a call: plans] [plans] [the biggest inhibitor is the system itself…] [transformative agreements] … another way of writing …another writing / … not only articles… diapositiva numero opr in practice …not only texts… diapositiva numero diapositiva numero [as now we have the eosc!] …enabling services …enabling text and data mining� …going for a new discovery …[and above all, keep your rights!!!] open science??? open science: messages diapositiva numero morrealesberg transcript transcript: laura k. morreale distant gatherings: a text-case for digital manuscript collaborations delivered for the th annual (virtual) schoenberg symposium on manuscript studies in the digital age manuscript studies in the digital covid- age november - , all right so so thank you very much and a special thanks to lynn ransom and everyone at the schoenberg institute for all the hard work that's gone into organizing this conference. i’m going to start my talk, distant gatherings: a text case for digital manuscript collaborations” with a slightly personal statement that i know will resonate among many of you all. very simply, i’ve missed seeing you all. for reasons we all understand our plans for projects for meetups and face-to-face exchanges have been eclipsed by this covid moment and have kept us from seeing each other in the ways that we all rely upon in the ways that nourish our scholarship and our intellectual lives. and yet as we've all discovered life goes on and we have all learned how to cope and maybe even to flourish within this unexpected moment. what i’d like to suggest today is that due to this very dramatic change in how we're living our lives and the expectation, expectations we continue to bring to our scholarly work we've begun to see this world differently, to imagine ways of using our skills as medievalists that benefit from the digital tools we have our display at our disposal and our newfound proficiencies with them. i came to this conclusion after making two observations first that the zoom only conferences which have become our default simply replicate older forms of in-person interaction without allowing us to profit from the real benefits that face-to-face meetings allow this is why despite all the hard work and effort that go into organizing these events there remains there remains this sense of dissatisfaction when we all individually turn away from our screens at the end of a presentation and go back to our lockdown lives we all know that zoom fatigue is real and yet secondly it's not the medium that is the problem only the way we've chosen to use it we only need to recognize that people still even in lockdown willingly spend a lot of time communicating online on twitter and other social media platforms for example to understand that there's some value to be had in digital engagement so how do we harness the power of digital interactions for scholarly work without sacrificing the important exchange that occurs in more traditional formats the challenge in my mind is less to build new tools or structures to support the distant gatherings we now attend then to reconfigure what we currently have to capitalize on the unique characteristics of the digital medium and what are these characteristics computer-enabled work relies upon its own strengths including the ability to disseminate information efficiently to collate and visualize data too great for humans to manage on their own to virtually bring together items that are physically separate and to manipulate and modify sources without causing harm to the originals distanced scholarship that relies upon the online world capitalizes on the very qualities we have come to expect in our everyday communicative practice. it's dynamic, user driven, interconnected, and responsive to visitors, all characteristics we should first identify and then use to our advantage when we undertake digital scholarship. now i have found that time-bound digital events that rely heavily on the dynamic quality of virtual exchange, that promote transparent and public-facing communities, and that circle around a circumscribed set of questions or problems-- much like a facebook post that poses a thorny question and then elicits a lengthy response-- these have been very successful ways to bring people together to engage in online scholarship. now what does that mean in real life, how does that work out? today i’ll be profiling three different manuscript-based projects that have capitalized on the qualities of online communication i’ve just outlined these include the defeat of translation project the las fera and image du mon transcription challenges and the pelerinage de damoiselle sapience transcription event which amy just mentioned, which is taking place right now as we speak. all three of these projects rely upon the same basic workflow but they unfold over different time spans. the pelerinage event, as we know, was engineered as a three-day sprint the transcription challenges are styled as two-week efforts and the deiphira stretches over a longer period of time but still requires a relatively minimal weekly time commitment from project participants in all of these projects we began with a manuscript or a set of manuscripts that had been digitized so that the images are freely accessible to anyone once we located the desired manuscripts for each of these projects we loaded images into a transcription platform called from the page so that everyone involved in the project could access the manuscript and type up a transcription of the work directly in the transcription pane now in transcribing or in the case of images describing what we see in the digitized copy we scrutinize the copy of the medieval work we create our own st century version of that text and we record our observations of the manuscript or the text along the way once the transcription is complete we're then able to export it export it in a text file and create any kind of final product that we would like the tech part here is so easy as to be almost a non-issue which i think is an important part of what has made these projects effective these then are the mechanics of these online projects digitize manuscript to transcription platform to exported text to final product. but of course the machines and digitized images are not doing the scholarly work that needs to get done it's people who do that we must also recruit and organize participants in the project then provide the framework for how the work will proceed. naturally the framework for how the work work gets done varies according to the project parameters so i’m going to start then to talk a little bit about the deiphira translation project centered around the typ manuscript housed at harvard's houghton library the different is a minor literary work by the well-known humanist author leon batisti alberti and it's been identified as the first dialogue on love in italian vernacular literature so um a work of of some consequence the dialogue is about a woman named deiphira and the conversation takes place between palimacro the man who loves her and filarco his interlocutor, who claims to be wise in the ways of love and especially of women but we're all sort of doubtful of that at this point. if you look closely at the image on the frontice piece of the houghton typ- manuscript which i have reproduced here for you you can see those three figures, presumably the characters we meet in the dialogue, um this is the the deiphira is the only text in the folio manuscript and it's written in a nice clear humanist script. so starting in april of this year a group of nine scholars who i like to call the deiphira began a collaborative project to transcribe and translate the digitized version of the manuscript. the impetus for the project was largely selfish- as the grip of lockdown tightened many of us were feeling isolated and we were looking for ways to connect with other scholars to continue to research and to share our expertise so the deiphira decided to meet up using the tools that the digital world now offers us zoom meetings and shared google docs a wordpress website to collect our materials and you can see the address at the top of the slide if you want to check it out. and then the digitized-manuscript -to- keyboard-text workflow that i just outlined for you. as we have transcribed and translated this work into english we have come to enjoy and to really capitalize on the digital environment as we build our working community of friends centered around this text. we meet twice a week on zoom for an hour each time first to do a very close analysis of the manuscript and the version of the text found in it and then to produce an english language translation and i’m pretty strict about keeping it at just one hour so people know what they're in for and that the project is not burdensome. during these meetings we've already completed a full transcription of the manuscript and just this past week we completed the first run through our english translation our next steps will be to to do another read through and to normalize our translation but these things will all be done collaboratively which means we benefit from everybody's expertise and perspective. in terms of the research products that have come out of our efforts so far we've put together a collation diagram using viscollm so thank you =dot and uh alberto, we've isolated various hands within the manuscript and the corrections made to it we've discovered that the typ is actually extracted from a much larger manuscript and we've posted some of these findings in a preliminary project at digital mappa that brings together the manuscript images our transcription our translation and the collation diagrams. we hope then together all of these materials into a digital site housed at georgetown university and add a collation of our edition to the standard print edition from the s. our georgetown hosted edition will also include annotations a manuscript recension and supplementary essays on the text in our methodology and we hope to have this done in the next six months or so. now the second project or rather group of projects i’ll talk about is a series of two-week manuscript transcription challenges that stanford rare books curator benjamin albritton and i organized and staged over this past summer and fall. to date there have been three of these transcription challenges the first featuring multiple manuscript of excuse me manuscripts of goro dati's th century geographic tree uh text called la sfera and the most recent which just finished up on in october treating the first half of the first recession of goussoin de metz’ mid th century scientific treatise the image du monde all of that information there for you now the inspiration for the first challenge, a competition to transcribe versions of dati’s work was the result quite honestly of just just too much time on twitter. in the earlier days of lockdown in the us ben and i were lamenting our mutual feelings once again of isolation and we came up with this idea to connect with other scholars who might enjoy gathering around and engaging with the text through the act of transcribing it collaboratively. we identified la sfera as a good candidate since we could easily locate locate digitized versions of the text at yale, the arsenal inpparis and the vatican library. we loaded the images of each of these versions into stanford libraries’ instantiation of fromthepage and then we created a quick wordpress website where we collected all the materials that we would need for the challenge and then set out the terms of the competition. we created team pages with links to the transcription platform, a project log and a link to the rules and guidelines so everyone sort of knew knew what they were up to. through our personal networks and the magic of twitter we were able to build three teams of scholars each team usa, equipe france, and squadra italia. the rosters were filled and a group of team captains put in place about a week before the competition began at which time each team would have two weeks to complete their transcriptions and submit them to the panel of three judges they assessed each submission according to speed accuracy and collaborative participation. and once this first competition began officially the work was really fast and furious. within about hours of the start time roughly half of the transcription work had already been done and teams were moving on to the review stage. ben and i along with many other competitors we're constantly tweeting updates to track our progress or to highlight what was discovered in our manuscripts whether it was a strange writing style or an unusual map or even a party of dragons who were hanging out in a misidentified tower of babel. during this first day of competition that first hours or so our tweets attracted a lot of attention from art historians from librarians who had a copy of las vera in their own repositories to fellow medievalists who were just interested in what were going on and what was going on and dh scholars who were really intrigued by the competition competition’s framework. after the first flush of excitement the teams began to buckle down and engage in the nitty-gritty of producing a clean transcription to submit to the judges. now to do this teams use several forms of digital communication including slack to really for inter- team communication, twitter to really speak more publicly, and then the challenge team pages where members would post observations some research and some substantive substantive findings uh the first success the first event was so successful and the demand for a second phase was so strong that we brought in five more copies of la sfera and ran a second transcription challenge in late july meaning that at the end of the second two week challenge we had eight full transcriptions of this text from eight different manuscripts. along the way our transcription work inspired a deep engagement with the text itself on the part of our transcribers who produced some amazing scholarship like this blog post authored by team captain carrie benes comparing the different scripts across the manuscripts and you can still read carrie's small essay on the team vatican page of the la sfera website. we've also cataloged and made public all the data all the transcriptions created during these events so that other scholars can learn about the challenges and use our transcriptions in their own work. now one of the greatest points of pride for me at least over the course of these challenges is that all told somewhere between and medievalists have participated, each bringing his her or their own expertise and enthusiasm to the effort in their own ways and i know a lot of transcribers are out in the audience right now listening and um it's been really gratifying to see our community come together through these projects and i hope that other scholars will come forth who want to adopt this model and propose their own texts. and speaking of other texts i’ll just take my last few minutes to talk about the pelerinage de damoiselle sapience transcription event that's going on as a part of the schoenberg conference. the task for the event participants just as we have mapped it out on the event website is to transcribe the ten folios of the pelerinage de damoiselle sapience a previously unedited work from upenn’s ms , then to prepare a set of rules for the transcription we make-- that's today's task --and then to create a narrative section to introduce our methodology and the work itself. our goal is to prepare all of these scholarly products bring them together into one document and then submit our work to the digital medievalist for possible publication in the journal as one of their methods articles. for this event we've chosen the metaphor of the relay race and this is the spirit that has been brought to the work since yesterday morning. we have three teams working together passing the baton from one to the next. team transcription began the event by transferring the text from the digital images of upenn's ms to the transcription pane and then passing the baton off to team revision, today's team who's now reviewing the version created by the first leg of the race and creating their transcription statement that sets out the rules. tomorrow team submission will take us all the way to our finish line of a fully transcribed text with a transcription statement and a small narrative and how we got our work done. now in a relay race you watch your teammates intently you support them and encourage them when you can and you build the strategy for your own participation based on what they accomplish so with this ethos in mind we've been watching intently what our teammates have been doing as they complete their leg of the race cheering them on through channels like twitter or facebook for example and strategizing for our chance to take the baton. this event then is a test case for a digital methodology that brings together transcribers from near and far as mapped on the damoiselle sapiance participant map and it's an experiment with different ways that our workflow might function on a very limited timeline. so to wrap up i think the elements that make these events work so well are the following there's really a well-defined time period that allows participants to know what they're in for the easy tech allows for a very low barrier barrier of entry and i think they both promote and rely upon communication and community these are all important aspects. so where do we go next? um these projects have tested the waters of our digital scholarly practice and they represent but one way to gather together around manuscripts while at a distance from the work and from each other. while virtual manuscript work will never replace interacting with the material objects that's for sure we can enjoy these distant gatherings for now knowing that we're engaging in real scholarly work even as we look to the day that we can finally map out new journeys to unite with these materials and with each other. now i started with a personal statement and i will end with one as well. like many medievalists who followed a non-traditional career path who could never displace themselves to spend extended periods of time in the archives or take up visiting teaching positions i found that the shift to online work was really a comfortable one for me. in fact i had already spent years learning how to work at a distance when our lockdown lives began. certainly postcovid scholarship will look different than what has come before as medievalists recover from this moment and we are forced to recalibrate our scholarly methods as. i'm sure we will i encourage the community to look especially to those scholars with non-traditional backgrounds who may already have navigated some of the challenge that will face challenges that will face us as we move beyond this pandemic moment. and i’ll stop there and thank you very much for your attention. thank you very much. it's um i feel as though we have talked a lot about how we can use digital methods to bring scattered manuscripts together but it's really heartening to hear about using the tools to bring scattered scholars together um so now we will take time for questions on this talk i yeah we already have one question for laura um federico botana do you want to ask directly if you could uh turn your video on and i’ll let you take it away yes done thank you um i have worked on la sfera and uh i had a lot of problems sort of working with the printed printed printed editions which i think is more or less the the early s edition which was re-edited in the th century and there's so much difference in manuscripts it's still the same text we can see where the errors are and i think this big need for digital for a critical edition is that with what you've been producing is um is that a possibility i think would be much very helpful for everybody. well you know at this point we're kind of taking the the approach that um each manuscript is is its own product and there's there's an amazing joy um i think that the participants felt in engaging so deeply with all these variations and then i mean it's really great to be able to see all eight of them up and see where things don't match up and um you know even the sort of adjustments that the scribes were making right as they were um working through these you know different variations and copying and so we're just i mean it's such a huge tradition i think there are over um extant versions um so it would be an enormous amount of work if someone wanted to undertake it and you and i should talk um but um yeah but for the moment it's more um you know giving sort of honoring those those individual witnesses um so but we'll see where the project goes thanks yeah it would be great to see i’d love to see it because i must say i have to struggle a lot with some of sometimes it's a sound only that makes um things change but anyway i stopped there thank you very much for watching absolutely. uh all right i see catherine chandler has a question hi yes um i pers participated in the most recent transcription project which was very challenging and a lot of fun um and so i’m i’m not sure if it's important to have a critical edition come out of the project or what but um somebody who's more of a liturgist and somebody who works more with music manuscripts um is it possible that there might be a project involving liturgical texts or music texts ones that involve more latin latin west liturgical things um thank you so much and thanks for your participation i will just tell you too that the judges for the image du monde got in touch with me last night and there is um there is a decision so that will come out very soon um but in terms of what comes out and what sort of challenges appear later um what i really hope and i know um ben albritton and i agree on this is that we hope that people will take the model and will run their own challenges because frankly it's a lot of fun but it's a lot of work right and so and you know to tell you the truth i’ve learned so much doing them about those particular texts and so i think it's a really great way both to advertise the text that you're interested in in the work that you've done on it right but also to you know bring that expertise it's it's i never could have imagined what people you know what they brought to the challenges it's it was so amazing um the work and the the genie, right that people brought so thanks it was great fun thank you so much for all the hard work you do it's a lot of fun thank you are there other questions from the audience if not i have one and so uh i’ll go ahead and ask it and if anyone has another question they can write it in um i just kind of following up on that um and also the lovely visual of the map with people all over the world participating um i wonder like how much you thought about like how this pro this kind of project might scale up i’m not sure it needs to scale up but you know how do you make it larger how do you make it how not you but how does one make it larger how does one make it more inclusive um and what what would be the challenges if there are any other than just the fact that it's a lot of work for one person to manage like are there ever platforms to put this out on or there probably are other platforms i mean you know part of the reasons that we use the tools that we did is because they're um almost universally accessible and you know people you know most everybody can get onto a google doc um so you know that we relied on on those tools in that way in terms of how might one might scale it up um certainly you could do more delegation um you know my role as um the coordinator of these events um particularly la sfera and the image du monde was really just to absorb any problems that would come along and try to fix those things and so i’m sure if you were to have you know three people doing that job um that that could be done as well um i mean i haven't really thought making it bigger i you know i was i was overwhelmed that plus you know scholars were participating in each of these things it's um it was great so sorry lynn no i mean i i mean i think it's fantastic i don't mean to suggest that it needs to get bigger or anything but it just seems like such it's such a good idea and you could really just get so much new data out there right anyway well congratulations it's it's been fun to watch um okay so one person has another question has come in uh sarah savant you want to sure i was actually asking sort of the opposite question to you i think or a slightly different question which is how did you get started in the first place and what is the i mean it's quite a lot to get people i think and what is kind of the value that you think brought people to want to do this um you know it strikes me a little bit like hackathons but um that people do in computer science but what you know what what motivates everyone and people don't have time i mean you know even in lockdown there's a lot of people who are juggling more not less. i got to tell you sarah i mean i i was amazed that people would do this but you know one of the things that i like to to build into any of these um events is a little bit of fear i mean i think the competition thing gets people going i mean we're all you know polite people but at the end of the day like we want to win right and so um i think there's that little bit that's that that's in there and that's you know i i built that into the damoiselle sapience thing too like we need to be done by friday night right and the challenge is how can you know how can we do that so i think it's the combination of that little bit of sort of thrill fear um and then the limited time span when you say to somebody listen, for the challenges, i would say, listen if you're just doing transcription ,if you're not a team captain or anything like five to six hours ,tops um you get to interact with everybody you get to do something that you really enjoy doing and that maybe you know your ten-year-old doesn't care about who's sitting right um and so it's five to six hours tops um you get to be part of this team interact with people and then two weeks we're done like we're done right and so i think that allows people a little bit of joy in what they're doing. a really strict project timing like knowing and and being and being loyal to that okay and i love the competition idea we all do we all do, yeah i know exactly thank you durham research online deposited in dro: january version of attached �le: accepted version peer-review status of attached �le: peer-reviewed citation for published item: crang, m. ( ) 'the promises and perils of a digital geohumanities.', cultural geographies., ( ). pp. - . further information on publisher's website: http://dx.doi.org/ . / publisher's copyright statement: crang, m. ( ) 'the promises and perils of a digital geohumanities.', cultural geographies., ( ). pp. - . c© the author(s) . reprinted by permission of sage publications. additional information: use policy the full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for personal research or study, educational, or not-for-pro�t purposes provided that: • a full bibliographic reference is made to the original source • a link is made to the metadata record in dro • the full-text is not changed in any way the full-text must not be sold in any format or medium without the formal permission of the copyright holders. please consult the full dro policy for further details. durham university library, stockton road, durham dh ly, united kingdom tel : + ( ) | fax : + ( ) https://dro.dur.ac.uk https://www.dur.ac.uk http://dx.doi.org/ . / http://dro.dur.ac.uk/ / https://dro.dur.ac.uk/policies/usepolicy.pdf https://dro.dur.ac.uk the promises and perils of a digital geohumanities abstract this intervention asks to what extent to developments of digital media offer new objects that demand new methods, and to what extent they create new methods that might be applied to older cultural fields creating a digital humanities. it argues that digital media sometimes reanimate older debates and issues not only in what we study but how we do so, and their significance may be less in new techniques than altering the general tools of our trade in cultural geography. the paper looks at both new digital cultures, such as gaming and new converging media, and new methods, be they analysing the data exhaust of digitally mediated social lives, or using new software in literary analysis. profound tensions exist between quantitative imaginaries of a massive stock of texts yielding determinate meanings and deconstructive visions of texts yielding indeterminate and proliferating meanings. big data sits uneasily with big interpretation. the paper suggests a materialist semiosis is needed to attend to the permutations where new digital techniques may form affective technologies conveying meanings as much as effective analytical tools. keywords: methods, digital humanities, new media, literary geography introduction there has never been a shortage of hyperbole regarding digital media’s always imminent, somehow never arriving, effects on society and the academy, be they from cheerleaders or prophets of doom. and yet there is a remarkable sensation i sometimes feel about the relationship of research and digital devices that everything has changed and nothing has changed. there was a moment doing fieldwork in for me when i looked at the back seat of a hire car and realised there were two digital cameras, one digital camcorder, a voice recorder and laptop. they were all making digital files which i would process and link with a bibliographic database package that in turn linked increasingly either digital versions of papers or my own notes on publications stored as digital files. i had perhaps consequently chosen a moleskine notebook for my fieldnotes as something of a retro affectation. this short intervention seeks to unsettle the continuity of methods in a world whose cultures are increasingly lived through digital media, whilst probing some of the claims of new ‘digital’ methods that seem to promise miraculous solutions to well-worn problems. this commentary draws on materialist media studies that point to the effects of technologies of media on how we think as well as the effect of their content on what we think about. they refuse to divide ‘container’ and ‘content’ or ‘atoms’ and ‘bits’. all media have complex materialities and are not dematerialised information. like johanna drucker i want to resist the ‘pixel-plagued bit-weary’ investment in a form of materiality that creates a false binary of ‘the matter of the real’ in opposition to an immateriality attributed to the ‘virtual’. i to exemplify this it is perhaps salutary to return to an old technology when it was new. at some point in , friedrich nietzsche bought a typewriter. he bought it since his failing eyesight meant staring at the page whilst writing brought on terrible headaches. typing by touch, he could write with his eyes closed. but it also began to inflect what nietzsche wrote. friedrich kittler notes that nietzsche's prose ‘changed from arguments to aphorisms, from thoughts to puns, from rhetoric to telegram style’. ii nietzsche himself recognized that ‘our writing equipment takes part in the forming of our thoughts’ suggesting we all need to think about the tools of our trade. nor can we opt out of new technologies, not even by using a retro notebook. we are all now digital scholars these days – even if it is ‘digital lite’ in terms of using various forms of digital mediation in various aspects of our work. this general engagement should not be overlooked, when the banal technologies of storing, filing and writing work as the scaffolds for our practice. as katherine hayles notes for contemporary thinking ‘the keyboard comes to seem an extension of one’s thoughts rather than an external device on which one types. embodiment then takes the form of extended cognition, in which human agency and thought are enmeshed within larger networks that extend beyond the desktop computer into the environment’. iii the impact of new media is not just on what we study but how we think. there is then no opting out of digital scholarship. as kittler reminds us, heidegger highlighted that: ‘whether or not we personally ever use the typewriter is not important. what is important is that all of us are thrown into the age of typewriting, whether we like it or not. of course, heidegger himself preferred to continue his work in his own handwriting’ iv but we do not need to rely on the slightly romantic and anti-technological bent of a thinker like heidegger. walter benjamin was similarly intrigued by the way technologies for storing, processing and presenting information shaped our thinking. as he put it looking at the desk of the s, ‘the card index marks the conquest of three-dimensional writing, and so presents an astonishing counterpoint to the three-dimensionality of script in its original form as rune or knot notation. and today the book is already, as the present mode of scholarly production demonstrates, an outdated mediation between two different filing systems. for everything that matters is to be found in the card box of the researcher who wrote it, and the scholar studying it assimilates it into his own card index.’ v it is well known then that he attempted to create a form of writing that enabled such a three dimensionality through ‘files’ of examples rather than linear prose. all underpinned by the humble technology of filing, indexing and referencing. a reminder that the database has longer pedigree in academic work than its digital form. in the s there was a rash of work heralding the new possibilities and forms of electronic writing as a hyperlinked database, pointing to the almost embarrassingly literal enactment of deconstructive theory’s destabilisation of the text. vi and yet, as jacques derrida reflected, the texts of his that were ‘most disobedient to tenets of linear writing,’ he wrote before computers enabled dislocations and grafts, in fact glas was written with an olivetti typewriter: ‘it was theorized and it was done – yesterday. the path of these new typographies, which have become common today, was blazed in an experimental fashion a long time ago. it’s thus necessary to invent other ‘disorders’, more discreet, less jubilatory and exhibitionist ones which this time would be contemporary with the computer.’ vii this intervention will seek those modest disruptions in arenas of cultural geography both where the object of our study has become digital and where our mode of study now uses digital media. i use that as a heuristic conscious it risks restaging the dualism of field and academy, things and thought, res extensa and res cogitans, which material media analysis debunks. viii within these two broad areas, i want to make a further subdivision between the ‘migration of our cultural legacy into digital form and the creation of new, born-digital materials’. ix i will first, and most briefly, look at new digital objects of study; secondly, the ways the prevalence of digital media renders the social perceptible in new ways. likewise i divide the effects of digital media on academic practice, into, first, converging forms of knowledge as they become digital and, second, the application of new digital and computational techniques to old issues –exemplified by new approaches to literary geographies. in doing this i cut across the terrain staked out as a project of geohumanities and that of a digital humanities. ketchum, luria, dear, and richardson’s recent text speaks then of four dimensions of geohumanities, ‘geocreativity (creative places), geotexts (spatial literacies), geoimagery (visual geographies), and geohistories (spatial histories)’. x this intervention stresses the challenge of new media to the first three with the creation of new kinds of place and modes of interacting, with new textual and visual apparatus. it comes back to the ‘spatial humanities’ which they bracket as one and same as ‘the digital humanities’ which is described as ‘the absorption of methods of geographical information science into humanities scholarship’. that constricted definition of the digital humanities is not one many who identify as practitioners would accept – even if they acknowledge the rise of spatial databases and spatialised presentations of data. i want to hold to katherine hayles expansive set of possible outcomes. xi to do this means a seeing the digital geohumanities as being an oscillation between using digital technologies in studying traditional objects and also humanities methods in studying digital objects. xii digital cultures and born digital objects there are new cultural forms and practices created through digital media that would seem at first blush to beckon for newly digital modes of analysis. and yet on probing, these born digital cultural artefacts – like computer games – turn out to be ‘remediating’ previous media structures. xiii we need to ask what are the continuities and what are the changes, for instance attending to how game aesthetics remediate landscape. there is little novel in the ideological content of, say, many video games’ depiction of racialised others. the long running franchise of grand theft auto plays on an american urban imaginary taken from tv and film. it also trades on racialised and sexualised stereotypes to animate urban spaces. xiv we might see a not so crypto- orientalism in the conversion of middle eastern cities into backdrops and theatres of action, people into targets and victims for ‘first person shooter’ games positioning the player as a western soldier. however, games such as full spectrum warrior: ten hammers remediate that by using simulated contemporary cnn style news media coverage as framing devices. xv in contrast to first person view games, some, such as age of empire, use map views deploying the cartographic device of a slow map reveal in the corner of the screen and play around the different spatial knowledges and forms of representing travel and mapping. xvi studies have looked at the socialities (and less often spatialities) of online worlds as new creative places for either simulated social encounter (as in virtual worlds) or collaborative quest based games (massively multiplayer online roleplaying games (mmporgs)). the coming together in a simulated place by distributed actors challenges the embodied copresense in participant observation manuals. and yet, as illustrated by longan’s piece in this issue, research practice seems a very familiar ethnographic one at heart. so we have new forms, recycling formats, being analysed by methods that fuse old techniques with new practices. digital lives: rendering culture perceptible. the percolation of new media into everyday life suggests separations of real and virtual, material and cyberspace are misconceived. this is not the place to explore the transformations this enables in the organisation of culture, society, economy and urban life, my concern here is more with the methods it enables to study these. suffice to say initial prophecies of placeless or dematerialised living have now been replaced by an attention on how for instance media enable local life to function. xvii beyond that far from there just being social networks there has been a proliferation of location based social networks, xviii and far from virtual games there are ‘hybrid’ digital games that embed themselves into places – with simulated zombie attacks in real life streets or playing around and against other nearby participants. xix moreover art works increasingly annotate spaces and layer digital content onto places, or use places to inform media content. xx a variety of geowebbed media are now also conveying stories in situ, for instance in the work of janet cardiff or indeed repopulating urban settings with past soundscapes or artistic interventions. xxi here then new media are allied with a distributed archive imbricated in spatial practices. others mash multiple different forms of data together to alter the experience of the spaces and add possibilities for popular archiving. of course popular archiving and authorship then (as ever) reflect the multiple dimensions of power in this case mediated through new technology, about who gets to author what. it is at this point then possible to analyse the layering of different signifiers in different variants of media onto places – to look at either a palimpsest or indeed competing media and their differing constituencies of users trying to create hegemonic meanings for places. xxii we have then a very literal enactment of long established arguments about contested and polysemic landscapes. but here then for cultural geography are either new tools for intervention, where we might add public archival annotation via geotagging to existing techniques. social media render the back and forth of social life perceptible to analysis (be that by academics, governments or more often corporations) through the digital traces – the data exhaust – they leave. our banal social lives become digitally mediated and can be subject to quantitative encapsulation through lexical analysis. for instance alan mislove and colleagues applied a word-rating system –scoring positive and negative connotations – to us based geolocated tweets to produce stunning time lapse maps of the ‘mood of the nation’. xxiii similar approaches have correlated postings with stock market movements xxiv and yet so far the conclusions have been banal. the poetics and affective power of the visualisation have often been more powerful than the supposed ‘result’. one result that is clear is the institution of the social media used as ‘evidential’ and media researchers who use it are ‘not mere observers or utilizers of social media content but are promoters of this infrastructure’ who by framing an issue via a specific media platform risk reproducing how that media frames the issue. xxv such analyses render apparent the centrality of the transmission of affects and feelings to the going on of social life. xxvi and yet, this is done via quantification whereas so much work on emotions or affects has started from postulating them as unquantifiable. latour and lepinay turned to gabriel tarde, who saw the economy and the social as a series of quantifiable intensities, in response. at the start of the twentieth century, he argued that the problem of scientific study of society was not that it quantified, but that its metric was wrong. he wondered about the possibilities of developing metrics for fame, charisma, happiness by creating ‘valuemeters’ or ‘glorimeters’. they note, we should nowadays ‘have no difficulty understanding what digitization has done to the calculation of authority, the mapping of credibility and the quantification of glory’. xxvii however, enthusiasm for alternate metrics rather underplays possible alienation by any and all metrics. so to take an example close to home, the concatenation of student evaluations, national research evaluations, league tables, citation analyses, twitter buzz (where the circulation of academic work on social media is logged by initiatives like alt.metrics xxviii ) and so forth that increasingly govern academic life do not seem to promote positive affects. xxix these metrics do not simply report the world, but rather format it in their own image. rendering cultural life more perceptible, and thus amenable to action by different groups, highlights what kittler called ‘institutions of selection’ which attribute significance. xxx it is also the case then that such data by replicating current patterns of interaction tend to be conservative both in repeating what is currently dominant but also restaging a simplistic monism. xxxi digital convergence if new media are recording traces of cultural life, so too are the old media being transformed. cultural geography has tended to focus on the meaning rather than the substance of media, saying much more about the dematerialised ‘text’ than the ‘book’. as keighren and withers note most work in geography focuses on the content of printed narratives to the neglect of epistolary conventions. xxxii we have been (too) eager to read artefacts like texts, and rather less adept at seeing texts as artefacts. xxxiii as kittler argues media are ‘material devices for producing, processing, transmitting and storing information’. xxxiv what is underway with the increasing use of digital media is a mnemotechnical shift from the library to the database. xxxv roger chartier suggests that the result is a dedifferentiation of discourses that were previously held apart by material differences and associated conventions. chartier is led to speculate that ‘in the digital world all textual entities are like databases that offer fragments, the reading of which in no way implies the perception of the work or the body of works from which they come’. xxxvi the question is whether the database is antitethical to narrative, as argued by lev manovich, or symbiotic with it as katherine hayles would have it. xxxvii this suggests attending rather more to the conventions and modes of information presentation. drucker argues it highlights the spatial organisation of texts and how those structure semantic relations. xxxviii in this she raises two approaches made possible, the first she calls speculative computing, the other a digital humanities where there is a scientific attachment to objective data. the former kind of vision draws on the power of visualisation to produce an affective response and drucker calls for ‘diagrammatology’ where the compositional possibilities and distribution of materials perform relations. xxxix the latter approach of digital humanities mines the universe of digital textual objects to reveal patterns of relations through data visualisation or ‘infovis’ techniques: ‘infovis uses graphical primitives such as points, strait lines, curves, and simple geometric shapes to stand in for objects and relations between them - regardless of whether these are people, their social relations, stock prices, income of nations, unemployment statistics, or anything else. … this reductionism becomes the default “meta-paradigm” of modern science and it continues to rule scientific research today.’ xl therefore this approach may well unnerve many for it is not only radically quantitative but informed by a reductionist sensibility: ‘in the sciences, theory distils from experience a few underlying regularities, thus reducing a seemingly infinite number of particularities into a parsimonious few. the more instances that can be reduced, the more powerful the theory is meant to be […]. reduction is good, proliferation is bad’ xli this reductive digital humanities is exemplified in ‘culturomics’ xlii that mines the digitised books available through google to chart, for instance, the frequency of emotive terms over time and between countries or look at the rise and fall of key terms about climate change. xliii however, looking for cultural markers as metonyms of wider larger cultural units is something that has rather gone out of fashion in cultural geography. by contrast when social media are mined, what is being traced are performative flows rather than markers of specific cultures. i share delyser and sui’s concern that such might submerge traditional interpretative scholarship with superficial number crunching that does not situate the object or process of analysis. xliv mays argues powerfully that deconstructive and quantified approaches view texts in contrasting ways. deconstruction tends to focus on a specific work to show its meaning is indeterminate, open to proliferation and contested interpretation, whilst quantitative methods grasp the proliferation of texts assigning them determinate meanings. xlv quantifying the aesthetic or the aesthetics of quantification?: digital mapping and literary geographies in this context we might revisit the notion of mapping texts inspired by authors such as franco moretti: what do literary maps allow us to see? two things basically. first, they highlight the ortgebunden, place-bound nature of literary forms, each of them with its peculiar geometry, its boundaries, its spatial taboos and favourite routes. and then, maps bring to light the internal logic of narrative: the semiotic domain around which a plot coalesces and self-organises. xlvi the rise of gis has eased the literal mapping of all the places mentioned or where scenes occur in books, which mostly follow the relatively inert idea of space in moretti. the restricted spatiality may stem from a sometimes shallow engagement of gis work with other work on literary geography. for instance, piatti at al. imagine themselves viewing ‘the horizon of a promising interdisciplinary research field – a future literary geography’. xlvii according to the ‘literary geographies’ blog xlviii some works on place, space and literature that had already been published that decade. the spatio- temporal forms, relations and analyses worked through in those offer rather richer concepts than typologising types of spaces (visited by characters, scenes of action, imagined or spoken of, or routeways) and their frequency of occurrence. one can mine william wordsworth’s poems xlix or joyce’s ulysses for place names but it is less clear how that gets us very far in understanding ideas of the natural in the former or, say, the influence of vico’s geopoetic theory of scalar recapitulation in the latter. l indeed when travis uses vico’s recapitulative time to understand s dublin in o’brien’s at swim two bird, he ends up moving away from what he terms a scientific metonymic gis to a metaphorical gis. the former can trace timespace paths of o’brien’s narrator but to include vico inspired temporality he has to employ metaphorically separated layers. the result is less an analytical map than an evocative visualisation; less digital humanities than speculative computing in drucker’s terms. li there is also traffic the other way, where literary material is infusing mapping. so now there are geowebbed applications that say transpose the places of ulysses back onto dublin. or, more in the spirit of diagrammatology, take the relations between places and transpose them onto entirely different cities. agendas i finish then not with conclusions but issues developing or challenging both to digital approaches and to non-digital techniques. it seems to me that it is impossible to ignore these challenges to how cultural geographies approach their objects of study. equally, it seems unproven that some of the new techniques lead to much conceptual advance. three points do emerge across the range of approaches presented here. first, many of these approaches serve to mobilise texts and destabilise the relationship with people. texts are made much more strongly performative than representational and people are no longer ‘autonomous’ actors. moreover, digital media shift attention from stocks of information (in archives and libraries which people may choose to visit) to flows of information (even if people try and ignore them). as kittler puts it ‘persons are not objects but addresses which make possible the assessment of further communications’. lii second, this seems to decentre the agency of the human actors or as kittler puts it the hylomorphism of media as matter and content as spirit, where ‘living spirit’ is opposed to ‘the dead letter’. liii third, there is challenge from digital geohumanities to reconcile the elaboration of meaning from a specific body of material and the reduction of a massive corpus to a pattern. here then we must ask about the desire behind analysing big data and how it throws into relief cultural geographers’ taste for ‘big interpretation’. forms of monism may have made something of a comeback in cultural geography, but not the reductive forms of social physics sometimes underpinning calculative analysis. liv fourth, digital media affect all our research not just by creating new ‘objects of study in new formats’, but shifting ‘the critical ground on which we conceptualize our activity’. lv acknowledgements: i would like to thank dydia delyser for her forbearance and her engagement which along with the comments of three referees improved this immeasurably. author biography: mike crang (durham university) edited the anthology virtual geographies more than years ago (when it seemed they might be), and he has since done projects and published on digital divides, digital cultures and urban spaces, and on the mediation of tourism landscapes. other work has addressed literary geographies, and the values of preservation or disposal. i j. drucker, entity to event: from literal, mechanistic materiality to probabilistic materiality. parallax ( ) p. ii f. kittler, discourse networks / (stanford, stanford university press, ), p. iii n. k. hayles, how we think: digital media and contemporary technogenesis (chicago, university of chicago press, ), p. - iv j. armitage, from discourse networks to cultural mathematics: an interview with friedrich a. kittler, theory, culture & society ( ), p. v w. benjamin, one way street & other writings (london, verso, ) p. vi g. landow, hypertext: the convergence of contemporary critical theory and technology. (baltimore, johns hopkins university press, ); j. d. bolter, writing space: the computer, hypertext and the history of writing (nj: lawrence erlbaum, ) vii j. derrida, word processing, oxford literary review, ( ), p. viii m. crang, telling materials, in m. pryke, g. rose and s. whatmore, eds. using social theory, (london, sage, ) p. - ix j. drucker cited in s. mays, literary digital humanities and the politics of the infinite, new formations ( ), p. x ketchum, j., s. luria, et al. geohumanities symposium: editors response . progress in human geography ( ) p - . xi hayles, how we think. xii k. fitzpatrick, the humanities, done digitally.. m. k. gold. ed. debates in the digital humanities (minneapolis, university of minnesota press, ), p. xiii j. d. bolter, and r. grusin remediation. configurations ( ) p. ; remediation: understanding new media (cambridge, mit press, ). xiv r. atkinson, and p. willis. charting the ludodrome the mediation of urban and simulated space and rise of the flaneur electronique, information, communication & society ( ), p. - . xv j. höglund, electronic empire: orientalism revisited in the military shooter. game studies ( ), np; n. poor, digital elves as a racial other in video games. games and culture ( ), - xvi s. lammes, playing the world: computer games, cartography, spatial stories. aether: the journal of media geography ( ), - ; spatial regimes of the digital playground: cultural functions of spatial practices in computer games space and culture ( ), - ; terra incognita: computer games, cartography and spatial stories, in m. van den boomen, s. lammes, a.-s. lehmann, j. raessens and m. t. schäfer, eds. digital material: tracing new media in everyday life and technology, (amsterdam: amsterdam university press, ) p. - ; m. longan, playing with landscapes: social processes and spatial forms in video games. aether: the journal of media geography ( ), - xvii gordon, e. and a. de souza e silva (). net locality: why location matters in a networked world. (oxford, john wiley & sons, ); crang, m., t. crosbie, & s. graham, technology, timespace and the remediation of neighbourhood life, environment and planning a ( ): – . xviii a.de souza e silva, and j. frith, locative mobile social networks: mapping communication and location in urban spaces mobilities ( ) ( ): – . xix a.de souza e silva, and l. hjorth, playful urban spaces simulation & gaming ( ) ( )p. - ; licoppe, c. and y. inada, geolocalized technologies, location-aware communities, and personal territories: the mogi case journal of urban technology ( ) ( )p. - . xx m. berry, m. hamilton, et al. transmesh: a locative media system." leonardo ( ) ( ): - ; berry, m. and o. goodwin, poetry u: pinning poems under/over/through the streets. new media & society ( ) ( )p. - ; f. timeto, redefining the city through social software: two examples of open source locative art in italian urban space, first monday ( ) ( ); a. kraan, to act in public through geo-annotation: social encounters through locative media art, open ( ), p. - . xxi s. barns, street haunting: sounding the invisible city, in m. foth, l. forlano and c. satchell, eds. from social butterfly to engaged citizen: urban informatics, social media, ubiquitous computing, and mobile technology to support citizen engagement, (cambridge ma: mit press, ) p. - ; m. berry, and o. goodwin, op.cit.; d. pinder, ghostly footsteps: voices, memories and walks in the city, ecumene ( ), p. - . xxii m. graham, m. zook, and a. boulton, augmented reality in urban places: contested content and the duplicity of code. transactions of the institute of british geographers ( ): - ; graham, m.. neogeography and the palimpsests of place: web . and the construction of a virtual earth tijdschrift voor economische en sociale geografie ( ): - . xxiii a. mislove, s. lehmann, y-y. ahn, j-p. onnela, and j. rosenquis’s visualisation of the twitter pulse of the nation can be accessed at http://www.ccs.neu.edu/home/amislove/twittermood/ xxiv j. bollen, h. mao, and x. zeng, twitter mood predicts the stock market. journal of computational science ( ): - . xxv m. w. wilson, morgan freeman is dead and other big data stories. cultural geographies online early. xxvi as in the travels of a football related fracas far from its origin, see j. w. crampton, m. graham, a. poorthuis, t. shelton, m. stephens, m. w. wilson, and m. zook. beyond the geotag: situating "big data" and leveraging the potential of the geoweb. cartography and geographic information science ( ), p. - . xxvii b. latour, and v. a. lépinay, the science of passionate interests: an introduction to gabriel tarde's economic anthropology (chicago, pricky paradigm press, ) p. xxviii the project seeks to use twitter and social bibliography notes, on services like mendeley or citeulike, to record the equivalent of corridor conversations to reveal the travel and influence of work see http://altmetrics.org/manifesto/ xxix r. burrows, living with the h-index? metric assemblages in the contemporary academy. the sociological review ( ), p. - . xxx in j. armitage, p. xxxi t. j.barnes, big data, little history. dialogues in human geography ( ), p. - ; t. j. barnes, and m. w. wilson, big data, social physics, and spatial analysis: the early years. big data & society ( ), p. - . xxxii i. m. keighren and c. w. j. withers, questions of inscription and epistemology in british travelers’ accounts of early nineteenth-century south america, annals of the association of american geographers ( ), p. xxxiii n. k. hayles, writing machines (boston, ma, mit press, ) p. xxxiv in n. gane, radical post-humanism: friedrich kittler and the primacy of technology, theory, culture & society ( ), p. . xxxv mays, literary digital humanities. xxxvi r. chartier, languages, books, and reading from the printed word to the digital text. critical inquiry ( ) p . xxxvii n. hayles, how we think xxxviii j. drucker, diagrammatic writing, new formations ( ), p. . xxxix n. hayles, writing machines xl l. manovich, what is visualization? paj: the journal of the initiative for digital humanities, media, and culture ( ), np. xli , n. hayles, writing machines, p. xlii d. delyser, and d. sui, crossing the qualitative- quantitative divide ii: inventive approaches to big data, mobile methods, and rhythmanalysis, progress in human geography ( ), p. - . xliii a. acerbi, v. lampos, p. garnett, and r. a. bentley. the expression of emotions in th century books, plos one ( ):e ; r. a. bentley, p. garnett, m. j. o'brien, and w. a. brock, word diffusion and climate science, plos one ( ):e . xliv delyser and sui, crossing divides ii, p. xlv mays p. - xlvi f. moretti, atlas of the european novel - (london, verso, ) p. . xlvii b. piatti, h. r. bär, a.-k. reuschel, l. hurni, and w. cartwright, mapping literature: towards a geography of fiction, in . w. cartwright, g. gartner and a. lehn, eds. cartography and art, (berlin, heidelberg: springer, ), p. emphasis added xlviii http://literarygeographies.wordpress.com/ xlix d. cooper, and i. n. gregory, mapping the english lake district: a literary gis. transactions of the institute of british geographers ( ), p. - . l m. crang placing stories, performing places: spatiality in joyce and austen, anglia: zeitschrift für englische philologie ( ), p. . li c. travis, transcending the cube: translating giscience time and space perspectives in a humanities gis. international journal of geographical information science ( ), p. - . lii f. kittler, the history of communication media, ctheory ( ). liii f. kittler, number and numeral, theory, culture & society ( ), p. ; mays, literary digital humanities, p. liv t. j. barnes and m wilson, a brief history lv j drucker writing, p. the ucar open access mandate marlino et al. page submitted march , the ucar open access mandate: a community-centered model of action mary marlino, jamaica jones, karon kelly, mike wright university corporation for atmospheric research boulder co, usa introduction in its role of managing the us federally-funded national center for atmospheric research (ncar), the non-profit university corporation for atmospheric research (ucar) has a strong history of supporting and promoting the atmospheric sciences and related fields. in september , ucar joined a growing number of other institutions worldwide in passing an open access mandate requiring that all peer- reviewed research published in scientific journals by its scientists and staff be made publicly available online through its institutional repository, opensky. the new policy and accompanying repository will enable ucar to compile, preserve and share a complete record of its intellectual output; increase its community visibility and impact; and advance research in the atmospheric sciences by providing free, worldwide access to ucar and ncar scholarship. the passage of the ucar open access policy was especially noteworthy as it marked the first instance of a national science foundation-funded national laboratory to mandate open access. also noteworthy was the broad community-driven process that the ncar library, as the leader in this initiative, employed. this presentation will outline the three-phase process adopted by the library in its effort to reflect both institutional and disciplinary community values and needs through opensky services and policies. phase i: institutional and sponsor engagement ucar is a research consortium with members from nearly us universities and over international academic institutions offering graduate programs in the atmospheric and related sciences. the institution was created by the university community fifty years ago to enhance the computing and observational capabilities of the universities and to focus on scientific problems that are beyond the scale of a single institution. ucar is the governing body that manages the national center for atmospheric research (ncar), sponsored by the us national science foundation (nsf), and has conducted research in the atmospheric sciences since its inception in . for fifty years, we have been a leader in the development of climate models, software tools, support facilities ranging from computational to aircraft, and support services required to perform innovative and global research. given our structure and funding model, the network of stakeholders in our research is unique. universities are our constituents; their faculty members are also our colleagues and peers. the community is highly diverse, especially in terms of each institution’s ability to provide scholarly resources to support scientific research. these institutions have looked to ucar, in keeping with its role as a leader in atmospheric science, for access to our publications. unfortunately, owing to the traditional restrictions inherent in the scholarly publishing system, we have not been able to provide our members direct access to our research as published in scholarly journals. motivated to find a solution to this challenge, and buoyed by recent successes within the larger open access ecosystem, the ncar library began its advocacy efforts with an extensive internal public relations campaign, reaching out to all possible stakeholders, from the library’s own advisory board to ncar and ucar governance bodies, including the ucar board of directors, the scientific and support staff, and the nsf, our primary funding agency, all of whom gave the open access policy their support. marlino et al. page submitted march , there were concerns raised during this process. the most frequent was a genuine concern about our relationships with publishers and scientific societies. unlike a traditional university setting, which supports many disciplines, ncar research is focused primarily on the atmospheric and related sciences. over % of our scientists are members of at least one of the two major professional societies serving the atmospheric sciences: the american meteorological society (ams) and the american geophysical union (agu). ncar and ucar have strong institutional ties to these societies and a deep respect for the integral relationship between their health and stability, their role in promoting scholarship, and the development of a scientific workforce. over % of our peer-reviewed papers are published by either ams or agu. recognizing the heightened tension surrounding the subject of open access by publishers in general, and the importance of publishing as a revenue stream for societies in particular, it became apparent to us that for our open access initiative to have broad community support, the second step in the process must include ensuring that we move forward in concert with ams and agu. phase ii: professional society engagement historically, professional societies have promoted the development and dissemination of disciplinary information and education. more recently, revenues from publishing have become increasingly important to subsidizing those society services and operations. the transition to online-only publishing is an attempt to reduce their increasing costs while preserving support for other society services. however, in the absence of new and viable production and business models, it is clear that most societies will continue to rely heavily on journal subscriptions to underwrite operations for some time to come. extensive communication and trust-building were critical to the success of the partnership between the ncar library, ams, and agu. opening the conversations, we reiterated our belief that the health of the professional societies and the health of the discipline were deeply intertwined. further, we suggested that open access represents a potential and significant reimagining of their business models, rather than a threat. while the conversation was at first a challenge, we were ultimately asked to take the lead in developing a partnership, and to spearhead a collaborative approach to forging innovative solutions to what are now well-documented tensions between open access advocacy by libraries and the interests of academic societies and publishers. working in tandem for nearly a year, the ncar library, ams, and agu maintain a healthy respect for each other’s positions and now strive for mutually profitable solutions. as a result of our conversations with ams and agu, both professional societies have made significant steps towards more openness, not only in reductions to their embargo periods, but equally important, in their wiliness to engage with us in the conversation. phase iii: user engagement in repository design with support from our internal and external advisory boards, the ucar open access policy went into effect in september . its primary provision states that all ucar authors must deposit the final manuscripts of their published, peer-reviewed works into our institutional repository, opensky. the obvious next step, then, was to begin development of this resource. building upon infrastructure that was developed in two us digital library initiatives over the last decade - the digital library for earth systems education (dlese) and the national science digital library (nsdl) - opensky will be formally released in the fall of . throughout our early development efforts, we maintained on-going communications with the many scientific divisions that comprise our institution. in so doing, we were delighted to discover sustained interest in the project and a broad diversity in potential collections. additionally, we were met with requests for assistance to track usage metrics and impact factors for a number of resources, such as the marlino et al. page submitted march , ncar supercomputing facility, which researchers throughout the university community use. accommodating these requests is a high priority. in response to both these needs and to the distributed nature of the organization, the ncar library adopted a new model of delivery for opensky. this model relies on an underlying repository, which will house all digitized and born-digital library resources (including archival resources, special collections such as the ncar technical reports, and opensky content). the traditional repository functions, including storage, search and discovery, embargo control, and metadata services will be provided. however, we will also provide a suite of web services that integrate repository resources in departmental and divisional work practices in ways that have been tailored to their specific needs and interests. we also anticipate supporting links to related information, including primary atmospheric data managed by the ncar data center. although the ncar library itself does not manage data, we are cognizant of the increasing demand for access to derived atmospheric data and related data products in journal publications, along with the new demands for openness and interoperability amongst data, systems, tools and archives that this will impose on scientific repositories. finally, we are challenged by accommodating the decentralized nature of our institution and the irrepressible “creative adaptation” of community resources by divisions and programs. given the distributed nature of born-digital scholarship, as well as our organizational culture, the challenges inherent in both discovering repurposed objects and of preserving the original object in are considerable. reflections and considerations we recognize that these are early days in our development; however, we are cautiously optimistic that owing to our approach, opensky will be successful in dealing with persistent areas of concern for repositories, such as deposit compliance. based on our experience thus far, we conclude with several reflections and considerations: . open access is inevitable; however, its implementation is still very much in play. advocacy and explicating the issues remain important, but equally important is the development of new business models that will ensure the vitality of our academic societies. . technological models that will fulfill traditional ir roles must also meet a community’s expressed needs and in a manner that is consistent with the community’s “lived” culture. long – standing work flows and processes, some of which are highly idiosyncratic, are not changed without resistance, and in some cases, apathy. there is a tension inherent in designing both for those needs and for new delivery models that will probably not be resolved in the very near term. . know your community and your disciplinary culture; build advocacy from the beginning of the effort. find the strengths and the “ties that bind” you to your stakeholders. use these “ties” as tools in the development of community-centered approaches. because of the time that we invested in understanding our culture and the changes to work processes that a successful and vibrant repository will require, we were able to build advocacy not just internally, but with the professional societies as well. . finally, ideology is inspiring, but not practical. it is critical to be honest about the nature of your community, the relationship between the producers and the providers, and what is sustainable and what is not. in our case, we understood early on that if we did not engage with the societies as partners in this endeavor, we would fail. we believe that the social and political aspects of repository development and management are absolutely essential to building advocacy and participation, which ultimately will determine the success of any institutional repository. marlino et al. page submitted march , ucar is a unique institution supporting a unique community. this truth does not diminish the broad implications or applications of its successes in the advocacy or implementation of its recent open access initiatives. we believe that the lessons we have learned over the past twelve months may be useful to other academic and research communities, and look forward to contributing to and supporting further advances in the effort to make research more freely and openly available to all. double gamers: academics between fields skip to main content home research outputs people faculties, schools & groups research areas accountancy africa ageing agricultural management agriculture agriculture & food animal care any arts arts & humanities biomedical sciences biomedical/medical sciences/health biotechnology & informatics business administration cancer & neurodegeneration cell signalling chemical engineering chemistry civil law computer science computing social sciences criminal law education energy engineering fisheries food food analysis food management fossil fuels gene expression health higher education housing humanities infectious diseases journalism education land management law management & commerce mathematics mechanical engineering medical science other miscellaneous categories peace studies physics renewable energy sources science & technology seafood vegetables research centres/groups air quality management resource centre applied marketing research group applied statistics group bat conservation research lab big data enterprise and artificial intelligence laboratory bristol bio-energy centre bristol centre for economics and finance bristol centre for linguistics bristol economic analysis bristol group for water research bristol inter-disciplinary group for education research bristol leadership and change centre bristol robotics laboratory centre for appearance research centre for applied legal research centre for architecture and built environment research centre for fine print research centre for health and clinical research centre for machine vision centre for moving image research centre for public health and wellbeing centre for research in biosciences centre for sustainable planning and environments centre for transport and society centre for water, communities and resilience collaborative entrepreneurship research group commercial law research unit computer science research centre creative technologies laboratory data research access and governance network (dragon) digital cultures research centre document and location research group education innovation centre engineering modelling and simulation research group environmental law and sustainability research group global crime, justice and security research group human resources, work and employment innovation, operations management and supply institute for sustainability, health and environment institute of bio-sensing technology mathematics and statistics research group moving image research group psychological sciences research group regional history centre research group in mathematics and its applications robotic engineering and computing for healthcare - fet science communication unit social justice research group social science research group software engineering research group sustainable economies research group (serg) the who collaborating centre for healthy urban environment unconventional computing group visual culture research group browse by year by author by type about oai research repository all output person project advanced search double gamers: academics between fields costa, cristina home outputs authors cristina mendes da costa cristina .costa@uwe.ac.uk associate professor in learning and teaching abstract the field of academia is frequently associated with traditional norms that aim to regulate scholarly activity, especially research. the social web, as another field, is often viewed as challenging long-established conventions with novel knowledge production practices. hence, the two fields seem to oppose rather than complement each other. using a bourdieuian lens, this research examines research participants' accounts of their approaches to practice on the social web in relation to academia. the paper reports on the habitus dissonance between the two fields, before discussing the effects of the two fields’ competing doxas on individuals’ habitus. citation costa, c. ( ). double gamers: academics between fields. british journal of sociology of education, ( ), - . https://doi.org/ . / . . journal article type article acceptance date oct , publication date jan , journal british journal of sociology of education print issn - publisher taylor & francis (routledge) peer reviewed peer reviewed volume issue pages - doi https://doi.org/ . / . . keywords digital scholarship, doxa, symbolic violence, pierre bourdieu, the social web public url https://uwe-repository.worktribe.com/output/ publisher url https://doi.org/ . / . . additional information additional information : this is an accepted manuscript of an article published by taylor & francis in british journal of sociology of education on th december , available online: https://doi.org/ . / . . . files double_gamers_academics_between_fields_final version .docx ( kb) document download licence http://www.rioxx.net/licenses/all-rights-reserved double_gamers_academics_between_fields_final version .pdf ( kb) pdf download preview licence http://www.rioxx.net/licenses/all-rights-reserved organisation(s) ace dept of education and childhood research centres/groups bristol inter-disciplinary group for education research you might also like doing research in and on the digital: research methods across fields of inquiry ( ) book digital scholarship, higher education and the future of the public intellectual ( ) journal article digital scholars: a feeling for the academic game ( ) book chapter cluster analysis characterization of research trends connecting social media to learning in the united kingdom ( ) journal article digital literacies for employability- fostering forms of capital online ( ) journal article downloadable citations html bib rtf uwe bristol research repository powered by worktribe | accessibility about uwe bristol research repository administrator e-mail: repository@uwe.ac.uk this application uses the following open-source libraries: sheetjs community edition apache license version . (http://www.apache.org/licenses/) pdf.js apache license version . (http://www.apache.org/licenses/) font awesome sil ofl . (http://scripts.sil.org/ofl) mit license (http://opensource.org/licenses/mit-license.html) cc by . ( http://creativecommons.org/licenses/by/ . /) powered by worktribe © advanced search just leave the fields blank that you don't want to search repository id title all of any of name year keywords all of any of research centres/groups ace central ace dept of art & design ace dept of arts & cultural industries ace dept of creative & cultural industries ace dept of education and childhood ace dept of film & journalism ace technical resources apd central academic practice directorate dir central dir directorate projects dir planning & bi directorate fac accommodation services fac business support fac catering & bar services fac central fac centre for sport fac cleaning services fac conference/ecc fac estates operations fac ft facilities technology fac health & safety unit fac logistics fac printing & stationery fac security fac space management & design fac sustainability fac travel & access fbl bristol business engagement centre fbl central fbl dept of accounting economics & finance fbl dept of business & management fbl dept of law fcm central fcm corporate communications fcm creative strategy fcm global centre fcm international recruitment & admissions fcm marketing fcm student journey communications fcm tsu temps fcm uk recruitment & admissions fet central fet dept of architecture & built environ fet dept of computer sci & creative tech fet dept of engineering design & mathematics fet dept of geography & envrnmental mgmt fet technical resources fin central fin commercial services fin corporate services fin faculty finance fin financial services fin procurement & payments fin systems and management accounts fin treasury and operations facilities faculty of arts creative industries & education faculty of business & law faculty of environment & technology faculty of health & applied sciences finance future students future students, comms and marketing har dept of animal and agriculture science har dept of equine science har dept of sport science has central has dept of allied health professions has dept of applied sciences has dept of health & social sciences has dept of nursing & midwifery has technical resources hrs advice hub hrs business partners hrs central hrs consultancy hrs employee relations & reward hrs equality & diversity hrs hr online project hrs learning & development hrs organisation & learning development hrs payroll & pensions hrs resourcing hrs systems & information hartpury (associate faculty) human resources it services its applications development & testing its central its compliance & security team its enterprise architecture & strategy its it operations its strategic business engagement lci careers and enterprise lci equality diversity and inclusivity lci library services library careers and inclusivity rbi central rbi research & business enterprise service rbi research & development research business & innovation sas administration & advice sas casuals sas central sas policy development & student experience sas student data & systems sas student journey programme sas student support & wellbeing scm central scm corporate communications scm creative strategy scm strategic marketing sfs admissions sfs central sfs global centre sfs international sfs recruitment & outreach spo central strategic communications and marketing strategic programmes office student and academic services university of the west of england type book book chapter conference proceeding dataset digital artefact exhibition / performance journal article other patent physical artefact presentation / conference report thesis working paper publication status submitted accepted in press published unpublished journal or publication title all of any of order the results by last modified (most recent first) by last modified (oldest first) by year (most recent first) by year (oldest first) by title search cancel reconstructing the past through utah sanborn fire insurance maps: a geospatial approach to library resources search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine march/april volume , number / table of contents   reconstructing the past through utah sanborn fire insurance maps a geospatial approach to library resources justin b. sorensen j. willard marriott library, university of utah justin.sorensen@utah.edu doi: . /march -sorensen   printer-friendly version   abstract cartographic maps have the ability to convey information and ideas in ways text cannot. the utah sanborn fire insurance maps are one such resource, depicting detailed information on buildings, layouts, compositions and boundaries of cities and towns. as time has progressed, the interest in these resources has continued to grow, opening the door for the creation of an updated method for viewing and examining this valuable collection. through the incorporation of gis and geospatial technology, the printed and scanned materials have been converted into georeferenced raster datasets, allowing viewers the ability to geospatially interact with the information and apply the information to their research in new and exciting ways. this article describes the digital scholarship lab's endeavor to convert these valuable resources into research driven geospatial datasets, providing a new format for how the library information is presented as well as a new method for interacting and examining the information in detail.   introduction sanborn fire insurance maps are a highly requested resource in libraries. each collection of hand-made maps contains detailed surveyor information on commercial, industrial and residential sections of cities and towns ranging from the mid s to the mid s (figure ). while the j. willard marriott library has scanned and created digital versions of the original printed maps, the digital scholarship lab established a goal to further develop, enhance and utilize the information contained within these maps. as a result, an innovative project was created that not only offers these resources openly to students, staff, faculty and visitors of the university of utah, but also creates a method for displaying and examining each map within physical space through the incorporation of geospatial software and -dimensional technology. figure : portion of original scanned mount pleasant sanborn map (close-up detail). [ view larger version of figure . ]   project development and process beginning in , sanborn fire insurance maps (founded by daniel alfred sanborn) were created for the purpose of assessing fire insurance liability within urbanized areas of the united states, depicting detailed information on cities and towns consisting of: building compositions and structural information, layouts, business and street information, property boundaries and much more ("sanborn maps", ). author kim keister describes this collection well, stating "the sanborn maps survive as a guide to american urbanization that is unrivaled by other cartography and, for that matter, by few documentary resources of any kind" ("sanborn maps", ). as time has progressed, each published volume and its descendent updated versions have become highly requested resources in fields such as historical research, planning, preservation and the study of urban geography ("sanborn maps", ). while many of these collections are available at academic institutions, most are archived in their original printed format for preservation, thus limiting the ability to openly obtain information contained within each detailed map. in , the j. willard marriott library's digital technologies staff began the sizable task of scanning each utah sanborn fire insurance map contained within the special collections department (arlitsch, ). this effort produced high-resolution scanned tiffs of the entire collection (numbering over , maps in total) made viewable through a webpage specifically designed to present the entire scanned map collection (figure ). as time progressed and the interest in this resource continued to grow, an updated method for viewing this valuable collection while interacting and examining the information closely was in order. what better way to accomplish this goal than to incorporate gis and geospatial technology, thereby converting the printed and scanned materials into geospatial raster data while offering a new method and format for the j. willard marriott library to offer these resources and a convenient way for individuals to access and examine the detailed information remotely. figure : screenshot of scanned utah sanborn fire insurance map portal located in the j. willard marriott library's digital library. [ view larger version of figure . ] gis stands for geographic information system, which is an assemblage of computer hardware, software and data designed to examine and present geospatial data by combining spatial mapping and analysis with database technology. with software such as this, it becomes possible to turn ordinary printed data and imagery into geospatial data, identifying geographic features, locations or boundaries on the earth. for the utah sanborn fire insurance maps, this process would involve converting each scanned map into a georeferenced overlay, a process by which a -dimensional printed map or photographs existence is defined within physical space by giving the image latitude and longitude coordinates dependent upon a particular map projection system, resulting in the digital map's alignment to its appropriate geographic location within a virtual environment. the original scanned tiffs created by the j. willard marriott library's digital technologies staff were first converted from large file sizes (approximately mb per map) to smaller web-sized jpeg images ( - mb per map) by digital scholarship lab staff using photoshop to allow for easy remote access while preserving image quality. the map collection was georeferenced using arcgis software, aligning each map to its appropriate geospatial location through established georeferencing protocols to maintain consistently projected digital overlays for all of the historic maps. these protocols include the use of reference layers composed of satellite imagery, street centerlines and parcel data as well as the utilization of a nad projected coordinate system (north american datum, ). the georeferenced maps were then converted to kmz files ("keyhole markup language", ) using global mapper software, allowing each map to be viewed in detail using free and openly available google earth software (figure ). while great care has been taken to maintain georeferencing accuracy, it is important to note that many map features have changed or vanished over time, resulting in the georeferencing of each historic map as close to its appropriate geographic position as possible. figure : digital aerial view of the georeferenced salt lake city sanborn map collection overlaid within google earth on present-day satellite imagery for comparison. [ view larger version of figure . ] with the completion of the georeferencing process, all georeferenced map files were delivered to staff members of uspace (the university of utah's institutional repository), who uploaded each georeferenced map individually as compound objects using contentdm software while entering geospatial metadata for each of the georeferenced maps, opening access to the newly created materials via url links accessible within the j. willard marriott library catalog. as the geospatial components were becoming openly available to students, staff, faculty and visitors of the university of utah, a method for quickly accessing the information and datasets contained within the new collection was in order. this process would lead to a number of brainstorming sessions on how to best present the information and ideas for different display components visitors would utilize prior to download. as a result, a library study guide hosted on the j. willard marriott library's website was created (figure ), conveniently navigating visitors throughout the entire map collection. each city tab within the study guide represents one of several geographic locations depicted in the utah sanborn fire insurance map collection, displaying each geographic set by the year in which it was created, ranging from the mid s to the mid s. links to each year's collection are available for the original scanned tiffs as well as the newly georeferenced materials available for download while embedded google earth gadgets offer visitors the opportunity to interact with each map collection within a -dimensional environment prior to download. figure : screenshot of the project study guide hosted on the j. willard marriott library's website including links to original scans, georeferenced materials and embedded interactive google earth gadgets displaying each collection by location. [ view larger version of figure . ]   benefits of this project adding a geospatial component to library materials allows the information to be displayed, expressed and presented in ways standard printed or scanned information cannot. by utilizing gis and geospatial technology with the incorporation of geospatial datasets such as these, a new realm for library research is opened, allowing research institutions a new method for sharing information in a world more and more reliant on digital information. ranging from the creation of historical reconstruction models (figure ), interactive -dimensional model overlays (figure ), planning analysis or the study of change over time, individuals from multiple disciplines are now able to utilize the georeferenced utah sanborn fire insurance maps in new and exciting ways. figure : detailed -dimensional model created by caitlyn tubbs (digital scholarship lab) based on information contained in a georeferenced salt lake city sanborn fire insurance map. [ view larger version of figure . ]   figure : interactive -dimensional model created by justin sorensen (digital scholarship lab) based on information contained in a set of georeferenced salt lake city sanborn fire insurance maps. [ view larger version of figure . ]     project results this project has resulted in the creation of an innovative portal for all utah sanborn fire insurance maps hosted at the j. willard marriott library complete with links to individual maps contained within our collection, georeferenced maps available for download in both kmz (google earth) and zipped geo-raster jpeg (arcgis) formats for access by students, staff, faculty and visitors of the university of utah, as well as interactive google earth gadgets embedded within each geographic page of the project study guide, "reconstructing the past through utah sanborn fire insurance maps", displaying each collection of georeferenced maps while overlaying them on a virtual -dimensional model of the earth.   conclusion this project demonstrates not only how printed, scanned and highly-requested library resources such as the utah sanborn fire insurance maps can be converted into research driven geospatial datasets, but also one of the many ways in which gis can be beneficial in sharing library collections while taking library research to a new level. geospatial technology is an amazing resource available and within a world continually converting towards a digital realm, gis will be one of the many tools libraries will have available to assist them in geospatially sharing their resources with others.   acknowledgements the author would like to acknowledge the work of fellow digital scholarship lab staff member caitlyn tubbs (geospatial data & visualization intern) for her georeferencing assistance and uspace staff members donald williams (ir coordinator) and cindy russell (ir workflow specialist) for their work uploading each of the map files and applying metadata.   notes the gadgets contained on each city page of the study guide require a google earth plug-in to operate. if you experience trouble loading the interactive map windows, please verify that the plug-in is installed on your browser (recommend browser: firefox).   references [ ] "sanborn maps." wikipedia. wikimedia foundation. [ ] arlitsch, kenning. "digitizing sanborn fire insurance maps for a full color, publicly accessible collection." d-lib magazine, vol. , no. / , july . http://doi.org/ . /july -arlitsch [ ] "north american datum." wikipedia. wikimedia foundation. [ ] "keyhole markup language." wikipedia. wikimedia foundation.   about the author justin sorensen is the gis specialist for the j. willard marriott library's digital scholarship lab. a graduate of the university of utah, justin has a strong background in geography and geospatial technology and has been creating, developing and managing geospatial projects for the digital scholarship lab since .   copyright © justin sorensen diversity and inclusion in digital scholarship and pedagogy: the case of the programming historian this article presents several inclusion and diversity policies and strategies for digital scholarship and pedagogy, using the programming historian as a case study. by actively supporting and working towards gender diversity, as well as multilingualism, cultural inclusivity and open access, the programming historian aims to further enhance what is meant to be open in the context of access, diversity and inclusion in digital scholarship and pedagogy. diversity and inclusion in digital scholarship and pedagogy: the case of the programming historian keywords digital pedagogy; diversity; multilingualism; open access the programming historian: identity and history this article describes work undertaken by the editorial board of the programming historian to situate diversity at the heart of our open access (oa) project. founded in by william j turkel and alan maceachern, the programming historian publishes novice-friendly, peer- reviewed tutorials that help humanists learn a wide range of digital tools, techniques and workflows to facilitate research and teaching. turkel and maceachern focused their initial lessons on the programming language python, and these were published oa as a network in canadian history & environment (niche) ‘digital infrastructure’ project. in the programming historian expanded its editorial team and launched as an oa peer-reviewed scholarly journal of methodology for digital historians. in we added a spanish- language sub-team to the initial english-language team, and in started publishing translated tutorials under the title the programming historian en español. in we hosted our first spanish-language writing workshop in bogotá, colombia, issued a call for new tutorials in spanish, and began to plan for translating tutorials into english. in the same year we added a french-language sub-team and in launched the programming historian en français. at the time of writing, the programming historian has published tutorials: in english, in spanish, and two in french. insights – , diversity and inclusion: the programming historian | anna-maria sichani et al. james baker senior lecturer in digital history and archives university of sussex maria josÉ afanador-llach assistant professor in digital humanities and history universidad de los andes brandon walsh head of student programs, scholars’ lab university of virginia anna-maria sichani research fellow in media history and historical data modelling university of sussex the editorial board of the programming historian (consisting of individuals at the time of publication) has long considered its work as consisting of much more than merely running journals. all three publications are embedded within the existing infrastructure of scholarly publications: they all have issns and are listed in the directory of open access journals. all tutorials are subject to peer review, and come with all the usual publications metadata associated with a journal. but, even with these features, the programming historian is more properly an ongoing project: it requires software development, conducts community surveys and solicits community input. it also seeks to actively rebalance global access to computational skills and methods, and advocates against the bifurcation of technical and scholarly roles in digital research. additionally, it takes seriously its commitment to transparency, accountability and diversity. in this case study we explore four aspects of the programming historian as a project, with a focus on practice that is intended to work towards diversifying digital history. in the first part we examine the conception and implementation of our policy on editorial board diversity. in the second part we discuss our translation initiatives: how they came about, the new structures that were required to support them, and how translation has created not only new audiences for our work but has also drawn in reviewers, editors and supporters from communities not previously considered by the editorial board. thirdly, we explore a consequence of translation: the need to examine the anglophone preconceptions and prejudices of our documentation, workflows and tutorials, and a drive to ensure that all our tutorials – english, spanish, or french – are written for international audiences. finally, we describe our open ethos: a key part of our socio-technical project infrastructure. we argue that publishing of oa material alone does not rebalance global access to computational skills and methods. only by making our process open (from site updates to peer review) have we been enabled to work towards our ambition of diversifying digital history. the programming historian and diversity diversity policy – gender and cultural since its relaunch as a publication in , the programming historian has been actively committed to diversity. to quote from our diversity policy: ‘we insist on a harassment- free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience’. given the complexities of gender expressions and national identity, we recognize that this is not an easy conversation with easy solutions. we see therefore the text of our policy as a living document and hope it can create opportunities for further conversation. at times, despite these commitments, the programming historian has become aware of the ways in which its own structure, processes and editorial board might inadvertently be reinforcing barriers to access (see, for example, adam crymble’s research ). addressing these shortcomings and honouring these commitments is a process and requires ongoing work. to that end, our commitment to diversity also extends to our editorial board, and our diversity policy ensures that members from any one gender or any one nationality do not comprise more than % + of the members on the board. this policy ensures that the project continues to benefit from diverse viewpoints. in terms of authors’ diversity, at the time of writing the programming historian contained tutorials written by authors ( men and women) affiliated with institutions from uk, belgium, canada, and the usa. at the time of writing, the editorial board has eight editors (four men and four women) affiliated with universities from the usa and the uk; the spanish editorial team has five members (three men and two women) affiliated with institutions from mexico, germany, the usa and colombia, and the french editorial team has three members (one man, two women) affiliated with institutions from france and canada. any time the board grows or ‘our commitment to diversity also extends to our editorial board’ ‘only by making our process open … have we been enabled to work towards our ambition of diversifying digital history’ shrinks, the diversity policy ensures that we examine anew our make-up to ensure that we are doing the best we can to represent and address the needs of our diverse international audience. translation full-language initiatives as mentioned earlier, in the programming historian added a spanish-language sub-team to the initial english-language team, and we launched the programming historian en español with spanish translations of existing lessons. after the writing workshop in bogotá, colombia, we started the editorial process to add original tutorials in spanish, and we have just published the first original spanish lesson. in the french-language sub-team joined the project, and in the programming historian en français was launched. the programming historian is now a proudly multilingual project involving a large team. translation requires extensive teamwork among the language sub-teams and co-ordination across our editorial team. but translation is not purely a matter of converting lessons from english into another language. these new full-language initiatives have challenged technical infrastructure and our operation as an oa scholarly publication. for example, as we are committed to publishing openly reviewed tutorials to a high standard, there is an extensive set of technical, editorial and administrative processes and policies in place (from peer review, technical infrastructure to issn and indexing). as new language sub-teams have joined the project, we have endeavoured to ensure consistency in the implementation of these processes and policies, whilst at the same time ensuring that lessons from the work of the sub-teams can enrich and enhance the project as a whole. to this end, we put in place the additional language sub-teams policy, a document that has the function of underscoring the effort and commitment that a translation initiative requires of a language sub-team, both in terms of development and maintenance. ad hoc translation initiating and hosting a language initiative requires a huge effort and commitment both in terms of development and maintenance. with this in mind, our decision to integrate full translation initiatives has been carefully thought through. a by-product of this is our work to support and encourage ad hoc translation of the programming historian tutorials. since this is an oa publication, all our lessons are published under the creative commons attribution licence (cc by) and this allows anyone to distribute, remix, reuse and build upon the lesson as long as the original source is credited. by choosing one of the more liberal of the cc licences, we allow derivative works of the lessons, including translations (and even more creative adaptations of them), and we enable onward reuse. although we are not able to host or maintain ad hoc translations as part of our own infrastructure, we actively encourage ad hoc translations by providing information and tips on the translation process based on the existing language initiatives. through ad hoc translations, we will be able to celebrate once more the benefits of oa educational content, to map our audience’s linguistic diversity and enable creative reuse of the project. internationalization writing for an international audience the programming historian editors, authors and readers live all around the world and operate in a range of language and cultural contexts. publishing in more than one language, as we have done since , is helping us to reach that global audience, and we have an ambition to – where possible – translate all published tutorials across our language initiatives. ‘now a proudly multilingual project’ ‘map our audience’s linguistic diversity and enable creative reuse of the project’ but translation alone is not enough to ensure reaching an audience. through our work translating and adapting the tutorials from english to spanish, we came to understand that a global project places additional responsibilities on authors, editors and reviewers. and so in april we developed our guidelines on writing for an international audience that formalize a requirement for authors to take steps to write tutorials that are accessible to as many people as possible. the guidelines begin by recognizing that not all methods or tools are fully accessible to international audiences. they then make specific recommendations that emerge from our experience of translation. initially, they ask authors to consider whether their chosen method is reproducible in languages other than those in which they have advanced proficiency. in part this relates to choosing to write tutorials for digital research tools with multilingual documentation, but it also relates to the anglophone assumptions of some of these tools. this is particularly acute for text analysis tools, which we know from experience do not always support different character sets (e.g. accented characters, non-latin scripts, etc.). building on this, our guidelines ask that authors consider the primary sources they chose for their tutorial, and whether alternative data sets from outside their geographical expertise can be suggested for readers to explore. finally, we ask that authors consider international audiences when constructing their prose, avoiding language that is nationally or regionally specific. so, we ask authors to be sensitive to how specific cultural references, idiomatic expressions, or tones, might not register for all audiences, and, in addition, how readers might not be familiar with persons, organizations, or historical details specific to a particular culture. for example, a lesson that might joke, ‘don’t throw away your shot! try text analysis today!’ will parse for people who have seen the play hamilton, but even in this case we cannot assume that everyone will understand the reference. readers without that cultural familiarity may be confused. we also ask that examples of code and metadata use internationally recognized standards for date and time formats. taken together, these guidelines not only ease translation but they also work towards ensuring that all of our international audiences find the tutorials we publish approachable and intelligible. as part of this internationalization strategy, in august of we hosted a writing workshop in bogotá, colombia. with sponsorship from the british academy, we brought together humanities scholars from across the americas (chile, argentina, cuba, mexico, colombia, brazil, canada and the usa) with the objective of writing tutorials on digital humanities methodologies that specifically addressed research needs in latin america and the hispanic world. until now, the programming historian had exclusively contained lessons originally written in english that were later translated into spanish. we are beginning to reverse this, and we have received our first lessons originally written in spanish, ready to be translated in the opposite direction. neutral political policy as part of our internationalization agenda, we developed a neutral political policy. the programming historian is an international publication and welcomes readers and contributors with a wide range of political, cultural and religious views. while the members of our editorial board are undoubtedly passionate about a range of issues, we decided that the programming historian and the editorial board must remain apolitical with regards to party politics, elections, referendums and matters of international relations. this extends to but is not limited to posts on the programming historian blog, and any social media outlets maintained by or on behalf of the programming historian. editors are free to express their own views, but should do so in a manner that makes it clear that they speak on behalf of themselves and not the project or its editors. ‘we ask that authors consider international audiences’ ‘we brought together humanities scholars from across the americas’ open access/open ethos for the programming historian, oa is about not only offering freely accessible tutorials on digital methods and tools to a truly international and wide audience, but also developing an overall workflow that embraces the values of transparency, collaboration, mutual respect and open peer review in scholarly communication. to this end, we are committed to and are actively working towards making as much as possible of our process, communications, decisions and workflow (from site updates to policy discussion) publicly available on github, a platform for sharing, managing and versioning coding projects. the centrepiece of our open ethos is that the peer review for each tutorial happens in public. and so while each tutorial receives the committed attention of a particular editor and pair of reviewers, each review also can, for a time, receive input from anyone through the github interface. this process has led to fruitful – if challenging – discussions about inclusion, intellectual diversity and internationalization that would not be possible in a closed peer- review pipeline. in addition, the project engages with the technical protocols of github as a means of deliberately slowing down its procedures in order to focus, in public, on the translation process. with the exception of new tutorials, each time an editor seeks to edit or add new text to the vast majority of pages on the programming historian website, those edits or additions must be approved by at least two fellow editors before they are ‘pushed’ to the live site. as all of our pages – from our diversity policy to author guidelines – are published (at present) in three languages, one of the approvers must come from a language sub-team. this not only alerts the relevant translation teams of the intention to add or change text, but it also starts a process whereby the proposed changes are not incorporated until they have been translated into each language in which we publish. the result is a slow process of simultaneous translation, but one that ensures internationalization is publicly incorporated into regular workflows rather than as an afterthought. conclusion – future directions the programming historian, an oa online publication actively operating in multiple languages, has developed a model to address the problem of global and linguistic access to digital resources, methodologies and tools for the humanities. this commitment to linguistic and geographic diversity in the digital humanities means understanding the limits and possibilities of the institutional, historical, cultural and economic contexts in regions like latin america. during the course of our work in first the spanish-speaking and, latterly, the francophone world, the editors of the programming historian came to understand that oa alone cannot foster diversity and inclusion, rather what we found was that the expansion of a community of open practice raises new questions about what it means to be open in the context of access, diversity and inclusion. the programming historian has presented work at various international venues, and continues to actively seek out partnerships to promote these values. thus, by bringing an inclusive process of internationalization and dialogue about these limitations to the fore, the programming historian hopes to serve as a model for future global collaborations. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘full list of industry a&as’ link: http://www.uksg.org/publications#aa competing interests the authors have declared no competing interests. ‘internationalization is publicly incorporated into regular workflows’ ‘the expansion of a community of open practice raises new questions’ ‘the centrepiece of our open ethos is that the peer review for each tutorial happens in public’ http://www.uksg.org/publications#aa article copyright: © anna-maria sichani, james baker, maria josé afanador-llach and brandon walsh. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. corresponding author: anna-maria sichani research fellow in media history and historical data modelling university of sussex, uk e-mail: a.sichani@sussex.ac.uk orcid id: http://orcid.org/ - - - to cite this article: sichani a-m, baker j, afanador-llach mj and walsh b, “diversity and inclusion in digital scholarship and pedagogy: the case of the programming historian,” insights, , : , – ; doi: https://doi.org/ . / uksg. submitted on february             accepted on march             published on may published by uksg in association with ubiquity press. references . the programming historian: https://programminghistorian.org/ (accessed april , ). . note that we do not issue dois, though this has been discussed by the editorial board on multiple occasions and in multiple contexts: see “issues tracker: search for doi”, the programming historian: https://github.com/programminghistorian/jekyll/issues?utf =%e % c% &q=doi (accessed march , ). ultimately, we have prioritized dedicating our resources to the diversity and internationalization agenda discussed in this article over implementing unique identifiers. . “diversity policy”, the programming historian: https://programminghistorian.org/en/about#diversity-policy (accessed march , ). . adam crymble, “identifying and removing gender barriers in open learning communities: the programming historian” in: blended learning in practice (autumn, ), – : http://researchprofiles.herts.ac.uk/portal/files/ /blip_ _autumn_ _final_autumn_ .pdf (accessed march , ). . “diversity policy”, the programming historian. . “additional language sub-teams policy”, the programming historian, last modified may : https://github.com/programminghistorian/jekyll/wiki/additional-language-sub-teams-policy (accessed march , ). . “write for a global audience”, the programming historian: https://programminghistorian.org/en/author-guidelines#write-for-a-global-audience (accessed march , ). . adam crymble and maria josé afanador-llach, “writing workshop report bogotá, colombia”, the programming historian (blog), august , : https://programminghistorian.org/posts/bogota-workshop-report (accessed march , ). . jennifer isasi and josé antonio motilla, “convocatoria para lecciones en español en the programming historian”, the programming historian (blog), april , : https://programminghistorian.org/posts/convocatoria-de-tutoriales (accessed march , ). . “neutral political policy”, the programming historian: https://github.com/programminghistorian/jekyll/wiki/neutral-political-policy (accessed march , ). http://creativecommons.org/licenses/by/ . / mailto:a.sichani@sussex.ac.uk http://orcid.org/ - - - https://doi.org/ . /uksg. https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ https://programminghistorian.org/ https://github.com/programminghistorian/jekyll/issues?utf =%e % c% &q=doi https://programminghistorian.org/en/about#diversity-policy http://researchprofiles.herts.ac.uk/portal/files/ /blip_ _autumn_ _final_autumn_ .pdf https://github.com/programminghistorian/jekyll/wiki/additional-language-sub-teams-policy https://programminghistorian.org/en/author-guidelines#write-for-a-global-audience https://programminghistorian.org/posts/bogota-workshop-report https://programminghistorian.org/posts/convocatoria-de-tutoriales https://github.com/programminghistorian/jekyll/wiki/neutral-political-policy the programming historian: identity and history the programming historian and diversity diversity policy - gender and cultural translation full-language initiatives ad hoc translation internationalization writing for an international audience neutral political policy open access/open ethos conclusion - future directions abbreviations and acronyms competing interests references how digital are the digital humanities? an analysis of two scholarly blogging platforms                city, university of london institutional repository citation: puschmann, c. & bastos, m. t. ( ). how digital are the digital humanities? an analysis of two scholarly blogging platforms. plos one, ( ), e . doi: . /journal.pone. this is the published version of the paper. this version of the publication may differ from the final published version. permanent repository link: http://openaccess.city.ac.uk/ / link to published version: http://dx.doi.org/ . /journal.pone. copyright and reuse: city research online aims to make research outputs of city, university of london available to a wider audience. copyright and moral rights remain with the author(s) and/or copyright holders. urls from city research online may be freely distributed and linked to. city research online: http://openaccess.city.ac.uk/ publications@city.ac.uk city research online http://openaccess.city.ac.uk/ mailto:publications@city.ac.uk research article how digital are the digital humanities? an analysis of two scholarly blogging platforms cornelius puschmann *‡, marco bastos ‡ faculty of social sciences, zeppelin university, am seemooser horn d, friedrichshafen d- , germany, franklin humanities institute, duke university, s. buchanan blvd, bay box , durham, north carolina , united states of america ‡ these authors contributed equally to this work. * cornelius.puschmann@hiig.de abstract in this paper we compare two academic networking platforms, hastac and hypotheses, to show the distinct ways in which they serve specific communities in the digital humanities (dh) in different national and disciplinary contexts. after providing background information on both platforms, we apply co-word analysis and topic modeling to show thematic similari- ties and differences between the two sites, focusing particularly on how they frame dh as a new paradigm in humanities research. we encounter a much higher ratio of posts using humanities-related terms compared to their digital counterparts, suggesting a one-way de- pendency of digital humanities-related terms on the corresponding unprefixed labels. the results also show that the terms digital archive, digital literacy, and digital pedagogy are rel- atively independent from the respective unprefixed terms, and that digital publishing, digital libraries, and digital media show considerable cross-pollination between the specialization and the general noun. the topic modeling reproduces these findings and reveals further dif- ferences between the two platforms. our findings also indicate local differences in how the emerging field of dh is conceptualized and show dynamic topical shifts inside these respective contexts. introduction the advent of the internet has profoundly affected scholarly communication [ – ]. few schol- ars, whether in the sciences, social sciences, or humanities can imagine conducting research or organizing teaching without relying on email, digital library services, or e-learning environ- ments. formal academic publishing has undergone a series of changes with the increased avail- ability of electronic publications, whether under an open access or toll access regime [ ]. structural changes in the dissemination of knowledge have largely been gradual and evolution- ary: while the volume of scholarly publications has greatly increased in the past decades and the formal and distribution models have diversified, the form and function of research articles and scholarly monographs have remained relatively stable [ ]. plos one | doi: . /journal.pone. february , / open access citation: puschmann c, bastos m ( ) how digital are the digital humanities? an analysis of two scholarly blogging platforms. plos one ( ): e . doi: . /journal.pone. academic editor: vincent larivière, université de montréal, canada received: june , accepted: november , published: february , copyright: © puschmann, bastos. this is an open access article distributed under the terms of the creative commons attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. data availability statement: all relevant data are within the paper and its supporting information files. funding: this work was supported by the national science foundation under grant number and the german research foundation under grant number pu / - . the funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. competing interests: the authors have declared that no competing interests exist. http://crossmark.crossref.org/dialog/?doi= . /journal.pone. &domain=pdf http://creativecommons.org/licenses/by/ . / meanwhile, the range of avenues available for the dissemination of informal scholarly com- munication has increased exponentially. in addition to formal publication venues, scholars can now communicate their findings in (micro)blogs, wikis, social networking sites (sns) and countless other social web platforms [ – ]. such services carry both opportunities and risks for early-career researchers, and they are used for a wide variety of purposes and with a range of motives [ – ]. while researchers are able to disseminate their findings more quickly and reach out to broader audiences than was previously possible, they also risk that their work will not be acknowledged in more traditional and hierarchical professional structures. informal genres of scholarly communication frequently lack peer review and rely on new measures of impact, rather than the established currency of acceptance within a field [ ]. as a result, re- searchers have overall been very careful in their acceptance of digital formats that compete with established forms of expert knowledge dissemination, largely choosing instead to focus on established formats [ ]. this is especially true in the humanities, where conservatism towards new formats is particularly strong. digital humanities (dh) can be broadly characterized as the adoption of an array of computational methodologies for humanities research [ , ]. during the early nineties, dh scholarship developed under the umbrella of several academic organizations dedicated to what was then commonly referred to as humanities computing [ ]. these organizations brought together scholars from different fields interested in exploring computational methods for traditionally-defined humanities scholarship [ ]. the suffix “digital” is increasingly used to delineate the new computational areas of humanities research (i.e. digital literature, digital archaeology, digital history, etc.). the introduction of computational methods aims among other things to supplement established humanities research routines and explore new method- ological avenues, such as text analysis and encoding; archive creation and curation; mapping and gis; and modeling of archaeological and historical data [ , ]. since the early s the term digital humanities has also been used to refer to humanities research defined by a data-driven approach, in which summarization and visualization are im- portant methodological cornerstones. media and cultural studies, library and archival studies, digital pedagogy, and the recently emergence of moocs have also been referred to as digital humanities in a more general sense [ ]. as a result, dh has evolved to incorporate a range of different definitions and is subject to considerable interpretative flexibility [ ]. the central hypothesis of this study is that the variety of terms and topics associated with dh is locally configured, and that their makeup reflects different (and to a degree contradictory) conceptual- izations of what constitutes dh. dh and social media because of its interdisciplinary and international character, its affinity for digital media, and its recent emergence as a scholarly movement, dh has been comparatively strongly impacted by informal communication tools such as blogs and twitter, with junior scholars invested in dh research using such tools widely to organize, network, and collaborate. kirschenbaum notes the important role of social media for establishing and galvanizing dh as a movement: “twit- ter, along with blogs and other online outlets, has inscribed the digital humanities as a network topology, that is to say, lines drawn by aggregates of affinities, formally and functionally mani- fest in who follows whom, who friends whom, who tweets whom, and who links to what.” [ ] usage of twitter and blogs has contributed to establishing dh as a brand, and it has helped to increase its visibility on a global scale [ ]. while actively using social media does not make one a digital humanist, social media applications seem to be perceived as valuable instruments for intra-community communication in the dh community, rather than being used just out of how digital are the digital humanities? plos one | doi: . /journal.pone. february , / curiosity or for self-promotion [ ]. crucially, there are scholars who take up blogging and twitter because they are important channels of communication in the dh community. such tools therefore increasingly constitute scholarly infrastructure to their users in the same sense that library services and communal mailing lists constitute infrastructure. while traditional scholarly organizations are struggling to integrate social media, dh scholars, espe- cially junior researchers, have considerable uptake of such tools, reflected for example in the strong use of twitter at the annual digital humanities conference [ , ]. dh can therefore be characterized as an emerging digital scholarly network—a group of scholars that has integrat- ed digital genres of scholarly communication into its communicative infrastructure from the onset. inside such a network in which heterogeneous links connect different actors it should be possible to study the flow of ideas, trends, and discourses much more effectively through social media than purely by assessing formal publications in scholarly journals and monographs [ ]. hastac the humanities, arts, science, and technology alliance and collaboratory (hastac) is an online community and social network that connects researchers, young scholars, and the gen- eral public interested in a wide range of subjects associated with dh and peer-to-peer learning. founded in by davidson and goldberg [ ], hastac emerged as a consortium of edu- cators, scientists, and technology designers funded by the national science foundation, the digital promise initiative, and the macarthur foundation, with infrastructure provided by duke university and the university of california humanities research institute. hastac dif- fers from similar initiatives in that it is largely decentralized with content generated by a net- work of over ten thousand members including university faculty, students, and general public. the network platform is built on the drupal content management system and requires an inclusive free-of-charge membership. member participation varies widely, with many register- ing but passively interacting with the website by reading the content and a robust minority ex- pressing their thoughts and communicating their interests by writing or commenting on blog posts, joining discussion forums, or contributing information about current events. according to the initiative’s website, “hastac members are motivated by the conviction that the digital era provides rich opportunities for informal and formal learning and for collaborative, net- worked research that extends across traditional disciplines, across the boundaries of the acade- my and the community, across the two cultures of humanism and technology, across the divide of thinking versus making, and across social strata and national borders.” [ ]. while the platform is interdisciplinary in nature, it is strongly focused on learning and dh-related topics. hypotheses hypotheses is a publication platform for academic blogs. launched in , it is funded and operated by the centre for open electronic publishing (cléo), a unit that brings together two major french research institutions and two universities: the centre national de la recherche scientifique (cnrs), the École des hautes Études en sciences sociales (ehess), the aix-mar- seille université, and the université d’avignon. in addition to hypotheses, cléo provides other tools via the openedition portal: revues.org, a platform for journals in the humanities and so- cial sciences and calenda, a calendaring tool. according to the hypotheses website “[a]cademic blogs can take numerous forms: accounts of archaeological excavations, current collective research or fieldwork; thematic research; books or periodicals reviews; newsletter etc. hypotheses offers academic blogs the enhanced visibility of its humanities and social sciences platform. the hypotheses team provides support how digital are the digital humanities? plos one | doi: . /journal.pone. february , / http://revues.org and assistance to researchers for the technical and the editorial aspects of their project.” [ ] to publish on hypotheses, a blog must first be admitted by the platform’s editorial team. only researchers employed by institutions of higher learning are eligible to join hypotheses after having been evaluated, and the criterion for positive evaluation is a consistent focus on aca- demic issues. through its policy the platform maintains some characteristics of a formal publi- cation outlet, aiming to stimulate both open discussion within scholarly disciplines and exchange with the broader public. hypotheses is based on the wordpress content management platform, with a home page that features current contributions from participant blogs. in addition to english, a large por- tion of hypotheses’ content is composed in french, german, spanish, and other languages, but for the purpose of this study we only considered posts published in english. similarities and differences both platforms share strong similarities: they aim to promote new forms of scholarly commu- nication and knowledge dissemination. at the same time, there are also considerable differ- ences: hastac places a clear emphasis on learning and also mentions media and communication in its self-characterization. while hypotheses is also interdisciplinary in char- acter, it has a stronger slant towards traditional humanities subfields, and specifically towards history. the concept of scholarly blogging outlined on the hypotheses website points to its role for intradisciplinary communication, whereas hastac is more geared towards interdisciplin- ary exchange. despite these differences, the two platforms make an ideal case for comparison on the grounds of their functional similarities. both are related to dh, both seek to integrate blogging into scholarly communication, and both are publicly funded. furthermore, both plat- forms have been operational for a similar timespan and attract broadly comparable user communities. research design our aim is to characterize differences in the discourse that takes place on hastac and hy- potheses reflecting different cultural implementations of dh and different understandings of what constitutes dh. to this end, we formulated two research questions: how frequent are particular keywords associated with (digital) humanities on the two platforms (h ) and what are thematic differences in the distribution of topics in the two sites (h )? we approached the first question by counting the co-occurrence of humanities-related terms and their digital equivalents (e.g. history—digital history) on blog posts. in a second step we applied topic modeling to the post content to identify substantial thematic differences between the commu- nities in both platforms and their respective approaches to blogging. based on the self-charac- terizations of both platforms, we expected there to be both overlap and variation with regards to the adoption of dh-related labels and overall disciplinary focus. data the data from the two platforms were collected from database dumps containing the sql table structure and the blog post content. hastac data included content posted between august , and august , , together with the profile data of , users. most users shared brief biographical information and identified a set of topical interests, institutional affiliation, and links to personal websites. in addition to the posts themselves, the hypotheses data includ- ed metadata such as author information, timestamp, text, internal and external links in each post, which was collected between the st of july and the rd of june . how digital are the digital humanities? plos one | doi: . /journal.pone. february , / the language of posts was detected automatically using the language identification system langid.py for python, which supports a large number of languages and achieves a high level of accuracy without requiring prior in-domain classifier training [ ]. the material initially in- cluded a large number of posts published in languages other than english ( , posts) pub- lished over different periods of time. for the purpose of this investigation, we only considered blog posts in english published between the st of july and the th of june , thus ex- tracting , posts from hastac and , posts from hypotheses. we performed a co- word analysis over these , posts [ ] and subsequently extracted a random sample of , posts from each platform to perform topic modeling. fig. shows a frequency histogram of blog posts in the abovementioned period on a logarithmic scale, with hastac posts being comparatively more frequent from to , and posts on hypotheses being comparatively more frequent in the period thereafter. activity on both platforms drops during the summer vacation months (july for hastac and august for hypotheses) reflecting seasonal work patterns. methods we approached our first question (h ) by means of a co-word analysis of keywords associated with humanities and digital humanities research [ ]. we used one vector of twenty humani- ties areas (anthropology, archaeology, archive, art, culture, ethnography, history, humanities, learning, libraries, literacy, literature, media, pedagogy, preservation, publishing, rhetoric, scholarship, storytelling, knowledge) and another identical vector plus the suffix “digital” (digi- tal anthropology, digital archaeology, digital archive, digital art, digital culture, digital ethnog- raphy, digital history, digital humanities, digital learning, digital libraries, digital literacy, digital literature, digital media, digital pedagogy, digital preservation, digital publishing, digital rhetoric, digital scholarship, digital storytelling, digital knowledge). these keywords include terms that describe fields or general domains associated with the humanities on the basis of raw token frequencies identified in the two datasets. this approach comes with considerable limitations. firstly, the semantics of the terms differ considerably, as some describe fields of scholarship (history—digital history), while others are more general and tend to be polysemous (knowledge, media). the same applies to their prefixed counterparts, with digital history likely fig . english-language blog posts published on both sites between and . doi: . /journal.pone. .g how digital are the digital humanities? plos one | doi: . /journal.pone. february , / identifying a field, while digital media most likely describes certain kinds of technical media. furthermore, issues of precision and recall arise, due to which not all discussion of the relevant phenomena is reliably captured and some of what is captured relates to other concepts. in spite of these limitations, we found co-word analysis to be useful, because it shows the entrenchment of the terms as convenient and fashionable labels on both platforms. we accept that such labels do not narrowly identify concepts, but believe that they are suitable to characterize the success of particular terms around which the dh community can rally. using these terms we generated a series of term-document matrices for each of the net- works. we visualized the association between humanities and dh by performing a multinomi- al logistic regression on the terms. we relied on the textir package for r [ ] to convert the term-to-term co-occurrence matrix to a matrix of the log-odds ratios of co-occurrence. the re- sulting matrices (hastac and hypotheses) scales the word similarity as a function of word frequency, with terms of similar semantic content numerically represented as being similar to one another [ ]. after converting the log-odds ratios to distance matrices using cosine simi- larity [ , ], we relied on multidimensional scaling [ ] to visualize humanities and dh terms in a latent semantic space [ ] with a two-dimensional density surface [ ]. the second question (h ) was addressed using latent dirichlet allocation [ ] implementa- tion for r [ ]. r package topicmodels allows the probabilistic modeling of term frequency oc- currences in documents and estimation of similarities between documents and words using an additional layer of latent variables referred to as topics. the package provides the basic func- tions for fitting topic models based on data structures from the text mining package tm [ ]. topics were modeled using a mixed-membership approach in which documents are not as- sumed to belong to single topics, but to simultaneously belong to several topics, with varying distributions across documents. to equally represent both platforms, we drew a random sam- ple of , posts from each platform from the data previously described. prior to mapping the documents to the term frequency vector, we tokenized the posts and processed the tokens by removing punctuation, numbers, stemming, and stop words, in order to sparsen the matrices. we also omitted very short documents (< characters) for the same purpose. ethics statement. the authors confirm that the study is in compliance with the terms and conditions of hastac and hypotheses. results co-word analysis with respect to our first research question (h ) we found that unprefixed keywords occurred in a much higher ratio relative to their prefixed counterparts. table shows the number of oc- currences of humanities and dh terms on both platforms, with a high concentration of posts focusing on art, media, history, culture, and humanities, followed by learning, publishing, and libraries. the areas of research with fewer occurrences are archaeology, storytelling, ethnogra- phy, and preservation. hastac presented a much higher number of references to humanities ( , ) and dh ( , ) in comparison to hypotheses ( , and , respectively). the ratio of posts with humanities to dh related terms is also higher on hastac at seven posts on hu- manities to each post on dh while on hypotheses the ratio is of fifty-one posts on humanities to each post on dh. in fact, we found no mention to nine areas of dh in the hypotheses sample. although the distribution of humanities and dh terms is skewed towards hastac, the distribution per area of research on humanities is fairly similar. fig. shows a cluster dendo- gram of term co-occurrences based on euclidean distance, with humanities areas appearing at the top of the hierarchical structure and dh terms appearing near the bottom. art, culture, how digital are the digital humanities? plos one | doi: . /journal.pone. february , / and media are likely to also refer to general terms rather than only humanities disciplines, therefore presenting a higher value of intergroup dissimilarity and appearing higher up in the hierarchy. more narrowly defined areas such as learning and digital media are followed on hastac, while the hierarchical clustering of topics on hypotheses is topped by history and publishing. fig. shows internal differences and dissimilarities between the two platforms in their usage of the labels listed in table . dh subfields are much more distinct from other terms in hastac that they are on hy- potheses, where many of the dh labels are either uncommon or not used at all. unsurprisingly, we found that most blog posts that made reference to dh terms also included references to the unprefixed terms, but not the other way around. from the , posts on hastac that in- cluded references to humanities-related terms ( , occurrences), % of them also included references to the corresponding label in dh. however, from the , posts on hastac that included references to digital humanities terms ( , occurrences), only % of them also in- cluded references to the corresponding term in the humanities. this asymmetry is actually more pronounced in the hypotheses network. from the , posts on hypotheses that in- cluded references to humanities-related terms ( , occurrences), % also included refer- ences to the corresponding term in dh. however, from the posts on hypotheses that included references to dh-related terms ( occurrences), only % also included references to the corresponding humanities area. the dependence of digital humanities on established humanities labels is consistent, but it varies considerably within each of the areas investigated. the average percentage of posts per area that include reference to both humanities and dh is still quite skewed, as % of posts on hastac (mean = . , median = . ) and hypotheses (mean = . , median = . ) dedicated to digital humanities areas also including references to the main humanities area. the reverse table . number of occurrences of humanities and dh terms. hastac hu hastac dh hypo hu hypo dh anthropology na archaeology na archive art culture ethnography na history humanities knowledge na learning na libraries literacy na literature na media pedagogy na preservation publishing rhetoric na scholarship storytelling doi: . /journal.pone. .t how digital are the digital humanities? plos one | doi: . /journal.pone. february , / dependency is also observed in the aggregated data per area, as less than % of posts on has- tac (mean = . , median = . ) and hypotheses (mean = . , median = . ) dedicated to hu- manities also included references to the related dh area. however, the dependency is noticeably lower in some fields of humanities. preservation and archival studies presented a much lower ratio of posts dedicated to digital humanities that also referred to the associated humanities area ( % and % on hastac, and % and % on hypotheses). storytelling, literacy, and pedagogy are also particularly independent in the hastac network, with %, %, and % of posts making reference to digital terminology without mentioning the related fig . hierarchical cluster dendrogram of term co-occurrences in both platforms. doi: . /journal.pone. .g how digital are the digital humanities? plos one | doi: . /journal.pone. february , / humanities field. on hypotheses, art is the term most detached from the main humanities area, with % of posts dedicated to digital art not making reference to the unprefixed field. some areas show a strong intersection of humanities and dh terms. a considerable propor- tion of articles that refer to humanities, storytelling, and libraries also made reference to digital humanities, digital storytelling, and digital libraries ( %, %, and % on hastac, and %, %, and % on hypotheses). media, scholarship, literacy, and preservation also pre- sented higher-than-average levels of cross-pollination on hastac, with %, %, %, and % of the articles focusing on these terms also making reference to their niche digital human- ities label. most of these terms also presented a considerable level of intersection of dh with general terms. we further explored the interplay between humanities and dh by performing a multinomi- al logistic regression on the terms. the matrices of log-odds ratios of co-occurrence indicate the word similarity and allow for visualizing humanities and dh terms in a latent semantic space with a two-dimensional density surface. fig. shows a contour-sociogram of the terms with substantial cross-pollination across different topics of humanities and digital humanities research. hastac posts with humanities and dh terms are clearly clustered around four main groups. the first includes terms associated with humanities at large, culture, and arts; the second is dedicated to education and learning; the third to archives and libraries; and the last clusters terms associated with anthropology and history. on the other hand, hypotheses posts with humanities and dh terms are mostly concentrated on a single cluster due to many topics lacking more entry points. nonetheless, humanities content published on hypotheses presents clusters around humanities and media; archives, history, and arts; and one cluster grouping li- brary-related materials. the vast majority of articles focusing on digital media, digital libraries, digital art, digital hu- manities, digital culture, and digital publishing also included references to the main humanities area. this is particularly the case on hastac ( %, %, %, %, %, and %, respective- ly), but also on hypotheses ( %, %, %, %, %, and %, respectively). in short, the re- sults predictably show a considerable one-way dependency of dh on the unprefixed keyword, and a relative independence of the latter relative to the former. however, there are a few dh fig . density curves of log-odds co-occurrence ratios between humanities-related terms. larger labels represent thematic areas manually identified. doi: . /journal.pone. .g how digital are the digital humanities? plos one | doi: . /journal.pone. february , / areas that presented substantial independence from the related humanities area, namely preser- vation, archive, storytelling, literacy, and pedagogy. we interpret this emancipation as an indi- cator for the establishment of these terms as convenient labels, which, while not necessarily identifying clear-cut concepts, provide attractive brands for the dh community to rally around. topic modeling we proceeded by exploring the topical differences between the two platforms to test our sec- ond research question (h ). we modeled twenty topics for the combined corpus of both plat- forms ( , posts each). table provides an overview of twelve selected topics and their ten most distinct terms by rank, some of which related to particular domains (health, history, law, art, games), while others are related to more general themes (chatter, learning). topics were labeled through a qualitative interpretation of the most salient topic keywords and table . common topics on hastac and hypotheses. topic : health topic : cold war topic : law topic : dh health war law digital medicine university legal humanities medical korean series university history history turkish hastac food korea history new university cold also media social culture said will urban women one scholars care art book technology research visual new research topic : socmed topic : data topic : art topic : urban std social can university urban can data art social new use museum political media will history new one digital heritage international cultural information museums studies culture project cultural european time also music global digital site new economic space work sound management topic : gaming topic : chatter topic : learn topic : energy game one students energy games people learning climate video like will change play can can policy virtual just class countries world time new will one even education global can think digital gas gaming now one carbon worlds many work paper doi: . /journal.pone. .t how digital are the digital humanities? plos one | doi: . /journal.pone. february , / reading a sample of the associated blog posts, meaning that they retain a certain subjective bias. most domain areas identified are strongly associated with content published on hypotheses through individual blogs with a clear and consistent topical focus (e.g. health, history, law, energy), while hastac has a stronger association with metatopics such as learning, data, and gaming. some topics of general interest (e.g. social media and data) are shared between the plat- forms. conference calls and job advertisements form two distinct yet evenly distributed topic based on their stylistic uniformity. in addition to pointing out thematic differences, topics also reflect differences in style between the two sites. topic # (chatter) is lexically distinct from other topics in that it uses much more general nouns (time, people) and verbs (think, know). it reflects a set of essayistic posts, particularly on hastac, which discuss controversial issues and tend to be relatively short. spam is also a distinct topic, but one that is also shared between both sites. we also found that while some topics overlap somewhat, many are highly characteristic of one of the two platforms. topics # (health), # (cold war), # (law), # (art), # (urban studies), and # (energy) are relatively clearly associated with hypotheses, while topics # (digital humanities), # (gaming), # (chatter), and # (learning) are prevalent on has- tac. topics # (social media) and # (data) show a more even distribution between the two sites. similar to our findings in the co-word analysis, # (digital humanities) is more prevalent in hastac than in hypotheses. the distribution of topic scores suggests that a number of lin- guistically distinct thematic areas exist on hypotheses, and that these areas follow disciplinary patterns. by contrast, hastac posts are less clearly associated with a single field of inquiry and most closely associated with metatopics such as learning and general conversation. has- tac posts are also linked to the discussion of digital humanities and the usage of labels relat- ed to dh. the differences between the two platforms may point to diverging goals associated with scholarly blogging: addressing broad interdisciplinary issues before a wider public vs. con- ducting focused scholarly discussion within fields. the difference in the number of unique authors between the two platforms ( authors on hastac vs. authors on hypotheses) may influence the result of the topic modeling, with a few very specific topics present on hypotheses not represented on hastac (e.g. cold war). nonetheless, the results confirm the observations drawn from the co-word analysis, with topics on hypotheses tending to be more disciplinarily aligned and connected exclusively to a single area of research, while posts on hastac are more likely to pick up interdisciplinary and gen- eral themes. fig. shows the topic scores in the selected topics, with each dot representing a post and its color indicating the platform. discussion the results reported in this study can be summarized in two parts. firstly, we found a substan- tial one-way dependency of dh terms on their unprefixed counterparts, as most blog posts dedicated to dh also included references to the corresponding humanities term ( % on has- tac and % on hypotheses). dh-related labels are considerably more frequent in hastac pointing to an unequal adoption of digital humanities-related terms in different local contexts. secondly, we found a tendency in hypotheses towards focused thematic areas representing dis- ciplinary interests contrasted with a tendency to discuss more general, cross-disciplinary themes in hastac. in terms of institutional branches of humanities research, history is the areas with the largest number of posts across the networks for the sample of topics considered in this study. areas that are not traditionally associated with humanities research (or institutions that support the how digital are the digital humanities? plos one | doi: . /journal.pone. february , / field), i.e. library and media, also account for a considerable portion of the posts. we also found considerable topical differences between the two platforms. while traditional areas of the humanities and social sciences (history, art, law) are clearly represented in hypotheses, hastac is topically more cross-disciplinary and less focused on single disciplines. some of these topics show considerable overlap between the networks (i.e. social media and data), highlighting the fact that there are areas in which users of hastac and hypotheses have simi- lar interests, while others are considerably more predominant in one of the networks. although both networks are on the forefront of the digital humanities research agenda, they present considerable differences in how explicitly they use new disciplinary labels (hastac) and ad- dress well-established disciplinary themes without explicitly associating them with dh (hypotheses). the differences we observed highlight that two platforms that attract broadly similar user communities may still differ considerably with regards to topics. we interpret the differences in adoption of digital humanities terminologies and topics across the networks to mirror dif- ferent developments in dh. whereas digital learning, digital literacy, and particularly digital scholarship are particularly prominent labels on hastac, hypotheses is mostly focused on digital libraries, digital history, and digital archives. these differences are of qualitative and quantitative nature reflecting not just the personal preferences of bloggers and users, but may also indicate broader conceptual differences. while blog posts in hastac tend to raise issues suitable for (controversial) discussion, contributions in hypotheses more closely mirror tradi- tional expository humanities genres (e.g. book chapters or essays). moreover, while hastac is a social network in which users can create profiles and interact with other users by posting and commenting on content, hypotheses is a publishing platform with lesser emphasis on community building than hastac, and a closer alignment with traditional genres of publishing. the content of each network also presents considerable variation in terms of formats and style. the prominence of topic # (chatter) in hastac indicates that hastac’s blog en- tries are conceptually more like casual conversation rather than academic writing. as blogs serve different purposes for different users, the data necessarily includes posts of different gen- res comprising of short essays, conference reviews, book reports, group discussions, and fig . distribution of posts per topic, with posts in red from hastac and posts in blue from hypotheses. doi: . /journal.pone. .g how digital are the digital humanities? plos one | doi: . /journal.pone. february , / general academic advertising. while hastac and hypotheses are interdisciplinary in charac- ter, they have a strong slant towards the humanities, particularly towards learning and digital media on hastac, and specifically towards history on hypotheses. common to both net- works is the small proportion of users producing the large majority of the content, which leads to a typical long-tail distribution of content within the platforms. in the last instance, the results reported in this study show that the variety of terms and top- ics associated with dh is locally configured and reflects different conceptualizations of what constitutes dh. we expect this study to be informative for future research grappling with the rapid establishment of dh in humanities departments. at any rate, it will be interesting to fol- low the ongoing maturation of both platforms and their respective approaches to scholarly blogging, as well as the different conceptualizations of digital humanities scholarship in north american and european contexts. supporting information s materials. hastac dataset with , entries including timestamp and blog posts. (zip) s materials. hypotheses dataset with , entries including timestamp and blog posts. (zip) acknowledgments we are thankful to david sparks for helping with hastac data analysis and visualization, the hastac team for providing access to the hastac data, ruby sinreich for providing impor- tant feedback to this research, and marin dacos for providing access to the hypotheses data. author contributions conceived and designed the experiments: cp mb. performed the experiments: cp mb. ana- lyzed the data: cp mb. contributed reagents/materials/analysis tools: cp mb. wrote the paper: cp mb. references . borgman cl ( ) scholarship in the digital age: information, infrastructure, and the internet. cam- bridge, ma: mit press. p. doi: . /jxb/erm pmid: . meyer et, schroeder r ( ) the world wide web of research and access to knowledge. knowl manag res pract : – . doi: . /kmrp. . . . nentwich m, könig r ( ) cyberscience . : research in the age of digital social networks. frank- furt am main: campus. p. doi: . /s - - - pmid: . dutton wh, jeffreys pw, editors ( ) world wide research: reshaping the sciences and humanities. cambridge, ma: mit press. p. doi: . /mitpress/ . . . . evans ja ( ) electronic publication and the narrowing of science and scholarship. science : – . doi: . /science. pmid: . cope ww, kalantzis m ( ) signs of epistemic disruption: transformations in the knowledge system of the academic journal. first monday . available: http://firstmonday.org/article/view/ / . ac- cessed december . . mahrt m, weller k, peters i ( ) twitter in scholarly communication. in: weller k, bruns a, burgess j, mahrt m, puschmann c, editors. twitter and society. new york: peter lang. pp. – . . puschmann c, mahrt m ( ) scholarly blogging: a new form of publishing or science journalism . ? in: tokar a, beurskens m, keuneke s, mahrt m, peters i, et al., editors. science and the internet. düs- seldorf: düsseldorf university press. pp. – . how digital are the digital humanities? plos one | doi: . /journal.pone. february , / http://www.plosone.org/article/fetchsinglerepresentation.action?uri=info:doi/ . /journal.pone. .s http://www.plosone.org/article/fetchsinglerepresentation.action?uri=info:doi/ . /journal.pone. .s http://dx.doi.org/ . /jxb/erm http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /kmrp. . http://dx.doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /mitpress/ . . http://dx.doi.org/ . /science. http://www.ncbi.nlm.nih.gov/pubmed/ http://firstmonday.org/article/view/ / . puschmann c ( ) (micro)blogging science? notes on potentials and constraints of new forms of scholarly communication. in: bartling s, friesike s, editors. opening science. berlin, heidelberg: springer international publishing. pp. – . doi: . / - - - - . . shema h, bar-ilan j, thelwall m ( ) research blogs and the discussion of scholarly information. plos one : e . doi: . /journal.pone. pmid: . kjellberg s ( ) i am a blogging researcher: motivations for blogging in a scholarly context. first mon- day . available: http://firstmonday.org/article/view/ / . accessed december . . gruzd a, staves k, wilk a ( ) connected scholars: examining the role of social media in research practices of faculty using the utaut model. comput human behav : – . doi: . /j. chb. . . . . rowlands i, nicholas d, russell b, canty n, watkinson a ( ) social media use in the research workflow. learn publ : – . doi: . / . . priem j, hemminger bh ( ) scientometrics . : new metrics of scholarly impact on the social web. first monday . available: http://firstmonday.org/article/view/ / . accessed december . . bar-ilan j, haustein s, peters i, priem j, shema h, et al. ( ) beyond citations: scholars’ visibility on the social web. in: archambault É, gingras y, larivière v, editors. proceedings of the th international conference on science and technology indicators. montréal: science-metrix and ost. pp. – . . schreibman s, siemens r, unsworth jm, editors ( ) a companion to digital humanities. oxford: blackwell publishers. p. pmid: . gold mk, editor ( ) debates in the digital humanities. minneapolis, mn: university of minnesota press. p. doi: . /s - - - pmid: . berry d ( ) understanding digital humanities. basingstoke: palgrave macmillan. p. doi: . /s - - - pmid: . kirschenbaum mg ( ) what is digital humanities and what’s it doing in english departments? ade bull : – . . juola p ( ) killer applications in digital humanities. lit linguist comput : – . doi: . /llc/ fqm . . moretti f ( ) graphs, maps, trees: abstract models for a literary history. new york: verso. p. doi: . /j.encep. . . pmid: . mcpherson t ( ) introduction: media studies and the digital humanities. cine j : – . doi: . /cj. . . . pinch tj, bijker we ( ) the social construction of facts and artefacts: or how the sociology of sci- ence and the sociology of technology might benefit each other. soc stud sci : – . doi: . / . . ross c, terras m, warwick c, welsh a ( ) enabled backchannel: conference twitter use by digital humanists. j doc : – . doi: . / . . puschmann c, weller k, dröge e ( ) studying twitter conversations as (dynamic) graphs: visuali- zation and structural comparison. in: taddicken m, editor. proceedings of general online research . düsseldorf: dgof. . yan e, ding y ( ) a framework of studying scholarly networks. in: archambault É, gingras y, lari- vière v, editors. proceedings of the th international conference on science and technology indica- tors. montréal: science-metrix and ost. pp. – . . davidson cn, goldberg dt ( ) a manifesto for the humanities in a technological age. chron high educ: b . . hastac ( ) about hastac. available: http://www.hastac.org/about. accessed december . . hypotheses.org ( ) about hypotheses. available: http://hypotheses.org/about/hypotheses-org-en. accessed december . . lui m, baldwin t ( ) langid.py: an off-the-shelf language identification tool. in: li h, editor. proceed- ings of the th annual meeting of the association for computational linguistics. jeju island, korea: acl. pp. – . . callon m, courtial j-p, turner wa, bauin s ( ) from translations to problematic networks: an intro- duction to co-word analysis. soc sci inf : – . doi: . / . . taddy m ( ) multinomial inverse regression for text analysis. j am stat assoc : – . doi: . / . . . how digital are the digital humanities? plos one | doi: . /journal.pone. february , / http://dx.doi.org/ . / - - - - http://dx.doi.org/ . /journal.pone. http://www.ncbi.nlm.nih.gov/pubmed/ http://firstmonday.org/article/view/ / http://dx.doi.org/ . /j.chb. . . http://dx.doi.org/ . /j.chb. . . http://dx.doi.org/ . / http://firstmonday.org/article/view/ / http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /llc/fqm http://dx.doi.org/ . /llc/fqm http://dx.doi.org/ . /j.encep. . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /cj. . http://dx.doi.org/ . / http://dx.doi.org/ . / http://dx.doi.org/ . / http://www.hastac.org/about http://hypotheses.org http://hypotheses.org/about/hypotheses-org-en http://dx.doi.org/ . / http://dx.doi.org/ . / . . . lipsitz sr, laird nm, harrington dp ( ) generalized estimating equations for correlated binary data: using the odds ratio as a measure of association. biometrika : – . doi: . /biomet/ . . . . moody j, light r ( ) a view from above: the evolving sociological landscape. am sociol : – . doi: . /s - - - . . leydesdorff l ( ) top-down decomposition of the journal citation report of the social science ci- tation index: graph- and factor-analytical approaches. scientometrics : – . doi: . /b: scie. . .e . . kruskal jb ( ) multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. psychometrika : – . doi: . /bf . . moody j ( ) the structure of a social science collaboration network: disciplinary cohesion from to . am sociol rev : – . doi: . / . . wickham h ( ) ggplot : elegant graphics for data analysis. berlin, heidelberg: springer. p. pmid: . blei dm, ng ay, jordan mi ( ) latent dirichlet allocation. j mach learn res : – . doi: . /jmlr. . . - . . . grün b, hornik k ( ) topicmodels: an r package for fitting topic models. j stat softw . pmid: . feinerer i, hornik k, meyer d ( ) text mining infrastructure in r. j stat softw . how digital are the digital humanities? plos one | doi: . /journal.pone. february , / http://dx.doi.org/ . /biomet/ . . http://dx.doi.org/ . /biomet/ . . http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /b:scie. . .e http://dx.doi.org/ . /b:scie. . .e http://dx.doi.org/ . /bf http://dx.doi.org/ . / http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /jmlr. . . - . http://dx.doi.org/ . /jmlr. . . - . http://www.ncbi.nlm.nih.gov/pubmed/ << /ascii encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /none /binding /left /calgrayprofile (dot gain %) /calrgbprofile (srgb iec - . ) /calcmykprofile (u.s. web coated \ swop\ v ) /srgbprofile (srgb iec - . ) /cannotembedfontpolicy /error /compatibilitylevel . /compressobjects /tags /compresspages true /convertimagestoindexed true /passthroughjpegimages true /createjobticket false /defaultrenderingintent /default /detectblends true /detectcurves . /colorconversionstrategy /cmyk /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel /emitdscwarnings false /endpage - /imagememory /lockdistillerparams false /maxsubsetpct /optimize true /opm /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo true /preserveflatness true /preservehalftoneinfo false /preserveopicomments true /preserveoverprintsettings true /startpage /subsetfonts true /transferfunctioninfo /apply /ucrandbginfo /preserve /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true ] /antialiascolorimages false /cropcolorimages true /colorimageminresolution /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution /colorimagedepth - /colorimagemindownsampledepth /colorimagedownsamplethreshold . /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /colorimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /jpeg coloracsimagedict << /tilewidth /tileheight /quality >> /jpeg colorimagedict << /tilewidth /tileheight /quality >> /antialiasgrayimages false /cropgrayimages true /grayimageminresolution /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution /grayimagedepth - /grayimagemindownsampledepth /grayimagedownsamplethreshold . /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /grayimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /jpeg grayacsimagedict << /tilewidth /tileheight /quality >> /jpeg grayimagedict << /tilewidth /tileheight /quality >> /antialiasmonoimages false /cropmonoimages true /monoimageminresolution /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution /monoimagedepth - /monoimagedownsamplethreshold . /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k - >> /allowpsxobjects false /checkcompliance [ /none ] /pdfx acheck false /pdfx check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ . . . . ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ . . . . ] /pdfxoutputintentprofile () /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /ara <feff a e f f f a b f a a a f c a c f a b a a d b a e f f f f e f a d f b e f e f a d f b e> /bgr <feff f e b d e a c a f e a c d c c a c b d e f e d e a e a d f f f d f e e e a e a d e a c d c e e f f f e b e> /chs <feff f f fd e b bbe b a b efa f e e ad d cf d a ef ee f f f c f e ee ca f ad c f b efa > /cht <feff f f e b a d f e efa acb f ef bc ad c cea d a ef ee f f f c f e ee ca f ad c f b f df efa acb ef > /cze <feff f e e ed f e a b e e ed f b d e f f c b e e a c e f ed f b c e ed b e f e e f b d e d f e e e f ed f d f f e e f b a ed e> /dan <feff e c c e e c f f d f b d e c e c d b e e f a b c e f d f b d e b e e e f c c f e f e e> /deu <feff e e e c c e e a d c c e f e f d f b d e e c f e e e f d b a e d f e e c c d f b d e b f e e e d f e f e f f f e e e> /esp <feff c f e f e f d e f f f d f e d f c c c e e f d e f f f e f c f e f e f f e> /eti <feff b e e b c fc b c e d a f b f c b f d f b d e c f f d b e c f f d f b d e f d d f e e f e a d f f e e d a> /fra <feff c a f f e e e f d e f f e c e d f e e e c f d e e e e ea f e f c e f e f e c e e> /gre <feff a c b c b bc bf c bf b ae c c b b c c ad c c b c c c b bc af c b b c b b b bd b b b bc b bf c c b ae c b c b ad b b c b c b f c bf c b af bd b b ba b c b be bf c ae bd ba b c ac bb bb b bb b b b b c c bf d b ba c c c c c b ba ad c b c b b c af b c c c b bb ae c c bf b cc c b c b c e a b ad b b c b c b c bf c ad c b c b b b bc b bf c c b ae c b b bc c bf c bf cd bd bd b b bd bf b c c bf cd bd bc b c bf f c c bf f e ba b b bc b c b b b bd ad c c b c b c b ba b cc c b b c e> /heb <feff d e ea de e d d d d d e d ea d dc d db d d dc d e d e de e de db d f d de d ea d de d dd dc d d e e ea e d dd d d e d e d d db d ea d ea e de e de db d e e d e e d e d ea e d dd dc e ea d d d d d de e e d ea f d d f e d d e e d d ea de ea e d de d ea d d ea e e d de d dd dc d f d c e d d e d d de d e d da dc de e ea de e e dc f e de e de db d e e d e e d e d ea e d dd dc e ea d d d d d de e e d ea f d d f e d d e e d d ea de ea e d de d ea d d ea e e> /hrv (za stvaranje adobe pdf dokumenata najpogodnijih za visokokvalitetni ispis prije tiskanja koristite ove postavke. stvoreni pdf dokumenti mogu se otvoriti acrobat i adobe reader . i kasnijim verzijama.) /hun <feff b e c f d e e e f d c b e a ed e f d e f a c e b e d c c f f b d e d f b a b b c e c c ed e f b b c b e a ed e c e f a f f b d e d f b a f e a f e c a f c b e a f b b c e f b d e> /ita <feff c a a d f a f e f d e f f e d c c e e f d e f f e f f e f f e f e e> /jpn <feff ad c cea a d ea d ec b fa b f f e f c b f f e e a d b a f c c f d a a eb f f a f e ee d b f c d e e a d b a b f d a f c e cb fbc f c fc > /kor <feffc c c c c acc a d c ec ace d c c c dcd d c c c c d ac c a c d d c f bb c cb c c c d b c b e e c b ac c c c b c bb c cb f bc f f e c c c c d c c c f c c c b b c b e e> /lth <feff e f b f d e f d b f f b d e c b c b b f b f b e e d e d e b f b d e c b f d f f e c e d a f d e> /lvi <feff a d e f a f b a d c c f f f b d e c b b d f b c d e b e a f a f b d e c b f f e f e c b b f a e b d a d e> /nld (gebruik deze instellingen om adobe pdf-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. de gemaakte pdf-documenten kunnen worden geopend met acrobat en adobe reader . en hoger.) /nor <feff b e e c c e e c e f f d f b d e f d e f f b b b f b c e d f b d e e b e e e f c c f e c c e e> /pol <feff e f f a e f b d e f a a e a f e f b f f b a a b f b e f b d e d f c e f f d f f e e f a d e> /ptb <feff c a f e e f f d f d e f f d f e d d f c c e f f d e f f f d f f d f f f f e f f f e> /rum <feff c a e f d e f e c f e f d e c f f c f e f e c c f e> /rus <feff f e b c d d b d e a b f e d f e a c d e f c c a c b c d e f e e f b f b e a e a d d e e e f d e e b e e e d d b d e a c d b c e d e e a b c f e c e c e f f e e b f e d e> /sky <feff f e e f e e e e f b d e f f c b f e e a c f e b c e fa c d e f e e f b d e d f e e e f f f d f f f e e f ed e> /slv <feff e f a a e a f b d e f f c b f e a d e a a b b f f e f b e a f e b e a e f b d e a d f f d f a f e f e e e f a d e> /suo <feff b e e e e e b c b e c f c e e e e e e b e c d c f f e f f d f b d e a e c f d f b d e f e f c c a f e a c c a d d c c e> /sve <feff e e e e e e c c e e e f d c c b f d f b d e f d e c e d c f d b d f b c e b d f b d e b e f e f f f e f e e> /tur <feff fc b b b c c f e a d b e e c b f c c f c f d b e e c b c c e e e f c f c e c c f f e f e e b fc fc d c c e c c e> /ukr <feff a e e f c b f e d d f e a c d f c f a d a f e f c b f e a e f a d e e f a e e e a e e d e a c d c e d a f f e e f d e e> /enu (use these settings to create adobe pdf documents best suited for high-quality prepress printing. created pdf documents can be opened with acrobat and adobe reader . and later.) >> /namespace [ (adobe) (common) ( . ) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) ( . ) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /convertcolors /converttocmyk /destinationprofilename () /destinationprofileselector /documentcmyk /downsample bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements false /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles false /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) ( . ) ] /pdfxoutputintentprofileselector /documentcmyk /preserveediting true /untaggedcmykhandling /leaveuntagged /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [ ] /pagesize [ . . ] >> setpagedevice using data curation profiles to design the datastar dataset registry search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine july/august volume , number / table of contents   using data curation profiles to design the datastar dataset registry sarah j. wright, wendy a. kozlowski, dianne dietrich, huda j. khan, and gail s. steinhart cornell university leslie mcintosh washington university school of medicine point of contact for this article: sarah j. wright, sjw @cornell.edu doi: . /july -wright   printer-friendly version   abstract the development of research data services in academic libraries is a topic of concern to many. cornell university library's efforts in this area include the datastar research data registry project. in order to ensure that datastar development decisions were driven by real user needs, we interviewed researchers and created data curation profiles (dcps). researchers supported providing public descriptions of their datasets; attitudes toward dataset citation, provenance, versioning, and domain specific standards for metadata also helped to guide development. these findings, as well as considerations for the use of this particular method for developing research data services in libraries are discussed in detail.   introduction and background the opportunities afforded for new discoveries by the widespread availability of digital data, research funders' evolving policies on data management and sharing, including the us national science foundation and the national institutes of health, and the roles of academic libraries in facilitating data management and sharing have attracted a great deal of attention in the past several years. libraries and librarians have embraced a variety of opportunities in this arena, from conducting research to understand data management and sharing practices of researchers, to education and outreach on funder requirements. some organizations are developing web-based tools to support data management planning, such as the data management plan online tool (dmp online) with a u.s. version supported by the california digital library and partners (dmptool); others are developing research data infrastructure to support data curation and escience, such as johns hopkins university's data conservancy, the purdue university research repository, and rutgers' ruresearch data portal. one of cornell university library's (cul) efforts in this area of digital data discovery is the datastar project. in its first incarnation, datastar was envisioned as a "data staging repository," where researchers could upload data, create minimal metadata, and share data with selected colleagues. users could optionally create more detailed metadata according to supported standards, transfer the dataset to supported external repositories when they were ready to publish, and obtain assistance from librarians with any of these processes. the system was based on the vitro software, a semantic web application that also underlies the vivo application, developed at cul's mann library. a prototype system with capabilities for specialized metadata in ecology and linguistics was successfully developed, and its use piloted as a mechanism for deposit of data sets to the data conservancy and cornell's ecommons repository. we found, however, that the conversion of these specialized metadata schemas and standards into semantic ontologies and the development of reasonably user-friendly editing interfaces based on these ontologies to support the entry of detailed metadata were far too labor-intensive to be realistically implemented or sustainable and that phase of the project has been brought to a close. in reconceiving datastar, we chose to focus on the idea of developing a research data registry to support basic discovery of data sets. prior research showed that cornell researchers would be interested in a basic data registry to demonstrate compliance with funders' requirements, and early work at the university of melbourne which extends vivo to accommodate basic data set description also stimulated our thinking in this area. the current project will transform datastar, extending the vivo application and ontology to support discipline-agnostic descriptive metadata, enabling discovery of research data and linking this research data with the rich researcher profiles in vivo. adapting vivo for this purpose confers the advantage of exposing metadata as linked open data, facilitating machine re-use, harvesting, and indexing. as part of this project, we will develop and document a deployable and open-source version of the software. cornell, in partnership with washington university at st. louis, is continuing development of the datastar platform with this new focus. with this core purpose of data discovery in mind, we wanted to ensure development decisions were driven by real user needs. to that end, we used the data curation toolkit to conduct interviews and create a set of data curation profiles (dcps) with participants selected at cornell university (cu, ithaca, ny) and washington university in st. louis (wustl, st. louis, mo). others have used the data curation profile toolkit to understand researchers' data management and sharing needs and preferences more broadly, but to the best of our knowledge, ours is the first application of the toolkit to inform software development for a particular project. we report our findings regarding researcher data discovery needs vis-à-vis the evolving datastar platform and our experiences with this particular method for informing software development.   methods we conducted structured interviews with researchers at cu and wustl using the data curation profile (dcp) interview instrument to inform the ongoing development of a data registry. we conducted our interviews according to the procedures outlined in the dcp toolkit v . , with minor modifications to the interview instrument. modifications were made after reviewing the interview worksheet to determine whether it included questions that would help us prioritize development decisions. we identified those that did, and inserted four additional questions. no questions were eliminated from the instrument, even though some were not directly applicable to our immediate investigation. the questions added and the corresponding interview modules were: please prioritize your need for... the ability to create a basic, public description of (and provide a link to) my data. (module , added to question ) please prioritize your need for... the ability to easily transfer this data to a permanent data archive. (module , added to question ) please prioritize your need for... the ability to track data citations (module , added to question ) please prioritize your need for... the ability to track and show user comments on this data (module , added to question )   creation of data curation profiles interviews were conducted by cul and wustl librarians as described in the dcp user guide; interview subjects were identified by the librarians after institutional review board approval was obtained at each institution. participants from a broad range of disciplines and data interests were invited to participate, and eight completed interviews (table ). all interviews were recorded (audio only), and interviews were transcribed when the interviewer indicated information was not robustly captured from the interview worksheet or notes. each interviewer constructed a dcp according to the template in the toolkit. participants were given an opportunity to suggest corrections to the completed profiles, names and departments were removed from the profiles, and the resulting versions were made available both on the data curation profile website and cornell's institutional repository.   university research topic dcp filename cornell university biophysics cornelldcp_biophysics.pdf   sociology / applied demographics cornelldcp_demographics.pdf   ecology & evolutionary biology / plant biology cornelldcp_herbivory.pdf   genetics / plant breeding cornelldcp_plantbreeding.pdf   linguistics cornelldcp_linguistics.pdf washington university in st. louis archaeology / gis washudcp_gis.archaeolog.pdf   physical / theoretical chemistry washudcp_theoreticalchemistry.pdf   public health communication washudcp_publichealth.pdf table : participant interview research topics and completed data curation profile file names in cornell's ecommons repository.   analysis and interpretation of completed data curation profiles much of the information contained in the dcps could help shape both the current software development project and other future data curation services. therefore, it was important to evaluate and prioritize the information gleaned from the dcps in order to guide datastar development. the analysis team (librarians and additional project personnel) performed an initial analysis of the profiles upon their completion. first, the team read selected sections from the completed profiles with each section assigned to two readers. each reader individually summarized any observed trends in the information collected, identified the most useful information that might guide development and identified the most interesting results without specific regard for datastar. each pair of readers then jointly reached agreement on these topics and shared the findings with the analysis and development team, which produced a summary of findings for internal use. second, we evaluated the quantitative responses from the interview worksheets. all questions with discrete responses (e.g. prioritization of service needs; yes/no) were evaluated using an assigned numerical value system. for the prioritization questions, "not a priority" was assigned a value of , "low priority" a , "medium priority" a , and "high priority" a . for the yes/no style questions, a "no" was assigned a value of and a "yes" a . responses of "i don't know or n/a (dkna)" and instances where no response was recorded were also tabulated; questions where such responses were greater than % (i.e. greater than two of eight profiles) of the total were discussed. these numbers were used to calculate the average prioritization value (apv), which represents the overall desire for a given service and helped guide further discussion during analysis. this value system was not meant to show statistical differences but was used to evaluate the relative importance of these services to the researchers and reflect an overall prioritization of the entire pool of respondents. after determining the average prioritization values, we reviewed the verbal interview answers to confirm the replies.   relevant findings and discussion key findings regarding data sharing and the relationship of scientific research data to institutional repositories are covered in cragin et al. our findings with respect to data sharing reinforce many of the general conclusions in that paper and will not be detailed here. instead, our focus is on those issues that directly influenced the current iteration of datastar or will be considered for future development of datastar.   findings that directly influenced current datastar development after evaluation and prioritization of the findings from the interviews, a set of responses that were particularly relevant to the current iteration of datastar emerged. these findings are summarized in figure . figure : summary of prioritization responses for features included in the current version of the datastar registry. asterisk (*) denotes a question added to the original dcp tool for this project. total number of interviewees in all cases was eight. discovery and a basic public description one of the primary goals of datastar as a data registry is to have it serve as a hub for discovery of that research data. some of the more traditional ways of locating data of interest to researchers have included direct contact with peers or collaborators or finding related work via publications. however, the large and quickly growing volume of information available about scientific efforts and accomplishments now available on the internet has opened a new route of discovery: internet search engines. one of the simplest elements that can be used to facilitate internet discovery is a basic description of the item (in this case the dataset) of interest. to learn how to make datastar effectively support data set discovery, we asked researchers about the importance of internet discoverability, types of metadata elements to enhance discovery, and more generally, how they envisioned prospective users finding their data sets. first, researchers were asked to prioritize "the ability for people to easily discover this dataset using internet search engines (e.g. google)." the response was strongly in favor of internet discoverability: six respondents rated it as high or medium, one ranked it of low importance and one researcher did not reply. other routes of discovery that were mentioned were individual researcher web pages, laboratory web pages, department web pages, google scholar, web of science, and domain-related web pages (e.g. "language page" for the linguist, web interfaces for gis data for the archaeologist). for the researcher who ranked internet discoverability as low, it is important to note that the researcher felt "confident that the existing channels of data distribution (both online and print) were sufficient to ensure the farmers, breeders and researchers in counterpart programs in other agricultural extension services were discovering the data"; in this case, the need for broader discovery was perhaps not prioritized because the current system was deemed adequate. second, researchers were asked the open-ended question, "how do you imagine that people would find your data set?" the demographer was already aware that the project website was found online by searching for the project name or for terms related to the project topic such as "county population" or "projection". the ecologist studying herbivory mentioned that a basic public description could aid in discoverability, drawing a parallel to literature database searches that are done on geographic location or period of time in conjunction with a basic project description (i.e. topic). this researcher specifically mentioned the idea of using a "sort" or "find more data like this" tool or using descriptive terms such as keywords, species and geographical inputs. while a primary set of attributes useful for description and discovery had already been identified (e.g. author, title, subject), the interviews confirmed those attributes and identified the additional facets geography and time, which were incorporated into the current version of datastar. future iterations of datastar may also include search terms like species or other facets that would enable researchers to "find more data like this." finally, researchers were asked to prioritize "the ability to create a basic, public description of (and provide a link to) my data." the positive response — six researchers ranked it high or medium priority, one answered it was not important, and one researcher did not answer the question — indicated that a publicly accessible data registry would likely be of interest to researchers. when asked "who would you imagine would be interested in this data?" the majority of respondents (five) felt their data could potentially be used by groups requiring general public access such as "students," "farmers," "policy overseers," "language learners," and "educators." more than half of researchers interviewed believed their work to be applicable to such broad-range audiences; this reinforces the need for a means of easy, public discovery of current work and data in a wide variety of fields. the responses to all these questions are consistent with the current development trajectory for datastar. data citation and citation tracking scholarly manuscript citations have quickly become "the currency of science", with research resources dedicated to tracking, monitoring and linking of publications (e.g. scopus, web of science, google scholar, journal impact factors). more recently, citation of datasets has gained attention of this kind as well. web of knowledge has added a new database, the data citation index, which is designed to help users discover datasets and provide proper attribution for those datasets. datacite, established in and with member institutions around the world, is another effort to support data archiving and citation. proper, consistent, relevant citation of a dataset serves multiple purposes. from the researcher perspective, dataset citation can allow proper credit to be given to both the original researcher and, when applicable, to the dataset source. a critical step in the process of scientific collaboration and data sharing, dataset citation forms a connection between the data and related publications, as well as a connection between related datasets. from a data management perspective, citation lays the foundation and framework for capturing relationships between datasets, publications and institutions, enabling further analysis of the impact and usage of the data. when asked to prioritize "the ability to enable version control for this dataset," five researchers considered this to be a high priority and one, who ranked version control as not important, felt the "data do not have different versions". the ecologist felt strongly that "the link between the data and the related publication(s) is very important" and went so far to say that he "would like any repository to provide the link and to make that linking easy". similarly, the biophysicist indicated that "the ability to cite the publicly available protein crystal structure is a high priority, as it will be referenced in the published paper and others should be able to find it". while the ability to easily create a citation for a dataset was prioritized highly, on average the need to track citations was ranked as not quite as high an importance (apv for tracking citations was . compared to . for ability to cite the data). five of the eight researchers interviewed ranked tracking citations of medium importance, two ranked it of high importance and one researcher did not know how important it was to him personally, "since he believed people would do this regardless". for those who did rank it as a high priority, one researcher did specifically mention that "he considers this the real measure of the value of his data". prior to the interviews, providing a means for citing described datasets was being considered for inclusion in datastar. the researchers made clear its importance and a recommended citation format for the dataset(s) will be a future feature in the application. versioning a natural part of the lifecycle of many datasets is the potential for re-processing, re-calibration or updating, either of a part or of the whole file. for example, remote sensing data from satellites is often released near real-time; it is not uncommon for instrument calibration changes to be applied to those data after the initial release, requiring the dataset, or a portion thereof, to be re-processed to reflect the correct instrument parameters. in other datasets, errors can be found and corrected, or versions can be created for restricted use access that are different than those destined for public access; data can also be appended or added, keeping a time-series up to date or supplementing information already within the dataset. tracking of such changes can be important when trying to re-create an analysis or interpret an outcome, and ensures long-term usability of a dataset. when asked to prioritize "the ability to enable version control for this dataset," five researchers considered this to be a high priority and the one, who ranked version control as not important, felt the "data do not have different versions". interestingly, the same researcher mentioned "new data is being added to the data set but no data are being changed"; it is possible that the researcher had a different concept of versioning that did not include time series data. both the plant geneticist and the biophysicist mentioned their labs presently employ protocols that track data versions; for those researchers and others like them who already maintain records of changes and modifications, transfer of this information to a data registry should be a logical next step. interestingly, while the biophysics lab did track data processing versions, the project principle investigator felt this was more critical to internal use of the data, rather than of the shared version. nonetheless, datastar development plans include offering the ability to record information about dataset version or stage. provenance key to the success of sharing and re-using data is the user's confidence in the quality of the datasets; with the advent of publicly available and openly shared data, the risk of not knowing the source of and possible changes to a dataset before use is greatly increased. this process of documenting the origins of and tracking movements, transformations and processes applied to a dataset is called data provenance. to determine whether data provenance is important to researchers, we asked them to prioritize their need for "documentation of any and all changes that were made to the dataset over time." this documentation of changes or provenance of the dataset was ranked as a high priority by three researchers interviewed, with one specifically mentioning the desire to know "whether any changes had been made, and by whom". in the case of the ecologist, data provenance was only a medium priority, but this was "because he didn't feel like any changes should be made" to the dataset. the biophysicist stated data provenance was not applicable to his dataset, because "all processing and re-processing of the data starts from the initial files... and progresses linearly through the data stages". in this case, the researcher did feel that preservation of such initial files was "critical for internal use" but not necessarily of importance as "part of the scientific record". given the combination of the range of responses to the idea of tracking provenance with the technical demand of a full implementation of workflow provenance, datastar will have the ability to record whether or not a dataset is the original version or the derived version and to document the originating dataset for a derived version. datastar will not automatically capture this information; responsibility for recording provenance will rest in the hands of those who submit the data. formal metadata standards the application of formal metadata standards to datasets establishes the framework on which both the technical and practical implementation of organization and discovery can be built. numerous metadata standards applicable to scientific data exist (e.g. darwin core, federal geographic data committee content standard for digital geospatial metadata (fgdc/csdgm), ecological markup language, discovery interchange format, iso , etc.), but when specifically asked to prioritize their need to "apply standardized metadata from your field or discipline to the dataset," only one researcher felt this was a high priority, three said it was a low priority, and the remaining half of the researchers either did not know or did not respond. looking at the pool of dcp's that have been completed at other institutions, similar responses are common, with seven of thirteen profiles not mentioning the prioritization of standardized metadata application. this notable lack of response or interest in formalized metadata standards could be due to a variety of reasons. it is possible that no domain-specific metadata standards exist in the participants' respective fields or that they fail to use standards that do exist. failure to use existing standards could simply be due to a lack of awareness, or because the importance and function of metadata may not be valued by busy researchers. there are barriers to the use of metadata standards by non-experts — some standards exist but lack openly available and usable tools (e.g. data documentation initiative (ddi)), or are complex and difficult to use (e.g. ddi, fgdc/csdgm). another likely reason for the researcher responses could be one of semantics. the word "metadata" may not be understood across subject domains; sometimes "documentation" or "ancillary data" or another domain-specific term is more readily recognized by researchers when talking about scientific metadata. indeed, while some locally developed standards, such as meta-tables, geo-referencing and data diaries are regularly employed, no researcher in this study mentioned any formal metadata standards that were currently being used on their datasets. the linguist did identify this lack of metadata standardization as a problem, adding that it "may be partially due to the very narrow and specific nature of each researcher's project". similarly, another researcher said that no formal standards were applied to describe or organize their data because of "the highly specialized and unique nature of the lab's research". although responses were few and mixed, we feel that there is substantial doubt surrounding the reasons for this lack of interest in standardized metadata. therefore, datastar will provide the option to provide information about formal metadata standards used to describe the datasets. if the researcher chooses to use a metadata standard, that information can be leveraged and becomes an attribute of the dataset itself. we believe that this additional, optional descriptive facet will enhance data discovery, one of the basic functionalities of the datastar registry.   findings that may influence future datastar development some findings were beyond development capabilities for the current iteration of datastar. they were still useful because they helped to provide additional context and details about the ways researchers prefer to interact with their data. many of these findings were related to tools used to generate and use data, as well as linking and interoperability of data. although not all of these findings will result in new functions in later versions of datastar, the answers to these questions will almost certainly direct future datastar development. these findings are summarized in figure . figure : summary of prioritization responses for features not necessarily implemented in the datastar registry. asterisk (*) denotes a question added to the original dcp tool for this project. total number of interviewees in all cases was eight. connecting or merging data the overarching goal of datastar is to improve discoverability, and in turn, re-use of data sets. a primary way that data sets are re-used is by connecting and merging them with other related data sets, whether that relationship is topical, geographic, temporal or through some other affiliation. among the different services associated with tools, linking and interoperability of datasets, the "ability to connect or merge your data with other datasets" emerged as the highest priority. most of the researchers (six) considered connecting or merging data a high to medium priority. two said they didn't know or it wasn't applicable; none of the researchers interviewed considered the ability to connect or merge data either low priority or not a priority. the motivations researchers gave for merging datasets varied, although meta-analyses were cited more than once. the ecologist considered merging datasets a high priority because he already does this, but currently extracts the raw data needed to compile meta-analyses from the papers themselves. this method is less than ideal, so he is enthusiastic about a tool that would allow him to ( ) identify related datasets via a "find more data like this" function and ( ) access and combine the related data sets into meta-analyses. the sociologist's research involves acquiring demographic data from a variety of sources, processing, analyzing and aggregating the data in their own database, so merging datasets is a very high priority for this research group as well. in contrast, one of the few researchers interviewed who did not identify merging or connecting datasets as a priority, explained that this was due largely to the specificity of the research project and a lack of standardization in the field. it is worth speculating whether datastar can provide services in this arena beyond the basic service of fostering discovery. as a registry, datastar is not focused on storing the data itself and will not support data integration, analysis, or comparison tools in this phase of development. using datastar, one will be able to find information about the dataset, including where it is located, but will not be able to manipulate the datasets themselves. by improving discoverability, datastar will have the potential to aid with activities such as meta-analyses by improving researchers' abilities to find related datasets. data visualization and analysis there are many commercial and open source tools used to analyze and visualize data (e.g. open refine, r, arcgis, and many more). half the participants reported using excel, both to analyze and to create charts and graphs. a variety of other tools were mentioned by researchers, including google apis, sas, matlab, instrument-specific software and proprietary code generated by the lab. it was not surprising then, that when asked to prioritize their need for "the ability to connect the data set to visualization or analytical tools" it emerged as a high priority. four researchers considered this a high priority and three a medium priority; only one, the public health communications researcher, considered it a low priority. interestingly, the same researcher uses a number of sophisticated software programs for analyses and data visualization. the demographer, who considered data visualization and analytical tools a high priority, reported "if the data were hosted in an external repository, it would be high priority to be able to continue to use visualization tools such as google maps and google charts". because the need for analysis and visualization of data was highly ranked by the majority of the researchers interviewed, we are exploring the types of support datastar, as a dataset registry, might be able to provide. one consideration may be to employ visualizations for the discovery of datasets such as interactive maps or timelines showing the relationships between registered datasets. more exploration and thought will be required before we attempt to address these needs. usage statistics website useage statistics are commonplace. in the library environment, usage statistics are frequently used for collection development decisions and recent investigations have focused on exploring whether a new measurement of journal impact might be based on electronic usage statistics instead of the typical measure of citation frequency. researchers, however, largely rely on citations as their measure of success, so it was of interest to learn whether they prioritized the collection of usage statistics for their data. the "ability to see usage statistics (i.e. how many people have accessed the data)" received the full range of prioritization, from not a priority to high priority. half of the researchers (four of eight) considered collecting statistics on use a medium priority, two considered it not a priority, and one each considered it low and high priority. in contrast, the majority of the researchers interviewed considered the ability to track citations either equal to or higher priority than collecting usage statistics (apv = . vs . , respectively). one researcher would like to collect both metrics in order to measure "conversion to scholarship," or how many uses of the data resulted in publications or other tangible scholarly products. the demographer already tracked usage of the data, using google analytics on the project website; this researcher's highest priority is tracking "the internet domains of users and the referring keywords used to find the website". these comments are indicative of a general attitude that citations are the basis of the scholarly record, while usage statistics are seen as a less important, administrative function. it was interesting to note that every researcher answered the question, indicating that they were familiar with the concept of collecting usage statistics. one researcher even said it and citation tracking were of low importance, because "people would do this regardless". given this expectation, it is probably important that datastar give researchers some measure of data use, even if it is as basic as a report of the number of page views. this is a feature that hadn't been considered before performing the interviews, but that will be considered for future iterations of datastar. user comments user comments are not unknown in the scholarly publishing and scientific communities, but they are still a controversial idea among many researchers. it is important to consider the purpose since researchers may have very different opinions of comments in the context of a social network vs. the context of public peer review. a recent study of public peer review found that short comments submitted by the "interested members of the scientific community" were far less useful for selection and improvement than were the comments of invited reviewers. furthermore, when nature tried open peer review, they found that it was far from popular; both authors and reviewers disliked commenting on the scholarly record in this manner. less formal examples of user comments can be found employed in scientific social networking sites like nature network, biomed experts and mendeley. the purpose of the interactions in these and other social sites is generally to collaborate, post articles and trade ideas. since datastar is a dataset registry, not quite fitting in the category of peer-reviewed content or social networking site, it is an open question whether user comments would be of interest to researchers. the interview instrument contained two questions related to user comments, one about "the ability of others to comment on or annotate the dataset" and one about "the ability to track and show user comments on [the] data." only three researchers gave a high or medium priority to the possibility of users commenting on or annotating datasets, while four stated tracking or showing comments was at least a medium priority. only one researcher was consistently enthusiastic about this idea and talked about the potential of user comments to provide "new life to the data, and could serve as a global virtual lab meeting". two others were moderately enthusiastic, categorizing commenting or annotating a medium priority and tracking comments a high and medium priority, respectively. most of the other researchers were either hesitant or completely dismissive of any possible utility of user comments; five responded that enabling user comments was either low priority or not a priority at all. one expressed concerns about enabling unmediated public comments, noting that misinformed comments can permanently tarnish a publication or dataset. therefore, if user comments are a feature of a future version of datastar, adding the ability to moderate comments should be considered. because the researchers interviewed had very mixed opinions of these services, user comments are a possible future development, but are not a high priority for the first iteration of the datastar data registry.   methodological considerations the data curation profile toolkit has been used to discover and explore researchers data management and sharing needs and attitudes, but to the best of our knowledge this is the first time the toolkit has been used to inform the development of a specific tool. we found it a useful method for informing software development because it helped to better define the research context in which the tool will potentially be used. we do have some concerns about our use of this tool. one issue already discussed to some extent in the section on formal metadata standards is that of the terminology used in the toolkit. half of the researchers either did not answer or responded that they did not know whether they considered the ability to apply standardized metadata from their discipline a priority. this may indicate that there is not an accepted (or at least highly used) metadata standard for their area of research, but descriptive metadata also has many different names and it might have been helpful to familiarize ourselves with the research area's preferred term ahead of time (i.e. "code book", "ancillary data" or "documentation") to minimize the use of unfamiliar terminology. indeed, several researchers discussed locally developed standards, including geo-referencing and data diaries, but only one prioritized "formal metadata standards" as it was presented in the interview questionnaire. another issue was how much of an effect the interviewee's focus on one particular dataset had on the nature of the answers given. the interviews may reflect the researcher's needs and attitudes concerning one particular dataset, not the body of their research or views on data sharing in general. however, we feel that the potential pitfalls of asking about one data set are outweighed by the benefit of getting concrete answers when the researchers have a real example in mind. also worth considering is whether responses would have differed if we had described the goal of our project, i.e. developing a dataset registry. we did not describe the current project to avoid introducing bias, however, a fuller description of the tool in development might have elicited different responses. despite any lingering reservations, the process of interviewing researchers and creating profiles provided a useful view into a cross-section of research disciplines and helped us to gain insight into the environment in which the tool in development will be used. as libraries become increasingly involved in developing data management services and tools, methods such as this one are important to ensure that services are based on real user needs.   conclusions at the outset of this project, it was our goal to develop a research data registry to support basic discovery of datasets, and we had already identified interest from cornell researchers in a data registry to demonstrate compliance with funders' requirements. in order to ensure development decisions were driven by user needs, we performed structured interviews using the data curation profile toolkit. the eight researchers interviewed represented a wide range of research areas and had a similarly wide range of priorities with respect to managing their data. although no priorities were shared by all researchers, the process did help to identify some services that are likely to be valued and utilized across research areas. those cross-cutting high priority services included creating a basic public description of datasets, enabling citation of datasets in publications, support for version control and requiring the citation of datasets by users. several of these had already been identified as priorities in the current phase of datastar. the interviews provided affirmation that these services were indeed prioritized by researchers in different fields. in addition to confirming ideas regarding researcher priorities, the interviews also helped identify additional services important to researchers, including data citation and discovery by web search engines, both of which can be incorporated in the current iteration of datastar. while these researchers also prioritized the ability to connect and merge data sets and to perform visualization and analysis of data sets, a data registry cannot support these functions. performing user studies at the outset of our project to redirect datastar development allowed us to discover how researcher needs converged (and diverged) across disciplines. this in turn informed our decisions regarding how datastar can best balance the needs and interests of those sharing their data, support requirements of funding agencies, and incorporate emerging organizational principles of linked open data, facilitating machine-reuse, harvesting, and indexing. as one might have expected, we were also left with more questions and more ideas for future projects and tool development to foster data stewardship.   acknowledgements we thank kathy chiang, keith jenkins, kornelia tancheva, and cynthia hudson for completing data curation profiles for this project, and are grateful for additional assistance from jennifer moore, robert mcfarland and sylvia toombs.   references national science foundation, "dissemination and sharing of research results"; national institutes of health, "nih data sharing policy". catherine soehner, catherine steeves, and jennifer ward, e-science and data support services: a study of arl member institutions, washington, dc: association of research libraries, . martin donnelly, sarah jones, and john w. pattenden-fail, "dmp online: the digital curation centre's web-based tool for creating, maintaining and exporting data management plans", international journal of digital curation , no. ( ); california digital library, "dmptool: guidance and resources for your data management plan"; johns hopkins university, "data conservancy"; purdue university libraries, "purr: purdue university research repository"; rutgers university libraries, "ruresearch data portal". dianne dietrich, "metadata management in a data staging repository", journal of library metadata , no. ( ): - http://doi.org/ . / . . ; huda khan, brian caruso, jon corson-rikert, dianne dietrich, brian lowe, and gail steinhart, "datastar: using the semantic web approach for data curation", the international journal of digital curation , no. ( ): - http://doi.org/ . /ijdc.v i . ; gail steinhart, "datastar: a data staging repository to support the sharing and publication of research data", paper presented at the st annual international association of technical university libraries conference, west lafayette, in, june - , . cornell university library, "vitro: integrated ontology editor and semantic web application"; medha devare, "vivo: connecting people, creating a virtual life sciences community". d-lib magazine , no. / ( ). http://doi.org/ . /july -devare khan, caruso, corson-rikert, dietrich, lowe, and steinhart, "datastar: using the semantic web approach for data curation", - . gail steinhart, eric chen, florio arguillas, dianne dietrich, and stefan kramer, "prepared to plan? a snapshot of researcher readiness to address data management planning requirements", journal of escience librarianship , no. ( ) http://doi.org/ . /jeslib. . ; university of melbourne, "vivo & ands: building a turnkey research data metadata store on vivo". michael witt, jacob carlson, d. scott brandt, and melissa h. cragin, "constructing data curation profiles", international journal of digital curation , no. ( ): - . http://doi.org/ . /ijdc.v i . purdue university libraries, "data curation profiles toolkit"; melissa cragin, carole l. palmer, jacob r. carlson, and michael witt, "data sharing, small science and institutional repositories", philosophical transactions of the royal society a: mathematical physical and engineering sciences , no. ( ): - ; witt, carlson, brandt, and cragin, "constructing data curation profiles", - . ibid. purdue university libraries, "data curation profiles toolkit". kathy chiang, dianne dietrich, keith jenkins, kornelia tancheva, sarah wright, leslie mcintosh, cynthia hudson, and j. moore, "data curation profiles completed for the datastar project", ecommons@cornell university library, http://hdl.handle.net/ / . cragin, palmer, carlson, and witt, "data sharing, small science and institutional repositories", - . chiang, dietrich, jenkins, tancheva, wright, mcintosh, hudson, and moore, "data curation profiles completed for the datastar project", plant breeding. ibid. demographics. ibid. herbivory. eugene garfield, "the use of journal impact factors and citation analysis for evaluation of science", presented at cell separation, hematology and journal citation analysis, rikshospitalet, oslo, . thompson reuters, "thomson reuters unveils data citation index for discovering global data sets". datacite, "datacite: helping you to find, access, and reuse data". chiang, dietrich, jenkins, tancheva, wright, mcintosh, hudson, and moore, "data curation profiles completed for the datastar project". hebivory. ibid. biophysics. ibid. herbivory ibid. linguistics. peter buneman, s. khanna, and w. c. tan, "why and where: a characterization of data provenance". paper presented at the th international conference on database theory (icdt ), london, england, ; peter buneman, and susan b. davidson, "data provenance - the foundation of data quality". chiang, dietrich, jenkins, tancheva, wright, mcintosh, hudson, and moore, "data curation profiles completed for the datastar project". demographics. ibid. herbivory. ibid. biophysics. shirley cohen, sarah cohen-boulakia, and susan b. davidson, "towards a model of provenance and user views in scientific workflows", presented at the third international workshop in data integration in the life sciences, proceedings, . national science board, dissemination and sharing of research results, washington d.c., , report no. nsb- - . purdue university libraries, "data curation profiles directory". chiang, dietrich, jenkins, tancheva, wright, mcintosh, hudson, and moore, "data curation profiles completed for the datastar project". ibid. linguistics. ibid. biophysics. ibid. herbivory. ibid. demogrpahics. ibid. linguistics. ibid. demographics. lutz bornmann, hanna herich, hanna joos, and hans-dieter daniel, "in public peer review of submitted manuscripts, how do reviewer comments differ from comments written by interested members of the scientific community? a content analysis of comments written for atmospheric chemistry and physics", scientometrics , no. ( ): - . "overview: nature's trial of open peer review", nature international weekly journal of science ( ). http://doi.org/ . /nature lisa r. johnston, "nature network", sci-tech news , no. ( ): - . chiang, dietrich, jenkins, tancheva, wright, mcintosh, hudson, and moore, "data curation profiles completed for the datastar project". herbivory. ibid. demography and linguistics. ibid. biophysics. oliver pesch, "usage factor for journals: a new measure for scholarly impact", the serials librarian , no. / ( ): . chiang, dietrich, jenkins, tancheva, wright, mcintosh, hudson, and moore, "data curation profiles completed for the datastar project". public health. ibid. demographics. ibid. theoretical chemistry. purdue university libraries, "data curation profiles toolkit"; cragin, palmer, carlson, and witt, "data sharing, small science and institutional repositories", - ; witt, carlson, brandt, and cragin, "constructing data curation profiles", - . chiang, dietrich, jenkins, tancheva, wright, mcintosh, hudson, and moore, "data curation profiles completed for the datastar project". steinhart, chen, arguillas, dietrich, and kramer, "prepared to plan?". purdue university libraries, "data curation profiles toolkit".   about the authors sarah j. wright is life sciences librarian at albert r. mann library. her interests include data curation and scholarly communication and information trends in the molecular and life sciences disciplines. in addition to providing reference services and serving as liaison for the life sciences community at cornell, she also participates in cornell university library's research support service initiatives in the area of data curation. she is a member of cornell university's research data management services group and cornell university library's data executive group.   wendy a. kozlowski is the science data and metadata librarian at the john m. olin library, where her work focuses on issues related to the data curation lifecycle, and outreach and consultation on information management and other research data related needs. she coordinates cornell university's research data management service group, a campus-wide organization that links faculty, staff and students with data management services, and co-chairs the library data executive group. before coming to cornell, she spent years in biology and oceanography research.     dianne dietrich is physics & astronomy librarian at the edna mcconnell clark physical sciences library. she has scholarly communication, collection development, instruction, and reference responsibilities, and is responsible for assessing and meeting emerging data curation needs for cornell's physics and astronomy communities.   huda j. khan is the lead datastar developer at cornell university library and has also worked as a developer on the vivo project. her work with both vivo and datastar delves into semantic web representations and her research background and interests include human computer interaction. ms. khan received her joint ph.d. in computer science and cognitive science from the university of colorado at boulder.   gail s. steinhart is research data and environmental sciences librarian and a fellow in digital scholarship & preservation services, cornell university library. her interests are in research data curation and cyberscholarship. she is responsible for developing and supporting new services for collecting and archiving research data, and serves as a library liaison for environmental science activities at cornell. she is a member of cornell university library's data executive group and cornell university's research data management service group, which seek to advance cornell's capabilities in the areas of data curation and data-driven research.   leslie mcintosh oversees the medical informatics services within the center for biomedical informatics, facilitating the implementation and adoption of a clinical database repository, and an in-house developed software application allowing access to the aggregated patient electronic medical records from washington university school of medicine and the bjc hospital system. she holds a masters in public health with an emphasis in biostatistics and epidemiology in addition to a ph.d. in public health epidemiology. her primary interest is in facilitating medical and health research to elucidate information and knowledge from data in order to ultimately improve health.   copyright © sarah j. wright, wendy a. kozlowski, dianne dietrich, huda j. khan, gail s. steinhart and leslie mcintosh linking, publishing and evaluating - linked open data for language resources linking, publishing and evaluating linked open data for language resources francesco mambrini francesco.mambrini@unicatt.it scs annual meeting | washington, dc | january , table of contents introduction treebanks and linguistic annotation linked open data lod for language resources the l-lod network lila: linking latin conclusion full name & full name | università cattolica del sacro cuore, circse treebanks! morphology text syntax figure: morphosyntactic information stored in a xml file of the ancient greek and latin dependency treebank. full name & full name | università cattolica del sacro cuore, circse treebanks: a success story? figure: a comment posted on facebook about the workshop of the papygreek project, helsinki . full name & full name | università cattolica del sacro cuore, circse open problems sparseness: there is a multitude of projects involving linguistic annotation; standardization: projects jealously hang on their guidelines and tagset and refusing to consider any form of standardization; interoperability: no way to make morphosyntactic annotation interact with other levels of information (e.g. lexical resources); usability: lack of general-purpose tools for annotating, manipulating and querying the data. full name & full name | università cattolica del sacro cuore, circse use-evaluation-correction a virtuous circle full name & full name | università cattolica del sacro cuore, circse lod and semantic web in the classics figure: linked open data: the recommendations. full name & full name | università cattolica del sacro cuore, circse pelagios a lod network of annotations i annotate place reference using gazetteer uris from pleiades i publish annotation using the oac vocabulary how? | don’t unify the model – annotate! full name & full name | università cattolica del sacro cuore, circse the pelagios model strengths and weaknesses decentralization: as pelagios only links data from many different project; a simple model: based on one minimal vocabulary (no effort of conversion/mapping); community effect: pelagios is nowadays more than a successful platform; it is a well connected and motivated community of people de facto standard: pelagios has achived the critical mass to be a de facto standard. full name & full name | università cattolica del sacro cuore, circse the l-lod network legend corpora lexicons and dictionaries terminologies, thesauri and knowledge bases linguistic resource metadata linguistic data categories typological databases other pdev-l... lexinfo sentim... lexvo wiktio... univer... olia univer... univer... univer... polyma... dbpedia univer... thesoz... stw th... univer... umthes genera... wolf w... wordne... parole... arabic... slovak... univer... clld-p... clld-wals genera... catala... rss- ... masc-b... babelnet univer... manual... univer... clld-a... clld-g... interc... aperti... aperti... univer... wordne... mlsa -... aperti... univer... clld-e... aperti... multiw... lexico... univer... univer... automa... xlid-l... croati... chat g... fiesta mexico isocat aperti... social... bultre... univer... prince... muninn... univer... univer... gemet-... univer... univer... openwn... aperti... de-gaa... univer... greek ... dbpedi... dbpedi... aperti... iate rdf frameb... sli ga... lemonuby romani... emn news- ... univer... glottolog finnwo... univer... univer... wordne... wordne... geolog... hebrew... aperti... ietflang saldo-rdf aperti... norweg... japane... clld-afbo univer... univer... univer... univer... geowor... copyri... dbpedi... univer... univer... multil... univer... univer... isocat... univer... simple swefn-rdf chines... wikili... dbnary fao ge... linked... eurose... phonet... univer... olia d... univer... lemonw... univer... alpino... persia... univer... aperti... univer... tds linked... iwn univer... wordne... lingvo... italwo... swedis... jrc-na... univer... univer... aperti... thist earth clld-s... aperti... chines... aperti... open d... univer... aperti... univer... univer... aperti... biblio... univer... framester atlant... aperti... univer... thai w... dannet... aperti... multex... univer... univer... cornet... aperti... ontos ... linked... univer... zhishi.me univer... univer... gemeen... univer... univer... dbpedi... pleiades univer... slowne... reuter... open b... olac m... galici... lodac ... basque... premon lingui... univer... univer... univer... univer... open w... univer... associ... saldom... univer... brown ... greek ... univer... univer... univer... aperti... univer... icewor... kore ... albane... aperti... world ... open m... aperti... aperti... plword... univer... zhishi... clld-wold univer... aperti... panlex the linguistic linked open data cloud from lod-cloud.net figure: the linguistic linked open data (llod) cloud. full name & full name | università cattolica del sacro cuore, circse why lod? sparseness: all independent and self-standing projects can live and prosper across the web; small size/marginality: newcomers can be adequately represented along with the “big players”; lack of interoperability: as many layers of annotations can be added to enccode information about any level of linguistic analysis (syntax, morphology, semantics, pragmatics...); lack of standardization: the adoption of common vocabularies is crucial for any lod enterprise. usability issues: interoperable and standardized data are ready to be reused; data integrated in a lod network are easier discover and thus reuse. full name & full name | università cattolica del sacro cuore, circse the lila project https://lila-erc.eu/ i funded under the erc program (principal investigator: marco passarotti) i aims to connect linguistic resources (lexica, corpora, nlp tools) of latin i uses the lemma has the linking element (pretty much as pelagious uses the gazetteer id) i provides uris for latin lemmas, using an ontology based on ontolex i the collection of lemmas (and the first resources linked) can be: i visualized at: https://lila-erc.eu/lodlive/ i queried at: https://lila-erc.eu/sparql/ full name & full name | università cattolica del sacro cuore, circse https://lila-erc.eu/ https://lila-erc.eu/lodlive/ https://lila-erc.eu/sparql/ lila: link via the lemma “causa” in thomas aq. scg . . full name & full name | università cattolica del sacro cuore, circse summing up with lod we can produce data that are: . more connected . more discoverable . more standardized . easier to reuse full name & full name | università cattolica del sacro cuore, circse introduction treebanks and linguistic annotation linked open data lod for language resources the l-lod network lila: linking latin conclusion microsoft word - mterras_reuse of digitised content.docx edinburgh research explorer so you want to reuse digital heritage content in a creative context? good luck with that. citation for published version: terras, m , 'so you want to reuse digital heritage content in a creative context? good luck with that.', art libraries journal, vol. , no. , pp. - . https://doi.org/ . /s digital object identifier (doi): . /s link: link to publication record in edinburgh research explorer document version: peer reviewed version published in: art libraries journal general rights copyright for the publications made accessible via the edinburgh research explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. take down policy the university of edinburgh has made every reasonable effort to ensure that edinburgh research explorer content complies with uk legislation. if you believe that the public display of this file breaches copyright please contact openaccess@ed.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. download date: . apr. https://doi.org/ . /s https://doi.org/ . /s https://www.research.ed.ac.uk/portal/en/publications/so-you-want-to-reuse-digital-heritage-content-in-a-creative-context-good-luck-with-that( -c cd- ad-b - ef bbce ).html so you want to reuse digital heritage content in a creative context? good luck with that. accepted, art libraries journal . professor melissa terras department of information studies foster court university college london gower street wc e bt m.terras@ucl.ac.uk abstract although there is a lot of digitised cultural heritage content online, it is still incredibly difficult to source good material to reuse, or material that you are allowed to reuse, in creative projects. what can institutions do to help people who want to invest their time in making and creating using digitised historical items as inspiration and source material? the garden of earthly delights, repurposed over at etsy introduction we live at a time when most galleries, libraries, archives and museums are digitising collections and putting them up online to increase access, with some (such as the rijksmuseum , lacma , the british library , and the internet archive ) releasing                                                                                                                 https://www.etsy.com/uk/search/handmade?q=garden% of% earthly% delights& order=most_relevant&ref=auto &explicit_scope= https://www.rijksmuseum.nl/en/explore-the-collection http://collections.lacma.org/ http://britishlibrary.typepad.co.uk/digital-scholarship/ / /a-million-first- steps.html http://blog.archive.org/ / / /millions-of-historic-images-posted-to-flickr/ content with open licensing actively encouraging reuse. we also live at a time where it has become increasingly easy to take digital content, repurpose it, mash it up, produce new material, and make physical items (with many commercial photographic services offering no end of digital printing possibilities , and cheaper global manufacturing opportunities at scale being assisted with internet technologies ). what relationship does digitisation of cultural and heritage content have to the maker movement ? where are all the people looking at online image collections like europeana or the book images from the internet archive and saying... “fantastic! cousin henry would love a tea-towel: i'll make some christmas presents based on that!”? the british library is currently tracking their public domain reuse in the wild, looking to see where the million images they released into the public domain, and on flickr, end up being used, manually maintaining a list of creative projects of what people have done with their content. people are using digitised material: visit a commercial fabric printing service like spoonflower and you can see people reusing creative commons images such as those from wikipedia as a design source and inspiration, although many don’t quote the source of their images used a basis for fabric design . on etsy , an online marketplace for handicrafts, you can see historical art and culture turned into material for sale, such as coasters, corsets, bangles, pillows, phone cases, jewellery, etc - although, again, where the source                                                                                                                 for example, http://www.photobox.co.uk/ and http://www.snapfish.co.uk/ http://www.bagsoflove.co.uk/ see http://www.alibaba.com/showroom/digital-printing-silk-scarves.html for an example of where you can organize foreign suppliers to create materials. http://en.wikipedia.org/wiki/maker_movement http://www.europeana.eu/ https://www.flickr.com/photos/internetarchivebookimages/ http://britishlibrary.typepad.co.uk/digital-scholarship/ / /tracking-public- domain-re-use-in-the-wild.html http://britishlibrary.typepad.co.uk/digital-scholarship/ / /a-million-first- steps.html http://blpublicdomain.wikispaces.com/creative+projects http://www.spoonflower.com/welcome http://www.spoonflower.com/fabric/ for example see http://www.spoonflower.com/fabric/ , http://www.spoonflower.com/fabric/ , http://www.spoonflower.com/fabric/ , http://www.spoonflower.com/fabric/ , http://www.spoonflower.com/fabric/ , http://www.spoonflower.com/fabric/ . https://www.etsy.com/ https://www.etsy.com/uk/listing/ /ancient-world-map-marble-coaster-set- of?ref=sr_gallery_ &ga_search_query=ancient&ga_search_type=all&ga_view_type =gallery, https://www.etsy.com/uk/listing/ /degas-ballet-corset-historical- corset?ref=sc_ &plkey= bea ab c d b f b f d e d % a &ga_search_query=degas&ga_search_type=all&ga_view_type=gallery, https://www.etsy.com/uk/listing/ /large-size-brass-cuff-with-graphic- lady?ref=sc_ &plkey= e e d c b e e cafc a % a images came from is not usually made clear. overall, though, the question is why more creative use isn't made of online digital collections. why haven’t we seen the "maker's revolution" where everyone is walking around going "this old thing? i cobbled it together from public domain images on wikimedia and had a tailor on etsy run it up for me!" - or even see more commercial companies start to use this content as the basis for their home and fashion collections on the high street. there are now funding programs and efforts to help try and help the exchange between the "multiple sub-sectors of the creative industries and the public infrastructure of museums, galleries, libraries, orchestras, theatres and the like " and funds for "collaboration between arts and humanities researchers and creative companies ". in this this new "impact" world, allowing reuse of digitised content will have on-going benefits, but what can institutions be doing to make sure the digitised content they spent so much time creating is used, and reused, further? institutions who have made their out of copyright images freely available for reuse should be applauded: it’s absolutely the right thing to do (there are, of course, many institutions who haven't made their digitised content available). but with that caveat in place, unfortunately, the remainder of this article is an expression of sheer frustration at the current state of play of delivering digitised content online to users. so much stuff, such poor interfaces. there is now a vast amount of digitized content online: europeana now has over million items online from institutions. flickr is now being used, independently of the commons, to host tens of millions of digital cultural heritage objects, by thousands of institutions. but for a user, browsing through this content, it is nigh on impossible to navigate or search in any meaningful way, simply because interfaces are so poor (and often the content isnt tagged very well, so isn't very findable). what if institutions have their own content management system? "user friendly" interfaces, such as aquabrowser , or digitool are often anything but. unless you know exactly what you are looking for, it's incredibly difficult for a user to browse and view image                                                                                                                                                                                                                                                                                                                               &ga_search_query=shallot&ga_search_type=all&ga_view_type=gallery, https://www.etsy.com/uk/listing/ /sandro-botticelli-birth-of-venus- pillow?ref=related- , https://www.etsy.com/uk/listing/ /on-sale-samsung- galaxy-s -case- gustav?ref=sr_gallery_ &ga_search_query=klimt&ga_search_type=all&ga_view_typ e=gallery, https://www.etsy.com/uk/listing/ /charm-bracelet-degas- ballerina- theme?ref=sr_gallery_ &ga_search_query=degas&ga_search_type=all&ga_view_typ e=gallery, https://www.etsy.com/uk/listing/ /van-goghs-starry-night- landscape- with?ref=sr_gallery_ &ga_search_query=gogh&ga_search_type=all&ga_view_type =gallery, https://www.etsy.com/uk/listing/ /botticelli-pop-remix- -on- stretched?ref=sr_gallery_ &ga_search_query=remix+art&ga_search_type=all&ga_ view_type=gallery http://the multiple sub-sectors of the creative industries and the public infrastructure of museums, galleries, libraries, orchestras, theatres and the like http://www.react-hub.org.uk/about/ http://www.europeana.eu/ http://www.proquest.com/products-services/aquabrowser.html http://www.exlibrisgroup.com/category/digitooloverview content. finding images that are interesting from a design perspective is a time consuming, utterly frustrating task, as users try to navigate (mostly unsuccessfully) what the cultural heritage sector has spent millions of pounds putting online. suggestion: institutions should use employ graphic designers to sort through their thousands of images and present to their users a curated collection of a few hundred really good things which are ripe for using. in amassing some downloadable packs of images of art, logos, boats, trains, halloween, christmas, etc you will encourage reuse. at the moment institutions are making users work too hard to sort through the digital haystack to find the interesting, usable needle. no wonder much of the content isn't used or even viewed: people simply cant find it, or they walk away from horrible interfaces before finding that digitisation diamond. the shackles of copyright, part : aesthetic. copyright free images which are put online with free to use licenses are out of copyright (of course) which means they are from a particular time period: generally pre- s (depending on international copyright laws). there's a lot of stuff, but an incredible amount of it is victoriana, which has a particular aesthetic. this is great if you are into steampunk (a look at the first few pages of the internet archive book images flickr stream will explain that fashion) but this doesn't suit all users, particularly those who are interested in th century design. suggestion: institutions should cherry pick a few in-copyright items that are really very reusable, and pre-emptively clear copyright under various licenses. here are fabulous s illustrations which we have arranged for you to use under a creative commons license! (there are some examples of this on flickr commons, but it is in the minority). there are resources which are required for this, but really, institutions could be leading the way in making images of selected in-copyright items available and usable for people, to encourage uptake and creativity. or - at the very least – institutions could make processes for chasing copyright clearance clearer to users. it is often impossible to even find out who to email in an institution about rights clearances. the shackles of copyright, part : cowardice. let's address the majority of institutions who do not make material available for reuse. for example, if you’d like to make some of stationery, visiting europeana to find some interesting images of old envelopes, to print up some notecards with those on (not to sell! just for your own use!). images are labelled "envelope" in europeana . the licensing for these - what you can and cant reuse - is incredibly confusing. only of these items have been put into the public domain . a quarter of these digitised items have licenses which allow access but no further reuse of the                                                                                                                 http://en.wikipedia.org/wiki/steampunk https://www.flickr.com/photos/internetarchivebookimages/ http://www.europeana.eu/portal/search.html?query=envelope&rows= http://www.europeana.eu/portal/search.html?query=envelope&rows= &qf=right s% ahttp% a% f% fcreativecommons.org% fpublicdomain% fmark% f . % f* images . why not? what are institutions scared of? that someone is going to pop over to photobox (other commercial photo printers are available) and make up some notelets? that someone will make a corset out of images and sell it on etsy? if material is out of copyright, and an institution does not have the nous or cant afford to employ a graphic designer to turn images of envelopes into going commercial concerns, why shouldn't anyone else? why are you putting images online if your message to user is "you can't use it. at all". what are institutions afraid of? (we must not presume that users will not use digital images when they don’t have permission to do so: they will take them and use them anyway ). what would happen if we just let people reuse (out of copyright) digital content? what is the worst that could happen? that something archival takes off and becomes another "keep calm and carry on " meme? wouldn’t institutions love to be the source of one of those, for perpetuity? all over the world, institutions are digitising cultural heritage content and putting it online with restrictive licensing which means that users cannot do anything at all with it (at least not without jumping through lots of begging hoops, or using it illegally). this is a complete waste of limited resources in the sector. what "access" are institutions actually providing, if it’s only of the "look but dont touch" variety? suggestion: if institutions are not going to monetize an out of copyright digitized item themselves, they should make it available for others to reuse, with a generous license. image quality for creative reuse, a clear dpi (or higher) image of the digitised item is needed. it is no use saying "this is in the public domain!" if you only provide dpi: nothing can be done with low resolution images, except putting them on other webpages. so much of the "public domain" material is low resolution, which stops people from using the images for creative purposes (which is perhaps deliberate: that'll thwart those corset makers!) institutions should allow access to reasonably high resolution images, and let users play with them. additionally, maintaining white space around images (without cutting off subject matter) ensures images are reusable. suggestion: provide at least dpi images to users. a thought on makers some digitised content may be made freely available, but it remains quite costly for people to do anything creative with it where digital printing is concerned, especially in small print runs, or making individual items. it takes significant investment of time and resources to take an archival tiff and turn it into, say, a cushion (or a corset). this                                                                                                                 http://www.europeana.eu/portal/search.html?query=envelope&rows= &qf=right s% ahttp% a% f% fwww.europeana.eu% frights% frr-f% f* http://mw .museumsandtheweb.com/paper/where-do-images-of-art-go-once- they-go-online-a-reverse-image-lookup-study-to-assess-the-dissemination-of- digitized-cultural-heritage/ http://en.wikipedia.org/wiki/keep_calm_and_carry_on should offset the feeling that institutions are giving content away for nothing. it becomes co-creation, rather than mere duplication, taking skill, resources, training, and talent. this maker activity should be respected, as well as the source of the inspiration: love the provision of high quality digital heritage imaging online, but love the people who have the sewing chops to make the corsets. suggestion: wonderful things can happen when individuals work with institutional digitised content: we should be celebrating this form of public engagement, and doing all we can to support it. conclusion overall, here is what institutions can do if they want people to really use digitised content: • put out of copyright material in the public domain to encourage reuse. go on! what are you scared of? • provide dpi images as a minimum. make sure the image quality is good before putting it online. • curate small collections of really good content for people to reuse. present them in downloadable "get all the images at once" bundles, with related documentation about usage rights, how to cite, etc. • think carefully about the user interface you have invested in. have you actually tried to use it? does it work? can people browse and find content? • make rights clearer. give guidance for rights clearance for in-copyright material, and perhaps provide small collections with pre-cleared rights, to allow some th century materials to be reusable. what do we want! curated bundles of dpi images of cultural heritage content, freely and easily available with clear licensing and attribution guidelines! when do we want that? yesteryear! institutions can be doing so, so much more to help those wanting to use digitised content creatively, and to unlock the potential of our large scale investment in digitized cultural heritage content. with the simple measures described here, we could open up access to a whole range of activities which could transform engagement with digital cultural heritage, which can only be a good thing for both users, and institutions. across the great divide across the great divide: findings and possibilities for action from the summit meeting of academic libraries and university presses with administrative relationships (p l) by mary rose muccie (temple university press) joe lucia (temple university libraries) elliott shore (arl) clifford lynch (cni) peter berkery (aaup) across the great divide acknowledgments financial support andrew w. mellon foundation summit planning & development julia claire blixrud, – , deserves special recognition as the originator of the idea to convene this forum bringing presses and libraries together. rikk mulligan, visiting program office at arl for scholarly publishing july –june , for extensive work calling together the p l summit organizers & participants, helping to shape the program and agenda, and coordinating event logistics with arl & temple university staff. monica mccormick, digital scholarly publishing officer, new york university, for event facilitation and general guidance with respect to issues and challenges involved in library / press cooperation and collaboration. p l summit program committee members: lisa bayer, university of georgia press jane frances bunker, northwestern university press joe lucia, dean of libraries, temple university peter berkery, executive director, aaup brenna mclaughlin, director of marketing & communications, aaup mary rose muccie, temple university press elliott shore, executive director, association of research libraries karen williams, dean of university libraries, university of arizona event day note-taking & documentation sara cohen, editor, temple university press aaron javsicas, editor-in-chief, temple university press annie johnson, scholarly communications specialist, temple university elizabeth waraksa, program director for research and strategic initiatives, association of research libraries event administrative and logistics support amy eshgh, association of research libraries christine jones, temple university libraries marianne moll, temple university libraries this work is licensed under a creative commons attribution . international license. https://creativecommons.org/licenses/by/ . / across the great divide across the great divide: findings and possibilities for action from the summit meeting of academic libraries and university presses with administrative relationships (p l) by mary rose muccie (temple university press) joe lucia (temple university libraries) elliott shore (arl) clifford lynch (cni) peter berkery (aaup) context and rationale partnerships and collaborations have become standard responses to the multiple challenges that both higher education and scholarly publishing face. organizing the work of the academy, either on one campus or across institutions, around collaborative partnerships often enables cost reduction, increases efficiencies, and perhaps most usefully, builds connections between distinct domains to achieve greater strategic impact. in the area of scholarly communication, new or revived partnerships between the university press and the academic research library are an opportunity to re-imagine functions that have been separated from one another through custom, convenience, professional practices, or standard administrative operation. in many of these re-imaginings, provosts and higher-education funders view the library as an appropriate host and sponsor for experiments, situated as it is at the center of many campuses, and in light of its role in the collection, preservation, and dissemination of information and scholarship. instructional technology support, writing centers, digital scholarship centers, visualization labs, and carefully designed collaborative learning and research facilities are examples of the ways in which academic libraries have adapted to reaffirm their positions as centers for discovery, knowledge creation, and scholarship within a college or university. at the same time, the university press occupies a complementary position on the outer boundaries of a university, attracting and disseminating the work of the global academy. as a public-facing unit that generally operates on a different (and often increasingly problematic) budgetary basis than the library or instructional units, university presses have been challenged to leverage linked information technologies that take a new vision of scholarly communications from imagination to reality, while maintaining standards of scholarly merit vis-à-vis consistently applied peer review and editorial best practices. interest in partnership between press and library demonstrates an appreciation that the skills, roles, and capacities of these two institutional units can together support a common mission. increasingly these partnerships start with an administrative merger that subordinates one unit to the other at an organizational level, i.e., the press reporting to the library. in some cases the institution is trying to solve one or more of a set of issues that arise from the changing roles and across the great divide operating environments of both the press and the library; in others, both units are operationally viable and are linked to increase reporting-line efficiency. both institutions sit at a nexus of issues that have come into better view as the revolution in linked information technologies continues to change the way scholarly communications are produced and disseminated. scott waugh, provost at ucla, in his plenary remarks opening the presses reporting to libraries (p l) summit offers the following prescription: we need to foster consortia of presses and libraries that aim to achieve a common view of and role in the dissemination and preservation of knowledge, data, and scholarship. the p l movement is a step in that direction, and there are many individual projects confronting this need. we also need to encourage and foster collaborative efforts that are designed to support the dissemination and preservation of scholarship on a broad scale. consortium arrangements, such as jstor or hathi trust, have been a major benefit to libraries and presses, helping them operate more efficiently while expanding their reach and increasing the services they offer. more can be done. true collaboration will require libraries, presses, university administrators and faculty to reach decisions about complex issues: how to reduce redundancies and capitalize on specialties; how to work across institutional boundaries to achieve efficiencies and lower expenses; and how to recognize comparative advantages and give priority to other institutions. universities, faculties, presses and libraries are all part of one large, endangered eco-system. although competition is integral to higher education and has spurred important advances, we all inhabit the same system and need to cooperate and collaborate for the welfare of the system. the complete text of provost waugh’s opening remarks is included as appendix to this white paper. the p l summit convened jointly by the association for research libraries (arl), the association of american university presses (aaup), and the coalition for networked information (cni), funded by the andrew w. mellon foundation, and hosted by temple university libraries and temple university press, the p l summit was held in philadelphia on may and , . in the first such meeting of members of this particular community, teams of press directors and library deans/ directors with an administrative relationship (typically involving the press reporting into the library) discussed the benefits of, challenges in, and possibilities around this relationship. (see appendix for attendee list.) p l explored how these separate components of the scholarly communications ecosystem (e.g., libraries and publishers) might move beyond relationships often established for administrative convenience and think together, leveraging the skills and strengths of their distinctive enterprises to move toward a unified system of publication, dissemination, access, and preservation that better serves both the host institution and the wider world of scholarship. p l was an important first step toward a shared action agenda for university presses and academic across the great divide libraries that supports and updates traditional approaches to scholarly publishing, broader scholarly communication through established and emerging channels and practices, and digital scholarship services for faculty and students. this shared action agenda also must seek to adapt to the new challenges of the digital environment in commitments such as the preservation of the scholarly record. through a series of guided working sessions, attendees shared experiences and brainstormed about areas of common interest that, through partnership, can strengthen and expand their joint mission. they opened a dialogue and strategized about larger issues and challenges that cut across the domains of libraries and publishing, thus laying the groundwork for a follow-up summit (p l ) dedicated to formalizing this list of areas, concretizing next steps, and drafting implementation plans. format meetings of the arl, aaup, and the american library association (ala) include sessions dedicated to libraries and publishing, as do events such as the library publishing forum, sponsored by the library publishing coalition. what made p l particularly important was the bringing together for the first time pairs of library and press directors from institutions where an established university press now reports to the library — a critical common ground that the conveners believed offered the participants the opportunity to work together effectively on new challenges. rather than attending and passively listening to speakers, summit participants were divided into small teams for a set of four working sessions organized around clarifying the benefits of partnership, identifying key challenges, and proposing experiments for overcoming those challenges to build on the strength of library and press collaboration. working-session topics were developed from invitees’ detailed answers to a survey, with questions designed to ensure that each pair of attendees talked about these issues before their arrival to philadelphia. (see appendix , survey analysis.) the survey provided organizers with an understanding of the motivating factors that create a successful campus partnership, the different ways institutions manage and leverage this type of press-library relationship, and the areas of common interest for the future. (see appendix , agenda.) key issues there is no single model for the relationship between the library and the press, yet similar challenges exist across the spectrum of p l institutions regardless of location, size, or public vs. private designation. discussions at p l backed up the frequently heard statement that presses and libraries want same thing, that is, widespread, cost-effective distribution of scholarly products. they have shared problems and a shared future. however, there is a need for bidirectional education on the challenges each side is facing as well as for frank conversations about opportunities for change. overgeneralizing is not effective; even within the p l participant group, there were significant differences of perspective among the libraries and presses represented. across the great divide tensions can and do exist between the two units. libraries want presses to put more effort into clear mission-oriented work. presses want libraries to think in practical business terms. university presses are typically run as cost-recovery operations with complex budgets whereas the library is a budgeted academic service operation. added to this is the antithetical reality of a press running a business in an educational environment while at the same time operating a mission-based program within a publishing business. it was clear that presses have work to do in terms of educating libraries on their missions and the reasons behind what they do. as one press director said, “i constantly have to evangelize among librarians, and tell them we’re mission driven. they are [at first] suspicious of my motives. they think it’s all about profit.” areas for understanding libraries and publishers have long experienced the tensions inherent the traditional buyer- seller relationship. those tensions change and grow when a member of the sales community has a reporting relationship to, and often a shared budget with, a member of the purchasing community. in acknowledgment of those tensions, when managing a shared budget, it is clear libraries and presses should approach the budgetary relationship as a partnership, not as patronage; at the same time there needs to be frank conversations about the extent to which the press is expected to be financially self-sustaining and the implications of this for other mission priorities. when talking press finances within the library and with the university administration, the p l group identified a need to develop a “script” to follow that frames the funding conversation as mission-based support, not as subvention. as part of this development, many presses pointed out the need to recognize how lean press staffing really is. with the majority of time spent meeting contractual obligations, presses can have little to no time for new initiatives. launching a new initiative may mean a reduced annual publication portfolio unless new resources are available. developing a shared vocabulary and an understanding of the other’s skills is essential. a lack of knowledge of the publishing business is usually true of the institution itself, and is something that can be tackled first at the library-press level and from there, used in broader conversations on campus. the skills and expertise of publishing professionals are poorly understood in general, particularly in the areas of acquisitions; finance, including author royalties; contractual obligations and subsidiary rights; channel sales and marketing; and publicity beyond the local community. presentations aimed at educating librarians on the varying structures of university presses, their approaches to the publishing process, and the skills embodied by the staffs at conferences such as the library publishing forum, arl, and the charleston conference offer opportunities to share information with the broader library community. by the same token, press employees may have little knowledge of the role librarians play in discoverability through the creation and dissemination of metadata, or in historical preservation across the great divide through the collection of primary sources. something as simple as shared organizational charts can shed light into the workings of one’s partner. the press’ role on campus while presses are developing more services in support of their host institutions’ direct priorities, many in partnership with the library, a press’s overall traditional validation in the united states comes from the world outside of its home university. they contribute to the public and local good, but do so primarily for the academy broadly at an international level. university presses play an important role in the development of scholarly disciplines as well as that of individual scholars, something poorly understood by libraries, and indeed by academics in disciplines dominated by journal publications. while university libraries also collaborate and contribute in support of the academy broadly, they tend to be more institutionally oriented and focus on research support and teaching and learning at a local level, interacting with faculty primarily as users whereas presses see them as authors and researchers. and although showing value to the university is essential for a press, equally important is maintaining editorial independence and quality. it takes work on the part of both the press and library to change the way the university administration sees its press. a university press is a key component of the university’s academic reputation, a tool to support and advance the university mission. titles with the press imprint market the university worldwide. the library leadership is positioned to advocate for the press, and the work of the press and library should reflect the way the university thinks of itself. both need to be seen as strategic mission-driven advantages. and it’s key that their strategic goals be both integrated and complementary. it is perhaps more critical that press and library leadership develop a common vocabulary and messages that speak to the stressors in the current scholarly publishing ecosystem when engaging top administrators (presidents, provosts, and financial officers). a coherent presentation of the underlying financial, production, and consumption challenges for scholarly output is a necessary framework for these discussions. posing the cost/value trade-offs in the academic enterprise is central to this framing. the ways in which press-library collaboration locally and at more global levels can work toward the twin goals of sustainability and transformation need to be at the center of the conversation. preliminary recommendations a tighter coupling of library initiatives and press intellectual capital can open up new ways of thinking about publishing as a core function of the academic environment. this link is integral to moving from shared one-off projects to scalable solutions. p l participants identified a number of concrete opportunities for closer ties and strategic and tactical integration of libraries and presses. • integrate press and library staff as much as possible. include the press director on the library management team, form working groups and committees that include staff from both organizations, and develop a joint strategic plan. this high-level integration supports broad strategic initiatives key to changing the local environment. on the operational side, presses and libraries often share services such as it support for online across the great divide journals, use of the repository environment as an ebook publishing platform, and backlist digitization projects or combined backlist/holdings digitization projects. hr support, joint fundraising and shared development staff, and shared events are common. integrating salary lines to include the press director’s salary in the library budget makes a statement about shared commitment and frees up money the press can invest in new initiatives or in something as simple as increasing travel for the acquisitions editors. • partner on developing publishing expertise as an educational asset. create and host an undergraduate research journal or develop a program to educate graduate students on open access, authors’ rights, copyright and permissions, and publishing in a socially responsible way, and even in finding the right publishers for their work. • leverage the strengths of both the library and press to create open educational resources. open educational resources (oer) are a hot topic on many campuses and are an underused route for library-press collaboration. libraries have a window into the university’s pedagogy and the opportunity to start conversations with faculty about textbook affordability. beyond managing print-on-demand editions, press expertise can be used to work with faculty to develop a project, have it fully peer reviewed, add the press imprint, publicize it beyond the author’s home university, and create standards for authors so the process is replicable. many oer titles are not adopted in research institutions; adding the imprimatur of a press as well as the addition of formal peer review could encourage broader use. • develop a shared approach to digital scholarship. digital scholarship and digital humanities projects are both challenges and opportunities for libraries and presses. they provide a chance to develop policies and standards for the university but also raise questions. how are the roles and responsibilities of the library and press defined? what is the response when a faculty member brings a project? that is, is it automatically supported or first evaluated for value and impact? is it a one-off project, a prototype, or part of a broader infrastructure? can the options of both the library as a partner for press projects and the press as an advisor for library projects be supported? who owns the resulting work? what is “publishing” in these cases? many digital humanities/digital scholarship projects would benefit from editorial vision and review; how and when is this input gathered? and how do we do this at scale? defining the skill sets is essential so that each unit can be drawn on effectively. the european perspective the summit ended with a presentation by wolfram horstmann, director of göttingen state and university library at georg august university, göttingen, germany. his talk allowed participants to compare and contrast experiences in the united states and canada with those in germany. of the university presses in germany, are run by their libraries and are fully open- access publishers. the connection between a university press and its home institution is much more overt than in north america. that is, universities are expected to develop the capacity across the great divide to distribute their own faculty’s research, and thus a german press reflects the profile of its founding university. although some cost-recovery tools exist, typically presses are supported by the university. in addition, the german political climate strongly favors free and open dissemination of research across all disciplines, and german libraries have created services in support of creation and distribution of scholarship. german libraries are building support for the increasingly data-intensive research methods used by faculty. as wolfram noted, this is a new area for presses and one in which working together can produce robust frameworks for support. in addition, he sees value in libraries helping presses leverage institutional repositories, digital collections, text corpora, tools for digital editing, and research-data publication workflows. wolfram concluded his talk with a number of observations on german university press publishing and library publishing support that apply equally to p l participants. to summarize, libraries there and here are moving beyond consumption toward assistance with production of content and a new generation of university presses focusing on electronic publishing and open access has formed. conclusion and next steps the library-press relationship explored in p l allows for transformative approaches in support and dissemination of scholarship. effective exploitation of these partnerships is in the early stages and there is an opportunity to influence the outcomes to ensure they are as broadly applicable and scalable as possible. as cliff lynch (cni) noted in his summary of the day’s conversation, we must do more exploration of both intra-institutional (library and press) and cross-institutional collaborations. he provided several compelling suggestions for partnerships, including new ways to promote and leverage library special collections as well as ideas for increasing discoverability of press content. (see appendix for the full text of his remarks.) addressing the challenges around implementing the ideas and recommendations resulting from p l and moving toward the library and press futures that participants and speakers envision requires broader and deeper investigation. building on the success of p l, a subsequent summit (p l ) will continue the collaborative conversation, tackle the issues raised as well as others facing library-press partnerships, and delve deeply into the recommendations from this meeting as well as those proposed in other contexts. open to a wider audience, p l will be structured to allow more time for moderated discussion. sessions focused on collaboration, both intra- and inter-institutional, would be paramount. examples could include creating and leveraging shared skills, sharing support for data within the university and in the press author pool, and partnering on scalable scholarly communication and library publishing programs. p l would focus on strategies to reinforce the library and press joint mission and advance the shared goal of promulgating scholarship. across the great divide appendix : the role of libraries and university presses in the scholarly eco-system: a provost’s perspective scott waugh, executive vice chancellor and provost, ucla in recent years, north american libraries and university presses have been jolted by a series of shocks that jeopardize their mission and, in the case of some presses, their very existence. indeed, these tremors have upset what might be called the scholarly eco-system, of which presses and libraries are constituent elements, prompting worries about the stability of the entire system. solutions to the problems of presses and libraries, of scholarly communication in general, therefore, will require large-scale cooperation and collaboration among all elements of the eco-system to find ways of meeting the risks and promises of the digital age and ensuring the survival of the system as a whole. in the second half of the th century, the scholarly eco-system that developed in the us and canada for the production and dissemination of important research proved to be brilliantly successful. based on the network of research universities that expanded from the later s onward, this eco-system consisted of four, interlocking elements: • discovery – research has flourished across the disciplines, with unbounded reach in space, time, and subject. • dissemination – it is necessary not only to compile data, but to disseminate it as broadly as possible to stimulate and inform further research as well as educate students. university presses perform this role and add value to the scholarship by shaping and refining it. • preservation and archiving – the products of research have to be readily available to scholars. university libraries set about gathering, collecting, cataloguing, and archiving research products that could be widely and easily accessed. to this end, they purchase monographs published by the university presses, providing a stable market for their product. • validation and authentication – the entire system depends on faculty, and most importantly, on peer review: faculty acting as the reviewers, assessors, and validators of research, proposals, publications, the appointment and promotion of faculty, and the admissions and certification of students (especially graduate students). this eco-system flourished and expanded, producing a vast array of research and scholarship that was disseminated around the world and making north america the leader in higher education and scholarly research of all kinds. the scholarly eco-system was based in research universities and was nourished and sustained by the revenue model that supported the universities. the model consists of five elements: ( ) state funding, ( ) tuition and student fees, ( ) federal funding and foundation support, ( ) private giving and endowments, and ( ) self-supporting sales and service functions. this pool of revenues provided for the compensation of faculty, the support of research facilities, the growth across the great divide in graduate education, the support of scholarly societies and organizations, the expansion of libraries, and the growth of journals and scholarly publications of all kinds. although reductionist the model makes the point that every part of the scholarly eco-system is fueled by the same sources of revenue flowing into the universities. this scholarly eco-system thrived and expanded as long as the revenues and costs remained roughly in equilibrium. in the last two decades, however, the equilibrium has been upset by uncertainties in the revenue streams and an inexorable growth of expenses: • state funding has declined almost everywhere and is increasingly uncertain. • tuition growth has slowed or stalled in the face of mounting student debt and concerns about affordability. • federal funding and foundation expenditures have been nearly flat, while more institutions and faculty are competing for grants. • endowment growth and payout have fluctuated, and wealth is unevenly distributed among institutions. • although many universities have successfully pursued new revenues, these additional funds tend to be restricted to specific purposes. • the costs of running a university have sharply increased, leading to competing pressure for every dollar. these factors, along with competition for reputational prestige, have created dysfunctional relationships in the dissemination of scholarship, driving university presses and libraries apart. it is a familiar picture: the costs of some prestigious journals have skyrocketed, limiting the ability of libraries, once the reliable partner of university presses in purchasing their scholarly output, to acquire new materials. at the same time presses have experienced declining revenues, while the costs of producing a book or monograph have risen. as a result, university presses struggle to make money on scholarly publications: it is estimated that % of new books lose money, % break even, and only % generate profit. in short, presses and libraries, which previously were partners in the scholarly eco-system, have become rivals for university subsidies in an age when university budgets everywhere are strained. pitting one against the other endangers the entire eco-system. information technology and digitization have complicated the picture. they have held out the promise of seamless and limitless access to all knowledge of all time, all of the time, and all “free.” they have also made possible a radical diversification of scholarly communication and modes of publication, enhancing the dissemination of scholarship, a critical feature of the scholarly eco-system. the open access imperative of the federal government – demanding that all research data and materials produced under federal grants be publicly available, as well as the final products whether books or articles – has dovetailed with and accelerated this vision of an electronic cornucopia of knowledge, fundamentally altering the nature of the scholarly eco- system. digitization has blurred the bright line between dissemination and archiving, and the open access movement has underscored the need for universities to figure out how to do both well. across the great divide both libraries and university presses are central to the open access movement and should be partners and leaders in that effort, drawing on their combined expertise. yet, thus far it has only increased pressure on their budgets and, hence, on universities generally. aside from journals, technology is the fastest rising expense for libraries, and it has been equally challenging for presses. digitization raises a host of difficult decisions how to organize, store, and provide access to digital materials. the technical requirements of open access are daunting and expensive. individual institutions have developed their own projects using their own protocols and platforms, leaving a plethora of projects and data scattered across the web. bringing them together or developing a common platform has proved to be enormously challenging. while such efforts as the committee on coherence at scale for higher education and the share project are addressing these challenges, they only scratch the surface of the problem. we need to foster consortia of presses and libraries that aim to achieve a common view of and role in the dissemination and preservation of knowledge, data, and scholarship. the p l movement is a step in that direction, and there are many individual projects confronting this need. we also need to encourage and foster collaborative efforts that are designed to support the dissemination and preservation of scholarship on a broad scale. consortium arrangements, such as jstor or hathi trust, have been a major benefit to libraries and presses, helping them operate more efficiently while expanding their reach and increasing the services they offer. more can be done. true collaboration will require libraries, presses, university administrators and faculty to reach decisions about complex issues: how to reduce redundancies and capitalize on specialties; how to work across institutional boundaries to achieve efficiencies and lower expenses; and how to recognize comparative advantages and give priority to other institutions. universities, faculties, presses and libraries are all part of one large, endangered eco-system. although competition isintegral to higher education and has spurred important advances, we all inhabit the same system and need to cooperate and collaborate for the welfare of the system. a basic obstacle to modernizing the scholarly eco-system is that all these efforts depend on the original funding model, which is increasingly rickety. it is critical, therefore, for presses and libraries to engage provosts and demonstrate the importance of the issues they are grappling with. second, they must encourage provosts to engage the faculty. the eco-system today is no less dependent on faculty as at its inception, not only as producers and consumers of scholarship, but also as reviewers and validators. faculty must become aware of the fragility of the system and their role in it. they must recognize the many trade-offs involved in budgeting for academic activities. they must acknowledge that the thrill and prestige of publishing in some journals can crowd out the ability of the library to purchase other publications or perform other services. faculty need to consider ways of vetting scholarly products that are less costly than at present and find other ways of determining and ascribing quality and prestige. the faculty is at the heart of these issues and integral to the survival of the eco-system. across the great divide appendix : participants attendees first last title institution email john weaver dean of library services and educational technology abilene christian university library john.weaver@acu.edu jason fikes director abilene christian university press jason.fikes@acu.edu bryn geffert librarian of the college amherst college library bgeffert@amherst.edu mark edington director amherst college press medington@amherst.edu john unsworth vice provost, university librarian and chief information officer brandeis university library unsworth@brandeis.edu sylvia fuks fried director brandeis university press fuksfried@brandeis.edu guylaine beaudry university librarian concordia university library guylaine.beaudry@concordia.ca geoffrey little editor-in-chief concordia university press geoffrey.little@concordia.ca elizabeth kirk assoc. librarian for information services dartmouth university library elizabeth.e.kirk@dartmouth.edu john zenelis dean of libraries and university librarian george mason university library jzenelis@gmu.edu john warren head, mason publishing/george mason university press george mason university press jwarre @gmu.edu chris bourg director mit library cbourg@mit.edu amy brand director mit press amybrand@mit.edu carol mandel dean, division of libraries new york university library carol.mandel@nyu.edu ellen chodosh director new york university press ellen.chodosh@nyu.edu sarah pritchard dean of libraries northwestern university library spritchard@northwestern.edu jane bunker director northwestern university press j-bunker@northwestern.edu faye chadwell university librarian and press director oregon state university library and press faye.chadwell@oregonstate.edu tom booth associate director oregon state university press thomas.booth@oregonstate.edu barbara i. dewey dean of university libraries and scholarly communications penn state library bdewey@psu.edu patrick alexander director penn state university press pha @psu.edu peter froehlich director purdue university press pfroehli@purdue.edu barb martin director southern illinois university press bbmartin@siu.edu pamela hackbart- dean interim co-dean, library affairs southern illinois university carbondale library phdean@lib.siu.edu david seaman dean of libraries and university librarian syracuse university library dseaman@syr.edu alice pfeiffer director syracuse university press arpfeiff@syr.edu joe lucia dean of libraries temple university library joseph.lucia@temple.edu mary rose muccie director, temple university press and scholarly communications officer, temple university library temple university press maryrose.muccie@temple.edu june koelker dean texas christian university library j.koelker@tcu.edu dan williams director texas christian university press d.e.williams@tcu.edu bella gerlich professor and dean of libraries texas tech university library bella.k.gerlich@ttu.edu courtney burkholder managing director texas tech university press courtney.burkholder@ttu.edu jon miller director university of akron press mjon@uakron.edu mailto:john.weaver@acu.edu mailto:jason.fikes@acu.edu mailto:bgeffert@amherst.edu mailto:medington@amherst.edu mailto:unsworth@brandeis.edu mailto:fuksfried@brandeis.edu mailto:guylaine.beaudry@concordia.ca mailto:geoffrey.little@concordia.ca mailto:elizabeth.e.kirk@dartmouth.edu mailto:jzenelis@gmu.edu mailto:jwarre @gmu.edu mailto:cbourg@mit.edu mailto:amybrand@mit.edu mailto:carol.mandel@nyu.edu mailto:ellen.chodosh@nyu.edu mailto:spritchard@northwestern.edu mailto:j-bunker@northwestern.edu mailto:faye.chadwell@oregonstate.edu mailto:thomas.booth@oregonstate.edu mailto:bdewey@psu.edu mailto:pha @psu.edu mailto:pfroehli@purdue.edu mailto:bbmartin@siu.edu mailto:phdean@lib.siu.edu mailto:dseaman@syr.edu mailto:arpfeiff@syr.edu mailto:joseph.lucia@temple.edu mailto:maryrose.muccie@temple.edu mailto:j.koelker@tcu.edu mailto:d.e.williams@tcu.edu mailto:bella.k.gerlich@ttu.edu mailto:courtney.burkholder@ttu.edu mailto:mjon@uakron.edu across the great divide first last title institution email linda cameron director university of alberta press cameronl@ualberta.ca karen williams dean university of arizona library karenwilliams@email.arizona.edu kathryn conrad director university of arizona press kconrad@uapress.arizona.edu tom hickerson vice provost and university librarian university of calgary library tom.hickerson@ucalgary.ca brian scrivener director university of calgary press brian.scrivener@ucalgary.ca julia oestreich senior editor university of delaware press joestrei@udel.edu toby graham university librarian and associate provost university of georgia library tgraham@uga.edu lisa bayer director university of georgia press lbayer@uga.edu mary beth thomson senior associate dean university of kentucky library mbthomson@uky.edu jonathan allison interim director university press of kentucky jonathan.allison@uky.edu james hilton university librarian and dean of libraries; vice provost for digital education and innovation university of michigan library hilton@umich.edu charles watkinson director, university of michigan press / aul, publishing university of michigan press watkinc@umich.edu gregory c. thompson associate dean for special collections university of utah library greg.c.thompson@utah.edu john alley editor in chief university of utah press john.alley@utah.edu michael burton director university press of new england michael.p.burton@dartmouth.edu jon cawthorne dean, university libraries west virginia university library jon.cawthorne@mail.wvu.edu lisa quinn associate director wifrid laurier university press lquinn@wlu.ca gohar ashoughian university librarian wilfrid laurier university library gashoughian@wlu.ca observers first last title institution email alex holzman president alex publishing solutions aholzman@temple.edu harriette hemmasi university librarian brown university library harriette_hemmasi@brown.edu becky brasington clark director of publishing library of congress recl@loc.gov blane dessy director, national enterprises library of congress bdes@loc.gov sarah lippincott program director library publishing coalition sarah@educopia.org jill oneill educational programs manager niso jilloneill@nfais.org xuemao wang dean and university librarian university of cincinnati library wang xm@ucmail.uc.edu mary case university librarian and dean of libraries university of illinois at chicago library marycase@uic.edu meredith babb president, aaup and director, university press of florida university press of florida and aaup mp@upf.com peter potter director, publishing strategy virginia tech library pjp @vt.edu don waters program officer, scholarly communications andrew w mellon foundation djw@mellon.org kathleen fitzpatrick associate executive director and director of scholarly communication mla kfitzpatrick@mla.org chuck henry president council on library and information resources chenry@clir.org mailto:cameronl@ualberta.ca mailto:karenwilliams@email.arizona.edu mailto:kconrad@uapress.arizona.edu mailto:tom.hickerson@ucalgary.ca mailto:brian.scrivener@ucalgary.ca mailto:joestrei@udel.edu mailto:tgraham@uga.edu mailto:lbayer@uga.edu mailto:mbthomson@uky.edu mailto:jonathan.allison@uky.edu mailto:hilton@umich.edu mailto:watkinc@umich.edu mailto:greg.c.thompson@utah.edu mailto:john.alley@utah.edu mailto:michael.p.burton@dartmouth.edu mailto:jon.cawthorne@mail.wvu.edu mailto:lquinn@wlu.ca mailto:gashoughian@wlu.ca mailto:aholzman@temple.edu mailto:harriette_hemmasi@brown.edu mailto:recl@loc.gov mailto:bdes@loc.gov mailto:sarah@educopia.org mailto:jilloneill@nfais.org mailto:wang xm@ucmail.uc.edu mailto:marycase@uic.edu mailto:mp@upf.com mailto:pjp @vt.edu mailto:djw@mellon.org mailto:kfitzpatrick@mla.org mailto:chenry@clir.org across the great divide speakers first last title institution email wolfram horstmann director of the göttingen state and university library university of geottingen horstmann@sub.uni-goettingen.de scott waugh provost ucla swaugh@conet.ucla.edu facilitator first last title institution email monica mccormick program officer for digital publishing nyu libraries & nyu press monica.mccormick@nyu.edu others first last title institution email peter berkery executive director aaup pberkery@aaupnet.org brenna mclaughlin director of marketing and communications aaup bmclaughlin@aaupnet.org rikk mulligan program officer for scholarly publishing association of research libraries rikk@arl.org elliott shore executive director association of research libraries elliott@arl.org elizabeth waraksa program director for research & strategic initiatives association of research libraries elizabeth@arl.org clifford lynch director cni cliff@cni.org sara jo cohen editor temple sara.cohen@temple.edu annie johnson library publishing and scholalry communications specialist temple annie.johnson@temple.edu aaron javsicas editor-in-chief temple aaron.javsicas@temple.edu mailto:horstmann@sub.uni-goettingen.de mailto:swaugh@conet.ucla.edu mailto:monica.mccormick@nyu.edu mailto:pberkery@aaupnet.org mailto:bmclaughlin@aaupnet.org mailto:rikk@arl.org mailto:elliott@arl.org mailto:elizabeth@arl.org mailto:cliff@cni.org mailto:sara.cohen@temple.edu mailto:annie.johnson@temple.edu mailto:aaron.javsicas@temple.edu across the great divide appendix : p l survey analysis executive summary participants: this report considers the submissions of teams of press and library deans/ directors. the p l survey received submissions, including those from three observers (the library of congress, brown university, and virginia tech), both teammates from two institutions (southern illinois university and wilfrid laurier university), and three of four participants for the university press of new england (two from brandeis university and one from the librarian of dartmouth college). four of the expected participating partners did not complete the survey. the full text answers are available in the attached tabbed excel workbook file. the strategic plans of the majority of these press-library relationships are aligned or coming into alignment, although a few are not in synch because of institutional issues explained in individual responses. the challenges for most remain financial (budgetary) with a particular focus on both sustainable operations and a growing need to produce open access (oa) scholarship. some are looking to use this alignment of press and library or expansion of the library’s mission to also operate as a press to move toward a new model for scholarly publishing that privileges oa. the budgetary and operational relationships of these libraries and presses are aligned but not necessarily integrated. several presses either share budgets with the library or come under their library’s budget. most of the presses receive technical support from the library and share its infrastructure, although several continue to require specific platforms and software packages to publish. governance for the majority of these partnerships is integrated or in the process of becoming so, as is operational alignment. there are very few shared staff between press and library—comments suggest these are technical positions and functions including it, institutional repositories, and web content. some of these partnerships are cross-training staff in the libraries and presses to support one another and to possibly integrate functions in the future. the support for digital scholarship broadly is relatively new and in its earliest stages in many of these partnerships. several presses also produce digital supplements to traditional print publications, although the sophistication of these products varies widely. many of these presses are involved in producing digital formats beyond books and articles, but most of these efforts are in their initial stages. the majority of these libraries offer some form of digital publishing service, although what this entails differs widely in the comments. in some instances, when the press has been “grown” within the library, its peer-review and editorial processes are meant to be integrated into the digital research production process, not as a later stage after the project has been created. other institutions are adding digital components or are offering print-outputs to digital projects. strategic alignment across the great divide are the strategic plans of the library and press created in partnership? yes: ( % of responding teams) | no: ( %) | unclear: ( %) southern illinois university answered both yes and no. both wilfrid laurier partners answered that at this stage in the process it remains unclear. is the strategy aligned with the strategic planning of the parent institution? yes: ( %) all institutions answered “yes,” although answers were qualified by stating plans are in development. is there a shared vision of the future of scholarly communications and academic publishing? please explain. yes: ( %) | no: ( %) | in process: ( %) | unclear: ( %) most of these press and library partnerships operate under a shared vision or are coming to operate under such a vision, although for some this is more of a spectrum rather than an absolute alignment. several of those who said “no” or that it was complicated point out the tension or conflict between the library’s support for open access and the mission of the press to sustainably disseminate research. what are the challenges in planning future endeavors? the foremost challenges are financial: limited and reduced library budgets, presses operating at a loss, the need for a sustainable oa (and digital publishing) business model, and the cost of software platforms and digital publishing infrastructure. staff, personnel, and skills development are issues linked to budgets and support. beyond the immediate financial challenge is a lack of buy-in or support from faculty and administration for the traditional mission to publish peer- reviewed work. amherst college in particular notes the need to define and promote a new model for scholarly publishing that fulfills the mission of higher education and is also more efficient and effective. see comments from amherst college press, george mason, temple, and wilfrid laurier for the best detail and range. budgetary and operational relationship budgets: the press and library operate under: shared budget: ( %) | separate budget: ( %) | other: ( %) the majority of those with separate budgets ( / %) say it is the policy of the parent institution, with one stating that it is more of a partnership and another that it is their dean’s choice. three of those who selected “other” explain that the library is responsible for the press budget in one form or another, from the press being a library line-item or having its budget monitored by the library business office. across the great divide shared technical infrastructure: shared desktop support: ( %) shared software licenses: ( %) shared technical staff: ( %) shared application environment (web servers, cms, ojs, etc.): ( %) shared hardware budget: ( %) other: ( %) where services and support are not provided by university/central it, the presses either rely on the library’s it staff or contract work out. desktop support comes from library it in most cases, followed by campus it. the university or library provides licenses for most common or standard software such as ms office, sometimes adobe creative suite and indesign, although adobe packages are through the press in other instances. some libraries provide the open journal systems (ojs) platform and a repository platform to the press as well. some institutions, such as temple university, also host a digital scholarship center within their library, offering another source for specific support to the library and press governance: fifteen ( %) press and library partnerships share internal governance while ten ( %) do not. sixteen ( %) library directors sit on press boards and seven ( %) sit on press committees; this includes three ( %) library directors who sit on both press boards and press committees. fewer press directors sit on library boards ( , %), but a large majority sit on library committees ( , %), with five ( %) press directors sitting on both library boards and library committees. operational alignment: are the press and library aligned operationally? yes: ( %) | no: ( %) | other: ( %) for those who answered “other” there is a lack of clarity regarding what “operational alignment” means. across the great divide • arizona: facilities, human resources support, and payroll are managed by the libraries. press staff are engaged in a wide variety of ways, including service on committees, cross-training in the finance area, etc. press staff participate in libraries’ shared governance associations, the libraries’ social committee, etc. other functions operation independently. • dartmouth: both library and press report to the provost, with the dean of libraries acting as press governor. • kentucky: moving in that direction. the press director reports to the dean of libraries and serves on the organization’s executive committee along with associate deans. there has been good collaboration between the press and the library’s scholarly communications area. we continue to merge it, business, and hr operations to create efficiencies. libraries director of philanthropy now providing support to press. • new york: these questions do not make any sense. the press is part of the division of libraries. i have no idea what you mean by “internal governance” or “aligned operationally.” • northwestern: not sure what is meant by this question. • temple: we’re not sure what “operational alignment” means. we share a vision for scholarly communications and work together on implementation. we share a staff person who is tasked with creating and supporting a library publishing and scholarly communications program. • wilfrid laurier: we are working on identifying the degree of operational alignment for those who answered “yes” the details of their alignment vary. most appear to be aligned in terms of administrative infrastructure: hr, accounting/financial systems, and some other services, but many presses continue to have specific needs outside these alignments. • abilene christian u: the press director is part of the library leadership. • alberta: ualberta press’s reputation for quality and impact of its scholarly publications by supporting changes in research directions and dissemination needs in the humanities and social sciences and a strategy in that regard is to collaborate with the libraries on alternate, library-based research dissemination channels and initiatives. • amherst college: it’s not entirely clear what this question has in view. the library and the press are, on our campus, a single, integrated entity. the press exists to advance the research and scholarly communications objectives of the amherst college library, and by extension those of amherst college. the press director holds two distinct roles: that of director of the amherst college press (in which he reports to the librarian of the college), and that of publisher of the lever press, a parallel initiative encompassing the support of a coalition of liberal arts college libraries (in which he reports to the “oversight committee,” a governing board established by the consortium). the operations of both of these presses take place within the framework of amherst college’s personnel and management policies, financial systems, and technological infrastructure. across the great divide • calgary: the press is a unit of the university’s libraries and cultural resources division. we share services and staff. • george mason: mason publishing (including press) reports to digital programs and services, which is one of three operational divisions within the university libraries. • georgia: in areas including hr and development, they are aligned. in others, including the basic business functions of the press, it is relatively distinct. • michigan: cemented in the structure since director of press is also aul for publishing and the press is treated as a designated not auxiliary unit. we continue to find ways of bringing the operational activities of publishing into the rest of the library. • oregon state: i answered yes, but this alignment is ongoing. we do share the same central hr and financial/accounting personnel in the business center that works for osulp. all employees are evaluated on their contributions to the overall plan. we have also sought to assess performance based on how employees’ work reflects our core values. however, there are times when the press is still outside some activities of the organization. obviously some of the financial and accounting issues of the press are different. the press staff do not regularly attend administrative briefings. the associate director is a part of the osulp management team and is on the listserv but he doesn’t attend the meetings on a regular basis. • penn state: press director is on the dean’s library council with other department heads and participates in discussions and policy making. • purdue: the press is a unit of the library. staffing: only six institutions ( %) share personnel between their library and press. these tend to include technology-related jobs and functions or professional administrative and scholarly communications work. • amherst college: all personnel within the press are employees of the library. • calgary: design: fte . press/ . library; admin staff: combined . ftes of library staff shared by press • georgia: we have shared cost of a marketing/design position and, until fy , shared an it person. • michigan: a number of positions are funded from the materials budget, but these tend to focus on open access/pub services initiatives • purdue: it, hr, scanning, ip/legal counsel, and digital humanities • syracuse: one position at present job functions: have you (or are there plans to) reduced duplication among staff by retooling, educating, or retraining? please explain. across the great divide four ( %) institutions affirmed they had eliminated duplication through layoffs, retirements, or retraining and repurposing. most expressed that there were few redundancies and duplications; however, it, hr, financial services, fundraising, and grant writing are noted as functions to integrate and retrain. future-looking training includes shifts in scholarly communications, digital publication and curation, cross-training to support institutional repository document processing and digital humanities production. • abilene christian: digital publication and curation. • kentucky: it, business and hr staff. • mit: fundraising and grant writing staff; possibly shared hr. • northwestern: scholarly communications. • purdue: it, hr, fundraising. cross-training libraries staff in publishing workflow support— repository document processing. also cross-training copy-editors, sales in dh production and communication. • temple: possibly scholarly communications and digital scholarship. • wilfrid laurier: integrating it and financial services. digital scholarship does the library have a support center or formal program to facilitate digital humanities/ digital scholarship activities? yes: ( %) | no: ( %) many of these centers and programs are quite new and still developing, most having been established in or more recently. new york university has a digital scholarly publishing program officer who works with both the library and digital scholarship services group. several institutions are beginning to align some aspects of open educational resources, digital scholarship, and digital publishing. is the press currently involved in any publishing ventures that involve digital supplements (data, software, apps, etc.) to traditional book or journal publications? yes: ( %) | no: ( %) is the press currently engaged in any projects that involve publication of scholarly materials in “non- traditional” digital formats (e.g., not books or articles)? yes: ( %) | no: ( %) does the library have a digital publishing service? yes: ( %) | no: ( %) the details offered suggest a very broad range of understandings regarding what a digital publishing service means. some equate it to the institutional repository, others to separate products, blogs, or file preparation services. across the great divide are research products from the digital scholarship enterprise being considered for potential press publication in any format? yes: ( %) | no: ( %) | not yet: ( %) examples include print-versions or variants of digital scholarship; print-on-demand and pdf versions; and an open access journal. many presses would like to be doing work like this but projects have yet to reach this stage or have yet to be proposed. the two best exemplars submitted are: • amherst college: we speak of ourselves as an open access, “digitally native” publisher. this means we work to explore with authors how their scholarship, which increasingly begins within digital infrastructures and is authored using digital tools, can more effectively communicate its ideas through the use of digital tools. in exploring these possibilities we bring as well the perspective of our library colleagues who look to the long-term sustainability of digital artifacts of scholarship. • georgia: our stated goal with regard to the digital humanities lab is that the press will provide peer review and marketing of its scholarly projects. we also have a series, new perspectives on the civil war, that was purposefully designed to include a digital component. what didn’t we ask that you think we should know? • abilene christian university: we seek strategies for promoting the press as a vital part of the university, and for realizing new efficiency. e.g., we are cross- training librarians as copy-editors for the press. • dartmouth (upne): we believe that a closer reporting relationship will be made in the future, with the press reporting directly to the library. • george mason university/university libraries: it would be helpful to know what other small library publishing/university press groups are using for publishing platforms. for example, what good (and low cost) platforms are being used to publish (oa) journals? are there alternatives to ojs? what book production/publishing management/ marketing software is available that is low cost but productive? what approaches are new library publishing/press ventures to engage and entice the university community to opt for their services? what metrics are they using to show their value to the university. • mit: the libraries and the press have recently launched some joint fundraising initiatives–including new funds that are explicitly designated as joint library/press funds (for digitization and for oa). • new york university: this survey assumes a certain outlook that just makes no sense to respond to in our environment. from our perspective it sets up a mental model of press vs. library that does not exist here. we certainly have a library and a press, but the questions imply a nature of interaction that does not reflect our deeper coordination and collaboration. across the great divide • northwestern university: do the faculty understand, and/or take advantage of, the library/press relationship? (in our case the answer is probably no.) this p l summit as currently configured has struck some as too narrowly defining “partnerships,” a la scholarly communication and publishing, as opposed to broader service collaborations. where is the reader/researcher in all this? are we paying attention to what they want and need? • oregon state university libraries: we continue to see benefits from the organizational alignment of the press and the libraries. obstacles and even resistance remain but there is much more openness to change and experimentation on both sides. one of the biggest benefits for the press has been heightened visibility across the university. another huge benefit has been increased awareness of university press publishing challenges and issues within the library. • purdue: do all players share a similar definition of what publishing is and might become or of what scholarly communications is and might become? in a post-open- or post- public-access world, who are we working for? • syracuse university: the formal relationship between the library and the university press is still rather new, and evolving. we also collaborate on design of library promotional materials, and on the development of donors through the library’s assistant dean of advancement. we engage in regular cross promotion of services and publications. this survey was completed jointly by david seaman and alice pfeiffer, director of the press. • tcu: issues of open access, shared initiatives • texas tech university press: the new dean of libraries is very event-oriented, and the library building is well set up for events. the library is taking advantage of press authors to give presentations as part of their library event series. also, the library will be selling press titles at the front circulation desk. • university of arizona libraries: there is a shared development program between the libraries and press. • university of delaware: how the press-library relationship is channeled/presented to university administration. is the partnership between the two clear to administrators, or does at least one of the parties need to do more to advocate for the other to administration? do administrators understand the shared values of presses and libraries and why those values are critical for institutions of higher ed? • university of georgia: the press has reported to the libraries for approximately nine years. unlike other similar arrangements, the press was (and remains) financially strong prior to the move. while the reporting relationship has afforded many unforeseen benefits detailed above, the original decision to move the press to the libraries was motivated by a new provost’s desire to have fewer reporting lines. the arrangement has worked out splendidly at georgia. • university of michigan: the press is seen as an approach to publishing defined by its editorial board and functions but it is integrated into the library at michigan, a type of across the great divide organization that is not recognized by many of the questions about. there are interesting cultural issues that we have encountered that are not recognized above, especially around the integration of staff from a library and publishing background. across the great divide appendix : aaup/arl/cni p l summit agenda monday, may th : – : pm registration desk open doubletree by hilton s broad st, philadelphia, pa : – : pm reception and dinner estia, a greek mediterranean restaurant – locust street, philadelphia, pa http://estiataverna.com/ reception: – : non-alcoholic beverages (coffee, tea, iced tea, soft drinks, juice) will be available as well as a cash bar for those wishing to purchase drinks. pm welcome: joe lucia and mary rose muccie tuesday, may th : – : am registration and continental breakfast : – : am summit introduction: monica mccormick, nyu : – : am keynote: scott waugh, ucla : – : am working session — challenges and barriers we’re separating publishers from librarians for this session, to encourage candor about the challenges of working together. there are many visions for press/library missions and collaboration: what are the obstacles in your institution, from your position in either the press or library? what do you wish people in the other organization understood? what are some structural, financial, administrative, technical, or social barriers? : – : am break : – : am working session — alignment (mission and identity) in thinking about the evolving mission of both entities, what are some ways in which they can come together? traditionally, publishers have focused on the production of scholarship and libraries on consumption—in the st century people don’t necessarily consider these as widely separated anymore. is there an evolution in what people expect from content and how they may get it? is greater mission alignment both desirable and possible as these expectations shift? can an alignment of goals offer strategic advantages in planning shared innovation and processes? how can an aligned press and library further the greater institutional mission in ways not possible before? http://estiataverna.com/ across the great divide : am – : pm lunch : – : pm working session — financial (budget and staffing) will closer collaboration and partnership between the library and press help manage the total cost of the scholarly publishing system? how? framing the discussion in terms of production and consumption, how can sustainable financial models for university- based scholarly publishing be developed that combine the strengths of each unit and move toward shared skills and infrastructure? the pre-summit survey revealed that institutions have strategically aligned the budgets of press and library; ten reported that budgets are still entirely separate: what are the advantages of these different situations? what shared infrastructure, workflows, and cross-training opportunities offer the greatest promise for both press and library? : am – : pm working session — digital scholarship and dissemination explore the possibilities of digital scholarship not only to maximize access, but also to better support interdisciplinary scholarship, teaching, and learning across the institution, from the position of an aligned library and press. areas of exploration include: new and experimental modes of scholarly research, publication, and dissemination; the creation of data management plans; open access models; print- and-digital hybrid scholarship; partnering with or creating digital scholarship centers; discoverability of new scholarly publication forms; and preservation of digital research publications and products. : – : pm break : – : pm plenary: wolfram horstmann : – : pm summit summary cliff lynch — what have we learned today peter berkery, elliott shore — defining action steps for the future : pm wrap and close across the great divide appendix : reflections on aaup/arl/cni meeting and opportunities for library-press collaboration clifford lynch, executive director, coalition for networked information may , ; revised oct , i had the opportunity to provide some summary reflections for the association for research libraries (arl)/association of american university presses (aaup)/coalition for networked information (cni) convening of university libraries and university presses in may . this is an edited, abstracted, and summarized version of my remarks. convened jointly by arl, aaup, and cni, funded by the andrew w. mellon foundation, and hosted by temple university libraries and temple university press, the presses reporting to libraries (p l) summit was held in philadelphia on may – , . in the first such meeting of members of this particular community, teams of press directors and library deans/directors with an administrative relationship (typically involving the press reporting into the library) discussed the benefits of, challenges in, and possibilities around this relationship. my remarks fall into three categories: macro issues, specific observations (“gems”) that i thought were really important, and questions i was surprised not to hear much about in the conversation, but that seem important to me. macro issues we heard much talk of ecosystems throughout the conversations today. ecosystems were an integral part of scott waugh’s opening plenary, and i think he accurately described much of what’s going on in what he characterized as the scholarly communications ecosystem. but we should not be thinking in terms of ecosystems, i believe. this is a terrible mistake as we try to understand the implications of recent developments. ecosystems are nasty places when left unsupervised and uncivilized. darwin rules. here we find “nature, red in tooth and claw” (tennyson, in memoriam a.h.h.); existence is “nasty, brutish, and short” (hobbs, leviathan). the academy can make other futures, if it has the will. the difference between ecosystems and societies is the introduction of not-necessarily- darwinian values and moral structures (e.g. don’t eat the weak or elderly). here i must recognize, with a great debt of thanks, timothy norris (formerly a council on library and information resources fellow, norris is now at the university of miami), whose excellent blog post has been haunting me for the past few years. scholarly publishing needs to be a society; the academy is a society. talking of ecosystem rather than society in this context is an abdication of responsibility. we must invent our own future deliberately, not simply let it evolve from marketplace competition. timothy norris, “morality in information ecosystems,” september , , http:// connect.clir.org/blogs/tim-norris/ / / /morality-in-information-ecosystems. http://connect.clir.org/blogs/tim-norris/ / / /morality-in-information-ecosystems http://connect.clir.org/blogs/tim-norris/ / / /morality-in-information-ecosystems across the great divide peter berkery of aaup earlier spoke of defining a space, a sphere of university press activity and responsibility and of clearly distinguishing this from the commercial scholarly publishing space. i think this is going to be essential, and most effectively and easily done in the world of scholarly monographs and also, perhaps, in humanistic journals. there is great opportunity for scoping territory in new long-form argument genres in the digital realm. this sphere needs to be clearly delineated as part of the society of the academy, not the broader ecosystem and marketplace of scholarly publishing. inside this society i think we are going to need different, or additional, economic models to support the dissemination of scholarly work, particularly long-form arguments. organizationally and with regard to budgets, treating presses explicitly as part of the host university’s scholarly communications portfolio and strategy is a central step towards making this possible. all of the institutions represented here have at least taken the first steps along this path. note that there’s recent data questioning the value of the apparently very minimal editorial contributions of science publishers, for example. in stark contrast, i think that the contributions that the best university presses make in taking a monograph from first draft to final version is widely understood to be very, very large. as part of this we need to understand and define “us and them” to identify who is within the collaborative and collectively supported society and who stands outside as competition, as pure marketplace players and competitors. this is a very nasty and potentially controversial question that needs to be taken up. where do the big, wealthy university presses, that are so important in the monographic marketplace, like oxford, cambridge, harvard, etc., fit in? they aren’t here because they are among the university presses that have not restructured their reporting relationships. are they commercial publishers in all but name, or are they instruments of the academy that can be brought within this new sphere? what about all of the other smaller university presses? we need to understand the various lines and axes of collaboration: at this meeting we have focused mainly on intra-institutional (library and press) collaborations rather than cross- institutional collaborations involving libraries and presses from several institutions. libraries have, in some areas at least, a very strong record in this kind of inter-institutional work; the library-press collaborations need to build on this and span the nation. we must do more focusing on common platforms and ways to make library systems better accommodate presses broadly (e.g. today’s discussion on metadata workflows). it is clear, at least at the institutions represented here, that presses are moving from the periphery, from ancillary services, to the core and center of the academic enterprise. this trend is hugely important, and it allows, indeed invites, presses to re-consecrate themselves to their genuine fundamental mission: to abandon subventions for genuine budgets and to be funded, see sharon farb et al, “how much does $ . billion buy you? a comparison of published scientific journal articles to their pre-print version,” presentation from cni fall membership meeting, https://wp.me/p lnct- j. another presentation on this subject, by martin klein et al, “comparing published scientific journal articles to their pre-print versions,” was made at the joint conference on digital libraries . https://wp.me/p lnct- j across the great divide at least in part, as components of the central academic enterprise, as part of a university scholarly communications and stewardship strategy. it makes it possible to stop doing “stretch” quasi-mass-market publications to help cross-subsidize what they are really supposed to be focusing on. finally, it is very striking to me today that there is no consensus among the scholarly and funder communities about the vision of the desirable future for the monograph. contrast this to the scientific journal, where it is clear that scientists, funders and policymakers in the us, uk and elsewhere have broadly agreed that the desirable and goal end-state is open access (oa), though there is argument about the pathways (green, or gold, or other means) to reach that desired future, with the emerging consensus varying from nation to nation, and we are still struggling to understand the economics and other implications of the alternatives. note also that the current us funder requirements for public access to journal articles are substantially different than the open access approaches that libraries have been advocating to faculty over the past decade or more. but for science, and for the journal article, there’s a rough general consensus as to where we should be headed. is there agreement that the future goal for monographs in digital form is open access? what, if any, is the role of the embargo? what, if any, are the models for commerce and oa co- existence? further, there’s the question of what to do with out-of-print works, and what to do when books in digital form never go out of print (though contracts between author and publisher may expire). i think we do not have a consensus on this, in fact, not even the beginnings of a consensus, and i think that there is great urgency attempting to develop this consensus. gems: observations that got my attention (and my own extensions or re-interpretations of these) • we must figure out how to do cross-institutional subvention for individual monographs relatively routinely. this is hard, but seems clear, at least to me. open educational resources as they are now emerging are a fertile area for new collaborations that include press and library. but beware: these resources heavily engage non-print media and are going to require new skill sets that are often present neither in the library nor the press. • enlist the press’s marketing arm to feature institutional special collections and ir materials. • there are rich opportunities for bibliographic curation of university press publications: authors are collaborators, so it would be useful (and wise) to include links to some content, reviews, tables of contents, etc. use the library to feed these elements into the bibliographic record continuum. turn university press books into “featured items” in discovery systems and make these publications stand out. also, use these publications see, for example, the berlin declaration on open access to knowledge in the sciences and humanities, the office of science and technology policy memo “increasing access to the results of federally funded scientific research,” the finch report on open access, the royal society’s “science as an open enterprise,” and many others. across the great divide as an opportunity for bringing people to campus for symposia: connect and curate these materials, and use the institutional repository (ir) and the press (either the local press, the press that published the monograph originally, or both) to disseminate these materials, all linked back to the original monograph. in almost all cases, the number of books published by local university presses is quite small (these are events, as opposed to the comparatively vast and routine local faculty publications in scientific journals, for example): honor these. the local library really can support this. • include press projects as part of the development portfolio. this strategy is stunningly obvious, but i fear very rare. • include in press portfolios the work of scholars (not necessarily faculty at the press’s institution) with research focus on local special collections. there are some fabulous opportunities here. furthermore, do this with university museums, archives and other campus collections. develop models to scale up to multiple institutions, not all of which will have local university presses. this is a really, really exciting idea. this strategy also provides a pathway to independent scholars and citizen scholarship connected to local collections. things we did not talk about personally, i think that one of the great intellectual challenges of our times is to re- conceptualize the children of the monograph for the digital world. it is not how we move pdfs around or remarket fragmented pdfs of monographs, but what monographs morph into in the digital world. we need to talk about standards, templates and preservability. experiences like the mellon guttenberg-e project offer a wealth of insight that has not been fully harvested and acted upon. we need to orchestrate focused efforts to engage this problem. it’s really hard, and really important. we did not talk about it here, and i’m not sure why. perhaps it’s because university presses feel that it is far away from their existential issues, or that it’s just too long term, or maybe it was simply that there just wasn’t time to get into it. this is one that keeps me up at night. implicit in it is challenging historic assumptions with press editorial roles and contributions, and with the traditional length constraints and other practices related to scholarly monographs. one of the challenges here is to balance, or perhaps provide alternative choices, among the editorial investments, length and prospective estimated size of readership for monographs in the making. a second challenge is how to deal with the potential separation but inter-connectedness of evidence and analysis, and facilitate the reuse of the underlying evidence; this is a fundamental problem facing all disciplines and all forms of scholarly communication. across the great divide related resources coalition for networked information. “institutional strategies for open educational resources (oers): report of a cni executive roundtable held april & , .” august . https:// www.cni.org/go/oers -cni-report. “keynote presentations day .” pacific neighborhood consortium annual conference and joint meetings. august . https://youtu.be/roi f el . lynch, clifford a. “the battle to define the future of the book in the digital world.” first monday : (june ). http://firstmonday.org/ojs/index.php/fm/article/view/ / . lynch, clifford a. “a matter of mission: information technology and the future of higher education.” in the tower and the cloud, edited by richard n. katz, – . boulder: educause, . http://www.educause.edu/research-and-publications/books/tower-and-cloud/matter- mission-information-technology-and-future-higher-education. lynch, clifford a. “the scholarly monograph’s descendants.” in the specialized scholarly monograph in crisis, or how can i get tenure if you won’t publish my book?, edited by mary m. case, pp. – . washington, dc: association of research libraries, . https://www.cni. org/go/scholarly-monographs-descendants. http://www.cni.org/go/oers -cni-report http://www.cni.org/go/oers -cni-report https://youtu.be/roi f el http://firstmonday.org/ojs/index.php/fm/article/view/ / http://www.educause.edu/research-and-publications/books/tower-and-cloud/matter- mission-information-technology-and-future-higher-education http://www.educause.edu/research-and-publications/books/tower-and-cloud/matter-mission-information-technology-and-future-higher-education http://www.educause.edu/research-and-publications/books/tower-and-cloud/matter-mission-information-technology-and-future-higher-education http://www.cni.org/go/scholarly-monographs-descendants http://www.cni.org/go/scholarly-monographs-descendants across the great divide: acknowledgments across the great divide: appendix : the role of libraries and university presses in the scholarly eco-system appendix : participants appendix : p l survey analysis appendix : aaup/arl/cni p l summit agenda appendix : reflections on the aaup/arl/cni meeting publishing archaeological excavations at the digital turn full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=yjfa journal of field archaeology issn: - (print) - (online) journal homepage: http://www.tandfonline.com/loi/yjfa publishing archaeological excavations at the digital turn rachel opitz to cite this article: rachel opitz ( ) publishing archaeological excavations at the digital turn, journal of field archaeology, :sup , s -s , doi: . / . . to link to this article: https://doi.org/ . / . . © the author(s). published by informa uk limited, trading as taylor & francis group published online: nov . submit your article to this journal article views: view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=yjfa http://www.tandfonline.com/loi/yjfa http://www.tandfonline.com/action/showcitformats?doi= . / . . https://doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=yjfa &show=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=yjfa &show=instructions http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - publishing archaeological excavations at the digital turn rachel opitz university of glasgow, glasgow, uk “like the folk tale or the three-act play, the excavation report has become a literary genre, a conventional kind of writing to which most authors conform.” (bradley ) abstract this paper engages with repeated calls within archaeology for a re-envisioning of the excavation report, contextualized by the transformation of scholarly communication taking place across the humanities and social sciences. this widespread transformation is rooted in a growing interest in showing data together with synthesis and argument, the importance afforded to public engagement, and the proliferation of digital platforms that enable creative presentations of scholarly work. in this context, we discuss our experience producing an excavation report that attempts to integrate several forms of scholarly and public-facing communication on a digital platform, and aims to engage audiences at multiple levels, while simultaneously facilitating data reuse and laying out the authors’ current interpretations. we consider the benefits and challenges of producing work in this way through the example of producing the gabii project’s first volume, a mid-republican house from gabii, developed through a collaboration between the gabii project team and the university of michigan press. this experience is contextualized within the broader discourse surrounding changing expectations about open access, authorship and credit, and sustainability of digital scholarship in academic publishing. keywords publication; multimedia; digital humanities; methodology; d introduction: digital publication, humanities scholarship and writing archaeological excavations this paper reflects on the experience of producing an exca- vation report, a mid-republican house from gabii (opitz et al. ), that attempts to take advantage of the flexibility of current digital platforms to write and create content for audi- ences from the interested member of the public to the academic disciplinary specialist, and integrates the publication and pres- entation of basic data with that of synthesis and argument within a single work. this volume, the first report in the gabii project’s planned core publication series, was developed through a collaboration between the gabii project team and the university of michigan press. through the process of devel- oping, publishing, and revising this volume, we have engaged with aspects of the extended and multifaceted discourse in archaeology surrounding how we communicate the excavation and research process and its outcomes. our effort follows in the footsteps of experiments in archaeological excavation publi- cation, many of which were carried out in the early s. key examples include works on the excavations at Çatalhöyük (discussed in tringham and tringham and stevanović ), numerous projects linking articles and digital archives as exemplified by those carried out in relation to the leap pro- ject (richards et al. ), e.g. clarke and colleagues’ ( ) publications of the silchester excavations, and the growing number of excavation project teams making their data and reflections available on the web through interactive sites and databases. examples of the latter range from development-led work at heathrow terminal (http://www.framearch.co.uk/ t /) to the long-term research excavations at the athenian agora (http://www.agathe.gr). our project also draws on the active efforts across the humanities and social sciences to recon- sider strategies for the presentation and publication of data. archaeological excavation reports exemplify the data-rich humanities publication, and provide a useful lens for consider- ing the ways in which the digital format can present humanities scholarship, which is increasingly cognizant of complex data, with that data in whatever form it takes. in this context, we face questions germane to debates on open access, authorship and credit, the sustainability of digital scholarship, and connect- ing diverse audiences with scholarly work, all subjects of debate in both the domain of archaeology and in scholarship at large (e.g. seidemann ; heath et al. ; lake ; kansa ; kansa et al. ; pratt ; kratz and strasser ; moore and richards ; richards and hardman ). this article reflects on some of the choices made in creat- ing a mid-republican house from gabii and their impli- cations, specifically for the archaeological excavation report as a genre and broadly for scholarly humanities publications, as we look to continue to improve the approach taken in our own work. in this light, we present some of the challenges and broader impacts of creating a multi-layered publication to which interactive media is integral, which aims be credible with specialists, and which attempts to engage non-special- ists. at a time when bringing the humanities into the public square and demonstrating its value is a pressing concern (ang ; jay ; pearce et al. ; scanlon ), © the author(s). published by informa uk limited, trading as taylor & francis group this is an open access article distributed under the terms of the creative commons attribution license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. contact rachel opitz rachel.opitz@glasgow.ac.uk archaeology, school of humanities, gregory building, lilybank gardens, university of glasgow, glasgow g qq journal of field archaeology , vol. , no. s , s –s https://doi.org/ . / . . http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf http://orcid.org/ - - - http://www.framearch.co.uk/t / http://www.framearch.co.uk/t / http://www.agathe.gr http://creativecommons.org/licenses/by/ . / mailto:rachel.opitz@glasgow.ac.uk http://www.tandfonline.com many scholars are focused on engaging non-specialists in their research. this mode of publication and effort to address multiple audiences requires attention to how text, media, and data are presented and involves close collaboration between authors, editors, technologists, and designers. by focusing on the form and style of the excavation report as central to appealing to a spectrum of contemporary audiences and as essential to addressing long-standing concerns about the mismatch between the desires of archaeological readers and the reality of the (primarily) print report or monograph, this paper takes up points raised by the frere report ( ) and commentary by hodder ( ) on the evolving, proble- matic style of archaeological site reports. it further engages with the discourse on the relationship between the intended audience, increasing professionalization and institutionaliza- tion, and writing style as expressed in the puns report (jones et al. , ), and in work by joyce ( ) and fagan ( ) on the impact of language and the modalities of archaeological storytelling. hodder ( ) writes, “at best the [impersonal style] reports are dull, excessively long, detailed and expensive and read by no one except the delirious special- ist” and boivin ( ) describes them as “boring, boring, bor- ing … ,” surely a characterization to be avoided. the arguments and reflections presented here are based on the specific experience of producing the gabii project’s first exca- vation report in an experimental format, and address issues still faced, after twenty-odd years of efforts by numerous pro- jects, by teams working to produce innovative publications that present their data, ideas, and reasoning in new ways. looking inward: remaking the academic excavation monograph since the gabii project has conducted survey and exca- vation at the ancient city of gabii, situated approximately kilometers east of rome, italy. the project has maintained a commitment to techniques of digital field documentation, resulting in the accumulation of an extensive body of digital data ranging from a database of written field observations to photorealistic d models of architecture and stratigraphy. in the project began to plan the publications of the results of the excavation. the team aimed to develop effective and innovative ways of publishing and sharing the project’s rich digital dataset, resulting in the “gabii goes digital project” (ggdp), which ran from – , thanks to an initial grant of approximately $ , from the national endowment for the humanities office of digital humanities. the ggdp provided an opportunity to develop innovative modes of pub- lication for our own data, and to address broader issues in the communication and publication of born digital, non-tra- ditional data sources in the humanities and social sciences. the ggdp resulted in the prototype for the design of a mid-republican house from gabii (opitz et al. ). this volume presents the archaeological story of a single mid- republican house at various levels of detail and sophistication intended for different audiences, within a single digital product through a multi-layered textual narrative, a fully searchable database, and an interactive d representation of the archaeo- logical remains and reconstructions. in looking to design a contemporary excavation monograph, we have started from the principle that reader experience is cen- tral, while attempting to adhere to the norms of the genre clo- sely enough that the product is still recognizable as scholarship and as an excavation report. while traditionally composed as long-form linear prose, providing a narrative of excavation strategy, stratigraphic sequence, key material finds, and various categories of supporting and complementary evidence, along with parallels to evidence from other projects, excavation reports are rarely read as a narrative. rather, they are skimmed and mined for the information required by each reader. special- ists in ceramics target pottery quantifications and typologies and someone excavating a house nearby flips straight to the descriptions of domestic architecture (mccarthy et al. ; richards and hardman ). this style of reading suggests that search functions and linking will be essential, and both are well supported by an interactive digital format. the choice of a digital-only format for the gabii project’s excavation reports series was further encouraged by the pro- ject’s substantial investment in digital recording and media, particularly image collections and d photorealistic models. the presence of a resource in itself can serve as a pressure to take advantage of that resource, compelling us to incorporate our digital archive, including the visual and d components, into publications. the digital publication of excavation archives is not new, and many fine examples may be found in repositories including the digital archaeological record (tdar), opencontext, and the archaeological data service (ads), in addition to individual university archives (kansa ; marwick ). however, most of these archives remain relatively separate from the monographs and reports that pre- sent those same excavations, and are provided or linked as a supplement or appendix (see discussion in the context of the leap project [richards et al. ]; for a recent and rich example of linking reports and archives see milner et al. a and b). the publication of excavation archives should, we argue and have done in this volume, be an integral part of the ‘main’ publication. one expression of this is the presence of the d interactive model, by default occupying half the screen when reading, containing the spatial data archive in the form of d models and surveyed limits of strati- graphic units. the presentation of the d interactive archive side-by-side with the text, and relative scarcity of descriptions of spatial relations in the text, is an attempt to push readers to engage with the archive and its data together with the text. the inclusion of fairly dense links between the text and the d interactive model serving as a gateway to the archive is another device through which we attempt to more closely integrate the data archive and the written narrative. if the published archive can perform the heavy lifting of presenting primary evidence, and provide added benefits of improved searchability, space is opened within the written excavation volume itself for more synthesis, interpretation, and storytelling. this reformu- lation requires that the archive be closely and coherently linked to the published written work, and that the structure and rela- tive roles of the prose, data, and visual media be rethought (figure ). the restructuring of the publication to closely integrate the archive with the synthesis and narrative relies, as noted above, on linking between different components. there have now been numerous experiments with the use of hypertext and multimedia presentations to support non-linear narratives, multivocality, and different forms of writing in both scholarly and lay contexts, and further discussions of these experiments (clarke ; early-spadoni ; franze et al. ; webb ; rettberg ; schreibman ). starting from the principle of linking the excavation archive and the journal of field archaeology s figure . a, b): early drafts of the interface, including the interactive d interface and ‘infoboxes’ that summarize data. c, d): the revised interface, with additions including a minimap and compacted ‘infobox’ layout to facilitate seeing summaries of information and the d model space simultaneously. the text is placed beside, rather than below, the model in the revised interface, to better support parallel reading and viewing. s r. opitz monograph, the detailed evidence with the argumentation built on top of that evidence, we decided to pursue the use of linked text to present narratives targeting different audi- ences, allowing a reader to move between different streams and levels of detail, an approach discussed in more detail below. the motivation is to support both productive skim- ming and deep dives into specific areas of interest, taking advantage of the digital format to support the kind of reading and searching we see as prevalent within our readership, while simultaneously providing a new means of engaging with archaeological evidence to a broader readership. this aim of reaching an expanded readership reflects the growing importance of public archaeology, and that a broad communication of the process and results of archaeological excavation and interpretation has become a priority for many scholars and professionals. how can we present pri- mary evidence, which requires background knowledge and context to be well interpreted, to a general audience without becoming tedious and sunk in minutiae? how can we sim- plify complex argumentation to require less background knowledge? at this point, this is not a problem specific to archaeological excavation monographs, but one relevant to humanities and social science disciplines at large. how do we present difficult data, which is in reality open to multiple interpretations, in a responsible way, in formulations appro- priate to engaging with diverse audiences? specialists in science communication will no doubt find this challenge all too familiar (logan ; besley and tanner ; fischhoff and scheufele ; krause ), but we would argue, as have others (jay ; coble et al. ; green ) that the advent of digital publication has pushed us all to become better communicators of our research, and what has previously been a specialist concern is now the business of all scholars and researchers. the first volume of the gabii project reports is the result of an initial experiment in writing linked and layered text addressing different audi- ences, directly connected to primary data and media, in a restructuring of the excavation monograph. the format of the volume the text of the first gabii volume is written in three layers, ‘story,’ ‘more’ and ‘details’ which link to one another, to the site’s database and to an interactive d model of the physical remains excavated in one area of the former town of gabii (figure ). in addition there is an introduction to the volume, which explains the format and provides a ‘guide to reading,’ and an ‘apologia’ which explains the project’s methodology. each layer of text addresses a different audience and, while linked to the others, is a self-contained unit telling the whole story of the archaeology in question. story—talking about what happened the first layer of text seeks to tell, in the simplest terms poss- ible, our current understanding of what happened in this part of the ancient town of gabii. this highly simplified narrative, which attempts to avoid jargon or the assumption of specialist knowledge, was the result of an extended exercise in distilla- tion from the minutiae of individual stratigraphic units and ceramic sherds down to the story that starts with ‘once upon a time there was a house.’ the attempt at extreme sim- plification, which when assessed using common measures of readability achieved a . on the gunning fog scale or grade level: – years old (eighth and ninth graders) on the automated readability index (see brewer ), forces us, as authors, to drill down to the essentials to address a broad and general audience. inevitably there are details that don’t quite fit, irreconcilable differences between parts of the record, much like the differences between what is seen by two witnesses to a crime. it’s all too easy as experts to feel it impossible to reconstruct a narrative of events without attempting to include or explain the details, yet this exercise is necessary for effective communication. also needed to engage a broader audience is a shift in tone and vocabulary. in the mid-republican house the first level of text, that is the ‘story,’ is intended to use the tools of fiction writing to engage readers and communicate complex ideas through simple language. in short, as authors, we have chosen to be engaging in narrative style rather than exacting in the details. while these shifts in detail, vocabulary, and style are expected for a presentation of scholarly work in a public communication venue, their use in a scholarly venue is unconventional. the disjuncture between the storytelling mode of ‘humanities in the public square’ or science communications and the scho- larly venue is, it can be argued (culler and lamb ), one caused by the perceived necessity of a serious and impar- tial tone in academic writing. the perceived need for scholarly prose to be serious in tone has been discussed frequently in academic literature, coming not infrequently under critique. hyland ( : ) summarizes the situation, stating, “the convention of imper- sonal reporting remains a hallowed concept for many, a cor- nerstone of the positivist assumption that academic research is purely empirical and objective, and therefore best presented as if human agency was not part of the process.” this is, he notes, a learned behavior, as reflected in manuals for aca- demic writing which include statements like, “in general, aca- demic writing aims at being ‘objective’ in its expression of ideas, and thus tries to avoid specific reference to personal opinions. your academic writing should imitate this style by eliminating first person pronouns … as far as possible” (arnaudet and barrett : ). we see the same intellectual history and links with positivism implicated in discussions of archaeological writing. expectations for a serious and imper- sonal tone in archaeological excavation reports emerged, as noted by hodder ( ) out of the professionalization of archaeology and has strong roots in the processual school. given the body of critiques of the writing of academic prose as impenetrable or dry in the name of seriousness or figure . the main component parts of a mid-republican house, which com- prise layers of narrative text, interactive media, and data. links allow the reader to move between different components. journal of field archaeology s scientific impartiality, the publication of writing guides in book and article form that urge more creativity, and the cur- rent emphasis by funding agencies such as the us national endowment for the humanities and the uk arts and huma- nities research council on public engagement, one might expect a significant shift toward academic writing in exper- imental and creative forms as part of the production of core scholarship. archaeological scholarship has produced some important examples of creativity in writing and mickel ( : ) argues that, “by capitalizing on the tropes of fictive narrative, archaeologists will be better able to discuss more vividly, complexly—and therefore accurately—the pro- cedure and outcome of an excavation. moreover, a more fictive writing style enables greater transparency, as well as active engagement with more diverse audiences, enlisting invested communities in discursive participation with the epistemological processes of archaeological research.” while in agreement with mickel’s perspective and her arguments that fictive narratives have great potential for communicating archaeology (mickel , ), compared to the number of conventionally composed articles, creative works remain in the minority. further, the preponderance of experimental writing seems to take place outside the confines of formal peer-reviewed publication, primarily through personal web- sites, social media, and blogs. if we accept that these alterna- tive venues for publication, though increasingly respected, remain for the moment outside core scholarship, then we must admit a sea change toward more diverse forms of writ- ing and publication has not truly taken place, and the poten- tial of the form remains untapped. we might blame a perceived increase in risk (of rejection by publishers, of career consequences, of negative perception by peers) if one exper- iments with new forms rather than undertaking boilerplate scholarly writing. further, most of us are not trained as crea- tive writers or storytellers. in academic circles, we are well habituated to critiquing our peers’ writing for content, prior to and through peer review, but less effort has been made to dissect one another’s prose not for content but for style and narrative arc. in archaeology, a discipline with a huma- nistic past and its guts tied up over scientific legitimacy, we hesitate to draw attention to the difficulties of good storytell- ing and the relative scarcity of professional preparation for this task. more—the importance of true stories the first layer of text in a mid-republican house experiments with style by telling a story grounded in our understanding of the archaeological evidence. this grounding is what makes it a real story and, we argue, a scholarly text. the grounding in the evidence, providing the reasoning and first line of evi- dence-based argument behind the story, is created through links to the second layer of text. this second level of text was originally simply labeled ‘more’ and aims to reach a broad audience of archaeologists and students of archaeology who would be interested in the specific case of the site of gabii and how it fits in with broad pictures about roman urbanism, the roman countryside, emerging regional econ- omies, and a variety of other topics of academic interest. the material here essentially represents what would go in an academic journal intended for a broad audience, e.g. anti- quity or world archaeology or the journal of archaeological research. for any of these venues we would expect the audi- ence to have a solid background in archaeological method and theory broadly writ, and an interest in the big picture questions about the development of society and the unrolling of history, seen through a material culture lens. however, we would not expect deep foreknowledge of the details of the evolution of roman republican architecture and the organiz- ation of domestic space, nor of the ceramic sequences that ground chronologies. this level of writing, achieving a gun- ning fog index score of . or an automated readability index score of grade level: – years old (college level), adheres the most closely to the experience of academic authors, at least as reflected by our group. it is the synthetic and analytical prose composed after careful study, giving enough, but not overwhelming, detail, in the spirit of cun- liffe’s “cake baked by an expert” (wills ). while this synthetic style of reporting the findings of an excavation is more readable than a set of catalogs, publications of excavations which provide only the ‘analysis and interpret- ation’ without clear links to the full dataset have come under increasing critique, in particular from scholars advocating for scientific reproducibility and open knowledge, precisely because they synthesize and leave out detail, making it difficult to interrogate the interpretations put forward. the solution taken up by some projects (e.g., athenian agora: http://www. ascsa.edu.gr/index.php/excavationagora/publications-and- resources/; villa magna: http://archaeologydata.brown.edu/ villamagna/) is to produce parallel monographs or reports in print form and ‘data publications’ which are usually digital and placed with a repository. more recently the ‘data paper’ has emerged (e.g. framework archaeology , appearing in internet archaeology, who together with the journal of open archaeological data have been at the front of the devel- opment and promotion of the ‘data paper’) to provide a bridge between the paper monograph and the data deposited with, in their case, the ads repository. these parallel data publications attempt to bridge the divide between synthesis and conclusions and the data in which they are grounded. however, as the two publications are often housed in separate institutional contexts and in different media, there are challenges in linking between them to facilitate the kind of re-investigation of the analysis and conclusions proposed, as discussed below. details—how do we talk about data? archaeological excavation monographs have traditionally dealt with the publication of data in several ways: through the publication of catalogs of specific classes of material (e.g. the lamps from cosa), through the publication of table-heavy appendices, through long descriptive sections where specifics about features and finds are detailed, and through the publication of plans and measurements. two important criticisms have been raised against the approach of publishing catalogs and appendices: first, that this data, in its print form, is difficult to reuse, and second, that close links between the conclusions and the data are often lacking (see connah : – for a discussion of the problems of paper publication of excavation data). for larger projects, particularly those taking place in an academic context, the publication of volumes on materials often came years after the publication of the synthetic volumes, further reducing the legibility of the data. the publication of annotated plans, widely viewed as another basic form of data, is generally carried out within the main volume and seen as critical to the presentation of the site, but this aspect of the traditional excavation report s r. opitz http://www.ascsa.edu.gr/index.php/excavationagora/publications-and-resources/ http://www.ascsa.edu.gr/index.php/excavationagora/publications-and-resources/ http://www.ascsa.edu.gr/index.php/excavationagora/publications-and-resources/ http://archaeologydata.brown.edu/villamagna/ http://archaeologydata.brown.edu/villamagna/ faces many of the same criticisms. these plans are, as map- makers and surveyors have regularly pointed out, synthetic and interpretive documents in their own right (dinsmoor : – ). they leave some things out and add other things in, projecting straight lines and completing corners. this makes them equally difficult to reuse as supports for alternative or revised interpretations. further, the most com- mon criticism on this subject is simply that there never seem to be enough detailed plans, that we lack access to basic spatial data (figure ). with these dissatisfactions in mind, and the complaints of graduate students, fellow researchers, and ourselves about the difficulties of working with ‘other people’s data’ echoing in our ears (allison ; atici et al. ; baird and mcfayden ; huggett ), we must consider what we want from our own published data. further, we must consider the implications of the form in which we are publishing our data. as huggett asks ( : ), “how does our relationship with archaeological data change as the observations, measurements, uncertainties, ambiguities, interpretations and values encapsulated within our datasets are increasingly subject to scrutiny, comparison, and re-use? what are the implications of increasing access to increasing quantities of data drawn from different sources which are more or less open, more or less standardized, and increasingly reliant on search tools with greater degrees of automation and linkage?” one proposed solution, with origins primarily in science communities, suggests the use of well described vocabularies of common terms, structured meta- data explaining how the data was collected, and ontologies showing how different data categories and elements relate, in order to allow us to engage with data in its digital form. these suggestions for a formal knowledge and data modeling approach have been translated from the sciences into the digi- tal humanities community, with an emphasis on ontologies to describe systems of knowledge (kintigh ; faniel et al. ; dallas ; meghini et al. ). another proposed sol- ution to bridging between data and synthesis is that of the ‘data narrative’ or ‘data publication,’ and this is the route a mid-republican house has pursued. the ‘details’ level of our text, sitting two levels below the basic story, is essentially a data narrative or data paper. like the ‘more’ level, it is intended for an academic audience, and has a similar readability score, achieving . on the gunning fog index. it finds historical parallels in the figure . an overview plan from a mid-republican house which, while useful in that it provides an overview of the excavated area, does not readily allow for interrogation or reuse of the project’s spatial data. journal of field archaeology s descriptive sections of a traditional excavation monograph, and serves some of the same purposes. this level, with refer- ences to specific sets of stratigraphic units or classes of cer- amics, is where we want to achieve productive skimming. here, along with the ‘more’ layer of text, is also where we should be achieving the ‘careful and detailed argument’ that is at the heart of a successful monograph or report. the target audience is the specialist, who wishes to know what supports the arguments made at the synthetic level. most of this text will never be read by most readers, but each bit of text will be closely scrutinized by a small number of expert readers. by separating out the detailed description into its own layer, and linking it to the synthetic layer, we hope to achieve a good balance between the need for detailed explanation and not obscuring the key findings contained in the ‘more’ layer. the inclusion of a data narrative may seem unnecessary given the incorporation of the database into the publication. one might argue that long written lists of individual strati- graphic units assigned to each phase are redundant when that information can be called up by searching the published data. however, the relationship between the structure of the data and its interpretation, as discussed by llobera ( ) and huggett ( ), is complex. the data narrative serves as another transitional layer, providing context by revealing the way in which data was selected and aggregated, showing which data were most important to us when making the interpretations presented in the ‘more’ level, and highlighting connections made between individual bits of data. this is close to what huggett ( : – ) describes as ‘tacit knowl- edge,’ necessary to connecting data and interpretation. there is a persistent myth that data can speak for itself. if it does, it speaks rather incoherently. the body of published data includes much information that was collected but never or only lightly used. documentation of soil color is practiced by many archaeologists, boxes for describing or categorizing soil color are regularly found on recording forms, and munsell charts in excavation kits. while under some circumstances these data may be central to the interpretations made, in the case of the publication of a mid-republican house later incorporated into a public complex, this data element was not central to our inter- pretive process. including lists of stratigraphic units per phase and details on their stratification, but not details on their color, suggests which data elements were most used when ana- lyzing and interpreting the mass of data collected. the data nar- rative should draw some order out of the sea of data. the data narrative’s primary role then, is insight into our reading of the data and reasoning about the patterns we can see. this level of text is, unsurprisingly, most densely linked to the data itself (figure ). it is also the level of text with the most visualizations and charts, reflecting its role in summarizing and highlighting patterns identified in the data. this approach differs from efforts to use metadata, ontol- ogies and vocabularies together to fulfil the role of supporting the arguments made in synthetic articles, reports or mono- graphs, and make chains of data selection and reasoning clear. dallas ( : – ) discusses attempts by roux and courty ( ) along these lines, and draws out the work of gardin to lay a theoretical foundation for this new mode of publication. in this model: publications of archaeological research are framed not as passive diagrammatic summaries, but as performative, interactive mech- anisms (cf. roux and courty ), allowing active access to descriptions and interpretations of the archaeological record, conceived as a schematized sequence of inferences between prop- ositions organically connected with supporting archaeological data. readers (“consultants”) of a digitally enabled logicist archaeological publication would be able to navigate interactively through its argumentation structure, traversing the inference tree of the authors’ arguments and filtering, juxtaposing and analyz- ing data, both qualitatively and quantitatively (gardin and roux : – cited in dallas : ). a key point here is that the reader can navigate. gardin’s “vision for a radically different model of archaeological publication, based on the schematization of archaeological syllogisms and their reliance on the construction of the archaeological record through recording and documentation, and served by semantic, interactive technologies of presentation, linking and reasoning” to achieve, “semantically enriched information integration that does justice to the complexity and human agency underlying knowledge construction in archaeology” (dallas : ) will require that we develop extensive semantic and data literacy as a community to be good readers of these works. we would argue that at this stage most of us are not habituated to reading the data-metadata-ontology triad directly, and the data narra- tive remains a useful tool for linking written interpretation and data. thispoints usto a fundamental question: how densely should we link between our different layers of text and the data in our current structure? linking and navigating text, data and media in laying out the body of evidence and argumentation for our interpretation of the archaeological record at gabii, every- thing is connected and we could easily produce a dense mesh of linkages, to the point that every word links out to figure . densely linked text from the ‘details’ section of a mid-republican house, providing access to the interactive d model on a per stratigraphic unit basis to support the data narrative. s r. opitz another part of the text or to the data. this is not what we have done; we have linked selectively, even sparsely. the pri- mary purpose of the links is to encourage a reader to go dee- per into the text and to see connections between data, argumentation, and interpretation. by placing a link from a specific piece of text, we are saying at the most basic level “there’s more here.” the links act as highlighters, pointing out places where the argument might be contested or where there is a particularly dense summarization that deserves further consideration. in our volume, the links within the text are one-directional, allowing a reader to drill down. each word in the text, in our current approach to linking, has only one target. this is an obvious limitation, as a single piece of argumentation in the ‘more’ layer may draw on mul- tiple points in the ‘details’ layer. an alternative linking scheme might provide a way to navigate back up from deeper layers, something many readers may desire, but presents sev- eral design challenges. first, a single piece of data in the details section might support several points made in the more section, the inverse of the limitation mentioned above. second, from a design perspective there would need to be a visual difference between links that let you drill down from more general discussion to the more detailed layers and links that let you move up from the data and details into the ‘story’ and ‘more’ levels of text, in addition to the existing visual distinction between links within the text, shown with solid underline, and links from text to d model, shown with dashed underline. we might design a sys- tem of links that allows for multiple connections flowing arbi- trarily between layers of text, as well as those connecting to the data and the interactive d environment. navigating through such a densely and intricately linked text presents its own challenges. this approach is reminiscent of the wiki- pedia style of reading, where one link leads to another and hours later you’ve somehow moved from reading about the geology of volcanic tuffs to the punk music scene in s manchester. on the one hand, dense multi-linking may, rather than clarifying relationships, lead to a reader feeling overwhelmed by the number of connections to explore from any given point. on the other hand, our current approach of using curated links flowing in a single direction is problematic in that it likely simplifies too much, leaving out many possible connections. we suggest that the best balance between these factors will be specific to each publi- cation, and that the design of appropriate linking systems for excavation reports is an area for further experimentation and research. similar challenges in design exist for the system of links between text and the interactive d environment that provide access to the spatial data collected by the project. the design of the interface embedded in the publication, discussed in opitz and johnson ( ), is intended to provide any reader with an intuitive physical sense of the physical remains of the house, which is fundamental to our interpretation and the nar- rative constructed through the data and the text (figure ). this environment has effectively been designed to operate at the ‘story’ and ‘more’ levels of the publication, in that it does not supply tools that would allow for detailed investigation, e.g. taking measurements or cutting sections. those activities are supported through the database, which provides access to individual d models of stratigraphic units which may be downloaded, measured, sectioned, annotated, etc. effectively, the ‘details’ level interaction with the spatial data is accessed through a separate interface, another design choice. we face similar issues when considering designing an interface that would allow for exploration from entries in the database up through various layers in the text. we might append links to database entry pages that connect them to their mentions in various parts of the text, adding new links as new publications appear. doing so would benefit a reader who began by exploring the data, or ‘drilled down’ to data from a given text, explored laterally, and then wished to see the contextualized discussion of a given piece of data. a system like this would also support a multivocal approach, desirable to many (e.g. joyce and tringham ; habu et al. ; beale and reilly ; shillito ), where discrete texts representing different perspectives and interpretations are clearly linked to the same data. how- ever, a system like this would require ongoing updates of the data layer as new text referring to the same data emerged. further, as noted above, this approach would likewise lead to a dense mesh of links, as each data element refers back to multiple points in various texts, and dense linking may introduce confusion or prove overwhelming for some read- ers. as above, while not arguing against data-to-text linking, we emphasize that design choices about the level of figure . the combination of photorealistic models of stratigraphic units and reconstructions of the house presented in the d interactive environment is intended to provide an intuitive sense of the physical remains and the structures interpreted based on them. journal of field archaeology s granularity and flexibility provided by linking must be made in the context of each project’s larger goals. as illustrated by the discussion above, the design of the publication’s structure and interface, allowing for exploration and interrogation of primary data and media at several levels, and the design of the text to address multiple audiences, raise issues of broad interest in archaeology and in the humanities, beyond the particular problem of the excavation monograph. there have been numerous discussions on the future of the academic monograph (hill ; crossick ; lyons and rayner ; deegan ; jubb ; o’sullivan ). the production of a mid-republican house and its design, carried out in collaboration with the university of michigan press, reflect these discussions, and through them the impacts of current thinking on digital media and writing for the web, the prioritization of public engagement and demonstrating the value of humanities scholarship, and the growing influ- ence of open access policies. below, we briefly discuss the pro- duction context with the aim of situating a mid-republican house in this broader landscape and highlighting potential future directions for the reformulated excavation report. implications for excavation reports and excavation monographs as a genre in order to consider the current and potential impact of the digital format on the publication of an archaeological exca- vation monograph, it is useful to review the roots and tra- ditions of the genre. excavation monographs and reports have a somewhat troubled past. it is a common complaint that excavations take an excessively long time to publish after their completion and that many never come to publi- cation at all. the problem of non-publication is emphasized by the intro- duction to the new excavation report guides for the national museum service (nms) in ireland, which states, “it is apparent, however, that the pressure of work now associated with the pro- fession has led to variable quality of reportage and these deficiencies must be rectified” (duffy ), a clear reference to the widely acknowledged problem of poor or non-existent excavation reporting. the situation is widespread enough that several professional bodies have produced guidance on writing excavation reports, aimed both at improving the format and quality of the content and encouraging publication in the first place. these guides address many of the same fundamental challenges we discuss in the academic context. the guide for professionals reporting in ireland, for example, comments on content and style for the concise report, intended for a general readership, and the final report, which is more technical. jigsaw, a community archaeology group based in cambridgeshire, uk, provides both an intro- duction to report writing which contains suggestions about the aims of the report and likely readership and a structured template for a report, complete with section headings (clarke ). bajr (connolly ) and unesco (maarleveld ) provide similarly detailed guidance. the unesco guide attributes the format and style of the excavation reports to reports written by british scholars working in the th cen- tury. “the format of excavation reports dates back to the th century based on pitt-rivers’ cranborne chase model. this generally comprises summary/ abstract, introduction/ back- ground, description of features, structures and stratigraphy, discussion, catalogues/ specialist reports/ appendices. in addition, the volumes on the cranborne chase excavation contain useful relic tables summarizing context details including features, stratigraphy, and finds. now, in the st century excavation reports contain more data with more specialist reports, but follow the same format, without relic tables” (structure of a report [rule ] unesco reporting guidelines, in maarleveld ). the format of the publications that do appear, either as reports or monograph series, has likewise come under criti- cism variously as unnecessarily dense, characterized by unreadable prose, and fragmented with specialist reports pushed into appendices or separate volumes (bradley ). the format for the presentation of primary data, a task which is heavily descriptive by nature, and is often executed in a strictly pro forma style encouraged by strong disciplinary norms or professional societies whose guidance leaves little scope for creativity, only increases the problems identified above. this discourse was picked up later by perry and mor- gan, who in the mad project undertook the excavation of a hard drive and in writing up the results comment on the cur- rent state of site reports but also their necessity, “these reports are usually articulated in coded language, primarily only comprehensible to experts and written in the passive tense. there is much to be critiqued about both the style and the legacy of such reporting, and we note with some des- pair the lack of progress over the years in rethinking its dimensions … ” (perry and morgan ). given that it benefits scholars and professionals to publish, and indeed it is mandated for excavators working in many western countries, and given the existence of extensive gui- dance on both content and style, we must ask why accessible excavation reports and monographs seem to continue to be such a struggle to produce. we suggest, as have others, that the difficulty emerges at least in part from disjunctures between the character of contemporary archaeological data, the aims of the excavation report or monograph, and their expected format. the unesco guide, in acknowledging the essentially th century format of excavation publications and at the same time noting that the amount of data needing publication has greatly increased, hits on the first disjuncture. the greatly increased amount of data and variety of types of data makes the exhaustive publication of a large excavation archive in a traditional format an overwhelming task for the authors (thomas ; mccarthy et al. ; hodder ). while relatively small catalogs and tables of data in print form are readable and digestible, in larger quantities, this information also becomes awkward for the reader to consume (aitchison ). summary charts and graphics are widely used by specialists to get around this problem, together with the selec- tion of exemplar artifacts. while this is effective to an extent, the approach remains limited for larger projects. the aim of these publications is, first, to present primary archaeological data, and second, to provide a compelling interpretation of that data. the sheer quantity of data creates problems with the first aim, and the space taken up by the data presentation can easily obscure the useful interpretive sections or the links between the data and the interpretation, making achieving the second aim more difficult. the challenges of balancing the desire for full publication with the expense and difficulty of publishing large archives were recognized in the s, as noted in the frere report ( ). this report advocated four levels of recording, s r. opitz appropriate to different situations, and a division between archive or database from publication, which was widely viewed as a pragmatic solution. this proposed solution, how- ever, was not entirely satisfactory, and the frere report was followed by the cunliffe report ( ), which drew attention to the subsequent problems of re-use of the divided archives and reports. further criticisms were raised, e.g. by hodder ( ), of the divide between description and interpretation, which he likewise saw as creating a barrier for re-interpret- ation and data re-use. the desire for greater integration of description and interpretation, and an emphasis on synthesis is likewise reflected in the puns report (jones et al. ; jones et al. ) and in bradley ( ), which also criticize the separation between reports on stratigraphy, specialist reports, and discussion as making the conclusions drawn difficult to critique. many of these same issues were taken up by the cifa/he workshop “challenges for archaeological publication in a digital age” in (wills ). following the thread of this discussion, spooled out over fifty years, the current tasks are to retain the benefits of the archive—interpretive-narrative divide, while providing enough connections to facilitate re-interpretation and re- use of archival data, and to produce more synthetic and clearly written narratives. several publications starting from the early s can be highlighted as efforts in this direction. these exceptions to the picture of traditionalist publications (e.g. given and knapp ; mickel ; tringham and ste- vanović ) share some characteristics in their format, including an emphasis on visual design, some experimen- tation with the style of the narrative, inclusion of digital com- ponents, and a move away from the suggested pro forma structure and categories of information. in order to continue to pursue the reimagining of the excavation report or mono- graph in a digital context, we turn to the broader changes in scholarly humanities publications, which have accelerated since the s under the digital humanities banner. looking outward: scholarly publishing, digital humanities and new media there is a wide-ranging conversation within the digital huma- nities community about the impact of digital media and writ- ing for the web on scholarly communication, and the scholarly monograph in particular (e.g. earley-spadoni ; dougherty and nawrotzki ). writing history in the digital age (dougherty and nawrotzki ) discusses the impact of digi- tal media on scholarly writing for historians. scholars in media studies have emphasized the role of digital media in promoting multimodal publications. in writing with sound: composing multimodal, long-form scholarship (sayers ) the author discusses the creation of multimodal publications in scalar. the popularity of platforms like scalar and omeka attest to the broad community of scholars working and experimenting in the format of digital publication. these communities are explicitly discussing the approach we (and some of them) have taken, merging the publication of narrative, database and archive. the database | narrative | archive publication explicitly reflects on multiple attempts to stitch these com- ponents together. they rely on the concepts of transmedia storytelling and database narrative (conceptually linked to gabii’s data narrative) to explore new modes of presentation. the introduction, written in , explains that the contribu- tors to the volume are “investigating and addressing critical, conceptual, and creative questions at the heart of contempor- ary nonlinear storytelling in this formative era of the web, while underlining connectivity and historical resonances with earlier media forms and texts.” while we are working within the fulcrum platform through our collaboration with the uni- versity of michigan press, the basic issues remain the same. these platforms provide the sandbox in which we can exper- iment with the form of publication, but they do not define the new structures or conventions for mixing and presenting text, media and data. the new structures and conventions needed to bring the monograph into the ‘digital age’ are the subject of several long running projects. jstor’s ‘reimagining the digital monograph’ project (humphreys et al. ) has been the impetus for much discussion in recent years. their ‘topo- graphic’ tool essentially supports the ‘productive skimming’ we describe as a primary mode of engagement for academic readers approaching an excavation monograph. this mode of reading is commented on by the jisc survey on the role of the monograph (oapen ) highlighting the need to support skimming as a reading mode, and to create bridges between skimming and deeper reading. in the uk context, the ahrc funded ‘the academic book of the future’ project (lyons and reyner ; deegan ; jubb ) plays a similar role in drawing together the current dis- course and stimulating further discussion on the direction of academic publications, with a particular emphasis on the impact of digital media in a range of formats. as asked in the context of the mellon funded symposium on digital publication in the humanities, and mellon’s broad effort to reimagine scholarly communication in the huma- nities: “what features define the quality of scholarly argu- ment? if the monograph is increasingly being challenged as a viable component of systems of scholarly communications, what other genres are needed to disseminate knowledge in the humanities?” moreover, as john maxwell of simon fraser university observed in response to a request to review mel- lon’s approach to this complicated system, “the inward-facing importance of the monograph as a credential has often over- shadowed the outward-facing features of the monograph, which are intended to promulgate broad understanding of humanities research” (waters ). if the emphasis for the new monograph is placed more heavily on promoting broad understandings, it is worth looking at parallel approaches to communication developed in contexts such as museums and explicitly ‘public humanities’ projects. looking outward: public engagement and academic publishing in the context of ‘public archaeology’ there are a growing number of high quality presentations of archaeological materials and reports from excavations that take advantage of digital media to present the site for a variety of audiences, emphasizing communication and promoting understanding over presenting an academic facade. as an example of the genre, we can point to a publication of the serf hillforts pro- ject (http://www.seriousanimation.com/hillforts/) which describes itself as a ‘digital engagement’ and a web app. the introduction to this project states, archaeological visualisation, or the act of picturing the past in the present, is a complex area of research which exists at the journal of field archaeology s http://www.seriousanimation.com/hillforts/ convergence of evidence, interpretation, scientific data collection and artfully crafted storytelling. it is a process which at its core relies on a personal engagement between practitioner, practice and the archaeological record. traditional modes of represen- tation ask for visuals which embody a somewhat conclusive and didactic voice. how then might we use visualisation to better reflect the fluidity of the interpretive process and engage audi- ences more meaningfully with the ways in which the excavated evidence challenges archaeologists? this work aims to develop creative methodologies and outputs which more accurately reflect the multi-layered, multi-vocal and ambiguous processes involved in archaeological interpretation. the interface demonstrates the possibilities for bringing together a range of visual digital media (photogrammetry, aerial photogra- phy, rcahms survey data, d reconstruction, film-making) to open up the processes behind the excavation and interpretation to a general audience and act as a dynamic archive now that the excavations have concluded. (serf hillforts project ) clearly the authors are addressing the same issues at hand in our work, as discussed in this article. is the difference merely a matter of what we choose to call the product? we have cho- sen to publish the gabii project reports with an academic press, giving an isbn to the digital volume and dois to indi- vidual data elements, to highlight our contention that this digi- tal archaeology report is a scholarly work. is this imprimatur of the ability to be cited an important differentiator between a work that is primarily scholarly and one primarily intended for public engagement? is there an implied level of synthesis, inclusion of comparanda, and interpretation required to move from ‘digital engagement’ to ‘digital scholarship,’ or is it a matter of including certain elements or following specific conventions of form? we suggest that the main difference is one of stated intended audience, and that there is much overlap in the actual elements included and means of interaction pro- vided in public facing and scholarly digital publications. we also highlight the importance of a reference that can be persist- ently cited, in this case the dois for data elements and an isbn for the volume. stable citations play a key role in the process of academic scholarship and publication, particularly over the long term, and the non-persistence or instability of many digi- tal projects is often cited as a danger, leading to a push for replication of digital projects into print, pdf or other media deemed more likely to remain accessible. the serf hillforts project publication provides a number of means of engagement with their materials. it uses a con- ventional form for the majority of the text, which appear as pdf site reports, both annual and specialist. the project focuses on visualizations as an alternative mode of engage- ment, with a carefully designed interactive d interface. the stated aim to “more accurately reflect the multi-layered, multi-vocal and ambiguous processes involved in archaeolo- gical interpretation” is one our project, and many scholarly publications, share. given shared aims and common struc- tural elements, continued cross-fertilization between digital public engagement projects and digital scholarship should provide the impetus for innovation in both domains. public engagement, open access and economics strengthening connections between writing for public and scholarly audiences is not without challenges, and questions of audience inevitably raise issues related to cost. at present the introduction to the text in a mid-republican house and the data itself are freely available, while the ‘scholarly’ layers of text are available for purchase. if we are to truly encourage the public to engage across all the layers of the text, we must consider the price point of these products. we see the ques- tions of open access and public engagement as closely linked to one another, and at the heart of an ongoing debate about the financial structure behind the publication of scholarly digital long form works. the need for a new financial structure for digital scholar- ship and the often high barrier to entry created by the cost of scholarly books has been addressed in recent reports on the state of publication in the humanities. at one extreme, elliot ( ) supports the view that a move to university funded open access publication is the way forward for the digital monograph. we are endorsing a model of university-funded publication that results in an open access digital publication, as well as a print-on- demand physical product sold for an appropriate list price. we are aware that several university presses are currently developing an infrastructure (often supported, it seems, by mellon foun- dation resources) for digital publication. we have followed these developments carefully and find them encouraging. if a model of university-funded publication is to succeed, there must be a variety of presses that have the capacity and the will- ingness to participate in such a program. one of the values that a university press brings is its ability to cultivate and market specialized lists of authors and titles in particular fields, and fac- ulty will to continue to seek those presses that can place their scholarship in an appropriate intellectual network. (elliot ) the case put forward by elliot is one many academics and members of the public may agree with in principle, but the question ‘who really pays?’ within this scenario still requires an answer. a parallel report notes that, “in a recently pub- lished mellon-funded study, the university presses at indiana and michigan put the average costs respectively at $ , and $ , … the study reports average costs ranging from $ , per book for the group of the smallest university presses to more than $ , per book for the group of the largest presses” (hilton et al. ). in parallel, in the uk the ref, which has a strong influ- ence on academic publication strategies, is increasingly requiring open access publication in order for a work to qua- lify for submission. while this does not yet extend to mono- graphs (but will from according to current guidance), there is a growing culture of open access publication as a gold standard, and some scholars and institutions have begun to push for open access monographs. in response, presses are increasingly charging fees on the order of £ , to offset lost revenues. the situation raises a number of questions: “how are these costs to be afforded in a new regime of long-form monographic publishing, with growing pressure for open access? can the need to advance scholar- ship be reconciled with the need to drive down the costs of both monograph and other long-form publication to afford- able levels” (waters )? this discussion is also relevant in the context of reports for developer-led archaeological pro- jects, as the profession considers the cost and effort of produ- cing excavation reports, in balance with the imperative to inform and engage the public. in both the academic and pro- fessional communities, the financial questions are crucial because of their impact on the ability and willingness of authors, managers, and publishers to experiment with new digital publication formats. further, and as noted by hilton and colleagues ( ) the above, “are costs for monograph publication only; the costs of s r. opitz innovative long-form genres that are non-linear, data-inten- sive, or multimedia rich are still not yet well understood” (waters ). the implication of this last statement is that, “non-linear, data-intensive, or multimedia rich” digital publications are likely to be particularly costly, and conse- quently are seen as higher risk projects and less publishable. in the case of a mid-republican house, which actively experi- mented with the form of the publication, the gabii project and university of michigan press were fortunate to receive support from the national endowment for the humanities, the mellon foundation, and the university of michigan to defray some of the costs of developing the publication and its platform. this situation, however, must be the exception when considering a widespread shift in the form of publi- cations across projects at varying scales, with substantially lower production costs for most projects. for example, in elliot’s suggested scheme, authors and publishers may elect to split the content for a digital project into two parts, “long-form scholarship published digitally with a strong resemblance to print monographs, and then a separate sup- plement with materials that do not fit into a format that mir- rors print publication” (type in the elliot classification of forms of digital scholarship). this split allows for part of the project to be completed following established publication workflows, reducing cost and risk. in parallel, the community may coalesce around a limited set of platforms that have been developed and are maintained through the efforts of a few projects, shared on an open access basis. the discussion in this section has focused on the costs of producing digital publications. the cost of the preservation of a complex digital publication, both in the archival sense and as a functioning accessible product, likewise must be con- sidered when discussing the economics of producing digital excavation reports. the issues surrounding archiving digital data have been well and repeatedly rehearsed (e.g. faniel et al. ; jeffrey ; kansa et al. ; kansa ), cover- ing the creation of appropriate metadata, the selection of stable and open formats, and the costs of producing and maintaining community archives. in one sense, the archiving of a complex digital publication like a mid-republican house falls well within the scope of current good practice, as the component parts of publication exist in archive-friendly for- mats. preserving and maintaining a ‘live’ version of the inter- face presents a greater challenge, and one that should be met if we are to succeed in making the interface and interactive format of the publication integral to its character and to the way in which the narrative and interpretation are con- structed. the archiving of digital interfaces has seen less attention, and this topic, we argue, should be collectively addressed by creators of digital projects, archivists and librar- ians. beyond eventual archiving, the maintenance of a ‘live’ version of a digital project that depends on rapidly changing technologies requires careful negotiation and commitments from both publishers and authors. planning for future for- ward migrations of the technological platform and interface at the moment of contract negotiation may provide one route forward. these challenges are not insurmountable, but will require discussion and change across the entire scho- larly publishing system. considering the costs and benefits of production, main- tenance, and archiving, at a community scale a balance must be struck between developing a new excavation report format that will not be prohibitively costly or complex to produce for most projects, while recognizing the limitations imposed by approaches like the ‘digital supplement’ and encouraging experimentation with and development of more interactive multimedia platforms. for a mid-republi- can house we have elected to pursue a format not suitable for print publication in order to keep the interactive d models, active links, and searchable database integral to the written work. this approach was pursued because of the con- viction that if we produced a ‘digital supplement’ work that separates the main text from the online content, we would risk the online-only material becoming inherently underva- lued and accessory. the ‘digital supplement’ approach to publication also forbids, to a great extent, strong or dense connections between the text and other components of the publication, a real detraction if we wish both to enforce links between data and interpretation, as suggested by dallas ( ) and huggett ( ) and to encourage contextualized data reuse (faniel ) to form new interpretations and understandings. thus while acknowledging the costs, in developing a publication that is more reliant on its digital for- mat, we attempted to tightly integrate various components of digital scholarship, e.g. databases, visualizations, and interac- tive archives, into the excavation report in its digital form, with the intention to add value through the enhanced ability to communicate the archaeological story of a part of the town of gabii in direct connection to the evidence on which it is based. conclusions this article discusses the experience of producing a mid- republican house from gabii in the context of over twenty years of experiments and attempts to reform the excavation report within both developer-led and academic archaeology and a broad transformation in scholarly communication dri- ven by increasingly data-embedded humanities scholarship, the emerging prioritization of public engagement, and the opportunities afforded by digital platforms. our proposed contemporary excavation monograph format integrates sev- eral forms of scholarly and public-facing communication to create a digital product that reaches multiple audiences and serves both as a platform for data reuse and for communicat- ing current interpretations. it wraps together data publication and archiving with primary publication, reflecting increased contemporary concern with responsible digital data practices. this form requires us to be flexible as readers and consumers of archaeological information. many challenges remain in developing linked multi-layered texts and creative interfaces for prose, media, and data that simultaneously are works of rigorous scholarship and platforms for public engagement and future research. in grappling with these emerging digital forms, we see an opportunity to reinvigorate the publication and reading of archaeological excavations’ data and stories. acknowledgements this article is the product of extended, forthright discussion among the gabii team over many years, and like a mid- republican house, it would not exist without the contri- butions and goodwill of the whole group. the ideas developed here likewise grew out of ongoing conversations with col- leagues at the university of michigan press, whose support and collaboration have been crucial to the success of this journal of field archaeology s project. the early stages of this project were supported by the national endowment for the humanities office of digital humanities (award #hd- - ) under the title “the st c. data, st c. publications. d model publication and building the peer. reviewer community project” and later stages of the project benefitted from support from the mellon foundation, through their award to the university of michi- gan press. finally, i would like to thank michael given and william caraher for insightful comments on early drafts of the text. all faults and errors remain my own. disclosure statement no potential conflict of interest was reported by the author(s). funding this work was supported by andrew w. mellon foundation [building a hosted platform for managing monographic source materials]; national endowment for the humanities [hd ] notes on contributor rachel opitz (phd , university of cambridge) is a lecturer of archaeology at the university of glasgow. her research focuses on rural western mediterranean societies and landscapes in the st millen- nium b.c. the foundations of this work are in remote sensing and sur- vey, human perception of the built and natural environment as studied through formal exercises in d modeling and analysis of visual attention, and the material culture of rural communities and the towns emerging within them. her recognized methodological expertise includes photo- grammetric modeling in the context of excavations, lidar-based analy- sis of sites and landscapes, and developing information metrics to ask new archaeological questions using d data. orcid rachel opitz http://orcid.org/ - - - references aitchison, k. . “grey literature, academic engagement, and preservation by understanding.” archaeologies ( ): – . https://doi.org/ . /s - - - allison, p. . “dealing with legacy data—an introduction. internet archaeology . https://doi.org/ . /ia. . ang, i. . “from cultural studies to cultural research: engaged scholarship in the twenty-first century.” cultural studies review ( ): – . arnaudet, m., and m. barrett. . approaches to academic reading and writing. englewood cliffs, nj: prentice hall. atici, l., s. w. kansa, j. lev-tov, and e. c. kansa. . “other people’s data: a demonstration of the imperative of publishing primary data.” journal of archaeological method and theory ( ): – . https://doi.org/ . /s - - - baird, j. a., and l. mcfadyen. . “towards an archaeology of archaeological archives.” archaeological review from cambridge ( ): – . http://www.societies.cam.ac.uk/arc/home.html beale, g., and p. reilly. . “digital practice as meaning making in archaeology.” internet archaeology . https://doi.org/ . /ia. . besley, j. c., and a. h. tanner. . “what science communication scholars think about training scientists to communicate.” science communication ( ): – . https://doi.org/ . / boivin, n. . “insidious or just boring? an examination of academic writing in archaeology.” archaeological review from cambridge : – . bradley, r. . “the excavation report as a literary genre: traditional practice in britain.” world archaeology ( ): – . https://doi.org/ . / brewer, j. c. . “measuring text readability using reading level.” in encyclopedia of information science and technology, fourth edition, edited by m. khosrow-pour, – . hershey pa: igi global. clarke, a., m. g. fulford, m. rains, and k. tootell. . “silchester roman town insula ix: the development of an urban property c. ad – –c. ad .” internet archaeology . https://doi.org/ . /ia. . clarke, j. r. . “ d model, linked database, and born-digital e- book: an ideal approach to archaeological research and publication.” in d research challenges in cultural heritage ii, vol. , edited by s. münster, m. pfarr-harfst, p. kuroczyński, and m. ioannides, – . cham: springer international publishing. https://doi.org/ . / - - - - _ clarke, r. . an introduction to archaeological report writing. cambridgeshire: jigsaw. accessed august , . https:// jigsawcambs.org/images/introduction_to_archaeological_report_ writing.pdf coble, z., s. potvin, s., and r. shirazi. . “process as product: scholarly communication experiments in the digital humanities.” journal of librarianship and scholarly communication ( ): – . https://doi.org/ . / - . connah, g. . writing about archaeology. cambridge: cambridge university press. https://openresearch-repository.anu.edu.au/handle/ / connolly, d. . record sheet and report templates risk assessment form & useful guides. london: molas. crossick, g. . “monographs and open access.” insights ( ). https://doi.org/ . /uksg. culler, j. d., and k. lamb, eds. . just being difficult?: academic writing in the public arena. stanford, ca: stanford university press. cunliffe, b. w. . the publication of archaeological excavations: report of a joint working party of the council for british archaeology and the department of the environment. department of the environment, uk. dallas, c. . “jean-claude gardin on archaeological data, representation and knowledge: implications for digital archaeology.” journal of archaeological method and theory ( ): – . https://doi.org/ . /s - - - database | narrative | archive (http://dnaanthology.com/anvc/dna/ index). deegan, m. . the academic book and the future project report: a report to the ahrc and the british library, london. london: the british library. dinsmoor, w. b. . “the archaeological field staff: the architect.” journal of field archaeology ( ): – . https://doi.org/ . / dougherty, j., and k. nawrotzki. . writing history in the digital age. ann arbor, mi: university of michigan press. https://muse. jhu.edu/book/ duffy, p. . nms ireland excavation reports guidelines for authors. national monuments service of ireland. https://www.archaeology.ie/ sites/default/files/media/publications/excavation-reports-guidelines- for-authors.pdf earley-spadoni, t. . “spatial history, deep mapping and digital storytelling: archaeology’s future imagined through an engagement with the digital humanities.” journal of archaeological science : – . https://doi.org/ . /j.jas. . . elliot, m. . “the future of the monograph in the digital era: a report to the andrew w. mellon foundation.” journal of electronic publishing ( ). http://doi.org/ . / . . . fagan, b. . writing archaeology: telling stories about the past. london: routledge. faniel, i., e. kansa, s. whitcher kansa, j. barrera-gomez, and e. yakel. . “the challenges of digging data: a study of context in archaeological data reuse.” in proceedings of the th acm/ieee- cs joint conference on digital libraries, edited by j. downie, – . new york: acm. https://doi.org/ . / . fischhoff, b., and d. a. scheufele, . “the science of science communication ii.” proceedings of the national academy of sciences (supplement ): – . https://doi.org/ . / pnas. s r. opitz http://orcid.org/ - - - https://doi.org/ . /s - - - https://doi.org/ . /ia. . https://doi.org/ . /s - - - http://www.societies.cam.ac.uk/arc/home.html https://doi.org/ . /ia. . https://doi.org/ . /ia. . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /ia. . https://doi.org/ . /ia. . https://doi.org/ . / - - - - _ https://jigsawcambs.org/images/introduction_to_archaeological_report_writing.pdf https://jigsawcambs.org/images/introduction_to_archaeological_report_writing.pdf https://jigsawcambs.org/images/introduction_to_archaeological_report_writing.pdf https://doi.org/ . / - . https://openresearch-repository.anu.edu.au/handle/ / https://openresearch-repository.anu.edu.au/handle/ / https://doi.org/ . /uksg. https://doi.org/ . /s - - - http://dnaanthology.com/anvc/dna/index http://dnaanthology.com/anvc/dna/index https://doi.org/ . / https://doi.org/ . / https://muse.jhu.edu/book/ https://muse.jhu.edu/book/ https://www.archaeology.ie/sites/default/files/media/publications/excavation-reports-guidelines-for-authors.pdf https://www.archaeology.ie/sites/default/files/media/publications/excavation-reports-guidelines-for-authors.pdf https://www.archaeology.ie/sites/default/files/media/publications/excavation-reports-guidelines-for-authors.pdf https://doi.org/ . /j.jas. . . http://doi.org/ . / . . http://doi.org/ . / . . https://doi.org/ . / . https://doi.org/ . /pnas. https://doi.org/ . /pnas. framework archaeology. . “heathrow terminal excavation archive (data paper).” internet archaeology . https://doi.org/ . /ia. . franze, j., k. marriott, and m. wybrow. . “what academics want when reading digitally.” in proceedings of the acm symposium on document engineering, edited by s. simske, – . new york: acm. https://doi.org/ . / . frere, s. . principles of publication in rescue archaeology: report by a working party of the ancient monuments board for england. united kingdom: committee for rescue archaeology. fulcrum platform. (https://www.fulcrum.org/). gardin, j.-c., and v. roux. . “the arkeotek project: a european network of knowledge bases in the archaeology of techniques.” archeologia e calcolatori : – . accessed october , . given, m., and a. b. knapp. . the sydney cyprus survey project: social approaches to regional archaeological survey. monumenta archaeologica . los angeles: cotsen institute of archaeology, university of california, los angeles. green, a. r. . “integrity, advocacy and the public purpose of scholarship.” in history, policy and public purpose: historians and historical thinking in government, edited by a. r. green, – . london: palgrave pivot. https://doi.org/ . / - - - - _ habu, j., c. fawcett, and j. m. matsunaga, eds. . evaluating multiple narratives: beyond nationalist, colonialist, imperialist archaeologies. new york: springer. hardman, c., and j. d. richards. . “stepping back from the trench edge: an archaeological perspective on the development of standards for recording and publication.” in the virtual representation of the past, edited by m. greengrass and l. hughes, – . farnham, uk: ashgate publishing, ltd. heath, m., m. jubb, and d. robey. . “e-publication and open access in the arts and humanities in the uk.” ariadne ( ). hill, s. a. . “making the future of scholarly communications: making the future of scholarly communications.” learned publishing : – . https://doi.org/ . /leap. hilton, j., c. walters, p. courant, s. smith, w. kahn, c. watkinson, j. jackson, s. smart, g. dunham, s. pekala, and n. fitzgerald. . “a study of direct author subvention for publishing humanities books at two universities: a report to the andrew w. mellon foundation by indiana university and university of michigan.” accessed august , . https://deepblue.lib.umich.edu/handle/ . / ?show = full. hodder, i. . “writing archaeology: site reports in context.” antiquity ( ): – . https://doi.org/ . /s x huggett, j. . “digital haystacks: open data and the transformation of archaeological knowledge.” open source archaeology: ethics and practice. warsaw: de gruyter open. humphreys, a., c. spencer, l. brown, m. loy, and r. snyder. . reimagining the digital monograph design thinking to build new tools for researchers. jstor labs report. journal of electronic publishing ( ). http://doi.org/ . / . . . hyland, k. . “humble servants of the discipline? self-mention in research articles.” english for specific purposes ( ): – . https://doi.org/ . /s - ( ) - jay, g. . “the engaged humanities: principles and practices for public scholarship and teaching.” journal of community engagement and scholarship ( ): – . jeffrey, s. . a new digital dark age? collaborative web tools, social media and long-term preservation.” world archaeology ( ): – . jones, s., a. macsween, s. jeffrey, r. morris, and m. heyworth. . from the ground up: the publication of archaeological projects: a user needs survey. york: council for british archaeology. jones, s., a. macsween, s. jeffrey, r. morris, and m. heyworth. . “from the ground up: the publication of archaeological projects: a user needs survey. a summary.” internet archaeology . https://doi.org/ . /ia. . joyce, r. . the languages of archaeology: dialogue, narrative, and writing. oxford, uk: blackwell publishing. joyce, r. a., and r. e. tringham. . “feminist adventures in hypertext.” journal of archaeological method and theory ( ): – . jubb, m. . academic books and their futures: a report to the ahrc and the british library. london: british library. kansa, e. . “openness and archaeology’s information ecosystem.” world archaeology ( ): – . kansa, e. c., s. w. kansa, and l. goldstein. . “on ethics, sustainability, and open access in archaeology.” the saa archaeological record ( ): – . kansa, e. . “ . . click here to save the past.” mobilizing the past. http://dc.uwm.edu/arthist_mobilizingthepast/ kintigh, k. . “the promise and challenge of archaeological data integration.” american antiquity ( ): – . https://doi.org/ . /s kratz, j. e., and c. strasser. . “researcher perspectives on publication and peer review of data.” plos one ( ): e . krause, k. . “a framework for visual communication at nature.” public understanding of science ( ): – . https://doi.org/ . / lake, m. . “open archaeology.” world archaeology ( ): – . llobera, m. . “archaeological visualization: towards an archaeological information science (aisc).” journal of archaeological method and theory ( ): – . https://doi.org/ . /s - - - logan, r. a. . “science mass communication: its conceptual history.” science communication ( ): – . https://doi.org/ . / lyons, r. e., and s. j. rayner, eds. . the academic book of the future. london: palgrave macmillan. https://doi.org/ . / maarleveld, t. j., u. guérin, and b. egger, eds. . manual for activities directed at underwater cultural heritage: guidelines to the annex of the unesco convention. paris: unesco. marwick, b. . “computational reproducibility in archaeological research: basic principles and a case study of their implementation.” journal of archaeological method and theory ( ): – . https://doi.org/ . /s - - - mccarthy, m., t. padley, and c. brooks. . “not drowning but waving: one approach to the problem of the publication of large archaeological assemblages.” antiquity ( ): – . meghini, c., r. scopigno, j. richards, h. wright, g. geser, s. cuy, and a. vlachidis. . “ariadne: a research infrastructure for archaeology.” journal on computing and cultural heritage ( ): : – : . https://doi.org/ . / mickel, a. . archaeologists as authors and the stories of sites: defending the use of fiction in archaeological writing. saarbrücken: lambert academic publishing. mickel, a. . “the novel-ty of responsible archaeological site reporting: how writing fictive narrative contributes to ethical archaeological practice.” public archaeology ( ): – . milner, n., c. conneller, and b. taylor. a. star carr volume : a persistent place in a changing world. york: white rose university press. https://doi.org/ . /book milner, n., c. conneller, and b. taylor. b. star carr volume : studies in technology, subsistence and environment. york: white rose university press. https://doi.org/ . /book moore, r., and j. d. richards. . “here today, gone tomorrow: open access, open data and digital preservation.” in open source archaeology: ethics and practice, edited by a. t. wilson and b. edwards, – . warsaw/berlin: walter de gruyter gmbh & co kg. oapen. . researcher survey . http://oapen-uk.jiscebooks.org/ files/ / /oapen-uk-researcher-survey-final.pdf. omeka (http://omeka.org/). opitz, r., m. mogetta, and n. terrenato, eds. . a mid-republican house from gabii. ann arbor, mi: university of michigan press. https://doi.org/ . /mpub. opitz, r. s., and t. d. johnson. . “interpretation at the controller’s edge: designing graphical user interfaces for the digital publication of the excavations at gabii (italy).” open archaeology ( ): – . https://doi.org/ . /opar- - o’sullivan, j. . scholarly equivalents of the monograph? an examination of some digital edge cases (report). the academic book of the future project. https://cora.ucc.ie/handle/ / pearce, n., m. weller, e. scanlon, e., and s. kinsley. . “digital scholarship considered: how new technologies could transform journal of field archaeology s https://doi.org/ . /ia. . https://doi.org/ . /ia. . https://doi.org/ . / . https://www.fulcrum.org/ https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ https://doi.org/ . /leap. https://deepblue.lib.umich.edu/handle/ . / ?show=full https://deepblue.lib.umich.edu/handle/ . / ?show=full https://doi.org/ . /s x https://doi.org/ . /s x http://doi.org/ . / . . https://doi.org/ . /s - ( ) - https://doi.org/ . /ia. . http://dc.uwm.edu/arthist_mobilizingthepast/ https://doi.org/ . /s https://doi.org/ . /s https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /s - - - https://doi.org/ . /s - - - https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /s - - - https://doi.org/ . / https://doi.org/ . /book https://doi.org/ . /book http://oapen-uk.jiscebooks.org/files/ / /oapen-uk-researcher-survey-final.pdf http://oapen-uk.jiscebooks.org/files/ / /oapen-uk-researcher-survey-final.pdf http://omeka.org/ https://doi.org/ . /mpub. https://doi.org/ . /opar- - https://cora.ucc.ie/handle/ / academic work.” in education ( ). http://ineducation.couros.ca/ index.php/ineducation/article/view/ perry, s., and c. morgan. . what archaeologists do: the site report & what it means to excavate a hard drive. accessed august , . https://savageminds.org/ / / /what-it-means-to-excavate-a-hard- drive/. pratt, d. . “not an either/or proposition: combining interpretive and data publication.” journal of eastern mediterranean archaeology & heritage studies ( ): – . rettberg, j. w. . “electronic literature seen from a distance: the beginnings of a field.” dichtung-digital. accessed july , . http://www.dichtung-digital.org/( )/ /walker-rettberg/walker- rettberg.htm# richards, j. d., and c. s. hardman. . “stepping back from the trench edge: an archaeological perspective on the development of standards for recording and publication.” in the virtual representation of the past, edited by m. greengrass and l. hughes, – . farnham, surrey and burlington, usa: ashgate publishing company. richards, j. d., j. winters, and m. charno. . “making the leap: linking electronic archives and publications.” in on the road to reconstructing the past, proc. th int. conf. on computer applications and quantitative methods in archaeology (caa), budapest, hungary, , edited by e. jerem, f. redö, and v. szeverényi, – . budapest: archaeolingua foundation. roux, v., and m.-a. courty. . “introduction to discontinuities and continuities: theories, methods and proxies for a historical and sociological approach to evolution of past societies.” journal of archaeological method and theory ( ): – . https://doi.org/ . /s - - -y sayers, j. . writing with sound: composing multimodal, long- form scholarship. in digital humanities conference, university of hamburg. accessed june , . http://www.dh . uni-hamburg.de/conference/programme/abstracts/writing-with-sound- composing-multimodal-long-form-scholarship/. scalar (http://scalar.usc.edu/scalar/). scanlon, e. . scholarship in the digital age: open educational resources, publication and public engagement.” british journal of educational technology ( ): – . https://doi.org/ . /bjet. schreibman, s. . non-consumptive reading. in from literature to cultural literacy, edited by n. segal and d. koleva, – . london: palgrave macmillan. https://doi.org/ . / _ seidemann, r. m. . “authorship credit and ethics in anthropology.” anthropology news ( ): – . serf hillforts project. . “designing an interactive archaeological resource: serf hillforts”. https://www.gla.ac.uk/schools/humanities/ research/archaeologyresearch/projects/serf/serfhillfortdigitalresource/. shillito, l. m. . “multivocality and multiproxy approaches to the use of space: lessons from years of research at Çatalhöyük.” world archaeology ( ): – . thomas, r. . “drowning in data?: publication and rescue archaeology in the s.” antiquity ( ): – . https://doi. org/ . /s x tringham, r. . interweaving digital narratives with dynamic archaeological databases for the public presentation of cultural heritage. bar international series : – . tringham, r., and m. stevanović. . last house on the hill: bach area reports from Çatalhöyük, turkey. los angeles: cotsen institute of archaeology press. waters, d. j. . monograph publishing in the digital age. https:// mellon.org/resources/shared-experiences-blog/monograph- publishing-digital-age/ webb, s. . “reconfiguring narrative” using digital tools. scholarly and research communication ( ). https://doi.org/ . /src. v n a wills, j. . “ st-century challenges for archaeology. workshop : challenges for archaeological publication in a digital age. who are we writing this stuff for, anyway?” draft proposed actions, summary of issues discussed, and notes. accessed april , . https://www. archaeologists.net/sites/default/files/ st-century% challenges% workshop% % draft% proposed% actions% c% summay % and% notes% v % for% website% consultation% % % .pdf. s r. opitz http://ineducation.couros.ca/index.php/ineducation/article/view/ http://ineducation.couros.ca/index.php/ineducation/article/view/ https://savageminds.org/ / / /what-it-means-to-excavate-a-hard-drive/ https://savageminds.org/ / / /what-it-means-to-excavate-a-hard-drive/ http://www.dichtung-digital.org/( )/ /walker-rettberg/walker-rettberg.htm# http://www.dichtung-digital.org/( )/ /walker-rettberg/walker-rettberg.htm# https://doi.org/ . /s - - -y https://doi.org/ . /s - - -y http://www.dh .uni-hamburg.de/conference/programme/abstracts/writing-with-sound-composing-multimodal-long-form-scholarship/ http://www.dh .uni-hamburg.de/conference/programme/abstracts/writing-with-sound-composing-multimodal-long-form-scholarship/ http://www.dh .uni-hamburg.de/conference/programme/abstracts/writing-with-sound-composing-multimodal-long-form-scholarship/ http://scalar.usc.edu/scalar/ https://doi.org/ . /bjet. https://doi.org/ . /bjet. https://doi.org/ . / _ https://doi.org/ . / _ https://www.gla.ac.uk/schools/humanities/research/archaeologyresearch/projects/serf/serfhillfortdigitalresource/ https://www.gla.ac.uk/schools/humanities/research/archaeologyresearch/projects/serf/serfhillfortdigitalresource/ https://doi.org/ . /s x https://doi.org/ . /s x https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/ https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/ https://mellon.org/resources/shared-experiences-blog/monograph-publishing-digital-age/ https://doi.org/ . /src. v n a https://doi.org/ . /src. v n a https://www.archaeologists.net/sites/default/files/ st-century% challenges% workshop% % draft% proposed% actions% c% summay% and% notes% v % for% website% consultation% % % .pdf https://www.archaeologists.net/sites/default/files/ st-century% challenges% workshop% % draft% proposed% actions% c% summay% and% notes% v % for% website% consultation% % % .pdf https://www.archaeologists.net/sites/default/files/ st-century% challenges% workshop% % draft% proposed% actions% c% summay% and% notes% v % for% website% consultation% % % .pdf https://www.archaeologists.net/sites/default/files/ st-century% challenges% workshop% % draft% proposed% actions% c% summay% and% notes% v % for% website% consultation% % % .pdf https://www.archaeologists.net/sites/default/files/ st-century% challenges% workshop% % draft% proposed% actions% c% summay% and% notes% v % for% website% consultation% % % .pdf abstract introduction: digital publication, humanities scholarship and writing archaeological excavations looking inward: remaking the academic excavation monograph the format of the volume story—talking about what happened more—the importance of true stories details—how do we talk about data? linking and navigating text, data and media implications for excavation reports and excavation monographs as a genre looking outward: scholarly publishing, digital humanities and new media looking outward: public engagement and academic publishing public engagement, open access and economics conclusions acknowledgements disclosure statement notes on contributor orcid references microsoft word - on_seams_and_edges_-_dreams_of_aggregation_access_and_discovery_in_a_broken_world_-_final- .docx seams and edges: dreams of aggregation, access & discovery in a broken world abstract visions of technological utopia often portray an increasingly ‘seamless’ world, where technology integrates experience across space and time. edges are blurred as we move easily between devices and contexts, between the digital and the physical. but mark weiser, one of the pioneers of ubiquitous computing, questioned the idea of seamlessness, arguing instead for ‘beautiful seams’ — exposed edges that encouraged questions and the exploration of connections and meanings. with discovery services and software vendors still promoting ‘seamless discovery’ as one of their major selling points, it seems the value of seams and edges requires further discussion. as we imagine the future of a service such as trove, how do we balance the benefits of consistency, coordination and centralisation against the reality of a fragmented, unequal, and fundamentally broken world. this paper will examine the rhetoric of ‘seamlessness’ in the world of discovery services, focusing in particular on the possibilities and problems facing trove. by analysing both the literature around discovery, and the data about user behaviours currently available through trove, i intend to expose the edges of meaning-making and explore the role of technology in both inhibiting and enriching experience. how does our dream of comprehensiveness mask the biases in our collections? how do new tools for visualisation reinforce the invisibility of the missing and excluded? how do the assumptions of ‘access’ direct attention away from practical barriers to participation? how does the very idea of systems and services, of complex and powerful ‘machines’ ready to do our bidding, discourage us from seeing the many, fragile acts of collaboration, connection, interpretation, and repair that hold these systems together? trove is an aggregator and a community; a collection of metadata and a platform for engagement. but as we imagine its future, how do avoid the rhetoric of technological power, and expose its seams and edges to scrutiny. paper in march the sydney electrical and radio exhibition opened in a blaze of excitement. aboard his yacht in genoa, inventor guglielmo marconi triggered a radio signal that reached across the world and switched on more than electric lights at the sydney town hall. ‘all in less than a second!’, exclaimed the sydney mail, ‘here was magic! arabian nights recede into remoteness: their magic was nothing compared to this’.[ ] radio had ‘eliminated time and distance’, argued the sydney morning herald, seeing in the exhibition a future where electricity would free the world from drudgery.[ ] about a month later the british and australian prime ministers spoke for the first time via wireless telephone. the british pm, ramsay mcdonald, suggested that the technology ‘would be the means of knitting the two countries closer and closer together’. ‘these were days for the annihilation of time and space’, he proclaimed.[ ] from railways to the telegraph, radio, and the internet, the progress of technology has often been imagined as a battle against time and space. progress has been measured in the seconds we save, in the distances we conquer, in the barriers of terrain and politics we bridge. in the realm of information this march of conquest is accompanied by adjectives such as ‘instantaneous’ and ‘seamless’. no need to wend your way between separate sources and services, technology promises a future beyond silos. you don’t have to look too hard to find software and service vendors touting the promise of ‘seamless discovery’. indeed, it turns out that ‘seamless discovery’ itself is the registered trademark of a video discovery platform used by foxtel and others.[ ] in the library world, seamless discovery is commonly associated with what are variously called ‘next-generation catalogues’, ‘web-scale discovery services’ or ‘discovery layers’.[ ] the idea is familiar and seductive. instead of forcing searchers to construct multiple queries across a variety of databases, systems and interfaces, these services aggregate metadata from different sources and offer access through a single search portal. the march of library technology promises to annihilate the legal and technological barriers that interrupt our information-seeking journey. a seam-free service is one that maximises ease-of-use. library users already have a very clear picture of what such a service might look like. every day they undertake a wide variety of social and economic exchanges mediated through the infrastructure of search. google might not be the only platform for online discovery, but it has played a central role in re-engineering our understanding and expectations of online experience. search is no longer just a task to be accomplished in pursuit of a particular goal — to find a desired resource or piece of information. ours is increasingly a ‘culture of search’ where the technologies of discovery are naturalised ‘into the backgrounds, fabrics, spaces and places of everyday life’.[ ] i search, therefore i am. it’s natural then that users of other discovery services will approach them with a set of expectations shaped by the googlisation of modern culture. it’s not just the simplicity of that single search box, it’s our faith that search will just work. every time google responds to our query about some obscure piece of television trivia with million results, we cannot fail to be impressed by the power at our fingertips. every time google predicts our query or customises our results we are beset with awe — a combination of fear and wonder. this must be magic.[ ] library services cannot compete with google’s oracular power, but they can at least aim to offer users a comparable level of simplicity. the features of ‘next-generation catalogues’ or discovery layers tend to follow a familiar check-list: single search box, faceted navigation, and relevance-ranked results. the pursuit of seamless discovery likewise mirrors google’s totalising reach. one search box to access a whole world of data. there’s nothing wrong with this — we all want to make life as easy as possible for the people who use our services. the question is how the pursuit of a google-like experience constrains our options and assumptions. despite the mathematical foundations of google’s pagerank algorithm there are politics at work in calculations of relevance and criteria for inclusion.[ ] google’s dominance gives it immense power in presenting to us an image of the world constructed to it’s own secret formula. this power bears ontological weight — if we can’t find something on google does it exist? if we are concerned with absence as well as inclusion, with addressing the silences within our cultural record, we need to wary of sharing in google’s aura of completeness. seams are not simply obstacles to a smooth user experience, they’re reminders that our online services are themselves constructed. there’s nothing natural or inevitable about a list of search results. mark weiser, one of the pioneers of ubiquitous computing, argued against seamlessness because it made everything seem the same. instead he imagined systems with ‘beautiful seams’.[ ] the possibilities of ‘seamful design’ have been taken up by other researchers, exploring ways that users can be empowered to discover and manipulate their contexts and connections.[ ] as mitchell whitelaw notes ‘seamfulness is also an ethical and political stance’ — it’s a commitment to exposing the interpretative distance between our collection data and its online representation.[ ] there are opportunities here not only for transparency, but to explore alternatives to google’s template for discovery. research into the visualisation of large cultural heritage collections, by whitelaw and others, has emphasised that search is only one way of representing a collection.[ ] by focusing on the stylish minimalism of the search box, we discard opportunities for traversing relationships, for fostering serendipity, for seeing the big picture. it’s important to recognise, however, that this type of research is not aimed at supplanting search, nor building a better google. nor indeed should alternative collection interfaces be judged on narrow measures of utility. this is building as critique — each alternative interface offers a means of questioning our assumptions about the discovery of online collections. as matt ratto argues in his discussion of ‘critical making’, ‘these material interventions provide insubstantiations of how the relationship between society and technology might be otherwise constructed’.[ ] by playing around with our expectations we can start to think differently, to develop new metaphors for our online experience. my own eyes on the past, which allows you to find your way into trove’s digitised newspapers through machine recognised faces and eyes, is far from a practical discovery tool.[ ] but building on my earlier work using facial detection technology as a means of archival intervention, it opens up questions about the lives embedded within our collections — we see them differently, we feel differently. a google-like search experience offers utility at the expense of critique. its technologies are black boxed, its assumptions obscured. how do those of us in the discovery business respond? how do we create a buffer for critical reflection while still meeting user expectations? by unpicking a few seams, cultural institutions can open up a space for discussion, but what does this actually mean for a service such as trove that must deal with thousands of users a day? i’d suggest we start with an acknowledgement of our limits, an attempt to trace the edges and the fractures that are too often glossed over in our pursuit of seamlessness. i also think we should take our metaphors seriously, not just as marketing hype, but as the means by which structure the realm of what is possible. let’s start by admitting what trove is not: . trove is not perfect . trove is not everything . trove is not a machine trove is not perfect trove is an aggregator. it pulls together metadata from a variety of different sources, applies some normalisation across the required fields, and sends the results off to be indexed. with close to million resources harvested from hundreds of contributors through an assortment of different pipelines, it’s inevitable that there will be errors and oddities. descriptive standards vary, and sometimes the assumptions trove makes about the data it’s getting are wrong. if you want to see errors, of course, you can head along to trove newspapers zone where the limitations of optical character recognition are on display for all to see. unlike some full-text databases, trove exposes the raw output of its ocr processing. the accuracy of ocr is heavily dependent on the quality of the source material which, in the case of historical newspapers, varies considerably.[ ] a few years ago, as part of separate research project, i made an attempt to estimate ocr accuracy in trove across a sample of , newspaper articles.[ ] i basically just compared the ocr output to a dictionary list of words and calculated the accuracy of each article as a percentage of the total number of words. variations were considerable across both time and titles, but the average was around %. a much more rigorous analysis of the british library’s digitised th century newspapers found an overall word accuracy of %.[ ] trove’s transcriptions are improving all the time thanks to the efforts of thousands of online volunteers who correct the raw ocr output. astonishingly, more than million lines of text have been corrected by trove users, in what is rightly touted as a highly successful crowdsourcing initiative. but it’s also important to put this effort in perspective. head across to the trove newspapers zone and enter ‘has:corrections’ into the search box to retrieve all the articles that have at least one crowdsourced correction.[ ] at the time i wrote this, the figure was , , or just . % of the total number of newspaper articles in trove. paul hagon’s analysis of trove crowdsourcing behaviour also indicates there is a flattening out of growth in corrections. despite their important efforts, trove’s volunteers will never be able to produce a perfect rendering of the newspaper content. but what is ‘perfection’ anyway? ocr accuracy is important only in so far as it supports the interests and activities of users. for the purposes of discovery the accuracy of common search terms such as names, places or events are likely to be most important. but a much broader range of words would be significant in an analysis of changes in language across time. accuracy is something that need to be assessed and understood within the context of a specific research activity. researchers using digitised text collections need to consider the impact of technologies such as ocr on their methodologies, or else, in tim hitchcock’s words, ‘this is roulette dressed up as scholarship’.[ ] services like trove can support rigorous digital scholarship by exposing as much information as possible about the technologies they employ and any known limitations. this applies not just to ocr, but to fundamental technologies such as keyword search and relevance ranking. if we are developing resources for scholarly use we cannot simply black box our tech and trade on trust. that’s google’s game. we have to be prepared to expose configurations and assumptions so that analyses can be replicated and exposed to critique. querypic is a simple tool that visualises search results in the trove newspapers zone. querypic lets you see patterns and trends across the whole database but, as the help system warns, it creates ‘sketches, not arguments’ — critical interpretation is always required.[ ] when did the ‘great war’ become the ‘first world war’?[ ] querypic can be used to explore this shift in terminology, but if you examine the results closely you’ll notice a small bump in the graph indicating that the term ‘world war i’ was being used during world war i. huh? if you drill down through the results you’ll find that this is because trove users have been busily adding the tag ‘world war i’ to selected articles, and by default trove searches user tags and comments as well as article text. the bump is an artefact of trove’s search configuration. trove’s primary function is discovery — to make it as easy as possible for people to find things they’re interested in. but the sort of fuzziness that supports discovery works against other forms of analysis. we should make these sorts of assumptions more obvious, and provide opportunities for researchers to question things like relevance ranking. by showing our seams, exposing our imperfections, we have the opportunity to educate. as well as helping people use trove, we can open up bigger questions about the way search works on the web. trove is not everything there’s nothing natural about our cultural collections or their digital representations — they have been created by many acts of selection, neglect, vision, accident and planning. if you graph the number of newspaper articles in trove by state and year you’ll notice a rather dramatic spike around .[ ] figure — trove newspaper articles by state/year why? were more newspapers printed during the war era? the answer is simply funding. as part of the australian newspaper digitisation program, the nsw and victorian state libraries have chosen to invest in the digitisation of newspapers from the world war i period. the contents of trove’s newspaper zone, like any online collection, is constructed — shaped by many competing priorities. the consequences of this process are not always obvious. in a competition for resources what gets digitised and why? there’s a danger that the sheer scale of aggregation services like trove will reinforce existing prejudices. people already struggling for visibility and recognition within our cultural record might be lost amidst the overwhelming numbers of the safe and the sanctioned. the ontological weight of search can too easily equate absence with non-existence. but aggregation also offers new opportunities for analysis. questions of representation and diversity can be explored through the metadata itself. mitchell whitelaw notes that some collection interfaces are already exploring ways of representing absence. perhaps we can extend this evolving language across large aggregated collections to reveal not only what is found, but what is missing. figure — trove resources and contributors by state by way of a quick example, i used the trove api to harvest raw numbers of holdings and contributors for each state. it was a simple matter to combine these with population data to create a crude graph of resource representation by state.[ ] obvious anomalies, such as queensland’s apparent underrepresentation, might be simply explained by demographics, but the point is that aggregated data enables us to frame these sorts of questions without undertaking a major research project. perhaps more interestingly, i was able to easily compare the languages spoken at home in australia, according to the census, with the languages of resources in trove’s book zone.[ ] it’s fascinating to consider how we might use socio-economic data to slice our cultural collections across the grain to reveal different patterns of access and exclusion. there are other opportunities as well. like trove, the digital public library of america aggregates metadata from a wide range of cultural organisations. the dpla has taken a public stance on diversity, monitoring its own holdings to highlight questions of underrepresentation, and working proactively to fill known gaps.[ ] by admitting the constructed nature of our collections, the gaps and the silences as well as their strengths, perhaps aggregations like trove can become sites of both analysis and activism. trove is not a machine trove is not a single application, it’s a complex system with multiple components. this size and complexity focuses our attention on the technology — on the lines of code and racks of servers. but the system only exists to support human creativity and cooperation. is it a machine, a community, or something else? i often talk about trove as a platform — it can be built upon in many ways, both through code and collaborations.[ ] in particular, by providing an open api, trove invites the public to create new tools, analyses and interfaces. but there are metaphorical dangers lurking here as well. social media services such as facebook and youtube also describe themselves as platforms — staking out a space alongside traditional media outlets, while seeking to expand through developers and new technology partners. as tarleton gillespie notes, ‘these terms matter as much for what they hide as for what they reveal’.[ ] in this case, the ‘platform’ label can divert attention away from the analysis of business models involving the monetisation of personal data. if we are to embrace the ‘platform’ metaphor we must also be ready to unpack its implications. writing about an earlier generation of information infrastructure metaphors — superhighways, virtual communities and digital libraries — peter lyman argued that such terms contained ‘an indirect dialogue about questions of social and economic justice in an information society’.[ ] if we want progressive platforms we need to honestly address issues of openness, participation, and accessibility. every api is an argument and no data is ever truly ‘open’.[ ] for me the term ‘platform’ speaks of something unfinished — an invitation and an opportunity. trove is permanently under construction, constantly improved through the labours of its developers and community. this is most evident in the work of trove’s text correctors, whose many small acts of repair help the technology to function more efficiently. but each tag or comment also changes trove — aiding discovery, adding context, or creating new connections. the trove api is not merely a plaything for tinkerers like me. no interface can ever serve the needs of all users — there will be inevitably be biases and assumptions that limit engagement. but the api at least keeps open the possibility of alternative troves that address existing biases and meet the needs of specific communities in a way that a single, centralised portal can never do. other trove-building activity is less visible, and the responsibilities more distributed. for example, trove is currently working with victorian collections to bring many small, local collections from across victoria into trove.[ ] but this collaboration is itself built on the labours of many people over many years — from the museums australia staff who train community groups, to the local volunteers who painstakingly digitise and describe their collections. trove helps bring these efforts to the attention of the web, and is itself enriched. as peter lyman notes, for all the new terms we have for systems and devices we have thus far failed to find a language to describe online collaboration and social engagement. instead we fall back on the awful term ’user’ — ‘a word that places technique at the centre, and even contains a hint of dependence upon or subordination to technology’. by drawing attention away from ‘the machine’ to the many small acts that sustain and enlarge a service such as trove, we create a space where language might evolve. broken worlds instead of visions of technological progress, steven j. jackson presents a vision of a fundamentally broken technosocial world barely held together by numerous acts of concern and repair.[ ] most technological futures are ultimately alienating and disempowering — we are passive consumers of the latest wonders and gadgets. by focusing on ‘repair’, as jackson suggests, we see the human agency at work, the possibilities for change. similarly, by seeing our seams and edges as sites of repair rather than speed bumps in the onward march of progress, we can open spaces for dialogue, for sharing, and for learning — for imagining something different. references . ‘when marconi switched on the lights the sydney electrical and radio exhibition’, sydney mail (nsw), april , p. . <http://nla.gov.au/nla.news- article > ↩ . ‘tales of the genii’, the sydney morning herald (nsw), march , p. . <http://nla.gov.au/nla.news-article > ↩ . ‘wireless telephony. england and australia. prime ministers converse. brisbane, april ’, cairns post (qld.), may , p. . <http://nla.gov.au/nla.news-article > ↩ . digitalsmiths seamless discovery, <http://www.digitalsmiths.com/solutions/seamless-discovery/features/> ↩ . joshua barton and lucas mak, ‘old hopes, new possibilities: next-generation catalogues and the centralization of access’, library trends, vol. , no. , , pp. – . <http://muse.jhu.edu/journals/library_trends/v / . .barton.html> ↩ . ken hillis, michael petit, and kylie jarrett, google and the culture of search, routledge, , p. . ↩ . ken hillis, michael petit, and kylie jarrett, google and the culture of search, routledge, , p. ff. ↩ . lucas d. introna and helen nissenbaum, ‘shaping the web: why the politics of search engines matters’, the information society, vol. , no. , , pp. – . <http://www.tandfonline.com/doi/abs/ . / >; laura a. granka, ‘the politics of search: a decade retrospective’, the information society, vol. , no. , september , pp. – . <doi: . / . . > ↩ . quoted in matthew chalmers and ian maccoll, ‘seamful and seamless design in ubiquitous computing’, in workshop at the crossroads: the interaction of hci and systems issues in ubicomp, . ↩ . matthew chalmers and areti galani, ‘seamful interweaving: heterogeneity in the theory and design of interactive systems’, in proceedings of the th conference on designing interactive systems: processes, practices, methods, and techniques, acm, , pp. – . http://dl.acm.org/citation.cfm?id= ↩ . mitchell whitelaw, ‘representing digital collections’, in performing digital: multiple perspectives on a living archive, ed. david carlin and laurene vaughan, ashgate publishing, farnham, uk, . ↩ . see for example the dl workshop, ‘the search is over! exploring cultural collections with visualization’. http://searchisover.org/ ↩ . matt ratto, ‘critical making’, in open design now: why design cannot remain exclusive, ed. bas van abel, lucas evers, and peter troxler, bis publishers, amsterdam, the netherlands, . http://opendesignnow.org/index.php/article/critical-making-matt-ratto/ ↩ . eyes on the past, <http://eyespast.herokuapp.com/>. for context see ‘eyes on the past’, <https://storify.com/wragge/eyes-on-the-past>. ↩ . rose holley, ‘how good can it get?: analysing and improving ocr accuracy in large scale historic newspaper digitisation programs’, d-lib magazine, vol. , no. / , march . <http://www.dlib.org/dlib/march /holley/ holley.html> ↩ . tim sherratt, ‘mining for meanings’, harold white fellowship public lecture, national library of australia, may . http://discontents.com.au/mining-for- meanings/ ↩ . simon tanner, trevor muñoz, and pich hemy ros, ‘measuring mass text digitization quality and usefulness: lessons learned from assessing the ocr accuracy of the british library’s th century online newspaper archive’, d-lib magazine, vol. , no. / , july . <http://www.dlib.org/dlib/july /munoz/ munoz.html> ↩ . <http://trove.nla.gov.au/newspaper/result?q=has% acorrections> ↩ . tim hitchcock, ‘historyonics: academic history writing and its disconnects’. <http://historyonics.blogspot.ca/ / /academic-history-writing-and-its.html>; ian milligan, ‘illusionary order: cautionary notes for online newspapers’, activehistory.ca. <http://activehistory.ca/ / /illusionary-order/> ↩ . <http://dhistory.org/querypic/help/> ↩ . <<http://dhistory.org/querypic/ />. see also tim sherratt, ‘when did the “great war” become the “first world war”?’. http://discontents.com.au/when-did- the-great-war-become-the-first-world-war/ ↩ . trovenewspapers ↩ . <https://plot.ly/~wragge/ > ↩ . <https://plot.ly/~wragge/ > ↩ . ‘digital public library of america » blog archive » diversity and the dpla’. <http://dp.la/info/ / / /diversity-and-the-dpla/> ↩ . tim sherratt, ‘from portals to platforms: building new frameworks for user engagement’, presented at the lianza conference, hamilton, new zealand, october . <http://www.nla.gov.au/our-publications/staff- papers/from-portal-to-platform> ↩ . tarleton l. gillespie, ‘the politics of “platforms”’, new media & society, vol. , no. , may . http://papers.ssrn.com/abstract= ↩ . peter lyman, ‘information superhighways, virtual communities and digital libraries: information society metaphors as political rhetoric’, in technological visions: the hopes and fears that shape new technologies, ed. marita sturken, douglas thomas, and sandra j ball rokeach, temple university press, philadelphia, , pp. – . ↩ . tim sherratt, ‘“a map and some pins”: open data and unlimited horizons’, presented at the digisam conference on open heritage data in the nordic region, malmö, april . http://discontents.com.au/a-map-and-some-pins- open-data-and-unlimited-horizons/ ↩ . ‘growing together – trove and victorian collections’, <http://www.nla.gov.au/blogs/trove/ / / /growing-together-trove-and- victorian-collections>. ↩ . steven j. jackson, ‘rethinking repair’, media meets technology, mit press, . ↩ report on the nd african digital scholarship and curation conference search   |   back issues   |   author index   |   title index   |   contents d-lib magazine july/august volume number / issn - report on the nd african digital scholarship and curation conference   martie van deventer south african council for scientific and industrial research <mvandeve@csir.co.za > heila pienaar university of pretoria <heila.pienaar@up.ac.za > the nd african digital scholarship and curation conference was held in pretoria, south africa, on and may . in addition, an e-research seminar for senior researchers was facilitated on may, and several post-conference workshops were held on may. the principal organizers of the conference were the university of pretoria represented by dr. heila pienaar, and the university of botswana, represented by prof. tunde oladiran. this conference was a follow-up to two conferences that were held independently in and : the university of botswana's digital scholarship conference in december , gaberone, and the st african digital curation conference in february , pretoria (http://stardata.nrf.ac.za/nadicc/programme.html). the main purpose of this nd conference was to identify opportunities, strategies and practical examples for new forms of research and scholarship, and for the management of the digital content of these activities by academics, researchers, scientists, information professionals and it experts. in particular it was an attempt to pull subject expertise and advanced computer skills, as well as information science practitioners, into the same conversations. the collaboration between the two universities ensured a very successful conference! the outcome of the e-research seminar was exceptionally rewarding. this half-day seminar was hosted by the university of pretoria and the council for scientific and industrial research as part of the southern education and research alliance activities. the main purpose of the seminar was to extract lessons learnt by overseas players active in the field of e-research and to dovetail the learning with local agendas with the view of using these as input considerations for the review and mobilisation of a south african e-research blueprint. despite the very limited time available and the highly condensed format of the workshop, the organisers, presenters and delegates generally expressed satisfaction with the quality of dialogue and knowledge exchange that took place and expressed confidence in the local players' ability to map out and implement a collaborative programme to benefit the south african research community. delegates agreed that the immediate and primary focus will have to be on the inclusive development of a strategic framework with a five- to ten-year horizon, and to have this adequately funded, resourced and governed to serve the south african research community with the technology backbone and service for effective and efficient linkages to similar users and providers, both locally and abroad.the conference was planned as three parallel tracks but also with three plenary sessions. the plenary sessions attempted to provide an international perspective while the parallel tracks concentrated on practical implementations and ongoing activities, mainly in africa but also with perspectives from abroad. tracks were subdivided as follows: track a: e-research & e-science; it infrastructure for digital scholarship; collaboration; open scholarship and e-resources. track b: digital preservation; digital data management and digital curation. track c: digital divide; e-learning and distance learning; intellectual property issues; it adoption and perceptions; information literacy; ethics and trust in the digital world. during the first plenary session of the conference, matthew dovey, of the jisc (joint information systems committee, uk), provided a very informative and entertaining overview of the uk's vre (virtual research environment) programme from to the present. lee dirks, of microsoft research, was the main speaker during the second plenary session. in his paper "transforming scholarly communication," he predicted that within the next five to ten years: open access to both text and data will be the rule, not the exception, publications will be live documents with links to (real-time) data and related software, new forms of peer review and social networking will have been accepted/adopted, blogs and wikis for collaborative research will be normal operating procedure, national and international repositories will be a key part of the scientific cyberinfrastructure, preservation and long-term access to data sets will be a mandated part of the scientific lifecycle, a service industry will develop around online data analysis, visualization and dissemination of scientific information, and that most of the above scenarios will be cloud-based services, hosted by third parties and not the academic institution. he pleaded for networking and collaboration – a message that again surfaced during the third plenary where myron gutmann (and anne green) of icpsr, in their paper "building partnerships among social science researchers, institution-based repositories and domain specific data archives" placed much emphasis on collaboration for the sake of curation excellence. gutmann indicated that there is an increasing sense that partnerships can work but that such partnerships require effective agreement on repository formats and metadata standards. he indicated that, although much still needs to be done, participants are making gradual progress towards the shared goal of efficient digital preservation and data sharing. he is of the opinion that by sharing the workload it is possible to benefit from domain expertise, to make use of 'on the ground' services and to build up economies of scale. several other international speakers contributed to the sharing of learning and experience. david giaretta of the uk's digital curation centre (dcc) was perhaps the most prolific – delivering no fewer than three papers! conference attendees appeared surprised by the contributions from africa. papers varied from digitizing knowledge regarding the processing of indigenous fruit to investigating the need for a virtual research environment amongst researchers of malaria, to activities that produce the data related to sun spot activity, to intelligent transport systems and to cloud computing. the fact that e-research / e-scholarship had progressed as far as it had in a period of approximately months since the first conference, emphasized the dire need to also get the related curation activities up to speed. a paper on botswana's progress with regard to e-learning was well received while a paper relating initiatives to introduce science and technology to toddlers attracted much attention. the next stage of development would be to transfer some of the learning content to mobile technology – a solution to several obstacles within the african context. similarly, papers on the use of wikis, 'gaming' and virtual worlds to do information literacy training were well received and much discussed. the importance of trust in the digital environment was highlighted by several speakers. digital crime was seen as one of the key reasons why african scholars are not making use of the opportunity to collaboratively build knowledge. it was reported that digital crime is seen as an international constraint on collaboration. even inside south africa almost % of current scholars reportedly do not collaborate online. open access and open scholarship attracted much attention. the use of electronic content in special libraries of tanzania resulted in a recommendation that suppliers standardize the interfaces and functionality of their products to encourage use of content. on the other hand the university of pretoria was able to show the importance of collaboration between the library and the research office in ensuring that research statistics were accurate – especially when these have a direct impact on funding. they also made use of the opportunity to explain the process that led up to the university's open access declaration a week after the conference. another paper that caused much deliberation was one in which the authors made use of 'worldmapper' images to display africa's contribution to the published body of research literature (showing a very skinny and small africa) versus the image of a large, obese africa when it comes to issues such as poverty and illiteracy. the reality of africa where we do cutting edge basic science including cell biology, immunology, biochemistry, genetics, microbiology and molecular science, which in turn enables us to discover the origin and development of disease, was stressed. understanding the nature of human disease allows us to develop new drugs, accurate diagnostic tools, effective vaccines and other interventions that can be life saving. this in turn means that we can take our discoveries into clinical research, and translate what we do from the laboratory to the bedside with the intention of improving people's lives ... and yet we are struggling to communicate our research victories to the rest of the world. a serious plea was made that africans actively participate in the effort to find alternative publishing models that would reward research effort much earlier in the research cycle. lastly – conference workshops provided hands on training related to the management of spatial data; making use of web . to create second generation libraries; establishing institutional repositories; and promoting open access for the advancement of science and research. although some thought that the workshops were too basic, many expressed appreciation for the opportunity to experience and learn about the activities first hand. all papers and several presentations have been made available via the conference web site. the conference proceedings may be found at <http://www.library.up.ac.za/digi/programme.htm>. planning for the third conference is already underway. the next conference will be in may in gaberone, botswana. copyright © martie van deventer and heila pienaar top | contents search | author index | title index | back issues previous conference report | in brief home | e-mail the editor d-lib magazine access terms and conditions doi: . /july -vandeventer   modelling medieval vagueness towards a methodology of visualising geographical uncertainty in historical texts mateusz fafinski , michael piotrowski abstract: the project an agile approach towards computational modeling of historiographical uncertainty is building a taxonomy of historiographical uncertainty. we are focusing on early medieval texts as our case studies, because they are characterised by a high degree of “high stakes” uncertainty and a varied historiography characterised by a vivid debate. the additional factor of the manuscript text-transmission ensues that also the material aspect of the textual study will be covered in our attempt to build an adaptable taxonomy of historiographical uncertainty. computational humanities need a robust methodological platform, that can be applied to a wide variety of projects. uncertainty in general and geographical uncertainty in particular stand as the crucial aspects of this platform. we investigate a methodology of visualising geographical locales in historical texts and their historiographies that explicitly models uncertainty in. keywords: uncertainty; mapping; historiography; medieval history the problem of uncertainty in historical methodologies the problem of uncertainty and vagueness in history and historiography is deeply embedded in historiographical practice. while vagueness and uncertainty are impossible to fully separate, they can nevertheless be modelled on a spectrum where vagueness is a category rooted on the source side and uncertainty on the side of the historiographical interpretation. while each of them is anchored at opposing sides of a gradient, they are both always present, and trying to fully separate them is counterproductive – as edgington [ed , p. «] remarked, “vagueness and uncertainty can interact.” for the early narrative historians like thucydides [th ] uncertainty was more or less a question of believability of sources. uncertainty was the absence of reliable information and not necessarily a presence of ambiguity. indeed, the citing practice of “it is said,” a distancing technique, allowed for a binary understanding of uncertainty between hearsay and “perfect” knowledge [gr ]. faculty of arts, department of language and information sciences, university of lausanne, bątiment anthropole, lausanne, switzerland, mateusz.fafinski@unil.ch, httpsȷ//orcid.org/ - «- « - » faculty of arts, department of language and information sciences, university of lausanne, bątiment anthropole, lausanne, switzerland, michael.piotrowski@unil.ch, httpsȷ//orcid.org/ - «-«« - « cba doiȷ . » /inf _ « r. reussner, a. koziolek, r. heinrich (hrsg.)ȷ informatik , lecture notes in informatics (lni), gesellschaft für informatik, bonn « https://orcid.org/ - - - https://orcid.org/ - - - mailto:mateusz.fafinski@unil.ch https://orcid.org/ - - - mailto:michael.piotrowski@unil.ch https://orcid.org/ - - - https://creativecommons.org/licenses/by-sa/ . / https://doi.org/ . /inf _ this is a feature, not a bug, of early historiographies, as uncertainty becomes essentially a narrative technique to make a point, a claim. this method underlines the early attempts to tackle uncertainty, but they can be summarised under the equation of uncertainty with unreliability. this process was of extreme importance for later methodology of history, as it put source criticism and narrative techniques in the very centre of strategies to deal with historical uncertainty. this strategy of choosing between variants, especially in ancient writers like herodotus or xenophon, has been deemed “narrative uncertainty” [ma , p. ]. as a strategy (not a model) it allowed the early narrative historians to choose among the variants in their sources in order to shape their stories. narrative uncertainty permeates all the levels and types of vagueness present in those texts. moreover, scholarly editions and digital facsimiles introduce another layer between us and the source and thus another level of uncertainty. imaging (or creation of digital facsimiles) is in this respect no different to any other form of processing of historical sources [pr ]. the focus to date in many disciplines of historical research has often been on reducing uncertainty [see, e.g., bl ]. even when acknowledged, uncertainty was to be modelled in order to be factored out rather than factored in. in this method the vagueness of the sources should be analysed to the point of the lowest possible uncertainty in their interpretation. this reductive approach is caused by the deep unease with fuzziness in some methodologies of history, seen as responsible for potentially false outcomes. the goal of the historian was in those approaches to reconstruct the one-dimensional facts of the past, “to extract the facts in such a way as to arrive at the truth” [sk , p. « ]. nevertheless, among the researchers of the historical method the need to model and factor uncertainty in has been recognised, including the importance it can play at the interface between history and informatics [to », pp. – «]. in this spirit, there is today a growing, although still mostly ad hoc, understanding in digital scholarship that this “spurious exactitude” [ta ] and attempts to force uncertainty out at every cost is detrimental to our ability to actually research the past. more and more projects are thus explicitly factoring in uncertainty in their individual methodologies [see, e.g., bi »]. factoring uncertainty in as opposed to the minimising approach, we want to focus on the explicit modelling of uncertainty in order for it to become an integral part of computational humanities methodology, as we have already advocated elsewhere [pi ]. as our case study we have chosen the work of gregory of tours, a th-century historian concerned mainly with the events, locales, and persons in the territory of modern france, germany, italy, and spain [gr »]. we recognise the rich historiographical tradition on gregory and the fact that his work is in itself a historiography, in which vagueness and uncertainty are not a simple matter of a lack of knowledge but are conscious tools for creating community [re «], presenting a particular vision of the past [he »], and which have generated rich reflection already in « mateusz fafinski, michael piotrowski the early medieval period [re ]. our attempt is based on a three-pronged approach to visualising geographical vagueness in early medieval texts. first, we are concerned with the uncertainty concerning the manuscripts that transmit the texts, crucial to the creation of what we call today historia francorum – a very much interpretative creation on its own [go ]. their age and place of production are crucial for the editorial choices undertaken when producing the editions and translations of those texts and the introduction of the “editorial narrative” [ra , p. ]. second, we are concerned with the distribution of the vagueness and uncertainty within the textȷ its typology and ontology, an issue already flagged as crucial for knowledge retrieval from texts [kc ]. third, we are concerned with the actual mapping of the locations within the textȷ how the vagueness and uncertainty of the text of gregory is projected onto a two-dimensional map. we can see that when it comes to modelling uncertainty there is a high degree of interrelat- edness between those different types. because the aim of our project is to work towards a historiographical methodology of uncertainty we also try to identify not only its level but also the historiographical stakes involved. the level of uncertainty is established based on how much information the text delivers about a particular category. the historiographical stakes are determined based on how much this particular type of uncertainty influences the historiographical interpretation of the text itself. and so, we identify different forms of uncertainty in our case study and categorise them according to those two factors (uncertainty level/historiographical stakes of that form of uncertainty)ȷ . in-source uncertaintyȷ • sources of gregory (high/high) • trustworthiness of his text (high/high) • language of gregory (to what extent the texts that we have in later copies, reflect the language of gregory himself); his orthography, matters of transition from late latin to romance (high/low) • locations, dates, persons – the content uncertainty, the area where the most historiographical debates happen (low/high) . supra-source uncertaintyȷ • the manuscript transmission, which models also the extent to which the text that we have is actually the text of gregory (low/low) • the texts for which gregory is a source (low/high) • the historiographical uncertainty, i.e., the historiographical models and narra- tives built on the basis of particular interpretations of the in-source uncertainties (high/high) in this paper we focus on the geographical uncertainty in both domainsȷ in the text and outside of it. modelling medieval vagueness « visualizing geographical uncertainty vagueness is inherent in the descriptions of locales mentioned in gregory’s writings. we recognise that these texts are imbued with a degree of vagueness and background noise – in effect every location is to a certain extent uncertain and so is its approximation on a two-dimensional map. in this respect as a work of history it shows striking similarities to literary texts – being in effect both – and requires similar attention to modelling its uncertain geodata [see rph «]. in geographical information systems (gis), uncertainty is often defined as “a measure of the user’s understanding of the difference between the contents of a dataset and the real phenomena that the data are believed to represent” [lo , p. ], i.e., the difference between the geographical position of a locale and the author’s understanding of that position. in our case, there are two additional levels. one is the semantic uncertaintyȷ differing meanings that are assigned to the linguistic markers representing these locales [bgp ]. the second one are the uncertainties of translation [he ]. it features prominently in translation theory [see, e.g., hm ] and directly influences historiographies in various languages. in other words, our author operates on a high initial degree of vagueness (the difference between his understanding of the locales and their actual geographical positions is large); his understanding of the semantic quantifications of areas is uncertain (e.g., defining kingdoms as areas of influence of particular rulers); those locales are originally described in latin, but are in modern historiographies translated into different languages. geographical locations in historical texts might be referred to through terms, phrases, and concepts that have nothing – or very little – to do with geographical terminology. this renders any attempt to automate their extraction and visualisation without a robust uncertainty schema almost futile. inclusion of uncertainty modelling remains in this respect a crucial aspect. while in gis a strong focus is laid on the uncertainty of geospatial data, [go ] when it comes to modelling uncertainty in historical and historiographical texts additional layers appear and we are confronted with a much richer structure of uncertainty. visualising this vagueness requires the application of different degrees of uncertainty. even points on a map (e.g., “roma”) can be recognised as being in essence fuzzy approximations of (a) gregory’s understanding of where “roma” is; (b) our understanding of what area gregory means by “roma”; (c) our understanding of what “rom,” “rome,” “rzym,” etc., represent on a map. visualising historical sources without acknowledging and factoring in uncertainty is then in effect a visualisation of no more than a historiographical narrative – an interpretation of those sources. oftentimes digital humanities projects leave the explicit acknowledgment of this narrative out in order to factor the uncertainty out, but in reality, by failing to make this narrative explicit, they are, simply speaking, mapping the wrong thing [fa ]. our understanding of the geographical space is also different from the understanding of the authors of our sources. this understanding has been progressively translated through various historiographical interpretations and created a new geography to be mappedȷ a subjective structure [to ], an additional layer of interpretative geography created by historians. thus vagueness and uncertainty make numerous (but nevertheless limited) historiographical narratives possible and lead to sometimes risky but high-stakes « mateusz fafinski, michael piotrowski statements [ko ]. in historiography this creation of interpretative layers is a long-recognised phenomenon [see, e.g., wh «]. but with the advent of digital and computational humanities it remained an intuitive and implicit element of the methodology of those new branches. it can help us, for example, recognise the geographical horizon of the author of a source through computational methods. the measure of the area which can be assigned as characterised by a low level of geographical uncertainty corresponds with the expression of the geographical horizon of the author in a particular text. but this method will only work if we recognise, model, and factor in the historiographical uncertainty associated with a particular source. fig. ȷ surviving manuscripts of historia francorum and their dating we recognise this conundrum and see the need to assign different methods of mapping to different types of uncertainty in historical texts. and thus, while individual locales of low uncertainty can be assigned points, those of a higher degree need to be presented through polygons and those exhibiting a high degree of uncertainty across the three domains ( . uncertainty about a primary source author’s knowledge of a locale position; . uncertainty about a scholar’s understanding of a primary source’s reference to a locale’s position; «. uncertainty how much a single point can stand for the area(s) represented by a locale name) need to be visualised using fuzzy methods. those problems are visible not only in case of the in-text data but also outside of it, as exemplified by the manuscript transmission of gregory of tours’s main work, historia francorum (fig. ). modelling medieval vagueness « fig. ȷ map of production locations of surviving manuscripts of historia francorum the dating of various manuscripts as well their assignment to a particular space reflects a historiographical tradition that is characterised by a very high degree of uncertainty. while palaeographical dating and localising remains the basic method of work with those manuscripts, and is characterised by taking into account a high degree of uncertainty, both the current predominance of digital facsimiles [te ] and the inherent lack of ability to accommodate fuzzy dating in catalog metadata [da ] make it difficult to include this uncertainty in current digital projects. moreover, a lack of precise uncertainty taxonomy makes comparisons between those projects difficult, if not misleadingȷ the understanding, for example, what degree of correspondence between terms like “northern france” and “northern gaul” exists and what is their level of uncertainty is almost entirely lacking. we propose therefore, as a form of stop-gap solution and a stepping stone in modelling this particular form of historiographical uncertainty, to map the distribution of those manuscripts through 𝑘-means clustering and kernel density estimation. this method, based on the idea of dividing observations into clusters with the nearest mean as a centroid [ma ] and the smoothing of data based on the bandwidth [pa ], showcases one possible example of computationally representing uncertainty of historiographical and chronological data on a two-dimensional map (see fig. ). it should be also noted that the use of fuzzy clustering (𝑐-means) did not produce significant differences at this scale and with this bandwidth. « mateusz fafinski, michael piotrowski this map (fig. ) is not so much a map of the provenance and dating of the manuscripts of gregory of tours’s historia francorum (although one might interpret it as such) as it is a map of the historiographical uncertainty about their localisation and dates of productionȷ a map of uncertainty, if you will. this is even more visible through the nature of bandwidth in kernel density estimationȷ the choice of value of this parameter is in itself laden with uncertainty. this observation is crucial in order to use such visualisations at all. providing the correct context is an important step to make such maps usable. it has been noted by drucker [dr »] that while the methods underpinning the algorithms we use often lack contextualisation, it is the very goal of humanities to provide such context. we see the recognition of such visualisations as visualisations of uncertainty as an important step forward in this respect. moving forward with uncertainty there are tangible gains from including uncertainty in our models. as we strive to go beyond the narrow application inside a singular case-study, we want to highlight how modelling uncertainty and operating within a theoretically-based taxonomy might prove to be one of the crucial contributions of theoretical digital humanities [pi ] to computational humanities and to the historian’s toolbox alike. in order for computational humanities to function as a self-defined and independent field it requires a robust theoretical and methodological framework of its own. when it comes to uncertainty, a robust taxonomy will allow for a creation of project-independent methodology. when it comes to mapping historical sources it will finally allow not only for a basis of comparison between projects but also for a distinction between mapping sources and mapping historiography, thus bringing the methodologies of computational humanities on the same page as the methodologies of history. using a taxonomy of uncertainty might also help to fine-tune geotagging of historical sources. by modelling vagueness in and assigning the correct level of uncertainty, the most appropriate method of visualisation can be assigned to a locale. this method can supplement models based on fuzzy representation of spatial data in texts [bgp ]. conclusions a robust methodology for uncertainty is a necessity for computational humanities to advance as a field. through factoring vagueness in and modelling it for our visualisations we can finally achieve a more stable common ground between various, currently methodologically disjoint, projects that constitute the field of computational humanities. acknowledgments this work is supported by a spark grant from the swiss national science foundation (no. « ) awarded to m.p. modelling medieval vagueness « « bibliography [bgp ] bordogna, g.; ghisalberti, g.; psaila, g.ȷ geographic information retrievalȷ modeling uncertainty of user’s context. fuzzy sets and systems /, pp. – », june , , doiȷ . /j.fss. . . . [bi »] binder, f.; entrup, b.; schiller, i.; lobin, h.ȷ uncertain about uncertainty, different ways of processing fuzziness in digital humanities data. inȷ digital humanities » conference abstracts, lausanne july – , ». alliance of digital humanities organizations, pp. – , », urlȷ http://nbn- resolving.de/urn:nbn:de:bsz:mh - . [bl ] blau, a.ȷ uncertainity and the history of ideas. history and theory /«, pp. « –« , , doiȷ . /j. - . . .x. [da ] davis, l. f.ȷ manuscript road tripȷ linked data, library science, and medieval manuscripts, dec. , , urlȷ https://manuscriptroadtrip.wordpress. com/ / / /manuscript-road-trip-linked-data-library-science- and-medieval-manuscripts/, visited onȷ / / . [dr »] drucker, j.ȷ graphesisȷ visual forms of knowledge production. harvard uni- versity press, ». [ed ] edgington, d.ȷ validity, uncertainty and vagueness. analysis /», pp. «– », oct. , doiȷ . /analys/ . . . [fa ] fafinski, m.ȷ facsimile narrativesȷ researching the past in the age of digital reproduction. digital scholarship in the humanities submitted/, . [go ] goodchild, m. f.ȷ how well do we really know the world? uncertainty in giscience. journal of spatial information science / , pp. – , , doiȷ . /josis. . . . [go ] goffart, w.ȷ from historiae to historia francorum and back againȷ aspects of the textual history of gregory of tours. inȷ rome’s fall and after. hambledon, london, pp. – », . [gr ] gray, v.ȷ thucydides’ source citationsȷ “it is said”. the classical quarterly / , pp. – , , doiȷ . /s . [gr »] gregory of toursȷ the history of the franks. penguin, harmondsworth, ». [he ] hewson, l.ȷ les incertitudes du traduire. french, meta / , pp. – , , doiȷ . / ar. [he »] heinzelmann, m.ȷ gregor von tours ( « – »)ȷ “zehn bücher geschichte”, historiographie und gesellschaftskonzept im . jahrhundert. wissenschaftliche buchgesellschaft, darmstadt, ». [hm ] hewson, l.; martin, j.ȷ redefining translationȷ the variational approach. routledge, . « » mateusz fafinski, michael piotrowski https://doi.org/ . /j.fss. . . http://nbn-resolving.de/urn:nbn:de:bsz:mh - http://nbn-resolving.de/urn:nbn:de:bsz:mh - https://doi.org/ . /j. - . . .x https://manuscriptroadtrip.wordpress.com/ / / /manuscript-road-trip-linked-data-library-science-and-medieval-manuscripts/ https://manuscriptroadtrip.wordpress.com/ / / /manuscript-road-trip-linked-data-library-science-and-medieval-manuscripts/ https://manuscriptroadtrip.wordpress.com/ / / /manuscript-road-trip-linked-data-library-science-and-medieval-manuscripts/ https://doi.org/ . /analys/ . . https://doi.org/ . /josis. . . https://doi.org/ . /s https://doi.org/ . / ar [kc ] kerdjoudj, f.; curé, o.ȷ evaluating uncertainty in textual document. inȷ uncertainty reasoning for the semantic web. th international workshop on uncertainty reasoning for the semantic web (ursw ), co-located with the »th international semantic web conference (iswc ), bethlehem, pa oct. , . , urlȷ http://ceur-ws.org/vol- /paper .pdf, visited onȷ / / . [ko ] koselleck, r.ȷ standortbindung und zeitlichkeit. ein beitrag zur historiographis- chen erschließung der geschichtlichen welt. in (koselleck, r.; mommsen, w. j.; rüsen, j., eds.)ȷ objektivitčt und parteilichkeit in der geschichtswissenschaft. dtv, münchen, pp. –» , . [lo ] longley, p. a.; goodchild, m. f.; maguire, d. j.; rhind, d. w.ȷ geographic information systems and science. wiley, . [ma ] macqueen, j.ȷ some methods for classification and analysis of multivariate observations. in. proceedings of the fifth berkeley symposium on mathematical statistics and probability, volume ȷ statistics. the regents of the university of california, , urlȷ https://projecteuclid.org/euclid.bsmsp/ , visited onȷ / / . [ma ] marincola, j.ȷ authority and tradition in ancient historiography. cambridge university press, . [pa ] parzen, e.ȷ on estimation of a probability density function and mode. annals of mathematical statistics ««/, pp. – , sept. , doiȷ . /aoms/ , urlȷ https://projecteuclid.org/euclid.aoms/ , visited onȷ / / . [pi ] piotrowski, m.ȷ digital humanitiesȷ an explication. in (burghardt, m.; müller- birn, c., eds.)ȷ proceedings of inf-dh , berlin sept. , . gesellschaft für informatik, , doiȷ . /infdh - . [pi ] piotrowski, m.ȷ accepting and modeling uncertainty. zeitschrift für digitale geisteswissenschaften/sonderband » die modellierung des zweifels, schlüs- selideen und -konzepte zur graphbasierten modellierung von unsicherheiten, ed. by kuczera, a.; wübbena, t.; kollatz, t., , doiȷ . /sb _ a, urlȷ http://www.zfdg.de/sb _ . [pr ] prescott, a.ȷ the imaging of historical documents. in (greengrass, m.; hughes, l. m., eds.)ȷ the virtual representation of the past. digital research in the arts and humanities, ashgate, aldershot, pp. – , . [ra ] ralle, i. h.ȷ maschinenlesbar – menschenlesbar, Über die grundlegende aus- richtung der edition. editio « / , pp. »»– , , doiȷ . /editio- - . modelling medieval vagueness « http://ceur-ws.org/vol- /paper .pdf https://projecteuclid.org/euclid.bsmsp/ https://projecteuclid.org/euclid.bsmsp/ https://doi.org/ . /aoms/ https://doi.org/ . /aoms/ https://projecteuclid.org/euclid.aoms/ https://doi.org/ . /infdh - https://doi.org/ . /sb _ a http://www.zfdg.de/sb _ https://doi.org/ . /editio- - https://doi.org/ . /editio- - [re «] reimitz, h.ȷ cultural brokers of a common past, history, identity and ethnicity in gregory of tours and chronicles of fredegar. in (pohl, w.; heydemann, g., eds.)ȷ strategies of identificationȷ ethnicity and religion in early medieval europe. cultural encounters in late antiquity and the middle ages (celama) «, brepols, turnhout, pp. –« , «. [re ] reimitz, h.ȷ history, frankish identity and the framing of western ethnicity, – . cambridge university press, cambridge, . [rph «] reuschel, a.-k.; piatti, b.; hurni, l.ȷ modelling uncertain geodata for the literary atlas of europe. in (kriz, k.; cartwright, w.; kinberger, m., eds.)ȷ understanding different geographies. lecture notes in geoinformation and cartography, springer, berlin, heidelberg, pp. « – , «, doiȷ . / - - - - _ . [sk ] skinner, q.ȷ sir geoffrey elton and the practice of history. transactions of the royal historical society /, pp. « –« , , doiȷ . / . [ta ] tarte, s. m.ȷ digitizing the act of papyrological interpretation, negotiating spu- rious exactitude and genuine uncertainty. literary and linguistic computing /«, pp. «» –« , , doiȷ . /llc/fqr . [te ] terras, m. m.ȷ artefacts and errors, acknowledging issues of representation in the digital imagining of ancient texts. in (fischer, f.; fritze, c.; vogeler, g., eds.)ȷ codicology and palaeography in the digital age . bod, norderstedt, pp. »«– , , urlȷ http://kups.ub.uni-koeln.de/ /, visited onȷ / / . [th ] thucydidesȷ the peloponnesian war. hackett, indianapolis, in, . [to »] topolski, j.ȷ metodologia historii. państwowe wydawnictwo naukowe, warszawa, ». [to ] topolski, j.ȷ narrare la storia. nuovi principi di metodologia storica. mondadori, milano, . [wh «] white, h.ȷ interpretation in history. new literary history »/ , pp. –« », «, doiȷ . / . mateusz fafinski, michael piotrowski https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ https://doi.org/ . / https://doi.org/ . /llc/fqr http://kups.ub.uni-koeln.de/ / https://doi.org/ . / ucaris – making the most of your current resources procedia computer science ( ) – available online at www.sciencedirect.com - © published by elsevier b.v this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). peer-review under responsibility of eurocris doi: . /j.procs. . . sciencedirect cris ucaris – making the most of your current resources leigh garrett*, carlos silva centre for digital scholarship, university for the creative arts, falkner road, farnham, gu ds, england abstract this paper outlines the work carried out by the project team over the last three years to develop an in-house current information management system, focused on the specific need to gather information from across various departmental databases to fulfil the research excellence framework requirements for a specialist arts institution. the overall objective of the project was to support the university’s successful submission to the ref in november . the system was used to collate relevant information from various institutional databases and transfer this to the higher education funding council for england (hefce) submission system, thereby increasing institutional efficiency by reducing repetition of data entry and saving time in checking and organising information. © the authors. published by elsevier b.v. peer-review under responsibility of eurocris. keywords: cris management; linking data; database integration; eprints; visual arts . introduction the university for the creative arts (uca) is a leading art and design university in the south of england with campuses in canterbury, epsom, farnham, maidstone and rochester. the origin of the university lies in a number of independent public art and design colleges in the counties of kent and surrey, almost all of which had origins in the nineteenth century . * corresponding author. tel.: + . e-mail address: lgarrett@ucreative.ac.uk © published by elsevier b.v this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). peer-review under responsibility of eurocris http://crossmark.crossref.org/dialog/?doi= . /j.procs. . . &domain=pdf http://crossmark.crossref.org/dialog/?doi= . /j.procs. . . &domain=pdf leigh garrett and carlos silva / procedia computer science ( ) – as an active research institution uca participates in the national assessment of research quality, the latest of which, the research excellence framework (ref) is currently underway and outcomes of which be made known at the end of . the higher education funding council for england (hefce) provides research funding according to the quality and score achieved during the assessment period, normally the previous to years . in , at the end of the previous assessment exercise, the university received a couple of recommendations to inform and support its future submissions. these indicated that the university needed to have: its own institutional repository, and a system, which could effectively support future submissions. although successful in its assessment return, the university recognised that its existing process of collecting the information needed for the submission was both lengthy and repetitive. in the university participated in the kultur consortium (a partnership lead by the university of southampton, in collaboration with university for the creative arts, university of the arts london and winchester school of arts (part of the university of southampton) and received funding from jisc under its repository start-up and enhancement strand. the project aimed at creating a transferable and sustainable institutional repository model for research outputs in the creative and applied arts, a discipline area where repository practice was very underdeveloped. as a result of this work, using the enhanced eprints platform, in uca launched its institutional repository under the name of uca research online (ucaro) . . ucaris in funds were secured to develop the university’s research information system (ucaris). its objective was to gather relevant information from across the university in preparation for the submission to the research excellence framework (ref ). the project proposed the development of a comprehensive, university wide research information system using the eprints repository platform as a base. as well as being seen as critical for the ref submission, the overarching approach of the implementation of ucaris was that there should be ‘one input, many outputs’, in order to minimise the burden on academics and administrators in producing publications and reports for other internal and external purposes. due to the costs and particular requirements of the university, ucaris is one of the few solutions developed ‘in house’ rather than purchasing a commercial system. the international interest that ucaris demonstrated over the past three years shows the uniqueness of a specialist visual arts institution solving a specific problem using data mining and reusing research information from different sources. in preparation for the ref, the centre for digital scholarship at uca worked with all the relevant internal departments across the university and the electronics department at the university of southampton to develop the eprints ref plugin to meet the university’s and ref requirements. as a result the university now has an appropriate publications strategy and a research information system that is flexible enough to return any publication data that the ref may require within the relevant units of assessment. . . the schedule the project started in september and it was divided in three phases – planning, development and deployment. the project was officially signed off in september and the ref submission was made on th november . phase ( / ) focused on investigating requirements both for the ref and appropriate solutions; technical enhancements to the institutional repository’s underlying infrastructure, including updating its software, ldap integration, and installation of additional tools; and identifying and meeting skills requirements. leigh garrett and carlos silva / procedia computer science ( ) – phase ( / ) focused on the technical implementation and development of the approved solution; gathering relevant information and documentation from specified fields within university databases in accordance with the ref requirements; and making the data available in one single secured place and testing the approved solution. phase ( / ) focused on interface and security tests; bug fixes; training key stakeholders; and supporting the final submission process. as part of the project, it was acknowledged that the institutional repository uca research online (ucaro) was going to play a key role in the ref submission by providing content in terms of research outputs and staff profile information, neither of which was available in other institutional systems. fig. . ucaris interfaces the flexibility of eprints to harvest data from other systems and ‘push’ processed information both back to these and on to other systems meant that the development of ucaris would ensure the university had an appropriate research information system capable of supporting its ref submission with potential to support and grow in future to meet new challenges and opportunities. ucaris is mainly formed by the adapted ref plugin (fig. ), which harvests the different information needed to produce a coherent report and then submit it to the different sections on the (hefce) submission system. these reports can be tailored to fit any data required where the ref plugin has access . furthermore, ucaris is able to harvest specific content from datasets instead of connecting directly to a database. this minimised the risks associated with handling confidential and researcher information on the same infrastructure as uca research online, and enabled the setting up of separate permissions to add, modify and delete data without interfering with production data. leigh garrett and carlos silva / procedia computer science ( ) – . . communications strategy the project team worked with the ucaris steering group, which consists of key stakeholders including representatives from the library, human resources, marketing, it services and the research office. the steering group proposed the creation of a project board comprising members of it services and the library in order to ensure the successful delivery of the technical components of the project. the project reported to other interested parties via the university committee structure, particularly the value for money group and the it strategy group. the team successfully submitted a paper and presented at the open repositories in prince edward island in canada exposing and promoting the specific development work carried by uca and eprints to accomplish this project. . . outcomes the project was successful in its overall objective to support the university’s successful submission to the ref in november . the system was used to collate relevant information from various institutional databases and transfer this to the higher education funding council for england (hefce) submission system, thereby increasing institutional efficiency by reducing repetition of data entry and saving time in checking and organising information. . . lessons learned the primary lesson learned was to narrow our requirements and focus to achieve the specific goal of the project – to provide the right tools for a successful submission of the ref . originally we noticed that the software developed could be used for several other purposes, however we immediately realised that by broadening the scope we were loosing focus. the project team agreed on a strategy to focus only on ensuring gathering and collating date essential to support the university’s ref submission. collaborative working with colleagues and external service providers was critical but involved a lot of communication, planning and commitment from all the stakeholders involved. by providing a clear plan and agreement of timings, responsibilities and schedules the project was allowed to flow its course as expected. . . future the university has already started planning for ref and the most obvious solution is to learn and build upon the success of ucaris and develop it into a more comprehensive current research information system (cris), which appears to be the sector wide preferred route. if the university follows in a similar fashion, development and enhancement will include feeding data to other internal and external university systems to support and enhance marketing, learning, teaching and knowledge transfer activities as shown in figure . there is further potential for integration with other systems as required, using, cerif, sword and oai-pmh protocols. leigh garrett and carlos silva / procedia computer science ( ) – fig. . ucaris future . conclusions ucaris has ensured that the university had a system not only capable of generating its research outputs but also enhanced and enabled the deposit of these and related data into the (hefce) submission system. further, ucaris is able to gather specific information from different areas and departments and therefore provide a single point for analysing institutional data. this can be used to make decisions regarding marketing, review research targets, and inform and support internal planning and research funding. acknowledgement ucaris would like to thank the university for the creative arts for funding the project and in particular rosemary lynch from library & student services for championing and supporting the project from its inception. thanks are also due to eprints services for working in collaboration with the university for the creative arts and providing the expertise needed for the successful completion of the project. references . uca ( ). university for the creative arts http://www.ucreative.ac.uk (retrieved th april ). . ref ( ). research excelence framework http://www.ref.ac.uk/ (retrieved th april ). . kultur ( ). kultur project http://kultur.eprints.org/ (retrieved th april ). . ref plugin ( ). eprints ref plugin http://www.eprints.org/ref / (retrieved th april ). corporate template-set universiteit leiden mart van duijn & laurents sesink july a comprehensive approach towards the curation of born digital material by leiden university libraries first university in the netherlands founded by william of orange in as a reward for leiden’s resistance against the spanish motto praesidium libertatis [bastion of freedom] collections , , volumes , , e-books , current serials , e-journals , manuscripts , letters , maps , prints , drawings and , photographs wicked problem problem difficult or impossible to solve because of incomplete, contradictory, and changing requirements that are often difficult to recognize. support and facilitate, in collaboration with expert centres, faculties and (inter)national partners, digital scholarship within the university. centre for digital scholarship • open access, copyright and publication advice • data management • digital preservation • data science special collections curation of rare, valuable and vulnerable material. • material preservation • research support • outreach • acquisition ../../downloads/screen shot - - at . . .png the challenge: curating born digital material ecosystem mapping problem statement participants and stakeholders scope constraints systems and applications goal overview/status report/roadmap • e-books and e-journals • research data • virtual research environments • websites • digital av-material • web exhibitions overview/status report/roadmap • e-books and e-journals • research data • virtual research environments a digital heritage • documents • correspondence • publications • databases • website • software • websites • digital av-material • web exhibitions framework/grid content type image av text online resource c o n te x t ty p e archive/heritage press photo's; conference photo's; maps recordings of an interview or a television performance correspondence; lecture notes; presentations website? research output poster; printed map film on dvd; film on cd publications (e-books and e- journals) website? research data photo’s for research recording for oral history research report; dissertation datasets j.m.van.duijn@library.leidenuniv.nl l.b.j.sesink@library.leidenuniv.nl a comprehensive approach towards the curation of born digital material by leiden university libraries altmetrics in action: using metrics tools to extend library research support services natalia madjarevic training and implementation manager, altmetric scott taylor research services librarian, university of manchester @nataliafay / @altmetric / @scott__tweets altmetric.com overview @altmetric @nataliafay •  what are altmetrics? tracking attention to research outputs at altmetric •  altmetric for institutions monitoring attention to your research outputs •  implementing altmetrics at the university of manchester altmetrics in research support •  how can i disseminate my work as widely as possible? •  who’s downloading and citing my research? •  what platforms should i use to share my papers & data? •  many, many questions about open access policies •  how can the library help me with all of this? top questions i was asked as an academic librarian… @altmetric @nataliafay developing library services to support digital scholarship… …driving research innovation and enabling open research. literature reviews, resources and training grant applications and funder reporting open access and dissemination research data management evaluation and impact assessment @altmetric @nataliafay all impact means is that we are engaged with the world, trying to make it a better place to live in. (professor michael stewart, ucl) how does research contribute to changes in everyday decisions or working practices? «  did it help improve services or business? «  provoke debate? «  shape policy? and how do you demonstrate this to funders? @altmetric @nataliafay journal impact factor citation counts h-index number of publications o  tracking attention to research outputs in non-traditional sources, e.g. policy documents, news, blogs and social media o  help understand how research is being received and used o  complementary to traditional citation-based analysis o  indicators of research impact what are altmetrics? @altmetric @nataliafay mentions in news reports references in policy mentions in social media wikipedia citations reference manager readers… etc. academic attention broader attention alternative metrics “altmetrics” + traditional metrics traditional bibliometrics traditional bibliometrics funding awards awards and professional recognition altmetrics altmetrics are part of a broader conversation… clinical guidlines government debates patents changes to curriculum   public speaking events teaching activities @altmetric @nataliafay multifaceted picture of engagement: audiences practitioners general public professional communicators interested parties scholars advantages of altmetrics moving beyond crediting only journal article contributions early career researchers whose work may not have accrued citations to demonstrate engagement real-time, immediate feedback on attention and dissemination of scholarly content @altmetric @nataliafay each day, we track ~ , new mentions of research across sources incl. social media, news, and policy docs. we’ve tracked almost m outputs. each week, ~ k unique articles are shared. mentions range in complexity, from quick shares to comprehensive reviews. altmetric data, march looking beyond numbers towards quality of engagement…   a more well-rounded view of research impact… @altmetric @nataliafay altmetric for institutions @altmetric @nataliafay building altmetric for institutions “please let me see data on papers from a person or department.” @altmetric @nataliafay real-time feed of research engagement monitor your popular papers being discussed right now @altmetric @nataliafay o  integrations with criss, irs, profiles, excel, url tracking, custom apis, bespoke connectors o  low barrier technical setup for end users o  departmental hierarchy, author papers o  regular updates how we populate altmetric for institutions institutional repositories @altmetric @nataliafay encourage staff to deposit in your research information management system @altmetric @nataliafay % of altmetric top were open access + = range of methods to understand research attention library support services and training adoption & better evidencing of research impact altmetrics formula… @altmetric @nataliafay implemen'ng  altmetrics  at  the  university   of  manchester   wed th june, liber scott taylor, the university of manchester library why  were  we  interested  in  altmetrics?   why  altmetric?   how  we  trialled  afi   set-­‐up   • library   provides   altmetric   with  data     • altmetric   configure   uom  app   create  user   groups   • rs  team   idenafy  aer     group   • ae  team   idenafy  aer     group   test  phase   • user  groups   test   applicaaon     feedback   • tier    group   ahend  focus   group   • tier    group   provide   wrihen   feedback   set-­‐up   • library   provides   altmetric  with   data     • altmetric   configure   uom  app   create  user   groups   • rs  team   idenafy  aer     group   • ae  team   idenafy  aer     group   test  phase   • user  groups   test   applicaaon     feedback   • tier    group   ahend  focus   group   • tier    group   provide   wrihen   feedback   institutional repositories set-­‐up   • library   provides   altmetric  with   data     • altmetric   configure   uom  app   create  user   groups   • rs  team   idenafy  aer     group   • ae  team   idenafy  aer     group   test  phase   • user  groups   test   applicaaon     feedback   • tier    group   ahend  focus   group   • tier    group   provide   wrihen   feedback   tier    user  group   tier    user  group   set-­‐up   • library   provides   altmetric  with   data     • altmetric   configure   uom  app   create  user   groups   • rs  team   idenafy  aer     group   • ae  team   idenafy  aer     group   test  phase   • user  groups   test   applicaaon     feedback   • tier    group   ahend  focus   group   • tier    group   provide   wrihen   feedback   by  tweedewereldoolog-­‐wiki  (own  work)  [public  domain],  via  wikimedia  commons   set-­‐up   • library   provides   altmetric  with   data     • altmetric   configure   uom  app   create  user   groups   • rs  team   idenafy  aer     group   • ae  team   idenafy  aer     group   test  phase   • user  groups   test   applicaaon     feedback   • tier    group   ahend  focus   group   • tier    group   provide   wrihen   feedback   our  future  plans   by  chris  brown  (edgware  road)  [cc  by-­‐sa   .  (hhp://creaavecommons.org/licenses/by-­‐sa/ . )],  via  wikimedia  commons thanks! natalia@altmetric.com / scott.taylor@manchester.ac.uk @nataliafay / @altmetric / @scott__tweets altmetric.com digital research at the british library developments in digital scholarship: at the british library and at kitchen tables everywhere dr. mia ridge, @mia_out digital curator, british library digitalresearch@bl.uk @bl_digischol mitchell library, state library of new south wales http://acmssearch.sl.nsw.gov.au/search/itemdetailpaged.cgi?itemid= the digital research team is a cross-disciplinary mix of curators, researchers, librarians and programmers supporting the creation and innovative use of british library's digital collections. @aquilesbrayner @ndalyrose @mia_out @miss_wisdom @benosteen @mahendra_mahey with thanks to... teamwork supporting digital scholarship @ bl we support new ways of exploring and accessing our collections through: • getting digital and digitised content available online • collaborative projects • digital scholarship support and guidance • events, publications and training • competitions, and awards (bl labs) h ttp s://w w w .flickr.co m /p h o to s/b ritish lib ra ry/ / im a g e ta k e n fro m p a g e o f 'm a n , e m b ra cin g h is o rig in , ... civ iliza tio n , ... m e n ta l a n d m o ra l fa cu ltie s. ... illu stra te d ' b ritish lib ra ry h m n t s .g . custodianship research business culture learning international living knowledge: the british library – 'we make our intellectual heritage accessible to everyone, for research, inspiration and enjoyment' 'we make our intellectual heritage accessible to everyone, for research, inspiration and enjoyment' ok, but how? internal digital scholarship training • staff across all collection areas are familiar and conversant with the foundational concepts, methods and tools of digital scholarship. • staff are empowered to innovate. • collaborative digital initiatives flourish across subject areas within the library as well as externally. • our internal capacity for training and skill-sharing in digital scholarship are a shared responsibility across the library. evaluating the training programme: goals • hands-on, practical exercises • time to explore innovative digital projects • trying new tools, particularly with bl or similar collections • the expertise and enthusiasm of instructors • meeting colleagues and learning about bl projects participants appreciated... • case studies / real world examples help • articulate learning outcomes, expected results for exercises • provide clear, printable instructions • no more than people in hands-on courses • allow time for all to complete exercises • build in optional activities for advanced participants practical digital training tips inspired... big data history of music how can vast amounts of bibliographic data held by research libraries be unlocked for music researchers to analyse? analyses and visualisations of these datasets exposed previously uncharted patterns in the history of music. • applying new skills within the library - more guidance and support within current library policy and infrastructure • reaching staff who work on rosters and cannot attend for a whole day • digital tools for non- western materials; working with non-textual digital collections such as the uk web and sound archives on-going challenges • software tools break/change - some courses need to be completely checked before they're run • digital curators need to maintain our skills and knowledge too – digital scholarship reading group - hour, once a month – hack and yack - hours, once a month – both are open to all challenge: the pace of change transkribus for handwritten text recognition datasets and digital collections datasets about our collections bibliographic datasets relating to our published and archival holdings datasets for content mining content suitable for use in text and data mining research datasets for image analysis image collections suitable for large-scale image-analysis-based research datasets from uk web archive data and api services available for accessing uk web archive collections digital mapping geospatial data, cartographic applications, digital aerial photography and scanned-in historic map materials http://bl.uk/digital http://www.bl.uk/collection-guides/datasets-about-our-collections http://www.bl.uk/collection-guides/datasets-for-content-mining http://www.bl.uk/collection-guides/datasets-for-image-analysis http://www.bl.uk/collection-guides/datasets-from-uk-web-archive http://www.bl.uk/collection-guides/digital-mapping http://www.bl.uk/bibliographic/download.html http://data.webarchive.org.uk/ ask me for a demo! contact us: digitalresearch@bl.uk @bl_digischol @mia_out @miss_wisdom bl labs competition bl labs competition finalists for • black abolitionist performances and their presence in britain • hannah-rose murray, phd student at the university of nottingham • sherlocknet: using convolutional neural networks to automatically tag and caption the british library flickr collection • luda zhao, brian do and karen wang, students from stanford university, california "i was able to do in minutes with a python code what i'd spent the last ten years trying to do by hand!" -dr. katrina navickas, bl labs winner political meetings mapper political meetings mapper video: https://youtu.be/xabsuynkd s https://youtu.be/xabsuynkd s spatial humanities ‘being able to link the map and the underlying text allows us to understand how patterns vary from place to place.’ combining text analysis and geographic information systems to investigate the representation of disease in nineteenth-century newspapers, lancaster places associated with a range of common nineteenth century diseases. bl labs bi-annual competition & awards the british library labs awards deadline is september. 'a million images' on flickr commons mario klingemann ' minerals' 'grassroots' projects participatory history as learning • constructive environments for learning and mastering skills – palaeography – tagging / classification – source familiarity – biographical research – database and web design – data collation, analysis, interpretation – extended research projects @marinelivesorg http://digitalpopuplab.org/ olney & district historical society http://www.mkheritage.org.uk/odhs/ conclusion kiitos / thank you! mia ridge @mia_out digital curator, british library digitalresearch@bl.uk @bl_digischol mailto:digitalresearch@bl.uk library trends v. , no. winter : http://hdl.handle.net/ / introduction: the library of congress national digital information infrastructure and preservation program patricia cruse and beth sandore this special issue of library trends is comprised of sixteen articles that tell fascinating stories about the ground-breaking efforts of numerous part- ners within the library of congress national digital information infra- structure and preservation program (ndiipp). since its inception in , ndiipp has grown from an experimental program into a true partnership of concerned organizations working together to sustain access to digital information that is critical to scholarship and cultural heritage nation- wide. the seeds for ndiipp were initially sown in a report issued in july by the national research council titled lc : a digital strategy for the library of congress. the report, which was commissioned by the library and congress in , was an on-site study of the library’s technology practices, an initiative conducted by a committee of the computer science and telecommunications board of the national research council. among the recommendations of the lc report was the point that the library of congress should take the lead in the preservation and ar- chiving of digital materials, but that it must continue to work with other institutions in determining collection policies for digital information, and it must accelerate its efforts to meet the growing demand. in december congress passed legislation asking the library of congress to de- velop a national program to preserve the ever-growing amounts of digital information, especially materials created only in digital format. this law was passed in order to ensure that this content would be accessible for cur- rent and future generations. this program was funded by a $ million congressional appropriation and was formally called the national digital information infrastructure and preservation program (ndiipp). in de- cember congress released $ million for the initial planning phase. from that point forward, the library of congress sought and solidified collaborations with numerous organizations, both public and private, to library trends, vol. , no. , winter (“the library of congress national digital information infrastructure and preservation program,” edited by patricia cruse and beth sandore), pp. – (c) the board of trustees, university of illinois library trends/winter present a plan for a national digital preservation program to congress in . in this plan, librarian of congress james billington emphasized the urgent need to set in place a trusted solution for the preservation of scholarly and cultural heritage information nationwide through the ndiipp program mission: never has access to information that is authentic, reliable, and complete been more important, and never has the capacity of libraries and other heritage institutions to guarantee that access been in greater jeopardy. recognizing the value that the preser vation of past knowledge has played in the creativity and innovation of the nation, the u.s. con- gress seeks, through the library of congress, to find solutions to the challenges posed by capturing and preserving digital information of cultural and social significance. with these words and the funding from the united states congress, the library of congress entered a new era that is marked by the neces- sity for cooperation and interdependence in order to sustain access to a highly distributed network of digital heritage and scholarship content. the sixteen manuscripts included in this special issue represent a micro- cosm of over sixty collaborative projects that were launched by the ndiipp. they are organized around three important themes that emerged from the individual and the collective efforts of the partners. these themes coalesced around the shared critical need to preserve significant born- digital and digitized legacy information. the topics treated in this issue include, more specifically: • new organizations and missions and new perspectives on sustainability; • preservation of specific types of content, including web content, cultural heritage and special collections, ejournals, and geospatial information, and the format and metadata standards to support ingest, management, and migration of digital content; • interoperability, data transfer and storage, and the future of digital pres- ervation systems. each article in this issue tells a compelling story conveying the sense of urgency that has pervaded the efforts of the numerous institutions and groups involved in ndiipp, many with little else in common but the need to develop policy, structure, process, commitments, and technologies to preserve significant cultural and historical content into the future. in the decade that has passed since the lc report was commissioned by the library and congress, there has been a significant amount of for- ward momentum aimed at setting in place the shared policy, practice, and technical infrastructure for the preservation of important yet at-risk cultural heritage and government information in digital form in this country. the ndiipp timeline document that follows this introduction (attachment ) provides a detailed and rich chronology of the numerous cruse/introduction activities of the partnership and the library of congress that have focused on technical, policy, and organizational matters since the inception of the program. through the support from congress, matched dollar for dollar by the over sixty partner institutions, the ndiipp partners as a group have seized the opportunity to make headway on the challenges of a national yet de- centralized digital preservation mandate through numerous coordinated efforts. ndiipp is engaged in collecting and archiving at-risk content, co- operating on digital preservation best practices and standards, and de- veloping tools and services to be shared within and beyond the partner network. the work of the partners comprises several critical areas includ- ing: selection and preservation; metadata for creative commercial content; tools and services for the network; collaborations to preserve state and local government information; u.s. federal agency working groups; the section study group; the national science foundation partnership; and the international partnership for archiving the web. the ndiipp status document that follows this introduction (attachment ) provides a more detailed description of the numerous collaborations and initiatives that represent the core of the national digital preservation partnership’s working agenda. from approximately through , the participants in the ndiipp partnership worked individually on specific projects and met periodically to work together toward defining a new construct—a nation- wide digital preservation network. through the collective experiences of the past several years, the ndiipp partner institutions developed and communicated a strong and unified message about the critical need for institutions to work in concert to preserve digital scholarship and heritage information that the united states is at risk of losing permanently. the partners grappled with the daunting knowledge that although few institu- tions are able to appreciate and prioritize the urgent needs of digital pres- ervation, there is great social capital in the shared understanding that all (scholars, congress, and citizens) have critical needs for sustained access to the wealth of literature and information that is produced in this coun- try. the adage “the whole is greater than the sum of the parts” became a strong and persistent foundation for the ndiipp partnership. as the ndiipp partners addressed digital preservation challenges, the need for new organizations emerged—structures, policies, and processes that centered on digital preservation. the experience of the ndiipp partnership suggests that groups that are focused on domain- or format- specific content have a strong likelihood of developing sustainable digi- tal preservation models. in a number of these cases, new organizational models have been created to meet the emerging needs of these new col- laborations. the articles in this issue that address the new organizational structures provide fascinating and diverse case studies in the approaches library trends/winter taken by several institutions to reshape existing programs or to develop new organizational structures to meet the needs created by digital pres- ervation programs. often these groups had limited funding and time to capture critical resources at the risk of substantial loss of digital content due to deterioration of storage and the looming threat of obsolescence— of systems and content format. ndiipp provided the impetus for some existing organizations to refocus their missions, integrating digital pres- ervation into the core mission. in the case of public television (wnet/ channel , and new york university), the ndiipp program catalyzed a community around the development of standards and systems to sup- port the preservation of public television content. through the ndiipp program, other organizations are redefining consortia partnerships, in the case of the social science data community (data-pass project, mur- ray archive), and some have formed new types of not-for-profit consortia among cultural heritage institutions for the purpose of preserving digital special collections. the educopia consortium formed by the institutions involved in the metarchive american south project provides a compelling story of the coordinated efforts of a group of u.s. academic institutions to preserve and make accessible the digital cultural heritage materials that document significant points in the history of the american south. one of the most significant developments of the ndiipp partners has been the shared understanding of the high value that sustainability brings to digital preservation activities, for individual projects as well as across the partnership. in their article describing the history of social science data curation and the development of the data-pass project (data preser- vation alliance for the social sciences), gutmann et al. explore the chal- lenges of preserving digital social science data. their sense of urgency was fueled by the knowledge that after a significant shift in survey research methods had occurred in the s, “less than half of the digital social sci- ence research content . . . has been preserved at a professionally managed archival institution.” altman also explores the changes in policies, pro- cess, and perspective that occurred with the preservation of digital social science data in the henry a. murray archive. in his article on the pres- ervation of business records from the dot com era, kirsch describes his pioneering efforts to rescue the digital records of defunct law firms that represented failed dot com businesses—in a sense, although these records had custodial support, they were “orphaned” without access due to legal restrictions. if not for kirsch’s efforts, with support from the ndiipp pro- gram, the archive of the businesses’ legal transactions would have been in- accessible and eventually lost, representing not only an institutional loss, but more important, a loss to historians and society of a significant record of activities from an era in u.s. history that could not otherwise be studied or well understood. lefurgy’s article explores the perspectives on sustain- ability of digital preservation efforts based on a survey of the ndiipp part- cruse/introduction ners two years into the three-year term of their projects. his observations suggest that many of the participating institutions were grappling with the prospect of sustained funding and mainstream organizational models to support digital stewardship. the development of reliable methods to preserve the at-risk content of the ndiipp partners—scholarly journals, web content, geospatial infor- mation, video, and audio files—formed the dynamic core of exploration and discovery in the ndiipp program. the library of congress invested strategically in a number of institutions that pledged to investigate numer- ous challenges in preserving specific content formats. many of the articles in this volume focus specifically on community information needs for re- use and sustained access to the various types of content, placing emphasis on standards development and best practices. the articles by seneca and by hswe et al. provide equally compelling yet different approaches to web archiving that were developed by two ndiipp-sponsored projects to serve various government information, state archives, and cultural heritage communities. the common concerns that fueled these and other web archiving projects included the fleeting nature of content on the web, and the adverse impact of “lost evidence” on many aspects of everyday activities that have come to depend on web dissemination (e.g., federal, state, and local government agency websites, publications and legislation; scholarly presentations, technical reports, and working papers; etc.). the print counterparts for all of this information were previously deposited in libraries, archives, and organizations where the arrangements for steward- ship of the print materials have been well established. with the advent of the web, numerous government and private organizations that previously published materials in print have shifted to web dissemination, many simply ceasing the crucial deposit of a print archival copy. web archiving services provide the critical link that ensures long-lived access to digital documents that are now distributed through the web. the ndiipp program also contributed substantially to the develop- ment of production archiving services for reliable and sustained access to vetted scholarly publications and digital-only materials. two such services are featured in this issue, including lockss and portico, both which were initially developed to address e-journal archiving. the article by reich and rosenthal provides a rich account of the development of the lockss (lots of copies keep stuff safe) distributed approach to ejournal preser- vation, but focuses specifically on the concept of a private lockss net- work (pln). the private lockss network extends the concept of the lockss preservation strategy to fit the needs of a group or groups of institutions that share in a common commitment to preserve their collec- tive digital scholarly content that is deemed to be significant to the group. skinner and halbert’s article on the metaarchive cooperative, also in this issue, provides a working example of a successful pln. in her article on library trends/winter the development of the portico ejournal archiving service, kirchhoff re- lates the evidence gathered by ithaka (echoed in similar surveys in the united kingdom) that confirms the rapidly evolving reliance of scholars on ejournals to the core business plans of the portico service. libraries, publishers, and scholars seek affordable organizational solutions to en- sure that they will have sustained access to digital scholarship into the future. kirchhoff also outlines new preservation areas in which the schol- arly community is keen for portico to provide services, including e-books and digitized collections. because the nature of digital preservation needs varies by institution and intent, both centralized (portico) and distributed (lockss) services address important aspects of digital preservation that are both complementary and necessary. ndiipp also provided support for cultural heritage institutions to move forward their work in the arena of geospatial information preser- vation. this issue includes two articles that focus on different aspects of geospatial data collection and preservation. a team from north carolina (morris et al.) pursued preservation of state and local digital geospatial data and addressed the challenges of identifying and collecting from ap- proximately one hundred state and regional agencies geospatial data that is used in applications such as tax assessment, transportation planning, hazard analysis, health planning, political redistricting, and utilities man- agement. the goal of the north carolina effort was to start at the local level and work outward—establish a framework of partnerships among local, state, and federal agencies and work toward alignment of collection and preservation functions for gis information, first at the state level, then forming partnerships among states and other gis organizations. tak- ing a slightly different approach, a team from stanford university and the university of california, santa barbara developed the national geospatial digital archive (ngda) with the goal of building a national, federated collecting network for the archiving of at-risk geospatial images and data. the article by erwin et al. presents a compelling story of the challenge of developing a collection framework and establishing the requisite format and metadata standards. one important evolutionary change that occurred over the course of the ndiipp program was the gradual and shared realization that preser- vation as a function exists along the continuum of access to content. the results of innumerable conversations across the various user communities involved in ndiipp surfaced an important point—digital preservation actu- ally means, in many instances, the process of ensuring sustainable access over time to critical scholarly and heritage content. for the user communities, successful digital preservation programs were those that could guarantee both preservation of and access to significant digital content into the future. preservation systems and data ingest and transfer protocols play a criti- cally important role in ensuring that digital information is sustained in its cruse/introduction original and intended form, or as close to that as possible. the articles in the final section of this special issue present the results of explorations in the technology of digital preservation systems and the movement of data both into and among preservation repositories. each participant in an ndiipp project committed at the outset of the program to transfer to the library of congress their digital content in the final year of the project. the article by mandelbaum et al. describes the collaboration between the library of congress and the san diego supercomputing center (sdsc) on models for mass data transfer and storage. this work considers the numer- ous human and technology factors involved in developing a reliable and replicable process for data transfer between two organizations. research at the university of maryland by jaja and song produced the adapt (ap- proach to digital archiving and preservation technology) framework for digital content preservation that is based on a layered, digital object archi- tecture and includes a set of modular tools and services built using open standards and web technologies. in their article describing a framework for repository interoperability, habing et al. describe the development of the hub and spoke (hands) tool suite. these tools were developed, im- plementing the mets and premis standards for preservation metadata, to assist in the management of digital content in multiple repository sys- tems while preserving valuable preservation metadata. the hands model provides a standards-based method for packaging content that allows digi- tal objects to be moved between repositories more easily while support- ing the collection of technical and provenance information crucial for long-term preservation. the final article in this volume from dubin et al. investigates the next generation of preservation repository architecture through the development of semantic repositories that take into account the meaning of relationships within and among digital objects. with the start of ndiipp, the library of congress and cultural heritage institutions in the united states entered into a new partnership focused on shared stewardship of significant cultural heritage and scholarly digital content. the members of this partnership have undertaken, along with the library of congress, to form the foundation of digital culture preser- vation in the united states. other critical efforts underscore the need for this national partnership to take hold and embrace other types of critical digital content. the interim report of the blue ribbon task force on sustainable digital preservation and access underscores not only the im- portance but the urgency of digital preservation efforts like ndiipp, and it seeks to identify sustainable economic models for accomplishing this goal. through the ndiipp program, institutions have worked in concert to develop a strong sense of shared ownership in both the problem and the various solutions. this is reflected in the numerous cumulative accom- plishments of the partnership, and the continuous growth in membership and outreach. the articles contained in this special issue of library trends library trends/winter tell the stories of some of those initial efforts, and contribute to the cumu- lative understanding of the challenges and the accomplishments in digital preservation and access. note . for the most up-to-date information on the digital preservation efforts of the partnership, see http://www.digitalpreservation.org references blue ribbon task force on sustainable digital preservation and access. ( , december). sustain- ing the digital investment: issues and challenges of economically sustainable digital preservation. retrieved february , , from http://brtf.sdsc.edu/biblio/brtf_interim_report.pdf library of congress. ( ). preserving our digital heritage: plan for the national digital informa- tion infrastructure and preservation program. a collaborative initiative of the library of congress. washington, dc: library of congress. retrieved february , , from http://www .digitalpreservation.gov/library/resources/pubs/docs/ndiipp_plan.pdf national research council. committee on an information technology strategy for the library of congress; computer science and telecommunications board; commission on physical sciences, mathematics, and applications; national research council. lc : a digital strategy for the library of congress. washington, dc: national academy press. retrieved february , , from http://www.nap.edu/openbook.php?record_id= attachment . national digital information infrastructure and preservation program timeline http://www.digitalpreservation.gov july national research council issues lc : a digital strategy for the library of congress. commissioned by the librarian of congress in , the on-site study of the library’s technology practices and initiatives was conducted by a committee of the computer science and telecommunica- tions board of the nrc. among its recommendations: the library should take the lead in the preservation and archiving of digital materials, but it must continue to work with other institutions in determining collection policies for digital information and accelerate its efforts to meet the grow- ing demand. december congress passes legislation asking the library of congress to develop a national program to preserve the burgeoning amounts of digital information, especially materials that are created only in digital formats, to ensure their accessibility for current and future generations. the program, funded by a $ million appropriation, is formally called the national digital information infrastructure and preservation program (ndiipp). congress releases $ million for initial planning. march senator ted stevens (r-alaska), chairman of the senate ap- propriations committee and vice chairman of the joint committee on cruse/introduction the library for the th congress, addresses the federal library and in- formation center committee (flicc) of the library of congress on how to meet the challenge of preserving and providing access to authoritative federal information. october the library of congress, in collaboration with the internet archive, webarchivist.org, and the pew internet & american life project, announces the release of a collection of digital materials called the sep- tember web archive, available at september .archive.org. the collec- tion represents the library’s early attempts to capture information on the web before it disappears from the historical record. february the library of congress convenes a group to discuss pos- sible scenarios for the development of an infrastructure for the collection, access and preservation of digital information. this scenario planning fol- lows a series of convening sessions, held november , that brought together a cross section of industry and other stakeholder communities for their input on the first stages of the digital infrastructure program. october ndiipp submits to congress for approval the results of its extensive meetings and planning sessions for the digital preservation pro- gram. the “master plan,” for a collaborative network to be formed by the library of congress, is called “preserving our digital heritage.” january congress approves the plan and releases another $ mil- lion. the remaining $ million from congress must be matched dollar- for-dollar from non-federal, in-kind, or cash contributions. august ndiipp issues an announcement seeking applications for projects that will advance the nationwide program to collect and preserve digital materials. june the library of congress enters into a joint digital preserva- tion project with old dominion, johns hopkins, stanford, and harvard universities to explore strategies for the ingest and preservation of digi- tal archives. the archive ingest and handling test (aiht) is designed to identify, document, and disseminate working methods for preserving the nation’s increasingly important digital cultural materials, as well as to identify areas that may require further research or development. june ndiipp partners with the nsf to establish the first research grants program to specifically address digital preservation. nsf is to ad- minister the program, which will fund cutting-edge research to support the long-term management of digital information. the effort is part of the library’s collaborative program to implement a national digital pres- ervation strategy. library trends/winter september ndiipp announces awards of $ . million resulting from its august solicitation. the awards are received by eight consor- tia comprising thirty-six institutions. the award winners agree to identify, collect, and preserve specific types of born-digital materials. these awards from the library are matched dollar-for-dollar by the winning institutions in the form of cash, in-kind or other resources. march ndiipp holds the first semiannual meeting of the eight award winners. this is the first opportunity for the partners in ndiipp to meet each other and discuss how they will achieve the objectives of their own projects as well as those of the overall national program. april the newly formed section study group holds its inaugural meeting at the library of congress. the goal of the group, named after the section of the u.s. copyright act that provides limited exceptions for libraries and archives, is to prepare findings and make recommendations to the librarian of congress by mid- for possible alterations to the law that reflect current technologies. this effort will seek to strike the ap- propriate balance between copyright holders and libraries and archives in a manner that best serves the public interest. may ndiipp and the national science foundation award ten uni- versity teams a total of $ million to undertake pioneering research to support the long-term management of digital information. these awards are the outcome of a partnership between the two agencies to develop the first digital-preservation research grants program. the awards are matched dollar-for-dollar by the institutions. july representatives from the institutions that received awards to- taling nearly $ million in september convened to discuss the progress of their digital preservation projects, learn about related ndiipp undertakings, and discuss ways of moving forward. during the meetings, the partners break into so-called “affinity groups.” these groups are formed based on issues that are paramount among the thirty-six proj- ect partner institutions. the four groups focus on intellectual property rights; collection and selection of digital materials; economic sustainabil- ity of the digital preservation projects; and the technical architecture. august the library of congress launches a new public website to cover the groundbreaking work of the section study group. the site at http:// www.loc.gov/section offers the group’s mission statement, its schedule of meetings and links to relevant sections of the copyright act. the site also offers links to background papers pertinent to libraries and archives and the rights issues they encounter when working with digital materials. october ndiipp announces that it is making a $ million grant award for the development of portico, a nonprofit electronic archiving cruse/introduction service being developed by ithaka. this award, to be matched by ithaka, is being used to support portico’s development of the archives’ technical infrastructure and an economically sustainable business model for a con- tinuing archiving service for scholarly resources published in electronic form, beginning with electronic scholarly journals. january held in berkeley, ca, the third meeting of the ndiipp proj- ect partners draws the largest crowd ever for this semiannual gathering. the meeting is designed to continue the work done during the two previ- ous partner meetings: update the participants on the program and pro- vide them with a forum to inform fellow participants on what they have learned so far and the common issues they face. january the section study group announces that it will hold two public roundtables in march —in los angeles and in washing- ton, dc—to gather insights and opinions on how to revise copyright ex- ceptions for libraries and archives under section of the copyright act. the roundtables are open to the public. april the library of congress holds a strategy session with leading producers of commercial content in digital formats and learns that creators of television, radio, music, film, photography, pictorial art, and video games are keenly interested in the preservation of their digital materials for archi- val and other purposes. may the library of congress holds sessions with the united king- dom’s joint information systems committee to discuss collaboration and common issues. may the library of congress launches website devoted to effort to capture websites for preservation at http://www.loc.gov/webcapture. june the library of congress makes a $ , award to stanford university to collaborate on development of ndiipp technical architec- ture. july digital preservation program seeks private sector partnerships with a request for expressions of interest to support preservation of cre- ative works. july the library of congress partnership supports preservation of foreign news broadcasts with an agreement with scola (http:www.scola .org) to ensure access to television programming of long-term research interest january ndiipp partner meeting convenes at san diego supercom- puter center. over one hundred attendees work within three tracks of breakout sessions to advance work on strategic outcomes. library trends/winter january congress rescinds $ million in ndiipp funds. pending awards for more than $ million to support technical infrastructure, collec- tion of creative content and states information are suspended or reduced. march eight digital preservation partners cooperative agreements are extended for eighteen months to run through spring of . march four technical infrastructure projects are funded for one year at $ million to develop tools for the capture and evaluation of digital con- tent and shared storage services to strengthen the network of partners. august the library of congress makes awards to private-sector pro- ducers of digital content in the areas of films, sound recordings, comics, pictorial art, video games, and virtual worlds to jump-start private sector preservation of their digital creative works. november martha anderson is named new director of program management for ndiipp. january twenty-one states, working in four multistate demonstration projects, receive a total of $ . million to preserve at-risk state and local government information. march ndiipp launches a monthly online digital preservation newsletter. http://www.digitalpreservation.gov/news/ / news _article_newsletter.html march section study group releases its report with recommen- dations for alterations to copyright law that address the handling of infor- mation in digital formats. july the library of congress releases the international study on the impact of copyright law on digital preservation. the report is a joint effort of ndiipp, the joint information systems committee, the open access to knowledge (oak) law project, and the surffoundation. attachment . national digital information infrastructure and preservation program (ndiipp) partnership network october, summary: the national digital information infrastructure and preser- vation program (ndiipp) is engaged with over one hundred partners collecting and archiving at-risk content, cooperating on digital preserva- tion best practices and standards, and developing tools and services to be shared with the partner network. cruse/introduction selection and preservation within a network sixty-seven partners from the academic library, archives, and non-profit communities • selection criteria and guidelines for at-risk born digital content, such as datasets, websites, geospatial, television, and business records. • preservation strategies for the selected content • tools and services for preservation of specific content types • development of collaborative networks for digital preservation • research on methods and infrastructure for preserving digital content metadata for creative commercial content twenty partners from the commercial content producer community • photographers, graphic artists, motion picture, sound recording, and interactive media producers • standards for commercial content formats and metadata to make the content discoverable by search engines tools and services for the network five partners from academic computer science and non-profit communities • development of tools to collect, analyze, and extract metadata from content published on the web • development and deployment of storage and preservation services for a variety of content collaborations for the preservation of state and local government information twenty-three partners from state libraries and archives • collect and preserve state geospatial data, legislative records, court case files, web publications and executive agency records • develop model data management and archiving systems • provide access to content for congress and others u.s. federal agency working groups eleven federal agencies • leading digitization working group for federal agencies includes na- tional archives and records administration (nara), u.s. government printing office (gpo), smithsonian, national park service, holocaust museum, national gallery of art • leading standards working group for federal agencies for digitizing still and moving images includes gpo, nara, national library of medi- cine (nlm), national agricultural library (nal), national technical information service (ntis), u.s. geological survey (usgs), smith- sonian, department of transportation, national gallery of art, also library trends/winter affiliation with commerce, energy, nasa, defense information man- agers group (cendi) section working group nineteen members • a select committee of copyright experts charged with updating for the digital world the copyright act’s balance between the rights of creators and copyright owners and the needs of libraries and archives • work completed and report published march , national science foundation partnership ten research projects • co-sponsor with national science foundation (nsf) of digital preservation research agenda, dig-arch • leader on nsf task force on sustainability, convened january international partnership for archiving the web thirty-nine national, state, academic, and non-profit libraries and archives • working together to develop standards, tools, and processes for ar- chiving the web • developing national web archive collections based on common techni- cal standards beth sandore is associate university librarian for information technology planning and associate dean of libraries at the university of illinois library at urbana-champaign. in this role she focuses on shaping technology to foster accessible, effective library programs and services. the core of illinois’ library technology programs include de- veloping and evaluating innovative digital library technology, and ensuring sustained access to research and cultural heritage content over time. sandore received her a.m. from the university of chicago in and has held appointments at northwestern university, the university of california, the illinois institute of technology, and the national center for supercomputing applications (ncsa). her research has been supported by the institute of museum and library services (imls), the national sci- ence foundation, the library of congress, and private foundations. she is currently a co-principal investigator with john unsworth for illinois’ $ . million national digital preservation partnership supported by the library of congress. patricia cruse is the founding director of the california digital library (cdl) digital preservation program. she works collaboratively with the ten university of califor- nia libraries to develop sustainable strategies for the preservation of digital content that supports the research, teaching, and learning mission of the university. ms. cruse has developed and currently oversees several of cdl’s initiatives, including the ndiip-funded web archiving ser vice and the imls-funded digital preser va- tion repository. recent activities include specifying preservation services for the hathitrust initiative and working with uc campus stakeholders to develop a set of digital curation micro-services. student as producer student as producer student as producer and open educational resources: enhancing learning through digital scholarship sue watling sue watling, centre for research and development, university of lincoln, brayford pool, lincoln ln ts, swatling@lincoln.ac.uk biography currently a teaching and learning co-ordinator in the centre for educational research and development (cerd), sue has over years’ experience in education: in adult and community education, social services and for the past years at the university of lincoln where she supports staff in the use of virtual learning environments. she has a particular interest in the social impact of the internet and issues of digital inclusion. sue has mas in gender studies and open and distance education and is currently undertaking doctoral research based around teaching and learning in a digital age. she has professional accreditation as a learning technologist with the association for learning technology (cmalt), is a member of the association for learning development in higher education (aldinhe) and a fellow of the higher education academy (hea). abstract at the university of lincoln, the student as producer agenda is seeking to disrupt consumer-based learning relationships by reinventing the undergraduate curriculum along the lines of research-engaged teaching. the open education movement, with its emphasis on creative commons and collaborative working practices, also disrupts traditional and formal campus-based education. this paper looks at the linkages between the student as producer project and the processes of embedding open educational practice at lincoln. both reinforce the need for digital scholarship and the prerequisite digital literacies that are essential for learning in a digital age. key words: student as producer, digital scholarship, digital literacies, open education, open educational resources, creative commons mailto:swatling@lincoln.ac.uk introduction strong linkages are evolving between two major projects at the university of lincoln. these are student as producer and the hea/jisc-funded embedding oer (open educational resources) practice. both projects aim to support and enhance the learning experience: student as producer through promoting research- engaged teaching and embedding oer practice through adopting a creative commons approach to teaching, learning and research. virtual learning environments (vles) have become integral to the he experience and to use them effectively involves a shift in practice to digital scholarship. becoming a digital scholar involves adopting those digital literacies, which have become essential for learning in the st century. this paper will examine the framework of student as producer and the experience of promoting open educational practice at lincoln. it will show how both these initiatives are highlighting digital scholarship and highlighting essential digital literacies. the paper will begin with a brief outline of the philosophy of student as producer before looking more closely at its digital scholarship theme. the wider implications of engagement with virtual learning and the requirements of a digital scholar will be discussed. these include support for individual confidence and competence with educational technology, in particular the adoption of a tripartite model of digital literacies, which includes personal, professional and public dimensions. finally, the paper will show how the progressive pedagogy of student as producer has multiple linkages with the processes of embedding open educational resources and offers useful ways forward for the enhancement of learning in a digital age. student as producer student as producer is a major cross-institutional initiative at the university of lincoln. it involves establishing research-engaged teaching as the organising principle for curriculum design and development. student as producer is reinventing the undergraduate student experience. it is not a prescriptive approach to change but has been introduced as a platform for debate and intellectual discussion about the nature of teaching and learning and its relationship to research: the essential aspects of research-engaged teaching and learning is that it involves a more research-oriented style of teaching, where students learn about research processes, and where the curriculum emphasises the ways by which knowledge is produced, rather than learning knowledge that has already been discovered. (neary : ) under the principles of student as producer, the revised curriculum supports opportunities for students to learn as researchers, through inquiry-based learning and problem-solving activities. the philosophical underpinning of student as producer is one of critical pedagogy, where students are recognised as being participative constructors of knowledge rather than passive consumers: students do come armed with their own experience, which critical teaching acknowledges through a dialogue with students. here the educator is also educated, and the student also becomes the teacher. (burawoy : ) student as producer is an attempt to restate the purpose of he by seeking to reconnect the core research and teaching activities of universities in a way that consolidates and substantiates the values of academic life (neary and winn ; hagyard and watling ). under student as producer, the student is http://studentasproducer.lincoln.ac.uk http://oer.lincoln.ac.uk http://studentasproducer.lincoln.ac.uk/ http://oer.lincoln.ac.uk/ not only encouraged to engage actively in the research process but also to become the producer of social reality and a citizen of the future. at a time when the social impact of the internet is adding digital dimensions to st century citizenship, student as producer actively recognises the need to support and resource the development of digital scholarship. student as producer and digital scholarship the progressive pedagogy of student as producer seeks to reinstate research engagement as an organising principle within a number of key themes, one of which is digital scholarship. the internet has great potential for enhancing and extending effective teaching and learning. this can happen through staff-led online activities, which support off-campus access and interaction, and through student-led use of social media like twitter and facebook. engaging with digital scholarship offers useful opportunities for establishing new collaborative working relationships between staff and students. mobile internet access via smartphones and tablets ensures continual connectivity and the ability to be in permanent contact with multiple sources of information. this access to vast repositories of knowledge has the potential to empower students by disrupting traditional classroom practices. digitally adept students create their own personal learning environments built from digital tools they already use. favourite content is shared via social bookmarking software like diigo and delicious. slideshare and vimeo enable the uploading of student-created presentations, while blogs and wikis support student review, feedback and evaluation. students who are also digital scholars navigate complex websites and authenticate online content with confidence, accurately distinguishing between knowledge, information and personal opinion. working outside traditional barriers of time and distance, they participate in online communities of practice and expertise, leading to an abundance of collaborative online learning opportunities. student as producer encourages students to see themselves as producers of social reality and citizens of the future. to be a digital scholar under student as producer is to engage with the wider social impact of digital ways of working. digital scholars have the personal skill-set to manage virtual learning effectively and also understand how digital literacies reflect individual identities and values. when they select the appropriate tools for tasks, they are also aware of the potential impact of their choices, particularly with particular to digitally inclusive practice. finally, digital scholars support the principles of open education, with its sharing of educational resources and the reusing and repurposing of content. there are clear alignments between digital scholarship under student as producer and the philosophy and practice of the open education movement. both support a commons-based, peer- production approach to leaning and this connection will be addressed in more detail later in this paper. before then, the relationships between digital scholars and their essential digital literacies will be examined. discovery: student as producer • technology in teaching: digital scholarship • space and spatiality: learning landscapes in higher education • assessment: active learners in communities of practice • research and evaluation: scholarship of teaching and learning • student voice: diversity, difference and dissensus • support for research-based learning through expert engagement with information resources • creating the future: employability, enterprise, beyond employability, postgraduate student as producer: digital scholars and digital literacies digital scholarship is not determined by access to educational technology but by the ways in which it is used. this requires attention to digitally literate ways of working, with clear frameworks defining those most essential for learning in a digital age. prerequisite digital literacies should not be assumed. instead, their adoption requires explicit support structures which themselves are flexible and adaptable to change as the internet continues to develop and evolve. research has been funded through the hea/jisc developing digital literacies programme. the programme offers a definition of digital literacies as those capabilities which fit an individual for living, learning and working in a digital society: “for example, the skills to use digital tools to undertake academic research, writing and critical thinking; as part of personal development planning; and as a way of showcasing achievements” (jisc ). unpicking digital literacies in more detail reveals a complexity of issues. as well as the effective use of technology for education, these include the wider social dimensions that demand a more scholarly approach. the principles of student as producer frame students as the producers of social reality. this can be usefully applied to a tripartite model of digital literacies, one that encompasses professional and public dimensions as well as competency-based personal ones. personal digital literacies are primarily about functionality. they describe the skill-set necessary for effective management of hardware, software, mobile technology and social media. personal digital literacies are recognised as essential requirements for learning. raising awareness of the professional and public dimensions of digital literacies can require a more sophisticated understanding of the broader social consequences of the move towards digital ways of working. there are exacting boundaries between private and professional online practices. establishing a virtual presence with friends can differ significantly from how one presents to colleagues, clients or service users. the professional elements of digital literacies include the construction of appropriate online identities. professional digital literacies include the potential for misuse of email or social media and understanding how the speed of online communication can encourage responses posted in haste and later regretted. the principles of online data protection, understanding the permanence of digital footprints and how user history is tracked and recorded are all essential to professional practice, as is an appreciation of the speed at which images and text can be taken out of context and spread across worldwide networks. as well as the professional dimension, there is a public aspect to digital literacies. this involves understanding them as learned social practices. it includes awareness of the potential for the replication and reinforcement of existing digital inequalities and exclusions. public digital literacies highlight the dichotomy of technology that enables access while also denying it unless steps are taken to ensure barrier-free ways of working. the university of the future needs to be many things, including the producer of students who are aware of the social shaping of technology and the parameters of digital divides. digital citizens have a responsibility to adopt holistic approaches and the progressive pedagogy of student as producer ensures that it is ideally placed to support digital citizenship as a new way of being in a digital age. alongside student as producer, another initiative at lincoln offers opportunities for digital scholarship and acceptance of a triad of personal, professional and public digital literacies. this is the philosophy and practice of open education. the linkages between this and student as producer will be examined next. student as producer and the open education movement as student as producer seeks to disrupt consumer-based relationships between tutors and students, so open education has disrupted traditional provision of formal campus-based teaching, learning and research. the www.jisc.ac.uk/developingdigitalliteracies http://http//www.jisc.ac.uk/developingdigitalliteracies movement was initiated by the massachusetts institute of technology (mit) in . funded by the william and flora hewlett foundation, mit course materials were made freely available online for public access. the uk open university developed a free access site, openlearn, and the sharing of educational resources was promoted in the education sector by jorum, a national repository of free educational content. hea/jisc have funded three phrases of research on the creation and embedding of open educational resources, resulting in the construction of a number of subject specialist repositories, for example openspires from the university of oxford, freely available modules such as chemistryfm from the university of lincoln and the development of free educational software for constructing oer such as xertes from the university of nottingham. open education challenges historical conceptions of academic institutions as sole gatekeepers of information and knowledge. by taking advantage of the instant access to digital data afforded by internet technologies, open education provides platforms for user participation and sharing and in so doing creates internationally distributed networks of open educational resources. these support knowledge sharing through flexible and borderless online educational experiences, which fit well with student as producer’s active engagement with research and the production of new knowledge. working with open educational content supports digital scholarship and its associated digital literacies, for example searching, selecting and evaluating content, making measured judgments on authenticity and value, and being aware of the need for inclusive design in order to achieve maximum capacity for sharing. student as producer and embedding oer practice at the university of lincoln, staff are adopting the philosophy and practice of open educational resources as a whole-institution approach. embedding oer practice is a hea/jisc-funded project, running concurrently with an he change academy programme. using a macro and a micro approach, it aims to promote open education as a sustainable and effective way of supporting teaching, learning and research in a digital age. the micro element consists of six individual projects investigating the use of oer in different generic aspects of the student experience. project teams include current students who are actively encouraged to provide the student voice with regard to learning and to participate in the oer research processes. the six areas of oer practice are listed below: • supporting transition with oer focuses on providing new students with access to library resources prior to enrolment. • early reflective writing uses oer to support the processes of reflective thinking and writing early in the student experience in semester a, year one. • employability explores oer for embedding graduate attributes in the undergraduate curriculum. http://ocw.mit.edu/index.htm http://openlearn.open.ac.uk http://www.jorum.ac.uk http://openspires.oucs.ox.ac.uk http://forensicchemistry.lincoln.ac.uk http://nottingham.ac.uk/xertes http://oer.lincoln.ac.uk http://ocw.mit.edu/index.htm http://openlearn.open.ac.uk/ http://www.jorum.ac.uk/ http://openspires.oucs.ox.ac.uk/ http://forensicchemistry.lincoln.ac.uk/ http://nottingham.ac.uk/xertes http://oer.lincoln.ac.uk/ • practice education electronic resources (peer) is looking at oer for the construction and assessment of e-portfolios for students and practice educators or mentors on undergraduate and postgraduate work-based learning awards. • exploring and embedding the use of oer on pgcert/he … and beyond is the development of an online postgraduate module called teaching and learning in a digital age. • behind the scenes offers technical support for the using and reusing of oer and is making recommendations for policy and practice with repositories. alignment of oer practice with generic elements of learning reflects the macro dimensions of the project, which investigate strategic approaches for sustaining institutional change. sustainability of project outcomes will be enhanced through the purposeful alignment of themes with existing institution-wide strategies. getting started is the university of lincoln programme for transition support and the oer work of the library-based team will feed directly into this pre-existing framework. students can participate in a learn higher certificate experience, which involves evidencing extra-curricular activities and offers an institution-wide pathway for broader adoption of the oer work on employability. highlighting shared student experiences, such as transition, reflective writing, employability and e-portfolios, offers multiple opportunities for attracting wider attention to oer and extending open practices across a range of subject disciplines. further commonalities include evidencing graduate attributes and difficulties with reflective practice. these are reported by all project teams and the linkages suggest the availability of free open educational resources, encouraging opportunities for employability and reflective processes, will support the embedding of oer practice as a whole-institution strategy. the oer research project is bringing together teaching and learning staff from across the university and asking them to work specifically in digital environments. this is raising awareness of the need for increased support for those digital literacies which are essential for oer practice: for example searching, selecting and evaluation, the identification of appropriate tools for content sharing, and the effective management of online ways of working. one way this need might be met is through raising the profile of the digital scholarship theme of student as producer. promoting digital scholarship through student as producer will help to ensure relevant institution-wide support for a broad range of digital learning experiences. this will include opportunities for enhancing the personal, professional and public dimensions of digital literacies as well as creating an institution-wide framework in which open education practices can be sustained. conclusion this paper has examined linkages between student as producer and research on open education at the university of lincoln. the digital scholarship theme of student as producer offers a useful framework for the adoption of open educational practices, which in themselves are a potential way forward for the university of the future. working with open educational resources highlights the potential of technology for education. it demands attention to digital scholarship and reinforces the development of a tripartite model of the personal, professional and public dimensions of digital literacies. student as producer has a natural affinity with the ethos of the open education movement; both have in common the disruption of traditional balances of power and provide valuable opportunities for discussion and debate about the enhancing the learning experience. open education requires a whole-institution approach to digital scholarship, which in turn requires attention to the appropriate digital literacies. student as producer offers a strategic framework for enhancing the quality of the learning experience, one which aligns well with the philosophy and practice of open educational resources. through promoting open approaches to learning, student as producer can provide a mechanism for embedding digital scholarship in the curriculum and in so doing will create an effective learning environment which is relevant for a digital age. references burawoy, m ( ). what might we mean by a pedagogy of public sociology? address to c-sap annual conference. cardiff, november. hagyard, a and watling, s ( ). the student as scholar: research and the undergraduate student. towards teaching in public. london: continuum. jisc ( ). developing digital literacies. briefing paper in support of jisc grant funding / . available at: www.jisc.ac.uk/media/documents/funding/ / /briefingpaper.pdf neary, m ( ). student as producer: research-engaged teaching and learning at the university of lincoln. available at: http://studentasproducer.lincoln.ac.uk neary, m and winn, j ( ). the student as producer: reinventing the student experience in higher education. the future of higher education: policy, pedagogy and the student experience. london: continuum. http://http//www.jisc.ac.uk/media/documents/funding/ / /briefingpaper.pdf http://studentasproducer.lincoln.ac.uk/ http://eprints.lincoln.ac.uk/ / http://eprints.lincoln.ac.uk/ / student as producer sue watling biography abstract introduction student as producer student as producer and digital scholarship student as producer: digital scholars and digital literacies student as producer and the open education movement student as producer and embedding oer practice conclusion references [pdf] original articles. | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /jpme. . . . corpus id: original articles. @article{kayaba originala, title={original articles.}, author={m. kayaba and k. iwayama and h. ogata and y. seya and k. kiyono and m. satoh}, journal={journal of clinical rheumatology : practical reports on rheumatic & musculoskeletal diseases}, year={ }, volume={ }, pages={ } } m. kayaba, k. iwayama, + authors m. satoh published medicine journal of clinical rheumatology : practical reports on rheumatic & musculoskeletal diseases . mccoy d. back to basics for health care. mail & guardian ; may: . . census figures, downloaded from http://www.statssa.gov.za (accessed june ). . rabinowitz hk. a program to increase the number of family physicians in rural and underserved areas: impact after years. jama ; : - . . rabinowitz hk. evaluation of a selective medical school admissions policy to increase the number of family physicians in rural and underserved areas. n engl j med ; : … expand view on pubmed dsh.oxfordjournals.org save to library create alert cite launch research feed share this paper citations methods citations view all tables and topics from this paper table i mercury university of southern california norris comprehensive cancer center ehlers-danlos syndrome nucleus raphe magnus estimated ephedra sinica censuses hospital admission education, medical hill-sachs lesion hospitals, rural corporation doxorubicin/fluorouracil/mitomycin protocol citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency the incidence and clinical presentation of infantile rotavirus diarrhoea in sierra leone. f. d. de villiers, t. sawyerr, gillian k de villiers medicine south african medical journal = suid-afrikaanse tydskrif vir geneeskunde pdf save alert research feed hemoglobin a c above threshold level is associated with decreased β-cell function in overweight latino youth. claudia m. toledo-corral, l. g. vargas, m. goran, m. weigensberg medicine the journal of pediatrics pdf save alert research feed abdominal and pericardial ultrasound in suspected extrapulmonary or disseminated tuberculosis. m. patel, s. beningfield, v. burch medicine south african medical journal = suid-afrikaanse tydskrif vir geneeskunde pdf save alert research feed n-terminal pro-brain natriuretic peptide and risk of coronary artery lesions and resistance to intravenous immunoglobulin in kawasaki disease. ken yoshimura, takahisa kimata, k. mine, t. uchiyama, s. tsuji, k. kaneko medicine the journal of pediatrics pdf save alert research feed a neurobehavioral intervention and assessment program in very low birth weight infants: outcome at months. k. koldewijn, a. v. van wassenaer, + authors f. nollet medicine the journal of pediatrics pdf save alert research feed moderate-to-vigorous physical activity, indices of cognitive control, and academic achievement in preadolescents. d. pindus, e. drollette, + authors c. hillman medicine, psychology the journal of pediatrics pdf save alert research feed child allergic symptoms and mental well-being: the role of maternal anxiety and depression☆ a. teyhan, b. galobardes, j. henderson medicine, psychology the journal of pediatrics pdf save alert research feed exercise capacity in pediatric patients with inflammatory bowel disease. h. ploeger, t. takken, + authors b. timmons medicine the journal of pediatrics pdf save alert research feed central adiposity is negatively associated with hippocampal-dependent relational memory among overweight and obese children. n. khan, c. baym, + authors n. cohen medicine the journal of pediatrics pdf save alert research feed does preterm birth influence cardiovascular risk in early adulthood? g. kerkhof, p. breukhoven, r. leunissen, r. willemsen, a. hokken-koelega medicine the journal of pediatrics pdf save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency two decades of haemophilia treatment in the netherlands, – a. h. triemstra, c. e. smit, h. ploeg, e. briët, f. rosendaal medicine haemophilia : the official journal of the world federation of hemophilia save alert research feed initial presentations of pediatric hemophiliacs. j. conway, m. hilgartner medicine archives of pediatrics & adolescent medicine highly influential view excerpts, references background save alert research feed management of haemophilia in the developing world c. lee, c. kessler, + authors c. karagus medicine haemophilia : the official journal of the world federation of hemophilia highly influential view excerpts, references background save alert research feed diagnostic symptoms of severe and moderate haemophilia a and b a survey of cases r. ljung, p. petrini, i. nilsson medicine acta paediatrica scandinavica view excerpts, references results save alert research feed factor concentrates for haemophilia in the developing world c. lee, c. kessler, + authors a. srivastava medicine haemophilia : the official journal of the world federation of hemophilia view excerpts, references background save alert research feed when are children diagnosed as having severe haemophilia and when do they start to bleed? a -year single-centre pup study h. pollmann, h. richter, h. ringkamp, h. jürgens medicine european journal of pediatrics view excerpts, references results save alert research feed the cupped disc. who needs neuroimaging? d. greenfield, r. siatkowski, j. glaser, n. schatz, r. parrish medicine ophthalmology pdf save alert research feed prevalence of human parvovirus b and tt virus in a group of young haemophiliacs in south africa rubinstein, karabus, smuts, kolia, van rensburg medicine haemophilia : the official journal of the world federation of hemophilia view excerpt, references background save alert research feed [a case of chronic persistent cough (cpc) caused by gastroesophageal reflux (ger) (including a study of cpc caused by suspected ger)]. k. fujimori, m. satoh, m. sasagawa, e. suzuki, m. arakawa medicine arerugi = [allergy] save alert research feed original articles. j. assies, a. schellekens, j. l. touber medicine the netherlands journal of medicine pdf save alert research feed ... ... related papers abstract tables and topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue wjla_a_ _p journal of library administration, : – , copyright © taylor & francis group, llc issn: - print / - online doi: . / . . digital humanities and libraries: a conceptual model chris alen sula school of information & library science, pratt institute, new york, ny, usa abstract. though there has been much discussion of the con- nection between libraries and digital humanities (on both sides), a general model of the two has not been forthcoming. such a model would provide librarians with an overview of the diverse work of digital humanities (some of which they may already perform) and help identify pockets of activity through which each side might en- gage the other. this article surveys the current locations of digital humanities work, presents a cultural informatics model of libraries and the digital humanities, and situates digital humanities work within the user-centered paradigm of library and information sci- ence. keywords digital humanities, academic libraries, research li- braries, services, users, cultural informatics introduction in , the chronicle of higher education called digital humanities “the first ‘next big thing’ in a long time, because the implications of digital technol- ogy affect every field” (pannapacker, ). by that point, several popular books had already been published (schreibman, siemens, & unsworth, ; cohen & rosenzweig, ; moretti, ; seimens & schreibman, ; boot, ), major journals established (digital humanities quarterly, digi- tal humanities now, digital medievalist, international journal of humanities and arts computing, literary and linguistic computing), and dozens of fed- eral grants awarded to projects in the area of digital humanities—not to mention many more ongoing projects at that time. address correspondence to chris alen sula, assistant professor and coordinator of digital humanities, school of information & library science, pratt institute, west th st., sixth floor, new york, ny , usa. e-mail: csula@pratt.edu digital humanities and libraries while skeptics today remain unsure of the “newness” of digital hu- manities (dh) or how it will impact the content of scholarship (fish, , a, b; marche, ), dh has already had significant influence on discussions of scholarly communication, funding, and tenure and promotion. nearly digital humanities grants and fellowships have been awarded by national endowment for the humanities (neh, a) since (this figure does not include grants for preservation, infrastructure, and cultural heritage, or funding from other agencies for humanities projects that include a digital component). the modern language association ( ) has issued guidelines for evaluating digital scholarship for the purposes of tenure and promotion, and job candidates lament that many openings in the humanities now re- quire some background in digital humanities (mla jobs tumblr, ). for a growing list of dh jobs, see the digital humanities job archive ( ). given the impact of digital humanities on these institutionalized processes, it is natural to wonder how dh might be connected to one of the oldest institutions in knowledge work: the library. discussion of digital humanities and its connection to libraries has grown rapidly in the past several years, and on both sides of the aisle. stephen ramsay ( ) has linked dh to one of the oldest functions of the library, namely knowledge organization: of all scholarly pursuits, digital humanities most clearly represents the spirit that animated the ancient foundations at alexandria, pergamum, and memphis, the great monastic libraries of the middle ages, and even the first research libraries of the german enlightenment. it is obsessed with varieties of representation, the organization of knowledge, the tech- nology of communication and dissemination, and the production of use- ful tools for scholarly inquiry. several others have asked if the library can function as a space for the digi- tization, computation, and preservation work that accompanies dh projects. for evidence of continuing interest in libraries, one need look no further than thatcamp—a series of locally-organized unconferences—attendance at which has been discussed as a defining characteristics of digital human- ists. the pop-up topics at thatcamps frequently include the library, and a special thatcamp libraries was held in november in conjunction with the digital library federation forum. within library and information science (lis), there is a corresponding (if more dispersed) discussion of dh. though dh is less prominent at national conferences, it has received attention within the field, including major orga- nizations. the american library association’s (ala) association of college and research libraries hosts a listserv for digital humanities discussion and recently launched a new blog that includes events, resources, case studies, and tools (http://acrl.ala.org/dh). the council on library and information c. a. sula figure digital humanities publications in library and information science, – . resources and the association of research libraries have both published a major reports on digital humanities centers, which are discussed in section two below. the institute of museum and library services (imls) has also supported collaboration between ischools and digital humanities centers, in- cluding internships for lis masters students working in the digital humanities (ischools & the digital humanities). a search for “digital humanities” within library and information science literature reveals a steady increase in publications since in the library, information science & technology abstracts (lista) database, which in- dexes over journals as well as books, research reports, and proceedings. (see figure ). it is remarkable that publications on digital humanities have nearly doubled in , with more still being indexed at the time of this publication. a topic model of the sources returned by the query is given in table . these topics were generated using latent dirichlet allocation (lda) in a free tool based on the popular mallet toolkit (http://code.google.com/p/topic- modeling-tool). lda views each document as a mixture of topics and uses word distribution to calculate the probability that each document contains each topic. for example, the concepts library and archive might be distributed across a corpus such that documents containing the words ‘cat- alog,’ ‘book,’ and ‘barcode’ would have a probability of . of being about library, while documents containing ‘notes,’ ‘scope,’ and ‘provenance’ would have a . probability of being about archive. in practice, these topics are usually unknown at the start of the analysis and must be inter- preted from a list of terms that are found to cluster together. thus, topic modeling using lda resembles an exercise in knowledge organization, in which higher-level categories must be created from lower-level “documents” (in this case, word clusters). digital humanities and libraries table topic analysis of “digital humanities” abstracts in lista ( – ) topics top terms in topic arts & humanities librarianship [ ] humanities, web, access, scholars, tools, journals, students, art, academic, online digital infrastructure [ ] article, libraries, library, collections, content, national, computer, metadata, researchers, documents knowledge production & collaboration [ ] digital, paper, data, technologies, based, collaboration, knowledge, study, projects, approach digital scholarship [ ] digital, university, information, work, project, science, dh, technology, scholarship, projects research communities [ ] research, resources, text, analysis, twitter, social, conference, including, open, community since topic titles involve significant interpretation, it is helpful to trian- gulate the assignments using a variety of methods. in the case of the lista abstracts, five topics were created using lda, and titles were assigned, first, by examining the term clusters and the abstracts in which they occur. for example, a number of abstracts in the first topic concerned access to arts and humanities collections, as well as online resources. since these fall under the province of subject librarians, the topic was titled “arts and humanities librarianship.” in some cases, it was helpful to examine the full dataset (not just clusters of top ten words) using a network graph (see figure ). in this graph, each document appears with its weighted relations (i.e., prob- ability assignments) to topics. documents and topics that are more closely related appear together, while those that are unrelated or weakly related are pushed apart. this graph helped in assigning titles to topics and , which are more closely related to each other than any other pair in the corpus. the titles “arts and humanities librarianships” and “research communities” (respectively) help express this relationship, since subject librarianship is in- deed connected to understanding various research communities and their needs, resources, and methods of communication. the five topics present in the lista abstracts show a wide range of engagement with the digital humanities. this interest also seems in keeping with several of the core competencies of librarianship described by the ala, which “a person graduating from an ala-accredited master’s program in library and information studies should know and, where appropriate, be able to employ” (american library association, ). among the most germane competencies to dh are those concerning information resources (esp. digital resources), knowledge organization (esp. cataloging and classification of dh materials), technological knowledge and skills (including the analytical, c. a. sula figure network graph of topic analysis of “digital humanities” abstracts in lista ( – ). visualization, and content management tools used by digital humanists), and users services, which will be taken up in the fourth section of this paper (see table ). given this significant overlap in interests, competencies, and institutional structures, we are left to wonder not whether but how libraries can join in the work of digital humanities. some commentators follow micah vandegrift’s ( ) enthusiastic injunction, “stop asking if the library has a role, or what it is, and start getting involved in digital projects that are already happening.” (for more details on this view, see vandegrift and varner (this issue). others are less sanguine about the realities of librarianship and the possibility for jumping into new, digital humanities projects. miriam posner (this issue) highlights important institutional barriers to dh work in the library, including workload, conventions of assigning credit solely to faculty members, and lack of institutional commitment. further discussions of challenges are found in libraryloon ( ), furlough ( ), muñoz ( ), and galina russell digital humanities and libraries table ala core competencies of librarianship related to digital humanities a. concepts and issues related to the lifecycle of recorded knowledge and information, from creation through various stages of use to disposition. b. concepts, issues, and methods related to the acquisition and disposition of resources, including evaluation, selection, purchasing, processing, storing, and deselection. d. concepts, issues, and methods related to the maintenance of collections, including preservation and conservation. b. the developmental, descriptive, and evaluative skills needed to organize recorded knowledge and information resources. c. the systems of cataloging, metadata, indexing, and classification standards and methods used to organize recorded knowledge and information. a. information, communication, assistive, and related technologies as they affect the resources, service delivery, and uses of libraries and other information agencies. d. the principles and techniques necessary to identify and analyze emerging technologies and innovations in order to recognize and implement relevant technological improvements. d. information literacy/information competence techniques and methods, numerical literacy, and statistical literacy. e. the principles and methods of advocacy used to reach specific audiences to promote and explain concepts and services. f. the principles of assessment and response to diversity in user needs, user communities, and user preferences. g. the principles and methods used to assess the impact of current and emerging situations or circumstances on the design and implementation of appropriate services or resource development. a. the fundamentals of quantitative and qualitative research methods. a. the necessity of continuing professional development of practitioners in libraries and other information agencies. ( ). these challenges doubtless vary among and within institutions, so a general formula for the connection between libraries and digital humanities does not seem forthcoming. what remains possible, however, is a sketch of the conditions under which libraries may be more favorable to digital humanities work (and when it may happen elsewhere) and a general conceptual model of libraries and the digital humanities. this latter project has two parts. first, it should be possible to articulate the variety of ways in which libraries engage with dh and to locate these interactions in some larger relational framework. such a model would provide librarians with an overview of the diverse work of digital humanities (some of which they may already perform) and help identify pockets of activity through which each side might engage the other. second, it should be possible to situate dh work in libraries within larger paradigms or philosophies of the field. doing so would integrate dh work more fully into the overall life of the library, providing grounds for establish- ing priorities and making decisions with respect to levels of commitment, funding, and support. the following sections take up these tasks by survey- ing the current state of digital humanities work within institutions, presenting c. a. sula a cultural informatics model of libraries and the digital humanities, and situ- ating dh work within the user-centered paradigm of library and information science. a short history of digital humanities, and its current whereabouts digital humanities focuses both on the application of computing technology to humanistic inquiries and on humanistic reflections on the significance of that technology. marija dalbello ( ) traces the history of digital humanities back to mid-twentieth century efforts in humanities computing and, in par- ticular, to early forms of text analysis. with the growth of internet technology in the s, focus shifted to hypertexts, digital repositories, and multimedia collections. the st century has seen a dramatic rise in social networks and crowdsourcing, access to digitized cultural heritage materials, and interfaces for archives and collections that exploit the capabilities of linked data and visualization. this long and varied history helps to account for the wide range of topics currently found in digital humanities work, topics ranging from text analysis and visualization to digital pedagogy and new platforms for scholarly communication. the location in which digital humanities work occurs is similarly varied. matthew kirschenbaum, for example, claims that digital humanities is often found within english departments because of historical connections between texts, computing, and composition, as well as interest in editorial processes, hypertext, and cultural studies ( , p. ). though english departments may be among the most prominent, digital humanities now includes faculty from the broad range of arts and humanities departments, including archae- ology, art history, classics, comparative literature, history, music, performing arts, philosophy, postcolonial studies, religious studies, theatre, and more. in a broader view, several studies have attempted to determine the loca- tion of digital humanities within the university at large. in , the council on library and information resources (clir) commissioned a yearlong study of digital humanities centers to explore their financing, organizational struc- ture, products, services, and sustainability (zorich, ). the study defined such centers as undertaking some or all of the following activities: . builds digital collections as scholarly or teaching resources, . creates tools for authoring, building digital collections, analyzing collec- tions, data or research processes, managing the research process, . uses digital collections and analytical tools to generate new intellectual products, . offers digital humanities training, digital humanities and libraries . offers lectures, programs, conferences or seminars on digital humanities topics, . has its own academic appointments and staffing, . provides collegial support for and collaboration with members of other academic departments at the home institution, . provides collegial support for and collaboration with members of other academic departments, organizations or projects outside the home insti- tution, . conducts research in humanities and humanities computing (digital scholarship), . creates a zone of experimentation and innovation for humanists, . serves as an information portal for a particular humanities discipline, . serves as a repository for humanities-based digital collections, and . provides technology solutions to humanities departments. (pp. – ) though this study did not explicitly address connections between libraries and digital humanities, several of the defining tasks of dh centers could also be characterized as library activities, including the focus on building digital collections and associated tools, using these collections, and serving as a repository ( – , ). many of the other list items are service-oriented: offering training, collegial support, serving as an information portal for disciplines, and providing technology solutions ( , , , , , ). the remaining features are either structural (appointments and staffing) or more oriented towards research and experimentation ( , , and to some extent ). based on the centers surveyed, the clir report concludes that broader-base initiatives, rather than siloed centers, may be more suited for meeting the needs of humanists, leveraging campus resources efficiently, and addressing large- scale community needs, such as long-term digital repositories. two more recent studies have attempted to gauge the type and degree of interaction between digital humanities initiatives and libraries. the asso- ciation of research libraries’ spec kit on digital humanities reports on the status of digital humanities within research libraries, with about half of the member libraries responding (bryson et al., ). the report finds that only % of libraries host a dedicated center for dh. more commonly, about half of the arl member libraries responding provide ad-hoc services, such as consultation, project management, or technical support, while one- quarter host a digital scholarship center that provides services to multiple disciplines, including the humanities. the authors suggest that libraries may be most useful for getting new dh projects off the ground (by providing pre-existing infrastructure) and for ensuring the long-term sustainability of projects (by bringing skills in digital management and preservation). in a separate and ongoing effort, an imls-sponsored partnership be- tween three graduate ischools (university of maryland college of informa- tion studies, university of michigan school of information, and university c. a. sula of texas austin school of information) and three nationally-recognized dig- ital humanities centers (mith, cdrh, and matrix) maintains a crowd- sourced spreadsheet of dh centers worldwide, with specific reference to their engagement with academic departments and libraries (ischools & the digital humanities, ). as of november , nearly centers are listed, roughly half of them in the united states. of those centers, nearly half are located within libraries and another quarter maintain some informal relationship with libraries. outside of the u.s., library-hosted dh centers are much less common, and only a small number report informal ties to their library. together, these studies suggest a wide range of models for institutional collaboration between libraries and digital humanities. in some cases, the choice of where to locate digital humanities may be arbitrary, academically speaking. it may have more to do with funding, local politics, or being first out of the gate at an institution rather than the location being chosen for more principled reasons. with this diversity in mind, we may now turn to the actual work of digital humanists to consider ways in which libraries and dh can be mutually supporting. a conceptual model for digital humanities and libraries as the reports cited in the previous section suggest, the work of digital hu- manists is diverse, and their collaborations with libraries idiosyncratic with respect to institutions. still, it is worth considering ways in which the work of digital humanists mirrors activities, resources, and skills found within many libraries. ben showers ( ), for example, highlights five areas of overlap between dh and libraries: managing data, “embedded” librarianship, digiti- zation and curation, digital preservation, and discovery and dissemination. though these and other points of comparison are useful, a more conceptual comparison between dh and libraries would help locate these examples within a common schema and encourage both sides to envision further pos- sibilities. this section presents a conceptual model for digital humanities and li- braries that is founded on a cultural informatics framework. this term was first introduced by sengers ( ) to describe the “confluence of computa- tion and humanities,” including both the ways in which computation could help cultural scholarship and the ways in reflection on cultural background could change the development of technology (p. ). furner ( ) connects the term ‘cultural informatics’ to the specific way in which cultural her- itage institutions (including libraries, museums, and archives) create, man- age, and organize information artifacts. some of these artifacts are collected by institutions; others are created by the institutions themselves. this model digital humanities and libraries stresses a continuum of information content associated with cultural heritage institutions. first, these institutions make available information artifacts pro- duced elsewhere that are deemed worthy of preservation. in some cases, cul- tural heritage institutions may also create new information artifacts through research, reports, or the creation of digital objects from non-digital ones. all of these documents, broadly construed, represent information; the new products of cultural heritage institutions are no different, in principle, than the familiar sources of books, articles, images, sounds, recording, sculptures, journals, notes, reports, and ephemera. the two are distinguished only by the site at which one is produced. in this sense, cultural heritage institutions create and make available “first-order” content. second, cultural heritage institutions often work with content of a spe- cial type: “second-order” content, or content about the content of other in- formation artifacts. this may include bibliographic records, resource guides, subject analyses, metadata, or even preservation data that facilitates the or- ganization and understanding of information artifacts. (preservation data is included here because it involves information about information artifacts in an organizational sense (e.g., put these documents in an environment below ◦), but preservation work itself seems to combine first- and second-order content by using second-order content to make available the first-order con- tent of found artifacts.) it is worth noting that second-order content is often recorded in first-order artifacts, such as subject bibliographies, keywords, and encoded metadata. this is hardly surprising, since research of any kind (including second-order information) is often worthy of preservation. the work of analysis and organization produces the second-order content; the document itself may be treated as a first-order creation. roughly speaking, we have here a distinction between pure content and pure representation, a distinction that often breaks down when examining any particular object. an archival letter may describe a map and how to use it, a scholarly article may point toward other sources via citation, and a visualization may contain as much interpretation and narrative in its design and presentation as it does first-order data that it represents. the point of this distinction is not to determinately classify information sources into one field or another; it is to capture the broad range of activities involved with the work of cultural heritage institutions. in some cases, they facilitate access (in a transparent way) to existing sources. in others, they engage in acts of research, analysis, and visualization—and, in so doing, create new artifacts of knowledge. along this dimension of first- and second-order content, we can situate the traditional activities of cataloging, bibliography, collection development, preservation, subject analysis, and knowledge organization. in addition to considering what kind of information is being produced or made available, cultural informatics also takes note of who or what is doing the producing. at one end, it focuses on human actors who may be involved in communication, instruction, or other “manual labor” tasks at cultural c. a. sula heritage institutions. at the other, cultural informatics considers computer- driven technologies, such as automatic metadata extraction, online search- ing, and digital content management. these broad extremes are bridged by studies of human–computer interaction, which examines the many affor- dances that computing technologies provide to different users (card, moran, & newell, ). on this dimension, it should be noted that many activities which start on the human side of things wind up drifting toward computation: card cata- logs give way to search engines, manual classification is replaced by natural language processing. the history of automation suggests that tasks will gen- erally be shifted from humans to computers to the extent possible for any given task. this trend does not imply that there is some fixed directionality to the map dynamics as a whole. on the contrary, each (technological) solution often brings with it a new (human) problem. technology may become more powerful, but it also brings with it increasingly specialized discourses and the need for teachers and translators of that technology. in some cases, com- puter innovations may enter the scene abruptly when it suddenly becomes possible to do some task that was impossible with mere human power (e.g., visualization allowing simultaneous representation of a million data points). these reflections suggest an equilibrium within the model: items may even- tually accrue on the side of computation, but a snapshot of the field at any given time would probably reveal activities plotted across wide areas of the map. the overall model is thus a dynamic one, ranging over the shifting array of tasks and task locations. an overview of today’s field with respect to digital humanities is given in figure . this model suggests a multiplicity of ways in which libraries and dh may support, engage, and create with one another. interestingly, current dh activities fall across a wide range of the map—and not merely the computational end. digital humanists may rely on libraries as much for access to digital collections and tools as they do resource instruction and preservation. this overlap of first- and second-order content, human- and computer-powered work suggests that libraries and dh are indeed engaged in complementary activities—as commentators have suggested—and that dh has an enduring place within the world of libraries. at the same time, not all digital humanists may engage in the full range of the activities listed in figure . this fact suggests that there is no singular answer from the perspective of library administration about how libraries should engage with dh. in some situations, a library would do well to fo- cus on digitization and digital preservation; in others, it would do better to keep pace with emerging tools for text analysis. some dh support may be best accomplished by providing large-scale access to collections, datasets, or technology, while other situations may merit individual, customized collab- oration with dh researchers (kamada, ). digital humanities and libraries figure a cultural informatics model for digital humanities and libraries though the broad question of dh and libraries has no determinate answer, it does not mean libraries are without guidance in how to support dh. after all, they are not without populations of users, users who bring with them particular information needs, and they are not without general strategies for library outreach, a longstanding tool for raising awareness of what libraries may offer. discovery of user needs and fostering of new user populations both lay at the heart of user-centered librarianship an apology for local solutions the lack of a general answer about how libraries can best engage with dh may be unsatisfying, but this also seems predicted by the user studies paradigm that has dominated the field for the past several decades. as several authors have pointed out, the user-centered tradition can be traced back to studies of scholarly communication in the s and s, which, to varying degrees, took stock of individual scholars’ information seeking behaviors (case, ; bates, ; talja & hartel, ). the user-centered tradition gained full steam with dervin and nilan’s seminal article, which called for c. a. sula table neh digital humanities start-up grant criteria • research that brings new approaches or documents best practices in the study of the digital humanities; • planning and developing prototypes of new digital tools for preserving, analyzing, and making accessible digital resources, including libraries’ and museums’ digital assets; • scholarship that focuses on the history, criticism, and philosophy of digital culture and its impact on society; • scholarship or studies that examine the philosophical or practical implications and impact of the use of emerging technologies in specific fields or disciplines of the humanities, or in interdisciplinary collaborations involving several fields or disciplines; • innovative uses of technology for public programming and education utilizing both traditional and new media; and • new digital modes of publication that facilitate the dissemination of humanities scholarship in advanced academic as well as informal or formal educational settings at all academic levels. (national endowment for the humanities, b) a shift away from objective, mechanistic, and universal views of information needs toward more subjective, constructionist, and situated understandings ( , pp. – ). rather than casting about for a general way in which libraries can fit in the larger dh movement, libraries can (and already do) focus on responding to the needs of their patrons. there is a well-established need for academic libraries and librarians to support faculty activities, most notably teaching and research, as well as student learning. these activities can be given further description within a digital humanities framework by examining the work that digital humanists actually do, much of which is described in the neh digital humanities start-up grants criteria (see table ). the guidelines are themselves significant because they reflect state-of-the-art work in dh and have been used to fund hundreds of projects to date—making them responsible, in no small part, for shaping the field. (it should be noted that guidelines for neh digital implementation grants follow essentially the same criteria but focus more on creating and supporting longer-term initiatives.) though the activities listed in table cover much of the ground of dh as discussed here, explicit recognition of the role of pedagogy is absent from the criteria. digital humanists are among the forefront of instructors using technologies to engage students in new forms of digital scholarship, commu- nication, and dissemination of ideas. moreover, digital humanists are often responsible for training others in using particular tools or methods, partic- ularly undergraduates, or for seeking instruction in those areas themselves. most often, this has been left to extracurricular skill-shares or workshops in which digital humanists can “catch up” on the latest trends. these tasks are far beyond merely providing technological resources, a model that per- vades many it departments; they involve directed and creative uses of those digital humanities and libraries resources, and the literacies required to sustain them. libraries and librari- ans can fulfill a vital need here in supporting instructional technology and working with faculty to use technology more creatively in classroom settings. in addition to capturing the current work of dh, the activities listed in table also reflect a new type of academic library user that has emerged in the past decade, one that is focused on digital scholarship and research. this new type coincides with trends in other fields in terms of big data, access to datasets, and support for technology, including instruction. in this respect, a scientist seeking access to large databases for research and a digital humanist interested in text analysis using large corpora are quite similar in terms of information needs, and the role of libraries in provid- ing such resources is basically the same. the major difference seems to be a historical one; science and technology-related fields have received this type of support more frequently in the past decade, while support for the humanities has been limited still to print collections or electronic journal articles. the growth in digital humanities offers an important opportunity to provide renewed support for the humanities and to bring library re- sources across the board up to speed with digital scholarship for the st century. though the possible roles for academic libraries within digital human- ities seem relatively clear, engagement with dh in other types of libraries, particularly public libraries, may be quite different, at least from a user per- spective. academic settings, particularly the institutions where digital hu- manities is growing, often have user populations that are technologically skilled, relatively speaking. members of the public may also want new and exciting access to information—the very kind that digital humanities often brings—but others may simply rely on their libraries for more basic access to information, including job searches, research on immigration and legal procedures, internet and email, or child and youth programming. in some cases, these users may comprise a larger segment of the overall population, and there is a strong case for prioritizing these more basic needs over those of the most tech-savvy users. support for dh in non-academic libraries must be part of an overall needs assessment and may wind up taking a backseat to initiatives that serve a wider population of library users. conclusion: from theory to action the foregoing sections have attempted to locate digital humanities within the world of libraries in several ways: first by examining the institutional location of dh work, then by presenting a conceptual model of dh and lis, and finally by locating digital humanities within the overall user-centered paradigm of the field. at each turn, the points of connection between li- braries and dh were varied and often dependent on the needs of particular c. a. sula faculty members (i.e., users) within an institution. though a general, cul- tural informatics model was presented, this model stresses the diversity of activities involved in dh and cultural heritage institutions and avoids total- izing recommendations about how such work is to be pursued. while this article has been focused on conceptual ties between libraries and dh, it is worth concluding with some more practical considerations about how such a model can be enacted. first, librarians (esp. subject librarians) can discover which of their users are working in digital humanities. resources such as the humanities, arts, science, and technology advanced collaboratory (hastac) directory (lo- cated athttp://hastac.org/members), which includes over , members, as well as social media sites (esp. twitter) can be useful for identifying local faculty with an interest in dh. second, librarians can attempt to survey the needs of these users (formally or informally), as well as faculty members in general, some of whom may be interested in digital humanities but un- sure where to start. as part of this needs assessment, measures such as cost and impact may be considered. this method, again, suggests that different needs will emerge in different settings, even if faculty members bring di- verse projects and issues with them. some of these needs may already be met by preexisting resources; others may require new purchases or changes in staffing. these needs and others may be compared to those plotted in figure , and some libraries may find it advantageous to focus on particular clusters of the grid, while others may find a more scattered approach to be justified. in particular, libraries would do well to identify mutually sup- porting activities, such as purchasing gis datasets together with offering gis workshops. although the landscape of digital humanities is complex and chang- ing, libraries are well positioned to meet the needs of many digital human- ists, both by expanding current offerings and by promoting existing skills and services that lie squarely within the field of library and information science. references american library association. ( ). ala’s core competencies of librarianship. retrieved from http://www.ala.org/educationcareers/sites/ala.org.education careers/files/content/careers/corecomp/corecompetences/finalcorecompstat . pdf bates, m. j. ( ). information science at the university of california at berkeley in the s: a memoir of student days. library trends, ( ), – . boot, p. ( ). mesotext: digitised emblems, modelled annotations, and humanities scholarship. amsterdam: pallas publications. bryson, t., posner, m., st. pierre, a., & varner, s. ( ). spec kit : digital humanities. washington, dc: association of research libraries. digital humanities and libraries card, s. k., moran, t. p., & newell, a. ( ). the psychology of human-computer interaction. hillsdale, nj: lawrence erlbaum associates, inc. case, d. o. ( ). looking for information: a survey of research on information seeking, needs, and behaviour. london: academic press. cohen, d., & rosenzweig, r. ( ). digital history: a guide to gathering, preserving, and presenting the past on the web. philadelphia: university of pennsylvania press. dalbello, m. ( ). a genealogy of digital humanities. journal of documentation, ( ), – . doi . / dervin, b., & nilan, m. ( ). information needs and uses. in m. e. williams (ed.), annual review of information science and technology (arist) (vol. , pp. – ). white plains, ny: knowledge industry publications. digital humanities job archive. ( ). retrieved from http://jobs.lofhm.org fish, s. ( , december ). the old order changeth. the new york times. re- trieved from http://opinionator.blogs.nytimes.com/ / / /the-old-order- changeth/ fish, s. ( a, january ). the digital humanities and the transcending of mortality. the new york times. retrieved from http://opinionator.blogs.nytimes.com/ / / /the-digital-humanities-and-the-transcending-of-mortality/ fish, s. ( b, january ). mind your p’s and b’s: the digital humanities and interpretation. the new york times. retrieved from http://opinionator.blogs. nytimes.com/ / / /mind-your-ps-and-bs-the-digital-humanities-and-inter pretation/ furlough, m. ( ). some institutional challenges to supporting dh in the library. on furlough. retrieved from http://www.personal.psu.edu/mjf / blogs/on_furlough/ / /some-institutional-challenges-to-supporting-dh-in- the-library.html furner, j. ( ). cultural informatics. retrieved from http://furner.info/?page_id= galina russell, i. ( , august). the role of libraries in digital humanities. world library and information congress: th ifla general conference and assembly. lecture delivered from universidad nacional autónoma de méxico, san juan, puerto rico. retrieved from http://conference.ifla.org/past/ifla / -russell- en.pdf ischools & the digital humanities. (n.d.). retrieved from http://www.ischooldh.org ischools & the digital humanities. ( ). digital humanities centers. retrieved from https://docs.google.com/spreadsheet/ccc?key= alb dje v ncdfjuzf- vazgpovefoaf dwrvq rsqnc#gid= kamada, h. ( ). digital humanities: roles for libraries? c&rl news, ( ), – . available at http://crln.acrl.org/content/ / / .full kirschenbaum, m. ( ). what is digital humanities and what’s it doing in english departments? ade bulletin, , – . doi . /ade. . libraryloon. ( ). additional hurdles to novel library services. gaiva libraria. retrieved from http://www.inthelibrarywiththeleadpipe.org/ /dhandthelib/ marche, s. ( , october ). literature is not data: against digital humani- ties. los angeles review of books. retrieved from http://lareviewofbooks.org/ article.php?type&id= &fulltext= c. a. sula mla jobs tumblr. ( , september ). [web log post.] retrieved from http://mlajobs.tumblr.com/post/ /digital-humanities-asst-prof-in- american-or modern language association. ( ). guidelines for evaluating work in dig- ital humanities and digital media. retrieved from http://www.mla.org/ guidelines_evaluation_digital moretti, f. ( ). graphs, maps, trees: abstract models for a literary history. new york: verso. muñoz, t. ( ). digital humanities in the library isn’t a service. retrieved from http://trevormunoz.com/notebook/ / / /doing-dh-in-the-library.html national endowment for the humanities. ( a). information about grants made by the national endowment for the humanities [data files]. retrieved from http://www.data.gov/list/agency/ /∗ national endowment for the humanities. ( b). digital humanities start-up grants. retrieved from http://www.neh.gov/grants/odh/digital-humanities-start-grants. pannapacker, w. ( , december ). the mla and the digital humanities. chron- icle of higher education. retrieved from http://chronicle.com/blogpost/the- mlathe-digital/ / ramsay, s. ( , october ). care of the soul. lecture conducted from emory uni- versity. retrieved from http://stephenramsay.us/text/ / / /care-of-the- soul.html schreibman, s., siemens, r., & unsworth, j. (eds.). ( ). a companion to digi- tal humanities. oxford: blackwell. available at http://www.digitalhumanities. org/companion/ sengers, p. ( ). practices for a machine culture: a case study of integrat- ing cultural theory and artificial intelligence. surfaces viii. available at http://www.cs.cmu.edu/∼phoebe/work/papers/surfaces /sengers.practices- machine-culture.html showers, b. ( ). does the library have a role to play in digital human- ities? jisc digital infrastructure team. retrieved from http://infteam.jis– cinvolve.org/wp/ / / /does-the-library-have-a-role-to-play-in-the-digital- humanities siemens, r., & schreibman, s. (eds.). ( ). a companion to digital liter- ary studies. oxford: blackwell. available at http://www.digitalhumanities. org/companiondls/ talja, s., & hartel, j. ( ). revisiting the user-centred turn in information science research: an intellectual history perspective. information research, ( ) paper colis . available at http://informationr.net/ir/ – /colis/colis .html vandegrift, m. ( ). what is digital humanities and what’s it doing in the li- brary? in the library with a lead pipe. retrieved from http://www.inthe– librarywiththeleadpipe.org/ /dhandthelib/ zorich, d. m. ( ). a survey of digital humanities centers in the united states. washington, dc: council on library and information resources. retrieved from http://www.clir.org/pubs/reports/pub /reports/pub /pub .pdf untitled ‘‘slavic studies and slavic librarianship’’ revisited: notes of a former slavic librarian aaron trehub auburn university libraries, auburn university, auburn, alabama, usa this article revisits the author’s essay in solanus on the state of slavic librarianship at the turn of the twenty-first century in order to assess how the profession has changed in the interim. trehub notes that the single most important effect of the proliferation of new library information technologies has been a gradual shift in emphasis from curation to creation. the library may no longer be the first stop in a student’s research, but it remains the preferred venue for study, collaboration, social interaction, internet use, and access to and help with specialized resources. the job of librarians—including slavic librarians—is to build on their comparative advantage in these areas and re-integrate libraries into the students’ information workflow. the author suggests that the best way to maintain slavic studies’ viability as a discipline is to participate more fully and be represented more prominently in the technology-driven scholarly digital initiatives that are trans- forming librarianship in general. keywords digital scholarship, slavic studies, information technology, information literacy, library, libraries in , after almost twenty years of working as a slavic studies analyst and librarian, i traded in my shapka for a nascar baseball cap (so to speak) and accepted a position as the director of library technology at auburn university in alabama. so when the editors of seeir first contacted me about revisiting my solanus article for this special issue on slavic information literacy, my first reaction was to argue that my defection from the field disqualified me from commenting on more-recent developments. on reflection—and address correspondence to aaron trehub, assistant dean for technology and technical services, auburn university libraries, auburn university, mell street, auburn, al , usa. e-mail: trehuaj@auburn.edu slavic & east european information resources, : – , copyright # taylor & francis group, llc issn: - print / - online doi: . / with some persuasion—i decided that my experiences outside slavic studies might enable me to shed some light on and bring a different perspective to some of the issues facing the field today. this is especially true in the area of library technology, my current area of responsibility. before moving on to information technology, information literacy, and other library matters, i would like to fulfill the retrospective part of my charge and dwell for a moment on my treatment in the solanus article of the development of slavic studies and slavic librarianship in the united states— in particular, on the crucial importance i ascribed to the cold war and the us-soviet rivalry. this turned out to be an unexpectedly controversial point. although one can argue about the degree to which the cold war and us government policy shaped the field and helped to build the library collections that supported it, i believed that my nutshell history of slavic studies in the united states was essentially accurate. others took a different view, however, including the anonymous peer reviewers for this journal— which is how the article came to be published in solanus and not in seeir. i was therefore reassured to see, in the course of researching this piece, that my chronology and interpretation matched those of caryl emerson, professor of comparative literature at princeton university, who summarized the field’s history in an address to the annual convention of the american association of teachers of slavic and east european languages (aatseel). ‘‘the cold war was good for us professionally,’’ emerson said; ‘‘indeed, the cold war brought us into existence as a field.’’ she also acknowledged ‘‘the enormous role contemporary politics has always played in our establishment as a discipline.’’ this may be an unpalatable fact to some people in the field, but it strikes me as indisputable. equally indisputable are the dangers of prognostication, as i learned upon re-reading my article. in the course of rebuking frances fukuyama for proclaiming the ‘‘end of history’’ in , i promptly committed the same sin, writing with equally unfounded confidence that ‘‘it is clear that we have reached the end of history as we knew it in the years – .’’ one year after those words were published, the united states suffered the most devastating assault on its territory since the war of and embarked on its most costly and divisive military campaign since the vietnam war. i speak of course of the attacks of september , and the ensuing global war on terror—a war whose end is not in sight and whose ramifications are as far- reaching as those of the wars of the twentieth century, including the cold war. moral: never proclaim the end of anything based on current trends, especially where history is concerned. the same caveat may apply to my earlier comments about the future of slavic librarianship. at the end of the solanus article, i argued that slavic studies and slavic librarianship had experienced a post-cold war relapse into normalcy—a ‘‘withering into the truth’’ was how i showily put it, stealing a line from w. b. yeats. i also argued that the profession—slavic librarianship—was coping rather well notes of a former slavic librarian with the fall from prominence of the field it serves. was i right about that then, and if i was, is it still true today? judging by the fact that slavic librarianship continues to survive as a distinct subspecialty, the profession appears to have successfully navigated the changes of the past decade. however, since my involvement in the field these days is limited to lurking on the slavlibs e-mail forum—whose survival, by the way, could also be adduced as a sign of the field’s continuing viability—i must rely on some of my former colleagues for an assessment of the current state of affairs. and here the picture is mixed. the consensus among the people i asked appears to be that the profession is in pretty good shape considering the turmoil of the past fifteen to twenty years—or at least it’s in no worse shape than other area studies specialties. as one librarian put it to me in an e-mail, ‘‘slavic is changing just like everything else and i don’t think it is that different, other than the pace may be somewhat slower.’’ another former colleague has written of a ‘‘small renaissance’’ in slavic librarianship, with several new professionals entering the field in the past two or three years. i have come across other signs of continued vitality, such as the growing prominence of digital projects in slavic studies and slavic librarianship, a prominence that has found official embodiment in the digital projects subcommittee of the american association for the advancement of slavic studies (aaass) bibliography & documentation committee; or the andrew w. mellon foundation’s recent grant program aimed at promoting collaboration among scholarly publishers, including publishers in the field of slavic studies. and i was pleased to see that librarians at the university of california at los angeles and berkeley remedied a longstanding deficiency and published a comprehensive guide to slavic library and archival collections in the united states and canada in —the first guide of its type since . but there are also problems. in , i identified the main problems in day-to-day slavic librarianship as the erosion of bibliographic control, the disappearance of established vendors, inadequate acquisitions budgets, and ineffective exchange programs. although exchange programs appear to have lost their saliency—none of the slavic librarians i contacted for this piece mentioned them—the other three problems have not. indeed, problems with bibliographic control affect librarianship in general, not just slavic librarianship. for example, in late the us library of congress formed the working group on the future of bibliographic control. the working group issued its final report in january . the report focuses heavily on changes in information technology and recommends (among other things) that libraries ‘‘position our technology for the future’’ and ‘‘design for today’s and tomorrow’s user.’’ the consequences of failing to do so? ‘‘library users will continue to bypass catalogs in favor of search engines’’ and ‘‘resources needed to catalog at a sophisticated level [will become] a. trehub increasingly difficult to sustain.’’ we are already seeing these trends at work in our library, and i know that we are not alone. the other two problems—disappearing vendors and inadequate budgets—remain as timely as ever. the sudden, unexpected, and disruptive demise of the russian press service in april is evidence of the former. for some slavic librarians, it revived memories of the turbulent days after the collapse of the soviet union in . brad schaffner, the head of the slavic division at harvard university’s widener library, expressed the hope that the demise of rps ‘‘is not the start of another round of collapse similar to the situation after the fall of the soviet union.’’ schaffner listed other points of concern: the weak dollar; prices for foreign publications outstripping already inadequate budgets; the rising cost of shipping materials (the immediate cause of rps’s going out of business); reports of declines in graduate enrollments in slavic studies programs; and moves to cut back, disperse, or dilute major slavic collections in the united states. examples of the latter include proposals to dismantle the renowned slavic and east european library (and other area-studies libraries) at the university of illinois at urbana-champaign; and to turn the library of congress’s european reading room into an exhibit space. misguided as they may be, these initiatives point to a much larger issue: the erosion of confidence in the traditional mission of the humanities and the value of a liberal-arts education. in a recent article deploring the actual or impending closures of german departments at a large private and a medium- sized public university in california, will corral and daphne patai recalled the fate of the slavic languages and literatures department at the university of massachusetts at amherst, which was abolished in the s. ‘‘given the general lack of commitment to a coherent view of the humanities and their significance,’’ corral and patai wrote, ‘‘it is unavoidable that particular departments, especially in foreign languages, will be slated for elimination or revamping according to currently fashionable trends.’’ if corral and patai are right, then the implications in the long run for slavic studies—and slavic librarianship—are discouraging. information technology, information literacy, and digital scholarship but since, as j. m. keynes said, in the long run we are all dead (or at least retired), the real issue is what we can do to address the most important challenge facing slavic librarianship today. to quote brad schaffner again, ‘‘the biggest change to slavic librarianship’’ in the past decade ‘‘has come as a result of technical advancements.’’ as befits the head of a library with extensive collection development responsibilities, schaffner focused on the financial aspects of technological change, in particular the difficulty of notes of a former slavic librarian supporting both digital and traditional collections on an analog-era budget. however, i would suggest that it is just as important to figure out how we as a profession can best take advantage of new information technologies. this is where the notion of information literacy comes into play. the american library association defines information literacy as ‘‘the set of skills needed to find, retrieve, analyze, and use information.’’ more expansively, information literacy could be defined as the process of teaching people not only how to access and use new information technology, but also how to understand the way information is organized, so that they can assess for themselves its relevance and reliability and integrate it seamlessly into their work. readers who were educated in the analog era may be forgiven for spotting strong similarities between information literacy and the array of skills and practices described by jacques barzun and henry graff in their classic the modern researcher ( ). however broadly one defines it, adopting information literacy as a new liberal art at a time when the liberal arts are being ‘‘kicked off campus’’ may not be a winning strategy for slavic librarians, or librarians in general. still, there is general agreement that the advent of the new information technologies has fundamentally changed research, writing, and pedagogy. for example, when i took a slavic bibliography course at the university of illinois in the early s, the syllabus was almost exclusively paper-based (some microfilm was also involved), and much of it was late-nineteenth-century paper at that. by the time i left illinois in , the same course was heavily electronic, with a strong emphasis on online resources and digital collections. while it is not yet possible to search a fully indexed web facsimile version of v. i. mezhov’s russkaia istoricheskaia bibliografiia [russian historical bibliography], i am confident that, thanks to mass digitization, it soon will be. at that point the question will be: do all students who need to know about mezhov’s work actually know that it is available online? will they even know why they should consult mezhov in the first place? the growing importance of information technology and information literacy in slavic librarianship and librarianship in general is the result of three things: web browsers, affordable digital publishing tools, and google. what we have witnessed in the past fifteen years is nothing less than the birth and growth of an entirely new communications medium. i would argue that the web has already surpassed previous media—radio, cinema, and television—in its power, immediacy, and influence on our lives. blogs, wikis, social networking sites, and free or inexpensive digital content- management software have given millions of people the ability to be not only their own printing presses and publishing houses, but also wire services, photo agencies, recording studios, movie production companies, and advertising firms. google and other search engines enable millions of other people to find these self-published works. more to the point for our a. trehub field, google has enabled millions of people to bypass libraries completely in their search for information. and millions of people are taking advantage of the opportunity. numerous studies confirm what reference librarians observe in their daily work: that students overwhelmingly prefer google and other commercial search engines to the specialized information resources libraries offer. for example, an oclc survey conducted in revealed that % of the college students questioned began their information searches with google or another search engine. even more disturbingly for reference librarians, the same survey revealed that only a small percentage of students—in this case, %—say that they ‘‘consult librarians when seeking help from a trusted source.’’ this finding echoes a study of usage patterns at the harold b. lee library at brigham young university. for those who believe that information literacy means knowing which tools to use for a given purpose, this is not necessarily a bad thing. google is a powerful tool whose use should be encouraged, when appropriate. after all, the issue is not where researchers start their search, but where they end it. still, the very fact that the library is no longer the preferred starting point for most research projects has left many members of our profession feeling marginalized. and not without some justification, since google searches do not always (or even often) lead to the library, or to the very expensive proprietary databases that we spend millions of dollars each year to license. one point has emerged pretty clearly from the welter of technology- driven change: the library is no longer perceived as the sole or ultimate source of authoritative information, or of guidance on how to use it. however, the library does appear to be the preferred venue for study, collaboration, social interaction, internet use, and access to and help with specialized resources. the job of librarians—including slavic librarians—is to build on our comparative advantage in these areas and re-integrate libraries ‘‘into the information workflow of our students.’’ the question, of course, is how to do that. in my solanus piece, i identified the collaborative production of bibliographies and other online reference resources as one way to take advantage of the new technologies. as a former bibliographer, i retain a fondness for the well-crafted marc record and respect the purpose of bibliography as an enterprise, which is to organize created knowledge and present it in a usable way. so i was dismayed to hear my colleague tom wilson, the associate dean for library technology at the university of alabama, proclaim ‘‘the death of bibliography’’ in his keynote presentation to the electronic resources and libraries conference in atlanta, georgia. wilson’s point was that people want access to the thing itself—for example, the full text of an article, book, or book chapter—and not a bibliographic surrogate for that thing, however well-crafted it may be. in this he is surely correct, and thanks to google books, the open content alliance, and other mass-digitization notes of a former slavic librarian projects, the vision he describes is rapidly becoming a reality. but does that mean that bibliographies (like those of mezhov) and other traditional reference tools have entirely outlived their usefulness? perhaps. in the solanus article, i touted the collaboration between the american bibliography of slavic and east european studies (absees) and its european counterpart, ebsees. this now appears to have been an infelicitous example. absees is still alive and apparently well: it continues to be compiled at the university of illinois at urbana-champaign with help from contributing editors at other institutions, and is being marketed by ebsco information services. unfortunately, ebsees has not survived the technological and administrative challenges of the past decade, most importantly the loss of its long-time editorial headquarters at the école des hautes études en sciences sociales in paris. according to the osteuropa- abteilung of the staatsbibliothek zu berlin—the bibliography’s current home—no new records have been or will be added to ebsees after december , . the bibliography is now a static resource, covering the years – . so the dream of combining the two bibliographies to form a comprehensive guide to scholarship in our field published ‘‘north of the rio grande and west of the oder-neisse line’’ is defunct. on the plus side, ebsees is getting a new search interface, with faceted browsing and tag clouds. but perhaps wilson’s obituary for traditional reference tools is premature. take, for example, wikipedia. as everyone reading this article probably knows, wikipedia is a free, multilingual, communally-edited online encyclopedia. launched in january by jimmy wales (an auburn university alumnus) and larry sanger, it currently contains million articles ( million in the english-language version) and attracts over million visitors each year. so it would appear that there is considerable demand for (some) traditional reference resources, albeit in new guises. this is a demand that librarians can take advantage of. some are: ann lally and carolyn dunford recently published an article in d-lib magazine on using wikipedia to direct students to digital collections at the university of washington libraries. we have started doing the same thing at the auburn university libraries, adding links to our digital resources in alabama-related articles in the encyclopedia. slavic librarians might consider following suit, contribut- ing articles and external links to wikipedia—or to its peer-reviewed counterpart, scholarpedia, or maybe even a discipline-specific encyclope- dia (slavipedia?) or wiki. another way for librarians to integrate slavic collections into students’ information workflow is to partner with teaching faculty in slavic studies departments and incorporate library resources into blackboard, sakai, moodle, and other web-based course-management systems. helen sullivan has written about doing precisely that at the university of illinois at urbana-champaign. collaborating directly with teaching faculty to include library holdings in the course syllabus is hardly a a. trehub new activity for librarians—it has been going on for decades—but web- based course-management systems increase its reach and effectiveness. they also give librarians the opportunity to incorporate subject-specific guidance on research skills, strategies, and approaches—i.e., information literacy— directly into the course syllabus, in a way that can be consulted by students remotely and at any time. for example, the resource page for a course on nineteenth-century russian history would be a good place to link to a librarian-produced, web-based video introducing students to mezhov and other bibliographic resources. in other words, bi on youtube. stranger things have happened. finally, there is digitization and the creation of new digital resources. a oclc report on library technology trends predicted that ‘‘digitiza- tion…may emerge as the most significant new format trend by .’’ that prediction seems well on its way to being fulfilled. many academic and even public libraries have embarked on local digitization projects and are adding digitization to their list of routine activities. slavic collections are contributing modestly to this trend. i have already mentioned the digital projects subcommittee of the aaass bibliography & documentation committee. the inventory of slavic, east european, and eurasian digital projects at the slavic and east european library at the university of illinois at urbana- champaign currently lists over digital projects at libraries, museums, and archives around the world. among many other projects, it includes seventeen moments in soviet history (a multimedia timeline of the years between and ) at michigan state university, the prokudin-gorskii collection of color photographs from pre-revolutionary russia at the library of congress, the harvard project on the soviet social system collection at the harvard college library, and russia engages the world at the new york public library. as ex-slavicist abby smith famously asked some years ago, ‘‘why digitize?’’ there are a number of reasons: n improving accessibility and visibility. digitizing the library’s unique holdings makes them available to the largest possible audience and encourages their use in teaching, learning, and research. n ‘‘branding.’’ increasingly, many academic libraries offer an identical selection of commercial databases to their users. digitization allows libraries to highlight their unique collections and establish a distinctive identity. n conservation. digitization reduces the need to handle fragile or light- sensitive originals. these are all good reasons for instituting a digitization program, but i would add another, even more compelling reason: to strengthen the library’s connection with teaching faculty and especially students by involving them in projects that combine primary source materials with advanced information notes of a former slavic librarian technology to produce high-quality online resources. in other words, digitization can be used to position the library as a center for the production of digital scholarship and an equal partner in the teaching process. it is interesting and encouraging that the impetus for such an alliance is coming from scholars, primarily in the humanities. one of the most eloquent champions of library-based digital scholarship is ed ayers, formerly a professor of history at the university of virginia and currently the president of the university of richmond. in the s, ayers conceived and supervised the design of the well-known digital repository, the valley of the shadow: two communities in the american civil war. in the years since then, he has advocated assigning weight to digital scholarship in promotion and tenure decisions—a crucial point—and has spoken eloquently and amus- ingly about the successes and failures of his attempts to fuse scholarship and teaching with digital technology. in our own field, patricia hswe has encouraged slavic scholars and librarians to get involved in digital scholarship, highlighting the possibilities for ‘‘collaborative efforts between faculty, students, librarians, and programmers.’’ she has also touched indirectly on the bi benefits of digital scholarship, pointing out that today’s students ‘‘are wired, connected, and geared to go with their gadgets’’ and calling digital collections ‘‘an incredible resource for librarians who are looking for up-to-date user services and relevant approaches to user education.’’ of course, not all students (or librarians, for that matter) are as tech- savvy as stereotypes would suggest. and, despite its manifest benefits, even encouraging faculty to get involved in digital scholarship can be difficult. in my comments on a panel of papers on this subject at the aaass conference in boston in , i identified several obstacles, most of them having to do not with technology, but with culture. specifically, with academic culture: academic politics, rituals, incentive systems, and bureaucratic styles. information scientists might call this phenomenon social informatics, while proponents of slavic information literacy might include it among the professional competencies expected of savvy library patrons. whatever it’s called, an easy familiarity with the academic culture of a specific discipline (in this case, slavic studies) is at least as important as a knowledge of scanning resolutions, file formats, controlled vocabularies, markup lan- guages, and metadata schemas for the success of such programs. i have often heard presenters at conferences on institutional repositories say that the technological challenges of creating a repository are trivial when compared with the social and cultural ones. in other words, culture trumps technology. my experience as the director of library technology and as a manager of digitization projects at auburn university tends to support that axiom. the first problem is prosaic. senior (read: tenured) faculty members do not need to work on digital projects. they have cleared the tenure hurdle a. trehub and can work on what they choose, at their own pace. they may regard digital projects with skepticism, or as yet another unlooked-for and unwelcome intrusion. this is especially likely to be true if they happen to be older or unfamiliar with information technology, born-digital scholarship, and new models of scholarly communication. i don’t want to be unfair to older faculty members who are both interested in and knowledgeable about the new information technologies. in general, though, getting senior faculty members interested in digital projects is a tough sell. for their part, junior (read: untenured) faculty members have other things on their minds. the main one is getting tenure. digital projects still do not have a lot of clout where promotion and tenure are concerned (although this is changing, slowly, at some institutions). there’s little or no prestige in doing them. indeed, depending on the departmental culture, getting too much involved in digital projects can be detrimental to a junior scholar’s academic career. at least that is the perception, and it is hard to overcome it when the professional stakes are so high. apart from these prosaic considerations, there are more-substantive reasons for junior and senior faculty members to balk at investing time and effort in digital projects. i mentioned skepticism. faculty members— especially faculty members in the humanities and social sciences—may wonder, sometimes justifiably, whether digital projects really contribute to better learning and research. they may ask where exactly the value lies for them and their research. as a colleague at one of the aforementioned conferences on institutional repositories put it, faculty members and other researchers tend be focused on ‘‘me, me, and me,’’ often for good reason. they are unlikely to be persuaded by arguments about broader accessibility, better preservation, application of standards, and opportunities for collaborative research and instruction. and they may find it difficult to see exactly how creating and ‘‘publishing’’ high-quality digital resources for teaching and research purposes—resources that may be based on the library’s or archives’ materials but that go beyond those materials by including additional information and scholarly apparatus such as essays or links to related resources—can help them to address the crisis in scholarly publishing, a crisis that has been brought about by publishers selling back to the universities their own intellectual output. finally, there are concerns about the pedagogical value of digital resources per se. when i first started working at auburn, i gave a pitch about the auburn library’s digital projects at a monthly meeting of the auburn university history department. the reception was polite but reserved. this may have had to do with my deficiencies as a salesman, but it may also have reflected genuine reservations about what i was selling. as i was demonstrating a set of digitized historical photographs from one of the library’s special collections, a tenured but youngish faculty member said that this was all very nice, but in his view it just reinforced one of the most notes of a former slavic librarian pernicious trends among his students—namely, their reluctance to read books and other traditional printed materials. the only response i could offer at the time was to speculate that involving students in the creation of new digital resources would force them to engage with texts and other primary-source materials, since these are the raw materials for most digital projects. since then, i have seen that digital projects can indeed be used pedagogically to promote the same level of engagement with primary- source materials that close reading of traditional texts does. perhaps just as importantly, such projects allow faculty to integrate new technology and digital resources into their own teaching, thereby presenting them with an ideal opportunity to modify their students’ information-seeking behavior, instead of complaining about it. it’s not all bad news and pushback, however. we have found that there are ways to get teaching faculty interested and involved in digital projects. here are some practical suggestions for promoting digital scholarship through the library: n start a lecture series in the library on the new information technologies and their impact on teaching and learning. for maximum credibility, have the guest lecturers be faculty members at other universities who have actually done digital projects and incorporated them into their classes or research. n get support from the institutional office of information technology or the academic departments to send teaching faculty to educause, the coalition for networked information, the digital library federation, and other conferences devoted to information technology, libraries, and higher education in general. n work on getting digital projects recognized in the promotion and tenure system at your institution. if you are a librarian, ask your dean or director to work with his or her counterparts on campus to put this on the administration’s agenda. until digital projects count in the tenure calculus, it will be an uphill battle to get faculty (especially junior faculty) to spend time on them. n if gift resources are available—a big if—and the terms of the gift permit it, consider diverting money from a gift fund or endowment to create a modest grant program to support faculty-initiated digital projects in the library. the cornell university libraries have had such a program in place since . to date, the faculty grants for digital library collections program has supported over twenty digital projects involving faculty from different departments at cornell, including an ongoing project based on the library’s holdings of underground polish publications from the solidarity era. the auburn university libraries launched a much smaller pilot version of the cornell program last year, using interest from a gift fund. the pilot program’s first project—a digital collection devoted to a a. trehub pioneering american female philanthropist of the late nineteenth and early twentieth centuries, based on biographical research conducted by an auburn history professor—will be unveiled later this year. a second digital project—with an east european focus, as it happens—is in the works and should be ready to be made public in early . the grant amounts in both cases have been modest: between $ , and $ , per project, with most of the money being used to hire student assistants and fund research trips. the libraries are supporting the projects through staff time, consultation, and the use of the libraries’ digital production facilities. the collections are being hosted on library server computers and will be added to the auburn university digital library. there are other areas in which librarians can make a positive contribution to digital projects and digital scholarship. one is digital preservation. digital preservation is the flipside of creating digital collec- tions. although it does not have a direct connection to bibliographic instruction, information literacy, or research, it is an increasingly important part of any digitization program. with support from the institute of museum and library services (imls), auburn university is currently partnering with the alabama department of archives and history in montgomery and five other colleges and universities around the state to create a geographically distributed digital preservation network using stanford university’s lockss (lots of copies keep stuff safe) software. the alabama digital preservation network (adpnet) currently contains identical copies of selected digital collections on server computers at all seven member institutions. more content is on the way, and we hope to recruit more members to the network in the coming year. there is no reason why the major slavic collections in the united states could not join forces and set up a private lockss network (pln) for the preservation of digital collections in slavic studies. conclusion: from curation to creation re-reading one’s past work is educational if not always pleasant, and i would give my solanus article an overall grade of b+. in retrospect, i think i got more things right than wrong, and i am gratified to see that i devoted almost half the article to the effects of the new information technologies, including their potential negative effects. on the minus side, i think that i overestimated the importance of traditional bibliography and grossly underestimated the importance—or at least the popularity—of google and improved search technologies. and my perception of slavic studies’ viability as a discipline may have been too sanguine. it seems to me that there is little that slavic librarians can do to forestall the feared decline of area and language studies or the humanities in general. notes of a former slavic librarian what they can do is to participate more fully and be represented more prominently in the technology-driven trends that are transforming librarian- ship and indeed other professions as well. looking back over my twenty years as a librarian, i would say that the single most important development in that time has been a shift in emphasis from curation to creation. that is, from acting as the organizers and stewards of content created by other people and agencies to becoming content creators in our own right. stewardship and preservation of the materials entrusted to us is an honorable and important job, now perhaps more than ever. but i would like to see us—and here i rejoin temporarily the ranks of practicing slavic librarians—get more involved as a profession in designing and producing digital scholarship. in this connection, it is perhaps not a bad thing that some slavic librarians are leaving their home discipline for broader areas in librarianship and scholarly communication and are devoting themselves to promoting information literacy (in its broadest sense) for students and faculty members. at the beginning of this piece, i referred to my defection from the field. i now think that was the wrong word. venturing beyond the field would have been closer to the mark—and with the possibility of return. to close with another quotation, this time from one of osip mandel’shtam’s last poems: ‘‘my vernemsia eshche, razumeite!’’ [we’ll be back—count on it!]. notes . aaron trehub, ‘‘slavic studies and slavic librarianship in the united states: a post-cold war perspective,’’ solanus, n.s., ( ): – . . caryl emerson, ‘‘slavic studies in a post-communist, post / world: for and against our remaining in the hardcore humanities,’’ slavic and east european journal , no. ( ): , . see also victoria e. bonnell and george breslauer, ‘‘soviet and post-soviet area studies’’ and ellen comisso and brad gutierrez, ‘‘eastern europe or central europe? exploring a distinct regional identity,’’ both in the politics of knowledge: area studies and the disciplines, ed. david l. szanton, university of california press/university of california international and area studies digital collection, edited vol. , (berkeley, ca, – ), http://repositories.cdlib.org/uciaspubs/editedvolumes/ . . trehub, ‘‘slavic studies and slavic librarianship,’’ . . michael brewer, e-mail to the author, april , . . bradley schaffner, ‘‘message from the chair,’’ association of college and research libraries, slavic and east european section, newsletter ( ): – . . miranda remnek, ‘‘the aaass bibliography and documentation committee’s working group on digital projects,’’ slavic & east european information resources , no. ( ): – . . scott jaschik, ‘‘new collaboration for scholarly publishing,’’ inside higher ed, december , , http://insidehighered.com/news/ / / /mellon. . allan urbanic and beth feinberg, a guide to slavic collections in the united states and canada, special issue, slavic & east european information resources , no. – ( ). this is the first comprehensive guide to slavic collections in north america since melville j. ruggles and vaclav mostecky, russian and east european publications in the libraries of the united states (new york: columbia university press, ). . the working group’s charge, composition, and final report can be found at working group on the future of bibliographic control, http://www.loc.gov/bibliographic-future/. a. trehub . richard amelung et al., on the record: report of the library of congress working group on the future of bibliographic control (washington, dc: library of congress, ), , http://www.loc.gov/ bibliographic-future/news/lcwg-ontherecord-jan -final.pdf. . bradley schaffner, e-mail to the author, april , . . miranda beaven remnek, e-mail to slavlibs discussion group, february , ; jennifer howard, ‘‘scholars question library of congress’s plan to relocate a reading room,’’ the chronicle of higher education: daily news, march , , http://chronicle.com/daily/ / / n.htm. . will h. corral and daphne patai, ‘‘an end to foreign languages, an end to the liberal arts,’’ chronicle of higher education, june , , a , http://chronicle.com/weekly/v /i / a .htm. . american library association, association of college and research libraries, ‘‘introduction to information literacy,’’ http://www.ala.org/ala/mgrps/divs/acrl/issues/infolit/overview/intro/index.cfm. for a skeptical take on information literacy, see stanley wilder, ‘‘information literacy makes all the wrong assumptions,’’ the chronicle of higher education, january , , b , http://chronicle.com/ weekly/v /i / b .htm. . jacques barzun and henry f. graff, the modern researcher (new york: harcourt, brace, ). . jeremy j. shapiro and shelley k. hughes, ‘‘information literacy as a liberal art: enlightenment proposals for a new curriculum,’’ educom review , no. ( ), www.educause.edu/pub/er/review/ reviewarticles/ .html/; jack miles, ‘‘three differences between an academic and an intellectual: what happens to the liberal arts when they are kicked off campus?’’ cross currents , no. ( ): – . . cathy de rosa et al., college students’ perceptions of libraries and information resources: a report to the oclc membership (dublin, oh: oclc online computer library center, april ), – . . de rosa et al., college students’ perceptions, – . . patricia a. frade and allyson washburn, ‘‘the university library: the center of a university education?’’ portal: libraries and the academy , no. ( ): – . . google’s effect on reading habits and attention spans is a separate but very interesting question, unfortunately outside the scope of this paper. see nicholas carr, ‘‘is google making us stupid?’’ atlantic monthly, july/august , http://www.theatlantic.com/doc/ /google. . ann m. lally and carolyn e. dunford, ‘‘using wikipedia to extend digital collections,’’ d-lib magazine , no. – ( ), http://www.dlib.org/dlib/may /lally/ lally.html. . thomas c. wilson, ‘‘e-resources: enigma or dilemma, or both,’’ (keynote address, electronic resources and libraries conference, atlanta, ga, march ), http://smartech.gatech.edu/handle/ / . . trehub, ‘‘slavic studies and slavic librarianship,’’ . . european bibliography of slavic and east european studies (ebsees), http://ebsees. staatsbibliothek-berlin.de/information.php. . wikipedia, http://en.wikipedia.org/wiki/wikipedia, s.v., ‘‘wikipedia.’’ . lally and dunford, ‘‘using wikipedia.’’ . scholarpedia, http://www.scholarpedia.org/. . an example is aaass bibliography & documentation committee, subcommittee on digital projects, the digital slavist, http://digitalslavist.xwiki.com/xwiki/bin/view/main/webhome. . helen sullivan, ‘‘slavic bibliography online: webct as a resource for online instruction,’’ slavic & east european information resources , no. ( ): – . . ‘‘five year information format trends,’’ oclc reports, march , . . aaass bibliography & documentation committee, subcommittee on digital projects, inventory of slavic, east european, and eurasian digital projects, http://www.library.uiuc.edu/spx/inventory/. . seventeen moments in soviet history, http://www.soviethistory.org/index.php; the empire that was russia: the prokudin-gorskii photographic record recreated, http://www.loc.gov/exhibits/empire/; the harvard project on the soviet social system online, http://hcl.harvard.edu/collections/hpsss/ index.html; russia engages the world, – , http://russia.nypl.org/home.html. . abby smith, why digitize? washington, dc: council on library and information resources, , http://www.clir.org/pubs/reports/pub -smith/pub .html. . the valley of the shadow: two communities in the american civil war, http://valley.vcdh. virginia.edu/. . see, for example, edward l. ayers and william g. thomas, ‘‘time, space, and history’’ (paper, educause annual conference, dallas, tx, october , ), http://connect.educause.edu/library/ abstract/timespaceandhistory/ . notes of a former slavic librarian . patricia hswe, ‘‘what you don’t know will hurt you: a slavic scholar’s perspective on the practicality, practicability, and practice of digital scholarship,’’ slavic & east european information resources , no. ( ): – . . see the section devoted professional literacy on michael brewer’s web site, slavic information literacy for students, scholars, and educators, http://intranet.library.arizona.edu/users/brewerm/sil/ prof/index.html. . cornell university, faculty grants for digital library collections, web site, http://dcaps.library. cornell.edu/facultygrants/; solidarność: cornell university library’s collection of polish underground publications, http://www.library.cornell.edu/colldev/slav/solidarityhome.html. . auburn university digital library, http://diglib.auburn.edu/. . lockss web site, http://www.lockss.org/. . alabama digital preservation network (adpnet), http://adpn.org/. . i am informed by the editors that the east coast consortium for slavic collections (eccsc) is spearheading a lockss-based initiative aimed at harvesting and preserving selected slavic e-journals in the regular lockss network. the eccsc is composed of several major academic and public libraries (cornell, dartmouth, duke, harvard, library of congress, new york public library, new york university, princeton, university of north carolina at chapel hill, university of toronto, and yale). . ‘‘kak po ulitsam kieva-viia…,’’ in osip mandel’shtam, sobranie sochinenii v trekh tomakh [collected works in volumes] (washington, dc: inter-language literary associates, ), : . a. trehub media, culture & society ( ) – © the author(s) reprints and permission: sagepub.co.uk/journalspermissions.nav doi: . / mcs.sagepub.com digital media studies futures ben aslinger bentley university, usa nina b. huntemann suffolk university, usa keywords digital media, feminist, game studies, industry, new media, practice, theory television and media studies embraced diverse methodologies as scholars sought to interrogate textuality, social contexts, economic and industrial imperatives, and policy and audience formations. media studies has always existed on the fuzzy boundary line (or border war, depending on your perspective) between the humanities and the social sciences; its hybrid nature has often provoked debates about the future of the field. current prescriptions for the future of media and cultural studies from john hartley and graeme turner provide new directions for researchers; our attempt here is to add to their provocations and think through some of the research and collaboration opportunities of digital media studies futures while pointing toward the economic, institutional, and dis- ciplinary challenges of further enriching and hybridizing media studies. while we take seriously the critiques made by graeme turner ( ) against a small slice of scholar- ship on creative industries, new media studies, and/or convergence culture studies, we find it important to stress the variety and depth of digital media scholarship as well as the theoretical and methodological vigor of “new” media studies. we also want to stress a wider array of digital media scholarship than turner acknowledges and point to the ways that critical race, feminist, queer, postcolonial and globalization scholars interrogate digital poetics and politics and open up new pedagogical and research horizons. as scholars interested in game studies and digital media studies, we argue that future media scholars will need to do even more to talk and work across disciplinary bounda- ries. game studies and digital culture studies are not simply media studies. computer scientists, software studies scholars, game designers, designers (in fields from interac- tion design to human computer interaction to industrial design), cultural anthropologists, corresponding author: ben aslinger, department of english and media studies, bentley university, forest st, aac g , waltham, ma , usa. email: baslinger@bentley.edu mcs . / media, culture & societyaslinger and huntemann article at sage publications on march , mcs.sagepub.comdownloaded from http://mcs.sagepub.com/ media, culture & society ( ) sociologists, psychologists, and education policy and curriculum researchers, to name a few, are interested in creating and analyzing games and using games to address social and cultural issues. as scholars, we must rethink the ways that digital technologies ask us to grapple with diverse methodologies necessary to produce research questions and knowledge, and how conversations about digital culture must work within and among the sectors mentioned above. since game studies is not always media studies, how can media studies scholars participate in fields that transect technology, design, humanities, and social science departments, and in conversations that happen both inside and outside the academy? what methods of inquiry, what questions about participation, access, con- text, or engagement can we explore, and what do we need to learn to be able to partici- pate in conversations about media cultures that are increasingly algorithmic and code-based? it is our contention that media studies scholars have important things to say about web-, network-, and game-based communication forms and experiences, but we risk irrelevance if we do not grapple with both the opportunity and the challenge of the digital. as john hartley argues, media studies increasingly requires “new competen- cies,” “new horizons,” and “new problem situations” in order to explore the dynamics of systems that depend on technological protocols and scientific principles as much, if not more, than linguistic, visual, and sonic style ( : ). scholars, activists, and educators are using games and digital media to teach st- century literacies and approaching various media forms as opportunities for teaching system-based and design-based thinking. too many scholars to mention are working through the relationship between diverse but interlinked media platforms, technologies, and experiences, challenging “medium-specific” modes of analysis that have been cen- tral to some cinema and television studies approaches. new models of spectatorship, sharing, the dynamics of platforms, ecosystems of communication activity, norms and transgression, and distribution and circulation are being elaborated and debated. feminist/ queer digital media studies expands our understanding of the effect of networks on the creation and evolution of gendered and sexual subjectivities. but excitement about digital media can bleed into scholarly apprehension. scholars interested in the open and international flow of “born digital” work worry about geotracking; we need to recognize the often national contingency of material used in multimedia/digital scholarship. scholars interested in writing histories of emerging media wonder how technologies/platforms and post-structuralist historiographic method mix. researchers examining online and networked identities and cultures revise ethical guidelines for new types of research projects and wonder how to exercise good judgment when conducting research on provocative issues such as sexting, how to be fair to participants/communities, and how to resist the cooptation of research to feed moral panics. the products of digital media are too often discussed as objects void of materiality – ethereal objects or simply “experiences” without a physical presence. however, richard maxwell and toby miller remind us: “before there can be a story to analyze, a message to decode, or a pattern to identify in collective or individual media use, there has to be a physical medium, a technical means of communication” ( : ). these technical means – cell phones, laptops, game consoles, e-readers, server farms, and so on – contribute to the increasing amount of e-waste generated every time we search, text, at sage publications on march , mcs.sagepub.comdownloaded from http://mcs.sagepub.com/ aslinger and huntemann play, tweet, remix, or update our facebook statuses. digital media studies should not ignore the significant environmental impact of the material outputs of our digital lives. this involves tracing related objects through their entire life cycle – extraction, process- ing, assembly, distribution, consumption and disposal. digital media studies also means thinking through a variety of issues involving the nature, type, and speed of academic labor. how will media futures affect scholars and teachers working in a variety of contexts and types of institutions? how much do we need to know about coding and programming? if we didn’t learn coding or design skills in graduate school or our prior professional lives, how do we go about acquiring these skills within the frenetic rhythms of academic life? how do we combine established research methodologies with emerging ones (e.g. data visualization, big data, etc.)? given resource differences between institutions, who gets to work and play with new methods? we love platforms such as flow, mediacommons, and in media res, but we believe that there needs to be a recognition that not everyone will be able to (or needs to) build their own platforms and that we need to find ways in the digital humanities and media studies to value both the creation and use of software and platforms. we are afraid that in an era of shrinking budgets, institutions and departments may be attracted by the allure of digital media studies’ “newness” and the way that “new” media courses attract stu- dents without realizing the significant investments of time and resources that doing digi- tal media studies well requires. digital media studies depends on capital for the purchase, maintenance, and space for computer lab/studio space and new forms of learning envi- ronments. we also want to make sure that “new” media studies remains a felicitous space for work on gender, sexuality, race, and class and for diverse scholars who challenge the too often described conflation/caricature of the new media scholar as an apolitical white heterosexual male academic. we also want to keep and expand “new” media studies’ focus on the particularity of place and conversation between disparately located research- ers, as transnational and translocal flows have always been integral to “new” media analyses (in contrast to the lingering influence of the national cinema paradigm and the historical national focus of broadcasting in both commercial and public service broad- casting systems). new opportunities for pedagogy have been created by the emergence of new net- works and platforms. how can we bring media production, even low-level production activities such as blogging, twitter, basic video editing, and simple game design, into critical studies classrooms? this is not just to move beyond the traditional expository essay, but to allow students interested in media production, design, and expression to start using theories, concepts, and critiques as potential inspirations for narratives, exper- iments, and projects. pedagogical models that separate conceptual material from applied skills must be, at long last, replaced with the lessons learned from critical media literacy approaches that do not simply integrate theory with praxis, but fundamentally address these objectives as mutually constitutive. the process of creating something (a blog, a video, a game) will often reveal the constraints of a medium or set of practices, thus underscoring significant conceptual constraints that theory can explain. the limits of a theoretical approach are, similarly, best seen through application and case studies. this theory–praxis model should not only guide a curriculum program overall, but also be reflected in individual courses through what we would call critical production at sage publications on march , mcs.sagepub.comdownloaded from http://mcs.sagepub.com/ media, culture & society ( ) assignments. are there ways that the future of media studies could work out the produc- tion versus theory/studies issues that often plague our departments? theory–praxis models also extend to the relationships media studies scholars should seek to forge with industry, relationships that will benefit our teaching and our research. too often industry views the academy as outmoded and irrelevant, and sees scholars as misinformed, at best, and threatening, at worst. for example, this is illustrated by the incorrect assumption often held by game developers that game studies researchers are universally investigating the negative effects of game play on young people. given the early history of game studies, particularly in north america, one can understand the origins of this view. however, by pursuing opportunities to discuss our research to mem- bers of the industry at events such as the annual game developers conference (gdc) – as mia consalvo, jane mcgonigal and ian bogost did for years at their “game studies download” gdc session – we can correct misunderstandings about our work. this can facilitate access to the industrial processes digital media researchers wish to investigate. as we open industry–academy relationships, it is important, of course, to recognize that the objectives of our research will most often not align with the objectives of industry. and thus, collaboration is a negotiation between often incongruous goals. in the future, it will be harder to define oneself as a television, film, or game studies scholar, in part because economic imperatives in hiring are increasingly demanding mul- tifaceted candidates. too many people to mention have weighed in on the issues articu- lated in this thinkpiece, but we hope that this piece elicits further conversation about what types of scholars we are and hope to be and how we can respond to the opportunity and challenge of digital media. references hartley j ( ) digital futures for cultural and media studies. malden, ma: wiley-blackwell. maxwell r and miller t ( ) greening the media. new york: oxford university press. turner g ( ) what’s become of cultural studies? thousand oaks, ca: sage. at sage publications on march , mcs.sagepub.comdownloaded from http://mcs.sagepub.com/ learning ecologies through a lens: ontological, methodological and applicative issues. a systematic review of the literature british journal of educational technology doi: . /bjet. vol no – © british educational research association learning ecologies through a lens: ontological, methodological and applicative issues. a systematic review of the literature albert sangrá, juliana elisa raffaghelli and montse guitert-catasús albert sangrá is academic director of the unesco chair in  technology and education for social changeat the open university of catalonia. he is a researcher at the edul@b research group and full professor at the psychology and educational sciences department. at the  uoc, he has served as director, methodology and educational innovation until , being in charge of the educational model of the university; director of the m.sc. program in education and ict (e-learning) ( – ), and director of the elearn center ( – ). his main research interests are ict uses in education and training and, particularly, the policies, organization, management and leadership of e-learning implementation, and its quality assurance, and the professional development for online teaching. juliana elisa raffaghelli is an associate professor at the faculty of education and psychology (universitat oberta de catalunya). she served as principal investigator at the project “professional learning ecologies for digital scholarship:  modernizing higher education by supporting professionalism,” funded by the ministry of science, technology and university of spain under the program “ramon y cajal” ( - ). her research interests are connected to faculty development for the modernization of higher education, data literacy and the use of closed and open data models for the improvement of educational processes. montse guitert-catasús is full professor of psychology, an education studies  at the open  university of  catalonia  (uoc). she is director of the digital literacy program and coordinator of the “ict competences” course since its inception ( ). she is a  professor at the  information and knowledge society doctoral programme;  master in education and ict (e-learning)  and master in open software at the uoc. her research focuses on ict uses in education and training and, particularly, online collaboration, online teacher training and digital competences. *address for correspondence: juliana elisa raffaghelli, faculty of psychology and education sciences, universitat oberta de catalunya, ramblas del poblenou , barcelona, spain. email: jraffaghelli@uoc.edu abstract the concept of learning ecologies emerged in a context of educational change. while the “learning ecologies” construct has offered a broad semantic space for characterizing innovative ways of learning, it is also true that its potential to promote innovative educational interventions may have been hindered by this same broadness. based on this assumption, in this paper the authors carried out a systematic review of the literature on learning ecologies with the aim of analysing: ( ) the varying definitions given to the concept, including the ontological perspective underlying the phenomena studied; ( ) the methodological approaches adopted in studying the phenomenon; and ( ) the applications of the research on this topic. throughout this analysis, the authors attempt to describe the criticalities of the existing research, as well as the potential areas of development that align well with the theoretical/ontological issues, methodological approaches and educational applications. the authors selected and analysed articles, which they then classified in a set of categories defined by them on a theoretical basis. moreover, in order to triangulate the manual coding, a bibliometric map was created showing the co-citation activity of the papers. the emerging picture showed significant variability in the ontological definitions and methodological approaches. in spite of this richness, few educational applications currently exist, particularly with regard to technology-enhanced learning developments. most research is observational, devoted to describing hybrid (digital and on-site) learning activities mailto: https://orcid.org/ - - - mailto:jraffaghelli@uoc.edu © british educational research association british journal of educational technology vol no introduction the abundance of resources in the open and social web has created unprecedented opportuni- ties for learning. key attributes such as “complex,” “self-organized,” “connected” and “adaptive” have been applied to depict the range of conditions underlying the learner’s freedom of choice (kop & fournier, ; siemens, ). moreover, the possibility of blending digital activities practitioner notes what is already known about this topic • the “learning ecologies” (le) construct, widely adopted in the last years, has offered a broad semantic space for characterizing innovative ways of learning. • several meanings have been assigned to this construct, which in some cases may conflict, ie, formal learning spaces and tools as le, and informal learning across sev- eral contexts as le. what this paper adds • the authors hypothesize that the power of the topic lies in its ability to support models and practices by overcoming the rigid separation between formal and informal learn- ing. in fact, le conceptualize the relationships between formal and informal as a continuum across several learning contexts, mediated by digital technologies. • the paper introduces a systematic review of the literature on le in which the incon- sistencies in the definitions of the construct are analysed together with the methodo- logical approaches and the research applications. • furthermore, the alignment between these three key elements has been studied in order to explain the weaknesses in realizing the full potential of the construct. implications for practice and/or policy • clearer definitions of le may encompass new models and tools for analysing technology-enhanced learning processes, supporting learning visibility and learn- ers’ awareness of the connections between the formal and the informal and vice versa. • new research on digital tools for self-diagnosis and the development of learning ecologies might align with a perspective of self-directed and self-determined learning as a way to progress in lifelong learning. that bridge the gap between the school and social spaces. furthermore, many of the studies relate to the field of secondary education, with fewer studies exploring adult learning and higher education. the studies dealing with professional development relate mostly to teachers’ continuing education. the authors conclude that the concept of learning ecologies could be used to address further experimental and design-based research leading to research applications if there is proper alignment between the ontological, methodological and applicative dimensions. the main potential of this strategy lies in the possibility of supporting learners by raising their awareness of their own learning ecologies, thereby empowering them and encouraging them to engage in agentic practices. this empowerment could help maintain and build new and better learning opportunities, which every learning ecology can incorporate, amidst the chaotic abundance that characterizes the digital society. © british educational research association systematic review: lifelong learning ecologies with on-site activities has led to the hybridization of learning contexts, where the learners experience a sort of “continuum” while searching for resources, cultivating relationships and engaging in activities to help them achieve their own, more or less, conscious learning goals (esposito, sangrà, & maina, ). most of the literature produced in the last two decades in the field of technology-enhanced learning and online and blended learning has increasingly emphasized the centrality of the learner. the learner’s intentionality to achieve knowledge and develop skills is the axis for interpreting the concept as a unified lived experience, as it makes sense of the multiple relationships and resources that comprise the learning activities. in this regard, the concept of personalized learning environments gained popularity, due to its opera- tional alignment with the idea of learners’ initiative, self-regulation, self-organization and lead- ership (attwell, ; dabbagh & kitsantas, ). however, other important constructs that characterize the changing landscape of learning in the digital era emerged. the conceptualiza- tions emphasized the idea of learning everywhere and at any time, based on the rising phenome- non of access to the internet and the use of mobile devices for learning. in this respect, “seamless learning” (sharples, ; wong, milrad, & specht, ) and “ubiquitous learning” (virtanen, haavisto, liikanen, & kääriäinen, ) led the way. this development further enriched the technological landscape while also contributing to the debate on formal, non-formal and infor- mal learning (mocker, ), since the digital tracking of activities and the digital presence on more informal spaces such as social media made the incidence and importance of unstructured forms of learning for lifelong learning more and more visible. a need was also detected to renew formal instruction by integrating or recognizing informal learning together with the formal curriculum (cross, ; kamenetz, ). the techno-educational debate was accompanied by other important pedagogical debates, which contributed to the idea of self-determination and free appropriation of the digital abundance, eg, heutagogy, or a pedagogy of adult self- determination and awareness of one’s own abilities to continue learning (blaschke, ). notwithstanding the impressive corpus of literature on technology-enhanced learning, as it stands today, the aforementioned constructs have shown their ability to describe specific areas of learning. moreover, the ongoing hybridization of learning, both in terms of the medium (digital, on-site) and the type of learning (formal, non-formal, informal), reveals the need to generate new theories and constructs that are able to embrace the changing nature of the phenomenon in question, namely, lifelong learning. another critical point to be considered relates to the so-called “pedagogy of abundance” (kop, fournier, & mak, ), where the learners select, freely and at their own convenience, the digital resources, tools and environments that they prefer. the some- what naïve assumption “the more, the better” is a fallacy and represents a techno-determinist approach, which has already been criticized in the literature (selwyn, ). therefore, consid- ering the situation described above, there appears to be a need to explore and develop constructs that can explain technology-enhanced, lifelong learning. furthermore, these constructs should establish effective methodological approaches and enable knowledge transfer to applications in education. in this paper, the authors will explore a concept that has been frequently adopted in the liter- ature on pedagogical innovations: the “learning ecology” (hereinafter le) construct, defined as the sum of contexts where the learner self-directs her activity, cultivating relationships and using, producing and sharing resources. moreover, an le is deemed hybrid, both in terms of the medium, since the lines between physical and virtual configurations are blurred, and in terms of type of learning, since it integrates formal and informal learning. while this concept has offered a broad semantic space capable of encompassing innovative ways of learning, it is also true that its potential to promote innovative educational interventions may have been hindered by this same © british educational research association british journal of educational technology vol no broadness. based on this assumption, in this paper the authors carry out a systematic review of the literature with the aim of analysing: ( ) the varying definitions given to the concept, includ- ing the ontological perspective underlying le; ( ) the methodological approaches adopted to study the phenomenon; ( ) the applications of the research results on this topic with regard to educational interventions. throughout this analysis, the authors attempt to establish the critical- ities of the existing research, as well as the potential areas of development which align with the theoretical/ontological issues, the methodological approaches and the educational applications. background: the concept of le in the literature the ecological perspective was adopted in the social sciences in the early eighties through bateson’s pioneering interdisciplinary approach to the study of human behaviour in his work “steps to an ecology of mind” (bateson, ). a little later, bronfenbrenner ( ) character- ized human development as a process based on interactions at several social levels in what he called “the ecological systems theory.” in his approach, bronfenbrenner described the individu- al’s ability to appropriate several resources for competence development. while both the afore- mentioned authors see the sociocultural system as complex and multilayered and developing in the same way as an ecology does, bronfenbrenner’s perspective places importance on learner agency in relation to her engagement with self-development. since the emergence of these two important theoretical contributions until the present day, the ecological approaches concerned with teaching and learning issues in the digital age have yielded a range of terms and concep- tual definitions. these definitions range from those that are strongly linked to the legacy left by studies on biological ecosystems, which characterize the school, the classroom and the web as ecosystems for learning, to those that treat the web as a new kind of learning environment or as a component in a more complex entanglement of individuals and tools, which constitute ecologi- cal components (esposito, sangrà, & maina, ). nonetheless, a common theme across several studies is the ecological perspective conceived as a cyclical, complex and emergent phenomenon (haythornthwaite & andrews, ). in her study on achieving technological fluency, brigid barron ( ) made an early and rele- vant contribution. she analysed how technological fluency was achieved across a set of contexts and in terms of resources, activities and relationships, which provided opportunities for learning in physical or virtual spaces (barron, ). she compared the levels of expertise with the types of contexts and the frequency of activity within them. she then gave a definition of le from a sociocultural perspective, in which the transitions of the individuals across a range of formal and informal contexts providing diverse learning opportunities (barron, ) can improve the understanding of the interdependence of the institutional and personal levels in the educational use of emerging icts. as in barron’s study, many authors have characterized le as combinations of formal, non-formal and informal learning contexts (wilkinson, kemmis, hardy, & edwards-groves, ). nevertheless, the term has often been adopted to describe the emergent dynamics of learning within the classroom (crick, mccombs, haddon, broadfoot, & tew, ) or within e-learning environments (richardson, ). moreover, the term has been used in several fields of edu- cation, including technologies and gender (barron, ), ict skills development (barron, ), collaborative learning (hodgson & spours, ), designs for learning with technol- ogies (luckin, ), learning resources for homeless populations (strohmayer, comber, & balaam, ), teachers’ professional development (sangrà, gonzález-sanmamed, & guitert, ; van den beemt & diepstraten, ), personalized learning and lifelong learning (maina & gonzález, ), youth civic engagement (ige, ) and ubiquitous learning in higher © british educational research association systematic review: lifelong learning ecologies education (díez-gutiérrez & díaz-nafría, ). also jackson ( ) has to be taken into main consideration, as he explores the construct of learning ecologies and introducing the very inter- esting concept of lifewide learning. although his studies are not as empirical as the ones men- tioned before, they are experientially rich and have been inspirational for many empirical studies as conceptual basis. while considered a powerful tool which has already been applied in several ways, the concept of le for lifelong learning has to overcome some issues to fulfil its full potential. firstly, there is an ontological problem posed by the different ways of defining le as an empirical phenomenon (technological resources, digital spaces, learning networks, etc.) and in some cases based on sub- sidiary theories. secondly, these differences have led to a variety of instruments and methods of study being adopted that require defining, including new educational research methods, such as public data-driven research (kimmons & veletsianos, ), among others. therefore, the researcher interested in applying this perspective might be puzzled by the variety of approaches and instruments. a systematic review of the literature on the topic, to our knowledge inexistent until now, might bring some light on the areas where deeper exploration is needed, and on the associations between research subtopics, instruments and the interpretation of empirical data for the advancement of the field. methodology study design and sampling this paper provides a systematic review of the literature on the topic, based on the prisma workflow (moher, liberati, tetzlaff, altman, & prisma group, ). systematic reviews entail a specific process of appraising, summarizing and communicating the literature, while dealing with otherwise unmanageable quantities of documents. moreover, the process also attempts to control researcher bias in data collection and analysis (petticrew & roberts, ). following this approach, five scientific databases that index peer-reviewed research were scanned (full names and urls are shown in table ). these databases were selected due to their coverage of: ( ) peer- reviewed empirical research; ( ) social research; ( ) educational research. within each database, we adopted the query “learning” and “ecolog*” without time or disciplinary constraints. this search yielded papers. from these, were overlaps and, once eliminated, papers were considered for the screening phase. in this phase, three researchers read the abstract and excluded the papers that were not relevant for the analysis envisaged. the exclusion criteria were: ( ) not a journal or conference paper; ( ) not empirical research; ( ) no english language version available (for an international audience to follow the analysis with transparency); ( ) superficial usage of the le concept: construct mentioned, but not used for or central to the research; ( ) dealing with ecologies as a topic in science education rather than a pedagogical approach; ( ) unavailable document (requested or searched via the library). in this regard, some authors con- sidered important by experts on the topic of le, like n. jackson ( ) have been published as conceptual books and their reference is frequently embedded in empirical research, which was targeted in this systematic review. according to the scheme above, papers were excluded and papers were considered for final analysis. figure shows the prisma workflow. appendix in supporting information shows the complete list of authors and papers selected. data analysis as for the analysis, the papers were coded and classified into different categories, as defined by three authors, and further discussed in a session within an extended research group with eight © british educational research association british journal of educational technology vol no table : database for the classification of articles fields [variables] description subfields [codes assigned] authors authors in the paper title publication title year year of publication source title journal, conference or other information indicating the type and context of publication cited by number of authors citing the publication under analysis document type type of publication article, conference paper, book chapter publication source scientific database where the publication was found: scopus (www.scopus.com ), wos (web of knowledge, www.webof- knowledge.com ), doaj (directory of open access journals, www.doaj.com), eric (education resources information center https://eric. ed.gov/), editlib (learning & technology library, https:// www.learntechlib.org/) presence [ ]/absence [ ] abstract synthesis of the research, as provided by the authors author keywords specific words describing the content/focus of the research research area the overarching disciplinary field where the research can be placed, based on scopus and wos definitions social sciences, computer science, health sciences, engineering, psychology, arts & humanities (including linguistics) type of learning characterization of the learning processes according to their structure, from more structured and institutionalized, to more open and unacknowledged by the participants, according to mocker ( ) formal, non-formal, informal, mixed educational level characterization of the educa- tional level taking into consid- eration the lifelong learning spectrum early education and care, school primary, school secondary, teacher education & professional development, professional learning, adult learning ontological definition the conceptual and empirical definitions supporting the construct of le adopted in the study a space, an environment, a metaphor, network, contexts for learning, available resources, a set of elements (resources, relationships, activities), a timeline describing learning transitions, a learning identity and the expertise on which it is based, unclear ontological definition (continues) http://www.scopus.com http://www.webofknowledge.com http://www.webofknowledge.com http://www.doaj.com https://eric.ed.gov/ https://eric.ed.gov/ https://www.learntechlib.org/ https://www.learntechlib.org/ © british educational research association systematic review: lifelong learning ecologies experts and four phd students. as can be observed, the fields identified attempted to capture: ( ) the research identity (authors, title, year, source title, no. of citations, doi, document type, publication type, publication source, author keywords, research area, geographical area); ( ) the research focus on learning (type of learning, educational level, pedagogical granularity); ( ) the epistemological approach (ontological definition, underlying theories, methodological approach, research applications); d) the evaluation of alignment within the epistemological approach (alignment between ontology, theories, methodological approach, theories and research applications). some of these categories were shaped on the basis of prior studies, cited in the “description” column; in any case, they were discussed and adjusted in the above-mentioned session. table presents the set of categories defined and then validated. after consolidating the categories, the authors analysed papers (almost % of the overall dataset of papers) and the interrater agreement was calculated. the cohen’s kappa obtained was . , which can be interpreted as “substantial agreement” ( . to . ). one researcher therefore proceeded with the codification of the remaining papers, adopting the criteria discussed within the research group. the data collected through the database (cf. table ) were processed by adopting two techniques: descriptive univariate and bivariate statistics, adopted to better describe and summarize the numerous variables studied in the literature, according to the classification in table . finally, a fields [variables] description subfields [codes assigned] underlying theories/ models the most relevant theories detected in the papers cited, where present. variable coded openly and subfields created inductively. connectivism, socio-constructivism, actor–network theory, self-directed learning, lifelong learning, communities of inquiry, critical pedagogy, mixed theories, unclear theoretical positioning methodological approach methodological choices made by the authors; codes elaborated from raffaghelli, cucchiara, and persico ( ) literature review, conceptual paper, qualitative observational, quantitative observational, mixed observational, qualitative interventionist, quantitative experimental, mixed interventionist, unclear methodological definition research applications (impact) the extent to which the selected study envisaged research applications, derived from the concept of le adopted. the overall concept was based on bastow, tinkler, and dunleavy ( ). the type of impact was derived from discussion a framework to observe learning processes, a framework to develop self-diagnosis empowering learners to engage in lifelong learning, a framework to develop learning needs analyses to design educational interventions, a framework to develop digital tools and environments for learning alignment the extent to which there is a powerful relationship between the definition of le as a concept, the empirical research and the research applications (between ontology, methodological approach, theories and research applications) powerful alignment good alignment with some weaknesses weak alignment no alignment table : (continued) © british educational research association british journal of educational technology vol no co-citations map was created. this is based on a text mining technique aimed at understanding the relationships between the citations as the dynamic used in carrying out the research activity about a particular topic (van eck & waltman, ). the interpretation of this data would lead us to further characterize the trends in the corpus of literature ( papers) under analysis. the bibliometric maps are based on three main elements: statistical analysis of written publications (often including text and data mining); methods of visualization (distance-based; graph-based; timeline-based) and digital tools supporting analysis and visualization. not only do the forms of visualization explore a current, static relationship, but they also highlight groups (clusters) that are “closer” within the relationship, as well as their progression, if we take into consideration the timeline. although bibliometric maps were not developed for conducting literature reviews, the associated techniques and tools allow scientific information about a given field of research to be processed in order to analyse a set of bibliographical references to identify research agglomerates and visualize their connections. in our research, the total number of citations and the relationships between the authors cited most often and all authors were extracted from the corpus analysed; a specific dataset was created and the outputs were processed using specialized software that delivers the bibliometric maps as output. the software tool, citenet explorer (http://www.citnetexplorer.nl/), was used to carry out this phase of the study, which enables the co-citations to be analysed and visualized. results the results are presented according to the four main categories of analysis explained in the pre- vious section, combining the numerous elements in order to gain a better understanding of the findings. the dataset and a more complete set of dynamic representations of the graphs in this paper are available at tableau public (https://tabsoft.co/ jbcp s). figure : prisma workflow—selection of articles http://www.citnetexplorer.nl/ //tabsoft.co/ jbcp s://tabsoft.co/ jbcp s © british educational research association systematic review: lifelong learning ecologies as we observe in figure , in the number of papers increases throughout the period until , and decreases in and . the number of citations also decreases, which is an effect that can be expected (it takes years to accumulate a number of citations in a publication). the year yields the highest number of cited papers; taking into consideration the num- ber of papers that year ( ), the attention is clearly focused on just a few authors. moreover, the latest contributions tend to adopt highly diversified sources of reference. interestingly, the three most cited contributions indeed pertain to three very diverse disciplinary fields: abd-el-khalick and akerson ( ), out of total citations, whose topic is science education; barron ( ), out of citations, whose topic is technological education through a social lens (gender and inclusion in technological fluency); and gutiérrez ( ), out of cita- tions, whose field of research is sociolinguistics. when considering research productivity measured through the number of citations, combined with the type of learning and educational level (showed in figure ), we see that most citations can be connected to the study of secondary school education, with out of citations (see, eg, barron, , ; gurung & rutledge ; shaw & krug ), followed by teach- ers’ professional development, with (see, eg, van den beemt & diepstraten ); and higher education , with (see, eg, dron, seidel, & litten ; okamoto, kayama, cristea, & seki ) . these are the easiest and most commonly studied levels in educational research, since the subjects are often engaged through institutional programmes or design experiments in class. however, if we combine this information with the type of learning under study, we can see that in spite of the high level of activity showing the continuum between informal, non-formal and formal learning at secondary school level ( citations combining all or at least types of learning), most studies in other levels concentrate on formal contexts and types of learning ( / in teachers’ professional development; / in higher education). interestingly, many papers study learning processes which are informal yet connected to secondary students engaged in formal learning. however, most papers adopt the le concept whereby it reflects a bal- ance of resources used within a class, where the introduction of digital technologies, or the use of alternative community spaces or time, expands the space or learning environment. figure : number of papers per year and number of citations per paper note. the scale for number of citations is showed in the left axis and the scale for the number of publications is showed in the right axis; moreover, the two scales are made compatible to allow comparisons. the x axis shows the timeline. © british educational research association british journal of educational technology vol no having characterized the research productivity in terms of full citations along the time span considered, as well as per type of learning and level of instruction where the studies were placed, we will now go in-depth by analysing how the le have been characterized in ontological and methodological terms, as well as from the point of view of research applications. furthermore, we will consider the issue of alignment between the above-mentioned three attributes (ontology, methodology and research applications). lastly, we will analyse the co-citations between authors in order to search for recurrent information supporting our assumptions. theories in le research as for the theories adopted by the authors, we combined this information with the research areas and the educational level in order to see if there was a pattern in the use of theories (see figure ). as expected, most papers examining the research area and educational level adopted socio-constructivism as the main theory to support their work on le—see, eg, jocson ( ) and hernández-sellés, gonzález- sanmamed, & muñoz-carril ( ). overall, the use of connectivism can be deemed relevant too (see, eg, jiménez cortés ; macleod, haywood, woodgate, & alkhatnai ). however, it should be considered that a sig- nificant number of papers were labelled as having an “unclear theoretical definition,” and this situation applied particularly to teachers’ professional development. it is also clear that the types of theories adopted are very diverse. if we only take into account the research area, it is evident that most papers fall under the area of social sciences and deal with pedagogy rather than with figure : type of learning × level of education × number of citations © british educational research association systematic review: lifelong learning ecologies the specific teaching in a disciplinary field (see, eg, greenhow & robelia ; khau, de lange, & athiemoolam, ). however, research has been carried out that adopts the concept of le in the disciplinary areas of sociolinguistics, computer sciences and stem—see, eg, hibbert ( ) for the first area; tabuenca, kalz, and specht ( ) for the second; and johnston, southerland, and sowell ( ) for the third. in all these cases, highly diversified learning theories were used. ontological definitions in le research we also investigated the ontological definitions adopted by the authors, which should shed light on how le were conceived as an object of study together with their connected empirical phe- nomena. we had hypothesized a rather uneven set of definitions. the theories underlying the various studies allowed us to imagine categories, which were theoretically elaborated concepts, to a greater or lesser degree. in fact, “resources for learning” “sets of elements” or “learning environments” (eg, khau et al., ; okamoto et al., ) were less elaborated concepts and were more connected with the empirical phenomena of the technologies available or framing the learner’s experience. “contexts for learning,” on the other hand, could be connected to the socio-constructivist approaches, using the idea of several social contexts (formal, non-formal, informal) where the learner interacts and builds knowledge. in the same vein, the learning ecol- ogy considered as a network refers to connectivist studies. while coding, a few other ontological definitions emerged (cf. table ), which were aligned with the complex picture already noticed while exploring the theories. as we observe in figure , consistent with the higher number of studies conducted under the theoretical approach of socio-constructivism, the ontological definition of “contexts for figure : underlying theories combined with research area and educational level © british educational research association british journal of educational technology vol no learning” appears in a significant number of studies ( / ), which is, however, identical to the number of studies where the concept of le is defined as “environment” ( / ). methodological approaches when merging the ontological definitions with the methodological approaches (cf. figure ), more relevant information comes out: most studies, independently of their ontological defini- tion, are executed using conceptual approaches (no empirical research, / ; see, eg, johnston et al., ; okamoto et al. ) and exploratory, observational approaches ( / as qualita- tive observational, / as quantitative, / as mixed, with observational approaches total- ling / ). very few papers adopt experimental/interventionist approaches ( / overall; see, eg, ozan, ; wong ), ie, attempting to modulate learner behaviour/opinions as well as to study the educational impacts of interventions. in a research topic spanning years of research, it is to be expected that the construct adopted would tend towards applied educational research (gorard, ). we could assume here that development in the research field has been very slow, with most papers concentrating on observing and describing a phenomenon (the le) rather than confirm- ing hypotheses and implementing experimental designs. alignment between ontological definitions, methodological approaches and research applications lastly, we took into consideration the alignment between the theories adopted, the ontologi- cal definition of le and the methodological approach adopted to conduct the various studies. the aim of this focus of analysis related to our assumption that greater alignment could lead to better research quality and usage/applications. indeed, we considered the alignment together with the research applications. all in all, these two variables could explore the effectiveness of the research in the sense of putting the construct of le to work, addressing educational figure : ontological definitions and methodological approaches © british educational research association systematic review: lifelong learning ecologies design, teaching and learning processes. not surprisingly, as shown in figure , le are mostly connected to conceptual applications, namely, to defining a framework for observing learning processes ( papers). the research alignment is considered mostly good with some weaknesses ( . , in a scale from to ). the following category relates to the applications of digital tools and environments for learning ( ), also rated mostly good with weaknesses ( . ). the third place is given to a high number of papers not defining any type of research application ( ) where the alignment could be deemed weak ( . ). the few remaining papers consider other types of application, such as defining learning needs analyses (two papers, with weak alignment [ . ]), defining educational interventions and assisting their implementation (one paper, with weak alignment) and a framework for training professionals (two papers, with good alignment with weaknesses [ . ]). one paper developed a self-diagnosis approach to empowering lifelong learn- ers that had an excellent level of alignment ( . ). this scenario supports the prior findings of a concentration of conceptual and observational studies, where the research applications have not been neatly developed. hence, while the observational approaches exploring and describing le prevail, the more “designerly” ways of research are almost non-existent. integrative analysis: co-citations map the last type of analysis conducted on the papers sampled was the co-citations map. this was adopted as a method to gain a better understanding of the relationships and progress made in work relating to the concept of le. figure shows the co-citations map, where, at first sight, two main groups of nodes or authors cited (x axis) can be observed across a time span (y axis), with sparse elements at the centre and the beginning of the period ( ). the main and more compact group (also in terms of clusterization of nodes, which are shown in green) related to the publications that cite the seminal work of barron ( ). these publications are mostly classified as using socio-constructivist theories and are placed in the area of social sciences, while other categories (such as the methodological approach and research applications and alignment) are more fragmented). there are four seminal works (abd-el-khalick & akerson, ; barron, figure : number of papers and level of theoretical, ontological and methodological alignment note. the blue bar represents the level of alignment and shows the mean alignment score (scale = - , where is no alignment, = weak alignment, = good alignment with some weaknesses, = good alignment). the grey bars represent the number of papers in the category considered (ie, number of papers for “a framework to observe learning processes”). moreover, % and % of papers are concentrated in just two of the grey bars. © british educational research association british journal of educational technology vol no ; okamoto, & kayama, ; okamoto, kayama, inoue, & cristea, ) to which other papers can be connected. beyond the aforementioned work of barron, the other three works can be placed in the area of technology (development of e-learning environments) and stem educa- tion, supporting the idea of disciplinary fragmentation. discussion and conclusions in our systematic review of the literature, we focused on three essential elements shaping the development of le as a research topic. these elements were the conceptual definition or the onto- logical perspective addressed by several studies, the methodological approaches and the applica- tions of the research to several educational services, process, practices, etc. these three elements were combined with dimensions characterizing the theories adopted and the disciplinary field or area of research as well as the context from which the empirical evidence was taken, such us the educational level and type of learning. according to our analysis of papers, we observed, firstly, that research in this field is growing at a slow but constant pace; however, there are some imbalances between the papers produced and coherent patterns of citation. while some of the works in the field are highly visible, others are somewhat submerged. this element, combined with other factors, shows a rather fragmented field where the concept of le could be said to be polysemic. as a matter of fact, the many the- oretical approaches emerging, in spite of the small number of papers that focus on socio-con- structivism combined with the secondary level and connectivism in higher education and adult learning, as well as the many ontological and methodological approaches observed, support the hypothesis of fragmentation. another important piece of evidence in this sense was the co-cita- tions map, which showed separated areas of research (stem/computational focus and sociotech- nical approaches). this form of fragmentation is quite usual, especially in fields that are multi- or interdisciplinary, as is the case with educational technologies and e-learning (sangrà, guàrdia, & gonzález-sanmamed, ) figure : co-citations map © british educational research association systematic review: lifelong learning ecologies with regard to contexts where empirical evidence was generated, it must be highlighted that most studies focused on the educational level of secondary education, in spite of interesting analysis observing the combinations of formal learning in class with informal learning activities complementary to the school. this situation aligns with the trend identified by zawacki-richter and latchem ( ), whose paper revisiting years of research in educational technologies found that most empirical research had been conducted on this educational level. it is evident that studies analysing the continuum between formal and informal learning in higher educa- tion and adult education, as well as vocational educational training are still needed. moreover, most research dealing with professional learning focused on teachers’ professional development (tpd), with studies investigating formal learning processes within tpd activities; this result can evidently be connected to the facilitated access to formal contexts of learning (such as school and teacher education). this picture appears to show that the full potential of the concept of le remains unexploited. from the background literature analysed, it was made clear that people adopt technologies to flow across several experiences of learning, cultivating relationships and curating resources, and analyses focusing mainly on formal spaces show a very limited picture of these lifelong learning continuums. however, the research analysed to date does not seem to pro- vide strong evidence for research applications produced by educational interventions that make use of the concept of learning ecology. it seems clear that there is a need for further research to identify patterns that could lead to better educational designs in several fields—materials, resources, applications, guidance, etc.—across digital and physical contexts of learning to pro- mote the visibility and development of le. in fact, since most studies are exploratory and obser- vational, the analysis is limited to describing or explaining an existing le, and this is the case of studies with good alignment (even if there are no research applications, but the ontological and methodological principles are in line with each other). however, the poor alignment observed in many studies seems to show that the concept is adopted only as an initial metaphor. going a step further, a number of authors have acknowledged the importance of making le visible (esposito et al., ; hernández-sellés et al., ; patterson, baldwin, araujo, shearer, & stewart, ), but, in line with the low alignment observed, very few studies provide design- based research that tests educational interventions based on the idea of the visibility of le. visibility of le is especially important to make learners aware of their le. as argyris ( ) stated, reflection is a key aspect of increasing the learning capabilities through double loop learn- ing (acquiring specific skills and reflecting on the same achievements). moreover, as blaschke ( ) states, the visibility of le can promote learner empowerment in self-determining their own lifelong learning pathways (jackson, ). in fact, the concept of le could combine self-de- termined learning as a motivation for learning in the mid and long term, and self-directed learn- ing as a motivation and direction of learning across immediately available contexts. nonetheless, as we observed throughout the fragmentation of ontological perspectives, poor operational defi- nitions hinder developments connected to visibility, indicating a possible way forward for future research. for example, one promising area of research could be connected to multimodal analyt- ics based on learners’ activities in several learning contexts; the dashboards that help represent learners’ own le; and the opportunities for development. in this regard, the multiple apps helping learners to track activities in informal situations beyond the classroom could represent a new way of thinking about how learning processes bridge the formal and informal continuum along a timeline. however, if the research conducted on the basis of sociotechnical approaches is never linked to research in the computational sciences, it will be impossible to bridge pedagogical con- cepts and technological developments; this is a concern if we consider the situation of research on le as described in this paper. © british educational research association british journal of educational technology vol no our study is limited in offering a perspective of what le are or could actually be, but we have tried to show the criticalities that prevent progress in research on this topic being made due to its fragmentation and lack of educational applications. furthermore, while the topic has advanced discussions on conceptual basis, as it is the case of jackson contribution through his books and online resources, our aim here was to show the problems of alignment between concepts and empirical research. this might be seen as a limitation on the study scope, ie, in covering the whole universe of scholarly work on le. however, it was a necessary step to achieve the systematic review’s goal. it is clear that the growing interest in le clearly demonstrates the attention paid by researchers to the need to overcome rigid separations between formal and informal learning, the digital and the physical, the pedagogical and the technological. a more rigorous conceptual and empirical alignment of research efforts would encompass clearer advancement towards mod- els capable of characterizing patterns of ecological growth and maintenance, tools supporting the visibility of learning processes across contexts and tools to self-diagnose one’s own le and characterize the le of specific professional or disciplinary groups, etc. although the current sit- uation does not address this potential, and the results of the systematic review suggest few cur- rent educational applications, there are some scenarios that could be considered promising. on one hand, there are a number of ongoing studies that could provide interesting contributions to the field from very different topics of research. studies that focus on the learning ecologies of entrepreneur mothers (johnson, ) online higher education students (peters, ); media communication professionals (bruguera, ); which try to identify patterns on learners’ deci- sions and their inner motivations to learn, have the potential to support learning design for these emergent collectives, taking into consideration preferred and new resources that could result in new learning opportunities. on the other hand, different methodological approaches can be envisaged: research-based designs resulting in frameworks and artefacts that could be used for self-diagnosis supporting learners’ awareness raising on their learning ecologies; longitudinal studies that could provide a wider pic- ture of the learning ecologies of specific collectives and their lifespan approaches to learning, etc. in the current landscape, new data-driven techniques and artificial intelligence could provide new tools to analyse the learning ecologies not only for diagnosis, but also with predictive and proactive approaches. however, more in-depth and extensive research on the topic is required, as le could become a lens for seeing how people organize their means of learning more clearly, namely, how they make decisions on what and how to learn. for learners’ autonomy and freedom entails the richest forms of learning, and it is the educational endeavour to capture and support such forms without limitating them. this is the potential enclosed in the construct of learning ecologies. acknowledgements the authors wish to thank the colleagues of edul@b for the insightful discussions about the concept of learning ecologies; and to the research assistant alicia puig fernández, for her valu- able contribution. this research has been parcially funded by the project “ecologías de apren- dizaje a lo largo de la vida: contribuciones de las tic al desarrollo profesional del profesorado” (eco learn) spanish ministry of economy and competitiveness i+d (edu - ); the proj- ect “cómo aprenden los mejores docentes universitarios en la era digital: impacto de las ecologías de aprendizaje en la calidad de la enseñanza (eco learn-he), spanish ministry of economy and competitiveness i+d (edu - -r) and the project “professional learning ecologies for digital scholarship: steps for the modernisation of higher education”, spanish ministry of economy and competitiveness, programme “ramón y cajal” ryc- - . © british educational research association systematic review: lifelong learning ecologies statements on open data, ethics and conf lict of interest the data processed and analysed in this paper is openly shared as open data. the main dataset has been published on tableau public, allowing the interested reader to take a look at dynamic visualizations. moreover, there is a copy of data at zenodo, added in this paper as reference (fernández & raffaghelli, ; raffaghelli & fernández, ). the data can be used and shared citing the original work. all the information contained in the datasets is public and does not refer to sensitive information. the whole research project eco learn is compliant with the uoc’s institutional ethics committee guidelines. the authors declare no conflicts of interest. references abd-el-khalick, f., & akerson, v. l. ( ). learning as conceptual change: factors mediating the development of preservice elementary teachers’ views of nature of science. science education, ( ), – . https://doi.org/ . /sce. argyris, c. ( ). behind the front page. san francisco, ca: jossey bass. attwell, g. ( ). personal learning environments – the future of elearning? lifelong learning, (january), – . retrieved from http://www.elearningeuropa.info/files/media/media .pdf barron, b. ( ). learning ecologies for technological fluency: gender and experience differences. journal of educational computing research, ( ), – . https://doi.org/ . / n -vv - rb - va barron, b. ( ). interest and self-sustained learning as catalysts of development: a learning ecology perspective. human development, , – . https://doi.org/ . / bastow, s., tinkler, j., & dunleavy, p. ( ). the impact of the social sciences: how academics and their research make a difference. london, uk: sage. bateson, g. ( ). steps to an ecology of mind. collected essays in anthropology, psychiatry, evolution, and epis- temology. northvale, nj: jason aronson inc. https://doi.org/ . / blaschke, l. m. ( ). heutagogy and lifelong learning: a review of heutagogical practice and self deter- mined learning. international review of research in open and distance learning, ( ), – . https://doi. org/ . /j.system. . . bronfenbrenner, u. ( ). ecological models of human development. readings on the development of children. retrieved from http://www.psy.cmu.edu/~siegler/ bronfebrenner .pdf bruguera, c. ( ). the opportunities of social media for professional development: an exploration of the learning ecologies of digital communicators. th eden research workshop, barcelona, – october, . crick, r. d., mccombs, b., haddon, a., broadfoot, p., & tew, m. ( ). the ecology of learning: factors contributing to learner-centred classroom cultures. research papers in education, ( ), – . https:// doi.org/ . / cross, j. ( ). informal learning: rediscovering the natural pathways that inspire innovation and performance. san francisco, ca: pfeiffer. dabbagh, n., & kitsantas, a. ( ). personal learning environments, social media, and self-regulated learning: a natural formula for connecting formal and informal learning. internet and higher education, ( ), – . https://doi.org/ . /j.iheduc. . . díez-gutiérrez, e., & díaz-nafría, j.-m. ( ). ubiquitous learning ecologies for a critical cybercitizenship. comunicar, ( ), – . dron, j., seidel, c., & litten, g. ( ). transactional distance in a blended learning environment. alt-j, ( ), – . https://doi.org/ . / esposito, a., sangrà, a., & maina, m. f. ( ). chronotopes in learner-generated contexts. a reflection about the interconnectedness of temporal and spatial dimensions to provide a framework for the ex- ploration of hybrid learning ecologies of doctoral e-researchers. elearn center research paper series. retrieved from http://journals.uoc.edu/index.php/elcrps/article/view/ /n -esposito-epub https://doi.org/ . /sce. http://www.elearningeuropa.info/files/media/media .pdf https://doi.org/ . / n -vv - rb - va https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /j.system. . . https://doi.org/ . /j.system. . . http://www.psy.cmu.edu/~siegler/ bronfebrenner .pdf https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /j.iheduc. . . https://doi.org/ . / http://journals.uoc.edu/index.php/elcrps/article/view/ /n -esposito-epub © british educational research association british journal of educational technology vol no esposito, a., sangrà, a., & maina, m. f. ( ). emerging learning ecologies as a new challenge and es- sence for e-learning. in m. ally, & b. khan (eds.), international handbook of e-learning volume : theoretical perspectives and research (pp. – ). london, uk: routledge. https://doi.org/ . / fernández, a. p., & raffaghelli, j. e. ( ). co-citations map: literature review on learning ecologies, – . zenodo. https://doi.org/ . /zenodo. gorard, s. ( ). combining methods in educational and social research. maidenhead, uk: open university press – mcgraw hill education. greenhow, c., & robelia, b. ( ). informal learning and identity formation in online social networks. learning, media and technology, ( ), – . https://doi.org/ . / gurung, b., & rutledge, d. ( ). developing and implementing technology-integrated triad model in content areas for alternative students: an exploratory study. society for information technology & teacher education international conference (vol. ). aace. gutiérrez, k. d. ( ). developing a sociocritical literacy in the third space. reading research quarterly, ( ), – . haythornthwaite, c., & andrews, r. ( ). e-learning ecologies. in e-learning theory and practice (pp. – ). london, uk: sage. https://doi.org/ . / .n hernández-sellés, n., gonzález- sanmamed, m., & muñoz-carril, p.-c. ( ). teacher’s roles in learn- ing ecologies: looking into collaborative learning in virtual environments. profesorado, revista de currículum y formación del profesorado (vol. ). force. hibbert, l. ( ). language development in higher education: suggested paradigms and their applica- tions in south africa. southern african linguistics and applied language studies, ( ), - . https://doi. org/ . / . . hodgson, a., & spours, k. ( ). collaborative local learning ecologies: reflections on the governance of lifelong learning in england (report). leicester: national institute of adult continuing education. ige, o. a. ( ). rethinking students’ dispositions towards civic duties in urban learning ecologies. international journal of instruction, ( ), – . jackson, n. ( ). the concept of learning ecologies. in n. jackson, & b. cooper (eds), lifewide learning, education & personal development (chapter a ). retrieved from http://www.lifewideebook.co.uk/concep- tual.html jackson, n. ( ). exploring learning ecologies. betchworth, surrey: chalk mountain. isbn - - - - jiménez cortés, r. ( ). aprendizaje ubicuo de las mujeres jóvenes en las redes sociales y su consciencia de aprendizaje. prisma social: revista de investigación social, issn-e – , no. , (ejemplar ded- icado a: tecnologías móviles en la educación y sociedad actual), (pp. – ). fundación is+d para la investigación social avanzada. jocson, k. m. ( ). “put us on the map”: place-based media production and critical inquiry in cte. qualitative studies in education, ( ), - . https://doi.org/ . / . . johnson, n. ( ). capacity development through informal learning: an exploration of the digital learn- ing ecologies of canadian female entrepreneurs. th eden research workshop, barcelona, – october, . johnston, a., southerland, s. a., & sowell, s. ( ). dissatisfied with the fruitfulness of “learning ecolo- gies”. science education, ( ), – . https://doi.org/ . /sce. kamenetz, a. ( ). edupunks, edupreneurs, and the coming trasformation of higher education. white river, vt: chelsea green. khau, m., de lange, n., & athiemoolam, l. ( ). using participatory and visual arts-based methodolo- gies to promote sustainable teaching and learning ecologies: through the eyes of pre-service teachers. the journal for transdisciplinary research in southern africa, ( ), . https://doi.org/ . /td.v i . kimmons, r., & veletsianos, g. ( ). public internet data mining methods in instructional design, edu- cational technology, and online learning research. techtrends, ( ), – . https://doi.org/ . / s - - - kop, r., & fournier, h. ( ). new dimensions to self-directed learning in an open networked learning environment. international journal of self-directed learning, ( ), – . retrieved from http://sdlglobal. com/ijsdl/ijsdl . - .pdf#page= https://doi.org/ . / https://doi.org/ . /zenodo. https://doi.org/ . / https://doi.org/ . / .n https://doi.org/ . / . . https://doi.org/ . / . . http://www.lifewideebook.co.uk/conceptual.html http://www.lifewideebook.co.uk/conceptual.html https://doi.org/ . / . . https://doi.org/ . /sce. https://doi.org/ . /td.v i . https://doi.org/ . /s - - - https://doi.org/ . /s - - - http://sdlglobal.com/ijsdl/ijsdl . - .pdf#page= http://sdlglobal.com/ijsdl/ijsdl . - .pdf#page= © british educational research association systematic review: lifelong learning ecologies kop, r., fournier, h., & mak, j. s. f. ( ). a pedagogy of abundance or a pedagogy to support human beings? participant support on massive open online courses. international review of research in open and distance learning, ( ), – . luckin, r. ( ). re-designing learning contexts. technology-rich, learner-centred ecologies. london: routledge. macleod, h., haywood, j., woodgate, a., & alkhatnai, m. ( ). emerging patterns in moocs: learners, course designs and directions. techtrends, ( ), – . https://doi.org/ . /s - - -y maina, m. f., & gonzález, i. g. ( ). articulating personal pedagogies through learning ecologies. in b. gros, kinshuk, & m. maina (eds.), the future of ubiquitous learning (pp. – ). hershey, pa: igi-global. https://doi.org/ . / - - - - _ mocker, d. w. ( ). lifelong learning: formal, nonformal, informal, and self-directed. adult education, ( ), . https://doi.org/ . / moher, d., liberati, a., tetzlaff, j., altman, d. g. & prisma group. ( ). preferred reporting items for systematic reviews and meta-analyses: the prisma statement. plos medicine, ( ), e . https:// doi.org/ . /journal.pmed. okamoto, t., & kayama, m. ( ). a collaborative environment for new learning ecology and e-ped- agogy. in a. tatnall, j. osorio, & a. visscher (eds.), information technology and educational manage- ment in the knowledge society (pp. – ). boston, ma: kluwer academic publishers. https://doi. org/ . / - - - _ okamoto, t., kayama, m., cristea, a., & seki, k. ( ). the distance ecological model to support self/ collaborative-learning in the internet environment. in proceedings – ieee international conference on advanced learning technologies, icalt (pp. – ). ieee computer society. https://doi. org/ . /icalt. . okamoto, t., kayama, m., inoue, h., & cristea, a. i. ( ). the integrated e-learning system rapsody based on distance ecology model and its practice. journal of educational technology & society, ( ), – . ozan, o. ( ). scaffolding in connectivist mobile learning environment. turkish online journal of distance education, ( ), – . https://doi.org/ . / .ch patterson, l., baldwin, s., araujo, j., shearer, r., & stewart, m. a. ( ). look, think, act: using critical action research to sustain reform in complex teaching/learning ecologies. journal of inquiry and action in education, ( ), – . peters, m. ( ). the contribution of digital online higher education: student engagement in the con- tinuum between formal and informal learning. th eden research workshop, barcelona, – october, . petticrew, m., & roberts, h. ( ). systematic reviews in the social sciences. a practical guide. oxford, uk: blackwell. raffaghelli, j. e., cucchiara, s., & persico, d. ( ). methodological approaches in mooc research: retracing the myth of proteus. british journal of educational technology, ( ), – . https://doi. org/ . /bjet. raffaghelli, j. e., & fernández, a. p. ( ). systematic review on the research topic learning ecologies – dataset and analysis. zenodo. https://doi.org/ . /zenodo. richardson, a. ( ). an ecology of learning and the role of elearning in the learning environment (global summit). auckland. retrieved from http://unpan .un.org/intradoc/groups/public/documents/apcity/ unpan .pdf sangrà, a., guàrdia, l., & gonzález-sanmamed, m. ( ). educational design as a key issue in planning for quality improvement. in m. bullen, & d. p. janes (eds.), making the transition to e-learning: strategies and issues (pp. – ). hershey, pa: igi global. https://doi.org/ . / - - - - .ch sangrà, a., gonzález-sanmamed, m., & guitert, m. ( ). learning ecologies: informal professional development opportunities for teachers. in ieee rd annual conference international council for education media (icem) (pp. – ). https://doi.org/ . /cicem. . selwyn, n. ( ). editorial: in praise of pessimism-the need for negativity in educational technology. british journal of educational technology, ( ), – . https://doi.org/ . /j. - . . .x https://doi.org/ . /s - - -y https://doi.org/ . / - - - - _ https://doi.org/ . / https://doi.org/ . /journal.pmed. https://doi.org/ . /journal.pmed. https://doi.org/ . / - - - _ https://doi.org/ . / - - - _ https://doi.org/ . /icalt. . https://doi.org/ . /icalt. . https://doi.org/ . / .ch https://doi.org/ . /bjet. https://doi.org/ . /bjet. https://doi.org/ . /zenodo. http://unpan .un.org/intradoc/groups/public/documents/apcity/unpan .pdf http://unpan .un.org/intradoc/groups/public/documents/apcity/unpan .pdf https://doi.org/ . / - - - - .ch https://doi.org/ . /cicem. . https://doi.org/ . /j. - . . .x © british educational research association british journal of educational technology vol no sharples, m. ( ). seamless learning despite context. in l. h. wong, m. milrad, & m. specht (eds.), seamless learning in the age of mobile connectivity (pp. – ). singapore, singapore: springer. https://doi. org/ . / - - - - _ shaw, a., & krug, d. ( ). youth, heritage, and digital learning ecologies: creating engaging virtual museum spaces. conference on educational media and technology. edmedia + innovate learning (vol., ), aace. siemens, g. ( ). new structures and spaces of learning: the systemic impact of connective knowledge, connectivism, and networked learning. in encontro sobre web . . braga, portugal: universidade do minho. retrieved from http://elearnspace.org/articles/systemic_impact.htm strohmayer, a., comber, r., & balaam, m. ( , april). exploring learning ecologies among people expe- riencing homelessness. in proceedings of the rd annual acm conference on human factors in computing systems (pp. – ). acm. tabuenca, b., kalz, m., & specht, m. ( ). tap it again, sam: harmonizing the frontiers between digital and real worlds in education. in ieee frontiers in education conference (fie) proceedings (pp. – ). ieee. https://doi.org/ . /fie. . van den beemt, a., & diepstraten, i. ( ). teacher perspectives on ict: a learning ecology approach. computers & education, – , – . https://doi.org/ . /j.compedu. . . van eck, n. j., & waltman, l. ( ). visualizing bibliometric networks. in y. ding, r. rousseau, & d. wolfram (eds.), measuring scholarly impact: methods and practice (pp. – ). cham, switzerland: springer. virtanen, m. a., haavisto, e., liikanen, e., & kääriäinen, m. ( ). ubiquitous learning environments in higher education: a scoping literature review. education and information technologies, ( ), – . https://doi.org/ . /s - - - wilkinson, j., kemmis, s., hardy, i., & edwards-groves, c. ( ). leading and learning: developing ecolo- gies of educational practice. australian association for research in education international conference. retrieved from https://espace.library.uq.edu.au/view/uq: wong, l.-h. ( ). analysis of students' after-school mobile-assisted artifact creation processes in a seamless language learning environment. journal of educational technology & society, ( ), - . wong, l. h., milrad, m., & specht, m. ( ). seamless learning in the age of mobile connectivity. singapore, singapore: springer. https://doi.org/ . / - - - - zawacki-richter, o., & latchem, c. ( ). exploring four decades of research in computers & education. computers & education, , – . https://doi.org/ . /j.compedu. . . supporting information additional supporting information may be found online in the supporting information section at the end of the article. https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ http://elearnspace.org/articles/systemic_impact.htm https://doi.org/ . /fie. . https://doi.org/ . /j.compedu. . . https://doi.org/ . /s - - - https://espace.library.uq.edu.au/view/uq: https://doi.org/ . / - - - - https://doi.org/ . /j.compedu. . . department of library services library home libraries & units libraries economic & management sciences education engineering, built environment & information technology health sciences humanities law mamelodi music natural & agricultural sciences theology veterinary science units access and lending learning centre library technical services marketing & quality assurance scholarly communications research commons special collections study collection search catalogue worldcat discovery accredited journals databases e-book collections e-dictionaries e-journals e-reference books google scholar past exam papers subject guides upspace (institutional repository) services bindery borrowing copyright digitisation interlending lending services home makerspace membership plagiarism printing/photocopying referencing research support room bookings special needs clients training wi-fi access my library space log in renew library items ask a librarian chat to a librarian subject guides nevada (sa textbooks) my tuks login about us                   annual reports code of conduct contact us history hours management library newsletter rules & regulations service pledge vision & mission rate us department of library services + / services and support for teaching, learning and research will continue and most librarians are available to provide virtual support. online consultation and library instruction will be provided by phone or on google hangouts. my library space my library account renew library items ask / chat to a librarian subject guides nevada (sa textbooks) my tuks login page not found oops - this is embarrassing! it seems that we cannot retrieve the page you were looking for. please try the following: start over at the library's homepage (www.library.up.ac.za) if the problem persists, please report this error to the webmaster. news & announcements library staff intranet p e r s o n a li si n g t wi t t e r c o m m u n i c a t i o n : a n e v a l u a t i o n of ‘r o t a t i o n-c u r a t i o n’ fo r e n h a n c i n g s o c i a l m e d i a e n g a g e m e n t w i t h i n h i g h e r e d u c a t i o n c o n d i e , j.m, ayo d e l e , i, c h o w d h u ry, s , p o w e , s a n d c o o p e r, am h t t p : // d x. d oi. o r g / . / . . t i t l e p e r s o n a li si n g twi t t e r c o m m u n i c a t i o n : a n e v a l u a t i o n of ‘r o t a t i o n-c u r a t i o n’ fo r e n h a n c i n g s o c i a l m e d i a e n g a g e m e n t w i t h i n h i g h e r e d u c a t i o n a u t h o r s c o n d i e , j.m, ayo d e l e , i, c h o w d h u ry, s , p o w e , s a n d c o o p e r, am ty p e ar ti cl e u r l t hi s v e r s i o n i s a v a il a b l e a t : h t t p : // u s ir. s a lf o r d . a c . u k /i d / e p r i n t / / p u b l i s h e d d a t e u s i r i s a d i gi t a l c oll e c t i o n of t h e r e s e a r c h o u t p u t of t h e u n iv e r s i t y of s a lf o r d . w h e r e c o p y r i g h t p e r m i t s , f ull t e x t m a t e r i a l h e l d i n t h e r e p o s i t o r y i s m a d e f r e e l y a v a il a b l e o n li n e a n d c a n b e r e a d , d o w n l o a d e d a n d c o p i e d fo r n o n- c o m m e r c i a l p r i v a t e s t u d y o r r e s e a r c h p u r p o s e s . p l e a s e c h e c k t h e m a n u s c r i p t fo r a n y f u r t h e r c o p y r i g h t r e s t r i c ti o n s . f o r m o r e i nf o r m a t i o n , i n cl u d i n g o u r p o li c y a n d s u b m i s s i o n p r o c e d u r e , p l e a s e c o n t a c t t h e r e p o s i t o r y te a m a t : u s i r @ s a lf o r d . a c . u k . mailto:usir@salford.ac.uk draft only. final version available on publisher’s website: https://www.tandfonline.com/doi/abs/ . / . . please cite and reference the published version: condie, j.m., ayodele, i., chowdhury, s., powe, s., & cooper, a.m. ( ) personalising twitter communication: an evaluation of ‘rotation-curation’ for enhancing social media engagement within higher education, journal of marketing in higher education. personalising twitter communication: an evaluation of ‘rotation-curation’ for enhancing social media engagement within higher education authors condie, j.m school of social sciences and psychology, university of western sydney, australia ayodele, i school of health sciences, university of salford, salford, uk chowdhury, s school of health sciences, university of salford, salford, uk powe, s school of health sciences, university of salford, salford, uk cooper, a.m. school of health sciences, university of salford, salford, uk correspondence details: j.condie@westernsydney.edu.au tel: + https://www.tandfonline.com/doi/abs/ . / . . abstract social media content generated by learning communities within universities is serving both pedagogical and marketing purposes. there is currently a dearth of literature related to social media use at the departmental level within higher education institutions (heis). this study explores the multi-voiced interactions of a uk psychology department’s ‘rotation curation’ approach to using twitter. an in-depth analysis of a corpus of tweets by curators ( staff, students, and guest curators) was carried out using a combination of computer- assisted and manual techniques to generate a quantitative content analysis. the interactions received (e.g. retweets and favourites) and type of content posted (e.g. original tweets, retweets and replies) varied by curator type. student curators were more likely to gain interactions from other students in comparison to staff. this paper discusses the benefits and potential limitations of a multi-voiced ‘rotation curation’ approach to social media management. keywords: social media; twitter; learning; higher education; engagement; marketing introduction maintaining a successful presence across the web within social media spaces is an increasingly important component of the ‘business’ of higher education institutions (heis). social media platforms make universities more visible and accessible to new and existing stakeholders, as well as provide “a potentially good measure of how institutions position themselves to maximize prestige within a globally competitive field” (shields, , p. ). the emergence of league tables upon which universities are ranked for their use of social media exemplifies this point such as theunipod social media rankings (theunipod, ) and top uk universities on social media (rise, ) for example. rankings on such league tables have been acknowledged by heis and promoted by those highly positioned for their online ‘influence’ (see university of salford, , for example). as such, there is an emerging literature around how heis are using social media to market themselves using a new set of rules for engaging with different stakeholder groups (e.g. constantinides & zinck stagno, ; fagerstrøm & ghinea, ; rutter, roper, & lettice, ). this paper contributes an analysis of ‘rotation curation’ to the emergent literature around the use of social media platforms for university marketing and engagement activities. made popular by twitter accounts such as @sweden, ‘rotation curation’ offers a co-produced, multi-voiced approach to the management of a social media account/presence (christensen, ; vandenbroek, ). this paper focuses on the use ‘rotation curation’ on twitter by staff and students within a uk psychology department. many academics and educators have embraced ‘the participatory web’ to engage in a plethora of scholarly activities online (costa, a; weller, ). the use of social media, generally defined as “a group of internet-based applications that build on the ideological and technological foundations of web . , and that allow the creation and exchange of user generated content” (kaplan & haenlein, , p. ), is an prominent feature of contemporary academic practices (graham, ). for research purposes, academics are using social media to create online ‘communities of practice’ (lewis & rush, ), to produce knowledge collaboratively (cooper & condie, ), to network outside of their own universities with interested individuals and groups (lupton, ), and to disseminate research findings and publications (rowlands, nicholas, & russell, ). although digital forms of scholarship enable academics to do their work differently, using social media for research purposes is far from straightforward (costa, b); the adoption and application of new technologies varies widely in relation to dynamic social and cultural processes (lewis, marginson, & snyder, ; snyder, marginson, & lewis, ). within teaching, academics are using social media platforms to extend learning dialogues beyond traditional educational contexts (see dabbagh & kitsantas, ; dhir, buragga, & boreqqah, , for reviews). social media can offer ways to immerse students in deeper learning experiences (graham, ), particularly mainstream platforms such as facebook and twitter which act as commonplace sites for everyday social interactions and social support during university study (deandrea, ellison, larose, steinfield, & fiore, ). while some educators are focused on ‘e-professionalism’, emphasising the risks and misuses of social media platforms for both students and staff, fenwick ( ) notes that social media opens up new possibilities for student professionalism, which may be harnessed for students’ future employability. academics and students are therefore operating within, and navigating through “a kaleidoscope of interconnected digital, open and social practices” (atenas, havemann, & priego, , p. ). from an organisation perspective, the potential of social media communications for marketing and engagement with a wider range of stakeholders is being recognised (kuzma & wright, ; rutter et al., ) as heis continue to ‘experiment’ with social media marketing (constantinides and zinck stagno, ). a key area of focus has been student recruitment and how social media marketing might impact student choices of university study and institution. when compared to traditional communication tools such as a course brochure, social media has been found to play only a minor role in university choices and was ranked last by students in a list of informational resources (constantinides and zinck stagno ( ). constantinides and zinck stagno ( ) attribute this low ranking to a lack of relevant content, arguing that a “simple presence in the social media space is not enough” and “two-way communication, dialog and engagement” (p. ) with prospective students should be sought. indeed, recent research indicates that more interaction with social media followers has a positive impact on student recruitment performance (rutter et al., ). within an increasingly competitive higher education sector, online content generated by users engaged in scholarly activities on social media platforms can be considered as multi-purposed in meeting both marketing and pedagogical objectives (fagerstrøm & ghinea, ; krachenberg, ). as boyd ( ) notes, social media creates a ‘context collapse’ where the audiences engaged, and the purposes of use are more blurred than previously experienced. currently, much of the literature evaluating social media use in teaching and learning within higher education institutions (heis) does not consider who else might be engaging with such user generated content and how marketing and pedagogical objectives overlap. when analysed, sewell ( ) found that the followers of the texas a&m university medical library twitter account went beyond affiliated staff ( . %) and students ( . %) to include external organisations ( . %), alumni ( . %), and libraries and librarians from other institutions ( . %). when teaching and learning occurs on more open and accessible platforms, interactions amongst the immediate learning community of staff and students can potentially be viewed by others thus serving wider engagement purposes and contributing to the institution’s brand ‘identity’ (mirzaei, siuki, gray, & johnson, ) or ‘personality’ (neier & zayer, ). learning communities on social media may also answer calls for more two- way, dialogical interactions within university social media marketing activities (constantinides & zinck stagno, ; rutter et al., ). curating an engaged department research on how academics use social media platforms has steadily increased at the individual level (e.g. costa, b; lupton, ; mewburn & thomson, ; veletsianos & kimmons, ) yet there is currently a dearth of published research at a departmental level around the use of social media within heis. within departments, there is a need to decide upon which new technologies to adopt to engage with key stakeholders and to develop online communities for learning and knowledge sharing. such decisions are also relative to wider organisational policies and procedures for social media use by individuals and groups within and across the university; the institutional stance on the management, protection and production of broader brand identity (mirzaei et al., ) and university reputation (erskine, fustos, mcdaniel, & watkins, ; mcneill, ). twitter is a seemingly prevalent social media platform for academic departments to build an online presence and a way in which to take ownership of a department’s ‘brand’ or ‘identity’ (palmer, ). junco, elavsky, and heiberger ( ) attribute twitter’s popularity amongst academics to its microblogging format, which enables public dialogue to be both ongoing and ubiquitous. academics can also be ‘more willing’ to engage with twitter for professional scholarship than facebook, which is reportedly reserved for more personal and private networking purposes (junco et al., ). indeed, knight and kaye ( ) found that twitter was a particularly successful platform for increasing conversations and interactions between staff and students. however, in research on uk engineering departments’ use of twitter, palmer ( ) found that a ‘megaphone’ style of tweeting was common, where information is broadcast in a one-way style of communication. engineering departments that interacted more with their twitter followers had a larger follow base and were mentioned more by other twitter users (palmer, ). palmer’s ( ) conclusions align with those researching social media marketing for student recruitment (e.g. constantinides & zinck stagno, ; rutter et al., ), that heis should move towards an interactional, two-way use of social media to facilitate discussion amongst the learning community and to sustain an active present online. as a general rule, kaplan and haenlein ( ) advise that social media use should be active, interesting and honest. to maintain an active presence, encourage two-way interactions and build engaged communities on twitter, some departments (e.g. (deleted to maintain the integrity of the review process), @nursingsuni) have adopted the ‘rotation curation’ approach, where different people curate the account, representing the department’s online presence and contributing to its brand identity and community in their own way. in terms of how collaborative, co-produced approaches to twitter management play out in practice, one of the most well-known examples of ‘rotation curation’ is the swedish tourism board’s account @sweden. although co-curation enables a range of voices to be heard and provides a snapshot of different people’s views and perspectives, it also presents significant challenges. christensen ( ) highlights issues of representation and whether curators are representative of community members (everyday swedes/swedishness) and what the account then represents to the wider audience. the @sweden account also engages in commercial nationalism (christiansen, ), which may echo the issues of commercialisation in higher education, and the appropriation of learning communities for marketing purposes. for vandenbroek ( ), the various attempts at ‘rotation curation’ are an example of ‘technological solutionism’ and that the success of the @sweden account is situated within the country’s historical, cultural, and nation branding practices. in other words, what works for one country may not work for another. similarly, ‘rotation curation’ may work for one academic department but not for another, particularly when broader institutional communication policies impact upon academic autonomy and social media innovation (erskine et al., ; mcneill, ). co-curation of twitter accounts, which aim to facilitate multi-voicedness and dialogue, arguably capture how digital scholarship encourages “individuals to question established norms and adopt new philosophies of practice that challenge conventions implicit in academic work” (costa, a, p. ) and reflect how social media represents a breakdown of the traditional conventions, hierarchies and norms of scholarly communication. research on how an active, interactional social media presence at departmental level can be achieved and sustained would therefore be useful. indeed, a number of questions arise in relation to academic departments’ uses of twitter such as what kind of voices are developed for departmental accounts and how are networks of learners and interested communities formed. building a twitter presence and network requires significant resources, particularly in terms of time, and thus ways to encourage engagement with key audiences (e.g. current and prospective students, industry and community partners and organisations) and use the platform more effectively are required. attempts to engage hei audiences in social media spaces also need evaluating to guide current and future engagement efforts (constantinides & zinck stagno, ). to further understand the impact of such investment in social media engagement on the platform, this paper focuses on the multi-voiced ‘rotation curation’ example of twitter use by an academic department from the (deleted to maintain the integrity of the review process), which is in the rise ( ) top for social media engagement for universities in the uk. in march , the psychology department at the (deleted to maintain the integrity of the review process) started a twitter account (deleted to maintain the integrity of the review process). the department opted for a ‘rotation curation’ approach where a different person every week – students, lecturers, and researchers – represents the department and runs the twitter account in their own way. ‘rotation curation’ was also adopted as an opportunity for staff and students to interact in new, meaningful ways in online spaces ((deleted to maintain the integrity of the review process) ). the account therefore presents an opportunity to understand how both marketing and pedagogical objectives are being aligned in social media spaces. the (deleted to maintain the integrity of the review process) account continues to operate a multi-voiced approach and curators are identified by their personal twitter username referenced in the accounts’ ‘bio’ section. this multi-voiced approach, where students also represent their departments’ twitter presence, stands apart from many departmental and institutional accounts in its dialogical, multi-voiced style. aims of the current study this paper aims to contribute to the growing body of research on social media use within higher education contexts. it contributes specifically to efforts around co-produced and multi- voiced approaches to social media management, specifically the use of ‘rotation curation’. the research objectives are as follows:  to explore the relationships between types of curator (student, staff, guest) and interactions gained across different stakeholder groups;  to understand how different staff and students approach their curation of a departmental twitter account. methodology this research is situated within the emerging turn towards ‘big data’ approaches for analysing user-generated content from social media platforms. however, ‘big data’ functions as an umbrella term and “remains a loosely defined often nebulous term for large datasets that require complex technologies for the capture, storage and analysis procedures” (murphy & burman, ). like other recent studies (e.g. sewell, ; stephansen & couldry, ), this research uses the concepts of ‘big data’ on a ‘small scale’ in analysing a singular twitter account linked to the psychology department at the (deleted to maintain the integrity of the review process). this project was set up to explore the first year of a departmental twitter account, to help understand any impact and how the account was being used. as the (deleted to maintain the integrity of the review process) twitter account is active most days throughout the year, a ‘cut off point’ for data generation was from the initial tweet (march ) to june in order to create a static, finalised dataset suitable for analysis. to obtain the dataset, the twitter archive was downloaded directly from the twitter account into an ms excel file, which provides the following information: tweet id; in reply to status id; in reply to user id; tweet time stamp; content of tweet and expanded url linked to tweet. some of the initial data from the twitter archive was modified to assist analysis. for example, the time stamp was segmented into year, month, day, hour, and minute to understand engagement levels at different times. additionally, further data was generated through manual coding to understand the account’s audience, the interactions between curators and audiences, and the content, which received higher engagement levels. manual coding was carried out by three researchers (deleted to maintain the integrity of the review process) to ensure inter-coder reliability and to manage the volume of data. the coding scheme was developed through pilot coding and group discussions around how best to define and code the data into suitable categories. the downloaded twitter archive does not indicate who ‘retweeted’ (shared) or ‘favourited’ content. therefore, an online social tool called tweet tunnel was used to obtain this information. tweet tunnel shows which users have retweeted or favourited individual tweets sent from a twitter account. however, there are some limitations to using tweet tunnel. for example, it is restricted in how far back in a twitter account it can retrieve data (i.e., the last , tweets). also, tweet tunnel is a live dashboard and is constantly updating its content in relation to tweets sent from the account under analysis. as such, manual coding was carried out by (deleted to maintain the integrity of the review process) for tweets that were unavailable in tweet tunnel so that ‘retweets’ and ‘favourites’ were available for the entire dataset. overview of the rotation curation approach used on this account account curators are provided with guidelines around how to use the account for the week that they are representing the department. the guidelines cover details around ‘what to tweet’, ‘managing mentions’, ‘participant recruitment’, ‘managing followers and following’, ‘profile’, and ‘password protocols’. the guidelines aim to ensure that curators feel supported to take their own approach and talk about their particular stance on psychology and university life. curators are encouraged to use twitter in a way that provides information but also encourages other to interact by contributing their personal views and experiences of what it is like to study/teach/research psychology). staff and students are recruited as curators through a number of methods (e.g. twitter, lectures, and word of mouth), and the allocation of weeks is random. although there were no incentives or rewards offered to curate the account, it is seen as an important opportunity for students to gain social media work experience and develop their digital presence and professionalism in preparation for the graduate job market. the guests approached to curate the account were known to departmental staff, either as former students or colleagues at other institutions. anonymity and other ethical considerations as twitter is open and publicly available, the university ethics panel stated that no formal ethical approval was required as this is a large community dataset run by staff and students within a single department. however, it was important that throughout the analysis, the contextual integrity of the data was respected. throughout this process, the ethical principles of the british psychological society ( , ) were adhered to in relation to data protection, confidentiality, and anonymity. for example, by focusing on the data as a whole (in relation to deleted to maintain the integrity of the review process) via the quantitative coding procedure, and only recording the type of person who was curating the account (in relation to broad categories of staff, student, and alumni), it is not possible to identify individual people who have curated the account from the coding even if they have included a name in their initial tweet. furthermore, the (deleted to maintain the integrity of the review process) bio information only contains the current curators’ username, which is not included in the analysis due to the cut-off date used to create a static dataset. analysis the main data analysis was carried out within ms excel. a quantitative content analysis was conducted to identify frequencies between different variables e.g. type of curator compared to the type of tweet. in order to provide context to some of the results, a qualitative content analysis was carried out to provide examples that explicate the quantitative findings. all tweets used as examples have been anonymised to ensure the confidentiality of those curating and interacting with the account. to understand ) the relationship between the type of curator and the style of content curation and ) the relationship between the type of curator and the audiences engaged, we categorised the data into type of tweet (i.e. retweet, reply, and original), type of curator (i.e. staff, student, guest) (see table ), and type of stakeholder group (e.g. external professional, professional bodies, alumni) to perform χ analyses. . overview of the data set each curator was responsible for the account and its interactions for one calendar week throughout term time and the holiday period. detailed demographics for curators are not collected in the management records of the account. however during the study period, there were student curators (undergraduate and taught postgraduate), staff (within the directorate and postgraduate research students) and guests (external to the organisation). as such the main focus of the analysis is linked to student and staff curation but where relevant, the impact of guest curators is also highlighted. as can be seen in figure , across all curators, tweets began around am and then from am – pm the rate of percentage of tweets per hour ranged from . % to . % indicating a fairly consistent use of the account throughout the day across the whole study period. however, when this is broken down by curator type, staff and guests used the account to a greater extent in the morning than through the afternoon and evening. students differed in that they tweeted more consistently throughout the working day, which follows a similar pattern of the overall analysis of the account. one explanation is that some curators (i.e. students) may find it easier to integrate curation into their working day (e.g. when they already use twitter on a regular basis) whereas for others, it may be more difficult to tweet during daily activities and thus they curate at other times (e.g. on the daily commute). figure percentage of tweets per hour of the day in relation to overall and by type of curator for all activity of the account, the highest numbers of tweets were sent on mondays (the ‘handover’ day when the curator changes for the week), with this gradually reducing over the working week. although the number of original tweets reduced over the course of a week, the number of retweets remained stable, suggesting a change in how curators generated content to post on the account over the course of the week. when looking at staff and students separately, both groups had peaks of activity on monday and wednesday. to further understand the change in behaviour over the week, a qualitative inspection of tweets reveals that the first tweet of the week and the next few tweets are often original content. the first tweet, usually a hello and introduction linking to the curator’s own username on twitter, gets a high level of engagement in terms of retweets and favourites. for this account, the level of engagement then decreases across the next few original tweets, which could explain the shift towards retweeting in the latter stages of the week over continuing with original tweets to encourage engagement. students tended to provide personal reflections on writing assignments and attending lectures, giving an insight into their experiences within higher education and prospective students an insight into the course. whereas staff used the account to provide reflection on their week, but also as a way to try and engage the student population by discussing specific lecture content or linking to resources relevant to recent teaching experiences: “thanks to this morning’s seminar i now understand the essay and feel less stressed about submitting my dissertation ethics form! #phew” (tweet by student) “seem to have positive remarks for my poster, hope everyone is getting on alright #socialpsych” (tweet by student) “@[twitter name] reading over lecture notes i tend to write possible exam questions, helps a lot! i like mind mapping too w/ pictures/colours :)” (tweet by student) “i've seen a few 'stressed' comments recently as deadlines loom, so just wanted to wish everyone good luck. #itllbeworthitintheend” (tweet by directorate staff member) “wish i'd seen this before my face perception lecture! your brain thinks emoticons are real faces #mediapsych http://t.co/ bkwtxpxog” (tweet by directorate staff member) although fewer tweets were sent at the weekends (sunday % and saturday % of tweets), the account was often active outside of typical working week hours (i.e. monday-friday am- pm). this reflects the continuous (non-time bound) style of twitter (junco, elavsky, and heiberger, ) and also how engagement is often beyond traditional learning contexts (e.g. long after a lecture or seminar). in addition to examining the impact of weekdays and weekends, we also explored interactions during exam periods and holidays on the account. it seems from the current data that interactions with the account were not significantly influenced by exam periods or other holidays as favourites and retweets by staff and curators remained consistent. the use of guest curators occurred outside of teaching periods; during this time it was found that staff and students engaged less with the account. using guest curators therefore appeared to have a limited impact upon the immediate learning community. however we should be aware that one of the limitations of twitter is the curator is not aware of who is simply ‘listening’ and not interacting with their tweets in terms of replies, retweets and favourites; but a feeling of no or limited interaction could potentially impact negatively upon curation behaviour. staff and students curators showed different levels of use during teaching and non-teaching weeks throughout the analysed period. when student curators were managing the account during teaching-time, weekly averages of . tweets were generated, whereas staff curators generated an average of . weekly tweets during teaching-time. student curator’s weekly average of tweets was . tweets/week during non-teaching time and staff curators generated a weekly average of . tweets during non-teaching times. as such it can be seen that students were, on the whole, more active in terms of generating tweets across all time periods in comparison to staff curators. thus the rotation of curators helps to keep the account active during all periods within an academic year while enabling staff and students to navigate the impacts of social media on workload. it also demonstrates the added value of student curators in making the account more active and more diverse in terms of content. overall tweets were sent within the study period, of these were retweets ( . %), were replies ( . %), and were original tweets ( . %). due to the higher number of student curators, the majority of the tweets were sent by students ( tweets), followed by staff ( tweets) and guests ( tweets). table frequencies and percentages of type of tweet posted by curator type overall number in the sample n= (%) guest n= (%) staff n= (%) students n= (%) retweets (the exact version of the original tweet without comment or alteration) ( . ) ( . %) ( . ) ( . ) reply (response to another person’s tweet) ( . ) ( . ) ( . ) ( . ) original tweet (original thought or message sharing) ( . ) ( . %) ( . ) ( . ) total for each group ( ) ( ) ( ) ( ) a (curator: guest, staff, students) by (tweet type: retweet, reply, original) χ found a significant relationship between curator type and tweet type, χ ( ) = . p < . . to further understand this relationship, we ran three separate x χ analyses focusing on student and staff curators, and grouping the data into two categories, ) retweets and non- retweets (i.e. reply and original together), ) original and non-original tweets, and ) replies and non-replies. we found that students more heavily relied on retweets than staff did, χ ( ) = . p < . , while staff were more likely to post original tweets than students, χ ( ) = . p < . . staff and students similarly utilised replies to other twitter accounts, χ ( ) = . p = . . one possible explanation for this finding is that students feel less expert than staff and therefore chose to share other people’s content through retweets, whereas staff can more easily take up the role of ‘expert’ when sharing content on twitter. to examine how these curatorial differences impact on the account’s audiences (i.e. stakeholder groups), we now explore the interactions generated through the multi-voiced approach to twitter curation. unfortunately, it is difficult to investigate the impacts of type of curator on retweeted items because there is no way of distinguishing between the retweets from the original creator and this curated account. . how does the multi-voiced approach impact upon interaction? as previously noted, data generated from the (deleted to maintain the integrity of the review process) twitter account offers a distinct opportunity to examine the varying impacts of student and staff curators on interactions within and beyond the immediate learning community of the department. while it is difficult to analyse interactions on retweets, what we can explore is the influence of curator type on interactions with non-retweeted content (i.e. original tweets and replies) posted by the account. in this section, we explore how student and staff curators generate interactions (i.e. retweets and favourites) by looking at raw frequencies. we next adjust the interactions to account for the curator tweet frequencies to explore how curator type influences who engages with the account (i.e. stakeholder groups). table frequencies and percentages of stakeholder group interactions in terms of retweets and favourites by curator type type of curator staff student group stakeholder group retweets n= (%) favourites n= (%) retweets n= (%) favourites n= (%) professionals directorate staff ( ) ( ) ( ) ( ) university staff ( ) ( ) ( ) ( ) external professional ( ) ( ) ( ) ( ) students [@twittername] student ( ) ( ) ( ) ( ) [@twittername] alumni ( ) ( ) ( ) ( ) other salford students ( ) ( ) ( ) ( ) other university students ( ) ( ) ( ) ( ) organisations professional bodies ( ) ( ) ( ) ( . ) external organisation ( ) ( ) ( ) ( ) other unidentified individual ( ) ( ) ( ) ( ) anonymous ( ) ( ) ( ) ( ) are students or staff more likely to generate interaction? for the purposes of inferential analysis, we binned the stakeholders into higher order groups for analysis (see table ). overall, students were more likely to generate interactions from professionals (n = tweets) in comparison to staff (n = ), external organisations (n = ) in comparison to staff (n = ), and other students (n = ) in comparison staff (n = ), χ ( ) = . p = . . this finding is likely due to the greater volume of tweets posted by student curators than staff curators. therefore, we adjust the data to account for differences in overall tweet frequencies by scaling the student interactions to the ratio of student-to-staff tweets. a (curator: staff, student) by (stakeholder group: professionals, students and organisations) χ showed a significant relationship between curator type and stakeholder groups, χ ( ) = . , in the higher order group analysis, ‘other’ was removed given the uncertainty around who belongs to the anonymous and unidentified stakeholder groups. p = . . more specifically, student curators were gaining more interactions from student stakeholders (n = ) than staff (n = ). however, there was little difference in student curator interactions with professionals (n = ) and organisations (n = ) relative to staff interactions with these stakeholder groups (n = , n = respectively). when students curate the departmental account, a main benefit is increased student interaction. student curators may indeed be contributing to an increased sense of community and belonging among students around the brand identity of the department. it is evident that students on the course are engaging with the account, which can help to create a sense of community across different cohorts and help students to feel part of the identity of (deleted to maintain the integrity of the review process). consequently, accounts that are solely curated by one group of people, or those with an unidentified voice, may have limitations in terms of the breadth of their reach (palmer, ). given that heis aim to engage with a variety of stakeholders, the multi-voiced approach offers potential to reach new audiences and interact with existing connections. what is not determinable from the analysis of this account is whether being part of (deleted to maintain the integrity of the review process) positively impacts students’ employability in the future, either through raising their awareness of organisations that relate to psychology careers or through a stronger professional online network and presence. . what content is posted and how does it impact interaction? what does the content look like? student use of retweets and staff use of original content (see table ) could reflect that students (who on the whole were undergraduates) know less than staff when it comes to posting psychology and research-related content. there were only a small number of cases where students took a strong stance on psychology research; students tended to prefer to tweet psychology-related content that related to themselves and their learning experiences. often their tweets sought to signpost followers to useful information sources: “bps research digest is a great way of finding out about interesting new psych research. but you knew that already... http://t.co/d qd jrgo” students received interactions on non-retweeted content (i.e. original tweets and replies) but staff had only sent . % as many tweets. x . = tweets. http://t.co/d qd jrgo ‘rotation curation’ perhaps acts as scaffolding for students to develop their digital literacy skills in relation to professional conduct and presence online (fenwick, ). in taking up the role of ‘information sharer’, students can learn how to portray themselves online and gain confidence to express their knowledge in a public arena. figure percentage of the type of tweet content for the overall tweet corpus and by curator type due to the nature of the account and the department it represents, it is not surprising that the highest proportion of tweets relate to psychology as a discipline (see figure ). tweets related to current research in the news, existing concepts and theories that had been learned, activities/events occurring in and around the department, and humour (e.g. psychology-related memes). “what do you think about early childhood attachments? how do they influence our adult relationships? #attachment #devpsy ” “preparing for our last recording for psychologyfm today on 'learning difficulties and the big society' #allfm kindly funded by the bps” humorous tweets and those that contained images often generated more interactions in terms of retweets and favourites. staff curators were more likely to tweet about both internal and external events than students, whereas students were more likely to post content related to promoting recruitment opportunities inside the university. although not an explicit aim of the account, across all curators, it is interesting to note there was little activity around external job adverts and external training courses, which may suggest that the account is not being used to raise awareness of ways to gain further training beyond university-provided initiatives. nor is it being used as a potential way to remain in touch with alumni and support their continuing training needs. in contrast to those affiliated with the department, guests were more likely to tweet research and policy content beyond psychology, which may also help explain why they engaged with more external organisations than other curators. having guests external to an organisation being part of a ‘rotation curation’ approach diversifies the account’s content and reach, particularly as they shared their @[twitter name] content into their individual twitter accounts and thus with their own networks. however, due to the small number of guest curators within the dataset, it is not possible to know the full impact of guest curation, or how curators found their @[twitter name] experience. on a closer inspection of the data, often guest curators would retweet their tweets using their personal accounts to encourage their wider followers to engage. two-way conversations on the account varied in relation to different curator groups. for the purposes of this analysis, the definition of a conversation on twitter was taken as having at least three tweets creating at least a two-way conversation. overall, there were tweets that aimed to begin a conversation. conversations were most frequently generated through original tweets but also by replies to other peoples’ tweets and by quoting a tweet. “i realise this could ruin any hope i have of being productive today, but...what is your fave study/piece of research? and why?” within the dataset, of the tweets aimed at beginning a conversation were curator- initiated tweets that lead to a conversation on twitter and a further tweets initiated by other students within the department. the conversation below also provides an example of how curators’ plan to use the account, and how current students’ tweets function as marketing information for prospective and future students: “#[department hashtag] interesting #enviropsych lecture with @[twitter name] critical thinking, realism and social constructionism wow! @[twitter name] @[twitter name] cheers for posting [name removed]; nice for new students to hear about some interesting topics they can look forward to https://twitter.com/hashtag/salfordpsych?src=hash https://twitter.com/hashtag/enviropsych?src=hash @[twitter name] no problem, next week when i take over they will get a feel of what it's like to be a #[department hashtag] student for a week! @[twitter name] sounds great. having followed this account for a while now, i wish it had been around when i was a new student. v helpful. @[twitter name] the future is now twitter and #socialmedia as learning tools @[twitter name] @[twitter name] @[twitter name] i'm still working out the possible uses of social media (ive never posted a pic of a cat!), but i'll get there @[twitter name] @[twitter name] the possibilities are only as endless as your imagination and invitation!” despite staff generating greater original content it was found that students tried to instigate interactions and conversations with greater frequency but were often not successful in gaining responses. this may be an indication of some of the challenges of twitter as a platform for conversations about topics and further research may be able to demonstrate if other social media platforms have greater success in relation to two-way interaction. within this corpus of tweets, there were only a couple of instances where the curator tried to organise a tweetchat (e.g. using a hashtag - deleted to maintain the integrity of the review process) at a given time about a given subject. “there's lots of you thinking about or doing dissertations at the mo? fancy a #[twitter name] dissertation tweetchat on wed evening?” through the analysis, it was found that these often led to only limitation discussion with those associated with the course and also external people. there are examples of other departmental and organisational accounts that appear to have greater activity in tweet chats (e.g.@nursingsuni or @nsmnss), which could be reflective of disciplinary differences in online scholarly communication (holmberg & thelwall, ) or that the ‘macro’ level of hashtag based tweeting is less relevant to the @[twitter name] community (bruns & moe, ). however further research is needed around the differences between online communities to understand why some tweet chats succeed and others fail. conclusion https://twitter.com/hashtag/salfordpsych?src=hash https://twitter.com/hashtag/socialmedia?src=hash mailto:e.g.@nursingsuni this paper examines the use of twitter within a higher education institution (hei) at the departmental level, specifically an account that adopts the ‘rotation-curation’ approach where different people (students, staff and guests) curate a departmental twitter account each week. within the analysis, we examined the levels of engagement and reach of the account in terms of key stakeholder groups (e.g. professionals, students, external organisations). although social media interactions with different stakeholders are likely important to the immediate learning community of staff and current students, there is a need for engagement beyond the department and the university, given the competitive globalised field of higher education (shields, ). while student curators reached many of the key stakeholder groups, interaction was most likely to be with other student groups. it is not possible to determine interest from non-interactive users; those who listen and read but do not engage openly with the account (e.g. through favourites, retweets or replies). thus, further research would be a useful to examine relevant stakeholder perceptions of the department’s social media presence (including its content, curators and approach to social media management). for example, how are student curators interpreted by local industry and community organisations? how do prospective students interpret the content posted and its curators? from a promotions perspective, the use of different curators diversifies the content posted and the stakeholder groups engaged. given the range of stakeholder groups that hei departments are encouraged to engage with, the use of multiple curators can result in a greater diversity of content for the account’s followers. when current students tweet about their experiences, it potentially offers prospective students peer insights into university study and life. in allowing different people to represent the department as well as different types of people (e.g. students, staff, and guests), the ‘brand identity’ of the account may be broadened resulting in ‘shared’ brand associations that contribute to a ‘healthy’ university, particularly in terms of student recruitment success and student identification with the department (mirzaei et al., ). further research focused specifically on measuring the impact that ‘rotation curation’ has on student recruitment and retention performance could be useful, particularly considering the resource implications of sustaining an active social media presence in terms of staff time. when considered from a pedagogical perspective, ‘rotation curation’ has the potential to enhance student participation and encourage learning beyond traditional contexts. although this research did not gain insights from curators about their experiences of tweeting for the department, handing over the social media reigns provides opportunities for students to engage in networked, participatory practices in online open spaces within a supportive learning community. trusting students to represent the department arguably shifts their identity from student to ‘colleague in training’. this is in contrast to other educational approaches that typify e-professionalism as risk and misuse avoidance related to social media use for both students and staff (see fenwick, ). becoming a ‘social’ professional takes time and by implementing a multi-voiced approach to twitter, both students and staff have an opportunity to build confidence and an authoritative voice within their discipline by posting original content that engages various relevant stakeholder groups, and by participating in two-way interactions within and beyond their immediate learning communities. academics and students are participating, coincidentally and increasingly more strategically, in marketing practices on social media platforms by identifying with the institution and therefore representing the university’s brand. the department’s twitter account in this paper was positioned primarily as a learning space, yet it functions as an important marketing tool to network new and existing stakeholders into the department’s core ‘business’ of learning, teaching and research. as marketing and pedagogical activities overlap (krachenberg, ), academic and marketing staff within universities could work more closely to enhance social media marketing strategies through teaching, learning and research practices. however, the tensions around using social media for both learning and marketing does require consideration, particularly in terms of the potential appropriation of learning communities for the more commercial purposes of university branding. as the analysed account and the multi-voiced approach remains active and flourishing, there is potential merit in this method of social media management. ‘rotation curation’ could be extended beyond twitter to other social media platforms used for pedagogical and marketing activities such as facebook, instagram and snapchat for example. while twitter may be popular with academics, students are less likely to be using the platform for professional and learning purposes (knight & kaye, ). co-produced, multi-voiced approaches to social media management may generate more activity and interactivity when located in students’ social media spaces of choice, as well as having greater reach in terms of engaging and including prospective students. as the social media landscape evolves, heis and academics will need to evolve to engage with their key stakeholders, students included. the ability to engage with new and emerging social media platforms, and experiment with co- produced forms of social media management such as ‘rotation curation’, can be further understood within the wider institutional culture and the university support mechanisms in place that enable academic autonomy and innovation (mcneill, ). as outlined earlier, the analysed departmental account sits within a university rated within the top for social media in the uk (rise, ). in addition, two members of staff from the university within a different department have recently been named in the list of most influential higher education professions (jisc, b); demonstrating a commitment by many to use social media to enhance aspects of higher education within the university. further research on how social media policies and practices within institutions play out at departmental and group levels is needed. the use of alternative social media practices such as the multi-voiced approach within this university is unlikely to be coincidence. rather, its use more likely reflects the emphasis placed on the role of social media for both pedagogical and marketing purposes within the organisation. acknowledgements we would like to thank the curators and followers of the academic departmental twitter account and those who continue to support the ‘rotation curation’ approach. no funding was received to carry out this research. references atenas, j., havemann, l., & priego, e. ( ). opening teaching landscapes: the importance of quality assurance in the delivery of open educational resources. open praxis, ( ), – . http://dx.doi.org/ . /openpraxis. . . boyd, d. ( ). it’s complicated: the social lives of networked teens. retrieved from http://dl.acm.org/citation.cfm?id= british psychological society. ( ). code of ethics and conduct. leicester: british psychological society. british psychological society. ( ). ethics guidelines for internet-mediated research. british psychological society. https://doi.org/inf / . bruns, a., & moe, h. ( ). structural layers of communication on twitter. in k. weller, a. bruns, j. burgess, m. mahrt, & c. puschmann (eds.), twitter and society. new york: peter lang. retrieved from http://eprints.qut.edu.au/ / christensen, c. ( ). @ sweden: curating a nation on twitter. popular communication, ( ), – . http://dx.doi.org/ . / . . condie, j., & cooper, a. ( ). we are all@ salfordpsych the rotation curation approach and twitter as a learning tool in higher education. retrieved from http://usir.salford.ac.uk/id/eprint/ constantinides, e., & zinck stagno, m. c. ( ). potential of the social media as instruments of higher education marketing: a segmentation study. journal of marketing for higher education, ( ), – . https://doi.org/ . / . . cooper, a., & condie, j. ( ). bakhtin, digital scholarship and new publishing practices as carnival. journal of applied social theory, ( ), – . retrieved from http://socialtheoryapplied.com/journal/jast/article/view/ costa, c. ( a). double gamers: academics between fields. british journal of sociology of education, – . https://doi.org/ . / . . costa, c. ( b). outcasts on the inside: academics reinventing themselves online. international journal of lifelong education, – . https://doi.org/ . / . . dabbagh, n., & kitsantas, a. ( ). personal learning environments, social media, and self- regulated learning: a natural formula for connecting formal and informal learning. the internet and higher education, ( ), – . https://doi.org/ . /j.iheduc. . . deandrea, d., ellison, n., larose, r., steinfield, c., & fiore, a. ( ). serious social media: on the use of social media for improving students’ adjustment to college. the internet and higher education, ( ), – . http://doi.org/ . /j.iheduc. . . dhir, a., buragga, k., & boreqqah, a. ( ). tweeters on campus: twitter a learning tool in classroom? journal of universal computer science, ( ), – . https://doi.org/ . /jucs- - - erskine, m., fustos, m., mcdaniel, a., & watkins, d. ( ). social media in higher education: exploring content guidelines and policy using a grounded theory approach. retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi= . . . . &rep=rep &type=pd f fagerstrøm, a., & ghinea, g. ( ). co-creation of value in higher education: using social network marketing in the recruitment of students. journal of higher education policy and management, ( ), – . https://doi.org/ . / x. . fenwick, t. ( ). social media, professionalism and higher education: a sociomaterial consideration. studies in higher education, ( ), – . https://doi.org/ . / . . graham, m. ( ). social media as a tool for increased student participation and engagement outside the classroom in higher education. journal of perspectives in applied academic practice, ( ), – . https://doi.org/ . /jpaap.v i . holmberg, k., & thelwall, m. ( ). disciplinary differences in twitter scholarly communication. scientometrics, ( ), – . https://doi.org/ . /s - - - junco, r., elavsky, c. m., & heiberger, g. ( ). putting twitter to the test: assessing outcomes for student collaboration, engagement and success. british journal of educational technology, ( ), – . https://doi.org/ . /j. - . . .x kaplan, a., & haenlein, m. ( ). users of the world, unite! the challenges and opportunities of social media. business horizons, ( ), – . https://doi.org/ . /j.bushor. . . knight, c. g., & kaye, l. k. ( ). “to tweet or not to tweet?” a comparison of academics’ and students’ usage of twitter in academic contexts. innovations in education and teaching international, – . https://doi.org/ . / . . krachenberg, a. ( ). bringing the concept of marketing to higher education. the journal of higher education, ( ), – . https://doi.org/ . / kuzma, j., & wright, w. ( ). using social networks as a catalyst for change in global higher education marketing and recruiting. journal of continuing engineering education, ( ), – . http://dx.doi.org/ . /ijceell. . lewis, b., & rush, d. ( ). experience of developing twitter-based communities of practice in higher education. research in learning technology, ( ), – . https://doi.org/ . /rlt.v i . lewis, t., marginson, s., & snyder, i. ( ). the network university? technology, culture and organisational complexity in contemporary higher education. higher education quarterly, ( ), – . https://doi.org/ . /j. - . . .x lupton, d. ( ). “feeling better connected”: academics’ use of social media. retrieved from http://apo.org.au/research/feeling-better-connected-academics-use-social-media mcneill, t. ( ). don’t affect the share price’’: social media policy in higher education as reputation management. research in learning technology. retrieved from http://www.researchinlearningtechnology.net/index.php/rlt/article/view/ mewburn, i., & thomson, p. ( ). why do academics blog? an analysis of audiences, purposes and challenges. studies in higher education, ( ), – . https://doi.org/ . / . . mirzaei, a., siuki, e., gray, d., & johnson, l. ( ). brand associations in the higher education sector: the difference between shared and owned associations. journal of brand management, ( ), – . https://doi.org/ . /bm. . neier, s., & zayer, l. t. ( ). students’ perceptions and experiences of social media in higher education. journal of marketing education, ( ), – . https://doi.org/ . / palmer, s. ( ). characterizing twitter communication–a case study of international engineering academic units. journal of marketing for higher education, ( ), – . http://dx.doi.org/ . / . . rise. ( ). top uk universities on social media. retrieved november , , from https://www.rise.global/top-uk-universities rowlands, i., nicholas, d., & russell, b. ( ). social media use in the research workflow. learned publishing, ( ), – . https://doi.org/ . / rutter, r., roper, s., & lettice, f. ( ). social media interaction, the university brand and recruitment performance. journal of business research, ( ), – . https://doi.org/ . / sewell, r. ( ). who is following us? data mining a library’s twitter followers. library hi tech, ( ), – . https://doi.org/ . / snyder, i., marginson, s., & lewis, t. ( ). “an alignment of the planets”: mapping the intersections between pedagogy, technology and management in australian universities. journal of higher education policy and management, ( ), – . http://dx.doi.org/ . / theunipod. ( ). theunipod social media rankings . retrieved july , , from http://www.theunipod.com/making-your-choice/university-rankings/social-media- rankings university of salford. ( ). salford university is among the most influential on social media in the uk. retrieved november , , from http://www.salford.ac.uk/news /salford- a-top- -university-for-social-media vandenbroek, a. ( ). tweeting sweden: technological solutionism,# rotationcuration, and the world’s most democratic twitter account. interface, ( ), – . http://dx.doi.org/ . / - . veletsianos, g., & kimmons, r. ( ). scholars in an increasingly open and digital world: how do education professors and students use twitter? the internet and higher education, , – . http://doi.org/ . /j.iheduc. . . weller, m. ( ). the digital scholar: how technology is transforming scholarly practice. london: bloomsbury academic jcit_ _ _revision _v ( ).pdf introduction for well over a century, seattle’s lake union has been a site of rapid change. it has undergone successive waves of urban development and redevelopment since the mid- s, from colonization, deindustrialization, and the construction of interstate , to the development of biotechnology campuses in the south and the university of washington in the northeast. in recent decades, the lake and the many neighborhoods that ring its shores have undergone an immense transformation, driven by the region’s burgeoning tech industry. thus, for generations now, lake union has been at the geographic and symbolic center of urban growth and local conceptions of place, a hydrological backdrop to the actors and political processes entangled in such transformations. the questions of who controls growth, who makes decisions, and who has a say in this process are pertinent for understanding the future of lake union and seattle. the task of the northlake collective—six graduate students from geography, history, social work, and the built environment involved in the larger lake union lab at the university of washington—was to conduct an exploratory placed-based investigation of a slice of the city adjacent to lake union that ultimately might engage the public through a digital humanities portal. the broader goal of the northlake collective and the current paper is to address how we, as a team of emerging scholars, understand and investigate ‘cities’ in the current century as both networked at the global scale and dynamic places for everyday interactions and processes. what emerged from our work is a transdisciplinary framework that proposes to enrich grounded urban theory and counter urban redevelopment marketed for the ‘good of all.’ the northlake collective began as a way of exploring more complex urban narratives beyond or between disciplinary frameworks and connecting these new narratives with university and community partners. urbanists from a variety of disciplines have argued that city management and planning in the st century are oriented towards a city’s place in the global hierarchy, producing a metanarrative of ‘the city in crisis’ that competes on a global scale for finite capital resources and ideal urban dwellers. this manufactured ‘urban-crisis’ discourse is used to justify apolitical management by expert urban managers, who might argue that the issues are too pressing and concerns too imminent for a democratic process (davidson and iveson b; elwood and lawson ; marcuse ; rizzo and galanakis, ). the city in crisis often legitimizes urban renewal for the ‘good of all’ by elites including mobile urban policies (davies and msengana-ndlela ; jacobs ), place branding and waterfront renewal (airas, et al. ), and the competition for managerial firms (davidson and iveson b). decision-making processes in the entrepreneurial/technologically managed city create a disconnect between the image of the city at the global scale and that in local practice (falahat ; foo, martin, wool, and polsky ), thus marginalizing and disenfranchising people of color, the poor, and homeless (bose ). one approach to countering this metanarrative is careful attention to the ways scholars represent and write about cities (marcuse ), employing an engaged and critical social science perspective (gleeson, ), and turning towards a more local ethnographic approach that takes into account relational processes and development at the city scale (davidson and iveson a; jacobs ; robinson ; secor, ), as well as complex intertwined histories (hayden ; loukaitou-sideris and ehrenfeucht ; massey ). the epistemological difficulties with investigating urban processes are ill-served by isolated disciplinary approaches, and so urbanists from a variety of disciplines have called for a thematic and cross-disciplinary approach to cultivate a more holistic view of urban concerns over a singular, hegemonic metanarrative (anderson, et al. ; davies ; manzo and perkins ; ramadier ; rizzo and galanakis ). our paper follows from these critical concerns about contemporary cities, as well as the numerous calls for greater collaboration within urban research, whether between or across disciplines (petts, et al. ; ramadier ; rizzo and galanakis ). this has meant focusing on a technology (amorim, et al. ) or artistic tool (rizzo and galanakis ) that crosses and mediates multiple disciplines and allows for a coming-together of multiple actors and stakeholders within a given urban locale to illustrate multiple histories, identify concerns, and develop solutions. although these technological tools are useful mediating devices, they are not a panacea to the challenges of interdisciplinary collaboration or public engagement. indeed, they produce their own particular challenges that we will discuss in more detail throughout the article. as a transdisciplinary research team, our approach was largely exploratory: to engage with a local site by employing the sorts of methods, data sources, and research products possible given the particular makeup and manifestation of our team. thus, in research and through this paper, we endeavored to put aside existing disciplinary methods and collectively construct a uniquely urban epistemological framework to explore the ongoing challenges of urbanism. we draw from previous scholarship, particularly rizzo and galanakis’ ( ) notion of transdisciplinary urbanism as a methodological framework that allows for the study of “uncertainty, chance and open-endedness, and to transparently renegotiate power structures in urban space” (p. ) by engaging various urban actors, theories, and practices. the paradoxes and complexity inherent to understanding the ‘city’ and how to address these concerns led us to develop a framework that might enrich grounded urban theory through three ‘enabling constraints’: place, technology and public (see figure ). constraints, in this undertaking, are reconceptualized with a positive and productive capacity, as opposed to a solely prescriptive and confining function (hayles ; introna ; mcdonnell ). place, technology, and public, formulated as ‘enabling constraints,’ set limits to our approach of the complexity of the city, while also opening up space for possibilities in that approach. place provided a certain malleability as a loosely bounded location that was also subjectively experienced, leading us to questions of scale, methods, and our epistemological rendering of place as geographically constrained. born-digital, our project saw technology or digital scholarship as a tool and end product for the power of visual argumentation that could be harnessed more fully in cross-disciplinary work—although those same productivities also imposed operational and typological limits. finally, public or public scholarship offered accessibility to the city and a common space for collaboration both within and beyond the academy, while also raising the challenge of the moral imperative of public engagement and constituting the ‘public’ itself. these affordances allowed us to deepen our work and produced the context for our research. figure : enabling constraints and interstitial questions at the intersections of these enabling constraints emerged questions regarding how issues of place, technology, and the public might affect one another (see figure ). reflecting on this particular set of challenges and their interrelatedness allowed us to identify the constituent elements of our collaboration and how these might lead to more meaningful research on cities. these challenges included our privileged position within a prominent, long-established university and our use of digital technologies, both of which undermined attempts at non- expert knowledge production. the limitations of our attempts at transdisciplinary urbanism, as well as our accomplishments, shed light on both the difficulties and the possibilities of novel research structures and approaches. following a brief narrative of our collaborative process, the remainder of the paper is organized around discussion of each of the three enabling constraints, which combine to structure our epistemological framework for studying urbanism. we conclude by highlighting several dilemmas of transdisciplinary collaboration attempting to engage place, technology, and the public in order to better represent contemporary cities. origins and objectives of lake union laboratory, a transdisciplinary urban research collective the lake union laboratory (lulab) is a research collective at the university of washington founded to not only investigate the urban but to do so in a way that consciously incorporates diverse disciplinary perspectives, methodologies, and ways of knowing using digital and public platforms. lulab focused on lake union and its surrounding neighborhoods because the lake has long served as a bellwether for urban growth and change: lake union is at once emblematic of and somewhat divergent from several key trends in contemporary urbanization. in ways that are resonant with similar experiences in countless other cities nationally and globally, the communities surrounding lake union have undergone several decades of post-industrial reinvention. at the same time, however, lake union offers a remarkable case in which a very particular combination of technological enterprises—not the usual suspects in large-scale urban redevelopment— have converged to channel huge amounts of capital into a large project of creative destruction and renewal in a world-class urban core. both of these realities are tightly bound to shifts in the structure of the global economy that have been unfolding for decades, and the outcomes for lake union could be informative for future urban planning and policy debates more broadly. (lake union lab, ) consisting of four core faculty members and a total of six graduate student researchers representing the college of the built environment, school of social work, and departments of geography and history, lulab primarily examines social, environmental, economic, and technological dynamics surrounding lake union, a freshwater lake located entirely within the city limits. since , the work of lulab has included a diverse array of undergraduate and graduate pedagogical projects, faculty research, and graduate student research projects. this paper will focus specifically on the contributions of the northlake collective, which included a series of research projects centered on the stretch of land along the north shore of lake union. as graduate students involved in lulab, we have spent the last three years exploring a small slice of seattle’s rapidly changing urban landscape. committed to place-based inquiry and public scholarship, we combined historical and socio-spatial ways of knowing in an effort to rethink how we produce and publicize knowledge about the city and urbanism. our primary at this time, lulab is predominately located within the university, and has not expanded its focus or reach to include large numbers of non-academic collaborators. however, a related project, urban@uw, which grew out of the initial lulab project, prioritizes non-academic collaboration as key to the study and practice of urbanism at the university of washington (see http://urban.uw.edu/). goals throughout this project were to analyze and represent the complexity of urban spaces and engage in cross-campus collaboration to produce a proof-of-concept digital humanities project, while showcasing the value of transdisciplinary inquiry. after two years of working together, our effort resulted in three major components: ) an online exhibit documenting a sensory exploration of northlake, ) a microhistory of a neighborhood apartment building, and ) an online digital mapping portal that compiled and displayed several data-driven research projects, ranging from an investigation into works progress administration-era housing designations to a study of mid- th century urban renewal projects. additional work was done exploring laundries, census data, and ongoing development projects. our choice of subject matter tilted heavily towards the social sciences and humanities, reflecting the disciplinary fields and research interests of the collective’s members. it is also an indication of our desire to work at multiple scales and temporalities. the resulting public digital humanities project represents northlake along the dimensions of time, place, and social life. an in-progress digital interactive map, the capstone of the northlake collective, brings together the sensory exploration of place and attempts at defining the ‘place’ northlake with disciplinary projects highlighting urban change across four eras (the progressive era, the new deal era, the urban renewal era, and the contemporary era), all crosscut by broad social, political, economic, and ecological themes. the final public-facing product ideally will represent a unique approach to the study of cities in the st century, providing a resource for urban communities and stakeholders seeking to counter uncritical development narratives. towards enabling constraints in transdisciplinary urban research enabling constraints are those elements, material or conceptual, that function within a system or context to delimit the space of possibilities while simultaneously allowing productivity and creativity precisely by that delimitation (bullock and buckley ; hayles ; introna ; shogan ). enabling constraints are “necessary condition(s) for complex emergence” (davis, et al. : ), and thus particularly relevant for complex systems, such as the city and collaborative research of it. the positive, productive dimension of constraint develops out of complexity theory in evolutionary biology (hayles : ) and finds relevance in several fields, including art and architecture, cognition, communication, and philosophical inquiry. in evolutionary biology, constraints play a positive role in the development of organisms by bounding physical environments and creating feedback loops that allow only the most viable systems to emerge (hayles : ). in creative fields, designers impose upon themselves enabling constraints—thematic, aesthetic, and material as space, tools, and materiality—to prompt a potential for creativity but also to provide an internal coherence in their media and discipline to their process (hallam, et al. ; mcdonnell ). meanwhile, philosophical inquiry considers the conceptual dimension. enabling constraints, as considered by butler ( ) and then introna ( ), are used to understand agency: actors “operat(e) within a field of enabling constraints (or encodings) at the outset,” (butler : ), where constraints are norms- or rules-based, but are also insecure and thus enable revision and transformation. enabling constraints as a concept has been meaningfully applied in various fields, although only with nascent inroads in current literature. in our own transdisciplinary inquiry into the urban, enabling constraints form a fulcrum around which revolve place, technology and public, extending both material and conceptual qualities that open up space for possibility and creativity while delimiting capacities. place emerged early on as a means of opening and bounding to develop our group purpose and project, the sites of our inquiry, and methodologies. the use of technology and digital scholarship, meanwhile, offered a platform to mediate transdisciplinary perspectives and facilitate the public aspect of the work. on the other hand, our iterative, collaborative process of producing a publically accessible interface was restricted somewhat by the obligation to consider representational and technical questions during data collection and production phases. finally, the notion of public scholarship held promise for advancing academic research beyond the confines of campus, though the enterprise of genuinely engaging public stakeholders remained limited by conceptual challenges and institutional and practical barriers. our epistemological framework for approaching the st century city emerges at the intersection of these three constraints (see figure ), even as each constraint guided the project to practical ‘how’ questions. place: an enabling constraint through location, meaning, and methodology our place-based work evolved into a significant enabling constraint in these three ways: the process of seeking out place in constructing our group purpose and project; the site of our inquiry; and the methods from which we drew to explore, document, and publicize our findings. as we began to document the shifting character of a small slice of urban seattle, we found that grounding our project in place had both expected and unexpected effects on the nature of our findings. the transdisciplinarity of the work, the availability of specific technologies, and the desire to engage various publics likewise had profound impacts on our understanding of the place under study. through the structure afforded by lulab, we sought to explore the reciprocal imprint of social, political, and economic processes on northlake and the ways northlake itself shaped the processes as they occurred in space. using the transdisciplinary structure and born-digital aspects of lulab to query the ways that these large processes shaped and were shaped by grounded conditions in northlake was an intriguing proposition, but our group first had to confront how we would approach the study of place. lulab faculty, in both undergraduate courses and other research projects, were already examining neighborhoods around the lake—in particular south lake union, a section of the city experiencing unprecedented growth and speculative investment. for our project, we chose to focus on northlake, a less-studied but no less interesting stretch of shoreline located along the lake’s north-central boundary. nestled among more established neighborhoods like fremont, wallingford, and the university district, northlake remains relatively undifferentiated. long a bastion of light industry, especially shipbuilding, northlake has undergone several concurrent economic and social shifts in recent years. both private capital and the public sector have ‘re- discovered’ the area with investments in the form of condominiums and office space, as well as new construction by the university of washington, which is aggressively seeking to expand its campus footprint into northlake. northlake’s relatively low profile immediately interested us, provoking a series of exploratory questions focused on the nature of place-specific identity and meaning. members of the collective were intrigued that an area so centrally located in a city as rapidly changing as seattle remains so little known or investigated. aside from being the name of a street and a local bar (the northlake tavern), the place designation itself is only sparingly used. and, in those rare cases where ‘northlake’ is invoked—for example in the signage shown in figure —the name seems to label underutilized or peripheral space. the somewhat elusive character of northlake proved to be appealing in these formative stages because it allowed each of us to approach the area with few preconceptions. at this stage of the project, we were still in a fully exploratory mode, having not yet decided on a particular medium or tool for the eventual display of our findings. as we began our initial inquiry into northlake, it was clear that it, in fact, possessed a rich history, driving seattle’s development and transition at key moments. from its early role as a site of trade and commerce, to later uses as a distribution hub for regional extractive industries, including a large-scale coal gasification plant, to the remediation and adaptation of that plant into a popular city park, northlake has been the site of important economic, social and environmental shifts emblematic of larger forces shaping seattle and beyond (klingle ; morrill ; sanders ; thrush ). the university of washington has become a key player in seattle’s urban development surrounding lake union, particularly in the realm of biotechnology and life science research, which helped identify seattle as a ‘curative city’ (sparke ). with the university of washington’s seattle campus expanding from the east to meet growing capacity needs, and technology companies like adobe and tableau establishing campuses on its western edges, northlake appears to be on the cusp of additional for more about seattle’s recent tech boom, see balk, g. “seattle’s population boom approaching gold rush numbers,” seattle times, september , . http://www.seattletimes.com/seattle-news/data/seattles- population-boom-approaching-gold-rush-numbers/ see university of washington master plan seattle campus, semi-annual report, march : http://www.washington.edu/community/files/ / /uw-semi-annual-report- - .pdf transformations, making it a productive location for place-based transdisciplinary inquiry into the larger patterns and processes of urbanism. figure : “the elusive character of northlake,” as depicted by waterfront park signage lacking an obvious signified point of reference ultimately, we drew from an understanding of place as simultaneously a cartesian entity, a marked boundary around a specific locality, and a subjective experience tied to particular localities (hayden ). we explored northlake through the ‘coming togetherness’ of multiple histories and experiences, mindful of the way a particular locality is created through relational processes with other times and places (massey ). our approach to northlake was not as an inert or known entity but rather as the ‘raw material’ through which creative social practices emerge and are reproduced (bondi ; bourdieu ; de certeau ; cresswell , ; massey , ; soja ). moreover, we were conscious that our work impacted the place itself: through the study of northlake, we were among those producing knowledge of the city and, perhaps unintentionally, involved in recreating dichotomies between an engaged academic, expert knowledge-making collective and residents of the places under study. the transdisciplinary production of place-based knowledge is critical in st century cities for creating narratives outside of urban redevelopment and city branding (gibson ). davidson and iveson ( a) call for a view on cities that would enable more useful place-based politics that are relevant to urban dwellers but also extend beyond the city. in the effort to reinvigorate urban politics, kurt iveson ( , ) and andrew kirby ( ) have argued that the inclusion of difference in urban planning is not found in making a plan that vacates conflict from public spaces, or planners and designers creating what constitutes ‘good’ urbanism; rather, inclusion in planning means engaging in a dialogue. a critical question in reinvigorating urban politics and creating more democratic urban planning and development is how to inculcate participation and incorporate the voices of urban residents. as we will discuss in greater depth in the technology and public portions of this paper, what is at stake in placemaking and urban narratives of place is the engagement, in both traditional and web-based forms, of urban actors outside of the academy. new technologies that allow for engagement outside of traditional venues such as public meetings or city council hearings offer one potential venue for dynamic exchange among various publics, but they are also fraught with their own means of exclusion. in our work on northlake, for example, we selected a platform without sufficiently considering how individuals and groups outside of the academy might respond (or not) to its modes of presentation and feedback mechanisms. working under multiple ways of understanding and operationalizing place required that we standardize and equalize our processes at the outset. rather than artificially constrain our inquiry, we decided to let the place itself drive our data collection and analysis. this decision resulted in the first completed project of our collective, “northlake: a sense of place.” methodologically, we sought to augment our individual disciplinary traditions by working collectively to explore northlake through our senses. this follows a rich tradition of sensory studies, which attempts to contest the so-called “primacy of the visual” (macnaghten and urry ) and open up analytical space for the consideration of auditory experiences (bull ), olfactory perspectives (classen et al. ), as well as taste and touch to understandings of the city. for our project, one student, employing the sense of sight, perused the area to photograph ‘swatches’ or visual/spatial compositions, sifting out textures, colors, patterns, vistas and viewsheds that, in aggregate, could be understood as characteristic of the particularities of the place, as simultaneously industrial and natural. then, ‘listening hard’ to northlake led another student to novel and surprising soundscapes, capturing ambient freeway noise, ducks in the water, and other exemplars of the sonic environment. appealing to the sense of touch, meanwhile, one student explored the various textures of northlake, revealing simultaneously the cold concrete and natural softness of place. finally, in search of taste, one student was drawn to social hubs, as reflecting forms of taste or distinctions that were also matters of socio- cultural positions: a pizza joint with a long history of recipes and gathering; a church that offered community sunday lunch; and a coffee shop representing an ‘up-and-coming’ side of northlake. following recent studies on urbanism, senses can emerge as salient in shaping the experience of the city, specifically in facilitating academic sensitivity to future urban vulnerabilities and mediating relations between ‘self and society’ and ‘idea and object’ (adams and guy ; bull et al : ; doherty ). figure : shifting sites of place-based inquiry over the course of the project timeline the decision to ground our study on a small area of the city proved foundational to the type of transdisciplinary work we accomplished. it allowed us to partake in productive, early conversations about both the nature of our investigations and our group itself. however, this choice also led to a further engagement with questions surrounding the proper scale of our inquiry, an ongoing dialogue that shaped the contours of the project. as we continued, the size of the sites within ‘northlake’ expanded or contracted depending on the methods of engagement we preferred as well as the available primary source materials (see figure ). we began with a sizable neighborhood, roughly one square mile, before whittling the area of inquiry down to more easily accommodate our sensory exploration. we were mindful, however, of falling into the ‘local trap’ (born and purcell ; purcell ) by uncritically accepting a particular scale as inherently more valuable. therefore, we attempted to transcend a reliance on specific temporal and spatial scales by also considering the global/urban forces impacting northlake over time. for example, by tracing the stories of residents who lived in one particular apartment building, we discovered migration patterns linking northlake to several countries in scandinavia. additionally, the occupations of building residents, which were largely in extractive industries, like lumber and mining, echoed seattle’s early twentieth-century history as a node for natural resource markets in the american west. working with common sites of inquiry at different times in the project allowed for connections to be made across individual contributions, resulting in a coherent exploration of a place that incorporated multiple sensory perspectives and ways of knowing. technology: a critical methodology and technique of public engagement our second enabling constraint, technology, is both a critical tool and an end product. it serves as a medium for facilitating transdisciplinary research on the city and as a portal to engage the public and academic audiences in rich explorations of place. at stake in this conversation was the extent to which digital tools could facilitate knowledge production and knowledge politics (elwood ). the use of digital humanities technology offered real potential for transdisciplinarity and innovative arguments in a study of place, as well as meaningful and accessible connections with the public. however, the field remains relatively fluid, with few stable methodological or theoretical boundaries (alvarado ).technology also put limits on the research process and exposed underlying disciplinary assumptions, especially as it related to the our individual backgrounds and training in particular tools and methodologies. in this way, we experienced firsthand the reality that technology is never neutral, whether in its production or its reception. our goal in using digital tools was to facilitate the transdisciplinary exploration of st century cities without falling back on disciplinary divides. digital humanities provided a common ground, since no single discipline represented in our research group could claim ownership of digital space, technology, or even visual arguments. by using a common platform, we hoped to prevent disciplinary domination of any one facet of knowledge production. yet, in the end, the structure of disciplinary fragmentation in the academy proved difficult to circumvent. because the rich history of northlake that we were developing spans over years, the final portal needed a series of contemporary basemaps to support the narrative we produced for the early s, s, s, and today. this meant digitizing historical maps in arcgis and the associated delegation of digital tasks to members who had used this software in the past. ultimately, roles and responsibilities with regard to technology were unevenly distributed, falling along disciplinary lines. while we are not arguing that it is antiquated or never useful, a disciplinary division of labor can plague interdisciplinary work that seeks to be genuinely collaborative (petts et al. ), particularly in our experience with an exploratory, technology- driven project on cities. even within the academy, then, technology can alienate by either compelling or discouraging involvement dependent on one’s preexisting expertise. this echoes ongoing debates in digital humanities more generally surrounding how best to engage the “unskilled” in the initial development and later use of various software platforms, especially when the long-term stability of those platforms remains unknown (edwards ). outside of lulab itself, the disciplinary structure of the university has long been considered an impediment to the development and execution of interdisciplinary projects (petts et al. ). our account parallels the woes of previous scholars about the hard-wiring of research organizations and support, specifically how funding structures, existing time commitments, and departmental allegiances serve as insidious barriers to cross-disciplinarity (petts et al. ). particular to the use of technology and the digital humanities, practical issues of funding streams and ownership of the physical server created several obstacles. space to host the omeka instances and maps, for instance, created real problems for transdisciplinary collaboration. the material space our data and visual arguments needed, as well as the emergent problem of connecting data and instances across servers, proved challenging and time consuming. these complications emerge from trying to fit a transdisciplinary research group into the material institutional framework of a disciplinary university, or how to fit one ideological project into the material form of another. the push to produce a public portal as end product provided potential for new and innovative ways of approaching the city. access to technology that can be used to create legible ‘needs narratives’—thus addressing the unequal distribution of power between residents, community organizations, and urban managers and developers (elwood ; harwood ; roy )— was a significant driving factor for producing a digital humanities product. decisions regarding technology drew out issues of representation, as elaborated in the next section on public. neither solely ends nor means, digital tools in urban collaborative research are intended to question and, potentially, to overcome metanarratives of urban development by dispersing control of that narrative and un-stabling any one narrative’s ‘truth’ or permanence. technology, for collaborative urban research, becomes more fully enabling when we consider its power for orienting researchers to the needs narratives of urban stakeholders (elwood ). however, that same portal product as potential was also quite constraining. rather than allowing research questions to emerge from the data or using data to answer questions, we were repeatedly compelled throughout the research process to consider the form our output the omeka program runs from individual instances installed on a server. while we explored other online options, having a concrete instance on a university-supported server meant we could ensure long-term maintenance of the product. at this time there are at least four instances of omeka (the central research instance and several teaching instances used for individual undergraduate classes) on four different servers that are all connected to the lulab project and are meant to come together in the public-facing portal. as we endeavor to produce a common portal and set of visual arguments to elucidate a history of place in northlake and engage with urban residents and stakeholders, we have had to copy and transfer instances to new servers due to memory availability and shifting funding streams. would take, as well as issues of representation with which it came. the use of images, for example, assumed an outsized importance, as did mapping, as users were more likely to engage the digital platform vis-à-vis these modalities. yet, both our selection of photographs and our choices of maps proved quite limited. in one instance, we had access to several historical maps dating to the s, though our textual research had revealed intriguing stories dating to the s. owing to the visual nature of the medium, however, we ultimately decided to return to the archives and re-focus on the depression-era, as that was the time period covered by so many of our cartographic resources. significantly, had we more fully engaged the public in this stage of the project, we might have located more diverse sources, including family photographs, oral interviews, and neighborhood publications. from the outset, the lulab group has had to consider whether the final product/digital portal would be an open access information dump (where stakeholders can access raw data, similar to the city gis archives) or a curated narrative that links the historical phases of development around lake union. the former offers the potential for users to create their own projects and initiatives, while offering input on what data should be gathered and shared, whereas the latter could incorporate writings, feedback, oral histories, and more from residents, workers, and other interested parties. additionally, the latter could engage the technical knowledge of users, who might have expertise in programming beyond that of the collective’s members. both options, we hoped, would allow the project to have an impact, however modest, on seattle’s rapidly changing urban landscape. while this influence would likely be centered on awareness and appreciation rather than concrete policy shifts, it nonetheless might capture the attention and imagination of community members. during moments of rapid changes in the urban landscape, it can be argued that even marking and memorializing what came before has political meaning beyond mere sentimentality (till ). like any web-based product, each approach also necessitates consideration of long-term site management. as graduate students, our ability to maintain an open access site is limited, though this might also encourage more engaged participation from non-university participants. this still unresolved tension over final product continues to be a challenge as the portal moves forward. as an exploratory project, focusing our efforts through the omeka platform was useful. its origins as a content management tool specifically designed to aid in curation allowed us to collect and organize a range of seemingly disparate items and artifacts. however, our work being place-based and exploratory, we started with no explicit research questions or clear idea of what the final result might be. this ultimately meant that the platform played a more determinative role in our investigations than is likely optimal. quite simply, we lacked extensive experience in computer programming, and, as a result, the project was constrained by, and oriented towards, the capabilities and limitations of the software. in the end, while we may have gone off course from a traditional approach to research, were we documenting a transdisciplinary process or actually working towards a defined end product? each option necessitates different types of organization and accountability, most notably whether the technological platform could or should be decided in advance. for better or worse, in our case, it seems that the need to have a tangible outcome did, at times, sidetrack the transdisciplinary goals of our efforts. moreover, it limited the space, both actual and figurative, for exploring the possibilities of non-expert engagement, as access to the technology created barriers for a more widespread user base. in endeavoring to produce knowledge about st century cities and reinvigorate urban politics through knowledge production, we were working towards a deliverable that would allow us to connect with various publics and other academic communities. the question that persists is whether digital technologies currently exist that can at once serve as end products accessible to wider audiences and as tools in exploratory research. moreover, these technologies must also be both accessible and rigorous, specific and crosscutting, serving the needs of researchers and stakeholders outside the academy. public scholarship: public engagement as an enabling constraint for urban research from the formation of the northlake collective, we and our faculty mentors envisioned projects that would have some sort of public face and audience. the desire to make our work accessible grew out of three primary impulses: to democratize access to knowledge about the city; to influence interpretations and imaginings of the current and future lake union landscape; and to demonstrate the possibilities of collaborative, transdisciplinary work for investigating place. an important caveat at the outset: throughout the process of working together, impulses related to public scholarship frequently guided our discussions and decisions, however, the project did not involve formal public participation. that said, the drive to produce public scholarship provided us with conceptual and practical direction by opening up possibilities and imposing certain limitations. in many ways, as we discuss below, we found issues of the public to be the most challenging of our enabling constraints. the act of doing scholarship in public and in partnership with various publics has garnered increasing attention in recent years, especially in applied urban research (davies and msengana-ndlela ; hoyt ), public history (weyeneth ), place-based education (smith ), and web-based platforms (cohen and rosenzweig ). by engaging the public, “the wall between school and community becomes much more permeable and is crossed with frequency" (smith : ). while the need for public engagement in urban research is well established and its application sought after from a range of stakeholders, there remains a need for publications that offer greater detail as to the key procedural and conceptual dimensions of carrying out such work. in our own transdisciplinary foraging for publicly engaged urban research, the ‘public’ offered license to go beyond the walls of the ivory tower to the city. however, the freedom and excitement of tinkering in the city was quickly arrested by emergent conceptual questions concerning the following: ( ) how ‘the public’ is constituted in this particular instance (i.e., who are we engaging with?), ( ) what form(s) of engagement our project could/should take (i.e., how are we engaging them?), and ( ) the broader potential impact of public engagement, including being accountable to communities beyond the academy (i.e., why does this matter?). these issues, which inherently crisscross place and technology, both pushed and pulled us in new and yet unsettled directions. just what constitutes a ‘public’ or ‘publics,’ as well as what ‘scholarship’ is or is not, remains hotly debated, as are the rubrics by which such forms of intellectual production should best be valued. place attachment and identities in urban neighborhoods are critical for understanding how urban actors—individuals, residents, and communities—behave through organization, planning, and development. the production of place and place-based identities through everyday practices and experiences re-historicizes actors, entangling and producing place- based actors (cresswell ; escobar ; hayden ; trudeau ). the debates in our own graduate student collective grappled with such issues of defining ‘our’ public(s) and deeply informed how we sought out the city and the technology we used for searching. with faces and communities in the forefront, we found ourselves drawn to fresh research questions, unfamiliar approaches, and new digital platforms for public consumption. having a public—although yet- unidentified—so explicitly part of our research buoyed group discussions and research activities. yet, ultimately, our academic institution and its reward systems, deadlines, and responsibilities, as well as its bureaucratic mandates and expectations, curtailed implementation. chief amongst these institutional expectations was the location of lulab solely within the university and our lack of foresight to include non-expert public input in the research team. although the project was intended to be public-facing, its parameters did not allow for public engagement in the knowledge production process. in this way, our work perhaps regrettably became more of a “show and tell,” rather than a true “give and take” with diverse stakeholders. we join other scholars in viewing the institution as constraining—though also sanctioning in other ways—the potential of public engagement in transdisciplinary urban research (see petts et al. ). second, considering the form of public engagement the project might take raised questions surrounding the co-production of scholarship and how technological platforms and venues may or may not support co-production with various publics. emerging as an interstitial question between two enabling constraints, the form of our engagement and our intended audience depended, to a great extent, on the selection of a particular digital humanities platform. we discussed whether the web would ultimately serve as the sole mechanism for interaction with various publics. at one point, we evaluated but, in the end, discarded the possibilities of creating a pop-up museum in northlake or distributing hard-copy, interpretive materials to local public library branches. in pursuing a digital portal to facilitate public engagement, we were confronted with the question of whether our site would be a highly-curated, but largely one-way, educational platform or a dynamic interface driven by user contributions. in an effort to engage the ‘public’ enabling constraint of our work and make meaningful connections with urbanites outside of the academy, we deliberated centering our efforts on the co-creation of information with various users. in this way, we would produce dynamic content but could use a largely static mode of delivery, as our use of online tools was largely limited by our training and familiarity with those tools. at the time of publish, any internet user could find or interact with the project’s growing suite of materials. moving forward, we aim to engage these materials in a way that encourages public engagement and curation. the broader impact and accountability to communities beyond the academy is of growing concern. the moral imperative of public engagement is tied to urban places (catungal and mccann ; deutsche ; kohn ; low and smith ; mccann ; mitchell ; ruddick ; rizzo and galanakis, ; sennett ; tuan ). urban places are the fabric through which urban processes are experienced by, and values communicated to urban dwellers, citizens, and political actors (cresswell , ; hayden ; soja ; tuan ). as such, place attachment, enacted via lived experience and semiotic constructions of place, informs how people engage with urban politics and processes through participation, political agency, and decision-making (escobar ; till , ; davies and msengana- ndlela, ). publicly engaged urban research is doubly bound by its progressive potential to rewrite more complex urban histories that could counter neoliberal development in st century cities and by the limitations on community through cooptation and limited access to knowledge (rizzo and galanakis, ). if our work was to have any impact on seattle’s rapidly changing urban landscape, even if that effect was largely one of awareness and appreciation of change over time, individuals and groups beyond the university had to be cognizant of its existence and, ideally, have a stake in its content. the promise of publicly engaged urban research presented itself in research design and deliberations, but conceptual challenges and institutional and technological barriers soon came to cloud the face of the public in the northlake collective. the exact form that this engagement would take, as well as who the intended audiences might be and how we would be accountable, remained largely indeterminate and unresolved. what remains to be determined is exactly the shape that such a creation might take, as well as how it would best incorporate the many born-digital aspects of our efforts. discussion the transdisciplinary concepts of place, technology, and public scholarship emerged as enabling constraints from within the context of our particular urban research engagement. we propose this tripartition as a framework for exploring the contemporary city via the structure afforded by transdisciplinary, born-digital collaborations. the emergent, indeterminate, and productive character of these concepts, combined with the practical constraints and interrelationships that they bring to bear, suggests that digital research on urban places might mirror the city’s own complexities. rather than view enabling constraints as “a frustrating ‘tyranny’ to be escaped wherever possible,” we came to consider them as productive constraints that should be “leveraged” (bullock and buckley, : ). welcoming ambiguity is particularly useful for transdisciplinary urbanism (rizzo and galanakis ); and by embracing and engaging with not just the potentials but also the limits of place, technology, and public, we were able to ‘lean into’ rather than oppose the paradoxes inherent in investigating the urban. at the intersection of each pair of these enabling constraints emerged a set of questions (see figure ). within the conceptual framework structured by place, technology, and public, these interstitial questions directed the project toward more practical concerns. whereas we initially focused on defining how we might engage or confront each enabling constraint, we soon found that questions incorporating two or more of these enabling constraints compelled us to consider our project in more tangible and less ‘meta’ ways. thus, the practice of generating these interstitial questions was a productive step in moving our project forward. moreover, it challenged the distinctness of place, technology, and public by highlighting inherent overlaps and relationships between the concepts. for instance, we found that engagement with technology largely determined our research ‘deliverables.’ the pre-built platforms we used both facilitated and limited our investigations into urbanism, and the group collectively navigated tensions surrounding disciplinary silo-ing and divisions of labor that emerged from our chosen technological engagements. furthermore, the imperative of public-facing work shaped the questions we asked and the portals we used to publicize our work. however, our construction as a solely academic project located squarely within a university limited other forms of engagement with area residents. thus, we arrived at perhaps the most significant takeaway from our project: our set of emergent enabling constraints does not function in isolation but, rather, operate as an interrelated, productive unit that, together, raises fundamental methodological questions related to subject, site, representation, dissemination, and audience. the freedom to engage with transdisciplinary methodologies made possible the emergence of this tripartite framework. by encouraging the development of new methodological expertise amongst the graduate student collective—especially through our project on the sensory exploration of northlake—we arrived at new ways to collectively and transdisciplinarily experience and produce knowledge about urban places. further, because no one discipline ‘owned’ northlake, and because this set of projects was independent of anyone’s graduate research in his or her department, we all felt more open to stepping outside of our comfort zones—and, more importantly, outside disciplinary bounds—to formulate this research. as we collectively constructed our project to understand and confront thorny urban problems, we found that in the absence of disciplinary constraints, the transdisciplinary constraints of place, technology, and public engagement provided the necessary contours for our investigation into urban processes. this theoretical framework synthesizes our intellectual engagement with northlake, with our research project, and with each other. our intellectual work prefigures the reflective and experiential aspects of our research processes and our development as transdisciplinary and urban scholars-in-the-making. our collaborative efforts interrogated conceptions of place within complex urban systems, while using shifting geographic boundaries to productively constrain our investigation. as befits contemporary investigation into complex urban processes, public engagement and the effort to be publicly relevant and accountable to various stakeholders in the areas under study resulted in productive moments of transdisciplinary collaboration, as well as individual and disciplinary- specific engagements with our research area. most importantly, transdisciplinary experimentation, guided by the questions emerging from our tripartite framework, yielded data sets from which a narrative of northlake began to crystallize and new insights into urban processes could be generated. these topics are the subject of our forthcoming paper, meant to serve as complement to this theoretical paper, in which we deliberate more fully upon the experiential aspects of our collaborative process and its impact on us as emerging scholars. references adams, m. and guy, s. ( ). editorial: senses and the city. senses and society, ( ), – . airas, a., hall, p. v., and stern, p. ( ). asserting historical “distinctiveness” in industrial waterfront transformation. cities, , – . alvarado, r. c. ( ). the digital humanities situation. in matthew k. gold (ed.) debates in the digital humanities. minneapolis, mn: university of minnesota press. amorim, l. m. do e., barros filho, m. n. m., and cruz, d. ( ). urban texture and space configuration: an essay on integrating socio-spatial analytical techniques. cities, , – . anderson, p. m. l., brown-luthango, m., cartwright, a., farouk, i., and smit, w. ( ). brokering communities of knowledge and practice: reflections on the african centre for cities’ citylab programme. cities, , – . born, b. and purcell, m. ( ). avoiding the local trap: scale and food systems in planning research. journal of planning education and research. ( ), - . bose, s. ( ). universities and the redevelopment politics of the neoliberal city. urban studies, ( ), – . bourdieu, p. ( ). the logic of practice. palo alto, ca: stanford university press. bull, m. ( ). sounding out the city: personal stereos and the management of everyday life. new york: berg. bull, m., gilroy, p., howes, d. and kahn, d. ( ). introducing sensory studies. senses and society, ( ), - . bullock, s. and buckley, c. ( ). embracing the ‘tyranny of distance’: space as an enabling constraint. technoetic arts: a journal of speculative research volume , , - . butler, j. ( ). excitable speech: a politics of the performative. new york: routledge. catungal, j.p. and mccann, e.j. ( ). governing sexuality and park space: acts of regulation in vancouver, bc. social and cultural geography, ( ), - . classen, c., howes, d. and synnott, a. ( ). aroma: the cultural history of smell. london and new york: routledge. cohen, d.j. and rosenzweig, r. ( ). digital history: a guide to gathering, preserving, and presenting the past on the web. philadelphia: university of pennsylvania press. cresswell, t. ( ). theorizing place. thamyris/intersecting: place, sex and race, ( ), – . cresswell, t. ( ). landscape and the obliteration of practice. in k. anderson, m. domosh, s. pile, and n. thrift (eds.), handbook of cultural geography (pp. – ). thousand oaks, ca: sage. cresswell, t. ( ). place: a short introduction. oxford: blackwell. davidson, m., and iveson, k. ( a). beyond city limits. city, ( ), – . davidson, m., and iveson, k. ( b). recovering the politics of the city: from the “post-political city” to a “method of equality” for critical urban geography. progress in human geography, ( ), – . davies, j. s. ( ). new perspectives on urban power and public policy. cities, , – . davies, j. s., and msengana-ndlela, l. g. ( ). urban power and political agency: reflections on a study of local economic development in johannesburg and leeds. cities, , – . davis, b., sumara, d. and luce-kapler, r. ( ). engaging minds: cultures of education and practices of teaching ( rd edition). new york: routledge. de certeau, m. ( ). the practice of everyday life (trans. steven f. rendall) berkeley: university of california press. deutsche, r. ( ). art and public space: questions of democracy. social text, , – . doherty, g.g. ( ). sensing and anticipating ecological vulnerabilities in urban environments. urbenviron: º seminário internacional de planejamento e gestão ambiental (brasilia), - . edwards, c. ( ). the digital humanities and its users." in matthew k. gold (ed.) debates in the digital humanities. minneapolis, mn: university of minnesota press. elwood, s. ( ). beyond cooptation or resistance: urban spatial politics, community organizations, and gis-based spatial narratives. annals of the association of american geographers, ( ), – . elwood, s., and lawson, v. ( ). whose crisis? spatial imaginaries of class, poverty, and vulnerability. environment and planning a, ( ), – . escobar, a. ( ). culture sits in places: reflections on globalism and subaltern strategies of localization. political geography, ( ), – . falahat, s. ( ). context-based conceptions in urban morphology: hezar-too, an original urban logic? cities, , – . foo, k., martin, d., wool, c., and polsky, c. ( ). reprint of “the production of urban vacant land: relational placemaking in boston, ma neighborhoods.” cities, , part b, – . gibson, t.a. ( ). selling city living: urban branding campaigns, class power and the civic good. international journal of cultural studies, ( ), - . gleeson, b. ( ). what role for social science in the ‘urban age'? in neil brenner (ed.) implosions/explosions: toward a study of planetary urbanism. berlin: jovis. hallam, j., leel, h. and das gupta, m. ( ). collaborative cognition: co-creating children’s artwork in an educational context. theory and psychology, ( ), – . harwood, s. ( ). geographies of opportunity for whom? neighborhood improvement programs as regulators of neighborhood activism. journal of planning education and research, ( ), – . hayden, d. ( ). the power of place: urban landscapes as public history. cambridge, ma: mit press. hayles, n.k. ( ). desiring agency: limiting metaphors and enabling constraints in dawkins and deleuze/guattari. substance: on the origin of fictions: interdisciplinary perspectives, ( , / , / ), - . hoyt, l. (editor) ( ). transforming cities and minds through the scholarship of engagement: economy, equity, and environment. nashville, tn: vanderbilt university press. introna, l. ( ). the enframing of code: agency, originality and the plagiarist. theory, culture and society, ( ), - . iveson, k. ( ). putting the public back into public space. urban policy and research, ( ), – . iveson, k. ( ). beyond designer diversity: planners, public space and a critical politics of difference. urban policy and research, ( ), – . jacobs, j. m. ( ). urban geographies i: still thinking cities relationally. progress in human geography, ( ), – . kirby, a. ( ). cities and powerful knowledge: an editorial essay on accepted wisdom and global urban theory [part i]. cities, , supplement , s –s . klingle, m. ( ). emerald city: an environmental history of seattle. new haven, ct: yale university press. kohn, m. ( ). brave new neighborhoods: the privatization of public space. new york, routledge. loukaitou-sideris, a., and ehrenfeucht, r. ( ). sidewalks: conflict and negotiation over public space. cambridge, ma: mit press. low, s., and smith, n. ( ). introduction: the imperative of public space. in setha low and neil smith (eds.) the politics of public space. london: routledge. lake union lab. ( ). about. retrieved september , from http://lulab.be.washington.edu/omeka/lulab. macnaghten, p. and urry, j. ( ). contested natures. thousand oaks, ca: sage. manzo, l. c. and perkins, d. d. ( ). finding common ground: the importance of place attachment to community participation and planning. journal of planning literature, ( ), – . marcuse, p. ( ). depoliticizing urban discourse: how ‘we’ write. cities, , – . massey, d. ( ). spatial divisions of labor: social structures and the geography of production. basingstoke: macmillan. massey, d. ( ). for space. thousand oaks, ca: sage. mccann, e. ( ). space, citizenship and the right to the city: a brief overview. geojournal, , – . mcdonnell, j. ( ). impositions of order: a comparison between design and fine art practices. design studies, , - . mitchell, d. ( ). the right to the city: social justice and the fight for public space. new york: guilford press. morrill, r. ( ). the seattle central district (cd) over eighty years. geographical review, ( ), - . petts, j., susan owens, s. and bulkeley, h. ( ). crossing boundaries: interdisciplinarity in the context of urban environments. geoforum, , – . purcell, m. ( ). urban democracy and the local trap. urban studies, , - . ramadier, t. ( ). transdisciplinarity and its challenges: the case of urban studies. futures, , – . rizzo, a. and galanakis, m. ( ). transdisciplinary urbanism: three experiences from europe and canada. cities, , – . robinson, j. ( ). developing ordinary cities: city visioning processes in durban and johannesburg. environment and planning a, ( ), – . roy, p. ( ). collaborative planning – a neoliberal strategy? a study of the atlanta beltline. cities, , – . ruddick, s. ( ). constructing difference in public spaces: race, class, and gender as interlocking systems. urban geography, ( ), – . sanders, j.c. ( ). seattle and the roots of urban sustainability: inventing ecotopia. pittsburgh, pa: university of pittsburgh press. secor, a. ( ). urban geography plenary lecture - topological city. urban geography, ( ), – . sennett, r. ( ). flesh and stone: the body and the city in western civilization. new york: w.w. norton & company. shogan, d. ( ). characterizing constraints of leisure: a foucaultian analysis of leisure constraints. leisure studies, ( ), - . smith, g.a. ( ). place-based education: learning to be where we are. phi delta kappan, , - . soja, e. ( ). the socio-spatial dialectic. annals of the association of american geographers, ( ), soja, e. ( ). thirdspace: journeys to los angeles and other real-and-imagined places. oxford: blackwell. sparke, m. ( ). global geographies. in michael brown and richard morrill (eds.), seattle geographies. seattle, wa: university of washington press. thrush, c. ( ). native seattle: histories from the crossing-over place. seattle, wa: university of washington press. till, k. e. ( ). artistic and activist memory-work: approaching place-based practice. memory studies, ( ), – . till, k. e. ( ). wounded cities: memory-work and a place-based ethics of care. political geography, ( ), – . trudeau, d. ( ). politics of belonging in the construction of landscapes: place-making, boundary-drawing and exclusion. cultural geographies, ( ), – . tuan, y.-f. ( ). the city as a moral universe. geographical review, ( ), – . weyeneth, r.r. ( ). what i’ve learned along the way: a public historian’s intellectual odyssey. the public historian, ( ), - . doi: . /tph. . . . highlights • a theoretical framework is proposed for conducting transdisciplinary urban research • this is comprised of place, technology, and public as ‘enabling constraints’ • interrelationships between these dimensions lead to productive research questions transformations: from social media campaign to scholarly paper proceedings from the document academy volume issue special issue: neo-documentation around the world: global developments article transformations: from social media campaign to scholarly paper hilary yerbury university of technology sydney, hilary.yerbury@uts.edu.au ahmed shahid university of sydney, gaafarushahid@gmail.com follow this and additional works at: http://ideaexchange.uakron.edu/docam part of the library and information science commons this article is brought to you for free and open access by the university press managed at ideaexchange@uakron. it has been accepted for inclusion in proceedings from the document academy by an authorized administrator of ideaexchange@uakron. for more information, please contact mjon@uakron.edu. the university of akron is ohio’s polytechnic university (http://www.uakron.edu/). recommended citation yerbury, hilary and shahid, ahmed ( ) "transformations: from social media campaign to scholarly paper," proceedings from the document academy: vol. : iss. , article . available at: http://ideaexchange.uakron.edu/docam/vol /iss / http://ideaexchange.uakron.edu/docam?utm_source=ideaexchange.uakron.edu% fdocam% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages http://ideaexchange.uakron.edu/docam/vol ?utm_source=ideaexchange.uakron.edu% fdocam% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages http://ideaexchange.uakron.edu/docam/vol /iss ?utm_source=ideaexchange.uakron.edu% fdocam% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages http://ideaexchange.uakron.edu/docam/vol /iss ?utm_source=ideaexchange.uakron.edu% fdocam% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages http://ideaexchange.uakron.edu/docam/vol /iss / ?utm_source=ideaexchange.uakron.edu% fdocam% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages http://ideaexchange.uakron.edu/docam?utm_source=ideaexchange.uakron.edu% fdocam% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages http://network.bepress.com/hgg/discipline/ ?utm_source=ideaexchange.uakron.edu% fdocam% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages http://ideaexchange.uakron.edu/docam/vol /iss / ?utm_source=ideaexchange.uakron.edu% fdocam% fvol % fiss % f &utm_medium=pdf&utm_campaign=pdfcoverpages mailto:mjon@uakron.edu http://www.akron.edu/ introduction at the meeting of the document academy, we presented a study of a social media human rights campaign about the disappearance of a journalist in maldives, analysing the relationships between the posts of the campaign and the documents from ngos, governments and the united nations responding to the social media discourse. in the discussion after the presentation, professor geoffrey bowker asked if we had considered the relationships between the posts of the campaign and our own scholarly paper. this was a challenge that we accepted and this paper presents our interpretation of the relationships between the posts, comments and tweets of the social media campaign and the paper now published in the proceedings of the meeting of the document academy (yerbury and shahid, ). the original study set out to explore how an understanding of the genre of a human rights document could shed light on the posts, comments and tweets of a social media campaign and the texts developed from the message of the campaign. the campaign was instigated in august by the friends and family of a young maldivian journalist who had disappeared, presumed abducted, to prompt the police to conduct a thorough investigation and provide some answers about his disappearance. the campaign, under the tagline find moyameehaa, used a website, facebook and twitter . in its first hundred days, the campaign did not result in any answers from the government nor did it prompt a thorough investigation by the police. however, it did lead to a range of outcomes, including a formal statement from amnesty international, an agenda item from the un and speeches or questions in the parliaments of australia, canada and the uk among others. we designed a study using genette’s notion of transtextuality ( ) to explore the standing of the posts, comments and tweets from this campaign and their relationship to other texts to determine whether they and the other texts, outcomes produced by other organisations, met the requirements for a human rights document. for genette, transtextuality is “all that sets the text in a relationship, whether obvious or concealed, with other texts” ( , p. ). his concern was to show that all published texts were linked to other texts, networked in some way. from this basic notion, he established five types of textual relationships. the first is intertextuality, where content from one text is inserted in another, usually through the conventional process of quoting; the second is paratextuality, a process involving content which surrounds and situates the text, making it more accessible, for example through a preface or foreword; the third is metatextuality, which permits commentary on the original; the next is architextuality, which links a text to others of the same genre. the final type is www.findmoyameehaa.com; https://www.facebook.com/findmoyameehaa; and #findmoyameehaa. yerbury and shahid: transformations published by ideaexchange@uakron, http://www.findmoyameehaa.com/ https://www.facebook.com/findmoyameehaa hypertextuality, which is concerned with the relationship between text a and text b; where b cannot exist without a (genette, , pp. - ). the original text can thus be referred to as a hypotext. using this analytical frame and the processes and characteristics of a human rights document (guzman and verstappen, ), we concluded that the facebook posts, tweets and website on which the social media campaign was based did not constitute a human rights document; they do not encapsulate the documentary practices of witnessing and of human rights notifications. rather, they represent a series of lacks – lack of evidence, lack of documentary practices, lack of bureaucratic processes, lack of political will and even lack of compassion towards a grieving mother – which, in human rights terms, permit the creation of other documents. witnessing is something of a vexed concept in human rights practice, with its quasi-legal sense that somebody saw an abuse of human rights taking place and reported it to an organisation which was in a position to document it, verify it and to take some action (mcclintock, ). a broader conception of witnessing is being adopted through the use of interactive technologies, in projects such as the whistle which is a digital human rights reporting platform that connects civilian witnesses and human rights ngos by supporting the verification process. mcpherson, the leader of this economic and social research council project, notes ( ) that civilian witnesses may have problems in getting their social media information verified because they may not have access to an appropriate network. from a documentation perspective, it is the voice of a writer that is significant, rather than the report of witnessing an action, with the document itself ‘standing in’ for the writer (levy, , p. ). for levy ( , p. ) this sums up the purpose of a document: “documents are exactly those things we create to speak for us, on our behalf and in our absence”. in the case of the findmoyameehaa campaign, the instigators are not witnesses, in the human rights sense. however, they seem to recognise the power of the document: although the instigators describe themselves as family and friends, and although they have no documented evidence of the moment of disappearance, one of the participants noted, ‘we are not a bunch of nobodies without voice or an audience’ (facebook post, august ). being able to use social media effectively to disseminate the message of the journalist’s alleged abduction may have had some impact on the influence of the campaign. genette’s notion of transtextuality made evident the relationships between the social media campaign, the notion of a human rights document and the texts created after the involvement of a range of different organisations; these texts are, clearly, human rights documents. the analysis http://www.smhr.sociology.cam.ac.uk/ proceedings from the document academy, vol. [ ], iss. , art. http://ideaexchange.uakron.edu/docam/vol /iss / http://www.smhr.sociology.cam.ac.uk/ demonstrated the relationships and the transformations which co-existed between the text of the posts, comments and tweets of the social media campaign and the texts from the various organisations, with a particular focus on the transformations that led to them being seen as human rights documents. linking the findmoyameehaa campaign and the docam paper the purpose of this linking of the social media campaign and the published scholarly paper is not to answer the question of whether either or both can be perceived as a document, but to explore the ways they are related and the processes through which the paper transforms the posts, comments and tweets of the campaign to create a document. following buckland ( ), frohmann ( ) and lund ( ), we acknowledge that there is little value in discussing definitions and criteria by which either the posts, comments and tweets or the scholarly paper can be considered a document. we continue the approach from the original study, which recognised the value in considering a document as text which could facilitate an understanding of the complexities of communications in social media and the relationships among and between people and among and between messages. thus, genette’s notion of transtextuality is important to this investigation. his concern was to show that all published texts were linked to other texts, networked in some way. equally important is frohmann’s ( ) argument that our understanding of a document and the justification of that understanding are to be found in the “the stories we tell”, the ways in which we talk about documentation and the document. these stories are not always the same, so one may find different justifications for the same action but also disagreements on what are appropriate actions. at a more practical level, there are papers which draw the attention of the researcher to issues and problems involved in using social media sources as case study material or evidence in a study (eg boyd, , king, , weller, ), which bring into focus the perspectives of the researcher and of the users of social media. the complexity of the links between the findmoyameehaa campaign and the docam paper will be set out in three themes which themselves are fundamental to an understanding of documentation. these themes are labelled demonstrating relationships, transformations and the stories we tell. the notion of demonstrating relationships is inherent in one of the classic texts, suzanne briet’s what is documentation? here, she sets out the idea of an ‘initial’ document giving rise to ‘secondary or derived’ documents, a ‘documentary fertility’ which she encourages us to admire (briet, , pp. – ). the notion of transformations can also be seen in this early work; however, perhaps more relevant to this current discussion is the notions of fixity and fluidity explored by levy ( ). he argues that documents must be conceptualised as being fixed and yerbury and shahid: transformations published by ideaexchange@uakron, unchanging on the one hand but fluid and capable of change on the other hand. his discussion of the fixed and the fluid in hypertextual relationships, while concerned with a particular technology, raises questions about the documentary systems that can emerge as a result of briet’s documentary fertility. levy’s idea ( , pp. - ) that not only do we use documents to tell stories but that documents ‘stand in’ for us, they speak for us, in our absence is relevant to the discussion of the theme of the stories we tell through our practices. demonstrating these relationships will start from the existence of the docam paper, examining how what genette refers to as its architextuality – its genre – is fundamentally based on the notion of relationships with previous scholarly work and with evidence from a new study. it considers the role of intertextuality in creating a relationship based on quotation and allusion and explores hypertextuality as a way of indicating that, without the hypotext of the posts, comments and tweets, the paper could not exist. the section on transformations starts from metatextuality, commentary on a text. it is also concerned with the ways in which researchers transform social media into data and data into a report of research and the ways in which the technologies of the social media affect the actions and expressions of those who use them. finally the theme of ‘the stories we tell’ picks up on the genre of the scholarly article, showing that there is more than one way to enact the genre, discusses genette’s notion of paratextuality and concludes by showing how stories of hypotextual relationships exist even when the original is not mentioned. demonstrating relationships briet ( ) would argue that the journal article is a document derived from the initial text of the social media campaign and thus a clear relationship exists. genette’s analysis takes a more nuanced approach. following genette, the conference paper at the centre of this analysis is part of a web of relationships with other texts. architextuality, the genre of the text, is an essential starting point for noting relationships among and between texts (genette, , p. ). a scholarly research article, such as this conference paper, has a clearly defined set of characteristics, usually including a research question, a review of the literature on the topic, a description of how the research question will be answered, an exposition of the evidence gathered to answer the question and a discussion of the relationship of this evidence to the findings of previous studies. the genre itself assumes a networking of ideas and evidence demonstrated through a process of argument. thus, the social media campaign, its posts, comments and tweets, and the associated documents are inextricably linked to the docam paper, since they provide the evidence on which the research question is based. going further, the form of publishing the scholarly article, in the online proceedings of the meeting of the document academy, makes it an example of a proceedings from the document academy, vol. [ ], iss. , art. http://ideaexchange.uakron.edu/docam/vol /iss / sub-genre of academic publishing, open access publishing. the proceeding of the meetings of the document academy are what is known as diamond open access (cf fuchs and sandoval, , p. ) the most accessible category, being available without payment by either the author or the reader and without a requirement for registration or membership, an ideological position facilitating the widespread dissemination of scholarly work to anyone with access to a networked computer. the technologies available for open access online scholarly publishing vary in their sophistication and their abilities to position an article in the broader scholarship in the field. the technology used here, digital commons developed by berkeley electronic press for use with digital repositories in universities, and research centres in the us, is at the less sophisticated end of the spectrum. it provides hot links to the findmoyameehaa website and facebook page but no links to the scholarly references, yet still enables material to be found through google, google scholar and other databases as well as through the digital commons network. thus, the published paper is well networked, through the tools of digital scholarship. in particular, the digital commons technology allows authors to know how many times a document has been downloaded and the city and country where the readers are. although the specific identity of the reader is not revealed, there is a sense in which through technological mediation these readers are not invisible in the way boyd ( ) indicates readers of social media may be. the link to scholar google can identify those papers which have cited the published conference paper and create a hyperlink to them; in this instance, the scholar involved is identified as is the way in which the two documents are linked. this linking through the power of the technology opens the possibility of relationships with the findmoyameehaa campaign being created way beyond the original study. returning to the first-level relationships, relationships between texts are created through intertextuality, the use of quotations and allusions. this process is fundamental to the development of this scholarly argument and works at two levels. firstly, quotations are used to substantiate the argument of the paper, that the social media campaign is essentially a set of commentaries on the original statement of rilwan’s disappearance. quotes also show how expressions of emotion fill the gap left by lack of information and they give an impression of popular culture through the range of media used to express these emotions. they are, finally, used to give a glimpse of contemporary debates in maldivian society, showing something of the relationships between maldivian citizens and the government. these quotes are fundamental to the scholarly argument, demonstrating a relationship between the paper and the posts, comments and tweets which is so fundamental that without them, the paper cannot exist. secondly, through the use of references to other studies, this paper shows that the interpretations being made in this study have been made previously in other yerbury and shahid: transformations published by ideaexchange@uakron, scholarly contexts and in doing this, it locates the campaign in a new and different network, creating relationships that neither the participants in the social media campaign nor the authors of the scholarly articles could have imagined. transformations the act of carrying out the study led to a series of transformations of the posts, comments and tweets, raising questions of fixity and fluidity as levy ( ) discussed. these are transformations of meaning, transformations into artefacts of research and transformations of the relationships which the writers have with the text. the process of transformation is apparent in genette’s category of metatextuality, which involves critical commentary on an original text. when the original text comes from social media, such as facebook or twitter, understanding transformation becomes more complex. in a study involving social media, the metatextual relationship can be seen as the process of turning everyday experiences, recorded in facebook posts and comments and tweets, into data so that they can then be commented upon. in the first instance, this can be taken as a process of abstraction, a way of seeing the posts, comments and tweets of very many people not as a series of individual entities but as a whole, a narrative of collective experiences and expressions of emotion. from this process of analysis and synthesis arise the symbols and memes which exemplify a collectivity of sentiment and which can act as a shorthand to evoke the feelings of the participants in the campaign. this transformation of individual experiences and expressions into data facilitates the suggestion that a collectivity exists, not yet a social movement with the features identified by castells ( ) but nonetheless, a group expressing despair and hope, formulating actions to address their concerns and using social media to do so. these symbols are identified and communicated through the published paper as including the passage of time since rilwan disappeared, exemplified in the endlessly repeated statement of the days, hours and minutes since rilwan’s disappearance, and the sorrow of rilwan’s parents, expressed in photos taken of them in public protests, with one of the photos of his mother becoming a poster and rallying image for a part of the campaign. they go some way to creating common ground between the reader of the scholarly paper and the men and women who have contributed to the social media campaign, bridging the gap to make the everyday experiences expressed through social media understandable to those reading in other parts of the world. another symbol, the use of the question mark in the thaana script, ؟, both as a twibbon in twitter and as the basis for creating a portrait of rilwan, evokes the unanswered questions on which the campaign is based. the use of the hashtag #findmoyameehaa makes clear the purpose of the campaign to monolingual english speakers, but speakers of dhivehi are urging others to proceedings from the document academy, vol. [ ], iss. , art. http://ideaexchange.uakron.edu/docam/vol /iss / ‘#find a madman’ and those maldivians who have cherished freedom of speech are noting the loss of a champion, whose bloghandle was moyameehaa. this process of transformation can be considered from the perspective of the practice of the researcher, transforming data into evidence to support an argument. here, when the data derive from social media sources, the concerns may be with the completeness of the data and the command of the tools to analyse it. king ( ), commenting on the value of real time data for research in the social sciences, notes that access to social media posts provides another avenue for understanding the opinions of citizens on an issue, adding to the commonly used option of the random survey. this process of transforming social media posts into research data is not unproblematic. weller states that whereas some researchers are so focused on the digital patterns emerging through social media usage that they may almost forget that there are actual users, other are so focussed on the user perspectives that they overlook the impact of the technologies that allow the perspectives to emerge ( , p. ). our focus was on the users and their engagement in this social media campaign, but we did not completely overlook the technological aspect. we sought to sidestep the problems caused by the ‘ever-changing nature’ of social media (weller, , p. ). having decided to focus on the campaign over the one hundred days from the disappearance of rilwan, we also made the decision to accept the posts, comments and tweets as they were on that day as presented by the technologies of the website, facebook and twitter. we printed out the facebook posts and we printed twitter’s top tweets for the #findmoyameehaa for the same period. in this way, we stopped the movement inherent in social media and, in levy’s terms, created fixity ( ). this meant that we did not have a record of the real-time creation of the social media campaign; we acknowledged that there may have been posts that were deleted. perhaps more significantly, we recognised that in using the top tweets, we were accepting a transformation of the everyday already implemented by twitter, with no clear understanding of exactly how top tweets were derived (twitter n.d.). but, we had created immutable data, albeit data with potential flaws. our immutable data presented us with another challenge – the posts, comments and tweets were not all written in english, although the majority were: some were written in english and dhivehi and some in romanised dhivehi. images and linked documents from maldives were likely to be written in dhivehi, in thaana script. the technological tools for managing bi-lingual data of this kind appears not to be available, another reason why we opted for the printouts. we chose to use a ‘pencil and paper’ method for recording our analysis of the content of the posts, comments and tweets, acknowledging the human process of interpretation of the data, the bi-lingual nature of the posts and the lack of command of dhivehi of one of the researchers. our methodology, then, could be challenged for lacking rigour, not following the convention of the use of some yerbury and shahid: transformations published by ideaexchange@uakron, technology for managing the data. in a review of another paper exploring notions of emotion and political accountability in this social media campaign, our methodology was challenged – not for incomplete data, but for not using computer software to analyse the posts, comments and tweets. the process of transformation can also be considered from the perspective of the individual whose posts, comments or tweets may be part of the body of data being analysed. here, the concern may be with unknown people reading posts, statements being taken out of context and with issues of the public and the private. boyd ( ) notes that there are three aspects of the digitised communications of social media that affect relationships, an invisible audience, a lack of boundaries in time, space or social interactions and a blurring of the line between notions of public and private. as we transformed the experiences of everyday life into research data, we were an invisible audience for those posting on facebook and twitter. they would have been completely unaware of us, our reading of their posts and tweets and our offline analysis of these expressions of their thoughts and actions. in our analysis, we did not record the identity of those posting in the campaign nor where they were located but we were interested in the way in which the social media campaign was able to move beyond the maldivian context from which the campaign emerged. to that extent, we did not establish boundaries in space; we were criticised by another reviewer of the submission referred to above for not establishing in the title of the submission that the social media campaign was bounded by space. although we carried out the analysis after the days of the campaign, we recognised that marking the passage of time was a significant factor for people engaged in this campaign, marking the hours and days that rilwan had been missing, and included this as a meme in the published paper. equally significant for this exploration of the relationships between the posts, comments and tweets of a social media campaign and the published report of the study is the question of the public and the private. tweets, intended as they are for a broad audience, are generally not considered private communications, but with facebook posts, being potentially private communications but publicly accessible, the argument is not so clear. we took the view that the tweets using the hashtag #findmoyameehaa were public and intended to be so. the website and facebook page were set up to seek information about rilwan’s disappearance and to communicate information about plans for action; this was made clear in the ‘about’ section, thus position it as a public space. to this extent, it is possible to consider these platforms as being expressions of a kind of public sphere (habermas, ), offering a possibility for anyone to participate in the campaign and its debates. however, the facebook page became a focus for expressions of grief and frustration by friends as well as by people who identified themselves as strangers; and while these form part of the public record of the campaign, some were clearly personal statements, not part of a public sphere. the campaign itself proceedings from the document academy, vol. [ ], iss. , art. http://ideaexchange.uakron.edu/docam/vol /iss / could be seen as a political action, drawing attention to the apparent lack of action of the police in investigating the disappearance of the journalist; posts, comments and tweets report on the attempts of the family to prompt the police to action. tweets and posts from some others are also political actions, critical of the police and of the government. in a country slipping away from democratic practices (mulberry, ), we confronted the situation expressed so clearly by boyd and crawford ( , p. ): “just because it is accessible doesn’t make it ethical”. the transformation of the social media campaign posts into data for a scholarly paper thus raised a question about the naming of individuals as authors of posts and tweets; some individuals were clearly significant contributors not only to this campaign but to broader debates on democracy and freedoms. it was easy to use google to identify many contributors directly; some used a pseudonym but it was still easy to piece together information to identify them; a few used a pseudonym that could not be related back to a named individual. a convention of research practices dictates that a source of data be identified. however, it was the text that was important to our argument about transformations – what was written or presented, rather than who wrote it. we created a section in the published paper entitled ‘becoming something else’. here we focused on something beyond levy’s notion of fluidity; which seems more concerned with the artefact and the way it is perceived ( , p. ). instead, we focused on how the use of genette’s notion of transtextuality was able to demonstrate that social media enabled text which originally clearly was in one category of transtextuality to transform into another. a key example of that was the creation of memes. it was not important to know who echoed the passage of time, only that so many people did so that it became a refrain in the campaign. the photograph of rilwan’s mother, taken at one of the gatherings, was an example of a post that transformed from being a text in its own right to becoming a quotation (intertext) and then becoming an expression of collective sadness (metatext). creating a hypertext involves a process of transformation as genette asserts ( , pp. - ) where a second text becomes possible because of the existence of the first, perhaps much as briet considered primary and secondary documents. in the original study, the concern was to show how the posts, comments and tweets led to the emergence of a range of hypertexts – human rights documents – and this transformation was flagged in the title – the becoming of a human rights document. similarly, the development of a scholarly paper also involves the transformation of ideas from existing scholarly works and evidence through a rational process which, in the social sciences, bring the possibility of new understandings. thus, the published paper is clearly a hypertext, with the posts, comments and tweets of the social media campaign as the hypotext, the base without which the published paper would not have come into existence. yerbury and shahid: transformations published by ideaexchange@uakron, this paper exemplifies levy’s notion of fluidity ( ) in its own development. our files show that we had worked on several drafts, which we exchanged and commented on. it was the fourth completed version that was regarded as fixed and submitted. the process of review caused that fixity to disappear as the text of the paper entered another phase of fluidity as we worked through other versions on our way to an agreed and therefore fixed text. the editorial process may introduce other aspects of fluidity before the final, authorised, version of the paper is published. the stories we tell as a preface to this section, it is important to acknowledge that the story we have told is based on the stories told by others, whose posts, comments and tweets in the social media campaign left powerful traces. without these stories, there was no basis for our study. researchers are essentially storytellers, making public the outcomes of investigation and the assumptions from which they undertook the investigation. in writing a paper for publication, we use a form of storytelling scholars acknowledge as being appropriate for the purpose of creating new understandings. a study may be telling the same story as has been told many times before, but with a different setting or different characters or even a different ending; that is, it may use a conceptual model or a theory which has been tested many times before, but will seek to extend the understanding by implementing it in a different context or with a different type of participant. this form of storytelling is monitored by a process of review, where other scholars indicate whether or not a particular article is suitable to be made public. from this perspective, the data, the posts and tweets from the social media campaign are the material on which the tool of theory works. the paper has been published as a peer reviewed contribution to the proceedings of the document academy meeting of . thus, it follows that the method and the data used, the posts and tweets, were considered appropriate. the anonymous reviewer of another paper using the same social media campaign to explore grassroots’ collective action tells a different story. in his or her story, there are ‘two main setbacks [in the paper]. one of them is the material used: tweets and facebook’. the inference is that this leads to a fundamental weakness in the study. the editor of that journal, therefore, concluded that this was a story that should not be told in public. as storytellers, it can reasonably be assumed that we will have told stories before, based on this or some other study. this story is not told using the disembodied voice of some academic writing, “the voice from nowhere” criticised by haraway ( ). our names as authors of this paper form part of genette’s paratext. as authors, we are acknowledged as storytellers; a google search on our names links us to previous stories we have told, each from our distinct cultural proceedings from the document academy, vol. [ ], iss. , art. http://ideaexchange.uakron.edu/docam/vol /iss / and disciplinary context. the backstory we might tell here is of our own transformation in writing this paper, from independent storytellers to a collaborative partnership of storytelling, bridging the gaps between languages and cultural experiences and across disciplines and fields of practice. self-citing is another way of telling a story of the link between current work and previous work. in this way, the network of articles by a given author becomes apparent. we have not linked this study to our previous work together, nor to our work done separately. this seems to be an instance where, in genette’s terms, we have created a hypertext, the paper entitled ‘the becoming of a human rights document’, without acknowledging its other hypotexts, that is our previous published papers, written together or separately. analysis of this previous work would show an emphasis on human rights and democratic values, especially as expressed in the work of keck and sikkink ( ), risse and sikkink ( ) and giddens ( ). it would have shown that our work can be seen in the utopian tradition of studies of social media (zemmels ), sharing the enthusiasm for the possibilities of social media enhancing opportunities for civic engagement (eg bakardijeva ). it would also have shown our concerns for active citizenship and for issues and practices of human rights. there is one last piece of storytelling to offer here. the understandings gleaned from linking with these unmentioned hypotexts would perhaps indicate that a final link between the social media campaign and the scholarly paper is that the writing of the paper is itself a form of activism. the conference paper draws attention to an event in a part of the world that few people are familiar with, except perhaps as a holiday destination, and shows how local citizens, human rights bodies and the institutions of that state, that is the police service, parliament and even the president interacted with each other and with the event, the disappearance of a young journalist, during the first hundred days of the social media campaign to prompt action on his alleged abduction. it is also a way of considering human rights practices, showing how major agencies take up reports of human rights abuses and make statements, either as press releases aimed at no one in particular or at particular groups such as the united nations or even as questions addressed to a government. we have no way of knowing whether readers of our paper have become active in the social media campaign; there has been a small increase in facebook likes since the original conference presentation but we know from the data supplied by the document academy’s publishing software that none of the + readers of the paper currently lives in maldives. through this focus on human rights practices, the published paper broadens the understanding of what success in a grass roots action through social media might constitute: action that leads to formal statements by national, international or supra-national bodies must be considered successful, even if the accepted yerbury and shahid: transformations published by ideaexchange@uakron, understanding of success is action by the government where the campaign is taking place. conclusion this paper took up the challenge to explore relationships between a published paper and a social media campaign. it has suggested that these relationships can be conceptualised in three ways: relationships which emerge from the links created between the content of the published paper and the content of the social media campaign posts, comments and tweets and those which emerge through the technology of online publishing; transformations which emerge as the social media campaign is incorporated into the scholarly work; and the stories we tell directly and indirectly through our practices of scholarship. this descriptive piece brings together several elements in the process of exploring this relationship. firstly, genette’s concept of transtextuality has been fundamental in showing that texts are linked at several levels and in many ways and these links are more complex and nuanced than the simple division between primary and secondary documents suggested by briet ( ). although transtextuality assumes the primacy of text, it acknowledges multiple authors and readers and the impact of context, leading to transformations both acknowledged and unacknowledged. as genette noted ( , p. ), the use of the framework of transtextuality has brought potentially concealed relationships with other texts to our attention. architextuality, the understanding of genre and links among examples of the same genre, was a strong starting point for elucidating the complexities of the ways in which the genre of scholarly paper works and how a particular example of the genre is linked into other examples and, in this case, into another genre, the social media campaign. metatextuality, concerned with commentaries on an original text, was found to be very useful in exploring wide- ranging transformations in documents. in particular, the analysis of these transformations brought to light a range of issues emerging from their interpretation. this was significant in showing the importance of context in understanding documents and their relevance. it was also useful in shedding light on how intangibles such as fixity and fluidity in documents can emerge from work practices. secondly, the role of technology has been significant and has worked together with genette’s conceptual framework to bring a more nuanced understanding of the relationships among and between texts. the technology of social media has created complex relationships between the texts of the posts, comments and tweets and the published paper as well as relationships between the authors and the data, the writers of the posts and the posts and the writers of the posts and the authors, following levy’s proposition ( ) that documents can stand in for the writer. the technology of online publishing has set up a different proceedings from the document academy, vol. [ ], iss. , art. http://ideaexchange.uakron.edu/docam/vol /iss / set of relationships between the published paper and other published papers, between the published paper and its readers and between the authors and range of indexes and sources of access to the published paper. this notion of bibliometric links is not new, but here, it is suggested that conceptualised in this way, the links show the importance of context and the way in which decisions about the use of publishing technology can facilitate the creation of links beyond those of citing and cited papers. finally, a consideration of the ways an author positions the work, giving a focus on the broader context of scholarly and professional practices demonstrates the importance of frohmann’s emphasis on ‘the stories we tell’ as we justify the documents we create. the paper is not presented as an objective, third person account, presented by disembodied individuals, but rather as a situated ‘story’. in this instance, the published paper ‘the becoming of a human rights document’ becomes situated in a societal context of campaigning against violations of human rights, showing its authors as scholars engaged in significant societal issues. references bakardjieva, m. , ‘subactivism: lifeworld and politics in the age of the internet’, the information society, vol. , no. , pp. - . boyd, d. , ‘social media is here to stay... now what?’ microsoft research tech fest, redmond, washington, february . available at: http://www.danah.org/papers/talks/msrtechfest .html boyd. d. and crawford, k. , ‘critical questions for big data’, information, communication & society, vol. , no. , pp. - , doi: http://dx.doi.org/ . / x. . briet, s. , what is documentation? english translation of the classic french text. tr. and ed. r.e day and l. martinet. lanham, md., scarecrow press. buckland, m. , ‘what is a document?’ journal of the american society for information science, vol. , no. , pp. - . castells, m. , networks of outrage and hope: social movements in the internet age, polity press, cambridge. frohmann, b. , ‘revisiting ‘what is a document?’’ journal of documentation, vol. , no. , pp. - . fuchs, c. and sandoval, m. , ‘the diamond model of open access publishing: why policy makers, scholars, universities, libraries, labour unions and the publishing world need to take non-commercial, non-profit open access serious’, triplec, vol. , no. , pp. - . available at http://www.triple- c.at/index.php/triplec/article/viewfile/ / genette, g. , palimpsests: literature in the second degree, university of nebraska press, lincoln, nebraska. giddens, a. , the consequences of modernity, polity press, cambridge. yerbury and shahid: transformations published by ideaexchange@uakron, http://www.danah.org/papers/talks/msrtechfest .html http://dx.doi.org/ . / x. . http://www.triple-c.at/index.php/triplec/article/viewfile/ / http://www.triple-c.at/index.php/triplec/article/viewfile/ / guzman, m. and verstappen, b. . what is documentation? huridocs, versoix. available at: http://www.huridocs.org/wp- content/uploads/ / /whatisdocumentation-eng.pdf habermas, j. , the structural transformation of the public sphere; an inquiry into a category of bourgeois society, mit press, cambridge, mass. haraway, d. , ‘situated knowledges: the science question in feminism and the privilege of partial perspectives’, feminist studies, vol. , no. , pp. – . keck, m. & sikkink, k. , ‘transnational advocacy networks in international and regional politics’, international social science journal, vol. , no. , pp. - . king, g. , ‘ensuring the data-rich future of the social sciences’, science, vol. , no. , - . doi: http://dx.doi.org/ . /science. . available at: http://nrs.harvard.edu/urn- :hul.instrepos: . levy, d. , ‘fixed or fluid? document stability and new media’, echt proceedings of the acm european conference on hypermedia technology, acm, new york, ny., pp. - . ‒ , scrolling forward, arcade publishing, new york, ny. lund, n. , ‘document theory’, annual review of information science and technology, vol. , pp. - . mcpherson, e. , ‘digital human rights reporting by civilian witnesses: surmounting the verification barrier’, in r. a. lind, ed. produsing theory in a digital world . : the intersection of audiences and production in contemporary theory, peter lang publishing, new york, pp. – . mcclintock, m. , a basic approach to human rights research. available at: http://humanrightshistory.umich.edu/research-and-advocacy/basic- approach-to-human-rights-research/ mulberry, m. , democratic decline in the maldives: will the world wake up? october . available at: https://www.opendemocracy.net/civilresistance/matt-mulberry/democratic- decline-in-maldives-will-world-wake-up risse t. & sikkink, k. , ‘the socialization of international human rights norms into domestic practices’, in risse, t., ropp, s. and sikkink, k. (eds.). the power of human rights: international norms and domestic change, cambridge university press, cambridge, pp. - . twitter n.d. faqs about top search results. available at: https://support.twitter.com/articles/ ?lang=en# weller, k. , ‘accepting the challenges of social media research’, online information review, vol. , no. , pp. - . http://dx.doi.org/ . /oir- - - proceedings from the document academy, vol. [ ], iss. , art. http://ideaexchange.uakron.edu/docam/vol /iss / http://www.huridocs.org/wp-content/uploads/ / /whatisdocumentation-eng.pdf http://www.huridocs.org/wp-content/uploads/ / /whatisdocumentation-eng.pdf http://dx.doi.org/ . /science. http://nrs.harvard.edu/urn- :hul.instrepos: http://humanrightshistory.umich.edu/research-and-advocacy/basic-approach-to-human-rights-research/ http://humanrightshistory.umich.edu/research-and-advocacy/basic-approach-to-human-rights-research/ https://www.opendemocracy.net/civilresistance/matt-mulberry/democratic-decline-in-maldives-will-world-wake-up https://www.opendemocracy.net/civilresistance/matt-mulberry/democratic-decline-in-maldives-will-world-wake-up https://support.twitter.com/articles/ ?lang=en http://dx.doi.org/ . /oir- - - yerbury, h. and shahid, a. , ‘the becoming of a human rights document; an exploration of a human rights campaign’, proceedings from the annual meeting of the document academy: vol. , article . available at: http://ideaexchange.uakron.edu/docam/vol /iss / zemmels, d. , ‘youth and new media: studying identity and meaning in an evolving media environment’, communication research trends, vol. , no. , pp. - . yerbury and shahid: transformations published by ideaexchange@uakron, http://ideaexchange.uakron.edu/docam/vol /iss / proceedings from the document academy transformations: from social media campaign to scholarly paper hilary yerbury ahmed shahid recommended citation tmp. .pdf.nucgu a decade of rapid—reflections on the development of an open source geoscience code hal id: hal- https://hal.sorbonne-universite.fr/hal- submitted on may hal is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. the documents may come from teaching and research institutions in france or abroad, or from public or private research centers. l’archive ouverte pluridisciplinaire hal, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. distributed under a creative commons attribution - noncommercial - noderivatives| . international license a decade of rapid-reflections on the development of an open source geoscience code cédric h. david, james s. famiglietti, zong-liang yang, florence habets, david r. maidment to cite this version: cédric h. david, james s. famiglietti, zong-liang yang, florence habets, david r. maidment. a decade of rapid-reflections on the development of an open source geoscience code. earth and space science, american geophysical union/wiley, , � . / ea �. �hal- � https://hal.sorbonne-universite.fr/hal- http://creativecommons.org/licenses/by-nc-nd/ . / http://creativecommons.org/licenses/by-nc-nd/ . / http://creativecommons.org/licenses/by-nc-nd/ . / https://hal.archives-ouvertes.fr a decade of rapid—reflections on the development of an open source geoscience code cédric h. david , , james s. famiglietti , , , zong-liang yang , florence habets , and david r. maidment jet propulsion laboratory, california institute of technology, pasadena, california, usa, center for hydrologic modeling, university of california, irvine, california, usa, department of earth system science, university of california, irvine, california, usa, department of geological sciences, jackson school of geosciences, university of texas at austin, austin, texas, usa, umr metis, cnrs, upmc, paris, france, center for research in water resources, university of texas at austin, austin, texas, usa abstract earth science increasingly relies on computer-based methods and many government agencies now require further sharing of the digital products they helped fund. earth scientists, while often supportive of more transparency in the methods they develop, are concerned by this recent requirement and puzzled by its multiple implications. this paper therefore presents a reflection on the numerous aspects of sharing code and data in the general field of computer modeling of dynamic earth processes. our reflection is based on years of development of an open source model called the routing application for parallel computation of discharge (rapid) that simulates the propagation of water flow waves in river networks. three consecutive but distinct phases of the sharing process are highlighted here: opening, exposing, and consolidating. each one of these phases is presented as an independent and tractable increment aligned with the various stages of code development and justified based on the size of the users community. several aspects of digital scholarship are presented here including licenses, documentation, websites, citable code and data repositories, and testing. while the many existing services facilitate the sharing of digital research products, digital scholarship also raises community challenges related to technical training, self-perceived inadequacy, community contribution, acknowledgment and performance assessment, and sustainable sharing. . introduction driven by the need to understand earth’s dynamic climate, geoscientists have dedicated much effort to creat- ing numerical models of the major components of the climate system and to analyzing their outputs. early modeling studies date back to the s and include simulations of the earth’s atmosphere [phillips, ], oceans [bryan and cox, ], land [manabe, ], and rivers [miller et al., ]. decades later, computer modeling and data-intensive analysis have become key elements upon which modern climate science has been built [e.g., intergovernmental panel on climate change, ], and numerous geoscientists therefore dedicate considerable research energy to such endeavors. computer-assisted research is equally ubiquitous in the broad scientific community, such that some have argued that computer modeling and data-intensive science be considered legitimate pillars of science, hence joining experimental science and theoretical science [bell, ; bell et al., ; hey et al., ; hey, ; hey and payne, ], although such a view is not without its critics [vardi, a, b]. nevertheless, computer modeling and analysis are now integral parts of many geoscience investigations. the recent mandate [holdren, ] requesting that the direct results of federally funded scientific research in the u.s. be made further accessible—including availability of digital data—has spurred much discussion in the scientific community. kattge et al. [ ] argued that while data sharing is necessary, associated hurdles subsist, and proper means of acknowledgment (i.e., citations) are needed so that scientists can benefit from the added burden. this argument was further supported by the survey of kratz and strasser [ ]. others have also suggested that the computer codes used to generate or to analyze data are equally important and should hence be made similarly accessible [nature, ; nature geoscience, ]. prior to the recent mandate, barnes [ ] had already advocated for sharing computer code so that—like any other scientific method—code development could benefit from the peer review process. additionally, the description of computations using only natural language or equations has inherent ambiguities that have unpredictable effects on results; hence, access to the source code is essential to reproducing the central findings of studies a decade of rapid publications earth and space science research article . / ea special section: geoscience papers of the future key points: • a reflection on the open source development of geoscience codes is presented • sharing can be broken down into three phases: opening, exposing, consolidating • free online services facilitate sharing and allow for further academic credits correspondence to: c. h. david, cedric.david@jpl.nasa.gov citation: david, c. h., j. s. famiglietti, z.-l. yang, f. habets, and d. r. maidment ( ), a decade of rapid—reflections on the development of an open source geoscience code, earth and space science, , doi: . / ea . received oct accepted mar accepted article online apr © . the authors. this is an open access article under the terms of the creative commons attribution-noncommercial-noderivs license, which permits use and distri- bution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made. http://publications.agu.org/journals/ http://onlinelibrary.wiley.com/journal/ . /(issn) - http://dx.doi.org/ . / ea http://dx.doi.org/ . / ea http://onlinelibrary.wiley.com/journal/ . /(issn) - /specialsection/gpf http://onlinelibrary.wiley.com/journal/ . /(issn) - /specialsection/gpf [peng, ; ince et al., ]. some of the largest geoscience modeling centers have been openly sharing their software for three decades [e.g., williamson, ; anthes, ; hurrell et al., ] and have already embraced state-of-the-art software sharing practices [e.g., rew et al., ]. the lack of training in software engineering and development, however, make such endeavor more challenging for most scientists [hey and payne, ]. river modeling—an area of earth system modeling concerned with the numerical simulation of water flow waves in surface river networks—does not have a strong tradition of open source codes (as discussed below) and can benefit from further sharing. at scales ranging from continental to global, the river models that are the most widely used in existing literature are perhaps those of lohmann et al. [ ], lisflood-fp [bates and de roo, ], rtm [branstetter, ], hrr [beighley et al., ], and cama-flood [yamazaki et al., ]. the degrees at which these software are shared vary widely (see appendix a for further details). lisflood-fp can be obtained upon request to the developers and is only available in the form of an executable (i.e., not the source code) that is limited to noncommercial use. hrr and cama-flood offer a greater degree of openness than lisflood-fp because their respective source codes can be obtained, although this also requires contacting the developers. the code for rtm can be downloaded upon open registration, and that of lohmann is openly available. perhaps because their source codes are readily accessible, these two latter river models appear to be the only ones that are currently used by large modeling centers: rtm is an integral part of the community land model [oleson et al., ], and the code of lohmann et al. [ ] is used along with the north american land data assimilation system [lohmann et al., ; xia et al., ]. the generally limited tradition of sharing within the continental to global scale river modeling community makes this field of earth system modeling a suitable candidate—out of potentially many—for a case study on sharing source code and data, which is the purpose of this paper. we here reflect on the benefits and challenges that have been associated with the open source development of an alternative river model (routing application for parallel computation of discharge (rapid) [david et al., d]) based on years of experience since its inception. this paper starts with a short background on rapid (section ). we then highlight three consecutive but distinct phases of the sharing process: opening (section ), exposing (section ), and consolidating (section ). the implications for earth science are then presented (section ), followed by our conclusions. this paper is intended as a reflective perspective in line with this special issue on geoscience papers of the future. our reflection based on the decade-long history of rapid includes answers to questions that are relevant to sharing any other type of earth science model: how to share a geoscience model? what are the minimum sharing steps? how to share data? how to make both model and data citable? how to programmatically reproduce an entire study? how to test a model and how to do so automatically? what are the limits of sharing? such questions shall be answered if, as a whole, the geoscience community yearns to abide agu’s motto of “unselfish cooperation in research.” . background on rapid the routing application for parallel computation of discharge (rapid) [david et al., d] is a numerical model that simulates the propagation of water flow waves in networks of rivers composed of tens to hundreds of thousands of river reaches. river routing in rapid is based on the traditional muskingum method [mccarthy, ] adapted to a matrix-vector notation [david et al., d]. given the inflow of water from land and aquifers—as commonly computed by land surface models—rapid can simulate river discharge at the outlet of each river reach of large river networks. river routing with the traditional muskingum method is performed using two parameters: a temporal constant (k) characteristic of flow wave propagation and a nondimensional constant (x) representative of diffusive processes. each river reach j of a river network can be prescribed its own set of parameters (kj, xj). rapid includes an automated parameter estimation procedure based on an inverse method that searches for an optimal set of model parameters by comparing multiple model simulations with available observations located across the river basins of interest while varying the muskingum parameters. different types of river networks can be used including those based on traditional gridded representations [david et al., a] and those from vector-based earth and space science . / ea a decade of rapid hydrographic data sets (the “blue lines” on maps) [david et al., d, a, b, ]. river routing and parameter estimation can both be performed on a subset of a large domain in order to study local processes. rapid has been applied to areas ranging from km to km (figure ) and can be run on personal computers as well as on larger parallel computing machines with demonstrated fixed-size parallel speedup [david et al., d, a, ]. the idea of building a river routing model capable of simulating river discharge in large vector-based hydro- graphic data sets was first discussed with the author in january (figure ; this idea was discussed in a meeting between cédric h. david and david r. maidment held on thursday, january at the center for research in water resources of the university of texas at austin). the first lines of the rapid source code were written in september after a year of initial research. the development of rapid has been ongoing since then, with a source code now containing over lines. the code is written using the fortran pro- gramming language and three scientific libraries: the network common data form (netcdf) [rew and davis, ] for the largest input and output files, the portable, extensible toolkit for scientific computation (petsc) [balay et al., , ] for linear algebra, and the toolkit for advanced optimization (tao) [munson et al., ] for automatic parameter estimation. petsc and tao both use a standard for parallel computing called the message passing interface (mpi) [dongarra et al., ]. the community of rapid users has grown steadily since the software inception a decade ago and currently includes researchers in universities, government agencies, and industry. notable applications of rapid include studies of river aquifer/interactions [saleh et al., ; flipo et al., ; thierion et al., ], continental-scale high-resolution flow modeling [tavakoly et al., ], decision support during droughts [zhao et al., ], computation of river height at the regional scale in preparation for an expected satellite mission [häfliger et al., ], nitrogen transport modeling [tavakoly et al., ], reservoir storage simula- tions [lin et al., ], and operational flood forecasting for the united states [maidment, ]. the breadth of the current rapid users community can be explained—at least in part—by a conscious effort to share the software and related material in an open manner. figure . selected applications of rapid. (left) the san antonio and guadalupe basins were used in david et al. [ d]. the upper mississippi basin was used in david et al. [ d, a]. the texas gulf coast region was used in david et al. [ b]. the mississippi basin was used in david et al. [ ]. (right) the domain of sim-france was used in david et al. [ a]. earth and space science . / ea a decade of rapid . initial steps of sharing geoscience software—opening rapid while many geoscientists are seduced by the concept of open source software, the lack of knowledge regard- ing where to initiate the process of sharing seems to be the most commonly acknowledged impediment. the purpose of this section is therefore to expose the three most basic steps involved in open sourcing in order to overcome this initial roadblock. . . authorship and licensing of the source code perhaps the two most important steps prior to actually sharing software are to determine the authorship and the type of license for use. authorship is key because publication is a prime metric for the scientific commu- nity. software licenses are essential to specify what type of usage the authors allow and to avoid any ambi- guity with regards to potential restrictions intended by them. while authorship is commonly valued by academics, license specification in geoscience software is relatively infrequent, either because of lack of computer science training or because the consequences of missing licenses are underappreciated. yet sharing software without a license is unwise because it implies that software developers leave copyright details unspecified and therefore subject to default applicable laws. such details include questions related to reproducing, creating derived products, distributing, displaying, and performing of the work [e.g., us congress, ]. figure . timeline of the development of rapid. note that—with the exception of v . . —the official version numbers all correspond to exact snapshots of the source code used in the writing of published rapid articles. the v . . benefited from data model enhancements compared to the alpha version which had been used in david et al. [ d]. the list of acronyms used is the following: center for research in water resources, the university of texas at austin (crwr), mines-paristech (mines), department of geological sciences, jackson school of geosciences, the university of texas at austin (dgs), university of california center for hydrologic modeling (ucchm), jet propulsion laboratory, california institute of technology (jpl), u.s. national science foundation project ear- (nsf-ear), french projects vulnar/piren-seine and mines-paristech (fr), u.s. national aeronautics and space administration projects nnx al g (nasa-ids ) and nnx ae g (nasa-ids ), university of california office of the president (ucop), jet propulsion laboratory strategic research and technology development (jpl-r&td), national center for atmospheric research advanced study program (nc), american geophysical union horton research grant (agu), and microsoft azure for research (msr). earth and space science . / ea a decade of rapid in the case of rapid the determination of authorship was fairly straightforward because a unique contributor had been writing the source code. however, some licenses allow for an organization to be named in the license. at the time of the initial release of rapid, its development had taken place while the author had been in three different departments of two institutions located in two countries; hence, the selection of the organization was a challenge. fortunately, some licenses allow the author name to be used as the organization name as well, which was the option chosen for simplicity. decisions concerning authorship and organization were both made with agreement from all scientists involved in guidance and funding during the source code development. the will to foster collaborations among the institutions that had contributed to the development of rapid and the desire to encourage broader community use both justified the release of the code in an “open source” manner. three of the commonly used license types for open source software were considered in the selection: the gnu’s not unix (gnu) general public license (gnu gpl) in its version . , the massachusetts institute of technology (mit) license, and the berkeley software distribution (bsd) -clause license; all of which being accessible from the open source initiative (https://opensource.org/). these licenses all permit private use, commercial use, modifications, and distribution. additionally, these licenses all include a requirement for inclusion of the license and copyright notice when sharing the source code, and a statement that neither the software author nor the license owner can be held liable for any potential damages. the selec- tion of the license to use for rapid was made based on the few aspects in which these common licenses differ. the gnu gpl was not chosen because of a requirement to keep the same license when distributing derived works (i.e., copyleft), which was found too restrictive and could have limited potential collaborations with industrial partners. the mit license does not specifically state that the developers’ names cannot be used in potential adver- tisements over derived products, which made the author uncomfortable. the bsd -clause license was therefore selected for rapid with agreement from all scientists involved in guidance and funding during the source code development. note that a full in-depth review of the details associated to available open source licenses is beyond the scope of this paper. however, online services such as http://choosealicense.com/licenses/ or http://oss-watch. ac.uk/ provide helpful information regarding open source licenses and may be valuable to readers when deciding on a license for their own software. additionally, the book of rosen [ ] offers an excellent in-depth analysis of open source licenses and presents many important aspects that are not mentioned here, including: collective works, work made for hire, and dual licensing. the specification of both authorship and license can therefore be seen as a key endeavor prior to sharing soft- ware because such allows to engage in discussions that help clarifying the intentions and concerns of all par- ties involved. the clarifications provided by licenses then rule any potential utilization of the software that might otherwise have been unforeseen. the inclusion of the license in the source code then allows for clearer and safer sharing outside of the circle of developers than would otherwise be possible. . . software documentation an important practice before sharing is to document the software. at the most basic level, documenting involves describing the various tasks performed by the program with comments in the source code. readability is also appreciated by users; hence, some cleaning and formatting of the code can be beneficial. an additional, valuable step in documenting software is to describe the platforms supported (operating systems), programming language, dependencies on other software, steps of installation, and commands (or options) for execution. such can be done in a users’ manual or a short tutorial. however, writing full software documentation can be very time consuming, and the associated academic credits are currently relatively limited compared to the preparation of a scientific manuscript. in the case of rapid, software documentation therefore initially focused on commenting, cleaning, and formatting the source code and on preparing a basic tutorial. further documentation was later prepared on an as-needed basis and improved over time. the “metric” used for determining the need for additional documentation is the frequency of requests for help from different users on any given topic. current rapid documentation includes a series of portable document format (.pdf) files with information related to the basic functioning of the model, installation procedures, and example tutorials. therefore, despite the widely accepted value of users’ manuals, an alternative approach has been taken for rapid, in which a basic tutorial was initially produced and additional documents have been prepared when needed and improved iteratively. earth and space science . / ea a decade of rapid https://opensource.org/ http://choosealicense.com/licenses/ http://oss-watch.ac.uk/ http://oss-watch.ac.uk/ . . sharing the software once authorship and license are determined, basic formatting of the source code is performed, and minimal supporting documentation is prepared; the source code can be deemed ready to be distributed. at this stage researchers may consider sharing their software with collaborators using the simplest modes of transmission including email or portable storage devices. additionally, geoscience researchers might wish to share their software beyond their group of direct colla- borators. the most direct way to enable discovery and access to the software (or the supporting information) is to create a website. the initial benefits of sharing the source code and its supporting documents online are twofold. first, a website provides a way for collaborators to access the software and material at any time even if the developers are not available. second, many potential collaborators might want to assess the capabilities of a piece of software and gauge whether or not it could help their needs prior to contacting the authors. the first rapid website was published in july to promote community usage and facilitate ongoing col- laborations and was at the time hosted by the software author’s institution. at that point in the development, rapid had reached a certain level of maturity, demonstrated by its ability to run on two conceptually differ- ent types of river networks: a vector-based hydrographic data set [david et al., d] and a grid-based net- work [david et al., a]. note, however, that the corresponding manuscripts had yet to be accepted for publication (figure ). the website creation and online sharing of the open source software therefore occurred approximately years after the first lines of code were written but a year before the first papers were published. various updates to the rapid code were then routinely made available as compressed linux archives (.tar.gz) in the download section of the website. over the years, the site was migrated twice as the employment of the author evolved. the urls of each website were all published in a series of peer- reviewed papers; therefore, continuity among websites had to be ensured. such was made possible through automatic html redirection of older urls to the newer ones. the rapid website (now at http://rapid-hub. org) is still regularly updated and contains a basic introduction, a download page, links to publications, supporting documents, animations of model results, information on training, and contact details. unexpectedly to the developer, the most immediate benefits of sharing the rapid source code and docu- mentation online have been community feedbacks and contributions. bug reporting from users on applica- tions to various computing environments (e.g., operating system and compiler collection) has enabled the source code to be strengthened. requests for clarification and reports of mistakes in the tutorials have both improved the documentation. the most gratifying feedback, however, has been when users write their own documents or programming scripts to be shared back to the community of rapid users. such contributions are also available on the rapid website. note that while focus is made here on classic dedicated websites, online software communities can also provide valuable hosting services [e.g., horsburgh et al., ] and sometimes even include the ability to run software online [e.g., peckham et al., ]. finally, this study pre- sents an approach in which the source code is shared only after reaching a certain level of maturity; partly because scientists tend to refrain from sharing raw software [barnes, ]. an alternative approach where software is shared from the onset could be equally valuable and potentially lead to similar but earlier benefits. . fostering sustainable use and development of a geoscience model— exposing rapid the first few years following the initial sharing of rapid online were accompanied by a slow but steady growth of its user base. the increasing size of the users community exposed a series of challenges that hin- dered sustainability of the sharing endeavor. the following describes the difficulties encountered and the steps taken to empower users by increasing transparency. these steps also allowed saving developer time while increasing scientific outreach and furthering academic credit. . . version control system and online code repository between july , when rapid was first shared online, and march , a total of successive archives to the source code were made available on the website. for each revision, the entire source code was included in an archive file with a name that included the date of creation. a text file briefly describing the changes made to the code was then maintained (but not shared) by the author. the increasing number of versions of the source code associated with the growth of the rapid users community made it difficult to support earth and space science . / ea a decade of rapid http://rapid-hub.org http://rapid-hub.org users; particularly when their applications were based on outdated rapid code. it became clear then that there was a need for a more advanced code management strategy, including mechanisms for fully documenting and exposing code changes, and for enabling automatic updates by users. documenting code changes has traditionally been done using a type of computer science tool called version control systems (vcss). notable examples include the concurrent versions system (cvs), subversion (svn), mercurial (hg), and git (git). vcss have many capabilities including: tracking, commenting, labeling, saving, comparing, and sometimes merging code changes. these systems can be deployed and used by a single per- son or by a team of developers to track code changes locally. alternatively, vcss are capable of interacting with online servers called code repositories, in which case the source code is hosted in a remote location. examples of code repositories include sourceforge, codeplex, google code, bitbucket, and github. the increasing notoriety and usage of both git and github motivated their use for rapid. these two tools are designed to work hand in hand and together allow tracking and fully exposing code changes online. the somewhat overwhelming tasks of tracking and documenting changes in all previously released versions of the rapid code using git and publishing the full repository on github were therefore undertaken in march . despite a steep learning curve associated with understanding the functioning and usage of these tools, the effort turned out to be more natural and less time consuming (approximately h) than initially antici- pated. one of the capabilities of git that was valuable for this endeavor is the tagging of the source code at any given stage. tags can then be used to navigate through various snapshots of the software. in the case of rapid, tags consisting of the previous release dates were used and hence enabled instant retrieval of any of the past versions. version control for all previous releases was completed on april , date that also coin- cided with the first code update published on github. the same process has continued since april , and the repository now contains tagged versions including five official releases (see below). the full downloadable repository including all previously released versions of rapid with tracked changes is available at https:// github.com/c-h-david/rapid (figure ). this github repository allows obtaining all previous versions of the code, navigating among them, and retrieving any available updates; each of these actions using single-line commands. several unanticipated aspects of version control systems and code repositories have also been beneficial to rapid. among them is the ability to create official releases of the source code with tags using classic version numbers. to leverage this capability, it was decided to associate official release numbers to the snapshots of the code used in peer-reviewed papers written in the development of rapid [i.e., david et al., a, d, a, b, ] with incremental version numbers from v . . to v . . . a direct link to the latest official release of rapid is at: https://github.com/c-h-david/rapid/releases/latest. this study will be a candidate for a new official version number. the github official releases are also convenient because they can be assigned a unique citable digital object identifier (doi) through a data repository called zenodo (see section . ) as was done in this study. each one of the official release numbers hence now has a citable reference (respectively, david [ , a, b, a, b]). another advantage of code repositories is the ability to browse through the various versions of the source code online; which is helpful for training, for debugging, and for electronic communications. also of note are the social media aspects of code repositories that allow users to be kept appraised of new software developments. incidentally, industrial partners started manifesting their interest in rapid after it was version controlled and published in a code repository, perhaps because these joint efforts contributed to making the software more professional. the combined use of version control systems and code repositories for geoscience software therefore allows levels of transparency in code changes and accessibility to updates that are not permitted by the more com- mon sharing of snapshots of the source code. together, vcss and code repositories can help foster further community use while lightening the user support load and increasing author acknowledgment through citable references. . . sharing the data over the years of development of rapid, the size of all input and output files used in associated peer- reviewed publications has grown. these data sets have been stored on various machines and their cumula- tive size approaches gb. such data can fill a large portion of the total storage capacity of current personal computers and therefore represent a potential issue for data storage and data preservation. additionally, these files provide examples of the inputs necessary to run rapid (and of the outputs it produces) and can earth and space science . / ea a decade of rapid https://github.com/c-h-david/rapid https://github.com/c-h-david/rapid https://github.com/c-h-david/rapid/releases/latest therefore serve as the basis for training material. finally, such files can be used to check that potential new modifications to the software have not altered its overall functioning (i.e., software testing). there are therefore many aspects for which the sharing of files associated with previously published rapid studies can be valuable, including data preservation, software training, and software testing. as with software sharing, the choice of authorship, license, and repository are at the base of data sharing. the most widely used data licenses appear to be those of the creative commons (cc). cc licenses and their differences share many commonalities with the various software licenses (section . ) and specify details related to attribution, derivatives, distribution, and commercial use. readers may find the online service for guiding the choice of a creative commons license (https://creativecommons.org/choose/) helpful when pick- ing a license for their data. note that the selection of a data license is a requirement before sharing data on online repositories. three repositories currently seem to be the most popular for scientific data sharing: figshare (http://figshare.com/), dryad (http://datadryad.org/), and zenodo (https://zenodo.org/). these repositories vary in their cost and storage capabilities, but all provide a unique digital object identifier (doi) making the data fully citable in peer-reviewed publications. data repositories (like software repositories) also allow for the inclusion of a short description of the files. the input and output files associated with rapid simulations consist in a series of small comma separated variables (csv) files and of larger netcdf files. the csv files contain information including river network con- nectivity, muskingum model parameters, existing observations, and subbasin specifications. the netcdf files contain the inflow of water from land and aquifers into river reaches and the outflow of water from the river reaches. figure . the github repository for rapid is available at http://github.com/c-h-david/rapid and offers direct access to all previous releases of the source code, including official versions. the successive modifications of all files are fully tracked and documented. the status badges in the readme file automatically link to the pass/fail state of the latest automatic build, and to the latest citable version. earth and space science . / ea a decade of rapid https://creativecommons.org/choose/ http://figshare.com/ http://datadryad.org/ https://zenodo.org/ http://github.com/c-h-david/rapid the input and output files corresponding to peer-reviewed publications of rapid are usually either gener- ated from scratch or prepared based on off-the-shelf data sets or on files prepared by coauthors. generally, the amount of work involved in the choice of data sources, data preparation, or data transforma- tion is well aligned with the authorship of the associated papers. it was therefore decided that the authorship of the data sets used in the preparation of existing peer-reviewed rapid publications would mirror the authorship of the corresponding papers. the details associated with each of the few creative commons licenses vary from very permissive to very restrictive. similarly to the choice of software license for rapid, the selection of data license was made to maximize community usage. the creative commons attribution (also known as cc by) license was chosen for rapid-related data sets as it allows for distribution, modifications, derived works, and commercial use. the cc by license is the most accommodating of the creative commons licenses, and its only requirement is to credit the original author(s) of the work. figshare, a free data repository, is probably the most popular of the available data sharing services, but its current mb limitation on any file in each data set makes it impossible to share rapid files (often of larger than gb). dryad only accepts files corresponding to peer-reviewed publications, which would satisfy our needs, but their service is not free, and data publication charges increase steeply once the size of each data set goes beyond gb (here again common in rapid data sets). zenodo was therefore chosen to host data associated with our peer-reviewed publications as it allows for many large files (each up to gb) and remains free. zenodo is supported by the european center for nuclear research and appears to be functioning under stable funding. furthermore, zenodo guarantees survival of data sets and of their doi. the files correspond- ing to the first two peer-reviewed articles using rapid [i.e., david et al., a, d] were published in two separate zenodo repositories (respectively, david et al. [ b, c]) for the purpose of this study. a short note ( – pages) included in each data publication summarizes the sources for all raw data used and explains the content of each file as well as how it was prepared. in addition to fulfilling recent requirements for increasing access to funded research [e.g., holdren, ] data publication is therefore beneficial for data preservation and storage, for training, and for software testing. perhaps more importantly for geoscience researchers, data publications can be officially cited, resulting in potential academic credit for authors [kratz and strasser, ], although the related benefits are complex (section . ). . . training courses to date, three training courses have been organized for rapid (figure ). the first training course was held at the institute of atmospheric physics of the chinese academy of sciences in beijing, china, on – may . the second training course was part of the community wrf-hydro modeling system training work- shop held at the national center for atmospheric research in boulder, colorado, on – may . the third training course was organized during the national flood interoperability experiment summer institute held at the national water center in tuscaloosa, alabama, between june and july . each one of the training courses enabled further growth of the users community. additionally, each training course allowed the evaluation of the ease of use of rapid and of the quality of the tutorials. note that while the first two train- ing courses were taught by the lead software developer, the instructor of the third training was instead a member of the rapid users community. the third training course hence marked a transition when rapid users started teaching themselves, which can be seen as a milestone in user support. the organization of training courses therefore allows for the evaluation of software usability, but can also be seen as an integral part of community growth and engagement. . facilitating updates of a geoscience model—consolidating rapid inevitably, with years of development, the complexity of the source code for rapid increased. additionally, as rapid reached a certain level of maturity, the users community gradually became more active, and the frequency of requests for updates, upgrades, and bug fixes grew. at this stage in model development, the need for regular testing of the code after each modification became crucial to avoid the risk of becoming inundated by bug reports from users. it followed that a more efficient approach for model testing was earth and space science . / ea a decade of rapid needed to avoid being overwhelmed by user support activities, while sustaining community engagement and entertaining user requests. . . the transition from manual to automatic testing the many steps involved in testing software are often repetitive, tedious, and prone to human errors. the most basic testing steps consist of running the program with a given set of instructions and verifying that the outputs that are generated are as expected. if the source code and/or example data are not available locally, testing also involves downloading of a series of files. additionally, when the software depends on other programs or libraries, their installation is required (see section . ) prior to the creation of the program executable from the source code. as the frequency of software updates increases, full testing can become a great consumer of developers’ time and is therefore sometimes avoided, increasing the risk of releasing faulty source code. the testing of rapid—as that of many other geoscience models—involves all the aforementioned testing steps. when performing testing operations manually, many of these steps are achieved using graphical user interfaces (guis), i.e., the most natural tools for human-machine interactions. downloading of the code and data is done using an internet browser or a file transfer protocol client. the instructions for each simula- tion are created and modified manually using a text editor. model outputs are checked visually by reading (for text outputs) or inspecting graphics (e.g., hydrographs created from binary outputs). however, despite the value of guis in easing human-machine interactions, the automation of tasks performed in guis is a challenge. in most operating systems, text-only interfaces called shells allow users to request tasks from the computer by typing commands. shells are the most fundamental way in which users can interact with the system. actually, shells used to be the main tools for interactions with computers before the advent of guis. each command entered in the shell results in an action performed and/or in text output. the success (or failure) of each action is then summarized by an integer number called exit code generated after each command execution, although it is kept hidden to the user unless specifically requested. a series of consecutive operations can therefore be easily automated by creating a text file containing the corresponding commands (i.e., a shell script), and their respective success can be checked using exit codes. hence, the automation of model testing can be accomplished if all the steps involved can be summarized in a series of simple commands and included in a script. despite a few years of programming experience gained in the development of the rapid source code, the author was not familiar with the many command line tools allowing such automation. however, after overcoming the initial learning curve associated with the discovery and use of a series of these tools, it was found that many of the steps involved in the testing of rapid can be performed directly from the shell using existing programs (see the appendix b for examples on linux). the few tasks that could not be directly performed with off-the-shelf tools were specifically related to the outputs of rapid and necessitated the creation of ad hoc programs. as mentioned earlier (section ), two main modes of usage currently exist for rapid: the first consists in per- forming flow simulations, and the second is dedicated to parameter optimization. when simulating flows, rapid generates a netcdf file containing the discharge for each river reach and at each time step. when checking simulations manually, hydrographs are created based on output files and verified visually. in order to allow for automated testing of simulations, a new approach was needed to compare two separate netcdf files and return an exit code for success if the files are similar. the issue here is that different computing envir- onments can lead to slightly different results (generally on the order of � m /s in rapid simulations) due to variations in ordering of floating-point arithmetic operations. a bit-to-bit comparison of files or of their cor- responding floating points was therefore not possible. a fortran program for testing the similarity of two rapid output files was hence created to automatically compare among all simulated discharge values with an option for specifying acceptable absolute and/or relative tolerances. when run in parameter optimization mode, rapid generates a text file including all parameter values tested and the corresponding cost functions obtained, along with the final optimized parameter values. however, because the optimization method used is unconstrained, the parameters found automatically are sometimes void of physical meaning in which case one has to handpick the best possible parameters from the list of physically valid values. additionally, given earth and space science . / ea a decade of rapid the dependence of the search space on the initial values used at the start of an optimization procedure, comparisons among a series of optimization experiments help selecting best possible parameters. finally, these best possible parameters need be compared with previous results to perform software testing. three automatic tasks were therefore needed to test the optimization procedure in rapid: ( ) finding the best valid parameters in a given optimization experiment, ( ) picking the optimal parameters among a series of experiments, and ( ) comparing these optimal parameters with previously computed values. three shell scripts using off-the-shelf linux programs were prepared to perform these three tasks. these programs all consist in a series of text editing tasks that were all made possible by combining existing linux tools (see appendix b). overall, the greatest challenge in creating programs for testing of rapid was in the discovery of available command line tools. once this learning curve was overcome, the automation process became straightfor- ward. despite their value for automatic testing (section . ), the custom programs that were created also turned out to be beneficial for day-to-day usage of the software. indeed, the few manual tasks involved in basic postprocessing of rapid outputs were time consuming and prone to human error. the development of automatic tools for testing, albeit motivated by user support, therefore turned out to be a valuable endea- vor for the developer as well. . . testing by reproducing entire studies programmatically once the programs that are necessary for automatic testing of a piece of software are prepared, the actual process of designing tests can begin. the choice of what tests to be performed therefore had to be made. in the case of rapid, the creation of tests was motivated by the desire to check that any potential modifica- tion of the source code did not alter the overall functioning of the model. it was therefore decided to test soft- ware updates by automatically reproducing all model runs that had been performed in previously published studies. the first two rapid papers [i.e., david et al., a, d] were selected for this endeavor. for each study, two main shell scripts were created: one for downloading the corresponding data and the other for reproducing and checking all model runs, hence fully exposing the provenance of past results. creating programs that automatically download data is fairly straightforward if the data sets are published online (section . ) and can be done using off-the-shelf command line tools. the design of programs that automatically reproduce a series of model runs requires further effort. because the generation of an executable from the source code (i.e., software build) can be time consuming, it is important that the code is designed so that all model options can be accessed at runtime (without rebuild). such capability is usually achieved through the use of an input text file containing instructions. several model runs can then be performed automatically solely by mod- ifying the instruction file prior to execution. finally, programs for resetting each instruction file to its respective default state (chosen arbitrarily) turned out to be useful when launching multiple model runs successively. the programs that were created for the automatic reproduction of the first and second rapid papers together combine tests ( and , respectively). the number of tests allowed by these programs is about an order of magnitude greater than what was used when performing manual testing and approxi- mately an order of magnitude faster. such a stricter testing procedure allowed for both increased robustness of the software and for temporal savings. here again, the value of these tests goes beyond their use for devel- opment work, as they provide examples of model usage as well as a means for users to check their installation (or modifications) of rapid. the tests corresponding to the first rapid paper were actually run successfully by attendees of the second rapid training (section . ). finally, the automatic reproduction of existing studies fully document the provenance of all simulations performed in the corresponding studies. . . continuous integration the principal strength of automatic tests is that they allow checking that the piece of software functions as expected after creation of the executable (build). however, geoscience models are often built upon existing software, all of which being already installed on the developer’s computer. the only way for developers to ensure portability of their geoscience model—i.e., to verify that there are no any unexpected dependen- cies on their own system—is to install the model on a blank machine. this machine can be a dedicated computer requiring frequent reinstallation of the operating system or a less costly and less time consuming virtual machine (vm) that can be regularly reset using vm snapshots. another approach to earth and space science . / ea a decade of rapid guaranteeing portability, but does not require manual resetting of the machine (actual or virtual), is made possible by hosted continuous integration (ci) services. the purpose of hosted ci services is to monitor changes to a source code repository and perform a set of given tasks upon publication of updates. the list of tasks to be performed is provided in a text file that is specific to the ci service and that is included in the source code. the most basic task performed by the ci server is to build the software using the instructions provided. any time that an update is published on the code repository, the ci server automatically creates a clean operating system, downloads the source code, and builds the software. several services currently exist for hosted continuous integration, including travis ci, codeship, and circle ci. these services share many commonalities including the capacity to interact with github repositories. travis ci was chosen here for rapid because of its relative higher popularity, its tight coupling with github (using the same identifier and password), and its free support of open source projects. surprisingly, activating travis ci for rapid turned out to be a very straightforward two-step process. the first step consists in adding a text file called .travis.yml to the github repository including the series of shell commands necessary to install rapid and its dependencies. these commands were already summarized in one of the existing tutor- ials, which eased this procedure. the second step consists in logging in on travis ci using github credentials and activating the monitoring of the rapid github repository. the result of this two-step process is available at https://travis-ci.org/c-h-david/rapid (figure ) and includes information on all previous builds of rapid since the continuous integration process was activated. of particular interest here is the pass/fail status of the latest build. despite demonstrating the portability of the rapid source code through automatic building of the software, one of the direct benefits of hosted continuous integration was to fully trace the software dependencies and environment variables that are needed to build rapid. a successful build on a ci server guarantees that all necessary software were properly described and therefore that the installation procedure was fully documen- ted. setting up continuous integration for rapid also proved to be immediately valuable as it allowed for illumination of some dangerous programming practices present in the source code (e.g., lack of variable initialization and implicit data type conversions). these weaknesses were exposed because the compiler col- lection used in the ci server provided different warnings than that of the development machine. continuous integration therefore had an immediate impact on the quality of the rapid code. another benefit of continuous integration is the ability to advertise for the validity of the latest releases through a “status badge” confirming that the latest build was successful (figure ). in addition to enhancing the portability and quality of the code, continuous integration helped with auto- matic testing of rapid. once automatic tests were included in the code repository, their use in the continuous integration process was easily activated through inclusion in the ci server instructions. the only difficulty associated with testing within the ci server was to ensure that the integration process—i.e., the combination of software building and testing—could be performed in a limited period of time ( min for travis ci), so tests had to be split into smaller groups running on different travis ci computers. exit codes being the sole means of the ci server to determine success (or failure) of each command, their proper handing within testing scripts (section . ) was particularly key for continuous integration. the benefits of the temporal investments made in sharing code, data, and in preparing automatic tests therefore became even greater when using continuous integration, as they together allowed for machines to take over many of the time- consuming parts of code development. . implications for earth science—lessons learned . . on the phases of sharing and their associated benefits the first phase of sharing, i.e., “opening” a piece of software, consists in a series of consecutive steps. perhaps the most important step in this phase is to determine the authorship and license hence clarifying potential ambiguities on permissions and restrictions intended by the authors prior to release. software description —through cleaning, formatting, and commenting the code, and/or through preparation of a basic tutorial —allows potential peers to independently start using the software and evaluating the associated capabilities. note that our recommendation for minimal initial software description is marginally more time consuming earth and space science . / ea a decade of rapid https://travis-ci.org/c-h-david/rapid than sharing code “as is” [e.g., barnes, ]. the creation of a website is then the natural next step to provide continuous access to (and a means of discovery for) the software and associated documents; which together permit the fostering of a burgeoning community of users. one of the benefits of opening software, albeit not easily quantifiable, is the satisfaction one may get in having their research used by peers: hence contributing to one’s community. in addition, opening software has direct benefits for the code itself. initially, community feedback on potential bugs in the code and clarity of the code and tutorials are to be expected. eventually, com- munity contribution to the software knowledge base (tutorials and processing scripts) are even more rewarding. the second phase of sharing, i.e., “exposing” the software and data, consists in using version control systems (vcss) to track code changes, in publishing code and data through online repositories, and in organizing training courses. vcss allow documenting code changes, labeling versions, comparing snapshots, and mer- ging differences. code repositories are companions to vcss and facilitate browsing through code, increasing the transparency of changes, publishing official versions, and increasing accessibility to updates. the social media capabilities of code repositories also allow users to be kept appraised of software developments. online availability is equally helpful for discussing specifics of the source code with remote users. data repositories are valuable for storage and preservation, but also because sharing example data provide useful training material and later support automatic testing. perhaps more importantly [e.g., kattge et al., ; kratz and strasser, ], data repositories enable potential furthering of academic credit through citable material (for both code and data) while fulfilling recent requirements in accessibility to direct products of funded research [holdren, ]. code and data sharing are particularly valuable once a small community of users exists. training courses allow for the evaluation of software usability and are also an integral part of user engagement. exposing software empowers users through eased access and usability and hence contributes to the growth of the users community while simultaneously saving developer time therefore making the figure . the continuous integration server for rapid is available at http://travis-ci.org/c-h-david/rapid and provides details on all previous builds of the software. the continuous integration of rapid is currently running on fifteen separate travis ci workers. note that the build time provided is the sum of the run times of all workers and that the wall clock time is shorter as several workers run concurrently. earth and space science . / ea a decade of rapid http://travis-ci.org/c-h-david/rapid sharing process more sustainable. finally, this phase demonstrates a certain level of maturity in the software and thus helps attract more users including potential industry partners. the third phase of sharing, “consolidating” software, becomes necessary as the community of users continues to grow and the frequency of requests for code modifications increases. an active user base is very valuable because not only it generates motivation for software improvements but also it demands rigorous testing of updates at the risk of being inundated by bug reports. manual testing—a repetitive, tedious, and time-consuming task that is prone to human error—is then no longer appropriate because the associated temporal investments become an impediment to user support and hence to sustainable sharing. at such stage of open source development, the creation of automatic testing tools and the activation of hosted con- tinuous integration become necessary. the automatic tests facilitate tremendous savings in developer time and enable users to check their installation and/or potential code modifications. continuous integration enhances software portability through ensuring a full and up-to-date description of software dependencies. together, continuous integration and testing eventually allow saving time, strengthening the code, and benefit day-to-day operations. from a developer’s perspective, continuous integration enables a more sustainable approach to software development through letting machines take over many of the time- consuming development tasks and therefore frees up availability to further community engagement. the three phases of open source sharing that are presented here are incremental and align with the size of the users community. therefore, while every phase enhances the sharing endeavor compared to its prede- cessor, each implementation can be spread out over the lifetime of the software (figure ). note that the completion of the various steps in each phase (or of the phases themselves) can sometimes be made in a dif- ferent order than that proposed here. for example, continuous integration can be implemented before auto- matic testing, or data can be published before using vcss and code repositories. likewise, vcss can be used prior to sharing the software as can be needed if several developers are initially involved. however, we highly recommend the selection of a license—a step that is too often overseen in practice—before sharing. . . implications for source code development our experience with the open source development that is presented in this study has highlighted three aspects of software design that are important to the sharing process: the necessity of a strong data model, the importance of instruction files, and the choice of the hosting location for material linking data and software. the data model is the standard used to describe the content and format of the various files read or created by the earth science model, along with how they relate to one another. this description includes the name and content of variables accompanied by their computer-based representation (i.e., character, logical, integer, floating point, and associated precision), the sorting details (e.g., arbitrary, ascending, and descending), and the file type used to store the data (e.g., csv and netcdf). the data model has a direct impact on the func- tioning and performance of an earth science model and is therefore usually defined before or at the same time, but it is almost always refined with time. one should hence actively promote early data model stability for model development. additionally, from a data sharing perspective, one should also wait for some stabi- lization of the data model before publishing data sets at the risk of making such data sets hastily obsolete. a stable data model is also advantageous because it facilitates the consistency of tutorials and automatic tests. in our experience, a good metric for determining the maturity of the data model has been its ability to accom- modate different types of conceptually different inputs. at the initial stage of development, many variables are often “hardwired” in the earth science model source code, be it because of a strive for early results standing against recommended programming practices or per- haps because such can sometimes facilitate the detection of programming inconsistencies (e.g., array sizes) by compilers. this advantage comes at a price: the source code needs be rebuilt every time new instructions are used. as the software matures, it is therefore good practice to combine all possible instructions in a single text file that is read at runtime. the instruction file then enables multiple simulations to be run without rebuilding the source code, which greatly eases day-to-day operations and automatic testing. some files create a link between the source code and the data of an earth system model. examples include the data downloading script (which contains the names of all input and output files) and the instruction file (which can also contain their respective sizes). in the process of sharing source code and data, one may earth and space science . / ea a decade of rapid therefore wonder which of the associated repositories is the most appropriate to store these specific files. our experience has shown that it is helpful to include the data download scripts with the source code so that users can readily obtain example data to check their installation and/or modifications of the code and in order to ease the continuous integration process. additionally, while the file names and sizes corresponding to a set of example input/output files tend to remain stable, the associated variables names within the source code might evolve (e.g., to improve readability) in which case old instruction files would no longer be applicable. it is therefore good practice to also keep example instruction files with the source code and not with the data. . . remaining community challenges and the limits of sharing as geoscientists further embrace digital scholarship, a series of community challenges are to be expected. these challenges include matters related to the following: technical training, self-perceived inadequacy, community-based documentation, acknowledgment and assessment of digital scholarship, and sustainable sharing. as mentioned earlier (section ), geoscientists seem to often justify the lack of sharing by a lack of know-how, at least in conversation. this paper touches on many aspects related to code and data sharing including licenses, repositories, versioning, testing, and continuous integration. while these subject matters are com- monly taught in computer science departments, they are typically absent from most geoscience depart- ments. such a lack of training has now been recognized by computer science colleagues [e.g., hey and payne, ]. if digital scholarship is expected of geoscientists, it must also become part of their curriculum in universities. another common justification for the lack of sharing seems to be that computer codes created are too simple (or “not good enough”) to be made public. this apparent self-perceived inadequacy was humorously dis- cussed by barnes [ ] who argued that if the code does the task that it is designed for, then it is good enough to be shared. our opinion is well in line with that of barnes, as too many of we geoscientists spend much of our time writing code that others have written before, and more will write in the future. while much can be learned writing one’s own code, having access to examples could lead to community-wide temporal savings. the need to document software can also be seen as an impediment to sharing. preparing documentation and keeping it up to date with latest code developments is indeed time consuming. while some argue for sharing as is [e.g., barnes, ], our experience has shown that if there is in fact a community need for the tools, users are likely to manifest themselves with questions. we therefore recommend here a limited amount of editing and formatting of the source code—as was already advocated for by easterbrook [ ]—and suggest that the preparation of a small tutorial suffices as initial supplementary documentation. such an approach was taken for rapid and was later rewarded by contributions to the documentation from enthu- siastic members of the users community. this case study suggests that the many steps involved in sharing together require substantial dedication; particularly as the users community grows. in the words of easterbrook [ ], “making a code truly open source […] demands a commitment that few scientists are able to make.” the geoscience community should therefore acknowledge sharing efforts in a way comparable to traditional scientific article publications [e.g., kattge et al., ]. fortunately, modern technology now allows for digital products to be fully citable (see sections . and . ), which is a significant step toward acknowledging contributions [kratz and strasser, ]. however, some scientific software eventually grow beyond the publishing scientific community, in which case an alternative measurement of their broader impact would be valuable. unfortunately, there does not appear to be a way to track downloads in current data or software repositories. github does provide a download count, but this capability is limited to the binary files associated with official releases which only represent a portion of the total number of downloads. the lack of this capability is perhaps why many devel- opers wish to be contacted by users before granting access to their code. this might also explain the regis- tration systems used by some modeling centers to track the usage of their codes [e.g., hurrell, ]. further, while we agree on the many benefits of citable digital research products [kattge et al., ; scientific data, ; kratz and strasser, ] we further argue here that an additional step is needed to pro- mote open research: a cultural shift in the assessment of research performance. digital scholarship can be earth and space science . / ea a decade of rapid seen as impactful for research, education, and outreach, and researchers might respond more enthusiastically to the added burden if their peers (i.e., colleagues and tenure and promotion committees) valued the efforts. this expectation in turn means digital scholars must become advocates of their own cause. finally, and contrary to common belief, open source software does not mean free user support [barnes, ; easterbrook, ]. this unfortunate misconception hinders sustainable sharing as it does sustainable research. an analogy between traditional publishing and digital scholarship can be made here. it is common that a given researcher x reads a scientific paper written by another researcher y, applies the published meth- ods to his/her case study, and writes their own paper citing the work of y. however, it is less usual for x to ask y for help with the data collection or application of the methods to the new case study without an implied understanding of coauthorship. such is particularly true if the associated efforts require rigorous data collec- tion, detailed data inspection, and/or enhancement of the methods. the same modus operando can reason- ably be applied to digital scholarship. citation of the digital research products is appropriate—and sufficient —when using these products as is. however, if user support requires “substantial” expertise or involvement from the developers, coauthorship seems appropriate. similarly, if research proposals planning to use open source software are likely to necessitate assistance from the developers, a proportionate amount of funding can reasonably be requested. such funding can then be used to answer new scientific questions and lever- aged for support. developers must therefore acknowledge that they too often drown—happily—in the time sink of user support. the benefits of community feedback cannot alone justify the associated efforts as devel- opers’ time could be very well spent instead on new publications or new research proposals. as we encou- rage geoscientists to enthusiastically embrace the open source approach, our community must therefore also strive for a proper balance between further sharing and sustainable research. . conclusions as geosciences gradually evolve to rely on increasing amounts of computer-aided methods and our society clamors for further transparency in the products of the research it supports, many geoscientists are faced with the challenges of digital scholarship. the importance of learning best sharing practices is particularly acute in the general field of earth science modeling—i.e., the creation, update, and maintenance of numerical models used to study the dynamic elements of the earth—which is a key component of current climate change studies. this paper focuses on the specific field of continental to global scale numerical modeling of flow wave propagation in rivers, one of potentially many scientific areas in which open development is uncommon, merely as an avenue to reflect on the open sharing process in earth science. this study presents reflections based on the years of development of an open source river model called rapid and highlights three consecutive but distinct phases of the sharing process: ( ) opening, ( ) exposing, and( )consolidating.eachofthesephasesrespondstoauserscommunityofincreasingsize.phase (opening) consists in selecting an open source license, cleaning and formatting of the source code associated with pre- paring a short tutorial, and publishing the source code on a dedicated website. this first phase is the minimal sharing phase and allows fostering a burgeoning community of users. phase (exposing) consists in tracking code changes using version control systems (vcss), publishing the source code and example files on citable software and data repositories, and organizing software trainings. this second phase becomes necessary as the user community grows, in order to facilitate transitions between existing versions of the source code and to provide example case studies. phase (consolidating) consists in creating a set of automatic tests and activating continuous integration of the software. this third phase becomes needed to ease the sharing processwhenthefrequencyofrequestsforsoftwaremodificationsincreasesalongwiththeactivityoftheusers community. note that while the phases presented here mirror the rapid development timeline, the order in which each phase (or its associated steps) is completed can vary among software projects. the case study herein provides details on the several phases involved in open sourcing of a numerical model of a component of the earth system and highlights the many benefits—mutual to developers and to users— of open source development. in addition to contributing to transparency in science, open source sharing allows for ( ) improvement of software and documentation, ( ) temporal savings through letting machines take over many of the repetitive aspects of software use and development, and ( ) academic credit through citable digital research products. note that all the services used in this study are available at no cost to open source developers. earth and space science . / ea a decade of rapid however, the benefits of further sharing also come with a substantial time commitment as the sharing pro- cess becomes increasingly demanding with a growing user base. this added burden must be managed by the geoscience community as a whole; and several potential avenues are proposed here. first, the inclusion of digital scholarship methods in the geoscience curriculum of universities could greatly ease the associated learning curve. second, geoscientists need to overcome the self-perceived inferiority of their computer code and instead embrace the many benefits of peer review for code development. third, users of open source software should consider contributing to their own community through helping with the associated docu- mentation. fourth, traditional scholarship and digital scholarship should equally weigh in the acknowledg- ment, evaluation, and promotion of geoscientists, because digital scholarship equally impacts research, education, and outreach. finally, the geoscience community might consider including open source developers in their peer-reviewed manuscripts and research proposals when making substantial use of the developers’ expertise in their endeavors in order to foster sustainable sharing practices. appendix a: on the apparent level of sharing in selected river models the information below was retrieved at time of writing ( august ) for selected river models with applic- ability from regional to global scale: the code of lohmann et al. [ ] is available for download from as part of the north american land data assimilation system and does not appear to include a license: http://www.nco.ncep.noaa.gov/pmb/codes/ nwprod/nldas.v . . /sorc/nldas_rout.fd/ the website for lisflood-fp [bates and de roo, ] states, “we are happy to provide a copy of the executable for noncommercial studies”: http://www.bristol.ac.uk/geography/research/hydrology/models/lisflood/downloads/ rtm [branstetter, ] can be downloaded from the community earth system model [hurrell et al., ] website at http://www.cesm.ucar.edu/models/cesm . /, which states that “a short registration is required to access the repository.” the website for hrr [beighley et al., ] states that “if you would like the source code, please email dr. beighley”: http://www.northeastern.edu/beighley/hillslope-river-routing-hrr-model/ a similar statement is provided for cama-flood [yamazaki et al., ]: “please contact to the developer (dai yamazaki) for the password to download the cama-flood package”: http://hydro.iis.u-tokyo.ac.jp/ ~yamadai/cama-flood/ appendix b: example programs and special characters for automatic testing in linux the automation of rapid testing in this study was made possible through the use of a series of programs and special characters including some that—despite a few years of experience with programming on linux— were not already known to the author. a nonexhaustive summary is provided here in hope that readers might find these helpful for testing their own geoscience software; merely as a supporting information to compre- hensive references [e.g., siever et al., ; jones, ]. the generation of an executable from the software source code (compilation and linking or build) is a convo- luted and multistep process that can be transformed into a one-line command by using a program called make of which instructions are in a file called makefile. word count (wc) can be used to count the number of lines in a file. globally search a regular expression and print (grep) can locate a string of characters in a text file. the stream editor (sed) can search for a string of characters in a file and replace it by another. the basic calculator (bc) can be used to compare two numbers and provide a boolean value summarizing whether or not they are equal. the worldwide web get (wget) allows for downloading files from the internet. special characters in a given linux shell can also ease the automation of software testing. here we focus on one of the most common linux shells used in scientific computing called the bourne again shell (bash). helpful values can be obtained at the command line in bash or in a bash script: the number of arguments provided ($#), the list of all arguments ($@), the first argument ($ ), and the exit code of the previous com- mand ($?). the text resulting from the execution of one command can be redirected to a text file (>) or used as the input to another command (|). earth and space science . / ea a decade of rapid http://www.nco.ncep.noaa.gov/pmb/codes/nwprod/nldas.v . . /sorc/nldas_rout.fd/ http://www.nco.ncep.noaa.gov/pmb/codes/nwprod/nldas.v . . /sorc/nldas_rout.fd/ http://www.bristol.ac.uk/geography/research/hydrology/models/lisflood/downloads/ http://www.cesm.ucar.edu/models/cesm . / http://www.northeastern.edu/beighley/hillslope-river-routing-hrr-model/ http://hydro.iis.u-tokyo.ac.jp/~yamadai/cama-flood/ http://hydro.iis.u-tokyo.ac.jp/~yamadai/cama-flood/ references anthes, r. ( ), summary of workshop on the ncar community climate/forecast models – july , boulder, colorado, bull. am. meteorol. soc., ( ), – . balay, s., w. d. gropp, l. c. mcinnes, and b. f. smith ( ), efficient management of parallelism in object oriented numerical software libraries, in modern software tools in scientific computing, edited by e. arge, a. m. bruaset, and h. p. langtangen, pp. – , birkhauser, cambridge, mass. balay, s., j. brown, k. buschelman, v. eijkhout, w. d. gropp, d. kaushik, m. g. knepley, l. curfman mcinnes, b. f. smith, and h. zhang ( ), petsc users manual (revision . ), argonne natl. lab, argonne, ill. barnes, n. ( ), publish your computer code: it is good enough, nature, , , doi: . / a. bates, p. d., and a. p. j. de roo ( ), a simple raster-based model for flood inundation simulation, j. hydrol., ( - ), – . beighley, r. e., k. g. eggert, t. dunne, y. he, v. gummadi, and k. l. verdin ( ), simulating hydrologic and hydraulic processes throughout the amazon river basin, hydrol. processes, ( ), – , doi: . /hyp. . bell, c. g. ( ), the future of scientific computing, comput. sci., , – . bell, g., t. hey, and a. szalay ( ), beyond the data deluge, science, ( ), – . branstetter, m. ( ), development of a parallel river transport algorithm and applications to climate studies, phd thesis, univ. of texas, austin. bryan, k., and m. d. cox ( ), a numerical investigation of the oceanic general circulation, tellus, ( ), – , doi: . /j. - . .tb .x. david, c. h. ( ), rapid v . . , zenodo, doi: . /zenodo. . david, c. h. ( a), rapid v . . , zenodo, doi: . /zenodo. . david, c. h. ( b), rapid v . . , zenodo, doi: . /zenodo. . david, c. h. ( a), rapid v . . , zenodo, doi: . /zenodo. . david, c. h. ( b), rapid v . . , zenodo, doi: . /zenodo. . david, c. h., f. habets, d. r. maidment, and z.-l. yang ( a), rapid applied to the sim-france model, hydrol. processes, ( ), – . david, c. h., f. habets, d. r. maidment, and z.-l. yang ( b), rapid input and output files corresponding to “rapid applied to the sim- france model”, zenodo, doi: . /zenodo. . david, c. h., d. r. maidment, g.-y. niu, z.-l. yang, f. habets, and v. eijkhout ( c), rapid input and output files corresponding to “river network routing on the nhdplus dataset”, zenodo, doi: . /zenodo. . david, c. h., d. r. maidment, g.-y. niu, z.-l. yang, f. habets, and v. eijkhout ( d), river network routing on the nhdplus dataset, j. hydrometeorol., ( ), – . david, c. h., z.-l. yang, and j. s. famiglietti ( a), quantification of the upstream-to-downstream influence in the muskingum method and implications for speedup in parallel computations of river flow, water resour. res., , – , doi: . /wrcr. . david, c. h., z.-l. yang, and s. hong ( b), regional-scale river flow modeling using off-the-shelf runoff products, thousands of mapped rivers and hundreds of stream flow gauges, environ. modell. software, , – . david, c. h., j. s. famiglietti, z.-l. yang, and v. eijkhout ( ), enhanced fixed-size parallel speedup with the muskingum method using a trans-boundary approach and a large sub-basins approximation, water resour. res., , – , doi: . / wr . dongarra, j., et al. ( ), special issue—mpi—a message-passing interface standard, int. j. supercomput. appl. high perform. comput., ( - ), – . easterbrook, s. m. ( ), open code for open science? nat. geosci., ( ), – . flipo, n., c. monteil, m. poulin, c. de fouquet, and m. krimissa ( ), hybrid fitting of a hydrosystem model: long-term insight into the beauce aquifer functioning (france), water resour. res., , w , doi: . / wr . häfliger, v., et al. ( ), evaluation of regional-scale river depth simulations using various routing schemes within a hydrometeorological modeling framework for the preparation of the swot mission, j. hydrometeorol., ( ), – , doi: . /jhm-d- - . . hey, t. ( ), science has four legs, commun. acm, ( ), doi: . / . . hey, t., and m. c. payne ( ), open science decoded, nat. phys., ( ), – . hey, t., s. tansley, and k. tolle ( ), jim gray on escience: a transformed scientific method, in the fourth paradigm. data intensive scientific discovery, edited by t. hey, s. tansley, and k. tolle, pp. xvii–xxxi, microsoft res, redmond, wash. holdren, j. p. ( ), memorandum for the heads of executive departments and agencies. increasing access to the results of federally funded scientific research, exec. off. of the pres., off. of sci. and technol. policy, washington, d. c. horsburgh, j. s., m. m. morsy, a. m. castronova, j. l. goodall, t. gan, h. yi, m. j. stealey, and d. g. tarboton ( ), hydroshare: sharing diverse environmental data types and models as social objects with application to the hydrology domain, j. am. water resour. assoc., doi: . / - . . hurrell, j. w. ( ), ncar in the st century building on a distinguished record of achievement, leadership and service. hurrell, j. w., et al. ( ), the community earth system model: a framework for collaborative research, bull. am. meteorol. soc., ( ), – , doi: . /bams-d- - . . ince, d. c., l. hatton, and j. graham-cumming ( ), the case for open computer programs, nature, ( ), – , doi: . / nature . intergovernmental panel on climate change ( ), climate change : the physical science basis. contribution of working group i to the fifth assessment report of the intergovernmental panel on climate change, edited by t. f. stocker et al., pp. – , cambridge univ. press, cambridge, u. k. jones, m. t. ( ), gnu/linux application programming, nd ed., course technol., cengage learn., boston, mass. kattge, j., s. diaz, and c. wirth ( ), of carrots and sticks, nat. geosci., ( ), – . kratz, j. e., and c. strasser ( ), researcher perspectives on publication and peer review of data, plos one, ( ), e , doi: . / journal.pone. . lin, p., z.-l. yang, x. cai, and c. h. david ( ), development and evaluation of a physically-based lake level model for water resource management: a case study for lake buchanan, texas, j. hydrol., (part b), – , doi: . /j.ejrh. . . . lohmann, d., r. nolteholube, and e. raschke ( ), a large-scale horizontal routing model to be coupled to land surface parametrization schemes, tellus, ser. a, ( ), – . lohmann, d., et al. ( ), streamflow and water balance intercomparisons of four land surface models in the north american land data assimilation system project, j. geophys. res., , d s , doi: . / jd . maidment, d. r. ( ), a conceptual framework for the national flood interoperability experiment, cuahsi. earth and space science . / ea a decade of rapid acknowledgments this work was supported by the jet propulsion laboratory, california institute of technology, under a con- tract with the national aeronautics and space administration and by the university of california office of the president multicampus research programs and initiatives; both institu- tions are gratefully acknowledged. this research was also partially supported by a microsoft azure for research grant from microsoft research. the practical application of this study was enabled by the netcdf, mpich, petsc, and tao scientific libraries that were all built with the gnu compiler collection installed on a community enterprise operating system (centos). version control was performed using git. this study was also made possible using the following free online services: github (code reposi- tory), zenodo, (data repository), and travis ci (continuous integration). comments from the editor, associate editor, and two anonymous reviewers on earlier versions of this manuscript are gratefully acknowledged. the authors are thankful to yolanda gil for stressing the importance of software licenses and for continuous support throughout the research presented in this paper, to an anonymous reviewer of a national science foundation review panel for suggesting the use of github, to luke a. winslow for demonstrating the power of continuous integration and for mentioning travis ci, and to lars holm nielsen of zenodo for support on hosting the rapid input and output files. thank you to the earthcube ontosoft leadership team and the advi- sory committee members for enligh- tening discussions. the two data sets used in this paper [david et al., b, c] are openly available through their respective digital object identifiers (doi). the piece of software used herein includes the scripts to reproduce all numerical experiments performed in this paper (hence describing prove- nance of results) and is openly available at https://github.com/c-h-david/rapid/ tree/ ; it will be assigned a doi [e.g., david, , a, b, a, b] pending publication of this study. http://dx.doi.org/ . / a http://dx.doi.org/ . /hyp. http://dx.doi.org/ . /j. - . .tb .x http://dx.doi.org/ . /j. - . .tb .x http://dx.doi.org/ . /zenodo. http://dx.doi.org/ . /zenodo. http://dx.doi.org/ . /zenodo. http://dx.doi.org/ . /zenodo. http://dx.doi.org/ . /zenodo. http://dx.doi.org/ . /zenodo. http://dx.doi.org/ . /zenodo. http://dx.doi.org/ . /wrcr. http://dx.doi.org/ . / wr http://dx.doi.org/ . / wr http://dx.doi.org/ . /jhm-d- - . http://dx.doi.org/ . / . http://dx.doi.org/ . / - . http://dx.doi.org/ . / - . http://dx.doi.org/ . /bams-d- - . http://dx.doi.org/ . /nature http://dx.doi.org/ . /nature http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . /j.ejrh. . . http://dx.doi.org/ . / jd https://github.com/c-h-david/rapid/tree/ https://github.com/c-h-david/rapid/tree/ manabe, s. ( ), climate and the ocean circulation: . the atmospheric circulation and the hydrology of the earth’s surface, mon. weather rev., ( ), – . mccarthy, g. t. ( ), the unit hydrograph and flood routing. miller, j. r., g. l. russell, and g. caliri ( ), continental-scale river flow in climate models, j. clim., ( ), – , doi: . / - ( ) < :csrfic> . .co; . munson, t., j. sarich, s. wild, s. benson, and l. curfman mcinnes ( ), tao user manual (revision . ), math. and comput. sci. div., argonne natl. lab, argonne, ill. [available at http://www.mcs.anl.gov/tao.] nature ( ), code share, nature, , . nature geoscience ( ), towards transparency, nat. geosci., ( ), . oleson, k., et al. ( ), technical description of version . of the community land model (clm), tech. note ncar/tn- +str, ncar, boulder, colo. peckham, s. d., e. w. h. hutton, and b. norris ( ), a component-based approach to integrated modeling in the geosciences: the design of csdms, comput. geosci., , – . peng, r. d. ( ), reproducible research in computational science, science, ( ), – , doi: . /science. . phillips, n. a. ( ), the general circulation of the atmosphere: a numerical experiment, q. j. r. meteorol. soc., ( ), – , doi: . / qj. . rew, r., and g. davis ( ), netcdf—an interface for scientific-data access, ieee comput. graphics appl., ( ), – . rew, r., d. heimbigner, and w. fisher ( ), announcing a transition to github. rosen, l. e. ( ), open source licensing: software freedom and intellectual property law, nd ed., prentice hall, upper saddle river, n. j. saleh, f., n. flipo, f. habets, a. ducharne, l. oudin, p. viennot, and e. ledoux ( ), modeling the impact of in-stream water level fluctua- tions on stream-aquifer interactions at the regional scale, j. hydrol., ( – ), – . scientific data ( ), more bang for your byte, sci. data, , . siever, e., a. weber, s. figgins, r. love, and a. robbins ( ), linux in a nutshell, th ed., o’reilly media, inc., sebastopol, calif. tavakoly, a. a., c. h. david, d. r. maidment, z.-l. yang, and x. cai ( ), an upscaling process for large-scale vector-based river networks using the nhdplus dataset. tavakoly, a. a., d. r. maidment, j. mcclelland, t. whiteaker, z.-l. yang, c. griffin, c. h. david, and l. meyer ( ), a gis framework for regional modeling of riverine nitrogen transport: case study, san antonio and guadalupe basins, j. am. water resour. assoc., , – , doi: . / - . . thierion, c., et al. ( ), assessing the water balance of the upper rhine graben hydrosystem, j. hydrol., – , – . us congress ( ), the copyright act of . vardi, m. ( a), author’s response, commun. acm, ( ), – , doi: . / . . vardi, m. ( b), science has only two legs, commun. acm, ( ), , doi: . / . . williamson, d. l. ( ), description of ncar community climate model (ccm b), ncar tech. note, natl. cent. for atmos. res., boulder, colo. xia, y., et al. ( ), continental-scale water and energy flux analysis and validation for the north american land data assimilation system project phase (nldas- ): . validation of model-simulated streamflow, j. geophys. res., , d , doi: . / jd . yamazaki, d., s. kanae, h. kim, and t. oki ( ), a physically based description of floodplain inundation dynamics in a global river routing model, water resour. res., , w , doi: . / wr . zhao, t., b. s. minsker, j. s. lee, f. r. salas, d. r. maidment, and c. h. david ( ), real-time water decision support services for droughts, pap. , pp. – . earth and space science . / ea a decade of rapid http://dx.doi.org/ . / - ( ) . .co; http://dx.doi.org/ . / - ( ) . .co; http://dx.doi.org/ . / - ( ) . .co; http://dx.doi.org/ . / - ( ) . .co; http://www.mcs.anl.gov/tao http://dx.doi.org/ . /science. http://dx.doi.org/ . /qj. http://dx.doi.org/ . /qj. http://dx.doi.org/ . / - . http://dx.doi.org/ . / - . http://dx.doi.org/ . / . http://dx.doi.org/ . / . http://dx.doi.org/ . / jd http://dx.doi.org/ . / wr << /ascii encodepages false /allowtransparency false /autopositionepsfiles true /autorotatepages /all /binding /left /calgrayprofile (none) /calrgbprofile (eci-rgb.icc) /calcmykprofile (photoshop default cmyk) /srgbprofile (srgb iec - . ) /cannotembedfontpolicy /warning /compatibilitylevel . /compressobjects /off /compresspages true /convertimagestoindexed true /passthroughjpegimages false /createjobticket false /defaultrenderingintent /default /detectblends false /detectcurves . /colorconversionstrategy /srgb /dothumbnails false /embedallfonts true /embedopentype false /parseiccprofilesincomments true /embedjoboptions true /dscreportinglevel /emitdscwarnings false /endpage - /imagememory /lockdistillerparams false /maxsubsetpct /optimize true /opm /parsedsccomments true /parsedsccommentsfordocinfo true /preservecopypage true /preservedicmykvalues true /preserveepsinfo false /preserveflatness false /preservehalftoneinfo false /preserveopicomments false /preserveoverprintsettings true /startpage /subsetfonts true /transferfunctioninfo /preserve /ucrandbginfo /remove /useprologue false /colorsettingsfile () /alwaysembed [ true ] /neverembed [ true /courier /courier-bold /courier-boldoblique /courier-oblique /helvetica /helvetica-bold /helvetica-boldoblique /helvetica-oblique /symbol /times-bold /times-bolditalic /times-italic /times-roman /zapfdingbats ] /antialiascolorimages false /cropcolorimages false /colorimageminresolution /colorimageminresolutionpolicy /ok /downsamplecolorimages true /colorimagedownsampletype /bicubic /colorimageresolution /colorimagedepth - /colorimagemindownsampledepth /colorimagedownsamplethreshold . /encodecolorimages true /colorimagefilter /dctencode /autofiltercolorimages true /colorimageautofilterstrategy /jpeg /coloracsimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /colorimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /jpeg coloracsimagedict << /tilewidth /tileheight /quality >> /jpeg colorimagedict << /tilewidth /tileheight /quality >> /antialiasgrayimages false /cropgrayimages false /grayimageminresolution /grayimageminresolutionpolicy /ok /downsamplegrayimages true /grayimagedownsampletype /bicubic /grayimageresolution /grayimagedepth - /grayimagemindownsampledepth /grayimagedownsamplethreshold . /encodegrayimages true /grayimagefilter /dctencode /autofiltergrayimages true /grayimageautofilterstrategy /jpeg /grayacsimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /grayimagedict << /qfactor . /hsamples [ ] /vsamples [ ] >> /jpeg grayacsimagedict << /tilewidth /tileheight /quality >> /jpeg grayimagedict << /tilewidth /tileheight /quality >> /antialiasmonoimages false /cropmonoimages false /monoimageminresolution /monoimageminresolutionpolicy /ok /downsamplemonoimages true /monoimagedownsampletype /bicubic /monoimageresolution /monoimagedepth - /monoimagedownsamplethreshold . /encodemonoimages true /monoimagefilter /ccittfaxencode /monoimagedict << /k - >> /allowpsxobjects true /checkcompliance [ /none ] /pdfx acheck false /pdfx check false /pdfxcompliantpdfonly false /pdfxnotrimboxerror true /pdfxtrimboxtomediaboxoffset [ . . . . ] /pdfxsetbleedboxtomediabox true /pdfxbleedboxtotrimboxoffset [ . . . . ] /pdfxoutputintentprofile (none) /pdfxoutputconditionidentifier () /pdfxoutputcondition () /pdfxregistryname () /pdfxtrapped /false /createjdffile false /description << /enu () >> /namespace [ (adobe) (common) ( . ) ] /othernamespaces [ << /asreaderspreads false /cropimagestoframes true /errorcontrol /warnandcontinue /flattenerignorespreadoverrides false /includeguidesgrids false /includenonprinting false /includeslug false /namespace [ (adobe) (indesign) ( . ) ] /omitplacedbitmaps false /omitplacedeps false /omitplacedpdf false /simulateoverprint /legacy >> << /allowimagebreaks true /allowtablebreaks true /expandpage false /honorbaseurl true /honorrollovereffect false /ignorehtmlpagebreaks false /includeheaderfooter false /marginoffset [ ] /metadataauthor () /metadatakeywords () /metadatasubject () /metadatatitle () /metricpagesize [ ] /metricunit /inch /mobilecompatible /namespace [ (adobe) (golive) ( . ) ] /openzoomtohtmlfontsize false /pageorientation /portrait /removebackground false /shrinkcontent true /treatcolorsas /mainmonitorcolors /useembeddedprofiles false /usehtmltitleasmetadata true >> << /addbleedmarks false /addcolorbars false /addcropmarks false /addpageinfo false /addregmarks false /bleedoffset [ ] /convertcolors /converttorgb /destinationprofilename (srgb iec - . ) /destinationprofileselector /usename /downsample bitimages true /flattenerpreset << /presetselector /mediumresolution >> /formelements true /generatestructure false /includebookmarks false /includehyperlinks false /includeinteractive false /includelayers false /includeprofiles true /marksoffset /marksweight . /multimediahandling /useobjectsettings /namespace [ (adobe) (creativesuite) ( . ) ] /pdfxoutputintentprofileselector /documentcmyk /pagemarksfile /romandefault /preserveediting true /untaggedcmykhandling /usedocumentprofile /untaggedrgbhandling /usedocumentprofile /usedocumentbleed false >> ] >> setdistillerparams << /hwresolution [ ] /pagesize [ . . ] >> setpagedevice science magazine january • vol issue s c i e n c e sciencemag.org il l u s t r a t io n : d a v id e b o n a z z i/ s a l z m a n a r t by tania rabesandratana h ow far will plan s spread? since the september launch of the europe-backed program to mandate immediate open access (oa) to scientific literature, funders in countries have signed on. that’s still far shy of plan s’s ambition: to con- vince the world’s major research funders to require immediate oa to all published papers stemming from their grants. whether it will reach that goal depends in part on details that remain to be set- tled, including a cap on the au- thor charges that funders will pay for oa publication (science, november , p. ). but the plan has gained momen- tum: in december , china stunned many by expressing strong support for plan s (science, december , p. ). this month, a national funding agency in africa is expected to join, possibly followed by a second u.s. funder. others around the world are considering whether to sign on. plan s, scheduled to take effect on janu- ary , has drawn support from many scientists, who welcome a shake-up of a pub- lishing system that can generate large prof- its while keeping taxpayer-funded research results behind paywalls. but publishers (in- cluding aaas, which publishes science) are concerned, and some scientists worry that plan s could restrict their choices. if plan s fails to grow, it could remain a divisive mandate that applies to only a small percentage of the world’s scientific papers. (delta think, a consulting company in philadelphia, pennsylvania, estimates that the first funders to back plan s ac- counted for . % of the global research ar- ticles in .) to transform publishing, the plan needs global buy-in. the more funders join, the more articles will be published in oa journals that comply with its require- ments, pushing publishers to flip their jour- nals from paywall-protected subscriptions to oa, says librarian jeffrey mackie-mason, the chief digital scholarship officer at the university of california, berkeley. robert-jan smits, the european com- mission’s oa envoy in brussels, who is one of the architects of plan s, says publishers have stalled by emphasizing the need for broad participation. “the big publishers told me: ‘listen, we can only flip our jour- nals [to oa] if this is signed by everyone. so first go on a trip around the world and come back in years. then we can talk again,’” smits recalls. “some people try to do anything to keep the status quo.” oa mandates are nothing new: in europe, research funders require that papers be made free at some point, up from in , according to the registry of open access repository mandates and poli- cies. but existing policies typically allow a delay of or months after initial publica- tion, during which papers can remain be- hind a publisher paywall. plan s requires immediate oa; it also in- sists that authors retain copyright and that hybrid journals, which charge subscrip- i n d e p t h “[plan s] is perhaps our best chance to transform the publishing industry soon.” jeffrey mackie-mason, university of california, berkeley publishing the world debates open-access mandates spurred by european funders behind plan s, many countries consider similar moves published by aaas o n a p ril , h ttp ://scie n ce .scie n ce m a g .o rg / d o w n lo a d e d fro m http://science.sciencemag.org/ news | i n d e p t h january • vol issue sciencemag.org s c i e n c e c r e d it s : (g r a p h ic ) n . d e s a i/ s c ie n c e ; (d a t a ) n a t io n a l s c ie n c e b o a r d , s c ie n c e & e n g in e e r in g i n d ic a t o r s tions but also offer a paid oa option, sign “transformative agreements” to switch to fully oa. some european funders think plan s goes too far. “we and many german [or- ganizations] think that we should not be as prescriptive as plan s is,” says wilhelm krull, secretary general of the volkswagen foundation, a private research funder in hannover , germany. the country is eu- rope’s top producer of scientific papers, ahead of the united kingdom and france, whose main funding agencies have signed on to plan s. germany’s biggest federal funding agency, dfg, said it supports plan s’s goals but prefers to let research- ers drive the change. other funders, including the esto- nian research council, say the timeline is too tight, and they will reconsider joining when plan s’s impact is clearer. other european funders are weighing pros and cons. spain’s science ministry says it is analyz- ing the potential repercussions of plan s on the country’s science and finances, and on research- ers’ careers. fnrs, the fund for scientific research in belgium’s wallonia-brussels region, is waiting for plan s to announce its cap on article-processing charges (apcs), the fees for pub- lishing in oa journals, which the coalition’s funders have pledged to pay. “we’re not ready to com- mit if the costs are too high,” says véronique halloin, secretary- general of fnrs, whose exist- ing oa mandate caps apc re- imbursement at € —which halloin admits is on the low side. many await the european commission’s policy: although its grants represent a small percentage of research funding in europe, its oa rules can in- fluence national mandates. the commission’s research chief, carlos moedas, supports plan s, and its -year funding pro- gram horizon europe, which will begin in , contains general statements of support for oa. plan s’s rules will go into the program’s model con- tract for grants, smits says. smits has found unexpected support from china, which now produces more scientific papers than any other coun- try. last month, china’s largest government research funder and two na- tional science libraries issued strong state- ments backing plan s’s goals. “china needs to contribute to international open access [and] open its research results to its own people,” says zhang xiaolin of shanghai- tech university in china, who chairs the strategic planning committee of the chi- nese national science and technology li- brary. even if chinese organizations do not join plan s formally, similar oa policies in china would have a “huge, perhaps deci- sive impact on the publishing industry,” mackie-mason says. for now, north america is not following suit. the bill & melinda gates foundation was the first plan s participant out- side europe, and another pri- vate funder may follow. but u.s. federal agencies are sticking to policies developed after a white house order to make peer-reviewed papers on work they funded freely available within months of publication (science, april , p. ). “we don’t anticipate making any changes to our model,” said brian hitson of the u.s. depart- ment of energy in oak ridge, tennessee, who directs that agency’s public access policy. nor are the three main fed- eral research funders in canada ready to change their joint oa policy. plan s is “a bold and aggressive approach, which is why we want to make sure we’ve done our homework to ensure it would have the best ef- fect on canadian science,” says kevin fitzgibbons, executive director of corporate planning and policy at canada’s natural sciences and engineering re- search council in ottawa. outside europe and north america, funders gave science mixed responses about plan s. india, the third biggest pro- ducer of scientific papers in the world, will “very likely” join plan s, says krishnaswamy vijayraghavan in new delhi, principal scientific adviser to india’s government. but the russian science foundation is not planning to join. south africa’s national research foundation says it “supports plan s in principle,” but wants to consult stakeholders before signing on. jun adachi of the national institute of informatics in tokyo, an adviser to the japan alliance of univer- sity library consortia for e-resources, says that despite interest from funders and li- braries, oa has yet to gain much traction in his country. south america has a strong tradition of oa repositories and fee-free publish- ing, often with government subsidies. bianca amaro, president of la referencia, a santiago-based latin american network of repositories, says plan s takes a more “systemic view” than previous policies, and she values its pledge to monitor apcs and their impact—a worry for lower-income countries. “we’ll see how europe handles this,” she says. of course, mackie-mason says, not every funding agency will agree that plan s is the best way to universal oa. “but some will agree it’s good enough and perhaps our best chance to transform the publishing indus- try soon,” he says. it comes in the wake of often incremental oa initiatives in the past years, and some disagreement about the best route to oa. “in the oa movement, it seems to a lot of people that you have to choose a road: green or gold or diamond,” says colleen campbell, director of the oa initiative at the max planck digital library in munich, germany, referring to various styles of oa. “publish- ers are sitting back laughing at us while we argue about different shades” instead of focusing on a shared goal of complete, im- mediate oa. because of its bold, stringent requirements, she and others think plan s can galvanize advocates to align their ef- forts to shake up the publishing system. the plan s team predicts steady growth in the coming months. funders will dis- cuss plan s in são paulo, brazil, at the may meeting of the global research council, an informal group of funding agencies. al- though smits will leave the european com- mission in march, the plan s coalition is seeking a replacement who can keep the momentum going. “the combined weight of europe and china is probably enough to move the sys- tem,” says astrophysicist luke drury, of the dublin institute for advanced studies and the lead author of a cautiously supportive response to plan s by all european acad- emies, a federation of european academies of sciences and humanities. if plan s does succeed in bringing about a fairer publishing system, he says, a transition to worldwide oa is sure to follow. “some- body has to take the lead, and i’m pleased that it looks like it’s coming from europe.” j with reporting by jef rey brainard, sanjay kumar, dennis normile, and brian owens. paper players percentages of the world’s science articles by country china united states . germany japan india united kingdom other countries france italy south korea russia canada brazil spain australia . . . . . . . . . . . . published by aaas o n a p ril , h ttp ://scie n ce .scie n ce m a g .o rg / d o w n lo a d e d fro m http://science.sciencemag.org/ the world debates open-access mandates tania rabesandratana doi: . /science. . . ( ), - . science article tools http://science.sciencemag.org/content/ / / content related file:/content permissions http://www.sciencemag.org/help/reprints-and-permissions terms of serviceuse of this article is subject to the is a registered trademark of aaas.sciencescience, new york avenue nw, washington, dc . the title (print issn - ; online issn - ) is published by the american association for the advancement ofscience science. no claim to original u.s. government works copyright © the authors, some rights reserved; exclusive licensee american association for the advancement of o n a p ril , h ttp ://scie n ce .scie n ce m a g .o rg / d o w n lo a d e d fro m http://science.sciencemag.org/content/ / / http://www.sciencemag.org/help/reprints-and-permissions http://www.sciencemag.org/about/terms-service http://science.sciencemag.org/ white paper report report id: application number: hd project director: brian hoffman (bjh @nyu.edu) institution: new york university reporting period: / / - / / report due: / / date submitted: / / mediacommons: social networking tools for digital scholarly communication kathleen fitzpatrick, pomona college brian hoffman, new york university mark reilly, new york university avi santo, old dominion university    grant period / through / (originally / ) institution  the institute for the future of the book @ new york university  amount awarded  , . project website  http://mediacommons.futureofthebook.org/  keywords  digital scholarly communication, social networking, media studies, mediacommons, in media res     project summary   over the last two years, with the assistance of a national endowment for the humanities digital start-up grant, we have built the mediacommons user profile system, within which members are able to consolidate much of the scholarly work that they produce across the web, and through which they are able to forge connections with other scholars in the field and develop collaborative projects. this profile system is a crucial step toward achieving the broader changes that mediacommons hopes to bring about: new modes of preserving, analyzing, and making accessible digital resources, as well as new digital modes of collaboration and publication. in order to lay the groundwork for a community-driven system for publishing in the digital humanities, we needed to develop the technical systems that provide the social framework for that community. as the broader publishing network will provide access to a wide range of intellectual writing and media mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report http://mediacommons.futureofthebook.org/ http://mediacommons.futureofthebook.org/ production, including forms such as blogs, wikis, and journals, as well as digitally-networked scholarly monographs, developing systems for the accreditation, dissemination, discussion, and preservation of such work is of paramount importance. these fundamentally social processes require a dynamic and complex membership system to serve as the backbone of mediacommons. in the coming months we hope to continue building on this social networking architecture, providing more advanced functionality, including a dynamic textual analysis driven recommendations system and a set of feedback/rating tools that will allow us to develop the peer- to-peer review system of the future. about mediacommons  mediacommons is an all-electronic scholarly publishing network focused on the digital humanities developed by dr. kathleen fitzpatrick (pomona college) and dr. avi santo (old dominion university), in collaboration with the institute for the future of the book and new york university libraries. fitzpatrick and santo first began to map out what this environment might look like following a workshop organized by the institute and the annenberg school at usc in may on the ongoing crisis in academic publishing, as university presses are increasingly threatened by a faltering economic model.mediacommons hopes to present one possible pathway out of this crisis, by re-imagining scholarly publishing in digital environments, and by focusing on the scholarly communities that such publishing is intended to serve. mediacommons is attempting to re-imagine what academic publishing and scholarly review processes might look like in the digital age. though mediacommons was initially envisioned as a born-digital scholarly press, its creators quickly realized that academics publishing in a digital environment required more than just new modes of writing, but also new ways of thinking about the functions of scholarly writing. thus, mediacommons was re-conceptualized as a scholarly network dedicated to shifting the focus of scholarship back to the circulation of discourse by transforming what it means to “publish” in a digital environment. as a scholarly network, mediacommons thus has as one of its key goals facilitating interconnections among scholars, students, and other interested members of the public, enabling a shift in the way scholarly work is disseminated, from the privileging of distinct, isolated texts to focusing upon continuous discourse among researchers and authors. mediacommons is also dedicated to making the process of scholarly writing and publishing transparent, encouraging authors, editors and readers to engage one another throughout. in so doing, it is our hope that new scholarly processes and forms of writing will emerge, at once collaborative, multi-nodal, open-ended, and multi- directional. mediacommons serves as an umbrella/incubator/host for several innovative projects, each designed to re-imagine what scholarly publishing looks like and does in an online environment. below are descriptions of three current ongoing mediacommons projects, though we have ambitions for launching several others (described toward the end of this document) and are mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report http://mediacommons.futureofthebook.org/users/kfitz http://mediacommons.futureofthebook.org/users/kfitz http://mediacommons.futureofthebook.org/users/avi-santo http://mediacommons.futureofthebook.org/users/avi-santo http://www.futureofthebook.org/ http://www.futureofthebook.org/ http://dlib.nyu.edu/dlts/ http://dlib.nyu.edu/dlts/ http://dlib.nyu.edu/dlts/ http://dlib.nyu.edu/dlts/ currently experimenting with ways of franchising our existing templates so that project editors can easily customize mediacommons’ design architecture to launch new works, while tapping into our diverse community of users. in media res (imr) is currently the most visible project created by mediacommons and is one of several emerging sites experimenting with new approaches to scholarly writing in the digital era. since its launch in october , imr has been dedicated to experimenting with collaborative, multi-modal forms of online scholarship. each weekday, a different media scholar curates a -second to -minute clip accompanied by a - -word impressionistic provocation. imr offers scholars opportunities to engage in both new ways of writing and new ways of thinking about writing in a digital environment. imr also regularly hosts "theme weeks", which are designed to generate a networked conversation between curators. all the posts for that week will thematically overlap and the participating curators each agree to comment on one another's work. we use the title "curator" because, like a curator in a museum, imr posters are repurposing a media object that already exists and providing context through their commentary, which frames the object in a particular way. imr’s goal is to promote an online dialogue amongst scholars and the public about contemporary approaches to studying media. curatorial notes are purposely short because they are intended to enable a lively debate in which the sum total of the conversation will be more valuable than any one particular voice. as of may , , we have had original curatorial posts to the site from different contributors, including some of the top media studies scholars in the field. in march , imr received unique visitors ( /day) averaging . visits each to the site for an average daily traffic of visitors. mediacommons press: open scholarship in open formats was launched in august and is a live in-development component of mediacommons promoting the digital publication and peer- to-peer review of texts ranging from article- to monograph-length. utilizing commentpress, a tool developed by the institute for the future of the book that allows for simultaneous, granular, paragraph-level horizontal commenting on web documents by multiple users, mediacommons press has spearheaded new modes of engaging with born-digital work, both prior to and post publication. to date, the press has hosted a complete draft of fitzpatrick’s forthcoming manuscript from nyu press, planned obsolescence: publishing, technology, and the future of the academy, along with two complete reviews of the manuscript. in so doing, the process of academic publishing is made transparent while the possibility for engaging with fitzpatrick as she revises the manuscript is made available to a community of online participants. to date, there have been comments made on the unpublished manuscript by different respondents, including fitzpatrick’s engagement with her readers/reviewers, producing a rich set of micro- conversations building outwards from multiple sections of the manuscript. mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report http://mediacommons.futureofthebook.org/imr/ http://mediacommons.futureofthebook.org/imr/ http://mediacommons.futureofthebook.org/mcpress/ http://mediacommons.futureofthebook.org/mcpress/ http://www.futureofthebook.org/commentpress/ http://www.futureofthebook.org/commentpress/ mediacommons press also recently hosted an open-review experiment for an upcoming special issue of the print journal shakespeare quarterly dedicated to “shakespeare and new media.” special issue editor katherine rowe selected four essays under consideration for possible publication in sq : and three reviews accepted for publication for limited time open peer review. from march through may , , the essays under review generated comments from different reviewers, including the essays’ authors’ engagement with one another, with the special editor and with their readers. authors are currently revising their essays based on the feedback they received. most recently, mediacommons also added the new everyday (tne), an experiment in “middle-state publishing” (a web publication whose form approximates something “between a blog and a journal”) being undertaken as part of a two-year project by the new york visual culture working group, housed at nyu and funded by its humanities initiative. the project launched with a cluster edited by nicholas mirzoeff considering the murder of jorge steven lópez mercado; the pieces that form this cluster are open for discussion, and are intended to be seen, both collectively and individually, as remaining somewhat “in process.” a new publishing platform, which remixes features of both imr and commentpress, is currently in development and is expected to be online when tne officially launches in fall . project achievements in the two years since nyu was awarded the neh digital start-up grant the project has realized a number of achievements both directly and indirectly related to the grant. the network of sites is now running on an entirely new infrastructure, based on the open source drupal framework, and membership across the sites has grown to registered users (from on the th of january, ). the sites now include features such as customized content types, improved workflows, content syndication, a graphic design scheme that can be 'extended' for new mediacommons projects, and faceted search and browse capabilities. under the supervision of the editors, registered users can create and syndicate a wide array of different content types using easy-to-use, web . interfaces to input, preview, and publish their work. traffic to the sites has grown to over ten thousand unique visitors per month, with a peak of over , unique visitors in april . we also improved the scalability and sustainability of the site by implementing a code repository and separate environments for the development, staging, and release of site code. after the release of the new and improved mediacommons and in media res sites, serious work began on the user profile and e-portfolio system described in the grant proposal. the high-level functions described in the grant were translated into a data vocabulary and wireframe outlines of page layouts, which were then mapped to a set of community created 'modules' and to custom functionality developed at nyu. the resulting version . of the profile system was released in late . the profile has three high-level functional areas: ) self-representation; ) social network linking among users, directly and via terms such as affiliation and interests; ) mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report http://mediacommons.futureofthebook.org/the-new-everyday/mercado http://mediacommons.futureofthebook.org/the-new-everyday/mercado http://mediacommons.futureofthebook.org/the-new-everyday/mercado http://mediacommons.futureofthebook.org/the-new-everyday/mercado aggregation of users' intellectual output in a range of formats and at a range of scales (from tweets to books). most of the development work for the portfolio system went into the aggregation of intellectual output. mediacommons users now have a suite of powerful tools for creating an electronic nexus of their output as print authors, bloggers, critics or contributors on mediacommons projects, and as users of services like facebook or hulu. some of these tools are fully automated -- a user's commenting on an in media res piece will automatically be reflected on her profile -- while others, particularly the print publication database still require laborious manual input or esoteric file import abilities -- a drawback we hope to address soon by ) finding repositories of bibliographic information from which to harvest records and then devising a means of associating the right records with the right users and ) improving the overall usability of the profile management screens. design process the initial design challenge concerned the upgrade of the existing sites in preparation for the development of the user profile system. the technical team consulted with the editors to learn what they wanted the sites to do but were unable to realize within the original wordpress framework. for instance, the "theme week" concept is at the heart of in media res, but was initially an editorial construct without matching functionality: site structure was conveyed through conventions of titling, ordering, or by embedding links in text. working with the editors, the design team worked to translate concepts such as 'theme week' into a suite of content types, roles, and workflows that would be determinate enough to ensure that functionality and presentation could be built around them, yet flexible enough to allow for contingencies such as a theme week's being published before all its constituent elements were complete (for which we developed functionality for editors to insert temporary placeholders). when the time came to begin work on the profile system, we used a similar process. we began by discussing the concepts fully with the editors until distinct entities and functionalities emerged. these were documented as an "abstract data model" and a set of high-level functional requirements. when this documentation was complete, we searched for and found a consultant to help us map them to existing and to-be-developed functionality available in the drupal framework, and ultimately hired jay datema of bookism.org. jay used the design documentation to research and install a suite of rd party drupal modules and to configure them in such a way that our core data model and functional requirements were met. we then spent a lengthy period of time refining, customizing, and debugging the system, and finally released it in january , roughly months after the initial documentation was completed. challenges/lessons learned .  technical issues  mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report http://mediacommons.futureofthebook.org/publications http://mediacommons.futureofthebook.org/publications mediacommons and in media res are now running in drupal, an open source content management system with a dizzying and ever-growing suite of add-ons known as 'modules' created and managed by the user / developer community. keeping up with the latest state of the modules that are necessary to run the sites (by the time the new everyday is released, there will be over rd party modules in use across the mediacommons network) has proven to be a significant task in itself, requiring a regular review of the site and its dependencies by members of the technical design team. recently, the design team undertook and completed a restructuring of the databases and code such that each site is run completely independently and connects to a shared database for user and taxonomic data. this revision should enable the development of additional projects without compromising the performance or stability of those already in production. as time permits, we expect to continue to implement architectural changes that reduce the interdependence of the sites, reduce unnecessary or redundant modules, and possibly eliminate the technical constraints currently impeding the rapid development of additional mediacommons projects. .  editorial issues  beyond the technical difficulties that the site faced, of course, lie a range of editorial and community-oriented difficulties; as many such projects have found out, just because a system can be built doesn't mean that it will be accepted, and just because it is built on a platform as flexible as drupal doesn't mean that everything we might want to do with it can be easily accomplished. the challenges that we've encountered in the editorial arena fall into a few different categories: the challenge of building participation among scholars in the field, the challenge of linking the profile system to the network's broader publishing goals, and the challenge of developing protocols for the review and assessment of the work produced within the network. a.  community participation the challenges involved in fostering discussion within digital scholarly publishing networks are no small matter; motivating and sustaining the desire in users to participate in online communities has been the issue over which many innovative digital projects have stumbled. mediacommons has developed a thriving user base ( active users as of may ), but participation in the kinds of discussion that the network is hosting, whether the day-to-day conversations on in media res or the larger-scale, project focused commenting on mediacommons press, has been extremely difficult to build. this is by no means a new problem; motivating scholars to participate in the frankly selfless processes of peer review has long been a challenge within scholarly publishing, as any journal or university press editor can confirm. the specific challenge for new modes of scholarly publishing like mediacommons is to model and inculcate generosity. this is easier said than done, perhaps; as a commenter on twitter noted after a recent talk fitzpatrick gave about peer-to-peer review, “being helpful is not really part of mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report academic culture.” persuading scholars to take the time to participate in the process of reviewing, discussing, and assisting in the development of other scholars’ work won’t be easy—unless doing so is somehow in their interest. there are two potential means that we can see for encouraging such self-interested altruism. the first is ensuring that the network within which scholars are publishing and commenting is composed of a community to which they are committed, and to which they feel responsible—the community of their peers. noah wardrip-fruin, who conducted an experimental blog-based review of his book-in-progress, expressive processing, noted that prior such experiments had sought to create new communities around the texts as they were published, and argued that “this cannot be done for every scholarly publication,” and, moreover, that there are in many cases existing communities that can be drawn upon to great advantage. such communities might include existing online social networks, but they might also include the clusters of scholars who already interact and discuss projects with one another in different formats, via disciplinary organizations and other professional groups, field-based listservs, and even more informal writing groups. making use of such existing communities will be necessary to motivating participation in online review precisely because scholars are already committed to the success of those groups, and to the opinion that those groups hold of their own work. beyond such professional responsibility, however, a key factor in motivating participation in new modes of online peer review will be the visibility that these processes will provide for what is now an unrecognized—indeed, an invisible—form of academic labor. allowing scholars to receive “credit” for the reviews they do, both in the sense of making visible reviewers’ critical role in the development of arguments and texts and in the sense of rewarding good reviewing, could help foster a culture in which reviewing is taken seriously as a scholarly activity, and which therefore encourages participation in review processes. b. linking text and network of course, in order to foster such a culture, we need to determine and to demonstrate by example what “good reviewing” is, such that we can reward it. that determination will require that this publishing system develop some means not just of reviewing a text, but of assessing the comments that are left by reviewers. this process of reviewing the reviewers will be crucial to any open publishing and review process, as authors and readers will need to be able to judge the authority of the commentary that a text has received. there’s thus both carrot and stick involved in building the scholarly review community; the carrot is the ability of reviewers to contribute something positive to the community and be rewarded for it, while the stick is the ability of the community to call out those members who don’t contribute positively. this community regulation of peer review standards—not just the standards that texts under review are held to, but the standards that reviews themselves are held to—has the potential to greatly improve the quality of scholarly communication in a broad sense, reducing thoughtless snark and focusing on helpful dialogue between authors and readers. mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report in order for that community regulation to develop, however, we need to have reliable knowledge of who our reviewers are and what work they’ve done within the publishing network. for that reason, we hope to build a bridge between the commentpress system in use at mediacommons press and the mediacommons profile system (see below). a full understanding of the context of and perspective represented within comments and reviews written on texts published within mediacommons will require linking those reviews to their authors' profiles. moreover, including the reviews in the information in a scholar’s profile -- and, further, including the community’s assessment of those reviews -- will allow the community to see clearly which members are active in the reviewing process, which members are highly thought of as reviewers, and which members could stand either to become more active or more helpful as reviewers. in this way, the stick in the carrot-and-stick approach to encouraging participation in an online reviewing process might allow the community to develop a “pay-to-play” relationship between reviewing and publishing, in which the right to publish one’s own texts within the network can only be earned by participation in the review process. it goes without saying that such a system will need to balance the desire to make the scholarly community self-regulating with certain fail-safes to prevent abuse of the system—avoiding logrolling, cliquishness, exclusionary behavior, and so forth. but we hope that by making all aspects of the reviewing system public and visible, and by tying the reviewing process to the community itself, we can promote an ethos of collegiality that will help guide the system’s development.   c. creating assessment metrics beyond developing and regulating the system of publishing and review, however, we need to find ways to communicate the value of the work that is produced within this publishing network to the scholarly community at large. much of the resistance of scholars to new modes of digital publishing tends to focus around concerns that texts published in such venues won’t be taken seriously, and therefore be seen to “count,” by their colleagues, their departments, their deans and provosts, and their promotion and tenure committees. and worse, to some extent, they’re right: scholars and administrators accustomed to evaluating print-based research products often don’t know how to assess the quality or impact of born-digital scholarship, and tend therefore to underestimate its value to the field. numerous attempts to close that gap in the assessment of digital scholarship are underway, through projects sponsored by disciplinary organizations such as the modern language association, as well as through policies developed at individual institutions. the documents being produced and circulated by these groups are helping to reshape the thinking of many review bodies with respect to the tenurability of scholars who work in digital forms. mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report however, such documents tend to emphasize “peer review” in a fairly traditional form, and ensuring that promotion and tenure committees take seriously the kinds of open review that texts such as those published by mediacommons press will undergo will no doubt require further intervention. but as michael jensen of the national academies press has argued, web-native scholarship has the potential to provide a much richer and more complex set of metrics through which the importance of scholarly texts can be judged. such metrics, which form the basis of what jensen has called “authority . ,” will make use of a range of data including numbers of hits and downloads, numbers of comments, numbers of inbound links, etc., gauging the impact a text has had by the degree of its discussion around the web. but it will also make use of more sophisticated, less popularity-driven data, including such factors as the “reputation” of a press, an author, or a reviewer. as a result, these developing metrics will not focus simply on quantity— how many people have read, discussed, or cited a text—but also on the quality of the discussions of a text and the further texts that it has inspired. the “review of the reviewers” that mediacommons proposes to develop might help provide some of those new metrics of scholarly authority. by computing a reviewer’s reputation based on the community’s assessment of the quality of his or her reviews, we can then bring that reputation to bear on subsequent comments by that reviewer, indicating clearly to readers involved in promotion and tenure processes which opinions are generally considered authoritative by the community. the use and interpretation of such metrics will never be as simple as the binary measurement that traditional peer review provides—either a text was or was not published in a peer-reviewed venue—but they will enable us to develop a much more informative picture of the impact a scholar’s work is having on the field. . institutional issues ownership of a project like mediacommons represents a significant challenge, one that many institutions will find themselves facing in the increasingly collaborative world of digital humanities research. the project's two principals are located at pomona college (california) and old dominion university (virginia), and the project was begun in collaboration with the institute for the future of the book (originally affiliated with the university of southern california and now with new york university). as a result of that collaboration, mediacommons's technical and design leads are located at nyu, and this start-up grant, originally written by personnel from the institute, was applied for through and administered by nyu -- an institution with which neither project principal has a direct relationship. the multi-institutional nature of this collaboration has presented challenges for everyone involved. for the project principals, issues of communication were paramount; being unable to walk across campus for regular discussions with the technical and design team at times left them feeling out of the loop. for that team, issues of institutional mandate loomed; it wasn't always clear how much labor they could provide in support of faculty who weren't members of their institution. and for everyone, the host institution's only slowly growing awareness of the project's significance and potential resulted in a series of bureaucratic obstacles to rapid progress. mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report these challenges highlight the growing pains being realized as scholarly publishing shifts from its traditional location within the university press to take up a new position within the university library. presses have for the last several decades had an outward-facing orientation; their primary stakeholders are in fields whose members are dispersed around the world, and as a result, they have a somewhat attenuated relationship with their own institutions. libraries, by contrast, are largely inward-facing; their stakeholders are almost uniformly members of the institution that they serve, and for that reason, libraries tend to think institutionally rather than disciplinarily. as scholarly publishing moves into the library, and yet maintains its cross- institutional field-based focus, libraries will be required to face the challenges involved in supporting faculty from other institutions in their projects, and will need to think about ways to communicate to their administrations the value for the institution in doing so. further work . infrastructure a. franchisable api as mediacommons continues to grow, we expect an increasing interest among its users in spinning off new projects within the network (see content / project development section below for examples). currently, the design/technical team is developing a new journal of "middle state" work in collaboration with nick mirzoeff, an nyu faculty member. however, the ability of the design/technical team to take on similar projects for faculty at other institutions is expected to be extremely limited. for this reason, the editorial and design teams have discussed in our 'wish list' sessions the development of a 'franchisable api’ that would allow members of the community to secure their own funding and / or technical support to develop projects whose timeline and ambition won't be constrained by the limited availability of the design/technical team. while we haven't developed a complete list of features, we do recognize that an api will be required for authenticating via mediacommons, consuming and updating profile information, and exposing content to the federated mediacommons search index. in addition to actual apis, we expect that we will also need to provide code and instructional material for people developing projects - for instance drupal themes and modules that would allow non-experts to graphically design a new mediacommons project by manipulating some simple controls, such as a color palette or drag and drop layout designer. b. long-term preservation strategy one major hurdle to the adoption of the mediacommons network as a fully realized venue for scholarly communication is that persistence has not generally been addressed in web publication software as it has in print publication. a scholar asked to spend her limited time and energy submitting her ideas to a venue that has bounced around among institutions, disappeared for long periods of time, and lost access to its own content and that on which it depends may mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report http://mediacommons.futureofthebook.org/users/nickmirzoeff http://mediacommons.futureofthebook.org/users/nickmirzoeff understandably be skeptical or choose to reserve her 'a' ideas for print. for this reason, we recognize that it is critical for mediacommons to develop and implement a long-term strategy for the continuing accessibility, verifiability, and renderability of its content. there are many questions to be addressed before a solution can be designed and implemented. for example, while nyu's digital library technical services team has been producing sites like mediacommons with one hand, it has been constructing and refining an oais-based repository for preservation of digital materials with the other. there is no immediately obvious way in which these two efforts fit together. the task of replicating mediacommons content as "archival information packages" within the oais context raises a host of questions and problems: how would a blog post and its attendant comments be represented as an aip? how will the ability of content editors to continually edit their work affect the repository? and, not least, how much energy should be diverted from keeping mediacommons sites up to date in order to ensure that they last, and what if this diversion actually ends up undermining the longevity of the project? one possible avenue may build on work that is beginning at nyu now to integrate dynamic online collections and objects with the preservation repository by means of an "archive it" button that would appear on content editing interfaces such as those used on mediacommons sites. a user with ownership over a particular piece of content, for instance an in media res posting, would have, in addition to the "publish" button that appears now, the option to "archive" the page in the repository. when she did so, the site would convert the various files and database fields that constitute the object into a data stream and send the "submission information package" to the repository for ingest and / or updating of an associated series of versioned aips. when finished with the ingest, the repository would send a message back to the site that would indicate the date and status of the last archive action. we don't know how feasible such a solution would be. one area that needs to be investigated and considered is the overhead of maintaining the ability to interact with the repository from a constantly evolving, community-driven publication platform (mediacommons has already undergone a wordpress to drupal .x migration, a drupal .x to .x migration, and will soon undergo a drupal .x to .x migration -- none of these are trivial). adding to this challenge is the rapidly expanding set of content types and structural relations conceived by editors and contributors in the mediacommons community. one possible meta-solution to these problems may entail the articulation of a constrained and generalized set of preservation types to which diverse mediacommons content types could easily be reduced. for instance, an in media res post's constituent parts might be characterized by highly general relations such as "is a response to" or "seed for conversation." in the case of in media res, whose underlying youtube-hosted artifacts are highly unstable and sometimes at odds with the perceived interests of for-profit intellectual property holders, it may even be necessary to devise a means of representing the "missing center" of an archived conversation. c. recommendation engine mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report within the existing profile system, users are able to define their scholarly interests through tags; these tags are likewise used as a crowd-sourced taxonomy for the content published within the system. further, users of mediacommons are able to search for users and texts that deploy those tags, as well as for terms contained within the texts themselves, in order to find potential collaborators or texts of interest within the site. all of this, however, requires the user to initiate the discovery process. we hope in the coming months to begin the development of a recommendations engine that would use the information in a member’s profile, along with robust textual analysis of documents in the network, to present the user with frequently updated suggestions for texts to read, discussions to participate in, and collaborators to work with. such a system would help us encourage active use of and discussion on mediacommons, by providing frequent reminders about the material of interest to each of our members. d. peer-to-peer review tools alongside this recommendation engine, and as mentioned above in thinking through the challenges presented in linking the texts that we publish and the social network through which they're published, as well as in creating assessment metrics for the "success" of those publications, we hope to develop a set of tools that will turn the profile system we have built into the foundation of an open, online, peer-to-peer review network. fitzpatrick has written extensively about the need for such a network in her project, planned obsolescence: publishing, technology, and the future of the academy. the nature of authority is changing dramatically in online scholarly communication, and we need new review metrics and tools that work with that change, rather than imposing older gatekeeping modes of peer review where they simply don't apply. we therefore want to encourage the open, ongoing review of all mediacommons-published texts, taking advantage of the network's capacity for discussion as a means of helping scholars filter the wealth of content that the network makes available. however, we believe that the most crucial aspect of such a peer-to-peer review system will be not the review of the particular texts, but the review of the reviewers, an ongoing assessment of members' critical practices within the site that will allow other users to determine the relative authority of those reviewers, and therefore the weight that their criticisms should be granted. such a review of the reviewers will require us to develop a "reputation" system that will allow users of the network to assess the quality of the comments left on texts published within mediacommons, and then reflect that assessment within the profile system. . content/project development in addition to current projects like in media res and the new everyday already available through mediacommons, there are several other initiatives in development. these include open mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report beta, led by nina huntemann (suffolk university), and an as-yet-unnamed media mash-up project edited by christian keathley and jason mittell (middlebury college), which will build on their proposed neh summer institute on producing media criticism in the digital age. both projects would take advantage of mediacommons's franchisable api and user profile system in order to create customizable templates for their sites and to link them to our growing scholarly network. open beta is an experiment in scholarly gaming, combining video of academics playing video games with critical comments on the process. the mash-up project would provide a platform for critically analyzing these forms as well as a space where academics can experiment with and receive critical feedback on producing mash-ups. we are also committed to expanding the current functionality of our existing projects. we are developing further customization capabilities for imr that will allow posts to be better integrated into classroom curricula. in the future, registered imr users will be able to customize existing posts by adding lesson-specific questions for students to answer and by setting up a password protected private version of a post available only to students registered for a particular class. conclusion we are very excited about the changes that the mediacommons network has undergone in the two years since this start-up grant was awarded. in developing and deploying a . version of the user profile system, we have in fact surpassed our stated objective "to build a working prototype of a set of networking tools that will serve as the membership system for mediacommons." furthermore, by undertaking and executing this work, we have significantly improved both the organizational and technical foundations for further growth of the network and further exploration of the ideas from which the project originated. and, as is hopefully clear from this report, we have reached a point where the success of the network is itself propelling us into new areas of scholarly and technical inquiry, so much so that despite what has been realized so far, we continue to be very much focussed on creating new projects and continuing to strengthen the productive partnership that has arisen among drs. santo and fitzpatrick, the institute for the future of the book, and nyu libraries. mediacommons: social networking tools for digital scholarly communication neh digital humanities start-up grant: final report http://mediacommons.futureofthebook.org/users/ninabeth http://mediacommons.futureofthebook.org/users/ninabeth http://mediacommons.futureofthebook.org/users/christian-keathley http://mediacommons.futureofthebook.org/users/christian-keathley http://mediacommons.futureofthebook.org/users/jason-mittell http://mediacommons.futureofthebook.org/users/jason-mittell untitled editorial � commentary adding value to scholarship in residency: supporting and inspiring future emergency medicine research in canada daniel k. ting, md*; blair l. bigham, md, msc † ; shaun mehta, md ‡ ; ian stiell, md, msc § introduction this is an exciting time for cjem; it has recently conducted its first review and plans several changes to enhance the journal’s impact and contributions to our field. cjem faces the challenge of publishing high- quality scholarly work while simultaneously providing opportunities for resident physicians to publish their work. resident publication in peer-reviewed journals remains the gold standard for measuring scholarly output, and we recognize that, as the calibre of cjem rises, so too must the calibre of academic output by the residents. the current landscape of resident research residents face several barriers when aiming to publish rigorous, high-impact original research during dense and time-limited training programs. more than half of the fellowship of the royal college of physicians (frcp) training programs expect residents to submit a manuscript to a journal and an abstract to an interna- tional conference, which may unintentionally empha- size completion over quality, and encourage residents to “go through the motions” rather than conduct valuable work with true curiosity. given the time and resource constraints of resident projects, residents often choose traditional projects like case reports and retrospective work at their home centre with little expectation of producing high-impact scholarship and work alone. an approach where post- graduate training programs embrace a broader definition of scholarship and support the collaboration required to generate impactful science will encourage teamwork between residents across canada. removing requirements that encourage working in silos may allow for more impactful longitudinal or multicentre studies that span longer than a training period. we believe that such an approach would still develop the skills in research literacy required of modern clinicians, support residents on a more traditional research track, and lead to a more impactful scholarship for residents less inclined toward academic careers. bringing purpose to scholarship scholarly activity is a fundamental part of all canadian emergency medicine (em) training programs. begin- ning research training at this early stage helps build a foundation of success for future scholars, practitioners, and leaders. historically, the focus of research has lar- gely been in the basic or clinical sciences; however, the landscape of academia has changed significantly. there has been an explosion of academic work in areas such as medical education, patient safety, quality improvement, and administration. these projects are instrumental in generating creative solutions to systems-level issues and should be encouraged to optimize the way we train physicians and deliver care. this shift is supported broadly, as evidenced by the updated definition of the canmeds scholar role, where emphasis is placed on developing structured critical appraisal skills. these principles were rein- forced at the canadian association of emergency from the *department of emergency medicine, university of british columbia, vancouver, bc; † division of emergency medicine, mcmaster university, hamilton, on; ‡ division of emergency medicine, department of medicine, university of toronto, toronto, on; and the § department of emergency medicine, ottawa hospital research institute, university of ottawa, ottawa, on. correspondence to: daniel ting, pandosy st, kelowna, bc v y t , canada; email: daniel.ting@alumni.ubc.ca © canadian association of emergency physicians cjem ; ( ): - doi . /cem. . cjem � jcmu ; ( ) https://doi.org/ . /cem. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms. mailto:daniel.ting@alumni.ubc.�ca https://doi.org/ . /cem. . https://doi.org/ . /cem. . https://www.cambridge.org/core https://www.cambridge.org/core/terms physicians (caep) academic symposium, with scho- larly work defined as a systematic approach to identify a problem and design a response. trainees should pursue scholarship aligned with their career interests, instead of completing projects for the sake of requirement fulfillment. by welcoming a broad definition of scholarship, training programs can inspire a sense of ownership and meaning for trainees under- taking research projects. we urge institutions to recognize alternative forms of scholarship and to develop position statements, working groups, and discussion forums for less traditional albeit important topics. nurturing research producers although training programs have similar formal academic requirements, their scholarly output is uneven; further, some residents are more interested in academic undertakings than others, showing differing levels of engagement and productivity based on their career goals. residents should be connected to mentors who are available and enthusiastic and who have scholarly experience in their area of interest. while some residents may not have a local mentor who fits these criteria, the rise of digital technologies overcomes traditional institutional and geographical barriers to scholarly collaboration. academic gather- ings, such as the caep conference or the network of canadian emergency researchers (ncer) annual meeting, should be approached by attendees and organizers as an opportunity to spark academic networks. these collaborations can blossom in a “digital laboratory.” recently, some of these opportu- nities have taken on a formal structure, such as the canadiem digital scholars fellowship for medical education. furthermore, institutions should foster a culture that celebrates academic scholarship. scholarly inquiry is an exercise in delayed gratification whose traditional rewards manifest relatively rarely and are bestowed to selective endeavors (e.g., conference presentation). institutions can use social media as a simple and inex- pensive strategy to increase the immediate impact of a broad range of scholarly work. for example, the university of ottawa has recently integrated a digital scholarship and knowledge dissemination program within the structure of its academic department. this includes funded faculty positions and an institutional website to publish a large range of local scholarly work, from grand rounds to journal club summaries to article publications; such programs can be replicated more broadly. resident opportunities offered by cjem cjem created a new leadership team in november , and this team recently affirmed a vision of “inspiring excellence in emergency medical care.” we wish to be among the top three em journals within years. does this mean fewer opportunities for learners at cjem? most certainly not. the cjem editorial board recently approved this specific objective – “to provide scholarly opportunities for medical students, residents, and newly practicing physicians.” the entire editorial team wishes to work with our younger authors to assist the development of their scholarship skills. congratulations to daniel ting, a resident from university of british columbia, who has nearly completed his term as our first “cjem intern.” we welcome residents to apply for this internship position every year. cjem has many sections well suited to resident submissions. clinical correspondence (i.e., case reports, images) will be almost exclusively reserved for young canadians. resident issues publishes articles written on subjects of importance to em residents. residents are welcome to submit a commentary on other issues. journal club is a great opportunity because we expect these articles to be co-authored by a learner or new physician. residents are frequently involved in brief educational reports, submissions that discuss educational advances in em. finally, we have created a new cate- gory, brief original research, comprising no more than , words and references. this is well suited for smaller studies, some of which may have been resident research projects. bottom line – cjem loves publishing articles by residents! keywords: emergency medicine, medical education, residency acknowledgements: the authors thank gerhard dashi and eddy lang for their critical input in this manuscript. competing interests: none declared. references . stiell i, lang e, atkinson p. a new chapter for cjem. cjem ; ( ): - . adding value to scholarship in residency cjem � jcmu ; ( ) https://doi.org/ . /cem. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/ . /cem. . https://www.cambridge.org/core https://www.cambridge.org/core/terms . calder la, abu-laban rb, artz jd, et al. caep academic symposium: how to make research succeed in your department: promoting excellence in canadian emer- gency medicine resident research. cjem ; ( ): - . . ioannidis jp. why most clinical research is not useful. plos med ; ( ):e . . richardson d, oswald a, chan m-k, et al. scholar. in: canmeds physician competency framework (eds. frank jr, snell l, sherbino j). ottawa: royal college of physicians and surgeons of canada; , - . . gottlieb m, fant a, king a, et al. one click away: digital mentorship in the modern era. cureus ; ( ):e . . gottlieb m, sheehy m, chan t. number needed to meet: ten strategies for improving resident networking opportu- nities. ann emerg med ; ( ): - . . zaver f, thomas a, shahbaz s, et al. the canadiem digital scholars program: an innovative international digital collaboration curriculum. cjem ; (s ):s . . rosenberg h. development of emottawa’s digital scholarship and knowledge dissemination program; . available at: https://emottawablog.com/ / /development- of-emottawas-digital-scholarship-and-knowledge-dissemination- program. ting et al ; ( ) cjem � jcmu https://doi.org/ . /cem. . downloaded from https://www.cambridge.org/core. carnegie mellon university, on apr at : : , subject to the cambridge core terms of use, available at https://www.cambridge.org/core/terms. https://emottawablog.com/ �/� /development-of-emottawas-digital-scholarship-and-knowledge-dissemination-program https://emottawablog.com/ �/� /development-of-emottawas-digital-scholarship-and-knowledge-dissemination-program https://emottawablog.com/ �/� /development-of-emottawas-digital-scholarship-and-knowledge-dissemination-program https://doi.org/ . /cem. . https://www.cambridge.org/core https://www.cambridge.org/core/terms adding value to scholarship in residency: supporting and inspiring future emergency medicine research in�canada introduction the current landscape of resident research bringing purpose to scholarship nurturing research producers resident opportunities offered by cjem acknowledgements references destroying the silo: how breaking down barriers can lead to proactive and co-operative researcher support in this article we evidence some of the opportunities which the research excellence framework (ref) open access agenda has created for closer collaboration between the library research support team and the research office at liverpool john moores university (ljmu). drawing on personal reflections and shared experiences, we suggest that this new co-operative spirit has yielded three important outcomes. first, it has allowed for the dismantling of research support silos and forms of duplication. second, it has enhanced the visibility and profile of research within academic departments and across the university as a whole. thirdly, it has enabled relationships and engagement to be developed beyond our institution, allowing us to learn from others. this leads us to further suggest that in the context of research support within an institution, the library should take an active role to engage with collaborative forms of research support. destroying the silo: how breaking down barriers can lead to proactive and co-operative researcher support keywords collaboration; research support; network new team, new service our own research support team was created as a direct result of the open access (oa) requirements for the research excellence framework (ref) . (ref is the system for assessing the quality of research in uk higher education institutions.) the ref required an adherence to an oa repository set-up and a focus on impact, collaboration and public engagement. liverpool john moores university (ljmu) is a medium-size university with a very active research community. as with many other universities, previously ljmu had principally supported researchers via academic liaison librarians. to meet the increasing research support demand, our small team (which consists of . full-time equivalent posts, including a research support manager, an open access and digital scholarship librarian and a research support librarian) came into existence in august , and members of the team immediately had to be strategic about how we went about supporting the research needs of the university. insights – , destroying the silo: proactive and co-operative researcher support | cath dishman and katherine stephan cath dishman open access and digital scholarship librarian liverpool john moores university katherine stephan research support librarian liverpool john moores university although there were job descriptions and mentions of research support within our library service’s plan, ostensibly our aim was to support academics and researchers throughout the entire research lifecycle. in that sense, the scope of what we could do was vast; the tricky bit was trying to narrow down our focus within our limited personnel resources. background to collaboration collaborating to meet university and strategic initiatives and goals is not a new concept, most certainly not with libraries. there are many examples of libraries that have public/university partnerships, collection management and consortia. libraries thrive at partnerships and working together, with ‘evidence of the willingness or our profession to network, collaborate and share in order to develop services that meet the needs of our users’. what is particular to research support, however, is the need to teach ‘various literacies’ and for ‘research librarians … to try and understand, and address, the disparate support needs of their respective research communities’. effectiveness in this context means being able to actively support and work alongside academics, researchers and phd students through the deployment of general and specialist library skills. these skills can be measured out in terms of research knowledge, technical skills and softer, more person- centred skills (e.g. patience and mutuality). as with other projects that demand extensive time and effort, it is important to capitalize on a ‘commitment to work across units and divisions to deliver a better solution’. finding our purpose as a team, we started to crystallize our purpose when we looked beyond our own service and experiences of designing and delivering research support. we recognized the potential of working with other university partners when katherine presented a discussion on research cafés at our professional services conference, and the researcher development advisor from research and innovation support (ris) was in attendance. as part of her own role, she focused on developing training and support for researchers. after the session she approached katherine because she had identified possible overlaps in the work of the two teams but also the potential for collaboration and sharing. at the same time, the university appointed a new researcher development manager within the doctoral academy (da). the researcher development manager was tasked with a similar role to that of the researcher development advisor/research support team, with the point of departure being an explicit focus on the training and development needs of phd students. it quickly became apparent that we shared a commonality of purpose, namely promoting and supporting research excellence. the research support team in the library shared the remit of both ris and the da: to support all researchers. through the simple expedient of talking and sharing experiences, we collectively came to the realization that we were often undertaking and/or planning complementary activities. similarly, we came to view that this lack of co-ordination and effective communication was unnecessary and unsustainable. in that sense, it provided an opportunity to capitalize on our shared intelligence and individual strengths, so as to deliver a more meaningful, better co-ordinated research support experience for academics, researchers and postgraduate researchers alike. working together we saw this initial, slightly informal conversation as a starting point for working more effectively together. consequently, we began to communicate with ris and the da via e-mail and regular catch-up meetings in an effort to foster awareness and avoid duplication. we also gave careful consideration to the question: who else within the university has a similar remit in ‘promoting and supporting research excellence’? and, relatedly: how could we best go about identifying and developing a relationship with them? ‘an opportunity to capitalize on our shared intelligence and individual strengths … to deliver a more meaningful, better co- ordinated research support experience’ ‘the scope of what we could do was vast; the tricky bit was trying to narrow down our focus’ to begin with, we decided to take a proactive stance. this meant, for example, inviting ourselves and/or signing up to events and meetings we thought might be of interest and relevance (e.g. faculty research days, department events listed on ljmu’s staff intranet, eventbrite, staff talks and lectures.) moreover, we discovered that one of our faculties was holding a research day, so we asked if it would be possible for us to attend the event. this provided us with a platform from which to further promote the library and form new relationships and research networks. as it so happened, they held a ‘marketplace’ over lunchtime. the marketplace brought together a diverse range of individuals and teams with different skills and experiences with the overall aim of facilitating discussion, learning and collaboration. we took up a stand, adapted a quiz from another event, and offered a prize. importantly, this gave us a talking point – a usp (unique selling point), if you will – and enabled us to position ourselves as a potential resource and research partner. as a result of this, we have been invited to each subsequent conference. we continue to have a visible presence at the marketplace and, as our relationship with the faculty has grown, we have also been asked to present on predatory publishing. another tactic was using our new-found (and now regular) contacts to get us invited to meetings, which, in fact, worked both ways. for example, we suggested meetings to our respective colleagues and they would recommend and invite us to others. a particularly fruitful example of this tactic was with international women’s day (iwd). our colleague from da was invited to help organize the university’s event and she subsequently invited us, enabling the library service to be pivotal to the organization and the event itself. as a result, we made new contacts amongst female researchers, increased our overall visibility and started to receive invitations to take part in or live- tweet with other events. assisting with these types of events involves very little effort on our part and we reap large rewards in terms of exposure and awareness of the library. moving forward, we have developed a shared, centralized calendar of events with our colleagues in ris and the da. we have also started to offer sessions as part of each other’s respective training programmes. for example, we have run sessions as part of the activator researcher programme and da training on subjects like metrics, publishing and data management. within this, we are also trying to meet more regularly with other colleagues who support researchers, either from central teams or within faculties. this allows us to discuss planning joint activities as well as keeping abreast of developments within their respective areas. benefits of collaboration the positives of working jointly with others mean that you learn and benefit from their knowledge and experience. you are also able to promote the training and support they deliver, too. this joined-up approach also benefits those you are trying to support, as they do not have to look to a myriad places for help; they can get sustained support and know-how if they come to one team, as each team is aware of and promotes each other. we have benefited from joint sessions with ris and da that are not badged and/or hosted within the library. delivering library research support sessions as part of a wider researcher/phd training programme means that the library is recognized more widely as part of the research process, by both the researchers themselves and other research support staff across the institution. for example, our sessions on publishing that are advertised and facilitated as ‘researcher events’ are better attended than one-off library training events. collaboration has also led to involvement in other events, conferences and workshops and quite simply, has broadened our horizons by introducing us to a wider community of people and events. having that personal introduction can make a vast difference when asking ‘we have developed a shared, centralized calendar of events with our colleagues in ris and the da’ ‘the library is recognized more widely as part of the research process’ individuals to present or take part in a panel discussion, for example. we feel strongly that the benefits of collaboration outweigh the time and effort exerted. taking part in events and working with others at conferences can lead to useful networking opportunities, which can then be brought into play in connection with future development activities. beyond the institution ljmu is a member of various consortia with other libraries, such as the north west academic libraries (nowal) and the northern collaboration, and we have attended events hosted by these organizations. when we started out, we wanted to make use of contacts and expertise from other institutions to create our own events, too. other benefits of hosting ourselves meant that we did not have to travel (which, with caring responsibilities, is an important practical consideration) and we were able to invite people whose work we wanted to know more about. building on those premises, we hosted our first exchange of experience about research support with nowal. in terms of speakers, we invited people we had seen on twitter, met at other events and heard about, as well as our own colleagues. as this first event on general research support was well received, we held a second event on bibliometrics and brought representatives from research support from around the region. later that same year, we were introduced to the newly appointed scholarly communications librarian at the university of liverpool, which prompted some joint working within our city. our first joint event with them was also a nowal event, this time on the theme of open research. after this successful event, we met in october/november and discussed the idea of hosting a joint week of events during open data week in february . expanding on the theme of open data to open research in general, we branded it love open week and scheduled a planned week of activities. we held a joint love open research café with speakers from both universities, which was open to the public. the turnout for the event was excellent, with around phd students and staff from both institutions as well as member of the public in attendance. there was an interesting mix of speakers and an engaged audience that asked relevant and curious questions. at the everyman theatre, we held a joint screening of the film paywall, which discusses the costs and challenges with scholarly publications, followed by a panel discussion. to build on a cross-dialogue and discussion, our panel included the university of liverpool’s head of research support, a phd student and a researcher from ljmu, and the director of liverpool university press, and was chaired by our associate director. across the week during lunchtimes, we held a number of database awareness events at both campuses, which involved liaising with other librarians and departments. the love open week events involved a variety of stakeholders representing both universities. the success of the collaborative partnership has resulted in us organizing a joint research café as part of the university of liverpool’s impact week and a visit from the knowledge exchange unit later in the year. we are running love open week again in to coincide with open data week, with more events and activities. barriers we are really pleased with what has been achieved through our collaborative working, but it is important to acknowledge that there have been obstacles along the way. the first important thing to consider is that there will always be people who do not see the benefit of working with you and, having experienced this, you may have to face this fact and move on. this can be difficult if the people concerned are key stakeholders, but there may be a way around the situation or you may just have to change your plans. it is imperative that you focus on the people who will work with you. if you can place emphasis on the positive relationships you have built up and not on a perceived barrier, you may find some success. ‘place emphasis on the positive relationships you have built up’ ‘the benefits of collaboration outweigh the time and effort exerted’ the other thing to consider is that sometimes when working with others you may need to compromise in order to meet everyone’s needs. for example, when working on love open week our preference would have been to hold a research café over lunch in one of our own buildings, as we do at ljmu. however, as the university of liverpool’s da had offered to host and pay for drinks and cakes, the event really needed to be held in the afternoon on their campus. this was still branded as a joint event with speakers and attendees from both institutions. the key here is to be flexible as the benefits of holding the event outweighed any potential barrier. a creative way to overcome a barrier is to adjust your thinking and reassess. for example, last year the university held a prestigious event for international women’s day and, for various reasons, this year a similar event was not planned. we felt that something should be done about this but were not able to organize an event of the same scale. instead, we took a different approach, and along with our colleagues in the da, we arranged a lunch- time research café promoting the work of our postgraduate researchers, the remit being research by women, for women or about women. another way to overcome a barrier is to invite yourself to events or meetings and take any strategic opportunities that come your way. for example, katherine was invited to give a quick, five-minute talk about research support as she walked into the first da conference (that she had invited herself to). as a result, we have been involved in the da conference in every subsequent year and are now seen as an integral part of the day. it is vital to recognize that small opportunities to network and liaise can pay dividends later. taking that small opportunity can lead someone to seek out your service at a later point. conclusion the advent of the ref oa policy opened up opportunities to work across teams within the institution, which provided the impetus to seek other ways we could work collaboratively to support researchers. by identifying others that share a similar purpose, we were able to develop relationships which led to interesting and fruitful collaborations, across the university and beyond. doing this saves times and effort but also provides more purposeful and meaningful research support, more widely. collaboration across teams utilizes the skills of individuals to produce a coherent message to those you are trying to reach. whilst there can be barriers at times, these are often not insurmountable and with adjustment you can still have a positive outcome. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘full list of industry a&as’ link: http://www.uksg.org/publications#aa competing interests the authors are both employed by liverpool john moores university. ‘small opportunities to network and liaise can pay dividends later’ references . steve rose, “’no librarian is an island’: developing a share and collaborative approach to service provision,” sconul focus : https://www.sconul.ac.uk/sites/default/files/documents/ _ .pdf (accessed september ). . kimberly douglass and thura mack, “what do you give the undergraduate researcher who has everything? an academic librarian,” the journal of academic librarianship , no. ( ): – ; doi: https://doi.org/ . /j.acalib. . . (accessed september ). . jayshree mamtora, “transforming library research services: towards a collaborative partnership,” library management , no. / : – ; doi: https://doi.org/ . / (accessed september ). . cathryn mahar, susan mikilewicz, and jenny quilliam, “a one-team collaborative approach to research outputs collection, management, and reporting to deliver enhanced services to researchers and the university community,” in collaboration and the academic library, ed. jeremy atkinson (chandos publishing, ), – ; doi: https://doi.org/ . /b - - - - . - . katherine stephan, “research cafés: how libraries can build communities through research and engagement”, insights, : ( ): https://insights.uksg.org/articles/ . /uksg. /; doi: https://doi.org/ . /uksg. (accessed september ). http://www.uksg.org/publications#aa https://www.sconul.ac.uk/sites/default/files/documents/ _ .pdf https://doi.org/ . /j.acalib. . . https://doi.org/ . / https://doi.org/ . /b - - - - . - https://insights.uksg.org/articles/ . /uksg. / https://doi.org/ . /uksg. . north west academic libraries (nowal) web page: https://www.nowal.ac.uk/ (accessed september ). . northern collaboration web page: https://northerncollaboration.org.uk/ (accessed september ). . “love open, love research, love data monday february- february ,” research data management, university of liverpool: https://www.liverpool.ac.uk/library/research-data-management/love-data-week- / (accessed september ). . paywall: the business of scholarship web page: https://paywallthemovie.com/ (accessed september ). article copyright: © cath dishman and katherine stephan. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. corresponding author: cath dishman open access and digital scholarship librarian liverpool john moores university, gb e-mail: c.l.dishman@ljmu.ac.uk orcid id: https://orcid.org/ - - - co-author: katherine stephan orcid id: https://orcid.org/ - - - to cite this article: dishman c and stephan k, “destroying the silo: how breaking down barriers can lead to proactive and co-operative researcher support,” insights, , : , – ; doi: https://doi.org/ . /uksg. submitted on july             accepted on september             published on october published by uksg in association with ubiquity press. https://www.nowal.ac.uk/ https://northerncollaboration.org.uk/ https://www.liverpool.ac.uk/library/research-data-management/love-data-week- / https://paywallthemovie.com/ http://creativecommons.org/licenses/by/ . / mailto:c.l.dishman@ljmu.ac.uk https://orcid.org/ - - - https://orcid.org/ - - - https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ new team, new service background to collaboration finding our purpose working together benefits of collaboration beyond the institution barriers conclusion abbreviations and acronyms competing interests references read reviews write a review correspondence: brent.thoma@usask.ca date received: june , doi: . /winn. . archived: december , keywords: tenure and promotion, academic merit, digital teaching, digital resources, scholarship citation: brent thoma, teresa chan, javier benitez, michelle lin, educational scholarship in the digital age: a scoping review and analysis of scholarly products, the winnower :e . , , doi: . /winn. . © thoma et al. this article is introduction in boyer redefined the scope of scholarship in higher education with the definition of four overlapping subtypes of scholarship (discovery, integration, application, and teaching) (boyer ). prior to this redefinition, scholarship was largely considered to consist only of the discovery subtype. boyer’s influential definition paved the way for the recognition of a broader definition of scholarship that included teaching in addition to research. the explosive growth of digital products (resources used for the dissemination of information that exist primarily in digital formats) that has occurred since the internet was democratized in could not be predicted at that time (leiner et al. ). social media, online courses, blogs, podcasts and other digital products have since changed the way we teach, disseminate, and discuss scholarly ideas. their exclusion from traditional scholarly frameworks, combined with a lack of standards to ensure their quality, may explain why they are generally not viewed as scholarship by members of the academic establishment (brabazon ; hendricks ; kirkup ; savage ). scholars and educators are turning to digital methods for disseminating knowledge and reaching students (priem ). this has resulted in the creation of online communities of practice with benefits including: increased collaboration, enhanced knowledge dissemination, instantaneous scholarly discussion, and the generation of scholarly identity (kirkup ; gruzd, staves, and wilk ; maitzen ; shema, bar-ilan, and thelwall ). arguments against digital products note that they have not proven to be superior and that they require more time to develop (cooke ). the increasing prominence of digital products in medical education and the time being devoted to their development makes determining their scholarly value extremely important (cadogan et al. ; medicine  educational scholarship in the digital age: a scoping review and analysis of scholarly products brent thoma , teresa chan , javier benitez , michelle lin . mededlife research collaborative . emergency medicine residency program, university of saskatchewan . simulation fellowship program, massachusetts general hospital . department of medicine, division of emergency medicine, mcmaster university . department of emergency medicine, university of california san francisco abstract boyer’s framework of scholarship was published before significant growth in digital technology. as more digital products are produced by medical educators, determining their scholarly value is of increasing importance. this scoping systematic review developed a taxonomy of digital products and determined their fit within boyer’s framework of scholarship. we conducted a broad literature search for descriptions of digital products in the medical literature in july using medline, embase, eric, psychinfo, and google scholar. a framework analysis categorized each product using boyer’s model of scholarship, while a thematic analysis defined a taxonomy of digital products. abstracts were found and met inclusion criteria. digital products mapped primarily to the scholarship of teaching ( . %) followed by integration ( . %), application ( . %), and discovery ( . %). a taxonomy of categories was defined. web-based or computer assisted learning ( %) was described most frequently. we found that digital products are well described in medical literature and fit into boyer’s framework of scholarship and proposed a taxonomy of digital products that parallel traditional forms of the scholarship of teaching and learning. this research should inform the development of tools to examine the impact and quality of digital products. ✎ thoma et al the winnower june https://thewinnower.com/topics/medicine https://thewinnower.com/papers/ -educational-scholarship-in-the-digital-age-a-scoping-review-and-analysis-of-scholarly-products#submit https://thewinnower.com/papers/ -educational-scholarship-in-the-digital-age-a-scoping-review-and-analysis-of-scholarly-products#submit mailto:brent.thoma@usask.ca https://dx.doi.org/ . /winn. . distributed under the terms of the creative commons attribution . international license, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited. matava et al. ; bahner et al. ). in this scoping review paper, we quantify the increasing prevalence of digital products in the medical literature, develop a taxonomy of digital products, and compare the products in the taxonomy to traditional forms of the scholarship of teaching and learning. we hope that this will increase the awareness of this growing area of educational scholarship and classify digital products so that their value can be understood within the context of their traditional parallels. methods in concert with an expert librarian, an expert search strategy was developed using the medline, embase, eric, and psychinfo databases, as they were deemed to be the most likely to provide literature on digital products used in medical education. the search was not limited by year or language, and used the keywords and keyword variations of: (student, medical or medical student or “internship and residency” or intern or resident) and (education, medical or education, medical, graduate or education, medical, undergraduate or “medical education”) and (blog or weblog or microblog or social media or social network or “health . ” or “web . ” or video or youtube or podcast or vodcast or webcast or screencast or wiki or widget or new media or new technology or mobile app or app, collaborative or cooperative behavior or conferencing or crowdsource or rss or “really simple syndication” or computer-assisted instruction or web-based instruction or “access to information” or open access or free access). in addition to this traditional literature search, a previously described google scholar search methodology (chan et al. ) was conducted for five sets of keywords: “blogging and scholarship,” “digital scholarship medicine medical,” “free open access medical education,” “medical blogging” and “’tenure and promotion blogging.” the first results for each keyword set were reviewed and relevant results were added to the findings. a title review of the abstracts was performed by one author (bt). abstracts were excluded if ( ) there was no english-language abstract, ( ) they were duplicates, or ( ) they clearly did not address the use of digital products in medicine. the abstracts were coded and classified with a detailed abstract review conducted by two authors (bt, jb). upon abstract review, articles were excluded if ( ) no particular digital product was described, ( ) the digital product did not meet the criteria for scholarship based on boyer’s model, or ( ) upon closer inspection they met the initial exclusion criteria. during the abstract review, two authors (bt, jb) performed both a framework analysis and thematic analysis of the digital products described in the abstracts. two reviewers (bt, jb) classified the digital products described in the first abstracts collaboratively to develop an initial taxonomy and set of definitions for the thematic analysis and to calibrate the coding schemes for the thematic and framework analyses. subsequently a constant comparator technique was used to perform both analyses whereby classifications were made independently in batches of approximately abstracts and compared. the frequent comparisons allowed the reviewers to ensure consistency within the analyses and to refine a consensus definition for each type of digital product in the thematic analysis. when available and necessary, full manuscripts were reviewed to accurately classify the digital products and their form of scholarship. discordant classifications were discussed by the reviewers and resolved by consensus when possible. when consensus was not reached, a third reviewer (tc) arbitrated disagreements. the third reviewer also audited the excluded abstracts to ensure that they met the review’s exclusion criteria. the year of publication of each abstract was also recorded to demonstrate the prevalence of digital products described each year. while they were conducted concurrently, the two analyses were functionally independent. the thematic analysis was used to derive a taxonomy that defined the described all of the digital products found in the literature. additional items were added to the taxonomy as they were found and the definitions were frequently refined to accurately describe all of the digital products effectively. the purpose of the framework analysis was to determine if and how digital products fit into boyer’s four types of scholarship (boyer, ). digital products were classified as one or more of boyer’s types of scholarship: discovery (original research for the advancement of knowledge), integration educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://creativecommons.org/licenses/by/ . / (contextualizing information across disciplines or into larger intellectual patterns), application (applying knowledge dynamically to inform and test new theories in an engaged fashion), and/or teaching (systematic study of teaching and learning in the presence of learners) (gale et al. ; boyer ). the intraclass correlation coefficient was calculated to determine a measure of agreement. the definitions resulting from the thematic analysis were assessed to determine if there were traditional scholarly products used for the same purpose. this comparison, while inherently subjective, was conducted to further contextualize the role of each type of digital product. results the flow diagram for the literature search, title review, and abstract review is presented in figure . the thematic and framework analyses were conducted on digital products described by the abstracts that met the inclusion criteria. an abstract published in described the oldest digital product. figure . diagram illustrating the number of articles excluded through the title and abstract reviews. the number of digital products described in the published medical literature between and july is illustrated in figure . the number of digital products for was projected to double because our literature search only included articles published through july . educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june figure . the number of digital products described in the medical literature over time. framework analysis table presents the results of the analysis mapping published digital products to boyer’s framework of scholarship.(boyer ) the intraclass correlation between the raters was . , but disagreements were ultimately discussed to resolve consensus. most products ( . %) were categorized under the scholarship of teaching. the scholarship of integration ( . %), application ( . %), and discovery ( . %) were described much less frequently. this table further stratifies these scholarship models based on the categories of digital products, as derived by our thematic analysis. of note, there were some products that could be classified as more than one type of scholarship. table : types and numbers of digital products mentioned in the literature and classified using boyer's framework of scholarship digital product discovery (%) integration (%) application (%) teaching (%) total web-based or computer assisted learning ( ) ( . ) ( . ) ( . )* multi-modal products ( ) ( . ) ( . ) ( . )* social network ( ) ( . )* ( . ) ( . )* instructional video ( ) ( ) ( . ) ( . ) online repository ( ) ( . ) ( . ) ( . )* podcast ( ) ( ) ( ) ( ) online course ( ) ( ) ( ) ( )* video podcast ( ) ( ) ( . ) ( . ) blog ( ) ( . ) ( . ) ( . )* educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june open access journal ( )* ( . ) ( ) ( . ) wiki ( ) ( ) ( ) ( )* website ( ) ( . ) ( . ) ( )* online discussion board ( ) ( ) ( ) ( )* e-mail ( ) ( ) ( ) ( )* application ("app") ( ) ( ) ( ) ( ) online textbook ( ) ( ) ( ) ( )* virtual reality ( ) ( ) ( ) ( )* search engine ( ) ( ) ( ) ( ) serious game ( ) ( ) ( ) ( )* total ( . ) ( . ) ( . ) ( . ) in table , the starred numbers represent the most popular type of scholarship for each product. the table includes abstracts that were classified as multiple forms of scholarship, resulting in totals ( ) greater than the number of abstracts reviewed ( ). thematic analysis table provides a taxonomy of the digital products described in the literature and derived from the thematic analysis. each of the categories are defined with an example provided. together, web- based learning and computer assisted learning ( %) were the most prevalent forms of digital product. a single category was created for these two types of digital products because prior to the democratization and widespread accessibility of the internet, web-based learning products were classified under the umbrella term of computer assisted learning. the significant overlap between these two terms necessitated their amalgamation into one category in our taxonomy. social networks, instructional videos, online repositories, podcasts, online courses, video podcasts (also known as screencasts or vodcasts), and blogs had roughly similar prevalence and collectively comprised another % of the publications. table : definitions and examples of digital products. digital product definition example applications (‘apps’) a resource downloaded to a smartphone. irash is an application that allows users to search and learn about various rashes (deveau and chilukuri ) blog a website used to publish information in periodic posts that are primarily text-based. a blog was created to host synopses of ‘morning report’ sessions run by chief medical residents (bogoch et al. ) e-mail a common form of direct electronic messaging between a sender and one or more recipients. e-mail was used to send questions to teach residents about pediatric emergency medicine (komoroski ) instructional video a video demonstrating a skill (ie procedure, physical exam finding, ecg or x-ray instructional video used to teach chest tube insertion (davis et al. ) educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june interpretation, etc). multi-modal products a product that consists of multiple digital products. an online course on evidence based medicine and critical appraisal that used video podcasts, a wiki and blogs (tam and eastwood ) online course a complete curriculum delivered using multiple online modalities. differs from multi-modal products in that it is organized into a formal curriculum. the online genetic testing curriculum is a course about the ethical, legal, and social implications of genetic testing and counseling (metcalf, tanner, and buchanan ) online discussion board an online forum that allows users to post and respond to other participants. a clinical discussion board for learners to describe their rural medicine experiences (baker, eley, and lasserre ) online repository an online database that resources can be drawn from and added to. a repository of images of dermatologic findings in darker-skinned patients (ezzedine et al. ) online textbook a textbook published online. oditeb (open distributed text book), an online textbook that describes the diagnosis of gastrointestinal tumours (horsch et al. ) open access journal a journal only available online that publishes articles without access restrictions. various online journals have been created to decrease cost and allow open-access publication of scientific materials (davis and walters ) podcast audio recordings that are published periodically with the intent of disseminating knowledge. surgery podcasts are used to teach core principles to clinical clerks on their surgical rotation (white, sharma, and boora ) search engine search engines used to find information online. google, yahoo, dogpile, altavista, metacrawlers and ask were used to find information on scleroderma renal crisis (akbar and yacyshyn ) serious game an online game designed to educate the players. emedoffice, a serious game to teach practice management.(hannig et al. ) social network an online platform that allows synchronous and asynchronous communication between individuals. twitter used to connect teachers with learners (forgie, duff, and ross ) video podcast videos with embedded audio that are published periodically. differs from instructional videos because it focuses on knowledge rather than skill. video podcasts used to teach embryology (evans ) virtual reality a virtual environment used to present learning material. a virtual reality simulator was used to simulate medical cases (alverson et al. ) web based learning or computer assisted learning educational modules that may make use of multiple modalities. web-based learning is based online while computer assisted learning is not. these modalities were combined due to substantial overlap. a web based module on pediatric pain management (ameringer et al. ) a computer based application about occupational lung disease (bresnitz, gracely, and rubenstein ) website an online webpage that cannot be classified as any other digital product. case based pediatrics is a website with a list of teaching cases for medical students and residents (falagas, karveli, and panos ) wiki a website that can be openly edited by end- a wiki site for orthopedic cases, utilizes a educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june users. utilizes crowd-sourcing as a method for improving and revising the content. scoreboard to encourage participation (ma et al. ) historical parallels as demonstrated by our framework analysis, digital products can be classified within the types of scholarship described by boyer (boyer ) and most fall under teaching and learning. following the completion of our thematic analysis, the definitions of the digital products were compared with traditional forms of the scholarship of teaching and learning. table outlines the parallels between traditional products and of the digital products described in the thematic analysis. no product was found that was comparable to the digital product ‘virtual reality.’ table : comparing traditional products used for the scholarship of teaching and learning to digital products that are used for this purpose types of teaching and learning resources examples of traditional products examples of digital products interactive resources small groups workshops online discussion board social network wiki independent study resources assignments discussions with tutors group work laboratory work e-mail online course serious game virtual reality web based and computer assisted learning audiovisual resources lecture skill demonstration podcast video podcast instructional video point-of-care resources guidebooks pocketbooks applications (‘apps’) written resources textbook printed journals medical journalism online textbook blog open access journal website resource repository library library classification system online repository search engine discussion the growing number of digital products documented in the literature (figure and ) suggests that medical educators are increasingly using technology to engage in various forms of scholarship. while educators have discussed applying boyer’s traditional definitions of scholarship to digital products (heap and minocha ; pearce et al. ), we provide the first comprehensive framework analysis of these products. our framework analysis found that, following teaching and learning, integration ( . %), application ( . %), and discovery ( . %) were the most frequent types of scholarship found in digital products. we suspect that the digital products were predominantly consistent with scholarship of teaching and learning because, despite boyer’s reclassification of scholarship, educators have traditionally not had their scholarly contributions recognized. literature that assesses their innovations is one way to receive academic recognition for their work. educators should keep in mind that digital products can be educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june scholarly outside of their traditional realm of teaching. for example, boyer’s concept of application was demonstrated by the various ‘apps’ that allow translation of concepts at the point of care (graber, tompkins, and holland ), integration was illustrated by an online textbook that synthesized multiple resources into a single resource (horsch et al. ), and discovery was exemplified by open access online journals that fostered new scientific works (p. m. davis and walters ). social networks were the most versatile product with multiple examples of their use in teaching, application, and integration. the thematic analysis described the diversity of digital products (table ). notably, web-based and computer assisted learning programs were prominently featured in the literature and there has been a recent uptake of social media (nickson and cadogan ; cadogan et al. ). social networks, in particular, seem to have impacted medical education by allowing scholars to share their digital products (boulos, maramba, and wheeler ). a traditional parallel was found for nearly every digital product defined in the thematic analysis. the use of digital products was particularly prominent for the scholarship of teaching and learning. this may be because of their reach, customization, and updatability. whereas scholarly teaching was historically a fleeting event offered to a defined group (i.e. an address that was given in a lecture hall), digital products extend their reach to large numbers of learners who can access them at their convenience. this asynchrony allows learners to customize their experience (i.e. by speeding up or slowing down a lecture) and educators to update their products as needed. that said, there is no compelling evidence that digital products are more effective for learning and they may take more time and resources to develop than traditional products (cooke ). they have also been criticized for their lack of editorial oversight and review (brabazon ; kirkup ). these limitations may limit their widespread endorsement and utilization. further research will be required to determine when and how they should be used. while our results suggest that this research is increasingly being conducted, the role and value of digital products in our current academic schema for scholarship remains poorly defined, and hence, poorly acknowledged. institutions that do acknowledge digital products as scholarship for the purpose of promotion and tenure decisions have difficulty classifying them and quantifying their value relative to other scholarly pursuits (gruzd, staves, and wilk ; cheverie, boettcher, and buschman ; rockwell ; ruiz, mintzer, and leipzig ). novel ways to recognize digital products include publishing them on a platform with peer review and publication processes such as mededportal (ruiz, mintzer, and leipzig ; reynolds and candler, christopher ) or conducting educational research to evaluate their efficacy (cheston, flickinger, and chisolm ). regardless, the amount of academic recognition for digital products is relatively low compared to the effort expended to build and maintain them and may limit their growth in the future (anderson et al. ; profhacker ). limitations while our literature search was intended to be as broad as possible, it is still likely that some digital products were missed since they may not have been reported in the literature. a broader review of grey and non-english literature would not have been feasible given the sheer volume of unreported products. for example, a recent report found that there were english-language blogs and podcasts in emergency medicine alone (cadogan et al. ). additionally, we may have missed digital products of historic significance that were described using terms that are not applicable today. for example, cd-rom’s were likely to have been considered digital products in the past but were not included in our literature search. missing resources would change the number of products per year represented in figure and made our taxonomy of digital products incomplete. the exclusion of the mededportal database could also be considered a limitation as it publishes many digital products. however, our search explicitly attempted to quantify and describe the digital products described in the literature. mededportal’s publications are digital products, rather than descriptions of them, and for this reason they were considered to be outside of the scope of this review. finally, our quantification of the rapidly increasing number of digital products described annually in the educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june literature fails to account for the increase in literature that has been published in general (larsen, ). unfortunately, we were unable accurately quantify this growth for the body of literature that our review assessed. as the amount of research published annually is increasing (larsen, ), the increase in descriptions of digital products would have been less spectacular had we been able to take this into account. future directions since the digital products described in the medical literature fit within boyer’s framework, we feel strongly that they should be considered alongside other forms of scholarship. however, given the ease with which some products can be created, better evaluation tools will need to be developed to determine their quality, value, and relative impact. educator portfolios are becoming accepted as a way to provide additional detail to the traditional curriculum vitae, which sub-optimally captures the scholarly efforts of educators (simpson et al. ; baldwin, chandran, and gusic ). in showing that digital products fall within boyer’s framework of scholarship, our findings suggest that we should look to apply other conceptual frameworks of educational scholarship to digital products or online educational resources. frequently, educators lean towards the criteria for assessing scholarship developed by glassick. assessment frameworks such as glassick’s criteria of scholarship are manifest in the aamc toolbox for evaluating educators and could be used to evaluate these portfolios (glassick ; gusic et al. ). table suggests multiple parallels between traditional and digital projects for teaching and learning that could guide how digital products should fit into these portfolios. developing a standardized approach would allow promotion committees and administrative leadership to evaluate digital and traditional educational efforts more rigorously. together, boyer and glassick’s respective frameworks provide a roadmap for educators interested in scholarship. digital scholars must take care to ensure that their digital products warrant scholarly respect by ensuring that they stand up to the scrutiny of these recognized conceptual frameworks. conclusion digital products are increasingly being described in the medical literature. they are likely to have a substantial impact on medical education and can readily fit into boyer’s established framework of scholarship. our taxonomy shows clear parallels between digital and traditional products and can hopefully provide a framework for further research on digital scholarship. references akbar, s, and e yacyshyn. . “is there relevant information about scleroderma renal crisis on most frequently visited internet search engines?” journal of rheumatology ( ): – . alverson, dale c, stanley m saiki, summers kalishman, marlene lindberg, stewart mennin, jan mines, lisa serna, et al. . “medical students learn over distance using virtual reality simulation.” simulation in healthcare : journal of the society for simulation in healthcare ( ): – . doi: . /sih. b e f d . ameringer, suzanne, deborah fisher, sue sreedhar, jessica m ketchum, and leanne yanni. . “pediatric pain management education in medical students: impact of a web-based module.” journal of palliative medicine ( ): – . doi: . /jpm. . . anderson, michael g, donna d alessandro, dawn quelle, rick axelson, lois j geist, and donald w black. . “recognizing diverse forms of scholarship in the modern medical college”, – . doi: . /ijme. b . c. bahner, david p, eric adkins, nilesh patel, chad donley, rollin nagel, and nicholas e kman. . educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://dx.doi.org/ . /sih. b e f d https://dx.doi.org/ . /jpm. . “how we use social media to supplement a novel curriculum in medical education.” medical teacher ( ): – . doi: . / x. . . baker, peter g, diann s eley, and kaye e lasserre. . “tradition and technology: teaching rural medicine using an internet discussion board.” rural and remote health ( ): . http://www.ncbi.nlm.nih.gov/pubmed/ . baldwin, constance, latha chandran, and maryellen gusic. . “guidelines for evaluating the educational performance of medical school faculty: priming a national conversation.” teaching and learning in medicine ( ): – . doi: . / . . . bogoch, isaac i, david w frost, suzanne bridge, todd c lee, wayne l gold, daniel m panisko, and rodrigo b cavalcanti. . “morning report blog: a web-based tool to enhance case-based learning.” teaching and learning in medicine ( ): – . doi: . / . . . boulos, maged n kamel, inocencio maramba, and steve wheeler. . “wikis, blogs and podcasts: a new generation of web-based tools for virtual collaborative clinical practice and education.” bmc medical education (january): . doi: . / - - - . boyer, e. . “scholarship reconsidered: priorities of the professoriate” the carnegie foundation for the advancement of teaching: princeton, nj. brabazon, t. ( ). the google effect: googling, blogging, wikis and the flattening of expertise. libri, ( ), - . doi: . /libr. . bresnitz, eddy a, edward j gracely, and harriet l rubenstein. . “a randomized trial to evaluate a computer-based learning program in occupational lung disease.” journal of occupational and environmental medicine ( ). http://journals.lww.com/joem/fulltext/ / /a_randomized_trial_to_evaluate_a_computer_based. .aspx. cadogan, m., b. thoma, t. m. chan, and m. lin. . “free open access meducation (foam): the rise of emergency medicine and critical care blogs and podcasts ( - ).” emergency medicine journal, (february). doi: . /emermed- - . chan, teresa m, clare wallner, thomas k swoboda, katrina a leone, and chad kessler. . “assessing interpersonal and communication skills in emergency medicine.” academic emergency medicine ( ): – . doi: . /acem. . cheston, christine c, tabor e flickinger, and margaret s chisolm. . “social media use in medical education: a systematic review.” academic medicine : journal of the association of american medical colleges ( ): – . doi: . /acm. b e ffc . cheverie, joan f., jennifer boettcher, and john buschman. . “digital scholarship in the university tenure and promotion process: a report on the sixth scholarly communication symposium at georgetown university library.” journal of scholarly publishing ( ): – . doi: . /scp. . . cooke, david. . “futurecasting in education technologies: fun new toys and a reality check.” international conference on residency education, plenary session. available at: https://www.youtube.com/watch?v=xcodjukpuec&list=uu z-vvzoq cvwmvdzsh a. retrieved november , . davis, james s, george d garcia, mary m wyckoff, salman alsafran, jill m graygo, kelly f withum, and carl i schulman. . “use of mobile learning module improves skills in chest tube insertion.” the journal of surgical research ( ). elsevier ltd: – . doi: . /j.jss. . . . davis, philip m, and william h walters. . “the impact of free access to the scientific literature: a review of recent research.” journal of the medical library association : jmla ( ): – . doi: . / - . . . . deveau, michael, and suneel chilukuri. . “mobile applications for dermatology.” seminars in cutaneous medicine and surgery ( ). elsevier inc. – . doi: . /j.sder. . . . educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://dx.doi.org/ . / x. . https://dx.doi.org/ . / . . https://dx.doi.org/ . / . . https://dx.doi.org/ . / - - - https://dx.doi.org/ . /libr. . https://dx.doi.org/ . /emermed- - https://dx.doi.org/ . /acem. https://dx.doi.org/ . /acm. b e ffc https://dx.doi.org/ . /scp. . https://www.youtube.com/watch?v=xcodjukpuec&list=uu z-vvzoq cvwmvdzsh a https://dx.doi.org/ . /j.jss. . . https://dx.doi.org/ . / - . . . https://dx.doi.org/ . /j.sder. . . evans, darrell j r. . “using embryology screencasts: a useful addition to the student learning experience?” anatomical sciences education ( ). wiley subscription services, inc., a wiley company: – . doi: . /ase. . ezzedine, k, a amiel, p vereecken, t simonart, b schietse, k seymons, b s ndiaye, et al. . “black skin dermatology online, from the project to the website: a needed collaboration between north and south.” journal of the european academy of dermatology and venereology : jeadv ( ): – . doi: . /j. - . . .x. falagas, matthew e, efthymia a karveli, and george panos. . “infectious disease cases for educational purposes: open-access resources on the internet.” clinical infectious diseases : an official publication of the infectious diseases society of america ( ): – . doi: . / . forgie, sarah edith, jon p duff, and shelley ross. . “twelve tips for using twitter as a learning tool in medical education.” medical teacher ( ): – . doi: . / x. . . gale, nicola k, gemma heath, elaine cameron, sabina rashid, and sabi redwood. . “using the framework method for the analysis of qualitative data in multi-disciplinary health research.” bmc medical research methodology ( ). bmc medical research methodology: . doi: . / - - - . glassick, charles e. . “boyer’s expanded definitions of scholarship, the standards for assessing scholarship, and the elusiveness of the scholarship of teaching.” academic medicine ( ), – . doi: . / - - graber, mark l, david tompkins, and joanne j holland. . “resources medical students use to derive a differential diagnosis.” medical teacher : – . doi: . / . gruzd, anatoliy, kathleen staves, and amanda wilk. . “tenure and promotion in the age of online social media.” proceedings of the american society for information science and technology ( ): – . doi: . /meet. . . gusic, m, j amiel, c baldwin, l chandran, r fincher, b mavis, p o’sullivan, et al. . “using the aamc toolbox for evaluating educators: you be the judge!” mededportal. doi: . / ​ mep_ - . . hannig, andreas, nicole kuth, monika Özman, stephan jonas, and cord spreckelsen. . “emedoffice: a web-based collaborative serious game for teaching optimal design of a medical practice.” bmc medical education (january): . doi: . / - - - . heap, tania, and shailey minocha. . “an empirically grounded framework to guide blogging for digital scholarship.” research in learning technology (august). doi: . /rlt.v i . . hendricks, arthur. . “bloggership, or is publishing a blog scholarship? a survey of academic librarians.” library hi tech ( ): – . doi: . / . horsch, a., p. hellerhoff, m. hogg, h. ahlbrink, t. balbacha, liss. t., k. minov, and p. gerhardt. . “concepts of a web-based open distributed textbook for the multimodal diagnostics of gastrointestinal tumours with mri, ct and video-endoscopy addressing students of medicine and students of medical informatics as two different target groups.” studies in health technology and informatics ( ): – . kirkup, gill. . “academic blogging: academic practice and academic identity.” london review of education ( ): – . doi: . / . komoroski, e m. . “use of e-mail to teach residents pediatric emergency medicine.” archives of pediatrics & adolescent medicine ( ): – . doi: . /archpedi. . . larsen, p. o., & von ins, m. ( ). the rate of growth in scientific publication and the decline in coverage provided by science citation index. scientometrics, ( ), - . doi: . /s - - -z. educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://dx.doi.org/ . /ase. https://dx.doi.org/ . /j. - . . .x https://dx.doi.org/ . / https://dx.doi.org/ . / x. . https://dx.doi.org/ . / - - - https://dx.doi.org/ . / - - https://dx.doi.org/ . / https://dx.doi.org/ . /meet. . https://dx.doi.org/ . /?mep_ - . https://dx.doi.org/ . / - - - https://dx.doi.org/ . /rlt.v i . https://dx.doi.org/ . / https://dx.doi.org/ . / https://dx.doi.org/ . /archpedi. . . https://dx.doi.org/ . /s - - -z leiner, barry m, david d clark, robert e kahn, leonard kleinrock, daniel c lynch, jon postel, larry g roberts, and stephen wolff. . “a brief history of the internet.” computer communication review ( ): – . doi: . / . ma, zhen-sheng, hong-ju zhang, tao yu, gang ren, guo-sheng du, and yong-hua wang. . “orthochina.org: case-based orthopaedic wiki project in china.” clinical orthopaedics and related research ( ): – . doi: . /s - - -z. maitzen, rohan. . “scholarship . : blogging and/as academic practice.” journal of victorian culture ( ): – . doi: . / . . . matava, clyde t, derek rosen, eric siu, and dylan m bould. . “elearning among canadian anesthesia residents: a survey of podcast use and content needs.” bmc medical education (january): . doi: . / - - - . metcalf, mary p, t bradley tanner, and amanda buchanan. . “effectiveness of an online curriculum for medical students on genetics, genetic testing and counseling.” medical education online (january): – . doi: . /meo.v i . . nickson, christopher p, and michael d cadogan. . “free open access medical education (foam) for the emergency physician.” emergency medicine australasia ( ): – . doi: . / - . . pearce, nick, martin weller, eileen scanlon, and melanie ashleigh. . “digital scholarship considered: how new technologies could transform academic work.” in education ( ). http://ineducation.couros.ca/index.php/ineducation/article/view/ / . priem, jason. . “beyond the paper.” nature : – . doi: . / a. koh, adeline. . “the challenges of digital scholarship.” chronicle of higher education. retrieved from http://chronicle.com/blogs/profhacker/the-challenges-of-digital-scholarship/ on d\necember , . reynolds, robby j., and s. candler, christopher. . “mededportal : educational scholarship for teaching.” journal of continuing education in the health professions ( ): – . doi: . /chp. rockwell, geoffrey. . “on the evaluation of digital media as scholarship.” profession : – . doi: . /prof. . . . . ruiz, jorge g, michael j mintzer, and rosanne m leipzig. . “the impact of e-learning in medical education.” academic medicine ( ): – . doi: . / - - savage, william w. . “the transom: you can’t spill mustard on a blog.” journal of scholarly publishing ( ): – . doi: . /scp. . . shema, hadas, judit bar-ilan, and mike thelwall. . “research blogs and the discussion of scholarly information.” plos one ( ): e . doi: . /journal.pone. . simpson, deborah, ruth-marie e fincher, janet p hafler, david m irby, boyd f richards, gary c rosenfeld, and thomas r viggiano. . “advancing educators and education by defining the components and evidence associated with educational scholarship.” medical education ( ): – . doi: . /j. - . . .x. tam, chun wah michael, and anne eastwood. . “available, intuitive and free! building e- learning modules using web . services.” medical teacher ( ): – . doi: . / x. . . white, j s, n sharma, and p boora. . “surgery : evaluating the use of podcasting in a general surgery clerkship.” medical teacher ( ): – . doi: . / x. . . educational scholarship in the digital age: a scoping review and analysis of scholarly products : medicine thoma et al the winnower june https://dx.doi.org/ . / . https://dx.doi.org/ . /s - - -z https://dx.doi.org/ . / . . https://dx.doi.org/ . / - - - https://dx.doi.org/ . /meo.v i . https://dx.doi.org/ . / - . https://dx.doi.org/ . / a http://chronicle.com/blogs/profhacker/the-challenges-of-digital-scholarship/ on december https://dx.doi.org/ . /chp https://dx.doi.org/ . /prof. . . . https://dx.doi.org/ . / - - https://dx.doi.org/ . /scp. . https://dx.doi.org/ . /journal.pone. https://dx.doi.org/ . /j. - . . .x https://dx.doi.org/ . / x. . https://dx.doi.org/ . / x. . educational scholarship in the digital age: a scoping review and analysis of scholarly products abstract introduction correspondence: date received: doi: archived: keywords: citation: methods results framework analysis thematic analysis historical parallels discussion limitations future directions conclusion references language resources for historical newspapers: the impresso collection proceedings of the th conference on language resources and evaluation (lrec ), pages – marseille, – may c© european language resources association (elra), licensed under cc-by-nc language resources for historical newspapers: the impresso collection maud ehrmann?, matteo romanello?, simon clematide†, phillip benjamin ströbel†, raphaël barman? ?digital humanities laboratory, epfl †institute for computational linguistics, zurich university ?{maud.ehrmann, matteo.romanello, raphael.barman}@epfl.ch.ch †{siclemat, pstroebel}@cl.uzh.ch abstract following decades of massive digitization, an unprecedented amount of historical document facsimiles can now be retrieved and accessed via cultural heritage online portals. if this represents a huge step forward in terms of preservation and accessibility, the next fundamental challenge– and real promise of digitization– is to exploit the contents of these digital assets, and therefore to adapt and develop appropriate language technologies to search and retrieve information from this ‘big data of the past’. yet, the application of text processing tools on historical documents in general, and historical newspapers in particular, poses new challenges, and crucially requires appropriate language resources. in this context, this paper presents a collection of historical newspaper data sets composed of text and image resources, curated and published within the context of the ‘impresso - media monitoring of the past’ project. with corpora, benchmarks, semantic annotations and language models in french, german and luxembourgish covering ca. years, the objective of the impresso resource collection is to contribute to historical language resources, and thereby strengthen the robustness of approaches to non-standard inputs and foster efficient processing of historical documents. keywords: historical and multilingual language resources, historical texts, multi-layered historical semantic annotations, ocr, named entity processing, topic modeling, text reuse, digital humanities . introduction digitization efforts are slowly but steadily contributing an increasing amount of facsimiles of cultural heritage docu- ments. as a result, it is nowadays commonplace for many memory institutions to create and maintain digital reposito- ries which offer rapid, time- and location-independent ac- cess to documents (or surrogates thereof), allow to virtually bring together disperse collections, and ensure the preser- vation of fragile documents thanks to on-line consultation (terras, ). beyond this great achievement in terms of preservation and accessibility, the next fundamental chal- lenge –and real promise of digitization– is to exploit the contents of these digital assets, and therefore to adapt and develop appropriate language technologies to search and re- trieve information from this ‘big data of the past’ (kaplan and di lenardo, ). in this regard, and following decisive grassroots efforts led by libraries to improve ocr (optical character recogni- tion) technology and generalize full-text search over histor- ical document collections (see, e.g., the impact and trove projects), the digital humanities (dh), natural language processing (nlp) and computer vision (cv) communities are pooling forces and expertise to push forward the pro- cessing of facsimiles, as well as the extraction, linking and representation of the complex information enclosed in tran- scriptions of digitized collections. these interdisciplinary efforts were recently streamlined within the far-reaching europe time machine project which ambitions, in gen- eral, the application of artificial intelligence technologies on cultural heritage data and, in particular, to achieve text understanding of historical material. http://www.impact-project.eu https://trove.nla.gov.au https://www.timemachine.eu this momentum is particularly vivid in the domain of dig- itized newspaper archives, for which there has been a no- table increase of research initiatives over the last years. besides individual works dedicated to the development of tools (yang et al., b; dinarelli and rosset, ; moreux, ; wevers, ), or to the usage of those tools (kestemont et al., ; lansdall-welfare et al., ), events such as evaluation campaigns (rigaud et al., ; clausner et al., ) or hackathons based on digitized newspaper data sets have multiplied. additionally, several large consortia projects proposing to apply computational methods to historical newspapers at scale have recently emerged, including viraltexts , oceanic exchanges , im- presso , newseye , and living with machines (ridge et al., ). these efforts are contributing a pioneering set of text and image analysis tools, system architectures, and graphical user interfaces covering several aspects of histor- ical newspaper processing and exploitation. yet, the application of text processing tools on historical documents in general, and historical newspapers in partic- see the edition of the coding da vinci cultural hackathon, https://www.deutsche-digitale- bibliothek.de/content/journal/aktuell/kicking-coding-da-vinci-berlin?lang=en a project aiming at mapping networks of reprinting in th-century newspapers and magazines (us, - ): https://viraltexts.org a project tracing global information networks in historical newspaper repositories from to (us/eu, - ): https://oceanicexchanges.org https://impresso-project.ch a digital investigator for historical newspapers (eu, - ): https://www.newseye.eu a project which aims at harnessing digitised newspaper archives (uk, - ): https://www.turing.ac.uk/research/research- projects/living-machines http://www.impact-project.eu https://trove.nla.gov.au https://www.timemachine.eu/ https://www.deutsche-digitale-bibliothek.de/content/journal/aktuell/kicking-coding-da-vinci-berlin?lang=en https://www.deutsche-digitale-bibliothek.de/content/journal/aktuell/kicking-coding-da-vinci-berlin?lang=en https://viraltexts.org/ https://oceanicexchanges.org/ https://impresso-project.ch/ https://www.newseye.eu/ https://www.turing.ac.uk/research/research-projects/living-machines https://www.turing.ac.uk/research/research-projects/living-machines ular, poses new challenges (sporleder, ; piotrowski, ). first, the language under study is mostly of ear- lier stage(s) and usually features significant orthographic variation (bollmann, ). second, due to the acqui- sition process and/or document conservation state, inputs can be extremely noisy, with errors which do not resem- ble tweet misspellings or speech transcription hesitations for which adapted approaches have already been devised (linhares pontes et al., a; chiron et al., ; smith and cordell, ). further, and due to the diversity of the material in terms of genre, domain and time period, lan- guage resources such as corpora, benchmarks and knowl- edge bases that can be used for lexical and semantic pro- cessing of historical texts are rather sparse and heteroge- neous. finally, archives and texts from the past are not as anglophone as in today’s information society, making mul- tilingual resources and processing capacities even more es- sential (neudecker and antonacopoulos, ). overall, and as demonstrated by vilain et al. ( ), the transfer of nlp approaches from one domain or time period to another is not straightforward, and performances of tools initially developed for homogeneous texts of the immedi- ate past are affected when applied on historical material (ehrmann et al., ). this echoes the statement of plank ( ), according to whom what is considered as standard or canonical data in nlp (i.e. contemporary news genre) is more a historical coincidence than an objective evidence or reality: non-canonical, heterogeneous, biased and noisy data is more prevalent than is commonly believed, and his- torical texts are no exception. in this respect, and in light of the above, it can therefore be considered that historical lan- guage(s) belong to the family of less-resourced languages for which further efforts are still needed. to help alleviate this deficiency, this paper presents a ‘full- stack’ historical newspaper data set collection composed of text and image resources produced, curated and pub- lished within the context of the ‘impresso - media mon- itoring of the past’ project . these resources relates to historical newspaper material in french, german and lux- embourgish and include: ocred texts together with their related facsimiles and language models, benchmarks for ar- ticle segmentation, ocr black letter and named entity pro- cessing, and multi-layer semantic annotations (named enti- ties, topic modeling and text reuse). the objective of the impresso resource collection is to contribute to historical language resources, and thereby strengthen the robustness of approaches to non-standard inputs and foster efficient processing of historical documents. more precisely, these resources can support: (a) nlp research and applications dealing with historical language, with a set of ‘ready-to-parse’ historical texts covering years in french and german, and a set of language models; (b) model training and performance assessment for three tasks, namely article segmentation, ocr transcription and named entity processing (for the first time on such material for the latter), with manually transcribed and annotated corpora; https://impresso-project.ch (c) historical corpus exploration and digital history re- search, with various stand-off semantic annotations. to the best of our knowledge, the impresso resource col- lection represents the most complete historical newspapers data set series to date. in the following, we introduce the impresso project (section ), present the impresso resource collection (sections , and ), account for major exist- ing historical language resources (section ), and conclude (section ). . mining years of historical newspapers: the impresso project impresso - media monitoring of the past’ is an interdisci- plinary research project in which a team of computational linguists, designers and historians collaborate on the se- mantic indexing of a multilingual corpus of digitized his- torical newspapers . the primary goals of the project are to apply text mining techniques to transform noisy and un- structured textual content into semantically indexed, struc- tured, and linked data; to develop innovative visualization interfaces to enable the seamless exploration of complex and vast amounts of historical data ; to identify needs on the side of historians which may also translate into new text mining applications and new ways to study history; and to reflect on the usage of digital tools in historical sciences from a practical, methodological, and epistemological point of view. in doing so, impresso addresses the challenges posed by large-scale collections of digitized newspapers, namely: ( ) newspaper silos: due to legal restrictions and digitisa- tion policy constraints, data providers (libraries, archives and publishers) are bound to provide incomplete, non- representative collections which have been subjected to digitization and ocr processing of varying quality; ( ) big, messy data: newspaper digital collections are characterised by incompleteness, duplicates, and abundant inconsisten- cies; ( ) noisy, historical text: imperfect ocr, faulty article segmentation and lack of appropriate linguistic resources greatly affect image and text mining algorithms’ robust- ness; ( ) large and heterogeneous corpora: processing and exploitation requires a solid system architecture and infras- tructure, and interface design should favor efficient search and discovery of relevant content; and ( ) transparency: critical assessment of inherent biases in exploratory tools, digitized sources and annotations extracted from them is paramount for an informed usage of data in digital scholar- ship context. with respect to source processing, impresso applies and im- prove a series of state-of-the-art natural language and im- age processing components which produce, in fine, a large- scale, multilingual, semantically indexed historical news- paper collection. the various lexical and semantic anno- tations generated thereof are combined and delivered to the project is funded by the swiss national science foun- dation for a period of three years ( - ) and involves three main applicants: dhlab from the ecole polytechnique fédérale de lausanne (epfl), icl from the university of zurich, and c dh from the university of luxembourg. https://impresso-project.ch/app/# https://impresso-project.ch https://impresso-project.ch/app/ digital scholars via a co-designed, innovative and power- ful graphical user interface. furthermore, and this is the focus of the present paper, those sources and annotations are also published apart from the interface for further usage by cultural heritage partners, and dh and/or nlp commu- nities. finally, some of the text and image mining compo- nents are subject to systematic evaluation, for which ground truth data are produced. all publicly released impresso resources, i.e. corpora, benchmarks and annotations, are published on the project’s website and on impresso zenodo community with de- tailed documentation. table summarizes the links and dois of the datasets. . impresso corpora the first resource is a set of normalized, ‘ready-to-process’ newspaper textual corpora which, for copyrights reasons, do not correspond to the full impresso newspaper collection accessible through the interface. . . original sources impresso gathers a consortium of swiss and luxembour- gish research and cultural heritage institutions and focuses primarily on sources of these countries in french, german, and luxembourgish. provided by its partners, impresso original sources correspond as of november to newspapers. concretely speaking, sources consist of ei- ther both ocr output and images, or only ocr. regard- ing images, they are thus either served online via the iiif image api of the impresso infrastructure, or accessed di- rectly via the data provider’s iiif endpoint . text and lay- out acquisition outputs (i.e. ocr and olr) come, for their part, in a variety of mets/alto format flavors, some- times complemented by proprietary formats of private ser- vice providers. overall, the current collection amounts to ca. tb, text and image combined. more newspaper titles in french and english will be acquired and ingested during the last year of the project. . . legal framework original sources are subject to copyright law and impresso has received permission from its partners to use them, pro- vided that legal terms of use are respected upon online ac- cess and/or download. more specifically, digital documents are subject to two different right statements: ( ) public do- main, or unrestricted: documents are no longer in copy- right and may be used without restriction for all purposes, including commercial; ( ) academic use, or restricted: doc- uments are still under copyright and their use is restricted to personal and/or academic purposes, with the possibility https://impresso-project.ch/project/datasets https://zenodo.org/communities/impresso namely: the swiss national library, the national library of luxembourg, the media center and state archives of valais, the swiss economic archives, the journal le temps (ringier group), the journal neue zürcher zeitung, and other local and interna- tional data providers. defined by the international image interoperability frame- work, an interoperable technology and community framework for image delivery: https://iiif.io to download the text or not. the present impresso corpus release includes unrestricted documents and a part of re- stricted ones (for personal and academic usage). depend- ing on negotiations with data providers and on the inclusion of new collections, the situation is very likely to evolve in the future and impresso original source release will be com- plemented. . . source processing the original files provided by our partners encode the structure and the text of digital objects according to mets/alto xml library standards. mets (metadata encoding and transmission standard ) encodes various metadata as well as information on the physical and logical structure of the object, while alto (analyzed layout and text object ) represents information of ocr recognized texts, i.e. describes text and layout information (coordinates of columns, lines and words on a page). while very precise and complete, these xml files contain more information than necessary in a text mining context, and are cumber- some to process. moreover, mets and alto schemas are flexible and libraries usually adapts them according to their text acquisition capacities, resulting in a variety of input variants. combined with the existence of different file hier- archies, source identifiers and image mappings, as well as other ocr/olr proprietary formats, these inputs require, to say the least, a great deal of processing before they can finally be parsed. to this end, each library input is converted into ‘canonical’ files where information is encoded according to impresso json schemas, from which ‘ready-to-process’ files can easily be derived. defined iteratively and shared with other newspaper projects, these json schemas act as a central, common format which a) allows the seamless processing of various data sources; b) preserves the information neces- sary for nlp processing and interface rendering only; and c) drastically reduces file sizes, thereby allowing easier pro- cessing in distributed environments. schemas and converters are published and documented on- line and are not described further here. an important point to mention, though, is that we mint and assign unique, canonical identifiers to newspaper issues, pages as well as content items (i.e. newspaper contents below the page level such as articles, advertisements, images, tables, weather forecasts, obituaries, etc.) . . release the impresso corpora are released in two versions, both distributed as compressed archives (bzip ) of data in newline-delimited json format: ) the ‘canonical’ ver- sion, with a fine-grained logical and physical representation of newspaper contents, including image coordinates and ) the ‘ready-to-process’ version, which offer ‘reconstructed’ content item full texts, that is to say continuous strings non divided by ocr token units. this reconstruction signifi- cantly reduces the overhead when parsing the entire dataset, http://www.loc.gov/standards/mets https://www.loc.gov/standards/alto https://github.com/impresso/impresso-schemas https://impresso-project.ch/project/datasets https://zenodo.org/communities/impresso/ https://iiif.io/ http://www.loc.gov/standards/mets/ https://www.loc.gov/standards/alto/ https://github.com/impresso/impresso-schemas number of items unrestricted restricted (with download) restricted (w/o download) total # issues , , , , # pages , , , , , , # tokens , , , , , , , , , , , # content items , , , , , , , , # images , , , , , , table : global statistics on the impresso corpora. which amount to gb compressed (restricted and unre- stricted). the impresso corpus currently contains newspapers: from switzerland and from luxembourg. as mentioned previously, contents are subject to different license regi- mens, depending on the permissions given by cultural her- itage institutions and rights holders. in table we provide some basic statistics about our corpora, divided by license type. the release will contain all contents in the public do- main (unrestricted), as well as those available for academic use and for which the text can be downloaded (restricted with download, negotiations ongoing). the released corpora amount to almost billion tokens of textual contents, covering a time span of more than years (see fig. ), and contain roughly million images. . . metadata contextual information about digital collections is essen- tial and we attempt to provide as much information as possible, even though this is neither the core expertise nor part of the main objectives of the project. impresso newspaper metadata corresponds to descriptive (e.g. title, dates, place of publication), structural (issue, page, con- tent items), and administrative metadata (file timestamps, file creator, preservation metadata). these metadata were given by cultural institutions and, most of the time, com- pleted by the impresso team (either technical or descriptive metadata). since this metadata set does not intend to re- place library professional information but is rather meant for statistical ‘data science’ purposes, each record contains links to authority information such as the original biblio- graphic notice and the library portal. impresso newspaper metadata is encoded in json format, covers all newspapers and is published under a cc-by . license. . impresso benchmarks in order to support the training and evaluation of some pro- cessing components, several benchmarks were produced. they include material from both restricted and unrestricted collections, for which right clearance has been achieved. all are released under open licenses. . . article segmentation ground truth exploration and automatic processing of digitized newspa- per sources is greatly hindered by the sometimes low qual- https://creativecommons.org/licenses/by/ . / ity of legacy ocr and olr (when present) processes: con- tent items are incorrectly transcribed and incorrectly seg- mented. in an effort to address these shortcomings, im- presso developed an approach for content item recognition and classification exploiting both textual and visual features (barman et al., ). the objectives were, on the one hand, to filter out noisy or unwanted material before the ap- plication of subsequent nlp processes (e.g. removing all meteo tables and title banners before running topic model- ing or text re-use) and, on the other hand, to allow faceted search on content item types (e.g. search “xyz” in type of items ‘editorials’). to this end, a set of newspaper images was manually annotated and several experiments were conducted (bar- man, ). although newspaper content items can be of many types, we choose to focus on four classes that were deemed suitable for developing a first prototype, as well as meaningful within the impresso context, as follows: . feuilleton, i.e. an excerpt of a bigger work published over time in several issues of a newspaper, correspond- ing to the french roman-feuilleton or the english se- rial; . weather forecast, i.e. a text or image with the predic- tion of weather, or even a report of past weather mea- surements; . obituary, i.e. a small notice published by relatives of a deceased person; . stock exchange table, i.e. a table reporting the values of different national stocks. three newspapers from the french speaking part of switzerland covering a period of ca. years ( - ) were considered for the annotation. to obtain a diachronic ground truth, three issues were sampled every three or five years for the whole duration of each newspa- per. the sampled images were annotated using the vgg image annotator v. . . (via), a simple web interface for annotating images with annotation export in json format (dutta and zisserman, ). concretely speaking, each annotated image is associated with the list of its regions (i.e. coordinates) and their corresponding labels. overall, there is little to no agreement among historians and/or librar- ians about a ‘base’ newspaper content items taxonomy. the gazette de lausanne, the impartial and the journal de genève. https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / figure : distribution of tokens over years (whitespace tokenization was applied). page scans were annotated – among which with at least one annotation –, amounting to annotated re- gions. work is ongoing and once models would have reached a satisfying level of precision, they will be applied on the whole collection to filter out elements before text process- ing and enable faceted search over content item types. this article segmentation data set (annotations and images) is published under a cc-by-sa . license, using via as well as the standard object annotation coco (lin et al., ) formats. . . black letter ocr ground truth we created a publicly available ground truth (i.e., a man- ually corrected version of text) for black letter newspa- per print for the assessment of the ocr quality of the german-language neue zürcher zeitung (nzz) (ströbel, phillip benjamin and clematide, simon, ). we sam- pled one front page per year for the long period the nzz has been published in black letter ( - ), resulting in a diachronic ground truth of pages. we used the transkribus tool do complete the annotations. we pub- lished the ground truth as tiff images and corresponding xml files . first experiments on improving the ocr for this kind of data showed that elaborated deep learning mod- els (weidemann et al., ) reach character accuracies of . % and that they are transferable to other newspaper data and to better images than present in the ground truth (ströbel and clematide, ). . . named entity processing ground truth after image segmentation and transcription, the last im- presso benchmark relates to an information extraction task, named entity (ne) processing. ne processing tools are increasingly being used in the context of historical docu- ments and research activities in this domain target texts of http://cocodataset.org/#format-data https://transkribus.eu/transkribus https://github.com/impresso/nzz-black-letter-ground-truth different nature (e.g. museum records, state-related docu- ments, genealogical data, historical newspapers) and dif- ferent tasks (ne recognition and classification, entity link- ing, or both). experiments involve different time periods, focus on different domains, and use different typologies. this great diversity demonstrates how many and varied the needs –and the challenges– are, but also makes perfor- mance comparison difficult, if not impossible. in this context, the impresso project organises a clef evaluation lab, named ‘hipe’ (identifying historical peo- ple, places and other entities) (ehrmann et al., ). the hipe shared task puts forward two ne processing tasks, namely: ( ) the named entity recognition and classifi- cation (nerc) task, with two sub-tasks of increasing level of difficulty with high-level vs. finer-grained entity types, and ( ) the named entity linking task. the hipe corpus is composed of content items from the impresso swiss and luxembourgish newspapers, as well as from american newspapers, on a diachronic basis. for each language, articles of four different newspapers were sampled on a decade time-bucket basis, according to the time span of the newspaper (longest duration spans ca. years). more precisely, articles were first randomly sam- pled from each year of the considered decades, with the constraints of having a title and more than characters. subsequently to this sampling, a manual triage was applied in order to keep journalistic content only and to remove un- desirable items such as feuilleton, cross-words, weather ta- bles, time-schedules, obituaries, and what a human could not even read because of ocr noise. this material was manually annotated according to hipe annotation guidelines, derived from the quaero annotation guide. originally designed for the annotation of ‘ex- see clef : https://clef .clef-initiative.eu and hipe: https://impresso.github.io/clef-hipe- from the swiss national library, the luxembourgish na- tional library, and the library of congress, respectively. see the original quaero guidelines: http://www.quaero.org/media/files/bibliographie/quaero-guide- annotation- .pdf http://cocodataset.org/#format-data https://transkribus.eu/transkribus/ https://github.com/impresso/nzz-black-letter-ground-truth https://clef .clef-initiative.eu/ https://impresso.github.io/clef-hipe- / http://www.quaero.org/media/files/bibliographie/quaero-guide-annotation- .pdf http://www.quaero.org/media/files/bibliographie/quaero-guide-annotation- .pdf tended’ named entities (i.e. more than the or traditional entity classes) in french speech transcriptions, quaero guidelines have furthermore been used on historic press corpora (rosset et al., ). hipe slightly recast and sim- plifies them, considering only a subset of entity types and components, as well as of linguistic units eligible as named entities . the annotation campaign was carried out by the task orga- nizers with the support of trilingual collaborators. we used inception as an annotation tool (klie et al., ), with the visualisation of image segments alongside ocr tran- scriptions. for each language, a sub-sample of the corpus was annotated by two annotators and inter-annotator agree- ment is computed, before and after an adjudication. as of march , top-level entity mentions were anno- tated and linked to wikidata. for each task and language the corpus is divided into train- ing, dev and test data sets, with the only exception of en- glish for which only dev and test are produced. these man- ually annotated materials are released in iob format with hierarchical information. even though many evaluation campaigns on ne were orga- nized over the last decades, only one considered french historical texts (galibert et al., ) and, to the best of our knowledge, this is the first multilingual, diachronic named entity-annotated historical corpus. . impresso lexical and semantic annotations finally, a wealth of annotations as well as language mod- els are automatically computed over the whole impresso collection. they include: at lexical level, linguistic pre- processing (lemmatisation and historical spelling normal- ization), word embeddings, ocr quality assessment and n-grams; at referential level, ne mentions and linked en- tities; at conceptual level, topics, topic models, and topic- annotated content items; at collection level, text reuse clus- ters and passages; and, finally, visual signatures of pho- tographs and pictures contained in newspapers. these en- richments of our content items are represented as stand-off annotations and are released under cc-by or cc-by-sa . license. however, not all annotation data sets are fully ready at the moment; the following sections present those which are part of the current release. . . ocr quality assessment in order to automatically assess the loss of information due to ocr noise, we compute a simple ocr quality mea- sure inspired by spell-checker approach of alex and burns ( ). in our case, it basically corresponds to the propor- tion of words of an historical newspaper article that can be found in the wikipedia corpus of the corresponding lan- guage. given the multilingual nature of our texts and the large number of names in newspapers, this offers a practical approach, especially for german where normal nouns and hipe guidelines are available at: https://doi.org/ . /zenodo. e.g. muc, irex, ace, conll, kbp, ester, harem, quaero, germeval, etc. proper nouns are capitalized. before actually comparing the words, we normalise diacritical marks the same way as our text retrieval system solr does before indexing the con- tent. therefore, for instance, we consider the frequently oc- curring ocr errors bäle or bàle as equivalent to the correct spelling of the town bâle, because they are all normalized to the same string bale. the reason for this normalisation ap- proach in ocr assessment is that we want to inform our im- presso users about the real loss of recall they should expect when actually running standard keyword queries over our text collection (bäle will be found even is the user search for bàle, but bâte would not return any result, and this is the loss we want to account for). the ocr quality assessment is a number between and that is distributed along with our data as stand-off anno- tation for each content item. impresso interface users will probably quickly grasp the meaning of the numbers by just being exposed to texts and their corresponding ocr quality assessment, and learn to interpret them with respect to the type of article, e.g. stock market prices with many abbrevia- tions that will lower the score. as our approach is unsuper- vised, we need to formally evaluate it similar to alex and burns ( ) by testing whether there is a reasonable cor- relation between the automatically computed quality and some ground truth character error rate. . . word embeddings as mentioned earlier, the full impresso collection cannot be distributed due to copyright restrictions. having the mate- rial at hand, however, allows us to compute historical news- papers genre-specific lexical resources such as word em- beddings that can be distributed to the public. specifically, we build classical type-level word embeddings with fast- text . this choice is motivated by fasttext’s support for subword modeling (bojanowski et al., ), which is a useful feature in the presence of ocr errors. there has been recent work on top of fasttext for bringing the embed- dings of misspelled words even closer to the correct ver- sions via supervised training material (piktus et al., ). well-known drawbacks of type-level word embeddings are that (a) they enforce their users to adhere to the same tokenisation rules that their producers applied and, more severely, (b) they cannot differentiate the meanings of am- biguous words, or words that change their meaning in cer- tain constructions. the simple character-based approach proposed by akbik et al. ( ) (“contextualized string em- beddings” ) has successfully tackled these two problems and led to excellent results for ner. our own experiments with ner on noisy french historical newspapers addition- ally proved the resilience of these embeddings trained on in-domain material to ocr errors (bircher, ). within the impresso interface, word embeddings are mainly used for suggesting similar words in the keyword search (including cross-lingual), thereby supporting query expan- sion by semantic or ocr noise variants. query expansion is also offered for the lexical n-gram viewers. two types of word embeddings derived from the impresso text material are published: character-based contextualized https://fasttext.cc https://github.com/zalandoresearch/flair https://doi.org/ . /zenodo. https://fasttext.cc https://github.com/zalandoresearch/flair string embeddings and classical type-level word embed- dings with subword information. . . topic models the impresso web application supports faceted search with respect to language-specific topics (french, german, lux- embourgish). we use the well-known mallet toolkit, which allows the training and inference of topic models with latent dirichlet allocation (blei et al., ). first, linguistic preprocessing is applied to the data. for pos tagging, the spacy library is used because of its ro- bustness in the presence of ocr noise. however, spacy lemmatization is not always very satisfactory and further analyzers and sources are used to complement its results. for german, we rely mostly on the broad-coverage mor- phological analyser gertwol , and are currently work- ing on the problem of lemmatization of words with histor- ical spelling and/or ocr errors (see jurish ( ) for ear- lier work based on finite-state approaches for german). for french, we use the full-form lexicon morphalou (atilf, ) to complete lemma information not provided by spacy. dealing with the low-resourced luxembourgish language is more difficult (although spacy now has pos tagging support for this language), mostly because of many spelling variants and reforms this language has seen over the last years. then, under the assumption that topics are more inter- pretable if they consist of nouns and proper nouns only, we reduce the corpus size by excluding all other parts of speech based on the information obtained from spacy. as an addi- tional benefit, this filtering drastically reduces the number of tokens of the corpus that topic modeling has to deal with. next, topics are computed on this reduced, preprocessed material. although the german part of the collection is of reasonable size, the french material is however still too big for mallet and sampling of articles containing at least nouns and/or proper nouns is applied. in order to keep the facets for topic search manageable and interpretable, and at the same time account for the diversity of contents found in newspapers, we set the number of topics for german and french to . for the french topics, we directly fit topic distributions for about a third of our overall data. topic in- ference with the model trained on the sample is used for the remaining articles. topic inference also solves the problem that our collections is continuously growing, and recom- puting topic models from scratch each time is not feasible. additionally, historians prefer to have semantically stable topic models for their work. therefore, we also apply topic inference on newly added german texts. topic models, as well as topics and content item topic as- signments are released in json format. topics are also available within the impresso web interface, where they (a) serve as search facets, i.e., users can restrict their search results to articles containing only certain topics; or (b) the http://mallet.cs.umass.edu https://spacy.io/ http://www .lingsoft.fi/doc/gertwol http://www.cnrtl.fr/lexiques/morphalou also documented online at https://github.com/impresso/impresso-schemas users can select topics as entry points to explore the topic modeling based soft-clustering of articles over the entire corpus; or (c) they provide the basis for an article recom- mender system based on topic distribution similarity. fu- ture work will focus on the evolution of topics over time and cross-lingual topic modeling. . . text reuse text reuse can be defined as the meaningful reiteration of text beyond the simple repetition of common language. it is such a broad concept that it can be understood at differ- ent levels and studied in a large variety of contexts. in a publishing or teaching context, plagiarism can be seen as text re-use, should portions of someone else’s text be re- peated without appropriate attribution. in the context of literary studies, text re-use is often used as a synonym for literary phenomena like allusions, paraphrases and direct quotations. text reuse is a very common phenomenon in histori- cal newspapers too. nearly-identical articles may be re- purposed in multiple newspapers as they stem from the very same press release. in newspapers from the period before the advent of press agencies, text reuse instances can be in- teresting to study the dynamics of information spreading, especially when newspapers in the same language but from different countries are considered. in more recent newspa- pers text reuse is very frequent due to cut-and-paste jour- nalism being an increasingly common practice. we used passim (smith et al., ) to perform the auto- matic detection of text reuse. passim is an open source soft- ware that uses n-grams to effectively search for alignment candidates, the smith-waterman algorithm to perform the local alignment of candidate document pairs, and single- link clustering to group similar passages into text reuse clusters. as a pre-processing step we used passim to identify boil- erplate within our corpus. this step allows us to reduce the input size of approximately %, by removing mostly short passages that are repeated within the same newspa- per within a time window of days. we then run passim on the entire corpus after boilerplate passages have been re- moved: passim outputs all text passages that were identified as belonging to a text reuse cluster. as opposed to boiler- plate detection, text reuse detection explicitly targets reuse instances across two or more sources (i.e. newspapers). we post-process passim’s output to add the following infor- mation: • size, i.e. the number of text passages in the cluster; • lexical overlap, expressed as the proportion of unique tokens shared by all passages in a cluster; • time delta: the overall time window covered by a given cluster (expressed in number of days); • time gap: following salmi et al. ( ), we compute the longest gap (expressed in number of days) between the publication of any two passages in a cluster. https://github.com/dasmiq/passim http://mallet.cs.umass.edu/ https://spacy.io/ http://www .lingsoft.fi/doc/gertwol/ http://www.cnrtl.fr/lexiques/morphalou/ https://github.com/impresso/impresso-schemas https://github.com/dasmiq/passim dataset dois impresso historical newspaper textual material . /zenodo. impresso newspaper metadata . /zenodo. impresso ocr quality assessment . /zenodo. impresso ocr ground truth . /zenodo. impresso article segmentation ground truth . /zenodo. impresso hipe shared task named entity gold standard . /zenodo. impresso word embeddings . /zenodo. impresso topic modelling data . /zenodo. impresso text reuse data . /zenodo. table : impresso datasets dois. this information is added to each text reuse cluster with the goal of easing the retrieval as well as the analysis of detected text reuse. since passim detects several million clusters in the entire impresso corpus, we need to further characterize each cluster if we want to enable historians to find instances of text reuse that are of interest to them. each of these additional dimensions characterizes a certain aspects of reuse: lexical overlap allows for distinguishing almost exact copies of a piece of news from re-phrasings or paraphrases; time delta is an indicator of the longevity of a given piece of news; and, finally, time gap captures the viral nature of news spreading, especially its pace of publication. we release as a resource (in json format) the boilerplate and text reuse passages as detected by passim, as well as the additional information we compute at cluster-level. this data can be used to filter out duplicates from the input corpus, given the detrimental effects that such duplicates have on semantic models (e.g. topics, word embeddings) (schofield et al., ). text reuse information is currently used in the impresso in- terface as an additional navigation aid, as it points users to existing reuses of the news article in focus. future upgrades of the interface will include a dedicated text reuse explorer, which will allow users to search over and browse through all text reuse clusters, and to filter them based on several criteria (i.e. size, lexical overlap, time gap, time delta). . related work this section briefly summarizes previous efforts with re- spect to historical language resources. we focus here on historical newspapers and refer the reader to sporleder ( ) and piotrowski ( ) for further information on historical language in general. digitized newspaper corpora, understood here as consist- ing of both images and ocr, primarily exist thanks to the considerable efforts of national libraries, either as individ- ual institutions, either as part of consortia, e.g. the eu- ropeana newspaper project (neudecker and antonacopou- los, ). those institutions are the custodians of these digital assets which, after having been hidden behind dig- ital library portals for long, are now increasingly making their way to the public via apis and/or data dumps (e.g. the french national library apis and the national library of luxembourg open data portal ). impresso corpora are by no means meant to compete with these repositories, but rather to complement them, with derived, working ‘sec- ondary’ versions of the material in a form that is suitable for nlp needs. to our knowledge, and since corpus prepa- ration is often done by private companies mandated to de- velop digital portals, no ‘ready-to-process’ set of historical newspaper corpus such as the impresso one exists. several instances of ocr and article segmentation bench- marks exists thanks to, among others, the long-standing se- ries of conference and shared tasks organized by the doc- ument analysis community impresso annotated data sets are, in this regard, not new but complementary: german black letter ground truth is not common and, given the va- riety of historical newspaper material, article segmentation over page scans of different sources is beneficial. with respect to word embeddings, the companion web- site of hamilton et al. ( ) provides word vec em- beddings for french and german derived from google n- grams. more recently, riedl ( ) released german word embedding data sets derived from historical newspapers. in the last years, a few gold standards were publicly re- leased for named entities: galibert et al. ( ) shared a french named entity annotated corpus of historical newspa- pers from the end of the th century and neudecker ( ) published four data sets of pages each for dutch, french, and german (including austrian) as part of the eu- ropeana newspapers project. besides, linhares pontes et al. ( b) have recently published a data set for the evalu- ation of ne linking where various types of ocr noise were introduced. in comparison, the hipe corpus has a broader temporal coverage and additionally covers english. regarding topic modeling, yang et al. ( a) gives an overview of earlier work on historical newspapers. finally, as far as text reuse is concerned, very few resources and/or benchmarks were published to date. franzini et al. ( ) have published a ground truth dataset to benchmark http://api.bnf.fr https://data.bnl.lu/data/historical-newspapers in particular the icdar conferences, e.g. http://icdar .org/. https://nlp.stanford.edu/projects/histwords https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. http://api.bnf.fr https://data.bnl.lu/data/historical-newspapers/ http://icdar .org/ https://nlp.stanford.edu/projects/histwords the detection of a specific type of text reuse (i.e. literary quotations). the viral texts project has published an online interface, the viral texts explorer , which makes search- able and explorable text reuse clusters extracted from th century newspapers. a similar online interface was pro- vided also by salmi et al. ( ) for million text reuse clusters extracted from finnish press ( – ). . conclusion and perspectives we have presented a series of historical newspaper datasets – the impresso resource collection – composed of corpora, benchmarks, semantic annotations and language models in french, german, luxembourgish and english covering ca. years. produced in the context of a collaborative, interdisciplinary project which aims at enabling critical text mining of years of newspapers, this collection includes different types of resources that could support the needs of several communities. the textual corpora we release are large-scale, diachronic, multilingual and with real-world ocr quality. their availability will foster further research on nlp methods applied to historical texts (e.g. ocr post- correction, semantic drift, named entity processing). simi- larly, our benchmarks will fill an important gap in the adap- tation of existing approaches via e.g. transfer learning, as well as enable performance assessment and comparisons. language models will naturally find their use in many ap- plications, while lexical and semantic annotations will sup- port historical corpus exploration and be suitable for use at public participatory events such as hackathons. as future work we attempt to integrate more textual material (french and english notably), to release additional annotations (im- age visual signatures, historical n-grams and named enti- ties) and to serialize our data in more formats in addition to json. . acknowledgements we warmly thank the impresso team as well as student col- laborators camille watter, stefan bircher, julien nguyen dang for their annotation work. authors also gratefully acknowledge the financial support of the swiss national science foundation (snsf) for the project impresso – media monitoring of the past under grant number cr- sii . . bibliographical references akbik, a., blythe, d., and vollgraf, r. ( ). contextual string embeddings for sequence labeling. in proceedings of the th international conference on computational linguistics, pages – , santa fe, new mexico, usa, august. association for computational linguis- tics. alex, b. and burns, j. ( ). estimating and rating the quality of optically character recognised text. pages – . acm press. atilf. ( ). morphalou. ortolang (open re- sources and tools for language) –www.ortolang.fr. https://viraltexts.northeastern.edu/clusters barman, r., ehrmann, m., clematide, s., oliveira, s. a., and kaplan, f. ( ). combining visual and textual features for semantic segmentation of historical news- papers (submitted). journal of data mining and digital humanities. https://arxiv.org/abs/ . . barman, r. ( ). historical newspaper semantic seg- mentation using visual and textual features. master the- sis, epfl. bircher, s. ( ). toulouse and cahors refer to loca- tions, but t<<i*louse and caa.qrs as well. a neural approach for detecting named entities in digitized his- torical newspapers. master thesis, zurich university. blei, d. m., ng, a. y., and jordan, m. i. ( ). la- tent dirichlet allocation. journal of machine learning research, (jan): – . bojanowski, p., grave, e., joulin, a., and mikolov, t. ( ). enriching word vectors with subword informa- tion. corr, abs/ . . bollmann, m. ( ). a large-scale comparison of histor- ical text normalization systems. in proceedings of the conference of the north american chapter of the association for computational linguistics: human lan- guage technologies, volume (long and short papers), pages – . association for computational lin- guistics. chiron, g., doucet, a., coustaty, m., visani, m., and moreux, j.-p. ( ). impact of ocr errors on the use of digital libraries: towards a better access to informa- tion. in proceedings of the th acm/ieee joint con- ference on digital libraries, jcdl ’ , pages – , piscataway, nj, usa. ieee press. clausner, c., antonacopoulos, a., pletschacher, s., wilms, l., and claeyssens, s. ( ). prima, dmas , competition on digitised magazine article segmenta- tion (icdar ). dinarelli, m. and rosset, s. ( ). tree-structured named entity recognition on ocr data: analysis, processing and results. in lrec, pages – . dutta, a. and zisserman, a. ( ). the via annotation software for images, audio and video. in proceedings of the th acm international conference on multimedia, mm ’ , new york, ny, usa. acm. ehrmann, m., colavizza, g., rochat, y., and kaplan, f. ( ). diachronic evaluation of ner systems on old newspapers. in proceedings of the th conference on natural language processing (konvens )), pages – . bochumer linguistische arbeitsberichte. ehrmann, m., romanello, m., clematide, s., and bircher, s. ( ). (submitted) introducing the clef hipe shared task: named entity recognition and linking on historical newspapers. in european conference on in- formation retrieval, lisbon, portugal, april. franzini, g., moritz, m., marco, b., passarotti, m., and cuore, s. ( ). using and evaluating tracer for an index fontium computatus of the summa contra gen- tiles of thomas aquinas. in elena cabrio, et al., editors, proceedings of the fifth italian conference on compu- tational linguistics (clic-it ). galibert, o., rosset, s., grouin, c., zweigenbaum, p., and https://viraltexts.northeastern.edu/clusters https://arxiv.org/abs/ . quintard, l. ( ). extended named entities annota- tion on ocred documents: from corpus constitution to evaluation campaign. in nicoletta calzolari (conference chair), et al., editors, proceedings of the eight interna- tional conference on language resources and evalua- tion (lrec’ ), istanbul, turkey, may. european lan- guage resources association (elra). hamilton, l. w., leskovec, j., and jurafsky, d. ( ). diachronic word embeddings reveal statistical laws of semantic change. in proceedings of the th annual meeting of the association for computational linguis- tics (volume : long papers), pages – . asso- ciation for computational linguistics. jurish, b. ( ). finite-state canonicalization techniques for historical german. doctoral thesis, universität pots- dam. https://nbn-resolving.org/urn:nbn:de:kobv: - opus- . kaplan, f. and di lenardo, i. ( ). big data of the past. frontiers in digital humanities, . kestemont, m., karsdorp, f., and düring, m. ( ). min- ing the twentieth century’s history from the time maga- zine corpus. eacl , page . klie, j.-c., bugert, m., boullosa, b., de castilho, r. e., and gurevych, i. ( ). the inception platform: machine- assisted and knowledge-oriented interactive annotation. in proceedings of the th international conference on computational linguistics: system demonstrations, pages – . lansdall-welfare, t., sudhahar, s., thompson, j., lewis, j., , and cristianini, n. ( ). content analysis of years of british periodicals. proceedings of the national academy of sciences, ( ):e –e . lin, t.-y., maire, m., belongie, s., hays, j., perona, p., ramanan, d., dollár, p., and zitnick, c. l. ( ). microsoft coco: common objects in context. in eu- ropean conference on computer vision, pages – . springer. linhares pontes, e., hamdi, a., sidere, n., and doucet, a. ( a). impact of ocr quality on named entity linking. in digital libraries at the crossroads of digital infor- mation for the future, pages – . springer lncs, october. linhares pontes, e., hamdi, a., sidere, n., and doucet, a. ( b). impact of ocr quality on named entity linking. in adam jatowt, et al., editors, digital libraries at the crossroads of digital information for the future, pages – , cham. springer international publishing. moreux, j.-p. ( ). innovative approaches of histori- cal newspapers: data mining, data visualization, se- mantic enrichment. in proceedings of ifla wlic , page , columbus, oh. neudecker, c. and antonacopoulos, a. ( ). making europe’s historical newspapers searchable. in th iapr workshop on document analysis systems (das), pages – , santorini, greece, april. ieee. neudecker, c. ( ). an open corpus for named entity recognition in historic newspapers. in nicoletta calzo- lari (conference chair), et al., editors, proceedings of the tenth international conference on language resources and evaluation (lrec ), paris, france, may. euro- pean language resources association (elra). piktus, a., edizel, n. b., bojanowski, p., grave, e., fer- reira, r., and silvestri, f. ( ). misspelling oblivi- ous word embeddings. in proceedings of the con- ference of the north american chapter of the associa- tion for computational linguistics: human language technologies, volume (long and short papers), pages – , minneapolis, minnesota, june. association for computational linguistics. piotrowski, m. ( ). natural language processing for historical texts. synthesis lectures on human language technologies, ( ): – . plank, b. ( ). what to do about non-standard (or non- canonical) language in nlp. in proceedings of the th conference on natural language processing (kon- vens )). bochumer linguistische arbeitsberichte. ridge, m., colavizza, g., brake, l., ehrmann, m., moreux, j.-p., and prescott, a. ( ). the past, present and fu- ture of digital scholarship with newspaper collections. page . multi-paper panel presented at the digital humanities conference, utrecht, july . riedl, m. ( ). german word embeddings for shico based on historic newspapers, june. rigaud, c., doucet, a., coustaty, m., and moreux, j.- p. ( ). icdar competition on post-ocr text correction. in th international conference on document analysis and recognition, sydney, australia, september. rosset, s., grouin, cyril, fort, karen, galibert, olivier, kahn, juliette, and zweigenbaum, pierre. ( ). struc- tured named entities in two distinct press corpora: con- temporary broadcast news and old newspapers. in th linguistics annotation workshop (the law vi), pages – , jeju, south korea, july. salmi, h., rantala, h., vesanto, a., and ginter, f. ( ). the long-term reuse of text in the finnish press, – . ceur workshop proceedings, : – . schofield, a., thompson, l., and mimno, d. ( ). quantifying the effects of text duplication on semantic models. in proceedings of the conference on em- pirical methods in natural language processing, pages – , copenhagen, denmark, september. associ- ation for computational linguistics. smith, d. a. and cordell, r. ( ). a research agenda for historical and multilingual optical character recog- nition. https://ocr.northeastern.edu/. smith, d. a., cordell, r., and mullen, a. ( ). com- putational methods for uncovering reprinted texts in antebellum newspapers. american literary history, ( ):e –e , sep. sporleder, c. ( ). natural language processing for cultural heritage domains. language and linguistics compass, ( ): – . ströbel, p. b. and clematide, s. ( ). improving ocr of black letter in historical newspapers: the unreason- able effectiveness of htr models on low-resolution https://ocr.northeastern.edu/ images. in proceedings of the digital humanities , (dh ). clariah. terras, m. m. ( ). the rise of digitization. in ruth rikowski, editor, digitisation perspectives, pages – . sensepublishers, rotterdam. vilain, m., su, j., and lubar, s. ( ). entity extraction is a boring solved problem: or is it? in human lan- guage technologies : the conference of the north american chapter of the association for computational linguistics; companion volume, short papers, naacl- short ’ , pages – . association for computa- tional linguistics. event-place: rochester, new york. weidemann, m., michael, j., grüning, t., and labahn, r. ( ). htr engine based on nns p building deep architectures with tensorflow. technical report. wevers, m. ( ). using word embeddings to exam- ine gender bias in dutch newspapers, - . in proceedings of the st international workshop on com- putational approaches to historical language change, pages – , florence, italy, august. association for computational linguistics. yang, t.-i., torget, a., and mihalcea, r. ( a). topic modeling on historical newspapers. in proceedings of the th acl-hlt workshop on language technology for cultural heritage, social sciences, and humanities, pages – , portland, or, usa, june. association for computational linguistics. yang, t.-i., torget, a. j., and mihalcea, r. ( b). topic modeling on historical newspapers. in proceedings of the th acl-hlt workshop on language technology for cultural heritage, social sciences, and humanities (latech), pages – . . language resource references ströbel, phillip benjamin and clematide, simon. ( ). nzz black letter ground truth. university of zurich, . , islrn - - - - . introduction mining years of historical newspapers: the impresso project impresso corpora original sources legal framework source processing release metadata impresso benchmarks article segmentation ground truth black letter ocr ground truth named entity processing ground truth impresso lexical and semantic annotations ocr quality assessment word embeddings topic models text reuse related work conclusion and perspectives acknowledgements bibliographical references language resource references finding characteristic features in stylometric analysis carmen klaussner ∗† , john nerbonne‡ , and çağri çöltekin§ trinity college dublin, ireland university of groningen, the netherlands university of freiburg, germany ∗klaussnc@tcd.ie †corresponding author ‡j.nerbonne@rug.nl §c.coltekin@rug.nl abstract the usual focus in authorship studies is on authorship attribution, i.e. deter- mining which author (of a given set) wrote a piece of unknown provenance. the usual setting involves a small number of candidate authors, which means that the focus quickly revolves around a search for features that discriminate among the candidates. whether the features that serve to discriminate among the authors are characteristic is then not of primary importance. we respectfully suggest an alternative in this paper, namely a focus on seeking features that are characteristic for an author with respect to others. to determine an author’s characteristic features, we first seek elements that he or she uses consis- tently, which we therefore regard as representative, but we likewise seek elements which the author uses distinctively in comparison to an opposing author. we test the idea on a task recently proposed that compares charles dickens to both wilkie collins and a larger reference set comprising several authors’ works from the th and th century. we then compare the use of representative and distinctive features to burrows’ delta and hoovers’ cov tuning; we find that our method bears little similarity with either method in terms of characteristic feature selection. we show that our method achieves reliable and consistent results in the two- author comparison and fair results in the multi-author one, measured by separation ability in clustering. introduction this paper suggests an novel, complementary focus in stylometry, i.e. trying to identify characteristic features of authors rather than focusing on discriminating among authors, which is the common task in authorship attribution. the latter has served to focus scholars on a task with clear success criteria, certainly an achievement, but we suspect that its focus on finding discriminating features leads to an overemphasis on unusual features rather than characterizations of what is general and consistent about an author’s style. we thus ask with others ‘if you can tell authors apart, have you learned anything about them?’ (craig, ). concretely we try to identify words that dickens uses with a consistent frequency throughout a selection of his writings and which are used differently by other authors. we think that the approach might be used to analyze syntactic features, too, but we will not try to show that. the field of stylometry in authorship studies has undergone considerable change in the course of the th century, whose beginning marked the tentative introduction of new measures to the field, heralding the rise of non-traditional, quantitative techniques to be established alongside the then predominant traditional methods (e.g. manuscript provenance or dating of materials). in the interest of space we shall not summarize that history here, referring instead to excellent recent surveys (stamatatos, ; oakes, ). since burrows’ work is a touchstone for many, we discuss it here specifically and compare our proposal to his work in more detail below. burrows’ delta (burrows, ) was designed for authorship attribution, seeking the most likely authorial candi- date for a given document from a set of authors based on differences between z-scores of high-frequency items. delta is usually applied to the – most frequent words, i.e. the highest frequency stratum. this is an advantage since high frequency words are likely to be encountered in most documents. but note that highly variable features could be useful for the task of identifying an author if they happened to occur almost exclusively in just one author’s works, but we would not regard them as characteristic since they are not used consistently. burrows’ iota and zeta (burrows, ; bur- rows, ; hoover, ) investigate words in middle-range and low-range frequency strata, and they look for words appearing consistently in one author’s works and less frequently to not at all (iota) in the works of others. more recently, hoover introduced cov tuning, that uses the coefficient of variance to detect those frequent features that are most variable over a multi-author corpus (hoover, ). we introduce a new technique, representativeness and distinctiveness, focusing on finding style markers that are used consistently in the works of one author and differently from that of others. concretely, we try to detect charles dickens’ style presented by tabata ( ), who used random forest classification. we compare our results to tabata’s in section . . the remainder of this paper is organized as follows; we begin by introducing and further motivating representativeness and distinctiveness in section in the context of style analysis. section gives an overview of the data; section continues by first exemplifying our technique’s application to an actual task and subsequently comparing it to other methods in the field. we close the discussion in section . it has been suggested that work in author profiling might be relevant to the task of finding typical features, and this is indeed similar, but the focus of profiling is rather on distinguishing groups of authors, e.g. by age or sex. see rangel et al. ( ) and references there. finding characteristic features rather than focusing exclusively on identifying stylistic features that discriminate among authors, we first seek features that an author uses consistently in his work, calling these features representative, and turn to distinctive features in a second step. in dialec- tology, where these methods were first used, we note, e.g. that the word used for the storage space in a car is fairly consistently call a ‘boot’ throughout the uk and simi- larly that the words ‘cot’ and ‘caught’ rhyme on the eastern seaboard of the us. this makes them representative. we do not have atomistic data of this detail in stylome- try, where there is a long and serious tradition of looking first to word frequencies as style markers. we therefore focus on word frequencies here, but we might also have examined the frequencies of word bigrams or sequences of part-of-speech tags. in order to identify what is consistent in an author’s style, we consider not only the very highest strata of frequent words (i.e. – ), but rather a larger set (i.e. – ). the aim of this is to find features with a very even distribution over an author’s works; those used very frequently and those used less frequently. naturally, very infrequent features will suffer the instability problems associated with sparse data, so we do not imagine using them effectively. distinctive features are always identified with respect to a set of comparable au- thors, and they are simply the features used differently by the candidate under exami- nation and the comparable set. we turn now to a more formal introduction of representativeness and distinctive- ness and further explanation of how it can be used in stylometry. more specific ap- plications of the method are presented in section , where we test the method in two different settings. . representativeness and distinctiveness representativeness and distinctiveness were introduced in dialectology (wieling and nerbonne, ), with the goal of detecting linguistic features that ‘marked’ the speak- ers of a particular dialect in contrast to others. in the orginal paper it is used to detect characteristic features (e.g. lexical items), that differ little within the target group of geographical sites (and may therefore be regarded as ‘representative’) and differ con- siderably more outside that group (so that they are also ‘distinctive’ with respect to the other group). it was later extended to function with numerical measures (prokić, çöltekin, and nerbonne, ), and since we will analyze frequency, we will focus on that extension. in authorship analysis, we examine the words extracted from an author’s documents compared to documents by another group of authors (∼the reference set). more exactly, we examine the frequency distribution of the author’s vocabulary as it is used across the range of documents (or text segments). the technique begins by identifying which feature frequencies are consistent over the target author’s document set. afterwards, it selects those consistent and thus representative features of that author that are also distinctive with respect to those documents in the (contrasting) reference set. we assume a set of documents from an author under investigation, din as well as a set of contrasting documents, dex, which we need if we are to identify distinctive features. we may also refer to d,d = din ∪ dex, the union of the two sets. we assume moreover a distance function diff, which for a given feature f, returns the distance between a pair of documents with respect to f. the formal definition of representativeness of a particular feature f for a document set din (belonging to the target author) is then based on the mean distance of the documents in din with respect to f: d din f = |din| −|din| ∑ d,d′∈din,d =d ′ difff (d,d ′) ( ) where the fraction before the summation is based on the number of non-identical pairs in the set din. naturally we also need to know the average distance between pairs of documents, where the first comes from din and the second from dex. these allow us to compare the target author to others: ddf = |din ×dex| ∑ d∈din,d ′∈dex difff (d,d ′) ( ) where we assume, as noted above, that d = din ∪dex . we implicitly appeal to the assumed definition in order to suppress the reference to two document sets on the left- hand side of the definition. we deliberately collect feature frequencies not only when they are greater than those in the reference set, but also when they are less. in order to determine features both representative of a particular author as well as distinctive with respect to other authors, we normalize the average values defined in eq. and eq. above. reprf (din) = − d din f −df sd(df ) ( ) distf (d) = ddf −df sd(df ) ( ) where df is the mean difference between all documents within the document set d,d = din ∪dex, with respect to the feature f, where sd(df ) is the standard devi- ation of differences between all documents in the document set with respect to f, and where we again implicitly assume that d = din ∪ dex . note that repr is defined as the negative of the normalized d din f , since smaller internal differences mean more consistent features. the normalization step also makes sure that representativeness not only measures consistent features within an author’s documents, but that it also com- pares them to the rest of the documents. hence, only the features that are exceptionally consistent within the target author’s documents in comparison to the other documents will receive higher repr scores. similarly, the dist measure does not just select highly variable features in the language, but will score highly those features whose use con- trasts between the target author’s documents and the reference set. we define the features that are both representative and distinctive as the character- istic features of an author. in this paper we use the sum of repr and dist to obtain a single summary score representing how characteristic a features is for the author of interest. we refer to this combined score (repr + dist) as the rdf score, and refer then to rdf (a,b) or rdf (din,dex). for different applications, other combinations of repr and dist may be more appropriate. . distinctiveness in comparing only two authors the representativeness and distinctiveness as defined above compares texts written by an author with a reference set typically comprising many other authors. in some of the experiments (reported in section . ), we present results comparing only two authors. this subsection discusses the interpretation of the measures in the two-author setting and clarifies further properties of the rdf score. in the two-author setting, we have two sets of documents, one belonging to author a and the other to author b (or to din,dex), respectively. we first consider the case where the same feature is representative in both authors’ works. if the feature is used consistently at the same rate by both authors, it will be representative for both indi- vidually, but not distinctive. if it is used consistently by both but at different rates, then it may score well in distinctiveness depending on the size of the difference. so representative features need not result in high rdf scores. the rd measure is symmetric, for example, when feature f is representative in set din because it occurs with a consistently high frequency. if the same feature f is also representative in the opposing set, dex, but with a low frequency, then f will be representative and distinctive for both sets, and rdf (a,b) = rdf (b,a). but the measure may be asymmetric, so that rdf (a,b) = rdf (b,a), if, for example, the feature is highly representative in a but not b. this means that a repre- sentative and distinctive feature for the candidate set din, may be unrepresentative for set dex because its frequencies may vary too much in the documents in dex. although this feature is not representative for dex, it may still be distinctive in din with respect to dex, because it is used with consistent frequency in din but not in dex. thus, high rdf scores indicate consistent frequencies within the target author’s documents that may either be inconsistent or be consistently different in the reference set. the values obtained do not reveal whether an author consistently avoided or pre- ferred a particular feature. a given feature f may be scored highly relevant for both authors, so that rdf (a,b) ≈ rdf (b,a) meaning one uses it consistently less than the other, rendering it a good separator for the two authors. general properties from a performance point of view, the more features (or docu- ments) one considers, the more expensive the computations will be, since the methods require pairwise comparisons of all documents for each individual feature. data in this section we introduce the data sets used in all the experiments reported on in section . the exact composition of the data sets was motivated by a study by tabata ( ), where charles dickens was contrasted with both contemporary writer wilkie collins in a two-author comparison and a larger reference set comprising different authors from the th and th century and thus a reference for the average writing style of that time. for all experiments, we consider the data sets proposed by tabata ( ), namely a set consisting of twenty-four texts by dickens and collins each (shown in table and table respectively). thus, while the data set for the first experiment here is the same as used by tabata ( ), we assembled the data for the second experiment all computations for this paper, including representativeness and distinctiveness were implemented using the statistical language r (r core team, ), using packages, such as cluster, stats and mclust. we would like to thank tomoji tabata for making his data set available to us. ourselves; these contain the same texts for dickens as in the first experiment while the reference set in this second case contains fifty-five texts by sixteen different authors. the texts are shown in table and table . this data set was preprocessed by removing all punctuation, but retaining contractions and compounds and transforming the data by computing relative frequencies multiplied by . finally, we remove document- specific features over the whole corpus by probing whether a term appears in at least / of the documents and discarding it otherwise. we note that both data preparation steps – limiting features to the most frequent ones and filtering those that do not appear regularly – serve to increase the chance of using features we would call ‘representative’. eliminating infrequent features reduces noise and increases the chance of settling on statistically stable elements. table dickens’ texts. author texts year dickens sketches by boz - dickens the pickwick papers - dickens other early papers - dickens oliver twist - dickens nicholas nickleby - dickens master humphrey’s clock - dickens the old curiosity shop - dickens barnaby rudge dickens american notes dickens martin chuzzlewit - dickens christmas books - dickens pictures from italy dickens dombey and son - dickens david copperfield - dickens a child’s history of england - dickens bleak house - dickens hard times dickens little dorrit - dickens reprinted pieces - dickens a tale of two cities dickens the uncommercial traveller - dickens great expectations - dickens our mutual friend - dickens the mystery of edwin drood table collins’ texts. author texts year collins antonina collins rambles beyond railways collins basil collins hide and seek collins after dark collins a rogue’s life - collins the queen of hearts collins the woman in white collins no name collins armadale collins the moonstone collins man and wife collins poor miss finch collins the new magdalen collins the law and the lady collins the two destinies collins the haunted hotel collins the fallen leaves collins jezebel’s daughter collins the black robe collins i say no collins the evil genius collins little novels collins the legacy of cain table th century texts. author texts year defoe captain singleton defoe journal of prague year defoe military memoirs of capt. george carleton defoe moll flanders defoe robinson crusoe fielding a journey from this world to the next fielding amelia fielding jonathan wild fielding joseph andrews i&ii fielding tom jones goldsmith the vicar of wakefield richardson clarrissa i - ix richardson pamela smollett peregrine pickle smollett travels through france and italy smollett the adventures of ferdinand count fathom smollett humphrey clinker smollett the adventures of sir launcelot greaves smollett the adventures of roderick random sterne a sentimental journey sterne the life and opinions of tristram shandy - swift a tale of a tub swift gulliver’s travels swift the journal to stella - table th century texts. author texts year brontë, a. agnes grey austen emma austen mansfield park austen pride and prejudice austen northanger abbey austen sense and sensibility austen persuasion - brontë, c. the professor brontë, c. villette brontë, c. jane eyre brontë, e. wuthering heights eliot daniel deronda eliot silas marner eliot middlemarch - eliot the mill on the floss eliot brother jacob eliot adam bede gaskell cranford - gaskell sylvia’s lovers gaskell mary barton thackeray vanity fair thackeray barry lyndon trollope doctor thorne trollope barchester towers trollope the warden trollope phineas finn trollope can you forgive her trollope the eustace diamonds collins after dark collins the moonstone collins the woman in white experiments in this section, we begin by considering the task proposed by tabata ( ), i.e. that of determining dickens’ characteristic features. we do this by first comparing his works to his contemporary collins and then to a reference corpus; this is done in section . and section . respectively. in order to analyze the extent to which the method pro- posed here is different from the machine-learning technique used by tabata ( ), we compare our results to tabata’s in section . . further, we consider comparisons both to burrows’ well-established method (burrows’ delta in section . ), as well as to a more recently introduced technique (hoover’s cov tuning in section . ). . dickens vs. collins charles dickens is perceived to have a somewhat unique style that sets his pieces apart from his contemporaries (mahlberg, ). this makes him a good subject for style analysis, as there are likely to be features that distinguish him from others. thus, dick- ens has been focus of numerous stylistic analyses (mahlberg, ; craig and drew, ; tabata, ). the study presented by mahlberg ( ) describes a work aimed at introducing corpus linguistics methods to extract key word clusters (sequences of words), that can then be interpreted more abstractly in a second step. the study focuses on twenty-three texts by dickens in comparison to a th century reference corpus, containing twenty-nine texts by various authors and thus a sample of contemporary writing. according to mahlberg, dickens shows a particular affinity for using body part clusters: e.g. ‘his hands in his pockets’, which is interpreted as an example of dickens’ individualization of his characters. although this use is not unusual for the time, the rate of use in dickens is remarkable, as dickens, for instance, links a particu- lar bodily action to a character more than average for the th century. the phrase ‘his hands in his pockets’, for instance, occurs ninety times and in twenty texts of dickens, compared to thirteen times and eight texts in the th century reference corpus. mahlberg concludes that the identification of body part clusters provides further evidence of the importance of body language in dickens. thus, frequent clusters can be an indication of what function (content) words are likely to be or not be among dickens’ discriminators, in this case, we would expect there to be examples of body parts, such as face, eyes and hands. for the comparison between dickens and collins, we consider the same data used by tabata ( ). the combined data set contains twenty-four documents each for the two author, for which the first ∼ most frequent words were extracted. for evaluation, we return to the authorship evaluation task, since, after all, characteristic words should serve to discriminate between authors, but we take care to attend to the words responsible for the discrimination as well. we use five-fold cross-validation and subsequent clustering of documents which we evaluate using the adjusted rand index (ari) (hubert and arabie, ), where is the expected (chance) value and perfect overlap with a (gold) standard. the input features for clustering are selected by considering the shared items of the n-highest rated features of the two authors, with n iterating from to the total length of the feature input list in steps of fifty, e.g. , , , ... . the distance matrix was computed using the ‘manhattan’ distance and subsequent clustering was performed using ‘complete link’ (manning, raghavan, and schütze, ). table shows selected results, where input refers to the features originally selected and shared to those selected by the rdf scores for both authors and therefore retained for clustering. for each iteration, we show the ari for clustering on the complete data set and on the test set only. the results are very regular, even when increasing the fea- table . results for five-fold cross-validation for discriminating in the dickens/collins set, with input referring to the number of features selected from the (top of the) lists of the two authors’ representative and distinctive features and shared to the number of those input features shared by both. the shared features are used in clustering. results for clustering on the entire set/test set are shown in the other columns. adjusted rand index (ari) feature no. fold fold fold fold fold input shared full test full test full test full test full test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ture input size dramatically. however, at shared features, the accuracy decreases, and this deterioration continues in subsequent iterations. fold one is considerably and consistently worse for the test set accuracy than the other folds. upon examining its test documents, it can be observed that two unusual pieces of collins are part of this set, antonina and rambles beyond railways, which tabata also identified as conspicuous in collins’ works (tabata, ). further, we can examine prominent features of the two authors in table , which shows the fifteen highest rated representative and distinctive features for each author. the six features in bold are shared by dickens and collins and appear among the top fifteen items based on rdf scores. these features are thus not only distinctive, but also representative in their frequency distributions for dickens and collins. this means that one of them uses the item consistently more frequently than the other. considering the consistency of results, the method is likely to be appropriate for two-author compar- isons. table . representative and distinctive scores for highest features on input features in fold one. shared features are marked in bold. dickens collins feature rdf score feature rdf score left . upon . letter . though . only . such . first . so . discovered . only . later . being . but . but . produced . much . advice . many . wait . answer . upon . very . though . and . words . left . future . to . news . first . . dickens vs. ‘world’ in the second experiment presented by tabata ( ), the task was to identify dickens’ style with respect to a larger reference corpus, in order to detect items that set him apart from other authors of his time rather than only collins. thus, we consider the same texts used in that exercise and transformed the data by computing relative frequencies and excluding words not present in at least / of the complete data set, which reduces it to ∼ input features (words). table shows the cross-validation results for clustering dickens vs. the reference corpus. as in the previous case, the distance matrix was computed using the ‘manhat- tan’ distance and subsequent clustering was done using ‘complete link’. in contrast to the dickens-collins comparison, the results are less consistent. in order to obtain a fair number of shared features, the number of input features has to be much greater than in the two-author experiment. in the previous case, there were two pieces in the first fold’s test set that are likely to have lowered the overall ari (see above). of course this can happen in other trial table . results for five-fold cross-validation on the dickens/world set, with input referring to the number of highest features selected from dickens’ and the reference corpus’ representative and distinctive features and shared to the number of those input features shared by both sets – these are used in clustering. results for clustering on the entire set/test set are shown. adjusted rand index (ari) feature no. fold fold fold fold fold input shared full test full test full test full test full test - . − . . . . . - . . . . . . . . − . . . . . . . . - . . . . . . . . − . . . . . . . . − . − . . . − . − . . − . . − . . − . . runs based on a random five-fold cross-validation. if there are only a few documents of a given author and these are (almost) all missing from the training corpus, they are more likely to be misclassified in clustering. the test set in fold three is an interesting candidate; clustering based on a higher set of features is quite low, close to the expected value of random clustering, while the test set results based on fewer features are gen- erally quite high. the test set for this fold consists of four novels by dickens, of all six of the novels by austen in the data set and one each by smollett and sterne and each of the brontë sisters. closer inspection reveals that the absolute distance between clusters is very slight for the test documents. clustering the complete data set shows that seven documents are misclassified – namely all three novels of charlotte brontë as well as one by thackeray, smollett, sterne and dickens each. interestingly, all of austen’s novels are correctly attributed, despite the fact that none of her works were part of the training corpus, suggesting that her style is sufficiently similar to her peers. this might also suggest that austen is not only very consistent within her own texts, but presents a kind of average of the corpus, while certain authors/works deviate more from this. the only fold that behaves more regularly is fold five, where both the full set and the test set have mediocre to fair results, suggesting that the test documents in this case (gaskell ( / ), eliot ( / ), trollope ( / ), collins ( / ), thackeray ( / )) were a better reflection of the training corpus, which in fact did contain samples of these authors. overall, one can conclude that the composition of the reference set, as well as possible prevalence of particular authors might considerably influence the selection of features. table shows the fifteen highest rated features for both dickens and the reference corpus. in this case, the scores for each are considerably lower than for dickens and collins in the previous experiment. this suggests that consensus over features is more difficult to attain for the larger reference set, which in turn affects the degree of dis- tinctiveness for dickens, (even if his features’ representativeness will be the same in this case). the number of shared items is also lower than it was previously when we considered the same number of highest features. however, among the first thirty items of both lists, there are a number of body parts, such as head, faces, and legs, as well as words denoting action, such as looking, shaking and raising, indicating that these indeed distinguish dickens from his contemporaries, one giving preference to these expressions, while the others are rather avoiding them. while representativeness and distinctiveness cannot reveal which of these expressions dickens himself preferred, taking into consideration previous analyses (mahlberg, ; tabata, ), we might tentatively conclude that he used the above more frequently than his peers. table scores for highest features on input features in fold five. dickens world feature rdf score feature rdf score corner . head . given . corner . quiet . old . till . legs . for . various . return . hat . pleased . shaking . however . until . entirely . looking . give . remark . use . heavily . without . returned . able . raising . cannot . behind . upon . faces . . comparing to tabata’s random forests in the following, we compare our results to the ones obtained by tabata ( ), who used random forests (rf) classification on the same two tasks we reported on in the last two sections. random forests classification random forests (rf) was first introduced by breiman ( ) and is based on ensemble learning from a large number of decision trees randomly generated from the data set. the “forest” is created by building each tree individually by sampling n cases (docu- ments) at random with replacement (with n ∼ % of the complete data). at each node, m predictor variables are selected at random from all the predictor variables finally choosing the variable that provides the best split, according to some objective function ( m � total number of predictor variables). a new document is classified by taking an average or weighted average or a voting majority in the case of categorical variables. in terms of interpretability, rf classification offers more transparency than other machine-learning algorithms in that it indicates what variables were important in classi- fication, in the present case, which words were best in separating dickens from collins or from the th/ th century reference set. for both experiments in tabata ( ), the most frequent words were used as input features, yielding a list of features for dickens and collins each, shown in table and one for dickens’ positive and negative features when compared to the larger reference corpus, as shown in table . table . dickens’ markers, when compared to collins according to tabata’s work using random forests. dickens’ markers very, many, upon, being, much, and, so, with, a, such, indeed, air, off, but, would, down, great, there, up, or, were, head, they, into, better, quite, brought, said, returned, rather, good, who, came, having, never, always, ever, replied, boy, where this, sir, well, gone, looking, dear, himself, through, should, too, together, these, like, an, how, though, then, long, going, its collins’ markers first, words, only, end, left, moment, room, last, letter, to, enough, back, answer, leave, still, place, since, heard, answered, time, looked, person, mind, on, woman, at, told, she, own, under, just, ask, once, speak, found, passed, her, which, had, me, felt, from, asked, after, can, side, present, turned, life, next, word, new, went, say, over, while, far, london, don’t, your, tell, now, before table . tabata’s dickens markers, when compared to the larger reference corpus. positive dickens’ markers eyes, hands, again, are, these, under, right, yes, up, sir, child, looked, together, here, back, it, at, am, long, quite, day, better, mean, why, turned, where, do, face, new, there, dear, people, they, door, cried, in, you, very, way, man negative dickens’ markers lady, poor, less, of, things, leave, love, not, from, should, can, last, saw, now, next, my, having, began, our, letter, had, i, money, tell, such, to, nothing, person, be, would, those, far, miss, life, called, found, wish, how, must, more, herself, well, did, but, much, make, other, whose, as, own, take, go, no, gave, shall, some, against, wife, since, first, them, word characteristic feature comparison since representativeness and distinctiveness returns a combined measure of how con- sistent (representative) and distinctive a feature is with respect to a comparison au- thor/authors, no attention is paid to the question, which author used a feature more frequently than the other if the feature is representative for both. thus, in contrast to the rf information that makes it possible to attribute particular features to authors, features may appear in both lists. since we are only given the forty to sixty most prominent features for each participant, an exact rankings comparison is not possible in this case. instead, we also consider the same number of most prominent representa- tive and distinctive features and compare how many items are shared, when the same number of input features is considered, in this case the most frequent ones. ta- ble shows comparisons of the experiments. the number of directly shared items, for instance, items appearing under dickens under both rf and rd is fairly high — rd shares eighteen words, or ∼ % of the sixty most prominent words for dickens under rf. considering collins, the overlap is comparable, namely twenty-one shared items of sixty-six words under rf (∼ %). however, what is noticeable is that some of tabata’s dickens features appear among our collins features, suggesting that they are good separators for the two authors, being more frequent for dickens, but more representative for collins. regarding the dickens/reference set comparison, there are two shared items for the forty most prominent words for dickens under each analy- sis, while there are twelve out of sixty-two for dickens’ negative words / the reference corpus. however, if we raise the number of features in the input, using ∼ for the dick- ens / collins comparison, the number of shared items for dickens falls to four out of sixty and eleven out of sixty-six for collins. considering ∼ most frequent words instead of for dickens / the reference corpus causes a drop to zero out of forty shared words for dickens and one out of sixty-two for the corpus. the fact that the two methods are similar given a more limited input is not necessarily surprising, but it table . comparison of highest rated words under each method for both experiments. bold printed words indicate a direct correspondence with the other method. features printed in italic are indirectly shared, namely by the opposing author. dickens collins rf rd rf rd very first first upon many upon words first upon only only very being left end such much words left many and letter moment being so end room so with moment last indeed a enough letter only such answer to much indeed last enough air air such back on off very answer a but being leave great would on still and dickens world rf rd rf rd eyes till lady head hands for poor old again however less looking are give of returned these without things round under cannot leave down right upon love door yes looking not night up not from gentleman sir than should mr child but can to looked nor last here together about saw through here would now face back head next its indicates that while rf performs better on a few, more frequent features, this is not true for representativeness and distinctiveness. comparing the corresponding ari scores for those input features confirms this; for the two-author experiment, the ari is also high, but starts dropping relatively quickly on clustering the first - most prominent features. for the second comparison, the numbers become even less stable, which suggests, that the method struggled more on finding discriminators when only considering the most frequent features. thus, the above comparisons indicate that methods are more similar for two-class problems, although this could also be due to the fact that representativeness and dis- tinctiveness might possibly be less suited for mixed set comparisons. . comparing to burrows’ delta in order to understand to what extent representativeness and distinctiveness are similar or different to other methods extant in the literature, we compare the features emerging from our analysis to those selected (or used) by two other techniques. we begin with a comparison to burrows’ delta (burrows, ). from a theoretical point of view, one central difference between the techniques is one of design; burrows’ delta was intended for authorship attribution, i.e. measuring similarity between a test document and different candidate authors, indicating which author of those considered would be most likely to have authored this particular docu- ment. however, representativeness and distinctiveness aims at detecting characteristic stylistic features – thus one question addressed here would be to what extent charac- teristic stylistic features coincide with those found most discriminating in successful authorship attribution. burrows’ delta is an authorship attribution technique used to identify the most likely author for a test document on the most frequent words ( – mfw). to per- form the test, a corpus of candidate authors is assembled with a couple of documents each and both the mean and standard deviation for all features are calculated over the complete set of features (words). to compute z-scores for individual authors, for each author and feature, one takes the average standardized frequency over his documents and computes z-scores using mean and standard deviation over the whole corpus. the test document is treated similarly also using the corpus’ µ̂ and σ̂. we then compare the test piece’s scores to those of a candidate author and take the mean over the absolute differences to obtain a combined score. thus, delta is defined as ‘the mean of the absolute differences between the z-scores for a set of word-variables in a given text-group and the z-scores for the same set of word-variables in a target text’ (burrows, ). the delta scores emerging from the analysis quantify the individual comparisons for each author in the main corpus and a specific test piece, where the lowest distance indicates the closest fit. the delta z- scores refer to z-scores computed over the distribution of delta scores, e.g. if a value (corresponding to the lowest distance) diverges a lot (from the mean of all differences), it indicates that the author’s piece and the test piece are unusually close and that there is no other close competitor (this can be quantified through the z-distribution). delta experiment since the two methods have different aims, there is no direct way of comparing the results. the output of delta are delta scores and delta z-scores corresponding to an aggregation over some number of most frequent words – this does not immediately reveal which words were determining the overall proximity or non-proximity to a test document. to determine what features were central in the analysis, one could examine z-scores of individual features before they are combined into the overall delta score. for instance, important features for dickens should show low absolute differences be- tween z-scores of dickens’ set and one of his documents as a test document. in the following experiment, we consider a classic delta analysis as well as one that allows for a comparison to characteristic features emerging from applying representa- tiveness and distinctiveness to the same data. the data set used for the analysis is the same as the one used in section . . more specifically, there are twenty-four texts by dickens and fifty-five by sixteen other authors. although this would be a suitably bal- anced set for representativeness and distinctiveness, it is less well suited for applying delta due to the fact that dickens is dominating as a single author. for this reason, we reduce dickens’ set in order to prevent his style from dominating the mean and stan- dard deviation over the entire corpus — which are crucial parameters for delta. we randomly extract eight documents for dickens and take the remainder as test pieces. the data was preprocessed as described in section . for the final input we retain the most frequent features. first considering a classic delta analysis of the data, the delta scores reveal that in all sixteen cases, dickens is rated closest to his own document. considering the distributions of delta over all authors, namely delta z-scores, it seems that under delta dickens’ documents are not extraordinarily similar to one another based on these test pieces and when compared to the other candidate authors (a typical result is shown in table ). feature comparison in order to compare the two methods, we use the same training data (sixty-three authors on features) to compute representative and distinctive fea- tures (for delta, we consider the feature values corresponding to table ). to examine table . delta z-scores for candidate authors in corpus w.r.t test text nicolas nickleby, indicat- ing that dickens is not notably closer to the test document than the other candidates. author delta z-score dickens − . eliot − . c. brontë − . gaskell − . thackeray − . collins − . trollope − . smollett − . austen − . sterne − . swift − . fielding − . richardson − . defoe − . e. brontë . goldsmith . a. brontë . similarities in feature importance, we can compare the rankings of the features under the two methods. for delta, low values indicate greater importance, while in terms of representativeness and distinctiveness, higher values would be more desirable. we correlate the rankings for all features under each method using spearman’s ρ, which is bounded by [− , ]. thus, for a strong correlation in the present case, we would expect a large negative correlation. correlating all the rankings over all features returns a weak negative value: − . , however, among those , there might be less accurate ones, so it remains to test higher rated features’ correlations. for this purpose, we reorder the features according to the highest representative and distinctive features and try different levels of highest values, shown in table . the correlation between the number of features considered and the correlation between methods is − . , the mean of this over all sixteen test pieces is − . , with correlations ranging from − . to − . , which does not indicate a very stable relationship. but this does indicate that it is beneficial to include a larger number of features (words). thus, the degree of correlation seems to be subject to the particular test document, as well as the composition of test and training corpus. further, we can compare the number of top features shared between the methods. among the first ∼twenty to thirty most important features, methods share only one term, namely ‘hardly’. among the first words, there are nineteen shared ones: more, nothing, without, however, old, hardly, she, return, for, entered, stay, about, fu- ture, but, conduct, away, pleased, immediately, entirely, cold, be and than. considering the first most important ones yields sixty-three shared features; the first raises it to common features. the above comparison showed that there might not be a very strong or even consis- tent correlation between features emerging as important from the two methods. delta scores (per feature) and rdf scores correlate only weakly, from which we conclude that they are genuinely different. however, since they were designed for different pur- poses any comparison between them is unlikely to be ideal. in our case, delta requires table . rank correlation of different numbers of features based on delta and rd; where a high negative correlation would be indicative of a strong similarity between the methods. no. of features spearman’s ρ − . − . − . − . − . − . − . − . . − . − . . − . that one includes fewer documents by dickens in the main corpus, while more doc- uments would be better for representativeness and distinctiveness to estimate rep- resentativeness more reliably. generally, features that are consistent for a particular author in terms of being avoided or preferred with respect to the main corpus, are likely to emerge under both methods, provided the chosen test piece is also following this regular pattern. . comparing to hoover’s cov tuning for the comparison between the cov tuning method (hoover, ) and representa- tiveness and distinctiveness, we again consider the dickens/collins data set. the cov tuning method was introduced to ‘identify words used fairly frequently and in many texts but with widely varying frequencies’. for this purpose, one con- siders a two-/multi-author text corpus and computes the coefficient of variance over the complete sample (for each feature f separately) by dividing the standard deviation σf by the mean µf (the computations are on the basis of relative frequencies). the resulting scores are then multiplied by to express them as percentages. however, hoover notes that high covs are also awarded to features that are rare or only occur in a small number of texts, which necessitates choosing items that occur in a large num- ber of texts. according to david hoover (email communication), there do not yet exist clear guidelines for choosing the number of documents a term has to appear in, so this is done here heuristically as well. cov tuning experiment since the methods operate on different levels of the data set, i.e. cov tuning being computed on the basis of the whole corpus and representativeness and distinctiveness requiring division of authors into sets, there is unlikely to be an ideal experimental design for comparison. similar to the previous experiment, there are different aspects one may consider to gain some intuition about the similarities and differences between the two techniques. to arrive at a good estimation for thresholds of input features, we analyze accuracy in clustering documents for the highest features under the cov tun- ing method. further, we examine similarities with respect to the features chosen by the cov as highest and look at the cov and rdf score correlations for these features. finally, we consider highly rated words shared by both methods, when representative- ness and distinctiveness is applied as usual. clustering with the cov in order to restrict the number of input features, different thresholds were explored, but only a very high threshold of ‘appearance in at least % of the documents’ proved effective in terms of clustering (practically, this included features appearing in all documents). this reduced the data to input features. table shows the results for clustering different levels of top features for the cov. the distance matrix was computed using the ‘manhattan’ distance and clustering was done using ‘complete link’. the clustering result is evaluated using the adjusted rand index (ari). the results indicate, that in this case at least features are required and clustering results are highest on – features. table . cov tuning’s accuracy in clustering on the dickens/collins set, shown using different numbers of highest input features. no. of features ari . . . . . . . . comparing cov tuning and representativeness / distinctiveness in order to investi- gate correlations between the two methods, we consider the highest features emerging under cov tuning with respect to clustering and consider the exact same features or- dered by their rdf scores. a high correlation in terms of rank would be marked by a high spearman’s ρ, close to . table shows selected levels of the ranking corre- lations of cov and rdf scores for both dickens and collins. occasionally, there are stronger correlations for collins’ scores and the cov, but since these are also negative, it seems rather erratic. the correlation between the number of features considered and the correlation between methods is . for dickens and . for collins, which indi- cates that the level is likely to be relevant here (the overall correlations were computed on a stepwise version of the data, e.g. for levels, there were ∼ correspon- dences). we interpret the low correlation to indicate that cov and rd are genuinely different concepts. shared feature lists as a final exercise, we look into size and type of features identi- fied by the two methods where representativeness and distinctiveness are computed on table . correlation of rankings on various levels of top features according to the features selected for the cov. spearman’s ρ no. of features dickens collins . . . . . . . . . . . − . . − . . − . . − . − . − . − . − . − . − . . − . − . − . − . . . . the entire feature input of ∼ features. since the method is computed with respect to particular author samples, less frequent, but consistent features are considered like- wise. thus, for each method, we order features according to prominence and consider the overlap at different levels of the ranked list. table shows the number of shared items at different steps. when considering both dickens and collins (for all features as input) the overlap with the features selected by the cov is not considerable – the top features only yield eight to eleven shared items, but which incidentally include upon and letter, which have previously been identified as dickens and collins markers (tabata, ). further, we compare the features chosen by cov and rd (for dickens) on the exact same input of features appearing in all documents. the overlap of highest ranked features is greater after the first words, but less than one might expect on the same input, if the methods were choosing features in a similar fashion. in terms of a general comparison, we note that cov tuning requires virtually no computation time compared to the expensive pairwise comparisons of documents needed for representativeness and distinctiveness. disregarding any particular author in the set (unsupervised approach), as it is done in cov tuning, potentially offers more possibilities for evaluation than a supervised technique, where accuracy of selected features can only be heuristically evaluated for instance, by clustering. the fact that cov tuning is successful at all, considering it operates only by measuring variability of frequent features is impressive - however this potentially indicates a different application area than representativeness and distinc- tiveness, where the focus is on author-dependent consistency of usage regardless of exact frequency strata. there is an overlap, nevertheless, if only at a theoretical level, as items appearing in most documents as well as being highly variable might be more likely to vary between than within authors. table . number of shared items at different levels of prominence, including the top features – for rd for both all original input features before ‘tuning’ and only using the features input to cov computations. input mfw cov no. of features dickens collins dickens conclusion this work has introduced representativeness and distinctiveness, a simple statistical measure to identify features that an author uses consistently and in a way that distin- guishes him/her from others. the technique requires a substantial number of docu- ments of each author (in order to gauge consistency), and its performance wanes when one set is less homogenous. different comparisons to other techniques applied in the domain, both well established and recently introduced ones, indicate more differences than similarities to representativeness and distinctiveness. through its ability to ana- lyze both frequent as well as less frequent features renders it a powerful and promising technique for stylometric analysis in authorship. future considerations we should like to be able to characterize the extent to which one can consider a feature score high or low in an absolute sense as opposed to merely high or low with respect to the other features for a particular author. for instance, there are authors, such as jane austen, who are rather consistent in vocabulary use through- out their different works and who might thus be more likely to end up with higher rep- resentative scores than authors displaying less consistency, such as for instance mark twain, who is seen to be more volatile. future work might therefore include exploring the properties of high and low rdf scores in order to be able to generalize about the degree to which an author is consistent over his works and different from others. our goal in this paper was to suggest an emphasis in stylometry on features whose frequency distributions might be regarded as fairly characteristic for a given author as opposed to those that serve to discriminate the author from others. our comparisons have indicated that these two characterizations may be very different. as stylometry evolves to encompass syntactic features, which we suspect will be less numerous than the very large vocabularies of authors, the shift in emphasis may become more impor- tant. references breiman, l. ( ). “random forests”. in: machine learning, pp. – . burrows, j. ( ). “‘delta’: a measure of stylistic difference and a guide to likely authorship”. in: literary and linguistic computing . , pp. – . burrows, j. ( ). “who wrote shamela? verifying the authorship of a parodic text”. in: literary and linguistic computing . , pp. – . burrows, j. ( ). “all the way through: testing for authorship in different frequency strata”. in: literary and linguistic computing . , pp. – . craig, h. ( ). “authorial attribution and computational stylistics: if you can tell authors apart, have you learned anything about them?” in: literary and linguistic computing . , pp. – . craig, h. and drew, j. ( ). “did dickens write ”temperate temperance”?: (an attempt to identify authorship of an anonymous article in all the year round)”. in: victorian periodicals review ( ), pp. – . hoover, d. ( ). “corpus stylistics, stylometry, and the styles of henry james.” in: style . . hoover, d. ( ). “tuning the word frequency list”. in: universite de lausanne: digital humanities : conference abstracts, pp. – . hubert, l. and arabie, p. ( ). “comparing partitions”. in: journal of classification . , pp. – . mahlberg, m. ( ). “clusters, key clusters and local textual functions in dickens.” in: corpora . , pp. – . manning, c. d., raghavan, p., and schütze, h. ( ). introduction to information retrieval. vol. . cambridge university press. oakes, m. p. ( ). literary detective work on the computer. vol. . john ben- jamins publishing company. prokić, j., çöltekin, ç., and nerbonne, j. ( ). “detecting shibboleths”. in: pro- ceedings of the eacl joint workshop of lingvis & unclh. eacl . avignon, france: association for computational linguistics, pp. – . r core team ( ). r: a language and environment for statistical computing. r foundation for statistical computing. vienna, austria. rangel, f. et al. ( ). “overview of the author profiling task at pan ”. in: note- book papers of clef. http://www.uni-weimar.de/medien/webis/research/events/pan- /pan -web/, pp. – . stamatatos, e. ( ). “a survey of modern authorship attribution methods”. in: jour- nal of the american society for information science and technology . , pp. – . tabata, t. ( ). “approaching dickens’ style through random forests”. in: uni- versity of hamburg: proceedings of the digital humanities: conference abstracts, pp. – . wieling, m. and nerbonne, j. ( ). “bipartite spectral graph partitioning for cluster- ing dialect varieties and detecting their linguistic features”. in: computer speech & language . , pp. – . report on the th international conference on electronic publishing: social shaping of digital publishing search d-lib:   home | about d-lib | current issue | archive | indexes | calendar | author guidelines | subscribe | contact d-lib   d-lib magazine november/december volume , number / table of contents   report on the th international conference on electronic publishing: social shaping of digital publishing tomasz neugebauer concordia university, montreal, canada tomasz.neugebauer@concordia.ca doi: . /november -neugebauer   printer-friendly version   abstract elpub , "social shaping of digital publishing: exploring the interplay between culture and technology", the th annual conference on electronic publishing, took place - june at the university of minho in guimarães, portugal. this report summarizes some of the arguments and results presented, and offers some review and reflection on the contents.   introduction the elpub conference has featured research results in various aspects of electronic publishing for the last years, involving a diverse international community of researchers in computer and information sciences, librarians, developers, publishers, entrepreneurs and managers. an analysis of elpub paper keywords carried out in showed that the most frequent subjects were "users", "web", "metadata" and "xml". over the course of the last decade, open access, intellectual property rights, and institutional repositories are also favorite themes among elpub authors. elpub , "social shaping of digital publishing: exploring the interplay between culture and technology" was the th annual conference on electronic publishing. it took place - june at the university of minho in guimarães, portugal. the conference programme included a panel discussion, two keynotes and parallel sessions: future solutions & innovations digital texts & reading digital scholarship & publishing repositories & libraries special archives this conference report includes a review of the keynotes and the panel discussion as well as a review of a selection of two presentations from each of the parallel sessions. the purpose of this report is to summarize some of the arguments and results presented as well as offer some review and reflection on the contents.   kathleen fitzpatrick's keynote the conference opened with a keynote by the director of scholarly communication of the modern language association, kathleen fitzpatrick. the keynote was titled the same as her latest book, planned obsolescence: publishing, technology, and the future of the academy . the budget slashing experienced by libraries and university presses during the dot-com bubble burst of was particularly devastating for university presses whose budgets decreased dramatically, while libraries managed their cuts through consortia agreements and improved interlibrary loans services. the unfortunate consequence is that marketability of content began to take precedence over high quality scholarly merit. kathleen fitzpatrick suggests that scholarly publishing has become obsolete and is in need of significant structural changes beyond a conversion from print to digital. while academic tenure and promotion ought to depend on peer review and scholarly merit, not on media format, online scholarship continues to pose evaluation difficulties. the in media res project and the media commons are examples of encouraging initiatives towards open peer review. although there are experiments in open peer review, such as the issue of shakespeare quarterly, the pace of change in academia remains 'glacial'. if reviewers can sometimes miss the point of a work, it may mean that the structures of peer review are broken and in need of critical examination. in the world of print media, there was a true scarcity of space that required strict gate keeping. with the internet and digital media, that scarcity is now recreated artificially only. instead of gatekeepers, kathleen fitzpatrick suggests the need for progress in coping with abundance through filters, that include post publication review and the public community. a necessary component of the change is in the perception of online publishing by the academy: publishing online has to be acknowledged as both legitimate and sufficient. online publishing and the use of post-publication review is more collaborative in nature than the single authorship of a traditional monograph. however, perhaps it is precisely the notions of creativity and originality that will need to change if the academy is to avoid institutional obsolescence. the challenge to develop more effective information filters as a way of dealing with abundance, rather than focusing on gate keeping, will hopefully inspire progress in information retrieval technologies, but the specifics of these new filtering methods remain a challenge. electronic publishing and the web has indeed grown at an astonishing pace due in part to the fact that the costs of sharing content online are calculated according to different criteria than in the world of print. however, the social systems that produce scholarly publications are currently dependent on the gate keeping function of pre-publication peer-review.   special archives jelle gerbrandy described the design, and lessons learned from, the development of the biography portal of the netherlands, a project in collaboration with els kloek from the institute for netherlands history. the lessons learned call for addressing copyright issues early on in a project; in the case of the biography portal, unresolved license issues with the data providers pose a fundamental challenge. the biodes xml format for representing biographical information that was developed for this project would likely be more widely adopted by other organizations if it didn't deviate from the text encoding initiative (tei) standard for purposes of simplicity. maria josé vicentini jorente (universidade estadual paulista júlio de mesquita filho) presented an analysis of the national archives experience digital vaults as an example of a novel paradigm of information design of archival collections. the interaction design of national archives experience digital vaults is based on the linking of documents indexed independently from physical spaces, institutions, chronologies and archival fonds. maria josé vicentini jorente alluded to a tension between the user needs of professional visitors and those of the general public. the presentation of archives as fonds serves the former more, whereas new technologies in information retrieval have helped to facilitate a new "post-custodial paradigm in which any individual is able to access, research in and rebuild virtual collections, creating unique paths to approach historical contents". the digital vaults interface is indeed innovative in its design, but confirming the effectiveness of this design seems to call for further empirical study.   digital scholarship & publishing pierre mounier, from the centre for open electronic publishing at l'École des hautes études en sciences sociales (ehess) presented openedition freemium as a new commercial model devised for libraries interested in open access humanities and social sciences content. the term "freemium" was popularized by the journalist chris anderson in his book free: the future of a radical price . in this context, it means that the social sciences and humanities books and journal articles are available to all in html format, while subscribing libraries get access to premium services such as pdf and epub download, usage statistics, export in marc format, alerts, assistance, training, and more. openedition is a relatively new and promising platform with participation from publishers and subscribing libraries. the freemium economic model seems like an ideal solution for social sciences and humanities publishing. it results in basic open access to full-text in html formatted content. furthermore, subscribing libraries retain their role as mediators, purchasers and promoters of content while publishers retain the revenue stream from subscriptions that they depend on. caren milloy (jisc) spoke about oapen-uk project which aims to gather information on the potential for open access scholarly monograph publishing in the humanities and social sciences. in collaboration with five publishers, oapen-uk has set up an interesting comparative study on the impact of open access on a monograph's sustainability and profitability by measuring and comparing the usage, sales, citation and discoverability data of monographs. the publishers proposed pairs of monographs that are as similar as possible. this allowed the oapen project to create two groups of monographs to compare: an experimental group that will be made openly available on the oapen library under a creative commons licence while the control group will be available only as ebooks for sale under the publishers normal licensing. in addition to this experiment, the evidence gathering takes the form of focus groups comprised of institutional representatives, publishers, authors/readers, funders, learned societies, ebook aggregators, research managers and administrators. the results of the initial focus groups are already available on the oapen site, and the results of the comparative study will become a valuable source of evidence for the impact of open access on the sustainability of monograph publishing in the humanities and social sciences.   repositories & libraries the swedish study, "accessibility and self archiving of conference articles: a study on a selection of swedish institutional repositories", presented by peter linde (blekinge institute of technology) et al., on the accessibility of conference articles, confirmed the importance of open access subject and institutional repositories in providing access to and preservation of conference papers. a significant number of the articles in the study were found in some type of oa archive, confirming that repositories are currently used for this purpose. furthermore, a striking % of the conference papers were not available at all in any format or platform and thus represent potential candidates for inclusion in institutional repositories. one of the interesting recommendations by the authors of this study is the development of a copyright policy database for conferences, similar to sherpa/romeo for journal publishers. lydia chalabi (university of algeirs) points to the lack of research studies on the use of open archives and the open access movement's impact over the scientific production of developing countries, in the report of her study on open archives in developing countries "open access in developing countries: african open archives". using open access directories as a data source, lydia chalabi filters down to the open archives used for scholarly communication in african developing countries. an analysis of these reveals that the open archives in african developing countries are limited in various ways. for example, more than half of the archives include content that requires a local login, only three offer usage statistics, and the existing content consists mostly of theses. the open access movement intends to improve access to scholarly communication by removing economic barriers faced by scholars. researchers in african developing countries can benefit from improved access to the outputs of research from other parts of the world, but it seems equally important to have sufficient support for contributing to the open access content through deposit/publication in open access archives.   panel discussion on academic e-books the topic of the panel discussion, moderated by peter linde, was "academic e-books —technological hostage or cultural redeemer?". kathleen fitzpatrick (modern languages association and pomona college), antónio câmara (universidade nova de lisboa and ydreams inc.), delfim ferreira leão (university of coimbra) and karin byström (uppsala university) were to discuss the positive and negative aspects of e-books. although e-books represent a sustainable opportunity for academic publishing, their readability and access/impact will need to continue to improve. academic libraries face difficulties in selecting and acquiring e-books: many monographs still don't have an e-book version, many have embargoes/delays, and e-book publishers often have business models that can prevent libraries from purchasing their e-books for ownership and allow only limited licensed access. kathleen fitzpatrick pointed out that e-books are currently in an early stage of what they will become through the addition of video and interactive components. antónio câmara inspired a lively discussion on the future of teaching by arguing that open access video courses represent a fundamental and transformative change for the future of teaching and the university. he argued that professors will become tutors that offer additional perspective and motivation to students that will increasingly choose to learn from the world's top professors through video lectures. antónio câmara predicts that books will continue to play an important role in teaching, while libraries will be involved in the development of sophisticated visualization tools.   antónio câmara's "publishing in " keynote antónio câmara continued the "video will be king" theme in his keynote speech. he paints a picture of the future where technology allows publishers to "print" interactive digital displays on "anything", such as product packages. he described the electrochromic display technology developed by the ydreams spinoff company ynvisible. currently, printing electrochromic displays is expensive, but as the costs go down, we can expect to see the digital become an even more integral part of physical product experiences. printing technology today still uses dots, but with over billion dollars in printed goods produced in , the race to produce inexpensive electrochromic printing is ongoing. futurology is a speculative activity and only time will tell if the digital world will make its way onto the physical printing presses of the future. contemplating the feasibility of accessing digital information sources through a printed interactive display on a physical object such as a magazine, a coffee cup or a postcard does help to generate an image of one possible future of "publishing in ".   future solutions & innovations carlos henrique marcondes (university federal fluminense) presented a paper titled "knowledge network of scientific claims derived from a semantic publication system". the textual format currently used for scholarly publishing is a metaphor of the th century print text model and restricts computer programs from precise and meaningful semantic analysis of content. carlos henrique marcondes presents a prototype of an enhanced bibliographic record and author deposit interface that allows for the encoding of the conclusions of articles with the use of linked data principles and the national library of medicine's unified medical language system (umls). authors are asked to enter a conclusion and natural language processing libraries are used to represent this knowledge as antecedent-consequent relations between phenomena using structured umls. the accuracy and reliability of the semantic formalization of article conclusions remains dependent on authors' familiarity with umls since they are asked to validate the extracted relations and mapping to umls terms during the deposit process. a vision of the opportunities in the "future of digital magazine publishing" was presented by dora santos silva. she cites encouraging statistics published by the association of magazine media (mpa), that % of subscribers renew their magazine subscription. furthermore, % of young consumers are reading magazines electronically while % of those who have downloaded apps have paid for magazine content. almost every print magazine has an online presence, although most of these are merely digital pdf replicas of the print. dora santos silva outlines key features that define a magazine: it has a beginning, middle and end; it is edited and curated; it has an aesthetic treatment; it is date-stamped and periodic; its contents are permanent, suffering only minimal corrections. she outlines the potential of digital magazines using the following examples: ifly magazine, zoo zoom magazine, viv magazine, and all out cricket magazine. three magazines were profiled and critiqued in detail: flypmedia, magnética magazine and the new yorker — ipad edition. although many of the usability problems outlined by jakob nielsen can be found in the ipad editions of digital magazines, and only pdf replicas of print magazines exist for many magazines, dora santos silva's paper presents advantages of the digital formats over traditional ones that represent an opportunity for publishers.   digital texts & reading celeste martin (emily carr university of art + design) spoke from the point of view of a designer in the presentation co-authored with jonathan aitken titled "evolving definitions and authorship in ebook design". the multimedia and social interaction potential inherent in ebooks challenges traditional notions of authorship resting with the creator of the text. the impact of the design of the user experience and user participation in ebooks elevates the role of the designer. realizing the full potential for new "enhanced" ebooks that are designed for tactile use on a tablet requires a collaborative effort between writers and designers. celeste martin describes the results of such collaboration in her classroom, where five authors agreed to work with groups of design students in repurposing their books into digital format. the results included traditional book elements such as pages and linear navigation systems, but they also included features that offer an intentionally different experience from the original text. the results also included reader participation through annotation upload, content sharing with other users, game-like interaction, and creative "vertical" and random navigation and exploration. it seems likely that implementing some of these e-book designs would bring to light usability challenges with readers who experience difficulties in learning how to use them. the aesthetics of ebooks emerge from an interdependence of form and content. chrysoula gatsou (hellenic open university) reported the results of a usability study co-authored with anastasios politis (technological educational institute of athens) and dimitrios zevgolis (hellenic open university) on the use of visual metaphors from "text vs visual metaphor in mobile interfaces for novice user interaction". the study observed younger and older novice (i.e., inexperienced with computers) users as they interacted with an application interface with two types of interaction icons: visual metaphor and text. the intention of a designer to create clear layout and comprehensive visual metaphors is insufficient to guarantee that the user will perceive and appreciate it as intended. chrysoula gatsou advises that choosing visual metaphors for interface icons requires careful consideration of their comprehensibility to users. this study reports that the metaphor of a "home" as the navigation button to return to the main menu caused problems for older users. the older users in the study performed better when interacting with text buttons whereas younger users performed better in their interaction with icons. ideally, the interface designer can find universally comprehensible visual metaphors for icons, but the challenge is very difficult due to differences in users' age, experience, and culture. the authors of the study do not mention this, but it seems like the design strategy of icons that use a visual metaphor and the text alongside may be an effective compromise. the authors of this study state that "cultural differences may also be defined by age", but the results show only that age correlates with some interface design preferences, not that age is a dimension of culture.   conclusion the conference on electronic publishing has provided a valuable venue for the exchange of ideas between librarians, computer scientists, publishers and others since it was first organized in . elpub's comprehensive programme includes an astonishing variety of research perspectives on electronic publishing. this year, the theme "social shaping of digital publishing: exploring the interplay between culture and technology" inspired fascinating discussions on the future of digital publishing and contributions of research results from a variety of perspectives including design, librarianship, archives, publishing and computer science. this year's presentation by carlos henrique marcondes on formalizing the conclusions of papers so that they can be published along with the article metadata seems to be indicative of the focus for elpub . next year, at elpub , the th international conference on electronic publishing, with the main theme "mining the digital information networks", we can expect a greater focus on text/data mining, machine processing and knowledge discovery. the ethics of text/data mining can be a particularly relevant aspect that will hopefully be addressed by some of the submissions. elpub is scheduled to take place june - , at blekinge institute of technology in karlskrona, sweden. in addition to the traditional themes of publishing and access, the main theme of extracting and processing data from digital publications as well as the use of this information in social contexts will be featured. all of the papers from this conference are available in elpub digital library and in proceedings published by ios press. in the spirit of antónio câmara's argument for the primacy of video, recordings for most of the presentations are also available at educast@fccn (https://educast.fccn.pt/vod/channels/r i amwrr and https://educast.fccn.pt/vod/channels/ks u khu).   notes fitzpatrick, kathleen. planned obsolescence: publishing, technology, and the future of the academy. nyu press, , new york university. anderson, chris. free: the future of a radical price. chris anderson, , isbn - - - - .   acknowledgements thanks to peter linde for his helpful comments and suggestions.   about the author tomasz neugebauer is the digital projects & systems development librarian at concordia university libraries and editor of photographymedia.com. he holds a bachelor's degree in philosophy and computer science and a masters in library and information studies from mcgill university.   copyright © tomasz neugebauer high integration of research monographs in the european open science infrastructure deliverable . ms report on stakeholders and european infrastructures related to open science in hss (d . ) and informed communication and outreach strategy, based on coordinated core mission (ms ) grant agreement number : project acronym : hirmeos project title : high integration of research monographs in the european open science infrastructure funding scheme : einfra- - project’s coordinator organization : cleo-cnrs e-mail address : pierre.mounier@openedition.org website : http://operas.hypotheses.org wp and tasks contributing : wp task . and ms wp leader : ugoe dissemination level : pu due date : feb. ( . . for ms ) delivery date : mar. the project has received funding from european union’s horizon research and innovation programme under grant agreement . this publication reflects only the author’s views – the community is not liable for any use that may be made of the information contained therein. ref. ares( ) - / / draft version hirmeos table of contents disclaimer abbreviations . executive summary . introduction and background . stakeholder communities . members of hirmeos’ consortium, other projects related to hirmeos . academic community (hss) . . senior faculty members . . junior scholars and academic students . libraries and publishers . . libraries; oa publishers . other projects and networks related to oa . local, national and eu policy makers & funders . it providers, start-ups and smes . media and general public . communication and dissemination strategy . objectives . multilevel approach . communication channels and dissemination activities . hirmeos website . social media . mailing list and newsletter . events . . organization of roundtables and workshops . . participation in external events hirmeos . dissemination materials . dissemination toolkit . press relations . publications . feedback and evaluation . dissemination timeline . conclusion hirmeos disclaimer this document contains a description of the hirmeos project findings, work and products. certain parts of it might be under partner intellectual property right (ipr) rules so, prior to using its content please contact the consortium head for approval. in case you believe that this document harms in any way ipr held by you as a person or as a representative of an entity, please do notify us immediately. the authors of this document have taken any available measure in order for its content to be accurate, consistent and lawful. however, neither the project consortium as a whole nor the individual partners that implicitly or explicitly participated in the creation and publication of this document hold any sort of responsibility that might occur as a result of using its content. hirmeos is a project funded by the european union (grant agreement no ). hirmeos abbreviations hss: humanities and social sciences oa= open science os= open science ws= workshop wp= work package hirmeos . executive summary this document presents the report on stakeholders and european infrastructures related to open science (deliverable . ) of hirmeos project. the report contributes to reaching the milestones by providing an informed communication and outreach strategy based on coordinated core mission. the developed strategy is intended to be updated mid-way through the project. all the members of hirmeos consortium participate in the elaboration and in the implementation of the communication and the outreach strategy. for the production of the deliverables of work package (=wp) number is responsible ugoe, the leader of this wp. the main tasks of wp are to: ● develop a communication and outreach strategy ● prepare efficient tools for the communication toward the various stakeholders ● outreach the various stakeholders foster and strengthen communication with a network of key stakeholders and scientific communities hirmeos ● . introduction and background hirmeos is a -month project funded under the horizon program of the european commission. it focuses on the monograph as a significant mode of scholarly communication in the humanities and social sciences (hss) and tackles the main obstacles to the full integration of important platforms supporting open access monographs and their contents. the consortium of hirmeos is composed of nine partners from seven countries. the core mission of hirmeos is to: ● intensify usage on open access books through addition of new services and data ● increase coordination and cross-linking between the platforms and with other indexing services ● integrate research books in the open science ecosystem ● boost user-driven innovation in the academic book publishing sector, both commercial and non-commercial the platforms participating in hirmeos (openedition books, oapen library, ekt open book press, ubiquity press and göttingen university press) will be enhanced with tools that enable identification, authentication and interoperability, so as with and tools that enrich information and entity extraction. new services will allow annotating monographs and gathering usage and alternative metric data. hirmeos will also enrich the technical capacities of the directory of open access books, while it will also develop a structured certification system to document monograph peer- review. the wp applies a holistic approach to design hirmeos’ mode of communication and outreach. wp main goals are to strengthen hirmeos’ international network and core mission and to foster the joint development and aligned implementation of state-of-the-art services. wp aims to reach out to all relevant stakeholders and communities, in research and the public, in order to ensure that hirmeos’ services for scholarly monographs are aligned and interoperable with existing infrastructures for open science and stakeholders’ and communities’ needs. more specifically, the wp consists of four tasks: t . communication and outreach strategy t . dissemination activity hirmeos t . community outreach t . alignment and exploitation hirmeos . stakeholder communities in order to achieve hirmeos’ communication and disseminations goals it needs an acute understanding and precise delineation of the key stakeholder communities concerned by the project. each community has different needs and expectations. in order to contact and inform its stakeholders and maximize its visibility, it is important that hirmeos develops a multilevel approach, using varied communication styles and different communication channels. hirmeos aims also to promote citizens’ engagement in science and research. it will organize events and communications platforms to enable the dialogue between academics and citizens. annotation feature will increase usage by the citizens of the academic output and new metric services will give some feedback to the authors. the target groups of stakeholders, who will be approached at various levels as listed in the table below, are the following: ● members of hirmeos’ consortium and members of other projects related to hirmeos ● academic community (hss) ● libraries and publishers ● other projects and networks related to open access ● local, national and eu policy makers & funders ● it providers, start-ups and smes ● media and and general public . members of hirmeos’ consortium, other projects related to hirmeos description: wp leaders and other partners involved in hirmeos’ consortium; members of other oa related projects (above all, opera and openaire) importance: very high. all partners of hirmeos are involved in each wp, at the same time all hirmeos partners are well-networked and mature players in their respective fields. therefore they have to be well informed about the developments of the project and the surrounding landscape. the connection with operas, the dariah network at large and openaire is very important for the dissemination of hirmeos’ outcomes. relevant news of hirmeos: technical implementations; new tools and services; best practice guidelines; community feedback, conferences and other liaison activities. channels to be used in order to reach them: slack; emails; internal meetings; website; assessment and feedback: direct networking and meetings. . academic community (hss) . . senior faculty members description: established professors and other senior hss scientists hirmeos relevant news of hirmeos: new services and tools improving researching and teaching with open access research monographs in digital format; events, workshops and training for the new tools; use cases; community feedback and best practice guidelines. importance: very high. they are creators and users of research monographs at the same time, while also gatekeeper, opinion leader and multipliers within their fields that have a strong influence on their junior colleagues. they play a major role on rejection, disregard, interest or uptake of new practices and technologies. therefore, increasing their acceptance of oa research monographs will have a strong impact on the success of hirmeos. channels to be used in order to reach them: direct networking; roundtables; liaison activities; website; newsletter; social media; scientific publications; conference presentations; dissemination toolkit (poster, flyers). assessment and feedback: roundtables, reviews of monographs published on hirmeos’ platforms. . . junior scholars and academic students description: b.a. m.a, ph.d students, post-doc, research assistants, research training groups. relevant news of hirmeos: new services and tools improving learning, researching and teaching with digital research monographs; events; workshops; training for the new tools; use cases, best practice guidelines. importance: very high. like senior faculty members (s. above), but potentially still more important because more familiar with digital tools and the methods of digital monograph publishing. channels to be used in order to reach them: direct networking; roundtables; liaison activities; website; newsletter; social media; scientific publications; conference presentations; dissemination toolkit (poster, flyers). assessment and feedback: workshops, feedback through the website, . . scientific societies, deans, heads of department: description: societies related to topics of hss; faculty members with management duties. relevant news of hirmeos: new services and tools improving researching and teaching with research monographs in digital format; events, workshops and training for the new tools. importance: very high. they are both propagators and multipliers of relevant information for academics. channels to be used in order to reach them: direct networking, roundtables, liaison activities; website; newsletter; publications; conference presentations; dissemination toolkit (poster, flyers). assessment and feedback: roundtables. . libraries and publishers . . libraries; oa publishers description: university libraries actively engaged in os and oa; publishers producing hss research monographs in pure gold oa. hirmeos relevant news of hirmeos: overview about hirmeos; information about new services and new tools; best practice guidelines. importance: medium high. they are already aware of the potential of digital humanities and oa. they can take advantage of hirmeos’ enhancements for oa platforms. more publishers could be involved in the network of hirmeos’ platforms. channels to be used in order to reach them: dissemination toolkit (poster, flyers); website; social media assessment and feedback: website . . commercial publishers: description: commercial publishers which do not publish research hss monographs in oa or only in green oa. relevant news of hirmeos: overview about hirmeos; information about new services and new tools; best practice guidelines. importance: medium. the dissemination of their products could take advantage of hirmeos’ platforms but they are committed to traditional financial models which do not allow them to embrace full oa. channels to be used in order to reach them: dissemination toolkit (poster, flyers); website; social media assessment and feedback: website . other projects and networks related to oa description: eu projects and research groups related to os, oa and digital humanities relevant news of hirmeos: overview about hirmeos; information about new services and new tools; best practice guidelines. importance: medium. a well-structured synergy should improve the impact of hirmeos. channels to be used in order to reach them: website; social media; newsletter; scientific publications; dissemination toolkit (poster, flyers). assessment and feedback: social networks; workshops. . local, national and eu policy makers & funders description: decision makers of governmental and funding agencies: national research councils, foundations supporting research and publishing in hss relevant news of hirmeos: overview about hirmeos. information about the benefits for scholars and other stakeholders. social-economic meaning of hirmeos and oa in general importance: high / very high. they provide infrastructure and services in support of implementation of oa policies. support is needed to strengthen the outcomes of hirmeos. channels to be used in order to reach them: direct networking; social media; press releases assessment and feedback: evaluation of hirmeos’ report and other deliverables hirmeos . it providers, start-ups and smes description: it solutions providers and new tech start-ups developing software products, tools & apps for the monographs in the os paradigm relevant news of hirmeos: overview about the technical development of hirmeos; overview about needs of scholars and students importance: medium. exchange with this community could help to enhance the technical implementations of hirmeos. channels to be used in order to reach them: social media; dissemination toolkit (poster, flyers); conference presentations. assessment and feedback: workshops . media and general public description: media, particularly those spending attention to os, oa and academic policy in general. relevant news of hirmeos: overview about hirmeos; information about the benefits for researcher and other stakeholders; social-economic meaning of hirmeos and oa in general importance: high. decision makers have to cope with the public opinion and this is still mainly made through traditional news media. actually, there are in some parts of the european press ideological resistances against oa, which hirmeos must deal with. channels to be used in order to reach them: press releases; publications social media assessment and feedback: publication of articles following hirmeos press releases hirmeos . communication and dissemination strategy . objectives the main objectives of the hirmeos communication and dissemination strategy are related to: ● understand the full life-cycle and landscape of the scholarly monograph as an important form of communication for book-oriented disciplines against the backdrop of digital humanities (dh), open access (oa) and open science (os). ● identify disciplines with low uptake in dh, oa and/or os and possible roads for change; ● identify communities already experimenting or implementing dh, oa, os; ● involve stakeholders and target groups to gather input and evaluate services; ● promote the services and platforms developed by hirmeos; ● bring forward open science and open access in the disciplines with low uptake ● create a strong and recognizable brand for the project, identity and key messages to be used on all dissemination material; ● generate positive media coverage for the project at a local, national, european and global level; ● support sustainability and visibility of the research results even after the project’s lifetime. . multilevel approach hirmeos dissemination aims are set on different levels: level : within hirmeos consortium level : within own organization & networks level : towards core target groups through direct networks level : towards other stakeholders and decision makers in the field of the project level : towards other countries and sectors the following issues and messages will be disseminated: ● current developments ● achieved results ● achieved milestones ● published deliverables and other publications, like scientific articles ● attended events and own events, like the wp leaders meeting ● other important incidents hirmeos . communication channels and dissemination activities . hirmeos website the project website – http://www.hirmeos.eu – is a fundamental element of hirmeos communication and dissemination strategy. it informs all the stakeholder communities around the mission and development of hirmeos project and provides regular updates on relevant events and planned activities. it will also contain links to relevant publications and to other oa projects. a first neutral version of the website will be available at the end of month two. logo and other elements of design which defines the core identity of hirmeos project will be added after the realization of the dissemination toolkit in month . the website is built with the web content manager wordpress and a google analytics snippet will be coded into the website, so that it will be possible to monitor the usage of the website hirmeos . social media hirmeos will take advantage of different social networks. member of the consortium will use the following media in order to disseminate innovations and events of hirmeos: twitter, linkedin, researchgate, academia.edu. this media will guarantee an essential presence of hirmeos project in the web and increase public awareness in the open science community. what to announce ● news / newsletter / project articles: these should be retweeted as much as possible. ● technical implementations: any updates to the services and tools implemented on the five platforms connected in hirmeos ● new publications: new research monographs in hss published on the platforms of hirmeos. ● external news: any global updates to policies and infrastructures, scholarly information topics, research data management . mailing list and newsletter a newsletter about developments of hirmeos, related events, publications and presentations concerning open access and open science in hss will be started after the development of the official logo of hirmeos and of the dissemination kit. it will be sent every two months. on the website hirmeos.eu we will add the possibility to subscribe to hirmeos mailing list. we will work towards a population of the mailing list also through the networks of operas, dariah and openaire. every wp leaders should come up with a news item every month before the publication of the newsletter (so in m , , , , , , , , , , , , ). the news sent to the wp leader, who selects and edits them in a structured document designed according to the core identity of hirmeos. the news editor chooses what goes into the newsletter. . events . . organization of roundtables and workshops hirmeos wp will organize a differentiated program of joint events like conferences, workshops, presentations, round tables. direct networking with stakeholders like senior and young scholars will particularly, but not exclusively, concern german institutions related to ugoe. (see list in appendix). the continuous exchange with these stakeholder communities will permit to organize some round tables focussing the specific needs of the different scholars in hss. hirmeos the hirmeos consortium has a budget for the organization of workshops and similar events. these workshops are planned and discussed with the other wp leaders. over the forthcoming months we will organize at least workshops in order to present new services and tools implemented on the five platforms of hirmeos, so as some best practice guidelines. the first workshop will take place in september and focus on the implementations concerning recognition and identification services; we are exploring the possibility of organizing this event immediately after the second meeting of the hirmeos’ wp leaders in berlin. a second workshop will concern the technical developments in annotation and should take place before the realization of deliverable d. . (due in month ), in order to receive some feedbacks from the academic community. a third workshops – approximatively after month - will present the new tools and methods for certification and metric (deliverables d . and d . ). a fourth workshop should take place after the conclusions of all technical implementations and present specific use cases and best practices for the users of monographs published on the platforms of hirmeos. at least the following workshops are to be organized: workshop location date goal ws berlin september informing about tools and services for identification and recognition implemented on the five platforms ws to be defined with the other wp leaders before deliverable d . (due date month ) getting feedback from academic community for the certification service implemented at partner level ws to be defined with the other wp leaders after deliverables d . (due in m ) and d . (due in m ) presenting new annotations service and metrics ws to be defined with the other wp leaders after deliverable d . (due in m ) presenting how hirmeos’ tools and services improve research activity. discussion of a concrete use case. best practice guidelines . . participation in external events active participation in external events (conferences, workshops, symposia and so on) will increase the contact to the key stakeholders. the member of wp and, with minor effort, the other members of the consortium will give conference and poster presentations or will participate as discussant to roundtables in order to present mission and outcomes of hirmeos project. some relevant events (only for , the list has to be updated in the course of the project): hirmeos name of event location date a transition to fair open access leiden april first conference workshop of the association of european university presses (aeup) stockholm university library mai - open access: publisering og arkivering av forskning university of bergen, bergen, norway june dhbenelux conference utrecht, niederland - july st international conference on electronic publishing limassol, cyprus june - cern workshop on innovations in scholarly communication geneve, switzerland june - lodlam venice, italy june - th liber annual conference university of patras (greek) july - open-access-tage dresden september - th conference on open access scholarly publishing lisbon september - force berlin - october hirmeos . dissemination materials . dissemination toolkit with the dissemination toolkit (d. . , due in months ) will be created a set of dissemination materials. poster and flyers will be available for download from the project website. its content is developed for multipliers to support their efforts to contribute to the project’s aims and activities. dedicated content will be provided to liaison partners who will multiply the project’s efforts and ensure wider reach and impact. the following resources will be created: ● logo ● letterhead ● factsheet ● powerpoint template ● word deliverable template ● flyers (three different versions: students; researchers; eu project coordinators) ● poster . press relations in order to increase the impact of hirmeos and obtained good press coverage some press release will be prepared. the first press release will be published after month , as soon as the dissemination kit will be available. . publications wp will work on two scientific papers to be published in academic journals. the first will concern nerd’s meaning for research monograph; the second will be about the impact of hirmeos project on the research activity. targeted will be renowned, peer reviewed journals like international journal of humanities and arts computing; digital humanities quarterly or digital scholarship in the humanities hirmeos . feedback and evaluation the effectiveness of hirmeos’ dissemination and communication strategy will be regularly measured. all partners will contribute to the implementation of the communication plans. the performance targets are the following: indicator quantity total number of website visitors number of eu countries reached through website number of contact in the mailing list number of twitter followers workshops - conference presentations - roundtables with scholars - number of scientific societies reached per direct networking number of press releases newsletter number of published papers hirmeos . dissemination timeline dissemination roadmap hirmeos dissemination material poster flyers portal updates factsheet dissemination kit workshops and liaison activities roundtables st workshop nd workshop * rd workshop * th workshop * dissemination channels newsletter social media press release scientific papers *date yet to be decided hirmeos . conclusion this document presents the dissemination strategy and the key stakeholder communities for the hirmeos project. coordination of the activities will be done by ugoe, but the support of all other partners is essential to fulfilling this plan. the kick-off of hirmeos in january provided a chance to discuss with the member of the consortium some needs and requirements. according to the feedback that we will receive through website and direct networking with our stakeholders, we could modify the dissemination and communication strategy in order to better fit the requirements of the different communities, above all for what concerns the specific content of the planned workshops. breaking it down: a brief exploration of institutional repository submission agreements suggested citation: rinehart, a., & cunningham, j. ( ). breaking it down: a brief exploration of institutional repository submission agreements. the journal of academic librarianship, ( ), - . doi: . /j.acalib. . . the journal of academic librarianship ( ) – contents lists available at sciencedirect the journal of academic librarianship breaking it down: a brief exploration of institutional repository submission agreements amanda rinehart a,⁎, jim cunningham b a ohio state university, w. th avenue, columbus, oh , usa b illinois state university, north school st., normal, il - , usa ⁎ corresponding author at: ohio state university, , usa. e-mail addresses: rinehart. @osu.edu (a. rinehart), (j. cunningham). http://dx.doi.org/ . /j.acalib. . . - /© elsevier inc. all rights reserved. a b s t r a c t a r t i c l e i n f o article history: received june received in revised form october accepted october available online october institutional repositories typically have a submission agreement that is meant to protect the institution hosting the repository and inform submitters of their rights and responsibilities. this article examines how various libraries have created submission agreements, enquires as to issues surrounding them, and identifies commonalities and unique statements. the authors deployed a survey to institutional repository administrators listed in opendoar in the united states. approximately % of the potential institutional repository managers responded. library administrators, institutional repositories managers/architects, and legal counsel were the most likely to have input into the creation of the submission agreement; scholarly communications librarians were involved only % of the time. although submission agreements averaged words arranged in sentences, their reading com- plexity requires a university degree. commonalities include characterizing the agreement as a non-exclusive li- cense, indicating the submitter's responsibility for obtaining permissions for any content that they did not produce, and confirming the right of the submitter to enter into the agreement. submission agreements are generally complex and do not accommodate the common practice of mediated submission. sharing submission agreements publicly may lead to simplified and standardized language and reduce barriers to submitters. © elsevier inc. all rights reserved. keywords: copyright license deposit agreement author's rights ferpa mediated submission . introduction at the turn of the century, the initial experiments to implement in- stitutional repositories (irs) were actively ongoing. crow ( ) differ- entiated irs from other repository types as having four primary characteristics: institutionally defined, scholarly in nature, cumulative and perpetual, and open and interoperable. it was predicted that the in- stitutional and open components would result in increased prestige, as well as capture new and emerging forms of digital scholarship (crow, ; lynch, ). the cumulative and perpetual component ad- dressed concerns regarding the ephemeral nature of the digital world and the need to preserve it, while the open and interoperable compo- nent represented a welcome alternative to traditional, monopolistic and rigidly controlled commercial journals (duranti, ; robertson and borchert, ; crow, ; bastos, vidotti, and oddone, ; bergstrom, courant, mcafee, and williams, ). the advent of disrup- tive technologies allow the collection and distribution of material at a lower cost (blythe and chachra, ; heath, ; odlyzko, ) and the sheer increase in the overall volume of scholarly material w. th avenue, columbus, oh jlcunni@ilstu.edu (budd, ; rawls, ) made irs an attractive addition to the aca- demic library's suite of services. in about % of research institutions in the us had irs, a rate that rose to % in (lynch and lippincott, ; markey and council on library and information resources, ). as of , opendoar listed irs in the us, which is an estimated % of degree-granting post-sec- ondary title iv institutions having an ir (university of nottingham, ; u.s. dept. of education, ). this would indicate that the vast majority of higher educational institutions have either chosen not to implement an ir or may still be faced with that decision. . . submission agreements a key part of launching an ir is creating the submission agreement document. this is a formal legal agreement that “defines the relation- ship between the individual submitting the content and the institution that is operating the repository” and grants “the repository the neces- sary rights to disseminate an author's work while affording the institu- tion a measure of protection against submitted content that may violate legal or ethical boundaries” (gilman, , section repository submis- sion agreements and contracts and licenses). it may be a paper copy, a check box on a ‘click-through’ agreement, or other digital signature mechanism (jones, ). these submission agreements may also be referred to as a license or a deposit agreement. for simplification of http://crossmark.crossref.org/dialog/?doi= . /j.acalib. . . &domain=pdf http://dx.doi.org/ . /j.acalib. . . mailto:jlcunni@ilstu.edu http://dx.doi.org/ . /j.acalib. . . http://www.sciencedirect.com/science/journal/ a. rinehart, j. cunningham / the journal of academic librarianship ( ) – language, the term ‘submission agreement’ or sa will be used for the re- mainder of this publication. as well, the term ‘author’ will refer solely to the generator of the material, while the term ‘submitter’ refers to the per- son that signs or approves the sa, regardless if they are the actual author. the primary motivation for an ir sa is to minimize the legal risk to the host while maximizing the ability to re-use the material. the sa should “clearly indicate that the repository is not responsible for any mistakes, omissions or infringements in the deposited work” (jones, andrew, and maccoll, , p. ). in particular, it should state that: “in the event of a breach of intellectual property rights, or other laws … the repository … is not under any obligation to take legal action on behalf of the original author, or other rights holders, or to accept liability for any legal action arising from any such breaches” (british library, n.d., section liability). while protecting the institution, sas may also address a number of other components, such as: submitters' rights and responsibilities, end-user permissions for both full-text and the metadata created during submission, and various policies and laws. gilman ( ) recommends that the sa cover, at a minimum, the submitter's rights to enter into the agreement, granting of a license to the institution and assurances from the author of the legality of the content. however, many sas include other submitter's rights and responsibilities, as well as end-user permis- sions. although sas are commonplace, the importance of them should not be underestimated. as early as , a coalition of library organiza- tions found that more “work is needed on models for obtaining copy- right clearance and models for contracts or agreements between rights owners/producers and archives/libraries” (rlg/oclc working group on digital archive attributes, research libraries group, and oclc, ). as well, the sa is mentioned six times in the trustworthy repositories audit and certification: criteria and checklist and is also a key part of the administrative function of the open archival information system reference model (center for research libraries and oclc, ; consultative committee for space data systems, ). . . submitter rights submitters may have the right to determine when end-users may have access to the material (embargo period), the circumstances for re- moval, and the terms of any re-use (jones et al., ). since one hall- mark of the ir is openness, a re-use license is necessary to delineate how end-users may access, re-use and distribute the material (jones et al., ). this license is typically separate from the sa, and may range from the traditional ‘all rights reserved’ statement, which indi- cates that end-users cannot use the material for any purpose without permission from the owner, to more liberal creative commons licenses (creative commons, n.d.). while creative commons licenses have be- come more popular, there is some controversy over which one consti- tutes true open access (andersen, ). although the different re- use licenses allow for flexibility, they add to the number of new deci- sions that a submitter may be confronted with when agreeing to a sa. . . submitter responsibilities it is the intention that the sa be signed or approved by someone who holds the copyright to the work or has permission from the copyright holder. while authors initially hold the copyright to their works, they often transfer the copyright to a commercial company during the publi- cation process. ir submitters “are expected to read the licence carefully and ensure that they have the right, as confirmed by the publishers of a paper that might have appeared in a journal, to deposit the item in the ir” (tedd, , p. ). most sas then ask the submitter “to grant the institution a license to use the work in question…[and]…in all cases, the grant of rights should be nonexclusive” (gilman, , section grant of license to the institution; author's emphasis). by placing copyright clearance responsibilities on the submitter, the host of the ir is legally protected. however, the already-complicated and confusing world of copyright gets even further so when dealing with various types of items: articles, books, raw research data, unpublished reports, works for hire, etc. dif- ferent types of material may be subject to other laws in place of, or in ad- dition to, copyright law. a few examples include export control laws, the federal educational rights and privacy act, americans with disabil- ities act, and hippa (u.s. department of state, n.d.; u.s. department of education, n.d.; u.s. department of health and human services, n.d.). in addition, there may also be local policies that appear to be, or are, conflicting with the sa. it is common for higher education institutions to have an existing intellectual property policy that states what types of works can belong to an individual and which are the property of the in- stitution. often these intellectual property policies were created prior to the advent of technology that allows for easy distribution of digital material and without consideration to the diverse types of scholar- ship that now exist. therefore, the intellectual property policy may need additional interpretation for researchers to understand when they must seek permissions from their employer prior to submis- sion. some sas seek to remind the submitter of these additional obligations. . . end-user permissions - metadata metadata that makes the content discoverable is created during the submission process, by the submitter or the ir staff. there is growing in- terest in mining this metadata with automated computer scripts (swanson and rinehart, ). as such, “the host repository may, or may not, wish to claim copyright in any additional data created during the submission and subsequent archiving of the work” (jones et al., , p. ). “it is advisable to state an explicit re-use policy for metadata, other- wise people will have to make assumptions – the failsafe being that they do not have permission … opendoar goes further in recommending that you even allow your metadata to be reused commercially. this is because any loss of potential revenue is far outweighed by the benefits accrued from the additional exposure of your material.” (jisc, n.d.a, section re-use of metadata). as interest grows in bibliometrics and meta-analysis, metadata own- ership and automated access is becoming more important (dollar, king, knight, and leonard, ). however, specifying who may access and re-use metadata, particularly when it is created by multiple people, adds to the complexity to the sa. . . end-user permissions - full-text in addition to metadata mining, web robots are increasingly used to harvest full texts for various purposes. some of these purposes are ben- eficial to the ir - the obvious case being indexing by search services such as google. however, the ir may or may not want a third party to make cache copies of complete works, particularly as a collection of works often has greater value than the works individually. as well, irs may de- cide to harvest works from each other, triggering a number of unex- plored questions. opendoar recommends that repositories allow transient harvesting of full items by robots for benign or beneficial pur- poses (jisc, n.d.a). since re-use statements almost always assume that end-users are individuals, this practice may not be considered in most sa documentation. . . complexity of the sa because of these complex issues – institutional protections, submitter rights and responsibilities, end-user permissions, and the a. rinehart, j. cunningham / the journal of academic librarianship ( ) – plethora of laws and policies that may apply – it is easy for an sa to be- come a lengthy morass of legal language. the necessary level of detail and complexity must be balanced with the desire to keep it understand- able. as jones et al. ( ) state, “these notices should be of a simple de- sign, which can be easily understood” (p. ) and gilman recommends that libraries “construct clear and effective submission agreements that will grant the repository the necessary rights to disseminate an author's work while affording the institution a measure of protection ” ( , chapter contracts and licenses). if users do not understand the sa, then they may be deterred from submitting their work, or submit in vi- olation of law or policy. due to the complexity of the issues outlined above, it is not uncom- mon for legal counsel to be consulted during the creation of the sa (barton and waters, ). as well, it is understandable that “some peo- ple have felt unable to sign the licence as we ask them to agree that they are the copyright holder and/or have the right to grant the licence – this has been overcome by offering advice and support on copyright poli- cies” (barwick, ). it is now common practice in the us for irs or their institutions to offer some level of support for copyright clearance (hanlon and ramirez, ). . . mediated submission; the end run around the sa in addition to copyright clearance support, some authors rely on other people to submit their material. submitting works on behalf of an author is termed a ‘mediated submission’ or ‘mediated deposit’, and prevents the author from agreeing to the sa (jisc, n.d.b, section me- diated deposit). hanlon and ramirez ( ) found that about % re- positories provided mediated submissions and another % provided a blend of self-submission and mediated submissions. mediated submis- sions place the legality of the sa into question, and therefore, “reposito- ry administrators may wish to consider collecting additional proof of the permissions being granted when deposit is taking place via a third party… while [these do not] constitute a complete legal defence, they are likely to demonstrate a responsible approach” (jisc, n.d.b, section mediated deposit). as markland and brophy ( ) stated “the legality of ‘click through’ licenses [is not] always clear, particularly when it is the repository manager, not the academic author who is depositing” (p. ). in the case of mediated submission, gilman recommends that “ade- quate information [be] gathered by the repository platform to allow rea- sonable certainty as to which individual completed the agreement …[and]… if work is submitted on behalf of a another person (e.g. an as- sistant), … the language in the submission agreement [allows] that proxy to agree on behalf of the author”( , section repository sub- mission agreements). if this language is not present in the sa, it is pos- sible that this practice reduces, or even eliminates, its meaningfulness. . . making sas easy in lynch speculated that irs would “offer the opportunity for bottom-up, community-driven, consensus development about rights and permissions” and would result in a “relatively small number of sets of terms and conditions that can cover the majority of the mate- rials” (p. ). however, a few years later, markland and brophy ( ) noted that “repository managers have differed in their attitudes to- wards what needs to be done by whom to keep within strict compliance with the law, a reflection perhaps on the vagueness of available infor- mation” (p. ). indeed, gilman's excellent work on the topic calls for li- brarians “to construct clear and effective submission agreements”, while simultaneously noting that “it is necessary to be aware not only of international differences, but of differences between states' laws and in the varying applicability of laws between private and public ed- ucational institutions”, “defamation, obscenity, privacy and accessibility laws also vary between countries”, and that sas are “shaped also by the local institutional setting – for example, issues such as liability risk tol- erance and the institutional insurance coverage” ( , sections contracts and licenses, conclusion: context changes, but ethics re- main). all these considerations make the sa a difficult document to con- struct. despite the many examples of ir sas, independent evaluation of sas are difficult to find and peer-reviewed literature on the topic is scarce. this knowledge gap is problematic for us institutions that wish to craft or change a sa. this brief exploration of sas and the practices surrounding their creation is a proof-of-concept effort in identifying is- sues surrounding sa creation and specific ir sa language. ultimately, these methods may be used to ) aid those in the position of crafting an ir sa, and ) provide empirical data to inform larger discussions re- garding the legal partitioning of rights within the scholarly communica- tions landscape. . materials and methods using the e-mail distribution service from opendoar (university of nottingham, ), the authors deployed a survey (appendix a) to ap- proximately ir administrators in the united states. the survey was deployed via survey select software, and due to logistics, ir administra- tors were only contacted once. the survey was deployed wednesday, october th, and responses were accepted for four weeks. the sur- vey solicited information about the creation and deployment of the sa and general information about the ir and host institution. the results were summarized and analyzed using descriptive statistics and concept mapping. for initial text analysis, all sas were loaded into voyant tools (https://voyant-tools.org/) and frequencies of common words (such as ‘non-exclusive’, ‘rights’, and ‘intellectual property’) were noted. after initial exploration, each sa was read and entire phrases were classified into common concepts, such as ‘copyright permissions’, ‘infringement of rights – general’ and ‘infringement of rights – privacy’. therefore, some concepts bear great similarity to each other, but are listed separately, as they represent different sas. a spreadsheet was used to track frequen- cies of each concept. due to complex wording, some phrases included multiple concepts. in these instances, the concept category was either widened to be more inclusive, or the phrase broken down and counted in more than one category. the final concept categories and frequencies were then confirmed by the second author and any discrepancies re- solved through discussion. these concepts were then further sorted into four categories: protecting the host institution from unlawful activ- ities, delineating the rights of the submitter, making specific promises to the submitter, and reminders of specialized circumstances or options. for sa language analysis, complete sas were submitted. however, two institutions submitted sas that are specific to student work and these sas are considered separately. to aid in visualization, the shortest and longest sa are compared to all of the sas by removing any proprietary names and creating collocate graphs in voyant tools. a collocate graph is a network graph that repre- sents higher frequency terms (in blue) and terms that appear in prox- imity to them (in orange). . limitations the term ‘submission agreement’ was not defined in the survey, al- though the sas that were collected all fit the basic definition given in the introduction of this work. due to the modest response rate, the re- sults presented here are exploratory in nature. as well, the majority of responses are from doctorate institutions, which skews the observa- tions towards that community. however, considering that all institu- tions are under similar legal burdens, these results may well apply to smaller, and/or non-research institutions. the survey was limited to the united states, as sas are heavily influenced by copyright laws, and these laws differ by country. therefore, the results of this survey are only generalizable within the united states. as well, since only irs are addressed in this study, other types of repositories may require different sas, which are not within the scope of this article. https://voyant-tools.org library administrators institutional repository manager/architect legal council other library personnel, not individually listed here scholarly communications librarian university or college administrators other campus-level entities, not individually listed here other, please specify respondents (count) fig. . what entities had input into the submission agreement(s)? (n = ). a. rinehart, j. cunningham / the journal of academic librarianship ( ) – . results . . demographics twenty-two of the potential irs responded completely, and an- other responded partially. this yielded a % complete response rate. however, all responses were included when possible, resulting in an . % response rate for some questions. responses were relatively even across all four regions of the country, with % originating from the midwest, % in the west, % in the south and % in the east. about two-thirds, or %, of the responses were from doctorate granting universities, % from baccalaureate colleges, % from master's colleges and universities, and % from medical or special schools. dspace is the most commonly employed ir software ( %), with digital commons coming second ( %), and the remaining % switching soft- ware, using a homegrown platform, ir+ or fedora/islandora. on aver- age, the irs have been operational for about six years, although the entire range spans from two years to ten (fig. ). . . who had input? as would be expected, library administrators had the most input into the creation of the sa ( or %, fig. ). this was closely matched by ir managers/architects ( or %), followed by legal counsel ( or %), and other library personnel ( or %) (fig. ). interestingly, only scholarly communications librarians had input ( %, fig. ). in of the sas, the legal department had no input into the creation of the sas. in another eight, the legal department only had a little input. the remaining sas are split between moderate legal department input ( ), no legal department ( ), and significant legal or entirely legal depart- ment input ( ). . . self-submission and format with regards to the sa format, are click-through ( %), with an- other being both click-through and paper ( %). four have no sa and three have a paper agreement. nearly two thirds, or of the responses indicated that less than % of the material in the ir was self-submitted (fig. ). another reported no material that was self-submitted ( %, fig. ). only four have self-submissions as more than % of their r e sp o n d e n ts ( co u n t) y fig. . approximately how many years has your inst material ( %, fig. ). therefore, most irs use mediated submission pro- cesses for much of their material. of the six repositories that did not report self-submission, two still had a sa (fig. ). in contrast, there were two repositories that reported self-submission, but no sa (fig. ). all of these repositories contained both journal articles and book chapters, indicating probable commer- cially published content, and thus a probable need for the legal protec- tion of a sa (data not published). however, for the two repositories without self-submission, but with submission agreements, it is unclear if the sa was created and then not used, or if the sa operates in a medi- ated submission process. . . ir content most of the irs (over %) contain journal articles, theses/disserta- tions, unpublished reports/working papers, multimedia/audio-visual materials, conference/working papers, and books, chapters and sections (fig. ). about two-thirds collect undergraduate work and other special item types (fig. ). it was less common to collect teaching materials ( %), datasets ( %) and software/computer code ( %) (fig. ). only a few irs were dominated by one type of material: namely, four irs consist primarily of theses/dissertations, three consist primarily of journal articles, two primarily consist of books, chapters and sections and one is more than % unpublished reports/working papers ears itutional repository been operational? (n = ). none - % - % - % - % r e sp o n d e n ts ( co u n t) unknown sa no sa fig. . the approximate percentage of material in the ir that was self-submitted by presence or absence of a sa (n = ). a. rinehart, j. cunningham / the journal of academic librarianship ( ) – (fig. ). overall, about % of irs had less than % of any one type of materials, indicating that most irs have diverse collections (fig. ). . . sa text in general, the sas included in text analysis ranged from to in word counts, with one lengthy outlier at words. the aver- age was words arranged in sentences (without the outlier). how- ever, common terms and their collocates varied widely, as illustrated by collocate graphs of the shortest, longest, and total sas (fig. ). addition- ally, the flesch reading ease averaged and the flesch-kincaid grade level averaged , regardless of length of the sa. therefore, sas gener- ally require a university degree to be understood (a score of – ) or approximately years of education (kincaid, fishburne, rogers, and chissom, ). common concepts are in bold, followed by the frequency in which the concept appears in the sas and representative phrases (tables – ). it should be noted that none of these concepts, or the phrases that represent them, are intended to be used in isolation. software/computer code datasets teaching materials (lesson plans,… other special item types undergraduate work books, chapters and sections conference/workshop papers multimedia/audio-visual materials theses/dissertations journal articles unpublished reports/working papers fig. . approximately how much and what type of material . . potential changes only five respondents clearly indicated that they would change any- thing about their sa. two clearly indicated a desire to simplify the agreement: – “i wish it sounded less alarmist about copyright. we have had faculty refuse to submit because of issues with the language. but the lan- guage was drafted by univ. legal counsel.” – “consolidate it” another three indicated that their agreements were lacking: – “will include firmer embargo sign-off by etd author and advisor.” – “i'd like to have an easy way to capture author(s) permission when library staff deposit on behalf of author(s)” – “would like option to separate it out from individual submissions so that agreement can still be electronic but so we don't have to require it for each submission.” . discussion . . sa contributors surprisingly, scholarly communications librarians appear to rarely be involved in the creation of the ir sa. considering that they are often tasked with interpreting and explaining the document, this seems a strange disconnect. however, it is possible that the sa was cre- ated prior to the hiring of a scholarly communications librarian. other explanations may be the lack of a scholarly communications librarian entirely or the use of another title by the person that fills this role. in these cases, it is quite possible that the person responsible for the interpreting the sa for end-users was involved in its creation, but that this was not measurable by this survey. only more detailed data collec- tion from this community will confirm any disconnects between the crafting of the sa and the responsibility of interpretation and explanation. considering that a primary motivation behind the sa is legal protec- tion, the lack of legal counsel's involvement in the creation of the major- ity of the agreements is also surprising. when these results are combined with the total lack of responses to the question regarding any changes to the sa, the length of time that some of the irs have respondents (count) - % - % - % - % is currently in your institutional repository? (n = ). fig. . collocate graphs of the shortest sa, all sas collected by this study (n = ), and the longest sa. a. rinehart, j. cunningham / the journal of academic librarianship ( ) – been operational, and the lack of any known legal issues in the litera- ture, this seems to indicate that the sas are quite durable despite lack of legal counsel input. anecdotally, many of the sas appear to be pat- terned off of robust templates from either other institutions or ir soft- ware providers and these may have been initially crafted with legal counsel input. therefore, it is possible that additional legal input is not needed. indeed, if an sa has been created largely from a vetted, robust example, then a variety of personnel may not need to be involved in sa creation. contrarily, any flaws in sa language, including those based on initial assumptions about the use of the ir, would be perpetu- ated. therefore, caution should be used in re-using sa language. unfor- tunately, it is not possible to determine from this brief survey if an individual sa has been copied or modified from another institution's sa. however, legal issues are not necessarily readily volunteered and we must not forget that while we have an absence of evidence, this can- not be construed as evidence of absence. it is possible that there have been significant legal issues surrounding sa language, but they have been handled in such as manner as to not have made it into the litera- ture or be common knowledge. only future research on the topic can provide conclusive evidence. table phrases to protect the institution from unlawful activities. concept phrase copyright or third party permission, if no copyright to the material “i warrant that i have the copyright or other intellectual property right or permission to g statement(s) from the owners of each third party copyrighted matter” the submitter has rights to enter into the agreement “you represent and warrant that ( ) you have all of the rights necessary to grant the licen license.” the work is original “you represent that the submission is your original work” no infringement on anyone else's copy or intellectual property rights “you also represent that your submission does not, to the best of your knowledge, infringe knowledge, infringe upon any third party's copyright or other intellectual property righ all third-party work is identified and acknowledged “if the submission contains material for which you do not hold copyright, or for which the of fair use, you represent that you have obtained the unrestricted permission of the cop third-party owned material is clearly identified and acknowledged within the text or co hold copyright and that exceeds fair use, you represent that you have obtained the unres by this license, and that you have identified and acknowledged such third-party owned no infringement on others rights, in general “i agree to hold the institution, department, [repository name] and their agents harmless f intellectual property infringement arising from the exercise of these non-exclusive gran no infringement on others rights, or contain any unlawful material, specifically name “the work is original and does not infringe upon the rights of others, does not contain any or “you represent and warrant that …the submission contains no libelous or other unla other rights of any person or third party.” no infringement on others rights, specifically named as patent or trade secret “the work does not infringe any copyright, patent, or trade secrets of any third party.” . . lack of self-submission originally, irs were intended to accept content from the original au- thors, requiring them to both know, and understand, the copyright im- plications of posting their work freely on-line. however, of the respondents, or %, indicated that less than %, or none, of their ma- terial was self-submitted. as one respondent commented: “at present, we search major rd party databases to identify articles written by authors. we contact authors for permission to deposit, but the actual submission (and clicking through the license agree- ment) is done by library staff. we retain the emails from faculty au- thors giving us permission to deposit.”these results are consistent with other findings of low self-submission (royster, ; watson, ; salo, ). indeed, it's been suggested that self-submission may be discouraged due to administrator reluctance to relinquish control (dubinsky, ). considering our results, and the recent finding that self-submission may increase ir costs (burns, lana, and budd, ), we anticipate that mediated submission will con- tinue to outpace self-submission. usage rate (n = ) % rant these rights” or “i have obtained and uploaded… written permissions % se” or “you represent that … you have the right to grant the rights contained in this % % upon anyone's copyright.” or “your submission does not, to the best of your ts;” % intended use is not permitted, or which does not reasonably fall under the guidelines yright owner to grant [the ir] the rights required by this license, and that such ntent of the submission.” or “if the submission contains material for which you do not tricted permission of the copyright owner to grant [the university] the rights required material clearly within the content of your submission.” % or any liability arising from any breach of the above warranties or any claim of ted rights.” or “the work is original and does not infringe upon the rights of others,” d as libelous, privacy, and/or confidentiality % libelous content, and does not invade any privacy or confidentiality of third parties.” wful matter and will not violate the right of privacy or constitute an invasion of any % table phrases that delineate the rights of the submitter. concept phrase usage rate (n = ) non-exclusivity, making no exclusive claim on the use of the material % “i hereby grant to [repository name here] and its agents the non-exclusive license to…” the submitter or author retains all current ownership rights, including copyright % “you will retain your existing rights to your work, and may submit the work to publishers or other repositories without permission from [the institution].” or “you retain all of your ownership rights in all materials submitted to [the ir].” leave the copyright with the submitter or author, specific to publication elsewhere and/or copyright transfer % “i retain all ownership rights to copyright of the thesis, dissertation or project. i also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project.” or “i understand that the work's copyright owner(s) will continue to own copyright outside these non-exclusive granted rights.” option to choose an embargo and/or expiration date(s) % “i understand that accepted papers may be posted immediately as submitted, unless the submitting author requests otherwise or submits a revision.” or “i do not want my project(s) available to the larger scholarly community upon deposit in [repository name here]. i want my project(s) available only to the [repository name here] community, permanently or until (specify date) when it may be made available to the larger scholarly community.” a. rinehart, j. cunningham / the journal of academic librarianship ( ) – as noted elsewhere in the literature, this lack of self-submission may result in very few authors actually seeing the sa (hanlon and ramirez, ; markland and brophy, ). indeed, one respondent noted that “so far, only some students and one faculty member are doing their own submissions. we do everything else”. therefore, it would ap- pear that the legality of the sa may be undermined by the practice of mediated submissions. as no sa analyzed in this study mentioned me- diated submission, it is unclear how this practice is being accommodat- ed. due to this problematic issue, the authors did not attempt to craft a simplified sa from existing phrases, as this may inaccurately represent an ideal sa. . . unique wording emerging trends include characterizing the sa as a non-exclusive li- cense, specifying the submitter's responsibility for obtaining rights or permissions for any content for which they did not produce themselves or rights that were previously assigned to another entity, and confirming the right of the submitter to enter into the agreement. how- ever, some sas accommodate a few unique circumstances. for instance, two repositories submitted sas that are specific to student work and mentioned ferpa. as well, one respondent indicated that they “would like [the] option to separate [the sa] out from individual submissions so that [an] agree- ment can still be electronic but so we don't have to require it for each submission”. another sa addressed this very issue: “in the event that you may, from time to time and subject to accep- tance by [the university], deposit additional papers and other mate- rials, this agreement shall be applicable to the additional deposit. a description of the additional papers and materials shall be provided by you and attached as an exhibit to this agreement.” this is the only instance we received of one sa being used to cover multiple submissions. interestingly, this sa also addressed the assign- ment of responsibility after death. “[the university and ir] may assume table phrases that imply specific promises to the submitter. concept phrase digital preservation of the material “to migrate the item (i.e., copy it) for preservation purposes.” digital preservation of the material, detailing back-up and security “you agree that [the ir] may, without changing the content, translate the submission to an as well as keep more than one copy, for the purposes of security, backup and preservati the ir will not alter or change the work outside of the agreement “[the institution] may not alter the content of the work.” or “[the institution] will clearly id owner(s) of the submission, and will not make any alteration, other than as allowed by custodial responsibility for previously accepted [works] orphaned by the death or dissolution of the submitter and not formally assigned to the custody of another agency.” these clauses address often overlooked issues that may merit future exploration. . conclusion crafting a sa is often a balance between educating the submitter and protecting the institution. while it may be tempting to attempt compre- hensiveness, it may not be advisable, or possible, to craft an all-inclusive sa. even a lengthy and complex sa may not accommodate the changing scholarly communications landscape and unnecessary complexity, as noted by one respondent, deters submissions. in recent years, irs have been under increasing pressure to manage a di- verse array of material in repositories, including datasets and student work (newton, miller, and bracke, ). dataset sharing has been spurred by the ostp public access memo of , which called for “each federal agen- cy with over $ million in annual conduct of research and development expenditures to develop a plan to support increased public access to the re- sults of research funded by the federal government” (holden, ). these results are specified as both publications and datasets (rinehart, ). datasets are particularly confusing as copyright may, or may not, apply (enimil and rinehart, ). additionally, other legal responsibilities, such as export control laws, human subjects' privacy, and patent claims may apply. outside of federal pressure, students have readily adopted irs (passehl-stoddart and monge, ; rozum, thoms, bates, and barandiaran, ). however, the standard sa does not accommodate ferpa, which is necessary to properly ingest student work. when consider- ing these new demands, it readily becomes apparent that attempting to en- compass all potential circumstances in a single sa is improbable. all of the sas were dense documents. this is appropriate if the read- ing audience is graduate students or faculty, but undergraduates may find the documents too difficult to understand. for instance, many sas had language that accounted for unlawful content; libelous, invading privacy, violating confidentiality, etc. some specified particular circum- stances such as human subjects or interviews. various user groups, usage rate (n = ) % % y medium or format, on” % entify your name(s) as the author(s) or this license, to your submission.” table phrases that remind the submitter of specialized circumstances or other options. concept phrase usage rate (n = ) reference to sponsored research (or other contractual) obligations % “if the submission is based upon work that has been sponsored or supported by an agency or organization other than [the institution], you represent that you have fulfilled any right of review or other obligations required by such contract or agreement” state law obligations % “this agreement shall be governed by and interpreted in accordance with the laws of the state of california.” or “this agreement will be interpreted and governed in accordance with applicable federal law and the laws of the state of maryland without reference to its conflict of laws rules.” additional requirements for irb or human subjects research % “if your work includes interviews, you must include a statement that you have permission from the interviewees to make their interviews public.” or “no social security numbers are included in the work and all use of human subjects in support of the work has been cleared by the …university institutional review board.” printing of the sa % the above constitutes a legal agreement between you and [the university], so please print out a copy of this agreement for your records. creative commons licenses % “if you wish to attach a creative commons license to your work, please check which one” student rights n/a “notice of submitted work as potentially constituting an educational record under ferpa: under ferpa ( u.s.c. § g), this work may constitute an educational record. by signing below, you acknowledge this fact and expressly consent to the use of this work according to the terms of this agreement. i further understand that i am waiving some of my rights to anonymity under the family education rights and privacy act of (ferpa) and am granting [the university] the right to display my name and year of birth in connection with any contribution of which i am an author or co-author.” a. rinehart, j. cunningham / the journal of academic librarianship ( ) – including younger students and international researchers, may not be overly familiar with these concepts. unfortunately, when submitters are faced with a complex and confusing sa, they may ig- nore it, decline to participate, or look to their peers for possibly mis- leading interpretations. complexity of the sa can easily result in confusion. consider the basic purpose of sas: copyright. sas generally require that the submitter un- derstand and know whether they have assigned their copyright to a pub- lisher. this is typically stated by two different methods: the statement that the work is original and that any third-party works are accounted for properly. the distinction between original work and third-party work could be confusing should a submitter interpret the statements to- gether: how can an (entirely) original work contain third-party work? similarly, is a co-authors work included in the original or treated as third-party work? co-authors may assume that as a shared copyright holder, every author must give permission in order to submit the work (personal experience, redacted for review). indeed, some legal counsel representatives have recommended this practice, possibly increasing this confusion (personal experience, redacted for review). similarly, many submitters may be confused about when they must seek permis- sion from a third party and when they can simply provide a citation. while these issues may be readily transparent to those who commonly discuss copyright implications, they may not be so to the average submitter. similarly, reminding submitters of their other obligations in all capital letters can deter them from participation. while it may be seen as an added emphasis, it can be taken as yelling or shouting. the ideal sa is a simple, readily understandable document that re- quires no added emphasis. practices for simplifying the sa could include requiring the submitter to properly attribute or cite any work that is in- corporated into the whole work, and if the copyright to the work has been assigned to another party, to request permission from that party. additionally, instead of listing all possible violations, it may be more beneficial to use the blanket ban of anything ‘unlawful’. this would protect the university without attempting to predict every future possi- bility. standardization of language between institutions would also make sas a great deal easier for mobile academics to understand. however, a clear and simple sa may not cover the nuances that may be required for submitters to understand their rights and obligations. while this survey did not explore current educational resources that are associated with sas, it is common for librarians to provide additional education on these topics. these activities may benefit from being em- bedded in ongoing education practices. for instance, particular ‘unlaw- ful’ circumstances may be better explained in annual responsible conduct of research training or in an faq. ongoing education, along with public availability, may increase submitter understanding. as one respondent noted: “we make the submission agreement available for people to see before they get into the submission process and we ex- plain what it means”. because of this educational role, it is beneficial if the person who will be interpreting the sa for end-users, and providing accompanying education, be involved in its creation. more research within the communities of practice will reveal the most effective sa statements and accompanying education practices. another consideration for the sa document is the practice of mediated submission. although the initial development of irs anticipated self-submis- sion of material, these results bolster the observation that self-submission is relatively uncommon. the practice of mediated submission means that the sa may not be serving the purpose to which it was intended. perhaps a two-step workflow, with an initial agreement that the author signs to dele- gate the actual submission to another person, would resolve this dilemma. from this brief exploration, we conclude that the while majority of sas cover the same common basic concepts, efforts to be comprehen- sive result in complex, dense documents, with variability in language and detail. if ir managers, library administrators, and library legal de- partments exchanged sa language and experiences, a crowd-sourced effort may rapidly optimize sa language. this would particularly benefit those that are striving to launch a new ir, as it would actively reveal any re-purposing of sa language. re-using language that has already been vetted may reduce or eliminate the need for a plethora of personnel, such as legal counsel, to contribute. alternatively, re-purposing sa lan- guage may not consider the changing scholarly communications land- scape or unique local needs. regardless, sharing sa language would reveal the possibilities and pitfalls of current sa language. acknowledgments many thanks to jean bigger, of the illinois math and science acade- my, for revision and suggestions to the survey. this research did not receive any specific grant from funding agencies in the public, commer- cial, or not-for-profit sectors. appendix a e-mail message “we'd really appreciate it if you could participate in a survey to doc- ument current practices in institutional repository submission agree- ments. if there are any questions, please contact rinehart. @ osu.edu”. the survey link is below: url: redacted contact: rinehart. @osu.edu e–mail message. “we'd really appreciate it if you could participate in a survey to document current practices in institutional repository submission agreements. if there are any questions, please contact [redacted for review]. the survey link is below: url: redacted for review a. rinehart, j. cunningham / the journal of academic librarianship ( ) – institutional repository submission agreements survey the intent of this survey is to document current practices in institu- tional repository submission agreements. participation in this survey is entirely voluntary. there are only thirteen questions and it should take less than twenty minutes. if there are any concerns regarding this sur- vey please contact rinehart. @osu.edu. e–mail message. “we'd really appreciate it if you could participate in a survey to document current practices in institutional repository submission agreements. if there are any questions, please contact [redacted for review]. the survey link is below: url: redacted for review contact: redacted for review institutional repository submission agreements survey. the intent of this survey is to document current practices in institutional repository submission agreements. participation in this survey is entirely voluntary. there are only thirteen questions and it should take less than twenty minutes. if there are any concerns regarding this survey please contact [redacted for review]. . what approximate percentage of material in your institutional repository was self– submitted by depositors? none – % – % – % – % . if your depositors agree to a submission statement, please copy and paste it below: . approximately how much and what type of material is currently in your institutional repository? % – % – % – % – % journal articles theses/dissertations undergraduate work unpublished reports/working papers books, chapters and sections conference/workshop papers multimedia/audio–visual materials other special item types teaching materials (lesson plans, syllabi, etc.) datasets software/computer code page . ir submission agreement creation and changes . please select all that apply to your institutional repository's submission agreement(s): it is a click–through agreement it is a paper agreement there is no agreement required for submission other, please specify . what entities had input into the submission agreement(s)? library administrators university or college administrators scholarly communications librarian institutional repository manager/architect legal council other library personnel, not individually listed here other campus–level entities, not individually listed here other, please specify contact: redacted for review institutional repository submission agreements survey. the intent of this survey is to document current practices in institutional repository submission agreements. participation in this survey is entirely voluntary. there are only thirteen questions and it should take less than twenty minutes. if there are any concerns regarding this survey please contact [redacted for review]. . what approximate percentage of material in your institutional repository was self– submitted by depositors? none – % – % – % – % . if your depositors agree to a submission statement, please copy and paste it below: . approximately how much and what type of material is currently in your institutional repository? % – % – % – % – % journal articles theses/dissertations undergraduate work unpublished reports/working papers books, chapters and sections conference/workshop papers multimedia/audio–visual materials other special item types teaching materials (lesson plans, syllabi, etc.) datasets software/computer code page . ir submission agreement creation and changes . please select all that apply to your institutional repository's submission agreement(s): it is a click–through agreement it is a paper agreement there is no agreement required for submission other, please specify . what entities had input into the submission agreement(s)? library administrators university or college administrators scholarly communications librarian institutional repository manager/architect legal council other library personnel, not individually listed here other campus–level entities, not individually listed here other, please specify references andersen, r. ( ). cc by and its discontents: an oa challenge. library journal, ( ), . barton, m. r., & waters, m. m. ( ). creating an institutional repository: leadirs work- book. massachusetts: mit libraries retrieved from: http://dspace.mit.edu/ bitstream/handle/ . / /barton_ _creating.pdf?sequence= barwick, j. (may , ). building an institutional repository at loughborough univer- sity: some experiences. program: electronic library and information systems, ( ), – . http://refhub.elsevier.com/s - ( ) - /rf http://dspace.mit.edu/bitstream/handle/ . / /barton_ _creating.pdf?sequence= http://dspace.mit.edu/bitstream/handle/ . / /barton_ _creating.pdf?sequence= http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf a. rinehart, j. cunningham / the journal of academic librarianship ( ) – bastos, f., vidotti, s., & oddone, n. ( ). the university and its libraries: reactions and resistance to scientific publishers. information services & use, ( / ), – . bergstrom, t. c., courant, p. n., mcafee, r. p., & williams, m. a. (january , ). eval- uating big deal journal bundles. proceedings of the national academy of sciences of the united states of america, ( ), – . blythe, e., & chachra, v. (september , ). the value proposition in institutional re- positories. educause review, ( ), – . british library. (n.d.). ethos toolkit: a guide to using and participating in ethos. deposit agreements. retrieved from: http://ethostoolkit.cranfield.ac.uk/tiki-index.php? page=deposit+agreements budd, j. m. ( ). faculty publishing productivity: comparisons over time. college & research libraries, ( ), – . burns, s. c., lana, a., & budd, j. m. ( ). institutional repositories: exploration of costs and value. d-lib magazine, ( / ). http://dx.doi.org/ . /january -burns. center for research libraries (u.s.), & oclc ( ). trustworthy repositories audit & certi- fication (trac) criteria and checklist. chicago: center for research libraries. consultative committee for space data systems ( ). reference model for an open ar- chival information system (oais). washington, d.c: ccsds secretariat. creative commons. (n.d.). about the licenses. retrieved from: http://creativecommons. org/licenses/ crow, r. ( ). the case for institutional repositories: a sparc position paper. arl bi- monthly report. . dollar, d., king, l., knight, p., & leonard, p. (november , ). data mining on vendor- digitized collections [presentation]. charleston conference: issues in book and se- rial acquisitions. nc: charleston. dubinsky, e. ( ). a current snapshot of institutional repositories: growth rate, disci- plinary content and faculty contributions. journal of librarianship and scholarly communication, ( ), ep . retrieved from: http://dx.doi.org/ . / - . . duranti, l. ( ). the long-term preservation of the digital heritage: a case study of uni- versities institutional repositories. italian journal of library & information science, ( ), – . enimil, s., & rinehart, a. (december , ). can i copyright my data? [webinar]. present- ed for the association for library collections and technical services, a division of the american library association. gilman, i. ( ). library scholarly communication programs: legal and ethical consider- ations. oxford: chandos pub. hanlon, a., & ramirez, m. ( ). asking for permission: a survey of copyright workflows for institutional repositories. libraries and the academy, ( ), – . heath, f. ( ). documenting the global conversation: relevancy of libraries in a digital world. journal of library administration, ( ), – . http://dx.doi.org/ . / . holden, j. p. (february , ). increasing access to the results of federally funded sci- entific research. retrieved from: http://www.whitehouse.gov/sites/default/files/ microsites/ostp/ostp_public_access_memo_ .pdf jisc. (n.d.a) repositories support project: data re-use policies. retrieved from: http:// www.rsp.ac.uk/start/policies-and-legal-issues/re-use-policies/. jisc. (n.d.b) repositories support project: submission policies. retrieved from: http:// www.rsp.ac.uk/start/policies-and-legal-issues/submission-policies/. jones, c. ( ). institutional repositories: content and culture in an open access environ- ment. oxford: chandos. jones, r., andrew, t., & maccoll, j. ( ). the institutional repository. oxford: chandos publishing. kincaid, j. p., fishburne, r. p., rogers, r. l., & chissom, b. s. ( ). derivation of new read- ability formulas (automated readability index, fog count, and flesch reading ease formula) for navy enlisted personnel. research branch report – . naval air station memphis: chief of naval technical training. lynch, c. a. ( ). institutional repositories: essential infrastructure for scholarship in the digital age. portal: libraries and the academy, ( ), – . lynch, c. a., & lippincott, j. k. (september , ). institutional repository deployment in the united states as of early . d-lib magazine, , . markey, k., & council on library and information resources ( ). census of institutional repositories in the united states: miracle project research findings. washington, d.c: council on library and information resources. markland, m., & brophy, p. ( ). sherpa project evaluation, final report. manchester: cerlim (centre for research in library & information management) retrieved from: http://www.sherpa.ac.uk/documents/sherpa_evaluation.pdf newton, m. p., miller, c. c., & bracke, m. s. (january , ). librarian roles in institu- tional repository data set collecting: outcomes of a research library task force. collection management, ( ), – . http://dx.doi.org/ . / . . . odlyzko, a. ( ). the economics of electronic journals. first monday, ( ). http://dx.doi. org/ . /fm.v i . . passehl-stoddart, e., & monge, r. ( ). from freshman to graduate: making the case for student-centric institutional repositories. journal of librarianship and scholarly communication, ( ), ep . retrieved from: http://dx.doi.org/ . / - . . rawls, m. ( ). looking for links: how faculty research productivity correlates with li- brary investment and why electronic library materials matter most. evidence based library and information practice, ( ), – retrieved from: http://ejournals. library.ualberta.ca/index.php/eblip/article/view/ rinehart, a. (april , ). federal public access initiatives update [blog]. retrieved from: https://library.osu.edu/researchcommons/ / / /funding-for-data- science-research/ rlg/oclc working group on digital archive attributes, research libraries group, & oclc ( .). trusted digital repositories: attributes and responsibilities: an rlg-oclc report. mountain view, ca: rlg. robertson, w. c., & borchert, c. a. ( ). preserving content from your institutional re- pository. serials librarian, ( – ), – . royster, p. ( ). the institutional repository at the university of nebraska-lincoln: its first year of operations. oclc systems and services, ( ), – . http://dx.doi. org/ . / . rozum, b., thoms, s., bates, s., & barandiaran, d. ( ). we have only scratched the sur- face: the role of student research in institutional repositories in in creating sustainable community: the proceedings of the acrl conference. portland, or: association of college and research libraries, – retrieved from: http://www.ala.org/ acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/ /rozum_ thoms_bates_barandiaran.pdf salo, d. ( ). innkeeper at the roach motel. library trends, ( ), – . http://dx.doi. org/ . /lib. . . swanson, j., & rinehart, a. ( ). data in context: using case studies to generate a com- mon understanding of data in academic libraries. journal of academic librarianship., ( ). tedd, l. (ed.). ( ). institutional repositories. bradford, england: emerald group publishing. u.s. department of education, national center for education statistics ( ). chapter . digest of education statistics, (nces - ). u.s. department of education. (n.d.). family educational rights and privacy act (ferpa). retrieved from: http://www .ed.gov/policy/gen/guid/fpco/ferpa/index.html u.s. department of health and human services. (n.d.). understanding health information privacy. retrieved from: http://www.hhs.gov/ocr/privacy/hipaa/understanding/ u.s. department of state. (n.d.). overview of u.s. export control system. retrieved from: http://www.state.gov/strategictrade/overview/ university of nottingham ( , april ). opendoar email distribution service. re- trieved from: http://www.opendoar.org/tools/emailservice.html university of nottingham (january , ). opendoar; the directory of open access repositories. retrieved from: http://www.opendoar.org/index.html watson, s. ( ). authors and university irs. serials, ( ), – . http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://ethostoolkit.cranfield.ac.uk/tiki-index.php?page=deposit+greements http://ethostoolkit.cranfield.ac.uk/tiki-index.php?page=deposit+greements http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://dx.doi.org/ . /january -burns http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://creativecommons.org/licenses/ http://creativecommons.org/licenses/ http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://dx.doi.org/ http://dx.doi.org/ http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://dx.doi.org/ . / http://dx.doi.org/ . / http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf http://www.rsp.ac.uk/start/policies-and-legal-issues/re-use-policies/ http://www.rsp.ac.uk/start/policies-and-legal-issues/re-use-policies/ http://www.rsp.ac.uk/start/policies-and-legal-issues/submission-policies/ http://www.rsp.ac.uk/start/policies-and-legal-issues/submission-policies/ http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://www.sherpa.ac.uk/documents/sherpa_evaluation.pdf http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . /fm.v i . http://dx.doi.org/ http://dx.doi.org/ http://ejournals.library.ualberta.ca/index.php/eblip/article/view/ http://ejournals.library.ualberta.ca/index.php/eblip/article/view/ https://library.osu.edu/researchcommons/ / / /funding-for-data-science-research/ https://library.osu.edu/researchcommons/ / / /funding-for-data-science-research/ http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://dx.doi.org/ . / http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/ /rozum_thoms_bates_barandiaran.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/ /rozum_thoms_bates_barandiaran.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/ /rozum_thoms_bates_barandiaran.pdf http://dx.doi.org/ . /lib. . http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://www .ed.gov/policy/gen/guid/fpco/ferpa/index.html http://www.hhs.gov/ocr/privacy/hipaa/understanding/ http://www.state.gov/strategictrade/overview/ http://www.opendoar.org/tools/emailservice.html http://www.opendoar.org/index.html http://refhub.elsevier.com/s - ( ) - /rf breaking it down_cover breaking_it_down_publisher breaking it down: a brief exploration of institutional repository submission agreements . introduction . . submission agreements . . submitter rights . . submitter responsibilities . . end-user permissions - metadata . . end-user permissions - full-text . . complexity of the sa . . mediated submission; the end run around the sa . . making sas easy . materials and methods . limitations . results . . demographics . . who had input? . . self-submission and format . . ir content . . sa text . . potential changes . discussion . . sa contributors . . lack of self-submission . . unique wording . conclusion acknowledgments appendix a e-mail message institutional repository submission agreements survey references sigchi conference paper format content divide: africa and the global knowledge footprint sponsored by: sig/iii shimelis assefa lis program university of denver sassefa@du.edu abebe rorissa department of information studies university at albany arorissa@albany.edu daniel gelaw alemneh digital libraries services university of north texas daniel.alemneh@unt.edu kendra albright school of lis university of south carolina kendraa@mailbox.sc.edu abstract the purpose of this panel is to discuss the global knowledge output at a macro level with a view to understand key inputs that foster scientific and research performance. here, knowledge production is limited to scientific and technical journals and patent registrations to gauge the performance of each region and continent the world over. greater emphasis will be placed to highlight important indicators from the input side that help spur national research and innovation systems in africa. defined here as “content divide,” panel members focus on key variables that help build scientific and research capabilities of africa. closely interrelated variables that will be discussed include ( ) access to the global knowledge base, ( ) the role of higher education systems ( ) national, regional, and global research and education networks (rens); and ( ) gross expenditure on r&d (gerd). keywords knowledge production, content divide, africa, higher education, innovation, sig/iii. introduction africa is the second largest continent in the world, with a population of a little over one billion living in about countries. according to the fifth edition of the guide to higher education in africa, (international association of universities [iau], ), there are institutions of higher education in african countries. not only is africa’s scientific and technical performance very low compared to other regions (king, ; may, ; tussen, ; teferra & altbach, ), africa also has limited access to critical content/knowledge as evidenced by limited or no subscription to scientific and technical databases (zulu, ). it is this limited access to the global knowledge base and at the same time very small contribution to it that we constrain as “content divide.” this bi-directional challenge has both an “input” and “output” dimension to it. in order to understand the divide, we used the number of ( ) scientific and technical journals; and ( ) patent applications – as indicators of knowledge production. research and development (r&d) activities by higher education systems and their affiliates together with patent applications by universities and respective countries are often cited as a barometer for the wellbeing of nations scientific and innovation impact (king, ; may, ; powell & snellman, , p. ; vincent-lancrin, ). analysis of scientific and technical journals from isi’s master journal list for science citation index that has titles are analyzed to provide a macro-level view of research outputs by africa in comparison to other regions. two additional datasets, i.e., data published by the patent co-operation treaty (pct) regarding the number of patent applications filed (world intellectual property [wipo], ) and another data from unesco institute for statistics ( ) that show gdp devoted to r&d activities (gerd), were juxtaposed to find a reasonable explanation to serve as a starting point to discuss the content divide (table ). this is the space reserved for copyright notices. asist , october - , , baltimore, md, usa. copyright notice continues right here. continent s&t jnl titles % of titles patents filed % of patents % of gerd africa . . . asia . , . . europe . , . . oceania . , . . n. america . , . . s. america . , . . , % , % table . s&t journals, patent applications, and r&d expenditure mailto:sassefa@du.edu mailto:arorissa@albany.edu mailto:daniel.alemneh@unt.edu mailto:kendraa@mailbox.sc.edu the data in table shows how africa fares in comparison to other regions. the key question for this panel is to explore what variables on the input side contribute to innovation and scientific research activities. what are the key resources/inputs that help improve national research and innovation systems? how do we define key scientific and innovation capabilities in the context of africa? panelists address these questions by focusing on the following key variables/indicators that we argue help bridge the content divide thereby increasing the knowledge production of the continent.  access to the global knowledge base,  the role of higher education systems,  national, regional, and global research and education networks (rens), and  gross expenditure on r&d. issues to be discussed in view of well-established facts, i..e, higher education systems as engines for knowledge production and economic development (altbach, ; arocena & sutz, ; marginson, ), and research outputs in scientific and technical fields together with patent applications as key indicators for innovation-related activities (acs, anselin, & varga, ; archibugi & coco, ), panel members will explore the above four key variables to find plausible explanations to the following questions:  what is the state of scientific and technical research outputs of african higher education systems?  what key variables enhance the scientific and technical research performance of african higher education systems?  what is the role of national, regional and global research and education networks (nrens) in fostering an environment for africa to increase its knowledge production?  to what extent gross expenditure on r&d (gerd) improve research performance in africa? panelists dr. kendra albright is associate professor at the school of library and information science, university of south carolina. dr. albright has a wide ranging international experience and her research focuses on users and their social and cultural contexts. her work explores the individual and social contexts that generate problems to be solved and the way information and communication are used to solve those problems. drawing from information science, communications, psychology, public health and education. dr. albright will discuss the nature of relationship between universities, government, and private sector in enhancing the scientific and technical research activities of africa. dr. daniel gelaw alemneh is a digital curator and project manager in the digital library division of the university of north texas libraries. academic libraries provide services to support the creation, organization, management, use and reuse of digital scholarship dr. daniel will examine the critical factors that can be considered on the input side of building a research capacity. he will provide a comparative analysis of r&d investment and the corresponding research productivity by individual countries and universities in africa together with other regions of the world. dr. shimelis assefa is assistant professor in the library and information science program at the university of denver. his research interests include scholarly communication and measurement of knowledge production; value creation and organization-wide information systems, learning technologies, and health informatics. he will discuss the landscape of scientific and technical research outputs by african higher education systems vis-à-vis the contribution and access to the global knowledge footprint. dr. abebe rorissa is associate professor in the department of information studies at the university at albany, state university of new york (suny). dr. abebe’s research focuses on multimedia information organization and retrieval, measurement and scaling of users' information needs and their perceptions of multimedia information sources and services, and use/acceptance/adoption and impact of information and communication technologies (icts).he will discuss the role of national research and education networks in the context of africa as a means to foster more collaboration in accessing data, protocols, hardware, software, and laboratory instrument from other partnering organizations. he will take the ubuntunet alliance as case to discuss how national, regional, and global research and education networks (rens) can tap into the global ren as well as share resources and expertise among themselves. acknowledgments this template was adapted for use at the asis&t annual meeting from several sources, including the existing asis&t annual meeting template, and the template used for the acm sigchi conference proceedings. we would like to thank all of the people who worked hard to design these templates. references acs, z.j., anselin, l., & varga, a. ( ). patents and innovation counts as measures of regional production of new knowledge. research policy, ( ), - . archibugi, d., & coco, a. ( ). a new indicator of technological capabilities for developed and developing countries (arco). world development, ( ), - . arocena, l., & sutz, j. ( ). changing knowledge production and latin american universities. research policy, ( ), - . castells, m. ( ). the university system: engine of development in the new world economy. in j. salmi & a.m verspoor (eds), revitalizing higher education (pp. - ), oxford, pergamon: ppublished for the iau press. delivery of advanced network technology to europe (n.d.). gÉant . european commission information society and media. retieved from http://www.geant .net. international association of universities ( ). guide to higher education in africa. th edition. basingstoke: palgrave macmillan. king, d.a. ( ). the scientific impact of nations: what different countries get for their research spending. nature, , - . mattoon, r.h. ( ). can higher education foster economic growth? chicago fed letter, . marginson, s. ( ). higher education in the global knowledge economy. procedia - social and behavioral sciences, ( ), - . may, r.m. ( ). the scientific wealth of nations. science, ( ), . national science board ( ). science and engineering indicators . arlington, va. teferra, d., & altbach, p. g. ( ). trends and perspectives in african higher education. in t. damtew, & p.g altbach (eds.). african higher education: an international reference handbook (pp. - ). bloomington: indiana university press. tusubira, f.f. ( ). creating the future of research and education networking in africa. retrieved from http://www.ubuntunet.net/publications. tussen, r.j.w. ( ). africa’s contribution to the worldwide research literature: new analytical perspectives, trends, and performance indicators. scientometrics, ( ), - . unesco institute for statistics ( , august). global investments in r&d. uis fact sheet, . retrieved from http://www.uis.unesco.org. vincent-lancrin, s. ( ). what is changing in academic research? trends and prospects. in oecd, higher education , volume , globalization (pp. - ). paris: oecd publishing. world intellectual property organization ( ). pct yearly review: the international patent system. geneva, switzerland: wipo. zulu, b. ( , may ). bridging the scientific content divide in african universities. retrieved from http://goo.gl/radxs. http://www.geant .net/ http://www.ubuntunet.net/publications http://www.uis.unesco.org./ http://goo.gl/radxs microsoft word - returned eliss obendorf and randerson final.docx.doc.docx student as producer the model united nations simulation and the student as producer agenda simon obendorf and claire randerson simon obendorf, senior lecturer in international relations, school of social sciences, university of lincoln, lincoln, lincolnshire, ln ts: sobendorf@lincoln.ac.uk (corresponding author) claire randerson, senior lecturer in international relations, school of social sciences, university of lincoln, lincoln, lincolnshire, ln ts: cranderson@lincoln.ac.uk biographies simon obendorf was educated at the university of melbourne where he read for undergraduate degrees in political science and in law before completing a phd in international relations theory. he teaches and researches in the fields of international relations, postcolonial studies, and gender and sexuality. claire randerson studied history and politics at the university of lancaster and international studies at the university of birmingham before taking up a lecturing position on the international relations degree at the university of lincoln. she teaches and researches in the fields of international relations, eu politics and genocide studies. abstract the authors of this paper introduced an assessed model united nations simulation as a core component of the undergraduate politics and international relations programmes at the university of lincoln. the authors use their experience of creating and delivering this module to reflect upon the institutional implementation of a student as producer agenda to guide curriculum development and pedagogy. they conclude that many existing trends in the teaching and learning of politics and international relations are congruent with the emerging focus in british higher education on research-engaged teaching and learning and the development of students as producers of knowledge. they conclude by suggesting that these priorities are perhaps best implemented at degree programme level and that they should take greater account of a broad notion of internationalisation and the value of simulation-driven teaching and learning. key words: model united nations, simulation, pedagogy, student as producer, research-engaged teaching, case-based learning, international relations, politics the student as producer reflecting contemporary concerns about the challenges facing higher education and a desire to recapture the nature of the university as a liberal humanist institution (neary and winn ), the university of lincoln-led student as producer project is funded by the higher education academy ( a) through its national teaching fellowship scheme ( b). in co-operation with the universities of sheffield, reading, warwick, oxford brookes, gloucestershire, wolverhampton and plymouth, the £ , project addresses concerns about the disconnect between research and teaching in higher education institutions and the notion of students as consumers and “passive recipients of knowledge” (ramsden : ). the focus of student as producer therefore is to rejuvenate the university project via a reconceptualisation of the relationship between research and teaching. it seeks to collaboratively engage students in the main function of academia, the production of knowledge. central to the student as producer project therefore is the notion of research-engaged teaching and learning, defined by the university of lincoln as “a fundamental principle of curriculum design whereby students learn primarily by engagement in real research projects, or projects which replicate the process of research in their discipline. engagement is created through active collaboration among and between students and academics” (university of lincoln ). research-engaged teaching and learning has been designated a key institutional priority by the university of lincoln and represents an extension of the previous policy of research-informed teaching in which the connection between research and teaching resulted from curriculum content, enriched by an academic’s own research interests, transmitted to undergraduate students via lectures (university of lincoln : ). concerns about the potential negative effects of the research-informed teaching approach on student engagement have led lincoln to place research-engaged teaching and learning as the key paradigm around which curriculum design and delivery will be constructed in the future. this key priority reflects increasing interest in the wider academic community in reconfiguring the relationship between teaching and research at university level to enhance the student learning experience via student involvement in research (hattie and marsh ; neary and winn ; pascarella and terenzini ; zamorski ). central to the institutional priority of student as producer at the university of lincoln is its distinctive commitment to position research-engaged teaching and learning as a “unifying principle for its pedagogic practices” (university of lincoln : ), ensuring that the design and delivery of undergraduate programmes and modules engage with this principle throughout. at the same time, however, there is clear recognition that the concept and practice of research-engaged teaching and learning already exists both in the broader academic community and in existing teaching practice at the university of lincoln. in , the authors chose to introduce a version of the well-established model united nations simulation to the school of social sciences’ undergraduate programmes at the university of lincoln. model united nations is a compulsory part of the second-year assessed programme of the ba (hons) international relations. it is also open to second-year students studying other degrees in the school of social sciences. the existence of model united nations as an assessed module is a highly distinctive feature of lincoln’s international relations programme and a departure from the delivery of model united nations primarily as a co-curricular or extra-curricular activity at other institutions worldwide. delivery of the simulation as a compulsory module for level two international relations students means that it forms a core component of degree-level assessment and contributes to the overall degree classification of participating students. the decision to introduce a simulation-led module was influenced by increased evidence of the advantages of simulations in the delivery of political science and international relations programmes. simulations require students to learn and perform in interactive environments in which “it is the environment that is simulated … but the behaviour is real” (jones : ). as asal and blake ( ) have argued: simulations offer social science students an opportunity to learn from firsthand experience …this sort of experiential learning allows students to apply and test what they learn in their textbooks, and often helps to increase students’ understanding of the subtleties of theories or concepts and draw in students who can be alienated by traditional teaching approaches. by putting students in role-play situations where they need to make defensible decisions and often have to convince others to work with them, simulations also provide students with the opportunity to develop their communication, negotiation, and critical thinking skills, and in many cases, improve teamwork skills. in this paper, we explore how simulation-led teaching and learning aids the delivery of the objectives envisaged by the student as producer project. we reflect on the contributions simulations can make in placing students and student production at the heart of the teaching and learning process. we also take the opportunity to examine how our experience of delivering a simulation can inform and enhance an institution’s understanding and implementation of the student as producer policy framework. some of the key areas in which simulations extend our understanding of student-led and research-engaged teaching are those of student voice and the need to embrace a broad notion of internationalisation. the model united nations simulation the model united nations simulation is a global phenomenon. according to the united nations association of the united states of america ( a,b), over , students worldwide participate in a model united nations simulation each year. having originated in the united states of america, the model united nations simulation has grown in popularity globally, especially since the end of the cold war (deutsche gesellschaft für die vereinten nationen ). the growth in membership and topical relevance of the united nations itself has been largely responsible for the worldwide dissemination of the model united nations project, providing the motivation for educators and students alike to embrace the popular model united nations simulation as a way of teaching and learning about transnational issues, global governance and diplomacy (muldoon : ; phillips and muldoon ). the model united nations blends case-based instruction and investigation with aspects of problem-based learning (mcintosh : – ). participants are allocated specific roles as representatives of united nations member states or united nations observer states/bodies. after a period of preparation involving research on their allocated countries and designated policies, delegates participate in a strategically condensed simulation of the work of existing united nations bodies such as the security council, the general assembly or the economic and social council (ecosoc) (phillips and muldoon : ). the model united nations programme is best understood as an operational simulation, which is to say it seeks to simulate the work of an actually existing body and uses role descriptions and expectations of participants derived from the united nations itself. further, it encourages participants to engage with contemporary or historical events or issues of importance to the united nations system (muldoon : ). as a highly flexible simulation framework, the model united nations is suitable for delivery in a variety of contexts and at a wide range of learning levels. model united nations simulations are found in schools, colleges and universities around the world, and run variously as short duration in-class events, semester- and year-long programmes, and even major residential conferences, bringing together hundreds of participants from across nations, regions or the world. at many universities, the model united nations simulation operates as an extra-curricular activity open to the entire student body and staged by student clubs and societies, sometimes with the assistance or support of international relations programmes and academics. significant guidance is available on the logistics of organising successful conferences (endless and wolfe ): national united nations associations provide a wealth of teaching and learning resources to support model united nations programmes in their countries (united nations association of the united kingdom ; united nations association of the united states of america , c) and the united nations itself offers a range of model united nations resources through its website and other publications (barrs and juffkins ; united nations ). simulation in international relations teaching and learning despite its introduction prior to the adoption of the student as producer project at the university of lincoln (in ), it is clear that the module is aligned with many of the features and outcomes envisaged by the research-engaged teaching and learning and student as producer agendas. the decision to include an assessed version of model united nations on the international relations programme was informed by the extensive literature encouraging adoption of simulation models in international relations teaching and learning. this literature speaks of the desirability of broadening the range of teaching, learning and assessment methods in politics and international relations (hale ; ralph, head and lightfoot ; simpson and kaussler ). driving this concern is an awareness that students acquire skills and knowledge in various ways and that curricula and teaching practice should be designed to reflect this (kolb , ). fox and ronkowski ( ) draw on these insights to examine the learning styles of political science students. their research concluded that political science students learn in a diverse range of ways, highlighting a need for teaching methods to more fully reflect the diversity of learning styles among political science students, with potentially significant implications for improving knowledge and skills development, confidence building and student retention. the advantages of case- or problem-based learning and its applicability to the teaching of politics have been extensively surveyed by sarah hale ( , ). she maintains that “case based learning is an innovative teaching method that has a great deal to offer tutors and students in the social sciences, including increased inclusivity, deep learning, better retention of knowledge, development of critical and analytical skills, greater student interest and the development of key employability skills” (hale : ). research on the strengths of case- or problem-based learning validates the increasing use of simulations in undergraduate and postgraduate politics and international relations programmes. writing specifically about international relations courses, weir and baranowski ( ) emphasise the importance of simulations in promoting active learning and as a means of facilitating students’ ability to consider international politics from non-western perspectives. newmann and twigg ( ) argue that simulations enable students to experience and more fully understand theories, issues and concepts in international relations. they assert that “the simulation format provides students a better framework than do lecture notes for long term retention of important international relations concepts” (newmann and twigg : ). simpson and kaussler ( ) highlight the role of simulations in the development of key communication and analytical skills among students and the conveyance of empirical, issue-based knowledge and theoretical understanding. it is evident therefore that a broad consensus exists in the pedagogic literature about the value of simulations, such as model united nations, in both scaffolding student knowledge of global affairs and the politics of international organisation, and in developing key skills in research, negotiation, debating, public speaking, parliamentary procedure, etc) (hazleton and jacob ; karns ; phillips and muldoon ). model united nations at the university of lincoln the lincoln model united nations takes places within a single-semester academic module and culminates in a day-long simulation of the united nations general assembly and two of its committees (the committees simulated vary from year to year according to the issues under debate). given the size of the student cohort, not all of the member states of the united nations are represented. each student is each allocated a country by teaching staff in a way that maintains the proportional voting balance of the united nations’ regional blocs. students meet weekly in two-hour sessions, where vital materials are introduced, including briefings on the history and structure of the united nations, the functioning of the united nations regional bloc system, instruction in resolution writing, rules of procedure, etc. the two-hour teaching block facilitates a greater level of interactivity among students and staff and allows students to work in small groups on allocated tasks or skills development. these encompass the areas of public speaking, caucusing, negotiation, using the rules of procedure and giving in-class presentations on their country/issues. a concurrent fortnightly session is held in a computer laboratory to facilitate student research on their countries and the issues under debate. having familiarised themselves with their country and with united nations structure and procedure, each student produces a proposed draft resolution on a topic of relevance to their country. students then caucus among their fellow delegates in an attempt to reach a defined threshold of support for the inclusion of their resolution on the draft agenda. after this process is complete, a formatively assessed practice simulation of the general committee of the general assembly is held in which delegates debate and revise the ordering of resolutions on this draft agenda. the formative nature of this process also enhances student ownership of the simulation and allows delegates to determine the topics addressed at the formally assessed final simulation. once the issues to be debated have been finalised, teaching staff offer detailed briefings on the specific subjects to be discussed, giving all students a baseline of knowledge from which to research and write about their countries’ positions on the chosen issues and to prepare for debate. the highlight of the module, a formal conference simulation, is held over an entire day, using the debating chamber of the lincolnshire county council. the position of chair is occupied by a member of the teaching team, with other staff members co-ordinating the activities of the secretariat, observing student performance for assessment purposes, and facilitating the smooth running of proceedings. several days after the simulation, students are invited to a debriefing session with teaching staff. evaluating the model united nations: the student as producer approach the decision to implement model united nations as an assessed module in the school of social sciences curriculum at lincoln was taken before the introduction of the student as producer programme and was informed by broader moves towards research-engaged and student-led teaching and learning in the british higher education sector. it was not, therefore, subject to the formal student as producer validation requirements envisaged in the current institutional policy framework (university of lincoln : ). it is therefore timely to consider the extent to which the lincoln model united nations operates to embed the principles envisaged by the student as producer project and in what ways the experience of delivering this unique module may contribute to future iterations of teaching and learning policy at both institutional and national levels. such a project is simplified considerably by the clarity of the key features of the student as producer programme delineated by its authors. these are: • discovery: student as producer • technology in teaching: digital scholarship • space and spatiality: learning landscapes in higher education • assessment: active learners in communities of practice • research and evaluation: scholarship of teaching and learning • student voice: diversity, difference and dissensus • support for research based learning through expert engagement with information resources • creating the future: employability, enterprise, beyond employability, postgraduate (university of lincoln, : ) in each of these areas it is possible to discern clear synergies between the institutional objectives of the student as producer agenda and the learning outcomes envisaged by the co-ordinators of the model united nations module. these are explored in greater detail in the sections that follow. discovery discovery is conceived of as a mode of teaching and learning that spans problem-, enquiry- and research- based learning. it envisages students working collaboratively to solve “challenging open-ended problems or scenarios” (university of lincoln, : ) under the guidance of teaching staff who function as facilitators of student learning. it also recommends the integration of practical use of disciplinary research methodologies to address “authentic research problems in the public domain” (university of lincoln, : ), again with support and instruction from both academic and library staff. in its blend of case-based instruction and investigation with aspects of problem-based learning (mcintosh : ), model united nations as an assessed module at the university of lincoln provides a clear example of research-engaged teaching and learning where “students learn as researchers [and] the curriculum is largely designed around inquiry based and problem solving activities” (university of lincoln : ). model united nations requires undergraduate students to obtain and produce knowledge (often of cultures, nations and issues unfamiliar to them) using engaging and innovative methods and provides them with experience of independent research. students are also encouraged to conduct research in ways that emulate, as closely as possible, the practice of academic and foreign policy professionals. alongside library-based research, students have contacted foreign governments, made contact with diplomatic missions and interviewed diplomatic personnel, made use of non-governmental and intergovernmental organisations’ research findings, and in many cases have used the opportunity of travel abroad (whether personal or on university-organised study tours) to gather appropriate materials from overseas archives and sources. students are given skills in the management and evaluation of research sources through the assessment requirement of maintaining an indexed, annotated and comprehensive binder of research sources. this forms a valuable point of reference throughout the module and at the final simulation conference. it also serves as evidence of undergraduate research activity and engagement. students draw on their research in crafting country- and issue- specific policy position papers, in preparing strategy, speeches and negotiation positions. this serves to link the craft of research to the practice of diplomacy and demonstrates its importance to students’ future professional careers. technology in teaching the student as producer agenda invokes a vision of digital scholarship, whereby information technology facilitates the teaching and learning process and shapes new intellectual relationships between teacher and student, and between students themselves. in this context, technology is an enabler of learning and performance. information technology and audiovisual support are integral to the delivery of model united nations. students are required to disseminate their research outputs (country profiles, position papers, etc) via the virtual learning environment for the module hosted on the blackboard platform. this environment underpins the collaborative nature of the simulation, whereby each student’s individual performance and contributions are vital to the overall success of the simulation and inform other students’ participation. students also use computer-mediated communication technologies such as threaded discussion boards, social networks and email for caucusing, negotiation and collaboration. teaching staff design the blackboard site in order to encourage student participation. the site thus functions as an evidence base of student engagement and activity levels that can be taken into consideration in assessment. while virtual learning environments are a central complement to the simulation proceedings, they are not permitted to replace face-to-face contact, negotiation or debate. as with the real united nations, human interaction is vital (matthys and klabbers ). at the final simulation, audio and video of proceedings are recorded to a dvd for later playback and use in formal assessment and external examination procedures. space and spatiality one of the defining aspects of the student as producer model is its attention to the imbrication of teaching space and student experience. the model explicitly encourages staff to take the landscapes in which learning occurs into consideration in their planning and delivery of modules. such concerns are familiar to the teaching staff of model united nations, who have identified the significance of space and spatiality to student experience and utilise a diverse range of venues in the delivery of this module. the bulk of the module is delivered in a large, technology-enabled learning space. significantly, this space was one of a number of teaching rooms in the university that were remodelled as new learning landscapes (university of lincoln and degw ) and features open teaching space, collaborative table groups, facilities for small group work/brainstorming, as well as the usual range of conventional teaching tools (computer, projector, visualiser, etc). this space facilitates a high level of interactivity among students and staff. the student as producer project documentation speaks of the desirability of using venues such as “the library and elsewhere on and off campus to deliver enhanced teaching experiences” and to “engage with the community outside of the campus” (university of lincoln : ). for model united nations, important aspects of learning and student activity take place in the library and in computer laboratories. most significantly, the module culminates in a formal conference simulation held using the debating chamber of the lincolnshire county council. the use of this formal, horseshoe-styled debating chamber adds immeasurably to the student experience (see lincolnshire county council ). convening in such a venue contributes to the seriousness with which students approach the proceedings. the purpose-built debating chamber, equipped with microphones and large-screen closed circuit video projection of students as they address the chamber, facilitates discussion, negotiation and the efficient management of debate. assessment there are remarkable similarities between the guidance for assessment provided by the student as producer documentation and the assessment matrix adopted by the teaching staff for model united nations. the student as producer documents call for an approach to assessment that reflects the discovery mode of teaching and learning and which rewards research skills and outputs as well as creative problem solving. in the model united nations, assessment has been designed to evaluate and encourage the development of a range of skills and knowledge. the assessment matrix for the module comprises both formative and formal assessed components. students produce documents (including written country profiles, draft resolutions and amendments to resolutions), give speeches, debate, negotiate and caucus throughout the semester and are provided with feedback in both group and one-on-one tutorial sessions. there are four components of formal assessment of candidates’ performance: a formal written and researched country position paper (worth % of the overall grade); in-simulation participation (worth %); a collated, annotated and indexed binder of research sources (worth %); and a reflective essay in which the candidate is asked to link their experiences in the simulation to the theories and approaches studied in their degree programme (worth %). the nature of the module necessitates consistent levels of student attendance and participation in comparison with many more conventional undergraduate modules. as a consequence, the authors have sought to ensure that assessment rewards and encourages continuous engagement and performance and completion of key tasks and stages throughout the module. the module also gives students some leeway in determining the subjects studied and the issues debated in the simulation. this emerges from the process of students researching their own country’s politics and international priorities and negotiating with each other for these to be included on the simulation’s agenda. a formative debate forum, simulating one of the united nations’ administrative committees, is built into the curriculum in order to facilitate this process. staff members have noted how this engages student interest and encourages student ownership of the module. this fact exhibits a high degree of congruence with the student as producer model, which similarly calls for student participation in designing assessment and providing peer feedback. research and evaluation the student as producer programme calls for both curriculum development and learning styles to be developed in light of prevailing pedagogic research and to provide opportunities for students and staff to reflect on and disseminate their own teaching and learning experiences. again, this is an area where the lincoln model united nations has seemed to anticipate many of these preoccupations. certain formative and formally assessed tasks provide students with the ability to research, reflect and engage critically with their experience as learners. the reflective essay component of the module encourages students to reflect on how their model united nations experience has illuminated or informed other materials they have studied in their degree. they are also encouraged to reflect on the skills they have acquired through participation. participants are led to consider the extent to which the simulation and its outcomes reflect the ‘reality’ of diplomatic practice and united nations procedure. prior to the submission of this output, students are invited to a review session where they can share and discuss their experiences of the conference and the module. student feedback is collected at the end of the module and has been uniformly positive. in their feedback and evaluations, students have praised the module for developing their awareness of other cultures and countries, facilitating knowledge of the operation of international diplomacy and the work of the united nations, and providing them with key vocational skills. in this, the experience of both teaching staff and students bears out the conclusions drawn from academic studies of the value of simulations in higher education teaching and learning (and international relations studies more specifically). staff have used student feedback to inform continual improvement of the module and its delivery (one example is the introduction of a practice simulation session prior to the assessed conference). the lincoln model united nations is unique in its position as an assessed part of the core curriculum. teaching staff drew extensively on published pedagogic research on the value of simulations in the teaching and learning of higher education when developing the module. staff have also been involved in the dissemination of their pedagogic experience through the sharing of best practice and the preparation of research outputs. student voice it is in the area of student voice that the similarities between the teaching model proposed by the student as producer programme and the nature of the model united nations programme might be seen to diverge and thus require careful reflection. the student as producer project places emphasis on amplifying students’ voices and concerns and allowing greater levels of student input into the management and delivery of their own learning. model united nations, on the other hand, requires students to role-play, identifying, researching and representing established positions that in fact often diverge considerably from their own. in addition, the formally assessed nature of the module, as well as the strict rules of debate and conduct in which international diplomacy is conducted, can derogate against providing students with excessive autonomy and influence over the module. staff are continually involved in balancing the competing demands of providing an accurate simulation of international diplomacy and the need to ensure equitable opportunities for participation and assessed performance. however, while students may well be constrained by the demands of role-play and the requirements of the simulation itself, many of the issues that emerge for debate are clearly the product of students’ own preoccupations and concerns (while still relevant in the simulation framework). simulation role-play encourages students to examine their own prejudices and positions more thoroughly and to draw on a wider range of research materials. the requirement to represent a nation other than their own also ensures that students examine unfamiliar viewpoints and develop empathy for alternative voices and perspectives. in this, students have the ability to enrich their understanding and that of their peers. the cross- cultural awareness gained and the training in diplomatic protocol, formal debating standards and the processes of an intergovernmental organisation requires students to encounter and engage with diversity and difference and to respectfully and productively dissent where appropriate. in this the module scaffolds the student as producer programme’s concerns with developing new forms of citizen engagement. the collaborative nature of the module means that each student’s participation influences the success of the module. the absence of a particular country, or a disengaged delegate, has the clear potential to detract from the experience and performance of other delegates in terms of denying them information, documentation, vital allies and opponents and in limiting the range of voices and positions aired. thus students see themselves as having a collegial responsibility to the student community to which the module is delivered. such concerns also form part of the student as producer agenda, which calls for the development of environments in which “students might support the learning of other students”. support for research-based learning library staff are valorised in the student as producer documentation. similarly, library staff are an integral part of the delivery of the model united nations module. they provide expert training in research on diplomatic affairs and in the use of the sprawling and complex united nations research databases. they also provide key support to both students and staff in the identification of problem- and role-specific research information. much of this takes place in instructor-facilitated, research-focused laboratory/library sessions where students work collaboratively with teaching and library staff and each other (particularly in groupings of importance to the simulation) to identify, share, synthesise and absorb relevant research materials. creating the future one of the concerns of student as producer is to pay due regard to preparing the student for life after graduation. this spans inculcating broader graduate attributes as well as the provision of key vocational skills. it is clear that model united nations fulfils such a role. at the heart of the module is a far deeper concern with skills development than is usual in conventional undergraduate modules. students develop skills in the areas of public speaking, the practice and rules of formal institutional/parliamentary debate, report writing, the presentation of organisational and/or institutional views and opinions, collaboration, negotiation and caucusing, research and problem-based enquiry. it is worthy of note that many students use their model united nations research to scaffold other undergraduate research, such as in independent studies (ba honours dissertations) and in other subject- specific modules such as human rights, war crimes and genocide, and globalization and developing societies. the centrality of model united nations to the international relations programme has also helped inform and shape other co-curricular activities such as student trips to the headquarters of intergovernmental institutions including the international monetary fund, the european union and the united nations itself. feedback also indicates that model united nations, and the skill-set it has fostered, features strongly in students’ applications for graduate employment or postgraduate course enrolment. evaluation frameworks: model united nations’ contribution to the student as producer objectives the student as producer programme may be evaluated at a range of levels: institutional, departmental, course/programme and module/unit. at the university of lincoln, a comprehensive evaluation framework exists to provide criteria for measuring the success and contributions of the student as producer programme as it is rolled out on an institution-wide basis. the opportunity therefore exists to explore the extent to which the implementation and delivery of the lincoln model united nations has contributed to meeting institutional objectives and entrenchment of the student as producer ideal. the discussion here naturally focuses on the desired objectives and projected longer-term impacts identified by the institution and those responsible for its embrace of student as producer. the student as producer evaluation framework (university of lincoln ) differentiates between internal and external institutional outcomes by which the project can be analysed. for instance, the framework calls for the measurement of tangible student research outputs, whether traditionally understood publications or student participation in academic conferences. the model united nations module’s embrace of student research leading up to and including the day conference contributes to this objective. the conference provides students with a controlled space in which to demonstrate the breadth and quality of their research, analysis and preparation. similarly, the range of extra-curricular model united nations conferences that exist at national, regional and global levels provide a strong platform which can be leveraged to showcase university of lincoln student research capabilities and scholastic achievements. elsewhere, the internal objectives identified by the evaluation framework call for a changed relationship between students and staff and heightened levels of student engagement. evidence from student feedback and module evaluation processes continually reveal that model united nations is a highlight of many students’ undergraduate careers. the module enjoys high degrees of student engagement due to its unique blend of diplomatic simulation, student research and staff-led teaching. further, the fact that teaching staff also play roles in the simulation itself (as president of the general assembly, secretary-general, secretariat members, other delegations, etc), means that students are required to form collaborative and interactive working relationships with teaching staff that are deeper and richer than is the normal in traditional lecture/seminar style undergraduate teaching and learning. this qualitative improvement in staff–student interaction has a beneficial impact on the delivery of concurrent and subsequent modules in the programme, encouraging the development of collegiate working relationships that bridge the gap between teacher and learner. model united nations has also played a key role in contributing to many other internal institutional objectives. it forms a key feature of recruitment exercises and literature in the school of social sciences, providing a way of showcasing our embrace of the student as producer project and its unique curriculum to prospective undergraduates and their families. the explicit focus on issues of diplomacy, international affairs and transnational organisation, the module’s requirements for students to represent and understand countries, opinions and policies other than their own, together with the participation of students from a range of degree programmes, contributes to core objectives and desired graduate outcomes in the field of internationalisation. the high levels of student satisfaction in the module, evidenced in annual module evaluation surveys, have the potential to contribute to improved scores on national evaluation processes such as the national student survey. on the external front, the fact that model united nations is delivered as an assessed core module provides a key way of differentiating the university of lincoln’s social sciences programmes in general, and international relations programmes in particular, from those of its key competitors. model united nations as it is delivered at lincoln is one of the very few worldwide that is offered as a taught and assessed undergraduate module, providing scope for the university to be recognised as a key national and international leader in the higher education sector. it offers a key point of distinctiveness for the university and the school to leverage in their public relations, marketing and branding exercises. the involvement of local, national and transnational bodies in the definition and delivery of the programme indicate the extent to which the model united nations programme contributes to the student as producer project objective of having impact beyond the higher education sector. our key partner in this regard is the democratic services section of lincolnshire county council. the council generously provides access to its debating chamber for the running of the simulation each year as part of its broader educational outreach and democratic services projects. similarly, the range of materials utilised from bodies such as the united nations associations of the united kingdom and the united states of america, as well as from the united nations organisation itself, are testimony to the ways in which the model united nations project enables teaching staff to draw on teaching and learning materials from non-traditional sources well beyond the higher education sector. the student as producer evaluation framework also looks forward to potential longer-term influences arising from adoption of this model. while model united nations has only been running for three years, some predictions in this area can be made. certainly the model united nations has encouraged students to meet in small learning groups in various locales around the campus, including it labs, library spaces and formal and informal learning landscapes. information technology, especially on mobile platforms, plays a key role in supporting these learning groups and students move seamlessly between face-to-face interaction, group work, interaction on university-supplied virtual learning environments and engagement with each other in online social networks. these outcomes are envisioned by the student as producer project and it does not seem premature to point to the fact that model united nations has played a key role in bringing them into existence. similarly, many of the transformations in student attitude and experience predicted by the evaluation framework seem remarkably congruent with the experiences derived from the implementation of model united nations. the framework speaks of students being excited, stretched and engaged, all descriptions that apply to the students engaged in the model united nations module. similarly, the framework speaks of encouraging the institution to develop unique, high-quality undergraduate teaching and learning, praised as best practise by regulators; comments that have been echoed by external examiners of the model united nations module. conclusions this institutional-level evaluation demonstrates the key values of the student as producer approach but also highlights the extent to which its successful delivery is dependent on institutional resourcing and infrastructure, collaborations with the wider community and the significant investments of staff time and enthusiasm in order to deliver innovative, research-engaged programmes. this latter point should not be taken lightly: staff involved in simulation-based teaching must be prepared to allocate significantly greater amounts of time and energy to the module than is the case in a traditionally conceived (lecture/seminar) module. this is particularly acute in the early stages of introducing and delivering such a module. as a consequence of this, it may be appropriate and practical to align degree programmes rather than individual modules/units with the requirements of the student as producer programme or to take a longer- term view of the broader implementation of student as producer requirements. not all modules can, or should, be delivered in the ways explored in this paper and not all modules will necessarily embed all of the principles of the student as producer framework. students learn in a variety of ways and staff may be more or less comfortable with different delivery techniques (especially where these involve major reconceptualisations of the relationship between lecturer and student). it is important that degree programmes reflect and reward this diversity, providing space for research-informed teaching (more traditionally conceived) as well as the delivery of required disciplinary knowledge. change, however, may not be as dramatic as many might perceive it to be. as this paper has demonstrated, existing modules in a higher education context – especially those that have been introduced in the wake of the expansion of british higher education after and informed by new developments in pedagogic research – may already have high degrees of congruence with the outcomes and processes of student as producer. the lincoln model united nations, while not explicitly conceived of as an exemplar of student as producer teaching and learning, nonetheless operates to successfully embed the principles of the model in the school of social sciences undergraduates programmes. it is therefore possible that many other existing modules at institutions implementing student as producer (or programmes similar to it) may well already reflect many of its priorities. the student as producer documentation provides a convenient diagnostic framework for evaluating existing programmes, many of which may only require minor modification in order to be brought into alignment. the framework also guides academic staff to consider many recent developments in pedagogic practice and the study of higher education when conceiving and delivering academic programmes. however, the discussion here also indicates a number of areas which may require greater consideration in future revisions of the student as producer policy framework. most obviously, student as producer might pay far closer attention to the pedagogic value and challenges inherent in delivery of simulation-based teaching and learning, which currently do not feature in the documentation or analysis. as has been shown, simulation and role-playing have significant educational value. they support the acquisition of skills and curricular content as well as embedding core skills of academic and future vocational relevance, including knowledge production, research and analysis and collaboration. the existing documentation may therefore require an expansion of student as producer’s existing notion of student voice in order to encompass the benefits of role-play in encouraging students to work in and represent unfamiliar paradigms, viewpoints and beliefs. the broader documentation may also wish to consider the value of simulation-based learning in providing a structure within which the student as producer goals may be realised. paralleling the introduction of student as producer has been a growth in awareness of the importance of internationalisation in higher education. the authors view internationalisation more broadly than simply a way of explaining the growing numbers of international students studying in the uk and justifying the attention paid to international student recruitment. internationalisation encompasses the need to prepare undergraduates (irrespective of their degree or national origin) for a culturally diverse and globally oriented workplace and for a world marked by the encounter with difference and an increasing demand for cross- cultural communication skills. model united nations embeds internationalisation in this broader sense and encourages students to regard internationalisation and intercultural awareness as key personal attributes gained through their studies. while to some extent this is reflected in the student as producer programme’s attention to employability and graduate attributes, the authors recommend that far greater emphasis be placed on the desirability of this broader notion of internationalisation throughout the various strands of student as producer. in a world where knowledge is produced and debated in global contexts, this has never been more relevant or necessary. references asal, v and blake, el ( ). creating simulations for political science education. political science education. ( ), – . barrs, d and juffkins, m (eds) ( ). intermediate school kit on the united nations. new york and geneva: united nations. deutsche gesellschaft für die vereinten nationen ( ). un basis informationen: model united nations (mun). available at: deutsche gesellschaft für die vereinten nationen ev website: www.dgvn.de/fileadmin/user_upload/publikationen/basis_informationen/bi-mun.pdf endless, b and wolfe, ad ( ). model un 'in a box': a guide for faculty and students - teaching model united nations and running your own simulation. oak park: american model united nations. fox, r and ronkowski, s ( ). learning styles of political science students. ps: political science and politics. , – . hale, s ( ). case based learning: a review of good practice. fdtl case based learning in politics. available at: http://www .hud.ac.uk/hhs/cps/cbl/rogp.pdf hale, s ( ). politics and the real world: a case study in developing case-based learning. european political science. , – . hattie, j and marsh, hw ( ). the relationship between research and teaching: a meta-analysis. review of educational research. ( ), – . hazleton, wa and jacob, je ( ). simulating international diplomacy: the national model united nations experience. teaching political science. ( ), – . higher education academy ( a). student as producer: research engaged teaching and learning – an institutional strategy. national teaching fellowships scheme. http://www.heacademy.ac.uk/projects/detail/ntfs/ntfsprojects_lincoln (accessed april ). higher education academy ( b). ntfs projects funded in . national teaching fellowships scheme. http://www.heacademy.ac.uk/resources/detail/ntfs/projects_ (accessed april ). jones, k ( ). simulations: a handbook for teachers and trainers ( rd ed). london: kogan page. karns, mp ( ). teaching international organization through model united nations. paper presented at the annual meeting of the international studies association. los angeles, – march. kolb, d ( ). learning style inventory. boston: mcber and co. kolb, d ( ). experiential learning: experience as the source of learning and development. englewood cliffs: prentice hall. lincolnshire county council ( ). your council. available at: http://fsd.lincolnshire.gov.uk/section.asp?catid= (accessed april ). matthys, k and klabbers, jhg ( ). model united nations nline (muno): a study of a policy exercise using internet gaming. in: wc kriz and t eberle (eds). bridging the gap: transforming knowledge into action through gaming and simulation, – . munich: sagsaga (swiss austrian german simulation and gaming association). mcintosh, d ( ). the uses and limits of the model united nations in an international relations classroom. international studies perspectives. ( ), – . muldoon, jp ( ). the model united nations revisited. simulation and gaming. ( ), – . neary, m and winn, j ( ). the student as producer: reinventing the student experience in higher education. in: l bell, h stevenson and m neary (eds). the future of higher education: policy, pedagogy and the student experience. london: continuum. newmann, ww and twigg, jl ( ). active engagement of the intro ir student: a simulation approach. political science and politics. ( ), – . pascarella, et and terenzini, pt ( ). how college affects students (vol ): a third decade of research. san francisco: jossey-bass. phillips, mj and muldoon, jp ( ). the model united nations: a strategy for enhancing global business education. journal of education for business. ( ), – . ralph, j, head, n and lightfoot, s ( ). pol-casting: the use of podcasting in the teaching and learning of politics and international relations. european political science. , – . ramsden, p ( ). learning to teach in higher education. london: routledge. simpson, aw and kaussler, b ( ). ir teaching reloaded: using films and simulations in the teaching of international relations. international studies perspectives. ( ), – . united nations ( ). model united nations headquarters. available at: http://cyberschoolbus.un.org/modelun/index.asp (accessed april ). united nations association of the united kingdom ( ). model united nations guide. available at: www.una.org.uk/mun/ (accessed april ). united nations association of the united states of america ( ). how to plan a model un conference. new york: united nations association of the united states of america. united nations association of the united states of america ( a). five things you should know about model un. available at: www.unausa.org/global-classrooms-model-un/about-global-classrooms-model-un/five- things-you-should-know-about-global-classrooms-model-un (accessed april ). united nations association of the united states of america ( b). model un: frequently asked questions. available at: www.unausa.org/global-classrooms-model-un/how-to-participate/getting-started/frequently-asked- questions (accessed april ). united nations association of the united states of america ( c). model united nations: how to participate. making model un accessible. available at: www.unausa.org/global-classrooms-model-un/how-to- participate (accessed june ). university of lincoln ( ). student as producer: research-engaged teaching and learning: an institutional strategy – evaluation framework. available at: http://studentasproducer.lincoln.ac.uk/files/ / /lincoln_evaluation_framework_v .pdf university of lincoln ( ). student as producer: research-engaged teaching and learning at the university of lincoln. user's guide – . available at: http://studentasproducer.lincoln.ac.uk/files/ / /user- guide.pdf university of lincoln and degw ( ). learning landscapes in higher education. available at: http://learninglandscapes.lincoln.ac.uk/ (accessed april ). weir, k and baranowski, m ( ). simulating history to understand international politics. simulation and gaming. ( ), – . zamorski, b ( ). research-led teaching and learning in higher education: a case study. teaching in higher education. ( ), – . full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=wref download by: [merinda hensley] date: april , at: : the reference librarian issn: - (print) - (online) journal homepage: http://www.tandfonline.com/loi/wref helping lis students understand the reference librarian’s teacher identity loriene roy & merinda kaye hensley to cite this article: loriene roy & merinda kaye hensley ( ): helping lis students understand the reference librarian’s teacher identity, the reference librarian to link to this article: http://dx.doi.org/ . / . . published online: apr . submit your article to this journal view related articles view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=wref http://www.tandfonline.com/loi/wref http://dx.doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=wref &page=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=wref &page=instructions http://www.tandfonline.com/doi/mlt/ . / . . http://www.tandfonline.com/doi/mlt/ . / . . http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - material-mind-method: on the teaching of reference helping lis students understand the reference librarian’s teacher identity loriene roya and merinda kaye hensleyb aschool of information, the university of texas at austin, austin, tx; bdigital scholarship liaison and instruction librarian, university of illinois at urbana-champaign, champaign, il the library and information science (lis) classroom setting for reference education is complex and busy. in a semester-long class, students are taxed with understanding concepts and acquiring skills. they take the first steps toward the challenging act of defining a reference source. they begin to acquire close knowledge of many information sources as they evaluate their structure, compare their content, and extract relevant information by using them to answer simulated and real patron questions. in practicing the stages of the reference dialogue (i.e., the reference interview), lis students start to visualize their personal contributions to the reference encounter. adler ( ) suggests that we reassess the label of the reference interview by renaming our interactions the “reference dialogue” (p. ). the premise for this change in terminology is to construct an authentic space allowing for generative con- versation, a balanced exchange where patrons not only bring their questions and experience but also bring substantial disciplinary knowledge. through all of these efforts, lis students become aware of the multiple roles of the reference librarian who serves as listener, inquirer, searcher, and provider of information. in , james elmborg argued, “the reference desk can be a powerful teaching station—more powerful, perhaps, than the classroom” (p. ). although the traditional reference desk has undergone changes in recent years with libraries shifting focus to virtual reference, the development of personal librarian services, and with some libraries closing their desks entirely, working with patrons in one-on-one reference interactions continues and it is the responsibility of the lis educator to help students prepare for this changing reference environment. through learning how to enhance a refer- ence consultation with intentional teaching strategies, lis students can pro- mote learning among their patrons while creating an intellectual space for library patrons to personally connect with their librarian. elmborg proposes a student-centered pedagogy for the reference desk that mirrors the work of teaching in a one-to-one writing conference. lis edu- cators can replicate this model by challenging their students to practice contact loriene roy loriene@ischool.utexas.edu school of information, the university of texas at austin, guadalupe street, suite # . , austin, tx . the reference librarian http://dx.doi.org/ . / . . published with license by taylor & francis. © loriene roy and merinda kaye hensley t he r ef er en ce l ib ra ri an asking thoughtful questions; listening to patrons as they talk through the research process; and, perhaps most important, paying close attention to not overstepping the learning process by answering questions beyond the level of patron expertise. still, helping prospective reference librarians to craft their reference approach in a patron-centered manner is not enough. elmborg asserts, “above all, we need to oppose vigorously the notion that power and professionalism depend on maintaining special skills that librarians withhold from patrons or students” ( , p. ). in other words, lis educators need to help their students understand how to share their expertise in a manner that encourages library patrons to model similar ways of thinking and practicing as they develop from novice to experienced researchers. one way to accomplish this is for us to help lis students to actively cultivate their teacher identity during reference interactions and, in so doing, break down barriers of power between the librarian and patron. we propose three principles for lis students to consider as they strengthen their inner teacher for reference service. we ask them to (a) adopt a deep understanding of critical pedagogy and its impact on patron learning; (b) explore learning styles through the lens of diverse cultures and; (c) implement a critically reflective practice before, during, and after the reference conversation. adler reminds us that, “dialogue is at the heart of critical pedagogy, and it is at the heart of critical thinking” (p. ). critical pedagogy (or critlib, as it is known in the library community) is a theory and approach to learning meant to guide the patron in questioning elements of power that influence beliefs and practices throughout political and social systems. one way to assist lis students to implement critical pedagogy into their reference dialogue is to encourage them to examine the effects of power on information structures. for the purposes of developing teaching skills, we must turn critical peda- gogy back onto ourselves as lis educators. in other words, we need to make sure that we are not unintentionally reinforcing notions of power by the choices we make in teaching in our reference courses. for example, a reference desk can be seen as a place of power, creating a physical and metaphorical barrier that lis students can learn to easily fix by coming out from behind the desk to work with a patron side-by-side at a computer station. another troublesome example that can be addressed through critical pedagogy is to avoid continually demonstrating the same resources for all patrons as if a single database can solve each and every research question. in modeling how, as experts in finding information, we should be able and willing to help our lis students learn to locate alternative voices represented in the scholarly conversation. our students, in turn, can learn to assist patrons during the reference dialogue in contextualizing the notions of authority. furthermore, new librarians should be cognizant that most often, our teaching role does not align with providing a specific answer for l. roy and m. k. hensley t he r ef er en ce l ib ra ri an the patron. rather, we need to help our students learn how to work through a process of asking questions that will guide patrons through the research process. these are just a few examples of how lis educators can incorporate the principles of critical pedagogy within their classrooms to guide reference students in more closely evaluating the role power plays during the reference dialogue. by actively working to break down barriers between themselves and their patrons, lis students will improve their teaching skills and increase communication and hopefully illuminating a clearer path to increased learn- ing during the reference dialogue. how can lis educators and students learn more about critical pedagogy? there is a growing critlib community in librarianship, with a steadily increasing number of publications, conferences, as well as a weekly #critlib chat on twitter. diverse learning styles the second step in strengthening our student’s teacher identity is to assist them in greater understanding of the impact of diversity of learning styles in the way we approach the reference dialogue. a first step is to help students to examine their own assumptions about learning by completing a learning style assessment for themselves. we both use the kolb experiential learning model (kolb, ). even in a small graduate class, i have found that all four main kolb learning style preferences will be apparent; at least one student will prefer to start his or her learning with concrete experience, whereas others will prefer reflective observation, abstract conceptualization, or active experimentation. while kolb helps lis students become more aware for how they like to learn, it is also important to help them learn how to apply this information not only to their own learning but also to their interactions with their fellow classmates and to their library patrons. although kolb helps highlight diversity in learning styles, it is important to recognize that it does not fully address the cultural diversity that lis students and their library patrons bring to the reference dialogue. lis students can learn from direct interactions with international students and with others from diverse backgrounds. for example, as an american indian woman, i often follow a protocol or etiquette in introducing myself that not only delineates aspects of my genealogy but also hints at my responsibilities and how i view the world. in joking, i sometimes say that i come from a land of corrupt politicians and bossy women. other american indians catch this inside joke when they realize that i am enrolled or listed on the roll of ojibwe indians of the white earth reservation and am a member of the minnesota chippewa tribe. by saying that i am mukwa or bear clan, i am disclosing that i am supposed to exhibit strength and courage (johnson, ). as with other native people, i might have a dark sense of humor, enjoy visual the reference librarian t he r ef er en ce l ib ra ri an learning and learning by doing, and be less likely to self-promote. all of these characteristics illustrate the cultural influences that patrons and lis students bring to the reference dialogue. critical reflective practice one of the most common inclinations of new teachers in the classroom is to assert a “sage on the stage” presence by acting as a lecturer of knowledge instead of engaging students in their learning process. in the reference dialogue, this may translate to an authoritarian presence (e.g., a lack of awareness of body language or tone of voice) that can put up walls between a librarian and a patron. one way to help lis students counteract unintentional behavior is to assist them in learning how to adopt a critical reflection strategy, a strategy that has been discussed in the education literature and is increasingly used in assessment of information literacy instruction. stephen brookfield ( ) defines critical reflection as a process through which a teacher considers how power can frame and distort the educational process and by examining our assumptions that, on the surface, may make teaching easier but ultimately impede learning. critical reflection is the glue that brings together critical pedagogy and an understanding of learning styles in the lis students’ “learning to teach” toolkit. establishing a routine of critical reflection can be as simple as advising lis students to take a few minutes after a reference dialogue to answer their questions about the interaction. some questions to consider include the follow- ing: when did i feel connected/disconnected from the patron? was there any- thing about the interaction that made me feel anxious? what would i do differently if i had the chance for a do-over? what do i need to learn in order to improve my reference skills? did i adapt my knowledge to the patron’s level of understanding? in addition to collecting insights through self-reflection, brookfield offers three additional critically reflective lenses through which to view teaching including referring to the professional literature, seeking input from our patrons, and asking for feedback from our peers. how can critical reflection influence our work as reference librarians? ultimately, the focus for lis students should be on improving the reference dialogue across a spectrum of skillsets. brookfield points out the need to estab- lish credibility with a patron but also emphasizes the need to refrain from blaming ourselves when teaching does not go as planned. in addition, critical reflection can also assist in controlling how we approach a reference dialogue so that we do not leave the outcome of an interaction to chance. a practice of critical reflection sets a foundation for examining a rationale for how librarians teach during a reference dialogue. in reenvisioning brookfield’s theory for the reference dialogue, lis students learning about reference can lean on a theore- tical underpinning to make informed decisions regarding how to approach working with patrons. l. roy and m. k. hensley t he r ef er en ce l ib ra ri an summary lis students developing reference skills can be supported in understanding their identities as teachers by considering critical pedagogy, the effect of diversity on learning styles, and engaging in self-reflection during and after the reference dialogue. this triad of concepts and skills is important as students navigate the potential power differential that occurs between the patron and the librarian. through learning how to welcome the patron as a partner in their research process, lis students will acquire behaviors that are supportive and empathetic of their patrons. note . learn more about critical pedagogy at http://critlib.org/about/. references adler, k. ( ). radical purpose: the critical reference dialogue at a progressive urban college. urban library journal, ( ), – . retrieved from http://ojs.gc.cuny.edu/index. php/urbanlibrary/article/view/ brookfield, s. ( ). becoming a critically reflective teacher. the jossey-bass higher and adult education series. san francisco, ca: jossey-bass. critlib: critical librarianship, in real life & on the twitters.. retrieved from http://critlib.org/ about elmborg, j. k. ( ). teaching at the desk: toward a reference pedagogy. portal: libraries and the academy, ( ), – . doi: . /pla. . johnson, b. ( ). ojibway heritage. lincoln, ne: university of nebraska press. kolb, d. a. ( ). learning styles inventory: technical manual. boston, ma: mcber. the reference librarian t he r ef er en ce l ib ra ri an http://critlib.org/about/ http://ojs.gc.cuny.edu/index.php/urbanlibrary/article/view/ http://ojs.gc.cuny.edu/index.php/urbanlibrary/article/view/ http://critlib.org/about http://critlib.org/about http://dx.doi.org/ . /pla. . diverse learning styles critical reflective practice summary note references microsoft word - fftc feature-finding for text classification richard s. forsyth & david i. holmes, bristol stylometry research unit, department of mathematical sciences, university of the west of england, bristol bs qy, uk. corresponding author: richard forsyth. email: forsyth_rich@yahoo.co.uk [cite as: forsyth, r.s. & holmes, d.i. ( ). feature-finding for text classification. literary & linguistic computing, ( ), - . ] "every man's language has, first, its individualities; secondly, the common properties of the class to which he belongs; and thirdly, words and phrases of universal use." -- samuel taylor coleridge ( [ ]). abstract stylometrists have proposed and used a wide variety of textual features or markers, but until recently very little attention has been focused on the question: where do textual features come from? in many text-categorization tasks the choice of textual features is a crucial determinant of success, yet is typically left to the intuition of the analyst. we argue that it would be desirable, at least in some cases, if this part of the process were less dependent on subjective judgement. accordingly, this paper compares five different methods of textual feature finding that do not need background knowledge external to the texts being analyzed (three proposed by previous stylometers, two devised for this study). as these methods do not rely on parsing or semantic analysis, they are not tied to the english language only. results of a benchmark test on representative text-classification problems suggest that the technique here designated monte-carlo feature-finding has certain advantages that deserve consideration by future workers in this area. keywords: linguistic variables, minimum-deviance classification, monte-carlo methods, pattern recognition, stylometry, text categorization. «feature-finding for text classification» . introduction in their attempts to capture consistent and distinctive features of linguistic style, stylometrists have used a bewildering variety of textual indicators (see: holmes, ). in the majority of stylometric studies, however, the choice of which indicators (or `markers') to use in a given problem is left to the discretion of the investigator (e.g. dixon & mannion, ; matthews & merriam, ; merriam & matthews, ; holmes & forsyth, ). an advantage of this practice is that it allows the exercise of human judgement, and thus can sometimes save a time-consuming search for suitable descriptors. on the other hand, it also inevitably involves subjectivity. very often the choice of suitable linguistic markers is crucial to the development of an effective discriminant rule; but, being subjective, it may not be replicable on another problem. a further disadvantage is that each stylometrist typically has a `tool-kit' of favourite marker types which encompasses only a small fraction of those that might be used. the situation is similar in the related fields of multivariate pattern recognition and machine learning (everitt & dunn, ; quinlan, ): most studies begin by presuming that a suitable set of attributes or features has already been found. in text analysis this presumption is more than usually questionable. it is arguable, for instance, that mosteller and wallace ( [ ]), in their classic study of the federalist papers, brought a good deal of background knowledge to the task of finding features that would distinguish hamilton's from madison's writings, and that once they had discovered reliable verbal markers such as `upon' and `while' the game was almost over. as part of an automated inductive system, it would clearly be desirable for this part of the process to be less dependent on human expertise. for these and other reasons, a number of studies have appeared recently (e.g. burrows, ; binongo, ; burrows & craig, ; kjell, ; ledger & merriam, ) in which the features used as indicators are not imposed by the prior judgement of the analyst but are -- at least to a large extent -- dictated by the texts being analyzed. the main aim of the present paper is to advance this trend, by conducting a test of five different methods of textual feature-finding (three proposed by previous researchers and two newly devised) on a mixed set of text-classification problems. although no set of textual markers can be entirely free of preconceptions, the five methods of feature-finding tested here depend only minimally on human judgement. in addition, none of them presupposes that the text being analyzed is in english. a secondary aim is to show that categorization of quite short segments of text (shorter than most previous stylometrists have tried to classify) is feasible using a relatively simple algorithm -- provided that suitable features have been found. thus this paper describes an experiment with a straightforward plan: a single classification technique is applied to test problems using five different types of textual marker. the main response variable is the proportion of correct classifications achieved on unseen test samples (each relatively short); the main factor of interest is the source of markers (with five levels). before describing these different marker sources, however, it is necessary to give a brief outline of ( ) the benchmark problems, and ( ) the classification algorithm used -- as both of these have novel aspects that will be unfamiliar to many readers. «feature-finding for text classification» . the bristol benchmark suite in many areas of computing, benchmarking is a routine practice. for instance, when compilers are tested for compliance to a programming-language specification, it is normal to apply them to a suite of benchmark cases and record any divergence from expected behaviour. there is not room here to go into the pros and cons of benchmarking in any depth, except to acknowledge that sets of benchmarks do have drawbacks as well as advantages -- one disadvantage being that once a a benchmark suite is widely accepted as standard, an incentive exists to optimize performance on that suite, possibly at the expense of other problem types. nevertheless benchmarking does have a role to play in setting agreed and objective standards. for example, it is arguable that in the field of forecasting, the work of makridakis and colleagues (e.g. makridakis & wheelwright, ), who tested a number of forecasting methods on a wide range of (mostly economic) time series, transformed the field -- leading to both methodological and practical advances. likewise, in machine learning, the general acceptance of the machine-learning database repository (murphy & aha, ) as a de facto standard, and its employment as the basis for extensive comparative tests (e.g. michie et al., ) has thrown new light on the merits and demerits of various competing algorithms. although billion-byte public-domain archives of text exist, e.g. project gutenberg and the oxford text archive, stylometry currently lacks an equivalent set of accepted test problems. thus we have been forced to compile our own. for any deficiences in it, we apologize; but it is hoped that the test problems described below will evolve in due course into something of value to the field as a whole. (it is fair to state that even in its present admittedly underdeveloped form the suite of problems described below already provides a more varied and exhaustive range of tests than any previously reported in the stylometric literature.) the text-categorization problems in this suite were selected to fulfil a number of requirements. .provenance: the true category of each text should be well attested. .variety: problems other than authorship should be included. .language: not all the texts should be in english. .difficulty: both hard and easy problems should be included. .size: the training texts should be of `modest' size, such as might be expected in practical applications. the last point may need amplification. although some huge text samples are available, most text- classification tasks in real life require decisions to be made on the basis of samples in the order of thousands or tens of thousands, rather than hundreds of thousands or millions, of words. an enormous training sample of undisputed text is, therefore, something of a luxury. it was felt important that the method used here should be able to perform reasonably well without reliance on this luxury. «feature-finding for text classification» subject to these constraints, ten test problems were chosen. this suite (called here tbench ) contains three authorship problems, three chronology problems, two content-based problems, and two synthetic quasi-random problems. fuller information is given in appendix a. . pre-processing & other preliminaries in order to impose uniformity of layout and thus reduce the effect of factors such as line-length (usually not an authorial decision and in any case very easy to mimic) all text samples were passed through a program called pretext before being analyzed. this program makes some minor formatting changes: tabs and other white-space characters are converted into blanks; runs of multiple blanks are converted into single blanks; and upper-case letters are converted into lower case. by far the most important change made by pretext, however, is to break running text into segments that are then treated as units or cases to be classified. just what consititutes a natural unit of text is by no means obvious. different researchers have made different decisions about the best way of segmenting long texts and thus turning a single sequence of characters into a number of cases or observations. some have used fixed-length blocks (e.g. elliott & valenza, ); others have respected natural subdivisions in the text (e.g. ule, ). both approaches have merits and drawbacks. because linguistic materials have a hierarchical structure there is no universally correct segmentation scheme. textual units could range from single lines or sentences at one end of the scale to chapters or even whole books at the other. generally speaking, smaller text units are too short to provide opportunities for stylistic habits to operate on the arrangement of internal constituents, while larger units are insufficiently frequent to provide enough examples for reliable statistical inference. the compromise adopted for the present study was to break all texts into blocks of roughly the same length. in fact, each block boundary was taken as the first new-line in the original text on or after the th byte in the block being formed. as a result, mean block size is always between and characters. such units will be referred to as kilobyte lines. the number of words per kilobyte line varies according to the type of writing. a representative figure for tbench as a whole is words per line. this is, in fact, the median word length per line of the five files nearest to the median overall line length (of bytes). the exact figure is unimportant, though it should be noted that each kilobyte line (almost invariably less than words) is very short in comparison with the size of text units that previous stylometrists have felt worth analyzing. in other words, this is an attempt to work with text units near the lower limit of what has thus far been considered feasible. evidence of this is provided by the two quotations below, made years apart. "it is clear in the present study that there is considerable loss in discriminatory power when samples fall below words". (baillie, ) «feature-finding for text classification» "we do not think it likely that authorship characteristics would be strongly apparent at levels below say words, or approximately letters. even using word samples we should anticipate a great deal of unevenness, and that expectation is confirmed by these results." (ledger & merriam, ) further confirmation is provided in table , which collates information from a selection of stylometric studies, showing the length, in words, of text blocks that various researchers have tried to categorize. the numbers in the column labelled norm give the typical or recommended text size for the researcher(s) concerned, i.e. the size for which they have confidence in their methods. rows have been arranged in descending order of this norm. the range column gives the sizes, again in words, of the smallest and largest text block analyzed by the worker(s) concerned. it will be seen that there is wide variation. none the less, the size of text block that previous researchers have felt able to categorize is typically quite large, the median in the norm column being words. the smallest text segment ever given an attribution (so far as we know) is a -word poem analyzed by louis ule ( ), who also analyzed the next-smallest, of words, as well as some of the biggest. table -- size of text blocks analyzed by various stylometrists. researchers subject norm range smith ( ) elizabethan drama - ule ( ) marlowe's writings - holmes ( ) mormon scriptures - butler ( ) sylvia plath's poems - merriam ( ) federalist papers [unknown] burrows ( ) bronte sisters - milic ( ) jonathan swift - mosteller & wallace ( ) federalist papers - ledger ( ) platonic dialogues - matthews & merriam ( ) elizabethan drama [unknown] binongo ( ) nick joaquim's short stories [unknown] elliott & valenza ( ) elizabethan poetry - thisted & efron ( ) shakespeare & contemporaries - it is very rare for anyone to attempt to classify a text segment of less than words; so the problem of classifying kilobyte-sized chunks, averaging around words and almost always less «feature-finding for text classification» than words, must be regarded as a relatively stringent stylometric test. a method that is successful under these conditions thus deserves serious consideration. . classification by minimum deviance the particular classification algorithm used is not the prime concern of this paper. for that reason it was decided to employ a relatively simple method for the trials reported here. we term this the `method of minimum deviance'. it is a variant of the nearest-centroid classifier, which is, in turn, related to the well-known and popular nearest-neighbour classifier ( -nnc): see, for example, dasarathy ( ). similarity-based methods of this general type have been shown in several empirical trials to be surprisingly robust (forsyth, ; aha et al., ; michie et al., ; mckenzie & forsyth, ). moreover, such methods are easy to understand and to implement. basically, the minimum-deviance classifier, as implemented here, uses a training set of examples with known class membership to compute a centroid (multi-dimensional average) for each category. then on a fresh or unseen case a measure of distance from (or equivalently, proximity to) each class centroid is computed and the current case is assigned to the category of the centroid which it most resembles. most such algorithms use a euclidean or city-block distance metric, but in the present case the `distance' measure used is termed deviance. it is computed as follows where i is the current case, j is a feature index, c is a class code, and mcj is the mean value of class c on feature j in the training set. if the features are, for example, word frequencies then xij is simply the number of times that word j occurs in line i. (this relies on using lines, or blocks, of approximately equal length, as is done here.) this measure is asymptotically related to the chi-squared statistic with mcj being the expected value under the hypothesis that the instance belongs to category c. the + . in the denominator can be seen as a slight bias, downgrading the effect of infrequently occurring features, as well as avoiding division by zero. overall, minimum-deviance classification is a simple, fast and intuitively appealing technique which appears to give good results. . data-driven feature-finding in this section we return to the main focus of the investigation by outlining three of the methods of data-driven feature-finding tested on the benchmark suite, namely those proposed by previous . )+m( )m-x( =c),deviance(i cj cjij j  , e )e-(o =  «feature-finding for text classification» researchers. (the two novel methods need somewhat fuller discussion, and are covered in the next three sections.) .letters (ledger & merriam, ); .most common words (burrows, ); .digrams (kjell, ). the first and simplest method is simply to treat each letter of the alphabet as a feature, i.e. to count the frequency of each of the letters in each kilobyte line. at first glance this would seem not just simple, but simplistic. however, several previous studies -- most notably ledger & merriam ( ), but also ule ( ) and ledger ( ) -- have reported surprisingly good results when using letter-counts as stylistic indicators. at the very least, it allows us to establish a baseline level of performance: more sophisticated features sets need to outperform letter counting to justify their added complexity. the second type of textual feature has been used by burrows ( ) as well as binongo ( ), among others, not only in authorship attribution but also to distinguish among genres. essentially it involves finding the most frequently used words and treating the rate of usage of each such word in a given text as a feature. the exact number of common words used varies by author and application. burrows and colleagues (burrows, ; burrows & craig, ) discuss examples using anywhere from the most common to the most common words. binongo ( ) uses the commonest words (after excluding pronouns). greenwood ( ) uses the commonest (in new testament greek). in the present study, the most frequent words in the combined training samples were used -- without exclusions. most such words are function words, and thus this approach can be said to continue the tradition, pioneered by mosteller & wallace ( [ ]), of using frequent function words as markers. the third method tested here uses digram counts as features. kjell ( ) reported good results in assigning federalist essays written either by hamilton or madison to the correct authors using a neural-network classifier to which letter-pair frequencies were given as input features. the present method is a slight generalization of kjell's in that it uses character pairs rather than just letter pairs; so, for instance, digrams involving blanks or other punctuation marks may be used. another difference from kjell's work is that, instead of selecting letter pairs for their discriminatory ability, the commonest digrams in the combined training sets of the problem concerned were used, thus rendering this approach more directly comparable with that of burrows. note that all three methods of feature-finding outlined above share four desirable properties: ( ) they are easy to compute; ( ) they are easy to explain; ( ) they are interlingual, i.e. they are not limited to english; ( ) they require no exercise of skill by a user but can be found quite automatically. these four properties also apply to the next two methods. . progressive pairwise chunking to broaden the scope of this comparison somewhat, two novel feature-finding techniques were also tested. «feature-finding for text classification» the first of these is here called progressive pairwise chunking. it attempts to avoid the artificiality of always using fixed-length markers (such as digrams or trigrams) while also allowing marker substrings that are shorter than words (e.g. an affix such as `ed ') or which cross word boundaries (e.g. a collocation such as `in the'). this is done by adapting a method first described (in different contexts) by wolff ( ) and dawkins ( ). essentially the algorithm scans a byte-encoded text sequence, looking for the most common pair of symbols. at the end of each scan it replaces all occurrences of that pair by a newly allocated digram code. this process is repeated for the next most common pair and so on, till the requested number of pairings have been made. the program used here assumes that character codes from ascii upwards are free for reassignment (as is the case with tbench ), so byte codes from onwards are allocated sequentially. its output is a list of doublets. these are not always digrams, since previously concatenated doublets can be linked in later passes. thus the program can build up quite long chains, if they occur in the data -- identifying sequential dependencies of quite a high order (in a markovian sense) without demanding excessive computational resources. in particular, it does not need the huge but sparsely filled multi-dimensional matrices that would be required by a simple-minded approach to analyzing transition probabilities spanning more than a few items. an extract from this program's output when applied to the namesake training data (containing poems by bob dylan and dylan thomas) is shown as table . for the sake of brevity, only the most common substrings are shown, plus a selection of less common strings that illustrate the potential of this method. table -- example of substrings formed by progressive pairwise chunking. freqlist output; date: / / : : c:\bm \bd.trn bytes. c:\bm \dt.trn bytes. bytes. input files read. most frequent markers : `e ` ` t` `th` ` th` `s ` ` a` ` the` `d ` `in` ` s` `t ` ` the ` `, ` «feature-finding for text classification» ` i` ` w` `an` `and ` ` and ` `ing` `you` ` you` `ed ` ` to ` `wh` `'s ` `e, ` `s, ` ` the s` `in the ` `ver` it will be seen that, as well as pure digrams, this method tends to find common trigrams (e.g. `ver'), words (e.g. ` the '), morphemes (e.g. `ing', `'s ') and collocations (e.g. `in the'). some of the other substrings, such as `the s', do not fall naturally into any pre-existing linguistic grouping. as it turns out, dylan thomas is rather fond of following the definite article with the letter `s' -- a fact that more conventional methods of feature finding would not be able to exploit. . monte-carlo feature-finding the fifth and last method of feature-finding tested in this study takes the idea implicit in the progressive chunking method (that it is desirable to employ a variety of marker substrings, both longer and shorter than words) to what may be thought its logical conclusion. monte-carlo feature-finding is simply a random search for substrings that exist in the training data. here this process is implemented by a program called chisubs. this finds textual markers (short substrings) without any guidance from the user, merely by searching through a given set of training texts. the operation of chisubs may be described very simply. firstly the program repeatedly extracts substrings of length s (where s is a random number between and l) from randomly chosen locations in the training text until n distinct strings have been found. next the best c of these substrings are retained and printed, where `best' means having the highest chi-squared score. chi- «feature-finding for text classification» squared is used as an index of distinctiveness here because mcmahon et al. ( ) used it succesfully for a similar purpose -- expected values being calculated on the basis of equal rates of usage. in the experiments reported here, the values for the parameters mentioned above were as follows: l= , n= , c= . thus the program sought different substrings, of from to characters long, and then retained only the most discriminating of them . table shows the result of running chisubs on training data from the federalist papers (see appendix a for details). it illustrates the sort of markers found by this process. only the best markers have been listed, to save space. table -- marker substrings derived from feds data. chisubs output; date: / / : : gramsize = c:\bm \hamilton.trn bytes. c:\bm \madison.trn bytes. proportion in class = . proportion in class = . grams kept = rank substring chi-score frequencies `pon` . . . ` would` . . . `there ` . . . ` wou` . . . ` on ` . . . ` would ` . . . `up` . . . `na` . . . `owers` . . . `partmen` . . . `wers` . . . `epa` . . . `ould` . . . `oul` . . . ` on the` . . . `ould ` . . . ` on` . . . ` form` . . . strictly speaking the program only dumps up to c substrings, since those with a chi-squared score less than k, the number of categories, are not kept; furthermore, substrings occurring less than times in the combined training text are also dropped. however, none of the benchmark sets used here actually gave rise to fewer than markers. «feature-finding for text classification» `court` . . . `wo` . . . `powers ` . . . `governm` . . . `ou` . . . `ernment` . . . ` there` . . . `presi` . . . [the grave accent (`) is used here as a string delimiter, since single and double quotation marks may occur in these text markers.] to interpret this listing, it should be noted that, in the last two columns, the frequency of usage in hamilton's training sample comes before the frequency in madison's. as both authors are represented by almost the same amount of text, this can be read as saying that, for example, ` would' is a hamilton marker ( : ) while ` on the ' is a madison marker ( : ). thus even this simple printout provides some interesting information -- although many of these items appear to be what mosteller & wallace ( ) would call "dangerously contextual". whether such reliance on contextual or content-bearing linguistic items is a weakness can be answered by empirical testing; but chisubs also suffers from two structural faults which, unless corrected, would detract from its appeal even if such markers prove effective in practice. firstly, as can easily be seen above, there is plenty of redundancy (e.g. ` would' as well as ` would '). secondly, these text fragments are often segmented at what seem to be inappropriate boundary points (e.g. `governm', which surely ought to be ` government'). the most notable example of improper fragmentation in table is `pon', which anyone who has studied the federalist problem will immediately realize is an imperfect surrogate for ` upon'. . how long is a piece of substring? chisubs has no background knowledge: it knows nothing about words, morphemes, punctuation, parts of speech or anything specific to english or other languages. it treats text simply as a sequence of bytes. this lack of preconceptions is an advantage in that it could deal with other natural languages such as latin, artificial languages such as c++, or indeed non- linguistic material such as coded protein sequences, without amendment. but it has the disadvantage that it often produces substring markers which, to a user, appear to be truncated at inappropriate places. examples of this problem are `rpus' instead of `corpus' (from the mags data) and, most irritatingly, `pon' instead of ` upon' from the federalist samples. so the program described here was modified to alleviate this problem -- without the need to introduce any background knowledge such as a lexicon or morphological rules (specific to english) that would reduce the generality of the method. the revised program, teff (text- extending feature-finder), picks short substrings at random by the same method as described, but each substring is `stretched' as much as compatible with the data as soon as it is generated and before being saved for evaluation. «feature-finding for text classification» the idea is that if a substring is embedded in a longer string that has exactly the same occurrence profile then retaining the shorter substring is an inadvertent and probably unwarranted generalization. for example, if `adver' happens always to be part of `advertise' or `advertising' or `advertisement' in every occurrence in a particular sample of text it seems a safer assumption that `advertis' characterizes that text than `adver', which could also appear in `adverbial' or `adverse' or `animadversion' or `inadvertent' -- which, with our knowledge of english, we suspect to characterize rather different kinds of writing. so teff employs a procedure that takes each proposed marker string and tacks onto it character sequences that always precede and/or follow it in the training text. the heart of this process is a routine called textend(s) that takes a proposed substring s and extends it a both ends if possible. an outline of its operation is given as pseudocode below. repeat if s is invariably preceded by the same character c then s = concatenate(c,s) if s is invariably followed by the same character c then s = concatenate(s,c) until s reaches maximum size or s is unchanged during loop in teff, this procedure is only used within the same category of text that the substring is found in. for example, with the federalist data, if `upo' were found in the hamilton sample, as it most probably would be, then a common predecessor/successor would only be sought within that sample. this is a simple but effective procedure which does seem to eliminate the most glaring examples of improper text fragmentation. as this is a rather subjective judgement, a specimen of the results of applying this procedure to a list of substrings produced by chisubs from the federalist data is given below as table . this shows each input substring, then a colon, then the resultant extended version of that substring -- both bounded by grave accents to show whether blanks are present before or after. thus, `deraci` : ` confederacies` originally `invariably preceded' (or followed) meant exactly that, but the process was rather slow, so current versions of this procedure actually stop looking after consecutive occurrences of the same predecessor or successor. this does not affect substrings that occur less than times, of course; and appears to make little difference to the rest. «feature-finding for text classification» means that the th item was derived from the substring `deraci' which turned out always to be embedded within the longer string ` confederacies'. from this listing it hoped that readers will be able to appreciate how the program works and judge its effectiveness. table -- examples of `stretched' substrings from federalist text. `upo` : ` upon ` `pon` : `pon` ` would` : ` would ` `there ` : ` there ` ` on ` : ` on ` `up` : `up` `na` : `na` `owers` : `powers` `partmen` : ` department` `wers` : `wers` `epa` : `epa` `ould` : `ould ` ` on the` : ` on the` ` on` : ` on` ` form` : ` form` `court` : ` court` `wo` : `wo` `powers ` : `powers ` `overnme` : ` government` `ou` : `ou` `ernment` : `ernment` ` there` : ` there` `presi` : `preside` ` cour` : ` cour` `nat` : `nat` `nmen` : `nment` `deraci` : ` confederacies` `dicia` : ` judicia` `he stat` : ` the stat` `heir` : ` their ` `ed` : `ed` `cour` : `cour` `feder` : `federa` `nst` : `nst` `onsti` : ` constitu` `ve` : `ve` `e t` : `e t` `, would` : `, would ` `pa` : `pa` `ep` : `ep` `dep` : `dep` `d by` : `d by ` `ongres` : ` congress` `e` : `e` `the na` : ` the nat` «feature-finding for text classification» `stituti` : `stitution` `xecutiv` : ` executive` ` by ` : ` by ` ` govern` : ` govern` `execu` : ` execut` it is hoped that readers will agree that expansions such as `partmen' to ` department', `dicia' to ` judicia', `he stat' to ` the stat', `ongres' to ` congress' and `heir' to ` their ' represent gains in clarity. teff, however, cannot eliminate short and apparently unsuitable substrings altogether. a case in point is the retention of `earn' among the mags markers as well as ` learn'. this brought to light an uncorrected scanner error (`iearner' in place of `learner') in the text from the journal machine learning, but even when this was corrected `earn' was retained as the training text also contained the proper name `kearns', which ensured that the substring `earn' was not in fact always preceded by the letter `l' in this file. clearly this shows that the method is somewhat sensitive to misspellings, typographical errors and the presence of rare words or proper names. as it is desirable that a text classifier should be able to cope with at least some moderate level of spelling &/or typing mistakes, some consideration was given to the idea of modifying the system so that strings were extended if a high enough proportion ( % or %, say) of their preceding or succeeding characters were identical. such a program, however, would inevitably be more complex and slower than teff and would require a good deal of fine tuning, so it has been left as a future development. meanwhile, the results quoted in the following section were obtained with teff, which does, despite its simplicity, offer an improvement in intelligibility over the basic monte-carlo feature-finder (chisubs). . results to recapitulate, textual features found by five different methods were tested on a range of text- categorization problems. these five methods will be referred to as shown in table . note that only the last type of marker is selected according to distinctiveness: the rest are chosen solely by frequency. table -- types of textual marker tested. code number name brief description . letters letters of the roman alpahbet . words most frequent words . digrams most frequent digrams . doublets most frequent substrings found by progressive pairwise chunking . strings most distinctive substrings «feature-finding for text classification» found by teff program thus this experiment has a simple -factorial design, with five levels on the first factor (source of text markers) and levels on the second (problem number). the latter is essentially a nuisance factor: the problems do differ significantly in difficulty, but this is of no great interest. the main response variable measured is the percentage of correct classifications made on unseen test data. mean values are shown in table . table -- mean success rates on test data. source of textual marker mean percentage success rate letters . words . digrams . doublets . strings . these results appear in increasing order of accuracy, averaged over the test problems. it would seem that letters are less effective than the middle group of words, digrams and doublets, while strings are more effective. to test this interpretation, a -way analysis of variance on these percentage scores was performed. as expected, there was a very highly significant main effect of problem (f , = . , p < . ). clearly some problems are harder than others. (in fact, namesake proved the easiest, with a mean success rate overall of . %; while augustan was the most difficult, with a mean success rate of . %.) more interestingly, there was also a highly significant main effect of marker type (f , = . , p = . ). in other words, even after allowing for differences between problems, the null hypothesis that all five marker types give equal success rates must be rejected. (this design doe not permit testing for an interaction effect.) to investigate the factor of marker type further, the effect of differential problem difficulty was removed by performing a -way analysis of variance not on the raw percentage success rates but on the deviations of each score from the mean for that problem, i.e. on the residuals. once again this revealed a highly significant effect of marker type (f , = . , p < . ). in addition, dunnett's method of multiple comparisons with a standard was performed (minitab, ). for this purpose, the success rate of digrams (the marker type giving the median mean score) was taken as a norm. using a `family error-rate' of . (i.e. with a % significance level overall) gave an adjusted error rate of . . at this level, scores obtained by letters were significantly different from those obtained by digrams (lower), as were scores obtained using strings (higher). the other two marker types (words and doublets) did not differ significantly from digrams in effectiveness. «feature-finding for text classification» thus the appearance of a middle group consisting of words, digrams and doublets than which letters give significantly worse results and strings significantly better was confirmed. this is perhaps unsurprising -- at least with benefit of hindsight -- given the fact that letters implies using fewer attributes than the rest ( versus ) and that strings are preselected for distinctiveness whereas the other types of marker are selected only according to frequency. none the less, the fact that this preselection did not seem to give rise to overfitting was by no means a foregone conclusion. in a further attempt to shed light on the effect of marker type as well problem type, indicator variables (with values or ) were created and used in a regression analysis, with percentage success rate again used as the dependent variable. five of these binary variables indicated the presence or absence of a particular type of marker (as above); four of them indicated the type of problem (namely, authorship, chronology, subject-matter or random data). in addition, as the number of categories is clearly a determinant of how difficult any classification task is, the reciprocal of the number categories was also computed ( /k) and used as a predictor variable. this variable gives the proportion of correct classifications expected by chance, assuming equal prior probabilities. these ten variables (expected chance success-rate, five indicators of marker type and four indicators of problem type) were then supplied to minitab's stepwise regression procedure as explanatory variables for predicting success rate. the procedure halted after inclusion of three variables, giving the regression equation below, with an overall r-squared of . . percent = . - . *rand + *invcats + . *strings this need not be taken too seriously as a predictive formula. what it does reveal, however, is that only three of these variables were worth using (between them accounting for . % of the variance in classification success rate). these were, in order of inclusion: rand : whether the problem used random data or not; invcats : the reciprocal of the number of categories; strings : whether strings were used as markers. this can be interpreted as saying that, other factors being held constant: the random problems give success rates on average . % below other problems; teff strings give success rates . % above the expectation for other marker types; while /k is an estimate of the effect of varying k, the number of different categories, on percentage of correct classifications. for instance, using teff substrings on a non-random problem with only categories the success rate predicted by this formula is . % of course the range of problems and markers used here is restricted, so no great weight should be given to the precise values of these regression coefficients; but this analysis does suggest that while random problems are much harder than the rest, the difference in difficulty between authorship, chronology and content-based classification problems is relatively minor. it also confirms the small but significant superiority of strings found by teff (the extended monte- carlo feature-finder) as compared with the other marker types. «feature-finding for text classification» . discussion in this study we have attempted to examine the question of where stylometric indicators come from, a question that has, broadly speaking, only been answered implicitly by previous stylometric researchers. in doing so, we have arrived empirically at a preliminary `pecking order' among stylistic marker types based on results obtained on a benchmark suite of text classification problems. this ranking suggests that letter frequencies are less effective than other sorts of textual markers which share with letters the desirable properties of being easy to compute and applicable to a range of languages, such as common words or digrams. thus researchers who employ letters in preference to common words or digrams in future stylometric studies may well be wasting information. the two novel types of textual marker tested in this paper performed creditably. we propose that they both merit serious consideration by future researchers in this area. doublets obtained by progressive pairwise chunking would appear to be at least as informative as common words or digrams. strings obtained by monte-carlo feature-finding gave significantly better results than the other types. while this study is limited in scope, our results do suggest that the emphasis on the word as the primary type of linguistic indicator may be counter-productive. of course this conclusion only carries force to the extent that the benchmark suite used here (tbench ) is adequate, and it can only be regarded as a prototype. nevertheless the very idea of using an agreed suite of benchmark problems to help provide an empirical perspective on various aspects of text classification is in itself, we believe, a contribution to stylometry; and while tbench is admittedly imperfect, it provides a starting point for future developments. we have also demonstrated here the feasibility of successfully classifying quite short text segments -- kilobyte lines, well under words long on average. this we hope will encourage further work on the precise relationship between likely classification accuracy and size of text unit. to date, so it would appear from table , stylometrists have `played safe' and used rather long stretches of text -- longer than strictly necessary. it would be useful to have more empirical data on this issue, to help future workers make an informed trade-off between size and accuracy. the present study at least provides a data point away from the norm. much else remains to be done. two avenues which we intend to follow concern: ( ) combining lexical markers such as used here with syntactic markers (as used, for instance, by wickmann, ) and/or semantic markers (as used, for instance, by martindale & mckenzie, ); ( ) investigating interaction effects between problem type (e.g. authorship versus chronology) and marker type. a more extensive benchmark suite might allow investigation of whether certain types of marker are better with certain types of problem. for example: are common words better in authorship studies while teff markers work better on content-based discriminations? there was some suggestion of this in the figures obtained here, but the results are inconclusive. an extended benchmark suite would permit serious investigation of such questions. «feature-finding for text classification» finally, more work is needed on feature-selection as well as feature-finding. by most standards, is rather too many variables for convenience. (even variables would be considered a large feature set in some quarters.) certainly, just showing a user a list of textual features, even if they are presented in order of distinctiveness, is unlikely to promote deep insight into the nature of the differences between the text types being studied. if, however, equivalent (or better) performance could be achieved with a reduced subset of markers then the development of an accurate classifier using them would have the beneficial side-effect of promoting deeper insight into the data -- possibly an even more valuable outcome than just having a good classification rule. initial experiments to this end, using a simple stepwise forward-selection procedure, have proved disappointing. it remains to be seen whether it is intrinsic to the nature of text that accurate classification requires the use of many indicators or whether a more efficient variable-selection algorithm, such as a genetic algorithm or simulated annealing (siedlecki & sklansky, ; reeves, ), would eliminate this problem. goldberg ( ) has argued that textual features have some inherent characteristics -- infrequency, skewed distribution, and high variance -- which together imply that simple yet robust classification rules using just a handful of descriptors will seldom if ever be found for linguistic materials. this is an interesting conjecture, which we believe is worth attempting to falsify. a variable-selection program, which could reduce the number of textual features needed for successful text classification from around to less than would settle this question. it would also be a useful text-analytic tool. moreover, a serious attempt to develop such a tool, even if it ended in failure, would have interesting implications, since it would tend to corroborate goldberg's thesis. we hope to pursue this line of investigation in future studies. «feature-finding for text classification» appendix a : details of benchmark data sets the text-classification problems that constitute tbench (text benchmark suite, edition) form an enhanced version of the test suite used by forsyth ( ). they also constitute a potentially valuable resource for future studies in text analysis. collecting and editing tbench has been an arduous chore, but enhancing and maintaining it could become a full-time job. already some of the problems of corpus management (aijmer & altenberg, ) have presented themselves. it is hoped that support to overcome such problems may in due course be forthcoming, so that successors to tbench may offer a genuine resource to the research community. ideally they could be made publicly available, e.g. on the internet, for comparative studies; but, as some of the original texts used are still under copyright, the best way of doing this will require legal advice. summary information about the selection of works from various authors and subdivision into training and test files is contained in section . here this is amplified by giving further details concerning the sources and sizes of the texts used in the benchmark suite. note: a policy adhered to throughout was never to split a single work (article, essay, poem or song) between training and test sets. a. sources of benchmark data authorship namesake ( classes): poetry by bob dylan and dylan thomas. songs by bob dylan (born robert a. zimmerman) were obtained from lyrics - (dylan, ). in addition, two tracks from the album knocked out loaded (dylan, ) and the whole a-side of oh mercy (dylan, ) were transcribed by hand and included, to give fuller coverage. an electronic version of lyrics - is apparently available from the oxford text archive, but this was not known until after this selection had been compiled. further information about the oxford text archive can be obtained by sending an electronic mail message to archive@vax.oxford.ac.uk poems of dylan thomas were obtained from collected poems - (thomas, ) with four more early works added from dylan thomas: the notebook poems - (maud, ). most were typed in by hand. ezra ( classes): poems by ezra pound, t.s. eliot and william b. yeats -- three contemporaries who influenced each other's writings. for example, pound is known to have given editorial assistance to yeats and, famously, eliot (kamm, ). a random selection of poems by ezra pound written up to was taken from selected poems - (pound, ), and entered by hand. it was supplemented by «feature-finding for text classification» random selection of pre- cantos, obtained from the oxford text archive. poems by t.s. eliot were from collected poems - (eliot, ), scanned then edited by hand. a random selection of poems by w.b. yeats was taken from the oxford text archive. for checking purposes collected poems (yeats, ) was used. as is usual in machine learning the data was divided into training and testing sets. in the above cases this division was made by arranging each author's files in order and assigning them alternately to to test or training sets. as this data is held (with a few exceptions) in files each of which contains writings composed by one author in a single year, this mode of division meant that works composed at about the same time were usually kept together; and, even more important, that single poems were never split between test and training files. the effect of this file-based blocking is presumably to make these tests somewhat more stringent than purely random allocation of individual poems to test and training sets would have been. feds ( classes): a selection of papers by two federalist authors, hamilton and madison. this celebrated, and difficult, authorship problem -- subject of a ground-breaking stylometric analysis by moseller & wallace ( [ ]) -- is possibly the best candidate for an accepted benchmark in this field. an electronic text of the entire federalist papers was obtained by anonymous ftp from project gutenberg at gutnberg@vmd.cso.uiuc.edu for checking purposes the dent everyman edition was used (hamilton et al., [ ]). here the division into test and training sets was as follows. author training paper numbers test paper numbers hamilton , , , , , , , , , , , , , , , , , , , , , , , , , , , , madison , , - - , , thus, for madison, all undisputed papers constitute the training set while the `disputed' papers constitute his test sample. this implies that we accept the view expounded by martindale & mckenzie ( ), who state that: "mosteller and wallace's conclusion that madison wrote the disputed federalist papers is so firmly established that we may take it as given." for hamilton, a random selection of papers was chosen as a training sample with another random selection of papers as test set, giving test and training sets of roughly the same size as madison's. chronology / english poetry ed( classes): poems by emily dickinson, early work being written up to and later work being written after . emily dickinson had a great surge of poetic composition in and a lesser peak in , after which her output tailed off gradually. the work included was all of a choice of emily dickinson's verse «feature-finding for text classification» selected by ted hughes (hughes, ) as well as a random selection of other poems from the complete poems (edited by t.h. johnson, ). data was entered by hand. jp( classes): poems by john pudney, divided into three classes. the first category came from selected poems (pudney, ) and for johnny: poems of world war ii (pudney, ); the second from spill out (pudney, ) and the third from spandrels (pudney, ). all poems in these four books were used. john pudney ( - ) described his career as follows: "my poetic life has been a football match. the war poems were the first half. then an interval of ten years. then another go of poetry from to the present time" (pudney, ). here the task is to distinguish his war poems (published before ) from poems in two other volumes, published in and -- i.e. there are three categories. wy( classes): early and late poems of w.b. yeats. early work taken as written up to , the start of the first world war, and later work being written in or after , the date of the irish easter rising, which had a profound effect on yeats's beliefs about what poetry should aim to achieve. same source as in ezra, above. for these problems the classification objective was to discriminate between early and late works by the same poet. the division into test and training sets for emily dickinson and yeats was once again file-based: in these cases the files of each poet were ordered chronologically and assigned alternately (i.e. from odd then even positions in the sequence) to two sets. the larger of the two resulting files was designated as training and the smaller as test file. with john pudney, each book was divided into test and training sets by random allocation of individual poems. subject-matter mags ( classes): this used articles from two academic journals literary and linguistic computing and machine learning. the task was to classify texts according to which journal they came from. in fact, each `article' consisted of the abstract and first paragraph of a single paper. these were selected by taking a haphazard subset of the volumes actually present on the shelves in uwe's bolland library on two separate days, then scanning in (photocopies of) the relevant portions and editing them. the results were as follows. «feature-finding for text classification» literary & linguistic computing machine learning year articles words year articles words for both journals the articles from even years were used as the training sample and those from odd years the test sample. troy ( classes): electronic versions of the complete texts of homer's iliad and odyssey, both transliterated into the roman alphabet in the same manner, were kindly supplied by professor colin martindale of the university of maine at orono. traditionally each book is divided into sections or `books'. for both works the training sample comprised the odd-numbered books and the test sample consisted of the even-numbered books. the task was to tell which work each kilobyte line came from. randomized data augustan ( classes): the augustan prose sample donated by louis t. milic to the oxford text archive. for details of the rationale behind this corpus and its later development, see milic ( ). this data consists of extracts by many english authors during the period to . it is held as a sequence of records each of which contains a single sentence. sentence boundaries as identified by milic were respected. to obtain test and training sets a program was written to allocate sentences at random to four files. the larger two of these were then treated as training sets, the smaller two as test data. rasselas ( classes): the complete text of rasselas by samuel johnson, written in . this was obtained in electronic form from the oxford text archive. for checking purposes, the clarendon press edition was used (johnson, [ ]). this novel consists of «feature-finding for text classification» chapters. these were allocated alternately to four different files. files and became the training data; files and were used as test sets. the inclusion of random or quasi-random data may need a few words in justification. the chief objective of doing so here, was to provide an opportunity for what statisticians call overfitting to manifest itself. if any of the approaches tested is prone to systematic overfitting -- in the sense of exploiting random peculiarities in the training data -- then this last pair of problems will tend to reveal it. a success rate significantly below chance expectation on either of these data sets would be evidence of overfitting. our view is that, as a general rule, some `null' cases should form part of any benchmark suite: as well as finding what patterns do exist, a good classifier should avoid finding patterns that don't exist. a. sizes of benchmark problems problem categories kilobytes (training, test) dylans bob dylan dylan thomas , , ezra ezra pound t.s. eliot w.b yeats , , , feds alexander hamilton james madison , , ed emily dickinson to after , , jp john pudney: war poems poems from spill out poems from spandrels , , , wy w.b. yeats to after , , mags lit. & ling. computing machine learning , , troy iliad odyssey , , augustan random sentences random sentences , , rasselas even-numbered chapters odd-numbered chapters , , «feature-finding for text classification» references aha, d.w., kibler, d. & albert, m.k. ( ). instance-based learning algorithms. machine learning, , - . aijmer, k. & altenberg, b. ( ) eds. english corpus linguistics. longman, london. baillie, w.m. ( ). authorship attribution in jacobean dramatic texts. in: j.l. mitchell, ed., computers in the humanities, edinburgh univ. press. binongo, j.n.g. ( ). joaquin's joaquinesquerie, joaquinesquerie's joaquin: a statistical expression of a filipino writer's style. literary & linguistic computing, ( ), - . burrows, j.f. ( ). not unless you ask nicely: the interpretive nexus between analysis and information. literary & linguistic computing, ( ), - . burrows, j.f. & craig, d.h. ( ). lyrical drama and the "turbid montebanks": styles of dialogue in romantic and renaissance tragedy. computers & the humanities, , - . butler, c.s. ( ). poetry and the computer: some quantitative aspects of the style of sylvia plath. proc. british academy, lxv, - . coleridge, s.t. ( ). biographia literaria. dent, london. [first edition, .] dasarathy, b.v. ( ) ed. nearest neighbour (nn) norms: nn pattern classification techniques. ieee computer society press, los alamitos, california. dawkins, r. ( ). hierarchical organisation: a candidate principle for ethology. in: p.p.g. bateson & r.a. hinde, eds., growing points in ethology. cambridge university press. dixon, p. & mannion, d. ( ). goldsmith's periodical essays: a statistical analysis of eleven doubtful cases. literary & linguistic computing, ( ), - . dylan, b. ( ). knocked out loaded. sony music entertainment inc. dylan, b. ( ). oh mercy. cbs records inc. dylan, b. ( ). lyrics - . harper collins publishers, london. [original u.s. edition published .] eliot, t.s. ( ). collected poems - . faber & faber limited, london. elliott, w.e.y. & valenza, r.j. ( ). a touchstone for the bard. computers & the humanities, , - . everitt, b.s. & dunn, g. ( ). applied multivariate data analysis. edward arnold, london. «feature-finding for text classification» forsyth, r.s. ( ). neural learning algorithms: some empirical trials. proc. rd international conf. on neural networks & their applications, neuro-nimes- . ec , nanterre. forsyth, r.s. ( ). stylistic structures: a computational approach to text classification. unpublished doctoral thesis, faculty of science, university of nottingham. goldberg, j.l. ( ). cdm: an approach to learning in text categorization. proc. th ieee international conf. on tools with artificial intelligence. greenwood, h.h. ( ). common word frequencies and authorship in luke's gospel and acts. literary & linguistic computing, ( ), - . hamilton, a., madison, j. & jay, j. ( ). the federalist papers. everyman edition, edited by w.r. brock: dent, london. [first edition, .] holmes, d.i. ( ). a stylometric analysis of mormon scripture and related texts. j. royal statistical society (a), ( ), - . holmes, d.i. ( ). authorship attribution. computers & the humanities, , - . holmes, d.i. & forsyth, r.s. ( ). the 'federalist' revisited: new directions in authorship attribution. literary & linguistic computing, ( ), - . hughes, e.j. ( ). a choice of emily dickinson's verse. faber & faber limited, london. johnson, s. ( ). the history of rasselas, prince of abyssinia. clarendon press, oxford. [first edition .] johnson, t.h. ( ) ed. emily dickinson: collected poems. faber & faber limited, london. kamm, a. ( ). biographical dictionary of english literature. harpercollins, glasgow. kjell, b. ( ). authorship determination using letter pair frequency features with neural net classifiers. literary & linguistic computing, ( ), - . ledger, g.r. ( ). re-counting plato. oxford university press, oxford. ledger, g.r. & merriam, t.v.n. ( ). shakespeare, fletcher, and the two noble kinsmen. literary & linguistic computing, ( ), - . makridakis, s. & wheelwright, s.c. ( ). forecasting methods for managers, fifth ed. john wiley & sons, new york. martindale, c. & mckenzie, d.p. ( ). on the utility of content analysis in authorship attribution: the federalist. computers & the humanities, , in press. «feature-finding for text classification» matthews, r.a.j. & merriam, t.v.n. ( ). neural computation in stylometry i: an application to the works of shakespeare and fletcher. literary & linguistic computing, ( ), - . maud, r. ( ) ed. dylan thomas: the notebook poems - . j.m. dent & sons limited, london. mckenzie, d.p. & forsyth, r.s. ( ). classification by similarity: an overview of statistical methods of case-based reasoning. computers in human behavior, ( ), - . mcmahon, l.e., cherry, l.l. & morris, r. ( ). statistical text processing. bell system technical journal, ( ), - . merriam, t.v.n. ( ). an experiment with the federalist papers. computers & the humanities, , - . merriam, t.v.n. & matthews, r.a.j. ( ). neural computation in stylometry ii: an application to the works of shakespeare and marlowe. literary & linguistic computing, ( ), - . michie, d., spiegelhalter, d.j. & taylor, c.c. ( ) eds. machine learning, neural and statistical classification. ellis horwood, chichester. milic, l.t. ( ). a quantitative approach to the style of jonathan swift. mouton & co., the hague. milic, l.t. ( ). the century of prose corpus. literary & linguistic computing, ( ), - . minitab inc. ( ). minitab reference manual, release . minitab inc., philadelphia. mosteller, f. & wallace, d.l. ( ). applied bayesian and classical inference: the case of the federalist papers. springer-verlag, new york. [extended edition of: mosteller & wallace ( ). inference and disputed authorship: the federalist. addison-wesley, reading, massachusetts.] murphy, p.m. & aha, d.w. ( ). uci repository of machine learning databases. dept. information & computer sceince, university of california at irvine, ca. [machine-readable depository: http://www.ics.uci.edu/~mlearn/mlrepository/html.] pound, e.l. ( ). selected poems. faber & faber limited, london. pudney, j.s. ( ). selected poems. john lane the bodley head ltd., london. pudney, j.s. ( ). spill out. j.m. dent & sons ltd., london. pudney, j.s. ( ). spandrels. j.m. dent & sons ltd., london. «feature-finding for text classification» pudney, j.s. ( ). for johhny: poems of world war ii. shepheard-walwyn, london. quinlan, j.r. ( ). c . : programs for machine learning. morgan kaufmann, san mateo, california. reeves, c.r. ( ). modern heuristic techniques for combinatorial problems. mc-graw-hill international, london. siedlecki, w. & sklansky, j. ( ). a note on genetic algorithms for large-scale feature selection. pattern recognition letters, , - . smith, m.w.a. ( ). an investigation of morton's method to distinguish elizabethan playwrights. computers & the humanities, , - . thisted, r. & efron, b. ( ). did shakespeare write a newly-discovered poem? biometrika, ( ), - . thomas, d.m. ( ). collected poems - . j.m. dent & sons ltd., london. ule, l. ( ). recent progress in computer methods of authorship determination. allc bulletin, ( ), - . wickmann, d. ( ). on disputed authorship, statistically. allc bulletin, ( ), - . wolff, j.g. ( ). an algorithm for the segmentation of an artificial language analogue. brit. j. psychology, ( ), - . yeats, w.b. ( ). the collected poems of w.b. yeats. macmillan & co. limited., london. a digital humanities reading list: part ,   skill building   liber’s digital humanities & digital cultural heritage working group is   gathering literature for libraries with an interest in digital humanities.   four teams, each with a specific focus, have assembled a list of must-read   papers, articles and reports. the recommendations in this article (the third in   the series) have been assembled by the team in charge of enhancing skills in   the field of digital humanities for librarians, led by caleb derven of the   university of limerick.   the third theme: skill building the recommended readings and tutorials in this post broadly focus on what   skills are needed for providing dh services in libraries and how library staff   can acquire these skills.   in the case of the former, we examined resources that resonated as   representative or evocative of what skills library staff might obtain allowing   them to participate in digital humanities work or practices. with the latter,   we’ve highlighted a few skills tutorials that provide practical instruction in   https://libereurope.eu/working-group/digital-humanities-digital-cultural-heritage/ https://web.archive.org/web/ /http://libereurope.eu/blog/dt_team/caleb-derven/ useful tools and skills for dh practice. of course, given the sheer plurality of   both web-accessible and published resources, this posting highlights a   sampling of what’s available. the working group’s zotero library , and items   specifically related to skill building within libraries, offers a surfeit of additional   starting places.   . coding for librarians: learning by example, andromeda yelton   this issue of library technology reports examines the contexts   of, the motivations for, and concrete examples of coding in   libraries. the chapters in the issue are notable for the range of   libraries represented (albeit in primarily north american settings),   from public to special to academic libraries. the chapters carefully   describe not only the what of coding (specific tools or approaches   used, the problems addressed by the coding, etc.) but also why   librarians should code, and through exploring political and social   dimensions of coding, outlines a sort of ethics of coding in   libraries. the issue makes a strong case for the active role of the   librarian in the creation of the digital library.   . using open refine to create xml records for wikimedia batch   upload tool: nora mcgregor   many of us working in dh or digital library projects that involve any   level of metadata clean-up, data munging or data transformations   have likely encountered open refine, a veritable panacea for many   data related issues. this blog post from the british library’s digital   scholarship department provides a comprehensive and detailed   description of a specific approach to uploading collection   metadata to wikimedia commons using open refine as a core   tool. the post highlights openness as both platform and tool.   . digital humanities clinics – leading dutch librarians into dh:   lotte wilms, michiel cock, ben companjen   this article describes a series of dh clinics run in academic and   research libraries in the netherlands aimed towards enabling   library professionals to provide services to students and   researchers, identify skill gaps and provide identifiable solutions   and to assist in automating daily work, echoing themes in the   library technology reports issue noted above. the librarians   involved in the project ran five dh clinics in and found that   the model of training collections librarians interested in dh all at   https://www.zotero.org/groups/ /liber_digital_humanities_working_group/collections/ zs ckrj http://dx.doi.org/ . /ltr. n https://britishlibrary.typepad.co.uk/digital-scholarship/ / /using-open-refine-to-create-xml-records-for-wikimedia-batch-upload-tool.html https://britishlibrary.typepad.co.uk/digital-scholarship/ / /using-open-refine-to-create-xml-records-for-wikimedia-batch-upload-tool.html https://hdl.handle.net/ / once worked very well, as you not only get the training part in order,   but also put a network in place.   . programming historian   as our first suggestion for dh-related tutorials, the programming   historian provides lessons in a wide range of open skills,   technologies and tools, from a variety of disciplinary perspectives,   related to many data and content areas that librarians work with in   dh contexts. the site covers a broad range of use cases that   strongly reverberate with library dh work, from visualisation to   textual analysis to gis and mapping contexts and digital   publishing.   . library carpentry: what is library carpentry?   building on the lessons and approach of software carpentry and   data carpentry, library carpentry could be viewed as an essential   prologue before embarking on the deep dives of the programming   historian lessons. the tools detailed in library carpentry’s lessons   form the core of the work undertaken in many of the resources   noted in this post.   . british library digital scholarship training programme   this collection of courses provided by the british library is aimed   at librarians to provide them with an understanding of digital   scholarship and to develop the necessary skills to deliver   dh-related services. links are provided to all the slides and   resources used in the training. the tools and approaches are   consonant with resources noted above.   the skill-building team of the working group will be providing additional posts   in the coming months that highlight both specific use cases faced in liber   institutions and potential challenges in providing dh services.     https://programminghistorian.org/ https://programminghistorian.org/ https://librarycarpentry.org/ https://www.bl.uk/projects/digital-scholarship-training-programme deposite     arc http: this is th dalvean ranking submitte this arti access a d anu resear chived //www the submitted n, michael c g contempora ed for public icle was pub article. it is   rch repository d in a w.anu.ed d version of: coleman ( ary american cation in the j lished online available on y anu r du.au/re ) n poems journal liter e june line at: http:/ resea esearch rary and lin in literar //dx.doi.org/ arch r h/acces guistic comp ry & linguist . /llc/f eposit s/ puting tic computin fqt tory ng as an advvance ranking contemporary american poems michael coleman dalvean australian national university school of politics and international relations michael.dalvean@anu.edu.au january mailto:michael.dalvean@anu.edu.au ranking contemporary american poems – michael dalvean introduction the purpose of this paper is to examine what distinguishes a “professional” poem from an “amateur” poem. the central idea here is that professional poets are more likely than amateur poets to have grasped the basic skills associated with writing poetry and have therefore been able to produce poems of lasting quality. amateurs, on the other hand, are less likely to have mastered the basic required skills and are therefore less likely to have produced work of lasting quality. intuitively, we know that there are differences between the skills of amateurs and professionals in various fields and we are quick to make aesthetic judgments based on our raw subjective responses. however, the objective quantification of the factors that lead to such responses is rarely considered. by using computational linguistics it is possible to objectively identify the characteristics of professional poems and amateur poems. this way an objective basis for our subjective responses can be identified. the upshot of identifying the characteristics of high quality poems is that we can then come up with a means of placing poems on a continuum according to how much a poem exemplifies the characteristics of an amateur poem or, at the other extreme, a professional poem. we can then use this continuum to rank professional poems and, in doing so, we can make some objective statements about which poems are “better”. there is a tradition of considering some poets as “minor” and others as “major” (eliot, ). placing poems on a continuum that is based on the extent to which poems possess the craftsmanship of a professional may be a step towards explaining why some poets are “greater” than others. thus, an important element of this paper is the creation of such a continuum using a corpus of contemporary american poets. ranking contemporary american poems – michael dalvean related work in computational linguistics several computational linguistic approaches to the analysis of poetry have been made. rhyme and meter have been quantified (green, bodrumlu, & knight, ) and methods to classify poems according to individual authors and styles have been used (kaplan & blei, ). however, only two attempts have been made to isolate the variables associated with poetic talent. the first study to use computational linguistics to identify high quality poetry is forsythe (forsythe, ) which looked at the characteristics of english poems over the last years. the analysis here was based on a study group of poems that consistently appeared in recent anthologies. a control group was selecting an “obscure” poem initially published in the same year as one of the poems in the study group. the obscure poems had not subsequently appeared in an anthology. this resulted in a sample consisting of “successful” poems and “unsuccessful” or “obscure” poems matched by year of publication. the study found that the successful poems had fewer syllables per word in their first lines and were more likely to have an initial line consisting of monosyllables. it was also found that successful poems had a lower number of letters per word, used more common words, and had simpler syntax. thus, contrary to what we might expect, the more successful poems used simpler language. in essence, poems that use language that is simple and direct are more likely to be reproduced in anthologies. the second study is that of kao and jurafsky ( ). this study used a study group of “successful” american poems, where success was defined as having been reproduced in the anthology contemporary american poetry (poulin & waters, ). they used a control group of amateur poems selected from an amateur poetry website (www.amateurwriting.com). in terms of effect size and statistical significance, the biggest difference was that the professional poets used words that were more concrete than the amateur poets. furthermore, the amateur poets were more likely to use perfect rhymes rather than approximate rhymes, more alliteration and more emotional ranking contemporary american poems – michael dalvean words, both negative and positive. finally, professional poets tend to use a greater variety of words than amateur poets. that is, the number of different words in the professional poets is greater than the number of different words in the amateur corpus. this is not to say that they use more complex words, merely that they use a greater variety of simple words. an alternative approach in this paper i attempt to extend the kind of analysis undertaken in forsythe ( ) and kao and jurafsky ( ). that is, i wish to determine what distinguishes a well-crafted poem from a less well-crafted poem. i use the same data as that used by kao and jurafsky ( ). however i extend the analysis in two ways. firstly, i examine a broader range of linguistic variables than kao and jurafsky. the significant insight from kao and jurafsy’s ( ) analysis is that the concreteness of words is far more important an indicator of poetic quality than any of the characteristics we might usually associated with poetic craft such as perfect end rhyme frequency or the type/token ratio. therefore, if a search is made for linguistic characteristics using the types of variables that have been investigated in relation to language processing then there is the possibility that the insights gained by kao and jurafsky ( ) can be further extended. for this purpose i use linguistic variables derived from linguistic inquiry and word count (pennebaker, francis, & booth, ) and psycholinguistic variables from the paivio, yuille and madigan ( ) word norms. it will become apparent that this approach provides a further insight into the types of linguistic characteristics that distinguish professional from amateur poems. a second way in which i extend the analysis of kao and jurafsky ( ) is to use machine learning to develop a classifier. the idea here is that if there are characteristics that distinguish amateur from professional poems then it should be possible to classify a given poem as being more towards the amateur end of the spectrum or more towards the ranking contemporary american poems – michael dalvean professional end. this being the case, it is also possible to rank individual poems according to their position on the spectrum. thus, given kao and jurafsky’s ( ) selection of professional poems it should be possible to rank them according to where they are on the spectrum. in this sense it is possible to state that, even among professional poets, some are better than others. method the data the data consist of the poems used by kao and jurafsky ( ). of these poems, are professional poems drawn from contemporary american poetry (poulin and waters, ) and are amateur poems drawn from www.amateurwriting.com. the professional poems were written in the later half th century by poets who have been members of the academy of american poets. in the poem corpus there are individual poets. the number of poems chosen from the anthology was in direct proportion to the number of poems the poet had in the anthology. where a poem was over words it was removed and replaced by another poem by the same poet. the final selection of poems had an average of words (min = ; max = ) (kao & jurafsky, , p. ). the control poems were selected from www.amateurwriting.com which is a free website on which anyone is able to post their writing. of the available at the time of selection, were randomly selected and corrected for grammar and spelling. the average length of poems was words (min = ; max = ) ) (kao & jurafsky, , p. ). the variables i would like to thank justine kao for supplying me with the data used in the analysis. http://www.amateurwriting.com/ http://www.amateurwriting.com/ ranking contemporary american poems – michael dalvean the dependent variable in the analysis is a binary taking the value of if the poem is by a professional poet and if it is not. the independent variables are linguistic variables derived from two sources – linguistic inquiry and word count (liwc) and the paivio yuille and madigan ( ) word norms and their extension by clarke and paivio ( ). sixty eight linguistic variables were derived from linguistic inquiry and word count (liwc). this program breaks text down into linguistic categories according to a specifically designed dictionary (pennebaker, francis, & booth, ). the categories used are based common behavioural and cognitive processes and include negative emotion, affect, leisure, work, family, social activities and psychological processes. the categories were derived from lists of words empirically associated with each category. thus, the psychological processes category was derived from words developed from the positive affect negative affect scale (watson, clarke and tellegen, , cited in pennebaker et al ), roget’s thesaurus, and standard english dictionaries. thus, with sixty-eight linguistic categories liwc captures a great deal of the linguistic content of a given text. an additional psycholinguistic variables were derived from paivio yuille and madigan’s ( ) word norms and the extension of these by clarke and paivio ( ). the paivio yuille and madison ( ) and clarke and paivio ( ) (pymc) word norms are derived from a sample of nouns. for each word, linguistic and psycholinguistic variables were derived. some of these are structural such as the number of letters and number of syllables. another set of variables were derived from subjects’ responses to the words by getting to answer questions on a number of psycholinguistic dimensions. the variable “meaningfulness” was derived by asking subjects, for each word, how many associated words they could think of in seconds while the variable “age of acquisition” (aoa) was derived by asking subjects at what age they estimate they learnt each of the words. the result is that there are variables for each of the words that measure their structural and ranking contemporary american poems – michael dalvean psycholinguistic properties. in order to illustrate how the poems were scored on each of these variables i shall use the “ease of definition” (def) variable. this variable was derived by asking how easy is was to define each of the words on a scale of (very hard) to (very easy). thus, for each of the words we have a def score. out of the word sample the word that was easiest to define was “baby” (score = . ) and the word that was the hardest to define was “gadfly” (score = . ). the average score for the words was . . words with in this range were “vessel” ( . ), “warmth” ( . ), “alimony” ( . ) and “caravan” ( . ). to use the raw def scores to score poems, the first stage was to determine, for each poem, which of the words in the pymcp sample were present. the average def score for each poem could then be calculated. consider for example the sentence “the baby ridiculed the gadfly’s caravan”, in this sentence the words “the” and “ridiculed” are not in the word sample so they are not part of the calculation. the remaining words, “baby”, “gadfly”, and “caravan”, are in the sample and have scores of . , . , and . respectively. the sentence contains three words from the sample so the “def” score for the sentence is calculated as follows: ( . + . + . )/ = . . using this methodology we get a proxy for the average def (ease of definitions) of words used in each poem. it is only a proxy because it is based on a word sample. the poems were scored on all psycholinguistic variables in the same way as described above for def. thus, the data consist of a corpus of poems with the professional poems scored as and the amateur poems scored as . for each of these poems there are linguistic variables derived from liwc and derived from the pymcp norms. machine learning ranking contemporary american poems – michael dalvean it is apparent that the number of variables under consideration is half the sample size. in traditional hypothesis testing this would be a problem. however, recent advances in machine learning have pointed the way towards making sense of situations in which there is a great number of independent variables. much of this approach has been developed in the context of gene sequencing in which it is not unusual to have a sample size of less than and yet the number of independent variables that need to be considered is several thousand. ultsch and kämpf ( ) give an example of a data set consisting of leukemia patients and variables. clearly there needs to be some way of selecting the variables that are likely to provide the best signal. the solution used in this paper is to use logistic regression with forward stepwise selection. under this procedure variables are selected according to an algorithm that surveys all the independent variables and selects the independent variable that provides the best logistic fit for the dependent variable. this procedure continues until no additional variables can be found that add to the model’s ability to fit the data. clearly, this can lead to problems because it is possible that variables are selected due to their ability to learn the “noise” in the dataset rather than generalize. this is known as “overfitting” (hawkins, ). to prevent overfitting, an independent holdout sample can be used to check the generalization ability of the model at each of the steps in the stepwise procedure. the idea here is that the testing sample will be “held out” from the model building procedure and will only be used to test the generalization ability of the model at each stage of its development. typically, the generalization ability of a model rises with the first few independent variables added and then falls away as more independent variables are added. as independent variables are added the internal measures of model fit such as r tend to rise consistently but the external generalization ability (that is, the ability to classify cases that were not used in the creation of the model – the “held out” cases) falls considerably after the ranking contemporary american poems – michael dalvean first few variables are selected. the idea is to choose the model that maximizes the external generalization ability. it is important to specify the holdout sample correctly as it must at all times be separate from the sample of the data used to create the model. the idea here is that a certain proportion of the data p should be used to create the model and the remaining proportion - p should be used to test that the model has not been overfitted. if the model is able to generalize then it should be able to correctly classify cases that were not used in creating it. this “holdout” sample is one way of doing this and is a standard method of testing models in machine learning. another technique derived from machine learning is the use of an ensemble of models to increase the classification accuracy. the idea here is that averaging the outputs of several different models will likely increase the overall accuracy. this assumes that the errors of each constituent model in the ensemble are not correlated. one way to do this is to train different models on different subsets of the data. another way is to use different variables in each constituent model. in this paper the latter approach is the one used. before discussing the modeling process in detail it is worthwhile to consider a question that arises in relation to the studies that have been done with this data previously: why not simply use the logistic equation from kao and jurafsky’s ( ) analysis? the answer is that there is a problem with overfitting in any modeling and, although it is possible that their equation is not overfitted, in the absence of an independent test using a holdout sample or some similar method, it is always possible that the equation does is overfitted to the data. in such cases the model does not truly generalize but instead “learns” the noise in the sample and is therefore not useful for actually classifying poems into professional amd amateur. this is despite the fact that certain variables may have been identified as being important in such a classification scheme. there is a distinction between traditional ranking contemporary american poems – michael dalvean hypothesis testing and machine learning. traditional hypothesis testing is based on the idea that the identification of statistically significant variables is the essential aim as it is required to develop theoretical explanations. the problem with such an approach is that it can lead to the identification of variables that have statistical significance but little discriminant power. the central aim of machine learning, on the other hand, is classification so the discriminant power of the variables selected in crucial. the statistical significance of variables is not as important as whether they are able to increase the classification accuracy of the model. modeling and results the first stage of the modeling procedure is to divide the sample (n= ) into a training sample of n = and a testing sample of n = . the training sample will be used to create models using the stepwise procedure while the testing sample will be “held out” from the model building procedure and used only to test each model created at each step of the stepwise procedure. thus, of the amateur poems were randomly selected from the amateur poems and of the professional poems were randomly selected from the professional poems. the next stage of the process was to run the stepwise procedure using all linguistic variables. the stepwise procedure continued for iterations and then stopped. the best classification accuracy for the holdout sample occurred at step . this model consisted of two variables: article (e.g.: “the”, “a”) and; insight (e.g.: “explain”, “feel”). both of these are liwc variables. the sensitivity was %, the specificity was % giving an overall accuracy of %. this yields a cohen’s kappa value of . which is highly statistically significant (test of ho: kappa= : z= . , p = . t.t.t.). parameter estimates for this model are presented in table . ranking contemporary american poems – michael dalvean table about here the next model was created by removing the two variables article and insight from the pool of potential independent variables and running the stepwise procedure again. the stepwise procedure continued for iterations and then stopped. the best classification accuracy for the holdout sample occurred at step . this model consisted of two variables: affect (e.g.: “gentle”, “terrible”) and; cognitive mechanisms (e.g.: “imagine”, “consider”). both of these are liwc variables. the sensitivity was %, the specificity was % giving an overall accuracy of %. this yields a cohen’s kappa value of . which is highly statistically significant (test of ho: kappa= : z= . , p = . t.t.t.). parameter estimates for this model are presented in table . table about here the two variables affect and cogmech were removed from the potential pool of independent variables and the stepwise procedure run again. however, subsequent models had a lower classification accuracy than models and . the summary accuracy and parameter estimates for models and are given in table . table about here the pymc variables were not selected by the search procedure in the creation of the first two models. in order to introduce them into the analysis a different search procedure was undertaken. all the liwc variables were removed from the potential pool and only the pymc variables were retained for subsequent model building. the idea here is that the ranking contemporary american poems – michael dalvean stepwise procedure is a “greedy” search algorithm which takes, at each step, the variable with the greatest model fitting power. this means that some combinations of variables can be overlooked because some variables work best when combined with other variables which may not be identifiable with individual sweeps of the data. the model building described above did not use any pymc variables because, as individual variables, the liwc variables performed better. by eliminating the liwc variables there is the possibility that some combination of pymc variables will be selected and, in combination with other pymc variables, perform well. thus, the same procedure as that enumerated above was undertaken but with only the pymc variables. that is, when the best model for a given iteration was identified, the constituent variables from that model were eliminated from the pool of potential independent variables and the procedure was run again. the resulting models from this procedure are listed in table . table about here clearly, all the models created using the liwc variables (models and ) and those using the pymc variables (models , and ) are able to classify the holdout sample well beyond chance alone. the worst performing model is model and the cohen,s kappa for this model is . and this is well beyond chance (test of ho: kappa= : z= . , p = . t.t.t.). ranking contemporary american poems – michael dalvean thus, we have five models each of which is able to classify the holdout sample (n = ) with an accuracy of between % (model ) and % (model ). the next stage is to average the results of all models to see if this increases the accuracy over that of the highest model in the ensemble. model has an accuracy of % and so the ensemble will only be considered an improvement if the ensemble classifies more accurately than this. the ensemble score is derived by averaging the logistic score for each case across the models. if the average is above . the case is scored as a while if the score is below . the case is scored as a . the result of the ensemble is a sensitivity of %, specificity of % giving an overall accuracy of %. the cohen’s kappa value for this result is . which is significantly above chance (test of ho: kappa= : z= . , p = . t.t.t.). thus, the accuracy of the ensemble of % is greater than the accuracy of any of the constituent models in the ensemble. ranking the poems the upshot of the preceding section is that we have an algorithm that is able to correctly classify poems as professional/amateur with an accuracy of % using linguistic variables. there are several applications for such an algorithm. for example, a publisher who needs a quick way of sorting through the voluminous submissions received on a weekly basis could first select a filtered list by running poems though such an algorithm. however, i wish to discuss a different application – the ranking of contemporary established poems. there is a tradition of regarding poets as “great”, “minor”. we tend to ignore the fact that some poets are not great or minor but are simply forgotten, as forsythe’s ( ) study emphasizes. ts eliot points out that there is a distinction between major and minor poets but that most people would disagree about which poets should be on which lists (eliot, ). the point of ranking contemporary american poems – michael dalvean ranking poems using a classification scheme such as the one advocated in this paper is that such a method provides an objective measure of the likely subjective judgments of many individuals. the procedure is to use the ensemble classifier to give each of the established poems a score which can then be used to place them on a continuum from most professional to least professional. the score is simply the score derived by the ensemble classifier. that is, the score is the average logit score derived from the logit scores of the constituent models in the ensemble. the amateur poets are excluded from this comparison for the simple reason that their status is not in contention. however it should be noted that there is no reason that we could not provide a score for the purposes of identifying amateur poems who are producing work of a professional standard. in this regard it is worthwhile noting that in the control group of amateur poets, there are with logit scores in the “professional” range of <. . of these , three score in the very high range of >. suggesting that these poems may be indicative of future poetic success. table lists the poems and authors in descending order of logit scores. the highest score is . for the poem working late by louis simpson. the lowest score is . for blackberry eating by galway kinnell. table about here the vast majority of the poems, out of , have scores in the “professional” range of >. . interestingly, of the poems score in the amateur range of <. . in other words, there are poems that are more like amateur poems than professional poems. one way to explain this is that this can be expected given that the classifier has a specificity of %. in other ranking contemporary american poems – michael dalvean words, there will be up to % that are misclassified. the misclassified poems represent a misclassification of % which is within the expected error range. however, this interpretation has one important caveat in that when we compare the poets who have more than one poem in the corpus, there is a great deal of consistency in the classifications of their poems. of those poets who have more than one poem in the corpus, most show consistently high or low quality. for example, ai has two poems in the corpus, riot act april and twenty year marriage which score in the high to very high range of . and . respectively. at the other extreme are galway kinnell and robert creely who also have two poems each in the corpus but whose poems both score in the amateur range of <. . finally, there are poets who have poems in each of the high and low scoring categories. cd wright, for example scores . for approximately forever and . for more blues and the abstract truth. carol frost has three poems in the corpus and these show great variation from . for sexual jealousy to . for the undressing and . for to kill a deer. in all there are six poets who straddle the two categories. given that there are poets with more than one poem in the corpus, the majority ( ) have poems in one category or another. thus, the that straddle two categories represent the exceptions rather than the norm. furthermore, where a single poet has more than one poem in the “amateur” range, this is not merely a result of the % error of the classifier but may indicate that the poems are in fact more like amateur poems than professional poems. conclusion in this paper i have extended the work of kao and jurafsky ( ) in three ways: ) i have examined a greater number of linguistic variables and in the process i have identified a number of variables that have not previously been linked with poetic skill. secondly i have created an ensemble classifier consisting of models. the classifier has a holdout sample ranking contemporary american poems – michael dalvean accuracy of %. finally, i have used the classifier to rank a corpus of contemporary american poems. this ranking is an objective means of determining which poems are more like amateur poems and which are more like professional poems. bibliography clarke, j., & paivio, a. ( ). extensions of the paivio, yuille and madigan ( ) norms. behavioral research methods, ( ), - . eliot, t. ( ). what is minor poetry? the sewanee review, ( ), - . forsythe, r. ( ). pops & flops: some properties of famous english poems. empirical studies of the arts, ( ), - . green, e., bodrumlu, t., & knight, k. ( ). automatic analysis of rhythmic poetry with applications to generation and translation. proceedings of the conference on empirical methods in natural language processing (pp. - ). emnlp . hawkins, d. ( ). the problem of overfitting. journal of chemical information and computer science, ( ), - . kao, j., & jurafsky, d. ( ). a computational analysis of style, affect, and imagery in contemporary poetry. naacl workshop on computational linguistics for literature. retrieved january , , from http://www.stanford.edu/~jurafsky/kaojurafsky .pdf kaplan, d., & blei, d. ( ). a computational approach to style in american poetry. ieee conference on data mining. ieee. paivio, a., yuille, j., & madigan, s. ( ). concreteness, imagery, and meaningfulness values for nouns. journal of experimental psychology, ( , pt ), - . pennebaker, j., chung, c., ireland, m., gonzales, a., & booth, r. ( ). the development and psychometric properties of liwc . austin, tx. retrieved march , , from www.liwc.net pennebaker, j., francis, m., & booth, r. ( ). linguistic inquiry and word count (liwc). mahwah, nj: erlbaum. poulin, a., & waters, m. (eds.). ( ). contemporary american poetry, eighth ed. houghtin mifflin company. ultsch, a., & kämpf, d. ( ). knowledge discovery in dna microarray data of cancer patients with emergent self organizing maps. esann proceedings - european symposium on artificial neural networks, - april, (pp. - ). bruges. watson, d., clark, l., & tellegen, a. ( ). development and validation of brief measures of positive and negative affect: the panas scales. journal of personality and social psychology, ( ), - . ranking contemporary american poems – michael dalvean appendix: tables table : parameter estimates for model . variable b sig exp(b) article . . insight - . . . constant - , . . table : parameter estimates for model . variable b sig exp(b) affect - . . cogmech - . . constant table : parameter estimates and accuracy data for models and sensitivity specificity accuracy variables examples b sig exp(b) model % % % article "the", "a" . . . insight "imagine", "contemplate" - . . . constant - . . . model % % % affect "gentle", "terrible" - . . . cogmech "imagine", "consider" - . . . constant . . ranking contemporary american poems – michael dalvean table : parameter estimates and accuracy data for models , and sensitivity specificity accuracy variables description b sig exp(b) model % % % emo emotional content - . . . constant . . . model % % % img imagery . . . rhy no. of rhyming words - . . . emogd goodness deviation - . . . constant . . . model % % % con concreteness . . . gdn goodness - . . . constant . . . emotional content of nouns in the word sample was derived by asking subjects to rate words according to the degree to which the words would evoke a positive or negative emotional response from people. words that elicit strong feelings get high ratings. words that are not emotional get low ratings. imagery of nouns in the word sample was derived by asking subjects to rate words according to the degree to which it was possible to imagine an image to represent the word. words that elicit strong/weak images get high/low ratings. imagability is highly correlated with concreteness. the number of rhymes for words in the noun sample was derived by asking subjects, for each word, whether they can think of many words that rhyme with the given word (high rating) or few words that rhyme with it (low rating). goodness deviation was calculated by taking the absolute deviation from neutral of goodness ratings (see note below). concreteness ratings were derived by asking subjects how easy it was to form a sensory impression of the noun depicted. those that were easy/difficult to associated with a sense were hated high/low on concreteness. goodness ratings for nouns in the noun sample were derived from subjects’ impressions of the extent to which the word evokes a high level of goodness (high rating) or badness (low rating). ranking contemporary american poems – michael dalvean table : professional poems ranked by logit scores title author logit working late louis simpson . the image robert hass . how simile works albert goldbarth . eating alone liyoung lee . facing it yusef komunyakaa . nostos louise gluck . hello naomi shihab nye . twentyyear marriage ai . the room of my life anne sexton . years end ellen bryant voigt . dearest reader michael palmer . when you go away ws merwin . power adrienne rich . lying in a hammock at william duffys farm in pine island minnesota james wright . university hospital boston mary oliver . the prediction mark strand . traveling through the dark william stafford . the small vases from hebron naomi shihab nye . japan billy collins . to kill a deer carol frost . variations on a text vallejo . more blues and the abstract truth cd wright . to dorothy marvin bell . gin david st john . cleaning a fish dave smith . the fish elizabeth bishop . glassbottom boat elizabeth spires . the choir olga broumas . writing in the afterlife billy collins . dream song your face broods john berryman . reuben reuben michael s harper . fork charles simic . b o d y james merrill . the abduction stanley kunitz . warning to the reader robert bly . notice what this poem is not doing william stafford . crossing the water sylvia plath . animals are passing from our lives philip levine . in trackless woods richard wilbur . onions william matthews . ranking contemporary american poems – michael dalvean title author logit clear night charles wright . may sharon olds . those winter sundays robert hayden . at the cemetery walnut grove plantation south carolina lucille clifton . charles on fire james merrill . thrall carolyn kizer . why i am not a painter frank ohara . the dancing gerald stern . riot act april ai . root cellar theodore roethke . absences donald justice . the porcelain couple donald hall . minor miracle marilyn nelson . this night william heyen . aubade some peaches after storm carl phillips . oranges gary soto . the intruder carolyn kizer . wingfoot lake rita dove . to an adolescent weeping willow marvin bell . they feed they lion philip levine . heaven as anus maxine kumin . the strange people louise erdrich . the russian robert bly . my noiseless entourage charles simic . new vows louise erdrich . the older child kimiko hahn . my indigo liyoung lee . nurture maxine kumin . personal poem frank ohara . her kind anne sexton . the stairway stephen dunn . tomatoes stephen dobyns . letter jean valentine . the undressing carol frost . the mutes denise levertov . degrees of gray in philipsburg richard hugo . the summer day mary oliver . our lady of the snows robert hass . audacity of the lower gods yusef komunyakaa . hay for the horses gary synder . a blessing james wright . adultery james dickey . celestial music louise gluck . to speak of woe that is in marriage robert lowell . for the anniversary of my death ws merwin . ranking contemporary american poems – michael dalvean title author logit fragments stephen dobyns . the singing c k williams . approximately forever cd wright . scar lucille clifton . the night the porch mark strand . dream song the glories of the world struck me john berryman . weddingring denise levertov . pacemaker wd snodgrass . sexual jealousy carol frost . after making love we hear footsteps galway kinnell . a lovely love gwendolyn brooks . playing dead andrew hudgins . the language robert creeley . the warning robert creeley . blackberry eating galway kinnell . dalvean_ranking coversheetrev dalvean_rankingcontemporary dalvean_ranking coversheet dalvean_rankingcontemporary op-llcj .. exploring the linguistic landscape of geotagged social media content in urban environments ............................................................................................................................................................ tuomo hiippala department of languages, university of helsinki, finland, digital geography lab, university of helsinki, finland and helsinki institute of sustainability science, university of helsinki, finland anna hausmann , henrikki tenkanen , and tuuli toivonen digital geography lab, university of helsinki, finland, department of geography and geosciences, university of helsinki, finland and helsinki institute of sustainability science, university of helsinki, finland ....................................................................................................................................... abstract this article explores the linguistic landscape of social media posts associated with specific geographic locations using computational methods. because physical and virtual spaces have become increasingly intertwined due to location-aware mobile devices, we propose extending the concept of linguistic landscape to cover both physical and virtual environments. to cope with the high volume of social media data, we adopt computational methods for studying the richness and diversity of the virtual linguistic landscape, namely, automatic language identification and topic modelling, together with diversity indices commonly used in ecology and information sciences. we illustrate the proposed approach in a case study cover- ing nearly , posts uploaded on instagram over . years at the senate square in helsinki, finland. our analysis reveals the richness and diversity of the virtual linguistic landscape, which is also shown to be susceptible to continu- ous change. ................................................................................................................................................................................. introduction staying connected to social media has become an inseparable aspect of everyday life for many. this kind of constant connectedness is enabled by mobile devices, such as smartphones and tablet computers, which allow users to create and share content and to maintain personal relationships while being on the move (deumert, b; baym, ). mobile devices are also increasingly aware of their geographic location due to widespread adoption of positioning technology in consumer electronics (kellerman, ). consequently, many social media platforms now allow and explicitly en- courage users to anchor the content they create to specific geographic locations. this practice, known as geotagging, provides social media platforms with information about the mobility of their users, which can be used for targeting advertisements and profil- ing their consumer preferences. geotagged social media content also holds po- tential for sociolinguistic inquiry. in this article, correspondence: tuomo hiippala, university of helsinki, p.o. box , , finland. e-mail: tuomo.hiippala@helsinki.fi digital scholarship in the humanities, vol. , no. , . � the author(s) . published by oxford university press on behalf of eadh. this is an open access article distributed under the terms of the creative commons attribution license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. doi: . /llc/fqy advance access published on october d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay http://orcid.org/ - - - http://orcid.org/ - - - http://orcid.org/ - - - http://orcid.org/ - - - xpath error undefined namespace prefix we adopt the term virtual linguistic landscape, which ivkovic and lotherington ( ) coined for discussing multilingualism on the web, to describe the languages present in geotagged social media content posted from a specific geographic location. we propose that the virtual linguistic landscape may be considered an extension of the physical linguistic landscape in the built environment. to explore the characteristics of virtual linguistic landscapes, we analyse nearly , posts uploaded on instagram from the senate square in helsinki, finland, over a period of . years. we seek to answer the following research questions: ( ) how to characterize virtual linguistic land- scapes in terms of their linguistic richness and diversity? ( ) how do virtual linguistic landscapes change over time? given the high volume of data, we adopt methods from the field of natural language processing, namely, automatic language identification and topic model- ling. to measure linguistic richness and diversity, we use established indices from the fields of ecology and biology, which have been previously applied to the study of linguistic landscapes (peukert, ; manjavacas, ). we also perform temporal ana- lyses at various timescales to examine changes in the virtual linguistic landscape. we do not, however, seek to compare or make claims about the respective char- acteristics of virtual and physical linguistic landscapes (cf. deumert, a, pp. – ). instead, we aim to develop methods for studying high volumes of geo- tagged social media content, setting the stage for approaches involving mixed methods, which are ul- timately necessary for achieving a comprehensive view of virtual linguistic landscapes. physical places and virtual spaces androutsopoulos ( ) has observed that new sources of data for sociolinguistic inquiry are cur- rently emerging at the intersection of research on computer-mediated communication (cmc) and linguistic landscapes. whereas cmc covers private and public communication in digital media, such as social media platforms, discussion forums, and email, the research on linguistic landscapes focuses on ‘‘signs and other artifacts in public space’’ (androutsopoulos, , p. , our emphasis). these definitions may reflect an emerging division of work between the aforementioned domains of sociolinguistic research, as the study of linguistic landscapes has traditionally focused on built envir- onments, covering various locations ranging from tourist attractions (bruyèl-olmedo and juan- garau, ) to transportation hubs (soler- carbonell, ) and various media from billboards to shop signs (gorter, ). at the same time, the broader notion of public space, which androutsopoulos ( ) assigns to the domain of linguistic landscapes, has been and con- tinues to be transformed by digital technology in the form of both hardware and software (dodge and kitchin, ). in the field of human geography, one of the leading theorists of this transformation is aharon kellerman (see kellerman, , ), who has argued that mobile devices have enabled the emergence of a ‘‘double space’’ of intertwined physical and virtual spaces (see also zook and graham, ). this double space now increasingly envelopes its subjects, as access to the virtual space is no longer restricted by limitations arising from static hardware in the physical space, such as desk- top computers. due to the increased potential for spatial mobil- ity, this double space can now fill or support many basic human needs, including those originally defined by abraham maslow (kellerman, ). for example, needs pertaining to esteem, such as status and reputation, are increasingly formed in virtual spaces (kellerman, , p. ). kellerman ( , p. ) identifies multiple connections be- tween the physical and virtual spaces, which are grouped along several dimensions: organization, or how such spaces are structured; movement, or the connections between spaces; and users, who popu- late these spaces. two specific connections warrant further attention, namely, the convergence of phys- ical and virtual places, and the languages encoun- tered in virtual spaces, as both shape the virtual linguistic landscape. exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay first, kellerman ( , p. ) proposes that locations defined in the virtual space tend to con- verge with their ‘real’ counterparts in physical space. this tendency is also evident in user-generated social media content. to exemplify, visual content on social media platforms for mobile photography, such as instagram, has been suggested to serve the purpose of mediating the user’s presence or activ- ities at some specific physical location (villi, ). alternatively, geographic information such as place names may be provided linguistically in the caption and/or in hashtags accompanying the visual con- tent. the most accurate form of geographic infor- mation, however, is produced by location-aware devices, which are now widely available to con- sumers through smartphones (kellerman, , p. ). together, the combination of new commu- nicative practices and technological infrastructure may be suggested to drive the convergence of phys- ical and virtual spaces. second, in terms of their linguistic characteris- tics, kellerman ( ) suggests that physical spaces are characterized by domestic languages, whereas virtual spaces are dominated by english due to their international orientation. lee ( , p. ) has observed that assumptions about the dominance of english in virtual spaces have been common among both academic and popular audiences ever since internet became widely used. yet measuring the actual linguistic diversity of virtual spaces re- mains a challenge (paolillo, ), which is also af- fected by how such virtual spaces are defined and delimited (leppänen and peuronen, ). however, the current consensus seems to be that languages other than english are becoming increas- ingly prominent on the internet (lee, , p. ). in virtual spaces, the linguacultural make-up of users has the potential to be extremely diverse, be- cause online interactions do not require physical presence, but allow participation from distance, as illustrated in fig. . moreover, users may choose to use different languages for different audiences (androutsopoulos, ). it is also important to acknowledge that online interactions can be asyn- chronous and unfold over longer periods of time. moreover, not all social media content is necessarily created at the time of upload, as exemplified by the practice of posting content related to previous events under hashtags such as #throwback. similarly, the content associated with a specific vir- tual location must not be necessarily created at the actual physical location. acknowledging the possibility of such temporal and spatial discrepancies, we build on the work of kellerman ( , , ) and propose that geo- tagged social media posts anchored to a specific geographic location act as an extension of the lin- guistic landscape of the corresponding physical en- vironment. this extension is enabled by the double space, which encompasses both physical and vir- tual spaces, assisted by technologies such as satellite positioning. however, unlike signs and other ob- jects found in the physical environment, social media posts cannot take a material form (although augmented reality may eventually allow them to be represented in physical space, cf. allen et al., ), as they exist on platforms in the virtual space, which may be accessed using any device cap- able of doing so, either from the actual location or from distance. like urban spaces in general, linguistic land- scapes are dynamic and sensitive to social and eco- nomic changes (gorter and cenoz, ). as papen ( ) has shown, changes in the physical linguistic landscape may take place over longer timescales, oc- casionally spanning decades or more. the virtual linguistic landscape, in turn, may be more sensitive to short-term changes due to the immateriality of digital content. in addition, the use and status of a physical location are likely to influence its virtual linguistic landscape in geotagged social media, be- cause these attributes may be expected to be carried over from the physical space to the virtual space. to draw on an example, the linguistic landscapes of tourist attractions, landmarks, or transportation hubs may be expected to be diverse due to their cultural value or role in the transportation network (bruyèl-olmedo and juan-garau, ; soler- carbonell, ). with these points in mind, we now turn our attention towards the data collected from the senate square and the methods applied to its analysis. t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay social media data and computational methods . data and location we collected data from instagram, a social media platform for sharing photographs and short videos, using the platform’s application programming interface (api). in total, we collected , posts uploaded by , unique users between july and february , that is, over a period of roughly . years. as illustrated in fig. , each geotagged post on instagram is asso- ciated with a specific location pre-defined on the platform, which means the geographic coordinates of an individual data point do not provide gps- level accuracy, unlike some other platforms, such as twitter and flickr. instead, the geographic coordinates associated with an instagram post refer to what is commonly termed a point-of-interest (poi) in the field of fig. a fictional example showing how ( ) two finnish users at the senate square speak finnish with each other, but the other posts a photograph with an english caption on instagram, having a number of international users in her social network. ( ) associating the photograph with the location named helsinki cathedral allows a german user who searches for content from helsinki to discover the photograph. ( ) despite physical distance, german users can interact with the content and each other, contributing to the virtual linguistic landscape of the senate square. each step in this chain of events involves language choices, which all contribute to the virtual linguistic landscape exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay geoinformatics (hochmair et al., ). instagram pois are provided by the parent company, that is, facebook. the response to any spatial query is therefore restricted to content associated with a poi on the platform. in our case, each post retrieved for the study was geotagged to a poi located within a -m radius from the point . latitude and . longitude (wgs- ), which lies at the centre of the senate square in downtown helsinki, finland. we chose the location due to its status as a cul- tural landmark and a touristic attraction, which are likely to be reflected in its virtual linguistic land- scape. overlooked by the lutheran cathedral and surrounded by the main building of the university of helsinki and the government palace, the senate square and its neoclassical architecture are widely recognized as one of the most important landmarks in helsinki and in entire finland. the lutheran cathedral, in particular, which is shown in fig. , is often used as a symbol for the city of helsinki (jokela, ). in addition to its role as a touristic attraction, the senate square serves as a venue for different events, ranging from concerts and festivals to protests and demonstrations. . identifying the language of social media content like many other forms of digital data, geotagged social media content may be characterized as ‘big’ due to its high volume, velocity, and variety (kitchin, ). together, these characteristics pre- sent several challenges for the collection, processing, and analysis of social media data. challenges related to volume and velocity may be met by adopting a programmatic approach, that is, collecting data sys- tematically via an api and processing the data ac- cordingly (see tenkanen, , p. ). for mapping the languages that make up the virtual linguistic landscape, further processing involves automatic language identification, which is an active area of research within the broader field of natural language processing (zubiaga et al., ). fig. social media platforms such as instagram ( ), twitter ( ), and flickr ( ) all allow users to embed geographic metadata into their content at various degrees of accuracy from gps coordinates to poi locations defined by the platforms t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay automatic language identification, however, is not a straightforward task due to the variety of the data, which in this case takes the form of lin- guistic variation. much has been written about the language of social media in recent years, revealing variation across different linguistic structures (see zappavigna, ; seargeant and tagg, ; hoffman and bublitz, ). on a more practical level, the length of social media posts is typically limited, which encourages the use of abbreviations, non-standard spellings, and other forms of creative language use (carter et al., , p. ). another challenge emerges from the use of hashtags, which are used to affiliate around shared values or topics (zappavigna, ). hashtags are often written in multiple languages (barton, ; lee and chau, ), which injects multilingual material into otherwise monolingual texts. the same holds true for usernames on social media platforms. each of the aforementioned issues introduces additional challenges to performing automatic lan- guage identification. yet it should be noted that identifying the language of a sentence is not a straightforward task for humans either due to am- biguous language use or orthographically similar words in multiple languages. for example, a caption consisting of a single proper noun, such as ‘helsinki’, may represent finnish, english, german, or some other language whose vocabulary includes this word, essentially preventing the iden- tification of language. we evaluated several state-of-the-art frameworks that provide pre-trained models for performing automatic language identification. the libraries considered for the current study are listed in table and introduced briefly below. the first framework, fasttext, relies on word embeddings, which is a technique for learning numerical repre- sentations of words in a vocabulary by observing their distribution in their context of occurrence (bojanowski et al., ). the second framework, langid.py, is designed to provide reliable language identification across multiple domains, such as of- ficial documents, newspaper articles, and social media messages (lui and baldwin, ). finally, the third framework, cld or the compact language detector , was originally developed for google’s chromium open-source project but has not been documented in a peer-reviewed publica- tion. for this study, we used cld via the polyglot natural language processing library. all programs developed for this study were writ- ten using the python . . programming language, to take advantage of the wide range of libraries available within the python ecosystem. the libraries used include the natural language toolkit (nltk; bird et al., ), polyglot, spacy, and gensim (rehurek and sojka, ) for natural language pro- cessing; scikit-bio for diversity measures; and pandas (mckinney, ) and scikit-learn (pedregosa et al., ) for storing and manipulat- ing the data. all code written for this study is made publicly available with an open licence at: https:// doi.org/ . /zenodo. . . evaluating language identification frameworks to evaluate how the language identification frame- works introduced above perform on our data, we created a ground truth by randomly sampling the data without replacement for , captions. we then applied the preprocessing steps described in table to these captions, extracting a total of , sentences. two annotators, namely, the first and the second author, subsequently identified the language of each preprocessed sentence manually. we annotated each language using its iso- code, such as ‘en’ for english, or using multiple codes joined by a þ if the sentence featured more than one language, such as ‘enþfi’ for english and finnish. to assess the level of agreement between the two annotators, we used the common metrics for mea- suring inter-rater agreement surveyed in artstein and poesio ( ), such as fleiss’ � ( . ), scott’s table language identification frameworks used in the study name reference number of languages supported fasttext bojanowski et al. ( ) langid.py lui and baldwin ( ) cld – exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. � ( . ), and krippendorff’s � ( . ) as imple- mented in nltk (bird et al., ). the average observed agreement between the two annotators was . . overall, these metrics suggest that the ground truth can be reliably used for evaluating the performance of language evaluation frame- works, particularly as the manual classification also accounted for code-switching within sentences. for the final ground truth, we dropped captions whose language we disagreed on, retaining a total of , captions with , sentences, which was further reduced to , by leaving out sentences whose language could not be manually identified or which contained sentence-internal code- switching. we then evaluated the language identification frameworks against the ground truth and examined whether their performance would improve by excluding sentences with a low character count. fasttext and langid.py had a slight advantage over cld , as they supported all manually identified lan- guages present in the ground truth, whereas cld did not support latin. however, the ground truth contained only three sentences in latin, so this dis- advantage should not have a big impact on the performance of cld . table reports the reliability of predictions for each framework at different char- acter thresholds, using krippendorff’s � to correct for chance agreement. average observed agree- ment—or accuracy—is given in parentheses. as table shows, the fasttext library and its pre- trained model provide superior performance com- pared to langid.py and cld regardless of the char- acter threshold. langid.py and cld begin to match fasttext’s baseline performance only at the thresh- old of thirty characters or above, which simultan- eously involves losing nearly % of the data. this trade-off is obviously unacceptable, which is why we chose fasttext for automatic language identification. . measuring richness and diversity to measure the richness and diversity of the lan- guages that make up the virtual linguistic landscape, we adopt common indices used in the fields of ecol- ogy and information sciences, such as richness, menhinick’s richness, berger–parker dominance, and shannon entropy. peukert ( ) provides a thorough introduction to using these indices to measure linguistic diversity, illustrating their appli- cation in a comparison of physical linguistic table the individual steps of the preprocessing strategy were designed to counter common challenges in automatic language identification, such as emojis and smileys, excessive punctuation, multilingual hashtags and usernames, and sentence-level code-switching the original caption includes hashtags, user mentions, and smileys and emojis great weather in helsinki!!! on holiday with @username.:-) #helsinki #visitfinland we begin by replacing any line breaks with whitespace and convert the emojis into their corresponding emoji shortcodes, which are wrapped in colons great weather in helsinki!!! on holiday with @username.:-) #helsinki #visitfinland:nerd_ face_&_sunny_&_passenger_ship: the colons make finding the emojis easy using a regular expression, which we then apply to remove them great weather in helsinki!!! on holiday with @username.:-) #helsinki #visitfinland we then remove any words that begin with an @ symbol, which indicates a username great weather in helsinki!!! on holiday with:-)#helsinki #visitfinland next, we remove any hashtags, that is, any words beginning with a # great weather in helsinki!!! on holiday with:-) any remaining non-alphanumeric words in the caption, such as the smiley:-) are then removed using a regular expression great weather in helsinki!!! on holiday with longer sequences of exclamation or question marks (e.g. !!!), full stops, and other kinds of punctuation are shortened to just one of each character (e.g. !) great weather in helsinki! on holiday with these sequences can confuse the punkt sentence tokenizer (kiss and strunk, ), which outputs a python list containing sentence tokens. these tokens are then fed to the language identification frameworks one at a time [”great weather in helsinki!”, ”on holiday with”] t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay landscapes in two neighbourhoods in hamburg, germany and showing how these indices may be used to measure and compare linguistic diversity across locations. manjavacas ( ), in turn, applies similar indices to geotagged twitter posts from berlin, germany. because these indices are relatively new to the study of linguistic landscapes, we intro- duce them in greater detail in connection with the analyses of linguistic richness and diversity in section . . exploring the virtual linguistic landscape . temporal patterns in social media activity fig. presents instagram activity around the senate square over h. the figures show the average number of posts and their standard deviation for each hour of the day for four different samples: fig. a shows the hourly frequency of all posts in the data set over , days, which also includes posts without any linguistic content (n¼ , ). not surprisingly, this frequency reflects common hours of activity in the city, with approximately four to six posts per hour for daytime and evening hours. during the night, the number falls down to roughly two posts per hour. a similar pattern may be observed in fig. b, which only includes posts with captions (n¼ , ). the pattern changes when choosing different timescales and preprocessing the data for language identification (n¼ , ), as illustrated in fig. c and d, which show the average number of hourly of posts for weekdays (n¼ , ) and weekends (n¼ ), respectively. whereas the weekdays show a peak around lunch hours, the activity in- creases considerably towards the evening during weekends. a d’agostino–pearson test showed that none of the hourly observations in fig. c and d follow a normal distribution, which means that the statistical differences between hourly activity may be evaluated using levene’s test and the mann–whitney u-test. for levene’s test, which compares the variance of samples, the differences were found to be statistically significant for hours (w¼ . , p¼ . ), (w¼ . , p¼ . ), (w¼ . , p ¼ < . ), (w¼ . , p ¼ . ), (w¼ . , p¼ . ), and (w¼ . , p¼ . ). the mann–whitney u- test, which examines the difference in averages, showed a statistically significant difference for hour (u¼ , . , p¼ . ). this suggests that social media activity is subject to temporal variation, which can be revealed by examining the data on different timescales. in other words, studying the activity at lunch hour during the working week will reveal a different pic- ture than an analysis focusing on the late hours on the weekend. this variation will undoubtedly affect the appearance of the virtual linguistic landscape on the daily scale and beyond. as a culturally valued landmark and a tourist attraction, the senate square also experiences seasonal variation, attracting a higher number of users during the summer months and christmas holidays, as shown in fig. a. the seasonal pattern becomes increasingly pronounced due to the rapidly growing popularity of instagram as a social media platform. fig. b, in turn, shows the average number of sentences per day of the week, which reveals increased activity during the weekend. this trend, however, becomes less pronounced due to loss of table krippendorff’s � scores for language identification frameworks at different character thresholds for prepro- cessed sentence length framework no threshold > characters > characters > characters cld . ( . ) . ( . ) . ( . ) . ( . ) fasttext . ( . ) . ( . ) . ( . ) . ( . ) langid.py . ( . ) . ( . ) . ( . ) . ( . ) data loss % ( ) . % ( ) . % ( ) . % ( , ) note: best result is marked in bold. for data loss, the value in parentheses reports the number of sentences lost. exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay data when predictions are filtered using the prob- abilities provided by fasttext, which is visualized using the coloured bands in fig. b. generally, these probabilities are distributed over the lan- guages supported by fasttext and range between and , which reflects how confident the framework is about its prediction. requiring a certain level of confidence, as expressed by the probability asso- ciated with a prediction, naturally results in a trade-off between the quality of predictions and volume of data. including all predictions regardless of their level of confidence is likely to increase the number of errors, as very short sentences force fasttext to make uninformed guesses based on limited data. to improve the quality of language identification while preserving the temporal features of instagram activity at the senate square, we exclude predictions that fall into the first decile either in terms of their associated probability (< . ) or character length after preprocessing (< ), amount- ing to a loss of . % of the data. this left us with (a) (b) (c) (d) fig. the daily ‘pulse’ of the senate square on instagram. the line shows the average number of posts per hour, whereas the area indicates the standard deviation from the average. (a) average posts per hour for all posts in dataset (n¼ , ) over , days. (b) average posts per hour for posts with captions (n¼ , ) over , days. (c) average posts per hour during weekdays (monday to friday, n¼ , ) for captions whose language could be identified (n¼ , ). (d) average posts per hour during weekend (saturday and sunday, n¼ ) for captions whose language could be identified (n¼ , ) t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay , sentences in eighty unique languages posted over , days for analysing temporal changes in the virtual linguistic landscape. . the distribution of languages over time we now turn our attention towards the virtual lin- guistic landscape of the senate square by examining the sentence-level distribution of languages in the captions. the chosen level of analytical granularity was not linguistically informed but defined by our preprocessing strategy, which uses sentence tokeni- zation (see table ). our discussion focuses on fig. , which shows the top ten languages identified using fasttext, accompanied by . % confidence intervals estimated by drawing , bootstrapped samples from the underlying data. this means that the mean value lies within these intervals at . % probability. if the confidence intervals do not over- lap, the difference between individual languages is significant at . level. the graphs in fig. are presented in pairs. on the left-hand side, the y-axes show the daily relative frequency, which calculated given by dividing the number of observations for each language by the total number of daily observations for all languages. this measurement is intended to capture the power relations and visibility of different languages in the virtual linguistic landscape. on the right-hand side, the y-axes give the number of sentences per day. this measurement is intended to account for the growing volume of data, which was observed in fig. a. to begin with, fig. a shows the daily relative frequencies for the three most common lan- guages—english, finnish, and russian—and the combined relative frequency for the remaining sev- enty-seven languages identified in the data (grouped together under the label ‘other’). these languages also underline the role of senate square as a tourist destination, as approximately half of the sentences are written in english. furthermore, english seems to be gaining most from the growing popularity of instagram, as indicated by the growing sentence count in fig. b. assuming that the dominance of english results from its role as a lingua franca, this raises questions about who the users of english are. we will return to this issue in section . . generally, the ‘big three’—english, finnish, and russian—make up the vast majority of the virtual linguistic landscape. what is particularly worth noting in fig. a and b is that finnish overtook russian as the second most common language only in . traditionally, helsinki has been a popular destination among russians due to its proximity and accessibility via road, rail, sea, and (a) (b) fig. monthly and weekly instagram activity around the senate square. (a) number of unique users per month. note that observations for and cover only a part of the year. (b) average sentences per day of the week for sentences whose language could be identified at various probability thresholds exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay (a) (b) (c) (d) (e) (f) fig. daily relative frequencies for languages identified using fasttext, with . % confidence intervals estimated using , bootstrapped samples from the underlying data, which are marked by the shaded areas. the lines show a third-order polynomial regression fitted using ordinary least squares. (a) daily relative frequencies for the top- languages: english (en), finnish (fi), russian (ru) and other languages (n¼ ). (b) daily sentence counts for the top- languages. (c) daily relative frequencies for the top – languages: (japanese (ja), korean (ko) and swedish (sv). (d) daily sentence counts for the top – languages. (e) daily relative frequencies for the top – languages: spanish (es), german (de), italian (it) and portuguese (pt). (f) daily sentence counts for the top – languages t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay air. interestingly, the decline of the russian language coincides with the economic sanctions imposed on russia due to the invasion of ukraine, which caused the number of russian tourists visiting helsinki to dip in and (official statistics of finland, ). comparing the difference between the daily relative frequencies for russian in and – using the kruskal–wallis h-test was found to be statistically significant at h¼ . , p ¼ < . . figure c-f zooms into the languages outside the top three, which were grouped together under the label ‘other’ in fig. a and b. note that this move is accompanied by a changes of scale, as the relative frequencies and sentence counts for these languages are considerably lower than those in fig. a and b. the observations are split into different figures for a clearer view, but if fig. c-f were presented in a single graph, the confidence intervals would overlap for many languages, indicating that the differences in their frequencies and counts are not statistically significant. the way the relative frequencies of these languages fluctuate suggests that they contribute sporadically in the virtual linguistic landscape, which is also supported by their low sentence counts. nevertheless, fig. c and d shows how geograph- ically remote languages such as japanese (ja) and korean (ko) contribute to the virtual linguistic landscape, even temporarily surpassing swedish, the second official language of finland. the rela- tively low proportion of swedish in the virtual lin- guistic landscape stands in stark contrast with the physical linguistic landscape, in which swedish re- mains very prominent, as public signs are required to be bilingual if the number of minority speakers in the municipality exceeds % or , individuals (syrjälä, , p. ). this is naturally the case with helsinki as well, which is historically a bi- and multilingual city. however, fasttext cannot dis- tinguish between standard swedish and finland- swedish, which means these observations should not be associated exclusively with the swedish- speaking minority in finland, but include visitors from sweden as well. coming back to japanese and korean, it should be noted that although tourism statistics for helsinki show that visitors from european countries outnumber asians three to one (official statistics of finland, ), the widespread adoption of mobile technology among japanese and korean users may explain their prominence in the virtual linguistic landscape. these languages, however, decline to- wards the present, although tourism statistics show that arrivals from japan and korea continue to in- crease, which may suggest that these users are aban- doning instagram. european visitors, in turn, are likely to include a sizeable number of business trav- ellers, who may be less likely to contribute to the virtual linguistic landscape at the senate square, which may explain the relatively low proportion of major languages spoken in europe such as spanish, german, italian, and portuguese. . language choices among users the most striking feature of the virtual linguistic landscape at the senate square is the dominance of the english language, as it is unlikely that half of the users active at the location would speak english as their first language. to investigate lan- guage choices among users, we retrieved the time and location of posts for up to thirty-three previous posts for each user, who were naturally limited to those users who had posted captions whose lan- guage we could identify. to determine the likely country of origin for each user, we first retrieved the administrative region of each coordinate/time- stamp pair in the location history using a point-in- polygon query. next, we used the timestamps to determine the overall duration of user’s activity within each region by calculating the time between the oldest and newest posts. in addition to storing the region with the longest period of activity, we also recorded the region with the most activity. finally, we calculated the average duration of activ- ity for each user by dividing the time spent at each region by the total number of regions visited. the initial data for estimating the users’ country of origin contained , posts by , unique users. on the average, the location history of a user contained . coordinate/timestamp pairs (sd¼ . ), whereas the average period of activity amounted to days (sd¼ ). to make our estimation more reliable, we discarded the first quartile for both coordinate/timestamp pairs and exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay the longest period of activity. in practice, this meant excluding users with eleven or fewer coordinate/ timestamp pairs and whose longest period of activ- ity was days or less. for the final estimation, we retained a total of , posts by , unique users. for these users, we assumed that the admin- istrative region where the users had been active for the longest period of time could be used to approxi- mate their country of origin. table presents the distribution of sentences in the six most frequent languages shown in fig. among users from the ten most frequent countries of origin. as may be expected, the majority of users active in the vicinity of the senate square come from finland, but what is surprising is that finnish users post nearly as much in english as in finnish. previous surveys on the role of the english language in finland have emphasized the popularity and importance of english, particularly among the youth (leppänen et al., ). this may be a source of bias, as youth are also more likely to use social media (longley et al., ; hausmann et al., ). nevertheless, the high proportion of sentences ( . %) written in english warrants closer atten- tion, as similar findings have been reported for other social media platforms, namely twitter, by laitinen et al. ( ). to do so, we trained a topic model over mono- lingual english captions posted by users whose country of origin was estimated to be finland. these data consisted of , captions with , unique words after removing rare and frequent words that appeared in a single sentence or in more than % of the sentences. the model was trained using the latent dirichlet allocation algo- rithm for iterations with ten passes through the corpus, using the implementation provided in the gensim library (rehurek and sojka, ). to pre- process the data, we adopted the procedure set out in table . we also removed stopwords defined in nltk (bird et al., ) and lemmatized the words using the lookup table for english in spacy. finally, we calculated a coherence score, cv, for each topic, which has been suggested to correlate strongly with human evaluations of topic coherence (röder et al., ). table gives the ten most prominent topics with their ten most frequent words. some of the coher- ence scores are fairly low, which is not surprising given the noisy social media data and the small size of the corpus. nevertheless, the topics can provide insights into the nature of the content posted in english by finnish users. to begin with, several topics seem to be strongly associated with the loca- tion, weather, leisure, and celebrations such as christmas and new year’s eve ( and ) and the lux light carnival ( ). many topics also feature words associated with a positive sentiment ( , – , , and ). this suggests that finns use english to connect with international audiences, appraising the physical location and the activities associated with it in the virtual space. finnish users appear to participate in maintain- ing the identity of the location as a culturally valued table the distribution of the six most common languages among the users originating in ten most common countries country finnish english russian swedish japanese korean all finland , , , russia , – , the usa , , the uk , , germany , sweden , spain , italy , france japan note: the countries are ranked by their popularity in the leftmost column. the rightmost column gives the total number of sentences written by users from the particular country in all languages. t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay landmark, at the same time construing the location as a tourist attraction. the role of english as the lingua franca of tourism (francesconi, ), which may also explain the choice of language, is also supported by a positive view of the language and a high level of proficiency in finland (leppänen et al., ). however, the preference for english holds for most, but not all linguistic groups contri- buting to the virtual linguistic landscape: table shows that russians clearly prefer their native lan- guage over english. . the diversity of the virtual linguistic landscape finally, we turn towards the richness and diversity of the virtual linguistic landscape, applying the in- dices introduced in section . . the following dis- cussion focuses on fig. , which shows several indices applied to the results of automatic language identification. we introduce these indices and ex- plain their implications below. fig. a shows the linguistic richness, or simply the number of unique languages per day, and the number of singletons, that is, how many languages appear only once a day. in fig. a, the parallel in- crease in unique languages and singletons suggests that smaller languages are driving the increase in linguistic richness. this observation was supported by a strong positive correlation for pearson’s r be- tween -day rolling averages for unique languages and singletons (r¼ . , n¼ , , p ¼ < . ). increasing linguistic richness also correlated with the increase in unique users (r¼ . , n¼ , , p ¼ < . ), as shown in fig. b. to summarize, fig. a and b suggests that the growing popularity of instagram has resulted in an increasingly rich virtual linguistic landscape at the senate square, as smaller linguistic groups have adopted the platform. simple richness index, however, does not ac- count for the growing volume of data due to the increasing popularity of the platform. this perspec- tive can be provided by menhinick’s richness index, which emphasizes the relationship between data volume and richness. menhinick’s richness index, shown in fig. c, reveals a decreasing trend over the . years. this trend suggests that despite increasing linguistic richness, driven by the increase in smaller languages, the virtual linguistic landscape is increasingly dominated by languages such as english, finnish, and russian (cf. fig. a and b). in other words, the growing volume of data has made the dominant languages increasingly promin- ent in the virtual linguistic landscape, which is re- flected in a decreasing value for menhinick’s richness index. measuring the diversity of the virtual linguistic landscape requires indices that account for both the number of languages observed and their relative proportions. one such index is the berger–parker dominance index, shown in fig. d, which gives the fraction of observations for the language with the most posts per day. given the observations in fig. a, approximately half of the time the dominant language is english. the decreasing table a topic model trained over , captions written in english by finnish users, with one topic per column helsinki get year make love helsinki good town look day christmas cold happy start night lux pizza run go one cathedral menu new open great light morning conjurer lot independence light thing well art enjoy finland beautiful afternoon let church market finally time night people sunday walk friday know back senate ready take welcome see home lovely finnish special nice square new week way last festival city colour right finland time may picture wine come snow sun well exhibition sunny lunch always thank drink december amaze blue look pretty big winter taste get spring weekend wait today know like last . . . . . . . . . . note: the words (rows) associated with each topic are sorted by their weight in a descending order. the final row gives the coherence score cv for the topic (röder et al., ). exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay (a) (b) (c) (d) (e) (f) fig. various diversity measures applied to the data set, with . % confidence intervals estimated using , bootstrapped samples from the underlying data. the line shows a third-order polynomial regression fitted using ordinary least squares. (a) richness and singletons. (b) richness and daily unique users. (c) menhinick richness. (d) berger—parker dominance. (e) dominance. (f) shannon entropy t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay berger–parker index suggests that the dominant lan- guages are losing ground to smaller languages, show- ing a drop of thirty points during the . years, which suggests that the virtual linguistic landscape of the senate square is becoming increasingly diverse. this observation is also supported by the decreasing dom- inance index in fig. e, which measures the respect- ive proportions of languages: a dominance index of would indicate that all languages are equally present, whereas an index of would mean the total domin- ance of a single language. finally, the observed increase in diversity is also sup- ported by shannon entropy, shown in fig. f, which captures the amount of information required to de- scribe the degree of order/disorder in a system. the higher the degree of disorder—in this case, the variety of languages and their respective probabilities of occur- rence—the more information is required to describe the state of the system, that is, the virtual linguistic landscape. interestingly, the index for shannon entropy peaks in . this may suggest that the virtual linguis- tic landscape of the senate square has reached its max- imal degree of diversity (with slightly over eight languages on the average day, as shown in fig. a pos- sible within the current userbase of instagram. to summarize, several conclusions may be drawn from the indices in fig. . the richness of the virtual linguistic landscape increases as the number of users grows. although the number of languages found in the virtual linguistic landscape grows, dominant lan- guages such as english, finnish, and russian gain the most from the growth, enabling them to consolidate their position. yet the proportion of dominant lan- guages is decreasing, which indicates increasing diver- sity. put differently, smaller languages are gaining on the share of the dominant languages. at the same time, the virtual linguistic landscape at the senate square seems to have reached a point where the linguistic diversity no longer increases. in other words, the number of languages in the virtual linguistic landscape remains the same, but the smaller languages change. discussion and conclusion our results suggest that virtual linguistic landscapes can be effectively characterized using computational methods, which are necessary for handling high vol- umes of social media data. with carefully planned preprocessing, automatic language identification and other natural language processing techniques can do most of the analytical work in a sufficiently reliable manner. however, insights provided by automatic language identification are limited without the means to evaluate the respective proportions of the observed languages. our analysis revealed a rich and diverse virtual linguistic landscape at the senate square, which is dominated by english, as the lan- guage is used extensively by both locals and tourists. the results also emphasize the role of senate square as a highly valued cultural landmark and a tourist attraction (jokela, ). the cultural im- portance is manifested in the high number of posts by locals, whereas the impact of tourism is reflected by the high number of foreign visitors. in this respect, our findings support kellerman’s ( ) view that qualities associated with the phys- ical place may be carried over to the corresponding virtual space. although we did not explicitly touch upon the issue in the analysis, it should be noted that global mobility and tourism are a privilege of a select few rather than the many, which is likely to be reflected in the linguistic landscape. choosing an alternative location for the study, such as a local transportation hub, would have likely yielded very different results (cf. soler-carbonell, ). the richness and diversity of the virtual linguistic landscape also resonate with lee’s ( , p. ) proposal that user-generated social media content increases the potential for exposure to foreign lan- guages. geotagged social media content may be par- ticularly effective for this purpose, as content associated with a location can be accessed through map interfaces instead of using hashtags or search terms in some specific language. this effect is fur- ther reinforced by instagram, which allows locations defined on the platform to have multilingual names. all the content associated with the locations named in different languages is then aggregated under a single point of interest. this is also likely to drive the formation and maintain the double space, as conceptualized by kellerman ( ). in addition, the nature of instagram as a plat- form must be taken into account when interpreting exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay the results. unlike twitter, which acts as a forum for public discussion, instagram may be preferred for sharing personal experiences (zappavigna, , ; tenkanen et al., ). together with the in- tended audience, the platform may affect language choices among users (androutsopoulos, ). tracing these linguistic repertoires would, however, require a much closer analysis of longitudinal data for individual users, which was beyond the scope of this article. however, our proposed method could be easily adopted for a large-scale study of what pennycook and otsuji ( , p. ) have called ‘‘a geography of linguistic happenings’’. such ana- lyses, however, would still be limited by the spatial accuracy of instagram, as observed in section . . users may, for instance, associate content with lo- cations higher in the poi hierarchy (such as ‘helsinki’ instead of ‘senate square’) or choose the wrong location altogether. in terms of other limitations, the results are nat- urally affected by how widely instagram has been adopted by potential users of social media, and should be evaluated in the light of the inherent bias towards younger population found in social media data (longley et al., ; hausmann et al., ). furthermore, the proposed method cannot provide a fine-grained view of the linguistic land- scape, because automatic language identification cannot detect code-switching within sentences, or distinguish between varieties of a single language, such as american and british english or finland- swedish and standard swedish, unless explicitly trained to do so. despite these limitations, our results suggest that instagram and other social media platforms with geolocated content do nevertheless hold much po- tential for sociolinguistic inquiry, as suggested by androutsopoulos ( ). tapping further into this potential, however, would benefit from collaborat- ing with geographers, to leverage more advanced methods for spatiotemporal analysis. such analyses could be used, for instance, to reveal where and when particular linguistic groups are active, to evaluate the potential for interaction between these groups. longitudinal analyses for individual users, in turn, could be used to investigate their linguistic repertoires. finally, because computational methods develop rapidly, analytical tools should be shared openly to enable the replication and reproduction of research, which would benefit the entire field of study. a natural extension to the current work would be to take on what jaworski and thurlow ( ) have conceptualized as semiotic landscapes, whose ana- lysis would include other modes of expression be- sides language in the virtual linguistic landscape. although research on artificial intelligence is making rapid progress in processing multimodal data (bateman et al., , pp. – ), identifying fine-grained patterns of multimodal communica- tion in high volumes of geotagged social media data is likely to remain a long-term endeavour. nevertheless, sufficiently mature computational techniques can already support the study of both virtual and physical linguistic landscapes, and their potential applications should be explored further. funding this work was supported by the finnish cultural foundation and the kone foundation. references allen, p. t., fatah, a., and robison, d. ( ). urban encounters reloaded: towards a descriptive account of augmented space. in jung, t. and tom dieck, m. c. (eds), augmented reality and virtual reality: empowering human, place and business. cham: springer, pp. – . androutsopoulos, j. ( ). computer-mediated com- munication and linguistic landscapes. in holmes, j. and hazen, k. (eds), research methods in sociolinguistics: a practical guide. oxford: wiley, pp. – . androutsopoulos, j. ( ). networked multilingualism: some language practices on facebook and their impli- cations. international journal of bilingualism, ( ): – . artstein, r. and poesio, m. ( ). inter-coder agree- ment for computational linguistics. computational linguistics, ( ): – . t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay barton, d. ( ). the roles of tagging in the online cur- ation of photographs. discourse, context and media, , – . bateman, j. a., wildfeuer, j., and hiippala, t. ( ). multimodality: foundations, research and analysis – a problem-oriented introduction. berlin: de gruyter mouton. baym, n. k. ( ). personal connections in the digital age, nd edn. malden, ma: polity. bird, s., klein, e., and loper, e. ( ). natural language processing with python. sebastopol, ca: o’reilly. bojanowski, p., grave, e., joulin, a., and mikolov, t. ( ). enriching word vectors with subword informa- tion. transactions of the association for computational linguistics, : – . bruyèl-olmedo, a. and juan-garau, m. ( ). shaping tourist ll: language display and the sociolinguistic background of an international multilingual reader- ship. international journal of multilingualism, ( ): – . carter, s., weerkamp, w., and tsagkias, m. ( ). microblog language identification: overcoming the limitations of short, unedited and idiomatic text. language resources and evaluation, ( ): – . deumert, a. ( a). digital superdiversity: a commen- tary. discourse, context & media, – : – . deumert, a. ( b). sociolinguistics and mobile communication. edinburgh: edinburgh university press. dodge, m. and kitchin, r. ( ). code and the trans- duction of space. annals of the association of american geographers, ( ): – . francesconi, s. ( ). reading tourism texts: a multimodal analysis. bristol: channel view publications. gorter, d. ( ). linguistic landscapes in a multilingual world. annual review of applied linguistics, : – . gorter, d. and cenoz, j. ( ). translanguaging and linguistic landscapes. linguistic landscapes, ( – ): – . hausmann, a., toivonen, t., slotow, r., tenkanen, h., moilanen, a., heikinheimo, v., and di minin, e. ( ). social media data can be used to under- stand tourists’ preferences for nature-based experi- ences in protected areas. conservation letters, ( ): e . hochmair, h. h., juhász, l., and cvetojevic, s. ( ). data quality of points of interest in selected mapping and social media platforms. in kiefer, p., huang, h., van de weghe, n. and raubal, m. (eds), progress in location based services . cham: springer, pp. – . hoffman, c. r. and bublitz, w. (eds) ( ). pragmatics of social media. berlin and boston: de gruyter mouton. ivkovic, d. and lotherington, h. ( ). multilingualism in cyberspace: conceptualising the vir- tual linguistic landscape. international journal of multilingualism ( ): – . jaworski, a. and thurlow, c. (eds) ( ). semiotic landscapes: language, image, space, london and new york: continuum. jokela, s. ( ). tourism and identity politics in the helsinki churchscape. tourism geographies, ( ): – . kellerman, a. ( ). mobile broadband services and the availability of instant access to cyberspace. environment and planning a, : – . kellerman, a. ( ). the satisfaction of human needs in physical and virtual spaces. the professional geographer, ( ): – . kellerman, a. ( ). daily spatial mobilities: physical and virtual. new york and london: routledge. kiss, t. and strunk, j. ( ). unsupervised multilingual sentence boundary detection. computational linguistics ( ): – . kitchin, r. ( ). big data and human geography: opportunities, challenges and risks. dialogues in human geography, ( ): – . laitinen, m., lundberg, j., levin, m., and martins, r. ( ). the nordic tweet stream: a dynamic real-time monitor corpus of big and rich language data. in mäkelä, e., tolonen, m. and tuominen, j. (eds), proceedings of the digital humanities in the nordic countries rd conference, helsinki, finland, march , pp. – . lee, c. ( ). multilingual resources and practices in digital communication. in georgakopoulou, a. and spilioti, t. (eds), the routledge handbook of language and digital communication. new york and london: routledge, pp. – . lee, c. ( ). multilingualism online. new york and london: routledge. exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay lee, c. and chau, d. ( ). language as pride, love, and hate: archiving emotions through multilingual instagram hashtags. discourse, context and media , – . leppänen, s. and peuronen, s. ( ). multilingualism and the internet. in chapelle, c. a. (ed.), the encyclopedia of applied linguistics. oxford: wiley- blackwell. leppänen, s., pitkänen-huhta, a., nikula, t., kytölä, s., törmäkangas, t., nissinen, k., kääntä, l., räisänen, t., laitinen, m., pahta, p., koskela, h., lähdesmäki, s., and jousmäki, h. ( ). national survey on the english language in finland: uses, meanings and atti- tudes, vol. of studies in variation, contacts and change in english. helsinki: university of helsinki. longley, p. a., adnan, m., and lansley, g. ( ). the geotemporal demographics of twitter usage. environment and planning a: economy and space, ( ): – . lui, m. and baldwin, t. ( ). langid.py: an off-the- shelf language identification tool. in proceedings of the th annual meeting of the association for computational linguistics, jeju island, korea, july . association for computational linguistics, pp. – . manjavacas, e. ( ). mapping urban multilingualism through twitter. master’s thesis, the free university of berlin. mckinney, w. ( ). data structures for statistical com- puting in python. in van der walt, s. and millman, j. (eds), proceedings of the th python in science conference, austin, texas, united states, june –july , pp. – . official statistics of finland ( ). accommodation statistics. http://www.stat.fi/til/matk/index.html (ac- cessed july ). paolillo, j. c. ( ). how much multilingualism? language diversity on the internet. in danet, b. and herring, s. c. (eds), the multilingual internet: language, culture, and communication online. oxford: oxford university press, pp. – . papen, u. ( ). commercial discourses, gentrification and citizens’ protest: the linguistic landscape of prenzlauer berg, berlin. journal of sociolinguistics ( ): – . pedregosa, f., varoquaux, g., gramfort, a., michel, v., thirion, b., grisel, o., blondel, m., prettenhofer, p., weiss, r., dubourg, v., vanderplas, j., passos, a., cournapeau, d., brucher, m., perrot, m., and duchesnay, é. ( ). scikit-learn: machine learning in python. journal of machine learning research, : – . pennycook, a. and otsuji, e. ( ). metrolingual multitasking and spatial repertoires: ’pizza mo two minutes coming. journal of sociolinguistics, ( ): – . peukert, h. ( ). measuring linguistic diversity in urban ecosystems. in duarte, j. and gogolin, i. (eds), linguistic superdiversity in urban areas: research approaches. amsterdam: benjamins, pp. – . rehurek, r. and sojka, p. ( ). software framework for topic modelling with large corpora. in proceedings of th language resources and evaluation conference: workshop on new challenges for nlp frameworks, elra, pp. – . röder, m., both, a., and hinneburg, a. ( ). exploring the space of topic coherence measures. in proceedings of the th acm international conference on web search and data mining (wsdm’ ), acm, pp. – . seargeant, p. and tagg, c. (eds) ( ). the language of social media. basingstoke: palgrave. soler-carbonell, j. ( ). complexity perspectives on linguistic landscapes: a scalar analysis. linguistic landscape, ( ): – . syrjälä, v. ( ). naming businesses – in the context of bilingual finnish cityscapes. in ainiala, t. and östman, j.-o. (eds), socio-onomastics: the pragmatics of names. amsterdam: benjamins, pp. – . tenkanen, h. ( ). capturing time in space: dynamic analysis of accessibility and mobility to support spatial planning with open data and tools. phd thesis, department of geosciences and geography, university of helsinki. http://urn.fi/urn:isbn: - - - - . tenkanen, h., di minin, e., heikinheimo, v., hausmann, a., herbst, m., kajala, l., and toivonen, t. ( ). instagram, flickr, or twitter: assessing the usability of social media data for visitor monitoring in protected areas. scientific reports ( ). villi, m. ( ). ‘‘hey, i’m here right now’: camera phone photographs and mediated presence. photographies ( ): – . zappavigna, m. ( ). ambient affiliation: a linguistic perspective on twitter. new media and society ( ): – . t. hiippala et al. digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay http://www.stat.fi/til/matk/index.html http://urn.fi/urn:isbn: - - - - http://urn.fi/urn:isbn: - - - - zappavigna, m. ( ). discourse of twitter and social media: how we use language to create affiliation on the web. london: continuum. zappavigna, m. ( ). social media photography: con- struing subjectivity in instagram images. visual communication, ( ): – . zook, m. a. and graham, m. ( ). mapping digiplace: geocoded internet data and the representation of place. environment and planning b, ( ): – . zubiaga, a., vicente, i. s., gamallo, p., pichel, j. r., alegria, i., aranberri, n., ezeiza, a., and fresno, v. ( ). tweetlid: a benchmark for tweet language identification. language resources and evaluation, ( ): – . note http://www.instagram.com exploring the linguistic landscape digital scholarship in the humanities, vol. , no. , d ow nloaded from https://academ ic.oup.com /dsh/article-abstract/ / / / by n ational library of h ealth s ciences user on m ay http://www.instagram.com white paper report report id: application number: hk- - project director: nancy maron (nancy.maron@ithaka.org) institution: ithaka harbors, inc. reporting period: / / - / / report due: / / date submitted: / / sustaining the digital humanities: lessons learned (neh white paper) nancy l. maron and sarah pickle ithaka s+r june , ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) / introduction ithaka s+r recently completed a study, with generous funding from the national endowment for the humanities’ office of digital humanities, that explored the different models colleges and universities have adopted to support digital humanities (dh) outputs on their campuses. the final report, entitled sustaining the digital humanities: host institution support beyond the start-up phase, and the accompanying sustainability implementation toolkit, are intended to guide faculty, campus administrators, librarians, and directors of support units as they seek solutions for coordinating long-term support for digital humanities resources at their institutions. by exploring both the assumptions and practices that govern host support, from the grant-stage to the post-launch period, we hoped to gain a clearer understanding of the systems currently in place and to identify examples of good practice. over the course of this study, ithaka s+r interviewed more than stakeholders and faculty project leaders at colleges and universities within the us. these interviews included a deep-dive phase of exploration focused on support for the digital humanities at four campuses—columbia university, brown university, indiana university bloomington, and university of wisconsin- madison. this research helped us to better understand how institutions are navigating issues related to the sustainability of dh resources and what successful strategies are emerging. research for this study began in october and involved two stages:  phase , sector-wide research: interviews and desk research with stakeholders at a variety of higher education institutions (public and private, teaching- and research- focused, large universities and small liberal arts colleges) provided an overview of the practices and expectations of digital humanities project leaders, funders, and their university administrators, as well as the challenges and successes they have encountered along the way.  phase ii, deep-dive research: more extensive analysis of four institutions that have created and managed several of their own digital projects allowed us to develop a map of the full scope of their activities, the value they offer to the host university, and the dynamics that drive decision making around the role the university plays in supporting them. unlike many other recipients of digital implementation grants who are developing digital tools and online resources, the primary deliverable for this grant is a white paper to share findings from our work. we refer our readers to that paper, sustaining the digital humanities: host institution support beyond the start-up phase, for the most comprehensive discussion of methodology and lessons learned. in this paper, we are pleased to have the opportunity to reflect further on the project as a project, and to consider its challenges and impacts. ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) /  the url for the final report is: http://www.sr.ithaka.org/sites/default/files/sr_supporting_digital_humanities_ f.pdf  the url for the sustainability implementation toolkit is: http://www.sr.ithaka.org/research-publications/sustainability-implementation-toolkit lessons learned and changes in course during the course of the study we chose to modify the methodology due to both a sharpening of the focus on institutional models and an awareness of the difficulty in collecting reliable financial data. this shift resulted in our conducting more case profiles and more interviews, but in collecting less financial data than first planned.  landscape focus on campus profiles. our initial plan for our landscape review was to interview - individuals at institutions across the united states, in faculty, administration, and department head roles. as we sharpened our focus on institutional strategies, we decided to use the landscape phase of our research to create profiles of a dozen campuses. rather than interviewing individuals specifically by job role, we chose campuses to profile and then sought key individuals on those campuses.  expanded from two deep dives to four. we conducted four deep profiles, instead of two, as originally planned. this afforded us a greater understanding of both the common and the unique challenges faced by universities in this area, making it possible for us to describe in our report three campus ―models‖ for supporting dh, while remaining attentive to the influences that local idiosyncrasies can have when adopting any one of these models.  de-emphasized cost data. an initial goal of the study was to quantify the cost—to the pis, to their host institutions, to granting agencies—of creating and sustaining digital humanities resources. the motivation for attempting this was to develop a view of all the resources already being spent on doing this work in an ad hoc fashion. between the time of the grant proposal and our undertaking the work, however, we had completed another study [imls-funded case studies of digitized special collections and an arl-funded survey of digitized special collections] that had allowed us to do further cost data gathering, specifically at some institutions, including academic libraries with special collections. this exercise, as well as our experience in interviewing staff and faculty for this project, made it painfully clear that accurate cost data would be difficult to obtain, as in most cases neither faculty nor library staff were in the habit of tracking the time they were devoting to specific digital projects. we did gather some data concerning budgets in our faculty surveys, but chose to focus on the larger issue of which units were devoting time to specific activities, and determining whether or not they were doing so on an in- kind or paid basis. http://www.sr.ithaka.org/sites/default/files/sr_supporting_digital_humanities_ f.pdf http://www.sr.ithaka.org/sites/default/files/sr_supporting_digital_humanities_ f.pdf http://www.sr.ithaka.org/research-publications/sustainability-implementation-toolkit ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) /  shifted timing of campus meetings. the initial plan was to visit each campus twice, first for interviews with senior administrators and support staff, and later on, to interview faculty once the survey results had been analyzed. due in part to the challenges of scheduling these sessions around holidays and campus schedules, we opted to conduct most faculty interviews via phone, to get to them more quickly. this turned out to be an even better plan; the second campus meetings were then devoted to sharing back our findings and hosting facilitated sessions with groups of stakeholders. these sessions offered us valuable feedback on our findings, and also were in some cases run as workshops, where senior administrators, faculty and unit heads actively discussed the roles they currently play and how they see their own systems developing to better manage the demands of faculty and the work they create. perhaps the most difficult question was how to define the particular flavor of ―digital humanities‖ we would examine. did we care about all the shapes and sizes that dh engagement comes in, or just in the large-scale digital outputs that seem to garner the most attention and funding? in the end, we developed a method we hoped would acknowledge and capture data on the widespread interest in digital humanities, while also identifying practitioners who are actually building and managing long-term resources. the survey was directed at all faculty in a few departments selected by our campus-based partners (often based in the library) and we tried to get as broad participation as possible. but the survey also sought to identify those among the respondents who had managed or created digital projects that they considered to be for public use and that were expected to need ongoing support and development. this approach worked for the most part, but while we were eager to learn more about those major, public digital research initiatives, we soon realized that campus leaders still need a better understanding of what faculty (and even students) are doing, and to what extent those other activities generate materials that will require a support strategy. we hope that those who choose to undertake a campus-based survey for themselves will consider ways to capture more data about the sorts of files, formats, and intentions of even those practitioners whose work is not intended for public use. in other words, while we focused on a particular use case that is known to create significant sustainability challenges, there are many faculty and students who are creating other types of resources and data that may also pose challenges over the long-term, and the survey could prompt respondents to offer greater detail about that work so that a better-informed and finer-tuned system of support could be developed. accomplishments the paper and toolkit were published on june , and represent the final deliverables from this grant. in the course of conducting the study and developing the paper and tools, we had several accomplishments worth emphasizing: ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) /  we undertook and completed four full campus profiles, twice as many as original proposed, by altering the methodology used to focus less on gathering cost data and more on understanding process and strategy.  our original estimate was to interview about people in the course of this project. in the end, we interviewed over individuals, including some more than once.  we held on-campus meetings to share back our findings and discuss them with campus stakeholders. each campus partner was offered a short menu of types of events we might host for them. this phase of the project was extremely productive; rather than just providing us with feedback on our work (though they served this purpose, also), in many cases, the sessions ended up being a good neutral ground for people across campus to begin to have substantive conversations about how to better coordinate their activities. several times, we were told that meetings like that were very valuable but ―just don’t happen.‖ it may take some time to see the results from this work; we will continue to track evidence of people and teams using this approach to develop their own campus- based strategies.  our marketing team developed lists of contacts and communications to disseminate the report and the toolkit. an announcement was sent to , contacts, including us library deans and directors, digital humanities centers, digital humanists, publishers, and higher education and libraries media. additionally, the announcement was posted on the acrl digital humanities interest group listserv, the acrl sustainability listserv, and on ithaka s+r’s blog and twitter account. audiences the readership for this report includes several groups. while it is too soon after publication to have a full picture of the impact the paper and toolkit will have, we expect the readership to include:  library administrators and dh coordinators. we see as the main audience for this report those in the library who manage digital projects, whether for the library’s own collections or as a service to faculty to come to the library for support. we have heard from some library directors that the report will be useful to them and others who are considering developing dh strategies for themselves. in just the last week, we have heard from an aul for technology at a major research institution (wisconsin) and a head of a liberal arts college publishing program (amherst) who reported that they had shared the report widely with campus colleagues.  dh practitioners. faculty who are engaged in building digital projects of their own will be one of our audiences here, too. as many of the initiatives to gain further funding to support staff hires, technology capacity and education for practitioners are lead by faculty members, we believe that the report will provide them with the tools they need to gather ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) / data on the nature of the need on their campus, and to have structured conversations with administrators about possible paths forward.  heads of other related units on campus:many units on campus, from digital humanities centers to technology or visualization groups, to the university press are or could be participating in the process of creating and managing the new digital research resources being created on campus. while digital humanities centers seem to the obvious leader in these discussions, we hope that the paper encourages a discussion about the roles that the dh center does assume, and important roles that others will need to take on.  senior administrators (deans, provosts). our research made clear that in most places, this issue is only beginning to emerge at the highest levels of administration, and yet the instances of greatest coordinated investment only occur with support from the top. we hope that senior administrators will find this to be a useful paper for framing the issues, and we imagine that library directors and faculty will direct them to it for this purpose. the reach of this report and this topic is nation-wide and even international. while geographic differences do exist concerning institutional strategy, the tools offer here are easily translatable to other settings. a complete list of interviewees is available in the appendices of final report, starting on page . in total, we spoke with individuals from institutions of higher education and other organizations, such as funding agencies. those institutions included public and private universities and colleges. while most were research universities, were liberal arts colleges. in terms of outreach, within the first ten days of publication (june – june ), we had total page views of the final report, which has been downloaded times. there have been page views for the toolkit, and various elements of the kit were downloaded times. social media has played d a significant role in spreading the word about this publication. the initial ithaka s+r announcement was re-tweeted times, reaching , followers. another people and organizations tweeted independently about the project, and those tweets were re- tweeted times, for a total reach of , . evaluation the project was supported by an advisory committee, which included richard detweiler, president, great lakes colleges association; martin halbert, dean of libraries, university of north texas; stanley n. katz, director, center for the arts and cultural policy studies; lecturer with rank of professor, woodrow wilson school of public and international affairs; president emeritus of the american council of learned societies; maria c. pantelia, professor, classics, university of california, irvine; director, thesaurus linguae graecae®; richard spies, former executive vice president for planning and senior advisor to the president at brown university, ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) / former vice president for finance and administration at princeton university. ann j. wolpert, director of libraries, mit, was a valued member of the advisory committee until her death in october . the advisory committee offered valuable guidance at key milestones throughout the project:  a conference call on may , allowed us to share findings from the sector-wide research that had been completed and to select the deep dive sites.  feedback on the project leader questionnaire was solicited via email in september , after we had refined the instrument in collaboration with the campuses coordinators.  a conference call on december , served to discuss preliminary findings from the faculty surveys and to review early sketches of project lifecycles, as well as to discuss the format and emphasis of the final report.  a final in-person meeting was held at the ithaka s+r offices in new york on march , . at this session, the committee reviewed draft profiles of brown and indiana and helped us to plan our on-campus workshops and roundtables.  several members of the advisory committee read full drafts of the final paper and offered detailed comments and feedback. in addition, we received valuable feedback from members of the community at different points throughout the project, thanks to our close working relationships with our partner campuses. meetings held at each of the four campuses permitted us to test out the ideas in the paper and those used to build the toolkit with groups of varying composition. the campus workshops included the following:  columbia university: roundtable of several senior library directors and staff dedicated to supporting digital humanities work, including the aul for collections and services, associate vp, digital programs and technology services, the director of the center for digital research and scholarship, acting executive director for the center for new media teaching and learning. the director, humanities and history libraries and the digital humanities coordinator.  indiana university bloomington: roundtable with the libraries executive council, which included the dean of the library and five associate deans; a presentation of research findings attended by about thirty people, including the majority of the libraries executive council, several members of the libraries’ digital collections services, a handful of faculty members who have created dh projects, and few support staff from other units around campus; finally, a library staff training session on sustainability principles attended by several members of the libraries executive council, members of the libraries’ digital collections services, and the reference librarians. ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) /  university of wisconsin-madison: key stakeholder roundtable, including the dean of the library, an aul for library technologies, the cio, and two associate deans in the college of letters and science.  brown university: key stakeholder roundtable, including the university librarian, two auls, the dh librarian, the deputy provost, two key administrators in other support units, and a handful of faculty with dh projects. these sessions were structured to include a formal presentation of findings from the campus- based survey, including dh activity on campus; a review of overlaps and gaps in the current system of supporting services to digital humanities project leaders; and a facilitated discussion on the key motivators for offering dh support. the feedback from these sessions, and our observations of how the ―key stakeholder‖ sessions helped to surface often sensitive topics in very productive ways strongly influenced the final design of the sustainability implementation toolkit, in particular. the broader public is just now starting to respond to the project, and we will continue to track this over the months ahead. at the annual meeting of the associate of american university presses (aaup, june ) a session on publishing and digital humanities included a brief synopsis and discussion of the paper. at the annual meeting of the american library association (june ) a discussion of the paper is on the agenda of the acrl interest group for digital humanities. responses to the paper will vary for different categories of readers. dh practitioners, particularly faculty members, may find this useful as a way to raise awareness of the topic on their campuses. some well-known dh practitioners (alex gil at columbia and trevor muñoz at mith) were recently quoted in ―when digital projects end,‖ an article in inside higher education, devoted to the study. gil pointed out that ―the report does a fine job of teasing out the diversity of support approaches at different universities…now that they have brought this level of detail to the conversation, i hope we can begin expanding the concept of support that the study assumes to include the learning of faculty, students and librarians. nothing in my estimate will support digital scholarship and allow it to endure constant technological change -- on any campus -- more than shared knowledge.‖ continuation and long term impact unlike some of the other grantees in this program, this paper is considered to be the end product of a successful research project, so there are no immediate plans to continue the project itself. ithaka s+r will continue to host the paper and the toolkit, and to promote it through webinars and other speaking engagements that we participate in. the papers that ithaka s+r publishes carl straumsheim, ―when digital projects end,‖ inside higher ed, june , . http://www.insidehighered.com/news/ / / /study-preserve-digital-resources-institutions-should-play-their- strengths http://www.insidehighered.com/news/ / / /study-preserve-digital-resources-institutions-should-play-their-strengths#sthash.eieofrfg.dpbs http://www.insidehighered.com/news/ / / /study-preserve-digital-resources-institutions-should-play-their-strengths#sthash.eieofrfg.dpbs ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) / tend to remain relevant over many years, so we have reason to believe that the readership of this work will continue to grow, as we continue to promote it. as a result of the project we came to know the senior library and dh leaders at the four campuses we worked most closely with, columbia, brown, indiana and wisconsin. these relationships have been wonderfully productive, not just for the paper, but in other ways as well. we are developing a training course, for example, and may now end up partnering with columbia in future years. this grant gave us license to speak with many of the leaders of the dh community, and this led to other possible partnerships, as well. it has been a pleasure getting to know many of the library directors, faculty, senior administrators and other departmental heads, and these relationships will certainly last well beyond the end of the grant. we have started to hear of some encouraging illustrations of the impacts the process has had for those campuses we partnered with for this study. according to university librarian harriette hemmasi of brown university, ―the process at brown heightened insight among the various stakeholders about the ways in which we see ourselves and each other as part of the campus infrastructure that supports digital humanities and digital scholarship, more generally. it also provided an impetus for increased collaboration, resulting in an award from the provost to fund a two-year digital humanities lecture series, including at least one short-term scholar-in- residence each year.‖ according to lee konrad, associate university librarian, technology strategies and data services at university of wisconsin-madison, ―the process helped to illustrate both the pros and cons of supporting [dh-related] work in a highly decentralized manner. i came away feeling that while this type of support model has its challenges, it also has great rewards in that it brings together scholars, technologists, and librarians from across the campus in ways that might be difficult in a highly structured environment. the process gave us a very important opportunity to work together at administrative levels, and …to discuss engaging in sustainable digital humanities work at scale.‖ in addition, as is often the case, while this project has answered some questions it has also suggested others in need of further investigation. for example, it became clear that there is much more to discuss concerning what it means to ―publish‖ or ―disseminate‖ one’s work. many campus roundtables with library staff and faculty suggested that posting materials in a campus repository was all that was needed. and yet, we heard very little about significant impact or efforts to build audience for these projects and even where there was a university press on campus, it was not generally considered a key player. we hope to further explore this topic, by working with members of the association of american university presses as well as with library publishing units that are starting to play a role in this area. ithaka s+r: sustaining the digital humanities: lessons learned (neh white paper) / grant products during the course of this grant, we wrote and published the final report, entitled sustaining the digital humanities: host institution support beyond the start-up phase as well as the sustainability implementation toolkit. both are freely available and hosted on the ithaka s+r website: http://www.sr.ithaka.org/research-publications/sustaining-digital-humanities  sustaining the digital humanities: host institution support beyond the start-up phase http://www.sr.ithaka.org/sites/default/files/sr_supporting_digital_humanities_ f.pdf  sustainability implementation toolkit http://www.sr.ithaka.org/research-publications/sustainability-implementation-toolkit the toolkit outlines three key phases, each including several downloadable files: step one: assess the landscape (http://www.sr.ithaka.org/content/assess-landscape)  survey of faculty creation of digital content, tools, and infrastructure  customizing and implementing the survey  interview guide: directors of support units  interview guide: senior administrators  interview guide: digital project leaders step two: identify overlaps and gaps (http://www.sr.ithaka.org/content/identify-overlaps-and-gaps)  analyzing the data gathered  overlaps and gaps worksheet step three: discuss and address institutional priorities (http://www.sr.ithaka.org/content/discuss-and-address-institutional-priorities)  hosting a stakeholder roundtable  stakeholder roundtable: presentation template additional features of the toolkit include:  a briefing paper for digital project leaders (http://www.sr.ithaka.org/sites/default/files/briefing_paper.pdf)  intake questionnaire for new digital projects http://www.sr.ithaka.org/sites/default/files/intakequestionnaire.pdf http://www.sr.ithaka.org/research-publications/sustaining-digital-humanities http://www.sr.ithaka.org/sites/default/files/sr_supporting_digital_humanities_ f.pdf http://www.sr.ithaka.org/sites/default/files/sr_supporting_digital_humanities_ f.pdf http://www.sr.ithaka.org/research-publications/sustainability-implementation-toolkit http://www.sr.ithaka.org/content/assess-landscape http://www.sr.ithaka.org/content/identify-overlaps-and-gaps http://www.sr.ithaka.org/content/discuss-and-address-institutional-priorities http://www.sr.ithaka.org/sites/default/files/briefing_paper.pdf http://www.sr.ithaka.org/sites/default/files/intakequestionnaire.pdf volume issue / f i n g a l ’ s c a v e : t h e i n t e g r at i o n o f r e a l - t i m e a u r a l i s at i o n a n d d m o d e l s shona kirsty noble the glasgow school of art renfrew st glasgow g rq united kingdom s.noble@gsa.ac.uk abstract: fingal’s cave: an audiovisual experience is an immersive virtual reality application that combines d models, a narrative soundscape and interactive auralisation in a recreation of a visit to fingal’s cave. this research explores the importance of audio in heritage visualisations and its practical implementation. fingal’s cave is a sea cave on the isle of staffa off the west coast of scotland revered for its extraordinary acoustics. audio is extremely important in the history and culture of fingal’s cave and it has long been romanticised, inspiring countless folklore, art, poetry and music. the visualisation is designed to encourage viewers to become a part of the cultural narrative and explore the cave for themselves, move around and speak to hear their voice auralised as it would be inside the cave. this is the first time the acoustic characteristics of a heritage site have been included in a visualisation in this interactive manner. this paper reviews whether auralisation is effective and meaningful and supports a creative response to heritage sites. the impact of the visualisation in terms of engaging with communities of interest and in the field of audio in heritage visualisation is discussed. the research suggests it is necessary that audio be included in heritage visualisations to give a full and complete understanding of how people experience it. keywords: real-time auralisation, virtual reality, heritage visualisation, acoustic response, fingal’s cave, staffa, intangible heritage, audiovisual data i n t r o d u c t i o n fingal’s cave: an audiovisual experience is a practical demonstration of the integration of auralised audio with a d model. in this research, the d model forms a representation of the visual elements of the example heritage site, fingal’s cave, while the auralised audio forms a representation of the acoustic properties. while there is an extensive body of research into the importance of aurality and its inclusion in studies of cultural heritage mailto:s.noble@gsa.ac.uk s. k. noble, fingal’s cave: the integration of real-time auralisation and d models (especially works by foka & arvidsson , mattern and sterne ), and similarly in aural reproduction using digital technologies (see works by brereton , guthrie and laird ), there is further research needed that fully addresses the intersection between them. this means that key issues relating to engagement and how dissemination modes ultimately influence our perception of heritage sites could benefit from applying the principles of the prototypes in this research. the very ephemeral nature of sound makes it one of the most intangible forms of heritage, but through full acoustic analysis, it can be further understood and replicated. this research aligns with the work of betts and veitch , in that it advocates moving away from a focus on vision as the sense that best describes an experience of a place, and instead invoking full embodiment. the application developed is a multisensory exploration featuring real-time auralisation to create an immersive experience that recreates the effects on sound of fingal’s cave. there is an interactive aspect which allows a participant to speak into a microphone and hear their own voice as if inside the cave. this draws upon research into virtual stage acoustics and involves real-time auralisation generated from an impulse response captured inside fingal’s cave. as well as application of real-time auralisations to the users’ voice, the virtual experience also has a narrative soundscape featuring readings of poetry written about the cave, quotes from its historical visitors and ambient wave recordings. the output is implemented in unity, a free to use game development software, and viewed through a virtual reality head mounted display (hmd). this work supports the ongoing research of the historical archaeological research project on staffa (harps) which is a collaboration between the glasgow school of art’s school of simulation and visualisation and the national trust for scotland. harps is adopting a new multi-disciplinary approach to recording and exploring the historical archaeological potential of the island. work includes excavation, reflectance transformation imaging (rti) of tourist graffiti in fingal’s cave, and laser scan, audio and photogrammetric surveys. harps is the source of the base data sets used in this project, specifically the laser scan and audio sound sweep data of fingal’s cave, as well as photogrammetric data of staffa. now a national nature reserve, the current owners of staffa, the national trust for anna foka and viktor arvidsson, ‘experiential analogies: a sonic digital ekphrasis as a digital humanities project’, dhq: digital humanities quarterly , no. , , http://www.diva-portal.org/smash/get/diva : /fulltext .pdf. shannon mattern, ‘ear to the wire: listening to historic urban infrastructures’, amodern , no. october, , – , http://amodern.net/article/ ear-to-the-wire/. jonathan sterne, the audible past, duke university press, , https://culturetechnologypolitics.files.wordpress.com/ / /jonathan- sterne-the-audible-past-intro.pdf. jude s brereton, damian t murphy and david m howard, ‘the virtual singing studio: a loudspeaker-based room acoustics simulation for real-time musical performance’, joint baltic-nordic acoustic meeting, odense, denmark, , – . a guthrie et al., ‘using ambisonics for stage acoustics research’, international symposium on room acoustics, , – , ftp:// s -s .cisti.nrc.ca/outgoing/isra /cd_isra /papers/p .pdf; anne guthrie, ‘stage acoustics for musicians: a multidimensional approach using d ambisonic technology’, rensselaer polytechnic institute, , http://www.fraufraulein.com/anne/ anneguthrie_ _ _ _dissertation_finalsmall.pdf. iain laird, damian murphy and paul chapman, ‘comparison of spatial audio techniques for use in stage acoustics laboratory experiments’, eaa joint symposium on auralization and ambisonics, berlin, germany, , – , http://eprints.whiterose.ac.uk/ /; iain laird, damian murphy, and paul chapman, ‘energy-based calibration of virtual performance systems’, c th international conference on digital audio effects, york, , – , http://eprints.whiterose.ac.uk/ /; iain laird, damian murphy, and paul chapman, ‘spatialisation accuracy of a virtual performance system’, joint baltic-nordic acoustics meeting, odense, denmark, ; iain laird et al., ‘development of a virtual performance studio with application of virtual acoustic recording methods’, audio engineering society th convention, london, , – . eleanor betts, towards a multisensory experience of movement in the city of rome’, rome, ostia, pompeii: movement and space, ray laurence and david j newsome, eds, oxford university press, , – , https://doi.org/ . /acprof:osobl/ . . ; eleanor betts, ‘the multivalency of sensory artifacts in the city of rome’, senses of the empire multisensory approaches to roman culture, eleanor betts, ed, st ed., routledge, , – ; eleanor betts, ‘the sacred landscape of picenum ( - bc): towards a phenomenology of cult places’, inhabiting symbols: symbol and image in the ancient mediterranean, edward wilkins, john b. and herring, accordia s, eds, accordia research institute, university of london, , – . jeffrey veitch, ‘soundscape of the street: architectural acoustics at ostia’, senses of empire: multisensory approaches to roman culture, eleanor betts, ed, st ed., routledge, , – . harps, ‘the historical archaeology research project on staffa (harps) | society of antiquaries of scotland’, , http://www.socantscot. org/research-project/the-historical-archaeology-research-project-on-staffa/. ibid. http://www.diva-portal.org/smash/get/diva http://amodern.net/article/ear-to-the-wire/ http://amodern.net/article/ear-to-the-wire/ https://culturetechnologypolitics.files.wordpress.com/ / /jonathan-sterne-the-audible-past-intro.pdf https://culturetechnologypolitics.files.wordpress.com/ / /jonathan-sterne-the-audible-past-intro.pdf ftp://s -s .cisti.nrc.ca/outgoing/isra /cd_isra /papers/p .pdf ftp://s -s .cisti.nrc.ca/outgoing/isra /cd_isra /papers/p .pdf http://www.fraufraulein.com/anne/anneguthrie_ _ _ _dissertation_finalsmall.pdf http://www.fraufraulein.com/anne/anneguthrie_ _ _ _dissertation_finalsmall.pdf http://eprints.whiterose.ac.uk/ / http://eprints.whiterose.ac.uk/ / https://doi.org/ . /acprof http://www.socantscot.org/research-project/the-historical-archaeology-research-project-on-staffa/ http://www.socantscot.org/research-project/the-historical-archaeology-research-project-on-staffa/ s. k. noble, fingal’s cave: the integration of real-time auralisation and d models scotland, want to attract people to visit the site . however, there are several barriers to experiencing the island such as physical access issues (including cost), weather, and the conservation needs of the reserve. using harps datasets, this paper discusses the benefits and problems of integrating auralised audio and d models in heritage visualisations. this aligns directly with the harps project’s objective, which is not to attempt to somehow replicate the experience of the cave in vr but to create a parallel, but affective immersive experience of fingal’s cave for those that are unable to visit the site in person. f i n g a l ’ s c a v e : c u l t u r a l a n d h i s t o r i c a l c o n t e x t fingal’s cave is a sea cave on the isle of staffa, one of the inner hebrides off the west coast of scotland. it is to the west of mull, between iona and gometra, as shown in figure . figure . staffa is located to the west of the isle of mull, scotland. courtesy google maps. the name, staffa, comes from the norse ‘stafr,’ meaning ‘staff’ and is so named because of its distinctive columnar basalt structure . the prismatic basalt columns and quasi-hexagonal pattern were formed upon the rapid cooling and hardening of volcanic material, at which time an enormous amount of tension builds up. the most economical way to release this tension is thought to be through hexagonal cracks in the rock . the island is about one mile long by half a mile across, but features at least twelve prominent sea caves, the most famous of which is fingal’s cave, shown to the right of figure . national trust for scotland, ‘staffa - visit’, , https://www.nts.org.uk/visit/staffa/. google maps, ‘staffa - google maps’, , https://www.google.co.uk/maps/place/staffa/@ . ,- . , z/ data=! m ! m ! s x b da fd f: x ff b d ae d ! m ! d . ! d- . . donald b. macculloch, staffa, th ed., david & charles, ). philip ball, ‘pattern formation in nature: physical constraints and self-organising characteristics.’, architectural design , no. , , – , http://www.philipball.co.uk/images/stories/docs/pdf/admc_ball_final .pdf. john patterson maclean, an historical, archaeological and geological examination of fingal’s cave in the island of staffa, rewritten, ulan press, . https://www.nts.org.uk/visit/staffa/ https://www.google.co.uk/maps/place/staffa/ http://www.philipball.co.uk/images/stories/docs/pdf/admc_ball_final .pdf s. k. noble, fingal’s cave: the integration of real-time auralisation and d models figure . fingal’s cave on the isle of staffa. image © the author, . the route from the landing place to the cave mouth is treacherous with its sea spray soaked stone, as can be seen in figure , but the effort is rewarded, not only through its striking visuals, but through its extraordinary acoustics. the many surfaces of the cave create a uniquely resonant space: when a sound wave is emitted into a space, it reflects off surfaces in the space and these reflections attenuate over a certain amount of time. this reduction is a result of being absorbed by the walls of the space and the speed at which it reduces depends on the material and size of the space. this attenuation is known as the reverberance of the space . the hard basalt rock columns are reflective and angular creating a diffuse reverberant field meaning sound is reflected in multiple different directions. depending on the weather this may be pleasing to the ear or unsettling; under certain conditions, the caves of the island make a loud ‘booming’ noise, which was said to terrify inhabitants of the island. springer, springer handbook of acoustics, thomas d. rossing, ed, nd ed., springer-verlag new york, , https://doi.org/ . / - - - - ; laird, murphy, and chapman, ‘spatialisation accuracy of a virtual performance system’. https://doi.org/ . / - - - - https://doi.org/ . / - - - - s. k. noble, fingal’s cave: the integration of real-time auralisation and d models figure . the access causeway for fingal’s cave. image © the author, . sound is particularly culturally important to the identity of scotland, and there is a long tradition of music, poetry and oral history being inspired by the landscape. fingal’s cave and its acoustics is a major contributor to this, with many historical references noting its evocative acoustic qualities. the influence of this natural geological feature is widespread and can be seen through the depth of artistic responses to staffa and fingal’s cave throughout the last two and a half centuries. its many layers of engagement and creative responses began with the botanist, joseph banks’ visit in . banks made the connection between fingal’s cave and james macpherson’s translation of ossian’s poem, “fingal” from the th century . although its authenticity is widely doubted, the ‘translation’ nonetheless brought staffa and fingal’s cave further into the public view . since then, it has inspired countless poets, including wordsworth, hogg, scott and keats; musicians, most famously mendelssohn, and more recently pink floyd; ralph crane and lisa fletcher, ‘inspiration and spectacle: the case of fingal’s cave in nineteenth-century art and literature’, interdisciplinary studies in literature and environment , no. , , – , https://doi.org/ . /isle/isv . jennifer davis michael, ‘ocean meets ossian: staffa as romantic symbol’, romanticism , no. , , – , https://muse.jhu.edu/ article/ ; hugh blair, ‘the poetical works of ossian by james macpherson with a critical dissertation by hugh blair’, , http://www. exclassics.com/. nigel leask, ‘fingalian topographies: ossian and the highland tour, - ’, journal for eighteenth-century studies, , no. , . https://doi.org/ . /isle/isv https://muse.jhu.edu/article/ https://muse.jhu.edu/article/ http://www.exclassics.com/ http://www.exclassics.com/ s. k. noble, fingal’s cave: the integration of real-time auralisation and d models artists, including turner ; and literature by the likes of verne, among many others. the cave also features in multiple gaelic and irish legends, most famously in the story of fionn maccoul (fingal), who built a causeway of stepping stones from northern ireland to scotland to confront his rival benandonner. the giants causeway in northern ireland and fingal’s cave on staffa are said to be the last remaining pillars of this causeway . these romanticized interpretations marked a change in worldview of scotland from a place to be feared to somewhere much revered and sparked a surge in ‘geotourism’ . in modern times, these romanticized views are largely forgotten: fingal’s cave is seen primarily as a geological feature. gordon proposes that reconnecting with the ‘voices’ of the stones of scotland’s natural geodiversity will: “link people today with their cultural roots and sense of place” . dynamic engagement with landscape and its past romanticism may open further opportunities to connect with the past, present and future of our natural heritage. a u d i o a n d h e r i t a g e archaeoacoustics, the study of acoustics in relation to archaeological and heritage sites, is critical if we are to fully understand the impact and importance of a heritage site. this is particularly true in the case of fingal’s cave as a lot of the engagement with this site surrounds its acoustic as much as its visual properties, which is attested to by the numerous references to sound in the creative works inspired by it (see examples below). archaeological research in the last couple of decades has taken a much stronger interest in the specific relationships between artificial archaeological structures; stone circles, standing stones, chambered tombs, and churches, and the role that acoustics may have played in their design and even on their psychological impact. for examples of this see works by diaz-andreu , mattern , murphy and watson and keating . bradley’s work on the archaeological significance of natural places is also relevant , and green and murphy and shelley have curated useful repositories of impulse responses recorded at heritage sites across scotland and beyond. despite this, there remains much less focus on the acoustics of natural places and the impact that this may have had on earlier people, although till , who studied the sound archaeology of the natural caves of altamira, spain, suggests acoustics play a strong role in their contextualisation and their construction as significant places in the past. something which the harps project argues also occurs at fingal’s cave. yale center for british art, ‘staffa, fingal’s cave’, , http://collections.britishart.yale.edu/vufind/record/ . visit scotland, ‘scottish folklore - ghosts, myths and legends | visitscotland — phantom piper, fingals cave, corryvreckan whirlpool’, , http://ebooks.visitscotland.com/ghosts-myths-legends/phantom-piper-corryvreckan/. s schama, landscape and memory, fontana press, . john e gordon, ‘engaging with geodiversity: ‘stone voices’, creativity and cultural landscapes in scotland’, scottish geographical journal , no. – , , – . ibid., . margarita díaz-andreu, ‘archaeoacoustics of rock art: quantitative approaches to the acoustics and soundscape of rock art’, caa . keep the revolution going: proceedings of the rd annual conference on computer applications and quantitative methods in archaeology, stefano campana et al., eds, archaeopress, , – , https://www.researchgate.net/profile/julio_del_hoyo-melendez/publication/ _ colour_and_space_in_cultural_heritage_in_ ds_the_interdisciplinary_connections/links/ b ae e f de.pdf - page= . shannon mattern, ‘sonic archaeologies’, the routledge companion to sound studies, michael bull, ed, st ed., routledge, , – , http://wordsinspace.net/shannon/wp-content/uploads/ / /mattern_sonicarchaeologies_routledge_uneditedproofs.pdf. damian murphy et al., ‘acoustic heritage and audio creativity: the creative application of sound in the representation, understanding and experience of past environments’, internet archaeology, no. , march , , https://doi.org/ . /ia. . . aaron watson and david keating, ‘architecture and sound: an acoustic analysis of megalithic monuments in prehistoric britain’, antiquity , no. , , – , https://doi.org/ . /s x . richard bradley, an archaeology of natural places, routledge, . ‘archaeoacoustics scotland - home’, , http://www.archaeoacousticsscotland.com/. damian murphy and simon shelley, ‘openair | the open acoustic impulse response library’, , http://www.openairlib.net/. rupert till, ‘sound archaeology: terminology, palaeolithic cave art and the soundscape’, world archaeology , no. , , – , https:// doi.org/ . / . . . http://collections.britishart.yale.edu/vufind/record/ http://ebooks.visitscotland.com/ghosts-myths-legends/phantom-piper-corryvreckan/ https://www.researchgate.net/profile/julio_del_hoyo-melendez/publication/ _colour_and_space_in_cultural_heritage_in_ ds_the_interdisciplinary_connections/links/ b ae e f de.pdf https://www.researchgate.net/profile/julio_del_hoyo-melendez/publication/ _colour_and_space_in_cultural_heritage_in_ ds_the_interdisciplinary_connections/links/ b ae e f de.pdf http://wordsinspace.net/shannon/wp-content/uploads/ / /mattern_sonicarchaeologies_routledge_uneditedproofs.pdf https://doi.org/ . /ia. . https://doi.org/ . /s x http://www.archaeoacousticsscotland.com/ http://www.openairlib.net/ https://doi.org/ . / . . https://doi.org/ . / . . s. k. noble, fingal’s cave: the integration of real-time auralisation and d models many historic cathedrals are noted for their acoustic qualities and this is important as fingal’s cave has often been referred to as a ‘cathedral’ and a ‘temple’. for example, the poem ‘staffa, the island / fingal’s cave’ by john keats likens it to a church organ and cathedral: “…this was architectured thus by the great oceanus! — here his mighty waters play hollow organs all the day; here, by turns, his dolphins all, finny palmers, great and small, come to pay devotion due, — each a mouth of pearls must strew! many a mortal of these days dares to pass our sacred ways; dares to touch, audaciously, this cathedral of the sea! ...” pentcheva’s study into the aural architecture of hagia sofia draws connections between the effect of both poetry, and the polymorphic materials used in the construction of the church have on the human brain. it may be supposed that the similar effects of the natural structure of fingal’s cave sparked poetic imagination. poetry moves beyond dryly descriptive language to conjure a dynamic and emotive vision of a subject, as in the following verse by james hogg: “dark staffa! in thy grotto wild, how my wrapt soul is tought to feel! oh! well becomes it nature’s child now in her stateliest shrine to kneel! thou art no fiends’ nor giants’ home - thy piles of dark and dismal grain, bespeak thee, dread and sacred dome, great temple of the western main! ...” pentcheva’s research suggests that such ekphratic descriptions of a space can integrate “a direct response to sensual materiality of the space and uncovers in it a metaphysical dimension” . this is reflected in its descriptive language; for example, sound is often described in a physical way as ‘piercing’ or ‘thumping’ as it has a direct effect on the body. this notion allows understanding of the importance of a space to past visitors. they were so taken by a space, visually, aurally and corporally, that they were compelled to write poetry or in the case of mendelssohn and fingal’s cave, compose an overture. though the name, fingal’s cave, comes from the mythical celtic hero; the gaelic name is ‘an uamh bhinn’ which means the melodious cave. one early visitor, pancoucke, noted the effects on his wife’s voice when singing inside the cave: “… her voice vibrated throughout the columns, becoming fuller and more powerful, the tones seemed to take on a new life, the held notes became stronger; the religious majesty of the location infused these harmonies with something beautiful and grandiose. everyone applauded and it seemed that even the gods of this enchanted place echoed this applause in the air.” . bartleby, ‘fingal’s cave. john keats ( - ). staffa, the island. henry wadsworth longfellow, ed. - . poems of places: an anthology in volumes. scotland: vols. vi-viii’, , http://www.bartleby.com/ / / .html. bisserra v. pentcheva, ‘hagia sophia and multisensory aesthetics’, gesta international centre of medieval art, / , , – , http:// iconsofsound.stanford.edu/pentcheva.gesta .hagia sophia.pdf. donald b. macculloch quoting hogg (entry in visitor book in ulva), staffa, th ed., david & charles, , . pentcheva, ‘hagia sophia and multisensory aesthetics’, . macculloch, staffa, . translated extracts from ibid. http://www.bartleby.com/ / / .html http://iconsofsound.stanford.edu/pentcheva.gesta .hagia sophia.pdf http://iconsofsound.stanford.edu/pentcheva.gesta .hagia sophia.pdf s. k. noble, fingal’s cave: the integration of real-time auralisation and d models pentcheva notes the ‘performative’ characteristics of the space when the acoustics work together with the visuals, an insight this research draws upon. one visitor to fingal’s cave, queen victoria, noticed that “the rocks under water were all colours, pink, blue, green, which had the most wonderful effect” . the sea-soaked basalt pillars and shifting seas bounce light creating extraordinary effects. this coupled with the colours of the algae on the rocks and misty sea spray work to create an opalescent effect, shown in figure . figure . the effects of light in fingal’s cave. image © the author, . the presence of light on the natural rock formations can be compared to the importance of light in churches. pentcheva , observes that the built materials of hagia sofia “become animate in the shifting natural light, and these transient manifestations trigger the spectator’s memory and imagination to conjure up images” . perhaps these visual qualities, together with the dominance of wave sounds within the cave, are why fingal’s cave leaves such an impression on its visitors. a u d i o a n d v i s u a l i s a t i o n real-time auralisation is used frequently in stage acoustics research which explores the relation that spatial distribution of sound has on a musician’s experience in varying performance spaces such as concert halls. it is pentcheva, ‘hagia sophia and multisensory aesthetics’. eve eckstein quoting queen victoria (diary entry ), historic visitors to mull, iona and staffa, excalibur press of london, , . pentcheva, ‘hagia sophia and multisensory aesthetics’. ibid., . s. k. noble, fingal’s cave: the integration of real-time auralisation and d models pertinent to clarify some terms here, most importantly an impulse response, which is thought of as the ‘acoustic fingerprint’ of a space. they can be created by deconvolving a sine sweep recorded in that space to extract a known signal to discern the effects the space had on that signal . an auralisation is the resultant audio after an impulse response has been applied, representative of how a voice or other audio would sound within the physical space in which the impulse response was recorded. laird et al. developed a virtual performance studio in which a musician can “practice in a virtual version of a real performance space in order to acclimatise to the acoustic feedback received on stage before physically performing there” . brereton implemented a virtual acoustic environment, in which participants can physically move around the virtual space and when they speak they hear their acoustics change depending on their position within the space . the acoustically rendered environments uses real-time convolution with ambisonic impulse responses. ambisonic audio gives an impression of sound within the full sphere: an array of speakers (minimum of for first-order ambisonics) are placed around a sweet-spot from which a spatial reproduction of a sound field can be experienced . real-time auralisation, derived from impulse responses captured within the cave, allows participants to hear themselves as if immersed in the reverb of the cave. this is achieved by first capturing an impulse response, recorded inside fingal’s cave, which is a short, sharp, clean sound representative of the acoustic ‘fingerprint’ of the space. it is then applied to, or convolved with, live speech to simulate the reverberation of the space where the impulse response was recorded. where this project takes things one step further is to include a d recreation of fingal’s cave, tracked in d space and rendered in real-time depending on the position of the viewer’s head. in this way the d model is married with the live auralisations; what the user hears matches what they see. the visuals of the vr environment and the d sound work together in unison, creating a highly immersive, multisensory experience. as discussed, there is a strong case for including sound in digital reconstructions of heritage sites, but further to this, vr can be considered a platform for including kinaesthesia. as slaney, foka and bocksberger make the case for in their forthcoming article, kinaesthesia can help with fully understanding physicalities of the space and “contribute to formulating conceptions of the ancient past” . with virtual reality, the user can explore the virtual environment and interact with their surroundings and, as forte notes, “what really changes our capacities of digital/virtual perception is the experience, a cultural presence in a situated environment” . the use of vr is particularly appropriate in the case of fingal’s cave as much engagement with this site surrounds its experiential as much as its visual properties. however, jeffrey argues that a significant barrier to interaction with digital objects and spaces in vr is their ‘immateriality’ , and that digital ‘recreations’ struggle to carry the same signs of use that a physical artefact would . the digital object is impervious to its visitors or users’ mark, therefore making it difficult to imagine it has a past steeped in use and reuse. francis stevens, ‘creswell crags - recording report’, york, , https://drive.google.com/file/d/ b ux jaxlczkuwx n rla zaym /view. laird et al., ‘development of a virtual performance studio with application of virtual acoustic recording methods’. ibid. brereton, murphy, and howard, ‘the virtual singing studio: a loudspeaker-based room acoustics simulation for real-time musical performance’. springer, springer handbook of acoustics, . helen slaney, anna foka, and sophie bocksberger, ‘‘ghosts in the machine: experiencing animation,’’ forthcoming , . maurizio forte, ‘virtual reality, cyberarchaeology, teleimmersive archaeology’, fabio remondino and stefano campana, eds, d recording and modelling in archaeology and cultural heritage: theory and best practices, , , , https://vle.gsa.ac.uk/bbcswebdav/pid- - dt-content-rid- _ /courses/acpg_ihv_ /virtual_reality_cyberaerchaeology_teleim.pdf. stuart jeffrey, ‘digital heritage objects, authorship, ownership and engagement’, authenticity and cultural heritage in the age of d digital reproductions, paola di giuseppantonio di franco, fabrizio galeazzi, and valentina vassallo, eds, mcdonald institute for archaeological research, , , https://www.repository.cam.ac.uk/bitstream/handle/ / /authenticity_chapter .pdf?sequence= . stuart jeffrey, ‘challenging heritage visualisation: beauty, aura and democratisation’, open archaeology, , , – , https://doi. org/ . /opar- - . https://drive.google.com/file/d/ b ux jaxlczkuwx n rla zaym /view https://vle.gsa.ac.uk/bbcswebdav/pid- -dt-content-rid- _ /courses/acpg_ihv_ /virtual_reality_cyberaerchaeology_teleim.pdf https://vle.gsa.ac.uk/bbcswebdav/pid- -dt-content-rid- _ /courses/acpg_ihv_ /virtual_reality_cyberaerchaeology_teleim.pdf https://www.repository.cam.ac.uk/bitstream/handle/ / /authenticity_chapter .pdf?sequence= https://doi.org/ . /opar- - https://doi.org/ . /opar- - s. k. noble, fingal’s cave: the integration of real-time auralisation and d models t e c h n i c a l i m p l e m e n t a t i o n . c a p t u r i n g t h e s o u n d staffa and fingal’s cave as heritage sites present a multitude of potential problems when it comes to data capture and processing. the methodological approach becomes an assault course in dealing with issues with field recording and ways to overcome them. for auralisation, i used an ambisonic impulse response that was recorded in inside fingal’s cave as part of the harps project. since the cave has an extremely high noise floor due to waves and wind, the acoustic characteristics were captured using a swept sine wave with the receiver positioned mid-way through the cave. an exponential sine sweep is a computer-generated signal composed of sine waves increasing in frequency exponentially from hz to khz which are emitted into a space, reverberate around the space and are captured by a microphone and storage medium . laird et al. have shown this method of auralisation can create a ‘reasonably accurate simulation’ of the reverberation of a space. the high noise level experienced when recording the impulse responses is masked by including the ambient noise of the cave in the soundscape. for the purposes of this project, allowances must be made as to the quality of the auralisation, since, as this is designed for use in tourism, the auralisation will be directly affected by the acoustic response of the space in which it is exhibited. the sine sweep was deconvolved using matlab via a bespoke procedure written by iain laird. this procedure extracts the original emitted signal in order to discern the effects the space had on that signal . ronan breslin, lecturer at the school of simulation and visualisation at the glasgow school of art, reduced audio distortions on the impulse response and applied it to the voice recordings in reaper to simulate the cave’s reverberation. audioclip and audioclip show the effect the auralisation has on the recorded speech: audioclip . an example voice recording before the impulse response is applied. courtesy the author, . angela farina, ‘simultaneous measurement of impulse response and distortion with a swept-sine technique’, proc. aes th conv, paris, france, no. i, , – , https://doi.org/ . /aspaa. . ; angelo farina, ‘advancements in impulse response measurements by sine sweeps’, the nd convention of the audio engineering society, audio engineering society, , http://pcfarina.eng.unipr.it/public/ papers/ -aes .pdf. laird, murphy, and chapman, ‘spatialisation accuracy of a virtual performance system’. mathworks, ‘matlab - mathworks’, , https://uk.mathworks.com/products/matlab.html?s_tid=hp_products_matlab. stevens, ‘creswell crags - recording report’. reaper, ‘reaper | digital audio workstation: audio production without limits’, , https://www.reaper.fm/. https://soundcloud.com/shonaknoble/before https://doi.org/ . /aspaa. . http://pcfarina.eng.unipr.it/public/papers/ -aes .pdf http://pcfarina.eng.unipr.it/public/papers/ -aes .pdf https://uk.mathworks.com/products/matlab.html?s_tid=hp_products_matlab https://www.reaper.fm/ s. k. noble, fingal’s cave: the integration of real-time auralisation and d models audioclip . an example voice recording after the impulse response is applied. courtesy the author, . the latest version of unity, . , supports importation of ambisonic audio . the pre-auralised audio files are ambix (wav b-format) and require audio channels which unity decodes. the oculus spatialiser plugin supports ambisonic audio in virtual reality . it uses an array of eight virtual speakers surrounding the viewer to decode and spatialise audio, which means it gives an impression of sound within a d environment . it is designed for use with ‘broadband’ audio such as wind and wave sounds which feature highly in this soundscape. when -channel ambix format audio files are placed in the scene, this plugin makes use of the highly receptive head tracking mentioned above to alter the ambisonic orientation of the audio source with the movement and rotation of the hmd . the oculus spatialiser, together with the hmd technology ensures the d modelled scene works in unison with the audio and means that what the viewer sees matches what they hear. they can physically move around the space and turn their head and both the scene and the audio update accordingly. video demonstrates how the speaker’s voice changes with the orientation of the user’s head as they move around the space: video . screen-grab video of fingal’s cave: an audiovisual experience in use, courtesy the author, . unity, unity - game engine’, , https://unity d.com/. oculus, ‘playing ambisonic audio in unity . (beta)’, , https://developer.oculus.com/documentation/audiosdk/latest/concepts/ ospnative-unity-ambisonic/. madeline carson et al., “surround sound impulse response measurement with the exponential sine sweep; application in convolution reverb” (univeristy of victoria, ), http://arqen.com/wp-content/docs/surround-sound-impulse-response.pdf. oculus, “playing ambisonic audio in unity . (beta).” https://soundcloud.com/shonaknoble/voice-recording-after-auralisation https://vimeo.com/ https://unity d.com/ https://developer.oculus.com/documentation/audiosdk/latest/concepts/ospnative-unity-ambisonic/ https://developer.oculus.com/documentation/audiosdk/latest/concepts/ospnative-unity-ambisonic/ http://arqen.com/wp-content/docs/surround-sound-impulse-response.pdf s. k. noble, fingal’s cave: the integration of real-time auralisation and d models . n a r r a t i v e s o u n d s c a p e this research will afford engagement with and understanding of multiple layers of the cave’s history, encompassing multiple viewpoints, to enable people to write their own experience with the site as much as possible. voice recordings are layered with ambient wave noise and mendelssohn’s overture to tell the story of the cave’s romanticised history through an abstract narrative. the ambient wave and wind recordings were captured in fingal’s cave as part of the harps project and the script features folklore, poetry and descriptions of the cave. it introduces the cave in multiple languages; french, gaelic and english. the voice audio was recorded in a studio environment to reduce room presence on the recordings as much as possible. folkloric elements were included in the narrative and these tell of a rivalry between the giants fionn maccoul and benandonner and how staffa and the giants causeway were made and unmade via their interactions . for the cultural narrative, poetry was chosen for its imagery of experiences within the cave. much of it describes the acoustics and comes from a time when recording the sounds in reality was simply not possible. john keats seems in awe and has a thoroughly spiritual experience in his poem staffa, the island/fingal’s cave . he cites ‘organs’, a ‘spirit’ and compares the cave to a cathedral. james hogg also compares the cave to a spiritual space, referring to it as a ‘temple’ . however, hogg’s experience seems entirely different; his imagery conveys a much darker experience and illustrates the sheer power of the space. the wave audio had to be layered carefully as it tends to sound like undifferentiated white noise. inspired by the work of tim neilson in the animated film moana , the idea was to give character to the wave noise and create the feeling that although the ocean is ever present within the cave, it also ‘speaks’ to the cave’s visitors, inspiring poetry and leaving a lasting impression. to this end, the wave audio grows stronger at meaningful points in the narrative. this is coupled with evocative poetry by sir walter scott which reinforces the strength and character of the presence of the ocean within fingal’s cave. audioclip . the full narrative soundscape courtesy the author, . featuring quotes by c.l.f. pancoucke, queen victoria, d.b. macculloch, m. faujas and sir robert peel. folklore retold from visit scotland and the gaelic otherworld (j.g. campbell). music: the hebrides (fingal’s cave), op. , felix mendelssohn, from musopen. poetry: the lord of the isles: canto iv, sir walter scott ; staffa, the island/fingal’s cave, john keats - ; and untitled (entry in the visitor’s book at ulva), james hogg . voice over artists: norman mackay, robbie noble and shona noble. musopen, ‘the hebrides (fingal’s cave), op. | free music’, , https://musopen.org/music/ /felix-mendelssohn/the-hebrides-fingals-cave- op- /; creative commons, ‘public domain mark - creative commons’, , https://creativecommons.org/share-your-work/public-domain/pdm. visit scotland, ‘scottish folklore - ghosts, myths and legends | visitscotland — phantom piper, fingals cave, corryvreckan whirlpool’. bartleby, ‘fingal’s cave. john keats ( - ). staffa, the island. henry wadsworth longfellow, ed. - . poems of places: an anthology in volumes. scotland: vols. vi-viii’, , originally published by james r. osgood & co., boston, – , http://www.bartleby.com/ / / .html. donald b. macculloch quoting hogg (entry in visitor book in ulva), staffa, th ed., david & charles, , . tim neilsen, ‘how tim nielsen and team made ‘moana / vaiana’ sound so good | a sound effect’, interview by jennifer walden for a sound effect, , https://www.asoundeffect.com/moana-sound/. poemhunter, ‘the lord of the isles: canto iv. poem by sir walter scott - poem hunter’, , originally published by adam and charles black, edinburgh, , https://www.poemhunter.com/poem/the-lord-of-the-isles-canto-iv/. https://soundcloud.com/shonaknoble/narrative-soundscape https://musopen.org/music/ /felix-mendelssohn/the-hebrides-fingals-cave-op- / https://musopen.org/music/ /felix-mendelssohn/the-hebrides-fingals-cave-op- / https://creativecommons.org/share-your-work/public-domain/pdm http://www.bartleby.com/ / / .html https://www.asoundeffect.com/moana-sound/ https://www.poemhunter.com/poem/the-lord-of-the-isles-canto-iv/ s. k. noble, fingal’s cave: the integration of real-time auralisation and d models the final quote is from sir robert peel: “i have stood on the shores of staffa; i have seen the ‘temple not made with hands’” . it is read in multiple voices, which are layered and desynchronized. this final notion is intended to draw to mind the ghosts of the past visitors to fingal’s cave and to inspire people to explore the cave for themselves. . l i v e a u r a l i s a t i o n to be able to speak within the vr experience and hear your own voice reverberated in the same way as it would be in the cave, live auralisation is required. since it is not possible to perform real-time auralisation within unity as it is not supported, it is implemented via external software and delivered as an adjunct to the core unity experience. drawing again from stage acoustics research, the real-time auralisation is implemented using a headset microphone for participants to speak into. this live audio is then routed from the microphone via a soundcard to the reaper digital audio workstation where a single reaverb (a powerful reverberation plugin) is applied each of the w, x, y, z audio channels (the four channels work together to give an impression of sound within the full sphere). each reaverb instance is loaded with an impulse response corresponding to w, x, y, z of the impulse response that was recorded in the cave. the w, x y, z channels are then routed to an ambisonic binaural decoder for headphone playback . the decoder used was the ambisonic toolkit . it is then fed to headphones via the soundcard output. this is the same process as with the pre-auralised audio, except that the convolution is performed in real-time as the participants are speaking. consequently, some processing latency is present which causes the speech to sound delayed. latency was reduced by reducing the recording buffer size, adjusting the sound resolution of the reverb (called the fast fourier transform) and changing the head related transfer function (hrtf) in the decoder. this is an iterative process and is achieved by experimenting with the settings to keep unwanted feedback at bay while reducing processing latency as much as possible. this ensures responsiveness within the space . in order to make applying the acoustic characteristics of the cave to live speech suitable for mobile dissemination, the reverberation could have been approximated using the built in unity audio source component . this takes input from the microphone and applies reverberation which is mocked up by ear, listening and tweaking until the desired effect is achieved. although this is a viable option, for this project external software performs the live auralisation to make use of the impulse responses recorded inside the cave, as these gave an authentic representation of the acoustic response, rather than a generic, or synthesised one. the cave has a unique acoustic response and since the impulse response was recorded in-situ, it is a more convincing and authentic representation, which is important if the visualisation is to give a considerate understanding of an experience within fingal’s cave . donald b. macculloch quoting sir robert peel (speech delivered in glasgow ), staffa, th ed., david & charles, , . reaper, ‘reaper | digital audio workstation: audio production without limits’. geoffrey francis, ‘supplement to reaper user guide the reaper cockos effects summary guide’, , , https://www.reaper.fm/ guides/reaeffectsguide.pdf. ronan breslin, ‘pers. comm.’, july , . ambisonic toolkit, ‘atk for reaper’, , http://www.ambisonictoolkit.net/documentation/reaper/. iain laird, ‘pers. comm.’, july , . unity, ‘unity - manual: audio source’, , https://docs.unity d.com/manual/class-audiosource.html. laird, ‘pers. comm’. https://www.reaper.fm/guides/reaeffectsguide.pdf https://www.reaper.fm/guides/reaeffectsguide.pdf http://www.ambisonictoolkit.net/documentation/reaper/ https://docs.unity d.com/manual/class-audiosource.html s. k. noble, fingal’s cave: the integration of real-time auralisation and d models . a u d i o v i s u a l i m p l e m e n t a t i o n with the individual elements of the visualisation processed and ready to be brought together, the final stage of implementation of the project was to ensure that the d space and auralisation worked together in a coherent and meaningful way. a unity vr environment is where the d model and the pre-auralised narrative soundscape were brought together. generated as part of the harps project, simvis holds laser survey data of fingal’s cave which was used for the d digital model. for use in unity, the model was simplified while keeping the optimal level of detail. figure . point cloud data of fingal’s cave, captured by harps. screenshot from cyclone. figure . the original mesh of fingal’s cave. screenshot from dreshaper. harps, ‘the historical archaeology research project on staffa (harps) | society of antiquaries of scotland’. leica geosystems, ‘leica cyclone - d point cloud processing software - leica geosystems - hds’, , http://hds.leica-geosystems.com/ en/leica-cyclone_ .htm. dreshaper, ‘ dreshaper | dreshaper’, , http://www. dreshaper.com/en/. http://hds.leica-geosystems.com/en/leica-cyclone_ .htm http://hds.leica-geosystems.com/en/leica-cyclone_ .htm http://dreshaper.com/en/ s. k. noble, fingal’s cave: the integration of real-time auralisation and d models figure . the processed mesh of fingal’s cave. screenshot from ds max. it was not possible to use the original texture of the cave captured by the laser scanner due to gaps in the original dataset, so this was created within unity. the final stage was to scale the model so that it shows in the visualisation at : scale. to contextualise fingal’s cave within its positioning on the isle of staffa, a d digital model of staffa was created from photogrammetry data also captured as part of the harps project and processed using agisoft photoscan . figure . digital mesh of the isle of staffa from photogrammetry data. screenshot from photoscan. figure . textured model of the isle of staffa. screenshot from photoscan. autodesk, ‘ ds max | d modelling, animation & rendering software | autodesk’, , https://www.autodesk.co.uk/products/ ds-max/ overview. agisoft, ‘photoscan’, , http://www.agisoft.com/. ibid. ibid. https://www.autodesk.co.uk/products/ ds-max/overview https://www.autodesk.co.uk/products/ ds-max/overview http://www.agisoft.com/ s. k. noble, fingal’s cave: the integration of real-time auralisation and d models the visualisation is experienced in virtual reality (vr) through a head mounted display and head tracking system (an htc vive ). this offers a highly immersive experience and gives a good sensation of being surrounded by the virtual environment. the user is tracked in position and orientation, using highly receptive head tracking, ensuring head movement and rendering of the virtual environment are synchronised . it uses stereoscopic display which means the viewer can perceive depth which in turn informs interactivity. on entering the unity application, the viewer is dropped into a scene where they can see staffa and fingal’s cave and ambient audio surrounds them. the narrative audio plays (this is a linear, -minute-long track), and as participants move around and explore their surroundings, the reverberation of the audio changes with their position and the direction they are facing (movements are tracked by the hmd and reverberation is controlled by the oculus spatialiser). once the narrative soundscape finishes, the ambient soundscape continues to play (for a further minutes before looping) and participants are invited to interact with the acoustics and speak to hear their voice auralised. through use of interactive virtual reality, the exploration afforded in the virtual site surpasses the breadth of exploration available at the actual site. in this digital recreation, viewers can move all around the cave, from front to back and left to right, they are only constrained to the plane of the ocean, compared with being constrained to a narrow causeway on the right side of the cave that stops half-way inside of the real cave. this is designed to increase the ‘otherworldly’ nature of the experience and allow participants to feel like they are ‘walking on water’. although the nature of the site made recording of both the laser scan data and ambisonic impulse responses difficult, there were workarounds in post processing available to prepare both for public consumption. the reasons for making recording tough, namely limited access and difficult conditions, directly contribute to the reasoning behind the creation of this visualisation. they further add to the allure of the cave, as not everyone will have the opportunity to visit. i m p a c t one aim of this project was to demonstrate that d models and interactive real-time auralisation can come together in a meaningful way. this research has proved their integration possible, but what is the impact of the visualisation in terms of engaging with communities of interest and in the field of audio in heritage visualisation? the main interpretive aspect of the experience is the narrative soundscape, designed to give a sense of the different cultures and experiences of the many visitors to fingal’s cave throughout history. it is intended to afford engagement with and understanding of multiple layers of histories and meanings; a reflection of the intangible side to its heritage. it is purposefully abstract to allow for personalisation; everyone should take something different from it depending on their own knowledge and experiences. as ioannidis et al. suggest: “if digital storytelling used the right combination of interest to focus attention, empathy to make visitors feel they are part of the story world and imagination to let them fantasise alternative realities, then it could constitute a successful entertainment experience” . by its very definition, folklore tells the stories of a community passed by word of mouth through generations. its inclusion in the immersive experience means that the stories, traditions and beliefs of a community are woven into htc, ‘vive | discover virtual reality beyond imagination’, , https://www.vive.com/uk/. steven lavalle, virtual reality, cambridge university press, , http://vr.cs.uiuc.edu/. yannis ioannidis et al., ‘one object many stories: introducing ict in museums and collections through digital storytelling’, digital heritage international congress (digitalheritage) \ ieee , , , http://www.madgik.di.uoa.gr/sites/default/files/digitalheritage _ _ ioannidisetal_ .pdf. https://www.vive.com/uk/ http://vr.cs.uiuc.edu/ http://www.madgik.di.uoa.gr/sites/default/files/digitalheritage _ _ioannidisetal_ .pdf http://www.madgik.di.uoa.gr/sites/default/files/digitalheritage _ _ioannidisetal_ .pdf s. k. noble, fingal’s cave: the integration of real-time auralisation and d models the narrative. the poetry was chosen for its differing imagery of experiences within the cave. the use of poetry, storytelling and voice give it cultural presence, as do the sea sound samples running underneath. storytelling through an abstract narrative soundscape brings the human experience to an otherwise inanimate visualisation. it is designed to make people want to visit the site, but equally, provide an authentic experience for those who cannot, hence enhancing virtual engagement off site and supporting communities of interest in their continuing interaction with the heritage site. in terms of accessibility, the experience works as an interactive installation and is suitable for dissemination as an exhibit in art galleries, museums and heritage centres. the institution would need a pc with reaper installed, a vive and a space large enough to house the vive. an optional extra would be a screen that shows what the person wearing the hmd is seeing, making for a more inclusive experience. this d tool has the potential to maximise visitor usage, for all internal, external, local and international audiences (it could be packaged for download from the internet). the core concept of the visualisation has been designed around contextualising and increasing understanding of the heritage site, both on and off-site. it goes beyond a static d model of the site, bringing visuals together with exploration and interactivity. f u t u r e d i r e c t i o n s this research has determined a prototype for a visualisation that acts as a profoundly immersive representation of a heritage site; highlighting the importance of including audio, as well as visual characteristics, in heritage visualisations if they are to give a more authentic and deeply engaging understanding of the site. the next stage in the development of this project would be to engage in a full-scale evaluation. a possible evaluation methodology might follow the work of pujol-tost and rate the visualisation for the following factors, outlined as critical for achieving cultural presence: “ ) realistic behaviour and scientific/cultural reliability of the virtual environment; ) distinctive cultural elements (place, material culture, everyday life, people’s aspect); ) presence of realistic, autonomous human characters; and ) communicational aspects of technology (visual realism, affordances for interaction in environment; intuitiveness of interaction in devices)” the {leap] project postulates that cultural presence, the sensation of ‘being there’, is determined by narrative and can boost social significance and understanding . participants should be recruited in groups – those who have visited the cave and could provide feedback on the application’s success at recreating this experience; and those who have not visited the cave and could provide feedback on the application’s success at inspiring them to visit. both groups should be asked whether they draw any significance from the experience in terms of engaging with the cave’s heritage. the ‘voices’ of participants could be harnessed as a dataset that captures responses to the cave of people who are unable to visit it in person, allowing them to become part of the ongoing cultural narrative surrounding it. laia pujol-tost, ‘being there and then. cultural presence for archaeological virtual environments’, eurovr international conference , unpublished, , , https://www.upf.edu/documents/ / /eurovr_ poster_cameraready.pdf/ bdae b- c c- - f - e a . pujol-tost, ‘being there and then. cultural presence for archaeological virtual environments’. forte, ‘virtual reality, cyberarchaeology, teleimmersive archaeology’. https://www.upf.edu/documents/ / /eurovr_ poster_cameraready.pdf/ bdae b- c c- - f - e a https://www.upf.edu/documents/ / /eurovr_ poster_cameraready.pdf/ bdae b- c c- - f - e a s. k. noble, fingal’s cave: the integration of real-time auralisation and d models through the technical implementation of this project, some areas of further development have arisen. although this visualisation serves as a demonstration of the core concept, it should be considered a prototype for which the following developments could be made in the future. critically, the prototype requires the implementation of more impulse responses. since an impulse response measures the acoustic response of a space from the exact position from which it was measured, it is specific to that point. indeed, as sterne affirms, “a single impulse response no more captures the motion of sound in a room over time than a photograph of a person walking captures his or her route” . so, to truly represent the cave’s acoustics, many more impulse responses measured all around the cave would be needed. this requires further field work on staffa to record impulse responses from multiple positions to survey the full acoustic footprint of the cave, something the harps project intend to do in future field seasons. though the project makes use of free and open access software wherever possible, more work is needed to design integrated digital infrastructures for experience-based sensory representation that ease financial pressures on both the initial purchase and maintenance . this research has highlighted a demand for live auralisation to be supported in game development engines such as unity. ideally, multiple impulse responses could be loaded into the game engine and the convolution linked to the position of the participant within the virtual scene. there would need to be some experimentation with regards to applying a cross-fade between impulse response positions. the implementation of the impulse responses into the core unity application would allow an application such as this one to be self-contained and consequently be widely disseminated through heritage organisations to reach new audiences, expanding engagement with the site. this research has determined a prototype for a visualisation that dynamically integrates d models and auralisation so that they work in unison in a highly responsive, multisensory virtual experience. it also champions the capturing of the cultural and intangible side to its heritage, moving people to want to visit the cave and equally to conserve it. a c k n o w l e d g e m e n t s sincerest thanks go to stuart jeffrey and the glasgow school of art for support throughout this project, as well as ronan breslin, victor portela, mike marriott, matthieu poyade, brian loranger, daniel livingstone and jessica argo. special thanks also go to iain laird and nick green for their expertise regarding auralisation. thanks indeed to helen slaney for sharing your forthcoming article with anna foka and sophie bocksberger. finally, thank you to the view journal editors and the anonymous reviewers for your constructive criticism and guidance. b i o g r a p h y shona noble holds a master’s degree in international heritage visualisation from the glasgow school of art. she is passionate about telling the stories of past people and cultures, conveying many layers of meaning of arts and heritage through compelling narratives and visuals. her specialisms include researching and creating audience- focussed digital experiences using futuristic technologies, and developing interactive digital content for the arts and cultural heritage. jonathan sterne, ‘space within space: artificial reverb and the detachable echo’, grey room, , , , http://sterneworks.org/ reverb--sterne.pdf. anna foka et al., ‘beyond humanities qua digital: spatial and material development for digital research infrastructures in humlabx,’ digital scholarship in the humanities , no. , , – , https://doi.org/ . /llc/fqx . http://sterneworks.org/reverb--sterne.pdf http://sterneworks.org/reverb--sterne.pdf https://doi.org/ . /llc/fqx s. k. noble, fingal’s cave: the integration of real-time auralisation and d models view journal of european television history and culture vol. , , doi: . / - . .jethc publisher: netherlands institute for sound and vision in collaboration with utrecht university, university of luxembourg and royal holloway university of london. copyright: the text of this article has been published under a creative commons attribution-noncommercial-no derivative works . netherlands license. this license does not apply to the media referenced in the article, which is subject to the individual rights owner’s terms. her most recent research focussed on the integration of audio in heritage visualisation and culminated in a virtual reality project that presented an audiovisual exploration of fingal’s cave, demonstrating the isle of staffa’s cave’s extraordinary acoustics. this project features real-time photorealistic visualisation and is designed to be entirely immersive, allowing uninterrupted exploration of the digitally recreated cave. it includes an innovative interactive element in which a viewer can speak into a microphone and hear their voice auralised as it would be inside the cave, a first in heritage visualisation. http://creativecommons.org/licenses/by-nc-nd/ . /nl/deed.en_gb http://dx.doi.org/ . / - . .jethc _hlk _hlk why we need to find time for digital humanities: presenting a new partnership model at the university of sussex why we need to find time for digital humanities: presenting a  new partnership model at the university of sussex article (published version) http://sro.sussex.ac.uk harvell, jane and ball, joanna ( ) why we need to find time for digital humanities: presenting a new partnership model at the university of sussex. insights, ( ). p. . issn - this version is available from sussex research online: http://sro.sussex.ac.uk/id/eprint/ / this document is made available in accordance with publisher policies and may differ from the published version or from the version of record. if you wish to cite this item you are advised to consult the publisher’s version. please see the url above for details on accessing the published version. copyright and reuse: sussex research online is a digital repository of the research output of the university. copyright and all moral rights to the version of the paper presented here belong to the individual author(s) and/or other copyright owners. to the extent reasonable and practicable, the material made available in sro has been checked for eligibility before being made available. copies of full text items generally can be reproduced, displayed or performed and given to third parties in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge, provided that the authors, title and full bibliographic details are credited, a hyperlink and/or url is given for the original metadata page and the content is not changed in any way. http://sro.sussex.ac.uk/ recognizing that academic libraries should develop and nurture strong, mutually beneficial relationships with researchers in digital humanities, the authors believe it is strategically important to invest time and resources exploring ideas and partnering with academic colleagues on projects. this approach can provide many unforeseen benefits to both the library service and to the workforce. the article is based on our experience as core associates of the sussex humanities lab at the university of sussex. it outlines the impact this collaboration has had, including influencing working practices and culture within the library, involvement in research bids, informing the development of new services, and addressing library questions using digital humanities methods. most importantly, it exemplifies a new model of the librarian as equal partner in the research process. why we need to find time for digital humanities: presenting a new partnership model at the university of sussex introduction this study shows how we responded to professor hitchcock’s impassioned plea to have us at the heart of the sussex humanities lab (shl), our digital humanities centre here at the university of sussex. we propose that there are significant and tangible benefits for libraries in carving out strong relationships with digital humanities (and other departments involved in digital scholarship), and that these develop from our potential to collaborate and contribute to research, rather than purely to support it. at the university of sussex, we have set as a strategic priority the engagement of library staff with the work and research of digital humanities, and would fully endorse the finding of the recent rluk report that ‘participating in such a programme opens up opportunities for academic libraries and their staff and increases the visibility of their work and collections’. we also argue that there are noticeable benefits for staff skills insights – ( ), november digital humanities: a new partnership model at sussex | jane harvell and joanna ball joanna ball head of library content delivery & digital strategy the library university of sussex jane harvell head of academic services the library university of sussex video clip: tim hitchcock, professor digital history (co-director of the sussex humanities lab). interview by jane harvell, january . available for download here: https://doi.org/ . /uksg. .s https://doi.org/ . /uksg. .s https://www.youtube.com/watch?v=cciisgz n &feature=youtu.be and expertise. this article presents the model of engagement that we have fostered at sussex, and the impact on the library and its staff. the sussex humanities lab and the lab model the rluk report suggests that there is no single model for digital humanities within institutions. the shl has been established as a lab model described by varner and hsew as having ‘a specific focus, tied either to the mission of the campus or to the aims of their founders, which necessarily means that many do not take on responsibility for digital projects that fall outside of the scope’. the lab is dedicated to developing and expanding research into how digital technologies are shaping our culture and society, and draws on expertise from a number of different disciplines to answer questions within arts and humanities. library staff as core associates the directors of the shl have a long-standing relationship with a number of library colleagues, and it felt very natural for us to be included within the funding bid to the university to establish the sussex humanities lab in . several library staff are named as core associates and the lab finances a research fellow based in the library who reports to our head of special collections. additionally, the trustees of the mass observation archive (moa) , one of our major special collections, support a research fellow based in the lab who uses digital humanities techniques to work on the moa. embedding research activity within special collections is very different from the library’s more traditional support for research activity. as core associates we are equal partners in the development of the lab, participating in awaydays and regular lab meetings, and encouraged to attend and organize seminars. involvement at this level brings us closer to our academic colleagues and so increases our confidence to input to research where we feel we can contribute. an example is the digital preservation for social sciences and humanities conference, a joint shl/digital repository of ireland event, where the organizing committee included a member of library staff. this brought the conference to the attention of a new audience, academic librarians, a number of whom attended an academic conference for the first time. securing firm and enthusiastic support from senior library management has been crucial to success, as it is vital to identify and advocate for the benefits of such a relationship across the university – this is key to the work and aims of the lab. in addition to traditional areas of engagement (for example, metadata creation), our association with digital humanities here at sussex has resulted in creative collaborations that require new ways of working (and thinking) not normally associated with libraries and their staff and more akin to the habits and behaviours of researchers. in libraries we design and deliver tightly managed projects and are accountable for their success. in contrast, within academia experimentation and failure are integral to the research process, and offer constructive ways of formulating questions and problem solving. our involvement with this different approach brings us closer to researchers’ working methods and provides the opportunity to transfer some of these practices into our library work. ‘there is no single model for digital humanities within institutions’ ‘we are equal partners in the development of the lab’ ‘firm and enthusiastic support from senior library management has been crucial to success’ library staff as collaborators library staff are also collaborators in research. their skills and expertise are revealed and recognized through the close relationship with the shl and they are now being named on research bids. our head of special collections, fiona courage (interviewed), has been included in a number of successful large bids to offer advice on appropriate archiving policy and practice and also in her capacity as curator of the moa. we anticipate future opportunities for the library to be included in research bids around digital preservation, research data management (rdm), metadata, data ethics and other areas where we have expertise and experience. video clip: fiona courage, head of special collections (core associate of the sussex humanities lab). interview by jane harvell, january . available for download here: https://doi.org/ . /uksg. .s working on solutions to institutional problems working with digital humanities presents opportunities for new perspectives on long- standing library problems. the university of sussex, along with many other similar higher education institutions, is going through a period of intense growth, and this has put pressure on services and spaces across the campus, in particular the library building. the information we have traditionally gathered on building usage has been based on a demographic breakdown in gate entries or rough headcounts. recently, user experience (ux) approaches have begun to bring us better insights into how the building is being used. working with the lab we have piloted a project to use a digital humanities technique of big data analysis to our wifi access points to track the devices of our users as they move around the building. this provides management information that can be used for the development of new services as well as to aid decision-making about building design. we have also benefited from shl collaboration and expertise in the development of other services, such as rdm and digital preservation. the library had previously struggled to engage humanities researchers with discussions on rdm, but input from the shl has enabled us to design a service that addresses the needs of humanities as well as sciences. research data and digital preservation are issues which are shared but where the lab and the library each lack some elements of understanding, expertise or practical application, and so there is potential in combining efforts. the library as a research subject libraries have traditionally focused on opening up digital collections to their researchers, and more recently have been exploring how they can support researchers in applying text and data mining techniques to these collections. but libraries, and not just their collections, have the opportunity to be research subjects. there is potential to open up the mass of ‘input from the shl has enabled us to design a service that addresses the needs of humanities as well as sciences’ ‘library staff are also collaborators in research’ https://doi.org/ . /uksg. .s https://youtu.be/ euzbd data that libraries capture and create as management information to make it available for use by the research community for student projects and in hackathons. at sussex we are exploring how we can provide data sets with open licences through our web pages. these types of service have the potential to be extended to other disciplines, and we are now being approached by colleagues within anthropology about a research project to use ux methodologies to understand the use of our spaces. changing culture our collaboration has had an impact on the wider library culture, as more of our staff participate in the research life of the lab, for example, by attending and contributing to the shl’s this and thatcamp. this is an opportunity not only for senior managers or subject librarians but for all library colleagues to get involved. in addition, we have recently established a library innovation group, comprised of staff at all levels and from all sections of the library and including representation from the shl. this is an example of our relationship with the lab increasing our confidence in the entire research life cycle: supporting research is no longer limited to our research support team and we have an opportunity to expose more of our staff to the academic research life cycle. we have always provided training and advice to researchers, both as consumers of information, and, increasingly, as generators of information. however, our relationship with shl has enabled us to make this a two-way process, giving us an opportunity to enhance the digital skills of our library staff. we recently collaborated on a library carpentry programme for our own staff and the wider community, giving attendees the skills to clean and manipulate their own data. open scholarship as librarians, we are advocates of open access (oa) and have taken on responsibility for managing the university’s oa funds and monitoring compliance with funder policies. however, our engagement with the lab has enabled us to develop a broader understanding of the requirements for open scholarship. we are currently working with one of the lab co-directors to develop a business case for a university press initiative for the creation of new types of open research outputs. this press would create capacity within the library to collaborate on innovative digital projects across campus. through our partnership with the shl we are able to support the university’s strategic ambitions to be an exemplar for open research. conclusion and new ways of working literature on libraries and the digital humanities has tended to focus on how libraries can support the digital humanities – an extended form of research support. we think that this is missing the point. professor caroline bassett, co-director of the sussex humanities lab (interviewed), proposes that what is exciting about our collaboration with the shl is that it is a new model of partnership, embedding the work of the lab within the library and vice versa. ‘we are able to support the university’s strategic ambitions to be an exemplar for open research’ video clip: caroline bassett, professor of media and communications (co-director of the sussex humanities lab). interview by jane harvell, january . available for download here: https://doi.org/ . /uksg. .s just as librarians can partner with academics on their research outputs to support them as consumers and producers of information, so can digital humanities researchers partner with us to address library problems from a different perspective. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘abbreviations and acronyms’ link at the top of the page it directs you to: http://www.uksg.org/publications#aa competing interests the authors have declared no competing interests. references . kamposiori c, the role of research libraries in the creation, archiving, curation, and preservation of tools for the digital humanities, , london, rluk: http://www.rluk.ac.uk/wp-content/uploads/ / /digital-humanities-report-jul- .pdf (accessed september ). . varner s and hsew p, special report: digital humanities in libraries, american libraries, : https://americanlibrariesmagazine.org/ / / /special-report-digital-humanities-libraries/ (accessed july ). . mass observation archive: http://www.massobs.org.uk/ (accessed september ). . digital preservation for social sciences and humanities: http://dpassh.org/ (accessed september ). . this and thatcamp: http://thisand.thatcamp.org/ (accessed september ). . library carpentry: https://librarycarpentry.github.io/ (accessed september ). https://doi.org/ . /uksg. .s https://youtu.be/shmwnvvm w http://www.uksg.org/publications#aa http://www.rluk.ac.uk/wp-content/uploads/ / /digital-humanities-report-jul- .pdf https://americanlibrariesmagazine.org/ / / /special-report-digital-humanities-libraries/ http://www.massobs.org.uk/ http://dpassh.org/ http://thisand.thatcamp.org/ https://librarycarpentry.github.io/ article copyright: © jane harvell and joanna ball. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. corresponding author: jane harvell head of academic services the library university of sussex, falmer, brighton bn ql, uk tel: + ( ) ; internal extension: | e-mail: j.harvell@sussex.ac.uk orcid id: http://orcid.org/ - - - joanna ball orcid id: http://orcid.org/ - - - to cite this article: harvell j and ball j, why we need to find time for digital humanities: presenting a new partnership model at the university of sussex, insights, , ( ), – ; doi: https://doi.org/ . /uksg. published by uksg in association with ubiquity press on november http://creativecommons.org/licenses/by/ . / mailto:j.harvell@sussex.ac.uk http://orcid.org/ - - - http://orcid.org/ - - - https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ introduction the sussex humanities lab and the lab model library staff as core associates library staff as collaborators working on solutions to institutional problems the library as a research subject changing culture open scholarship conclusion and new ways of working abbreviations and acronyms competing interests references provided by the author(s) and nui galway in accordance with publisher policies. please cite the published version when available. downloaded - - t : : z some rights reserved. for more information, please see the item record link above. title a landscape archive: methods for interaction design,preservation, access, and mapping—a case study author(s) joy, cillian; keane, aisling; corrigan, peter publication date - - publication information joy, cillian, keane, aisling, & corrigan, peter. ( ). a landscape archive: methods for interaction design, preservation, access, and mapping—a case study. journal of web librarianship, ( - ), - . doi: . / . . publisher taylor & francis link to publisher's version https://doi.org/ . / . . item record http://hdl.handle.net/ / doi http://dx.doi.org/ . / . . https://aran.library.nuigalway.ie http://creativecommons.org/licenses/by-nc-nd/ . /ie/ a landscape archive: methods for interaction design, preservation, access, and mapping - a case study abstract the national university of ireland, galway (nui galway) acquired the archive of tim robinson in . robinson is a cartographer and writer who lived, studied, and documented the landscape surrounding galway bay over the course of years. this paper describes the methods taken for the digital preservation, access provision, discovery, and digital mapping of this landscape archive. we describe a user interface that allows exploration and discovery of the landscape archive on a digital map, linked to the archive, which allows the user to interact with the material from a perspective of place. the aim of this is to provide an enhanced user experience and create potential for teaching, research, and community engagement. the archive can also be accessed using more traditional hierarchical archival discovery interfaces and is selectively digitally preserved and accessible as a digital archive. background to the archive in , tim and máiréad robinson first visited the aran islands off the west coast of ireland. there, they discovered the islands had not been surveyed in almost a century and their most accurate maps were the six-inch ordnance survey (os). inspired by the landscape and its rich cultural imprint, robinson soon found himself sketching out a rough design for a new map. “i was anxious to get on with the actual mapping as soon as possible, though i had little idea of how to go about it” (robinson, ). in , having published two maps of the aran islands and the burren regions (midwest ireland), the robinsons’ founded “folding landscapes” as a publishing house for their critically and scholarly acclaimed original maps of the west of ireland. in , folding landscapes won the ford ireland conservation award, and subsequently won the european award in madrid for the project’s “unique combination of culture, heritage and conservation” (robinson, ). the landscape archive robinson amassed records of considerable breadth in his research, map-making, and writing work. they amount to over linear meters of shelf space in our archives’ strong room and include a townland index, a comprehensive collection of research maps and boxes of manuscript material that document robinson’s multi-disciplinary research (nui galway, b). the self-contained ‘townland index’ is a set of over , record-sized cards transcribed from robinson’s weather-beaten field notebooks. the cards describe townlands and are arranged by civil parish (nui galway, c). these cards trace the irish (gaeilge) and english language placenames, and their meaning and translation (joyce, ). the cards also record the local features of historical, ecclesiastical, archaeological, linguistic, and geological significance; as well as occasional snatches of place and folklore. in transcribing them to the index, their content was expanded to include contextual information derived from local authorities and published secondary sources. the information contained within the index marks the foundation stone of his mapping project. the cards are also an important resource for researchers interested in placenames, gathering together the official knowledge of the placenames’ branch of the os with local derivations and pronunciations of the name. digital work to stay true to the archive’s multi-disciplinary approach and to keep the mapping project alive, the authors are currently engaged to digitally preserve and map the archive. we aim to represent robinson’s work digitally to increase access and present the archive in a visually engaging manner. the digital work is divided into phases and this paper describes the first phase of this work, the digital preservation and mapping of the townland index. the phases of work break down as follows: . digital preservation, access, and mapping of the townland index cards (nui galway d, e) . transcription of the townland index cards . digital preservation and access to the archive’s manuscripts, photographs, and cartographical material . to facilitate connections with other/similar national and international digital initiatives. throughout all phases of this work we are guided by the principles of digital preservation, interoperability, and open access. we also strive to create innovative and transformative interfaces for an augmented user experience to further discovery, teaching, research, and community engagement. by extracting maps from physical collections and publishing them online in an open manner, we can greatly increase discoverability, usefulness, and research potential (hurley, , cox, , burns, ). importantly for landscape studies, using historical maps underpins current and future research (panagos et al., ). it is worth noting that for the phases of work defined above, the delivery of phase has been committed to by the library and its partners. the delivery and detailed planning of phases to are still under discussion and await funding. digital scholarship in nui galway the pace of scholarship-focused digital content creation has grown substantially in nui galway, both in the library and on a wider university level. new partnerships have both demanded and introduced an entirely new scale of digital preservation, digitization, and archive management. innovation and investment in both the management and services of the digital library, as well as in high profile archival collections, have resulted in new partnerships, working practices, and strategies. in , the nui galway library formed a digital library strategy group to improve and secure the future of its digital collections. the library has invested and innovated to create robust and flexible ways of working with digital objects. we created and published a digital library strategy, implemented preservation and discovery infrastructures, evolved the internal thinking in relation to the digital library, created a multi-functional cross-unit team to run digital library projects, and embarked on interesting and innovative projects (nui galway, a). in terms of archives, our work integrates both traditional (archival finding aids) and new digital discovery methods (faceted search for archives), while also tests the capacity to create visually engaging ways (mapping and deep zoom) to reach new audiences (hurley, , deal, ). digital publishing infrastructure the library has an existing and evolving digital publishing platform created from a suite of open source software. as discussed, the first phase of the work on the landscape archive is digital preservation, access, and mapping of the top-level townland index cards. this work, the main subject of this paper, takes place on the library’s institutional digital platform. key elements of this digital platform are: • the digital repository using islandora for digital preservation, access, and niche discovery (moses and stapelfeldt, ) • digital exhibition and mapping using omeka and neatline (nowviskie et al., ) • archival management using calm (axiell, ) • wide discovery using primo (exlibris, ). our digital repository, mentioned above, is implemented using islandora. islandora is an open-source software framework designed to manage digital objects, support long-term storage and preservation, and manage access. it is based on fedora (flexible extensible digital object repository architecture) (payette and lagoze, ). commonly known as fedora, fedora is a community driven open-source repository system. fedora’s main use is for the long-term preservation of digital content. it enables persistence, fixity, audit, versioning, and import/export (duraspace, ). in addition, other key components used for our digital collections are omeka and neatline. omeka is used for exhibiting content (hardesty, ) and neatline is used for digital mapping of the archive (nowviskie et al., ). neatline allows for archival metadata and objects to be reimagined, while providing possibilities for deep interpretive or theory-based expression (deal, , kramer-smyth, nishigaki, and anglade, ). at the same time, neatline acts as a communication intersection point between scholars and managers of digital collections (nowviskie et al., ). neatline is a low-barrier technology that can be used quickly by new scholars, non-technical stakeholders and staff, and technical experts. in our environment, omeka and neatline are managed by the library at an institutional level for the entire university as part of our digital publishing infrastructure. we view and encourage these tools being used by undergraduates, researchers, and staff for digital projects. digital preservation, access, and mapping of the top-level townland index cards during the first phase of work, our aims and objectives are to preserve and create a visual interface to the townland index. when starting, we first identified potential user interactions with the content and subsequently drafted content flow and user interactions within our digital platform. see “figure . overview of tim robinson digital mapping project” for an overview of the user and digital platform interaction. access is provided via the following multiple access points: • a hierarchical listing of archival metadata using our archive search interface • high resolution digital access to the objects along with their metadata using our digital repository • a map interface where users interact with the digital objects using real polygon townland shapes on open street maps (osms). furthermore, in terms of data flow and system interoperability, we aim for as little content duplication as possible while creating a usable and engaging interface. figure . overview of tim robinson digital mapping project methods the work for phase (digital preservation, access, and mapping of the top-level townland index cards) can be broken into the follow sections: • archival listing and arrangement • digitization • digital repository (digital metadata, digital preservation, ingestion, and access), • mapping (metadata transfer to omeka, creation of neatline exhibition, generation of geo-spatial metadata for each townland, enrichment of metadata with geo-spatial polygons, and adding a geo-referenced map layer to osm with archival maps sourced from the landscape archive). archival listing and arrangement as detailed previously, the robinson archive contains a substantial volume of material. the initial appraisal work on the archive identified three distinct sections, and the decision was made to approach archival listing in this order, making each section available once it had been listed. the archive was listed according to the general international standard archival description (isad (g)) standard for archival description and catalogued on the library’s archival management system (calm) (axiell, ). the priority was the townland index, a set of over , record-sized cards, containing information that was transcribed from robinson’s notebooks. in the archive, we have maintained their original order and created descriptive metadata at the level of townland. the second strand of the archive is a comprehensive body of reference maps and charts that fed into folding landscapes’ cartographic output. thirdly, robinson’s manuscript material builds a more complete picture of the evolution of his landscape work. arranged per the geographical areas on galway bay where tim worked (aran islands, the burren and connemara); it gathers together his research work for each area, documents his writing processes, and his involvement in local ecological matters such as the campaign to save roundstone bog, mullaghmore / mullach mór. all sections of the archive have been described using isad (g) at item level. digitization the townland index contains approximately , cards. as a first step, we digitized, in tagged image file format (tiff) format, the title card of each of the townlands in the index. the title card provides the top-level information about the townland and most townlands have multiple cards associated with them. with thanks to the university’s centre for irish studies, we were the beneficiaries of a successful grant application that funded an intern for two weeks who carried out the initial digitization work for us. at the time of writing, the library has digitized all the index cards and published openly online the title card for each townland. currently, we are organizing the digital objects and metadata to prepare for ingestion to our digital repository. when ingested, all the townland index cards will be available openly online with repository features such as navigation, faceted search, descriptive metadata, and deep zooming. digital repository digitization provides the digital object in a format that can be preserved (brown, , duraspace, ). to complete the preservation and access component of this work, we required metadata for each digital object and a location to securely store both the objects and their related metadata. for digital preservation, we use a double strength approach. specifically, we store objects and associated metadata in both our digital repository (using islandora to store high resolution versions and provide lower resolution access alongside the metadata) and local storage (to store high resolution versions and the metadata). a precursor to digital preservation is the creation of a package that aids preservation, we achieve this by combining the digital object and its metadata in an appropriate format (oais, ). our already-created archival metadata was transformed from encoded archival description (ead) (pitti ) to metadata object description schema (mods) (guenther, ). ead was exported from our archival management system (calm) and transformed to mods using an extensible stylesheet language transformation (xslt). this process created one mods xml file for each digital object. the xml and the digital object (in tiff format) form the package for digital repository ingestion and local storage. the digital images, along with their corresponding metadata (in mods format), are uploaded to our digital repository, where they are organized into a structure defined by their civil parish. they can be navigated through that structure or searched. after ingestion, mods metadata is enriched to include descriptive metadata fields as per “table : metadata object description schema fields used”. metadata field description filename filename of the digital resource id unique identifier, matches to physical archival record title a word, phrase, character, or group of characters, normally appearing in a resource, that names it or the work contained in it. type of resource a term that specifies the characteristics and general type of content of the resource. name * the name of a person, organisation, or event (conference, meeting, etc.) associated in some way with the resource. role * designates the relationship (role) of the entity recorded in name to the resource described in the record. for example, creator genre a term or terms that designate a category characterising a style, form, or content, such as artistic, musical, literary composition, etc. date created date item was created date issued date digital resource was published publisher the name of the entity that published, printed, distributed, released, issued, or produced the resource. description abstract, summary of the contents extend normally physical number of pages related item information that identifies other resources related to the one being described. subject > topic * used as the tag for any topical subjects that are not appropriate in the <geographic>, <temporal>, <titleinfo>, <name>, <genre>, <hierarchicalgeographic>, or <occupation> sub elements. subject > geographic * used for geographic subject terms that are not appropriate for the <hierarchicalgeographic> element. subject > temporal * used for chronological subject terms or temporal coverage. cartographics cartographic (maps or charts) data indicating spatial coverage. table : metadata object description schema fields used. * indicates that these fields can be repeated. table : metadata object description schema fields used. mapping to present both the geographical range and subject material of the archive in a way that honors its visual qualities, we selected the neatline plugin for omeka. importantly, neatline was an existing component of our institutional digital platform and as such using neatline ensures the sustainability of this digital work. neatline facilitates the creation of a narrative using maps and timelines (nowviskie et al., , evans and jasnow, ). neatline also allows for spatial data to be added, providing metadata for each townland, as a point or polygon. point or simple polygons for each townland are straight forward to add using the neatline interface but the visual representation of the townland would not be detailed enough, nor would it provide a modern and usable user experience. using a simple polygon to represent each townland visually provides little more than a single point or dot, when ideally the borders of each townland are what are required in the visual display. importantly, neatline allows for complex polygons to be added by using well-known text (wkt), a text markup language for representing vector geometry objects on a map. the data points needed to trace the outlines or borders of these townlands are complex shapes with many points required to represent the real-life townland shape. thanks to the open source and crowd sourced communities, the townland shapes were already available on the web site of the irish open street map community in an open source geographic information system (gis) format, geojson. building from this, we first defined a manual process to convert existing townland geojson to wkt. we then scripted the retrieval and conversion of existing geojson to wkt for import to neatline. this script is open source and can be found on github (corrigan, ). this work created an interface, using open street map, which displays each townland visually. users can simply select a townland to view more information. for example, see “figure : tim robinson’s title cards mapped to townlands”. figure : tim robinson’s title cards mapped to townlands in addition to the osm base layer, neatline can also be used with further map layers (functionality that allows us to increase the complexity of the information we can make available thereby increasing access) to enhance both usability and the end-user experience as stated in our original objectives. we created another map layer using two of tim robinson’s maps from the region. these maps are of connemara and the aran islands. the goal is to enable users to visually interface with the townlands using the original robinson maps with all the features. for example, see “figure : tim robinson’s title cards mapped to townlands interfaced with original maps”. figure : tim robinson’s title cards mapped to townlands interfaced with original maps the digital mapping project (nui galway, e) can currently be viewed as a beta version. a user can either conduct a search for the townland they are interested in on the exhibition, http://exhibits.library.nuigalway.ie/neatline/fullscreen/tim-robinsons-townland-index-for-connemara-and-the-aran-islands or they can hover over the map. by selecting a townland, they will get the result shown in figure . here you see tim’s map, the townland of ballinahinch on it, the descriptive metadata, and a thumbnail image of the title card relating to that townland from the townland index. two considerable components of the archive are visually represented – the townland index and the maps. next steps our next steps in the project are to publish freely online the complete set of townland index cards. the archive has applications to many areas of research focus at nui galway including the irish language, history, archaeology, botany, marine science, and geology, as well as strong local community engagement in the west of ireland. our intent is to enable transformative uses of the materials with a cartographic base layer that can be built upon and enriched with multi-formatted digital objects. the material in the digital archive has connections with other national digital initiatives such as logainm.ie, the placenames database of ireland, and duchas.ie, the national folklore collection of ireland, and we are open to increasing connectivity with these digital resources as we progress our project. references axiell. . "calm." last modified - - , accessed . http://alm.axiell.com/solutions/technology/calm. brown, adrian. . "practical digital preservation: a how-to guide for organizations of any size." in, - . london : facet pub. . burns, jane a. . "role of the information professional in the development and promotion of digital humanities content for research, teaching, and learning in the modern academic library: an irish case study." http://dx.doi.org/ . / . . . doi: . / . . . corrigan, peter. . "geojson-wkt." accessed march. https://github.com/pcorrigan/geojson- wkt. cox, john. . "communicating new library roles to enable digital scholarship: a review article." http://dx.doi.org/ . / . . . doi: . / . . . deal, laura. . "visualizing digital collections." http://dx.doi.org/ . / . . . doi: . duraspace. . "fedora and digital preservation." accessed june. http://fedorarepository.org/fedora-and-digital-preservation. evans, courtney, and ben jasnow. . "mapping homer’s catalogue of ships." literary and linguistic computing ( ): - . doi: . /llc/fqu . exlibris. . "primo discovery and delivery." accessed march. http://www.exlibrisgroup.com/category/primooverview. guenther, rebecca s. . "using the metadata object description schema (mods) for resource description: guidelines and applications." http://dx.doi.org/ . / . doi: . hardesty, juliet. . "exhibiting library collections online: omeka in context." http://dx.doi.org/ . /nlw- - - . doi: nlw- - - .pdf. hurley, joseph a. . "developing a digital map collection for research and teaching: the “planning atlanta: a new city in the making, s– s” collection." http://dx.doi.org/ . / . . . doi: journal of map & geography libraries, vol. , no. - , january-august , pp. – . joyce, patrick weston. . the origin and history of irish names of places. dublin: gill. kramer-smyth, jeanne, morimichi nishigaki, and tim anglade. . "archivesz: visualizing archival collections." retrieved april : . moses, donald, and kirsta stapelfeldt. . "renewing upei’s institutional repository: new features for an islandora-based environment." the code lib journal ( ). nowviskie, bethany, david mcclure, adam soroka, jeremy boggs, and eric rochester. . "geo- temporal interpretation of archival collections with neatline." literary and linguistic computing ( ): - . doi: . /llc/fqt . nui galway, library. a. "digital scholarship - nui galway." nui galway library, accessed june . http://library.nuigalway.ie/digitalscholarship/. nui galway, library. b. "tim robinson collection - archival record." accessed june . http://archivesearch.library.nuigalway.ie/nuig/calmview/treebrowse.aspx?src=calmview. catalog&amp;field=refno&amp;key=p . nui galway, library. c. "tim robinson's townland index - archival record." accessed june . http://archivesearch.library.nuigalway.ie/nuig/calmview/treebrowse.aspx?src=calmview. catalog&amp;field=refno&amp;key=p % f . nui galway, library. d. "tim robinson's townland index for connemara and the aran islands. nui galway digital collections." accessed june . https://digital.library.nuigalway.ie/islandora/object/nuigalway:robinson. http://alm.axiell.com/solutions/technology/calm http://dx.doi.org/ . / . . https://github.com/pcorrigan/geojson-wkt https://github.com/pcorrigan/geojson-wkt http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://fedorarepository.org/fedora-and-digital-preservation http://www.exlibrisgroup.com/category/primooverview http://dx.doi.org/ . / http://dx.doi.org/ . /nlw- - - http://dx.doi.org/ . / . . http://library.nuigalway.ie/digitalscholarship/ http://archivesearch.library.nuigalway.ie/nuig/calmview/treebrowse.aspx?src=calmview.catalog&amp;field=refno&amp;key=p http://archivesearch.library.nuigalway.ie/nuig/calmview/treebrowse.aspx?src=calmview.catalog&amp;field=refno&amp;key=p http://archivesearch.library.nuigalway.ie/nuig/calmview/treebrowse.aspx?src=calmview.catalog&amp;field=refno&amp;key=p % f http://archivesearch.library.nuigalway.ie/nuig/calmview/treebrowse.aspx?src=calmview.catalog&amp;field=refno&amp;key=p % f https://digital.library.nuigalway.ie/islandora/object/nuigalway:robinson nui galway, library. e. "tim robinson's townland index for connemara and the aran islands. nui galway exhibits." nui galway library, accessed june . http://exhibits.library.nuigalway.ie/neatline/show/tim-robinsons-townland-index-for- connemara-and-the-aran-islands. oais. . reference model for an open archival information system. https://public.ccsds.org/pubs/ x m .pdf. panagos, panos, arwyn jones, claudio bosco, and p.s. senthil kumar. . "european digital archive on soil maps (eudasm): preserving important soil data for public free access." international journal of digital earth. sep ( ): . doi: . / . . . payette, sandra, and carl lagoze. . "flexible and extensible digital object and repository architecture (fedora)." pitti, daniel v. . "encoded archival description: an introduction and overview." http://dx.doi.org/ . / . doi: new review of information networking, vol. , no. , , pp. - . robinson, tim. . setting foot on the shores of connemara & other writings, setting foot on the shores of connemara & other writings: the lilliput press. robinson, tim. . "folding landscapes | roundstone, co galway, ireland." http://www.foldinglandscapes.com/. http://exhibits.library.nuigalway.ie/neatline/show/tim-robinsons-townland-index-for-connemara-and-the-aran-islands http://exhibits.library.nuigalway.ie/neatline/show/tim-robinsons-townland-index-for-connemara-and-the-aran-islands https://public.ccsds.org/pubs/ x m .pdf http://dx.doi.org/ . / http://www.foldinglandscapes.com/ abstract background to the archive the landscape archive digital work digital scholarship in nui galway digital publishing infrastructure digital preservation, access, and mapping of the top-level townland index cards methods archival listing and arrangement digitization digital repository mapping next steps references white paper report report id: application number: hd project director: edward ayers (eayers@richmond.edu) institution: university of richmond reporting period: / / - / / report due: / / date submitted: / /       landscapes  of  the  american  past:  visualizing  emancipation     white  paper   submitted  to  the     national  endowment  for  the  humanities   office  of  digital  humanities     scott  nesbit  and  edward  l.  ayers     june                   the  spatial  turn  in  the  humanities  has  coincided  with  increasingly  accessible  tools   for  cartography  and  the  powerful  emergence  of  desktop  gis  computing.    landscapes   of  the  american  past:  visualizing  emancipation  is  a  prototype  study  in  the   possibilities  for  creating  richly  interactive,  broadly  accessible  digital  projects  that   both  reflect  current  scholarly  understandings  of  large,  complex  processes  and   provide  tools  to  interrogate  those  understandings.    the  resulting  project,   “visualizing  emancipation,”  depends  upon  innovative  use  of  earlier  digital   scholarship  and  digitized  texts  and  offers  a  compelling  model  for  undergraduate   research  in  the  humanities.         “visualizing  emancipation”  is  the  first  map  of  the  most  dramatic  social   transformation  in  american  history,  the  freedom  of  four  million  slaves  in  the  civil   war.    in  mapping  this  social  transformation,  it  takes  a  new  perspective  on  a   significant  scholarly  question:  where,  when,  and  under  what  conditions  did  slavery   fall  apart?    it  brings  together  three  kinds  of  evidence  to  answer  this  question,   evidence  showing  where  slavery  was  protected  by  the  us  government  and  where  it   was  not  during  the  civil  war;  showing  the  approximate  locations  of  u.s.  troops   during  that  war;  and  showing  “emancipation  events,”  documented  instances  where   the  lives  of  enslaved  men  and  women  were  changing,  sometimes  for  good,   sometimes  for  ill,  during  the  war.    by  exposing  the  evidence  on  which  it  draws,  it   allows  students  and  the  public  to  access  the  sources  to  ask  their  own  questions   about  emancipation  and  find  out  how  their  own  locales  and  ancestors  might  have   experienced  the  end  of  slavery.    it  allows  scholars  to  ask  new  questions  about   where,  when,  and  how  enslaved  men  and  women  escaped  bondage,  and  what  their   lives  may  have  looked  like  when  they  did  so.       interpretation   like  other  maps,  “visualizing  emancipation”  is  both  a  tool  for  interpretation  and  an   image  that  makes  its  own  point:  the  end  of  slavery  did  not  come  about  in  an  instant,   with  the  emancipation  proclamation.    it  began  before  shooting  started  and  ended   long  after  the  last  confederate  armies  surrendered.    it  followed,  in  w.e.b.  dubois’   words,  as  a  “dark  human  cloud  that  clung  like  remorse”  on  the  rear  of  the  union’s   swift-­‐marching  columns.  as  we  indicated  in  “seeing  emancipation:  scale  and   freedom  in  the  american  south,”  the  first  essay  to  be  published  in  the  award-­‐                                                                                                                  w.  e.  b.  dubois,  the  souls  of  black  folk,  ed.  henry  louis  gates,  jr.  ( ;  new  york:   oxford  university  press,   ),   .   winning  journal  of  the  civil  war  era,  emancipation  could  be  found  in  the  interaction   between  men  and  women  operating  at  multiple  scales  of  action.    it  could  be  found  in   the  escape  of  fugitives,  in  union  and  confederate  armies’  conscription  of  enslaved   men  to  work  on  fortifications,  and  in  escaped  slaves’  offers  to  guide  u.s.  troops   through  southern  wilds.         opportunities  for  freedom  could  at  times  seem  randomly  distributed,  as  men  and   women  participated  in  mass  exodus  on  some  plantations  while  others  nearby  were   left  enslaved.    yet  emancipation  proceeded  in  patterns,  not  as  a  chaotic,  secular   rapture,  in  which  men  and  women  became  free  without  discernible  sequence,   rationale,  or  order.    enslaved  people,  legislators,  and  armies,  in  fits  and  starts,   imprinted  the  end  of  slavery  on  the  american  south.         patterns     enslaved  men  and  women  found  release  from  their  bonds  in  waves,  rising  and   falling  with  the  campaign  seasons,  the  fortunes  of  union  arms  and  the  pitiful   defenses  of  contraband  camps.    unlike  the  legal  extension  of  freedom,  which   gathered  momentum  through  acts,  proclamations,  and  amendments,  enslaved  men   and  women  did  not  experience  emancipation  as  a  process  building  on  past  success,   pointing  toward  a  future  without  bondage.    as  often  as  liberation  was  welcomed   with  exhilaration,  men,  women,  and  children  also  experienced  war  and  freedom  as                                                                                                                    edward  l.  ayers  and  scott  nesbit,  “seeing  emancipation:  scale  and  freedom  in  the   dangerous  flight  and  backbreaking  labor,  marked  often  by  hunger,  violence,  and   distrust  of  the  liberating  army.       enslaved  men  and  women  were  more  likely  to  find  freedom  in  some  places  than   others.    freedom  and  union  arms  pushed  into  the  confederacy  by  water  and  rail.     enslaved  men  and  women  living  along  the  atlantic  seaboard—the  coast  and  sea   islands  of  south  carolina,  within  a  day’s  walk  of  the  north  carolina  coast,  and  along   virginia’s  chesapeake  bay—had  the  greatest  and  earliest  opportunities  to  find   freedom.    enslaved  men  and  women  living  along  the  south’s  major  rivers  had  a   greater  chance,  too,  especially  those  on  the  plantations  of  the  mississippi  delta,   along  the  tennessee  river  in  northern  alabama,  and  along  virginia  tidewater’s   potomac,  rappahannock,  and  james  rivers.         those  living  or  working  along  the  south’s   ,  miles  of  railroads  were  also  more   likely  to  find  freedom.    confederate  civilians  along  the  line  between  corinth,   mississippi,  and  decatur,  alabama,  complained  to  their  government  at  the  close  of    that  in  the  past  year  their  enslaved  workers  “had  been  carried  off  in  very  large   numbers,  declared  free,  and  refused  the  liberty  of  returning  to  their  owners.”     union  officers  had  “pressed  all  the  negroes  in  this  country”  around  the  nashville-­‐                                                                                                                  william  g.  thomas,  the  iron  way:  railroads,  the  civil  war,  and  the  making  of   modern  america  (new  haven:  yale  university  press,   ),      civilians,  “to  the  hon.  secretary  of  war  of  the  confederate  states  of  america,”   florence,  al,  january   ,   ,  official  records  (hereafter  or)  i. .ii,   -­‐ ,   http://dsl.richmond.edu/emancipation/#event/ .   decatur  line  by  the  end  of   .  before  he  followed  the  rails  through  georgia,  gen.   william  t.  sherman  moved  his  troops  along  the  jackson-­‐meridian  line  in  mississippi   with  a  train  of  refugee  families  extending  as  far  as  the  column  itself,  or  in  sherman’s   turn  of  phrase,  “  miles  of  negroes.”    some  ran  to  u.s.  lines  of  their  own  accord,   others  were  dragged  without  their  assent.    once  under  union  protection,  men  and   women  found  themselves  in  a  legal  state  of  freedom,  yet  with  immediate  constraints   no  less  coercive  than  those  they  experienced  under  slavery,  as  they  were  put   immediately  to  work  cooking,  digging,  farming,  or  marching  to  war.     emancipation  was  made  of  much  more  than  the  rush  of  enslaved  people  to  union   lines.  visualizing  emancipation  breaks  the  actions  and  experiences  of  enslaved  men   and  women  into  what  we  have  called  emancipation  event  types,  each  carrying  a   pattern  distinct  from  but  related  to  the  others.    we  were  particularly  interested  in   the  experiences  that  marked  the  end  of  slavery.    both  armies  conscripted  men  and   women  into  service,  pulling  them  away  from  their  homes  and  the  forms  of  slavery   they  had  known  before.    people  of  color  took  part  in  irregular  fighting,  raiding   plantations  while  not  enlisted  in  any  military  unit.    they  passed  intelligence  to  the   united  states  army  and  served  as  guides  to  troops.    african  americans  also  suffered   abuse,  were  rushed  away  from  oncoming  union  soldiers  so  that  their  owners  might   protect  their  human  property,  and  were  at  times  re-­‐enslaved  once  they  had  escaped   slaveowners’  control.                                                                                                                        granville  m.  dodge  to  ulysses  s.  grant,  pulaski,  tn,  december   ,   ,  or  i. .iii,   ,  http://dsl.richmond.edu/emancipation/#event/ .      william  t.  sherman  to  h.  w.  halleck,  meridian,  ms,  february   ,   ,  or  i. .ii,   -­‐ ,  http://dsl.richmond.edu/emancipation/#event/ .       security   the  patterns  made  by  a  few  of  these  kinds  of  events  suggest  how  emancipation   begins  to  look  differently  once  mapped  in  time  and  space,  and  broken  apart  by  the   different  experiences  black  southerners  encountered.    our  research  into  the  official   records  of  the  war  of  the  rebellion  shows  that  african  americans  were  victims  of   war-­‐related  abuse  more  frequently  once  black  men  began  fighting  for  the  united   states.    accounts  of  the  abuse  of  men  enlisted  in  the  u.s.  colored  troops,  including   the  atrocities  at  fort  pillow  are  well  known.  attacks  against  non-­‐uniformed  black   southerners  also  rose  after   .  occasionally  this  abuse  came  at  the  hands  of   undisciplined  u.s.  soldiers,  such  as  those  commanded  by  william  dwight  who  raped   the  enslaved  women  they  found  at  new  iberia,  louisiana  four  months  after  the   emancipation  proclamation  went  into  effect.    more  often  abuse  came  at  the  hands   of  confederates,  who  killed  unarmed  men  and  women  at  goodrich’s  landing,   mississippi,  on  hutchinson’s  island,  south  carolina,  helena,  arkansas,  and  a  large   number  of  other  places  dispersed  throughout  the  south.    violence  against  african   americans  composed  a  greater  part  of  the  war  effort  in  the  west  than  the  east.     attacking  black  men  and  women  was  a  regular  part  of  bushwhackers’  attempts  to   control  missouri,  and  violence  against  women  and  children  who  worked  u.s.  owned   plantations  along  the  mississippi  were  at  constant  risk  of  attack  by  small,  marauding   units  of  confederates.                                                                                                                    william  dwight,  jr.  to  richard  b.  irwin,  washington,  la,  april   ,   ,  or  i. .i,   ,  http://dsl.richmond.edu/emancipation/#event/ .     freedom  was  more  secure  in  the  eastern  theater  of  war,  particularly  in  virginia  and   north  carolina,  than  those  in  the  west,  but  more  dangerous  to  achieve.    in  virginia,   escaping  slavery  itself  was  an  incredibly  dangerous  business  because  of  the  highly   mobile  and  numerous  confederate  units  operating  throughout  the  state.    the   likelihood  that  an  enslaved  man  or  woman  would  be  caught  while  attempting  to  get   to  union  lines  was  great,  even  if  they  were  accompanying  a  u.s.  unit.    confederate   troops  were  eager  to  attack  smaller  commands  that  had  moved  in  advance  of  the   main  body  of  u.s.  troops.  they  captured  hundreds  of  escaped  slaves  after  halting   brig.  gens.  james  wilson’s  and  august  kautz’s  raid  along  the  danville  railroad  in   june   .         yet  once  behind  union  lines  in  a  refugee  camp,  fugitives  from  slavery  were  relatively   safe.    few  raiding  parties  penetrated  union  lines  to  seize  black  southerners  living   around  fortress  monroe  in  virginia  or  in  new  bern,  north  carolina.    the  tens  of   thousands  of  african  americans  who  left  their  farms  in  the  tidewater  regions  of   virginia  and  north  carolina  were  secure  in  their  freedom  after  the  emancipation   proclamation.    refugee  camps  and  u.s.  owned  plantations  along  the  mississippi   river  did  not  share  the  natural  geographic  advantages  of  the  atlantic  seaboard.     these  farms  and  villages  were  often  lightly  guarded  and  suffered  frequent  raids,   some  of  which  re-­‐enslaved  hundreds  of  men  and  women.                                                                                                                        “reports  from  petersburg,”  richmond  daily  dispatch,  july   ,   ,   http://dsl.richmond.edu/emancipation/#event/ .   scale   the  events  we  gathered,  detailing  where  and  when  men  and  women  became  free,   should  be  viewed  together  at  multiple  scales.    from  the  widest  vantage-­‐point,  we   can  discover  differences  at  the  level  of  the  region,  distinguishing  between  the  likely   experience  of  men  and  women  in  the  east  from  those  in  the  west.    examinations  of   differences  at  the  local  level  require  different  vantage  points  and  data  with  greater   specificity.    each  emancipation  event  is  encoded  with  a  geographic  precision  level,   which  appears  as  a  halo  around  events.    we  surround  events  about  which  we  lack   great  geographic  precision  with  large  halos,  warning  against  misinterpretation.     events  about  whose  location  we  have  very  specific  knowledge  do  not  receive  these   marks,  and  can  be  used  for  detailed,  local-­‐level  analysis.     for  example,  it  is  clear  that,  from  the  widest  vantage  point,  enslaved  men  and   women  ran  away  in  greater  numbers  when  united  states  army  units  came  near.    in   many  cases,  this  was  because  these  units  visited  southern  farms  and  either  invited   or  forced  enslaved  men  and  women  to  leave  with  them.    our  research  suggests  more   complicated  dynamics  at  work  as  well.    when  u.s.  units  led  by  maj.  gen.  david   hunter  entered  augusta  county,  virginia,  in  june   ,  they  created  new   opportunities  for  enslaved  men  and  women  there.    twenty  enslaved  men  and   women  working  at  the  central  asylum  in  staunton  left  with  the  union  troops.     confederates  stationed  nearby  reported  the  next  day  that  the  “yankees”  were   “capturing  negroes,”  and  were  intent  on  burning  the  railroad  bridge  at  the  cusp  of   the  blue  ridge  mountains.     not  all  those  who  left  their  owners,  however,  went  with  hunter’s  troops.    some  took   advantage  of  the  disruption  created  by  u.s.  forces  in  the  area  to  leave  the  area  for   their  own  purposes.    shortly  after  u.s.  troops  came  through,  a  man  named  jack  left   the  plantation  on  which  he  was  held.    his  owner  guessed  that  the  enslaved  worker   was  headed,  not  to  the  southwest  with  the  union  forces  but  east,  toward  his  family’s   home  in  petersburg.    the  patterns  that  we  see  turn  out  often  to  have  complex   backstories.    enslaved  men  and  women  used  armies  to  find  freedom  and  each  other.       evidence     the  patterns  that  emerge  from  “visualizing  emancipation”  are  complex,  operating  at   multiple  scales  and  revealing  the  violence  that  attended  freedom  and  the   connections  tying  widely  disparate  actions.    gathering  and  encoding  the  evidence   upon  which  this  project  rests  likewise  required  attention  to  patterns  and  potential   linkages  between  disparate  sources  and  depended  upon  robust  collections  of   digitized  sources  and  the  interpretive  abilities  of  undergraduates,  given  a  controlled   research  environment.                                                                                                                      staunton  republican  vindicator,  july   ,   ,  valley  of  the  shadow;  francis  t.   nichols  to  john  c.  breckinridge,  lynchburg,  virginia,  june   ,   ,  or  i. .i,   -­‐ ,  http://dsl.richmond.edu/emancipation/#event/ .        staunton  republican  vindicator,  july   ,   ,  valley  of  the  shadow.   mapping  the  movement  of  united  states  troops  required  algorithmic  manipulation   of  previously  digitized  texts.    when  we  began  the  project,  we  intended  to  map  the   movements  of  united  states  armies  using  the  official  records  of  the  war  of  the   rebellion  at  the  level  of  the  army  and  army  group.    it  quickly  became  apparent  that   this  task  was  at  once  too  large  and  too  small,  too  large  because  even  acquiring  this   level  of  detail  from  the  collected  reports  was  far  too  ambitious  and  too  small   because  this  level  of  detail  would  not  enable  us  to  capture  the  movements  of  smaller   units  that  moved  throughout  the  american  south.    when  it  became  clear  that   mapping  the  units  from  the  official  records  was  impracticable,  we  began  looking  for   other  ways  of  finding  the  places  civil  war  armies  moved.    one  source,  frederick   dyer’s  compendium  of  the  war  of  the  rebellion,  contained  this  information,  though  it   was  published  one  hundred  years  ago.    fortunately,  we  discovered  that  dyer’s   compendium  was  among  the  sources  that  researchers  at  the  tufts  university   perseus  digital  library  had  recently  digitized  and  deeply  marked  up  according  to   the  text  encoding  initiative  standards.    perseus  researchers  had  tagged  dyer’s   compendium  with  structured  xml  data,  indicating  the  names  of  regiments,  places,   and  dates  mentioned  in  the  text.         dyer  had  written  his  text  as  a  sequential  list  of  actions  taken  by  union  regiments  in   a  highly  structured  fashion.    because  he  had  structured  the  text  sequentially,  we   were  able  to  develop  relatively  straightforward  algorithms  that  associated  the   places  he  mentioned  with  the  appropriate  dates.    we  then  worked  with  university  of   richmond  undergraduates  to  clean  the  resulting  dataset  of  obvious  errors.    the   result  is  the  most  robust  map  to  date  of  union  army  movements,  a  dataset  including   more  than  forty  thousand  individual  unit  location/date  pairs  (for  more  on  this   dataset,  see  appendix  i).     “visualizing  emancipation”  depends  on  the  generosity  and  excellence  of  an  earlier   generation  of  digital  humanities  projects.    we  would  not  have  been  able  to  build  a   map  of  union  army  movements  in  the  limited  scope  of  a  digital  start-­‐up  grant   without  prior  digitization  efforts  and  experiments  in  automated,  deep  encoding  of   texts  by  the  perseus  digital  library.    the  emancipation  events  that  form  the  core  of   our  project’s  dataset  likewise  relies  on  exemplary,  freely  accessible  archival  projects   in  the  digital  humanities  published  within  the  last  two  decades,  particularly  the   university  of  virginia’s  valley  of  the  shadow,  cornell  university  library’s  making  of   america,  and  the  university  of  richmond’s  own  daily  dispatch  archive.    making  use   of  these  sources  in  order  to  harvest  and  encode  emancipation  events  required  a   variety  of  methods  and  enabled  us  to  think  purposefully  about  the  role  of   undergraduates  in  humanities  research.     finding  and  encoding  emancipation  events  required  much  more  nuance  than  we   could  achieve  using  algorithms  alone.    it  instead  required  a  recursive,  careful   weighing  of  evidence  and  refinement  of  our  hypotheses  about  what  emancipation   looked  like  in  the  civil  war.  while  we  knew  that  finding  evidence  of  men  and   women  becoming  free  would  be  a  complicated  task,  we  did  not  anticipate  the   difficulty  we  had  in  judging  who  was  becoming  free  and  who  was  not  during  the   war.    we  quickly  decided  that  we  would  look  for  a  much  more  general  set  of  events;   we  asked  students  to  look  for  any  document  in  which  slavery  was  changing,  or  any   evidence  of  african  americans  acting  (outside  their  normal  course  of  duty  as   members  of  the  united  states  colored  troops).    while  giving  this  broad  directive,   we  asked  students  to  describe  what  they  found.    after  a  few  months  of  describing   these  emancipation  events  without  a  controlled  vocabulary,  we  began  refining  the   ways  that  we  discussed  emancipation  events,  combining  some  categories  with  large   overlap,  eliminating  others  that  seemed  too  vague.    together  with  our  student-­‐ researchers,  we  decided  on  nine  emancipation  event  types  that  described  much  of   what  we  found  in  the  official  records  and  other  sources.    we  describe  these  event   types  in  appendix  ii.     expanding  research  opportunities     while  we  anticipate  that  the  results  of  this  research  will  be  significant,  we  believe   that  the  model  of  undergraduate  research  we  pursued  brings  just  as  important   ramifications  for  undergraduate  education  in  the  humanities.    humanists  have  often   labored  under  the  assumption  that  undergraduates  are  not  able  to  do  the  kinds  of   careful  work  required  for  effective  research  in  the  humanities.    our  experience  with   this  project  leads  us  to  believe  that,  given  proper  controls  and  guidance,   undergraduates  can  be  effective  researchers  in  large-­‐scale  humanities  projects.         we  made  two  decisions  that  we  believe  were  essential  for  coordinating   undergraduate  researchers.    first,  we  created  opportunities  for  controlled,   interpretive  decisions  that  did  not  rely  on  large  bodies  of  contextual  knowledge.    by   asking  students  to  describe  in  a  few  words  the  actions  they  found  within  the   documents,  we  enabled  them  to  practice  historical  interpretation  on  a  very  modest   scale.    by  recursively  moving  from  the  texts  they  studied  to  their  determinations  of   emancipation  event  types,  they  did  historical  work  manageable  for  many   undergraduate  students.    second,  we  offered  students  assignments  that  could  yield   interpretive  insight  at  multiple  scales.    undergraduates  could  find  patterns  within   their  own  documents  simply  by  examining  a  season  of  the  american  civil  war  in  a   single  place  using  one  source.    their  contribution  to  the  larger  project  had  its  own   coherence  as  a  research  agenda,  over  which  they  could  rightly  claim  deep   knowledge  and  on  which  they  might  write  their  own  interpretations.         organizing  our  research  as  an  extensible  project,  amenable  to  the  contributions  of   undergraduate  researchers,  has  also  enabled  us  to  open  the  project  beyond  its  initial   creators  at  the  university  of  richmond,  to  the  public  and  undergraduates  involved   in  coursework  at  other  institutions.    azavea,  a  geospatial  development  firm  in   philadelphia,  proved  to  be  an  excellent  partner  in  developing  the  project’s  user   interface.    developers  at  azavea  built  a  crowdsourcing  system  for  “visualizing   emancipation,”  by  which  registered  users  of  the  project  from  anywhere  in  the  world   might  submit  emancipation  events  to  be  approved  by  scholars  at  the  digital   scholarship  lab  and  published  on  our  map.  members  of  the  public  have  begun   contributing  their  own  emancipation  events  to  the  project.    they  have  drawn  on   sources  available  online  and  in  archives  across  the  country  as  they  ensure  that  the   places  they  know  intimately  are  properly  represented  on  a  map  of  the  end  of   slavery.       since  we  believe  that  “visualizing  emancipation”  offers  a  model  for  undergraduate   research,  we  have  encouraged  instructors  at  other  universities,  colleges,  and   advanced  undergraduate  classes  to  organize  research  assignments  around  the  site.     we  look  forward  to  partnering  with  classes  to  upload  emancipation  events  based  on   local,  archival  newspaper  sources  and  those  held  by  the  library  of  congress  as  part   of  its  chronicling  america  newspaper  digitization  project  beginning  in  fall   .     instructors  teaching  a  wide  range  of  courses,  from  graduate  research  seminars  to   american  history  survey  and  advanced  placement  u.s.  history  courses,  have   expressed  interest  in  contributing  to  “visualizing  emancipation”  in  this  way.     we  wholeheartedly  encourage  efforts  such  as  these  that  combine  face-­‐to-­‐face   classroom  instruction  with  digital  tools  and  materials.    we  have  been  interested  in   such  challenges  for  a  number  of  years,  starting  with  the  history  engine,  which  we   created  at  the  university  of  virginia  in    and  which  is  now  hosted  and  directed   by  the  digital  scholarship  lab.    asynchronous  collaborations  such  as  these   encourage  early  on  the  practices  of  history  that  we  find  most  compelling:  research   in  primary  sources,  the  careful  weighing  of  evidence,  and  the  crafting  of  narratives   based  on  research  in  primary  source  materials.    by  adding  to  ongoing,  large-­‐scale   datasets,  these  collaborations  among  strangers  bring  to  light  new  sources  for  the   public  and  scholars  alike.       “visualizing  emancipation”  is  an  ongoing  research  project-­‐-­‐necessarily  incomplete,   since  it  invites  the  contributions  of  the  public  and  classrooms  across  the  country.     we  have  also  begun  thinking  about  the  ways  in  which  “visualizing  emancipation”   might  be  extended  beyond  public  contributions  to  its  dataset.    extending  the   usefulness  of  databases  and  collaborative  projects  such  as  “visualizing   emancipation”  remains  an  opportunity.         as  the  project  grows,  we  expect  to  add  functionality  in  two  areas.    in  order  to  share   data  more  effectively,  it  is  important  that  we  build  a  tool  that  will  allow  for   download  of  the  latest  version  of  our  data.    as  we  build  a  data  download  tool,  we   will  also  continue  to  clean  our  dataset  and  refine  our  metadata  descriptions,  so  that   our  data  will  be  of  use  to  others.  these  modifications  will  make  use  of  our  strict   division  of  data  from  the  visualizations  that  rely  on  those  data,  enabling  us  the   flexibility  to  adapt  our  project  as  visualization  technology  changes  in  the  future.         extending  the  usefulness  of  the  project  will  also  involve  analyzing  the  effectiveness   of  the  current  user  interface.    we  believe  that  the  simple  message  to  be  taken  away   from  “visualizing  emancipation”—that  the  end  of  slavery  occurred  not  simply   through  fiat  in  washington  d.c.,  but  through  the  actions  of  individuals  throughout   the  american  south—is  best  learned  through  exploratory  interaction  with  primary   sources.    in  order  to  make  this  exploratory  environment  accessible  to  teachers  and   students,  we  have  begun  developing  lesson  plans  and  learning  modules  to  facilitate   use  of  the  project  in  classrooms  at  the  middle  and  high  school  levels.    these  will   modules  will  include  video  tutorials  introducing  the  project,  its  interface,  and  a   number  of  narrative  threads,  pointing  out  to  visitors  some  of  the  patterns  in  our   large  and  growing  database.     “visualizing  emancipation”  aims  to  organize  the  sources  for  the  study  of  the  end  of   slavery  in  time  and  space  for  a  broad  audience.    the  fundamental  patterns  of   emancipation  were  geographic,  as  soldiers  and  slaves  moved  about  the  war-­‐torn   south.    their  interactions  followed  recognizable  patterns,  along  rails  and  riverbeds,   up  coastlines  and  at  strategic  junctions  across  the  south.    we  provide  a  platform  for   thinking  about  these  patterns  and  for  encouraging  other  scholars,  teachers,  and   students  to  understand  the  end  of  slavery  in  increasingly  sophisticated  ways,   fulfilling  our  ongoing  goal  of  creating  technically  innovative,  engaging,  scholarly   applications  for  the  public  good.                                       appendices           appendix  i:    union  army  regiment  locations     “visualizing  emancipation”  for  the  first  time  plots  the  locations  of  regiments  in  the   united  states  army.    these  locations  should  be  regarded  as  approximations  subject   to  a  number  of  caveats.     our  information  on  the  location  of  u.s.  regiments  comes  from  the  careful  cataloging   of  frederick  h.  dyer,  a  former  drummer  boy  in  the  united  states  army  who  went  on   to  compile  the  compendium  of  the  war  of  the  rebellion  ( ).    the  compendium   supplies  a  nearly  complete  list  of  union  regiments  during  the  civil  war  along  with   detailed  descriptions  of  those  units’  movements  over  the  course  of  the  war.    the   perseus  digital  library  at  tufts  university  digitized  this  text,  creating  approximately    files,  one  for  each  regiment,  encoded  according  to  the  standards  established  by   the  text  encoding  initiative  (tei).    scholars  at  perseus  used  algorithms  to  recognize   the  places  and  dates  mentioned  in  dyer’s  text.       scholars  at  the  digital  scholarship  lab  transformed  these  files  into  a  format  that   mapping  applications,  such  as  google  earth,  can  read.    we  paired  the  places  and   dates  that  perseus  identified  in  the  compendium,  then  went  about  checking  for   errors.         we  are  aware  that  errors,  unfortunately,  remain  in  this  dataset.    these  arise  from  a   few  different  sources.    frederick  dyer’s  compendium  is  quite  reliable,  yet  even   more  detailed  and  thoroughly  researched  sources  exist  for  tracking  u.s.  civil  war   military  units,  particularly  the  supplement  to  the  official  records  of  the  war  of  the   rebellion.    some  errors  were  introduced  into  dyer’s  text  through  digitization,  and   more  errors  appeared  during  the  process  of  identifying  place-­‐names;  some   historical  places  are  not  listed  in  even  the  best  modern  gazetteers,  while  other   places  remained  ambiguous  to  the  computational  models  because  they  are  shared   by  multiple  locations.  the  digital  scholarship  lab  introduced  further  errors  in   computationally  pairing  dates  and  locations.    while  we  have  caught  hundreds  of   errors,  we  know  that  many  others  still  remain  to  be  corrected.    we  are  currently   looking  for  ways  to  correct  remaining  errors  in  the  armies  dataset.         appendix  ii:  emancipation  event  types     the  end  of  slavery  in  the  united  states  was  a  complex  process  that  occurred   simultaneously  in  courtrooms  and  plantations,  on  battlefields  and  city  streets.    it   involved  a  wide  variety  of  human  interactions,  many  of  which  we  represent  in  this   map  as  emancipation  events.    we  have  identified  ten  distinct  but  interrelated  kinds   of  events:     a.  african  americans  helping  the  union       over  the  course  of  the  civil  war,  african  americans  helped  union  troops  in  a  variety   of  ways.    this  event  type  tags  those  places  where  former  slaves  aided  troops  in   informal  capacities,  usually  outside  their  conscription  as  laborers  on  plantations,  as   soldiers,  or  as  cooks  in  military  camps.    we  have  especially  used  this  tag  to  note   where  people  of  color  gave  information  to  u.s.  forces  or  served  as  guides  for  troops   navigating  the  southern  terrain.  they  did  so  throughout  the  south,  unevenly  over   the  course  of  the  war.  isaac  i.  stevens  found  enslaved  men  of  great  help  during  his   navigation  of  the  sea  islands.    near  coosaw  island  he  found  cyas,  who,  he  wrote,   “subsequently  proved  of  great  service  from  the  intimate  knowledge  he  possessed  of   the  country.”  (or  i. .i,   -­‐ )     b.  abuse  of  african  americans       emancipation  caused  chaos  on  the  land,  and  african  americans  bore  the  brunt  of   this  disruption.  this  category  indicates  places  where  whites  in  either  the  union  or   the  confederacy  abused  people  of  color  during  the  war.  documents  tagged  under   this  event  include  incidents  of  murder,  discriminatory  pay,  beatings,  and  starvation.     perhaps  the  most  infamous  of  these  were  the  events  at  fort  pillow.  brig.  gen.  m.   brayman  wrote  to  his  superiors,  describing  the  events  there:  “fort  pillow  was  taken   by  storm  at   p.m.  on  the   th,  with  six  guns.    the  negroes,  about   ,  murdered,   after  surrendering  with  their  officers.  of  the    white  men,    have  just  arrived,   and  sent  to  mound  city;  about    are  prisoners,  and  the  rest  killed.    the  whole   affair  was  a  scene  of  murder.”  (or  i. .ii,   )       c.  orders  or  regulations     emancipation  came  about  not  only  through  the  initiative  of  enslaved  people  or  the   actions  of  individual  soldiers,  but  through  official  orders,  policies,  and  regulations.     events  tagged  within  this  category  were  policy  changes  directly  affecting  the  slave   regime  issued  the  union  and  confederate  governments.  among  other  events,  these   include  orders  declaring  enslaved  men  and  women  in  a  territory  free,  orders   requiring  commanders  to  send  enslaved  men  and  women  to  the  quartermaster,  and   confederate  responses  to  emancipation  and  the  enlistment  of  black  troops.  in   louisiana,  for  example,  confederate  authorities  struggled  with  the  best  approach  to   captured  african  american  troops.    while  they  saw  the  benefits  of  taking  a  hard  line   against  black  troops  by  enslaving  them,  they  worried  that  such  a  policy  could   backfire.  the  assistant  adjutant  general  in  confederate  louisiana  in   ,  charles   le  d.  elgee,  proposed  treating  us  colored  troop  soldiers  “with  all  proper  leniency,”   as  prisoners  of  war  in  order  not  to  dissuade  dissatisfied  black  troops  from  deserting   the  enemy.  (or  i. .ii,   -­‐ ).       d.  conscription  and  recruitment,  union     these  events  detail  the  marshaling  of  enslaved  men  and  women  in  the  fight  against   the  confederacy.  included  in  this  category  are  the  drafting  of  contraband  men  and   women  to  work  in  military  camps,  fortifications,  as  soldiers,  or  as  servants  in   various  capacities.  in  some  places,  this  was  a  systematic  effort  to  draw  upon  black   labor  to  the  greatest  possible  degree.  by  july   ,  gen.  nathaniel  banks  reported   from  louisiana  that  “every  negro  within  the  present  lines  of  this  department,  or   within  reach  of  them,  without  distinction  of  age,  sex,  or  condition,  is  in  the  service  of   the  government,  either  in  the  army  or  in  producing  food  for  the  army  and  its   dependents.”  (or  i. .i,   )       e.  conscription,  confederate     the  confederacy  depended  upon  slave  labor  on  plantations  to  provide  food  and  the   normal  operations  of  its  slave  society,  and  near  the  front  lines  in  direct  service  to   the  government.  these  events  describe  the  ways  that  confederates  were  able  to  use   african  american  labor  for  their  war  effort.    it  includes  orders  and  reports  of   impressment  of  slaves  for  use  in  building  fortifications,  railroads,  and  other  efforts   while  bypassing  most  mentions  of  african  americans  working  as  on  privately  held   farms.  confederate  conscription  began  early  in  the  war.  in  late  july,   ,  gen.  john   b.  magruder  ordered  that  half  the  male  slaves  and  all  free  men  of  color  in   gloucester,  middlesex,  and  matthews  counties  muster  “to  finish  the  works  around   gloucester  point.    magruder  promised  recompense  to  the  slaveowners:  “fifty  cents  a   day  and  a  ration  for  each  negro  man  during  the  time  he  is  at  work.”    (or  i. .i,   )   magruder  sent  agents  into  the  county  to  enforce  the  order.     f.  irregular  fighting     this  event  category  documents  african  americans’  involvement  in  irregular  fighting   and  appropriation  of  property  that  accompanied  the  civil  war,  either  as  willing   participants  or  as  victims.    within  this  category  we  have  collected  incidents   involving  african  americans  taking  or  destroying  property  claimed  by  landowners,   enslaved  men  and  women  killing  white  civilians  or  military  personnel,  and  instances   where  people  of  color  were  the  objects  of  irregular  fighting  or  pillaging.         included  among  these  events  are  the  regrets  of  maj.  gen.  samuel  r.  curtis  in  a  letter   to  colonel  n.  p.  chipman  in  helena,  arkansas  the  day  after  the  emancipation   proclamation  went  into  effect.  “i  am  sorry  indeed,”  curtis  wrote,  “to  hear  of  the  loss   of  mrs.  craig’s  house  by  burning.”    curtis  wrote  of  their  wealthy  mutual   acquaintance  in  a  mournful  tone.  alas,  this  is  war;  although  it  was  the  negroes  who   did  it,  still,  it  is  the  result  of  war.”  (samuel  r.  curtis  to  n.  p.  chipman,  st.  louis,  mo,   january   ,   ,  or  i. ,   -­‐ .)       g.  capture/enslavement/re-­‐enslavement  of  african  americans  by  confederates   confederate  troops  and  civilians  made  concerted  efforts  to  re-­‐enslave  african   americans  who  had  escaped  their  control  during  the  war  and  to  enslave  free  blacks   who  lived  in  northern  states.    this  effort  included  counterattacks  and  ambushes  on   smaller  union  regiments  travelling  with  people  of  color,  raids  on  contraband  camps   along  the  mississippi  and  atlantic  seaboard,  and  dragnets  at  the  edges  of   confederate-­‐held  territory  watching  for  the  escape  of  african  americans  from  the   southern  interior.     during  confederate  general  sterling  price’s  series  of  attacks  in  missouri  in  the   autumn  of   ,  for  example,  a  confederate  scouting  party  ran  into  a  train  of   wagons  manned  by  a  small  number  of  federal  troops.    brig.  gen.  john  shelby   reported  the  results.    they  “captured   ,    caissons,    artillery  horses  with   harness,    negroes,  and    prisoners,  besides  killing  and  wounding  a  large   portion  of  the  guard.”  (or  i. .iii,   )  confederate  attacks  on  african  americans   such  as  this  one  appear  throughout  the  u.s.  south.     h.  fugitive  slaves/runaways   men  and  women  ran  from  slavery  to  union  lines  before  any  major  battles  had  been   fought.    events  tagged  as  “fugitive  slaves/runaways”  are  instances  where  enslaved   people  ran  away  from  their  owners  or  turned  up  before  union  units  seeking   protection.    many  of  these  events  are  taken  from  newspaper  advertisements  seeking   the  return  of  escaped  slaves.    typical  is  john  werth’s  complaint  to  the  richmond   daily  dispatch,  promising  a  fifty  dollar  reward  “for  the  apprehension  and  delivery  to   me,  in  richmond,  of  jack  oseen,  a  slave,  who  absconded  last  week  from  the   fortifications  in  chesterfield  county.  jack  is  a  black  negro,  about    years  of  age,   slightly  built,  good  teeth,  but  rather  far  apart,  has  a  scar  on  the  right  hand,  and   another  on  the  left  wrist;  was  lately  purchased  from  near  goldsborough,  n.c.”  (“fifty   dollars  reward,”  richmond  daily  dispatch,  april   ,   )     i.  capture  of  african  americans  by  union  troops   if  many  african  americans  eluded  slavery  by  leaving  their  plantations  without   outside  intervention,  others  escaped  through  the  direct  intervention  of  united   states  troops.    in  many  of  these  cases,  military  reports  leave  some  ambiguity  to  the   question  whether  enslaved  men  and  women  had  any  choice  about  leaving  their   property,  neighbors,  and  homes.    we  have  assigned  instances  of  direct  military   intervention  on  plantations  to  this  category,  “capture  of  african  americans  by  union   troops.”  brig.  gen.  grenville  m.  dodge  reported  the  results  of  his  unit’s  expedition  in   northern  alabama  in  just  this  way:  “it  has  rendered  desolate  one  of  the  best   granaries  of  the  south,  preventing  them  from  raising  another  crop  this  year,  and   taking  away  from  them  some   ,  negroes.”  (or  i. .i,   ).       j.  protecting  slave  property  from  union  troops   slave  owners  in  the  border  south  and  confederate  states  sought  to  protect  their   property  in  human  beings  from  emancipation  in  any  way  they  could.    for   slaveholders  in  the  border  south,  this  often  meant  pressing  soldiers  to  return  the   men  and  women  they  claimed.  in  the  confederate  states,  especially  after  the   emancipation  proclamation,  slave  owners  transported  men,  women,  and  children  to   places  they  hoped  would  be  “safe”  from  union  troops  and  freedom.  events  of  this   type  document  the  efforts  of  slave  owners  to  retain  their  property.    before  his   assault  on  atlanta,  gen.  william  t.  sherman  complained  that  he  was  encountering   very  few  african  americans  in  northern  georgia,  “because  their  owners  have  driven   them”  to  the  southwest  corner  of  the  state.  “negroes  are  as  scarce  in  north  georgia   as  in  ohio.    all  are  at  and  below  macon  and  columbus,  ga.”  (or  i. .ii,   )     these  event  types  together  capture  most  of  the  events  we  gathered  in  visualizing   emancipation.    because  these  types  of  events  are  interrelated,  many  events  are   encoded  with  multiple  types.     undergraduate  researchers  at  the  university  of  richmond  recorded  and  coded   events  from  a  number  of  different  sources.    they  searched  through  letters,  diaries,   and  newspapers—particularly  newspapers  gathered  in  the  valley  of  the  shadow   project  and  in  the  richmond  daily  dispatch.    they  spent  by  far  the  most  time  on  a   full  canvas  of  the  official  records  of  the  war  of  the  rebellion.    in  most  cases,  we   depended  on  the  making  of  america  project  at  cornell  university  for  access  to  these   texts,  though  in  some  cases  we  supplemented  this  version  with  the  version  digitized   and  managed  by  e-­‐history  at  ohio  state  university.         students  searched  through  this  corpus  for  words  commonly  used  during  the  civil   war  to  refer  to  african  american  men  and  women  in  the  south:  contraband,  negro,   black,  colored,  slave.    if  the  document  detailed  the  changing  practice  of  slavery  or  its   dissolution,  students  recorded  it  along  with  a  number  of  pieces  of  information  about   that  event,  particularly  its  date,  location,  and  an  event  type.       we  were  not  always  certain  where  an  event  occurred.    some  events  we  were  sure   occurred  on  a  certain  city  block;  we  had  only  the  vaguest  sense  of  where  others   happened.    because  of  this  uncertainty,  students  recorded  a  precision  level  for  each   event.    we  represent  this  level  of  uncertainty  as  a  halo  around  the  events:  if  the  map   displays  events  at  a  zoom  level  that  implies  greater  certainty  than  is  warranted,  the   event  is  displayed  with  a  halo  that  grows  larger  with  our  uncertainty  about  that   event.         undergraduate  students  also  recorded  the  number  of  african  americans  affected  by   events.    some  events  describe  the  actions  of  only  one  or  two  enslaved  men  or   women;  others  describe  the  activities  of  thousands.    more  often,  the  sources  give   only  the  vaguest  suggestion  of  the  numbers  of  men  and  women  involved:  there  were   “several,”  “many,”  “masses.”    because  these  descriptions  are  so  unreliable,  we  do  not   currently  represent  on  the  map  the  number  of  men  or  women  involved  in  an  event.     each  documented  event  is  represented  with  a  dot  of  the  same  size  and  color.     liber-legographic_final forming relationships in digital humanities building blocks for libraries digital humanities (dh) is a collaborative discipline. libraries engaging in this field must build connections with dh research communities to be successful. our below illustration demonstrates ways in which libraries can build such relationships and points to various useful readings on this topic. mind your language language is key in articulating the library’s role. expertise such as metadata can and should be reframed for a dh context. job titles and strategy documents also help position library offerings. read: cox find your level tailor your approach to the stage of dh development your library is at. there is no one-size-fits-all approach. look for incremental steps to progress in a sustainable way. read: rluk, lse know your audience an environmental scan helps identify your local dh landscape and its focus areas. knowing strengths and gaps can inform library strategy. it also can act as a useful ice-breaker with faculty. read: ecar be a translator librarians’ work with digital collections requires understanding both intellectual and technical aspects. this in turn equips them to act as dh ‘translators’ between research and technical partners. read: star & griesemer be loud, be bold overcome “library timidity” and broadcast the library’s expertise as an enabler of digital scholarship and a go-to for digital research projects. be specific about offerings and expertise. read: vandegrift & varner win friends & influence faculty use any existing links with dh researchers as demonstrator projects or partnerships. this helps momentum and advocacy particularly if broader engagement is an early stumbling block. read: gerber et. al play with structures different models of structured collaboration - such as dh fellows in the library or ‘embedded’ librarians working on faculty projects – can enable a more formalised approach. read: wilson & berg, locke & mapes ( ) partnership vs service how is this dh collaboration framed? is the library a partner or service? local context can inform here. in some cases library collaboration itself is listed as a ‘service’ offered for digital projects. read: muñoz, dinsman help from above strategic commitment is vital for long term sustainability. adequate resourcing for aspects of advocacy, research, skills development along with acknowledgement of new measures of success. read: posner strength in numbers find external library colleagues. working or technical groups (e.g. liber wgs, carpentries) can offer shared expertise and augment teaching. rse offers an ‘alt-ac’ parallel here. read: rse full references with links can be found here: !!! this poster came out of a group discussion of the building relationships sub-team of the liber digital humanities working group. edinburgh research explorer palimpsest: improving assisted curation of loco-specific literature citation for published version: alex, b, grover, c, oberlander, j, thomson, t, anderson, m, loxley, j, hinrichs, u & zhou, k , 'palimpsest: improving assisted curation of loco-specific literature', digital scholarship in the humanities, vol. , no. , pp. i -i . https://doi.org/ . /llc/fqw digital object identifier (doi): . /llc/fqw link: link to publication record in edinburgh research explorer document version: peer reviewed version published in: digital scholarship in the humanities general rights copyright for the publications made accessible via the edinburgh research explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. take down policy the university of edinburgh has made every reasonable effort to ensure that edinburgh research explorer content complies with uk legislation. if you believe that the public display of this file breaches copyright please contact openaccess@ed.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. download date: . apr. https://doi.org/ . /llc/fqw https://doi.org/ . /llc/fqw https://www.research.ed.ac.uk/portal/en/publications/palimpsest-improving-assisted-curation-of-locospecific-literature(e ee d - b- f- e a- ccab eb d).html palimpsest: improving assisted curation of loco-specific literature beatrice alex†, claire grover†, jon oberlander†, tara thomson•, miranda anderson*, james loxley*, uta hinrichs°, ke zhou‡ †ilcc, school of informatics university of edinburgh [balex][grover][jon]@inf.ed.ac.uk • school of arts and creative industries edinburgh napier university t.thomson @napier.ac.uk °sachi group, school of computer science university of st andrews uh @st-andrews.ac.uk * school of literature, languages and cultures university of edinburgh [miranda.anderson][james.loxley]@ed.ac.uk ‡ school of computer science university of nottingham zhouke.nlp@gmail.com abstract text mining and information visualisation techniques applied to large-scale historical and literary document collections have enabled new types of humanities research. the assumption behind such efforts is often that trends will emerge from the analysis despite errors for individual data points and that noise will be dominated by the signal in the data. however, for some text analysis tasks, the technology is unable to perform as well as domain experts, perhaps because it does not have sufficient world knowledge or metadata available. yet, the advantage of language processing technology is that it can process at scale, even if not perfectly accurately. geo-locating literary works is one example where human expert knowledge is invaluable when it comes to distinguishing between candidate works. this was both tara thomson and ke zhou were employed at the university of edinburgh, when they carried out the work for this project. mailto:t.thomson @napier.ac.uk mailto:zhouke.nlp@gmail.com the underlying assumption in palimpsest, an interdisciplinary digital humanities research project on mining literary edinburgh. from the outset, the project adopted an assisted curation process whereby the automatic processing of large data collections was combined with manual checking to identify literary works set in edinburgh. in this article, we introduce the assisted curation process and evaluate how the feedback from literary scholars helped to improve the technology, thereby highlighting the importance of placing humanities research at the core of digital humanities projects. . introduction the following quotation from john wilson’s the recreations of christopher north ( ) illustrates one of the many ways in which edinburgh has been used as a literary setting. edinburgh castle is a noble rock — so are the salisbury craigs noble craigs — and arthur's seat a noble lion couchant, who, were he to leap down on auld reekie, would break her back-bone and bury her in the cowgate. edinburgh, the first unesco city of literature, has a rich literary heritage which provides the backdrop for many novels and stories. this paper reports on interdisciplinary work carried out for the palimpsest project, which focussed on text mining literary works set in edinburgh. the project’s aim was to examine the dimensions of literary edinburgh through using text mining to scour accessible historical and fictional literary works in order to uncover those which mention edinburgh or places within it. the term “loco-specific literature” here describes the widespread use of non-fictional place names in literary texts. this reflects an investment in place by these works, which through the use of a place name provide an anchoring mechanism that both enables and constrains the imagination of the author and the reader. we grounded “loco-specific” passages of text by identifying their latitudes and longitudes, so that both scholars and the public can geographically explore the historical and fictional city via the geo-located passages of text. palimpsest was a collaboration between literary scholars studying the use of place and place names in literature, and computer scientists working on text mining and information visualisation. through a range of maps and http://palimpsest.blogs.edina.ac.uk/ http://palimpsest.blogs.edina.ac.uk/ visualisations accessible via the litlong.org site, the product of palimpsest, users are now able to explore the associations of place names and the spatial relations of places in the literary city at particular periods in its history, in the works of specific authors and works, or across periods and authors. the palimpsest data is also accessible via the mobile litlong iphone app in situ while walking through the city. in this article we present an overview of the project workflow and describe the assisted curation process adopted. this process involved automatic retrieval and ranking of accessible literature according to its loco-specificity, which was followed by the manual selection of ranked documents, resulting in a set of literary works identified as set in edinburgh. we also report on the fine-tuning of the retrieval and ranking prototype based on literary scholar annotators' feedback and explain how we evaluated it using standard retrieval metrics. . palimpsest the palimpsest project was an ahrc funded collaboration between: the university of edinburgh’s school of literatures, languages and cultures and its school of informatics; edina; and the university of st andrews’ sachi lab. the idea of the project arose from dr. miranda anderson’s project called “palimpsest: the literary high street” for which a prototype map interface to literary quotes containing edinburgh place names was developed. the data presented in this interface is a small set of around quotations from around works crowd-sourced from anderson’s colleagues. our aim with this project was to scale up this effort by relying on computer-assisted processing for some of the steps involved in collecting the data and geo-locating place name mentions within them. the litlong interfaces, the final outputs of palimpsest, link to more than , locations within edinburgh mentioned in over , literary excerpts from around books. they are aimed at scholarly and non-specialist audiences, including tourists exploring the streets of edinburgh virtually or physically, locals who want to discover how authors described their city years ago and literary scholars who are interested in place and the relations between place and literature. http://palimpsest-eng.appspot.com http://palimpsest-eng.appspot.com/ . palimpsest workflow figure shows the workflow adopted in palimpsest. the input data was made up of five literary document collections amounting to over , works, most of which are out of copyright, as well as a small set of modern books from authors who are well known for their literature being set in edinburgh (including, for example, irvine welsh, alexander mccall smith and muriel spark). the out-of-copyright collections are varied in content and quality and contain literary fiction and nonfiction genres mostly in english but also in other languages. they include the world public domain subset of the hathitrust data ( , documents) , the british library nineteenth century books collection ( , documents) , the project gutenberg data ( , documents) , a collection of documents from the national library of scotland ( , documents) and the oxford text archive tei text data ( , documents) . unfortunately, our data collection and preparation work was carried out before the eebo-tcp and ecco-tcp data sets were made available for research. in future iterations of palimpsest, these collections should also be considered. in palimpsest, our analysis was limited to english language documents. if the information on the language of a text was not already present in the metadata for the document then we computed it automatically using the textcat language identification tool which works very reliably even when given just a few sentences of text. in our workflow the input data was first converted to one common format necessary for the document retrieval component where it was first indexed. edinburgh-specific candidate documents were then retrieved automatically from this index. this component outputs a set of ranked edinburgh-specific candidate documents per collection. https://www.hathitrust.org http://labs.bl.uk/digital+collections+-+books+and+text https://www.gutenberg.org http://www.nls.uk http://ota.ox.ac.uk/catalogue/index.html http://odur.let.rug.nl/~vannoord/textcat/ hathitrust collection british library nineteenth century books national library of scotland collection oxford text archive project gutenberg ... text mining digitised documents document retrieval & filtering relational database user interfaces edinburgh gazetteer ranked lists of edinburgh-specific candidates manual curation curation of edinburgh-specific literature and deduplication fine-grained location extraction and geo- referencing using the edinburgh geoparser geo-referenced locations snippets meta data . the journal of sir walter scott (scott, walter) . robert louis stevenson (black, margaret moyes) . the modern scottish minstrel, volumes i-vi. (various) . spare hours (brown, john) . the heart of mid-lothian (scott, walter) . the works of robert louis stevenson (stevenson, robert l.) . rab and his friends and other papers (brown, john) . greyfriars bobby (atkinson, eleanor) ... gazetteer of edinburgh place names and their latitude/longitude pairs or shape files derived from several sources figure : palimpsest workflow. the ranked output was loaded into a web-based annotation tool for manual curation. all edinburgh place names occurring in the document along with the snippets of text surrounding the place name mentions were displayed to aid the decision making of the annotators, three literary scholars from the english literature department at the university of edinburgh. the curators considered candidate literary works as set in edinburgh if the city featured as a setting but not necessarily as the primary setting. for example, a book containing a chapter with dense mentions of edinburgh place names was considered as being sufficiently set in edinburgh for inclusion, particularly since in the litlong interfaces excerpts are abstracted away from the their source works. while going through the ranked list of documents, the curators decided to stop curating after going through the top % of the ranked documents per collection. aside from project time constraints, this decision was taken because, as the likelihood of a document being a true edinburgh-specific candidate decreases with a decreasing ranking, it took longer and longer to find real true data points in the ranked list. moreover, the annotators also added any documents which were not already selected as edinburgh-specific candidates as part of the assisted curation process, but which were in the pool of documents crowd-sourced manually for the palimpsest prototype. the sub-set of works which were identified as edinburgh-specific in this way were then further processed by our text mining pipeline which geo-references place names by grounding them to their latitude/longitude coordinates using the edinburgh geoparser (grover et al. , alex et al. ). the geoparser is set up to work by default with geonames , a global gazetteer. we adapted it to work with the fine-grained edinburgh gazetteer, which was aggregated and cleaned during the palimpsest project. the text-mined output (geo-referenced location mentions, snippets etc.) was stored in the palimpsest database, which is accessible via our web-based litlong visualisations (see figure ), an iphone app and via a search api hosted by edina. figure : the litlong web interface accessible at litlong.org https://www.ltg.ed.ac.uk/software/geoparser http://www.geonames.org http://litlong.org/navigating-with-litlong/download-our-app/ http://litlong.edina.ac.uk/search/ https://www.ltg.ed.ac.uk/software/geoparser . document retrieval and ranking the aim of palimpsest was to be able to discover and make available for exploration a broad spectrum of books, including forgotten gems which are not part of the established canon of edinburgh literature. the main literary research question was: is the characteristic of literary setting, and the detailed ways in which this is narratively established, sufficiently amenable to machine reading to allow us to work automatically across large scale collections of digitised texts? this involved finding ways to define when a book qualifies as being edinburgh- specific, exploring the literary use of place names and their utility as a marker of setting (anderson and loxley, ), and developing a document retrieval and ranking tool to sift potential candidates out of the pool of literature to which we had access. in order to retrieve candidates of edinburgh-specific literature, the literary data was first indexed using indri . and ranked using a set of , edinburgh place name queries. we used the indri inference network language model based ranking approach (strohman et al., ). it combines the language modeling (ponte and croft, ) and inference network (turtle and croft, ) methods to information retrieval. the resulting model allows structured queries similar to those used in inquery (callan et al., ) to be evaluated using language modeling estimates within the network. indri is a research oriented information retrieval framework, which supports various effective ranking functions. the reason why we chose it over other search engines is that its structured query representation allows us to enforce constraints for ranking. the document retrieval and ranking prototype was developed using the hathitrust data only, the largest of our document collection. metadata and ambiguity weightings were taken into account when computing the ranking. the ranking score of a document was computed by combining the score for the location query retrieved from the content of a book with a score based on information in the metadata http://sourceforge.net/projects/lemur/files/lemur/indri- . / they included entries appearing in at least three of five resources used to construct the edinburgh gazetteer (openstreetmap, oslocator, royal commission for ancient historic monuments of scotland, historic scotland, quatroshapes of edinburgh areas). http://sourceforge.net/projects/lemur/files/lemur/indri- . / of the book. for example, the ranking was increased given the presence of a set of favoured subclasses from the library of congress classification system, including pr (english literature), da (description and travel), pz (fiction and juvenile belles lettres), pn (literature (general)) and ps (american literature), or given a list of relevant subject terms (edinburgh, scotland, literature, fiction, novel, poetry, poem, story, stories, drama, novella, english, biography, ballads, ballad, scottish). this was to ensure documents with such metadata information appeared higher in the ranking. at the same time, the ranking score was down-weighted for ambiguity of edinburgh place names in order to push documents which mention place names most likely not referring to a location within edinburgh down the list. there are various kinds of ambiguous place names, for instance: common place names which occur in other towns or cities (for example, ‘market street’); place names which are derived from person names or which describe a person (for example, ‘the town guard of edinburgh’) or place names which are also common nouns (for example, ‘trinity’). the weight for a location was determined by means of its frequency in geonames so that more frequently occurring place names are considered less likely to be locations occurring within edinburgh. the output of the document retrieval component is a set of ranked edinburgh-specific candidate documents per collection as depicted in figure . this data was then loaded in order of ranking into the curation tool. . assisted curation the term assisted curation refers to the process of semi-automatically curating a set of edinburgh-specific literature from all accessible literature. this means that the results of the retrieval and ranking process were checked manually by literary curators. in the case of edinburgh, related endeavours to geo-locate literature have relied on the collection of titles, by metadata we refer to traditional library cataloguing record data. in this article we also refer to this information as genre and subject, respectively. by subject information we refer to marc bibliographic data information, in particular subject access field information stored under code (personal name), (topical term) or (geographic name). the ambiguity-related weight was computed by dividing by the sum of the frequency of the place name in geonames and . or passages, by a few individuals or via crowd sourcing (e.g. edinburgh reads run by edinburgh libraries or global bookmap ). the book navigator, a web-based tool and mobile app interface which allows the users to manually geo-locate place name mentions in literary data directly in ebooks (hinze et al., ), could be used for such crowd-sourcing endeavours. as mentioned previously, the idea for palimpsest arose out of an initial prototype which visualises a small set of around extracts manually collected by literary scholars at the university of edinburgh. such an approach results in high-quality data with the disadvantage of missing less well-known but potentially interesting works. in this further iteration of palimpsest we considered the entire pool of literature accessible to us in order to determine a sub-set of highly ranked edinburgh-specific candidates automatically using location-based document retrieval. the aim was to reveal a wider range of edinburgh-specific literature, by uncovering now obscure and neglected literary works, and juxtaposing them alongside the more famous and well-known works. assisted curation by means of text mining alone has shown encouraging results in other domains (e.g. kristjansson et al., and alex et al., ). in palimpsest specifically, we combined text mining and information retrieval for assisted curation and studied how user feedback can improve the technical stages of this process. extending the same model to other collections for identifying edinburgh-specific place names, using the same parameters and setup, is straightforward in terms of running the text processing tools. the main effort would be in converting the data to the same input format required by the information retrieval and text mining pipeline. some decisions made in palimpsest and some resources used as part of this processing are very specific to edinburgh. this means that applying the model to a different city of literature would involve more work on the technical side, including the development of a place name gazetteer and some thinking about what constitutes a place name specific to the new location. http://yourlibrary.edinburgh.gov.uk/fictionmap http://www.mappit.net/bookmap/ http://yourlibrary.edinburgh.gov.uk/fictionmap http://www.mappit.net/bookmap/ . curation tool the manual annotation of the ranked candidates to select actual edinburgh-specific literature was done using the web-based annotation tool displayed in figure . all ranked documents are displayed on the left-hand panel, listing the title of each work, the author and publication date if available, a link to the original source document and a list of location mentions identified within the book. when clicking on a title and thereby selecting a document, additional information appears in the right-hand panel, including a graph showing occurrences of place names within a document and snippets containing edinburgh place names. based on this information and by following the link to the original source, the annotators were able to determine a work as being edinburgh-specific or not, enter further comments and identify the start and end content pages of a document. the latter was useful to avoid identifying edinburgh place names in the paratext of a literary work. when clicking the submit button, a document annotation is saved to the database and disappears from the panel on the left. however, the annotators were also able to access all previous annotations by clicking on the link on the top right corner (“see list of annotated books”). the tool also allows users to search for an author or book title in the list of ranked document using the search box in the top left corner. the annotation tool was developed specifically for palimpsest as the curation process involved a specific set of aforementioned requirements (including rating, commenting, linking to original documents, annotated documents disappearing from view, highlighted location entities in context, searching, marking start/end content pages and graph of location occurrences across the document). we were aware of existing tools supporting such features but not one supporting them all. figure : palimpsest annotation tool. . annotation scheme items were annotated using the annotation scheme shown in table . we consider documents annotated as yes or yes (except) as edinburgh-specific within palimpsest. the scheme was developed by the annotators while working on an initial ranking of hathitrust documents. label explanation yes fiction containing edinburgh place names yes (except) narrative non-fiction (incl. letters, memoirs, autobiographies, etc.) containing edinburgh place names probably not poetry containing edinburgh place names maybe literature containing edinburgh place names but not considered sufficiently place-rich no non-literary works containing edinburgh place names or literary works not containing edinburgh place names table : annotation scheme the decision to include certain non-fictional works within the database, such as letters, memoirs, autobiographies, biographies and travels journals, reflects the widening of the literary canon more generally to include such non-fictional works, as well the inclusion of we excluded poetry but we annotated it (probably not) to be able to work on it in future. such works in the palimpsest prototype, where literary scholars had suggested passages from both fictional and non-fictional works for inclusion. however, this added to the complexity of creating a reliable automated ranking system, as discussed further in later sections. poetry was reluctantly excluded in this iteration of palimpsest due to the text mining challenges created by its form as well as its textual presentation. no distinction was made between non-literary works containing edinburgh place-names (e.g. gazetteers, directories etc.) and literary works not containing edinburgh place-names. both types were annotated using the “no” label. however, the annotators frequently added comments on the type of work in question which could help to make this distinction. . experiment data we used the world public domain hathitrust collection ( , documents) to develop the retrieval and ranking component. for setting up the prototype, we started with all hathitrust documents with available genre information in the metadata in the form of codes from the library of congress classification system. we found that , ( . %) of the works in that collection had genre information in their metadata records, i.e. library of congress classes and subclasses. applying the document retrieval and ranking prototype to this data yielded , ranked candidate documents containing one or more edinburgh place names. over a period of two weeks, the annotators curated the ranked documents in order. this resulted in , annotated documents, of which were considered edinburgh-specific literature. we considered this to be our gold standard data which we used later to test different document retrieval and ranking component modifications. poetry set in edinburgh was excluded from the palimpsest database accessible via litlong but was annotated during the manual curation as “probably not” for processing in possible future iterations of this work. the text processing stages would need to be tailored to poetry as they are currently developed for running text containing capitalisation and punctuation, conventions which are often not adhered to in poetry. , ( . %) documents had subject information stored in their metadata records. . initial feedback initially, the annotators reacted enthusiastically to the annotation and discovered numerous works set in edinburgh of which they had not been aware, such as margaret williamson’s john and betty’s scotch history visit ( ), a history and travel guide in the guise of a fictional story about two school children travelling scotland, and professor john wilson’s noctes ambrosianae, a collection of popular political and editorial columns originally published in blackwood’s edinburgh magazine between and . they were also pleased to discover a large number of travel memoirs from the mid- to late-nineteenth century, most written by americans travelling in scotland. as they worked through the documents, however, they lost trust in the ranking system. they noticed relevant documents appearing far down the list and sometimes had to go through many documents to find a positive example. given the sheer volume of the results, and the amount of time it was taking the annotators to examine each text for relevance, it was apparent that they would be unable to manually curate the full set of results. as such, improving the ranking system would be of paramount importance in curating a strong, relevant collection of works for the final database. the annotators identified two main issues with the ranking system, which had resulted in a substantial number of false hits appearing higher on the list than other more relevant results. the most urgent issue arose from the inclusion of ambiguous place names in the search gazetteer. these were of two varieties: edinburgh location names that also appeared with great frequency in other cities (for example, ‘commercial street’), and place names not specific to particular locations (for example, ‘town hall’). in the first category, there were a great number of texts appearing high in the ranking that were not set in edinburgh, but instead in other british and north american cities, particularly london and boston. most of those texts set in north american cities were histories of those regions, which had been given primacy in the ranking due to the high density of shared place names. the annotators observed that most of the edinburgh-specific texts also included reference to the name ‘edinburgh’, or one of its variants, such as ‘edinboro’, ‘edina’, or ‘embro’, among others, whereas it naturally did not feature as often in those texts set in other cities, especially in north america. in the second category, the place names resulting in false hits were largely general, rather than loco-specific, names, including ‘police station’, ‘the square’, ‘main street’, ‘town hall’, ‘medical school’, ‘great hall’, and others. many of the texts ranked high in our results included a high density of these general types of places, making the texts appear to be dense in edinburgh locations. there was also a high instance of names such as ‘trinity’, ‘the loan’, or ‘the murrays’, which name not only places but historical, social, or religious concepts, or even people. the annotators compiled a list of such ambiguous place names to feed back to the natural language processing developers in order to improve the document retrieval and ranking prototype. another problem with the ranking was the frequent occurrence of non-narrative non-fiction works, which the project team had not planned to include in the database, such as regional and family histories, encyclopaedias, dictionaries, catalogues, and county registers. one especially amusing result that appeared high in the ranking was a record of unfashionable crosses in shorthorn cattle pedigrees ( ), by f.p. and o.m healy, which dealt with cattle breeds in ohio that descended from imported british stock. it was apparent that non- fiction, in general, would pose an issue for ranking, as place names appear to be used with much greater frequency in non-fiction writing than in fiction. however, since some types of non-fiction were going to be included in the database, such as memoirs and literary correspondence, non-fiction as a general category could not be entirely excluded. in hopes of limiting manual curation of non-fiction works, the annotators observed a series of titular words that always marked a non-relevant text, including (but not limited to) ‘record(s)’, ‘register’, ‘catalogue’, ‘dictionary’, ‘encyclopedia/encyclopaedia’, ‘topography’, and ‘index’. the annotators then fed this list back to the language technology team for deletion from the ranking. where codes from the library of congress classification system were available, the annotators also suggested that giving literary categories a higher ranking than non-fiction categories may help bring fiction higher in the ranking, despite its often minimal density of place names. . improving the ranking in summary, the annotators recorded a list of ambiguous place names mostly referring to other locations and a list of words in titles suggesting non-literary content. they also observed that most edinburgh-specific documents contain at least one reference to edinburgh or one of its variants. based on this feedback, we then fine-tuned the retrieval component. there is a body of research on using relevance judgments for improving information retrieval, a good summary of which is provided by manning et al. . we tested the initial ranking (baseline) as well as the following three measures and their combination: . down-weighting further ambiguous place names identified by the annotators. . removing documents containing non-literary titular words identified by the annotators (‘catalogue’, ‘dictionary’, etc.). . ensuring that edinburgh or one of its variants (‘auld reekie’, ‘edinboro,’ ‘edinbra’, ‘edinburg’, ‘edinbrughe’, ‘edinburrie’, ‘embra’ and ‘embro’) occurs in the work. the latter step also meant that the query gazetteer was increased from , to , place names in order to include the various name variants of edinburgh. . results as document retrieval systems produce ranked output, they are most standardly evaluated by means of the mean average precision (map) metric which results in one single figure measuring the quality across all recall levels (baeza-yates and ribeiro-neto, ; manning et al. ). the set of , annotated hathitrust works was used as an evaluation set to determine the effect of each modification. map scores are computed by comparing the system ranking to the ground truth of ratings of the same data created by the annotators. our baseline, the document retrieval and ranking prototype before the modifications were made, performed at a map score of . when retrieving , documents from the hathitrust collection. figure shows that down-weighting of ambiguous place names (see system ) resulted in a small improvement in the map score. filtering documents with non- literary title words (system ) had the highest increase in the map score and also lead to a sizeable reduction in the number of ranked document by . % compared to the baseline system. the condition of edinburgh or one of its variants to appear in the document (see system ) decreased the map score slightly which is unsurprising since a small number of edinburgh-specific documents do not refer to the city itself; the feedback from the annotators was that in the majority of cases (but not all) the city name is mentioned. however, this measure resulted in a large decrease in the number of ranked documents reducing the workload of the annotators significantly (by . %). we therefore consider measure to be beneficial as well. when combining all three measures, the retrieval component yielded a small improvement in the map score of . (compared to the baseline map of . ), and the workload of documents to be curated was reduced considerably by . %. figure : performance of the document retrieval baseline and various modifications. we report mean average precision (map) and number of ranked documents (#ranked) per retrieval method. the improved document retrieval and ranking component (when applying all three measures) was re-applied to the as yet un-annotated hathitrust collection (with/without metadata information) and was also run on the four remaining, out-of-copyright data collections listed in section . . the different ranking outputs (one per collection) was then presented to the annotators in the annotation tool for further manual curation. . feedback from the curators the annotators reported the new ranking to be significantly improved after their feedback had been taken into consideration when making the modifications to the prototype. while the annotation process was slowed a great deal, this was due to the increased instances of relevant documents demanding closer scrutiny. after down-weighting ambiguous place names and applying the ‘edinburgh +’ criteria, the annotators found more edinburgh-specific works rising to the top of the list, and fewer instances of works set in other cities. the occurrence of texts set in boston and other american cities almost entirely disappeared from the top % of results, although many texts remained that were set in london. the substantial overlap of place names in london and edinburgh, coupled with cultural connections that lead to mentions of edinburgh in these texts (for instance, minor references to a person from or a trip to edinburgh), would make these results challenging for the automated process to differentiate between. however, the annotators noted particular place names that were more often associated with london than edinburgh (such as ‘haymarket’), which became red flags for the annotators, speeding up the curation process slightly. adding variants of ‘edinburgh’ to the ‘edinburgh +’ criteria yielded positive results that would not have been identified in the early ranking phase, highlighting the value of including historical variants. for instance, william beatty’s novel the secretar ( ) is set in edinburgh and written partly in scots dialect, so ‘edinburgh’ does not appear in the text but ‘embro’ does. the exclusion of documents with non-relevant title words also made the curation process much more manageable, as stated, but non-fiction in general still dominated the higher ranked documents and the ranking system remained unable to make finer genre distinctions, especially where metadata was incomplete. fiction still ranked lower due to the lower density of place names; however, as annotators were not spending as much time sifting through obvious false hits, they were able to find fictional texts as they moved deeper into the ranked documents. this improvement in the ranking system seemed to be somewhat undermined by either limited or incorrect metadata in the results from other collections, in particular the national library of scotland’s digital collection and additional results from hathitrust without relevant metadata information, which led to non-relevant non-fiction (particularly family histories) rising in the ranking. however, these were smaller sets of results so the higher number of false hits was more manageable than with the main batch of results from the hathitrust collection. improvements in the ranking system enabled annotators to discover more relevant documents than they found initially, although it remained apparent that the ranking system would not be able to make reliable distinctions between imaginative descriptions of place and references to real-world locations. a telling example was the appearance of sir james matthew barrie’s novel quality street ( ) fairly high in the ranking. barrie is a scottish author, most famous for being the creator of peter pan. the front matter of the book contains a mention of edinburgh in one of the other works he is the author of (an edinburgh eleven), so the text met the ‘edinburgh +’ criteria. other place name mentions within the content of the book included ‘quality street’ and ‘the causeway’, actual locations in edinburgh; however, the novel is not set in edinburgh, but in a fictional small town. within the palimpsest project’s workflow and its scope of resources, works such as this could not be resolved through the improved ranking, only through human curation. this is a clear example of why domain expert knowledge within technology-assisted projects such as palimpsest is essential. . discussion and conclusion in the palimpsest project we have explored how to combine computational approaches (document retrieval and ranking, text mining and information visualisation) to facilitate literary research. the technical partners in the project built on their know-how already acquired in the trading consequences project , a digital humanities collaboration involving environmental historians as domain experts (hinrichs et al., ). from this past experience, it was clear to the team even at the stage of writing the proposal that the involvement of domain experts, the literary scholars in the case of palimpsest, was fundamental to the success of this interdisciplinary digital humanities project. as a result, the http://tradingconsequences.blogs.edina.ac.uk assisted curation process undertaken in palimpsest, which we described and evaluated in this article, was planned right from the project outset. this process attempted to keep the user in the loop during the iterative technical development. we received very useful feedback from the literary scholars on issues that appeared as they curated documents and considered their suggestions in changing the underlying methods for retrieving and ranking edinburgh-specific literature. our results show that system performance improved slightly and that curation workload was reduced substantially as a result. the improved method was subsequently applied to all document collections, which resulted in mostly positive feedback from the curators reporting that the ranking revealed more relevant documents. while working with the output data, the literary scholars became increasingly familiar with the strengths and limitations of the document retrieval and ranking technology and used this knowledge to their advantage to speed up their work. aside from providing valuable feedback for improving the technology, they also understood quickly that the automatic process was there to assist and not replace them. human curation was particularly vital for cases in which the system had insufficient knowledge or capability to perform a task such as the distinction between fictional and real place names or between different types of genre. palimpsest is therefore a good case study for illustrating the importance of humanities scholarship at the core of digital humanities research. the fact that the system struggled to differentiate between genre for works which did not contain this information in the metadata is not surprising. in such cases, the system has to rely mostly on the presence or absence of location mentions in the text. this signals the importance of the availability of document level metadata information. since our work was completed, ted underwood and his collaborators have developed a method to classify genre of hathitrust documents at the page level using machine learning (underwood, ). using their code, pages can be labelled with . % accuracy as either paratext (front matter, back matter, ads), prose nonfiction, poetry (narrative and lyric), drama (incl. verse drama), or prose fiction. this shows that certain types of metadata information can be inferred automatically with relatively high accuracy and that it does not necessarily require laborious manual curation to perform the bulk of such work. if we had had this genre classifier available to us from the outset of palimpsest, the genre distinctions could also have been considered by our document retrieval and ranking system, making its ranked output more reliable. the aim to uncover hidden literary gems set in edinburgh was clearly met by the assisted curation approach taken in the project. the underlying idea of the project was to go from big data (all of the accessible literature) to small data (the edinburgh-specific documents that finally made it into the palimpsest corpus). by starting with big data, we did not, as tim hitchcock put it rightly, want to ‘get away with dirty data’ (hitchcock, ). the combination of automatic and manual processing meant that we were able to identify a wide range of literary works set in edinburgh whilst at the same time ensuring that all documents visualised by the litlong tools contain edinburgh place name mentions. in future iterations of palimpsest, we would like to include additional collections which have become accessible more recently and would also like to adapt the language processing tools to process edinburgh-specific poetry already annotated during the manual curation phase. funding this work was supported by the arts and humanities research council [ahrc ah/l / ] via the digital transformations in the arts and humanities - big data programme. acknowledgements we would like to thank dr. harris-birtill at the university of st andrews who helped with the data processing for the annotation tool. we also thank the data providers (hathitrust, the british library labs, project gutenberg, the oxford text archive and the national library of scotland) as well as the authors and the publishers of a selected set of contemporary authors who have given us access to the texts for the purposes of palimpsest. references alex b., byrne k., grover c. and tobin r. ( ). adapting the edinburgh geoparser to historical georeferencing. international journal for humanities and arts computing, ( ), march . alex, b., grover, c., haddow, b., kabadjov, m., klein, e., matthews, m., roebuck, s., tobin, r. and wang, x. ( ). assisted curation: does text mining really help? in: biocomputing . in proceedings of the pacific symposium on biocomputing, pp. - . anderson, m. and loxley, j. ( ). the digital poetics of place names. literary mapping in the digital age. cooper, d., donaldson, c. and murrieta-flores, p. (eds.), routledge, pp. - . baeza-yates, r. and ribeiro-neto, b. ( ). modern information retrieval. boston: addison-wesley longman. barrie, j.m. ( ). quality street. new york: c. scribner's sons. beatty, w. ( ). the secretar. paisley & london: alexander gardner. callan, j.p., croft, w.b. and harding, s.m. ( ). the inquery retrieval system. in: proceedings of the rd international conference on database and expert systems applications, , pp. - . grover, c., tobin, r., byrne, k., woollard, m., reid, j., dunn, s. and ball, j. ( ). use of the edinburgh geoparser for georeferencing digitised historical collections. philosophical transactions of the royal society a, ( ), pp. - . healy, f.p. and healy, o.m. ( ). a record of unfashionable crosses in shorthorn cattle pedigrees. bedford, iowa: the authors. hinrichs, u., alex, b., clifford, j., watson, a., quigley, a., klein, e. and coates, c.m. ( ). trading consequences: a case study of combining text mining and visualization to facilitate document exploration, to appear in dhs, special issue of dh . hinze, a. m., littlewood, h. and bainbridge, d. ( ). mobile annotation of geo- locations in digital books. in: proceedings of the international conference on theory and practice of digital libraries. poznań, poland: springer. hitchcock, t. ( ). big data, small data and meaning. historyonics blog post, / / based on his keynote talk at the british library labs symposium on / / . kristjansson, t.t., culotta, a., viola, p. and mccallum, a. ( ). interactive information extraction with constrained conditional random fields. in: proceedings of aaai, pp. - . manning, c.d., raghavan, p. and schütze, h. ( ). introduction to information retrieval. new york: cambridge university press. ponte, j.m. and croft, w.b. ( ). a language modeling approach to information retrieval. in: proceedings of sigir, pp. - . strohman, t., metzler, d., turtle, h. and croft, w.b. ( ). indri: a language-model based search engine for complex queries (extended version), ciir technical report. turtle, h. and croft, w.b. ( ). evaluation of an inference network-based retrieval model. acm transactions on information systems, ( ), pp. - . underwood, t. ( ). understanding genre in a collectio of a million volumes. interim performance report for the digital humanities start-up grant [hd ], university of illinois, urbana-champaign. williamson, m. ( ). john and betty’s scotch history visit. boston: lothrop, lee & shepard co. wilson, j. ( ). noctes ambrosianae. ferrier, j.f. (ed.), edinburgh: william blackwood & sons. wilson, j. ( ). the recreations of christopher north. boston: phillips, sampson, and company. ubrisa .pdf libraries as learning organisations: implications for knowledge management priti jain and stephen mutula library hi tech news number , pp. - , # emerald group publishing limited, - , doi . / introduction the concept of a learning organisation is relevant to all twenty- first century organisations because of increasing complexity, uncertainty and change (malhotra, ). libraries can benefit significantly as learning organisations through reducing complacency; continuous learning, improvement and innovation (michael and higgins, ); being better equipped to deal with independent and distance learning (brophy, ); serving as a source of competition (fowler, ); promoting inquiry and dialogue; encouraging collaboration and team learning; establishing systems to capture and share learning; empowering people toward a collective vision; and connecting the organisation to its environment (watkins and marsick, ). the term ‘‘learning organisation’’ is defined in many ways. sutherland ( ) defines it as, ‘‘an organisation in which people at all levels, individually and collectively, are continually increasing their capacity to produce results they really care about’’. senge ( ) defines learning organisations as ‘‘organisations where people continually expand their capacity to create the results they truly desire, where new and expansive patterns of thinking are nurtured’’. skyrme ( ), writing from a knowledge management (km) perspective, defines learning organisations as ‘‘organisations that have in place systems, mechanisms and processes, that are used to continually enhance their capabilities and those who work with it or for it, to achieve sustainable objectives – for themselves and the communities in which they participate’’. giesecke and mcneil ( ) were more succinct in bringing into the definition the concept of knowledge which other definitions were merely implicit about. they defined a learning organisation as ‘‘an organisation skilled at creating, acquiring and transferring knowledge and at modifying its behaviour to reflect new knowledge and insights’’. sutherland’s and senge’s definitions may be perceived as leaning towards continuous learning, whereas skyrme and giesecke and mcneil stressed infrastructure as the foundation of a learning organisation and km, respectively. given the implicit and explicit aspect of km in the definitions of a learning organisation, knowledge capital in the competitive operations of organisations becomes imperative. it is therefore prudent to suggest that a learning organisation is one that applies the principles of km in harnessing its human capital. consequently, the characterisation of a learning organisation may be perceived as similar to the characteristics of a knowledge intensive organisation. several authors (skyrme, ; michael and higgins, ; sudharatna and li, ; brandt, ) have all attempted to characterise a learning organisation as one that supports life- long learning and where all the employees exchange information and ideas freely. openness, learning from mistakes, trust and imagination are tenets of such a learning organisation. these management processes include strategic planning; participatory management; employee empowerment; competitor analysis; performance measures; reward and recognition system with continuous update of their basic processes. tools and techniques also characterise learning organisations. these tools and techniques are learning and creativity skills to support individual and group learning, problem solving, interviewing, brainstorming, organising information, implanting new knowledge into mental models etc. (skyrme, ; michael and higgins, ; sudharatna and li, ; brandt, ). senge ( ) adds that learning organisations are characterised by systems thinking or holistic approaches. systems thinking integrates disciplines and compares them to appreciate their relationships. system thinking or holistic perspective sees an organisation as a whole and the impact of actions is seen on all parts of the system (skyrme, ). senge ( ) also considers personal mastery as a characteristic of a learning organisation. personal mastery refers to self-assessment and learning premised on the fact that organisational learning is dependent on its staff members’ learning efforts. personal mastery has two components namely that, one must define what one is trying to achieve (a goal), and one must also have a true measure of how close one is to the goal. as a consequence, those, who equip themselves with a high level of personal mastery, always continue to learn. learning organisations are also characterised by mental models, the ability to compare reality or personal vision with perceptions and reconciling both into a coherent understanding. moreover, mental models imply a shared vision of a mutually desirable future; incentive infrastructure which are essential for encouraging adaptive and expected behaviour (brandt, ; senge, ). larsen ( ) points out that team learning also characterises a learning organisation. team learning has personal and career benefits because each member of the team draws talent, experience and knowledge from a variety of other people. in addition, all members of the team work for common goals, and teamwork gives greater insight into individual differences, enabling individuals to learn to work together cohesively. learning organisation and km in a km environment, the internet links people-to-people, people-to- business, people-to-information and people-to-culture where there is intensive creation, sharing and use of knowledge (american library association, ). learning organisations are engaged as part of their business in knowledge creation. consequently, from the perspective of all information and km, it behooves librarians, to increasingly apply information management approaches because information has to be placed in a specific context to make it valuable to the user. moreover, with the knowledge economy expanding as the use of internet increases in the business environment, effective km in learning organisations takes centre-stage. ardern ( ) notes that km will become a major factor as business move to be more competitive, continue to downsize and try to determine how to best capture the knowledge of their staff. swan et al. ( ) established ‘‘a clear relationship’’ between a learning organisation and km by defining km as ‘‘. . . any process or practice of creating, acquiring, capturing, sharing and using knowledge, wherever it resides, to enhance learning and performance in organisations’’. in table i, we suggest that learning organisations and knowledge-intensive organisations share similar systems and infrastructure, while bearing in mind that km is a discourse of study while learning organisations are an entity or enterprise. learning organisation and km are inextricably intertwined. chase ( ) observes that km is about enhancing the use of organisational knowledge through sound practices and organisational learning. both km and organisations learning involve one or more of the following; capacity building through project teams, assigning staff responsibilities where their talents can be optimised, creating knowledge databases, institutional repositories, mentoring, etc. grey ( ) says km concerns critical thinking, innovation, intelligence, learning, competencies and sharing of experiences. white ( ) perceives km as a process of creating, storing, sharing and re-using know-how to enable an organisation to achieve its goals and objectives. the organization for economic co-operation and development ( ) observes that km is the collection of organisational practices related to generating, and disseminating know-how; and promoting knowledge sharing within an organisation with the outside world. challenges and opportunities for libraries as learning organisations in the digital environment in which most libraries now find themselves, education especially, in university environment is rapidly changing with academics increasingly adopting digital scholarship. digital scholarship may include submission of articles, peer review and publication all done electronically; teaching using purely or blended electronic means, evaluation and assessment of academic work electronically, collaborative research in electronic means and electronic communications. digital scholarship processes are supported by a variety of content in the form of e-journals, e-books, institutional repositories, databases and digital libraries. digital scholarship also enables the integration of various media such as text, graphics, animations, video and audio in teaching and research processes. digital scholarship has been made possible by the rapid development of emerging technologies such as web . . o’reilly ( ) refers to web . as ‘‘the network platform, spanning all connected devices and applications that make the most of the intrinsic advantages of that platform, delivering software as a continually-updated service that gets better the more people use it; consuming and remixing data from multiple sources, including individual users, while providing their own data and services in a form that allows remixing by others, creating network effects through an ‘‘architecture of participation’’, and going beyond web . to deliver rich user experiences. examples of web . applications include digital commons, blogs, social networking sites, wikis, etc. the other emerging technology that is influencing digital scholarship environments especially with regard to information and km is the digital library. digital libraries make information more available, raise its quality, and increase its diversity. they offer great user satisfaction; offer several ways in which libraries can improve services while reducing cost; provide instantaneous access to online information; offer / access to information so long as requisite infrastructure is in place; and overcome the problem of deterioration over time associated with physical media. digital libraries demand information professionals and their institutions to provide the resources, including the specialized staff, to select, structure, table i commonalities of learning organisation and km-intensive organisation learning organisation knowledge-intensive organisation provides a learning culture provides a learning organisation environment has shared vision and culture of sharing has strong culture of sharing and creativity provides team learning provides team work key management processes present km process present availability of learning tools and techniques availability of km technologies and tools applies systems thinking or holistic approach a central knowledge repository personal mastery library expertise mental models knowledge mapping learning organisation strategy km strategic plan continuous learning regular update of knowledge incentive infrastructure motivation open system partnerships library hi tech news number offer intellectual access to, interpret, distribute, preserve the integrity of and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities (digital library federation, ). digital scholarship in the university environment is also being influenced by the use of open source software especially with regard to providing e-learning platforms and also as infrastructure for developing institutional repositories. the major benefits of open software include: reduced costs and less dependency on imported technology and skills; affordable software for individuals, low- cost licensing implications; ability to customise the software to local languages. moreover, in a digital scholarship environment, scientific journals that were a few years ago produced largely in print form are now rolled out first as e-versions before the print versions can appear. libraries are also transforming their print collections into electronic formats through digitization or subscription to e-journals with or without print alternatives as a strategy to make them more accessible and to enhance resource sharing (youngman, ). with the transformation of what was largely a print environment into major digital collections, several issues arise that must be addressed such as integrity of the scholarly research process, intellectual property rights, privacy, security, etc. libraries provide a critical role as institutional repositories with an open access infrastructure which promote e-research, interdisciplinary work and cross-institution collaborations. stueart ( ) observes that the role of librarians has changed in parallel with changes in technology. librarians are increasingly engaged in perfecting tools and procedures to enhance access by creating portals, gateways and hypertext links to scholarly resources. similarly, sutherland ( ) suggests that with swift changes in technology, increased customer expectations, service competition, changing organisational values, interdisciplinary studies, e-learning and the demand for digital resources, it is critical that libraries become learning organisations. association of college & research libraries ( ) observes that the transitions in the production, dissemination and retrieval of information provide ample opportunities for academic libraries to lead their institutions in pursuing new modes of academic research and productivity. the changing paradigms of knowledge production, expanding sources and modes of dissemination and faster and broader accessibility to a growing range of information all offer entrepreneurial opportunities to academic librarians. a council on library and information resources ( ) study explains that the problem of managing and preserving knowledge in these shifting realms of digital proliferation is enormous, and librarians need to be an integral part of the solution. a texadata ( ) study found that libraries do not have a monopoly on publicly available information and library users do not believe that libraries provide unique information. miller and hart ( ) postulate that in the future, ‘‘there will be much less focus on providing a learning environment, instead the library will be ‘an information source’, accessible from essentially everywhere’’. bennett ( ) notes that today’s academic library design should no longer be dominated by information resources and their delivery, but should ‘‘incorporate a deeper understanding of the independent, active learning behaviour of students and the teaching strategies of faculty meant to support those behaviours’’. it is also important for academic libraries to redefine and expand their clientele by repositioning themselves to serve the entire world through the use of technology. strong online services can raise the profile of the institution, its scholars and its collections. institutional repositories, consisting of scholarly work of an institution and their special collections, can increase access to scholarship and archive collections. e-learning is another opportunity for libraries in teaching the use of a variety of information and communication technologies to facilitate student- oriented, active, open and life-long learning skills (university of botswana, ). lewis ( ) presents what he calls a model of five strategic pieces ‘‘for maintaining the library as a vibrant enterprise worthy of support from our campuses’’. the five strategic pieces for libraries include: complete the migration from print to electronic collections and capture the efficiencies made possible by this change; retire legacy print collections in a way that efficiently provides for its long-term preservation and makes access to this material available when required. . .this will free space that can be repurposed; redevelop the library as the primary informal learning space on the campus – in the process partnerships with other campus units that support research, teaching and learning should be developed; reposition library and information tools, resources and expertise so it is embedded into the teaching, learning and research enterprises. . .this includes both human and, increasingly, computer-mediated systems. . .emphasis should be placed on external, not library-centered, structures and systems; and migrate the focus of collections from purchasing materials to curate content (lewis, ). several sources (senge, ; brandt, , association of college & research libraries, ) note the skills and competencies that should characterise professionals working in a learning organisation including academic libraries. these skills include: team skills, public relations and communication skills, ability to think in terms of the enterprise (strategically), creative thinking, use of new technology and information tools effectively, ability to train and educate the client effectively, customer oriented, intellectual property attorney, publishing consultants, content managers and the capability of working effectively in partnership with faculty members and other stakeholders. transforming into learning organisations is a process. giesecke and mcneil ( ) suggest a number of strategies to change culture, vision and objectives to become a learning organisation. these strategies include: commitment to change; connecting learning with the organisation’s operations; assessing organisational capacity; communicating the vision; modelling a commitment to learning; cutting bureaucracy and streamlining structures; capturing learning and sharing knowledge; rewarding learning; learning more about learning organisations; and continuously adapting and improving learning. they library hi tech news number further point out that a commitment to change is driven from the top to bottom with all library staff members expected to reframe their thinking with a positive attitude, having a clear vision and adapting to change whenever it is necessary. giesecke and mcneil remind us that learning organisations flourish only when knowledge is shared so that staff can benefit both from individual and team learning. the motivations for the transformation of academic libraries into learning organisations are many, especially in the context of km in business environments. the organization for economic co- operation and development ( ) attests to the importance of km as a way to enhance productivity and efficiency. furthermore, as organisations make attempts to increase flexibility and mobility, they create new opportunities that demand integration of knowledge from the outside. moreover, concerns for promoting life-long learning, sharing of knowledge by different units in organisations, realization that knowledge-enhanced organisations experience rapid creation of new knowledge and the improvement of access to knowledge bases are factors that increase efficiency, innovation, quality of goods and services and equity. wimmer ( ) points out that knowledge-intensive environments are at an advantage where knowledge is created, shared, learned, enhanced, organised and utilised for the benefit of the organisation and its customers. such knowledge should therefore be effectively managed to give libraries a competitive advantage. conclusion today, the library has a central place and integral role to support higher education. as the higher education evolves, the library and librarians have to evolve with equal measure and develop digital collections tailored to the information age, investigate options to provide access to digital collections, develop custom portals that provide specialized searching options for high- value collections, such as dissertations and special collections, and expose library digital collections to the world via institutional repositories, search engines and portals. libraries must evolve into learning organisations and the challenge is for leadership and staff to recast their identities in relation to the changing modes of knowledge creation and dissemination, and in relation to the academic communities they serve. to become a learning organisation, libraries should create the climate for change and innovation. libraries should create learning environments by working collaboratively with other disciplines, particularly educators and community developers and be better equipped to cope with independent learning. libraries must also empower their employees to be flexible in order to take advantage of new and interchangeable roles as facilitators, mentors, coaches and stewards. finally, libraries need to promote a culture of knowledge-sharing, collective learning and collaboration. references (the) american library association ( ), ‘‘principles of the networked world’’, available at: www.ala.org/ala/washoff/ washpubs/principles.pdf (accessed march ). ardern, c. ( ), records and information management: a paradigm shift?, arma international, toronto. association of college & research libraries ( ), ‘‘changing roles of academic and research libraries’’, available at: www.ala.org/ ala/acrl/acrlissues/future/changingroles.cfm (accessed may ). bennett, s. ( ), libraries designed for learning, clir, washington, available at: www.clir.org/pubs/reports/pub /pub web.pdf (accessed may ). brandt, r. ( ), ‘‘on using knowledge: a conversation with bob sylwester’’, educational leadership, vol. no. , pp. - . brandt, r. ( ), ‘‘is this school a learning organization? ways to tell’’, journal of staff development, vol. no. , winter, available at: www.nsdc.org/library/ publications/jsd/brandt .cfm (accessed april ). brophy, p. ( ), the academic library, nd ed., facet publishing, london. chase, r.l. ( ), ‘‘knowledge navigators’’, available at: www.sla.org/pubs/ serial/io/ /sep /chase .html (accessed may ). clir ( ), ‘‘clir workshop on future of academic libraries’’, available at: http://sentra.ischool.utexas.edu/~adillon/blog/ archives/ (accessed march ). digital library federation ( ), ‘‘a working definition of digital library’’, available at: www.diglib.org/about/ dldefinition.htm (accessed july ). fowler, r.k. ( ), ‘‘the university library as learning organization for innovation: an exploratory study of college & research libraries’’, available at: www.ala.org/ala/ acrl/acrlpubs/crljournal/backissues b/ may /fowler.pdf (accessed april ). giesecke, j. and mcneil, b. ( ), ‘‘transitioning to the learning organization’’, library trends, vol. no. , pp. - . grey, d. ( ), ‘‘knowledge management and information management: the differences’’, available at: www.smith weaversmith.com/km-im.htm (accessed november ). larsen, k. ( ), ‘‘learning organizations’’, available at: http://home. nycap.rr.com/klarsen/learnorg/ (accessed may ). lewis, d.w. ( ), ‘‘exploring models for academic libraries’’, available at: http:// acrlog.org/ / / /exploring-models-for- academic-libraries/ (accessed april ). malhotra, y. ( ), organizational learning and learning organizations: an overview, available at: www.brint. com/papers/orglrng.htm (accessed april ). michael, t.s.c. and higgins, s.e. ( ), ‘‘nanyang technological university (ntu) library as a learning organisation’’, libri, vol. , pp. - . miller, r. and hart, d. ( ), ‘‘the future of academic libraries’’, university times, vol. , available at: http://mac .umc.pitt. edu/u/fmpro?-db=ustory&-lay=a&-format= d.html&storyid= &-find (accessed march ). o’reilly, t. ( ), ‘‘web . : compact definition’’, available at: http://radar.oreilly. com/archives/ / /web_ _compact_defin ition.html (accessed september ). (the) organization for economic co- operation and development ( ), ‘‘the learning government: introduction and draft results of a survey of the knowledge management practices in ministry/ departments/agencies of central government’’, th session of the public management committee, chateau de la muette, paris, - april , available at: www.oecd.org/dataoecd/ / / .doc (accessed november ). library hi tech news number senge, p.m. ( ), the fifth discipline: the art and practice of the learning organization, random house, london. skyrme, d. ( ). ‘‘the learning organization’’, available at: www.skyrme. com/insights/ lrnorg.htm (accessed may ). stueart, r. ( ), ‘‘digital libraries: the future of scholarly communication’’, paper presented at ub library auditorium, the university of botswana, gaborone, - august . sudharatna, y. and li, l. ( ), ‘‘learning organization characteristics contributed to its readiness-to-change: a study of the thai mobile phone service industry’’, available at: www.fm-kp.si/zalozba/issn/ - / _ - .pdf (accessed may ). sutherland, s. ( ), ‘‘the public library as a learning organisation’’, available at: www. ifla.org/iv/ifla /papers/ e-sutherland.pdf (accessed may ). swan, j., scarborough, h. and preston, j. ( ), ‘‘knowledge management – the next fad to forget people?’’, proceedings of the th european conference on information systems, copenhagen, in loermans. j. ( ), journal of knowledge management. vol. no. , pp. - . texadata ( ), ‘‘white paper on the future of academic libraries’’, available at: www.texadata.com/ / /whitepaper-on- future-of-academic.html (accessed march ). university of botswana ( ), university of botswana elearning, university of botswana, gaborone. watkins, k.e. and marsick, v.j. ( ), sculpting the learning organisation: lessons in the art and science of systemic change, jossey-bass, san francisco, ca. white, t. ( ), ‘‘knowledge management in an academic library’’, available at: http:// eprints.ouls.ox.ac.uk/archive/ / / tatiana_white_km_article.pdf (accessed june ). wimmer, m.a. (ed.) ( ), ‘‘knowledge management in e-government’’, proceedings of the rd international workshop (kmgov- ), copenhagen (dk), - may, schriftenreihe informatik # , trauner verlag, linz. youngman, f. ( ), ‘‘opening speech’’, presented at digital scholarship conference, library auditorium, university of botswana, gaborone, / december . further reading bundy, a. ( ), ‘‘beyond information: the academic library as educational change agent’’, paper presented at the th international bielefeld conference, bielefeld, germany, - february . priti jain (jain@mopipi.up.bw) university of botswana, private bag , gaborone, botswana. stephen mutula (mutulasm@ mopipi.up.bw) department of library and information science, university of zululand, kwadlangezwa, kwa-zulu natal, south africa. library hi tech news number op-llcj .. stylometry and collaborative authorship: eddy, lovecraft, and ‘the loved dead’ ............................................................................................................................................................ alexander a. g. gladwin, matthew j. lavin and daniel m. look st. lawrence university, canton, ny, usa ....................................................................................................................................... abstract the authorship of the short story ‘the loved dead’ has been contested by family members of clifford martin eddy, jr. and sunand tryambak joshi, a leading scholar on howard phillips lovecraft. the authors of this article use stylometric methods to provide evidence for a claim about the authorship of the story and to analyze the nature of eddy’s collaboration with lovecraft. further, we extend rybicki, hoover, and kestemont’s (collaborative authorship: conrad, ford, and rolling delta. literary and linguistic computing, ; , – ) analysis of stylometry as it relates to collaborations in order to reveal the necessary considerations for employing a stylometric approach to authorial collaboration. ................................................................................................................................................................................. introduction when ‘the loved dead’ was published in the may- june-july issue of weird tales—accredited to clifford martin eddy, jr., or c. m. eddy, jr.—con- troversy followed. the issue was banned in at least indiana, if not nationwide (joshi, , p. ); the magazine’s editor, farnsworth wright, would hold a wariness of stories containing socially contentious subject matter for years. ‘the loved dead’ is a first-person narrative of a necrophiliac who is on the run from authorities as he explains the roots of his predilection for corpses. the material proved too explicit for audiences, and wright’s at- tempts to preclude another controversy caused rela- tively innocuous stories such as ‘in the vault’ by howard phillips lovecraft—better known as h. p. lovecraft, who was a friend of eddy—to be denied publication in in fear of another mishap (de camp, , p. ). although controversy has continued to surround the story, the focus is no longer on the subject matter. instead, the question of authorship has defined the discourse surrounding ‘the loved dead’, because lovecraft—one of weird tales’ most popular contributors—is known to have revised it. the extent of his revisions, though, re- mains unclear. biographical and historical considerations because direct historical evidence is typically privi- leged over computational analysis, we seek to ex- haust these resources before enacting a less conventional approach. however, the results are not fruitful; when we consider the fact that lovecraft ‘revised’ the story, we find only more am- biguity. the term may imply copy-editing or correspondence: alexander a. g. gladwin, franklin ave. columbus, oh united states. e-mail: aaggladwin@gmail.com digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqv digital scholarship in the humanities advance access published july , content suggestions, but lovecraft often ghostwrote and collaborated on stories while retaining the title of ‘revisionist’. lovecraft biographer sunand tryambak joshi, or s. t. joshi ( ), identifies the author’s frequent undermining of his input on revised stories: he and his friend, robert h. barlow, cowrote ‘till a’the seas’, but lovecraft allowed it to be published exclusively in barlow’s name to en- courage the young author (p. – ). in another case, he took a two-sentence treatment by zelia bishop and turned it into a , word novella, ‘the mound’ (joshi, , p. ); he does not appear to have sought credit in its attempted pub- lication. thus, the term ‘revision’ is separated from its denotation and even connotations in the context of lovecraft, and a careful suspicion of declaring the story to be eddy’s on that basis alone is warranted. to lovecraft, revision could mean copy-editing, minor rewrites, major rewrites, or sole writing based on ideas. current primary and secondary claims do not clarify the extent of his contribution. additional historical details about the authorship complicate the question further. in terms of first- hand accounts, lovecraft ( / ) does refer to the story in a letter to his aunt as ‘poor eddy’s ‘‘the loved dead’’’ (p. ). however, eddy and lovecraft were friends; the latter visited the former on occasion, as both were permanent residents of providence—excepting lovecraft’s stint in new york city from to (joshi, , p. – ). if he were willing to allow his friend barlow and numerous others to take credit for stories that he worked on, then the possibility that he would refer to the story as eddy’s is inconclusive of its authorship. however, in march , lovecraft ( / ) says in a letter to robert bloch, ‘it may interest you to know that i revised the now-notorious ‘‘loved dead’’ myself—practically re-writing the latter half’ (p. ). in contrast, eddy’s wife, muriel eddy ( / ), claims in her memoir about lovecraft that he merely ‘read the original manuscript and touched it up in places, with my husband’s full sanction, but it was entirely the brain-child of my husband’ (p. – ). joshi ( ) has called into question many of muriel eddy’s claims in this piece, citing a lack of supporting evidence and positing the possibility that she wished to capitalize on lovecraft’s fame (p. – ). there is no way to be certain in either case, as joshi himself is merely speculating, and there is no reason to dismiss muriel’s statement about ‘the loved dead’ out of hand. jim dyer—eddy’s grandson and head of fenham publishing, which continues to release collections of eddy’s stories—further argues in favor of his grand- father’s nigh-complete authorship: ‘there should be no confusion regarding my grandfather’s stories. lovecraft and my grandfather would read their stories aloud to each other and both would give advice and suggestions. my grandfather’s stories were written by him as any of the people who knew both lovecraft and my grandfather could attest to’ (personal communication, february ). he echoes muriel’s viewpoints, and the opin- ion of eddy’s family on the matter is consistent. the historical evidence does not provide a clear, single narrative. joshi ( ) argues in favor of lovecraft, albeit without any concrete evidence: ‘there was. . . in all likelihood a draft written by eddy for this tale’, he says, ‘but the published ver- sion certainly reads as if lovecraft had written the entire thing’ (p. ). joshi compares the ‘adjective- choked prose’ to lovecraft’s contemporary short story, ‘the hound’. most likely, eddy wrote at least the plot, if not an entire draft. the uncertainty lies in how much of the printed tale comes from eddy’s draft, and how much lovecraft wrote or rewrote. primary tests . background to provide a more in-depth analysis, we look to stylometry, a study that analyzes, per holmes’ ( ) definition, implicit aspects of an author’s writing style through statistical analysis. there are numerous approaches and debates within the field about the most accurate metrics—e.g. should the focus be on common words such as articles in order to measure subconscious stylistic qualities, or words that appear rarely in order to find author- ial stamps—but the principle remains: authors have distinctive qualities to their writing that can be un- covered and statistically analyzed, which provides a. a. g. gladwin et al. of digital scholarship in the humanities, evidence regarding the authorship of a disputed text. we first provide a historical overview of stylo- metric techniques, because many have come into popularity only to be dismissed as a result of further investigation. morton ( ) focused on words that appear only once in a text, i.e. hapax legomena, and their positions in sentences. smith ( ) revealed the process’ several flaws: small sample sizes, loose statistical inferences, and improper data collection. to question specific stylometric methods is import- ant, because it pushes scholars to find more sound and rigorous ones. several tests have yielded useful results, chief among them being mosteller and wallace’s ( ) testing of the federalist papers. they attempted to analyze the texts through synonym pairs, such as the selection of the word ‘big’ over ‘large’. however, upon realizing that this method would not work for the texts at hand, they altered their study to focus on function words such as conjunctions and articles, which have little meaning on their own but reveal relationships in the structure of the sentence. the federalist papers were written by john jay, alexander hamilton, and james madison and pub- lished under a pseudonym. scholars and historians were certain of the authorship of seventy-three let- ters, but the author(s) of the remaining twelve was unclear. mosteller and wallace tested claims via bayesian statistics about who of john jay, alexander hamilton, and james madison wrote the disputed papers. they discovered that the meas- ures were consistent with madison, which suggested a high probability that he was the author of all twelve, a claim agreed upon by historians (juola, , p. ). this case study suggests that, if stat- istical rigor is maintained, reliable results can be obtained. the important step is procuring accurate measurements and tests, applying them to suitable texts, and carefully interpreting the content of the results. there have been only a small number of stylo- metric analyses of texts where the authorship is col- laborative. rybicki et al. ( ) explore the stylistic qualities of the inheritors, romance, and the nature of a crime, all three of which are accredited to both ford madox ford and joseph conrad. their method of testing is particularly useful for differen- tiating authorship in texts that were written in segments. the publications that contained lovecraft and eddy’s stories, such as weird tales, provide numer- ous authors and stories that can serve as test cases for collaborative authorship due to the literary con- text of ‘hack writing’—a context that adds weight to our inquiry into lovecraft as revisionist. usage of the term ‘hack’, from hackney, to describe a laborer or ‘drudge’ dates back to the th century, but the concept of literary hackwork has its roots in the history of popular periodicals, especially london’s grub street (cox and mowatt, ). in the context of the th- and early- th-century united states, literary hackwork was likewise linked to mass- market periodicals and pulp publications. because the second half of the th century saw the rise of the modern profession of authorship and with it a professional discourse of trade books and period- icals, there exist numerous articles defining hack- work and discussing its merits. these sources tend to agree that a hack ‘writes for pay, and, if he were not paid, could not write’ (lang, , p. ). to succeed, a hack had to be ‘a sort of all-trades’ jack’ (reeve, , p. ). in comparison with popular notions of authorial identity and celebrity, literary hackwork occurred behind the scenes and carried with it a decreased emphasis on receiving credit. broadly speaking, hackwork included various levels of unnamed collaborative labor, including ghostwriting, revision, line editing, and adaptation. lovecraft in particular is a part of this context, although he does differ from the prescribed image of a hack writer. he often edited and ghostwrote for money, including a short story published in the same issue of weird tales as ‘the loved dead’: ‘imprisoned with the pharaohs’, originally titled ‘under the pyramids’, which was accredited to famous magician harry houdini; lovecraft received $ for his work (joshi, , p. – ). however, lovecraft does not fit the image of a hack writer in many ways, namely that he was against the idea of commercial writing (joshi, , p. ), favoring the image of the amateur who wrote out of personal passion and expression. still, he did revise, ghost- write, and line edit—at times for friends, at times stylometry and collaborative authorship digital scholarship in the humanities, of for money due to his financial destitution—which places him in this tradition. although rybicki, hoover, and kestmont have begun to study literary collaboration using compu- tational tools, further investigation into the range of literary collaborations of the modern period is called for, an investigation for which hack writing is well suited. the kinds of collaborations we seek to study are notoriously opaque but tremendously common, and crucial to the production of popular and liter- ary texts. successfully using authorship attribution techniques with these kinds of collaborations would have numerous implications for the study of litera- ture and literary history. potential fruits include: a better understanding of how and when hackwork took place; better information about the extent of collaboration needed to affect an authorial signal in a text; and, more aspirationally, methods for detect- ing ghostwritten texts in a field of candidate texts. . lexical richness we initially attempted to measure variables of lex- ical richness, which has a turbulent history in styl- ometry and has largely faded from popular use. juola ( ) argues that such measurements have not ‘been demonstrated to be sufficiently distin- guishing or sufficiently accurate’, although he does admit that they cannot be dismissed outright with specific counterexamples (p. – ). scholars have found stability with certain variables, including look’s ( ) study of the works of robert e. howard, which argues for the stability of the type-token ratio given certain constraints; holmes’ ( ) study of authorship in mormon scripture, which combines variables including the hapax dislegomena ratio h, honoré’s r, and yule’s k; and grieve’s ( ) multifaceted approach to lexical richness, which utilizes the aforementioned and numerous other variables to predict the author- ship of a text, achieving a success rate ranging from to % when there are two potential authors. however, tweedie and baayen ( ) provide evi- dence toward the instability of r and k in relation to the total token count of a text, n. its general usage appears to be heavily predicated on having particularly apt test subjects, and even then, a level of certainty reaching even % could be seen as too low for predicting the author of a single disputed text. we measured t, h, r, and k and performed principal component analysis (pca) to condense and visualize the data, because lexical richness can provide insight into the style of a text, but our results were too invariable and muddled to use in our test case. however, it can be useful for stylo- metric analysis of collaborative authorship given particularly apt test cases, so we leave this informa- tion for those wishing to perform further testing on other subjects. . latent features measurements of lexical richness can be useful, but we prefer elements of style that can more consist- ently allow us to distinguish between authors, and such elements have been found in the study of the parts of language that often fail to attract significant attention: function words. included in this umbrella term are conjunctions, articles, prepositions, par- ticles, and auxiliary verbs. the principle behind measuring function word usage is that it reflects an author’s structural stylistic impulse. thus, mea- suring the frequencies of the most common func- tion words and words in general can provide information about an author on a level that cannot be easily imitated. we will discuss two particularly useful tools, both suggested by burrows, that focus on function words and common words across texts in general in order to reveal the relationships among corpora. . . function words in the first method, burrows ( ) outlines the process of taking function word frequencies and applying pca by analyzing the speech of certain characters in the works of jane austen, and then comparing the works of different authors. binongo ( ) hones the technique in a process that will be outlined here. his focus is the th book in the oz series, the royal book of oz, which was published years after the death of l. frank baum, the undis- puted author of the first fourteen books. the con- troversy is due to a statement in the th book’s original publication that claims the text was written by baum and only ‘enlarged and edited’ by ruth a. a. g. gladwin et al. of digital scholarship in the humanities, plumly thompson (as cited in binongo, , p. ), who would go on to write the th to rd entries. however, this claim has been disputed. the most recent edition from recognizes thompson’s authorship, stirring the controversy. binongo performs an authorship attribution test distilled from burrows’ technique. he takes the first thirty-three books in the oz series—including four- teen that are undisputedly by baum, eighteen undis- putedly by thompson, and the royal book of oz— and treats these books as a single corpus. he re- moves non-prose sections for ease of study. then, he obtains the proportion of each word in the corpus—i.e. for every distinct word, he measures the number of occurrences in the text, say w, and calculates w / n. from this list, he records the top fifty function words, excluding auxiliary verbs and pronouns due to the former’s multiplicity of inflec- tions and the latter’s dependence on factors such as point of view. binongo then subdivides each text into blocks of , words in order to see variations within a book, creating text blocks; next, he calculates the pro- portions of the top fifty function words within each , word subdivision. he then utilizes pca to distill the information from these measurements into easily visualized data. binongo’s results are convincing (fig. ). when he visualizes the data, works by baum and those by thompson are divided along the x-axis. binongo notes that even the th oz book, glinda of oz— which was edited by baum’s son from a rough draft—falls distinctly in the cluster of baum’s works. the royal book of oz, however, clusters with thompson’s work, lending evidence to the claim that she is the likely author of the text. the fact that the revised version of baum’s work is not- ably similar to his other works is useful because the disputed text we will be considering, ‘the loved dead’, is possibly just a light revision by lovecraft. in theory, if he had no greater hand in the published version, we should obtain similar re- sults in favor of eddy. this process provides visualizable data on the differences in style between two authors by tracking subconscious stylistic qualities. the utility is clear and will provide information about the disputed text that we will consider, because a light revision by one author of another’s text will not likely change an author’s basic grammatical structures. . . burrows’ delta the second method outlined by burrows ( ) measures style based on common words—including non-function words—and does not privilege visual- izations; rather, it yields a single value called delta that reflects the similarity of subsets of a corpus to a key subset. in our study, that will mean treating ‘the loved dead’ as our key subset, and comparing stories by lovecraft and eddy to it. the main set will consist of all of these stories. the authorial subset that yields the smallest delta score will be the least unlike the key text in terms of common word usage. burrows emphasizes using the phrase ‘least unlike’ in order to clarify that the author with the lowest delta score does not neces- sarily have a similar style to the author of the dis- puted text, but rather is closer in style than any other author with a higher delta score. fortunately for us, this will not be an issue, as ‘the loved dead’ is all but certainly the work of lovecraft, eddy, or both. the purpose of using this test in conjunction with function word pca is that it provides another way to measure style as seen fig. binongo’s results for function word pca for baum and thompson stylometry and collaborative authorship digital scholarship in the humanities, of in common words—and the results for both tests should thus coincide with each other—but with a potentially different word set. . . rolling delta rybicki et al. ( ) use rolling delta in their study of collaborative authorship, which applies the same basic technique as burrows’ delta, but provides sev- eral delta scores for a single test text by setting two numbers—a ‘rolling window’ and a step size—and then ‘rolling’ through the text. the window is the number of words that will be considered when cal- culating delta, and the step size is the increment by which the index of the first word for the window increases. so, for example, if the window size were set to , and the step size , then that test would start with a window containing the first , words of the text, evaluate the delta scores for the authorial subsets, then measure another , word window starting with the st word, followed by a , word window starting with the , st, etc., until a window contained the last word of the text. there are two other differences between burrows’ delta and rolling delta. first, while the delta score is weighted by the number of top words considered, rolling delta is not. so, if we were to measure the frequencies of the thirty top words, and one author differed from the usage in the test text by standard deviation each time—and, in the case of rolling delta, for each rolling window—then the burrows’ delta score would be , but the rolling delta scores would be . second, rolling delta uses ‘culling’, which is the process of removing words that do not appear in a certain percentage of the text. if the cul- ling value is thirty, then only words that appear in at least % of the texts will be considered when calcu- lating the rolling delta scores. this process is useful, in particular for larger texts, because it removes words that are used often, but only in a designated percentage of samples. text selection and natural lan- guage processing before any tests can be run, the data must be stan- dardized. we choose to work only with prose because that is the genre of the disputed text. we want to have samples that are of the same language form as each other and as the test text to avoid unforeseen variables. moreover, because ‘the loved dead’ is fic- tion, we have decided to exclude essays and letters in case they interfere with the language structures that we are attempting to unearth. non-english language sections, epigraphs, and chapter numberings (e.g. ‘ii’ or ‘chapter ’) have been removed. because we only have twelve eddy stories to work with (table ), we create a subset of lovecraft’s horror prose that contains twelve tales contempor- ary to ‘the loved dead’ (table ). contractions are not expanded in the delta tests as a result of hoover’s ( ) findings that doing so actually lowers accuracy, and are similarly not ex- panded for the function word pca test. spellings are not standardized in order to avoid altering the data. thus, even sections of the stories written in dialect, e.g. zadok allen’s monologue in the shadow over innsmouth, are left intact, as they rep- resent specific word choices by the authors. the lovecraft set is tested against eddy in relation to not only ‘the loved dead’, but also a text that we know to be entirely by lovecraft: ‘the mound’, which is not in our test sets due to its technical des- ignation as a collaboration between him and zelia bishop. testing ‘the mound’ provides base cases that should yield strong results in favor of lovecraft if the tests are accurately capturing style. we perform tokenization in python using the natural language tool kit’s (nltk) tokenizer, which creates a list wherein each item is a word or piece of punctuation (http://www.nltk.org/). all of the tests and calculations for this paper excluding rolling delta—which was performed using the stylo package for r programming language, de- veloped by eder, rybicki, and kestemont (https:// sites.google.com/site/computationalstylistics/stylo)— are written in python to optimize performance and collection. validating our approach to validate our method, we perform our tests in two separate scenarios. first, we run our tests on authors a. a. g. gladwin et al. of digital scholarship in the humanities, http://www.nltk.org/ https://sites.google.com/site/computationalstylistics/stylo https://sites.google.com/site/computationalstylistics/stylo that worked during the same time and/or in the same genre as lovecraft to ensure that we can dif- ferentiate authors that share those qualities; second, we run our tests on established revisions/collabor- ations to see how our test results reflect the mixed authorship. . differentiating lovecraft from contemporaries to explore how our tests interpret results for au- thors working in similar time periods or genres, we choose to test the lovecraft set against a selection of texts by three authors: booth tarkington, who wrote during approximately the same period as lovecraft, but not in the horror, fantasy, or weird fiction genres; edgar allan poe, whose horror stories greatly influenced lovecraft (joshi, , p. ), but who wrote approximately a century earlier; and arthur machen, who wrote before and during lovecraft’s lifetime and worked in the horror, fan- tasy, and weird fiction genres. for each author, we select a subset of his/her bibliography (table ), as well as one story that will act as a test text in the same way that we will use ‘the mound’. for tarkington, we use the magnificent ambersons; for poe, ‘the fall of the house of usher’, and for machen, ‘the white people’. . . function word pca we note that only twenty-four top words could be tested due to our smaller sample size and the con- straints of our method of pca. thus, a lack of clarity in our visualizations could indicate an actual lack of distinction between the two authors in terms of function word usage, or that we are not measuring enough top words. this bolsters the im- portance of using burrows’ delta, as we can select any number of top words regardless of the number of samples. the lovecraft and tarkington sets cluster dis- tinctly across the first pc with no overlap (fig. ). both ‘the mound’ and the magnificent ambersons cluster with their respective authors. for the comparison between the lovecraft and poe sets, the test does not differentiate the stories as clearly (fig. ); ‘the mound’ and ‘the fall of the house of usher’ appear between the two clusters. this discrepancy shows why it is especially import- ant to follow up on unclear results with other tests in order to check whether the lack of separation is due to the constraints of the test or the texts themselves. the lovecraft and machen sets differentiate clearly (fig. ). ‘the mound’ and ‘the white people’ are substantially closer in terms of function word usage to the lovecraft set and machen set, respectively. function word pca does differentiate the au- thors distinctly in most cases, with few cases of ‘mis-attribution’. the tests do not always demarcate the differences clearly enough, which shows that we should look more closely at the comparison either by increasing the number of top word counts mea- sured or looking to other tests, namely burrows’ delta. table lovecraft stories and token count title token count the festival , herbert west-reanimator , the hound , hypnos , the lurking fear , the music of erich zann , the nameless city , the outsider , the picture in the house , the rats in the walls , the temple , the terrible old man , table eddy’s stories and token counts title token count an arbiter of destiny , arhl-a of the caves , ashes , the better choice , the cur , deaf, dumb, and blind , eterna , the ghost-eater , red cap of the mara , sign of the dragon , souls and heels , with weapons of stone , stylometry and collaborative authorship digital scholarship in the humanities, of . . burrows’ delta burrows’ delta provides the most accurate results in terms of predicting the authorship of our test texts (table ). although a delta score for an authorial subset on its own can reveal how much or little that author’s usage of common words resembles that of the test text’s, the number we will use in comparing authors is the difference between the delta scores. because a lower delta score means an author’s style is ‘less unlike’ that of the test text, the difference between delta scores reveals the extent to which one author is stylistically closer. our results for the comparison of the lovecraft and tarkington sets yield smaller delta scores for lovecraft when ‘the mound’ is the test text, and similarly smaller delta scores for tarkington when the magnificent ambersons is the test text. the dif- ferences range from an absolute value of . to over . . our results for machen show that lovecraft’s style is closer to that of ‘the mound’ than table tarkington, poe, and machen stories tarkington poe machen alice adams harlequin and columbine the black cat the masque of the red death the angels of mons the hill of dreams the beautiful lady penrod the cask of amontillado the murders in the rue morgue far off things the inmost light the conquest of canaan penrod and sam a descent into the maelström the pit and the pendulum a fragment of life the secret glory the flirt seventeen the facts in the case of m. valdemar the purloined letter the great god pan the terror gentle julia the turmoil hop-frog the tell-tale heart the great return the three imposters the gentleman from indiana ligela hieroglyphics: a note upon ecstasy in literature fig. function word pca results for lovecraft and tarkington fig. function word pca results for lovecraft and machen fig. function word pca results for lovecraft and poe a. a. g. gladwin et al. of digital scholarship in the humanities, machen’s, and machen’s is closer to that of ‘the white people’ than lovecraft’s. poe, in contrast, produces similar difficulties to those seen with function word pca. although ‘the mound’ is closer to the lovecraft set, ‘the fall of the house of usher’ produces a difference in delta scores of approximately . in favor of lovecraft. thus, we have a ‘false positive’, or a result that inaccurately indicates the authorship of a disputed text. to test that delta is actually unable to differentiate the style of the two authors clearly, though, we return to burrows’ suggestion about the number of top words chosen: although thirty is often enough to differentiate, it is on the low end of the recom- mended values. thus, one way we can test our result is to perform delta tests for various top word counts. therefore, we perform delta tests with word counts ranging from to in incre- ments of ten words. the results clearly show that usage of common words in ‘the fall of the house of usher’ is closer to that of the poe set (fig. ). in fact, the more top words we use, the clearer that result becomes. this suggests that if our results for an unknown text yield differences in delta scores that are below . , then we should further investigate the scores by testing with more top words to ensure that the result for thirty top words is not a misrepresen- tation of the actual stylistic qualities being measured. as a result, we will consider an author’s style to be notably closer to that of the test text if the difference between the authors’ delta scores is . or greater. . . rolling delta the results for rolling delta are similar to those for burrows’ delta. rolling delta is useful for exam- ining the stylistic qualities of several sections of the test text, which informs collaborative authorship at- tribution by potentially revealing segmentation. our test texts, however, are not only unsegmented, but also non-collaborative, which means our results should resemble those we found for burrows’ delta. in the comparisons between the tarkington and lovecraft sets, we see similar results to the ones from burrows’ delta; for ‘the mound’, with a window size of % of the windows yield smaller delta scores for the lovecraft set, with an average difference of . . for lovecraft, the delta scores range from . to . , and for tarkington, they range from . to . . when we use the magnificent ambersons as a test text, we see similarly clear results: % of the windows yield smaller delta scores for tarkington, with an average difference of . in favor of tarkington. the values for tarkington range from . to . , and for lovecraft, they range from . to . . the rolling delta results for comparing poe and the lovecraft sets similarly reflect our results for burrows’ delta. when we use ‘the mound’ as the test text, the lovecraft set has smaller delta scores for % of the windows, with an average differ- ences of . in favor of lovecraft. the values for lovecraft range from . to . , and for poe, table burrows’ delta results for lovecraft, tarkington, poe, and machen test author delta author delta diff ‘the mound’ lovecraft . tarkington . . the magnificent ambersons . . . ‘the mound’ . poe . . ‘the fall of the house of usher’ . . . ‘the mound’ . machen . . ‘the white people’ . . . fig. delta differences for lovecraft and poe stylometry and collaborative authorship digital scholarship in the humanities, of from . to . . with ‘the fall of the house of usher’—for which we run all rolling delta tests with a window size of , and step size of — the results are not as decisive, but still indicate that poe’s style is predominant: . % of the windows yield smaller scores for poe, with an average differ- ence for those windows of . . for the windows that favor lovecraft, the average difference is . . the delta scores for poe range from . to . , and for lovecraft they range from . to . . although there is some misattribution here for the windows in ‘the fall of the house of usher’, we remind the reader that these scores are unweighted. thus, for thirty top words, the average differences of . and . are weighted to . and . . these small values indicate that the test text does not as clearly differentiate the style of the potential authors, and should be considered with the results from the other tests, or further testing. for machen, we again see reflections of our re- sults for burrows’ delta. for ‘the mound’, the lovecraft set has smaller delta scores for . % of the windows, with an average difference for those windows of . in favor of lovecraft. for the win- dows that yield smaller delta scores for machen, the average difference is . . the values for lovecraft range from . to . , and for machen, from . to . . for ‘the white people’, . % of the windows have smaller delta scores for machen, albeit with a smaller average difference of . . for the windows in favor of lovecraft, the average dif- ference is . . the values for machen range from . to . , and for lovecraft, from . to . . taken in whole, rolling delta always favors the proper author, and the cases of mis-attribution do not yield large differences in delta scores. however, rolling delta does get results that correctly identify the true author for test texts that are as small as , words. thus, it is sufficiently reliable and will help us test for a segmented collaboration ‘the loved dead’. . determining authorship in an estab- lished collaboration to test an established collaboration, we return to the field of ‘hack writing’, in this case focusing on stories featuring the character conan the barbarian. starting in and ending in , gnome press published five books containing robert e. howard’s original conan stories. in , a sixth book, tales of conan, was published. however, the stories con- tained in the sixth book were not original conan stories written by howard. rather, l. sprague de camp took four existing non-conan stories by howard, recast them as conan stories, and changed their titles, resulting in ’the road of the eagles’, the flame knife, ‘hawks over shem’, and the blood- stained god. as these works were originally penned by howard and then heavily edited by de camp— including changes to the setting, time period, and main characters—these stories are an example of a specific type of revision, one that borders on post- humous collaboration. given our interest in the nature of the revision versus collaboration, these texts will be compared to the nineteen conan stories by howard as well as sixteen non-conan stories by de camp (table ). for the collaborations, we will table howard and de camp stories howard de camp beyond the black river red nails the ameba the hostage of zir black colossus rogues in the house the clocks of iraz the inspctor’s teeth the devil in iron the scarlet citadel the command little green men from afar gods of the north shadows in the moonlight the emperor’s fan the merman hour of the dragon shadows in zamboula the gnarly man nothing in the rules jewels of gwahalur the slithering shadow the goblin tower reward of virtue the people of the black circle the tower of the elphant the guided man two yards of dragon the phoenix on the sword the vale of lost women the hardwood pile the unbeheaded king the pool of the black one a witch shall be born queen of the black coast a. a. g. gladwin et al. of digital scholarship in the humanities, use ‘the road of the eagles’, the flame knife, and ‘hawks over shem’ as test texts. . . function word pca function word pca clearly divides the authors across the first principal component, and the test texts all cluster with howard (fig. ). the analysis of function words reflects that heavily edited texts may retain stylistics similarities to the original author, namely in the use of function words; despite de camp’s heavy editing of the texts, these three test texts clearly cluster with the howard samples. . . burrows’ delta burrows’ delta corroborates our results from function word pca, with all three test texts yield- ing smaller delta scores for howard than de camp (table ). this reflects the fact that, in terms of common word usage, the test texts resemble howard more closely than they do de camp. based on these tests, we can infer that even a heavily edited text can still bear stylistic similarities to the original author. notably, both author sets yield small delta scores, less than . for all of them, and howard’s delta scores for two of the stories is smaller by a margin greater than . . these results imply that both author sets have word usage closer to the test texts than tarkington, poe, and machen do to ‘the mound’, as well as lovecraft to those authors’ respective texts. . . rolling delta the results for rolling delta similarly reflect the clear stylistic resemblance of the common word usage in the test texts to that of the works of howard. for ‘hawks over shem’, . % of the win- dows yield smaller delta scores for howard, with an average difference of . , whereas the average dif- ference for the windows in favor of de camp is . . the delta scores for howard range from . to . and from . to . for de camp. for the flame knife, . % of the windows favor howard with an average difference of . . the windows in favor of de camp have an average dif- ference of . . for howard, the values range from . to . , and from . to . for de camp. for ‘the road of the eagles’, . % of the win- dows favor howard with an average difference of . , compared to the average difference of . for the windows in favor of de camp. for howard, the values range from . to . , and from . to . for de camp. for each text, several windows of the text are stylistically similar to de camp; this could stem from the authors’ varying word usage, or reflect sec- tions more heavily rewritten by de camp. regardless, the results indicate that even though heavy editing can affect the common words usage to resemble the style of the editing author, the style of the original author can remain predominant. results for ‘the loved dead’ . function word pca results the lovecraft and eddy sets differentiate with minor overlap, and ‘the mound’ is correctly clus- tered with lovecraft’s texts; ‘the loved dead’ clus- ters toward the rightmost edge of the eddy set, although we must again note that only words were tested here for the reasons mentioned in note (fig. ). these results would imply that ‘the loved dead’ is closer in style to eddy in terms of common func- tion word usage, although not as drastically as ‘the mound’ is to lovecraft. these results could further imply that this test struggles to differentiate the fig. function word pca results for howard and de camp stylometry and collaborative authorship digital scholarship in the humanities, of authors, although the ability to accurately categorize ‘the mound’ so strongly would provide counter- evidence to that claim. still, another test of common word usage that approaches the measure- ments differently and considers a less selective set is useful in corroborating or contradicting these find- ings, and so we look to burrows’ delta. . burrows’ delta results preliminary, burrows’ delta tests on ‘the mound’ with thirty top words and no subdivision clearly distinguish the two authors: lovecraft’s set has a delta score of . , and eddy’s of . (table ). the test captures large differences that indicate lovecraft is stylistically closer in terms of common (function and non-function) word usage. the difference of . between the two authors is notable for delta scores, and the low score corrob- orates the picture painted here that lovecraft is styl- istically closer to the author of ‘the mound’ than eddy. when ‘the loved dead’ is the test text, the re- sults are less staggering: lovecraft’s set has a delta score of . , and eddy’s, . , resulting in a difference of . in favor of the eddy set. nonetheless, the results show that eddy is closer in common word usage than lovecraft. a possibility for the lack of such large differences as those seen in ‘the mound’ is that ‘the loved dead’ is a significantly shorter text—nearly one- seventh of the lovecraft story’s token count. thus, in order to find out how this difference compares to other delta scores, we perform burrows’ delta with each of the lovecraft and eddy stories as the test text one at a time, tested against the lovecraft and eddy subsets. in each case, we remove the test text from the respective author subset. this allows us to see how many false positives are found in stories for which we know the author, and find the range for the differences in delta scores. we visualize the results for these tests by plotting the difference between delta scores for each graph, with a negative difference reflecting a smaller score for lovecraft, and a positive difference reflecting a smaller delta score for eddy. thus, a story that reads closer to eddy will have a positive difference, while a story that reads closer to lovecraft will have a negative difference. we find that there are no false positives for lovecraft, but two for eddy: ‘the ghost-eater’ and ‘deaf, dumb, and blind’ (fig. ). the differences in delta scores for these stories are smaller than most of the other tests, which implies that the common word usage is not nearly as distinguishing as it is for the other texts. to fur- ther investigate the false positives, we reapply the method we used when delta scores for lovecraft and poe tested against ‘the fall of the house of usher’ yielded a false positive: we test delta scores for word counts ranging from to in incre- ments of ten. when we perform these tests, we see only confirmation that the styles of the stories are closer to lovecraft, rather than the clear picture we saw with ‘the fall of the house of usher’ (fig. ). ‘deaf, dumb, and blind’ in particular consistently fig. function word pca results for lovecraft and eddy table burrows’ delta results for howard and de camp test author delta author delta diff ‘hawks over shem’ howard . de camp . . the flame knife . . . ‘the road of the eagles’ . . . a. a. g. gladwin et al. of digital scholarship in the humanities, yields a stronger score for lovecraft by a margin greater than . . ‘the ghost-eater’ and ‘deaf, dumb, and blind’ are the only two stories that delta fails to distin- guish in our testing. the data for the function word pca graphs reveal that the two rightmost eddy values—i.e. those closest to the lovecraft cluster, which implies they are less distinctly similar to eddy in terms of function word usage—are ‘the ghost-eater’ and ‘deaf, dumb, and blind’ (fig. ). this is notable because these are two stories on which lovecraft is also known to have done revision work. in terms of ‘the ghost-eater’, lovecraft (as cited in derleth, ) claims in a letter to muriel eddy that he ‘made two or three minor re- visions’, which joshi ( ) finds agreeable, claim- ing that he ‘cannot detect much actual lovecraft prose, unless he was deliberately altering his style’ (p. ). in regard to ‘deaf, dumb, and blind’, there is little in terms of lovecraft’s own claims, although joshi adds that—while eddy only implies that lovecraft fixed up the last paragraph—‘in truth, the entire tale was probably revised, although again eddy very likely had prepared a draft’ (p. ). although the margin for delta scores with ‘the ghost-eater’ are relatively small, the dif- ferences for ‘deaf, dumb, and blind’ are large enough to support the claim that lovecraft’s style is as present—if not more present—than eddy’s. both stories are actually closer to lovecraft’s style in terms of common word usage than ‘the loved dead’, and are the only two of the eddy subset for which that is true. thus, we have inadvertently found that these three stories are the most difficult to differentiate stylistically between lovecraft and eddy; therefore, the stories have measurable simila- rities to the former’s style. when we consider the delta scores that strongly favored howard over de camp, the scores for ‘the loved dead’, ‘the ghost- eater’, and ‘deaf, dumb, and blind’ could provide evidence toward claims that lovecraft revised the stories substantially, because even de camp’s exten- sive rewriting did not make his common word usage more predominant in the texts. this would imply that he rewrote at least portions of the story, and the small difference in scores could imply that he did indeed rewrite the second half. when we apply the varying word count method- ology to delta tests for ‘the mound’ and ‘the loved dead’, we find a corroboration of our previ- ous findings for the two stories, namely, ‘the mound’ is definitively closer in style to the lovecraft set, and ‘the loved dead’ is not fig. delta differences for lovecraft and eddy fig. delta differences for lovecraft and eddy table burrows’ delta results for lovecraft and eddy test author delta author delta diff ‘the mound’ lovecraft . eddy . . ‘the loved dead’ . . . stylometry and collaborative authorship digital scholarship in the humanities, of significantly differentiated, as the author with the smaller delta score changes, given the number of top words considered, and the differences are within . of each other (fig. ). . rolling delta results our rolling delta results largely reflect our burrows’ delta results. for ‘the mound’, . % of the windows have smaller delta scores for the lovecraft set when compared to eddy, with an aver- age difference in delta scores of . . for the win- dows that favor eddy, the average difference is . . the delta scores for lovecraft range from . to . , and for eddy they range from . to . . for ‘the loved dead’, interestingly, eddy yields smaller delta scores for % of the windows, with an average difference of . . the values range from . to . for eddy, and from . to . for lovecraft. to further investigate these results, we look into the effects of culling. because ‘the loved dead’ is such a small text, and we are already working with shorter and fewer texts, the effects of culling can be severe. even common words might not show up in a smaller text, and thus would be removed from the set due to our decision to set the culling rate to . when the culling rate is , we do not control the number of top words considered, which means the count can be below thirty. we will consider no cul- ling with , , and top words. when we perform rolling delta with thirty top words (fig. ), our results seem in line with our findings in burrows’ delta. sixty percent of the win- dows favor lovecraft, with an average difference of . , compared to the average difference of . for the windows that favor eddy. the values for lovecraft range from . to . , and from . to . for eddy. the windows are almost evenly split in terms of which author has a smaller delta score, which reflects the fact that the test can not clearly differentiate the style of the author. however, the windows with the largest dif- ference in delta scores are the last two, which due to the window size encapsulate the entire second half of the story. the differences for these windows are . and . , which are greater than the average differences by a factor of approximately . when we perform rolling delta with (fig. ) and (fig. ) top words, the lovecraft set gar- ners a smaller delta score for every window, with respective average differences of . and . . the values for the tests with fifty words range from . to . for lovecraft and from . to . for eddy. the values for the tests with words range from . to . for lovecraft and from . to . for eddy. in both cases, the last two windows yield the largest difference in delta scores. for rolling delta with fifty top words, the differences for the last two win- dows are . and . —approximately twice the average difference—and for top words, the dif- ferences for the last two windows are . and . —again approximately twice the average difference. conclusion . interpretation of results our tests consistently reveal that ‘the loved dead’ does not bear an overwhelming stylistic similarity to lovecraft or eddy, and is consistent with the con- jecture that the story is a collaboration. we per- formed lexical richness pca to see if we could detect stylistic similarities in infrequently used words. however, these tests did not reveal anything regarding the authorship of ‘the loved dead’ and are known to be of questionable reliability in gen- eral. we performed function word pca because the analysis of common word usage is more reliable, and this test indicated that ‘the loved dead’ is fig. delta differences for lovecraft and eddy a. a. g. gladwin et al. of digital scholarship in the humanities, closer in terms of function word usage to eddy’s style, although it does share similarities to lovecraft’s. we selected burrows’ delta because it proved to be the most reliable in our preliminary tests, and allows us to look at most common word usage and not just function word usage. again, our results showed that ‘the loved dead’ is stylistically closer to eddy, but also has stylistic similarities to lovecraft, which could imply that lovecraft’s pres- ence as author in the story is a larger revision akin to de camp’s revision of howard stories. further study on the relationship between the extent of a collaboration and the margin of delta scores is war- ranted. finally, we also tested with rolling delta to see if ‘the loved dead’ might be a segmented col- laboration, and while our results generally suggest that both authors’ style is detectable in the whole story—again, implying that lovecraft likely rewrote sections of the text—lovecraft’s is particularly no- ticeable in the story’s second half, which corrobor- ates his epistolary claim. the results for all four tests, when considered in tandem, suggest that lovecraft likely edited ‘the fig. rolling delta results for lovecraft and eddy fig. rolling delta results for lovecraft and eddy stylometry and collaborative authorship digital scholarship in the humanities, of loved dead’, perhaps extensively, but did not write the entirety or majority of the story as it appeared in weird tales. at most, it appears that his edits focused on the second half of the story, as he stated in his letters. this claim supports a middle- ground between the claims made by joshi and eddy’s family. no definite claims can be made— as, we want to stress, this information is only evi- dence and should not be considered an endpoint for scholarship on the matter—but the evidence cer- tainly goes against any claim that either author is solely responsible for ‘the loved dead’. we have found that the same is true for ‘the ghost-eater’ and ‘deaf, dumb, and blind’. our set of tests allows for a deeper understanding of the extent to which multiple authors might have contributed to a text. although the tests do require interpretive work by the person performing the tests, the use of these four tests in tandem is useful because each provides unique insights while potentially corroborating the results found in the other three. although the results of stylometric ana- lyses cannot demonstrate causality, mathematical methods can describe features that correlate with authorial identity, which is particularly useful when historical evidence is not present. further, we find that ‘hack writers’ of the pulp market pro- vide particularly fruitful test cases of collaborative authorship. future analyses of collaborative author- ship might extend our findings by applying our methods to other collaborative case studies, com- paring our methods with machine-learning meth- ods, or studying how ‘collaborative proportion’ affects delta scores. references binongo, j. n. g. ( ). who wrote the th book of oz? an application of multivariate analysis to author- ship attribution. chance, ( ): – . burrows, j. ( ). ’an ocean where each kind. . .’: stat- istical analysis and some major determinants of literary style. computers and the humanities, ( / ): – . burrows, j. ( ). questions of authorship: attribution and beyond: a lecture delivered on the occasion of the roberto busa award ach-allc , new york. computers and the humanities, ( ): – . cox, h. and mowatt, s. ( ). revolutions from grub street: a history of magazine publishing in britain. oxford: oxford up. de camp, l. s. ( ). lovecraft: a biography. london: new english library. derleth, a. w. ( ). [introduction]. in divers hand and h. p. lovecraft (author), the dark brotherhood and other pieces. sauk city, wi: arkham house, pp. ix–x. eddy, m. ( ). the gentleman from angell street. in cannon p. h. (ed.), lovecraft remembered. sauk city, wi: arkham house, pp. – . (original work pub- lished ) fig. rolling delta results for lovecraft and eddy a. a. g. gladwin et al. of digital scholarship in the humanities, grieve, j. ( ). quantitative authorship attribution: an evaluation of techniques. literary and linguistic computing, ( ): – . holmes, d. i. ( ). a stylometric analysis of mormon scripture and related texts. journal of the royal statistical society. series a (statistics in society), ( ): – . holmes, d. i. ( ). the evolution of stylometry in humanities scholarship. literary and linguistic computing, ( ): – . honorè, a. ( ). some simple measures of richness of vocabulary. association for literary and linguistic computing bulletin, : – . hoover, d. l. ( ). delta prime? literary and linguistic computing, ( ): – . joshi, s. t. ( ). a note on the texts [foreword]. in joshi s. t. (ed.) and h. p. lovecraft (author), the horror in the museum. new york: del rey, reprint edn., pp. xv–xviii. joshi, s. t. ( ). i am providence: the life and times of h.p. lovecraft. new york: hippocampus. joshi, s. t. ( ). [introduction]. in joshi s. t. (ed.) and h. p. lovecraft (author), the crawling chaos and others: the annotated revisions and collaborations of h. p. lovecraft, vol. . welches, or: arcane wisdom, pp. – . juola, p. ( ). authorship attribution. foundations and trends in information retrieval, ( ): – . kuiper, s. and sklar, j. ( ). principal component ana- lysis: stock market values. in practicing statistics: guided investigations for the second course, st edn. upper saddle river, nj: pearson, pp. – . lang, a. ( ). in defense of the literary hack. current literature, : . look, d. m. ( ). statistics in the hyborian age: an introduction to stylometry. in prida j. (ed.), conan meets the academy: multidisciplinary essays on the enduring barbarian. jefferson, nc: mcfarland, pp. – . lovecraft, h. p. ( ). letters to robert bloch. s. t. joshi and d. e. schultz (eds.). west warick, ri: necronomicon press. lovecraft, h. p. ( ). letter to lillian d. clark [ december ]. in joshi s. t. and schultz d. e. (eds), letters from new york, vol. . lovecraft letters. san francisco: night shade, p. . morton, a. q. ( ). literary detection. new york: scribners. mosteller, f. and wallace, d. l. ( ). inference and disputed authorship: the federalist. reading, ma: addison-wesley. reeve, j. k. ( ). practical authorship. ridged, nj: editor company. rybicki, j., hoover, d., and kestemont, m. ( ). collaborative authorship: conrad, ford, and rolling delta. literary and linguistic computing, : – . smith, m. w. a. ( ). hapax legomena in prescribed positions: an investigation of recent proposals to re- solve problems of authorship. literary and linguistic computing, : – . tweedie, f. j. and baayen, r. h. ( ). how variable may a constant be? measures of lexical richness in per- spective. computers and the humanities, : – . yule, g. u. ( ). the statistical study of literary vocabulary. cambridge: cambridge university press. notes this opinion seems to have evolved over two decades; writing three decades prior, joshi ( ) states, ‘the two authors probably contributed equally’ (p. xvii). t ¼ v=n , where v is the number of distinct words in a text and n, the total number of words. h ¼ v =v where v is the number of words appearing exactly twice in a text. r ¼ logðnÞ �v =v a value first suggested and defined in honorè ( ). k ¼ ð x i vi�nÞ n , where vi is the number of words appearing i times; this value was first suggested and defined in yule ( ). the process for performing pca is outlined by kuiper and sklar ( ). a corpus is defined as any collection of texts. in our case, we have three subsets: ‘the loved dead’, lovcraft texts, and eddy texts. the mathematics behind calculating delta are outlined more completely in burrows ( ), but are outlined in short here. a main set—or, to use a familiar term, corpus—is created from all of the texts under consid- eration. any number of the most frequent words be- tween and are found, and burrows separates homographs such as ‘before’, which can be either a conjunction or preposition, although the practice is not necessary. he then finds the percentage that each of these words takes up for each text of the corpora, and standardizes these percentages based on the mean and standard deviation for that word across the texts. a delta score for a subset is the average absolute value stylometry and collaborative authorship digital scholarship in the humanities, of of the average z-scores for the number of words being considered across all the texts in that subset. thus, if we have a key text k and author subset a, the delta score for a for a list of n words is: deltaðaÞ ¼ xn i¼ jzki � zaij n : the discarded draft of the shadow over innsmouth, a purposefully experimental piece, is not included, as it was a deliberate attempt by lovecraft to depart from his usual style (joshi, , p. ). ‘the thing in the moonlight’, which was adapted from notes in one of lovecraft’s letters, is also not included, as he did not write the published piece (joshi, , p. ). we initially ran tests with this set and a larger set of all of lovecraft’s horror stories, which excludes his prose poems, humorous stories, collaborations, ghost-written projects, and so-called ‘dream cycle’ stories. all of the results were largely identical and did not add insight into our testing. ‘the mound’ contains , tokens and ‘the loved dead’, , . tokenization involves three major steps: first, we remove apostrophes, semicolons, commas, and periods. one consequence of this process is that contractions can be confused with homographs, e.g. ‘can’t’ and ‘cant’; however, this does not skew our results signifi- cantly. second, we tokenize the texts with nltk’s func- tion. third, we remove any remaining punctuation or leftover odd characters such as html tags from the tokenized list. the remaining list consists of words only, which allows us to perform the frequency tests. we find the values used to calculate t, h, r, and k, as well as word frequencies, using methods in nltk’s freqdist class, which creates a frequency distribution of the tokens in a text. we utilize scipy’s ‘numpy’ and ‘stats’ packages. the former allows us to store data in a matrix format for use in pca computation, and the latter allows us to standardize values as z-scores (http://www.scipy.org). we perform pca using matplotlib’s ‘mlab’ module, which mimics matlab functions (http://www.matplotlib.org). pca, in our case, treats texts as rows and measure- ments as columns, and the tool we use to run pca— matplotlib’s ‘mlab’ module—requires that there be more rows than columns. we have chosen to perform these tests using only two authors at a time in order to most closely emulate the testing procedure we will use for ‘the loved dead’. we use thirty as the word count as a default because it is, generally, sufficient. it also avoids any potential complications from including too many top words, e.g. incorporating words that are used commonly in only one set. this is the reason that, when the delta scores are close to each other, we try a range of values between and : it gives us a complete picture of the differences in common word usages between the sets, accounting for the potential errors associated with the extremes of our word count range. we run this test at a culling rate of . unless other- wise stated, the window size is , and the step size is , . two de camp samples are not visible in the selected window. they carried such highly positive values for the first principal component that their inclusion made the samples on the graph more difficult to discern. we use a window size of , words and a step size of words for tests with ‘the loved dead’, as that is the low end of the range stated by burrows for delta. a. a. g. gladwin et al. of digital scholarship in the humanities, http://www.scipy.org http://www.matplotlib.org from provider to partner: how digital humanities sparked a change in gale’s relationship with universities the past decade has seen huge growth in the teaching and research of what is broadly called digital humanities (dh). increases in computing power and data availability have seen a rise in individual researchers and research groups working on digital scholarship projects in the humanities, arts and social sciences. this article shows how publishers of traditional digital archives have adapted to the increasing prevalence of dh amongst their traditional customers. the success of this adaptation depends entirely on the relationship with the academic community, and gale has seen a shift from being a provider of products to a partner, trusted to help libraries, scholars and institutions achieve their objectives. as a leading global provider of digital archives, gale is well placed to review the current state of dh research and teaching, and this article will discuss significant academic events that have brought scholars, librarians and students together, and the lessons learned for institutions around the world looking to expand into dh. finally, the article looks at how working to understand the common challenges and barriers to dh research and teaching has pushed many archive publishers to re-evaluate traditional archive publishing and enable new and innovative ways to explore the past. from provider to partner: how digital humanities sparked a change in gale’s relationship with universities keywords digital humanities; gale; digital archives; primary sources; libraries; academic publishing an introduction to digital archives and gale over the past years digital archives have become an essential resource in university libraries. a natural evolution of microfilm and cd-rom archive access, a digital archive usually provides cloud-hosted access to vast amounts of primary source material accessible through a web interface. for researchers around the world, a digital archive democratizes access to many of the world’s leading research, national, private and public libraries, making access to material available on their desktops that would previously have necessitated a visit to the library in question. there are several private companies currently digitizing large archive collections, including gale, proquest, adam matthew, wiley and ebsco, and archives will generally be available for institutions to subscribe to or purchase. with recent increases in technological capabilities, there are numerous large regional/national open digitization projects, including europeana, the digital public library of america (dpla) and the hathi trust, that provide access to large digital archive collections, as well as smaller collections being digitized at the institutional level. gale’s digital archive programme began in with eighteenth century collections online (ecco), one of the most ambitious digitization projects of the time. ecco provides digital versions of the th-century texts catalogued in the english short title catalogue , and gives researchers desktop access to ‘every significant english-language and foreign-language title printed in the uk between the years and ’. insights – , from provider to partner: dh and gale’s relationship with universities | chris houghton and sarah ketchley chris houghton head of digital scholarship international gale primary sources sarah ketchley faculty affiliate instructor university of washington the following year saw gale publish the times digital archive, and in the years since, we have published hundreds of digital archives containing over million pages of often unique, difficult-to-access documents from six continents, covering nine centuries. digital archive publishing is becoming more prevalent as the technology for digitizing historical artefacts gets cheaper and more ubiquitous, and many universities, museums, libraries and other research institutions have digitization projects of their own. at the other end of the digitization spectrum, there are a number of commercial publishers working at a global scale, digitizing significant publications or national library collections; gale, proquest, adam matthew, alexander street press, ebsco, wiley and brill among them. most commercially available digital archives are full-text searchable, meaning that any researcher can search for a word or phrase and theoretically find it anywhere in a document. this functionality requires a process of optical character recognition (ocr), which is used to transliterate large corpora of documents. for publishers of historical archives, this requires running ocr software over scanned images, converting the letters on the page into machine-readable text, and capturing their position on the page. as a result, users can search for words and the underlying ocr text will identify the page, and location on the page, of each matching term in the archive. a changing relationship: archives as infrastructure gale was founded in as a publisher of directories and reference titles and is now part of cengage learning, one of the world’s largest educational publishers. within cengage, gale used to be known as library reference’, a name which accurately reflected the traditional business model for a publisher of primary source archives working with libraries to provide reference material. in the early s, around the world, large national consortia such as couperin and dfg operated as de facto buying groups, negotiating discounted access to products for their members. in the uk, jisc purchased the archive with funding it received to make the archives freely available to all uk higher and further education institutions. after the financial crisis and subsequent funding cuts to higher education, the ability of many consortia to offer this kind of large investment in archives drastically reduced, and publishers found themselves having to change business models and operations to deal directly with university libraries on a much more regular basis. these fundamental changes to the university sector in many major markets prompted significant adaptation in operations. now, rather than dealing with a centralized funding body, commercial publishers had to have relationships with individual institutions. for gale, this meant a significant investment in customer-facing operations and a shift in focus to not just working with libraries but understanding what they needed to be successful. digital archives are a significant investment for any university, and the majority of libraries communicated the need to provide evidence of academic support for an archive within their institution before countenancing a purchase. this need for academic support meant that, with the agreement of the library, publishers were now doing more than ever to speak with, and understand the needs of, the primary users of archives – the teachers, scholars and academics who could use archives like ecco or the times digital archive in research or in the classroom. these relationships with academics would naturally become significant. by understanding the research topics, needs and objectives of scholars, archive publishers were able to ensure two things: that they digitized archives and created new products where there was a desire for the material and that digital archives were presented in ways that would best support research and teaching needs. ‘publishers were now doing more than ever to speak with, and understand the needs of, the primary users of archives’ as the s progressed, gale started to see the beginnings of an evolution in this relationship, especially in universities that had not traditionally purchased archives. the feedback we were getting was that many archives were now seen as crucial to the work of a humanities department; we would receive orders for digital archives because an academic was moving to a new institution and purchasing the archive was a condition of their move. this evolution played out in the way that libraries purchased. in many markets around the world, evidence started to build of a move away from traditional end-of-year purchasing, as libraries found new ways of funding these purchases. gale’s relationship with libraries was changing again, and we would find ourselves supporting significant capital bids for investment in digital archives. allied to this was significant growth in new markets such as china, where in we developed the gale scholar programme to help ambitious chinese institutions quickly develop digital libraries on a par with the top universities around the world, a programme that is now being exported globally due to its popularity. first contact with digital humanities supporting ‘gather data and analyse’ more and more, institutions were reflecting a need to not just search and retrieve documents as had been commonplace, but also to gather data at scale and analyse it. digital archives started to respond by creating cross-searches of multiple archives, by incorporating analytical tools into archives and by facilitating access to ocr. for gale, these developments took hold in and with artemis primary sources, one of the first major archive cross-searches (later rebranded as gale primary sources ), which not only allowed the potential search of hundreds of millions of pages of primary source content, but included analytical tools to allow users to look at them through a different lens. significantly, users now had the ability to download the ocr for individual documents, which they could then combine with ocr for other documents into a corpus ready for analysis. examples of digital scholarship the other change that was happening during the early s was that we were starting to see requests to access the underlying data of archives, both metadata and ocr, for the purposes of text mining. from the start, we were keen to agree to these requests, but it began as an ad hoc process, often taking many months. for example, in gale was contacted by dr michaela mahlberg, phd, supervisor for kat gupta, who was studying at the university of nottingham. gupta was interested in getting access to the ocr for the times digital archive for the years – while researching their monograph, representation of the british suffrage movement. focusing on the times, gupta’s monograph, ‘uses corpus linguistics to examine how suffrage campaigners’ different ideologies were conflated in the newspaper over a crucial time period’. gupta was able to extract certain sections of the newspaper and mine them for references to ‘suffrage’, ‘suffragism’, ‘suffragette’ etc in order to identify the prevailing attitudes to the movement, amongst other conclusions. being exposed to scholarship really helped gale to understand how academics were using archive data, and whether they were using digital archives as we had envisioned or were making new applications. we would connect with researchers who were using metadata (such as word counts ) that we had never considered making widely available as the basis of their research. ‘requests to access the underlying data of archives, both metadata and ocr, for the purposes of text mining’ making data available collaborating with researchers to provide data for these projects and many others like them really helped to give an understanding of dh and the dh community. in gale made the decision to move the data provision from an ad hoc process to something more structured, and gale became the first humanities publisher to make underlying ocr and metadata available to customers through text and data mining (tdm) drives. subsequently, most commercial publishers of digital archives make their ocr text and, in some cases, metadata available to researchers. this development helped to crystallize gale’s relationship with the dh community. for the first time, we knew exactly who was using our data, and had the option to remain in contact to understand how they were using it. discovering common barriers demographics around dh soon became apparent; there was a relatively small set of core practitioners: researchers who were creating projects, writing code and were comfortable with managing large data sets. however, most scholars and institutions around the world that were interested in working in dh would often find the path to successful projects barred by a few common barriers: ) access to relevant data in an optimized format bringing together a significant corpus of data for analysis often involved insurmountable challenges: finding the data in the first place; combining data from disparate sources; cleaning the data to prepare it for analysis. the time and technical skills needed to undertake these processes were proving to be an obstacle for many. figure shows a typical research process, based on academic insight. many researchers were telling us several of the research steps could each take up to % of the allotted project time. cleaning data and creating exploratory tools were proving to be extremely time-consuming activities. ) hosting data this challenge occurred frequently when universities purchased the tdm drives. for many gale digital archives, the ocr and metadata equates to several terabytes of data, which makes it sometimes problematic or expensive to host locally. anecdotal evidence suggested that some university bureaucracies made it hard for researchers to get access to the data that they had purchased, if the university was able to find the server space to mount the drives at all. ) tools to analyse data can be challenging the analysis of large corpora of ocr or metadata text typically requires a degree of coding proficiency. experienced dh practitioners can be coders, while we would often see data analysis from academics who might consider themselves traditional ‘gale became the first humanities publisher to make underlying ocr and metadata available’ figure . a typical digital humanities research process based on comhis collective, university of helsinki, text and data mining eighteenth century based on estc & ecco, bsecs conference , oxford. [slide ] humanists but had taught themselves some basic coding. the message that came across strongly was that this need for coding often acted as a barrier to teaching dh in the undergraduate classroom, and to wider dh take up, as it required a significant time commitment. developing a solution in consideration of these barriers, it soon became apparent that there was an opportunity for us to develop a solution to support the existing digital humanities community and help to spread its skills and insights beyond the core practitioners. gale started building a cloud-hosted text and data mining platform in , and in released gale digital scholar lab, the first (and currently only) product to combine the broad range of archives available from gale with powerful text mining and natural language processing (nlp) tools. developing the lab proved to be a significant process, featuring several redesigns. at every stage, we made sure to solicit input from scholars to ensure that the product would deliver for established dh practitioners and those looking to break into dh. designed to overcome the three common barriers, take-up was strong immediately, with initial customers in china, singapore, australia and uae, spreading to europe and the united states. libraries identified the lab as a tool to help them support dh in a relatively low impact way, with vast archives of data optimized for use, simplified cleaning, and tool customization that did not require any existing coding knowledge. challenges and implications with a goal of continually developing the lab to support the needs of the dh community and the wider academic community, a series of technical challenges and content implications arise. development challenges for gale, the lab represents a new model for development. unlike a digital archive, which is essentially a static product, the gale digital scholar lab iterates and develops in line with user feedback and market need. this development work is expensive and time-consuming, and like any large corporation with a varied product portfolio, this means competing to ensure that investment into the lab is consistent. the requirement to understand the market is now stronger than ever, as gale works to identify research trends, development needs and common issues in order to try and provide solutions where appropriate. as a publisher with a global profile, one extremely important facet is to make sure that there is input from academics around the world, especially non-english native speakers, a demographic currently under- represented in dh. product challenges developing a software solution often leads to challenges as myriad development paths become apparent, and prioritization is needed. some of the immediate challenges include: • increasing and improving tools the lab utilizes mostly open-source tools, which will be developed and expanded. • increased outputs by introducing tools to develop the kinds of outputs students are tasked to create in dh courses (interactive timelines, enhanced maps, scholarly editions, etc.) we can increase classroom efficiency. ‘one extremely important facet is to make sure that there is input from academics around the world’ • moving beyond tdm providing the ability to analyse the many non-text components of gale’s archives, including pictures, adverts and photographs. • supporting classroom use to fulfil one of the most common requests, gale will partner with leading scholars to create material to contextualise the processes in the lab and teach with it. • non-english content by introducing new non-english language archives and the training tools to mine them, gale can further enable dh in non-english native countries. content implications giving users the ability to interact with digital archives in new ways by making ocr accessible, and through initiatives like gale digital scholar lab, has raised questions about archive publishers’ existing data, with potential implications for future archive digitization. the most obvious consideration involves the quality and accuracy of ocr. ocr has always been an imperfect process, relying as it does on software to interpret often unclear historical documents. accuracy of ocr can depend on when the archive was processed (since earlier versions of ocr software are less accurate) and of the age and clarity of the original document. now that ocr is more visible than ever before, gale is working with several academic groups to determine what choices there are to improve the quality of the underlying ocr in the digital archives. rescanning or repeating the ocr process comes at a prohibitive cost, but we want to determine whether there are systematic or crowd-sourced solutions to such a significant issue across all of dh. similar questions exist around metadata. by giving scholars the ability to enhance metadata, it would greatly increase the number of research questions answerable through digital archives. probably the most common customer request involving the lab is to make content hosted by the institution available for analysis through the lab. this is a huge and complicated project, simply because of the vast range of types of content hosted by universities and libraries around the world, not to mention inconsistencies in ocr standards, metadata and document format. however, given that this is an obvious area of need for institutions, gale is working on options to make it available in the gale digital scholar lab. supporting and working with dh the increase in collaboration with the dh community is set to continue. throughout the world, gale is working to contribute to the community, increasing visibility for academics and software developers, with a goal of acting as partner, not solely a software provider. bringing academics in house since gale has employed academics in the us as dh specialists with a brief to advise on development, support new customers to the lab and help contextualize dh processes for research and teaching around the world. alongside their work for gale, dr sarah ketchley (university of washington) and dr wendy perla kurtz (ucla) teach dh courses in their institutions. ketchley first offered introductory dh classes for undergraduates and graduates in , integrating the gale digital scholar lab into her syllabus in late . as a cloud-based platform, with no local software installation requirement, the lab is an ideal platform for classroom use. the class featured undergraduate students from departments across campus, the majority with no prior experience in dh, working in teams to create and curate content sets, clean ocr text, and then analyse their collected research material using the digital tools incorporated in the lab. this structured workflow presents opportunities to teach digital project management, data curation, the process of creating ocr text and the challenges of working with it. the course was again being offered in the summer quarter in an entirely online format, for which the lab is well suited. its suite of digital tools generates in-depth discussions about the nature of text mining, qualitative vs. quantitative analysis and the types of research questions that can be asked and answered by topic modelling or named entity recognition, for example. in lieu of a final research paper or exam, students exported the results of their research and analysis in the lab, including primary source document images, ocr text and visualizations, to build digital exhibits in omeka and interactive narratives in storymapjs, both third-party applications to publish and visualize digital projects. sensitivity to dh community ethics as the relationship between commercial vendors and academia becomes more involved, we are acutely aware of the ethics of the dh community, namely the aspirations for data and software to be open and research to be freely available. given the irreconcilable fact that our digital archives exist for universities behind a paywall, we are always working to make sure that the data is as available as possible, and several recent research projects have relied on gale providing specific aspects of archives not commonly available. the lab is a good example of gale’s desire to be as open and supportive as possible within the contractual boundaries of our agreements with source libraries, giving users the ability to export ocr, statistical analyses and visualizations at all steps of the workflow. development partnerships one positive outcome of gale’s increasing visibility in the dh community is the increase in opportunities to actively work together with academics and research groups. in the pipeline are numerous joint partnerships exploring ocr correction, tool creation, development of pedagogy and many others. the ability to support and amplify innovative work is paramount for us, and the opportunity to (for example) develop cutting-edge tools to analyse gale archives is too good to miss. evolving academic events publishers, vendors and other content providers traditionally sponsor and exhibit at academic conferences and other events. the focus on collaboration and openness driven by dh has prompted an added emphasis in events for gale and, in the past year we have started to organize our own events to bring together academics, developers and librarians from around the world. in november gale invited library directors from the leading chinese universities to an ‘advanced workshop of digitization, libraries and digital humanities’ in dali, china. then, in december , gale japan welcomed scholars to ‘an invitation to digital humanities’ at the tokyo international forum. in may , gale brought approximately european academics, librarians and students to the british library to hear talks from a panel of distinguished international speakers for the inaugural gale digital humanities day. the day was designed to provide insights into all aspects of dh, incorporating academic research sessions (literature and distant reading and computers reading the news), teaching sessions (digital humanities in the classroom), and sessions discussing institutional considerations (institutional support and infrastructure for digital humanities). the international panel featured speakers from the us, uk, netherlands, japan and australia, and one of the most striking points was how much similarity there was in approaches, methods and challenges. these events all featured academic speakers to provide a forum for knowledge exchange. for gale, these events launch partnerships and are as useful for our education as that of the academic community, with events often including associated focus groups. ‘in lieu of a final research paper or exam, students exported the results of their research and analysis’ ‘increase in opportunities to actively work together with academics and research groups’ conclusion and future plans numerous projections of future job trends, including this from the world economic forum (see figure ), indicate that analytical skills and the ability to work with large data sets will continue to grow in importance for jobs in the future. increasing numbers of universities are turning to dh as a method of providing humanities and social science graduates with these desirable analytical skills, providing experience to help them thrive in a rapidly changing job market. these changes in universities challenge publishers to evaluate their archives and the various ways to explore and interact with them. changing the fundamental ways of using content requires real engagement with the academic and library communities on numerous levels in order to deliver solutions that address real-world problems. dh is a fascinating, complex and exceptionally diverse field that is simultaneously challenging and enthusing gale. it can feel like a bold move to expose metadata and ocr through platforms like gale digital scholar lab because it exposes us to questions about their nature, format and quality. however, by taking this step, gale has begun numerous conversations with academics and institutions around the world about potential solutions for the long- standing problems of digitizing historical documents. the possibilities, even in the relatively small area of ocr correction and remediation, are incredibly exciting and we anticipate seeing substantive improvements in this area as we begin to partner with academics around the world on ocr projects. in terms of software, the possibilities for developing software to support dh research and teaching are no less extensive and exciting. every interaction with an institution confirms that there is a huge appetite to research and teach dh, and for gale’s solutions to evolve to meet the challenges faced. future developments will see gale digital scholar lab grow to support teaching through pedagogical support; include local content upload to allow researchers to ingest their own content to use in the lab; and increase the range of tools to support as wide a range of analyses as possible. for gale, there is real opportunity in developing closer relationships with academia to fulfil our primary aim of advancing knowledge through a detailed exploration of the past and promoting opportunities for this kind of research as widely as possible. in the future, we will continue to work as closely as possible with academics while being respectful to the ethics of the dh community. by collaborating on building new platforms and pathways, we remain committed to advancing humanities scholarship and to amplify it by supporting the global academic community. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘full list of industry a&as’ link: http://www.uksg.org/publications#aa competing interests ch declares that he is employed by gale. sk has no competing interests. ‘there is a huge appetite to research and teach dh, and for … solutions to evolve to meet the challenges faced’ figure . comparing skills demand: top ten in vs. analytical thinking and innovation complex problem-solving critical thinking and analysis active learning and learning strategies creativity, originality and initiative attention to detail, trustworthiness emotional intelligence reasoning, problem-solving and ideation leadership and social influence coordination and time management today, trending, declining, analytical thinking and innovation active learning and learning strategies creativity, originality and initiative technology design and programming critical thinking and analysis complex problem-solving leadership and social influence emotional intelligence reasoning, problem-solving and ideation systems analysis and evaluation manual dexterity, endurance and precision memory, verbal, auditory and spatial abilities management of financial, material resources technology installation and maintenance reading, writing, math and active listening management of personnel quality control and safety awareness coordination and time management visual, auditory and speech abilities technology use, monitoring and control http://www.uksg.org/publications#aa references . “help for researchers,” british library, http://vll-minos.bl.uk/reshelp/findhelprestype/catblhold/estcintro/estcintro.html (accessed september ). . eighteenth century collections online, gale, https://www.gale.com/intl/primary-sources/eighteenth-century-collections-online (accessed september ). . gale primary sources, gale, https://www.gale.com/intl/primary-sources (accessed september ). . kat gupta, representation of the british suffrage movement (bloomsbury academic, ). . kat gupta, mixosaurus, http://mixosaurus.co.uk/publications/ (accessed september ). . dallas liddle, “reflections on , victorian newspapers: ‘distant reading’ the times using the times digital archive,” journal of victorian culture, , issue , ( june ): – , doi: https://doi.org/ . / . . (accessed september ). . exploring ecco: key moments in th-century philosophical literature. eetu mäkelä, vili lähteenmäki, antti kanner, ville vaara. never mine the mind? symposium on computational approaches to intellectual history and the history of philosophy. helsinki, may (slide ), https://comhis.github.io/assets/files/never_mine_the_mind__comhis_collective.pdf (accessed september ). . “gale digital scholar lab,” gale, https://www.gale.com/intl/primary-sources/digital-scholar-lab (accessed september ). . “the gale review,” gale, https://www.gale.com/intl/blog/ / / /gale-digital-humanities-day-at-the-british-library/ (accessed september ). . forecast of jobs report , world economic forum, http://www .weforum.org/docs/wef_future_of_jobs_ .pdf (accessed september ). article copyright: © chris houghton and sarah ketchley. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. corresponding author: chris houghton head of digital scholarship, international gale primary sources gale, a cengage company, uk e-mail: chris.houghton@cengage.com orcid id: https://orcid.org/ - - - co-author: sarah ketchley orcid id: https://orcid.org/ - - - to cite this article: houghton c and ketchley s, “from provider to partner: how digital humanities sparked a change in gale’s relationship with universities”, insights, , : , – ; doi: https://doi.org/ . /uksg. submitted on august             accepted on september             published on october published by uksg in association with ubiquity press. http://vll-minos.bl.uk/reshelp/findhelprestype/catblhold/estcintro/estcintro.html https://www.gale.com/intl/primary-sources/eighteenth-century-collections-online https://www.gale.com/intl/primary-sources http://mixosaurus.co.uk/publications/ https://doi.org/ . / . . https://comhis.github.io/assets/files/never_mine_the_mind__comhis_collective.pdf https://www.gale.com/intl/primary-sources/digital-scholar-lab https://www.gale.com/intl/blog/ / / /gale-digital-humanities-day-at-the-british-library/ http://www .weforum.org/docs/wef_future_of_jobs_ .pdf http://creativecommons.org/licenses/by/ . / mailto:chris.houghton@cengage.com https://orcid.org/ - - - https://orcid.org/ - - - https://doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ an introduction to digital archives and gale a changing relationship: archives as infrastructure first contact with digital humanities supporting ‘gather data and analyse’ examples of digital scholarship making data available discovering common barriers developing a solution challenges and implications development challenges product challenges content implications supporting and working with dh bringing academics in house sensitivity to dh community ethics development partnerships evolving academic events conclusion and future plans abbreviations and acronyms competing interests references figure figure william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative this project was funded in part by the national endowment for the humanities an overview: the differences slavery made: a close analysis of two american communities william g. thomas iii and edward l. ayers the original idea for an electronic article seemed simple enough. using digital media, we wanted to give readers full access to a scholarly argument, the historiography about it, and the evidence for it. our early models of the article contained neat squares and lines and carefully arranged explanations of the links from one part to another. we admired the recently published new york review of books article by robert darnton on the possibilities of digital scholarship, and after years of building the valley of the shadow project digital archive, we welcomed the opportunity to offer an interpretive analysis based on its sources. through two sets of readings by peer reviewers and presentations to a range of audiences, we have revised our presentation and our argument while maintaining the original purpose of the article. this essay introduces the electronic article and explains its development, as well as our intentions for it. the full electronic version of this article can be found at www.historycooperative.org/ahr/. http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative frontispiece: two hundred miles and several borders separated franklin and augusta, but both were part of the same topographic region, the great valley in the border region of the eastern united states. the two counties shared similar soils, climates, ethnic and religious background of their white residents, and access to national and regional markets. despite such similarities, however, slavery caused these communities to differ in a wide range of ways. our principal goal was to fuse the electronic article's form with its argument, to use the medium as effectively as possible to make the presentation of our work and its navigation express and fulfill our argument. as a result, this piece of electronic scholarship operates on several levels to connect form and analysis. first, it allows one to reconstruct the process by which our argument was developed, to follow the logic of our thinking, in effect to reconstruct the kind of "trails" that vannevar bush expected the technology to allow historians when he envisioned the future of computing in his seminal essay "as we may think." this electronic scholarship also uses spatial analysis and spatial presentation to locate its subjects and its readers within the context of the historical evidence and interpretation. the methodology of the emerging field of historical gis (geographic information systems) informed our analysis and led to the creation of a comparative spatial database of our communities. in addition, we sought to express the spatial analysis of the argument in the article's structure. third, the electronic article presents itself in a form to allow for unforeseen connections with future scholarship. we consider this last goal critically important for scholars working in the digital medium because the rate of technological change will certainly offer new opportunities even as it displaces current practices. publishers, scholars, technologists, and librarians have hammered out international standards that govern the basic structures and forms of digital work to take advantage of technological development. our article seeks to work within these standards in expectation of change. our analysis focuses on slavery and its relationship to modernity. historians have long studied—and argued over—slavery's association with new world capitalism. studies of the transatlantic slave trade, of the relationship between the modern world and slavery, and of the connections between spatial and temporal portrayals of slavery suggest that it might be time to reexamine the connections between slavery and modernity in the united states. in the case of the united states, the political, economic, and ideological issues in the crisis of – turned around issues related to emergent forms of modernity: the integrity of the nation-state, the course of economic development, the meaning of participatory democracy, and the nature of individual autonomy. this conflict over modernity has long received attention from historians. many influential studies have relied on a vision of divergent societies, a modernizing north and a south resisting modernity. recent scholarship, however, has indicated that we might need to revisit that equation. the institution of slavery may have struck its own bargain with modernity in the nineteenth-century american south. the test-bed for our article comes from the valley of the shadow project. that digital archive allows readers to examine two communities, one in the north (franklin county, pennsylvania) and one in the south (augusta county, virginia), during the coming, fighting, and aftermath of the american civil war. the two communities were chosen to provide something like a controlled experiment. they shared similar geographic locations, soil, climate, crops, white ethnicity, and religious denominations. only one major difference separated them: the virginia community was built around slavery and the pennsylvania community was not. yet that one difference extended its defining influence into all the social arrangements of the southern community, in ways obvious and otherwise, and pulled its white people into the confederacy and a war that destroyed what they fought to protect. http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative by choosing communities close to the mason-dixon line, we offer a rigorous test for slavery's influence: if slavery pulled everything into its orbit only two hundred miles from freedom, we could assume that its force was even greater farther south. if the community without slavery found its institutions and social life undiluted despite living on the border with bondage, we could assume that the patterns would be even stronger farther north. we investigate the problem of modernity and its relationship to slavery in these communities by joining the tools of geography and cartography to those of social-science history. we created a detailed gis to compare these places and their social, economic, and political structures. our goal is to reconstruct the social, economic, and political geography of slave and free communities, to compare them, and to analyze the spatial relationships embedded within them. our argument is that slavery was more central to the civil war than we have thought because it exerted a determining influence even where slavery did not take the form of cotton plantations and african-american majorities. slavery adopted the forms of modern life available in the mid- nineteenth-century united states—capitalist forms of investment and economic motivation, advanced transportation and communication, politics of broad participation by white men, and general white prosperity. the differences slavery made for white people were pervasive and structural but not intrinsically opposed to modernity. we do not find a different white culture in the south than in the north, a culture built around resistance to or even skepticism of modern life. instead, we find a politics built around the protection of slavery through whatever means necessary. many white men preferred to protect slavery with unionism, recognizing that the united states provided a safe haven for the institution where it existed, including virginia. but, as the events of and unfolded, those white men came to believe that only secession could protect slavery. these men did not act as they did because they fostered a different political culture from the north, notions of race profoundly unlike those of most whites elsewhere, divergent forms of christianity, or any other characteristic that set them against modernity. instead, they sought to control the future of the united states by preserving a place for slavery as americans spread their dominion. white southerners projected the spread of modern slavery into part of an american empire. when that failed, they sought to create an empire of their own. the digital article makes this argument in a way that takes advantage of the medium's possibilities for precision and interrelation. in the digital article, we have sought to separate various strands of historical argument and evidence so that we can better understand their relationship to one another. thus we examine agriculture, demography, transportation, class relations, churches, and so on in individual nodes of analysis, comparing the northern and southern counties and placing them in regional perspective. we do the same for political affiliation and behavior, which we then relate to their material bases. by exploring these facets of social life as rigorously as possible, we hope to refine our questions and thus our answers. we recognize, of course, that two counties cannot stand in for the entire united states, much less as proxies for the problem of modernity and slavery. we also recognize that no two counties are typical of places as vast as the american north and south. but we have chosen these counties to offer a rigorous test for our argument, and it seems to us that they serve that purpose well. the experiences of our two counties show that slavery drove all the conflict that brought on the civil war but not in a simple way based on modernity, not in the way many imply when they speak of "economics" causing the war or of the "industrial" north against the "agricultural" south. http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative it is not yet clear how digital technology will affect the practice of history or whether historians will heed, for example, robert darnton's call to consider the advantages of electronic publication. in a show of leadership, darnton offered historians an example of what new electronic scholarship might look like, publishing an electronic essay in the american historical review. in recent years, several historians have been working toward common ends, experimenting with publication in the digital medium, seeking to join their analysis with the form of its presentation in innovative ways. philip j. ethington has written an impressive electronic essay and presentation for the world wide web on the urban history of los angeles. ethington "explores the hypothesis that the key concept in the search for historical certainty should be 'mapping' in a literal, not a metaphoric, sense." his work includes a wide range of media and sources to create, or rather recreate, the "panorama" of the city. ethington suggests that the web site can be read like a newspaper, inviting readers to wander through it, skipping from section to section, and focusing on what strikes their interest. motivated "simultaneously by two ongoing debates: one among historians about 'objective knowledge,' and another among urbanists about the depthless postmodern condition," ethington's electronic scholarship grasps the "archetype of 'hyperspace'" to address these concerns. the goal of our article is to open the process of scholarly inquiry, to allow readers not only to confront our argument but also to work with its evidence and its constituent parts. the key to the article—indeed, to our decision to embark on a digital article in the first place—is the recent emergence of extensible markup language (xml). xml separates the structure of a text from its presentation, allowing authors to use the structural definitions of the text for searching, linking, and identifying discrete elements, all the while keeping the style and layout of the text's presentation in a separate set of controls (a style sheet called xsl). xml holds out possibilities for scholarship that take us far beyond html, the first language of the web, allowing dynamic and multiple linking among diverse sources. the sources we have included in evidence and historiography, for example, operate in a modular fashion; each node of the article's sections contains source information, citation information, linkages, and analysis of its relationship to the whole. the modular structure and the xml behind it make the article flexible yet rigorous, open to alternative presentations yet fixed within an international standard. the first peer reviewers of the piece questioned whether what we were producing could be called an "article" at all. some argued that in the article form there is an implied contract with the reader. the reader expects to allocate a set amount of time to read it, to find the argument laid out in a relatively familiar fashion, and to recognize the visual cues of footnotes, headers, captions, and other means of corresponding with the reader. early drafts of our article did not meet this contract but instead asked the reader to participate more in the process of investigation. the boundaries between authors and readers in hypertext have been a subject of sustained discourse among literary critics, and a few of these studies have moved well beyond the postmodern approaches of the initial wave of hypertext literature. this recent literature emphasizes the complex process of negotiation as readers and authors continually encounter familiar subjects in unfamiliar forms. the openness that the technology affords and the alternative readings possible within this article raise questions about the role of narrative in electronic scholarship, questions never far from our consideration. literary critics, such as espen aarseth, janet murray, jerome mcgann, george landow, and historians, such as darnton, ethington, and roy rosenzweig, have speculated on the future of narrative in cyberspace. yet examples of non-linear narrative or hypertext remain few and far between. despite all of the new technologies and the excitement of the medium, the web, it turns out, is full of traditional linear narrative in large measure because so much of the material on the web has been migrated from print. http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative as this article evolved, two tensions came to the fore, an outgrowth of the possibilities that both the digital medium and the reader would allow. the first was that software and hardware configurations vary widely, so much so that decisions about technologies dramatically restrict audience and performance. some of the most powerful hypertextual technologies, for example, remain proprietary and, therefore, inaccessible to some web users. the second tension developed around the article's narrative structure and the potential non-linearity of our argument. readers on the web have grown accustomed to conventions of text placement, symbols for various links, and navigational structures. in the process of the article's development, we presented our work to numerous audiences and saw a spectrum of readers. readers split over the purpose and character of narrative structure in the digital medium. some embraced non-linearity as the natural and most effective means of presenting digital scholarship. others considered an ordered, linear argument essential to historical scholarship of any form. after several drafts, which favored first one then the other approach, we concluded that a balance must be struck. the digital medium offers—in some respects, demands—a form of hyperlinking. it excels in the presentation of linked information and modules of analysis and explanation. the non-linearity that is necessarily a part of digital scholarship cannot serve to obscure its argument, yet the argument in digital scholarship cannot ignore the non-linearity of the medium. sustained argument in the digital medium must extend across and among interrelated parts, each piece of which must be understood as possibly the opening page for any given reader. the greatest challenge for the author of digital scholarship is that every page needs to have the codes, symbols, links, and information to allow readers to access the whole argument. without the conventions already inherent in the book or article form, such as page numbering, chapter organization, and indexing, the digital scholar must develop the argument with the appreciation that the reader might encounter it at several points along its explication. several problems confront digital scholarship at this stage. first, digital scholarship must consider how to cite "born digital" information, such as an evidence module or a page from an electronic article. second, it needs to examine how to give readers a sense of the scale of the article and a means by which to track their reading, and presumably its relationship to the overall work. the first problem stems in part from the nature of xml and dynamic database systems where there is no fixed or "hard" url or web page address for a particular page because each page is dynamically generated. the second is endemic to the medium; it is difficult to tell how "long" or "big" a web site is from the first page or from any page within it. only exploration and investigation will reveal the scale and scope of a digital work. our "reading record" and "citation lookup" tools were designed to address these fundamental issues for digital publications, and they were the result of much experiment and testing. while we consider them useful innovations, we expect that this area of digital work will develop new visualization technologies to make apparent to readers the reach and placement of digital analysis and argument. http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative the electronic article's maps and statistical tables derive from a geographic information system database in which thousands of households have been identified and linked to agricultural, slaveholding, and population census data. this map, showing the spatial distribution of slaveholders and non-slaveholders in augusta county, indicates the spread of slavery throughout the county. as technology matures and readers become further acquainted with digital scholarship, the premium for authors in this medium will be on transparency. technology, no matter how interesting or innovative, should facilitate an argument and in doing so remain transparent to the reader. the article as it finally appears in electronic form for the american historical review has, in some respects, been tamed in the peer review process. it follows a more traditional structure than our earlier drafts. it uses commonplace names for its parts. it does not include fancy diagrams for navigational schemes. instead, the article places the argument in front of the reader immediately and gives the reader a series of choices of ways to test, elaborate, or challenge that argument. as such, it is an extension, an enhancement, of normal scholarly practice in our discipline. the process of peer review and revision for this article was unusual because both the argument and the form were under review. as historians design and write pieces in digital format and as their peers consider how to review this electronic scholarship, they will come face-to-face with numerous fundamental questions. first, we need to recognize that these publications are highly collaborative and involve the creative and technical work of other professionals. the scholarship produced in the digital medium will continue to be characterized by intermediation, negotiation, and manipulation by a range of scholars and professionals. second, digital publications require a host of technical decisions on the part of authors and publishers—software platforms, server requirements, proprietary plug-ins, and, for example, browser specifications. an entire field of computer science, human and computing interface, has developed in the last twenty years, testament to the importance of the selections we make in this new medium. http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative it seems to us that the range of digital scholarship might be quite broad and that the advantage of the digital medium for publication is the openness it provides to scholarly authors to design and create a presentation uniquely suited to their topic, field, period, or problem. in our case, we hoped to create a flexible framework for digital scholarship for historians. the idea behind our approach was to develop a set of common categories and provide some definition to their relationship. we tried to avoid creating idiosyncratic elements peculiar to our work, and we strove to produce a template other scholars might use for a piece of digital scholarship. the article's form—its modules of refracted analysis, evidence, and historiography—is meant to instruct and carry forward the argument. we propose what we have called a "prismatic" model as an alternative to robert darnton's pyramid structure, one that allows readers to explore angles of interpretation on the same evidentiary and historiographical background. the prismatic functionality of the article offers to open the process of historical interpretation to the reader, providing sequential and interrelated nodes of analysis, evidence, and their relationship to previous scholarship. we plan to make something like this article's technologies broadly available, using its structures for different objectives, historical questions, periods, and concerns. as an extension of our work on this article, we have begun to create an application called chart, for comprehensive history analysis and research tool, which could work in a college classroom as well as in a professional journal. chart permits the use of xml and its advantages without requiring the large scale, daunting complexity, and considerable cost our prototype article has demanded. it seems possible that digital scholarship is particularly well suited for some forms of historical analysis, and we put our attempt forward as an early experiment to see if that might be so. our close analysis of two american communities explores the relationship between modernity and slavery. the argument we offer seeks to overturn a longstanding argument about the coming of the american civil war, one that has taken form in the traditional medium of book and film and dominated our understanding of the character of slavery and freedom in the modern world. slavery, in our view, must be understood as having no single determinative value, no one experience or effect that can be either pointed to or dismissed; instead, its refractive powers touched every aspect of society. slavery and freedom each developed a spatial character in addition to, indeed in relationship to, a political, social, and economic structure. both societies had established a particular footprint in the landscape. our article seeks in its form to capture that spatiality and to represent its complexity and interrelatedness. if we are to show slavery's relationship to modernity and argue that it was pervasive, systemic, and spatially arranged, then the digital medium provides an essential means to make the argument. digital publication, in our argument, is not merely convenient or innovative but intrinsic. the electronic article has been intensely collaborative from the outset, both between the authors and among professional staff and research assistants at the university of virginia and the virginia center for digital history. kimberly a. tryka, associate director of the center, applied her valuable expertise in xsl style sheets and transformations, creating the innovative reading record tool as well as helping develop the fundamental structure of the article. her work on this article has been instrumental and critical to our effort. benjamin knowles of octagon multimedia productions gladly gave us his time and graphic design talent and web expertise, working with us to design the interface for this work. aaron sheehan-dean, now at the university of north florida, worked on the gis and spss data and offered his considerable expertise in civil war history. watson jennison, now at the university of north carolina, greensboro, energetically investigated the newspapers and compiled content analysis of them. steve thompson, now at the university of texas at austin, helped develop the original gis for augusta county. we also especially thank our colleagues at the corcoran department of history at the university of virginia for their helpful criticism of both our form and analysis in a draft version of this article at a department workshop. lloyd benson, john unsworth, and michael holt carefully read several drafts of this article and offered written comments. we appreciate especially the thoughtful readers of the ahr for their wise and judicious reading of this piece in draft form. finally, we would like to express our gratitude for michael grossberg's careful suggestions and patient support of our work and his leadership in bringing it to publication. we thank all these friends and colleagues for their invaluable help. http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative william g. thomas iii is the director of the virginia center for digital history and an assistant professor of history at the university of virginia. he is the author of lawyering for the railroad: business law and power in the new south ( ). since , he has managed and directed the development of the valley of the shadow: two communities in the american civil war project. he is also the co-author and assistant producer of massive resistance, an emmy nominee documentary on virginia's civil rights experience, and he is currently beginning research on an environmental and social history of the chesapeake bay region. edward l. ayers is the dean of the college of arts and sciences and the hugh p. kelly professor of history at the university of virginia. he is the author of the promise of the new south: life after reconstruction ( ) and in the presence of mine enemies: war in the heart of america, – ( ), a narrative history based on the valley of the shadow project. notes Ê vannevar bush, "as we may think," atlantic monthly, july . for the most current and complete analysis of the challenges historians face working in the digital medium, see roy rosenzweig, "scarcity or abundance? preserving the past in a digital era," ahr (june ): – ; roy rosenzweig and michael o'malley, "brave new world or blind alley? american history on the world wide web," journal of american history , no. (june ): – ; and rosenzweig, "the road to xanadu: public and private pathways on the history web," journal of american history , no. (september ): – . Ê david eltis, the rise of african slavery in the americas (cambridge, ); robin blackburn, the making of new world slavery: from the baroque to the modern, – (london, ); david eltis, et al., the trans-atlantic slave trade: a database on cd-rom (new york, ); robert w. fogel, without consent or contract: the rise and fall of american slavery (new york, ); john thornton, africa and africans in the making of the atlantic world, – (new york, ). Ê for the most influential studies that emphasize the divergence of northern and southern societies in terms associated with modernity, see eugene d. genovese, the slaveholders' dilemma: freedom and progress in southern conservative thought, – (columbia, s.c., ); and genovese, roll, jordan, roll: the world the slaves made (new york, ); james m. mcpherson, ordeal by fire: the civil war and reconstruction (new york, ); and mcpherson, battle cry of freedom: the civil war era (new york, ); william h. pease and jane h. pease, the web of progress: private values and public styles in boston and charleston, – (new york, ). Ê the url is http://valley.vcdh.virginia.edu . the valley project includes thousands of letters, tens of thousands of census entries, soldiers' service records, and newspaper articles. one of the purposes of the archive is to encourage students and other non-historians to write history for themselves with a capacious archive that permits people to make their own connections, to follow their own insights. but another is to permit professional historians to ask questions of greater specificity and precision than would be possible without having historical materials available in electronic form. that is what we attempt in our digital article. on the project, see william g. thomas iii, "in the valley of the shadow: communities and history in the american civil war," virginia magazine of history and biography (september ), http://jefferson.village. virginia.edu/vcdh/thomas.vmhb.html . for reviews and other analysis of the valley project, see andrew mcmichael, "the historian, the internet, and the web: a reassessment," aha perspectives (february ): – ; o'malley and rosenzweig, "brave new world or blind alley?" – ; rosenzweig, "road to xanadu," pars., /journals/jah/ . /rosenzweig.html ; gary j. kornblith, "venturing into the civil war, virtually: a review," journal of american history (june ): – . Ê for recent interpretations using gis, see anne kelly knowles, ed., past time, past place: gis for history (redlands, calif., ); and knowles, ed., the special issue of social science history , no. (fall ), http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm william g. thomas iii and edward l. ayers | an overview: the differences ...unities | the american historical review, . | the history cooperative http://muse.jhu.edu/journals/social_science_history/toc/ssh . .html. Ê philip j. ethington, "los angeles and the problem of urban historical knowledge," ahr (december ), http://www.usc.edu/dept/las/history/historylab/lapuhk/; robert darnton, "the new age of the book," new york review of books, march , ; darnton, "an early information society: news and media in eighteenth-century paris," ahr (february ), /journals/ahr/ . /ah .html. Ê see, for example, janet murray, hamlet on the holodeck: the future of narrative in cyberspace (new york, ); espen aarseth, cybertext: perspectives on ergodic literature (baltimore, ); jerome mcgann, radiant textuality: literature after the world wide web (new york, ); and also see anthony grafton, the footnote: a curious history (cambridge, mass., ). content in the history cooperative database is intended for personal, noncommercial use only. you may not reproduce, publish, distribute, transmit, participate in the transfer or sale of, modify, create derivative works from, display, or in any way exploit the history cooperative database in whole or in part without the written permission of the copyright holder. http://www.historycooperative.org/journals/ahr/ . /thomas.html ( of ) / / : : pm op-llcj .. the influence of language orthographic characteristics on digital word recognition ............................................................................................................................................................ ofer biller, jihad el-sana and klara kedem ben-gurion university of the negev, israel ....................................................................................................................................... abstract this research studies the effect of language orthographic characteristics on the performance of digital word recognition in degraded documents such as histor- ical documents. we provide a rigorous scheme for quantifying the statistical influence of the orthographic characteristics on the quality of word recognition in such documents. we study and compare several orthographic characteristics for four natural languages and measure the effect of each individual characteristic on the digital word recognition process. to this end, we create synthetic lan- guages, for which all characteristics, except the one we examine, are identical, and measure the performance of two word recognition algorithms on synthetic docu- ments of these languages. we examine and summarize the influence of the values of each characteristic on the performance of these word recognition methods. ................................................................................................................................................................................. introduction research in digital script recognition could be clas- sified to two main approaches: segmentation-based and segmentation-free recognition. the segmenta- tion-based approach segments an input word into individual characters, which are then recognized and combined to identify the input word. the seg- mentation-free approach (the holistic approach) recognizes a whole word without segmenting it (opposed to classical optical character recognition (ocr)). recently, the holistic approach for digital word recognition has been attracting more interest and has become widely accepted in the handwriting recognition research community, e.g. plamondon and guerfali ( ), plamondon and srihari ( ), gatos et al. ( ), biadsy et al. ( ), zagoris et al. ( ). this approach has been adapted for word recognition of historical docu- ments which usually suffer from high level of deg- radation (lavrenko et al., ; rath and manmatha, ). in the holistic approach, the quality of the results depends on characteristics of the language. for example, a word for which there are many words with similar characters and length in the lexicon, may have a higher chance of being misclassified. such an observation calls for examining the relation between language characteristics and the quality of automatic word recognition. the influence of sev- eral language orthographic characteristics on visual word recognition has been studied in the domain of cognitive psychology perception (richards and heller, ; coltheart et al., ; andrews, , , ; grainger, ; perea and rosa, ; new et al., ). in this research, we explore the influence of these characteristics on automatic word recognition for low quality and degraded docu- ments, such as historical documents. we examine the following orthographic characteristics: word length, word orthographic neighborhood, and distribution of ascenders and descenders among text characters. intuitively, these correspondence: ofer biller, department of computer science, ben gurion university of the negev, p.o.b be’er sheva , israel e-mail: billero@cs.bgu.ac.il literary and linguistic computing � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqu literary and linguistic computing advance access published october , at b en g urion u niversity - a ranne l ibrary on n ovem ber , http://llc.oxfordjournals.org/ d ow nloaded from gatos etal. ( ) ; grainger, http://llc.oxfordjournals.org/ characteristics statistically influence the quality of holistic recognition methods. here we provide a rigorous scheme for measuring and validating this intuition. we start with learning the behavior of these char- acteristics in natural languages, by gathering text documents in english, hebrew, arabic, and russian and analyzing the distribution of these characteristics in them. to isolate the effect of a specific characteristic, we generate synthetic languages where all characteristics are identical except the one we examine. then we produce document images for these languages, apply word recognition algorithms, and analyze the relations between each language characteristic and the performance of the recognition. we simu- late degraded documents by adding random noise to the synthetically generated documents in a con- trolled manner. the analysis of the relationship between language characteristics and the performance of recognition algorithms can determine the suitable recognition approach for a given language, and simplify the util- ization of existing techniques. this is useful in the research of degraded documents in different lan- guages. here we examine two approaches for word spotting (recognition): dynamic time warping (dtw) offered by rath and manmatha ( b) and gradient-based binary features (gsc) used in zhang et al. ( ). the article is organized as follows. in section , we outline related work. in section , we present the language characteristics and review their effect on human word recognition as reported in works in psychology and cognitive research. in section , we show the distribution of these characteristics for several natural languages. section measures the influence of each characteristic on word recog- nition, using the synthetically generated languages and degraded documents. finally we draw conclu- sions in section . related work the relation between language characteristics and word recognition has attracted the interest of researchers in the field of psychology and cognitive research. the orthographic similarity (orthographic neighborhood) was shown to affect human per- formance of word recognition, e.g. grainger ( ), andrews ( , ), perea and rosa ( ). the commonly used definition for word neighborhood was introduced by coltheart et al. ( ) as the set of words that can be received by replacing a single letter in the original word. recently, a more flexible definition was suggested, which is based on levenshtein’s string distance metric (levenshtein, ; yarkoni et al., ). the new method takes into account letter replace- ment, but also insertion and deletion. word length has also been shown to have influence on visual word recognition (richards and heller, ; new et al., ). the segmentation-free holistic approach to script recognition has become more popular in recent years, e.g. madhvanath and govindaraju ( ). this approach has been adapted for word recogni- tion of historical documents, which usually suffer from high level of degradation (lavrenko et al., ). spitz ( ) makes use of the presence of ascending and descending letters for both recogni- tion and for language identification (spitz, ). some work has been done on image quality meas- urement and ocr result prediction, e.g. esakov et al. ( ); blando et al. ( ); ye et al. ( ); salah et al. ( ), but these mostly focus on ocr while our work deals with holistic word recognition. natural language characteristics below we present the language characteristics we use and our hypotheses on their influence on the per- formance of word recognition. the characteristics we examine are word length, orthographic neighbor- hood, distribution of ascenders and descenders. in a preliminary stage we examined a number of possible characteristics and decided to focus on the charac- teristics above due to their high influence on the structural shape of words. throughout the research article, we use the same characteristics for all the examined languages (as described below). o. biller et al. of literary and linguistic computing, at b en g urion u niversity - a ranne l ibrary on n ovem ber , http://llc.oxfordjournals.org/ d ow nloaded from paper , , , ; : paper http://llc.oxfordjournals.org/ . word length we expect word recognition algorithms to perform better in languages with longer words. we consider the averages of word length in the languages as an orthographic characteristic. . orthographic neighborhood this characteristic aims to measure orthographic distribution of words in the language. the assump- tion is that low orthographic distance between words is manifested in small distance between word images which contributes to lower recognition rates. an accepted measure for this is the ortho- graphic neighborhood. in this article, we use levenshtein distance for determining the ortho- graphic neighborhood. we take the average neigh- borhood size over all language words as a representative measure. . ascenders and descenders an important aspect that has high impact on the quality of word recognition is the graphical proper- ties of the character set. we focus on ascenders and descenders. graphically a handwritten line has a lower baseline and an upper baseline, between which most letters are written. some letters exceed these baselines. those that exceed the upper baseline are called ascenders (e.g. b, d). the ones that exceed the lower baseline are called descenders (e.g. g, q). ascenders and descenders influence dramatically the outer shape of the word’s image, therefore, might effect word recognition. several approaches use the presence of ascenders and descenders for recogni- tion and language identification, but to the best of our knowledge, no work has been done to quantify this influence. distribution of orthographic characteristics in natural languages here we analyze the behavior of the characteristics described above in several natural languages. we collected a corpus of books in four different languages: english, russian, hebrew, and arabic. the books we used were a random collection of prose books in text format (some of which were— adventures of huckleberry finn, crime and punishment, and variety of short stories which were available in several languages). we processed the text by extracting distinct words and building a dictionary for each language. we prefer collecting word statistics from documents and not from dic- tionaries in order to get a better representation of frequently used words and to ignore rare and unused words. for each of the languages, our database contains about , unique words. . word average length figure shows that both english and russian have on average longer words than hebrew and arabic and a wider spread of word lengths. the graph in figure illustrates the distribution of word length by showing the percentage of words, for each length over all words in the corpus. it shows substantial differences among the examined languages. . word orthographic neighborhood to evaluate the orthographic density of the words in each language, we calculate for each word in a language the number of its neighbors (within the corpus). the graph in figure shows for each lan- guage the percentage of words having a specific neighbor count (the x-axis). as can be seen, in english and russian the main mass of words con- centrates in the lower values of neighbors per word, while hebrew and arabic spread also over higher values of neighbors. fig. distribution of word length per language— percentage of words per word length language characteristics and word recognition literary and linguistic computing, of at b en g urion u niversity - a ranne l ibrary on n ovem ber , http://llc.oxfordjournals.org/ d ow nloaded from paper which ' utilize : http://llc.oxfordjournals.org/ . ascending and descending letters to measure the distribution of ascending and des- cending letters, we define ascender and descender configuration. each configuration is represented by a sequence of letter types (ascending/descending/ none). each word matches some configuration. for example the word ‘dog’ will match the config- uration ascend-none-descend. for each language, we compute how many words are in a given con- figuration set. in table we present the average size of the configuration sets in each language. a con- figuration is considered empty for a language if the language does not contain any words of that con- figuration. we expect languages with many small non-empty configuration sets to have more graph- ical diversity in words’ general shapes and therefore to have an advantage in word recognition. . summary of the results for the characteristics on the four natural languages as seen above there are substantial differences in the values of the checked characteristics amongst the languages. in general, english and russian have longer words and lower number of neighbors per word than hebrew and arabic. as to ascender and descender distributions, hebrew words have the lowest diversity, and english has the widest spread among the examined languages. experimental evaluation in previous sections, we listed several natural lan- guage characteristics which we conjecture influence the quality of word recognition. we measured some of these characteristics on corpora of four natural languages and found out that there are substantial differences between these characteristics among dif- ferent languages. yet, we have not given any estima- tion or measurement as to the influence of each of these characteristics on word recognition. using real documents in different languages does not enable the isolation of the effect of each characteristic. to measure the impact of a certain characteristic of natural languages, we generated a set of synthetic languages, based on the english alphabet and simi- lar in every characteristic but the examined one. for the examined characteristic, the languages have different values. for each language we generate docu- ments, perform word spotting, evaluate the results and compare their accuracy. because the generated languages and documents are similar in every aspect except the examined characteristic, a distinctive and stable variance in test results would be confidently assigned to the examined characteristic. on clean, high quality, synthetic documents we would receive near perfect word recognition results regardless of the values of the language characteris- tics. therefore, we added noise to the documents to create variance in the quality of test results. we used few different combinations of degradation types and levels such as simple random gaussian and salt & pepper noise, and kanungo’s degradation model (kanungo et al., ). in all experiments, fig. the distribution of word neighbors count—shows the percentage of words having specified neighbors count, for english, hebrew, russian, and arabic table ascender and descender configurations—for each language, the number of non-empty configuration sets and the average number of words in a configuration set language non-empty configurations average configuration size english , . russian , . hebrew . arabic , . o. biller et al. of literary and linguistic computing, at b en g urion u niversity - a ranne l ibrary on n ovem ber , http://llc.oxfordjournals.org/ d ow nloaded from in order t ' ' in order t since in order http://llc.oxfordjournals.org/ the reaction to change in the characteristics was similar across different types of degradations. the results provided are the averages over the used deg- radation models. example of the generated noise is shown in figure . the degradation models’ param- etrization was adjusted to receive mediocre recogni- tion results, and stayed constant throughout all experiments. among the many available approaches for word recognition, we chose to start with dtw based approach, presented by rath and manmatha ( b) with the combination of profile features from rath and manmatha ( a), and the gsc method (zhang et al., ). . evaluation of the word spotting as part of document generation, we construct a ground truth data base on which we test our rec- ognition results. using both word recognition methods, we retrieve all the occurrences of each word in the corpus from the generated documents. the overall performance for a given method on a given language is the average f-measure for all the retrieved words. . the testing system we have built a testing system that runs sets of tests. each set is composed of tests with similar languages which differ only in the examined characteristic. the process for one test includes creation of a language and a noised document in this language, running two word recognition methods, and evalu- ating the recognition results for both. the whole process of running a test set is automatic, and can be run several times in order to increase the reliabil- ity of the results. test specifications and test results are written to a data base, enabling fast creation of new tests and easy access to test results. the testing software is very robust and enables conducting tests on other characteristics in the future. . stability calibration the process of data generation includes many random elements such as word generation in the synthetic language dictionary, word selection for the synthetic document, and degrading the docu- ment. a set of predefined rules and parameters con- trol these random elements, yet they have some influence on the results. the larger the volume of text in the test, the smaller the effect of the random elements is on the results. therefore, we had to find the sufficient size of tested documents in order to reduce the random factor in the results. we ran the system on inputs consisting of a growing number of lines and plotted the corresponding f-measure. as seen in figure , as the number of lines in the document increases, the f-measure stabilizes. to evaluate the fluctuation of the results, we calculate the standard deviation of the f-measure for each set of five subsequent tests. as the number of text lines used increases, the standard deviation of the subsequent test results decreases. this evaluation shows that above lines of text, the standard deviation stays below %. according to this, we set the number of words per document to for all tests, i.e. over text lines in a document. fig. stability calibration graph—test grade (f-measure) as function of line count in document fig. two examples of a noised document segments from the conducted experiments. the examples use dif- ferent degradation models, the upper segment is degraded using gaussian and salt & pepper noise, and the lower is degraded using kanungo’s model. the same degradation and binarization processes were applied on all generated documents language characteristics and word recognition literary and linguistic computing, of at b en g urion u niversity - a ranne l ibrary on n ovem ber , http://llc.oxfordjournals.org/ d ow nloaded from <br/> , http://llc.oxfordjournals.org/ . experiments and results next we describe the experiments performed on the test system for each of the examined language characteristics. average word length: we generated nine lan- guages, each with different word length average, be- tween and . the standard deviation of word lengths was set to one for all languages. the length of each word in a test set was randomly generated with normal distribution of the specified average and standard deviation. the experiment was re- peated three times for validating the results. figure shows two sample lines from three docu- ments from three languages with word length aver- ages of , , and , respectively. figure summarizes the results of the word length experiment. it displays the f-measure of word recognition as a function of word length average, for dtw and gsc methods. we notice that the difference between the quality of the results of both methods is high. the f- measure for a language with the short words was very low for dtw, and above % for gsc. as the average word length increases, the f-measure for dtw approaches %, while f-measure decreases for gsc for average word length and above. we believe that this is because gsc was fine- tuned for english, where average word length is , as seen in figure ascending and descending characters: this experi- ment examines the influence of ascender and des- cender distribution on word recognition results. for that purpose, we create randomly synthetic test languages using a partial set of the english lower case letters. the alphabet of each test language consists of characters with a different compos- ition of ascenders and descenders. we used more than one test language for each composition to eliminate the influence of the choice of a specific character set. figure shows the average f-measure for each combination (#ascenders_#descenders_ #normal) the dtw with the projection profile features shows significantly better results for balanced com- bination of ascenders and descenders while the gsc method is less influenced by this characteristic. for both recognition methods the results are similar with f-measure around and %, while the rec- ognition by dtw for words with no ascenders and descenders is much lower (about . %). fig. examples of spotting words in noisy documents (by dtw). two example lines are from three documents with average word lengths of , , and . the query words are marked with an underline, and the matches found by the system are marked with a rectangle (therefore the word in the first line marked with a rectangle and not underlined is an example of false positive) fig. the graph depicts the f-measure as a function of ascender and descender composition in a language fig. word length experiment—quality of the results (f-measure) as a function of average word length in a language o. biller et al. of literary and linguistic computing, at b en g urion u niversity - a ranne l ibrary on n ovem ber , http://llc.oxfordjournals.org/ d ow nloaded from in order % http://llc.oxfordjournals.org/ orthographic neighborhood: in this experiment, we created documents with different average neigh- bor count for the words (and fixed average word length to be ). the created test set contained the values of . , , , and as average word neighbor count. the graph in figure displays the f-measure results for dtw and gsc, and shows that in both methods the quality of recognition declines as the number of the neighboring words grows. conclusions and future research in this research, we examine the effect of several orthographic language characteristics on word rec- ognition. we examine word length, orthographic neighborhood, and distribution of ascenders and descenders. these characteristics show very different behaviour in the natural languages we tested; e.g., hebrew and arabic have on average shorter words than english and russian, a larger number of orthographic neighbors, and also fewer ascender– descender configurations. we investigated the influence of these character- istics on the performance of word recognition in degraded documents, by running word spotting tests on synthetically generated languages (all the synthetic languages were constructed of the english alphabet). the synthetic languages are built so that just the examined characteristic varies while all the others remain the same. we conclude that for the characteristics as- cender/descender and for average neighbor count dtw and gsc demonstrate a similar behavior with minor differences, in which low number of neighbors per word and balanced distribution of ascenders and descenders contribute to better re- sults of word recognition. on the other hand, for average word length, dtw clearly presents better results as words get longer, while gsc responds best when average word length is between – let- ters. this can be explained by the fact that dtw allows flexibility of word length (rath and manmatha, ), while gsc performs substitution of words to a predefined number of regions within the word frame (zhang et al., ), which leads to optimal behavior in a limited range of word lengths. acknowledgements this research was supported in part by the dfg- trilateral grant no. , the lynn and william frankel center for computer sciences, and the paul ivanier center for robotics and production management at ben-gurion university, israel. references andrews, s. ( ). frequency and neighborhood effects on lexical access: activation or research. journal of experimental psychology: learning, memory, and cognition, : – . andrews, s. ( ). frequency and neighborhood effects on lexical access: lexical similarity or orthographic re- dundancy. journal of experimental psychology: learning, memory, and cognition, : – . andrews, s. ( ). the effects of orthographic similarity on lexical retrieval: resolving neighborhood conflicts. psychological bulletin and review, : – . biadsy, f., el-sana, j., and habash, n. ( ). online arabic handwriting recognition using hidden markov models. proceedings of the th international workshop on frontiers of handwriting and recognition, la baule, centre de congres atlantia, france. pp. – . blando, l. r., kanai, j., and nartker, t. a. ( ). prediction of ocr accuracy using simple image features. montreal, canada: icdar, pp. – . fig. f-measure results as a function of average neigh- bor count language characteristics and word recognition literary and linguistic computing, of at b en g urion u niversity - a ranne l ibrary on n ovem ber , http://llc.oxfordjournals.org/ d ow nloaded from s - - http://llc.oxfordjournals.org/ coltheart, m., davelaar, e., jonasson, j. f., and besner, d. ( ). access to the internal lexicon. in dornic, s. (ed.), attention and performance vi. hillsdale, nj: erlbaum, pp. – . esakov, j., lopresti, d. p., and sandberg, j. s. ( ). classification and distribution of optical character recognition errors, proceedings of spie — the international society for optical engineering, orlando, florida, , pp. – . gatos, b., konidaris, t., ntzios, k., pratikakis, i., and perantonis, s. ( ). a segmentation-free approach for keyword search in historical typewritten documents, proceedings of eighth international conference on document analysis and recognition, vol. . seoul, korea, pp. – . grainger, j. ( ). word frequency and neighborhood frequency effects in lexical decision and naming. journal of memory and language, : – . kanungo, t., haralick, r. m., and phillips, i. t. ( ). global and local document degradation models icdar, tsukuba city, japan: ieee, pp. – . lavrenko, v., rath, t. m., and manmatha, r. ( ). holistic word recognition for handwritten historical documents, document image analysis for libraries. palo. alto, ca, pp. – . levenshtein, v. i. ( ). binary codes capable of cor- recting deletions, insertions and reversals. soviet physics doklady, : . madhvanath, s. and govindaraju, v. ( ). the role of holistic paradigms in handwritten word recognition. ieee transactions pattern analysis machine intelligence, ( ): – . new, b., ferrand, l., pallier, c., and brysbaert, m. ( ). reexamining the word length effect in visual word recognition: new evidence from the english lexi- con project. psychonomic bulletin and review, ( ): – . perea, m. and rosa, e. ( ). the effects of ortho- graphic neighborhood in reading and laboratory word identification tasks: a review. psicológica, : – . plamondon, r. and guerfali, w. ( ). why handwriting segmentation can be misleading? proc. intl conf. on pattern recognition. vienna, austria, pp. – . plamondon, r. and srihari, s. n. ( ). on-line and off-line handwriting recognition: a comprehensive survey. ieee transactions pattern analysis machine intelligence, : – . rath, t. and manmatha, r. ( a). features for word spotting in historical manuscripts. proceedings of seventh international conference on document analysis and recognition, vol. : – . rath, t. and manmatha, r. ( b). word image matching using dynamic time warping. ieee computer society conference on computer vision and pattern recognition (cvpr’ ), vol. , ii– – . rath, t. m. and manmatha, r. ( ). word spotting for historical documents. ijdar, ( – ): – . richards, l. g. and heller, f. p. ( ). recognition thresholds as a function of word length. american journal of psychology, ( ): – . salah, a. b., ragot, n., and paquet, t. ( ). adaptive detection of missed text areas in ocr outputs: application to the automatic assessment of ocr quality in mass digitization projects. is&t/spie electronic imaging. international society for optics and photonics, doi: . / . . spitz, a. l. ( ). determination of the script and language content of document images. ieee transactions pattern analysis machine intelligence, ( ): – . spitz, a. l. ( ). shape-based word recognition. icdar, ( ): – . yarkoni, t., balota, d., and yap, m. ( ). moving beyond colthearts n: a new measure of orthographic similarity. psychonomic bulletin and review, ( ): – . ye, p., kumar, j., kang, l., and doermann, d. s. ( ). unsupervised feature learning framework for no-reference image quality assessment, cvpr. providence, ri, usa: ieee, pp. – . zagoris, k., papamarkos, n., and chamzas, c. ( ). web document image retrieval system based on word spotting, ieee international conference on image processing , atlanta, georgia, usa, pp. – . zhang, b., srihari, s. n., and huang, c. ( ). word image retrieval using binary features. proc. of the spie conf. on document recognition and retrieval xi, , san jose, california, usa. o. biller et al. of literary and linguistic computing, at b en g urion u niversity - a ranne l ibrary on n ovem ber , http://llc.oxfordjournals.org/ d ow nloaded from http://llc.oxfordjournals.org/ bridging gaps: libraries and software computing for mbas ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice bridging gaps: libraries and software computing for mbas hilary a. craiglow vanderbilt university, nashville, tn hilary.craiglow@owen.vanderbilt.edu kelly m. lavoice vanderbilt university, nashville, tn kelly.lavoice@owen.vanderbilt.edu abstract increasingly, data comes in large sets, and meaning must be derived by the user. graduate business students and their future employers are looking for complementary computer programing knowledge to discover and display custom meaning from large sets of data. we present the context for graduate business programs and data analytics, the increasing need for knowledge of software computing and computer programing capabilities, and how libraries are well positioned to play a part in this data literacy. we also present a case study utilizing the software carpentries curriculum and lessons to provide a certificate program in software computing for graduate business students. finally, we suggest opportunities for the academic business library community to further this work. keywords data analytics, software computing, computer programming, coding, mba, python, sql, r/rstudio, data literacy, library instruction, the carpentries, software carpentry, micro- credential context: mbas and data analytics librarians understand that the amount of data, both freely available and commercially curated, is increasing exponentially. researchers have long used various statistical methods and computer programs to work with datasets that come without structure or custom searchable interfaces. while some tools have a strong history in academic research, including spss, sas, and stata, new tools are being developed by vendors and open-educational communities to provide researchers with additional ways to analyze and find meaning in large datasets. while several software programs are freely available to analyze the ever-expanding universe of data, a challenge is finding experienced analysts able to work with these tools to bring meaning to data and translate meaning into action. datasets themselves may present trends or correlations, but a skilled analyst must also be able to communicate how data should be incorporated into strategies and business decisions. mba programs prepare students to succeed in a wide range of business roles; critical thinking is important for all the career paths students will pursue after graduation. traditionally coveted roles, including those in long-established business spheres like consulting and investing, are becoming increasingly data-focused. an mba graduate will likely interface with business analytics in almost any role they accept. for example, a manager in a non-technical role may be responsible for supervising data http://doi.org/ . /ticker. . . mailto:kelly.lavoice@owen.vanderbilt.edu mailto:hilary.craiglow@owen.vanderbilt.edu ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice analysts. a successful manager needs to ensure that technical teams can communicate and collaborate with non-technical teams. learning to speak the language of business analytics is critical for success. in a november edition of bloomberg business week, professor paul oyer at the stanford graduate school of business explains the importance for both data scientists and managers to be able to communicate and translate data issues to solve problems. managers have a ton of institutional knowledge and know how a business works, but they’re sitting on troves of data that they don’t know what to do with. and data people don’t know what questions need to be answered through the data. the person who can go in the middle is becoming more and more valuable. (cohen, , para. ) in the graduate management admission council’s (gmac) corporate recruiters survey, data and data analytics were front and center. “overall, percent of employers plan to place recent business school graduates into data analytics roles in ” (graduate management admission council, , p. ). if a student does not intend to pursue a data analytics role, it is likely they will work with data specialists in some capacity in their future career. business schools are aware of the needs of industry and employers and have looked to address the demand for data skills in a variety of ways. this paper is not a comprehensive curriculum review; others have tracked the increasing representation of data analytics courses in the mba curriculum and seen the many ways in which business schools are working to incorporate these skills into student learning outcomes. warner ( ) provided a valuable overview of themes that should be taught in business analytics courses for mba students and highlighted pedagogical resources from vendors. warner stressed the importance of recognizing that using analytics to inform decision making is an iterative process that works best with a team of collaborators applying their own knowledge and expertise to the context at hand. while individual teaching tools and software programs continue to change and develop over time, warner mentioned challenges faced in that academic professionals still face today; these include “access to data sets, faculty expertise, student aversion to statistics, finding a suitable textbook and cases, and staying current with practice” (warner, , p. ). gupta, goul, and dinter ( ) recognized the demand for managers and analysts who can succeed in business intelligence (bi) & analytics roles. they reviewed course materials posted on teradata university network’s web pages to develop a survey that was administered to participants at a bi event at the americas conference of information systems in detroit, michigan. using feedback from survey participants, mostly bi instructors, the authors developed a model curriculum to differentiate between the skills required in various programs. the authors discussed the differences among bi coursework for mba, master of science (ms), and undergraduate programs: mba coverage predominantly focused more on strategically applying bi tools and technologies and solving business problems for competitive advantage, the ms coverage emphasized learning how to build bi applications using bi tools, and the undergraduate coverage focused on understanding bi and its tools for solving business problems. (gupta et al., , p. ) mccollum, gheibi, and doganaksoy ( ) highlighted the value of tools like r and jmp for exploratory data analysis in teaching business analytics in an mba course. the authors used a dataset with qualitative and quantitative variables from the federal railroad association. they concluded that http://doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice “using a practical and relatable dataset, like the railroad incidents dataset that we have used for our analysis, motivates students’ learning significantly” (mccollum et al., , p. ). the authors highlighted the value of using examples from industry to help make connections between academic study and business operations. some mba programs are adding courses, as electives or requirements, that aim to teach both programming languages and data analytics skills. for example, columbia business school introduced two such credit-bearing courses in : introduction to programming using python and introduction to databases for business analytics. according to columbia, “by the end of the - academic year, one-quarter of full-time mbas were learning how to code in either python or sql” (kurczy, , para. ). other business schools have created new degrees to prepare students for future data-focused roles. the mit sloan school of business offers a master of business analytics. this -month program offers courses including from analytics to action, communicating with data, and analytics software tools in r, python, sql, and julia (mit sloan school of business, ). nyu also offers a master of science in business analytics, a one-year, part-time program offering courses like “foundations of statistics using r” and “data-driven decision making” (leonard n. stern school of business, ). this program could be pursued by a student working full-time to prepare to take on additional data-focused work. business schools wishing to add data courses or programs face many challenges, such as meeting accreditation requirements, managing crowded curriculums, and hiring faculty to meet demand. this process is time-consuming and prescriptive. another possibility of introducing business students to software computing involves partnering with other programs on campus, such as computer science departments or online learning extension programs. mba programs may allow students to take courses in other graduate programs on campus without pursuing dual degrees. vanderbilt mba students may take courses in other graduate programs, including computer science and data science. however, to be admitted entry into graduate courses in either program requires that students demonstrate sufficient training in a combination of programming techniques, calculus, statistical regression techniques, and/or other foundational computer science concepts. these courses are not intended to teach software programming to novice learners. mba students may look outside of their curricular studies to find learning opportunities to engage with software computing. harvard university, for example, offers both freely available moocs and fee- based harvard extension school courses that teach fundamentals of programming languages in a range of levels from introductory to advanced (harvard university, n.d.). as of december , harvard offers four freely available moocs, which range from five weeks to weeks. the harvard extension school offers two semester-long courses, programming languages and web programming with python and java script. both options could be utilized by students at other institutions looking for introductory programming courses that can be completed online while pursuing an mba degree. for students looking for a brief introduction, wishing to understand broad capabilities of these programs and how they are utilized in business contexts, these courses may not be an ideal fit. we also recognize that costs associated with these programs can create barriers for students. there are vendors that will license and sell online learning modules that cover these topics to academic libraries; however, finding funds in library budgets to add new resources like this may be challenging, depending on other competing collection development priorities and budgetary pressures. http://doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice business librarians are in an excellent position to bring data analytics and software computing programming at an introductory level to their campus communities. librarians have traditionally built an expertise in purchasing research data. this skillset ranges from reviewing licenses and technical requirements to marketing the datasets and helping researchers determine which datasets are most appropriate for solving research problems. in addition to our traditional expertise with finding, evaluating, and purchasing data sources, librarians have often been able to assist their campus communities by developing relevant programming relatively quickly. library workshops and supplemental instruction supports the curriculum and exists alongside credit-bearing courses, which take additional time and resources to develop. through our strong relationships with student, faculty, and staff stakeholders, we attempt to fill gaps in teaching and learning with timely, targeted programming. this ability to develop our own co-curricular courses, without going through a lengthy approval process, allows us to move quickly with implementing new offerings. case study: walker management library in spring , the walker management library, with the support of faculty, administrators, staff, and students, undertook a new initiative to support the growing interest in software and computing in vanderbilt university’s owen graduate school of management. this effort developed after librarians noticed an increase in student requests to career services, faculty, and librarians for opportunities to use tools like python or sql for both school projects and career readiness. these conversations almost always acknowledged that top potential employers were looking for these skills and experiences with certain software programs and coding languages on resumes. in our process of exploring how best we could make a meaningful impact in this area, we spoke with the deans of faculty and the mba program about what was and was not represented in the curriculum. we wanted to be sure the program developed was co-curricular; our programming would complement and not compete with the curriculum. we recognized that the curriculum would change over time and, in turn, our library offerings would also evolve. before creating our software computing certificate program, we had tested some brief low-barrier learning programs on data tools like bloomberg and capital iq. in fall , we had added single classes on tableau, r, python, and sql. these workshops always had strong attendance but were limited in time and scope. fifty-minute introductory sessions set the context for why these programs were relevant and how they are being used in industry, but did not allow time to learn, work through sample data projects, and explore program capabilities in a cohesive manner. the library had also sponsored various data working groups. our business information & data analysis librarian sponsored the r working group, a time for students to collaborate and crowdsource ideas for data projects. while this group was open to any member of the vanderbilt community, many owen students attended. additionally, our library fulfilled requests from faculty to provide support to students completing assignments that involved data analysis or some form of coding. for example, in spring , an owen faculty member wanted to offer her students an opportunity to learn to web- scrape static web content for a course assignment. the library was able to develop and offer two optional workshops in the weeks before the course assignment was due. as relative novices in this area of data course design, we looked to partner with others who have tried- and-tested curricular materials. we found a good match with an organization outside of the library community, but with a deep commitment to supporting data needs in libraries and academia. libraries http://doi.org/ . /ticker. . . - ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice have a rich tradition of sharing content, including research guides and lesson plans. furthermore, professional organizations have routinely offered workshops and other learning materials for librarians to develop wide ranges of instructional content. the carpentries (https://carpentries.org) is an organization that pulls together both concepts: collaborative program development and library training. the carpentries curates an international community of instructors who create and refine lessons on software and data tools that develop skills and literacy. since , software carpentry (https://software-carpentry.org) events have reached more than , researchers internationally (software carpentry, n.d.). the carpentries collective develops curricula and lessons for two main tracks: software and data. the mission of the carpentries is to build global capacity in essential data and computational skills for conducting efficient, open, and reproducible research. we train and foster an active, inclusive, diverse community of learners and instructors that promotes and models the importance of software and data in research. we collaboratively develop openly-available lessons and deliver these lessons using evidence-based teaching practices. we focus on people conducting and supporting research. (carpentries, a) the software carpentry was developed out of stem (science technology engineering math), but sample lessons representing a variety of social science and humanities disciplines are in development (carpentries, n.d.). although not perfect in translation to business and finance, the core lessons in this area resonate well. content for the software carpentry programs varies, but the core lessons include: the unix shell, version control with git, programming with python, programming with r, r for reproducible analysis, using databases and sql, and programming with matlab. these are often organized into two-day workshops. while anyone may use the carpentry curriculum under a creative commons attribution license (cc_by), only workshops taught by at least one certified carpentry instructor can be branded as an official carpentry workshop (carpentries, b). many academic libraries have incorporated curriculum from the carpentries into their services for both library staff and the campus community. the university of oklahoma in norman first brought the software carpentry workshop to their campus in , as a partnership between campus it and the university libraries. over faculty, students, and staff from over departments have attended their carpentry programming. the university of oklahoma joined the carpentries organization, which allows instructors per year to be officially certified. this method ensures sustainability, with the instructional workload spread across many potential instructors (pugachev, ). in , a group of seven colleges and universities formed the new england software carpentry library consortium, piloting a consortial membership model with the software carpentry foundation. this allowed the institutions to gain the benefits of being carpentries members while sharing the membership cost. initial members included brown university, dartmouth college, harvard university, mount holyoke college, tufts university, university of massachusetts, amherst, and yale university. during the - pilot year, the consortium's members hosted nine workshops across new england. members of the group noted: it was important to include staff from both libraries and information technology because we often provide complementary or joint services to support researchers’ data needs. we needed to make sure that we could accurately and persuasively communicate the value of the carpentries and the consortium to our administrators. (atwood et al., , ) http://doi.org/ . /ticker. . . https://carpentries.org/ https://software-carpentry.org/ ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice vanderbilt joined the carpentries in partnership with the office of the vice provost for research. like the new england software carpentry library consortium, we found that library staff must work closely with other groups on campus to support the complex research needs of our faculty and students. it is important to note that while our pilot software computing certificate course incorporated many lessons from the formal carpentries curriculum, and our main instructor was carpentries-certified, we did not brand our program as a carpentries workshop. in the summer of , vanderbilt’s digital scholarship and communications department hosted a -day carpentries-certified workshop for library staff. that program created new potential instructors for future library workshops targeting students, staff, and faculty as we work to build our carpentries connections across vanderbilt’s campus. in spring , we piloted our software computing certificate program. the program was open to all graduate business students in the owen graduate school of management. the course ran for weeks during the spring semester, a lunch session and then hours of instruction each friday. in total, this course involved hours of instruction. to maintain a high instructor-to-student ratio, we capped our pilot program to just students. we opened registration at : a.m. on a wednesday morning; by : a.m., the program was full. within the hour, we formed a waitlist with names. we quickly learned that there is a lot of interest in the topic and our students are willing to give up half of a day each week for six weeks to participate. while students did not earn credit towards their degree requirements, we did offer a certificate and online badge upon completion. the effort took a significant amount of staff time. each workshop had one instructor who used the carpentries curriculum, and two, sometimes three, helpers who followed along and helped students who lagged behind during instruction or exercises. in this model, the instructor can focus on delivering the lesson, and helpers can troubleshoot with a participant without holding up the entire class. the primary instructor for the course was our business information and data analysis librarian. in addition, one session was taught by the director of research computing and another by the library’s data curation specialist. helpers included our librarian for geospatial data and systems and librarian for copyright and scholarly communications. it was a cross-campus team effort. our curriculum is outlined below: i. program launch the program was launched with opening remarks from our career management director, who discussed how valuable these skills would be on resumes. additionally, there was an introduction to the program, syllabus, lessons, workshop process, software, and code of conduct. ii. command line and shell the unix shell allows people to do complex things with just a few keystrokes and automate repetitive tasks. use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including “high-performance computing” supercomputers). iii. version control, sharing and reproducing code version control is the cornerstone of the digital world; it’s what professionals use to keep track of what they’ve done and to collaborate with other people. every large software development project relies on it, and most programmers use it for their small jobs as well. http://doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice iv. python part i & ii this section provided an introduction to python, the popular open-source programming language. jupyter notebook was used and covered. v. databases & sql part i & ii three common options for storage are text files, spreadsheets, and databases. databases include powerful tools for search and analysis and can handle large, complex datasets. students were required to participate in the first three lessons, plus lesson or , in order to qualify for a certificate. most enrolled students were able to successfully complete these requirements. after the course, students were given an opportunity to provide feedback via an anonymous qualtrics survey. students were asked about the value of each individual lesson. when asked to rank the lessons in order of importance, sql came out first, followed by python, command line/shell/bash and version control/code sharing/git. the open-ended questions provided valuable feedback on the student experience. when asked “what part of the lessons worked well?,” students spoke to the value of the instructors and the value of active learning. when asked “what part of the lessons didn’t work well?,” answers spoke to an aversion to long periods of lecture time or exercises that involved following examples verbatim. additionally, a student mentioned that they desired an opportunity to work on a project through each lesson. to quote: “it would have been nice to work on a project through each of the lessons. i feel like we learned the capabilities of these languages, but i'm not sure i could work on a project or get a job done.” the same sentiment came up in an answer to a later question. a supplementary issue presented itself in answers to multiple survey questions: the challenge of pacing the class. while some students came in with a basic knowledge of at least one coding language, some had no prior experience. while having additional helpers aims to ensure students who need additional support can have it without involving the entire class, we recognize that a one-size-fits-all class is challenging. pugachev ( ) mentioned similar feedback from students at the university of oklahoma. given the nature of the work involved in teaching this -hour course, at this time it is unrealistic for the library to sustain offering an introductory and intermediate version of the course. moving forward: opportunities as we look towards the future, we have identified three areas in which we hope to expand our course offering. barring finding a prepackaged business-focused curriculum as flexible as the carpentries curriculum and instructor community, we first want to identify business datasets that can be integrated into the lessons. while the carpentries team is working on an economics curriculum that would likely be relevant, we think we can identify business datasets in our current collections, available online, or from faculty research projects that can be used in lessons. these examples can be designed to reflect projects that students may be asked to work on as consultants or in other business-focused roles. we can use our expertise of finding and evaluating datasets to identify potential resources and formats that would work best for the lessons in our course. ideally, all examples will come from open-source material. we recognize the value of utilizing materials that can be shared with business librarian colleagues at other institutions. like business schools, we believe that case-study-based learning will only enhance the value of our course. http://doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice while we continue to explore new curricular materials and course structures for our certificate program, it is important to acknowledge that our current course is not built to scale to reach all current students in the owen graduate school of management. by licensing online learning materials like o’reilly for higher education, we can ensure students waitlisted for our course can engage in self-study with high quality resources. this will also allow for students unable to complete our certificate course to gain the skills they will need to collaborate with peers in our software working groups and on special projects. additionally, we want to provide students with an opportunity to build on the skills gained during the introductory course. students expressed a desire for relatable projects that would allow them to put their skills into practice. we plan to reach out to faculty members in the business school to identify a faculty partner who will provide students with access to a dataset and allow them to assist with data analysis. in the future, we may explore expanding this opportunity by identifying companies that utilize our students for internships and case competitions who can provide students with a real-life problem that can be solved with data analysis. we would work closely with our career services department, as well as our centers for social enterprise and entrepreneurship, to identify these potential partners. finally, we hope to help students continue to build and sustain their own student learning networks, building coding communities of practices. as with learning any language, conversation is critical. while we host working groups for specific programming languages, such as r and python, we can encourage students who complete the software computing certificate program to continue collaborating as a cohort. we welcome other business and data library colleagues to connect with us as we continue building our software computing certificate program at the owen graduate school of management. we aim to make our course as relevant as possible to our mbas, offering students an introduction to software computing programs and connecting them with resources and projects that they can use to continue independent or cohort-based project work. additionally, we hope to engage librarians with little prior software computing or programming experience to explore the possibilities of teaching this valuable content. we recognize that the curricular materials from the carpentries have a tradition of being used to teach both students and fellow instructors, and our course would be just as beneficial to library staff as it is to mba students. libraries have a rich history of teaching students how to use information, data, and complementary tools. software computing is a natural progression of how business libraries prepare students for real- world business applications. librarians have the capability to provide students with an introduction to the software skills they need to be successful working with data and with data scientists. as data analysis continues to grow as a skillset desired by all industries, we hope to continue to grow our capacities to support student learning in this area. http://doi.org/ . /ticker. . . ticker: the academic business librarianship review, : ( ) http://doi.org/ . /ticker. . . © hilary a. craiglow and kelly m. lavoice references atwood, t. p., creamer, a. t., dull, j., goldman, j., lee, k., leligdon, l. c., & oelker. s. k. ( ). joining together to build more: the new england software carpentry library consortium. journal of escience librarianship, ( ), e . https://doi.org/ . /jeslib. . the carpentries. ( a). about us. retrieved from https://carpentries.org/about/ the carpentries ( b). what is a carpentries workshop? retrieved from https://carpentries.org/workshops/ the carpentries. (n.d.). community developed lessons. retrieved from https://carpentries.org/community-lessons/ cohen, arianne. ( , november ). every mba needs tech skills--not just those headed for silicon valley. bloomberg business week. retrieved from https://www.bloomberg.com/news/articles/ - - /business- schools-bridge-gap-between-managers-and-data-scientists graduate management admission council. ( ). corporate recruiters survey report . retrieved from https://www.gmac.com/market-intelligence-and-research/research-library/employment-outlook/ -corporate- recruiters-survey-report griffel, m. ( ). the new languages of business: python, sql, r. ideas and insights. retrieved from https://www .gsb.columbia.edu/articles/ideas-work/new-languages-business-python-sql-r gupta, b., goul, m., & dinter, b. ( ). business intelligence and big data in higher education: status of a multi- year model curriculum development effort for business school undergraduates, ms graduates, and mbas. communications of the association for information systems, , - . https://doi.org/ . / cais. harvard university. (n.d.). online python courses. retrieved from https://online-learning.harvard.edu/subject/python kurczy, s. ( ). mbas who code. ideas and insights. retrieved from https://www .gsb.columbia.edu/articles/ideas-work/mbas-who-code mccollum, j., gheibi, s., & doganaksoy, n. ( ). introducing advanced exploratory data analysis tools in an mba program. global journal of business pedagogy, ( ), - . leonard n. stern school of business. ( ). course index. retrieved from https://www.stern.nyu.edu/programs- admissions/ms-business-analytics/academics/course-index mit sloan school of management. ( ). master of business analytics curriculum. retrieved from https://mitsloan.mit.edu/master-of-business-analytics#curriculum pugachev, s. ( ). what are "the carpentries" and what are they doing in the library? portal: libraries and the academy, ( ), - . https://doi.org/ . /pla. . software carpentry. (n.d.). about us. retrieved from https://software-carpentry.org/about/ warner, janice. ( ). business analytics in the mba curriculum. proceedings of the northeast business & economics association (pp. – ). http://doi.org/ . /ticker. . . https://doi.org/ . /jeslib. . https://carpentries.org/about/ https://carpentries.org/workshops/ https://carpentries.org/community-lessons/ https://www.bloomberg.com/news/articles/ - - /business-schools-bridge-gap-between-managers-and-data-scientists https://www.bloomberg.com/news/articles/ - - /business-schools-bridge-gap-between-managers-and-data-scientists https://www.gmac.com/market-intelligence-and-research/research-library/employment-outlook/ -corporate-recruiters-survey-report https://www.gmac.com/market-intelligence-and-research/research-library/employment-outlook/ -corporate-recruiters-survey-report https://www .gsb.columbia.edu/articles/ideas-work/new-languages-business-python-sql-r https://doi.org/ . / cais. https://online-learning.harvard.edu/subject/python https://www .gsb.columbia.edu/articles/ideas-work/mbas-who-code https://www.stern.nyu.edu/programs-admissions/ms-business-analytics/academics/course-index https://www.stern.nyu.edu/programs-admissions/ms-business-analytics/academics/course-index https://mitsloan.mit.edu/master-of-business-analytics#curriculum https://doi.org/ . /pla. . https://software-carpentry.org/about/ i. program launch ii. command line and shell iii. version control, sharing and reproducing code iv. python part i & ii v. databases & sql part i & ii genealogy article utilizing webs to share ancestral and intergenerational teachings: the process of co-building an online digital repository in partnership with indigenous communities derek jennings ,*, michelle johnson-jennings , , and meg little community health and epidemiology, university of saskatchewan, saskatoon, sk s n a , canada; mjohnsonjennings@usask.ca school of social work, university of washington, seattle, wa , usa indigenous community health (rich) center, university of colorado, boulder, co , usa department of pharmacy practice and pharmaceutical sciences, university of minnesota, minneapolis, mn , usa; littlem@d.umn.edu * correspondence: drj@usask.ca; tel.: + - - - received: may ; accepted: june ; published: july ���������� ������� abstract: indigenous knowledge and wisdom continue to guide food and land practices, which may be key to lowering high rates of diabetes and obesity among indigenous communities. the purpose of this paper is to describe how indigenous, ancestral, and wise practices around food and land can best be reclaimed, revitalized, and reinvented through the use of an online digital platform. key informant interviews and focus groups were conducted in order to identify digital data needs for food and land practices. participants included indigenous key informants, ranging from elders to farmers. key questions included: ( ) how could an online platform be deemed suitable for indigenous communities to catalogue food wisdom? ( ) what types of information would be useful to classify? ( ) what other related needs exist? researchers analyzed field notes, identified themes, and used a consensual qualitative research approach. three themes were found, including a need for the appropriate use of indigenous knowledges and sharing such online, a need for community control of indigenous knowledges, and a need and desire to share wise practices with others online. an online food wisdom repository that contributes to the health and wellbeing of indigenous peoples through cultural continuity appears appropriate if it follows the outlined needs. keywords: indigenous health; world wide web; wise practices; digital repository; ancestral teachings . introduction they’re going to tell you a lot everyday . . . . you’re going to like knowing everybody around you, area, the people, you know, telling you something every day. (referenced flying objects in the skies.) it’s going to be just like that. you’re going to know everything, what’s going on in the world . . . . even, we going to have a spider web all over the country. spider web’s going to cover us—prophecy by choctaw hopaii/prophet in the early s. (referenced in mould , p. ) historically, choctaw hopaii were storytellers of wisdom that crossed generations and time. while the above indigenous choctaw prophecy arose nearly a century ago, the second author, a choctaw tribal member, interpreted it as predicting the importance of being connected and delivering information through the world wide web, herein referred to as the web. while the hopaii’s role in the genealogy , , ; doi: . /genealogy www.mdpi.com/journal/genealogy http://www.mdpi.com/journal/genealogy http://www.mdpi.com https://orcid.org/ - - - x https://orcid.org/ - - - x http://dx.doi.org/ . /genealogy http://www.mdpi.com/journal/genealogy https://www.mdpi.com/ - / / / ?type=check_update&version= genealogy , , of community was diminished through colonization, their lasting impact and prophecies continue to guide successive generations into the digital age and can help us realize technology’s full potential. for instance, the web currently hosts indigenous ancestral knowledge transmission and facilitates social connections around the world, among many indigenous persons. yet, critical questions arise regarding how to implement the teachings of previous generations within contemporary times. the purpose of this paper is to describe indigenous communities’ needs regarding how best to reclaim and revitalize food and land practices through the use of an online digital platform. the term “wise” is used instead of “best” practices to indicate centering indigenous worldviews. wise practices further refer to the use of indigenous knowledges (iks) that inform locally appropriate actions, tools, principles, or decisions and contribute to the development of sustainable health practices (johnson-jennings et al. ). use of wise practices also counters the assumption that what is best for one group is generalizable to all others, as is implied by the reference to best practices (wesley-esquimaux and calliou ; thoms ). . . cultural continuity: transmission through online platforms online platforms may support continuity of cultural food and land practices if indigenous people are in control of its implementation. across centuries, indigenous cultures have been at the forefront of gathering new technology and reinventing its utility. for instance, indigenous nations are widely known to have quickly obtained glass beads and then integrated them into creating indigenous regalia and other traditional works, substituting difficult-to-obtain shells, pearls, stones, copper, dyed porcupine quills, and the like. the glass bead technology increased the efficiency of transmitting cultural symbols and designs through established patterns and codes, thereby also increasing transmission of ancestral knowledges. in this manner, indigenous communities exercised sovereignty and self-determination in using novel technology. today, online and social media platforms have been used to effectively disseminate cultural information and create online communities among indigenous peoples in efficient and self-determined manners. for example, in , the center for digital scholarship and curation at washington state university developed the mukurtu wumpurrarni-kari archive (center for digital scholarship and curation at washington state university ). this platform serves as a community-driven, open source content management system that meets the “needs of diverse communities who want to manage and share their digital cultural heritage” on their own terms. idle no more is another recent digital social movement that illustrates the power of connecting with those around the world for social justice and change (tupper ). in both examples, digital online platforms remain important for not only connecting with others, but also for maintaining cultural practices, which can further cultural continuity (see champagne ). research indicates that if cultural continuity is increased, so do health outcomes (chandler and lalonde ; oster et al. ). therefore, indigenous communities may support cultural continuity, and thereby improve their health outcomes, through the use of culturally situated digital tools. . . the new smoke signals: digital technology as a potential form of disseminating indigenous knowledge several indigenous cartoons or memes depict social media and technology as the new smoke signals, i.e., the ability to communicate with other indigenous groups across great distances. digital platforms have been seen to share indigenous knowledge (ik) appropriately and to efficiently transmit this information to many at once (hunter ). in fact, digital repositories can hold cultural artifacts and disseminate cultural beliefs, teachings, and practices in a culturally appropriate manner (hennessy et al. ). this is especially important for indigenous communities who have been historically fragmented, or isolated, through colonization and lack access to elders, knowledge keepers, and/or other readily available access to this information. furthermore, digital platforms can unite indigenous persons efficiently over commonly held interests. in essence, an online social network may be just as powerful as in-person networks, and perhaps more efficient. given that indigenous people are often marginalized in the media, digital platforms can inform cross-national indigenous communities about pressing issues and call for immediate global responses. such quick responses were seen with genealogy , , of the aforementioned idle no more movement. the web rapidly connected indigenous communities around the world in order to form online communities, share information, and organize protests for social change. one element of this web-based communication resulted in organized “flash mobs”, which rapidly occurred in unison in various public venues across the world (tupper ). digital platforms can further increase cultural continuity, or furthering of cultural practices, and encourage revitalization around food practices. unfortunately, colonization has systematically dismantled many indigenous families, community kinship networks, and land connections that provided emotional, spiritual, and physical support and guided cultural transmission of traditional ways (saslis-lagoudakis et al. ). yet, iks have survived in the pockets of communities and families across the world in various forms. though iks continue to be transmitted, sometimes remaining ik was protected and not shared with the younger generations for fear that they too would be oppressed if they maintained these practices. others, perhaps, did not share iks with the younger generations out of fear of cultural appropriation and/or misuse of iks by outsiders (johnson-jennings et al. forthcoming). as a result, iks sometimes became so heavily guarded that communities could not actively evolve or transform their cultural knowledges to fit modern-day life. furthermore, iks are often land-based and, by default, may consequently not be shared outside of the local geographic area. given the removal of many indigenous persons from their tribal homelands and related original ik, they may now share more iks in common with the communities on whose homelands they currently reside. as a result, ancestral cultural information has become fragmented and sometimes inaccessible (nelson ), leaving some indigenous groups wanting to reconnect with the ik of their tribes via digital methods (saslis-lagoudakis et al. ). online digital platforms that are controlled by indigenous communities can ensure dissemination of cultural knowledges in appropriate ways. they can further provide an opportunity to revitalize and reclaim iks regarding land, medicines, and foods, as more communities may provide ongoing feedback to build and validate the data. for example, knowledge of indigenous herbal medicines and their evolution is related to an indigenous group’s environment or place, which includes how climate change has impacted these medicines (lynn ). this invaluable information could be shared, for example, with indigenous persons who live in a region differing from their tribal homelands through an online ik repository. the individuals could also learn about the medicines that grow locally and how they relate to the medicines belonging to their tribal homelands. elders could augment this information with stories or experiences, as appropriate, for others to learn. though digital platforms will not replace the physical connections to the land or to elders’ teachings, there exists potential to provide the resources and initial direction to guide knowledge creation, sharing, and reclamation. . . re-storying through digital spaces digital spaces can serve as excellent storytelling platforms, which may be particularly important for indigenous communities. especially given that indigenous communities have moved from solely surviving colonial genocide and ethnocide to a place of indigenous persons promoting health and wellbeing on their own terms (walters et al. ). furthermore iks center on relationality to past, present, and future ancestors and the environment (moreton-robinson et al. ) and have been transmitted through storytelling for centuries, especially in regards to original instructions (oi), which refer to ancestral protocols that focus on relationships to others and the environment (nelson ). today, indigenous scholars and communities are seeking innovative means to support relational approaches for wellbeing in today’s world (martin and mirraboopa ). thus, the web can serve as one innovative method to present indigenous stories in a culturally relevant and relational manner while creating present-day and future solutions. for example, several indigenous grassroots methods exist for sharing indigenous knowledge systems in order to support thrivance, “or moving towards supporting existing health and wellbeing as guided by oi and the community” (p. s walters et al. ). to do so, indigenous persons must control the comprehensive management of such stories or data to ensure accuracy, validity, reliability, and fidelity. genealogy , , of indigenous stories are embedded with iks and cultural protocols. place-based teachings, relational worldviews, and ways of being, knowing, and communicating in the world are embedded in iks and indigenous stories. ois are learned and interpreted by the subsequent generations through formal and informal teaching processes, often through stories (nelson ). in doing so, indigenous ancestors and communities have been given practices and protocols for how to live and interact in the world (nelson ). thus, attempts at health interventions that ignore this relationship may not be as effective as those incorporating ois and related iks. as the choctaw hopaii suggested, the web can bring together indigenous peoples across the world, while also holding promise for cultural perpetuity. in particular, health interventions that include narrative transformation can serve to decolonize internalized and embodied narratives of victimization into narratives of hope and wellbeing (walters et al. ). this has been done by indigenous communities for centuries through stories, art, song, ceremonies, and dance. because indigenous community members maintain, or have access to, internet connections and obtain information online as their preferred/primary source of health information (donelle and hoffman-goetz ), an online platform may be an appropriate place for stories and building relationships across the web. more specifically, digital spaces can be indigenized, or centered within iks and ois, in order to promote narratives of hope and healing instead of narratives of disparities and trauma, which are often highlighted in the research. thus, this study sought to identify community needs regarding a digital online platform and how best to develop a culturally appropriate repository, which would allow the communities to reclaim, revitalize, and reinvent food and land practices on their own terms. . methods after a year of development, the authors began an initial online framework for a community-engaged indigenous food wisdom repository (repository) in february . dr. johnson-jennings (choctaw) is the primary investigator (pi), dr. jennings (sac and fox and quapaw) is the co-principal investigator (copi), and dr. little (ally) is the co-investigator (co-i). the positionalities of the researchers include dr. jennings being an indigenous health educator and professor from the anishinaabe sac and fox tribe as well as the quapaw dhegihan sioux tribe. dr. johnson-jennings is a choctaw nation indigenous clinical health psychologist, professor, and research center director. both dr. jennings and johnson-jennings have decades of experience engaging with indigenous communities, organizations, and clinics across the us, new zealand, and canada and co-developing research projects. dr. little is a white ally, registered nurse, and professor who has worked with indigenous groups in food and health over the years. the first step in development was to determine if a repository would provide the indigenous communities opportunities to access wise health practices, increase their cultural continuity, and support grassroots movements towards maximizing health. though formal interviews were initially proposed, key community stakeholders requested more informal interviews and focus groups in the form of talking circles; written field notes and observations were collected at these meetings. later, an online survey was completed and analyzed in a separate study. after receiving institutional review board exemption approval, between – , the pi and co-pi visited indigenous communities and events in the us, canada, and new zealand to conduct key informant interviews on traditional medicines and food, with a focus on identifying digital food data needs. participants were recruited via existing indigenous research partner referrals. these participants were seen as “experts” in indigenous food practices. the authors made contact via email or phone and requested an interview. after consent was given, data were documented via notes or audio recordings, which were destroyed after notes were taken. this part of the overall needs’ assessment included individual formal and informal interviews with indigenous food or food sovereignty experts and/or traditional food medicine experts from the united states (american indian, alaska native, and native hawaiian), canada (first nations), and new zealand (maori). professional backgrounds included a range of indigenous food activists, researchers, farmers, elders, and healthcare genealogy , , of professionals. age ranges were from to > . participants were selected based on referrals from pre-identified indigenous community leaders. the pis then contacted key informants via phone, in person, and/or at another food health meeting/conference. key informants included three elders, five persons working in healthcare or health research, two traditional healers, one program director, and two food sovereignty leaders and farmers. regarding food and digital repositories, the pis asked the following questions: ( ) how could an online platform be deemed suitable for indigenous communities to catalogue food wisdom? ( ) what types of information would be useful to classify? ( ) what other related needs exist? two additional talking circles were held at international indigenous food-related conferences and meetings. the participants were asked to reflect on the above questions; the first two authors recorded field notes and analyzed results of individual interviews, as guided by the decolonizing (smith ) and two-eyed seeing frameworks (marsh et al. ; bartlett et al. ). the researchers sought to provide space for indigenous knowledges alongside any useful western knowledge that may assist them. they reviewed field notes, identified themes and domains, and subcategorized, drawing from the consensual qualitative research (cqr) approach (hill ). cqr is an inductive approach often used with small sample sizes and when open-ended interview questions are asked. the words were analyzed and multiple perspectives were considered. the first two authors coded all field notes and created domains and themes. the third author served as an external auditor. they used cross-referencing and were in consensus with the findings. . results the research team identified and agreed upon the themes portrayed in table : ( ) sharing ancestral food practice data, ( ) needing indigenous community control of an online platform, and ( ) the utility of wise food practices. a number of domains, or subthemes, were also categorized. connection to land appeared important to participants, as most participants desired to conduct their individual interviews on the land, directly showing the growth of their food and/or offering food. each of these themes is addressed in the following sections. table . summary of themes and domains using the consensual qualitative approach. themes domains appropriate use: sharing ancestral food and land practice data online (a) learn/reconnect with traditional food and land practices within current climates (b) revitalize land and food practices through knowledge accessibility and exchange in a digital multimedia format community control of indigenous knowledge (ik) (a) indigenous community input and data management (b) include elders’ knowledge in technology (c) use multimedia, including videos and photographs (d) use of geographic information system (gis) maps to portray data sharing wise practices (a) share practices and stories with others (b) learn regional wise practices (c) learn global practices that relate to environmental changes and ecosystems (d) use of wise practices for healing genealogy , , of ( ) appropriate use: sharing ancestral food and land practice data online. sharing ancestral food and land practices online was considered appropriate. associated domains included (a) desiring to learn traditional food and land practices within current climates, and (b) desiring to revitalize land and food practices through knowledge accessibility and exchange in a digital format (table ). one participant noted that, simply put, “food is our medicine”, and we need to utilize online resources “to grow this knowledge and get back in touch with the earth”. overall, participants confirmed having access to online resources for other purposes and saw it as an appropriate medium for sharing iks. even though participants discussed the need to share and access iks online, they also spoke about the importance of reconnecting with the land. one participant discussed the need for their youth and community to reengage with the land. another discussed the rainbow and it “connecting the sky, water, and earth, which bring forth our kai (food)”. he then related the connections to food, water, and land by discussing the symbolism of animals within his culture. similarly, a participant stated that the octopus “represents our connection to the water and the lands, as well as our connections to each other as indigenous peoples”. she then went on to discuss the trade system of foods between indigenous peoples globally and the need to continue these exchanges by utilizing online resources. these experts also voiced a shared desire to reconnect with the land to revitalize traditional foods and knowledges. several participants further mentioned the need to discuss food within the context of changing climates. ( ) community control. the second theme highlighted the importance of community control of iks and ois with domains including (a) indigenous community input and data management, (b) incorporating elder knowledge with technology, (c) the use of multimedia including videos and photographs, and (d) geographic information system (gis) maps to portray data. for instance, a participant argued: “the community doesn’t want our information taken and used, like usual. it would only be helpful if we had control over it and decided where that information went”. several participants discussed the fear of information being disseminated or misused without their control. it was suggested that including elder knowledge in technology would further cultural continuity and ensure that knowledge was displayed in culturally appropriate manners. “we really want our elders’ voices recorded and available to the youth. we are going to lose that if we don’t,” stated one participant. overall, participants voiced the desire to share information online, mapping traditional lands and food sources, but also limiting more sensitive information to tribal members only. ( ) sharing wise practices. the third theme included the desire to identify and share wise practices through digital formats to improve health and wellbeing and to increase cultural continuity. this theme included the following domains: (a) sharing practices and stories with others, (b) learning regional wise practices, (c) learning global practices that relate to environmental changes and ecosystems, and (d) how wise practices are currently used for healing. several participants noted that wise practices were preferable to best practices. in fact, many indigenous participants argued that indigenous “ancestors knew best and what to keep us healthy”. they discussed the need for furthering the wise practices that “actually work” as opposed to colonial-based practices. “we need to tell our own food stories, and i wouldn’t mind learning what others are doing,” stated a participant. the participants discussed the need to re-story or to tell their views of food and health as informed by their iks and ois. . discussion ancestral wisdom has guided indigenous peoples across generations into the digital age. while the utility of the web was perhaps prophesized long ago, this study illustrates that indigenous persons continue to envision the potential of digital platforms that provide opportunities to connect, reclaim, revitalize, and reinvent food wisdom. specifically, this study found that key indigenous community stakeholders from the us, canada, and new zealand shared the following views: ( ) an online repository would be useful for sharing information about indigenous food and ways to improve their health and wellbeing, ( ) indigenous communities need to remain in control of this knowledge, and ( ) identifying and sharing wise practices may support cultural continuity by re-storying health. these genealogy , , of findings are among the first to demonstrate global indigenous community interest in utilizing an online repository to re-story and improve health and wellbeing by increasing cultural continuity. these findings will further help guide the development of a food wisdom repository for indigenous land and food practices. . . digital platforms and sharing ik digital platforms like the suggested repository can create innovative spaces for sharing ancestral wisdom, storytelling, and re-storying of indigenous teachings of health and wellbeing, as implied by our findings. digital tools provide an opportunity to share information in multimedia formats that are consistent with decolonizing methodologies (smith ). in this manner, indigenous cultural teachings and worldviews can be centered in a digital platform, reflecting their worldview. this centering of iks has been done previously in other research. for instance, photovoice is a specific photographic technique that is compatible with indigenous approaches to furthering knowledges. it enables participants to record the strengths and concerns of their communities, promote critical group dialog, and influence policymakers (jennings et al. ; jennings et al. ; jennings and lowe ; wang and burris ). jennings and colleagues have shown that photovoice can assist in centering the voices of indigenous youth about foods in obesity interventions (jennings and lowe ; jennings et al. ). by telling their stories through digital means, western views of health are not imposed on indigenous communities, but instead, others can view the world as they see it. this can also occur within an online platform, as communities can post their views of healthy food practices, which may or may not differ from western views. an online digital platform can also host visual, oral, and written stories from indigenous community perspectives and reach a wide global audience rapidly. when considering the need to appropriately share ancestral information online, our findings led the investigators to propose an online map of traditional food and land practices for specific geographical locations. as mentioned before, our study supported notions that indigenous groups living in the same ecoregion share more food and land practices than those indigenous people who may be linguistically and culturally related but live apart. thus, the online map will embed cultural narratives and storytelling around food practices for the region as dictated by elders and other knowledge keepers. at the same time, our findings further suggest that indigenous control of their knowledges is critical. thus, inputting knowledge cannot be overseen by those trained in western research paradigms alone, especially if indigenous participants seek to re-story views of their health from a story of deficit to one of strength. communities must be engaged throughout the process. our findings also indicate that indigenous communities wish to identify stories to be shared online and steward these within an open access platform. however, their stories need to retain their relational aspects. in particular, western research paradigms often devalue stories by dissecting them, reducing them to individual parts, and then examining the parts, while often ignoring the relationships between them. in contrast, storytelling is a characteristic of indigenous methodologies (kovach ), and, as a whole, has many complicated layers. if broken apart and disconnected from the people, relational stories lose their meaning. hence, stories require a relationship between the teller and receiver; they typically have a plot, an interrelated sequence of events. as wilson ( ) explains, knowing the storyteller allows the receiver to assess their credibility. the broader importance of stories is their rootedness of iks in land and place: “our way of mapping our territory is through our stories. there is a story about every place. there are songs about each place. there are ceremonies that occur about those places. the songs, the stories, the ceremonies are our map” (p. , little bear ). elders and other knowledge keepers can help identify which stories to share. our findings further indicate that there is a need for including gis technology. gis software can facilitate mapping of ancestral and new stories that support healthy food practices while maintaining the relationships to the land and place. thus, an online map with the embedded stories could maintain relationality to place and provide ongoing feedback from the community. genealogy , , of as guided by the desire to share food and land practice, the researchers will further map the proposed regions by ecosystems, such as the plains across north america, the woodland regions, the desert regions, coastal regions, etc. the authors will then link indigenous nations who may be further removed from one another. for instance, the anishinaabe peoples include multiple tribes who were separated during a mass migration, later during colonial assaults, and then more recently by government removal (american indian resource center ). thus, culturally and linguistically related indigenous nations exist across the americas (i.e., the ojibwe anishinaabe in canada and the us) and into mexico (the kickapoo anishinaabe). several key informants in this study indicated that these cultures still maintain similar cultural food and land practices. hence, the proposed online platform will connect similar tribes and their tribal food and land practices. indigenous stakeholders further wished to know how climate change and environmental pollution were affecting other indigenous groups and their food practices across the world. therefore, environmental changes must be considered alongside food practices and can serve to connect communities facing similar issues. this will further facilitate knowledge sharing and wise practices that may assist others, as the repository will include blogs and discussion groups. our findings further indicate that elders’ knowledge is just as important as archival research findings. their stories should also be included in such digital platforms, as indicated by other research (marsh et al. ; satterfield et al. ). digital tools can address the need to maintain and utilize elder knowledge. for example, satterfield et al. ( ) and colleagues indicate that elders remember a time before diabetes was rampant in indigenous communities; they hold knowledge that can guide programming and policy development by identifying the factors that promote health. in addition to their held knowledge, elders play an essential role in knowledge translation (ninomiya et al. ). through use of a shared online digital platform, indigenous knowledges will be centered, and indigenous elders will be sought to support cultural continuity through sharing food and land practices that were once disrupted. digitizing this knowledge keeps it available for future generations and furthers transmission across the generations. . . indigenous communities maintaining control of ik secondly, there exists a dire need for indigenous communities to maintain control of indigenous knowledges (ik) and original instructions (oi) on an online platform, given the potential for cultural appropriation and misuse by outsiders. indigenous community members wish to oversee both the input of data and online data management. therefore, the digital online platform should not be seen as a mere tool to catalogue and categorize data by outsiders, but instead engage communities continually. the tribal health sovereignty model supports indigenous communities in maintaining control over their health and in taking charge in identifying specific indigenous views of health that may vary from western views (jennings et al. ). similarly, indigenous data sovereignty (ids) “is a critical part of this re-storying because, over time, non-indigenous people with the power to select, record, and interpret data have colonized it . . . . information derived from implicitly or explicitly biased data will likely become single stories that range from liberating and empowering at one end to controlling and disempowering at the other” (johnson-jennings et al. ). hence, re-storying and revitalization of food and land practices can change the narrative from one of deficit—i.e., focused on health disparities (e.g., obesity, cancer, diabetes)—to one of thrivance—i.e., focused on iks related to relationships with foods and lands, wise practices, and ancestral stories that continue to guide these relationships and health. drawing from a community-engaged framework, a community advisory council of elders from each region will be gathered to assist in identifying and stewarding the data. elders have vast sources of knowledge and are highly esteemed in indigenous communities. they can inform resilience strategies and share food and land narratives that may otherwise be missed (kahn et al. ). furthermore, elder advisory councils serve an important role to advise and inform researchers about the appropriate use of data, community considerations, and if research is being conducted in a good way (christopher genealogy , , of et al. ). elders can wisely assist with stewarding the data and unforeseen community concerns as well. though the authors propose that elders and key stakeholders control and steward the data, this must be done in a manner that continually supports indigenous data sovereignty (ids) (for a more thorough discussion of ids and wise practices in relation to these findings and subsequent data collection, see johnson-jennings et al. ). ids requires consulting with indigenous communities at each step of the project; yet, the very nature of an online platform emphasizes the developer and user as having little interaction once the platform is complete. indigenous methodologies support actively engaging elders and other knowledge keepers throughout all steps of developing an online digital repository and in maintaining ids. through using such a digital online software that can actively support ids, the authors aim to establish geographical and national-/tribal-level elders’ advisory councils to oversee the identification of iks, the most appropriate mediums for sharing, and the audiences with which to share. there are current examples of indigenous communities overseeing and stewarding their online data as a collective whole, which may be useful for the proposed repository. one available platform providing for online stewardship is the mukurtu software. using this software, indigenous communities can indigenize and digitize cultural protocol and secure specific web pages. designated community leaders can further add videos, photographs, stories, and/or other content as they desire (christen ). again, based on our findings, the repository needs to be fluid and include multimedia shared in a manner that is protective of iks (including videos, photographs, gis maps, blogs, etc.), as well as ensuring that the community maintains ownership of all iks uploaded. an elders’ community advisory council in each region will be crucial in identifying who can manage the data, which data will be limited via passwords to local communities (including elders only or other specific sub-selected groups) or regional communities, and which data will be made publicly available. examples of publicly available data may be community stories of successful gardening projects, organizations that increase healthy food access, or stories that relate to place and deemed appropriate to share with all, whereas password-protected data may include medical plants and their uses, stories specific to iks of the region, and songs related to place. again, the community will be in control of uploading this data, stewarding such through a community-appointed elders’ advisory council, and being able to remove content at will. though data storage of these geographic regions and food and land practices will initially occur at the university, a more suitable long-term community site will be found as advised by the elders’ advisory council. . . wise practice sharing our final finding implies that indigenous communities desire to share wise practices, rather than best practices, in order to learn about and improve food and health. indigenous communities often consider wise practices culturally appropriate and more likely to lead to sustained change given that they center indigenous worldviews and epistemologies (see johnson-jennings et al. ; jennings et al. ; satterfield et al. ). privileging indigenous worldviews remains important because ontological and epistemological beliefs drive methodology and methods, which then affect the credibility of results and their translation into evidence-based best practices (wilson ). likewise, credible indigenous wise practices arise from indigenous scientific research paradigms by way of their ontologies, epistemologies, methodologies, methods, results, and knowledge translation. each paradigm has its own ways of realizing its very divergent aims: generalizability versus sovereignty and self-determination. at the same time, global practices, healing, and a focus on environmental changes need to be included in the repository as desired by indigenous communities. therefore, the food wisdom repository will be international in nature and will call for collective iks and ois sharing global wise practices that support thriving in the current world. through sharing these practices, indigenous communities may further their revitalization and reclamation of food practices in order to facilitate health and wellbeing. genealogy , , of . conclusions this study found that an online food wisdom repository may contribute to the health and wellbeing of indigenous peoples by supporting cultural continuity. wise practices that include ancestral wisdom and original instructions can guide cultural continuity of indigenous cultures and ever-evolving iks, as they have for centuries. even though colonization forced indigenous persons into a survival mode in which some practices were taken, forgotten, or stagnated and which did not allow evolution with changing technologies and times, indigenous communities have retained much knowledge that they now wish to share with others. this era of thrivance supports asserting data sovereignty and reclaiming, revitalizing, and reinventing iks around cultural food and land practices through new technologies. however, indigenous groups must remain engaged and in control throughout each step. author contributions: d.j. and m.j.-j. conducted oversaw the research design, data collection and analysis. they further contributed to the writing of the original paper. they equally contributed to the overall project. as pi’s. m.l. contributed as a co-investigator in verifying the data, contributed to the original paper, and provided overall administration for manuscript submission. all authors have read and agreed to the published version of the manuscript. funding: this research was funded by the shakopee mdewakanton sioux seeds of native health food repository development award. conflicts of interest: the authors declare no conflict of interest. references american indian resource center. . anishinaabe timeline. available online: https://www.bemidjistate.edu/ airc/community-resources/anishinaabe-timeline/ (accessed on april ). bartlett, cheryl, murdena marshall, and albert marshall. . two-eyed seeing and other lessons learned within a co-learning journey of bringing together indigenous and mainstream knowledges and ways of knowing. journal of environmental studies : – . [crossref] center for digital scholarship and curation at washington state university. . mukurtu. available online: https://mukurtu.org/about/ (accessed on december ). champagne, duane. . social change and cultural continuity among native nations; lanham: altamira press. available online: https://primo.lib.umn.edu/primo-explore/fulldisplay?docid= umn_alma &context=l&vid=twincities&lang=en_us&search_scope=mncat_ discovery&adaptor=localsearchengine&tab.=article_discovery&query=any,contains (accessed on march ). chandler, michael j., and christopher lalonde. . cultural continuity as a hedge against suicide in canada’s first nations. transcultural psychiatry : – . [crossref] christen, kimberly a. . mukurtu: an indigenous archive and publishing tool. humanities commons. [crossref] christopher, suzanne, robin saha, paul lachapelle, derek jennings, yoshiko colclough, clarice cooper, crescentia cummins, margaret j. eggers, kris fourstar, and lennie webster. . applying indigenous community-based participatory research principles to partnership development in health disparities research. fam community health : – . [crossref] donelle, lorie, and laurie hoffman-goetz. . an exploratory study of canadian aboriginal online health care forums. health communication : – . [crossref] hennessy, kate, natasha lyons, stephen loring, charles arnold, mervin joe, albert elias, and james pokiak. . the inuvialuit living history project: digital return as the forging of relationships between institutions, people, and data. museum anthropology review : – . available online: https://scholarworks.iu.edu/ journals/index.php/mar/article/view/ / (accessed on february ). hill, clara e. . consensual qualitative research: a practical resource for investigating social science phenomena. worcester: american psychological association, available online: https://www.apa.org/pubs/books/ (accessed on february ). https://www.bemidjistate.edu/airc/community-resources/anishinaabe-timeline/ https://www.bemidjistate.edu/airc/community-resources/anishinaabe-timeline/ http://dx.doi.org/ . /s - - - https://mukurtu.org/about/ https://primo.lib.umn.edu/primo-explore/fulldisplay?docid=umn_alma &context=l&vid=twincities&lang=en_us&search_scope=mncat_discovery&adaptor=localsearchengine&tab.=article_discovery&query=any,contains https://primo.lib.umn.edu/primo-explore/fulldisplay?docid=umn_alma &context=l&vid=twincities&lang=en_us&search_scope=mncat_discovery&adaptor=localsearchengine&tab.=article_discovery&query=any,contains https://primo.lib.umn.edu/primo-explore/fulldisplay?docid=umn_alma &context=l&vid=twincities&lang=en_us&search_scope=mncat_discovery&adaptor=localsearchengine&tab.=article_discovery&query=any,contains http://dx.doi.org/ . / http://dx.doi.org/ . /m q x http://dx.doi.org/ . /fch. b e f http://dx.doi.org/ . / https://scholarworks.iu.edu/journals/index.php/mar/article/view/ / https://scholarworks.iu.edu/journals/index.php/mar/article/view/ / https://www.apa.org/pubs/books/ genealogy , , of hunter, jane. . the role of information technologies in indigenous knowledge management. australian academic & research libraries : – . [crossref] jennings, derek, and john lowe. . photovoice: giving voice to indigenous youth. pimatisiwin: a journal of aboriginal and indigenous community health : – . available online: http://www.pimatisiwin.com/ online/wp-content/uploads/ / / jennings.pdf (accessed on february ). jennings, derek, meg m. little, and michelle johnson-jennings. . developing a tribal health sovereignty model for obesity prevention. progress in community health partnerships : – . [crossref] jennings, derek r., koushik paul, meg m. little, daryl olson, and michelle d. johnson-jennings. . identifying perspectives about health to orient obesity intervention among urban, transitionally housed indigenous children. qualitative health research : – . [crossref] johnson-jennings, michelle, derek jennings, and meg m. little. . indigenous data sovereignty in action: the food wisdom repository. journal of indigenous wellbeing. te mauri pimatisiwin : – . available online: https://journalindigenouswellbeing.com/media/ / / . .indigenous-data-sovereignty-in- action-the-food-wisdom-repository.pdf (accessed on july ). johnson-jennings, michelle, derek jennings, koushik paul, and meg little. forthcoming. an indigenous food wisdom repository: needs and uses for digital indigenous food knowledge and practices. kahn, carmella b., kerstin reinschmidt, nicolette i teufel-shone, christina e oré, michele henson, and agnes attakai. . american indian elders’ resilience: sources of strength for building a healthy future for youth. american indian and alaska native mental health research : – . [crossref] [pubmed] kovach, margaret. . emerging from the margins: indigenous methodologies. in research as resistance: revisiting aritical, indigenous, and anti-oppressive approaches, nd ed. edited by leslie brown and susan strega. toronto: canadian scholars’ press, pp. – . available online: https://books.google.com/books?hl=en& lr=&id= unvcgaaqbaj&oi=fnd&pg=pa &dq=decolonizing+methodologies&ots=wmy xv tfk& sig=gowqn-k toscfkipdexay molca#v=onepage&q=decolonizingmethodologies&f=false (accessed on march ). little bear, leroy. . aboriginal relationships to the land and resources. in sacred lands: aboriginal world views, claims, and conflicts. edited by jill oakes, rick riewe, kathi kinew and elaine l. maloney. edmonton: the university of alberta press, pp. – . available online: http://www.uap.ualberta.ca/titles/ - -sacred-lands (accessed on february ). lynn, kathy. . the impacts of climate change on tribal traditional foods. climatic change : – . [crossref] marsh, teresa naseba, sheila cote-meek, pamela toulouse, lisa m. najavits, and nancy l. young. . the application of two-eyed seeing decolonizing methodology in qualitative and quantitative research for the treatment of intergenerational trauma and substance use disorders. international journal of qualitative methods : – . [crossref] martin, karen, and booran mirraboopa. . ways of knowing, being and doing: a theoretical framework and methods for indigenous and indigenist re-search. journal of australian studies : – . [crossref] moreton-robinson, aileen, jean o’brien, and chris. andersen. . relationality: a key presupposition of an indigenous social research paradigm. london: routledge. mould, tom. . choctaw prophecy: a legacy for the future. tuscaloosa: university of alabama press. nelson, melissa k. . original instructions: indigenous teachings for a sustainable future. rochester: bear & company. ninomiya, melody m., donna atkinson, simon brascoupé, michelle firestone, nicole robinson, jeff reading, carolyn p. ziegler, raglan maddox, and janet k. smylie. . effective knowledge translation approaches and practices in indigenous health research: a systematic review protocol. systematic reviews : . [crossref] [pubmed] oster, richard t., angela grier, rick lightning, maria j. mayan, and ellen l. toth. . cultural continuity, traditional indigenous language, and diabetes in alberta first nations: a mixed methods study. international journal for equity in health . [crossref] [pubmed] saslis-lagoudakis, c. haris, julie hawkins, simon j. greenhill, colin a. pendry, mark f. watson, will tuladhar-douglas, sushim r. baral, and vincent savolainen. . the evolution of traditional knowledge: environment shapes medicinal plant use in nepal. proceedings of the royal society b biological sciences : . [crossref] [pubmed] http://dx.doi.org/ . / . . http://www.pimatisiwin.com/online/wp-content/uploads/ / / jennings.pdf http://www.pimatisiwin.com/online/wp-content/uploads/ / / jennings.pdf http://dx.doi.org/ . /cpr. . http://dx.doi.org/ . / https://journalindigenouswellbeing.com/media/ / / . .indigenous-data-sovereignty-in-action-the-food-wisdom-repository.pdf https://journalindigenouswellbeing.com/media/ / / . .indigenous-data-sovereignty-in-action-the-food-wisdom-repository.pdf http://dx.doi.org/ . /aian. . . http://www.ncbi.nlm.nih.gov/pubmed/ https://books.google.com/books?hl=en&lr=&id= unvcgaaqbaj&oi=fnd&pg=pa &dq=decolonizing+methodologies&ots=wmy xv tfk&sig=gowqn-k toscfkipdexay molca#v=onepage&q=decolonizingmethodologies&f=false https://books.google.com/books?hl=en&lr=&id= unvcgaaqbaj&oi=fnd&pg=pa &dq=decolonizing+methodologies&ots=wmy xv tfk&sig=gowqn-k toscfkipdexay molca#v=onepage&q=decolonizingmethodologies&f=false https://books.google.com/books?hl=en&lr=&id= unvcgaaqbaj&oi=fnd&pg=pa &dq=decolonizing+methodologies&ots=wmy xv tfk&sig=gowqn-k toscfkipdexay molca#v=onepage&q=decolonizingmethodologies&f=false http://www.uap.ualberta.ca/titles/ - -sacred-lands http://www.uap.ualberta.ca/titles/ - -sacred-lands http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / http://dx.doi.org/ . / http://dx.doi.org/ . /s - - -x http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /rspb. . http://www.ncbi.nlm.nih.gov/pubmed/ genealogy , , of satterfield, dawn, john eagle shield, john buckley, and sally taken alive. . so that the people may live (hecel lena oyate ki nipi kte): lakota and dakota elder women as reservoirs of life and keepers of knowledge about health protection and diabetes. journal of health disparities research and practice : – . available online: http://digitalscholarship.unlv.edu/jhdrp/vol /iss / / (accessed on march ). satterfield, dawn, lemyra debruyn, marjorie santos, larry alonso, and melinda frank. . health promotion and diabetes prevention in american indian and alaska native communities—traditional foods project, – . mmwr supplement : – . [crossref] smith, linda tuhiwai. . decolonizing methodologies, st ed. new york: zed books limited. thoms, j. michael. . leading an extraordinary life: wise practices for an hiv prevention campaign with two-spirit men. toronto. available online: http:// spirits.com/pdfolder/extraodinarylives.pdf (accessed on march ). tupper, jennifer. . social media and the idle no more movement: citizenship, activism and dissent in canada. journal of social sciences education : – . [crossref] walters, karina l., michelle johnson-jennings, sandra stroud, stacy rasmus, billy charles, simeon john, james allen, joseph keawe’aimoku kaholokula, and mele a. look. . growing from our roots: strategies for developing culturally grounded health promotion interventions in american indian, alaska native, and native hawaiian communities. prevention science : – . [crossref] wang, c. c., and mary ann burris. . photovoice: concept, methodology, and use for participatory needs assessment. health education & behavior : – . [crossref] wesley-esquimaux, cynthia, and brian calliou. . best practices in aboriginal community development: a literature review and wise practices approach. banff. available online: https://www.researchgate.net/profile/brian_calliou/publication/ _best_practices_in_ aboriginal_community_development_a_literature_review_and_wise_practices_approach/links/ c a ef dfa /best-practices-in-aboriginal-community-development-a (accessed on march ). wilson, shawn. . research is ceremony: indegenous research methods. winneoeg: fernwood. © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). http://digitalscholarship.unlv.edu/jhdrp/vol /iss / / http://dx.doi.org/ . /mmwr.su a http:// spirits.com/pdfolder/extraodinarylives.pdf http://dx.doi.org/ . /jsse-v -i - http://dx.doi.org/ . /s - - -z http://dx.doi.org/ . / https://www.researchgate.net/profile/brian_calliou/publication/ _best_practices_in_aboriginal_community_development_a_literature_review_and_wise_practices_approach/links/ c a ef dfa /best-practices-in-aboriginal-community-development-a https://www.researchgate.net/profile/brian_calliou/publication/ _best_practices_in_aboriginal_community_development_a_literature_review_and_wise_practices_approach/links/ c a ef dfa /best-practices-in-aboriginal-community-development-a https://www.researchgate.net/profile/brian_calliou/publication/ _best_practices_in_aboriginal_community_development_a_literature_review_and_wise_practices_approach/links/ c a ef dfa /best-practices-in-aboriginal-community-development-a http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction cultural continuity: transmission through online platforms the new smoke signals: digital technology as a potential form of disseminating indigenous knowledge re-storying through digital spaces methods results discussion digital platforms and sharing ik indigenous communities maintaining control of ik wise practice sharing conclusions references unam first let me acknowledge and thank dra. elsa ramírez leyva, and the organizing committee for inviting me to speak with you today. i am honored to be here and to present ideas and projects my colleagues and i have undertaken at the ucla library. the university where i work, the university of california los angeles, is a large, public research institution. the ucla library is one of libraries within the university of california system, a system that collectively serves , students and faculty. i want to acknowledge the privileges that my background--earning a ph.d. at uc berkeley, and working at ucla--has provided, and how that has shaped my perspective and led to this opportunity to speak with you today. i invite you to share your experiences and ideas during the question and answer period in order to broaden my own awareness and foster an open exchange of other perspectives. first let me acknowledge and thank dra. elsa ramírez leyva, and the organizing committee for inviting me to speak with you today. i am honored to be here and to present ideas and projects my colleagues and i have undertaken at the ucla library. the university where i work, the university of california los angeles, is a large, public research institution. the ucla library is one of libraries within the university of california system, a system that collectively serves , students and faculty. i want to acknowledge the privileges that my background--earning a ph.d. at uc berkeley, and working at ucla--has provided, and how that has shaped my perspective and led to this opportunity to speak with you today. i invite you to share your experiences and ideas during the question and answer period in order to broaden my own awareness and foster an open exchange of other perspectives. my position within this large organization is as front-line librarian in a public service unit. i lead a small team of librarians who dedicate % of their time to very small programs we develop to deepen connections with researchers at ucla. i have worked in the charles e. young research library since , when i accepted my position as librarian for digital research and scholarship managing a renovated area known as the research commons. i am the liaison librarian to the anthropology department and the digital humanities program; ucla has a digital humanities minor and a graduate certificate program. because of my background in digital humanities and my transition from a position at ucla’s center for digital humanities to working in the library, i co-founded a special interest group on libraries and digital humanities in the association for digital humanities organizations (adho). (i will talk more about this sig towards the end of my talk.) with that background in mind, allow me to jump right in with an example of how the research partnerships functional team works to build communities of practice that support the research of faculty and students at ucla. as you’ll see, i am very much a ground-up builder, and a visual thinker: so bear with me as i start with an everyday, familiar anecdote and then, hopefully, draw connections to the larger picture. yesterday, prof. todd presner, the chair of the digital humanities program, forwarded an email he received from a ucla graduate student in film and media studies sent asking for presner’s advice. brief synopsis: student is working on his dissertation on silent and early sound cinema. his dissertation advisor, a professor in cinema & media studies, recommended the student reach out to presner for advice on how to analyze an informal survey that a cinema journal of the period conducted – asking readers to list their favorite films. the student was hoping to create: “visually appealing statistical graphics” that he might use for conference presentations and job talks! these types of requests have, in the past, been very problematic – but let me tell you the happy ending, so you’ll appreciate the problems we’ve been working to solve. presner forwarded the message with a note to the student saying: “i am taking the liberty of cc-ing a colleague at the ucla digital library, who is particularly informed about digital methods and presentations of statistical data, dr. zoe borovsky. i would try to set up a meeting with her to discuss your data and possibilities for analysis. she oversees a summer accelerator program, which may also offer future opportunities for support.” presner included a link to our website: http://dressup.library.ucla.edu/team/ http://dressup.library.ucla.edu/team/ here’s our website: our students use it to practice their github skills – so it’s always a bit chaotic. i am going to describe our dressup program fully a bit later: dressup stands for digital research start-up partnerships and it is part of this community of practice, an ecosystem of support that i’ve developed at ucla. this student’s request has, before dressup been a growing concern of the digital humanities faculty and research staff at ucla. demand for research support outside of the program has grown at a faster pace than we can support. faculty teach capstones, required for graduation, as overloads. dh faculty who rely upon expert staff such as myself to teach hands-on workshops for their courses – workshops such as text-mining, network analysis, gis and mapping – are often competing with students such as the one making this request – for resources. while there are well-funded units for instructional support, and, students enrolled in dh courses get some support, it’s the graduate students at ucla in departments such as film and television, or even those in the social sciences division – who have the least support for digital research. so here’s my problem statement: the cinema & media studies professor & student on one side: have seen these amazing projects that presner and digital humanities students do. but, they don’t even think to look for that type of expertise in the library. presner knows how to direct them to the program that we’ve built, but the challenge for the library is: how to make the library’s role in the research process more apparent, more visible. obviously, it’s not only a problem that our users don’t come to us directly but, in order to argue for more resources from campus administrators to get the support of our development officers, and donors, we need to amplify our impact. while the library renovation in provided spaces to showcase and demo projects, the library’s role (other than furniture and large monitors) was largely hidden. that was the motivation, back in , to launch dressup. dressup stands for digital research startup partnerships -- it was designed to put the library, librarians, and library staff, back in the loop, a way of showcasing them as active partners throughout the research process. i believe that, especially with digital research projects, when researchers create and curate their own collections, that library expertise: scanning, ocring, meta-data, data-cleaning and curation, text-mining, archiving is especially relevant. the “ecosystem” that research partnerships built has main components. . dressup: the incubator was built first, in . it’s a six-week, summer-intensive program for graduate student researchers. . in , we began adding in some train-the-trainer components – with a small grant from librarians association of universities of california (lauc). . in we launched a series of workshops and events based on workshops we developed in dressup, to meet the needs of researchers. the workshops are offered during the academic year, and are intended for a broader audience. next, i’ll describe all three parts of our ecosystem! i’ll start with dressup – and spend the most time describing that – because workshops are an established and familiar way that libraries have provided research support. dressup, because it is high-touch, labor-intensive is the more controversial part of our program. digital research start-up partnerships (dressup) is a framework for engaging researchers with the library during the inception through the active stages of the digital research life-cycle planning, collecting & curating, cleaning & refining, analyzing & visualizing, and sharing we ( library staff) have, over the last two years, operated on a shoestring budget. our university librarian provides about $ k so we can hire a graduate student assistant. we put out an annual call for proposals to graduate students in the humanities and social sciences; students apply and we accept up to proposals. over a -week period, we work with these students to “incubate” their projects. during the first weeks, we tailor workshops to their interests, following the framework. after weeks, the students should have a defined workflow that describes and documents their process through those stages. their mid-term is a report on areas where they want help to address: scaling up from the sample to the whole. during the final weeks, we meet with them in groups and individually to focus on specific aspects of their research projects. here’s an example of the type of projects we undertake: nina was urban planning grad student, working on her dissertation, she had collected months of twitter data related to sexual harassment; she was using atlas.ti during her coding process, had lost sight of the big picture; we showed her how to export the data out of atlas ti and provided her with alternative ways of visualizing and analyzing her data. she was able to finish her dissertation much more quickly than her advisors had expected. nina is now a lecturer at calstate long beach and she comes back to lecture at ucla and talk with students in the dressup program. of course, students love dressup: it’s project-based learning, they learn research data management as it applies to their project. they meet librarians with expertise they did not know existed! we are able to demonstrate impact with success stories: • grad students at the beginning of their programs have received grants nsf, ford fellowships, etc., that provide fellowship support during grad school. • teaching: they use their projects or skills to teach undergraduate courses, guest lectures for dh , or work in undergraduate research center. some teach workshops as part of our workshop series – that i will talk about later. with the principles of minimal computing and minimal design in mind, we embrace a philosophy of “minimalism’ - especially where it helps to keep us nimble and sustainable. we aim to keep the program small in scale, but portable, modular, extensible, and reproducible. we are often asked, “but how do you plan to scale up so that you can teach all the things to all the people?!” the answer is, “we don’t!” our intention is keep the summer incubator program small and self-selecting, with a focus on reproducibility, not on scaling up. our lessons are modular; we repurpose them during the academic year as part of our workshop series—that i’ll describe later. following this “minimal” & “modular” approach allows us, as instructors move away from the generic, one size fits all workshop, avoid burnout, and, during the summer, offer a more tailored experience that embraces the “boutique.” we are creating a growing curriculum that is ready to cover all manner of digital scholarship needs (from project planning and management to collecting, cleaning, and analyzing data), then we flex to fit each dressup cohort’s needs. for example, we might focus more or less on web scraping over getting data out of licensed databases during the collection phase if needed. this year, several researchers have projects involving oral histories, so we invited the director of our center for oral history research to offer a session dedicated to conducting and working with oral histories in research. at this point, you may be asking: what’s different about dressup? why not offer fellowships as many other research centers do? dressup takes a partnership approach to incubating digital research projects, rather than a fellowship model. we do not do projects for the participants, we provide an environment and infrastructure for them to learn to do their own projects. participating graduate students are not paid and they bring their own projects to the program. in this model, there is the expectation that participants put in the work they need to define and complete a portion of their own research project. when we say “partnerships”, we don’t mean that we are acting as partners on their research projects - rather it is a learning and teaching partnership. graduate students and librarians set the agenda together and the participants have the opportunity to share their own knowledge and skills with the group. this strategy is targeted to and works well with graduate students. within the ucla community, there is a noticeable gap in these types of services so the grad students are eager and motivated participants. dressup emphasizes process over product or tools. we encourage participants to focus on generating prototypes using data sets curated to the participants’ research question. in this way, we are teaching research as process – a framework that can be generalized and adapted to other research areas including grant writing, non-digital research methods, and professional activities. pictured is an example of the brainstorming/planning process that we use at the beginning of dressup. participants further define their research questions, then outline the project inputs, processes, and outputs. as they learn new methods and tools, and work with their sample data set, they revise this plan and document their workflow. one of the primary goals of the program is to build community and communities of practice around digital research at ucla. to this end, dressup provides graduate students with a cohort with which they can share and discuss their work with each other. the graduate students that participate also give back and contribute to the community by returning to teach or guest lecture for the next cohort of graduate students. they also take what they know into their departments and academic communities, thereby serving as vectors who spread knowledge about digital research practices and library expertise to faculty, fellow students, and other peers. likewise, librarians who join the dressup team will they take what they’ve learned and give back to the community by offering workshops, consultations, and bring their new expertise into the units they serve. in this way, we are growing the library’s capacity while also growing the community itself. that brings me to the next section: train-the-trainer. the middle-ground of the ecosystem, has changed quite a bit – but i envision this as the trunk of the tree, the most important aspect of the program. it’s also the most difficult to sustain in an environment such as ucla where research support is de- centralized: where we rely on partners outside of the library for support. we called this train-the-trainer effort: “sharing the incubator” because initially, we developed dressup as a local, sustainable model for ucla library staff to engage with researchers at an early stage of their careers. in addition to incubating graduate student projects, we are incubating an infrastructure that demonstrates to our colleagues how, (although we all work in separate departments) we can work together as a team across library departments. during a restructuring of public services, we institutionalized that infrastructure – and created research partnerships functional team, as one of functional teams across user engagement. (the other teams are teaching & learning, research assistance, etc.) the research partnerships functional team has librarians inside user engagement, but also, members from other library units – and members from partner organizations at ucla: from centralized ucla it organizations, the center for digital humanities, etc. dressup is a high-touch/quality over quantity project - relatively equal number of librarians to grad students benefits of this approach: each partnership is a detailed case-study of researcher needs that a subject librarian, on their own, would not tackle. even better, it helps us see the edges of our own capacity too; it’s a self-assessing process. as we spoke with our colleagues about dressup, they expressed interest in participating. this led us to develop a separate training program (dressup for librarians) librarians requested release time from their supervisors, they spent spring quarter learning about digital research tools and methodologies, then worked with us during the summer. the release time for librarians was difficult – but ultimately the train-the-trainer effort has evolved into research partnerships functional team. the librarians who participated in the train-the-trainer program, are now members of the research partnerships functional team. we’re framing this as a success – we now have a formalized team, and the official support of an aul. we also obtained a seed grant for $ k from the vice chancellor of research – and began planning our next steps: the workshop series. oftentimes, the ethos of librarianship presents it as a service model, and we want to challenge that, or at least add to that. we’re looking to enrich the profession, and give librarians good reasons to participate. within the library, we have had to argue for quality experience over quantity (for example, of student consultations, etc.): dressup is professional development for us core team and our partner librarians. the ecosystem helps library by building capacity to meet researcher needs: as i mentioned earlier, researchers are our partners rather than fellows; they’re learning alongside us – the eco-system—with the incubator in-house--provides a valuable, visible learning environment for library staff who want to skill up by building their own capacity and expertise, and hone new skills in a team-based environment. this way, librarians are active throughout the research cycle, not just in resource discovery and publishing. although we still send librarians to external training programs, the train-the-trainer component provides an immediate, practical application by immersing librarians wishing to learn skills in our team – helping them to forge ongoing, supportive relationships inside and outside the library. and, as we planned our next stage: the workshops, we realized we had lots of opportunities for librarians to co-lead workshops, develop materials, etc. they could build capacity at their own pace. i want to mention that the capacity building among our colleagues has been our biggest challenge: and the goals i’ll mention here are aspirational. • move beyond one-shot instruction sessions and workshops, engaging, actively, as a team, in all phases of the research process. • rethink divisions of library staff into units that are either designated as public service (user engagement) or are not (digital library) • do this by providing ways for a broader range of library staff to engage with researchers • build community gradually (and organically) while assessing capacity/demands amongst our colleagues and constituencies • view the library as a collection of expert consultants who guide the research process as users and user communities assemble and analyze research collections or build upon/utilize the library collections. let’s talk about workshops – this has been the focus of the research partnerships functional team during the academic year. let me return to the request from the student – the cinema & media studies grad student who wrote to todd presner. did he have to wait for the summer to get started? no, because research partnerships offers workshops – i was able to send him the schedule and he’s signed up for tableau public workshop. the value of the workshops to staff (and faculty such as presner) is that we’re not offering as many one-off consultations. those can be informative – but what we’ve seen, as staff, is that if we offer core skills in workshops, we save each other’s time, as well as the time of faculty members. we’ve held workshops in the library – and, brought a whole new set of users into the library. we had users signup for winter workshops. the users came from all over campus – so we know that it’s not just dh students, but students from epidemiology, the anderson business school, urban planning, spanish and portuguese, etc. to wrap this up presentation, i want to talk about sustainability. what makes this model sustainable – within ucla, and externally: beyond ucla. internally, we took a hard look at why we, the staff supporting research (dh program, as well as researchers outside of dh) were so exhausted. and, specifically, we looked for ways to address our collective lacks. a centralized calendar for workshops, and promotion in general, was one of those lacks. rather than forming a new task force, and asking for a budget, we draw upon existing partnerships to distribute the work. our partners at idre, used my team in the library, the research partnership functional team, as a large stakeholder, to argue for internal resources at idre to drive their endeavors. in many ways, the research partnerships functional team within the library acts as a broker for a larger co-op or matrix: one that runs on trust and partnerships between silos and divisions. we work together on grant proposals, coordinate events, and work to align resources with needs. this is a little microcosm or model of working (building those communities of practice!!) that we then extend outward. in the same way that grad student act as our vectors, we’ve vectorized library staff! many of you are probably familiar with the external organizations that sustain library efforts such as ours. this slide is supposed to look like chaos! the problem at ucla has been that because research support is decentralized, coordinating external partnerships (in addition to internal ones), was contributing to exhaustion. but since i mentioned adho, and the special interest group: libraries and dh, i’ll use that as an example. adho, and specifically, the special interest group on libraries and dh, has been important way of sustaining our efforts at ucla, as dressup evolved. we needed a light-weight, easy-to-maintain connection with colleagues, but - no heavy administrative responsibilities - no big barriers for participation - visibility for librarians’ research and initiatives adho was great because it was designed to be a lightweight alliance of regional dh organizations and there’s less need for a lot of committee work. adho sponsors special interest groups but allows us to be very flexible. sigs work mainly to sponsor workshops and organize pre-conference events. the libraries and dh sig is designed as the “connective tissue” between ) professional library organizations, (ifla: int’l federation of library associations and institutions) ) professional dh organizations (ach: association for computers and the humanities) there are many others i could mention here, i’ve listed a few above. the goal, though, is, to provide enough of an exchange that we can identify initiatives that push librarians beyond the service model of supporting individual projects – to larger initiatives that allow us to build cross-institutional infrastructure. i’ll end with just a few of examples: • initiatives that span institutions: such as iiif. our sig has sponsored pre-conference workshops at adho on • collections as data and ucla librarians have, ourselves, presented a paper on dressup at dh http://dressup.library.ucla.edu/posts/ / / /dh .html next year, at adho’s conference at utrecht, our sig is planning to sponsor a pre- conference on libraries and dh: and the plans for that are just beginning. i promised to start at the ground up – and build from the tree to the broader view, from the anecdotal to the larger picture. so i’ll end with this image. and hope that this presentation will provide an invitation to walk with me a bit– on a pathway to talk about the bigger picture, or the root system, or other types of trees or forests that grow near you. thank you! cmbx wadl workshop report report on the workshop on web archiving and digital libraries (wadl ) edward a. fox virginia tech fox@vt.edu mohamed m. farag virginia tech mmagdy@vt.edu abstract this workshop explored the integration of web archiving and digital libraries, so the complete life cycle involved is covered, from creation/authoring, uploading/publishing in the web (including web . ), (focused) crawling, curation, indexing, exploration (including searching and browsing), (text) analysis, archiving, and up through long-term preservation. it included particular coverage of current topics of interest: challenges facing archiving ini- tiatives, archiving related to disasters, interaction with and use of archive data, applications on an international scale, working with big data, mobile web archiving, temporal issues, memento, and sitestory. introduction at the end of the acm/ieee-cs joint conference on digital libraries in indianapolis, the wadl workshop ran on the afternoon of thursday july and the morning of friday july , with most attendees also meeting for dinner on thursday. there were attendees: administrators, faculty, librarians, researchers, and students. representation included from ball state university, brazilian development bank, harding university, los alamos national laboratory, old dominion university, stanford university, united nations, ucla library, and virginia tech. there were presentations, summarized in the next section, some given by groups. in addition, there were short personal introductions by other attendees, as well as a final plenary discussion, so everyone was engaged. it became clear that web archiving is very important, adding a temporal dimension to the web. this is essential for historians, and for those in the future who will seek to understand the evolution of the modern world, since so much of the knowledge, culture, events, scholarship, and other activities of humanity often only has a fleeting presence on the web, and will be lost if there is not a comprehensive and systematic effort to develop reliable and persistent archives. further, to be useful, these must be supported by services enabling access, analysis, and interactive use. it is a promising sign that the research, development, library, information science, com- puter science, and archiving communities are joining to address these challenges, which fit well with work on digital libraries and information retrieval. it is hoped that this report will inspire others to tackle the many challenges in web archiving that still need addressing. acm sigir forum vol. no. december presentations . arcspread: enabling web archive analysis for non-cs experts, by andreas paepcke andreas paepcke (stanford) presented a vision of how sociologists, political scientists, and historians might analyze web archives in the future. his project, arcspread [ ], designs and implements a spreadsheet-based approach to the problem. andreas worked through a hypothetical example using mockup components. pieces of the three-tier architecture are implemented, but work remains around the interaction components, visualization tools, and the underlying distributed compute engine. . applying web archives to real-time group source pre- diction of speech, by andreas paepcke andreas paepcke also talked about one problem faced by a friend (henry). though an active and productive businessman, a sudden illness left him unable to speak, and a quadriplegic. his conversation partners would get bored waiting for him to finish typing with a head tracking device onto an onscreen keyboard. in an effort to engage these conversation partners, andreas and his team generated word trees that attempt to predict what his friend (henry) is going to say, given his previously typed word [ ]. for the prediction statistics, they used three different underlying collections: henry’s web blog, a specialized crawl of m webpages, and a collection of k minute phone conversations. evaluation studies have compared the outcomes, so far purely regarding their effectiveness to produce word trees. discussion led to some suggestions for enhancements, extensions, and additional evaluation. . united nations digital repository for un documentation, by ylva braaten ylva braaten (united nations) talked about the work at the un library in new york, changing its indexing processes and moving to a digital repository for un documentation (parliamentary documents, conference related documents, publications, etc). she presented a status report on un movement towards more digital and automated processes, involving different departments, agencies, and duty stations. . sitestory, archiving done differently, by martin klein and justin brunelle martin klein (los alamos national laboratory) provided an overview of the sitestory [ ] concept and its functionality compared to crawler-based archiving. he discussed the sitestory approach and described various use cases. he and justin brunelle further intro- duced the sitestory testbed and provided insight into novel results of benchmarking experi- ments. acm sigir forum vol. no. december . hiberlink, towards time travel for the scholarly web, by martin klein martin klein also gave a brief overview of the recently launched hiberlink project [ ]. this project aims at quantifying the “citation rot” problem in scholarly articles, that is occurring at unprecedented scale, but also at proposing solutions for researchers and publishers to ensure the longevity of the content of research. . temporal user intention modeling in social media, by hany salaheldeen hany salaheldeen (old dominion university) talked about modeling temporal user inten- tions in social media. the web is stuck in the “perpetual now,” and web resources are prone to change, relocation, and deletion. an author could share a resource on his social network at a point in time in order to convey a certain message, having a specific intention in mind. after a period of time, if the state of the resource differed, the reader who reads the author’s post and examines the resource might not see and understand what the author intended. this change of intention and the resource’s state could cause a significant inconsistency in the published content. hany’s goal is to model the author’s intention across time, and make predictions to avoid the temporal inconsistency that might occur. . needs and obstacles for a web archiving initiative at the ball state university libraries, by michael szajewski michael szajewski (ball state university) summarized the digital library program at ball state university, including a contentdm and dspace repository. he focused on the op- portunities for ball state university libraries to develop an initiative to capture, preserve, and provide access to archived dynamic web content. he discussed requirements, needs, and expectations for such an initiative, including the ability for the program to clearly sup- port teaching and scholarship in a pragmatic way and the ability to preserve student digital scholarship. obstacles at both the university and library levels were discussed. . who and what links to the internet archive, by yasmin alnoamany yasmin alnoamany (old dominion university) presented the results of research (with ahmed alsum, michele weigle, and michael nelson) on internet archives wayback machine access logs, trying to answer some questions such as what users are looking for, why they come to ia, where they come from, and how pages link to ia. her description of the most used languages found in web archives led to a broad discussion. . web archiving profile overview, by ahmed alsum ahmed alsum (old dominion university) presented his research results of profiling the existing web archives for top-level domains and content languages. he used these results to acm sigir forum vol. no. december build a profile for each web archive and used these profiles to optimize the query routing for the memento aggregator. . archiving the mobile web, by frank mccown, monica yarbrough, and keith enlow frank mccown (harding university), with two students, explained that the web is going mobile, and archivists want to capture this ephemeral content before it disappears. but archiving the mobile web isn’t always as straightforward as it might seem; there are many hard research challenges. in his talk he shared his work on developing tools for web archivists to automate the discovery and archiving of mobile websites. . temporal spread in archived composite resources, by scott ainsworth scott ainsworth (old dominion university) talked about the temporal spread of archived content. when a user retrieves a page from a web archive, the page is marked with the capture datetime of the root-level resource, which effectively asserts “this page looked like this at a particular point in the past.” however, the embedded resources, such as images and stylesheets, are nearly always archived at different times, although their capture time is not displayed to the user. the resulting presentation gives the appearance of coherent, temporally-aligned result, but can actually be composited from resources captured over a wide range of datetimes. he examined the temporal spread of composite archived resources (root plus embedded resources). he found that composite resources average . - . % complete and have a mean temporal spread of . - . days, depending on heuristic and source policy. . the role of bndes in brazil preservation of heritage, by fernanda balbi fernanda balbi (brazilian development bank) presented a brief explanation on the history of bndes and its role in brazilian development. she talked about the bndes performance in projects addressing preservation of the architectural heritage, preservation of collections, and strengthening of relevant cultural institutions, including digital library projects. . measuring archivability of web resources and the dam- age when mementos are missing, by justin brunelle justin brunelle (old dominion university), with slide title “how i spend my summer vaca- tions,” discussed his current research efforts in measuring the archivability of web resources and measuring damage that occurs in mementos. the archivability portion of the talk dis- cussed what makes pages more or less archivable and how archivability is changing over time. the damage portion of the talk discussed how we can measure memento importance and how we can measure the impact that a missing memento has on a larger resource. acm sigir forum vol. no. december . crisis, tragedy, and recovery network digital library (ctrnet) + web archiving in qatar and vt, by edward a. fox, seungwon yang, and the ctrnet team edward fox (virginia tech) talked about the ctrnet [ ] and ideal [ ] projects, as well as about archiving plans in qatar and at virginia tech. the ctrnet project [ ] has been archiving tweet and webpage collections related to various types of disasters that affect communities. research relates to social media use during crises, visualizing emergency phases or water main breaks, focused crawling, analyzing webpage collections with big data software, filtering using machine learning, and topic tagging. thanks to new support for the integrated digital event archiving and library (ideal) project [ ], the aims are being broadened to also encompass other types of events, e.g., political or community focused. in qatar, in collaboration with the national library, there are plans to deploy tools like seersuite, heritrix, and solr, to collect, archive, and make accessible both scholarly content and the broader web associated with the nation of qatar [ ]. at virginia tech, there are plans to further deploy heritrix, as well as sitestory and other tools, to help with campus web archiving. discussion after all the talks, there was a wide ranging plenary discussion. it became clear that invest- ment in publishing on the web, and innovation in publishing approaches, are extensive. on the other hand, most organizations, and many researchers, consider web archiving as some- thing that others will do for them. further, they are unaware that a very small community, with quite limited resources, is at work on methods for web archiving. there was discussion of possible sponsorship to advance the state-of-the-art in web archiving, including development of more advanced methods and services. there was dis- cussion of possible collaboration among the workshop attendees, since there are a number of related efforts. good synergies seemed likely to enable some of the essential advances. conclusions there is a growing gap between methods for publishing/presenting on the web, and meth- ods for archiving. there is need for funding to advance web archiving. there are many opportunities for information retrieval and digital library research to support web archiving. for more information, please see the workshop announcement [ ] and the final website [ ], which includes powerpoint and other slides used for presentations, as well as audio recordings of presentations and discussion sessions. see also ahmed alsum’s trip report available as a blog posting [ ]. acknowledgments thanks go to the organizing committee (paul bogen, tessa fallon, kristine hanna, eric hetzner, gina jones, martin klein, frank mccown, michael nelson, and andreas paepcke) for helping with the advertising, planning, and running of the workshop; the jcdl conference acm sigir forum vol. no. december and organizing team also aided in these matters. thanks go to the ctrnet and ideal project teams for their assistance, and to virginia tech for hosting the workshop website. partial support was provided by the u.s. national science foundation through grants iis- and . additional support was provided by qatar through nprp - - - . references . ahmed alsum, - - : web archiving and digital libraries workshop - wadl trip report, blog article, old dominion university, august , http://ws- dl.blogspot.com/ / / - - -web-archiving-and-digital.html . edward a. fox, web archiving and digital libraries (wadl ): a jcdl workshop (http://jcdl .org/workshops), webpage, virginia tech, , http://www.ctrnet.net/wadl . edward a. fox and mohamed m. farag. wadl website, with presentation slides and audio recordings, virginia tech, . http://eventsarchive.org/?q=wadl . edward a. fox and the ctrnet team. crisis, tragedy, and recovery network home- page. virginia tech, - . http://www.ctrnet.net . edward a. fox and the ideal team. events archive homepage. virginia tech, . http://www.eventsarchive.org . edward a. fox and the qdl team. qatar digital library: helping build the future of digital libraries in qatar homepage. qatar university, - . http://qdl.qu.edu.qa/ . muriel mewissen. hiberlink: time travel for the scholarly web. edina, united kingdon, . webpage at http://edina.ac.uk/projects/hiberlink summary.html . andreas paepcke and sanjay kairam. echotree: engaged conversation when capa- bilities are limited. . stanford infolab publication server. technical report and homepage at http://ilpubs.stanford.edu: / / . sitestory team (lanl, odu). . sitestory web archive homepage at http://mementoweb.github.io/sitestory/ . siddhi soman, arti chharjta, alexander bonomo, and andreas paepcke. arcspread for analyzing web archives. . stanford infolab publication server. technical report and homepage at http://ilpubs.stanford.edu: / / acm sigir forum vol. no. december eccentric training interventions and team sport athletes journal of functional morphology and kinesiology review eccentric training interventions and team sport athletes conor mcneill ,* , c. martyn beaven , daniel t. mcmaster , and nicholas gill , te huataki waiora school of health, adams centre, the university of waikato, tauranga, new zealand; martyn.beaven@waikato.ac.nz (m.b.); dmcmaste@waikato.ac.nz (d.t.m.); nicholas.gill@nzrugby.co.nz (n.g.) new zealand rugby union, wellington, new zealand * correspondence: cmfm @students.waikato.ac.nz received: august ; accepted: september ; published: september ���������� ������� abstract: eccentric resistance training has been shown to improve performance outcomes in a range of populations, making it a popular choice for practitioners. evidence suggests that neuromuscular adaptations resulting from eccentric overload (eo) and accentuated eccentric loading (ael) methods could benefit athletic populations competing in team sports. the purpose of this review was to determine the effects of eccentric resistance training on performance qualities in trained male team sport athletes. a systematic review was conducted using electronic databases pubmed, sportdiscus and web of science in may . the literature search resulted in initial articles, with included in the final analysis. variables related to strength, speed, power and change of direction ability were extracted and effect sizes were calculated with a correction for small sample size. trivial, moderate and large effect sizes were reported for strength (− . to . ), speed (− . to . ), power ( . to . ) and change of direction ( . to . ) outcomes. eccentric resistance training appears to be an effective stimulus for developing neuromuscular qualities in trained male team sport athletes. however, the range of effect sizes, testing protocols and training interventions suggest that more research is needed to better implement this type of training in athletic populations. keywords: eccentric; overload; training; athlete; team . introduction there is growing evidence in the literature that suggests eccentric resistance training is an effective stimulus for enhancing physical performance [ ]. during an eccentric contraction, kinetic energy is transferred and stored as elastic potential energy within the muscle tendon unit [ ], which can acutely enhance force production in the subsequent concentric contraction through the stretch-shortening cycle [ – ]. training methods that take advantage of the eccentric phase have been referred to in the literature as eccentric overload (eo) and accentuated eccentric loading (ael) [ – ]. these terms refer to the manipulation of force and time variables during the eccentric phase of exercise through the application of relatively high force or high velocity [ – ]. longitudinal eccentric training may benefit team sport athletes who are required to produce force quickly during rapid movement through favourable neuromuscular and morphological adaptations [ ]. the purpose of this systematic review was to investigate the effects of eo and ael on performance qualities in trained male team sport athletes. j. funct. morphol. kinesiol. , , ; doi: . /jfmk www.mdpi.com/journal/jfmk http://www.mdpi.com/journal/jfmk http://www.mdpi.com https://orcid.org/ - - - http://www.mdpi.com/ - / / / ?type=check_update&version= http://dx.doi.org/ . /jfmk http://www.mdpi.com/journal/jfmk j. funct. morphol. kinesiol. , , of . materials and methods . . search strategy one reviewer conducted the literature search according to the preferred reporting items for systematic reviews and meta-analyses (prisma) guidelines for systematic reviews [ ]. the electronic databases pubmed, sportdiscus and web of science were searched up until may . no date ranges were imposed on the individual databases. search terms included: ‘eccentric exercise’, ‘eccentric training’, ‘eccentric contraction’, ‘strength’, ‘power ’, ‘speed’, ‘velocity’, ‘force’, ‘hypertrophy’, ‘athletes’, and ‘team sports’. boolean operators ‘and’ and ‘or’ were used to combine key search terms. when applicable, filters were used during the initial literature search to identify relevant articles. full-text articles from peer-reviewed academic journals written in english were included, while articles involving animal (non-human), youth (< years old), and older (> years old) participants were excluded. once the initial search had been conducted, the articles were stored in reference manager software (zotero, version . . , corporation for digital scholarship, vienna, va, usa). duplicate articles were manually reviewed and merged using the included “duplicate items” function. from the company’s website, “zotero assesses records for duplicates based on the title, doi, and isbn fields to determine duplicates. if these fields match (or are absent), zotero also compares the years of publication (if they are within a year of each other) and author/creator lists (if at least one author last name plus first initial matches) to determine duplicates” (www.zotero.org). the titles and abstracts of the remaining records were then screened. articles not meeting the inclusion/exclusion criteria were removed, and the remaining records were assessed. those full-text studies meeting the eligibility criteria were then assessed for inclusion in the review. an additional search was carried out using the reference lists of articles; those records identified through the additional search were then subjected to the same systematic process. finally, all of the studies deemed to meet the criteria were assessed for methodological quality and included in the review. . . eligibility criteria studies meeting the following inclusion criteria were included in the review: • participants were healthy, competitive, male team sport athletes above the recreational level (i.e., professional, national, elite) and were between and years of age. • the sports included in the review following the screening process were basketball, soccer, handball, and rugby union. • studies investigated the effects of longitudinal (≥three weeks) eo training interventions. eccentric training load (volume, intensity) needed to be quantified. • data on at least one of the following outcome measures were reported: strength (e.g., rm, maximal voluntary contraction, peak torque), maximum sprint times (e.g., m, m, m sprint), power (e.g., jump height, rate of force development), and change of direction (e.g., t-test, cutting). studies with the following exclusion criteria were not included in the review: • participants were individual sport athletes (i.e., skiing, cycling, running) or untrained (students or with less than six months training experience). studies not listing the training experience/sport status of participants were also excluded. • studies investigating male and female athletes were excluded if the results were not reported separately. • the training intervention included injured participants. • supplements or ergogenic aids were used in the intervention. www.zotero.org j. funct. morphol. kinesiol. , , of . . study selection the eligibility assessment was performed in an unblinded manner by a single reviewer. the study selection process used is visually represented by figure . those studies identified through the initial search outlined above or identified through reference lists were then screened for eligibility criteria. if there was uncertainty about whether a study met the standard for inclusion, an additional reviewer (c. martyn beaven) was consulted and an agreement was reached (n = ). j. funct. morphol. kinesiol. , , x for peer review of . . study selection the eligibility assessment was performed in an unblinded manner by a single reviewer. the study selection process used is visually represented by figure . those studies identified through the initial search outlined above or identified through reference lists were then screened for eligibility criteria. if there was uncertainty about whether a study met the standard for inclusion, an additional reviewer (c. martyn beaven) was consulted and an agreement was reached (n = ). figure . flow chart of the literature search process using the preferred reporting items for systematic reviews and meta-analyses (prisma) guidelines. . . analysis of results the remaining articles were evaluated using a item scale designed for exercise training studies [ ]. the goal of the scale was to assess the quality of strength and conditioning interventions, which might otherwise score poorly in assessments designed for healthcare research and interventions. this scale includes a item scale (range to ) designed for rating the methodological quality of exercise training studies (table ). two authors conducted the quality assessment independently; any discrepancies between the scores were discussed and a consensus was reached (n = ). the score for each criterion was as follows: = “clearly no”, = “maybe”, and = “clearly yes”. the items included: inclusion criteria were clearly stated; subjects were randomly allocated to groups; intervention was clearly defined; groups were tested for similarity at baseline; use of a control group; outcome variables were clearly defined; assessments were practically useful; duration of intervention was practically useful; between-group statistical analysis was appropriate; point measures of variability. figure . flow chart of the literature search process using the preferred reporting items for systematic reviews and meta-analyses (prisma) guidelines. . . analysis of results the remaining articles were evaluated using a item scale designed for exercise training studies [ ]. the goal of the scale was to assess the quality of strength and conditioning interventions, which might otherwise score poorly in assessments designed for healthcare research and interventions. this scale includes a item scale (range to ) designed for rating the methodological quality of exercise training studies (table ). two authors conducted the quality assessment independently; any discrepancies between the scores were discussed and a consensus was reached (n = ). the score for each criterion was as follows: = “clearly no”, = “maybe”, and = “clearly yes”. the items included: . inclusion criteria were clearly stated; . subjects were randomly allocated to groups; . intervention was clearly defined; . groups were tested for similarity at baseline; . use of a control group; . outcome variables were clearly defined; . assessments were practically useful; . duration of intervention was practically useful; . between-group statistical analysis was appropriate; . point measures of variability. j. funct. morphol. kinesiol. , , of table . quality assessment for each study included in the analysis. author inclusion criteria random allocation intervention defined groups tested for similarity at baseline control group outcome variables defined assessments practically useful duration of intervention practically useful between-group stats analysis appropriate point measures of variability askling et al. ( ) brughelli et al. ( ) cook et al. ( ) de hoyo et al. ( ) de hoyo et al. ( ) iga et al. ( ) ishøi et al. ( ) krommes et al. ( ) maroto-izquierdo et al. ( ) mendiguchia et al. ( ) mjølsnes et al. ( ) sabido et al. ( ) suarez-arrones et al. ( ) sanchez-sanchez et al. ( ) j. funct. morphol. kinesiol. , , of one reviewer created a data extraction form based on several existing literature reviews [ , , , ] and variables of interest related to the research questions. this extraction form was created using microsoft excel (microsoft corporation, redmond, wa, usa). the reviewer then manually extracted data from each study for the following physical qualities: strength, speed, power and change of direction. where possible the mean, standard deviation, percent difference and effect size statistic were calculated. if the relevant information (sample size, standard deviation, change in means) was not available, then the authors’ reported values were used. effect size was calculated for each treatment to determine the magnitude of change in the outcome variable using the mean difference (mdiff), pooled pre- (sd ) and post-test (sd ) standard deviation and pre- and post-test sample size pairs (n) [ ]. a majority of studies meeting the inclusion criteria had a sample size of fewer than participants; as such, hedges’ g correction was applied to cohen’s d, as it has been shown to correct for small sample bias [ ]. cohen′s dav = mdi f f sd +sd ( ) hedge′s gav= cohen ′s dav × ( − (n)− ) ( ) values were interpreted as trivial . < trivial < . , . ≤ small < . , . ≤ moderate < . , . ≤ large < . , . ≤ very large < . [ ]. . results . . participant characteristics data for participant and training intervention characteristics are reported as mean ± standard deviation, unless otherwise stated. a total of studies met the inclusion criteria and were included in the review, with a summary of the participant characteristics provided in table . a total of participants were recruited and included in the analysis. of the total participants, were included in the experimental group, with the remaining participants serving as controls; one study used a crossover design with participants serving as their own controls (n = ). background variables were provided in all studies except one [ ]. participants took part in a range of team sports including basketball, soccer, handball, and rugby union. elite junior or academy athletes were recruited in four studies [ – ]. athletes from professional or division i sport organisations were recruited in six studies [ , – ]. the remainder of studies recruited semi-professional or lower division athletes. . . intervention characteristics training programs lasted from to weeks ( . ± . ), including to training sessions ( . ± . ), with the exception of suarez-arrones et al. [ ], whose intervention included sessions over weeks. studies utilised a wide range of equipment including free weights typically found in performance settings, inertial flywheel devices or bodyweight-exercise based equipment. the prescribed training volume ranged from one to six sets of to repetitions. one study reported the number of sets (four) but not the number of repetitions [ ]. five studies in total followed the nordic hamstring exercise protocol (nord) as described by mjølsnes et al. [ ], which is a week intervention progressively increasing volume from two to three sets of to repetitions utilising the nord, performed concurrently with soccer specific training. the prescription method of exercise intensity used in the experimental groups was quantified in only three studies. these authors prescribed intensity based on percentage of one repetition maximum ( rm) [ ], percentage of bodyweight [ ] or through a familiarisation protocol [ ]. the remaining studies verbally encouraged participants to produce a maximal effort either against a flywheel device of varying inertial resistance [ , , , , ], or during the eccentric phase of bodyweight exercise [ , , , , ]. compliance to the training intervention was reported in all but three studies [ , , ]. compliance values ranged from % to % ( . % ± . %). all studies reporting concurrent sport practice in addition to the intervention reported no differences in sport-specific training volume between the experimental and control groups. j. funct. morphol. kinesiol. , , of table . study characteristics for eccentric overload training interventions with male team sport athletes. study (year) sample size population age (years) height (m) body mass (kg) sport quality assessment askling et al. ( ) exp = swedish premier league . ± . . ± . . ± . soccer con = . ± . . ± . . ± . brughelli et al. ( ) exp = division spanish soccer . ± . . ± . . ± . soccer con = . ± . . ± . . ± . cook et al. ( ) exp = semiprofessional rugby union . ± . . ± . . ± . rugby union exp = . ± . . ± . . ± . exp = . ± . . ± . . ± . exp = . ± . . ± . . ± . de hoyo et al. ( ) exp = division spanish academy soccer . ± . . ± . . ± . soccer con = . ± . . ± . . ± . de hoyo et al. ( ) exp = division spanish academy soccer . ± . . ± . . ± . soccer con = iga et al. ( ) exp = english professional league . ± . . ± . . ± . soccer con = . ± . . ± . . ± . ishøi et al. ( ) exp = division danish academy soccer . ± . . ± . . ± . soccer con = . ± . . ± . . ± . krommes et al. ( ) exp = division danish professional soccer . ± . . ± . . ± . soccer con = . ± . . ± . . ± . maroto-izquierdo et al. ( ) exp = division professional handball . ± . . ± . . ± . handball con = . ± . . ± . . ± . mendiguchia et al. ( ) exp = semiprofessional spanish soccer . ± . . ± . . ± . soccer con = . ± . . ± . . ± . mjølsnes et al. ( ) exp = division – danish soccer soccer exp = sabido et al. ( ) exp = division handball . ± . . ± . . ± . handball con = sanchez-sanchez et al. ( ) exp = regional . ± . . ± . . ± . soccer/basketball con = suarez-arrones et al. ( ) exp = serie a professional . ± . . ± . . ± . soccer j. funct. morphol. kinesiol. , , of . . outcome measures . . . strength strength outcomes were assessed in of the studies included in the literature review (table ). five of the nine studies used an isokinetic dynamometer to perform the strength assessment. training interventions utilised inertial flywheel devices [ , , , , , , ], traditional isoinertial equipment [ , ] or exercises performed with bodyweight [ , , , , ]. mendiguchia et al. [ ] used both isoinertial and bodyweight exercises. studies including the nord reported effect sizes ranging from trivial (− . ) [ ] to moderate ( . ) [ ]. effect sizes ( . to . ) were calculated for research involving inertial flywheel devices. only cook et al. [ ] reported outcome data for strength testing and training with isoinertial equipment. the authors found large ( . , bench press) to very large ( . , back squat) effects when eccentric training was compared to traditional training. . . . speed nine of the fourteen studies reported outcomes measures of speed (table ): these measures included velocity [ ], top speed [ ] and time variables [ , , – , , , ]. for clarity, positive effect sizes represent a favourable change. training interventions with flywheel devices reported effect sizes ranging from . [ ] to . [ ]. there were mixed results for nord training interventions with some unfavourable (− . to − . ) and favourable ( . to . ) changes in speed outcomes. eo effects on short sprint (< m) times ( . to . ) [ , , , ] were also compared to longer sprint times (> m) (− . to . ) [ , , – ]. . . . power explosive movement was measured using a variety of tests (counter-movement jump (cmj), triple jump, leg press and throwing) in of the studies included in this review (table ). cmj was assessed in six of the studies investigating power. effect sizes for jump height (cmj) ranged from small ( . ) to moderate ( . ). cook et al. [ ] combined isoinertial eccentric training with overspeed exercises ( . ). flywheel studies investigating lower-body power measures reported effect sizes of . to . . krommes et al. [ ] reported an effect size of . after a nord intervention. one study examined upper-body power despite not including upper-body training (− . ) [ ]. . . . change of direction only three studies investigated the effects of eccentric training on change of direction performance (table ). the investigation by de hoyo et al. [ ] used force plates to capture kinetic data in crossover and sidestep tasks. contact time ( . to . ) and braking time ( . to . ) both displayed small to moderate effects. effect size for peak braking force ( . to . ) and braking impulse ( . to . ) ranged from small to moderate. two studies reported [ , ] moderate to large effect sizes ( . to . ) for changes in illinois and t-test performance. j. funct. morphol. kinesiol. , , of table . eccentric training intervention characteristics for strength outcomes. study (year) weeks sessions sets × reps equipment intensity prescription method results askling et al. ( ) × flywheel ◦ s− or . s “max effort” ekfpt ( , . %, g = . ); ckfpt ( , . , g = . ) brughelli et al. ( ) – × ? bodyweight n/a “max effort” ckfpt (− , − %, g = − . ); ckept ( , . %, g = . ) cook et al. ( ) × isoinertial – % rm % rm bench rm (g = . ); squat rm (g = . ) iga et al. ( ) – × – bodyweight ◦ s− or s “max effort” ekfpt ( to , . % to . %, g = . to . ) ishøi et al. ( ) – × – bodyweight n/a “max effort” ekfpt ( . , . %, g = . ) maroto-izquierdo et al. ( ) × flywheel two . kg flywheels with moment inertia of . kg·m “max effort” legpress rm ( . , . %, g = . ) mendiguchia et al. ( ) – × – isoinertial + bodyweight – kg or – % bw absolute load + % bodyweight ckfpt ( . to . , − . % to . , g = . to . ); ekfpt ( . to . , . % to . %, g = . to . ) mjølsnes et al. ( ) – × – bodyweight n/a “max effort” ekfpt ( , . %, g = . ) sabido et al. ( ) – × flywheel flywheel disc with inertia moment of . kg m “max effort” halfsquat rm ( . , . %, g = . ) notes. results are reported as (change in mean, % difference, hedges’ g). abbreviations. ekfpt (eccentric knee flexor peak torque), ckfpt (concentric knee flexor peak torque), ckept (concentric knee extensor peak torque), bench rm (bench press rm), squat rm (back squat rm), halfsquat rm (half squat rm), and legpress rm (leg press rm). j. funct. morphol. kinesiol. , , of table . eccentric training intervention characteristics for speed outcomes. study (year) weeks sessions sets × reps equipment ecc load/intensity prescription method results askling et al. ( ) × flywheel ◦ s− or . s “max effort” f m (− . , − . %, g = . ) cook et al. ( ) × isoinertial – % rm % rm eccentric + overspeed vs. traditional m ( . , g = . ) de hoyo et al. ( ) – × flywheel concentric = optimal power output (per inertia = . kg/m ) “max effort” m (− . , %, g = . ); f m (− . , . %, g = . ); m (− . , . %, g = . ) ishøi et al. ( ) – × – bodyweight n/a “max effort” m (− . , . %, g = . ) krommes et al. ( ) – × – bodyweight n/a “max effort” m (− . , − %, g = . ); m (− . , − %, g = . ); m ( . , . %, g = − . ) maroto-izquierdo et al. ( ) × flywheel two . kg flywheels with moment inertia of . kg·m “max effort” m (− . , − . %, g = . ) mendiguchia et al. ( ) – × – isoinertial + bodyweight – kg or – % bw absolute load + % bodyweight v m ( . , . %, g = . ); v m (− . , − . %, g = − . ); ts (− . , − . %, g = − . ) sabido et al. ( ) – × flywheel flywheel disc with inertia moment of . kg m “max effort” m (− . , − . %, g = . ) suarez-arrones et al. ( ) – × – inertial + bodyweight inertia . kg/m highest power output between two loads during familiarization m (g = . ); m (g = . ); m (g = . ) notes. results are reported as (change in mean, % difference, hedges’ g). abbreviations. m ( m sprint), m ( m sprint), f m (flying m sprint), m ( m sprint) f m (flying m sprint), v m ( m velocity), v m ( m velocity), and ts (top speed velocity). j. funct. morphol. kinesiol. , , of table . eccentric training intervention characteristics for power outcomes. study (year) weeks sessions sets × reps equipment ecc load/intensity prescription method results cook et al. ( ) × isoinertial – % rm % rm eccentric + overspeed cmjpp (g = . ) de hoyo et al. ( ) – × flywheel concentric = optimal power output (per inertia = . kg/m ) “max effort” cmj ( . , . %, g = . ) krommes et al. ( ) – × – bodyweight n/a “max effort” cmj ( . , . %, g = . ) maroto-izquierdo et al. ( ) × flywheel two . kg flywheels with moment inertia of . kg·m “max effort” pwr ( . , . %, g = . ); pwr ( . , . %, g = . ); pwr ( . , . %, g = . ); pwr ( . , . %, g = . ); pwr ( . , . %, g = . ); cmj ( . , . %, g = . ); sj ( . , . %, g = . ) sabido et al. ( ) – × flywheel flywheel disc with inertia moment of . kg·m “max effort” cmj ( . , . %, g = . ); tj_r ( . , . %, g = . ); tj_l ( . , . %, g = . ) sanchez-sanchez et al. ( ) – × flywheel iso-inertial pulley ( . kg/ m ) and flywheel ( . kg/m ) “max effort” cmj ( . , . %, g = . ) suarez-arrones et al. ( ) – × – inertial + bodyweight inertia . kg·m highest power output between two loads during familiarization halfsquat (g = . ); halfsquat (g = . ); rlhs (g = . ); llhs (g = . ); rlhs (g = . ); llhs (g = . ) notes. results are reported as (change in mean, % difference, hedges’ g). abbreviations. cmjpp (counter-movement jump peak power), cmj (counter-movement jump height), halfsquat (half squat power at kg), halfsquat (half squat power at kg), pwr to pwr (power at % to % rm leg press (w)), rlhs (right leg half squat power at kg), llhs (left leg half squat power at kg), rlhs (right leg half squat power at kg), llhs (left leg half squat power at kg),tj_r (triple jump right leg), and tj_l (triple jump left leg). j. funct. morphol. kinesiol. , , of table . eccentric training intervention characteristics for change of direction outcomes. study (year) weeks sessions sets × reps equipment ecc load/intensity prescription method results de hoyo et al. ( ) – × flywheel concentric = optimal power output (per inertia = . kg/m ) “max effort” bt_crossover ( . , . %, g = . ); bt_sidestep ( . , . %, g = . ); ct_crossover ( . , . %, g = . ); ct_sidestep ( . , . %, g = . ); rb_imp_crossover ( . , . %, g = . ); rb_imp_sidestep ( . , . %, g = . ); rpb_crossover ( . , . %, g = . ); rpb_sidestep ( . , . %, g = . ) maroto-izquierdo et al. ( ) × flywheel two . kg flywheels with moment inertia of . kg·m “max effort” t-test ( . , . %, g = . ) sanchez-sanchez et al. ( ) – × flywheel iso-inertial pulley ( . kg/ m ) and flywheel ( . kg/m ) “max effort” illinois ( . , . %, g = . ) notes. results are reported as (change in mean, % difference, hedges’ g). abbreviations. bt_crossover (braking time in crossover cutting), bt_sidestep (braking time in sidestep cutting), ct_crossover (contact time in crossover cutting), ct_sidestep (contact time in sidestep cutting), rb_imp_crossover (relative braking impulse in crossover cutting), rb_imp_sidestep (relative braking impulse in sidestep cutting), rpb_crossover (relative peak braking force in crossover cutting), and rpb_sidestep (relative peak braking force in sidestep cutting). j. funct. morphol. kinesiol. , , of . discussion the goal of this systematic review was to identify and evaluate the existing literature surrounding eo training interventions and their effects on performance measures in trained team sport athletes. the current evidence is in support of the inclusion of eccentric training in training programs to improve performance measures of strength, speed, power and change of direction. however, inconsistencies exist within the literature with regard to methodologies and variables of interest that need careful consideration before the results can be extrapolated to other athletes or populations. the quantification and prescription of eo intensity in well-trained athletes was only reported in four studies focused on performance outcomes [ , , , ]. the loading parameters (i.e., load magnitude, repetitions, tempo) were unspecified in several training interventions [ , , , ], making it difficult to assess the connection between stimulus and adaptation [ ]. interestingly, cook et al. [ ] were the only authors to prescribe supramaximal training loads (> % rm), even though this method does not require specialised equipment beyond typical barbells and weight plates. the remaining two studies either estimated joint angular velocities [ ] or used submaximal loads [ ]. other prescription methods relied on instructing participants to perform one or more phases of the exercise with “maximal effort” [ , , , ], but provided no further evidence for eo. tous-fajardo et al. [ ] demonstrated that the magnitude of eccentric peak force with a flywheel device is largely dictated by the trainee, and that differences in eo exist between those with and without flywheel training experience. thus, the quantification of neuromuscular and mechanical output data in eo exercises may be necessary to determine the extent of training load and subsequent adaptation. . . strength maximum strength, as measured by a single maximal voluntary contraction, is influenced by both neurological and morphological factors that can be influenced through eo [ ]. studies involving trained [ – ] or untrained participants [ , ] have demonstrated an increase in maximal strength after eo interventions, which may be due to greater neurological contributions and/or type-ii muscle fibre hypertrophy. these adaptations are thought to be a result of the high levels of tension developed in the muscle fibres [ ], with relatively lower metabolic cost and levels of activation when compared to concentric contractions [ ]. these activation patterns may be a result of the distinct molecular and neural characteristics of eccentric contractions [ ], and could necessitate specific strategies in order to accurately assess and prescribe eccentric training [ ]. a recent review by douglas et al. [ ] highlighted that motor unit recruitment and discharge rates are contributing factors to improvements in eccentric strength following eccentric or heavy resistance training. the velocity of eo training appears to play an important role in determining subsequent strength adaptation following the intervention. for example, roig et al. [ ] reviewed eccentric and concentric training studies and found that eccentric stimuli produced superior improvements in total strength. however, the effects were greater when the testing and training velocities were matched. the authors concluded that performance outcomes were mode- and velocity-specific. in contrast to these findings, other investigations [ , ] have found that high-velocity eo training ( ◦·s− ) had a greater degree of transfer than slow velocity eo training ( ◦·s− ) during isokinetic testing. furthermore, one review [ ] challenged whether eo provided any additional benefit over traditional training, suggesting that specific populations such as athletes and the untrained may respond differently to eo as a training stimulus depending on baseline strength capabilities. studies in the current review reported effect sizes from − . [ ] to . [ ] when eo training methodologies were applied to athletes. training interventions and assessments that were matched on mode of contraction reported effect sizes from . [ ] to . [ ]. brughelli et al. [ ] conducted concentric isokinetic testing following a training intervention that emphasized eccentric contractions. the authors did not speculate whether mode specificity might have played a role in their findings. the only three studies [ , , ] to report on dynamic (eccentric and concentric) strength tests ( rm leg press, back squat, half squat) demonstrated moderate ( . ) to large ( . ) effects. although some j. funct. morphol. kinesiol. , , of transfer effect has been noted between contraction modes [ ], this phenomenon is inconsistent within the literature [ ]. three studies investigating nord and strength outcomes found small to moderate improvements in trained athletes [ , , ]. the isokinetic assessments used by these authors matched the mode of contraction used in the training interventions. ditroilo et al. [ ] found that the nord is capable of producing emg levels greater than those reported in maximal eccentric isokinetic testing, which supports the nord as an effective eo exercise. the effect and time course of training duration and number of sessions on strength outcomes following eo training is unclear, as both shorter ( weeks) and longer ( weeks) interventions and smaller ( ) [ ] and larger ( ) [ ] numbers of sessions resulted in improvements. more research is needed to understand the dose–response relationship between eo and strength in trained athletes. with the exception of brughelli et al. [ ], studies investigating strength-based outcomes in athletes appear to be in agreement with the literature that supports eo as a potent stimulus for neuromuscular strength adaptation [ ]. . . speed the effect of eo training on measures of linear sprint ability are thought to involve numerous components [ ]. briefly, sprint performance consists of several phases, including acceleration and maximal speed, which have distinct kinetic and kinematic features [ ]. acceleration is characterised by extensor action in the hip, knee and ankle, maximal rate of force development over minimal time (rfd), maximal relative strength and higher ground contact time. maximum-velocity sprinting involves the hip and ankle extensors, rfd and relatively short ground contact time. these features are related to physical qualities such as mechanical stiffness and stretch-shortening cycle (ssc) performance [ , – ]). studies investigating the contribution of mechanical [ ] and neuromuscular [ , , – ] factors have demonstrated improvements as a result of eccentric training. several articles identified throughout the literature search were in agreement with the apparent positive effect of eo on sprint performance. each of the nine studies included in the review reported trivial to moderate improvements in speed measures. investigations involving short sprints (< m) showed trivial to moderate effect sizes. krommes et al. [ ] reported moderate improvements in m ( . ) and m ( . ) sprint times, but moderately slower times in m sprints (− . ). cook et al. [ ] reported that eccentric training alone did not improve m sprint speed. mendiguchia et al. [ ] described trivial results in m velocity ( . ) and m top speed (− . ). additionally, bilateral and unilateral eo training have been reported to increase power and change of direction ability without improving m sprint times [ ]. a recent review favoured the use of eo with a flywheel device over traditional resistance training for improving running speed [ ]. however, as mentioned previously, controversy exists as to whether certain inertial devices are capable of producing eo [ , ], as force production is in part determined by the experience of the trainee. as previously mentioned, acceleration and maximum-velocity sprinting are distinct physical abilities and may depend on separate performance qualities. thus, identification and examination of the factors associated with the prescription of eo to enhance specific aspects of sprint performance are necessary. contraction velocity appears to be a contributing factor to subsequent adaptations in eo research [ , , ], with high velocities resulting in greater magnitudes of neuromuscular adaptation. a review by guilhem et al. [ ] stated that isokinetic angular velocity in the literature ranges from ◦·s− to ◦·s− . askling et al. [ ] were the only authors in the current review to report the joint angular velocity ( ◦·s− ) used in training interventions for speed outcomes, which resulted in a moderate effect ( . ) in trained athletes. based on previous research investigating high- and low-velocity training [ , ], ◦·s− may represent a relatively low training velocity. limited availability of eo literature examining trained populations makes it difficult to draw generalisations; however, it is speculated that distinct sprint qualities may be differentially affected by eo training parameters such as velocity. j. funct. morphol. kinesiol. , , of . . power the storage and reutilisation of elastic potential energy during the eccentric phase of the ssc is thought to contribute to jump performance [ , – ]. in the current review, changes in lower-body power performance were measured primarily with jumping variations. these investigations [ , – , ] revealed small to moderate ( . to . ) improvements in cmj performance, while the inclusion of overspeed training protocols resulted in large ( . ) improvements [ ]. one reason for the efficacy of eo training in improving power performance may be that relatively fewer motor units are recruited, resulting in more muscle tension, especially at higher velocities [ , ]. the high levels of tension developed in the mtu may lead to adaptive responses in the elastic components of the muscle [ , ]. however, investigations using eo have also reported no change in squat jump, counter-movement jump or rate of force development [ ]. additionally, eccentric duration and execution of the correct technique [ , ] may actually decrease measures of velocity in a jump squat. discrepancy in the effects of eo on power may be a result of individual differences (e.g., technique, anthropometric qualities, contractile and elastic capabilities), which have been shown to influence drop jump and cmj performance [ ]. eo has also been shown to suppress performance qualities for extended periods of time, which could potentially interfere with results dependent on the timing of the post-testing regime [ , ]. thus, although there seems to be a favourable effect on the expression of power following eo, it is unclear whether the effect sizes related to power production in the literature review are affected by individual recovery profiles or individual differences. . . change of direction despite recent evidence [ , , ] on the relationship between eccentric strength and change of direction performance, only three studies in the present review examined any measure of agility. kinetic data for a novel crossover and sidestep cutting task displayed moderate to large ( . to . ) changes in the eccentric phase of muscle action following an eo training intervention. similar effect sizes ( . to . ) were reported for the illinois and t-test times in trained athletes [ , ]. interestingly, neither study reported performing agility-specific training as part of the intervention. this lack of task-specific activities suggests that lower-body flywheel training may transfer to complex skills such as change of direction ability. these findings are in agreement with existing literature that has reported improvements in change of direction ability following eo training interventions. gonzalo-skok et al. [ ] reported substantial improvements in measures of change of direction ability for two eo training programs. the authors reported that although bilateral and unilateral groups improved, differences existed between groups in power outputs and force-vector applications. these results support existing evidence [ , ] for the positive but differential effects of eo on change of direction performance. . conclusions eo appears to be an effective training strategy for athletes and sports practitioners looking to improve measures of strength, speed, power and change of direction. the review highlights evidence suggesting eo can improve performance qualities even in experienced athletes. the exact neurological and morphological mechanisms underlying these changes have been the focus of a growing body of research. due to a large degree of variation in the existing research, a dose-response relationship for a specific method and its intended adaptation has yet to be determined in trained team sport athletes. future research should explore the quantification of eccentric ability, the prescription of eccentric training variables and the relationship between eo-induced adaptation and performance qualities. author contributions: conceptualization, n.g., d.t.m., c.m.b., c.m.; data curation c.m.; formal analysis, c.m.; investigation, c.m.; supervision, n.g., d.t.m., c.m.b.; writing—original draft, c.m.; writing—review and editing, n.g., d.t.m., c.m.b. funding: this research received no external funding. j. funct. morphol. kinesiol. , , of conflicts of interest: the authors declare no conflict of interest. references . douglas, j.; pearson, s.; ross, a.; mcguigan, m. chronic adaptations to eccentric training: a systematic review. sports med. , , – . [crossref] [pubmed] . lindstedt, s.l.; lastayo, p.c.; reich, t.e. when active muscles lengthen: properties and consequences of eccentric contractions. news physiol. sci. , , – . [crossref] [pubmed] . komi, p.v. stretch-shortening cycle: a powerful model to study normal and fatigued muscle. j. biomech. , , – . [crossref] . wilson, j.m.; flanagan, e.p. the role of elastic energy in activities with high force and power requirements: a brief review. j. strength cond. res. , , – . [crossref] [pubmed] . cormie, p.; mcguigan, m.r.; newton, r.u. developing maximal neuromuscular power: part —biological basis of maximal power production. sports med. , , – . [crossref] [pubmed] . isner-horobeti, m.-e.; dufour, s.p.; vautravers, p.; geny, b.; coudeyre, e.; richard, r. eccentric exercise training: modalities, applications and perspectives. sports med. , , – . [crossref] . guilhem, g.; cornu, c.; guével, a. neuromuscular and muscle-tendon system adaptations to isotonic and isokinetic eccentric exercise. ann. phys. rehabil. med. , , – . [crossref] . wagle, j.p.; taber, c.b.; cunanan, a.j.; bingham, g.e.; carroll, k.m.; deweese, b.h.; sato, k.; stone, m.h. accentuated eccentric loading for training and performance: a review. sports med. , , – . [crossref] . chaabene, h.; prieske, o.; negra, y.; granacher, u. change of direction speed: toward a strength training approach with accentuated eccentric muscle actions. sports med. , , – . [crossref] . vogt, m.; hoppeler, h.h. eccentric exercise: mechanisms and effects when used as training regime or training adjunct. j. appl. physiol. , , – . [crossref] . schoenfeld, b.j.; ogborn, d.i.; vigotsky, a.d.; franchi, m.v.; krieger, j.w. hypertrophic effects of concentric vs. eccentric muscle actions: a systematic review and meta-analysis. j. strength cond. res. , , – . [crossref] . douglas, j.; pearson, s.; ross, a.; mcguigan, m. eccentric exercise: physiological characteristics and acute responses. sports med. , , – . [crossref] [pubmed] . moher, d.; liberati, a.; tetzlaff, j.; altman, d.g. preferred reporting items for systematic reviews and meta-analyses: the prisma statement. plos med. , , e . [crossref] . brughelli, m.; cronin, j.; levin, g.; chaouachi, a. understanding change of direction ability in sport. sports med. , , – . [crossref] . maroto-izquierdo, s.; garcía-lópez, d.; fernandez-gonzalo, r.; moreira, o.c.; gonzález-gallego, j.; de paz, j.a. skeletal muscle functional and structural adaptations after eccentric overload flywheel resistance training: a systematic review and meta-analysis. j. sci. med. sport , , – . [crossref] . roig, m.; o’brien, k.; kirk, g.; murray, r.; mckinnon, p.; shadgan, b.; reid, w.d. the effects of eccentric versus concentric resistance training on muscle strength and mass in healthy adults: a systematic review with meta-analysis. br. j. sports med. , , – . [crossref] . lakens, d. calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and anovas. front. psychol. , , . [crossref] . hopkins, w.g.; marshall, s.w.; batterham, a.m.; hanin, j. progressive statistics for studies in sports medicine and exercise science. med. sci. sports exerc. , , – . [crossref] . mjølsnes, r.; arnason, a.; Østhagen, t.; raastad, t.; bahr, r. a -week randomized trial comparing eccentric vs. concentric hamstring strength training in well-trained soccer players. scand. j. med. sci. sports , , – . [crossref] . de hoyo, m.; pozzo, m.; sañudo, b.; carrasco, l.; gonzalo-skok, o.; domínguez-cobo, s.; morán-camacho, e. effects of a -week in-season eccentric-overload training program on muscle-injury prevention and performance in junior elite soccer players. int. j. sports physiol. perform. , , – . [crossref] . de hoyo, m.; sañudo, b.; carrasco, l.; mateo-cortes, j.; domínguez-cobo, s.; fernandes, o.; del ojo, j.j.; gonzalo-skok, o. effects of -week eccentric overload training on kinetic parameters during change of direction in football players. j. sports sci. , , – . [crossref] http://dx.doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /physiologyonline. . . . http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - ( ) - http://dx.doi.org/ . /jsc. b e ae a http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . / - - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /s - - -y http://dx.doi.org/ . /j.rehab. . . http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /japplphysiol. . http://dx.doi.org/ . /jsc. http://dx.doi.org/ . /s - - - http://www.ncbi.nlm.nih.gov/pubmed/ http://dx.doi.org/ . /journal.pmed. http://dx.doi.org/ . / - - http://dx.doi.org/ . /j.jsams. . . http://dx.doi.org/ . /bjsm. . http://dx.doi.org/ . /fpsyg. . http://dx.doi.org/ . /mss. b e cb http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /ijspp. - http://dx.doi.org/ . / . . j. funct. morphol. kinesiol. , , of . ishøi, l.; hölmich, p.; aagaard, p.; thorborg, k.; bandholm, t.; serner, a. effects of the nordic hamstring exercise on sprint capacity in male football players: a randomized controlled trial. j. sports sci. , , – . [crossref] . suarez-arrones, l. in-season eccentric-overload training in elite soccer players: effects on body composition, strength and sprint performance. plos one , , e . [crossref] . askling, c.; karlsson, j.; thorstensson, a. hamstring injury occurrence in elite soccer players after preseason strength training with eccentric overload. scand. j. med. sci. sports , , – . [crossref] . iga, j.; fruer, c.; deighan, m.; croix, m.d.; james, d.v. ‘nordic’ hamstrings exercise—engagement characteristics and training responses. int. j. sports med. , , – . [crossref] . krommes, k.; petersen, j.; nielsen, m.b.; aagaard, p.; hölmich, p.; thorborg, k. sprint and jump performance in elite male soccer players following a -week nordic hamstring exercise protocol: a randomised pilot study. bmc res. notes , , . . maroto-izquierdo, s.; garcía-lópez, d.; de paz, j.a. functional and muscle-size effects of flywheel resistance training with eccentric-overload in professional handball players. j. hum. kinet. , , – . [crossref] . sabido, r.; hernández-davó, j.l.; botella, j.; navarro, a.; tous-fajardo, j. effects of adding a weekly eccentric-overload training session on strength and athletic performance in team-handball players. eur. j. sport sci. , , – . [crossref] . brughelli, m.; mendiguchia, j.; nosaka, k.; idoate, f.; arcos, a.l.; cronin, j. effects of eccentric exercise on optimum length of the knee flexors and extensors during the preseason in professional soccer players. phys. ther. sport , , – . [crossref] . cook, c.j.; beaven, c.m.; kilduff, l.p. three weeks of eccentric training combined with overspeed exercises enhances power and running speed performance gains in trained athletes. j. strength cond. res. , , – . [crossref] . mendiguchia, j.; martinez-ruiz, e.; morin, j.b.; samozino, p.; edouard, p.; alcaraz, p.e.; esparza-ros, f.; mendez-villanueva, a. effects of hamstring-emphasized neuromuscular training on strength and sprinting mechanics in football players: hamstring training and performance. scand. j. med. sci. sports , , e –e . [crossref] . rey, e.; paz-domínguez, Á.; porcel-almendral, d.; paredes-hernández, v.; barcala-furelos, r.; abelairas-gómez, c. effects of a -week nordic hamstring exercise and russian belt training on posterior lower-limb muscle strength in elite junior soccer players. j. strength cond. res. , , – . [crossref] . sanchez-sanchez, j.; gonzalo-skok, o.; carretero, m.; pineda, a.; ramirez-campillo, r.; nakamura, f.y. effects of concurrent eccentric overload and high-intensity interval training on team sports players’ performance. kinesiol. int. j. fundam. appl. kinesiol. , , – . [crossref] . toigo, m.; boutellier, u. new fundamental resistance exercise determinants of molecular and cellular muscle adaptations. eur. j. appl. physiol. , , – . [crossref] . tous-fajardo, j.; maldonado, r.a.; quintana, j.m.; pozzo, m.; tesch, p.a. the flywheel leg-curl machine: offering eccentric overload for hamstring development. int. j. sports physiol. perform. , , – . [crossref] . friedmann-bette, b.; bauer, t.; kinscherf, r.; vorwald, s.; klute, k.; bischoff, d.; müller, h.; weber, m.-a.; metz, j.; kauczor, h.-u.; et al. effects of strength training with eccentric overload on muscle adaptation in male athletes. eur. j. appl. physiol. , , – . [crossref] . helland, c.; hole, e.; iversen, e.; olsson, m.c.; seynnes, o.; solberg, p.a.; paulsen, g. training strategies to improve muscle power: is olympic-style weightlifting relevant? med. sci. sports exerc. , , – . [crossref] . núñez, f.j.; santalla, a.; carrasquila, i.; asian, j.a.; reina, j.i.; suarez-arrones, l.j. the effects of unilateral and bilateral eccentric overload training on hypertrophy, muscle power and cod performance, and its determinants, in team sport players. plos one , , e . [crossref] . vikne, h.; refsnes, p.e.; ekmark, m.; medbø, j.i.; gundersen, v.; gundersen, k. muscular performance after concentric and eccentric exercise in trained men. med. sci. sports exerc. , , – . [crossref] . english, k.l.; loehr, j.a.; lee, s.m.c.; smith, s.m. early-phase musculoskeletal adaptations to different levels of eccentric resistance after weeks of lower body training. eur. j. appl. physiol. , , – . [crossref] http://dx.doi.org/ . / . . http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . /j. - . . .x http://dx.doi.org/ . /s- - http://dx.doi.org/ . /hukin- - http://dx.doi.org/ . / . . http://dx.doi.org/ . /j.ptsp. . . http://dx.doi.org/ . /jsc. b e http://dx.doi.org/ . /sms. http://dx.doi.org/ . /jsc. http://dx.doi.org/ . /k. . . http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /ijspp. . . http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /mss. http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . / .mss. . .ab http://dx.doi.org/ . /s - - - j. funct. morphol. kinesiol. , , of . farthing, j.p.; chilibeck, p.d. the effects of eccentric and concentric training at different velocities on muscle hypertrophy. eur. j. appl. physiol. , , – . [crossref] . beaven, c.m.; willis, s.j.; cook, c.j.; holmberg, h.-c. physiological comparison of concentric and eccentric arm cycling in males and females. plos one , , e . [crossref] . meylan, c.; cronin, j.; nosaka, k. isoinertial assessment of eccentric muscular strength. strength cond. , , – . [crossref] . paddon-jones, d.; leveritt, m.; lonergan, a.; abernethy, p. adaptation to chronic eccentric exercise in humans: the influence of contraction velocity. eur. j. appl. physiol. , , – . [crossref] . buskard, a.n.l.; gregg, h.r.; ahn, s. supramaximal eccentrics versus traditional loading in improving lower-body rm: a meta-analysis. res. q. exerc. sport , , – . [crossref] . ditroilo, m.; de vito, g.; delahunt, e. kinematic and electromyographic analysis of the nordic hamstring exercise. j. electromyogr. kinesiol. , , – . [crossref] . alcaraz, p.e.; carlos-vivas, j.; oponjuru, b.o.; martínez-rodríguez, a. the effectiveness of resisted sled training (rst) for sprint performance: a systematic review and meta-analysis. sports med. , , – . [crossref] . von lieres und wilkau, h.c.; irwin, g.; bezodis, n.e.; simpson, s.; bezodis, i.n. phase analysis in maximal sprinting: an investigation of step-to-step technical changes between the initial acceleration, transition and maximal velocity phases. sports biomech. , – . [crossref] . brughelli, m.; cronin, j. influence of running velocity on vertical, leg and joint stiffness: modelling and recommendations for future research. sports med. , , – . [crossref] . farley, c.t.; gonzález, o. leg stiffness and stride frequency in human running. j. biomech. , , – . [crossref] . lópez mangini, f.; fábrica, g. mechanical stiffness: a global parameter associated to elite sprinters performance. rev. bras. ciências esporte , , – . [crossref] . voigt, m.; bojsen-møller, f.; simonsen, e.b.; dyhre-poulsen, p. the influence of tendon youngs modulus, dimensions and instantaneous moment arms on the efficiency of human movement. j. biomech. , , – . [crossref] . malliaras, p.; kamal, b.; nowell, a.; farley, t.; dhamu, h.; simpson, v.; morrissey, d.; langberg, h.; maffulli, n.; reeves, n.d. patellar tendon adaptation in relation to load-intensity and contraction type. j. biomech. , , – . [crossref] . liu, c.; chen, c.-s.; ho, w.-h.; füle, r.j.; chung, p.-h.; shiang, t.-y. the effects of passive leg press training on jumping performance, speed, and muscle power. j. strength cond. res. , , – . [crossref] . mike, j.n.; cole, n.; herrera, c.; vandusseldorp, t.; kravitz, l.; kerksick, c.m. the effects of eccentric contraction duration on muscle strength, power production, vertical jump, and soreness. j. strength cond. res. , , – . [crossref] . papadopoulos, c.; theodosiou, k.; bogdanis, g.c.; gkantiraga, e.; gissis, i.; sambanis, m.; souglis, a.; sotiropoulos, a. multiarticular isokinetic high-load eccentric training induces large increases in eccentric and concentric strength and jumping performance. j. strength cond. res. , , – . [crossref] . gonzalo-skok, o.; tous-fajardo, j.; valero-campo, c.; berzosa, c.; bataller, a.v.; arjol-serrano, j.l.; moras, g.; mendez-villanueva, a. eccentric-overload training in team-sport functional performance: constant bilateral vertical versus variable unilateral multidirectional movements. int. j. sports physiol. perform. , , – . [crossref] . oliveira, a.s.; corvino, r.b.; caputo, f.; aagaard, p.; denadai, b.s. effects of fast-velocity eccentric resistance training on early and late rate of force development. eur. j. sport sci. , , – . [crossref] . cormie, p.; mcguigan, m.r.; newton, r.u. changes in the eccentric phase contribute to improved stretch-shorten cycle performance after training. med. sci. sports exerc. , , – . [crossref] . mcguigan, m.r.; doyle, t.l.a.; newton, m.; edwards, d.j.; nimphius, s.; newton, r.u. eccentric utilzation ratio: effect of sport and phase of training. j. strength cond. res. , , – . . di giminiani, r.; petricola, s. the power output-drop height relationship to determine the optimal dropping intensity and to monitor the training intervention. j. strength cond. res. , , – . [crossref] . bridgeman, l.a.; mcguigan, m.r.; gill, n.d.; dulson, d.k. relationships between concentric and eccentric strength and countermovement jump performance in resistance trained men. j. strength cond. res. , , – . [crossref] http://dx.doi.org/ . /s - - - http://dx.doi.org/ . /journal.pone. http://dx.doi.org/ . /ssc. b e a http://dx.doi.org/ . /s http://dx.doi.org/ . / . . http://dx.doi.org/ . /j.jelekin. . . http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / . . http://dx.doi.org/ . / - - http://dx.doi.org/ . / - ( ) - http://dx.doi.org/ . /j.rbce. . . http://dx.doi.org/ . / - ( ) -b http://dx.doi.org/ . /j.jbiomech. . . http://dx.doi.org/ . /jsc. b e bde f http://dx.doi.org/ . /jsc. http://dx.doi.org/ . /jsc. http://dx.doi.org/ . /ijspp. - http://dx.doi.org/ . / . . http://dx.doi.org/ . /mss. b e d e http://dx.doi.org/ . /jsc. http://dx.doi.org/ . /jsc. j. funct. morphol. kinesiol. , , of . laffaye, g.; wagner, p. eccentric rate of force development determines jumping performance. comput. methods biomech. biomed. eng. , , – . [crossref] . shepstone, t.n.; tang, j.e.; dallaire, s.; schuenke, m.d.; staron, r.s.; phillips, s.m. short-term high- vs. low-velocity isokinetic lengthening training results in greater hypertrophy of the elbow flexors in young men. j. appl. physiol. , , – . [crossref] . wirth, k.; keiner, m.; szilvas, e.; hartmann, h.; sander, a. effects of eccentric strength training on different maximal strength and speed-strength parameters of the lower extremity. j. strength cond. res. , , – . [crossref] . aboodarda, s.j.; page, p.a.; behm, d.g. eccentric and concentric jumping performance during augmented jumps with elastic resistance: a meta-analysis. int. j. sports phys. ther. , , . . bobbert, m.f.; huijing, p.a.; van ingen schenau, g.j. drop jumping. i. the influence of jumping technique on the biomechanics of jumping. med. sci. sports exerc. , , – . [crossref] . leong, c.h.; mcdermott, w.j.; elmer, s.j.; martin, j.c. chronic eccentric cycling improves quadriceps muscle structure and maximum cycling power. int. j. sports med. , , – . [crossref] . brandenburg, j.p.; docherty, d. the effects of accentuated eccentric loading on strength, muscle hypertrophy, and neural adaptations in trained individuals. j. strength cond. res. , , – . . spiteri, t.; nimphius, s.; hart, n.h.; specos, c.; sheppard, j.m.; newton, r.u. contribution of strength characteristics to change of direction and agility performance in female basketball athletes. j. strength cond. res. , , – . [crossref] . spiteri, t.; newton, r.u.; binetti, m.; hart, n.h.; sheppard, j.m.; nimphius, s. mechanical determinants of faster change of direction and agility performance in female basketball athletes. j. strength cond. res. , , – . [crossref] . tous-fajardo, j.; gonzalo-skok, o.; arjol-serrano, j.l.; tesch, p. enhancing change-of-direction speed in soccer players by functional inertial eccentric overload and vibration training. int. j. sports physiol. perform. , , – . [crossref] © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (http://creativecommons.org/licenses/by/ . /). http://dx.doi.org/ . / . . http://dx.doi.org/ . /japplphysiol. . http://dx.doi.org/ . /jsc. http://dx.doi.org/ . / - - http://dx.doi.org/ . /s- - http://dx.doi.org/ . /jsc. http://dx.doi.org/ . /jsc. http://dx.doi.org/ . /ijspp. - http://creativecommons.org/ http://creativecommons.org/licenses/by/ . /. introduction materials and methods search strategy eligibility criteria study selection analysis of results results participant characteristics intervention characteristics outcome measures strength speed power change of direction discussion strength speed power change of direction conclusions references (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: the consequences of framing digital humanities tools as easy to use paige morgan p.morgan@miami.edu orcid: - - - abstract this article examines the recurring ways in which some of the most popular dh tools are presented as easy to use. it argues that attempts to couch powerful tools in what is often false familiarity, directly undermines the goal of encouraging scholarly innovation and risk taking. the consequences of framing digital tools as either easy or more difficult shapes the relationship between librarians and the students and faculty whose research they support, and, more broadly, the role and viability of libraries as spaces devoted to skill acquisition. keywords: infrastructure, digital humanities, dh tools, dh pedagogy a digital humanities librarian provides consultations to researchers who are developing or struggling with dh projects. frequently, these consultations begin with the researcher apologizing and explaining to the librarian their poor aptitude for digital humanities. in many cases, these researchers’ prior experience includes a referral to one or more digital humanities tools that have been branded as user-friendly/easy to use. at first, it can look as though this phenomenon is chiefly the result of language and rhetoric used to frame various dh tools — a component influenced by the software industry’s move towards graphical user interfaces and marketing software for everyone to use, whether in the workplace or at home, regardless of gender, age, or other factors that affect digital tools. that language remains the article’s primary focus. however, the issue is not simply tool-framing language. the taglines and framing in tool mailto:p.morgan@miami.edu http://orcid.org/ - - - (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: documentation are the most visible and stable form, as opposed to more ephemeral instances of language in libguides, promotional materials, and workshops and conversations at conferences. researchers are encountering and struggling with an approach to dh growth and expansion that substantially relies on marketing aspects of dh research as easy. in other words, this article explores the way that our framing for dh tools and resources shapes researchers’ emotions and expectations. sociologist susan leigh star examined “the work behind the work” in scientific research contexts, meaning “the countless, taken-for-granted and often dismissed practices of assistants, technicians, and students that made scientific breakthroughs possible” (timmermans , ). the infrastructure set-up for digital humanities, and the pressures that it places on students, serve as a parallel area of hidden work that can be illuminated. despite the presence of “easiness” rhetoric in multiple contexts, tool presentation language is often the most concrete example that is available for analysis. tool presentation language is the material that constitutes users’ introduction to the tool — usually the front page of a website, the about page, and any promotional videos — the materials that create a tool’s reputation. instead of residing in a particular tool, or the tool creators’ choices, this is a problem within the design of the larger field of the digital humanities, a problem that can remain largely invisible. recent efforts in library and dh scholarship have focused on illuminating work in digital humanities that tends to go unseen (shirazi ); by unpacking the challenges around tool framing, one can lay the ground for working with them more effectively. defining easiness (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: ease of use is one of the most desirable characteristics for any given tool — rivaled only in popularity by the quality of being free. it is not merely a digital humanities fascination — developers have been pursuing the creation of user-friendly graphical interfaces since the late s. that pursuit has its own complex and continuing history, bound up in corporate rivalry and the outsized influence of certain tech leaders, such as steve jobs and his fascination with skeuomorphic design. as the tech industry has exerted influence on dh in many ways, it is unsurprising that dh tools have emulated this aspect of tech design. easiness can seem like an obvious goal for dh support practitioners and tool developers; it goes hand in hand with efforts to democratize the field and make learning and research opportunities more available, regardless of whether institutions have existing and active dh programs. the easier it is to do dh, the more people will try it out — an appealing prospect at a time when humanities departments are looking for ways of asserting their continuing relevance, reinventing themselves in response to cultural shifts, and working to demonstrate that they provide students with job-ready skills. easiness is attractive in part because it is powerful. the availability of easy-to- use tools shapes dh support infrastructure and affects how dh is incorporated into the classroom, in terms of how much time is needed to show students how to configure a tool and begin using it. for individual scholars developing projects, perceived ease or difficulty can be a deciding factor if there are multiple tools from which to choose and may determine whether the scholar decides to pursue the project at all. transitioning to digital from conventional printed scholarship includes an adjustment to iterating through (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: multiple stages; and may involve multiple, modular outputs, such as datasets, websites, and processing workflows (brown et. al. , par. ). the technical and scholarly ambitiousness of a particular project will intersect with each other. depending on a scholar or team’s prior experience, the impacts of this intersection may be hard to predict (brown et. al. , par. ). the problem of unpredictable challenges is complicated further by the pressure researchers face to show their deliverables to colleagues who may be less accustomed to the ups and downs of iteration, but are still called to evaluate it, either for promotion or degree completion. while guidelines and articles from major disciplinary organizations (modern language association ; presner ; american historical association ) discussing the evaluation of digital scholarship acknowledge the iterative nature of digital work, it is harder for such guidelines to prepare colleagues for evaluating mid-stage outputs with aesthetics that may not match the sophistication of the various commercial websites that individuals encounter every day. all these factors contribute to making “easy” tools compelling. despite its considerable dazzle, easiness is an abstract and intangible quality; the promise of easiness, or an easy-to-use tool, is that some process (whether display, formatting, organization, or analysis) can be accomplished with minimal difficulty, confusion, or extra labor. when such processes are simplified, researchers feel more able to focus their learning on what they perceive as most relevant to their research question and intellectual work. in digital humanities, and in the context of technology generally, easiness is most likely to be associated with tools that are classified as “out- of-the-box,” meaning that they do not require configuration or modification to work, or “off-the-shelf,” meaning that they are standardized, rather than customized, and (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: intended for general audiences to be able to use. because easiness is abstract, it can be taken as synonymous for other qualities, like speed (cf. various statements about accomplishing a process or analysis with “one click”). though the variants on “easy” are common in tool branding, terms like “fast” and “simple” are regular alternatives. for many tools, it would be more accurate to say that they make a given process not easy, but easier than an alternative. easiness is subjective — what is easy for one user may not be for another. it is important to understand that easiness is subjective because it is situated and dependent upon other factors. these factors include the particular nature of the material being worked with (i.e., whether the material is text or image-based), and its condition (i.e., whether a dataset has been examined and normalized), as well as the availability (or lack) of training or experience that provides a user with relevant contextual knowledge. however, researchers may not see this situatedness clearly. finally, because easiness is both powerful and subjective, it is value-laden; and it carries a backlash for individuals who expect to find a process or tool to be easy yet discover the opposite. the backlash comes in part from researchers’ inexperience with the various interdependencies and situatedness of easiness — many of which are complexities of technological, academic, and library systems and infrastructure. ideally, a researcher pushes past the backlash, and over time they gain familiarity and experience that help them make choices about their research project or their career with greater autonomy. part of the reason that claims about easiness have such weight is that they inevitably tell us stories about the available infrastructure and its condition — whether or not there are opportunities to learn a particular skill (e.g., a coding (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: language), and how legible and genuine those opportunities appear to the audience for whom the tool is intended. as a result, scrutinizing easiness rhetoric can be helpful for librarians and administrators who are trying to get a clearer sense of their patrons’ needs, or who want to think more critically about the type of support they are providing. examples of easiness framing easiness has become sufficiently important that in digital humanities libguides and tool bibliographies, it may be the first or second characteristic mentioned for any tool listed. a typical description might consist of one or two sentences explaining “[tool] is free and easy to use and allows you to [process/visualize/analyze content].” this sort of description echoes the taglines and catchphrases associated with various tools. besides omeka and scalar, there is stanford’s palladio (“visualize complex historical data with ease.”), the knight lab’s timelinejs (“easy-to-make, beautiful timelines”) and juxtaposejs (“easy-to-make frame comparisons”), cartodb (“maps for the web, made easy” – while this is no longer cartodb’s official catchphrase, it is still widely visible in search results). although qualities such as access, sustainability, and portability are significant concerns in dh, in examining libguides and other dh tool roundups, one sees that they are referenced far less than if a tool will be easy. the guide authors try to succinctly articulate what each tool is meant to do; what processes it speeds up, facilitates, or makes easier; and the language that is used to present its capabilities and its value to potential users. in order to get a concrete sense of how this language appears, and the promises and assertions that tool framing makes, this article will examine three tools developed (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: specifically for dh use within the last ten years. the point of this examination is not to critique or accuse the tools – they are merely the most concrete and available examples of a more widespread ephemeral phenomenon that shows up not only in written contexts, but also in workshops, webinars, and casual conversations. omeka.net omeka was released by the center for history and new media at george mason university in , and it is intended for an audience of users in the galleries, libraries, archives, and museums (glam) sector, as well as anyone else wanting to build exhibits and collections online. it allows for the creation of multiple collections of items with metadata structured according to disciplinary or institutional schemas and standards. users have the ability to follow widespread practices that will make their data interoperable, adjust those schemas to a local house style, or do a bit of each as needed. the sort of functionality that omeka makes possible is available in software developed for the glam community but is often priced at an institutional level that puts it out of reach of individuals and the smallest institutions. this sort of software may be available as open-source and may require experienced tech support personnel to manage the back-end setup and ongoing maintenance. since the initial release, the omeka development team has worked to improve the tool’s functionality and accessibility, both through the omeka.net subscription service and by making it available as a “one-click install” through internet service providers like reclaim hosting. omeka’s contributions are remarkable, though hard to explain succinctly for audiences who are unfamiliar with the existing software contexts. dan cohen summarized it as “wordpress for your exhibits and collections” at the original release, (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: aiming at a description that would make it easy for people to describe the tool to others. up until september , omeka.net featured a prominent tagline: “your online exhibit is one click away.” in its website redesign that tagline was replaced by a less exuberant description: getting started is easy with omeka with our hosted service.” the omeka.org website continues marketing omeka via cohen’s original wordpress reference under the heading “simple to use”: “our ‘five-minute setup’ makes launching an online exhibition as easy as starting a blog. no code knowledge required.” this rhetoric isn’t precisely mismatched, because omeka does indeed allow users to start adding items and metadata right away. for those already versed in metadata standards and best practices, the main learning curve will involve getting accustomed to the interface. however, many digital humanists coming from departments such as english and history are unlikely to have received this training, and as such, face an additional and substantial learning curve, because there is more to a good omeka exhibit than simply getting content onto the web. the omeka.net documentation acknowledges this challenge in its getting started section, where it recommends that users plan out their content before building an omeka website and refers them to cohen & rosenzweig’s digital history: a guide to gathering, preserving, and, presenting the past on the web. the omeka.org documentation goes further, recommending that users sketch out wireframes of their site prior to building it. both versions of omeka encourage new users to explore the showcases of existing omeka sites. but while omeka may make building an exhibit as easy as blogging on a technical level, its framing is easily misunderstood by users who fail to anticipate the complex intellectual work required to produce a site that is ready to share publicly. (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: scalar scalar is the creation of the alliance for networking and visual culture (anvc) in association with vectors journal and the institute for multimedia literacy at the university of southern california. an open beta version was released in spring , and the current version, scalar . , was released in late . anvc presents their work as “explor[ing] new forms of scholarly publishing aimed at easing the current economic crisis faced by many university presses while also serving as a model for media-rich digital publication,” and describes scalar as a “key part” of this process, facilitating collaboration and material sharing between libraries, archives, scholarly societies and presses” (anvc: about the alliance n.d.). these partnerships have resulted in one of scalar’s most unique features: the ability to add images and videos from organizations like the shoah foundation and the internet archive to a scalar site by performing a keyword search, selecting results with a checkbox, and clicking a button to import them, along with any associated metadata. this entire process (including the optional step of editing individual item metadata) can be performed within the scalar user interface. once imported, users can select from a few different layouts available via a dropdown menu in order to emphasize text or media, or split the emphasis between the two (scalar: selecting a page's default view, n.d.). the other feature that especially distinguishes scalar from other cmss is the structural freedom that it grants users. where blogging platforms like blogger, wordpress, and dreamwidth structure content chronologically, scalar has no default organizational structure. instead, it allows users to create pages, which can be (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: combined into paths, annotated, tagged, or used as tags for other content. this gives them multiple options for creating non-linear, nested, radial, recursive, and intersecting narratives. configuring these choices is accomplished primarily through a relationships menu at the bottom of each page created, below the main text input window. the actual, final steps of creating an organic structure through a combination of selecting objects and dragging and dropping them within a gui requires far fewer steps in scalar than it would in any other environment, and is further enhanced by the fact that scalar includes options to show visual representations of the structure (path view, tag view). however, this structural freedom is also the aspect of scalar that requires the most careful advance planning from users in order to avoid producing a tangle of disconnected, disparate files. as such, its organizational freedom is simultaneously the feature that most complicates scalar’s self-presentation of easiness. like omeka, scalar articulates its claim of easiness through a comparison to blogging (“...if you can post to a blog, you can use scalar”), pointing to the similarities of the wysiwyg interface in its text input window and those used by wordpress and other blogging platforms. the trailer also connects itself to the activity of blogging by emphasizing the simplicity with which authors can work with a wide range of media types — not just how easy it is to “import media directly without cutting and pasting code” but also combining different types of media, such as “tagging poems with videofiles or tagging images with audiofiles.” what the trailer wants to convey is that any media type the user could imagine — from images and text to maps and source code — can be juxtaposed within a scalar book, all without requiring the book’s author to have any knowledge of markup language. this emphasis on diverse media formats is (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: coupled throughout the trailer with statements about scalar’s ability to handle quantity — not only in terms of media, but also that scalar makes it “easy to work with multiple authors because each author’s contributions are tracked and all versions preserved.” as the trailer ends, the narrator reiterates that despite the wide variety of options available (visualizations, paths, annotations, etc.), “all these objects are designed to work together to make it easier for you to create objects to think with — the thinking is still up to you.” as was the case with omeka, scalar’s claims aren’t untrue – it does offer unique functionality that simplifies and streamlines the processes of juxtaposing media and crafting non-linear narratives; and it does so in a way that saves considerable technical labor. in emphasizing its most innovative functionalities, however, scalar’s framing underemphasizes that these functionalities come with their own particular workload. the more complex a narrative structure is, and the more material it contains, the more important it is to have experience managing data with workflows, strict file naming practices, and/or data dictionaries. without such practices, or a site structure that has been carefully determined in advance, users are more likely to end up with a tangled mess rather than the sophisticated site that they had hoped for. likewise, scalar’s documentation raises the question of what tool managers tell users to prepare them for the work of developing site structure. scalar’s presentation materials focus on the ease with which scalar can keep track of multiple users – however, this focus tends to obscure the social decision making that will almost certainly be required; as well as the emphasis on how much freedom to show different objects skirts around the reality that producing a good site is often a case of learning (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: what not to show in order to keep the narrative streamlined and compelling, rather than simply showing a great quantity of objects. dh box dhbox (http://www.dhbox.org) is currently in development at the cuny graduate center. as the newest of the tools that i have examined in this piece, dhbox is an indication that easy tool rhetoric is still being used. dhbox uses containers to create remote environments in the cloud that are already configured for several popular and powerful dh tools, including ipython, rstudio, wordpress, and mallet. containers allow programs to run in virtual environments that are identical, rather than risking the possibility that some users’ settings and configurations will generate errors. using pre- configured container environments can substantially cut down on the set-up time before students can get started actually using tools. the streamlined setup enables students to work with complex tools like mallet and the nltk on their own laptops without needing a physical computer lab, or requiring the instructor to consult or negotiate with campus it personnel. dhbox makes a few prominent claims about its easiness. a brief statement centered on its front page explains that “setting up an environment for digital humanities computational work can be time-consuming and difficult. dh box addresses this problem by streamlining installation processes and providing a digital humanities laboratory in the cloud through simple sign-in via a web browser.” the “about” page reiterates that dhbox allows a cloud laboratory to be deployed “quickly and easily” from (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: any computer with an internet connection, promising a device agnostic lab ready to go in minutes. though dhbox emphasizes how much easier it is to use than it is to create a lab from scratch, it is not actually intended for beginners, as a closer look at the about page shows. dhbox makes it simple to set up a lab if you have an internet connection and “some contextual knowledge.” this abstract phrase gets clarified further down — the tool is intended for users who “know what the command line is” and “what a server does.” for others, the creators recommend a list of four resources to help bring potential users into the target audience, including a portion of the apache http server documentation, shaw’s “the command line the hard way” book, lessons hosted at the programming historian site, and posner’s “how did they make that?.” this is a substantial reading list, but one that should provide a novice digital humanist with a solid grounding in the relevant concepts. oddly enough, there is no explicit suggestion that individuals using dhbox need to understand how the gold-standard tools it contains work — the implication is that once the virtual lab is up and running, the rest of the progress will follow naturally. the idea of easiness, especially in tech contexts, is often associated with support for new and inexperienced users; however, dhbox is a reminder that the situated nature of easiness means that it can also be intended specifically for advanced users. the presentation materials for dhbox attempt to be direct with would-be users by offering two benchmark questions that must be answered in order to use the tool productively; and the creators acknowledge that users might need to learn more, rather than simply suggesting that the tool will have excellent results for anyone and everyone. (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: what tool users are looking for tool users want the easiest experience possible, but looking at these three tools in particular enables one to more concretely define what easiness means in the context of dh. the emphasis on graphical user interfaces and no coding or technical knowledge suggests a desire for as little preparation as possible — particularly the desire to avoid learning material that is purely technical and has no equivalent in their home disciplines, such as understanding image aspect ratios or file compatibility issues. for researchers who are already overburdened, this is an understandable rational economic choice. users are also looking for tools that give them the ability to fully realize their imaginations, and to produce something new and dramatically different from what non- dh methods allow. this output could be new because it is a highly visual digital exhibit, or because it features non-linear narratives or juxtapositions of strikingly different media, or because it makes it possible for an entire graduate seminar to have access to sophisticated analytical tools like rstudio and mallet. users may likewise be looking for tools that allow them to explore a particular method in depth, and achieve mastery, especially within a given period of time, i.e., one semester-long course (goldstone ). finally, though this is rarely made directly explicit by the tool presentations themselves, users want stability, and to feel that any effort that they make in a tool will be rewarded and worthwhile, rather than failing (terras a; terras b). this is most evident in language that gestures towards the tool’s output. sometimes this is conveyed by promising speed (an exhibit that is one click away) and sometimes by (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: promising complexity. scalar’s creators understand that “important topics require time and sustained attention to be fully explored,” and work to convey to authors that with scalar, they will be able to create a scalar book that is worthy of committed attention from readers. while digital humanists may want to avoid spending time acquiring extraneous knowledge, they are drawn to the field because they are willing to make an investment — but they want that investment to “provide a satisfying moment of completion” (brown , par. ) or move them closer to being able to declare the project finished (kirschenbaum , par. ). in light of these needs, we might ask whether easiness is a quality that digital humanities tool creators should pursue. in “blunt instrumentalism: on tools and methods,” dennis tenen ( ) argues in favor of caution around easiness in dh research, because prioritizing it often comes at the expense of understanding the critical inner workings of analytical tools. overreliance on out-of-the-box tools can result in researchers confusing the tools themselves with methodologies ( ), and the end result is that the scholarship is less finely-grained and rigorous. the best kinds of tools, according to tenen, are “the ones we make ourselves” – though he acknowledges the formidable labor involved in producing, marketing, and maintaining such tools, especially when working within academic contexts. tenen characterizes a preference for easiness as a sort of intellectual laziness or lazy thinking, when more attention to method is warranted ( ). in some cases, this critique is highly applicable; in others, it fails to take in to account that the preference for easiness is influenced by a lack of infrastructure – and that some tools, like dh box, are intended specifically to solve the common infrastructure problem of a lack of physical space. out-of-the-box tools, which (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: might be better characterized as “entry-level” dh tools, are arguably fulfilling a community need. but whose role and responsibility is it to guide new users through those tools and into the more complex understanding of methodologies that might develop as users become more familiar with them? how libraries fit into dh infrastructure growth whether identified as “digital humanities” or previous terms like “humanities computing” or “technological humanities,” librarians and scholars have been using tools in research contexts for a long time. the current wave of dh seems to have begun around ten years ago, kicked off in part by the creation and release of affordable and user-friendly tools like omeka, as well as chnm’s zotero citation manager. william pannapacker’s pronouncement in the chronicle of higher education that dh seemed like “the first ‘next big thing’ in a long time,” was disputed by digital humanists for whom the field was nothing new — still, pannapacker’s observation reflected the start of a rise in dh-focused hiring. while the quantity of available new dh-focused positions was overstated in some cases (risam ), there has been demonstrable growth in certain sectors. in , there were two searches for digital humanities librarian jobs, and that number has risen steadily since, with twenty-eight job searches for librarians or similarly titled library-based, front-facing positions (such as digital scholarship coordinator, digital scholarship lead) in both and — an indication that libraries are actively working to increase their direct involvement with dh (morgan and williams ). (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: as the field of digital humanities and the number of roles associated with it have grown, various concerns and questions have arisen about how to effectively build infrastructure and support systems that are both productive and scalable. many of these discussions focus on the roles that libraries and librarians play — whether in supporting dh as a service, being the driving force or an active collaborator in dh growth, or providing much needed guidance for archiving and maintaining digital scholarly work. as projects and tools have been created and aged and sometimes disappeared, the larger dh community has begun to be more aware of the importance of sustainability (davis ). furthermore, in enterprise-level software and hardware provision, librarians have far more expertise and experience than traditional academic personnel. however, this pressure to achieve success and provide expertise risks becoming unsustainable for libraries themselves, while simultaneously failing to fully acknowledge the contributions that they have made to dh growth. there are several excellent articles and essays discussing the opportunities and challenges that libraries face as they develop involvement and support strategies for digital humanities and digital scholarship. in this instance, i want to focus on the challenges that out-of-the-box, easy-to-use tools seem to have the potential to ameliorate, if not solve completely. these include the tendency to assign librarians or coordinators ample amounts of responsibility for creating digital humanities successes without giving them the necessary authority to do so (posner , ), a lack of training opportunities (posner , ), and a tendency to award credit for achievements to faculty, rather than library collaborators (posner , ). these hurdles are further complicated by the sheer variety of requests that occur, many of (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: which include requests for time-consuming and non-extensible customization (vinopal and mccormick , ). libraries and librarians are under pressure to produce demonstrable results; to have learned enough from “intensive development for boutique projects” to provide the scalable support that scholars need, often as inexpensively as possible (maron and pickle , ); and to have a reproducible model that can be clearly articulated to stakeholders, and adapted as needed over time. easy-to-use tools can help with many of these challenges. because they are branded as entry-level tools, and have documentation, they are positioned to allow librarians to be more hands-off, relieving them of the responsibility for success. if librarians are more hands-off, they are less likely to go uncredited for their work; and if the tools can offer the right balance of restrictions and customization, then the library is absolved of that burden as well. the arl spec kit for digital humanities survey found that % of libraries characterized their digital humanities services as offered on an “ad hoc” basis (bryson et. al. , ) — sometimes described as a “service-and-support” model, where projects are initiated by faculty who approach the library with ideas (posner ; muñoz ). an alternate approach is the skunkworks or library incubator model (see muñoz ; nowviskie ), where the library develops dh projects in which it plays a leadership role and allows students and faculty opportunities to be involved. the ad hoc or service-and-support model can be problematic because relatively few members of the campus community have access to it.the skunkworks/incubator model depends on the library having the startup expertise it needs to develop and execute good projects that are compelling to faculty and students, and that provide them with (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: opportunities to develop the experience and skills that they see as useful. even when an incubator can successfully create opportunities that draw faculty and students in, access can be fairly limited. both of these models have risks in terms of sustainability and scalability. a third model has emerged, one that is more scalable and sustainable — let’s call it “lightweight-service-and-support.” this model may include one or more dedicated personnel, i.e. a dh librarian or specifically dh programmer, but it is resource- conservative, and cautious about providing too much one-to-one guidance that would be unfair to other support seekers, because such guidance would not scale, and would quickly constitute a significant/unsustainable time commitment for the librarian or team. the lightweight-service-and-support model relies heavily on easy-to-use tools, which offer researchers several options while still scaling well to a library’s support capacity. the tools’ user community, documentation, and their popularity (which can result in how-to videos and example projects) helps to lessen the amount of training, management, and outreach that librarians need to do. this model looks very similar to the second tier of support that vinopal and mccormick ( ) explain how the supported tools “should offer a fixed set of templates, so users can pick the format, style, or functionality that best meets their needs … if services at this level are well- designed and supported, a majority of scholars could rely on these sustainable alternatives to one-off solutions” ( ). vandegrift and varner likewise gesture towards this model when they provide a concise formula for how libraries should conceptualize their dh offerings: “the goal is to have the fewest tools to support that meet the most needs” ( , ). lightweight-service-and-support need not be the only tier of the (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: model as vinopal and mccormick’s four-tiered model makes clear; however, in the absence of resources for higher tiers to develop potentially ground-breaking and grant- winning projects, lightweight-service-and-support can still serve a wide range of community members. establishing practices and models that can help make dh in libraries sustainable and scalable is important work that can and will help libraries continue evolving along with scholarly disciplines. but are the practices that are scalable and sustainable for libraries equally sustainable and scalable for the faculty and students who look to the library for dh opportunities? dh as scalable and nonscalable to explain further, anthropologist anna lowenhaupt tsing defines scalability as the ability to expand without having to rethink or transform the underlying basic elements. she examines scalability as a specific approach to design — one that has allowed for both the precision of the factory and the computer; and she argues that scalability is so ubiquitous and powerful that it stops us from noticing the aspects of the world that are not scalable. to push back against this suppressive impulse, tsing’s nonscalability theory is to allow us to see “how scalability uses articulations with nonscalable forms, even as it denies or erases them” (tsing , ). scalability prioritizes and values precision-nested fit — and it is the driving force behind much of our current infrastructure. the goals of nonscalability theory are to focus on perceiving the heterogeneous and nonscalable forms and understand that they, too, have roles to play in growth. at the heart of nonscalability theory is the question of how we look at, (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: and how we handle, the idea of diversity — specifically, the diversity of objects that do not fit within the precision-nested growth structures of scalability. diversity, argues tsing, isn’t simply different — it can contain the potential for transformative change. rawson and muñoz ( ) adapt tsing’s theoretical framework to unpack and examine their work “cleaning” data in the nypl’s “what’s on the menu?” archive, featuring over one hundred years of menus from restaurants, cafés, hotels, and other dining establishments. they argue that the concept of “data cleaning” and the use of the phrase “data cleaning” obscure the complex and heterogeneous details of the process as well as the degree to which it is high-stakes critical work with far-reaching effects that can impact the value of research findings. to reduce that process to “data cleaning” is to misunderstand a highly nonscalable process as a scalable one. rawson and muñoz set out to “clean” and normalize the data of different dishes and food items within the collection. although the nypl had arranged the menus in the collection to be interchangeable objects within the catalog, and although menus have a common overall format (i.e., food items with prices, grouped according to particular meals or particular sections of meals), each menu showed considerable variation. some of this variety was straightforward to normalize (e.g., fifteen variant listings for potatoes au gratin). to clean this data would be to make it scalable — to allow users to query the entire archive of menus to understand when, where, and how potatoes au gratin appeared, and get an accurate answer. however, as they worked to clean the data so that it would help answer research questions about the effect of wartime food rationing on menus or the changing boundaries of what constituted a dish over time, rawson and muñoz began to understand that reducing variants to a single value was “not a self- (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: contained problem, but rather an issue that required returning to [their] research questions and investigating the foods themselves.” the individual menu items’ heterogeneity was central to answering the research questions, and what was needed was not to make each food item scalable, but instead to create a dataset that would be compatible with the nypl archive and illuminate (and allow users to interact with) the nonscalable heterogeneous aspects of the menu contents. becoming aware of the pressures of scalability can be difficult even for experienced digital humanists. rawson and muñoz explain that when they began “cleaning” their data, they saw their main challenge and goal as “processing enough values quickly enough to ‘get on with it’” (page). the characteristics associated with scalability — speed, simplicity, and unimpeded growth — have considerable overlap with the characteristics associated with easiness. the tools we use — whether we are their creators or their consumers — are not immune to the pressure to be scalable. tsing’s theory of nonscalability, which rawson and muñoz have shown to have considerable implications for how we conceive of our goals when working with data, is equally relevant to both dh projects and to the infrastructure that we build for people who are working on them. dh projects are nonscalable. this means that they are particularly nonscalable with various out-of-the-box tools (not only omeka and scalar) because as tsing explains, scalability is the “ability to expand without distorting the framework” (tsing , ). tools designed to present and process data may appear or present themselves as though they come with that framework in place. omeka has items and item types with metadata categories; scalar has pages, paths, and tags — but these components are building blocks, and a highly incomplete framework, if they (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: can be said to be a framework at all. and this is precisely as it should be — they are there to be distorted, or, rather, to be transformed, as researchers’ projects take shape. when tools present themselves as easy, quick, and simple, they are promising the user that working with them will be scalable. and when those of us who are in the position of introducing those tools reiterate and reinforce that presentation, we are likewise telling researchers that they should expect scalability and strive for it, despite the fact that they are engaging in an eminently nonscalable process. we are encouraging them to imagine the complex diversity of their material without preparing them for the transformative process that including it will require. instead of helping them learn to see heterogeneity, and find effective ways of interacting with it, by training them to expect easiness, we are leaving an empty space in their preparation — and that space is as likely as not to end up filled with a conviction of their own inadequacy. the consequence is not only this emotional plunge. out-of-the-box tools may successfully circumvent technical work, but in doing so, they may also bypass the thought process of imagining a research question and its answers beyond the constraints and affordances of a single tool. this can impact the depth and richness of the answer to the research question, as well as the project’s long-term sustainability. thinking beyond the capabilities of a particular tool can also be an opportunity for researchers to utilize their existing disciplinary expertise in making decisions about data categories and relationships between materials – and in the process, gain much needed confidence for future experimentation, allowing them to work with less dependence upon librarians or other support personnel. (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: possible avenues for intervention the ways that “easiness” rhetoric can shape tool users’ expectations and experiences are a challenge. this challenge intersects with a related problem, namely, that the community of practice in dh is still grappling with how best to incorporate data modeling in dh. a data model defines the objects or entries that a database (or really any data presentation system, including content management systems) contains. it sets out the rules for how different pieces of data are connected with each other. if entries have additional data that modifies them (i.e., a data model about individuals might include their nationality, and depending on the focus of the database, one part of the model might specifically focus on defining how to record complexities around nationality, such as individuals who are born in one country to parents who are citizens of another country.) effectively incorporating data modeling involves articulating the questions and complexities that accompany it in humanities contexts; and the work of disseminating and/or training dhers to understand their work with various tools as data modeling. posner has previously noted that “humanists have a very different way of engaging with evidence than most scientists or social scientists” (posner ). for example, close reading is more likely to work towards describing a specific pattern within a text and tracing it from its start to end point. the focus of many traditional humanities scholarly essays is identifying and elucidating one or a small number of objects which are unique. to use tsing here, humanities research is much more focused on illuminating and celebrating nonscalability; thus, it is no surprise that humanists have, even within the dh community, hesitated about invoking the idea of “data” in relation to their work. (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: however, organizing data is what allows researchers to produce scholarship (posner ). when the omeka documentation suggests that users should plan their site before beginning to use the tool, they are obliquely suggesting that scholars need to develop a data model that allows an omeka site to be driven by a more complex principle than “let me show you all my stuff.” scalar users face the same challenge — perhaps even more so, since in scalar the capacity for non-linear and intersecting paths plus the ability to display both text and media-focused pages means that scholars could conceivably be working with two interlocking data models: one for their narrative and one for their non-narrative content. and this need applies to other dh tools as well — including several of the tools available through dhbox. data modeling is not easy work — but helping students understand how it fits into the process of working with so-called “easy” tools would be one way of preparing them better. this example (and potential impact) of data modeling underscores that the problems created by easy tool rhetoric cannot simply be attributed to the tool creators and the teams that designed and wrote their publicity materials. if our libguides and workshop promotional materials draw on the same tool presentation that emphasizes easiness, then we are also using easiness rhetoric just as the tool makers are. who has the responsibility and capacity to intervene in this situation? what kind of intervention is appropriate? while tool creators bear some responsibility, there is, in most cases, a gap between the authors of a tool’s presentation site and the readers. librarians who are mentoring students and faculty who are learning new tools — or who are in charge of designing and maintaining a local infrastructure system — are positioned to fill that gap because they are usually closer to the learners than the tool creators are. given (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: humanists’ uncertainty around thinking of their materials as data (keener , par. ), librarians and instructors offering basic tool trainings are more likely to be successful because they can have conversations that go both ways in consulting contexts. our models for dh development and support in libraries need to consider not only what tools to provide — but also how those tools’ capabilities and reputation shape infrastructure — and how we can design around the tools’ rhetoric in response. in “on nonscalability,” tsing points out several examples in which scalability has been achieved in part through a reliance on disciplined labor. one example that she uses is that of sugar cane cutters in puerto rico in the s. the workers had a limited time frame in which to work, and their working conditions were crowded and dangerous — especially because of the sharp machetes that each worker used. the result was that “workers were forced to use their full energy and attention to cut in synchrony and avoid injury” (tsing , ). by disciplining themselves to learn the skill of synchronous cutting, they solved the company’s problem — and transformed themselves from nonscalable individuals into a scalable work force. disciplined labor can be created when any powerful entity (a factory, a corporation, or even a library) identifies an infrastructural problem that they then leave to less powerful individuals to solve by changing themselves in some way. the creation of disciplined labor isn’t necessarily malicious. in the context of library infrastructure for dh tools, the problem is the nonscalability of individual dh projects versus the scalable support that we offer in the form of entry-level tools. because the tools present themselves as easy to use, it is easier for libraries (and departments) to decide that only minimal training is needed, and that the rest can be left to the students themselves. the students become disciplined (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: laborers because they see dh tool facility as leading both to greater prestige and to jobs. even when tools make beneficial achievements in terms of what is possible, the potential for problems exists. scalar, omeka, dhbox, and numerous other tools that can be used for dh make it possible for researchers to produce scholarly objects that would not have been possible otherwise without months or sometimes years of training. dhbox takes three tremendous difficulties (money, space, staff), and transforms them into a different difficulty (an individual user’s knowledge of servers and the command line). scalar and omeka transform the challenge of needing knowledge around databases, html, and css, transforming those challenges into the need for a user to understand how to develop an effective data model. all three tools are beneficial to the larger community of practice of digital humanities – and, yet, all three can be problematic as well, because through the combination of the way that libraries use them in building dh infrastructure, and the way that the tools present themselves, they shift tremendous responsibility for success directly onto the individual user and that user’s capacity to pick up wide-ranging (and not always easily accessible) knowledge on the fly. the resulting phenomenon is a form of what economist jacob hacker ( ) has identified as “risk shift.” hacker identifies risk shift by tracing changes in frameworks for economic protection (including banking, income, healthcare, and retirement). risk shift is the phenomenon by which support provided by larger corporate and social entities (employers, insurance companies, banks) is withdrawn, and responsibility for preventing risks is placed on individual families. while hacker’s research traces this phenomenon through the larger american employment system, sociologist tressie mcmillan cottom’s (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: recent book lower ed: the troubling rise of for-profit colleges in the new economy argues that the same risk shift can be seen in the higher education system as credential costs that used to be supported by federal grants have shifted more onto students. a certain reliance on dh tools marketed as “easy to use” creates a similar risk shift for our students and faculty learning to use them, including librarians who are working with limited amounts of time to pick up dh skills and experience. there is no simple solution to the problems that can be created by “easiness” rhetoric. certainly, the answer is not that the tools featuring it are bad and that we should stop using them. nor is it for us to take a reverse approach and brand the tools as ultra-challenging, suitable only for hardcore data nerds (a problematic approach that has been an aspect of dh in the past in debates about hacking vs. yacking (cecire ; nowviskie ). training and dialogue specifically focused on data modeling throughout the community could and will be very helpful, but it will take time for that to happen. if it does, it will be well-augmented by a more complex understanding among dh infrastructure providers (whether in libraries, centers, or departments) of what scalability means with regard to dh. among other things, this more complex understanding might involve scrutinizing what needs tools are meeting — scrutinize these needs especially through the tools’ marketing and self-presentation — and consider how those needs might shape infrastructure. one specific aspect of this might involve looking at the differences between what tool presentation leads users to think they need (i.e. lots of different types of media) vs. the contextual knowledge that more experienced digital humanists know they need (including naming conventions, data models, etc.). this doesn’t mean that libraries necessarily have to dramatically increase (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: their dh infrastructure investment or expend substantially more resources — if we are alert, deliberate, and proactive, it is possible to build infrastructure that is scalable, both for libraries, and for our users. conclusion when researchers embarking on a digital humanities project look for the right tool, the perceived easiness of that tool is an important consideration. tools that can provide an easy-to-use experience are becoming an important part of library infrastructure for dh because they seem to require less support and labor from library personnel involved in introducing dh methodologies to students and faculty. however, tools branded as “easy to use” can create a backlash in which users’ research stalls and they blame themselves when a particular tool was more difficult than they expected. this article has sought to better understand the challenges presented by easy tool rhetoric for dh service providers by examining the presentation and documentation of three digital humanities tools. this examination revealed that though the tools have made valuable contributions that substantially simplify certain technical aspects of producing websites and multimedia objects, the rhetoric of their presentation tends to elide the vital and challenging critical thinking that users must do while using the tools. this elision underscores key competencies, such as data modelling, that the larger digital humanities community is only just beginning to grapple with. libraries have an important role to play in helping tool users develop knowledge that will avoid the backlash of easy tools. (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: [many thanks to yvonne lam for invaluable conversations throughout the development of this essay; and to alex gil, yvonne lam, emily mcginn, roopika risam, and rachel shaw for feedback on earlier versions.] references “alliance for networking visual culture.” n.d. http://scalar.usc.edu/ “alliance for networking visual culture » about the alliance.” n.d. http://scalar.usc.edu/about/ american historical association. . “guidelines for the professional evaluation of digital scholarship by historians | aha.” american historical association. june. https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of- digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital- scholarship-by-historians brown, susan, patricia clements, isobel grundy, stan ruecker, jeffery antoniuk, and sharon balazs. . “published yet never done: the tension between projection and completion in digital humanities research.” digital humanities quarterly ( ). http://www.digitalhumanities.org/dhq/vol/ / / / .html bryson, tim, miriam posner, alain st. pierre, and stewart varner. . “digital humanities, (spec kit ). http://publications.arl.org/digital-humanities-spec-kit- / cecire, natalia. . “when digital humanities was in vogue.” journal of digital humanities ( ). http://journalofdigitalhumanities.org/ - /when-digital-humanities-was- in-vogue-by-natalia-cecire/ cohen, dan. . “introducing omeka.” dan cohen (blog), february . http://www.dancohen.org/ / / /introducing-omeka/ (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: cottom, tressie mcmillan. . lower ed: the troubling rise of for-profit colleges in the new economy. the new press. davis, robin camille. . “die hard: the impossible, absolutely essential task of saving the web for scholars.” presentation, eastern new york association of college & research libraries meeting, may . http://academicworks.cuny.edu/jj_pubs/ / “dh box.” n.d. http://dhbox.org/ “dh box: about.” n.d. http://dhbox.org/about goldstone, andrew. . “teaching quantitative methods: what makes it hard (in literary studies).” pre-print (forthcoming in debates in the digital humanities . https://doi.org/ . /t g skg. hacker, jacob. . the great risk shift: the new economic insecurity and the decline of the american dream. oxford, new york: oxford university press. keener, alix. . “the arrival fallacy: collaborative research relationships in the digital humanities.” digital humanities quarterly ( ). http://www.digitalhumanities.org/dhq/vol/ / / / .html kirschenbaum, matthew g. . “done: finishing projects in the digital humanities.” digital humanities quarterly ( ). http://www.digitalhumanities.org/dhq/vol/ / / / .html maron, nancy l., and sarah pickle. . “sustaining the digital humanities host institution support beyond the start-up phase.” ithaka s+r. https://digital.library.unt.edu/ark:/ /metadc /m / /high_res_d/sr_supporting _digital_humanities_ f.pdf (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: modern language association. . “guidelines for evaluating work in digital humanities and digital media” modern language association. january. https://www.mla.org/about- us/governance/committees/committee-listings/professional-issues/committee-on- information-technology/guidelines-for-evaluating-work-in-digital-humanities-and- digital-media morgan, paige, and helene williams. . “the expansion and development of dh/ds librarian roles: a preliminary look at the data.” presentation, digital libraries federation forum . https://osf.io/vu f/ munoz, trevor. . “in service? a further provocation on digital humanities research in libraries.” dh+lib. june . http://acrl.ala.org/dh/ / / /in-service-a-further- provocation-on-digital-humanities-research-in-libraries/ nowviskie, bethany. . “skunks in the library: a path to production for scholarly r&d.” journal of library administration ( ): . doi: . / . . . ———. . “on the origin of ‘hack’ and ‘yack.’” in debates in the digital humanities, edited by lauren f. kelin and matthew k. gold. university of minnesota press. http://dhdebates.gc.cuny.edu/debates/text/ “omeka.net.” n.d. http://www.omeka.net/ “omeka.net: about.” n.d. http://info.omeka.net/about/. pannapacker, william. . “the mla and the digital humanities.” hastac (blog). december . https://www.hastac.org/blogs/nancyholliman/ / / /mla-and-digital- humanities (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: posner, miriam. . “no half measures: overcoming common challenges to doing digital humanities in the library.” journal of library administration ( ): – . doi: . / . . posner, miriam. . “humanities data: a necessary contradiction.” miriam posner’s blog. june . http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/ presner, todd. . “how to evaluate digital scholarship.” journal of digital humanities ( ). http://journalofdigitalhumanities.org/ - /how-to-evaluate-digital-scholarship-by- todd-presner/ rawson, katie, and trevor muñoz. . “against cleaning.” curating menus (blog), july . http://www.curatingmenus.org/articles/against-cleaning/ risam, roopika. . “where have all the dh jobs gone?” roopika risam (blog), september . http://roopikarisam.com/uncategorized/where-have-all-the-dh-jobs-gone/ “scalar user’s guide: creative use of structure.” n.d. scalar user’s guide. http://scalar.usc.edu/works/guide/creative-use-of-structure “scalar user’s guide: selecting a page’s default view.” n.d. scalar user’s guide. http://scalar.usc.edu/works/guide/selecting-a-pages-default-view shirazi, roxanne. . conditions of (in)visibility: cultivating a documentary impulse in the digital humanities. invisible work in the digital humanities symposium. florida state university, november - . https://www.youtube.com/watch?v= livujbrs . tenen, dennis. . “blunt instrumentalism: on tools and methods.” in debates in the digital humanities , edited by lauren f. klein and matthew k. gold. university of minnesota press. (accepted manuscript) version of record at https://www.tandfonline.com/doi/full/ . / . . cul: easy tools submission: terras, melissa. a. “a decade in digital humanities.” melissa terras’ blog (blog), may . http://melissaterras.blogspot.com/ / /inaugural-lecture-decade-in-digital.html. ———. b. “reuse of digitised content: chasing an orphan work through the uk’s new copyright licensing scheme.” melissa terras’ blog (blog). february . http://melissaterras.blogspot.com/ / /reuse-of-digitised-content- -chasing.html. timmermans, stefan. . “introduction: working with leigh star.” in boundary objects and beyond, edited by geoffrey c. bowker, stefan timmermans, adele e. clarke, and ellen balka. cambridge, massachusetts: the mit press. tsing, anna lowenhaupt. . “on nonscalability: the living world is not amenable to precision-nested scales.” common knowledge ( ): – . doi: . / x- vandegrift, micah, and stewart varner. . “evolving in common: creating mutually supportive relationships between libraries and the digital humanities.” journal of library administration ( ): – . doi: . / . . vinopal, jennifer, and monica mccormick. . “supporting digital scholarship in research libraries: scalability and sustainability.” journal of library administration ( ): – . doi: . / . . in memoriam edna aizenberg, marymount manhattan college, april nina joan auerbach, university of pennsylvania, february jacques barchilon, university of colorado, boulder, june nina baym, university of illinois, urbana, june lawrence i. berkove, university of michigan, dearborn, may maryellen bieder, indiana university, bloomington, january richard joseph bourcier, university of scranton, may l. ross chambers, university of michigan, ann arbor, october brian w. connolly, xavier university, oh, may june s. cummins, san diego state university, february craig j. decker, bates college, march david r. eastwood, united states merchant marine academy, july milton d. emont, denison university, january john halperin, vanderbilt university, march jonathan hess, university of north carolina, chapel hill, april barbara hodgdon, university of michigan, ann arbor, march donald d. kummings, university of wisconsin, parkside, november thérèse ballet lynn, chapman university, may david d. mann, miami university, oxford, december charles a. messner, jr., carleton college, march sally todd nelson, dawson college, canada, april anna norris, michigan state university, march philip roth, cornwall bridge, ct, may estella irvine schoenberg, buffalo state college, state university of new york,  april c. jan swearingen, texas a&m university, college station, june betty perry townsend, university of maryland, college park, december daniel d. townsend, anne arundel community college, md, may this listing contains names received by the membership ofice since the march issue. a cumulative list for the aca- demic year – appears at the mla web site (www .mla .org/in_memoriam). in memoriam [ p m l a pmla . ( ), published by the modern language association of america omeka.net is a web publishing platform for sharing digital collections and creating media-rich online exhibits. omeka.net offers the perfect platform for your digital public history work. with a range of reasonably priced plans, omeka.net provides a hosted solution for individuals, courses, and institutions. sign up today at www.omeka.net/signup omeka.net is a project of the corporation for digital scholarship - - - press.jhu.edu cyberformalism histories of linguistic forms in the digital archive daniel shore “one of very few digital humanist works i can recommend enthusiastically to poets, and to poetry critics; it does not disappoint.”—stephanie burt, author of the poem is you $ . hc/eb writing to the world letters and the origins of modern print genres rachael scarborough king “this erudite, sophisticated, beautifully written book is a major achievement.”—thomas keymer, author of sterne, the moderns, and the novel $ . hc/eb reductive reading a syntax of victorian moralizing sarah allison “beautifully written, confident, and illuminating. it is a book that promises to transform the way we read novels.” —elsie b. michie, author of the vulgar question of money $ . hc/eb y e a r s o f p u b l i s h i n g s t a n f o r d u n i v e r s i t y p r e s s sup.org stanfordpress.typepad.com maximum feasible participation american literature and the war on poverty stephen schryer p o s t * narrowcast poetry and audio research lytle shaw p o s t * other englands utopia, capital, and empire in an age of transition sarah hogan he experimental imagination literary knowledge and science in the british enlightenment tita chico uncle tom from martyr to traitor adena spingarn foreword by henry louis gates, jr. karman a brief treatise on action, guilt, and gesture giorgio agamben translated by adam kotsko m e r i d i a n : c r o s s i n g a e s t h e t i c s reading popular newtonianism print, the principia, and the dissemination of newtonian science laura miller $ . | cloth “reading popular newtonianism makes a landmark contribution to our understanding of the cultural meanings and social impact of one of the central igures in the history of science and ideas. ”—mark r. m. towsey, university of liverpool in the red and in the black debt, dishonor, and the law in france between revolutions erika vause $ . | cloth “a ground-breaking study exemplary in every way.” —maura o’connor, university of cincinnati the physics of possibility victorian fiction, science, and gender michael tondre $ . | paper “beautifully researched, beguiling, and risky in all the best ways, he physics of possibility opens entirely new possibilities for how we read chance in the nineteenth-century novel.”—devin griiths, usc what is the present? michael north a provocative new look at concepts of the present, their connection to ideas about time, and their effect on literature, art, and culture “ a bracing, expansive, and consistently surprising meditation on the notion of the present. in an authoritative yet lucid style, michael north animates with great panache some of the most profound and persistent questions in the history of ideas.” — david james, author of modernist futures: innovation and inheritance in the contemporary novel cloth $ . the last utopians four late nineteenth-century visionaries and their legacy michael robertson the entertaining story of four utopian writers—edward bellamy, william morris, edward carpenter, and charlotte perkins gilman— and their continuing influence today “ a brilliant and eloquent guide through the life, times, and imaginations of some of the most compelling writers of the turn of the twentieth century, the last utopians is not an epitaph but, rather, a reminder of how vital and humane the utopian imagination once was—and might be again.” —daniel t. rodgers, author of age of fracture cloth $ . how the other half looks the lower east side and the afterlives of images sara blair how new york’s lower east side inspired new ways of seeing america “ illuminating at almost every turn, with a telling analytical expressiveness, an utterly persuasive narrative arc, and a tour-de-force impact. no work of art has made it more richly possible to see the usable pasts of the lower east side than blair’s book.” —thomas j. ferraro, author of feeling italian: the art of ethnicity in america cloth $ . looking for new ideas for teaching nineteenth-century american literature? mla members save % on all mla titles at www.mla.org/books. bookorders@mla.org n www.mla.org/books n phone orders - summer � pp. cloth $ . member price: $ . paper $ . member price: $ . edited by mark c. long and sean ross meehan “the collection addresses diverse settings and pedagogical approaches beyond the traditional ones, and the roster of contributors includes some of the very best emerson scholars. i learned something new and valuable about emerson from every essay.” —robert d. habich ball state university a leader of the transcendentalist movement and one of the country’s fi rst public intellectuals, ralph waldo emerson has been a long-standing presence in american literature courses. today he is remembered for his essays, but in the nineteenth century he was also known as a poet and orator who engaged with issues such as religion, nature, education, and abolition. approaches to teaching the works of ralph waldo emerson looking for new ideas for teaching twentieth-century american literature? mla members save % on all mla titles at www.mla.org/books. bookorders@mla.org n www.mla.org/books n phone orders - august � pp. cloth $ . member price: $ . paper $ . member price: $ . edited by logan esdale and deborah m. mix “no account of american literature, literary modernism, and women’s literature would be complete without attention to gertrude stein. . . . this collection of essays suggests contexts in which we can teach her work.” —jane bowers john jay college of criminal justice, city university of new york a trailblazing modernist, gertrude stein studied psychology at radcliffe with william james and went on to train as a medical doctor before coming out as a lesbian and moving to paris, where she collected contemporary art and wrote poetry, novels, and libretti. known as a writer’s writer, she has infl uenced every generation of american writers since her death in and remains avant-garde. approaches to teaching the works of gertrude stein looking for new ideas for teaching nineteenth-century french literature? mla members save % on all mla titles at www.mla.org/books. bookorders@mla.org n www.mla.org/books n phone orders - august � pp. cloth $ . member price: $ . paper $ . member price: $ . edited by michal p. ginsburg and bradley stevens “this collection constitutes a rich educational tool for instructors of french who want to teach and study this great novel.” —jacques neefs johns hopkins university the greatest work of one of france’s greatest writers, victor hugo’s les misérables has captivated readers for a century and a half with its memorable characters, its indictment of injustice, its concern for those suffering in misery, and its unapologetic embrace of revolutionary ideals. the novel’s length, multiple narratives, and encyclopedic digressiveness make it a pleasure to read but a challenge to teach, and this volume is designed to address the needs of instructors in a variety of courses that include the novel in excerpts or as a whole. approaches to teaching hugo’s les misérables binder .pdf the american archivist vol. , no. fall/winter – abstract rumors of the deterioration of the historian-archivist relationship have been exag- gerated. this article first traces the evolving historian-archivist bond over the last eight decades. second, it discusses the methods scholars have employed in studying historians, namely bibliometrics, questionnaires, interviews, and a combination. third, it describes the results and implications of those studies in three areas: locat- ing sources, using primary and nontextual materials, and overall information-seek- ing and use. fourth, it considers the evolving and still ambivalent role of information technology in historians’ research. finally, it suggests possibilities for future research, highlighting digital history, personal archiving, web . , democratization and public history, crowdsourcing and citizen archivists, digital curation, activist archivists and social justice, diversity and the changing demographics of the archival profession, and education and training. though historians and archivists may not always have used their relationship to clio’s maximum advantage, digital technology and an improved knowledge of historians’ work practices based on investigations by archi- val scholars engender new and better possibilities for collaboration. archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect alex h. poole © alex h. poole. key words historians, historiography, digital collections, online environment, archivist- historian relationship, archival profession, digital scholarship d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole perhaps the very closeness of that century-long historian-archivist relation- ship fostered perceptions in both professions that now hinder understanding the realities of archives and forging closer partnerships with each other. —terry cook the necessary reconsideration of how the historical art is practiced is not taking place universally or uniformly. —lena roland and david bawden historians’ culture and modus operandi have typically been the opposite of the speed and openness, the collaborative spirit and do-it-yourself mentality, that characterize the internet at its best. —kristen nawrotzki and jack dougherty the moments of discovery that scholars share with archivists were described by historians with delight and gratitude. —jennifer rutner and roger c. schonfeld introduction: why study historians? t he impact of historians’ work transcends their own academic communities: it percolates into public education curricula and influences multiple gen- erations of students. historians, too, represent “researchers of last resort” and therefore wield disproportionate influence as users and as advocates. third, as frequent and experienced users, they constitute an identifiable and measurable user group. historians and archivists alike can profit from a closer and more perfect union. both analysis and synthesis, this article first traces the contours of the his- torian-archivist bond over the last eight decades. second, it lays out the methods scholars have employed in studying historians, namely bibliometrics, question- naires, interviews, and mixed approaches. it then describes the results of those studies in three areas: locating sources, using primary and nontextual materials, and overall information-seeking and use. fourth, it considers the evolving and still ambivalent role of information technology in historians’ research. finally, it suggests directions for future research, highlighting digital history, personal archiving, web . , democratization and public history, crowdsourcing and citizen archivists, digital curation, activist archivists and social justice, diversity and the changing demographics of the archival profession, and education and training. rumors of the deterioration of the historian-archivist relationship have been exaggerated. though historians and archivists may not always have used their relationship to clio’s maximum advantage, digital technology and an d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect improved knowledge of historians’ work practices based on investigations by archival scholars engender new and better possibilities for collaboration. historians and archivists in , terry cook and francis x. blouin and william rosenberg provoc- atively argued that historians and archivists faced an unprecedented lack of common understanding. cook stressed that historians view archives as a “foreign country”; meanwhile blouin and rosenberg underscored an “archival divide” between the two professions. whereas in cook’s opinion historians see archivists as “honest brokers,” archivists in fact “co-create” the archives. “that archivists are continually making such judgments may account for the histor- ical profession’s sense of denial,” cook underlined, “or at least its failure to engage with the archival profession on matters of archival substance.” more alarmist still, blouin and rosenberg contended that such a divide imperils future historical research. “the structures and managerial demands of digital archives,” they insisted, “are almost certain to reinforce the separation between historians and archivists—between historical understanding and archi- val administration—that characterize the archival divide.” but the cleavage cook and blouin and rosenberg pointed out is overdrawn. it may well be more rhetorical—not to say polemical—than factual. both cook and blouin and rosenberg tended to dichotomize the two groups (“historians” versus “archivists”) even as they homogenized the members of each group. as archivist maygene daniels reminded us, “our unity seems to be as much in our diversity and the breadth of our interests as in any common professional core.” the historical profession similarly welcomes diversity, foregrounds inclusivity, and encourages variegated intellectual pursuits. sweeping generalizations must be made with great caution. far from a new phenomenon, the peculiar relationship among historians and archivists has long proved a source of concern, debate, and ambivalence. interpretations of the historian-archivist relationship, whether positive or neg- ative, whether hortatory or admonitory, whether focusing on the personal or on the intellectual aspects of the relationship, have differed over time undoubt- edly because of changing demographics, new areas of study, and new types of sources. professions, after all, are never static, and dynamism is not necessarily an indication of dysfunction. rather, professional evolution may indicate new possibilities for or phases of symbiosis. in short, the temptation to tell a facile story of declension must be resisted. on the whole, studies of historians’ information-seeking practices and the relationship between historians and archivists since the middle of the twentieth century paint a more nuanced—and more favorable—picture of the relationship d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole than cook and blouin and rosenberg would have us believe. historians are increasingly aware of the constructed nature of the archives, even if their work does not always explicitly address this construction. research since the early s in particular suggests that historians both need and appreciate archivists more than ever. gestating in the nineteenth century, the archivist-historian relationship remained pivotal both in facilitating historical research and in defining the identity of both archivists and historians. the ambivalent relationship between archivists and historians long predated even the era of the archival profession’s “founding brothers.” the relationship appeared turbulent between and , but a breach was unlikely, even after the society of american archivists coalesced in . in , the first president of the society of american archivists, historian albert r. newsome, opined, “perhaps an archivist ought not to be a historian, but a historian may well be an archivist.” from their professional birth, however, archivists feared encroachment by historians and librarians. presumably the feeling was mutual. many of the early characterizations of the historian-archivist relationship were anecdotal. in , the national archives’ philip brooks lamented a grow- ing separation between the archival and historical professions. but a year later, brooks retrenched, claiming, “for some years after the mid- ’s the close understanding between the majority of historians and the archivists seemed to diminish . . . . since world war ii the comity of interest has gradually revived.” historian donald mccoy agreed: the s ushered in a rapprochement between the professions. one of the “founding brothers,” lester cappon, a phd in history and direc- tor of the institute of early american history and culture at williamsburg, also seized upon the issue. he remarked in defense of historians, “we may criticize some of his ilk for their antiquarianism, but it was he who challenged careless state officials and won public support for modest archival agencies, who res- cued discarded federal records from destruction, who awakened men in pri- vate life to awareness of the historical value of their family papers and offered preservation of these documents in the ‘cabinets’ of historical societies.” karl trever, who had also done doctoral work in history, similarly pushed for collegi- ality between archivists and historians. historian john edwards caswell soon lobbied for a closer relationship not only among archivists and historians, but also among archivists, historians, and records managers. cappon stressed, “the archivist is not a mere caretaker of the paper res- idue of the past but a person with scholarly proclivities and, at best, a scholar himself. and his field of scholarship . . . is history.” in much the same way, future archivist of the united states james b. rhoads, another history phd, noted, “the archivist is . . . an information specialist in the truest sense—a d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect historian analyzing existing structure and providing information about the con- tent of large bodies of historical material.” nevertheless, many historians demurred, framing the archivist as merely a “hack,” a “hewer of wood and a drawer of water.” but assessments of the relationship remained ambivalent if not contradictory. historian walter rundell jr., who chaired the survey on the use of original sources in graduate history training in the late s under the aegis of the national archives, thought the relationship for the most part satisfactory, though he conceded that both par- ties could work to improve it. archivist patrick m. quinn subsequently labeled the archivist-historian relationship as less than harmonious since the second world war. but he also maintained, “if in the past the historian has been the bricklayer and the archivist the hod-carrier, the future will witness at least an equalization of these roles.” though philip mason, founder of the walter p. reuther library and a history phd, conceded the stereotype of archivist as subordinate, he exhorted archivists not to accept “second class citizenship.” former acting archivist of the united states frank burke was hopeful: “maybe . . . there will emerge that reconciliation of the estranged parent and child, both having matured and recognized that each has a place in the other’s existence.” george bolotenko of the national archives of canada wondered: “is this modern banishment of historians from the role of keeper of the record not in some measure the conscious or subconscious revenge of the ‘little brother’ against ‘big brother’?” archivist mattie russell eschewed equivocation, insisting, “archivists should first be historians.” archivist william l. joyce saw an “unfortunate” adversarial dynamic obtain- ing between the two groups, not least because many archivists were “tilling in a vineyard once looked down upon by historians.” but fredric miller, holder of both a phd in history and a master’s in library science, lauded “the dynamic and uniquely symbiotic relationship between archives and history.” scholars who studied the relationship empirically reached similar conclu- sions. archivists made appreciable intellectual contributions to historical schol- arship, asserted barbara c. orbach, a phd student in american studies as well as the holder of a master’s degree in library and information science. a team of archivists and historians in the early s pinpointed the “natural partnership” among those who rendered evidence available and those who exploited it. the relationship between archivists and historians also appeared close- knit in the s. library and information science professor wendy duff and lis doctoral student catherine johnson believed archivists key in orienting his- torians to new archives and new collections. historians appreciated archivists’ social capital. the former group profited from archivists’ knowledge of record- keeping, of archival systems, and of core concepts such as scope, content, and d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole provenance. further, archivists proved key partners in historians’ verification efforts. johnson and duff concluded, “historians develop different strategies to establish relationships with [archivists], including chatting, doing their home- work, and offering to help with matters that concern the archivist, such as explain- ing collections, collaborating, and empathizing over professional problems.” again lauded as crucially important to and critically involved in the research process in the s, the archivist represented an “expert and a partner.” “the stereotype of the curmudgeonly archivist,” suggested one scholar, “is disappear- ing.” cook, however, pinpointed an “unhealthy divergence” between historians and archivists gathering momentum since the s, a perspective blouin and rosenberg endorsed, but that seems difficult to sustain evidentially. despite their tortuous bond, archivists and historians share fundamen- tal concerns, as blouin and rosenberg conceded. each group wrestles with the nature of source materials, the phenomenon of social memory, and issues sur- rounding culture, power, and agency. the possibility for productive collabora- tion is perhaps unprecedented. the past may be a foreign country, as david lowenthal asserted, but surely historians and archivists can cocreate maps. results of previous scholarship methods and samples challenges abound in studying the users of archives; historians are no exception. even years after arthur mcanally’s seminal study, donald case observed, “history remains an area in which actual behavior . . . has not been well studied.” nearly a quarter-century after case’s observation, the lit- erature remains underdeveloped still. archivists may focus on collections at the expense of their users. early scholarship on historians’ information-seeking and use dealt largely with the roles of libraries and librarians; only in the late s did archivists and archives-centered scholars enter the dialogue with salutary results. changing trends in historiography, namely the advent of the new social history in the s and the ascent of cultural history in the s, made such user studies all the more necessary. unfortunately, such evolving scholarly practices likely fed some scholars’ belief of disjuncture among historians and archivists (as well as among the historical discipline itself). historian donald kelley argued: the political and ideological confusions of the s produced more new histo- ries and “turns” both left and right, social and linguistic as well as massive demo- graphic expansion of the historical profession in the context of the vietnam war, with attendant repercussions, student movements, and radicalisms which d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect sought not only to view history “from the bottom up” but also to shift it into a “new left” activist mode, though increasingly in the interests not of an imagined international proletariat but of women, blacks, neglected ethnic groups, and others seeking identity through a history of their own. by the middle of the s, in fact, “factional polarization” or “fragmentary chaos” seemingly prevailed in the historical discipline. some archival scholars also took heed of the shifting direction of histor- ical scholarship. social history deeply affected not only the writing of history, but also the relationship among archivists and historians of all stripes. “the mundane and the ordinary” acquired unpreceded prominence; concordantly, historians consulted a similarly unprecedented array of sources. more problematic for archivists, the field’s dynamism militated against traditional forms of archival organization. accustomed to ordering materials by provenance and filing system, archivists found themselves confronted by research interests resistant to these schema. one scholar skewered archivists for failing to respond to the challenges of the new social history. these chal- lenges persist in the present. the new social history aside, archivists also were faced with the efflores- cence of cultural history in the s. cultural history’s apotheosis continued. more broadly, historiography turned away from national and international pol- itics toward sundry topics such as childhood, death, and the body; from a focus on events to a focus on structures; from the efforts of “great men” to those of so-called ordinary people and the ways in which they experienced social change; and from the study of thought to the study of collective movements and trends. methodologically, moreover, historians tuned away from traditional notions of objectivity and accepted heteroglossia. finally, historians proved increasingly amenable to a greater variety of evidence (oral, visual, and statistical, for instance) as opposed merely to traditional documents (namely official written records). facing these challenges emerging from social and cultural history, both librarians and archivists studied historians largely as a way to improve ser- vices. though their units of analysis often varied, scholars have employed four methods in studying historians’ information-seeking practices: bibliometrics, questionnaires, interviews, and mixed methods. bibliometrics describing literature formally, bibliometrics provides “insights about research interests, resource needs, research behavior, interdisciplinarity, schol- arly communication, and collection management.” bibliometrics comprises two classes. citation studies count each bibliographic unit each time it appears in a footnote; reference studies, by contrast, count each bibliographic unit in d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole the footnote only once. bibliometrics covers a tremendous amount of data; conversely, it deals only with explicit data (occurrences and co-occurrences) in published texts. but bibliometrics cannot provide information about how researchers locate or obtain materials. other potentially problematic assumptions with bibliometric analysis include that the citing of a document implies use of it by the author; that the citing of a document reflects its quality; that citations point to the best available works; that a cited document is related in content to the citing document; and that all citations are equal. for archival materials, finally, bibliometric analysis may be compromised by a general lack of common standards and terminology. arthur mcanally, annie marie alston, clyve jones, michael chapman, and pamela carr wood, clark elliott, fredric miller, m. sara lowe, jana brubaker, graham sherriff, and donghee sinn employed bibliometrics in their studies (see table ). despite their common reliance upon bibliometrics, however, these scholars examined literature that generally varied in theme, time period, and geography; they also tended to use different sample sizes (number of journals, number of articles, and number of citations). questionnaires other scholars relied upon questionnaires. properly designed question- naires are useful if the units of analysis are individuals. further, they may help collect original data when a population is too large to observe directly. on the other hand, questionnaires depend upon individual memory; similarly, they report what researchers claim they used and what materials they claim they found useful. what is more, they cannot show what scholars would have used had other materials been available. on a pragmatic note, achieving a desired response rate (e.g., %) chal- lenges scholars, as does culling a representative sample from a larger popula- tion. moreover, disseminating the survey and following up often proves time consuming and resource intensive. michael stevens, peter uva, margaret stieg, dianne l. beattie, helen r. tibbo, wendy duff, barbara craig, and joan cherry, susan hamburger, and alexandra chassanoff used question- naires to study historians. as with bibliometric studies, these studies varied considerably in their foci and render direct comparison difficult (see table ). interviews interviews provide “a window on a time and a social world that is experi- enced one person at a time, one incident at a time.” despite their advantages, d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect namely in the area of “thick” description, interviews, like questionnaires and bibliometrics, evince possible shortcomings. for example, it presupposes that the interviewee will summarize his or her perception and behavior as it evolved throughout the research process. additionally, an interviewer necessar- ily assumes that the interviewee will reliably report his or her perceptions and behavior during the interview. to combat these limitations, scholars who undergirded their studies with interviews—barbara orbach, donald case, charles cole, robert delgadillo and beverly lynch, and wendy duff and catherine johnson —often combined structured and nonstructured methods or employed open-ended questions to spur discussions (see table ). yet again, these scholars’ foci varied considerably. table . studies employing bibliometrics author date thematic focus temporal focus geographic focus sample mcanally (phd student, lis) not specified – united states not specified alston (master’s student, lis) not specified not specified united states , references books book chapters journal articles jones, chap- man, & carr (librarians) not specified – united kingdom , references journals articles elliott (archi- vist) history of science th and th c. great britain, canada, united states , references journals articles miller (archi- vist/ historian) social history – united states journals articles lowe (master’s student, lis and history) not specified not specified world , references issues brubaker (librarian) illinois state history not specified illinois , references journal articles sherriff (librar- ian) not specified rd c.–late th c. of north american , citations master’s theses sinn (professor, lis) not specified not specified world , references journal articles d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole table . studies employing questionnaires table . extended author(s) date subjects’ thematic focus subjects’ temporal focus subjects’ geo- graphic focus sample size response rate subjects’ gender subjects’ rank stevens (graduate student, lis) political ( %) th c. ( %) united states . % not specified not specified social ( %) th c. ( %) intellectual ( %) th or th c. ( %) diplomatic ( %) uva (librarian) medieval ( . %) byzantine history –modern u.s. foreign country ( . %) . % not specified professors ( . %) far eastern ( . %) associate ( . %) modern european ( . %) united states ( . %) assistant ( . %) american cultural and intellectual ( . %) stieg (professor, lis) general ( ) classical–modern u.s. united states approx. % not specified not specified topical ( ) colonial ( ) th century ( ) beattie (archivist) women’s ( / ) not specified canada % not specified not specified tibbo (professor, lis) all all united states % male ( %) full or associate professor ( %) female ( %) assistant professor ( %) duff et al. (profes- sors, lis) social ( %) th c. ( %) central canada . % male ( %) professor ( %) cultural ( %) th c. ( %) female ( %) assistant professor ( %) political ( %) th c. ( %) duff et al. (profes- sors, lis) social ( ) th century ( ) canada ( ) . % male ( %) professor ( %) cultural ( ) th century ( ) europe ( ) female ( %) associate professor ( %) political ( ) th century ( ) united states ( ) assistant professor ( %) hamburger (archivist) not specified not specified not specified . % male ( %) faculty ( %) female ( %) graduate student ( %) unspecified ( %) undergraduate student ( %) chassanoff (phd student, lis) women’s (approx. half); focus on social and cultural th c. united states not speci- fied male ( %) professor ( %) female ( %) associate professor ( %) assistant professor ( %) adjunct professor ( %) teaching professor ( %) visiting professor ( %) professor emeritus/a ( %) research professor ( %) dean/department head ( %) endowed chair ( %) distinguished professor ( %) other ( %) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect table . studies employing questionnaires table . extended author(s) date subjects’ thematic focus subjects’ temporal focus subjects’ geo- graphic focus sample size response rate subjects’ gender subjects’ rank stevens (graduate student, lis) political ( %) th c. ( %) united states . % not specified not specified social ( %) th c. ( %) intellectual ( %) th or th c. ( %) diplomatic ( %) uva (librarian) medieval ( . %) byzantine history –modern u.s. foreign country ( . %) . % not specified professors ( . %) far eastern ( . %) associate ( . %) modern european ( . %) united states ( . %) assistant ( . %) american cultural and intellectual ( . %) stieg (professor, lis) general ( ) classical–modern u.s. united states approx. % not specified not specified topical ( ) colonial ( ) th century ( ) beattie (archivist) women’s ( / ) not specified canada % not specified not specified tibbo (professor, lis) all all united states % male ( %) full or associate professor ( %) female ( %) assistant professor ( %) duff et al. (profes- sors, lis) social ( %) th c. ( %) central canada . % male ( %) professor ( %) cultural ( %) th c. ( %) female ( %) assistant professor ( %) political ( %) th c. ( %) duff et al. (profes- sors, lis) social ( ) th century ( ) canada ( ) . % male ( %) professor ( %) cultural ( ) th century ( ) europe ( ) female ( %) associate professor ( %) political ( ) th century ( ) united states ( ) assistant professor ( %) hamburger (archivist) not specified not specified not specified . % male ( %) faculty ( %) female ( %) graduate student ( %) unspecified ( %) undergraduate student ( %) chassanoff (phd student, lis) women’s (approx. half); focus on social and cultural th c. united states not speci- fied male ( %) professor ( %) female ( %) associate professor ( %) assistant professor ( %) adjunct professor ( %) teaching professor ( %) visiting professor ( %) professor emeritus/a ( %) research professor ( %) dean/department head ( %) endowed chair ( %) distinguished professor ( %) other ( %) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole table . studies employing interviews table . extended author date thematic focus temporal focus geographic focus sample size subjects’ gender subjects’ rank orbach (librarian/doctoral student, american studies) social ( %) th c. ( %) united states male ( %) professor ( %) political or political/diplomatic ( %) female ( %) assistant professor ( %) intellectual ( %) visiting lecturer ( %) cultural ( %) doctoral student ( %) general ( %) case (professor, lis) social ( %) not specified united states male ( %) professor ( %) intellectual ( %) female ( %) chronological ( %) cole (professor, lis) not specified th c. ( %) united kingdom not specified doctoral student ( %) th c. ( %) th c. ( . %) delgadillo (master’s student, latin american studies); lynch (professor, education) not specified not specified continental europe ( %) male ( %) doctoral student ( %) latin america ( %) female ( %) master’s student ( %) united states ( %) far east ( . %) africa ( . %) great britain ( . %) eastern europe ( . %) duff (professor, lis) and johnson (phd student, lis) social ( %) not specified not specified male ( %) associate or assistant professor ( %)political ( %) female ( %) legal ( %) aboriginal ( %) intellectual ( %) cultural ( %) material culture ( %) johnson (phd student, lis); duff (professor, lis) historians: not specified not specified not specified historian (rank not specified) ( %) social ( %) political ( %) legal ( %); aboriginal ( %) intellectual ( %) cultural ( %) material culture ( %) doctoral students: doctoral student ( %) canada ( %) scottish ( %) german ( %) communication studies ( %) rutner and schonfeld (consultants) not specified not specified not specified not specified faculty ( . %) doctoral student ( . %) support staff ( . %) roland and bawden (professors, lis) not specified not specified not specified not specified historian archivist librarian web researcher d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect table . studies employing interviews table . extended author date thematic focus temporal focus geographic focus sample size subjects’ gender subjects’ rank orbach (librarian/doctoral student, american studies) social ( %) th c. ( %) united states male ( %) professor ( %) political or political/diplomatic ( %) female ( %) assistant professor ( %) intellectual ( %) visiting lecturer ( %) cultural ( %) doctoral student ( %) general ( %) case (professor, lis) social ( %) not specified united states male ( %) professor ( %) intellectual ( %) female ( %) chronological ( %) cole (professor, lis) not specified th c. ( %) united kingdom not specified doctoral student ( %) th c. ( %) th c. ( . %) delgadillo (master’s student, latin american studies); lynch (professor, education) not specified not specified continental europe ( %) male ( %) doctoral student ( %) latin america ( %) female ( %) master’s student ( %) united states ( %) far east ( . %) africa ( . %) great britain ( . %) eastern europe ( . %) duff (professor, lis) and johnson (phd student, lis) social ( %) not specified not specified male ( %) associate or assistant professor ( %)political ( %) female ( %) legal ( %) aboriginal ( %) intellectual ( %) cultural ( %) material culture ( %) johnson (phd student, lis); duff (professor, lis) historians: not specified not specified not specified historian (rank not specified) ( %) social ( %) political ( %) legal ( %); aboriginal ( %) intellectual ( %) cultural ( %) material culture ( %) doctoral students: doctoral student ( %) canada ( %) scottish ( %) german ( %) communication studies ( %) rutner and schonfeld (consultants) not specified not specified not specified not specified faculty ( . %) doctoral student ( . %) support staff ( . %) roland and bawden (professors, lis) not specified not specified not specified not specified historian archivist librarian web researcher d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole combination of methods as work by raymond vondran, diane beattie, helen tibbo, deborah lines andersen, suzanne r. graham, ian anderson, and margaret stieg dalton and laurie charnigo showed, some scholars compared the results of different methods brought to bear on the same sample (see table ). using multiple methods may yield the most trustworthy results, but may be the most challenging study to undertake. but reconciling divergent findings produced by the use of different methods remains difficult, as beattie’s, anderson’s, and dalton and charnigo’s studies underscored. irrespective of method, scholars faced similar challenges in designing their studies, namely corralling a representative sample from a larger population and isolating or accounting for numerous variables. indeed, many studies are not directly comparable as a result, as tables one through four suggest. what is more, there may be a disjuncture between what historians did and what they claimed they did with respect to their research processes. future research should grapple with these issues and compare more specifically their studies’ units of analysis as well as their findings. studies of historians focused on how they located scholarly materials, on the ways in which they used those materials (especially primary and nontextual sources), and on their overall information-seeking and use strategies. historians’ research processes locating sources in locating sources, historians’ favored methods remained consistent over time (see table ). most notably, footnote or citation chaining remained founda- tional, ranking first in uva’s study, first in stieg’s, first in hernon’s, second in beattie’s, first in orbach’s, first in tibbo’s ( ), first in delgadillo and lynch’s, first in tibbo’s ( ), first in anderson’s, second and third, respectively, in duff, craig, and cherry’s two studies, second (primary sources) and third (secondary sources) in dalton and charnigo’s, and second in hamburger’s. this long-standing preference for footnote-chaining aside, two trends are of particular importance for archivists. first, archivists themselves played an important role for stevens’s study (fourth), beattie’s (first), anderson’s (eighth), duff, craig, and cherry’s (first in “finding and using archival resources” and fourth in “historians’ use of archival sources”), and dalton and charnigo’s (eighth). orbach noted, “historians tended to accord repository staff due credit—and considerable power—in facilitating access to primary material.” along these lines, visits to archives were important for historians in dalton and d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect charnigo’s (fourth in primary sources) and hamburger’s (third) studies. in both cases, there is clearly room for archivists to embed themselves more deeply, more frequently, and earlier on in historians’ work processes. notwithstanding the role of archivists themselves, finding aids help schol- ars reduce uncertainty in working with unfamiliar repositories and materi- als. historians ranked finding aids fifth in orbach’s study; fifth in tibbo’s (“primarily history”); second in anderson’s; first in duff, craig, and cherry’s (“historians’ use of archival sources”); first in dalton and charnigo’s; and sixth in hamburger’s. conventional finding aids aside, the web irrevocably changed the possibilities for locating sources. according to daniel v. pitti, the web “awakened an abiding but dormant aspiration: to provide comprehensive universal access to the world’s primary cultural and historical resources.” particularly after the turn of the millen- nium, scholars probed historians’ use of new technology to locate materials. despite the advent of encoded archival description, however, simply mount- ing finding aids on the web was no panacea for historians. the search engine used, the skills of the user, and the amount of information available on the web played a critical role in historians’ successfully locating finding aids. indeed, only of tibbo’s historians (“primarily history”) were certain they had used ead finding aids (a further were unsure, and indicated they had not). furthermore, tibbo’s sample rarely consulted electronic databases; instead, they employed varied search methods to find primary sources, from footnote-chaining to web searching. building on these findings, tibbo empha- sized the importance of developing user-friendly electronic finding aids and databases. indeed, webinars on ead or on tools such as archivegrid may prove a useful investment of repository resources. speaking to tibbo’s finding, % of dalton and charnigo’s respondents never used electronic databases. but anderson also found that united kingdom historians’ concerns about online finding aids did not indicate reluctance to use online retrieval tools per se. indeed, the same number of his interviewees ranked online methods and print and informal means “most effective.” anderson found that print-based (i.e., formal) retrieval methods were most significant for unpub- lished and government sources and informal retrieval methods were most sig- nificant with published sources and artifacts. anderson also encountered a disjuncture that underscored the possible difference between use and usefulness. nearly all of his sample claimed they used leads from print sources ( %) or informal contacts ( %) to locate materi- als, but only just above a quarter ( %) said these leads were the most effective method. in other words, the most popular retrieval methods were not invari- ably the most effective. ultimately, anderson’s historians sought more online retrieval options even as they remained beholden to print forms. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole table . studies employing mixed methods table . extended author date method thematic focus temporal focus geographic focus sample size subjects’ gender subjects’ rank vondran (professor, lis) interviews not specified post-renais- sance europe and americas interviews ( ) not specified not specified questionnaires: articles journals beattie (archivist) – bibliometrics women’s history ( / ) not specified canada bibliometrics not specified not specified survey articles scholars questionnaires: sent out, returned tibbo (pro- fessor, lis) interviews political ( %) th– th c. ( %) north america ( %) interviews ( ) male ( %) professor ( %) abstract analysis social/cultural ( %) th– th c. ( %) europe ( %) female ( %) associate professor ( %) diplomatic ( %) general modern (non-u.s.) ( %) u.s.-western history (joint history) ( %) instructor ( %) labor ( %) th– th c. ( %) africa ( %) historiography/research methods ( %) th c. ( %) latin america ( %) special groups/topics ( %) all centuries ( %) middle east ( %) u.s. regional ( %) ancient/classical ( %) u.s.-asia (foreign rela- tions) ( %) economic ( %) medieval ( %) u.s.-europe (foreign relations) ( %) constitutional/legal ( %) other ( %) other ( %) women ( %) quantitative ( %) military ( %) archives ( %) andersen (professor, lis) interviews not specified not specified not specified interviews ( ) not specified professor (overrepresented) questionnaires surveys ( ) associate or assistant professor (underrepresented) graham (librarian) bibliometrics not specified not specified not specified not specified not specified not specified questionnaire anderson (professor, lis) interviews not specified not specified not specified interviews ( ) females overrepre- sented by % senior/principal lecturer ( %); questionnaires questionnaires ( ) lecturer ( %) professor ( %) dean/department head ( %) reader/research fellow ( %) dalton (pro- fessor, lis); charnigo (librarian) questionnaires questionnaires modern/early modern questionnaires: questionnaires ( ) male ( %) professor (over %) citation analysis social ( ) united states ( %) citations ( , ) female ( %) women’s ( ) european ( %) cultural ( ) latin american ( %) religious ( ) asian ( %) scientific ( ) legal ( ) political ( ) medical ( ) intellectual ( ) foreign policy/foreign relations ( ) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect table . studies employing mixed methods table . extended author date method thematic focus temporal focus geographic focus sample size subjects’ gender subjects’ rank vondran (professor, lis) interviews not specified post-renais- sance europe and americas interviews ( ) not specified not specified questionnaires: articles journals beattie (archivist) – bibliometrics women’s history ( / ) not specified canada bibliometrics not specified not specified survey articles scholars questionnaires: sent out, returned tibbo (pro- fessor, lis) interviews political ( %) th– th c. ( %) north america ( %) interviews ( ) male ( %) professor ( %) abstract analysis social/cultural ( %) th– th c. ( %) europe ( %) female ( %) associate professor ( %) diplomatic ( %) general modern (non-u.s.) ( %) u.s.-western history (joint history) ( %) instructor ( %) labor ( %) th– th c. ( %) africa ( %) historiography/research methods ( %) th c. ( %) latin america ( %) special groups/topics ( %) all centuries ( %) middle east ( %) u.s. regional ( %) ancient/classical ( %) u.s.-asia (foreign rela- tions) ( %) economic ( %) medieval ( %) u.s.-europe (foreign relations) ( %) constitutional/legal ( %) other ( %) other ( %) women ( %) quantitative ( %) military ( %) archives ( %) andersen (professor, lis) interviews not specified not specified not specified interviews ( ) not specified professor (overrepresented) questionnaires surveys ( ) associate or assistant professor (underrepresented) graham (librarian) bibliometrics not specified not specified not specified not specified not specified not specified questionnaire anderson (professor, lis) interviews not specified not specified not specified interviews ( ) females overrepre- sented by % senior/principal lecturer ( %); questionnaires questionnaires ( ) lecturer ( %) professor ( %) dean/department head ( %) reader/research fellow ( %) dalton (pro- fessor, lis); charnigo (librarian) questionnaires questionnaires modern/early modern questionnaires: questionnaires ( ) male ( %) professor (over %) citation analysis social ( ) united states ( %) citations ( , ) female ( %) women’s ( ) european ( %) cultural ( ) latin american ( %) religious ( ) asian ( %) scientific ( ) legal ( ) political ( ) medical ( ) intellectual ( ) foreign policy/foreign relations ( ) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole table . discovering and locating sources table . extended author year source source source source source source source source source source stevens (mas- ter’s student, lis) secondary sources nucmc colleagues archivists journals hamer’s guide uva (librarian) footnotes/cita- tions ( . %) journals ( . %) separately pub- lished bibliogra- phies ( . %) book reviews ( . %) colleagues ( . %) correspondence ( . %) meetings ( . %) indexes ( . %) (tie) students ( . %) (tie) other ( . %) stieg (professor, lis) bibliographies/ references in books or journals specialized bibliographies book reviews library catalogs abstracts or indexes colleagues (outside home institution) browsing li- brary shelves consulting experts discus- sion with colleagues (home in- stitution) consultation with librar- ians hernon (profes- sor, lis) review of subject literature book reviews library catalogs indexes/ab- stracts colleagues outside home institution browsing library shelves consulting experts colleagues at home institution consulta- tion with librarians beattie (archi- vist) - consulting archi- vists ( . %) citations in jour- nals or books ( . %) colleagues ( . %) inventories/lists ( . %) catalogs/indexes ( %) published guides ( . %) union lists ( . %) orbach (librar- ian; phd stu- dent, american studies) citations in footnotes/bibliog- raphies ( %) guides of some sort ( %) card catalog/index ( %) colleagues ( %) finding aids ( %) tibbo (profes- sor, lis) footnotes/ bibliographies from monographs ( %) footnotes/bibli- ographies from journal articles ( %) browse library subject catalog ( %) specialized bibli- ographies ( %) browsing library stacks ( %) footnotes/bibliogra- phies from disserta- tions ( %) indexes and abstracts ( %) general bibliogra- phies ( %) citation indexes ( %) delgadillo (mas- ter’s student, latin american studies); lynch (professor, education) citations in secondary sources ( %) bibliographies ( %) library catalog ( . %) talking with ad- visers ( . %) talking with col- leagues ( . %) talking with instruc- tors ( . %) talking to librarians ( . %) tibbo (profes- sor, lis) leads/citations in printed books ( %) library online catalog ( %) printed bibliogra- phies ( %) printed reposito- ry guides ( %) printed finding aids ( %) other libraries’ cata- logs ( %) newspapers ( %) repository web- sites ( %) online bib- liographic utilities ( %) government documents ( %) anderson (pro- fessor, lis) printed books/ articles ( %) in-person repositories’ physical finding aids ( %) informal leads, e.g. colleagues or browsing ( %) printed bibliog- raphies ( %) other institutions’ websites/opacs ( %) repository guides ( %) own institu- tion’s web- sites/ opacs ( %) archival/library staff or hired researchers ( %) govern- ment doc- uments ( %) newspapers ( %) duff, craig, and cherry (pro- fessors, lis) archivists ( %) footnotes/other references ( %) colleagues ( %) published bibli- ographies ( %) book reviews ( %) web ( %) indexes ( %) abstracts ( %) students ( %) duff, craig, and cherry (pro- fessors, lis) finding aids ( %) archival sources ( %) footnotes or refer- ences ( %) archivists ( %) colleagues (% n/a) published bibliogra- phies (% n/a) book reviews (% n/a) web ( %) dalton (profes- sor, lis); char- nigo (librar- ian) (primary sources) finding aids ( %) footnotes/cita- tions ( %) archives/library catalogs ( %) archival visits ( %) bibliographies ( %) bibliographic databas- es ( %) colleagues ( %) archivists ( %) websites ( %) reference librarians ( %) (tie) dalton (profes- sor, lis); char- nigo (librarian) (secondary sources) bibliographic databases ( %) reading other sources ( %) footnotes, ref- erences, notes, bibliographies in other works ( %) library catalogs ( %) bibliographies ( %) book reviews, new books, journal listings ( %) specialized bibliogra- phies ( %) (tie) colleagues ( %) (tie) browsing library stacks ( %) (tie) publisher catalogs ( %) (tie) hamburger (archivist) library opac ( . %) footnote chain- ing ( . ) in-person visit ( . %) library website ( . %) librarian/refer- ence staff ( . %) paper finding aid ( . %) colleagues/ friends ( . %) manuscripts card catalog ( . %) ocls/ rin ( . %) nucmc ( . %) (tie) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect table . discovering and locating sources table . extended author year source source source source source source source source source source stevens (mas- ter’s student, lis) secondary sources nucmc colleagues archivists journals hamer’s guide uva (librarian) footnotes/cita- tions ( . %) journals ( . %) separately pub- lished bibliogra- phies ( . %) book reviews ( . %) colleagues ( . %) correspondence ( . %) meetings ( . %) indexes ( . %) (tie) students ( . %) (tie) other ( . %) stieg (professor, lis) bibliographies/ references in books or journals specialized bibliographies book reviews library catalogs abstracts or indexes colleagues (outside home institution) browsing li- brary shelves consulting experts discus- sion with colleagues (home in- stitution) consultation with librar- ians hernon (profes- sor, lis) review of subject literature book reviews library catalogs indexes/ab- stracts colleagues outside home institution browsing library shelves consulting experts colleagues at home institution consulta- tion with librarians beattie (archi- vist) - consulting archi- vists ( . %) citations in jour- nals or books ( . %) colleagues ( . %) inventories/lists ( . %) catalogs/indexes ( %) published guides ( . %) union lists ( . %) orbach (librar- ian; phd stu- dent, american studies) citations in footnotes/bibliog- raphies ( %) guides of some sort ( %) card catalog/index ( %) colleagues ( %) finding aids ( %) tibbo (profes- sor, lis) footnotes/ bibliographies from monographs ( %) footnotes/bibli- ographies from journal articles ( %) browse library subject catalog ( %) specialized bibli- ographies ( %) browsing library stacks ( %) footnotes/bibliogra- phies from disserta- tions ( %) indexes and abstracts ( %) general bibliogra- phies ( %) citation indexes ( %) delgadillo (mas- ter’s student, latin american studies); lynch (professor, education) citations in secondary sources ( %) bibliographies ( %) library catalog ( . %) talking with ad- visers ( . %) talking with col- leagues ( . %) talking with instruc- tors ( . %) talking to librarians ( . %) tibbo (profes- sor, lis) leads/citations in printed books ( %) library online catalog ( %) printed bibliogra- phies ( %) printed reposito- ry guides ( %) printed finding aids ( %) other libraries’ cata- logs ( %) newspapers ( %) repository web- sites ( %) online bib- liographic utilities ( %) government documents ( %) anderson (pro- fessor, lis) printed books/ articles ( %) in-person repositories’ physical finding aids ( %) informal leads, e.g. colleagues or browsing ( %) printed bibliog- raphies ( %) other institutions’ websites/opacs ( %) repository guides ( %) own institu- tion’s web- sites/ opacs ( %) archival/library staff or hired researchers ( %) govern- ment doc- uments ( %) newspapers ( %) duff, craig, and cherry (pro- fessors, lis) archivists ( %) footnotes/other references ( %) colleagues ( %) published bibli- ographies ( %) book reviews ( %) web ( %) indexes ( %) abstracts ( %) students ( %) duff, craig, and cherry (pro- fessors, lis) finding aids ( %) archival sources ( %) footnotes or refer- ences ( %) archivists ( %) colleagues (% n/a) published bibliogra- phies (% n/a) book reviews (% n/a) web ( %) dalton (profes- sor, lis); char- nigo (librar- ian) (primary sources) finding aids ( %) footnotes/cita- tions ( %) archives/library catalogs ( %) archival visits ( %) bibliographies ( %) bibliographic databas- es ( %) colleagues ( %) archivists ( %) websites ( %) reference librarians ( %) (tie) dalton (profes- sor, lis); char- nigo (librarian) (secondary sources) bibliographic databases ( %) reading other sources ( %) footnotes, ref- erences, notes, bibliographies in other works ( %) library catalogs ( %) bibliographies ( %) book reviews, new books, journal listings ( %) specialized bibliogra- phies ( %) (tie) colleagues ( %) (tie) browsing library stacks ( %) (tie) publisher catalogs ( %) (tie) hamburger (archivist) library opac ( . %) footnote chain- ing ( . ) in-person visit ( . %) library website ( . %) librarian/refer- ence staff ( . %) paper finding aid ( . %) colleagues/ friends ( . %) manuscripts card catalog ( . %) ocls/ rin ( . %) nucmc ( . %) (tie) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole ninety-one percent of anderson’s sample used at least one electronic retrieval method, and % used between and . participants used websites and opacs heavily but rarely used search engines or archon. indeed, % of his interviewees deemed online retrieval most effective, but one-third claimed online retrieval was least effective (they expressed concerns about accuracy and completeness). these historians wanted a greater number of online finding aids ( % of the sample) and more digitized sources ( %) that provided context and peer-reviewed mediation. overall, tibbo’s ( ), anderson’s, dalton and charnigo’s, and duff et al.’s (both ) studies suggested historians’ willingness to adopt electronic resources as long as those resources were easily accessible and met traditional criteria for authenticity and reliability. by , then, the web was of considerable importance in locating histor- ical information. in one study, half ( %) of participants claimed the web was “very” or “somewhat” important in locating sources. in this sense, the advent of google added a formidable arrow to historians’ quivers. crucial in jumpstart- ing the research process, google offered convenience, ease of use, and a broad scope of searchable material. historian daniel j. cohen queried rhetorically, “is google good for history? of course it is.” still, google coexisted with traditional approaches, as chassanoff deter- mined. although the web represented a “ubiquitous, enabling tool,” partici- pants in rutner and schonfeld’s study evinced concerns regarding its efficiency and comprehensiveness (mirroring the concerns expressed by anderson’s partic- ipants) nearly a decade earlier. indeed, historians were not always amenable to the web’s promise of nearly instantaneous delivery of historical information. reflecting this ambivalence—or perhaps divided loyalty— % of chassanoff’s sample still followed leads in books and articles in ferreting out primary sources. while % of her sample exploited a combination of online tools (google key among them) to locate materials, they tended to use these tools early on in their research processes. historians’ comfort level with print remained evi- dent: continuity and change coexisted, if not always easily. using materials primary materials in the s as in the s, primary sources constitute the foundation of historians’ work. historians remain fixed on primary materials. anderson’s study, for instance, found that % of the sample used primary sources. though historians working on all historical periods used a broad array of sources, the ratio of archival to secondary materials remained relatively con- stant over time (see table ). d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect in his dissertation, mcanally found that between . % and . % of the citations referred to primary sources. of these primary sources, between % and % denoted printed primary sources and between % and % man- uscript materials. in their united kingdom study, jones, chapman, and woods found similar use of primary sources: . % of their references were to man- uscripts. but elliott’s citation analysis published just shy of a decade later issued a corrective to mcanally and to jones and his colleagues: approximately % of the references he analyzed referred to primary unpublished sources. just under half ( %) of the references elliott culled referred to primary published sources; therefore, fully three-quarters ( %) of references were to primary sources of some type, a finding that pointed to uva’s (primary sources were the most important sources at almost every stage of the research process). elliott, like mcanally and jones et al., found that historians preferred cer- tain types of primary sources more than others. of primary unpublished sources in elliott’s study, % referred to personal papers, and % of those referred to correspondence. the other % referred to corporate sources, and % of those referred to correspondence. (a full % of manuscript references harkened to correspondence.) finally, those historians studying twentieth-century topics made the heaviest use of manuscripts (i.e., unpublished primary sources). reaffirming stieg’s finding of , albeit with a different sample, miller found that his sample depended upon primary sources almost as much as on books and periodicals. three-quarters of miller’s sample used primary unpub- lished materials substantively. despite their concentration on the new social his- tory, however, miller’s historians made much use of tried and true sources such as personal correspondence. nevertheless, miller’s sample showed a slightly lower reliance on personal papers than elliott’s. on the whole, miller’s sample worked more intensively with sources such as case files and census records (i.e., public records) rather than personal, family, and financial records. the majority ( %) of orbach’s interviewees used secondary sources ini- tially and turned to primary sources only upon writing. these historians turned back to secondary sources at the close of their projects; thus they resembled those tibbo ( ) later studied. whereas primary sources undergirded argu- ments, secondary sources “played supporting roles such as exposing untrod intellectual territory, providing background, supplying leads to pertinent sources, and filling in facts.” mirroring the findings of earlier studies, % of her participants thought correspondence the most useful class of primary source. rooted in a combination of methods, beattie’s study problematized ear- lier work. she found four contradictions. first, more than three-quarters of respondents to her questionnaire deemed primary manuscript materials the most useful type of textual materials, and less than half claimed government d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole records were the most useful. but her questionnaire and the reference analysis revealed contradictory results, namely, twice as many references to manuscript materials as to government records. second, beattie’s historians claimed to use the personal papers of individuals ( %) and the records of women’s organi- zations ( . %). on the other hand, her citation analysis showed that half of table . sources most commonly used table . extended author date mcanally (phd student, lis) books ( . %) newspapers ( . %) public docu- ments ( . %) journal articles ( . %) general manu- scripts ( . %) archives ( . %) interviews ( . %) jones et al. (librarians) monographs ( . %) journal articles ( . %) printed docu- ments/calendars ( . %) manuscripts ( . %) newspapers ( . %) contemporary pamphlets/ ephemera ( . %) parliamen- tary debates/ proceedings ( . %) published collections ( . %) reference works ( . %) government reports ( . %) elliott (archi- vist) primary pub- lished sources secondary sources unpublished primary sources (personal) unpublished primary sources (corporate) stieg (profes- sor, lis) books periodicals manuscripts newspapers microcopies government publications (tie) theses/disser- tations (tie) lowe (mas- ter’s student, lis and history) monographs ( %) journals ( %) book chapter ( %) government material ( %) unpublished materials ( %) newspapers ( %) dissertations (. %) oral communi- cations (. %) tibbo (pro- fessor, lis) newspapers unpublished correspondence published pam- phlets handwritten manuscripts unpublished diaries or journals government papers or reports typed manu- scripts government correspon- dence unpublished minutes photographs duff et al. (professors, lis) manuscript records ( %) published printed records ( %) typescript records ( %) photographs ( %) maps ( %) moving images ( %) sound record- ings ( %) architectural plans ( %) duff et al. (professors, lis) original ( %) microfilm ( %) photocopy ( %) microfiche ( %) transcribed ( %) e-reproduction ( %) photographic facsimile ( %) dalton (pro- fessor, lis); charnigo (librarian) books ( %) journal articles ( %) manuscripts, archives, special collection ( %) dissertations ( %) newspapers ( %) government documents ( %) photographs ( %) maps ( %) publications of scholarly organizations ( %) websites ( %) brubaker (librarian) newspapers ( . %) archival sources ( . %) government documents ( . %) journals or serials ( . %) other ( . %) sherriff (archivist) books ( . %) periodicals ( . %) journal articles ( . %) government documents ( . %) book chapters ( . %) sinn (profes- sor, lis) secondary pub- lished materials ( . %) archival materi- als ( . %) web resource (. %) digital collec- tions items (. %) multimedia (. %) chassanoff (phd student, lis) correspondence ( . %) newspapers ( . %) books ( . %) periodicals ( . %) manuscripts ( . %) photographs ( . %) diaries ( . %) legal materi- als ( . %) accounts ( . %) maps ( . %) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect respondents ( . %) cited organizational records, but that only . % cited per- sonal papers. third, two-thirds ( . %) of questionnaire respondents claimed to have used census records, but only % of their footnotes cited these records. fourth, two-thirds ( . %) of respondents claimed to have used social service and court case files, but fewer than % cited these sources. table . sources most commonly used table . extended author date mcanally (phd student, lis) books ( . %) newspapers ( . %) public docu- ments ( . %) journal articles ( . %) general manu- scripts ( . %) archives ( . %) interviews ( . %) jones et al. (librarians) monographs ( . %) journal articles ( . %) printed docu- ments/calendars ( . %) manuscripts ( . %) newspapers ( . %) contemporary pamphlets/ ephemera ( . %) parliamen- tary debates/ proceedings ( . %) published collections ( . %) reference works ( . %) government reports ( . %) elliott (archi- vist) primary pub- lished sources secondary sources unpublished primary sources (personal) unpublished primary sources (corporate) stieg (profes- sor, lis) books periodicals manuscripts newspapers microcopies government publications (tie) theses/disser- tations (tie) lowe (mas- ter’s student, lis and history) monographs ( %) journals ( %) book chapter ( %) government material ( %) unpublished materials ( %) newspapers ( %) dissertations (. %) oral communi- cations (. %) tibbo (pro- fessor, lis) newspapers unpublished correspondence published pam- phlets handwritten manuscripts unpublished diaries or journals government papers or reports typed manu- scripts government correspon- dence unpublished minutes photographs duff et al. (professors, lis) manuscript records ( %) published printed records ( %) typescript records ( %) photographs ( %) maps ( %) moving images ( %) sound record- ings ( %) architectural plans ( %) duff et al. (professors, lis) original ( %) microfilm ( %) photocopy ( %) microfiche ( %) transcribed ( %) e-reproduction ( %) photographic facsimile ( %) dalton (pro- fessor, lis); charnigo (librarian) books ( %) journal articles ( %) manuscripts, archives, special collection ( %) dissertations ( %) newspapers ( %) government documents ( %) photographs ( %) maps ( %) publications of scholarly organizations ( %) websites ( %) brubaker (librarian) newspapers ( . %) archival sources ( . %) government documents ( . %) journals or serials ( . %) other ( . %) sherriff (archivist) books ( . %) periodicals ( . %) journal articles ( . %) government documents ( . %) book chapters ( . %) sinn (profes- sor, lis) secondary pub- lished materials ( . %) archival materi- als ( . %) web resource (. %) digital collec- tions items (. %) multimedia (. %) chassanoff (phd student, lis) correspondence ( . %) newspapers ( . %) books ( . %) periodicals ( . %) manuscripts ( . %) photographs ( . %) diaries ( . %) legal materi- als ( . %) accounts ( . %) maps ( . %) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole faced with these conflicting findings, beattie hypothesized that use was not tantamount to usefulness. subsequent studies such as tibbo’s, anderson’s, and duff et al.’s reaffirmed this point: the materials historians most often used were not always the ones they considered most useful. this issue remains unresolved in the s. debates over use and usefulness aside, web resources emerged as a key concern of scholars who studied historians in the early s. graham’s study unearthed still another contradiction: more than % of her sample had not sought out primary sources online even though nearly three-quarters ( %) felt “general satisfaction” with the quality of web information. nearly half ( %), more- over, were confident that web resources had sufficient permanence to be cited in scholarship. this reticence presaged tibbo’s “primarily history in america” study, which found historians still characterizing printed primary sources such as newspapers and unpublished correspondence as their most important sources. these historians’ preference for newspapers reinforced mcanally’s and jones et al.’s decades-old conclusions. graham’s findings also pointed to duff et al.’s “finding and using archival resources”: this sample’s most important sources were textual (manuscripts, printed records, and typescripts). conversely, more than one-fifth ( %) of their sample used digital reproductions. historians appreciated the potential for dig- itization because it could increase their access to documentary materials. yet they wanted direct access both to original documents and to digitized finding aids. they trusted archivists, moreover, to ensure proper measures were taken to ensure authenticity and integrity. conversely, some historians’ skepticism about web resources persisted, brubaker’s study found historians citing very few electronic primary or second- ary sources. her sample eschewed electronic newspapers, journals, and serials; only . % of their citations to archival materials were to electronic versions. change was afoot by the s. relatively few of rutner and schonfeld’s interviewees worked solely with physical primary sources. instead, they used digital representations whenever possible to save time and money. similarly, their sample unhesitatingly used digitized secondary sources such as books, book chapters, and articles. what was more, these scholars found working with digitized materials unprecedentedly convenient. mirroring duff et al.’s sample of , these historians wanted more online finding aids as well as more digitized primary sources. like rutner and schonfeld, chassanoff found nearly all of her sample ( %) relying upon digitized primary sources. while these historians often physically accessed accounts and ledgers, correspondence, diaries, and manuscripts, they frequently deferred to online versions of nontextual materials such as artworks, oral histories, photos, sound recordings, film, and video. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect familiar concerns persisted, however. chassanoff’s historians showed most enthusiasm for digitized sources that effectively replicated the attributes of physical archival sources. concerned with quality, her historians requested reproductions of original images, a finding that went against both sinn’s and gibbs and owens’s studies, both of which found that content outstripped qual- ity in importance. similarly, chassanoff’s sample wanted to procure online materials from a reputable repository that provided a detailed finding aid. other desiderata included collections of digitized primary materials supported by provenance information. her historians’ digital wish list, finally, included full (and searchable) runs of historical newspapers as well as manuscripts, popular magazines, and diaries and journals. in keeping with their current use patterns, they hoped for increased online access to nontextual items such as photographs and oral histories. though comparisons are challenging because of scholars’ varying units of analysis, several trends particularly relevant to archivists can be discerned. first, as shown in table , archival materials were used by almost all historians examined in these studies. such materials cropped up in mcanally’s (fifth and sixth), jones et al.’s (fourth), elliott’s (third and fourth), stieg’s (third), lowe’s (fifth), tibbo’s (second, fourth, fifth, seventh, and ninth), duff et al.’s (first and third), dalton and charnigo’s (third), brubaker’s (second), sinn’s (second), and chassanoff’s (first, fifth, and seventh) work. second, newspapers were heavily used, as noted in studies by mcanally (second), jones et al. (fifth), stieg (fourth), lowe (sixth), tibbo (“primarily history”) (first), dalton and charnigo (fifth), brubaker (first, tied), and chassanoff (first, tied). third, periodicals ranked second in stieg’s study, second in sherriff’s, and third in chassanoff’s. fourth, diaries ranked fifth in tibbo’s study (“primarily history”) and seventh in chassanoff’s. finally, nontextual materials constituted an important primary source for historians, a finding discussed in greater detail below. in setting their work priorities, archivists and archival scholars can learn much from these studies. for example, they might probe the vexing question highlighted by scholars such as beattie: how can the seeming disjuncture between use and usefulness be resolved? determining which materials are most useful as opposed merely to most used seems critical for resource allocation as archivists increasingly seek to mount sources on the web. second, archivists might determine how best to preserve traditional attributes such as provenance and authenticity in mounting materials on the web. chassanoff’s, sinn’s, and gibbs and owens’s studies suggest scholarly consensus is elusive: do historians prefer quality over quantity? third, archivists might well prioritize digitization projects based on previous studies’ findings. it appears that correspondence represents a particularly good candidate for digitization. newspapers and d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole periodicals seem in robust demand, as do nontextual materials. finally, archi- vists would do well to explore pitti’s suggestion of in the late s: how to link finding aids or other tools for locating materials with quality reproductions of the materials themselves. nontextual materials as shown in table , nontextual materials became increasingly important in historians’ work. on one hand, stieg found the relative lack of her sample’s use of newer media formats “striking, if not surprising.” on the other hand, her historians used materials such as photographs (ranked number eight in her survey among formats), maps (ninth), sound recordings (tenth), films (twelfth), and videotape (thirteenth). beattie’s sample often exploited nontextual pri- mary materials: approximately three-quarters of respondents claimed to use photographs and nearly two-thirds used oral histories. that said, dalton and charnigo learned that the availability of nontextual sources and the time period being studied by historians circumscribed the extent to which such sources, especially film, were used. pursuant to beattie’s work, over the next quarter-century scholars such as lowe, tibbo, dalton and charnigo, duff et al., rutner and schonfeld, sinn, and chassanoff also found historians using nontextual materials. lowe’s sample, for instance, used oral communications (ranked eighth), and tibbo’s (“primarily history”) used photographs (tenth). dalton and charnigo’s historians relied upon photographs (ranked seventh; % used them), maps (eighth; %), oral histories (twelfth; %), audiovisual materials (fourteenth; %), and artifacts or museum pieces (fifteenth; %). similarly, duff et al.’s sample embraced pho- tographs (ranked fourth; % of respondents called them “very” or “somewhat” important), maps (fifth; %), films and moving imaging (sixth; %), sound recordings (seventh; %), and architectural plans (eighth; %). rutner and schonfeld’s historians, too, embraced nontextual materials, namely audio/video, websites, and games. locating, accessing, and working with such materials seemed unprecedentedly convenient to these scholars. also in , sinn found photographs and artwork cited frequently ( . % of the total number of images). his sample also cited screenshots of film or mul- timedia (. %). perhaps more important, archival materials were the most frequently used source for images ( . %) and digital archival collections con- tributed another . %. nearly % (ranked sixth) of participants in chassanoff’s study relied upon photographs, . % (tenth) upon maps, . % (eleventh) upon oral history, . % (twelfth) upon artwork, . % (thirteenth) upon film, . % (fourteenth, tied) on datasets, . % (fourteenth, tied) on sound recordings, and . % (sixteenth) d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect on video. perhaps more important, participants accessed artwork, oral histo- ries, photographs, and sound, film, and video recordings more frequently online than in person. on the whole, scholarly reliance upon nontextual images perhaps has been slow to evolve. that said, these studies suggested that historians were more willing than ever to consult sources other than traditional print. one strategy for quickening the pace of digitization and thus insinuating nontextual images into historians’ work is to include historians themselves in undertaking tasks such as metadata creation or descriptive tagging. in light of findings such as chassanoff’s, making these nontextual materials accessible electronically should be among archivists’ key priorities. historians’ overall information-seeking and use as part of their user studies, some scholars traced the overarching infor- mation-seeking behavior of historians. for instance, stevens stressed his sam- ple’s reliance upon professional networks as opposed merely to the consultation of formal sources. conversely, uva’s historians used literature more exten- sively than personal channels; the latter were most germane in the early stages of scholars’ work. nearly three decades later, however, lowe reiterated that informal contact was a principal vehicle of scholarly communication. relying upon informal versus formal retrieval methods led to a larger conversation: whether historians participated in an invisible college. scholars remained far from reaching consensus on this issue. vondran, stevens, case, and delgadillo and lynch claimed historians did rely upon such a net- work. conversely, stieg, hernon, and dalton and charnigo concluded that historians lacked an invisible college. given her verdict on historians’ lack of an invisible college, it was no wonder that stieg thought historians’ research methods unsystematic; they also neglected to exploit all available resources. although miller believed social historians to be astute users of archives, he echoed stieg: historians did not use as many sources as they might, nor did they fully exploit these sources’ potential. orbach, too, underscored historians’ “neither entirely conscious nor entirely linear” research process. satisficing or even the principle of least effort could triumph. “few scholars would argue with the ideal of thorough and painstaking research,” orbach suggested, but “fewer still care to or can afford to engage full-time in this single pursuit until its completion.” challenging the findings of uva and stieg, scholars such as beattie, orbach, case, tibbo (“primarily history”), cole, duff and johnson, and chassanoff argued that historians were methodical and organized, if nonlinear and iterative, in their pursuits. indeed, an iterative approach could prove key in building context, d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole “the sine qua non of historical research.” case highlighted the number of archives and libraries containing materials of interest to historians, another factor that could engender perceptions of haphazardness. “although the infor- mation seeking is partially blind or unconscious,” cole concluded, “it is strongly motivated nonetheless.” finally, chassanoff’s sample harkened to uva’s; her historians pursued a nonlinear search process involving multiple (an average of eight) strategies. whether methodical or not, historians’ research processes often revolved around collecting names and subjects. both stevens and orbach stressed the importance of names and subjects; orbach also emphasized the salience of chronological periods. cole, too, discerned that collecting names of people and organizations constituted vital information-finding strategies. the bedrock of original research, names allowed researchers both to access resources and to perform original research. like stevens’s, orbach’s, and cole’s samples, nearly all of duff and johnson’s historians collected names. susan hamburger also underlined her sample’s reliance upon personal names, though she thought personal names far from the most effective vehicle for searching. historians’ strategies may be orderly to one degree or another, but whether they are maximally efficient is another matter. they would be well advised to involve archivists earlier and more frequently in the research process both for- mally and informally, especially given archivists’ technological savvy. computer technology renders archivists’ involvement with historians’ work all the more imperative—and all the more feasible. historians and information technology computers seemed promising for historical research as early as the s with the first scholarly use of punch cards. in the late s, frank and harriet owsley incorporated statistics into their work; a handful of other historians followed suit in the s. in the early s, some scholars focused on social mobility, urbanization, patterns of assimilation, ordinary people, and multiple causation. quantitative analysis and data modeling predominated. but the cost and time involved in training impeded such work. notwithstanding microfilm, historians accrued only limited experience with new technology by the late s and in any event lacked a clear concep- tion of how best to exploit it. the lack of standardized computer programs meant new methods spread slowly. nonetheless, archivist of the united states james b. rhoads thought computers were rising in archivists’ estimation as effective data manipulators. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect yet historian h. j. hanham insisted that the computer had changed his- torians’ priorities little. for some historians, in fact, computing became negatively identified with quantitative history. another historian, however, examined the state of political, economic, and social history and found much to admire. pointing to the nineteenth-century roots of the historical profession, he reflected, “perhaps we are closer than many thought possible a few years ago to realizing . . . a truly scientific historiography.” still another historian, joel h. silbey, concurred: “guidelines are well-defined, themes and patterns are well established . . . the sophistication attained and confidence exhibited sug- gest a great deal of useful work to come.” in , uva found that . % of respondents used computers in their research, though hardly for the advanced number-crunching advocated by silbey and others of his wont. computers and networking were de rigueur in some quarters by the early s. merely using word processing software, for example, yielded a consid- erable payoff. the s witnessed important refinements in the use of data- bases for historical computing. yet lack of coordination and communication between historians and computer programmers festered. gereben zaagsma even contended, “american computer-aided historical research had all but died by the mid- s, the result of a backlash against quantitative approaches . . . to the detriment of traditional problem-oriented and narrative history.” in the early s, adoption of information technology by humanists remained desultory. “a persistent skepticism still haunts the profession, as our machine-less colleagues still wonder whether historians who use comput- ers are the vestal virgins of a new research paradigm or naked emperors proud of their virtual clothes,” remarked the head of the canadian committee for history and computing. during this period, however, lis scholars such as case and tibbo probed historians’ willingness to use computers in unprecedented depth. case was of two minds. on one hand, he observed “an antitechnology bias in a tradi- tion-oriented profession,” but, on the other, he found his sample of histori- ans remarkably open to any strategies that would facilitate their research. in case’s sample, of historians used computers and had done so for an average of . years. at the same time, nearly all still edited their manuscripts on paper; not one, moreover, used bibliographic databases. computers allevi- ated the tedium of composing, typing, and revising, but the work a computer could perform seemed but prefatory to critical interpretation. tibbo called for synthesizing traditional and new approaches. surveying the offerings of the web in , two historians reflected: “we are impressed—even astonished.” historian and librarian robert darnton admitted, “like many academics, i am about to take the leap into cyberspace, d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole and i’m scared. what will i find out there? what will i lose? will i get lost myself?” barriers to the optimal enlistment of technology persisted. one study of humanists castigated historians’ “culture of low expectations.” in this vein, andersen’s sample feared that investing time in electronic resources would undercut scholarly productivity. some of her historians deemed lack of instruc- tion and finding relevant information key impediments. foreshadowing a american historical association study, andersen found that many members of her sample neglected information technology, but that a minority made heavy use of it. they requested cutting-edge equipment; personalized, hands-on, and in-house training; and timely support, database information, and improved access to electronic information. in a related study, andersen’s sample (like case’s) undercut stereotypes about technophobia. these scholars embraced computers for word process- ing, communicating, printing, and photocopying. additionally, nearly all mem- bers of her sample thought electronic information access technologies essential, especially in verifying bibliographic citations or locating documents. but seldom did they use databases or spreadsheets. perhaps most important, responses indi- cated that many of these historians were unaware of the resources available to them. clearly, communication was at a premium. history students meanwhile cited problems with electronic sources similar to those of full-fledged historians. cole’s sample of doctoral students stressed the challenge of assessing the quality and relevance of information as opposed merely to its quantity. though generally positive about computer use, delgadillo and lynch’s sample showed only limited use of computing technol- ogy. one-third of the sample had used email during the previous year, but they used computers mostly for consulting online catalogs and for word processing. overall, these students demonstrated many of the same information-seeking strategies as did their mentors, which likely helped to explain their hesitancy about adopting new technologies. for historians and students alike the question remained open: could the web facilitate “serious” historical work? complicating this question, the web seemingly democratized history by allowing users of all stripes to create and place their own histories in the public domain. historians michael o’malley and roy rosenzweig thought the web showed that “meaning emerges in dia- logue and . . . culture has no stable center, but rather proceeds from multiple ‘nodes.’” a survey conducted by the american association for history and computing (aahc) identified much individual and institutional variety. every respondent used email and nearly all ( %) used computers for research. moreover, two-thirds of respondents ( %) felt dissatisfied with their institutions’ d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect technology policies, initiatives, and plans. in the end, though, responses indi- cated a sentiment of “cautiously optimistic experimentation.” optimistic or not, historians ratcheted up their expectations for the infor- mation age. for example, duff and cherry’s sample wanted the best of both worlds: easy access to electronic documents in good condition, on one hand, and the functionality of paper documents on the other. in terms of electronic resources, they requested comprehensive coverage, results ranked in order of relevance, provenance information about digitized images, browsing function- ality, and search query assistance. graham’s mixed-methods study determined that most respondents used electronic resources more in than in . nevertheless, her sample showed no particular interest in using electronic versions of sources, despite the latter’s advantages in search functionality. few of these historians cited elec- tronic resources in their work, for they believed their colleagues respected print citations more than electronic ones. (sinn later hypothesized that new types of resources undergo a trial period in which they build up legitimacy among scholars.) finally, though % were uncertain whether digitized sources would positively affect their research, half were interested in learning more about digitized sources. in just this sense, roy rosenzweig propounded, “historians are not particularly hostile to new technology, but they are not ready to wel- come fundamental changes to their cultural position or their modes of work.” statistical and mathematical tools, after all, could not supplant critical qualita- tive judgment. anderson’s study similarly found historians willing to use online resources and tools as long as those resources met their needs. some historians’ prob- lems with electronic sources stemmed from the scope and indexing of the source rather than from equipment or software. indeed, most members of dalton and charnigo’s sample were “highly appreciative” of electronic resources, though their use of online resources and tools in no way implied jettisoning traditional methods. scholars also discussed generational issues with respect to technology. for instance, tibbo discerned a difference between junior and senior faculty: junior faculty were much more likely to search the web and opacs than their older colleagues. conversely, dalton and charnigo claimed, “the myth of the younger generation teaching the older appears . . . to be just that, a myth.” cohen and anderson concurred with dalton and charnigo: neither technopho- bia nor technophilia was the strict preserve of any age group. historians voiced concerns about the migration of sources to the web. many historians found reading onscreen unpleasant. others evinced concern about the authenticity, reliability, persistence, stability, and legibility of sources on the web. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole holdouts such as historian alexander maxwell noted that historians tradi- tionally were reluctant to embrace methods of digital scholarship. he insisted not only that original paper documents were always to be preferred, but also that digital archives should duplicate (not replace) such originals. any digital documents, he felt, should embody images of the original. complicating maxwell’s argument, more than , members of the american historical association described their use of computer technology in . “power users” ( . % of the sample) exploited multiple digital technolo- gies; “active users” ( . %) employed a variety of online sources, adopted new technology, and taught themselves to use it; “passive users” ( . %) employed computers for word processing and for occasional online searches, but relied on others for training; and “avoiders” ( . %) shunned computers. power users worked with a greater number of programs ( . ) than active ( . ) or passive ( . ) users. whereas more than half of power users welcomed new software or digital tools, the remaining respondents favored a more cau- tious approach. notably, nearly half ( %) of passive users and avoiders claimed few programs or tools proved useful in their research. despite their cautiousness, nearly all respondents used word processing and conducted some online searches; three-quarters ( %) used at least one other program or technology. therefore, the differences between power users and the other respondents perhaps hinged more on quantity than on use. but age and generation proved notably important in the study (concurring with tibbo but not with dalton and charnigo, cohen, or anderson), more so, in fact, than geographic field of specialization, type of employing department, or gender. historians over were twice as likely to be either technologically ambivalent or hostile as their counterparts under . despite such checkered findings, a study determined that “more digi- tal archival materials are used in historical research and . . . more historians are using digital archival materials for their research.” the use of digital archival materials—the actual number of items each year, the average number of digital items in the articles, the number of articles that used digital items, and the average number of articles using digital items each year—increased between and . but the actual use of such resources remained infinitesimal: web resources were . % of citations, digital archival collections were . %, and multimedia were . %. even so, the small number of items cited might not indicate a minor scholarly impact. it seemed that electronic sources were slowly coming into their own. but a familiar conclusion emerged from gibbs and owens’s study: interest in new forms of data coexisted with traditional use of historical sources. ultimately, scholars’ verdicts on the impact of computers on historical practice varied, but most opted for cautious generalization even as they hedged d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect their bets rhetorically. for example, it seemed unclear to one group of research- ers whether use of the web by historians constituted a sea change or merely an adaptation. rutner and schonfeld also temporized, “the underlying research methods of historians remain fairly recognizable even with the introduction of new tools and technologies, but the day to day research practices of all histori- ans have changed fundamentally.” according to toni weller, though few historians seemed “digital luddites,” by the early s, there remained “a degree of condescension and suspicion towards digital resources.” few historians leveraged digital tools for analysis, much less disseminated their work digitally. the phenomenon of “new media, old mentality” died hard. indeed, gibbs and owens determined that both his- tory professors and graduate students exploited technology to streamline their traditional methods; they were relatively ignorant, however, of digital tools. “the web may not be the brave new world or the postmodern inferno, but it is an arena with which everyone concerned about the uses of the past in the present should be engaged,” claimed o’malley and rosenzweig. ultimately, technology can serve as an ever more powerful resource not only for effecting historical scholarship, but also for enabling new collaborations among archi- vists and historians. possibilities for future research possibilities for future research on historians, archivists, and informa- tion-seeking include digital history, personal archiving, web . , democratiza- tion and public history, crowdsourcing and citizen archivists, digital curation, activism and social justice, diversity and demographics, and education and training. these overlapping issues will profoundly affect both the writing of his- tory in the future and the trajectory of the historical and archival professions. digital history digital history harnessed computers and software. “on one level,” noted william g. thomas iii, “digital history is an open arena of scholarly produc- tion and communication, encompassing the development of new course mate- rials and scholarly data collection. on another, it is a methodological approach framed by the hypertextual power of these technologies to make, define, query, and annotate associations in the human record of the past.” nevertheless, the bulk of professional historians vouchsafed little atten- tion to digital history; digital scholarship itself comprised a sliver of american history overall as of the middle of the s. although scholars relied upon d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole word processing, email, and web browsing, their computerized research skills were immature. in spite of slow adoption by historians, digital media and networks quan- titatively improved capacity, accessibility, and flexibility. they also promoted diversity, manipulability, and interactivity. conversely, they posed drawbacks. concerns stemming from quality, authenticity, durability, readability, passivity, and inaccessibility loomed large. faced with this gordian knot, one historian complained, “i am up for tenure this year; i don’t have time for this electronic stuff.” in the s, born-digital objects such as hypertextual maps, annotated let- ters, edited video, oral histories, and relational databases became part of some historians’ practices. some historical work showed the advantages of textual analysis and historical geographical information systems (hgis) in enriching or amending traditional interpretations. but exactly how “doing history” has changed remains an open question. not only are incentives sparse overall, but few students are trained in such methods. historians’ tendencies toward con- servatism remain apparent. one historian chimed in, “i do not care a whit whether improved access to digital information comes about because of pub- lic-private partnership or changing attitudes among library professionals: i only care about improved access.” flying in the face of such sentiments, gibbs and owens noted, “historical scholarship increasingly depends on our interactions with data, from battling the hidden algorithms of google book search to text mining a hand-curated set of full-text documents.” recent projects such as william g. thomas iii and richard healey’s “railroads and the making of modern america” and daniel j. cohen and his colleagues’ “using zotero and tapor on the old bailey proceedings: data mining with criminal intent” demonstrated the scholarly potential inhering in large quantities of data. in these arenas, the computer qua research tool served as “a moveable and adjustable lens that allows scholars to view their subjects more closely, more distantly, or from a different angle.” ian anderson testified, “whether analyzing change over time or the relation- ship between cause and effect it is impossible to avoid talking about extent, range, scope, degree, duration, proportion or magnitude, whether one is using adverbs and adjectives or decimal points and chi-squares.” james crossman lobbied for combining historians’ and statisticians’ skills. in , the american council of learned societies’ report our cultural commonwealth maintained, “digital technology can offer us new ways of seeing art, new ways of bearing witness to history, new ways of hearing and remem- bering human languages, new ways of reading texts, ancient and modern.” zaagsma recently noted, “i would hope that within a decade or so there will be no more talk of ‘digital history’ as all history is somehow ‘digital’ in terms of d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect incorporation of new types of sources, methods and ways of dissemination.” how can archival principles and practices add value to digital history? personal archiving in , donald hawkins observed: “what we have written, what we have read, where we have been, who has met with us, who has communicated with us, what we have purchased, and much else is recorded in increasingly greater detail in personal digital archives, whether they are held by individuals, institu- tions, or commercial organizations, and whether we are aware of those archives or not.” personal digital archives thus constituted “an optional, even acciden- tal, part of our collective cultural record.” for archivists, personal archives introduced another degree of difficulty to existing practices. perhaps most important, archivists lacked input regard- ing the creation of personal archives. furthermore, archivists needed to take responsibility for preserving indefinitely the materials and their contextual relationships. finally, archivists needed to preserve the authenticity of these materials and to remain cognizant of privacy and intellectual property issues. a number of scholars suggested the importance of personal archives for future scholarship and encouraged repositories to take heed of these materials lest they be lost irretrievably. granular studies of personal records’ creation and use seemed an overlooked area for research. archivists may profit from adopting the bodleian library’s recommendations. first, creators would benefit from exposure to digital curation expertise. second, though archival profes- sionals have important skills to deal with these materials, they need to extend those skills, namely in learning how to exploit new tools. third, archivists can raise awareness of the need to preserve personal materials and can forge col- laborations with creators and other stakeholders. what strategies for raising awareness and effecting collaborative outreach might be most effective? web . newfound interest by archivists in personal archiving channeled into their engagement with web . more broadly. web . represents: the network as platform, spanning all connected devices; web . applica- tions are those that make the most of the intrinsic advantages of that plat- form; delivering software as a continually-updated service that gets better the more people use it, consuming and remixing data from multiple sources, including individual users, while providing their own data and services in a form that allows remixing by others, creating network effects through an d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole “architecture of participation,” and going beyond the page metaphor of web . to deliver rich user experiences. in the s, web . infiltrated the archives as well. it offered not only new sources for archivists to preserve, but also new ways for them to reach out to professional and public historians and to other constituents. “do we dare to assign value to the words of those prophets written on subway walls and ten- ement halls, now more likely inscribed on web sites, blogs, twitter, facebook, youtube, and other digital social media?,” asked terry cook. mary samouelian’s study found a gap between archivists’ awareness of the importance of web . and repositories’ actions in capitalizing upon it. of the repositories in her population that hosted digital collections, ( %) used web . applications. more auspicious, interviewees were “overwhelmingly posi- tive” about using web . . another study found nearly one-fifth of repositories in the united states and canada using at least of web . applications (blogs, facebook, or twitter). nonetheless, their outreach efforts appeared relatively conservative. samouelian’s participants’ most common motivations for embracing web . stemmed from their interest in promoting use and sharing of content. interviewees mentioned requests for scans, interest in viewing original materi- als, and even inquiries about donations. on the other hand, the time necessary to maintain a web . presence proved the biggest drawback. the institutions most successful in attracting audiences had the luxury of devoting staff time to web . . all the same, web . projects may ultimately help archives large and small attract new staff and resources. overall, web . changed archivists’ technological interaction with stake- holders. “in a web . world,” max j. evans argued, “researchers who discover collections and collection components should have several interactive choices: an email address or telephone number by which to contact an archivist to learn more; a way to schedule a visit; or a listing of hours and location so that an unannounced visit can be planned. or . . . detailed finding aids can also become the means to order up archival digitization-on-demand.” this ideal is yet to be realized. apropos of preserving web . materials, archivists might consider target- ing blogs, facebook, and twitter. blogs seemed the successor to that staple of historical research, diaries. through blogs, ordinary people are “confessing their sins, complaining about work, or celebrating small, personal achievements.” by extension, blogs in many ways democratized web . , shedding light on ordinary people (or at least ordinary web users, potentially a key distinction). barriers to entry are low: users needed neither advanced technical skills nor design and literacy skills. moreover, they can exploit free software and online services. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect but blogs posed challenges for archivists—not least because of their inher- ent instability. along these lines, preservation of blogs can prove labor-inten- sive and potentially duplicative. last and perhaps most important for future researchers, blogs could lose necessary context if separated from their environ- ment of creation. like blogs, facebook and twitter drew increased interests from archivists in several studies both as future historical sources worthy of preservation and as tools to connect with constituents such as professional and public histori- ans. first, scholars considered preservation. one study’s participants showed “indifference, mistrust, and confusion about the preservation of their facebook records.” furthermore, participants assumed their records lacked historical or research value. suffice it to say, these results do not bode well for future research use. additionally, preservation of tweets drew attention from archival schol- ars. four obstacles arose. first, tweets were ephemeral and lacked standards and best practices for collection and preservation. second, both experiential and contextual information could be lost in the course of preservation. third, it was difficult if not impossible to determine whether a given account is used by a single or by multiple users, much less to verify the identity of a user or of users. finally, archivists faced two ethical issues: the anonymity and safety of users and the inability to secure consent. yet timothy arnold and walker sampson offered useful prescriptions for preserving tweets. first, they advo- cated for documenting the tools employed to gather any tweet collection(s). further, to preserve necessary contextual information, archivists should docu- ment the rationale behind their search parameters (for example their selection of keyword terms and hashtags). second, both facebook and twitter spurred archives to strengthen bonds with existing constituents and to cultivate new audiences, public and profes- sional historians among them. as an extension of existing outreach endeavors, facebook allowed archivists to keep pace with peer institutions’ outreach efforts and to raise the public profile of their own institutions as well as to share their collections. the vast majority of participants in a recent study ( of ) deemed facebook a “good” or “great” outreach venue. like facebook, twitter increased its archival profile in the s, in no small measure because the library of congress committed in to preserve the entire twitter archives. a recent study found institutions successfully increas- ing awareness of and access to their collections through twitter. institutions engaged with their audiences through administrative updates ( . %), links to institutional site content ( . %), link sharing from other sites ( . %), inter- acting with twitter users ( . %), event promotion ( . %), and social media– focused tweets ( . %). two findings seemed propitious. first, through twitter, d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole smaller institutions could have an impact disproportionate to their size. second, twitter encouraged reciprocity between users and institutions. indeed, one project on the war of testified to the possibilities of twitter vis-à-vis out- reach. therefore, web . can potentially contribute to the democratization of history and highlight archivists’ public roles in the process. despite the myriad possibilities offered by web . , archivists would be well advised to remember, as roland and bawden proclaimed, “among all the ‘noise’—blogs, emails, status updates, chat forums, tweets—there is also much silence.” the digital divide continues to loom large in the archives. web . is no exception. scholars thus might explore how best to capture representative web . content for future historians. what selection and appraisal policies and practices are appropriate? democratization and public history carl becker famously declared, “the history that lies inert in unread books does no work in the world.” professional historians seemed oblivious. more than six decades later, douglas greenberg lamented that american profes- sional historians lacked legitimacy with the general public, in no small measure because of their tendency to remain cloistered in the academy. by the early s, however, the public history web seemed a reality based upon grassroots efforts that comprised individuals, nonprofit organizations, and government agencies. ideally, such democratization could counteract the narrowing of concerns of professional historians. rosenzweig argued, “the web takes carl becker’s vision of ‘everyman a historian’ one step further—every person has become an archivist or publisher of historical documents.” indeed, those who rarely if ever had access to historical materials could now access or even publish such materials. in this vein, the very notion of who counted as a “historian” expanded to include amateurs, curators, documentarians, historical society personnel, teachers, and students. the web could render the past better documented, more diverse, and more democratic. moreover, web . fostered symbiosis between scholarly and popular history; this augured well for the examination of collec- tive experience, consciousness, and public memory. greenberg maintained, “public historians can do their work for the public, by the public, and with the public.” popular engagement, however, introduced potential drawbacks: users might not grasp the context(s) surrounding materials. similarly, users might reflexively presume sources’ impartiality or completeness or both. more broadly, the democratization of history paradoxically could exacerbate the dig- ital divide, whether between commercial and nonprofit entities or between d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect resource-rich and resource-poor educational institutions. the ideal spelled out by the american council of learned societies, “we should place the world’s cultural heritage—its historical documentation, its literary and artistic achieve- ments, its language, beliefs, and practices—within the reach of every citizen,” remains just that, an ideal. scholars might ask: how can the most diverse con- tributions to the public history record on the web be secured, and what roles might archivists (and historians) play in enlisting such contributions? crowdsourcing and citizen archivists crowdsourcing complements the democratization of history. “by design- ing platforms that make adding real value to our work intriguing, easy, and fun,” archivist of the united states david ferriero contended, “we can cultivate both professional and non-professional ‘citizen archivists.’” members of the public may contribute to public education. “crowdsourcing,” asserted johan oomen and lora aroyo, “has the poten- tial to help build a more open, connected, and smart cultural heritage with involved consumers and providers: open (the data is open, shared and accessi- ble), connected (the use of linked data allows for interoperable infrastructures, with users and providers getting more and more connected), and smart (the use of knowledge technologies and web technologies allows us to provide inter- esting data to the right users, in the right context, anytime, anywhere).” for instance, “crowdsourcers” might engage in correction and transcription, contex- tualization, classification, curation, and crowdfunding. terry cook advocated for archivists’ public engagement as coaches, mentors, and partners. crowdsourcing qua peer production would likely thrive if contributors chose the projects they worked on and determined how much time to invest. should this succeed, “the archives of the people (as they have always been, but only in the abstract) thus become the archives by the people (who contribute and add value) and for the people (who now can actually use them).” in this vein, archivists could collaborate with historians to promote initiatives such as history harvest, which encourages citizens to contribute for education and research digitizations of their documents and artifacts. yet ensuring contributors are consistent and knowledgeable given their lack of training is a central challenge, as is maintaining an appropriate level of quality and accuracy of their products. (and who will determine what is appropriate?) marc parry asked, “will enough volunteers participate to sustain these projects? will the crowd care about less-sexy subjects, beyond war and famous individuals? and could transcribers’ political beliefs skew their work on documents related to sensitive history topics?” these are useful questions for scholars to unpack. “how well we meet that challenge for more democratic, d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole inclusive, holistic archives may determine how well we flourish as a profes- sion in this digital century,” terry cook prognosticated. how might archivists encourage citizen participation in such endeavors? digital curation digital curation centers on “planned, systematic, purposeful, and directed actions that make digital information fit for a purpose.” future historical research will depend upon the born-digital materials that digital curation addresses. archives and digital curation work are complementary. areas of knowledge overlap, including ownership, donor relations, intellectual prop- erty, appraisal, provenance and respect des fonds, the context of creation and use, authenticity, evidence, the life cycle, descriptive hierarchy, access and use restrictions, transfer of ownership, permanence, and metadata. a recent survey found that more than half ( %) of respondents, all of whom were college or university archivists, were involved in campus conver- sations about curation. nearly half of respondents collected institutional or research data in their repositories. nonetheless, institutional size mattered: the largest institutions saw the most archivist involvement. most striking, the vast majority of participants ( %) believed archivists should be involved with digital curation on some level, but only % of these respondents felt capable of fulfill- ing their perceived roles. for their part, historians demonstrated an inconsistent level of engagement with digital curation. roland and bawden underscored historians’ potentially conflicting priorities: “while digitization of analogue collections is recognized as progressive in that it increases access to historical resources and knowledge, as well as enabling a more democratic, alternative history to be told, others regard the digitization of born-digital material such as blogs and datasets as more pressing due to its fragile and vulnerable nature.” in the digital as in the analog world, however, not everything can or should be preserved. appraisal and selection remain stumbling blocks. indeed, one study’s respondents wanted the selection criteria for digital data to mirror those of analog materials. hence their sample favored archivists making final appraisal and selection decisions. “the meaningful preservation of digital information will determine the stories future historians will (or will not) tell, the information they will (or will not) access, and the knowledge available (or not) for future generations to build upon.” how can archivists make histori- ans more aware of the need for and benefits of digital curation? can they work together to develop criteria for selection and appraisal? d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect activist archivists and social justice historian howard zinn insisted in , “the archivist, in subtle ways, tends to perpetuate the political and economic status quo simply by going about his ordinary business.” “the rebellion of the archivist against his normal role is not, as so many scholars fear, the politicizing of a neutral craft, but the humanizing of an inevitably political craft,” he maintained. but many archi- vists seemed slow or reluctant to heed zinn’s exhortation. one united kingdom archivist observed, “thirty years on from zinn’s comments, there clearly remains a need to take up his call to become ‘activist archivists.’” though some archivists may view activism as “controversial, even inap- propriate,” anne gilliland justified such activism. “with this agency and activism,” she stipulated, “comes a responsibility that needs to be informed by supporting evidence and appropriate technical and methodological expertise; broad critical consciousness; cultural awareness and sensitivity to the needs and rights of individuals who are the creators, subjects, or users of archival materials; robust and relevant professional ethics; and . . . strong self-reflection and public disclosure of the personal motivations behind one’s actions.” a recent exchange between mark a. greene and randall c. jimerson showed that the debate over the appropriateness or the nature (or both) of archival activism continued to thread professional discourse. greene asserted, “pursuing ‘social justice,’ as high minded and as universal an aspiration as it may sound, risks overly politicizing and ultimately damaging the archival profession.” he favored documenting controversial issues rather than participating in them. jimerson, on the other hand, remarked, “what the call of justice asks archivists to accept is a responsibility to level the playing field. the archival profession as a whole—but not necessarily each individual archivist or repository—should assume a responsibility to document and serve all groups within society.” the society of american archivists weighed in on the issue in and seemed to lean toward greene’s position. the organization concluded: although some—or even most—of saa’s leaders, members, and staff may hold similar views on social issues and matters of social justice, the organization as a whole does not have the resources or knowledge of a consensus to comment or act on every social issue that emerges. to choose to comment or act on one issue to the exclusion of others would raise concerns about how saa reaches a decision about when to become involved and when and how the broader membership is consulted (or even polled) about their individual positions on a given social issue. the profession, it seems safe to say, remains divided on the issue of activ- ism and social justice. might soliciting input from the historical profession given d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole its own efforts in this area past and present enrich the conversation among archivists as well as among archivists and historians? diversity and the changing demographics of the archival profession nearly years ago, kathryn m. neal urged archival professionals to recruit minorities into the profession. interest in diversity and inclusivity soon bur- geoned. younger archivists increasingly hoped to shed light on marginalized populations by unearthing and publicizing previously overlooked documentary materials. a diversity agenda that embraces multiculturalism should encour- age multiple perspectives while highlighting the relationships among them. in the late s, the archival profession foregrounded three facets of diver- sity: within the profession at large, within the society of american archivists, and in the historical record. elizabeth adkins averred, “after a long and some- what tortuous journey, diversity is now a front-and-center priority.” greene subsequently cautioned, “unless and until archivists of the so-called majority culture immerse themselves in the challenging, sometimes harsh, frequently perplexing, and usually nuanced world of diversity issues, it is unlikely that our profession, our institutions, our collections, and our researchers will achieve truly fundamental and enduring successes in achieving the goals—unclear as those often may be—of multiculturalism in archives.” how can the archival profession recruit and retain archivists of color? how can collecting policies be developed to preserve the diversity of the cultural record? how can archivists ensure such diverse materials are made available and accessible to historians? education and training archivists, historians, and librarians still differ over the place of archival education in the curriculum. dissatisfaction with the graduate education of historians dates at least to the mid-twentieth century, as philip c. brooks, for one, lamented. matters scarcely improved in the late s. as one study noted, professors, themselves ignorant of research methods, often refused to admit their own lim- itations and insisted that students effectively teach themselves. two decades later, stasis still obtained. a team of historians and archivists soon weighed in: “nothing better illustrates both the uncertainty about teaching archival princi- ples and the inadequacy of historical and archival cooperation than the state of graduate history courses in research methodology.” students relied upon trial and error, a strategy both expensive and time consuming: bridges et al. lobbied for a synthesis of historical and archival research methods. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect nearly two decades later, many historians remained ill informed about the need to train their students to understand archival contexts. one study found that doctoral students’ training depended largely upon their advisers and that the process of learning to work with primary sources remained informal. these students received scant support for learning new research methods; they strug- gled to narrow the scope of their research, to refine their arguments, to manage their sources and notes, and to locate technological support. “current histor- ical scholars do not really engage with the conceptual impact of the digital age despite using digital resources in their work,” weller recently asserted, “and consequently current students of history are often not taught to think about these conceptual issues or to apply traditional historical methodologies to their everyday digital and online experiences.” historically, archival instruction for undergraduates was circumscribed to orientations, tours, and displays. there existed neither competencies nor learn- ing objectives nor standards for undergraduate archival education; trial and error prevailed. but students often needed considerable guidance in using archival materials, and such guidance was rarely forthcoming. one study found a great deal of variation regarding how faculty members addressed archival research: for instance, some targeted it only toward history majors and others only for upper-level students. but archivists now have a prime opportunity to educate undergraduate and graduate students, exposing them to “clio in the raw.” an archives can serve as a “laboratory in critical thinking” that trains students to select authen- tic and credible evidence as well as to analyze and interpret primary sources. archivists can introduce students to archival holdings and help them to discern research topics and to learn key skills. familiarizing students with primary sources in particular not only inte- grates archives specifically into the curriculum, but also introduces students to or reinforces research methods based upon an understanding of finding aids and archival concepts such as provenance. such instruction can also connect students to historical artifacts both emotionally and physically. for instance, xiaomu zhou found that most students in her sample strug- gled to use primary sources. as such, she found that the teaching of basic archi- val skills was the most vital part of the orientation. similarly, magia g. krause found that even rudimentary archival education improved students’ abilities with respect to critical thinking and to grasping historical context. as import- ant, archival education appreciably improved students’ ability to use primary sources. third, wendy duff and joan cherry’s study found undergraduate stu- dents’ confidence in using archival materials increased over the course of a semester following an archives orientation. on a -point scale, these students’ confidence in their ability to locate archival materials (mean) went from . to d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole . . more than three-quarters ( . %) of the students thought the session pro- vided “essential” or “generally good” knowledge, and the professors involved found the orientation both positive and impactful. doris j. malkmus also explored primary sources and undergraduate edu- cation, but examined a digital component as well. two challenges remained: searching effectively in digital collections, not merely accessing them, and crit- ically analyzing web resources instead of falling prey to the web’s easy grati- fication. more archivists than ever graduated from library and information science programs in the s and s; these graduates were increasingly technologically literate. archivists are well placed to instruct students on using web resources in particular. to address archival education for students, sammie morris, lawrence j. mykytiuk, and sharon a. weiner proposed the concept of “archival literacy,” “the knowledge, skills, and abilities necessary to effectively and efficiently find, interpret, and use archives, manuscripts, and other types of unique unpublished materials.” archival literacy included understanding and locating primary sources; developing a research question and an argument; soliciting feedback and guidance from archivists; showing increasing familiarity with archives; adhering to publication standards; and progressively refining these skills. perhaps most important, studies pointed to the potential for increased collaboration both among historians and archivists and among archivists them- selves. for example, zhou determined an opportunity for collaboration among archivists and faculty, primarily in assessing students’ pre-existing knowledge before the orientation and thereby ensuring the orientation is tailored to stu- dent needs. similarly, in teaching students about online archival sources, archivists and faculty might collaborate in developing an online tutorial, as malkmus and morris et al. suggested. determining outcomes for archival edu- cation and methods for evaluating their success are crucial in informing opti- mum training programs. how can the sorts of collaboration noted by these scholars be refined and extended? conclusion despite concerns over mutual incomprehension among archivists and his- torians, their relationship may well be more symbiotic than ever. as suggested by the examination of previous findings and of possibilities for collaboration, notions of archival divides and foreign countries seem unduly alarmist. archivists should resist being “society’s footnotes”; collaborating with his- torians in new and more proactive ways constitutes a crucial way of doing so. in this vein, more archivists should investigate historians’ work practices and should publish their findings. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect “the past may be an undiscovered country,” toni weller asserted, “but the digital age demands its own bold historical exploration.” e. h. carr’s sage observation remains true: “there is no more significant pointer to the character of a society than the kind of history it writes or fails to write.” in this, histori- ans and archivists alike bear a heavy responsibility. notes the author would like to thank helen r. tibbo for her comments on an earlier draft of this article. terry cook, “the archive(s) is a foreign country: historians, archivists, and the changing archival landscape,” the american archivist (fall/winter ): . lena roland and david bawden, “the future of history: investigating the preservation of information in the digital age,” library and information history , no. ( ): . kristen nawrotzki and jack dougherty, introduction, in writing history in the digital age, ed. kristen nawrotzki and jack dougherty (ann arbor: university of michigan press, ), . jennifer rutner and roger c. schonfeld, “supporting the changing research patterns of historians,” ithaka s+r ( ), , http://www.sr.ithaka.org/research-publications/ supporting-changing-research-practices-historians. wendy duff, barbara craig, and joan cherry, “finding and using archival resources: a cross- canada survey of historians studying canadian history,” archivaria ( ): . fredric miller, “use, appraisal, and research: a case study of social history,” the american archivist (fall ): . ian anderson, “are you being served? historians and the search for primary sources,” archivaria ( ): – . i use the term “archivist” in the broadest sense, consonant with the definition offered in the society of american archivists glossary of archival and records terminology: “n. ~ . an individual responsible for appraising, acquiring, arranging, describing, preserving, and providing access to records of enduring value, according to the principles of provenance, original order, and collective control to protect the materials’ authenticity and context. – . an individual with responsibility for management and oversight of an archival repository or of records of enduring value,” http:// www .archivists.org/glossary/terms/a/archivist. cook, “the archive(s) is a foreign country,” – ; francis x. blouin jr. and william rosenberg, processing the past: contesting authority in history and the archives (new york: oxford university press, ). cook, “the archive(s) is a foreign country,” , . cook, “the archive(s) is a foreign country,” – . blouin and rosenberg, processing the past, . blouin and rosenberg, processing the past, . maygene daniels, “on being an archivist,” the american archivist (winter ): . andrew abbott, the system of professions: an essay on the division of expert labor (chicago: university of chicago press, ). marlene manoff reflected, “interest in the archive is growing despite—or perhaps because of—the recognition of the holes in the historical record, the problems of its arbitrariness and lack of transparency.” see marlene manoff, “theories of the archives from across the disciplines,” portal: libraries and the academy , no. ( ): . in fashioning history: current practices and principles (new york: palgrave-macmillan, ), robert berkhofer probes the role of archives and archivists in historiography. see especially chapter . cook, “the archive(s) is a foreign country,” . on this ambivalence, see william birdsall, “the two sides of the desk: the archivist and the historian, – ,” the american archivist , no. ( ): – ; patrick quinn, “archivists and historians: the times they are a-changin’,” midwestern archivist , no. ( ): – ; luke d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole gilliland-swetland, “provenance of a profession: the permanence of the public archives and historical manuscripts tradition in american archival history,” in american archival studies: readings in theory and practice, ed. randall jimerson (chicago: society of american archivists, ), – . on the “founding brothers,” see richard j. cox, charles dollar, rebecca hirsch, and peter j. wosh, “founding brothers: leland, buck, and cappon and the formation of the archives profession,” the american archivist ( /supplement): – . birdsall, “the two sides of the desk,” . albert r. newsome, “the archivist in american scholarship,” the american archivist , no. ( ): . j. frank cook, “the blessings of providence on an association of archivists,” the american archivist (fall ): . philip c. brooks, “archivists and their colleagues: common denominators,” the american archivist , no. ( ): . philip c. brooks, “archives and the young historian,” historian (march ): . donald mccoy, the national archives: america’s ministry of documents, – (chapel hill: university of north carolina press, ), . lester j. cappon, “the archival profession and the society of american archivists,” the american archivist , no. ( ): . karl l. trever, “the american archivist: voice of a profession,” the american archivist , no. ( ): . john edwards caswell, “archives for tomorrow’s historians,” the american archivist (october ): . lester j. cappon, “tardy scholars among the archivists,” the american archivist , no. ( ): . james b. rhoads, “the historian and the new technology,” the american archivist , no. ( ): . w. kaye lamb, “the archivist and the historian,” american historical review , no. ( ): . walter rundell jr., “relations between historical researchers and custodians of source materials,” college and research libraries , no. ( ): , . he appealed to historians’ self-interest too: “historians can only profit by establishing amicable relations with custodians of original sources necessary for their research,” . quinn, “archivists and historians,” . quinn, “archivists and historians,” – . philip p. mason, “archives in the seventies: promises and fulfillment,” the american archivist (summer ): . frank g. burke, “the future course of archival theory in the united states,” the american archivist (winter ): . george bolotenko, “archivists and historians: keepers of the well,” archivaria (summer ): . mattie u. russell, “the influence of historians on the archival profession in the united states,” the american archivist (summer ): . william l. joyce, “archivists and research use,” the american archivist (spring ): . miller, “use, appraisal, and research, . barbara c. orbach, “the view from the researcher’s desk: historians’ perceptions of research and repositories,” the american archivist (winter ): – . edwin bridges, gregory hunter, page putnam miller, david thelan, and gerhard weinberg, “toward better documenting and interpreting of the past: what history graduate programs in the twenty-first century should teach about archival practices,” the american archivist (fall ): . wendy duff and catherine johnson, “accidentally found on purpose: information-seeking behavior of historians in archives,” library quarterly , no. ( ): – . catherine johnson and wendy duff, “chatting up the archivist: social capital and the archival researcher,” the american archivist (spring/summer ): . johnson and duff, “chatting up the archivist,” . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect duff, craig, and cherry, “finding and using archival resources,” . “historians profit from the archivists’ extensive knowledge and the history of the records on both sides of the archival threshold. archivists profit from the historians’ focus and topic interests which can reveal new connections among sources and may turn up additional knowledge that strengthens description,” . johnson and duff, “chatting up the archivist,” . rutner and schonfeld, “supporting the changing research patterns of historians,” , . doris j. malkmus, “teaching history to undergraduates with primary sources: survey of current practices,” archival issues , no. ( ): . cook, “the archive(s) is a foreign country,” . blouin and rosenberg, processing the past, . david lowenthal, the past is a foreign country (cambridge: cambridge university press, ). see, for instance, mary speakman, “the user talks back,” the american archivist (spring ): – ; paul conway, “facts and frameworks: an approach to studying the users of archives,” the american archivist (fall ): – ; william maher, “the use of user studies,” the midwestern archivist ( ): – ; roy turnbaugh, “archival mission and user studies,” the midwestern archivist ( ): – ; bruce dearstyne, “what is the use of archives?: a challenge for the profession,” the american archivist (winter ): – . donald case, “the collection and use of information by some american historians: a study of motives and methods,” library quarterly , no. ( ): – . mark a. greene and dennis meissner, “more product, less process: revamping traditional archival processing,” the american archivist , no. (fall/winter ): . donald kelley, frontiers of history: historical inquiry in the twentieth century (new haven: yale university press, ), . peter novick, that noble dream: the “objectivity question” and the american historical profession (cambridge: cambridge university press, ), . tom nesmith, “archives from the bottom up: social history and archival scholarship,” archivaria ( ): . “fundamental to social history is a search for a determinative context, a respect for the cultures of different groups, and a recognition of the power of diversity.” see alice kessler- harris, “social history,” in the new american history, ed. eric foner (philadelphia: temple university press, ), . fredric m. miller, “social history and archival practice,” the american archivist (spring ): ; nesmith, “archives from the bottom up,” . miller, “social history and archival practice,” . dale c. mayer agreed: “both records groups and personal papers share organizational characteristics and descriptive practices which are not responsive to the needs of researchers in general and social historians in particular.” see dale c. mayer, “the new social history: implications for archivists,” the american archivist (fall ): . mayer, “the new social history,” . so multiple are the survivals of interest to students of his- tory today that it is difficult to find a classification for them. see berkhofer, fashioning history, . “cultural history, once a cinderella among the disciplines, neglected by its more successful sis- ters, was rediscovered in the s.” see peter burke, what is cultural history? (malden, mass: polity, ), ix. burke, what is cultural history?, – . burke reflected in , “almost everything seems to be having its cultural history written these days.” peter burke, “the new history: its past and its future,” in new perspectives on historical writing, ed. peter burke (university park: pennsylvania state university press, ), – . h. white and k. mccain, “bibliometrics,” annual review of information science and technology ( ): ; graham sherriff, “information use in history research: a citation analysis of master’s level theses,” portal: libraries and the academy , no. ( ): . white and mccain, “bibliometrics,” . orbach, “the view from the researchers’ desk,” . linda c. smith, “citation analysis,” library trends , no. ( – ): – . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole miller, “use, appraisal, and research, . arthur mcanally, the characteristics of materials used in research in united states history (phd diss., university of chicago, ). annie may alston, “characteristics of materials used by a selected group of historians in their research in united states history” (master’s thesis, university of chicago, ). clyve jones, michael chapman, and pamela carr woods, “the characteristics of the literature used by historians,” journal of librarianship and information science (july ): – . clark elliott, “citation patterns and documentation for the history of science: some methodological considerations,” the american archivist , no. ( ): – . miller, “use, appraisal, and research,” – . m. sara lowe, “reference analysis of the american historical review,” collection building , no. ( ): – . jana brubaker, “primary materials used by illinois state history researchers,” illinois libraries , no. ( ): – . sherriff, “information use in history research,” – . donghee sinn, “impact of digital archival collections on historical research,” journal of the american society for information science and technology , no. ( ): – . beattie, “an archival user study,” . questionnaire-based studies assume that what is being measured is equivalent to the information need of the scholar making the demands, a potentially misleading assumption. see charles cole, “inducing expertise in history doctoral students via information retrieval design,” library quarterly , no. ( ): – . beattie, “an archival user study,” . earl babbie, the practice of social research (belmost, calif.: wadsworth/thomson learning, ). michael stevens, “the historian and the archival finding aid,” georgia archive , no. ( ): – . peter uva, information-gathering habits of academic historians: report of the pilot study (syracuse: state university of new york, upstate medical center, ), http://files.eric.ed.gov/fulltext/ed .pdf. margaret stieg, “the information needs of historians,” college and research libraries (november ): – . beattie, “an archival user study,” – . helen r. tibbo, “primarily history in america: how u.s. historians search for primary materials at the dawn of the digital age,” the american archivist (spring/summer ): – . duff, craig, and cherry, “finding and using archival resources,” – ; wendy duff, barbara craig, and joan cherry, “historians’ use of archival sources: promises and pitfalls of the digital age,” public historian (spring ): – . susan hamburger, “how researchers search for manuscript and archival collections,” journal of archival organization ( ): – . alexandra chassanoff, “historians and the use of primary source materials in the digital age,” the american archivist , no. (fall/winter ): – . duff, craig, and cherry, “finding and using archival resources.” duff, craig, and cherry, “historians’ use of archival sources.” herbert rubin and irene rubin, qualitative interviewing: the art of hearing data (thousand oaks, calif.: sage, ), . donald case put it skeptically, “interviews are time-consuming and are based on unreliable self-reports of thought, motivation, and action.” see case, “the collection and use of information by some american historians,” . raymond vondran, “the effects of method on the information-seeking behavior of academic historians” (phd diss., university of wisconsin–madison, ). orbach, “the view from the researcher’s desk,” – . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect case, “the collection and use of information by some american historians, – ; donald case, “conceptual organization and retrieval of text by historians: the role of memory and metaphor,” journal of the association of information science and technology (october ): – . charles cole, “information acquisition in history phd students: inferencing and the formation of knowledge structures,” library quarterly , no. ( ): – ; cole, “inducing expertise in history doctoral students via information retrieval design,” – ; charles cole, “name collection by phd history students: inducing expertise,” journal of the american society for information science and technology , no. ( ): – . roberto delgadillo and beverly lynch, “future historians: their quest for information,” college and research libraries , no. ( ): – . duff and johnson, “accidentally found on purpose,” – . vondran, “the effects of method on the information-seeking behavior of academic historians.” beattie, “an archival user study,” – . helen tibbo, abstracting, information retrieval and the humanities: providing access to historical literature (chicago: american library association, ). deborah lines andersen, “academic historians, electronic information access technologies, and the world wide web: a longitudinal study of factors affecting use and barriers to that use,” journal of the association for history and computing , no. ( ), http://quod.lib.umich.edu/j/jahc/ . . /--academic-historians-electronic-information-access?rgn=main;view=fulltext; q =deborah+lines+andersen. suzanne r. graham, “historians and electronic resources: patterns and use,” journal of history and computing (september ), http://quod.lib.umich.edu/j/jahc/ . . /--historians- and-electronic-resources-patterns-and-use?rgn=main;view=fulltext. anderson, “are you being served?,” – . margaret stieg dalton and laurie charnigo, “historians and their information sources,” college and research libraries , no. ( ): – . donald case, for instance, advocates coupling interviews with examination of behavioral artifacts or observations of scholars at work (“the collection and use of information by historians,” ). “because what people do often fails to match what they say—a behavior pattern to which histo- rians were considered unlikely to be an exception—perception and reality can be two different things.” see dalton and charnigo, “historians and their information sources,” . even annie marie alston’s study, completed one year after mcanally’s and aimed directly at jux- taposing findings, found that results were “seen to deviate in varying degrees in almost every analysis.” see alston, “characteristics of materials used by a selected group of historians in their research in united states history,” . duff, craig, and cherry, “finding and using archival resources.” duff, craig, and cherry, “historians’ use of archival sources.” duff, craig, and cherry asked participants to rank these sources as very important or somewhat important; the figures in this table represent the aggregation of these two categories. i was unable to discern exact numbers from duff , craig, and cherry’s figure. orbach, “the view from the researcher’s desk,” . duff and johnson, “accidentally found on purpose,” . daniel v. pitti, “encoded archival description,” in encoded archival description: context, theory, and case studies, ed. jackie dooley (chicago: society of american archivists, ), – . helen r. tibbo and lokman i. meho, “finding aids on the world wide web,” the american archivist (spring/summer ): – . they stressed that users needed to learn the search engine’s searching protocol, to employ phrase searching, and to search for materials using more than one search engine. tibbo, “primarily history in america,” – . dalton and charnigo, “historians and their information sources,” – . those who did use electronic resources deemed comprehensiveness their highest priority. anderson, “are you being served?,” – . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole duff, craig, and cherry, “historians’ use of archival sources,” . duff, craig, and cherry, “finding and using archival resources,” . rutner and schonfeld, “supporting the changing research patterns of historians,” . daniel cohen, “is google good for history?” (talk, american historical association annual meeting, january , ), http://www.dancohen.org/ / / /is-google-good-for-history/. chassanoff, “historians and the use of primary source materials in the digital age,” – . rutner and schonfeld, “supporting the changing research patterns of historians,” . sinn, “impact of digital archival collections on historical research,” . chassanoff, “historians and the use of primary source materials in the digital age,” – . duff and johnson, “accidentally found on purpose,” . andersen, “academic historians, electronic information access technologies, and the world wide web.” sinn, “impact of digital archival collections on historical research,” . mcanally, the characteristics of materials used in research in united states history. clyve jones, michael chapman, and pamela carr woods, “the characteristics of the literature used by historians,” journal of librarianship and information science (july ): . uva, information-gathering habits of academic historians. elliott, “citation patterns and documentation for the history of science,” – . miller, “use, appraisal, and research, . in rank order, stieg’s historians favored books, periodicals, manuscripts, newspapers, microcopies, and theses/dissertations and government publications. miller, “use, appraisal, and research,” . helen r. tibbo, abstracting, information retrieval, and the humanities: providing access to historical literature (chicago: ala, ). orbach, “the view from the researcher’s desk,” . beattie, “an archival user study,” . duff, craig, and cherry, “finding and using archival resources,” ; tibbo, “primarily history in america,” . graham, “historians and electronic resources.” tibbo, “primarily history,” . duff, craig, and cherry, “finding and using archival resources,” , – . duff and cherry noted, “many users stated that they liked the original paper format most because it was the easi- est to read and to navigate; it also provided a sense of the whole document. they highlighted the importance of the physical attributes of the original paper and commented on its authenticity, accuracy, trustworthiness, and completeness.” see duff and cherry, “use of historical documents in a digital world.” rutner and schonfeld, “supporting the changing research patterns of historians,” . along these lines, jones, chapman, and woods found that two repositories accounted for a full % of all man- uscript references, perhaps an indication of the difficulty of accessing archival materials physi- cally. see jones, chapman, and woods, “the characteristics of the literature used by historians,” . rutner and schonfeld, “supporting the changing research patterns of historians,” – . chassanoff, “historians and the use of primary source materials in the digital age,” – . “thus digital images of actual documents and the highest resolution of images may not be a neces- sary condition for the use of digital collections. what matters more seems to be the uniqueness of the content.” see sinn, “impact of digital archival collections on historical research,” ; fred gibbs and trevor owens, “building better digital humanities tools: toward broader audiences and user-centered design,” digital humanities quarterly , no. ( ), http://www.digitalhumanities .org/dhq/vol/ / / / .html. chassanoff, “historians and the use of primary source materials in the digital age,” – . stieg, “information needs of historians,” . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect stieg, “information needs of historians,” – . then again, photographs appeared only in one-fourth and oral histories in less than one-tenth of the citations she examined. beattie, “an archival user study,” – . dalton and charnigo, “historians and their information sources,” . duff, craig, and cherry, “finding and using archival resources,” . rutner and schonfeld, “supporting the changing research patterns of historians,” . on the other hand, these historians expressed concerns about properly capturing, presenting, and citing these materials. sinn, “impact of digital archival collections on historical research,” – . chassanoff, “historians and the use of primary source materials in the digital age,” – . valerie harris and peter hepburn, “trends in image use by historians and the implications for librarians and archivists,” college and research libraries (may ): . “while the availability of images online has grown steeply, the number of images in history journals has remained more or less level during the decade between and ,” . harris and hepburn, “trends in image use by historians,” . stevens, “the historian and archival finding aids,” . uva, information-gathering habits of academic historians, – . lowe, “reference analysis of the american historical review,” . vondran, “the effects of method on the information-seeking behavior of academic historians.” stevens, “the historian and archival finding aids,” . case, “the collection and use of information by some american historians,” . delgadillo and lynch, “future historians,” – . stieg, “the information needs of historians,” . hernon, “information needs and gathering patterns of academic social scientists,” . dalton and charnigo, “historians and their information sources,” . stieg, “the information needs of historians,” . miller, “use, appraisal, and research,” . orbach, “the view from the researcher’s desk,” . herbert a. simon, “rational choice and the structure of the environment,” psychological review , no. ( ): – ; donald o. case, “the principle of least effort,” in theories of information behavior, ed. karen e. fisher, sandra erdelez, lynne mckechnie (medford, n.j.: information today, inc., ): – . orbach, “the view from the researcher’s desk,” . orbach’s historians behaved pragmatically: of interviewees knew their intended end product before they began, and of had an embryonic thesis upon beginning research. factors that caused them to stop seeking information included their intended audience, impending deadline(s), the discovery of contrary evidence, or the exhaustion of themselves or of their resources. duff and johnson, “accidentally found on purpose,” . along these lines, johnson and duff suggested that electronic systems would need to mimic the archivist’s contextual knowledge of his or her collections. see johnson and duff, “chatting up the archivist,” . case, “the collection and use of information by some american historians,” . cole, “information acquisition in history phd students,” – . chassanoff, “historians and the use of primary source materials in the digital age,” – . orbach, “the view from the researcher’s desk,” – . cole, “name collection by phd history students,” . duff and johnson, “accidentally found on purpose,” . hamburger, “how researchers search for manuscript and archival collections,” . all the same, % of her sample ultimately found what they sought. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole gereben zaagsma, “on digital history,” low countries historical review , no. ( ): – ; william g. thomas iii, “computers and the historical imagination,” in a companion to digital humanities, ed. susan schreibman, ray siemens, and john unsworth (oxford: blackwell, ), http:// www.digitalhumanities.org/companion/view?docid=blackwell/ / .xml&chunk.id=ss - - &toc.depth= &toc.id=ss - - &brand=default. dagmar horna perman, “bibliography and the historian,” in bibliography and the historian: the conference at belmont of the joint committee on bibliographic services to history, ed. dagmar horna perman (santa barbara, calif.: clio, ), . robert p. swierenga,“clio and computers: a survey of computerized research in history,” computers and the humanities (september ): . rhoads, “the historian and the new technology,” . h. j. hanham, “clio’s weapons,” daedalus (spring ): . ian anderson, “history and computing,” ( ), making history, http://www.history.ac.uk/ makinghistory/resources/articles/history_and_computing.html. swierenga,“clio and computers,” . “the first halting steps are now in the past and the way is open,” . joel h. silbey, “clio and computers: moving into phase ii, – ,” computers and the humanities (november ): . uva, information-gathering habits of academic historians. charles dollar, “quantitative history and archives,” archivum ( ): . perhaps more import- ant, he explicitly linked development in history with the relationship among historians and archi- vists: “the on-going revolution in computer and telecommunications processing virtually brings the future into the present, a situation which can be ignored only at great loss to archives and archivists throughout the world,” . richard jensen, “the microcomputer revolution for historians,” journal of interdisciplinary history (summer ): . jensen offered a striking parallel: “the impact of the microcomputer rev- olution is analogous to the impact of the personal automobile on the passenger transportation system,” . thomas iii, “computers and the historical imagination.” for instance, historians probed how database tools affected their interpretations and whether such tools could handle nontabular data. slatta, “historians and microcomputing, ,” . zaagsma, “on digital history,” . helen r. tibbo, “information systems for the humanities,” annual review of information science and technology ( ): . igartua, “the computer and the historian’s work,” . case, “conceptual organization and retrieval of text by historians,” ; case, “the collection and use of information by some american historians,” . case, “the collection and use of information by some american historians,” . gilmore and case, “historians, books, computers, and the library,” – . tibbo, “information systems for the humanities,” . michael o’malley and roy rosenzweig, “brave new world or blind alley? american history on the world wide web,” journal of american history (june ): . robert darnton, “a historian of books, lost and found in cyberspace,” chronicle of higher education, march , . pamela pavliscak, seamus ross, and charles henry, information technology in humanities scholarship: achievements, prospects, and challenges—the united states (washington, d.c.: american council of learned publications, ), http://archives.acls.org/op/ _information_technology.htm. andersen, “academic historians, electronic information access technologies, and the world wide web.” deborah lines andersen, “historians on the web: a study of academic historians’ use of the world wide web for teaching,” journal of the association for history and computing , no. ( ), d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect http://quod.lib.umich.edu/j/jahc/ . . /--historians-on-the-web-a-study-of-academic- historians-use?rgn=main;view=fulltext. andersen, “historians on the web.” cole, “inducing expertise in history doctoral students via information retrieval design,” . delgadillo and lynch, “future historians,” – . nevertheless, delgadillo and lynch thought it unclear whether students used technology more frequently than their mentors. that is, “original work that is responsibly based on primary sources, is intelligently informed by relevant scholarship, and makes a clear argument or group of arguments.” see carl smith, “can you do serious history on the web?,” aha perspectives ( ), rrchnm, http://chnm.gmu.edu/ essays-on-history-new-media/essays/?essayid= . o’malley and rosenzweig, “brave new world or blind alley?,” . o’malley and rosenzweig, “brave new world or blind alley?,” . dennis trinkle, “computers and the practice of history: where are we? where are we headed?,” perspectives: american historical association newsletter ( ), http://www.historians.org/perspectives/ issues/ / / not .cfm. they cited concerns such as rapid technological obsolescence, a lack of appropriate training, stu- dents’ resistance, a negative effect on tenure and promotion, and uncertainty about technology’s potentially deleterious effect on teaching. trinkle, “computers and the practice of history.” duff and cherry, “use of historical documents in a digital world.” graham, “historians and electronic resources.” roy rosenzweig, “scarcity or abundance? preserving the past in a digital era,” american historical review , no. ( ): . roy rosenzweig, “digital archives are a gift of wisdom to be used wisely,” chronicle of higher education, june , , http://chnm.gmu.edu/essays-on-history-new-media/essays/?essayid= . william turkel and his colleagues returned to this point in , announcing, there is “no substi- tute for close and critical reading, for careful citation, or for reasoned judgment.” w. j. turkel, k. kee, and s. roberts, “navigating the infinite archive,” in history in the digital age, ed. toni weller (london: routledge, ), . anderson, “are you being served?,” . dalton and charnigo, “historians and their information sources,” . dalton and charnigo, “historians and their information sources,” . tibbo, “primarily history in america,” – . dalton and charnigo, “historians and their information sources,” . daniel cohen, “the future of preserving the past,” journal of heritage stewardship ( ), http:// www.dancohen.org/files/future_of_preserving_the_past.pdf; anderson, “are you being served?,” . david a. bell, “the bookless future: what the internet is doing to scholarship,” new republic , nos. – ( ): ; gilmore and case, “historians, books, computers, and the library,” . daniel cohen, “history and the second decade of the web,” rethinking history , no. ( ): . alexander maxwell, “digital archives and history research: feedback from an end-user,” library review , no. ( ): . maxwell, “digital archives and history research: feedback from an end-user,” , . robert townsend, “how is digital media reshaping the work of historians?,” perspectives on history ( ), http://www.historians.org/perspectives/issues/ / / pro .cfm. sinn, “impact of digital archival collections on historical research,” . sinn, “impact of digital archival collections on historical research,” . fred gibbs and trevor owens, “building better digital humanities tools: toward broader audiences and user-centered design,” digital humanities quarterly , no. ( ), http://www .digitalhumanities.org/dhq/vol/ / / / .html. onno boonstra, leen breure, and peter doorn, past, present and future of historical information science (the hague: data archiving and networked services, ), – . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole rutner and schonfeld, “supporting the changing research patterns of historians,” . weller, “introduction: history in the digital age,” – . nawrotzki and dougherty, introduction, . turkel, kee, and roberts, “navigating the infinite archive,” . “technology is often used to make the traditional methods a little bit easier without challenging standards or creating alternative procedures and tactics.” gibbs and owens, “building better digital humanities tools.” o’malley and rosenzweig, “brave new world or blind alley?,” . “above all, the archivist in the internet age needs to stay ahead of technology.” see laura millar, archives: principles and practices (new york: neal schuman, ), . helen tibbo insists, “archivists of all stripes must understand technology issues including information system architecture, the nature of electronic records and databases, content and digital asset management systems, record migration, digitization, and web design and creation.” see helen r. tibbo, “so much to learn, so little time to learn it: north american archival education programs in the information age and the role for certificate programs,” archival science ( ): . douglas seefeldt and william g. thomas, “what is digital history? a look at some exemplar projects,” perspectives on history (may ). “on one level,” seefeldt and thomas claimed, “digital history is an open arena of scholarly production and communication, encompassing the devel- opment of new course materials and scholarly data collection efforts. on another level, digital history is a methodological approach framed by the hypertextual power of these technologies to make, define, query, and annotate annotations in the human record of the past.” daniel j. cohen, michael frisch, patrick gallagher, steven mintz, kirsten sword, amy murrell taylor, william g. thomas iii, and william j. turkel, “interchange: the promise of digital history,” journal of american history , no. (september ): . o. v. burton, “american digital history,” social science computer review , no. ( ): , . boonstra, breure, and doorn, past, present and future of historical information science, . rosenzweig and cohen, digital history. shawn martin, “digital scholarship and cyberinfrastructure in the humanities: lessons from the text creation partnership,” journal of electronic publishing (winter ), http://quod.lib.umich .edu/j/jep/ . . ?rgn=main;view=fulltext. thomas iii, “computers and the historical imagination.” useful works on gis include joanna guldi, “what is the spatial turn?” (university of virginia library, spatial humanities, institute for enabling geospatial scholarship, ), http://spatial.scholarslab.org/spatial-turn/; anne kelly knowles, ed., placing history: how maps, spatial data, and gis are changing historical scholarship (redlands, calif.: esri, ); anne kelly knowles, ed., past time, past place: gis for history (redlands, calif.: esri press, ); and richard white, “what is spatial history?,” stanford university, spatial history project, , http://www.stanford.edu/group/spatialhistory/cgi-bin/site/pub.php?id= . william g. thomas iii, “blazing trails toward digital history scholarship,” social history , no. ( ): . maxwell, “digital archives and history research,” . fred gibbs and trevor owens, “the hermeneutics of data and historical writing,” in writing history in the digital age, . national endowment for the humanities, “railroads and the making of modern america,” grant number: hj- - , https://securegrants.neh.gov/publicquery/main.aspx?f= &gn=hj- - ; national endowment for the humanities, “using zotero and tapor on the old bailey proceedings: data mining with criminal intent,” grant number: hj- - , https://securegrants.neh.gov/ publicquery/main.aspx?f= &gn=hj- - . christa williford and charles henry, one culture: computationally intensive research in the humanities and social sciences, a report on the first respondents to the digging into data challenge (washington, d.c.: council on library and information resources, ), . anderson, “history and computing.” james crossman, “‘big data’: an opportunity for historians?,” perspectives on history (march ). d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect american council of learned societies, “our cultural commonwealth: the report of the american council of learned societies commission on cyberinfrastructure for the humanities and social sciences” (new york: american council of learned societies and the andrew w. mellon foundation, ), . zaagsma, “on digital history,” . donald t. hawkins, introduction, in personal archiving: preserving our digital heritage, ed. donald t. hawkins (medford, n.j.: information today, inc., ), . clifford lynch, “the future of personal digital archiving: defining the research agendas,” in personal archiving, . catherine c. marshall warned, “the very same characteristics that make personal digital assets attractive—the ease with which they are created, edited, copied, and shared; the fact that they don’t take up real space; and the long tail phenomenon, to name a few—also make digital stewardship a far greater burden.” see catherine c. marshall, “rethinking personal digital archiving, part : four challenges from the field,” d-lib magazine (march/april ), http://www.dlib.org/dlib/march /marshall/ marshall-pt .html. susan thomas, “paradigm: a practical approach to the preservation of personal digital archives” (oxford: bodleian library, ), . neil beagrie, “plenty of room at the bottom? personal digital libraries and collections,” d-lib magazine (june ), http://www.dlib.org/dlib/june /beagrie/ beagrie.html; thomas, “paradigm,” . jordan bass, “a pim perspective: leveraging personal information management research in the archiving of personal digital records,” archivaria (spring ): . thomas, “paradigm,” . tim o’reilly, “web . : compact definition?,” o’reilly radar, october , , http://radar.oreilly .com/ / /web- -compact-definition.html. kate theimer introduces the notion of “archives . ,” which pivots on openness, transparency, flexibility, experimentalism, user-centeredness, technology, measurement, assessment, efficiency, advocacy, proactivity, interpretation, shared standards, iterative products, and innovation. see kate theimer, “what is the meaning of archives . ?,” the american archivist (spring/summer ): – . terry cook, “‘we are what we keep, we keep what we are’: archival appraisal past, present and future,” journal of the society of archivists (october ): . roland and bawden advocated forcefully for saving web . materials. see roland and bawden, “the future of history,” . mary samouelian, “embracing web . : archives and the newest generation of web applications,” the american archivist (spring/summer ): , . sean heyliger, juli mcloone, and nikki lynn thomas, “making connections: a survey of special collections’ social media outreach,” the american archivist (fall/winter ): ; . for example, “special collections’ use of social media platforms closely follows the conventional format and usage of each: short, frequent tweets on twitter; somewhat less frequent, slightly longer updates on facebook; and infrequent, semiregular, lengthy posts on blogs” ( ). samouelian, “embracing web . ,” – . heyliger, mcloone, and thomas, “making connections,” . j. gordon daines and cory l. nimer, “web . and the archivist,” society of american archivists, the interactive archivist, may , , http://interactivearchivist.archivists.org/. max j. evans, “archives of the people, by the people, for the people,” the american archivist (fall/ winter ): . catherine o’sullivan, “diaries, on-line diaries, and the future loss to archives; or, blogs and the blogging bloggers who blog them,” the american archivist (spring/summer ): . o’sullivan, “diaries, on-line diaries, and the future loss to archives,” . o’sullivan, “diaries, on-line diaries, and the future loss to archives,” – . craig blaha, “preserving facebook records: subscriber expectations and behavior,” preservation, digital technology, and culture , no. ( ): . blaha, “preserving facebook records,” . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole “twitter affords a unique opportunity to collect records of unfolding events unfiltered by mass media.” see timothy arnold and walker sampson, “preserving the voices of revolution: examining the creation and preservation of a subject-centered collection of tweets from the eighteen days in egypt,” the american archivist (fall/winter ): . arnold and sampson, “preserving the voices of revolution,” , , , . arnold and sampson, “preserving the voices of revolution,” – . joshua d. hager, “to like or not to like: understanding and maximizing the utility of archival outreach on facebook,” the american archivist (spring/summer ): – . hager, “to like or not to like,” . adam kriesberg, “increasing access in characters or less; or, what are archival institutions doing on twitter?,” the american archivist (fall/winter ): – . katy lalonde, chris sanagan, and sean smith, “the war of in characters or less: ‘supercool or super un-tweet worthy?,’” the american archivist (fall/winter ): – . roland and bawden, “the future of history,” . carl becker, “everyman his own historian,” american historical review vol. , no. (jan. ): . douglas greenberg, “‘history is a luxury’: mrs. thatcher, mr. disney, and (public) history,” reviews in american history (march ): – . “the road to xanadu: public and private pathways on the history web,” journal of american history (september ): . rosenzweig, “scarcity or abundance?,” . rosenzweig, “the road to xanadu,” . rosenzweig and cohen, digital history. weller, “conclusion: a changing field,” . greenberg, “‘history is a luxury,’” . weller, “conclusion: a changing field,” . rosenzweig and cohen, digital history. american council of learned societies, our cultural commonwealth, . “the practice of obtaining information or services by soliciting input from a large number of people, typically via the internet and often without offering compensation,” oxford english dictionary, s.v. “crowdsourcing,” http://www.oed.com/view/entry/ ?redirectedfrom=crowdsourcing. david ferriero, “cultivating citizen archivists,” national archives, aotus blog, april , , http:// blogs.archives.gov/aotus/?p= . ferriero, “cultivating citizen archivists.” johan oomen and lora aroyo, “crowdsourcing in the cultural heritage domain: opportunities and challenges,” proceedings of the th international conference on communities and technologies, june –july , (brisbane, australia), . oomen and aroyo, “crowdsourcing in the cultural heritage domain,” – . cook, “‘we are what we keep, we keep what we are,’” . evans, “archives of the people, by the people, for the people,” . evans, “archives of the people, by the people, for the people,” . william g. thomas, patrick d. jones, and andrew witmer, “history harvest: what happens when students collect and digitize the people’s history,” perspectives on history (january ), http://www .historians.org/publications-and-directories/perspectives-on-history/january- /history-harvests. oomen and aroyo, “crowdsourcing in the cultural heritage domain,” – . parry, “historians ask the public to help organize the past.” cook, “‘we are what we keep, we keep what we are,’” . national research council of the national academies, preparing the workforce for digital curation (washington, d.c.: national academies press, ), . boonstra, breure, and doorn, past, present and future of historical information science, . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect alex h. poole, “how has your science data grown? digital curation and the human factor, a critical literature review,” archival science , no. ( ): – . jackie dooley, “the archival advantage: integrating archival expertise into management of born- digital library materials” (dublin, ohio: oclc research, ); anne gilliland-swetland, enduring paradigm, new opportunities: the value of the archival perspective in the digital environment (washington, d.c.: council on library and information resources, ). daniel noonan and tamar chute, “data curation and the university archives,” the american archivist (spring/summer ): – . roland and bawden, “the future of history,” . roland and bawden, “the future of history,” . roland and bawden, “the future of history,” . roland and bawden, “the future of history,” . howard zinn, “secrecy, archives, and the public interest,” midwestern archivist , no. ( ): . zinn, “secrecy, archives, and the public interest,” . ian johnston, “whose history is it anyway?,” journal of the society of archivists , no. ( ): . anne gilliland, “pluralizing archival education: a non-zero-sum proposition,” in through the archival looking glass: a reader on diversity and inclusion, ed. mary k. caldera and kathryn m. neal (chicago: society of american archivists, ), . gilliland, “pluralizing archival education,” – . mark a. greene, “a critique of social justice as an archival imperative: what is it we’re doing that’s all that important?,” the american archivist (fall/winter ): . greene, “a critique of social justice as an archival imperative,” . randall c. jimerson, “archivists and social responsibility: a response to mark greene,” the american archivist (fall/winter ): . society of american archivists, “saa’s criteria for advocacy statements,” http://archivists.org/ statements/saa%e % % s-criteria-for-advocacy-statements. “as an organization that values social responsibility, the public good, and the completeness of the public record and that under- stands the importance of advocacy, saa encourages its members to engage with social issues to the extent that they, as individuals, are able.” kathryn m. neal, “the importance of being diverse: the archival profession and minority recruitment,” archival issues , no. ( ): . mary k. caldera and kathryn m. neal, introduction, in through the archival looking glass, xix. terry cook was perhaps less impressed: “look around the conference rooms of every archival confer- ence i’ve ever attended in the anglo-saxon world for over three decades now, and you see a white, middle-class, well-educated, and not very diverse group—the only significant change in that time is the male-gender demographic domination has been replaced by a female one.” see cook, “‘we are what we keep, we keep what we are,’” . valerie love and marisol ramos, “identity and inclusion in the archives: challenges of documenting one’s own community,” in through the archival looking glass, . rabia gibbs, “the heart of the matter: the developmental history of african american archives,” the american archivist (spring/summer ): . elizabeth adkins, “our journey toward diversity—and a call to (more) action,” the american archivist (spring/summer ): – . adkins, “our journey toward diversity,” . mark a. greene, “into the deep end: one archivist’s struggles with diversity, community, collaboration, and their implications for our profession,” in through the archival looking glass, . joseph m. turrini, “the historical profession and archival education,” aca news (summer ): . brooks, “archives and the young historian,” . walter rundell jr., “researching the american past for relevance,” the history teacher , no. ( ): . orbach, “the view from the researcher’s desk,” . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter alex h. poole bridges et al., “toward better documenting and interpreting of the past,” . bridges et al., “toward better documenting and interpreting of the past,” – . blouin and rosenberg, processing the past, . rutner and schonfeld, “supporting the changing research patterns of historians,” . weller, “introduction: history in the digital age,” . sammie morris, lawrence j. mykytiuk, and sharon a. weiner, “archival literacy for history students: identifying faculty expectations of archival research skills,” the american archivist (fall/winter ): – . doris j. malkmus, “teaching history to undergraduates with primary sources: survey of current practices,” archival issues , no. ( ): . hugh a. taylor, “clio in the raw: archival materials and the teaching of history,” the american archivist , (july/october ): . “the current emphasis on active learning provides a unique opportunity for archivists to reshape their traditional role in the education of undergraduates.” see malkmus, “teaching history to undergraduates with primary sources,” . marcus c. robyns, “the archivist as educator: integrating critical thinking skills into historical research methods instruction,” the american archivist (fall/winter ): , – . xiaomi zhou, “student archival research activity: an exploratory study,” the american archivist (fall/winter ): . michelle mccoy, “the manuscript as question: teaching primary sources in the archives—the china mission project,” college and research libraries (january ): . zhou, “student archival research activity,” , . in such teaching, the archivist leads by example, that is, she shows the students archival materials and offers her interpretation in real time. magia g krause, “undergraduates in the archives: using an assessment rubric to measure learning,” the american archivist (fall/winter ): . wendy m. duff and joan m. cherry, “archival orientation for undergraduate students: an exploratory study of impact,” the american archivist (fall/winter ): – . malkmus, “teaching history to undergraduates with primary sources,” – . as rosenzweig argued, “thus far we have done much better at democratizing access to resources than at provid- ing the kind of instruction that would give meaning to those resources.” roy rosenzweig, “digital archives are a gift of wisdom to be used wisely,” chronicle of higher education, june , , http://chnm.gmu.edu/essays-on-history-new-media/essays/?essayid= . bridges et al., “toward better documenting and interpreting of the past,” ; turrini, “the historical profession and archival education,” . lis programs have also hired more archives-fo- cused faculty: at the end of s, there were faculty members in lis and in history; by , however, lis programs had and history departments only . archival education came into its own as field of education and research only in the s. see kelvin l. white and anne j. gilliland, “promoting reflexivity and inclusivity in archival education, research, and practice,” library quarterly (july ): . morris, mykytiuk, and weiner, “archival literacy for history students,” . morris, mykytiuk, and weiner, “archival literacy for history students,” – . zhou, “student archival research activity,” . malkmus, “teaching history to undergraduates with primary sources,” . wendy m. duff, amy marshall, carrie limkilde, and marlene van ballegooie, “digital preservation education: educating or networking?,” the american archivist (spring/summer ): ; morris, mykytiuk, and weiner, “archival literacy for history students,” – . cook, “the archive(s) is a foreign country,” . “the archival profession will not make significant inroads against its marginality without also encouraging its own scholarship.” see nesmith, “archives from the bottom up,” . weller, “introduction: history in the digital age,” . e. h. carr, what is history? (cambridge: cambridge university press, ), . d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. fall/winter archival divides and foreign countries? historians, archivists, information-seeking, and technology: retrospect and prospect about the author alex h. poole is an assistant professor at drexel university’s college of computing and informatics. he earned his phd at the school of information and library science at the university of north carolina at chapel hill. his website is https://alexhpoole.wordpress.com/. d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - . . . by c arnegie m ellon u niversity user on a pril reaching people: library strategy - in , the national library of scotland launched the way forward: library strategy, - as the first of two five-year strategies to take the library to its centenary in . reaching people: library strategy, - is the second strategy, concluding in the year the library celebrates its th anniversary. there are strong elements of continuity between the two. this is clearest in our continued focus on safeguarding collections and improving access to them. these two areas were strongly supported in our public consultations and link directly to our statutory function as scotland’s legal deposit library. however, there are important differences between the strategies. reaching people: library strategy, - will emphasise connecting with multiple audiences and enriching lives with our content and services. the first five years of our ten- year journey to our centenary focused on building infrastructure, capacity and resilience, while growing partnerships, reputation and income. the second period concentrates on sharing information, knowledge and experiences with a diverse audience in scotland and around the globe. in developing this strategy, we have listened carefully to the results of audience surveys and to user feedback. these have told us clearly about the need to enhance the facilities in the george iv bridge building in edinburgh and the desire for greater access to collections in digital formats. to achieve this, we will launch a series of programmes and projects to deliver important aspects of our plans for the next five years. work has begun on the potential redevelopment of the main library building in edinburgh. a programme of preservation and digitisation of moving image and sound will save some of the most fragile formats in the country from disintegration. a project will be developed to preserve, digitise and make available to the nation scotland’s newspaper heritage. we will also continue to explore opportunities to reach people outside scotland’s central belt. we have partially addressed this by significantly expanding our digital services over the last five years. the new strategy aims to intensify this work. the events of have demonstrated how important this is. in a similar vein, we will bring to the foreground our web- archiving activity and electronic legal deposit of published content as a growing source for scotland’s online memory. in the first wave of initiatives, a large-scale digitisation project with an international partner will be pursued. the library will address the silences, including historical biases, in our collections. we will analyse our inventory to identify and recognise under-represented communities to allow us to create a more representative national collection into the future. a powerful backdrop to our deliberations has been a strong sense of scotland’s historical and cultural past. few countries have more fully embraced the ability of literacy, education and civic discourse to improve their people and develop their economy. the library will continue to be a place where ideas, debate and discussion take place. we have seen in the year we commence this new strategy how important it is that we are prepared for rapid change which will require flexible working and creative thinking. our response to the covid- pandemic demonstrated an agility that will frame our thinking for the next decade. of equal importance has been the recent publication of a culture strategy for scotland by the scottish government (february, ) which will provide important national guidance for our work. the library strategy has been produced in collaboration with the library board, our staff, our partners, groups and individuals. the feedback and ideas are reflected in the final document. i am extremely grateful for all the ideas, enthusiasm and support we have received throughout this conversation about the future shape and direction of the national library. working together, i am confident that reaching people: library strategy - will make the library a truly national library fit for the st century when we celebrate our th birthday. | national library of scotland introduction dr john scally national librarian strategy - | participation in cultural activities is at a historical high and the overall use of the library’s content is buoyant. however, the way the library is used is changing. though the number of reading room visits is steady, we have seen greater numbers visit to view the exhibitions, attend talks and take part in other educational activities. the way people access culture using digital technology also continues to evolve, so we must keep innovating to meet demand and stay relevant. in the coming years, it will be necessary to deliver our services through multiple channels for those who may not have access to our buildings. most of the library’s income is from the scottish government. the library has vast physical collections, requiring more than miles of shelving across four different sites. our expanding digital collections are held in two datacentres in edinburgh and glasgow and, increasingly, in the cloud. all of this needs to be regularly refreshed and upgraded to preserve the national collection from threats such as fire, water, bit-rot (deterioration of data) and cyber-attacks. nevertheless, we know that there are many competing demands and that pressure on public funding is likely to continue. over the past five years, there has been more reliance on external grants and donations. the diversity of the library’s income will continue to grow. although the library has always been a strong advocate of open access, we know there are socio-economic and geographical variances in how people engage with the library. accordingly, themes of equality, diversity and inclusion underpin the new strategy and we know this is an area where we must strive to make our collections and services more representative of the whole nation. partnerships will continue to be of critical importance. our relationship with the faculty of advocates goes back more than years and our work with the other five legal deposit libraries is fundamental in collecting, preserving and making available digital and physical content. we continue to learn by working with other organisations and these partnerships help us identify where we can become stronger. over the past five years, memoranda of understanding have been signed with the universities of glasgow and edinburgh, scottish enterprise, and, most recently, bbc scotland. the library has also gone on the road with its touring exhibitions. more of these partnerships and collaborations will help us develop services for the people of scotland and extend our user base. much has been achieved over the past five years. there has been more experimentation and some well-judged risk-taking. we have developed our own voice on social media – a mix of humour and candid views on the role of libraries in an era of misinformation. | national library of scotland the operating environment strategy - | the national library will be a familiar and valued institution across scotland, recognised for its outstanding collections and services. it will be known as a library with responsive and accessible services, simple to find and easy to use at the click of a button, the tap of a screen, or by simply walking into one of our buildings. as scotland’s premier research library, we will be a place where researchers and scholars are valued and supported as our original core audience. we are continually evolving our services and growing our collections to meet their varying needs. many of our visitors will be regulars, taking the opportunity to enhance their knowledge of family, community, hobbies or current issues, such as climate change. those studying towards qualifications, pursuing continuous professional development or researching a business idea will find a mixture of tailored resources to satisfy their information needs. young people and families will increasingly find the library a welcoming place, with a growing programme of events designed for them. the library’s collections will be more comprehensive and more representative of the whole nation. our physical and digital collecting will be guided by our role as the guardian of the published and recorded memory of scotland. our web-archiving and digital collecting activities across the nation will be deeper, wider and more representative. we will address the silences in our collections to ensure that a richer variety of voices, views and experiences are collected, described and curated. the library’s digital scholarship services will be amongst the best in europe, delivered through the data foundry, a dynamic destination where our data can be downloaded, reused and replayed. by , we will be a library that offers a personalised experience to every resident in scotland. | national library of scotland what the national library will look like in strategy - | open we commit to openness and transparency in all areas of our work. we aim to make our collections and our related work free, open and reusable wherever possible. trusted we provide accurate and reliable information to support debate and discussion. inclusive we are responsive and inclusive as we build and interpret collections for current and future generations. we will challenge ourselves, our assumptions and our policies in order to create a more inclusive collection and a more diverse audience. connected we work collaboratively to improve our services and extend the benefits they offer. inspiring we believe in the power of the collections to change lives through learning, research, discovery and improved wellbeing. we actively support participation in culture and heritage for everyone. responsible we commit to minimising our environmental impact, and to creating a more sustainable, resilient and healthy environment for future generations. mission to enhance scotland’s international reputation by making a significant and lasting contribution to global knowledge and the memory of the world. vision to create opportunities for people to participate in scotland’s rich cultural life as one of the leading national libraries in europe. how we work the national library of scotland is a place for inspiration, exploration, and enjoyment. everyone is treated with dignity and respect. these are our guiding principles: | national library of scotland mission, vision and how we work strategy - | . we are the guardian of the published and recorded memory of scotland for current and future generations. . we will collect, preserve and make available diverse materials that represent the lives and memories of scotland’s people, and which contribute to world knowledge. . we will work with partners to secure the nation’s fragile moving image, sound and newspaper collections. . we will preserve and make available the online memory of scotland through our web-archiving activities, placing the nation at the forefront of open content archiving. . we will work to address the silences in the collections to ensure a richer and more representative variety of voices, views and experiences of st century scotland are collected and curated. our strategic priorities for - safeguarding collections | national library of scotland strategy - | our strategic priorities for - | national library of scotland . we make it easier for people to access the collections. . we will deliver outstanding digital engagement, helping people to use the collections in the most creative ways possible. . joining and using the library will be simple and seamless, opening up a personalised world of knowledge, learning and entertainment. . people will have access to more than million of the library’s items in digital format as we complete our ‘one third digital’ initiative. . it will be easier to discover the library’s special and hidden collections through our programme of online listing, cataloguing and discovery work. . we will provide a safe and trusted environment for informal learning and activity that promotes wellbeing. improving access strategy - | . we put audiences at the heart of everything we do and offer a rich variety of ways for people to participate and engage with their heritage. . we will take an audience-led approach to the development and delivery of all the library’s services and cultural experiences. . we will employ the latest technologies and expert staff to transform our public spaces into inspiring and welcoming destinations for research, discovery, lifelong learning and entertainment. . we will engage communities throughout scotland with the collections – through touring exhibitions, targeted learning and outreach activities, and innovative online content. . we will create new programmes and services to reach wider and more diverse audiences and to help support communities around scotland to thrive. engaging audiences | national library of scotland our strategic priorities for - strategy - | | national library of scotland our strategic priorities for - . we encourage and support research, learning and discovery. . we will support the contribution of new knowledge to the world by developing research collaborations and research fellowships. . we will encourage investigations of the collections from different angles, uncovering untold stories and giving fresh perspectives on society and culture, including key areas of public debate such as climate change and misinformation. . we will develop our digital scholarship service by presenting the collections as data, opening up new possibilities for research, learning and creativity. . we will provide people with access to learning through our collections in support of the curriculum, lifelong learning, creative practice and continuous professional development. supporting learning, research and discovery strategy - | . we will continue to be a great organisation to work for and with, developing new ways of doing, delivering and partnering. . people – we will support, develop and train our staff and recruit new talent to enhance our existing skills and knowledge. . sustainability – we will continue our leadership role in environmental sustainability, developing a fuller understanding of our impact on the environment. we will reduce, monitor and reduce further the environmental footprint of all areas of our operation. . estates – we will ensure our property assets are maintained and improved and collections are held securely in appropriate environments by implementing our property asset management plan. . data – we will use data to help the library optimise its services and business processes. . finance – we will derive ‘best value’ from our current core income sources, develop diverse income streams and clearly communicate the value of the library to external funders to attract additional income. . partnerships – we will grow our partnership activity both nationally and internationally with the aim of introducing new ways of working, reaching new audiences and supporting scotland’s vibrant cultural sector, especially libraries and archives. developing the organisation | national library of scotland our strategic priorities for - strategy - | how we support a successful scotland the preparation of the library strategy has been informed by the scottish government’s national performance framework (npf). the national performance framework was relaunched in and sets national outcomes. these outcomes are designed to support delivery of the scottish government’s purpose, which is: to focus on creating a more successful country, with opportunities for all of scotland to flourish, through increased wellbeing and sustainable and inclusive economic growth. although our work contributes to some extent to all of these outcomes, the library is most closely aligned to five and we will monitor our performance against these. the table below shows how the library’s outcomes match to the scottish government’s national outcomes. on the following pages, we outline examples of how the library’s work feeds into the national outcomes. scottish government national outcomes national library of scotland outcomes we are creative and our vibrant and diverse cultures are expressed and enjoyed widely. we are well educated, skilled and more able to contribute to society. we have a globally competitive, entrepreneurial, inclusive and sustainable economy. we tackle poverty by sharing opportunities, wealth and power more equally. we value, enjoy, protect and enhance our environment. we are the guardian of the published and recorded memory of scotland for current and future generations. we make it easier for people to access the collections. we put audiences at the heart of everything we do and offer a rich variety of ways for people to participate and engage with their heritage. we encourage and support research, learning and discovery. we will continue to be a great organisation to work for and with, developing new ways of doing, delivering and partnering. | national library of scotland we are well educated, skilled and more able to contribute to society. the national library is widely acknowledged as the premier library for many of scotland’s research communities. we contribute to and create innovative resources for use in schools including ‘scotland on screen’ and the ‘national library learning zone’. we link with scottish universities, colleges and schools on innovative research projects. more than per cent of higher education students who completed the last audience survey said the library helped advance their education. by supporting the knowledge economy, we contribute to a modern, successful scotland. we are creative and our vibrant and diverse cultures are expressed and enjoyed widely. our collections help to enhance scotland’s international reputation for the quality of its literary, scientific and cultural heritage, and for treasuring this heritage. by collecting and recording the knowledge of scotland we preserve the memory bank of the nation. the library has the world’s largest collection of scottish gaelic material. research into family history is supported, helping many people trace their scottish family background. our exhibitions attract many international visitors, adding to their understanding of scottish identity. more than per cent of all respondents to the last audience survey said the library helped them better understand scotland’s culture and history. strategy - | we tackle poverty by sharing opportunities, wealth and power more equally. we provide free access to all our collections, online and onsite. we continue to seek community benefits through our procurement activities. this includes fair work practices such as the living wage. we provide work experience and volunteer opportunities. we have an active outreach programme that works with schools, local community projects and community libraries across scotland. all our educational resources link to the curriculum for excellence and are promoted to schools across scotland. we have a globally competitive, entrepreneurial, inclusive and sustainable economy. our collection of business information resources is one of the largest collections of company and market data in the united kingdom, and is a key potential resource for scotland’s business community. we are the only national library in the united kingdom that provides direct access to an extensive range of market research reports, company and news data, and guides to starting and running a business directly via the web, free of charge, to registered users. we can deal with business enquiries in person, by phone or email, or via our library online chat service. we have worked with collaborators to develop the business & intellectual property centre in glasgow. we value, enjoy, protect and enhance our environment. we have reduced greenhouse gas emissions by more than per cent from - baseline levels. energy consumption has been reduced by per cent from - baseline levels. the percentage of waste that is recycled now exceeds per cent. we continue to operate a sustainable procurement policy. how we support a successful scotland | national library of scotland strategy - | the national library of scotland is a registered scottish charity. no. sc doi: . /zzxs-sr george iv bridge, edinburgh eh ew www.nls.uk operationalizing “internationalization” in the community college sector: textual analysis of institutional internationalization plans volume , accepted as an empirical research article by editor mary ann bodine al-sharif │received: april , │ revised: august , september , september , │ accepted: september , . cite as: unangst, l., & barone, n. ( ). operationalizing “internationalization” in the community college sector: textual analysis of institutional internationalization plans. journal for the study of postsecondary and tertiary education, , - . https://doi.org/ . / (cc by-nc . ) this article is licensed to you under a creative commons attribution-noncommercial . international license. when you copy and redistribute this paper in full or in part, you need to provide proper attribution to it to ensure that others can later locate this work (and to ensure that others do not accuse you of plagiarism). you may (and we encour- age you to) adapt, remix, transform, and build upon the material for any non-commercial purposes. this license does not permit you to use this material for commercial purposes. operationalizing “internationalization” in the community college sector: textual analysis of institutional internationalization plans lisa unangst * boston college, chestnut hill, massachusetts, usa unangstl@bc.edu nicole barone boston college, chestnut hill, massachusetts, usa baronena@bc.edu * corresponding author abstract aim/purpose this paper evaluates three community college internationalization plans using quantitative textual analysis to explore the different foci of institutions across three u.s. states. background one of the purposes of community college internationalization is to equip fu- ture generations with the skills and dispositions necessary to be successful in an increasingly globalized workforce. the extent to which international efforts have become institutionalized on a given campus may be assessed through the analysis of internationalization plans. methodology we use the textual analysis tool voyant, which has rarely been employed in edu- cational research, being more frequently applied in the humanities and under the broad heading of “digital scholarship”. contribution extant literature examining internationalization plans focuses on the four-year sector, but studies centered on the two-year sector are scarce. this study ad- dresses that gap and seeks to answer the research questions: how do communi- ty colleges operationalize internationalization in their strategic plans? what terms and/or concepts are used to indicate international efforts? findings key findings of this study include an emphasis on optimization of existing re- sources (human, cultural, community, and financial); the need for a typology of open access institution internationalization plans; and the fragmentation of in- ternational efforts at the community college level. https://doi.org/ . / https://creativecommons.org/licenses/by-nc/ . / https://creativecommons.org/licenses/by-nc/ . / mailto:unangstl@bc.edu mailto:baronena@bc.edu operationalizing “internationalization” in the community college sector impact on society it is clear that internationalization at community colleges may take shape based on optimization of resources, which begs the question, how can education sec- tor actors best support open access institutions in developing plans tailored to the local context and resources at hand? future research we recommend additional use of quantitative textual analysis to parse interna- tionalization plans, and imagine that both a larger sample size and cross-national sample might yield interesting results. how do these institutional groupings op- erationalize internationalization in the corpus of their plans? keywords internationalization, community college, quantitative textual analysis, digital scholarship, international students introduction valeau and raby ( ) assert that community college international programs “play a key role in providing the skills needed for a competitive, globally competent workforce and for a citizenry who are cultured, transformative, and empowered to support reform at the local and global level” (p. ). international education scholars and practitioners have long recognized the influence of globalization on u.s. community colleges and how these institutions have responded over time (raby & valeau, ). globalization is understood here as a blurred economic and political phenomenon with neo- colonial aspects, and one which altbach and knight ( ) assert pushes “ st century higher educa- tion toward greater international involvement” (p. ). this shift also relates to internationalization, defined by knight ( ) as “the process of integrating international, intercultural or global dimen- sions into the purpose, functions, or delivery of postsecondary education” (p. ). recent scholarship on this topic has moved beyond the original, primary focus on the four-year sector, and an emerging area of interest addresses internationalization efforts in u.s. community colleges. over time, scholars and practitioners have emphasized the salience of community college interna- tionalization. woodin ( ) asserted that how community colleges internationalize has critical impli- cations for the global and local economies in which the institution is situated. similarly, dellow ( ) highlighted the importance of internationalizing community college academic and technical programs, citing globalization’s impacts on future employment opportunities for students. it is evi- dent that community college internationalization is seen as a tool for advancing both economic and student development; it is through the various institutional conceptualizations of internationalization that we parse how international activities are defined and described through the textual analysis of internationalization plans at three institutions. according to the mapping internationalization on u.s. campuses report, fewer than percent of community colleges surveyed had an articulated internationalization plan (helms & brajakovic, ). given the expanding nature of internationalization in u.s. higher education, the ability to ac- cess and analyze internationalization plans is necessary to describe, measure, and evaluate how inter- nationalization transpires in the community college setting. through the use of voyant, a web-based text analysis tool, this study seeks to understand the salience and operationalization of internationali- zation efforts in community college internationalization plans through the following research ques- tions: . how do community colleges operationalize internationalization in their strategic plans? . what terms and/or concepts are used to indicate internationalization efforts? through the analysis of three community college internationalization plans, this study begins to ad- dress the gap in the literature on what we know about community college internationalization through the lens of strategic plans. background information on the history of internationalization, how internationalization has typically been measured, and barriers to internationalization at commu- unangst & barone nity colleges are presented. the study concludes with a discussion of the findings and recommenda- tions for further research. background in order to assess the extent to which community colleges have kept abreast of internationalization activities, scholars have sought to measure internationalization across a number of domains using primarily quantitative approaches. for example, the survey conducted by the american council on education (ace) titled mapping internationalization on u.s. campuses (helms & brajakovic, ) measures the state of internationalization at u.s. higher education institutions across several compre- hensive internationalization indicators: an articulated institutional commitment to internationaliza- tion, administrative structure and staffing, curriculum, co-curriculum, and learning outcomes, faculty policies and practices, student mobility, and collaborations and partnerships. though community col- leges have typically represented a small portion of the institutions sampled in the mapping survey, scholars have called into question the extent to which all of the internationalization dimensions used in this survey and other tools that measure campus internationalization are relevant to community college settings (woodin, ). one indicator that is noted as a particularly important dimension across many of the existing interna- tionalization tools is an articulated institutional commitment or strategic internationalization plan (community colleges for international development [ccid], ; green & siaya, , helms & brajakovic, ; ivey, ). while the current study does not seek to measure internationalization across u.s. community colleges, it does aim to shed light on community colleges’ articulated com- mitment to internationalization and the how these institutions operationalize internationalization in the corpus of their internationalization plans. scott ( ) asserts that internationalization plans are critical to the advancement of a college or university’s internationalization efforts. these written commitments are crucial for “expressing institutional commitment, defining institutional goals, in- forming stakeholders’ participation, as well as informing and stimulating stakeholder involvement in internationalization initiatives” (childress, , p. ). it is clear that internationalization plans serve a vital role in how internationalization activities are identified, defined, and communicated. yet what do we know about the study of internationalization plans and how internationalization is operationalized across various higher education institutions? in one study, childress ( ) analyzed internationalization plans at institutions and found that these plans accomplish several goals in- cluding offering guidelines for internationalization, garnering buy-in from campus stakeholders, ex- plaining the meaning and goals of internationalization, encouraging collaboration between depart- ments, and serving as a tool for fundraising. in another qualitative multi-case study on internationali- zation plans at jesuit institutions, nguyen ( ) found that the institutions in the study engaged in preliminary internationalization activities (e.g., recruiting international students, internationalizing the curriculum, increasing opportunities for study abroad and global partnerships), but operated with fragmented internationalization plans. while these studies offer important insights into the role of internationalization plans, the literature on these plans is scant and these studies did not include any analysis of community college plans. internationalization at community colleges international education has been part of community colleges since the late s when those working to expand international education opportunities began to look to community colleges as a potential avenue and resource (raby & valeau, ). the increasing number of international education pro- grams during this time sparked the establishment of the community colleges for international de- velopment (ccid) in , a non-profit organization that “empowers an international association of community, technical, and vocational institutions to create globally engaged learning environments” (ccid, n.d., para ). since its inception, ccid has collaborated with member institutions in order to integrate and embed international experiences across each campus sector. member institutions are operationalizing “internationalization” in the community college sector encouraged to use ccid’s framework for comprehensive internationalization tool in order to self- assess progress on internationalization efforts (ccid, ). this tool is further detailed in the fol- lowing sections. as globalization played a larger role in american higher education, community colleges experienced increased growth in international activity, which was characterized by four phases: recognition, ex- pansion and publication, augmentation, and institutionalization (raby & valeau, ). the recogni- tion phase, which took place between the mid- s and the mid- s, was largely characterized by the establishment of study abroad and international student support programs at several community col- leges, and the establishment of ccid and the american council on international and intercultural exchange (aciie) (raby & valeau, ). this phase was supplanted by increased dissemination of information on community college internationalization, increased documentation and financial re- sources to support international efforts, and the expansion of international support offices (raby & valeau, ). the third phase, augmentation, endured during the s and was marked by concert- ed international student recruitment and the rise of study abroad programs (raby & valeau, ). finally, institutionalization has been characterized by the inclusion of international efforts into insti- tutions’ strategic plans, mission, and vision statements, growth among study abroad and international student services, and a push for institutional leaders to drive internationalization efforts (raby & valeau, ). though the past several decades have demonstrated valiant efforts on the part of community colleg- es to pursue international activities, these efforts have been measured and assessed using several tools and met with varying levels of success. m easurin g in ternationalization at com m un ity colleges with a greater emphasis being placed on creating campuses that are responsive to international edu- cation, scholars have sought to measure the extent to which community colleges are internationaliz- ing. the mapping survey has often provided the data necessary for researchers to examine interna- tionalization trends specific to two-year institutions. green and siaya ( ) reported on the first in- ternationalization index developed from an ace survey created to measure the extent to which community colleges engaged in internationalization using the following metrics: articulated commit- ment, academic offerings, organization infrastructure, external funding, institutional investment in faculty and international students and programs. the authors classified community colleges as highly active or less active based on their internationalization score, with percent of community colleges being categorized as less active (green & siaya, ). harder ( ) drew upon mapping data to examine internationalization trends at suburban, ru- ral, and urban community colleges. the author found that rural community colleges internationalized at lower levels than urban and suburban two-year campuses and institutional support was one of the leading indicators for successful internationalization (harder, ). this analysis resulted in recom- mendations for increasing institutional support in resource-constrained environments, including ar- ticulating a commitment to internationalization through strategic plans and mission statements, ad- vocating for support from senior-level administrators, and establishing global focused learning out- comes. around this same time period, the community colleges for international development (ccid) de- veloped a framework for comprehensive internationalization that two-year institutions could use to self-assess areas for development and improvement. the iteration of this framework enables community colleges to assess their efforts across the following indicators: leadership and policy, or- ganizational structure and personnel, teaching and learning, co-curricular, international student sup- port, study abroad, professional development, and partnerships (ccid, ). this framework also allows institutions to assess their internationalization progress along a continuum (e.g., seeking, build- ing, reaching, and innovating) (ccid, ). unangst & barone copeland, mccrink, and starratt ( ) advanced harder ( ) and green and siaya’s ( ) work to create the community college internationalization index (ccii), which incorporates contempo- rary shifts in internationalization efforts. this index seeks to measure internationalization efforts at the institution level for public community colleges (copeland et al., ). this tool allows institu- tions to track internationalization efforts while taking institutional context and community into ac- count; a feature not present in green and siaya’s instrument. higher education leaders, scholars, and practitioners continue to grapple with how best to interna- tionalize their campuses, particularly in light of increasing enrollment of international students across all sectors of postsecondary education and continued political, social, and economic shifts around globalization. attempts to internationalize community colleges have not come without challenges, several of which are further described. barriers to in ternationalization at com m un ity colleges though there are several noted impediments to internationalization at community colleges, this study will briefly describe three of the most commonly cited obstacles: financial constraints, the inclusion of internationalization in the institution’s mission or strategic plan, and support from faculty and sen- ior-level administrators, who serve as key drivers of international efforts in the two-year sector. in addition to examining these barriers, the authors also underscore the political context in which com- munity colleges – and their respective internationalization plans – are situated. given that the majori- ty of these institutions are public, they are subject to the political whims of the state policy contexts they are embedded in, as well as federal level trends, which indirectly impact state decision-makers as well as the overall education landscape. at present, the narrative of the ruling party in washington is one of significantly constrained immigration, including refugee and asylee admissions (trump, ). further, recent enrollment data reflects that the number of international students pursuing under- graduate and graduate degrees in the u.s. has fallen since (redden, ); shifts in international student enrollment necessarily affect established and ongoing process of internationalization. there- fore, while this article describes barriers to internationalization in terms of resources, and a lack of strategic planning and engagement from key stakeholders, the authors understand that our analysis takes place in a challenging political time and context. resource constraints limited financial resources present a significant barrier to the development of international educa- tion and study abroad programs. amidst competing priorities, the revenue necessary to develop and maintain international programs proves challenging to sustain, particularly during a time when state support for higher education is increasingly on the decline (bissonette & woodin, ; green, ; raby & rhodes, ). it is typical that international programs and staff that support these programs are the first to be cut in times of financial austerity, thus limiting the extent to which international programs can be established and sustained (green, ; raby & valeau, ). due to the uncertain nature of funding for public higher education, community colleges find it challenging to designate resources in an institution’s budget for international efforts, which has implications for sustainability (bissonette & woodin, ). articulated commitments bissonette and woodin ( ) assert that “a well-communicated and well-implemented strategic plan will set international education on track for long-term success” (p. ). however, research consistent- ly finds that the lack of a clear, articulated strategy on internationalization is a barrier to these efforts. in one qualitative study on internationalization at an urban community college, mcraven and somers ( ) found that there was disagreement among the college president, administrators, and trustees on whether a commitment to internationalization efforts should be included in the mission statement and strategic plan. this point is particularly salient given that senior leaders and administrators serve operationalizing “internationalization” in the community college sector as the “first line of advocacy and sets the tone for programmatic changes” (raby & valeau, , p. ). further complicating this issue is a lack of overall institutional strategy when deciding which international activities to pursue. green ( ) notes that typical internationalization plans refer to one or two activities (e.g., study abroad or recruiting international students) and fail to integrate in- ternationalization goals with other institutional goals, including student learning outcomes. the im- portance of strategic plans cannot be overstated as these plans typically serve as one metric of an institution’s commitment to internationalization. lack of faculty and leadership support among both faculty and senior administrators, a lack of support for international efforts serves as another hindrance to achieving internationalization goals (green, ; harder, ). these views may stem from several sources. for example, scholars have noted a tension around the perception of the mission and goals of community colleges. raby and valeau ( ) stated, “although there is no national community college policy that opposes internationalization, there remains a belief that serv- ing the local community is the opposite of a global connection” (p. ). the desire to serve local over global needs may also stem from how administrators and faculty value international activities. while some institutional leaders may fail to see the value in internationalizing their campuses, other faculty, staff, and administrators may hold negative views toward international education or intercul- tural learning in general (gore, ; green, ). for administrators and faculty who hold positive views of international education, further restrictions are placed on internationalization efforts when faculty and leaders have limited experience or opportunities to engage in international activities (bi- sta, ). unfortunately, these views have negative consequences which often result in fewer admin- istrators and faculty leading international efforts, despite their critical role in this area. methodology this examination of three community college internationalization plans uses the open-source, online textual analysis platform voyant (described as a collection of analytical tools by its creators) (rock- well & sinclair, ). while voyant is not a new product – it was launched in – it has rarely been used in educational research, instead being more frequently applied in the humanities and under the broad heading of “digital scholarship”. we offer here a brief introduction to the platform before presenting our data, based on textual analysis of three internationalization plans. the developers of voyant describe it as combining “the capabilities of personal-computer-based pre- indexing tools, such as tact, with more accessible web-based tools that can find text and create indexes in real time” (rockwell & sinclair, , p. ). in short, voyant is a website that allows any user to either upload documents or to enter website urls and conducts an indexing and correlation of the words contained in those documents or on those webpages. it then visualizes this data in a number of ways: through word clouds (of the corpus of documents and individual documents), dis- tribution graphs, and indices of word frequency and phrase frequency (for instance, how often “in- ternational recruitment” is used in a given document). it also offers the correlation and significance level (p-value) of sets of two words within those documents or webpages. while there is no formal limit on the number of documents or website urls that may be analyzed at a given time, in general a larger corpus will result in a longer processing time, and may also return an error. rockwell and sinclair ( ) view voyant as a component of a larger project they refer to as agile hermeneutics (ah), which is defined as a pragmatic collaborative practice. at its heart it is pair work, and because only one person has his or her hands on the comput- er it requires dialogue between those participating. the paired scholars alternate between in- terpreting the results of text-analysis tools, and looking ahead, and reflecting back on what is needed. this then maximizes the dialogue between the scholar function and the develop- unangst & barone ment function to the point where they are woven into an organic whole. (rockwell & sin- clair, , p. ) ah encourages interdependence of research community members and experimentation, which is also reflective of digital scholarship as a whole. here, improvements to software and analytic tools are largely open source and with attribution readily given. contributors, participants, and researchers may be pursuing vocational or avocational projects; in this mode, rockwell and sinclair ( ) have constructed voyant as a platform that “mixes tools as panels much like those in a comic book, creat- ing a medley, or commedia, that encourages ‘serious play’” (p. ). it is clear that voyant does not interpret meaning (rockwell & sinclair, ). for instance, a word cloud produced by a given university’s mission statement – taking our home institution boston col- lege as an example – might display “boston” in the largest font size and most frequently used word in the document. devoid of context, we might interpret this finding to mean that the mission state- ment is closely tied to local community-based initiatives, that town-gown relations are strongly valued by the college. however, this analysis would miss that the use of boston might reflect the title of the institution being used repeatedly. this also relates to rockwell and sinclair’s warning of “the disap- pearance of the author” ( , p. ) and, by extension, context and intentionality, in voyant analysis. finally, a word about our selection of community college sites. we employed purposeful sampling to obtain publicly available internationalization plans of three institutions located in different states with distinct frameworks for higher education policy and funding (patton, ). our decision to select three plans was based on the application of quantitative textual analysis, which can provide huge amounts of data from relatively short documents (data that, in turn, must be culled in order to pre- sent a standard length journal article). further, though we do not claim generalizability here, we found the examination of community colleges in varied contexts to be an important pre-requisite to looking at correlations and frequencies for the purposes of an exploratory analysis. thus, we empha- size that shoreline community college (washington), pima county community college district (ari- zona) and harper college (illinois) are not representative, but reflect distinct goals, contexts, and stakeholders (campus internationalization leadership team (cilt) shoreline community college, ; office of international education, ; pima county community college district board of governors, ) analysis in an effort to consider this grouping of internationalization plans, we first used voyant to evaluate the corpus of documents. the five most frequently used words in the corpus were: global ( ), in- ternational ( ), students ( ), community ( ), and college ( ) (sinclair & rockwell, c). though this is not the particular emphasis of the paper at hand, we do find the use of “global” and “international” to pose several possible research questions: are these terms used interchangeably in all internationalization documents? does “global”, for instance, tend to refer to economic phenomena? a visualization of keywords in the corpus is provided in figure . we examined the corpus for keywords identified by ace and madeline green as essential to the campus internationalization process in the community college setting (green, ; helms & brajakovic, ). terms without statistically significant correlations at the p < . or p. < . lev- els included “leaders*” (encompassing leadership, leaders, and leader), which was used times in the corpus; “study abroad” times; language* times; research* times; resource* times; agree- ment twice; and articulation once. we further found that invest* (which would include investment), infra* (which would include infrastructure), and mobilit* (which would include mobility), were each used zero times. we find particularly notable the omission of mobility or mobilities here, given that these terms are so frequently employed in the internationalization literature (altbach & knight, ; clifford & montgomery, ; helms, rumbley, brajkovic, & mihut, ; hudzik, ; leask, ). operationalizing “internationalization” in the community college sector figure : word map, corpus of community college internationalization plans we examined the corpus for keywords identified by ace and madeline green as essential to the campus internationalization process in the community college setting (green, ; helms & brajakovic, ). terms without statistically significant correlations at the p < . or p. < . lev- els included “leaders*” (encompassing leadership, leaders, and leader), which was used times in the corpus; “study abroad” times; language* times; research* times; resource* times; agree- ment twice; and articulation once. we further found that invest* (which would include investment), infra* (which would include infrastructure), and mobilit* (which would include mobility), were each used zero times. we find particularly notable the omission of mobility or mobilities here, given that these terms are so frequently employed in the internationalization literature (altbach & knight, ; clifford & montgomery, ; helms, rumbley, brajkovic, & mihut, ; hudzik, ; leask, ). our impression at this point was that the emphasis of internationalization plans surveyed was on recognizing and capitalizing upon the international people and activities already present on the rele- vant campuses, rather than on intentionally expanding activities to pursue some new version of in- ternationalization, for instance new student mobility initiatives or pools of funding for international faculty research. we further probed this concept by examining word pairing correlations in the cor- pus. correlations in voyant are: calculated by comparing the relative frequencies of terms. a coefficient that approaches indicates that values correlate positively, that they rise and fall together. coefficients that ap- proach indicate little correlation. approaching - , terms correlate negatively (as one term rises, the other falls). (dickerson, , p. ) most correlations generated by voyant across the three internationalization plans were not statistical- ly significant at the p < . or p < . levels (sinclair & rockwell, b). we include in tables - select word pairings of particular interest. notably, among the strongest positive correlations among all statistically significant results is between the terms “college” and “global” (r = . , p = . ). this of course reflects the topic of internationalization plans (in a sense it is reflexive) unangst & barone and again begs the question, how is “global” distinct from “international”? among the strongest negative correlations was between the terms “global” and “diverse” (r = - . , p = . ), which is interesting as it seems to echo the artificial divide between diversity in the u.s. sphere and internationalism described by olson, evans, and shoenberg ( ) among others (maturana sendoya, ). importantly, a number of keywords highlighted by ace and madeline green did appear in the cor- pus of internationalization plans examined, and could be tested for correlations through voyant (green, ; helms & brajakovic, ). these include both fund* ( results in corpus) and organ- iza* ( results in corpus). further, partner* was displayed times (including instances of “part- nerships”, six of “partnership”, five of “partners”, and one instance of partner). in turn, collaborat* appeared times in the corpus (including seven instances of “collaboration”, four instances of “collaborate”, two instances of “collaborative”, and once for each of “collaborated”, “collabora- tions”, and “collaboration”). we include statistically significant word pairing correlations with part- ner* and collaborat* at the p < . level below. this was an area of particular interest, as we imagined that institutional partnerships and collaborations might discuss and display innovative or emerging approaches to internationalization. table . word pairings with partner* and collaborat*, sorted by strength of correlation rounded to the one hundredth place term term correlation (r) significance (p) term correlation (r) significance (p) office partner* . . collaborat* . . efforts partner* . . collaborat* . . plan partner* . . collaborat* . . academic partner* . . collaborat* . . culture partner* . . collaborat* . . new partner* . . collaborat* . . goals partner* . . collaborat* . . mission partner* . . collaborat* . . statement partner* . . collaborat* . . training partner* . . collaborat* . . diverse partner* . . collaborat* . . educational partner* . . collaborat* . . again, our impression based on the word pairings of partner* and collaborat* (which exactly parallel each other) is that the community colleges surveyed are primarily working within their existing struc- tures and practices to develop and implement their internationalization plans. however, we observe that there is a positive correlation with “new”, which indeed indicates innovation in this area. we also underscore the appearance of both “mission” and “statement” here, pointing towards an alignment of institutional values and goals with potential or actual partner institutions. operationalizing “internationalization” in the community college sector finally, academic*, which appears times in the corpus, is correlated with several key terms at the slightly weaker p < . level (results displayed in table ) (sinclair & rockwell, b). no correla- tions were found to be statistically significant at the p < . level. we underscore these findings given that academic considerations are typically fundamental to an internationalization plan or strategy. table . word pairings with academic*, sorted by strength of correlation rounded to the one hundredth place term term correlation (r) significance (p) office academic* . . efforts academic* . . plan academic* . . culture academic* . . goals academic* . . new academic* . . statement academic* . . mission academic* . . diverse academic* . . training academic* . . educational academic* . . again, we observe here a close connection between academic functions and the existing structures of the given institution: existing offices, mission, institutional culture, and so forth. there is no clear emphasis, for instance, on emerging uses of technology to facilitate student group work across bor- ders or a nascent focus on establishing a robust program of international visiting scholars (bissonette & woodin, ). finally, we probed the term “internationalization” itself. the correlations of relevant word pairings at the p < . level, we imagined, might indicate the key areas of focus across the three institutional plans as a whole (sinclair & rockwell, b). given that all statistically significant correlations with “internationalization” at the p < . level are negative, indicating an inverse relationship between the two terms in question, table displays words that tend not to appear together in the same phrase. as we do not see clear groupings by, for in- stance, internationalization as administrative function (briggs & ammigan, ; perez-encinas & rodriguez-pomeda, ), internationalization as economic goal (ho, lin, & yang, ; sá, ), internationalization in the classroom (leask & carroll, ; niehaus & williams, ), or interna- tionalization as philosophy (brooks & waters, ; deardorff, ), we find that the appearance of “internationalization” throughout these documents is characterized by a lack of cohesion. in short, there seem to be multiple conceptions and operationalizations of “internationalization” at work. in- terestingly, the phrase that used “internationalization” most often across the corpus was “interna- tionalization efforts”, which appeared six times, and can refer to any activity in place at a given higher education institution (hei). unangst & barone table . word pairings with internationalization, sorted by strength of correlation rounded to the one hundredth place term term correlation (r) significance (p) director internationalization - . . develop internationalization - . . experience internationalization - . . office internationalization - . . efforts internationalization - . . staff internationalization - . . strategic internationalization - . . academic internationalization - . . plan internationalization - . . new internationalization - . . culture internationalization - . . goals internationalization - . . mission internationalization - . . statement internationalization - . . training internationalization - . . educational internationalization - . . diverse internationalization - . . in dividual in ternationalization plan s with respect to the particular areas of focus of the internationalization plans of harper college, pima county community college district and shoreline college, we find distinctions in the use of key terms. faculty engagement is clearly emphasized at harper college, student recruitment at pima community college district and research at shoreline college (see figures - ) (sinclair & rockwell, d). again, these disparate areas of focus seem to echo the results of word pairing correlations noted previously – each community college surveyed emphasizes a distinct area of internationaliza- tion that suits its state context, constituents, mission, and goals (bissonette & woodin, ). operationalizing “internationalization” in the community college sector figure : corpus, visualization of “faculty” word frequency (sinclair & rockwell, d) figure : corpus, visualization of “recruitment” word frequency (sinclair & rockwell, d) figure : corpus, visualization of “research” word frequency (sinclair & rockwell, d) given that our results based on the corpus of internationalization plans did not clearly elucidate how students are discussed by the heis, as a final point of analysis we examined how this term appears across the three documents. as displayed by figure , “students” was used much more often in the shoreline college internationalization plan ( times) than other plans surveyed ( times com- bined) (sinclair & rockwell, a). unangst & barone figure : bubble line chart of “students” among internationalization plans (sinclair & rockwell, a) indeed, pima county community college district’s internationalization plan encompasses organiza- tional goals (“establish a language institute” for esl), community-workforce goals (“identify oppor- tunities for workforce development in the international arena”), and staff/faculty development goals (“intercultural training”) in addition to one student-centered goal (as distinct from increased interna- tional student recruitment). importantly, no action items for students themselves are identified in the plan – they seem not to be constructed as participants or agents in the proposed internationalization processes, but rather objects or passive recipients. for instance, “infusing global knowledge into the curriculum” is a process that has not included student experts reflecting on their own learning. this framing, of course, contradicts the student development literature and may be critically viewed (thomas, hill, o’ mahony, & yorke, ; yosso, ). somewhat similar to the formulation of internationalization goals at pima, harper college under- scores organizational and faculty-oriented goals (“foster a culture of accountability in all areas of international education”), though it does seek to “optimize participation by students and faculty in international education programs”. interestingly, another strategic goal is to “achieve greater integra- tion of international students into life of college”; again, we observe here a contradiction with the critically-oriented literature on international student experience, which resists a deficit orientation and emphasis on assimilation. in contrast, shoreline community college states that its campus internationalization goals are “inter- nationalizing the curriculum, creating opportunities for meaningful interaction between domestic and international students, enhancing the global competence of college employees, and engaging the community on international issues” (campus internationalization leadership team (cilt) shoreline community college, , p. vi). a main point of emphasis in shoreline’s internationalization plan is the overlap between the goals of the general education outcomes and campus internationalization, inherently centering the student experience. further, a key asset of the campus is identified as “a stu- dent body that is multinational, multiethnic, and multilingual. in particular, a majority of our interna- tional students are from the one of the more dynamic regions in the world: hong kong, indonesia, taiwan, korea, [and] china” (campus internationalization leadership team (cilt) shoreline community college, , p. v). in addition, shoreline surveyed peer institutions in the area to learn about the student support programs that might be relevant to diverse and international student groups, highlighting possible initiatives for adoption or adaptation at shoreline itself. however, stu- dents are also excluded from the leadership or organizing group, comprised exclusively of faculty and staff members. operationalizing “internationalization” in the community college sector discussion paucity an d t ypes of in ternationalization plan s as observed in our introduction, this paper is quite focused in its scope, comprising an analysis of three community college internationalization plans. this is a choice informed by the paucity of community college internationalization plans made publicly available, which may reflect several dy- namics at individual institutions. as suggested by childress ( ), institutional leadership may be reluctant to support the development of internationalizations plans in resource constrained envi- ronments that are also subject to close public scrutiny, which indeed describes almost all community college settings with very few, well-resourced exceptions. she writes that “if institutional leaders [are] not certain they could allocate the resources to carry out particular goals for internationalization, then written commitments to those goals in internationalization plans [are] neither in their best inter- est nor in the best interest of the institution” (childress, , p. - ). by extension, we find it likely that an awareness of limited financial resources may be driving the emphasis on internationalization within established institutional structures and processes evidenced by this textual analysis. in other words, this may reflect internationalization through optimization of existing resources (human, cultural, community, and financial) rather than a framework for expansion of those same resources. this seems a ripe area for further inquiry: can internationalization as pro- cess and strategic goal at open-access institutions be seen through a lens of resource identification and capitalization, and if so, how might the outcomes of internationalization differ at the two-year level? further, how might a focus on optimization of resources be seen with a systems perspective; where are the detailed plans that outline how bureaucratic and siloed administrative units may be uni- fied in pursuit of internationalization (mcraven & somers, )? we also consider childress’ finding that “internationalization plans were explained as irrelevant for some institutions in which internationalization has already been integrated into the fabric of the insti- tution” (childress, , p. ) and that plans may be most appropriate for community colleges that are in the initial stages of internationalization. among the three internationalization plans in focus here, two (pima county and harper) seem to fall into this “early stage” category, urging greater insti- tutional cohesion and basic organizational orientation towards international students and topics. however, shoreline’s internationalization plan is layered and reflects already-ingrained institutional approaches to supporting international students, activities, and curriculum, and thus seems to indi- cate that this college has found an internationalization plan relevant to its continued evolution, par- tially contradicting childress’ finding. might this indicate a consideration of internationalization plan typology, including the emerging (for colleges outlining nascent internationalization processes) and the evolving (for colleges seeking to deepen or iterate their internationalization processes)? this area for future research would be distinct from a typology of institutional culture as relevant to interna- tionalization processes as proposed by bartell ( ). “in ternationalization at h om e” (iah ) leask has argued the importance of internationalization at home in the four-year institutional con- text, observing that as the vast majority of students will not study abroad, internationalizing all as- pects of a college or university’s operations to internationalize the home campus experience is vital ( , ). in practice, this can mean reviewing a core curriculum to integrate internationally- relevant learning objectives; revisiting existing syllabi to include international authors; requiring stu- dents to cite international authors in their written work; and many other initiatives within and outside of the classroom itself. this emphasis on iah has also been described and encouraged in the two year sector by custer and tuominen ( ), among others. further, iah is sometimes characterized as optimizing existing institutional structures (beelen & jones, ), a clear theme identified in the analysis here. unangst & barone working within established institutional frameworks also comes with its own set of challenges. as hunter ( ) recently argued in the european context, drawing from interviews with university ad- ministrators, the “staff interviewed highlighted that many of the challenges they faced in dealing with international activities lay in institutional structures and practices that were not supportive of the needs of internationalization” (p. ). this begs the question, as the community colleges surveyed here propose various internationalization goals, to what extent they have already streamlined struc- tures to support success. in a sense, the advantage of creating new collaborations and initiatives in support of internationalization is that structures can be tailored; working through existing structures may optimize resources – but can those resources be appropriately channeled through potentially archaic or entrenched organizational channels? that is to say, are detailed plans for organizational realignment supporting internationalization missing from these internationalization plans because they have already been proposed or achieved? or are they not yet in existence? again, we seem to be pointing towards the need for a systems perspective or organizational theory frame, given that com- munity college internationalization plans do not appear to be proposing new staff or faculty roles, and instead reallocating institutional resources. targeted in ternationalization as noted, faculty engagement is emphasized at harper college, student recruitment for pima com- munity college district, and research at shoreline college. further, across all three documents we fail to find clear trends in conceptualizing internationalization as administrative function, internationali- zation as economic goal, internationalization in the classroom, or internationalization as philosophy. in short, “internationalization” lacks cohesion in the corpus of documents; instead, internationaliza- tion is operationalized in distinct and disparate ways, but ways that are clearly tailored to institutional context. this lack of cohesion may be seen as a fragmentation of what internationalization means in practice, which is not necessarily negative. this fragmentation would indeed indicate a different direction from the homogenization of higher education recently discussed by de wit, gacel-Ávila, and jones, ( ), who have written that “little space is left for new and innovative ideas for internationalization, embedded in the local and institutional context” (p. ). what we seem to be observing in the community college space is a closer attention to stakeholder needs and institutional mission and re- sources. indeed, previous work has indicated that this is a distinctive feature of the open access land- scape (bissonette & woodin, ; custer & tuominen, ). however, fragmentation of internationalization in the community college sector might also indicate an opportunity to engage mid-level organizational units, such as district level “faculty curriculum councils [that] could dramatically enhance internationalization and create faculty buy-in with a rela- tively modest financial outlay” (mcraven & somers, , p. ). that is to say, while we acknowledge that two-year institutions are embedded in their local settings, district or even state level groups may also be well connected to stakeholder needs and resource constraints and be in a position to offer consistent guidance and useful resources in at least some areas. faculty councils, for example, might be in a position to identify specific texts or instructional tools appropriate to medical assistant programs and thereby “flesh out what it means to be international and local at the same time” (mcraven & somers, , p. ). such an approach would also address the issue of including in- ternational content in “core” or required classes, rather than electives alone (beelen & jones, ). similarly, community actors may add capacity and direction to community college internationalization efforts. service learning programs by definition are meant to be guided by community-based actors, and frequently incorporate international and intercultural elements (berry & chisholm, ). operationalizing “internationalization” in the community college sector limitations there are several limitations to this study that necessitate mention. the primary limitation is the sample size of data, guided by our experimentation with a new analytical tool producing large amounts of data. as previously noted, our sample was limited to publicly available internationaliza- tion plans that could be accessed online. future research might use different search criteria to locate community college internationalization plans including personal outreach to relevant staff members using member directories from community college professional organizations, such as community colleges for international development (ccid). it is clear from our analysis that future research using a similar analytic approach with a larger sample is warranted and necessary in order to advance our understanding of internationalization plans at these institutions. a second limitation to our analysis was the type of data analyzed. we initially attempted to analyze mission and vision statements on community college websites, but found that these data sources did not include the type of information necessary for an in-depth analysis. there was little mention of global or international goals in the mission and vision statements on community college websites, which may be indicative of the extent to which community colleges publicly embrace internationali- zation efforts given their historical local focus. finally, textual analysis on its own, devoid of context, can be considered a limitation. we have attempted to address this limitation of the tool by situating selected word and phrases into the broader context and conversation on internationalization at community colleges. recommendations for future research this study sought to identify how internationalization was operationalized in three community col- lege internationalization plans. in our analysis and in line with existing literature, we found that finan- cial resources may be a critical factor in determining internationalization activities. additionally, this analysis shows the extent to which institutional context and culture influences the stage at which community college internationalization plans are in their development. while recommendations for further research are noted throughout this study, we acknowledge that our study was indeed limited by the extent to which internationalization plans were publicly available. it is possible that though we were able to locate and access several plans, that these documents were not intended for audiences outside of campus and community stakeholders. there may exist institutional plans that go into greater depth and detail around international activities and efforts. the analysis of a larger sample of strategic plans may yield additional findings and insights on internationalization trends across a great- er variation of community colleges. references altbach, p. g., & knight, j. ( ). the internationalization of higher education: motivations and realities. jour- nal of studies in international education, ( – ), – . https://doi.org/ . / bartell, m. ( ). internationalization of universities: a university culture-based framework. higher education, ( ), – . beelen, j., & jones, e. ( ). redefining internationalization at home. in a. curaj, l. matei, r. pricopie, j. salmi, & p. scott (eds.), the european higher education area: between critical reflections and future policies (pp. – ). cham: springer. https://doi.org/ . / - - - - _ berry, h. a., & chisholm, l. a. ( ). service-learning in higher education around the world: an initial look. the in- ternational partnership for service-learning. bissonette, b., & woodin, s. ( ). building support for internationalization through institutional assessment and leadership engagement. new directions for community colleges, ( ), – . https://doi.org/ . /cc. https://doi.org/ . / https://doi.org/ . / - - - - _ https://doi.org/ . /cc. unangst & barone bista, k. ( ). faculty international experience and internationalization efforts at community colleges in the united states. in r. l. raby & e. j. valeau (eds.), international education at community colleges: themes, practices, research, and case studies (pp. – ). new york, ny: palgrave. https://doi.org/ . / - - - - _ briggs, p., & ammigan, r. ( ). a collaborative programming and outreach model for international student support offices. journal of international students, ( ), – . https://doi.org/ . /zenodo. brooks, r., & waters, j. ( ). student mobilities, migration and the internationalization of higher education. cham: pal- grave macmillan. campus internationalization leadership team (cilt) shoreline community college. ( ). advancing campus internationalization: report to the president’s senior executive team. shoreline. childress, l. k. ( ). internationalization plans for higher education institutions. journal of studies in interna- tional education, ( ), - . https://doi.org/ . % f clifford, v., & montgomery, c. ( ). designing an internationationalised curriculum for higher education: embracing the local and the global citizen. higher education research and development, ( ), – . https://doi.org/ . / . . community colleges for international development. (n.d.). mission statement. retrieved from https://www.ccidinc.org/about-us/ community colleges for international development. ( . fci assessment tool. retrieved from https://www.ccidinc.org/fci-assessment-tool/ copeland, j. m., mccrink, c. l., & starratt, g. k. ( ). development of the community college internation- alization index. journal of studies in international education, ( ), – . https://doi.org/ . / custer, l., & tuominen, a. ( ). bringing “internationalization at home” opportunities to community col- leges: design and assessment of an online exchange activity between u.s. and japanese students. teaching sociology, ( ), – . https://doi.org/ . / x de wit, h., gacel-Ávila, j., & jones, e. ( ). voices and perspectives on internationalization from the emerg- ing and developing world. in h. de wit, j. gacel-Ávila, e. jones, & n. jooste (eds.), the globalization of in- ternationalization: emerging voices and perspectives (pp. – ). https://doi.org/ . / deardorff, d. ( ). internationalization: in search of intercultural competence. international educator, ( ), – . dellow, d. a. ( ). the role of globalization in technical and occupational programs. new directions for com- munity colleges, ( ), - . https://doi.org/ . /cc. dickerson, m. ( ). a gentle introduction to text analysis with voyant tools. irvine. retrieved from https://cloudfront.escholarship.org/dist/prd/content/qt jz sf/supp/dickerson_textanalysisvoyantt ools_ .pdf gore, j. e. ( ). faculty beliefs and institutional values: identifying and overcoming these obstacles to educa- tion abroad growth. in r. lewin (ed.), the handbook of practice and research in study abroad (pp. – ). new york, ny: routledge. green, m. f. ( ). internationalizing community colleges: barriers and strategies. new directions for community colleges, ( ), – . wiley. https://doi.org/ . /cc. green, m. f., & siaya, l. m. ( ). measuring internationalization at community colleges. american council on edu- cation: washington, d.c. harder, n. j. ( ). internationalization efforts in united states community colleges: a comparative analysis of urban, suburban, and rural institutions. community college journal of research and practice, ( ), – . https://doi.org/ . / . . helms, r. m., & brajakovic, l. ( ). mapping internationalization on u.s. campuses: edition. washington, d.c. https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ https://doi.org/ . /zenodo. https://doi.org/ . % f https://doi.org/ . / . . https://www.ccidinc.org/about-us/ https://www.ccidinc.org/fci-assessment-tool/ https://doi.org/ . / https://doi.org/ . / x https://doi.org/ . / https://doi.org/ . /cc. https://cloudfront.escholarship.org/dist/prd/content/qt jz sf/supp/dickerson_textanalysisvoyanttools_ .pdf https://cloudfront.escholarship.org/dist/prd/content/qt jz sf/supp/dickerson_textanalysisvoyanttools_ .pdf https://doi.org/ . /cc. https://doi.org/ . / . . operationalizing “internationalization” in the community college sector helms, r. m., rumbley, l. e., brajkovic, l., & mihut, g. ( ). internationalizing higher education: national policies and programs. washington, d.c. retrieved from https://www.acenet.edu/news- room/pages/internationalizing-higher-education-worldwide-national-policies-and-programs.aspx ho, h. f., lin, m. h., & yang, c. c. ( ). goals, strategies, and achievements in the internationalization of higher education in japan and taiwan. international education studies, ( ), – . https://doi.org/ . /ies.v n p hudzik, j. k. ( ). comprehensive internationalization: from concept to action. washington, d.c. retrieved from https://shop.nafsa.org/detail.aspx?id= e hunter, f. ( , january). training administrative staff to become key players in the internationalization of higher education. international higher education, , – . https://doi.org/ . /ihe. . . ivey, t. ( ). curriculum internationalization and the community college. (published doctoral dissertation). retrieved from proquest dissertations and theses database. ( ) knight, j. ( ). updating the definition of internationalization. international higher education, , – . leask, b. ( ). “beside me is an empty chair”: the student experience of internationalisation. in e. jones (ed.), internationalisation and the student voice: higher education perspectives (pp. – ). new york: taylor & fran- cis. retrieved from http://www.routledge.com/books/internationalisation-and-the-student-voice- isbn leask, b. ( ). a conceptual framework for internationalisation of the curriculum. in b. leask, internationaliz- ing the curriculum (pp. – ). abingdon: routledge. leask, b., & carroll, j. ( ). moving beyond “wishing and hoping”: internationalisation and student experi- ences of inclusion and engagement. higher education research and development, ( ), – . https://doi.org/ . / . . maturana sendoya, i. ( ). internationalization and multiculturalism: natural and unaware partners. chestnut hill. mcraven, n., & somers, p. ( ). internationalizing a community college: a view from the top. community college journal of research and practice, ( ), – . https://doi.org/ . / . . nguyen, b. ( ). internationalization at jesuit colleges and universities in the united states: tensions between the jesuit mission and internationalization in strategic plans (doctoral dissertation). available from proquest dissertations and theses database. (umi no. ) niehaus, e., & williams, l. ( ). faculty transformation in curriculum transformation: the role of faculty development in campus internationalization. innovative higher education, ( ), – . https://doi.org/ . /s - - - office of international education. ( ). strategic plan for the internationalization of harper college. palatine. olson, c. l., evans, r., & shoenberg, r. f. ( ). at home in the world: bridging the gap between internationalization and multicultural education. washington, d.c. retrieved from http://store.acenet.edu/showitem.aspx?product= &session= b bd a ddf c f bb patton, m. q. ( ). qualitative evaluation methods. thousand oaks: sage publications. perez-encinas, a., & rodriguez-pomeda, j. ( ). international students’ perceptions of their needs when going abroad: services on demand. journal of studies in international education, ( ), - . https://doi.org/ . / pima county community college district board of governors. ( ). executive summary of the - out- comes of pcc’s strategic plan for internationalization. tucson. raby, r. l., & rhodes, g. m. ( ). promoting education abroad among community college students. in h. barclay hamir & n. gozik (eds.), promoting inclusion in education abroad (pp. – ). sterling, va: stylus publishing, inc. https://www.acenet.edu/news-room/pages/internationalizing-higher-education-worldwide-national-policies-and-programs.aspx https://www.acenet.edu/news-room/pages/internationalizing-higher-education-worldwide-national-policies-and-programs.aspx https://doi.org/ . /ies.v n p https://shop.nafsa.org/detail.aspx?id= e https://doi.org/ . /ihe. . . http://www.routledge.com/books/internationalisation-and-the-student-voice-isbn http://www.routledge.com/books/internationalisation-and-the-student-voice-isbn https://doi.org/ . / . . https://doi.org/ . / . . https://doi.org/ . /s - - - http://store.acenet.edu/showitem.aspx?product= &session= b bd a ddf c f bb http://store.acenet.edu/showitem.aspx?product= &session= b bd a ddf c f bb https://doi.org/ . / unangst & barone raby, r. l., & valeau, e. j. ( ). community college international education: looking back to forecast the future. new directions for community colleges, ( ), – . https://doi.org/ . /cc. raby, r. l. & valeau, e. j. ( ). global is not the opposite of local: advocacy for community college interna- tional education. in r. l. raby & e. j. valeau (eds.), international education at community colleges: themes, practic- es, research, and case studies (pp. – ). new york, ny: palgrave. https://doi.org/ . / - - - - _ redden, e. ( , november ). new international enrollments decline again. inside higher ed. retrieved from https://www.insidehighered.com/news/ / / /new-international-student-enrollments-continue- decline-us-universities rockwell, g., & sinclair, s. ( ). hermeneutica: computer-assisted interpretation in the humanities. cambridge: the mit press. sá, c. m. ( ). forget the competition trope. international higher education, , – . https://doi.org/ . /ihe. . . scott, r. a. ( ). campus developments in response to the challenges of internationalization: the case of ramapo college of new jersey. springfield, va: cbis federal. sinclair, s., & rockwell, g. ( a). bubblelines. retrieved april , , from http://voyant-tools.org sinclair, s., & rockwell, g. ( b). correlations. retrieved april , , from http://voyant-tools.org sinclair, s., & rockwell, g. ( c). terms. retrieved april , , from http://voyant-tools.org sinclair, s., & rockwell, g. ( d). trends. retrieved april , , from http://voyant-tools.org thomas, l., hill, m., o’ mahony, j., & yorke, m. ( ). supporting student success: strategies for institutional change. london: higher education academy. retrieved from https://www.heacademy.ac.uk/knowledge- hub/supporting-student-success-strategies-institutional-change trump, d. j. ( ). executive order protecting the nation from foreign terrorist entry into the united states. retrieved feb- ruary , , from https://www.whitehouse.gov/presidential-actions/executive-order-protecting-nation- foreign-terrorist-entry-united-states/ valeau, e. j., & raby, r. l. ( ). building the pipeline for community college international education leader- ship. in r. l. raby & e. j. valeau (eds.), international education at community colleges: themes, practices, research, and case studies (pp. – ). new york, ny: palgrave. https://doi.org/ . / - - - - _ woodin, s. ( ). calls for accountability: measuring internationalization at community colleges. in r. l. raby & e. j. valeau (eds.), international education at community colleges: themes, practices, research, and case studies (pp. – ). new york, ny: palgrave. https://doi.org/ . / - - - - _ yosso, t. j. ( ). whose culture has capital? a critical race theory discussion of community cultural wealth. race ethnicity and education, ( ), – . https://doi.org/ . / https://doi.org/ . /cc. https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ https://www.insidehighered.com/news/ / / /new-international-student-enrollments-continue-decline-us-universities https://www.insidehighered.com/news/ / / /new-international-student-enrollments-continue-decline-us-universities https://doi.org/ . /ihe. . . http://voyant-tools.org/ http://voyant-tools.org/ http://voyant-tools.org/ http://voyant-tools.org/ https://www.heacademy.ac.uk/knowledge-hub/supporting-student-success-strategies-institutional-change https://www.heacademy.ac.uk/knowledge-hub/supporting-student-success-strategies-institutional-change https://www.whitehouse.gov/presidential-actions/executive-order-protecting-nation-foreign-terrorist-entry-united-states/ https://www.whitehouse.gov/presidential-actions/executive-order-protecting-nation-foreign-terrorist-entry-united-states/ https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ https://doi.org/ . / operationalizing “internationalization” in the community college sector biographies lisa unangst is a phd candidate at the boston college lynch school of education and human development. her research interests include edu- cational policy and practice supporting displaced students in comparative, trans-national context; international alumni affairs; and quantitative textu- al analysis. lisa worked previously at cal state east bay, the california institute of technology, and harvard university. nicole barone is a phd candidate at the boston college lynch school of education and human development. her interests center on college access, diversity, and inclusion in international education, and study abroad at community colleges and minority serving institutions. she has worked previously at diversity abroad, bunker hill community college, and the university of washington. operationalizing “internationalization” in the community college sector: textual analysis of institutional internationalization plans abstract introduction background internationalization at community colleges measuring internationalization at community colleges barriers to internationalization at community colleges resource constraints articulated commitments lack of faculty and leadership support methodology analysis individual internationalization plans discussion paucity and types of internationalization plans “internationalization at home” (iah) targeted internationalization limitations recommendations for future research references biographies - - -libraries-as-research-partners clarin & research libraries: a very short introduction leon wessels programme manager clarin eric clarin in eight bullets • clarin is the common language resources and technology infrastructure • esfri eric status since , landmark since • that provides easy and sustainable access for scholars in the humanities and social sciences and beyond • to digital language data (in written, spoken, video or multimodal form) • and advanced tools to discover, explore, exploit, annotate, analyse or combine them, wherever they are located • through a single sign-on environment • that serves as an ecosystem for knowledge sharing • and: ready for integration in eosc clarin eric in members and centres a consortium of: • members: at, bg, cy, cz, de, dk, dlu, ee, fi, gr, hr, hu, it, lt, lv, nl, no, pl, pt, se, si • observers: is, fr, sa, uk; • > centres clarin in data types • newspaper archives • literary texts • social media data • parliamentary records • historical letters • oral history data • disciplinary libraries • institutional archival data • broadcast archives • … collaboration with research libraries • collaboration on a national scale: – several national consortia include research libraries • collaboration on a supranational scale: – liber: mou, sshoc – europeana: dsi- want to know more? visit the clarin stand at the market place or www.clarin.eu / clarin@clarin.eu http://www.clarin.eu/ mailto:clarin@clarin.eu csdh paper-visual matters visual matters csdh-schn visual matters: experiments in the public visualization of text csdh-schn : presentation paper u of alberta: geoffrey rockwell, jingwei wang, bennett tchoh, chaolan wu, ali azarpanah mcgill: stéfan sinclair . introduction visualization is the new knowing. from big data to complex processes, visualization tools and data walls are deployed for discovering knowledge and representing it back to others. the tools, however, are not accessible to most scholars, especially humanists who are trained to work with text. the visual matters project takes a speculative design approach to prototyping alternative visualization environments that challenge ideas about the surveillance of data. we do this by adapting an existing text analysis environment, voyant tools (voyant-tools.org), so that it can be used in new ways by textual scholars to learn through the graphical. in this paper we will discuss: ● significance and methodology: we will outline the speculative design approach of prototyping that we have taken and why we think it is important. ● auditing voyant: we will discuss how we audited how voyant works (or not) on data walls and touch tables. ● book map: we will demonstrate a first working speculative design called the book map. ● prototypes: we will discuss two sketches for future research and development. ● conclusion: we will end with a few thoughts on visualization, ubiquitous screens and knowledge. . significance and methodology interacting is believing. visualizations work rhetorically when users feel they are exploring data and drawing their own conclusions rather than being told what to think. to understand the potential of visualization we therefore need to experiment with designing interactions. for this reason visual matters takes a speculative design approach of prototyping alternative visions. we do this through scenarios that use two future contexts for experiments in text visualization, . data walls - how can data walls be used by groups for research and instruction with voyant? . touch tables - how can touch tables be used by small groups with voyant? visual matters csdh-schn significance: why do these prototypes matter? the “csec presentation” that edward snowden and glenn greenwald leaked to the globe and mail in shows the olympia system developed by the communications security establishment of canada. from the powerpoint slides we can see how olympia allows intelligence analysts to combine and visualize both data from different surveillance databases, but also processes of querying across databases. it is both a data visualization tool and a visual programming environment. for those studying big data and surveillance who don’t have access to olympia, the slides are a trace or fiction of what intelligence officers imagine they could do (rockwell & sinclair ). they provide those of us without access with the equivalent to a design fiction with which we can speculate about the practices of surveillance in the age of big data. olympia is just one example of a big data visualization tool developed to manage information about people and communications. companies like palantir and ibm have developed other suites of tools that they can customize for organizations (palantir and watson analytics respectively). the rest of us can only imagine how our information is being managed without access to these industrial tools. promotional materials available online or through the spy files released by wikileaks (https://wikileaks.org/the-spyfiles.html) help us understand what is being developed and used, but they are difficult to find and digest. by contrast a speculative design approach can imagine how we are being managed. for example, christian laesser’s they know web site tells a story of how surveillance we have heard about from snowden could affect a fictional architecture student in berlin. the story is told through a short video which draws on speculative designs of what the systems look like. the web site includes a poster and documentation showing the sources of the speculations. similarly in this project we are prototyping how textual scholars might know using voyant together on data walls or touch tables. to do that we needed to audit how voyant works in these new contexts, and then design speculative prototypes. speculative design research: most design research practices are aimed at producing functional designs, but as dunne and raby point out in the "critical design faq" there is a tradition of design as critique that goes back to italian radical design of the s. dunne's book hertzian tales ( ), first published in , introduced critical design as an alternative to commercial product design. in the faq they define critical design as an approach that "uses speculative design proposals to challenge assumptions, preconceptions and givens about the role products play in everyday life." the idea is not to produce design ideas that are more efficient or that meet some need, but to create speculative designs that provoke critical thought about everyday life and its products. it is about the implications of things like visualizations rather than the applications. this approach isn't really a method, but a supplement to design practices that use design for critical purposes. dunne & raby further developed the approach in speculative everything where they provide examples of how speculative designs can present possible futures as scenarios. as such speculative design is related to science/speculative fiction in that science or speculative fiction writers use fiction to imagine the implications of technology in a way that can provoke reflection on the present. speculative design is likewise aimed at provoking reflection, but with fictional products rather than words. the fictional designs visual matters csdh-schn encourage us to imagine potential scenarios of use or alternative futures that change how we think. visual matters borrows this approach to critically explore how visualization is emerging as a way of knowing and specifically a way of knowing large collections of texts. visualization has traditionally been seen as a way of objectively representing data so that people can “carry out tasks more effectively.” (munzer , ) the idea is that visualization can show patterns in otherwise indigestible data making it easier for users to draw conclusions about the phenomena visualized. the challenge for people like tufte ( , ) was in how to best present data visually with the least distractions. the history of text analysis tools in the humanities is a history of developing forms of interactive reading (sinclair ). tools like tact (released in ) demonstrated a new way of using computers to interpret texts (lancashire ). instead of running batch processes on a mainframe that produced results that would be printed for study, the new interactive tools on personal computers could be used to query texts immediately on the screen. tact brought a primitive version of a graphical user interface and visualization to the humanist and their personal computer. this history of reading tools like tact and before it arras (smith ) is being lost as the tools are now difficult, if not impossible to run. while digital humanists have been developing visualization tools to study texts since the work of john b. smith ( , ), visualization got little attention in the humanities generally until franco moretti argued in graphs, maps, trees ( ) that computer visualizations enabled the “distant reading” of texts. distant reading was presented as an alternative to interpretative “close reading.” distant reading would help explain historical trends the way scientific visualization lets users explore a phenomenon. johanna drucker ( ) rightly critiques the naïve scientific view that visualization shows the phenomena itself, pointing out that data should be called “capta” as it is not given (data) so much as captured and represented through a surrogate (capta). she shows how the whole pipeline of visualization from the choices of what data to capture to the visual form chosen involves interpretation which should be critically interrogated. she imagines visualization tools that rather than hiding the agency of the person making the choices would be capable of showing interpretation in all its subjectivity. she imagines visualization tools that can show the ambiguity of evidence and point of view rather than presenting the illusion of a god's-eye view. in short she proposes generative visualization for humanists where we make new knowledge through interpreting with visualization rather than just representing knowledge. in this she is aligned with speculative design as imagined by dunne and raby. speculative design is about trying to create designs that provoke new thinking and knowledge rather than just illustrating what is known or what is given. visual matters draws on these ideas about design and visualization practice to think through how text visualization might be different than scientific visualization. the project creates a visual matters csdh-schn context where alternative tools can be prototyped and shared. the questions we will speculate about through design include those that drucker raises: ● how can visualization tools aid in exploring and representing to others a multiplicity of interpretations? ● how can visualization communicate where information is missing or ambiguous in ways that open room for speculation? ● how can visualization be a site for negotiating ideas about texts in a group rather than simply presenting them? ● how can a tool make clear the positionality of the interpreter? how can it present a story of different views on data? references: n. a. ( ). “csec presentation.” uploaded to scribd by c. freeze of the globe and mail. pdf of csec powerpoint slides from presentation at five eyes conference. <https://www.scribd.com/document/ /csec-presentation> drucker, j. ( ). graphesis: visual forms of knowledge production. cambridge, massachusetts, harvard university press. dunne, a. and f. raby (n.d.) “critical design faq”. <http://www.dunneandraby.co.uk/content/bydandr/ / > dunne, a. ( ). hertzian tales: electronic products, aesthetic experience, and critical design. cambridge, massachusetts, mit press. dunne, a. and f. raby ( ). speculative everything: design, fiction, and social dreaming. cambridge, massachusetts, mit press. laesser, c. ( ). they know. web site at <https://they-know.org/en>. accessed jan. , . lancashire, i., ed. ( ). using tact with electronic texts. new york, modern languages association of america. moretti, f. ( ). graphs, maps, trees: abstract models for literary history. london, verso. munzer, t. ( ). visualization analysis and design. new york, crc press. rockwell, g. and s. sinclair ( ). hermeneutica: computer-assisted interpretation in the humanities. cambridge, massachusetts, mit press. rockwell, g. and s. sinclair ( ). “watching out for the olympians! reading the csec slides.” information ethics and global citizenship: essays on ideas to praxis. eds. t. samek and l. shultz. jefferson, north carolina: mcfarland, p. - . sinclair, s. ( ). "computer-assisted reading: reconceiving text analysis." literary and linguistic computing. ( ): - . smith, j. b. ( ). “image and imagery in joyce's portrait: a computer-assisted analysis.” directions in literary criticism: contemporary approaches to literature. eds. s. weintraub and p. young. university park, pa, the pennsylvania state university press: - . smith, j. b. ( ). "computer criticism." style. xii( ): - . smith, j. b. ( ). "a new environment for literary analysis." perspectives in computing. ( / ): - . visual matters csdh-schn tufte, e. ( ). the visual display of quantitative information. cheshire, ct, graphics press. tufte, e. ( ). envisioning information. cheshire, ct, graphics press. . auditing voyant introduction voyant-tools (vt) is one of the most used free text analysis and visualization suites, available to the general public at voyant-tools.org. it was developed by stéfan sinclair & geoffrey rockwell with an initial release in . in the month of october , vt was accessed , times from different countries (rockwell & sinclair, ). vt has a variety of text analysis and visualisation tools which have options for customisation. although vt’s interface was designed for use on a personal computer by a single user, its many visualisations makes vt a good candidate to be tested on a huge data wall and a data table. our goal was to test how a group of scholars working on analysing a text could work interactively on vt displayed on a data wall and on a large data table. large displays are more common these days and can be found in many areas of the campus. they are owned by businesses and universities and are used to display commercial ads and to provide information to students and shoppers (among other applications). for our testing, we needed displays which are touch sensitive and accessible to students. we were able to find them in the digital scholarship centre (dsc) at the university of alberta. the dsc has study spaces, multimedia equipment, broadcasting spaces, modern computers, a d printer, virtual reality equipment and of special interest to us, a . m x . m touch capable data wall with a resolution of k and a inch data table. both displays are run by a windows operating system and are available for booking from the dsc website. the data wall and the data table were booked times during which the interactive use of vt for text analysis was tested but unfortunately we could not do further testing of our proposed solutions because the dsc was closed due to the covid- pandemic. voyant tools on the data wall and touch table the data wall is run by a desktop computer to which a mouse and keyboard are attached. we decided to only use the touch sensitive datawall as our input terminal i.e. starting from the launching of the browser to the loading and analysis of text., it became quickly apparent that working directly on the data wall was not very practical. with a very large space to work with it was impractical to use the whole data wall without running back and forth. we realised that we needed just about a third of the data wall ( . m x . m)--and that was still a very large space to work with. hence the dimensions of the data wall made it possible for three groups of people to use the data wall simultaneously. but this usage has many challenges because the data wall is run by one operating system, the three groups have to share the start menu and keyboard for example. to launch a new program from the start menu, a user who is working on the right side of the data wall has to walk to the left side and to access the start menu. the program window visual matters csdh-schn might open in full screen mode above the other windows being used by the other users. the newly opened program window has to be reduced and dragged to the right usually across the windows of other users. a similar difficulty was faced with the keyboard which has to be shared among the group of users. it has to be dragged from one area of the screen to another. vt’s default interface is made up of panels and the user can set which text analysis tool or visualisation is displayed in each panel. on a desktop the pointer changes to a resize pointer when placed over the edges of the panels. mousing over is not possible on a touch screen and it was impossible to resize individual panels. vt also has a skin builder—discussed in more detail below—in which the number, size, and other parameters of each panel can be preset and placed into a new vt window but it also requires the ability to mouse over which is not possible on the data dall. the browser window could be resized but the larger the window, the more the users has to step back to have the whole window in view. also, many options of vt panels use sliders for the visualization options. although clicking and dragging different vt elements worked, the slider didn’t work on the data wall. different levels could be selected by clicking along the slider’s length. similar problems were faced on the data table, i.e. it was difficult to resize the panels and the slider didn’t work. it is not uncommon for such problems to be faced when an interface designed to be used on one platform is used on another. these issues faced could be addressed by microsoft implementing changes in the operating system to make common use instances like mousing over available to touch screens, but bringing these issues to the attention of such a big corporation like microsoft and having them address is not obvious. a more realistic solution will be website designers implementing some design changes on their interface. solutions to the problems faced are the following: • instead of a slider, a clickable dropdown menu could be used. (windows surface tablets now permit mouse overs with the use of the stylus). • to make resizing easier, resize icons could be added to the corner of each panel. • to address the problem of the shared keyboard, a mini virtual keyboard could be added such that it appears below a text input area each time it is selected. studies have shown a preference for smaller keyboards (gu, shim, kim, & lee, ) and highly customisable keyboards have been created to meet this need (for example virtual keyboard by mottie ( )). these proposed solutions improve the use experience for both the data wall and the data table but they seems more useful for the data table since it was designed for direct use. the data wall’s large size and other common issues like poor touch calibration, possible harm to eyes due to the screen that has to remain bright to be visible to people standing at a distance from the screen and physical strain from using such a large display makes simultaneous remote access by multiple users through personal portable mobile devices like smartphones, tablets and laptops a better solution for the interactive use of the data wall (rittenbruch, ). visual matters csdh-schn references gu, j., shim, y. a., kim, s., & lee, g. ( , november). “a small virtual keyboard is better for intermittent text entry on a pen-equipped tablet.” in proceedings of the acm international conference on interactive surfaces and spaces. pp. - . mottie. ( ). virtual keyboard ( . . ) [mobile application software]. retrieved from https://mottie.github.io/keyboard/ rittenbruch, m. ( , october). “supporting collaboration in large-scale multi-user workspaces.” in proc. of the workshop: collaboration meets interactive surfaces: walls, tables, tablets and phones. pp. - . rockwell, g., & sinclair, s. ( ). hermeneutica: computer-assisted interpretation in the humanities. cambridge, ma: mit press. . design : book map to explore the potential of visualization on data walls in public spaces, our team developed a prototype called the book map, which is an interactive visualization designed for the data wall of a public library. the book map shows the trajectories that a text (book) takes around the world. it presents an animation of the movement of the text on a map of the world. imagine this on a large data wall at the entrance of a public library. figure . : screenshots of the shared visual space and personal controls of the book map the design challenge we set ourselves with this prototype was to imagine a visualization that lots of people could interact with in a public space. to this end, users can scan the qr code in the lower left which will call up an interface on their smartphone where they can choose a book to map. multiple people can choose books at the same time and these books will then ping-pong around the world. the book map uses natural language processing, named-entity recognition, and geo-spatial data to visualize geo-locations in books on a public data wall. it was inspired by the voyant dreamscape tool (https://voyant-tools.org/?corpus=frank&view=dreamscape). there are special administrative interfaces for adding books and editing the list of locations recognized. technically, book map is a website built with python django framework, which involves multiple sections for different functionalities. visual matters csdh-schn challenges ● although ner performs efficiently in extracting locations from a book, there will be inevitably some error data that have to be revised manually. ● the geo-location system cannot handle the fictional locations from a book, which leads to another discussion about how to visualize a fictional location on a real map. ● it is still a challenge to display multiple books on the single book map. when a viewer chooses a book, the currently displayed book will be replaced by the new one. figure . : main map screen figure . is the main visualization screen of the book map, which is supposed to be projected to a big screen or data wall in the library. imagine it in the foyer of a public library. it displays the information of a chosen book (which will be explained in book bubbles screen), which includes: • the title of the book • the geo-locations of the book • all locations extracted from the text of the book will be displayed as pins on a world map • the name of the current geo-location • display the name of the current location, such as paris, venice, london, etc. • the geo-path of the book • an animated path shows the trace of the geo-locations in the book, in accordance to the sequence of the text of the book • the context around the current geo-location • display the context corresponding to the current location in the book. a single location may show different contexts as it may appear multiple times in the book. • the total occurrence of the geo-location • if the map screen is rendered on a pc or a touchscreen, the readers are allowed to tap the location pin to check its total occurrence in the book. visual matters csdh-schn • qr code to book bubbles screen • display the qr code linking to the book bubbles screen. figure . : book bubbles screen when viewers scan the qr code with their mobile phones from the map screen, they will be redirected to the book bubbles screen where they can choose a book using mobile phones. all the books are displayed in touchable bubbles. when a user wants to see the locations of a given book, he/she can drag the book bubble to the book icon at the top-right corner, and this book will be projected to the map screen in real-time. imagine different books being launched using different colours by visitors to the library. the books could be books about the local city or specific to celebrations that week. visitors could see where they could go with books in the library. book management toolkits book map provides a series of toolkits for the librarians/administrators to add/manage books. figure . : extractor visual matters csdh-schn this extractor allows librarians to extract geo-locations from a given book. librarians first upload the text file of a book, and the program will apply the following processes to it: . use stanford ner to recognize named entities in the book; . extract the location entities and store the context around them; . use mapbox api to convert the location entities to geo-location data; . store the data as a json file on the server. figure . : editor once a book has been extracted as a location json file, librarians can edit it manually to correct any potential errors caused by automation. the following features are provided by the editor, • rename a location. if the librarians find there is something wrong with the name of the location, they can modify it. • delete a location. a location can be removed from the data. however, it needs to be noticed that once a single piece of a location is removed, all the locations sharing the same name will be deleted from the data. • modify the geo-data of a location. librarians can modify the latitude & longitude of a selected location by dragging/dropping the pin on the world map. • save the json file. once the modification is done, librarians can override the json file on the server to bring the modification into effect. visual matters csdh-schn figure . : bubble configurator using bubble configurator, librarians can edit bubbles of the books which are to be displayed on the book bubbles screen. they can, • add a new bubble. librarians firstly choose a book's json file on the server (which was generated by extractor), then they can configure the title and the size of the bubble and add it to the preview section; • remove a bubble. librarians can drag/drop a bubble to the delete icon at the top-right corner to delete the bubble; • save the configuration. once finished, librarians can save the configuration so that the change will be applied to the actual book bubbles screen. the book map is a working prototype that we hope to test in the digital scholarship centre of the u of alberta when people are allowed to gather again. . design : teamwork scenario the touch table has become more and more popular among the workspace and study area. people took advantage of its features and turned it into a powerful tool of collaboration or presentation. for example, samsung has developed the flip display for collaboration (example : https://www.youtube.com/watch?v=vujd byfi ) and kodisoft company has transformed the touch table into a foodservice table where customers can have their own menu and play games together while they are waiting for lunch. our team also came up with several scenarios of using touch tables for collaboration and presentation. as the group project and teamwork prevail in universities and workplaces, we first thought about how we can use the touch table to enhance the productivity of the collaboration in the group project. hence, our team proposes a scenario of using a touch table to conduct visualization teamwork in voyant. imagine a group of students wants to use voyant to study their texts, using the touch table in the library. they hope that each student can have a chance to explore the visualization on their own but also study the project together in the shared table where each team member can interact visual matters csdh-schn and explore. in our proposal, voyant creates sections for each student to play with visualization, but also has a shared table for team collaboration. students can click save to spyral as a way to take notes for their own and they can add the screenshot or note to the shared collaboration platform such as google docs or others. in the design, the touch table can be separated into two parts. the major one is the voyant interface, accompanied by a small section of the collaboration platform. here is how it looks: section a section b collaboration platforms exported skin builder voyant-tools figure . . : wireframe of scenario voyant can be separated into two parts. in the middle of the interface are the shared sections and the sections in the two sides are individual sections where each member can play with the visualizations themselves and the individual can swap their section to the shared section. here is how it works: • if each student wants to play around the visualizations themselves, each student can use the visualization on the two sides. they can analyze the text individually, choosing different types of visualization. the numbers of sections in the two sides can be added based on the number of students in the project. the sections in the two sides can be rotated in a comfortable direction for students to read. • if students want to discuss their project together, they can use the shared space in the middle. in the shared space, students can open the new window to do the visualization or a student can switch one of their own visualizations to the middle section, sharing his/her work with the rest of the group members. visual matters csdh-schn • if students are preparing for the presentation, they can open the powerpoint or word document beside the voyant website, adding a screenshot and note immediately. . design : teaching or lecture scenario large scale visualization walls are increasingly becoming an important tool in teaching and research presentations. this experiment tries to utilize the potential capabilities of visualization walls when a group of participants investigates texts using voyant. specifically, the experiment aims to show the data wall's abilities to provide a platform to present the process of digital humanities research. a professor who wrote a book rich with analysis and data visualizations is invited to give a lecture about the book at the uofa. he/she wants to use the walls to show how he/she used voyant in his/her research. his/her plan is to reconstruct the process of writing the book. he/she decides to divide the data walls into four sections: figure . . : wireframe of scenario . in the left section, a short video shows how the professor gathered and cleaned his/her data. . in the second section, participants can try tools and code the professor used in the process of building the corpus through a file-sharing platform. . in the third section, participants will pass works to voyant. they will try to enter different formats of resources (plain text, ms word, ms excel, pdf files, xml files, webpages, etc.) gathered in the professor’s voyant corpus. they will learn, for instance, to enter xml documents into the tool by using xpath expressions. . in the leftmost section, the results of the analysis and visualization produced by the voyant tool will be presented. participants will work with the voyant tool to generate different visualizations and will learn about different options of voyant like the “stopwords” option. there could be different configurations of tools. visual matters csdh-schn figure . . : wireframe of voyant skin . conclusion for thousands of years we have been strived to know ourselves but only recently has visualization become a truly ubiquitous form of mediation for that knowledge. we are beyond the titillating examples of science fiction interfaces offered to us through films like minority report and the matrix, we are inundated with challenging new visualizations in news media, video games, and even so- called smart devices that control our homes. one thing is sure: these systems will only grow in importance for consumers, but also corporations, governments and other organizations. we are just beginning to get a handle on what's possible in research labs at universities, where large data walls and large touch tables are increasingly accessible. this research was doubly motivated: first, to better understand the technologies and how they may be used by groups, and second, to better understand our existing tools and how well they work (or not) through these new interfaces. we started with a conceptual examination of visualizations, why they matter, who is currently using them and to what ends, as well as introducing speculative design as a way of understanding visualizations in a less functional way. we proceeded with an audit of voyant as used on a large touch table, which allowed us to perceive some greater potential while identifying important shortcomings in both the tools and the systems meant to support them. engaging with our own exercise in speculative design, we presented the book map project which was designed from the outset to support scaled interfaces, from phones to large data walls. finally, we outlined an exercise in collaborative design that leverages some of the customizability of voyant to allow a group to articulate how different tools might work together for different purposes, especially when supported by large scale displays. we don't wish to exaggerate the significance of this work, but we strongly believe that humanists must be involved in designing, prototyping and implementing new visualizations tools intended to help generate knowledge. we are just at the beginning and much work remains to be done. visual matters csdh-schn . resources examples - https://www.youtube.com/watch?reload= &v= oxgzyhfcxe - https://www.youtube.com/watch?v=vujd byfi - https://www.youtube.com/watch?v=oimhuvoylfw useful resources: - https://www.researchgate.net/publication/ _multimedia_exhibition_design_ex ploring_intersections_among_storytelling_usability_and_user_experience_on_an_inter active_large_wall_screen - https://link.springer.com/content/pdf/ . /s - - - .pdf - https://arxiv.org/pdf/ . .pdf - https://www.researchgate.net/publication/ _use_of_large_multi_touch_interf aces_a_research_on_usability_and_design_aspects r e s o u r c e r e v i e w doi: dx.doi.org/ . /jmla. . journal of the medical library association ( ) october jmla.mlanet.org zoterobib. corporation for digital scholarship, boone boulevard, suite , vienna, va ; https://zbib.org/; pricing: free. general description zoterobib is a free, web-based cita- tion generator from the same team that developed zotero. zoterobib, or zbib, was released in and is maintained by the corporation for digital scholarship (also known as “digital scholar”). the corporation for digital scholarship is a non- profit organization founded in that is dedicated to the develop- ment of software and services for researchers and cultural heritage institutions [ ]. while it might be easy to initially confuse zotero and zoterobib, the main difference lies in what each tool can do and who uses it. zotero is a citation/reference manager and is useful software for almost anyone who is doing serious, long-term research [ ]. zoterobib is a citation generator for those who create occasional bibliographies or individuals who are interested in teasing out metadata for quickly citing a re- source. this review focuses pri- marily on the features of zoterobib, with a brief compari- son to the other existing tools. features accessibility all that is required to use zotero- bib is a device and an internet con- nection. no mobile app is available, but it is optimized for tablet and phone use. this resource is compat- ible with any browser and does not require users to download software or create an account. the bibliog- raphy is stored in the browser’s lo- cal storage, so these data remain entirely under the user’s control [ ]. if users are in private or  incog- nito mode, their bibliographies will be deleted when they close their browser windows. usability using zoterobib is easy and straightforward: users insert an identifier—uniform resource loca- tion (url), international standard book number (isbn), digital object identifier (doi), pubmed identifier (pmid)—or the title of the citation into the search bar. zoterobib will then automatically grab data from “newspaper and magazine articles, library catalogs, journal articles, sites like amazon and google books, and much more” [ ]. if the automatic import finds incomplete data or does not find any data for a reference, a manual editor can be used to create the citation. if the citation is entered man- ually, zoterobib enables the user to choose the type of citation, with typical options such as book or journal article, and allows choices like artwork, podcast, or statute. the entry fields change based on the citation type. after the citation is created, it is added to the “bibliography” section be- low the search bar. each citation can be manually edited in the bibliography, and the citation style can be altered at any time. zoterobib supports more than , citation style languages, including widely used styles from the american psychological associ- ation (apa) and modern language association (mla). in addition, zoterobib has styles commonly used in medicine like american medical association (ama), van- couver, national library of medi- cine (nlm), and those used for specific medical journals like the new england journal of medicine and nature. because zoterobib is con- nected with zotero, all of the styles are kept up to date with current style guide recommendations and publisher requirements. for exam- ple, when apa released the eighth edition of their style guide, zotero updated from apa to apa , and the style format was updated in zoterobib. once the user is satisfied with the bibliography, the options for copying or exporting the bibliog- raphy are: • copy to clipboard (copy cita- tions to a local computer’s clip- board) • download rich text format (rtf) (download option for all word processors) • copy hypertext markup lan- guage (html) • download ris • download bibtex • save to zotero https://zbib.org/ r e v i e w doi: dx.doi.org/ . /jmla. . jmla.mlanet.org ( ) october journal of the medical library association another helpful option that zoterobib offers is the ability to link to the current collection of cita- tions that users have gathered. the “link to this version” option gener- ates an url that can be used to re- trieve that version of a bibliography at a later date. the “link to this version” url is useful if the bibliography is loaded onto another computer or shared. the shared bibliography is generated as a read-only version but can be loaded into the editor if changes need to be made. zoterobib has an faq web page that contains information sep- arated into three areas: general, usage, and troubleshooting [ ]. if specific help or errors are located, these can be reported to issue/ask for assistance in the zotero forums [ ]. requesting support from zotero (@zotero) on twitter also seems to work for any problems with using zoterobib. brief comparisons one of the most noticeable differences between zoterobib and other citation generators is that zoterobib presents a “clean” interface without advertisements. most citation generators are littered with bothersome ads, which are distracting for anyone trying to use these websites. some of the other citation generator tools might also require that users create accounts to access premium features, such as additional citation styles or export features. this can be cumbersome for those who just want a quick, but reliable way to get an accurate reference. another advantage zoterobib has over other citation generators is it can generate bibliographies in more than , different styles [ ]. this feature is especially useful for those in the health sciences. while most of the free citation generators have the apa citation style availa- ble, they may lack the ability to generate citations in ama, van- couver, nlm, or other popular medical citation styles. in this reviewer’s opinion, the main disadvantage to zoterobib is when a citation does not have an identifier or pulls the wrong infor- mation from a url or a doi, the data have to be entered manually; however, as mentioned above, zoterobib will guide the user in en- tering the necessary information, based on the type of reference. an- other drawback is that zoterobib does not allow the user to change the order of the citations in the bib- liography if the style dictates that the references are numbered. alt- hough the user can delete and cor- rect the order of citations in the numbered bibliography, it is unfor- tunate that amending the order of the numbered references is not an option. finally, although perhaps not as necessary for health sciences li- brarians, zoterobib does not ap- pear to provide as many alerts to users to be cautious about untrust- worthy references. websites like mybib and easybib both encourage users to consider the credibility of a citation and remind users to look for certain information to be in- cluded (e.g., author, date of crea- tion for a website entry) [ , ]. table provides an overview of the various features of different citation generators. practical usage zoterobib can be used in the daily life of a health sciences librarian to: • show undergraduate students how to easily create a bibliog- raphy if they do not need to build an ongoing database of citations • generate and export a list of detailed citations from grey literature sources that do not offer an easy export option of their metadata for a systematic review • assist nursing students in how to properly cite their references in apa format • help a frugal researcher with exporting all of their citations using pubmed central identifi- ers (pmcids), but without any hidden or unreadable code that could interfere with the partic- ular formatting needed in a grant application • generate and export citations of us legal documents or law in- formation for a psychiatrist’s manuscript summary zoterobib is a free and easy-to-use tool for working with references. while it is not designed to be used for managing references for long- term research projects, it can help everyone from undergraduates to senior researchers quickly generate citations and bibliographies. it is a handy resource for anyone work- ing in health sciences libraries. r e v i e w doi: dx.doi.org/ . /jmla. . journal of the medical library association ( ) octrober jmla.mlanet.org table overview of features of different citation generators pulls metadata for each cita- tion with only entering one identifier (e.g., doi, isbn) no ads on website/ does not require purchasing/ creating an ac- count for more functionality ease of use/no coding or program- ming expe- rience needed no down- load re- quired exports in multiple formats† generates bibliog- raphies in multiple ci- tation styles‡ generates easily cop- ied citations in multiple citation styles† generates in-text ci- tations in multiple citation styles† helps check cred- ibility of citation§ zoterobib y y y y y y y y n ama citation generator (mick schroeder) y y y y n n n n n auratikum y n y y uta uta uta uta uta bibguru y y y y y n n n n bibliograph y y n y y uta uta uta uta bibliography.com y y y y n n n n n bibme* y n y y n n n n y citace pro y n y y n uta uta uta uta citationgenerator n y y y n n n n n citation machine* y n y y n n n n y citationsy y n y y uta uta uta uta uta cite me n n y y uta uta uta uta uta cite this for me* y n y y n y y y n citefast y n y y n n n n y crossref y y y y y n n n n doi citation for- matter y y y y n n y n n easybib* y n y y n n n n y formatically n y y y n n n n n r e v i e w doi: dx.doi.org/ . /jmla. . jmla.mlanet.org ( ) october journal of the medical library association table overview of features of different citation generators (continued) pulls metadata for each cita- tion with only entering one identifier (e.g., doi, isbn) no ads on website/ does not require purchasing/ creating an ac- count for more functionality ease of use/ no coding or program- ming expe- rience needed no down- load re- quired exports in multiple formats† generates bibliog- raphies in multiple ci- tation styles‡ generates easily cop- ied citations in multiple citation styles† generates in-text ci- tations in multiple citation styles† helps check cred- ibility of citation§ google scholar y y y y y n y n n harvard genera- tor n n y y n n n n n knightcite v . n y y y n n n n y mybib y n y y uta y y y y refindit y y y y n n y n n researchomatic n n n y uta uta uta uta uta sciwheel (for- merly, f workspace) y n uta y uta uta uta uta uta scribbr y n y y n n n n n y=yes; n=no; uta = unable to assess. * chegg owns the following websites or web citation tools, and they all function essentially the same way: bibme, citation machine, cite this for me, and easybib. † multiple export formats=more than just exporting to ms word or googledoc. ‡ multiple citation styles=more than just american psychological association (apa), modern language association (mla), chicago manual of style/turabian, and harvard styles. § the website/tool either had to provide feedback on credibility of citation information or remind the user of what to look for in a trustworthy resource. r e v i e w doi: dx.doi.org/ . /jmla. . journal of the medical library association ( ) october jmla.mlanet.org references . corporation for digital scholarship. digital scholar [internet]. vienna, va: the corporation [cited jul ]. <https://digitalscholar.org/>. . corporation for digital scholarship. zotero: about [internet]. vienna, va: the corporation [cited jul ]. <https://www.zotero.org/about/>. . corporation for digital scholarship. introducing zoterobib: perfect bibliographies in minutes. zotero blog [internet]. may [cited jul ]. <https://www.zotero.org/blog/introd ucing-zoterobib/>. . corporation for digital scholarship. zoterobib faq [internet]. the corporation [cited jul ]. <https://zbib.org/faq>. . corporation for digital scholarship. recent discussions. zotero forums [internet]. the corporation [cited jul ] <https://forums.zotero.org/>. . mybib. mybib – a new free apa, harvard, & mla citation generator [internet]. mybib [cited jul ]. <https://www.mybib.com/>. . chegg. easybib [internet]. santa clara, ca: chegg [cited jul ]. <https://www.easybib.com/>. tara j. brigham, ahip, brigham.tara@mayo.edu, http://orcid.org/ - - - , assistant professor, medical education, and librarian, mayo clinic libraries, mayo clinic, jacksonville, fl articles in this journal are licensed under a creative commons attribution . international license. this journal is published by the university library system of the university of pittsburgh as part of its d-scribe digi- tal publishing program and is cosponsored by the univer- sity of pittsburgh press. issn - (online) https://digitalscholar.org/ https://www.zotero.org/about/ https://www.zotero.org/blog/introducing-zoterobib/ https://www.zotero.org/blog/introducing-zoterobib/ https://zbib.org/faq https://forums.zotero.org/ https://www.mybib.com/ https://www.easybib.com/ mailto:brigham.tara@mayo.edu http://orcid.org/ - - - http://orcid.org/ - - - https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / http://www.library.pitt.edu/ http://www.pitt.edu/ http://www.library.pitt.edu/d-scribe-digital-collections http://www.library.pitt.edu/d-scribe-digital-collections http://upress.pitt.edu/ http://upress.pitt.edu/ http://creativecommons.org/licenses/by-nc-nd/ . /us/ general description features accessibility usability brief comparisons practical usage summary references tara j. brigham, ahip, brigham.tara@mayo.edu, http://orcid.org/ - - - , assistant professor, medical education, and librarian, mayo clinic libraries, mayo clinic, jacksonville, fl tara j. brigham, ahip, brigham.tara@mayo.edu, http://orcid.org/ - - - , assistant professor, medical education, and librarian, mayo clinic libraries, mayo clinic, jacksonville, fl research publishing by library and information science scholars in pakistan: a bibliometric analysis open access accepted date: february , received date: october , *corresponding author: muhammad yousuf ali deputy librarian sindh madressatul islam university karachi, pakistan e-mail: myali@smiu.edu.pk all jistap content is open access, meaning it is accessible online to everyone, without fee and authors’ permission. all jistap content is published and distributed under the terms of the creative commons attribution license (http:/ creativecommons. org/licenses/by/ . /). under this license, authors reserve the copyright for their content; however, they permit anyone to unrestrictedly use, distribute, and reproduce the content in any medium as far as the original authors and source are cited. for any reuse, redistribution, or reproduction of a work, users must clarify the license terms under which the work was produced. ⓒ muhammad yousuf ali, joanna richardson, abstract scholarly communication plays a significant role in the development and dissemination of research outputs in library and information science (lis). this study presents findings from a survey which examines the key attributes that characterize the publishing by pakistani lis scholars, i.e. academics and professionals, in national journals. a pilot-tested, electronic questionnaire was used to collect the data from the target population. respondents (or . % of target) provided feedback on areas such as number of articles published, number of citations, and the nature of any collaboration with other authors. the findings of this survey revealed that, among the various des- ignated regions of pakistan, the punjab region was the most highly represented. in articles published in national journals, there was a clear preference among all respondents to collaborate with at least one other author. the citation metrics for lis articles in national journals were relatively low ( . %), which aligns with scimago’s jour- nal and country rankings. the uptake of social scholarly networks mirrors international trends. respondents were asked to score factors which could impact negatively on their ability to undertake research and/or publish the re- sults. the study recommends that concerned stakeholders work together, as appropriate, to address concerns. in addition, it recommends that further research be undertaken to define patterns of pakistani co-authorship in the social sciences. keywords: scholarly communication, lis researcher, scholarly publishing, citation analysis joanna richardson information services, griffith university brisbane, australia e-mail: j.richardson@griffith.edu.au muhammad yousuf ali * sindh madressatul islam university karachi, pakistan e-mail: myali@smiu.edu.pk jistap journal of information science theory and practice http://www.jistap.org research paper j inf sci theory pract ( ): - , http://dx.doi.org/ . /jistap. . . . eissn : - pissn : - http://www.jistap.org research publishing by library and information science scholars . introduction scholarly communication is a process through which “scholars share their findings with colleagues and claim precedent for their ideas” (case, , p. ). according to the association of research libraries, scholarly communication can be defined as “the sys- tem through which research and other scholarly writ- ings are created, evaluated for quality, disseminated to the scholarly community, and preserved for future use. the system includes both formal means of communi- cation, such as publication in peer-reviewed journals, and informal channels, such as electronic listservs” (arl, n.d.). through this process, academics, schol- ars, and researchers share and publish their research findings so that they are made available to the wider academic community and beyond. scholarly communication is an integral part of the research lifecycle. all scholars, either directly or indi- rectly, are involved in this process. when compared with international approaches, the situation is quite different in pakistan with respect to library and infor- mation science. there are only two major lis research journals published in pakistan: one is print-only for- mat, pakistan library & information science journal (plisj), and the other is digital-only format, paki- stan journal of information management & libraries (pjim&l, formerly pjlis). whereas naseer and mahmood ( ) have done a bibliometric analysis of plisj, warraich ( ) has done a bibliometrics analysis of the latter journal. a number of pakistani lis research outputs are also published in social sci- ence journals, i.e. other disciplines; these tend to be published in english. mahmood ( ) has discussed in detail articles published by lis scholars in foreign journals. however, no user-based study has been conducted recently of the attributes which most accurately char- acterize lis scholars’ publications in national journals. this article reports on a survey which investigates this topic in an effort to determine the overall quantity and quality of publishing among lis scholars in pakistan. for the purposes of this paper, the authors have used the term “scholar” to encompass lis academics, re- searchers, and professionals. . literature review scholars work in a “rapidly evolving and transfor- mative landscape” (wolski & richardson, , p. ). the research lifecycle, which underpins their work, has been discussed by kwon et al. ( ) from the per- spective of the major research steps and research activ- ities associated with research and development (r&d) projects, and by lee et al. ( ) in regard to how scientists specifically locate the information required. boote and beile ( ) have made a valuable contri- bution in this arena with their analysis of what consti- tutes “good research.” there has been a focus in recent years on how to refine academic services required to support the research lifecycle (todd, ; vaughan et al., ; deng & dotson, ). the role which scholarly communication plays within this context has been explored by a range of contemporary thought leaders, including liu ( ), thorin ( ), borgman ( ), and maron and smith ( ). much has been written, for example, about the changing nature of scholarly publishing (rowlands et al., ). according to disabato ( ), the inter- net has disrupted the former relationships and roles among authors, publishers, and readers. in addition, new publishing and pricing models are being explored for journals, scholarly monographs, textbooks, and digital materials, as all the stakeholders try to establish sustainable business models (weller, ; arts & hu- manities research council, ), all of which is hav- ing a major impact on the ability of scholars to publish. the actual measurable impact of scholarly publish- ing likewise continues to be widely discussed. brody ( ) has reported on the potential contribution of open access to increasing research impact. concern has been expressed as to whether citation analysis should continue to be the predominant standard by which research impact is measured, especially journal articles (moed, ; sarli et al., ; aksnes et al., ; nightingale & marshall, ; dowling, ). meho and yang ( ) have examined the implica- tions of using major services such as web of science, scopus, and google scholar on the citation counts and rankings specifically of lis faculty. borgman and furner ( ) have discussed the transformation of scholarly communication while pos- ing fundamental questions about its impact: jistap vol. no. , - but how much has human behavior really changed? how much has the infrastructure for scholarly com- munication changed? are we witnessing a revolu- tion in scholarly communication, or an evolution? or a co-evolution of technology and behavior? … and how do we determine what kinds of change are occurring? (p. ) in response they have proposed bibliometrics as a powerful tool for studying scholarly communication, especially citation analysis. on the one hand, scholarly communication often involves a subjective assessment of quality, which is frequently undertaken through a peer-review process. bibliometrics, on the other hand, is the application of mathematical and statistical, i.e. non-subjective, methods to books, articles, and other media of communication in order to measure attri- butes such as post-publication impact. according to agyeman and bilson ( , p. ), bibliometrics is a “re- search technique in library and information science that applies quantitative analysis and statistics to describe publication patterns in any field of knowledge.” it can provide a useful tool for benchmarking research output. since the creation in of the higher education commission (hec), pakistan has become much more focused on the importance of research in helping the nation achieve a range of socio-economic goals. a number of studies have examined the education system for lis scholars as well as their research publications. according to ahmad ( ), library education began at the university level at the turn of the twentieth cen- tury, with postgraduate diplomas offered as of and eventually progressing to the master of philoso- phy (mphil) and doctoral levels. in haider and mahmood reported on lis doctoral studies at punjab university and made a number of suggestions to im- prove the then nascent program. bhatti and ariff ( ) suggested that a revised approach to distance education could enhance the lis curriculum in general in paki- stan. the following year haider and mahmood ( ) examined the relative lack of success up to that time of a number of lis doctoral programs. issues included the absence of proper supervision, resulting in theses of poor quality; low esteem for national phd degrees, when compared with international offerings, in the eyes of professional colleagues; and little or no impact of early degree recipients on the profession. samdani and bhatti ( ) reported on a similar but updated study. their findings reported that lis phd theses had been produced by pakistani universities and a total of phd degrees awarded by foreign universities. they provided a number of recommendations to improve the situa- tion. in more recent years ameen ( ) has highlighted the challenge of creating reputable academic programs: “… library schools have their own deficiencies such as lack of senior phd faculty members, lack of financial and other material resources to be able to deliver quali- ty education: it varies from school to school in absence of accreditation standards though” (p. ). ahmad and mahmood ( ) also have highlighted the lack of both finances and highly qualified staff; however, they do observe that lis “faculty members are inclined towards research” (p. ). in examining the changing lis research environment in pakistan, mahmood and shafique ( ) have reported “a wide gap between demand and supply of lis professionals with research qualifications” (p. ). they have expressed concern as to the flow-on effect in filling lis leadership positions. in , ameen and ullah reported on the results of a survey of the chief librarians of university libraries located in islamabad and rawalpindi in regard to their perceptions about their positions being allocated faculty status. a major finding was that “… the main barriers in getting faculty status are the librarians themselves, lack- ing preparedness in terms of qualifications and research output” (p. ). from the perspective of research outputs, mahmood ( ) reported on a review of articles published in foreign journals on various aspects of lis services in pakistan. in consulting four major abstracting services, several of which had in excess of , records, only articles were found which contained information on pakistani librarianship. of the authors affiliated with those articles, only ( . %) were affiliated with libraries and lis schools in pakistan. the majority of the articles ( . %) had a single author. he conclud- ed: “the promotion of research activities in the field of librarianship in pakistan is direly needed in this era of communication” (p. ). more recently, naseer and mahmood ( ) have reported on their analysis of one of the two major paki- stani lis journals in order to understand trends in lis research in pakistan. they examined the subject cover- http://www.jistap.org research publishing by library and information science scholars age and authorship characteristics of articles published in the pakistan library & information science journal (plisj) during - . major findings included: mostly asian authors, predominantly pakistanis, contribute to the journal. the state of collaboration among authors of plisj is not very encouraging as majority of the authors prefer to work in isola- tion. male authors lead the lis research scene but contributions from female authors have increased. descriptive articles still represent major part of plisj but articles based on empirical research have increased. mostly, articles written in english lan- guage are published in the journal but number of articles written in urdu has improved. (naseer & mahmood, , p. ) warraich and ahmad ( ) have conducted a sim- ilar study but of the other major national lis journal, the pakistan journal of library and information science (pjlis), recently rebranded as the pakistan journal of information management & libraries (pjim&l). they concluded that: authorship pattern shows that most of the papers were single authored and being pakistani origin journal majority of the authors belonged to pakistan. authors from foreign countries also contributed in this journal as found in the study of volumes. it shows that the journal has been internationally circulated. it needs wider circulation and that is the reason that for the last few years its issues have been made available online. authors from the university of the punjab contributed maximum papers followed by the university of karachi. majority of the papers were research papers and percent were written in english language. fifty one papers ( . %) have - references which seem to be a good trend in research while contributions ( . %) were with- out references. (p. ) from the literature reviewed to date, one can con- clude that while scholarly communication, in par- ticular scholarly publishing, is rapidly evolving at an international level with a range of consequences for scholars, the same cannot be said for lis scholarship in pakistan. the literature highlights a number of factors which have slowed the rate at which quality lis research is being undertaken at the national level. while several bibliometric studies have been done on lis outputs in foreign journals and in two specific na- tional lis journals, no bibliometric analysis has been undertaken which examines lis articles in national lis and related social science journals. this study is intended to fill that gap. . objectives of the study within the discipline of library and information sci- ence, it is common practice for scholars to publish in a range of journals. the main purpose of this paper is to examine the key attributes of pakistani lis scholars’ research outputs, based on their contributions in na- tional journals. based on the current gap in knowledge identified through the literature review, the following objectives are framed: . to identify relevant demographic information, e.g. gender and geographical affiliation . to determine the extent of collaborative author- ship . to determine the extent of publishing based on geographical regions . to determine the strength of association between job title (seniority) and number of publications . to determine the strength of citation metrics for national outputs . to identify factors which may impact negatively on lis scholars’ ability to undertake research and/or to publish it . methodology a questionnaire was designed based on the litera- ture review undertaken for this project. pretesting was done to refine the original set of questions. an online survey form was distributed via email, yahoo groups, and facebook to lis scholars within each province: sindh, punjab, khyber pakhtunkhwa (kpk), baluch- istan, gilgat baltistan & azad jammu and kashmir (ajk), and federal capital. the survey was left open from june to august , . collected data was jistap vol. no. , - analyzed in spss version . from discussions with several members of the paki- stan library association, it is estimated that there are currently about lis researchers / faculty members / postgraduate students. the authors chose ( %) as the target sample size for a geographically scattered population. the process involved selecting a random sample of respondents, where possible, from each of the designated provinces and administrative units so as to achieve adequate representation from each area. sindh, punjab, and the federal capital were able to meet the target of respondents from each province / administrative unit. baluchistan, khyber pakh- tunkhwa (kpk), gilgat baltistan (gb), and azad jam- mu and kashmir (ajk) did not meet the target. out of a target of scholars, responses were received, or a response rate of . %. . scope the study was limited to articles by lis scholars in pakistani, i.e. national, lis and other social science journals. it was understood that not all articles by the target group would necessarily be restricted to just the lis domain. it is not uncommon, for example, for lis scholars to publish on lis topics in allied areas such as law and education. . results . . demographical information the data collected for this research was obtained from respondents. respondents were queried regarding the following demographical information: gender, organizational affiliation by sector, and geo- graphical affiliation. . . gender table shows that ( . %) respondents were male and ( . %) were female. in naseer and mahmood’s study ( ), the ratio was male = . % and female = . %, with gender not determined for the remaining respondents. although fatima and bhatti’s study ( ) was limited to the punjab prov- ince, their results also indicated that male respondents were dominant. . . . affiliation by sector the respondents were asked to indicate whether they currently were working in the public or private sector. ( . %) worked in public sector institutions and ( . %) respondents worked in private sector institutions. table indicates that public sector institu- tions were more highly represented in the results. table . frequency distribution of respondents by gender gender frequency percent male . female . total . table . frequency distribution of respondents’ affiliation by sector institutions frequency percent public . private . total . http://www.jistap.org research publishing by library and information science scholars . . geographical affiliation table shows the respondents’ geographical affilia- tion within pakistan. although the research objective was to have a random sample of respondents from each designated province and administrative geo- graphical unit, this was not achieved for all regions. three provinces (punjab, federal capital, and sindh) accounted for . % of all respondents’ affiliations. . publishing history respondents were asked to indicate when they had first begun to publish their research. table indicates that ( . %) respondents began publishing during - ; ( . %) began during - ; and ( . %) began during - . in the ten-year period from – , only ( . %) respondents had published their first output. before , only ( . %) had begun publishing. seventeen ( . %) respondents indicated that they had not yet published in any research journals. these respondents have recently commenced either a master of philosophy or other postgraduate study. in addi- tion, although they have also carried out research for a master of library and information science degree, they have not yet published either their thesis or any of their findings. respondents from the public sector accounted for . % of the total first publications and respondents table . frequency distribution by geographic affiliation region frequency percent punjab . federal capital . sindh . kpk . baluchistan . gilgat baltistan & ajk . total . table . frequency distribution by year and by sector of respondents’ first publication year publicsector (a) private sector (b) sum (a + b) percent - . - . - . - . before . not yet published . total . jistap vol. no. , - from the private sector accounted for . %. . patterns of authorship respondents were asked to indicate the total number of their lis scholarly publications which had been pub- lished in national, i.e. pakistani, lis and other social sci- ence journals. as of august , , scholars had published research articles, with a mean = . and standard deviation (sd) = . as shown in table . table shows the authorship pattern of the scholarly outputs published in national journals. sin- gle-authored publications accounted for the highest percentage, i.e. ( . %, with mean = . & sd = . ); the number of two-authored papers was ( . %, with mean = . & sd = . ). the number of three-authored publications was ( . %, with mean = . & sd = . ). only articles ( . %, with mean = . & sd = . ) had four or more au- thors. it is not known to what extent the two-authored publications may represent collaboration between an early career researcher / professional and a more se- nior colleague or supervisor. . publications by sector table shows that whereas ( . %) respondents from the public sector produced ( . %) articles, ( . %) respondents from the private sector pro- duced ( . %) articles out of a total of articles. this demonstrates a positive correlation between the sector with the largest percentage of respondents and table . average number of scholarly publications in national journals per respondent number of publications mean sd . . table . distribution of lis publications in national journals by authors’ contribution type of authorship contribution no. of publications mean sd single author . .   two authors . .   three authors . .   four or more authors . .   total table . distribution of lis publications in national journals by sector sector respondents percent articles percent mean public . . .   private . . .   total . . http://www.jistap.org research publishing by library and information science scholars the sector that produced the largest percentage of the total publications recorded. . publications by region table shows the distribution of respondents and their publications by region. out of a total of ar- ticles, articles were published by respondents from the federal capital, from punjab, from sindh, from kpk , from baluchistan, and from gil- gat baltistan / ajk. while the top provinces along with the federal capital accounted for ( % of all publications), respondents from punjab accounted for % of that sub-set alone. given the relatively small sample size, spearman’s rank correlation was used to test whether a significant association could be shown between the number of scholars in a given region and the number of articles from that region: where rs = covariance / (xra st. dev. * yra st. dev.); xra = rank of x values; yra = rank of y values; xra - mx = x rank minus mean of x ranks; and yra - my = y rank minus mean of y ranks. x rank had a mean = . and sd = . . y rank had a mean = . and sd = . . the value of rho is . and the two-tailed value of p is . . by normal standards, the association between the two variables (scholars and publications) would be considered statistically significant. while it is not surprising that regions with a greater number of scholars had more publications, this does not account for the considerable variance in the number of publi- cations among the top ranked provinces, all of which had nearly the same number of respondents. . publications by job title table examines the distribution of respondents across the major job titles / roles among lis academics and professionals within pakistan. “other” encompass- es titles such as associate dean, assistant manager, in- formation officer, as well as the status of “unemployed.” as with table , spearman’s rank correlation was used to test whether a significant association could be shown between job title / role and number of pub- lications. x rank had a mean = . and sd = . . y rank had a mean = . and sd = . . the value of rho is . and the two-tailed value of p is . . by normal standards, the association between the two variables would not be considered statistically signifi- cant. . citation analysis respondents were asked how many of their lis pub- table . distribution of lis publications in national journals by regions location no. of scholars (x) no. of articles (y) rank of x values rank of y values x rank minus mean of x rank=xra - mx y rank minus mean of y rank=yra - my  sum diffs = (xra - mx) * (yra - my) federal capital . . . . punjab . . . sindh . . . . kpk - . - . . baluchistan - . - . . gilgat baltistan & ajk . . - . - . . total jistap vol. no. , - table . distribution of lis publications in national journals by job title job title no. of respondents (x) no. of publications (y) rank of x values rank of y values x rank minus mean of x ranks=xra - mx y rank minus mean of y ranks=yra - my sum diffs = (xra - mx) * (yra - my) assistant professor . . . . . librarian . . . . . professor . . - . . - . chief librarian . . . . . library consultant . . - . . - . other . . . - . - . associate professor . . - . - . . assistant librarian . . . - . - . lecturer . . - . - . . deputy librarian . . - . - . . total table . frequency distribution of number of lis publications cited authorship publications (n= ) article citations (f ) percent number of publications yes no yes n . . lications in national journals had been cited. table shows that ( . %) publications were not cited. this result is considerably higher than the figure of approximately % reported by warraich and ahmad ( ) for the pakistan journal of library and informa- tion science (now the pakistan journal of information management & libraries). . support for lis research in pakistan respondents were asked to rate their level of agree- ment or disagreement with a series of statements regarding support for lis research in pakistan. five factors were presented as a closed list with a -point likert scale ranging from “strongly agree” ( ) to “strongly disagree” ( ). the results are illustrated in table . the predom- inant concern was higher education commission of pakistan (hec) funding for publishing and participat- ing in relevant conferences; many respondents (n= , . %) either strongly agreed or agreed that this was necessary for supporting lis research. another con- cern was the level of report writing skills among re- searchers; a substantial number of respondents (n= , . %) either disagreed or strongly disagreed that http://www.jistap.org research publishing by library and information science scholars table . respondents’ satisfaction with support for lis research in pakistan variables scales* there is sufficient funding for lis research within my institution % . . . . . my parent institution supports my publishing activities % . . . . . hec should fund lis researchers to both publish and participate in relevant conferences % . . . . . i am satisfied with current research report writing skills of lis researchers % . . . . . articles submitted for publication are usually delayed in a very long queue % . . . . . *scales ( =strongly agree, =agree, =moderate, =disagree, =strongly disagree) the level was satisfactory. respondents’ satisfaction or dissatisfaction was fairly evenly divided for the level of support for the respondent’s publishing by their parent institution and the delay experienced in getting submitted articles actually published. a large number of respondents (n= , . %) either strongly agreed or agreed that their parent institution supported their publishing activities. . scholarly networks scholarly communication has increased in recent years through scholarly networks such as academia. edu, google scholar, researchgate, and mendeley. these scholarly networks provide a new window of opportunity for pakistani scholars to display their re- search works. there is also an opportunity to connect internationally with researchers in a similar discipline. in addition, researchers can remain up-to-date with current research trends in their field. only ( . %) respondents indicated that they did not use any schol- arly network. . discussion in examining the key attributes of pakistani lis scholars’ research outputs in national journals, survey results have shown a continued dominance of males among respondents. this gender distribution would seem to reflect the fact that in pakistani culture many women do not continue to study after marriage. it can be challenging for female professionals to continue un- dertaking research once they have family and house- hold responsibilities. the overall predominance of ( ) punjab, sindh, and the federal capital as most highly represented in terms of respondents’ geographical affiliation and ( ) the respondents’ affiliation with public institutions, principally universities, reflects the pattern of distribu- tion of universities within pakistan. in addition, a recent survey by jan and anwar ( ) offers further insights. in their study of publica- tions by lis faculty from eight pakistani universities, five faculty members from a single (public) university, university of punjab, contributed ( . %). eleven more publications ( . %) were contributed by another jistap vol. no. , - (public) university, islamia university. this may indi- cate that some universities, especially within the public sector, have been able to attract a number of faculty members who are motivated to undertake research and publish their findings. it would be useful in future to investigate the motivational factors for those lis scholars, i.e. not just academics, who do publish. not all respondents had yet to publish an article. sig- nificantly the largest percentage by far of those respon- dents who had begun to publish had only commenced doing so in the last five years. a major contributing factor may have been the more stringent criteria set by the higher education commission of pakistan (hec) in recent years for academic promotion. another con- tributor may have been the realignment by the hec in of the length of bachelor and masters programs in pakistan with international standards. as a result, some scholars who have wished to undertake doctoral study have first had to complete a two-year masters of philosophy (mphil), whereas there would have oth- erwise been more lis post-bachelor students in the educational system. it is conceivable that some would have commenced scholarly publishing while studying for the mphil. in the case of their first publication, respondents from the public sector accounted for . % of the overall total number of first publications, and respon- dents from the private sector accounted for . %. however, the ratio of public to private has seen a % increase from the private sector in the past five years. this is a trend which would be worthwhile to revisit within the next five years. in national lis and related social science journals, single-authored publications were the most common ( . %), followed closely by two-authored outputs ( . %); together they accounted for nearly % of all lis publications in national journals. three-authored publications represented . % and publications with four or more authors were a distant . %. as a whole, co-authored publications accounted for % of total publications, which shows a marked increase when compared with the recent analysis of a single national journal, pjlis, undertaken by warraich and ahmad ( ), in which . % of articles were single-au- thored. further investigation would be useful to deter- mine whether, for example, non-lis disciplines in the social sciences tend to have more multiple-authored publications than lis publications. it was found that punjab province was the most highly represented in terms of journal articles by lis scholars. sindh and the federal capital were followed by khyber pakhtunkhwa, which had a moderate lev- el. baluchistan, gilgat baltistan, and azad jammu and kashmir reported very few journal articles. as indicated previously, although there was a significant association between the number of scholars and the number of publications, this did not account for the considerable variance in the total number of publica- tions among the top ranked provinces, all of which had nearly the same number of respondents. in punjab there are three public universities—uni- versity of the punjab, islamia university, and sargodha university—which have lis departments conducting classes at the undergraduate, graduate, and doctoral levels. in the private sector, minhaj international uni- versity also offers lis education. in addition, punjab has another public universities and private ones, which would account for many lis professionals. lis courses are offered at only two universities in sindh, at one in kpk, one in baluchistan, and one in the federal capital. this may help to account for the pre- dominance of punjab in the number of articles by lis scholars in national journals. survey results showed that among lis professionals there was no direct correlation between seniority of position and a higher rate of publishing. although more librarians were represented in the survey than chief librarians, the average number of publications for each was very close: chief librarians = . ; librarians = . . in academia, on the other hand, although unsurprisingly more assistant and associ- ate professors were represented than professors, the average number of publications for each was markedly different: professor = . ; assistant professor = . . associate professors had an average of . publica- tions. in some cases, respondents will have published ar- ticles in international journals as well. therefore, the above figures are not intended to present an overall profile of respondents by job title/rank. it would be useful in future to compare publishing in both national and international journals to determine any significant comparative trends. given that practitioners accounted for slightly more http://www.jistap.org research publishing by library and information science scholars than % of all reported publications, further investi- gation could be undertaken to compare these metrics with work done by schlögl and stock ( ) and wil- lard et al. ( ) regarding publishing practices among lis practitioners and academics. while citation metrics are commonly used in lis bibliometric analysis, non-citation rates vary enor- mously by field. according to larivière et al. ( ), % of articles in the social sciences are not cited, compared with % for the humanities. as men- tioned previously, current survey results indicated that . % of reported publications had not been cited. although warraich and ahmad have reported a much lower figure (approximately %), their study was limited to a specific journal, the pakistan journal of library and information science (now pakistan journal of information management & libraries). it would be useful to examine ( ) how many of the journals cov- ered in the current survey were available only in print format, unlike the case of pjlis (pjim&l); and ( ) whether there was any correlation between year, i.e. age of publication, and number of citations. the very low citation metrics ( . %) for lis publications in national journals align with scimago’s journal and country rankings (http://www.scimagoir. com/). in fact, there is only one journal represented in the rankings, pjlis (pjim&l). as of , the av- erage citation per document rate in a two year period was . . there have been only citable documents published between and . the journal ranks out of the worldwide lis titles listed in the scimago database. because the other major pakistani lis journal, plisj, is available only in printed format, it is not represented in this international resource. this indicates that national lis publications may be struggling to compete at an international standard, an observation which has been corroborated by jan and anwar ( ). there are various challenges faced by pakistan lis authors in undertaking and publishing their research. respondents felt quite strongly that the higher edu- cation commission of pakistan (hec) should provide funding for publishing and participating in relevant conferences. the skills required to write an effective research report closely mirror those required for scholarly pub- lishing. they include the ability to write clearly and concisely, a critical and thoughtful level of enquiry, and the ability to logically structure an argument. the survey results indicate that there may be a consequen- tial training gap in this skills area; % either disagreed or strongly disagreed that the current research report writing skills of lis researchers were satisfactory. pakistani lis researchers are not alone in decrying unreasonable delays which may occur in the scholarly publishing process. however, unlike the case of the in- ternational “publishing giants,” there may be an oppor- tunity within pakistan to address this concern directly with editors and publishers. the authors would recom- mend that all stakeholders involved work together to address this issue. in addition to several of the impediments to either undertaking research or publishing the results as rated by respondents, it would be useful to gather additional data on the topic. areas could include data analysis, language as a barrier, research training, and mentoring / coaching. . conclusion the purpose of this study was to analyze the key at- tributes of articles published by pakistani lis scholars in national journals. compared with previous studies, results show that male authors continue to dominate and the punjab region continues to produce the high- est percentage of lis articles. given the historical challenges outlined by various authors in regard to the quality of academic lis programs in pakistan, it is sig- nificant that the largest number of articles represented in the study have only been published in the past five years. this may be a result of not only a comparatively recent emphasis at the national level on the impor- tance of research to advance strategic imperatives but also the more stringent criteria set by the higher education commission of pakistan in recent years for academic promotion. because citation analysis contributes to an under- standing of the publication patterns in disciplines, it is significant that a very high percentage of articles had not yet been cited, which correlates with the statistics generated by the international benchmarking tool, scimago’s journal and country rankings. at the same time, the results of the study have highlighted a sub- jistap vol. no. , - stantial increase in multiple-authored publications in comparison with an earlier survey. further research is indicated to determine whether there is any correlation between multiple-authorship and increased citation impact in general, and, if so, the potential impact on the citation metrics of national pakistani lis publica- tions in future. additionally, it is expected that further studies will be conducted by the authors in the use of scholarly networks among pakistani lis scholars to promote their publications. the research reported in this paper has highlighted additional areas for follow-up investigation so as to gain an in-depth profile of pakistani lis scholars. the results of such research should enhance understanding as to those critical success factors which might assist these scholars to improve their research publishing, thereby supporting national strategic objectives. acknowledgements the authors are grateful to the survey participants for their invaluable contribution. we would like to thank the editor and the anonymous reviewers for their helpful and insightful comments. references agyeman, e.a., & bilson, a. ( ). research focus and trends in nuclear science and technology in ghana: a bibliometric study based on the inis database. library philosophy and practice. paper . retrieved from http://digitalcommons.unl. edu/libphilprac/ . ahmad, p. ( ). lis education in pakistan at post- graduate level. pakistan library & information sci- ence journal, ( ), - . ahmad, s., & mahmood, k. ( ). library and infor- mation science education in pakistan: a decade of development - to . pakistan library & information science journal, ( ), - . aksnes, d. w., schneider, j. w., & gunnarsson, m. ( ). ranking national research systems by cita- tion indicators. a comparative analysis using whole and fractionalised counting methods. journal of informetrics, ( ), - . ameen, k. ( ). changing scenario of librarianship in pakistan: managing with the challenges and op- portunities. library management, ( ), - . ameen, k., & ullah, m. ( ). challenges of getting faculty status: perception of university librarians in pakistan. the international information & library review, ( ), - . arts & humanities research council ( ). the academic book of the future: call for proposals. swindon, uk: ahrc. retrieved from http://www. ahrc.ac.uk/funding-opportunities/pages/fu- ture-of-the-academic-book.aspx. aslam bhatti, m., & arif, m. ( ). library and infor- mation science distance education and continuing professional development in pakistan. library re- view, ( ), - . retrieved from http://dx.doi. org/ . / . association of research librarians (arl) (n.d). schol- arly communication. retrieved from http://www. arl.org/focus-areas/scholarly-communication#. vvwa ez lmm. boote, d. n., & beile, p. ( ). scholars before re- searchers: on the centrality of the dissertation lit- erature review in research preparation. educational researcher, ( ), - . borgman, c. l. ( ). scholarship in the digital age. cambridge, ma: mit press. borgman, c.l., & furner, j. ( ). scholarly commu- nication and bibliometrics. in b. cronin (ed.), an- nual review of information science and technology, vol. (pp. - ). medford, nj: information today. brody, t. ( ). evaluating research impact through open access to scholarly communication (doctoral dissertation, university of southampton). deng, s., & dotson, l. ( ). redefining scholarly services in a research lifecycle. in b. l. eden (ed.), creating research infrastructures in the st-century academic library: conceiving, funding, and building new facilities and staff (pp. - ). lanham, md: rowman & littlefield. disabato, n. ( ). publication standards part : the fragmented present. a list apart, no. . re- trieved from http://alistapart.com/article/publica- tion-standards-part- -the-fragmented-present. dowling, g. r. ( ). playing the citations game: from publish or perish to be cited or sidelined. australasian marketing journal (amj), ( ), - http://www.jistap.org research publishing by library and information science scholars . haider, s. j., & mahmood, k. ( ). post-master lis education at punjab university (lahore). pakistan library & information science journal, ( ), - . haider, s. j., & mahmood, k. ( ). mphil and phd library and information science research in paki- stan: an evaluation. library review, ( ), - . jan, s. u., & anwar, m. a. ( ). impact of pakistani authors in the google world: a study of library and information science faculty. library philosophy and practice (e-journal). paper . kwon, n. h., lee, j. y., & chung, e. k. ( ). under- standing scientific research lifecycle: based on bio- and nano-scientists’ research activities. journal of the korean society for library and information science, ( ), - . larivière, v., gingras, y., & archambault, É. ( ). the decline in the concentration of citations, – . journal of the american society for in- formation science and technology, ( ), - . lee, j. y., chung, e. k., & kwon, n. h. ( ). scien- tists’ information behavior for bridging the gaps encountered in the process of the scientific re- search lifecycle. journal of the korean society for information management, ( ), - . liu, z. ( ).trends in transforming scholarly com- munication and their implications. information processing & management, ( ), - . mahmood, k. ( ). library and information ser- vices in pakistan: a review of articles published in foreign journals. the international information & library review, ( ), - . mahmood, k., & shafique, f. ( ). changing re- search scenario in pakistan and demand for re- search qualified lis professionals. library review, ( ), - . maron, n. l., & smith, k. k. ( ). current models of digital scholarly communication: results of an inves- tigation conducted by ithaka for the association of research libraries. washington, dc: association of research libraries. meho, l. i., & yang, k. ( ). impact of data sources on citation counts and rankings of lis faculty: web of science versus scopus and google scholar. jour- nal of the american society for information science and technology, ( ), - . moed, h. f. ( ). citation analysis in research evalua- tion. dordrecht, the netherlands: springer. naseer, m. m., & mahmood, k. ( ). lis research in pakistan: an analysis of pakistan library and information science journal - . library philosophy and practice, (june). nightingale, j. m., & marshall, g. ( ). citation analysis as a measure of article quality, journal influence and individual researcher performance. radiography, ( ), - . rowlands, i., nicholas, d., & huntington, p. ( ). scholarly communication in the digital environ- ment: what do authors want? learned publishing, ( ), - . samdani, r. a., & bhatti, r. ( ). doctoral research in library and information science by pakistani professionals: an analysis. library philosophy & practices (november ). retrieved from http:// unllib.unl.edu/lpp/samdani-bhatti.pdf. sarli, c. c., dubinsky, e. k., & holmes, k. l. ( ). beyond citation analysis: a model for assessment of research impact. jmla, ( ), . schlögl, c., & stock, w. g. ( ). practitioners and academics as authors and readers: the case of lis journals. journal of documentation, ( ), - . thorin, s. e. ( ). global changes in scholarly com- munication. in h. s. ching, p. w. t. poon, & c. mcnaught (eds.), elearning and digital publishing (pp. - ). dordrecht: springer. todd, h. ( ). a partnership to support the research lifecycle: a case study from the university of queensland library. in international conference on change and challenge: redefine the future of aca- demic libraries, peking university, beijing, china, - november . vaughan, k. t. l., hayes, b. e., lerner, r. c., mcel- fresh, k. r., pavlech, l., romito, d., & morris, e. n. ( ). development of the research lifecycle model for library services. jmla, ( ), - . warraich, n. f., & ahmad, s. ( ). pakistan journal of library and information science: a bibliometric analysis. pakistan journal of library & information science, , - . weller, m. ( ). the digital scholar: how technology is transforming scholarly practice. london, uk: bloomsbury. willard, p., kennan, m. a., wilson, c. s., & white, h. jistap vol. no. , - d. ( ). publication by australian lis academ- ics and practitioners: a preliminary investigation. australian academic & research libraries, ( ), - . wolski, m., & richardson, j. ( ). a model for insti- tutional infrastructure to support digital scholar- ship. publications, ( ), - . tools for transparency and open access june , | : – : cest session session ground rules ● this session is recorded. we’ll share the recording and slides afterwards. ● questions for the speakers ? type them in the attendee chat on the left side of the screen. the chair will address these at the end of the session. ● technical issues. check your settings as well as your internet connection. no luck? try to rejoin by closing your tab and reusing the link provided. thank you for your attention and enjoy the session! tools for transparency and open access session the session will be chaired by simone kortekaas, wageningen university and research – library, the netherlands ● transparency, provenance and collections as data: the national library of scotland’s data foundry sarah ames, national library of scotland, united kingdom ● library toolkit for open access: impacts and implications maurits van der graaf, pleiade management & consultancy, the netherlands ● speed talk – an oer for early career researchers to improve skills on sharing and publishing nicole krüger; dr. tamara pianos, zbw – leibniz information centre for economics, germany transparency, provenance and collections as data: the national library of scotland’s data foundry dr sarah ames digital scholarship librarian national library of scotland @semames | #nlsdata this talk digital scholarship at the national library of scotland practical work involved in turning collections into data how we are promoting and embedding transparency in our practice the digital scholarship service the use of computational methods with national library of scotland collections to enable new forms of research digital scholarship nls digital scholarship computational methods research activity collections digital scholarship service encourage, enable & support use of computational research methods with the collections ensure that the collections are used to their full potential establish a library culture which understands digital scholarship practise and promote transparency in our data creation processes anticipate the future of research digital scholarship service encourage, enable & support use of computational research methods with the collections ensure that the collections are used to their full potential establish a library culture which understands digital scholarship practise and promote transparency in our data creation processes anticipate the future of research service levels level : self-service (data, tools) level : self-service plus consultation time level : funded service – partnership/collaboration why ‘service’? ‘digital humanities in the library isn’t a service’ (muñoz ) http://trevormunoz.com/notebook/ / / /doing-dh-in-the-library.html • however: • pragmatics: ‘service’ vs ‘project’, ‘programme’ • collaboration often builds on some level of underlying service provision • some users want diy http://trevormunoz.com/notebook/ / / /doing-dh-in-the-library.html . making data available text image metadata audiovisual maps web archive corporate . external engagement collaboration projects phds, residencies, fellowships beyond research community . internal engagement awareness training champions culture collections as data pro intermediate beginner all the tech skills! will find a way to get the data no matter how presented but – has expectations of existing standards (where they exist) & consistency limited tech skills understands value of different formats and approaches for research questions: theoretical rather than practical understanding wants to get hold of the data easily to check what’s there will employ an ra to do the work no tech skills wants to use online tools to explore datasets just wants the text identifying user needs plus broader audience includes other libraries (standards, presentation of data etc) pro intermediate beginner all the tech skills! will find a way to get the data no matter how presented but – has expectations of existing standards (where they exist) & consistency limited tech skills understands value of different formats and approaches for research questions: theoretical rather than practical understanding wants to get hold of the data easily to check what’s there likely to employ an ra to do the work no tech skills wants to use online tools to explore datasets just wants the text plus broader audience includes other libraries (standards, presentation of data etc) identifying user needs changing processes: digitisation to data selection rights and conservation assessments digitisation generate derivative images (thumbnails, crops, etc) files into repository – alto xml, txt, jpegs, pdfs, thumbnails, copyright info [retro-create alto] compile mets extract alto xml, txt, jpegs, pdfs, thumbnails and mets compile dataset: structure/naming conventions zip and move to cloud/local storage create doi publish online making decisions • standards: • mets/alto and plain text • marc/dublin core • tiered downloads • image sizes/quality to include • storage (local/cloud) • selecting the first five datasets • what metadata to include and what is available • how to be transparent: gathering and presenting dataset context a whole-library effort developers curators digitisation metadata rights access external relations embedding transparency in our practice the value of transparency • open practices: conveying processes, transformations, decision making • ‘reproducible research’ • acknowledging and reducing bias • providing counter-narratives our principles . communicating transparency shining light on invisible labour . value of clarity in online presentation of data open data plan • level of open data ( *) • formats • where we publish data • frequency of publication • open data register https://data.nls.uk/about/ https://data.nls.uk/about/ . data provenance standards • https://data.nls.uk/about/standards/ • mets implementation • technical provenance information…but what about reasoning and decision making process? https://data.nls.uk/about/standards/ library collections digitised material data problematic, historic collection practices selection processes; copyright; conservation; funding resource; ocr; copyright collections in context? contextualising our data collections • what do we digitise? • where? • why? • who is involved in the selection process? • what funds it? what are the responsibilities of libraries as we increasingly become producers of our own collections? thank you! sarah.ames@nls.uk @semames | #nlsdata library toolkit for open access: impacts & implications % oa articles per publication year data from the european open science monitor ( ) oa total gold oa green oa hybrid oa bronze oa % oa articles per publication year data from the european open science monitor ( ) oa total gold oa green oa hybrid oa bronze oa % oa articles per publication year data from the european open science monitor ( ) oa total gold oa green oa hybrid oa bronze oa what does this show us?  green oa important contribution to oa albeit with a delay  % gold and hybrid oa articles increasing to about % of the articles in  gold and hybrid oa articles are for the larger part apc-paid  % closed access articles:  % of the articles  % of the articles (from the viewpoint of ) library toolkit for oa focus on green oa transformative tools in the library toolbox closed access articles in subscription-only journals green oa: institutional repository closed acces articles in hybrid journals apc oa articles in oa or hybrid journals apc-funds transformative agreements apc-free journals support by libraries repositories [ o p e n d o a r ; - - ] growth - now impact of green oa tools:  repositories, mostly institutional impact:  publishers’ response: % of publishers allow self- archiving [sherpa-romeo]  % of journal articles available via green oa [open science monitor] library toolkit for oa focus on apc-gold oa transformative tools in the library toolbox closed access articles in subscription-only journals green oa: institutional repositrory closed acces articles in hybrid journals apc oa articles in oa or hybrid journals apc-funds transformative agreements apc-free journals support by libraries apc funding sources  library funds and library-managed funds:  transformative agreements  institutional oa-funds and so-called ‘block grants’ (uk)  researchers’ discretionary budgets:  researchers’ discretionary (institutional) budgets  author’s personal funds  co-author(s) funds  research funders:  oa allocations from a research grant  funds from a research grant not dedicated to oa institutional funds research funder apcs and authors % of the authors used provider of funding % of the authors used > provider of funding • % institution funding • % research funder • % other funding sources • % institution + research funder • % institution + other sources • % research funder + other funding • % institution + research funder + other sources survey among authors; monaghan funding sources apc payments • %: institution paid • %: author paid • %: research funder paid • %: co-author arranged payment • %: other/don’t know four models for libraries regarding apc’s a: separate finance streams  research registration system and repositories (green oa)  no transformative agreements (“the library cannot do it alone”)  sometimes oa monitoring b: oa fund to fill the gap  research registration system and repositories (green oa)  oa fund for authors without research grants – often limited to pure gold journals sometimes oa monitoring d: library in the lead: transformation of the library budget  transformative agreements  library oa fund  oa monitoring  institution-wide apc tracking  research registration systems/repositories c: research funder in the lead: compliance is key [uk]  transformative agreements  block grants from research funders; sometimes also an additional library oa fund  oa monitoring  institution-wide apc tracking  research registration systems/repositories closer look at model c/d: the tale of the frontrunners impact and implications ( )  a few frontrunners expect that with agreements with the top- publishers about % of the articles produced by their university will be covered  additional mechanisms are needed in case the library wants to cover the other %: articles published by the long tail of smaller publishers  the same frontrunners report the need for to fte extra staff for handling the related workflows impact and implications ( ) new workflows and services:  controlling publishers’ deals: eligibility checks & monitoring uptake  institution-wide tracking of apcs  management of oa fund and/or block grant  oa monitoring using bibliographic databases  also: tools to inform authors of transformative agreements impact and implications ( ) fundamental changes:  submission and acceptance process is between publishers and authors: libraries need to be informed in time  journal acquisition budget becomes an article publishing budget  the stockholm university library has stated this explicitly: ‘the aim is a gradual and cost-efficient reallocation of the acquisition budget to pay for publishing rather than subscriptions’ improving the library toolkit: conclusions and outlook the why: business case for oa support by libraries  % more views for oa articles in comparison to closed access articles  % more citations for oa articles in comparison to closed access articles  thus: . oa publishing gives a university a clear competitive edge by the enhanced visibility of its research and its researchers during the oa transition. . the oa toolkit of the research library will align the library with the objectives of the university piwowar piwowar  model c/d: reality at relatively few research libraries  apc market is so far partially (directly or indirectly) funded by research funders  read & publish (hybrid journals) or publish-licenses (apc-oa journals) will shift the financial burden of these apcs to the library  therefore, sharing the financial burden with research funders often necessary  feasibility depends on national landscape of research funders: the european landscape makes this feasible conclusion ( ): european landscape favours r&d/p deals conclusions ( ): next steps  green and gold both add value to oa transition  liber vision of predominant oa in can be reached (but maybe a few years later)  for this next steps are needed:  more r&p agreements and p agreements  mechanisms for r&p/p for long tail publishers  library support for apc-free oa journals [ssh disciplines]: ‘subscribe to oa’, library publishing,… outlook  financial participation of research funders in transformative agreements often necessary  call for publishers to inform libraries in time of articles submitted/accepted  apc-market moves from author-market to institutional market (less ‘apcs in the wild’)  transformative agreements will also transform the libraries regarding:  journal collections  journal acquisition procedures monaghan et al; the zbw is a member of the leibniz association. an oer for early career researchers to improve skills on sharing and publishing tamara pianos & nicole krüger zbw – leibniz-information centre for economics session # – tools for transparency and open access liber online - building trust with research libraries june , page zbw – leibniz-information centre for economics sites > in hamburg and kiel / germany national research infrastructure economics and business studies information. www.zbw.eu http://www.zbw.eu/ research in the fields of: > open science > web science page zbw – leibniz-information centre for economics th anniversary in www.zbw.eu http://www.zbw.eu/ https:// years.zbw.eu/openup/ page search portal econbiz www.econbiz.de learning materials http://www.econbiz.de/ page econbiz guided walk for students https://www.econbiz.de/eb/en/gw https://www.econbiz.de/eb/en/gw/ https://www.econbiz.de/eb/en/gw a few weeks later ... page let‘s make it oer! page barriers technical realization • typo • bootstrap • html, css contents • screenshots • covers of publications no download option / sharability materials not sharable under cc- by license essential to consider oer from the very beginning – before designing the learning materials. lessons learned! page page - open source - interactive content - mobile friendly - download and re-use technical platform? page - open source and mobile friendly - interactive content elements (quiz, drag and drop, dialog cards, interactive video, …) - download und adapt in wordpress, drupal, moodle, blackboard, or typo extension (reuse) - or: embed into websites / learning management environments (embed) https://h p.org https://h p.org/ https://h p.org/ page what about fonts, images of persons, screenshots, logos? cc under cc-by?  cc materials could be wrongly re-used under cc-by license. publish under cc-by-nc? copyright / licenses? zbw legal department page - only cc licensed images, videos, and fonts or own materials - no persons displayed in images - no screenshots - disclaimer for logos. - cc-by license: cc-by-nc cannot be shared in portals with ads, … legal advice required ... page econbiz academic career kit for early career researchers page drag and drop / quiz page decision based progress page personal support and a bit of comic relief page drag and drop / quiz page comic relief page rewards page page further reading page last updated: a few weeks later ... page how can users or teachers find our materials? - not in: youtube, cc-search, wikimedia commons, pixabay, slideshare - + oer commons - not findable as oer in google - contents / texts of h p files not indexed by search engines - limited number of search portals for oer that cover materials across universities or countries. finding and sharing interactive oer (h p) page page „allow content created on your site to be shared on a global h p hub with a single click” “allow users to browse and reuse content from the h p hub without having to download and import activities as h p files“ forthcoming: h p search hub https://h p.org/roadmap (june th, ) https://h p.org/roadmap eduarc project "the project targets the development of a design concept for disseminated learning infrastructures that provide federated access to digital educational resources." funded by partners: page zbw – leibniz-information centre for economics https://www.zbw.eu/en/research/science- - /eduarc/ seite other formats for oer pdf and print booklet videos academic career kit as powerpoint / pdf findable / commonly used formats e.g. „publish your paper“ as ppt or pdf https://www.econbiz.de/eb/en/research-skills/how-to-guides/ https://www.econbiz.de/eb/en/research-skills/information-literacy-videos/ https://www.econbiz.de/eb/en/research-skills/academic-career-kit/ https://www.econbiz.de/eb/en/research-skills/how-to-guides/ https://www.econbiz.de/eb/fileadmin/user_upload/ppt/econbiz_academic_career_kit_publish.pptx https://www.econbiz.de/eb/fileadmin/user_upload/pdfs/econbiz_academic_career_kit_publish.pdf page zbw – leibniz information centre for economics contacts nicole krüger @elocin_ka dr. tamara pianos t: + - e: t.pianos@zbw.eu @taps https://www.zbw.eu/en/search/econbiz-mobile/tamara-pianos/ https://twitter.com/elocin_ka http://www.zbw.eu/de/recherchieren/econbiz-mobile/tamara-pianos/ mailto:t.pianos@zbw.eu https://twitter.com/taps https://www.zbw.eu/en/search/econbiz-mobile/tamara-pianos/ q&a thank you for participating! recordings will be made available in the near future! session # - tools for transparency and open access liber sa slides oa toolkit for libraries presentation liber maurits van der graaf _liber_online_slides_pianos_krueger_zbw introduction to digital humanities dhum fall ma program in digital humanities cuny graduate center wednesdays : pm– : pm - room faculty dr. matthew k. gold mgold@gc.cuny.edu http://mkgold.net @mkgold dr. kelly baker josephs kjosephs@york.cuny.edu https://kbjosephs.net @kbjosephs course blog: http://cuny.is/dhintro course group: http://cuny.is/group-dhintro course hashtag: #dhintro email the class: dhintro @groups.commons.gc.cuny.edu advisory fellows: micki kaufman (ma in dh); andi Çupallari (ms in data analysis/vis) course overview in this introduction to the digital humanities (dh), we will approach the field via a caribbean studies lens, exploring how an understanding of the digital based in the growing area of digital caribbean studies might shape the larger field of dh. the course aims to provide a landscape view of dh, paying attention to how its various approaches embody new ways of knowing and thinking, new epistemolo- gies. what kinds of questions, for instance, does the practice of mapping pose to our research and teaching? how does the concept of mapping change when we begin from the global south? when we attempt to share our work through social media, how is it changed and who do we imagine it reaches? how can we visually and ethically represent various forms of data and how does the data morph in the representation? over the course of this semester, we will explore these questions and others as we engage ongoing debates in the digital humanities, such as the problem of defining the digital humanities, the question of whether dh has (or needs) theoretical grounding, controversies over new models of peer review for digital http://www.gc.cuny.edu/dh scholarship, issues related to collaborative labor on digital projects, and the problematic questions surrounding research involving “big data.” the course will also emphasize the ways in which dh has helped transform the nature of academic teaching and pedagogy in the contemporary university with its emphasis on collaborative, student-centered and digital learning environments and approaches. central themes in the course will emerge from our focus on the caribbean – in particular, how various technologies and technical approaches have been shaped by colonial practices; how archives might be decolonized and how absences in the archives might be accounted for; and how concepts like minimal computing might alter the projects we build. though no previous technical skills are required, students will be asked to experiment in introductory ways with dh tools and methods as a way of concretizing some of our readings and discussions. students will be expected to participate actively in class discussions and online postings (including on our course blog) and to undertake a final project that can be either a conventional seminar paper or a proposal for a digital project. students completing the course will gain broad knowledge about and understanding of the emerging role of the digital humanities across several academic disciplines and will begin to learn some of the fundamental skills used often in digital humanities projects. note: this course is part of an innovative “digital praxis seminar,” a two- semester long introduction to digital tools and methods that will be open to all students in the graduate center. the goal of the course is to introduce graduate students to various ways in which they can incorporate digital research into their work. learning objectives • students will become acquainted with the current landscape of the field of digital humanities and digital caribbean studies. • students will become conversant with a range of debates in the field of dh through readings and discussions. • students will create a social media presence and begin to prepare their own digital portfolios. • students will create a proposal for a digital project for possible development in the spring. • students will become familiar with the resources available at the graduate center to support work on digital teaching and research projects. requirements and structure: students in the course should complete the following work during the semester: reading and discussion (weekly) students should complete all weekly readings in advance of the class meeting and should take an active part in class discussions. blogging ( posts) • students are responsible for writing five blog posts on our shared course blog. these should be posted by monday night so that peers have the weekend to respond before class. – two short responses to our weekly readings or in-class discussions. post your thoughts, reactions, questions, responses; – one post about a workshop you have attended, with the goal of helping other students understand what they may have missed and/or what you found valuable about it; – one post about a praxis assignment; – one post about your final project. • students who are not writing blog posts on a given week should comment on and respond to the posts of other students. workshops ( workshops) • in connection with (gc digital initiatives)[http://cuny.is/cunygcdi], we will be offering skills workshops throughout the semester (https://gcdi. commons.gc.cuny.edu/calendar/). students are responsible for attending a minimum of three workshops over the course of the semester. you are free to go to as many as you’d like pending space limitations. to satisfy this requirement, students can also attend workshops offered by the interactive technology and pedagogy program, the teaching and learning center, the gc library, and the quantitative research center. praxis assignments ( assignments) during the semester, we will ask you to complete two praxis assignments. these exercises are meant to be beginner-level; our interest in having you complete them lies in getting you to experiment with new tools. your results do not have to be necessarily significant or meaningful; the important thing is to engage the activity and gain a better understanding of the kinds of choices one must make when undertaking such a project. we ask you to think, too, about both the strengths and the limitations of the tools you are trying out. our group on the cuny academic commons includes an integration with the dirt directory (look for the digital tools link in the group), which can help lead you to new tools to try. assignment options: . mapping assignment (due oct – required of all students) https://gcdi.commons.gc.cuny.edu/calendar/ https://gcdi.commons.gc.cuny.edu/calendar/ praxis assignment – mapping create a map using one of the tools described in “finding the right tools for mapping.” you can create any map you’d like; we just want you to try to use one of these pieces of mapping software. should you feel so inspired, we invite you to explore one of the following options: • create a map that in some way attempts to work against the constraints of maps (generally) or the particular mapping software you are using. • create a map of something that is not necessarily (or traditionally thought of as) mappable. • create a map related to issues of sovereignty as discussed in the “visualizing sovereignty” article. • create a map of a novel, an author’s works, or some other data using google maps, cartodb, arcgis storymaps, or another mapping platform. please create a blog post describing your experiences. choose either: . visualization assignment (due oct ) description forthcoming or . text analysis assignment (due nov ) description forthcoming final projects: students may choose between a) writing a conventional seminar paper related to some aspect of our course readings; or b) crafting a formal proposal for a digital project that might be executed with a team of students during the spring semester. guidelines for the proposal will be distributed later in the semester. grading regular participation in discussions across the range of our face-to-face and online course spaces is essential. • participation and online assignments ( %) • final project ( %) accounts all students should register for accounts on the following sites: [cuny academic commons], [twitter], and [zotero] (the library staff offers several very good intro workshops on zotero that you are encouraged to attend). remember that when you register for social-networking accounts, you do not have to use your full name or even your real name. one benefit of writing publicly under your real name is that you can begin to establish a public academic identity and to network with others in your field. however, keep in mind that search engines have extended the life of online work; if you are not sure that you want your work for this course to be part of your permanently searchable identity trail on the web, you should strongly consider creating a digital alias. whether you engage social media under your real name or whether you construct a new online identity, please consider the ways in which social media can affect your career in both positive and negative ways. books to purchase: you are required to purchase only one book for this course, though that book is also available in the library on reserve. all readings will be circulated via links on the web or via pdf. • benjamin, ruha. race after technology: abolitionist tools for the new jim code. medford, ma: polity press, . please check out our course schedule for our list of weekly assignments and readings. readings marked (pdf) will be made available via the files section of our course group. course schedule (subject to change) readings marked (pdf)** can be found in our course group** / - introductions course syllabus and site / - approaching the digital humanities, thinking the caribbean readings • matthew k. gold, “the digital humanities moment” - debates in the digital humanities ( ) • matthew k. gold and lauren f. klein, “digital humanities: the expanded field” debates in the digital humanities • matthew k. gold and lauren f. klein, “a dh that matters” debates in the digital humanities • kelly baker josephs and roopika risam, “digital black atlantic intro- duction” (draft) (pdf) • david scott, “on the question of caribbean studies” sites to explore • torn apart / separados • caribbean digital • create caribbean projects • the early caribbean digital archive assignment – blog post: • to what extent do these sites/projects reflect issues discussed in our readings? • or, if you were to center an understanding about what dh is around one of these projects/sites, how would dh be defined (or redefined)? / - epistemologies of dh readings • kim gallon, “making a case for the black digital humanities” • kelly baker josephs, “teaching the digital caribbean: the ethics of a public pedagogical experiment” https://commons.gc.cuny.edu/groups/introduction-to-digital-humanities- /documents/ https://dhdebates.gc.cuny.edu/read/untitled- c - - b-a be- fdb bfbd e/section/fcd c- - b- a -dc b baeec #intro https://dhdebates.gc.cuny.edu/read/untitled/section/ b b -bdda- f-b - ae fbbfd f#intro https://dhdebates.gc.cuny.edu/read/untitled/section/ b b -bdda- f-b - ae fbbfd f#intro https://dhdebates.gc.cuny.edu/read/untitled-f acf c-a - d -be - f ac e a /section/ cd - d b- f c- fdf- e c c #intro https://read.dukeupress.edu/small-axe/article/ / % ( )/ / /on-the-question-of-caribbean-studies http://xpmethod.plaintext.in/torn-apart/ http://caribbeandigitalnyc.net http://createcaribbean.org/create/projects/ https://ecda.northeastern.edu/ https://dhdebates.gc.cuny.edu/read/untitled/section/fa e e - c d- -a -d aac eb#ch https://jitp.commons.gc.cuny.edu/teaching-the-digital-caribbean-the-ethics-of-a-public-pedagogical-experiment/ https://jitp.commons.gc.cuny.edu/teaching-the-digital-caribbean-the-ethics-of-a-public-pedagogical-experiment/ • roopika risam, “what passes for human? undermining the universal subject in digital humanities praxis” • daniel paul o’donnell, katherine l. walter, alex gil, neil fraistat, “only connect: the globalization of the digital humanities” (pdf) • d. fox harrell, “cultural roots for computing” • ramsay, stephen, and geoffrey rockwell. “developing things: notes toward an epistemology of building in the digital humanities” debates in the digital humanities: , edited by matthew k. gold. university of minnesota press, . / - mapping readings • mark s. monmonier, how to lie with maps (pdf) • olivia ildefonso, “finding the right tools for mapping” • yarimar bonilla and max hantel, “visualizing sovereignty” • mayukh sen, “dividing lines. mapping platforms like google earth have the legacies of colonialism programmed into them” explore the following mapping projects: • slave revolt in jamaica, - • mapping inequality • renewing inequality • the decolonial atlas / - data and visualization readings • jennifer guiliano and carolyn heitman, “difficult heritage and the com- plexities of indigenous data” • tressie mcmillan cottom, “more scale, more questions: observations from sociology” • jessica marie johnson, “a review of ‘two plantations” • johanna drucker, “humanities approaches to graphical display” • lev manovich, “what is visualization?” sites to explore • a tale of two plantations https://dhdebates.gc.cuny.edu/read/d c ed - c - de - de- f fecd /section/ d cdb- a - e b- -bf cf bb #ch https://dhdebates.gc.cuny.edu/read/d c ed - c - de - de- f fecd /section/ d cdb- a - e b- -bf cf bb #ch http://eleven.fibreculturejournal.org/fcj- -cultural-roots-for-computingthe-case-of-african-diasporic-orature-and-computational-narrative-in-the-griot-system/ https://dhdebates.gc.cuny.edu/read/untitled- c - - b-a be- fdb bfbd e/section/c e- - e- f -e b a cac #ch https://dhdebates.gc.cuny.edu/read/untitled- c - - b-a be- fdb bfbd e/section/c e- - e- f -e b a cac #ch https://digitalfellows.commons.gc.cuny.edu/ / / /finding-the-right-tools-for-mapping/ http://smallaxe.net/sxarchipelagos/issue /bonilla-visualizing.html https://reallifemag.com/dividing-lines/ https://reallifemag.com/dividing-lines/ http://revolt.axismaps.com/ https://dsl.richmond.edu/panorama/redlining/#loc= / . /- . https://dsl.richmond.edu/panorama/renewal/#view= / / &viz=cartogram https://decolonialatlas.wordpress.com/ https://culturalanalytics.org/ / /difficult-heritage-and-the-complexities-of-indigenous-data/ https://culturalanalytics.org/ / /difficult-heritage-and-the-complexities-of-indigenous-data/ https://dhdebates.gc.cuny.edu/read/untitled/section/ e b - a- f - c - c bf #ch https://dhdebates.gc.cuny.edu/read/untitled/section/ e b - a- f - c - c bf #ch http://smallaxe.net/sxarchipelagos/assets/issue /review-johnson-plantation.pdf http://digitalhumanities.org/dhq/vol/ / / / .html http://manovich.net/content/ -projects/ -what-is-visualization/ _article_ .pdf http://twoplantations.com/ • around dh in days • data is beautiful: of the best data visualization examples from history to today / - history and the archive (guest visit from ada fer- rer) readings • linda m. rodriguez and ada ferrer, “collaborating with aponte: digital humanities, art, and the archive” • marlene l. daut, “haiti @ the digital crossroads: archiving black sovereignty” • jessica marie johnson, “markup bodies: black [life] studies and slavery [death] studies at the digital crossroads” (pdf) site to explore • digital aponte assignment: praxis mapping assignment due / - no classes / - no classes / - design / infrastructure readings • kelly baker josephs and teanu reid, “after the collaboration” (pdf) • angela sutton, “the digital overhaul of the archive of ecclesiastical and secular sources for slave societies (essss)” • bethany nowiskie, “capacity through care” • susan leigh star, “the ethnography of infrastructure” (pdf) • miriam posner, “see no evil?” • stephen jackson, “rethinking repair” • alex gil “interview with ernesto oroza” assignment: praxis visualization assignment due / - access / minimal computing readings https://arounddh.org https://www.tableau.com/learn/articles/best-beautiful-data-visualization-examples https://www.tableau.com/learn/articles/best-beautiful-data-visualization-examples http://smallaxe.net/sxarchipelagos/issue /ferrer-rodriguez.html http://smallaxe.net/sxarchipelagos/issue /ferrer-rodriguez.html http://smallaxe.net/sxarchipelagos/issue /daut.html http://smallaxe.net/sxarchipelagos/issue /daut.html http://aponte.hosting.nyu.edu/ http://smallaxe.net/sxarchipelagos/issue /essss.html http://smallaxe.net/sxarchipelagos/issue /essss.html https://dhdebates.gc.cuny.edu/read/untitled-f acf c-a - d -be - f ac e a /section/ a cbc - eee- a-a f - bb dfb c #ch https://logicmag.io/scale/see-no-evil/ https://sjackson.infosci.cornell.edu/rethinkingrepairproofs(reduced)aug .pdf https://dhdebates.gc.cuny.edu/read/untitled/section/f df - e- fe- -f dba c fb#ch • kathleen fitzpatrick, generous thinking (pdf) • cristina venegas, “tourism and the social ramifications of media tech- nologies”* • johanna drucker, “pixel dust: illusions of innovation in scholarly pub- lishing” • alex gil, “design for diversity: the case of ed” sites to explore • dhdebates site • sx salon/sx archipelagos • open-access publications on the cuny academic commons • manifold / - text readings • underwood, ted. “a genealogy of distant reading” digital humanities quarterly, vol. , no. , . • klein, lauren f. “distant reading after moretti” lklein, . • ramsay, stephen. “the hermeneutics of screwing around; or what you do with a million books” pastplay: teaching and learning history with technology, edited by kevin kee, university of michigan press, , pp. – . • witmore, michael. “text: a massively addressable object” debates in the digital humanities: , edited by matthew k. gold. university of minnesota press, . • so, richard jean. “all models are wrong.” pmla, vol. , no. , may , pp. - . (pdf) assignment: praxis text mining assignment due / - pedagogy readings • ryan cordell, “how not to teach digital humanities” • lizabeth paravisini-gebert, “review of puerto rico syllabus: essential tools for critical thinking about the puerto rican debt crisis” • roopika risam, “postcolonial digital pedagogy” (pdf) • marta effinger-crichlow, “a pedagogical search for home and care” https://lareviewofbooks.org/article/pixel-dust-illusions-innovation-scholarly-publishing/ https://lareviewofbooks.org/article/pixel-dust-illusions-innovation-scholarly-publishing/ https://des div.library.northeastern.edu/design-for-diversity-the-case-of-ed-alex-gil/ http://dhdebates.gc.cuny.edu http://smallaxe.net/sxarchipelagos/ https://commons.gc.cuny.edu/about/publications/ http://manifoldapp.org http://digitalhumanities.org/dhq/vol/ / / / .html http://lklein.lmc.gatech.edu/ / /distant-reading-after-moretti/ https://quod.lib.umich.edu/d/dh/ . . / : /--pastplay-teaching-and-learning-history-with-technology?g=dculture;rgn=div ;view=fulltext;xc= # . https://quod.lib.umich.edu/d/dh/ . . / : /--pastplay-teaching-and-learning-history-with-technology?g=dculture;rgn=div ;view=fulltext;xc= # . https://dhdebates.gc.cuny.edu/read/untitled- c - - b-a be- fdb bfbd e/section/ e e a- b- b - -a b e a#p b https://dhdebates.gc.cuny.edu/read/untitled/section/ - c - c a-b b - e#ch http://smallaxe.net/sxarchipelagos/issue /paravisini.html http://smallaxe.net/sxarchipelagos/issue /paravisini.html https://dhdebates.gc.cuny.edu/read/untitled-f acf c-a - d -be - f ac e a /section/ a a - - a- -b #ch multimedia to explore: • online dh syllabi (selections) • #sxcd - session : digital caribbean pedagogies • digital pedagogy in the humanities / - ruha benjamin, race after technology readings read race after technology in full / - open topic (tba) readings tba assignment: final proposals due / - student presentations (guest professor visit) / - student presentations / - final projects due https://github.com/curateteaching/digitalpedagogy/tree/master/keywords dhintro introduction to digital humanities course overview learning objectives requirements and structure: course-schedule course schedule (subject to change) / - introductions / - approaching the digital humanities, thinking the caribbean / - epistemologies of dh / - mapping / - data and visualization / - history and the archive (guest visit from ada ferrer) / - no classes / - no classes / - design / infrastructure / - access / minimal computing / - text / - pedagogy / - ruha benjamin, race after technology / - open topic (tba) / - student presentations (guest professor visit) / - student presentations / - final projects due microsoft word - rim systems_wustvilialee_ _v .doc exploring researchers’ participation in online research identity management systems shuheng wu queens college, the city university of new york - kissena blvd., queens, ny shuheng.wu@qc.cuny.edu besiki stvilia florida state university collegiate loop, tallahassee, fl bstvilia@fsu.edu dong joon lee texas a&m university tamu, college station, tx djlee@library.tamu.edu abstract prior studies have identified a need for engaging researchers in providing and curating their identity data. this poster reports preliminary findings of a qualitative study exploring how researchers use and engage in online research identity management (rim) systems. the findings identify nine activity or task related motivations of using rim systems. this study also identified three levels of participation in rim systems: readers, personal record managers, and community members. most participants of this study fell into the category of personal record managers, who may maintain their own profiles in a rim system. this suggests that a majority of researchers may be willing to maintain their research identity profiles. institutional repository managers may consider recruiting researchers as not only research information and data providers, but also curators of their own research identity data. keywords research identity management systems, motivations, engagement, researchgate, google scholar. introduction there are different research identity management systems, often referred to as research information management (rim) systems, from publishers, libraries, universities, search engines, and content aggregators (e.g., google scholar, orcid, researchgate). these systems employ different approaches to curating research identity information or data: manual curation by information professionals and/or users (including the subject of identity data), automated data mining and curation scripts (aka bots), and some combination of the above. with universities engaging in curating digital scholarship produced by their faculty members, staff, and students through institutional repositories (irs), some of these universities and their irs try to manage the research identity profiles of their contributors locally (e.g., expertnet.org, stanford profiles). while knowledge curation by professionals usually produces the highest quality results, it is costly and may not be scalable (salo, ). libraries and irs may not have sufficient resources to control the quality of large scale uncontrolled metadata often batch harvested and ingested from faculty authored websites and journal databases. they may need help from ir contributors and users to control the quality of research identity data. the literature on online communities shows that successful peer curation communities that are able to attract and retain enough participants can provide scalable knowledge curation solutions of a quality that is comparable to the quality of professionally curated content (giles, ). hence, the success of online rim systems may depend on the number of contributors and users they are able to recruit, motivate, and engage in research identity data curation. there is a significant body of research (e.g., cosley et al., ; nov, ; stvilia et al., ) on what makes peer knowledge creation and curation communities successful. however, most of the previous research has focused on encyclopedia, question answering, and citizen science communities. there has been little investigation on the peer curation of research identity data. this study explores how researchers use and participate in rim systems. particularly, it addresses the need to have greater knowledge of how to design scalable and reliable solutions for research identity data curation by examining researchers’ perceived value of research identity data and services, and their motivations to engage in online rim systems. findings can enhance our knowledge of the design of research identity data/metadata services, and mechanisms for recruiting and retaining researchers for providing and maintaining their research identity data. related work there have been considerable deliberations on the needs for and uses of research identity data and how to manage that effectively in library and information science (lis) research and practice communities (e.g., niso altmetrics initiative; research data alliance). an oclc task group aiming to register researchers in authority files identified {this is the space reserved for copyright notices.] asist , october - , , copenhagen, denmark. [author retains copyright. insert personal or institutional copyright notice here.] five stakeholder groups of research identity data: researchers, funders, university administrators, librarians, and aggregators (oclc research, ). for the researchers stakeholder group, the task group formulated five needs: disseminate research, compile all publications and other scholarly output, find collaborators, ensure network presence is correct, and retrieve others’ scholarly output to track a given discipline. this set of needs was compiled based on expert opinions of the group members, supplemented with a scenario-based analysis. it would be valuable to test this typology empirically and investigate what can be the disincentives for researchers to participate in online research identity data sharing and curation. different units in universities (e.g., office of research) are increasingly interested in collecting and analyzing research output for reporting, accreditation, and organizational reputation management. those activities and interests overlap with traditional interests of academic libraries, which have to better align their digital services with those broader organizational needs and priorities (dempsey, ; tenopir et al., ). one approach would be to add research identity management services to irs (palmer, ). there is evidence from the practice that adding research identity management services to an ir might increase researchers’ interest in the ir (dempsey, ). however, the increased interest in an ir might not always translate in the increased use of an ir and/or increased engagement in research identity data curation as multiple rim systems (e.g. researchgate, academia.edu) offer similar services and strive for researchers’ attentions and contributions. relying solely on automated mining, extraction, and aggregation of research identity data might result in poor quality. libraries need researcher engagement in identity data curation to provide scalable and high quality research identity management services. the online community literature shows that volunteer knowledge curators in open peer-production systems like wikipedia are mostly driven by intrinsic motivations such as their interests in specific areas (nov, ; stvilia et al., ). previous studies also examined user motivations to contribute in other online communities. ames and naaman ( ) interviewed ‘heavy’ users of a flickr application and identified four types of motivations for tagging: self- organization, self-communication, social-organization, and social-communication. a study of flickr collections by stvilia and jörgensen ( ) listed eight motivations members might have when organizing photographs into groups. nov et al. ( ) found a positive relationship between the motivation of building reputation in the community and the amount of metainformation (i.e., tags) provided. similarly, in a study examining an online network of legal professionals, wasko and faraj ( ) found a significant positive effect of building reputation on the quality and volume of knowledge contribution. the online communities literature provides valuable insights for designing rim systems and building and maintaining user communities around those systems. however, more empirical research is needed to understand what motivates researchers to engage in rim systems. research method guided by activity theory (at; engeström, ), this study employed semi-structured interviews (blee & taylor, ) to answer the following research questions: . how do researchers use online rim systems? . what are the levels of researcher engagement in online rim systems? this study defines the research population as employees and students of institutions having an ir and classified as research universities in the carnegie classification of institutions of higher education. the participants of this study must have at least one peer-reviewed research publication and have used at least one rim system by the time of interviews. this study used at and literature analysis to develop an interview questionnaire. the authors conducted semi-structured interviews with eight researchers from four institutions regarding their use of and participation in rim systems. one participant was a full professor, one was an associate professor, one was an assistant professor, two were postdoctoral researchers, and three were doctoral students. all interviews, ranging from to minutes, were audio recorded, transcribed, and coded with nvivo . two of the authors independently coded all the interviews using an initial coding scheme based on activity theory and literature analysis. after comparing, discussing, and resolving any differences in their coding, the two authors formed a new coding scheme with emergent codes and subcategories and recoded all interviews. findings activities and motivations from analysis of the interviews we identified the activities in which researchers engaged using rim systems (see table ) and the motives of those activities. find relevant literature one of the most frequent activities in which rim systems are used is literature search. outcomes of this activity can be used as input to other scholarly activities such as literature analysis, manuscript writing, or planning a research project. the literature search activity may include four actions: search, determine, select, and obtain. researchers may use different rim systems for different types of searches (e.g., known item, subject, navigational searches) based on the strengths or capabilities of rim systems. one participant explained how he used researchgate and google scholar for different purposes: i think they have different functions. like for researchgate i can follow some people, so i can have their most recent papers. but sometimes i also use google scholar when i have a specific paper that i want to look for. so if i know the title of the paper, or i know the author, and i want to see their publications, i will use google scholar. it’s convenient. researchers may also use rim systems to define and manage their own bibliographies by following or ‘bookmarking’ the core papers of a specific research area. one participant specified how he used researchgate to manage and expand his bibliographies: some of the big papers were sort of like in everyone’s research. these are the cornerstone articles that you base a lot of your research on ... i follow some of those articles [in researchgate]. to complete a literature search activity, researchers need to obtain the desired publications. researchers may be motivated to use rim systems providing open access to the self-archived versions of publications. rim systems with social networking features can attract researchers to contact authors and request a copy of publication they cannot access otherwise. one participant indicated what motivated him to use researchgate was that it provides open access to his works and allows requesting papers from others: it's good to have your stuff easily accessible because not everyone has access to databases, but if you're a researcher, it's easy to set up an account on one of these sites and connect with the authors to hopefully get the articles that you want to get. document manuscripts besides literature search, researchers may use rim systems in a manuscript writing activity to manage citations. they may use google scholar to verify bibliographic metadata of the resources cited in their papers, and/or obtain citations in a specific style. one participant revealed his use of google scholar when working on the reference list of his paper: there are times that i need to verify the source just to make sure the title, authors, and year, and just to make sure the information i put in are correct. google scholar is doing a good job in accurately reflecting publications, so i use it as a [citation management] resource. identify researchers rim systems and their citation extraction and analysis functions can be used for identifying potential reviewers, collaborators, letter writers, students, and advisors, who may have similar or specific research interests. one participant explained how she used researchgate’s citation information to know other researchers and identify potential collaborators: one of the advantages to using these [rim] systems is the ability to discover researchers that you may not have known like this ... i'm going to follow this guy from boston now because apparently he likes my work and i want to be helpful to him, and i want to see what he's doing with the stuff of mine that he's citing, because maybe we could be good collaborators. a potential future collaboration can be one of the motivations to follow other researchers in rim systems. one junior researcher stated that she hoped to convert some of the connections she was cultivating with other researchers in researchgate into future collaborations. activities actions rim functionalities find relevant literature search search engine; author profile; follow papers; citing & cited papers determine author and publication profiles select citation count; author impact scores; publication venue impact scores; manuscript status obtain download a paper; request a paper from author(s) document manuscripts document sources citation generator verify sources author profile identify researchers identify students, advisors, reviewers, collaborators, & etc. citing & cited papers; author profile, follow people; researchgate reads statistics disseminate research make papers accessible upload a paper; paper self- archiving status determination promote papers recommend papers; recommend people interact with peers ask and answer questions on forums q&a service send and receive private messages messaging service monitor the literature follow known researchers receive updates on known researchers follow papers receive updates on papers discover new papers recommend papers; citing & cited papers discover new researchers recommend people; citing & cited papers evaluate evaluate papers citation count; number of reads; manuscript status evaluate people author profile; h-index; export a cv; researchgate scores curate archive papers upload papers add and modify metadata for papers add/update index terms; claim papers; disavow papers; add/update citation information add and modify metadata for people create/update profile; merge profiles; add/edit index terms; endorse people for expertise; add/remove suggested co- authors review papers open review look for jobs search recommended job postings table . activities and rim system functionalities. disseminate research one of the main motivations of using rim systems is to disseminate research results. researchers may use rim systems to share publications, data, and other research products. nearly all participants mentioned they used rim systems to promote their research. the dissemination activity may consist of making research results available and actively promoting those results. to make research results available researchers may upload copies of their papers, presentations, or data to a rim system. a service participants found particularly helpful in that action was the one that helped them determine whether a publication could be self-archived in rim systems based on the publisher’s policies. after research results are uploaded to a rim system, the system then can use push services to promote results to the community. researchers may choose a specific rim system providing more effective mechanisms (e.g., social network) to promote research to the community that they want to reach. one participant emphasized the social network provided by researchgate for promoting his research to the peers: i used researchgate besides the google scholar because researchgate has slightly different methods of constructing the social network and the way they promote research is different – it’s more active than google scholar. in that sense, it serves my purpose of trying to promote my research in the peers. interact with peers scholarly work may involve interaction. researchers may interact about any aspect of research such as what design to employ for a particular research problem, what tools to use and how, or how to replicate research results. researchers may also interact to exchange information about employment opportunities, and to recruit students, collaborators, external reviewers, or letter writers for grant proposals or promotions. some rim systems provide researchers with q&a forums and a direct messaging service to communicate. in some cases, those communications channels become the only means on the web of reaching a particular researcher. one participant revealed how researchgate helped him communicate with a researcher he could not reach otherwise when he was looking for a recommendation letter from the industry: researchgate really gives you a way to connect to the researchers if you somehow cannot find their email address or other contact information from other channels … i was looking for some recommendation letters for personal use. i wanted one from industry. this company cited my paper … but for that specific case, the first author’s email was not on the paper. and the last author, the corresponding author, actually left the company. so i had nowhere to find them. then i checked researchgate. he was on researchgate. so i tried my last resort. i just sent him a message. and surprisingly, he replied. monitor the literature to stay current with the literature, researchers need to monitor the literature for new works and/or contributors. one participant indicated his motivation of using rim systems was to monitor his network of researchers: looking at what people whose work i'm interested in have cited, is useful for me and for following up on and finding out more about information that's useful for me in my own research. rim systems can be helpful to a researcher in monitoring literature by sending alerts about new works from the researcher’s network, and recommending new works and authors based on topical or co-citation matches. rim systems having social networking capabilities enable researchers to connect to and learn about junior researchers’ works, which may not be as visible as those of more established researchers. one participant explained how researchgate might enable her to know about junior researchers who otherwise could not be reached: researchers who are not that famous now, like junior faculty members or doctoral students, who are not big names, i probably cannot find another opportunity to know them all. if they also have a researchgate account and have some publications there, i hope this site can give me some automatic suggestions. evaluate evaluation can be a standalone activity (e.g., benchmarking oneself against other researchers) or part of a research process (e.g., evaluating papers for inclusion in a literature review). the targets of evaluation can be different entities such as a manuscript, a publication venue, an individual researcher, a lab, or an institution. researchers may play the role of evaluators or be the objects of evaluation by others. if the latter, a researcher still can be an active contributor of a distributed evaluation process by creating and maintaining a profile in a rim system to support his/her evaluation by others. the context of those evaluation activities may vary. one participant revealed that he created a google scholar profile to support his application for an award. another participant mentioned he used his google scholar profile and impact factors as an evidence of his research impact when applying for the u.s. permanent residency. a researcher’s career status may affect the types of evaluation activities she or he may engage in or is asked to perform. senior researchers may evaluate other researchers for promotion and tenure. doctoral students, on the other hand, may benchmark themselves against other doctoral students who are at the similar stage in their doctoral programs to assess their competitiveness for the job market. for example, one participant who was a doctoral student illustrated how she used researchgate to follow other doctoral students to help prepare herself for the job market: i followed some students who are at the same stage as myself … in other schools to see their publication rate, how many publications they will get in one year … and then i can estimate how much work should be expected for a doctoral student at my stage, so later, when i’m actually in the job market, i will not be too far away. curate curation of research resources can be defined as a process of managing those resources for discovery and future use (lord & macdonald, ). the main components of curation activities is quality assurance, which is the process of assuring that the research products, including information resources, meet the needs and requirements of the activities in which they are used (stvilia et al., ). researchers may use rim systems to self-archive papers and data and to make them accessible. researchers may create and manage metadata for those resources to make them findable and reusable, and also use the metadata to construct a cv for different purposes. rim systems with social networking capabilities allow researchers to request reviews of the content of their works from their peers. curation of research information enables all the other activities in which that information is used or reused. indeed, assuring the quality of their research identity metadata can be a motivation for researchers to establish a profile in a rim system. for example, one participant created a profile in google scholar to correct an error after she found google scholar had identified another researcher having the same name as the author of her article. furthermore, the quality of information determines the outcome of an activity using that information. concerns on the quality of an activity’s outcome using research information and its possible effects on a researcher can be a strong motivator for the researcher to engage in curating his/her research identity profile. one participant noted: if you don't maintain it [research identity profile], then it gives people an inaccurate view of your productivity, so you run the risk of potentially sending a signal about your productivity that's not accurate. all participants indicated that they maintained their own personal profiles at least in one rim system. their maintenance included adding bibliographic metadata and subject index terms to their publications, uploading full-text articles, and endorsing colleagues for their skills. look for jobs rim systems may serve as a social network for researchers to look for job information or find job candidates. one participant mentioned she used researchgate’s jobs service to look for relevant job postings. another participant described how he used researchgate’s messaging services to help a researcher in another country to find a job: for the messages i received, the only one that’s not requesting a paper is the one from an italian researcher. she told me she’s going to graduate, and she’s applying for a postdoctoral position. she’s personally asked me if i knew any positions in the untied states. so i replied her message, gave her some suggestions. engagement of the eight participants, four had public profiles in google scholar, seven used researchgate, and three had profiles in academia.edu. only one participant mentioned that he had an orcid account. when asked why they participated in a particular rim system, some participants recalled incidents that led them to create a profile in that system. some of them did not purposefully create profiles in a rim system to meet their research identity management needs, but the profiles were automatically generated and pushed on them. others mentioned they acted on a recommendation from friends, colleagues, or advisors when creating profiles. researchers can be also introduced to a rim system by another information system such as a search engine. they then perceived the value of membership after observing specific benefits provided by the system. for example, one participant revealed: i first came to researchgate, because a paper i was looking for at that time only had full-text version on researchgate … then i noticed that's a benefit. i should create an account there. levels of engagement the data analysis identified three levels or categories of researcher participation in a rim system. researchers belonging to the first category have claimed or activated an account in a rim system but do not maintain it or not interact with other members of the system. this category was called readers as they use rim systems mostly to access the literature. researchers in the second category may maintain their profiles in a rim system, but do not contribute to the system beyond that and not interact with other members of that system directly or indirectly. that is, they don’t ask or answer questions in q&a forums, endorse other members for their skills, send emails, or respond to other members’ emails or requests. this category was labeled as personal record managers. a majority of the participants (four out of eight) were grouped under this category. researchers in the third category not only maintain their own profiles, but also are willing to curate research information of other members by endorsing them for skills, and sharing information via messages, emails, or q&a services. this category was labeled as community members, who may be motivated by the feeling of reciprocity and being ‘a good member’ of the community. discussion and conclusions as mentioned above, an oclc task group formulated researchers’ five needs for research identity data (oclc research, ), which can be nicely mapped to five of the motivations of using rim systems identified in the current study: find relevant literature, disseminate research, curate, identify researchers, and monitor the literature. however, the empirical data collected from the current study identified four more motivations of using rim systems or research identity data: document manuscripts, interact with peers, evaluate, and look for jobs. indeed, most of the irs do not support those four activities (lee, ). several participants of the current study mentioned they used researchgate as it provided a social network allowing them to follow and communicate with other researchers and look for jobs. irs may consider incorporating these functionalities to allow their users to communicate with each other, and generate profiles to support different evaluation activities (e.g., self-evaluation, annual review). preece and shneiderman ( ) presented a framework to describe user engagement in online social communities consisting of four levels: reader, contributor, collaborator, and leader. the current study identified three levels of participation in rim systems: readers, personal record managers, and community members. these three categories can be mapped to the first three levels of engagement of preece and shneiderman’s framework. most participants of the current study fell into the category of personal record managers, who may maintain their profiles in a rim system, but do not contribute to the system beyond that nor interact with other members of that system. a study of data curation practices in irs found that ir staff’s curation activities focused on ensuring the quality of publication metadata for the long-term preservation of publications to increase their reusability (lee, ). findings of the current study suggest that a majority of researchers may be willing to maintain their research identity profiles. ir managers may consider recruiting researchers as not only research information/data providers, but also curators of their own research identity data. this study provides rich qualitative data regarding how researchers use and participate in online rim systems. still, this poster is limited as it reports preliminary findings based on interviews with eight participants from four institutions. more interviews will be conducted with researchers from other institutions and disciplines to gain different perspectives. based on findings of interviews, we will develop and implement a survey to reach more researchers, and develop a quantitative model of researcher participation in rim systems. acknowledgments the authors would like to express their appreciation to the researchers who participated in the study. this research is supported by an oclc/alise library and information research grant for and the national leadership grants from the institute of museum and library services (imls). the article reflects the findings and conclusions of the authors, and do not necessarily reflect the views of oclc, alise, and imls. references ames, m., & naaman, m. ( ). why we tag: motivations for annotation in mobile and online media. in b. begole & s. payne (eds.), proceedings of the sigchi (pp. - ). new york, ny: acm. bauin, s., & rothman, h. ( ). "impact" of journals as proxies for citation counts. in representations of science and technology (pp. - ). leiden: dswo press. blee, k. m., & taylor, v. ( ). semi-structured interviewing in social movement research. in b. klandermans & s. staggenbory (eds.), methods of social movement research (pp. - ). minneapolis, mn: university of minnesota press. cosley, d., frankowski, d., terveen, l., & riedl, j. ( ). using intelligent task routing and contribution review to help communities build artifacts of lasting value. in proceedings of the sigchi conference on human factors in computing systems (pp. - ). new york, ny: acm. dempsey, l. ( ). research information management systems - a new service category? retrieved from http://orweblog.oclc.org/archives/ .html engeström, y. ( ). learning by expanding: an activity- theoretical approach to developmental research. helsinki: orienta-konsultit oy. giles, j. ( ). internet encyclopedias go head to head. nature, ( ), - . lee, d. j. ( ). research data curation practices in institutional repositories and data identifiers (doctoral dissertation). retrieved from http://purl.flvc.org/fsu/fd/fsu_migr_etd- lord, p., & macdonald, a. ( ). e-science curation report: data curation for e-science in the uk: an audit to establish requirements for future curation and provision. bristol, uk: jisc. nov, o. ( ). what motivates wikipedians. communications acm, ( ), - . nov, o., naaman, m., & ye, c. ( ). analysis of participation in an online photo-sharing community: a multidimensional perspective. journal of the american society for information science & technology, ( ), - . oclc research. ( ). registering researchers in authority files. retrieved from http://www.oclc.org/research/themes/research- collections/registering-researchers.html palmer, d. ( ). the hku scholars hub: reputation, identity & impact management. how librarians are raising researchers’ reputations. retrieved from http://hub.hku.hk/bitstream/ / / /reputation.p df. preece, j., & shneiderman, b. ( ). the reader-to-leader framework: motivating technology-mediated social participation. ais transactions on human-computer interaction, ( ), - . salo, d. ( ). name authority control in institutional repositories. cataloging & classification quarterly, ( - ), - . stvilia, b., twidale, m., smith, l. c., gasser, l. ( ). information quality work organization in wikipedia. journal of the american society for information science & technology, ( ), – . stvilia, b., gasser, l., twidale, m., & smith, l. c. ( ). a framework for information quality assessment. journal of the american society for information science & technology, , - . stvilia, b., & jörgensen, c. ( ). user-generated collection-level metadata in an online photo-sharing system. library & information science research. ( ), - . tenopir, c., birch, b., & allard, s. ( ). academic libraries and research data services: current practices and plans for the future. retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi= . . . . &rep=rep &type=pdf wasko, m. m., & faraj, s. ( ). why should i share? examining social capital and knowledge contribution in electronic networks of practice. mis quarterly, ( ), - . llc_preprint     significance testing of word frequencies in corpora author: jefrey lijffijt affiliation: aalto university current affiliation: university of bristol mail: university of bristol, department of engineering mathematics, mvb woodland road, bristol, bs ub, united kingdom. e-mail: jefrey.lijffijt@bristol.ac.uk author: terttu nevalainen affiliation: university of helsinki author: tanja säily affiliation: university of helsinki author: panagiotis papapetrou primary affiliation for this manuscript: aalto university current affiliation: stockholm university author: kai puolamäki primary affiliation for this manuscript: aalto university current affiliation: finnish institute of occupational health author: heikki mannila affiliation: aalto university     abstract finding out whether a word occurs significantly more often in one text or corpus than in another is an important question in analysing corpora. as noted by kilgarriff ( ), the use of the χ and log-likelihood ratio tests is problematic in this context, as they are based on the assumption that all samples are statistically independent of each other. however, words within a text are not independent. as pointed out in kilgarriff ( ) and paquot & bestgen ( ), it is possible to represent the data differently and employ other tests, such that we assume independence at the level of texts rather than individual words. this allows us to account for the distribution of words within a corpus. in this article we compare the significance estimates of various statistical tests in a controlled resampling experiment and in a practical setting, studying differences between texts produced by male and female fiction writers in the british national corpus. we find that the choice of the test, and hence data representation, matters. we conclude that significance testing can be used to find consequential differences between corpora, but that assuming independence between all words may lead to overestimating the significance of the observed differences, especially for poorly dispersed words. we recommend the use of the t-test, wilcoxon rank-sum test, or bootstrap test for comparing word frequencies across corpora.     . introduction comparison of word frequencies is among the core methods in corpus linguistics and is frequently employed as a tool for different tasks, including generating hypotheses and identifying a basis for further analysis. in this study, we focus on the assessment of the statistical significance of differences in word frequencies between corpora. our goal is to answer questions such as ‘is word x more frequent in male conversation than in female conversation?’ or ‘has word x become more frequent over time?’. statistical significance testing is based on computing a p-value, which indicates the probability of observing a test statistic that is equal to or greater than the test statistic of the observed data, based on the assumption that the data follow the null hypothesis. if a p-value is small (i.e. below a given threshold α), then we reject the null hypothesis. in the case of comparing the frequencies of a given word in two corpora the test statistic is the difference between these frequencies and, put simply, the null hypothesis is that the frequencies are equal. however, to employ a test, the data have to be represented in a certain format, and by choosing a representation we make additional assumptions. for example, to employ the χ test, we represent the data in a x table, as illustrated in table . we refer to this representation as the bag-of-words model. this representation does not include any information on the distribution of the word x in the corpora. when using this representation and the χ test, we implicitly assume that all words in a corpus are statistically independent samples. the reliance on this assumption when computing the statistical significance of differences in word frequencies has been challenged previously; see, for example, evert ( ) and kilgarriff ( ).     table the x table that is used when employing the χ test corpus s corpus t word x a b not word x c d hypothesis testing as a research framework in corpus linguistics has been debated but remains, in our view, a valuable tool for linguists. a general account on how to employ hypothesis testing or keyword analysis for comparing corpora can be found in rayson ( ). we observe that the discussion regarding the usefulness of hypothesis testing in the field of linguistics has often been conflated with discussions pertaining to the assumptions made when employing a certain representation and statistical test. kilgarriff ( ) asserts that the ‘null hypothesis will never be true’ for word frequencies. as a response, gries ( ) argues that the problems posed by kilgarriff can be alleviated by looking at (measures of) effect sizes and confidence intervals, and by using methods from exploratory data analysis. our main point is different from that of gries ( ). while we endorse kilgarriff’s conclusion that the assumption that all words are statistically independent is inappropriate, the lack of validity of one assumption does not imply that there are no comparable representations and tests based on credible assumptions. as pointed out in kilgarriff ( ) and paquot & bestgen ( ), it is possible to represent the data differently and employ other tests, such as the t-test, or the wilcoxon rank-sum test, such that we assume independence at the level of texts rather than individual words. an alternative approach to the x table presented above is to count the number of occurrences of a word per text, and then compare a list of     (normalized) counts from one corpus against a list of counts from another corpus. an illustration of this representation is given in table . this approach has the advantage that we can account for the distribution of the word within the corpus. table the frequency lists that are used when employing the t-test. the lists do not have to be of equal length, as the corpora may contain an unequal number of texts. corpus s text s text s … text sn normalized frequency of word x s s … s|s| corpus t text t text t … text tm normalized frequency of word x t t … t|t| we emphasize that the utility of hypothesis testing critically depends on the credibility of the assumptions that underlie the statistics. we share kilgarriff’s ( ) concern that application of the χ test leads to finding spurious results, and we agree with kilgarriff ( ) and paquot and bestgen ( ) that there are more appropriate alternatives, which, however, have not been implemented in current corpus linguistic tools. we re-examine the alternatives and provide new insights by analysing the differences between six statistical tests in a controlled resampling setting, as well as in a practical setting. the question which method is most appropriate for assessing the significance of word frequencies or other statistics is not new. dunning ( ) and rayson and garside ( ) suggest that a log-likelihood ratio test is preferable to a χ test because the latter test is inaccurate when the expected values are small (< ). rayson et al. ( ) propose using the χ test with a modified version of cochrane’s rule. kilgarriff ( ) concludes     that the wilcoxon rank-sum test is more appropriate than the χ test for identifying differences between two corpora, but his study is limited to a qualitative analysis of the top words identified by the two methods. kilgarriff ( ) criticizes the hypothesis testing approach because the χ test finds numerous significant results, even in random data. hinneburg et al. ( ) study methods based on bootstrapping and bayesian statistics for comparing small samples. paquot and bestgen ( ) present a study of the similarities and differences between the t-test, the log-likelihood ratio test, and the wilcoxon rank-sum test; however, their study is also limited to qualitative analysis of the differences. they recommend using multiple tests, or the t-test, if only one method is to be applied. lijffijt et al. ( ) illustrate that the bootstrap and inter-arrival time tests provide more conservative p-values than those that are provided by bag-of-words- based models (i.e. tests based on the assumption that all words are statistically independent), which includes the χ and log-likelihood ratio tests. lijffijt et al. ( ) conduct a detailed study of lexical stability over time in the corpus of early english correspondence, using both the log-likelihood ratio and bootstrap tests, and conclude that the log-likelihood ratio test marks spurious differences as significant. relevant, but not discussed further here, is the need for balanced corpora when comparing word frequencies (oakes and farrow, ). we find that some statistical tests that are commonly used in corpus linguistics, such as the χ and log-likelihood ratio tests (dunning, ; rayson and garside, ), are anti-conservative, that is, their p-values are excessively low, when we assume that a corpus is a collection of statistically independent texts. we perform experiments based on a subcorpus of the british national corpus (bnc, ) that     contains all texts from the prose fiction genre. we quantify the potential bias of the tests based on the uniformity of p-values when we randomly divide the set of texts into two groups. this method is further explained in section . moreover, we show that the errors in the estimates differ according to each word and the dispersion of the words in the corpus. to define the dispersion of a word, we consider a measure of dispersion, dpnorm, which was introduced in gries ( ) and refined in lijffijt and gries ( ). because the bias that we observe does not solely depend on word frequency, we cannot simply use higher cut-off values in the χ or log-likelihood ratio tests to correct the bias. notably, the rank of words, in terms of their significance, changes. finally, we perform a keyword analysis of the differences between male and female authors, as annotated by lee ( ), using two methods. we find that the differences between the methods are substantial and thus necessitate the use of a representation and statistical test such that the distribution of the frequency over texts is properly taken into account (the t-test, wilcoxon rank-sum test, or the bootstrap test). . why the bag-of-words model is inappropriate the χ and log-likelihood ratio tests are based on the bag-of-words model (illustrated in table ), in which all words in a corpus are assumed to be statistically independent. from the perspective of any word, the corpus is modelled as a bernoulli process, i.e. a sequence of biased coin flips, which results in word frequencies that follow a binomial distribution (dunning, ). the bag-of-words model implicitly assumes both a mean frequency and a certain variance of the frequency over texts and thus an expected dispersion. figure shows the observed frequency distribution of the word i in the british national corpus and the expected frequency distribution in the bag-of-words     model. the observed distribution and the distribution that is predicted by the bag-of- words model clearly differ. fig. the frequency distribution of i in the british national corpus. the grey bars show a histogram of the observed distribution, and the black dotted line shows the expected distribution in the bag-of-words model, on which the χ and log-likelihood ratio tests are based. compared with the prediction, the observed distribution has much greater variance and thus demonstrates that the bag-of-words model is not an appropriate choice when comparing corpora, even for highly frequent words. another example is presented in table , which depicts p-values for the hypothesis that the name matilda is used at an equal frequency by male and female authors in the prose fiction subcorpus of the british national corpus. this subcorpus is presented in section . the frequency for male authors is . per million words (absolute frequency ), and the frequency for female authors is . per million words (absolute frequency ). with more than occurrences in the fiction subcorpus, we may easily trust the results of the χ and log-likelihood ratio tests, which show that male authors use this name more often than female authors. however, the other tests (the t-test, wilcoxon rank-sum test, inter-arrival time test, and bootstrap test) indicate that the observed frequency difference is not unlikely to occur at random. the     reason that the methods disagree is that the word is used in only of total texts ( text written by a male author and texts written by female authors), with an uneven frequency distribution: one text contains instances, followed by, in the other texts, instances, instances, instances, and instance, respectively. this uneven distribution should lead to an uncertain estimate of the mean frequency. in other words, the variance of the frequency of matilda is very high. the χ and log-likelihood ratio tests do not account for the uneven distribution, as these tests use only the total number of words in a corpus, and as a result they underestimate the uncertainty. table p-values for the hypothesis that male and female authors use the name matilda at an equal frequency, based on the prose fiction subcorpus of the british national corpus χ test log- likelihood ratio test welch’s t-test wilcoxon rank-sum test inter-arrival time test bootstrap test < . < . . . . . the remainder of this paper is structured as follows. in section , we present the significance testing methods, the uniformity test, and the dispersion measure. in section , we describe the data that are used. in section , we compare the methods in a series of experiments based on random divisions of the corpus, and in section we describe the differences between male and female authors that were identified using various methods. section briefly concludes the paper.     . methods in this section, we briefly discuss the mathematical models and assumptions that underlie each of the six methods discussed in the introduction. a summary of the essential differences is given in section . . the statistical test employed in the controlled random sampling experiment (section ) is presented in section . , and the measure of dispersion that we use is presented in section . . readers less interested in the specifics of the statistical tests may proceed directly to . and then to section . . notation we use q to denote the word that we intend to compare in two corpora, and s and t to denote the two corpora. corpus s contains |s| texts and size(s) words. we use subscripts to indicate individual texts: s , s , …, s|s|. we express the relative frequency of word q in corpus s as freq(q,s). each of the following six methods computes a p-value for the hypothesis of a word having an equal frequency in the two corpora, freq(q,s) = freq(q,t), against the alternative hypothesis that the frequencies are not equal: freq(q,s) > freq(q,t) or freq(q,s) < freq(q,t). thus, conforming to the tradition in corpus linguistics, all methods provide two-tailed p-values. . method : pearson’s χ test pearson’s χ test, which is also known as the χ test for independence or simply as the χ test, is based on the assumption that a text or corpus can be modelled as a sequence of independent bernoulli trials. each bernoulli trial is a random event with a binary outcome; thus, the entire sequence is similar to a sequence of biased coin flips. under the assumption of independent bernoulli trials, the probability distribution for the word frequency is given by the probability mass function of the binomial distribution. let n     be the size of the corpus and p the relative frequency of a word. the probability of observing this word exactly k times is given by pr(k = k)= p ( − p)n−k n k " # $ % & ' . ( ) this distribution is approximately normal with mean np and variance np( -p) when np( -p) > (dunning, ). the fact that this distribution is well approximated by a normal distribution is used in the χ test. the test is conducted as follows. let o = freq(q,s) ⋅ size(s) and o = freq(q,t) ⋅ size(t), which are the observed frequencies of q in s and t, respectively. let p be the relative frequency over the combined corpora, i.e. p = (o +o )/(size(s)+size(t)). we define the expected frequency in s and t as e = p ⋅ size(s) and e = p ⋅ size(t), respectively. the test statistic x using yates’ correction is given by x = (o −e − . ) e + (o −e − . ) e . ( ) the test statistic asymptotically follows a χ distribution with one degree of freedom. the p-value can be obtained by comparing the test statistic to a table of χ distributions. the χ test is available in most statistical software programs and implemented in tools such as wordsmith tools (scott, ) and bncweb (hoffmann et al., ). . method : log-likelihood ratio test the χ test is based on two approximations: the normal distribution approximates the binomial distribution, and the test statistic asymptotically follows a χ distribution. because of this double approximation, the χ test is inapplicable when the word frequency is small (< ). for this reason, dunning ( ) introduces a test which is not     based on the normality approximation but on the likelihood ratio. this test is called the log-likelihood ratio test and is also known as the g test. the likelihood function h(p;n,k) is the same as pr(k = k) in equation ( ); the only difference is that we explicitly mention the parameter p. the likelihood ratio is the ratio of the probability when we have two parameters, p and p (one for each corpus), divided by the probability when we have only one parameter, p (for both corpora). the precise mathematical formulation is given by p = freq(q,s), n = size(s), k = freq(q,s) ⋅ size(s), p = freq(q,t), n = size(t), k = freq(q,t) ⋅ size(t), and p = (k +k )/(n +n ). the likelihood ratio is defined as λ = h(p;n ,k )⋅h(p;n ,k ) h(p ;n ,k )⋅h(p ;n ,k ) . ( ) we set the parameters p , p , and p to the values that maximize the likelihood function. the full derivation can be found in dunning ( ). the log-likelihood ratio test is based on the fact that the quantity - log λ asymptotically follows a χ distribution with degrees of freedom that are equal to the difference in the number of parameters between the ratios (i.e. one in this instance). the quantity - log λ is used as the test statistic. dunning ( ) claims that this test statistic approaches its asymptotical distribution much faster than the test statistic in the χ test and is thus preferable, especially when the expected frequency is low. again, the final p-value is computed by comparing the test statistic to a table of χ distributions. the log-likelihood ratio test is available in many statistical software programs and implemented in tools such as wmatrix (rayson, ), wordsmith tools (scott, ), and bncweb (hoffmann et al., ).     similar to the χ test, this method is based on the bag-of-words model, the representation illustrated in table , and thus on the assumption that each word can be modelled as an independent bernoulli trial. as a result, the test ignores all structure in the corpus and even in texts and sentences. we refer to any method that is based on this assumption as a bag-of-words test. there exist other bag-of-words tests that are not based on approximations of the probability mass function given in ( ) but are directly based on the summation of values in equation ( ). such tests provide more accurate probabilities, especially for small frequencies, under the bag-of-words assumption. examples include fisher’s exact test and the binomial test. we expect these methods to perform similarly to the χ and log- likelihood ratio tests for low word frequencies, and as the frequency increases, the results will converge because all of these tests are based on the bag-of-words assumption and equation ( ). for brevity, we do not consider other bag-of-words tests in this paper. . method : welch’s t-test a t-test is a significance test in which the test statistic follows a student’s t-distribution. we intend to compare two groups of samples and make a minimum number of assumptions. we use welch’s t-test, which is based on the assumption that the mean frequency follows a gaussian distribution. welch’s t-test is more general than student’s t-test because the former test does not assume equal variance in the two populations. welch’s t-test provides a p-value for the hypothesis that the means of the two distributions are equal. the test statistic is the normalized difference between the means of the word frequencies. let x be the mean of the frequency of q over texts in s, and let s be the     standard deviation. likewise, let x be the mean of the frequency of q over texts in t, and let s be the standard deviation. the test statistic t is given by t = x − x s s + s t . ( ) the test statistic follows a student’s t-distribution with degrees of freedom that depend on the variance of the populations. an exact solution to this problem is unknown, but welch’s t-test is based on the welch-satterthwaite equation, which provides an approximate solution (welch, ). implementations of this test are available in statistical software programs, including r and microsoft excel. nb. it is often claimed that student’s and welch’s t-test are only applicable if the data follow a normal distribution. this is not true; the assumption is that the test statistic follows a normal distribution. in this case, the test statistic is the difference between the two means. this statistic does not in general follow a normal distribution. however, the central limit theorem (clt) states that, under very general conditions, the mean of a set of independent random variables approaches normality very fast when the number of samples increases. since the frequency of a word per text is bounded, the conditions for the clt are met, and the means x and x , as well as their difference are approximately normal when the number of texts is sufficiently large. for small corpora, it is a priori not clear if the test is an appropriate choice. . method : wilcoxon rank-sum test the wilcoxon rank-sum test, which is also known as the mann-whitney u-test, is a statistical test that does not make any assumption regarding the shape of the distribution for the quantity of interest. it is based on the fact that if the distributions of q for two     corpora are equal, then it is possible to induce a probability distribution over the rank orders (wilcoxon, ; mann and whitney, ). the test is performed as follows. we order all samples based on the frequency of word q, regardless of the corpus in which these samples are located. this approach gives us a ranked series, an example of which is shown in table . table example of a ranked series rank corpus s t t t s s s t t s the test statistic u is then defined as the sum of the ranks of texts of the smaller corpus. in this situation, because both corpora have a size of , we can select either s or t. we find that us = + + + + = and ut = ((n +n)/ ) - = . we obtain a p-value for small n by comparing the test statistic with a statistical table, and if n > , then the distribution of the test statistic is well approximated by a gaussian distribution using known parameters. implementations of this test are available in statistical software programs, such as r. multiple texts may have equal frequencies for a word. particularly for infrequent words, numerous texts in a corpus may have a frequency of zero. the wilcoxon rank- sum test accounts for texts with equal frequencies by assigning to each text the average rank over all equal-frequency texts. for example, if there are five texts with a frequency of zero, then each text is assigned a rank of .     . method : inter-arrival time test a novel significance test that is specifically designed for frequency counts in sequences is the inter-arrival time test, which was introduced by lijffijt et al. ( ). this test is based on the spatial distribution of a word in a corpus, as modelled by the distribution of inter-arrival times between words. the assumption is that the inter-arrival time distribution of a word captures the behavioural pattern of the word in a corpus. savický and hlaváčová ( ) use the inter-arrival time distribution to define a corrected frequency that captures whether words that are frequent in a corpus are ``common’’ or not, and altmann et al. ( ) reports that the inter-arrival time distribution of a word, as summarized in a burstiness parameter, is a good predictor of word class. the significance test is performed as follows. the inter-arrival times are obtained by counting the number of words between each consecutive occurrence of word q, plus one. the texts in the corpus are ordered randomly and the corpus is treated as though it were placed on a ring: the end of the corpus is attached to the beginning. we begin counting at the first occurrence and continue until we again reach the first occurrence. for example, assume that we have a corpus with ten words and two occurrences of word q (table ). table example of a small corpus index word x x q x x x q x x x the inter-arrival times for this corpus are + = and + = ; thus, the empirical inter-arrival time distribution is { , }. by definition, the number of inter-     arrival times is equal to the number of occurrences in the corpus, and the sum of the inter-arrival times equals the size of the corpus. the significance test is based on the production of random corpora by repeatedly sampling inter-arrival times from the empirical inter-arrival time distribution. the first occurrence must be sampled from a different distribution (lijffijt et al., ). after we obtain the index of the first occurrence, we sample uniformly at random an inter-arrival time from the empirical inter-arrival time distribution and insert a new occurrence of q at the position given by this inter-arrival time. this process is repeated until we exceed the length of the corpus. in lijffijt et al. ( ), the significance test is based on a foreground corpus s and a background corpus t. the test is performed by comparing the observed frequency of q in s to the frequency in randomized corpora with sizes equal to s but based on the inter-arrival time distribution of t. the test is one-tailed, and the alternative hypothesis is freq(q,s) > freq(q,t). the test is also asymmetrical in that the p-value for freq(q,s) > freq(q,t) is not necessarily the same as freq(q,s*) < freq(q,t*) if we set s* = t and t* = s because only one corpus is randomized. we adopt a slightly different approach that does not have these issues. we create random corpora s to sn, based on the inter-arrival time distribution of s, and random corpora t to tn, based on the inter-arrival time distribution of t, with all sizes equal to the smaller corpus. the one-tailed p-value is given by the mid-p test (berry and armitage, ): p = h freq(q,ti)− freq(q,si)( )i= n ∑ n , ( )     where h(x) = . if x > if x = if x < ! " # $ # . we can convert this to a two-tailed p-value (dudoit et al., ) using the following equation: ptwo = ⋅min(p, − p). ( ) because the p-value is an empirical estimate and the real p-value that we are approximating may be small, the use of smoothing is appropriate (north et al., ). thus, the final p-value is computed as follows: p*= ptwo ⋅ n + n + . ( ) the value p* is used as the p-value for this test in our experiments. obtaining the p-values takes longer compared to the other methods, as it requires sampling many pseudorandom numbers. specifically, it takes n times the number of tokens in a corpus steps to compute the p-values for all types. for example, for the experiment presented in section , this process takes several minutes. . method : bootstrap test bootstrapping (efron and tibshirani, ) is a statistical method for estimating the uncertainty of some quantity in a data sample by resampling the data several times. we can employ bootstrapping to create a significance test as follows. similar to the procedure used in the inter-arrival time test, we create a series of corpora s to sn, but we produce a random corpus by sampling |s| texts from s. likewise, we create a series     t to tn by repeatedly sampling |s| texts from t. the p-value is again obtained using equations ( ) through ( ). this method makes no assumptions regarding the shape of the frequency distribution for words and is thus generally applicable. this method is almost identical to the bootstrap test used by lijffijt et al. ( ), but our method differs in that we use a two-tailed p-value and resample both s and t concurrently. implementations in r and matlab can be found in lijffijt ( ). . summary of methods table summarizes the assumptions underlying the six methods that are described above. the χ and log-likelihood ratio tests represent the data in a x table, while welch’s t-test, the wilcoxon rank-sum test, and the bootstrap test take as input a list of frequencies per text for each word. the inter-arrival time test is based on the spatial distribution of a word in the corpora. the wilcoxon rank-sum and bootstrap tests make the fewest assumptions regarding the frequency distribution and are thus the most generally applicable. table summary of the six methods that are presented in this paper and the assumptions regarding the frequency distribution for each test test assumption regarding frequency distribution pearson’s χ test all words are statistically independent (bag-of-words model) log-likelihood ratio test all words are statistically independent (bag-of-words model) welch’s t-test all texts are statistically independent, and the mean frequency follows a normal distribution wilcoxon rank-sum test all texts are statistically independent     inter-arrival time test spaces between occurrences of the same word are statistically independent bootstrap test all texts are statistically independent . test for uniformity of p-values all of the previously discussed methods yield p-values for the hypothesis that the frequencies of a word q in s and t are equal. several studies, including kilgarriff ( ), rayson et al. ( ), and paquot and bestgen ( ), have previously compared some of these methods. these studies have shown that p-values in the same setting are not equal: there are differences in the significance of a given frequency difference between one method and another. this finding is alarming because we do not know which test yields the best results. we study the utility of these tests based on the criterion that if the data follow the distribution that is assumed in the null hypothesis and the test is unbiased, then the p-values given by the method should be uniformly distributed in the [ , ] range. this criterion is applicable according to the definition of p-values: the probability of encountering a p-value of x or less is x itself. for example, there is % chance of observing a p-value of . or less, and a % chance of observing a p-value of . or less. if this criterion is not fulfilled, then the test is either anti-conservative (the probability of encountering a p-value of x or smaller is more than x) or conservative (the probability of encountering a p-value of x or smaller is less than x). see, for example, blocker et al. ( ). when assessing a statistical testing procedure, testing for uniformity of p-values, either visually or by a statistical test, is a common practice in many disciplines such as     particle physics; see e.g. figures – , - , and – in beaujean et al. ( ). a similar kind of experiment has been published in lijffijt ( ), while for example schweder and spjøtvoll ( ) study the uniformity of p-values for multiple-hypotheses adjustment procedures, and l’ecuyer and simard ( ) use the kolmogorov-smirnov test (also used here) to measure the uniformity of random number generators. numerous statistical tests can be utilized to determine whether a distribution is uniform. we employ the kolmogorov-smirnov test (massey, ), which can be used to compare two distributions. the reference distribution f(x) that we use is the uniform distribution on [ , ]. the test is based on a simple statistic: the maximum distance between the empirical cumulative distribution fn(x), which is based on the observed data, and the theoretical uniform cumulative distribution function f(x): dn =sup x fn(x)−f(x) . ( ) the quantity ndn follows a kolmogorov distribution. the associated p-value can be found by comparing ndn to a table containing critical values for the kolmogorov distribution. implementations of this test are available in statistical software programs, including r. . measure of dispersion: dpnorm gries ( ) presents an overview of several dispersion measures and the disadvantages of each measure, and proposes a simple alternative that is reliable and easy to interpret: deviations of proportions (dp). the measure is based on the difference between observed and expected relative frequencies. the expected relative frequency is equal to the relative size of a text. let v ,…,vn be the relative frequencies that are observed in texts s ,…,sn, and let s ,…,sn be the relative sizes of the texts. dp is defined as     dp = si −vii= n ∑ , ( ) and the normalized measure dpnorm is given by dpnorm = dp −min i (si) . ( ) the normalized measure, as presented by lijffijt and gries ( ), has a minimum value of and a maximum value of , regardless of the corpus structure, whereas dp also has a minimum of , but its maximum depends on the corpus structure. because the dispersion is quantified as the difference between the expected and observed frequencies, a dispersion of indicates that a word is dispersed as expected, whereas a dispersion of indicates that the word is minimally dispersed. a word is minimally dispersed when it occurs only in the shortest text. . data for the purposes of our study, we require a relatively large and homogeneous data set containing information on the gender of the authors of the texts. to fulfil this requirement, we have selected a subcorpus of the british national corpus (bnc, ), namely the prose fiction genre. categorized by lee ( ), the genre excludes drama but includes both novels and short stories. lee ( , p. ) notes that ‘where further sub-genres can be generated on-the-fly through the use of other classificatory fields, they are not given their own separate genre labels, to avoid clutter’—thus, e.g. children’s prose fiction is not separated from adult prose fiction because these two types of fiction can be distinguished through the ‘audience age’ field. as the sub-genres of     prose fiction may differ from one another considerably, our material can be regarded as homogeneous only in relation to other super-genres, such as academic prose. the prose fiction subcorpus contains texts or c. million words of present- day british english. according to burnard ( , section . . . ), most of the texts are continuous extracts with a target sample size of , words, but several texts are included in their entirety. the gender of the authors is known for texts or c. . million words, which are divided fairly evenly between male and female authors: texts were written by men, and texts were written by women (c. . and . million words, respectively). these texts form our data set. for the uniformity experiments in the following section, we use the first , words of each text, while for the gender study, we analyse the full texts. we preprocess the data set by lowercasing all text; furthermore, punctuation, lemmatization, parts of speech, and multi-word tags are ignored, and only the word forms (i.e. running words) are considered. . uniformity of p-values . randomly assigning the texts to two sets the first experiment that we have conducted involves testing the uniformity of the p- values for each method. we have employed the following procedure. we randomly assign texts to corpus s and texts to corpus t, such that the corpora do not overlap. we then apply each method to all words with a frequency of or greater in the fiction corpus (there are , such words). the entire process is repeated times. due to the fact that the corpus is split into two parts at random, the null hypothesis, that there is no difference between these parts, is by definition true. notice     that two random samples from a population are almost always different, as long as there is variation in the population the samples are drawn from. that means we expect that there will be differences between the two samples. however, since the assignment is random, any observed structure is fully explained by the artefacts of random sampling, and there is no true discriminative structure present in the data. this procedure is very similar to permutation testing, see for example good ( ). for example, assume that we have drawn two samples, and we observe that the word would is more frequent in s than in t. if we also find it has a low p-value, we may think that there is a real difference between the two populations. however, since s and t are drawn from the same population, we know that there is no true difference between the two populations with respect to the frequency of would. doing many comparisons aggravates this problem, because then we are liable to find many large differences, while there are in fact none. a statistical test quantifies how likely an observation is under the null hypothesis. perhaps counter-intuitively, this does not mean that a p-value is always when there is no true difference between the populations; it means that the distribution of a p-value should be approximately uniform on the range [ , ]. that is, there is a % probability that a p-value is . or lower, % probability that it is . or lower, % probability that it is . or lower, and so on. in that case, the test is neither conservative, nor anti-conservative. when we do multiple tests, we can use bonferroni correction, or a more powerful alternative, to ensure that the smallest p-value of a set of tests has a uniform distribution. the probability distribution of the minimum corresponds to the family-wise error rate. other post-hoc corrections may also have different aims.     due to the random sampling, the p-values will not be exactly uniform, but—as discussed in section . —we can employ the kolmogorov-smirnov test to quantify the uniformity of the p-values for one word for one test in a single p-value. we repeat this experiment for each word, and obtain for each of the , words six p-values that express the uniformity of the p-values for each of the six tests. this results in a total of , ⋅ = , p-values. we use a minimum frequency of because the frequency influences the uniformity of the p-values and the influence differs per method. we do not claim that the significance tests are inapplicable to lower frequencies (in fact, we would argue the opposite), but this experiment is not meaningful using lower frequency words. we have not optimized the frequency threshold, and, as shown below, a frequency of is often too low. further details regarding why the experiments are not meaningful with less frequent words can be found in the discussion of the experimental results below. a low p-value for the kolmogorov-smirnov test indicates that the p-value distribution over the random corpus assignments is not uniform. however, due to testing , hypotheses, we do not expect all p-values of the kolmogorov-smirnov test to be high. to correct for multiple hypotheses, we apply the bonferroni correction by multiplying each p-value by the number of hypotheses. if a p-value is greater than one after multiplication, then we set the value to one. the bonferroni correction ensures that there is only α probability of rejecting any sample. the correction is conservative, but we also prefer to be cautious and not reject any samples as being non-uniform unless we are certain of their lack of uniformity. for a review of multiple hypothesis correction methods see shaffer ( ) or dudoit et al. ( ).     figure shows an overview of the performance of each method. in the following discussion, we write, for brevity, that samples or words are rejected in the uniformity test, where we actually mean that the null hypothesis that the p-values follow a uniform distribution is rejected. fig. the results of the uniformity test for all six methods based on random text assignments. each dot corresponds to a word, which has a frequency (x-axis) and dispersion (y-axis). light grey dots correspond to rejected samples. a sample is rejected if the corrected p-value of the kolmogorov-smirnov test for uniformity is < . . the wilcoxon rank-sum and bootstrap tests demonstrate the best performance with . % rejected samples. we observe that . % of the samples are rejected for the χ test, even for the highest frequency, well-dispersed words. the log-likelihood ratio test performs even worse: % of the words are rejected, and these also include the most frequent and best dispersed words. the difference is probably caused by yates’ correction for the χ test.     the t-test, wilcoxon rank-sum test, and bootstrap test perform much better: although . % to . % of the samples are rejected, we observe that these rejected samples consist of infrequent, poorly dispersed words. thus, testing words with sufficient frequency and/or dispersion yields appropriate results. because of zipf’s law, we know that the number of infrequent words greatly exceeds the number of frequent words, and thus, if we had selected a lower frequency threshold, then the percentage of rejected samples would have been much higher. the inter-arrival time test has more rejected samples ( . %), but these samples again include frequent and well-dispersed words. this result indicates that the test does not capture all of the structure that is present in the texts. this result may have occurred because inter-arrival times have correlations within texts and these are not captured by the test. the wilcoxon rank-sum and bootstrap tests demonstrate the best performance. frequent and well-dispersed words always yield a uniform distribution. when comparing the bootstrap and t-tests, we observe that the samples for which the t-test does not provide a uniform distribution are all instances in which the bootstrap test does not provide a uniform distribution plus a few more. especially for infrequent but relatively well-dispersed words, the bootstrap test appears to outperform the t-test. in contrast, the wilcoxon rank-sum test appears to provide a tighter boundary for the rejected samples. finally, we have also tested the performance of all tests on words with frequencies between and . figure displays the results. we observe that the χ and log-likelihood ratio tests fail to yield uniform p-values in almost all cases. the t-test and wilcoxon rank-sum test fail in nearly half of the instances; almost all words that have     frequencies below or that are poorly dispersed are rejected. the inter-arrival time and bootstrap tests are more successful in yielding uniform p-values for low frequency words, with the bootstrap test being the most successful. fig. the results of the uniformity test for all six methods, based on random text assignments, for low frequency words. each dot corresponds to a word, which has a frequency (x-axis) and dispersion (y-axis). light grey dots correspond to samples for which the null hypothesis that the p-values follow a uniform distribution has been rejected. the null hypothesis is rejected if the corrected p-value of the kolmogorov- smirnov test for uniformity is < . . . randomly assigning the words to two sets the second experiment that we conducted is based on the random assignment of individual words to two sets rather than the assignment of entire texts. this approach should lead to a smoother distribution of frequencies, and we expect all methods to yield unbiased (i.e. uniform) p-values in this setting. we have used the following procedure to test this hypothesis: we randomly assign half of the , words to corpus s and assign the other half of the words to corpus t. we then apply each method     to all words with a frequency of or greater in the fiction corpus (i.e. the same , words that were used in the previous experiment). the entire process is repeated times. again, we expect the p-value distribution for each word to be approximately uniformly distributed over the repetitions. we use the kolmogorov-smirnov test as discussed above to obtain , ⋅ = , p-values. we use the bonferroni correction for multiple hypotheses to compute the final p-values. figure shows an overview of the performance of each method. fig. the results of the uniformity test for all six methods based on random word assignments (rather than texts, as in fig. ). each dot corresponds to a word, which has a frequency (x-axis) and dispersion (y-axis). light grey dots correspond to samples for which the null hypothesis has been rejected. the null hypothesis is rejected if the corrected p-value of the kolmogorov-smirnov test for uniformity is < . . surprisingly, we observe that the χ test fails to yield uniform p-values for nearly % of the words. this result may have occurred because the test statistic only asymptotically follows a χ distribution, and another contributing factor could be yates’ correction, which makes the p-values more conservative (perhaps excessively     conservative). the latter reason is easy to verify because the kolmogorov-smirnov test can also be employed as a one-tailed test. we computed the p-values again by testing only whether the p-values for the frequency test are excessively low. table presents the results. we now observe that % of the samples are rejected; this result confirms that yates’ correction leads to conservative p-values, which is not necessarily a disadvantage. table for each method, the percentage of samples for which the null hypothesis under the one-tailed kolmogorov-smirnov test is rejected, based on random word assignments as in fig. . the alternative hypothesis is that p-values are anti-conservative. test χ test log- likelihood ratio test welch’s t-test wilcoxon rank-sum test inter- arrival time test bootstrap test percentage of rejected samples . % . % . % . % . % . %     fig. cumulative distribution of p-values for each method for the word trip. the diagonal line indicates the uniform distribution, which we expect to be close to the actual distribution. the p-values of the uniformity tests are presented in parentheses. the first four tests show a jagged pattern because of the deterministic nature of these tests, i.e. the limited number of different inputs leads to a limited number of different output values. this behaviour causes the uniformity test to yield low p-values. the inter-arrival time and bootstrap tests are less affected by this limitation. table also shows that . % of the samples are rejected for the log-likelihood ratio test, t-test, and wilcoxon rank-sum test despite our use of the conservative bonferroni correction. perhaps surprisingly, the inter-arrival time and bootstrap tests have no rejected samples; thus, we can conclude that these tests consistently yield reasonably uniformly distributed p-values. figure shows that all of the rejected samples are infrequent words. because this difference is unexpected, let us examine an example of the p-values that are given by each method for an infrequent word. figure illustrates the p-values for the word trip. we notice a problem here: the first four tests do not yield the expected uniform distributions. the cause is visible in     the figure: the number of unique p-values that these tests yield is limited, and the tests give a similar p-value for many of the randomized inputs, because the number of distinct inputs is also very low. this behaviour is not necessarily unfavourable; if we assume that only a certain number of p-values are possible, then the observed distribution may be ‘as uniform as possible’ under the constraints. the reference distribution in our test—which is the uniform distribution on [ , ]—does not assume a finite set of possible values. this distribution could have caused the uniformity test to be slightly inappropriate and to reject many samples, especially those corresponding to infrequent or very poorly dispersed words. thus, we should not necessarily interpret the smoother curves given by the inter-arrival time and bootstrap tests as superior. however, we are not aware of any significance tests that would be more appropriate in this situation, and we leave this issue for further research. figure illustrates a comparison of the p-values for the frequent word would. we continue to observe the jagged pattern, but the pattern is now less severe. the high p-values for each of the tests demonstrate that the uniformity test now functions properly. this result corroborates the evidence in fig. that in this randomization setting (assigning each word in the subcorpus randomly to s or to t) none of the frequent words is rejected. we conclude that all of the methods yield uniform p-values in this setting, in which we randomly sample words rather than texts. thus, the differences between the methods in the first experiment are fully explained by the additional structure of the texts. this finding is important because, when creating a corpus, one usually samples texts from various sources rather than individual words. as a note of caution, the jagged patterns provide the first four tests with a disadvantage in the uniformity test; thus, we     cannot conclude that these four methods are all inferior. nonetheless, the evidence does not suggest that any test is superior to the bootstrap test either. based on the experiments that have been discussed thus far, we can conclude that under the assumption of randomly sampled texts the χ and log-likelihood ratio tests may lead to spurious conclusions, and we therefore recommend the use of a representation of the data and a statistical test that takes into account the distribution of the word within the corpus. fig. cumulative distribution of p-values for each method for the word would. the diagonal line indicates the uniform distribution, which we expect to be close to the actual distribution. the p-values of the uniformity tests are presented in parentheses. the first four tests show a jagged pattern because of the deterministic nature of these tests, i.e. the limited number of different inputs leads to a limited number of different output values. nonetheless, at this frequency, the uniformity test works properly.     . differences between male and female writing . the bootstrap test past research on the bnc reports statistically significant gender differences in word- frequency distributions in conversation (e.g. rayson et al., ) and in both the fiction and non-fiction genres (e.g. argamon et al., ). we next consider the extent to which word-frequency distributions display statistically significant gender differences in the bnc prose fiction texts using the bootstrap test. after we control for a false discovery rate (fdr; benjamini and hochberg, ) at α = . , which controls the expected relative number of false positives over all positives, the bootstrap test returns words (occurring , times or more in both corpora) whose frequency differs significantly between the male- and female-authored subcorpora. the minimum frequency of , was chosen for ease of illustration, as the list of significant words would have been considerably longer if lower frequencies had been considered (cf. fig. , below). tables and list the words that are most significantly overrepresented in male and female prose fiction, respectively. table high-frequency words that are significantly overrepresented in male-authored prose fiction in the bnc according to the bootstrap test word males m/million females f/million dpnorm bootstrap a , , . , , . . . another , . , . . . by , , . , , . . . first , , . , . . .     from , , . , , . . . in , , . , , . . . its , . , . . . man , , . , , . . . of , , . , , . . . on , , . , , . . . one , , . , , . . . some , , . , , . . . the , , . , , . . . their , , . , , . . . they , , . , , . . . through , , . , . . . two , , . , , . . . us , . , . . . we , , . , , . . . were , , . , , . . . is , , . , , . . . left , . , . . . other , , . , , . . .     there , , . , , . . . are , , . , , . . . where , , . , , . . . he , , . , , . . .   table high-frequency words that are significantly overrepresented in female-authored prose fiction in the bnc according to the bootstrap test word males m/million females f/million dpnorm bootstrap ’ll , , . , , . . . ’m , , . , , . . . ’ve , , . , , . . . be , , . , , . . . come , , . , , . . . could , , . , , . . . did , , . , , . . . eyes , . , , . . . face , , . , , . . . for , , . , , . . . go , , . , , . . . her , , . , , . . . how , , . , , . . . if , , . , , . . . knew , . , . . . made , . , , . . .     make , . , . . . much , . , , . . . must , . , . . . n’t , , . , , . . . never , . , , . . . not , , . , , . . . own , . , . . . she , , . , , . . . should , . , . . . so , , . , , . . . thought , , . , , . . . to , , . , , . . . too , , . , , . . . want , . , , . . . when , , . , , . . . with , , . , , . . . would , , . , , . . . you , , . , , . . . your , , . , , . . . had , , . , , . . . look , . , , . . . take , . , . . . very , , . , , . . . do , , . , , . . . because , . , . . . put , . , . . .     that , , . , , . . . little , , . , , . . . ’re , , . , , . . . have , , . , , . . . well , , . , , . . . tables and are consistent with earlier research that has found gender differences based on word frequencies in prose fiction. overall, the tables suggest that male-authored fiction is dominated by more frequent use of noun-related forms than female-authored fiction, which is verb-oriented. male authors overuse articles (a, the) and prepositions (by, from, in, of, on, through), both of which are associated with nouns. similarly, male-authored fiction overuses other function words that are typically associated with noun phrases and nominal functions, such as another, first, one, some, two, and other. however, it is noteworthy that the list of significant items for male authors is shorter than that for female authors. the personal pronouns that are overrepresented in male-authored fiction are the first-person plural forms us and we and the third-person pronouns its, their, and they, while women’s fiction overuses the second-person forms you and your, which can have singular and plural referents. stereotypically, men tend to write about man and he, and women about her and she. these pronoun findings are consistent with those of argamon et al. ( , pp. – ) but deviate in that women do not significantly favour the first-person pronoun i, as the previous findings suggest. when the bootstrap method is used, personal pronouns do not emerge as unequivocal female-style markers in contemporary prose fiction.     table shows that female-authored fiction is marked by frequent verb use: there are more than twenty verb forms among the items overused by women (forms of be, do, and have; modals, such as could, should, must, and would; and activity and mental verbs, including come, go, make, knew, and thought). only three such verb forms are overused in male-authored fiction (were, is, and are). particularly salient features in women’s fiction are contracted forms (’ll, ’m, ’ve, n’t, ’re), negative particles (n’t, never, not), and intensifiers (much, so, too, very). these are all indicators that female- authored fiction employs a more involved, colloquial style than male-authored fiction, which, by contrast, is marked by features associated with an informational, noun- oriented style (for these distinctions, see biber, , pp. – ; biber and burges, ). however, these style markers may not be a simple reflection of the gender of the authors; rather, these differences may be correlated with target audience differences. both the male and female authors sampled for the bnc wrote for adults, and only a small minority wrote for children. however, c. million of the total of . million words in the male-authored fiction subcorpus was intended for a mixed readership, whereas half of the female-authored subcorpus (c. . million of . million words) targeted female audiences and may hence include more female characters and female- oriented topics than the male-authored subcorpus. previous research indicates that audience design is relevant in spoken interaction, and style shifting is typically a response to the speaker’s audience (bell, ). in weblogs, for example, the diary subgenre is reported to display more ‘female’ stylistic features, and the filter subgenre contains more ‘male’ stylistic features; in both cases the findings are independent of the gender of the author (herring and paolillo, ). it is plausible that different subgenres     of fiction and their target audiences also play a role in the word-distribution differences that are observed in the bnc prose fiction genre. . comparing the χ test with the bootstrap test the above analysis is based on words that are ranked as significant by the bootstrap test. most of these words are also significant according to the other tests, including those based on the bag-of-words model. however, how do we evaluate words that are ranked as significant by the bag-of-words tests, such as the χ test, but are considered insignificant by the more valid tests, such as the bootstrap test? tables and list high-frequency words (occurring , times or more in both subcorpora) for which the difference between the χ and bootstrap p-values is at least tenfold. by accounting for fdr control at α = . , we find that the χ p-values are significant, but the bootstrap p- values are not significant. all of the listed words are also significant according to our other bag-of-words test, the log-likelihood ratio. table high-frequency words that are significantly overrepresented in male-authored prose fiction in the bnc according to the χ test but not according to the bootstrap test word males m/million females f/million dpnorm χ bootstrap an , , . , , . . . . back , , . , , . . . . down , , . , , . . . . has , . , . . . . his , , . , , . . . . i , , . , , . . . . into , , . , , . . . .     my , , . , , . . . . off , , . , , . . . . old , . , . . . . or , , . , , . . . . out , , . , , . . . . people , . , . . . . them , , . , , . . . . this , , . , , . . . . up , , . , , . . . . which , , . , , . . . . who , , . , , . . . . then , , . , , . . . . looked , , . , , . . . . something , , . , . . . . just , , . , , . . . . turned , . , . . . .   table high-frequency words that are significantly overrepresented in female-authored prose fiction in the bnc according to the χ test but not according to the bootstrap test word males m/million females f/million dpnorm χ bootstrap all , , . , , . . . . and , , . , , . . . . any , , . , , . . . . as , , . , , . . . . away , , . , , . . . . been , , . , , . . . .     but , , . , , . . . . ’d , , . , , . . . . day , . , . . . . going , , . , , . . . . him , , . , , . . . . last , . , . . . . might , . , . . . . no , , . , , . . . . now , , . , , . . . . only , , . , , . . . . said , , . , , . . . . seemed , . , . . . . think , , . , , . . . . time , , . , , . . . . told , . , . . . . was , , . , , . . . . why , . , , . . . . room , . , . . . . know , , . , , . . . . about , , . , , . . . . even , , . , , . . . . after , , . , , . . . . long , . , . . . . tell , . , . . . .       some of the words in tables and appear to corroborate the above analysis: the writing style of women is more verb-oriented, whereas men overuse masculine and collective personal pronouns, such as his and them. however, the list of words for female-authored fiction also includes a male personal pronoun, him, and men appear to significantly overuse the first-person singular pronouns i and my, which is surprising in view of our general knowledge of gendered styles (argamon et al., ; newman et al., ). furthermore, men appear to overuse directional adverbs, such as back, down, out, and up; this result could be misinterpreted as an interesting discovery with regard to the focus of male prose writing on spatial orientation. if words of all frequencies are considered, then the most salient category of words that are ranked as significant by the χ test but not by the bootstrap test is proper nouns, as in the matilda example above. some of these words are also easily misinterpreted as genuine differences between subcorpora. even an experienced linguist cannot determine which bag-of-words results are genuinely significant; our comparisons show that such results can lead to conflicting interpretations. therefore, it is advisable to avoid the noise that is inherent in bag-of-words methods and to use a more valid test, such as the bootstrap test. . comparing the tests according to significance threshold figure summarizes the number of significant words that were returned by each test at varying significance testing thresholds. the t-test yields the least number of significant words, followed by the wilcoxon rank-sum and bootstrap tests in both figures. only the curve for the inter-arrival time test differs substantially between figs a and b. the test appears to have difficulty with comparing zero with non-zero frequencies and always deems such cases significant. we also observe that the χ and log-likelihood ratio tests     yield more words (by several orders of magnitude) as significant results than the t-test and the wilcoxon rank-sum and bootstrap tests. both axes have a logarithmic scale. fig. comparison of the number of significant words for the six methods. for each method, a curve demonstrates how the number of significant words increases as we increase the significance threshold in the male vs. female author comparison without correcting for multiple hypotheses. the x-axis shows the p-value threshold, and the y-axis shows the percentage of words that are marked as having significantly different frequencies between genders. the figure on the left (a) is based on all words in the prose fiction subcorpus, and the figure on the right (b) includes only those words with frequencies greater than zero for both genders. . conclusion many current corpus tools use the χ and log-likelihood ratio tests. we suggest that other tests be added to these tools for the reasons discussed in this paper. the core difference between the bag-of-words tests (the χ and log-likelihood ratio tests) and the other four tests (the t-test and the wilcoxon rank-sum, inter-arrival time, and bootstrap tests) is the representation of the data, and thus, the unit of observation: for the bag-of- words tests, the data are represented in a x table (table ) and the number of samples equals the number of words in a corpus, whereas for the other four tests, the data are     represented either by a frequency list (table ), or a list of inter-arrival times. in those cases, the number of samples is much lower than the number of words in a corpus. for the t-test, the wilcoxon rank-sum test, and the bootstrap test, the number of samples equals the number of texts in a corpus, and for the inter-arrival time test, the number of samples equals the number of occurrences of the word being tested rather than the total number of words. the number of samples generally determines our certainty with regard to the estimates, and the experimental results show that the bag-of- words tests have excessively high confidence in the estimates of mean word frequencies, in the context of the statistical comparison of two corpora. by studying the uniformity of the p-values that were given by each of the tests, we have shown that the choice of how to define independent samples and how to represent the data plays a major role in the outcome of a significance test. we have shown that bag-of-words-based tests may lead to spurious conclusions when assessing the significance of differences in frequency counts between corpora. note, however, that we are not suggesting that there is anything wrong with the χ and log-likelihood ratio tests as such, but only that their application in this context is problematic. we have also shown that appropriate alternatives exist: welch’s t-test, the wilcoxon rank-sum test, and the bootstrap test. we have considered the choice of statistical tests for comparing moderate-sized or large corpora (at least texts each). due to space limitations, we have not include discussion on how to compare small corpora. this problem is briefly addressed in lijffijt et al. ( ). it appears that the advice on which statistical test to use is not as straightforward as for large corpora. the objections made in this paper against the bag- of-words test hold for corpora of any size. however, in small corpora, counting all     occurrences of a word in the same text as one sample, i.e., a sample equals a text, may preclude the detection of many patterns. we would expect the inter-arrival time test to be a tempting alternative in that setting, but further investigation into the use of statistical tests for comparing small corpora or individual texts is warranted. notes kilgarriff refers to this test as the mann-whitney ranks test. in lijffijt et al. ( ) we set out to explore the question of lexical variation in a historical single-genre corpus of personal correspondence over time. comparing the log-likelihood ratio and bootstrap tests, we found that the two successive half-a-million- word subperiods of the corpus that we examined were more similar to each other with regard to their lexis than a bag-of-words method might lead one to postulate. we also observed that, besides the choice of method and the size of the corpus, the observed degree of similarity depends on several other factors, notably, the type of post-hoc correction, and the frequency cut-off and significance thresholds used. both p-values are actually using double precision floating point numbers; thus, these values are much smaller than . . acknowledgements we thank the anonymous reviewers for their valuable comments and suggestions. funding this work was supported by the academy of finland [grant numbers , ]; the finnish centre of excellence for algorithmic data analysis (algodan); the finnish centre of excellence for the study of variation, contacts and change in     english (varieng); the finnish doctoral programme in computational sciences (fics); the academy of finland’s academy professorship scheme; and the finnish graduate school in language studies (langnet). references altmann, e. g., pierrehumbert, j. b., and motter, a. e. ( ). beyond word frequency: bursts, lulls, and scaling in the temporal distributions of words, plos one, ( ): e . argamon, s., koppel, m., fine, j., and shimoni, a. r. ( ). gender, genre, and writing style in formal written texts, text, ( ): – . beaujean, f., caldwell, a., kollár, d., and kröninger, k. ( ). p-values for model evaluation, physical review d, ( ): . bell, a. ( ). language style as audience design, language in society, : – . benjamini, y. and hochberg, y. ( ). controlling the false discovery rate: a practical and powerful approach to multiple testing, journal of the royal statistical society, ( ): – . berry, g. and armitage, p. ( ). mid-p confidence intervals: a brief review, the statistician, ( ): – . biber, d. ( ). dimensions of register variation: a cross-linguistic comparison. cambridge: cambridge university press. biber, d. and burges, j. ( ). historical change in the language use of women and men: gender differences in dramatic dialogue, journal of english linguistics, ( ): – . blocker, c., conway, j., demortier, l., heinrich, j., junk, t., lyons, l., and punzig, g. ( ). simple facts about p-values, technical report     cdf/memo/statistics/public/ , laboratory of experimental high energy physics, the rockefeller university. bnc = the british national corpus, version (bnc xml edition) ( ). distributed by oxford university computing services on behalf of the bnc consortium. http://www.natcorp.ox.ac.uk/ (accessed november ). burnard, l. ( ). reference guide for the british national corpus (xml edition). published for the british national corpus consortium by the research technologies service at oxford university computing services. http://www.natcorp.ox.ac.uk/docs/urg/ (accessed november ). dudoit, s., shaffer, j. p., and boldrick, j. c. ( ). multiple hypothesis testing in microarray experiments, statistical science, ( ): – . dunning, t. ( ). accurate methods for the statistics of surprise and coincidence, computational linguistics, : – . efron, b. and tibshirani, r. j. ( ). an introduction to the bootstrap. new york: chapman and hall/crc. evert, s. ( ). the statistics of word cooccurrences: word pairs and collocations. dissertation, institut für maschinelle sprachverarbeitung, university of stuttgart. good, p. ( ). permutation, parametric, and bootstrap tests of hypotheses. rd edn., new york/heidelberg: springer. gries, s. th. ( ). null-hypothesis significance testing of word frequencies: a follow-up on kilgarriff, corpus linguistics and linguistic theory, ( ): – . gries, s. th. ( ). dispersions and adjusted frequencies in corpora, international journal of corpus linguistics, ( ): – .     herring, s. c. and paolillo, j. c. ( ). gender and genre variation in weblogs, journal of sociolinguistics, ( ): – . hinneburg, a., mannila, h., kaislaniemi, s., nevalainen, t., and raumolin- brunberg, h. ( ). how to handle small samples: bootstrap and bayesian methods in the analysis of linguistic change, literary and linguistic computing, ( ): – . hoffmann, s., evert, s., smith, n., lee, d., and berglund prytz, y. ( ). corpus linguistics with bncweb—a practical guide. frankfurt am main: peter lang. kilgarriff, a. ( ). comparing corpora, international journal of corpus linguistics, ( ): – . kilgarriff, a. ( ). language is never, ever, ever, random, corpus linguistics and linguistic theory, ( ): – . l’ecuyer, p. and simard, r. ( ). testu : a c library for empirical testing of random number generators, acm transactions on mathematical software, ( ): . lee, d. y. w. ( ). genres, registers, text types, domains and styles: clarifying the concepts and navigating a path through the bnc jungle, language learning & technology, ( ): – . lijffijt, j. ( ). bootstrap test for r and matlab. http://users.ics.aalto.fi/lijffijt/bootstraptest/ (accessed november ). lijffijt, j. ( ). a fast and simple method for mining subsequences with surprising event counts. in blockeel, h., kersting, k., nijssen, s., and Železný, f. (eds), proceedings of ecml-pkdd —part i. berlin: springer-verlag, pp. – .     lijffijt, j. and gries, s. th. ( ). correction to stefan th. gries’ “dispersions and adjusted frequencies in corpora”, international journal of corpus linguistics, ( ): – . lijffijt, j., papapetrou, p., puolamäki, k., and mannila, h. ( ). analyzing word frequencies in large text corpora using inter-arrival times and bootstrapping. in gunopulos, d., hofmann, t., malerba, d., and vazirgiannis, m. (eds), proceedings of ecml-pkdd —part ii. berlin: springer-verlag, pp. – . lijffijt, j., säily, t., and nevalainen, t. ( ). ceecing the baseline: lexical stability and significant change in a historical corpus. in tyrkkö, j., kilpiö, m., nevalainen, t., and rissanen, m. (eds), outposts of historical corpus linguistics: from the helsinki corpus to a proliferation of resources. studies in variation, contacts and change in english, vol. . helsinki: varieng. http://www.helsinki.fi/varieng/journal/volumes/ /lijffijt_saily_nevalainen/ (accessed november ). mann, h. b. and whitney, d. r. ( ). on a test of whether one of two random variables is stochastically larger than the other, annals of mathematical statistics, ( ): – . massey, f. ( ). the kolmogorov-smirnov test for goodness of fit, journal of the american statistical association, ( ): – . newman, m. l., groom, c. j., handelman, l. d., and pennebaker, j. w. ( ). gender differences in language use: an analysis of , text samples, discourse processes, : – .     north, b. v., curtis, d., and sham, p. c. ( ). a note on the calculation of empirical p-values from monte carlo procedures, the american journal of human genetics, ( ): – . oakes, m. p. and farrow, m. ( ). use of the chi-squared test to examine vocabulary differences in english-language corpora representing seven different countries, literary and linguistic computing, ( ): – . paquot, m. and bestgen, y. ( ). distinctive words in academic writing: a comparison of three statistical tests for keyword extraction. in jucker, a., schreier, d., and hundt, m. (eds), corpora: pragmatics and discourse. amsterdam: rodopi, pp. – . rayson, p. ( ). from key words to key semantic domains, international journal of corpus linguistics, ( ): – . rayson, p., berridge, d., and francis, b. ( ). extending the cochran rule for the comparison of word frequencies between corpora. in purnelle, g., fairon, c., and dister, a. (eds), le poids des mots: proceedings of the th international conference on statistical analysis of textual data (jadt ). louvain-la- neuve: presses universitaires de louvain, pp. – . rayson, p. and garside, r. ( ). comparing corpora using frequency profiling. in kilgarriff, a. and berber sardinha, t. (eds), proceedings of the workshop on comparing corpora. stroudsburg: association for computational linguistics, pp. – . rayson, p., leech, g., and hodges, m. ( ). social differentiation in the use of english vocabulary: some analyses of the conversational component of the     british national corpus, international journal of corpus linguistics, ( ): – . savický, p. and hlaváčová, j. ( ). measures of word commonness, journal of quantitative linguistics, ( ): – . schweder, t. and spjøtvoll, e. ( ). plots of p-values to evaluate many tests simultaneously, biometrika, ( ): – . scott, m. ( ). wordsmith tools, version . liverpool: lexical analysis software. shaffer, j. p. ( ). multiple hypothesis testing, annual review of psychology, : – . welch, b. l. ( ). the generalization of ‘student’s’ problem when several different population variances are involved, biometrika, ( – ): – . wilcoxon, f. ( ). individual comparisons by ranking methods, biometrics bulletin, ( ): – . a critical analysis of lifecycle models of the research process and research data management this is a repository copy of a critical analysis of lifecycle models of the research process and research data management. white rose research online url for this paper: http://eprints.whiterose.ac.uk/ / version: accepted version article: cox, a.m. orcid.org/ - - - x and tam, w. orcid.org/ - - - ( ) a critical analysis of lifecycle models of the research process and research data management. aslib journal of information management, ( ). pp. - . issn - https://doi.org/ . /ajim- - - © emerald. this is an author produced version of a paper subsequently published in aslib journal of information management. uploaded in accordance with the publisher's self-archiving policy. eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ reuse this article is distributed under the terms of the creative commons attribution-noncommercial (cc by-nc) licence. this licence allows you to remix, tweak, and build upon this work non-commercially, and any new works must also acknowledge the authors and be non-commercial. you don’t have to license any derivative works on the same terms. more information and the full terms of the licence here: https://creativecommons.org/licenses/ takedown if you consider content in white rose research online to be in breach of uk law, please notify us by emailing eprints@whiterose.ac.uk including the url of the record and the reason for the withdrawal request. mailto:eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ a slib jo u rn al o f in fo rm atio n m an ag em en t � � � � � � ������������ �� ����� ��� �� ������������ �������������� ��������� ������������������ ����� �� � � ������� � �������� ����� ��� � ������������������ ������ ������ ��� ����������������� ������ ������� � ��������������� ��!��"� � ��������#����������"����$���%�$���#�� &������#�' ���� ��� ��#���������� �������#�$�������� �� � � aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t a critical analysis of lifecycle models of the research process and research data management abstract purpose visualisations of research and research related activities including research data management as a lifecycle have proliferated in the last decade. this study offers a systematic analysis and critique of such models. design/methodology/approach a framework for analysis synthesised from the literature presented and applied to nine examples. findings the strengths of the lifecycle representation are to clarify stages in research and to capture key features of project based research. nevertheless, their weakness is that they typically mask various aspects of the complexity of research, constructing it as highly purposive, serial, unidirectional and occurring in a somewhat closed system. other types of models such as spiral of knowledge creation or the data journey reveal other stories about research. it is suggested that we need to develop other metaphors and visualisations around research. research implications the paper explores the strengths and weaknesses of the popular lifecycle model for research and research data management, and also considers alternative ways of representing them. practical implications librarians use lifecycle models to explain service offerings to users so the analysis will help them identify clearly the best type of representation for particular cases. the critique offered by the paper also reveals that because researchers do not necessarily identify with a lifecycle representation, alternative ways of representing research need to be developed. originality/value the paper offers a systematic analysis of visualisations of research and research data management current in the library and information studies (lis) literature revealing the strengths and weaknesses of the lifecycle metaphor. introduction in the last decade library and information studies (lis) has shown a growing interest in the detail of the research process, in part due to the creation of a new depth of research support services in libraries (corrall, ). the academic library’s role has been turned “inside out” from mainly providing the user community with access to literature, to stewarding the knowledge being created within the institution and making it discoverable to the wider world (dempsey, malpas & lavoie, page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t ). this includes an increasing investment in helping local researchers manage data within the research process, stewarding different versions and types of outputs, including data, and also supporting and measuring all sorts of dissemination and impact beyond academia. thus, libraries are seeking to offer services across the course of research as a whole, and as a consequence the black box of the research process has been opened. in this context a number of commentators and practitioners have commented on the proliferation of lifecycle models/visualisations in the research data management and research support area (wilson, ; l’hours, ; carlson, ). ball ( ) reviews nine such data lifecycle models; the committee on earth observation satellites (ceos) working group on information systems & services ( ) reviewed models. there is a significant amount of variation in their purpose and assumptions, but we can learn a lot about how research is conceptualised from examining them. the lifecycle has long been a favourite model in lis (ma and wang, a,b); in particular it is a core concept in records and archive management (williams, ). the appeal is the temporal dimension the metaphor adds to our understanding of the differing activities in view. the metaphor seems to be particularly appealing in the research area because it fits into thinking about designing systems workflows, be those administrative or it-based. yet the term lifecycle is also ambiguous implying the model of a birth to death journey or, somewhat in contrast, the pattern of birth and reproduction where the cycle is endlessly repeated or, indeed, enters a progressive upward cycle. a number of authors, notably carlson ( ); waddington, green & awre ( ) and wissik & durco ( ) have begun to reflect on these models. but there remain questions about what different types of model there are; what uses they have; and what assumptions are made in them. given the way that the lifecycle is becoming virtually the default way of representing the research process, it is important at a theoretical level to question whether this adequately conceptualises research. at a practical level, given their increasing use by practitioners to conceptualise and explain services, it is important to weigh up whether they capture how researchers themselves view the research process. in this context the aim of this paper is to explore the strengths and weaknesses of representing research and research data related activities in a lifecycle model, through a systematic comparison of some that have been published recently in the research data space. it does this by examining the literature to produce an analytic framework; undertaking a systematic analysis of a selection of nine models using the framework; and on this basis reaching some conclusions about the strengths and limits of this type of model/visualisation. having probed the typical limits of the lifecycle approach, thought can be given to the benefits of radically different representations such as the knowledge spiral or data journey which offer more critical perspectives. the lifecycle a number of authors have already attempted to differentiate different types of lifecycle. according to ma and wang ( a, b), the life cycle of information can be understood from two perspectives. the perspective of value focuses on how the worth of information changes, usually deteriorates, over time. the other more common perspective is of management. the authors identify six types of management based life cycle models. the commonest is what they call the chain model. it is a simple chain of steps. the second type, the matrix, expands the chain model by giving more detail on each of the stages in the chain. the third type is the circular model, with the crucial difference being that the end of the chain links back to the beginning thus restarting the cycle. the spiral extends the circular model by illustrating how each cycle is not just a repetition of previous cycles but builds and page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t strengthens, as things such as audiences or the nature and structure of information change. an integrated model combines two of the other types as a way to indicate the complexity of processes or to bring in external factors. the final model is the wave, which is able to express an on-going activity but an overall decline in the value of data. this set of categories itself reveals the complexity of the concept. whereas ma and wang ( a,b) emphasise the different visual structure of the models, carlson ( ) working more directly in the research service area emphasises differing underlying audiences. he suggests that life cycle models in the data service context can be divided into three types: .� individual-based, which are for a particular project giving detail on how it unfolds. .� organisation-based, which are for showing how services fit to different stages of research or how researchers should access different services at different stages in their work. .� community-based, which are for a particular academic community or discipline (including professionals that support them) to define existing or good practice. this seems to be an insightful categorisation, although, these are not mutually exclusive categories, for example a community based model could include elements of organisational support. he also mixes descriptive and prescriptive elements. another way of analysing such models is available in the work of möller ( ). in the context of seeking to develop a data lifecycle model for the semantic web, he analyses lifecycle models of “data centric systems” such as for multimedia, elearning, digital libraries, knowledge and content management, ontologies and databases. he focuses on data in the context of technical systems, with less sense that they could be socio-technical systems including different social actors, activities etc. however, he does usefully suggest that such systems can vary across the following dimensions: �� data/metadata – whether data and metadata are differentiated. �� prescriptive/descriptive – whether the model seeks to define what would be good practice or is describing actual practice. �� homogenous/heterogeneous – whether the data in the system is similar or of different types. this would seem to be more of a dimension of difference, than a yes no distinction. �� closed/open – whether the system follows a given set of data, or whether new data can join the system after the beginning of the cycle. �� centralised/distributed – whether the system is based on a single unitary infrastructure or is it spread over multiple systems. �� lifecycle type – sequential/incremental/evolving. in a sequential model each step has to be worked through before the next can be started. in an incremental model a new lifecycle can begin before all the expected steps are worked through. in an evolving model several different iterations of the lifecycle can be occurring simultaneously. �� granularity – whether it is a high level or fine grained model. some of these dimensions, such as the first may be specific to möller’s particular study, but most of the others appear to be productive for thinking about research based lifecycle models too. thus we have the beginning of an analytic framework combining the visual dimension (ma and wang, a,b); main purpose (carlson, ); and characteristics of the elements or processes making up the model from möller ( ). page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t method in order to carry out the study we collected a large number of models published in the last ten years in both the peer reviewed and practitioner lis literature relating to research and research data. inevitably this collection was not comprehensive. to represent the range of models being developed the search was not restricted to peer reviewed material, rather it was highly relevant to include examples being published by practitioners. we also made the decision to include models dealing specifically with data and as well as broader models of the research process as a whole, because they appeared to be published to the same audience and are often strongly inter-related. we then selected nine for more in-depth analysis. they were: .� research lifecycle (rin/nesta, ) .� the research lifecycle at ucf (university of central florida libraries’ research lifecycle committee, ) .� the scholarly knowledge cycle (lyon, ) .� research lifecycle enhanced by an "open science by default" workflow (grigorov, et al. ) .� the idealized scientific research activity lifecycle model (patel, ) .� the integrated scientific life cycle of embedded networked sensor research (pepe et al., ) .� e-science and the lifecycle of research (humphrey, ) .� create and manage data (corti et al., ) .� digital curation centre (dcc) curation lifecycle model (higgins, ) some such as “e-science and the lifecycle of research” and the “dcc digital curation lifecycle” have been very widely cited in the literature, and we know has been very influential at the practice level too. some, including the dcc model, have also been well documented in the public domain, thus supporting more in-depth analysis. the “integrated scientific life cycle of embedded networked sensor research” was developed and used in a number of publications by one of the leading researchers in the field, christine borgman, so again is considered particularly worthy of consideration. the open science by default model reflects current thinking around open science. thus nine models of research data and research, which appeared to be influential or interesting, were identified for analysis. using ma and wang ( a,b), carlson ( ) and möller ( ) as starting points we defined a set of key dimensions by which the models could be compared, and through applying them to the sample models refined the criteria. this framework is in three parts: scope and point of view; elements and processes; and visualisation. scope and point of view consists of three elements: .� the subject matter of the lifecycle, which could be anything from the whole research process at one end of the scale to a model just concerned with data, or some aspect of it, at the other. .� whether it is project-, organisation- or community-based, after carlson ( ). .� prescriptive/descriptive – whether the intention is to describe a real world state of affairs or define how things should be ideally (möller, ). page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t elements and processes consists of: .� level of abstraction, i.e., granularity, after möller ( ). .� homogenous/heterogeneous – whether constituent elements are similar or diverse (möller, ). .� closed/open – whether it is presented as a bounded, self-contained system or is open to outside elements or processes (möller, ). .� centralised/distributed – whether it rests on a single infrastructure or not (möller, ). .� uni/multi-directionality – whether the flow of activities goes in a single or multiple directions. this seemed to be an aspect of möller’s ( ) “lifecycle type”, but as this was rather complex we thought it more simply articulated through and . .� seriality/simultaneity – whether activities occur as a set of stages or whether more than one process can occur at once. visualisation consists of .�what type of lifecycle it is, after ma and wang ( a,b). .�use of colour, this item and the ones that follow are more granular features of the visual representation that emerged from the data as useful. .�visuality/textuality – the balance of image and text. .�recti-/curvilinearity – whether the forms were primarily organic curved forms or more rectilinear ones. findings description of table [insert table around here] table offers a summary of the nine models selected, starting with the broadest, which deals with the whole of the research process, through to the most specific, purely concerned with data curation. model : research lifecycle (rin/nesta ) the scope of the model is the whole research lifecycle for any discipline. published in the rin/nesta study “open to all?”, the lifecycle was developed to illustrate openness in research, dealing with issues such as what material outputs are made open and to whom, at each stage of the research lifecycle. in terms of elements and processes, it is at a high level of abstraction with the whole research process represented in just stages: conceptualising and networking; proposal writing and design; collecting and analysing; infrastructuring; documenting and sharing; publishing and page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t reporting; translating and engaging. the process is seen as largely closed (with a number of set elements in the system), unidirectional (traveling forward, without any sense of iteration or return to earlier steps) and serial (with a single sequence of steps taken one after the other) – though “infrastructuring” seems to happen in parallel with collecting and analysing. there is no indication that the processes would necessarily be occurring within one particular system, so it can be seen as distributed. visually the lifecycle type is circular. it is a true circular lifecycle in the sense it is clear that the stage of “translating and engaging” to disseminate research to different communities naturally merges into the “conceptualising and networking” around the next research idea, so one can understand a logic of the cycle repeating itself. a related table lists types of documentary or data outputs created at each stage in the lifecycle; thus it could be seen as a circular matrix within ma and wang’s ( a,b) terminology, because it is laid out as a circle, but has a table of explanation for each step in the circle. the low use of colour and text is consistent with a high level of abstraction. model : the research lifecycle at ucf (university of central florida libraries’ research lifecycle committee, ) as with the first model, the scope is the whole research lifecycle for any discipline. in this case it is a lifecycle for a particular organisation, because it is intended to show how different support services fit into the research workflow. this is what carlson ( ) calls an organisation based model. in terms of elements and processes it is a lot more detailed than the previous model, but remains unidirectional and serial. but it does include more heterogeneous elements of actors and their roles at different stages and appears to be distributed across multiple systems. using ma and wang’s terminology ( a, b) it is an integrated model, incorporating four different circular cycles (planning, project, publication and “ st century digital scholarship” – which includes dissemination and preservation) as well as some indication of other flows. it is clear how the cycle restarts with ideas springing from one piece of research naturally restarting a new planning cycle. it is one of the strongest designs with high use of colour to retain clarity despite its inclusion of lots of low level detail. the curvi-linear style is strongly suggestive of an organic process. model : the scholarly knowledge cycle (lyon, ) the scholarly knowledge cycle is a high level abstraction, which, unusually, encompasses both research and teaching – reflecting the reuse of research data and outputs in the teaching context. the text describing the model suggests that it is generic, but elements are clearly relevant to a specific, individual project. although it is labelled as a knowledge cycle, this early model has a strong focus on data and information management, e.g., key elements are databases. in terms of elements and processes, the order of research stages is less clearly identified than in some of the other models, particularly because many of the linkages are bi-directional. this captures complexity but reduces readability. similarly, its inclusion of heterogeneous elements (institutional repositories, databases, metadata, research and teaching processes) and differing infrastructures, means it captures complex reality, but has reduced legibility. in terms of visualisation two cycles (research and teaching) are happening simultaneously. page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t model : research lifecycle enhanced by an "open science by default" workflow (grigorov, et al. ) this lifecycle is again for the whole of the research process, with a prescriptive objective to explain how open science elements could potentially fit in at each stage of research. an inner circle sets out nine basic research stages, with a strong emphasis on dissemination and public engagement. an outer circle mirrors each step with open science innovations, e.g., research data management (rdm) around data, open access for outputs and citizen science applied to engagement activities. an additional element is linking to the author’s unique id meandering through the whole process, implying the way it ties together disparate activities. in terms of elements and processes it is at a high level of abstraction and there is a strong sense of a serial, uni-directional flow. visually it is presented as two concentric circles, but there is a clear logic to how the process is repeated from engaging publics to having a new idea. model : idealized scientific research activity lifecycle model (patel, ) the scope is again of the whole research lifecycle, specifically for science. the use of the word idealised in the title implies a prescriptive intent. the model is intended to show processes of a typical physical science experiment project and some idealised stages for supporting long-term research data management (patel, ). in terms of elements and processes there is quite a bit more complexity than the previous models. further, there is a definite sense of processes going on simultaneously in different areas, and some of the arrows suggest multi-directional flows. activity is going on across a distributed set of systems not in a single one. it is unusual compared to most of the models in being presented in in the form of a mass of text in rectangular boxes linked by straight lines, suggesting that it has been developed out of linking together a number of chains of steps. however, in fact there is a strong sense of circularity. four key heterogeneous elements (types of activities) are included: research, publication, administration and archiving – and linked by information flows. each type of activity is colour coded to help the reader pick out related processes; but the heavy textual orientation makes the whole diagram feel very complex. in seeking to present significant detail the effect is of considerable complexity, and on first viewing the visualisation is hard to read. this may be more related to the quality of graphic design than conceptual. legibility might also have been enhanced by organising some of the elements into clearer stages, e.g., there is no link between ‘publications databases’ and ‘publish research.’ model : the integrated scientific life cycle of embedded networked sensor research (pepe et al. ) again, the model’s scope is the whole research process, though in this case it relates to one specific research community. like most of the other models the cycle is closed, uni-directional and serial. it is a closed circular model of the research lifecycle at a medium level of abstraction and involving three sequential processes: . experiment design and device calibration; . data capture, cleaning, and analysis; . publication and preservation. what seems to be indicated by this visualisation is the importance of seeing design, analysis and publication as three relatively discrete processes. the low use of text and colour reflects little attempt to lend complexity to the representation. although the model is presented visually as circular, the potential link between last and first stage is not clear – since there is no focus on new idea generation or data reuse which might restart the process. thus in reality the three sub-processes seem to be linked together in a chain process, ending with publication. page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t model : e-science and the lifecycle of research (humphrey, ) despite the title of this model implying it is about the whole of research, its main focus is data, its use and re-use, not all aspects of research. the main chain of activity is from study concept and design, through data collection, processing, access and dissemination to analysis. this is at a high level of abstraction. above the data access chevron, a data repurposing loop starts. thus while the lifecycle is presented visually as a chain in that the main processes are indeed set out in order, the looping arrows point to data repurposing as a circular process, as well as also suggesting a parallel process of research outcomes stimulating new study design. according to the author, the model aims to provide a better understanding of the relations between stages in research and increase the awareness of potential information losses between stages (humphrey, ). the focus on data loss is interesting, however, in reality the concept of data loss is not very well visualised – it is supposedly indicated by gaps between each main activity in the chain, but this is not obvious to understand to the viewer. as with most of the other models there is a strong sense of uni-directionality and seriality. because it is presented in rectinlinear forms with reliance on text it does not convey much sense of an organic process. model : create and manage data ( corti et al. ) in terms of scope unlike most of the other models it is lifecycle purely concerned with research data. it is specifically for social science research. it is a data-centric view of the research process, thus it does not show the wider context of how a project is conceived or planned (and in this sense is not optimised for the researcher to link to their own process). in terms of elements and processes it has a low level of abstraction, with much detail provided in accompanying text. it is largely closed, unidirectional, serial. although presented as a circle it could be considered a chain matrix, because it consists of a series of stages, with some (textual) detail offered of activities that occur at each stage. although presented as a circle there is no real basis for a “rebirth”: there is a discontinuous jump between making one’s own data accessible (an end of project activity) to then searching for that of others (implying a new project). model : dcc curation lifecycle model (higgins, ) in terms of scope, unlike the other models the dcc curation lifecycle is specifically concerned with data curation only. it focuses not on the life of data from the researcher’s point of view, but on its preservation: indeed, it seeks to instantiate the oais reference model, a widely cited conceptual framework for digital archives (higgins, ). the intention is to define how things should be done, so it is a prescriptive model. it does not encompass the whole research process, such as idea generation. “conceptualise” as the starting point for data creation is symbolically set outside the main circular cycle. indeed, the model does not examine what the researcher might do in terms of collecting, manipulating etc. data. data creation is mentioned because the advice is that data has to be managed from birth, but the focus is on processes after its original purposes of collection. this makes it poor for explaining research services to end-users, but it is a strong representation of how data can be curated with an emphasis on reuse. in terms of elements and processes, it gives considerable low level detail of processes. it is closed and unidirectional, but has a strong sense of multiple processes happening simultaneously. if only because it concerned with one specific activity so activities seem to be happening within a single centralised system, unlike most of the other models. it is hard to understand in what sense “transform” the last step in the cycle can then lead on to “create/receive” so it is arguably, like the oais reference model, essentially a chain or number of chains, even if presented visually as circular and sequential. sub processes are nested within the page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t circle thus producing an onion or target model, a type of visualisation not identified by ma and wang ( a,b). discussion the nine models analysed here represent a wide range of approaches and purposes in representing research and/or data: from abstract representations of the whole research lifecycle through to much more detailed models of data creation and reuse, and even just of data curation. each has its strengths relevant to its particular purpose. some are specific to a particular academic field or institution, others seek to generalise to any form of research. although not an exhaustive picture of the range of types of model that have been developed in the last few years in the research/research data arena the analysis seems to support the value of the descriptive framework synthesised from previous authors. a few patterns are immediately apparent. sometimes the title of the model misrepresents the true scope, e.g., where a data lifecycle model is described as a research lifecycle (model ). similarly, it seems quite frequently to be the case that the visualisation misrepresents the process, e.g., is circular when in fact there is little explanation of how the loop is recontinued (models , , ) or is set out visually as a chain when there are strong suggestions of circularity (model ). more complex models seem to be more successful when they clearly identify sub-cycles or distinct stages. the dcc’s onion-type model, as a variant on the circular model, is a type of visualisation not mentioned by ma and wang ( a,b). the ukda model shows how a circular model can also have a matrix element, with more detailed descriptions of each stage. the strengths of the lifecycle model become clear through the analysis. it is a simple and understandable visualisation. its typically curvilinear shapes imply organic rather than highly managed or rationalised processes. as a means to make sense of research it offers more insights than simple lists of scholarly primitives (unsworth, ) or activities (palmer and cragin, ) because we need to have some notion of how such elements fit together. a temporal approach, often identifying discrete stages and their order, is the obvious way to link them. lifecycles also improve on the complex “information flow maps” presented in rin’s ( ) descriptions of life science research processes. these may reflect reality more fully, but are not as memorable or understandable, because they seem more like snapshots in time and do not give us a sense of research as a movement of stages towards an end-goal. furthermore, the strength of the lifecycle is that to some extent it reflects how researchers themselves perceive research. for example, in showing different stages of research they echo many research methods books which also represent research in flow chart form, moving across a number of stages (e.g., pickard, ) and visualised in what ma and wang ( a,b) would categorise as a chain. at a fundamental level much research is driven by the need to produce an output within the scope of a time limited project and so is progressive and linear. jeng et al. ( ) found that many researchers when asked to draw their research process represented it in a chain form. circular lifecycles can also be seen as having a strength in improving on the visualisation of research as a chain, by expressing the desire for data reuse, stressing that in some sense the process is to be repeated. hence briney ( ) writes about the “old data lifecycle” as picturing the purpose of data as achieved when a paper is written. the “new data lifecycle adds data sharing, preservation, and page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t data reuse as steps in the research process.” (briney, : ). yet on close inspection many of the data oriented models examined here do not really capture a circular process. at a pragmatic level the organisational form of a lifecycle is a good way of talking about delivering services at the point of need/within the user/researcher workflow (carlson, ). it reflects a general recognition of the need to reflect patterns in the life of the user, rather than expecting users to fit into the life of the library (connaway, ). some models go further in capturing complexity, e.g., the “idealised scientific research activity lifecycle model”. this arises from seeking to represent a number of processes occurring simultaneously, multi-directionality, and relying on text to convey detailed activities. a counter trend is to boil things down to a minimum of basic elements, ones that resonate with all researchers regardless of discipline or method. this is exhibited, for example, in many university web sites where the simplest categories are used to organise an explanation of rdm such as “create, organise, keep, find and share” (university of leicester, ). nevertheless, there are weaknesses with the lifecycle model for representing research and research data. in particular, most of the models are closed, serial and unidirectional. this cuts against what we know of the typical character of research. in the real world research is often accepted to be: unique: each piece of research has its own pattern, shaped by the context of the research, particular choices in research design and contingent events. shaped by discipline/sub-discipline and by methodology: there is a strong sense that the process of research differs between different academic “tribes”, particularly because of the use of different research methodologies. research methods text books suggest that at a high level quantitative and qualitative research may have fundamentally different structures. but there are also many mixed methods research designs (cresswell and plano clark, ). iterative and non-linear: research tends to be based on repeating steps a number of times or going back and forwards between different stages, and most significantly, it is non-linear. it may simply not progress by clear stages. stages may run in parallel or could be skipped entirely (pepe et al., ). a researcher may hit a problem and have to repeat steps early in the cycle (mattern et al., ) or plan the research as highly iterative. furthermore, at some point repeating the same method that has been used before has to break down and a new paradigm be constructed. so a linear narrative of research neglects the creativity and intuition that is also an important part of research. by definition there cannot be any recipes or prescriptions for the exploration of the unknown. heterogeneous: research involves multiple types of information and data. the rin/nesta ( ) information flow maps express well the sheer variety of types and sources of data used in a single field of research; lifecycles typically do this less well. open: new data and material can be drawn into research at almost any time (e.g., collecting new data or finding new literature). particularly in qualitative research with an exploratory or emergent design, there is a sense that new material can be brought into the research even at a late stage. page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t distributed: the infrastructure is typically not a unified one. this is true whether that be technical infrastructure, such as tools or storage spaces or socio-technical infrastructure, e.g., the people, communities, places that make research possible. concatenated: researchers often pursue an interest across multiple contexts, in multiple projects. the nature of the intellectual journey here is much richer and more complex to summarise than that encompassed by one lifecycle. messy: this characteristic captures a sense of real world disorganisation, unplanned and unexpected outcomes (brannen, ). research does not play out in a predictable way. research is a bricolage: making the best of the material to hand to create a new understanding of a phenomena (berry, ). lifecycle models are explicitly abstractions and simplifications; that is their value. they clarify by making relevant simplifications (carlson, ). in terms of the elements and processes dimension they tend to represent research as homogeneous, closed, unidirectional and serial. in terms of visualisation they are simple visually, and use low colour and text. in doing so they mask much of the complexity captured in the adjectives above. hence it is a weakness of many of the models, that though they represent research as containing heterogeneous elements and existing on distributed architectures they see it as a rather closed system. they did not represent wider processes and influences. typically, they are unidirectional and based on a series of pre-set, serial stages. none of them were evolving spirals, which would have reflected deepening understanding and learning. this begins to open up a more critical viewpoint on the lifecycle model, which despite its proliferation, has limits of explanatory power. lifecycles may make most sense where research is seen as a time limited project. and indeed this may be an increasing reality, one reinforced by funding, and academic promotion and tenure structures. it is consistent with some views of research that brew ( ) found in her study of the experience of research: such as of it as a series of steps (dominoes conception) or as in a social market place (trading conception). however, brew ( ) also found that some researchers see the experience of research as a journey of personal transformation. a lifecycle is weak in capturing this. rather, we have resonances of the much reproduced visualisation “the island of research”, that imagines research as an exciting but hazardous journey through an exotic island landscape (harburg, ). here the process is complex, uncertain and messy; and research is understood as an intellectual but also an emotional, even physical challenge. a lifelong career of concatenated research where rather intangible underlying themes are explored in an open ended way, has a very different quality to the individual project with its clear time horizon and outputs. researchers’ career narratives may often be presented as an open ended quest for knowledge. in a study at university of pittsburgh libraries one participant represented research as an “unending, repeating cycle” (mattern, : ). in the course of this resources are drawn on in an open way, are heterogeneous, and the infrastructure is distributed. this is part of what is lost in many lifecycle models: another weakness. in the pittsburgh study this participant drew their research as a daisy shape, with activities around an enduring core of the scholar’s own interests (jeng et al., ). she is quoted as saying “in order to unlock a phenomenon, i’ve got to understand who i am as a researcher” (mattern et al., : ). from a theoretical viewpoint it is important that the concept of research in lis reflects this complexity. it page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t may be that researchers respond more strongly to these types of representations, than visualisations that are more simplistic and reductive. a critical point is where the lifecycle model turns full circle. in a few cases we have seen that the notion that there is a restart of the cycle has not really been explained. the real nature of how a process might be iterated is not developed. if we are following data, it may be reused but it is likely to be in a different context; often by another researcher. the simple circle of reproduction pattern does not capture this complexity. ma and wang ( a,b) emphasise that a commonality of writing about lifecycles is that the value of data is typically seen as declining from the moment it is created – a characteristic best captured in the wave model. on the other hand, if we are following a research lifecycle, it will never be simply asking the same question again. if we are building knowledge a better way to represent growing understanding would be some sort of upwards spiral, since the whole point is that overall understanding is being accumulated. one thinks immediately of the spiral models in knowledge management such as the seci model proposed by nonaka, toyama and komo ( ). this after all is a model of knowledge creation: should not models of research also capture this upward spiral? it is noticeable that this is one of ma and wang’s ( a,b) types of lifecycle visualisation that is not found in this collection of lifecycles. from the information professional perspective, the (organisational and community) lifecycle is appealing as a way of bringing the library into the scholar’s “workflow”. but that currently popular term seems rather reductive, and suggests standardisation, an administrative or it perspective on research. for some researchers the most engaging and highly meaningful aspects of research are not captured by the idea of workflow. so while this approach goes further in seeing research from the researcher’s perspective, the lifecycle still often lacks the inner grasp of research as a personal transformation or intellectual journey. a further weakness of the lifecycle model, is that within such models data seems to be a neatly bounded “thing”. data is being represented as a spreadsheet or output that appears to exist unproblematically as a distinct entity and is worked on through a series of stages. yet what we know about research data is the way it changes itself through the lifecycle: data is collected, quality checked, selected, combined, transformed (e.g., through simulations), it may be embargoed or released in edited form. the ukda model comes nearest to capturing some of this. but the other models we have looked at here do little to capture this sense of data as changing and relational (haider and kjellberg, ). even the dcc model tracks data as if it were a single entity. yet this is to blackbox one of the key information management challenges. we might also want to recognise the way that research data is managed within a wider personal information collection. thus data could be just one element within a scholar’s pics related to research and wider pics for other tasks such as teaching and administration (al omar and cox, ). this thought directs our attention to another alternative viewpoint and weakness in the lifecycle approach. this arises where we focus on the data itself as mobile across different social contexts. this is a perspective captured in the metaphor of the data journey (bates et al., ). here we follow data, from its creation, often outside a research context as it is moved, transformed and repurposed in the different domains within which it is used. to exemplify this journey, bates et al. ( ) refer to the way weather data can be created in different contexts such as in weather stations page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t or through extracting historic data from ship logs, and later used by the national meteorological office, but also move to be exploited in future markets in the city of london. movement across contexts is accompanied by mutability in the content and meaning of the data as it is combined and managed differently. data is created by managers of a local weather station in a spirit of openness, but is much more closely protected by others in the chain. bates et al. ( ) visualise the journeys of weather data within and outside the research context with an underground map type diagram with different coloured lines and different stops. this is actually a chain representation, with sharp disjunctions as data is fundamentally transformed in different contexts. it articulates scepticism about the meaning of open data. it is a sharp reminder that many cyclic models are driven by policy prescriptions about reuse. most of the models we have looked at are project oriented reflecting the context of the use of data (and often its first creation) in one research project. the journeys model prompts us to consider the wider life of the data; just as the personal journey model prompts us to think about the wider narrative of the data in the researcher’s career. conclusion this paper presents a systematic investigation comparing the proliferating number of lifecycle models that have been generated in the area of research support and research data management, both in the peer reviewed and practitioner literatures. the analysis of the models revealed the very differing perspectives on research that are current and the different value of these, as well as reminding us that no model can capture a comprehensive viewpoint. on a practical level, the analysis will help practitioners select or design models appropriate to the task at hand. a number of radically alternative visualisations/metaphors were also considered, such as the perspective held by many researchers of research as a transformational journey (and that might be best represented as a spiral) and the rather discontinuous journey of data itself, which could be visualised in subway map form. given the complexity of both research and data our discussion has revealed both the flexibility and some of the perils in over-reliance on this one metaphor. burgi and roos ( ) suggest that when dealing with complex concepts, multiple metaphors are needed to avoid being trapped into simplistic assumptions. the lifecycle idea is an extremely useful metaphor, but it tends to encourage thinking that research processes are highly purposive, unidirectional, serial and occurring in a closed system. research is often not like this, and the analysis has exposed the limited thinking created by such an assumption. the lifecycle model also often implies a repeated cycle when there is no real basis for this. the conclusion must to be suggest a need to add other visualisations for research to our repertoire of conceptual models. the knowledge spiral and the data journey map are just two such examples that reveal that viewing research through different metaphors enriches our understanding. this is important theoretically because lis needs to develop a convincing understanding of the research process. it is important practically too, because while lifecycles are increasingly used to explain service offerings, the analysis shows that this may not always reflect researchers’ own understanding of the research process. in failing to do so they can alienate potential users. the paper developed and tested a framework for systematic comparison that can be reused or provide the basis for further study. this highlights three sets of interdependent features, around scope and point of view; elements and processes; and visualisation. these features seem to capture the degree of relevant complexity retained in each model. page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t as lis continues to unravel the complexities of research it seems likely that more models will emerge, helping us to build up an even richer account of research. in this context there is great value in asking researchers themselves about the metaphors through which they understand research. the drawing based method used at pittsburgh by jeng et al. ( ) and mattern et al. ( ) is a useful approach to elicit such representations. clearly what the researcher is being asked to represent is key: it could usefully try to get a representation of research career, not just an individual project. developed with academics from a wider range of subjects and levels of experience would enable us to gain more insight into the nature of research. new powerful visualisations to explain research, will also help all stakeholders understand their role and to explain the place of services in the research process. references al-omar, m., & cox, a. m. ( ). scholars’ research-related personal information collections: a study of education and health researchers in a kuwaiti university. aslib journal of information management, ( ), – . http://doi.org/ . /ajim- - - ball, a. ( ). review of data management lifecycle models. bath: university of bath. ����������� ��� �� � ������������������������ �� ��� bates, j., lin, y.-w., & goodale, p. ( ). data journeys: capturing the socio-material constitution of data objects and flows. big data & society, ( ). http://doi.org/ . / berry, k. ( ). research as bricolage: embracing relationality, multiplicity and complexity. in k.tobin and j. kincheloe (ed.s), doing educational research: a handbook, rotterdam: sense publishers, - . brannen, j. (ed.). ( ). introduction. in j. brannen (ed.) mixing methods: qualitative and quantitative research. routledge, xi-xvi. brew, a. ( ). conceptions of research: a phenomenographic study. studies in higher education, ( ), – . http://doi.org/ . / briney, k. ( ). data management for researchers: organize, maintain and share your data for research success. exeter: pelagic publishing. carlson, j. ( ). the use of life cycle models in developing and supporting data services. in j. m. ray (ed.), research data management. practical strategies for information professionals. west lafayette: purdue university press., - . committee on earth observation satellites (ceos) working group on information systems & services ( ) http://ceos.org/document_management/working_groups/wgiss/interest_groups/data_stew ardship/white_papers/wgiss_dsig_data-lifecycle-models-and-concepts-v - _apr .docx. page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t connaway, l. s. ( ). introduction. in l.s. connaway (ed), the library in the life of the user: engaging with people where they live and learn. dub: oclc, i-ix. retrieved from http://www.oclc.org/content/dam/research/publications/ /oclcresearch-library-in-life-of- user.pdf corrall, s. ( ). designing libraries for research collaboration in the network world: an exploratory study. liber quarterly, ( ), – . corti, l., van den eynden, v., bishop, l., & woollard, m. ( ). managing and sharing research data: a guide to good practice. los angeles: sage. creswell, j. w., & clark, v. l. p. ( ). designing and conducting mixed methods research ( nd ed.). thousand oaks, united states: sage publications. dempsey, l., malpas, c., & lavoie, b. ( ). collection directions: the evolution of library collections and collecting. portal: libraries and the academy, ( ), – . http://doi.org/ . /pla. . dooley, j. ( ). the archival advantage: integrating archival expertise into management of born- digital library materials. dublin, ohio: oclc research. retrieved from http://www.oclc.org/content/dam/research/publications/ /oclcresearch-archival- advantage- .pdf grigorov, i., carvalho, j., davidson, j., donnelly, m., elbaek, m., franck, g., jones, s., melero, r., knoth, p., kuchma, i., orth, a., pontika, n. rodrigues, e. and schmidt, b. ( ). research lifecycle enhanced by an "open science by default" workflow. doi: . /zenodo. haider, j., & kjellberg, s. ( ). data in the making: temporal aspects in the construction of research data. in k. rekers, j.v. and sandell (ed.), new big science in focus: perspectives on ess and max iv (pp. – ). lund university. harburg, e. ( ). the island of research. american scientist, ( ), . higgins, s. ( ). the dcc curation lifecycle model. international journal of digital curation, ( ). http://doi.org/ . /ijdc.v i . humphrey, c. ( ). e-science and the life cycle of research. retrieved from https://era.library.ualberta.ca/files/bvq zn q/lifecycle-science .pdf jeng, w., mattern, e., he, d., & lyon, l. ( ). unpacking the “ black box ”: a preliminary study of visualizing humanists and social science scholars ’ data and research processes. in iconference proceedings. http://doi.org/ . / l’hours, h. ( ). workflows and lifecycles: the long-term view from a national data centre. rdmf , dcc june . retrieved from http://www.dcc.ac.uk/sites/default/files/documents/rdmf /herve.pdf lenhardt, w. c., ahalt, s., blanton, b., & christopherson, l. ( ). data management lifecycle and software lifecycle management in the context of conducting science, journal of open research software, ( ), – . http://doi.org/ . /jors.ax page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t lyon, l. ( ). ebank uk: building the links between research data, scholarly communication and learning. ariadne. retrieved from http://www.ariadne.ac.uk/issue /lyon/ ma, f., & wang, j. ( )a. a literature review of studies on information lifecycle i: the perspective of value. journal of the china society for scientific and technical information, ( ), – . ma, f., & wang, j. ( )b. the review of studies on information lifecycle ii: the perspective of management. journal of the china society for scientific and technical information, ( ), – . mattern, e., jeng, w., he, d., lyon, l., & brenner, a. ( ). using participatory design and visual narrative inquiry to investigate researchers’ data challenges and recommendations for library research data services. program, ( ), – . http://doi.org/ . /prog- - - möller, k. ( ). lifecycle models of data-centric systems and domains. semantic web, ( ), – . http://doi.org/ . /sw. . nonaka, i., toyama, r., & konno, n. ( ). seci , ba and leadership : a unified model of dynamic knowledge creation. long range planning, ( ), – .�https://doi.org/ . /s - ( ) - palmer, c. l., & cragin, m. h. ( ). scholarship and disciplinary practices. annual review of information science and technology, ( ), – . http://doi.org/ . /aris. . patel, m. ( ). idealised scientific research activity lifecycle model. university of bath. retrieved from http://opus.bath.ac.uk/ / /i s _researchactivitylifecyclemodel_ .pdf pepe, a., mayernik, m., borgman, c. l., & sompel, h. van de. ( ). from artifacts to aggregations : modeling scientific life cycles on the semantic web, ( ), – . http://doi.org/ . /asi pickard, a. j. ( ). research methods in information ( nd ed.). london: facet publishing. research information network (rin). ( ). patterns of information use and exchange: case studies of researchers in the life sciences. rin/british library http://www.rin.ac.uk/system/files/attachments/patterns_information_use-report_nov .pdf research information network (rin)/ national endowment for science technology and the arts (nesta). ( ). open to all? case studies of openness in research. retrieved from http://www.rin.ac.uk/system/files/attachments/nesta-rin_open_science_v _ .pdf university of central florida libraries research lifecycle committee. ( ). the research lifecycle at ucf [online graphic]. retrieved from https://library.ucf.edu/about/departments/scholarly- communication/overview-research-lifecycle/ university of leicester ( ). data management support for researchers. retrieved from http://www .le.ac.uk/services/research-data page of aslib journal of information management a slib jo u rn al o f in fo rm atio n m an ag em en t unsworth, j. ( ). scholarly primitives: what methods do humanities researchers have in common, and how might our tools reflect this? retrieved from http://people.virginia.edu/~jmu m/kings. - /primitives.html williams, c. ( ). managing archives: foundations, principles and practice ( st ed.). oxford: chandos publishing. wilson, j. ( ). mapping life-cycle models to the university – in theory and in practice. rdmf meeting, th june, . retrieved from http://www.dcc.ac.uk/sites/default/files/documents/rdmf /james.pdf wissik, t., & Ďurčo, m. ( ). research data workflows : from research data lifecycle models to institutional solutions. in clarin selected papers, linköping electronic conference proceedings, annual conference , october – , , wroclaw, poland (pp. – ). linköping university electronic press, linköpings universitet. retrieved from http://www.ep.liu.se/ecp/ / /ecp .pdf page of aslib journal of information management aslib journal of inform ation m anagem ent name of the model . research lifecycle (rin & nesta ) . the research lifecycle at ucf (university of central florida libraries’ research lifecycle committee, ) . the scholarly knowledge cycle (lyon, ) . research lifecycle enhanced by an "open science by default" workflow (grigorov, et al. ) . idealized scientific research activity lifecycle model ( patel, ) . the integrated scientific life cycle of embedded networked sensor research (pepe et al. ) . e-science and the lifecycle of research (humphrey, ) . create and manage data ( corti et al. ) . dcc curation lifecycle model (higgins, ) scope and point of view subject matter research lifecycle (with material outputs table) research lifecycle data and information management, despite the use of the term “knowledge” in the name research lifecycle research and data lifecycle research lifecycle although the title suggests it is a research lifecycle model, it has a strong focus on data use data lifecycle data curation lifecycle individual / organization / community based community based organization based individual / organization based community based community based community based community based community based community based prescriptive or descriptive descriptive mixed mixed prescriptive prescriptive descriptive descriptive mixed prescriptive elements and processes level of abstraction high low medium high medium medium high low low homogenous/ heterogeneous homogenous heterogeneous heterogenous heterogenous heterogenous heterogeneous homogeneous homogenous homogenous closed/open closed closed open open closed closed closed closed closed centralized/distributed infrastructure distributed distributed distributed distributed distributed distributed centralized distributed centralized uni/ multi-directionality uni-directional uni-directional multi-directional uni-directional multi-directional uni-directional uni-directional uni-directional uni-directional seriality/ siimultaneity serial serial high simultaneity serial high simultaneity serial serial serial high simultaneity visualisation lifecycle type circular or circular matrix integrated of four circular cycles integrated of two circular cycles circle circular/chain circular presentation but could be seen as chain chain with circular elements chain matrix chain onion model use of colour low high low low medium low medium low medium visuality/ textuality low text low text medium text medium text high text low text low text low text (but matrix is highly textual) low recti- / curvilinearity curvi-linear curvi-linear mixed curvi-linear rectilinear curvilinear rectilinear curvilinear curvilinear table . a comparison of nine lifecycle models page of aslib journal of information management [pdf] lemmatization for variation-rich languages using deep learning | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /llc/fqw corpus id: lemmatization for variation-rich languages using deep learning @article{kestemont lemmatizationfv, title={lemmatization for variation-rich languages using deep learning}, author={m. kestemont and g. d. pauw and r. v. nie and w. daelemans}, journal={digit. scholarsh. humanit.}, year={ }, volume={ }, pages={ - } } m. kestemont, g. d. pauw, + author w. daelemans published computer science digit. scholarsh. humanit. in this article, we describe a novel approach to sequence tagging for languages that are rich in (e.g. orthographic) surface variation. we focus on lemmatization, a basic step in many processing pipelines in the digital humanities. while this task has long been considered solved for modern languages such as english, there exist many (e.g. historic) languages for which the problem is harder to solve, due to a lack of resources and unstable orthography. our approach is based on recent advances in… expand view via publisher clips.uantwerpen.be save to library create alert cite launch research feed share this paper citationshighly influential citations background citations methods citations view all figures, tables, and topics from this paper table figure figure table figure table table table table table table view all figures & tables lemmatisation deep learning convolution feature learning digital humanities orthographic projection existential quantification control theory artificial neural network pipeline (computing) citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency improving lemmatization of non-standard languages with joint learning enrique manjavacas, Ákos kádár, m. kestemont computer science naacl-hlt pdf view excerpts, cites background and methods save alert research feed lemmatization for ancient languages: rules or neural networks? o. dereza computer science ainl save alert research feed lemmatisation for under-resourced languages with sequence-to-sequence learning: a case of early irish o. dereza history highly influenced view excerpts, cites background save alert research feed deep learning-based morphological taggers and lemmatizers for annotating historical texts helmut schmid computer science datech pdf view excerpt save alert research feed ensemble lemmatization with the classical language toolkit p. burns computer science view excerpt, cites background save alert research feed a large-scale comparison of historical text normalization systems marcel bollmann computer science naacl pdf view excerpt save alert research feed sentiment analysis is a big suitcase e. cambria, soujanya poria, alexander gelbukh, m. thelwall computer science ieee intelligent systems pdf save alert research feed normalization of historical texts with neural network models marcel bollmann physics, computer science pdf view excerpt, cites background save alert research feed the ‘assertive edition’ g. vogeler computer science pdf save alert research feed collating medieval vernacular texts. aligning witnesses, classifying variants j. camps, l. ing, e. spadini history view excerpt, cites methods save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency weigh your words - memory-based lemmatization for middle dutch m. kestemont, w. daelemans, g. d. pauw computer science lit. linguistic comput. pdf view excerpts, references background, methods and results save alert research feed linguistic regularities in sparse and explicit word representations omer levy, y. goldberg computer science conll pdf view excerpt, references results save alert research feed dealing with orthographic variation in a tagger-lemmatizer for fourteenth century dutch charters h. halteren, m. rem computer science lang. resour. evaluation highly influential view excerpts, references background and methods save alert research feed computational linguistics and deep learning christopher d. manning history, computer science computational linguistics pdf view excerpts, references background save alert research feed a convolutional neural network for modelling sentences nal kalchbrenner, edward grefenstette, p. blunsom computer science acl , pdf save alert research feed a neural probabilistic language model yoshua bengio, r. ducharme, pascal vincent, christian janvin computer science j. mach. learn. res. , pdf view excerpts, references background save alert research feed standardizing tweets with character-level machine translation nikola ljubesic, t. erjavec, darja fiser computer science cicling pdf view excerpt, references methods save alert research feed distributed representations of words and phrases and their compositionality tomas mikolov, ilya sutskever, kai chen, g. corrado, j. dean computer science, mathematics nips , highly influential pdf view excerpts, references background and methods save alert research feed applying similarity measures for automatic lemmatization: a case study for modern greek and english dimitrios p. lyras, k. sgarbas, n. fakotakis computer science int. j. artif. intell. tools view excerpt, references background save alert research feed distributional memory: a general framework for corpus-based semantics m. baroni, a. lenci computer science cl pdf save alert research feed ... ... related papers abstract figures, tables, and topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue institutional repositories: strategies for the present and the future bepress from the selectedworks of jean-gabriel bankier institutional repositories: strategies for the present and the future available at: https://works.bepress.com/jean_gabriel_bankier/ / http://www.bepress.com https://works.bepress.com/jean_gabriel_bankier/ https://works.bepress.com/jean_gabriel_bankier/ / this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. institutional repositories: strategies for the present and future strategy session, nasig rd annual conference june , by connie foster professor and head, department of library technical services, western kentucky university jean-gabriel bankier president, berkeley electronic press glen wiley metadata librarian, cornell university library this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. abstract institutional repositories are tools to support, disseminate and showcase the scholarly communications and intellectual life of an institution. a successful repository requires planning and a defined focus, as well as an attractive name and design. to achieve success, the ir must serve faculty on faculty’s terms; the librarian’s role is to collaborate with faculty and ensure that the services of the ir meet their needs. foster, bankier and wiley offer strategies for success drawn from their work creating successful institutional repositories. this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. introduction have you ever searched the departmental web site of a prolific science or psychology faculty member who receives frequent accolades for pre- sentations and publications? can you find their research and link to it, or even pinpoint their recent articles under departmental listings? results of this endeavor can be quite frustrating. while an institutional repository (ir) will not solve all of these access and retrieval difficul- ties, it can offer a way to bring together much of the intellectual and creative efforts of a university in one place and establish a permanent path to discovery and open accessibility for faculty and student re- search projects to researchers worldwide. while some academic irs arose from the need to combat or present an alternative to the high cost of publishing journals and to engage in broad-based aspects of scholarly communication, government funding for research projects, publisher page charges, etc., not all institutions establish or implement an ir because of these issues and discussions. a rationale and definition could be one like that of western kentucky university’s (wku) topscholar™: a digital research repository, dedicated to scholarly research, creative activity and other full-text learning resources that merit enduring and archival value and permanent access within a centralized database that supports, reflects, and showcases the intellectual life of the university through easy searching and retrieval, and universal access and indexing. getting started successful repositories involve planning, commitment, and a defined focus. what follows are some guidelines that convey elements of the authors’ experience, success, and serendipity. start with a task force to present recommendations. the idea or directive will have come from somewhere, so get a planning group going from day one. develop a statement of purpose to convey what an institutional reposi- tory is and what will be placed there. what is the collection policy? where will the ir fall within library and university priorities, and who will manage it? have a philosophical and financial commitment from the “top down” in the university administration and the library leader- ship for this collaborative effort. many libraries are fortunate to have well-staffed systems departments to implement an ir; others must rely on the university’s it staff. staffing is a major consideration in whether to customize and host one’s own repository or outsource to a ready- made publishing platform and server space. how will the system run, be serviced and supported, be customized, be enhanced? what will the response time be to problems and changes? alma swan and sheridan brown, open access self- archiving: an author study (truro, u.k.: key perspectives ltd., ). available at http://www.keyperspectives.co.uk/ openaccessarchive/reports.html. staffing is a major consideration in whether to customize and host one’s own repository or outsource to a ready-made publishing platform and server space. “ ” this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. weigh the financial implications of a hosted and a local (open source soft- ware) repository. what do you get for what you are paying? nothing is truly free. both architectures have associated costs. the hosted system has an annual subscription fee; the local one has staffing consider- ations and server space. calculate the total cost of ownership. if the initiative receives special funding for a year’s pilot project, as many institutions do, and the ir succeeds, what plans are in place for subse- quent years? most of the time, the library becomes the funding source.i regardless of how costs are distributed, think about long-term options. acknowledge and be knowledgeable about the university’s intellectual property policies and ethics policies. create a separate copyright form for authors to deposit content in the repository. the university’s lawyer or another administrator with suitable background can draft, review and also work with the service provider to craft suitable terms not only for the copyright form but for the service contract for the university if us- ing an outside provider.ii consider the name. name and publicize the repository something other than an institutional repository. early in the development of reposi- tories on campuses, susan gibbons stressed this point, believing that the phrase conveys a mandate-like quality.iii names gravitate towards “scholar,” such as scholarly commons@, scholarworks@, escholar- ship@, and digitalcommons@. ur research is the name for the uni- versity of rochester’s and scholarsarchive at osu represents oregon state university. wku’s is named topscholar and glen wiley found that renaming cornell’s ir from dspace@cornell to ecommons@cor- nell contributed to a sea change in perception about the ir. various faculty approached him to commend him for “leaving that dspace” and to express their excitement about “ecommons.” redefining the ir as the “research showcase” specific to the university creates a sense of ownership and excitement across the campus. make it look good. the appearance of database results, a “vanilla” instance, and the word “pilot” generally look bad to faculty and imply that the library has not fully committed to the project of showcasing and promoting its faculty’s research. cornell’s ir achieved a great degree of success from its redesign. it went from looking like a stan- dard, out of the box instance of dspace, to having a visual identity unique to cornell. i the miracle census reports that in all applicable cases responding universities ranked as their top two sources of funding “special initiative supported by the library” and “costs absorbed in routine library operating costs.” see markey et al. “census of institutional repositories in the united states.” available at: http://www.clir.org/pubs/ab- stract/pub abst.html. ii the university of georgia’s law school offers an instruc- tive flow chart. see watson, carol a. and james m. donovan. “behind a law school’s decision to implement an institutional repository.” available at: http://digitalcom- mons.law.uga.edu/law_lib_artchop/ . iii these points were modified from the institutional reposi- tory task force report at western kentucky university, http://digitalcommons.wku.edu/top_pres/ /. this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. cornell’s dspace implementation, before redisign. cornell’s repository after redesign. this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. macalester college’s original digital commons design. macalester college’s digital commons after redesign. this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. strategies for success what makes an ir flounder? first, one must understand that campus “awareness” does not equal campus participation. in other words, just because faculty know about the ir doesn’t mean they will flock to it. generally, the compelling value propositions for the library, like per- sistent urls, handles, and long-term accessibility, are not as attractive to faculty. it is not enough to tell faculty and students about the new “features” of the ir, they must believe it is vibrant, active, and offers them useful services. dorothea salo, digital repository librarian for the university of wisconsin, explains, “the institutional repository and services associated with it must provide value to faculty on faculty terms before it will see more than scant, grudging use.”iv the librarian at the helm is not only responsible for keeping the ir up and running; he or she is also a collaborator and a promoter. the librarian must know how to speak to faculty’s needs and values, often via one-to-one contact. frame the library as a service provider, and begin to ask fac- ulty, “what can i do for you?” faculty want clerical and consultative services. these services could include scanning, mediated deposits, copyright advising and rights- checking. easy as you think it may be for faculty and students to upload content, for initial and even subsequent database building, do it for them! paul royster, scholarly communications librarian at the university of nebraska, lincoln, asks faculty members to send him a cv, then searches and uploads their work himself. remember, the institutional repository does not begin or end with preprints and postprints. faculty flock to opportunities to create origi- nal content. widen your definition of content, and begin to consider how “original content,” content created by faculty and published by the library in the ir, can be valuable, both to the careers of individual scholars and to the branding of your fledgling ir. most repository platforms are full-text indexed in search engines which offers schol- ars as well as the publishing institution high discoverability and wide dissemination opportunities. conference proceedings, working papers, newsletters and electronic theses and dissertations are also excellent additions to irs. traditional archival materials are possible areas for content growth. wku recently published several early wku essays compiled by the president in . these are the library’s first project under presidential papers. e-journal publishing generates additional original content and offers the opportunity to expose paper publications to the digital world. offer faculty the opportunity to transition current paper publications to digital, and provide them help to start new born-digital journals. digital commons has seen a rapid uptake in electronic journal cre- ation and adds at least five new journals a month across its near- institutions. within weeks of initial training, foster fielded a call from a wku professor of exercise science who wanted to start a journal in his field. within a year, the international journal of exercise science be- remember, the institutional repository does not begin or end with preprints and postprints. faculty flock to opportunities to create original content. “ ” iv salo, dorothea. “innkeeper at the roach motel.” library trends : (fall ). available at: http://digital.library. wisc.edu/ / . this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. gan publishing original, peer-reviewed research, and now has a second journal forthcoming. faculty are also intrigued by personalized services like grad student e-portfolios, or individual bibliography pages, both of which act as ad- ditional channels for engagement. how do you prove to the university community that the ir is work- ing? the solution is to give them ways to assess the impact of their scholarship. scholars are most persuaded by measures of their own success and the success of their peers. cornell and digital commons have found that download reports are absolutely essential to creating and maintaining investment in the ir.v in addition to providing these monthly individualized usage statistics, digital commons offers top downloads, papers of the day, and most recent downloads on the home pages of its hosted irs. with its new user interface design, cornell is looking to add ecommons’s “greatest hits.” finally, consider the role of the librarian. should the librarian be tin- kering with code and making policy in a backroom? we think not. af- ter the administrative considerations are tackled and you are ready to market the ir, be proactive. in the first stages of building the ir, one must target early adopters—young faculty looking to make their mark, proponents of open-access, and faculty who respond to opportunities for self-promotion, to name a few. it is important to seek out those sources of existing and potential original content—it’s everywhere. if the library has a large enough staff, subject specialists and/or liaison librarians often work one-on-one with their faculty members to gener- ate new content ideas and find existing content. gradually identify knowledgeable, dependable series administrators across campus who can assume certain responsibilities, and top-level administrators (usu- ally in the library) who can distribute the workload as it increases. digital commons librarians promote opportunities to publish elec- tronic journals, conference proceedings and working paper series, and they give paper-published journals the opportunity to transition to paper-electronic hybrids, or go fully digital. library staff members at cornell have successfully begun to recruit from other sources. cor- nell’s ecommons is now home to web site archives, materials from the established national conference usain (united states agricultural information network), and materials that were “losing” their original home, for example, technical reports from two old irs. don’t forget to market success. show it off! scholars are persuaded by the use patterns and successes of their peers. in addition to offer- ing download reports, share your successes. talk up the values and benefits. create a marketing plan, regular outlets for implementation, training, and information, and ongoing meetings with individuals, de- partments, councils, etc. seize every opportunity. create talking points about the benefits and values so librarians and supportive adminis- trators can engage in conversations with faculty. irs represent a new how do you prove to the university community that the ir is working? give them ways to assess the impact of their scholarship. “ ” v for example, doug white, professor of anthropology used download statistics and citation counts of his first issue of structure and dynamics to demonstrate the initial success of the project. see bankier and smith. “establishing library publishing: best practices for creating successful editors.” elpub conference proceedings , p. . available at: http://works.bepress.com/jean_gabriel_bankier/ . this is a preprint of an article submitted for publication in the annual nasig conference proceedings volume of the serials librarian, published by taylor & francis. copyright to the article is owned by the north american serials interest group (nasig). the serials librarian is available online at: http://www.informaworld.com. dimension in collection development. librarians are indeed building a database for students to see research (and get their own posted), for faculty to showcase or present ongoing instruction, for primary source documents to be uploaded, and for scholars to create new content and new publications. conclusion: the second year and beyond the experimental phase is over. reality and growth set in, or the one-by-one phase. serendipitous moments may occur, like foster’s phone call from that professor wanting to launch an electronic journal of student research in exercise science. be ready for challenges and opportunities. implementation brings to fruition what was recom- mended, planned, marketed, and launched. strategies, while subject to testing and modification, lay the groundwork with intensive training and initial content. year two builds upon this groundwork and one-by-one effort grows as the ir becomes an inextricable part of the university’s scholarly landscape. the “container” swells as content increases. within two years, foster began receiving calls from the honors college to promote topscholar™ downloads in its promotional literature, emails from the provost reminding participants to send appropriate presentations from the annual faculty conference, engaging the spirit, meetings with the graduate council that finally result in the upload of masters theses on a regular basis and, always, those one-to-one contacts. success will come, but the commitment to nurturing that outcome is relentless. each institution is different, but commitment from admin- istrators is essential. once identified and defined as an integral role for the library—to build this system for the institution—librarians become the true force behind it. keep the connections going; generate excitement about this new pathway of discovery. as charles e. glassick notes, “the process, the outcomes, and especially the passion of discovery enhance the mean- ing of the effort and of the institution itself.”vi discovery, accessibility, and permanence are the cornerstones of any institutional repository. the journey is challenging; not everyone will possess the same level of interest as you do, despite incentives. be ready to experience endless and unparalleled opportunities as part of a growing effort in digital scholarship, scholarly communications, and opening up access to con- tent and publishing opportunities that never before existed in this way. digital repositories really do create information possibilities. don’t forget to market success. show it off! scholars are persuaded by the use patterns and successes of their peers. “ ” vi scholarship assessed (san francisco, ca : jossey-bass, ), . bepress from the selectedworks of jean-gabriel bankier institutional repositories: strategies for the present and the future tmp j tj .pdf microsoft word - garda trames, , ( / ), , – mapping life stories of exiled latvians ieva garda-rozenberga institute of literature, folklore and art of the university of latvia abstract. by using the framework of digital humanities and narrative research, i will discuss why and how to map life stories of exiled latvians, what problems can occur during the mapping oral history, and what are the first/main results and visualizations of the ongoing research. thereby, the basis of the article is the fundamental research in humanities whose approach regarding the use of it will be twofold: ( ) applicatory – employing it tools and digital humanities methods for data procession, visualization and analysis; ( ) reflexive – bringing under scrutiny the opportunities of digital scholarship in oral history. this will allow sharing the new knowledge and ways of research about oral history and migration issues in latvia. keywords: oral history, migration, narrative cartography, mapping doi: https://doi.org/ . /tr. . . . introduction since when google released the google maps, mapping has become a common practice of our daily life – we add missing places, edit information (such as phone numbers and addresses of places), add pictures of places, write reviews and validate information added by other people. therefore, maps seem to be the “conceptual glue linking the tangible world of buildings, cities and landscapes with the intangible world of social networks” (hall and abrams : ). researchers working in the digital humanities have also become interested in the possibilities afforded by interactive mapping technologies. on his blog in , digital historian john levin began collecting links to academic digital humanities gis (geographic information systems) projects in order to “see how space and place are being analyzed” in the digital humanities domain, “and what technologies are being used to do so” (stadler et al. : ). by september , levin had collected links to more than projects (http://anterotesis.com/ wordpress/mapping-resources/dh-gis-projects/). currently, the interest in maps ieva garda-rozenberga and their use in different contexts has become so persistent that as william buckingham and samuel dennis argue “a new world of spatial information” (buckingham and dennis : ), promising increased dialogue among carto- graphy, geography, the humanities, and citizens, has been created. influenced by the growing interest in digital humanities, oral history researchers have begun a wide variety of oral history mapping projects (caquard , high ) by bringing to the forefront the interaction between place, memory, identity and sense of belonging (see, for example, the anti-eviction mapping project ; mapping memories: experiences of refugee youth ). one such project, with the goal of studying the migration of exiled latvians, is being conducted at the university of latvia’s institute of literature, folklore and art. the project “empowering knowledge society: interdisciplinary perspectives on public involvement in the production of digital cultural heritage” aims to carry out research on interaction dynamics of digital humanities (dh) and the develop- ment of knowledge society by developing a virtual research and involvement laboratory with integrated dh research and participatory tools. one of the research work packages intends mapping culture multimodally. the objective of this workgroup is to create and test the functionality of an it tool for analyzing and visualizing geospatial data. the tool will be used for analysis of different types of sources of humanities, and thus, the group has representatives from several branches in humanities (folkloristics, oral history and literary studies). in the field of oral history, the mapping is closely linked with the study of examining the importance of a sense of belonging that roots a person in particular space/environ- ment, and how changes in or the loss of a geographic space is experienced. in order to explore these changes, this article will focus on the life stories of those latvians, who fearing the return of the soviet armed forces and a new wave of terror, fled from latvia during world war ii, especially in . . mapping qualitative data the so-called spatial turn in the humanities was in large part closely linked with the rise of the geographic information systems (gis). it took, however, several decades until the system was adapted to the mapping of qualitative data (interviews, life stories, literary texts, images and sound), thus earning the name qualitative geographic information system (qgis). today, qgis is employed not only for the integration of qualitative data; it also provides qualitative data analysis methods, such as grounded analysis or discourse analysis (elwood and cope , knigge and cope ). this has allowed researchers to generate knowledge about society and historical, cultural and social processes taking place within society. as a result, in place of performing quantitative spatial analyses, qgis is more likely to be used to interpret and understand people’s lived experience (kwan and ding ). mapping life stories of exiled latvians in step with qgis, cartography also developed by addressing its relationship not only with the related topics of geography and the use of maps, but also by reviewing key words in the field, especially the notion of ‘place’. the under- standing of place as a “geographical space that is defined by meanings, sentiments and stories rather than by a set of coordinates” (opp and walsh : ) has recast the role of memory (hague , tasker ). highlighting the role of memory in creating the meaning of a place, which not only enacts on but is itself embedded in memory, inscribed and shaped by landscapes, topographies and environment, gastón gordillo concludes that “every memory is, in a funda- mental way, the memory of a place” (gordillo : ). this memory is then expressed, for example, via memorial plaques and instruments of cultural memory and, particularly, through stories. in agreement with geographer yi-fu tuan, who argues that “narrative produces place, and place in turn fosters and produces narrative” (tuan ), cartographers quite shortly almost unanimously acknowledged that the meaning of a place consists of its geographical coordinates, as well as its historical development and the people and stories associated with it (gordillo , hague , linhard , opp and walsh , pearce ). since the end of the th century, the map and mapping has also become an analytical tool in history studies, where maps are supplemented especially by personal narratives such as oral histories, life stories, and biographies (knopf , knowles et al. , kwan and ding , linhard , madden and ross ). for example, “for indigenous communities, where oral traditions have mythical, historical, and spatial functions, map becomes the tangible link between the oral story and the ancestral occupation of the land” (caquard and dimitrovas : ). this idea can be applied to other geographical processes, such as migration that takes people away along certain paths, places, borders. however, researchers go beyond geographical information. they listen to personal narratives to find out “the factors that explain why they crossed at specific points, how they reached these points, who was or was not with them, how they traveled, and how the actual outcome of their border crossing differed from the expected outcome” (linhard : ). . migration and narration tabea linhard and timothy h. parsons have stated that “migration does not just take place, it takes place between spaces, and it is in the precarious and often difficult position of the ‘in-between’ where most migration stories emerge” (linhard and parsons : ). the stories are full of powerful, living memories about life in the native land, about the narrators’ fates and migration to other countries, about the difficulties in settling into a new life, about the ups and downs. the position of the ‘in-between’ also raises the question of what it is that connects a person to a certain place/environment and how changes in – or even the ieva garda-rozenberga loss of – that geographic space are experienced. this can also be seen in the case of exile latvians. in the middle of the th century the most common reason for leaving for another country was the second world war and the subsequent occupation of latvia. as a result of the soviet occupation, approximately , – , refugees fled from latvia at the end of the second world war – it was around per cent of the population (strods , veigners ). approximately , refugees ended up in germany’s western occupation zones (kangeris ), and more than , refugees fled in boats to sweden (jansone and robežniece , lasmane ). as a result, the largest diasporas of latvians were established in the united states, canada, sweden, germany, great britain and australia. narrated memories play a special role in the creation of a diaspora. many crucial characteristics of a diaspora – such as the history of leaving the homeland, memories of home and a strong group consciousness – are largely created through narration (bela et al. ). as well as a new experience which also has to be included in some form of cultural expression, most often – narratives. when listening to life stories of latvians living abroad, we can see that memories of the first years of life in exile – the so-called transition period, when most people still hoped to return to latvia – are composed of individual events, usually without creating a broader and more detailed narrative. however, as barbara kirshenblatt- gimblett ( ) has pointed out – every exile has a story to tell. first days, months, and years were filled with new experiences – often traumatic, confusing and cumbersome, hence giving rise to countless jokes and anecdotes about culture shock, name change, linguistic and cultural incomprehensibility, poverty and oddities in immigrant nature. therefore, stories play a large role in the lives of latvian exiles, because they allow them to discover, see and define themselves, especially at a time when they had lost everything: their homes, families, livelihoods. in addition, the stories also provide a crucial perspective on the present and the future, serving as the guides for how to exist and how to continue living. telling stories in the exile community is a very important phenomenon both socially and culturally, because it does not only establish a united perspective on the past but also strengthens group identity. at the same time, telling stories is a process in which narrators recreate events, express their identity, confirm their own values and voice their own hopes and desires (bula ). thus the stories of the latvian diaspora express the cultural traditions that define its members as a group that is recognizable not only by the people of the country in which they live but also by those who have remained in their native land. to conclude, these memories established a bridge between parts of a nation that had been divided by the occupation. memories become a historical source with high moral status, because these memories belonged to people who were actually there, people who experienced the events first-hand. life stories hold particular significance in historical situations when a large part of a nation’s population has not lived in their native land for half a century (zirnīte ). mapping life stories of exiled latvians . mapping exiled latvian narratives a year after the “empowering knowledge society: interdisciplinary per- spectives on public involvement in the production of digital cultural heritage” project had begun, a study concept, a database of locations and a tool to geocode and map a variety of cultural content have been developed in close cooperation with the programmers and implementers of other project activities. the task of the researchers is to create representative corpora of research sources (folklore, life stories, and literary texts) and analyze them with the newly created tool to test its functionality. the textual corpus that was created for this purpose consisted of a hundred life stories. these stories ( ) reflect the diversity in the paths/geography of latvian migration; and ( ) the number of mapped life stories of exiled latvians cor- responds to the proportion of recordings by exiled latvians from various different countries held in the garamantas.lv national oral history collection. in other words, the selected stories comprise units from the united states and canada, ten from england, ten from sweden, four from australia, three from germany and three from norway. interviews were made between and by the researchers of the national oral history (noh) archive in collaboration with the american latvian association’s oral history project (hinkle ). there are men and women among narrators. of them were born before , – between and , – after ; in cases the date of birth is unknown. the average length of a life story interview is minutes. in total, almost hours of narration (or transcribed pages) are selected for mapping and further analysis. analyzing the corpora, three main place types can be identified: ( ) event places – such as place of birth, place of residence, etc.; ( ) episodic places, which do not carry significant meaning in the narrator’s life, but they allow to see the whole picture of narrator’s geographic perspective; and the last one ( ) projected places – real, existing places where the narrator him- or herself has never been and which are usually known only through the stories of parents or grandparents or other cultural issues, for example, photographs, books, and letters. for example, anda (born in ) left latvia with her parents at the age of seven. she spent the first years, after becoming a refugee, in denmark. later she joins her father, who had a job in england. in the family had the opportunity to move to norway. at the end of the interview when asked whether she feels any ties with latvia, she said: yes, very lively. when we were refugees, my mother bought a lot of books. the paper was very bad, but i read all of them. and then a picture from latvia began to emerge (nmv- ). the mapping of these places is not easy – places in a life story can take on varied and subtle forms that are often difficult to identify and even describe. some places are the setting for events in the story, others are simply mentioned; some place names are specific (e.g. the name of a city), others are generic (e.g. the lake, ieva garda-rozenberga the neighborhood); some place names are geographically precise (e.g. an address), others are much less so (e.g. a country); some places are described in rich detail by the narrator, while others are simply named; some place names have disappeared, and others have been modified (caquard and dimitrovas ). nevertheless, an analysis of the body of mappable texts reveals that places in latvia are mentioned quite precisely, often even including house and apartment numbers. however, as the narrators move beyond latvia’s borders as refugees, less attention is paid to place names. of importance is the place from which the path began and the main places where the latvians stayed along the way. for example, although narrators typically mention several longer stops along the way, crossing germany as refugees at the end of the war is very difficult to visualize because the narrators often provide a precise geographic name only for the end points in their journey, places where they spent a longer time and became relatively settled, such as the displaced persons (dp) camps. thus, marta, who was born in in riga, provides considerable detail and nuance as she tells about her childhood, youth and early days of employment until the outbreak of the second world war. she lists event locations quite precisely: i taught history in the gymnasium [secondary school]. [---] it was a private school. [---] and i worked there until the school was merged with the other german schools – with the beatere [school], with the draudziņa [school] – and was renamed the city secondary school no. . our classrooms were opposite the opera house, the valters-rapa store. secondary school no. was on brīvībasiela at the intersection with stabuiela, across from the cheka... (nmv- ) during the second world war, marta’s husband was conscripted into the legion; the rest of the family decided to go to kurzeme, to liepāja and from there to germany (see figure ). when asked how they ended up in germany, she answers: by horse to liepāja. and by ship from liepāja to danzig. and in germany, i lived in that first camp for a whole month (nmv- ). she describes her subsequent journey in broad strokes, providing names for only a few stops along the way: i didn’t know anything about my husband’s relatives, what had happened to them. but in the end we somehow contacted each other. they had gone to swabia. alright, so now we’ll go there. but we don’t have a horse or anything else. with a small cart, a little backpack on [my] back, we’re pushing that little cart, the baby is in the cart, [my] mother is also next to me. the bombs come, the communists are bombarding the road. dropping bombs, no armies here at all, just people. i even put [my] little girl down on the ground for a moment and lay down on top of her – let the shrapnel fall on me then, but none did. thank god. [---] that was in february of . we were barely dragging ourselves along through the snow, half-starved, and we made it to friedland, where people were being allowed across the border; it was americans and british there. there they gave hot coffee to the adults and porridge to the children. then i collapsed and began weeping. it was like day and night. the humaneness mapping life stories of exiled latvians and serenity. it was a huge difference. when things are too difficult, you can’t cry or anything – you’re like a tree, you just walk – but here the humanity appeared. [---] and then we headed further south. to swabia. and then we had the opportunity to arrange travel by train for a part of the way. life was already somewhat better there. and then we arrived in swabia. [---] in the morning i got a horse and driver to that farm to pick up my mother-in-law. and then from there we headed for the camps quite quickly, because the refugee camps were already being established. our first one was in offenbach (nmv- ). the analysis of marta’s description of her journey as well as descriptions in other mapped life stories reveal the spatial structures of stories. event places are most frequently mentioned in the life stories – each story contains an average of geographically knowable places: homes, neighborhoods, buildings (schools, stations), districts, country. frequency is followed by mentions of episodic places ~ ; uncommon – mentions of projected places ~ in each story. most often projected places appear to be parents’ or grandparents’ home, parish or county; they are often accompanied by an inherited sense of belonging, thus becoming the intimate and domestic spaces. however, as barbara piatti ( ) argues, the mapping is not an end in itself but a research tool that should help the investigation of many new questions, for example, what do we achieve by mapping oral history; how to map journeys of the figure . marta’s journey. january , – september , ieva garda-rozenberga narrators if the only information the life story delivers are some indications about stopovers or intermediate stations – while the rest of the route stays literally in the dark? likewise, my goal is not just to map geospatial information, although it could be a rewarding process and the practical marking of geographical places on a map could yield useful information. the issue at hand is the relationship between a person and his or her space – the perception and experience of a space – which is usually examined within the fields of human geography, cultural geography, environmental humanities, folklore studies, anthropology and many others. . meaning of place the various spatial expressions that embody our personal experience of the surrounding environment and help develop an understanding of these places (caquard : ) create so-called story maps (macfarlane ). in these story maps, memory and emotional episodes are linked with specific physical or imagined locations, thereby making these places qualitatively different centers of meaning (reinsone ), first of all at the time of the interview and again later if the place is digitized. it should be noted that british digital humanities researcher stuart dunn believes that digitizing a location (for example, from cartographic and/or textual sources) changes its nature from object to concept, usually created of human experience, perception and memories (dunn ). following this line of thinking, not only are houses, apartments, streets, neighborhoods, villages and municipalities created as places in the stories of latvian exiles, but often referring to latvia as a whole. digitizing or mapping locations therefore becomes a method to conceptualize latvia as a country or as a nation-state instead of a territorial location. thus, for example, valida says: i, too, have dreamt a great deal about latvia. and most of all i’ve seen our house there, covered in snow. and the moon is shining. and i think to myself that i ought to paint that scene right away. but i still haven’t gotten around to it (nmv- ). the temporal aspect is also important to understand the meaning or importance of place. we cannot speak of places as isolated from the past events associated with them – each place in a story is linked both to an event and the chronological boundaries of this event. therefore, places mentioned in life stories – whether event locations, projected locations or merely episodic locations – are recorded along with a time period. however, at the moment of narration, narrators often do not remember exactly when and where they were or simply do not consider this information worthy of mention. the time in a narrative can be precise or vague (caquard and dimitrovas ). for example, a soldier from the th division of the latvian legion spoke in considerable detail about moving around within germany; however, without mentioning the regiment in which he served, it is impossible to determine when and how long he was in which place. he mentions a few times that “we were there mapping life stories of exiled latvians until spring” (nmv- ), but most of the time, temporal markers are missing. yet, the blank spaces in the chronology of his movements do not prevent a landscape of his story to emerge, because the process of moving is as important as the actual arrival or departure (tilley ). the element of time, however, carries a different meaning when the narrator speaks about projected places. in such cases, the category of time becomes quite unclear and nebulous, just carrying an almost symbolic meaning. for example, a time frame may be referred to as ‘during my parents’ youth’ or ‘before they left latvia’. thus, these segments of time include the feeling of a lost home/homeland. as a result, the mapped narrative could be envisioned “as integrations of space and time; as spatio-temporal events” (massey : ). by adding an event and the element of time to a seemingly simple point on the map, we are no longer speaking about isolated locations but rather a landscape as a body of relational places in which the geographical, biographical (kwan and ding : ) and historical planes intersect. in addition, no matter whether the places are individually or collectively experienced (for example, the refugee camps in germany or boat landing sites on the swedish coast), they do not only link the respective places with the past/history (tilley ) but also create their meanings in the present (massey ). often it is only considerably later that we can evaluate the meaning of a specific place in the experience of exiled latvians, whether it was merely a site in transit (tilley ) or a shorter or longer pause in lived space. the case of latvians in sweden is interesting from this point of view. for nearly , civilian latvians, who went across the baltic sea between and , sweden was the land of hope and possibility. however, after climbing out of the boat and onto dry land, and after the kind welcome they received, their opinions of sweden as a land of hope and possibility gradually changed. many found it difficult to find suitable and reasonably well-paid work. the swedish government’s decision to repatriate military refugees, as per the soviet union’s request, was a signal to many to continue their search for a safe haven in some other country. thus, many eventually settled in england, canada and australia. in fact, after arriving in sweden, latvians did not rush to settle down as they hoped to return home soon. this hope helped them emotionally, but not always practically, as it can be seen in one of the stories: we were waiting for the start of the third world war when the british and the americans will release the baltic and we will be able to go home. and we lived with this naïve hope from year to year for several decades (nmv- ). this and many other stories testify that for many of the narrators sweden was in fact merely a transit port, where they would remain only until the border will be opened and they could travel back to latvia. only later, due to the daily life and its needs, people adapted to the new land and its traditions because an individual’s daily life is closely related to such areas as the language of communication, eating habits, work, and, of course, the social life. ieva garda-rozenberga . conclusions oral history forms a bridge between individual experience and society, it changes viewpoints of history, and it opens up new fields of study. attention is paid not only to what a certain individual has experienced, but also to the way in which that person understands, is aware of and narrates events and experiences as the consequences of those events. therefore, life stories of exiled latvians formed not only the basis of perceptions of latvian culture and statehood, but also the exile diaspora self-confidence and a sense of the community. namely, creating and narrating life stories, sharing the memories about homeland, recent history and common experiences within the circles of families and friends was one of the ways how to create and maintain latvian identity. at the same time, the life stories in the studies of diaspora are used to create an understanding of the complex nature and changes of travelling memories (bela et al. ) experienced by both travelling in time and space, and crossing the political and community boundaries. mapping the geospatial information included in the life stories is of particular importance in the studies of latvian emigration and diaspora formation in usa, great britain, sweden, germany and other countries after world war ii. the question is, what do we achieve by mapping oral history? maps are regularly used to study the geographic nature of stories and are used to ground the story in real places (caquard and cartwright : ). agreeing with malcolm bradbury, who in his atlas of literature has observed that “[a] very large part of our writing is a story of its roots in a place: a landscape, region, village, city, nation or continent” (bradbury : ), it can be concluded that life story is a story of its roots in a place, too. however, the points or lines on the map are just the beginning of the story: the real challenge, quoting margaret pearce is to get “the geographies of human experience and place in the map” (pearce : ). the mapping of exiled latvians’ life stories allows us to observe, first, the geographic structure of life stories, which is linked with individually significant spatial places (residences, places of education and employment, points of crossing the borders and points of social activity). they are narrated as real places and are most often connected to a particular event and time period. secondly, the projected landscapes, which are followed by the inherited sense of belonging and therefore make latvia a homeland for those latvians who have never been there. these mappings in turn take place in two different ways: ( ) matter-of-factly, using it instruments and digital humanities methods for the processing, visualization and analysis of data, and ( ) reflexively, by examining the possibilities of digital study in the field of oral history. why map these locations? it could all be inferred without mapping, just by reading the stories. however, the maps let us see the big picture (caquard and cartwright , corbeil , linhard ), making the experience of exiled latvians more visible and, thus, more tangible. mapping life stories of exiled latvians acknowledgements the article is based on a wider research project “empowering knowledge society: interdisciplinary perspectives on public involvement in the production of digital cultural heritage” [project no.: . . . / /a/ , funded by european regional development fund, - ], therefore i am thankful to my colleagues from the institute of literature, folklore and art at the university of latvia for their valuable advice in the project implementation and writing of this article. i am also grateful to amanda m. jātniece for translating the article. above all, i am thankful to all of the authors of the life stories. without their contribu- tions this research would not be possible. address: ieva garda-rozenberga university of latvia mūkusalas iela riga, latvia e-mail: ieva.garda@lulfmi.lv life stories cited interview with marta, national oral history archive catalogue reference nmv- . interview with valida, national oral history archive catalogue reference nmv- . interview with harijs, national oral history archive catalogue reference nmv- . interview with anda, national oral history archive catalogue reference nmv- . interview with jānis, national oral history archive catalogue reference nmv- . references bela, baiba, ieva garda-rozenberga, and māra zirnite, ( ) “migratory memories between latvia and sweden”. oral history , , – . bradbury, malcolm, ed. ( ) the atlas of literature. london: de agostini editions. buckingham, william r. and samuel f. dennis jr. ( ) “cartographies of participation: how the changing natures of cartography has opened community and cartographer collaboration”. cartographic perspectives , – . bula, dace ( ) “ievadam”. in agita lūse, ed. cilvēks. dzīve. stāstījums. – , rīga: latviešu antropologubiedrība. caquard, sébastien ( ) “cartography i: mapping narrative cartography”. progress in human geography , , – . caquard, sébastien and william cartwright ( ) “narrative cartography: from mapping stories to the narrative of maps and mapping”. the cartographic journal , , – . caquard, sébastien and stefanie dimitrovas ( ) “story maps & co. un état de l’art de la cartographie des récits sur internet/story maps & co. the state of the art of online narrative cartography”. m@ppemonde . available online at <http://mappemonde.mgm.fr/ _as /#englishversion>. accessed on . . . ieva garda-rozenberga corbeil, laurent ( ) “walking to the northern mines: mesoamerican migration in new spain”. in tabea linhard and timothy h. parsons, eds. mapping migration, identity, and space, – . cham: palgrave macmillan. dunn, stuart ( ). “praxes of ‘the human’ and ‘the digital’: spatial humanities and the digitization of place”. geohumanities , , – . elwood, sarah and meghan cope ( ) “introduction. qualitative gis: forging mixed methods through representations, analytical innovations, and conceptual engagements”. in sarah elwood and meghan cope, eds. qualitative gis: a mixed methods approach, – . london: sage. gordillo, gastón ( ) landscapes of devils: tensions of place and memory in the argentinean chaco. durham and london: duke university press. hague, cliff ( ) “planning and place identity”. in cliff hague and paul jenkins, eds. place identity, participation and planning, – , london and new york: routledge. hall, peter and janet abrams ( ) else/where: mapping new cartographies of networks and territories. minneapolis: university of minnesota design institute. high, steven ( ) “mapping memories of displacement: oral history, memoryscapes and mobile methodologies”. in robert perks and alistair thomson, eds. the oral history reader, – , london and new york: routledge. hinkle, maija ( ) “american latvian association’s oral history project ‘exile life narratives’”. in māra zirnīte and maija hinkle, eds. oral history sources of latvia – history, culture and society through life stories. – , riga: latvijas mutvārdu vēstures asociācija “dzīvesstāsts”. jansone, aija and inta robežniece, eds. ( ) latvijas gadi – . rīga: latvijas nacionālais vēstures muzejs. kangeris, kārlis ( ) “trimdas sākumi: latvijas pavalstniekikaralaikā vācijā”. in trimdas arhīviatgriežas: latviešubēgļugaitas vācijā. starptautiskas konferences materiāli, – .rīga: latvijas valsts arhīvs. kirshenblatt-gimblett, barbara ( ) “studying immigrant and ethnic folklore”. in richard dorson, ed. handbook of american folklore, – , bloomington: indiana university press. knigge, ladona and meghan cope ( ) “grounded visualization: integrating the analysis of qualitative and quantitative data through grounded theory and visualization”. environment and planning , – . knopf, christina ( ) “sense-making and map-making: war letters as personal geographies” special issue on cartography and narrative, nano: new american notes online, . available online at<https://nanocrit.com/issues/issue /sense-making-map-making-war- letters-personal-geographies>. accessed on . . . knowles, anne kelly, levi westerveld, and laura strom ( ) “inductive visualization: a humanistic alternative to gis”. geohumanities , , – . kwan, mei-po and guoxiang ding ( ) “geo-narrative: extending geographic information systems for narrative analysis in qualitative and mixed-method research”. the professional geographer , , – . Ķīlis, roberts ( ) “sociālā atmiņa un nacionālā identitāte latvijā”. in roberts Ķīlis, ed. atmiņa un vēsture. no antropoloģijas līdz psiholoģijai, – , rīga: nims. lasmane, valentīne ( ) pāri jūrai ./ . g. rīga: memento latvija. linhard, tabea ( ) “moving barbed wire: geographies of border crossing during world war ii”. in tabea linhard and timothy h. parsons, eds. mapping migration, identity, and space, – . cham: palgrave macmillan. linhard, tabea and timothy h. parsons ( ) “introduction: how does migration take place?” in tabea linhard and timothy h. parsons, eds. mapping migration, identity, and space, – , cham: palgrave macmillan. macfarlane, robert ( ) the wild places. london: granta books and penguin books. madden, marguerite and amy ross ( ) “genocide and giscience: integrating personal narratives and geographic information science to study human rights”. the professional geographer , , – . mapping life stories of exiled latvians massey, doreen ( ) “a global sense of place”. marxism today , – . massey, doreen( ) for space. london: sage. opp, james and john walsh ( ) “introduction: local acts of placing and remembering”. in james oppand john c. walsh, eds. placing memory and remembering place in canada, – , vancouver and toronto: ubc press. pearce, margaret wickens ( ) “framing the days: place and narrative in cartography”. cartography and geographic information science , , – . piatti, barbara, hans rudolf bär, anne-kathrin reuschel, lorenz hurni, and william cartwright ( ) “mapping literature: towards a geography of fiction”. in william cartwright, georg gartner, and antje lehn, eds. cartography and art, – , berlin and heidelberg: springer. reinsone, sanita ( ) apmaldīšanās poētika, stāstos un sarunās. rīga: lu literatūras, folkloras un mākslas institūts. stadler, jane, peta mitchell, and stephen carleton ( ) imagined landscapes. geovisualizing australian spatial narratives. bloomington and indianapolis: indiana university press. strods, heinrihs ( ) “latvijas pilsoņu repatriācija no rietumiem .– . gadā”. latvijas vēsture , – . tasker, nick ( ) “chairman’s message”. the society of cartographers newsletter, july, . tuan, yi-fu( ) “language and the making of place: a narrative-descriptive approach”. annals of the association of american geographers , , – . tilley, christopher ( ) a phenomenology of landscape: places, paths, and monuments. oxford: berg. veigners, ilgvars ( ) latvieši rietumzemēs. rīga: drukātava. zirnīte, māra ( ) “dzīvesstāstu liecinājums”. in zirnīte, māra, ed. latvijas mutvārdu vēsture. spogulis, – . rīga: lu filozofijas un socioloģijas institūts. ieva garda-rozenberga networked participatory online learning design and challenges for academic integrity in higher education case study open access networked participatory online learning design and challenges for academic integrity in higher education judy o’connell correspondence: juoconnell@csu.edu.au division of student learning u!magine digital learning innovation laboratory, charles sturt university, wagga wagga, australia abstract a new multi-disciplinary degree program in education and information studies was developed to uniquely facilitate educators’ capacity to be responsive to the demands of a digitally connected world. charles sturt university’s master of education (knowledge networks and digital innovation) aims to develop agile leaders in new cultures of digital formal and informal learning. the co-construction of knowledge through interpersonal discourse creates a pedagogical tension between a focus on knowledge-based instruction and outcomes, and on praxis-based instruction. this digital context draws attention to academic integrity issues in online learning environments. through the new subject game-based learning, students engaged in theory, practice, trends in game designs and immersive aspects of game, utilizing the technology and pedagogical affordances of a range of online tools. the subject builds on the keystone subject and incorporates reflective participation and a culture of participatory learning through integration of social media, social scholarship and open sharing of ideas, resources and experiences online within the broader education community. subject engagement and assessment design incorporated academic integrity strategies, and needed individual and group collaboration to be fully integrated into the learning experience of the students, thus modeling practices relevant to the student’s own processional practices. this paper also considers the contribution of global connectedness to the success of the pedagogic processes used for embedding academic integrity through social scholarship into the curriculum and learning experiences. keywords: participatory culture, game-based learning, academic integrity, online learning, information ecology background distance education and distance learning, once undertaken by one-to-one correspond- ence between learners and teachers has been radically transformed into online learn- ing, or e-learning, through the use of learning management systems and other web based or digital tools. now this type of education is characterized not so much by ‘distance’ as by the mode of ‘electronic’ or ‘e’ learning environments that is internet or web-based, and provides ongoing challenges for the researcher investigating profes- sional contribution (i.e. teaching or educating’) in higher education (thompson . p. ) and the facets of academic integrity for students learning in internet and online environments (sutherland-smith ). international journal for educational integrity © the author(s). open access this article is distributed under the terms of the creative commons attribution . international license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. o’connell international journal for educational integrity ( ) : doi . /s - - - http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf mailto:juoconnell@csu.edu.au http://creativecommons.org/licenses/by/ . / introduction distance education has evolved through many technologies, in tandem with the affor- dances these technologies provided, and each mode or ‘generation’ has required that distance educators and students be skilled and informed to select the best mix(es) of both pedagogy and technology (anderson and dron ). the current generation of academic degree programs which are delivered fully online (rather than face-to-face) through the use of information and communication technology (ict) are doing more than simply delivering content through asynchronous distance education modes. ra- ther, there has been a strong move to creating pedagogically enriched learning design within technology-rich contexts to support and improve learning experiences (ally ; kim and bonk ; siragusa et al. ; beetham and sharpe ). a growing number of studies have considered learning and teaching activities in on- line learning environments in higher education, examining such aspects as multimedia resources, blended and online new technologies, and the many ways in which online learning and teaching are now being conceptualised and embraced to engage students in learning and promote positive online learning experiences (boling et al. ; cho et al. ; lang and lemon ; mcloughlin and lee ). in practical terms when communication online becomes more relational, socialized and expressive, individuals are required to master an emergent, articulated repertoire of communicative competen- cies that mixes interpersonal and group process fluencies to make linkages and corre- spondences through a repertoire of competencies inextricably social and technological (lievrouw , p. ). in this way new communities of inquiry are formed around shared interest, activity and educational experiences (garrison et al. ; shea and bidjerano ), facilitated by the web as a platform for content creation and collabor- ation by multiple recipients (franklin et al. ). current online information environments and associated transactions are considered an important ‘information ecosystem’ (haythornthwaite and andrews ) influencing and shaping professional engagement and digital scholarship in communities of learn- ing in the higher education sector (lee et al. ). john seeley brown ( ) used an ‘ecology’ metaphor to describe the emerging technology landscape as “an open system, dynamic and interdependent, drivers, partially self-organizing, and adaptive (p. ). thomas and brown ( ), also explored what they described as a new ‘culture of learning’ where information technology has become a participatory medium, giving rise to an environment that is constantly being changed and reshaped by the participation within information spaces. they argued that traditional approaches to learning are no longer capable of coping with this constantly changing world. the information environ- ment is a technology environment, which demands adaptation. as information is also a networked resource, “information absorption is a cultural and social process of engaging with the constantly changing world around us” (thomas and brown p. ). information ecology at the heart of academic integrity in other words, our digital information ecology is a remix of different forms of technology, devices, data repositories, information retrieval, information sharing, networks and com- munication. new technological tools are expanding and continually altering the ways stu- dents, or educators can interact with the world. the implications for education that stem from new means for accessing information, communicating with others, and participating o’connell international journal for educational integrity ( ) : page of in a community needs a new brand of professional competences to thrive within the chan- ging environment. haste ( ) recognised the co-construction of knowledge through interpersonal discourse and the tension within pedagogy between a focus on knowledge- based instruction and outcomes, and on praxis-based instruction. “while most pedagogy, of course, recognises the interaction of both in good practice, there is nevertheless an underlying epistemological gap; knowledge-based models are implicitly more ‘top down’ and praxis-based more ‘bottom up’. ‘knowledge’ implies that the route to understanding is in the structured transmission of information. ‘praxis’ implies a necessary interaction with materials, actions or other persons as a route to understanding” (haste p. ). while technology is changing the information environment (including information places and spaces), the transactional nature of information interactions and knowledge flow underpins learning. information can comprise both physical and virtual parts for operation and interaction. a major challenge for education is to enable and facilitate the generation of new knowledge via an appropriate information environment, to facili- tate integration of new concepts within each person’s existing knowledge structure. this is described as an ‘information ecology’. “information ecology examines the contexts of information behaviour by analogy with ecological habitats and niches, identifying behaviours in biological terms such as ‘for- aging’” (bawden and robinson p. ). in this context of adaptive and responsive co-construction of knowledge, curriculum and subject delivery can be reshaped and re- constructed in a dynamic manner in response to changing environmental conditions or the personal professional needs of students. a digital information ecology provides the opportunity to work with information in the construction of knowledge in more dynamic ways, connecting learning experiences across the contexts of location, time, devices and platforms. this same digital information ecology has also extended the disciplinary and peda- gogic challenges relevant to learning design and the broader institutional context of re- sponsibility for academic integrity. while it is understood that learning and teaching requires engagement with the relevant knowledge, skills, and values pertinent to the discovery and dissemination of new knowledge (turner and beemsterboer ) the connections to how and when this process relates to either embedding or fostering academic integrity is less clear. in the literature review undertaken by macfarlane et al. ( ) the predominant focus emerging in the literature was on investigating and illus- trating a perceived lack of absence of academic integrity, and in this context identified that there is a pressing need for greater understanding of academic integrity across all practice elements, including teaching. the current proactive ethos towards academic integrity is influenced not only by policy actions, such as syllabi and course outlines acknowledgement of academic integrity, ethical guidelines or codes of conduct, but also by the very capacity of academics to teach and/or influence the students’ awareness and acceptance of academic integrity standards (löfström et al. ). this information ecology nurtures and validates a pedagogic process that involves the creation of assessments and environments for knowledge building to enhance collaborative efforts to create and continually improve ideas. this type of approach to knowledge building “exploits the potential of collaborative knowledge work by situating ideas in a communal workspace where others can criticize or contribute to their improvement” (scardamalia et al. p. ). learners in online o’connell international journal for educational integrity ( ) : page of environments do not learn at exactly the same time and the focus on individua- lised learning is highlighted, even emphasizing the time constraints and restrictions in modes of collaboration (zheng and dahl ). while asynchronous and syn- chronous communication is characterized by different discourse features (romis- zowski and mason ), it is recognised that two-way communication remains a critical feature of the e-learning educational process, as well as being a way of en- gaging the learners within the e-learning environment (desai et al. p. ). so while technology has not changed the fundamental capacity for learning, ict has changed how ideas and practices are established, communicated and main- tained when considered in the context of the established educational discourse. our work as educators has to centre on helping to meet future learning needs in courses/programs by fostering a culture of enquiry within a sustainable learning ecology that is shaped by the ubiquity of information, globally responsive peda- gogical practices, academic integrity, and driven by collaboration and informal learning in multiple access points and through multiple mediums. the phenomenon of academic dishonesty has attracted much interest over the years and the challenges and strategies for maintaining quality assurance is often addressed by policies, coupled with an investigation of new strategies for assessing the ‘igenera- tion’ (baggio and beldarrain ). what is required is a pathway forward to ensure that academic integrity in online learning programs and st century learning environ- ments responds to open learning and collaborative practices, in pedagogy and profes- sional skill development of students. the study by prescott ( ) corroborates this emphasis on choice and collaboration as providing a valuable framework for helping students build defences against inadvert- ent plagiarism in their study behaviours. both project premise and teaching aims coupled with good academic practice in collaborative writing activities demonstrated how online work in wikis could make writing ‘visible’ and create self-awareness with more effective self-monitoring with engaging resources. case description the master of education (knowledge networks & digital innovation) commenced at charles sturt university in and requires completion of sixty-four ( ) points com- prising two ( ) core subjects and six ( ) elective eight ( ) point subjects, to meet the australian qualifications framework standards for a masters degree by coursework (council ). it is being delivered fully in online distance education mode, lead by the courses (program) director and the education discipline team in the school of information studies, drawing on specialist adjunct staff associated with the school. the program is designed to respond to the following: � literature and literacy experiences in digital environments, including children’s and young adult literature, e-book systems, management and development; � information organisation in digital environments, information retrieval, content curation with the aid of mobile devices, online platforms and cloud based storage services; � concepts and practices for curriculum integration of social media tools, services and platforms; o’connell international journal for educational integrity ( ) : page of � information practices, with an emphasis on information fluency, critical inquiry and design thinking; � digital citizenship essentials, including legal and ethical behaviour and open learning approaches; � ict integration and innovation, demonstrating a technology infusion with mobile learning, tablets and devices for information rich learning experiences; and � creative and intellectual leadership in a global environment. the program is grounded in cross-disciplinary studies in education and information, allowing students to gain an advanced and integrated understanding of an important body of knowledge related to information and knowledge transactions in online know- ledge networks, built on processes and interactions for innovative education practice. it aims to encapsulate a participatory information ecology that is a co-construction of knowledge through interpersonal discourse and the tension within pedagogy between a focus on knowledge-based instruction and outcomes, and on praxis-based instruction, which is both creative and dialogic. the learning processes depend more on the coord- ination among all the interactions and activities that take place in different spaces of the learners’ lives (personal, home, and workplace) rather than only on interactions and activities developed in the spaces of formal learning within interact (blackboard) learning management system. the academic program has also been designed to enhance personal professional net- works and personal learning conversations, understanding that learning is social within communities of practice where learning happens through experience and practice as part of a community (lieberman and mace ). each subject is treated as an intensive pro- fessional development program, facilitated by social interaction through forums, twitter, adobe connect, and google hangouts, helping to facilitate greater insight into generic is- sues (rienties and kinchin ) through the various participatory learning experiences. the learning framework for the program is established in the keystone subject inf concepts and practices in a digital age, where a body of knowledge is intro- duced that includes a review of recent developments which are influencing learning and teaching in an increasingly digitally connected world. by examining key features and influences of global connectedness, information organisation, communication and participatory cultures of learning, students are provided with the opportunity to reflect on their professional practice in a networked learning community, and engage in dia- logue to develop an authentic understanding of concepts and practices for learning and teaching in digital environments (o’connell ). through this questioning, review and reconstruction of understanding, the subject frames the challenges of learning in digital environments and sets the context for innovation and change in professional practice. the subject is designed to provide: professional learning through authentic tasks and activities; opportunities for collaboration with peers; readings that are thought-provoking; study suggestions which encourage inquiry, reflection and analysis; and engagement with a curriculum unit/strategy to demonstrate application of new knowledge and understanding for learning and teaching practice. this foundation subject establishes connected learning within new information envi- ronments created by the social and technological changes of the digital age. the pur- poseful pedagogical praxis allows: o’connell international journal for educational integrity ( ) : page of � interaction with a diversity of content materials; � interactions within the cohort to improve learning and understanding in the formation of knowledge; � interaction through use of social media communication channels; and � interaction embedded in a multi-disciplinary information ecology. by focusing on connectivity, communication, collaboration and convergence, the sub- ject addresses the challenges, opportunities and emerging possibilities for learning and teaching in information-rich participatory environments. trends in knowledge con- struction, participation and social networks are explored, including information futures and digital convergence. the subject introduces education informatics and the scholar- ship of digital teaching, and models connected learning through group discourse and collaborative inquiry in digital environments, including the reflective and participatory experiences employed throughout the course. the first cohort of students was drawn from australian and international educa- tors, who were in leadership positions in schools; classroom teachers and teacher librarians; e-learning leaders in schools and higher education; educational designers in higher education; program leaders in education organisations; and technology integra- tors in schools and higher education. the range of admissions demonstrated well the multi-disciplinary program approach table . the ongoing admissions during and continued to attract applicants from the same cross-section of education, preparing an interesting cohort of students for the first delivery of the subject game-based learning in . the subject cohort in inf game-based learning was represented by the following group of students table . table first intake for med(kn&diginnov) role number classroom teacher teacher librarian school leadership school e-learning integrators faculty education/instructional designers academic librarian total table game-based learning subject cohort role number classroom teacher teacher librarian school leadership school e-learning integrators faculty/tafe education/instructional designers faculty/systems engineer total o’connell international journal for educational integrity ( ) : page of discussion and evaluation the design of assessments in participatory e-learning environments must emphasise digital flexibility and open collaboration. the ability to evaluate the validity and value of information accessed from multiple information environments is essential for scholarly st learning and academic integrity. approaches to assessment focus on participatory and digital experiences, in the context of program requirements, and in- clude extensive use of formative activities, as part of knowledge flow and peer-to-peer learning/engagement. social media channels are a vital part of this approach. many students keep an open and public record of their learning. providing an easy (and open) way to see the range of digital learning/assessment experiences alongside a record of their participatory experiences and online interactions, in keeping with the global participatory nature of the program. eg. http://thinkspace.csu.edu.au/becspink; http://thinkspace.csu.edu.au/msimkin/; http://thinkspace.csu.edu.au/andrewp/. “in professional programmes in particular, it is useful if students keep a reflective jour- nal, in which they record any incidents or thoughts that help them reflect on the content of the course or programme. such reflection is basic to proper professional functioning. the reflective journal is especially useful for assessing ilos (intended learning outcomes) in relating to the application of content knowledge, professional judgment and reflection on past decisions and problem solving with a view to improving them.” (biggs and tang , p. ). students are regularly required to reflect upon their practices, link their reflections to theories and communicate in writing an understanding of the connection between the reflection and theory. this encourages each student to become a proactive learner and reflective educator who is “committed to continuous improvement in practice; assumes responsibility for his or her own learning; demonstrates awareness of self, others, and the surrounding context; develops the thinking skills for effective inquiry; and takes action that aligns with new understandings” (york-barr et al. p. ). reflective thinking helps students develop a questioning attitude and new perspectives, identify areas for change and improvement, respond effectively to new challenges, and gen- eralise and apply what they have learned from one situation to other situations (turner et al. ). this experiential engagement is employed to foster creativity and initiative for new situations in connected environments for professional practice, and a capacity for confident personal autonomy and accountability in knowledge networking. this participa- tory approach also provides the grounding for new approaches to academic integrity. when students are watching and learning from each other, they are learning to work with an ‘open education’ ethos, and begin to support each other in research and thinking. by doing so, students can ensure the quality of their work, online, and through public scrutiny are also enabling a new level of academic integrity beyond scrutiny by services such as turnitin. the collaborative nature of the pedagogical approach was transparent almost from the outset. the subjects all utilise new and emerging technologies (social networking, media production, content curation, innovative approaches to presentations and more), all hand in hand with traditional e-learning approaches. the effect of this was immedi- ately highlighted by the public sharing from the keystone subject via twitter hashtag #inf . the bottom-up praxis was emphasised by a willingness of students to post a link to their assessments, via their reflective blog or relevant platform - even before the as- sessment was marked! after the assessments were marked, regardless of the grade level achieved, even more students willingly shared their work. a highlight for students was o’connell international journal for educational integrity ( ) : page of http://thinkspace.csu.edu.au/becspink http://thinkspace.csu.edu.au/msimkin/ http://thinkspace.csu.edu.au/andrewp/ when an assessment went ‘viral’, being picked up by some knowledgeable people and orga- nisations. see http://thinkspace.csu.edu.au/hbailie/ / / /going-viral/ fig. . students utilised participatory and collaborative tools and approaches throughout the keystone subject, with many learning for the first time how to engage at this level. an ex- cerpt from final blog reflections in the first subject highlighted the transformation: “my progression through inf has been a brilliant start to my journey along the masters of education (knowledge networks and digital innovation) path. the subject content has provided me strong foundations to build upon, and has been highly relevant to my workplace”. “inf has convinced me even more of the need for all teachers to become digitally literate, connected educators” “#inf : concepts and practices for the digital age has left me continually thinking, questioning, reflecting on current practices causing the continual shift of opinions regarding technology and education. and this is only the tip of the iceberg.” “#inf has been invigorating, exciting, lots of hard work, overwhelming at times, but above all fun. i have loved connecting with the cohort, it’s been amazing. people have said to me “isn’t online study very impersonal and isolating” but i couldn’t disagree more. i feel infinitely more connected with my classmates than i ever did while studying in the traditional way.” students are therefore immersed in a participatory learning experience that not only maintains and promotes a high calibre of pedagogical knowledge encounters, but also frames a new model for promoting academic integrity in online environments through embedding open approaches from learning and assessment from the outset. this is in direct contrast to assessment practices that sit behind the ‘walled garden’, and do not connect directly with the global education experiences of the students. game-based learning the subject, inf game-based learning, was completed for the first time in the session one (autumn) of . as a new subject, this provided another opportunity to embed the fig. twitter tells the story of academic work o’connell international journal for educational integrity ( ) : page of http://thinkspace.csu.edu.au/hbailie/ / / /going-viral/ new approaches being used to promote academic integrity within the participatory experi- ences of the subject, and demonstrate how to respond to the challenges of quality assurance for higher education in participatory frameworks and open approaches to education. the subject is designed as an introduction to understanding the potential role of games and gaming for learning in the digital age. trends in game designs, cultures, genres will be explored in the context of both educational games and commercial games, which can be successfully adapted for pedagogical, curriculum and individual needs of learners. the subject introduces the principles of game design, examines research literature surround- ing games and learning, and includes reflective participation in gaming culture. the subject also covers the principles and theories of game based learning, narrative and gameplay; the characteristics of effective digital game media for a variety of uses; informa- tion behaviour and knowledge construction in game environments; pedagogical affordances of digital games; and implementing digital games into the learning environment. throughout their study in this subject students continue to maintain their reflective blog at thinkspace – a platform provided for student use throughout their course/pro- gram. by providing this foundational and continuing connection point between all students and the lecturer, a vibrant community of sharing is both fostered and main- tained. in this subject, the reach to the global audience was also increased by use of iftt by the lecturer and subject team. see https://ifttt.com/ if ‘recipes’ (the name of the formulae being used to gather or manage information via api services) run automatically in the background and replace manual steps in informa- tion curation and/or sharing. two important ‘recipes’ allowed for the following: fig. . rss (rich site summary) is a format for delivering regularly changing web content. many news-related sites, weblogs and other online publishers syndicate their content as an rss feed to whoever wants it. rss solves a problem for those who regularly use the web making it possible to retrieve the latest content from the sites of interest. each time a student posted to their thinkspace blog, this post was immediately and automatically added to the lecturer’s rss reader for the subject collection which was set up at the begin- ning of the subject. using an rss reader (the reader chosen was feedly http://feedly.com/) the lecturer could easily read new content and post responses). this new blog post, now added to the feedly account, was then ‘announced’ to the lecturer’s twitter feed, and made available to members of the cohort or other interested educators using the #inf twit- ter hashtag. this was possible through the automated ifft ‘recipe’ process that was set up. particularly good posts gained feedback from beyond the cohort fig. . however, there was another background recipe process taking place using the iftt service fig. . each time a student posted a tweet about anything at all on twitter with the subject hashtag #inf the information was collected in a google spreadsheet. this aggregation fig. thinkspace blog post o’connell international journal for educational integrity ( ) : page of https://ifttt.com/ http://feedly.com/ of data provided rich material for future analysis, particularly when tracking several cohort experiences. in total, during the subject, a total of tweets were recorded! of course, students also engaged in focused discussion, using the blackboard discus- sion forum tool – for in-depth group conversations on topics/questions posed during the subject learning experiences. these created equally frequent and popular in-depth discussion and conversations which demonstrated the extent and multiplicity of engagement amongst students, ranging from rapid twitter conversations to in-depth discussions and personal reflections on thinkspace. this combination of collaborative, social experiences has driven very different learning experience, allowing affirmation of the formative learning experiences to be re-enforced by the formulation of assessment strategies that represent a more collaborative and open approach to formal assessment. it is in the area of formal assessments that the true value of academic integrity fostered through social media and online open experi- ences comes to the fore. the extent of global influence on formative approaches to subject design were exem- plified by the inclusion of a student blog post entitled ‘digital game based learning levels up digital literacies’ http://thinkspace.csu.edu.au/anotherbyteofknowledge/digital- game-based-learning-levels-up-digital-literacies/ in episode tide today in digital education podcast from dai barnes and doug belshaw http://tidepodcast.org/. the blog post was described as being well referenced and an excellent starting point for anyone interested in understanding the connections between digital literacy and game- based learning. this kind of feedback is unique and only available to students partici- pating in open and participatory learning experiences. fig. #gbl on twitter fig. collecting twitter data o’connell international journal for educational integrity ( ) : page of http://thinkspace.csu.edu.au/anotherbyteofknowledge/digital-game-based-learning-levels-up-digital-literacies/ http://thinkspace.csu.edu.au/anotherbyteofknowledge/digital-game-based-learning-levels-up-digital-literacies/ http://tidepodcast.org/ the first assessment, a critical review of three articles, provides a ‘traditional’ approach to assessment to ensure connection and engagement with relevant research. in a very straight- forward way the assessment ask is designed to provide students with opportunities to: � demonstrate an understanding of the features, terminology, history and taxonomy of computer-based games and gaming applications � evaluate and critically assess the relation between play, games and learning in formal and informal settings � discuss the relationship between games, media-literacy, information fluency and digital identity. by retaining assessment of this nature, students have a direct connection to standard academic assessment processes, running in parallel to the participatory and formative activities via thinkspace reflective blogging, discussion forums and twitter engage- ment. however, building on all the formative and participatory experiences, the final major assessment is also a collaborative and online learning experience. compendium in game-based learning c #gbl students were asked to “prepare a chapter as your contribution to the compendium: game- based leaning c #gbl for an open educational resource. the compendium is designed to bring student into the open publishing environment, while also fostering academic integrity built on the collaborative and participatory experiences in this subject and throughout the course/program. each student contributed a chapter to the digital compendium, and collab- orated on the choice of topic through a proposal process managed through a purpose-built wikispace set up for that purpose. as students wrote their proposals, the peer group cri- tiqued the proposal, provided feedback or encouragement, and together a final format/topic was chosen. the topics were not open-ended, so much as clustered around a ‘provocation’. “in this task, you will be guided to work collaboratively to choose your chapter contribution for one ( ) of the following sections of the compendium: part : motivation: what reading, research, environments, and change factors are emerging that require or validate being interested and inspired to move into game- based learning? part : provocation: through a case study, an environmental scan of your organisation, situational analysis, or other activity, develop and challenge readers with a perspective based on concrete settings or experiences. part : invitation: invite an organisation, system or workplace to meet the challenge of game-based learning. we explained to the students that “by combining your work together in the compen- dium, we can achieve the equivalent of a book on the topic available online - that pro- vides readers with rationale (part ), examples to show what is possible (part ) and ways to move forward (part ).” this approach was modelled on the work undertaken at duke university, where a web journal of final projects in the augmented realities humanities course was pub- lished at http://sites.duke.edu/lit s_ _f _augrealities/. o’connell international journal for educational integrity ( ) : page of http://sites.duke.edu/lit s_ _f _augrealities/ students created their own chapter on their personal course thinkspace site, and a duplicate version was combined into the final compendium. this allowed individual promotion via twitter and blog feeds of their work, and finally to promote the com- pendium publicly via twitter as a whole piece of work, as well as sharing the pride and accomplishment of the final online product. the compendium was published online at http://thinkspace.csu.edu.au/gblcompendium/ for the purpose of taking the subject experience beyond the ‘classroom’ and into the public sphere. the compendium experience was also intended to position game-based learning as the topic of open scholarly discourse, available for students, educators and practitioners alike, and where feedback and commentary is part of the participatory learning of online en- vironments and in keeping with the flexible and learner-focused cognitive frame of #gbl. this approach provides a substantial mechanism for embedding scholarly practice as an ac- tion of engagement in open and participatory environments, with a focus on best practice, rather than avoidance of plagiarism. the process adopted is one of apprentice scholarly writing by participating in disciplinary activities and producing scholarly writing that is acceptable to the community (li ). more than just communicating the requirements of academic integrity, the compendium facilitates a leadership approach in day-to-day schol- arly practice. because trust is a reciprocal process, the role of the lecturer is to show how trust in open environments translates into practice, utilizing a framework of trust within the instructional model to promote academic integrity (hulsart and mccarthy ) fig. . despite the many advantages to publishing student work online, there where the usual questions asked by fellow academics. how do you stop them copying or plagiarizing? how do you check their work? the answers are multiple, but include the obvious point of peer critique and collaboration to choose a different topic focus for each chapter. peer scrutiny is intense, inspirational, but also critical of content. however, by asking students fig. gbl compendium o’connell international journal for educational integrity ( ) : page of http://thinkspace.csu.edu.au/gblcompendium/ to first publish the chapter compendium as a web page on their thinkspace blog, the lec- turer was also able to use the clearly browser plugin, which makes blog posts, articles and webpages clean and easy to read, to generate a pdf that could be submitted to turnitin. all chapters passed turnitin scrutiny with ease, though two compendium chapters could not be assessed due to duplication in submission. the reaction of students, and the reach of the promotion of academic work is captured in these snippets from twitter fig. . in all subjects that have continued, following this particular subject, students con- tinue to publicly share their gratitude to peers and their reaction to the experiences of the program. what is of value is that this celebration of learning includes the funda- mental need to foster and embed new approaches to academic integrity in keeping with st participatory learning fig. . conclusion the creation of a multi-disciplinary program, built on a digital information ecology and student-focussed praxis, has created both a curriculum and learning approach that has facilitated understanding and knowledge construction in more dynamic ways, connect- ing experiences, reflective practices and online participatory experiences that epitomize a ‘new culture of learning’. both technical and pedagogical innovation should be fig. the compendium on twitter fig. was it worth it? o’connell international journal for educational integrity ( ) : page of hallmarks of the best learning environments we can create, and which incorporate a wide variety of pedagogical approaches, learning tools, methods and practices to sup- port students' diverse learning modes. the collaborative nature of the course/program has been highlighted, including a significant shift to public and open sharing of formal and informal assessments. this collaborative construct, with the approach to open and visible learning has provided a transparent approach to explicitly teach academic integ- rity as a foundational requirement for, or enabler of, participatory learning. this multi- disciplinary learning program, has resulted in a proactive, dynamic and responsive participatory learning design that embeds an approach to academic integrity which is responsive to open, participatory, socially moderated online environments, by explicitly fostering attitudes and behaviours that not only demonstrate a successful approach to academic integrity, but provide the knowledge and skills needed by educators from a wide range of professional education sectors/positions. authors’ information judy o’connell is senior lecturer and quality learning and teaching academic lead (online) in the faculty of science charles sturt university. as a courses director in the faculty of education her work has been recognised by two awards for academic excellence and leadership. her professional leadership spans school and tertiary education, with a focus on libraries, learning spaces, online learning design, innovation, social media and technology for learning and teaching. she has also been a member of the nmc k- horizon report advisory board since , and likes to stay in touch with emerging technologies, particularly in relation to learning experiences. judy writes online at http://judyoconnell.com. competing interests the author declares that there are no competing interests. received: may accepted: september references ally m ( ) foundations of educational theory for online learning. in: anderson t, elloumi f (eds) theory and practice of online learning. creative commons athabasca university, athabasca anderson t, dron j ( ) three generations of distance education pedagogy. int rev res open distrib learn ( ): – . retrieved from http://www.irrodl.org/index.php/irrodl/article/view/ . baggio b, beldarrain y ( ) academic integrity: ethics and morality in the st century. in: anonymity and learning in digitally mediated communications: authenticity and trust in cyber education. hershey, p – . doi: . / - - - - .ch bawden d, robinson l ( ) introduction to information science. facet, london beetham h, sharpe r (eds) ( ) rethinking pedagogy for a digital age: designing for st century learning. routledge, new york. biggs j, tang c ( ) teaching for quality learning at university. open university press, london boling ec, hough m, krinsky h, saleem h, stevens m ( ) cutting the distance in distance education: perspectives on what promotes positive, online learning experiences. internet high educ ( ): – , http://doi.org/ . /j. iheduc. . . brown js ( ) learning, working, and playing in the digital age. serendip. retrieved on february , from http://serendip.brynmawr.edu/sci_edu/seelybrown/seelybrown.html cho yh, choi h, shin j, yu hc, kim yk, kim jy ( ) review of research on online learning environments in higher education. procedia soc behav sci : – , http://doi.org/ . /j.sbspro. . . council, aqf ( ) australian qualifications framework desai ms, hart j, richards tc ( ) e-learning: paradigm shift in education. education ( ): – franklin t, van harmelen m et al ( ) web . for content for learning and teaching in higher education., retrieved from https://staff.blog.ui.ac.id/harrybs/files/ / /web- -for-content-for-learning-and-teaching-in-higher- education.pdf garrison dr, anderson t, archer w ( ) critical inquiry in a text-based environment: computer conferencing in higher education. internet high educ ( ): – haste h ( ) what is ‘competence’ and how should education incorporate new technology’s tools to generate ‘competent civic agents’. curric j ( ): – . doi: . / haythornthwaite c, andrews r ( ) e-learning theory and practice. thousand oaks sage publications, california hulsart r, mccarthy v ( ) utilizing a culture of trust to promote academic integrity. j contin high educ ( ): – . doi: . / . . kim kj, bonk cj ( ) the future of online teaching and learning in higher education: the survey says…. educause quarterly ( ): - lang c, lemon n ( ) embracing social media to advance knowledge creation and transfer in the modernized university: management of the space, the tool, and the message. in: fitzgerald t (ed) advancing knowledge in o’connell international journal for educational integrity ( ) : page of http://judyoconnell.com/ http://www.irrodl.org/index.php/irrodl/article/view/ http://dx.doi.org/ . / - - - - .ch http://dx.doi.org/ . / - - - - .ch http://doi.org/ . /j.iheduc. . . http://doi.org/ . /j.iheduc. . . http://serendip.brynmawr.edu/sci_edu/seelybrown/seelybrown.html http://doi.org/ . /j.sbspro. . . https://staff.blog.ui.ac.id/harrybs/files/ / /web- -for-content-for-learning-and-teaching-in-higher-education.pdf https://staff.blog.ui.ac.id/harrybs/files/ / /web- -for-content-for-learning-and-teaching-in-higher-education.pdf http://dx.doi.org/ . / http://dx.doi.org/ . / . . higher education., pp – , igi global, retrieved from http://www.igi-global.com.ezproxy.csu.edu.au/gateway/ chapter/full-text-html/ lee mj, mcloughlin c, chan a ( ) talk the talk: learner-generated podcasts as catalysts for knowledge creation. br j educ technol ( ): – li y ( ) apprentice scholarly writing in a community of practice: an interview of an nnes graduate student writing a research article. tesol q ( ): – , http://doi.org/ . / lieberman a, mace dp ( ) making practice public: teacher learning in the st century. j teach educ ( – ): – . lievrouw la ( ) alternative and activist new media. polity press, cambridge löfström e, trotman t, furnari m, shephard s ( ) who teaches academic integrity and how do they teach it? high educ ( ): – . doi: . /s - - - macfarlane b, zhang j, pun a ( ) academic integrity: a review of the literature. stud high educ ( ): – . doi: . / . . mcloughlin c, lee m j ( ) pedagogy . : critical challenges and responses to web . and social software in tertiary teaching. in: m lee c mcloughlin (ed) web . -based e-learning: applying social informatics for tertiary teaching. hershey, p – . doi: . / - - - - .ch o’connell j ( ) a multidisciplinary focus on st century digital learning environments: new program at csu. in: hegarty b, mcdonald j, loke s-k (eds) rhetoric and reality: critical perspectives on educational technology, proceedings ascilite dunedin., pp – prescott l ( ) using collaboration to foster academic integrity. open learn ( ): – . doi: . / . . rienties b, kinchin i ( ) understanding (in)formal learning in an academic development programme: a social network perspective. teach teach educ : – romiszowski a, mason r ( ) computer-mediated communication. in: jonassen dh (ed) handbook of research for educational communications and technology p – . lawrence erlbaum associates, mahwah scardamalia m, bransford, kozma b, quellmalz e ( ) new assessments and environments for knowledge building. in: assessment and teaching of st century skills. springer, netherlands, pp – shea p, bidjerano t ( ) learning presence: towards a theory of self-efficacy, self-regulation, and the development of a communities of inquiry in online and blended learning environments. comput educ ( ): – siragusa l, dixon kc, dixon r ( ) designing quality e-learning environments in higher education, proceedings ascilite., retrieved from http://www.ascilite.org/conferences/singapore /procs/siragusa.pdf sutherland-smith w ( ) plagiarism, the internet, and student learning: improving academic integrity. routlege, london thomas d, brown js ( ) a new culture of learning: cultivating the imagination for a world of constant change ( ). createspace, lexington thompson m ( ) from distance education to e-learning. in: haythornthwaite rac (ed) the sage handbook of e- learning research. sage publications, london, pp – turner j, reid m, shahabudin k ( ) reflective thinking. in practice-based learning . study advice and maths support, university of reading, reading turner sp, beemsterboer pl ( ) enhancing academic integrity: formulating effective honor codes. j dent educ ( ): – york-barr j, sommers wa, ghere gs, montie j ( ) reflective practice to improve schools: an action guide for educators, nd edn. corwin, thousand oaks zheng rz, dahl lb ( ) an ontological approach to online instructional design. in: h song t kidd (eds) handbook of research on human performance and instructional technology. igi global, p – . doi: . / - - - - .ch o’connell international journal for educational integrity ( ) : page of http://www.igi-global.com.ezproxy.csu.edu.au/gateway/chapter/full-text-html/ http://www.igi-global.com.ezproxy.csu.edu.au/gateway/chapter/full-text-html/ http://doi.org/ . / http://dx.doi.org/ . /s - - - http://dx.doi.org/ . / . . http://dx.doi.org/ . / - - - - .ch http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://www.ascilite.org/conferences/singapore /procs/siragusa.pdf http://dx.doi.org/ . / - - - - .ch abstract background introduction information ecology at the heart of academic integrity case description discussion and evaluation game-based learning compendium in game-based learning c #gbl conclusion authors’ information competing interests references open practices and identity: evidence from researchers and educators social media participation open practices and identity: evidence from researchers and educators’ social media participation george veletsianos george veletsianos is canada research chair in innovative learning and technology and associate professor with the school of education and technology at royal roads university in victoria b.c. his research examines learners’ and scholars’ practices and experiences in emerging online settings such as online social networks and open courses. address for correspondence: dr george veletsianos, school of education and technology, royal roads university, sooke road, victoria, bc v b y , canada. email: veletsianos@gmail.com practitioner notes what is already known about this topic • scholars use social media in teaching and research. • social media shapes and is shaped by scholarly practice. • social media can be used not just to amplify practice but also to transform scholarly endeavours. what this paper adds • scholarly practices are enacted openly in digital spaces and social media. • academics use social media to present a self that is disentangled from academic matters. • scholars’ digital participation practices may at times stand in stark contrast to and defy the evaluation metrics traditionally used to judge their work. • social media may be considered to be a place of gathering. implications for practice and/or policy • consider teaching “sharing” as a scholarly and educational practice in masters/ doctoral programs. • consider examining the values of networked, digital and open scholars in order to develop ways to innovate higher education. • consider openness and sharing as part of the academic reward structure. abstract the ways that emerging technologies and social media are used and experienced by researchers and educators are poorly understood and inadequately researched. the goal of this study is to examine the online practices of individual scholars in order to explore and understand the activities and practices that they enact when they use social media for scholarship. using ethnographic data collection methods and basic interpretive analysis techniques, i describe two emergent phenomena evident in scholars’ social media participation: scholarly practices enacted openly in digital spaces and the self that is disentangled from academic matters. these phenomena raise issues related to “sharing,” scholar identity, participation and social media as a place of gathering. british journal of educational technology vol no – doi: . /bjet. © british educational research association contemporary universities are facing numerous powerful forces that may shape their future. these include changing demographics, globalization and competition, a worldwide economic downturn, curtailment of public funding, pressures for accountability, and emerging technolo- gies (morrison, ; schwier, ; siemens & matheos, ; spanier, ). emerging technologies like social media have facilitated global collaboration and sharing, but only recently have educators and researchers embraced the social and participatory nature of these in their scholarship (veletsianos, ). social media and technology have penetrated the higher educa- tion sector, and it has been posited that they have influenced scholarly functions (pearce, weller, scanlon & kinsley, ) as well as the ways scholarship is organized, delivered, enacted and experienced (weller, ). while it is debatable whether social media have led to cultural changes within academia, or if individual scholars have always had a desire for connectivity and social media simply became available to satisfy that need (veletsianos & kimmons, a), what is undeniable is that social media have become part of the fabric of contemporary societies and our educational systems. at present, the ways that social media is used and experienced by scholars are poorly understood and inadequately researched (veletsianos, , ). the goal of this study is to examine online practices of individual scholars to answer the following research question: what activities and practices are evident when educators and researchers use social media for scholarship? by researching the online practices of networked scholars, we can gain a better understanding of scholars’ values, the degrees to which their activities align with values that characterize academic institutions and the culture surrounding online academic practice. while there is a growing body of literature reporting on student experiences with social media (eg, greenhow, ; minocha, ; selwyn, ), empirical accounts of scholars’ practices in these envi- ronments are sparse, making this research worthwhile from both a practical and scholarly perspective. to examine this topic, i first review literature relevant to scholars’ digital practices. next, i present the methods used to examine the posed research question. finally, i present the results of this investigation and discuss my findings. review of relevant literature the contemporary web can be described as a social platform that centers on users’ abilities to read, create, re-mix and contribute content in a user-friendly manner without the need for technical know-how. as a result, even though social media use patterns vary between countries, millions of individuals worldwide are participating on social media on a daily basis (singh, lehnert & bostick, ). higher education scholars also use social technologies in their personal and professional lives (moran, seaman & tinti-kane, ), including social networking sites (eg, facebook), media-sharing sites (eg, youtube) and microblogging platforms (eg, twitter). jenkins, purushotma, weigel, clinton and robinson ( ) argue that we live in a participatory culture, in a society in which the consumer no longer passively receives information, media and artifacts but also produces such content. these developments are seen as promising for higher education, partly because a web that is connected, democratic and user centered appears to align well with a socio-constructive ethos of learning, participation and knowledge creation (lave & wenger, ; vygotsky, , ). as a result, recent years have seen increasing calls for the use and integration of emerging technologies in higher education settings, fuelled by a rhetoric of “radical and transformational” impact on education (mcloughlin & lee, , para. ). however, technological panaceas to educational problems are not new (mishra, koehler & kereluik, ), and selwyn ( ) suggests that the ways that social media is used in higher education, and the outcomes associated with these ways, do not match the rhetoric and aspira- tions described in the literature. british journal of educational technology vol no © british educational research association emerging technologies, and social media in particular, are an integral part of digital scholarship, which is often seen as a major breakthrough in innovating the ways knowledge is created, negotiated and shared. a report funded by the canadian social sciences and humanities research council for example, argued that “sustaining digital scholarship in canada will safeguard the knowledge context that ensures the continued creativity of the digital economy, contribute exemplary content to canada’s digital identity, promote the public good, boost employ- ment and increase the numbers of flexible, highly qualified personnel” (bretz, brown & mcgregor, , p. ). emerging technologies have been defined as “tools, concepts, innovations, and advancements” that • may not necessarily be new technologies; • are evolving and coming into being; • go through hype cycles; • are neither fully understood and neither fully researched or researched in a mature way; and • even though they offer powerful potential for change, their potential is largely unattained (veletsianos, , p. ). this definition suggests that emerging technologies may both shape and be shaped by practice. in other words, in the context of this paper, when emerging technologies are integrated in scholarly practice (eg, a scholar uses twitter to network with colleagues), the technology is expected to shape the way the scholar networks with his or her colleagues, and the practice of scholarly networking will shape how the scholar will use various technologies for scholarly purposes. while the term “digital scholarship” is often used in reference to the use of technology to conduct research more efficiently (eg, use bibliographic management tools) and provide access to it faster (eg, through e-journals), recent research suggests that social media can be used not just to amplify practice but also to transform scholarly endeavours (weller, ). in an attempt to explain scholarly practices online, veletsianos and kimmons ( b, p. ) suggest that open scholarship refers to “a collection of emergent scholarly practices that espouse openness and sharing . . . [encompassing] ( ) open access and open publishing, ( ) open education, including open educational resources and open teaching, and ( ) networked participation.” this study focuses upon the phenomenon of scholars’ networked participation defined as “scholars’ use of participatory technologies and online social networks to share, reflect upon, critique, improve, validate, and further their scholarship” (veletsianos & kimmons, a, p. ). when employing social media in their practice, scholars appear to attempt to engage more effectively, and in different ways, with individuals and community groups interested in their scholarship (kirkup, ; kjellberg, ). for example, a number of researchers sought to examine the role and function of social technologies in scholarly lives. nardi, schiano and gumbrecht ( ) argued that scholarly blogs could “( ) update others on activities and wherea- bouts, ( ) express opinions to influence others, ( ) seek others’ opinions and feedback, ( ) ‘think by writing,’ and ( ) release emotional tension” (p. ), while walker ( ) noted that blogs could enable academics to launch critical discussions of their practice. in interviewing academics about their blogging practice, kirkup ( ) found that blogs are closely connected to professional identity as they enable academics to become “public intellectuals.” this evidence is corroborated by kjellberg ( ) and martindale and wiley ( ) who suggest that one of the motivations for blogging is to reach multiple and diverse audiences. a number of these findings have been replicated across social technologies. for instance, veletsianos ( , p. ) analyzed tweets from higher education scholars and found that when using the service they “( ) shared information, resources, and media relating to their professional practice; ( ) shared information about their classroom and their students; ( ) requested assistance from and offered suggestions to others; ( ) engaged in social commentary; ( ) engaged in digital identity and impression man- open practices and identity © british educational research association agement; ( ) sought to network and make connections with others; and ( ) highlighted their participation in online networks other than twitter.” on the other hand, scholars’ social media practices are rife with tensions. evidence from procter et al ( ), for example, seems to suggest that social media supplements rather than displaces traditional media because current forms of information exchange are “adequate” and “entrenched” in academic reward and promotion structures. for example, emerging forms of scholarly engagement involving social media may reach different and diverse audiences, but the field lacks evaluation metrics and guidelines about how academics can report the impact of their scholarship resulting from such engagement. and even though new forms of peer evaluation may be devised (eg, reader evaluations), and some of these have already been used by journals (eg, plos one, ), alternative forms of scholarly evaluation are not yet well understood by academia (johnson, levine, smith & stone, ). in addition, in a small-scale study of researchers at the university of milan, esposito ( ) found that even though a number of early adopters exist, some are doubtful and uncertain about both the tools and the values underlying digital and open scholarship. further, in an in-depth study of three researchers’ lived experiences with online social networks, veletsianos and kimmons ( ) suggest that scholars’ use of social media may be characterized by a personal– professional tension: when using social media, scholars in the aforementioned study negotiate their participation in ways that allow them to maintain appropriate and meaningful connections with professional and non-professional contacts. thus, they ( ) structure their participation such that others can scrutinize it or ( ) limit their visible participation so as to reduce exposure to the tension between the personal and professional spheres of life. such managed participation appears to be important, as prevalent social media activities (eg, “friending” acquaintances, colleagues, and family) appear to introduce conundrums for scholars. at the same time, hall ( ) suggests that learning technology is not neutral and, rather than assuming its demo- cratic, emancipatory or transformative potential, researchers need to critically examine the nar- ratives of educational technology within the broader corporate and political agenda that it exists. towards this goal, veletsianos and kimmons ( b) identify a number of assumptions and challenges of the movement towards open and digital scholarship (eg, social stratification and exclusion), noting that even though aspirations to broaden access to knowledge “might arise from the inadequacies and shortcomings of the status quo,” this does not make digital/open scholar- ship systems, decisions and initiatives “exemplar or just.” for instance, it is often assumed that open systems may foster an environment of equality and democratization (eg, everyone can have free access to articles published under an open access license), but in reality, some scholars may be able to benefit more than others as a result of their “knowledge, wealth, power, and ability” (c.f. chander & sunder, ). while researchers have begun examining the experiences and practices of academics and schol- ars who participate on social media, investigations that observe and examine situated practices directly are limited. in particular, ethnographic methods may yield findings that are not captured in studies guided by the research approaches utilized so far in examining the topic of this paper. in addition, ethnographic investigations of situated practices may contribute to an understand- ing of the social, political and cultural factors that influence technology use and nonuse. such investigations appear to be desperately needed in the educational technology field (selwyn, ). this investigation’s contributions therefore may be valuable in painting a more accurate and holistic picture of scholarly practices in online spaces. research question in this study, i posed the following research question: what activities and practices arise when researchers and educators use social media? british journal of educational technology vol no © british educational research association methods the overarching objective of this research is to make sense of practices with social media and to understand why scholars use emerging technologies, such as social media, in the ways that they do. these phenomena are complex and the research question posed above, as well as the method of analysis to answer that question, seek to uncover ways individuals participate in online settings. this study was influenced by cyberethnography (ward, ) and virtual ethnography (hine, ). however, unlike these analytical approaches, and akin to watulak ( ), instead of doing ethnography (green & bloome, ), my use of ethnography is limited to ethnographic data collection methods. specifically, i have been an active participant and contributor on social media sites. in these spaces, i interact with educators, researchers and students within the field of educational technology, participate in virtual events (eg, open courses) and keep a journal of digital artifacts, reflections and observations. this journal consists of observations, thoughts, reflections, screenshots and news articles. some of these artifacts are derived from a wide array of social media sites such as blogs, microblogging sites and video-sharing sites. this journal centers around three questions, “what am i observing in my social media participation with regard to scholarly practice? what phenomena and/or issues arise? what do these observations mean for scholarly practice, and how can we make sense of them?” this journal, guided by my obser- vations and experiences, then becomes the main data source that i analyze to answer the research question posed above. the result of that analysis is the qualitative study reported herein. data analysis was guided by the constant comparative method (glaser & strauss, ) in which i coded data to note emerging patterns. in particular, i took a piece of data (eg, a journal entry), coded it, compared it with other pieces of data (eg, a category, a journal entry, a screenshot), examined whether the data were similar or different, and developed additional categories to capture similarities and differences between the data. this process was iterative. in other words, i coded data, entered new data (ie, continued journaling) and continued comparing new and existing data and categories in an ongoing fashion. at times, new data changed my understanding of the practices i was observing. at other times, new data reinforced my understanding of what i was observing. the patterns that i identified in my data were compiled and reanalyzed across journal entries. journaling and analysis are ongoing processes, but after going through the process multiple times, i felt that the data had been saturated and could be grouped into the two major themes that i discuss below. it is important to note that this study attempts to understand practices and phenomena in a reflexive, autobiographical and reflective lens. it is reflexive and autobiographical because i use my personal experiences as a way to frame the phenomena under examination, and reflective because i reflect upon both my experiences and the participation of others within social media. even though the insights i gleaned from these artifacts were triangulated with personal experi- ences and theoretical constructs, the research method that i followed poses threats to validity and reliability. by being so close to the culture under investigation, it is possible that i have internalized the values and norms that characterize scholars who use social media. in addition, my life experiences as a learner, researcher, educator and user of social technologies influence the ways with which i view the world. however, rather than ignoring these threats, while conducting this investigation, i have consciously sought to contain and keep my prior experiences in check—a process known as “bracketing” (giorgi, ). i have done this by continuously reflect- ing on my analysis and interpretation, and persistently questioning the degree to which this analysis reflects my own preunderstandings of the phenomena under investigation. in addition, in presenting the results, i have attempted to provide rich descriptions of the phenomena exam- ined so as to enable readers to evaluate the plausibility of the results (merriam, ). open practices and identity © british educational research association methodologically, this investigation lies within the constructivist realm, as i do not purport to uncover a monolithic truth explaining the ways scholars act and participate online. rather, i believe that there exist multiple truths, some that are complementary and some that are contra- dictory. furthermore, these truths may coexist. for example, it may be true that some scholars are comfortable being on social media, which others may feel trepidation about participating online. it might also be true that the source of that uneasiness varies. for instance, some scholars might feel uneasy about participation because they see no real benefit to participation, while others see benefits but the potential costs of participation (eg, privacy concerns) outweigh those benefits. finally, the anecdotes and stories that i share below may not appear new or surprising to individuals who participate on social media. however, they are novel in the sense that they help describe and make explicit actions enacted on social media spaces that may either be implicit or left unexam- ined. as such, this essay may help us better understand how some scholars use social media in today’s environment and what their values are. results i remember the exact moment when i decided to join twitter and created a professional blog. i was reading chapter proposals for a book that i was editing and one proposal made such a big impression upon me that i decided to spend more time using these technologies outside of the courses that i was teaching. at first, i often struggled with the notion of public participation on social media, of “putting myself out there,” publishing draft ideas and sharing details of my professional and nonprofessional life that i assumed others would find incomplete, dull or irrel- evant. in retrospect, the source of this struggle was partly the training and scholarly encultura- tion that i received during my graduate degree. this training, implicit as it may have been, highlighted the notion that researchers: ( ) can be “scooped out of ideas” if they share ideas prematurely and ( ) are experts, knowledgeable in their field of study, confident of their work and should present themselves as such. since then, i have also learned that i am not the only scholar facing this struggle. others have expressed this struggle eloquently noting that publishing “half-baked” ideas might be a danger- ous career move for scholars because, even though an individual might benefit from thinking through ideas via writing, others may judge one’s work in a decontextualized manner without necessarily observing or understanding how a particular blog entry, for example, evolved over time. the struggles and tensions scholars may experience when participating on social media have also been reported elsewhere (eg, veletsianos & kimmons, ) wherein participants ( ) wondered whether the time spent on social media could have been spent more productively elsewhere, and ( ) struggled with the degree to which they should share professional versus nonprofessional information. these issues point to an increasing tension between personal and professional identity, the spectrum of sharing that lies between the two, and the perception of who a scholar is and what she/he does. sharing and issues of scholar identity (eg, the degree to which scholarly identity is distinct from nonprofessional identity) permeate social media participation. these issues are not limited to social media. in discussing the tattoos of faculty members for example, leonard ( , para. ) notes that “[i]n a university culture, where faculty are often reduced to numbers—grant dollars, student credit hours, teaching-evaluation scores, publication numbers—tattoos offer a space to disentangle our individual selves from the bureaucratic and corporate university.” in the same way, the social media uses that i have seen in my ethnographic journey appear to center around issues of identity and control. to explore these issues, i describe two emergent phenomena: scholarly practices enacted openly in digital spaces and the self that is disentangled from aca- demic matters. british journal of educational technology vol no © british educational research association scholarly practices enacted openly in digital spaces the image of the “lone scholar” tirelessly working on his or her research in a dimly lit office is in stark contrast to the connected and visible scholarly practices enacted online by the scholars that i have encountered in this investigation. while a number of specialized tools have been developed specifically for networked scholarly practice (eg, academia.edu, mendeley), scholars have also used their own individual websites, most often blogs, as spaces in which they enact and pursue their scholarship in a visible manner. in addition, personal websites often serve as areas where faculty’s digital presence is aggregated and various scholarly artifacts are featured. for example, personal websites frequently feature the creator’s blog, twitter feed, reading list, teaching arti- facts (eg, syllabi), scholarly artifacts (eg, copies of publications) and links to topics of interest (eg, professional organizations) or blogs that the individual reads. the question that is of interest here is: what types of scholarly practices are made visible through scholars’ online participation? examples of activities that i have frequently observed being enacted in public digital spaces include: • announcements. examples include announcements of publications, awards and job opportu- nities; • sharing drafts of manuscripts and requesting/receiving feedback on them; • developing and releasing textbooks written as part of a course (eg, amado et al, ; correia, ); • sharing syllabi and instructional activities; • live streaming or sharing video/photographs from one’s own teaching; • live blogging and live tweeting a conference keynote or session; • authoring and participating in the writing of collaborative documents pertaining to conference sessions/workshop; • engaging in debates and commentary on professional issues; • teaching. for example, scholars have taught courses that were made freely available to anyone who wished to participate (eg, hilton, graham, rich, & wiley, ); • making one’s tenure and promotion materials publicly available; • reflecting on and conversing about the doctoral process and thesis/dissertation. doctoral students have used blogs and wikis to share their work, and reflect upon and document their progress. self-organized systems through which some of these activities are enacted have also been formed. for example, #phdchat is an active online community initiated by doctoral students. individuals in the community use social media to update each other on their progress, share resources, learn about the profession, socialize and support each other; • creating video trailers to describe, promote and highlight academic artifacts such as books; • help-seeking with professional activities (eg, research, teaching), enacted in the form of crowd- sourcing. for example, individuals have asked for help in locating research literature. to further understand the values of scholars who use social media for scholarship, we can examine two of these practices in more detail. the first practice is the use of video trailers to describe, promote and highlight academic artifacts. these videos are often produced by individual faculty members, posted on a video-sharing website such as youtube and are then cross-posted on blogs and shared across social media services such as twitter and facebook. these videos are often of short duration (ie, about – minutes long) and attempt to relay the central tenets of the book, course, project or publication they advertise in a meaningful and appealing way. the fact that video trailers were originally conceptualized, developed and shared by individual faculty members (vis-à-vis by a production team at the academic institution which employs the indi- vidual) is important because it highlights ( ) the ease with which individual scholars can develop and share media pertaining to their scholarship, ( ) the creative opportunities afforded to schol- open practices and identity © british educational research association ars by emerging technologies, and ( ) scholars’ willingness to share their work outside of formal and institutional structures. in reflecting upon this practice, we may note that faculty members, who are not traditionally trained in media production, are seeking ways to engage with audiences who may be outside the reach of the dissemination methods that have historically been at their disposal. while the concept of media creation, curation, re-mixing and recirculation is not new (c.f. jenkins et al, ), one may see here the emergence of an academic subculture firmly grounded on participatory ideals. the second practice that requires deeper analysis is that of accessing research literature through crowdsourcing. imagine being at a university whose library does not have access to a journal that has published an article that you need for your teaching or research. or, that even though you have access to the journal, the year in which the paper was published falls outside of your institution’s subscription dates. for example, the journal imposes a -month lag between the time the paper is published and the time it becomes available electronically. what options are available to you? you could purchase the article. or, you could email the article’s author and ask for a copy of the paper. alternatively, you could search for the article online in the hopes that the author has self-archived the paper (eg, on his/her personal website). in addition to these approaches, i have discovered that individuals have also appropriated social technologies to find and share journal papers. reminiscent of peer-to-peer networks for music sharing, scholars have used pirateuniver- sity.org, thepaperbay.com, the scholar subreddit and the #icanhazpdf hashtag to access schol- arship that they need. on these websites, individuals request papers that they do not have access to, and those who have access (eg, through their institution’s libraries) provide them with a copy of the papers. coverdale ( ) has also discussed the role of file-sharing sites in retrieving and providing access to academic work. these activities are important because they suggest that individuals are willing and able to circumvent and defy restrictions to the sharing of knowledge and research. kroll ( ) describes this as “an act civil disobedience toward the scientific pub- lishing enterprise.” in fact, open access to scholarship is a value that is close to the hearts and minds of numerous scholars who use the internet for professional purposes, and even though they may not publicly embrace or endorse the activities described above, they often make their stance in support of open access known. for example, a number of them have used their blogs to explain that they refuse to publish in or review for journals that are not embracing open access ideals. the self that is disentangled from academic matters even though social technologies have been repurposed for scholarly activities, they were not originally designed for those purposes (hemmi, bayne & land, ). as the design of online social networks is framed around social and informal sharing, the presentation of the self that is disentangled from academic matters is a theme that is persistent in my observations of scholars’ activities online. scholars use social media for nonprofessional purposes, revealing multiple dimensions of their selves. for example, scholars have shared not just their scholarly successes, but their vulnerabilities and struggles (eg, with a divorce) and sought help with personal issues and causes that they are passionate about (eg, equal rights legislation and raising awareness concerning debilitating diseases). importantly, it appears that engagement with and sharing about issues unrelated to the profession is a value that is celebrated by this community. it is not uncommon, for example, to encounter blog entries discussing the positive outcomes of social sharing and twitter profiles proudly declaring that updates are composed of a mix of personal and professional tweets. as i have argued elsewhere (veletsianos, ), even though some status updates, such as meal preferences, may appear to serve limited professional purposes when seen out of context, they might actually serve significant social functions. scholars’ participation in the “childhood walk” internet meme exemplifies this theme. an inter- net meme is a concept or artifact created by individuals (eg, an image with a caption, a british journal of educational technology vol no © british educational research association re-enactment of an activity such as a dance) that spreads virally on the internet. examples of popular memes include: lolcats, three wolf moon t-shirt, numa numa dance and, more recently, the harlem shake videos. “a childhood walk” was a concept developed by internet performance artist ze frank, in which he asked individuals to use google street view to share stories and screenshots of childhood walks and memories. numerous individuals contributed stories (archived at http://www.zefrank.com/the_walk/), and among those, a number of researchers and educators shared their stories of childhood walks (eg, figure ). these stories present a self that is often unseen outside of small-group gatherings, and may enable individuals to develop relationships and bonds. discussion in this paper, i explored practices and activities that arise when researchers and educators use social media. i focused upon two phenomena and described how individuals have enacted scholarly practices openly in digital spaces and revealed a self that is disentangled from academic matters. while these practices may not be new, or even uniform across scholars and educators participating in social media, they nonetheless highlight a number of issues that are worthy of consideration and reflection. “sharing” is a persistent concept in these findings. the individuals who are embracing sharing practices are finding value in doing so, and often speak out in favor of sharing. it is not unusual for example to encounter quotes such as “good things happen to those who share,” or “sharing is caring” or “education is sharing.” these quotes illustrate and exemplify the values of this sub- culture. while faculty members have historically shared their work with each other (eg, through letters, telephone calls and conference presentations), and open access publishing is gaining increasing acceptance (scanlon, in press; weller, ), educators and researchers are increas- ingly sharing their scholarship online in informal, open spaces. wiley and green ( , p. ) even argue that “[e]ducation is, first and foremost, an enterprise of sharing. in fact, sharing is the sole means by which education is effected.” however, education, both k- to higher educa- tion, has generally lacked a culture of sharing. barab, makinster, moore and cunningham ( ) note that “change efforts [in k- ] have often been unsuccessful due in large part to the lack of a culture of sharing among teachers (chism, ).” if we were to consider the values of this subculture moving forward and follow the example they set, what role would “sharing” play figure : the author’s version of “a childhood walk” open practices and identity © british educational research association in our practice? a core value of this subculture is that sharing should be treated as a scholarly and educational practice. toward this goal, the practice of sharing needs to be taught in teacher preparation and doctoral programs alike, in the same way that student-centered pedagogies, digital citizenship and new media literacies are taught. the increasing calls for open access scholarship are a step towards this direction. scholars’ social media practices and innovations (eg, sharing draft versions of a manuscript, accessing literature through crowdsourcing) appear to question traditional elements of scholarly practice. such elements (eg, peer-review, publishing refined rather than in-progress work) are elements upon which scholarship has been established and is being enacted today. however, techno-cultural advances appear to have led to increasing calls to reevaluate the ways scholar- ship is disseminated, negotiated and evaluated. taken further, some of the digital scholarly practices i observed could be characterized as small acts of defiance against institutional norms, tenure and promotion practices, and the status quo. for example, tenure and promotion materi- als are not usually made publicly available. on the other hand, this research has also demon- strated that the alternative is also true. namely, that social media have been used in supporting commonly accepted forms of scholarly engagement such as knowledge creation and dissemina- tion. as modern universities are facing pressures to change within a world increasingly domi- nated by managerial and corporate practices that value quantitative metrics of evaluation, scholars’ digital participation may at times stand in stark contrast to and defy the evaluation metrics traditionally used to judge their work. viewing scholarly participation online with this lens allows us to ask a set of future research questions: what challenges, if any, are scholars facing at their institutions as they enact the practices described in this paper? how have scholars’ institutions responded to these practices, and how do these responses differ across institutions? the practices described in this paper suggest that social media can be viewed as a place where scholars can congregate to share their work, ideas and experiences. this place is similar to a conference gathering or a meeting of colleagues at a writing retreat. through social media gatherings, distributed individuals build ties, bonds and solidarity, even when they may have not met each other face-to-face. shared values (eg, openness) contribute to the formation of bonds and solidarity. a worthwhile question to ask in future research might be, how has this community developed over time? and what role has the technology played in its development? while prior research has explored the degree to which seniority may influence participation in digital scholarship (eg, kirkup, ), and researchers have argued that “professional seniority [gave professors] the confidence to invest time in non-traditional academic production” (p. ), this research has not differentiated participation between individuals at different levels of their career. instead, this research has focused on identifying emergent phenomena that were common across participants. however, it is important to note that individuals face different challenges and may come across different opportunities at different levels of their academic career. therefore, it would be worthwhile for future research to explore how scholars use social media to cope with the expectations of their academic roles (eg, being a doctoral student vs. being a newly hired faculty member). finally, this paper shows that academics have taken advantage of the social and playful nature of participatory technologies to share aspects of their life that are often considered private. academic identity online is an issue that requires deeper analysis as recent investigations have sought to explore the relationship between identity and online participation. stewart ( ), for example, sees six key selves in social media, or six ways of being. these are: the performatice, public self; the quantified or articulated self; the asynchronous self; the polysocial or augmented reality self; and the neo-liberal or branded self. these descriptors overlap with the types of par- ticipation described in this paper. in addition, kimmons and veletsianos ( ) have found that british journal of educational technology vol no © british educational research association when participating in online social networks, individuals put forth fragments of their identity that they deem to be acceptable to others. in short, individuals in their research presented their “true self ” online, but actively managed that presentation to conform to what they perceived others would deem socially acceptable. it may be worthwhile for future work to further explore how scholars perceive and construct their identity online. conclusion in this paper, i have examined practices and activities that arise when researchers and educators use social media. the practices identified are varied. a lot of these practices are not new. rather, they are enacted in different spaces and under different constraints. yet, they are novel. they enable us to examine the values of a subculture of digital scholars and allow us to seek further understanding of why scholars engage in the activities that they do on social media. acknowledgements the stellar fellowship mobility program supported this research. the views and results pre- sented in this report should not be taken to represent the views or opinions of the stellar fellowship mobility program. notes . in this paper, “scholarship” refers to research, teaching and service activities. . at the time of writing, reddit was a popular content aggregator whose content was con- tributed by users. a subreddit was a community of users who shared a common interest (eg, exercise, veganism, education, etc). . a hashtag refers to the twitter practice of adding the # symbol in front of a word in order to categorize a message for indexing/retrieval. for example, a popular hashtag is #edtech and is used by individuals to categorize a message as being relevant to educational technology. references amado, m., ashton, k., ashton, s., bostwick, j., clements, g., darnall, r. et al ( ). project management for instructional designers. retrieved february , from http://pm id.org/ barab, s., makinster, j., moore, j. & cunningham, d. ( ). designing and building an on-line community: the struggle to support sociability in the inquiry learning forum. educational technology research and development, , , – . bretz, a., brown, s. & mcgregor, h. ( ). lasting change: sustaining digital scholarship and culture in canada. report of the sustaining digital scholarship for sustainable culture group. retrieved july , , from http://tinyurl.com/sustaindigsc chander, a. & sunder, m. ( ). the romance of the public domain. california law review, , , – . chism, n. ( ). the place of peer interaction in teacher development: findings from a case study. paper presented at the annual meeting of the american educational research association (aera), chicago, il. correia, a.-p. (ed.) ( ). breaking the mold: an educational perspective on diffusion of innovation. retrieved february , from http://en.wikibooks.org/wiki/breaking_the_mold:_an_educational_perspective_on_ diffusion_of_innovation coverdale, a. ( ). free academic books here! retrieved november , , from http://phdblog.net/free- academic-books-here/ esposito, a. ( ). neither digital or open. just researchers. views on digital/open scholarship practices in an italian university. first monday, . retrieved may , , from http://firstmonday.org/htbin/ cgiwrap/bin/ojs/index.php/fm/article/view/ / giorgi, a. ( ). the theory, practice, and evaluation of the phenomenological method as a qualitative research procedure. journal of phenomenological psychology, , , – . glaser, b. & strauss, a. ( ). the discovery of grounded theory: strategies for qualitative research. chicago, il: aldine. green, j. & bloome, d. ( ). ethnography and ethnographers of and in education: a situated perspective. in j. flood, s. heath & d. lapp (eds), a handbook of research on teaching literacy through the communicative and visual arts (pp. – ). new york: simon & schuster macmillan. open practices and identity © british educational research association greenhow, c. ( ). online social networks and learning. on the horizon, , , – . hall, r. ( ). revealing the transformatory moment of learning technology: the place of critical social theory. research in learning technology, , , – . hemmi, a., bayne, s. & land, r. ( ). the appropriation and repurposing of social technologies in higher education. journal of computer assisted learning, , , – . hilton, j. l., graham, c., rich, p. & wiley, d. ( ). using online technologies to extend a classroom to learners at a distance. distance education, , , – . hine, c. ( ). virtual ethnography. london: sage publications. jenkins, h., purushotma, r., weigel, m., clinton, k. & robinson, a. j. ( ). confronting the challenges of participatory culture: media education for the st century. cambridge, ma: the mit press. johnson, l., levine, a., smith, r. & stone, s. ( ). the horizon report. the new media consortium. retrieved february , , from http://www.nmc.org/pdf/ -horizon-report.pdf kimmons, r., & veletsianos, g. ( ). the fragmented educator: social networking sites, acceptable iden- tity fragments, and the identity constellation. under review. kirkup, g. ( ). academic blogging: academic practice and academic identity. london review of educa- tion, , , – . kjellberg, s. ( ). i am a blogging researcher: motivations for blogging in a scholarly context. first monday, , , – . kroll, d. ( ). #icanhazpdf: civil disobedience? retrieved on november , , from http://cenblog. org/terra-sigillata/ / / /icanhazpdf-civil-disobedience/ lave, j. & wenger, e. ( ). situated learning: legitimate peripheral participation. cambridge, uk: cambridge university press. leonard, d. ( ). the inked academic body. the chronicle of higher education. retrieved october , , from http://chronicle.com/blogs/conversation/ / / /the-inked-academic-body/ martindale, t. & wiley, d. a. ( ). using weblogs in scholarship and teaching. techtrends, , , – . mcloughlin, c. & lee, m. ( ). future leaning landscapes: transforming pedagogy through social soft- ware. innovate, , . retrieved may , , from http://tinyurl.com/ qs q merriam, s. ( ). what can you tell from an n of ?: issues of validity and reliability in qualitative research. paace journal of lifelong learning, , – . minocha, s. ( ). an empirically-grounded study on the effective use of social software in education. education and training, , / , – . mishra, p., koehler, m. j. & kereluik, k. ( ). the song remains the same: looking back to the future of educational technology. techtrends, , , – . moran, m., seaman, j. & tinti-kane, h. ( ). blogs, wikis, podcasts, and facebook: how today’s higher education faculty use social media. boston, ma: pearson learning solutions. morrison, j. ( ). u.s. higher education in transition. on the horizon, , , – . nardi, b. a., schiano, d. j. & gumbrecht, m. ( ). blogging as social activity, or, would you let million people read your diary? proceedings of the � acm conference on cscw (pp. – ). pearce, n., weller, m., scanlon, e. & kinsley, s. ( ). digital scholarship considered: how new technolo- gies could transform academic work. in education, , . retrieved may , , from http://ineducation. ca/article/digital-scholarship-considered-how-new-technologies-could-transform-academic-work plos one ( ). article-level metrics information. retrieved january , , from http://www.plosone. org/static/alminfo procter, r., williams, r., stewart, j., poschen, m., snee, h., voss, a. et al ( ). adoption and use of web . in scholarly communications. philosophical transactions of the royal society, , – . scanlon, e. (in press). scholarship in the digital age: open educational resources, publication, and public engagement. british journal of educational technology. schwier, r. ( ). the corrosive influence of competition, growth, and accountability on institutions of higher education. journal of computing in higher education, , , – . selwyn, n. ( ). looking beyond learning: notes towards the critical study of educational technology. journal of computer assisted learning, , , – . selwyn, n. ( ). social media in higher education. in a. gladman (ed.), the europa world of learning (pp. – ). london, uk: routledge. siemens, g. & matheos, k. ( ). systemic changes in higher education. in education, , . retrieved july , , from http://ineducation.ca/article/systemic-changes-higher-education singh, n., lehnert, k. & bostick, k. ( ). global social media usage: insights into reaching consumers worldwide. thunderbird international business review, , , – . spanier, g. ( ). creating adaptable universities. innovative higher education, , , – . british journal of educational technology vol no © british educational research association stewart, b. ( ). digital identities: six key selves in networked publics. retrieved may , , from http://theory.cribchronicles.com/ / / /digital-identities-six-key-selves/ veletsianos, g. ( ). a definition of emerging technologies for education. in g. veletsianos (ed.), emerging technologies in distance education (pp. – ). edmonton, ab: athabasca university press. veletsianos, g. ( ). higher education scholars’ participation and practices on twitter. journal of compu- ter assisted learning, , , – . veletsianos, g. & kimmons, r. ( a). networked participatory scholarship: emergent techno-cultural pressures toward open and digital scholarship in online networks. computers & education, , , – . veletsianos, g. & kimmons, r. ( b). assumptions and challenges of open scholarship. the international review of research in open and distance learning, , , – . veletsianos, g. & kimmons, r. ( ). scholars and faculty members lived experiences in online social networks. the internet and higher education, , , – . vygotsky, l. s. ( ). thought and language. cambridge, ma: mit press. vygotsky, l. s. ( ). mind in society. cambridge, ma: harvard university press. walker, j. ( ). blogging from inside the ivory tower. in a. bruns & j. jacobs (eds), uses of blogs (pp. – ). new york, ny: peter lang. ward, k. j. ( ). cyber-ethnography and the emergence of the virtually new community. journal of information technology, , , – . watulak, l. s. ( ). ‘i’m not a computer person’: negotiating participation in academic discourses. british journal of educational technology, , , – . weller, m. ( ). the digital scholar: how technology is transforming scholarly practice. london, uk: bloomsbury academic. wiley, d. & green, c. ( ). why openness in education? in d. oblinger (ed.), game changers: education and information technologies (pp. – ). washington, d.c.: educause. open practices and identity © british educational research association developing literacies in the digital humanities classroom include, yet transcend, the ‘traditional’ passive literacies of reading, hearing and seeing into the active realms of finding, evaluating, creating, engaging and communicating with an audience that may extend beyond institutional boundaries. recognizing the overlaps between the core skills of digital pedagogy helps faculty identify how to leverage the expertise available through the library and centers for digital scholarship. including these campus stakeholders in the development of curricula--and potentially involving them in classroom instruction for discreet topics--fosters partnerships that can continue beyond a single course. the authentic, collaborative classroom environment described throughout this poster necessitates partnerships between it staff, edtech groups, and the library. instead of framing the contributions from staff as a service, these genuine collaborations foster an integrated classroom that encourages the growth of cross-campus relationships. building bridges: pedagogical strategies for introducing digital humanities in the undergraduate and graduate classroom sarah l. ketchley & wendy perla kurtz this poster highlights some of the challenges of teaching introductory-level digital humanities courses in undergraduate and graduate classrooms, and describes pedagogical solutions developed by faculty at ucla and the university of washington to address these complexities. these solutions include identifying and developing the core skill sets students need to begin work in digital humanities, including best practices for project management, for working with data, and interpreting and presenting analyses. from a faculty standpoint, the poster suggests strategies for building collaborative partnerships between libraries and faculty to best leverage each respective group’s expertise. the theoretical framework developed by eshet et al provides a robust framework for designing a course geared towards cultivating digitally literate students. in order to gather the data necessary to conduct meaningful research, work in the dh class combines ‘traditional’ linear search and retrieve exercises set against a backdrop of a more iterative and cyclical flow of retrieving digital material, evaluating and cleaning as appropriate, analyzing and visualizing, returning to repeat the process until satisfactory results are achieved. in this scenario, students are required to evaluate and order considerable amounts of information from disparate sources in a short period of time. repeating this process on multiple occasions, and perhaps across different platforms, helps reinforce this mental model for students, moving from an abstract concept into a defined workflow. the theory and practice of this type of work is the remit of librarians, who can offer advice on topics ranging from sourcing appropriately-licenced data, initiating efficient and comprehensive searches, curating data and working with metadata. finally, libraries can act as ‘skill-hubs’ providing recommendations for campus training resources that may not otherwise be readily discernible. . overview discussion and conclusions as well as playing a central role in the provision of digital services to faculty and students, centers for digital scholarship should, arguably, be equally involved in dh pedagogy. often housed in the library, they are staffed by librarians whose skills range from traditional librarianship, technical expertise, digital project development and management, as well as the provision of training, workshop development and education. although humanities faculty teaching with or about dh tools and methodologies may draw on librarians for support in developing their digital humanities curricula, more often they develop course materials without taking advantage of overlapping expertise and experience. the poster highlights areas of library expertise that we have drawn on in our own dh teaching, and will explore potential models for collaboration between faculty and librarians which address the issue of balance between technology and subject matter expertise. introduction contacts sarah l. ketchley, ph.d. department of near eastern languages & civilization, university of washington, seattle wa ketchley@uw.edu @sarahketchley wendy perla kurtz, ph.d. digital humanities program university of california, los angeles los angeles ca wpkurtz@ucla.edu @wendythedh figure . syllabus for ‘geospatial humanities: mapping space and place,’ ucla figure . after yoram eshet-alkalai. “digital literacy: a conceptual framework for survival skills in the digital era”. journal of educational multimedia and hypermedia ( ) ( ), - . figure . syllabus for ‘an introduction to digital humanities’, university of washington, winter quarter both syllabi are designed as foundational courses in computation for humanities: one is an introduction to digital humanities and the other course focused specifically on geospatial techniques for the humanities. the commonalities ) planning and scoping; ) data wrangling; ) analysis; and ) a public-facing presentation. these syllabi examples from our classroom underscore the value of identifying common pedagogical ground in an effort to develop curricular materials and teaching strategies that are relevant for digital scholars and librarians, and taught by specialists from both fields. the goal is to make dh pedagogy extensible and an engaging experience for both instructors and students. just as dh has redefined the nature of traditional scholarship, so dh pedagogy is redefining how we build and deliver our courses. conversations with librarians early in the planning process can help identify the most appropriate digital tools to use in the classroom; considerations may include ease of installation and management, sustainability, good documentation and extensibility. the knowledge and skills developed by librarians naturally align with some of the principal concepts at work in dh pedagogical practices. these skills include traditional librarianship, technical expertise, digital project development and management, as well as the provision of training, workshop development and education. miriam posner has identified three core levels to any dh project, no matter which methodology is used, from mapping to text mining. curating, processing, presenting, and – we have added – preserving, are foundational to dh project work. these are also the core competencies of librarianship. figure showcases the skills and expertise the library can offer in support of dh research and pedagogy on campus. it can also be used to uncover gaps which require further funding and support from administration. most importantly, it is a means to shed light on the work librarians are already doing in support of the dh, underscoring the library’s status as a valuable stakeholder and partner. fostering collaborationpromoting digital literacies figure . librarians as digital experts: skills and expertise sample syllabi why collaboration matters the average dh project has many collaborators, since it is understood that no one person can be expert in all aspects of successful project-building. the dh syllabi we showcase here represent a full--albeit compressed--project cycle, with the addition of a pedagogical layer. it is therefore unsurprising that collaboration in the dh classroom setting is as important as it is in a dh research project. project-based learning and the group work that inevitably serves as the base for those projects rely on successful team dynamics. but fomenting prosperous partnerships takes time, patience, and mutual goals, which can be challenging. in addition there’s only so much bandwidth people have available in their current roles. so, why should we strive to foster meaningful collaborations? relationship-building between faculty and staff or student peer groups enriches the educational experience for students. faculty teaching dh courses find the necessary supports for planning and executing complex dh courses and librarians can contribute their expertise in support of student research and learning outcomes. creating a working agreement, project scoping, and establishing parameters and expectations is a crucial element to developing healthy and long lasting collaborative partnerships. course description spatial humanities, geohumanities, gis humanities, “deep maps,” “digital culture mapping,” and “the spatial turn” are terms that have grown in popularity over recent years in the humanities and digital humanities. digital mapping makes it possible to create rich stories of culturally, socially, and historically relevant materials on a cartographic interface converting a purely geographic space into a place. this course will provide an overview of gis and other mapping techniques through project-based assignments. students with little to no gis experience will be exposed to the theories, concepts, and methods used for mapping projects in the humanities and social sciences. student with a gis background will have the opportunity to explore non-traditional uses of mapping systems. course outcomes and learning objectives by experimenting with a wide range of tools that employ a variety of data types (both structured and unstructured), students will learn the basics of mapping and geospatial information. students will build skills necessary to enhance their spatial thinking and literacy. by learning to manipulate gis software and generate basic spatial analysis, students will be able to apply spatial research methods to enhance their research in their own subject areas. students will also learn best practices for planning, managing, and effectuating a digital mapping project by creating a web or mobile-based project. course description a no-prerequisite course to introduce students to concepts and methodologies of using digital humanities tools for dataset creation, analysis and presentation. students will explore primary source material related to the lives and achievements of early pioneers in near eastern archaeology, focusing specifically on the period known as the 'golden age' of egyptology at the end of the th and early th centuries. students will analyze primary source documents using text mining methodologies, build digital maps and timelines, and ultimately present research results on an online platform. course outcomes and learning objectives humanities and social sciences students will become familiar with a range of tools and technologies for text mining and text analysis that will enhance their abilities to succeed both as undergraduate researchers and in their lives after graduation. students in technology disciplines will be able to explore the applications of digital tools to humanistic endeavors. students in this course will: . learn the basic vocabulary of concepts and tools in digital humanities and become acquainted with a range of projects, best practices and resources in the field. . gain hands-on experience of humanities dataset creation, curation, analysis and presentation. . gain an introductory knowledge of many open source digital tools or methods useful to broad humanities disciplines. . create a digital narrative to present the results of work in ( ) . mailto:ketchley@uw.edu mailto:wpkurtz@ucla.edu posthuman agency in the digitally mediated city: exteriorization, individuation, reinvention: annals of the american association of geographers: vol , no skip to main content log in  |  register cart home all journals annals of the american association of geographers list of issues volume , issue posthuman agency in the digitally mediat .... search in: this journal anywhere advanced search annals of the american association of geographers volume , - issue submit an article journal homepage , views crossref citations to date altmetric methods, models, and gis posthuman agency in the digitally mediated city: exteriorization, individuation, reinvention gillian rose department of geography, open universityview further author information pages - received aug accepted oct published online: mar download citation https://doi.org/ . / . . crossmark   translator disclaimer full article figures & data references citations metrics reprints & permissions get access /doi/full/ . / . . ?needaccess=true abstract accounts by geographers of the ways in which urban spaces are digitally mediated have proliferated in the last few years. this significant body of work pays particular attention to the production of urban space by software and digital hardware, and geographers have drawn on various kinds of posthumanist philosophies to theorize the agency of the technological nonhuman. the agency of the human, however, has been left undertheorized in this work, often appearing in the form of excessive resistance to the agency granted to the digital. this article contributes to understanding the digital mediation of cities by theorizing a specifically posthuman agency; that is, a human agency both mediated through technics and diverse. drawing on the philosophy of stiegler as well as a range of feminist digital scholarship, the article conceptualizes posthuman agency as always already coconstituted with technologies. posthumans are simultaneously individuated and exteriorized in that coconstitution, and this permits agency understood as reinvention. the article also insists that such sociotechnical agency is differentiated, particularly in terms of the spatialities and temporalities through which it is organized. it concludes by arguing that geographers must reconfigure their understanding of digitally mediated cities and acknowledge the inventiveness and diversity of urban posthuman agency. 过去数年来, 地理学者对于城市空间透过数码中介的方式已提出诸多的阐述。此一显着的研究工作, 特别关注由软件和数码硬件所生产的城市空间, 而地理学者已运用各种后人类哲学, 对科技的非人类能动性进行理论化。但人类的能动性却被此一研究工作遗落, 鲜少受到理论化, 并经常以极度反抗赋予数码能动性的形式出现。本文特别理论化同时透过科技中介和多样的人类能动性, 对于理解数码中介的城市作出贡献。本文运用斯蒂格勒的哲学, 以及女权主义数码研究的范畴, 将后人类的能动性概念化为总是已与科技共同构成。后人类在该共同构成中同时具体化与形象化, 并使得能动性可被理解为再创造。本文同时坚称, 此般社会科技能动性是差异化的, 特别是就组织该能动性的空间性与时间性而言。本文于结论中主张, 地理学者必须重组他们对于透过数码中介的城市之理解, 并认识到城市后人类能动性的创造性与多样性。 durante unos pocos años recientes han proliferado los recuentos de los geógrafos acerca de los modos como los espacios urbanos son mediados digitalmente. tan importante cuerpo de trabajo pone singular atención a la producción de espacio urbano por software y equipos digitales, y los geógrafos se han apoyado en varias clases de filosofías poshumanistas para teorizar la agencia de lo tecnológico no humano. la agencia de lo humano, sin embargo, ha sido soslayada en ese trabajo, sin suficiente teorización, apareciendo con frecuencia en la forma de excesiva resistencia a la agencia asignada para lo digital. este artículo contribuye a la comprensión de la mediación digital de las ciudades teorizando una agencia específicamente poshumana; esto es, una agencia a la vez mediada por medios técnicos, y diversa. con base en la filosofía de stiegler lo mismo que en una gama de erudición digital feminista, el artículo conceptúa la agencia poshumana como ya co-constituida desde siempre con tecnologías. los poshumanos son simultáneamente individuados y exteriorizados en esa co-constitución, lo cual permite que la agencia sea entendida como reinvención. el artículo insiste también en que tal agencia sociotécnica es diferenciada, particularmente en términos de las espacialidades y temporalidades a través de las cuales está organizada. en el artículo se concluye arguyendo que los geógrafos deben reconfigurar su entendimiento de las ciudades mediadas digitalmente y reconocer la inventiva y diversidad de la agencia urbana poshumana. key words: differencedigitalfeministposthumanstiegler 关键词:: 差异数码女权主义后人类斯蒂格勒。 palabras clave: diferenciadigitalfeministaposhumanostiegler acknowledgments i would like to thank the openspace work in progress group at the open university for their robust feedback on an early version of this article, as well as the article's referees. notes . smart cities are those in which digital technologies are deployed to achieve economic growth (through innovating new products and markets), environmental sustainability (by encouraging more efficient use of resources), and openness (by enabling greater citizen participation in city governance). at least, these are the claims made on behalf of the smart city by its advocates. . for stiegler's scathing critique of contemporary art practice, see crowley ( ). . kinsley ( ) does acknowledge this aspect of stiegler's work. . it is interesting to note here a connection to the work of rancière, unsurprisingly, as both stiegler and rancière are deeply influenced by foucault (crowley ). for rancière, framings of time and space dictate who (and what) is visible and audible, where and when. power, he argued, resides in the hierarchies embedded in such framing; and “politics,” for him, “is made possible by subjects transfiguring, transforming, appropriating space for the manifestation of dissensus” (dikec , ). rancière located the agency of that transfiguring anywhere, with anyone because, as he insists in his book the emancipated spectator (rancière ), everyone is always constantly learning about the world, and becoming human through that learning: “the human animal learns everything in the same way … as it learnt to venture into the forest of things and signs that surrounding it, so as to take its place among human beings: by observing and comparing one thing with another, a sign with a fact, a sign with another sign” ( ). in an approach similar to stiegler's, this is not learning in the sense of gaining more and more knowledge; it is learning as an orientation in the world. “we also learn and teach, act and know, as spectators who all the time link what we see to what we have seen and said, done and dreamed” (rancière , ). . stiegler's account does not therefore assume that all human bodies always have similar kinds of posthuman agency (moore ). nor, in his logic, is such agency exclusive to them: other, entirely nonhuman entities could also be capable of invention, though as jöns ( ) suggested, posthumans perform it most intensely. . i would also emphasize that there are multiple forms of power in the digitally mediated city, in contrast to the somewhat binary accounts of corporations or governments and citizens that appear in some geographical accounts (see also buscher et al. ). . this suggests that geographers interested in the mediation of cities should study the practices in which all of these are embedded, not just the “resistant” (rodgers, barnett, and cochrane ). additional information notes on contributors gillian rose gillian rose is professor in the department of geography, the open university, walton hall, milton keynes mk aa, uk. e-mail: gillian.rose@open.ac.uk. her research interests include digital visualizations of cities, contemporary visual culture, and visual methodologies. log in via your institution loading institutional login options... access through your institution log in to taylor & francis online log in shibboleth log in to taylor & francis online username password forgot password? remember me log in restore content access restore content access for purchases made as guest purchase options * save for later item saved, go to cart pdf download + online access hours access to article pdf & online version article pdf can be downloaded article pdf can be printed usd . add to cart pdf download + online access - online checkout issue purchase days online access to complete issue article pdfs can be downloaded article pdfs can be printed usd . add to cart issue purchase - online checkout * local tax will be added as applicable more share options   related articles people also read lists articles that other readers of this article have read. recommended articles lists articles that we recommend and is powered by our ai driven recommendation engine. cited by lists all citing articles based on crossref citations. articles with the crossref icon will open in a new tab. people also read recommended articles cited by information for authors editors librarians societies open access overview open journals open select cogent oa dove medical press f research help and info help & contact newsroom commercial services advertising information all journals books keep up to date register to receive personalised research and resources by email sign me up taylor and francis group facebook page taylor and francis group twitter page taylor and francis group linkedin page taylor and francis group youtube page taylor and francis group weibo page copyright © informa uk limited privacy policy cookies terms & conditions accessibility registered in england & wales no. howick place | london | sw p wg accept we use cookies to improve your website experience. to learn about our use of cookies and how you can manage your cookie settings, please see our cookie policy. by closing this message, you are consenting to our use of cookies. de-identification guidance prepared by the portage network, covid- working group on behalf of the canadian association of research libraries (carl) kristi thompson (western university) erin clary (portage) lucia costanzo (university of guelph) beth knazook (portage) nick rochlin (university of british columbia) felicity tayler (university of ottawa) jane fry (carleton university) chantal ripp (university of ottawa) kathy szigeti (university of waterloo) qian zhang (university of waterloo) roger reka (university of windsor) minglu wang (york university) rebecca dickson (coppul) mark leggott (rdc-drc) melanie parlette-stewart (portage) september portage network canadian association of research libraries portage@carl-abrc.ca portage network / canadian association of research libraries table of contents de-identification guidance .............................................................................................................................. identify and remove direct identifiers............................................................................................................ how do i remove this information? ............................................................................................................. identify and evaluate indirect or quasi-identifiers based on perceived risk and utility ................................ how do i figure out what combination of quasi-identifiers are a problem? ............................................... how do i assess the sensitivity of non-identifying variables in dataset? .................................................... considerations for qualitative data de-identification .................................................................................... brief considerations for social media, medical images, and genomics data .............................................. data collected from social media or social networking platforms (e.g., twitter, facebook). .................. medical images ......................................................................................................................................... genomics data, and other biomedical samples ........................................................................................ appendix : code for checking k-anonymity ................................................................................................ appendix : free de-identification software packages ................................................................................. appendix : fee-based services for de-identification ................................................................................... resources ....................................................................................................................................................... references ..................................................................................................................................................... portage network / canadian association of research libraries de-identification guidance the guidance below is intended to help you minimize disclosure risk when sharing data collected from human participants. if you use any of the following techniques to anonymize your data, please include this information in your documentation and readme file. for transparency, it should be clear how the dataset was modified to protect study participants. before proceeding, please note that not all human participant data needs to be de-identified, or stripped, of direct and indirect identifiers. please review your consent form and prepare your data to share only what participants have agreed to share. if you are unsure whether you need to de-identify your data, please see the portage help guide can i share my data? and consult with your institution’s research ethics board. for help selecting a repository for your data, please see portage’s recommended repositories for covid- research data guide or consult with librarians at your institution to see if further support is available. for help understanding any terms used in this document, please see portage’s glossary of terms for sensitive data used for research purposes. you may also wish to review portage’s human participant research data risk matrix and research data management language for informed consent for more information. learn more about creating appropriate documentation for depositing your datasets in the portage covid- working group’s “documentation and supporting materials required for deposit,” september , , https://doi.org/ . /zenodo. . portage covid- working group, “can i share my data?” september , , https://doi.org/ . /zenodo. . portage covid- working group, “recommended repositories,” september , . https://doi.org/ . /zenodo. . portage sensitive data working group, “sensitive data toolkit for researchers part : glossary of terms for sensitive data used for research purposes,” september , , https://doi.org/ . /zenodo. . portage sensitive data working group, “sensitive data toolkit for researchers part : human participant research data risk matrix,” october , , https://doi.org/ . /zenodo. , and portage sensitive data working group, “sensitive data toolkit for researchers part : research data management language for informed consent,” october , , https://doi.org/ . /zenodo. . https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. portage network / canadian association of research libraries identify and remove direct identifiers direct identifiers are those which place study participants at immediate risk of being re-identified. unless explicit consent was received from study participants, they must be removed from any published version of your dataset. the following list is based on various sources, including guidance from major international funding agencies, the us health insurance portability and accountability act (hipaa) and the british medical journal. see preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers and list of items considered under hipaa to be identifiers. direct identifiers are: ● names or initials, as well as names of relatives or household members ● addresses, and small area geographic identifiers such as postal codes / zip codes ● telephone numbers ● electronic identifiers such as web addresses, email addresses, social media handles, or ip addresses of individual computers ● unique identifying numbers such as hospital ids, social insurance numbers, clinical trial record numbers, account numbers, certificate or license numbers ● exact dates relating to individually-linked events such as birth or marriage, date of hospital admission or discharge, or date of a medical procedure ● multimedia data: unaltered photographs, audio, or videos of individuals ● biometric identifiers including finger or voice prints, and iris or retinal images ● human genomic data, unless risk was explained and consent to share data or consent for secondary use of data was received from study participants ● age information for individuals over years old how do i remove this information? removing direct identifiers from your data is relatively straightforward. you may either record this personal information in a separate document, spreadsheet, or database and link this to the other data points via a series of codes that can be removed before publishing or choose to delete the identifying data points entirely at the end of the project. refer to your consent forms to determine how to proceed. if you are unsure whether data can simply be unlinked or if it must be destroyed, consult your local research ethics board. iain hrynaszkiewicz, melissa l. norton, andrew j. vickers, and douglas g. altman, “preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers.” bmj (january , ): c , https://www.bmj.com/content/ /bmj.c ; and steve alder, “what is considered phi under hipaa rules?” hipaa journal (december , ), https://www.hipaajournal.com/considered-phi-hipaa/. https://www.bmj.com/content/ /bmj.c https://www.bmj.com/content/ /bmj.c https://www.hipaajournal.com/considered-phi-hipaa/ https://www.bmj.com/content/ /bmj.c https://www.hipaajournal.com/considered-phi-hipaa/ portage network / canadian association of research libraries identify and evaluate indirect or quasi-identifiers based on perceived risk and utility indirect or quasi-identifiers are characteristics (such as demographic information) relating to individuals that could be linked with other data sources to violate the confidentiality of individuals. quasi-identifiers may not be identifying on their own but can be disclosive in combination. for instance, identifying a participant’s home community size within an overall limited geographic study area may allow someone to infer that participant’s location more precisely. a variable should be considered a quasi-identifier if someone could plausibly match that variable to information from another source. see the international household survey network anonymization principles and the information and privacy commissioner ontario de-identification guidelines for structured data. a list of potential quasi-identifiers: ● geographic identifiers (census geography, town name, urban/rural indicator) of home, place of birth, place of treatment, place of schooling, or other geography linked to individuals ● sex / gender identity, orientation ● ethnic background, race, visible minority, or indigenous status ● immigration status ● membership in organizations ● use of specific social networks or services ● socioeconomic data, such as occupation or place of work, income, or education ● household and family composition, marital status, number of children / pregnancies ● criminal records and other information that may link to public records ● generalized dates linked to individuals, e.g. age, graduation year, immigration year ● some full-sentence responses ○ note: these must be checked individually. for instance, the comment “the library should be open longer” is not identifying; however, a comment like “as chair of a research group that uses the library,…” is potentially identifying. ● some medical information (e.g. permanent disabilities or rare medical conditions) may be identifying; temporary illness or injury is less likely to be so. the test is whether this is information that can be found elsewhere and therefore could be used to re-identify the person. how do i figure out what combination of quasi-identifiers are a problem? “anonymization principles,” international household survey network, accessed august , , https://ihsn.org/node/ ; information and privacy commissioner of ontario, deidentification guidelines for structured data, information and privacy commissioner of ontario, june , . https://www.ipc.on.ca/privacy- organizations/de-identification-centre/. https://ihsn.org/node/ https://ihsn.org/node/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://ihsn.org/node/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ portage network / canadian association of research libraries . observe the possible combinations a good first step may be to look at the demographic variables in the dataset and consider describing an individual to a friend using only the values of those variables. is there any likelihood that the person would be recognizable? for example, “i’m thinking of a person living in toronto who is female, married, has a university degree, is between the ages of and and has an income of between and thousand dollars.” even if there is only one such person in the dataset, this is likely not enough information to create risk unless contextual information about the dataset narrows things down further. for instance, if your data is limited to a specific, narrow group of individuals, such as the referees for the ontario hockey association, the list of quasi-identifiers given above may be enough to uniquely identify an individual. quasi-identifiers need to be evaluated in the context of what is known or what may be reasonably inferred about the survey population. . assess these combinations mathematically k-anonymity is a mathematical approach to demonstrating that a dataset has been anonymized, where k is an integer selected by the researcher that represents a group of records with the same information across all quasi-identifiers. within your dataset, a set of ‘k’ records (e.g., a set of or records) is called an equivalence class. to achieve k-anonymity, it should not be possible to distinguish one record from the other records in its equivalence class. for example, if you choose a k value of , each record in your dataset must have the exact set of quasi-identifiers that are present in at least other records in order to achieve k-anonymity. k-anonymity only works to precisely estimate risk if a dataset is a complete sample of some population. k-anonymity considerably overestimates risk in the case of a dataset that is a subsample of a population. when determining the appropriate k value to use, consider: ● a lower k value of may be sufficient in datasets that contain small samples from a large population. ● a higher (or more conservative) k value should be used if a dataset is a complete sample of a population. keep in mind that a dataset that is a complete sample of a known population may have additional risk factors. imagine that all the respondents in a particular equivalence class answered a question the same way - you would know how each person in the survey belonging to that equivalence class answered the question. respondents to surveys are generally told that their responses will be kept confidential, not merely that no one will know which line of data contains their specific answers. a k-anonymous dataset that is a complete sample may not fulfill that promise. the code in appendix can be used with your preferred statistical software package to create equivalence classes based on the quasi-identifiers in the dataset and to list them by size. if any khaled el emam and fida kamal dankar, “protecting privacy using k-anonymity,” journal of the american medical informatics association , no. (september ): – , https://doi.org/ . /jamia.m . https://academic.oup.com/jamia/article/ / / / https://academic.oup.com/jamia/article/ / / / https://doi.org/ . /jamia.m portage network / canadian association of research libraries equivalence class has fewer members than the value of k you selected, use the data reduction techniques below to further reduce dataset risk. for more on k-anonymity, see international household survey network (ihsn)’s measuring the disclosure risk and the uk anonymisation network’s anonymisation decision-making framework section . . , guaranteed anonymisation. . use data reduction techniques to address dataset risk univariate frequencies and bivariate crosstabs can be used to identify small categories of quasi- identifiers. data reduction techniques can be used to mitigate risk once you have identified these small groups. there are three simple types of data reduction you may wish to consider: . the simplest is to completely drop risky variables from the dataset. this is an option for variables with relatively high risk that are not considered to be of high research value. (for example, in some datasets geography may be considered relatively less important than ethnicity or language.) . the second is global re-coding, or aggregating the observed values into a defined set of classes, such as transforming a variable with years of age into a variable of ten-year age categories, or top-coding a high income category to “$ , and above”. . a third option for unusual cases is to use local suppression. for example, a very young married respondent might have their marital status set to ‘missing’ as an alternative to globally re-coding the otherwise non-risky age variable into a larger group. after each exercise in data reduction, repeat the test for k-anonymity described above and check equivalence classes until all groups are larger than your selected value for k. for more information, including information about more complex types of data reduction, see uk anonymisation network’s anonymization decision-making framework section . , anonymisation solutions. “measuring the disclosure risk,” international household survey network, accessed august , , https://ihsn.org/anonymization-risk-measure; and mark elliot, elaine mackey, kieron o’hara, and caroline tudor, the anonymisation decision-making framework. uk anonymisation network (ukan), university of manchester, , https://ukanon.net/ukan-resources/ukan-decision-making-framework/. ‘small’ is relative; as a first pass, groups smaller than % of the dataset or containing fewer than cases could be considered. elliot, mackey, o’hara, and tudor, the anonymisation decision-making framework. https://ihsn.org/anonymization-risk-measure https://ihsn.org/anonymization-risk-measure https://ukanon.net/ukan-resources/ukan-decision-making-framework/ https://ukanon.net/ukan-resources/ukan-decision-making-framework/ https://ukanon.net/ukan-resources/ukan-decision-making-framework/ https://ihsn.org/anonymization-risk-measure https://ukanon.net/ukan-resources/ukan-decision-making-framework/ portage network / canadian association of research libraries how do i assess the sensitivity of non-identifying variables in dataset? non-identifying information includes survey responses and measurements that are not likely to be recognizable as coming from specific individuals. examples include opinions, rankings, scales, or temporary measures such as resting heart rate after meditation or the number of times an individual ate breakfast in a week. it is possible for non-identifying information to be highly sensitive as well. information that could be used to stigmatize or discriminate against an individual, such as a criminal record, sexual practices, illicit drug use, mental health and psychological well-being, and other sensitive medical information all increase the risk of the dataset and should be considered when deciding whether to release the data at all. you may wish to remove or modify these variables to create a less sensitive version of the data. portage network / canadian association of research libraries considerations for qualitative data de-identification qualitative data describes qualities or characteristics that can be observed, but not necessarily measured. this type of data is collected through interviews, surveys, or observations, and may be in the form of transcripts, notes, video and audio recordings, images, and text documents. as with quantitative data, direct identifiers may appear in the form of names, date and place of birth, other locations, and even photos. these direct identifiers can be used along with indirect or quasi-identifiers, such as medical, education, financial, and employment information, to trace or determine an individual’s identity. the process for removing identifying information in a video recording, audio interview, or oral transcript is very different from that used to de-identify a medical record. for one, it is harder to do programmatically. extremely detailed field notes or audiovisual information often requires someone to read or watch the content thoroughly. general advice ● avoid asking for identifying information in the first place. ○ it is easier to edit the information at the point of capture than it is to remove information after it has been recorded. ○ if you require identifying information at the research stage, try to capture it within the first few minutes of an interview or recording, so that it is easy to edit it out quickly. alternatively, transcribe the information in a separate document that can be removed from a person’s file. ● make de-identification a part of the process of informed consent. ○ ensure that study participants are aware of your planned use of the data, and the fact that their information may be anonymized to protect them. make it clear in your consent forms how extensively they will be de-identified (i.e., what elements will be replaced or removed). while direct identifiers may be eliminated (name, address, birthday, etc.), there may be other subtle clues to their identities that remain within the recording or transcript. ○ agree in advance with participants which type of identifying information can be revealed in an interview. (for example, the participant may not wish to mention an employer’s name). this is easier than removing information after the fact. ○ keep in mind that not all data needs to be de-identified or anonymized. in some circumstances, you may be recording deeply personal accounts and should be mindful of a participant’s right to have their story told in their own words. some participants may have a personal interest in staying identified. portage network / canadian association of research libraries de-identification guidance ● use pseudonyms and change identifying details to protect anonymity. ○ if changing the person’s name, location of residence, or occupation can be done without compromising the dataset, this can help to protect their anonymity. be advised that this could influence the utility of a dataset as it may alter a future researcher’s perception of the interviewee’s socio-economic status or behaviour. ● if necessary, remove blocks of sensitive text or edit out portions within audio-visual data. ○ some portion of the research may need to be redacted. be wary of using search and replace techniques as it is easy to replace the wrong piece of information. ○ voices in audio recordings may need to be masked by altering pitches. ○ faces in visual data may need to be pixelated. ● restrict access. ○ this is not preferred, but some datasets will not remain useful if all identifiers are removed. it may be possible to allow researchers seeking secondary access to request that queries be performed by the original research team, who can then share results if they are non-disclosive or can be appropriately de-identified. for more information, see the uk data service's guide to anonymisation of qualitative data. “anonymisation: qualitative data,” uk data service, last modified june , , https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx. https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx portage network / canadian association of research libraries brief considerations for social media, medical images, and genomics data data collected from social media or social networking platforms (e.g., twitter, facebook). although information on social networking sites may be free to access or view, it does not automatically follow that it is free to redistribute. many platforms have terms of use that you will need to abide by, and the people who use the platform may have an expectation of privacy which must be respected. some platforms require users to register before content is visible, and others may have terms that prohibit data collection, data scraping, or republishing content elsewhere. here are a series of questions to consider before you deposit social media data: ● could the topic you are studying be considered sensitive? ● could your data lead to stigmatization of, or discrimination against, the content author? ● is the study population vulnerable? ● what expectation of privacy might the individual users of this platform have? ● is it possible or reasonable to obtain informed consent? ● can or should the data be anonymized? ● do the platform’s terms of use allow you to redistribute content? for example, twitter allows the content author to maintain control over their tweets. as part of twitter's policies, only numeric tweet ids and user ids should be redistributed. if you have weighed the questions above and decide to deposit your dataset, the tweets must first be ‘dehydrated’ (distilled down to just the tweet id) using a tool such as docnow’s twarc. any secondary use of the data would then require an end-user to “rehydrate” the tweet ids using the twitter rest api or an external tool such as docnow’s hydrator. content will not be returned for tweets that have since been deleted. the following resources provide more in-depth guidance: ● zeffiro and brodeur, social media research data ethics and management (slides from a workshop presented at mcmaster university). ● ryerson university research ethics board’s guidelines for research involving social media. “developer terms: more about restricted uses of the twitter apis,” twitter, accessed august , , https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases. documenting the now, “docnow/twarc.” github, accessed august , , https://github.com/docnow/twarc. documenting the now, “docnow/hydrator.” github, accessed august , , https://github.com/docnow/hydrator. andrea zeffiro and jay brodeur, “social media research data ethics and management.” workshop presented april , , sherman centre for digital scholarship, mcmaster university, http://hdl.handle.net/ / . ryerson university research ethics board, “guidelines for research involving social media,” ryerson university, november, , https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research- involving-social-media.pdf. https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases https://github.com/docnow/twarc https://github.com/docnow/hydrator https://macsphere.mcmaster.ca/bitstream/ / / /dmds% -% sm% ethics% and% data% management% -% .pdf https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases https://github.com/docnow/twarc https://github.com/docnow/hydrator http://hdl.handle.net/ / https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf portage network / canadian association of research libraries ● mannheimer and hull, sharing selves: developing an ethical framework for curating social media data. north carolina state university’s social media archives toolkit, which contains guidance on the legal and ethical implications of sharing social media data, and an annotated bibliography with further resources. medical images before you archive medical images, remove any direct identifiers you do not have explicit consent to share, such as name, patient id, and exact dates from the image header or embedded metadata, and black out any pixels in the image that contain identifying information. neuroimages must also be defaced using a tool such as pydeface. the following resources provide more guidance for de-identifying dicom files: ● the cancer imaging archive (tcia) de-identification overview. ○ see specifically “table - dicom tags modified or removed at the source site” for a list of dicom tags deemed to be unsafe. ● the radiological society of north america (rsna) international covid- open radiology database (ricord) de-identification protocol. ● the dicom standard itself provides important guidance for de-identifying header information. specifically, dicom part : security and system management profiles, appendix e: attribute confidentiality profiles may be useful. ○ these profiles attempt to balance the need to protect privacy with the need to retain information so the data remain useful. sara mannheimer and elizabeth hull, “sharing selves: developing an ethical framework for curating social media data,” international journal of digital curation , no. (april , ), https://doi.org/ . /ijdc.v i . . “social media archives toolkit,” north carolina state university libraries, accessed august , , https://www.lib.ncsu.edu/social-media-archives-toolkit. some repositories may be able to assist you or recommend tools for defacing. for example, the international neuroimaging data-sharing initiative (indi) can help researchers who plan to share their data on the indi platform. for further information, see the indi data contribution guide, accessed august , , http://fcon_ .projects.nitrc.org/indi/indi_data_contribution_guide.pdf. see also, omer faruk gulban, dylan nielson, russ poldrack, john lee, chris gorgolewski, vanessasaurus, and satrajit ghosh, “poldracklab/pydeface: v . . .” october , . http://doi.org/ . /zenodo. . kirby, justin. “submission and de-identification overview.” the cancer imaging archive (tcia), university of arkansas for medical sciences, april , , https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview. “rsna international covid- open radiology database (ricord) de-identification protocol,” radiological society of north america, international covid- open radiology database, accessed august , , https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf. medical imaging & technology alliance, dicom standards committee, “dicom part : security and system management profiles.” dicom standard (arlington, va: national electrical manufacturers association), accessed august , , https://www.dicomstandard.org/current/. https://doi.org/ . /ijdc.v i . https://doi.org/ . /ijdc.v i . https://www.lib.ncsu.edu/social-media-archives-toolkit https://www.lib.ncsu.edu/social-media-archives-toolkit/legal https://www.lib.ncsu.edu/social-media-archives-toolkit/legal https://pypi.org/project/pydeface/ https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview https://www.rsna.org/covid- /covid- -ricord/ricord-resources#identification https://www.rsna.org/covid- /covid- -ricord/ricord-resources#identification https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf https://www.dicomstandard.org/current/ http://dicom.nema.org/medical/dicom/current/output/pdf/part .pdf#chapter_e http://dicom.nema.org/medical/dicom/current/output/pdf/part .pdf#chapter_e https://doi.org/ . /ijdc.v i . https://www.lib.ncsu.edu/social-media-archives-toolkit http://fcon_ .projects.nitrc.org/indi/indi_data_contribution_guide.pdf http://doi.org/ . /zenodo. https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf https://www.dicomstandard.org/current/ portage network / canadian association of research libraries ○ if it is necessary to retain identifiers, your reb application will have ideally referenced the profile you intend to use, and your consent form should clearly state what information will be shared. de-identification of dicom files may be done programmatically, using a software to strip identifiers from the header. ● tcia recommends the clinical trial processor (ctp) software developed by rsna. ● rsna’s covid- open radiology database (ricord) recommends another rsna software called anonymizer, and has published instructions on how to install and use it. anonymizer implements ricord’s de-identification protocol. ● there are many other non-commercial options available, such as the dicomcleaner™ tool. ● as with all de-identification software, results may be variable, and you should confirm that identifying information was removed before you share your images. note that: ○ vendors or end-users may not have always used dicom elements in a way that conforms to the standard. ○ private elements or private tags may have been used to store personal information, and the use of these tags may not be well-defined in the vendor documentation. genomics data, and other biomedical samples because each person's dna sequence is unique, human biological materials can never be truly anonymous. before you archive or biobank these data, please review your consent form. ideally the consent process will have: ● provided participants with information about how their data will be used, analyzed, stored and shared, ● identified what information will be stored alongside the data, ● communicated what level of privacy or confidentiality a participant may expect, and who may have access to the data, ● indicated whether the data/samples will be stored in canada or outside of canada, ● acknowledged whether there is a possibility that the data will be used for commercial purposes, ● clearly explained the risks of disclosure. “clinical trial processor (ctp),” radiological society of north america, medical imaging resource community (mirc), accessed august , , https://www.rsna.org/research/imaging-research-tools. “rsna covid- dicom data anonymizer,” radiological society of north america, international covid- open radiology database, accessed august , , https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna- anonymizer-program-instructions.pdf. “rsna international covid- open radiology database (ricord) de-identification protocol,” radiological society of north america, international covid- open radiology database. clunie, david a., “dicomcleaner™,” pixelmed publishing™, accessed july , , http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html. https://www.rsna.org/research/imaging-research-tools https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html https://www.rsna.org/research/imaging-research-tools https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html portage network / canadian association of research libraries further information is available in tcps ( ), chapter : human biological materials including materials related to human reproduction (sections a and d specifically), and chapter : human genetic research. see also thorogood ( ) canada: will privacy rules continue to favour open science? the nih privacy in genomics webpage provides a concise overview of some of the benefits and risks of sharing genetic information. for an example of how genetic information was used to identify study participants, see identifying personal genomes by surname inference, or a summary of the study in the nature editorial on genetic privacy. for further information on ethics and consent in genomics, see the global alliance for genomics and health regulatory & ethics toolkit resources, such as data privacy and security policy and consent policy. government of canada (canadian institutes of health research, the natural sciences and engineering research council of canada, and the social sciences and humanities research council), tri-council policy statement: ethical conduct for research involving humans, december , https://ethics.gc.ca/eng/policy-politique_tcps -eptc _ .html. adrian thorogood, “canada: will privacy rules continue to favour open science?” human genetics : – (july , ), https://doi.org/ . /s - - -z. “privacy in genomics,” national human genome research institute, february , , https://www.genome.gov/about-genomics/policy-issues/privacy. gymrek, melissa, amy l. mcguire, david golan, eran halperin, and yaniv erlich. “identifying personal genomes by surname inference.” science , no. (jan , ): - . https://doi.org/ . /science. ; and “genetic privacy” [editorial], nature (january , ): , https://doi.org/ . / a. global alliance for genomics & health. genomic toolkit: regulatory & ethics toolkit. toronto, on: global alliance for genomics and health, accessed july , , https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/. https://ethics.gc.ca/eng/tcps -eptc _ _chapter -chapitre .html https://ethics.gc.ca/eng/tcps -eptc _ _chapter -chapitre .html https://ethics.gc.ca/eng/tcps -eptc _ _chapter -chapitre .html https://ethics.gc.ca/eng/tcps -eptc _ _chapter -chapitre .html https://link.springer.com/article/ . /s - - - https://www.genome.gov/about-genomics/policy-issues/privacy https://doi.org/ . /science. https://www.nature.com/news/genetic-privacy- . https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/ https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/ https://www.ga gh.org/wp-content/uploads/ga gh-data-privacy-and-security-policy_final-august- _wpolicyversions.pdf https://www.ga gh.org/wp-content/uploads/ga gh-final-revised-consent-policy_ sept .pdf https://www.ga gh.org/wp-content/uploads/ga gh-final-revised-consent-policy_ sept .pdf https://doi.org/ . /s - - -z https://www.genome.gov/about-genomics/policy-issues/privacy https://doi.org/ . /science. https://doi.org/ . / a https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/ portage network / canadian association of research libraries appendix : code for checking k-anonymity -- stata -- * stata code for checking k-anonymity * kristi thompson, may * create the equivalence groups egen equivalence_group= group(var var var var var ) * create a variable to count cases in each equivalence group sort equivalence_group by equivalence_group: gen equivalence_size =_n * list the id numbers of equivalence groups containing or fewer cases tab equivalence_group if equivalence_size < , sort * list the values of the quasi-identifiers for each small equivalence class. list var var var var var if equivalence_group == --- r -- # r code for checking k-anonymity # carolyn sullivan and kristi thompson, may # install plyr, a useful data manipulation package. install.packages("plyr") # load the library. library('plyr') portage network / canadian association of research libraries datafile <- " location of the data file - csv format - " # read the csv file. df <- read.csv (datafile) # figure out what equivalence classes there are, and how many cases in each equivalence class. dfunique <- ddply(df, .(var , var , var , var , var ), nrow) dfunique <- dfunique[order(dfunique$v ),] view(dfunique) the uk anonymisation network’s anonymisation decision-making framework, appendix b has code for doing this in spss. elliot, mackey, o’hara, and tudor, the anonymisation decision-making framework. https://ukanon.net/ukan-resources/ukan-decision-making-framework/ portage network / canadian association of research libraries appendix : free de-identification software packages many of these tools take a hierarchical approach to de-identifying data, which means that you will need to pre-define possible generalizations for the quasi-identifiers in the dataset, and the program will search for possible solutions and recommend a set of the generalizations to use to best meet anonymization goals. for datasets with a large number of quasi-identifiers, or cases where several datasets with similar quasi-identifiers need to be de-identified, this might be a useful approach. for smaller datasets, it may be more straightforward to work in a statistical package. the software packages included here all have some usability issues, and fairly steep learning curves. amnesia and the graphical user interface to sdcmicro may be the most user-friendly. recommended tools: ● amnesia ○ this software has both online and desktop versions, however, uploading sensitive data to a third-party web site is not generally recommended. if possible, install the software locally (windows or linux only). ○ amnesia supports k-anonymity and k m -anonymity (a slightly more flexible approach to anonymity when the number of quasi-identifiers in a dataset is very high, as it allows for combinations up to m quasi-identifiers to appear at least k times in the published data). ○ a few limitations: there is not currently a way to specify missing values; documentation could be more thorough, for instance, defining hierarchies is not straightforward. ○ this software may work best for clinical data, or data which are not survey data. ● sdcmicro ○ an r package for statistical disclosure control (microdata anonymization). this software can read many data types (e.g., csv, sav, dta, sas bdat, xlsx) and can be used in windows, linux or mac operating systems. implements muargus code. ○ a graphical user interface is available, and there is a vignette with guidance called ‘using the interactive gui - sdcapp’ linked from the sdcmicro landing page in cran repository. ○ please be aware that large datasets take time to load, and computation time for large or complex datasets may be lengthy. ○ the statistical disclosure control for microdata practice guide section on sdc with sdcmicro in r may be helpful if you need further guidance installing and using the sdcmicro package, or see benschop’s sdcmicro gui manual documentation. “using the interactive gui – sdcapp, the comprehensive r archive network (cran), accessed august , , https://cran.r-project.org/web/packages/sdcmicro/vignettes/sdcmicro.html. “computation time,” sdc with sdcmicro in r: setting up your data and more, sdc practice guide, , https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html#computation-time. “statistical disclosure control (sdc): an introduction,” sdc practice guide, , https://sdcpractice.readthedocs.io/en/latest/sdc_intro.html; and thijs benschop and matthew welch. statistical disclosure control for microdata: a practice guide for sdcmicro, international household survey network, accessed august , , https://sdcpractice.readthedocs.io/en/latest/index.html. https://amnesia.openaire.eu/ https://cran.r-project.org/web/packages/sdcmicro/index.html https://cran.r-project.org/web/packages/sdcmicro/vignettes/sdcmicro.html https://cran.r-project.org/web/packages/sdcmicro/vignettes/sdcmicro.html https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html#computation-time https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html https://readthedocs.org/projects/sdcappdocs/downloads/pdf/latest/ https://cran.r-project.org/web/packages/sdcmicro/vignettes/sdcmicro.html https://sdcpractice.readthedocs.io/en/latest/sdcmicro.html#computation-time https://sdcpractice.readthedocs.io/en/latest/sdc_intro.html https://sdcpractice.readthedocs.io/en/latest/index.html portage network / canadian association of research libraries other tools that may be useful: ● arx ○ open source anonymization tool for use in windows, linux, and mac. provides support for sql databases, xlsx and csv files, and has a graphical user interface. ○ supports various privacy models including k-anonymity, and variants ℓ-diversity, t- closeness, β-likeness, and more. ○ allows end-users to categorize, top and bottom code, generalize, and transform data in more complex ways. ○ large datasets take time to load, and computation time for large or complex datasets may be lengthy. ● mu-argus ○ software to apply statistical disclosure control techniques. the program takes a hierarchical approach to de-identifying data. ○ jar file should be executable in windows or mac os. ○ a tester found that getting data loaded and correctly defined was a bit of a challenge and advised that the program could use better documentation on setting up hierarchies. ● the university of texas at dallas anonymization toolbox ○ the toolbox currently supports different anonymization methods and privacy definitions, including k-anonymity, ℓ-diversity, and t-closeness. ○ algorithms can either be applied directly to a dataset or can be used as library functions inside other applications. ○ this is a set of java routines. data curators who prefer to do their statistical programming in java might find it useful. “privacy models,” arx – data anonymization tool, accessed august , , https://arx.deidentifier.org/overview/privacy-criteria/. https://arx.deidentifier.org/overview/ https://arx.deidentifier.org/overview/privacy-criteria/ https://github.com/sdctools/muargus http://cs.utdallas.edu/dspl/cgi-bin/toolbox/index.php https://arx.deidentifier.org/overview/privacy-criteria/ portage network / canadian association of research libraries appendix : fee-based services for de-identification a few fee-based services that researchers may opt to use for de-identification are included below: ● d-wise (american & european offices) ○ offering free anonymization services to anyone working on a covid- vaccine. ○ offering free anonymization services to researchers who deposit individual participant- level data from covid- clinical trials in vivli. ● inter-university consortium for political and social research (icpsr) (archive headquartered at university of michigan) ○ if you wish icpsr to conduct disclosure analysis of your data, you will need to purchase the professional curation package. cost is based on the number of variables and complexity of the data. contact icpsr acquisitions at deposit@icpsr.umich.edu for additional information (information obtained from open icpsr faq under pricing and sensitive data sections). ● privacy analytics (ottawa-based company) ○ privacy analytics can review datasets as part of their data privacy validation services. ○ methodology based on the hipaa expert determination de-identification standard. ○ to find out more about their services, please fill in the form at the bottom of their “certification” webpage. “d-wise offers free transparency services accelerating covid- vaccine research,” cision prweb, march , , https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_rese arch/prweb .htm. “d-wise offers anonymization services available on vivli covid- portal,” center for global clinical research data, april , , https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_rese arch/prweb .htm. “faqs,” openicpsr, accessed august , , https://www.openicpsr.org/openicpsr/faqs. “clinical trial transparency services,” privacy analytics, accessed on august , , https://privacy- analytics.com/clinical-trial-transparency/ctt-services/. “double-check your data and leverage it with confidence,” privacy analytics, accessed on august , , https://privacy-analytics.com/health-data-privacy/health-data-services/expert-data-opinion-services/. https://www.d-wise.com/de-identification-services https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://vivli.org/vivli-covid- -portal/ https://vivli.org/about/overview- / https://www.openicpsr.org/openicpsr/ https://www.openicpsr.org/openicpsr/faqs https://privacy-analytics.com/services/certification/ https://privacy-analytics.com/clinical-trial-transparency/ctt-services/ https://privacy-analytics.com/services/certification/ https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://www.prweb.com/releases/d_wise_offers_free_transparency_services_accelerating_covis_ _vaccine_research/prweb .htm https://www.openicpsr.org/openicpsr/faqs https://privacy-analytics.com/clinical-trial-transparency/ctt-services/ https://privacy-analytics.com/clinical-trial-transparency/ctt-services/ https://privacy-analytics.com/health-data-privacy/health-data-services/expert-data-opinion-services/ portage network / canadian association of research libraries resources . amnesia https://amnesia.openaire.eu/ . arx https://arx.deidentifier.org/overview/ . d-wise https://www.d-wise.com/de-identification-services . mu-argus https://github.com/sdctools/muargus . inter-university consortium for political and social research (icpsr) https://www.openicpsr.org/openicpsr/ . privacy analytics https://privacy-analytics.com/services/certification/ . sdcmicro https://cran.r-project.org/web/packages/sdcmicro/index.html . the university of texas at dallas anonymization toolbox http://cs.utdallas.edu/dspl/cgi- bin/toolbox/index.php references . alder, steve. “what is considered phi under hipaa rules?” hipaa journal, december , . https://www.hipaajournal.com/considered-phi-hipaa/. . benschop, thijs, and matthew welch. “statistical disclosure control for microdata: a practice guide for sdcmicro.” international household survey network. accessed august , . https://sdcpractice.readthedocs.io/en/latest/index.html. . clunie, david a. “dicomcleaner™.” pixelmed publishing™. accessed july , . http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html. . documenting the now. “docnow/hydrator.” github. accessed august , . https://github.com/docnow/hydrator. . documenting the now. “docnow/twarc.” github. accessed august , . https://github.com/docnow/twarc. . el emam, khaled, and fida kamal dankar. “protecting privacy using k-anonymity.” journal of the american medical informatics association , no. (september ): – . https://doi.org/ . /jamia.m . . elliot, mark, elaine mackey, kieron o’hara, and caroline tudor. the anonymisation decision- making framework. uk anonymisation network (ukan). university of manchester. . https://ukanon.net/ukan-resources/ukan-decision-making-framework/. . “genetic privacy.” [editorial]. nature (january , ): . https://doi.org/ . / a. . global alliance for genomics & health. genomic toolkit: regulatory & ethics toolkit. toronto, on: global alliance for genomics and health. accessed july , . https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/. . government of canada (canadian institutes of health research, the natural sciences and engineering research council of canada, and the social sciences and humanities research council). tri-council policy statement: ethical conduct for research involving humans. december . https://ethics.gc.ca/eng/policy-politique_tcps -eptc _ .html. https://amnesia.openaire.eu/ https://github.com/sdctools/muargus https://www.openicpsr.org/openicpsr/ https://www.hipaajournal.com/considered-phi-hipaa/ https://sdcpractice.readthedocs.io/en/latest/index.html http://www.dclunie.com/pixelmed/software/webstart/dicomcleanerusage.html https://github.com/docnow/hydrator https://github.com/docnow/twarc https://doi.org/ . /jamia.m https://ukanon.net/ukan-resources/ukan-decision-making-framework/ https://doi.org/ . / a https://www.ga gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/ https://ethics.gc.ca/eng/policy-politique_tcps -eptc _ .html portage network / canadian association of research libraries . gulban, omer faruk, dylan nielson, russ poldrack, john lee, chris gorgolewski, vanessasaurus, and satrajit ghosh. “poldracklab/pydeface: v . . .” zenodo. october , . http://doi.org/ . /zenodo. . . gymrek, melissa, amy l. mcguire, david golan, eran halperin, and yaniv erlich. “identifying personal genomes by surname inference.” science , no. (jan , ): - . https://doi.org/ . /science. . . hrynaszkiewicz, iain, melissa l. norton, andrew j. vickers, and douglas g. altman. “preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers.” bmj (january , ): c . https://www.bmj.com/content/ /bmj.c . . information and privacy commissioner ontario. deidentification guidelines for structured data. information and privacy commissioner of ontario. june , . https://www.ipc.on.ca/privacy- organizations/de-identification-centre/. . international household survey network. “anonymization principles.” accessed august , . https://ihsn.org/node/ . . international household survey network. “measuring the disclosure risk.” accessed august , . https://ihsn.org/anonymization-risk-measure. . international neuroimaging data-sharing initiative (indi). data contribution guide. accessed august , . http://fcon_ .projects.nitrc.org/indi/indi_data_contribution_guide.pdf. . kirby, justin. “submission and de-identification overview.” the cancer imaging archive (tcia), university of arkansas for medical sciences. april , . https://wiki.cancerimagingarchive.net/display/public/submission+and+de- identification+overview. . mannheimer, sara, and elizabeth hull. “sharing selves: developing an ethical framework for curating social media data.” international journal of digital curation , no. (april , ). https://doi.org/ . /ijdc.v i . . . medical imaging & technology alliance, dicom standards committee. “dicom part : security and system management profiles.” in dicom standard. arlington, va: national electrical manufacturers association. accessed august , . https://www.dicomstandard.org/current/. . moore, stephen m., david r. maffitt, kirk e. smith, justin s. kirby, kenneth w. clark, john b. freymann, bruce a. vendt, lawrence r. tarbox, and fred w. prior. “de-identification of medical images with retention of scientific research value.” radiographics , no. (may , ). https://doi.org/ . /rg. . . national human genome research institute. “privacy in genomics.” february , . accessed august , . https://www.genome.gov/about-genomics/policy-issues/privacy. . north carolina state university libraries. “social media archives toolkit.” accessed august , . https://www.lib.ncsu.edu/social-media-archives-toolkit. . portage covid- working group, “can i share my data?” september , . https://doi.org/ . /zenodo. . . portage covid- working group. “documentation and supporting materials required for deposit.” september , . https://doi.org/ . /zenodo. . . portage covid- working group. “recommended repositories.” september , . https://doi.org/ . /zenodo. . . portage sensitive data working group. “sensitive data toolkit for researchers part : glossary of terms for sensitive data used for research purposes.” september , . https://doi.org/ . /zenodo. . . portage sensitive data working group. “sensitive data toolkit for researchers part : human participant research data risk matrix.” october , . https://doi.org/ . /zenodo. . http://doi.org/ . /zenodo. https://doi.org/ . /science. https://www.bmj.com/content/ /bmj.c https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://www.ipc.on.ca/privacy-organizations/de-identification-centre/ https://ihsn.org/node/ https://ihsn.org/anonymization-risk-measure http://fcon_ .projects.nitrc.org/indi/indi_data_contribution_guide.pdf https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview https://wiki.cancerimagingarchive.net/display/public/submission+and+de-identification+overview https://doi.org/ . /ijdc.v i . https://www.dicomstandard.org/current/ https://doi.org/ . /rg. https://www.genome.gov/about-genomics/policy-issues/privacy https://www.lib.ncsu.edu/social-media-archives-toolkit https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. https://doi.org/ . /zenodo. portage network / canadian association of research libraries . portage sensitive data working group. “sensitive data toolkit for researchers part : research data management language for informed consent.” october , . https://doi.org/ . /zenodo. . . radiological society of north america, international covid- open radiology database. “rsna international covid- open radiology database (ricord) de-identification protocol.” accessed august , . https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- - deidentification-protocol.pdf. . radiological society of north america, international covid- open radiology database. “rsna covid- dicom data anonymizer.” accessed august , . https://www.rsna.org/- /media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf. . radiological society of north america, medical imaging resource community (mirc). “clinical trial processor (ctp).” accessed august , . https://www.rsna.org/research/imaging- research-tools. . ryerson university research ethics board. “guidelines for research involving social media.” ryerson university. november, . https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research- involving-social-media.pdf. . thorogood, adrian. “canada: will privacy rules continue to favour open science?” human genetics : – (july , ). https://doi.org/ . /s - - -z. . twitter. “developer terms: more about restricted uses of the twitter apis.” accessed august , . https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases. . uk data service. “anonymisation: qualitative data.” last modified june , . https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx. . zeffiro, andrea, and jay brodeur. “social media research data ethics and management.” workshop presented april , . sherman centre for digital scholarship. mcmaster university. http://hdl.handle.net/ / . https://doi.org/ . /zenodo. https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-covid- -deidentification-protocol.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf https://www.rsna.org/-/media/files/rsna/covid- /ricord/rsna-anonymizer-program-instructions.pdf https://www.rsna.org/research/imaging-research-tools https://www.rsna.org/research/imaging-research-tools https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf https://www.ryerson.ca/content/dam/research/documents/ethics/guidelines-for-research-involving-social-media.pdf https://doi.org/ . /s - - -z https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation/qualitative.aspx http://hdl.handle.net/ / de-identification guidance de-identification guidance identify and remove direct identifiers how do i remove this information? identify and evaluate indirect or quasi-identifiers based on perceived risk and utility how do i figure out what combination of quasi-identifiers are a problem? . observe the possible combinations . assess these combinations mathematically . use data reduction techniques to address dataset risk how do i assess the sensitivity of non-identifying variables in dataset? considerations for qualitative data de-identification brief considerations for social media, medical images, and genomics data data collected from social media or social networking platforms (e.g., twitter, facebook). medical images genomics data, and other biomedical samples appendix : code for checking k-anonymity appendix : free de-identification software packages appendix : fee-based services for de-identification resources references microsoft word - thesis draft_final.docx old stories and new visualizations: digital timelines as public history projects a thesis submitted to the temple university graduate board in partial fulfillment of the requirements for the degree master of arts by mary o’neill may thesis approvals: seth c. bruggeman, ph.d., advisor, temple university, history department hilary iris lowe, ph.d., temple university, history department dana dorman, archivist and librarian, historical society of haddonfield ii this work is licensed under the creative commons attribution-noncommercial . international license. iii abstract this thesis explores the use and potential of digital timelines in public history projects. digital timelines have become a popular and accessible ways for institutions and individuals to write history. the history of timelines indicates that people understand timelines as authoritative information visualizations because they represent concrete events in absolute time. the goals of public history often conflict with the linear, progressive nature of most timelines. this thesis reviews various digital timeline tools and uses the print center’s centennial timeline as an in-depth case study that takes into account the multifaceted factors involved in creating a digital timeline. digital history advocates support digital scholarship as an alternative to traditional narrative writing. this thesis illustrates that digital timelines can enable people to visualize history in unexpected ways, fostering new arguments and creative storytelling. despite their potential, digital timelines often replicate the conventions of their paper counterparts because of the authoritative nature of the timeline form. iv acknowledgments this project has been a journey and many people have helped me along the way. i am indebted to my advisor seth bruggeman, who has offered more encouragement than i sometimes realize. i would like to thank elizabeth spungen, director of the print center, for hosting my internship and welcoming me into the organization over the summer. along with liz, the print center staff, steven alvarez, jacqui evans, and john caperton, were a pleasure to work along and provided needed guidance. hilary iris lowe, deborah boyer, and margery sly, along with my other professors at temple have been wonderful to work with and have taught me a great deal. my fellow graduate students have been wonderful compatriots over the past two years. lastly, my parents, family, and friends have provided support and love throughout my schooling for which i am very grateful. v table of contents page abstract....................................................................................................................... iii acknowledgments ................................................................................................... iv list of figures ........................................................................................................... vi introduction ............................................................................................................. vii chapter . timelines history of timelines ............................................................................................... timelines as digital history: visualizations and narratives ................................... timeline tools........................................................................................................ . the centennial timeline: a case study introduction: commemorative practices and institutional history........................ the print center’s history in commemorations past .......................................... the print center’s centennial timeline ............................................................... developing the timeline........................................................................... forming entry content ............................................................................. the published timeline............................................................................ . conclusions: developing timelines as digital history timelines as history at the print center ............................................................. reflections ........................................................................................................... bibliography books, articles, and blogs ................................................................................... timeline resources digital timeline tools ............................................................................... digital timeline examples ........................................................................ appendices a. figures ....................................................................................................................... b. user’s guide: the print center’s centennial timeline .............................................. vi list of figures figure page . a small portion of joseph priestley’s a chart of biography....................................... . emma willard, temple of time, ........................................................................ . wikipedia’s timeline of art, -present (above)..................................................... . wikipedia’s timeline of art, -present (below) ..................................................... . michael friendly and daniel denis, milestones in the history of thematic cartography, statistical graphics, and data visualization, - ......................... . michael friendly and daniel denis, milestones in the history of thematic cartography, statistical graphics, and data visualization, - (detail)............. . the wright brothers, timeglider timeline (zoomed out) ............................................ . the wright brothers, timeglider timeline (zoomed in)............................................... . the road to revolution, timetoast timeline (timeline view) ...................................... . the road to revolution, timetoast timeline (list view) .............................................. . western history, preceden timeline (above) ............................................................. . western history, preceden timeline (below).............................................................. . tower of london, tiki-toki timeline ( d).................................................................... . tower of london, tiki-toki timeline ( d).................................................................... . revolutionary user interfaces, timelinejs timeline (above) ..................................... . revolutionary user interfaces, timelinejs timeline (below)...................................... . the print center, centennial timeline, ............................................................. . the print center, centennial timeline, ............................................................. . centennial timeline entries, “a permanent home” ...................................................... . centennial timeline entries, “the print center permanent collection at the philadelphia museum of art” .......................................................................................... . centennial timeline, march , .......................................................................... . centennial timeline, march , ........................................................................ vii introduction timelines are a standard part of how people teach, learn, and conceptualize history. with a timeline, it is easy to project the history that the timeline author desires leading to a teleological visualization of an imagined past. these selective linear narratives are well suited to tell institutional histories. digital timelines are increasingly accessible, for institutions and individuals. timeline tools enable people to actively participate in the fabrication of history, yet the effects of digital timeline tools becoming an ever more accessible technology is largely unexplored. the print center, a small contemporary art gallery in philadelphia, is celebrating its centennial in and among its commemorative activities the organization is creating a digital timeline to tell its history. for the print center the features of the digital timeline foster an interactive, dynamic history that is different from organization’s earlier anniversary timelines. do digital timelines, in all their forms, offer anything new? can digital technology make timelines that are effective visualizations, tools that scholars and the public can use to explore history rather than a means to recount a teleological narrative? can they be powerful public history projects and contribute to historiography as a part of the discourse rather than something that public historians simply discuss? this paper explores the use and potential of digital timelines in public history projects. the first part of this paper seeks to locate digital timelines within a historical context. i discuss what timelines are, as narratives and as visualizations. i study the history of timelines and their development to help me speculate about the impact of digital timelines. i then assess the range and development of digital timelines and timeline tools on the web, giving particular attention to how they can be of use to cultural institutions. after fitting digital timelines into historical context, i discuss the print center’s centennial timeline. viii this discussion takes into account multiple factors including the technology, the timeline’s development, its current use, and its potential. in computers, visualization, and history david staley states, “the time line is elementary in that it underlies many of our fundamental assumptions about the nature of time and our understanding of the past.” timelines are particularly authoritative because they represent concrete events in absolute time. public historians often want multiplicity in their pasts. public historians envision an inclusive past: a past that weaves together nonlinear narratives and undermines accepted grand narratives. as research tools, digital timelines can enable people to visualize history in ways that would be difficult without a computer, resulting in the ability to create new arguments and historical discussions. as narratives, digital timelines have the potential to tell creative, unexpected stories. despite their potential, digital timelines often replicate the conventions of their paper counterparts because of the authoritative nature of the timeline form. david j. staley, computers, visualization, and history: how new technology will transform our understanding of the past (me sharpe, ), . chapter timelines history of timelines understanding what timelines are and how we use them entails an intersection of disciplines ranging from history, philosophy, information design and visualization, new media studies, and education. scholars who have contemplated timelines typically include the timeline within broader studies covering the philosophy of time or from the perspectives of information design and visualization. daniel rosenberg and anthony grafton’s cartographies of time is a comprehensive survey of the history of timelines and is the only book devoted to the subject. rosenberg and grafton distinguish the timeline as consisting in the placement of historical events on a straight line with a single axis, a form that developed in the late eighteenth century. rosenberg and grafton state, “in the modern historical imagination, the timeline plays a special role: it appears as a graphic instantiation of history itself.” rosenberg and grafton argue that the timeline developed along a complex trajectory, along which the most important factors were the ideas, concepts, and philosophical developments that urged people to visualize the past in the ways that they did. joseph priestley’s a chart of biography ( ) and a new chart of history ( ) are often given the credit for pioneering what would become the timeline form we know today. priestley’s a chart of biography, displays time as horizontal. the lives of in this paper i assume that my audience has a clear idea of what a timeline is, an idea correlating with rosenberg and grafton’s definition. daniel rosenberg and anthony grafton, cartographies of time: a history of the timeline (princeton architectural press, ), . rosenberg and grafton, - . famous individuals appear as lines along the horizontal. the length of a person’s life determines the length of their line and no line is bolder than another. the vertical axis displayed categories in bands, with individuals placed in the appropriate category (fig. ).* rosenberg and grafton claim, “after priestley, most readers simply assumed the analogy between historical time and measured graphic space, so the nature of the arguments around chronographic representation shifted dramatically.” before priestly chronological and genealogical charts were the primary means through which people visualized the past. priestley’s charts promoted the idea that the massive, messy, and often overwhelming amount of information from the past could turn into something useful if organized the right way. priestley’s contemporary views of knowledge and the increasingly popular conception of absolute time helped formulate the way that he structured history in his visualizations. by the late seventeenth century, empiricism required the use of time as a variable in a variety of experiments. while natural philosophers were spending more time measuring the sun, stars, and planetary bodies with increasing exactitude, other scholarly pursuits felt the effects. isaac newton’s theories of absolute time separated the way that people experience time with the idea of time as a mathematical constant: absolute, true, and mathematical time, of itself, and from its own nature, flows equably without relation to anything external, and by another name * figures are listed in appendix a. ibid., . chronicles and annals were popular means of organizing the past before timelines, with chronology being a field of study in its own right. an important aspect of chronicles was the ability to synchronize multiple histories. for example, the forth century chronicle of eusebius traced the histories of multiple ancient cultures with the aim of placing these histories within the context of the rise of christianity. for eusebius the calculation of time for chronologies served the purpose of being able to put events within relation to one another. see: brian croke, "the originality of eusebius' chronicle," american journal of philology , no. ( ): ; and rosenberg and grafton, . is called duration: relative, apparent and common time, is some sensible and external (whether accurate or unequable) measure of duration by the means of motion, which is commonly used instead of true time; such as an hour, a day, a month, a year. this concept of time supports a linear model within which time moves forward evenly. when newton’s theories gained popularity so did his use of absolute time. in his book temporalities, russell west-pavlov describes the benefits of newtonian time, stating, “homogeneity permits a plethora of different events of differing durations, scales, speeds, to be evenly reified within an abstract framework of time.” people began to understand histories along linear lines equivalent to how they understood linear, absolute time. this concept of time permeated the way that people understood how to conduct inquiries into the past. the past visualized as events taking place on an abstract timeline promotes a sense of objectivity and a scientific basis for historical work. the use of absolute time in graphic representations of history gained popularity in the early nineteenth century as emerging nationalism encouraged people to develop a sense of historicism about the growth of their countries. the combination of absolute isaac newton, mathematical principles of natural philosophy, in newton/huygens, vol. of great books of the western world ed. robert maynard hutchins (university of chicago, ), . russell west-pavlov, temporalities (new york: routledge, ), - . during the middle ages the line between studying the past and time was cloudy. monks often studied timekeeping in efforts to standardize calendars and time measurements such as hours. their contemporary framework of time was directly connected to nature and the natural cycles that they experienced in their daily lives. events such as the introduction of mechanical clocks as opposed to sundials or water clocks began to separate time from everyday life. cyclical representations of time remained relevant in the use of calendars and timekeeping, but scholars with a holistic sense of natural philosophy, aligned absolute time with the study of the past. see: arno borst, the ordering of time: from the ancient computus to the modern computer (university of chicago press, ); and west-pavlov, - . for the relationship between absolute time and a scientific view of history see, west-pavlov, - ; and borst, . also see susan schulten, "emma willard and the graphic foundations of american history," journal of historical geography , no. ( ): ; and stefan tanaka, “pasts in a digital age,” in writing history in the digital age, ed. jack dougherty and kristen nawrotzki (university of michigan press, ), accessed march , , doi: http://dx.doi.org/ . /dh. . . . time and historicism affected the ways that timelines developed throughout the nineteenth centuries. susan schulten’s study of emma willard shows how history educators adopted timelines as a way to tell nationalistic histories. willard created visualizations for her students in which she used a combination of geographical structures like maps with representations of time. willard’s temple of time ( ) is one such visualization. temple of time displays categories and prominent individuals like a chart of biography, and countries like priestley’s a new chart of history. instead of an even line, the information in temple of time is projected onto a temple form with older history barely visible in the back and more recent history front and center. this depth perspective literally makes recent history larger, and by association more important, while past individuals and nations fade in to the background (fig. ). willard’s visualizations show a progression of time culminating with the present. schulten writes that willard’s graphs “became a cumulative statement of nationhood, as each moment of the past was chosen for its place in an evolving story of territorial fulfillment.” these graphs emphasize the directionality of time in order to support a national perspective. the alignment of absolute time with historicism did not defeat the tradition, common in chronicles and in genealogical visualizations, of using history as a means to claim power. the timeline that supports a history of a single entity, nation, individual, or organization, along a progressive linear trajectory is a standard that has become common in history texts, whether in the classroom or on museum walls. these timelines schulten, . to illustrate the relationship between representations of time and power, rosenberg and grafton give the example of a genealogy shaped like a triumphal arch that albrecht durer sketched for maximilian i, which linked the hapsburgs to biblical patriarchs. see rosenberg and grafton, . are the product of a particular concept of time and history. this form of the timeline is also the object of criticism from historians. historians’ main objection with timelines is rooted in the alignment of timelines with absolute time. this is because the association of timelines with a scientific representation of events can have the effect of taking the timeline outside the realm of historiography. the result is that timelines seem to present an objective reality, not an argument. west-pavlov tied linear historicism to the history wars stating, “the history wars drew upon the self-evidence of a version of history understood as a sequence of empirically verifiable facts to vilify those who questioned national myths so as to lay bare a history of genocide.” studying how aboriginal history is depicted at public sites in australia, elizabeth furniss refers to narratives of discovery, firsts, and pioneers as “timeline history,” within which, “the past is made comprehensible by events being ordered sequentially in a linear pattern along a line of development and progress.” public historians often criticize history that avoids controversy, struggles, and complications. handler and gable’s popular public history text, the new history in an old museum, advocates for a constructivist history that embraces multiplicity rather than the celebratory and overly simplified version of history that museums often present. regardless of the criticism they receive timelines are a culturally relevant way to represent history. can the growing variety and functionality of digital timelines defeat some of this criticism and make a place for timelines within scholarship? there is potential to build a scholarly discourse around the timeline’s evolving form. west-pavlov, . this quote refers to the australian history wars, but a similar sentiment abounded in the american spells of the same name; see edward t. linethal and tom engelhardt, eds. history wars: the enola gay and other battles for the american past (new york: henry holt and company, ). elizabeth furniss, "timeline history and the anzac myth: settler narratives of local history in a north australian town," oceania , no. , ( ): . timelines as digital history: visualizations and narratives many historians and scholars of visualizations and digital media are optimistic about the potential of digital history projects. in digital history daniel cohen and roy rosenzwieg state the advantages of digital media to be “capacity, accessibility, flexibility, diversity, manipulability, interactivity, and hypertextuality,” while simultaneously listing the potential drawbacks: “quality, durability, readability, passivity, and inaccessibility.” the advantages that cohen and rosenzweig list could greatly expand the potential of timelines as historical visualizations. a printed timeline is static and generally urges viewers to move in one direction. a digital timeline could have multiple starting points and links between events. optimistically, digital timelines can facilitate multiple perspectives and interpretations. scholars who study visualizations are generally open about the types of representations that should be engaged as arguments. in "digital visualization as a scholarly activity," martyn jessop asserts, “every representation, visual, or otherwise, is an effort to structure an argument and as such it is a rhetorical device.” jessop equates visualizations with text. for digital timelines, the challenge is whether historians can learn to read timelines as a distinct type of visualization different to, but no less meaningful than, text. there are many reasons to be skeptical of visualizations. johanna drucker warns about the dangers of embracing visualizations, stating: the persuasive and seductive rhetorical force of visualization performs such a powerful reification of information that graphics such as google daniel cohen and roy rosenzweig, “introduction,” in digital history: a guide to gathering, preserving and presenting the past on the web (philadelphia: university of pennsylvania press, ), accessed march , , http://chnm.gmu.edu/digitalhistory. martyn jessop, "digital visualization as a scholarly activity," literary and linguistic computing , no. ( ): . maps are taken to be simply a presentation of “what is,” as if all critical thought had been precipitously and completely jettisoned. the sentiment in drucker’s statement is that digital visualizations can prevent others from opposing or conversing with their content because of their presentation as facts. this critique is aligned with historians’ critiques of timelines. drucker’s skepticism of visualizations highlights that the authors of any visualization need to balance their priorities against how people, unfamiliar with the nitty-gritty details of the research, will interpret the visualization. because reading visualizations critically is not intrinsic for everyone studies from scholars such as staley and jessop, along with classic visualization studies from edward tufte, have advocated that visualizations need a set of best practices and guidelines similar to those that scholars have adopted for books and articles. among the suggestions are that makers of visualizations be aware of their audience, be mindful of the ways that the visualization distorts data, and embrace open debate about the methodologies and technologies they used to create the visualization. for these scholars guidelines can help bring the visualizations to a critical discourse in which they have traditionally not played a large part. one reason that scholars feel a set of best practices is needed for visualizations is because reading a historical narrative in an article or book is different from reading visualizations. reading visualizations usually involves observing the whole, then letting the information guide where and how the viewers’ eyes travel, affecting the way that they interpret the visualization. staley differentiates visualizations from prose because johanna drucker, “humanistic theory and digital scholarship,” in debates in the digital humanities, ed. matthew k. gold (minneapolis: university of minnesota press, ), accessed march , , http://dhdebates.gc.cuny.edu/debates. for an overview of information visualization see edward r. tufte, envisioning information. (cheshire, connecticut: graphics press, ); for visualization guidelines see staley, - . staley argues that visualizations are just as much secondary sources as books and should be treated as such in historiographies, see staley, . visualizations can be multidimensional and multidirectional while prose is based one on dimension and read on one direction. because of their one-way directionality, staley suggests that timelines remind historians of sentences and are therefore one of the most common forms of historical visualization. as a visualization that is more linear and closer to text than others, timelines can easily obscure their visual components. with timelines it is important to remember that the deciding factor determining the arrangement of chronological information is an imaginary line of time. most historical arguments take the form of narratives, yet because timelines are generally excluded from historiography, they are rarely analyzed as narrative forms. historian hayden white’s studies of narratives are an example of how historians theorize about textual history and chronology. white states, “unless at least two versions of the same set of events can be imagined, there is no reason for the historian to take upon himself the authority of giving the true account of what really happened.” for white, events in a narrative contribute to a plot that makes sense of the world in a manner in line with the moral standards of the society that created the narrative. timelines are not typically presented in multiple versions and by white’s definition are not narratives. resulting from this denial of narrative status there is little theoretical framework from which historians can study timelines. the view that lists of events are not narratives fits in with the perception that they tell a scientific history in absolute time. timelines have more potential to contribute to scholarly discourse as visualizations that contain staley, , . hayden white, the content of the form: narrative discourse and historical representation, (the john hopkins university press, ), . for white’s analysis of narratives see “the value of narrativity in the representation of reality,” in the content of the form, - ; and “the question of narrative in contemporary historical theory,” ibid, - , particularly - . narrative elements. thinking about a timeline first in terms of its visual elements enables historians to see how the timeline is composed of decisions that they can engage, such as the choice to display certain date formats, the implications of using or not using measured time, and how the timeline compares with others. with digital timelines these decisions are often limited and many of the tools available do not offer the types of customizations that one would be able to execute by hand. rosenberg and grafton end their book with ambivalence towards digital timelines. they state, “from the beginning the biggest challenge of the time chart was not to include more data, but to clarify a historical picture – to offer a form that was intuitive and mnemonic, and that functioned well as a tool of reference.” timelines of the past, like those of joseph priestley and emma willard, relied heavily on printmaking techniques and the artistic talents of the engravers. highly individualized, artistic visualizations are difficult to recreate in computer software. digital timelines can be individualized, but most use, or are based on, a limited number of timeline tools that structure events into a visual order. over the past ten years a number of these tools have developed a range of functions: from increasing the amount of data that timeline authors can employ to integrating multimedia content and social media. the following section evaluates digital timeline tools available online and begins to discuss the impact of this popular form of historical representation. timeline tools most advocates for digital history and visualizations introduce timelines as a standard historical visualization technique and quickly move on to more complex and rosenberg and grafton, . for a discussion of the limitations of computer generated visual forms see jessop, . technologically advanced visualizations such as network mapping, virtual realities, d modeling, d scanning and printing, and mapping projects. in eagerness to move on to flashier technology the meaning of the digital timeline is largely unexplored. timelines are common on the internet; they appear on websites ranging from museums and course syllabi, to blogs and social media outlets. rosenberg and grafton state, “along with the list and the link, the timeline is one of the central organizing structures of the contemporary user interface.” digital timelines exhibit a range of functions with varying complexity. understanding digital timelines involves analyzing the relationships between software creators and developers, timeline authors, and the users who interact with the final product. the most basic digital timeline does not require a special tool or advanced coding. a simple digital timeline could be a list of dates with some accompanying text. there are dozens of digital timeline tools available online that will format content in more complex visual ways. for examples see staley; and john theibault, “visualizations and historical arguments,” in writing history in the digital age. rosenberg and grafton, . i have tried to keep these terms consistent for clarity about the roles of my actors. i also use the term “contributors” to refer to people who suggest timeline content, but do not necessarily author it. occasionally these terms can cross over, as in timeline authors being software users. while i have tried to keep the authorship of timelines clear, it is important to note that timeline authors must work within the bounds of what the developers created. when authors wish the timeline to be an exploratory research tool then the authors are also timeline users. for this section i chose a selection of timeline tools to analyze. one main criteria in my selection process was that the timelines could be embedded on the users’ website. i left out timeline tools that its creators intended primarily for personal use, such as aeon timeline, and did not discuss non-historical timelines. i have tried to be aware of the cost of the tools that i discuss, but one cost that underlies all of these is the cost of web hosting, internet service, and computers. i have largely assumed that most organizations have a website, yet i am aware that individual historians and some smaller organizations do not, or rely on free services such as wordpress.com for their hosting. the ability to embed a timeline on a free hosting service can vary from embedding on a website where the site’s administrator has access to the site’s files. because many small cultural institutions do not have a large, or any, technical staff, i have tried to discuss tools that require a minimal amount of coding. there are many customized timelines on the web, but i have mainly stayed away from these because they are not practical for most cultural institutions. (for an example of one of these customized timelines see the evolution of wikipedia’s timeline of art is an example of a simple digital timeline. wikipedia user astarte created timeline of art on august , . at its inception the timeline consisted in sections listing decades beginning with the most recent decade at the top. under the section heading for each decade were lists of years and links to the wikipedia’s “year in art” pages. the early version of timeline of art contained the entire twentieth century and sporadic decades from the nineteenth century and earlier with nothing before . the source content for this timeline was not a grand narrative of art history, but the “year in art” pages. the years absent from the timeline at its early stages had not been created yet (figs. - ). a select number of individuals have authored the vast majority of timeline of art’s edits. these edits are concentrated in the nineteenth and twentieth centuries. after astarte created the structure of the timeline, other wikipedia users began to put text and links next to the years, primarily consisting in the births and deaths of prominent artists. it was not until september that an anonymous wikipedia user completely filled out the decades from year ad and added sections for earlier dates through prehistoric art. scrolling through the timeline, as it exists concurrent with this thesis, creates a sense that art today is more plentiful than ever before. the entries for the s are full of text and links. scrolling down this text becomes scarcer. at the end of the timeline there is a slight increase in text accompanying prehistoric entries. the web, http://www.evolutionoftheweb.com, collaboration between google chrome, hyperakt, and vizzuality, accessed march , .) “timeline of art,” wikipedia, accessed march , , http://en.wikipedia.org/wiki/timeline_of_art; for the history of the timeline of art see “timeline of art: revision history,” wikipedia, accessed march , , http://en.wikipedia.org/w/index.php?title=timeline_of_art&dir=prev&action=history. for a discussion of wikipedia as a vehicle for public memory see robert s. wolff, “the historian’s craft, popular memory, and wikipedia,” in writing history in the digital age. on timeline of art, the chosen organizational scheme has consequences for the content. pages related to art history organized by medium, periods, or by geographic regions, for example “chinese art” or “cave painting,” did not make the cut though they existed at the time of timeline of art’s creation. because timeline of art uses wikipedia content as the data source it would make sense that some material is left out. simultaneously, using a data source that is not an art history textbook would have the potential to provide an alternative point of view. however, wikipedia pages devoted to art tend to reflect standard art history divisions and the timeline reflects these tendencies. like in a standard text non-western art typically appears on pages that cover a specific culture or geography, while the larger art narrative is primarily western. on timeline of art the prehistoric section mentions africa and australia and a few years from to list chinese and japanese artists. from onward western art dominates. the text “the aztec calendar stone is discovered” is listed next to the entry “ in art.” the perspective of this entry highlights the stone’s discovery, hundreds of years after its creation. this example illustrates biases that are difficult to escape, even when the organization and selection of content is not based on any established text. many digital timelines are formatted as a simple list, like timeline of art. these timelines are easy to make and involve curatorial and narrative decisions, even if they do not involve extensive decisions about their visual design. in , the simile project from mit created timeline. david françois huynh, timeline’s creator, describes it as, “the equivalence of google maps but for temporal data—the first web api for rendering “chinese art: revision history,” wikipedia, accessed march , , http://en.wikipedia.org/w/index.php?title=chinese_art&dir=prev&action=history; “cave painting: revision history,” wikipedia, accessed march , , http://en.wikipedia.org/w/index.php?title=cave_painting&dir=prev&action=history. interactive timelines in web pages.” simile was a grant-funded project run out of mit that developed a variety of open source tools to “empower users to access, manage, visualize and reuse digital assets.” timeline was under active development between and . simile is no longer attached to an institution or funded. the tools that the project developed are currently hosted on google code and maintained by the open- source community. the technology used to create timeline has been incorporated into the neatline exhibits on omeka and the zotero timeline project, both hosted out of the roy rosenzweig center for history and new media at george mason university. simile’s timeline was a tool designed with the idea of creating an interactive timeline that could organize large amounts of data instantly. timeline is a widget that uses a combination of html, xml, and javascript. the timeline is created using hmtl div functions to make bands that create the illusion of infinite scrolling. the result is a horizontal timeline that authors can customize and that users can manipulate on the published timeline. with multiple bands synchronized together, the timeline enables users to perceive events at two different points of view, for example as years on one band and months on another. entries can either appear as points, represented by small dots with an adjacent title, or as durations, represented by for quote see: david françois huynh, “project highlights,” accessed march , , http://davidhuynh.net/. for simile timeline see: “timeline,” accessed march , , http://www.simile-widgets.org/timeline/; and the project’s source code at google code, accessed march , , https://code.google.com/p/simile-widgets/source/browse/#svn% ftimeline. see simile’s former homepage at http://simile.mit.edu/, accessed march , . the current simile project is hosted at “simile widgets,” accessed march , , http://simile-widgets.org. see dates active on google code at https://code.google.com/p/simile- widgets/source/browse/timeline/tags/#tags, accessed march , . for using simile timeline with zotero see, “timelines,” roy rosenzweig center for history and new media, accessed march , , https://www.zotero.org/support/timelines; for using simile timeline with omeka see, “neatlinesimile by scholars’ lab,” roy rosenzweig center for history and new media, accessed march , , http://omeka.org/add- ons/plugins/neatlinesimile. lines that span the duration based on the timescale of the band in which it appears. entries can be divided into categories. hovering over entries will show details that can include brief text, links, and images. vertical columns of varying widths can be inserted into the band to distinguish important events and time spans. the milestones project by michael friendly of york university and daniel denis contains a simile timeline (figs. - ). the milestones timeline aims to “span the entire development of visual thinking and the visual representation of data,” and includes entries, which friendly and denis call milestones, relevant to the development of visualization technologies. the milestones timeline is centered at . above the timeline is a key displaying different colored categories and below are a series of epochs in which the authors divided the milestones. the epochs enable users to navigate long periods without excessive scrolling, and also represent separate pages that users can reach by clicking “milestones detail” on individual entries. these pages collect the entries from the epoch into one list with additional information, images, and relevant links. the authors of the milestones timeline selected entries based on secondary sources, cited within the entries’ detail page, as well as the authors’ own research. the way that simile timeline formats the data displays patterns of activity. the period between and is an active wave, peaking in the s and trailing off into the s. between and entries appear at a steady pace. the multitude of entries, especially when they appear within a few years of one another, makes the “timeline basics,” simile widgets, accessed march , , http://simile- widgets.org/wiki/timeline_basics. for the milestones project see michael friendly and daniel denis, milestones in the history of thematic cartography, statistical graphics, and data visualization, accessed february , , http://www.datavis.ca/milestones. for quote see friendly and denis, “milestones project,” accessed march , , http://www.datavis.ca/milestones/index.php?page=milestones+project. timeline difficult to read as a linear narrative. as a user it is easier to jump around from entry to entry. the colored categories become focal points that lump entries together. while the milestones project contains interpretive elements in the form of the epochs that the authors created to organize the entries, the simile timeline on which the data appears can also be a research tool in and of itself. users could look for connections between technological developments, in green, and graphics, in blue, or they could track particular technologies. the milestones timeline focuses on technological developments. the authors lump everything before into one epoch and list few entries after . the authors cite the primary audience for this timeline as visualization specialists who are often not taught the history of the field in their academic and professional training. in this sense, the timeline is not giving a concise history of any one field, but showing people how their specialty relates to other specialties. this specialist audience may not have much interest in entries before that do not include familiar forms of technology. likewise, the dearth of contemporary entries could reflect the authors’ desire for the timeline to be relevant to a wide variety of specialists, as well as the authors’ unwillingness to overpopulate the timeline with entries that have not yet had an impact on a variety of visualization specialties. it would be difficult to combine the information on milestones timeline, which seeks a multi-specialist audience, into a more traditional narrative history. simile’s timeline provides a means for people to visualize large amounts of information. the tool is free, highly customizable, can handle more data than other more proprietary tools, and includes detailed documentation that explains both the concept ibid., “milestones project.” behind the timeline and how to implement many of its features. however, the technology does require that timeline authors play with code. for professional data visualization specialists this may not be a problem, but it could be a hindrance for small institutions low on skills, money, and time. as a visualization tool, timeline formats and organizes historical data into a visual scheme that would be difficult to do without a computer, and in doing so, enables users to see the data in a new light. the horizontal design and concept of simile’s timeline formed the basis for multiple timeline tools that came after. among the timeline tools that followed simile’s timeline are dipity, timeglider, timerime, timetoast and whenintime. (see figs - for images of a few of these tools.) these timelines share the horizontal format of simile’s timeline and incorporate features such as zooming that enables users to change the timeline’s scale of time, as well as multimedia and social media features. these tools do not require authors to do any coding, however authors cannot customize these timelines and the free versions are limited. with the exception of whenintime these tools cost money for timeline authors who want to work with their more advanced features. for historians and institutions interested in creating timelines these tools can save time, if the timeline author is willing “timeline getting started,” simile widgets, accessed march , , http://simile- widgets.org/wiki/timeline_gettingstarted. software homepages: dipity, http://www.dipity.com; timeglider, https://timeglider.com; timerime, http://timerime.com; timetoast, http://www.timetoast.com; and whenintime, http://whenintime.com. homepages last accessed march , . the expense of these tools varies. monthly plans start at around $ a month, and max with dipity’s $ a month professional plan. timerime offers users one-time purchase plans that start at € and can rise higher than € . without purchasing a plan the tools will have restrictions such as a limit number of entries and not allowing users to embed their timelines on other websites. for payment plans see: “professional solutions,” timerime, http://timerime.com/en/page/our_products/ ; “dipity premium plans,” dipity, http://www.dipity.com/premium/plans; “plans & pricing,” timeglider, https://timeglider.com/signup; “plans,” timetoast, http://www.timetoast.com/plans. plans last accessed march , . to keep the tool as it comes and spare the cash. the overall effect of these tools is similar, though each offers slight variations in features and design. with these tools authors have some control over how people will read the chronology that the timeline presents, but like with simile’s timeline, the users can decide how to read the timeline independent of the author’s intentions. for example, timeglider enables authors to create categories and to label events in importance. the result, in conjunction with the tool’s zooming feature, is that users can zoom out to an empty timeline and zoom in with the entries labeled as more important appearing first and the others slowly populating the timeline around it (figs. - ). for authors utilizing this feature means making a judgment about the relative importance of events and highlighting plot points like a more traditional narrative. with these timeline tools the technology determines the organization of the points in space. users could try to read the events chronologically if they zoomed in close enough for all entries to be visible. if they were zoomed out they would only see major plot points and would have to excavate the entries. a user could potentially do neither of these things and begin with the images that timeglider places at the top of the timeline. there are multiple ways to read these timelines and as a result, multiple narratives based on the user’s navigation choices. social media integration is a main feature of these timelines. the creators of dipity, derek dukes and bj heinley state, “dipity allows users to gather real-time sources from social media, traditional search services and rss to aggregate them in a single, easy to use, fun to navigate interface.” whenintime similarly enables users to grab data from wikipedia, twitter, and facebook feeds. traditionally historians sift through sources and select the content that will become history. the ability to grab and “how timeglider works,” timeglider, accessed march , , https://timeglider.com/how_it_works. present information in a public forum without intervention from historians or archivists enables people to create history that bypasses institutional standards. in relation to these timelines historians can become moderators, but are not necessarily authors. these timeline tools are primarily intended to for public consumption. they are about the exhibition of content and largely not used as research tools. other digital timeline tools, such as preceden, focus on information organization and emphasize exhibition in a more limited capacity, like for a small class (fig. - ). developer matt mazur created preceden in . preceden still presents a horizontal chronology, but it diverges from the other timeline tools in the way that it organizes data. preceden’s structure is similar to the priestley timelines. it is formed like a graph, with dates at the top of the timeline and layers that display categories listed down. preceden will change its height to accommodate layers and events. dates can be color coded to show additional categorization and can be either specific points or time spans. authors can set milestones, which will strike a vertical bar through the whole timeline. preceden’s focus on structuring data in a clear, readable way, highlighted by features such as ability to add layers and expand the height of the timeline, eliminates some of the clutter that other tools can attain when too many entries populate an area. preceden works well as a study aide for students and has potential to visualize dates in ways that the other tools cannot. the dating options in preceden are highly customizable; the software enables authors to lists entries in a multiple formats, for example using a year for one entry and preceden costs a one-time $ payment that enables users to make as many timelines as they want and to embed the timelines on their own website. teachers’ accounts allow students to create timelines from within the teacher’s account that are not made public or visible to other students, but which the teacher can edit. for preceden homepage see http://www.preceden.com; preceden allows users a free timeline, but limits it to events only. see “sign up,” preceden, https://www.preceden.com/signup; for teacher accounts see “preceden for teachers,” http://www.preceden.com/teachers. preceden website last accessed march , . the month and day for another. authors can choose whether to display the horizontal bars representing time spans as solid or as fading in and out on the edges. this effect becomes the visualization equivalent to a “circa” date. another unique dating feature is the ability to set “dependencies.” this feature enables users to set an entry to begin when another ends, essentially connecting entries together. on most of the timeline tools the author making the timeline must input a date for all entries. for historians trying to establish chronologies that may include events for which dates are unknown this feature enables approximations. without this feature, authors must either leave out material with unknown dates or claim authority and choose a date. the ability to visualize unknowns gives users the chance to guess dates on their own and makes the research process collaborative. preceden moves digital timelines in a direction focusing on data organization. similarly, tools such as chronos timeline in development at mit’s hyperstudio embrace the visualization of temporal data over social media and multimedia features. on the other end of the spectrum tools such as timelinejs and tiki-toki embrace the linearity of the timeline and highlight narrative, multi-media storytelling. instead of using the timeline itself as the main visual focus, these storytelling tools highlight entry content. tiki-toki launched in march of . tiki-toki’s default view is a horizontal layout with a small timeline band on the bottom and a larger band that displays entry content (fig. - ). tiki-toki includes an alternative d view that is unique among timeline “chronos timeline,” hyperstudio, digital humanities at mit, accessed march , , http://hyperstudio.mit.edu/software/chronos-timeline. chronos timeline is currently in development, but sample timelines are viewable from the website. like the simile-inspired timeline tools, tiki-toki offers a limited free version and has pricing plans ranging from $ . to $ . a month. for tiki-toki homepage and plans see http://www.tiki-toki.com; a history of the product’s development can be found on the timeline connected to the homepage: “beautiful web-based timeline software,” tiki-toki, http://www.tiki- tools. in the d view the user moves forward on a flat plane with future events visible on the horizon. rather than looking at history from a birds-eye view, the d perspective limits the user’s visibility and obscures past events once they go out of view. in one sense the d perspective makes entries more immediate for users. this view creates a similar effect as the emma willard graphics that embraced timelines as a nationalistic, progressive way to visualize history. tiki-toki can present a similar type of linear, progressive narrative. unlike willard’s timelines that reflect the will of one person, tiki- toki has a group edit feature. multiple authors can contribute content directly. this feature encourages collaborative storytelling that could potentially integrate multiple points of view. the final product is still a linear story, but the ability to have multiple authors removes the authority of the narrative from any one person. timelinejs has a similar storytelling function. timelinejs is an open-source timeline tool created by northwestern university’s knight lab. knight lab is primarily focused on journalism and creating tools to advance web-based new media. the intended audiences for timelinejs are journalists and media outlets. timelinejs encourages timeline authors to choose stories with a “strong chronological narrative,” and the software is optimized as a storytelling tool. a timelinejs timeline consists in a slider that displays content from an entry with a horizontal timeline underneath the slider. toki.com/timeline/entry/ /beautiful-web-based-timeline-software; website last accessed march , . schulten, ; schulten describes the movement of time in willard’s timelines, stating, “time flows forward and widens toward the viewer, as opposed to the timeline’s trajectory across the page in fixed increments.” group editing is only available with the paid version of tiki-toki. for an example see, “group edit demo,” alex kearns, tiki-toki, accessed march , , http://www.tiki- toki.com/timeline/entry/ /group-edit-demo/#vars!date= - - _ : : !. for timelinejs homepage and quote see: http://timeline.knightlab.com; for information about knight lab see “about us,” http://knightlab.northwestern.edu/about; websites last accessed march , . if the user employs the timeline to skip entries, the slider will quickly animate through the skipped entries, a similar experience to skimming text. the slider is an easier navigation option and presents users with a linear chronological narrative. using timelinejs does not require coding. authors can arrange their content on google spreadsheets with a template that knight lab provides, which they can then upload and publish as a timeline through the timelinejs website. it is possible for authors to create their own json files and customize the content and code if they wish and have the skills to do so. though authors can customize timelinejs, the abilities of the software are limited. the developers do not recommend more than entries, lest the author risk slow load times. the amount of recommended timeline content requires a similar time commitment as reading a newspaper article, further encouraging users to read the timelinejs timelines like stories. reading a larger number of entries would require more time than the casual user has and would be better represented by a timeline with less of a focus on storytelling. knight lab created the timeline revolutionary user interfaces to exemplify the functionality of timelinejs (figs. - ). this timeline tracks the development of computing devices that have altered the way humans and machines interact. this timeline makes users think about the interfaces that they use everyday, and to consider the software that is presenting the timeline as a part of that history. as a promotional tool, revolutionary user interfaces serves as a type of institutional history that places knight lab at the forefront of current user interface design. like all timelinejs timelines revolutionary user interfaces begins with an introduction at the earliest dated entry and moves the users forward. the timeline’s first “revolutionary user interfaces,” knight lab, accessed march , , http://timeline.knightlab.com/examples/user-interface. entry is at with a machine called the antikythera mechanism and travels through a select number of innovations leading to a dramatic increase in the development of innovative user interface technologies in contemporary times with entries such as voice recognition. the timeline does not include failed technology or that which does not have a direct connection to the user interfaces of today. unlike the milestones timeline, which includes a wide range of data that requires the user to sift through entries and seek out connections, revolutionary user interfaces makes the connections for the user. because of timelinejs’s emphasis on storytelling, revolutionary user interfaces entails more curatorial decision making than other tools. the tools and timelines discussed thus far require that timeline authors work with and around the technology in different ways. rosenberg and grafton termed digital timelines “grassroots timelines” implying that these timelines are taking the timeline outside of a professional realm and enabling people who otherwise would not have the resources to make timelines to become history producers. these timeline tools do offer options for people to create timelines without any professional historical training or technical expertise. however, the pool of individuals and organizations making these tools represent a much smaller group with their own motives regarding the function of the technology. stephen ramsay and geoffrey rockwell question whether digital tools contain arguments in and of themselves or whether they just relay information. they conclude, “where there is argument, the artifact has ceased to be a tool and has become something else.” in many ways these timeline tools all impose a structure on rosenberg and grafton, . stephen ramsay and geoffrey rockwell “developing things: notes toward an epistemology of building in the digital humanities” in debates in the digital humanities. timeline authors that affects the way users ultimately read the history that the finished timeline presents. mit’s simile timeline, along with their upcoming chronos software, and northwestern university’s timelinejs represent divergent paths for digital timeline technology. the former is data driven and encourages users to explore history with the visualization that the timeline provides. the latter is narrative driven and promotes the timeline as a means through which users can take part in a multimedia storytelling experience. the following section is an in-depth case study of one digital timeline, the print center’s centennial timeline. this section analyses the context behind the print center’s commemorations, discusses the organization’s past timelines, considers the practical limitations of creating a digital timeline from within an institution, and explores the merits of the final product as a public, digital history project. chapter the centennial timeline: a case study introduction: commemorative practices and institutional history this section is a case study in timeline production focusing on the print center’s past and current commemorative activity. the previous section discussed the promise of digital timeline tools and addressed the potential of digital timelines to defeat the notion that timelines can only tell flat, linear histories. in practice, the circumstances under which organizations choose to make timelines can obscure the more idealistic potential of the form. the timeline’s context greatly affects its purpose and content. this case study aims to reveal the combination of factors that affect the way organizations present history. the print center, a small nonprofit arts organization, is celebrating its centennial in and is using the occasion to undertake a digital timeline project. in the summer of i commenced an internship at the print center. during this internship i researched in the print center’s archives, participated in the planning of the centennial, and observed the daily operations of the organization. my time at the print center enabled me to reflect upon how the context of institutional history and commemoration affects public historical representations such as timelines. i have not created content for the centennial timeline, but while working on the centennial website that houses the timeline i became familiar with the how the timeline could evolve and began developing a plan that could potentially guide the timeline through the centennial and beyond. this centennial is one of many anniversaries the print center will have celebrated over its history and the centennial timeline is one of many timelines that these anniversaries have brought to life. this timeline blurs the lines between commemoration, memory, and history. because it involves the print center’s community of supporters, the timeline is as much a tool that looks to the future, speculating about the organization’s next hundred years, as it is one that reflects upon its history. the timeline uses the print center’s current institutional identity as a springboard from which to find points in the past that reveal how the organization became what its community know it as today. public historians have grappled with commemoration and institutional histories in a struggle to understand why people remember institutions in the ways that they do. historian richard white frames the processes of history and commemoration as “forms of mediation between past and present.” a timeline of institutional history typically leads directly to the present, and the individuals who author these timelines need to be acutely aware of the repercussions of the past that they assign to an institution. public commemorations often use history in ways that can highlight present issues, conform to or challenge accepted narratives, and can be celebratory for those involved. unlike academic pursuits, public commemorations are often personal for participating individuals and groups. where a commemoration takes place, who runs the show, and what their motives are can greatly affect the nature of commemorative activity. the book history wars: the enola gay and other battles for the american past illustrates the controversies that arise when a professional historical retelling focuses on changing a national narrative from within a national institution. the print center is not contending with a national audience like the smithsonian, but its community does accept a narrative richard white, "a commemoration and a historical mediation." the journal of american history , no. ( ): . of the organization’s identity. within this narrative, the print center is an inclusive organization that supports artists and artistic innovations. this narrative drives the print center’s identity and image. the “about us” page on the organization’s website demonstrates how identity and history converge at the print center. the page states, “in , the print club changed its name to the print center to mark its commitment to serve both its members and its community.” in a courier-post article from february , , fred adelson reports, “in , the print club changed its name to reflect more accurately that it serves a broader public audience.” these statements imply that the name the print club isolated the organization’s community. the shift in perception that center is better than club likely resulted from a combination of external forces and internal developments that created a distance between the old use of club and the new center. regardless, the print center now commemorates itself as a center; club rarely appears on anniversary publications. altering the history that accompanies the print center’s identity as a center affects the way that the organization encourages people to give them support, particularly financial support. this example illuminates the political dimension to commemorations that can be difficult to reconcile with history. fred adelson, “ years of printmaking is worth celebrating,” courier-post, february , , accessed march , , http://www.courierpostonline.com/story/entertainment/ / / /years-printmaking-worth- celebrating/ /?utm_campaign=% b% the+print+center+at+ % a+% vibrant+an d+relevant% % % d&utm_source=% b% copy+of+pabf+% b+exhibitions% % d&ut m_medium=% b% email% % d. among the works that illustrate controversies that commemorations can bring about, in addition to history wars, are david thelen, "memory and american history," the journal of american history , no. ( ): - ; sarah j. purcell, "commemoration, public art, and the changing meaning of the bunker hill monument," the public historian , no. ( ): - ; and veronica strong-boag, "experts on our own lives: commemorating canada at the beginning of the st century," the public historian , no. ( ): - . for the print center, i have tried use the name of the organization relevant to the date of my sources; therefore i use both the print center and print club based on context. for an organization with an upcoming anniversary the decision to write an institutional history is strategic. institutional histories can take a variety of forms. institutional histories can range in content and form from the paragraph about a company on the back of a granola bar to coffee table books filled with photographs from factory floors. the majority of institutional histories tell stories of success, progress, and power. an institution that publishes its history on its product wants to connect with its customers, appealing to their sentiments. the print center has a genuine concern for history, but the motivations for turning the magnifying glass upon the organization itself has more to do with the print center’s present struggle to stay relevant amidst the city’s large number of arts organizations. the print center in the past was one of a select number of philadelphia art institutions. the travel brochure “philadelphia in the spring” from the s only advertises two organizations specializing in contemporary art: the print club and the philadelphia art alliance. the current agency that markets philadelphia tourism, visit philadelphia, does not list the print center among the dozens of contemporary art institutional histories often struggle between how to portray the institution, the institution’s need to staying in business, and how to portray more delicate, potentially controversial histories. works that cover issues of institutional histories include scholarly monographs such as richard handler and eric gable, the new history in an old museum: creating the past in colonial williamsburg (durham: duke university press, ); and tony bennett the birth of the museum: history, theory, politics (new york: routledge, ); articles from historians include eric l. mckitrick and stanley elkins, "institutions in motion," american quarterly , no. ( ): - ; sally gregory kohlstedt, "institutional history," osiris ( ): - ; and abir-am, pnina g. “introduction.” osiris nd series, vol. , “commemorative practices in science: historical perspectives on the politics of collective memory,” ( ): - ; from the perspective of corporate history literature includes w. richard scott, "reflections: the past and future of research on institutions and institutional change," journal of change management , no. ( ): - ; and agnès delahaye, et al., "the genre of corporate history," journal of organizational change management , no. ( ): - . “philadelphia in the spring,” brochure, box , folder , collection , print club archives, historical society of pennsylvania. (hereafter: print club archives, hsp.) galleries on its “all art museums and galleries” page. for the print center the centennial serves as the organization’s bid to publicize and reclaim its former status. the print center’s history in commemorations past with its relevance and finances at stake, how the print center approaches telling its history in the centennial could take different forms. the print center has celebrated multiple anniversaries in the past with a range of programs, exhibits, and publications. the timelines that the organization created for these anniversaries present the history in a way that reflected the state of the organization at the time of the commemoration. most of the print center’s commemorative activity occurs through exhibits. the exhibits at the print center’s fiftieth and sixtieth anniversaries commemorate the institution through artwork chosen to represent history. in this way the institution is directly linked to developments in art and is able to portray itself as promoting and encouraging artistic developments. as commemorative events, these exhibits promoted a sense of reverence for the past and pride in the present. as historical narratives, “all art museums & galleries,” visit philadelphia, accessed march , , http://www.visitphilly.com/music-art/art-museums-galleries/view-all. for example, during the fiftieth anniversary, the print club organized an exhibit in which prints from and were juxtaposed with prints from . this exhibit displayed a stark comparison between the types of prints popular at the founding of the club and contemporary prints. this selection of a group of prints from the founding year could be an immersive experience and give viewers a feel for what the club was like at its inception. the sixtieth anniversary in featured an exhibit called “ prints from years.” in this exhibit one print was chosen for every year from to from artists who had previously exhibited at the print club. viewers could draw a line from the organization’s founding to themselves. the systematic presentation of art encouraged viewers to feel detached from the history and potentially more reflexive about change than the fiftieth anniversary exhibit. simultaneously, because only one print represented a single year the overall historical narrative of the exhibit is a continuous linear progression over which change occurs at equal rates. see: “fiftieth anniversary exhibition,” box , loose paper, print club archives, hsp; and “ prints from sixty years,” box , folder , print club archives, hsp. conflict was absent and plot development was relatively flat. the print center intended these exhibits to encourage viewers to look forward and speculate about what type of art will be next. in the range of anniversaries that the print center has celebrated, a few turned the spotlight on the organization itself and told an institutional history as a part of the commemorative celebrations. publications from the fortieth and ninetieth anniversaries contain timelines that include this type of institutional history. the brochure “fortieth anniversary: the print club” is a horizontally formatted booklet containing short essays about the organization and two separate timelines, one titled print club milestones and another titled exhibition highlights. each timeline consists of a vertical list of years, with accompanying descriptions. the section print club milestones is brief and relays the movement of the club leading to its permanent home at latimer street, staff increases, staff changes, and a few notable programs such as the artists’ assistance fund and the printmakers’ workshop. exhibition highlights lists the first dates of the print club’s various annual competitions and names of prominent artists who exhibited at the club in a given year, for example, “ – audubon.” in the exhibition timeline some years are listed twice and some skipped entirely. the separation of institutional history from exhibit history in the fortieth anniversary timelines creates a more layered depiction of the organization’s history than an exhibit alone. from exhibition highlights readers sense that the print club expanded its infrastructure in the s, establishing a series of competitions still in effect when the booklet was published. through the s and early s the entries display how the club integrated shows illustrating the history of printmaking with contemporary art. in the “fortieth anniversary: the print club,” box , folder , print club archives, hsp. late s and s the entries become almost entirely contemporary. the last three entries highlight the club’s efforts to reach an international community, both in showing international artists and in sending american prints abroad. print club milestones has a cluster of entries between and that takes up more than half of the timeline. this section tells the early foundation of the club, primarily concerned with the establishment of a permanent headquarters. one entry indicates conflict stating, “ – headquarters relinquished because of war.” unlike exhibition histories and statements of growth in which the organization has agency, this entry indicates that the club does not have full control over its history or destiny. the entries from the s highlight the club’s incorporation and staff development. the remaining entries from the s to s shift to program development and staff expansion. together, these two timelines paint a picture of a small organization that established itself in the s, grew steadily through the s and s, and had established itself as a prominent arts organization by . the ninetieth anniversary publication from , “ years: nurturing the new,” contains another timeline. this publication is a tri-fold brochure with a timeline across the lower half. this timeline is split into nine decade long sections beginning at . events that took place in the decade timeframe are stacked vertically over the decade. visually, the vertical events create an oscillating wave-like effect, steadily growing from the first decade, - , to the last, - . this timeline primarily focuses on the expansion and development of the print center with entries highlighting prominent individuals whose art the print center displayed or published, the establishing of see newspaper clipping, “she has spurred print club to new heights of prestige,” box , scrapbook - , print club archives, hsp. “ years: nurturing the new,” (the print center, ). the director of the print center gave this brochure to me. programs, and notable firsts. a few undated entries tell general developments. this timeline integrates the print center’s exhibition history, program history, and internal history resulting in a timeline that has an official, authoritative feel. compared to the fortieth timelines less space is devoted to operations. the first entry about the building is , covering the purchase of latimer street. the ninetieth timeline skips the organization’s earlier struggle to find a space in which to operate. instead of internal developments, names are important on the ninetieth timeline. most of the names the ninetieth timeline relays are notable individuals. for example, “ - / purchased complex experimental prints by jasper johns and robert rauschenberg shortly after their creation.” this entry suggests that the club investing in these two recognizable artists early on is a feat in itself. the timeline does not give the reader any idea as to the other purchases the club made during that ten-year period nor does the timeline indicate how prominent jasper johns and robert rauschenberg were at the time when the club purchased their work. this timeline also puts additional emphasis on naming funders such as the ford foundation and the pew charitable trusts. the emphasis on name recognition in the ninetieth timeline indicates that the purpose of the timeline is to be a physical manifestation of the print center’s prominence. the differences between the fortieth timeline and the ninetieth timeline denote that the situation of the club in its respective present greatly affects the way it portrays its history. one of the critiques of timelines compared to more traditional historical narratives is that they offer little interpretive history. though both timelines portray linear, progressive, and relatively one-sided narratives, the differences between the two indicate that timelines are interpretive and that the historical quality of their content depends as much on its authors as it does on the timeline’s linear form. these timelines highlighted events that the authors felt were most instrumental in creating the print center that they knew. in the print club was experiencing growth. the clubs’ early troubles to find a headquarters were struggles that the club survived and prospered through. in the print center was trying to re-imagine itself in a competitive nonprofit landscape and needed the history to emphasize impressive professional achievements. for its centennial in the print center is again rewriting its history. the print center’s centennial timeline the print center’s centennial timeline is the means through which the organization is currently publishing its institutional history. as a whole the print center’s centennial, “the print center: years in ,” will incorporate fundraising events, a gala, a collaborative art exhibit, and a website within which the timeline is only a small part. this section analyzes the development of this timeline, taking into consideration timeline’s digital format, its content, and its overall potential as a means through which the print center can share its institutional history. (see figs. - for images of the centennial timeline.) the print center intends the design of the timeline to be the foundation from which to build an interactive institutional history that eventually will be integrated into the main website and actively updated with current events and historical findings from the archives. the centennial timeline will likely undergo various alterations over the course of the centennial. in addition to populating the timeline with the information in this section about the print center’s centennial activity and website is current as of march , . the timeline can be viewed at “timeline,” the print center, http://printcenter.org/ /timeline- . on the webpage the timeline is titled timeline. because i am writing about multiple timelines i chose to name it the centennial timeline for clarity. in addition to the website, some of the information in this section, especially concerning the internal structure of the print center is from my internship at the organization between may and july of . information from the archives, the print center is soliciting its community to submit stories. ideally the timeline will enable individuals with different interests and perspectives to become contributors. it is the print center’s hope that the timeline will become a collaborative, dynamic digital project that never reaches a state of completion. with goals emphasizing collaboration, shared knowledge creation, and interactivity, the spirit of the centennial timeline aligns with that of many other digital history projects. the print center established the basic concept and design of the timeline in the fall of . for the centennial timeline the print center chose a simple design matching the organization’s aesthetic. the design has a vertical orientation in which older events are positioned at the top. a single line centered on the page represents the timeline. small circles along the line represent entries, with entries alternating between left and right alignments. larger bold text displays the year of the entry and a title, below which is smaller text with a brief excerpt from the entry’s page. some entries have thumbnail images below the year and title. all entries link to an individual page with expanded content. the development of the timeline involved creatively using and manipulating the technology until the result resembled the print center’s initial vision. developing the timeline the centennial timeline is a part of the print center’s centennial website. the centennial website is built with wordpress. the print center chose a plugin that uses a custom post layout to generate the timeline. like with other digital timeline tools the centennial timeline’s unique interface compels authors to add, edit, and narrate the history in particular ways. each timeline entry is an individual post that the author must assign the category “timeline” in order for the post to appear on the timeline. the only content on the editor of the “timeline” page from the back end is the shortcode that activates the plugin. contributing to the timeline requires understanding the relationship between wordpress posts, the plugin’s code that structures them, and the page on which they end up. the possibilities and potential of the timeline becomes more clear and realistic once the project’s authors make themselves aware of the limitations of the technology. the print center is a small nonprofit and has neither the funds to hire a developer nor any dedicated it staff to create the centennial website. thus, the print center is heavily reliant on plugins and other free tools to produce web content that the organization would not otherwise have the resources and time to create. additionally, the democratic rhetoric of the organization makes open source software a good fit. the plugin that creates the centennial timeline is called “wordpress posts timeline.” this plugin is available for free through wordpress.org. the author of the plugin is wylie hobbs. hobbs released the first version of the plugin on may , . at its inception the plugin created a static timeline without the ability to link to the posts, a function that hobbs added on july , . the plugin’s last update, which took care of a few css issues and ensured compatibility with the most recent update to the wordpress software, was on february , . since that time wordpress has undergone multiple updates, while the plugin’s development has remained stagnant. historian shawn graham emphasizes that digital technology, in its limitations, causes its users to think in specific ways. graham stated, “in digital work, these models are explicitly written in computer code. understanding how the code forces a particular worldview on the user is a key portion of becoming a “digital historian.”” see: shawn graham, “the wikiblitz: a wikipedia editing assignment in a first-year undergraduate class,” in writing history in the digital age. “wordpress posts timeline,” wordpress.org, plugin directory, accessed march , , https://wordpress.org/plugins/wordpress-posts-timeline; the plugin’s development log is found here: https://plugins.trac.wordpress.org/log/wordpress-posts-timeline. for a summary of the plugin’s development see: wordpress-posts-timeline/tags/ . . @ /readme.txt, https://plugins.trac.wordpress.org/browser/wordpress-posts-timeline/tags/ . . ?rev= . for hobbs, the financial gain from developing this plugin comes solely through voluntary donations. for the developer the total time involved in constantly upgrading the code, fixing bugs, and replying to user questions adds up to a cost that can exceed the benefits. though “wordpress posts timeline” has had over , downloads during its three year existence, this user-base has not been enough to build a community who will contribute to the plugin’s development, either financially or by improving and updating the code themselves. the print center cannot assume that hobbs will expand the timeline’s functionality and customization settings in the future. the dynamic environment revolving around free software means that organizations like the print center need to be flexible and adaptable when the tools it uses are no longer supported. for now, the print center is using the plugin as is. the basic function of the plugin is to format the timeline author’s posts into a vertical timeline. hobbs did not publicize the plugin as a tool intended to handle complex historical timelines. the plugin pulls from the author’s wordpress theme for font and style, resulting in a simple timeline that is not as visually distinct as tools such as simile or tiki-toki. this absence of bells and whistles enables the print center to author a more individualized timeline than it would be able to create using a tool that more strictly controls the timeline content. however, the simplicity of the timeline also means that the centennial timeline will not have some of the features that are standard on other tools such as zooming and categorizing entries. the plugin does include an options page that allows authors to make some customizations without coding. the decisions that the print center makes for more see, cohen and rosenzweig, “preserving digital history – the fragility of digital materials” in digital history. cohen and rosenzweig discuss the fragility of digital history projects extensively. regarding these options affects how users interact with the timeline and the history it portrays. customizations for the timeline include basic formatting such as the choice to have a thumbnail appear with entries, the date format visible on the timeline, and whether the author wants posts in ascending or descending order. these options appear as simple forms limited to a small range of choices, but are user-friendly for timeline authors who do not know how to code. the plugin will only allow one category of posts in the timeline and will not support multiple timelines without timeline’s authors customizing the code themselves. as a result, the print center’s centennial timeline exists on one linear path with no visual distinction between exhibits, programs, or operations. the ability for the timeline to link out to posts can break some of the linearity for users and has the potential to create historical depth out of what previously would have been a simpler list of chronological events. the available options apply to the timeline universally. all dates are displayed in the same format. all excerpts are the same number of words. all images are the same size. as a gallery devoted to contemporary art, the print center’s aesthetic is in line with this simple, elegant design. for the purpose of telling a complex history, this elegance could be misleading. the print center could add minor customizations to the timeline by accessing the plugin’s code. however, it is unlikely that interns, historians, or staff at the print center working on the timeline will have experience or training in web development. it is equally unlikely that anyone working on the print center’s timeline will write any code from scratch to add functionality to the timeline. nevertheless, understanding the plugin’s files can give authors a clearer idea of the plugin’s functionality and potential, essential for understanding the difference between the functionality of timeline entries as they appear on the timeline and the same entries as they appear as posts on their own pages. from the back end the entries that make up the timeline content are blog posts. the print center’s centennial website uses posts for multiple purposes in addition to the timeline. unlike some timeline builders like tiki-toki, which has an interface that mirrors the final design of the timeline, the print center’s centennial timeline can be cumbersome to edit from the back end. wordpress organizes posts in the back end by the date authors add them, not the historical date they represent. as a content management system for blogging, the wordpress interface works well. for the purpose of managing a historical timeline, the content can get jumbled amidst a variety of posts, especially when authors add entries out of order. from the front end, the entries’ titles navigate users out of the timeline and onto the individual post pages that wordpress automatically generates for all posts. the entries, as wordpress posts, have the capacity to utilize all of the functions available for any other wordpress posts. in addition to text, posts can support a variety of media including images, galleries, slideshows, and video. a major difference between the way the print center’s timeline functions and the way other timeline builders function is that the entry content is independent of the timeline software. for the centennial timeline, the print center chose the ability to generate customized content on entry pages in exchange for the ability to structure its data into a cohesive visual space. the flexibility of wordpress posts does put limits on the centennial timeline, notably in how the timeline portrays dates. wordpress post dating options determine the timeline entries’ dates. to accommodate this date structure the print center’s for a more detailed explanation of the plugin see appendix b. centennial timeline only shows years for items and does not show any months or days. when authors add an entry they must backdate posts to the year that the entry will appear on the timeline. publishing the year alone ensures that the author adding entries to the timeline will not have to publish incorrect dates on entries which occurred over a long period or for which dates are unknown. the way dates appear on the timeline can affect how users perceive past events. some events in the print center’s history took place on specific, known days. usually these are events such as exhibit openings, lectures, and programs. other events took place in longer timeframes. the print center’s copper drive, wherein the organization collected metal plates from artists and donated them for repurposing during world war ii, took place over a few months. ideally the timeline would be able to indicate the specific day on which a lecture took place and the time period during which the copper drive happened, but that is not how wordpress works. technological limitations largely guided the centennial timeline’s date format. consequently, dates on the centennial timeline are similar to those in previous anniversaries. from the front end all entries appear as a point, taking place in a single year. to users, the centennial timeline appears simple and elegant. the interface that authors navigate to create content was not developed for the purpose of writing historical content and can be cumbersome. regardless, using wordpress is relatively easy and is a good option for cash-poor organizations that already have web hosting like the print center. the software requires little coding and contains a range of free resources to do different tasks, qualities that can help small nonprofits. this discussion of the back end similarly, wordpress does not give users the option to leave out days and months on posts. to prevent erroneous dates appearing within the entries’ pages the metadata has been removed through custom css for all posts. if known, the authors can specify dates and general time periods from within the text of the entry’s post. the result of removing post metadata is that each post page looks more like a standard webpage and not a blog post. focused on the basic form of the timeline and the technology that creates it. designing content for the centennial timeline requires that the authors be aware of the limitations of the technology in addition to being conscious of how the timeline’s user audience interacts with the history on the published timeline. forming entry content there is a multitude of ways to format content on the timeline. the timeline could have a select number of entries with longer, more researched posts, or it could have a large number of posts with shorter, one to two paragraph descriptions. most of the decisions about content seem relatively mundane compared to the overarching historical narrative of the print center’s history. nevertheless, deciding how to title entries, how to format images within the entries, and what types of captions to use saves time for the authors and makes the timeline more cohesive. for authors in charge of writing content, having an estimation of the required type and length of content provides a framework around which to focus. with a clear direction, authors can be more productive in their research and writing. keeping entries consistent is beneficial for users looking at multiple entries because consistency prevents users from readjusting their expectations with every entry and instead enables them to focus on the history and ideas. in the print center’s earlier timelines paper space was a huge factor in determining the length of the timeline and entries. the print center believes that the for example, the centennial timeline displays a five-word excerpt with entries and links to the entries’ post page. it is possible to display the entries’ full text. a benefit of putting all the content on the timeline is that it prevents users from constantly clicking out of the timeline then back in: an action that could become tiring and turn off users. with this format it would be beneficial to have short entries to ensure users would not get lost within a sea of text separating one entry from another. however, if the timeline displayed full entries the scrolling length of the timeline would increase significantly. what this speculation displays is that the decisions the print center makes regarding simple formatting have consequences that affect the way they narrate their history. centennial timeline will be different because of its digital form. in little ways the digital timeline is different. the ability to devote more time to an entry than simple descriptive text adds an interpretive dimension to the digital timeline absent in the organization’s previous institutional histories. the ways that users can scroll around the timeline, the ways that users can link out of the timeline and back in again, change the how people interact with the timeline. the print center describes the primary characteristic of the timeline as interactive. calling a digital project interactive could mean different things. for the centennial timeline, the print center interprets interactive as a multimedia project with entries that promote the connections the print center has with other organizations and artists through linking outside of the print center’s website and which includes contributors from its community. ideally these elements create a different, more democratic, type of history than the previous timelines. one way to accomplish a cohesive and still engaging timeline would be to create a few standard entry types under which most timeline entries will fall. for the centennial timeline, text-based, media-based, and research-based entries could together fulfill the organization’s goals for the timeline. keeping most entries short would require the author to pinpoint potential entries relevant to the history and speculate about what users would be interested in. eliminating entries because they will not generate user interest could potentially skew the history and downplay important issues. however, institutional histories struggle with relevance. controlling the entries based on potential interest could make the timeline relevant to its audience. for the print center staff with little time to for cohen and rosenzwieg, interactive means, “a two-way medium, in which every point of consumption can also be a point of production.” with this definition, interactivity means enabling a dialogue in which the historian is not the only contributor. see cohen and rosenzweig, “exhibits, films, scholarship, and essays” in digital history. work on the project, the small doses of research and writing could be encouragement to add entries because it will take up less of their time. having a select number of longer entries featuring research from the organization’s archives would be useful to put the shorter entries into a wider historical context. longer entries would also serve a large portion of the print center’s audience: students, their professors, artists, and art collectors. the print center has close relationships with universities and art schools around philadelphia and commonly invites classes in art history, printmaking, and photography to the gallery. well-researched entries dealing with the arts community could appeal to this group. in addition to serving an academic community, a few longer entries would be a valuable tool for the print center to keep track of the research that has been done in its archives. the main goals of creating a variety of standardized entries are to present users with choices, provide a variety of media to hold their attention, and keep the timeline as cohesive as possible. the options for formatting images, slideshows, and text on wordpress are plentiful and could become distracting if the timeline embraces too many different visual elements. the challenge with making decisions about content is in balancing the print center’s needs and resources with the users’ interests. the centennial timeline has the potential to form a network within entries wherein the see appendix b for expanded suggestions relating to content guidelines. during my internship i stumbled across notes saved to the print center’s server from ten years prior that covered boxes from the archives that i had recently gone through on my own. ansley t. erickson cited have similar problem of keeping track of her notes and sources that she accumulated in digital formats. see ansley t. erickson, “historical research and the problem of categories: reflections on , digital note cards” in writing history in the digital age. using the timeline a type of research aid would enable the print center to identify underrepresented areas to direct future researchers. for more see cohen and rosenzweig, “designing history for the web,” in digital history. cohen and rosenzweig warn about how available formatting options for digital history projects can become a burden and distracting for the project’s authors and users. accumulation of information encourages users to view the entries in conversation with one another and piece together a multifaceted story. ultimately, how users experience the timeline will determine its development moving forward. given the dynamic, collaborative nature of the timeline, its advantages could turn to disadvantages if the content becomes too dispersed and unfocused. users could perceive the timeline to be clean and simple, appreciating the linear format and the entry layout. users could also perceive the layout as tedious. many online timelines have sliders that give users a reference point from which they can easily jump around time periods. the print center’s centennial timeline does not have this. on the centennial timeline points between entries are not based on measured time, but equidistantly placed along a line. there is little reference for users as to the total size of the timeline or how to find a particular time period. if users want to find the s they will have to scroll until they get there. consequently, the middle of the timeline easily feels like a cartoon ladder that has no beginning or end (figs. - ). currently, the small number of posts prevents the visual monotony from being detrimental. however, if more posts are added in the future the timeline could get so long that visitors will not have the time or patience to sort through it. at this future junction the print center would have figure out a solution, and possibly transfer the timeline to another interface. categorization is one organizational feature of most digital timelines that is absent from the centennial timeline. with categories users can quickly identify and sort entries without reading them. there are ways in wordpress to manage posts. tagging and tag clouds would be the easiest and most efficient way to categorize timeline entries. with tags, such as “publications,” “exhibits,” and “education,” the centennial timeline could isolated histories and draw lines between non-consecutive points in the print center’s history. as an idea, tag clouds have the potential to add historical depth to the timeline, but actually making the idea work involves a few practical hurdles. the print center’s centennial timeline is in an early stage of development. thus far this section has contemplated how the organization could creatively collaborate with the technology to bring its vision to life. these contemplations are suggestions for how a historian would like to develop the timeline and do not represent the timeline as it is. to understand the historical message that the timeline relays to users requires analyzing the timeline as the print center has published it thus far. the published timeline the timeline currently exists as a page on the print center’s centennial website. at the top of the page is a brief introduction to the timeline accompanied by a photograph of the print center as it stands today. the text states, “the print center’s first years have been marked by an abundance of important milestones, both for the organization and for the fields that we serve: printmaking and photography.” emphasizing that the history extends beyond the organization into the fields of printmaking and photography implies the organization’s relevance. this statement of relevance justifies the timeline’s existence to potential audiences who are involved with photography and printmaking, but may not necessarily be familiar with the print center. karen louise smith discusses the advantages of tag clouds, stating, “they do not exist solely to categorize data or content. they signal to others on the web, that there is a potential interest in collaboration and participation.” see karen louise smith, “from talk back to tag cloud: social media, information visualization and design.” science and technology for humanity (tic-sth), ieee toronto international conference. (conference dates, september - , ): . for the centennial timeline the downside to tag clouds is that tagging is a wordpress feature, not a feature of the timeline plugin. as a result posts on archive pages are not displayed as a timeline. making the tag clouds work requires customizing the website theme’s code. “timeline,” the print center, accessed february , , http://printcenter.org/ /timeline- . the introduction uses collaborative, inclusive language. the text indicates that the source material for the timeline is both the organization’s archives as well as “the memories of the artists, members, staff and many others that have shared our history,” and provides users with a link where they can suggest posts. the timeline does not distinguish between memory and history. instead, the centennial timeline introduces the print center’s past as a shared history that ultimately produced the print center standing today. the rhetoric of the introduction portrays the print center as a community rather than an institution. describing the project, the introduction states, “the timeline, launched to celebrate our centennial, will be an ongoing effort – designed to grow and deepen over time.” as of february , , the print center’s centennial timeline had thirty-one entries. the majority of the entries cover leadership, exhibits, and program development. with the print center’s intended updates and user contributions, the timeline could eventually portray a narrative that reflects the community-oriented history advocated for in the timeline’s introduction. presently, the thirty-one entries reflect a traditional selection of institutional events typically found in institutional histories. despite the organization’s hopes that the centennial timeline will represent a turn in the way the organization tells its history, the current entries form a narrative of the print center’s history that is similar in feel to the fortieth and ninetieth timelines. the centennial timeline as a whole tells a narrative of the print center’s history with three main three main plot points, roughly coinciding with different time periods. the ibid. ibid. the centennial timeline underwent changes during the writing of this paper. see appendix a, figs. - for a discussion of how the timeline has evolved over the course of this thesis. timeline relays major firsts for the print center, followed by programmatic growth, and leads into major exhibits and publications that had significant financial support. the first entry is “ – the print center was founded.” this entry is identical to the same entry on the ninetieth timeline. like on the ninetieth timeline, this entry contains no background information on the group that founded the club or the club’s transitory period leading to its move to latimer street. the following entries display growth and major firsts. beginning with the entry “ – drive to collect copper plates for the war effort,” the entries focus more on program development, such as the educational program prints in progress and director berthe von moschzisker’s support of modern art, as well as institutional growth, such as and the building expansion. beginning with the entry “ – the philadelphia portfolio,” most of the entries shift to being more about publications and exhibits that the print center commissioned. like the later entries from the ninetieth timeline, these entries highlight professional accomplishments and major gifts from large foundations. the only entries that appear throughout the whole timeline are entries about changes in leadership. the overall narrative is celebratory and positive. at face value, the entries as a whole reveal how the timeline is a reflection of the way that the print center imagines its own identity. for a user, scanning on the surface of the timeline results in no significant differences from the print center’s earlier histories. the entries’ content supports the celebratory feel of the timeline. the entries primarily contain text that ranges from to words with general statements and no citations. this content has the most potential to challenge the previous timelines, but could simultaneously portray a history that is out of alignment with the image of the print center today. the entry “ – the print center permanent collection at the philadelphia museum of art” could potentially challenge the print center’s identity (fig. ). the entry contains a paragraph about the print club permanent collection, the collection that founded the print collection at the philadelphia museum of art. it includes links to biographical articles on carl zigrosser and lessing rosenwald, and an image of a print of carl zigrosser. different accounts of the permanent collection appeared in the fortieth and ninetieth timeline. the fortieth timeline describes the print club permanent collection as a committee that ran an annual fundraising campaign to purchase prints that would be given to the philadelphia museum of art. by the ninetieth anniversary the original relationship between the two institutions became muddled. the ninetieth timeline introduces the print center permanent collection as a collection that the print center had been accumulating independently since its inception and that the organization gave to the philadelphia museum of art in at the instigation of zigrosser and rosenwald. both versions emphasize that the philadelphia museum of art did not collect prints before the intervention of the print club. the difference lies in where the print center fits in the story. that a collection from a small organization founded the print collection of the philadelphia museum of art is something that elicits a sense of pride from the organization. research in the archives has brought to light that the print club permanent collection never existed as a collection at the print club. the print club hosted the committee, which in turn raised “the print center permanent collection at the philadelphia museum of art,” the print center, accessed march , , http://printcenter.org/ / / / /inauguration-of-the-print-center- permanent-collection-at-the-philadelphia-museum-of-art. “fortieth anniversary: the print club,” box , folder , print club archives, hsp; and “ years: nurturing the new,” (the print center, ). for documents relating to the print club permanent collection see: box , book , print club archives, hsp. funds to purchase art. the print club was integral in founding the philadelphia museum of art’s print collection, but not in the way that has become a crucial aspect of the print center’s institutional identity. currently, the entry about the permanent collection on the centennial timeline does not specify the nature of the arrangement that the print club had with the philadelphia museum of art. writing and publishing an accurate description of the permanent collection would not just affect the timeline. the “history” section of the print center’s “about us” page on its main website claims, “in , the print club donated its collection of prints to the philadelphia museum of art forming the core of their fledging print department.” the print center advertises its relationship with the philadelphia museum of art to its community and revising this relationship could be a challenge. in addition to challenging the accepted institutional narrative, entries could provoke conversation. the entry, “ – alan freelon, first african american exhibiting member” has the potential to encourage users to contemplate history. this entry begins, “founded on democratic principles, women and people of color were included in the membership from our earliest days.” without directly stating the reality of racial prejudice in the twentieth century, this statement implies that the print club’s activity went against the current and that the founding ideology of the organization is relevant today. “about us,” the print center, accessed march , , http://www.printcenter.org/pc_about.html. “alan freelon, first african american exhibiting member,” the print center, accessed march , , http://printcenter.org/ / / / /allan-freelon-first-african-american-exhibiting- member. the text subsequently gives a brief biography of alan freelon, including his becoming a member of print club in . the text also relays that before freelon became a member of the print center he attended the pennsylvania museum school of industrial art on a full scholarship and received degrees from the university of pennsylvania and tyler school of art at temple university. the year that he became a member he had a solo exhibit at the th street branch of the new york public library. freelon’s biographical background information implies that multiple institutions supported him and his work before he became a print club member. according to the entry, it was not for another eight years after becoming a member did freelon exhibit at the print club. knowing that alan freelon was an established artist by the time he became an exhibiting print club member takes some of the impact away from the organization’s assertion of its democratic foundations. if the prerequisite for black artists gaining membership was acceptance in the arts community, then this particular entry is not evidence that the print center was more or less democratic than any of its contemporaries. instead of interpreting freelon’s acceptance in the print club as being a result of the democratic tradition of the organization, the timeline could use this event to provoke a discussion about the acceptance of black artists in philadelphia’s elite art community. in this scenario, the subject of the entry is no longer praise for an accomplishment, but a more reflective historical conversation. this type of entry content would shift the focus of the timeline away from the print center. the current focus on projecting a particular image of the print center works against many of the organization’s hopes for the timeline. as an interactive project, the print center wants to include its community, but the content on the timeline suggests that only certain types of contributions belong on the timeline. the possibility that the print center’s community can submit stories is appealing, especially for public historians who want to share authority. crowdsourced content has the potential to create alternative chronologies that challenge accepted narratives. “the heritagecrowd project: a case study in crowdsourcing public history” by shawn graham, guy massie, and nadine feuerherm, illuminates the challenges involved in crowdsourcing projects. among the difficulties that the heritagecrowd project faced included people being unclear about the way that the group was collecting content and people feeling their knowledge to be unprofessional. the authors noted, “the terminology and structure of the platform as it currently stands give more authority to the data displayed than might be warranted.” the authority that the centennial timeline currently embodies is one of the institution and not a wider community. fortunately for the print center the centennial timeline is under active development and exists in a form that the organization can easily revise. unlike the printed timelines that came before it, the digital iteration of the print center’s institutional history is not limited to the text that can fit in a small, defined space. the timeline fits well among the commemorative anniversary activities that the print center has undergone. the goals of the project put it in conversation with contemporary digital history projects, but the practical realities that the organization faces, in time, resources, technology, and the organization’s need to control its image all limit the potential of the timeline. as it stands the centennial timeline serves the same function as the paper timelines that came before it. the timeline is an authoritative for the field of public history, michael frisch is largely to credit for popularizing the concept of shared authority, see michael frisch, a shared authority: essays on the craft and meaning of oral and public history, (suny press, ). shawn graham, guy massie, and nadine feuerherm, “the heritagecrowd project: a case study in crowdsourcing public history” in writing history in the digital age. conceptualization of the past. public history projects that seek to challenge the past are risky. for the print center, there is little gain in contesting its image and creating a controversial history in an authoritative venue such as a timeline. chapter conclusions: developing timelines as digital history timelines as history at the print center the centennial timeline case study in the previous section explored the challenges of creating a digital timeline within the context of a commemorative institutional anniversary. the print center wants to publish its history in a wider context while still protecting its image. connecting an institutional history to people and events beyond the institution will put the history in conversation with a discourse that the institution has little control over. timelines are one of the most common and accessible ways to publish histories. because of their ubiquity and authority, timelines can often reflect the desire for self-preservation that underscores many teleological narratives. could an organization like the print center integrate a timeline with its centennial that investigates history without putting themselves at risk? the tools available to create timelines and the literature covering digital history and visualizations attest to the potential of the timeline as a tool for research and learning. in a best-case scenario, if the organization had the time and resources, the print center would be well situated to use a timeline to investigate the history of printmaking and photography. this hypothetical timeline could investigate the lines between art images and visual forms that do not typically fall under the umbrella of traditional art history studies. one of the print center’s early goals was to promote printmaking as art, but the organization did not shy away from printmaking that did not fit within the fine art umbrella. exhibitions at print club from the s and s often included shows dedicated to subjects such as illustration, cartoons, and even mapmaking, along with fine art printmaking. it was not until the s and s that the club began to reject its earlier, broader, exhibitions in favor of a narrower lineup of modern art. standard art history textbooks quickly move through art movements to set up a sense of continuity in art for students. one advantage of a digital art history timeline is that it could shift the perspective of a traditional art history text. this perspective is difficult to gain when the timeline contains objects traditionally included within art history study, such as in wikipedia’s timeline of art discussed earlier. a timeline of printmaking and photography could enable discussions of technology and techniques, in addition to the social impact of various forms. printmaking and photography are responsible for a wide array of popular and technical imagery. art historian james elkins studied the difference between traditional art images and non-art images. non-art images include a range of material from scientific illustrations to graphics. elkins characterizes these visual forms, stating: they are sequestered for their apparent failure to achieve historical significance, their ostensive lack of expressive power, the technical demands they make on viewers, and the absence of visual theories and critical apparatus that might link them to fine art or argue for their importance. for a sampling of the print club’s early exhibitions see: scrapbooks s- s, box , print club archives, hsp. for the print club’s shift to modern, see: scrapbooks - , box , print club archives, hsp. notable examples can be found in the newspaper clippings scattered throughout the scrapbooks, for example see walter baum, “print club lithography show stresses modernistic themes,” volume , scrapbook - , box , print club archives, hsp. art history textbooks often receive criticism from teachers who cite their content in addition to their cost, and size. for an example see, michelle millar fisher, “bye, bye survey textbook,” art history teaching resources, march , , accessed march , , http://arthistoryteachingresources.org/ / / /bye-bye-survey-textbook. james elkins, the domain of images (cornell university press, ), ix. non-art printmaking and photography share these characteristics with timelines and other information visualizations. most art history timelines follow textbook norms, but notable exceptions exist. rosenberg and grafton discuss several art history timelines that became popular in the s, as art movements came and went with growing rapidity. these timelines challenged the division between art and non-art images, mostly by making the chronological graphic a work of art in itself. though they questioned what it meant to label something art, these timelines typically only included material commonly considered fine art. the inquisitive nature of a timeline that includes non-art content as a means to study mediums that have fine art counterparts could encourage the print center’s community to reflect on the organization’s contemporary art exhibits. this timeline would also situate digital timelines as a contemporary iteration of the types of visualizations that printmaking and photography produced before the advent of digital technology. for the print center, a timeline covering the history of the fields that the organization promotes could be a valuable educational resource. the print center has a history of supporting educational programming, yet with staff, time, and funding resources all unpredictable, it has been difficult to keep these programs afloat. online educational resources could enable the organization to expand its outreach without exhausting its current program resources. rosenberg and grafton, - . the print center’s artists-in-schools program is the organization’s most current educational program. this program connects schools with artists and provides philadelphia high school students with printmaking and photography classes. for more about the print center’s artists-in- schools program, see “about aisp,” the print center, accessed march , , http://printcenter.org/aisp/about-aisp. in the s the print club began organizing children’s shows and programs, see “art is art, but ice cream – ah, that’s something good to eat,” volume , scrapbook - , box , print club archives, hsp. timelines have long been in use as educational tools. in their book doing history, linda levskit and keith barton criticized the timelines common in schools because they often highlight dates and politics, two things that children have little reference for understanding. understanding everyday things that have some type of imagery attached, like visual culture, is easier for children than understanding politics and government. with this in mind, levskit and barton note that children have a good visual sense of history and can usually sequence historical images. levskit and barton advocate for timelines that can serve as visual reference tools. as a reference tool, the timeline still presents an authoritative version of history and can obscure the decision making process that went into its creation. as an art gallery, the print center has a different relationship with images and history than a standard classroom. the print center’s audience expects art, and with this expectation comes an array of assumptions about the meaning and content of the material that the gallery exhibits and produces. because of its audience’s knowledge of art, the print center can more easily produce content that would be controversial in a history museum. likewise, visualizations in history museums often have an authoritarian air, rather than being presented as things that should be read critically. the print center is in a position to create a visualization that interrogates the history of art because its audience assumes that one purpose of the visual objects in art galleries is critical inquiry. for a history of timelines as educational tools see, rosenberg and grafton, , - ; for criticism of timelines from teachers’ perspectives see, michael kramer, “there is a timeline, turn, turn, turn,” hastac, november , , accessed march , , http://www.hastac.org/blogs/michael-j-kramer/ / / /there-timeline-turn-turn-turn; and christina davidson, “digital timelines,” hastac, october , , accessed march , , http://www.hastac.org/blogs/tinadavidson/ / / /digital-timelines- . linda s. levstik and keith c. barton, doing history: investigating with children in elementary and middle schools (mahwah, nj: lawrence erlbaum, ), , , and . this proposal could bring the digital timeline into a critical discourse absent from most timelines. it could also challenge art history historiography. through this musing i have tried to suggest that timelines could be a component of scholarly discourse. a question that this hypothetical timeline does not answer is whether creating a scholarly discussion with a digital timeline could represent a shift in how people understand and interact with the past. reflections the utility of digital tools for the purpose of historical scholarship is debated among historians and digital history proponents. some scholars argue that digital tools can shift the way that historians execute and present research. sherman dorn advocates that although traditional historical research is presented as an argument, digital scholarship can turn into something else. fred gibbs and trevor owens similarly encourage historians to use digital tools to alter the way they approach their work. for gibbs and owens this possibility of digital history means “de-emphasizing narrative in favor of illustrating the rich complexities between an argument and the data that supports it.” timelines can present users with massive amounts of information that calls into question causality and narrative structures, but they can just as easily support linear, progressive, selective narratives. whether digital timelines have more in common with their paper counterparts or whether they can fit within a model of digital scholarship that advocates for nonlinear narratives is a complicated question. the concept of hypertext forms the core of the sherman dorn, “is (digital) history more than an argument about the past?” in writing history in the digital age. fred gibbs and trevor owens “the hermeneutics of data and historical writing” in writing history in the digital age. sentiment that digital culture can promote nonlinear narratives. daniel rosenberg evaluates ted nelson’s affinity for hypertext as a way to defeat what nelson termed “the school problem,” that is, when subjects and ideas turn into timeslots and set on a linear schedule that eliminates the potential interconnections between ideas. educators looking toward digital technology are interested in the ability of digital tools to enable students to understand multiple perspectives and question historical reality. timelines on paper generally do not do this. for nelson the possibility of hypertext was to “encourage the reader to make explicit or implicit comparisons, mental leaps, and intellectual choices.” hypertext allows ideas to become a network and presents an unfinished, uncertain narrative. digital timelines are typically not as simple as a line that measures time containing events listed when they happened. digital timelines can employ content to move users through time and they can manipulate the scale of time. even the print center’s centennial timeline, though it appears to be a rather standard timeline at face value, enables users to maneuver out of the timeline to read content, travel to related website through links, and gives users an option to become contributors, to write the history themselves. these features are unique to digital timelines and digital timeline tools. they make the timeline a more complex, layered visualization than most static images. in a small way, digital timelines are removed from a strictly linear reading. if visualizations daniel rosenberg, “electronic memory,” in histories of the future, ed. daniel rosenberg and susan harding (durham: duke university press, ), . john allison, "history educators and the challenge of immersive pasts: a critical review of virtual reality ‘tools’ and history pedagogy," learning, media and technology , no. ( ): . rosenberg, “electronic memory,” - . ibid., . can defeat linear narratives, this does not mean that they will necessarily produce the types of history that public historians would be content with. the dominant paradigm among public historians is that cultural institutions should share their authority, embrace a constructivist history, tell multiple stories, and not get bogged down by linear, authoritative narratives. even when digital projects accomplish some of these goals, the technologies that digital projects use contain limitations that affect the history. along with nonlinearity, staley lists characteristics of digital scholarship including analogy, synthesis, networks, and structure. for public historians, working with technology that imposes strict hierarchies can be a challenge because the rhetoric of the discipline has a strong foundation in the assumption that a purpose of the field is to defeat hierarchies. timelines challenge historians because they represent a logical structure of time relative to western culture. west-pavlov claims, “the timeline is one of the most vivid exemplifications of our linear concept of time.” west-pavlov concludes that the timeline has become a “hegemonic temporal metaphor” in western, capitalist societies. the timeline is a relevant form of visual representation for cultures that understand time linearly. history, the academic discipline, has little control over how people conceptualize time. in this sense, the timeline is a check against historians to ensure that academic staley, . staley clarifies that humanities tend to deal with “high-context communication,” that is, the work of humanists is highly interpretive. technology is “low-context,” that is, with little room for error. the low-context nature of technology indicates that digital scholarship is not an anything-goes realm, even though the technology offers different, potentially new, directions for scholarship to turn. see staley, . west-pavlov, ibid., . representations of the past do not contradict conceptions of time from the wider culture out of which historians derive. in the ordering of time, historian arno borst states, “time can be aligned with perceptible experiences, in which case it will not be consistent, or else incorporated into a logical system of thought, in which case it will not be accurate.” for public historians, it can be difficult to reconcile the ways that humans and cultures experience time, which we readily admit is messy, with then arranging that mess into an argument, which the tenets of our discipline decree must involve some type of logic. public historians need a foundational philosophy that could guide the way they make and read timelines. the development of timelines as a popular form of visualizing history correlated to an understanding of absolute time as a universal constant measure, encouraging a sense of historicism in which people strove to document their world along a line of time culminating in the present. challenges to linear notions of time are abundant. philosophers such as edmund hursserl established and advocated for phenomenology, wherein time is related to subjective, human experiences. einstein’s relativity undermines the notion of a universal temporal clock in physics. more recently, the humanities have been contesting linear notions of time. west pavlov looked at postcolonial theory and ideas of temporality, stating, that postcolonial theories “lay bare the apparent tangles of plural, non-sequential historical processes which cannot be borst, . for an introduction to the affects of relativity on the philosophy of time see west-pavlov, , ; and bradley dowden, “what science requires of time,” in the internet encyclopedia of philosophy, accessed march , , http://www.iep.utm.edu/requires. abstracted from the regenerative process of nature itself.” the way that people interact with the internet could involve a similar temporal shift. despite the challenges to absolute time and the optimism of advocates for digital history, there has yet to be an accompanying historical worldview that aligns with relativity as well as historicism aligns with absolute time. visualizations without a conceptual framework to aid interpretation easily drop out of use. public historians speaking about nonlinear time and multiple histories can only have a limited effect on the way that people visualize history, especially on popular forums like the print center’s centennial timeline. in their conclusion to writing history in the digital age, the authors are unsure whether digital history will actually be revolutionary, citing that the best digital scholarship encompass qualities that historians cherish. this does not mean that digital timelines are incapable of being both good digital history and thoughtful public history. but it may imply that public historians let go of some of their ideological hopes regarding the potential of digital projects and instead focus on how to make these projects the best that they can be with the resources at hand. west-pavlov, . tanaka similarly advocates that digital media has the potential to escape the idea of universal time and illuminate the various time scales that different cultures use. see tanaka, “pasts in a digital age.” nomograms are a good example of a visualization that is no longer in popular use because, while useful, people generally do not intrinsically understand them. see thomas l. hankins, "blood, dirt, and nomograms: a particular history of graphs," isis , no. ( ): - . jack dougherty, kristen nawrotzki, charlotte d. rochez, and timothy burke, “conclusions: what we learned from writing history in the digital age,” in writing history in the digital age. bibliography books, articles, and blogs abir-am, pnina g. “introduction.” osiris nd series, vol. , “commemorative practices in science: historical perspectives on the politics of collective memory.” ( ): - . allison, john. "history educators and the challenge of immersive pasts: a critical review of virtual reality ‘tools’ and history pedagogy." learning, media and technology , no. ( ): - . bennett, tony. the birth of the museum: history, theory, politics. new york: routledge, . borst, arno. the ordering of time: from the ancient computus to the modern computer. university of chicago press, . cohen, daniel and roy rosenzweig. digital history: a guide to gathering, preserving and presenting the past on the web. philadelphia: university of pennsylvania press, . accessed march , , http://chnm.gmu.edu/digitalhistory. croke, brian. "the originality of eusebius' chronicle." american journal of philology , no. ( ): - . davidson, christina. “digital timelines.” hastac. october , . http://www.hastac.org/blogs/tinadavidson/ / / /digital-timelines- delahaye, agnès, charles booth, peter clark, stephen procter, and michael rowlinson. "the genre of corporate history." journal of organizational change management , no. ( ): - . dougherty, jack and kristen nawrotzki ed. writing history in the digital age. university of michigan press, . accessed march , , doi: http://dx.doi.org/ . /dh. . . . elkins, james. the domain of images. cornell university press, . fisher, michelle millar. “bye, bye survey textbook.” art history teaching resources. march , . http://arthistoryteachingresources.org/ / / /bye-bye- survey-textbook. frisch, michael. a shared authority: essays on the craft and meaning of oral and public history. suny press, . furniss, elizabeth. "timeline history and the anzac myth: settler narratives of local history in a north australian town." oceania , no. , ( ): - . gold, matthew k. (ed). debates in the digital humanities. minneapolis: university of minnesota press, . http://dhdebates.gc.cuny.edu/debates. handler, richard and eric gable. the new history in an old museum: creating the past in colonial williamsburg. durham: duke university press, . hankins, thomas l. "blood, dirt, and nomograms: a particular history of graphs." isis , no. ( ): - . linenthal, edward t., and tom engelhardt, eds. history wars: the enola gay and other battles for the american past. new york: henry holt and company, . levstik, linda s., & barton, keith c. doing history: investigating with children in elementary and middle schools. mahwah, nj: lawrence erlbaum, . jessop, martyn. "digital visualization as a scholarly activity." literary and linguistic computing , no. ( ): - . kohlstedt, sally gregory. "institutional history." osiris ( ): - . kramer, michael. “there is a timeline, turn, turn, turn.” hastac. november , . http://www.hastac.org/blogs/michael-j-kramer/ / / /there-timeline-turn- turn-turn. mckitrick, eric l., and stanley elkins. "institutions in motion." american quarterly , no. ( ): - . newton, isaac. mathematical principles of natural philosophy, in newton/huygens, vol. of great books of the western world ed. robert maynard hutchins. (university of chicago, ), - . purcell, sarah j. "commemoration, public art, and the changing meaning of the bunker hill monument." the public historian , no. ( ): - . rosenberg, daniel, and anthony grafton. cartographies of time: a history of the timeline. princeton architectural press, . rosenberg, daniel and susan harding (ed). histories of the future. durham: duke university press, . schulten, susan. "emma willard and the graphic foundations of american history." journal of historical geography , no. ( ): - . scott, w. richard. "reflections: the past and future of research on institutions and institutional change." journal of change management , no. ( ): - . smith, karen louise. “from talk back to tag cloud: social media, information visualization and design.” science and technology for humanity (tic-sth), ieee toronto international conference. (conference dates, september - , ): - . staley, david j. computers, visualization, and history: how new technology will transform our understanding of the past. me sharpe, . strong-boag, veronica. "experts on our own lives: commemorating canada at the beginning of the st century." the public historian , no. ( ): - . thelen, david. "memory and american history." the journal of american history , no. ( ): - . tufte, edward r. envisioning information. cheshire, connecticut: graphics press, . west-pavlov, russell. temporalities. new york: routledge, . white, hayden. the content of the form: narrative discourse and historical representation. the john hopkins university press, . white, richard. "a commemoration and a historical mediation." the journal of american history , no. ( ): - . timeline resources digital timeline tools chronos timeline: (hyperstudio, mit) http://hyperstudio.mit.edu/software/chronos- timeline dipity: http://www.dipity.com preceden: http://www.preceden.com tiki-toki: http://www.tiki-toki.com timeglider: http://timeglider.com timeline (simile, mit): http://www.simile-widgets.org/timeline timeline js (knight lab, northwestern university): http://timeline.knightlab.com timerime: http://timerime.com timetoast: http://www.timetoast.com when in time: http://whenintime.com digital timeline examples “centennial timeline,” the print center, http://printcenter.org/ /timeline- . friendly, michael and daniel denis. “milestones in the history of thematic cartography, statistical graphics, and data visualization,” the milestones project, http://www.datavis.ca/milestones. “revolutionary user interfaces,” knight lab, http://timeline.knightlab.com/examples/user- interface. “timeline of art,” wikipedia, http://en.wikipedia.org/wiki/timeline_of_art. appendix a figures figure . a small portion of joseph priestley’s a chart of biography. the original from contained six horizontal bands with dates along the horizontal axis from bc to ad. source: wikimedia commons, http://commons.wikimedia.org/wiki/file:priestleychart.gif. figure . emma willard, temple of time, . image courtesy of the american antiquarian society. figure (above), figure (below). wikipedia’s timeline of art, -present. source: http://en.wikipedia.org/wiki/timeline_of_art. figure (above), figure (below). michael friendly and daniel denis, milestones in the history of thematic cartography, statistical graphics, and data visualization, - . the milestones project. source: http://www.datavis.ca/milestones. figure is the screen that the timeline will navigate to when the user clicks “milestone detail” on the timeline. figure (above), figure (below). the wright brothers, timeglider timeline. these images show how the timeline begins at a zoomed out level and populates as users zoom in. source: http://timeglider.com/timeline/line_ c ed bbfb a ff . figure (above), figure (below). the road to revolution, timetoast timeline. these images are set to different views, one as a timeline, one as a list. source: https://www.timetoast.com/timelines/road-to-revolution-timeline. figure (above), figure (below). western history, preceden timeline. source: http://www.preceden.com/timelines/ -western-history. figure (above), figure (below). tower of london, tiki-toki timeline. source: http://www.tiki-toki.com/timeline/entry/ /tower-of-london- d/#vars!date= - - _ : : !. figure (above), figure (below). revolutionary user interfaces, timelinejs timeline. source: http://timeline.knightlab.com/examples/user-interface/ figure (above), figure (below). the print center, centennial timeline, . source: http://printcenter.org/ /timeline- . figure (above), figure (below). centennial timeline entries. figure : “a permanent home.” source: http://printcenter.org/ / / / /a-permanent-home. figure : “the print center permanent collection at the philadelphia museum of art,” source: http://printcenter.org/ / / / /inauguration-of-the-print-center-permanent-collection-at- the-philadelphia-museum-of-art/ figure (left) and figure (right). these images show the whole centennial timeline at different points in its development. figure shows the timeline as it was during my research, as of march , . figure shows the timeline as it was on march , . the most striking difference between these two images is how much the timeline has grown, particularly with images. the added entries and images are almost exclusively publications, beginning with “ the print center publication: romas viesulas” and ending at “ the print center publication: matt neff.” these publications mean that my analysis of time periods of the timeline should be revised slightly. while doing my research the publications began at . the additions imply that the print center began creating exhibit publications in the late s, rather than the s. besides the slight shift in the time that the print center is emphasizing its professional accomplishments, the content otherwise is the same. one thing starting to happen as the print center populates the centennial timeline with publications is that it takes longer to scroll through the timeline. where, with the smaller timeline, the lack of navigational tools was manageable, scrolling through an ever-increasing number of entries will get more difficult and tedious with time. appendix b user’s guide: the print center’s centennial timeline this user’s guide provides a basic introduction to editing the print center’s “centennial timeline.” it is primarily intended as a reference tool for individuals working on the “centennial timeline” who do not have technical backgrounds. general information • the print center centennial timeline exists on the print center’s centennial website and is found here: http://printcenter.org/ /timeline- /. • the plugin “wordpress posts timeline” formats posts categorized “timeline” into the format that the website shows as the timeline with a shortcode on the timeline page. the timeline page is the page that the menu on the homepage links to and displays the timeline. • the easiest way to work on the timeline is to keep a separate tab or window open to the front end, with the admin screen open in another. • through the dashboard, the timeline page is accessed by going to ‘pages’ > ‘timeline.’ the shortcode, [wp-timeline], is the only timeline content on the timeline page. • to access the timeline’s options through the dashboard go to ‘settings’ > ‘timeline settings.’ wordpress posts timeline plugin information • the plugin “wordpress posts timeline” can be found here: https://wordpress.org/plugins/wordpress-posts-timeline. • to access the plugin files go to ‘plugins’ > ‘wordpress posts timeline’ > ‘edit.’ • the file “wordpress-posts-timeline/readme.txt” contains background information about the plugin. this file contains a description, instillation instructions, and a changelog that documents the plugin’s updates till the time the print center downloaded it. the file “wordpress-posts-timeline/license.txt” is the gnu general public license specifying that the plugin is free software and relays the conditions that this freedom implies. there is no reason to edit these two files. • the remaining three of the plugin’s five files are the actual code. the file “timeline_options.php” determines the options on the options page. the file “wordpress-posts-timeline.php” is the file that controls the timeline’s functionality and instructs the timeline to use the options that the user chose from the options page. the file “timeline.css” determines how those elements will look on the page. these files work together to create the timeline. for example, the print center uses thumbnails on posts in its timeline. the file “timeline_options.php” instructs the options page to ask the user whether or not to include a thumbnail. if answered yes, the file “wordpress-posts-timeline.php” instructs the server to grab the image from its respective post. the file “timeline.css” then determines the behavior of the image on the page displaying the timeline, for example, the image’s size and alignment on the post. the shortcode on the page that the print center wants to place the timeline directs the website to these files. adding content • timeline content is added through posts. the print center’s centennial website uses posts in different categories for multiple purposes. the post’s title is the title that appears in bold as the items title. the date is the last attribute that needs to be set before working on the item’s content. • to add a post go to ‘posts’ > ‘add new.’ on the options panel to the right make sure that the category is set to “timeline.” • the timeline automatically grabs uncategorized posts, so it is important to be on the lookout for stray posts intended for other parts of the website that may end up on the timeline if its author forgot to give them a category. • to display an image for the item on the timeline add the image as a ‘featured image’ on the right panel. • the year that the timeline shows is taken from the post date. since wordpress automatically dates posts as the day they are made, this must be changed to the past date of whatever item is being added. • to date entries go to the top portion of the right panel and click ‘edit’ listed after ‘publish immediately.’ the year can be backdated to the year of the event. • to get rid of the dates within the posts, i removed the posts metadata through the centennial website theme’s ‘custom css’ form. to access this code from the dashboard go to the left side panel to ‘appearance’ > ‘theme options.’ on the panel within the options go to ‘advanced settings,’ and scroll down to ‘custom css.’ proposed content guidelines • text-based entries o text entries could be to words and could include an image if applicable. o these entries would require the author to be concise and add a little historical background without doing an overwhelming amount of research. o being short, the author of text-based entries has to think carefully about the story they want to tell and quickly structure a narrative. entries about individuals, who worked at or with the print center, such as the directors, could be text-based entries with a photograph. o images in this context are illustrating the text and are not telling the history. besides ensuring that authors do not spend too much time on individual entries, keeping the text brief will help keep users attention. • media-based entries o would use multi-media element such as photographs, images of art, sound clips, and video clips as the primary means through which users will interact with the historical content. o these entries would have shorter text, words or less, that serve to caption the media and provide any necessary background. o for images, the print center would have to think about placement and how a series of images creates a story. in this sense image entries involve similar curatorial and narrative decisions as text-based entries. o since wordpress supports a wide variety of sliders, light boxes, and gallery options for publishing images the timeline would work best if the print center decided on a select number of presentation formats that repeat throughout the timeline. for example a slideshow would work well with the print center’s exhibits and publications, while events and more narrative entries would work well in an article format with full-width images placed in the post so that the user would scroll down to read the story. similarly, video and sound entries would be consistently formatted. • research-based entries o research entries would provide historical context to the text and media- based entries. o these posts would typically focus on a topic, contain citations, include relevant historical images and photos from the archives, and aim to be around to words. o these entries will contain more in depth research from the archives. o there would be far more text and media-based entries than research entries, but the research entries would function to contextualize the often- unrelated smaller entries together through different histories. o after the research content, each of these longer entries could contain a list of the timeline entries that are connected to the topic, essentially nesting a smaller, themed timeline within the main centennial timeline. for example, an entry about the history of the print center’s educational programming would list the entries that deal with educational programs. o taking place over a range instead of on a specific date, the research- based entries would not fit easily onto the centennial timeline. one way to integrate them on to the timeline, and add historical depth to the timeline would be to place links to the research-based entries at the top of the timeline in the form of a tag cloud. (this could be an alternative to tagging all the entries.) social sciences and humanities pathway towards the european open science cloud proceedings of lr sshoc: workshop about language resources for the ssh cloud, pages – language resources and evaluation conference (lrec ), marseille, – may c© european language resources association (elra), licensed under cc-by-nc social sciences and humanities pathway towards the european open science cloud suzanne dumouchel, francesca di donato, monica monachini, yoann moranville, stefanie pohle, maria eskevich cnrs, net , cnr, dariah, mws, clarin eric suzanne.dumouchel@huma-num.fr​, ​didonato@netseven.it​, monica.monachini@ilc.cnr.it​, ​yoann.moranville@dariah.eu​, pohle@maxweberstiftung.de​, ​maria@clarin.eu abstract the paper describes a journey which starts from various social sciences and humanities (ssh) research infrastructures (ri) in europe and arrives at the comprehensive “ecosystem of infrastructures”, namely the european open science cloud (eosc). we highlight how the ssh open science infrastructures contribute to the goal of establishing the eosc. first, through the example of operas, the european research infrastructure for open scholarly communication in the ssh, to see how its services are conceived to be part of the eosc and to address the communities’ needs. the next two sections highlight collaboration practices between partners in europe to build the ssh component of the eosc and a ssh discovery platform, as a service of operas and the eosc. the last two sections focus on an implementation network dedicated to ssh data ​fairification ​. keywords:​ ssh, research infrastructure, eosc, data, fair . introduction the eosc implementation plan (dg research and innovation, ) is based on a federated model, aiming at creating, stimulating and implementing synergies between existing scientific resources, primarily through the research infrastructures (ri), including e-infrastructures, part of the horizon work programme. this paper guides through a long journey, articulated in a path which starts from the operas ri, and crosses various social sciences and humanities (ssh) research infrastructures in europe to arrive at the comprehensive “ecosystem of infrastructures”, namely the european open science cloud (​eosc​). it makes several stops at different crossroads to highlight the steps which contribute to developing ssh research both at european and international levels. by depicting this scenario, we aim at drawing the picture of an ecosystem, the european open science cloud (eosc). while the eosc implementation is a multi-year undertaking which is being addressed in practice in several stages, different european infrastructures are currently engaged in the activities in the field of open science in the ssh. most of them are dealing with data, especially to develop tools and guidelines for researchers to be able to share, use and host data, following the fair principles. in all initiatives, needs of collaboration emerge in order to reinforce the links between data and publications, especially regarding persistent identifiers (pid), data journals, etc. ​https://www.go-fair.org/fair-principles/ this paper highlights how the ssh open science infrastructures contribute on various levels to the goal of establishing the eosc. first, through the example of operas, the european research infrastructure for open scholarly communication in the ssh, to see how its services are conceived to be part of the eosc and to address the communities’ needs. then the paper points out collaboration practices between partners in europe to build the ssh component of the eosc (in the context of the sshoc h project) and a discovery platform specifically conceived as an operas service to be integrated into the eosc (triple h project). the last two parts of the paper focus again on collaborations: at a national level, through the eosc-pillar project, and internationally, through an implementation network dedicated to ssh data fairification. . crossroad : operas-p and operas operas-p is a two-year, european commission-funded project, aiming at the development of operas - open scholarly communication in the european research area for social sciences and humanities - as a european research infrastructure . operas-p project will develop a protocol and a roadmap for the inclusion of the operas research infrastructure social sciences and humanities open cloud project. h -infradev- - . see: ​https://cordis.euro pa.eu/project/id/ ​. ​created in , operas consortium comprises organisations from countries and is led by a core group consisting of members. doi: . /zenodo. mailto:suzanne.dumouchel@huma-num.fr mailto:didonato@netseven.it mailto:monica.monachini@ilc.cnr.it mailto:yoann.moranville@dariah.eu mailto:pohle@maxweberstiftung.de mailto:maria@clarin.eu https://ec.europa.eu/research/openscience/index.cfm?pg=open-science-cloud https://www.go-fair.org/fair-principles/ https://cordis.europa.eu/project/id/ https://cordis.europa.eu/project/id/ services for ssh into the eosc portal. this protocol will be based on the rules and procedures already introduced by the eosc, while taking into account the work in progress of the sshoc project. the project will implement some of the following operas innovative services, which will be integrated with the eosc ecosystem: a. operas discovery service. the triple project, described in detail below (see section ), will become the operas discovery platform, which will provide access to ssh resources, such as data and relevant publications, researcher profiles as well as project descriptions. b. operas certification service. ​the directory of open access books (doab), which ensures discoverability of open access books and delivers global peer-review certification for funders and libraries, will be redeveloped to become a central service of operas as an open source platform based on dspace technology. this move is crucial for ssh researchers in the light of plan s and the global shift towards open science in europe. c. operas metrics service. the metrics service collects usage metrics and altmetrics from many different sources (google books, matomo analytics, world reader, etc.) about the usage of monographs. measures are displayed in a light javascript widget, broken down into types and sources, with links to the description of each measure. different components complement the service, including a data model, an open source tool suite to provide metrics to the service, a central operas database as well as a dashboard and a javascript widget for visualisation. d. operas publishing service portal. ​due to the fragmentation of services and tools, ssh researchers in europe struggle to define and implement their communication strategy in an uncoordinated communication landscape. the operas-p project will implement a common access point to the publishing services offered by its members. this access point is a web portal listing the relevant services provided by the operas infrastructure nodes and beyond. the portal will help researchers in selecting the appropriate publishing venue and defining their scholarly communication strategy. e. operas check-in. ​to support a transparent and seamless access to the operas platforms and to external sources of data, the egi check-in service will be adopted as authentication and authorization service within the operas ri. the service provides an identity and access management solution that facilitates the access to services and resources using the federated authentication mechanisms, thanks to the implementation of virtual organisation common for operas services and its users. f. operas xml toolbox. ​in ssh, the community has to overcome a specific obstacle, i.e. the juxtaposition ​https://www.coalition-s.org/ of two standards: xml jats, adopted by the academic publishing industry, and xml tei (text encoding initiative) adopted by the humanities research community for books and digital editions. operas-p will provide tools to achieve interoperability between these two standards. the innovation part of the operas-p project is aimed at producing a robust, empirically tested and stakeholder-validated foundational body of knowledge relevant for the future development and functioning of operas. this includes the development of sustainable models of governance for infrastructures, business models for open scholarly publishing,/groundbreaking concepts to address the fairification of ssh data, multilingualism, the future of scholarly writing as well as quality assessment of novel research outputs. in sum, operas-p means a process of transforming operas to the status of a mature community, with a set of services compatible with eosc, stable national nodes and innovative plans for future development. . crossroad : building the ssh component of the eosc (sshoc) the overall objective of the sshoc project is to build the ssh component of the eosc. the project aims at realising the transition from the current landscape with disciplinary silos and separated e-infrastructure facilities into a cloud-based infrastructure where data are fair, and tools and training are available for ssh scholars who have adopted, or want to adopt, a data-driven scientific approach and who have an interest in the innovation and integration of their methodological frameworks. the ambition of sshoc is to: a. increase the efficiency and productivity of researchers - by providing a fully-fledged ssh cloud where data, tools and services are easily and seamlessly discoverable, accessible and (re)usable. b. contribute to the creation of a cross-border and multi-disciplinary open innovation environment - by fostering the development of infrastructural support for digital scholarship. c. strengthen/encourage the collaboration between the partners involved in the sshoc project that are representing the broad spectrum of the ssh community through the use and harmonisation of different technologies and services that are already available and also being developed within the course of the project. the project therefore aims for synergies across disciplines and work towards a clustered cloud infrastructure that makes use of common elements, such as secured login, storage and computing power, and other e-infrastructures. the project is very well connected to national activities, infraeosc- - , social science and humanities open cloud ​https://www.sshopencloud.eu/about-sshoc​. https://www.coalition-s.org/ https://www.sshopencloud.eu/about-sshoc thanks to the participation of all five ssh erics (european research infrastructure consortium). furthermore, salient pan-european and global data surveys participatie in the project. sshoc also participates in international activities such as the research data alliance and other initiatives of a similar nature. the sshoc ecosystem will use the existing infrastructures that are already provided by the project partners and will improve the findability of make existing tools and services for diverse communities of potential use better available. ​in particular, the sshoc approach is to develop, enhance, integrate a set of tools and services for managing and processing ssh research data that are central to the communities of use in ssh, based on existing tools and functionalities, and requirements for interoperability. existing tools and services will be adjusted and enriched, making connections to eosc-hub e-infrastructure for the sharing and use of tools and services useful for ssh. special attention is given to cross-disciplinary use of services e.g. providing language technology for social -sciences and humanities scenarios of use. the sshoc project will cover the full research and development and ready-to-market cycle: in particular, the ssh open marketplace platform will contain solutions, training materials, tools and services for researchers, all contextualised within one another. the lack of a central place integrating assets from all ssh-related project websites, service registries and data repositories is what drove the creation of this marketplace. the choice was made to provide datasets via the marketplace only when relevant in the context of tools, trainings or other materials . the marketplace has always, since itsbeginning, been conceptualised as a community-oriented platform where the community can directly take part in the curation of its data. the leveraged services will deeply embed open science and fair principles by making data findable, accessible, interoperable and re-usable. . crossroad : building a european discovery service for ssh data (triple) ssh research is divided across a wide array of disciplines, sub-disciplines and languages. while this specialisation makes it possible to investigate the extensive variety of ssh topics, it also leads to a fragmentation that prevents ssh research from reaching its full potential. use and reuse of ssh research is suboptimal, interdisciplinary collaboration possibilities are often missed, and as a result, societal, economic and academic impacts are limited (dallas c., ). triple could overcome potential gaps by providing access to other datasets, see https://doi.org/ . /zenodo. ​ and section the triple project , which consists of a consortium of currently partners from countries, is a practical answer to the above issues, as it aims at designing and developing a multilingual and multicultural discovery platform dedicated to ssh resources at european scale. triple will improve the accessibility and dissemination of ssh resources through a single access point which allows free access to circa six million documents in the domain of social sciences and humanities, including peer reviewed journals, articles, books and blog posts, as well as to research data, projects and researcher profiles. the triple solution will provide linked exploration thanks to ( ) the isidore search engine , and ( ) a variety of connected innovative tools, which include visualisations, a web annotation service, a trust building system, a crowdfunding system and a recommender system. triple main objective is then to enable researchers to discover and reuse ssh data macro-typologies, related not only to publications, but also to people and projects. the integration of triple into the eosc will be performed according to eosc general principles and to the set of recommendations and guidelines, structured under the six priorities, i.e. landscape, fair, architecture, rules of participation and sustainability, skills and training, which are coordinated by the relative eosc working groups. a major strength lies in the composition of the triple consortium: not only are the main ris for ssh project partners, but several partners also play an active part in the eosc implementation. moreover, specific synergies are developed with sshoc, and memorandums of understanding (starting with sshoc) are planned. the triple solution is envisaged to be a major component of the ​ssh open marketplace​, which will be the entry door to the eosc for all the different ssh services. the triple consortium is also experimenting with new forms of engagement and community-building through the triple forum, which will bring together relevant stakeholders. linked to the sshoc community and the ones served by the research infrastructures, triple forum will contribute to bringing the researchers into the eosc and more largely into the open science movement. . crossroad : beyond national services, how ssh open collaborations the eosc-pillar project (​https://www.eosc-pillar.eu/​) aims to identify, coordinate and harmonize existing national initiatives for the national coordination of data funded under the european commission program infraeosc- - “prototyping new innovative services”. isidore is a large-scale discovery service, developed by the tgir huma-num (cnrs) since (​https://isidore.science/​). https://doi.org/ . /zenodo. https://www.sshopencloud.eu/ssh-open-marketplace https://www.eosc-pillar.eu/ https://isidore.science/ infrastructures and services that recently started in many member states (ms) as one of the founding pillars for the development and the long-term sustainability of the eosc. the idea is, thus, leveraging national initiatives of the ms and thematic initiatives (ti) developed by research communities working in national and european collaborations to build a future based on open science and fair data practices. concretely, that implies to: a. support the coordination and harmonization of mature national initiatives for open data, open science services, cloud and data infrastructures. b. facilitate the adoption and compliance with eosc standards… while proactively providing feedback to the eosc governance… c. contribute to the creation of an achievable cutting-edge, end user-oriented environment for european data-driven science, through the promotion of fair practices and services. the federation of national initiatives will be the catalyst for trans-national open data and open science services (common policies, fair services, shared standards, technical choices). ​the project ​gathers representatives of the fast-growing national initiatives for coordinating data infrastructures and services in italy, france, germany, austria and belgium. ​in this framework, ​the french very large research infrastructure huma-num and the center for direct scientific communication (ccsd), who created the hal open archive and is now in charge of its development and management, together with the conference management platform sciencesconf.org and the hosting platform of epi-journals, decided to join their effort to propose a proof of concept (poc) around two of their services for ssh. this poc will link the huma-num repository nakala to the hal open archive to address the need for ssh to be able to prove the authenticity of data, and to guarantee accessibility to raw data which are at the root of research and innovation - this approach being in a perspective of reproducibility of the research. in eosc-pillar, the ssh community is built from regional areas. it highlights practices and opens opportunities for new collaborations with other disciplines, so as to bring researchers to new networks and innovative research projects. . crossroad : beyond the eosc, implementing ssh data fairification co-operas is an implementation network within the context of the gofair initiative (​https://www.go-fair.org/implementation-networks/overvi ew/co-operas/​). it aims to bring ssh data into the eosc, helping communities to fairifying them, and, in turn, to enrich the fairification process and registries with specific ssh standards. “define fair for implementation” is also the first recommendation of the dg research and innovation, . the network was created and launched in , and is one of operas’ building blocks connecting european and international research communities through the fair principles as a common ground. in that sense, within the operas environment, co-operas’ activities represent a reciprocal movement towards and from the research infrastructure: on the one hand, it brings feedback and suggestions from specific communities in order to implement the services; on the other hand it brings coordination to fragmented and heterogeneous communities. co-operas stands right at the crossroad between data and publications, and it perfectly fits in the operas ecosystem as it more than integrates data and publications. as a community-based network, co-operas’ first aim is to define the term “data” in the field of ssh. to this purpose, regional and national workshops in different languages (e.g. italian, german, french…) are being organized. researchers are asked to provide their definitions of “data”, and then to assess the level of fairness maturity of the data they are using and creating. diversity comes along with fragmentation of practices and lack of standards. then, the ssh community needs to converge around shared expertise and practices. to do so, the fair principles are one of the most valuable tools as they are able to be broadly applied and widely shared. identifying the gaps and the critical issues is crucial in order to plan new useful services or to create new standards and promote their adoption. in parallel, operas’ services and related projects such as triple and sshoc will offer a field of application for concrete and improved fair data curation, discovery, harvesting, and reuse in the ssh. . conclusions building eosc components implies to be well-organised and coordinated at a european scale. for the social sciences and the humanities, often fragmented also from a linguistic point of view, the challenge is quite high. the above surveyed initiatives focus each both on general and on specific aspects which, in the end, contribute to define a set of rules and guidelines for the implementation of the ssh components of the eosc. this is why there is a strong need for collaborations between european research infrastructures, as well as for interoperability of the services. but what is most important is to share a common goal and to work in the same direction. in general, strong synergies are in place between all the described initiatives and projects: - the main ris for ssh, i.e. clarin, dariah and cessda, are triple project members, and all the five https://www.go-fair.org/implementation-networks/overview/co-operas/ https://www.go-fair.org/implementation-networks/overview/co-operas/ erics (the three above plus share and ess) are sshoc project members; - specific synergies are developed between triple and sshoc, where the coordinator (cessda) is a triple partner, and cnrs and cnr are sshoc partners; - memorandums of understanding are planned; - egi partnership within triple ensures that the technology will be fully interoperable with other e-infrastructures services, especially regarding aai technology and resource discoverability; - the collaboration between triple, sshoc and the co-operas implementation network, in which, respectively, triple partners and sshoc partners are part of, builds a bridge between ssh data and the eosc, widening the concept of “research data” to all types of digital ssh research outputs; - numerous discussions about the eosc are linked to fairification of data, in the stm especially focusing on big data. in the ssh field, data does not always fit the definition of “big data”, but it still requires specific management and solutions. the co-operas work on ssh fairification, and specifically on fair implementation profiles and fair data objects, can be relevant for ssh initiatives. ssh contribution to the eosc definition and implementation draws upon the strong efforts made within the different projects and initiatives to build a strong ssh community. within the ssh, communities of practice are very fragmented but with a high willingness to share practices and knowledge and to build upon the existing commonalities. links are strengthened between humanities, social sciences, cultural heritage, scholarly communication communities. all the above described initiatives show a common vision and complementarity while sharing common challenges, such as overcoming fragmentation and the lack of a single, central solution, addressing common issues such as multilingualism, interoperability, fairification, the eosc marketplace, language and discovery services, and the connection to national and international activities.. these different initiatives could overlap in their activities at some point. however, this is not an issue. ssh are well-known for their diversity of interpretation and their critical dimensions. what is presented in this paper is a federation of ssh facets which contribute to avoid simplification and reduction in order to deploy complexity at a large scale through the different initiatives. this is where ssh, thanks and through the multiple facets, can play a strong role in the building of the eosc: they anchor a practice in a history, in an area, in a future. . acknowledgements the work presented in this paper has been partly supported by infraeosc- - triple, transforming research through innovative practices for scientific, technical and medical sciences linked interdisciplinary exploration, and partly by infraeosc- - sshoc, social science and humanities open cloud. . bibliographical references burgelman j-c, pascu c, szkuta k, von schomberg r, karalopoulos a, repanas k and schouppe m ( ) open science, open data, and open scholarship: european policies to make science fit for the twenty-first century. ​front. big data : . doi: . /fdata. . . barbot l, moranville y, fischer f, petitfils c, Ďurčo m, illmayer k, … karampatakis s ( ). sshoc d . system specification - ssh open marketplace (version . ). zenodo, . /zenodo. .the directorate-general for research and innovation ( ). turning fair into reality, - , european commission, brussels, - - - - , doi: . / . directorate-general for research and innovation ( ) european open science cloud (eosc) strategic implementation plan, - , european commission, brussels, - - - - , doi: . / . operas consortium. ( , july ). operas design study. zenodo. http://doi.org/ . /zenodo. von schomberg, r. ( ). “why responsible innovation?” in ​international handbook on responsible innovation a global resource​, eds r. von schomberg and j. hankins (cheltenham: edward elgar publishing), – . doi: . / . wenger, etienne ( ). communities of practice: learning, meaning, and identity. cambridge: cambridge university press; wenger, etienne; mcdermott, richard; snyder, william m. ( ). cultivating communities of practice (hardcover). harvard business press; st edition. het einde der tijden op twitter the apocalypse on twitter theo meder, dong nguyen, rilana gravel introduction nowadays, more and more research is being done using social media such as twitter. for instance, twitter can be used as the fastest news medium, commenting live on an earthquake, bad weather passing by or other news events. instead of polls, sentiment analysis of tweets could predict the outcome of elections. twitter can even be used for examining public health, from epidemics to use of medication. we decided to monitor twitter for folklore research. in the last four months of , during the so-called tinpot project of the meertens institute (knaw) in amsterdam (carried out in collaboration with teezir bv in utrecht), we were able to monitor all the dutch short message traffic on twitter related to urban legends and product rumours. however, as far as real products were concerned, the results were not as spectacular as we had hoped. most notable were the rumours saying that people would soon have to pay for each whatsapp message they were to send, and rumours about pork fat in various products (both rumours reached some , tweets in four months). by pork fat, people on twitter meant gelatin, which supposedly can be found in products as diverse as sweets in general, and wine gums in particular, cheese, bread, mcdonalds’ french fries, pringles, the white filling in oreo cookies, and the foam layer produced by senseo and nespresso coffee machines. these messages about pork are not only bad news for muslims (the word "haram" is used regularly), but also for vegetarians and vegans. however, during the research period, there was another rumour that really became a trending topic: the impending apocalypse. secular (or at least: non-christian) stories testifying that the world would come to an end on the st of december have been circulating for years. the main argument was that on that day, the mayan calendar ends, and that the maya – so to speak – possessed unique cosmic knowledge. the new age writer who this article is a reworking of the paper theo meder presented at the nd international conference of the international society for contemporary legend research (isclr) in prague (czech republic), june , . see becker, naaman & gravano ; sakaki, okazaki & matsuo ; and voets . dutch people on twitter were fast and best informed what happened on memorial day ( may ) during the minute of silence; see meder , pp. - . see o’connor, balasubramanyan, routledge & smith . paul & dredze . tinpot is a dutch acronym that stands for ‘language, identity, networks, and product rumours on twitter’. during the six months research period, we mainly focussed on language and identities, which resulted in a tool called tweetgenie: see www.tweetgenie.nl. looking at the lingual behaviour of dutch people on twitter (and not at their profiles or pictures), tweetgenie can guess the gender and age of these people (see nguyen et al a, b and , and gravel ). i did a little research on product rumours in this period. we hope to continue research into networks and (lingual) variation in a next project. teezir bv monitors social media for sentiments about products (research commissioned by commercial companies). http://www.tweetgenie.nl/ popularized this idea was the mexican-american art historian and peace activist josé argüelles ( - ) with his book the mayan factor ( ). most people did not believe him, nor were they convinced of his interpretation of the mayan calendar. besides, new mayan calendars were found that exceeded the year , while other people claimed that there would only come an end to an era when a new era is about to begin. nevertheless, the date of december , as the date of the apocalypse got implanted in many a human brain as a meme. by way of entertainment, hollywood even released a disaster movie called in . however, it remained to be seen how many people in the netherlands would really be afraid on the day of the alledged apocalypse. weeks and even months before, the media had created news items about the impending doom, and sometimes they had even managed to get someone in front of their cameras who was going to survive on a mountain top, or who was building a new ark. still, the question remains how deep the unrest was among large parts of the population. rumour had it that one was able to survive the end of days in the french pyrenees village of bugarach, but film crews that went there to document events on doomsday were merely filming each other. how can a researcher determine the average feelings among the dutch population, without extensive interviewing, unseen and without manipulating the outcome? although there is a distinction between the dutch population as a whole on the one hand and dutch people on twitter on the other, one can at least get a more or less fair indication by monitoring dutch tweets on twitter. instead of conducting time consuming interviews for qualitative research, or money consuming surveys for quantitative research, we monitored twitter between december and december , so generally speaking from one week before until a week after the alledged apocalypse. to make sure that we would not monitor tweets worldwide, we sponged tweets by using unique dutch keywords or combinations (coming down to: end, world, dead, die, going down, mayan calendar, prediction, believe, doomsday): einde wereld einde vergaat maya geloven einde maya eindvandewereld maya einde het einde dood het apocalyps maya het december wereld ik apocalyps maya voorspelling december dood wereld apocalyps december voorspelling december vergaat wereld ten onder het wereld vergaat jaar maya kalender het doomsday het wereld vergaat maya kalender wereld doemsdag laatste december maya kalender einde endoftheworldconfessions het about bugarach, see campion-vincent . the documentary bugarach et les journalistes on youtube (http://youtu.be/ pvuuz z _w) has unfortunately been removed because of copyright infringement. for a general introduction to - - , see de visser . the twitter population is not fully representative of the population of the netherlands: only a minority of the dutch have a twitter account, young people are overrepresented while the middle aged and elderly are underrepresented, and finally some opinion leaders are more actively present than timid trend followers. see note as well. http://youtu.be/ pvuuz z _w sometimes words like 'het' (= the) and 'ik' (= i) are added to collect only dutch tweets and no tweets in other languages. the last three keywords and combinations were added five days after the beginning of the survey when we noticed these were used as well. collecting tweets we collected , tweets about the end of times in two weeks time. just to give you a rough idea: if you were to print all the messages in a document, you would get a pack of paper consisting of more than , a sheets. incidentally, we are not dealing with all original tweets: % of the tweets are actually retweets (the selection is based on the string: rt<space>). another part consists of non-literal repetitions, variations on already circulating tweets. tweets in total , number of retweets , ( %) the number of retweets is enormous; in a random sample of tweets, the percentage of retweets is much lower ( % according to boyd, golder & lotan ). it is also obvious that messages about the apocalypse differed in quantity per day. date number of tweets saturday december , sunday december , monday december , tuesday december , wednesday december , thursday december , friday december , saturday december , sunday december monday december , tuesday december wednesday december thursday december total , set out in a graph, the activity on twitter looks like this: the dutch tweets about the apocalypse were collected by dong nguyen, using the twitter api of our partner teezir bv in utrecht. the quantitative analyses were performed with the concordancer software program antconc. the value for the first saturday is not entirely valid, since we only started crawling the twitter api halfway through the day. nevertheless, the first small peak seems to appear when everyone is free during the weekend and more people have time to tweet. after sunday, the curve goes down a little, but shoots up again one day before the supposed apocalypse, and the largest peak occurs - as to be expected - on the st of december itself. the next day, after ‘doomsday’ virtually passed-off smoothly, the curve immediately drops dramatically. the following days, people gradually lose interest in the subject, as it slowly peters out. the most popular retweets among the most popular tweets, we count the tweets that have been retweeted more than a hundred times. this is the case for tweets. some tweets got retweeted quite literally. in other tweets, micro-variations occurred, and then they got tweeted and retweeted again. below we present the top ten of popular dutch retweets concerning the apocalypse, followed by an english translation: top ten of most popular retweets number . rt @mannen_humor: degene die gelooft dat de wereld op december vergaat kan op december zijn geld op mijn bankrekening storten. <diverse variaties en afzenders> rt @male_humor: those who believe that the world will end on december st can deposit their money into my bank account on december th. <several variations and senders> , . rt @rtditbericht: rt als jij niet gelooft in het einde van de wereld in dec, : http://t.co/ jsznbyl <meerdere afzenders> rt @rtthismessage: rt if you do not believe in the end of the world in dec, : http://t.co/ jsznbyl <several senders> , . #tweetvandeweek: de maya's hebben een nieuwe theorie waardoor we nog jaar leven. de wereld vergaat pas als feyenoord kampioen wordt. <diverse variaties> http://t.co/ jsznbyl http://t.co/ jsznbyl #tweetoftheweek: the mayans have a new theory that would allow us to live on for another years. the world will end as soon as feyenoord becomes champion. <several variations> . rt @randompuber: dus december vergaat de wereld? #retweet als jij dit ook bullshit vindt! <meerdere afzenders> rt @randomadolescent: so december st the world will end? #retweet if you too think this is bullshit! <several senders> . rt @top_moppen: de maya's hadden het mis. vandaag is het niet het einde van de wereld, maar slechts het einde van de week... #topmoppen <kleine variaties, meerdere afzenders> rt @top_jokes: the mayans were wrong. today isn’t the end of the world, it’s just the end of the week... #topjokes <small variations, several senders> . modernegezegden maya : hey, biertje doen? maya : ik werk aan die kalender, maar het is denk ik niet het einde van de wereld als ik ''m niet afmaak. <ook verzonden door @lachjerot> modernproverbs mayan : hey, grab a beer? mayan : i’m working on this calendar, but i guess it won’t be the end of the world if i don’t finish it. <sent by @laughingyourassoff as well> . rt @swagzinnetjes: hey wacht, de wereld vergaat niet want op andere plaatsen in de wereld is het al december. <ook verzonden door rt @ohtienershit> rt @swaglittlesentences: hey wait, the world won’t come to an end because in other places in the world it is already december st <also sent by @ohadolescentshit> . rt @mannen_humor: stille tocht voor #bultrug? serieus? doe ff normaal... de #maya''s hadden gelijk, de wereld is naar de klote. het einde ... <variaties> rt @male_humor: silent procession for #humpback? seriously? get real... the #mayans were right, the world is fucked. the end... <variations> . rt @nuttelozefeiten: in australie is het al vrijdag / / <veel variatie en afzenders> rt @uselessfacts: in australia it is already friday / / <many variations and senders> . rt @robscheepers: de positieve kant van dat maya verhaal; áls de wereld vrijdag vergaat is #psv officieel kampioen seizoen / <meerdere afzenders> rt @robscheepers: the positive side of the mayan story; íf the world is gonna end on friday #psv will be official champion of season / <several senders> feyenoord is a soccer club from rotterdam. a humpback whale stranded on a sandbank near texel early december , and was nicknamed johannes (john). attemps to free the whale failed, and after a week, he (who turned out to be a she) died. there was a lot of turmoil about the question who should be allowed to rescue the animal. after the humpback had died, some people suggested having a silent procession to commemorate the (unnecessary?) loss of life. psv is a soccer club from eindhoven. the picture with the message “rt if you do not believe” the most popular retweets together consist of , tweets, which makes up % of all the tweets about the apocalypse. the top ten consists of , tweets, which is still % of the entire collection. what stands out most of all - not only in this top ten, but also in the most popular retweets – are the sobriety, humour and sarcasm with which people respond to the apocalypse. the vast majority of twitter users do not appear to believe in impending doom whatsoever - unlike the media hype surrounding the issue might have suggested. there is only one tweet (which got retweeted times) that seems to take the possibility of a global decline somewhat seriously, given the dry statement “het is december, we gaan dood. doei...” ("it is december , we are going to die. bye... ") most of the popular retweets do not remain completely stable: variation occurs in most cases. popular tweets get retweeted a lot and some people do so under their own names in a slightly different wording. fact is that tweets are especially popular when they come from accounts that have many followers: often not individuals, but communities like @mannen_humor (male_humor), @randompuber (randomadolescent), @rtditbericht (retweetthismessage), @despeld (theneedle ), @woordhumor (wordhumor) and @slechte_grappen (bad_jokes). still, it sometimes happens that tweets from individuals get retweeted without alteration. if such a person is a famous dutch personality, it is dutch version of the onion (usa), the daily mash, and the poke (uk) with fake humorous news. more likely that their tweet will be retweeted unchanged, with the famous person in the role of the authority. one of the best examples is the tweet from dutch weather forecaster and media personality helga van leur (rtl ): “de astronomische winter in begint op - om . uur. misschien was dat het bizarre op de #mayakalender!” (“astronomical winter starts on - at : p.m. maybe this was the bizarre event on the #mayancalendar!”, retweeted unaltered times). (micro)variation as an example of (micro)variation, we took the “we gaan dood” ("we are going to die") tweet. from december on, tweets carrying this message were circulating, containing the string “we gaan dood”: - - kleinemann rt @claudiajelena_: afraid voor december. / we gaan dood rt @claudiajelena_: afraid of december st. / we are going to die - - rayfr day omgomg, bijna december. we gaan dood fissaaaaa. omgomg, almost december . we are going to die partyyyyy. - - boybelieber_nl @ we gaan dood december ....... xd @ we are going to die december st ....... xd - - matex morgen december we gaan dood tomorrow december we are going to die - - ikbenhetkelly morgen december we gaan dood.......;p tomorrow december we are going to die.......;p - - connorvans oh neeee het is december, we gaan dood :( oh noooo it is december , we are going to die :( the stretched outcry “fissaaaa” comes from “fissa”, which is urban slang for a party. the emoticons winking sticking out tongue (;p) and laughing with eyes closed (xd) both indicate that the authors do not take their own statement very seriously. however, the emoticon in the last tweet looks sad. the message that got retweeted most ( times literally) goes like this: - - becockstovers het is december, we gaan dood. doei. it is december , we are going to die. bye. a few people take over the tweet literally and turn it into their own communication. however, the following tweet only got retweeted times: - - ninormaal_ het is december, we gaan dood. doei. it is december , we are going to die. bye. here are some more examples of variations in retweets: - - vldnino het is december. ik ben dood. doei. it is december st. i am dead. bye. - - maudlynn_ hey het is december , we gaan allemaal dood fuck yeah hey it is december , we are all going to die fuck yeah - - gelenvin december we gaan dood bitches!!!! december st we are going to die bitches!!!! - - frans december!!!!!!!! nooooo we gaan dood #ikmerkeralozoveelvan december st!!!!!!! nooooo we are going to die #itissoooohappeningtome - - tjaberyayo omg het is december omg we gaan dood omg omg omg omg it is december st omg we are going to die omg omg omg the abbreviation "omg" stands for "oh my god". a phlegmatic response to the ‘we are going to die’ tweet is: - - toine haha hele tl met het is december we gaan dood doei haha entire tl with it is december we are going to die bye the abbreviation “tl” stands for timeline. fiercer comments on this kind of tweets appear as well: - - dream_tastic hou je bek over die we gaan dood op december oke shut up about that we are going to die on december st okay - - justinex december we gaan dood is gwn onzin. december st we are going to die is jst nonsense - - nerdyfreshh ik heb geen eens zin om op twitter te kijken door jullie december we gaan dood bullshit. i don’t feel like looking on twitter because of your december st we are going to die bullshit. - - marlouajs_ dat '' het is december, we gaan dood '' is nu al tering irritant. hjb gewoon ofzo. this “ it is december st, we are going to die “ is already pissing me off. just stfu or something. the abbreviation “gwn” stands for “gewoon” (just), and “hjb” is short for “hou je bek” which i translated as stfu (shut the fuck up). the day after the alledged doomsday, the next response was inevitable: - - assa_lovesyax het is december en we leven nog. met jullie achterlijke ''de wereld vergaat, we gaan dood'' theorie =s it is december , and we are still alive. you and your stupid “the world is ending, we are going to die” theory =s the emoticon “=s” probably stands for confusion or mixed feelings. one important aspect of variation has not been discussed yet, because most tweets are not too long, and can be retweeted without loss of information. this is different with a tweet like "hey, grab a beer?" (# in top ten). the tweet itself already contains characters in dutch (out of the authorized). by retweeting, more information is added like "rt <space> @modernproverbs <space>" ( extra characters in dutch). so the longer this extra string of characters is, the more information you will lose at the end of your retweet. this loss of information makes it less attractive to retweet again. truncating the text to prevent loss of the punchline (which is quite customary) has actually hardly been done; retweets have almost always taken place using complete, original tweets (about such phenomena see also boyd, golder & lotan ). belief, fear and emotions the word cloud of the entire corpus of apocalypse tweets shows in particular that according to the mayans, the world will end on december st and that there are plenty of retweets. telling desparate people to transfer their money into someone else's account is a motif that recurs in the word cloud as well. word cloud of all dutch tweets about the apocalypse (some of the irrelevant function words were deleted). the most prominent words form the sentence: “rt the world ends in december”. the number of tweets with an explicit religious response is fairly small. a search for the keywords "god" and "allah" resulted in a poor score of . % (the majority of the responses came from dutch muslims): statement number “alleen god kent het einde” “only god knows the end” “alleen allah kent het einde” (niet de maya’s) <uiteenlopende formuleringen en varianten> “only allah knows the end” (not the mayans) <several different phrasings and variants> even on twitter, concerns about the end of days are of course not completely absent, as already demonstrated by the stolid tweet: "it is december , we are going to die. bye." there are people who express their genuine fear, or at least their distress. many of these tweets are individual expressions that do not get retweeted frequently, and they total about . this means that we are talking about % of the dutch tweets, and about a relatively small group of people. one of the more interesting and creative tweets was the following: - - annezeven ik denk dat johannes het begin van de apocalyps inluidt. vandaar ook de bijbelse naam natuurlijk #bultrug i think john heralds the beginning of the apocalypse. hence the biblical name of course #humpback at the beginning of december, a dying humpback whale was found on a sandbank called de razende bol (the stormy sphere) near the island of texel, and this whale was nicknamed john. from a biblical perspective, this is a meaningful name, for john was also the author of the last book of revelation about the apocalypse. in tweets, we searched for words like “bang” (fear, afraid, frightened), “angst(ig)” (afraid, fear(ful), anxiety), "zenuwachtig” (nervous), “zenuwen” (nerves), "stress" and "ongerust” (worried). here are some typical examples of fear and worry: - - elzederks ik ben egt bang dat de wereld vergaat december!! i am rlly afraid that the world will end december - - xjetnoraa_ pam en ik zijn bang dat op december de wereld vergaat =''( pam and i are afraid that on december st the world will end =''( - - xnobodycaresbby ik geloof niet dat de wereld vergaat december. maar ik zie zoveel mensen er bang voor zijn. en stiekum word ik zelf dan ook wel bang .. i don’t believe that the world will end december st. but i see so many people being afraid. and secretly this makes me scared as well .. - - liefsteliekex kutzooi, ben echt bang voor vrijdag .. straks vergaat de wereld echt. fucking shit, am really scared of friday .. maybe the world will end for real. - - emmaya was de vrijdag van deze week al voorbij! december geeft mij meer stress, want als de wereld echt vergaat? dan lacht niemand meer!! wishing this week’s friday was over already! december gives me a lot of stress, because if the world is really gonna end? then nobody will laugh anymore!! the emoticon =''( in the second example looks sad and cries. in the next three tweets, people indicate that they got scared by a certain article on the internet: - - hmfrisdrank ik wordt nu echt bang nu ik dit aan het lezen ben dfjaefhndaskfkeufefewkfhkeu http://t.co/ aehrtjt i’m getting really scared now i am reading this dfjaefhndaskfkeufefewkfhkeu http://t.co/ aehrtjt - - doerakx ik zweer ik ben bestwel bang eh nadat ik dit artiekel heb gelezen http://t.co/a bgxrrk i’m telling ya i got really scared eh after reading this article http://t.co/a bgxrrk - - marlythatsme http://t.co/o k ew hier wordt ik toch wel bang van.. http://t.co/o k ew this makes me kinda scared.. in all three cases, twitter users refer to a dutch article entitled “ redenen waarom de wereld vergaat op december ” (" reasons why the world will end on december , "), which can still be found on the internet: http://www.einde .nl/ -redenen-waarom-de-wereld-vergaat-op- - december- /. the internet article not only mentions the end of the mayan calendar, but also examines violent solar storms, the reversal of the magnetic poles, a collision with planet x and other doomsday scenarios. the final four examples we would like to present here: - - kellysky_ ik geloof niet in de apocalyps, maar ben toch wel ''n beetje nerveus voor vrijdag op de een of andere manier xd i don’t believe in the apocalypse, but still one way or another i am a little nervous for friday xd - - meialinde december ben ik vet zenuwachtig het idee dat elke seconde de wereld kan gaan trillen enzo hahaha december i am a nervous wreck the idea that any second the world can start shaking and such hahaha - - jenniferbonhofx ben bang dat wereld vergaat op december am afraid the world will end on december - - xniallxxx stiekem ben ik toch wel bang dat de wereld vergaat op decemberŠ! secretly i am afraid that the world will come to an end on december Š! emoticon xd is the diminutive form of xd that stands for loud laughter with eyes closed: so in this case there is some mild laughing going on. the meaning of Š we have not solved yet. one may wonder about the age of all these anxious twitter users. judging from the thirteen examples we gave, we are dealing with ten teenage girls, two teenage boys, and one undetermined person that has closed his or her account in the meantime. http://t.co/ aehrtjt http://t.co/ aehrtjt https://webmail.knaw.nl/owa/redir.aspx?c=fooeeaucsu-pgzowhcrljupmlgt iiy hmyoyb e_dnyhplmtn zmhwumybzxjhikfz jbg.&url=http% a% f% ft.co% fa bgxrrk https://webmail.knaw.nl/owa/redir.aspx?c=fooeeaucsu-pgzowhcrljupmlgt iiy hmyoyb e_dnyhplmtn zmhwumybzxjhikfz jbg.&url=http% a% f% ft.co% fa bgxrrk https://webmail.knaw.nl/owa/redir.aspx?c=fooeeaucsu-pgzowhcrljupmlgt iiy hmyoyb e_dnyhplmtn zmhwumybzxjhikfz jbg.&url=http% a% f% ft.co% fo k ew https://webmail.knaw.nl/owa/redir.aspx?c=fooeeaucsu-pgzowhcrljupmlgt iiy hmyoyb e_dnyhplmtn zmhwumybzxjhikfz jbg.&url=http% a% f% ft.co% fo k ew http://www.einde .nl/ -redenen-waarom-de-wereld-vergaat-op- -december- / http://www.einde .nl/ -redenen-waarom-de-wereld-vergaat-op- -december- / the spectrum of emotions on the total corpus of dutch tweets about the apocalypse can be analyzed using the program liwc: linguistic inquiry and word count. affective processes . positive emotion . negative emotion . positive feeling . anxiety . optimism . anger . sadness . past . religion . present . death . future . swear . sentiment analysis of liwc shows that affection and positive emotions on twitter predominate, that the tweets mainly radiate optimism, and that anxiety, anger and sadness play a minor role. compared to other blogs and to a representative number of random tweets, our apocalypse tweets prove to be fairly moderate, both in a positive and negative sense. in other words: emotions never run high. most tweets dwell on the present and then the future, which represents an average distribution for blogs and random tweets. verbal abuse is infrequent. death is not a dominant issue, though it is above average compared to blogs and random tweets. in the liwc analysis of apocalypse twitter messages, feelings of faith stand out the most. compared with a score of . for an average blog, and even . for a control group of random tweets, a score of . can be called high. in this case, it does not concern deep-rooted religious beliefs, but the fact that people often say they do not believe an apocalypse to be imminent. all in all, however, one should never confuse opinions and sentiments on twitter with the opinion of an entire nation: twitter is not the people. conclusion looking at the dutch tweets from the two weeks around december , we can conclude that only a very small percentage ( %) of the dutch people on twitter had real concerns about the supposed upcoming of the end of times. judging from their profiles, we are dealing with adolescents, mainly girls. the subject of the apocalypse produced a lot of short messaging on twitter (over , tweets), peaking at the th and st of december – that is: the day before and the day itself – but the general trend was that the twitter users took the prediction of the end of days with a grain of salt (quite unlike the predominantly indignant and appalled reactions on the product rumours about whatsapp and pork fat). in the tweets, positive emotions and belief in a good outcome dominated. new age believers in this profane apocalypse could expect to be faced with humor, sarcasm and mockery. serious religious responses on twitter were negligible –muslims responded more often than christians. in a about this program tausczik & pennebaker (p. : liwc is being pronounced as “luke”). and the dutch version zijlstra, van middendorp, van meerveld & geenen . see mustafaraj, finn, whitlock & metaxas who for instance distinguish between the “vocal minority” and the “silent majority”; furthermore duits ; scheele (only % of the dutch people online use twitter; only % of them are between and years old). sentiment analysis, the tweets' high score on religion is mainly due to the frequent lack of belief in the upcoming destruction of the world. in the end, twitter showed that the end of times on the st of december in the netherlands was ultimately a media hype rather than a dreaded belief event. acknowledgements this research was conducted in the context of the so-called tinpot project ( - ). the tinpot project was carried out at the meertens institute in amsterdam, in collaboration with the university of twente and the company teezir bv, from utrecht, and was funded by the royal netherlands academy of arts and sciences (knaw) as a public-private collaboration digital humanities project (uva, vu and knaw). cooperating in the tinpot project were theo meder, dong nguyen, dolf trieschnigg, rilana gravel, charlotte van tongeren, daphne van kessel, leonie cornips, marc van oostendorp, and thijs westerveld. literature becker, hila, mor naaman & luis gravano: ‘beyond trending topics: real-world event identification on twitter’, in: proceedings of icwsm, barcelona, spain ; http://sm.rutgers.edu/pubs/becker -icwsm .pdf boyd, danah, scott golder & gilad lotan: ‘tweet, tweet, retweet: conversational aspects of retweeting on twitter’, in: hicss- . ieee: kauai, hi, january , ; http://www.danah.org/papers/tweettweetretweet.pdf campion-vincent, véronique: ‘bugarach (aude) et la fin du monde en l’an ’, in: schweizerisches archiv für volkskunde ( ), pp. - . o’connor, brendan, ramnath balasubramanyan, bryan r. routledge & noah a. smith: ‘from tweets to polls: linking text sentiment to public opinion time series’, in: proceedings of the international aaai conference on weblogs and social media, washington, dc, may ; http://www.cs.cmu.edu/~nasmith/papers/oconnor+balasubramanyan+routledge+smith.icwsm .pdf duits, linda : ‘les voor journalisten : twitter is niet het volk’, in : diep onderzoek - - ; http://www.dieponderzoek.nl/les-voor-journalisten-twitter-is-niet-het-volk/ gravel, rilana: tell me what you tweet and i tell you who you are: an analysis of language use on twitter. leiden . [unpublished master’s thesis] meder, theo: ‘”you have to make up your own story here”: identities in cyber space from twitter to second life’, in: violetta krawczyk-wasilewska, theo meder & andy ross: shaping virtual lives. online identities, representations, and conducts. lodz , pp. - ; http://depot.knaw.nl/ / mustafaraj, eni, samantha finn, carolyn whitlock & panagiotis t. metaxas: ‘vocal minority versus silent majority: discovering the opionions of the long tail’, in: privacy, security, risk and trust. boston , p. - : http://cs.wellesley.edu/~pmetaxas/silent-minority-vocal-majority.pdf nguyen, dong, rilana gravel, dolf trieschnigg & theo meder: ‘“how old do you think i am?”: a study of language and age in twitter,’ in proceedings of the seventh international aaai conference on weblogs and social media, a; http://dolf.trieschnigg.nl/papers/icwsm. .nguyen.pdf nguyen, dong, rilana gravel, dolf trieschnigg & theo meder: ‘tweetgenie: automatic age prediction from tweets’, in: acm sigweb newsletter autumn ( b): ; http://sm.rutgers.edu/pubs/becker -icwsm .pdf http://www.danah.org/papers/tweettweetretweet.pdf http://www.cs.cmu.edu/~nasmith/papers/oconnor+balasubramanyan+routledge+smith.icwsm .pdf http://www.cs.cmu.edu/~nasmith/papers/oconnor+balasubramanyan+routledge+smith.icwsm .pdf http://www.dieponderzoek.nl/les-voor-journalisten-twitter-is-niet-het-volk/ http://depot.knaw.nl/ / http://cs.wellesley.edu/~pmetaxas/silent-minority-vocal-majority.pdf http://dolf.trieschnigg.nl/papers/icwsm. .nguyen.pdf http://dolf.trieschnigg.nl/papers/sigweb. .nguyen.pdf nguyen, dong, dolf trieschnigg, a. seza dogruöz, rilana gravel, mariët theune, theo meder & francisca de jong: 'why gender and age prediction from tweets is hard: lessons from a crowdsourcing experiment', in: coling (computational linguistics); http://www.dongnguyen.nl/publications/nguyen-coling .pdf michael j. paul & mark dredze: ‘you are what you tweet: analyzing twitter for public health;, in: the proceedings of the th international aaai conference on weblogs and social media (icwsm ), barcelona, spain. july ; http://cs.jhu.edu/~mpaul/files/ .icwsm.twitter_health.pdf sakaki, takeshi, makoto okazaki; yutaka matsuo: ‘earthquake shakes twitter users: real-time event detection by social sensors’, in: www proceedings of the th international conference on world wide web: http://csce.uark.edu/~tingxiny/courses/ spring /readinglist/www .pdf scheele, brian: ‘twitter, de stem van het volk?’, in: data denkers - - ; http://datadenkers.wordpress.com/ / / /twitter-de-stem-van-het-volk/ tausczik, yla r. & james w. pennebaker: ‘the psychological meaning of words: liwc and computerized text analysis methods’, in: journal of language and social psychology ( ) , pp. - ; http://homepage.psy.utexas.edu/homepage/faculty/pennebaker/reprints/tausczik&penneba ker .pdf visser, petra de: een beter milieu begint bij… de verantwoordelijkheid voor de ecologie in eindtijdvoorspellingen voor . groningen [unpublished master’s thesis]. voets, johan: ‘twitter radar is buienradar volgens twitterend nederland’, in numrush; february : http://numrush.nl/ / / /twitter-radar-is-buienradar-volgens-twitterend-nederland/ zijlstra, hanna, henriët van middendorp, tanja van meerveld en rinie geenen: ‘validiteit van de nederlandse versie van de linguistic inquiry and word count (liwc). een experimentele studie onder vrouwelijke studenten’, in: nederlands tijdschrift voor de psychologie ( ), pp. - ; http://link.springer.com/article/ . % fbf #page- http://dolf.trieschnigg.nl/papers/sigweb. .nguyen.pdf http://www.dongnguyen.nl/publications/nguyen-coling .pdf http://cs.jhu.edu/~mpaul/files/ .icwsm.twitter_health.pdf http://csce.uark.edu/~tingxiny/courses/ spring /readinglist/www .pdf http://datadenkers.wordpress.com/ / / /twitter-de-stem-van-het-volk/ http://homepage.psy.utexas.edu/homepage/faculty/pennebaker/reprints/tausczik&pennebaker .pdf http://homepage.psy.utexas.edu/homepage/faculty/pennebaker/reprints/tausczik&pennebaker .pdf http://numrush.nl/ / / /twitter-radar-is-buienradar-volgens-twitterend-nederland/ http://link.springer.com/article/ . % fbf #page- -disc-itlposter source: mit data-sci-machine.jpg discover the digital scholarship center (disc) wendy mann, debby kermer, joy suh | datahelp@gmu.edu university libraries | george mason university dsc.gmu.edu ! software & technology • gis, statistical, qualitative, etc. • collaborative workroom • scanners for digital projects • lab with windows and mac workstations " consultations & workshops consultations: https://dataservices.gmu.edu/consulting workshops: https://dataservices.gmu.edu/workshops # research data management • data management plans • metadata creation • long-term data preservation • archive & share data in mason’s institutional repositories $ data • data discovery & acquisition • text & numeric data • data cleaning & management • quantitative & qualitative analysis % gis • geospatial data • mapping • spatial analysis & digital humanities • digital tools • digital projects • data visualization • text mining & analysis data services fenwick library dataservices.gmu.edu data services fenwick library dataservices.gmu.edu data services fenwick library dataservices.gmu.edu data services fenwick library dataservices.gmu.edu creating, finding, and using data; data management and curation; geographical information systems (gis); digital scholarship (humanities, social sciences and sciences); digital projects planning and management; and related scholarly communication issues (e.g., open access, open data). what is digital scholarship? disc works with students, faculty, and staff doing digital scholarship in the humanities, social sciences, and sciences. our clients use and can get assistance with software such as: arcgis, qgis, spss, stata, sas, nvivo, atlas.ti, qda miner, voyant, gephi, mallet, etc., and related functions in excel, python, and r. who do we help? fenwick library a newly renovated space on the second floor of fenwick library. university libraries built disc by combining our existing data services program with two new lab areas. where are we located? chapter acrl’s scholarly communications roadshow bellwether for a changing profession joy kirchner university of british columbia library kara j. malenfant association of college and research libraries introduction at its heart, the acrl scholarly communications roadshow program highlights the need to redefine what it means to be a librarian in the twenty-first century. for over a decade, the association of college and research libraries (acrl) has been committed to its scholarly commu- nication initiative as one of its highest strategic priorities. professional development and continuing education for academic librarians are cor- nerstones of the initiative. the roadshow’s responsive curriculum has grown to support academic librarians as they stretch their professional muscles in new ways. attuned to a changing community, roadshow presenters continuously update the curriculum, and it has shifted focus from imparting a basic awareness of the dynamics in the current system of scholarly communication to facilitating participants’ deeper under- standing and engagement or commitment to changing the system. more than meeting the community where it is, the roadshow program challenges participants to assume ever more active roles in accelerating the transition to a more open system of scholarship. the roadshow program has set goals to stimulate new thinking about the future of library services, to provide practical ideas on developing services, and to discuss emerging themes, such as the use of alternative metrics in reward systems and the intersections of scholarly communi- cation and student learning. through the roadshow, acrl not only reached those who may not attend national conferences or work at large research universities, [ ] creative commons attribution-noncommercial (cc by-nc) http://creativecommons.org/licenses/by-nc/ . / common ground at the nexus of information literacy and scholarly communication but also asserted that scholarly communication issues are central to the work of all academic librarians and all types of institutions. in this chapter, we describe how the program has evolved to support academ- ic librarians as they assume new roles as contributors of knowledge creation, advocates of sustainable models of scholarship, and partners of faculty in both research and educational processes. background and context within acrl’s current strategic plan, there are three primary goal areas of focus for – : the value of academic libraries, student learning, and the research and scholarly environment. the goal for the research and scholarly environment strategic area is, “librarians accelerate the transition to a more open system of scholarship.” the specific objectives are: . model new dissemination practices. . enhance members’ ability to address issues related to digital scholarship and data management. . influence scholarly publishing policies and practices toward a more open system. . create and promote new structures that reward and value open scholarship. (acrl ) this commitment to hastening a more open system of scholarship is not new. acrl has long endeavored to reshape the system of schol- arly communication, focusing on the areas of education, advocacy, coalition building, and research. starting in january , an acrl task force on scholarly communication began discussing how acrl might contribute to shaping the future of scholarly communication and stated that such discussion “requires envisioning what such a fu- ture might be like” (english et al. , ). in the task force’s january report to the acrl board, they had determined that the issues surrounding scholarly communication and publishing were of major import to acrl members. the task force recommended that acrl, as one of its highest strategic priorities, be actively engaged in working to reshape the current system of scholarly communication, with activi- ties to include educational work, political advocacy, coalition building, and research. in describing the broad-based educational work, the task force identified a new role for acrl: given the complexity of these issues, and the impor- tance of working on them in a sustained way over time, we believe there is a critical need for acrl to mount ongoing programs to educate academic librarians about acrl’s scholarly communications roadshow scholarly communication issues and for acrl to create support mechanisms, programs, and publicity efforts to help make faculty researchers and higher educa- tion administrators more aware of the importance of these concerns. acrl’s broad membership base and its strong record in programming and continuing education puts the association in a unique position to be effective in these areas. (english et al. , ) based on the recommendations in the report, acrl launched its scholarly communication initiative in spring as one of its highest strategic priorities. acrl’s new standing committee of the board, the scholarly communications committee, then focused on continuing education for academic librarians by developing a preconference for the ala annual in orlando, florida: scholarly communication : an introduction to scholarly communication issues and strate- gies for change. presentations from this preconference included: • anatomy of a crisis: dysfunction in the scholarly communi- cations system (by lee van orsdel) • copyright, licensing, and information policy: mine, mine, and well, mine! (by dwayne buttler) • fostering a competitive market (by ray english) • open access (by karen williams) • scholarly communication: legislative and political advocacy (by james g. neal) • scholarly communication: strategies for change (by james g. neal) these presentation materials became the foundation of the acrl scholarly communication toolkit, which was launched in march to support advocacy efforts for academic and research libraries. the path from this initial preconference to creating a sustained roadshow workshop with a “ ” basic level approach was not entirely linear, and next we will describe in the stages leading up to it. in addition to offering this very first preconference in aimed at a basic -level education, members of the scholarly communications committee, together with staff, began exploring a new project with the association of research libraries (arl) to jointly promote the development of library-led outreach programs. the two organizations recognized a shared concern for supporting academic and research libraries in their growing efforts to develop campus outreach pro- grams. through the arl/acrl institute on scholarly communication (isc), the organizations have sought to aid libraries in developing their outreach programs by offering websites with resources and plan- ning guides, topical webcasts, workshops, and an immersive learning experience. this signature two-and-a-half day event, first offered in common ground at the nexus of information literacy and scholarly communication july , prepares participants to become local experts within their libraries and provides a structure for developing a program plan for scholarly communication outreach that is customized for each partici- pant’s institution. many of the members of acrl’s scholarly communications com- mittee worked as faculty to design and deliver initial offerings of the immersive event for the isc. in this capacity, they recognized the wide variance in background understanding and engagement in scholarly communications as a critical perspective for academic libraries and li- brarians. they saw a strong need to provide librarians with contextual understanding in order to help them take action and develop campus outreach programs. while many librarians understood that copyright, information economics, business models, open access, and other schol- arly communications issues are important, they did not have enough background in these issues to begin taking action in their own library and campus settings. many academic librarians, therefore, continued to require a basic approach before being able to benefit from the more advanced work on program planning offered via the isc. to help this segment of the community, acrl committee members decided in to return to the “ ” idea and develop a workshop specifically targeting librar- ians who were new to scholarly communications issues. it was felt that such a program could serve as a bridge course toward more advanced opportunities such as the isc. as one way to understand the varying levels of readiness within the community, we looked to an article by joyce ogburn ( ), which has served as a cornerstone text for the isc. in it, she proposes a series of five stages through which libraries, by programmatic efforts, will advance: . awareness: having basic knowledge of the issues . understanding: higher order of knowledge, intelligence, and appreciation . ownership: commitment and obligation . activism: goal-directed, concerted, and purposeful action . transformation: attainment of a profound alteration of as- sumptions, methods, and culture defining and applying these stages, she wrote, “can help establish and guide a program by setting direction and goals, tracking progress, identifying landmarks, and noting achievements…the stages reflect an evolution from local action to collaborative efforts with the goal of achieving widespread change” (ogburn , ). these stages provide a useful theoretical framework against which to consider how acrl’s curriculum for the roadshow has evolved to support a com- munity in transition, as we’ll describe next. acrl’s scholarly communications roadshow from conference workshop to roadshow the roadshow curriculum was initially developed by members of the acrl scholarly communications committee in a proposal for a basic half-day workshop offered in person as part of the acrl th na- tional conference, push the edge: explore, engage, extend, in seattle, washington, march – , . workshop leaders, known experts from the committee, developed the curriculum based on learning out- comes and speaker guidelines delineated in the proposal. two mem- bers of the committee, joy kirchner (university of british columbia) and lee van orsdel (grand valley state), worked in consultation with staff liaison kara malenfant to lead and guide the development of the program in accordance with the committee’s goals and in keeping with acrl’s commitment to continuing education in this area. they created a twofold vision for the workshop: • develop an acrl educational offering that provides the library community with well-developed basic scholarly com- munications program. • use the workshop as an opportunity to broaden expertise in scholarly communications by seeking out new, but knowl- edgeable and engaged, librarians for whom this opportunity to present would be good national-level exposure. partner these new librarians with seasoned scholarly communications committee experts or faculty from the isc. the workshop was titled “scholarly communication ” and was developed with the possibility of future offerings in mind. two other presenters joined kirchner and van orsdel in seattle: sarah shreeves (university of illinois at urbana-champaign and a faculty member with the isc), and molly keener (wake forest university), a newcomer. the four presenters worked together to develop the follow- ing modules for the half-day workshop: . introduction and economic issues . open access and openness as a principle . copyright and intellectual property . new modes and models of scholarly communication while it was in development, the presenters discussed the upcom- ing workshop with the acrl scholarly communications commit- tee at a january meeting. then-arl staff member karla strieb (née hahn) commented that librarians at the workshop may eagerly approach the presenters and invite them to offer a reprise on their campuses. sensing an opportunity to further acrl’s strategic goals by taking the workshop out and extending its reach, malenfant suggested the committee develop an acrl-subsidized roadshow program that institutions could apply to host. this plan was enthusiastically common ground at the nexus of information literacy and scholarly communication endorsed by the acrl scholarly communications committee and implemented quickly thereafter. in early march , acrl announced that it would carry the costs to take the workshop, “scholarly communications : start- ing with the basics,” on the road to five locations, chosen through a competitive process, in summer . promotion was queued up so that the roadshow was advertised with flyers and announcements at the acrl th national conference in seattle. in preparing to take the workshop on the road, presenters adapted the curriculum based on what they had learned. additional presenters were recruited from available isc faculty: kevin smith (duke uni- versity) and terri fishel (macalaster college). when announced, the roadshow was promoted in this way: led by two expert presenters, this structured interac- tive overview of the scholarly communication system highlights individual or institutional strategic planning and action. four modules focus on new methods of scholarly publishing and communication, copyright and intellectual property, economics and open access. as a result of the workshop, participants will understand scholarly communication as a system to manage the results of research and scholarly inquiry, enumerate new modes and models of scholarly communication and select and cite key principles, facts and messages relevant to current or nascent scholarly communica- tion plans and programs at their institutions. “scholarly communication ” is appropriate for those with new leadership assignments in scholarly communication as well as liaisons and others who are interested in the issues and need foundational understanding. (ala ) mentoring new presenters in addition to developing programming that would educate librarians with new responsibilities for scholarly communication, the roadshow has also served as a vehicle for directly mentoring newer librarians by expanding the presenter pool to bring in different areas of expertise within scholarly communication at large. to that end, a call went out to both faculty members of the isc and the acrl scholarly commu- nications committee asking for recommendations for new presenters. these calls resulted in recruiting a newer librarian, keener, to be part of the group designing and delivering the workshop at acrl national conference . once the roadshow was launched, more presenters acrl’s scholarly communications roadshow were needed, resulting in a similar call to isc and committee members to recruit an additional newer librarian, molly kleinman (university of michigan). members of the committee mentored keener and klein- man as appropriate in both developing the curriculum and in teaching. as the roadshow continued, the team of expert presenters was enlarged to accommodate an expanded program and replace those who discontinued their service to the program. a model for expand- ing the pool was discussed by the acrl scholarly communications committee, where it was decided that a formal selection process and mentorship program should be integrated into the roadshow pro- gram, with specific funding earmarked for this purpose. acrl sent out an announcement seeking expressions of interest from prospec- tive presenters to all major scholarly communication lists in march . this opportunity was also widely advertised at the acrl conference, and a formal selection and interview process took place for two new presenters over a two-year period. ada emmet (university of kansas) was selected in , and stephanie davis-kahl (illinois wesleyan university) was selected in . program revision in constant revision and updates to the program have been a critical staple in the roadshow curriculum development. workshop presenters are active in developing the program because they are keenly aware of how quickly the scholarly communication arena is evolving. they collaborate frequently to reflect on the program deliverables, deter- mine what improvements are necessary, and revise the program and handouts as new information emerges in this arena. they are attuned to the shifts they are observing in the community over time as library programs evolve through ogburn’s stages. recognizing this evolution relies on more than just a tacit sense; there is data to support the observation that libraries are becom- ing more engaged and taking on more activities related to scholarly communication education and outreach. prior to each roadshow, participants are asked to identify one person from each library to answer a series of questions—a census if you will. the purpose is to better understand the state of scholarly communication education and outreach efforts at the library level in the short term and the long term. the online questionnaire presents a checklist of some eighteen scholarly communication activities (e.g., outreach events for faculty on scholarly communication topics, an institutional repository, a fund to pay author fees for open access journal publishing, etc.) and asks the submitters to identify their library’s current activities and its future plans. in nearly all cases, there has been an increase in the number of common ground at the nexus of information literacy and scholarly communication libraries offering these activities over the last four years. (for complete text of questions and data underlying the graph in figure . , see ap- pendix . : responses to pre-workshop questionnaire on library- level engagement.) given this evidence that libraries felt a sense of ownership and were increasingly committing resources to implement education and outreach activities, it was felt that the community had largely moved beyond ogburn’s first stage of awareness of the issues and that it was time to shift the program offerings from a -level curriculum aimed at basic knowledge deliverables to a more advanced program. acrl sought to marshal resources to do this work well and named kirchner, who had been acting as coordinator of the presenter group, as acrl visiting program officer to lead that change. after three years of revi- sions, in the roadshow was substantially modified and renamed, “scholarly communications: from understanding to engagement.” this title dropped the designation and the term basics to reflect the transition from the program’s earlier goal of providing a base-level understanding of scholarly communication. as of , the program is now a more robust professional development offering and has extended from a half-day to a full-day workshop. new learning objectives were crafted to better reflect new deliverables (see appendix . .). have held outreach events for faculty on sc topics have held outreach events for students on sc topics have held education events for library staff on sc topics include sc topics in info lit instruction sessions have a library web presence on sc aimed at campus have a library web presence on sc aimed at library staff job descriptions for library staff include sc duties have assigned library staff members to be responsible for sc have a library committee on sc that includes library staff have a library committee on sc that includes other campus offer services, such as copyright, author rights, etc., offer services on open access mandates compliance offer services that support data management plan have an institutional repository have a fund to pay author fees for open access journal library serves as publisher for new models of sc discussions with faculty leadership regarding open-access member of sparc (the scholarly publishing and academic member of the alliance for taxpayer access % % % % % % % % yes yes yes yes figure . library-level activities of roadshow participants acrl’s scholarly communications roadshow the roadshow is now aimed at those with administrative respon- sibilities or new leadership assignments in scholarly communication or digital publishing, as well as liaisons and any others who are seeking to advance their professional development in scholarly communica- tion. broad goals of the revised program are designed to stimulate new thinking about the role of scholarly communication in the future of library services, to provide practical ways for participants to develop service models for scholarly communication in their libraries, and to empower participants to help accelerate the transformation of the scholarly communication system. as the program matured, acrl introduced a cost-sharing model to align the program more closely with other acrl professional development opportunities. (acrl is committed to underwriting the bulk of the costs for delivering the roadshow, and the cost for the five successful host institutions is $ , . separate from this competitive application process, acrl will now offer the program at full cost to institutions wishing to license it.) the revised workshop was piloted at the ala midwinter meeting and was one of the best-attended acrl daylong offerings at an ala midwinter meeting in several years. in evaluating this offering, a standard acrl instrument was used to allow data to be collected in a way that would tie the data back to acrl’s key performance indicators for professional develop- ment programs. looking back the program was initially developed to help libraries that were just starting to consider how to develop campus outreach programs, with an aim of supporting ogburn’s first stage (awareness: having basic knowledge of the issues). however, it quickly evolved to assist participants in thinking through service models for their scholarly communication activities and rapidly began to incorporate a higher level of knowledge and appreciation of the more nuanced aspects of scholarly communication. for instance, while open access awareness and education was the chief discussion point in the first roadshow, by , presenters became aware that open access is largely well under- stood and that there was a need to shift that segment of the workshop to focus more on emergent areas and the politics of open access and openness in practice. the presenters have been increasingly challenged with developing a curriculum to suit all library types. they have increasingly recog- nized, throughout the four years of the roadshow, that scholarly com- munication is no longer the focus of just large, research-intensive in- stitutions. accordingly, the program evolved to broaden the discussion common ground at the nexus of information literacy and scholarly communication from a publishing and research perspective to encompassing more of a teaching and learning perspective. this redesign allows it to resonate more with librarians undertaking scholarly communication activities at institutions with a primary focus on undergraduate education. the curriculum also evolved to include more material designed to assist liaison librarians who are working with students and with faculty as teachers, not just researchers. the presenters further redesigned the material to be applicable to any size institution. to further understand how the roadshow could better support liberal arts colleges, kirchner recruited davis-kahl (prior to her selection as a new presenter) to help the committee gather information about scholarly communication activities, priorities, needs, and current programs at small liberal arts colleges as a way of guiding our future training efforts. this work is currently underway in summer . in addition to adjusting the curriculum to meet the needs of librar- ians at different types of institutions, the presenters have discussed how to address the differences within disciplines regarding scholarly communications. it seems very important, and still more aspirational than real, for the roadshow to help librarians deal with the actual conditions and variance in attitudes regarding scholarly communica- tion in art history, english, biology, or physics, for example. while presenters have declared themselves anxious to address disciplinary differences, the best method of doing so is unclear. they see disciplin- ary differences as an important aspect of the intersection between scholarly communication and information literacy, and it is a recurring theme during debriefing calls and retreats. emerging themes in in , the curriculum was reshaped to build in more engagement with participants on how their libraries could create value-added services in the system of scholarship. this included thinking beyond open access and institutional repositories to consider other mecha- nisms to enhance knowledge exchange and mobilization, new forms of both creation and dissemination of scholarship, and means for tracking those developments on our own campuses. the presenters more deliberately included case studies in the curriculum to both instigate discussion and showcase how other institutions created such value-added services as supports for the open exchange of scholar- ship, open education services, publishing services, and copyright services. several emerging themes surfaced by . these include e-science, data management, scholarly communication as it relates to student learning, and how emerging alternative metrics to evaluate scholarship may change faculty reward systems (e.g., promotion and acrl’s scholarly communications roadshow tenure). while this chapter cannot explore each of these emerging themes in depth, we chose to focus on two that are relevant, given the subject of this book. first, we look at the emerging theme of scholarly communication and student learning. the roadshow program saw an increased interest in developing scholarly communication programs that fo- cused on undergraduate publishing support as a result of the increased number of institutions placing strategic emphasis on undergraduate research. as a result, the roadshow provided more emphasis on ways in which scholarly communication programming can support such in- stitutional imperatives. the roadshow presented case studies on how scholarly communication librarians or liaison librarians are working with faculty to provide avenues to give their undergraduate students publishing experience, typically through open access avenues. exam- ples include faculty who have created assignment-based models rang- ing from student article submissions to open access student journals to the launching of a student open access journal where students are assigned specific editorial roles as defined in such open access journal programs as open journal systems. other examples include student submission of exemplary undergraduate student work in institutional repositories. still other faculty are providing their students with oppor- tunities for publishing experience through other “open avenues,” such as wikis or through submission to wikipedia. next we look at another emerging theme around the use of alternative metrics in rewarding and valuing open scholarship. the roadshow has always addressed the role of promotion and tenure in the segments on the system of scholarly communication and as an influencing factor in the economics of traditional scholarly publishing. however, in the most recent cycle of roadshows, there was increased interest in delving more deeply into exploring programmatic roles for libraries and librarians in promotion and tenure arenas. through facil- itated dialogue, presenters and participants explore a role for libraries in assisting promotion and tenure committees with the evaluation of newer forms of scholarship. as promotion and tenure committees are increasingly faced with evaluating newer forms of digital scholarship, libraries could potentially play a role in providing context and un- derstanding of new models of scholarship and supporting alternative metrics (altmetrics) on their own campuses as a means of offering sup- port for scholarship or promotion and tenure cases that are not well supported by traditional citation metrics. discussion included how libraries can play a role in supporting or creating altmetrics to provide other avenues to demonstrate impact of an author’s or creator’s work beyond traditional avenues and how such models would be especially useful for those faculty seeking to demonstrate value for new models common ground at the nexus of information literacy and scholarly communication of scholarship. presenters and participants have also discussed col- lections statistics, institutional repository statistics, and how libraries can utilize, support, or contribute to the growing number of emerging altmetric tools in development. looking forward to a large degree, the roadshow program focuses on transitions occurring in research, publishing, teaching, and learning practices brought about by new technologies. those changes and the need to both respond and proactively shape a future that fully leverages the affordances inherent in new technologies, is at the heart of the road- show programming. the roadshow curriculum is likely to evolve to capture more thinking about the following trends: . value-added library services and mechanisms to enhance knowledge exchange, translation, dissemination, and mo- bilization, especially to support open exchange of research and scholarship. linked to this discussion is the growing importance of the accessibility and reuse of research data as an important emergent and complex new arena in scholarly communication as libraries begin to develop service models in support of data management. the intersection of scholarly communication and data curation will need to be explored. . the intersection of information literacy and scholarly commu- nication. acrl has begun to explore this trend through this book and a forthcoming white paper. likely the roadshow program will evolve as these investigations continue. . the growing value of “personal collections,” open educa- tion models, and open research data. how these collections contribute to scholarship and scholarly practice will likely be tracked in the roadshow program. . how actively institutions wish to support, preserve, and promote new forms of scholarship. as colleges and universi- ties are faced with the challenges of reviewing emerging forms of scholarship and scholarly communication for promotion and tenure considerations, they (perhaps with help from their libraries) will need to this issue. key questions for future scholarly communication programming will likely include tracking and thinking through the following: • how is the emerging landscape of scholarly communication and contribution shifting? • how might promotion and tenure processes be adapted to support knowledge production, transmission, and preserva- tion in an increasingly participatory culture? acrl’s scholarly communications roadshow • what approaches to promotion and tenure review are be- ing adopted and used by leading institutions in light of the changing landscape of scholarly communication and contri- bution? are there emerging best practices at the disciplinary level that might serve as a model for others? • what metrics of scholarly communication and impact will be relevant for promotion and tenure committees in a shift- ing landscape of scholarly communication? how will this differ by discipline? what role can librarians play in pro- viding altmetrics in support of new models of scholarship? • what is the role of community engagement in emerging forms of scholarly communication? • in what ways can libraries assist with supporting sustain- able scholarship in both its emerging formats and tradi- tional formats? conclusion in its fourth year, and with the workshops completed, the roadshow will have visited seventeen different states, the district of columbia, one us territory, and one canadian province. the twenty workshops offered over these four years will have reached , participants from different colleges and universities. (for a break- down, see appendix . .) participants have given consistently high evaluations with comments such as these: • “i liked how simple the presenters made a very complex sub- ject appear…i hope that i can do the same in the future.” • “it helped me connect issues in a coherent way—the relation- ship between open movement, copyright, economics etc.— good to have a conceptual framework.” • “my epiphany moment was how much faculty plays a role and how, as a library, we can engage faculty in these discus- sions.” • “i came away with concrete ideas to take back to my campus. many time [sic] at conferences or workshops i come away inspired but lacking in concrete solutions or initiatives. this time i was not only informed and inspired, but came away with ideas appropriate for my institution.” • “the two presenters were stunningly knowledgeable, but also very accessible and willing to field questions as they arose. great information presented. i came back energized and fired up.” while it is clear that the roadshow has been a catalyst for many participants to create or expand scholarly communication programs common ground at the nexus of information literacy and scholarly communication in their own libraries (vandegrift and colvin ), there have also been some positive unexpected outcomes. presenters have heard that simply seeing the advertisement itself spurred some institutions to take scholarly communication more seriously. some prospective hosts, whether selected or not, reported that the act of applying (and secur- ing partners for their application) has been a springboard for begin- ning their own local scholarly communication educational programs. several unsuccessful applicants, for instance, went ahead and launched their own local “roadshow” workshops. we have encouraged this by adding roadshow materials to the acrl scholarly communication toolkit under a creative commons license. in extending the reach of the roadshow this way, we hope that librarians will make use of these tools, including short videos, presentation templates, and handouts, to enhance their own knowledge or adapt them to offer related work- shops on their own campuses. from a library association perspective, the roadshow has been an extraordinary opportunity to support members in a much-needed way. it has directly supported acrl’s strategic priorities, and the responsive curriculum is a model for how the association can meet the changing reality of our work as academic librarians. by subsidizing the road- show, acrl has reached those who may not attend national confer- ences or work at large research universities. through the roadshow, acrl intends to send a clear message that scholarly communication issues are central to the work of all academic librarians and all types of institutions. acrl challenges all librarians to extend their curios- ity and be more responsive to their community, finding appropriate insertion points where there is a need on their campus. through the combination of excellent presenters and forward-thinking curriculum, acrl is supporting members of our profession as they assume new roles as contributors of knowledge creation, advocates of sustainable models of scholarship, and partners of faculty in both the research and educational processes. acknowledgements the authors wish to acknowledge the excellent work of members of the roadshow presenter team who have undertaken the efforts described in this chapter with dedication and verve. they are ada emmett, stephanie davis-kahl, molly keener, joy kirchner, sarah shreeves, kevin smith, lee van orsdel, and past presenters molly kleinman and terri fishel. re sp on se s re sp on se s re sp on se s re sp on se s* ye s n o n o, b ut pl an ne d in th e ne xt m on th s ye s n o n o, b ut pl an ne d in th e ne xt m on th s ye s n o n o, b ut pl an ne d in th e ne xt m on th s ye s n o n o, b ut pl an ne d in th e ne xt m on th s h av e he ld o ut re ac h ev en ts fo r f ac ul ty o n sc ho la rly c om - m un ic at io n to pi cs % % % % % % % % % % % % h av e he ld o ut re ac h ev en ts fo r s tu de nt s on s ch ol ar ly c om - m un ic at io n to pi cs % % % % % % % % % % % % h av e he ld e du ca tio n ev en ts fo r l ib ra ry s ta ff on s ch ol ar ly co m m un ic at io n to pi cs % % % % % % % % % % % % in cl ud e sc ho la rly c om m un ic at io n to pi cs in in fo rm at io n lit er ac y in st ru ct io n se ss io ns fo r s tu de nt s % % % % % % % % % % % % h av e a lib ra ry w eb p re se nc e on s ch ol ar ly c om m un ic at io n to pi cs a im ed a t c am pu s co m m un ity % % % % % % % % % % % % h av e a lib ra ry w eb p re se nc e on s ch ol ar ly c om m un ic at io n to pi cs a im ed a t l ib ra ry s ta ff on ly % % % % % % % % % % % % jo b de sc rip tio ns fo r l ib ra ry s ta ff in cl ud e sc ho la rly c om m u- ni ca tio n du tie s % % % % % % % % % % % % h av e as si gn ed li br ar y st af f m em be rs to b e re sp on si bl e fo r sc ho la rly c om m un ic at io n ac tiv iti es % % % % % % % % % % % % h av e a lib ra ry c om m itt ee o n sc ho la rly c om m un ic at io n th at in cl ud es li br ar y st af f o nl y % % % % % % % % % % % % a pp en di x . re sp on se s t o p re -w or ks ho p q ue st ion na ire on li br ar y-l ev el en ga ge me nt re sp on se s re sp on se s re sp on se s re sp on se s* ye s n o n o, b ut pl an ne d in th e ne xt m on th s ye s n o n o, b ut pl an ne d in th e ne xt m on th s ye s n o n o, b ut pl an ne d in th e ne xt m on th s ye s n o n o, b ut pl an ne d in th e ne xt m on th s h av e a lib ra ry c om m itt ee o n sc ho la rly c om m un ic at io n th at in cl ud es o th er c am pu s st ak eh ol de rs (e .g ., fa cu lty , e di to rs , un iv er si ty p re ss , r es ea rc h of fic e) % % % % % % % % % % % % o ffe r s er vi ce s, s uc h as c op yr ig ht , a ut ho r r ig ht s, a nd /o r op en a cc es s m an da te s co m pl ia nc e ad vi si ng % % % % % % % % % n/ a n/ a n/ a o ffe r s er vi ce s on o pe n ac ce ss m an da te s co m pl ia nc e or ad vi si ng n/ a n/ a n/ a n/ a n/ a n/ a n/ a n/ a n/ a % % % o ffe r s er vi ce s th at s up po rt d at a m an ag em en t p la n co m pl i- an ce o r a dv is in g n/ a n/ a n/ a n/ a n/ a n/ a n/ a n/ a n/ a % % % h av e an in st itu tio na l r ep os ito ry % % % % % % % % % % % % h av e a fu nd to p ay a ut ho r f ee s fo r o pe n ac ce ss jo ur na l pu bl is hi ng % % % % % % % % % % % % li br ar y se rv es a s pu bl is he r f or n ew m od el s of s ch ol ar ly co m m un ic at io n (e -jo ur na ls , e tc .) % % % % % % % % % % % % di sc us si on s w ith fa cu lty le ad er sh ip re ga rd in g an o pe n ac - ce ss re so lu tio n fo r m y ca m pu s % % % % % % % % % % % % m em be r o f s pa rc (t he s ch ol ar ly p ub lis hi ng a nd a ca de m ic re so ur ce s co al iti on ) n/ a n/ a n/ a n/ a n/ a n/ a % % % % % % m em be r o f t he a lli an ce fo r t ax pa ye r a cc es s n/ a n/ a n/ a n/ a n/ a n/ a % % % % % % acrl’s scholarly communications roadshow appendix . roadshow learning objectives overall program learning objectives participants will: • enhance understanding of scholarly communication as a sys- tem to manage the results of research and scholarly inquiry. • increase their ability to examine, and initiate or support new models of scholarly communication (e.g., research and social interaction models such as blogs, new ways of peer review). • select and cite key principles, facts, and messages relevant to their own scholarly communication plans and programs (cur- rent or nascent). • identify concrete actions that they may take back to their in- stitutions and in their positions to help accelerate the transfor- mation of the scholarly communication system. module learning objectives . scholarly communication system module participants will: . understand that the scholarly communication systems is made up of many interlocking systems . understand the basic, traditional iterations in the life cycle of scholarship . identify how disruptions are changing the traditional system of scholarly communication . economics module participants will: . understand some of the basic economic realities of the traditional scholarly publishing system . recognize the connection between authors’ copyright management practices and monopolistic pricing in the scholarly journal market . consider and reflect on alternative models and funding sources for scholarly publishing . copyright module participants will: . understand how copyright arises and identify types of ma- terial that are likely to be subject to copyright protection common ground at the nexus of information literacy and scholarly communication . identify the likely copyright owners of academic works and have a reasonable awareness of the rights attendant on such protection . be familiar with rights transfer and retention language commonly used in publishing contracts . open and openness module participants will: . understand the conceptual underpinnings of open move- ments . understand what the open access and public-access movements are . identify current events within the open- and public- access movements . identify other open movements . faculty and student engagement module participants will: . identify and examine current models and programming that support “openness” . explore new models and tenure and promotion consid- erations . explore models that you might consider piloting or experimenting with . consider what next steps you might take acrl’s scholarly communications roadshow appendix . roadshow hosts and participants year host location # participants # institutions atlanta university center robert w. woodruff library atlanta, ga colorado state university pueblo, co james madison university harrisonburg, va university of new mexico albuquerque, nm university of toronto toronto, on city university of new york ( colleges) brooklyn, ny washington research libraries consortium washington, dc university of hawaii at manoa honolulu, hi st. thomas university st. paul, mn academic library association of ohio columbus, oh auraria library denver, co bryan college dayton, tn florida state university tallahassee, fl kansas state university manhattan, ks lehigh valley association of independent colleges bethlehem, pa acrl louisiana chapter baton rouge, la state university of new york buffalo, ny texas tech university lubbock, tx university of puerto rico at mayagüez mayagüez, pr washington university st. louis, mo total , common ground at the nexus of information literacy and scholarly communication notes . when the workshop was offered in march , acrl was op- erating under its previous strategic plan, “charting our future: acrl strategic plan ” (acrl ). that plan contained forty strategic objectives, and in order to focus energies, in may the board identified six as strategic priorities for – (see acrl ). identifying the top priorities further supported acrl’s decision to invest in offering the workshop as a roadshow because it directly addressed one of these six: “en- hance acrl members’ understanding of how scholars work and the systems, tools, and technology to support the evolving work of the creation, personal organization, aggregation, discovery, preservation, access and exchange of information in all formats.” . examples showcased include university of british columbia’s dr. jon beasley murray’s undergraduate wikipedia assignment for his spanish literature class (jbmurray ). . see altmetrics at http://altmetrics.org/tools for a growing list of alternative metric tools in development. . the acrl scholarly communication toolkit is available at http://scholcomm.acrl.ala.org/. references ala (american library association). . “acrl offers scholarly commu- nication road show.” news release. march . http://www.ala. org/news/news/pressreleases /march /acrlscroadshow. acrl (association of college and research libraries). . “charting our future: acrl strategic plan .” approved june . last updated may , . http://www.ala.org/acrl/sites/ala.org.acrl/files/content/ aboutacrl/strategicplan/acrl-sp- - .pdf acrl (association of college and research libraries). . “strategic pri- orities: – .” last updated may . http://www.ala.org/acrl/ sites/ala.org.acrl/files/content/aboutacrl/strategicplan/priorities . pdf. acrl (association of college and research libraries). . “acrl plan for excellence.” april. http://www.ala.org/acrl/aboutacrl/strategicplan/ stratplan. buttler, dwayne. . “copyright, licensing, and information policy: mine, mine, and well, mine!” presentation at scholarly communica- tion : an introduction to scholarly communication issues and strategies for change, preconference for the ala annual meeting, acrl’s scholarly communications roadshow orlando, fl, june. english, ray. . “fostering a competitive market.” presentation at schol- arly communication : an introduction to scholarly communi- cation issues and strategies for change, preconference for the ala annual meeting, orlando, fl, june. english, ray, karyle butcher, deborah dancik, james neal, and catherine wojewodzki. . report of the acrl scholarly communications task force. chicago: acrl, january. http://www.ala.org/acrl/sites/ ala.org.acrl/files/content/issues/scholcomm/doc . .pdf. jbmurray. . “user:jbmurray/madness.” wikipedia user page. last updated may . http://en.wikipedia.org/wiki/user:jbmurray/madness#first_ steps:_. our. _project_. neal, james g. a. “scholarly communication: legislative and politi- cal advocacy.” presentation at scholarly communication : an introduction to scholarly communication issues and strategies for change, preconference for the ala annual meeting, orlando, fl, june. neal, james g. b. “scholarly communications: strategies for change.” presentation at scholarly communication : an introduction to scholarly communication issues and strategies for change, precon- ference for the ala annual meeting, orlando, fl, june. ogburn, joyce l. . “defining and achieving success in the movement to change scholarly communication.” library resources and technical services , no. (april): – . vandegrift, micah, and gloria colvin. . “relational communications: developing key connections.” college and research libraries news , no. (july): – . http://crln.acrl.org/content/ / / .full. van orsdel, lee. . “anatomy of a crisis: dysfunction in the scholarly communications system.” presentation at scholarly communica- tion : an introduction to scholarly communication issues and strategies for change, preconference for the ala annual meeting, orlando, fl, june. williams, karen. . “open access.” presentation at scholarly communica- tion : an introduction to scholarly communication issues and strategies for change, preconference for the ala annual meeting, orlando, fl, june. report the costs of publishing monographs toward a transparent methodology february , nancy maron christine mulhern daniel rossman kimberly schmelzinger the costs of publishing monographs ithaka s+r is a strategic consulting and research service provided by ithaka, a not-for-profit organization dedicated to helping the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways. ithaka s+r focuses on the transformation of scholarship and teaching in an online environment, with the goal of identifying the critical issues facing our community and acting as a catalyst for change. jstor, a research and learning platform, and portico, a digital preservation service, are also part of ithaka. copyright ithaka. this work is licensed under a creative commons attribution-noncommercial . international license. to view a copy of the license, please see http://creative- commons.org/licenses/by-nc/ . /. ithaka is interested in disseminating this brief as widely as possible. please contact us with any questions about using the report: research@ithaka.org. the costs of publishing monographs table of contents executive summary ................................................................................................. acknowledgements ................................................................................................... introduction ............................................................................................................. background ............................................................................................................. methodology ........................................................................................................... developing a definition of cost ........................................................................... findings .................................................................................................................. press-level observations ...................................................................................... title-level observations ....................................................................................... conclusion ............................................................................................................. appendices ............................................................................................................. appendix i: advisory board ............................................................................... appendix ii: participating presses .................................................................... appendix iii: note on methodology .................................................................. the costs of publishing monographs executive summary the university press business model faces numerous challenges today, with revenues under pressure due to a host of factors, from the decline of bricks-and-mortar stores and shifting library purchase patterns to the still emerging distribution and revenue models made possible by digital books. over the last few years, certain forces have emerged and intensified—federal mandates for open access, declining sales reach, and the desire of university presses to build a greater audience for scholarly works—encouraging university presses to seriously consider what it would take to make their scholarly monographs openly available. while there have been numerous efforts to understand the costs of publishing a scholarly monograph, this study is unique in that we worked with an advisory group of university press publishers to identify all of the cost components in scholarly monographic publishing and to work with a wide variety of university presses to calculate their costs of each of those components in a bottom-up fashion. in april , the andrew w. mellon foundation awarded a planning grant to ithaka s+r to convene a panel of experts to develop a study methodology for determining in as granular a way as possible the true costs of publishing scholarly monographs. the workshop brought together some of the best thinkers on this topic and resulted in a research methodology for this project, also funded by the andrew w. mellon foundation and conducted by ithaka s+r from january through november . this research project takes on a fundamental question at the heart of any potential new model to support oa monographs: what does it cost to create and disseminate them? the goals of the research were to  provide a comprehensive list of all of the activities needed in order to produce and disseminate a high-quality digital monograph;  generate empirical data on what it costs presses today (what activities they are undertaking today) to produce those books; and 
  offer recommendations of general principles to guide presses in seeking to establish price points for author-side payments for open access digital monographs. 
 data gathered from twenty participating presses, all members of the association of american university presses, suggest that monograph publishing today is considerably more expensive than has often been reported anecdotally or in other studies, and the costs of publishing monographs certainly more expensive than current price points for publishers with oa models would suggest. the study gathered costs of titles published in fiscal . the data gathered included estimates of staff time, direct expenses (such as the cost of a freelance copyeditor), and press-level overheads (such as legal support or rent). the most important aspects of this study are the focus on developing a definition for the full cost of publishing, and developing cost estimates for staff and non-staff expenses in core press activities—acquisitions, manuscript editorial, design, production, and marketing—in part by having staff estimate their time spent on monographs and on certain activities involved in producing a book. by using the methodology of calculating the component activities of book publishing among twenty presses, we were able to examine the staff time allocations in bottom-up fashion, thereby separating academic monographs fairly cleanly from the presses’ other activities (including journals and trade publishing). the study focused solely on the costs of producing the first digital copy of a “high quality digital monograph.” as such, the study excluded trade titles, textbooks and databases. it also excluded works in translation, edited volumes and paperback reprints. working with this data, the research team developed three cost “definitions”:  basic: which includes the cost of staff time and other non-staff direct costs in acquisitions, manuscript editorial, design, production, and marketing.  full cost: which includes the cost of staff time and other non-staff direct costs in acquisitions, manuscript editorial, design, production, and marketing; and also includes press-level overheads.  full cost plus: which includes the full publishing cost, and also includes “in-kind” contributions. this study of titles across university presses from four category types yielded a wide range of costs per title, from a low of $ , to a high of $ , and the range of costs is wide both within and across groups. ithaka s+r used the categories established by the association of american university presses: group , presses with annual revenue for the books division under $ . million; group , annual revenue for book division between $ . and $ million; group , annual revenues between $ and $ million; and group with annual revenue over $ million. while these averages provide a good sense of the range of costs involved in publishing, note that individual presses will want to develop their own figures. in the study, some presses reported elements as “departmental overhead” that might be best considered “basic” or direct costs. the costs of publishing monographs table . full cost of a high-quality digital monograph (excluding in-kind cost) group group average group median th percentile th percentile highest cost title lowest cost title $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , studying the costs of publishing nearly monographs across university presses, provides a descriptive account of how university presses of different sizes and with different missions account for the costs of monographic publishing. as such, it offers a model for how a university press can study its cost structure. our hope is that publishers and funders alike will be able to describe more accurately what they are including—and not including—when the elusive subject of publishing costs arises. twenty presses joined this effort to dig deeply into staff, overhead, and other costs and shared data in a common way so that, as a community, we can begin to understand all of the variables in the equation. among the key findings:  regardless of group type, the largest cost item for university presses is staff time, specifically the time related to activities of acquisitions, the area most closely tied to the character and reputation of the press. this activity is least likely to be outsourced, and considered to be closely tied to its financial success: acquisitions editors being the ones with the skill, subject expertise, and relationships needed to attract the most promising authors and topics to the press.  the working hypothesis at the outset of the study was that larger presses would demonstrate a lower per-book cost, presuming that larger houses are able to work more efficiently due to the economic benefits of scaling. based on the data contributed by the individual presses, the small university presses in group have been able to produce monographs at a lower cost than the other groups. it is impossible to determine if this signals greater efficiency on the part of the small presses or it simply means they underinvest in their publications. the costs of publishing monographs  we looked for significant determinants of cost. while press size, page count, and number of illustrations showed a relationship to cost, other factors, including whether or not the title was a “first book,” whether or not the press was at an institution that required it to pay rent, or whether the press was at a public versus private institution, did not. an examination of disciplines was not conclusive, due to small sample size. the study also captured costs concerning the press-wide overheads, defined here as departmental overheads and general and administrative expenses (g+a). these figures are important to consider, since at the most basic level, they are integral to the press functioning as a business. and yet there is still real debate about the best and most equitable way to assign those costs at a product level. further complicating the question is the underlying reason for the costs themselves: should higher per-book costs be interpreted as a sign of press inefficiency, or a reflection of a healthy press, in a position to devote and invest greater resources to its work? in the months ahead presses, funders and university administrators may try to determine how these costs can be best used to construct a system to support author-side subsidy for monographs. this study suggests the need for further study and discussion of the implications of oa, not just on libraries’ materials budgets, but on:  the role of marketing in encouraging wide-spread dissemination of scholarly works. the work of marketing departments serves today to sell books, but in its essence it is a machine intended to reach the widest audience for a given book. if “impact” replaces “sales” as the best way of framing success, what effect may this have on the strategies presses adopt to market monographs? which specific metrics will the presses gather, and what systems will they need to harvest and aggregate these?  the impact on revenue of introducing oa titles into the list. as publishers introduce oa titles to their lists, most, if not all, will still have print and perhaps even a premium or enhanced digital version for sale via consumer channels. what will the impact on revenue per title be as a result of this? while such sales for an oa title will likely decline, it may depend on the title and the formats offered.  the credentialing role of publisher. in journals publication, credentialing is discussed primarily in terms of peer review. for monographs it is important to also articulate the role of the acquisitions editor in identifying and developing scholarly work, the place of selection by an editorial board, and the way in which processes such as copyediting, design, and marketing also assure the quality of a the costs of publishing monographs work. as monographs transform into works of long-form digital scholarship, the values behind these processes of selection and quality control may stay the same, but the nature of the actual processes is likely to change to reflect the new demands of multimodal works. acknowledgements so many people contributed to this study in ways both big and small, but mostly big. first, sincere thanks to don waters and helen cullyer at the andrew w. mellon foundation, for entrusting this project to ithaka s+r. the advisory group that participated in the planning grant and the advisory committee for the current study included press directors who are deeply engaged in running their own businesses and sharply alert to the significance of the study. they have from the start made themselves available, both in person and via email and their guidance in shaping the study and interpreting its findings has been invaluable. the group of press directors who volunteered to take this adventure with us was exceptionally generous in the time and thoughtfulness they applied to this work, both in talking with us, in welcoming us to their offices, and most of all, in supporting this project and granting us access to their most valuable resource, their colleagues. our whole research team wishes to thank the press staffers who shared their experiences with us. while our main objective was to develop sound financial data, the staff meetings offered an opportunity for us to hear from the field’s most experienced professionals and its newest members. we learned a great deal from them about the cost drivers of publishing, but also about the passion and expertise they bring to their work. the costs of publishing monographs introduction as funders and academic organizations have begun to address the implications of open access to scholarly resources, one question comes up repeatedly: what does it cost to create, produce and disseminate—to publish—a high-quality digital monograph? this study was conceived and designed as a means to capture not just a single bottom line figure, but to capture the component costs of creating a scholarly book. the “crisis” of university press publishing may well be, when viewed over the long term, more of a chronic condition. and yet, the current model faces particular challenges that require attention. the prevailing business model for many years, still largely in place today, assumed that a press would invest in creating a book, reproduce it in print form, and manage its distribution through relationships with scholars and scholarly institutions, in particular their libraries and the wholesalers who served them. sales through wholesalers, retailers, institutions, and individuals would generate the revenue to cover the initial investment, if not more. but the number of units sold per book has declined over time. university press directors describe a downward trajectory, perceptible over the past decade or so: what used to be a typical sale in the low thousands of books, is now often in the low hundreds. and yet the effort and investment needed to create that book has not decreased in step with declining revenues, leading to a financial situation many publishers understand to be unsustainable. while publishers are actively seeking solutions to counter declining sales of monographs, there is a growing expectation from some corners—and a bona fide legal requirement from large federal funding agencies—that scholarly content be made openly available, free to all. open access publishing models, where a fee is paid by the content creator (or funder) to support the costs of publication have earned a strong foothold in journal publishing, where there is some evidence that an author-pays oa model can be economically viable. most book publishers have stayed on the sidelines in this discussion, for reasons both practical and philosophical, among them that books cost more to produce and humanities authors have fewer financial resources such as grants at their disposal to cover those costs. this moment offers an opportunity to re-examine book publishing business models and the assumptions that underlie them, with an eye to john thompson, books in the digital age (cambridge: polity press, ): “in the s academic publishes would commonly print between , and , hardback copies of a scholarly monograph . . . today many academic publishers say that total sales of hardback-only monographs are often as low as - copies worldwide” ( - ). the costs of publishing monographs developing new models that can both deliver sustaining revenues to the presses while affording greater access to end users. indeed, university press directors, deeply aware of how declining monograph sales have had an impact on their revenues, have already begun in earnest to experiment with other models, with the hope of increasing both revenue and ultimately the reach of scholarly works. the university of california press (ucp) launched the monograph-focused luminos and journal-focused collabra in february . while collabra is focused on the sciences, where open access is already an established presence, luminos is humanities-focused and embraces a model whereby costs are shared by the parties who benefit from publication—author or institution, publisher, and libraries. other established university presses, both large and small, have also launched oa models, including oxford university press and manchester university press; amherst college press was recently created explicitly to be an open access publisher, as were ubiquity press and open book publishers. and commercial academic presses as well, including elsevier, routledge, palgrave macmillan and wiley, have also entered this space. the presses that are currently engaging in publishing oa monographs have necessarily needed to develop pricing models. ucp’s luminos requires a fee of $ , per title, and others that have developed new models include palgrave open ($ , ), routledge and taylor and francis (£ , ), and brill ($ for cc-by-nc). some publishers offer different price points, depending on how restrictive (or not) the license is. publicly shared pricing models should not be confused with the actual costs of publication, which may be greater, the same, or less. a pricing model reflects strategic choices of the press: it could align with actual expenses incurred, include enough margin to be profitable, or be priced below cost, if the press can subsidize the expense in other ways, as a means to quickly enter the market with a competitive offer. while focused primarily on the uk, the hefce sponsored report and annexes offer a very good assessment of the current environment and issues facing monograph publishing today. geoffrey crossick, “monographs and open access: a report to hefce,” hefce, january , http://www.hefce.ac.uk/pubs/rereports/year/ /monographs/. other initiatives announced recently include getty publications, whose open-content initiative aims to make available, without restrictions, as many of the getty's digital resources as possible. johns hopkins university press has been awarded a grant from the andrew w. mellon foundation to support the development of muse open, a distribution channel for open access monographs through project muse. minnesota is exploring hybrid oa titles with print sales. brill: http://www.brill.com/about/open-access/publication-charges; palgrave mcmillan open access books: http://www.nature.com/openresearch/publishing-with-palgrave-macmillan/publication-charges/; taylor and francis’s routledge press refers to a policy on “research monographs”: http://www.tandfebooks.com/page/openaccess. npg offers a comparative list of oa monograph fees here: http://www.nature.com/openresearch/publishing-with- palgrave-macmillan/publication-charges/ ; ubiquity press: http://www.ubiquitypress.com/site/publish/; taylor and francis instructions to authors ( ) https://s -us-west- .amazonaws.com/tandfbis/rt-files/docs/instructions+for+authors.pdf. http://www.hefce.ac.uk/pubs/rereports/year/ /monographs/ http://www.nature.com/openresearch/publishing-with-palgrave-macmillan/publication-charges/ http://www.nature.com/openresearch/publishing-with-palgrave-macmillan/publication-charges/ http://www.ubiquitypress.com/site/publish/ https://s -us-west- .amazonaws.com/tandfbis/rt-files/docs/instructions+for+authors.pdf the costs of publishing monographs ultimately, however, pricing models need to be premised on actual costs, and the “cost of a monograph” has been extremely hard to pin down. while on one hand, publishers carefully track and forecast many types of costs, particularly the variable costs related to printing and other “out of pocket” expenses, internal staff time is rarely allocated back to the cost of producing a book. arriving at a figure that accurately represents the full costs of publishing a monograph may require publishers to undertake certain accounting practices that they have not in the past. having a deeper understanding of the costs of all the activities that go into the creation of a book will be valuable to publishers, regardless of what business models they choose to engage in. such an understanding is also important for libraries and other stakeholders who care about the preservation of a vibrant monograph publishing ecosystem, since relying on numbers based on evolving and idiosyncratic business models risks undermining long-term sustainability. should open access publication of monographs expand, it will be important to funders who will expect transparency in accounting from the publishers whose work they support. background over the years, several presses have developed their own means of determining the cost of publishing a monograph, with estimates for first-copy costs for print books ranging from $ , to $ , and higher. presses participating in the knowledge unlatched pilot in - each had to develop their own price of a book in order to participate; the price range of participating publishers was from $ , - $ , . a recent study conducted by indiana university and university of michigan identified the average cost of publishing a monograph at their two presses as being around $ , . others have suggested that full publishing costs for an academic monograph could be as high as $ , if all overheads are considered. some presses, like national academies press, are able to develop a per-book cost assessment, given its current practice of jennifer crewe, “scholarly publishing: why our business is your business too,” profession , - , http://dx.doi.org/ . / x . this appears to be the same range of costs in the current round two model, presented in december and launching in . appendix e of carolyn walters et al, “a study of direct author subvention for publishing humanities books at two universities: a report to the andrew w. mellon foundation by indiana university & university of michigan”, september , . joseph esposito, “the natural limits of gold open access,” the scholarly kitchen, november , , http://scholarlykitchen.sspnet.org/ / / /the-natural-limits-of-gold-open-access/ . http://dx.doi.org/ . / x http://scholarlykitchen.sspnet.org/ / / /the-natural-limits-of-gold-open-access/ the costs of publishing monographs tracking staff time, though due to their particular organizational model, the costs do not include several categories of activity, such as peer review, that most university presses would include. industry-wide statistics are harder to come by. while the aaup itself gathers statistics each year, these costs are captured at the press level. this is very useful for benchmarking, but assumes that the unit of measure is the press and not the book. those industry-based studies that have taken on some of the complexity of the book publishing model, including variable categories of costs, have not yet resulted in models that us- based publishers believe sufficiently take into account the categories of costs that a sustainable model would need to account for. a report from october, , from oapen –netherlands, entitled, “a project exploring open access monograph publishing in the netherlands,” starts to demonstrate this complexity. the study focused on a sample of titles from nine publishers in the netherlands and showed an average cost of eur€ , (or about $ , ) and a range of costs for the books studied from € , to € , . in , during the planning phase of this project two major papers appeared, outlining new ways to fund the creation of scholarly work. “a scalable and sustainable approach to open access publishing and archiving for humanities and social sciences,” issued in february by rebecca kennison and lisa norberg, outlined a model for an institution-based approach. in june , the association of american universities (aau) and the association of research libraries (arl) offered a “prospectus for an institutionally funded first-book subvention.” this approach also takes an institutional direction, with funding provided to universities to then redistribute to faculty members who are publishing first books. the reports’ contribution was in developing models for supporting open access scholarly content subsidies on a large scale, by estimating the size of the market, the scale of the demand and potential sources of funding. yet both reports acknowledge that the figure of $ , per book was considered a placeholder only. it is difficult to directly compare almost any two studies’ figures, since the categories that researchers choose to include within the cost differ. eelco ferwerda, ronald snijder, and janneke adema, “oapen-nl: a project exploring open access monograph publishing in the netherlands. final report,” october . also see guide to open access monograph publishing (a basic primer, but includes table citing current oa monographs pricing from several commercial presses), http://oapen-uk.jiscebooks.org/oaguide/. rebecca kennison and lisa norberg, “a scalable and sustainable approach to open access publishing and archiving for humanities and social sciences,” april . raym crowe, “aau/arl prospectus for an institutionally funded first-book subvention,” june . http://oapen-uk.jiscebooks.org/oaguide/ the costs of publishing monographs the andrew w. mellon foundation awarded ithaka s+r a planning grant in april to study the costs associated with academic monograph publishing. ithaka s+r convened meetings of experts and facilitated conversations aimed at developing an effective methodology to capture all of the activities and costs needed to produce and disseminate a high quality, digital monograph. the group met virtually starting in may and in person on june , in new york city. participants included peter berkery, executive director, and brenna mclaughlin, director of marketing and communications, of the association of american university presses (aaup); charles watkinson, at that time director, purdue university press and head of purdue libraries' scholarly publishing services (and now associate university librarian for publishing and director of university of michigan press); ellen faran, director, mit press (now retired); mark saunders, director, university of virginia press; barbara kline pope, executive director, the national academies press; and kim schmelzinger, former associate director and chief financial officer of northwestern university press, who manages the annual aaup operating statistics survey and analysis. the methodology recommended in this proposal, described below, was developed throughout the course of the planning grant. methodology based on the recommendations of university press experts, ithaka s+r adopted a bottom-up approach, addressing cost at the title level and the press level. when considering approaches for the current study, the task force explored several alternatives, including an industry-wide survey and a top-down assessment of current press costs. the approach developed during the course of the planning grant addresses cost at the title level and the press level. by collecting detailed, activity-based costs and overhead expenses at the press level, we have undertaken a careful examination of the costs of published monographs. by conducting this exercise with a variety of university presses, including representatives in each size bracket of the aaup membership, we are able to analyze key press-level differences and also provide comparisons for non- participating presses to consider. the limitations of this methodology are discussed in the appendix, but we believe that one of the most important contributions of this report is the breakdown of costs by specific activities associated with monographic publishing. scope for the purposes of this study, a scholarly monograph was defined as “a work of scholarship on a particular topic or theme which is written by a scholar (or scholars) and the costs of publishing monographs intended for use primarily by other scholars.” we did not address costs of other publisher-issued content, such as regional and trade titles, textbooks or databases. our definition of “monograph” excluded works in translation and edited volumes. while these works have much in common in spirit with the single-authored works included in the study, the means of their production would have introduced more challenges to the study. the study collected data on the costs in the us-market for all activities related to creating and disseminating the “first digital file” for a book. this excludes costs that are only related to printing, binding and distributing print copies of books. also excluded are advance payments made to authors and royalties from sales. the study captured both staff and other non-staff direct expenses attributed to specific titles, as well as operational overheads necessary for the publishing of a monograph. we consider these direct and overhead expenses together to make up the full publishing costs for a scholarly monograph. they include:  the costs required to support a title over time (and not just up until publication).  other costs considered vital to an ongoing publishing program, including those that serve to build and promote subject-related lists and authors (not just individual books) over time. as the scope of the study does not include questions related to revenue, we did not capture cost of sales in its many forms including royalty, distribution/sales discounting/sales commissions, and so forth. similarly, costs related to print production, warehousing fulfillment and distribution were not captured. however, because even an entirely free digital file will require work and cost in order to go from publisher to its readership, we have included, where possible, costs related to distribution of and access to the file, including metadata creation, search engine optimization (seo), and e- promotion. selection of presses starting from the aaup’s categorization of presses by size, we selected five presses in each of the aaups four size categories (table ) taking some care for diversity even within each category concerning geography and publishing focus (monograph-centric, versus those with a balance of monographs and other publishing activities). directors john thompson, books in the digital age (cambridge: polity press, ), - . the costs of publishing monographs agreed to permit the data gathered to be shared in aggregated, anonymized form in this publicly released report. table . aaup press categories, fy (n= ) group annual revenue, books division average full time equivalent employees average # titles published under $ . million . $ . - $ million . $ - $ million . over $ million . table . profile of participating presses group average full time equivalent employees average # titles published average # monographs published . . . . . . . . . . . data gathering from march through may , our research team met in person with staff from the participating presses to gather data. in addition to meeting with the business managers and cfos to gather press-level financial data, the researchers held meetings with staff in acquisitions, edp (editorial, production, and design), and marketing, to walk through the staff time allocation together. data from staff and the business manager/ cfo were entered into excel spreadsheets and sent to the research team at ithaka, where it was compiled and cleaned. any anomalies that were spotted were reviewed with press directors in conference calls in may and early june . each press’s data were shared back with them in the form of the costs of publishing monographs excel files at that time. further detail on the method of data collection and analysis can be found in the note on methodology in the appendix. developing a definition of cost the question of which costs to include stimulated a lively conversation among members of the advisory committee. there are many types of costs that presses track and assess regularly, particularly those related to the “out of pocket” expenses needed to publish any given title. these are the costs that are considered on a new book project’s profit and loss statement, a document developed at the outset of any new book project to assess the investment needed to create the book and the sales expected/needed to cover its costs or better. the basic calculations done in early stages of a decision to publish will take into account known pre-production and production costs, including printing, for example, and will also include assumptions and estimates from the sales and marketing teams concerning future sales through specific channels. it can be harder to determine how to account for fixed costs, those expenses the press absorbs, regardless of how many books they may produce and sell in a year. staff time, which includes salaries and benefits, is critical when talking about the cost of producing a monograph. an editor’s time spent selecting and developing a new work for example, should be included. but accounting for expenses that are less directly tied to the production of a specific book is more difficult. should a share of the cost to have a company website be “charged” to each book? how about the director’s time? how is rent factored in? most publishers do have some way to account for these “overhead” costs (in what is referred to as a "margin" calculation) when considering a book's profit and loss statement. there is, however, little agreement about the best way to do this, either concerning which costs to include or how to most “fairly” share them among the products or titles a press publishes. a main premise underlying the study is that the core activities that together are required to create and disseminate a high-quality digital monograph include activities grouped within the categories of acquisitions, manuscript editorial, design, production, and marketing. there are many ways to consider the elements that together comprise the full cost of publishing. the elements we considered included:  staff costs: this includes salary and benefits for those working in the following departments: acquisitions, manuscript editorial, design, and production. staff time is the accumulated salary and benefit cost of the time that staff in these the costs of publishing monographs departments reported spending directly working on monographs. staff overhead refers to time that staff in these departments reported spending on work not directly related to monographs or other publishing activity excluded from this study.  title-specific direct costs: this includes non-staff expenses that are directly attributed to the creation of a specific title, and are paid “out of pocket” by the press. these costs were gathered at the department level, at each of the departments identified above; acquisitions, manuscript editorial, design, production and marketing. for example, any outsourced work or services directly related to the creation of a monographs would fall into this category.  press-level overhead: press level overheads were divided into two categories, general and administrative expenses (g&a) and departmental overheads. g&a include staff and non-staff expenses in departments such as administration, accounting, legal and finance, as well as cost for rent and utilities. departmental overheads include non-staff expenses that are not directly attributed to the creation of a specific title from the departments identified above; acquisitions, manuscript editorial, design, production and marketing. for example, the acquisition department’s travel budget would fall into this category as well as activities like advertising that some presses do not break out by title.  in-kind contributions: these include the many contributions the press may benefit from, but are not paid for. this can include rent, staff time from other departments, and work that the authors themselves take on, like clearing permissions or generating an index. when we refer to the average cost of monographs in this report, we developed three definitions of “cost.” they are all legitimate expenses that the press incurs in the course of publication, but some are more directly tied to the publishing process than others. we anticipate that these definitions will be discussed and debated in some detail in the community, as pricing models are being developed.  basic cost: includes just staff and non-staff expenses directly incurred when producing the book. some have referred to this as the “incremental” cost of a press adding “one more book” to its existing publishing operation. here, our definition includes staff time and staff overhead, from those who work in the core publishing departments of acquisitions, manuscript editorial, design, production and marketing. it also includes what we are referring to as direct or “out-of-pocket” expenses which can be captured by title. the costs of publishing monographs  full cost: represents the basic cost, plus the press-level overheads.  full cost plus: represents not only the overhead costs included above, but even adds in in-kind contributions, the reported value of resources contributed to the project, but not paid for. these generally included contributed staff time, author-paid fees and office space. the choices of what to consider when talking about costs are philosophical as much as economic in nature. arguments abound about what sorts of costs to count and which to not count, particularly when developing a pricing model. the methods used by the researchers are essentially choices, and publishers will need to decide how, ultimately they will choose to calculate the costs of their work, as well as the implications for their business overall. but the report is intended first and foremost to provide data from the study, so that presses and funders will have a firm basis on which to make those choices. the pages that follow will include not just final figures, but discussion of how these totals were arrived at. findings press-level observations university press monographic publishing requires a fairly high-touch process, with many individuals in a variety of roles helping to shape the scholarly work from concept through final execution and distribution. this study of titles across university presses yielded a wide range of costs per title, from a low of $ , to a high of $ , , and the range of costs is wide both within and across groups (table ). regardless of which definition we use, the total, average cost of publishing a high-quality digital monograph is considerable, and in almost every instance higher than the figure of $ , that has gained popular attention in recent months. among the titles in our sample, the overall average cost per title was $ , (basic) and $ , (full cost). from these data, several patterns emerge: for a good discussion of accounting approaches, see thomas mccormack’s article, “book publishing accounting,” aaup wiki. the costs of publishing monographs  generally, the smallest presses demonstrated lower costs and larger presses higher costs. yet this was not uniform across the sample. some smaller presses had very high per book costs, due to low title output relative to staffing levels.  the data for presses in groups and demonstrated an interesting reversal with smaller group presses having a higher average cost per book. according to aaup statistician kim schmelzinger, this is somewhat in line with other financial trends noticed among aaup member presses: "there seems to be an inflection point on the growth path as presses reach the $ . -$ . million mark where they become more efficient, most likely due to systems and processes being scaled properly for the output of the press." this study did not permit us to definitely explain why this is, but there does appear to be a value of scale or strategy with the group presses.  scale? there was a working hypothesis at the outset of the study that larger presses would demonstrate a lower per-book cost, presuming that larger houses are able to work more efficiently due to the economic benefits of scaling. we did not see this in practice, as larger presses demonstrated higher costs, on average, per monograph. this came as a surprise to both the researchers and some directors. a few possible explanations were suggested: the larger presses carry higher costs because they can. the largest presses participating were located in metropolitan areas, where costs are higher. the financial health of these presses and their generous institutional hosts (including among the best-resourced universities in the country) permit them to take on investment in certain areas— staff salaries, marketing expenses, contributing to major overhead expenses like rent. in other words, a higher per-book cost in this case might be understood as signaling the ability to continue to invest in the business, not an inefficient use of funds. or, perhaps the higher costs come from the infrastructure necessary to acquire better authors and to sell more copies of the books. it is worth pointing out that the additional expense in the group press overheads, is somewhat idiosyncratic. one large press carries the cost of legal staff; another counts its digital division as overhead in monograph publishing, since its staff handle all metadata and formatting for monographs, as well; others pay significant rent. the costs of publishing monographs table . average cost per title by group. shown by breakdown of cost type and by the three different definitions of cost group staff costs staff overhead direct costs press- level overhead in-kind basic full cost full cost plus $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , staff cost across all presses, the largest expense by far was in the staff time spent directly on monograph publishing, whether on acquisitions, manuscript editorial, design, production and marketing activities (table ). (“press-level overhead” also includes staff time for those in press-level roles, such as legal, accounting, and so forth, so should those press-level roles be counted here, the percentage of salary and benefit related costs would be even higher.) the split of time spent on monographs by staff, versus direct costs is very similar across presses of all sizes. although outsourcing of services is a common practice, direct non-staff costs are still modest by comparison to the staff’s salary and benefits. we looked at this question in two ways, trying to capture both the time spent by staff in certain departments; and also time spent on certain activities, regardless of the actual department people were assigned to. (the two versions are very close; the department-based version may be a little lower, since some non-department affiliated staff, like directors, would not be included.) the costs of publishing monographs table . average staff costs per monograph, by activities and by group group acquisitions manuscript editorial production design marketing total staff expenses $ , $ , $ $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ $ , $ , $ , $ , $ , $ $ , $ , $ , range of staff costs by activities acquisitions manuscript editorial production design marketing min $ , $ , $ $ $ , max $ , $ , $ , $ , $ , median $ , $ , $ $ , $ , average $ , $ , $ $ , $ , direct non-staff costs direct non-staff costs reflect the title specific costs incurred in producing each book at each of the core departments. each press reported by department and by title on the actual direct non-staff expenses associated with creating that book. direct costs illustrate where outsourcing is taking place. by comparing the same set of activities across staff time and direct costs, we can begin to see where at the press certain functions are being outsourced. tasks requiring specialized technologies, like printing, have long been outsourced. relevant to this study we are also seeing many at least partially outsourcing certain elements of the publishing workflow, from copyediting to compositing to graphic design. while some may choose to entirely outsource a function, it is more common to see presses outsourcing some of the work, on a title by title basis. it is best to assess the sum of staff time and direct costs to get a full picture of the costs of each activity. manuscript editorial, for example, often outsources copyediting, and design departments may choose to hire freelancers to design interior and covers. still, the costs of publishing monographs this more holistic view of both staff and direct costs, shows that acquisitions costs are still significantly higher than others, and this holds across presses of all sizes. (table ) table . staff cost and direct non-staff costs (note: this chart only includes staff time and direct costs, by department, and not staff time overhead as in the basic definition. the charts below are intended to illustrate relationship of staff time spent on monographs, relative to direct expenses.) direct non-staff costs by department group acquisitions manuscript editorial production design marketing total direct costs $ $ , $ $ , $ $ , $ $ , $ $ , $ $ , $ $ , $ $ , $ $ , $ $ , $ $ , $ , $ , staff and direct costs by department group acquisitions manuscript editorial production design marketing total staff & direct expenses $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ $ , $ , $ , $ , $ , $ , $ , $ , $ , acquisitions acquisitions is the most expensive activity. among the five core publishing activities assessed, regardless of press size, the greatest share of staff cost is spent on acquisitions activities, with marketing and manuscript editorial activities vying for second (table ). the largest presses have higher costs for the acquisitions work than do the presses in groups , , and . this does not appear to be due to higher numbers of people doing the the costs of publishing monographs work – the larger presses reported fewer people contributing to acquisitions per title than many of the smaller presses. it may be due to a few factors contributing to higher salaries, including that the largest presses are in metropolitan areas, and that they are more likely to offer attractive salaries to recruit potential editors. of the types of acquisitions work, the most time consuming (thus most expensive) tended to be author communications and support, followed by developmental editing, and the peer-review process. acquisitions editors described working with authors, or potential authors, over the course of many years, sometimes meeting them as graduate students, or seeing a project in very early stages take shape over time. the peer-review function was described as especially time consuming, particularly from less senior staff, who are often tasked with identifying scholars willing to conduct the review, and coordinating the process. selection and taste are difficult to define, but central to the role of acquisitions. acquisitions editors exert editorial judgment in finding those manuscripts that fit the press’s mission. while some of the smallest presses make a limited investment in acquisitions, acquisitions editors of the larger presses described a major investment in immersing themselves in the fields in which they specialize. as one acquisitions editor from group described, “a common misperception is that we are simply reviewing things that come in through the door. that is not what we do. we are actively out there recruiting projects. a heck of a lot of our time goes into our “never published” category. some of the things i am working on today will not be published until .” while personal taste certainly plays a role, the final decision to publish at a university press depends on the work passing through a process of peer review. identifying reviewers for a work is a complicated and time consuming task. editors research the field by reading articles, asking the author, and using their own contacts, to build lists of potential reviewers. when a press ventures into new areas, developing lists of these contacts, and new reviewers is a substantial time commitment. for acquisitions editors, a list of trusted reviewers is an invaluable asset. the peer review process itself can be time consuming; editors and their assistants must choose to send out the book, invite the reviewers, send out the manuscript, and then wait for feedback, which can take months. some reviewers, not surprisingly, are better and more comprehensive than others. some recent studies have demonstrated that peer review processes are a critical aspect of establishing trust. see david nicholas, anthony watkinson, hamid r. jamali, eti herman, carol tenopir, rachel volentine, suzie allard, and kenneth levine, “peer review: still king in the digital age,” learned publishing , no. ( ): – , doi: . / . http://dx.doi.org/ . / the costs of publishing monographs reviews come back too lightly considered, or too negative. some editors spoke to us about reviews that are “more about the reviewer than about the book.” acquisitions table . acquisitions basic costs (group averages) group staff time direct costs total $ , $ $ , $ , $ $ , $ , $ $ , $ , $ $ , manuscript editorial as with acquisitions, a major cost in the manuscript editorial function was author communication and support. one production editor from group noted, “you just oversee everything... the copyeditor and the author constantly cc you on everything. when the copyeditor sends the first run through to the author, you can go over it, to make sure it is going alright.” when viewed through the lens of value-added and competitive advantage, a publishing enterprise well-known for effective author relations may have a definite advantage over its peers. most departments indicated that author relations was one of their most time-consuming – and by extension their most costly – activities. copyediting and proofreading reside within this function: copyediting is the process whereby a manuscript is shaped to conform to or meet house style standards and ensures that grammar, punctuation and spelling are correct and consistent. the copyediting function may also be charged with ensuring that headings, data, references, footnotes and other content elements are correct and consistent. copyediting was among the most commonly outsourced functions at the press. this is reflected in substantial costs appearing as direct expenses. the costs of publishing monographs manuscript editorial table . manuscript editorial basic costs (group averages) group staff time direct costs total $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , design costs changes in technology and the advent of xml workflow processes mean that display markup and composition—annotating documents in preparation for software transformation—can readily be combined with activities that have traditionally been undertaken at the copyediting stage (in what is considered file preprocessing). interior design, once aimed solely at providing a quality reading experience, is now linked to the need for distribution across multiple formats and platforms, serves many masters, and because of the use of templates and the leverage provided by technology can be performed in conjunction with many of the manuscript editorial and production functions previously described. cost drivers for design, according to staff interviewed, included complexity issues related to illustrations and supplemental and other materials. heavily illustrated monographs were consistently and often cited as using up vast resources in the preparatory stages. because of digital portability, the need for marketing activity in support of titles, and much (real or hoped-for) crossover between monographs and the rest of publishers’ lists, full-color covers are now the norm for scholarly monographs. much of this work is outsourced and performed by freelancers, with staff who are not performing cover design functions managing the work of others. presses see design as a key differentiator and of real appeal to both authors and audience. a deputy director in a group press commented, “all of our books get the same design treatment... even books in series get the same effort, they all go through that process, not just the trade books. we will spend as much time working on a…monograph as working on a frontlist trade book. the distinction doesn’t apply to the covers. we have a reputation in the academy of being good at this, and it helps us to attract prospective authors.” the costs of publishing monographs design table . design basic costs (group averages) group staff time direct costs total $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , $ , production production functions in a traditional sense include digital file preparation activities such as managing fonts, images, and templates, the preparation of photos and other images, and the like. digital asset management activities, or the processes and systems used to find, organize, convert, and store large content collections, typically reside side by side with digital file preparation activities. file preservation and file distribution activities, while often performed by third parties, must be managed, and that management occurs here. in many presses, production and manuscript editorial processes described above are merged in a single department. because of the compressed workflow timelines and fluid relationships between these functions, those activities we think of as production activities in the traditional print publishing environment are often undertaken by the same staff members (regardless of the department’s name or reporting lines). throughout the editorial, design, and production (commonly referred to as edp) functions, we observed the greatest amount of outsourcing. almost all (if not all) presses engage in some amount of freelance outsourcing in these stages. while work may be outsourced, there is still plenty of time devoted to project management, as press staff manage the freelancers and their work. quality and high production and design standards are seen by many presses as real differentiators. areas where they add value and are able to compete for top manuscripts in their fields of specialization. the need to control quality – particularly with respect to aesthetics – was expressed by many presses to be a binding constraint. several interviewees noted that they would not embrace a fully oa environment, or even a fully digital environment, because they perceived that this would compromise the visual aspect(s) of the content. the costs of publishing monographs production table . production basic costs (group averages) group staff time direct costs total $ $ $ , $ , $ $ , $ $ $ $ $ $ , marketing marketing departments mentioned repeatedly their early and ongoing involvement with many activities aimed at promoting monographs. pre-publication activities aimed at publicizing, or creating talk about and therefore consumer demand for monographs; the creation of detailed advertising plans, copywriting for seasonal and specialty catalogs (as well as for jackets and covers). marketing staff engage in extensive planning aimed at generating review coverage. detailed award submission plans are put together, as well as plans for exhibiting books at conferences and exhibits. many of the direct expenses associated with these activities have been aggregated and included here as departmental overhead, but the pre-publication staff time was directly captured during staff interviews at the site visits. seasonal catalogs, though expensive and time consuming, are still very much part of the marketing process. title-specific web marketing has become quite important. these activities—routinely undertaken for monographs and trade books alike—include such activities as managing authors in creating book websites or blogs; generation of email newsletters and promotional emails; web-based publicity for review generation, tracking of web mentions, and the like. search engine optimization and metadata creation and distribution have become critical tools in pre-publication content marketing. among the things that tend to drive the cost of marketing work at the participating university presses are that there are multiple streams of work taking place in each department, and each member tends to touch every book in some way. (the exception to this tends to be people in publicity roles, who might focus on certain lists.) the specialized roles within the department include conferences and exhibits, web-based marketing/social media, producing the annual and disciplinary catalogs, and advertising. rarely is this work outsourced. as one press director at a group press described, “acquisitions is core to the press. it is the management of the selectivity, relationship building. the other core function is strategic marketing, with the emphasis the costs of publishing monographs on strategic. it’s making decisions about understanding particular fields and making appropriate marketing choices. everything else can be outsourced.” author relations, also seen in other departments, can demand a great deal of time from staff. authors, even authors of academic monographs, might have expectations for reviews appearing in major mainstream publications. marketing staff are often the front line of discussion with authors, to get the most from their expertise and existing contact base and to manage their expectations. marketing table . marketing basic costs (group averages) group staff time direct costs total $ , $ $ , $ , $ $ , $ , $ $ , $ , $ , $ , staff-level overhead staff-level overhead includes the time staff in the core department reported working on monographs that were not published and on other activities. while the former is almost entirely borne by the acquisitions department, this activity on occasion may have an impact on some other roles, as well, such as marketing, typically called in to evaluate or even develop plans for prospective titles. (table ) time spent on books not published was often characterized as a critical step in the path to discovering the best quality works that the press eventually does sign and produce. staff discussions yielded some important context for understanding these figures: senior staff reported that acquisitions is more of a / job, with pitches and new ideas coming to them not just in the workplace and at conferences, but everywhere, including as one press director noted, “in line at the grocery store.” the deluge of ideas, well before they are filtered through the editorial lens can equal significant time spent. however, it is not just the senior editors who report much time spent on seeking new work. junior staff, whether assistants or editors, spend a great deal of time developing new leads, both for authors and for reviewers. should they decide to enter a new the costs of publishing monographs discipline, or a field that crosses over multiple disciplines, significant time is needed to identify the right people, and encourage them to work with the press. indeed the more senior staff tended to spend less time on manuscripts that were not published. some even noted that as a list becomes better known and better respected, it is less likely to receive manuscripts submitted in error, for example, or those that are just a bad fit, as authors are more attuned to the requirements of the list. the time spent on books that don’t come to fruition tends to be at the early stages of possible publication; while editors report winnowing a list from to up front, once they have identified the projects they feel are viable, the success rate increases significantly. by the time a project makes it to the editorial committee/board, many acquisitions editors reported there may be close to a % chance it will make it through to publication. table . staff-level overhead: percentage of time spent on books not published group total overhead costs staff time staff time on monographs never published staff time on other activities cost of acquisitions time time on monographs not published as % of acquisitions cost $ , $ $ , $ , . % $ , $ $ , $ , . % $ , $ $ , $ , . % $ , $ $ , $ , . % press-level overheads in order to develop an accurate picture of the full costs of publishing monographs, it was necessary to include some of the costs of running the business, even though not directly attributable to producing a specific title. it was important in this study to take overhead into account; editing and printing a book may be more directly relevant to publishing a new title, but no publishing house exists without support in areas like legal, accounting, and other necessary services. departmental overheads university presses may have non staff-related expenses that are managed at the departmental level, for example, travel and expenses for acquisitions editors attending conferences and costs of marketing efforts that are otherwise not broken out by title. the the costs of publishing monographs data below reflect averages by group. note that in some cases, presses did not indicate any costs for certain departments. this may not necessarily be because they do not spend money on these activities, but rather that they were able to assign costs directly to the titles themselves, or it might be due to the way the press’s departments are organized. a press that has a combined design and production department, for example, may have reported the departmental cost on just one of those lines. among the responses concerning departmental expenses for acquisitions were: business entertainment, travel, receptions, readers’ fees, free books and postage, training, memberships and telecom. table . average by group of departmental overheads, by department and by group group acquisitions manuscript editorial design production marketing $ , $ $ $ , $ , $ , $ , $ , $ , $ , $ , $ , $ $ , $ , $ , $ , $ , $ , $ , , total average $ , $ , $ , $ , $ , min $ $ $ $ $ , max $ , $ , $ , $ , $ , , marketing was the only department for which we captured specific line items for departmental level overhead costs. the categories requested in the questionnaire include those shown in table . other departments had much lower overhead costs and these were not broken out, but reported by the press as one aggregated figure per department. the costs of publishing monographs table . all presses breakdown of departmental overheads: marketing detail email & social media website advertising & promotion seasonal catalogs conferences & exhibits other average $ , $ , $ , $ , $ , $ , min $ $ $ $ $ $ , max $ , $ , $ , $ , $ , $ , , # responses response average $ , $ , $ , $ , $ , $ , general and administrative costs it is not uncommon for a press to develop a method for allocating a share of its overhead costs to the products it creates, however, there is tremendous variety in which costs a press actually pays itself, and which are covered by its host university. almost all the presses reported paying directly for accounting; many also listed finance positions. hr, however, was often an in-kind contribution at small and medium-sized presses. very few presses listed costs for legal support, with the exception of one press with a strong focus on art and art history which pays a substantial sum for legal staff. among the smallest presses, it staff consisted of contributed staff time at all five participating presses. almost all the other presses reported paying for it support directly, but with significant increase in cost from group (average = $ , ), group (average = $ , ) to group (average = $ , ). further detail on the method used in this study is found in note on methodology in this report. the costs of publishing monographs table . general and administrative costs group g+a per monograph full cost per monograph g+a as a percentage of full cost $ , $ , . % $ , $ , . % $ , $ , . % $ , $ , . % in-kind contributions it may seem obvious that if a press does not need to pay for something, it should not be factored into any calculation of cost, but there are two important reasons to take these costs into account. first, things change. what the university or library might cover one year, might not be included the next and having a clear notion of what the actual resource needs are is important. more to the point of our current study, some of the costs that are not paid by presses today, but that are integral to the book’s quality, are routinely covered by the author, often by grants and other funds the author is able to obtain. a worthy example exists in the case of indexing. while most presses and authors still feel this is a worthwhile activity, few presses today are paying for it themselves. they may provide guidance to authors, or may edit the authors’ work, but it is the authors who take responsibility for hiring someone, or doing it themselves. a similar case exists for image rights. presses expect authors to deliver manuscripts with rights cleared. these costs can vary widely, depending on the number and type of images or other rights needed. this is also the case for image and other rights and permissions clearance. presses routinely expect authors to have done permissions research, and to pay any related rights fees as part of the completed manuscript package they deliver to the publisher. yet, we also heard numerous times just how time consuming the work of press staff is in vetting, guiding, and correcting the work done by authors in this activity. title-level observations an examination of the least and most costly titles offers a taste of the data at a more granular level. while average page count across the full sample was , the most costly the costs of publishing monographs titles tended to be much longer works, with page counts well above the average ( pages on average for the top % titles). in table below, the most costly monographs had an average illustration count of images (or tables or charts) per book, while table shows that the least costly had an average of (and almost half had none). that page count and number of images/illustrations/tables drives cost is unlikely to come as a surprise. table . highest % of titles, by full cost group discipline pages # illustrations cost from staff time # people who worked on title basic cost full cost environment/conservation $ , $ , $ , economics $ , $ , $ , (not listed) $ , $ , $ , environment/conservation $ , $ , $ , african american studies $ , $ , $ , economics $ , $ , $ , african american studies $ , $ , $ , agriculture $ , $ , $ , law $ , $ , $ , literature $ , $ , $ , history $ , $ , $ , architecture $ , $ , $ , communications media $ , $ , $ , art & art history $ , $ , $ , architecture $ , $ , $ , literature $ , $ , $ , biography $ , $ , $ , literature $ , $ , $ , archaeology $ , $ , $ , the costs of publishing monographs table . lowest % of titles, by full cost group discipline pages # illustrations cost from staff time # people who worked on title basic cost full cost political science $ , $ , $ , anthropology $ , $ , $ , urban studies $ , $ , $ , philosophy $ , $ , $ , classics $ , $ , $ , history $ , $ , $ , performing arts $ , $ , $ , slavic studies $ , $ , $ , medicine $ , $ , $ , philosophy $ , $ , $ , philosophy $ , $ , $ , film studies $ , $ , $ , performing arts $ , $ , $ , archaeology $ , $ , $ , sociology $ , $ , $ , literature $ , $ , $ , anthropology $ , $ , $ , religion $ , $ , $ , language $ , $ , $ , considering the averages as well as the extremes prompts a slew of further questions, as well: how strong are the correlations between page length and cost, or illustrations and cost? what role—if any—does discipline play? while the sample size, at titles, is neither a particularly large sample nor completely representative of all university presses (or of all academic publishers in general), these data did allow us theorize about—and test—the strength of some of these connections. candidates for so-called “predictive variables” emerged from several sources. the data themselves, of course, suggested several possible areas for further analysis. during our interviews with press directors and staff, book-specific factors such as number and complexity of illustrations, whether authors were publishing their first book or were farther along in their publishing careers, were mentioned as important factors in driving costs. discussions with our advisory board members highlighted other potential types of cost drivers; some of the likely variables discussed included regional cost-of-living, the costs of publishing monographs public or private ownership, parent institution subsidy, press size, the number of monographs as a percentage of a press’s book publishing program, and the number of staff. below are some findings from this work examining correlations with various factors: page count. there is a moderate, positive relationship between the number of pages per monograph and the basic (and full) cost per monograph. illustrations. there is a moderate, positive relationship between the number of illustrated pages per monograph and the basic (and full) cost per monograph. while our count included all sorts of illustrations—black and white, halftone, and color images, tables, charts, and figures—edp staffers frequently cited this as a cost driver. disciplines. our sample was not structured to include a representative sample of different disciplines, but we tried to see what we might find by grouping the disciplines represented into groups that reflected possible cost drivers. table shows average cost for titles in each discipline listed, using the basic cost definition. the costs of publishing monographs table : average basic cost per title by discipline discipline mean standard deviation number american studies $ , . , . anthropology $ , . , . art/architecture $ , . , . archaeology $ , . , . area studies $ , . , . arts/film $ , . , . history $ , . , . humanities $ , . , . religion $ , . , . science $ , . , . social science $ , . , . first book? do first books, cost more to produce than books by more experienced authors? in this sample, we found that there was no statistically significant difference between the cost per monograph of “first books” versus other books. table . comparison of the basic cost of first books and not first books count (n= ) average basic cost first books $ , not first books $ , the costs of publishing monographs in addition, we examined characteristics of the presses themselves: press size. average cost, by press size demonstrated that the smallest presses have the lowest average costs per title, and the largest have the highest costs. but a closer look at the numbers suggests that this is not a simple progression. that group presses have lower costs than group presses led some in our advisory group to suggest that there is in fact an element of efficiency at work here; the group presses have perhaps reached a scale at which they are able to produce books more efficiently, through higher title count per staff and some outsourcing. why group presses, then, do not demonstrate even lower costs is unsure, but this may have something to do with the range of activities they engage in and how they chose to report their costs. table . average cost per monograph by press group basic cost group mean standard deviation number $ , $ , $ , $ , $ , $ , $ , $ , full cost group mean standard deviation number $ , $ , $ , $ , $ , $ , $ , $ , press location. presses in the northeast and south had significantly higher average costs than presses in the midwest and west. it is also the case that four of the five group presses are in the northeast. explanations for the high cost of presses in the south is less obvious. the costs of publishing monographs rent – no rent. there is no statistically significant difference between the cost per monograph at presses that pay rent and the cost per monograph at presses that don’t pay rent. average basic cost per monograph at presses paying rent was $ , and was $ , at presses not paying rent. public v private. there is no statistically significant difference between the cost per monograph at presses at public versus private institutions. the average basic cost per monograph at public institutions was $ , , while the average cost per monograph at presses at private institutions was $ , . authors. perhaps the subtlest and most interesting cost driver cannot be discussed using strictly quantitative terms. while speaking with staff about what they felt made one title more time consuming to produce than another, staff in every department in every press, came back to the author. when staff allocated additional time to working on a particular book, it was often because they could recall—often in some detail—the difficulties of working with a specific author on a specific project. this is not to discount the very real costs associated with complex production issues like illustrations, charts, or page count, but the theme arose so often that we feel it is worth exploring here. authors themselves, without regard to subject matter, can be significant cost drivers, based on the time they require at every step of the process. this may be because they are new to the publishing process and require a good deal of hand holding along the way (one manuscript editorial staffer described a major value in the work she does as “training scholars to be published authors.”) a less appealing scenario occurs when an author is unwilling or unable to hold to the format or deadline requirements of the press. this can result in press staff spending time managing the situation internally, and can also trigger higher external costs, if for example, an author insists upon making significant changes once a manuscript has already been to the compositor. two factors make this situation particularly difficult for presses to manage. the first is that it is harder up front to assess the likelihood of the author creating problems for the presses than it is to analyze the complexity of her manuscript. second, staff at many if not all of the presses we spoke with, at some point, referred to themselves as being “author centered” and considered the care, time and access they offer authors to be a key differentiator. the degree to which this is actually the case might be difficult to assess, but is an area worth further exploration, as well. it is clear that there is a very real and valuable set of activities that publishers offer, throughout the process of taking their original raw manuscript through to final publication. whether offering guidance in developing the manuscript and its ideas, formatting and designing layout, or considering how best to make sure the work reaches the costs of publishing monographs the right audience, these steps involve many individuals with specialized knowledge and connections. presses in the future might make this value more explicit, as a means of attracting authors; as some commercial presses already do, they might consider offering a basic service for those whose manuscripts come in “ready to go” and add-on services for those desiring (or requiring) additional support. or they might consider what the sweet spot might be between a fully hands-off template model of publishing, and one that offers a deep level of engagement. conclusion data gathered from the twenty participating presses suggest that monograph publishing is considerably more expensive than has often been reported anecdotally or in other studies, and certainly more expensive than current price points for publishers with oa models would suggest. the source of these costs is largely staff time, and specifically the time related to the activity of acquiring content, an area closely tied to the character and reputation of the press. this function is the least likely area to be outsourced at a press, and the most closely tied to its financial success: the acquisitions role, after all, is to bring skill, subject expertise, and relationships to bear in selecting and developing the most promising authors and topics to the press. another major cost is the overall operations of the press itself, the press-level overhead expenses. these figures are important to consider, since at the most basic level, they are integral to how press functions as a business. and yet, there is still real debate about the best and most equitable way to assign those costs at a product level. further complicating the question is the underlying reason for the costs themselves: should higher per-book costs be interpreted as a sign of press inefficiency, or a reflection of a healthy press, in a position to devote and invest greater resources to its work? these questions get not just to the heart of what it costs to produce a book, but what it costs to run a university press. though this study did not take on that particular question, it certainly did end up pointing up some important issues worth further investigation. for most university presses, monographs are rarely profitable on a per-title basis. some may achieve sustainability at the “list” level—a line of books, whether by discipline or by product type (monograph or trade books, for example) or perhaps at the level of the “books division.” further complicating this, certainly, is that while the financial success the costs of publishing monographs of certain books might end up subsidizing the weaker sales of others, choosing the winners in advance is just not possible. while the study did not address the presses’ financial performance, it did appear that presses that report being on good financial footing tend to be larger and have multiple streams of revenue that end up cross-subsidizing the monograph list. for some presses it may be a journals list, like chicago’s; others, like yale university press, have a strong textbook program; university of minnesota press has a professional testing program in its portfolio. to more fully understand to what extent this is the case, a fuller assessment of the economics of the university press—the whole press—would be needed. implications for university presses this study suggest some important next steps for university presses, as well as some open questions. how can these data be used to develop pricing models for open access funder subventions and publisher fees? now that we know what it costs to produce and distribute a monograph, what (portion of the) costs will publishers need to obtain as an up-front subvention, in order to make the book openly available at publication? will publishers look to recover full costs, or be able to develop estimates that model just the revenue forgone with oa models? in other words, is the basic, full, or full plus cost the one which publishers will seek to recoup? unlike the retail book market, where publishers establish prices for the products they sell, it is not entirely clear where this pricing model will develop, or what figures will take hold. who, in the end, will be setting the prices: presses or funders? how might a change in funding model result in changes to the publisher landscape? will new, institutional sources of support have the impact of incentivizing other, newer entrants into the field of scholarly book publishing? and will new entrants into the publishing field find ways to offer some, if not all, of the current list of services at prices lower than those publishers can support? we heard from press directors and editors just how difficult it is to accurately guess which titles will be the bestsellers and which will underperform. how will this uncertainty influence the way that publishers make choices about implementing an oa model? will they act cautiously, considering which books are likely to sell few copies, and therefore require the subsidy? how will they place those bets, and might it have the unfortunate effect of keeping the potentially most valuable material tied to the pay-for- access model? while there has been reference to publishers’ ability, in theory, to continue to monetize the monograph content in other ways, alongside an openly available digital version, how the costs of publishing monographs will this work, and what sort of revenue is likely? what licensing would be needed to insure continued revenue generation from other formats? how can publishers model future revenue across a monograph list, across the press, given certain titles most likely to become oa? what will happen to subsidiary rights and permissions income? what are the prospects for continuing to sell premium ebook versions and print copies? implications for libraries libraries and their patrons stand to gain quite a lot should open monographs become the norm. in the short term, this would remove a certain category of cost from the library collections budget (though this might be a short-lived benefit, should university-based models end up shifting funds from acquisition to subvention). however, some existential questions about the collection development role also arise in an environment where many monographs are oa, questions that also challenge the business models of the vendors that supply libraries. today, while multiple channels exist to carry, distribute and otherwise disseminate content to libraries, the primary routes of this distribution are premised on a sales model, where vendors, aggregator platforms, wholesalers and retailers each expect to see some gain from the sale of a book. absent the notion of “sales,” how robust will the notion of “distribution” and therefore discovery be? it is likely that fee-based models will emerge to support hosting oa content. unless the resulting oa books are as easy to identify, find and access, cataloged and indexed as their paid versions would have been, the aim of providing access to all will only be realized in theory. there are some interesting additional points of entry for libraries as publishers in this space, as well: for those institutions already undertaking peer-reviewed publishing, we suspect new models of subsidy might encourage some library presses to further pursue monograph publishing. we hope this report is useful for those new to this work, in that it outlines the categories of activity that are currently taking place at the high-quality university presses that are publishing monographs. for those in libraries already supporting the scholarly communications process, we hope that the description of the cost and value of the many stages of work that university presses currently undertake will be helpful in highlighting the degree of investment currently being made by established presses. new publishers or publishers-to-be can use this report as a way to consider all these phases of work, and to determine which to take on, which to outsource, or which they may choose to forgo. *** the costs of publishing monographs towards a pricing model in the months ahead, publishers will be considering not just what it costs to produce a monograph, but how much they would need a subvention to cover, in order to make that book available in an open access model. our findings suggest that the “basic” cost (including staff time and other direct non-staff costs in the five functional areas of the press) represents the bare minimum needed to develop, create and disseminate new works. we would also suggest that presses examining this for themselves look closely at what we have called here “departmental overhead.” while for our purposes, it was not possible to fully break this out, there may well be elements that presses will want to include as well. we cannot estimate costs for models and book formats that do not yet exist, but it is worth mentioning that we are at a moment of experimentation with digital formats, which suggests that in the future, even “simple” monographs are unlikely to consist of just an uninterrupted text. innovative projects underway today include the university of minnesota press’s manifold scholarship; the efforts of the university of michigan press and its partners to publish humanities monographs linked to rich media primary source materials using the hydra/fedora software framework, and stanford university press’s initiative to create a “peer-reviewed process for …interactive scholarly research projects.” for monographs to have significant reach and impact, absent a sales-based distribution channel, where will the books “live”? how will readers and institutions know they have been published? aside from the work of marketing teams, the act of distribution via sales channels, it is assumed, would need to be replaced by something similar. would today’s aggregators want to host open content alongside paid content? if not, where would the oa titles be hosted? if so, what would the hosting cost be? these are issues that have critical implications for the discoverability and impact of long-form digital scholarship. the humanities and social sciences open access movement has been stalled for a while in the united states, at least, with university presses uncertain of the payoff, authors not sufficiently demanding the change, and little funding to support the new model in the jason weidemann, “thoughts from editors and authors on what makes a good manifold project,” january , , http://manifold.umn.edu. “hydra/fedora mellon project,” michigan publishing, http://www.publishing.umich.edu/projects/hydra/ gabrielle karampelas, “stanford university press awarded $ . million for the publishing of interactive scholarly works,” stanford university libraries, january , , http://library.stanford.edu/news/ / /stanford-university- press-awarded- -million-publishing-interactive-scholarly-works. http://manifold.umn.edu/ http://www.publishing.umich.edu/projects/hydra/ http://library.stanford.edu/news/ / /stanford-university-press-awarded- -million-publishing-interactive-scholarly-works http://library.stanford.edu/news/ / /stanford-university-press-awarded- -million-publishing-interactive-scholarly-works the costs of publishing monographs system. the success of this author-side model will lie in the ability of funders, and ultimately the universities, in continuing to support this into the future. studies also funded by the mellon foundation at indiana university and the university of michigan and emory university suggest that provosts may be willing to financially support a producer-pays model for monographs, although many questions remain; for example, about what will happen at smaller institutions and for faculty not on the tenure track. the lever press collaboration was recently formed to introduce a “platinum” oa business model for the arts, humanities, and social sciences, with a commitment to issue works that will be both free for the author to publish and for the researcher to read, suggests a somewhat different set of opportunities and challenges. *** imagining a landscape where all new scholarship is quickly and freely available to anyone with an internet connection is deeply appealing. some of the questions concerning the mechanics of how to produce works of high quality, with digitally- enhanced features and functionality will continue to evolve in the months and years ahead. some of the economic questions may be helped along with the data in this report and others like it. but the questions will not all be financial ones. what do we see today as the real value of publishing? if it is defined as a means of quickly disseminating current research in a timely fashion, certain production-minded oa models are likely to prevail. they may provide a quick and efficient means of getting information into the world, and they may focus on certain production and dissemination activities primarily. if, however, there is real value in publishing as an act of curation, selection and author development, then most of the activities included in this report must continue to be supported. the value of creating not just one book, but a sustained contribution to the development of a discipline through developing works that advance the field is something editors feel is a mainstay of their work. publishers also provide important support to authors, whether first-time writers or not, by framing their work in the context of broader disciplinary trends, identifying constructive peer reviewers, by polishing their prose, packaging it, and positioning it to reach its intended audience, and michael elliott et al. “the future of the monograph in the digital era: a report to the andrew w. mellon foundation by emory university,” july , , https://pid.emory.edu/ark:/ /q fd ; and carolyn walters et al. “a study of direct author subvention for publishing humanities books at two universities: a report to the andrew w. mellon foundation by indiana university & university of michigan,” september , , http://hdl.handle.net/ . / . see http://www.leverpress.org/ for details. https://pid.emory.edu/ark:/ /q fd http://hdl.handle.net/ . / http://www.leverpress.org/ the costs of publishing monographs generally in supporting their work in turning a manuscript into a polished final monograph. what models will emerge to financially support these high-quality, freely-available digital monographs? many issues remain to be resolved not least of which concern how publishers balance the different business models emerging in the oa space. perhaps it is possible to have it both ways, after all. it may be that we are at a moment when the line between “scholarship distribution” and publishing—implying a deeper investment into an author and her book project—must be more clearly defined. unpacking the core elements that generate value—value to authors, to readers, to the academic disciplines – is likely to be the subject of debate and discussion as publishers and those they publish with and for must work together to create a sustainable ecosystem of high-quality, easily accessible, scholarship for all. “economic analysis of business models for open access monographs: annex to the report of the hefce monographs and open access project,” http://www.hefce.ac.uk/media/hefce/content/pubs/indirreports/ /monographs,and,open,access/ _monographs .pd f. http://www.hefce.ac.uk/media/hefce/content/pubs/indirreports/ /monographs,and,open,access/ _monographs .pdf http://www.hefce.ac.uk/media/hefce/content/pubs/indirreports/ /monographs,and,open,access/ _monographs .pdf the costs of publishing monographs appendices appendix i: advisory board peter berkery, executive director, aaup brenna mclaughlin, marketing and communications director, aaup barbara kline pope, director, the national academies press darrin pratt, director, university press of colorado john rollins, chief financial officer, yale university press mark saunders, director, university of virginia press rebecca schrader, director of finance and operations (retired), the mit press john sherer, spangler family director, the university of north carolina press charles watkinson, associate university librarian for publishing and director, university of michigan press the costs of publishing monographs appendix ii: participating presses baylor university press colorado university press columbia university press johns hopkins press indiana university press northwestern university press rutgers university press texas a&m university press the university of chicago press the mit press the university of north carolina press university of arizona press university of arkansas press university of georgia press university of michigan press university of minnesota press university of nebraska press university of virginia press university of washington press yale university press the costs of publishing monographs appendix iii: note on methodology in developing this methodology, we encountered certain challenges, some of which we understood going in to the process, others of which became apparent as work progressed. many directors praised the approach, in offering them an understanding of costs, particularly concerning staff time, that they did not have before. some staff were less convinced, and shared their concerns—about the value of staff allocations in general, about our method in particular, about potential uses of the data. below is a short outline of the issues encountered in gathering data directly from staff, and the steps taken to mitigate them.  we designed our staff allocation tool to be completed on-site, by staff using desktop computers or laptops. this was a problem for several presses, whose staff did not have access to laptops. even where presses were able to set up computers for our meetings, some staff were new to using them, and basic set-up took additional time. where technology was an issue, we used a printed worksheet, allowing staff to write in their answers, which our research team entered into excel sheets back at the office.  the question of how to allocate one’s time is not an easy one in general, and asking staff to come up with accurate figures proved difficult. the basic question asked was quite simple: how would you say you spend your time on monographs (as defined in the study), versus work on other types of publishing versus time spent on other non-publishing things? staff in certain departments—marketing comes to mind—were more comfortable with making estimates. in other departments, staff were less comfortable with this approach. a good deal of our on-site time consisted of group conversations, where as a group we discussed and debated what reasonable allocations might be. doing this as a group setting was invaluable: if one person was able to offer their approach to the group, others were then able to consider how their numbers might be different. this group discussion helped to normalize the responses, at least at the departmental level.  gathering a comprehensive set of data required extensive follow up with each press. not all staff could be there for our visits, and so our follow up process involved reviewing the staff lists with the responses received and working with our press contacts to make sure all data were recorded.  once the data were received from the presses, we reviewed and scoured it for anomalies. if most staffers in a department, for example, indicated spending little the costs of publishing monographs time in prior years on a title, but one staffer reported significant time in prior years, we might check with that person to make sure the response was accurate.  in some cases, human error intervened. three questions asked people to assign values to different books or activities, adding to %. the spreadsheets included a check field, showing if indeed the figured entered equaled %. many sheets came back without reaching this figure. if the error was - points off, we simply recalibrated all responses to equal %. if, however, the gap was greater than that, we went back to the staff to ask them to have another look at their response, and to re-submit.  the study definition of “monograph” was itself an issue of discussion. the exclusions (translations, edited works) did not map directly to what presses consider to be their monograph list. this posed some challenges, when asking staff to develop estimates for books falling into our definition of monographs.  while presses were asked to submit titles each, three presses had fewer than titles. two presses had such low title output that they ended up reporting on two years’ worth of data. in those cases, costs for the -year span were allocated to the monographs. data gathering we gathered three categories of data, using different tactics:  staff time allocations  direct costs per title (“title specific costs”)  press-level overheads (g+a and departmental costs) staff time allocations on-site visits included minute sessions with staff in acquisitions, edp, and marketing departments. the sessions involved a round-table discussion, followed by time spent filling out the allocations grid together. the discussions were framed in a way to guide the group through three main areas: the one exception to this was yale university press, where the press chose to give us department-level data, rather than having staffers complete staff allocation forms themselves. the costs of publishing monographs  the process that they engage in  what they see as the elements of their work that tend to drive cost  what they see as the elements of their work that deliver the most value to the publishing enterprise each staffer who responded to the study ( in all) completed a staff allocation form. this was completed in excel by most respondents. for some presses, without the ability to use laptops or otherwise complete the excel forms, we permitted staff to handwrite in the responses, which ithaka s+r staff later keyed in. direct costs per title the non-staff costs directly incurred by each title were gathered at the press, and entered into an excel workbook provided by ithaka s+r, with one tab for each of the twenty target titles. our main contact was asked to gather data on these costs (some did this themselves and others asked department heads to participate). the categories of direct costs were grouped in five core categories, corresponding to the key departments (acquisitions, manuscript editorial, design, production, and marketing). press-level overheads to gather data on overheads, ithaka s+r provided the director and cfo with a sheet labeled “press-level costs.” this included two sections. one section was for the press- level g+a expenses, such as legal and accounting. the second section was for department-level overhead. this section was intended to capture any relevant non-staff costs not already captured on the “title-specific” sheets. indeed, the financial lead at each press was asked to make sure that the title-specific sheets were given priority over the department level expenses; all costs should first be assigned at the title-level, where possible. any remaining expenses could be included them in the departmental costs, which we later allocated per title. data analysis staff costs to arrive at staff cost, we had press employees in each of the five departments estimate the time they spent in several ways. first, they were asked to estimate time spent on monographs in , the year the study covered. staff had a list of the relevant titles in hand, for reference, and were guided by a member of the research team in thinking about the costs of publishing monographs ways to best estimate time. they were also asked to consider time they might have spent on the list in earlier years ( , ) and in later years ( ), and to assign percentages. the calculation of staff cost involved adding the percentages of time people reported spending on monographs published in . each staff response was matched to the corresponding person’s salary and benefits. this cumulative percent was multiplied by the person’s salary/benefits to get a total cost for this person’s work, over several years, on monographs. staff were then asked to consider which of the titles they worked on took more (or less) time than others, based on recollections of the work they had done. some reported spending the same amount of time on all titles; this might be the case for a marketing person, for example, whose role is to present all titles at conferences. but most were able to distribute % of their cost among the titles, based on which took more or less time. (their resulting percentages were also displayed to them, by showing an estimate of equivalent days of work.) this total cost pool was then distributed across all the monographs they worked on. the cost of each staffer’s time on a specific title was added up across the press, resulting in a total cost for staff time on a specific title. staff were also asked to report on those activities they performed at the press, again, assuming % as the total amount of time they devoted to monographs. they were given a full list of activities to consider. though presented in “departments” including acquisitions, manuscript editorial, design, production, and marketing, all staff had the opportunity to assign a percentage of time to activities in any department. this was necessary due to the great variety among presses in organizing the work, particularly of the epd functions. for this reason, the figures offered for “staff cost” are best understood as the cost of staff time devoted to those activities, rather than the cost of the specific staff members in a specific department. note: since many directors also do acquisitions work, we privileged their work in acquisitions, charging a portion of their reported time to that line. their remaining time was then included in the g+a calculation. staff overhead costs at any publishing house—at any office, really—people spend time on things other than their primary activity, here producing books. when reporting on how they spent time in fy , staff were permitted to enter a percentage of time for “other” activities, including general staff meetings on things like health benefits, administration, and so forth. they were also permitted to enter a percentage for time spent on “books never published.” this was a category specifically introduced for acquisitions editors, who may the costs of publishing monographs devote a great deal of time to identifying new works. this time spent on “general” and “books not published” together comprise what we are calling staff overhead. to arrive at a per-book staff overhead by press, the costs were pooled at the press level and then distributed across other all books and monographs published by the press in , though the methodology is different for these two different types of costs. staff costs on general activities are distributed across all books based on the relative time spent on monographs versus non-monograph publishing activities in . time spent on “books never published” is calculated by pooling costs at the press level, then dividing it amongst other monographs only. thus % of these costs are then divided amongst all of the monographs. costs on activities each staffer who reported on time provided data on the percent spent on each activity related to their monograph work. we estimate the cost of an activity for a monograph by multiplying the percent of time someone spends on a particular activity by the cost of that person’s time on a particular monograph. for example, if one person’s cost for book a is $ , and % of her time is spent on author communication, that book has a cost of $ , on author communication from that person. these are added up across all people in the press to get a total cost by activity for each book and activity by department. direct costs the title-specific costs were added up across all titles. the worksheet used to gather direct costs by department also included a line to enter any estimates of cost for in-kind contributions. press-level overheads in this study “press-level overhead” includes general and administrative expenses as well as departmental overheads. departmental costs, where possible, were allocated to specific titles. for some non-staff direct costs at the department level, such as exhibits for the marketing department or travel for the acquisitions editors, presses were able to assign costs to specific titles. where that was not possible, the cost was included as “departmental overhead,” added to press-level overhead and allocated as described below. neither was particularly easy to allocate, as presses had a wide variety of methods in use. we explored many ways to do this, and three are described below. only the third, the one based on “effort spent” is shown in the data tables for the report. the costs of publishing monographs  an even split. the first method is to distribute press-level overhead evenly across all books. here we calculate all costs on the press information, tab, and total them. then we divided it by the total number of books the press produced in .  weighted by title. the second method uses weights. starting with the total press-level overhead, we took a percentage of that cost, based on the percentage of the total number of titles published that the target titles represent. this new pool was then distributed across the target titles based on the weights described in the staff level data. these weights reflect relative time spent on each title. in this model, books that required more time (and expense) from staff are also charged a greater share of overhead. (the weights are generated by counting the total number of days staff spent on each target title. then the percent of time spent on each specific title is the number of days on that title divided by the total number of days on target titles. if this number is %, the monograph gets % of the overhead costs for target monographs.  weighted by press effort on monographs. the third method is to determine the relative effort spent across the press on the monograph list, as a means of determine what portion of the overhead ought to be allocated to monographs, as opposed to other product lines at the press. we did this by calculating a percentage of time spent by all reporting staffers, and assuming that non-reporting staffers spent no time on monographs. the average time spent on monographs was used to determine the pool of g+a to be charged to the monographs list. this pool was then divided evenly by total number of monographs, to arrive at a g+a allocation per monograph for each press. microsoft word - dh proposal.docx framing the experience: a study of the history of interfaces to digital humanities projects the following paper is a slightly expanded version of the abstract that will appear in the conference book of abstracts. a longer version of the paper has been published before print by digital scholarship in the humanities at https://academic.oup.com/dsh/article- abstract/doi/ . /llc/fqz / . i am therefore not able to upload that paper to this repository, for copyright reasons. but please do read the full paper if you have access to it. introduction: drucker ( ) argues that, although the goals of human computer interface research are to render the interface invisible and facilitate access to digital content, interfaces themselves should be legitimate objects of study. yet little attention has been paid to this aspect of digital humanities resource design. the following paper therefore reports on a study of interfaces to long-lived dh resources to determine what information we may gain from them about the history of dh project development. research questions: the study addresses the following questions: • what can we learn from a study of interfaces to digital humanities material? • how have interfaces to digital humanities materials changed over the course of their existence? • do these changes affect the way the resource is used, and the way it conveys meaning? • should we preserve interfaces for future scholarship? experimental design the following research therefore adopts a case-study approach to a study of the interfaces to digital humanities resources, analysing a sample of projects and their progress over time, in detail. the sample is as follows: • the women writers project- brown university and subsequently northeastern university • the valley of the shadow project–university of virginia • the william blake archive- university of virginia, university of north carolina • proceedings of the old bailey online- sheffield university and hertfordshire university • digital images of mediaeval music- kings college london and oxford university • the oxford text archive- oxford university • virtual seminars for teaching literature- oxford university the reasons for choosing these projects are largely pragmatic: to reach a detailed understanding of interface development over an extended period it was important that resources had as long a lifespan as possible, but remained accessible and usable. the above projects were established in the s or early s and are still accessible, even if in a somewhat different form; relatively few dh projects with such a long history are still easily available. nevertheless, this is a proof of concept study, to investigate whether the proposed method produces useful results. it is not intended to represent a comprehensive audit of all such surviving projects. undertaking a larger study of this type could represent the next development of this research, were funding to be granted to do so. although digital humanities is now a global field, its antecedents in literary and linguistic computing were largely anglo-american, based in a small number of universities, some of which are represented in the sample above. thus the sample is skewed towards english language resources. it was also important to have fluency in the language of the resources, to gain the most complex possible understanding of them, and their accompanying documentation. however, important work was being done in humanities computing in countries such as finland, germany and italy during the same time. thus future work could be carried out on a sample of projects in collaboration with researcher fluent in such languages. the method of analysis is influenced by the work of vela et al. ( ) who used the internet archive’s wayback machine to investigate the design history of the perseus project. the wayback machine was therefore used to identify the original presentation of, then track every significant design change to, the websites in the sample. it is impossible to be certain when each change was made to the resources because, especially in its early days, wayback machine captures were relatively infrequent. nevertheless, this method does provide the most comprehensive insight currently possible into interface change over time. each website, and all significant design changes, were examined in detail, in terms not only of their visual design but also of their technical functionality, encoding and markup. findings: a great deal of valuable information may be derived from studying the interfaces of long- lived projects. the visual design can communicate subtle messages about the way the resource may have been conceived by its creators and the assumptions made, and perhaps subsequently altered, about user behaviour. it can also provide information about the changing place of digital humanities projects in local and national infrastructures, and the way that projects have sought to survive in challenging funding environments. a study of how interfaces have developed reminds us about how changing access conditions (modems to fibre broadband) and technical standards (hand-coded pages with blue hyperlinks to css, xhtml and postgres) have affected web design. when founded, these projects were pioneering users of an experimental medium, thus it was important to establish their intellectual credibility in the scholarly community. it is perhaps not surprising, therefore, that the interface designs of many early websites, such as that for brown women writers, refer to traditions of printed book design, and offered detailed information about the intellectual, and technical credibility of the project team. figure . original interface to the brown women writers project early interfaces were often visually experimental. project teams could assume no knowledge of digital resources, and so were creative in the use of visual navigation devices such as colour, the arrangement of resources in tables, or even an image of a floor plan of a physical archive building. figure . the floorplan navigational device for valley of the shadow growing awareness of user interface design conventions led to many interfaces being redesigned. however, the sample projects have preserved visual links to their original identity, for example by using an original colour scheme, fonts or the arrangements of images into a table-like frame. figure . detail of the original interface of virtual seminars for teaching literature figure . interface to the ww digital archive, successor project to virtual seminars some redesigns, such as that of the blake archive, changed the user experience, and the visual identity radically, emphasising high-resolution images over the original dominance of text. if current users cannot access earlier versions of the site, then valuable information about the assumptions made in designing the original resource is lost. figure . william blake archive- original interface figure . william blake archive- current interface conclusion: interfaces to dh projects provide valuable information about how their creators strove visually to communicate the meaning and importance of the material. subsequent changes provide evidence of how dh led, or responded to, advances in web technologies and interface design conventions. yet, while a great deal of attention is paid to digital preservation and curation, both in dh and information studies, the question of how, or whether, interfaces should be preserved remains unjustly neglected. it is still possible to find early versions of many digital resources using applications such as the wayback machine. however, this is not a perfect solution. once-experimental functionality, such as imagemaps, frames or animations may be incompatible with the wayback machine’s harvesting technology. this means that resources are already either wholly or partially inaccessible in their original form, and this may become even more of a challenge in future. this paper will therefore argue that the dh community should work with libraries to preserve original interfaces and their subsequent iterations. it is better to make conscious decisions to archive all versions of sites that are still accessible, as part of an agreed preservation strategy. not to do so means that we risk losing a wealth of information about the development of the early web and the status of digital humanities resources. references: drucker, j. . “reading interface.” pmla ( ): – . poole, a, h. . “how has your science data grown? digital curation and the human factor: a critical literature review.” archival science ( ): – . vela, s., cerrato, l. ilovan, m., li, t., rockwell, g., ruecker. s., . “the biography of an interface: perseus digital library.” in canadian society for digital humanities/société canadienne des humanités numériques (csdh/schn) conference. st catherine’s, ontario. the ebetharké syriac digital library: a case study | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /dlp- - - corpus id: the ebetharké syriac digital library: a case study @article{beard thees, title={the ebethark{\'e} syriac digital library: a case study}, author={i. beard}, journal={digit. libr. perspect.}, year={ }, volume={ }, pages={ - } } i. beard published computer science digit. libr. perspect. purpose the purpose of this paper is to review and describe the teamwork, collaboration and learning experiences involved in meeting the unique challenges of establishing a new digital library for syriac collections. the ebetharke syriac digital library portal is a collaborative effort between the libraries at rutgers, the state university of new jersey, and the beth mardutho syriac institute, a traditional library of texts, to create a specialized digital library collection online… expand view via publisher rucore.libraries.rutgers.edu save to library create alert cite launch research feed share this paper citationsbackground citations view all figures and topics from this paper figure figure digital library software development evert willem beth documentation experience library (computing) document one citation citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency supporting documents for "getting to beta: building a model collection in a world of digital one-offs" kathryn gucer, k. adams, chuck schoppet, r. punzalan computer science view excerpt, cites background save alert research feed references showing - of references sort byrelevance most influenced papers recency from manuscript catalogues to a handbook of syriac literature: modeling an infrastructure for syriaca.org nathan p. gibson, david a. michelson, d. schwartz computer science, sociology j. data min. digit. humanit. pdf save alert research feed a framework for distributed digital object services r. kahn, r. wilensky computer science international journal on digital libraries pdf save alert research feed the rutgers workflow management system: migrating a digital object management utility to open source g. agnew, yang yu computer science save alert research feed digital preservation: architecture and technology for trusted digital repositories r. jantz, michael j. giarlo save alert research feed vmcanalytic: developing a collaborative video analysis tool for education faculty and practicing educators g. agnew, chad m. mills, c. maher computer science rd hawaii international conference on system sciences pdf save alert research feed evaluation of the new jersey digital highway j. jeng sociology view excerpt, references background save alert research feed syriac utf- character set: page with code points u+ to u+ ff” utf- encoding table and unicode characters last accessed july n.d.). “digital curation: current standards and documentation. from page pixel: the digital curation blog at rutgers university libraries. last accessed july lucene/solr: now ready for the big leagues s. arnold computer science save alert research feed accents , symbols and foreign scripts ” ( n . d . ) , syriac / aramaic / modern assyrian ... ... related papers abstract figures and topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue cdh benevolence and excellence: digital humanities and chinese culture - , shanghai, china president of adho constitutent organizations board elisabeth burr . . . short introduction a very good morning to all of you. i have the great honour and joy to bring the greetings from the alliance of digital humanities organizations to you. as my internet connection is shaky, you will watch the video i have recorded beforehand. please go to adho’s website if you want to know which associations make adho, as this part of my speech was, sadly, cut for time reasons. have a very good conference! . recorded speech a very good morning to you all whether you are taking part in the conference in person or via computer screens. when i received the invitation to attend the opening ceremony of the chinese digital humanities conference from your colleague, dr jing chen, on behalf of the executive committee of this conference, i was thrilled. i could, however, not accept the invitation to be your distinguished guest straight away. as president of the alliance of digital humanities organizations (adho) i obviously had to inform, first of all, those who i represent and ask them for their opinion. as the reaction was unanimously positive i have now the great pleasure and honour to welcome you to this conference, to bring to you the collegial greetings of the alliance for digital humanities organizations and to wish your conference with the wonderful theme “benevolence and excellence: digital humanities and chinese culture” a huge success. allow me to say a few words about the alliance of digital humanities organizations. the first digital humanities organisations, or better organisations for humanities computing, as the field was called at the beginning, were founded, as far as we know, in the ies. in fact, in the european association for literary and linguistic computing (allc) was founded at king’s college london, and in followed the foundation of the association for computers and the humanities (ach) in the united states of america. from onward, these two associations celebrated joint conferences, which took place one year in europe and the other year in the united states. in , discussions started about an umbrella organization which would foster closer collaboration and exchange within the field of digital humanities more widely and which also other organisations might want to join. these discussions led to the foundation of the alliance of digital humanities organizations, or adho as it is generally called, in . the first adho digital humanities conference was celebrated in paris in . adho’s aim is to promote and support digital research and teaching across all arts and humanities disciplines, and to foster excellence in research, publication, collaboration and training. over the years, more and more digital humanities associations applied to become constituent organisations of adho. in , the canadian society for digital humanities / société canadienne des humanités numériques (csdh / schn) joined adho, in centernet and the australasian association for digital humanities (aadh) followed, in the japanese association for digital humanities (jadh) became a constituent organisation of adho, saw the arrival of humanistica. l'association francophone des humanités numériques / digitales, and in adho welcomed the taiwanese association for digital humanities (tadh), la red de humanidades digitales (redhd) based in mexico, and the digital humanities association of southern africa (dhasa). as the legal entity for adho, the stichting adho foundation, was established in the netherlands, adho needs to respect dutch laws. adho sponsors a whole range of special interest groups (sigs) which enable members of adho constituent organisations who have similar professional specialties and interests, to exchange ideas, keep themselves up to date on developments in their specific field, and develop related activities: these include  digital literary stylistics (sig-dls)  audiovisual data in digital humanities (sig avindh)  global outlook::digital humanities (go::dh)  geohumanities sig  libraries and digital humanities  linked open data (dh-lod) adho offers its own constituent organisations and sigs a common infrastructure where web pages, mailing lists, and email addresses can be hosted, and where tools like wordpress, drupal and mediawiki and others are made available for the community. this infrastructure is also used by affiliated bodies like the text encoding initiative (tei) and by some of the digital scholarly journals, which are published by adho constituent organisations and which adho sponsors: the open access peer-reviewed digital humanities quarterly (dhq), digital studies / le champ numérique, and humanités numériques. the journal of the alliance of digital humanities organizations is the peer reviewed and impact factor holding digital scholarship in the humanities (dsh) published by oxford university press. among the many countries from which proposals for publication were submitted during the last year, the people's republic of china with submissions actually holds the top place of the list. every year adho organises the digital humanities conference. at first, this conference took place either in europe or in the united states. when more associations joined, adho’s conference started to travel to other continents and countries. this year’s conference was supposed to be in ottawa canada, but because of covid- it had to be cancelled. this was a very sad and frustrating experience for us all. that we managed to have a virtual conference in the end was only possible because some of our colleagues were prepared to invest a lot of their time and energy in its realisation. obviously, this virtual conference could not do away with the loss we feel. it would have been so much better if we could have met in person as at least some of you can do at this conference. we were longing all through the year to meet our colleagues, exchange ideas with them, construct networks and collaborations and above all get to know each other better. we come after all from very different countries and continents, belong to diverse cultures, speak different languages, and have different views and perspectives on digital humanities. as adho’s next digital humanities conference scheduled for in japan had to be postponed for a year because of the pandemic we will now have to wait until before we can meet the global digital humanities community again in person. this also means that we have to wait much longer than planned, before we can welcome the japanese language among the conference languages. over the years and because of the hard work of the standing committee on multi-lingualism & multi-culturalism (mlmc) adho experienced a process of growing awareness that english is not everything and that the diversity of our community cannot be bridged by having english only conferences. we had to acknowledge instead that the languages and cultures we call our own determine our doing and our concepts of digital humanities. slowly but continuously languages which are spoken by the people or important communities of the country where our conferences took place were admitted as conference languages and ways were found to be inclusive when we present our papers. we were really looking forward to take up the challenges, which japanese will certainly present for most of us. adho is governed by two boards, a constituent organizations board, composed of one representative from each of the constituent organizations (cos) and the special interest groups (sigs) coordinator. the role of this board is to establish vision, strategy, and policy for adho. the second board is the executive board, which enacts the decisions taken by the constituent organisations board and administers the day-to-day running of the organisation. as president of the board, which represents all the digital humanities organisations which together make adho, and also personally, i would have wished to be able to bring you adho’s greetings in person and to get to know the digital humanities community which gathers at this conference at least a little bit, but the pandemic makes this impossible. i hope very much that the present virtual encounter will not remain the only one and that sometime in the future i will meet members of this community and will be able to exchange ideas with them. i also hope that at some point in the future the digital humanities community of the people’s republic of china will be part of the adho “family” and will, by contributing its own perspectives and cultures, help adho to become ever more embracing and sensitive to the diversity of the field and its scholarly community. may you have a great conference and enjoy very enriching scholarly debates and friendly human encounters. the american archivist vol. , no. spring/summer aarc- - - page pdf created: - - : : :am reviews trumps records. i believe this is not entirely the case. archives and records scholars who focus their research on organizational information cultures may be relatively uncritical of “information” by omission, rather than because they blindly embrace the view criticized by yeo. but, on the other hand, they also tend to understand “culture” in an almost anthropological sense (the constel- lation of practices, attitudes, and ideas that members of a collective use to make sense of their world), in contrast with yeo’s comparatively vague use of that concept. these small issues aside, yeo’s book provides a lucid argument for the need for records managers and archivists to resist the song of the information sirens. philosophically grounded and analytically clear, records, information and data offers a view of records capable of acting as the foundation for a renewed archival discipline for the twenty-first century. © juan ilerbaig university of toronto notes geoffrey yeo, “concepts of record ( ): evidence, information, and persistent representations,” american archivist , no. ( ): – ; yeo, “concepts of record ( ): prototypes and boundary objects,” american archivist , no. ( ): – . geoffrey yeo, “representing the act: records and speech act theory,” journal of the society of archivists , no. ( ): – . geoffrey yeo, “‘nothing is the same as something else’: significant properties and notions of identity and originality,” archival science , no. ( ): – . geoffrey yeo, “the conceptual fonds and the physical collection,” archivaria (spring ): – ; yeo, “bringing things together,” archivaria (fall ): – . see, for instance, chris hurley, “what, if anything, is a function?,” archives and manuscripts ( ): – . the eugenic rubicon: california’s sterilization stories by jacqueline wernimont and alexandra minna stern. scalar, ca. . epub. freely available at http://scalar.usc.edu/works/eugenic-rubicon-/index. t he eugenic rubicon: california’s sterilization stories is an interdisciplinary, mul-timedia collaboration in digital medical humanities. it offers novel and com- pelling interpretations of the social history of eugenics, as well as glimpses of the potential of archives to serve emerging forms of scholarship. the project is led by two distinguished academics: alexandra minna stern is professor and chair of the department of american culture at the university of michigan d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - - . . by c arnegie m ellon u niversity user on a pril http://scalar.usc.edu/works/eugenic-rubicon-/index the american archivist vol. , no. spring/summer aarc- - - page pdf created: - - : : :am reviews and director of its sterilization and social justice lab. at dartmouth college, jacqueline wernimont is distinguished chair of digital humanities and social engagement, and associate professor in the women’s, gender, and sexuality studies program. they are joined by a project team that includes students, scholars, and university of michigan epidemiologists associated with stern’s sterilization and social justice lab. eugenic rubicon is supported by research funding from the university of michigan and arizona state university, and a humanities collections and reference resources grant from the national endowment for the humanities. eugenics is typically understood as a progressive-era movement to improve the genetic quality of the human race through controlled reproduction. conventional historical interpretations maintain that eugenics was abandoned after the exposure of nazi germany’s abuses. even those knowledgeable of the history of medicine may be surprised to learn that eugenics laws persisted in the united states long into the twentieth century and that eugenic procedures are still reported in the present day. eugenic rubicon is a strong contribution to new historical research that interprets eugenics within the framework of repro- ductive justice, focusing on how institutions applied eugenics laws to conduct social control along gendered and racial lines: stern’s own eugenic nation: faults and frontiers of better breeding in modern america is a foundational work in this area. like eugenic nation, eugenic rubicon focuses on california, which had the most aggressive eugenic sterilization program in the united states. from the early to mid-twentieth century, approximately , patients in state institu- tions underwent forced sterilizations after being judged unfit to reproduce. the state law that supported this program was only repealed in . historians are bringing attention to how forced sterilization disproportionately targeted people of color, and children and women whose behavior did not align with social norms. eugenic rubicon’s focus on patient demographics and experience— rather than on legislative history or public policy—is a valuable effort to under- stand the impact of forced sterilization and to view these programs through the lens of social justice. eugenic rubicon was developed on scalar, an open source publishing plat- form designed for the presentation of multimedia digital scholarship. while the resource is ostensibly presented in an e-book format with a traditional chapter structure and index, its introduction describes “a developing prototype that uses mixed media and digital storytelling methods.” indeed, the scalar platform explicitly supports a plurality of scholarly and creative approaches, presenting a resource as readable as a straightforward historical analysis, while also offering nontraditional treatments of the subject matter (a marketing video for scalar asks, “when does an electronic book become an object to think with?” ). d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - - . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. spring/summer aarc- - - page pdf created: - - : : :am reviews eugenic rubicon’s chapter structure includes introductory text, an exam- ination of the bureaucratization of human experience through recordkeeping, a historical overview of the sonoma state hospital, an invitation for creative engagement with the historical content, and a call for reparations for eugenics survivors. throughout, multimedia objects, digitized archival materials, and embedded external content augment the text. in fact, several of the chapters consist of text that exists primarily to support embedded content, such as external websites and journalism. the navigation invites the reader to simply proceed through the chapters or take detours to read poems written by a project participant, explore an interactive timeline, or leave the resource entirely in favor of other content. ideally, this structure should support diverse levels of engagement with the content and target multiple audiences—from medical researchers to creative artists and social historians. however, while the chapter presentation is thematically coherent, one gets lost in the multiple hyperlinked paths (internal and external), which are presented in both the primary text and the site navigation. working through the scalar presentation evokes memo- ries of navigating websites in the s, which often presented a labyrinth of internal and external links, with an underlying presumption that the ultimate goal of web design is for the user to explore the interface. going forward, the eugenic rubicon team can enrich the experience of readers by more critically deploying scalar to serve the goals of the project. those who manage medical archives, or conduct research in them, are sure to encounter records protected by the federal health insurance portability and accountability act (hipaa). under the act, records held by a covered entity (such as most health care providers), and that disclose the individual identity of medical patients, are defined as protected health information (phi) and are restricted by hipaa’s privacy rules. at my own institution, a covered entity under hipaa, most researchers must complete an institutional review board (irb) process before accessing records containing phi. materials subject to hipaa include not only routine medical records, but correspondence, recordings, and photographs of patients. a few of our patrons are willing and able to undertake the irb process, but barriers to access deter most and they decide not to pursue this avenue of research. we see that, paradoxically, laws intended to protect the privacy of patients also prevent historians from researching and publishing on their experiences. the outcome, as eugenic rubicon notes and attempts to redress, is that the perspective of patients continues to be underrepresented in the history of medicine: “very little is known about the demographics and experi- ences of people sterilized, often against their will” (introduction). part of eugenic rubicon’s aim is to demonstrate the work of the sterilization and social justice lab, which identified a collection of nearly , patient records associated with california institutions from the s through the s. currently held on d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - - . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. spring/summer aarc- - - page pdf created: - - : : :am reviews microfilm in the california state archives and protected by hipaa, the records constitute an invaluable—yet inaccessible—record of the identities and experi- ences of institutional patients. eugenic rubicon provides a redacted sample of the records and promises that, as the records are digitized and made available within appropriate protocols, the resource will take on expansive new dimensions. for archivists, the second chapter’s observations about recordkeeping will hit home. “turning people into paperwork” raises theoretical and practical ques- tions about the dehumanization of individual experience through the main- tenance of bureaucratic forms, which “demonstrate how medical paperwork encouraged doctors to understand their patients in terms of boxes to check and pre-defined diagnoses.” eugenic rubicon aims to at least make this bureau- cratization of the patient experience visible, if not to rehumanize the system. those of us who manage collections of rote, systematic paperwork would do well to consider this problem and question how formulaic records in our own collections erase personal experiences and dehumanize the subjects they repre- sent. we should also consider the redemptive possibilities of drawing individual experience out of systematic records, whether through our own interpretations or by supporting new scholarship that draws narratives from data. eugenic rubicon provides a case study of the questions raised by new forms of digital scholarship. because multiple versions of pages have already been devel- oped, and authors advise that the resource’s content and structure are subject to further change, it is challenging to review or cite the resource in a traditional manner. the extent to which eugenic rubicon’s content underwent peer review before publication is also unclear. do dynamic, mutable resources such as eugenic rubicon ever move from prototype into a final, scholarly product? if not, is that a problem, or just a demand for the paradigms of scholarly communication to adapt? and, while print books and e-books can both suffer technological failures, such as a torn page or crashed download, projects hosted on platforms such as scalar can suffer glitches that potentially impact the scholarship itself: for example, how is the scholar or reviewer to evaluate a digital object that will not load? if embedded external content changes, disappears, or triggers browser security issues, how is the authority and impact of the scholarship affected? the scalar-supported offering of multiple interpretations and interactive paths presents another communica- tion challenge, which could muddy the thesis of even the strongest scholarship. however, in the case of eugenic rubicon, access to the primary subject matter— forced sterilization patients—is so limited that alternative interpretations may be the best options currently available. both creative writing and well-sanitized multi- media can indirectly illuminate the experience of patients without breaching their privacy or imposing a viewpoint on them that is not their own. setting aside scholarly communication questions and the emphasis on external content over original research in some of its chapters, eugenic rubicon’s d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - - . . by c arnegie m ellon u niversity user on a pril the american archivist vol. , no. spring/summer aarc- - - page pdf created: - - : : :am reviews scholarly text is novel, approachable, and appropriately academic. as a contri- bution to new scholarship on eugenics, its topical content and critical approach are relevant not only to historians and archivists working in the health sciences, but also to social historians, students, policymakers, and eugenics survivors. its critique of recordkeeping offers valuable perspectives to archivists who seek to support researchers in accessing and interpreting sensitive personal data. it is deeply troubling that the first-person experiences of patients have been systematically erased or underrepresented in historical analysis—whether due to bias or the functional problems of hipaa. eugenic rubicon is an imperfect, yet promising, step toward finding those voices. © maija anderson oregon health and science university library notes for example, see wikipedia, s.v. “eugenics,” https://en.wikipedia.org/wiki/eugenics. hunter schwarz, “following reports of forced sterilization of female prison inmates, california passes ban,” washington post, september , , https://www.washingtonpost.com/blogs/govbeat/ wp/ / / /following-reports-of-forced-sterilization-of-female-prison-inmates-california- passes-ban. alexandra minna stern, eugenic nation: faults and frontiers of better breeding in modern america (berkeley: university of california press, ). alliance for networking visual culture, “about scalar . —trailer,” https://scalar.me/anvc/scalar. for a more in-depth examination on the implications of hipaa for archivists, plus a set of rec- ommended practices, see emily r. novak gustainis and phoebe evans letocha, “the practice of privacy,” innovation, collaboration, and models: proceedings of the clir cataloging hidden special collections and archives symposium, march , http://www.medicalheritage.org/wp-content/uploads/ / / gustainis-letocha.pdf. recordkeeping informatics for a networked age by frank upward, barbara reed, gillian oliver, and joanne evans. clayton, victoria, australia: monash university publishing, . pp. softcover and epub. softcover $ . , epub freely available at http://books.publishing.monash.edu/apps/ bookworm/view/recordkeeping+informatics+for+a+networked+age/ / _cover.html. softcover isbn - - - - ; epub isbn - - - - . business analysts, known as consultants, play a ubiquitous and accepted role in the most lucrative sectors across the globe. the methods they use to analyze and document business processes increasingly relate to sectorial infor- matics. the authors of this volume expand on their previous work to argue that records and archives professionals would do well to adopt similar methods so as d ow nloaded from http://m eridian.allenpress.com /doi/pdf/ . / - - . . by c arnegie m ellon u niversity user on a pril https://en.wikipedia.org/wiki/eugenics https://www.washingtonpost.com/blogs/govbeat/wp/ / / /following-reports-of-forced-sterilization https://www.washingtonpost.com/blogs/govbeat/wp/ / / /following-reports-of-forced-sterilization https://www.washingtonpost.com/blogs/govbeat/wp/ / / /following-reports-of-forced-sterilization https://scalar.me/anvc/scalar http://www.medicalheritage.org/wp-content/uploads/ / /gustainis-letocha.pdf http://www.medicalheritage.org/wp-content/uploads/ / /gustainis-letocha.pdf http://books.publishing.monash.edu/apps/bookworm/view/recordkeeping+informatics+for+a+networked+age/ / _cover.html http://books.publishing.monash.edu/apps/bookworm/view/recordkeeping+informatics+for+a+networked+age/ / _cover.html herrmann_etal_da_response journal of cultural analytics may , response by the special interest group on digital literary stylistics to nan z. da's study j. berenike herrmanna, anne-sophie boriesa, francesca frontinib, simone reborac, jan rybickid auniversity of basel, germany bpaul valéry university, montpellier cuniversity of verona duniversity of kraków, poland a r t i c l e i n f o article doi: . / c. journal issn: - a b s t r a c t the publication of nan z. da's study in critical inquiry has triggered a debate about the methodological and conceptual dimensions of digitally assisted inquiry in literary studies. nan z. da's fundamental critique of what she calls "computational literary studies" addresses the work of the international special interest group"digitalliterary stylistics"(sig-dls) of the alliance of digital humanities organizations (adho). thus we—the five scholars forming the sig's current steering committee—would like to make a short statement. the publication of nan z. da's study in critical inquiry has triggered a debate about the methodological and conceptual dimensions of digitally assisted inquiry in literary studies. nan z. da's fundamental critique of what she calls "computational literary studies" addresses the work of the international special interest group"digitalliterary stylistics"(sig-dls) of the alliance of digital humanities organizations (adho). thus we—the five scholars forming the sig's current steering committee—would like to make a short statement. initially, we found it a bit surprising that a paper with so many formal, methodological, and theoretical flaws has received so much serious attention. formal and conceptual problems of the paper have been documented in abundance (see andrew piper's do we know what we're doing?, the responses at the critical inquiry blog, and twitter), and we don't see it as our role to add much at this level of the discussion. however, from the point of view of an international sig dedicated to literary stylistics—and digital literary studies—stretching across traditional disciplinary and methodological boundaries, we would like to make a few observations. first, we observe that nan z. da does not refer to the wealth of european, south- american, australian, african, and asian contributions to what she calls j o u r n a l o f c u l t u r a l a n a l y t i c s "computational literary studies." meanwhile, there is a substantial body of non- north-american contributions to international journals (such as digital humanities quarterlyor digital scholarship in the humanities), and also our sig represents members from around the world. it thus appears that the paper is to be primarily understood within the north-american frame—including its particular reference frame of prestige, distribution of research funding and recruitment strategies. a geographic, or cultural, bias may thus indeed be added to the problems that the paper has. and of course, there is room for asking questions about the author's actual motives. second, the fact that the paper has triggered a very serious debate points to a larger phenomenon extending beyond the north-american scholarly frame. it is thus of direct importance to any scholar using computational assistance in the study of literary texts: within the humanities there exists a number of scholars and institutions mounting an irreconcilable reproach against any "digital" or "computational" approaches to literary texts. this position centers around the contention that "literature" is not "reducable" to "numbers" (as well as on a perceived excess in distribution of funds to "dh"). in its extreme forms, this position goes beyond a "healthy skepticism." past experience shows that limiting "permissible" scholarly approaches for ideological reasons is both harmful and ineffective. third, we would like to highlight the difference between "cls" and "dls", thus, between "computational" and "digital" approaches, where "digital" is the more encompassing notion subsuming contributions from the established disciplines of computational linguistics, text mining, and nlp, as well as corpus linguistics and corpus stylistics. it thus comprises also computer-enhanced close reading, for example by means of keyword in context (kwic), or digital annotations of various—including hermeneutic—kinds. this factual practice counters da's statement that cls analyses are essentially run "without regard to position, syntax, context, and semantics" (p. ). da does not seem to be aware of the actual range of methods and the various traditions present in dls. whether in quantitative or qualitative studies, scholars have persistently striven to account for the complexity of literary discourse, and thus get much beyond "basic word frequencies" (p. ). j o u r n a l o f c u l t u r a l a n a l y t i c s fourth, digital literary studies, including cultural analytics, are an interdisciplinary, collaborative, and highly diverse endeavor. in opposition to traditional literary studies, digital studies require many hands, with labs as spaces for collaboration. it is precisely at these spaces, which can have many different organizational incarnations, where an interface of "hermeneutic" and "computational" communities is created. with a growing number of opportunities for making this kind of contact, it is up to the individual scholar to explore the full range of methods, scaling the grade of reduction, contextualization, and degree of direct scholarly interpretation. finally, in his opening response, andrew piper (quite generously) states that nan z. da's paper "is part of a growing body of work that seeks to introduce the idea of replication into the humanities." we fully endorse this line of work, and see it as one of our sig's main offices to further it—by fostering exchange and discussion, as well as methodological and terminological transparency, and the fit of models to data and method. one of our current initiatives is the dls tool inventory, and an upcoming workshop at dh dedicated to the critical assessment of widely used methods in dls. provided by the author(s) and nui galway in accordance with publisher policies. please cite the published version when available. downloaded - - t : : z some rights reserved. for more information, please see the item record link above. title new directions for academic libraries in research staffing: acase study at national university of ireland galway author(s) cox, john publication date - - publication information cox, john. ( ). new directions for academic libraries in research staffing: a case study at national university of ireland galway. new review of academic librarianship, ( - ), - . doi: . / . . publisher taylor & francis link to publisher's version http://dx.doi.org/ . / . . item record http://hdl.handle.net/ / doi http://dx.doi.org/ . / . . https://aran.library.nuigalway.ie http://creativecommons.org/licenses/by-nc-nd/ . /ie/ new directions for academic libraries in research staffing: a case study at national university of ireland galway john cox, university librarian, library, national university of ireland galway abstract new research needs, global developments and local shifts in emphasis are demanding a broader range of interactions by librarians with researchers and are challenging previous staffing structures. research has a higher institutional profile and academic libraries have responded by creating new roles and staffing models, with stronger linkage across campus as partners rather than supporters. particular circumstances at national university of ireland galway have shaped its library's staffing configuration for research. these include the emergence of digital scholarship across campus, opportunities offered by a new research building, the growing importance of archives and the publication of a new institutional strategy. significant reductions in staffing and budget are influential too. distinctive features in the revised staffing model are organisation by function instead of subject, prioritisation of engagement with digital scholarship, distributed management of archives and special collections and a particular emphasis on contribution across multiple teams. this case study reports early gains and challenges. introduction staffing for research services has been an area of strong emphasis and typically expansion for academic libraries in recent years. this article describes experience at national university of ireland (nui) galway which has adopted a new staffing model, maximising specialist skills and teamwork to meet an expanded range of researcher needs and expectations. the case study explains local circumstances, the structure adopted, early experience and points of learning. it is preceded by a literature review to provide global context. global trends in research services a review of the literature from the past five years or so indicates, in the initial period at least, some concerns by library managers. these are directed in particular at a lack of recognition by researchers of libraries as partners or keys to productivity (corrall, ), along with reduced engagement, (delaney & bates, ), limited awareness of what librarians provide (cooke et al., ), and among doctoral students a lack of understanding of the information environment (british library & jisc, ). it is also understood that researchers can look elsewhere to meet emerging needs if libraries are unable to deliver, but this is also recognised as an opportunity to set out a new agenda and to develop new roles (auckland, ). academic libraries have created new staffing positions in response to stronger institutional focus on research, increased emphasis on measuring research performance, notably the uk research excellence framework (ref), and changes in the scholarly communications environment, including expectations of open access to funded research. these posts are usually of a specialist nature and targeted at “higher- end” (corrall, , p. ) engagement with researchers further upstream in their work (auckland, ). they have focused initially on areas such as systematic literature reviews, bibliometrics, open access and data curation. changes in the way research is recorded and communicated, often incorporating datasets and other outputs besides formal publications, have created a need for expert advice by libraries on publishing, discoverability, rights management and altmetrics (johnson, adams becker, estrada, & freeman, , pp. - ), including the development of social media strategies to promote research (persson & svenningsson, ). to some extent this first wave of new research-focused roles might be considered a natural evolution from traditional library expertise in discovery, information literacy, copyright, and the organisation of information. the active involvement of academic libraries in digital humanities initially and then digital scholarship at large has occasioned a more radical shift in the scope and nature of their engagement with research and researchers. digital scholarship is a term affording many definitions but is fundamentally linked to the opportunities offered by digital technologies and content to enable new modes of enquiry, commonly across disciplines, and to make new connections, often by manipulating vast quantities of data. libraries can take a relatively passive role as a supplier of digital resources or advice or can become an active partner, recognising a shift “from the consumption to the creation of scholarship” (clay, , p. ). ad hoc engagement with digital humanities in (bryson, posner, st pierre, & varner, ) has since then for many libraries become mainstream involvement with digital scholarship, spawning a series of new activities (cox, ). a survey of members of the association of research libraries (arl) on support for digital scholarship in identified activities, all of them featuring to some extent in the responding libraries (mulligan, ). these activities range from digitisation and metadata creation to digital mapping, data management and software development, representing a considerable expansion in library interaction with researchers. other headline findings included the involvement of librarians, archivists and it professionals in library teams, collaboration across and beyond the campus, the creation of new posts, typically at senior levels and often not requiring a professional librarian qualification, and the expectation of further growth in digital scholarship engagement. mackenzie and martin ( ) describe similar patterns and a breadth of activity for libraries in the uk and ireland, including a range of services established at the university of salford (clay, ), while mcrostie sets out an impressively ambitious research information program focused on maximising research data at the university of melbourne (mcrostie, ). clearly a significant expansion in library engagement with researchers has taken place and digital scholarship has been a factor. the concept of library as partner, not simply service, has also advanced, with library staff occupying a new space, located much more prominently in the researcher’s orbit (posner, ; vandegrift & varner, ). examples are the university of sussex humanities lab whose core staffing includes members of library staff (mackenzie, ) and the involvement of the library in the social and cultural informatics platform (scip) at the university of melbourne (mcrostie, ). a related development is the linkage of library provision to the research lifecycle to enable partnership at all stages rather than, as traditionally for libraries, only at the beginning and end (corrall, ). this can be seen in staffing structures at brown university where a recent recruitment process for its center for digital scholarship intends that librarians will work through all stages of the research process (maron, ), while at king’s college london the library research support home page is sub-titled “support through the research lifecycle” (king's college london library, ). close collaboration with other parties on campus is another recurring theme, described by corrall ( , p. ) as “boundary- spanning activities”. these include working with campus research offices, it services and academics around open access and research data and are very much in evidence in the case studies referenced earlier at the university of salford and the university of melbourne. some academic libraries have executed significant changes in their organisational structure for research. the subject librarian role has come under the spotlight, and the broadening of research support activities as well as trends towards interdisciplinary research have led to its supplementation with specialists at some institutions (jaguszewski & williams, ). other libraries have shifted from subject to functional models (hoodless & pinfield, ; jaguszewski & williams, ). those who have done so have been motivated by a desire to strengthen research support, putting an emphasis on improved cross-campus collaboration and communications (eldridge, fraser, simmonds, & smyth, ). the university of manchester led this change in and reported success in developing new or enhanced research services in areas such as open access, citation analysis and data management (bains, , ). multiple staffing models are possible for digital scholarship engagement activities (vinopal & mccormick, ) and can be seen across arl institutions (mulligan, ). whatever configuration is adopted, staff skills are a concern. auckland ( ) listed potential skills required by subject librarians and identified a significant gap in provision for nine of them . the arl digital scholarship survey specified visualisation, data curation and management, and computational text analysis as the three most critical areas for skills development to meet emerging demand (mulligan, ). inskip sees a need for librarian skills and competencies to change fundamentally and for the librarian to be a polymath or a blended professional with a merged set of identities and practices, capable of working across different groups on campus (inskip, ). library enablement of research at nui galway: local factors founded in , nui galway is a research-led and research-intensive university across a wide range of disciplines. it is one of seven irish universities, with over , students, who are increasingly international in origin, and , staff. research is focused on selected areas of strength, including biomedical engineering, digital humanities and data analytics. its strategic plan aims to achieve a position in the top universities worldwide by and it is currently ranked in the top . the james hardiman library places a strong emphasis on enabling research and this is reflected in its own strategy to . its vision is that the library will be a catalyst for success as the university’s hub for scholarly information discovery, sharing and publication. local factors and institutional priorities are rightly recognised as shaping any academic library’s staffing model for research (auckland, ; hoodless & pinfield, ). certain events at nui galway in recent years have been highly significant in this regard, none more so than the opening of the hardiman research building in . adjoining the library, this building co-locates archives, special collections, digitisation facilities and library staff who have research roles with the university’s leading research institutes for the humanities and social sciences. this shared location has had a huge impact in advancing the positioning of the library as a partner with the academic community, promoting collaborations around digital scholarship projects, seminars, exhibitions and a visiting fellowship scheme. the university has placed a particular emphasis on archives as a point of distinction and a vital research resource and the new building, in addition to providing vastly improved accommodation, has delivered much stronger use of collections and attracted a number of prestigious donations. a notable addition at the time of writing is the archive of former president of ireland and united nations high commissioner for human rights, mary robinson. another event with long-term impact for the library has been the establishment of a partnership in between the university and the abbey theatre, ireland’s national theatre and one of its foremost cultural institutions, to digitise its archive (bradley & keane, ). the creation of the abbey theatre digital archive, completed in late , was the world’s largest ever theatre archive digitisation project, comprising more than a million pages in addition to audio-visual and other media. the library led the project, working alongside an external contractor and learning a lot about large-scale digitisation and the academic potential of digital collections (mackenzie, ). positive outcomes included a further raising of the profile of archives in the university and a new perception on campus of the library as a source of expertise in digital humanities and digital scholarship. expectations of the library were certainly raised and openings emerged to work with a range of partners active in digital projects. this brought challenges in terms of staffing and infrastructure, but also opportunities and culminated in the publication by the library in , following a series of mutually beneficial consultations, of a digital scholarship enablement strategy (national university of ireland galway library, a). the strategy positioned the library as one of a number of key players in the university, eager to engage and collaborate. it foregrounded the quality and range of the library’s archives and special collections as a particular strength, recognising that research collections are a key factor in engagement by libraries with digital scholarship (mackenzie, ) the growth in multidisciplinary and interdisciplinary approaches to academic work in recent years, exemplified by digital scholarship, has asked questions of library staffing models (jaguszewski & williams, ). nui galway adopted a new academic structure in , consolidating over disciplines into five colleges and schools. this structure has promoted interdisciplinarity and blurred the subject divisions on which a large part of the library’s staffing was based. national factors also added impetus to a review of staffing. ireland suffered a particularly sudden and deep recession in which had serious implications for university staffing and budget levels (cox, ). although an economic recovery has taken place, this has not as yet meaningfully impacted the universities. the library at nui galway has almost % fewer staff than in september , reduced from . fte to . fte, but has needed to respond to a range of new researcher behaviours, requirements and expectations. incremental adaptations have become unsustainable, calling for a radical change and this realisation informed the development of a new library strategy in and . another national factor of relevance to library staffing models has been an increased emphasis on research, with a particular focus on measuring its impact. this is clearly stated in the irish government’s research strategy document, innovation (department of jobs enterprise and innovation, ), which also indicates a strong expectation of open access to research publications and data, mandating a range of actions and interactions for libraries, research offices and academic units. alignment with institutional strategy has been the other main driver for the library at nui galway. the university published its vision strategy in april (national university of ireland galway, ). the university librarian was a member of the drafting group and this proved advantageous in informing early work on the library’s strategy. the university strategy foregrounds research, emphasising citation impact, researcher training, international profile and global impact through publications in particular, while also referencing archives as a differentiator. these areas of emphasis influenced library discussions around staffing for bibliometrics, researcher skills, open access, archives and digital scholarship. the range of factors described in this section proved to be major influencers in the development of the library strategy to and they shaped the staffing models adopted for its delivery. a new library staffing model for research the thrust of the library’s journey to strategy is towards delivering new value by recognising greater self-sufficiency among researchers and targeting areas of new need (national university of ireland galway library, b). one of its six priorities is high-impact publication of research, data and digital content. the strategy takes a user-centred approach, looking outwards and putting the three main library assets of space, collections and staff skills at the disposal of researchers in different ways, all geared towards discovery, sharing and publication. it promotes interactive locations to showcase and engage communities with research through exhibition and maker spaces, accessible collections such as digital archives to enable new approaches to knowledge creation, and evolving skills and infrastructure for digital publication to bring research outputs to global audiences. the focus throughout is on a new value proposition and a major reshaping of the library agenda based on consultation, listening and recent experience. the strategy anticipates multiple, often overlapping, touchpoints for researchers and others with the library, calling therefore on fluidity in the team structure supporting it. the change in organisational structure has been radical rather than incremental. more than half of the library staff members have changed role or line manager following the creation of five new teams in early . introducing change on this scale has called for considered planning and management. a key principle from to the outset was to invest heavily in consultation with staff to promote their fullest participation in the process of developing the new agenda, identifying emerging demands and shaping the new staffing model required. consultation included a series of horizon scanning workshops repeated until all staff had participated, individual meetings by the university librarian with each member of staff, and six thematic seminars, again repeated for full staff attendance, to agree priority actions for each of the six strategic thrusts in the vision for the library in (national university of ireland galway library, b) which had been developed collaboratively. meetings to brief staff took place as the new team structure was being created and populated. the whole process from initial workshops to establishment of teams took about two years, time well spent in terms of achieving staff buy-in. by the time the new teams were announced, there were few surprises. nevertheless the senior management team recognised as important the need to provide support for individuals and teams. not surprisingly, some staff expressed concerns about taking on new roles, giving up favoured activities, discontinuing some services and changing team or manager. there were questions too about the unproven value of new areas of priority. managers organised lots of team briefings and individual meetings to clarify queries. away days have been encouraged for teams as needed too. overall, there has been an impressive commitment to embracing new roles and taking on significant change. the new teams are:  operations: aligning library space and staffing with changing needs  collections: integrating management and development of, and access to, information resources  marketing and engagement: promoting the library and understanding user needs  research and learning: enabling research and developing academic skills  digital publishing and innovation: creating digital collections for innovative research table shows the new team structure relative to its predecessor; the teams are presented in alphabetical order in each column as efforts to map new to old are of limited value. table . new and previous team structure new teams previous teams collections customer focus and research services digital publishing and innovation information access and learning services marketing and engagement organisational development and performance operations staff development and service environment research and learning in terms of its alignment to research there are a number of distinctive features of this new staffing model relative to that which preceded it. the first of these is the formation of a separate digital publishing and innovation team, representing a prioritisation of engagement with digital scholarship. although the team is relatively small, comprising . fte, it has a diverse skills profile which includes programming, metadata, web publishing, information services, strategy and communications. its members have library, archives and it backgrounds and job titles include digital publishing and data management librarian, digital archivist and digital library developer. this fusion of talents, backgrounds and perspectives is beneficial in an area as multi-stranded as digital scholarship and facilitates an agile response to the range of projects involved. the new team advantageously brings together staff who were formerly dispersed throughout the library in a variety of roles. there is a particular need for it skills as the systems infrastructure is diverse and core to progress. it helps greatly that the head of the team has a strong track record of development with library technologies. another new departure is separate management of archives and special collections according to their public-facing activities and internal management processes. previously all aspects of archives and special collections were managed within the former research services division. the new arrangement allocates % of staff effort to public engagement in the research and learning team and % to the management of collections in the collections team. the rapid recent growth in archives had created difficulties in balancing demand between front- and back-of-house activities, sometimes to the detriment of the latter. new collections were acquired but time for optimal storage and cataloguing was compromised by the high profile of archives in research, teaching, exhibitions and information literacy. the intent is to manage acquisition, storage and cataloguing systematically by aligning these vital functions with the management of all library collections, while continuing to maximise the use of archives and special collections for academic and public engagement. a challenge will be to achieve the intended balance of effort, retaining some flexibility within a disciplined approach and recognising a need to manage new demand, sometimes to a more phased timetable than heretofore. the heads of the two teams understand the need for clarity of reporting and also for connectivity with the digital publishing and innovation team around digital archives. the engagement of three teams with archives and special collections represents a new level of distribution, with attendant risks but also the opportunity to connect this growing area of activity better with the library at large. nui galway has joined other universities in moving from a subject to a functional model of organisation, distributing its former subject librarians across the research and learning, marketing and engagement and collections teams. its reasons for doing so are similar to those reported elsewhere (hoodless & pinfield, ) but the staffing and budgetary cuts noted earlier have been a significant factor too. it is no longer feasible to devote a large proportion of professional staffing to individual colleges and greater flexibility of deployment is needed, especially as the range of expectations has broadened so much beyond collections, information literacy and reference, the core subject librarian areas of engagement (jaguszewski & williams, ). the new research and learning team now offers a more generic approach to information literacy, extending coverage to new skills needed in the changing scholarly communications environment. a particular emphasis is to complement the research services librarian whose remit is focused on specialist graduate skills, researcher enquiries, bibliometrics and open access advocacy across all disciplines. this is breaking down a division, artificial from the user perspective, between staffing for teaching and learning and research which the previous structure had surfaced. these new arrangements have not been immediately accepted by academic staff in all disciplines. a particular has been the school of law. the loss of their subject librarian has created concerns and a number of meetings have taken place to monitor the transition to the new structure and to ensure the availability of subject-specific information skills training where required. members of the school of law are beginning to acknowledge some gains from the new strategy and teams too, including access to new archives and collaboration on the establishment of an open access journal. some other, more general, points can be made about the modus operandi of the new structure in enabling library engagement with university research. a multi-contributory approach is at its core, taking into account the wider range of interactions with researchers, and teams have replaced divisions. success depends on strong collaboration and fluidity across teams, not just among the three teams working on digital archives, for example, but leveraging important contributions from the other two teams in relation to spaces, skills development and communications. a culture of collective responsibility for the new agenda at large, rather than only for a team’s immediate remit, is being encouraged. this has been a particular focus in the senior management team. the team has participated in two away days to reinforce mutual support, teamwork and ownership of the whole strategy and has initiated a new series of bimonthly meetings to exchange progress across all areas in the annual operational plan. the positive impact of these new approaches has been evident in a willingness by members to share staff between teams at busy times or to compromise on filling vacant posts in order to enable the recruitment of new skills of benefit to the library at large. as a result new positions have been created by filling vacant posts differently, sometimes at changed grades and in different teams. this has been unpopular at times as it has reduced the staffing in some teams but has been seen as essential in order to introduce specialist skills to meet new priorities (martin, ). because numbers are stretched there is openness to outsourcing, of digitisation or cloud-based it infrastructure for example, or to complementing rather than owning a function such as bibliometrics for which the university’s institutional research office is better resourced. depth often needs to be sacrificed in favour of breadth of coverage, but it seems better to engage across a range of emerging activities than to risk exclusion of the library when it has an important contribution to make. engagement brings opportunities, including participation in funding bids, to supplement reduced resources. the mentality is therefore one of maximum partnership with researchers, engaging visibly in their space at seminars and other events and recognising that participation opens up communication (cox, ). the first year the new structure has generated some benefits, as well as a few challenges, in its first year of operation. library engagement with digital scholarship has broadened and is on more solid foundations following the establishment of the digital publishing and engagement team which works to an annual plan and has developed a prioritisation process. the range of projects to create digital collections has expanded, notably the digitisation of the archive of ireland’s second largest theatre, the gate, as a complement to the abbey theatre digital archive. the library led the funding bid for that project and has also secured university funding for the cataloguing and digitisation of two other newly acquired archives. the presence of staff with it skills in this team has spawned new work around interface development, text and data mining of digital collections, and crowdsourced transcription. these are activities closely linked to the three areas referenced earlier as priorities for skills development in the arl digital scholarship survey (mulligan, ). the involvement of more than one team with archives has influenced a more holistic approach to their acquisition and processing, including new workflows to harmonise metadata across different platforms for paper and digital collections. the development of resourcing plans and bids, in order to ensure a plan for storage and cataloguing of collections before a commitment to receive them is agreed, means that acquisition takes longer but will have greater academic impact. a new archives strategy committee, including a number of research leaders on campus, has been very supportive of this approach. the focus is on linking archives to strong academic engagement. partnering on bids for resources to enable faster processing of new collections has yielded funding for two contract archivist posts to date. the new committee has also overseen the creation and publication for the first time of an archives strategy (national university of ireland galway library, a), drafted through a collective exercise across three teams. the strategy charts a course for the period to , covering collections management and discovery, publication of digital collections, academic engagement, and resourcing. one early outcome is a project initiated by the collections team to audit and reorganise archival holdings, aiming to yield an increase of more than % in available space for new additions. another is a publisher contract for a book on irish theatre archives, to be edited by an archivist in the research and learning team with contributions from a number of academic and library staff in the university. that initiative signifies an escalation in library engagement with researchers, as partners where opportunity arises. this has included organising a series of digital publishing seminars to showcase work on digital collections, publishing a number of reports to highlight the academic impact of archives (national university of ireland galway library, ), and ongoing consultation around the new strategy. all of this activity has been backed by a more concerted use of social media, facilitated by the marketing and engagement team. stronger researcher engagement has had positive results, including the joint funding bids for archives resources already mentioned, a partnership to provide a drop-in advisory service for digital scholarship projects and a project, led by the operations team, to develop a reconfigured exhibition space. a closer working relationship with the research office has emerged too. collaboration includes the implementation of a university policy on open access to research outputs and better integration of the research management and repository systems, again facilitated by increased it expertise in the digital publishing and innovation team. connectivity with a range of partners on campus is underpinning the development of advisory services and technical infrastructure for research data management. this is an area of growing demand for which library involvement was difficult to activate under the previous structure. the reaction of the research community to new library initiatives has been encouragingly positive to date. attendance at events such as seminars on archives or digital publishing has been strong, sometimes with standing room only, while library staff have been invited to participate in academic projects, notably the digital cultures initiative (national university of ireland galway moore institute, ) which seeks to join up digital scholarship activities on campus. there are, of course, some downsides in the new organisational model. staffing is very thinly spread in places and delivering a broad range of information literacy programmes has been achieved with difficulty while a key staff member was absent through illness. a deeper offering in areas such as bibiliometrics and systematic reviews is inhibited by lack of numbers. involving more teams on collaborative projects can sometimes take longer due to the need to convene larger groups, familiarise staff with new areas of work and fit around different peaks and troughs of demand for each team. failure to see the joins across teams can happen and there is a need for the university librarian and senior management team to monitor the embedding of the new structure to ensure that collaboration happens, communication across teams takes place and stakeholders, including user communities, university management and funders, are fully engaged. the broadening of activity calls for an ongoing prioritisation of resources to meet demand and this involves some delicate balancing acts. keeping up with the need to develop staff skills is also very challenging and the value of a more strategic approach, as implemented at manchester when it adopted a new structure in , is evident (bains, ). unlike at manchester, no additional funding has been made available for staff development, although this line item in the library budget has been protected at least from the impact of cuts necessitated elsewhere. the new strategy and team structure highlighted a need for new skills early on. before the teams were formally established the head of operations met with each of the other heads to identify development priorities and has proceeded to source opportunities to acquire new skills; progress has therefore been gradual depending on when courses arise. development has been fastest when larger groups can participate together in webinars or courses organised on site such as one on social media strategy. learning on the job with mutual support among colleagues has been a common approach. participation in new working groups locally, for example one on research data management, has proved to be an effective route to learning and understanding. another approach which is receiving new emphasis is learning through stronger engagement than previously with cross-institutional initiatives. these include proposals for digital scholarship projects with external partners and membership of a series of new groups created at national level by the consortium of national and university libraries (conul). in addition, recruitment opportunities arising from upcoming staffing retirements are being targeted with a view to importing new skills of value across the whole library rather than filling posts on a team-specific basis. conclusion new research needs, global developments and local circumstances are demanding a broader range of interactions by librarians with researchers and are challenging previous staffing structures. doing more with less remains the mantra, but libraries are at a vital juncture and risk exclusion from emerging areas of contribution and from associated funding opportunities. the library at nui galway has faced many changes and challenges in recent years and adopted a distinctive organisational model to accommodate new researcher expectations, the mix of staff skills required and the need to maximise teamwork to deliver on a new agenda identified in its strategy to . some important points of learning have emerged along the way. these include fitting new staffing models to local factors such as the emergence of digital scholarship across the campus, the opportunities offered by a new building, the growing importance of archives and the publication of a new institutional strategy. planning for change across a generous time horizon, in this case two years, has been helpful in engaging library staff with shaping the new agenda and understanding the rationale underpinning it as well as the need for a different structure of teams and individual roles. communicating continuously with researchers post implementation is especially vital in addressing concerns, building confidence in the new agenda and selling its benefits positively and early. a key event in this regard was the public launch of the strategy which attracted an engaged audience from across the campus and was led by the registrar and deputy president with speakers from leaders in teaching and learning and research. the launch provided an opportunity to emphasise the alignment of the strategy with university priorities and to confirm the library’s intent to generate new value and partnerships. it also reinforced the new teams and the commitment of individuals to the new directions being taken. a particular feature of this event was the honest account of their experience of change by two members of staff who had taken on new roles. the launch has been supplemented by an ongoing emphasis on proactive communications with influential groups such as the academic management team and the college vice-deans for teaching and learning and for research. fuller attention to communications, including an orientation of language used towards partnership instead of service, has to date generated an increased and more broadly based interaction with the academic community. a less close relationship is a concern commonly expressed about moving from subject to functional organisation (hoodless & pinfield, ) but this has not happened so far. the phrase carpe diem (seize the day) resonates strongly too. library managers have recognised that the opportunities offered by the new structure need to be taken early on if they are to be realised, even if this creates extra pressure. otherwise, traditional approaches may reassert themselves and the moment for change may be lost. a strong base has been established in terms of commitment to engaging with digital scholarship, developing partnerships around projects and resources, maximising archives and special collections as an institutional differentiator, and taking a more strategic approach to developing or introducing essential skills for the library team as a whole. the ongoing challenge is to build on these foundations by sustaining momentum, facing outwards and being alert to opportunities it is early days in terms of the new agenda and structure, with numerous challenges in evidence but also many good signs and much to build on, especially if an improved funding climate makes it possible to deepen the resource base in areas where the library is creating new value. the university management team is very positively disposed towards this evolving agenda and the additional funding it has awarded to digital projects and archives processing positions the library better now and for the future. references auckland, m. ( ). re-skilling for research: an investigation into the role and skills of subject and liaison librarians required to effectively support the evolving information needs of researchers retrieved from http://www.rluk.ac.uk/wp-content/uploads/ / /rluk-re-skilling.pdf bains, s. ( ). teaching ‘old’ librarians new tricks. sconul focus, , - . retrieved from http://www.sconul.ac.uk/sites/default/files/documents/bains.pdf bains, s. ( ). manchester’s new order: transforming the academic library support model retrieved november , from http://www.rluk.ac.uk/news/manchestersneworder/ http://www.rluk.ac.uk/wp-content/uploads/ / /rluk-re-skilling.pdf http://www.sconul.ac.uk/sites/default/files/documents/bains.pdf http://www.rluk.ac.uk/news/manchestersneworder/ bradley, m., & keane, a. ( ). the abbey theatre digitisation project in nui galway. new review of information networking, ( - ), - . retrieved from doi:http://dx.doi.org/ . / . . british library, & jisc. ( ). researchers of tomorrow: the research behaviour of generation y doctoral students retrieved from http://www.jisc.ac.uk/reports/researchers-of-tomorrow bryson, t., posner, m., st pierre, a., & varner, s. ( ). spec kit : digital humanities retrieved from http://publications.arl.org/digital-humanities-spec-kit- / clay, d. ( ). building scalable and sustainable services for researchers. in a. mackenzie & l. martin (eds.), developing digital scholarship: emerging practices in academic libraries (pp. - ). london: facet. cooke, l., norris, m., busby, n., page, t., franklin, g., gadd, e., & young, h. ( ). evaluating the impact of academic liaison librarians on their user community: a review and case study. new review of academic librarianship, ( ), - . doi: http://dx.doi.org/ . / . . corrall, s. ( ). designing libraries for research collaboration in the network world: an exploratory study. liber quarterly, ( ), - . retrieved from https://www.liberquarterly.eu/articles/ . /lq. /galley/ /download/ cox, j. ( ). sharing the pain, striving for gain. serials, ( ), - . retrieved from http://doi.org/ . / cox, j. ( ). communicating new library roles to enable digital scholarship: a review article. new review of academic librarianship, ( - ), - . doi: http://dx.doi.org/ . / . . delaney, g., & bates, j. ( ). envisioning the academic library: a reflection on roles, relevancy and relationships. new review of academic librarianship, ( ), - . doi: http://dx.doi.org/ . / . . department of jobs enterprise and innovation. ( ). innovation : excellence, talent, impact retrieved from https://www.djei.ie/en/publications/publication-files/innovation- .pdf eldridge, j., fraser, k., simmonds, t., & smyth, n. ( ). strategic engagement: new models of relationship management for academic librarians. new review of academic librarianship, ( - ), - . doi: http://dx.doi.org/ . / . . hoodless, c., & pinfield, s. ( ). subject vs. functional: should subject librarians be replaced by functional specialists in academic libraries? journal of librarianship and information science, ( ), in press. retrieved from doi:http://dx.doi.org/ . / inskip, c. ( ). novice to expert: developing digitally capable librarians. in a. mackenzie & l. martin (eds.), developing digital scholarship: emerging practices in academic libraries (pp. - ). london: facet. jaguszewski, j. m., & williams, k. ( ). new roles for new times: transforming liaison roles in research libraries retrieved from http://www.arl.org/storage/documents/publications/nrnt-liaison- roles-revised.pdf johnson, l., adams becker, s., estrada, v., & freeman, a. ( ). nmc horizon report: library edition retrieved from http://cdn.nmc.org/media/ -nmc-horizon-report-library-en.pdf king's college london library. ( ). research support: support through the research lifecycle retrieved november , from http://www.kcl.ac.uk/library/researchsupport/index.aspx mackenzie, a. ( ). digital scholarship: scanning library services and spaces. in a. mackenzie & l. martin (eds.), developing digital scholarship: emerging practices in academic libraries (pp. - ). london: facet. mackenzie, a., & martin, l. (eds.). ( ). developing digital scholarship: emerging practices in academic libraries. london: facet. http://dx.doi.org/ . / . . http://www.jisc.ac.uk/reports/researchers-of-tomorrow http://publications.arl.org/digital-humanities-spec-kit- / http://dx.doi.org/ . / . . http://www.liberquarterly.eu/articles/ . /lq. /galley/ /download/ http://doi.org/ . / http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://www.djei.ie/en/publications/publication-files/innovation- .pdf http://dx.doi.org/ . / . . http://dx.doi.org/ . / http://www.arl.org/storage/documents/publications/nrnt-liaison-roles-revised.pdf http://www.arl.org/storage/documents/publications/nrnt-liaison-roles-revised.pdf http://cdn.nmc.org/media/ -nmc-horizon-report-library-en.pdf http://www.kcl.ac.uk/library/researchsupport/index.aspx maron, n. l. ( ). the digital humanities are alive and well and blooming: now what? educause review, ( ), - . retrieved from http://er.educause.edu/articles/ / /the-digital- humanities-are-alive-and-well-and-blooming-now-what martin, l. ( ). the university library and digital scholarship: a review of the literature. in a. mackenzie & l. martin (eds.), developing digital scholarship: emerging practices in academic libraries (pp. - ). london: facet. mcrostie, d. ( ). the only constant is change: evolving the library support model for research at the university of melbourne. library management, ( / ), - . retrieved from http://www.emeraldinsight.com/doi/pdfplus/ . /lm- - - mulligan, r. ( ). spec kit : supporting digital scholarship retrieved from http://publications.arl.org/supporting-digital-scholarship-spec-kit- national university of ireland galway. ( ). vision : nui galway strategic plan, - retrieved from http://www.nuigalway.ie/vision / national university of ireland galway library. ( a). digital scholarship enablement strategy retrieved from http://library.nuigalway.ie/media/jameshardimanlibrary/digital-scholarship- enablement-strategy.pdf national university of ireland galway library. ( b). a vision for the library in retrieved from http://library.nuigalway.ie/media/jameshardimanlibrary/a-vision-for-the-library-in- .pdf national university of ireland galway library. ( a). archives strategy, - national university of ireland galway library. ( b). library strategy: the journey to retrieved from http://library.nuigalway.ie/media/jameshardimanlibrary/library-strategy---the-journey- to- .pdf national university of ireland galway library. ( ). archives publications retrieved from http://library.nuigalway.ie/collections/archives/publications/ national university of ireland galway moore institute. ( ). digital cultures initiative retrieved from http://mooreinstitute.ie/digital-humanities/digital-cultures-initiative/ persson, s., & svenningsson, m. ( ). librarians as advocates of social media for researchers: a social media project initiated by linköping university library, sweden. new review of academic librarianship, ( - ), - . doi: http://dx.doi.org/ . / . . posner, m. ( ). no half measures: overcoming common challenges to doing digital humanities in the library. journal of library administration, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . vandegrift, m., & varner, s. ( ). evolving in common: creating mutually supportive relationships between libraries and the digital humanities. journal of library administration, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . vinopal, j., & mccormick, m. ( ). supporting digital scholarship in research libraries: scalability and sustainability. journal of library administration, ( ), - . retrieved from doi:http://dx.doi.org/ . / . . http://er.educause.edu/articles/ / /the-digital-humanities-are-alive-and-well-and-blooming-now-what http://er.educause.edu/articles/ / /the-digital-humanities-are-alive-and-well-and-blooming-now-what http://www.emeraldinsight.com/doi/pdfplus/ . /lm- - - http://publications.arl.org/supporting-digital-scholarship-spec-kit- http://www.nuigalway.ie/vision / http://library.nuigalway.ie/media/jameshardimanlibrary/digital-scholarship-enablement-strategy.pdf http://library.nuigalway.ie/media/jameshardimanlibrary/digital-scholarship-enablement-strategy.pdf http://library.nuigalway.ie/media/jameshardimanlibrary/a-vision-for-the-library-in- .pdf http://library.nuigalway.ie/media/jameshardimanlibrary/library-strategy---the-journey-to- .pdf http://library.nuigalway.ie/media/jameshardimanlibrary/library-strategy---the-journey-to- .pdf http://library.nuigalway.ie/collections/archives/publications/ http://mooreinstitute.ie/digital-humanities/digital-cultures-initiative/ http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . http://dx.doi.org/ . / . . pushing the boundaries of the digital libraries field: preface ircdl procedia computer science ( ) – available online at www.sciencedirect.com - © the authors. published by elsevier b.v. this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). peer-review under responsibility of the scientific committee of ircdl doi: . /j.procs. . . sciencedirect th italian research conference on digital libraries, ircdl pushing the boundaries of the digital libraries field preface ircdl maristella agostia, tiziana catarcib, floriana espositoc* adepartment of information engineering, university of padua, via gradenigo /b, padua, italy bdepartment of computer control and management engineering “antonio ruberti”, sapienza university of rome, via ludovico ariosto , rome, italy cdepartment of computer science, university of bari “aldo moro”, via e. orabona , bari, italy abstract this contribution is the preface of the volume of post-proceedings of the th italian research conference on digital libraries, ircdl . the volume contains the reports on the invited presentations and the accepted papers. the accepted papers were initially reviewed for presentation at the conference, and after the presentation the papers were resubmitted by the authors in a revised version that included the suggestions received during the presentation at the conference. the resubmitted versions of the papers were reviewed by at least three anonymous reviewers and the accepted papers were revised by the authors taking into considerations the reviewers’ suggestions. © the authors. published by elsevier b.v. peer-review under responsibility of the scientific committee of ircdl . keywords: italian research conference on digital libraries; ircdl; digital libraries; digital scholarship; digital cultural heritage; scientific communications ircdl is a yearly deadline for italian researchers on digital libraries related topics and ircdl marks an important milestone for the field of digital libraries in italy. in fact it is the tenth edition of this initiative, so, by looking back at what it has been done and documented through the various editions of ircdl over the last decade, it reviews the current situation and lays the foundation for the future of the area. * corresponding authors: all the editors. e-mail addresses: agosti@dei.unipd.it, catarci@dis.uniroma .it, floriana.esposito@uniba.it © the authors. published by elsevier b.v. this is an open access article under the cc by-nc-nd license (http://creativecommons.org/licenses/by-nc-nd/ . /). peer-review under responsibility of the scientific committee of ircdl http://crossmark.crossref.org/dialog/?doi= . /j.procs. . . &domain=pdf http://crossmark.crossref.org/dialog/?doi= . /j.procs. . . &domain=pdf maristella agosti et al. / procedia computer science ( ) – the italian community that meets annually at ircdl is an active community that contributes to the achievement of european and international projects, and through its results can inform of what is happening in general in this area and give some guidance on what will be the new challenges and the boundaries of the area. one of the points of focus of ircdl is emphasizing the multidisciplinary nature of research on digital libraries which not only goes from computer science to humanities but also crosses areas in the same field, ranging, for example, from archival to librarian sciences or from information management systems to new knowledge environments. this is an ongoing challenge for the digital library field and there is need to continue to contribute to improve the cooperation between the many communities that share common objectives. the invited presentation by tiziana catarci angela di iorio and marco schaerf on the sapienza digital library addresses many of the challenges of this focus reporting on the design choices and implementation activities of interest for a major university that actively operates in research and teaching serving more than , students and staff members. to complement the topics addressed by the first invited talk, the invited presentation by rossella caffo, director of the central institute for the union catalogue of italian libraries and bibliographic information (iccu), which is an institute of the italian ministry of cultural heritage, activities and tourism (mibact), reports on the digital cultural heritage projects at the national and european level, where iccu is active and that are of interest in general and in particular for the libraries that face new challenges in contributing to the transfer of knowledge. the second major focus of ircdl is on the profound change that is happening in the world of scientific communication, where the object of scientific communication is no longer a linear text, although digital, but an object-centric network that consists of text, data, images, videos, blogs, etc. this change is likely to deeply modify the nature and the role of the digital library and its relationship with the thematic data center. to introduce the major challenges that this focus requires to see addressed and to lay the foundations of the area, two invited talks were given. the first of these was by gaby appleton (elsevier) on future trends in digital libraries and scientific communications, and the second was by costantino thanos (institute of information science and technologies “a. faedo”, italian national research council, pisa) on the future of digital scholarship, and they addressed the characteristics of modern science and the need for the publication and consumption of scientific information to undergo radical changes also to produce an important transformational impact on the scholarly record. during the conference, the debate on the contents of these two talks was enriched by the contributions of alberto del bimbo (department of systems and computer science of the university of florence) and carlo tasso (department of mathematics and computer science of the university of udine). in addition to the excellent invited speakers, from amongst the submitted papers we selected high-quality papers for inclusion in this volume of the elsevier procedia computer science series. the conference took place in the aula magna “antonio lepschy” of the department of information engineering of the university of padua, thanks to the support and sponsorship of the department and of the european project cultura - cultivating understanding and research through adaptivity - co-funded under the th framework programme of the european commission (http://www.cultura-strep.eu/). organisation honorary chair costantino thanos, institute of information science and technologies “a. faedo”, italian national research council, pisa program chairs maristella agosti, department of information engineering, university of padua tiziana catarci, department of automatic control and computer engineering management “antonio ruberti”, sapienza university of rome floriana esposito, department of computer science, university of bari “aldo moro” maristella agosti et al. / procedia computer science ( ) – program committee nicola barbuti, department of sciences of antiquity and late antiquity, university of bari “aldo moro” giovanni bergamin, national central library of florence (bncf), florence stefano berretti, department of information engineering, university of florence marco bertini, department of information engineering, university of florence maria teresa biagetti, department of documentary, linguistics and philology and geography, sapienza university of rome rossella caffo, central institute for the union catalogue of italian libraries and bibliographic information (iccu), rome michelangelo ceci, department of computer science, university of bari “aldo moro” rita cucchiara, department of engineering “enzo ferrari”, university of modena and reggio emilia paola de castro, publishing activities, national institute of health, rome, italy alberto del bimbo, department of systems and computer science, university of florence stefano ferilli, department of computer science, university of bari “aldo moro” nicola ferro, department of information engineering, university of padua paola gargiulo, cineca costantino grana, engineering department “enzo ferrari”, university of modena and reggio emilia maria guercio, department of history of art and the performing arts, sapienza university of rome donato malerba, department of computer science, university of bari “aldo moro” paolo manghi, institute of information science and technologies “a. faedo”, italian national research council, pisa simone marinai, department of information engineering, university of florence carlo meghini institute of information science and technologies “a. faedo”, italian national research council, pisa maurizio messina, marciana national library, venice antonella poggi, department of documentary, linguistics and philology and geography, sapienza university of rome giuseppe santucci, department of automatic control and computer engineering management “antonio ruberti”, sapienza university of rome marco schaerf, department of automatic control and computer engineering management “antonio ruberti”, sapienza university of rome gianmaria silvello, department of information engineering, university of padua gianni solimine, department of documentary, linguistics and philology and geography, sapienza university of rome anna maria tammaro, department of information engineering, university of parma carlo tasso, department of mathematics and computer science, university of udine francesca tomasi, department of classical philology and italian studies, university of bologna paul g. weston, department of humanities, university of pavia organizing committee debora leoncini, department of information engineering, university of padua ircdl steering committee maristella agosti tiziana catarci alberto del bimbo floriana esposito carlo tasso costantino thanos information services & use ( ) – doi . /isu- ios press an overview of the nfais annual conference: creating strategic solutions in a technology-driven marketplace bonnie lawlor∗ guest editor, nfais honorary fellow, upper gulph road, radnor, pa , usa abstract. this paper offers an overview of the highlights of the nfais annual conference, creating strategic solutions in a technology-driven marketplace that was held in alexandria, va from february - february , . the goal of the conference was to focus on how technological innovations, especially artificial intelligence and machine learning, along with changing market demands, are creating new opportunities within the information community. speakers were invited to demonstrate that such innovations have the potential to provide researchers with new tools with which to advance their quest for scientific discovery and also have the potential to provide the much-needed insights to assist business leaders in making their strategic decisions with confidence. the diverse speakers made their point - technology is driving us forward. but the real message of the conference was all about the value of content and how that value can be increased by leveraging appropriate technology. like changing a rough stone into in incomparable diamond, technology can transform traditional content into a faceted gem! keywords: semantic tagging, semantic search, information discovery, open access, open science, university presses, pre-print servers, plan s, artificial intelligence, blockchain technology, born-digital scholarship, internet-in-a box, scholarly communica- tion, scholarly publishing, academic research libraries, library consortia . introduction technology has been transforming scholarly communication for centuries - slowly at first and much more rapidly over the past century. from proto-writing [ ] in the early days of man, through to the invention of paper [ ], the printing press [ ], computers [ ], the internet [ ], and now blockchain technol- ogy [ ], innovative technologies have been disrupting how information is created, disseminated, managed, stored, and even perceived. with each change has come opportunity for some and ultimate irrelevance for those too focused on their traditional world to understand the potential impact of change. today’s information technology has created a landscape that is now ripe for a major change in how the scholarly communication process will proceed into the future. with the launch of digital publications in the mid- s, information has become much more widely and easily disseminated. these digital publications ultimately met the needs of the born-digital generation that developed from the launch of personal computers in the s - a generation that by the late s began to demand free access to *e-mail: chescot@aol.com. - / /$ . © – ios press and the authors. this article is published online with open access and distributed under the terms of the creative commons attribution non-commercial license (cc by-nc . ). http://dx.doi.org/ . /isu- mailto:chescot@aol.com b. lawlor / an overview of the nfais annual conference: creating strategic solutions digital information. that demand has since been heard and supported by research funders around the globe who now require that the outcomes of the research that they fund be made publically available. and the researchers, who are also authors and users of information, expect that such free sharing of their scientific data be allowed. the stars are now well aligned for change! indeed, gone is the traditional world of scholarly communication and publishing where publishers held the reigns. at the nfais annual conference dr. joris van rossum, director special projects, digital science, noted that the publisher’s role is getting smaller as alternatives for the fulfillment of the publisher functions have emerged [ ]. indeed, his comment applies to all content providers - from primary publishers through to the abstracting and indexing services - for today advances in technology have provided information alternatives that are often “good enough” for the majority of those that seek it. now that the stars are aligned how will our world change as change it will - process has already begun this question was raised several times during the nfais annual conference that took place earlier this year when a group of researchers, publishers, librarians, policy makers, and technologists gathered together in an attempt to learn how all stakeholders in scholarly communication are grappling with the changes that advances in technology have wrought upon them. how do they remain relevant to their users and sustainable as an organization well into the future? this dialogue went on for two-and-a-half days while attendees discussed the new initiatives in open access, the growth of pre-print servers, user expectations, the democratization of information, and the innovative technologies that may allow them to retain relevance in the eyes of information seeker - a discussion sprinkled with philosophical debates over who does versus who should own data. it was no surprise that at the close of this conversation no one could predict the future configuration of the information landscape. what was a surprise was the call for less passionate and non-constructive rhetoric from both sides of the open access issue, and more rationale and collaborative conversations about the future of the information community and its stakeholders. perhaps we are making our way to the table to work things out? . opening keynote the opening keynote presentation was given by dr. samuel zidovetzki, global health director, university of california riverside emergency medicine program, who spoke on the need for global access to up-to-date medical information. he noted that information is a commodity, one that is tightly controlled and regulated by many different entities, both public and private. and medical information is no different. for developing countries, gaining access to valuable and time-sensitive medical information can mean the difference between life and death. equally important is access to quality medical education versus what exists now in many countries - educational curricula using decades old medical texts. there are many barriers to the dissemination of accurate and timely medical information and there are organizations such as wikipedia, wiki project medicine, and other non-governmental and governmental organizations working to get medical information to those who need it most. to prove his point, he gave several examples based upon his work in refugee camps and noted that the medical conditions are quite rudimentary. in the dominican republic newly-minted doctors go right from medical school to work in the field and, unlike in the usa, they do not have the benefit of supervision by experienced physicians. they identify a patient’s symptoms and then rely upon outdated textbooks, old equipment, and even old medical posters hanging on the walls in order to make a diagnosis. while they may possess a smartphone, internet access is unreliable and costly – the best that they can do is download pdf’s to the phone when they have the opportunity and refer to them as needed. one very interesting and b. lawlor / an overview of the nfais annual conference: creating strategic solutions novel solution is the “internet-in-a-box” [ ] that originated in cuba. users come to a park and download what they need from a hot spot (see video at http://internet-in-a-box.org/). content can be customized based upon the setting and the box is available commercially. the box can also be used as a teaching tool and a side use has been the creation of a mobile classroom. he said that a study will be done in guatemala in may of this year to see what information is most used and most helpful. zidovetzki said that he uses the internet-in-a-box in his own work and noted that physicians actually do rely upon online information such as wikipedia (even in the united states) [ ], and offered the following statistics on wikipedia usage in the medical field: %– % of physicians, %– % of pharmacists, and % of medical students - even policy makers access the information and it was the number one resource during the ebola crisis [ ]. the use of online materials by physicians was confirmed by violaine iglesias, a speaker later in the conference. zidovetzki and others are working diligently to gather relevant, quality medical information that can be stored on internet-in-a-box and utilized offline in rural areas. it is important to have the information in native languages and it is a struggle to get the information translated so that the information is current. but he noted that progress is being made and the ability to have offline access to quality medical information will make a big difference in underdeveloped areas. he believes very strongly that ensuring the flow of high quality, timely information to medical professionals either at home or abroad can make all the difference in improving patient care. dr. zidovetzki’s slides are not available on the nfais website, but more information can be found in an interview with him on his work with the wikimedicine project [ ]. . advancing knowledge and research in the st century mary lee kennedy, executive director of the association of research libraries (arl), was the first speaker in this session and she discussed the work of research libraries in advancing knowledge in today’s research and learning ecosystem - an ecosystem that is focused on open science, the adoption of artificial intelligence and other fourth industrial revolution technologies [ ], and the debate about what constitutes research in the st century. she noted that arl’s mission is to advance research, learning, and scholarly communications by fostering the open exchange of ideas and expertise, promoting equity and diversity, pursuing advocacy and public policy efforts, forging partnerships, and by catalyzing collective efforts. she said that they have been giving a lot of thought as to where arl can add the greatest value and they believe it is at the intersection of three communities: ( ) research libraries and their parent organizations; ( ) public policy makers; and ( ) research and learning communities. she noted that each community has its own topics under discussion. for research institutions the topics are: accountability/value/public good, trust, affordability, diversity, equity and inclusion, and demographics. for public policy makers the topics are: open science, higher education accreditation, budgets, copyright, accessibility, net neutrality, and privacy and security. and for the research and learning communities the topics are open access, data, video, complex objects, text etc., cyber-physical systems, collaboration and continuous authorship, and personalized learning. these “topics of conversation” are all related; e.g. trust vs. privacy and security vs. accessibility, open access and open science, diversity and net neutrality, etc. and to be successful across the communities, the conversations need to be pulled together, and arl hopes to facilitate just that. their priority efforts are focused on the areas of scholars and research; diversity, equity and inclusion; and on data and analytics. http://internet-in-a-box.org/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions with regard to scholars and research they convened a meeting in december attended by learned societies, publishers, librarians, funders, and universities to see what initiatives could be established to advance open scholarship in the social sciences. by the end of the meeting, five group projects were proposed, with commitments from various participants to lead them. the projects will: ( ) conduct an authoritative investigation into scholarly society finances by a trusted third party, as the basis for financial and business model conversations with societies and external stakeholders. ( ) commission a paper on the role of scholarly societies and scholarly affiliation in a post-subscription environment. ( ) conduct a case study pilot on linguistics promotion-and-tenure (p&t). ( ) explore implementing peer review in socarxiv and psyarxiv. ( ) assess the impact of the reporting relationship between university presses and university libraries. the full report can be accessed at https://www.arl.org/wp-content/uploads/ / / . . -arl-ssrc- meeting-on-open-scholarship.pdf. with regard to diversity, equity and inclusion she noted their work with the university of british columbia in collecting and curating scholarly material created by indigenous communities. annually, the librarians produce a list of courses with significant indigenous content. for the winter session – , there were one hundred and eighteen courses, from thirty-five different departments. the xwi xwa library (see: https://xwi xwa.library.ubc.ca/) at the university collects materials written from indigenous perspectives, such as materials produced by, indigenous organizations, tribal councils, schools, publishers, researchers, writers, scholars, filmmakers, and musicians.  its collections and services reflect aboriginal approaches to teaching, learning, and research. she noted that the arl-sponsored ideal conference - advancing inclusion, diversity, equity, and accessibility in libraries & archives, will be held august – , in columbus oh [ ]. on the final area of focus, data and analytics, she reported on the work of the arl assessment program task force that met in december . as a result of their work [ ] five team-based pilot projects have been launched this year to look at the following questions: ( ) how does the library help to increase research productivity and impact? ( ) how do library spaces facilitate innovative research, creative thinking, and problem-solving? ( ) how does the library contribute to equitable student outcomes and an inclusive learning environ- ment? ( ) how do the library’s special collections specifically support and promote teaching, learning, and research? ( ) how do the library’s collections play a role in attracting and retaining top researchers and faculty to the institution? the goal is to ultimately measure outcomes and results, not just things. library assessment is really a new field and is becoming very important. she noted that outcomes will be examined within the goal of supporting and advancing research libraries and scholarly communication with an eye on creating the workforce of – only eleven years away! kennedy’s slides are available on the nfais website and a paper based on her presentation appears elsewhere in the issue of information services and use. https://www.arl.org/wp-content/uploads/ / / . . -arl-ssrc-meeting-on-open-scholarship.pdf https://www.arl.org/wp-content/uploads/ / / . . -arl-ssrc-meeting-on-open-scholarship.pdf https://xwi xwa.library.ubc.ca/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions . virginia tech and open science the second speaker in this session was julie griffin, associate dean for research and informatics at virginia polytechnic institute and state university. she opened with an overview of the institution, noting that there are over thirty-four thousand students and two hundred and eighty degree programs. it has a research portfolio of five hundred and twenty-two million dollars, a global education office, and has had a european campus in switzerland for more than twenty years. she went on to discuss the evolution of virginia tech (vt) university libraries, with special attention given to program and service developments in data science, learning spaces, digital literacy, and open education. griffin noted that vt libraries facilitate knowledge sharing, use, and reuse through the provision of open publishing services and openly-accessible collections, with the goal of providing equitable access to information. she added that vt libraries help make the scholarly record more open by embedding library services in the university research and learning infrastructures, by teaching digital literacy skills, by advocating for open access and open data, by integrating systems, and by providing services that enable institutional stewardship. she noted that in vt entered an online publishing partnership with ubiquity press (see https://www.ubiquitypress.com/). however, her institution is not a newcomer to online publishing. vt began publishing online journals in , and was one of the first universities in the country to require students to submit theses and dissertations electronically. this new three-year agreement with ubiquity press will allow the libraries to gain a state-of-the-art web platform that increases its capacity to publish freely- accessible scholarly research in a variety of formats, such as journals, books, conference proceedings, along with openly-licensed text, media, and other digital work used for teaching, learning, and research. according to the press release “the move to ubiquity is part of a larger strategy by the libraries to build a publishing program - called vt publishing - designed for the st-century digital economy [ ]. in closing she also discussed vt’s support of the national academy of sciences (nas) vision for open science by design – defined as “a set of principles and practices that fosters openness throughout the entire research lifecycle.” [ ] indeed, vt utilizes the open science framework (see https://osf.io/) to ensure that the university’s faculty and researchers can openly share their work. (note: in the q&a period following the presentations in this session, mary lee kennedy highly recommended that attendees read the free nas report on open science by design. she said that it is an easy read and that arl is following its guiding principles). griffin’s slides are available on the nfais website and a paper based on her presentation appears elsewhere in the issue of information services and use. . publishers creating new value the next session focused on how three publishers are creating new value for their content: . . institution of engineering and technology (iet) the first of the three speakers was vincent cassidy, director of academic markets at the institution of engineering and technology, iet, who discussed how they have added value to the inspec database via artificial intelligence and semantic tagging in order to create new services. he began by providing https://www.ubiquitypress.com/ https://osf.io/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions an overview of iet. it is a global professional society as well as a learned society with more than one hundred and seventy thousand members. headquartered in london, uk, it is a mid-size full-range publisher producing books, journals, magazines, standards, etc., as well as an abstracting and indexing database, inspec, the organization itself is one hundred and forty-eight years old and is very traditional in its thinking, so one of the major challenges moving forward was to step back and try to look at things in new and different ways. inspec itself will be fifty years old this year. as of january , it had eighteen and a half million records in the fields of physics; electrical and electronic engineering; computing and control engineering; and production, manufacturing and mechanical engineering. it includes material from more than forty- five hundred journals, and three thousand other publications from seven hundred and fifty publishers. materials covered include books, journals, videos dissertations, etc. the database makes scientific research discovery easier through high-quality classification and indexing, through the curation of source material, and through monitoring science and research for quality and relevance. inspec’s main customer base is comprised of universities, research institutes and corporations. cassidy noted that other a&i services in the audience can appreciate the effort that goes into creating and maintaining such a massive, curated database. it is very labor-intensive, especially in indexing. so why change after fifty years of success? well, as cassidy pointed out, the research information environment has changed significantly. the advent of “good enough” alternatives has fundamentally changed user behavior - they prefer google - not a&i services, and as a result usage is declining. and while information professionals respect and value inspec, there has been a decline in subscriptions in inspec’s core base. iet wanted to take action now, before things slid further. and most of what they are now doing required new skill sets and the hiring of data scientists. their goal was to apply semantic tagging to the eighteen million records, to develop a domain model from their diverse ontologies, to better understand user workflow and pain points, to build an mvp [ ] (minimal viable product) platform from which to iterate, and to build and strengthen engagement with their customers. he said that incorporating agile product management skills into a one hundred and forty-eight year old institution has been “interesting” to say the least! he noted that the culture of the organization had to change - not the vision, and that they had the full support of iet’s trustees. it has been a two-year process with frequent iterations for funding as well as to the market to confirm assumptions. they have taken the eighteen million records and created six billion concept relationships in a knowledge graph and this new product will be launched in the spring of this year as part of inspec’s fiftieth birthday celebration. they faced several challenges during the process, beginning with their choice of technology partners; introducing effective agile project management and project momentum as noted earlier; understanding customer needs and the user workflow; and sustaining the core business through the two-year effort. but cassidy has said that the effort has proven to be successful. they have reengaged with customers and users and the new value-add has triggered increased usage and growth. it is the first new organic growth that they have had in years. their conversations with their distribution channel partners have resulted in the emergence of new ideas and a fresh impetus on growth. and they have been able to refocus content curation and the inclusion of new content types. he said that they learned a lot as a result. his closing message was that highly-structured, human-curated databases can be repurposed and recycled to retain relevance and to provide new value propositions with a resultant growth in usage and in impact. he said that as an organization iet is no longer “traditional,” but rather has evolved into a one hundred and forty-eight year old “start-up” which is strange, exciting, and invigorating. he b. lawlor / an overview of the nfais annual conference: creating strategic solutions emphasized that this “new life” is not about technology - it is about data and innovative uses of data. and he recommended that organizations who want to reinvent themselves to remain relevant to the new generation of information users get the skills required to make the most of the data that they have invested in compiling and curation over the years. skill up! cassidy’s slides are available on the nfais website. . . association of university presses (aaup) the second speaker in this session was peter berkery, executive director of the association of university presses (aaup), who discussed how university presses are finding new ways to support the output of scholars in the humanities and social sciences who have become increasingly comfortable with both digital research and publishing. in his presentation he highlighted four initiatives as follows. . . . rotunda rotunda was founded by the university of virginia press in to apply traditional university press (up) strengths to the research that was beginning to emerge from digital humanities centers (see https://www.upress.virginia.edu/rotunda), that strength being the up imprimatur (e.g. peer review). the press quickly discovered that these one-off boutique projects did not have the scale that libraries were looking for and it did a slight pivot towards documentaries which evolved over time into the founding fathers documentary editions, and again into literature and culture collections, such as emily dickinson’s correspondence, herman melville’s draft manuscript of “typee,” etc. today tension persists among scholars between digital humanities as digital affordances versus a true research and development effort; e.g., natural language search, word frequency analysis, data manipulation, etc. the people at university of virginia press wrestle with this issue as well and they have found that sustainability issues still persist for boutique projects. they have found that their library customers want scalability: a perpetual license for what looks like content, and a small annual hosting fee for the platform (hosting, enhancements, browser compatibility). going forward, new projects will incrementally expand rotunda’s capabilities without compromising the press’ core values and they will continue to do what they have done so well for almost two decades. . . . manifold developed by the university of minnesota press, manifold is a do-it yourself (diy) web- based publishing platform for scholarly publishers, university departments, and scholarly groups (see https://manifold.umn.edu/). a manifold project is composed of three parts: first, the base layer - the epub file or google doc that a publisher uploads; second, the resources - the media and other texts that an author and editor will add to a project; and finally reader interaction comprised of comments, highlights, etc. there are also fee-based services for those organizations who do not want to implement manifold on their own. early experience has shown that a lot of groups do not want to diy, even when it’s free! . . . fulcrum fulcrum was developed by a group of campus-based publishers working closely with disciplinary faculty and information science specialists who recognized the changing nature of scholarly publishing in the humanities and social sciences (see https://www.fulcrum.org/). initial development was supported by https://www.upress.virginia.edu/rotunda https://manifold.umn.edu/ https://www.fulcrum.org/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions a grant from the andrew w. mellon foundation and implemented by the university of michigan library and press working with partners from indiana, minnesota, northwestern, and penn state universities. it is a digital publishing platform and a set of publishing services that is committed to publishing scholarship in a flexible, durable, discoverable, and accessible form. it is flexible in that it connects to other open source tools and is responsive to the changing needs of digital scholars. it is durable because it has been built on a research library infrastructure that is a trusted steward committed to preservation and stability, discoverability is supported by that fact that the system is interoperable with other publishing tools and has been integrated into the information supply chain. and it is accessible in that it is dedicated to inclusive services and content for all readers. . . . .supdigital developed by stanford university press, .supdigital is attempting something transformative at the research level by applying the rigors of traditional university press publishing to born-digital scholarship. the goal is to provide a formal, peer-reviewed publication process for interactive digital scholarship, allowing scholars to create a digital object that presents, explains, and displays their research. once an over- arching argument is embedded, the press can peer review and thus validate the work. thus far, .supdigital has developed standards for presentation, metadata, a hosting platform, and archiving capabilities (see video at http://blog.supdigital.org/). berkery noted that the lessons to be learned from these initiatives are the following: going into a project know if the results are meant to be leveragable (standard format) or flexible and he used the digital einstein papers website developed by princeton university press as an example (see https://press.princeton.edu/einstein/digital). he noted that the website is fantastic, but the infrastructure is not applicable to anything else. which segues into the issue of sustainability as these projects are expensive and the investment needs to be leveraged in multiple ways. also, peer review is a requirement and must be incorporated as university presses develop new means to “publish” outcomes from digital humanities research. he added that surprisingly, do-it-yourself implementations have had limited appeal, and finally, he noted that definitional issues surrounding the term “digital humanities” remain. in closing, he offered the following quote from alan harvey, director, stanford university press: “the goal is not to publish a book in digital form. the goal is to publish digital scholarship in its native form. that means embedding the scholarly argument within the digital object.” berkery’s slides are available on the nfais website and a paper based on his presentation appears elsewhere in the issue of information services and use. . . . mit knowledge futures group the final speaker in this session was catherine ahearn, senior project editor, mit knowledge futures group (kfg) - new joint initiative of the mit press and the mit media lab. it is the first of its kind between an established publisher and a world-class academic lab devoted to the design of future-facing technologies. ahearn noted that the kfg’s mission is to transform research publishing from a closed, sequential process, into an open, community-driven one, by incubating and deploying open source technologies to support both rapid, open dissemination and a shared ecosystem for information review, provenance, and verification. it provides support for mission-driven publishers and brings like-minded groups and individuals together. currently, there are four projects: http://blog.supdigital.org/ https://press.princeton.edu/einstein/digital b. lawlor / an overview of the nfais annual conference: creating strategic solutions • pubpub: this is a free, open authoring and publishing platform initially developed as a media lab project. it socializes the process of knowledge creation by integrating conversation, annotation, and versioning into short and long-form digital publications. currently there are more than one hundred pubpub communities (see https://www.pubpub.org/explore). • underlay: protocol for data interoperability. it is an open, distributed knowledge store that is architected to capture, connect, and archive publicly-available knowledge and its provenance. the underlay provides mechanisms for distilling the knowledge graph from openly-available publications, along with the archival and access technology to make the data and content hosted on pubpub available to other platforms. • prior art archive: the first free and open archiving platform for prior technical art for the entire it industry, prototyped by mit and cisco. its goal is to help fewer bad patents be issued, by giving uspto examiners the tools they need to find old technology. • ecosystem map: a mellon-funded environment scan to be published in june . ahearn noted that there is value in experimentation. they are redefining the digital reading experience; supporting the development and evolution of new ideas; and introducing transparency and openness in the review process to provide greater value to authors and readers she offered some interesting examples of what they have accomplished to date. the first was a collaborative reading experiment, frankenbook - a special edition for the th anniversary of mary shelley’s frankenstein that was published on pubpub in january . this edition included additional annotations from the editors, multimedia embedded in the text and annotations; labels added to the annotations for tailored reading; special functionality for use in classrooms; and community discussions around the text - currently frankenbook has four hundred and twenty discussions and more than seven thousand visits (see: https://www.frankenbook.org/). a second example is works in progress (wip). these are written works in early stages of development that would benefit from an open peer review process. it offers authors the benefit of community feedback in the development of their ideas, as well as the ability to publish a version of their work before more formal publication. after the open review period, authors may revise the work and submit it for consideration for formal publication. the mit press will have first right of refusal, and all submitted manuscripts will be subject to the press usual rigorous peer review (see: https://wip.pubpub.org/). her final example was data feminism (see https://bookbook.pubpub.org/data-feminism), a contracted manuscript by catherine d’ignazio and lauren klein. it is available for peer-to-peer review and will be published by the mit press as part of their ideas series. she noted that all titles in the ideas series are open access and published on pubpub with support from the mit libraries. this manuscript had more than five hundred comments and more than three thousand visits at the close of open reviews. this was a successful manuscript that emerged from the wip program noted above. ahearn said that while there have been many benefits from the experimentation that they have been doing, they have also faced some challenges, not the least of which have been funding models; creating new workflows and/or integrating with existing ones; and creating and communicating realistic goals. in closing, she invited all attendees to consider creating a community (go to: https://www.pubpub.org/ community/create) or joining an existing one and noted that all of their code is openly-available on github. ahearn’s slides are available on the nfais website. https://www.pubpub.org/explore https://www.frankenbook.org/ https://wip.pubpub.org/ https://bookbook.pubpub.org/data-feminism https://www.pubpub.org/community/create https://www.pubpub.org/community/create b. lawlor / an overview of the nfais annual conference: creating strategic solutions . the challenges to information discovery the final speaker of the day was tim mcgeary, associate university librarian for digital strategies and technology, duke university libraries, who gave one of the more thoughtful presentations at the conference. he discussed a major issue that all librarians have to address – how to provide state-of-the-art information discovery and personalization services while protecting user privacy. he opened with the following question: “how can a system be user-centric if it is not all about the user? and how can that happen without sacrificing some of our deepest values about privacy? are our values incompatible with user-centricity?” he noted that libraries have a unique position within the technology-driven marketplace - they are both an information-consumer and an information-provider. and, while libraries need to protect user privacy, they are competing for users with commercial services who are not so focused on user privacy and who gather a lot of user-information to provide very customized services - services that users also expect from libraries who do not gather such personal data. mcgeary looked at the past thirty years of information discovery - from the launch of web-based online public access catalogs [ ] (opacs) in the ’s, google in , faceted [ ] opacs in the early s, followed by web-scale discovery services [ ] offered by companies such as ebsco and proquest, and now index-based discovery services [ ]. he noted that modern discovery services and advanced opacs all use a single search box because that is what google has always used and google has (as we all know) totally reshaped user behavior and expectations. so what do users expect? they want to find all of the information that they need in one place; they want to be able to access information online from any location and using any device; they want personalization and customized search; and they want just-in-time customer support. he noted that one of the biggest issues is being able to access information easily and noted that many academics who have legal access to content in their library actually use pirated material from sci-hub simply because it is easier to use than many library systems and he reviewed several of the new services that are attempting to alleviate this problem (see article by john seguin that appears elsewhere in this issue as well as several discussions from the nfais annual conference [ ]). he noted that universities and libraries now realize what commercial and consumer providers have known for a long-time - that the data about their users is deep, untapped, and full of potential. he added that while libraries have long aimed to protect this data from being used harmfully, the era of constrained financial support, especially for libraries, requires much more intentional data-driven decision-making. in closing, mcgeary noted that the technology impact to user-centric discovery has created a new paradigm for libraries who must be willing to adapt to the user expectations of personalization, options, and convenience. libraries should aim to serve their users better with the responsible use of data and be willing to go further than they have dared before in collaborating and obtaining informed consent from their users. but what remains constant is that libraries must continue to protect their users’ intellectual freedom and privacy as a most foundational value, and the core to user-centric discovery. mcgeary’s slides are available on the nfais website and a paper based on his presentation appears elsewhere in the issue of information services and use. b. lawlor / an overview of the nfais annual conference: creating strategic solutions . an automated solution to information access day two of the conference opened with a plenary session given by sabine louët, ceo and founder of sciencepod, a start-up company in dublin, ireland that offers a digital content creation and publishing platform, designed to give people who are not familiar with the requirements of publishing, the help that they need to create high-quality content telling the story of their research as well as to provide experienced publishers with cost-effective tools to produce content (see: https://sciencepod.net/#splash). louët was formerly a news editor at nature biotechnology and the editor of euroscientist. she said that her company has a mission and that is to explain the meaning of scientific research findings that are buried in the journal literature and in databases, and that accomplish this by translating complex ideas into simple messages. she opened by saying that there are a lot of misconceptions about where innovation is coming from. it is not, in her opinion, coming from large, established companies who are perhaps a bit risk-averse, but rather it is coming from start-up companies. she said that she was at a meeting in berlin last week where someone did a study of one hundred and twenty start-ups and found that % were truly stand-alone companies, not offshoots of a larger corporation and twenty-three of the sample companies were ultimately acquired because of their innovations. she said that as an entrepreneur she has learned to ask the right questions and one of the big questions today is what is driving innovation in the information industry? she believes that one of the major forces is open access (oa) and she referred to a presentation made at the nfais annual conference [ ] by deni auclair of deltathink in which it was reported that oa accounted for twenty percent of the global content output, but only for ten percent of the revenue. oa has a ways to go, but it is having a major impact today and she commented on plan s – the effort by the german government to accelerate oa by making it a requirement that all government-funded research in germany be published in oa journals beginning in february  [ ]. she gave as an example the major deal that wiley has entered into in germany allowing seven hundred german institution access to wiley journals and also allowing their researchers to publish in wiley’s oa journals. the multi-million dollar deal was signed on january , and a public version of the contract is available [ ]. louët said that she believes that more of these deals will be announced in the not-too-distant future. many speakers throughout the conference reference plan s and the wiley deal. she noted that the impact factor [ ] still dictates journal and author recognition, and most journals with high impact factors are not oa (one of the problems facing the implementation of plan s). she added that with social media and science networks we are entering an era where the author is becoming a “brand” more so than the journal. because of this, publishers are becoming more author-centric and are seeking to provide more author services, some of which are to assist authors in promoting their work. which is where her company comes in. her company is focusing on taking bundles of articles (including oa articles) and, using artificial intelligence (ai), creating a “story” of the research in those bundles with the goal of shining a light on the authors of those papers. put simply, they create “automated summaries.” she noted that the use of automation to “write” articles started in when the first new report on an earthquake in los angeles, ca was written by a robot [ ]. as noted at the time, algorithms won’t necessarily replace editors and reporters because they cannot generate proficient text, but automation can help speed-up the writing process. in closing, she noted that the summaries created by a combination of sciencepod editors, writers, and ai improve content discovery. she said that this combination makes complex topics accessible, using https://sciencepod.net/#splash b. lawlor / an overview of the nfais annual conference: creating strategic solutions crystal-clear english, so that information is easily understood and that it can save publishers editorial time and costs when creating content to raise the profile of their content and their authors. louët’s slides are available on the nfais website. . unconventional partnerships . . preprints and journals the next session highlighted three speakers who focused on the importance of partnerships in the information industry. the first speaker was john inglis, co-founder of biorxiv and medrxiv, and executive director of cold spring harbor laboratory press. the goal of his presentation was to demonstrate ( ) the advantages that preprints offer individual scholars and their communities of practice, ( ) the integration and collaboration that is possible between servers and journals, and ( ) and the ways in which upstream discovery and assessment of a preprint may help optimize the published version of record of the manuscript concerned. he opened with the definition of a preprint - a research manuscript yet to be certified by peer review and accepted for publication by a journal that is loaded on an online platform that is dedicated to the distribution of preprints. he then went on to demonstrate the power of preprints using (with permission) the example of kenji sugioka ph.d. who is now an assistant professor at the university of british columbia. in january sugioka published his first paper as a post doc at the university of oregon. in june of that same year he published a preprint of a minor project that he was working on, followed in july with another preprint that discussed some more important research with which he was involved. the reaction to both preprints was very positive so sugioka decided in october to start looking for a tenure- track position and due to his preprints he was invited to a lot of interviews. in january his minor project paper was published. by april he had accepted an assistant professor position at the university of british columbia, and in august the paper on his major work was published. this, inglis he said, is the power of preprints. communication was done in parallel to the speed of research. the results were globally-disseminated freely; and the results were shared, evaluated, and commented upon quickly. sugioka was able to clearly demonstrate his experience, knowledge, and value - and landed a tenure-track position in less than a year from when he stared the search! inglis noted that preprints: • uncouple distribution of results from certification through peer review. • enable community awareness of recent work and the chance to comment. • provide evidence of productivity before papers are published, and • accelerate the pace of research and the advancement of science. he offered the following quote in support of the above: “if one preprint inspired the work of just two other people, biologists would see a five-fold acceleration in scientific progress in a decade” (steve quake, president, czi biohub, wired july , ). he noted that the number of preprint servers is proliferating across all disciplines and sub-disciplines, each with its own technology and polices (for an excellent overview of preprint growth see the summary of a presentation given by shirley decker-lucke of elsevier on preprint servers at the nfais annual conference) [ ]. he then went on to describe the two preprint servers hosted by his organization, biorxiv and soon-to-be-launched medrxiv. the former is a server for life science preprints. it is a five-year old not-for-profit old service, not a product, of cold spring harbor laboratory, and it is free for both authors b. lawlor / an overview of the nfais annual conference: creating strategic solutions and readers. it is hosted by highwire press, supported by the chan/zuckerburg initiative, and is publisher- neutral. submitted manuscripts are screened, not peer reviewed, and then posted within twenty-four to forty- eight hours. submitters must declare ( ) that they have the right and permission to submit, and ( ) that the manuscript is unpublished. the must also register any clinical trials with a registry approved by the international committee of medical journal editors (icmje). options soon to come are funder acknowledgement and access to data via a link to a repository. in-house staff check that the submission requirements are met; that the manuscript is of an appropriate scope (not non-science or pseudoscience), that it does not contain any obscenity, defamation, or plagiarism, and that it is, indeed, a research paper. also, images of human subjects are not accepted for inclusion. manuscripts are either made live or flagged for further review after consideration of the following questions: is the content science? is the manuscript a complete research paper? is the content at a level appropriate for sharing with practicing scientists, regardless of quality or accuracy? does the content have potential to harm (or prompt behavior that might harm) individual patients or populations; e.g., articles about dual-use research; articles about vaccine safety or infectious disease transmission; articles promoting or disputing specific drug regimens; and articles about the toxicity/carcinogenicity of common substances? as of early this year there were forty-two thousand posted manuscripts; one hundred and seventy- seven thousand authors representing fourteen thousand seven hundred institutions in one hundred and ten countries. as of january there had been . m page views and . m pdf downloads. usage, downloads, and manuscript submissions are all growing. he noted that use of twitter to talk about preprints is popular and helps accelerate growth in readership. he noted that biorxiv partners with publishers so that authors can ultimately submit their final manuscript to be published - approximately one hundred and forty-two journals are covered by these partnerships - and some journals actively encourage submission of the preprints themselves. he noted that sixty seven percent of biorxiv preprints are published on journals within two years, with a median eight months between posting and publication. preprints are given a forward link to the published version and the published version has the journal doi and a backward link to preprint. inglis briefly discussed three new services focused on preprints that have recently emerged. the first is prelights, a community platform for selecting, highlighting, and commenting on recent preprints from across the biological sciences launched by the company of biologists [ ]. the second is pre- reviews, a journal club for preprints that encourages scientists to post their outputs as preprints (see: https://www.authorea.com/inst/ -prereview). and the third is peer community in (pci), a non-profit scientific organization that aims to create specific communities of researchers that review and recommend, for free, unpublished preprints in their field (see: https://peercommunityin.org/). in closing, inglis noted that cold spring harbor laboratory press, jointly with yale university and the british medical journal, will be launching a preprint service for medicine (medrxiv) on june ,  [ ]. inglis’ slides are available on the nfais website and they provide a lot of detail. . . blockchain, artifacts, and the max planck society the second speaker in this session was david kochalko, co-founder of artifacts, a relatively-new company that uses blockchain technology to support the flow of research from start to finish. artifacts first came to my attention when courtney morris, the other co-founder of the company, spoke at nfais’ conference on the use of blockchain technology in scholarly communication last may [ ]. the company https://www.authorea.com/inst/ -prereview https://peercommunityin.org/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions was launched in march and since then has entered into multiple partnerships, the most recent being with the max planck society and the bloxberg consortium [ ]. the consortium is attempting to provide an infrastructure that will allow researchers around the world to use blockchain technology for their collaborative research. their vision is to have sufficient representation from various scientific entities actively participating in the consortium, so that the blockchain network itself may replace the traditional scientific infrastructure, ultimately eliminating current challenges such as the closed-access publishing of research results, among others. so what is blockchain technology? according to a recent report from the national institute of standards and technology (nist), “blockchains are immutable digital ledger systems implemented in a distributed fashion (i.e., without a central repository) and usually without a central authority. at its most basic level, they enable a community of users to record transactions in a ledger public to that community such that no transaction can be changed once published.” that same publication concluded that “the use of blockchains is still in its early stages, but it is built on widely-understood and sound cryptographic principles. moving forward, it is likely that blockchains will be another tool that can be used to solve newer sets of problems… blockchain technologies have the power to disrupt many industries. to avoid missed opportunities and undesirable surprises, organizations should start investigating whether or not a blockchain can help them. [ ] the artifacts blockchain allows both individuals and organizations to get on the band wagon. they have their own distributed ledger system (blockchain) that individual researchers can use, free of charge, to upload their research findings as they go through the research process all the way through to final publication. by doing so researchers have ultimate proof of their work and when it was done (entries are time-stamped). they can protect and manage their intellectual property while facilitating knowledge sharing, if and when they want to share; and they can get credit at any point for any type o research output - they do not have to wait until their research results are actually published. according to kochalko, there will ultimately be a “deep historical archive of published and discovered findings” that will be accessible to the broader scientific community [ ]. kochalko’s slides are available on the nfais website and a brief paper based upon his talk appears elsewhere in this issue of information services and use (isu). for more information on the use of blockchain technology in scholarly communication, i refer you to the special issue of issue of isu on this topic that was published last year (see: https://content.iospress.com/journals/information-services- and-use/ / ). . . the ebsco partnership strategy the final speaker in this session was nathanael lee, strategy analyst at ebsco information services, who discussed ebsco’s philosophy of partnerships and how their partnership strategy has evolved over the years. lee defined “partnership” as a relationship between two or more entities for the exchange of goods, services, and/or ideas in order to create outputs.” he noted that ebsco started as a subscription business helping publishers get their journals into the hands of librarians around the world and as a result they built a really strong publisher network. as the information industry and technology changed ebsco realized that librarians increasingly wanted digital information, they wanted to purchase “bundles” of journals, and they wanted more affordable and easy access to information overall. in the ’s ebsco turned their attention to technology and acquired a cd rom business, then, in partnership with members of their publisher network, they built ebsco host as a platform for the digital distribution of journals. the system, lee added, is used by every major library in the world today. they then started buying major https://content.iospress.com/journals/information-services-and-use/ / https://content.iospress.com/journals/information-services-and-use/ / b. lawlor / an overview of the nfais annual conference: creating strategic solutions abstracting and indexing services such as h. w. wilson with the goal of building their own “indexes” to the journal literature and ultimately built a “discovery service” that libraries can use to access and search their holdings (see mention of such services in tim mcgeary’s presentation mentioned earlier). now ebsco is entering the third phase of their business – they are offering services and solutions. they have formed a partnership with openathens (see: https://openathens.org/) to provide easy user authentica- tion/logins for ebsco services. they partnered with stacks to get a web-based content management sys- tem and have since acquired the organization. now ebsco offers the service to libraries together with log- in alternatives via openathens (see: https://www.ebsco.com/products/ebsco-stacks-library-websites). a more recent partnership that lee spearheaded is with stackmap (see: http://www.stackmap.io/) who offers a gps-like tool that allows library patrons to locate an object; e.g. book, in the physical library. ebsco has integrated that “service” into the search results of the ebsco discovery service so users get both the information that they need and directions to its physical location in the library - cool! another example is a partnership with folio, which stands for the “future of libraries is open.” it is a community founded in to develop a “reimagined” library services platform. while it is open source, ebsco works with folio to provide library support when needed (see: https://www.ebsco.com/partnerships/folio). lee said that ebsco is a mission-driven organization and that their mission is to “transform lives by providing reliable, relevant information when, how, and where people need it.” they look for partners with values that support the ebcso mission such as thinking long-term and making society better off. they believe in supporting their partners in their shared goals. he closed by encouraging organizations represented in the audience to consider partnering with ebsco, with some of the reasons being that they have the largest sales force in the library industry, the offer proof-of-concept testing for new ideas, and are willing to share the results of what they learn. for an interesting history of ebsco industries (of which ebsco information services is a part) see: http://www.fundinguniverse.com/company-histories/ebsco-industries-inc-history/. lee’s slides are available on the nfais website. . library consortia - a changing world the final speaker of the morning was roger schonfeld, director of libraries, scholarly communication, and museums program at ithaka s+r who discussed some of the changes that the library community has been experiencing. he opened with a “pre-digital history” discussing the library dream of a universal collection delivered as efficiently and seamlessly as a system could enable – a dream that could not be fulfilled in a print world. resource sharing was in the form of inter-library loan and union catalogs. eventually, new library systems actually brought these functions into the digital environment. he used the example of what is now the center for research libraries (crl) when in march , ten major u.s. universities entered into a formal agreement to establish the midwest inter-library corporation (milc). initially the center’s chief activity was accepting and processing deposits of monographs, journals, and other materials that were transferred to it by member universities. in addition to accepting deposits, the center also began to subscribe to u.s. and foreign newspaper titles not being acquired by members, thus establishing a collection that remains one of its enduring strengths. in the s it expanded from a mid- west regional organization to one with a national scope and in became the center for research libraries. as the organization grew its mission, out of necessity, changed because scope and mission, according to schonfeld, are intertwined. https://openathens.org/ https://www.ebsco.com/products/ebsco-stacks-library-websites http://www.stackmap.io/ https://www.ebsco.com/partnerships/folio http://www.fundinguniverse.com/company-histories/ebsco-industries-inc-history/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions with the ’s and ’s came the digital world with shared cataloging and oclc. originally founded as the ohio college library center in it quickly became the online cataloging library center and the first online cataloging by any library in the world took place in . he discussed the regional membership-based, collaborative library networks that emerged; e.g. solinet, palinet, and others, and how they have either merged or disappeared over time as commercial entities emerged to provide sophisticated integrated library systems, and centralized repositories emerged to serve as centers for digital preservation. these networks or consortia wielded considerable power as “buying clubs,” and their members still do – even outside of the usa (note: he referred to the projekt deal in germany in which libraries and research organizations get access to a set of wiley journals back to and whose scientists can publish in wiley open access journals without paying article processing charges (apcs). he added that most consortia lack the scale to secure such huge deals). schonfeld said that today library collaborative networks face three major challenges: • licenses vs. open access: subscriptions are slowly giving way to a growing number of oa models. collaborative networks do not have the systems to handle this. • resource sharing: most collaborative networks were established in the print environment. print has declined and cloud-based systems are the norm for handling digital content. with the rise of such cloud- based systems, print-based networks are no longer heavily-used and are becoming unsustainable. • funding: state support for higher education has declined, and continued scrutiny of library budgets has resulted in pressure to show value and differentiate against peers. bottom line – what is the value in becoming a member of a collaborative library network in today’s world? schonfeld said that such membership organizations are in the midst of a crisis. he noted that “trust” is the great intangible in networks. members often want their organization to do almost everything involving collaboration, but the fact is that not every membership community infrastructure is well-suited to support a multi-purpose organization. membership models are durable primarily due to a sort of peer pressure to belong - no one wants to be left out of the game. and it is also a fact that it is difficult to set an unambiguous strategic direction for a membership organization and it is a challenge to follow a single strategic purpose over a long period of time. he noted that many libraries belong to a number of consortia and their parent organization; i.e., the university, often is unaware of the memberships. schonfeld said that libraries need to realign with their parent, directly contribute to the university’s purpose, and integrate themselves into the university system. he noted that the essential transformations in libraries today are: • print vs. digital • local vs. shared • license vs. open access • general vs. distinctive • collections vs. workflows • selector vs. enabler • provider vs. partner he added that all libraries need to accept these trends and restructure their operations and organizations accordingly. in closing, schonfeld said that the lessons learned are that: √ every good idea does not require a new organization. √ for a new non-profit organization, grants should be used to establish a business model. b. lawlor / an overview of the nfais annual conference: creating strategic solutions √ membership models are not well-suited to product organizations and marketplace competition. √ open source solutions have an especially precarious balance between community governance and strategic agility. √ startups have a precarious existence. he noted that library consortia need to do some self-reflection and ask the following questions: what is our strategic role? do we duplicate the work of others? do we have the right partners? is our governance model well-adopted to the strategic role that we envision? schonfeld’s slides are not available on the nfais website. . scholarly publishing rebuilt from scratch the speaker for the members-only lunch was dr. jon tennant, identified as a nomadic paleontologist, rogue open scientist, and an independent researcher and consultant who is working on public access to scientific knowledge. he asked the question: “what would scholarly publishing look like if it were built from scratch?” is there a need to rebuild? is there a problem? tennant says it depends on your perspective and to whom you speak. publishing today is either brilliant or full of holes and there is even a movie about it entitled “paywall: the business of scholarship” (see: https://paywallthemovie.com/)! tennant said that in his opinion the answer to both questions above is “yes” and he purports that the problem is in many ways due to the business model - with access to information being “closed” and behind paywalls. there are efforts to change this and he mentioned some of the highlights from : sweden, germany, france, etc. all taking a strong stand for open access; plan s (see: https://www.coalition- s.org/; the springer -nature failed/delayed ipo [ ]; the european union open science cloud (eosc, see https://eosc-portal.eu/about/eosc), etc. he said that the current state of scholarly communication is that it is a th century process applied to a th century communication format that is slowly adapting to a s-based web technology and he asserts that we can do better. he added that our publishing failures are also due poor communication, with strong rhetoric overcoming the rational debate of issues such as open access and closed paywalls. we talk past each other rather than build bridges and he used an example of researchers responding to an announcement from elsevier with regard to the four stated principles of their information system supporting research: source neutral, interoperable, transparent, and in the control of researchers. the negative tweets against elsevier went on for days (see: https://twitter.com/elsevierconnect/status/ ). he said that all too frequently the use of social media is reactive and not self-reflective and that the quality of social media discussions can vary in quality. he briefly mentioned the open scholarship initiative (osi) which he believes is a brilliant idea. their goal is to bring all of the stakeholders together to have a reasonable discussion. they assume that there is a reconcilable middle ground and that such a middle ground is worthy of attaining (see: http://osiglobal.org/). however, tennant questions whether or not it is an ultimately doomed quest because tensions are rife and he suggests that perhaps there are times when reaching a middle ground is not necessarily a good thing. but through such an effort we can gain a more empathetic ground and come to better understand what each party really wants and needs. he showed some charts about the growth of open access (oa) publishing and asked the audience if rapid growth of oa is a good thing. a show of hands indicated that the audience did not view it positively - a response tennant did not expect. he also discussed briefly how publishing/peer review is being changed https://paywallthemovie.com/ https://www.coalition-s.org/ https://www.coalition-s.org/ https://eosc-portal.eu/about/eosc https://twitter.com/elsevierconnect/status/ http://osiglobal.org/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions to make the most of web technology and used f research as an example. this is an oa publish platform where the authors themselves control the peer review process (see an example of one of tennant’s articles at: https://f research.com/articles/ - /v ). he calls this “constrained” innovation because most the new tools are still being built around a journal-based system and therefore depend on publishers for sustenance [ ]. he noted that we all use networks and reviews to evaluate information on the web; e.g., trip advisor, but we for some reason we do not do it in science. why not use the power of professional networks to evaluate and communicate scientific results. he said that we should really give this some thought and move away from the world of journals and articles to focus on the power of networked technologies and version control. he asserted that research is a continuous process and should be communicated as such and supported david kochalko’s blockchain efforts to get all of the research “objects” - data, videos, protocols, etc. into a place that is accessible by the entire community. he asserts that the technology exists and said that a marriage of github (see: https://github.com/), stack exchange (see: https://stackexchange.com/), and wikipedia could form the foundation of the perfect platform having the required elements for quality control and moderation, for certification and reputation, and for engagement incentives. he went on to show how this “platform” could change our traditional methods of scholarly publishing. for example, publishing would no longer be organized around papers and journals, but rather there would be unrestricted content types and formats, and gatekeeping would be replaced by collaboration and constructive criticism. he said that is still a place for publishers as no one denies the value-add that publishers bring to the process of scholarly communication. if publishers compete fairly as service providers, it is possible to move towards open scholarship with for-profits as part of the system. but some stakeholders will find it difficult to collaborate if publishers work against researchers by locking-in those profit margins. tennant said that the challenges to transforming scholarly communication are to ensure a shift to digital norms that will reflect the adoption of web-based processes; to co-ordinate strategic and stepwise changes towards “open science”; to understand the changing roles of stakeholders such as editors, librarians, publishers, etc.; to reconcile changes across/between disciplines/communities having diverse norms, practices, and biases; and to resolve - with civility - the major tensions that exist between all of the stakeholders. in closing he said that the ultimate goal is for science to be a public good for the betterment of society by pooling knowledge and resources to create a decentralized scholarly infrastructure based upon strong values, on the principles of open scholarship, and with communities as the focus. tennant’s slides are on the nfais website. they contain a lot of information and useful links to additional worthwhile reading. . miles conrad lecture the first afternoon session was the miles conrad lecture. this presentation is given by the person selected by the nfais board of directors to receive the miles conrad award - the organization’s highest honor. this year’s awardee was martin kahn, chairman, code ocean, and long-time information industry executive. kahn opened by saying how surprised he was to get the award because he has not been associated directly with nfais through most of his career, although many of the companies with which has been involved, e.g., proquest, have had close relationships with nfais. so he was trying to figure out what he did right to be so honored and he did come up with a few ideas. https://f research.com/articles/ - /v https://github.com/ https://stackexchange.com/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions first, twenty years ago he did give a presentation at an nfais annual conference in philadelphia and a lot of people liked it. they liked it because he took it almost word-for-word (with proper attribution, of course) from a great book by kevin kelly entitled new rules for the new economy [ ]. kahn then briefly reviewed his talk from twenty years ago to see what held up, saying that kelly had ten rules that are as follows: ( ) embrace the swarm. as power flows away from the center, the competitive advantage belongs to those who learn how to embrace decentralized points of control. kahn said that what kelly meant was that the internet is our future - we are connecting everything to everything (this was before smartphones and high-speed telecommunications). he said that there is power and insight in what he called massive dumbness. ( ) increasing returns. as the number of connections between people and things add up, the conse- quences of those connections multiply out even faster, so that initial successes aren’t self-limiting, but self-feeding and self-reinforcing. kahn noted that the power of networked connections has been enormous, leading to a multitude of opportunities. ( ) plentitude, not scarcity. as manufacturing techniques perfect the art of making copies plentiful, value is carried by abundance, rather than scarcity, inverting traditional business propositions. kahn suggested that a network economy runs on plentitude, leading to a multitude of opportunities – a world of zillions! ( ) follow the free. as resource scarcity gives way to abundance, generosity begets wealth. following the free rehearses the inevitable fall of prices, and takes advantage of the only true scarcity: human attention. kahn said that all things get cheaper as they improve and that the best way to reach ubiquity is to give things away for free. ( ) feed the web first. as networks entangle all commerce, a firm’s primary focus shifts from maxi- mizing the firm’s value to maximizing the network’s value. unless the net survives, the firm perishes. according to kahn, members prosper as the net prospers. it is now the rule of thumb to support the ecosystem in which we live along with other stakeholders. closed systems, with some exceptions, have disappeared or are dying. ( ) let go at the top. as innovation accelerates, abandoning the highly-successful in order to escape from its eventual obsolescence becomes the most difficult and yet most essential task. kahn said that he believes that this is the most prescient rule - optimization precedes the demise of an organization, which is why it is much easier to start a new company than it is to change an older, established one. he referred to vincent cassidy’s earlier talk on how iet had to change the culture of its one hundred and forty-eight year organization to survive and was awestruck at their ability to do so. kahn noted that in times of change it is often the traits that were an organization’s original strength that can be the cause of their downfall – and we are certainly in times of change! kudos to iet for adapting and experimenting without ignoring their core competencies! ( ) from places to spaces. as physical proximity (place) is replaced by multiple interactions with anything, anytime, anywhere (space), the opportunities for intermediaries, middlemen, and mid-size niches expand greatly. kahn noted that well before it happened, kelly predicted the demise of “physical” spaces such as retail stores (amazon versus borders book stores) b. lawlor / an overview of the nfais annual conference: creating strategic solutions ( ) no harmony, all flux. as turbulence and instability become the norm in business, the most effective survival stance is a constant, but highly-disruption that we call innovation. kahn noted that the net causes turbulence and uncertainty - but that we must seek sustainable disruptive equilibrium without succumbing to it or running from it. those who have thrived on the intellectual stimulation of change also are dismayed by where they are. ( ) relationship tech. as the soft trumps the hard, the most powerful technologies are those that enhance, amplify, extend, augment, distill, recall, expand, and develop soft relationships of all types. kahn said that this is the least realized – trust is still an issue, especially in consumer services that can misuse customer information as tim mcgeary discussed earlier. ( ) opportunities before efficiencies. as fortunes are made by training machines to be ever more efficient, there is yet far greater wealth to be had by unleashing the inefficient discovery and creation of new opportunities. kahn suggested that we should not focus on problem solving, rather we should identify the opportunities that problems can often present. kelly saw problem-solving as looking backward. he believed that doing the smart thing was better than doing an old thing better. kahn said that this was the worst part of reliving that twenty-year old talk. why? because the example that kelly used in the book was the industry’s beloved dialog online service – a service that kahn acquired ten years after kelly’s book was published and an acquisition that was the worst one of kahn’s entire career. if anything, however, the revisit convinced him that future trends are identifiable by those with a strong intellect and thoughtful clarity in thinking, even if the time-frame is not. kahn said that the talk in philadelphia was not the reason the he received the miles conrad award. the real reason was his involvement with smart people at the right time and he provided several examples. first, he was involved with bill marovitz who saw the opportunity in the brs online service to provide medical information to practitioners without the use of intermediaries. this relationship ultimately led to ovid and mark nelson who saw that online services were to be temporarily supplanted by cd roms and that it was not the hardware that was essential, but rather software was the invaluable jewel. nelson believed, correctly, that the price of hardware would eventually plummet and the two of them managed the business accordingly and ultimately survived (although at times barely). mark had all of the ideas and strategies - kahn said that he was simply the “front man” as mark was painfully shy. kahn’s next venture in was at proquest where he was presented with a detailed proposal (codename “magnolia”) from suzanne bedell (now at elsevier) and john law for what would become the summon discovery service. he jokingly said that he keeps it under his pillow as his own personal tooth fairy and good luck charm (suzanne was in the audience!). this was the first of the library discovery services and, according to kahn, the best thing that happened while he was at proquest in summary, kahn remarked that his journey to the miles conrad award was due first and foremost to his involvement with really good people and due second to his involvement with really good technology. he was in the right place at the right time. he really had nothing to do with the creation of brs, ovid, or summons - they were not his ideas nor did he write a line of code. he was simply able to put his faith in the right people. he is now involved as chair of code ocean, an organization that was one of the start-up companies in the shark-tank shootout at the nfais annual conference [ ]. the company is early-stage and investor-owned. it is for-profit, but has yet to make a profit. eventually it will bring in revenue from fee-based services, but a large part of the company will remain focused on open source software for the individual researcher who wants to create a computational environment for collaboration and b. lawlor / an overview of the nfais annual conference: creating strategic solutions reproducibility (code ocean offers a collaboration platform for the creation of computational code and data; see: https://codeocean.com/). today it has more than ten thousand registered users, thirteen thousand private development projects, and more than five hundred published projects linked to peer-reviewed articles - many of which are in iet journals with whom they have a productive relationship. he said that code ocean follows many of kelly’s principles for success and its founder, simon adar, is yet another remarkable person with whom kahn has had the good luck to become associated. in closing, kahn said he still really does not know why he was given this honor, but he is proud to accept it and not at all ambivalent about doing so because of all that simon adar and the code ocean staff are trying to accomplish. kahn’s slides are not available on the nfais website. . unlocking the benefits of semantic search the remainder of the afternoon focused on the value of semantic search. the first of the three speakers in this session was bob kasenchak, director of product development at access innovations, inc. and a popular speaker at past nfais conferences. his discussion provided an overview of semantic search (including the diverse definitions floating around) and explained some of the related technologies and applications. in many ways his presentation framed the two following talks. kasenchak opened by saying that semantic search goes beyond keyword searching to examine the context within which the search terms “live.” basic search fails for several reasons: ( ) simple search only matches text strings; ( ) by its very nature language is ambiguous; and ( ), as we all know, we have amassed an enormous amount of content and continue to do so. unfortunately simple string matching is completely inadequate for large, specialized repositories of content that have a unique technical language that evolves over time. he frequently used the phrase that “google says things not strings,” meaning that keywords are only an indicator of what the searcher is really looking for. one of the issues today is that search platforms vary and usually the default search option is the one chosen by the user - the more “advanced” options are usually ignored. and simple searching is just that - matching a string of words against the exact same words in a database, often ignoring synonyms and word variations. he used the example of searching on google for the term “horse” for which he received more than three million results and noted that he wondered how the results were prioritized, for the first hit listed was by an author whose last name is “redhorse.” certainly the results were sub-optimal and there was no option to eliminate author names from the search. he modified the same search by pluralizing the search term to “horses.” this time he received . million results that were completely different from those resulting from the first search. based on the content of the second search result set, he assumed that google scholar ranks terms that match an article’s title and author higher that those terms embedded in the text. he noted that google scholar does not appear to recognize that a singular and plural version of the same term may be related to the same concept! it only matches the exact term! (apparently a search in google itself, not google scholar, does recognize the relationship). he did a second search in a service offered by a creator of scholarly services using the term “unmanned aerial vehicle” (drone) and received one hundred and seventy-one hits, but when he used the acronym “uav” he received almost three times as many hits. obviously in this case the search platform - at least in the default option - does not consider acronyms, so that users of the system are not getting all of the information that they are seeking. simple searching fails over and over again because of its inherent nature of looking at exact term matching. https://codeocean.com/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions semantic search is any search that attempts to go beyond the text string in the box, the common denominator of which is that it tries to examine the context in which the text string lives in order to drive relevant results. the methodologies used can include the use of synonym rings, taxonomies, lexical variants, fuzzy logic, the geographic location of the searcher, the searcher’s previous queries, previous similar searches, ontologies, knowledge graphs, and other strategies. some of these are quite simple and others are quite complex. for example, some will allow you to search a database even if you are not familiar with the language, but if you are able to get part of a term correct it will bring up relevant hits. granted, there will be some level of “noise,” so the searcher needs to be careful when evaluating the results. these types of searches utilize the levenshtein distance [ ] or a similar measure to match misspellings and variants; e.g. “bob” is one variant away from “rob,” and the greater the distance, the less likely the match. some engines “parse” queries; e.g., if kasenchak searches on “harrison ford” he will get one set of hits, but if he asks the question “when is harrison ford’s birthday?” the string is parsed and google returns the appropriate hit. one is a keyword search, the other is not. he also discussed contextual searches that are frequently based upon geographic location or prior searches; e.g. a search on the term “pizza” brought up a number of places near him where he could buy pizza (this can be done if the search engine recognizes an ip address based upon a gps locator). he also discussed the google knowledge graph that pops up on the side of the search results’ page that alerts the searcher to things that may be of interest to that specific person based upon all of the information that google has gathered on that person as a result if their past searches; e.g. he did a search on “jaguars” thinking that it would bring back results on the animal, but rather it brought back information on a sports team that he follows on a regular basis. the google knowledge graph connects search with known facts about entities and this is becoming quite common across search engines today. this “graph” presents the user with a lot of diverse information and provides a much richer search experience. to see what i mean, go to google and search on “empire state building.” (note: the underpinnings of this type of search is actually a very large ontology). kasenchak then moved on to the use of taxonomies, tagging, and controlled vocabularies by content providers. such use allows the context of terms in their databases to be more easily identified. he reminded everyone that while the searcher is interested in concepts, all that a search is based upon is words - they are the “window” to the content and all that the search engine has to go on. unfortunately, words can be ambiguous - hence there is an absolutely essential need for really good subject metadata. a search engine can be “tuned” to search tags before searching the free text and the use of tags, taxonomies, and controlled vocabularies not only allows users to query, but also to browse. such use also allows search engines to suggest topics using type-ahead or “did you mean” and to leverage synonymy to deliver the same relevant results from various inputs. he gave the example of the public library of science (plos) where such taxonomies are used. it has a thesaurus of more than nine thousand preferred terms and they apply about eight terms per article. it also has about four thousand synonyms. these are applied to documents and are actually exposed when users browse so that the user can click on the term for another search. they are also used to redirect search queries for synonyms. if the user believes that a term has been incorrectly applied they can click on it and notify plos - they use crowd-sourcing to keep their taxonomies honest! a second example came from jstor - specifically jstor labs - that actually uses the full article as the query using a combination of traditional taxonomies and naïve artificial intelligence topic modeling. the searcher simply drags or drops a document into the search box. it can suggest related content for research and for bibliographies. go take a look. it is experimental, successful, and really something quite novel (see: https://www.jstor.org/analyze/)!!! https://www.jstor.org/analyze/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions kasenchak closed by saying that the practical implementation of semantic search can be easy or complex. for the first route a content provider can use simple things such as already-existing search software tuning/options, enable fuzzy matching, use weight fields, document types, etc. where appropriate, and use dates to deliver recent results. this can be done directly if they have their own search software or via their distributor. the next level is more complicated and involves the creation of taxonomies and tags, the building of knowledge graphs, and a better understanding of the users of their data via the creation of user profiles, tracking, and keeping a history of user behavior, and other targeted means. kasenchak then ended and turned to the remaining two speakers to go into specific examples. kasenchak’s slides are available on the nfais website and an article based upon his presentation appears elsewhere in this issue of information services and use. . making semantic search work for you the second speaker in this session was travis hicks, director of web operations at the american society of clinical oncology (asco), who discussed the steps that organizations can take to optimize their digital content for discoverability when it is indexed by a semantic search model, including content structure and metadata strategy. in addition, he looked at the steps an organization can take to better understand how far they should dip their toe into semantical waters when it comes to their own internal search engines, including the benefits of structured thesauri. hicks opened by saying that content providers must understand all aspects of search because external search engines are the number one method of choice for those seeking to discover unknown content. he added that the wrinkle is that search queries take many forms. as a result, all types of search queries need to be taken into consideration when constructing content - it needs to be optimized for external discovery. (note: this is related to the concept of “mental models” discussed by the next speaker). he noted that good content is discoverable content and, as kasenchak mentioned earlier, google parses content, not only for keywords, but also for phrasing in order to provide additional insight into relevancy (the harrison ford vs. when is harrison ford’s birthday? example above). in doing this google is able to identify terms that are semantically-linked in its analysis of billions of documents and has dis-incentivized poor content that is limited to keywords. he added that content with a high-user value is discoverable content and that there are ten steps that content providers can take to boost search results. these are as follows: . focus on your users and their intent in content . write clean, concise copy . create links to reliable, high-quality internal and external content . use structured schemas that leverage updated html tags (see: https://schema.org/) . use bullets and organized lists (these are machine-preferred) . ensure that your content is mobile-friendly . utilize taxonomies . make sure site performance is strong - fast loads rank higher . ensure that the physical address is machine-readable . pay for it with google ad words (then again, maybe not - this can get expensive!) he noted that it is important to understand user intent and to do this you need to review internal analytics; e.g. https://schema.org/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions • what types of queries do your users use (e.g., keywords vs. natural language)? • what are the topical buckets of your queries (e.g., what terms/concepts are users generally looking for)? • what terms are searched, but do not produce any results? • are your users more likely to search internally or externally? • if you index content from multiple sites, do users search to get to those sites? • if you have facets, do users actually use them? he added that it is equally important to conduct user research; (e.g. what is the overall level of satisfaction with search results and if satisfaction is low, what trips them up) what areas or content types are the least discoverable and what facets may be helpful in enabling the users to access content they are seeking? he agreed with kasenchak in that he encouraged all content providers in the audience to create and support their own thesauri which he defined as a hierarchical classification system of terms. a good thesaurus will employ synonyms and facilitate the identification of semantic relationships. the addition of metadata will also enhance discoverability, and he added that synonyms can be utilized by search platforms such as solr (see: https://lucene.apache.org/solr/). indeed, the use of a thesaurus will enable a content provider to identify previously-unknown relationships across their content. hicks cautioned that it is very important to understand both the potential upside and the limitations of your internal search engine - it is unlikely that it is equivalent to google. know what it can and cannot do, and if there is functionality that is not being leveraged, identify what needs to be done to unlock the system’s capabilities. other reality checks that are equally important are to know the staff resources that you actually have available: do you have internal development staff? do you have individual(s) that are involved with ongoing system evaluation? do you have a search governance plan for your search platform; and is there someone ultimately accountable for search performance - where does the buck stop? he used his own organization as a case study. their website provides information to their members as well as to the general public. there are about . million page views on an annual basis and about one hundred and eighteen thousand unique searches were executed in . the site has two primary user groups - “heavy” users such as board members, volunteers, and internal staff, and “occasional” users, such as the average member, meeting attendees, pharma staff, and the general public. the major challenge that they were facing was the fact that they use solr as its search engine and the specific model and relevancy algorithm was last updated in . in fact, while there had been some ongoing maintenance, no major work done had been done since the last update. the primary issues that needed to be addressed were a general dissatisfaction by all user groups with search result relevancy - indeed, relevancy tuning had not kept up with content and business needs; and the fact that user expectations had shifted because of their experience with google - they needed a new user interface, new technology, and new functionalities. an examination of their search logs told them that searches were primarily keyword - there were only twenty-two searches that could be considered natural language searches and only eleven that could be considered advanced. they found that they lacked content for some queries and that other content needed to be refocused. they also found that more than half of the search queries involved basic cancer types, treatments, etc. so what did they do with the information? their plan for improvement included, among other things, adjusting proximity parsing, enhancing taxonomies, adding new/improved search models, implementing a relevancy score floor, and improving the user interface. the relevancy score floor was implemented to reduce the number of overall results (noise). if content doesn’t reach a certain relevancy threshold, it will not be presented as a result and the threshold is determined based on testing after initial changes were made (e.g., where do results tail off?). https://lucene.apache.org/solr/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions with regard to the user interface they added a type-ahead, contextual searching (did you mean?) and they distinguished actions such as paying dues from search. success was measured via user experience testing that provided feedback from end users on the search experience over time as improvements were implemented. the first step was to have users complete search- related tasks with the search functionality and existing user interface in order to document usability of search/satisfaction with the current state. the second step was ongoing testing: as major milestones were met to improve search, users were asked to perform the same set of tasks that they completed during the baseline testing. by comparing the results of each round of testing, they confirmed that improvements had a positive impact on search usability and identified additional areas for improvement. the third step was to rinse and repeat. for the long-term they will look at machine-assisted relevancy optimization; use solr logs and analytics to aid indexing and relevancy; consider the implementation of personalized search that could initially be based on simple user history, including profile data, later adding known online user behavior and individual click data. the will also consider voice-assisted search. in closing, hicks offered the following conclusions: first and foremost, understand the expectations of your users, both internal and external; develop a strategy for content optimization and stick to it; high-value user content is, by its very nature, findable and discoverable; there is no silver bullet for improving internal search - you must constantly reach out to your users; and finally, all efforts need ongoing evaluation and iterative testing of improvements. hicks’ slides are available on the nfais website. . information design considerations for effective semantic search the final speaker in this session was duane degler, strategic design consultant for design for context, llc, a consulting firm that specializes in usability and user-experience design (see: https://www.designforcontext.com/). degler opened by saying that semantic search seeks to enhance the meaning in content and to more closely align the searcher with the available information resources. as a result there needs to be a strong user-centered aspect in order to unlock the benefits. degler noted that users are highly- knowledgeable and that their search experiences are longer. today, documents are larger, more topics are covered, and there is a lot of repeated information so it is difficult for users to know what is and what is not relevant to their search. he talked about “mental models” and i was not sure what he meant, but apparently that is user experience design jargon. basically, a mental model describes how you or i would do something. it’s our thought process. it’s our expectations [ ]. degler said that searchers have a variety of mental models. these are: survey (“something about…”), targeted (“find something i know…”), exploratory or archival (“do not miss anything”), routine (“regular searches”), or collaborative (working with multiple authors). the use of semantics can be very helpful when it comes to these “mental models,” but we must understand our users and use semantics optimally, such as in navigation and term relationships. he asked if the meaning of a search term includes context and commented that a user’s task has its own language. the task (an element of user context) shifts perception of what matters in content, so we need to ask if our sites and content need to respond to tasks. usually, if users are clear on the relationship between information and task, then their focus on the information subjects should be sufficient and they can draw meaning based on their context. but in dense information environments, users may need help, https://www.designforcontext.com/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions and the use of semantics can provide it. in particular, the use of semantics is extremely useful when it comes to search, content navigation, and term relationships. we need to know if an internal search is a first stop, a next step, or a last resort and need to consider the user’s journey and path analysis. we also need to consider the semantics of our site as the language used for navigation, links and headings sets user expectations of relevance. semantics plays a role in focusing search, in providing the pathways for browsing, and in building relationship structures across terms and concepts. in closing degler noted that as we model content, we must recognize that its character, structure, and context all matter. degler’s slides are available on the nfais website. readers may also be interested in slides from some of his related talks that can be accessed at: https://www.designforcontext.com/insights/search-mental- models. . democratizing data - whose job is it? the final day of the conference opened with a plenary session by dr. daniel barron, a resident psychiatrist at yale university. the focus of his presentation was on the challenges that we face today as more and more personal data is captured - what information researchers need, how others use it, what the implications are, and what policies are needed. his presentation on the need for open science and data sharing was refreshingly logical and non-combative. he noted that he is a clinician - both a resident physician, clinical neuroscience trainee, and that he is also a researcher. his dissertation was on brain damage in patients with temporal lobe epilepsy. he uses open science tools every day, but has been warned that science is competitive business and to prove a point he gave the example of jack gallant, a neuroscientist, who published research in nature, but did not share his data for fear of competition. barron noted that what followed was “trial by twitter.” gallant was publically shamed for not sharing his data and for a period of three weeks there were multiple twitter threads that included an attempt to get him banned from nature and to lose his research funding (see: https://twitter.com/gallantlab/status/ ?lang=en). gallant himself was a user of social media, but at the time of the nfais conference he was relatively quiet. barron went on to say that there are many ways of sharing data. for example, one is to deposit in a publically-available repository or posting on a website. he noted that the latter is not a common practice in his discipline because of the data file sizes. they use mri scans ( –  mg each), usually eight per subject, and a study could include a hundred patients. neuroscience data usually resides in a non-public repository that requires a credentialed login and password, or a credentialed login with certain strings attached such as citation requirements, authorship agreements, and peer-to-peer collaborative agreements. researchers share data to facilitate transparency and reproducibility and to avoid duplication of studies and bias in reports. also many funding agencies around the globe demand that the results of government- funded research be made accessible. he noted that in the u.s. clinical trial data is considered confidential. the u.s. only requires that a summary report (including the questions asked) be filed. however, if the fda needs to see the results they can request and have the right to see the details. as of , the european union has said that clinical trial data is not confidential. ownership of data depends upon where you are located. researchers also share because data is expensive and he gave the following example. between and there were about twenty-two thousand mri studies published. he said that if you take a conservative https://www.designforcontext.com/insights/search-mental-models https://www.designforcontext.com/insights/search-mental-models https://twitter.com/gallantlab/status/ ?lang=en b. lawlor / an overview of the nfais annual conference: creating strategic solutions estimate that these papers represent twelve thousand studies with twelve patients per study, the time required to do the mri scans would be one hundred and forty-four thousand hours (one hour per mri). at yale the cost per mri (excluding ancillary labor costs) is six hundred dollars, so the cost of those studies - just for doing the mris - was $ . m. researchers share to exploit a limited and expensive resource through thoughtful stewardship. he noted the open fmri group at stanford that facilitates the uploading of these data sets to make them accessible. the data is put in a standardized format and tagged for ease of retrieval (note: open fmri no longer exists is it now open neuro (see: https://openneuro.org/), but all files uploaded to fmri remain available (see: legacy.openfmri.org). sharing is essential because science is collaborative. barron noted that he relies on the assistance of computational data scientists with whom he has to share data sets. also, researchers build on the work of their peers - yet another reason why they share data - they share to expand and advance science. if sharing is so important, why the battle to control data? one of the reasons jack gallant did not share his data was that he did not want his work to be pre-empted as his grant funding could be jeopardized. others may be seeking patents for themselves or their organization and future revenues are on the line. clinical data often includes personal data that is protected by law such as hippa (the u.s. health insurance portability and accountability act of - see: https://www.hhs.gov/hipaa/index.html). also, if a researcher has an idea for a potentially- valuable clinical, significant investment will be required to bring that idea to fruition and investors need to be shown that the intellectual property that the basis of the idea is safe and protected; i.e., that the research results are indeed proprietary. investors expect a return on their investment. barron went on to look at who actually owns data and noted that there are legal owners. sometimes it is the person or organization who paid to collect the data such as a pharmaceutical company. sometimes it is the institution where it’s collected such as a university. and sometimes ownership is specified in a legal contract. he said that usually someone or something owns data and that data ownership is a legal issue not an ethical or moral one. he noted that in a world of open science there are many discussions about who should control ownership. he said that it is not an issue of who should, but who does. in academia it is usually specified in a contract and he said if researchers do not like the current state of affairs they need to change the system and he believes that is happening right now. the twitter storm against jack gallant is just an example of the passionate feelings surrounding the sharing of data. but he views that to be a bottoms-up approach to change and he strongly believes that a tops-down approach such as the proposed legislation from nih, is better [ ]. why? because there are many types of data, some of which needs to be proprietary (clinical data) and some of which needs to be quickly shared. also, there are many stakeholders, and rationale and constructive conversations need to be held (sounds like jon tennant’s statement that reasonable and civil conversations are more effective than reactive and disruptive social media rhetoric). in closing, he referenced an article that he just published on open science [ ] and a podcast response to that article from orion open science [ ]. barron’s slides are accessible on the nfais website. . shark tank shoot-out the second session of the day was a “shark tank shoot out” in which four start-ups competed. https://openneuro.org/ legacy.openfmri.org https://www.hhs.gov/hipaa/index.html b. lawlor / an overview of the nfais annual conference: creating strategic solutions each had ten minutes to convince a panel of judges that their idea was worthy of potential funding (the “award” was actually a time slot on a future nfais webinar). this session has been a “tradition” at the conference since and can be quite entertaining – and informative as well. this year was no different. the judges for the session were: kent anderson, ceo, redlink; ann michael, president and founder, delta think, inc.; and jignesh bhate, founder and ceo, molecular connections. . . a maritime vessel index the first presenter was peter mccracken, co-founder and publisher, shipindex.org (https:// www.shipindex.org/), who spoke about a database that he has created for both maritime and genealogical research. he noted that the database can help you learn more about the ships that interest you – and it will lead the user to the books, journals, magazines, newspapers, cd-roms, websites, and online databases that mention them. simply put, shipindex is, at its core, an index. the business model offers both free and fee-based services. to provide access, the database uses a “freemium” model, in which over , citations are freely-available to all, without any registration or payment. the full database, currently at nearly . million citations, is available through subscription. individuals can subscribe for a set period of time - from $ for two weeks of access to $ for a year of access - or for a recurring monthly fee of $ , with coupon codes available to bring that down to $ per month. mccracken admitted that throughout its ten-year existence, shipindex has faced a number of challenges and that these continue, but that he and his team are committed to seeing the database succeed. he also alluded to a new business relationship that will help him reach his objective. i found the database very interesting, but as even mccracken said – it is a very niche product. however, he has a history of success. in , he co-founded serials solutions with his brothers and a high school friend and the company was acquired by proquest in . who knows where shipindex may end up? none of the slides from this session are posted on the nfais website. however, a paper based on mccracken’s short presentation appears elsewhere in this issue of information services and use. . . information streaming the second presenter was violaine iglesias, ceo and co-founder [ ] of cadmore media, founded in june , a company that provides video and podcast streaming for scholarly and professional organizations (https://cadmore.media/). she opened by asking the audience to do a show of hands ( ) if they ever watched a video on you tube (everyone), ( ) if they ever wished that the longer videos were shorter and got to the point more quickly (everyone), if they ever wished that the content of the video was available to read (just about everyone), and if they ever stopped watching because the video was either too long or too boring (everyone). with that, she said that everyone should be interested in her company because they are in the business of making video and audio material snappy and engaging while being informative at the same time. she noted that youtube is the second most popular search engine and that all content providers need to be visible on that site. video streaming [ ] is growing and is very popular with the younger generations and it is also widely-used as a teaching tool. she added that even surgeons use it to learn and fine-tune their skills and added that she would feel better as a patient, if such teaching videos are curated and peer reviewed. she said that watching a video as opposed to reading a document can be more challenging – they cannot be quickly scanned, the viewer may have difficulty if the speaker has an accent or uses unfamiliar https://www.shipindex.org/ https://www.shipindex.org/ https://cadmore.media/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions terms, etc. her organization actually supplies manuscripts of the videos, creates metadata, and provides workflow tools and a media player to help content providers make their content usable and findable via streaming media. iglesias gave a brief demo of the video player and showed how a transcript is displayed (tran- scripts can be provided in any language). the viewer can go to any part of the transcript to quickly access that part of the video – an asset for lengthy teaching tools. the demo was impressive (see: https://www.cadmore.media/playerdemo.). their major competitor is youtube who does not offer as many services as does cadmore media such as the creation of transcripts, metadata and the assignment of dois. the service has a tiered subscription model for large and small organizations and is based upon content, volume and usage. a paper based upon iglesias presentation appears elsewhere in this issue of information services and use. . . credit reports for responsible science the third presentation was by rebekah griesenauer, data engineer, ripeta and leslie mcintosh, ceo and co-founder of ripeta (https://www.ripeta.com/), an organization that is focused on improving the reproducibility of science by creating what is basically an automated credit report for responsible science. they said that the ripeta software identifies and extracts the key components of a research report/article, thereby shortening and improving the lengthy publication process while making the methods more easily discoverable for future reuse. this is important because scientific research is based on the principles of the scientific method, which assumes that research can be reproduced in future experiments. yet, an estimated fifty to eighty-five percent of research resources are wasted due to the lack of reproducibility. this not only wastes resources, it reduces the confidence in science as a whole. ripeta’s goal is to provide researchers, publishers, and funders with a streamlined method for assessing the quality and completeness of the scientific research. they do not judge the quality of the science that is being reported - they are evaluating the robustness of the report. does there appear to be sufficient information to reproduce the experiment(s) contained therein? digital science (https://www.digital-science.com/) is a partner in this effort and a demonstration is available at: https://demo.ripeta.com/ - definitely worth a view! basically a paper can be uploaded and scanned to see if best practices for reporting research have been followed and a “credit report” is automatically generated. a new software release is planned for april . a paper based upon their presentation appears elsewhere in this issue of information services and use. . . using artificial intelligence to identify concepts the final speaker was nicole bishop, founder and ceo of quartolio (https://quartolio.com/), a company founded in and one that has created a cross-discipline research platform that transforms scientific documents into research intelligence with artificial intelligence (ai) [ ]. she noted that researchers read only about one percent of scientific articles and one can only wonder what key research is going unread. part of the problem is the paywalls to journals as well as the length of the publication process and she believes that the open access movement will resolve these issues. the number of research publications is another part of the problem which she believes can be solved using ai. her organization is attempting to leverage the growing body of open access research publications to create an ontology of scientific research through the application of ai. https://www.cadmore.media/playerdemo https://www.ripeta.com/ https://www.digital-science.com/ https://demo.ripeta.com/ https://quartolio.com/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions at quartolio they are using natural language processing to extract information from millions of oa articles to connect the dots across scientific disciplines and identify concepts that are then aggregated into “folios” in a digital library. it is a cloud-based platform that uses library science and ai so that researchers can “discover, manage and curate research” in the folios. each is a collection of all the documents that the company has compiled on a single scientific concept. they also provide a service to organizations to do a similar analysis on their internal documents. clients upload their documents and the system applies the ai to “connect the dots.” they can process about five thousand documents in six hours. to date they have $ k investment funding and as of the nfais conference they were in the process of closing a $ . m round. the judges closed the session and later in the day announced the shark tank winner: cadmore media. . lightening talks the final session of the morning was a series of lightening talks (six minutes duration each). there was no specific theme. they were to be “short, concise presentations that include sensible solutions to problems, overviews of new projects, news, etc., with the goal of sparking ideas and debate. . . discovery the first speaker was mark gross, president, data conversion laboratory (dcl, http:// www.dclab.com/) who talked about the challenge of having your content discovered. for some searchers google delivers – for some it does not. he used a search on brian may as an example - the results do not bring up any of may’s scholarly articles (refer back to kasenchak’s presentation for more examples). he referred to an article in the guardian [ ] where it said that it takes an average of fifteen clicks for a researcher to access an article! he noted that scholarly publishers have a complex web to navigate as users seek content through many channels and from many locations, using a diverse array of devices. there is no “standard” xml format across the discovery vendors (ebsco, oclc, proquest, etc.). he noted that the challenges for publishers is for them to bring together their content and metadata and deliver to all of their vendors at the right time. but how can they do that? gross then went on to described the services that dcl offers that can content providers to meet that challenge. they offer digitization, taxonomy creation, semantic enrichment - all of the key discovery factors that were mention earlier by kasenchak, hicks, and degler. they have been in the business since , have seen the industry evolve, and have the knowledge and experience to be of assistance. . . facilitating computational peer review and research reproducibility the second speaker was pierre montagano, director of business development, code ocean. if you read the section on the miles conrad lecture you know a bit about the organization as marty kahn, the miles conrad lecturer is the chairman. code ocean (see: https://codeocean.com/) is a cloud-based, open, online code execution platform that integrates with any scholarly platform. it provides researchers and developers with an easy way to share, discover and run code that is published in academic journals and conference proceedings. it allows users to upload code, data, or algorithms and run them with a click of a button. the platform enables reproducibility, verification, preservation, and collaboration without any special hardware or setup.  code http://www.dclab.com/ http://www.dclab.com/ https://codeocean.com/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions ocean provides next generation tools to facilitate digital reproducibility, where users can access a working copy of a researcher’s software and data, configure parameters and run it regardless of the users’ operating systems, installation, programming languages, versions, dependencies, and hardware requirements. montagano reported that ocean has recently launched a new service that provides publishers with a private link for executable code uploaded by the author during submission. using container technologies, code execution is agnostic to programming languages, versions, or operating systems. the link can then be shared with peer reviewers who can easily change parameters, modify the code, upload data, run it again, and properly vet the submission. this will hopefully empower reviewers to conduct a more rigorous review of the science and help ensure reproducibility. this is being done in partnership with a number of publishers, including nature and elsevier. code ocean has a history with nfais. simon adar, code ocean founder and ceo, actually participated in the nfais shark tank shoot out the same year that he launched the company. at that time he noted that more and more research results include actionable data or code, but that the dissemination of that code relies on individuals to set up environments to reproduce the results. code ocean did not win, but based upon kahn’s earlier comments, it looks as though it is doing quite well. . . author choices in an open access world the third presenter was serena tan, senior editor, publishing development, john wiley & sons, inc. she opened with an overview of major open access [ ] (oa) initiatives that have taken place over the last two decades. • : pubmed central is founded and biomed central formed in becomes the first commercial “born open access” publisher • : the budapest oa [ ] initiative is signed; plos is founded and funded • by : elsevier, wiley, taylor and frances, the royal society of chemistry, the american chemical society, and more all launch hybrid oa programs • : nih mandates a twelve-month green oa policy she noted that since then oa has really gone mainstream and publishers have been developing new models; e.g. wiley launched a full gold oa program. plos one has become the world’s largest oa journal. the usa expanded green oa to all federal funders. funders in china and india have also launched green oa policies. the gates foundation launched a gold oa mandate, and this year wiley signed a read and publish deal in germany that covers the cost of oa publishing and subscriptions for their hybrid titles. the european union has announced oa  [ ] – and of course there is the ever-present plan s (see: https://www.coalition-s.org/). tan said that wiley believes that publishers and scientific societies play an essential role in enabling researchers to do their best work and that wiley embraces open access. it continues to develop sustainable new business models that support author choice and accelerate open access and they innovate to create new products and services around open access and open research in order to meet researchers’ needs. she noted that authors want to choose how they disseminate the results of their research; that not all authors have the money to spend on article processing charges (apcs); and that they want publication outlets that match their desires and needs around scope, audience, and speed. as noted several times throughout this conference, in order to meet these needs wiley became the first publisher to partner with projekt deal for a countrywide deal in germany to better address the growing research market and evolving needs of researchers. projekt deal is a consortium of german libraries and research institutions https://www.coalition-s.org/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions commissioned by the alliance of science organizations in germany, represented by the german rectors’ conference, the hrk. deal represents more than seven hundred mainly publicly- funded academic institutions in germany, including the most important science and research organizations. tan went on to say that the agreement preserves author choice and promotes greater access while supporting great science and scholarship. it provides german institutions with access to read wiley’s entire portfolio of electronic journals (e-journals) back to the year . in addition, corresponding authors from deal institutions will be able to publish in wiley’s gold open access journals without worrying about their ability to pay the apcs. they will also be able to publish open access articles in wiley’s hybrid journals, at no additional charge, if they so choose. in closing tan said that wiley supports author choice by: providing both green and gold open access options for authors; converting previously-hybrid titles to gold oa when sustainable; supporting oa titles that select manuscripts based upon their potential to advance thinking in the field; launching sound science titles that support the reporting of incremental findings; supporting authors without access to apc funds; collaborating on sustainable solutions that best enable researchers to take advantage of these publishing options; and innovating in open research to support the aspirations of the diverse communities we serve. tan’s slides are available on the nfais website. . . what is a library? scott livingston, executive director of library management systems at oclc, was the next speaker and he gave quite an entertaining and informative presentation. he opened by saying that in the usa when people think “public library” they automatically think “books.” the figure he quoted was seventy- five percent of all americans have that reaction. he added that library usage is declining - people just don’t use them as much as they did in the past, and that library circulation is declining as a result. so as a public librarian what do you do? livingston said that the heart of a library is not comprised of books, it is comprised of people, and that libraries really are about community engagement. but since the interests of communities are quite diverse due to factors such as geographic location, industries, etc., not all public libraries are equal. hence, oclc offers a new library system, wise, that is actually a community-engagement system that allows librarians to more easily and quickly gauge the interest of their current and potential patrons. according to marshall breeding, “oclc wise has built-in tools to help the library develop and manage its collection in response to use patterns. the product relies on its internal, real-time data rather than having to rely on exports to a third-party service… these automated processes are based on policies and thresholds set by the library, which can be updated as needed and these processes result in a customer-driven collection development strategy [ ]. livingston noted that wise uses key functions, such as circulation and acquisitions, to analyze usage data that can inform librarians how their patrons use or do not use their collections. it is designed around people, both library users and library staff, with the goal of delivering great public library experiences and enabling libraries to evolve as their communities evolve.  and it allows library collections to reflect the preferences, not only of the entire community, but also at an individual branch level. in closing, he said that he is scott livingston - and he is not a book! livingston’s slides are on the nfais website and a brief paper based upon his presentation appears elsewhere in this issue of information services and use. b. lawlor / an overview of the nfais annual conference: creating strategic solutions . . the future of information access the next speaker was john seguin, president, third iron, llc, a library technology company (https://thirdiron.com/) who discussed some of the technology approaches to making information access easier while in parallel reducing the piracy rampant in the information industry. specifically he addressed ra (https://ra .org/), google’s campus activated subsrciber access (casa - https://www.igi- global.com/librarians/casa/), and third iron’s libkey (https://thirdiron.com/libkey-discovery/). he said that in order to make the information access process simpler, the systems must work with and understand the authentication mechanism that an institution (university, library, etc.) uses. they must understand what access a user is entitled to before they authenticate as well as understand the access rights to content – is it open access material or subscription-based content. if the latter it not available, the system must route the user to the institution’s fulfillment mechanisms. the goal is to generate links as close to the content item (pdf) as possible. he added that all three of the above systems have the same goals, but that they have different starting points and that all have limitations. casa is only available when a searcher uses google scholar and it is ip-based. for ra , an institution must use security assertion markup language (saml) [ ] and he noted that not all publishers participate in ra . also ra is not aware of entitlements and this can produce confusion for users. libkey must be supported by the institution, but is not limited to searches via google scholar nor to participating publishers. he added that they have ten million users across more than seven hundred institutions in twenty-five countries. in closing, seguin said that because these technologies overlap in numerous ways, but do not compete with each other, users can easily utilize the technologies that are best for them to simplify their access journey. it should be noted that at the time of the nfais conference ra was still being developed. the final recommendation was just released by niso on june ,  [ ]. seguin’s slides are on the nfais website and a brief paper based upon his presentation appears elsewhere in this issue of information services and use. . . a research data management academy the final speaker in this session was jean shipman, vice president of global library relations for elsevier, who discussed the launch of a new free, online research data management (rdm) librarian academy. the main audience for the academy training is librarians of all types: academic, medical, special, public, government, and school. however, the key audience is the practicing librarian who is unable to leave the workforce to obtain additional formal training in rdm principles and best practices. as an aside, she noted that researchers can also benefit from participation. the training modules will be available to anyone around the world who has internet access. there are eight total units within the curriculum and each unit may be taken alone or if all the units are completed, continuing education certificates will be issued to those who want them. the “academy” is being developed by a team that includes librarians from harvard medical school, tufts health sciences, massachusetts college of pharmacy and health sciences (mcphs), university, boston university school of medicine, northeastern university, elsevier, and simmons university. simmons university will be the institution who will grant continuing education credits, but a fee will be required to cover the costs of providing such documentation. shipman noted that this is a unique partnership between librarians, library educators, and a publisher. https://thirdiron.com/ https://ra .org/ https://www.igi-global.com/librarians/casa/ https://www.igi-global.com/librarians/casa/ https://thirdiron.com/libkey-discovery/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions the need for this training was demonstrated through interviews and surveys that identified gaps in current training offerings and highlighted what skills librarians and researchers need to contribute to their rdm success. an inventory of existing courses was prepared as well. in closing, she noted that the academy will be launched later this year. shipman’s slides are on the nfais website and a brief paper based upon her presentation appears elsewhere in this issue of information services and use. . designing user experience the final presentation of the conference was a very visual and humorous talk by willy lai, vice president of user experience at macy’s. he gave example after example of designs fated to provide dreadful user experiences. one was a hotel room safe that could only be locked by using your credit card – but then the safe was locked and your credit card was outside the safe. he showed photos of bathrooms where the toilet paper roll was totally unreachable – too far away. same thing for a bank drop off window – far too high! the visuals were extremely humorous and he more than made his point. lai said that the definition of user experience (ux) is “a person’s perceptions and responses that result from the use or anticipated use of a product, system, or service.” [ ] a bad ux-design can hurt business and lai said that studies show that seventy percent of customers abandoned a purchase due to the amount of frustration derived from trying to make that purchase online. about sixty-seven percent of customers said that a bad online purchase experience leaves them with a negative impression of the brand. he used zulily as an example. on their old website a customer had to fill out a form before they could even see what was being sold. now you can browse all you want and only need to register when you want to make a purchase (note: i used the site to check it out, and registration takes a second – not the hurdle that it once was!) lai said that good ux design has the opposite effect - it is really good for business (as zulily learned). he said that a well-designed site can have as much as a two hundred percent higher visit-to-order conversion than one that is poorly designed. he added that some studies have shown that every dollar invested in ease-of-use can return anywhere from ten dollars to one hundred dollars and a side benefit is that people are less likely to abandon a purchase. he suggested taking a look at vwo (visual website optimizer) at https://vwo.com/ and noted that the site has some good reading material on ux design (i checked it out and it does!). lai said that a good design needs three components: it needs to be business-viable, technically- feasible, and user-desirable. he added that users want everything, but actually need less - you need to learn what is at the core of what they want and he used henry ford as an example. ford’s customers did not know that they really needed a car - they said that what they wanted was faster horses. lai said that you should not bring users in for validation of the final product. the strategic approach is to involve them from the very beginning. you need to know and understand their unmet needs and it gives you an opportunity to know what you dont know and to see the user’s world. lots of feedback brings clarity and if there is ever confusion – just dig deeper. lai closed with some guiding principles for ux design: • design for your target audience. • provide all the essential information at the upper part of the site “above the fold” ( % of the time that users spend on a site is near the top – they do not scroll down). https://vwo.com/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions • promote helpful information and make it look like relevant content (just like amazon). • shrink or eliminate forms to be filled out, which will result in significantly more conversions and increase order values (learn from zulily). • users do not read digital sites; they scan them first and then read what interests them. his closing comment was remember that frank lloyd wright said that he never designed or built anything until he visited the site and got to know the people who would be living in “his” house. get to know your user! lai’s slides are not available on the nfais website. . conclusion when reading the title of the nfais annual conference, creating strategic solutions in a technology-driven marketplace, i thought that the speakers would focus on technology itself, but the real message, perhaps from my biased perspective, was that in order to adapt to changes in today’s information community, content - both new and existing - should and must be the focus of our attention. technology’s chief role is to increase the value of that content - through enhancements, increased ease of access and use, all the way to the creation of new services and perhaps new forms of content. from the opening keynote that discussed the use of the internet-in-a box to store and disseminate relevant, high-quality medical information for offline utilization in rural areas, to the rejuvenation of iet’s fifty-year old inspec database via the use of semantic tagging - it was content that reigned center stage. the key message is that stakeholders in the information community must adapt to change in order to remain relevant. the same can be said of almost every presentation. how university presses are using technology to enhance the value of their content in addition to improving the ease of accessibility and use of that content. how libraries are using technology to offer the services that their users have come to expect from commercial services such as amazon. how new entrepreneurial start-up companies are using technology to make the audio component of traditional videos readable and searchable. they all focused on improving the relevance of their content and services through the use of appropriate technologies with one additional caveat - all applied technology with the needs and expectations of their user communities in mind. the conversation was very informative and demonstrated that today’s stakeholders are not standing still wondering how to move forward - they are actually moving forward. to sum up, i rephrase from vincent cassidy’s presentation. he said that as an organization iet is no longer “traditional,” but rather has evolved into a one hundred and forty-eight year old “start-up” which is strange, exciting, and invigorating. he emphasized that this “new life” is not about technology - it is about data and innovative uses of data. what has made the nfais conferences so interesting and valuable over the years is that nfais provides a neutral venue in which controversial issues can be discussed productively and with respect for differing opinions, and this year continued the tradition. what was new was the fact that more than one speaker requested an end to “unproductive” rhetoric from those on both sides of the open access movement and less use of social media that often provides a knee-jerk reaction to a situation rather than the expressing of a rationale, well-constructed viewpoint. even jon tennant who is very much for open access, stressed that we need “to understand the changing roles of stakeholders such as editors, librarians, publishers, etc., reconcile changes across/between disciplines/communities having diverse norms, practices, and biases; and resolve - with civility - the major tensions that exist between all of the stakeholders.” b. lawlor / an overview of the nfais annual conference: creating strategic solutions in closing i leave you with the following quotes: “our technology produces a state of chronic revolution” - aldous huxley - . so even if today’s “revolution” ends, be prepared for the next, and to help you prepare, keep in mind the following recommendation: “after you’ve done a thing the same way for two years, look it over carefully. after five years, look at it with suspicion. and after ten years, throw it away and start all over.” alfred edward perlman (https://www.quotes.net/quote/ ). there is no information on a potential “nfais” conference in . niso indicated that they are considering the continuance of this traditional and well-respected event, but no decision has been announced. watch for details on the niso website at: https://www.niso.org/. note: if permission was given to post them, speaker slides used during the nfais conference are embedded within the conference program at https://nfais.memberclicks.net/ -conference-program, and if they are available, the term “slides” appears highlighted in blue next to the title of the presentation. about the author bonnie lawlor served from – as the executive director of the national federation of advanced information services (nfais), an international membership organization comprised of the world’s leading content and information technology providers. she is currently an nfais honorary fellow. prior to nfais, bonnie was senior vice president and general manager of proquest’s library division where she was responsible for the development and worldwide sales and marketing of their products to academic, public, and government libraries. before proquest, bonnie was executive vice president, database publishing at the institute for scientific information (isi - now clarivate analytics) where she was responsible for product development, production, publisher relations, editorial content, and worldwide sales and marketing of all of isi’s products and services. she is a fellow and active member of the american chemical society and a member of the bureau of the international union of pure and applied chemistry for which she chairs their publications and cheminformatics data standards committee. she is also on the board of the philosopher’s information center, the producer of the philosopher’s index, and she serves as a member of the editorial advisory board for information services and use. she has served as a board and executive committee member of the former information industry association (iia), as a board member of the american society for information science & technology (asis&t), and as a board member of lyrasis, one of the major library consortia in the unites states. ms. lawlor earned a b.s. in chemistry from chestnut hill college (philadelphia), an m.s. in chemistry from st. joseph’s university (philadelphia), and an mba from the wharton school, (university of pennsylvania), with subsequent studies at insead in fontainebleau, france. contact: chescot@aol.com. about nfais founded in , the national federation of advanced information services (nfais tm ) is a global, non-profit, volunteer-powered membership organization that serves the information community; i.e., all those who create, aggregate, organize, and otherwise provide ease-of-access to and effective navigation and use of authoritative, credible information. https://www.quotes.net/quote/ https://www.niso.org/ https://nfais.memberclicks.net/ -conference-program b. lawlor / an overview of the nfais annual conference: creating strategic solutions member organizations represent a cross-section of content and technology providers, including database creators, publishers, libraries, host systems, information technology developers, content man- agement providers, and other related groups. they embody a true partnership of commercial, nonprofit, and government organizations that embraces a common mission - to build the world’s knowledgebase through enabling research and managing the flow of scholarly communication. nfais exists to promote the success of its members and for sixty-one years has provided a forum in which to address common interests through education and advocacy. at this conference it was announced [ ] that nfais would possibly be merged into niso, pending membership approval of each organization. this approval has been attained and the merger became official on june ,  [ ]. it marks the end of one era and the beginning of a new one! references [ ] proto-writing, wikipedia, https://en.wikipedia.org/wiki/proto-writing, accessed july , . [ ] the history of paper, wikipedia, https://en.wikipedia.org/wiki/history_of_paper, accessed july , . [ ] printing press, wikipedia, https://en.wikipedia.org/wiki/printing_press, accessed july , . [ ] the history of computers, http://www.softschools.com/timelines/computer_history_timeline/ /, accessed july , . [ ] history of the internet, wikipedia, https://en.wikipedia.org/wiki/history_of_the_internet, accessed july , . [ ] b. marr, a very brief history of blockchain technology everyone should read, forbes, february , , https://www.forbes.com/sites/bernardmarr/ / / /a-very-brief-history-of-blockchain-technology-everyone-should- read/# e bc , accessed july , . [ ] b. lawlor, an overview of the nfais annual conference: information transformation: open, global, collabora- tive, information services and use ( - ) ( ), , https://content.iospress.com/journals/information-services-and- use/ / - , accessed june , . [ ] internet-in-a-box, https://meta.wikimedia.org/wiki/internet-in-a-box, accessed june , . [ ] s.c. grover et al., comparison of the impact of wikipedia, uptodate, and a digital text book on short-term knowledge acquisition among medical students: randomized controlled trial of three web-based resources, jmir medical education ( ) ( ),https://mededu.jmir.org/ / /e /, accessed june , . [ ] s. harrison, why wikipedia medical content is superior, future tense, january , , https://slate.com/ technology/ / /wikipedia-doctors-medical-knowledge-study.html, accessed june , . [ ] a. gomez, exploring offline access to wikipedia: dr. samuel zidovetzki on wikipedia’s role in rural health initia- tives, wikimedia foundation, september , , https://wikimediafoundation.org/ / / /wikipedia-offline-access- samuel-zidovetzki/, accessed june , . [ ] e. schulze, everything you need to know about the fourth industrial revolution, davos world economic forum, january , , https://www.cnbc.com/ / / /fourth-industrial-revolution-explained-davos- .html, accessed june , . [ ] ideal - , https://library.osu.edu/events/ideal- -advancing-inclusion-diversity-equity-and-accessibility-in-libraries- archives, accessed june , . [ ] assessment program visioning task force and athenaeum consulting. arl assessment program visioning task force recommendations. washington, dc: association of research libraries, december , . available from: https://www.arl.org/wp-content/uploads/ / / . . -avtf-publicreport.pdf, accessed june , . [ ] j. boone, new library publishing program launched to support faculty scholarship, september , , https://vtnews.vt.edu/articles/ / /unirel-ubiquitypress.html, accessed june , . [ ] national academies of sciences, engineering, and medicine . open science by design: realizing a vision for st century research. washington, dc: the national academies press. https://doi.org/ . / . it can be freely- downloaded at: https://www.nap.edu/catalog/ /open-science-by-design-realizing-a-vision-for- st-century, accessed june , . [ ] minimal viable product, wikipedia, https://en.wikipedia.org/wiki/minimum_viable_product, accessed july , . [ ] opacs, wikipedia, https://en.wikipedia.org/wiki/online_public_access_catalog, accessed june , . [ ] faceted search, wikipedia, https://en.wikipedia.org/wiki/faceted_search, accessed june , . [ ] discovery services, library technology guides, https://librarytechnology.org/discovery/, accessed june , . https://en.wikipedia.org/wiki/proto-writing https://en.wikipedia.org/wiki/history_of_paper https://en.wikipedia.org/wiki/printing_press http://www.softschools.com/timelines/computer_history_timeline/ / https://en.wikipedia.org/wiki/history_of_the_internet https://www.forbes.com/sites/bernardmarr/ / / /a-very-brief-history-of-blockchain-technology-everyone-should-read/## e bc https://www.forbes.com/sites/bernardmarr/ / / /a-very-brief-history-of-blockchain-technology-everyone-should-read/## e bc https://content.iospress.com/journals/information-services-and-use/ / - https://content.iospress.com/journals/information-services-and-use/ / - https://meta.wikimedia.org/wiki/internet-in-a-box https://mededu.jmir.org/ / /e / https://slate.com/technology/ / /wikipedia-doctors-medical-knowledge-study.html https://slate.com/technology/ / /wikipedia-doctors-medical-knowledge-study.html https://wikimediafoundation.org/ / / /wikipedia-offline-access-samuel-zidovetzki/ https://wikimediafoundation.org/ / / /wikipedia-offline-access-samuel-zidovetzki/ https://www.cnbc.com/ / / /fourth-industrial-revolution-explained-davos- .html https://library.osu.edu/events/ideal- -advancing-inclusion-diversity-equity-and-accessibility-in-libraries-archives https://library.osu.edu/events/ideal- -advancing-inclusion-diversity-equity-and-accessibility-in-libraries-archives https://www.arl.org/wp-content/uploads/ / / . . -avtf-publicreport.pdf https://vtnews.vt.edu/articles/ / /unirel-ubiquitypress.html https://doi.org/ . / https://www.nap.edu/catalog/ /open-science-by-design-realizing-a-vision-for- st-century https://en.wikipedia.org/wiki/minimum_viable_product https://en.wikipedia.org/wiki/online_public_access_catalog https://en.wikipedia.org/wiki/faceted_search https://librarytechnology.org/discovery/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions [ ] g.p. khiste and r.k. deshmukh, discovery services: an overview, library research world ( ) ( ), . [ ] b. lawlor, an overview of the nfais annual conference: information transformation: open, global, collabora- tive, information services and use ( - ) ( ), – , https://content.iospress.com/journals/information-services-and- use/ / - , accessed june , . [ ] b. lawlor, an overview of the nfais annual conference: the big pivot: re-engineering scholarly communication, https://content.iospress.com/journals/information-services-and-use/ / ?start= , accessed june , , information services and use ( ) ( ), . [ ] based upon comments made during the public review process, implementation of plan s has been postponed until , https://www.coalition-s.org/rationale-for-the-revisions/, accessed june , . [ ] publish and access agreement, projekt deal and wiley, https://pure.mpg.de/rest/items/item_ _ /component/ file_ /content, accessed june , . [ ] impact factor, wikipedia, https://en.wikipedia.org/wiki/impact_factor, accessed july , . [ ] robot writes la times earthquake breaking news article, bbc news, march , , https://www.bbc.com/ news/technology- , accessed june , . [ ] b. lawlor, an overview of the nfais annual conference: information transformation: open, global, collabo- rative, information services and use ( - ) ( ), , https://content.iospress.com/journals/information-services-and- use/ / - , accessed june , . [ ] k. brown and o. pourquiè, introducing prelights: preprint highlights selected by the biological community, posted february , , http://thenode.biologists.com/introducing-prelights-preprint-highlights-selected-biological- community/news/, accessed june , . [ ] j. kaiser, medical preprint server debuts, science, june , , https://www.sciencemag.org/news/ / /medical- preprint-server-debuts, accessed june , . [ ] c. morris, powering research reputations: using real-time reputation building as an incentive to aid researchers in the sharing and discovery of knowledge, information services and use ( ) ( ), – ,  https://content.iospress.com/journals/information-services-and-use/ / , accessed june , . [ ] the novel blockchain consortium for science: bloxberg@blockchain munich meetup, october , , see: https://www.mpdl.mpg.de/en/about-us/news/ -the-novel-blockchain-consoritum-for-science-bloxberg-blockchain- munich-meetup.html, accessed june , . [ ] d. yaga, p. mell, n. roby and k. scarfone, nistir blockchain technology overview, national institute of standards and technology, u.s. department of commerce, january . [ ] editor, dave kochalko interview: how technology could restore trust in science publishing, the sciencepod maga- zine, march , , https://sciencepod.org/ / / /dave-kochalko-interview-how-technology-could-restore-trust-in- science-publishing/, accessed june , . [ ] r. schonfeld, why was springer nature’s ipo withdrawn? the scholarly kitchen, may , , see: https:// scholarlykitchen.sspnet.org/ / / /springer-nature-ipo-withdrawn/, accessed june , . [ ] j. tennant, academic publishing is broken. here’s how to redesign it, fast company ( ), https://www. fastcompany.com/ /academic-publishing-is-broken-heres-how-to-redesign-it, accessed june , . [ ] k. kelly, new rules for the new economy: ten radical strategies for a connected world viking press, , https://kk.org/mt-files/books-mt/kevinkelly-newrules-withads.pdf, accessed june , . [ ] b. lawlor, an overview of the nfais annual conference: the big pivot: re-engineering scholarly communica- tion, information services and use ( ) ( ), , https://content.iospress.com/journals/information-services-and- use/ / ?start= , accessed june , . [ ] levenshtein distance, informally, the levenshtein distance between two words is the minimum number of single- character edits (insertions, deletions or substitutions) required to change one word into the other. wikipedia, see: https://en.wikipedia.org/wiki/levenshtein_distance, accessed june , . [ ] k.k. berg, information behavior and mental models, search engine land ( ), https://searchengineland.com/ information-behavior-mental-models- , accessed june , . [ ] request for information (rfi) on proposed provision for a draft data management and sharing policy for nih funded or supported research (notice: not-od- - ). closed december . . see: https://osp.od.nih.gov/wp- content/uploads/data_sharing_policy_proposed_provisions.pdf, accessed june , . [ ] d. barron, how freely should scientists share their data?, scientific american ( ), https://blogs.scientificamerican. com/observations/how-freely-should-scientists-share-their-data/, accessed june , . [ ] good scientists share data, orion open science podcast, see: https://orionopenscience.podbean.com/e/good-scientists- share-data/, accessed june , . https://content.iospress.com/journals/information-services-and-use/ / - https://content.iospress.com/journals/information-services-and-use/ / - https://content.iospress.com/journals/information-services-and-use/ / ?start= https://www.coalition-s.org/rationale-for-the-revisions/ https://pure.mpg.de/rest/items/item_ _ /component/file_ /content https://pure.mpg.de/rest/items/item_ _ /component/file_ /content https://en.wikipedia.org/wiki/impact_factor https://www.bbc.com/news/technology- https://www.bbc.com/news/technology- https://content.iospress.com/journals/information-services-and-use/ / - https://content.iospress.com/journals/information-services-and-use/ / - http://thenode.biologists.com/introducing-prelights-preprint-highlights-selected-biological-community/news/ http://thenode.biologists.com/introducing-prelights-preprint-highlights-selected-biological-community/news/ https://www.sciencemag.org/news/ / /medical-preprint-server-debuts https://www.sciencemag.org/news/ / /medical-preprint-server-debuts https://content.iospress.com/journals/information-services-and-use/ / https://www.mpdl.mpg.de/en/about-us/news/ -the-novel-blockchain-consoritum-for-science-bloxberg-blockchain-munich-meetup.html https://www.mpdl.mpg.de/en/about-us/news/ -the-novel-blockchain-consoritum-for-science-bloxberg-blockchain-munich-meetup.html https://sciencepod.org/ / / /dave-kochalko-interview-how-technology-could-restore-trust-in-science-publishing/ https://sciencepod.org/ / / /dave-kochalko-interview-how-technology-could-restore-trust-in-science-publishing/ https://scholarlykitchen.sspnet.org/ / / /springer-nature-ipo-withdrawn/ https://scholarlykitchen.sspnet.org/ / / /springer-nature-ipo-withdrawn/ https://www.fastcompany.com/ /academic-publishing-is-broken-heres-how-to-redesign-it https://www.fastcompany.com/ /academic-publishing-is-broken-heres-how-to-redesign-it https://kk.org/mt-files/books-mt/kevinkelly-newrules-withads.pdf https://content.iospress.com/journals/information-services-and-use/ / ?start= https://content.iospress.com/journals/information-services-and-use/ / ?start= https://en.wikipedia.org/wiki/levenshtein_distance https://searchengineland.com/information-behavior-mental-models- https://searchengineland.com/information-behavior-mental-models- https://osp.od.nih.gov/wp-content/uploads/data_sharing_policy_proposed_provisions.pdf https://osp.od.nih.gov/wp-content/uploads/data_sharing_policy_proposed_provisions.pdf https://blogs https://blogs.scientificamerican.com/observations/how-freely-should-scientists-share-their-data/ https://blogs.scientificamerican.com/observations/how-freely-should-scientists-share-their-data/ https://orionopenscience.podbean.com/e/good-scientists-share-data/ https://orionopenscience.podbean.com/e/good-scientists-share-data/ b. lawlor / an overview of the nfais annual conference: creating strategic solutions [ ] the other co-founder is simon inger, a well-known information industry consultant and past speaker at several nfais conferences. [ ] a landscape review of streaming media, renew publishing consultants ( ), see: https://renewconsultants.com/ wp-content/uploads/ / /streaming-landscape- -renew-publishing-consultants-final.pdf, accessed june , . [ ] see short video from edtech week at https://www.youtube.com/watch?v=bqfiqavb jk, accessed june , . [ ] scientists should be solving problems, not struggling to access journals, the guardian, may , , https://www.theguardian.com/higher-education-network/ /may/ /scientists-access-journals-researcher-article, accessed june , . [ ] what is open access? https://www.springer.com/gp/authors-editors/authorandreviewertutorials/open-access/what-is- open-access/ , accessed june , . [ ] http://www.budapestopenaccessinitiative.org/read, accessed june , . [ ] new initiative to boost open access, https://www.mpg.de/openaccess/oa , accessed june , . [ ] m. breeding, smart libraries newsletter, april, , https://librarytechnology.org/document/ , accessed june , . [ ] security assertion markup language, wikipedia, https://en.wikipedia.org/wiki/security_assertion_markup_language, accessed june , . [ ] recommended practices for improved access to institutionally-provided information resources. results from the resource access in the st century (ra ) project, niso, june , , https://www.niso.org/standards-committees/ra , accessed june , . [ ] international organization for standardization ( ). ergonomics of human system interaction - part : human- centered design for interactive systems (formerly known as ). iso f±dis - : , see wikipedia, https://en.wikipedia.org/wiki/user_experience, accessed june , . [ ] t. carpenter, niso and nfais announce plans to merge, the scholarly kitchen ( ), https://scholarlykitchen. sspnet.org/ / / /niso-and-nfais-announce-plans-to-merger/, accessed june , . [ ] merger of major information industry associations finalized, niso press releease, july , , https://www.niso.org/ press-releases/ / /merger-major-information-industry-associations-finalized, accessed july , . https://renewconsultants.com/wp-content/uploads/ / /streaming-landscape- -renew-publishing-consultants-final.pdf https://renewconsultants.com/wp-content/uploads/ / /streaming-landscape- -renew-publishing-consultants-final.pdf https://www.youtube.com/watch?v=bqfiqavb jk https://www.theguardian.com/higher-education-network/ /may/ /scientists-access-journals-researcher-article https://www.springer.com/gp/authors-editors/authorandreviewertutorials/open-access/what-is-open- https://www.springer.com/gp/authors-editors/authorandreviewertutorials/open-access/what-is-open- https://www.springer.com/gp/authors-editors/authorandreviewertutorials/open-access/what-is-open-access/ http://www.budapestopenaccessinitiative.org/read https://www.mpg.de/openaccess/oa https://librarytechnology.org/document/ https://en.wikipedia.org/wiki/security_assertion_markup_language https://www.niso.org/standards-committees/ra https://en.wikipedia.org/wiki/user_experience https://scholarlykitchen.sspnet.org/ / / /niso-and-nfais-announce-plans-to-merger/ https://scholarlykitchen.sspnet.org/ / / /niso-and-nfais-announce-plans-to-merger/ https://www.niso.org/press-releases/ / /merger-major-information-industry-associations-finalized https://www.niso.org/press-releases/ / /merger-major-information-industry-associations-finalized microsoft word - ilinstructioneval_ .doc this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). program-integrated information literacy instruction for online graduate students swapna kumar & marilyn ochoa university of florida, usa abstract academic librarians often provide information literacy support for specific courses or topics in the form of research guides, one-shot training sessions, library orientations, or by embedding library content into online courses. less frequently, they provide continuous program-level support on-campus or online. this paper highlights the value of sustained involvement of librarians at the program level to provide information literacy in an online environment. the description of implementation, research results, and strategies for sustainability will be useful to other online programs engaged in equipping online graduate students with essential information literacy skills to succeed in their academic endeavors. keywords: online learning, information literacy, faculty-librarian collaboration, distance learning, library instruction introduction the number of students taking online courses at u.s. institutions of higher education has steadily increased in recent years. from the fall semester of to that of , the number of au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). students who took an online course increased from . million to . million. , students taking online courses need different forms of support to succeed at a distance. in addition to well- designed instruction, technical, administrative, and information literacy support is important for online students in post-secondary institutions. , information literacy instruction can help online students succeed in several aspects of their academic endeavors. the significance of support for online students at the institutional, program, and course level for a quality online learning experience and for fostering connectedness to the institution has been highlighted by the standards for distance learning library services of the association of college and research libraries and by the distance education & training council. this paper describes the design and implementation of information literacy instruction in an online graduate program, and reports on student perceptions of that instruction at the end of their first year in the program. institutional context the college of education at the university of florida offers several online graduate programs in education. assignments in a number of online courses in these programs require students to use online databases, integrate peer-reviewed literature into their writing, and to craft annotated bibliographies. formal library instruction is not integrated in most courses, although some faculty may provide students with library links, contact a librarian for a session or research guide, or recommend that students consult with a librarian. incoming students to the online doctoral program in educational technology come from various disciplines such as mathematics, science, art, instructional design, or nursing, and work in diverse environments (e.g. elementary au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). education, middle or high school, higher education, military), therefore they are not all aware of digital resources and scholarship in the field of educational technology. notwithstanding their high level of technical skills, students in the online program are all employed full-time, do not live near campus and several have returned to doctoral study after a long hiatus. students’ ability to access, find, evaluate, and synthesize prior research contributes largely to their progress in the doctoral program and their development as scholars and researchers. instruction and support in accessing library resources would therefore help students complete program activities successfully and could even reduce frustration and drop-out rates. the importance of support was highlighted during research conducted with the first cohort of the online doctoral program where one-third of students reported that increased library instruction was needed for student success in the doctoral program. the education librarian and educational technology program coordinator thus collaborated to pilot a program-integrated information literacy project with an aim to (a) provide all students in the online program with information literacy skills and continuous support and (b) create online materials that could be embedded into online courses. prior research emphasizes the importance of faculty-librarian collaboration in providing online information literacy instruction and of providing such instruction early in an online program. ,   moreover, the education library in the college strives to adhere to association of college and research libraries (acrl) standards that suggest creating “a program of library user instruction designed to instill independent and effective information literacy skills while specifically meeting the learner-support needs of the distance learning community.”   designing information literacy instruction au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). in an attempt to design the best possible information literacy instruction for the online program, prior research on the provision of library instruction to online students was reviewed. in the past, libraries have provided instruction to distant learners by traveling to remote sites, but in the last decade, interactions between academic librarians and students at a distance have become easier with online asynchronous and synchronous technologies. , researchers have reported on the use of online videos and tutorials to successfully explain literature search processes or the use of specific databases. , , , online students can access such videos and tutorials at their convenience and also view them multiple times. kimok & heller-ross concluded that it was important to not only provide pre-created standard tutorials in a course, but to also create such tutorials in response to student needs during a course. in the context of the online doctoral program, it was decided that essential information literacy topics identified by the education librarian would be addressed in pre-created tutorials. additionally, online students’ needs would be surveyed both at the beginning and during the year to create asynchronous resources that they could use. notwithstanding the value of asynchronous resources in online information literacy instruction, barnhart and stanfield, kontos and henkel and lietzau and mann highlighted the need for synchronous interactions between librarians and online students where librarians can explain procedures in real-time online. librarians’ interactions with online students can be a ‘one-shot’ session about a specific topic, database, or specific content within a course, or as an embedded librarian within a course or several courses in a college. such interactions enable students to clarify doubts immediately or follow along as a librarian explains a step-by-step au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). procedure. despite the difficulty of coordinating a common time for all online doctoral students to meet, one synchronous session was planned during each semester in the first year of the doctoral program. the librarian and program coordinator decided that skills that could not be “left to chance,” would be taught during such sessions, for instance, accessing library resources from off-campus, database searching, and an introduction to bibliographic management software. while prior research has reported on the value of all these formats and forms of interaction between librarians and online students, course-integrated instruction where students have immediate opportunities to transfer the content of library instruction to course activities has been found to be most effective. , , , , , , course-integrated instruction is structured with a specific focus on course assignments and the goal of helping students complete those assignments, combined with information literacy standards and/or a basic set of information literacy skills that have been defined by an accreditation association or a librarian. program- integrated instruction that would be tailored to the needs of incoming students and help them acquire skills needed to succeed in the online doctoral program was considered crucial in this context. the education librarian and the program coordinator thus decided to develop systematic program-integrated information literacy instruction using asynchronous and synchronous interactions for incoming online doctoral students in . in the long-term, it was hoped that the resources developed could also be useful to students in other programs or in future cohorts. implementation au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). considering the wide range of skills and backgrounds of incoming online doctoral students in the educational technology program, a needs assessment of students’ prior knowledge and skills was conducted before they began the program in summer . in addition to identifying the information literacy skills that students would need in the program, it was important to identify their existing skills and needs in order to design appropriate instruction. , , , , the needs assessment assessed the students’ perceived ability to use resources, find appropriate literature, cite and evaluate resources, and their preference of library instruction formats. library instruction content that encompassed essential information literacy skills and required skills for student success in initial doctoral courses had already been identified. this was adapted based on students’ survey responses (n= ; %) about their prior experiences with information literacy instruction, existing skills, and preference for asynchronous tutorials or asynchronous instruction. instruction using both synchronous and asynchronous technologies was integrated into the program in the following manner: summer . an introductory -hour session was held during the on-campus orientation week at the beginning of the program, which was attended by % of students enrolled at the time (n= ). topics covered included off-campus access to the library, library services for distance learners including interlibrary loan, and an introduction to catalogs and databases used to locate books or peer-reviewed materials. fall . only half the students rated themselves as experienced or very experienced in using article databases for finding literature in the needs assessment survey. moreover, students indicated a preference for asynchronous instruction. asynchronous learning objects such as quicktime video tutorials and documents were thus created for step-by-step instruction, placed au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). in an online repository, and linked within the two courses required for all incoming online doctoral students in the program. tutorials and videos encompassed access and search topics, specific database use, and citation management as follows: connecting from off-campus; library introduction; searching the library catalog; database searching; wilsonweb and ebsco host; ulrich periodicals; eric thesaurus; isi web of knowledge; dissertation searches; citing and organizing citations; and annotated bibliographies. students were encouraged to view the learning objects and to use the library help forum that was provided within the courses and monitored by a librarian if they had questions. thirteen of ( %) students used the asynchronous learning objects. students struggled to identify peer-reviewed scholarship in an initial course activity, therefore the librarian taught a -minute synchronous session within the virtual classroom in the online course, and answered students’ questions in real-time. fifteen of ( %) students attended the synchronous session. spring . in a poll at the end of , students reported their lack of experience using bibliographic management tools, a topic identified as important to their success in the doctoral program. the librarian subsequently conducted a synchronous session on refworks, a software available at the university, that was attended by of ( %) enrolled students. in one of their required courses, students were also required to complete a group activity on the use of apa style and annotated bibliographies. furthermore, to reinforce and check that students were familiar with off-campus access and able to access library resources, they were required to provide a screenshot of their connection to the library catalog from off-campus to the course instructor. au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). summer . students had to complete a literature review in the required summer seminar. to reinforce and extend the content of the prior online synchronous session, and to help them with bibliographic management, the librarian taught a one-hour advanced refworks session on- campus when students attended a one-week summer session. all ( %) attended and were able to ask questions in real-time. data collection and analysis student perceptions of the information literacy instruction provided in the first year of the online doctoral program (june -june ) were assessed using a short anonymous survey in july . the survey included items about (a) student satisfaction with different components of library instruction and (b) their perceived value of those components to their learning in the program and for the improvement of their information literacy skills. a -point likert-scale was used (e.g. not satisfied, satisfied, and very satisfied) and an option was included for students who had not attended the instruction or not used a resource (‘i did not attend this session/use this resource’). open-ended questions were included to gather student feedback on how well each component worked and to ask for students’ suggestions for improving online information literacy instruction for future students in the program. twelve of the students ( . %) enrolled in the program at the end of the first year responded to the survey. findings au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). students’ reported satisfaction with the information literacy instruction provided and its perceived value for their completion of assignments during the first year of the online doctoral program is presented in table . on a scale of to , students were most satisfied with the synchronous online sessions (m= . & . ) and face-to-face sessions (m= . ) and perceived the online synchronous sessions as most valuable (m= . & . ) for completing assignments in the first year of the online program. the mean student rating for satisfaction with asynchronous learning objects and their value for completing assignments was or less. on a scale of to , the mean student rating for improvement in their ability to access online resources, to search for online resources, to distinguish between peer-reviewed and non-peer- reviewed resources, and to manage their research was higher than . (table ). students’ mean rating of the usefulness of instruction in helping them to cite appropriately was . , indicating that more instruction should be provided in this area. au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). table : students’ satisfaction and perceived value of library instruction in year please rate your satisfaction with the library instruction you received for conducting research in year (not satisfied, satisfied, very satisfied). mean (n= ) sd summer : on-campus library orientation session. . . fall : synchronous session about types of resources . . fall : asynchronous learning objects (video tutorials, pdf documents) . . spring : synchronous session about refworks . . summer : on-campus session on refworks . . please rate the value of the following to your ability to successfully complete assignments in year (not valuable, valuable, very valuable). summer : on-campus library orientation session. . . fall : synchronous session about types of resources . . fall : asynchronous learning objects (video tutorials, pdf documents) . . spring : synchronous session about refworks . . summer : on-campus session on refworks . . please rate your agreement with the following (strongly disagree, agree, strongly agree) the library instruction in year has improved my ability to access online resources . . the library instruction in year has improved my search abilities . . the library instruction in year has increased my confidence in finding resources . . the library instruction in year has helped me distinguish between peer- reviewed and non-peer-reviewed resources . . the library instruction in year has helped me manage my research . . the library instruction in year has helped me cite appropriately . . au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). when asked to provide feedback on the different components of library instruction in the first year, six of twelve students highlighted the synchronous sessions as valuable, describing them as “well structured and informative,” “interactive, we could ask questions” and “valuable because you could refer back to them.” one student commented, i thought it covered everything that i needed to know. the format was easy to follow and flowed well from one topic to the next. i found the sessions to be very helpful, and i feel comfortable using these resources on a daily basis. in response to a question about their continued learning needs, students requested advanced instruction in bibliographic management tools and in citation styles. when asked to provide suggestions for improving library instruction for future students, three students responded that it was “effective as-is.” one student wrote, “i wouldn’t change anything. i think it provided everything i needed to know, and i have found myself using everything that was discussed so it was all useful.” two students suggested providing students with access to the asynchronous resources before they begin coursework in the program, so that they can prepare for doctoral school. discussion and implications this study was conducted with a small sample of online doctoral students in education who were working professionals and may not be representative of the larger population of online graduate students in other disciplines at post-secondary institutions in the united states. due to au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). several factors, the number of participating students decreased from to students during the first year of the program. nevertheless, the implementation of online information literacy instruction and the findings are discussed here with respect to implications for the online doctoral program at our college and for other online programs seeking to include program-integrated information literacy instruction. in the needs assessment at the beginning of the year, % of incoming students (n= ) reported that they had previously used online tutorials for information literacy instruction, and that they preferred to learn from asynchronous resources (e.g. videocasts, video tutorials, adobe pdfs) rather than synchronous interactions with a librarian. the librarian and program coordinator, therefore, focused on the creation of asynchronous resources, but included synchronous interactions based on the research reviewed and their instructional experience. in the year-end survey, students reported high satisfaction with the synchronous interactions that they perceived as most useful to their success in program assignments. possible reasons for students’ low rating of asynchronous resources could be that they did not appreciate the importance of reviewing these resources or their relevance to program assignments. it is also possible that they were dissatisfied with the quality of the resources. fifty-nine percent of students reviewed the asynchronous resources whereas % to % of students attended the synchronous sessions offered during the year. students were required to attend face-to-face synchronous sessions and were requested to attend online synchronous sessions as much as possible, but they were not required to view the resources provided as part of their course grades or assignments. the program coordinator highlighted the value of the asynchronous resources in au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). an email but did not require students to view them, which could be a reason for students’ low use of resources. lessons learned for future implementation of information literacy instruction in our program were (a) not to rely completely on students’ professed preference for one format of instruction over the other, but to provide as many forms of interaction with the librarian as possible (b) to stress the importance of students’ information literacy skills to their success in the program (c) to require the viewing of asynchronous objects that contained content we had identified as crucial to students’ information literacy and success in the program and (d) to place the asynchronous resources at strategic points before assignments or content in the course. in this instance, technical issues resulted in the placement of all the resources in one folder so that they could be accessed across courses in this online program as well as other programs. it is possible that the placement of asynchronous resources in the course might have been a reason for non- use. further research in the program will focus on the implementation of these lessons learned and students’ use of asynchronous resources or participation in synchronous interactions in information literacy instruction. based on these implementation experiences and students’ perceptions, we make the following suggestions for others interested in integrating information literacy support in online programs. • a needs assessment is crucial: identifying the information literacy skills that students will need to succeed in an online program, combined with an assessment of their existing skills and experiences can help determine the content and format of information literacy instruction in the online program. au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). • a continuous assessment of program and student needs should follow as students proceed through an online program: after students use pre-created instructional resources, the regular monitoring of students’ learning needs and course assignments can contribute to the development of new content and flexible instruction. • the academic librarian should interact with students in different media and use multiple formats of instruction: while some students appreciate the real-time interaction with the librarian, not all online students can attend such sessions. providing archives of synchronous interactions and step-by-step asynchronous resources such as tutorials or pdfs can help students with different learning styles. the availability of a help forum and access to a librarian can also be invaluable. • the placement of asynchronous resources and librarian availability within online courses and programs should be carefully considered: information literacy instructional resources should be placed at certain points in the program or coursework where students see them as relevant, and students should be required to view or use them. students might not realize the value of information literacy instruction and might not use the resources unless required to do so. in that context, highlighting the value of information literacy instruction to students’ success in an online program will contribute greatly to the use and success of such instruction. • faculty-librarian collaboration is essential at every step of the process: beginning with the identification of program needs and the resources already existing in an academic library, the continuous collaboration between faculty and academic librarians is important to the integration of information literacy instruction at every point in an online program. au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). ideally, a collaborative formative evaluation and reflection on what can be improved in the future should accompany this process. conclusion information literacy skills are as important as writing skills in a doctoral program. graduate students who live and work at a distance from the university and might be returning to school after a hiatus cannot be expected to use electronic databases or to evaluate and integrate digital scholarship into their coursework without instruction. in this study, instruction was intentionally integrated into the program at regular intervals and in different formats (synchronous or asynchronous), enabling greater involvement between the librarian and students. online students’ needs were regularly explored, monitored, and addressed as students progressed through the program. built on strong librarian-faculty collaboration, a learner needs assessment, the use of different instructional formats, and an evaluation, this library instruction project is a model that can be customized and implemented in other online programs and disciplines to support online students and provide them with skills to succeed in online learning. i. elaine allen and jeff seaman, “learning on demand: online education in the united states,” (babson park ma: babson college survey research group, ). retrieved from www.sloanconsortium.org/publications/survey/pdf/learningondemand.pdf. i. elaine allen and jeff seaman, “class differences: online education in the united au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). states, ,” retrieved from http://sloanconsortium.org/sites/default/files/class_differences.pdf. maria lapadula, "a comprehensive look at online student support services for distance learners," american journal of distance education , no. ( ): - . alan tait and roger mills (eds.), “rethinking learner support in distance education: change and continuity in an international context,” (london: routledge falmer, ). "standards for distance learning library services, " association of college and research libraries, accessed december , , http://www.ala.org/acrl/standards/guidelinesdistancelearning. “policies, procedures, standards, and guides of the accrediting commission of the distance education and training council, distance education training council,” (washington, dc: detc accrediting council, ). cheng-yuan lee, "student motivation in the online learning environment," journal of educational media & library sciences , no. (june ): - . swapna kumar, kara dawson, erik w. black, catherine cavanaugh and christopher d. sessums, “applying the community of inquiry framework to an online professional practice doctoral program,” international review of research in open and distance learning , no. ( ): - . dee bozeman and rachel owens, "providing services to online students: embedded librarians and access to resources," mississippi libraries , no. (fall ): - . linda l. lillard, et al., "embedded librarians: mls students as apprentice librarians in online courses," journal of library administration , no. / (january/march ): - . au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). acrl, "information literacy competency standards," p. . k. stuart ferguson and alice ferguson, "the remote library and point-of-need user education: an australian academic library perspective," journal of interlibrary loan, document delivery & information supply , no. ( ): - . jill s. markgraf, "librarian participation in the online classroom," internet reference services quarterly , no. / ( ): - . wendy holliday, sharolyn ericksen and britt fagerheim, "instruction in a virtual environment: assessing the needs for an online tutorial," the reference librarian ( ): - . elizabeth blakesley lindsay, lara cummings and corey m. johnson, "if you build it, will they learn? assessing online information literacy tutorials," college & research libraries , no. (september ): - . rachel g. viggiano, "online tutorials as instruction for distance students," internet reference services quarterly , no. / ( ): - . li zhang, "effectively incorporating instructional media into web-based information literacy," the electronic library , no. ( ): - . debra kimok and holly heller-ross, "visual tutorials for point-of-need instruction in online courses," journal of library administration , no. / ( ): - . anne c. barnhart and andrea g. stanfield, "when coming to campus is not an option: using web conferencing to deliver library instruction," reference services review, , no ( ): - . au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). fotini kontos and harold henkel, "live instruction for distance students: development of synchronous online workshops," public services quarterly , no. ( ): - . julie arnold lietzau and barbara j. mann, "breaking out of the asynchronous box: using web conferencing in distance learning," journal of library & information services in distance learning , no. - ( ): - . barbara kay adams, “library lecture seminars and workshops in course integrated instruction,” the southeastern librarian, ( ): - . francesca allegri, "course integrated instruction: metamorphosis for the twenty-first century," medical reference services quarterly (winter ): - . william badke, "ramping up the one-shot," online (weston, conn.) , no. (march/april ): - . penny m. beile, "effectiveness of course-integrated and repeated library instruction on library skills of education students," journal of educational media & library sciences , no. (march ): - . karen bordonaro and gillian richardson, "scaffolding and reflection in course-integrated library instruction," the journal of academic librarianship , no. (september ): - . russ a. hall, “the "embedded" librarian in a freshman speech class,” college & research libraries news, , no ( ): - . au tho r's a cc ep ted m an us cri pt this is an author's accepted manuscript of an article published in journal of library & information services in distance learning volume , issue , (copyright taylor & francis), available online at: http://www.tandfonline.com/ (doi: . / x. . ). linda lawrence stein and jane m. lamb, "not just another bi: faculty-librarian collaboration to guide students through the research process," research strategies , no. ( ): - . nancy h. dewald, ann scholz-crane and austin booth, "information literacy at a distance: instructional design issues," the journal of academic librarianship , no. (january ): - . colin higgins, "applying instructional design theory in academic libraries," library & information update (july ): . indira koneru, "addie: designing web-enabled information literacy instructional modules," desidoc journal of library & information technology , no. (may ): - . alexius smith macklin, "theory into practice: applying david jonassen's work in instructional design to instruction programs in academic libraries," college & research libraries , no. (november ): - . jerilyn veldof, “take a ride on the design cycle: instructional design for librarians. in integrating information literacy into the college experience,” (pierian press, ): - . swapna kumar, marilyn n. ochoa and mary e. edwards, “considering information literacy skills and needs: designing library instruction for the online learner,” communications and information literacy , no. (in press). au tho r's a cc ep ted m an us cri pt graphs, maps, and digital topographies: visualizing the dunciad as heterotopia copyright © canadian society for eighteenth-century studies / société canadienne d'étude du dix-huitième siècle, ce document est protégé par la loi sur le droit d’auteur. l’utilisation des services d’Érudit (y compris la reproduction) est assujettie à sa politique d’utilisation que vous pouvez consulter en ligne. https://apropos.erudit.org/fr/usagers/politique-dutilisation/ cet article est diffusé et préservé par Érudit. Érudit est un consortium interuniversitaire sans but lucratif composé de l’université de montréal, l’université laval et l’université du québec à montréal. il a pour mission la promotion et la valorisation de la recherche. https://www.erudit.org/fr/ document généré le avr. : lumen selected proceedings from the canadian society for eighteenth-century studies travaux choisis de la société canadienne d'étude du dix-huitième siècle graphs, maps, and digital topographies: visualizing the dunciad as heterotopia allison muri volume , uri : https://id.erudit.org/iderudit/ ar doi : https://doi.org/ . / ar aller au sommaire du numéro Éditeur(s) canadian society for eighteenth-century studies / société canadienne d'étude du dix-huitième siècle issn - (imprimé) - (numérique) découvrir la revue citer cet article muri, a. ( ). graphs, maps, and digital topographies: visualizing the dunciad as heterotopia. lumen, , – . https://doi.org/ . / ar https://apropos.erudit.org/fr/usagers/politique-dutilisation/ https://www.erudit.org/fr/ https://www.erudit.org/fr/ https://www.erudit.org/fr/revues/lumen/ https://id.erudit.org/iderudit/ ar https://doi.org/ . / ar https://www.erudit.org/fr/revues/lumen/ -v -lumen / https://www.erudit.org/fr/revues/lumen/ visualizing the dunciad as heterotopia lumen xxx / - / / - $ . / © csecs / scedhs . graphs, maps, and digital topographies: visualizing the dunciad as heterotopia hence springs each weekly muse, the living boast of c — l’s chaste press, and l — t’s rubric post; hence hymning tyburn’s elegiac lay, hence the soft sing-song on cecilia’s day, sepulchral lyes our holy walls to grace, and new-year odes, and all the grubstreet race. ’twas here in clouded majesty she shone ... alexander pope, the dunciad i. - when strait i might discry, the quintescence of grub street, well distild through cripplegate in a contagious map. john taylor, all the workes of iohn taylor the water-poet, “where was grub street?” pat rogers asked in his landmark study of “the topography of dulness” in eighteenth-century london. we might well ask where is the “here” where dulness shines in the epi- graph above, and we would fi nd the same answers that rogers did: topographically, grub street and dulness can be situated at particu- john taylor, all the workes of iohn taylor the water-poet (charlottesville: university of virginia library), <http://xtf.lib.virginia.edu/xtf/view?docid=chadwyck_ep/ uvagentext/tei/chep_ . .xml>. pat rogers, grub street: studies in a subculture (london: methuen & company, ), . allison muri lar coordinates of the city, with particular social and material histories; metaphorically, they are everywhere throughout the city, a permeating fog of intellectual turpitude and pedantry. if we were to construct a digital map of this literary territory, what could we discover that we cannot already visualize in the schematic maps provided by aubrey williams in , rogers in , or valerie rumbold in ? the following commentary starts from the position that digital scholarship and editing can provide enriched readings of literary texts through visualizations that extend beyond what is possible in print. this activ- ity entails the “distant reading,” proposed by franco moretti, a method that involves schematic representations such as graphs and trees, sta- tistical analysis, and mapping of texts as opposed to traditional liter- ary “close” readings. this activity does not eliminate the possibility of close reading, since digital displays of both high-resolution facsimile and diplomatic editions also provide the opportunity for studying texts, intertextual references, and cultural relationships that inform tra- ditional scholarship. as jerome mcgann has already argued, a digital archive of literary and artistic works can encompass a social text, a vast edition refl ecting the social-text editorial procedure proposed by d. f. mckenzie. the digital edition displays both texts as “linguistic forms” and books as “social events” ; it can allow readers to reconstruct a doc- umentary record not only of how literary works might have existed as words and ideas in their time, but also of innumerable intertextual relationships, and of material evidence such as the typography, layout, and illustrations of these texts, visualizations of how/where/when they were transmitted, or how they constructed and were constructed by a material world of trade and commerce, of news, even of archi- tecture, buildings, streets, and alleys. as mcgann comments, we can aubrey williams, pope’s “dunciad”: a study of its meaning (baton rouge: louisiana state university press, ). valerie rumbold, ed., the dunciad: in four books (new york: longman, ). d. f. mckenzie, bibliography and the sociology of texts (cambridge: cambridge uni- versity press, ). jerome mcgann, “from text to work: digital tools and the emergence of the social text,” romanticism on the net - (february-may ), <http://www. erudit.org/revue/ron/ /v/n - / ar.html>; see also dino buzzetti and jerome mcgann, “electronic textual editing: critical editing in a digital horizon,” electronic textual editing, ed. lou burnard, katherine o’brien o’keeffe, and john unsworth (new york: modern language association of america, ), - . visualizing the dunciad as heterotopia now create “editorial machines capable of generating on demand mul- tiple textual formations — eclectic, facsimile, reading, genetic — that can all be subjected to multiple kinds of transformational analyses.” with the arrival of digital scholarship, we are presented with new pos- sibilities and an opportunity to expand the possibilities for the visual- ization of eighteenth-century london and its society of letters. i have written elsewhere of my grub street project to create a system to map and thereby visualize the material networks of print and other trades in london in the long eighteenth century, from the restoration through the georgian era. franco moretti’s “very simple question about literary maps,” that is, “what exactly do they do? what do they do that cannot be done with words ... ?” has yet, i think, to be fully explored. moretti asks, “do maps add anything, to our knowledge of literature?” clearly, i think they do, as they have done long before digitization. beyond this, the incorpora- tion of high-resolution full-colour facsimiles of maps, prints and books with databases of contextual material will transform the way we visual- ize and research historical literature such as the dunciad of and dunciad variorum of . to begin, then: where was grub street? in terms of the real place, a textual description such as john strype’s in is opaque: situated in cripplegate ward, he wrote, the street is “very long, coming out of for- estreet, and running, northwards, into chiswel street; but some small part, to wit, from sun alley to chiswel street, is not in the ward, but in the liberty of finsbury. this street, taking in the whole, is but indif- ferent, as to its houses and inhabitants; and suffi ciently pestered with some illustrative fi gures are available in this article; for a larger set of visualizations in full colour, see <http://grubstreetproject.net/alexanderpope/dunciad - >. allison muri, “the technology and future of the book: what a digital ‘grub street’ can tell us about communications, commerce, and creativity,” in pro- ducing the eighteenth-century book: writers and publishers in england, - , ed. laura runge and pat rogers (newark: university of delaware press, ), - . franco moretti, graphs, maps, trees: abstract models for a literary history (london: verso, ), . david l. vander meulen, pope’s dunciad of : a history and facsimile (charlot- tesville and london: university press of virginia, ), hereafter citied as dun- ciad. allison muri, pope’s dunciad variorum of : a digital facsimile, <http://grubstreetproject.net/alexanderpope/dunciad >, hereafter cited as dunciad variorum. allison muri courts and alleys ... ” more helpful for visualizing the poem’s topog- raphy, strype’s survey of london also provides prints that can illustrate the locations of the dunces’ race in fleet street, or rag fair where dul- ness was seated when the dunciad was fi rst published in [fi gure ]. unlike printed illustrations, where options for including high-reso- lution full-colour images are limited by fi nancial considerations, a digi- tal edition could also show a contemporary view of where the goddess was seated at bedlam near grub street in the dunciad, in john rocque’s survey of the cities of london and westminster ( ); readers could also view these sites decades after pope’s death but in greater detail, in richard horwood’s plan of the cities of london and westminster ( - ). readers could visualize the poem by following the dunces in their games from st. mary le strand, to curll’s shop (opposite st. dunstan’s church in fleet street or opposite catherine street in the strand, depending on when pope wrote this segment ); then to bride- well, fleet ditch, and through ludgate into the city on an eighteenth- century map [fi gure ]. one could link to images of or maps showing smithfi eld where, as pope’s note informs his readers, “bartholomew fair was kept, whose shews, machines, and dramatical entertainments, formerly agreeable only to the taste of the rabble, were, by the hero of this poem and others of equal genius, brought to the theatres of cov- ent-garden, lincolns-inn-fields, and the hay-market, to be the reign- ing pleasures of the court and town” (dunciad variorum, note for i. ) [fi gure ]. one could see strype’s image of bedlam hospital as robert hooke’s orderly “palace beautiful” contrasted with hogarth’s hetero- topic representation of this hospital for the mad. we could map not john strype and john stowe, a survey of the cities of london and westminster vol. (london: printed for a. churchill, j. knapton, etc., ), . david vander meulen suggests that pope may have written a “progenitor of the full dunciad,” around ( ). paul baines and pat rogers indicate that curll’s shop was opposite st. dunstan’s curch, fleet street, from to ; he moved to a house “next the temple coffee house in fleet street” at the end of , which may have been the same shop. curll was briefl y in paternoster row in and moved to new premises in the strand in fall of (paul baines and pat rogers, edmund curll, bookseller. new york: oxford university press, , ). one could speculate that the composition of the passage depicting the race with curll slip- ping in his corinna’s “evening cates” predated his shop’s move from fleet street since the race would more sensibly remain on that street, logically proceeding in one direction from st. mary le strand eastward toward fleet ditch, bridewell, and ludgate, rather than looping west from st. mary le strand to curll’s shop in the strand before proceeding eastward again (for the locations of curll’s shops, see fi gure ). visualizing the dunciad as heterotopia only the fi ction of print culture that pope creates, but also the historical print culture. we could do all this, but what might we see? what new discoveries could be made through this process? how would it carry the argument further? all these remnants of material culture can be made visible online, and they can lend some life to the space of the dunciad, but as rogers remarked, “the ‘map’ is fi nally only a palimpsest. for the grub street which shows up there as an outlying enclave is no more than a shadow on the wall of the cave. the real grub street is elsewhere: an outlying enclave in the metropolis of letters.” how valuable is map- ping, when “grubstreet” is metonym for literary hacks and crass com- mercial print culture, or “bedlam” for madness, how valuable when “dulness” is both character and the nation’s spiritual state of “clouded majesty,” “cloud-compelling,” folly spread “o’er the land and deep”? does seeing the locations of the booksellers and authors implicated in the dunciad illuminate our reading of the poem? my suggestion is that maps, along with the dynamic display of graphs and archival printed materials can help us to investigate how that space, and its literary and social networks, as well as material trade networks, help us to see what pope and his readers saw in that fi gure of dulness that is impersonated in localizable points such as bedlam, fleet ditch, or st. mary le strand in fleet street but that also lies like a mist over the entire city. more- over, they can illuminate pope’s self-representation in the topography of dullness from which he, adopting a voice of omniscience from far above the excremental mire, excludes himself after the opening defense of “mr. pope’s integrity” in “the publisher to the reader.” i start from the premise that, in order to visualize the topography of eighteenth-century print culture and letters, we are best served by viewing and mapping eighteenth-century spaces as they were repre- sented in the eighteenth century, especially, strype’s maps published in and to some extent horwood’s more detailed map from the close of the century. what i am discussing, then, is in some ways an issue of editorial theory examined though the lens of the digital map: a process of creating a scholarly text that represents both the “real” eighteenth- century space and its representations in the imaginative spaces of fi c- tion and poetry. the process of mapping pope’s london is to read the dunciad as a network of literary communications, ideas, and physical-spatial rela- rogers, grub street, . dunciad, . allison muri tionships, to visualize it as a sort of countersite represented by maps of “the real” but absolutely different from those sites (which are refl ected in the poem as imaginative vitriol, literary allusions, metaphor and ambiguity). pope’s world of dunces utterly embedded in the streets of london is a virtual space, a mirror of the world, in which he reconsti- tutes himself as victim and as judge, while simultaneously eliminating himself from the streets. brean hammond was fi rst to suggest that the dunciad could be read as “the imaginative representation of the popean heterotopia,” and here i will follow his lead. foucault defi ned het- erotopias and utopias as having “the curious property of being in rela- tion with all the other sites, but in such a way as to suspect, neutralize, or invert the set of relations that they happen to designate, mirror, or refl ect. these spaces, as it were ... are linked with all the others, which however contradict all the other sites.” in opposition to utopias which were “fundamentally unreal spaces,” foucault suggested heterotopias are real locales that function “like counter-sites, a kind of effectively enacted utopia in which the real sites, all the other real sites that can be found within the culture, are simultaneously represented, contested, and inverted.” places of this kind, he noted, are “outside of all places, even though it may be possible to indicate their location in reality.” to map pope’s heterotopia of dulness, then, requires a visualization of the city and its book trades contrasted with the mirror or counter- sites presented in the poem. the fi rst edition of the dunciad, begin- ning with the words “book[s] and the man” situates its initial focus on the marketplace of literature, juxtaposing books in the fi rst line with “smithfi eld muses” in the second. in the dunciad (i. ) and dunciad variorum (i. ), pope situates the temple of dulness in rag fair in the east, outside of the city of london (fi gure ). rosemary lane passed through the impoverished st. botolph aldgate parish at the eastern boundary of the city of london, and can be seen as repre- brean hammond, “the dunciad and the city: pope and heterotopia,” studies in the literary imagination , no. ( ): . michel foucault and jay miskowiec (trans.), “of other spaces,” diacritics , no. ( ): . the edition reproduced in facsimile by vander meulen is the octavo that mis- printed the fi rst line as “book and the man.” strype defi nes four parts of the city of london and westminster in : the city of london within the walls and freedom, inhabited by “wealthy merchants and tradesmen”; the city or liberty of westminster and adjacent parts, the part beyond the tower (including east smithfi eld), and southwark (strype, vol. , ). visualizing the dunciad as heterotopia senting a liminal thoroughfare from the relatively wealthy city and the poorer areas of middlesex. rag fair emphasizes the book trade’s mar- ket economy: as pope tells his readers in the variorum, this was “a place near the tower of london, where old cloaths and frippery are sold”, and as pat rogers explains more graphically, it was “a place infamous for crime, prostitution, poverty and cheap secondhand trading.” the association of rags with paper-making would also imply the fi lth and poverty, not to mention the cheapness of the very pages that held the imprints of dulness. one contemporary of pope, describing the diseases of “mechanicks and tradesmen,” included those who “gather rags, for making paper, which send forth noisome smells and steams, especially such as are made of foul linnen, or of beds, on which people have died of distempers.” ned ward described the street as a “savoury place which, in ridicule of fragrant fumes that arise from musty rotten old rags, and burnt old shoes is call’d by the sweet name of rosemary- lane.” it was a place where a “tatter’d multitude” assembled every day: where all the rag-pickers about town ... have recourse, to sell their commodi- ties, to cow-cross merchants, long-lane sharpers, and other brokers, who were as busy in raking into their dunghills of old shreds and patches, and examining their wardrobes of decay’d coats, breeches, gowns, and petticoats, as so many cocks upon a pile of horse-dung, scraping about the filth to fi nd an oat worth picking at; or like a parsons hogs on a monday morning, routing about a church-yard to fi nd a s———— nce worth biting at. notably, the imperial seat of fools in rag fair and the beginning of the games at st. mary le strand represent areas beyond the reach of guild- hall and the legal limits of the city and its liberties. if the fi rst version of “smithfi eld muses” in referred to east smithfi eld outside of the city walls and freedom where we fi nd rose- mary lane, then the image of the market, of poverty, and of exclusion dunciad variorum, note to i. . rogers, grub street, . samuel parker, george ridpath, et al., eds. the history of the works of the learned , no. (july ), in eighteenth-century journals: a portal to newspapers and periodi- cals (harry ransom center, university of texas), . ned ward, “the london spy,” part (december ), in the london-spy com- pleat. in eighteen parts (london: printed for, and by j. how, and sold by eliphal jaye, ), . allison muri from the wealthy city would have been relevant. the poverty of that region seems, however, to have served a less powerful signifi er for the decay of literature than the heterotopia pope constructed of grub street, bedlam, and west smithfi eld inside the city liberties. indeed, specify- ing the location of the smithfi eld muses in west smithfi eld in the variorum edition, and moving dulness’s sacred dome from rosemary lane to bedlam in the dunciad in four books, reinforces a damn- ing countersite for the topography of print trades in and around the city. associating authorial muses with bartholomew fair and its rau- cous performances of jesters, dancers and tumblers effectively demoted authorial genius to a commerce of crass entertainment: ned ward for example had described the seesaw of conscience “between pride and profi t” in an actor considering whether it is preferable to have the “honourable title of one of his majesties servants” or “bartholomew- fair-player,” and choosing to play the fool at the fair for more money. as pope would have it in and subsequent editions, these amuse- ments that so basely prostitute artists, formerly agreeable only to the “taste of the rabble,” had been transplanted out of the city and moved westward, via the “theatres at covent-garden, lincoln’s-inn-fields, and the hay market,” to descend, fi nally, upon the polite society of “the court and town” in westminster. it is this action, or progress of dulness, that the dunciad is presumed to detail. from rag fair, then, outside the city, the queen and her dunces head to the western boundary of the city, as described by martinus scrib- lerus’s helpful note in the variorum of (note to ii. ). specifi cally, the games begin at the church of st. mary le strand — just beyond the temple bar at the extremity of the city’s liberties and marking the bor- der between london and westminster. the games continue to the east, concluding at the end of book ii with the participants passing through ludgate, and putting one another to sleep just inside the wall, close to stationers hall. dulness and her newly anointed king are returned to the temple of the goddess at the opening of book iii, presumably back at rag fair in rosemary lane where the “imperial seat of fools” and dulness’s “sacred dome” is located. the spatial tension pope creates with this movement of dulness eastward, opposite to the supposed movement westward to the court ned ward, “the london spy,” part (august ), . dunciad variorum, note to i. . dunciad, i. , i. . visualizing the dunciad as heterotopia and town, has inspired numerous interpretations of pope’s intentions. aubrey williams in argued that the situation of the games’ com- mencement at st. mary le strand, “just beyond the western bound- ary of the city ... is highly signifi cant”: the “exact geography of the games seems partly determined by the fact that the strand and drury lane were the actual sites of many printing-houses and theatres, and so could mark the encroachment of literary dulness on westminster.” alvin kernan, writing a few years later in , agreed: as williams demonstrated, he wrote, the poem’s action moving west from smith- fi eld to “the ear of kings” suggests “the corruption of taste and the translation of vulgarity from the city to the court, from the center of commerce to the polite world.” in , however, pat rogers rightly noted that the dunces, whether in smithfi eld, bedlam, rag fair, or moorfi elds, are in fact situated outside of the old walled city, though largely within the liberties of the city, and “cannot be simply equated with the brokers and stockjobbers, the merchants and importers, the shopkeepers and moneylenders: for the good reason that they them- selves had no real stake in the life of the city.” furthermore, the journey to fleet ditch “was not for them (as it was for the lord mayor and his party) a voyage out. it was not even an aimless peregrination. it was a homecoming.” in david sheehan described this progress as an “inward movement” in book ii, in “exactly the opposite direction” from the polite world in westminster “outside the limits of the city strictly defi ned, further and further inward, a movement of contraction and narrowing.” maynard mack, on the other hand, argued in that the movement in the dunciad is mostly “systolic” or “centrifugal,” with the only exception being “when the smithfi eld muses and their admirers ... migrate, not only from all quarters of the old city and its purlieus but seemingly from all england, to the west end.” finally, as brean hammond noted in , the poem’s topography suggests a “struggle between the purveyors of low-brow, popular, and irrational williams, pope’s “dunciad,” - . alvin b. kernan, “the dunciad and the plot of satire,” studies in english literature, - , no. ( ): . rogers, grub street, - . david sheehan, “the movement inward in pope’s ‘dunciad,’” modern language studies , no. ( - ): . maynard mack, alexander pope: a life (new york and london: w. w. norton & company, ), - . allison muri culture” and “those who wish to prevent their infi ltration into respect- able vicinities” as the dunces “migrate westward toward st. james’s palace and westminster.” in short, pope’s poem strains in two directions, and both the topog- raphy of dulness and the “action” of the poem escape defi nition, thanks to pope’s own paradoxical heterotopic vision. as others have argued, williams was taking pope at his word rather too easily when he adopted the model of dulness’s progression out of the city toward the world of civility. particularly if we map the action of the book trade itself, we can see that pope’s self-creation as a morally superior poet genius in oppo- sition to the dunces relies upon a certain representation of that trade: when pope refers to the place “where fleetditch with disemboguing streams / rolls the large tribute of dead dogs to thames”; or to “the neighbouring fleet / (haunt of the muses)”; when he describes the dunces moving “in the strand, thence along fleet-street (places inhab- ited by booksellers); when he adds emphasis by changing “thro’ the gates of lud” to “thro’ lud’s fam’d gates, along the well known fleet ... ” in , he is systematically equating fleet street and the nearby sewer of fleet ditch with the diminishment of creative values wrought by the trade of the booksellers. pope’s self-construction, then, as the voice of high-minded opposition to cultural commercialism and the decline of taste, relies on this localization of the booksellers and scrib- blers in this site of excrement and offal. if, however, we examine the locations where books were printed and sold, we get a somewhat dif- ferent view of the participation in this marketplace by both the dunces and pope. pope’s fellow conspirator lawton gilliver, who published the variorum, set up his shop in close to the shops (or former shops) of pope’s featured dunces bernard lintot and edmund curll in fleet brean hammond, “the city in eighteenth-century poetry,” in the cambridge com- panion to eighteenth-century poetry, ed. john sitter (cambridge: cambridge univer- sity press, ), . dunciad, ii. - . dunciad, ii. - . dunciad variorum, . dunciad, ii. . dunciad in four books, ii. . david foxon, pope and the early eighteenth-century book trade, revised and edited by james mclaverty (oxford: clarendon press, ), . visualizing the dunciad as heterotopia street and the strand, a short block and a half from temple bar. cer- tainly these historical facts are somewhat circumstantial, but as i will show, the concentration of pope’s own book sales in the area next to temple bar at the boundary of the old city provides insights into his representations of the print trade, which were calculated to elevate his own supposedly blameless role in the rabble of “scribblers, booksell- ers, and printers” who were attacking his integrity. the locations of pope’s sales also indicate quite clearly his own progress westward out of the old city. for the following analysis i am using a dataset from the english short title catalogue, comprising records of all monographs, newspa- pers, and other serial items published in london in the years just before pope’s dunciad was published. filtered by year ( - ) and place dunciad, iii. i am grateful to virginia schilling at the center for bibliographic studies & research, university of california at riverside, for providing this dataset. the known limitations of the english short title catalogue (<http://estc.bl.uk/>) mean that the data i present here provides a very good general representation of london’s print culture, but nevertheless the precise fi gures i present here must be considered somewhat imperfect and provisional. stephen tabor for example has noted that the estc provides a description of the ideal copy for each edition (“estc and the bibliographical community” the library , no. ( ): - ). thus any notion of a precise material reality in these numbers must be qualifi ed, though tabor also notes the eighteenth-century records are relatively accurate. a second caveat is that for the individual authors examined here the numbers pres- ent only a general picture to demonstrate the usefulness of a statistical approach. often the imprint information does not provide a location. works such as henry plomer’s a dictionary of the printers and booksellers who were at work in england, scotland and ireland from to (oxford: oxford university press, ), the british book trade index (university of birmingham, http://www.bbti.bham. ac.uk/), book trade indexes by ian maxted (exeter working papers in british book trade history. <http://bookhistory.blogspot.com>; the london book trades - : a preliminary checklist of members. folkestone, england: dawson, ), and defi nitive studies by james raven (“constructing bookscapes: experiments in mapping the sites and activities of the london book trades of the eighteenth century” in mappa mundi: mapping culture/mapping the world, edited by j. mur- ray, - . university of windsor: working papers in the humanities, ; “lon- don and the central sites of the english book trade” in the cambridge history of the book in britain, volume v, edited by michael f. suarez and michael l. turner, - . cambridge: cambridge university press, ; and the business of books: booksellers and the english book trade - . new haven: yale university press, ) can sometimes provide corroborating evidence of a bookseller’s location in a particular year, but there is more research to be done to fi nd exact locations for all the booksellers in question. for this study, if one record in a particular year names allison muri of publication (london), the estc shows , records of imprints from the start of the decade to the end of the year prior to publication of the dunciad on the th of may, . comparing publisher / bookseller locations from - shows an unsurprising preponderance of the book trades around st. paul’s and in paternoster row (st. paul’s appears in the imprint fi eld of records, while paternoster row appears in , ), with fleet street and the area around temple bar making a strong showing as well (searching for “temple bar” or “temple gate” fi nds records — some of which of course list more than one bookseller’s shop in that location). notably, and not surprisingly, the sites occupied by pope’s fi ctional dunces are not noticeably saturated with the book trades. grub street, active in the s, by now shows no activity in the book trade. a search for smith- fi eld in the imprint fi eld results in just records, mostly associated with nearby bartholomew close. a mere records are associated with lincoln’s inn, with covent garden, and just with hay market. pope’s dunces present a closer view of this scene. for example, lewis theobald’s publications up until the publication of the dunciad were, indeed, largely sold in fleet street close to the temple bar or temple gates [figure ]. a few of his works were sold inside the city wall, principally warwick lane, but certainly theobald represents the image pope presents of the spread of dulness westward. “ear-less” daniel defoe, however, shows pope’s heterotopia of dulness refl ects a dis- torted view. the majority of defoe’s booksellers in the decade up to the publishing of the dunciad were located within the old city walls, and hardly scurrilous places: paternoster row, warwick lane, near statio- ners’ hall, st. paul’s churchyard, and cornhill near the royal exchange [figure ]. it is unremarkable, certainly, that pope’s world is unfaith- ful to the real one, but this mapping in particular shows fi rst that the association of dulness with fleet street and fleet ditch does not apply to this dunce, and creates one telling instance of the heterotopic ten- sion between reality (site) and representation (countersite). similarly, the locations of eliza haywood’s booksellers confl icts with a represen- tation of the marketplace of literature moving westward. a signifi cant percentage of haywood’s publications in the years - were sold in the area around temple bar and temple gates, but booksellers in pall mall to the west and warwick lane are also well represented. a bookseller’s location, and another does not, or if a location is named over a span of years but is missing in a record within that timeframe, i apply that location to the record(s) where there is none named. visualizing the dunciad as heterotopia these locations show haywood’s books straying from pope’s imagined centers of dulness and the sewer of fleet street, almost equally repre- senting the established center of the book trade close to stationers hall inside the city wall, and the new “reigning pleasures of the court and town.” pope’s own record shows that the booksellers marketing his books from the time of his fi rst publication in poetical miscellanies, the sixth part ( ) to that of the dunciad are almost wholly centered in the temple district — where the gate out to the strand and to westmin- ster is situated (largely because of his association with lintot) [figure ]. notably, as maynard mack explains, except for one or two pieces in other anthologies, after his fi rst publication in poetical miscellanies (printed for jacob tonson within grays inn gate), pope never again published with tonson. mack suggests this was very probably “because they were ‘too well matched in the driving of shrewd bargains’” : pope was fi rmly situated in the landscape of both kinds of dunces, the book- sellers driven by profi t, and the writers scribbling away for the same end. moreover, while a signifi cant portion of his publications were sold inside the old city walls at the start of his career, in the s up to the publication of the dunciad the bulk of his works were sold near the temple bar or gates in fleet street. pope’s own commercial profi ts squat squarely upon corinna’s evening cates. i return now to my premise concerning digital topographies. my readers surely will have noted that i am presenting this information in print, but what i present here is of course a limited view of digital texts and visualizations, which can be dynamically generated from a database; which can connect with other collections online ranging from london lives - to eighteenth-century collections online to thconnect; and which are not limited in terms of the space allotted by the page format, the use of colour, the number of fi gures, or the generation of new views with new search queries. visualizing con- nections and connectedness (material, topographical, professional, temporal, symbolic) gives scholars an opportunity to “see” the poem see <http://grubstreetproject.net/alexanderpope/dunciad - >, figure . mack, alexander pope: a life, . see <http://grubstreetproject.net/alexanderpope/ dunciad - >, figure . for some examples of digital displays, and for more complete tables of booksellers and their locations, see <http://grubstreetproject.net/alexanderpope/dunciad - >. allison muri differently than when we approach it through a printed edition. these new readings, whether dynamically generated or carefully edited and statically displayed, can heighten and consolidate older connections that we have already made at the same time as they invite newer con- ceptualizations. pope’s self construction in opposition to his named dunces is clearly rather hypocritical, situated as he was in the crass commercialism of the book trades expanding out of the city along with other commercial ventures encroaching upon the strand. this claim is not particularly new, but to see the book trade of pope and the dunces in this light suggests how powerfully motivated pope was to remove himself from the very locale of his commercial success: he was fi rmly positioned in the market he so despised, and situating lewis theobald or daniel defoe or eliza haywood in that locale would seem to be a move to displace his own complicity, and his own marked presence, in the money-grubbing market of the book trade in fleet street. the power of databases and digital visualization creates a new text against which to read pope’s heterotopic vision. digital mapping gives us, at the very least, the ability to see literary works in ways we have not yet seen them. allison muri university of saskatchewan visualizing the dunciad as heterotopia figure . the topography of “all the grubstreet race” allison muri figure . the “taste of the rabble” progresses from smithfi eld to become the “reigning pleasures of the court and town.” visualizing the dunciad as heterotopia figure . all of lewis theobald’s booksellers in estc records up to publication of the dunciad ( – ). as in a bubble chart, the area of each circle corresponds to a value (here, the number of books published or sold in a given location). allison muri figure . daniel defoe’s booksellers in estc records for the decade up to publication of the dunciad ( – ). visualizing the dunciad as heterotopia figure . all alexander pope’s booksellers in estc records up to the publication of the dunciad ( – ). allison muri alexander pope’s booksellers in estc records for the decade up to publication of the dunciad ( – ). graphic criticism and the material possibilities of digital texts remaking collection how to cite: lorber-kasunic, j and sweetapple, k graphic criticism and the material possibilities of digital texts. open library of humanities, ( ):  , pp.  – , doi: https://doi.org/ . /olh. published: september peer review: this article has been peer reviewed through the double-blind process of open library of humanities, which is a journal published by the open library of humanities. copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access: open library of humanities is a peer-reviewed open access journal. digital preservation: the open library of humanities and all its journals are digitally preserved in the clockss scholarly archive service. https://doi.org/ . /olh. http://creativecommons.org/licenses/by/ . / jacqueline lorber-kasunic and kate sweetapple, ‘graphic criticism and the material possibilities of digital texts’ ( ) ( ): open library of humanities, doi: https://doi.org/ . /olh. remaking collection graphic criticism and the material possibilities of digital texts jacqueline lorber-kasunic and kate sweetapple university of technology sydney, au corresponding author: kate sweetapple (kate.sweetapple@uts.edu.au) narratives of material loss are often attributed to the process of digitis- ing cultural heritage collections. not being able to physically hold a literary artefact denies the reader an embodied understanding of the text made possible through tangible and contextual cues. what the artefact feels like—the dimensions, weight, volume, and paper quality—and where it is located—the institution, collection, shelf, or archival box—all play a role in the production of textual meaning. thus, the argument stands that by removing these cues certain ways of knowing a text are diminished. the process of digitisation, however, is not solely one of loss. scholars working with digital texts are finding new ways to search, model, analyse, and rearrange written language, and in doing so are benefiting from the interpretive possibilities of textual mutability. while some scholars are taking advantage of digital materiality through computational text analysis, far less attention has been paid to the non-verbal materialities of a text, which also play a role in the production of meaning. to explore the potential of these non-verbal materialities, we take a digitised version of herman melville’s moby-dick; or, the whale and alter graphic features of the page such as line length, type size, leading, white space, and tracking. through a critical design practice we show how altering these non-verbal elements can reveal textual qualities that are difficult to access by close reading, and, in doing so, create new, hybrid works that are part literary page, part information visualisation. https://doi.org/ . /olh. mailto:kate.sweetapple@uts.edu.au lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts ‘and, like the whale, once it is carved and rendered into its separate components, it is no longer the same creature it was when it was whole.’ (middleton, : ) introduction over the past two decades, the digitisation of cultural collections has generated vast numbers of textual surrogates. while initially understood to be place-holders for the original, recent discourse has begun to position digital textual surrogates as unique artefacts with distinct qualities and capacities. these artefacts are being reimagined as new, non-identical objects, with their own ‘ontological identity’, a process which challenges the conventional understanding of representations as mere copies and, furthermore, recognises that ‘for some purposes [the potential of these new forms] may exceed that of the originals’ (mueller, : para ). one significant way in which these artefacts ‘exceed that of the originals’ is through their digital malleability; that is, they can be computationally searched, modelled, analysed, and rearranged, unlike their print counterparts. while this malleability has enabled scholars to develop new textual practices in the humanities, these methods have not attended to the non-verbal elements of a text. these include the graphical qualities of a page, which, like written language, ‘make an important contribution to the production of semantic meaning … and can and should be understood as integral to textuality’ (drucker, a: ). nonverbal elements ‘often pass without registration or remark’ (mak, : ). when reading a page, we do more than attend to the written language; we register the typeface and size, number of columns, width of margins, absence or presence of headers, footers, or title—not consciously, nor as single entities, but rather as interdependent actors that shape the page and in turn shape our understanding of the textual artefact (drucker, b). at a macro level these visual qualities denote the genre of the artefact—for example a novel, a manuscript, a newspaper, a dictionary— while at a micro level they operate as a series of content directives—a subheading signalling a new section, an indent for a new paragraph, italics for emphasis, and indices and page numbers for navigational devices. these graphical features and the spatial relationships they generate are the under-acknowledged material qualities lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts of the page, rendered invisible by habituation, yet critical to the interpretation of a text. it is this understanding of the graphic and spatial qualities of the page as having semantic agency that has been under-theorised in humanities scholarship. in this next section, we explore the productive potential of an expanded understanding of materiality by remaking the page; that is, by showing how making or more specifically thinking-through-making can be used as a method of inquiry. while humanities scholars have only recently begun to recognise the potential of thinking-through-making, it has long been an essential condition for design-based research (burdick et al., : ). this shift towards an epistemology of making suggests new possibilities for design in humanities scholarship. as thompson klein ( : ) explains, making ‘brings the creative practice of design to the centre of research, favouring process over product’. stephen ramsay ( : para ) goes so far as to call this process a ‘new hermeneutic—one that is quite a bit more radical than taking the traditional methods of humanistic inquiry such as reading, writing, analysis, and interpretation’. by employing this methodology, we begin to ask ‘how can making with a focus on the semantic potential of graphic materiality reveal new insights?’ or, more specifically, ‘what qualities of a text are we unable to apprehend through conventional practices of interpretation, such as close reading?’ to explore these questions, we take a digitised version of herman melville’s novel moby-dick, or, the whale and alter the graphic features of the page, such as line length, type size, leading, white space, and tracking. even though we are using a digitised version of the novel, we maintain the format of a printed literary page, a persistent form encoded into digital interfaces. it is precisely the familiarity of the literary page that is required in order for defamiliarisation to occur. this process of graphically altering a text—what we refer to as graphical ‘deformance’—creates hybrid works that are we are not assuming that meaning lies exclusively within the material, and is therefore accessible through a rich description of physical properties, which would be to fall into the trap of literal materiality (drucker, b). a material’s capacity to produce meaning is a consequence of its associations with particular cultural and social contexts, not its inherent properties. some recent notable exceptions include johanna drucker ( ), n. katherine hayles ( ) and bonnie mak ( ). lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts part literary page, part information visualisation. these new forms provide scholars with additional ways of exploring a text, and therefore graphical deformance can be understood as a hermeneutic act. designing deformance before introducing the notion of graphical deformance we will look at some historical precedents in literary studies. in this field the process of altering, disrupting, or re-organising a text in order to bring to the surface previously inaccessible or obscured qualities of a work is called ‘deformance’. deformance describes an intervention into a text in order to bring attention to textual qualities eluded by conventional criticism. as jerome mcgann and lisa samuels ( : ), who introduced the term, write, through this process ‘we are brought to a critical position in which we can imagine things about the text that we did not and perhaps could not otherwise know’. these interventions may include reading a poem backwards, reordering the lines of a poem, or isolating only the nouns and verbs in a poem. in the following example, samuels and mcgann rework the wallace stevens poem ‘the snow man’, by reading the poem backwards so that the final line becomes the first, the second-last line becomes the second, and so on (figure ). the final stanza in stevens’ poem reads: nothing that is not there and the nothing that is. and, nothing himself, beholds for the listener, who listens in the snow, those of you familiar with the paris-based experimental collective ouvroir de litterature potentielle (oulipo), founded in , will recognise these playful yet unconventional approaches to literary criticism. the oulipians also experimented with the application of formal and procedural constraints in order to explore literature’s possibilities. one of their best-known formulae is ‘n + ’. in this experiment a writer takes a poem already in existence and substitutes each of the poem’s nouns with the noun appearing seven nouns away in the dictionary. thus, n + . by taking wallace stevens’ ‘the snow man’ again, and applying this process, we end up with a new poem, ‘the soap mandible’ (figure ). therefore, for the poem, see https://www.poets.org/poetsorg/text/brief-guide-oulipo. https://www.poets.org/poetsorg/text/brief-guide-oulipo lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts ‘one must have a mind of winter’ becomes ‘one must have a miniature of wisdom’; ‘to regard the frost and the boughs’ becomes ‘to regard the fruit and the boulders’; and, ‘of the pine-trees crusted with snow’ becomes ‘of the pinions crusted with soap’. figure : ‘the snow man’ ( ) by wallace stevens. figure : applying the n + formula to ‘the snow man’ by wallace stevens. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts what is critical to understand about these practices of textual deformation is that the purpose is not to ascribe new meaning to a text—it is not, in the first instance, to assist with the act of interpretation. nor, writes stephen ramsay ( : ), is it for ‘the immediate apprehension of knowledge’. rather, deformance is used to defamiliarise the text, to enable texts to be seen anew. this process of defamiliarisation, referred to as ‘estrangement’ by russian formalists in the early part of the th century, is a method of presenting familiar things in an unfamiliar or strange way. critically, defamiliarisation demands a slowing down of the reading process and an increased awareness of the creative devices that construct a text. on reading backwards, mark sample ( : para ) writes that it ‘revitalizes a text, revealing its constructedness, its seams, edges, and working parts’. digital technologies have made deformance a more common practice. the transformation of print artefacts into machine readable forms, coupled with computational text analysis tools that can read them, allows text to be treated as infinitely malleable and mutable. this has enabled scholars to explore sources in ways that were previously difficult, if not impossible. sample ( : para ), paraphrasing ramsay, makes the point that in the branch of digital humanities that focuses on text analysis and data-mining, deformance is a key methodology: ‘computers let us practice deformance quite easily’, he writes, by ‘taking apart a text—say, by focusing on only the nouns in an epic poem or calculating the frequency of collocations between character names in a novel’. we are more interested, however, in practices of deformance that attend to the graphical, not linguistic, features of a text—the oft-forgotten ‘architectures of a page’ which are intrinsic to the interpretation of texts (mak, ). we are specifically concerned with acts of deformance that transform graphical features such as line length, type size, leading, white space, and tracking. the page dimensions and the placement of the text box, however, reflect design conventions in this study. we retain these conventions not by default nor for reasons of nostalgia but because the standard form of the page is a critical reference point from which all subsequent alterations can be recognised. without conventions there can be no deformance. for an excellent account of the typographic conventions of the novel and their disruption, see zoe sadokierski ( ). lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts the transformation of these graphical features, with the specific intent of revealing qualities of textual artefacts, is familiar territory to a handful of design practitioners. through visual means, designers stefanie posavec ( ), owen herterich ( ), and jonathan puckey ( ), among others, explore (respectively): abstracting the sentence length and paragraph structure of the opening chapters of classic novels; isolating the spoken discourse in novels to reveal frequency, intensity, and patterns of dialogue; and shifting font size and weight, and using redaction strategies to map the evolution of daily news (lorber kasunic and sweetapple, ). while none of these designers position his or her work as graphic deformance, they are apprehending written artefacts visually and are therefore important reference points for this research. experimentations to explore the potential of graphical deformance as a critical strategy and to better understand how the formal elements of a page might shape the structure of a text, we remake all, or parts of, herman melville’s us edition of moby-dick; or, the whale. this classic work of american literature is in part the story of captain ahab’s monomaniacal hunt for the white whale (‘moby dick’) and ishmael’s spiritual journey from ‘alienation to harmony to skepticism and finally to detached balance’ (middleton, : ). inspired by the story of the essex, a whaling ship that sunk in after an encounter with a whale, moby-dick is a fictional voyage that follows ahab’s ship the pequod as it crosses the world. it draws extensively on melville’s own experiences as a seaman and his engagement with scientific issues of the th century (wilson, ). as a result of the novel’s availability on the web it has become a common text for digital humanities students to computationally analyse and study. this is partly due to its length (approximately , words), making it a good size corpus to algorithmically interrogate, and partly due to the voluminous critical attention dedicated to it through a variety of academic journals, centres, and platforms. this includes leviathan: a journal of melville studies, the melville society and the melville electronic library at hofstra university, to name but a few. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts the aim of these experiments is to show how a focus on graphic materiality can operate as a critical approach to exploring a text. as we have discussed, literary scholars have largely ignored graphic materiality when apprehending a text. however, designers, with their epistemological understanding of the role of the visual in signification, are well placed to do this. . one chapter, two pages one of our first experiments, one chapter, two pages, is the deformance of two well-recognised graphic elements: type size and leading. the design of a novel is highly standardised, with text blocks commonly set in a – point serif typeface, leading at – % of the type size (e.g. pt text, pt leading), justified, margins of equal size, and paragraph indents about the width of the characters’ cap height. the sizes of novels are also standardised. in one chapter, two pages, we challenge these conventions by taking each chapter of moby-dick and resize the type so that an entire chapter fits within the margins of a double-page spread. thus, chapters fall over double-page spreads. the size of the typeface is therefore determined by the length of the chapter and its capacity to fit across the two pages—starting at the top of the first page and ending at the bottom of the second. subsequently, as the length of the chapters in moby-dick varies considerably, so too does the point size of the text. the shortest chapter, ‘midnight aloft – thunder and lightning’ (chapter , figure ), contains words and is set at . / . pt (type size over leading), whereas the longest chapter, ‘the town-ho’s story’ (chapter , figure ) contains words, and is set at . / . pt. figures – illustrate the increase in number of words per chapter and the corresponding decrease in type size. one of the consequences of typesetting a book so that each chapter fills a double-page spread is that we end up with a novel of unusually large dimensions ( mm × mm). typically, the format of the book is decided by the publisher, leaving the designer to choose the typeface, size, and leading (as well all the other as long as the inner margins don’t make the text disappear into the gutter. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts paratextual elements) according to reading conventions. the length of the novel, or the number of pages, is a consequence of these typesetting choices and the word count. however, in one chapter, two pages, the format of the book is not given prior figure : typesetting chapter ‘midnight aloft – thunder and lightning’ as a double page spread. figure : typesetting chapter ‘the street’ as a double page spread. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts to its design. rather, it is determined by the size of the page required to typeset the largest chapter ( words) over two pages at a legible point size. and even though it is legible, it is difficult to read. not only is the type very small but the sentences are well over the recommended line length— – characters compared to a more standard – characters. while this process of altering the graphical elements of a text—taking the shorter chapters and expanding them to fit across two pages while simultaneously compressing the longer chapters—may not tell us anything specific about the plot or the content of the novel, it does draw our attention to the book’s overall structure. what is revealed through graphical deformance is the way melville changes the pace of his text by dramatically shifting chapter lengths. this is illustrated by flipping through the pages, but also by the thumbnail overview at the end of our publication (figure ). . tracking and sentence length the second act of graphic deformance also attends to melville’s writing style, but instead of focussing on the length of his chapters, we create a visual strategy that reveals the variation in sentence length throughout the novel. by varying the tracking—the space between the letters—in relation to the number of words figure : typesetting chapter ‘the town ho’s story’ as a double page spread. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts in each sentence we can begin to see the rhythms or patterns in melville’s prose. any sentence with less than the average word count ( . words per sentence) has incrementally reduced tracking (e.g. a – word sentence would have – tracking), and, conversely, any sentence longer than average has incrementally reduced tracking (e.g. a – word sentence would have tracking). we decided to tighten tracking for short sentences and expand tracking for long sentences in order to emphasise and reinforce the effect of different sentence structures. short sentences grab the reader’s attention. they are quick and dynamic, creating drama and intensity, and are often used to describe action. longer sentences slow the pace of the narrative, can be reflective or rambling, and provide space for rich description or the building of suspense. it is, however, the combination of lengths that gives these varying sentences their potency. this process of graphic deformance, specifically the visual tightening of short sentences, quickly reveals a characteristic of melville’s writing: concise opening lines. leafing through the newly tracked pages shows tightly clustered letters at the beginning of many chapters and paragraphs. melville ( [ ]: ) starts his novel as he intends to proceed by opening with one of literature’s shortest figure : thumbnails of chapters – . lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts and best-known first lines, ‘call me ishmael’. the lead sentences in chapter one’s fourth and fifth paragraph are equally brief: ‘once more’ and ‘but here is an artist’ ( ) (figure ). this pattern repeats throughout the book. chapter marks the beginning of the three-chapter chase of the white whale, a climactic and fatal event which begins ‘it was a clear steel-blue day’ ( ) (figure ). its clarity and brevity do nothing to foretell the ensuing drama. and although many of the opening lines are longer than a handful of words, they are more often than not shorter than melville’s average sentence length ( . words). revealed through this graphic strategy is melville’s extraordinary range of sentence lengths, from one-word sentences to the seemingly endless -word sentence in chapter (‘the whiteness of the whale’). in this chapter ishmael accounts for his fear of whiteness: ‘it was the whiteness of the whale that above all things appalled me’ ( ). he begins with a discussion of virtues commonly associated with the colour white, purity and even holiness, before moving into more philosophical terrain, where white is associated with ghostliness, absence, a void, the unknown. a -word sentence opens the chapter (tracked at – ), followed by sentences of , , , and words (tracked at , – , – , and , respectively) (figure ). this variation becomes apparent through the shifting visual figure : using tracking and sentence length as a graphic strategy on chapter ‘loomings’. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts density of the sentences and text blocks. in a serendipitous moment, the very subject of the chapter—whiteness—is rendered textually by the generous tracking of the third paragraph, which is a single sentence running to over four pages (figures – ). figure : using tracking and sentence length as a graphic strategy on chapter ‘the symphony’, a close up view. figure : using tracking and sentence length as a graphic strategy on chapter ‘the whiteness of the whale’. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts words are pulled apart, isolating letters which now float on the page and which are only partially tethered to a lexical unit. the horizontal lines of type that are typical of western reading and writing conventions momentarily disintegrate, tending instead towards vertical coherence at the edges of the text blocks. figure : an example of melville using long sentences in chapter . figure : an example of the variation in tracking and sentence length in chapter continued. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts the pull of justified type creates these perpendicular lines of code that seem to promise readability if read from top to bottom, not left to right. this promise, however, is quickly broken by a string of nearly-words: ‘thorb’, ‘peoct’, ‘maan’, ‘satal’, ‘dosab’ (figure ). towards the middle of the text blocks, the letters swim, belonging to neither warp nor weft, leaving white holes in the fabric of the page. rarely, however, does another long sentence immediately follow. when looking through the graphically altered pages a visual rhythm appears: long, airy sentences are followed by tightly written sentences, creating the illusion that the longer sentences are pushing up against the short. in the final paragraph of this chapter, aware of the complexity as well as the fragility of his long sentences, melville pulls tight the narrative thread by finishing with two short sentences: ‘and all of these things the albino whale was the symbol. wonder yet then at the fiery hunt?’ ( ) (figure ). the purpose of these sentences, ensuring the key ideas are set firm and clearly anchored in the reader’s mind, is visually reflected in the tightly knitted figure : page of chapter ‘the whiteness of the whale’. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts words, welded together by the negative tracking of a short sentence. this strategy of graphic deformance is simultaneously analytical and descriptive, drawing attention to the structural properties of the text whilst expressively embodying the function of the sentences—to draw out the narrative or pull it together. and while much is made of melville’s innovative and unorthodox approach to novel writing, chapter (‘the chase – third day’) illustrates how conventional his sentence structures can be. in this, the final chapter (before the epilogue) and day three of the battle with the whale, it becomes apparent that only death will release captain ahab from his obsession. his crew and the pequod seem similarly doomed. melville delivers this inevitable end through a combination of short and long sentences; rarely are they average in length. this common writing strategy of short sentences to describe dramatic action and create urgency, and long sentences to build suspense and tension, is made visually apparent through graphic deformance. throughout this chapter we see pages of text rendered barely legible through the too-tight tracking of staccato sentences (figure ): “is my journey’s end coming? my legs feel faint; like his who has footed it all day. feel thy heart,—beats it yet?—stir thyself, starbuck!—stave it off—move, move! speak aloud!—mast-head there! see ye my boy’s hand on the hill?—crazed;— aloft there!—keep thy keenest eye upon the boats:—mark well the whale!—ho! again!—drive off that hawk! see! he pecks—he tears the vane”—pointing to the red flag flying at the main-truck— “ha! he soars away with it!—where’s the old man now? sees’t thou that sight, oh ahab!—shudder, shudder!” ( ) this is followed by a long sentence, spread wide, shifting the narrative pace and providing a break in the action (figure ): the boats had not gone very far, when by a signal from the mast-heads—a downward pointed arm, ahab knew that the whale had sounded; but intending to be near him at the next rising, he held on his way a little sideways from the vessel; the becharmed crew maintaining the profoundest silence, as the head- beat waves hammered and hammered against the opposing bow. ( ) lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts this long sentence also creates an opportunity to build tension before the next round of action which follows (figure ): “drive, drive in your nails, oh ye waves! to their uttermost heads drive them in! ye but strike a thing without a lid; and no coffin and no hearse can be mine:—and hemp only can kill me! ha! ha!” ( ) this pattern of short-long-short, visually rendered by compacted line lengths and loose, drawn-out sentences, continues until the end, when melville changes the pace to match the slow sinking of the pequod with the penultimate paragraph, which consists of two long sentences: and words. he concludes the chapter with a modest -word sentence, looking for neither drama nor suspense, but rather an ending in which the ship and all but one of the crew are buried beneath ‘the great shroud of the sea’ ( ) (figure ). figure : an example of melville using short sentences for effect in chapter ‘the chase’. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts what the changes in the length of chapters or sentences means from a literary perspective is not for designers to speculate on. however, what we do know, as designers, is that meaning is graphically constituted, and therefore that making and remaking a text becomes a productive and generative research method through which to critically apprehend texts. . character speech in this next deformance experiment, we first identify and then graphically isolate the speech in melville’s text. by speech, we refer to conversations between two or more characters, as well as soliloquies and monologues. in moby-dick, speech has an important function. it reflects the various stages of the pequod’s journey from nantucket to the pacific (eldridge, ) and also enables melville to radically alternate between the ‘active, strenuous presence of the crew’ and ‘the long, deliberate swell of the middle section of moby-dick [where] there is tense calm before the boiling figure : a pattern of short-long-short sentence lengths visualised in the penultimate paragraph of chapter . lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts climax of the chase’ (middleton, : ). melville uses speech as a way to increase and ease the dramatic tension in the text, as well as to ‘affect the structure, tone, narrative rhythm, and characterisation’ of the novel (middleton, : ). to isolate the speech on the page, we erase the non-speech text, leaving the area blank. erasure, or the removal of text, has long been an important strategy in art and design practice. white space is not a void or an absence but a material element that is part of the semantic value of a text. it is integral to the way we read a text. as drucker ( : ) explains: an unprinted area … is not a given, inert or neutral space, but an espace, or field, in which forces among mutually constitutive elements make themselves available to be read … white space is thus visually inflected, given a tonal value through relations rather than according to some intrinsic property. this strategy of graphically omitting the non-speech passages enables us to read the speech as it occurs in the book, on each page, chapter by chapter. the most immediate effect of this graphic deformation is a sense of the volume of speech melville creates, and where and when it occurs in the novel. for instance, there are chapters where melville uses speech intensively, such as chapters , , and (‘sunset’, ‘dusk’, and ‘first night watch’) (figures – ). here the reader encounters a range of lengthy soliloquies (internal monologues) as the book shifts from the ‘colloquial speech of nantucket to the lingua franca of the sea itself ‘(middleton, : – ). in contrast, in chapters to the speech is sparse and only appears occasionally on the page (three lines overall) (figures – ). there are also long periods in the novel where there is no speech at all, only an endless sea of white, such as chapter (‘cetology’), which is pages long. here melville categorises species of the whale as if he were cataloguing his library of folios. these sections with no speech are significant because they provide periods of a recent special issue of media-n: journal of the new media caucus which focuses on the ‘aesthetics of erasure’ is typical of the productivity of this practice in the digital realm. seventeen of the book’s chapters focus on whale anatomy or behaviour. these chapters include ‘the sperm whale’s head – contrasted view’ and ‘the right whale’s head – contrasted view’. phillip lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts peace which are periodically broken by the presence of drama such as the sighting of the whale — ‘there she blows!’ (see figures and ). hoare ( : – ) explains that such sections lay out the whales’ physical structure with a wry mixture of known facts and arch analogy, and are void of dialogue. figure : visually isolating melville’s use of speech in chapters , and (‘sunset’, ‘dusk’, and ‘first night-watch’). figure : visually isolating melville’s use of speech in chapter ‘the sunset’. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts while the speech first appears inconsistent, this process of graphic deformance reveals how melville’s use of speech signifies the changing geographic stages of the pequod’s journey across the atlantic ocean, indian ocean, eastern seas, pacific ocean, and finally the central pacific whaling grounds (eldridge, ). figure : visually isolating melville’s intense use of speech in chapters and (‘first night-watch’ and ‘midnight-forecastle’) continued. figure : chapter ‘ambergris’ contains little or no speech. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts figure : chapter ‘the castaway’ contains little or no speech. figure : chapter ‘the castaway’ using speech sparingly. figure : the first three pages of chap- ter ‘a squeeze of the hand’ uses no speech at all. figure : an example of melville using no speech in his text. figure : graphic evidence of melville using speech sparingly in chapter ‘the cassock’. figure : graphic evidence of melville using speech sparingly in chapter ‘the try-works’. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts by viewing the chapters as thumbnails, graphic deformance enables us to see melville’s arrhythmical use of speech, which is nonetheless strategic in the way it functions to drive narrative, and to mark the shifting geographic and psychological landscapes. this distant view reveals the changing dynamic between action and repose, described by middleton ( : ) as the ‘two signatures of the dual rhythm of moby-dick’. figure : long periods of prose are broken up by sightings of the whale. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts isolating the speech also enables intimate perspectives of the characters when viewed at the level of a single page. by removing the non-spoken text, the voice of an individual character is amplified. the detailed context within which the characters appear is silenced, enabling us to see their eccentricities and habits. figure , for example, shows an exchange between flask and stubb. graphic deformance not only indicates the concentration of speech, but also the type: for example, dialogue or monologue. a monologue such as the one found in chapter , ‘the sermon’ (figure ), is easily recognised, because it appears as a solid block. in contrast to this, dialogue between two or more people can be identified figure : long periods of prose are broken by sightings of the whale. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts through the shape of the isolated text which consists of broken line lengths and the punctuated white space of the page (figure ). the dynamic visual relationship set up by these irregular shapes, and the spaces between them, prompt a to-ing and fro-ing between the characters. by enabling a distant view of the novel, one which encompasses all of the chapters, as well as a close reading of a single page, the semantic potential of graphic materiality is made available for visual processing, and thus provides an alternate possibility for critically appraising a text. figure : an exchange between two characters flask and stubb in chapter ‘ahab’s boat and crew. fedallah’. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts . moby-dick dictionary the final experiment differs considerably from the three previous examples. it is simultaneously a linguistic and graphical deformance. we start by taking the entire novel moby-dick and transforming it into a hybrid dictionary-concordance. each unique word is identified computationally and placed in context; that is, each word is shown as it would appear in every sentence over the course of the whole book (figure ). graphically, it is typeset like a dictionary, a format that is highly regulated in terms of its structure, but it operates like a concordance. this process of expanding the text (taking an unusually long novel of , + words and making it even figure : an examples of a monologue in chapter ‘the sermon’. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts longer, over million words) can be understood as linguistic deformation. while the text has not been reduced, it has been significantly altered, no longer resembling the narrative that melville initially constructed. at first glance this example may seem to be a relatively conventional piece of design work, reflecting the typographic practices of a reference text. however, this example also exhibits high levels of graphic deformation if we are to consider its origin as a novel. by transforming a page of prose into two columns, and introducing indentation, bold, and italicised text, as well the paratextual elements of a dictionary, we have transformed the text from one genre to another. although it remains a recognisable archetype, it has been significantly altered. figure : dialogue expressed as broken line lengths and white space on page . lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts this process of graphically altering the text, or defamiliarizing the novel, draws attention to individual lexical units. for example, we become aware of how many times melville uses the word ‘whale’ ( times), and that he uses the word ‘giraffe’ (unsurprisingly) only once. there are also words in this dictionary that at the time were neologisms created by the author (for example, a curio—an unusual or odd piece of art or bric-a-brac). second, these chronological entries also provide the reader with a history of usage in moby-dick, showing how each word is employed and evolves over the course of the narrative. perhaps one of the most generative aspects of visually representing every word and how it is used over the course of the novel is understanding each entry as a micro-narrative. in this mode, a single word steps you through a narrative, constructed one disconnected sentence at a time. the text is sequential but not strictly linear, and is re-authored by the principles of lexicography (figure ). arguably different in approach to our previous experiments, this is still an act of graphic deformance, as the qualities of a page are altered so as to change the text’s interpretative framework—from novel to dictionary—therefore allowing us to encounter the text anew. here the purpose of the work is not to ascribe new meaning to a text but figure : a close-up of moby-dick; or the whale typeset as a hybrid dictionary- concordance. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts to ‘reconstitute the work’s aesthetic form, as if a disordering of one’s senses of the work would make us dwellers in possibility’ (mcgann and samuels, : ). conclusion through these experiments, we demonstrate how the structural and formal aspects of melville’s writing can be brought to the surface through the alteration of graphical elements of a page, thereby asserting the often-neglected role of graphic materiality as a form of critique. these methods of graphical deformance are not intended to be used in isolation or to replace existing tools of literary criticism, whether they are close reading or computational text analysis. nor are they exhaustive—there are many other graphic and spatial qualities to explore. rather, they are designed to show how a digitised text can be productively manipulated to create alternate ways figure : a close-up of the entry ‘day’ as seen in the dictionary. lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts of critiquing a text graphically. in this project, which klein ( : ) would refer to as ‘boundary work’, the question of domain expertise arises: does it lie with the literary scholar or the design researcher? the answer is not a simple privileging of one discipline over the other but rather a process of attunement to graphic materiality led by the designer. as burdick et al. ( : ) state: not every digital humanist will become a designer, but every good digital humanist has to be able to “read” and appreciate that which design has to offer, to build the shared vocabulary and mutual respect that can lead to fruitful collaborations. while we primarily position our experiments as graphic deformance, they can equally be understood as forms of information visualisation. through methods of visual representation (shifting type size, tracking, and isolation) we reveal patterns within textual data. these newly created pages operate as both quantitative and qualitative forms of inquiry. however, in this paper, what we have begun to show is that the distinction between qualitative and quantitative methods is too simplistic, and that many of the problems with quantification in humanities research lie not in what aggregation can tell us, but rather in the visual languages used to present these understandings, which borrow heavily from scientific positivism, and thus embody values that are in opposition to core humanist values such as subjectivity, partiality, and uncertainty (drucker ). through this process of graphically deforming the text we present not the facts of the matter, but rather the manner in which melville has gone about writing his novel. these experiments are therefore hybrid texts, part literary pages and part information visualisations, made possible by the techniques of digital materiality, which enable scholars to graphically explore the structural and lexical qualities of the written language and to in turn open up new lines of inquiry. drawing on the domain expertise of visual communication design, we are proposing an alternate way of critiquing a text, one that takes into account the importance of graphical materiality and therefore embraces the inherent epistemological value of the visual (drucker, ). this focus is central to the lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts development of what we term ‘graphical criticism’, that is, criticism derived from the graphic manipulation of text. although nascent, such an approach has the potential to expand the way in which we explore digital texts, as well as helping to acknowledge the contribution of visual knowing to the emerging field of digital scholarship. acknowledgements thanks to gemma warriner who co-designed ‘one chapter, two pages’ and the ‘moby-dick dictionary’. thanks also to kristelle de frietas, design research assistant for the tracking and speech experiments. competing interests the authors have no competing interests to declare. references burdick, a, drucker, j, lunenfeld, p, presner, t, and schnapp, j digital_ humanities. cambridge, ma: mit press. drucker, j a speclab: digital aesthetics and projects in speculative computing. chicago and london: university of chicago press. doi: https://doi.org/ . / chicago/ . . drucker, j b entity to event: from literal, mechanistic materiality to probabilistic materiality. parallax, ( ): – . doi: https://doi. org/ . / drucker, j diagrammatic writing. new formations, ( ): – . doi: https:// doi.org/ . /newf. . . drucker, j graphesis: visual forms of knowledge production. cambridge, ma: harvard university press. hayles, n k writing machines. mediawork pamphlet. cambridge, ma: mit press. herterich, o to see and hear. available at: http://owenherterich.com/ projects/dialogue/index.php (last accessed october ). hoare, p cetology: how science inspired moby-dick. nature, : – . inspired by stefan ramsay’s ( : ) term ‘algorithmic criticism’ — criticism derived from the algorithmic manipulation of text. https://doi.org/ . /chicago/ . . https://doi.org/ . /chicago/ . . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /newf. . . https://doi.org/ . /newf. . . http://owenherterich.com/projects/dialogue/index.php http://owenherterich.com/projects/dialogue/index.php lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts klein, j t the boundary work of making in digital humanities. in: sayers, j (ed.), making things and drawing boundaries: experiments in the digital humanities. – . minneapolis: university of minnesota press. doi: https:// doi.org/ . /j.ctt pwt wq. lorber kasunic, j and sweetapple, k visualizing texts: a design practice approach to humanities data. in: maragiannis, a (ed.), proceedings of the digital research in the humanities and arts conference, – . london: drha. mak, b how the page matters. toronto: university of toronto. mcgann, j and samuels, l deformance and interpretation. new literary history, ( ): – . doi: https://doi.org/ . /nlh. . melville, h [ ] moby-dick ( nd edition). new york: w w norton. middleton, j a shark-talk: the uses of dialogue in ‘moby-dick.’ unpublished thesis (phd), indiana university. mueller, m martin mueller on ‘morgenstern’s spectacles or the importance of not-reading’, january . available at: http://sites.northwestern.edu/ nudhl/?p= (last accessed october ). posavec, s writing without words. available at: http://www.stefanieposavec. co.uk (last accessed october ). puckey, j zeitgeist, jonathan puckey projects. available at: http:// jonathanpuckey.com/projects/zeitgeist/ (last accessed october ). ramsay, s on building. available at: http://lenz.unl.edu/wordpress/?p= (last accessed march ). ramsay, s reading machines: towards an algorithmic criticism. urbana, chicago, and springfield: university of illinois press. sadokierski, z disturbing the text: typographic devices in literary fiction. book . , ( ): – . sample, m notes towards a deformed humanities. may , . available at: http://www.samplereality.com/ / / /notes-towards-a- deformed-humanities/ (last accessed october ). wilson, e melville, darwin, and the great chain of being. american literature, ( ): – . doi: https://doi.org/ . /saf. . https://doi.org/ . /j.ctt pwt wq. https://doi.org/ . /j.ctt pwt wq. https://doi.org/ . /nlh. . http://sites.northwestern.edu/nudhl/?p= http://sites.northwestern.edu/nudhl/?p= http://www.stefanieposavec.co.uk http://www.stefanieposavec.co.uk http://jonathanpuckey.com/projects/zeitgeist/ http://jonathanpuckey.com/projects/zeitgeist/ http://lenz.unl.edu/wordpress/?p= http://www.samplereality.com/ / / /notes-towards-a-deformed-humanities/ http://www.samplereality.com/ / / /notes-towards-a-deformed-humanities/ https://doi.org/ . /saf. . lorber-kasunic and sweetapple: graphic criticism and the material possibilities of digital texts how to cite this article: lorber-kasunic, j and sweetapple, k graphic criticism and the material possibilities of digital texts. open library of humanities, ( ): , pp.  – , doi: https://doi.org/ . /olh. published: september copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access open library of humanities is a peer-reviewed open access journal published by open library of humanities. https://doi.org/ . /olh. http://creativecommons.org/licenses/by/ . / introduction designing deformance experimentations . one chapter, two pages . tracking and sentence length . character speech . moby-dick dictionary conclusion acknowledgements competing interests references figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure figure received / / review began / / review ended / / published / / © copyright gisondi et al. this is an open access article distributed under the terms of the creative commons attribution license cc-by . ., which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. curriculum design and implementation of the emergency medicine chief resident incubator michael a. gisondi , adaira chou , nikita joshi , margaret k. sheehy , fareen zaver , teresa m. chan , jeffrey riddell , derek p. sifford , michelle lin . emergency medicine, stanford university school of medicine . emergency medicine, brigham & women’s hospital, harvard medical school . emergency medicine, massachusetts general hospital, harvard medical school . emergency medicine, university of calgary, calgary, alberta . faculty of health sciences, mcmaster university, hamilton, can . emergency medicine, keck school of medicine of the university of southern california, los angeles, ca . computer science, wayne state . department of emergency medicine, ucsf school of medicine  corresponding author: michael a. gisondi, mgisondi@stanford.edu disclosures can be found in additional information at the end of the article abstract background chief residents receive minimal formal training in preparation for their administrative responsibilities. there is a lack of professional development programs specifically designed for chief residents. objective in , academic life in emergency medicine designed and implemented an annual, year- long, training program and virtual community of practice for chief residents in emergency medicine (em). this study describes the curriculum design process and reports measures of learner engagement during the first two cycles of the curriculum. methods kern’s six-step approach for curriculum development informed key decisions in the design and implementation of the chief resident incubator. the resultant curriculum was created using constructivist social learning theory, with specific objectives that emphasized the needs for a virtual community of practice, longitudinal content delivery, mentorship for participants, and the facilitation of multicenter digital scholarship. the -month curriculum included key administrative or professional development domains, delivered using a combination of digital communications platforms. primary outcomes measures included markers of learner engagement with the online curriculum, recognized as modified kirkpatrick level one outcomes for digital learning. results an average of chief residents annually enrolled in the first two years of the curriculum, with an overall participation by % ( / ) of the allopathic em residency programs in the united states (u.s.). there was a high level of learner engagement, with an average , messages posted per year. there were also small group teaching sessions held online, which included faculty and chief residents. the monthly e-newsletter had a . % open rate. digital scholarship totaled online publications in two years, with chief resident co- open access original article doi: . /cureus. how to cite this article gisondi m a, chou a, joshi n, et al. (february , ) curriculum design and implementation of the emergency medicine chief resident incubator. cureus ( ): e . doi . /cureus. https://www.cureus.com/users/ -michael-a-gisondi https://www.cureus.com/users/ -adaira-chou https://www.cureus.com/users/ -nikita-joshi https://www.cureus.com/users/ -margaret-k-sheehy https://www.cureus.com/users/ -fareen-zaver https://www.cureus.com/users/ -teresa-m-chan https://www.cureus.com/users/ -jeffrey-riddell https://www.cureus.com/users/ -derek-p-sifford https://www.cureus.com/users/ -michelle-lin authors and faculty co-authors. conclusions the chief resident incubator is a virtual community of practice that provides longitudinal training and mentorship for em chief residents. this incubator conceptual framework may be used to design similar professional development curricula across various health professions using an online digital platform. categories: medical education, emergency medicine keywords: education, medicine, mentorship, resident, chief resident, social media, community of practice, online, digital, scholarship introduction in , there were , residency training programs in the united states (u.s.) accredited by the accreditation council for graduate medical education (acgme). most of these residency programs designate one or more of their senior residents as ‘chief residents,’ a one-year administrative designation with variable teaching and supervisory responsibilities [ - ]. chief residents are commonly selected for the role because of outstanding clinical performance in their first years of training [ ]. however, the duties of the chief resident are largely non-clinical. chief residents are often expected to manage resident clinical schedules, arrange didactic conference curricula, resolve conflicts among peers, and address issues of resident morale and wellness [ ]. these job expectations must be fulfilled while also advancing their own professional development and continued clinical training [ ]. chief residents who receive formal training for the role are better prepared to perform the assigned duties compared to those who do not [ ]. unfortunately, there are few opportunities for such training [ , ]. most commonly, residency directors send their chief residents to national conferences for specialty-specific, brief training courses. the acgme is one of several organizations that offer interdisciplinary training for chief residents; however, like national meetings, these courses occur over just a few days and do not provide longitudinal training or mentorship for their participants [ ]. in this paper, we describe the curriculum design and implementation of the chief resident incubator, an annual, -month training program and virtual community of practice for chief residents in emergency medicine (em). outcome measures in this study reflect learner engagement during the first two cycles of this novel curriculum. materials and methods the design and implementation of the chief resident incubator was informed by kern’s six- step approach for curriculum development for medical education, which includes problem identification, needs assessment for targeted learners, objectives, educational strategies, implementation, and program evaluation. the following summary of each step in the design and implementation process serves as the methodology for this curriculum project [ ]. . problem identification most of the em residency programs in the u.s. select at least two senior residents to serve in the administrative role of chief resident; assuming a conservative minimum of two chief residents per program, there are an estimated em chief residents [ ]. these individuals constitute our target population of potential learners. em chief residents generally lack gisondi et al. cureus ( ): e . doi . /cureus. of pertinent previous work experience and receive minimal training at their institutions to prepare them for the administrative duties of the role. . needs assessment the study authors performed an informal needs assessment for em chief residents. two specialty organizations in em sponsor chief resident-specific tracks at their national meetings, each lasting one to two days [ - ]. additionally, the acgme multi-specialty and pediatric leadership skills training programs for chief residents lasts five days and is available to senior residents from any specialty training program. there were neither online nor longitudinal training programs for em chief residents in the u.s. prior to the launch of this curriculum. in may , academic life in emergency medicine (aliem) launched the chief resident incubator in order to provide job training and professional development for chief residents in em [ ]. aliem (https://aliem.com) is a not-for-profit, digital health professions education organization that focuses on providing quality clinical content, developing collaborative resources, and sustaining online communities for em practitioners. the study authors are all affiliated with aliem and receive a small annual stipend for their teaching and adminstrative efforts in the chief resident incubator. . objectives the study authors generated goals and objectives for the curriculum through iterative discussions informed by a literature review and expert opinion. the four curricular objectives of the chief resident incubator are ( ) to foster a virtual community of practice for em chief residents, ( ) to deliver a longitudinal curriculum in administration and professional development relevant to the role of a chief resident, ( ) to provide mentorship for participants, and ( ) to facilitate multicenter scholarship among participants [ ]. . educational strategies two conceptual frameworks informed the design of this curriculum. first, we applied the lens of vyogtsky’s constructivist social learning theory to create our virtual community of practice, with a goal of developing an online learning environment that encourages collaboration, co- creation, and group learning [ ]. the constructivist theory recognizes the social nature of learning as essential to effective learning and values contributions made by all participants of a community [ ]. chief residents are often divided as middle managers between their peers and faculty members [ ]. it is important to create a community of practice for chief residents that allowed them to frequently interact with one another, thereby normalizing their experiences in the role and providing relevance to the curriculum content. second, we incorporated elements from dube’s virtual community of practice framework to create the program’s infrastructure [ , ]. the chief resident incubator is a temporary community, in that the membership was defined by the period in which we aimed to provide a one-year curriculum; this was matched to the common one-year timeframe of a chief resident year. the membership was limited to a peer group of learners at a similar stage of training. participation was voluntary through enrollment in our closed educational platform. the geographic distribution of learners resembles that of virtually distributed networks commonly seen in global companies. program faculty were intentional in their stewardship of the virtual community, through weekly facilitation of online discussions to promote learner engagement. the word ‘incubator’ was deliberately chosen to reflect our instructional approach [ ]. in business, an incubator is defined as a large initial investment in mentorship, capital, and networking for entrepreneurs, with goals to scale growth and ensure early success. this model gisondi et al. cureus ( ): e . doi . /cureus. of translates well to the successful development of early academicians. preparing our senior trainees to assume early leadership roles in healthcare is critical, as medicine and medical education continue to increase in complexity [ ]. the chief resident role has traditionally been regarded as an entry-level leadership position that can launch the career of future academic faculty members [ ]. for these reasons, there has been increasing interest in finding ways to ensure the early career success of chief residents [ , ]. the chief resident incubator provides continuous mentorship, programmatic resources, and a virtual network of peers to promote success in the chief resident role. similar to business incubators, the approach of our incubator leadership team to mentorship, team building, and learner engagement constantly evolves to meet chief resident needs and programmatic goals. . implementation the incubator leadership team directs the curriculum and actively facilitates participation of the chief residents. this team includes a chief operating officer, chief strategy officer, the aliem editor-in-chief, and several em faculty advisors who regularly execute initiatives within the curriculum. a virtual community of practice benefits from the involvement of coaches, especially in its early, developmental stages [ ]. for this reason, we annually invite ‘virtual mentors’ to serve each as month-long faculty experts and coaches. virtual mentors are selected based on their national reputation and scholarly expertise in selected curricular content, as well as their influential presence within medical education and social media. additionally, we feature ‘insiders’ as guest faculty mentors; these individuals are exemplary physician-leaders who participate in one-time teaching sessions about leadership development and professional identity formation. for e.g., dr. richard carmona (the th u.s. surgeon general) and dr. mel herbert (founder of a popular educational podcast series, em: reviews and perspectives [ ]) both served as 'insiders' in the first year of the curriculum. together, the incubator leadership team, virtual mentors, and insiders comprise the program faculty. the chief resident incubator requires a significant commitment of faculty time. collectively, members of the incubator leadership team each document between - hours of effort annually; consequently, small honoraria are awarded from aliem and its industry sponsors. the incubator leadership team initially identified topics to be delivered in a year-long curriculum, one per month, plus a four-week orientation to the program (table ). each of the virtual mentors are assigned to facilitate one of the monthly topics, delivered using just-in- time teaching [ ]. gisondi et al. cureus ( ): e . doi . /cureus. of month topic may orientation june what makes a leader? july negotiation skills august best practices of clinical scheduling september bedside teaching october interviewing skills november building a personal brand december project management january learning from mistakes february middle management march working with difficult residents april physician wellness table : chief resident incubator: curriculum in addition to the monthly curriculum, the incubator leadership team offers special events throughout the year, including online book clubs, a financial advice series, and the 'insiders' seminar series. all presentations are conducted, recorded, and archived as a video, using google hangout on air (google inc., mountain view, ca on youtube live, https://youtube.com). each recorded session includes a virtual mentor, at least one faculty member from the incubator leadership team, and three to five participating chief residents; all participants provide verbal consent to be recorded. we primarily use slack (slack technologies, san francisco, ca; https://slack.com), a virtual messaging and discussion platform, as the communication tool for incubator faculty and participants. slack discussions are summarized monthly by newsletter editors for the participants using an e-mail-based newsletter service, mailchimp (mailchimp, atlanta, ga; https://mailchimp.com). the combined use of slack, mailchimp, and google hangout on air videos supplant the need for a traditional digital course management system. the technologies used in the chief resident incubator are summarized in table . gisondi et al. cureus ( ): e . doi . /cureus. of platform description use slack - slack technologies, san francisco, ca. https://slack.com slack is a cloud-based, communication network that features a series of interconnected chat rooms, document storage, and integration with numerous third-party project management platforms. slack is the primary communication tool for faculty members and participants. google hangouts on air, hosted by youtube live - google inc., mountain view, ca. google hangout on air, https://hangouts.google.com/; youtube live, https://youtube.com google hangouts on air is a free, online communication platform that supports video chat among a group of individuals. discussions may be recorded, live streamed, and archived on youtube live. small group discussions between faculty members and residents are recorded and archived using hangouts on air. mailchimp - rocket science group, atlanta, ga. https://mailchimp.com mailchimp is an automated email marketing platform. this platform is used for the creation and distribution of a monthly e-newsletter. academic life in emergency medicine (aliem). https://aliem.com aliem is a not-for-profit, digital health professions education organization that hosts an educational blog, sponsors virtual communities of practice, and conducts scholarship. the aliem blog is used to disseminate scholarship and other educational content developed by the program participants. table : technologies used in the chief resident incubator in addition to online lectures and discussions, the chief residents are invited to two in-person sessions, each lasting two hours. these sessions coincide with two major em national conferences that are often attended by chief residents: a program launch event at the society of academic emergency medicine annual meeting in may, and a mid-curriculum event at the american college of emergency physicians scientific assembly in october. these sessions offer an opportunity for course faculty and participants to interact in-person, ultimately enhancing our online community through a culture of psychological safety, trust, and cohesiveness for the remainder of the year. professional development opportunities are also made available to the chief residents, such as the visiting chief resident grand rounds speaker series, educational grants, and mentored blog publications. first, chief residents can apply to present as an invited guest lecturer at grand rounds by submitting video vignettes of their teaching. those selected have been sponsored to lecture at such institutions as brigham and women’s hospital, new york university-bellevue hospital, northwestern university, and university of california san francisco. chief residents selected for these opportunities receive feedback on slide design and public speaking skills by members of the incubator leadership team before their presentations. second, there are educational grants available to participants. for e.g., ebsco health/dynamed plus© awarded a wellness grant to support residency-based wellness initiatives to one residency program and its chief residents. and third, interested chief residents are provided the opportunity to receive mentorship in publishing on an educational blog website, which receives . million annual page views. programmatic expenses include books provided to the chief residents during the curriculum, gisondi et al. cureus ( ): e . doi . /cureus. of https://slack.com https://hangouts.google.com/ https://youtube.com https://mailchimp.com https://aliem.com facility costs of the in-person events, and general administrative overhead. to finance the operating costs of the chief resident incubator, residency programs support tuition in the program at $ per chief resident. furthermore, the chief resident incubator is exclusively sponsored by an unrestricted educational grant from ebsco health/dynamed plus©. . program evaluation we report descriptive outcomes of the chief resident incubator for the first two years of the curriculum, spanning the period of may , to april , . primary outcomes reflect modified kirkpatrick level one outcomes for digital learning: evaluation of reaction via online engagement [ ]. the outcome measures include (a) enrollment, (b) slack message activity, (c) google hangout on air participation, (d) e-newsletter engagement, and (e) digital scholarship. this project received exempt status by the stanford university institutional review board. results table summarizes learner engagement for the first two years of the curriculum. an average of chief residents from residency programs enrolled annually in the chief resident incubator from - , representing % of the allopathic em residency programs in the u.s. gisondi et al. cureus ( ): e . doi . /cureus. of may – april may – april enrollment # chief residents # residency programs slack message activity # total messages , , % public messages % private group messages % direct messages google hangouts on air (ghoa) # ghoa small group presentations # faculty participants # chief resident participants e-newsletter engagement % click rate . digital scholarship # total online publications # chief resident co-authors # faculty co-authors table : measures of learner engagement with the chief resident incubator online curriculum, - the slack platform hosted an annual average of public discussion channels on topics related to chief resident administrative activities, such as conference, scheduling, educational projects, and wellness. in addition to public channels, learners created private channels for group messaging, as well as direct messages among participants used to faciliate projects and provide direct mentorship. an average of , messages were posted across two years: . % as public channel posts, % as private group messages, and . % as direct messages. a total of google hangout on air presentations occurred annually, each of which was archived for review by all participants at their convenience. a total of faculty members and chief residents participated in the live broadcast of at least one of these google hangouts. monthly e-newsletters delivered to all incubator members were opened by an average of . % participants; for comparison, the industry average reported by mailchimp for education-related gisondi et al. cureus ( ): e . doi . /cureus. of newsletters is % [ ]. a bibliography of digital scholarship for each cycle of the curriculum is summarized in table , totaling online publications co-authored by chief resident and faculty members. as per aliem blog editorial standards, each of these items underwent a thorough peer review by one or more aliem editors using a previously published review process [ ]. – joseph m, stuart r, paetow g, wojtal n, cohen v, lindsey j, gopalsami a, schneberk t, patel s, sheehy m. [blog post] “dear residents: things your new chiefs want you to know.” aliem, / / . http://www.aliem.com/ /dear- residents- -things-your-new-chiefs-want-you-to-know/ sheehy m, harding a. “aliem book club: the white coat investor: a doctor’s guide to personal finance and investing.” aliem, / / . http://www.aliem.com/aliem-bookclub-the-white-coat-investor/ shaikh s, risler z, hansen m, horan c, burrup d, harding a, chou a. [blog post] “ scheduling software options in the emergency department: an in-depth review.” aliem, / / . http://www.aliem.com/ / -scheduling-software- options/ stuart r. “aliem book club: dreamland: the true tale of america’s opiate epidemic.” aliem, / / . http://www.aliem.com/aliem-bookclub-dreamland-the-true-tale-of-americas-opiate-epidemic/ stuart r, joshi n. “aliem book club: bouncebacks! emergency department cases: ed returns.” aliem, / / . http://www.aliem.com/ /aliem-bookclub-bouncebacks/ trop a, zaver f, gottlieb m, glenn m. [blog post] “aliem chief resident incubator must read em journal articles - edition.” aliem, / / . http://www.aliem.com/ / -scheduling-software-options/ gottlieb m, habrat d, sheehy m, zidovetzki s, chou a, eds. [online textbook] aliem in-training exam prep: emergency medicine, st edition. aliem publishing: san francisco, ca, / / . isbn - - - - https://www.aliem.com/aliem-training-exam-prep-book-emergency-medicine/ – rose c, weir a. [blog post} “top foam radiology resources: aliem chief resident incubator recommendations.” aliem / / . https://www.aliem.com/ /top- -foam-radiology-resources/ craddick m, giordano j, gronowski t, kalnow d, pusateri m, sanford a, zaver f, chou a. [blog post] “top secrets to success as an emergency medicine resident.” aliem, / / . https://www.aliem.com/ / /top- -success- resident/ liu el, rose c, dyer s, zaver f, chou a. [blog post] “a starter’s roadmap to em resources: books, websites, and apps.” aliem, / / . https://www.aliem.com/ / /starters-roadmap-to-em-resources-books-websites-apps/ gronowski t, sanford a, md, liang l, glenn m, joshi n. “aliem book club: a thousand naked strangers: a paramedic’s wild ride to the edge and back.” aliem, / / . https://www.aliem.com/ / /aliem-book-club- thousand-naked-strangers/ burkhardt j, watsjold b, md, fan t, dyer s. [blog post] “mdpv card: introduction to ed charting and coding.” aliem, / / https://www.aliem.com/ / /pv-card-ed-charting-and-coding/ dyer s, burkhardt j, watsjold b, fan t, trop a. [blog post] “mded charting and coding: history of present illness & past medical, family, social history.” aliem, / / . https://www.aliem.com/ / /ed-charting-coding-history-of- present-illness/ gisondi et al. cureus ( ): e . doi . /cureus. of http://www.aliem.com/ /dear-residents- -things-your-new-chiefs-want-you-to-know/ http://www.aliem.com/aliem-bookclub-the-white-coat-investor/ http://www.aliem.com/ / -scheduling-software-options/ http://www.aliem.com/aliem-bookclub-dreamland-the-true-tale-of-americas-opiate-epidemic/ http://www.aliem.com/ /aliem-bookclub-bouncebacks/ http://www.aliem.com/ / -scheduling-software-options/ https://www.aliem.com/aliem-training-exam-prep-book-emergency-medicine/ https://www.aliem.com/ /top- -foam-radiology-resources/ https://www.aliem.com/ / /top- -success-resident/ https://www.aliem.com/ / /starters-roadmap-to-em-resources-books-websites-apps/ https://www.aliem.com/ / /aliem-book-club-thousand-naked-strangers/ https://www.aliem.com/ / /pv-card-ed-charting-and-coding/ https://www.aliem.com/ / /ed-charting-coding-history-of-present-illness/ pusateri m, battaglioli n. [blog post] “ tips to become a successful interviewer: do’s and don’ts.” aliem, / / . https://www.aliem.com/ / / -tips-become-successful-interviewer/ watsjold b, dyer s, burkhardt j, trop a. [blog post] “mded charting and coding: review of systems (ros).” aliem, / / . https://www.aliem.com/ / /charting-coding-review-of-systems/ fan t, burkhardt j, watsjold b, trop a. [blog post] “mded charting and coding: physical exam (pe).” aliem, / / . https://www.aliem.com/ / /charting-and-coding-physical-exam/ abubshait l, gottlieb m, haas m. [blog post] “mdpv card: hip injuries | quick reference guide.” aliem, / / . https://www.aliem.com/ / /pv-card-hip-injuries/ watsjold b, dodd k, trop a. [blog post] “mded charting and coding: medical decision making (mdm).” aliem, / / . https://www.aliem.com/ / /charting-and-coding-medical-decision-making/ abubshait l, gottlieb m, haas m. [blog post] “mdpv card: knee injuries | quick reference guide.” aliem, / / . https://www.aliem.com/ / /pv-card-knee-injuries/ gottlieb m, zarzar r, bierny p, eds. [online textbook] aliem in-training exam prep: emergency medicine, nd edition. aliem publishing: san francisco, ca, / / . isbn - - - - https://www.aliem.com/aliem-training-exam- prep-book-emergency-medicine/ glenn m, little a, haas m. [blog post] “mdpv card: elbow injuries.” aliem, / / . https://www.aliem.com/ / /pv-card-elbow-injuries/ battaglioli n. [blog post] “winners of the ebsco health/dynamed plus wellness grant.” aliem, / / . https://www.aliem.com/ / /winners-ebsco-health-dynamed-plus-wellness-grant/ lee h, abubshait l. [blog post] “mdpv card: laceration repair and sutures – a cheat sheet guide.” aliem, / / . https://www.aliem.com/ / /pv-laceration-repair-and-sutures/ table : digital scholarship authored by participants of the chief resident incubator, - discussion we describe a unique, longitudinal training program that allows em chief residents to connect and form a virtual community of practice. to our knowledge, it is also the only virtual community of practice for chief residents. our results demonstrate active learner engagement and scholarly productivity, indicating that the design and implementation of such communities for resident trainees is valuable. two years of experience with this unique curriculum has provided valuable insights into the nuances and dynamics of fostering such virtual learning teams. the chief resident incubator is administered using a variety of digital communications software to facilitate frequent and efficient interaction between learners, with a goal of fostering community and context. we observed that this effort was enhanced by in-person sessions, to promote engagement and trust. our primary outcome measures for the evaluation of this curriculum were therefore chosen to reflect the degree of engagement and online interaction of our learners. [ ] we envision that virtual teams will become more popular in health professions education, as gisondi et al. cureus ( ): e . doi . /cureus. of https://www.aliem.com/ / / -tips-become-successful-interviewer/ https://www.aliem.com/ / /charting-coding-review-of-systems/ https://www.aliem.com/ / /charting-and-coding-physical-exam/ https://www.aliem.com/ / /pv-card-hip-injuries/ https://www.aliem.com/ / /charting-and-coding-medical-decision-making/ https://www.aliem.com/ / /pv-card-knee-injuries/ https://www.aliem.com/aliem-training-exam-prep-book-emergency-medicine/ https://www.aliem.com/ / /pv-card-elbow-injuries/ https://www.aliem.com/ / /winners-ebsco-health-dynamed-plus-wellness-grant/ https://www.aliem.com/ / /pv-laceration-repair-and-sutures/ trainees, clinicians, and educators become more digitally proficient. technology has already facilitated the work of “globally distributed teams” in research collaboration [ ]. as with all new online courses, an orientation to the curriculum and its digital platforms was essential for early adoption and engagement. in particular, we observed that participants who had never used slack were slower to engage with the curriculum than those already familiar with the platform. our experience with this curriculum suggests that new learning outcomes can be realized for teams of learners using the conceptual framework of a virtual community of practice. for example, we report a total of digital publications by chief resident co-authors over the first months of the chief resident incubator. these opportunities for digital scholarship, and the faculty oversight they required, could not have been possible in any other chief resident training platform to date. our positive experiences in building a virtual community for resident learners informed our decision to further invest in this model of training for fellows (aliem fellowship incubator) and junior faculty (aliem faculty incubator). em as a specialty is extremely engaged in online and social media technologies for medical education [ - ]. it is likely that the assimilation of our faculty and chief residents to this new virtual community of practice resulted from familiarity with aliem, social media, digital learning, and the technologies that we used to deliver the curriculum. these technologies limited some scaling of our curriculum. for instance, google hangout on air has a bandwidth that supports a limited number of participants, and the no-cost version of slack limits the archiving of posts at , messages. to build similar incubators for larger learner groups would require additional financial resources for more costly, yet scalable online platforms. there are several important limitations to our current program, its evaluation, or both. importantly, we sought to monitor outcome measures of engagement with the curriculum by learners, as surrogate markers for the success of our virtual community of practice [ ]. we have not yet studied the impact of our curriculum on the job performance of our chief residents, nor have we examined knowledge retention. though we have anecdotal evidence that our learners enjoyed the format of a virtual community of practice, we have not formally measured their satisfaction with the curriculum. future studies are needed to assess these learning outcomes. the chief resident incubator was scaled for a cohort of em chief residents with a highly motivated group of expert facilitators. it may not hold true that other online communities will have similar successes, as these communities will vary based on the level of engagement of the facilitators and participants. our incubators are implemented by a geographically distributed team of academic emergency physicians who receive very small honoraria relative to the hundreds of hours of teaching effort they provide. it is unlikely that a competent volunteer model could be easily replicated. conclusions the chief resident incubator is a longitudinal, virtual community of practice that provides administrative training, professional development, and mentorship for em chief residents. the high level of curricular engagement by participants across the u.s. suggests that a digital platform can facilitate the development of unique communities of practice to address educational needs and foster scholarship. additional information disclosures gisondi et al. cureus ( ): e . doi . /cureus. of human subjects: all authors have confirmed that this study did not involve human participants or tissue. animal subjects: all authors have confirmed that this study did not involve animal subjects or tissue. conf licts of interest: in compliance with the icmje uniform disclosure form, all authors declare the following: payment/services info: this was a study of an educational product of academic life in emergency medicine (https://aliem.com), a not-for- profit, digital health professions education organization and educational blog. each of the study authors received small honorariums for their work on this aliem project. . financial relationships: all authors: gisondi, chou, joshi, sheehy, zaver, chan, riddell, sifford, lin declare(s) honorarium from academic life in emergency medicine. this was a study of an educational product of academic life in emergency medicine (https://aliem.com), a not-for- profit, digital health professions education organization and educational blog. each of the study authors received small honorariums for their work on this aliem project and their contributions as faculty members in the reported curriculum. other relationships: all authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work. references . nelson cs, brown ie, rao tk: a study of the responsibilities of chief residents in anesthesiology with a suggested job description. anesthesiol rev. , : – . . kolade vo, staton lj, jayarajan r, bently nk, huang x: feasibility of an innovative third-year chief resident system: an internal medicine residency leadership study. j community hosp intern med perspect. , – . . /jchimp.v . . kilpatrick cc, doyle pd, reichman ef, chohan l, uthman mo, orejuela fj: emotional intelligence and selection to administrative chief residency. acad psychiatry. , : – . . /appi.ap. . nakayama dk, phillips lg, newsome re jr, fuhrman gm, tarpley jl: transition from chief residency to specialty training: issues and solutions. am surg. , : - . . walker t, dusabejambo v, ho jj, karigire c, richards b, sofair an: an international collaboration for the training of medical chief residents in rwanda. ann glob health. , : - . . /j.aogh. . . . wilder jf, plutchik r, conte hr: the role of the chief resident: expectations and reality . am j psychiatry. , : – . . gisondi ma, bavishi a, burns j, adler md, wayne db, goldstein jl: use of a chief resident retreat to develop key leadership skills. med sci educ. , : – . . /s - - - . thomas pa, kern de, hughes mt, chen by: curriculum development for medical education: a six-step approach. thomas pa (ed): johns hopkins university press, baltimore, md; . . singh d, mcdonald fs, beasley bw: demographic and work-life study of chief residents: a survey of the program directors in internal medicine residency programs in the united states. j grad med educ. , : – . . / . . . resident academic leadership forum (ralf) . ( ). accessed: february , : http://www.saem.org/annual-meeting/education/forums/resident-academic-leadership- forum-(ralf). . emergency medicine residents’ association emra at cord . ( ). accessed: february , : https://www.emra.org/events/emra-at-cord- /. . academic life in emergency medicine, chief resident incubator . ( ). accessed: february , : https://www.aliem.com/aliem-chief-resident-incubator/. . dubé l, bourhis a, jacob r: towards a typology of virtual communities of practice . ijikm. , : – . . vygotsky ls: interaction between learning and development . mind in society: the development of higher psychological processes. cole m, john-steiner v, scribner s, souberman s (ed): harvard university press, cambridge, ma; . . dabrow sm, harris ej, maldonado la, gereige rs: two perspectives on the educational and administrative roles of the pediatric chief resident. j grad med educ. , : – . . /jgme-d- - . . dubé l, bourhis a, jacob r: the impact of structuring characteristics on the launching of gisondi et al. cureus ( ): e . doi . /cureus. of https://www.ncbi.nlm.nih.gov/pubmed/?term=a+study+of+the+responsibilities+of+chief+residents+in+anesthesiology+with+a+suggested+job+description https://dx.doi.org/ . /jchimp.v . https://dx.doi.org/ . /jchimp.v . https://dx.doi.org/ . /appi.ap. https://dx.doi.org/ . /appi.ap. https://www.ncbi.nlm.nih.gov/pubmed/?term=transition+from+chief+residency+to+specialty+training% a+issues+and+solutions https://dx.doi.org/ . /j.aogh. . . https://dx.doi.org/ . /j.aogh. . . https://www.ncbi.nlm.nih.gov/pubmed/?term=the+role+of+the+chief+resident% a+expectations+and+reality https://dx.doi.org/ . /s - - - https://dx.doi.org/ . /s - - - https://www.amazon.com/curriculum-development-medical-education-six-step/dp/ https://dx.doi.org/ . / . . https://dx.doi.org/ . / . . http://www.saem.org/annual-meeting/education/forums/resident-academic-leadership-forum-(ralf) http://www.saem.org/annual-meeting/education/forums/resident-academic-leadership-forum-(ralf) https://www.emra.org/events/emra-at-cord- / https://www.emra.org/events/emra-at-cord- / https://www.aliem.com/aliem-chief-resident-incubator/ https://www.aliem.com/aliem-chief-resident-incubator/ http://www.ijikm.org/volume /ijikmv p - dube.pdf https://www.colorado.edu/physics/phys /phys _fa / _readings/vygot_chap .pdf https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . /jgme-d- - . https://dx.doi.org/ . / virtual communities of practice. jocm. , : – . . / . entrepreneur: getting started with business incubators . accessed: february , : https://www.entrepreneur.com/article/ . . varkey p, peloquin j, reed d, lindor k, harris i: leadership curriculum in undergraduate medical education: a study of student and faculty perspectives. med teach. , : – . . / . em:rap, emergency medicine: reviews and perspectives . ( ). accessed: february , : https://www.emrap.org/. . brame c: just-in-time teaching. vanderbilt university center for teaching. accessed: february , : https://cft.vanderbilt.edu/guides-sub-pages/just-in-time-teaching-jitt/. . shappell e, chan t, thoma b, et al.: crowdsourced curriculum development for online medical education. cureus. , :e . . /cureus. . mailchimp, email marketing benchmarks. ( ). accessed: february , : https://mailchimp.com/resources/research/email-marketing-benchmarks/. . thoma b, chan t, desouza n, lin m: implementing peer review at an emergency medicine blog: bridging the gap between educators and clinical experts. cjem. , : – . . / . . . gajendran rs, joshi a: innovation in globally distributed teams: the role of lmx, communication frequency, and member influence on team decisions. j appl psychol. , : – . . /a . cameron p, carley s, weingart s, atkinson p: cjem debate series: #socialmedia - social media has created emergency medicine celebrities who now influence practice more than published evidence. cjem. , : – . . /cem. . . riddell j, brown a, kovic i, jauregui j: who are the most influential emergency physicians on twitter?. west j emerg med. , : – . . /westjem. . . gisondi et al. cureus ( ): e . doi . /cureus. of https://dx.doi.org/ . / https://www.entrepreneur.com/article/ https://www.entrepreneur.com/article/ https://dx.doi.org/ . / https://dx.doi.org/ . / https://www.emrap.org/ https://www.emrap.org/ https://cft.vanderbilt.edu/guides-sub-pages/just-in-time-teaching-jitt/ https://cft.vanderbilt.edu/guides-sub-pages/just-in-time-teaching-jitt/ https://dx.doi.org/ . /cureus. https://dx.doi.org/ . /cureus. https://mailchimp.com/resources/research/email-marketing-benchmarks/ https://mailchimp.com/resources/research/email-marketing-benchmarks/ https://dx.doi.org/ . / . . https://dx.doi.org/ . / . . https://dx.doi.org/ . /a https://dx.doi.org/ . /a https://dx.doi.org/ . /cem. . https://dx.doi.org/ . /cem. . https://dx.doi.org/ . /westjem. . . https://dx.doi.org/ . /westjem. . . curriculum design and implementation of the emergency medicine chief resident incubator abstract background objective methods results conclusions introduction materials and methods . problem identification . needs assessment . objectives . educational strategies . implementation table : chief resident incubator: curriculum table : technologies used in the chief resident incubator . program evaluation results table : measures of learner engagement with the chief resident incubator online curriculum, - table : digital scholarship authored by participants of the chief resident incubator, - discussion conclusions additional information disclosures references report of the mla task force on evaluating scholarship for tenure and promotion mla task force on evaluating scholarship for tenure and promotion december profession all material published by the modern language association in any medium is protected by copyright. users may link to the mla web page freely and may quote from mla publications as allowed by the doctrine of fair use. written permission is required for any other reproduction of material from any mla publication. send requests for permission to reprint material to the mla permissions manager by mail ( broadway, new york, ny - ), e-mail (permissions@mla.org), or fax ( - ). © by the modern language association of america �� profession� report of the ml a task force on evaluating scholarship for tenure and promotion executive summary in the executive council of the modern language association of america created a task force to examine current standards and emerging trends in publication requirements for tenure and promotion in english and foreign language departments in the united states. the council’s action came in response to widespread anxiety in the profession about ever-­rising demands for research productivity and shrinking humanities lists by academic publish-­ ers, worries that forms of scholarship other than single-­authored books were not being properly recognized, and fears that a generation of junior scholars would have a significantly reduced chance of being tenured. the task force was charged with investigating the factual basis behind such concerns and making recommendations to address the changing environment in which scholarship is being evaluated in tenure and promotion decisions. to fulfill its charge, the task force reviewed numerous studies, reports, and documents; surveyed department chairs; interviewed deans and other senior administrators; solicited written comments from association mem-­ bers; and consulted with other committees and organizations. the most significant data-­gathering instrument was a spring online survey of , departments in institutions across the united states covering a range of doctorate, master’s, and baccalaureate institutions. the response rate to the survey ( % of all departments and % of all institutions) provided a solid basis for the task force’s analysis and recommendations. the information gathered by the task force substantiates some worries and mitigates others. the results of the mla survey, which covered the academic years from – to – , initially seemed reassuring, �|||�report since they suggested that there has been no perceptible lowering of tenure rates among those in the final stages of the tenure process, where the de-­ nial rate seems to be around %. but further research presented a more complex picture. the mla survey showed that well over % of tenure-­ track faculty members leave the departments that originally hired them before they come up for tenure. data from studies conducted by other groups suggest that fewer than % of the phd recipients who make up the pool of applicants for tenure-­track positions obtain such positions and go through the tenure process at the institutions where they are initially hired, and a somewhat larger number of modern language doctorate re-­ cipients—more than %—never obtain tenure-­track appointments. in the aggregate, then, phds in the fields represented by the mla appear to have about a % chance of getting tenure. the mla survey further documents that the demands placed on can-­ didates for tenure, especially demands for publication, have been expand-­ ing in kind and increasing in quantity. while rising expectations have been driven by the nation’s most prestigious research universities, the effects ripple throughout all sectors of higher education, where greater emphasis has been placed on publication in tenure and promotion deci-­ sions even at institutions that assign heavy teaching loads. over % of all departments report that publication has increased in importance in tenure decisions over the last ten years. the percentage of departments ranking scholarship of primary importance (over teaching) has more than doubled since the last comparable survey, conducted by thomas wilcox in : from . % to . % (comprehensive survey ). judging from the mla’s survey findings, junior faculty members are meeting these ever-­growing demands even though this is a time when universities have lowered or eliminated subsidies for scholarly presses and libraries have dramatically reduced their purchases of books in the hu-­ manities. and despite a worsening climate for book publication, the mono-­ graph has become increasingly important in comparison with other forms of publication. indeed, . % of departments in carnegie doctorate-­ granting, . % in carnegie master’s, and % in carnegie baccalaure-­ ate institutions now rank publication of a monograph “very important” or “important” for tenure. the status of the monograph as a gold standard is confirmed by the expectation in almost one-­third of all departments sur-­ veyed ( . %) of progress toward completion of a second book for tenure. this expectation is even higher in doctorate-­granting institutions, where . % of departments now demand progress toward a second book. while publication expectations for tenure and promotion have in-­ creased, the value that departments place on scholarly activity outside mla�task�force�on�evaluating�scholarship�|||� monograph publication remains within a fairly restricted range. refereed journal articles continue to be valued in tenure evaluations; only . % of responding departments rated refereed journal articles “not important” in tenure and promotion decisions. other activities were more widely de-­ valued. translations were rated “not important” by . % of departments (including . % of foreign language departments), as were textbooks by . % of departments, bibliographic scholarship by . % of departments, scholarly editions by % of departments, and editing a scholarly journal by . % of departments. even more troubling is the state of evaluation for digital scholarship, now an extensively used resource for scholars across the humanities: . % of departments in doctorate-­granting institutions report no experience evaluating refereed articles in electronic format, and . % report no experience evaluating monographs in electronic format. given the trends the task force has identified, we offer the following rec-­ ommendations to address this complex situation before it becomes a crisis. . departments and institutions should practice and promote transparency throughout the tenuring process. . departments and institutions should calibrate expectations for achieving tenure and promotion with institutional values, mission, and practice. . the profession as a whole should develop a more capacious conception of schol-­ arship by rethinking the dominance of the monograph, promoting the scholarly essay, establishing multiple pathways to tenure, and using scholarly portfolios. . departments and institutions should recognize the legitimacy of scholarship produced in new media, whether by individuals or in collaboration, and cre-­ ate procedures for evaluating these forms of scholarship. . departments should devise a letter of understanding that makes the expecta-­ tions for new faculty members explicit. the letter should state what previous scholarship will count toward tenure and how evaluation of joint appoint-­ ments will take place between departments or programs. . departments and institutions should provide support commensurate with expectations for achieving tenure and promotion (start-­up funds, subven-­ tions, research leaves, and so forth). . departments and institutions should establish mentoring structures that provide guidance to new faculty members on scholarship and on the optimal balance of publication, teaching, and service. . department chairs should receive guidance on the proper preparation of a tenure dossier. . departments and institutions should construct and implement models for intermediate reviews that precede tenure reviews. . departments should conduct an in-­depth evaluation of candidates’ dossiers for tenure or promotion at the departmental level. presses or outside referees should not be the main arbiters in tenure cases. . scholarship, teaching, and service should be the three criteria for tenure. those responsible for tenure reviews should not include collegiality as an additional criterion for tenure. �|||�report . departments and institutions should limit the number of outside letters (in general, to no more than six). scholars should be chosen to write letters based primarily on their knowledge of the candidate’s field(s). letters should be limited to evaluating scholarly work. candidates should participate in se-­ lecting (or rejecting) some of their potential reviewers. . the profession as a whole should encourage scholars at all levels to write substantive book reviews. . departments and institutions should facilitate collaboration among scholars and evaluate it fairly. . the task force encourages further study of the unfulfilled parts of its charge with respect to multiple submissions of manuscripts and comparisons of the number of books published by university presses between and . . the task force recommends establishing concrete measures to support uni-­ versity presses. . the task force recognizes that work needs to be done on several questions not asked in its survey: salaries of junior and recently tenured faculty mem-­ bers, the role of unions, tenure appeals processes, and the lengthening of the pretenure period. . the task force recommends that a study of faculty members of color be conducted. . the task force encourages discussion of the current form of the dissertation (as a monograph-­in-­progress) and of the current trends in the graduate curriculum. . departments should undertake a comprehensive review to ensure that their expectations for tenure are consistent with their institutions’ values and mis-­ sion and that each step in the process is fair and transparent. task force report preamble “. . . how can ever-­increasing demands for publication as a qualification for tenure and promotion be sustained when scholars find it harder and harder to publish their books? on a broader level . . . how [will the] shifts in academic publishing . . . affect our scholarship, as well as the profession as a whole?” (mla ad hoc committee ). these questions defined the primary concerns of the report of the mla ad hoc committee on the future of scholarly publishing, chaired by judith ryan. the report delineated a “current crisis” ( ) in the profession, exacerbated by a per-­ ceived disjunction between what tenure committees value—the scholarly monograph—and what they reportedly devalue—editions, collections of essays, anthologies, and textbooks—scholarship that is “forming a larger and larger part of what scholarly presses wish to publish.” the possibility that “younger scholars may well be increasingly edged out of the publishing process” ( ), as the report concluded, led the mla executive council mla�task�force�on�evaluating�scholarship�|||� and its then president, stephen greenblatt, to write an open letter to the members of the association, outlining a “systemic, structural, and at base economic” problem: because of departments’ insistence “that only books and more books will do” to measure scholarly achievement, even as univer-­ sity presses cut back on the number of books they publish annually in cer-­ tain fields, “higher education stands to lose, or at least severely to damage, a generation of junior scholars.” the ideas that greenblatt and the executive council solicited from mla members—specific ways to ease this complex and self-­defeating situation largely by changing expectations for tenure— generated an outpouring of responses. the executive council thus decided to create a task force in to continue analyzing these issues. this report summarizes the deliberations and findings of the task force. the mla executive council wished to address not only the concerns in the committee’s report and the greenblatt letter and responses to that letter but also, and more broadly, the general anxiety that appears to exist in the profession about ever-­rising demands on the scholarship, teaching, and service of junior faculty members coming up for tenure; the dismay over the reduced output from or, in some cases, the phasing out of the humanities lists of academic presses; the worry that forms of scholarship other than the monograph are not properly recognized or evaluated, espe-­ cially when they come in electronic form; and ultimately, because of this confluence of factors, the fear that a generation of junior scholars will have a markedly reduced chance of being tenured. the council was alarmed, then, that a disjunction existed between rising expectations for tenure and diminishing availability of publishing outlets in the fields represented by the mla. it was concerned about the narrowing definitions of what con-­ stitutes scholarship for tenure and promotion, about the exclusion of new, alternative forms of scholarship, and about the failure to take account of the full range of practices that now constitute the system of scholarly exchange. finally, the council believed there was a need to revise proce-­ dures for evaluating scholarship to achieve equity and fairness throughout the tenuring process. to address this set of issues, the executive council charged the task force on evaluating scholarship for tenure and promo-­ tion first and foremost “with examining the procedures used to evaluate scholarly publications for tenure and promotion.” in addition: the task force will consider the effects of the widely discussed crisis in scholarly publishing (and the multiple forms that scholarly publishing now takes) on the criteria used to assess scholarly work in tenure reviews. the task force will review the guidelines and documents that the mla has is-­ sued in the past on matters relevant to its charge (e.g., recommendations on outside evaluation letters for tenure cases). the task force will acquire �|||�report specific information from department chairs, deans of humanities, and fac-­ ulty members recently reviewed for tenure about the requirements for ten-­ ure—and the outcome of tenure cases—in research institutions as well as liberal arts colleges and other kinds of postsecondary schools for the period – . for the same period, the task force will compare the number of books (volumes) in the fields represented by the mla published by aca-­ demic presses. the task force will also discuss the issue of subventions and of multiple submissions of manuscripts to presses. (“meeting” – ) in striving to fulfill its charge, the task force reviewed numerous mla studies, reports, documents, and statements and discussed a num-­ ber of readings on topics related to our concerns; they will be referenced throughout this report (see the works-­cited list). we also interviewed deans and other administrators; we published an item in the mla news- letter asking recently hired and recently tenured members of our pro-­ fession to comment on the tenure process at their institutions and on their own experiences with this process; and we consulted with various mla committees (committee on the status of women in the profes-­ sion, committee on information technology, committee on the litera-­ tures of people of color in the united states and canada, committee on scholarly editions) and held a meeting with the council of editors of learned journals. several members of the task force presented some of our preliminary findings at the mla annual convention. the task force took seriously the charge to “acquire specific informa-­ tion from department chairs,” and we requested that the mla conduct a substantial survey to determine as precisely as possible how expecta-­ tions for achievements in scholarship, teaching, and service had evolved from to and what rates of granting tenure were realized during this period. it seemed crucial to the task force that statistical information replace the anecdotes about heightened tenure criteria and diminishing tenure rates that have circulated in the profession and contributed to the belief in the often-­mentioned crisis. the knowledge that chairs have of their departments’ practices and their institutions’ procedures and values needed to be described as concretely as possible. the executive council approved funding for this survey, which was conducted in spring and based on a questionnaire drafted by mem-­ bers of the task force and the mla staff and designed for web-­based administration. the mla’s file of department administrators was used to define a universe of , modern language departments in , four-­year, degree-­granting, title iv–participating carnegie doctorate-­granting, master’s, and baccalaureate institutions in the continental united states. of these , departments, a subset of , departments in institu-­ mla�task�force�on�evaluating�scholarship�|||� tions was contacted by e-­mail and invited to complete the survey online. these , academic units included english departments, foreign language departments, and departments that encompass both english and foreign languages. data collection took place between march and may . of the , departments, ( . %) responded, representing institutions; at least one department responded in % of the institu-­ tions housing the entire set of , units canvassed in the survey. the findings and analysis of the survey, which appear in this report, are based on responses from departments in institutions that re-­ ported having a tenure system: of the responding english depart-­ ments and of the foreign language or combined departments. while these departments and institutions do not constitute a sample rep-­ resentative of all four-­year united states colleges and universities, their answers nonetheless afford insight into practices and standards for tenure in english and other modern language departments across a significant swath of united states higher education. in the interpretation of survey findings, the distribution of tenure-­track faculty members across the different types of four-­year institutions differs from the distribution of the institutions. carnegie doctorate-­granting in-­ stitutions constitute . % of four-­year institutions covered in the united states department of education’s integrated postsecondary education data system (ipeds) but % of the tenure-­track faculty members employed in four-­year institutions. carnegie master’s institutions constitute . % of the four-­year institutions in the ipeds but employ . % of tenure-­track faculty members. carnegie baccalaureate institutions constitute . % of the four-­ year institutions but employ . % of tenure-­track faculty members. the large population of tenure-­track faculty members in carnegie doctorate-­ granting institutions means that practices in the doctoral sector have an im-­ pact that goes far beyond the number of institutions represented. . a critical time, a new conjuncture any serious attempt to understand the issues surrounding the evaluation of scholarship for tenure and promotion today must first take into account the shifting nature of academic work over the past decades; the changes in the resources for disseminating scholarship, including the condition of university presses; and the significant changes in educational policies—all of which have increased pressure on the tenure system. because of factors identified in our report, this conjuncture may well represent a threshold moment with large effects and consequences in the future. first, we are seeing the effects of the cuts in funding for higher education at both the federal and the state levels. the subvention of substantial portions �|||�report of the costs of education for undergraduate and graduate students, what the mla committee on professional employment’s report calls “the social and financial compact,” existed for decades but began to erode with the end of the cold war ( ). and as phyllis franklin noted in , citing from the report of the council for aid to education, “breaking the social contract”: [t]he cost of educating college students grew by “more than sixfold between and , much faster than inflation, as measured by the consumer price index.” . . . [b]oth federal and state governments have underfunded higher education since the mid-­ s because entitlement programs like social security, medicare, and medicaid have required an increasingly large proportion of public funds. . . . thus, while the cost of educating a stu-­ dent increased, funding from the public sector decreased. by the year , the [cae] report concludes, “the higher education sector will face a funding shortfall of about $ billion [in dollars]—almost a quarter of what it will need.” ( ) this underfunding has occurred at a moment of marked growth in the student body and in administrations of higher education institutions (schuster and finkelstein ). indeed, the labor sector that has been most downsized and underfunded in recent history is the faculty. insufficient funding has gone hand in hand with—and is partly the result of—the corporatization of the university, which promotes a managerial-­ corporate culture that relies on the market to determine priorities and to value products and services to clients (rice and sorcinelli ). the adop-­ tion of business models that champion performance accountability and greater efficiency (mla committee on professional employment ) casts faculty members as productivity managers, in contrast to their tradi-­ tional role as stewards of the educational mission of their institution, and undermines the commitment of academic administrations to what r. eu-­ gene rice and mary deane sorcinelli call a prestige economy ( ). for the faculty, this new economy means that the quantity of research takes on greater prominence. in many colleges and universities, the managerial-­ corporate culture has led to a work speedup that equates more publication with greater productivity and productivity with accountability. as the mla committee on professional employment’s report points out, the tension in this country between increasing access to education and the expanding research mission of higher education was managed fairly well while resources were expanding in the s. with structural expansion, the increase in undergraduate enrollments created a need for more campuses and thus more career faculty lines. in turn, the number of graduate students working toward the doctoral degree also increased; mla�task�force�on�evaluating�scholarship�|||� in the fifteen academic years between and , the expansion of graduate education led to an extraordinary rise in the number of phds. a limit on the capacity to fund new positions for faculty members was inevitably reached, however, and the market became saturated in the early s. since then, phds “have faced an academic marketplace where, with one brief, modest exception in the late s, qualified candidates seek-­ ing careers in tenure-­track positions far outnumber available positions” (mla committee on professional employment ). in the period from – to – , the overall number of phd recipients declined (by about %, as reported in the annual survey of earned doctorates for those years). there has also been a general im-­ provement in the placement of phds to tenure-­track positions and a decrease in placements to non-­tenure-­track (ntt) positions, both full-­ time and part-­time, particularly for graduates in english. in the last three mla surveys of phd placement, placements to tenure-­track positions for graduates in english increased from % in – to . % in – to . % in – . placements of english graduates to ntt full-­time positions declined from . % to . % to %, whereas placements to ntt part-­time positions declined from % to . % to . %. the pat-­ tern is similar, although less clear-­cut, for graduates in foreign languages. in – . % of foreign language doctorate recipients were placed in tenure-­track positions; tenure-­track placements fell to . % in – but rose to . % in – . placement of foreign language graduates to ntt full-­time positions declined from . % in – to . % in – but increased to . % in – . placement to ntt part-­time positions declined from . % to . % to % (laurence and steward). . the shifting composition of the faculty and the shifting nature of academic work the decrease in the number of faculty members gaining full-­time tenure-­ track positions in language and literature during a period in which the number of undergraduate students increased means that full-­time faculty members had a significantly heavier workload in the s than they did in the s. as jack h. schuster and martin j. finkelstein point out, the weekly work effort of faculty members across institutional types increased from hours per week in to . hours in , and it increased most dramatically, to . hours, at research universities, where the fac-­ ulty has been subjected to both increasing instructional demands and in-­ creasing research demands ( ). across the board the proportion of faculty members working more than hours a week has doubled since , ris-­ ing from a significant minority ( . %) in to nearly % in . �|||�report even more important, however, schuster and finkelstein’s study con-­ cludes that we are witnessing a reshaping of the faculty by academic func-­ tion to an extent not seen since the emergence of graduate education in the nineteenth century ( ). most notable is the dramatic increase since the s in the use of full-­time and part-­time ntt faculty members, what has been called the casualization of the academic workforce. this casual-­ ization puts enormous pressure on the tenure system. as tenure-­track jobs have declined relative to all faculty positions since the s and as the number of students has grown considerably, part-­time and full-­time ntt academic labor has increased throughout the academy. in the s, ac-­ cording to a study led by margaret cahalan, % of the faculty nationwide consisted of part-­time employees; by , part-­time employees constituted % of the faculty (cahalan et al. – ). the american association of university professors (a aup) has noted that “through the s, in all types of institutions, three out of four new faculty members were appointed to non-­tenure-­track positions.” more recent studies suggest that the rate of casualization has only increased. a article in inside higher ed, re-­ porting on a document released in may by the national center for education statistics, stated that “between and , the number of full-­time faculty jobs at degree-­granting institutions rose to , from , —a gain of , jobs. but the number of part-­time jobs rose to , , up from , —a gain of , jobs. and as a percentage of faculty jobs at degree-­granting institutions, part-­time positions increased to percent, from percent, over those two years” ( jaschik). using figures derived from the national study of postsecondary fac- ulty (cataldi et al.), a survey of a sample of postsecondary faculty mem-­ bers conducted by the united states department of education at five-­year intervals, the aaup produced a chart showing that over % of faculty members in foreign language departments and over % of faculty mem-­ bers in the field of english and literature are part-­time. because the aaup does not disaggregate four-­year and two-­year institutions (where there are large numbers of faculty members working part-­time), the aggregate percentages mask the large disparity between two-­ and four-­year sectors. when the figures for english and literature are combined with those in foreign languages and are disaggregated by carnegie type of institution, the number of part-­time faculty members is . % in doctorate-­granting, . % in baccalaureate, % in master’s, and . % in associate’s institu-­ tions (i.e., two-­year colleges that grant associate degrees). the figures for full-­time ntt relative to part-­time ntt reveal im-­ portant differences by type of institution. of the ntt faculty members at doctoral institutions, . % are full-­time and . % are part-­time; at mla�task�force�on�evaluating�scholarship�|||� � master’s, . % are full-­time and . % are part-­time; at baccalaureate, . % are full-­time and . % are part-­time; and at associate’s, . % are full-­time and . % are part-­time. in other words, in all institutional sec-­ tors, the percentage of part-­time ntt faculty members is greater than the percentage of full-­time ntt faculty members. according to schus-­ ter and finkelstein, the prevalence of part-­time ntt faculty members is more pronounced in english composition, foreign languages, mathemat-­ ics, and business than in all other academic fields. in schuster and finkelstein’s view, full-­time nt t appointments, which usually have term limits, have “become the model type of full-­time appointment for new entrants to academic careers.” there is “considerable permeability” between “off the tenure track” and “on track” full-­time ap-­ pointments, especially among those who have the doctorate; and men are more likely to move onto this track than women ( ). moreover, ntt faculty members appear satisfied with some aspects of their jobs, especially their workload and the level of control over their professional time, which seem to compensate partially for dissatisfaction with their organizational status, perquisites, and future prospects ( – ). for schuster and fin-­ kelstein, the emergence of full-­time ntt faculty members has been less visible in the faculty’s structural evolution in the past two decades than the growth in part-­time ntt faculty members, but they claim that this phenomenon “marks a seismic shift in the types of faculty appointments being made” and in the definition of the faculty’s function ( ). the multitiered faculty structure in higher education, which is not likely to change in the foreseeable future, has direct implications for tenure. the dramatic increase in the number of part-­time ntt fac-­ ulty members and in the number of term-­limited full-­time ntt faculty members puts increased demands and pressure on all full-­time tenure-­ track and tenured faculty members in many areas for which the casualized work force is not—and should not be—responsible: service on department committees and in departmental governance; student advising; teaching upper-­level undergraduate and graduate courses; directing dissertations; and, less concretely but no less importantly, contributing to intellectual community building in the department and outside it, in the college and the university (see mla committee on professional employment – ; schuster and finkelstein – ). . the condition of university presses and of humanities lists the corporatization of the university and the imposition of business models of efficiency and output have affected not only the structure of the faculty but also the primary source for the dissemination of academic �|||�report scholarship in the humanities and social sciences: university presses. as phil pochada has pointed out, these presses have increasingly been asked to operate as businesses that must cover their costs and have lost or had sharply reduced their subsidies from the institution (qtd. in mla ad hoc committee ). they have reacted to—and accommodated—this new situation in part by discontinuing publication in certain humanities subjects altogether, such as the series in contemporary poetry at oxford university press and the series in french studies at cambridge university press (mla ad hoc committee ), or by reducing the humanities list (as stanford university press has done) and the number of translations published (as northwestern university press has done). this narrowing of publishing possibilities, especially in fields viewed as marginal, may not have bottomed out but may in fact become more acute in years to come. to be sure, some presses are now exploring the use of digital technology and collaborating with their campus libraries to find alternative ways to de-­ liver humanities content, in a few cases restoring previously discontinued series. university presses have pursued further adjustments to the current situation. they have also learned to manage their inventories and distri-­ bution more effectively to ensure more efficient sales and fewer returns. they maintain their humanities lists—indeed, often subsidize them—by publishing marketable regional fiction, guides to wildlife and cooking, and encyclopedias of states; reference books, which can command high prices; and volumes for nonuniversity readers, such as midlist books that trade publishers are now less likely to produce. presses are thus devoting substantial editorial resources to finding and promoting profitable books that can keep the entire press enterprise afloat and, more pertinent to this report, that can sustain publication in the fields represented by the mla. the plight of humanities publishing lists is a direct result of a lack of sales of new hardcover books from university presses. we frequently hear publishers describe a downward trend of monograph sales to libraries, from as many as – , clothbound copies for the initial printing to around today. accordingly, university presses have seen the unit cost of their titles increase markedly: in a study of libraries conducted by the association of research libraries (arl), the median cost of a monograph was $ . in and $ . in (an increase of %). the drop in humanities books is particularly evident for sales to col-­ lege and university libraries, the primary outlet for monographs. even where library budgets have grown, purchases of books in the humani-­ ties have not increased. libraries have scaled back on these purchases as costs of subscriptions to major scientific journals have skyrocketed; some subscription rates are in the thousands of dollars. the arl study mla�task�force�on�evaluating�scholarship�|||� found that, from to , library expenditures for serials rose %, whereas expenditures on monographs rose % (“monograph and serial expenditures”); there was an increase of % in the number of serials purchased, but a % decrease in the number of monographs purchased. furthermore, investment in electronic journal subscriptions, such as the indispensable jstor, has reduced budgets for humanities monographs. as a result, “automatic buy lists”—agreements between libraries and lead-­ ing university presses that ensure the purchase of a press’s entire list of new books—have largely become a thing of the past. a corollary consideration by many editors concerns the extent to which books are being purchased and read by faculty members and students in the language and literature disciplines. if the sale of monographs is a rough indicator of readership (and, of course, it is a very rough measure given the availability of photocopying, course packs, the system of interlibrary loans, and, most recently, texts online), then the editors of presses conclude that the monographs they publish are being read by fewer and fewer scholars. even if this is true, the decline in readership can be attributed only in part to the rising cost of books. it is possible that the accumulated volume of scholarship in book form is increasingly difficult to master and that schol-­ ars tend to read monographs in very restricted contexts: in relation to their own research, requests for reviews, teaching, and the evaluation of tenure and promotion cases. thus editors express concern that too few mono-­ graphs draw a readership across different fields in language and literature and between these fields and the social and applied sciences. some scholars have suggested that online publishing would help re-­ solve the problem of the increased cost of books. however, as jennifer crewe of columbia university press has pointed out, only part of the cost of producing new books would be reduced by online publication, since the costs of editing and refereeing remain the same in either medium and keeping up with new developments in technology represents added costs (crewe ; see also mla ad hoc committee ). online publication by itself is therefore not likely to solve the problems underlying the financial difficulties faced by libraries and presses. . changes in educational policy and practices over and beyond the different factors that we have enumerated—the problems of academic presses in publishing books in the humanities, the altered structure of the faculty, a dominance of casualized part-­time ntt and full-­time term ntt faculty positions, and the reduced fund-­ ing for hiring faculty members and for supporting higher education in general in the past three decades—there have also been significant shifts �|||�report in educational policy and, as a result, in educational practices. these shifts have produced revised criteria for determining what faculty mem-­ bers must do to be tenured and promoted. in , ernest boyer published scholarship reconsidered: priorities of the professoriate, an enormously influential book that attempted to reframe the discussion of the nature and function of scholarship and to challenge the traditional dichotomy between scholarship and teaching and thus be-­ tween content and process. boyer embraced an expanded view of scholarly work that highlighted teaching, engagement, integration, and discovery. based on a inquiry sponsored by the carnegie foundation for the advancement of teaching, scholarship reconsidered led to conceptualizing teaching as a scholarly enterprise in the s. there was an outpour-­ ing of pedagogical colloquiums, teaching circles, and the production of teaching portfolios, a practice that has become widely used in tenure and promotion cases. advocates of boyer’s views went on to form the carne-­ gie academy for the scholarship of teaching and learning and created a national network of institutions with teaching academies (see rice and sorcinelli – ). moreover, as the notion of a scholarship of engagement suggests, boyer and his followers believed that scholars should learn from and involve members of the community and apply their scholarship to the nation’s critical societal issues. the scholarship of integration was designed to combat fragmentation and alienation in the academy, to promote collab-­ orative work, and to reach across disciplinary boundaries, not only con-­ necting the usual disciplines but also achieving what david scott calls “transdisciplinarity” ( ), an ideal that is both invoked and underfunded on college and university campuses. boyer’s fourth category, the scholar-­ ship of discovery, the traditional goal of scholarship, was cast in scientific terms that had little to do with the humanities and, in general, seemed to be undervalued in his efforts to forge a more inclusive conception of scholarship that embraced teaching, application, and cross-­disciplinarity. the faculty’s intellectual and scholarly interests were depicted as narrow and in need of infusion, or replacement, by civic values—if not depicted by boyer himself then by his followers, by higher education administra-­ tors, and by governmental voices in higher education reform who have been calling for a reorientation of faculty work. the considerable amount of work required to implement the goals of scholarship reconsidered has been evaluated (see rice and sorcinelli – ), and its impact on institutional policy and practice has also been examined. kerryann o’meara and rice’s faculty priorities reconsidered: rewarding multiple forms of scholarship features essays about the ideas of scholarship mla�task�force�on�evaluating�scholarship�|||� reconsidered, reports from nine campuses on how they have revised ten-­ ure and promotion policies, and a national survey of chief academic of-­ ficers at four-­year institutions on concrete ways in which their institutions changed in the decade following the publication of boyer’s work. the essays reveal that many institutions did revise their tenure and promotion policies in keeping with boyer’s views, most notably in the area of the scholarship of teaching and learning, spurred by public concern about the faculty’s commitment to the education of undergraduate students. yet in “scholarship reconsidered: barriers to change,” robert m. diamond ob-­ serves that “many faculty members in key leadership roles were comfort-­ able with the status quo and were reluctant to change reward structures or definitions of scholarly work that had worked so well for them” ( ). he concludes that there are several ways to overcome “potential barriers” to change: ensure top leadership support, select the leaders of the initiative with care, institutionalize the process, reinforce new policies and proce-­ dures as they are approved, and ensure that communications are open and that the entire faculty has the opportunity for input ( – ). the findings of the national survey of chief academic officers reported in faculty priorities reconsidered reveal that the primary change in the decade following the publication of scholarship reconsidered consisted of heightened demands that faculty members excel in every area, principally “publication productivity” ( % of the chief academic officers surveyed said that it “counts more”). even more notable, the officers who classified theirs as “reform” institutions, that is, those that were changing tenure and promotion policies, as compared with “traditional” institutions, those that were not changing policies, reported that “emphasis has increased” in all areas surveyed: in publication productivity ( %), teaching ( %), engagement and professional service ( %), service to the institution ( %), and service to the profession ( %). in “traditional” institutions, the figures for increased emphasis were lower in all these categories: % in publication, % in teaching, % in engagement, % in service to the institution, and % in service to the profession. even those institutions that are not engaged in reforming their policies have increased expecta-­ tions in all areas (o’meara and rice ). this startling conclusion seriously concerns the task force. change in favor of a more capacious conception of scholarship, which we strongly endorse, should not mean ever-­wider demands on faculty members, most especially those coming up for tenure and promotion. we endorse the specific ways in which diamond suggests that change can be effected within an institution to combat the predominant satisfaction with the status quo that we found in our survey. in the same vein, we endorse most �|||�report of o’meara’s principles for “encouraging multiple forms of scholarship in policy and practice” ( ): prepare students in graduate school for the va-­ riety of roles and types of scholarship in which they will engage; socialize new faculty members to the broader institutional definition of scholar-­ ship; present clear expectations for scholarship in promotion and tenure guidelines; do not expect or reward the “overloaded plate”; provide useful feedback to faculty members during their evaluations; support pioneers with resources—structural and financial, training and development, po-­ litical and symbolic; define and emphasize scholarship in the context of institutional mission; and resist increasing research expectations to en-­ hance institutional prestige ( – ). it would appear that the decline in the importance of teaching, high-­ lighted in scholarship reconsidered, has been followed by an increased em-­ phasis on teaching since the mid-­ s, at least according to schuster and finkelstein’s findings. this increase is accompanied by a consistent, modest increase in emphasis on research spread throughout the higher education landscape ( – ). the findings of our survey confirm the con-­ clusions of schuster and finkelstein. of the three criteria for tenure and promotion—scholarship, teaching, and service—teaching is the criterion where there is the least variation among the departments we surveyed (monograph publication has the most variation). teaching was rated “very important” for tenure by . % of departments in the doctoral institu-­ tions, % of the master’s, and . % of the baccalaureate, in contrast to the monograph, which was ranked “very important” by % of de-­ partments in the doctoral institutions but by only . % of the bacca-­ laureate and . % of the master’s, and in further contrast to publication in general, which was ranked as “very important” by . % of the doc-­ toral institutions, . % of the baccalaureate, and . % of the master’s. overall, however, only . % of departments report that teaching has increased in importance (far fewer than the . % that report publica-­ tion increased in importance and the % that report monograph publi-­ cation has increased in importance). indeed, teaching increased most in importance in departments in doctoral institutions ( . %), as compared with . % of baccalaureate and . % of master’s institutions, where, presumably, teaching has always been considered important. that pub-­ lication increased in importance most in baccalaureate institutions con-­ firms the notion of increased demands for both teaching and publication throughout the carnegie system of institutions. (of the three traditional areas of faculty activity, service showed the lowest percentage increase in importance— . % across all sectors—and the highest decrease in impor-­ tance— . %.) mla�task�force�on�evaluating�scholarship�|||� . teaching and service as forms of scholarship; a provisional definition of scholarship although it is not the primary focus of this report, the task force wants to emphasize, along with boyer and others, the importance of teaching as a form of scholarship, indeed, as a venue for making scholarship public. as john guillory (a member of the task force) has written, most of us know “important teachers,” brilliant scholars whose scholarship went into the classroom and not onto the page ( ). charles e. glassic and his coauthors have rightly criticized the failure of the profession to devise ways to rec-­ ognize this kind of scholarly achievement. similarly, service, traditionally viewed in research institutions as the least significant of the scholarship, teaching, and service triad, should be seen as a crucial part of faculty work that overlaps with—and involves—both other elements. the report of the mla commission on professional service rejects the old “unwieldy, confused category, encompassing almost any faculty work that falls outside research and scholarship or teaching” ( ) in favor of the categories intel-­ lectual work and academic and professional citizenship ( ), which can oc-­ cur at a number of “sites,” including classrooms, committee meetings, the internet, scholarly conventions, journals, and community boards ( ). in this schema, intellectual work “is not restricted to research and scholarship but is also a component of teaching and service,” and citizenship, which “encompasses the activities required to create, maintain, and improve the infrastructure that sustains the academy as a societal institution,” comprises aspects of research and scholarship, such as participating in promotion and tenure reviews, evaluating manuscripts, and serving on committees in pro-­ fessional organizations or on task forces in one’s field ( , ). the applied work of citizenship makes knowledge available to government, industry, the law, the arts, and nongovernmental organizations; examples include “serv-­ ing on a state or local humanities council, helping a school system revamp its curriculum, working on a community literacy project, writing a script for public television, and consulting on expert testimony for congress” ( ). visual grids in the mla report on professional service convincingly show the overlapping, ambiguous, and connected activities in various fac-­ ulty work efforts and among sites and serve as a model for rethinking the conventional triad of faculty work (see also stanton). this more complex and expanded view of the sites of scholarship is subsumed in our understanding, influenced by the work of guillory, that scholarship in the humanities constellates three activities: research, in-­ terpretation, and reflection. research is not to be equated with scholar-­ ship; it is a component of scholarship that, in the fields represented by the �|||�report mla (and more broadly, the humanities), can include archival, artifactual, or textual objects that essentially involve human matters. scholarship in our field requires (re)interpretation, an analysis or critique that calls for a revision or reconfiguring of what has previously been thought—what paul armstrong calls “the subtle altering of opinion about ideas long and securely held, . . . a more effective explanation and dissemination of con-­ cepts, interpretations, and information that originated with other scholars” (ade ad hoc committee )—and that enters into conversation with those other scholars and interpretations. but as guillory emphasizes, any serious work of scholarship in our field also demands a moment of reflec-­ tion (or theorization), “a continuous examination of concepts and argu-­ ments as they arise from and are altered by the practices of interpretation and research” ( ), and a self-­consciousness about the method appropriate to the object of study. this constellation of research, interpretation, and reflection can produce knowledge that makes sense of complex states of affairs in the human world or complex states as objectified in human ar-­ tifacts. furthermore, scholarship should not be equated with publication, which is, at bottom, a means to make scholarship public, just as teaching, service, and other activities are directed toward different audiences. pub-­ lication is not the raison d’être of scholarship; scholarship should be the raison d’être of publication. although the working definition of scholarship that we have adopted differs from that of boyer and his followers, our task force report cer-­ tainly takes up some of his recommendations and, even more pertinently, those of the report from the mla ad hoc committee on the future of scholarly publishing, with which we began. as the following pages show, our task force report shares the committee’s view that the goals of scholarship in the fields represented by the mla would be better served if the monographic book were not so broadly required or considered the gold standard for tenure and promotion, as it is in a growing percentage of institutions; that journal publication should continue to be a primary venue for the advancement of knowledge in our field; that we need to be open to a variety of models and other forms of expression to do better jus-­ tice to different kinds of scholarly projects and alternative conceptions of the body of work; and that we always need to value quality over quantity (mla ad hoc committee ). . preliminary findings and conclusions we are living through profound changes in the academy that should com-­ pel us to reexamine our ways of evaluating scholarship for tenure and pro-­ motion. these profound changes provide the context and the lens through mla�task�force�on�evaluating�scholarship�|||� which the principal findings of the mla survey need to be understood. we can report that our survey of , department chairs does not reveal a considerable lowering of tenure rates. our survey does reveal, however, the increased demands, particularly for publication, placed on candidates for tenure. while this development has been driven by carnegie research i institutions, its effects ripple through all the carnegie sectors, thereby placing greater emphasis on publication in tenure and promotion decisions. yet our data reveal that junior faculty members have risen to meet these ever-­growing demands. the chairs responding to the mla survey indicate that until , candidates for tenure in their departments had not suffered for lack of publishing venues; and, in general, these chairs support the sta-­ tus quo regarding demands for tenure and the relative emphasis on schol-­ arship over teaching and service in their departments and institutions. thus we can state that faculty members hired to tenure-­track ap-­ pointments over the last ten years have been tenured in ways—and at rates—similar to their predecessors. there has, to date, been no “lost generation of scholars” from the tenure track. universities have lowered or eliminated subsidies for scholarly presses, and libraries have dramati-­ cally reduced the purchase of books in the humanities, thereby weakening the market for monographs, a trend that may not yet have peaked. none-­ theless, up to , these developments had not resulted in substantially increased tenure denials or substantially decreased rates of tenuring, at least in terms of the candidates who came up for tenure and promotion. non-­tenure-­track appointments continue to constitute well over half of all faculty positions in four-­year institutions for those whose principal teaching fields are english and foreign languages. and available data sug-­ gest that % or more of graduates who receive doctorates in english and foreign languages remain in ntt positions or leave the academy for employment elsewhere three to five years after receiving their de-­ grees. further, findings from the mla survey, along with data from the national survey of college graduates (available from scientists and engineers statistical data system [www .sestat .nsf .gov]) about the em-­ ployment situations of those holding doctorates in english and foreign languages, indicate that substantial numbers of junior scholars—from % to %—leave the institution before their final review for tenure begins. (the mla survey indicates that the largest number leave one tenure-­track appointment for another. we have no way of tracking tenure outcomes for these junior faculty members.) by our best estimate, then, only about % of modern language doctorate recipients obtain a tenure-­ track position and go through the tenure process at the institution where they were hired (see fig. “estimated percentage”). �|||�report the task force believes that the declining number of tenure-­track positions in relation to the total number of positions accounts in large measure for the widespread anxiety in the profession about standards for tenure and promotion. this anxiety may also be understood as un-­ ease about the continued escalation of quantitative demands for schol-­ arly production, the extension of such demands to a widening circle of institutions of higher learning, and the harm being done to scholars and scholarship as demands increase. as a result of our survey of , depart-­ ment chairs, we clearly see increasing pressure on the system of scholarly production and the mechanics of its evaluation. scholars and institutions have adapted to changing circumstances over the years as requirements for tenure have expanded and demands have been increased throughout the profession. but the results have taken their toll on individual scholars and institutions—and on the academy’s infrastructure as a whole—and strained the profession in ways that are intensely serious but not yet well understood or articulated. our report aims to address this complex, critical situation before it becomes a crisis. the new conjuncture, born of emergent historical, edu-­ cational, economic, and professional conditions, demands new thinking, new flexibility and openness, and new solutions. we are living through profound changes in the academy that require our ways of evaluating � estimated percentage of english and foreign languages doctorate recipients who become tenure-­track faculty members and achieve tenure at the institution where hired completed doctorate degree awarded tenure at institution where hired considered for tenure at institution where hired hired to tenure- track position within five years mla�task�force�on�evaluating�scholarship�|||� � scholarship for tenure and promotion to undergo serious reexamination. the task force concludes that we in the fields represented by the mla need to change our conception of what scholarship includes and excludes; that institutions should calibrate their demands for tenure to their par-­ ticular mission and values and devise requirements that are appropriate to their own academic contexts; and that the profession as a whole should reconsider the processes used in evaluating scholarship for tenure in the interest of fostering all the forms of scholarly exchange in which scholars engage and in the name of fairness and transparency throughout the pro-­ cess of evaluation, principles that should exemplify the highest standards of equity for our profession. part i: revising the meaning of scholarship . the dominance of the monograph; increased publication demands for tenure the monograph lies at the nexus of a number of crucial issues: it repre-­ sents a historical matter concerning changes in requirements for tenure, a practical matter involving the extension of the requirement of a tenure monograph beyond doctoral institutions to master’s and baccalaureate institutions, and a policy matter regarding its proper role in promoting research and scholarship and their dissemination in our fields. few concrete longitudinal data exist on when and where the publica-­ tion of a scholarly monograph emerged as the chief requisite, the gold standard, for tenure. one survey from the late s, conducted jointly by the national council of teachers of english and the mla, with sup-­ port from a grant from the then united states office of education, pro-­ vides some answers. designed to produce systematic information about english departments, the survey, which was administered to a sample of departments in four-­year institutions and received a remarkable . % response, included questions about criteria and procedures gov-­ erning tenure and promotion. according to the findings of the survey, which was prepared by thomas wilcox of the university of connect-­ icut, before neither the monograph nor other kinds of publica-­ tion were regarded as a principal requirement for tenure. asked to list in order of importance the criteria used to decide tenure and promotion, only . % of wilcox’s respondents ranked published scholarship of any kind, including the monograph, first or second. by contrast, in the new mla survey, . % of respondents rank publication “very important” for earning tenure in their departments. more specifically, publication of a monograph is now “very important” in . % of all departments in four-­ �|||�report year institutions and in % of carnegie doctorate-­granting institutions. these data document the change that has occurred in requirements for tenure and promotion and confirm beyond doubt that for the majority of institutions where the monograph became routine, it did so after . although requiring publication of a monograph remains a practice con-­ centrated in carnegie doctorate-­granting institutions, the findings of the mla survey also make clear that publication of a monograph is now generally regarded as desirable and considered a reasonable demand at four-­year colleges and universities throughout the united states. the mla survey findings also indicate that the monograph has be-­ come increasingly important in comparison with other forms of pub-­ lication, a fact confirmed by the requirement in many institutions of “progress toward the completion of a second book,” that is, a second monograph for tenure. a total of . % of responding departments con-­ sider “progress toward the completion of a second book” “important” or “very important” for tenure. among carnegie doctorate-­granting insti-­ tutions, . % of respondents say that progress toward the publication of a second monograph is “very important” or “important”; . %, or about half of the . %, indicate that emphasis on this requirement has increased over the past decade, and . % indicate that it has remained about the same. these figures show that progress toward a second mono-­ graph was already well established ten years ago. in fact, . % of all departments surveyed report that publication has increased in importance over the last ten years, a figure that attests to the growing significance of publication in master’s and baccalaureate institutions. baccalaureate institutions rank the monograph even more important than do master’s ( % versus . %); the difference reflects the tenure criteria of selective private institutions in the baccalaureate sector. this dramatic change in the requirements for tenure—specifically, the rise of what we might call the tenure monograph—is a direct result of the abrupt collapse of employment opportunities for new phds in the academy, as wilcox observed in the early s. a buyers’ market thus emerged in the early s, which made it possible for hiring departments to demand more of candidates seeking entry-­level jobs, particularly evi-­ dence of scholarly potential. often today, carnegie doctorate-­granting departments expect that graduate students will already have published one or more articles by the time they graduate; these institutions expect the same of the junior faculty members they hire. such increased de-­ mands on graduate students exerted pressure on programs to adjust their curricula; as they did so, the dissertation was reconceived as the first draft of a publishable book. the appearance of the tenure monograph was thus mla�task�force�on�evaluating�scholarship�|||� linked to a reconceptualization of the dissertation. in turn, the expecta-­ tion that the dissertation would be published after revision made it easier for departments to demand a monograph of tenure candidates. in addition to the buyers’ market, two other dramatic shifts in the cul-­ ture and demographics of the university, in the period after , made it possible and even routine for departments to demand more of candidates for both hiring and promotion. first, the number of women entering the profession increased greatly (women are now a numerical majority in the fields of language and literature) and further enlarged the available pool of academic labor. second, the field of literary study was shaken—some would say, transformed—by the emergence of theory, which provoked ideological factionalism in some, perhaps many, departments. these fac-­ tors, along with the effort after to democratize academic structures, meant that the authority for tenure decisions was largely removed from the office of the chair, where, according to the wilcox survey, it had resided in the majority of institutions. responses to that survey indicate that tenure decisions were made by the chair alone in . % of the departments; in an additional %, tenure decisions were made “by the chairman and an advi-­ sory committee appointed by him” ( – ). findings about promotion de-­ cisions matched those for tenure: in . % of the departments, decisions about promotion were made by the chair alone; in . %, they were made by the chair and a committee appointed by the chair ( ). these statistics from the s describe key points of the old-­boy system—wilcox called the concentration of power in the chair “this most autocratic, least demo-­ cratic procedure” ( )—that over the following decades was swept away. in its place, ad hoc department committees and outside letters of reference emerged as a fairer and more objective modus operandi. the new empha-­ sis on publication and other criteria for tenure was an expression, then, not only of the higher demands created by a buyers’ market but also of the search for safeguards against the possible arbitrariness or bias of chairs and of department factions unsympathetic to the new demographics of the profession and to new developments in literary study. in certain respects, the emergence of expectations oriented more to-­ ward publication than toward teaching in the period between and corrected a system of promotion and reward that had few procedures of accountability. it could thus be argued that the buyers’ market made procedures of evaluation fairer throughout the profession, possibly even improving the general quality of the professoriat. what emerged after is a system of bureaucratic equity, the core principle of which is that personnel actions are to enact (or at least appear to enact) institutional rules and procedures rather than personal inclinations and biases. �|||�report . the rates of tenuring, – ; gender difference in tenure rates the increasingly prevalent demand for a tenure monograph, and some-­ times a portion of a second monograph, along with the current problems that university presses are facing, has produced concerns throughout our profession about a possible decline in the rate of tenure. a narrow major-­ ity of respondents ( . %) agrees or strongly agrees that tenure prospects for junior faculty members in the humanities are at risk because of the financial difficulties university presses face ( . % agree and % strongly agree). but less than a quarter of respondents ( . %) affirm that the ten-­ ure prospects of one or more junior faculty members in their own depart-­ ments are at risk ( % agree and . % strongly agree), and three-­fifths ( . %) of respondents disagree that prospects for junior faculty members in their departments are at risk ( . % strongly disagree and % dis-­ agree). a substantial group of respondents stated that they did not have enough information to adopt a position about the prospects of junior fac-­ ulty members, either generally ( . %) or in their departments ( . %). the mla survey, however, indicates that, at least up to , anxiety about the future tenuring of young scholars had not yet become mani-­ fest as reality. asked whether any candidates in their departments had been denied tenure because of the limitations that university presses have placed on monograph publication, . % of respondents said, “no”; only % said, “yes, definitely,” and %, “yes, possibly.” but . % express con-­ cern that there may be such an effect in the future, and % are unsure but think such problems likely in the future. our data cannot describe sit-­ uations that are in the making, but there are reasons to believe that there may be a narrowing of publishing opportunities in the future that can have a negative impact on tenure reviews. (once again, the survey yields information about completed tenure reviews through – and does not predict outcomes for junior professors now in pretenure positions.) as we indicated in the preamble, the tenure rate for candidates whom responding departments considered for tenure in the period – needs to be understood in the context of the current academic job market, where an estimated % of all those who receive doctoral degrees do not obtain tenure-­track positions. for graduates who began tenure-­track ap-­ pointments, findings from the mla survey indicate that well over % leave their positions before being considered for tenure, some for tenure-­ track or non-­tenure-­track appointments at other institutions, some for pursuits outside college and university teaching. for junior faculty mem-­ bers who completed their probationary period and came up for tenure, respondents reported a tenure rate averaging around %, with a slightly mla�task�force�on�evaluating�scholarship�|||� lower rate of about % for carnegie doctorate-­granting institutions. the tenure rate did not vary between the two five-­year periods the mla examined— – to – and – to – . taken together, information from the mla survey and other sources suggests that about % to % of the tenure-­track assistant professors hired to a tenure-­track position actually go through the tenure process and receive tenure at the institution where they held their original appoint-­ ment. a recent study of tenure rates in the penn state university system, presented by michael dooris and marianne guidos, found that “[i]n the – academic year, percent to percent of the second-­, fourth-­, and sixth-­year cases reviewed at the college level resulted in recommenda-­ tions for continuation or early tenure” and went on to note that “similar studies have clearly shown that the approval percentage at the university level has almost always been over percent” ( ). dooris and guidos also found that, in the penn state system, where outcomes are examined from the point of hiring to a tenure-­track position, “[f ]or the last nine entering cohorts—that is, those beginning [as newly hired tenure-­track appointees] in through — percent of new entrants had received tenure by the end of their seventh year on the tenure track” ( ). the success rate for junior faculty members formally considered for tenure masks the actual number of those who held a tenure-­track position but did not come up for consideration at the institution that initially hired them. the mla study suggests that great demands have been made on the junior professoriat in the last several decades and that, by and large, the junior scholars who completed their probationary period have risen to meet these demands. the heightened demands for the monograph and other forms of scholarship, coupled with the increased demands for teaching and service (which also have strong scholarly aspects), have not, in fact, depressed the tenure rate. yet when there are few tenure-­track jobs available, department chairs and committees must work particularly hard to ensure that the faculty members hired will be strong candidates for tenure, since the failure to be approved at college committee and ad-­ ministrative levels can sometimes result in negative consequences for the department, such as losing the faculty line. a more worrisome possibility is that the need to try to ensure from the outset that the junior faculty member hired will be qualified to receive tenure may discourage hiring committees from taking risks on scholars who do not fit a narrow aca-­ demic profile and on work that is not perceived to be mainstream. but it is also reasonable to infer that departments are being highly selective and taking advantage of a pool of well-­qualified applicants in a buyers’ market for tenure-­track positions. �|||�report within the group of tenure candidates counted in the mla study, there are important gender differences. across all institutional categories, women consistently outnumber men in the pools of candidates considered for tenure and the number of candidates awarded tenure (in the candidate pools, % are women and % are men). the difference between the numbers of men and women considered for tenure is smallest in carnegie doctorate-­granting institutions (where . % are women) and largest in carnegie baccalaureate institutions (where . % are women) in the period from – to – . for this same period, overall ten-­ ure rates for men and women are comparable: . % for women, . % for men. tenure rates for women are lower than for men in carnegie doctorate-­granting institutions ( . % for women, . % for men), whereas tenure rates for men in carnegie baccalaureate institutions are lower than for women ( . % for men, . % for women). almost all the differences between tenure rates for men and women can be traced to differences in tenure rates in foreign language departments. for the period from – to – , the tenure rate for men was % versus . % for women in english; but in foreign languages, it was . % for men versus . % for women. the tenure rate for women for this five-­ year period is markedly lower in doctorate-­granting foreign language depart-­ ments: . % for men versus . % for women. such differences also emerge in master’s-­granting foreign language departments: . % for men versus . % for women. in baccalaureate-­granting foreign language departments, this trend is reversed: . % for men versus . % for women. by compari-­ son, the figures for various degree-­granting types of english departments are: for doctorate-­granting institutions, . % for men versus . % for women; for master’s institutions, . % for men versus . % for women; and for baccalaureate institutions, . % for men versus . % for women. the analysis of exit rates (cases where faculty members left a depart-­ ment before being formally considered for tenure) also reveals gender dif-­ ferences. women have an exit rate slightly higher than men ( . % versus . %), with the largest difference in departments in carnegie doctorate-­ granting institutions— . % for women versus . % for men. exit rates after the third-­year review, however, are higher for men than for women: . % versus . %—more specifically, . % for men versus % for women in carnegie doctorate-­granting institutions; . % for men versus % for women in carnegie master’s institutions; and . % for men versus . % for women in carnegie baccalaureate institutions. the salient exception to this trend is for departments in institutions with the smallest tenured and tenure-­track faculties, where there is a significantly higher exit rate for women as compared with men: . % versus . %. mla�task�force�on�evaluating�scholarship�|||� . favoring the status quo the tenure rate that emerges from the mla survey thus suggests that over the last decade the increasing demands for scholarly publication have not noticeably harmed the tenure rate of junior scholars. this finding counters some prevailing opinions about an inverse relation between de-­ mands and the percentage of junior scholars tenured and promoted. nev-­ ertheless, there are still compelling reasons to be concerned about the dominance of the tenure monograph in the system of evaluation. there is, inevitably, the worry that in a buyers’ market demands will continue to be raised, beyond those currently prevalent, although it is difficult to imagine that even more publication could be expected of candidates coming up for tenure than is already required. nevertheless, department chairs do not express concern about the current level of demands for ten-­ ure; on the contrary, they seem to approve of the status quo. only . % of respondents overall agree or strongly agree that the tenure and pro-­ motion process in their institutions overemphasizes book-­length mono-­ graphs and gives too little credit to refereed journal articles—although . % of respondents in carnegie doctorate-­granting institutions see too much emphasis on monograph publication and too little credit for articles, as compared with . % of respondents in carnegie master’s and . % in carnegie baccalaureate institutions. responses to several survey items that explored attitudes toward teaching and publication in tenure and promotion processes reveal the broad-­based support that continues to exist for both current publication requirements and the current balance between publication and teach-­ ing across departments in different institutional sectors. more than % of all respondents disagree or strongly disagree that accomplishment in teaching should count more than it does in their institutions’ tenure and promotion decisions, and variation among departments in different insti-­ tutional types is modest: . % of departments in carnegie doctorate-­ granting institutions disagree or strongly disagree that teaching should count for more, compared with . % of departments in carnegie mas-­ ter’s and . % in carnegie baccalaureate institutions. survey findings also indicate respondents’ broad-­based support for the current emphasis on monograph publication in their institutions. when respondents were asked whether they agree or disagree that the tenure and promotion pro-­ cess in their institutions emphasizes monograph publication too much and gives too little credit to refereed journal articles, . % of respon-­ dents in carnegie doctorate-­granting institutions disagreed or strongly disagreed, as did . % in carnegie master’s institutions and . % �|||�report in carnegie baccalaureate institutions. even in the subset of carnegie doctorate-­granting departments where monograph publication is either very important or important to earning tenure, only . % of respondents agreed or strongly agreed that publication of a monograph is emphasized too much and refereed journal articles too little. an even larger majority of respondents ( . %) reports that refereed journal articles can suffice to earn tenure in the respondent’s institution and disagrees that monographs are overemphasized. broken down by sec-­ tor, the figures are . % of respondents in carnegie doctorate-­granting, . % in carnegie master’s, and . % in carnegie baccalaureate insti-­ tutions. an additional . % of the respondents in carnegie doctorate-­ granting institutions disagree or strongly disagree that refereed journal articles can suffice to earn tenure in their institutions, but they also dis-­ agree or strongly disagree that journal articles are credited too little and monographs emphasized too much—that is, they are both subject to and support the monograph standard, at least for their departments and in-­ stitutions. respondents who are both subject to the monograph standard and inclined to question it are limited to a group of . % respondents in carnegie doctorate-­granting institutions. (these are respondents in the carnegie doctorate-­granting sector who agree or strongly agree that there is too much emphasis on monograph publication and who also dis-­ agree or strongly disagree that publication of journal articles can suffice to earn tenure in their institution; outside the carnegie doctorate-­ granting sector, only two respondents agreed both that monographs are emphasized too much in their institutions and that articles cannot suf-­ fice to earn tenure, one respondent each from a carnegie master’s and a carnegie baccalaureate institution.) this series of cross-­tabulations suggests that chairs in a majority of departments surveyed and across all types of four-­year colleges and uni-­ versities regard the practice on their campuses and in their departments as sufficient. even in the subset of departments where respondents agree or strongly agree with the statement that candidates are unlikely to earn tenure without the publication of a book, almost two-­thirds disagree or strongly disagree ( . %) that monograph publication is emphasized too much and refereed journal articles credited too little. when this sub-­ set of respondents is further limited to the respondents in carnegie doctorate-­granting institutions, . % disagree and . % strongly dis-­ agree that monograph publication is emphasized too much and journal articles credited too little (a total of . %), whereas . % agree and . % strongly agree that monograph publication is emphasized too much and journal articles credited too little (a total of . %). mla�task�force�on�evaluating�scholarship�|||� examining responses for the entire group of departments included in the survey reveals the lack of consensus and the division of outlook among the respondents in carnegie doctorate-­granting institutions as compared with respondents in other sectors. of the respondents in carnegie doctorate-­granting institutions, . % agree or strongly agree that in their institutions candidates are unlikely to earn tenure without publication of a book but disagree or strongly disagree that in their institutions there is too much emphasis on monograph publication and too little credit for articles in refereed journals—that is, they appear to support the monograph standard; . % disagree or strongly disagree that in their institutions candidates are unlikely to earn tenure without publication of a book and also disagree or strongly disagree that in their institutions there is too much emphasis on monograph publication and too little credit for articles in refereed journals—that is, they appear to some degree not to be subject to the monograph standard; and . % agree or strongly agree that in their institutions candidates are unlikely to earn tenure without publication of a book and also agree or strongly agree that in their institutions there is too much emphasis on monograph publication and too little credit for articles in refereed journals—that is, they appear ready to question the monograph standard. responses from those in carnegie master’s and baccalaureate insti-­ tutions show a far higher level of consensus: . % of respondents in carnegie master’s and . % of respondents in carnegie baccalaureate institutions disagree or strongly disagree that in their institutions candi-­ dates are unlikely to earn tenure without publication of a book and also disagree or strongly disagree that in their institutions there is too much emphasis on monograph publication and too little credit for articles in refereed journals—that is, they report that their institutions’ processes for tenure and promotion are not governed by a monograph standard. . the extension of the monograph requirement; rethinking the preeminence of the monograph; revaluing the essay the survey findings reveal the manifest extension of the monograph requirement to master’s and baccalaureate institutions, especially those where the teaching load is heavier than in the carnegie doctorate-­ granting system. the findings confirm the pressure of the doctoral sys-­ tem on all other types of higher education institutions and, conversely, the attempt of those other types to emulate the doctoral system. to be sure, this extension or pressure may devolve from the desire of non-­doctorate-­ granting administrations—deans, provosts, and presidents—to improve their national standing in the eyes of their prospective students, donors, �|||�report and state legislatures. the extension of carnegie doctorate-­granting de-­ mands to those who teach -­ or -­ course loads in a given year, rather than -­ and even less (common in research universities), and to those who are expected to have excellent teaching evaluations, in addition to numerous service and extracurricular requirements, seems excessive and unfair. the task force strongly recommends that non-­doctorate-­granting departments resist the temptation to model tenure standards on the prac-­ tices of research i institutions. even if the prevalence of the monograph as the chief tenure requisite in a significant minority of institutions is deeply established and growing, the task force believes it is crucial to evaluate the status of the monograph as a tenure requirement more critically than our profession has done to date (irrespective of the situation of university presses). among the topics in such an evaluation, we believe that it is critical to consider whether, in the words of lindsay waters, there is a “tyranny of the monograph” (“res-­ cue”); to evaluate how the tenure monograph has changed the profession as a whole; and to imagine alternatives to this kind of scholarly text. the high esteem given to monograph publication should not mean that the monograph is the only—or the best—form of scholarly communica-­ tion. the complaint occasionally heard today, and sometimes voiced by university press editors, that many monographs are really “articles on ste-­ roids” raises a serious question about how the demand for a book affects the quality of scholarship—and quality should be the first argument in favor of a tenure case. if, as the phrase “articles on steroids” implies, some books should more properly be articles, it could be argued that they do not take that form because of the tenure system’s demand for the monograph. departments should be wary of reflexively equating the publication of a monograph with the achievement of quality sufficient to merit tenure and with the “sole model for mature scholarship” (damrosch ). this idea seems to confuse scholarship’s substance with its form and to ignore changes in the practices of dissemination and publication over time. as david damrosch has argued, books and articles stood at relative parity in many fields in the early s, but the proliferation of journals, which outpaced the growth in university presses, had the “unintended effect of lowering the value of articles relative to books” ( – ). because articles could be published more readily, their prestige was reduced, as was their impact, since a smaller proportion of scholars in a particular field would see a particular journal ( – ). in damrosch’s view, in recent years there has been a lessening of competition between journals and books; in fact, presses benefit if parts of a book first appear in various journals. moreover, because of costs, there is a premium today on short books, mla�task�force�on�evaluating�scholarship�|||� � which reduces the difference in scale between the essay and the book ( – ). and damrosch rightly concludes that “we need not choose between these two forms” ( ). the constraints on university presses make this a timely moment for us in the fields represented by the mla to reaffirm that the monograph is not the only form in which scholarship can be produced and that there is value in books of linked essays as well as in the stand-­alone essay (that is, the jour-­ nal article that is not a book chapter). if journal publication is to be sustained as an independent venue of scholarship, then stronger arguments must be made in tenure cases for such articles in all institutional sectors, notably in carnegie research i institutions, which seem to be driving the definition of tenure requirements throughout the system. this revalorization would raise the possibility that the contribution of important articles might be equal to—or greater than—the contribution of some books. indeed, for the profession as a whole, a renewed emphasis on the value of the essay can open up fresh possibilities for the production of knowledge and for intellec-­ tual exchange that are not constrained by the book. scholarly journals have an advantage that books—with their longer lead time to publication—lack: they provide a forum for new ideas and research and for a timely exchange of argument. . expanding the definition of scholarship and the body of work to be evaluated for tenure the task force urges department tenure and promotion committees, de-­ partment chairs, and other administrators and administrative bodies to rank the monograph equally with other forms of peer-­reviewed scholarship and other professionally significant work. we oppose the preference for the monograph over all other contributions and its elevation to the gold standard based on the purely formal character of the book as book. the monograph has increasingly displaced other forms of scholarship and has become the sine qua non for achieving tenure and promotion, especially in those institutions regarded as the most prestigious. we further oppose the extension of this standard to other institutions in the carnegie system. we urge the members of the mla and of the wider academic community to recognize—and to act on the recognition—that valuable and important scholarship can take multiple forms and that requirements for tenure and promotion should be tailored to the mission of the institution. in our view, a body of essays or articles in peer-­reviewed journals can demonstrate the quality of scholarly work as well as or, in some cases, better than a mono-­ graph of similar length. moreover, edited collections of articles, critical editions, annotated translations of important primary texts, essays written �|||�report for a general audience, trade books, textbooks, and pedagogically useful monographs, as well as publications or other professional work in electronic form, may contribute to a body of scholarly and professional work that can meet the highest standards of scholarship in the tenure-­review process. our analysis of respondents’ assessments of how activities count in their institutions’ processes of evaluation for tenure points to the need for a more capacious notion of scholarship. our survey findings generally show that the value department chairs place on different types of schol-­ arly activity falls within a fairly restricted range. (it should be emphasized that respondents were asked how various activities do count, not how they ought to count.) most revealing are comparisons of the percentages of de-­ partments rating different items “not important.” the item that the low-­ est percentage of responding departments rated “not important” ( . %) was refereed journal articles; the item that the highest percentage of re-­ sponding departments rated “not important” was articles for a general audience ( . %). this item was followed closely by a radio or television broadcast—rated “not important” by . % of departments—and books for a general audience—rated “not important” by . % of departments. items at the next level were rated “not important” by respondents in % to % of departments and represent the categories where reconsid-­ eration is most needed, in our view. these include: • translations, rated “not important” by . % of all responding departments and by . % of foreign language departments • textbooks, rated “not important” by . % of departments • bibliographic scholarship, rated “not important” by . % of departments • books and articles oriented to classroom practice, rated “not important” by . % and . % of departments, respectively. the next group, rated “not important” by respondents in % to % percent of departments, includes scholarly editions, rated “not important” by %, and editing a scholarly journal, rated “not important” by . %. this last item seems highly undervalued when we consider that editors disseminate new scholarship and further the arts, stimulate and direct inquiry in their fields of study, help produce new knowledge, and cre-­ ate communities for discussion and debate within and among disciplines. undoubtedly, editors play a critical role in shaping their disciplines. in-­ deed, the existence of a journal on campus can have the same impact on intellectual community building as the creation of an institute: both can be focal points of intellectual ferment and excitement and centers for the scholarly education and development of students at all levels, but espe-­ cially for junior scholars. mla�task�force�on�evaluating�scholarship�|||� publication of a scholarly print monograph was rated “not important” by . % of responding departments but by only . % of departments in carnegie doctorate-­granting institutions (versus . % and . % of departments in carnegie master’s and baccalaureate institutions, respec-­ tively). examples of other activities with pronounced differences between institutional sectors are: • translations, rated “not important” by . % of foreign language departments in carnegie doctorate-­granting institutions but by . % and . % of those in carnegie master’s and baccalaureate institutions, respectively; • textbooks, rated “not important” by . % of departments in carnegie doctorate-­granting institutions but by . % and % of departments in car-­ negie master’s and baccalaureate institutions, respectively; • books oriented to classroom practice, rated “not important” by . % of depart-­ ments in carnegie doctorate-­granting institutions but by . % and . % of departments in carnegie master’s and baccalaureate institutions, respectively; • articles oriented to classroom practice, rated “not important” by . % of depart-­ ments in carnegie doctorate-­granting institutions but by . % and . % of departments in carnegie master’s and baccalaureate institutions, respectively; • bibliographic scholarship, rated “not important” by . % of departments in carnegie doctorate-­granting institutions but by . % and . % of depart-­ ments in carnegie master’s and baccalaureate institutions, respectively. these and other items can constitute valuable parts of a candidate’s scholarly portfolio. such portfolios will necessarily—and happily—be di-­ verse, in keeping both with the interests and talents of the junior scholar and, once again, with the mission of the institution and the ways it values faculty labor in scholarship, teaching, and service. . scholarship in new media digital scholarship is becoming pervasive in the humanities and must be recognized as a legitimate scholarly endeavor to which appropriate stan-­ dards, practices, and modes of evaluation are already being applied. the rapid expansion of digital technology has been fundamentally transform-­ ing the production and distribution of humanities scholarship, generating not only new forms of publication and dissemination—ranging from web sites and e-­journals to print-­on-­demand books—but also significant new modes of scholarship, including digital archives and humanities databases. large-­scale digital archives, for example, are contributing substantially to the development of standard forms of editing, reproduction, cataloging, and reference that facilitate the archives’ entry into research libraries. indeed, such libraries are intensively engaged in solving the problems of collecting digital works. �|||�report as recent studies have revealed, scholars across the humanities now make regular use of electronic resources, and, as the report from the mla ad hoc committee on the future of scholarly publishing indicates, “on-­ line journals are already being used by many scholars in our fields, and this use is likely to increase” ( ). the availability of major journals in both print and digital formats only emphasizes the two forms’ increasing interdependence; in fact, the distinction between them is beginning to disintegrate. scholarship in new media, much of it supported by research universities, granting agencies, learned societies, and foundations (among them the neh, the andrew w. mellon foundation, the getty fund, and the american council of learned societies), has resulted in the formation of international alliances, such as the digital library federation; stan-­ dards organizations, such as the text encoding initiative; and discipline-­ based consortia, such as the networked interface for nineteenth-­century electronic scholarship (nines). the mla committee on scholarly editions recently revised its guidelines to address electronic as well as printed editions in one set of recommendations, and in the mla published electronic textual editing, its first book devoted to electronic editing (burnard, o’brien o’keeffe, and unsworth). the report of the mla ad hoc committee on the future of scholarly publishing recognizes that digital publication raises many issues, includ-­ ing the need to construct viable business models for launching and sus-­ taining electronic publications, to establish standardized practices and adequate peer-­review procedures, and to develop reliable preservation strategies. senior faculty members, outside reviewers, and administrative committees often share these anxieties, in part because the new technolo-­ gies are rapidly evolving and some scholars have had little experience in using or evaluating them. in our survey, from % to over % of department chairs report that they have had no experience evaluating scholarly work produced in these new forms by candidates for tenure and promotion; departments in car-­ negie doctorate-­granting institutions consistently reported the highest percentages of inexperience. among the various forms of digital scholar-­ ship, experience evaluating refereed articles in electronic format is most widespread. even so, . % of the responding departments indicate they have as yet had no experience with them: specifically, . % of depart-­ ments in carnegie doctorate-­granting institutions, . % of departments in carnegie master’s institutions, and . % of departments in carnegie baccalaureate institutions. asked about inexperience in evaluating schol-­ arly monographs in electronic format, . % of departments in carnegie doctorate-­granting institutions report that they have no experience, as mla�task�force�on�evaluating�scholarship�|||� compared with . % of departments in carnegie master’s and . % of departments in carnegie baccalaureate institutions. overall, . % of respondents report having no experience with evaluating monographs in digital form. as jennifer howard notes in “gutenberg-­e lets historians present research in nontraditional ways,” monographs published digi-­ tally in the gutenberg-­e project have been considered positively in tenure cases, although the officers of the american historical association, which collaborates with columbia university press on the initiative, had to send letters to department chairs to disseminate information that would help legitimize this new form of monograph publication. more positively, higher percentages of departments across the insti-­ tutional board regard work in digital forms as “important” rather than “not important” in evaluation for tenure and promotion. that is, most respondents who report evaluating digital forms of scholarship see the work as creditable—although sizable (if lower) percentages of respondents still consider digital work “not important” in their institutions’ assess-­ ment of candidates for tenure and promotion. refereed articles published in electronic form are regarded as “important” for earning tenure and promotion in . % of departments and “not important” in . %; . % of respondents indicate that monographs in electronic form count as “im-­ portant” in their processes of evaluation, whereas . % say they are “not important.” given the current rarity of digital monographs, it is, in fact, surprising that so high a percentage of department chairs state that such work counts in their institutions—an indication perhaps of willingness to recognize such work, were examples to be forthcoming. the survey findings document the comparatively limited place and value that processes of evaluation give scholarship that appears in elec-­ tronic formats. in part, this limitation reflects the significant propor-­ tion of departments that as yet have no experience with scholarship in digital forms. but the cause-­and-­effect relations work in both directions here: probationary faculty members will be reluctant to risk publish-­ ing in electronic formats unless they see clear evidence that such work can count positively in evaluation for tenure and promotion. the survey findings suggest that work presented in electronic formats is still in the process of gaining the recognition necessary for it to fulfill expectations and requirements for tenure and promotion. refereed articles in digital media count for tenure and promotion in less than half as many depart-­ ments as refereed articles in print; print articles count in some fashion in . % of departments, as compared with . % for articles in elec-­ tronic form. monographs in electronic formats have a place in the evalu-­ ation of scholarship for tenure and promotion in only about one-­third as �|||�report many departments as print monographs— . % as compared with . %. scholarship in electronic formats seems to be recognized when done in addition to work in print formats but may place a candidate at risk if pre-­ sented as the sole or primary scholarly basis for consideration for tenure. it is clear, however, that electronic journals are increasingly run by editorial boards committed to peer review and that major forms of digital scholarship can fully support the modes of review previously associated only with print publication. although digital forms of scholarship increas-­ ingly pervade academic life, work in this area has not yet received proper recognition when candidates are evaluated for promotion and tenure. we consider it essential that tenure committees continue to learn about digi-­ tal scholarship. the mla has produced formal guidelines that spell out the responsibilities of candidates and of committees for preparing and evaluating digital scholarship: “the principle . . . is that when institu-­ tions seek work with digital media and faculty members express interest in it, the institution must give full regard to this work when faculty mem-­ bers are hired or considered for reappointment, tenure, and promotion” (mla committee on information technology). we should recognize that, while new media and the infrastructure that will support them are evolving in tandem under considerable internal and external pressure, the evolution of the two cannot be perfectly coordinated. nevertheless, in evaluating scholarship for tenure and promotion, committees and ad-­ ministrators must take responsibility for becoming fully aware both of the mechanisms of oversight and assessment that already govern the pro-­ duction of a great deal of digital scholarship and of the well-­established role of new media in humanities research. it is of course convenient when electronic scholarly editing and writing are clearly analogous to their print counterparts. but when new media make new forms of scholarship possible, those forms can be assessed with the same rigor used to judge scholarly quality in print media. we must have the flexibility to ensure that as new sources and instruments for knowing develop, the meaning of scholarship can expand and remain relevant to our changing times. part ii: the responsibilities of hiring institutions in the tenure process in a job system that has tenure-­track positions for slightly more than half of those who earn a phd, departments can afford to be highly selective in the hiring process. but even in a buyers’ market, departments must maintain their professional and ethical responsibilities to the junior faculty members they hire. the time to initiate communications about tenure expectations is mla�task�force�on�evaluating�scholarship�|||� at the point of hiring. this initial set of communications should be the be-­ ginning of a clearly articulated, transparent tenure process, which will benefit not only the department’s faculty members but also the entire institution. . written expectations and conditions of employment and of tenure to ensure transparency, departments should enter into discussions with new faculty members and state clearly the conditions of employment and the requirements for tenure in the letter of hire or in a supplementary written document. the task force recognizes that departments hire un-­ tenured faculty members at different stages, with differing levels of experi-­ ence and varying records of scholarly achievement, which may accrue from work they produced while at different institutions. and as we have already noted, many graduate students now have publication records at the time they enter the job market. the status of this work and the degree to which it will be recognized in the tenure process need to be stated and agreed on by the new faculty member, the department chair, and the dean. departments evaluate the scholarship of newly hired faculty members in various ways. some may disregard all previous scholarship and count for tenure only the body of work done once the faculty member has ar-­ rived, although this practice strikes us as ungenerous and unproductive, since scholarly achievements are not erased when one joins a new insti-­ tution. others may give full credit to the work done at previous institu-­ tions as part of scholarly requirements for tenure. the decision about how much previous scholarly work counts toward tenure is often a function of the number of years that the new faculty member serves before be-­ ing considered for tenure. candidates who have spent a number of years at another institution and who are to be considered for tenure after one year of service at the new institution should be judged predominantly on the basis of their record before they were hired, whereas candidates hired after a year or two at another institution and given reasonable time at the new institution may not always have their earlier work considered in the tenure dossier. it is critical to give careful consideration to the relation between the candidate’s previous work and the timetable for making the tenure decision. faculty members have a right to know from the outset what the institution’s policies and practices are, and institutions have an obligation to provide this information in clear terms at the time of hire. the task force therefore recommends that departments explicitly state in writing the timetable for the tenure period and what parts of the already completed scholarly work of all newly hired faculty members will count in the tenure decision. such a letter or supplementary memo should also outline how various previous and future professional activities will count �|||�report and thus what the department’s expectations of the newly hired faculty member are. the letter should be reviewed and updated periodically throughout the pretenure period to reflect any changes in expectations or assigned responsibilities, such as substantial new administrative tasks that the junior faculty member might be asked to assume during the pretenure period. this written statement, which should be included in the tenure dossier, will clarify the department’s goals for new faculty members and, in turn, help them devise a schedule for their scholarly and professional work. it will also help protect faculty members from the declaration of new ex-­ pectations by a new chair or tenure committee and serve to promote trans-­ parency, a value that should characterize the entire pretenure period. . joint appointments the interests of transparency and fairness demand that special attention be paid to joint appointments (appointments in more than one department or program) and that their conditions be clearly stated at the moment of hire. such appointments have become more frequent in recent years in keeping with the growing commitment to interdisciplinary work in the humanities. typically, joint appointments in the humanities designate a home depart-­ ment paired with either a program or another department, and the home department makes the decision recommending the candidate’s tenure. the other department or program, which sometimes lacks the power to grant tenure, often has minimal representation—and minimal say. such joint appointments are not truly joint when the interdisciplinary appointment becomes principally, if not exclusively, subject to disciplinary evaluation. the department that recommends the tenure decision will understandably receive the bulk of the candidate’s time and attention. this disciplinary imperative will tend to influence the scholarship of junior scholars to make it more acceptable to the tenure-­granting department and less faithful to the interdisciplinary spirit of their appointment and their own scholarly interests. if they remain true to the terms of the joint appointment, they will risk alienating certain department faculty members. the task force recommends that both units involved in a joint appoint-­ ment participate equally in the tenure review process from beginning to end. ideally, the two units should share the hiring decision, draft the hiring letter and supplementary memo together, and then collaborate throughout the period leading up to the tenure recommendation—a rec-­ ommendation that they should also make together. in institutions that do not allow this kind of collaboration between the two—or conceivably three—units, a transparent process that is fair to jointly appointed candi-­ dates needs to be articulated from the beginning of their appointment to mla�task�force�on�evaluating�scholarship�|||� the institution. where the institution does permit this kind of collabora-­ tion, we recommend that the two units: • together formulate the workload that the candidate will carry in both teach-­ ing and service to make sure that the demands from each side add up to a fair and manageable amount in conjunction with expectations for scholarship. because service in two units can pose particularly heavy demands on junior faculty members, the letter or memo should indicate precisely how much ser-­ vice the joint appointee is expected to perform in each unit. • together mentor the junior faculty member in the areas of teaching, scholar-­ ship, and service (see section ). • together assess and evaluate the candidate’s scholarship, teaching, and service through an intermediate review at the least but, preferably, more frequently during the pretenure years. as with faculty members with single appoint-­ ments, intermediate evaluations of jointly appointed faculty members should be in writing and followed up by meetings in person (see section ). • together make the recommendation regarding tenure and promotion through a specially formed tenure and promotion subcommittee on which the two de-­ partments or programs have equal representation. should there be disagree-­ ments between the two units on a personnel decision, the committee should produce a majority and a minority report. other arrangements for interdisciplinary joint appointments exist, in-­ cluding uneven divisions of the faculty member’s time. in keeping with the recommendations made above, the task force proposes that the same basic arrangements obtain for an uneven joint appointment as for an even one, with the important exception of the composition of the tenure and promotion subcommittee, where the primary unit could have proportion-­ ately greater representation. the need for special attention to joint appointments is not only a matter of fairness to junior colleagues. the task force’s recommendations are also designed to make the commitment to interdisciplinarity more than a valued but abstract ideal, as it often is in united states colleges and universities today; it should be embodied in the concrete practices of the institution. . start-up funds, subventions, research funds, and leaves to assemble the necessary elements of a scholarly and professional dos-­ sier for tenure and promotion, faculty members in the humanities should benefit from institutional support of the kind routinely provided to the sciences. these can include start-­up packages, summer or other research funds, subventions for accepted manuscripts, and one-­semester and full-­ year paid leaves. start-­up funds for faculty members in the humanities have become fairly common in both private and public institutions. they are widely available �|||�report in carnegie doctorate-­granting institutions, where % of the depart-­ ments surveyed reported offering them to some or all junior faculty mem-­ bers, but are unusual in carnegie master’s institutions, where a third or fewer departments say that a start-­up package is available to some or all junior faculty members. in the departments in our survey that provided information about the date they began to offer them, start-­up packages have been available since before in . % of carnegie doctorate-­granting, . % of master’s, and . % of baccalaureate institutions; from to , an additional . % of carnegie doctorate-­granting, . % of mas-­ ter’s, and . % of baccalaureate institutions established start-­up packages. these packages are a way of recognizing that the scholarship of language and literature professors carries with it costs that may not be as visible to the institution as they are in the sciences (laboratories, research assistants, and so forth) but that this work is no less important. institutions that expect humanities faculty members to produce significant scholarship have a duty to provide the conditions and financial resources to make this accomplish-­ ment possible throughout the entire probationary period and beyond. while it would be difficult to generalize about the dollar amount of start-­up funds, a range of $ , to $ , seems appropriate to the task force. in the mla survey, however, the average maximum amount was $ , in carnegie doctorate-­granting, $ , in carnegie master’s, and $ , in carnegie baccalaureate institutions. these stipends seem low considering that they need to help cover the expense of books and other materials; computing equipment and software; research assistance; travel to do research and to participate in conferences; institutional li-­ brary acquisitions; permissions, translations, and other costs related to publication; and training, consulting, and other educational and profes-­ sional advancement opportunities. at institutions with high scholarly demands, release time from teach-­ ing and paid leaves of absence from the campus are often built into initial faculty appointments. one-­semester paid leaves are available to some or all junior faculty members in almost two-­thirds of departments in private car-­ negie doctorate-­granting institutions but in less than two-­fifths of depart-­ ments in public carnegie doctorate-­granting institutions and in only % of departments in carnegie master’s institutions. full-­year paid leaves are unusual—overall fewer than one in ten departments say they are available, and they are almost unknown in carnegie master’s institutions, where . % of departments report offering them. summer and other research leaves are well established and have been available since before the – academic year in close to three-­quarters of the departments that offer them ( offer summer leave, report offering other types of research leave). mla�task�force�on�evaluating�scholarship�|||� � subventions for book publication often play a role in the support offered to faculty members when they are hired. they may be part of the start-­up funds or be separate from them. some institutions distribute subvention funds on a limited, competitive basis (a pool of scholars apply for grants, and not all projects are funded). a common practice is to restrict the use of such subventions to not-­for-­profit publishers that use stringent peer-­review processes. the mla survey indicates that (or . %) of the report-­ ing departments offer them. of the departments that provided more detailed information, . % in carnegie doctorate-­granting institutions started offering subventions before , while % in carnegie master’s and . % in carnegie baccalaureate institutions did so. the period – saw departments in another . % of carnegie doctorate-­granting, . % of carnegie master’s, and . % of carnegie baccalaureate institu-­ tions offer subventions. survey respondents cite very modest figures for the subventions they offer, with average maximums ranging from $ , at carnegie baccalaureate institutions to $ , at carnegie doctorate-­ granting institutions to $ , at carnegie master’s institutions. many university presses have come to expect scholars to seek subven-­ tions for the publication of monographs, especially for those works that have limited sales potential. editors say that the presence or absence of a subvention is never the determining factor in considering a manuscript for publication. nevertheless, once a book is accepted, a stipulation for a sub-­ vention often figures in the contract. presses assume that subventions will come from university research funds, scholars’ start-­up funds, or founda-­ tions. in the current climate, university and other scholarly presses cannot afford to publish all the manuscripts they wish to accept—particularly first books—without subventions. universities that place a high value on pub-­ lished research and scholarship from their faculty members should be pre-­ pared to provide funding for publication. some institutions will be more able than others to subsidize faculty research, but expectations should be commensurate with the ability to provide the means to achieve them. . intermediate reviews; mentoring from the time junior faculty members are hired at an institution, it is imperative that they receive professional advice from the department chair and from colleagues about teaching, scholarship and publication, and service. as we have already emphasized, chairs and administrators should make tenure and promotion expectations, in writing and in per-­ son, as transparent as possible to their junior colleagues from the begin-­ ning. moreover, the task force believes it is critical to have a third-­year or intermediate review—some institutions conduct multiple reviews before a �|||�report candidate comes up for tenure—with procedures, expectations, and goals for each review clearly outlined to the junior faculty member and carried out by a departmental committee (or a bi-­ or multidisciplinary committee for joint appointments or for work that goes beyond disciplinary bound-­ aries). we oppose soliciting external letters for this intermediate review. following an open and thorough intermediate review process, the de-­ partment chair and the members of the review committee should have frank discussions with their junior colleague about the strengths and weaknesses of the emerging dossier for tenure; the completed and future projects of scholarship and publication; the teaching portfolio; and the re-­ cord of service to the institution, the profession, and the community. this constructive feedback will give junior scholars time to strengthen their tenure profiles. mentoring should also include practical advice on how best to construct a tenure file that will demonstrate the candidate’s strengths when she or he approaches the tenure and promotion review process. appropriate mentoring will facilitate and enhance the candidate’s jour-­ ney from the point of hire to the tenure decision. a successful mentoring relationship does not simply concern the practicalities of the tenure process; it often touches on issues as far-­ranging as time management, selecting an appropriate journal or press for publication, choosing service and teach-­ ing assignments carefully, and developing productive ties with colleagues and administrators. senior colleagues should make themselves available to read junior colleagues’ scholarship and should make constructive comments to facilitate its publication. the best mentoring relationships are sensitive to the many differences in our professional and personal lives, including differ-­ ences in ethnicity, gender, age, sexuality, and familial responsibilities. junior faculty members also have responsibilities, however, in seeking appropriate mentor relationships, asking for clarification if any confusion remains about professional expectations, and examining carefully the advice that they re-­ ceive from outside as well as inside the department. mentoring relationships help prepare a junior faculty member not only for tenure and promotion but also for a long, vibrant, and productive career that recognizes in turn the value of collegiality and the ethics of responsibility toward junior scholars. part iii: the mechanics of the tenure review . proper preparation of the tenure dossier by the chair and the department the evaluation process that we recommend will be meaningless if a positive recommendation for tenure and promotion is accompanied by a poorly crafted and weakly made case presented to division, college-­wide, and university administrators and committees. inevitably, these commit-­ mla�task�force�on�evaluating�scholarship�|||� tees involve faculty members, administrators, and staff from nonhumanist disciplines who will not be familiar with the candidate’s disciplinary or interdisciplinary field(s) of specialization. the person who prepares the dossier, typically the department chair, should function as the advocate for the candidate’s promotion on behalf of the department that is recom-­ mending tenure and promotion. thus the chair should make the case for the significance to the humanities at large of the candidate’s scholarship (broadly defined) and publications and of the foci and critical approaches in his or her work. the chair should foresee and respond in advance to questions or concerns that the committee reading the dossier may have— for instance, why the press publishing work in a recently emerging field does not come from the north american institutions deemed the most prestigious and whether criticisms made by external referees are justified. finally, it is important for the chair to comment on the significance of the candidate’s work and on his or her potential as a scholar, teacher, and contributor to the department, the institution, and the humanities. the success of the tenure and promotion dossier should not depend on the chair’s skills at making a persuasive case. rather, it should rest on the merits of the candidate. it is incumbent on institutions to train incom-­ ing chairs in the preparation of a properly documented dossier, perhaps through workshops where outstanding and successful dossiers can be held up as models. in fact, the dean should return for revision those dossiers that do not do justice to the candidate’s scholarship, teaching, and service or that do not make an informed case for the candidate to colleagues from other fields, notably fields outside the humanities. . who does the reviewing—academic presses or internal and external referees? lindsay waters has observed that in the current system of tenure and promotion at research universities, humanities departments “outsource” the substantive review of the scholarly work of their junior colleagues to university press readers (enemies ). as he points out, this process of ex-­ ternal review serves to obviate the process of internal review: departmen-­ tal committees behave as if they cannot or should not determine the value of their junior colleagues’ work unless university presses deemed suffi-­ ciently prestigious have determined the value of that scholarship for them ( ). in fact, this practice of relying on university press readers continues today as if there were no systemic problems in scholarly publishing, even in fields (medieval studies, for example, or literatures in languages other than english) in which there are fewer venues for monograph publication and in which university presses have been scaling back production. �|||�report this disturbing overreliance on university press readers is not identi-­ cal to the system of peer review, which depends chiefly on external ref-­ erees chosen from a list of scholars in a candidate’s field of study. there are good reasons why the system of external peer review was established and why it should be maintained. as christopher jencks and david reis-­ man argued in the academic revolution, the development of external peer review freed individual scholars from the vertical—and sometimes pa-­ rochial and territorial—evaluation of their work by local college deans and upper-­level administrators. and as jonathan culler pointed out in framing the sign: criticism and its institutions, the external review process enabled the rise of many forms of innovative and even controversial work in the humanities. nonetheless, this apparatus of external peer review also created the conditions whereby individual departments can practi-­ cally abdicate their responsibility to review the scholarly work of the very colleagues they have appointed to tenure-­track positions. if internal review is to be a meaningful part of the tenure process, it cannot be used as a fallback mechanism when a junior scholar’s manu-­ script has not yet been accepted by a university press. rather, internal review should be as rigorous and as substantive as external review. de-­ partment heads (and advisory or executive committees, where appropriate) must ensure that the fairness of the internal review is not compromised by departmental or intradisciplinary factionalism. the only matter at issue should be the quality of the candidate’s dossier in scholarship, teach-­ ing, and service. this means that the candidate’s perceived collegiality should not be at issue. the task force agrees with the aaup’s argument (“on collegiality”) against the inclusion of collegiality as “a criterion for faculty evaluation” and tenure, almost as “a fourth criterion” beyond scholarship, teaching, and service. to be sure, as the a aup observed, collegiality, “in the sense of collaboration and constructive cooperation, identifies important aspects of a faculty member’s overall performance . . . not a distinct capacity . . . rather a quality . . . expressed in the successful execution” of the three functions. it thus recommended that the virtues of collegiality be reflected in the department’s and the institution’s defi-­ nitions of scholarship, teaching, and service. like the aaup, this mla task force opposes practices that exclude faculty members on the basis of their difference from a perceived professorial norm. the charge of uncollegiality may also threaten academic freedom when collegiality is confused with the expectation that faculty members display deference to administrative or faculty decisions and help ensure internal harmony. at the risk of stating the obvious, criticism and dissent do not necessarily conflict with the substantive virtues of collegiality. mla�task�force�on�evaluating�scholarship�|||� the task force recognizes that internal reviews may often be under-­ taken by scholars who are not specialists in the tenure candidate’s field of expertise. in small departments, this situation is frequently the case, but even large departments may not always be able to assign the work of a junior scholar to senior faculty members who are familiar with the candidate’s area of study. we believe nonetheless that any department capable of hiring a junior scholar must be capable of reviewing the work of that scholar for tenure and promotion. senior scholars whose research lies outside the tenure candidate’s area of expertise are still able to dis-­ cern the quality of the conceptual framework, the scholarly apparatus and documentation, the writing, and the impact of the candidate’s work on other scholars. internal review by nonspecialists and external reviews by specialists should be regarded as complementary parts of the whole tenure review process. . external letters: their number and form and the reviewer’s institution judging from the findings of our survey, practices involving external re-­ viewers vary widely from institution to institution. overall, ( . %) of the departments that responded to the mla survey reported that their institutions require them to seek letters from outside referees as part of the process of review for tenure and promotion. of the responding depart-­ ments in carnegie doctorate-­granting institutions, ( . %) said that they are required to seek outside letters, in contrast to only ( . %) of the departments in carnegie master’s and ( %) of the depart-­ ments in carnegie baccalaureate institutions. over half of the departments in carnegie master’s ( . %), nearly a third of the departments in car-­ negie baccalaureate ( . %), but only . % of the departments in carnegie doctorate-­granting institutions reported that neither are they required to seek outside letters nor do they seek them of their own volition. of the responding departments required to solicit outside letters, provided information about the number of letters they must seek. departments in carnegie doctorate-­granting institutions require the greatest number of external referees for tenure decisions— . % of these departments must seek four or more letters, compared with . % and . % of departments in carnegie baccalaureate and carnegie master’s institutions, respectively. among departments in carnegie baccalau-­ reate institutions, . % report that they must seek one to three let-­ ters. over two-­thirds ( . %) of carnegie master’s institutions report the same requirement. by contrast, . % of departments in carnegie doctorate-­granting institutions are required to seek six or more letters, and a handful of institutions require an astounding twelve, fourteen, and �|||�report sixteen letters (one case each). nine or more letters were required for ( . %) of the departments that provided answers to this question. it is nearly impossible to suggest best practices that will cover all in-­ stitutions and all forms of external review. nonetheless, the task force developed guidelines for conducting external reviews. in ordinary cir-­ cumstances, six external letters should be sufficient testimony to the value of a candidate’s scholarship. larger numbers inevitably add to the work-­ load of leading senior scholars in a field, particularly in small subfields, where the same senior names come up again and again as potential re-­ viewers. the crushing number of reviews some senior scholars are asked to do may negatively affect the quality of the individual letter, and the demand for a large number of letters is sometimes harmful to candidates as well, since some departments and colleges count it against candidates when potential referees decline to review their work. indeed, extrapolat-­ ing from our survey data, specifically from the number of candidates our responding departments considered for tenure over five years— – to – —we can say that letters were solicited per year; and since our sample represents % of the , departments in the mla database that we assume seek letters (or % of the , departments in the data-­ base), then something on the order of , letters are sought annually. in an effort to ensure the highest standards for external reviews, we recommend that the following procedures be adopted: • departments should solicit letters from the most knowledgeable reviewers in a candidate’s field, regardless of the perceived status of the reviewers’ institu-­ tions. some college and university guidelines urge department heads to seek reviewers exclusively from colleges and universities viewed as more presti-­ gious than their own, even though there may be no reason to believe that the most knowledgeable reviewers will come from those institutions. specialists in some fields, particularly emerging fields, may be affiliated with all types of institutions, and it is the job of the department head or the departmental committee that prepares the tenure dossier to explain to deans and provosts why a candidate’s referees are qualified to conduct a tenure review even when they do not come from institutions that equal or exceed the perceived prestige of the candidate’s own. • candidates should have the privilege and the responsibility of naming some of their potential reviewers (we recommend half ) and excluding one or two figures in their field who they believe, for various professional reasons, would be inappropriate referees. • external referees should be remunerated. we recognize that many depart-­ ment budgets are already strained. but refereeing is one of the most important evaluative mechanisms in the profession, and institutions should make funds available for external reviewers as a form of recognition for their work, along the lines of the $ –$ offered by most university presses. while this is mla�task�force�on�evaluating�scholarship�|||� hardly adequate compensation for the task, it does recognize refereeing as scholarly labor rather than as an obligation attendant on scholarly achieve-­ ment (see “mla recommendations on extramural evaluations”). • when approached for external evaluations months before the deadline, refer-­ ees should respond promptly, particularly if they believe they will be unable to conduct the review. when they agree to conduct a review, referees should complete their work in a timely and responsible fashion that includes a care-­ ful, lengthy assessment of the candidate’s primary works of scholarship and his or her potential to contribute to the field in the future. the review should present a critical evaluation of the work, not a list of superlatives for fear that anything more nuanced will be used against the candidate. referees should not be asked to judge the quality of teaching and service to the institution, which they have no way of evaluating. • departments should request evaluations by means of a form letter to ensure consistency and thus equity. although there are potentially as many such let-­ ters as there are departments, we strongly recommend that the form letter not ask referees to indicate whether the candidate would receive tenure at their own institution. different institutions have different needs and expectations, and letters should not presume that one tenure standard fits all. in fact, the form letter should not ask the referee to adjudicate the question of tenure at all; rather, it should focus solely on evaluating the scholarly merit of the candidate’s work, leaving the tenure determination to the candidate’s depart-­ ment and institution. the task force believes that it is incumbent on all departments and in-­ stitutions to make their own substantive determinations about the quality of candidates’ scholarship, regardless of the venue in which it appears and even when some part of it has not been publicly disseminated at the time of the tenure review. external reviews and press readers’ reports are un-­ deniably important to the tenure process, but they must be accompanied by rigorous and fair internal evaluation. . reviews of scholarly books the book review plays an essential role in humanities scholarship, dis-­ seminating information about new works, critically evaluating them, and engaging them in often pointed debate. as such, the best published reviews constitute an important scholarly activity that helps direct, al-­ ter, and sustain ongoing conversations in the field. while book reviews should be an important element of tenure evaluations, their special role in disciplinary and interdisciplinary conversations should be recognized and should not be confused or conflated with other forms of review, including internal and external evaluations and reader reports from presses. in fact, the growing emphasis placed on book reviews in tenure evaluations may have had a deleterious effect on this kind of scholarly forum, dampening �|||�report disagreements and transforming critical interrogation into a standard-­ ized summation. many senior scholars, faced with evaluating a number of tenure dossiers, lack the time to write book reviews, and some journal editors have begun to reduce the amount of space dedicated to reviews, whereas others increasingly turn the task over to graduate students or junior scholars in the field, who may not be sufficiently specialized in the subject or treatment of the book to draft anything other than a summary. such reviews have little value as part of a tenure dossier. in response to this many-­faceted issue, we offer the following recom-­ mendations: • tenure committees and senior administrators need to understand the nature of the humanities book review so that its generic distinction from other forms of peer evaluation can be recognized. a hostile or barbed review, for instance, should not necessarily be seen as an indication that the book is of poor quality. reviews can often be explicit in their disagreements and aggressive in their critique of a line of argument or mode of interpretation because they are tak-­ ing a position within particular disciplinary debates. • senior scholars should write book reviews to model this critical genre for younger scholars and to help identify significant new work. • journal editors should cultivate a more critical culture of reviewing, seeking out well-­qualified evaluators and asking tenured scholars, in particular, to make regular contributions. • new forums should be developed for the circulation of reviews in the hu-­ manities, since they have diminished or even disappeared from many schol-­ arly journals. we applaud the publication of state-­of-­the-­field review essays in such journals as signs and pmla. . evaluating collaboration solitary scholarship, the paradigm of one-­author–one-­work, is deeply em-­ bedded in the practices of humanities scholarship, including the processes of evaluation for tenure and promotion. collaboration, however, offers significant opportunities for enterprising, untenured scholars to tackle problems or interdisciplinary topics too formidable in scale or scope for an individual. sometimes collaboration simply offers the most satisfying way to approach an issue or problem in an article or a monograph. in fact, recent technological advances have made collaboration with distant colleagues easier, faster, and more efficient. and the special challenges involved in creating digital scholarship have led to new forms of collabo-­ ration in that arena as well. such opportunities to collaborate should be welcomed rather than treated with suspicion because of traditional prejudices or the difficulty of assigning credit. after all, academic disciplines in the sciences and so-­ mla�task�force�on�evaluating�scholarship�|||� cial sciences have worked out rigorous systems for evaluating articles with multiple authors and research projects with multiple collaborators. we need to devise a system of evaluation for collaborative work that is appro-­ priate to research in the humanities and that resolves questions of credit in our discipline as in others. the guiding rule, once again, should be to evaluate the quality of the results. going forward it is a truism that one report begs for—and begets—other reports. this report is no exception. the questions and issues that the task force has considered, including the survey findings and the legitimate concerns of all faculty members (junior and senior, tenure-­track and non-­tenure-­track), chairs, and administrators in language and literature programs as well as in other academic fields, have in no way been exhausted by this report. to begin with, there were aspects of our charge that we did not ful-­ fill. in our discussion of publication practices, we decided not to examine the issue of multiple submissions and to defer, for the time being, to the mla’s current statement, outlined in “advice for authors, reviewers, publishers, and editors of scholarly books and articles” (mla commit-­ tee on academic freedom). this document states that publishers “should make available to prospective authors clear statements about editorial pol-­ icies . . . and of procedures for the submission and review of manuscripts” but that, barring explicit strictures, an author could submit the prospec-­ tus of a book or monograph to several presses simultaneously. when a publisher asks to see the entire manuscript, however, the author should “inform the publisher if the manuscript has been submitted elsewhere” ( ). on the submission of articles and essays, the mla’s current posi-­ tion is that “an author submitting a manuscript to two or more journals simultaneously should notify each editor concerned,” although some jour-­ nals, including pmla, do not allow multiple submissions because they “often create unnecessary work for reviewers and editors” ( ). it should be noted that strictures against multiple submissions of book manuscripts are peculiar to academic presses; in trade publishing, multiple—and competitive—submissions are the norm, although trade editors do not usually commission peer reviews. the current mla rec-­ ommendation states that “if, after four months, a press that requires ex-­ clusive examination rights is unable to decide on acceptance or rejection and the publisher and author cannot agree on a reasonable timetable, the author may submit the manuscript elsewhere after writing to notify the publisher” ( ). since publishers rely on peers from the field to assist �|||�report in evaluating manuscripts, scholars who agree to write reviews should comply rigorously with the agreed schedule for completion. all too of-­ ten, however, as many of us have experienced or heard from younger colleagues, presses and journals delay decisions for a year or even more, thus restricting the options of junior faculty members who need multiple publications before the tenure process formally begins. given the reali-­ ties of tenure demands and, as our report confirms, the raised and rising requirements for scholarly production from junior scholars, the task force urges all parties involved in the review process—authors, editors, and re-­ viewers—to refrain from significant delays, especially for the untenured. we further recommend that the mla undertake a reexamination of its current policies on multiple submissions for books and articles. aside from the issue of multiple submissions, we were not able to com-­ pare the number of books published by university presses in the fields represented by the mla between and , as the council charged the task force, because, as peter givler of the association of american university presses informed us, presses do not share a standard way of codifying those fields. further, givler doubted that a comparison between only two points in time (only a few years apart) would lead to significant conclusions; a much broader study of trends would be preferable, though even more difficult to conduct. in this interdisciplinary age, presses list books under multiple rubrics, in part to increase the books’ sales poten-­ tial. to obtain quantitative information about the number of books pub-­ lished in the fields represented by the mla, we would have to conduct a systematic survey of university press directors, which would be costly and time-­consuming. the task force thus concluded that the mla’s funds should be used for a different survey, one that engaged the primary con-­ cerns of our charge. as we reported, our survey did not confirm the existence of a crisis of publication in the humanities, at least as late as , although there are reasons to believe that publishing opportunities may be narrowing further. the troubling signals we noted give us, teacher-­scholars in lan-­ guage and literature departments, ample reason for taking simple and constructive measures to support university presses on our campuses. we can become members of a permanent library committee and try to get our libraries to increase the size of their humanities budgets for books, reference works, and journals, and we can also offer to serve as members of press boards to affect publishing priorities and financial allocations (see also mla ad hoc committee ). moreover, we should urge our administrators to subsidize our academic presses and to earmark portions of the current or the next capital campaign for enlarging humanities li-­ mla�task�force�on�evaluating�scholarship�|||� � brary collections. the press is crucial to the mission of the university: the production of knowledge for society at large would be seriously compro-­ mised without the presses’ dissemination of innovative academic research and rigorous scholarship, on which policy makers, opinion leaders, and authors of works for the general public constantly draw. in retrospect, our survey questionnaire did not investigate several top-­ ics and areas that warrant further consideration. for instance, we did not ask department chairs to report on the salaries of their junior or recently tenured faculty members. we also did not inquire how unions at pub-­ lic institutions influence the evaluation of scholarship, even though they sometimes play an important role in establishing and defending proce-­ dural norms and rights in cases of tenure and promotion. further, we did not ask about existing appeals processes when tenure is not recommended at the department level or is denied at a higher administrative level—an area that certainly merits further study. and finally, the survey did not pose questions about lengthening the time to tenure or the probationary period before tenure to eight or nine years, which some institutions are now recommending. within the purview of our survey findings, it is clear that issues re-­ lated to faculty members of color (hiring, exiting, promoting, and tenur-­ ing) need further study and more precise documentation. the number of departments that reported cases of junior faculty members other than non-­hispanic whites coming up for tenure is small, in some categories fewer than a dozen; indeed, the number is too small to allow for signifi-­ cant inferences. for instance, only of the english departments that responded to the survey (or . %) reported on the cases of african american candidates who came up for tenure between – and – . over the most recent five-­year period covered in the survey ( – to – ), non-­hispanic whites made up . % of the candidates considered for tenure in english departments that answered questions about their candidates’ race and ethnicity and . % of those awarded tenure. on the foreign language side, departments reported on non-­hispanic white candidates, but only departments reported on african american candidates for tenure over the five-­year period between – and – . fifty to departments reported on hispanic candidates who made up . % of foreign language depart-­ ment candidates. because the small numbers make it difficult to discern general patterns, we urge the mla to undertake further analysis of the exit and tenure rates of academic populations other than non-­hispanic whites. and we further urge the mla to study, through both quantitative and qualitative methods, the career paths of faculty members of color. �|||�report more broadly, the small number of departments reporting figures about faculty members of color who were considered for tenure is symp-­ tomatic of the minimal presence of diverse races and ethnicities in the united states system of higher education. although this issue is beyond the task force’s purview, we feel compelled to emphasize the importance of increasing diversity in the pool of doctorates in the fields of the mod-­ ern languages and thus in the pool of applicants for academic positions and for tenure and promotion. we hope that our emphasis is not simply dismissed as a tropism and that mla members will urge their faculty and administration colleagues to undertake a campuswide study of faculty diversity, an essential component of a good education in the twenty-­first century, and to try to find concrete ways of grappling contextually with the historical, ideological, economic, political, and educational causes of this complex problem. our findings regarding the “tyranny of the monograph,” as it applies to a very influential sector of the carnegie institutions that we surveyed, led us to conclude that this is the moment to ask whether the needs of the profession—and of graduate students in the fields represented by the mla—are best served by the current idea of the dissertation as a book-­in-­progress. to be sure, we are not the first to raise this question. in we scholars: changing the culture of the university, david damrosch lamented the hegemony of the book-­length dissertation in the modern languages and its contribution to the overproduction of monographs. he asked, “why should the dissertation be presumed to be a protobook rather than a series of articles, each produced independently, sharing a common general theme or approach rather than developing a single argu-­ ment?” ( ). he argued that a series of articles might well be as valuable a form of scholarship as a single book, without ruling out the possibility that the work of some graduate students might best take shape as a book. damrosch’s idea that dissertations could take the form of linked articles rather than a single monograph is even more timely now than a decade ago because boundaries of disciplinary inquiry have been significantly altered by multi-­ and interdisciplinary work. a shift in the conception and structure of the modern language dis-­ sertation is one valid option for creating a more capacious understanding of scholarship. the monograph as the gold standard for tenure dossiers is a relatively recent development, and rigorous quality standards for schol-­ arship are not tied directly to monograph production. for example, in departments of philosophy, as well as in most departments in the social sciences, faculty members uphold the essay as the fundamental unit of scholarly production. if departments of english and other modern lan-­ mla�task�force�on�evaluating�scholarship�|||� guages were to encourage new structures for the dissertations required of their doctoral candidates, then deans and provosts would support such a change, in our view, because experts in the field had determined the va-­ lidity and the value of such a change. indeed, if the institutions perceived as the most prestigious thought anew about the various ways and forms in which advanced graduate students in the humanities demonstrate that they are able to conduct sustained original scholarly inquiry, these uni-­ versities would open the door to a long overdue reconsideration of the dissertation across the spectrum of graduate education programs in the united states. of necessity, the relation of the dissertation to the monograph will have to be reconsidered as dissertations are made accessible in electronic form through institutional repositories, multischool consortia, or com-­ mercial distributors. advocates regard making dissertations available electronically as a logical and desirable replacement for dissertation ab-­ stracts and the microfilm or microform system. they consider it retro-­ grade to continue using nineteenth-­century technologies and modes of dissemination and to deny academics and others access to new research and scholarship. more pertinent to our purposes in this report is the impact of the potential shift to the electronic dissertation on the process of evaluating scholarship for tenure and promotion. for if dissertations are so freely and widely accessible that they are in effect published, the typical route of the revised dissertation to a first book will need to be fundamentally rethought. university presses could well decide that there would virtu-­ ally be no sales of the printed form of a dissertation, even after it was revised. is it possible, however, as advocates of electronic dissemination claim, that those dissertations that generate interest online will make publication more attractive? perhaps, but the widely circulated electronic dissertation can create yet another level of increased expectations for writing and publication from nontenured members of the profession, who could be asked to produce an entirely new book-­length manuscript in the postdoctoral pretenure years. to prevent a further escalation of demands on young faculty members, institutions could protect their stu-­ dents’ dissertations for three to five years before the dissertations are posted on university library sites and give students the right to decide the year in which their dissertations become electronically available. this delay may only be a temporary measure if, as some advocates pre-­ dict, all faculty work will eventually be uploaded and disseminated freely through library servers instead of taking the form of print books and journals; such an open archive system, which is currently the subject of �|||�report considerable discussion within the research library community, would constitute a radical change in the ways in which scholarly communica-­ tion functions. it seems clear to the task force that our profession and the academy as a whole need to rethink not only the conception of the dissertation as a larval monograph but also, and more broadly, the entire graduate cur-­ riculum for students confronting a particular conjuncture of intellectual, academic, technological, and economic circumstances today. it is equally clear that the institutions perceived as the most prestigious in our fields will need to initiate such an effort if there is eventually to be a recon-­ ceptualization of the culminating piece of scholarship for the doctoral degree. but, in keeping with a principle that runs throughout this report, it is incumbent on each institution to define what type of final scholarly production is appropriate for its specific educational mission and for the kinds of academic positions that its phd recipients are likely to have, positions that may be very different from those of its own tenured senior faculty members. here again, we believe that the mla can act as a cata-­ lyst for this national conversation. indeed, if there is a final recommendation that the task force would make, it is this: discussions of the issues we have raised must begin in departments and institutions that have not yet undertaken them and must continue in departments and institutions that have already begun them. our survey suggests that such deliberations are in fact going on in a sub-­ stantial number—if not a majority—of the responding departments and thus in their institutions. asked whether their institution is reviewing its processes and reconsidering its criteria for tenure and promotion, ( . %) departments said no in . but ( . %) said that their in-­ stitution had completed such a review in the past three years, ( . %) that such a review was in progress, and ( . %) that such a review was under discussion. we advocate that discussions not be an end in them-­ selves but that they result in a genuine, comprehensive review. institutions should rethink on a regular basis their requirements for tenure and the process by which they evaluate the ways in which junior faculty members have met those requirements. for, as this report has tried to show, the meanings and functions of scholarship and scholarly exchange are histori-­ cal phenomena. the criteria for tenure and the processes of evaluation have shifted over the last few decades and will undoubtedly shift again. it is up to us, then, the teacher-­scholars of the mla, to become agents in our academic systems and effect changes that reflect and instantiate ap-­ propriate standards of scholarly production and equity and transparency for our colleagues, our institutions, and our society. mla�task�force�on�evaluating�scholarship�|||� summary recommendations . departments and institutions should practice and promote transparency throughout the tenuring process. . departments and institutions should calibrate expectations for achieving tenure and promotion with institutional values, mission, and practice. . the profession as a whole should develop a more capacious conception of schol-­ arship by rethinking the dominance of the monograph, promoting the scholarly essay, establishing multiple pathways to tenure, and using scholarly portfolios. . departments and institutions should recognize the legitimacy of scholarship produced in new media, whether by individuals or in collaboration, and cre-­ ate procedures for evaluating these forms of scholarship. . departments should devise a letter of understanding that makes the expecta-­ tions for new faculty members explicit. the letter should state what previous scholarship will count toward tenure and how evaluation of joint appoint-­ ments will take place between departments or programs. . departments and institutions should provide support commensurate with expectations for achieving tenure and promotion (start-­up funds, subven-­ tions, research leaves, and so forth). . departments and institutions should establish mentoring structures that provide guidance to new faculty members on scholarship and on the optimal balance of publication, teaching, and service. . department chairs should receive guidance on the proper preparation of a tenure dossier. . departments and institutions should construct and implement models for intermediate reviews that precede tenure reviews. . departments should conduct an in-­depth evaluation of candidates’ dossiers for tenure or promotion at the departmental level. presses or outside referees should not be the main arbiters in tenure cases. . scholarship, teaching, and service should be the three criteria for tenure. those responsible for tenure reviews should not include collegiality as an additional criterion for tenure. . departments and institutions should limit the number of outside letters (in general, to no more than six). scholars should be chosen to write letters based primarily on their knowledge of the candidate’s field(s). letters should be limited to evaluating scholarly work. candidates should participate in se-­ lecting (or rejecting) some of their potential reviewers. . the profession as a whole should encourage scholars at all levels to write substantive book reviews. . departments and institutions should facilitate collaboration among scholars and evaluate it fairly. . the task force encourages further study of the unfulfilled parts of its charge with respect to multiple submissions of manuscripts and comparisons of the number of books published by university presses between and . . the task force recommends establishing concrete measures to support uni-­ versity presses. . the task force recognizes that work needs to be done on several ques-­ tions not asked in its survey: salaries of junior and recently tenured faculty �|||�report members, the role of unions, tenure appeals processes, and the lengthening of the pretenure period. . the task force recommends that a study of faculty members of color be conducted. . the task force encourages discussion of the current form of the dissertation (as a monograph in progress) and of the current trends in the graduate curriculum. . departments should undertake a comprehensive review to ensure that their expectations for tenure are consistent with their institutions’ values and mis-­ sion and that each step in the process is fair and transparent. domna c. stanton, graduate center, city university of new york (chair) michael bérubé, penn state university, university park leonard cassuto, fordham university, lincoln center morris eaves, university of rochester john guillory, new york university donald e. hall, west virginia university, morgantown sean latham, university of tulsa notes� . these percentages are derived from two ipeds components: the institu-­ tional characteristics file, which provides a systematic count of institutions covered in the ipeds, and the employees by assigned position file, which counts the employees in various categories that institutions have on their payrolls as of no-­ vember of the given survey year. each data set was queried for degree-­granting, title iv–participating, carnegie doctorate-­granting, master’s, and baccalaureate institutions in the fifty states and the district of columbia. . our survey used the carnegie classifications, not the new carnegie clas-­ sifications announced in . a comparative analysis of the and the clas-­ sifications shows that more than % of the institutions in the classification that classify groups as carnegie doctorate-­granting institutions remain in one of the three new categories—ru/ v h: research universities (very high research ac-­ tivity), ru/h: research universities (high research activity), or dru: doctoral/ research universities. more than % of the institutions in the classification that our survey grouped as carnegie master’s institutions remain in one of the three categories of master’s colleges and universities in the classification. three and a half percent of the institutions classified as carnegie master’s institutions in are classified in as research universities or doctoral/ research universities. the remaining institutions are classified as master’s institutions in and placed in the baccalaureate colleges categories in . just over % of the institutions classified as baccalaureate colleges in are placed in one of the baccalaureate colleges categories in . most of the remainder ( . %) are institutions classi-­ fied as master’s institutions in . we can safely state that the new classifications, while they allow for finer-­grained analysis in some respects, affect the analysis of the survey findings only marginally. . the remaining percentage of phd recipients—ranging from % to %—rep-­ resent the following categories: non-­tenure-­track full-­time or part-­time unknown; mla�task�force�on�evaluating�scholarship�|||� higher education, appointment type not specified; postdoctoral fellowships; academic administration; secondary and elementary school teaching; business, government, and not-­for-­profit organizations; self-­employed; not employed seeking employment; and not seeking employment. . different types of faculty members can be classified under “full-­time non-­ tenure-­track.” these include, as we see in the “statement on non-­tenure-­track faculty members,” approved by the executive council of the mla in and en-­ dorsed by the delegate assembly in december , “external postdoctoral fellows, internal postdoctoral fellows, and more permanent ntts [non-­tenure-­track faculty members], the last of whom go by almost as many names as there are institutions.” . the percentages for full-­time and part-­time faculty members in english and foreign languages were calculated july using the united states department of education das-­t online data analysis system, http:// nces.ed.gov/ das. . some would argue that tenure-­track faculty members have benefited from the large presence of ntt colleagues because they are relieved from teaching lower-­ division courses. but it could also be argued that such purported relief increases the destructive consequences of departmental hierarchies and undermines the value of the tenure-­track faculty in the eyes of administrations focused on the bottom line. . we are not suggesting that universities are asking their presses to become profit centers and sources of revenue to subsidize other parts of the institution, as the term “for-­profit” business might imply, but we are emphasizing a new unwillingness to subsidize the not-­for-­profit university press. . penn state university press has revived its romance studies monograph series (http:// romancestudies .psu .edu/) in collaboration with the penn state university li-­ braries, the department of french and francophone studies, and the department of spanish, italian, and portuguese; monographs are offered in a free online version and for purchase in print. the university of california press is publishing mono-­ graph series online in collaboration with the california digital library (http:// www .ucpress .edu/ digpub/). and oxford university press plans to sell licenses of its hu-­ manities list to libraries. in an e-­mail message to david nicholls, director of mla book publications, sanford thatcher, director of the penn state university press, reported that the press is reviving a series in romance studies suspended several years ago by getting library support on the technical side (the press is becoming part of the library administratively). the program offers older titles in the discontinued series (then called penn state studies in romance literatures) online and for print on demand (pod). it also publishes new titles in romance studies (the new title of the series) in the same format; for new titles, the press expects sales of pod copies in the range of to to libraries and to to individuals. in an e-­mail mes-­ sage to david nicholls, lynne withey, director of the university of california press, indicated that the press is also experimenting with monographs that faculty series editors select; the staff then posts the texts online in the institutional repository maintained by the california digital library and produces print copies on demand. in the series in science and in linguistics that the university of california press has launched, the press is selling to copies per series, which allows it to continue publishing specialized work. the press is about to announce a series in literature. both of these instances, which revive publication in the fields represented by the mla, confirm yet again the crucial importance of digital technology to the future of scholarship and publication in the humanities. �|||�report . in “journals in the time of google,” lee c. van orsdel and kathleen born cite average prices per journal title in : whereas in chemistry it is $ , and in physics, $ , , in language and literature it is $ . . lindsay waters writes that in , % of the acquisitions budget in the university of california library system went for books, % for journals. in the early s, these percentages changed dramatically: % went for books, % for jour-­ nals (enemies ). . wilcox’s a comprehensive survey represents the report to the office of edu-­ cation in the united states department of health, education, and welfare, which provided the grant to support the sur vey, initiated as a project of the college section of the national council of teachers of english in . the project was subsequently endorsed by the mla and the ade, which participated in the advi-­ sory committee that was formed to oversee the project; wilcox was the survey’s director. the survey was mailed to a random sample of departments, which received the questionnaire in (the survey asked questions about the academic year – ); of the departments ( . %) returned questionnaires. wil-­ cox’s the anatomy of college english drew on the survey but is not a report of the survey findings. . women continue to be underrepresented, however, at the highest rank; ac-­ cording to the national study of postsecondary faculty, women make up % of full-­time tenured faculty members in english and foreign languages at the rank of professor. more encouraging, they make up . % of full-­time tenured faculty mem-­ bers in these fields at the rank of associate professor (cataldi et al.). . findings about tenure rates from other recent surveys are comparable with those reported in the mla study. a survey conducted by the american historical association in fall (townsend) found that of the faculty members that the history programs surveyed had considered for tenure in the preceding five years, received tenure, a . % success rate across the range of institutions (not including two-­year colleges). tenure rates at doctoral and research institutions and in phd programs were only slightly below this average at . %, and, when broken down by region, history programs in the midwest, which accounted for a quarter of the history faculty members who came up for tenure, reported a . % rate, whereas the other regions were all close to the . % average. . the % exit rate among those hired between and was calculated from a base number that includes persons hired between and , most of whom would not as yet have been considered for tenure and who would be the least likely to have left the departments that hired them for any reason. the percentage of exits would doubtless be significantly higher, and the implied tenure rate lower, were the base number limited to cohorts of assistant professors whose members either had already been considered for tenure by or who would have been had they not left before their cases could be sent forward for review. we have no way of tracking persons in the % who were hired at other institutions and eventually received tenure, those in the percentage who went on to ntt positions, or those who left the profession. the discrepancy implied by the difference between dooris and guidos’s % success rate and the mla’s estimated % exit rate suggests the degree to which the mla exit rate is lowered because, unlike dooris and guidos’s study, it is calcu-­ lated from a base that includes recently hired persons who are just beginning their probationary appointments. mla�task�force�on�evaluating�scholarship�|||� . there are also gender differences in the tenure rate according to the field. thus film and media studies have tenured more men than women; literary studies and creative writing have tenured about the same number of men and women; and rhetoric and composition, applied linguistics, and language learning and acquisition have tenured more women than men. it is, of course, difficult to know the histori-­ cal and intellectual community factors that account for these differences among the fields represented by the mla. . to cite an obvious example, publication provides a poor framework for evalu-­ ating scholarship in digital form, including “print” books and articles aggregated in major databases, such as project muse. . digital repositories in libraries and at other sites may further diminish the distinction between the essay and the book. . see “the contributions of journal editors to the scholarly community” (council), which also urges recognition of editorial experience in hiring, tenure, and promotion. as this document argues, the conceptualization and execution of special issues of a journal are often commensurate with editing a collection of essays in book form. thus special issues are often subsequently published as books. . this acceptance of diversity should also hold true for scholarship published abroad, where journal and press editors sometimes serve as the peer-­review body and readers’ reports may be unavailable. tenure review committees should always judge the merit of the scholarship itself, while understanding that international communi-­ ties have different vetting processes for academic publishing. . increasingly, many journals possess a significant digital component and place electronic versions of print articles in repositories, such as project muse and jstor; this is the case with pmla. however, in “a celj snapshot of learned journals and e-­publication in ,” based on the responses of member jour-­ nals (out of to whom the questionnaire had been sent), elizabeth haluska-­ rausch and bonnie wheeler found “no immediate compulsion to add or change to an electronic format” among their respondents ( ). indeed, the respondents did not see an intellectual need to “go digital,” since they viewed the digital revolution as “market and consumer driven” ( ). editors also reported that for journals with print-­only subscriptions or with both formats there was no evidence of decline in print sales due to electronic availability and that much of the evidence of decreased sales is anecdotal. . the mla international bibliography has been updated to provide links to digi-­ tal materials of indexed items. . the task force thus supports the recommendation of the acls commission on cyberinfrastructure for the humanities and social sciences that “policies for tenure and promotion . . . recognize and reward digital scholarship and scholarly communication; recognition should be given not only to scholarship that uses the humanities and social science infrastructure but also to scholarship that contributes to its design, construction, and growth.” . this letter might include, for instance, how work with digital media in re-­ search, as well as in teaching and service, will be evaluated and credited. see “guide-­ lines for evaluating work with digital media in the modern languages” (mla committee on information technology). . this mentoring process should include initiation into the work ings and the policy statements of the faculty member’s professional organizations. see, for �|||�report instance, “advice for authors, reviewers, publishers, and editors of scholarly books and articles” (mla committee on academic freedom) on issues of multiple submis-­ sions, appropriate time of response for submissions, and so on, as well as the “mla statement of professional ethics.” . this view is consistent with the provisions of the “mla statement of profes-­ sional ethics” for external and internal reviews: “a scholar who has any conflict of interest or is so out of sympathy with the [colleague’s work] as to be unable to judge its merits without prejudice must decline to serve as referee or reviewer.” . see heather dubrow’s introduction to the special topic on collegiality in pro- fession. she notes factors that cause a lack of collegiality in departments today: the job market, changes in career patterns, increased diversity within departments, changes in family patterns, and the influence of business models on the academy. . a significant, but puzzling, difference in tenure rates for women and men emerges in relation to the number of outside letters departments are required to solicit: women have a lower tenure rate than men in departments required to solicit the largest number of letters ( or more). for the years – and – , the figures show that when or more outside letters are required, . % of men versus . % of women are tenured; when or letters are required, . % of men versus . % of women are tenured; with to letters, % of men versus . % of women are tenured; and when letters are not required, . % of men versus . % of women are tenured. . in mla documents, the number of outside letters recommended ranges from to . the “ade and adfl statement on the use of outside reviewers” rec-­ ommends to letters; the “mla recommendations on extramural evaluations,” which was endorsed by the mla delegate assembly in and the mla executive council at its february meeting, recommends up to letters. in extraordinary circumstances, when there is substantial disagreement about a candidate or when a candidate’s work does not fit neatly into one disciplinary specialty, a dean or depart-­ ment head might ask for additional letters to supplement those of external referees who were already named or whose reviews were received. . see damrosch’s argument for collaboration as opposed to the idea of the iso-­ lated, individual scholar ( – ). . under the “statement of principles on academic freedom and tenure,” the aaup stipulated that “this period may not exceed seven years.” the association’s “statement of principles on family responsibilities and academic work,” how-­ ever, said that “a faculty member [can] be entitled to stop the clock or extend the probationary period, with or without taking a full or partial leave of absence, if the faculty member . . . is a primary or coequal caregiver of newborn or newly adopted children,” that “institutions [can] allow the tenure clock to be stopped for up to one year for each child, and . . . that faculty [can] be allowed to stop the clock only twice, resulting in no more than two one-­year extensions of the probationary period” (n ). in our view, it would also be appropriate to stop the clock or extend the probationary period for primary care of significant others, infirm parents, or relatives. . making dissertations available in electronic form as an institutional require-­ ment has elicited considerable comment at several universities, including ohio state university, columbus, especially from graduate students, who have expressed con-­ cern that their manuscripts may not be published in print form if they have already appeared on the web. because of such opposition, ohio state allows students a delay mla�task�force�on�evaluating�scholarship�|||� � of one to three years in releasing their dissertations in electronic form, and the ad-­ ministration is considering a five-­year delay to allow phd recipients more time to publish their work (carlson). . proquest/ umi receives dissertations electronically, prints them out, and then microfilms them to create an archival copy. thus this company still considers the “retrograde” microfilm or microform system the most reliable way to create an ar-­ chive of electronic texts. . this move raises the thorny issue of who owns the rights to the intellectual content of the dissertation, the author or the university? . see the final draft report from the acls commission on cyberinfrastructure for the humanities and social sciences, chaired by john unsworth. works�cited� acls commission on cyberinfrastructure for the humanities and social sciences. our cultural commonwealth. july <http:// www .acls .org/ cyberinfrastructure/ cyber .htm>. “ade and adfl statement on the use of outside reviewers.” ade bulletin ( ): . <http:// www .adfl .org/ resources/ resources_ statement .htm>. ade ad hoc committee on changes in the profession: teaching and research. “ade statement of good practice: teaching, evaluation, and scholarship.” ade bulletin ( ): – . <http:// www .ade .org/ policy/ index .htm>. american association of university professors. “the devaluing of higher educa-­ tion: the annual report on the economic status of the profession, – .” academe mar .-­apr. : – . boyer, ernest l. scholarship reconsidered: priorities of the professoriate. princeton: car-­ negie foundation for the advancement of teaching, . burnard, lou, katherine o’brien o’keeffe, and john unsworth, eds. electronic tex- tual editing. new york: mla, . cahalan, margaret, et al. fall staff in postsecondary institutions, . nces -­ . washington: gpo, . carlson, scott. “students oppose ohio state’s plan to put dissertations online.” chronicle of higher education may : a . <http:// chronicle .com/ temp/ email .php ?id=pmdvbzdcjcqprd kzmyntb yhsdjjhjx>. cataldi, emily forrest, et al. national study of postsecondary faculty. nces -­ . washington: gpo, . council of editors of learned journals. “the contributions of journal editors to the scholarly community.” oct. . <http:// www .celj .org/ news .html # october >. crewe, jennifer. “scholarly publishing: why our business is your business too.” profession . new york: mla, . – . culler, jonathan. framing the sign: criticism and its institutions. norman: u of okla-­ homa p, . damrosch, david. we scholars: changing the culture of the university. cambridge: harvard up, . diamond, robert m. “scholarship reconsidered: barriers to change.” o’meara and rice – . �|||�report dooris, michael j., and marianne guidos. “tenure achievement rates at research universities.” annual forum of association for institutional research. sheraton chicago hotel and towers. may . <http:// www .epi .elps.vt.edu/ perspectives/ tenureflow .pdf>. dubrow, heather. introduction. “collegiality.” profession . new york: mla, . – . franklin, phyllis. “the debate over college costs.” mla newsletter . ( ): – . givler, peter. e-­mail to david nicholls. aug. . glassic, charles e., mary taylor huber, and gene i. maeroff. scholarship assessed: evaluation of the professoriate. san francisco: jossey-­bass, . greenblatt, stephen. “call for action on problems in scholarly book publishing.” modern language association. may . nov. <http:// www .mla .org/ rep_ scholarly_ pub>. guillory, john. “evaluating scholarship in the humanities: principles and proce-­ dures.” ade bulletin ( ): – . haluska-­rausch, elizabeth, and bonnie wheeler. “a celj snapshot of learned journals and e-­publication in .” celj. feb. . nov. <http:// www .celj .org/ news .html>. howard, jennifer. “gutenberg-­e lets historians present research in nontraditional ways.” chronicle of higher education july . <http:// chronicle .com/ weekly/ v / i / a .htm>. jaschik, scott. “the shrinking tenure track.” inside higher ed may . dec. <http:// www .insidehighered .com/ news/ / / /data>. jencks, christopher, and david reisman. the academic revolution. new york: dou-­ bleday, . laurence, david, and doug steward. “placement outcomes for modern language phds: findings from the mla’s surveys of phd placement.” ade bulletin – ( – ): – . “meeting of the mla executive council.” pmla ( ): – . mla ad hoc committee on the future of scholarly publishing. “the future of scholarly publishing.” profession . new york: mla, . – . mla commission on professional service. “making faculty work visible: reinter-­ preting professional service, teaching, and research in the fields of language and literature.” profession . new york: mla, . – . <http:// www .mla .org/ resources/ documents/ rep_ facultyvis>. mla committee on academic freedom and professional rights and responsibili-­ ties. “advice for authors, reviewers, publishers, and editors of scholarly books and articles.” ade bulletin ( ): – . mla committee on information technology. “guidelines for evaluating work with digital media in the modern languages.” may . ade bulletin ( ): – . <http:// www .mla .org/ guidelines_ evaluation_ digital>. mla committee on professional employment. “final report.” pmla ( ): – . <http:// www .mla .org/ prof_ employment>. mla committee on scholarly editions. “guidelines for editors of scholarly edi-­ tions.” burnard, o’brien o’keeffe, and unsworth – . <http:// www .mla .org/ resources/ documents/rep_ scholarly/ cse_ guidelines>. “mla recommendations on extramural evaluations.” ade bulletin ( ): . <http:// www .mla .org/ pdf/ extramural_ evaluation .pdf>. mla�task�force�on�evaluating�scholarship�|||� “mla statement of professional ethics.” ade bulletin ( ): – . <http:// www .mla .org/ repview_ profethics>. “monograph and serial expenditures in arl libraries, – .” arl statistics – . washington: assn. of research libs., . <http:// www .arl .org/ stats/ arlstat/ graphs/ /monser .pdf>. o’meara, kerryann. “principles of good practice: encouraging multiple forms of scholarship in policy and practice.” o’meara and rice – . o’meara, kerryann, and r. eugene rice, eds. faculty priorities reconsidered: reward- ing multiple forms of scholarship. san francisco: jossey-­bass, . “on collegiality as a criterion for faculty evaluation.” american association of uni- versity professors. nov. . nov. <http:// www .aaup .org/ aaup/ pubsres/ policydocs/ collegiality .htm>. rice, r. eugene, and mary deane sorcinelli. “can the tenure process be im-­ proved?” the questions of tenure. ed. richard p. chait. cambridge: harvard up, . – . schuster, jack h., and martin j. finkelstein. with jesus francisco galaz-­fontes and mandy liu. the american faculty: the restructuring of academic work and careers. baltimore: johns hopkins up, . scott, david k. “the scholarship of integration.” o’meara and rice – . stanton, domna c. “the labor of service.” mla newsletter . ( ): – . “statement of principles on academic freedom and tenure.” american association of university professors. jan. . nov. <http:// www .aaup .org/ a aup/ pubsres/ policydocs/ statement .htm>. “statement of principles on family responsibilities and academic work.” american association of university professors. nov. . nov. <http:// www .aaup .org/ aaup/ pubsres/ policydocs/ workfam-­stmt .htm>. “statement on non-­tenure-­track faculty members.” . profession . new york: mla, . – . <http:// www .mla .org/ statement_on_nonten>. thatcher, sanford. e-­mail to david nicholls. may . townsend, robert b. “a survey of tenure practices in history: departments indicate books are key and success rates for tenure high.” perspectives . ( ). <http:// www .historians .org/ perspectives/ issues/ / / new .cfm ?pv=y>. van orsdel, lee c., and kathleen born. “journals in the time of google—periodi-­ cal price survey .” library journal apr. . aug. <http:// www .libraryjournal .com/ article/ca .html>. waters, lindsay. enemies of promise: publishing, perishing, and the eclipse of scholarship. chicago: prickly paradigm, . ———. “rescue tenure from the tyranny of the monograph.” chronicle of higher education apr. : b – . wilcox, thomas w. the anatomy of college english. new york: wiley, . ———. a comprehensive survey of undergraduate programs in english in the united states. ed . may . withey, lynne. e-­mail to david nicholls. may . evaluating the intellectual assets in the scholarship and collections directorate at the british library this is a repository copy of evaluating the intellectual assets in the scholarship and collections directorate at the british library. white rose research online url for this paper: http://eprints.whiterose.ac.uk/ / version: accepted version article: schofield, a.e., sen, b.a. and vasconcelos, a.c. ( ) evaluating the intellectual assets in the scholarship and collections directorate at the british library. library management, ( ). - . issn - https://doi.org/ . /lm- - - eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ reuse unless indicated otherwise, fulltext items are protected by copyright with all rights reserved. the copyright exception in section of the copyright, designs and patents act allows the making of a single copy solely for the purpose of non-commercial research or private study within the limits of fair dealing. the publisher or other rights-holder may allow further reproduction and re-use of this version - refer to the white rose research online record for this item. where records identify the publisher as the copyright holder, users can verify any specific terms of use on the publisher’s website. takedown if you consider content in white rose research online to be in breach of uk law, please notify us by emailing eprints@whiterose.ac.uk including the url of the record and the reason for the withdrawal request. mailto:eprints@whiterose.ac.uk https://eprints.whiterose.ac.uk/ intellectual assets in libraries intellectual assets (ias) are assets that belong to an organisation, and benefit the organisation, but are intangible and have no direct financial worth. for example, the number of people employed by an organisation is a tangible asset which can be measured. the knowledge and expertise those staff members possess is an intangible, intellectual asset. due to their intangible nature, they cannot be assessed using traditional quantitative methods. nevertheless, these assets need to be optimised in order for an organisation to run as efficiently and effectively as possible. edvinsson and malone ( ) divide intellectual assets into three key areas: human assets, structural assets, and relational assets. human assets refer to elements like staff expertise and the quality of training, structural assets to the organisational infrastructure and value added by intellectual property, and relational assets to the quality of the relationships the organisation has with internal and external stakeholders. corrall and sriborisutsakul ( ) add a fourth category specifically for libraries: collections and services assets. this takes into consideration the value added by the ways collections are used and the services offered by the library, and this category has been incorporated into this investigation. creating these categories allows for ias to be identified more easily, although it should be noted that some ias can be applied to more than one category and these groupings should perhaps only be used for the data collection process and discarded for the final report. this study focuses on the ias within the scholarship and collections directorate at the british library. the main aim of the directorate is to interpret and care for the library's collections. as a non-profit organisation and one which has a primary function of sharing knowledge and maintaining cultural heritage, intellectual assets are key to the library's success, and so an evaluation of the ias attached to the directorate would be of great value. the intention of this study is to determine how effectively the directorate is using its intellectual assets, indicating its strengths as well as areas for improvement, to enable the library to improve the services it provides. this research focuses on one sector of the british library: the scholarship and collections directorate. scholarship and collections (s&c) are responsible for the care, interpretation and development of the bl's collections of over million items. the directorate has just over members of staff, comprised of curators, archivists, restoration specialists, as well as individuals who have no direct contact with the collections at all. without the collections, the library could not exist. however, the collections as tangible assets are without value unless they are accessible, well utilised, and supported by human expertise. therefore, capitalising on the directorate's intellectual assets is essential in order to improve the library's services. figure . new structure of scholarship and collections an evaluation of the ias in scholarship and collections is of especial value at this time because the directorate underwent a major restructuring in - . the final report of the review (director of operations and services, : ) states that this change was prompted by a new emphasis on digital scholarship, as well as the 'changing external landscape in information provision and communications'. the report indicates that the library wished to demonstrate its change of priorities from the collections to access, and felt the need to be more recognisable in structure to users who were used to an academic library structure (director of operations and services, : ヱヱぶく fキェ┌ヴw ヱ swマラミゲデヴ;デwゲ デエw sキヴwiデラヴ;デwげゲ ミw┘ キミaヴ;ゲデヴ┌iデ┌ヴwぎ ; デエwマ;デic division rather than the previous structure which focused on the format of the collections. the restructure has inevitably caused disruption in certain areas and this, coupled with the diverse professional knowledge within the directorate, has meant that employees would benefit from a means of capitalising on their strengths, identifying where the directorate could be improved, and learning where certain areas of expertise can be found, all of which this investigation is expected to provide. methodology corrall and sriborisutsakul ( : ) suggest that using a mixed methodology is best for assessing the iaゲ キミ ;i;swマキi ノキhヴ;ヴキwゲ キミ ラヴswヴ デラ a┌ノaキノ けhラデエ デエwラヴwデキi;ノ ;ミs ヮヴ;iデキi;ノ ;キマゲげく tエ;デ キゲが ; ノキhヴ;ヴ┞ should seek to analyse both the qualitative data provided by employees and external stakeholders, they should also attempt to provide some quantitative data and benchmarking in order to fully understand the strengths and weaknesses of their ias. one of the main shortcomings of existing intellectual evaluation tools is the over-reliance on ケ┌;ミデキデ;デキ┗w マwデエラsゲく wエキノw デエw ;キマ キゲ デラ けマw;ゲ┌ヴwげ iaゲが デエw キsw; ラa マw;ゲ┌ヴwマwミデ キゲ ゲラマw┘エ;デ misleading, as it implies a numerative value can be placed on ias, wエキiエ ゲキマヮノ┞ キゲミげデ デエw i;ゲwく iデ ┘;ゲ directorate of scholarship and collections digital scholarship arts and humanities social sciences collection care content, strategy, research and operations determined that this investigation would be comprised of mostly qualitative data, only using quantitative methods as a support mechanism. in addition, this project is a case study which involves the input of bl staff members, making qualitative methods a natural choice. a phenomenographical approach was used. phenomenography is a philosophy developed in the s by ference marton and other researchers at gotenburg university in sweden, intended originally as a means of studying education and the ways that individuals learn. the approach is similar in many ways to phenomenology, except that rather than focusing on the phenomenon itself, aゲエ┘ラヴデエ ;ミs l┌i;ゲ ふヲヰヰヰぎ ヲΓヵぶ ゲデ;デw デエ;デ キデ けゲwwニゲ デラ キswミデキa┞ デエw ケ┌;ノキデ;デキ┗wノ┞ sキaawrent ways in ┘エキiエ キミsキ┗キs┌;ノゲ w┝ヮwヴキwミiw ゲ┌iエ ;ゲヮwiデゲ ラa デエwキヴ ┘ラヴノsげく it focuses on individual experience of a phenomenon, based on the assumption that everyone experiences the world differently (ashworth and lucas, ). the aim of phenomenography is to reveal variation in experience in the human world; the variation between qualitatively different ways of seeing, experiencing and understanding the same phenomena (marton and pang, ). therefore, while phenomenography and phenomenology are alike in that both are concerned with a particular phenomenon within a particular environment, there are key differences. while the former tends to focus on the ヴwゲw;ヴiエwヴげゲ ヮwヴiwヮデキラミ ラa ; ヮエwミラマwミラミが デエw ノ;デデwヴ aラi┌ゲwゲ ラミ デエw ヴwゲw;ヴiエ ゲ┌hテwiデゲげ w┝ヮwヴキwミiw ラa the phenomenon in order to formulate categories and find patterns of collective experience (andretta, ). also, while phenomenology emphasises individual experience, phenomenography attempts to reveal a collective experience by highlighting different facets of the phenomenon as experienced by different individuals (trigwell, ). this is more useful for this particular project, which deals with a diverse set of individuals within scholarship and collections who may experience the phenomenon of intellectual assets in very different ways. while objective results (e.g. the management of ias) can be documented from a first order perspective, phenomenography allows the researcher to investigate the second order perspective which focuses on the participants' internal understanding of the phenomenon and how this relates to the ways they manage ias. the primary method of data collection was a series of in-depth interviews with s&c staff members, as well as a selection of key stakeholders. the interviewees were taken from diverse areas of the directorate and had varying levels of responsibility so as to obtain as wide a field of results as possible. open-ended questions and prompts were used, as this allowed the subject to lead the discussion and bring up elements that the researcher, as an outsider to the directorate, may not otherwise have been aware of. the interviews were structured around the four areas of intellectual assets that have been identified: human, structural, relational, and collections and services, and interviewees were encouraged to give their personal opinions, allowing the researcher to gauge their feelings and beliefs about the strengths and weaknesses of intellectual assets in the directorate. the interviews were then transcribed and closely analysed with the purpose of finding key themes in the data, and divergences of opinion between interview subjects. this data was then followed by a questionnaire which was distributed to all staff in s&c. the purpose of this was to allow everyone the opportunity to participate, fill data gaps, and to allow for a small amount of statistical data collection by asking participants to rate certain elements of their experiences at the bl from to , with one being very poor and five being excellent. all the data was supported by reviewing bl documents in order to ascertain what the library is doing it and how they are doing it, and to compare the official and unofficial accounts of working at the bl. findings the following are some of the most interesting outcomes of the data. defining the library participants were asked to define the british library, and it was pertinent that they found it much easier to state what the library is not. the most frequent statement was that the library is not a museum. while the motivation behind this (that the collections are there to be used and not kept hwエキミs ェノ;ゲゲ i;ゲwゲぶ キゲ ┌ミswヴゲデ;ミs;hノwが キデ ;ノゲラ ノキマキデゲ デエw キミデwヴヮヴwデ;デキラミ ラa デエw ノキhヴ;ヴ┞く oミw ラa デエw blげゲ roles is to preserve and share the co┌ミデヴ┞げゲ i┌ノデ┌ヴ;ノ エwヴキデ;ェwが ;ミs ; ノ;ヴェw ミ┌マhwヴ ラa ヮwラヮノw ┘エラ come to the library are not there to use the reading rooms. they are there to see the exhibitions and the precious objects. these are using the library in the same way that they would use a museum, and they are a valuable source of revenue. it is also interesting to note that many of the interviewees praised the ways that museums such as the victoria & albert and the british museum promoted their collections and made them accessible. employees also pointed out that the library is not a university. again, in the strictest sense, this is true. however, the bl is an internationally vital seat of learning, and home to world-class academic experts who should be promoted. university websites have easily accessible web pages for their academic staff outlining their areas of expertise and their publications, which is something the bl would benefit from. it is very difficult to find the experts in s&c despite their academic prestige. one s&c employee even stated that the bl is not a library which, in the traditional sense of libraries as a place to find a book on a shelf to borrow, is accurate. however, rather than defining itself by what it is not, it is suggested that the library adopts a more inclusive matrix approach to the way it sees itself. national libraries are unique even amongst themselves, but the bl has aspects of museum, university, library and many other types of institution. rather than rejecting these comparisons, staff should embrace these aspects in order to wミエ;ミiw デエw lキhヴ;ヴ┞げゲ ;ゲゲwデゲ ;ミs マ;ニw ;ノノ ゲデ;aa awwノ デエ;デ デエwキヴ ┘ラヴニ キゲ ┗;ノ┌ws h┞ デエw blく untapped resources it is obvious that s&c holds a great wealth in professional expertise, and that staff are passionate about what they do. however, many of these resources are untapped and obscured. several participants discussed how they found valuable contacts within the directorate through word-of- mouth, and around % stated that they did not believe the rest of s&c were really aware what they do. many staff members stated that the directorate would benefit from more comprehensive profiles on the intranet system. at present, intranet profiles include job titles and extension numbers, whereas many participants felt that they could also include information about what their jobs involve, including areas of expertise and special interest. this would enable a more matrix culture where employees could utilise their skills, and inter-departmental links could be made within the directorate. digital scholarship there is also a large degree of confusion about digital scholarship. while all research participants agreed that it is a priority for the library, many professed confusion concerning what it actually means, or wariness over getting involved with it. the directorate would benefit from a clearer definition of digital scholarship and a greater degree of collaboration and understanding between those working with physical and digital media. it became clear throughout the research process that there is very little difference between the two. for example, staff working in collection care have largely the same aims as those working in digital scholarship: to ensure that collections are preserved and accessible. however, there is a divergence between the two departments, with many wマヮノラ┞wwゲ awwノキミェ デエwヴw キゲ ;ミ けwキデエwヴっラヴげ ゲキデ┌;デキラミが ┘エキiエ キゲ ミラデ デエw i;ゲwく annual reports the bl produces an annual report detailing its progress. however, it largely focuses on curatorial staff within s&c, leaving other employees unable to show what they have been doing. some sectors within the directorate are forming their own annual reports to enable them to demonstrate the intellectual value of the work they are doing. the directorate would benefit from acknowledging that departmental success cannot necessarily be determined by, for example, the percentage of collections that are digitally available, as not all departments are directly responsible for a collection. the impact of tangible assets while this project set out to evaluate only intellectual assets, it was quickly determined that it is impossible to completely separate intangible and tangible assets. budget cuts and the restrictions on staff employment have understandably limited what s&c are able to do. it would therefore be unfair to criticise the directorate for something like not having collections extensively digitised, when the bl has no internal budget for digitisation and relies on external funding which is not always available. lack of money and reduced staff numbers have a negative impact on ias, and any evaluation should always be conducted alongside a financial review in order to form a fair judgement on what the directorate is able to do. discussion several attempts have been made over the past two decades to put a tangible value on ias. kaplan and norton ( ) developed the balanced scorecard to allow intellectual assets to be considered alongside financial evaluations, and sveiby ( ) and edvinsson ( ) have also produced notable ia evaluation tools. however, these models were not developed with non-profit organisations such as libraries in mind. white ( ) has suggested that the bsc can be adapted for the measurement of ias in libraries. the scorecard could be adapted specifically for ias to measure the four library ia components posited by corrall and sriborisutsakul ( ), and would allow a library to think laterally, make decisions about where their ia priorities lie, and track their progress. this, and the fact that the bl have used the scorecard in the past so staff will be familiar with it, suggested that it could be adapted for the purpose of this evaluation. the four aspects of intellectual assets fit neatly into the four scorecard areas ( ) financial ( ) customer ( ) internal business processes ( ) learning and growth, and it would allow the directorate to consider their ias laterally. having a limit of five or six key performance indicators allotted to each scorecard component allows s&c to identify and prioritise their ias. figure . scorecard adapted for the arts and humanities department in scholarship and collections the original aim was to create one evaluation tool for the directorate, but it soon became apparent that this was not practical. the expertise and responsibilities of the various departments within scholarship and collections are very diverse, and an intellectual asset that is vital to one department would be meaningless to another. for example, the accessibility of collection catalogues would be very important to someone in arts and humanities, but would mean very little to someone in research and operations who has no direct responsibility for the collections. one theme that emerged from the data is that some employees felt that the work they do is not fully acknowledged, and this evaluation needs to incorporate all aspects of what the directorate does. the solution was to create a scorecard for each of the five departments in s&c which would feed into a universal directorate-wide scorecard. this would allow individual key performance indicators ふkpiゲぶ デラ hw デ;キノラヴws デラ ゲ┌キデ w;iエ swヮ;ヴデマwミデげゲ ミwwsゲ ;ミs wミゲ┌ヴw wケ┌;ノ ヴwヮヴwゲwミデ;デキラミが ;ミ w┝;マヮノw of which can be seen in figure . the general s&c scorecard would take into account all the results from each department, and would also evaluate how cohesively the directorate works together within the four components of intellectual assets (figure ). figure . general scorecard for scholarship and collections tエキゲ ゲiラヴwi;ヴs i;ミ hw ┌ヮs;デws ラ┗wヴ デキマw ;ゲ デエw sキヴwiデラヴ;デwげゲ ヮヴキラヴキデキwゲ iエ;ミェw デラ ;ss sキaawヴwミデ kpiゲが and would allow the directorate to clearly see where its strengths and weaknesses lie. it is suggested that all staff members participate in the evaluation by rating how well they think the directorate is achieving each kpi on a scale of one to ten. the mean score could then be calculated for the departments and the directorate. this is a way of quantifying qualitative data. it is also suggested that staff should be encouraged to comment on how well they think the directorate is achieving its goals and suggest improvements, perhaps through an anonymous messaging board. conclusion this tool for evaluating intellectual assets can not only be adapted for other libraries, but for any public sector organisation by altering the kpis. it considers all aspects of intellectual assets and allows them to be considered alongside one another. it also allows the organisation to benchmark their ias over time and prove the value of libraries in an increasingly challenging climate, and ensure that intellectual assets are used in the most effective ways. moving forward, it would be interesting to incorporate the views of bl users into the study, something which the limits of this research project did not allow for. similarly, it would be useful to get the input of a wider number of external stakeholders. overall, this study has allowed for the gap to be filled in research concerning the intellectual assets of libraries. the value of libraries lies in their capacity for knowledge dissemination powered in the vast majority by ias, and this tool will allow those assets to be identified, strengthened and demonstrated at a time when the financial climate has meant that libraries need to prove their value more than ever. references aミsヴwデデ;が “くが ふヲヰヰΑぶく さphenomenography: a conceptual framework for information literacy ws┌i;デキラミざ. aslib proceedings: new information perspectives, vol. ( ), pp. - . ashworth, p. and lucas, u. ( ). "achieving empathy and engagement: a practical approach to the design, conduct and reporting of phenomenographic research". studies in higher education, vol. ( ), pp. - . brennan, n. and b. connell ( ). "intellectual capital: current issues and policy implications". journal of intellectual capital vol. ( ), pp. - . the british library board, ( ). annual report: . london: british library. available at http://www.bl.uk/knowledge [accessed / / ]. corrall, s. and sriborisutsakul, s., ( ). さevaluating intellectual assets in university libraries: a multi-i;ゲw ゲデ┌s┞ aヴラマ tエ;キノ;ミsざ. in chu, s., (ed.), managing knowledge for global and collaborative innovations. pp. - . london: world scientific publishing. director of operations and services, ( ). final report of the review of scholarship and collections. london: the british library. edvinsson, l. and m. malone ( ). intellectual capital. new york: harper collins. kaplan, r. and norton, d., ( ). "the balanced scorecard - measures that drive performance". harvard business review, (jan-feb), pp. - . marr, b. ( ). "strategic management of intangible value drivers". handbook of business strategy. vol. ( ), pp. - . marton, f. and pang m.f. (eds.) ( ). two faces of variation. paper presented at the th european conference for learning and instruction: göteborg, sweden, august - meritum ( ). guidelines for managing and reporting on intangibles (intellectual capital report). l. canibano, m. sanchez and m. garcia-ayuso. seville, meritum. available at http://www.pnbukh.com/files/pdf_filer/meritum_guidelines.pdf accessed october . trigwell, k. ( ). "a phenomenographic interview on phenomenography". in: bowden, j. a. & green, p. (eds.), phenomenography, pp. - . melbourne: rmit university press. http://www.bl.uk/knowledge http://www.pnbukh.com/files/pdf_filer/meritum_guidelines.pdf if you would like to learn more about arieal research centre, please visit us at: w: arieal.mcmaster.ca t: @arieal_research centre for advanced research in experimental and applied linguistics (arieal) title: concreteness and psychological distance in natural language use journal: psychological science author(s): snefjella, b., kuperman, v. year: version: post-print original citation: snefjella, b., & kuperman, v. ( ). concreteness and psychological distance in natural language use. psychological science, ( ), – . https://doi.org/ . / rights: © < >. this is the post-print version of the following article which was originally published by psychological science in : snefjella, b., & kuperman, v. ( ). concreteness and psychological distance in natural language use. psychological science, ( ), – . https://doi.org/ . / https://arieal.mcmaster.ca/ https://twitter.com/arieal_research https://doi.org/ . / https://doi.org/ . / arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page concreteness and psychological distance in natural language use snefjella, b.*, kuperman, v. mcmaster university *department of linguistics and languages, mcmaster university, togo salmon hall , main st. west, hamilton, ontario, canada l s m abstract existing evidence shows that more abstract mental representations are formed and more abstract language is used to characterize phenomena that are more distant from the self. yet the precise form of the functional relationship between distance and linguistic abstractness is unknown. in four studies, we tested whether more abstract language is used in textual references to more geographically distant cities (study ), time points further into the past or future (study ), references to more socially distant people (study ), and references to a specific topic (study ). using millions of linguistic productions from thousands of social-media users, we determined that linguistic concreteness is a curvilinear function of the logarithm of distance, and we discuss psychological underpinnings of the mathematical properties of this relationship. we also demonstrated that gradient curvilinear effects of geographic and temporal distance on concreteness are nearly identical, which suggests uniformity in representation of abstractness along multiple dimensions. keywords psychological distance; construal-level theory; embodied cognition; social media; twitter; abstraction; concreteness https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page introduction one of the fundamental and unique abilities of the human mind is to transcend the boundaries of here and now: to imagine distant times, far-away places, and other people. the psychological mechanism of abstraction underlies this mental ability, but how this mechanism operates is a matter of continuing debate (barsalou, ; boroditsky & ramscar, ; burgoon, henderson, & markman, ; fischer & zwaan, ; gallese & lakoff, ; meteyard, cuadrado, bahrami, & vigliocco, ; paivio, ; schwanenflugel, harnishfeger, & stowe, ). yet social psychologists have noted a positive correlation between the perceived distance of an object or event and the level of abstraction at which that event is represented mentally. for instance, the influential construal-level theory of psychological distance (trope & liberman, , ) states that objects and events that are proximal (close to an egocentric self) are represented with rich, complex, concrete, contextual, and subordinate-level features. this is referred to as a low-level construal. a high-level construal is the representation of distal objects and events abstractly by their simple, invariant, superordinate-level characteristics. for example, if we are preparing a lecture for tomorrow (a proximal event), we will worry about which room to go to. when preparing a lecture for next month (a distal event), we will worry about its topic. according to construal-level theory, the distance-driven differences in construal arise because abstract representations and goals are more stable over time than concrete representations (the topic of my lecture remains the same, even if the location changes). thus, abstraction leads to successful traversing of psychological distances (trope & liberman, ). the relationship between abstraction and psychological distance is implicated in many personal and social phenomena, including the consistency of attitudes and evaluations in an individual (ledgerwood, trope, & chaiken, ), the actor-observer bias (nisbett, caputo, legant, & marecek, ), moral judgments (amit & greene, ), politeness (stephan, liberman, & trope, ), subjective judgments of truth (hansen & wanke, ), and consumer preferences (fiedler, ). the hypothesized positive correlation between the abstractness of mental representations and psychological distance has received support in many experimental paradigms and measures, including action identification (fujita, henderson, eng, trope, & liberman, ; liviatan, trope, & liberman, ), a “distance stroop” task (bar-anan, liberman, trope, & algom, ), and surveys and questionnaires (eyal, liberman, trope, & walther, ; trope & liberman, ; wakslak, trope, liberman, & alony, ; for a recent meta-analysis of research on the construal-level theory, see soderberg, callahan, kochersberger, amit, & ledgerwood, ). of greater importance for the present research are studies that capitalized on the ability of language to reflect abstractness of mental representations through abstractness of expressed meanings (paivio, ; schwanenflugel et al., ). these tasks elicited linguistic productions from participants while manipulating the distance of what participants were prompted to write about. psychological distance of described phenomena, typically conceptualized as their construal level, was then operationalized as relative abstractness of produced texts. a robust finding across the studies was that more abstract language is used to characterize phenomena that are more distant from the self temporally, spatially, or socially or are more hypothetical (trope & liberman, ). although valuable, the current methodology of using linguistic productions to study the link between abstraction and psychological distance is limited. in a typical laboratory experiment, small groups of undergraduate participants are prompted to write about distant or near objects or events, which often results in small samples of language obtained from https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page individuals with a relatively homogenous age range and experience. the scale of data collection is further limited by labor-intensive manual coding of linguistic abstractness. for instance, fujita et al. ( ), stephan et al. ( ), and gong and medin ( ) coded the writing of participants using the linguistic- categorization model of semin and fiedler ( ). in a similar vein, alter and oppenheimer ( ) used three human coders to rate productions of participants as being abstract, concrete, or both. even more drastically, instead of being treated as a continuum, abstractness is routinely binned into discrete categories. as the meta-analysis of research on construallevel theory further shows, distance was dichotomized into “close” and “far” categories in all studies that met the authors’ inclusion criteria (soderberg et al., ). either categorization obscures the precise mathematical form of their functional relationship and prevents characterization of the effect of abstractness on psychological distance in a graded way. in the present study, we addressed these drawbacks by using norming megastudies with ratings of semantic word properties as well as vast collections of language productions in text corpora. we used these resources to examine—on a larger scale and with a broader range of examples than in previous studies—abstractness of linguistic productions that describe objects or events positioned at various psychological distances. in our study, operationalization of construal level relied on a recent data set of concreteness ratings of , english words (brysbaert, warriner, & kuperman, ). ratings of words were made on a scale from (abstract) to (concrete) and averaged over participants: resulting concreteness norms ranged from . (“essentialness”) to (“pitbull”). using this data set, we were able to measure abstractness in language without human coders and with any number of productions. by handing the drudgery of coding to a computer, we were able to study psychological distance at scales not previously possible: millions of observations from thousands of language speakers varying in age, socioeconomic status, personality traits, and place of residence. furthermore, we analyzed natural language use, not language elicited in an experimental task; our study tested the ecological validity of the construal-level theory and complements experimental research. the meta-analysis by soderberg et al. ( ) pointed to the inability of current studies of psychological distance to identify the precise mathematical form of the gradient functional relationship between distance and abstractness. using clever analytical techniques, soderberg et al. ( ) predicted the relationship to be curvilinear. our approach charted the relationship along a continuum of psychological distances and provided other psychologically meaningful interpretations of its mathematical properties. we present four studies, each of which examined one of the critical dimensions of construal-level theory; all were based on social-media sources. specifically, we tested whether more abstract language is used in textual references to more geographically distant cities (study ), time points further into the past or future (study ), and references to more socially distant people (study ). in study , we examined whether a theme that is commonly experienced across all times and distances—death and dying— follows the pattern observed in the aggregate data. study : geographic distance method this study explored the role of geographic distance in explaining variability in the level of construal of u.s. cities. construal was operationalized as concreteness of language used in relation to the cities. we selected the most populous u.s. cities (with , inhabitants as an arbitrary lower threshold) using rankings of their population size in (http://en.wikipedia.org/wiki/ list_of_united_states_cities_by_population), https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page with the exception of washington, d.c., as the city name is homonymous with a name of a geographically distant state. new york city and oklahoma city, while homonymous with their respective states, are embedded in those states and thus were expected to introduce less noise in distance estimates. social media—a source of millions of data points, a broad selection of geographic objects, and a full range of possible distances—enables an expansion and refinement of prior experimental studies. using the publicly available data stream from the twitter application program interface at https://dev.twitter.com, we collected tweets that were (a) geo-tagged (i.e., had geographic information system coordinates of the location where the tweet was produced), (b) were sent from within the united states, and (c) contained the name of one of the major u.s. cities included in our analysis. to calculate the geographical distance between the location of the tweet production and the city of reference, we used the latitude and longitude coordinates of that tweet and of that city’s center, as supplied by the geocode() function of the ggmap package (kahle & wickham, ) in r statistical software (version . ; r development core team, ). we applied the haversine formula to obtain the greatcircle distance between these two points. to remove distributional skewness, we log-transformed (base ) all distances. all calculations and analyses in this and subsequent studies were made using r software. it is possible that a more psychologically valid measure of geographic distance is the time it takes to commute from the tweet location to the respective city center. we opted for the great-circle distance because estimates of the commute time (including waiting time in airports and traffic hours) are inherently variable. results a total of , tweets satisfying our criteria were collected between march and may . twitter is trendy and dynamic; collecting tweets over a wide time frame helps prevent trending topics from influencing the results. a further trimming retained tweets that contained four or more words (excluding the city name) that had concreteness ratings in brysbaert et al.’s ( ) data set: an average tweet contained . such words (sd = . ). this reduced the pool to , geo-tagged tweets. we calculated the mean concreteness of each tweet on the basis of words with available ratings (m = . , mdn = . , range = . – . , sd = . ). thus, the degree of construal was operationalized as a prevalence in the tweet of words that were rated in brysbaert et al.’s ( ) study as more concrete or more abstract when presented out of context. geographic distances between the point of origin of tweets and cities they referred to ranged from . km to , . km (m = . , mdn = . , sd = . ; see fig. for the distribution of distances). to decrease noise in the raw data, we binned observations into percentiles of the log distance distribution and calculated mean concreteness of each bin. a cubic function (y = . + . x – . x + . x ) provided an optimal polynomial fit to concreteness in a linear regression model with raw polynomials of log distance as predictors (adjusted r = . ; see fig. ). table reports the goodness of fit of the cubic function as well as lower- and higher-order polynomial functions using hierarchical regression: successive models were compared using the anova() function in r. while a cubic function was the best polynomial fit to the data, other functional relationships might offer a similarly good fit (e.g., logistic curve). because there is no theoretically grounded expectation as to the form of the curvilinear dependency, we leave exploration of alternative functional forms to future work. the pattern in figure indicates a substantial decrease in the concreteness of tweets regarding cities with an increase in the distance of the tweeting person from that city: the drop in concreteness between the https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page extreme distances was estimated at . points of concreteness or . % of the concreteness scale. this pattern, based on nearly half a million observation points and cities, is perfectly in line with the predictions and experimental validations of the positive correlation between psychological and geographic distance. moreover, we confirmed the predictions of soderberg et al. ( ) that construal and distance have a curvilinear relation. the functional form of the fitted curve, and its excellent fit to the data, enabled us to further interpret parameters of the cubic function. the inflection point (i.e., the point at which the second derivative of the function changes its sign) was estimated at . (in log- units) or around km from the city center. this implies that as a typical tweeting person moves away from the city center to this point, the concreteness of his or her references to the city decreases at a relatively high rate. this decrease becomes less precipitous as distances increase beyond km, and, as the first derivative of the polynomial shows, there is little to no decrease in concreteness associated with distances above km. we further speculated that the inflection point is psychologically meaningful and demarcates a distinction between (a) being within a city, where concreteness of mental representation of that city is the highest (the level of construal the lowest) and decreases sharply from the city center to the outskirts and (b) being outside of the city, where the representation of the city is more abstract overall (the level of construal is higher) and is less affected by how far the twitter user is from the city. to test this hypothesis of immediacy of experience, we calculated the radius of the city area for each of the cities, under a simplifying assumption that cities have a perfect circular shape (estimates of the city area were obtained from http://www.citymayors.com/statistics/largest -citiesarea- .html). extreme radii were found for new york, new york (r = km) and el paso, texas (r = km), and the mean radius was km, close to the inflection point of the fitted curve at km. while more sophisticated measurements of the urban territory will be necessary, the observed value is consistent with the notion that the construal level increases (and concreteness of language decreases) more drastically as the speaker loses the immediacy of the urban experience when moving from the city center to its outskirts: once outside the city, the construal level is more stable and high. study : temporal distance method a robust finding in the literature on memory and prediction in relation to psychological distance is that remoter events, whether in the past or the future, elicit a higher level of construal (trope & liberman, , ). to test this dimension of psychological distance, we employed the usenet corpus collected by shaoul and westbury ( ), which consists of over billion word tokens of public usenet postings collected from , englishlanguage news groups between october and january . several temporal terms were used to examine effects of temporal distance on concreteness of language in which past and future events are described. we explored distance both within specific time units (e.g., years ago vs. years ago) and between units (e.g., days from now vs. centuries from now; last week vs. next week). study a: “years ago.” soderberg et al. ( ) predicted the curvilinear relationship between distance and construal on the basis of their meta-analysis, which placed different studies of temporal distance along an objective timeline from to days. we were unable to re-create this finding in the corpus, as usenet contributors almost never refer to distances over days with the phrase “x days ago.” however, phrases such as “x years ago” yielded intriguing results. we identified all occurrences in the corpus of the phrase “x years ago.” we https://arieal.mcmaster.ca/ https://twitter.com/arieal_research http://www.citymayors.com/statistics/largest http://www.citymayors.com/statistics/largest arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page further extracted words to the left and to the right of the critical phrase, “years ago” (e.g., “wind generation of electricity years ago and they were commonplace then”). the -word window around the target word (henceforth referred to as the context) was chosen to approximately equate the number of words in the twitter (with its -character limit per tweet) and usenet extracts. we leave for further study the question of which window size provides the best accuracy. numbers preceding the target phrase (written either as words or numerals) served as the metric of the temporal distance from the time of e-mail submission to usenet. we restricted the time range to through years ago. finally, we removed all extracts with contexts that contained fewer than words with concreteness ratings available in brysbaert et al.’s ( ) data set. study b: “ago” versus “from now.” we were also interested in comparing the abstractness with which people refer to distances in the past or future, as measured by different time units. in the corpus, we identified all occurrences of the phrases “x ago” and “x from now” in which x was a unit of time: minute, hour, day, week, month, year, decade, or century. we further extracted five words to the left and five to the right of the critical phrase (e.g., “centuries ago,” as in “situation as recent as two centuries ago when much academic instruction was”). numbers (written either as words or numerals) were removed from the preceding context window. the resulting scale of time units was then ordinal (a week ago is further in the past than a day ago), rather than continuous, as in study a. finally, we removed all extracts with contexts that contained fewer than four words with concreteness ratings in brysbaert et al.’s ( ) data set. study c: “last” versus “next.” to ensure that observed differences between time units were not an artifact of our choice of the language denoting temporal distance (time units from now and time units ago), we conducted an additional set of analyses using contexts for phrases such as “yesterday,” “tomorrow,” “last week,” “next week,” and “last month.” contexts were defined as in studies a and b, and the trimming procedures were the same as in study b. results study a: “years ago.” a total of , extracts containing the phrase “years ago” were identified in the usenet corpus. because of skewness, temporal distances in years were log-transformed (base ). observations were binned by their temporal distance into intervals with open left boundaries—formed by the numbers to (in increments of ), to (in increments of ), and to , (in increments of )—and closed right boundaries. the histogram of the distribution of temporal distances is shown in figure a. mean concreteness of contexts was calculated for each bin and plotted against the log of the numeral in the interval’s left boundary (see fig. a). as with geographic distance, the best polynomial fit to concreteness was obtained with a cubic curve (y = . + . x – . x + . x ) and showed an excellent fit in a linear regression model with raw polynomials of log temporal distance (adjusted r = . ; see table for model comparison). verbal descriptions of past events were the more abstract (i.e., construed at a higher level) the more years had passed since the described event. the drop in concreteness between the extremes of the temporal range was fairly small and amounted to approximately . units of concreteness or . % of the concreteness scale. again, the observed pattern converged with the experimental evidence of the construal-level theory of psychological distance, in which construal is operationalized as concreteness of verbal description of the event. this also shows that the predicted curvilinear relationship (soderberg et al., ) between construal and distance holds for multiple dimensions of psychological distance. https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page a further inspection of the functional curve pointed to a faster drop in concreteness in verbal representations of events up to the inflection point of . log units (or years ago), a slower decrease in concreteness for more distant events past the inflection point, and virtually no change in concreteness of contexts associated with events taking place to , years ago. drawing an analogy to geographic distance, we speculated that the inflection point demarcated a change in the immediacy of one’s experience with events that happened during one’s lifetime and those that preceded it. as the literature on collective and generational memory demonstrates, critical social events (wars, natural disasters, changes in political regime) are more salient in mental representations of the past in those individuals who were exposed to the event as it happened than in individuals who were not (pennebaker, paez, & rim, ). if true, we would expect the inflection point of the functional curve ( years of age) to be close to the find age data on usenet users, the available statistics on internet users did not diverge from this number (e.g., the average age of social-media users in was . ; pingdom, ; see also eisenstein, in press). again, the functional form of the effect of temporal distance on concreteness of language production suggests immediacy of one’s experience with events as an important factor in the construal level of mental representation and linguistic expression of those events. study b: “ago” versus “from now.” a total of , critical phrases and their surrounding contexts were identified with time units (minutes to centuries) followed by “ago” or “from now.” after trimming, the data pool contained , contexts. mean concreteness was calculated for each context and plotted against respective time units. figure b summarizes the functional relationship between temporal distance from “now” (the time of writing of the posting) and the level of construal of the temporally marked event. the data in figure b indicated a near-linear decrease in concreteness for events that are further away from the present on the ordinal scale of time units, in accordance with the hypothesized relationship between temporal and psychological distance. the maximum contrast between time units (hours ago and centuries ago) was . units of concreteness, corresponding to % of the concreteness scale. regression models further indicated large effect sizes (r s = . and . for past and future, respectively). the patterns also showed that there was a preference for talking about past rather than future events, as evidenced in the circle sizes, proportional to log frequency of the phrase occurrence in the corpus. the intercepts of the regression lines further suggested that overall past experiences are represented in more detail (higher concreteness) than events that are envisioned in the future, which is in line with experimental research into the mental simulation of past and future events (d’argembeau & van der linden, ; johnson, foley, suengas, & raye, ). study c: “last” versus “next.” there were a total of , , contexts for phrases such as “last month” versus “next month.” figure c summarizes the results for all time units. we again noted a decrease in concreteness as temporal distance from the present increased. however, with these key phrases, the past and future appeared to be (almost) mirror images of each other, similar both in the slopes of the regression lines (β = . for the past and β = – . for the future events), the amount of explained variance (r s = . and . , respectively), and (log- ) frequencies of occurrence of respective phrases, shown as circle sizes in figure c. also, the contrast between maximally different time units (“yesterday/ today” and “last/next century”) was much larger than in the comparison in study b (“days ago/from now” vs. “centuries ago/from now”) and amounted to . units of concreteness or % of the concreteness scale. https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page study : social distance method using the same procedure as in study , we next investigated social distance by extracting from the usenet corpus five words on each side of a target word. previous experiments (liviatan et al., ; stephan et al., ) have shown that psychological distance between individuals is perceived to be larger, and the level of construal higher, if a social relationship between those individuals is more distant. to operationalize closeness of social relationships in a corpus, we took as a point of departure the bogardus social distance scale (bogardus, ; see also parrillo & donoghue, ). the scale evaluates the degree of willingness to establish social contacts with representatives of a racial, ethnic, socioeconomic, occupational, or other social group. the scale identifies closeness as the individual’s willingness to accept the group representatives using a -point scale: (a) potential partners in marriage, (b) close friends, (c) neighbors on the same street, (d) coworkers in the same occupation, (e) citizens in the same country, (f) only visitors to his or her country, or (g) people to be excluded from his or her country. to adapt the scale to the observed data, we converted the scale from a cumulative one (i.e., agreement with a higher degree of closeness implies agreement with all lower-degree categories) to a discrete ordinal one by identifying terms belonging to each of the scale’s categories (e.g., “friend,” “ally,” “confidant,” “pal,” “chum,” “buddy” for the close-friends category and “compatriot,” “countryman,” “countrywoman” for the visitors category. the full list of terms—created using our linguistic intuitions and the merriam-webster thesaurus (http://www.merriam- webster.com/)—is reported in table . results after trimming, , data points remained. figure plots the mean concreteness of the contexts, grouped by social-distance categories, against the ordinal scale of social distance. although the number of data points and confidence intervals varied across categories, the overall trend was in agreement with the hypothesized link. groups of individuals that are considered more distant socially are also construed in less concrete terms, with a maximum contrast of . points (about % of the available concreteness scale) between the family members and foreigners categories. study : concreteness of the theme of death over time and geographic distance method two points of criticism can be raised with regard to studies through . first, we used aggregate measures of the concreteness of verbal contexts, which average over a multitude of phenomena and a diversity of personal and collective experiences, and thus might lead to ecological fallacy (robinson, ). second, there are alternative explanations as to why a person might choose more abstract over more concrete words when describing a remote phenomenon. it might be due to the linguistically faithful reflection of the distance-driven change in one’s mental representation, which would be consistent with the premises of construal-level theory. conversely, it might take place because one does not have direct experience with the phenomenon and has access only to its gistlike representation through language; thus, one can describe it only in abstract terms. this relative abstractness is not expected to vary with distance but only with the amount of experience. in study , we considered construal of events related to death as a function of geographic and temporal distance. death is a concept that is acquired early, is salient and memorable as an event, and occurs to all living beings at all times and all locations, which gives on average an equal probability to directly or indirectly experience (somebody else’s) death at all https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page distances from the self. finding a curvilinear relationship between the concreteness of texts constrained to a familiar, ubiquitous event and distance from the self would be a step toward ensuring that the aggregate patterns are made of converging individual patterns and that—at least in some cases— predictions of construal-level theory are due to the change in distance and not only to the change in the strength of personal experience. to address these issues, we extracted twitter messages containing the words “died,” “dead,” or “death” from the data pool of study and , contexts from usenet e-mails containing the same words and a target phrase, “years ago,” as in study a. mean concreteness of those tweets and those contexts were calculated for each bin of log geographic distance (in km) and log temporal distance (in years), with bins defined as in study and study a, respectively. a similar study of social distance was not feasible because some of the categories (e.g., compatriots) did not offer a sufficient sample size to allow comparison. results concreteness and log geographic distance of texts related to death demonstrated a curvilinear relationship, which was well approximated by a cubic polynomial function (y = . – . x – . x + . x ; r = . ). the top panel of figure both reports the relationship for the tweets referencing death (solid line) and—for reference—replicates at a different scale the curve from figure (dotted line) that summarizes the trend in all tweets about major u.s. cities in study . the thematically constrained subset of tweets showed a similar if slightly flatter pattern than the overall trend. tweets about death sent from the city center were maximally concrete; their concreteness dropped dramatically when outside of the city and leveled off at distances above km, with a slight increase in concreteness at very remote distances. similarly, the concreteness of usenet contexts containing the words “years ago” and death- related words was a sigmoid function of log temporal distance, which was well approximated by a cubic polynomial (y = . + . x – . x + . x ; r = . ). the bottom panel of figure plots the curve estimated for death-related messages (solid line) and the overall trend for all messages (dotted line), which replicates, with a correction for scale, the curve in figure a. death-related contexts were generally more concrete than the thematically unconstrained contexts, but much like the overall trend in study a (fig. a), they showed the maximum of concreteness for deaths that occurred very recently, a drastic decrease in concreteness as the past became less recent, and a leveled- off pattern after some three decades from the time the message was written. to sum up, the curvilinear relationship between language concreteness and log (geographic and temporal) distance was confirmed even with a constraint that focused on one class of phenomena (i.e., those related to death). thus, phenomena that are likely to be part of individual experience and have a similar probability of occurring in an individual life recently or a long time ago, close or far, are construed with a similar level of detail at different distances as the entirety of phenomena that our method captures in a language corpus. general discussion we present a new method of examining an aspect of embodied cognition (barsalou, ; fischer & zwaan, ; gallese & lakoff, ; meteyard et al., ): the interplay among perceived and objective distance, abstraction as a mental faculty, and abstractness as a property of language. we identified words or phrases that denoted an entity or event for which we had information about distance. this information could be encoded in the phrase (“spouse” vs. “coworker” for social distance) or explicitly stated as a number (“twenty https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page years from now”). we then measured the concreteness of language that co-occurred with that word or phrase in texts and correlated this concreteness with distance. the utility of our method was demonstrated in studies of three critical dimensions of construal-level theory: spatial, temporal, and social distance. in all four studies, the predictions of construal-level theory held. tweets containing the names of an american city become more abstract as the geographic distance between the person sending the tweet and that city increased. similarly, verbal contexts of time points further into the past or future tended to be more abstract, as did verbal descriptions of more socially distant people. our use of multiple linguistic expressions of distance and massive amounts of linguistic productions in corpora allowed us to go beyond validation of prior experimental findings and answer outstanding questions about construal-level theory. one theoretical point raised by soderberg et al. ( ) was whether distance increased the processing of abstract information, decreased the processing of concrete information, or both. we note that in all our studies, greater distance led to more overall abstractness in language, but at every distance, the linguistic productions showed gradience in their abstractness and concreteness. specifically, our regression analyses of two continuous metrics of spatial (kilometers from the city) and temporal (years before writing) distance revealed that the relationship between log distance and abstractness of language is curvilinear and is well approximated by a cubic polynomial curve. language used in relation to cities and events is at its most concrete (construal is at its lowest) when the experience of that city or that event is most immediate (e.g., being in the city center or occurring in the very recent past; cf. hirst et al., ). tweets become abstract more rapidly between a city center and its suburbs than between the city boundary and any other location in the country. time references become abstract more rapidly between the present and the time point in the past that indicates a typical life span than between events in the distant and very distant past. this is also true when texts were thematically constrained to refer to death- related phenomena. thus, we both confirmed and specified the curvilinear relationship between distance and abstraction predicted by soderberg et al. ( ). moreover, the similarity of effects that physical and temporal distance have on linguistic concreteness—displayed over all relevant contexts or only a thematically constrained set—corroborates the long-standing observation that language often expresses temporal relations via metaphors of space (boroditsky, , ; boroditsky & ramscar, ). finally, symmetrical effects of past and future temporal distance on concreteness suggest analogous cognitive processes involved in remembering past events and imagining future ones (e.g., schacter, addis, & buckner, ). as with any method, a corpus-based approach has limitations. we were unable to explore the construal of entities that did not correspond to a word or phrase or that did not occur in corpora with sufficient frequency. there is little doubt that noise was introduced into the data from homography and polysemy: “bank” as a financial institution and the edge of a river, “work” as a noun and a verb, “chicago” as a city and a musical. also, we used concreteness ratings for words presented out of context to calculate the average concreteness of sequences of words that occur in context, missing out on metaphoric word use and other context- driven changes in word meaning. it is improbable, however, that our patterns arose because of a systematic bias in our operationalization of context concreteness, as this would have required contexts not only to be consistent in how they changed word concreteness but also to modulate this amount and direction of change as a function of distance. many of these limitations are addressable: by carefully restricting the searched linguistic materials and their https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page contexts (exemplified partly in study ); restricting the age, gender, or place of residence of contributors (as selfreported in several social-media sources); or taking temporal cross-sectional snapshots of the data. the utility of an observational approach based on corpora as a complement to experimental studies outweighs its limitations. it has the advantages of (a) ecological validity through observation of psychological distance in texts produced in natural communicative settings; (b) automatized ability to track psychological distance in vast spans of language created by heterogeneous, large populations; and (c) ability to investigate a very broad or a very focused range of entities or events. for instance, we chose american cities as our geographical objects of interest. any object for which we have a name and latitude and longitude coordinates can be explored for the effect of geographic distance on construal with the method presented in this article. equally, choosing one theme or a specific time slice in a corpus enables one to break down the aggregate trends demonstrated here into any level of granularity. notably, corpora and social media free researchers to study psychological distance outside of the laboratory. author contributions b. snefjella developed the study concept. both authors collected, analyzed, and interpreted the data. b. snefjella drafted the manuscript, and v. kuperman provided revisions. both authors approved the final version of the manuscript for submission. acknowledgments we thank the sherman centre for digital scholarship and the research & high- performance computing support group at mcmaster university for technical support. we also thank emmanuel keuleers and two anonymous reviewers, as well as the audience of the th annual meeting of the psychonomic society and the annual meeting of the american association for the advancement of science for providing valuable feedback. declaration of conflicting interests the authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article. funding this research was supported by social sciences and humanities research council insight development grant no. - - , natural sciences and engineering research council of canada discovery grant no. - , national institutes of health grant no. r hd (principal investigator: julie a. van dyke), and an early researcher award from the ontario research fund to the second author. note . we are indebted to emmanuel keuleers for raising this point. references alter, a. l., & oppenheimer, d. m. ( ). effects of fluency on psychological distance and mental construal (or why new york is a large city, but new york is a civilized jungle). psychological science, , – . amit, e., & greene, j. d. ( ). you see, the ends don’t justify the means: visual imagery and moral judgment. psychological science, , – . bar-anan, y., liberman, n., trope, y., & algom, d. ( ). automatic processing of psychological distance: evidence from a stroop task. journal of experimental psychology: general, , – . barsalou, l. w. ( ). grounded cognition. annual review of psychology, , – . bogardus, e. s. ( ). a social distance scale. https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page sociology & social research, , – . boroditsky, l. ( ). metaphoric structuring: understanding time through spatial metaphors. cognition, , – . boroditsky, l. ( ). does language shape thought?: mandarin and english speakers’ conceptions of time. cognitive psychology, , – . boroditsky, l., & ramscar, m. ( ). the roles of body and mind in abstract thought. psychological science, , – . brysbaert, m., warriner, a. b., & kuperman, v. ( ). concreteness ratings for thousand generally known english word lemmas. behavior research methods, , – . burgoon, e. m., henderson, m. d., & markman, a. b. ( ). there are many ways to see the forest for the trees: a tour guide for abstraction. perspectives on psychological science, , – . central intelligence agency. ( ). the world factbook: north america: united states. retrieved from https://www.cia.gov/library/publications /the-world-factbook/geos/us.html d’argembeau, a., & van der linden, m. ( ). phenomenal characteristics associated with projecting oneself back into the past and forward into the future: influence of valence and temporal distance.consciousness and cognition, , – . eisenstein, j. (in press). written dialect variation in online social media. in c. boberg, j. nerbonne, & d. watt (eds.), handbook of dialectology. new york, ny: wiley. eyal, t., liberman, n., trope, y., & walther, e. ( ). the pros and cons of temporally near and distant action. journal of personality and social psychology, , – . fiedler, k. ( ). construal level theory as an integrative framework for behavioral decision-making research and consumer psychology. journal of consumer psychology, , – . fischer, m. h., & zwaan, r. a. ( ). embodied language: a review of the role of the motor system in language compre- hension. the quarterly journal of experimental psychology, , – . fujita, k., henderson, m. d., eng, j., trope, y., & liberman, n. ( ). spatial distance and mental construal of social events. psychological science, , – . gallese, v., & lakoff, g. ( ). the brain’s concepts: the role of the sensory-motor system in conceptual knowledge. cognitive neuropsychology, , – . gong, h., & medin, d. l. ( ). construal levels and moral judgment: some complications. judgment and decision making, , – . hansen, j., & wanke, m. ( ). truth from language and truth from fit: the impact of linguistic concreteness and level of construal on subjective truth. personality and social psychology bulletin, , – . hirst, w., phelps, e. a., meksin, r., vaidya, c. j., johnson, m. k., mitchell, k. j., . . . olsson, a. ( ). a ten-year follow-up of a study of memory for the attack of september , : flashbulb memories and memories for flashbulb events. journal of experimental psychology: general, , – . johnson, m. k., foley, m. a., suengas, a. g., & raye, c. l. ( ). phenomenal characteristics of memories for per- ceived and imagined autobiographical events. journal of experimental psychology: general, , – . kahle, d., & wickham, h. ( ). ggmap: spatial visualization with ggplot . the r journal, ( ), – . retrieved fromhttp://journal.r- project.org/archive/ - /kahle- wickham.pdf ledgerwood, a., trope, y., & chaiken, s. ( ). flexibility now, consistency later: psychological distance and con- strual shape evaluative responding. journal of personality and social psychology, , – . liviatan, i., trope, y., & liberman, n. ( ). interpersonal sim- ilarity as a social distance dimension: implications for per- ception of others’ actions. journal of experimental social psychology, , – . meteyard, l., cuadrado, s. r., bahrami, b., & vigliocco, g. ( ). coming of age: a review of embodiment and the neuroscience of semantics. cortex, , – . nisbett, r. e., caputo, c., legant, p., & marecek, j. ( ). behavior as seen by the actor and as seen by the observer. journal of personality and social psychology, , – . paivio, a. ( ). mental representations: a dual coding approach. new york, ny: oxford university press. parrillo, v. n., & donoghue, c. ( ). updating the bogardus social distance studies: a new national survey. the social science journal, , – . pennebaker, j. w., paez, d., & rim, b. ( ). collective mem- ory of political events: social psychological perspectives. new york, ny: https://arieal.mcmaster.ca/ https://twitter.com/arieal_research https://www.cia.gov/library/publications/the-world-factbook/geos/us.html https://www.cia.gov/library/publications/the-world-factbook/geos/us.html https://www.cia.gov/library/publications/the-world-factbook/geos/us.html http://journal.r-project.org/archive/ - /kahle-wickham.pdf http://journal.r-project.org/archive/ - /kahle-wickham.pdf http://journal.r-project.org/archive/ - /kahle-wickham.pdf http://journal.r-project.org/archive/ - /kahle-wickham.pdf arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page psychology press. pingdom. ( ). report: social network demographics in . retrieved from http://royal.pingdom.com/ / / /rep ort- social-network-demographics-in- / r development core team. ( ). r: a language and envi- ronment for statistical computing. vienna, austria: r foundation for statistical computing. robinson, w. s. ( ). ecological correlations and the behavior of individuals. american sociological review, , – . schacter, d. l., addis, d. r., & buckner, r. l. ( ). remembering the past to imagine the future: the prospec- tive brain. nature reviews neuroscience, , – . schwanenflugel, p. j., harnishfeger, k. k., & stowe, r. w. ( ). context availability and lexical decisions for abstract and concrete words. journal of memory and language, , – . semin, g. r., & fiedler, k. ( ). the cognitive functions of linguistic categories in describing persons: social cognition and language. journal of personality and social psychology, , – . shaoul, c., & westbury, c. ( ). a reduced redundancy usenet corpus ( – ). edmonton, canada: university of alberta. retrieved from http://www.psych.ualberta.ca/~westburyl ab/downloads/usenetcorpus.download.htm l soderberg, c. k., callahan, s. p., kochersberger, a. o., amit, e., & ledgerwood, a. ( ). the effects of psychological distance on abstraction: two meta-analyses. psychological bulletin, , – . stephan, e., liberman, n., & trope, y. ( ). politeness and psychological distance: a construal level perspective. journal of personality and social psychology, , – . trope, y., & liberman, n. ( ). temporal construal and time- dependent changes in preference. journal of personality and social psychology, , – . trope, y., & liberman, n. ( ). temporal construal. psychological review, , – . trope, y., & liberman, n. ( ). construal- level theory of psychological distance. psychological review, , – . wakslak, c. j., trope, y., liberman, n., & alony, r. ( ). seeing the forest when entry is unlikely: probability and the mental representation of events. journal of experimental psychology: general, , – . https://arieal.mcmaster.ca/ https://twitter.com/arieal_research http://royal.pingdom.com/ / / /report-social-network-demographics-in- / http://royal.pingdom.com/ / / /report-social-network-demographics-in- / http://royal.pingdom.com/ / / /report-social-network-demographics-in- / http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page table results of hierarchical regressions comparing models predicting context concreteness note: in study , the predictor was log geographic distance from the city; in study a, the predictor was log temporal distance from the event in the past. the cubic polynomial provided the best fit. table terms used for the social-distance groups defined by bogardus ( ) family friends neighbors coworkers compatriots visitors foreigners husband friend neighbor coworker compatriot visitor immigrant wife ally neighbour co-worker countryman tourist foreigner spouse confidant peer colleague countrywoman traveler outsider consort confidante homie collaborator stranger alter ego homeboy workmate emigrant second self homegirl nonmember pal noncitizen chum newcomer buddy alien note: from left to right, the groups are arranged from the most proximal to the most distal. geographic distance (study ) temporal distance (study a) comparison with previous model comparison with previous model polynomial degree r ∆r p r ∆r p linear . - - . - - quadratic . . > . . . . cubic . . < . . . < . quartic . . >. . . . https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page fig. . results from study : scatterplot showing the mean concreteness of twitter messages regarding u.s. cities as a function of log geographic distance. the dotted line represents the inflection point (i.e., the point at which the second derivative of the function changes its sign), and the histogram of distances is presented along the x-axis. the best-fitting regression line and equation are also shown. https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page fig. . results from study : mean concreteness of usenet postings as a function of (a) log temporal distance from to years (study a), (b) time units in the past (target phrase: “ago”) and the future (target phrase: “from now”; study b), and (c) ordered time units in the past (target phrase: “last”) and the future (target phrase: “next”; study c). in (a), the dotted line represents the inflection point (i.e., the point at which the second derivative of the function changes its sign), and the histogram of distances is presented along the x-axis. in (b) and (c), circle size is proportional to log- frequency of the target phrase. in all panels, the best-fitting regression line and equation are provided, and error bars (where visible) reflect % confidence intervals https://arieal.mcmaster.ca/ https://twitter.com/arieal_research arieal research centre (w: arieal.mcmaster.ca; t: @arieal_research) snefjella & kuperman, page fig. . results from study : mean concreteness of usenet postings as a function of social-distance group (defined by bogardus, ). circle size is proportional to the log- frequency of search terms in each category. the best-fitting regression line and equation are provided, and error bars reflect % confidence intervals. fig. . results from study : scatterplots showing the mean concreteness of death-related twitter messages regarding u.s. cities as a function of geographic distance (upper x-axis) and of death-related usenet postings with the target phrase “years ago” as a function of log temporal distance (lower x-axis). solid lines represent best-fitting regressions, and dotted lines replicate the concreteness of twitter messages from figure and the concreteness of death-related usenet postings with “years ago” from figure a (upper and lower, respectively). https://arieal.mcmaster.ca/ https://twitter.com/arieal_research open access and promotion and tenure evaluation plans at the university of wisconsin--eau claire boston university openbu http://open.bu.edu bu open access articles bu open access articles - - open access and promotion and tenure evaluation plans at the university of wisconsin–eau claire this work was made openly accessible by bu faculty. please share how this access benefits you. your story matters. version citation (published version): stephanie h. wical, gregory j. kocken. . "open access and promotion and tenure evaluation plans at the university of wisconsin–eau claire." serials review, v. , issue , pp. - . https://hdl.handle.net/ / boston university http://www.bu.edu/disc/share-your-open-access-story/ full terms & conditions of access and use can be found at http://www.tandfonline.com/action/journalinformation?journalcode=usrv serials review issn: - (print) - x (online) journal homepage: http://www.tandfonline.com/loi/usrv open access and promotion and tenure evaluation plans at the university of wisconsin–eau claire stephanie h. wical & gregory j. kocken to cite this article: stephanie h. wical & gregory j. kocken ( ) open access and promotion and tenure evaluation plans at the university of wisconsin–eau claire, serials review, : , - , doi: . / . . to link to this article: https://doi.org/ . / . . © the author(s). published with license by taylor & francis group, llc© stephanie h. wical and gregory j. kocken accepted author version posted online: jun . published online: jun . submit your article to this journal article views: view crossmark data http://www.tandfonline.com/action/journalinformation?journalcode=usrv http://www.tandfonline.com/loi/usrv http://www.tandfonline.com/action/showcitformats?doi= . / . . https://doi.org/ . / . . http://www.tandfonline.com/action/authorsubmission?journalcode=usrv &show=instructions http://www.tandfonline.com/action/authorsubmission?journalcode=usrv &show=instructions http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - http://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - serials review , vol. , no. , – https://doi.org/./.. open access and promotion and tenure evaluation plans at the university of wisconsin–eau claire stephanie h. wical a and gregory j. kocken b amugar memorial library, boston university, boston, massachusetts, usa; bw. d. mcintyre library, university of wisconsin–eau claire, eau claire, wisconsin, usa keywords open access publishing; scholarly communication; university promotion and tenure evaluation plans abstract department and program evaluation plans at the university of wisconsin–eau claire were examined to see if these documents provide evidence that could be used to justify supporting the publication of peer-reviewed open access articles toward tenure and promotion. in an earlier study, the authors revealthatfacultymembersattheuniversityofwisconsin–eauclairearemoreunawareofopenaccess publishingthantheircounterpartsatlargeruniversities.thesefindingsdovetailwithotherstudiesthat show that faculty members are reluctant to publish in open access journals because of concerns about the quality of those journals. the existing body of scholarship suggests that tenure-line faculty fear publishing in open access journals because it could adversely impact their chances of promotion and tenure. the authors of this current study sought to determine if department and program evaluation plans could influence negative perceptions faculty have of open access journals. the implications of this study for librarians, scholarly communication professionals, tenure-line faculty, departments, and programs are addressed. introduction the library profession has a growing body of scholarship about perceived and real obstacles to the adoption of open access (oa) publishing and institutional repositories. in june of in the chronicle of higher education, jennifer howard ( ) observed that “junior faculty members concerned with tenure and promotion tend to be wary of repositories” (n.p.). research suggests that tenured faculty members are more likely to publish in open access jour- nals (norwick, ; park, ). untenured faculty, on the other hand, may not believe they are able to take the risk of publishing with a journal that has not been around long enough to have established prestige (norwick, ; suber, ). using a web-based questionnaire, park ( ) found the following when looking at the responses of tenured and untenured faculty (pre- and nontenure line): the tenured group accorded more importance to career benefit than did the untenured and not applicable groups. it is perhaps appropriate to say that while tenured respon- dents continue to pursue research and reputation, they likely are less concerned about where and what to publish to gain career benefit; in other words, career benefit may not be as high a priority for them as it is for non-tenured respondents. (p. ) contact stephanie h. wical wical@bu.edu electronic resources & acquisitions librarian, mugar memorial library, boston university,  commonwealth avenue, boston, ma . norwick ( ) found that % of the faculty mem- bers surveyed feared open access publications would negatively impact their tenure and promotion reviews. to further complicate this matter, much attention has been given to articles that have been published in journals that appear to have very little or no peer review (bohan- non, ; beall, ). “fear of losing the peer-review aspect of publishing is often cited by faculty authors as a reason they are opposed to the open-access model” (corbett, , p. ). coonin and younce ( ) also observed that peer review is the most important factor when determining where to publish. while publishing in a predatory journal is a valid concern, faculty members, departments and universities can take steps to ensure that professors are not in danger of throwing away perfectly good manuscripts, which is an unfortunate occurrence for some who have published in a journal that has no valid peer-review process. while academic disciplines as well as the “research and publishing culture within the disciplines” have bearing on whether or not faculty select open access journals for their manuscripts (coonin & younce, , p. ), could the documents that commu- nicate promotion and tenure policies and guidelines also communicate biases regarding open access journals? the university of wisconsin–eau claire (uwec) serves as the focus for inquiry because a previous study ©  stephanie h. wical and gregory j. kocken. published with license by taylor & francis. this isanopenaccessarticle.non-commercialre-use,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlyattributed,cited,andisnot altered,transformed, or built upon in any way, is permitted. the moral rights of the named author(s) have been asserted. https://doi.org/ . / . . https://crossmark.crossref.org/dialog/?doi= . / . . &domain=pdf&date_stamp= - - mailto:wical@bu.edu wical and kocken explored the attitudes and awareness of uwec fac- ulty toward open access publishing. that study revealed that nearly % of the faculty did not understand open access publishing, a significantly higher number than that reported in the literature (kocken & wical, ; xia, ). this current study expands on this ear- lier work, seeking to determine if department and pro- gram evaluation plans could account for some of the reluctance of uwec faculty to publish in open access journals. depending on an academic institution’s priorities, reexamining or updating the promotion and tenure pro- cess may be in order if the institution has made a com- mitment to support open access by adopting an open access mandate or other measure. “although it may not be possible to work directly with university administra- tors, it is important to take into account what their pri- orities are” (corbett, , p. ). since policy docu- ments, such as those regarding promotion and tenure, have to be signed by high-level administrators before they can be applied, reviewing these documents is a good approach to determining if any biases or potential biases for or against open access exist. there is a lack of anal- ysis of promotion and tenure documents in how they address or do not address emerging models of schol- arly communication, including open access journals and open access publishing (anderson & trinkle, ). in this current study, the investigators hypothesized that the documents guiding the tenure and promotion process at uwec do not formally support open access publishing or distribution. in , a time when major publishers were beginning to offer the first bundles that later came to be called “big deal” packages, “cronin and overfelt found that any for- mal analysis of promotion and tenure policies with regard to electronic publication was noticeably absent from the literature” (hattendorf westney, , p. ). while the tenure and promotion process varies from institution to institution, there are many aspects of the process that are similar at almost all institutions. frequently, tenured faculty members review untenured faculty members. those reviewers often look at a wide variety of crite- ria, which often include scholarship, service, teaching, and advising. nevertheless, cronin and overfelt’s study “suggested, unsurprisingly, that there may very well be inconsistencies in interpretation and practice in the aca- demic reward system” (hattendorf westney, , p. ). their conclusions have not been overturned in years since their research. “the florida state university system investigated the perceptions toward electronic publishing held by their university administrators and faculty. it concluded that there is a need to develop formal policies regarding the acknowledgement of electronic scholarly publishing in promotion and tenure decisions for faculty” (hattendord westnesy, p. ). questions about the value of open access publishing have persisted among scholars. a survey conducted by the american association for history and comput- ing revealed that, “chairpersons questioned whether a peer-reviewed electronic journal article was ‘as good as’ a peer-reviewed print article.” in discussing this survey, the book digital scholarship in the tenure, promotion, and review process, suggests that at that point years ago, “traditional modes of scholarship continue to remain the primary mechanisms by which all faculty in all dis- ciplines are evaluated” (hattendorf westney, , p. ). “electronic journals, even when they have a strictly defined peer-review process, continue to be less widely perceived by scholars as being of the same scholarly caliber as are traditional paper publications” (hattendorf westney, , p. ). although now over a decade old, this concept of traditional modes of scholarship remains an issue that persists in the tenure and promotion review process, and there is nothing in the way of scholarship that explores promotion and tenure documents as they relate to open access publishing. tenure and promotion committees were slow to address the issue of e-scholarship. in the early days of electronic journals, faculty and administrators were not sure what to do about electronic journals when it came to evaluating research output as it related to promotion and tenure (sweeney, ; hurrell & meijer-kline, ). more recent discussion of this topic explores whether the tenure and promotion process even needs to change to support open access publishing/distribution. david lewis ( ) argues the inevitability of open access. “as open access comes to dominate the scholarly communi- cation system, the current concerns about publishing in this venue, often related to promotion and tenure deci- sions, will diminish” (lewis, , p. ). if this proves to be true, then specific changes to the guidelines that direct tenure and promotion decisions would be unnec- essary. however, just like early electronic journals were viewed with suspicion, open access journals (which are online only) suffer from their older relative’s reputation. “if tenure and promotion committees do not recognize newer forms of scholarly outputs, including oa materi- als, as legitimate, then authors may be reluctant to explore these options” (hurrell & meijer-kline, , p. ). more- over, in describing the results of a survey of univer- sity of california, corbett ( ) noted that “a traditional system of tenure and promotion was seen as hindering changes in faculty behavior regarding scholarly commu- nication, i.e., deciding to publish in open-access journals or posting their publications in institutional repositories” (p. ). serials review while publication quality is the most important factor for faculty when choosing where to publish (warlick & vaughan, ), faculty authors often have to rely on other factors as a shorthand for quality in making their decisions (suber, ). “universities tend to use journal prestige and impact as surrogates for qual- ity” (suber, , p. ). therefore, journal reputation (antelman, ; suber, ) is what faculty authors make their decisions on. because of this, faculty authors are more likely to publish in older, more established journals rather than risk publishing in a new high-quality open access journal (suber, ). hattendorf westney ( ) observed that “the creation of formal criteria and guidelines for the assessment and evaluation of digital scholarship and teaching with technology for purposes of tenure, promotion and review remains largely in the discussion state” (p. ). this statement could also be made about open access scholarship. chairpersons questioned the equivalency of electronic journals to print journals when electronic journals first started to appear (hattendorf westney, ; anderson & trinkle, ). similarly, technology-based projects, such as “computer software, articles in e-journals, inter- net based materials, videotapes, and audiotapes” were not valued by the institutions even though faculty and administrators valued them (hattendorf westney, , p. .). today promotion and tenure committees may still favor older more established peer-reviewed jour- nals that are not open access to those that are open access, just as promotion and tenure committees had been leery of electronic journals in their early days. as hattendorf westney ( ) observed over years ago, “while electronic journals offer many advan- tages to multiple constituencies, their acceptance by university promotion and tenure committees remains unclear” (p. ). these shortcuts of relying on pres- tige and impact to indicate quality could potentially be problematic because of two observations noted by suber ( ): prestige can’t keep pace with quality, at least when there are many high-quality journals. if prestige is our measure of valuation, then it will inevitably undervalue some high- quality journals. (p. ) if you’ve ever had to consider a candidate for hiring, promotion, or tenure, you know that it’s much easier to tell whether she has published in high-impact or high- prestige journals than to tell whether her articles are actu- ally good. (p. ) suber ( ) observes that people who do not know that open access is compatible with publishing in a pres- tigious subscription journal assume that publishing in a prestigious subscription journal is incompatible with open access. methodology the university of wisconsin–eau claire (uwec) has departments or programs with evaluation plans that govern the tenure and promotion review process for faculty members. (accounting and finance, business and communication, information systems and marketing and management all share the same college of business evaluation plan). see table . the investigators were able to obtain all of the evaluation plans and review them. this study focused on how these guiding documents could potentially influence faculty attitudes and deci- sions regarding open access publishing/distribution. the investigators contacted the chairperson of each of these departments and programs and requested access to the departmental evaluation plans or program evaluation plans (deps/ peps) while also explaining the nature of table . departments and programs with evaluation plans at the university of wisconsin–eau claire. department/unit/program name date of review document approval academic affairs mcintyre library september ,  college of arts & sciences american indian studies june  art and design february ,  biology february ,  chemistry may ,  communication and journalism not available in reviewed document computer science may ,  economics april  english may ,  foreign languages october ,  geography and anthropology september ,  geology september ,  history july ,  latin american studies may ,  materials science september ,  mathematics not available in reviewed document music and theatre arts august ,  philosophy and religious studies not available in reviewed document physics and astronomy not available in reviewed document political science september ,  psychology may ,  sociology october ,  watershed institute november ,  women’s studies not available in reviewed document college of business accounting and finance∗ not available in reviewed document business communication∗ not available in reviewed document information systems∗ not available in reviewed document management and marketing∗ not available in reviewed document college of education and human sciences communication sciences and disorders  education studies december ,  kinesiology may ,  social work september ,  special education august ,  college of nursing and health sciences nursing july  ∗all of the departments in the college of business use a combined evaluation plan for reviews. wical and kocken the research. while each plan is individually approved by campus administration, evaluation plans for many of these departmental plans are reinforced by a university- wide document that provides guidance for the tenure and review process. each plan was carefully analyzed for specific language regarding open access publishing. the keywords “open access,” “internet,” “online,” “tradi- tional,” and “repository” were carefully noted and ana- lyzed in their contexts. in addition to the identification of specific words, a qualitative analysis of the language regarding scholarship requirements was conducted. this analysis explored an aspect of the context of how “open access” is supported or discouraged through an evalua- tion of the scholarship requirements in these documents. the research presented in this article is limited in its scope to uwec. every institution conducts tenure and promo- tion reviews in a different manner. uwec places a much greater emphasis on teaching than other institutions, and this emphasis influences the tenure and promotion review process as described in the official university documents. the focus of this study was to look only at these docu- ments without any potential bias in interpretation con- tributed by interviews with department chairpersons or department evaluation committee members. results in general, department and program evaluation plans at uwec present a variety of accepted publication types and give some sense of the relative importance of the types of publications that tenure-line faculty are expected to pro- duce in order to achieve promotion and tenure. while most departments and programs do not explicitly state the number of publications required for promotion and tenure, the expectations of departments in the college of business are more explicit: four peer-reviewed publica- tions are expected to achieve promotion and tenure. of the evaluation plans examined, five distinct issues were identified that relate to open access scholar- ship in the review process: first, plans that directly address “open access” and provide context that supports open access scholarship; second, the degree of flexibility given to candidates under review to support the evaluation of their scholarship; third, dissemination of scholarship; fourth, the issue of equality of scholarship; and finally, the use of contradictory language that potentially diminishes support for open access scholarship. rather than attempt to address each evaluation plan separately, unique occur- rences of these five distinct issues are addressed in this section. none of the evaluation plans examined specifi- cally mentioned “open access” anywhere within the department evaluation plan or program evaluation plan documents. this does not come as a surprise, but these plans did provide fascinating insight into how open access scholarship could be evaluated through the guidelines in these plans. the plans of three departments, communi- cation sciences and disorders, english, and education studies provided the strongest context toward addressing open access scholarship. the communications sciences and disorders department’s plan identified, “articles in refereed journals, in print or on-line” as recognized schol- arly activity (communication sciences and disorders department, , p. ). it is possible that the department attempted to stress format neutrality rather than support open access, but the step of identifying “on-line” is cer- tainly uncommon among the plans evaluated. similar to communication sciences and disorders, the english department also made reference to “electronic or online scholarship.” the english department’s plan goes on to state, in parentheses, that this scholarship is “subject to the same standards as non-internet scholarship/creative activity” (english department, , p. ). again, this is an effort to be format neutral, but it does recognize that scholarship can be published electronically. education studies identified “electronically published documents,” but this was identified in a nonessential category of scholarship termed “enhancing criteria” (education stud- ies department, , p. ). although an analysis of the document does not provide any additional context, the placement of “electronically published documents” within the “enhancing criteria” section could be con- strued as suggesting electronic publishing is of lesser value to this department. an overwhelming characteristic of the plans evaluated is the degree of flexibility present in these documents. from the perspective of supporting open access, plans with a great deal of flexibility in the types of scholarship accepted and the means of evaluating that scholarship offer the greatest potential support for open access. generally, most evaluation plans offered flexibility to reviewers and candidates regarding qualified works of scholarship. for example, the art and design depart- ment’s evaluation plan states, “the department encour- ages and recognizes a wide variety of scholarly activities and productions” (art and design department, , p. ). many of the plans evaluated cited a university-wide structural document that guided the formation of depart- mental evaluation plans. that document, the faculty and academic staff handbook, sets a tone of flexibility for the specific plans. two of the evaluation plan documents examined highlighted the value of dissemination of scholarly contributions. open access platforms allow for some of the broadest dissemination options available to scholars. the mathematics department states within their plan the department values “dissemination to an appropriate audience, and submission of the product to serials review the examination and critique of professional peers, either before or after dissemination, or both” (mathematics department, n.d., p. ). the chemistry department’s evaluation plan also echoed the sentiments of mathe- matics placing an emphasis on dissemination (chemistry department, , pp. – ). science, technology, engi- neering, and mathematics (stem) fields are earlier adopters of open access dissemination, and it is not too surprising to see an emphasis on dissemination within the plans of these departments. another recurring issue, which again appeared in some of the documents examined, focuses on the issue of equal- ity of scholarship. the psychology department’s plan “acknowledges that some review processes are much more rigorous than others in accepting materials for publica- tion and presentation” (psychology department, , p. ). likewise, the political science department’s plan mirrored this statement, almost word for word (politi- cal science department, , p. ). the issue of equal- ity of scholarship is something that scholars have often debated in regards to open access. the department’s per- ception of open access publication becomes more impor- tant within the context of these documents. the politi- cal science department’s plan further states, “it falls to the department personnel committee, or an appropriate subcommittee thereof, to make judgments on the qual- ity and importance of these activities” (political science department, , p. ). the candidate under review is, in this instance, unable to defend open access publi- cation and is left to the judgments of their senior col- leagues on the department personnel committee. a few departments offer avenues through which candi- dates can defend the quality of their scholarly prod- ucts. the latin american studies department’s plan, for example, “encourages faculty members to make a case in review materials should they wish to recommend a different ranking for their own work. scholarly activ- ity may be demonstrated in various ways, depending on the strengths, interests, and professional training of the individual” (latin american studies department, , p. ). while use of language that specifically identifies open access scholarship remained absent, the evaluation of these plans did reveal vague and occasionally contra- dictory language that could create confusion regarding the status of open access scholarship in the review pro- cess. notably, use of terms such as “traditional,” “in- print,” or “papers” leads the reader, and reviewers, to make assumptions about the definition of these terms. the english department identified “traditional forms of peer- reviewed, discipline-centered scholarship” in their review document but failed to clearly define what this scholarship entails (english department, , p. ). “traditional” could potentially lead a reader to regard print publica- tion as more important than online publication of schol- arship. additionally, the geography and anthropology department uses the term “papers” but does not elabo- rate if these could be either in print or online (geography and anthropology department, , p. ). that was the only reference to the term “papers” among the evaluation plans examined. some departments, such as the depart- ment of foreign languages, address this issue by using the broader term “publications” (foreign languages depart- ment, , pp. – ). other departments took more con- crete steps by identifying specific journals or defining ter- minology used in their review plans. the department of music and theatre arts, for example, identified very spe- cific journals for publication within their plan (music and theatre arts department, , p. ). the mathematics department, when identifying the significance of quality scholarship, provided a set of characteristics that define “quality” for the department (mathematics department, n.d., p. ). discussion of all the departmental evaluation plans analyzed dur- ing this research project, only those of english and of communication sciences and disorders specifically iden- tified internet publishing within the scholarship guide- lines for review. the overwhelming majority of evalua- tion plans are generally silent on this issue of electronic publishing. on the surface, this might suggest that open access publishing is not readily adopted by faculty tenure and promotion review committees. most of these doc- uments, however, provide reviewers with a tremendous degree of flexibility to evaluate the scholarly achievements of faculty peers. this is a strength and not a weakness of these review documents. given this, open access publish- ing is given equal weight alongside traditional publishing models. however, the promotion and review process is complex. the review document is ultimately interpreted by the reviewers. in the standard tenure and promotion review model, senior, tenured faculty members frequently review junior, untenured faculty members. the burden of determining a work’s value lies not with the individual under review but by the team conducting the review. the biases of the review members and perceptions of those biases by untenured faculty members become important factors in the process. from reviewing the evaluation plans for departments and programs from uwec, there is no specific prohi- bition against publishing in open access journals despite anxiety faculty members may have about the quality of open access publications. the recent proliferation of scam journals that have taken advantage of opportunities wical and kocken created by the gold open access model could have caught the attention of faculty who do not wish to publish in or be associated with anything other than reputable journals. jeffrey beall raised issues that are valuable to anyone on the tenure track. in his role as a scholarly communica- tions librarian, beall cautioned scholars about the perils of not carefully vetting their choice of publications and edi- torship venues. additionally, he suggested that untenured faculty become familiar with their departments’ evalua- tion plans (beall, ). this is an area in which schol- arly communications librarians can assist all tenured and tenure-track faculty members who may need advice about suitable journals to disseminate their scholarship as well as advice on how to evaluate journals in which their junior colleagues publish. one would hope that faculty members could identify the subscription journals in their fields, but junior faculty are often encouraged to consult more senior faculty for a list of publications. if departments and programs want faculty to choose to publish in a limited number of jour- nals, evaluation plans should explicitly say so. since open access journals are a recent development, more senior faculty members may not be as aware of reputable open access journals. also, newer open access publications may not have been around long enough to establish prestige (suber, ). until recently, the availability of a blacklist of open access publishers and journals (scholarlyoa.com) pro- vided a quick way for scholars to see if an open access journal or publisher was not worth their time or energy. scholarlyoa.com, also known as “beall’s list,” went dark in mid-january of . cabell’s international anticipates launching its own list of predatory journals and hired beall as a consultant (silver, ). many of beall’s critics have suggested referring to a whitelist of reputable open access publishers as a better approach. the open access scholarly publishers association (oaspa) membership list could serve as such a list and oaspa has a mission of setting and maintaining standards: our mission is to represent the interests of open access (oa) journal and book publishers globally in all scien- tific, technical and scholarly disciplines. this mission will be carried out through exchanging information, setting standards, advancing models, advocacy, education, and the promotion of innovation. (oaspa, ) what is noteworthy is that this association is for open access publishers and not for the consumers of open access publications. scholarly open access listed journals and publishers that beall considered predatory because they exploited the gold open access model and engaged in deceptive practices, like saying they performed peer review when they really published everything they received. while compliance with the ethical standards of oaspa was something that beall looked at when review- ing a journal or publisher for inclusion or exclusion from beall’s list, it was not the only factor taken into con- sideration (beall, ). moreover, members of oaspa have found themselves on beall’s list, even though oaspa also places its members under review for falling short of oaspa’s ethical standards. the directory of open access journals has recently purged itself of journals that are not providing peer review and could also be used as a white list. what scholars want to avoid is submitting an article to or reviewing an article for a publication that could be considered predatory because it engages in practices that are considered unethical by the scholarly community. ultimately, all reviews for promotion and tenure should be evaluated on a case-by-case basis. while some policies will explicitly state the number of and specific quality of publications required to achieve promotion and tenure, nearly all policies at uwec leave a lot of flexibility for promotion and tenure committees. however, determin- ing the quality of a publication may take much more work than most faculty members are willing or able to do with their current workloads. if, as suber ( ) states, “the key variables in journal quality are excellent authors, edi- tors, and referees,” (p. ), then perhaps faculty time is better spent analyzing the credentials of authors, editors, and referees rather than something like an impact factor, which is one of “the key variables in journal prestige” (p. ). the key variables that suber ( ) identifies are “quality, age, impact, circulation, and recognition by promotion and tenure committees” (p. ). in any case, committees are not looking at an article for its actual merit, but are making educated guesses based on what they believe to be true about journals. the way that open access journals are often perceived compared to society and commercial journals that charge a subscription price in many ways resembles the ways that electronic journals were perceived in the early days of electronic journals. “scholars who had published elec- tronically believed that there is a widespread perception that electronic publication is less significant than print publication. their beliefs continue to be reflected in the current practices of promotion and tenure” (hattendorf westney, , p. ) it is very likely that a bias against open access exists in the minds of faculty even if the pol- icy documents that guide promotion and tenure decisions fall silent on the issue. how this will likely translate in the tenure, promotion, and review process is that “established methods of publishing and teaching will continue to be rewarded more often and consistently” (fountain, , p. ). there is a lack of recent scholarship in this area, serials review and it deserves further exploration, as scholarly commu- nication has changed dramatically in the last years. after reviewing numerous evaluation documents, it is apparent that universities can take steps to support open access publishing without needing to directly sup- port open access in the guidelines that govern the evaluative process. the following are some sugges- tions of ways that scholarly communications librari- ans can help support open access within the academic landscape: � encourage all faculty members to submit eligible scholarly works to an institutional or discipline spe- cific open access repository. this is simple, but can have far reaching effects. sub- mitting scholarly works to an open repository can build awareness simply by being part of the tenure and review process. in turn, faculty and other academic staff learn more about repositories and copyright. additionally, this activity could be viewed as service to the college or uni- versity (more on that in the following). self-archiving of publications remains a valid option for tenure track professors, even though kim ( ) found that faculty members surveyed believed “there would be little pos- itive effect of self-archiving on tenure and promotion, especially when posting non-peer-reviewed materials” (p. ). kim ( ) indicated that two intervie- wees relayed that self-archiving their publications had helped enhance their reputations, and they believed that this helped them secure favorable recommenda- tion letters. in the fall of , library faculty at the university of wisconsin–eau claire signed a decla- ration to put our own research in minds@uw, the university of wisconsin system’s institutional reposi- tory, whenever possible (free, ). how could we expect teaching faculty to submit their work to an institutional repository if we were not leading by example? � scholarly communications librarians and open access advocates should review their institutions’ criteria for promotion and tenure. � avoid using contradictory language within guidelines. our words are often subject to interpretation. use of the term “traditional,” for example, is ambiguous and mis- leading. documents should be specific when identifying “traditional publications” or any time subjective language is used. � provide a mechanism for faculty members under- going a review to defend their publications and the journals they choose for publication. allowing faculty members to address a review com- mittee can often provide clarity for both parties during a review process. if given the opportunity, faculty members can provide alternate measures of scholarly impact, such as altmetrics. � build partnerships to support change. colleges and universities, especially publically funded institutions, are frequently under scrutiny to prove their value. partnerships with an office of research or similar department that is receptive to supporting open access and institutional repositories provide colleges and univer- sities with opportunities to demonstrate their value. � provide a mechanism for untenured faculty mem- bers to initiate a revision of an evaluation plan. the time when david lewis’s ( ) conclusion that open access will come to dominate the publishing land- scape is not yet upon us. while this may someday be true, the opportunities presented here can help make the tenure and review process more amenable to open access without forcing the issue through institutional requirements or an unnecessary reward structure. as mercer ( ) observes, librarians can effect change by “enhancing the value of open access with administra- tors and promotion/tenure committees” (mercer, , pp. – ). scholars who are on the tenure track are encour- aged to become familiar with their department or pro- gram evaluation plans and to begin conversations with librarians. if a plan has outdated language that does not reflect the current realities of scholarly communication, untenured faculty members are encouraged to initiate a dialog with tenured faculty members in order to make much needed changes where appropriate. untenured fac- ulty members can ask each of the members of their pro- motion and tenure committee and librarians which jour- nals they believe are appropriate publication venues and to bring high-quality open access publications to the attention of committee members. one way might be for untenured faculty (and their librarian allies) to send their senior colleagues great articles from open access journals that are relevant to their senior colleagues’ research inter- ests. if a senior colleague has never heard of the open access journal from which the article came and they find it highly useful, they may be more accepting of other open access journals that publish in areas outside of their research. another possibility is to seek out the mentorship of more senior colleagues by discussing possible journals where junior faculty may wish to publish their research. inviting librarians to departmental meetings to discuss scholarly communications issues is another way to effect positive change to update evaluation plans. in any case, the burden of pushing for updating department and pro- gram evaluation plans may fall on junior faculty, as senior faculty will have little incentive to change them unless they are charged with that responsibility on a formal committee. wical and kocken our findings are limited in scope in that they focus on uwec, but it is our hope that this study could lead to important conversations about real and per- ceived values of open access publications to tenure and promotion committees. with the abundance of open access journals and the appearance of predatory pub- lishers, it is crucial that tenure and promotion evalua- tion plans explicitly address what is considered legitimate scholarship. conclusion uwec program and department evaluation plans fall silent on the issue of open access publishing as a viable option for tenure line faculty. promotion and tenure guid- ing documents at larger research universities could pro- duce different results, and campus climates could be very different, so more research in this area is needed. an additional area of inquiry is described by hurrell and meijer-kline ( ): “no study has specifically investi- gated the knowledge, attitudes, and beliefs around oa publishing among academic faculty and administrators who sit on tenure and promotion committees, and the effect that those attitudes might have on their judgments” (p. ). this would be a logical next step to the study pre- sented here, especially since the documents as a whole leave much room for interpretation. corbett ( ) also recommends that librarians consider the priorities of administration. sweeney ( ) had surveyed adminis- trators about their attitudes about electronic journals, and a similar study looking at administrator attitudes and beliefs about open access journals and predatory publish- ers would be an excellent area of future research. before open access gains wide acceptance, the way promotion and tenure committees evaluate scholarship needs to be examined. more research is also needed to better under- stand how open access journals are accepted by specific disciplines based upon the prevalence of open access jour- nals in those disciplines. while the program and department evaluation plan documents at uwec do not explicitly address the issue of open access, they do explicitly address peer-review as an indication of quality scholarship. the implications of this for tenure line faculty, departments, and programs as well as for colleges and universities are addressed. our findings regarding the promotion and tenure documents are not surprising considering the prior survey findings of coonin and younce ( ) that “peer review and peer acceptance is at the heart of scholarly research endeav- ors” (p. ). because of the importance of peer review to scholarly communication, we may wish to turn our atten- tion to ensuring that scholars are not lured into publish- ing in journals that have little or no peer review. not all peer review processes are equal, so helping scholars deter- mine when to “walk away” from a questionable publisher is something that librarians are well positioned to do. in any case, scholarly communications librarians and open access supporters should familiarize themselves with their institution’s criteria for promotion and tenure to deter- mine not only how they can better help tenure-track fac- ulty but also how they can help policy evolve so that it corresponds well with the current landscape of scholarly communication. orcid stephanie h. wical http://orcid.org/ - - - gregory j. kocken http://orcid.org/ - - - x references anderson, d. l., & trinkle, d.a. ( ). valuing digital schol- arship in the tenure, promotion, and review process: a sur- vey of academic historians. in d. l. anderson (ed.), digi- tal scholarship in the tenure, promotion, and review process (pp. – ). armonk, ny: m.e. sharpe. antelman, k. ( ). self-archiving practice and the influence of publisher policies in the social sciences. learned publish- ing, ( ), – . art and design department. ( , february ). department evaluation plan. unpublished internal document, univer- sity of wisconsin–eau claire. beall, j. ( ). criteria for determining predatory open access publishers( nd ed.). retrieved from http://scholarlyoa.com/ / / /criteria-for-determini-ng-predatory-open- access-publishers- nd-edition/ beall, j. ( ). avoiding the peril of publishing qualitative research in predatory journals. journal of ethnographic & qualitative research, ( ), – . bohannon, j. ( ). who’s afraid of peer review? science, ( ), – . doi: . /science. . . chemistry department. ( , may ). department evalua- tion plan. unpublished internal document, university of wisconsin–eau claire. communication sciences and disorders department. ( ). department evaluation plan. unpublished internal docu- ment, university of wisconsin–eau claire. coonin, b., & younce, l. m. ( ). publishing in open access education journals: the authors’ perspectives. behavioral and social sciences librarian, ( ), – . doi: . / corbett, h. ( ). the crisis in scholarly communica- tion, part i: understanding the issues and engaging your faculty. technical services quarterly, ( ), – . doi: . / cronin, b., & overfelt, k. ( ). e-journals and tenure. journal of the american society for information science, ( ), – . doi: . /(sici) - ( ) : . .co; - . education studies department. ( , december ). depart- ment evaluation plan. unpublished internal document, uni- versity of wisconsin–eau claire. http://orcid.org/ - - - http://orcid.org/ - - - x http://scholarlyoa.com/ / / /criteria-for-determining-predatory-open-access-publishers- nd-edition/ https://doi.org/ . /science. . . https://doi.org/ . / https://doi.org/ . / https://doi.org/ . /(sici) - ( ) : . .co; - serials review english department. ( , may ). department evalua- tion plan. unpublished internal document, university of wisconsin–eau claire. foreign languages department. ( , october ). depart- ment evaluation plan. unpublished internal document, uni- versity of wisconsin–eau claire. fountain, k. c. ( ). to web or not to web? evaluation of world wide web publishing in the academy. in d. l. ander- son (ed.), digital scholarship in the tenure, promotion, and review process (pp. – ). armonk, ny: m.e. sharpe. free, d. ( ). news from the field: uw–eau claire supports open access. college & research libraries news, ( ), . geography and anthropology department. ( , september ). department evaluation plan. unpublished internal doc- ument, university of wisconsin–eau claire. hattendorf westney, l. c. ( ). mutually exclusive? infor- mation technology and the tenure, promotion, and review process. in d. l. anderson (ed.), digital scholarship in the tenure, promotion, and review process (pp. – ). armonk, ny: m.e. sharpe. howard, j. ( , june ). digital repositories foment a quiet revolution in scholarship. chronicle of higher edu- cation. retrieved from http://chronicle.com/article/digital- repositories-foment/ / hurrell, c., & meijer-kline, k. ( ). open access up for review: academic attitudes towards open access publish- ing in relation to tenure and promotion. open excess, ( ). retrieved from http://tsc.library.ubc.ca/index. php/journal /article/view/ kim, j. ( ). faculty self-archiving: motivations and barriers. journal of the american society for information science and technology, ( ), – . kocken, g., & wical, s. ( ). “i’ve never heard of it before”: awareness of open access at a small liberal arts university. behavioral & social sciences librarian, ( ), – . latin american studies department. ( , may ). depart- ment evaluation plan. unpublished internal document, uni- versity of wisconsin–eau claire. lewis, d. ( ). the inevitability of open access. college and research libraries, ( ), – . mathematics department. (n.d.). department evaluation plan. unpublished internal document, university of wisconsin– eau claire. mercer, h. ( ). almost halfway there: an analysis of the open access behaviors of academic librar- ians. college and research libraries, ( ), – . music and theater arts department. ( , august ). depart- ment evaluation plan. unpublished internal document, uni- versity of wisconsin–eau claire. norwick, e. ( ). academic rank of authors publishing in open access journals. agricultural information worldwide, ( ), – . open access scholarly publishers’ association [oaspa]. ( ). code of conduct. retrieved from: http://oaspa.org/ membership/code-of-conduct/ park, j. h. ( ). motivations for web-based scholarly pub- lishing: do scientists recognize open availability as an advantage? journal of scholarly publishing, ( ), – . political sciences department. ( , september ). depart- ment evaluation plan. unpublished internal document, uni- versity of wisconsin–eau claire. psychology department. ( , may ). department evalua- tion plan. unpublished internal document, university of wisconsin–eau claire. silver, a. ( ). controversial website that lists ‘preda- tory’ publishers shuts down. nature news (january , ). retrieved from http://www.nature.com/news/ controversial - website - that - lists - predatory - publishers - shuts-down- . suber, p. ( ). thoughts on prestige, quality and open access. logos, ( – ), – . sweeney, a. e. ( ). tenure and promotion: should you publish in electronic journals? the journal of elec- tronic publishing, ( ). retrieved from http://quod.lib. umich.edu/j/jep/ . . ?view=text;rgn=main doi: . / . . warlick, s. e., & vaughan, k. t. l. ( ). factors influ- encing publication choice: why faculty choose open access. biomedical digital libraries, ( ). retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/pmc / xia, j. ( ). a longitudinal study of scholars’ attitudes and behaviors toward open-access journal publishing. jour- nal of the american society for information science and technology, ( ), – . http://chronicle.com/article/digital-repositories-foment/ / http://tsc.library.ubc.ca/index.php/journal /article/view/ http://oaspa.org/membership/code-of-conduct/ http://www.nature.com/news/controversial-website-that-lists-predatory-publishers-shuts-down- . http://quod.lib.umich.edu/j/jep/ . . ?view=text;rgn=main https://doi.org/ . / . . http://www.ncbi.nlm.nih.gov/pmc/articles/pmc / abstract references op-llcj .. presenting the bangor autoglosser and the bangor automated clause-splitter ............................................................................................................................................................ d. m. carter the university of british columbia, okanagan campus, kelowna, british columbia, canada; centre for research on bilingualism, bangor university, gwynedd, wales m. broersma centre for language studies, radboud university, nijmegen, the netherlands; max planck institute for psycholinguistics, nijmegen, the netherlands k. donnelly centre for research on bilingualism, bangor university, gwynedd, wales a. konopka university of aberdeen, aberdeen, scotland ....................................................................................................................................... abstract until recently, corpus studies of natural bilingual speech and, more specifically, codeswitching in bilingual speech have used a manual method of glossing, part- of-speech tagging, and clause-splitting to prepare the data for analysis. in our article, we present innovative tools developed for the first large-scale corpus study of codeswitching triggered by cognates. a study of this size was only possible due to the automation of several steps, such as morpheme-by-morpheme glossing, splitting complex clauses into simple clauses, and the analysis of internal and external codeswitching through the use of database tables, algorithms, and a scripting language. ................................................................................................................................................................................. introduction one of the main challenges faced by researchers who study natural bilingual speech is the amount of time needed to collect, transcribe, and prepare the corpus data before any type of linguistic or sociolinguistic analysis can take place. for instance, previous ana- lyses of codeswitching patterns found specifically in the welsh–english siarad corpus utilized in our study relied on manual morpheme-by-morpheme glossing, clause-splitting (i.e. splitting complex clauses into simple clauses), and data preparation (carter et al., ; davies and deuchar, ; herring et al., ). the manual data preparation involved processes such as determining a main lan- guage and an embedded language for each bilingual correspondence: d. m. carter, faculty of creative and critical studies, department of critical studies, ccs , university of british columbia, okanagan campus, research road, kelowna, bc v v v , canada. e-mail: diana.carter@ubc.ca digital scholarship in the humanities, vol. , no. , . � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com doi: . /llc/fqw advance access published on february downloaded from https://academic.oup.com/dsh/article-abstract/ / / / by mpi psycholinguistics user on april simple clause (see section for details on the matrix language frame model; myers-scotton, , ). the result was a slow process that limited the number of clauses included in each analysis, ranging from a few hundred to a few thousand. one of the goals of our current study was to devise more efficient automated tools and tech- niques that would allow us to analyze all of the , clauses in the siarad corpus in a much shorter amount of time. in our article, we present the methodology and innovative tools that were essential to our study of codeswitching in the welsh–english siarad corpus of spontaneous bilingual speech. we believe that these tools will facilitate several steps in the analysis of monolingual and bilingual corpora. for instance, the bangor autoglosser can be utilized to automatically gloss corpora that include languages with small speaker populations, given that tagging systems are often unavailable for languages with fewer than five million speakers. the bangor automated clause- splitter can be a helpful tool for any researcher who needs to divide complex clauses into smaller clauses for analysis and may be used for other languages in addition to welsh, such as spanish, for example. previous work has successfully used automated tools to predict codeswitching in corpora (papalexakis et al., ; solorio and liu, ). in the present study, to the contrary, we analyze actual occurrences of codeswitching. specifically, our study employed automated methods with the aim of analyzing both internal codeswitches (two languages used within the same clause) and external codeswitches (switches extending over the clause boundary) triggered by cognates (clyne , ). clyne ( ) defines cognates, or trigger words, as proper nouns, bilingual homophones, and lexical transfers (items from one language that have become part of the lexicon of the speaker’s second language), and typically the default assump- tion is that cognates are nouns. however, in our study, we extended the definition to include all word types that overlap in form and meaning in the bilingual’s two languages. essentially, clyne’s triggering hypothesis proposes that cognates facili- tate codeswitching, an effect that is the result of the selection of the cognate from the mental lexicon (broersma and de bot, ; broersma, ). it is argued that cognates may be strongly connected in the mental lexicon and that their conceptual rep- resentations are more closely connected than those of non-cognates. therefore, the activation of a word that is shared by two languages may lead to a change in activation of both languages at the lexical level. this in turn may ‘boost’ the least active language to the extent that the next time a lemma is selected it may be one from the boosted language instead of the previously spoken language. similarly to the welsh–english studies men- tioned above, previous work on the triggering hy- pothesis was also performed manually and required over h to tag and analyze small corpora of – , words (broersma and de bot, ; broersma, ). however, through the implementation of the bangor autoglosser, the bangor automated clause- splitter, as well as database tables, algorithms, and a scripting language, we were able to successfully ana- lyze almost , words in , clauses. in the following sections, we first describe the collection and transcription of the siarad corpus and then, crucially, the autoglossing and clause-splitting pro- cesses, and final data preparation. data collection here we describe the method followed to collect the large welsh–english corpus used in our analysis of triggered codeswitching. the welsh–english siarad corpus consists of , words from speakers across sixty-nine conversations. the corpus was col- lected over a year period in wales by bilingual welsh–english researchers who were local members of the community (deuchar et al., ). the par- ticipants were recruited through a variety of means, such as newspaper announcements, and the ‘friend of a friend’ approach (milroy, ). the speakers were told that the aim of the study was to record people having bilingual conversations with another bilingual friend or family member. conversations lasted between and min, with a mean length of min, and were recorded using a marantz hard disk recorder (carter et al., ; deuchar et al., ). researchers were not d. m. carter et al. digital scholarship in the humanities, vol. , no. , downloaded from https://academic.oup.com/dsh/article-abstract/ / / / by mpi psycholinguistics user on april present at the time of the recording, and the par- ticipants could discuss any topic of their choice. the speakers were left alone to minimize the observer’s paradox, which occurs as the result of having an interviewer or researcher present during the record- ings (labov, ). it was important that the par- ticipants felt comfortable and unhindered given that informal situations, rather than formal interviews, are more likely to elicit natural bilingual speech. after the recordings were finished, the partici- pants were asked to complete a self-assessed back- ground questionnaire consisting of twenty questions. the questionnaire elicited a wide range of information, such as the participants’ age, gender, occupation, language of education, age of exposure to each language, language of social network, lan- guage proficiency, and attitudes toward codeswitch- ing. the anonymity of the participants was protected, which means that other researchers inter- ested in studying social variables could also access the questionnaire data without harming anonymity. transcription all of the recordings were transcribed by welsh– english bilinguals using the chat transcription system in the computerized language analysis (clan) program (macwhinney, ). within the chat system, transcribers used language tags to differentiate between welsh, english, and cognate words. words in the most frequent language in each conversation were left untagged, while all other words were tagged according to their corresponding language using three-letter abbreviations of iso- - (i.e. @s:eng for english). transcribers em- ployed a ‘dictionary method’ to allocate language tags and ensure consistency in the transcripts. words that occurred in the dictionaries of both english and welsh were considered cognates and tagged as @s:cym&eng, with the language tags in alphabetical order of the abbreviation. these words included proper nouns, nouns, and verbs as well as other word classes. in the siarad corpus, welsh is the most frequent language in all conversations, ranging from to % of the words, with an average of %. one of the key advantages of language tagging is that it greatly facilitates the identification of cog- nates and codeswitching in bilingual corpora. example ( ) below illustrates a transcription tier with language tags and a translation tier. ( ) ond dw i ddim actually@s:eng isio mynd i wrando ar y stuff@s:cym&eng. ‘but i don’t actually want to go and listen to the stuff’. in addition to the tiers of transcribed speech and the translation tier, another tier was included in the corpus that was essential to our study: a morpheme- by-morpheme gloss. originally, all of these tiers were entered manually by a team of bilingual researchers. the newly developed bangor autoglosser provides researchers with a more effi- cient automatic method of glossing. the bangor autoglosser the transcription tiers were glossed with an innova- tive automated tool called the bangor autoglosser that followed the leipzig glossing conventions (carter et al., ; donnelly and deuchar, ). given that the existing tagging system used in clan only handles larger languages of over five million speakers (macwhinney, ), it was neces- sary to create a tool from scratch that could auto- matically gloss large multilingual welsh–english texts. the implementation of the bangor autoglosser involved a combination of digital dic- tionaries and the application of constraint grammar (karlsson, ; karlsson, et al., ). constraint grammar assigns grammatical tags to text based on context-dependent rules written by a linguist. each rule selects, removes, adds, or replaces the tag on any given word by taking into account surrounding words and their tags. this was the first application of constraint grammar to mixed-lan- guage texts. essentially the procedure involves the separation of text into words, the lookup of each word in a dictionary that gives possible lemmas and part-of-speech (pos) for that word, and the presenting the bangor autoglosser digital scholarship in the humanities, vol. , no. , downloaded from https://academic.oup.com/dsh/article-abstract/ / / / by mpi psycholinguistics user on april selection of the correct lemma and pos for the word in its current context. this is illustrated in table . the autoglossing process is as follows. first, the bangor autoglosser imports each utterance from a transcript into an utterance table, as seen in fig. . the table facilitates the process of editing or adding items either directly to the table or to an exported spreadsheet version of the same table. second, the words are imported from the data- base into a ‘words table’ and tokenized (fig. ). any mutations in welsh are removed (e.g. ‘gath at- cath’), and any elisions or regular verb endings in english are also removed (e.g. ‘gonna, i’ll’). the language tags are used to decide which dic- tionary is consulted for the gloss. the correct diction- ary accumulates all matching entries for each word and writes them in another file that is in the format required by the constraint grammar parser. next, the parser applies the constraint grammar rules to the file. for example, in the case of the english word ‘dance’, you would have one reading: dance, sv, infin, meaning that ‘dance’ can be a singular noun, or a verb (with the combined tag ‘sv’), and if it is a verb, it is usually an infinitive. the constraint grammar rules then use context to convert the ‘sv’ tag into ‘n.sg’, or ‘v.pres’ (e.g. they dance). the constraint grammar rules for welsh are applied by the parser in the same way it would apply rules for any lan- guage. in other words, there is no need for a special algorithm to be written specifically for welsh. this is one of the features that allows constraint grammar to be used to tag multilingual text. one main differ- ence between english and welsh, however, is the higher number of homonyms present in english. as a result, in welsh, each individual meaning tends to have a separate reading. the results of the application of the grammar rules are stored in a words table as a combination of a gloss and pos-tag (fig. ). finally, the entire chat file is written out of the database with a new autogloss tier that is generated from the glossed words. this output is illustrated in example ( ). ‘but i don’t actually want to go and listen to the stuff’ using this innovative method, glossed text was produced at a rate of , words per minute and the h siarad corpus was glossed in approximately . h. we performed manual checks of the complete outputs from five random transcription files which showed that the precision of the glossing was be- tween and %, depending on the language. in addition to its efficiency, another advantage of the automated glossing is that it is now possible to easily access any word or attribute of texts that are available in the database. through the use of a scripting language such as hypertext preprocessor (php) (lerdorf, ) or python (bird et al., ), table welsh dictionary layout surface lemma enlemma pos gender number tense bara bara bread n m sg cathod cath cat n f pl mynd mynd go v infin aeth mynd go v s past hapus hapus happy adj rhywsut rhywsut somehow adv heb heb without prep fig. example ( ) in the utterance table d. m. carter et al. digital scholarship in the humanities, vol. , no. , downloaded from https://academic.oup.com/dsh/article-abstract/ / / / by mpi psycholinguistics user on april it is possible for researchers to manipulate the data- base at this point and begin to analyze the corpus data. we used a scripting language in most stages of our study, including the development of the auto- mated clause-splitter we describe next. the bangor automated clause-splitter given that the siarad corpus was not originally tran- scribed in simple clauses and no welsh parser existed, we needed to devise a way of automatically splitting complex clauses into simple clauses for our codeswitching analysis. this was an essential step so that we could apply the matrix language frame model (myers-scotton, , ) and determine a base or matrix language for each clause. according to the model, each codeswitched clause contains a matrix language that provides the morphosyntactic frame for the clause, and an embedded language that contains inserted material, mostly consisting of content morphemes. the matrix language can usually be determined by the language of the finite verb in each clause, which was found to be true for the entire siarad corpus. the large majority of the clauses in the siarad corpus has welsh as the main verb with english providing the inserted material. as mentioned in the introduction, previous stu- dies that involved manual clause-splitting took sev- eral weeks and many researchers to divide only a few thousand clauses (carter et al., ; davies and deuchar, ; herring et al., ). in the present study, we were able to analyze , clauses as a result of the creation of the automated clause- splitter. during the initial development phase, the first version of the clause-splitter was tested on the first utterances of a single file and was checked in detail, revealing an accuracy rate of %. in total, twenty-one ( %) of the utterances were split incor- rectly. out of the twenty-one, eight ( %) were due to an incorrectly applied rule in the constraint grammar, and another three ( %) because of an error in the dictionary. the final ten ( %) were due to the splitter itself. to increase the accuracy rate of the cause-splitter, we made corrections to the constraint grammar application as well as the dic- tionaries. additionally, we revised some of the as- sumptions that the splitter uses. for example, one assumption is that inflected verbs have the clause marker moved to the preceding word when the pre- ceding word is a conjunction, a subordinator, or an adverb. the initial list of these words was increased fig. disambiguated words from example ( ) stored in the words table after the application of the constraint grammar parser fig. example ( ) in the words list table presenting the bangor autoglosser digital scholarship in the humanities, vol. , no. , downloaded from https://academic.oup.com/dsh/article-abstract/ / / / by mpi psycholinguistics user on april because it was not exhaustive, thus causing inaccur- ate clause-splits. the clause-splitting procedure as applied to the siarad corpus can be summarized as follows. first, for the purpose of the present study, we removed all conversations containing more than two speakers leaving us with fifty-two conversations and speakers; this was a preemptive step that would later facilitate the statistical analysis. note that the clause-splitter could handle conversations with more than two speakers without any problem. second, we omitted all interactional markers, which are utterances such as ‘uhhuh, mmhm’ that do not fulfill any syntactic role in everyday speech. next, we added role indicators in the ‘words table’ (fig. ) to every finite verb, which were then moved where necessary. in the following example ( ), the finite verbs are underlined, the clause-splits are marked with a forward slash/, and the word onto which the clause-split marker was moved is in bold. the example illustrates how the marker is moved from ‘o’n’ to ‘pan’ (when) because ‘pan’ is a con- junction, following the assumption made by the clause-splitter that the marker be moved to the word preceding an inflected verb if that word is a conjunction. ( ) dw i yn cofio/o’n i yn gweithio ar y nos/pan o’n i yn gweithio yn beaumaris ‘i remember/i was working nights/when i was working in beaumaris’ as mentioned previously, spot checks of a random sample of the splits revealed that this method was over % accurate, which was deemed an acceptable rate given the speed of the process and the large number of clauses that were produced for analysis. next, we determined the matrix language of each clause by detecting the language of the finite verb within that clause. this step was done automatically based on the language tagging in the transcripts. once the matrix language was assigned, we assessed whether there were any internal or external codes- witches. if two languages co-occurred within the same basic clause, it was considered an internal switch, but if the subsequent clause had a different matrix language from the previous clause, then it was an external switch. finally, we generated additional data that characterized the clauses and the conversations. for example, we wanted to know the length of each clause in words, whether the clause contained cognates, and if yes, how many, the type of each clause, the length in letters of each cognate, and the language of the clause (welsh, english, or bilingual). other key information included the total number of words, clauses, cog- nates, and codeswitches in each conversation and per speaker. once the enriched data had been generated, they were exported to a comma-separated value file and could be analyzed using statistical software such as r (r development core team, ). conclusions in contrast to previous smaller-scale studies of codes- witching patterns in bilingual corpora, and specifically in the welsh–english siarad corpus, our research team was able to analyze the entire corpus of , clauses due to the development of innovative tools, namely, the bangor autoglosser, which applied constraint grammar to bilingual text for the first time, and the bangor automated clause-splitter that divided thou- sands of complex clauses into basic clauses at a rapid rate. all of the data were contained in database tables and were manipulated and analyzed through the use of a general-purpose scripting language, rather than a specific dedicated interface, such as the query applica- tion found in the clan (macwhinney, ) pro- gram. the scripts were written and utilized successfully to prepare a large quantity of clauses for the analysis of several variables pertaining to our study’s focus on triggered codeswitching. although a discussion of the statistical analysis and results are outside of the scope of this current article, it should be noted that without the use of the automated tools and scripts, it would not have been possible to process the large welsh–english siarad corpus with such speed, efficiency, and accuracy. funding this work was supported by a small research grant from the british academy awarded to the first and d. m. carter et al. digital scholarship in the humanities, vol. , no. , downloaded from https://academic.oup.com/dsh/article-abstract/ / / / by mpi psycholinguistics user on april second authors (grant number ). we also gratefully acknowledge the support of the max planck institute for psycholinguistics, the centre for research on bilingualism in wales, and the university of calgary. references bird, s., klein, e., and loper, e. ( ). natural language processing with python. california: o’reilly media, inc. broersma, m. ( ). triggered codeswitching between cognate languages. bilingualism: language and cognition, : – . broersma, m. and de bot, k. ( ). triggered codes- witching: a corpus-based evaluation of the original trig- gering hypothesis and a new alternative. bilingualism: language and cognition, : – . carter, d., broersma, m., and donnelly, k. ( ). applying computing innovations to bilingual corpus analysis. in valenzuela, e. and de la fuente, a. a. (eds), language acquisition beyond parameters: studies in honour of juana m. liceras. amsterdam: john benjamins. carter, d., deuchar, m., davies, p., and parafita couto, m. c. ( ). a systematic comparison of factors af- fecting the choice of matrix language in three bilingual communities. journal of language contact, : – . clyne, m. ( ). transference and triggering: observations on the language assimilation of postwar german-speaking migrants in australia. the hague: martinus nijhoff. clyne, m. ( ). dynamics of language contact: english and immigrant languages. cambridge: cambridge university press. davies, p. and deuchar, m. ( ). using the matrix language frame model to measure the extent of word order convergence in welsh-english bilingual speech. in breitbarth, a., lucas, c., watts, s., and willis, d. (eds), continuity and change in grammar. philadelphia, pa: john benjamins, pp. – . deuchar, m., davies, p., and donnelly, k. ( ). building and using the siarad corpus: bilingual conver- sations in welsh and english. manuscript. deuchar, m., davies, p., herring, j., parafita couto, m.c., and carter, d. ( ). bilingual language use. in thomas, e. and mennen, i. (eds), advances in the study of bilingualism. bristol: multilingual matters, pp. – . donnelly, k. and deuchar, m. ( ). using constraint grammar in the bangor autoglosser to disambiguate multilingual spoken text. in proceedings of the nodalida workshop constraint grammar applications, riga, latvia: nealt proceedings series, tartu. herring, j., deuchar, m., parafita couto, m. c., and moro quintanilla, m. ( ). ‘i saw the madre’: eval- uating predictions about codeswitched determiner- noun sequences using spanish-english and welsh- english data. international journal of bilingual education and bilingualism, : – . karlsson, f. ( ). constraint grammar as a framework for parsing unrestricted text. in proceedings of the th international conference of computational linguistics, vol. : – , stroudsurg, pa. doi: . / . . karlsson, f., voutilainen, a., heikkilä, j., and anttila a. ( ). constraint grammar: a language-independent system for parsing running text. natural language processing, . berlin and new york: mouton de gruyter. labov, w. ( ). some principles of linguistic methodol- ogy. language in society, : – . lerdorf, r. ( ). php on hormones—history of php. mysql conference. santa clara, california. http://web. archive.org/web/ id_/http://itc.conver- sationsnetwork.o rg/shows/detail .html. macwhinney, b. ( ). enriching childes for mor- phosyntactic analysis. department of psychology. paper . http://repository.cmu.edu/psychology/ . macwhinney, b. ( ). the childes project: tools for analyzing talk, rd edn. mahwah, nj: lawrence erlbaum associates. milroy, l. ( ). language and social networks. oxford: blackwell. myers-scotton, c. ( ). contact linguistics: bilingual encounters and grammatical outcomes. oxford and new york, ny: oxford university press. myers-scotton, c. ( ). common and uncommon ground: social and structural factors in codeswitching. language in society, : – . papalexakis, e., nguyen, d., and seza doğruöz, a. ( ). predicting code-switching in multilingual communication for immigrant communities. in proceedings of the first workshop on computational presenting the bangor autoglosser digital scholarship in the humanities, vol. , no. , downloaded from https://academic.oup.com/dsh/article-abstract/ / / / by mpi psycholinguistics user on april approaches to code switching. doha, qatar, october , pp. – . r development core team. ( ). r: a language and environment for statistical computing. vienna, austria: r. foundation for statistical computing. isbn - - - . http://www.r-project.org. solorio, t. and liu, y. ( ). learning to predict code- switching points. in proceedings of the conference on empirical methods in natural language processing. honolulu, october , pp. – . notes the siarad corpus of welsh–english data is available under open license at http://bangortalk.org.uk. at the time the corpus was being collected, the chat system was one of the most suitable choices (deuchar et al., ). currently, there are other op- tions available for multilingual data, such as elan (https://tla.mpi.nl/tools/tla-tools/elan/), although as macwhinney explains in the childes manual (http://childes.psy.cmu.edu/manuals/chat.pdf), the chat data can be translated to xml which can then be used in elan, among other programs. in welsh, as in the other celtic languages, some word- initial consonants change (‘mutate’) to reflect morpho- logical and syntactic relationships between the words of the utterance. for example: siop llyfrau da (a shop [with] good books), but siop lyfrau dda (a good book- shop), where the change d -> dd signifies that the ad- jective da (good) relates to siop (shop) and not to llyfrau (books). llyfrau is itself mutated ll -> l to show that it qualifies siop. another example is seen here where mae o’n marw means ‘he is dying’, but mae o’n farw means ‘he is dead’. the change m -> f signifies that marw (die, dead) is the adjective and not the verb. these mutations have to be removed to get to the underlying lemma. the scripts for the constraint grammar rules for welsh are available at https://github.com/donnekgit/autoglosser. d. m. carter et al. digital scholarship in the humanities, vol. , no. , downloaded from https://academic.oup.com/dsh/article-abstract/ / / / by mpi psycholinguistics user on april microsoft word - ccp in a box draft . . .docx   welcome to the colored conventions project ..................................................... memo of understanding ..................................................................................... project contact information ............................................................................... pedagogy ............................................................................................................ quality  control  protocol ..............................................................................................................................................   sample  undergraduate  assignments .....................................................................................................................   sample  graduate  seminar  assignment .................................................................................................................   technology ........................................................................................................ instructions  for  publishing  your  biographical  essays..................................................................................   video  tutorials  for  project  technologies.............................................................................................................   additional  omeka  resources..................................................................................................................................   project research guides and guidelines............................................................ research  guides............................................................................................................................................................   guidelines  for  sources  and  citation.....................................................................................................................     welcome  to  the  colored  conventions  project   thanks to you and your students for sharing your collective energy and intellectual acumen as we seek to learn more about the historic collective efforts of the colored conventions movement and to bring buried nineteenth-century black organizing to digital life. “colored conventions in a box,” is a curricular package that supports instructors as they engage in teaching that transforms the minutes of the convention they’ve chosen to teach into a rich and engaging series of cultural biographies and visual artifacts. in the “pedagogy” section, you will find all of the resources necessary for your class unit on a convention, including sample assignments for lower- and upper-division undergraduate classes as well as for graduate seminars. the first entry, quality control protocol, explains the process that we have developed to ensure our project meets the highest standards in scholarly research. in the “technology” section, you will find all of the resources necessary for your students to publish online using a digital humanities platform called omeka. this includes written and video instructions for uploading, as well as further resources that you might share with your students so that they can further explore the world of digital scholarship and engagement. lastly, in the “project research guides and guidelines” section, you will find links to extraordinarily useful research guides developed by university of delaware’s research librarians in order to offer your students research guidance in one central location. we have developed a protocol so these guides can be adopted by your librarians as well. these guides compile a host of resources, including census directories, digital newspaper, image databases and quite useful collections of nineteenth-century letters, some of which are available through university subscriptions. in this section you will also find the guidelines for sources and citations that will help us ensure best practices for citing sources, including works cited, and respecting intellectual property. please do not hesitate to contact us with any questions about colored conventions in a box by writing to us at info@coloredconventions.org. we look forward to your and your students’ contributions to this exciting project! yours, the colored conventions team at university of delaware  memo  of  understanding   the colored conventions project attends to issues of race and gender equity and bias, historical and present-day. our commitment, therefore, requires that we confront the underrepresentation of women in the convention minutes and articulate their substantial contributions to reform and organizational movements of the nineteenth century. because our project seeks to produce knowledge on the african-american experience online as a form of public engagement, students can publish assignments to our online site, coloredconventions.org. students can request to publish work online anonymously or can request removal by writing to info@coloredconventions.org. by signing this memo of understanding, you agree to the following: • our shared commitment to recovering a convention movement that includes women’s activism and presence—even though it’s largely written out of the minutes themselves. this means that for every delegate you assign, students should also be assigned an associated woman, such as a wife, daughter, sister, fellow church member, etc. • deliver each student biography (of a person or place) to ccp • provide a student roster for omeka accounts for uploading • fill-out the quality control spread sheet • share our opt-out policy and inform students that their work may be published online though ccp reserves the right to remove and edit material as needed • use the research guides (letter collections, databases, census records, etc.) provided • respond to our assessment request about your “colored conventions in a box” teaching experience in return for your commitment, we provide support in the following areas: • quality control protocol • library research guides, including bibliography of helpful primary sources • video and written instructions on publishing to colored conventions website • critical secondary literature (upon request) • documentation of useful teaching methods and tools (e.g. writing assignments, lesson plans, group activities, and more.) by signing below, i acknowledge that i have read and agree to the colored conventions project’s terms and conditions. the ccp requires a signed copy of the mou before curriculum, and a completed online form at http://coloredconventions.org/memo-of-understanding please return to gforeman@udel.edu. name: date: project  contact  information   general contact: info@coloredconventions.org faculty director: p. gabrielle foreman gforeman@udel.edu executive committee (faculty and graduate students): p. gabrielle foreman gforeman@udel.edu jim casey jccasey@udel.edu sarah patterson sarahp@udel.edu pedagogy   quality  control  protocol   please review ccp’s quality control protocol for quality assurance of materials published to coloredconventions.org. . faculty partners can help the project greatly by ensuring high standards of research and writing in student biographies. . we ask instructors to make sure students document their research, preferably in mla formatting. . any use of images must be accompanied by complete citations from sources and publishers. please guide your students to seek out images and other media in the public domain. (for assistance finding usable images, please see the research guide and example biographical entries.) . before we can make any biographies publicly viewable, instructors must fill out the quality control spreadsheet provided by the ccp team. this spreadsheet serves to advise us on which biographies meet academic standards and which are not recommended for publication. . the ccp team reserves the right to revise each student’s work on the site in the interest of quality and continuity. . all work on the site will be attributed. to ensure proper attribution for all contributing authors, please advise each student to sign every entry they create using the following format. a. student: emily johnson. taught by: gabrielle foreman, university of delaware, spring, .   sample  undergraduate  assignments   lower-­‐division  courses   exploratory  essay  ( -­‐  words)   this essay is designed to kick-start the initial research-intensive portion of the semester and will prepare you for the longer “research essay” that will follow. using your chosen colored convention as a case study, we will explore the relationship between history, technology and primary resources. in this paper, you will explain your research process as you searched for information about your assigned delegate, connected woman or cultural institution and construct a thesis concerning its importance alongside a theme, debate or the broader convention movement. the conclusion of your essay should offer an argument (thesis) that has the potential to be developed into a longer, anaytical essay. . introduction: open the essay with an overview of your delegate’s involvement in the convention and the type of sources you will consult for more information (newspapers, letters, history books). describe why these texts matter to the process of discovering more information about your subject. . body: in the body of the essay, describe the primary items you find during your search. what is the nature of the content? where did you discover these texts? . closing: close your essay by posing a tension-filled thesis about the relationship between a delegate, connected woman, debate, or theme present in the minutes. describe how the information you discovered impacts your argument. here, i want you to focus on how your research leads you to an argument. sample themes: gender education civil rights (voting rights) travel religion delegate  write-­‐up  ( -­‐  words).   trace your delegates’ social, geographical, organizational and familial networks, their geographical mobility (travel) and literary production (written contributions). if you’re assigned a location, trace its cultural history and use. you’ll use multiple newspaper databases, consult volumes of collected letters as well as images to develop a full visual and textual profile of your delegates. this essay should begin in the year of the convention that is your focus and give an overview of the lives of delegates starting at that time. instead of creating a chronological “biography,” your write up will use your delegate’s concerns in that year as the central point from which all of the other information you share flows. biographical  essay  peer  review  worksheet:     focus on each person’s paper one at a time, working to find the answers to these questions together. begin by reading and annotating their essay with your feedback. then, working together, find or develop answers to the following questions on a separate sheet of paper. . does the name of the author of the biography appear anywhere on the page? a. example: “student name. taught by: instructor, full institution name, semester year.” b. this work is part of a larger digital humanities project about the colored conventions and will be attributed to you. if you do not want to be included after the assignment, write to info@coloredconventions.org with the subject line “yourname.optout” . what were the circumstances and major events of the lives of these individuals and how did these forces lead these individuals to become delegates at or connected to delegates at this convention? . why or how this person was involved with or connected to the convention or the convention’s concerns? . what was the delegate’s role and actions, or lack thereof, at the convention? . circle every historical reference (associations, causes, debates, issues, organizations, other conventions, churches, significant events, etc). does the biography fully explain what these are, and why they matter? . what gives the delegate’s life historical significance that differs from other delegates? . what areas in the biography require further research? are there gaps of information? . what can you tell about the author’s research? offer - suggestions about other tactics or approaches or topics that they might additionally consider. . what revisions do you, as a stand-in for the general interest, online reader, think would make the essay more readable? . looking at the page online, what do you notice about the design or layout? are there images? could the use of images appeal to potential audiences? . if images can be used, are they displayed randomly or deliberately? upper-­‐division  courses   background: for the thirty-five years that preceded the civil war, free and fugitive blacks came together in state and national political conventions to strategize about how they might achieve educational, labor and legal justice at a moment when black rights were constricting nationally and locally. the delegates to these meetings include the most well-known, if mostly male, writers, organizers, church leaders, newspaper editors and entrepreneurs in the canon of early african american leadership—and many whose names and histories have long been forgotten. all that is left of this phenomenal collective effort are the minutes. even these materials are rare and can only be accessed through out-of-print volumes.   assignment description: in addition to official colored convention delegates, student researchers will be assigned women (wives, daughters, nieces, sisters) who are connected to male delegates or a convention “place” which can be researched using the same databases and scholarly resources. though women are under-represented in the delegate minutes, they are deeply involved in reform movements and are also involved in the businesses and in struggles for educational justice that the conventions discuss. be certain to begin your biography in the era you’re studying, in philadelphia if that’s the convention you’re assigned, or in california if you’re examining that later convention. delegate write-up ( - ) + links + full bibliography. trace your delegate’s social, organizational, entrepreneurial and familial networks, their geographical mobility (travel), writings and other cultural production (songs they composed, art they produced, patents they filed, businesses they started, inventions they contributed). you’ll use multiple newspaper databases, consult volumes of collected letters, and find images to develop a full visual and textual profile of your delegates. your write up should begin: “in xx (the year of the convention you’re studying). . .” and go on to develop an overview of delegate/affiliated woman/place starting at the time of the convention. instead of creating a chronological “biography,” your write up will use your delegate’s concerns in the year of the convention you study as the central point from which all of the other information you share flows.     sample  graduate  seminar  assignment   graduates seminar sample assignment: background: each student is assigned two delegates. as we note the gendered silences that convention minutes produced at the moment of archive/event production, we are deliberate in our effort to challenge such silences at the moment of event revisitation. as a result, in your research, one of your goals is to resuscitate a woman connected to your delegates whose actions and activism undergirds the movement’s goals. that person is often a family member, but is sometimes a close associate. so in total you’ll have four research subjects, two pairs each of which include an delegate. what to do and bring to class when you present • choose one of your two pairs to report on. • “map” your delegate over time. because we’re interested in circuits of mobility, activism and enterprise, please provide dates and locations which allow you to plot your delegate’s travels and transplantations. be as specific as you can when you can be (addresses of businesses or homes you find in city directories, in solid secondary sources, in census records would be useful, for example). • identify your delegate pair’s social network. who is in your delegate’s posse, how are they connected and for what duration? in other words, bring in a list of their professional and familial networks. can you name the social, political and geographical ties that link them? • bring two copies of your work. members of the colored convention digital humanities omeka action team may be in class to help upload the information you’ve found. write up due on (fill in here): on omeka. see omeka uploading video at coloredconvention.org. also please hand in a write up to your professor (share details here). useful hints and research tools: almost of the resources are available on the colored conventions library resource page. • use multiple spellings when searching names (i.e. william, wm, wm. robt. robert); names and initials may appear one way in the convention minutes and another way in databases, articles. • look for ads based on your subject’s occupation. • note the methodological differences used in different secondary sources, produced at different times, intended for various readerships. • get acquainted with ancestry.com (directories, census records, marriage/death, etc. available through ud’s database under ancestry library) • library of congress daniel murray pamphlet collection http://memory.loc.gov/ammem/aap/aaphome.html o the digital schomburg african american women writers of the nineteenth century http://digital.nypl.org/schomburg/writers_aa / sample spread sheet (we will send your specific year to you) you fill in the students’ names as you assign delegates and places of interest delegate name email watkins, william j. (ny, associate ed. fd paper) reason, charles l. (professor, activist, mathematician). student student.edu robert morris (ma, lawyer) rachel cliff (pa) student student.edu mary ann shadd (canada, editor) e.p. rogers (elymas payson, rev. nj) student student.edu remond, charles lenox (salem, ma; speaker, business; (sarah remond) francis a. duterte (pa) student student.edu william j. wilson (brooklyn; “ethiop,” frederick douglass paper. anglo african (on fiche here). josiah c. wears (pa, father) (sometime weare) and isaiah. second delegates. student student.edu douglass, robert, jr. phl artist. grace bustill (mother), sarah mapps douglass (sister). robert, sr. (rev and barber) jr. has a photo studio on arch street near th. dr. bias, j.j. gould (pa) student student.edu ray, charles bennett, (rev. ny) (cordelia ray). your principal focus. leonard a. grimes (second delegate, ma) student student.edu william douglass (rev, pa) elizabeth armstrong (pa) student student.edu sample syllabus that highlights colored convention teaching unit week : march and . t: reading: minutes of the national negro convention in philadelphia. assign delegates. th: in-class uploading workshop. you’ll learn how to upload images, links, create metadata, tag information etc. by next thursday after class, you must send in required information to coloredconventions@gmail.com and gforeman@udel.edu and have uploaded all of your information. help is available. presentation: reading: eric gardner, introduction to unexpected places, sakai or emailed by professor.   week   :    march    and   .   t: delegate presentations. trace your delegates’ social, geographical, organizational and familial networks, their geographical mobility and literary production. you’ll use multiple newspaper databases, consult volumes of collected letters, images, census and marriage records to develop a full visual and textual profile of your delegates. prepare a -minute powerpoint on your delegate and the person you’ve chosen as part of their social network. due saturday: long assignment on conventions. see hand-out. technology   instructions  for  publishing  your  biographical  essays   updated august , when you have finished composing your biographical essay, you will publish it on a public website using a software program called omeka. please follow the steps below. logging  into  coloredconventions.org:   • go to coloredconventions.org • scroll down and click “team members click here to login.” • enter your login username and password (you can get these from your instructor). creating  a  biography  page:     . click on the tab at the top of the page labeled exhibits. . find the exhibited titled “ philadelphia convention” and click to edit the exhibit. . scroll down to the bottom of the page to the section named “sections and pages.” . click on the link that says add a page. . this will open up a page titled page metadata. . in the box for title, write the delegate’s full name. skip the box marked slug. . choose a layout. if you have many or a few images, choose the one that looks best. . click save changes. . the page displayed is the place to add in your text. the numbers on the page layout correspond to the boxes under page content. for example, box on the template will show the text entered into box under page content. . to insert a link, highlight any text and click the button titled insert / edit link. . at the bottom of the page, click save and return to section. adding  images  to  use  on  a  biography  page:     if you want to include images with your biography, follow these guidelines. in omeka, you must first add an item to upload a file to the site. then you can attach that item to the site. add an item to the site . click on the tab at the top of the page labeled items. . click add an item. . the following instructions apply to each of the tabs on the add an item page. complete these before clicking the green button to add an item. . dublin core > follow the instructions that appear below. • title: a name given to the resource. typically, a title will be a name by which the resource is formally known, such as “portrait of william nell” or “the colored patriots of the revolutionary war.” • description: an account of the resource. description may include but is not limited to: an abstract, a table of contents of a longer work, a graphical representation, or a free-text account of the resource. • creator: the name of the person or organization that originally created the historical thing. examples of a creator include a person, an organization, or a service. • publisher: give the name of the resource where you found this item during your research process. for example, list the website, database, or books where you found this item. if at all possible, please include a url link to anything that appears online. • date: the point or points of time in the lifecycle of this item. please follow this format for dates: year-month-day to look like this: - - • contributor: write in your name as the person who has contributed this item to our project. if you do not wish to have your name or written work published, please write to ccp at info@coloredconventions.org with the subject line “yourname.optout” and your name or writing will be removed. please include your first and last name, university title, semester, and instructor, ex. student: emily johnson. taught by: john freeman, university of delaware, spring, . . item type metadata > choose the type as document, still image, etc . collection > philadelphia colored convention . files > choose file to upload the document. . tags > this will auto-suggest your choices. if none seem to fit, tag with their names (including incorrect spellings of names), locations, and all other relevant tags. . map > enter in the address, if available. . skip web map service inserting  an  image  into  a  biography  page   go back to the editing interface where you created a page before by clicking exhibits > edit (next to philadelphia colored convention) > edit (next to the page you already created). . click the green button named attach an item . click show search form to search by title or collection for the file you already uploaded. . click on an item to highlight it and then click attach selected item. . when finished attaching all images, click save and return to section. video  tutorials  for  project  technologies     contributing  students  and  instructors  may  consult  video  tutorials  for  publishing  your   biographies  and  adding  images  to  the  site.       how  to  publish  your  biography:  https://vimeo.com/   how  to  add  an  image  to  your  biography:  https://vimeo.com/     additional  omeka  resources   a  brief  introduction  to  omeka     screencasts  from  the  makers  of  omeka   omeka.net  offers  omeka  sites  for  free  –start  your  own  project!   using  omeka  by  the  florida  state  libraries   ideas  on  using  omeka  @  teaching  history     project  research guides and guidelines     research  guides   the colored conventions project provides research guides available for faculty partners online. for faculty partners at the university of delaware: http://guides.lib.udel.edu/coloredconventionsatud for national faculty partners: http://guides.lib.udel.edu/coloredconventions   guidelines  for  sources  and  citation   images: if you find images on any of the library databases, try to replicate that find in the library of congress, google books, documenting the american south, project gutenberg (www.gutenberg.org), hathitrust.com, schomburg library, (digital.nypl.org/schomburg/images_aa /); google images (images.google.com) allows you to paste in images to search for similar images. • always indicate where you got the item. link to the website or source. • make sure you include “specs” for images, that is, include the dimensions of the original image if available. please do not crop. please use the largest .jpg you can possibly get. works cited/bibliography: because this information is hard to piece together as you know, and can come from sources that don’t get it exactly right (as some of you have found out) it’s important to include a complete bibliography as you put together your delegate write-up. • in addition to your conventional bibliography, create a “works cited page” that includes live links if you got information from a web source. if you find a frontispiece (the image of the front of a book or pamphlet), include that link. • put your name, and the class, and the date of your work on every page. example: meredith sobel, engl , - - . your work will be part of a larger digital humanities project about the colored conventions, and will be attributed to you. if you do not want to be included, you should write to gforeman@udel.edu and coloredconventions@gmail.com with the subject line “yourname.optout” world journal of computer application and technology ( ): - , http://www.hrpub.org doi: . /wjcat. . knowledge management practices and performance of academic libraries: a case of mount kenya university, kigali campus library wamalwa lucas wanangeye, benard omallah george* department of library, mount kenya university-kisii campus, kenya copyright© by authors, all rights reserved. authors agree that this article remains permanently open access under the terms of the creative commons attribution license . international license abstract this research sought to determine, assess and evaluate the knowledge management practices and performance in academic libraries. the researcher chose the case study of mount kenya university, kigali campus library. the general objective of this study was to understand the knowledge management practices used in enhancing the performance of academic libraries. specifically, the researcher identified major drivers of knowledge management (km) practices in academic libraries; analyzed the km activities needed to enhance the academic library for proper km practice; and determined the challenges that might face the academic librarians in implementing km. the study population included all the library staff in the university. the researcher used census to sample the total population of all the university library staff. data was collected using questionnaires and data analysis procedures involved editing to verify the coherent of respondents in answering the questionnaires, coding to summarize and simplify the work of processing data and finally graphical representation to make statistical frequency distributions. keywords knowledge, knowledge management (km), academic libraries, information . introduction knowledge has been increasingly seen as a key competitive resource in organizations and this has influenced selection and recruitment practices in many organizations. indeed, as davenport and prusak ( ) reported: ‘companies hire for experience more often than for intelligence or education because they understand the value of knowledge that has been developed and proven over time’. knowledge can either be explicit or tacit knowledge; it is the implicit or tacit knowledge which is conceptual knowledge which comes from experience and gives rise to ‘wisdom’ that organization seek to add value to their processes. the conversion of implicit into explicit knowledge forms a powerful contribution to sustainable competitive advantage for organizations. but this knowledge alone will not foster a learning organization; rather it is through the sharing of knowledge that organizational learning is facilitated. academic libraries have transformed drastically from machine readable catalogue (marc) and circulation desk to metadata and web information, print collection and inter library loans to online databases and e-resources, quiet areas to learning and knowledge commons, bibliographic instruction to information literacy and life-long learning, information management to knowledge management and so on. accordingly, the roles of academic librarians have changed radically at both library practitioners and library school educators’ levels. they are no more traditional information protectors and managers. open access, knowledge management, digital scholarship, institutional repositories are all often owned by the libraries and the librarians. knowledge management (km) has increased in popularity and credibility as a management tool, as well as a research discipline, over the past decade. there have been concerns about whether km is simply a fad, and researchers and academics have debated its faddish like characteristics. the researchers, and this paper adopts the view that km certainly is not a fad for different reasons, and agree with stankosky’s view that one of these reasons is that the knowledge-economy is here to stay (stankosky, ). knowledge management is therefore said to be slowly but surely capturing the attention of many organizations in a quest for competitive advantage and service delivery (boahene, ). the question as to whether universities are ready for knowledge management practices becomes the perspective from which the researcher seeks to investigates the use of knowledge management practices in enhancing service delivery within higher education, and presents the nature of academics and universities, and the related challenges for km implementation within this context. world journal of computer application and technology ( ): - , review the library literature on km in libraries reveals that, all types of libraries are applying some km principles in the provision of library services. townley ( ) pointed out that special libraries, especially business and corporate libraries, are taking the lead on km research; and academic libraries, public libraries and digital libraries are in the limelight. the literature review also reveals that within academic libraries, public services are taking the lead in the research and implementation of km (wen ). mount kenya university is one of the most recognized and fast growing private universities in east africa. just like other universities, it offers different courses ranging from technology, business, health, communication, engineering and other faculties both national and international. having received a customer service charter (csc), the university is therefore keeping focus on the mission, vision and philosophy in an effort to satisfy customers without losing sight of the expectations of other stake holders according to the mku charter, ( ). it is in this perspective that the researcher chose the mount kenya university, kigali campus as the case study. . literature review according to a study carried out by sarrafzadeh, martin, & hazeri ( ), . % library information science (lis) professionals regarded km as a survival factor for libraries to respond to challenges they face in a continuously changing environment. since km equips academic libraries with ample amenities to satisfy the incessantly changing library customer needs, it is a survival kit and a strategic tool for academic libraries. increased visibility of libraries: libraries often have a poor image; they are not visible to their parent organization and work in isolation. the ultimate aim of km is to achieve an organization’s mission. therefore, all parts of an organization (including libraries) must ensure that km contributes towards the realization of the organizational mission and vision. adoption of km could assist library and information professionals in meeting user needs aligned with the organization’s strategic goals and objectives. in addition, km provides libraries with the opportunity to collaborate with other units in their organizations and hence become more integrated into corporate operations and enhance their overall visibility within the organization (sarrafzadeh, martin, & hazeri, ). thus, km endows academic librarians with various platforms to collaborate with academia, such as playing a leading role in electronic and open access publications by providing guidance on copyright issues, and self-archiving published articles in institutional repositories. academic libraries are perceived as knowledge creating organizations, as a system of integrated activities and business processes that work together collaboratively to facilitate accomplishing overall organizational goals (daneshgar & parirokh, ). academic libraries are the treasure house of knowledge to cater for the needs of scholars, scientists, technocrats, researchers, students and others who are in the mainstream of higher education (guru et al, ). librarians are acknowledged as knowledge creators through content management, organization of knowledge, and evaluating the validity and reliability of information obtained from unfamiliar sources (sinotte, ). librarians bring a set of values that are fundamental to the long-term survival of scholarship. librarians care about access and understand that some resources may have value to disciplines and time periods beyond their initiation (case, ). academia stimulates the creation and transmission of knowledge, and academic libraries have played a significant role in supporting such activities (kim & abbas, ). thus, academic libraries are knowledge creating and knowledge-based organizations. debowski ( ) puts emphasis on the need for cultivation of new knowledge competencies through the development of appropriate work-based learning programmes for librarians as early advocates of the knowledge management. increased value of knowledge in the knowledge economy: in a study undertaken by roknuzzaman & umemoto ( ), knowledge economy was considered to be one of the important drivers for libraries’ movement towards km. the above authors have noted that the value of knowledge has always been central to library practice, but the new knowledge-based economy places its significance more than ever before. according to the experts, human knowledge is doubling every thirty two hours. due to this, we are in a state of information overload and decay of existing knowledge, which is continuously replaced with new knowledge. according to israel ( ), this information explosion affects library users in a variety of ways; it damages health, leads towards bad decision making and creates information anxiety. in the same way, the information explosion confronts university librarians with many challenges; such as, selection and acquisition of library resources, organization of acquired resources, collection development, cataloguing, and reference services. at the same time it enables users to select from a wide range of resources (israel, ), which creates competition. information explosion and knowledge growth calls for innovative approaches to manage the right knowledge. since km emphasizes on updating of knowledge regularly in order to remove obsolete information and avail the most updated information, using the km systems academic librarians can overcome the problem of information explosion to a greater extent. . research methodology the researcher used both descriptive and analytical research design based on both qualitative and quantitative data. issues related to knowledge management practices and performance in academic libraries, especially mku were described and analyzed. the target population of the knowledge management practices and performance of academic libraries: a case of mount kenya university, kigali campus library research was the general library staff of the mount kenya university library, especially fulltime staffs. the researcher decided to use the census technique as a sampling technique because the number of the targeted population, the librarians, was small. the researcher used questionnaires, interviews and observation to collect data. . results and discussions all respondents were employees of mku library kigali campus and each had an email address. they all had computers and internet access at the time of this study. on examining the respondents, six were male and represented ( %) and four were female who represented ( %). from the ten distributed questionnaires, received were only seven ( %) who responded, three of them ( %) didn’t respond. table represents the gender distribution of the respondents. table . gender distribution of the respondents gender frequency male female total the distribution of the gender is as shown above, that amounts % of the male and amounting % of the female, which amounted to the total of hence ( %). figure . major drivers of km practices in academic libraries. world journal of computer application and technology ( ): - , table . respondents rating averages on major drivers of km in academic libraries statement strongly disagree disagree neutral agree strongly agree mean std. deviation to improve library services . . . . . . . to improve library productivity . . . . . . to produce more with less due to dwindling library budget . . . . . . . to leverage existing knowledge . . . . . . . achieve the library goals efficiently. . . . . . . . to manage information explosion . . . . . . . to make informed decisions . . . . . . . to establish best practices . . . . . . . to avoid duplication of efforts . . . . . . . improve my job performance. . . . . . . . enables me to react more quickly to change. . . . . . . . speeds up the process decision making . . . . . . . according to the cronbach’s alpha statistics with a score of . is clear evidence that the questionnaire was smart. the table below defines major drivers of knowledge management in academic libraries. a rating average means of . in the perception that leveraging existing indicates that most respondents were suggestive of being neutral than disagreeing or agreeing. a rating average of . explains that most respondents agree that job performance is a major driver of km in academic libraries. improving of library services was not agreed by respondents as a major driver of km in academic libraries by the respondents. these rating averages are depicted in the figure below. a rating of confirms that major drivers of km in academic libraries, as many agree that job performance are a major driver of km. about respondents totally disagree that making informed decisions is not a major driver of km in academic libraries while confirms to be neutral on whether the leverage of existing knowledge is a major driver of km in academic libraries. table . km practices needed to enhance the academic library for proper km practice. statement strongly disagree disagree neutral agree strongly agree mean std. deviation facilitates strong culture of knowledge sharing. . . . . . . . focus on identifying personal expertise. . . . . . . . create system to capture the tacit knowledge of employees. . . . . . availability of knowledge enabling technology. . . . . . . survey of knowledge within the library . . . . . . . focus on creativity and innovation . . . . . . written knowledge management policy. . . . . . . . strong partnership with other libraries . . . . . . . identify knowledge required in next five years. . . . . . . . establish knowledge repository . . . . . . . to find out the activities that are needed to enhance the academic library for proper km practices, questions about technology, innovation, creativity and expertise were raised. most respondents disagreed with a mean of . that km activities don’t facilitate strong culture of knowledge sharing. they however strongly agreed that creating a system to capture the tacit knowledge of employees is one of the km activities needed to enhance academic libraries for km practices. most of the respondents didn’t agree on the establishment of the knowledge repository, as most of them were not sure. these perceptions are reflected in the table below knowledge management practices and performance of academic libraries: a case of mount kenya university, kigali campus library table . challenges that might face the academic librarians in implementing km statement strongly disagree disagree neutral agree strongly agree mean std. deviation constant budget decline . . . . . . . lack of incentives . . . . . . . inadequate staff training . . . . . . . limited expertise in km . . . . . . . lack of clearly defined guidelines on km implementation . . . . . . . insufficient technology . . . . . . . a lack of knowledge sharing culture . . . . . . . a lack of cooperation among juniors and seniors . . . . . . there is no difference in job evaluation between those who practice knowledge sharing and those who do not. . . . . . . . lack of guidelines to support the sharing of knowledge. . . . . . . . a rating average of . reflects that the majority of the respondents agree that there is lack of knowledge sharing culture among the librarians, posing a challenge to academic librarians in implementing km. an average rating of . also reflects that some respondents were neutral on lack of clearly defined guidelines on implementing the km. a few respondents also disagreed that lack of incentives cannot pose a challenge to academic librarians in implementing the km. this is demonstrated in the figure below. figure . challenges that might face the academic librarians in implementing km . conclusions and recommendation the study investigated the km practices and performance in academic libraries where the operational culture of the library was not km. its purpose to examine current library service in an environment where information was changing fast, and where there was competition from other sources such as the internet, had been achieved. after discussing the implications of km for the library, the suggestion made by wen ( ) can be a practical way of getting the km process in place: the librarian should consider him/her self as the chief knowledge officer of the entire organization and should work together with the cio, heads of the planning department, the computer and information technology center, the human resources management department, the finance department, etc. to design and develop such a system. such a knowledge management system should be built on existing computer and information technology infrastructures, including upgraded intranet, extranet, and internet, and available software programs to facilitate the capture, analysis, organization, storage, and sharing of internal and external information resources for effective knowledge exchange among users, resource persons (faculty, researchers, and subjects specialists, and so on.), publishers, government agencies, businesses and industries, and other organizations via multiple channels and layers. . recommendation it was recommended that the use of such a web . application as delicious.com enables the accumulation and organization of all resources as tags in an individual’s delicious.com account. resources discovered with the use of web-quests, for example, can all be organized in one place. the use of web-quest style of instruction has the potential to enable students to make material gathered on the web their own, and integrate the data from their own practical experiences into their constructive action projects, but at the same time providing further validation for their conclusions from mostly web sources. this requires a certain amount of creativity and critical/reflective thinking to be successful. faculty and librarians can provide coaching for this. it was also recommended that even when the library did not have enough manpower to monitor or carry out all the duties that a fully functional library could, library user feedback can be used to improve products/services in the library. while some interview participants noted that the factors contributing to the inadequate state of library service included the negative attitudes and lack of awareness of the importance of library resources by some of the faculty, the library can use existing know-how and collaboration in a creative manner for new applications. the library can also continuously attempt to discover the service problems that cause gaps between targets and achievements. it is therefore practical for the library to try to counter dysfunctional beliefs within the university by utilizing multi-disciplinary teams to perform tasks and/or make decisions. additionally, through classifying documents, the library has capabilities to integrate its knowledge across different subject areas, thus provide knowledge in a seamless manner. references [ ] abbot, r. ( ). subjectivity as a concern for information science: a popperian perspective. journal of information science, ( ): - . [ ] abell, a. ( ). skills for knowledge environments. the information management journal, ( ): - . [ ] abram, s. ( ). evolution to revolution to chaos? reference in transition. internet librarian international, ( ). available: http://www.infotoday.com/searcher/sep /abram.shtml (accessed / / ). [ ] bailey, c.w. ( ). the role of reference librarians in institutional repositories. reference services review, ( ): - . available: http://www.digital-scholarship.org/cwb/ reflibir.pdf (accessed / / ). [ ] baker, l.m. ( ). observation: a complex research method. library trends, ( ): - . [ ] barquin, r. ( ). what is knowledge management? knowledge and innovation. journal of the kmci, ( ): - . [ ] caracelli, v. j. & greene, j.c. ( ). data analysis for mixed-method evaluation designs. educational evaluation and policy analysis, ( ): - . [ ] carlin, a. p. ( ). disciplinary debates and bases of interdisciplinary studies: the place of research ethics in library and information science. library & information science research, ( ): - . [ ] carpenter, c. & steiner, s. ( ). using web . technologies to push e-resources. available: http://smartech.gatech.edu/bitstream/ / / / -fri- _ .pdf (accessed / / ). [ ] daud, s., rahim, r.e.a. & alimun, r. ( ). knowledge creation and innovation in classroom. international journal of social sciences, ( ): - . [ ] davenport, t. h. & prusak, l. ( ). working knowledge: how organizations manage what they know. boston, mass.: harvard business school press. [ ] mount kenya university schedule and statutes ( ). [ ] stankosky, m. ( ) advances in knowledge management: university research toward an academic discipline, in stankosky, m. (ed.) creating the discipline of knowledge management. washington, elsevier butterworth-heinemann. . introduction . literature review . research methodology . results and discussions . conclusions and recommendation . recommendation references visual linguistic analysis of political discussions : measuring deliberative quality visual linguistic analysis of political discussions: measuring deliberative quality valentin gold department of politics and public administration, university of konstanz, germany mennatallah el-assady department of computer and information science, university of konstanz, germany annette hautli-janisz department of linguistics, university of konstanz, germany tina bögel department of linguistics, university of konstanz, germany christian rohrdantz department of computer and information science, university of konstanz, germany miriam butt department of linguistics, university of konstanz, germany katharina holzinger department of politics and public administration, university of konstanz, germany daniel keim department of computer and information science, university of konstanz, germany ....................................................................................................................................... abstract this article reports on a digital humanities research project which is concerned with the automated linguistic and visual analysis of political discourses with a particular focus on the concept of deliberative communication. according to the theory of deliberative communication as discussed within political science, polit- ical debates should be inclusive and stakeholders participating in these debates are required to justify their positions rationally and respectfully and should eventually defer to the better argument. the focus of the article is on the novel interactive correspondence: valentin gold, department of politics and public administration, po box , universität konstanz, konstanz, germany. e mail: valentin.gold@uni konstanz.de konstanzer online-publikations-system (kops) url: http://nbn-resolving.de/urn:nbn:de:bsz: - - erschienen in: digital scholarship in the humanities ; ( ), . - s. - https://dx.doi.org/ . /llc/fqv visualizations that combine linguistic and statistical cues to analyze the delibera- tive quality of communication automatically. in particular, we quantify the degree of deliberation for four dimensions of communication: participation, respect, argumentation and justification, and persuasiveness. yet, these four dimensions have not been linked within a combined linguistic and visual framework, but each single dimension helps determining the degree of deliberation independently from each other. since at its core, deliberation requires sustained and appropriate modes of communication, our main contribution is the automatic annotation and disambiguation of causal connectors and discourse particles. ................................................................................................................................................................................. introduction for the last two decades, notions of deliberative dem- ocracy have been intensively debated within political science and related fields. in recent years, deliberation research has experienced an empirical turn (chambers, ). in particular, the deliberative qual- ity of communication and its consequences on the overall political decision-making process has attracted attention, partly in light of highly public resistance to political agendas with respect to major development projects (e.g. airports, train stations, fracking). some natural questions that arise with respect to this are whether the deliberative quality of political discus- sions has an impact on the final decisions that are taken and whether a higher deliberative quality leads to greater acceptance of the final decision that is taken. particularly, does a higher deliberative quality of a political discussion lead to greater acceptance of the arguments that have been brought forward and of the final decision that is taken? a crucial component for finding an answer to these questions is the determination of the delibera- tive quality of a discussion. the tools that have been developed so far within political science to measure the deliberative quality of a given discussion are based on the manual coding of deliberative cate- gories and a subsequent statistical analysis of the categories coded. the discourse quality index, de- veloped by steenbergen et al. ( ) is thus far the most prominent manual coding scheme (hang- artner et al., ; thompson, ; lord and tam- vaki, ). this procedure, however, is beset with several difficulties. one difficulty is that manual coding is comparatively (labor) expensive and takes a long time (e.g. black et al., , p. ). another is the lack of inter-annotator agreement. the categories developed so far often fall prey to subjective judgments of the human annotators, thus leading to a problematic amount of disagree- ment among annotators (black et al., , p. ; dacombe, ). a desideratum in research on deliberative dem- ocracy is thus the automatic coding and analysis of political discussions according to criteria which re- flect the deliberative nature of a discussion. in this article, we present an approach that draws on a combination of linguistics and visual analytics in the creation of an automatic annotation system that can be used for the analysis of deliberative qual- ity. the approach is interdisciplinary and falls under the purview of ‘digital humanities’. from the political science perspective, deliber- ation is defined as a communicative process that is based on an inclusive and constructive debate be- tween the participating stakeholders (habermas, ; gutmann and thompson, ). hence, this definition refers to the notion of the procedural im- portance of how decisions are made. since delibera- tive decision-making is complex, we assume that deliberative quality is a latent construct (i.e. not a directly observable variable) consisting of several observable measures that can be used to approxi- mate the overall deliberative quality of a discussion. from the linguistic perspective, these observable measures mainly consist of linguistic features pre- sent in the communication. examples are rhetorical devices designed to make the communication as persuasively effective as possible, e.g. the use of lin- guistic features to establish a common ground/ understanding between the discussants (rhetorical questions, use of inclusive pronouns such as ‘we/us’ rather than exclusive ones such as ‘i/you’, discourse particles, tag questions, etc.), the presen- tation of justified arguments (identified, for example, by the linguistic feature of causal con- nectors, e.g. ‘because’), verbs signaling speaker stance with respect to a certain topic (‘think, believe’ versus ‘know’ or ‘accept, reject’), but also indica- tions of respect and politeness conveyed by one speaker to another (e.g. interruptions are generally known to signal disrespect and tend to be employed by more dominant speakers, see brown and levinson ( ) for more details on linguistic mar- kers of politeness). taking together the political science and the lin- guistic perspectives, we have identified four broad areas for which to calculate deliberative quality: participation (section . ), respect (section . ), argumentation and justification (section . ), and persuasiveness (section . ). among these areas, we focus mainly on the area of argumentation and jus- tification in order to demonstrate our overall lin- guistic visual analytics approach. in particular, we provide a computational implementation that automatically annotates corpora of deliberative communications with respect to linguistic and meta-linguistic features in each of these areas. our implementation combines a rule-based system that reflects deep linguistic analysis with more shallow natural language processing (nlp) approaches that include standard strategies such as keyword identi- fication, topic modeling, and calculations of utter- ance length, but also innovative perspectives on the data. the annotated data are further processed in the visual analytics system that ( ) depicts structures in the data through adapted textual data mining algorithms; and ( ) allows an explorative and inter- active access to the underlying data. details on this are presented in section , after a brief look at related work in section and a descrip- tion of our underlying methodology and data in section . section concludes the article. related work there are several interdisciplinary approaches which focus on understanding argumentation in discussions. here, we briefly discuss the ones closest to our research interests and distinguish them from our approach. a digital humanities cooperation between com- munication sciences, computer science, and compu- tational linguistics is currently looking at how the exchange of arguments plays out in unmoderated exchanges within social media such as twitter. this collaboration does not contain a political sci- ence component, nor does it contain a visual ana- lytics component. however, like the work described in this article, a considerable part of the overall effort is directed at identifying, understanding, quantifying, and analyzing linguistic features of ar- gumentation found in the data. twitter data are quite different from the political discussions inves- tigated here; however, both our efforts focus on dis- cussions in which the language of communication is german, and we have found that both our efforts have so far identified similar linguistic features as being relevant for an overall analysis of argumenta- tion (e.g. we have both identified rhetorically sig- nificant interactions between discourse particles and other parts of the grammar, see scheffler, ). the ‘eidentity’ project is also a digital humanities project. it involves a collaboration be- tween political science and computational linguis- tics. however, it does not include a visual analytics component, and its research topic is quite different. the overall goal of the ‘eidentity’ project is to iden- tify collective identities: how they are formed and how these change over time. the project works with large amounts of text and seeks to identify semantic fields that reflect complex concepts within these large corpora (blessing et al., ). the semantic fields—semantic co-occurrences of words related to collective identities—are meant to provide an auto- mated assistance to the work conducted by a human researcher. additionally, the goal of the ‘epol’ project is to measure neo-liberal argumentation and trace its impact over time (wiedemann et al., ). this project is again a digital humanities project and represents a collaboration between political science and computer science. the latter includes visual computing as well as nlp. ‘epol’ is also concerned with the identification of arguments in a political context. one difference to our work lies in the type of data they use. while we work oral negotiations, ‘epol’ uses newspaper articles on certain predefined political topics. in contrast to oral discussions, newspaper articles represent written and edited lan- guage. another difference lies in the depth to which the data are processed. rather than aiming at a deeper linguistic understanding of features or argu- mentation, the project focuses on text mining, i.e. on using shallow approaches to extracting informa- tion from a given text. we use these approaches as well as deeper linguistic knowledge. finally, docuscope (kaufer et al., ) provides a text analysis environment to determine rhetorical effects. the software allows classifying over cate- gories of rhetorical effects, e.g. emotions or confi- dence. in general, docuscope allows incorporating any dictionary and visualizes the categories. docuscope provides a good starting point for any dictionary-based approach. our approach, however, goes beyond visualizing dictionaries and provides various tools determining the degree of deliberative quality. data and methodology in this section, we briefly describe our data and overall methodology. the linguistic visual analysis is applied to two different types of data: ( ) multi- party negotiations which are the result of simula- tions conducted in an experimental setting; and ( ) real-world examples of political communication. in what follows, we provide details for each of the dis- ciplines involved in the work. . political science one of the main challenges in the analysis of delib- eration is the collection and analysis of data with regard to oral communications. most of the work conducted does not study the effects of synchronic face-to-face communications and instead tends to analyze asynchronic communication via digital means (see also the discussion in section on ar- guments in twitter data). for instance, sulkin and simon ( ) allow s of computer-based com- munication in order to analyze the effects on decision-making processes. persson et al. ( ) allow face-to-face deliberation but do not analyze the deliberative quality of communication. instead, they focus on the individual effects with regard to legitimacy of the decision-making outcome. in order to overcome this shortcoming, we have run a large number of simulation-gaming experi- ments. in these simulations, experimental subjects were asked to discuss the pros and cons of fracking and to decide unanimously whether fracking should be allowed in general or not. each experimental subject had to argue either in favor of or against fracking. to allow for a comparative analysis, the experimental subjects were provided with a prede- fined set of arguments. moreover, the experimental subjects had to answer surveys before and after the discussion. overall, we have conducted thirty-four experiments. each of the experiments lasted about . h, with a total time of h of group discussion. in most simulations, the maximum of h of discussion was fully made use of by the subjects. this provides us with the necessary amount of comparative data to test and evaluate the deliberative structure and content of political discussions. it also provides us with more data than is feasible to annotate manually. for the purposes of this article, we demonstrate our automatic visual linguistic approach not with respect to the fracking simulations, but with respect to a real-world example of political discussion. yet, we have used the experimental data to identify rele- vant deliberative features which can be further ex- tracted and analyzed in real-world discussion. in this article, we work with the publicly available (transcribed) data of the public arbitration that took place with respect to ‘stuttgart (s )’, a rail- way and urban development project in southern germany. the project includes the restructuring of the central station in stuttgart. ever since the pro- ject was officially announced in the late s, criti- cism was expressed. it was not until the late s, however, that large demonstrations and protests with over , participants took place. the main aim of the protests was directed against the demolition of the existing central railway station. on september , hundreds of protesters were injured when the police tried to secure the beginning of the construction work. this triggered massive public outrage (and a change in the govern- ment). in response, the (new) government agreed to establish a public arbitration procedure to discuss the facts of the project with both supporters and opposition. between october and november , the public arbitration took place. within eight rounds of arbitration, supporters and opposition discussed the merits of the project. the discussions were broadcasted live. the data we use to demon- strate the automated methods are the official transcripts that are available online. overall, this provides us with a corpus of around , utterances. . linguistics and computational linguistics the need for an automated annotation of relevant linguistic markers in the political discussions poses a challenge for linguistics and computational linguis- tics. the challenge for linguistic analysis lies in iden- tifying and understanding the linguistic markers that are relevant for measuring the deliberative qual- ity of a discussion. in particular, while much work has been done on understanding the pragmatic import of linguistic features within english, there is very little previous work to draw on for german. for example, while in english polar ques- tions and hedges are known to be used to signal a broad range of speech acts and speaker stance (e.g. lakoff, ; asher and reese, ), these do not feature prominently in our corpora. instead, an interaction between discourse particles (which english does not have) and other parts of the gram- mar, such as causal connectors, appears to play a large role in conveying pragmatic speech acts. in our work, we have thus concentrated on these. once relevant linguistic features have been iden- tified, a further challenge must be surmounted with respect to computational linguistics. computational linguistics is concerned with the automatic extrac- tion of linguistic information of a given data set. while fairly reliable tools exist for the annotation of german data with respect to morphological ana- lysis (schiller, n.d.), part-of-speech (pos) annota- tion and syntactic analysis (schmid, ; dipper, ), it is notoriously difficult to automatically and reliably identify information at the semantic and pragmatic level. in our work, we have thus concentrated on finding those linguistic markers that can be identified reliably via automatic meth- ods and have implemented programs to annotate the corpora automatically with the relevant information. in a first step, the data sets to be analyzed are converted into an xml-readable format. this is to guarantee the exchange of data across different plat- forms (interoperability) and in order to facilitate the annotation and subsequent extraction of linguistic information, as we can make use of the hierarchical organization of information that xml facilitates. in a next step, the data sets are organized in terms of elementary discourse units (edus) (marcu, ). this step also bears its own challenges—for our pur- poses and conforming with general current practice within computational linguistics, all lexical items between two punctuation marks are treated as be- longing to one discourse unit. the data are then further annotated with morphological and pos in- formation. these annotations at a very basic level of linguistic analysis then provide the input for our more sophisticated annotation layer. for example, where possible, we annotate edus as to what kind of speech act is being performed by the speaker. consider here just the example of ‘justification’. the primary linguistic marker for this is taken to be causal connectors such as ‘weil’, ‘deshalb’, ‘da’ (‘because’). however, most of the causal connectors used in german are ambiguous between a comple- mentizer reading and a different pos. for example, ‘da’ is also used as a spatial term meaning ‘there’. in order to disambiguate the occurrences, we integrate information about position in the clause and the pos of the elements surrounding the word in ques- tion. the overall work represents a fairly deep lin- guistic analysis of each of the edus. the overall result of the linguistic and computa- tional linguistic work is a corpus that is annotated with several different types of linguistic information. some of this information can be used as is, some of it can be used together with other information con- tained in the corpus as the basis for calculating the effect of further interactions between different elem- ents in an utterance (concrete examples are presented in section . ), or with respect to the overall patterns found in the data. this is exactly what the visual analytics component does. . visual analytics visual analytics has been defined as ‘the science of analytical reasoning facilitated by interactive visual interfaces’ (thomas and cook, ). within the field of digital humanities, the approaches of distant reading (moretti, ) and algorithm criticism (ramsay, , ramsay, ) share fundamental principles with visual analytics. in contrast to the longer-standing field of information visualization, where data (typically numerical data) are directly transformed into visualizations, visual analytics in- volves automated algorithmic analyses of the data before and after visualization. this procedure is described by the visual analytics mantra ‘analyse first—show the important—zoom, filter and analyse further—details on demand’ (keim et al., ). it has been shown that visual analytics approaches can be very beneficial to the analysis of language and linguistic data. first, statistical and algorithmic analyses are performed on text data and then suitable visual representations are designed to show the outcomes of the analyses. illustrative examples are visualizations of vowel harmonic con- straints within languages (mayer et al., ), cross- linguistic comparisons of linguistic features (rohrdantz et al., ), or approaches for tracking semantic change (rohrdantz et al., ). only few approaches exist that closely relate to our goals and tasks. first, for the analysis of conver- sation content, angus et al. ( a,b) suggest conceptual recurrence plots. all utterances of a multiparty conversation are displayed as rectangles along the diagonal of a triangle. the rectangles within the triangle indicate for each pair of utter- ances how much they relate in content. different patterns within the triangle indicate different kinds of concept recurrence, e.g. utterances that summar- ize the content of several previous utterances. second, nguyen et al. ( ) introduce argviz, a visualization system of the topical structure of mul- tiparty conversations based on topic modeling. a strength of the system is that topic shifts can be spotted in topic columns, which are coordinated with further standard views on the discourse. their topic modeling strategy, however, requires a whole corpus of related multiparty conversations in order to be trained. topics and text content are not contained in one single display, but distributed over coordinated views. both approaches help in obtaining insight into the topical structure of conversation. our goal is to go beyond that and incorporate further characteris- tics that are of relevance for measuring deliberation into our analyses. we therefore adapt and extend established visualization technologies and introduce novel approaches in order to enable an interactive exploration of deliberative communication. quantifying deliberative quality in this section, we will demonstrate how deliberative quality can be measured automatically using lin- guistically informed visual analyses. for each of the four pillars, we have identified as being signifi- cant for the measurement—participation, respect, argumentation and justification, persuasiveness— we will first briefly introduce the assumed causal link to deliberation before the applied method and some examples are presented. . role and structure of participation one of the basic characteristics of deliberative com- munication is equality in participation. within a deliberative discourse, each proponent should be treated equally, i.e. equality exhibits deliberation if all stakeholders are heard. conversely, if some stake- holders manage to achieve conversational hegem- ony, this indicates inequality in participation and, consequently, no deliberative communication (e.g. habermas, ; steenbergen et al., ; edwards et al., ). a simple way of assessing participation is to cal- culate the share of each individual in a multiparty conversation. the amount of turns and the turn lengths can be measured—for a high deliberative quality, these should be equally distributed among the participants. beyond numbers indicating an equal or unequal participation with respect to a whole conversation, it is further interesting to . inspect the course of the conversation more closely. do individuals only participate in certain phases of the conversation? how does the turn taking evolve? do certain persons tend to respond to certain others? are there sections with dialog structure within a multilog, i.e. a dialog with more than two participants? this is something that cannot be easily grasped by merely computing numbers, and it can therefore profit from the strengths of visualization. in some cases, visualizations may also reveal unexpected or unknown conversational patterns at a glance. the topical structure of a discourse is also rele- vant when investigating participation. it may be the case that the participants of a multilog have an equal share in terms of turn numbers and speaking time, but that their contributions are distributed over quite different topics. we are also interested in cases where each participant perhaps tries to push their own topics of interest and does not fig. visualization demonstrating the topic distribution and basic statistics for each participant in the s arbitration. the saturation in the left matrix indicates the relative frequency of the topics as automatically learned using mallet (mccallum, ). the bar chart at the right side of the figure indicates the amount of turns (length of the bar) as well as the average turn length from short (blue) to long (red). the figure is sorted for the participants’ position toward the s project and additionally for the amount of turns. respond to topics raised by others. instances of such a development in a multilog indicate a lower delib- erative quality, as all topics should be treated equally. how the topical preferences and the participa- tion of different speakers go together is something that again is best to be analyzed using visualizations. for example, visualizations can show who partici- pated to which extent in the elaboration of which topic. this may be analyzed aggregating proportions over the whole discourse. not only topic propor- tions of individuals but also topic proportions of opposing camps are of interest, for example the sets of participants in favor and against the con- struction of the s train station. this can be achieved generating views with a matrix or table structure. for instance, in fig. , the topic distribu- tion as well as some basic statistics are shown. the visualization allows to determine—within each row—which topics the participants contributed the most to and—within each column—whether there are topics that are discussed by many participants or whether there are specific topics that are only mentioned by single participants. in particular, the participants that are on top of each category contribute mostly to all topics, the partici- pants on the bottom only to single topics. in combination with the colored bar chart at the right side of the figure, our approach gives an overview over the amount of thematic equality in participation. going beyond aggregated views, representing the participation structure together with the topical structure over the course of a multiparty conversa- tion is an even more interesting challenge for visu- alization research. in order to address this challenge, we have come up with several options. first, fig. introduces a novel visualization, showing how turn lengths and speaker participation develop over the course of day of the s arbitra- tion. the rationale for this figure is to demonstrate patterns over the course of a discourse and to iden- tify various types of communication. for instance, in fig. , a segment of intense dialog between the arbitrator heiner geißler and the representative of the german railway company, volker kefer, can be identified. in the subsequent segment, various blocks can be seen indicating long monologs (i.e. presentations) by external experts. finally, after the presentations were given, the floor was opened for discussion. second, we automatically identify sections of the discourse where one topic is highly dominant and then label these sections in order to represent them visually. the label of a topic section contains up to five words, which correspond to the most frequent words belonging to the topic within the given sec- tion. fig. shows the statistically most significant topic sections of the complete s discourse. in a next step, the views of figs and can be integrated in order to show how topics and speakers are con- nected (cf. gold et al., ). fig. provides an example. again, the strategy is to let the computer automatically detect, structure, and display charac- teristics of the discourse in order to support the analysis process of the researcher. . respect mutual respect in terms of reciprocity is seen as a prerequisite of deliberative communication (e.g. gutmann and thompson, ; fishkin and luskin, ; gastil and black, ). reciprocity requires both speakers and listeners to treat one an- other with respect and equal concern—no matter how intensive or emotional the debate is. this in- cludes listening to and respecting each other’s argu- ments even though they may be inconsistent with one’s own beliefs and interests. a number of linguis- tic markers can indicate (dis)respect and/or (im)po- liteness. the challenge lies in being able to identify these consistently via automatic means. for ex- ample, rhetorical questions containing focus par- ticles such as ‘even’ as in ‘have you ever even done a real cost calculation?’ signal that speaker ser- iously doubts the overall competence of the ad- dressee in a manner that is disrespectful to the addressee (e.g. guerzoni, ). however, it appears that at least with respect to twitter data, it is nontrivial to extract this kind of information reli- ably by automatic means (zymla, ). at this stage of our work, we have thus decided to first focus on easily detectable features such as patterns of interruptions. however, it is challenging to dif- ferentiate between (disrespectful) interruptions and regular (deliberative) crosstalk. we have found that some of the relevant features for the identification of these different types are the length of utterances, the distribution of utterances, and the degree of recur- rence (i.e. the degree of similarity between the utterance and the previous utterances, see angus et al. ( a,b)). to determine the effects of interruptions, we have developed a visual framework that mainly works with the length of utterances. based on fig. result of an automated analysis of the entire s arbitration. topics are trained inserting the set of all turns into the standard topic modeling provided by mallet (mccallum, ). after that, our algorithm identifies sections of the mediation where individual topics cluster heavily. in another step, the most frequent words used within a topic cluster are extracted and provided as labels to the left of the blocks representing topic clusters. as can be seen, at the beginning of the arbitration, there are fewer significant topic clusters, mostly related to the capacity of s , environmental and security issues. toward the end of the arbitration, a longer discussion with clearer topical focus evolved. the high costs of the project, which are the main issue, become more prominent, indicated by terms like milliarden ‘billions’, kostenkalkulation ‘cost accounting’, euro, risikopuffer ‘risk buffer’, millionen ‘millions’, preissteigerung ‘price hike’, preise ‘prices’, finanzierung ‘funding’, geld ‘money’, or kosten ‘costs’. these lengths and the different colors for speakers, sections with interruptions can be identified. for instance, in fig. , the lime green speaker ms gönner, who is in favor of the project interrupts her opponent, the orange speaker mr holzhey— even though the arbitrator heiner geißler has given mr holzhey the floor. as can be seen in the right panel, mr holzhey is irritated by this behavior and seeks to regain his turn: ‘moment moment!’ (‘wait a minute wait a minute!’); ‘ganz ruhig ganz ruhig!’ (‘be calm be calm!’). moreover, the green and gray participants also rise to speak out of turn. in the former case, ms gönner is inter- rupted by the green participant who demands an answer to his question (. . .ich hätte darauf gerne eine antwort ‘i would like an answer to that’). however, ms gönner does not give up her turn and simply continues with her argument. in our interpretation, this is a show of speaker strength; however, the fact that an interruption was at- tempted is valued as a mark of low deliberative quality. . argumentation and justification at its core, deliberation requires sustained and ap- propriate modes of argumentation (e.g. stromer- galley, ; gastil and black, ; thompson, ). for one, arguments should be properly jus- tified. for another, arguments should make refer- ence to a common set of principles—indeed, it is more likely that an argument be successful if the speaker can appeal to a commonly agreed upon set of values or a commonly agreed upon under- standing of the world (habermas, ). two of the linguistic features relevant for the determination of these aspects of deliberation are presented in this section: ( ) causal connectors that support the pos- ition of the speaker, as opinions are not only stated, but are justified as well; and ( ) discourse particles which provide information about speaker stance/at- titude and/or which trigger conventional implica- tures as to what common knowledge about the world should be assumed (common ground) and to which degree. causal connectors in german can be divided into two classes: ‘markers of reason’ introduce the cause of an effect, while ‘markers of conclusion/result’ introduce a clause describing the effect of previously stated cause. both markers relate two parts of a sen- tence or several sentences: one part contains the reason for a specific statement and the other part contains the result. the following two sentences demonstrate this relationship, where ( ) states a result followed by a reason (‘weil’), and ( ) states a reason, followed by a result (‘daher’) (for previous computational work on these, see e.g. dipper and stede ; schneider and stede, ). ( ) er ist grün, weil ihm schlecht ist he.nom be. .sg green because he.dat feel.sick be. .sg ‘he is green, because he feels sick’. ( ) ihm ist schlecht daher ist er grün he.dat be. .sg feel.sick that.is.why be. .sg he.nom green ‘he is feeling sick, that’s why he is green’. there are several challenges in the automatic ana- lysis of these relations. as already mentioned in section . , some of the connectors are ambiguous and need to be disambiguated via the application of a rule-based system containing deep linguistic knowledge. a second challenge is the determination of scope: the reason/result relation can scope over fig. more detailed view on highlighted part from fig. . both the turn structure (to the right) and the content structure (to the left) have been integrated into one view. in contrast to fig. , the algorithm in this case has not searched for sections with topic clusters, but for sec tions with word clusters, which is another option of our method. during the dialog of heiner geißler (dark blue) and volker kefer (light green), the word kostenkalkulation ‘cost calculation’ clusters highly signifi cantly indicating that this is the main subject of their dialog section. f ig . v is u al fr am ew o rk to id en ti fy se ct io n s o f in te rr u p ti o n s. e ac h co lo re d ro w in th e th re e p an el s b el o n g s to o n e p ar ti ci p an t. t h e le ft p an el p ro v id es an o v er v ie w o f th e co m p le te s ar b it ra ti o n . in o rd er to al lo w a m o re re fi n ed o v er v ie w , th e v is u al fr am ew o rk p ro v id es zo o m in g fu n ct io n al it y . t h is is d em o n - st ra te d in th e m id d le an d ri g h t p an el . several discourse units (edus) or even sentences and are thus not limited to the edu containing the causal connector. in order to determine the scope (and to annotate the edus accordingly), fur- ther deep linguistic knowledge about the cues that delimit or license the relation is needed. an example of the type of algorithm used in our rule-based system for the automatic annotation of causal rela- tions is given in ( ). ( ) if result connector not in first edu of sentence and result connector not preceded by other connector within same sentence then mark every edu from sentence beginning to current edu with reason. elsif result connector in first edu of sentence then mark every edu in previous sentence with reason unless encountering another connector. starting from a (disambiguated) causal connector encoded in the text, rules of the type in ( ) are used to annotate the preceding and following dis- course units to indicate the speaker’s use of justifi- cation. an evaluation of our rule-based system with respect to a manually annotated gold standard has yielded precision, recall, and f-score values of . (bögel et al., ). an error analysis showed that the system can be improved further in future work. however, the present results are already of a high enough quality so that we can include this funda- mentally important feature as part of the measure- ment of deliberative quality. another relevant feature is the expression of common ground. in german, one of the very fre- quently encountered strategies for expressing whether a speaker considers information to be in the common ground (or whether they would like to have it be assumed as being in the common ground) is the use of discourse particles. german has an inventory of several different discourse par- ticles, many of which are currently the subject of active linguistic research (see zimmermann, for a recent overview). for example, by using the modal particle ‘ja’, the speaker indicates that they assume that a given statement/proposition is already known to the addressee or is general knowledge; i.e. speaker and addressee share a common ground, and the speaker expects that the addressee will not contradict the statement (karagjosova, ; zim- mermann, ). an example is given in ( ). ( ) first brother to second brother: morgen wird mama ja siebzig tomorrow be. .sg mum indeed seventy ‘tomorrow mum turns (as you know)’. rhetorically, this strategy can be used to put the addressee at a disadvantage—if the addressee does not want to acknowledge information as being com- monly agreed upon knowledge (common ground), then they have to explicitly reject it, something that is difficult to do since the speaker conveyed their assumption only indirectly via a conventional im- plicature (potts, ) in the first place. a slightly different pragmatic import about the mutual common ground that is assumed is con- veyed by doch ‘indeed’. in this case, the speaker as- sumes that the knowledge conveyed in utterance is already in principle also known by the addressee, but that it is not at present activated in the common ground. the use of ‘doch’ is thus a signal that the speaker wishes to reactivate informa- tion that is assumed to already be in the common ground. other particles like wohl ‘apparently’ signal speaker attitude toward a given proposition—the use of ‘wohl’ conveys a weak commitment to the proposition uttered. in contrast, as a discourse par- ticle halt ‘stop/well’ is used to indicate that the speaker considers the topic talked about to contain an immutable (world) constraint and also to express a certain degree of resignation in the face of how the world is (and cannot be changed). a study of these discourse particles showed that ‘halt’, ‘doch’, and ‘ja’ occur frequently in the s arbitration, whereas ‘wohl’ occurs only rarely janka ( ). the particles can be used in interaction with one another and also in interaction with causal connectors. an example from the s arbitration is shown in ( ). this example also illustrates that while causal connectors and modal particles each separately al- ready serve as indicators for the determination of deliberation, their interaction is also significant. thus, a justification that also includes a particle representing an immutable constraint (‘halt’ in ( )) indicates that the speaker considers this justification to be irrevocable; i.e. the speaker has made a point that they are not expecting to be contradicted. in ( ), the speaker (heiner geissler) states that most cars are present in a certain area. by using ‘halt’ in this context, he conveys that this point is absolutely true and does not need to be discussed any further. ( ) . . . weil halt da die meisten autos unterwegs sind . . . as halt there art most car.pl underway be. .pl ’. . . because most cars are underway in this area’. (heiner geissler, s , november ) note that like the causal connectors, discourse par- ticles also tend to be highly ambiguous; e.g. ‘halt’ also means ‘stop’ and ‘ja’ is also the word for ‘yes’. in order to achieve a successful identification of the discourse particles, a deep linguistic analysis is again necessary. fig. shows a visual analysis based on the lin- guistic annotation with respect to causal connectors, discourse particles, and their interaction. the prag- matic import of these linguistic features is registered as ‘justification’, ‘common ground’, and ‘immutable constraint’. the figure shows which speakers justify their arguments with which frequency and whether fig. use of justification, immutable constraints, and common ground assumptions by some of the speakers of the s mediation process, normalized according to the number of words each speaker uttered during the process. the number indicates the absolute value of the discourse units. they use discourse particles to convey that they con- sider certain information to be part of commonly agreed on knowledge (common ground) or to convey that they consider certain aspects to be hard, unchangeable facts (immutable constraints) about the world that cannot be discussed further and that thus make a solid point. the speakers depicted in fig. are among the ones which spoke the most during the s arbitra- tion. we have represented four speakers of the pro group and four speakers of the contra group. the bottom part of the figure shows an analysis of the arbitrator, heiner geissler. the visual analysis very clearly shows that heiner geissler makes the most use of common ground particles. a possible inter- pretation of the data is that geissler’s overall goal was to create a common ground for the two oppos- ite groups—an attempt that is expected from a neu- tral arbitrator. he also brought in the most justifications and pointed out immutable facts about the world more than others. again, these are strategies that are expected from an arbitrator who is trying to reach a consensus on the arguments that are exchanged. looking at the speakers in the pro versus contra groups, the visual analysis shows that the two top representatives of the pro group (kefer and gönner) use significantly more justification patterns than the other speakers. as the s mediation process was the result of an offensive against the pro s group, we can speculate that these representatives needed to justify their positions and decisions more during the arbitration. . persuasiveness deliberation is a process whose aim is to exchange arguments and to find a common strategy. however, the process of political deliberation does not neces- sarily result in an agreement. from a theoretical per- spective, deliberation has taken place if all the stakeholders have expressed their intention of coming to an agreement (even if none is reached) (e.g. gastil, ; mannarini and talò, ). however, due to real-world pressures and the neces- sity that the problem at stake needs to be resolved, most deliberations do end in an agreement. hence, the deliberative quality of a discourse can also be measured in terms of the degree of persuasiveness, i.e. who convinced whom and how/why. with regard to our experimental simulation- gaming experiments, we can evaluate information about persuasiveness since the experimental subjects had to note down their preferences after the discus- sion. for real-world conversations, analyzing who convinced whom is a more complex task since most agreements are based on a compromise between the contesting parties. this renders an analysis of the overall degree of persuasiveness difficult. hence, we propose a procedural measure for persuasiveness. based on holzinger and landwehr (holzinger, , ; landwehr and holzinger, ), we propose to measure the deliberative intentions of the stakeholders based on the types of speech acts expressed by performative verbs (e.g. ‘accept’, ‘threaten’) and the information conveyed by epi- stemic or attitude verbs (e.g. ‘believe, think, assume’ versus ‘know’) about speaker stance. the idea is that this approach will reveal sequences within the discourse that are characterized by either extensive bargaining or intensive argumenta- tion. moreover, if these sequences are linked to spe- cific topics, it will be possible to identify the argumentative quality of specific topics and to dis- cern instances of persuasion within the discourse. summary and future work this article presents work from an interdisciplinary research effort involving political science, linguistics, and visual analytics. the overall goal of our research is to find reliable indicators for the deliberative quality of a discussion. our strategy is to identify linguistic markers that pertain to a political science- oriented analysis of deliberation and that can be identified automatically via computational linguistic methods. this computational linguistic component is rule based and draws on deep linguistic know- ledge. its outputs are automatically annotated cor- pora with relevant linguistic information. these corpora are used as the basis for the visual analytics component, which incorporates shallow nlp meth- ods and other sophisticated statistical analyses of various features of the discussions. we provide examples of visualizations with respect to the s arbitration process and demonstrate that our meth- ods yield information that can ultimately be used to judge the deliberative quality of a discussion via the visual integration of very different types of information. in the future of this research project, several steps are necessary in order to automate and refine the measures for deliberative communication. first, more features for quantifying the deliberative degree of communication need to be extracted and evaluated. for instance, similar to the deep linguistic analysis of argumentation and justification, we intent to apply some automated procedures also to reveal patterns of persuasiveness. second, to achieve a single automated measure for the degree of deliber- ation, a combination of the four deliberative dimen- sions is required. an evaluation will be conducted to determine the validity of the automated measure. overall, the combination of automated measures and visual analytics proves to not only be conducive to measuring the deliberative quality of communica- tion but also to understanding the relevant features leading to deliberative decision-making. references angus, d., smith, a. e., and wiles, j. ( a). conceptual recurrence plots: revealing patterns in human discourse. ieee transactions on visualization and computer graphics, ( ): . angus, d., smith, a. e., and wiles, j. ( b). human communication as coupled time series: quantifying multi participant recurrence. ieee transactions on audio, speech & language processing ( ): . asher, n. and reese, b. ( ). negative bias in polar questions. in maier, e. bary, c., and huitink, j. (eds.), proceedings of sinn and bedeutung (sub) , nijmegen: nijmegen centre of semantics (ncs), pp. . black, l. w., burkhalter, s., gastil, j., and stromer galley, j. ( ). chapter : methods for analyzing and measuring group deliberation. in bucy, e. p. and holbert, r. l. (eds), sourcebook of political communication research: methods, measures, and analytic techniques. new york: routledge, pp. . blessing, a., sonntag, j., kliche, f., heid, u., kuhn, j., and stede, m. ( ). towards a tool for interactive concept building for large scale analysis in the humanities. proceedings of the th workshop on language technology for cultural heritage, social sciences, and humanities. association for computational linguistics (acl), sofia, bulgaria, pp. . bögel, t., hautli janisz, a., sulger, s., and butt, m. ( ). automatic detection of causal relations in german multilogs, proceedings of the eacl workshop on computational approaches to causality in language (catocl), gothenburg, sweden, pp. . brown, p. and levinson, s. c. ( ). politeness: some universals in language use. cambridge, ma: cambridge university press. chambers, s. ( ). deliberative democracy theory. annual review of political science, ( ): . dacombe, r. ( ). thinking about the quality of de liberative politics: a critical look at the discourse qual ity index. paper presented at the sspp annual research conference, king’s college london, june , . dipper, s. ( ). implementing and documenting large scale grammars german lfg. ph.d. thesis, ims, university of stuttgart. dipper, s. and stede, m. ( ). disambiguating potential connectives. in butt, m. (ed.), proceedings of konvens (conference on natural language processing), konstanz, pp. . edwards, p. b., hindmarsh, r., mercer, h., bond, m., and rowland, a. ( ). a three stage evaluation of a deliberative event on climate change and transformation energy. journal of public deliberation, ( ): article . fishkin, j. s. and luskin, r. c. ( ). experimenting with a democratic ideal: deliberative polling and public opinion. acta politica, ( ): . gastil, j. ( ). how balanced discussion shapes know ledge, public perceptions, and attitudes: a case study of deliberation on the los alamos national laboratory. journal of public deliberation, ( ): article . gastil, j. and black, l. w. ( ). public deliber ation as the organizing principle of political communi cation research. journal of public deliberation, ( ): . gold, v., rohrdantz, c., and el assady, m. ( ). exploratory text analysis using lexical episode plots. in bertini, e., kennedy, j., and puppo, e., (eds.), eurographics conference on visualization (eurovis) short papers. the eurographics association. . guerzoni, e. ( ). even npis in yes/no questions. natural language semantics, : . gutmann, a. and thompson, d. f. ( ). democracy and disagreement. why moral conflict cannot be avoided in politics, and what should be done about it. cambridge, ma: harvard university press. habermas, j. ( ). theorie des kommunikativen handelns. frankfurt am main: suhrkamp. habermas, j. ( ). the theory of communicative action. boston, ma: beacon. hangartner, d., bächtiger, a., grünenfelder, r., and steenbergen, m. r. ( ). mixing habermas with bayes: methodological and theoretical advances in the study of deliberation. swiss political science review, ( ): . holzinger, k. ( ). verhandeln statt argumentieren oder verhandeln durch argumentieren? eine empiri sche analyse auf der basis der sprechakttheorie. politische vierteljahresschrift, ( ): . holzinger, k. ( ). bargaining through arguing: an empirical analysis based on speech act theory. political communication, ( ): . janka, m. ( ). schattierungen in der argumentation modalpartikeln und kausale konnektoren. ba thesis, university of konstanz. karagjosova, e. ( ). the meaning and function of german modal particles. saarabrucken dissertations in computational linguistics and language technology. kaufer, d., geisler, c., vlachos, p., and ishizaki, s. ( ). chapter : mining textual knowledge for writ ing education and research: the docuscope project. in van waes, l., leijten, m., and neuwirth, c. m. (eds), writing and digital media. language and linguistic special. oxford: elsevier, pp. . keim, d. a., mansmann, f., schneidewind, j., thomas, j., and ziegler, h. ( ). chapter: visual analytics: scope and challenges. in simoff, s. j., bohlen, m. h., and mazeika, a. (eds), visual data mining. berlin, heidelberg: springer verlag, pp. . lakoff, r. ( ). language and woman’s place. new york, ny: harper & row. landwehr, c. and holzinger, k. ( ). institutional de terminants of deliberative interaction. european political science review, ( ): . lord, c. and tamvaki, d. ( ). the politics of justifi cation? applying the ‘discourse quality index’ to the study of the european parliament. european political science review, : . mannarini, t. and talò, c. ( ). evaluating public participation: instruments and implications for citizen involvement. community development, ( ): . marcu, d. ( ). the theory and practice of discourse parsing and summarization. cambridge, ma: mit press. mayer, t., rohrdantz, c., butt, m., plank, f., and keim, d. a. ( ). visualizing vowel harmony. linguistic issues in language technology, ( ): . mccallum, a. k. ( ). mallet: a machine learning for language toolkit. http://mallet.cs.umass.edu moretti, f. ( ). distant reading. london: verso. nguyen, v. a., hu, y., boyd graber, j., and resnik, p. ( ). argviz: interactive visualization of topic dy namics in multi party conversations. human language technologies: the annual conference of the north american chapter of the association for computational linguistics, : . persson, m., esaiasson, p., and gilljam, m. ( ). the effects of direct voting and deliberation on legit imacy beliefs: an experimental study of small group decision making. european political science review, ( ): . potts, c. ( ). the logical of conventional implicatures. oxford: oxford university press. ramsay, s. ( ). special section: reconceiving text ana lysis toward an algorithmic criticism. literary and linguistic computing ( ): . ramsay, s. ( ). reading machines: toward an algorithmic criticism. urbana: university of illinois press. rohrdantz, c., hautli, a., mayer, t., butt, m., plank, f., and keim, d. a. ( ). towards tracking semantic change by visual analytics, proceedings of the th annual meeting of the association for computational linguistics: human language technologies (short papers). portland, or: association for computational linguistics, pp. . rohrdantz, c., hund, m., mayer, t., wlchli, b., and keim, d. a. ( ). the world’s languages explorer: visual analysis of language features in genealogical and areal contexts. computer graphics forum, ( ): . scheffler, t. ( ). meaning variations in german tag questions. talk accepted at the dgfs (deutsche gesellschaft fur sprachwisenschaft) workshop on ‘‘the prosody and meaning of (non )canonical questions across languages’’. schiller, a. ( ) dmor user’s guide. technical report, universitat stuttgart, institut fur maschinelle sprachverarbeitung. schmid, h. ( ). improvements in part of speech tagging with an application to german, proceedings of the acl sigdat workshop, dublin, ireland. schneider, a. and stede, m. ( ). ambiguity in german connectives: a corpus study. in butt, m. (ed.), proceedings of konvens (conference on natural language processing). konstanz, pp. . steenbergen, m. r., bächtiger, a., spörndli, m., and steiner, j. ( ). measuring political deliberation: a discourse quality index. comparative european politics, ( ): . stromer galley, j. ( ). measuring deliberation’s con tent: a coding scheme. journal of public deliberation, ( ): article . sulkin, t. and simon, a. f. ( ). habermas in the lab: a study of deliberation in an experimental setting. political psychology, ( ): . thomas, j. j. and cook, k. a. ( ). a visual analytics agenda. ieee computer graphics and applications, ( ): . thompson, d. f. ( ). deliberative democratic theory and empirical political science. annual review of political science, ( ): . wiedemann, g., lemke, m., and niekler, a. ( ). postdemokratie und neoliberalismus zur nutzung neoliberaler argumentationen in der bundesrepublik deutschland . ein werkstattbericht. zeitschrift fur politische theorie, ( ): . zimmermann, m. ( ). discourse particles. in portner, p., maienborn, c., and von heusinger, k. (eds), semantics (handbucher zur sprach und kommunikationswissenschaft). mouton de gruyter, pp. . zymla, m. m. ( ). extraction and analysis of non canonical questions from a twitter corpus. ma thesis, university of konstanz. notes http://www.social media analytics.org/en/, last accessed october . https://www.ling.uni potsdam.de/acl lab/eidentity/ main.html, last accessed october . http://www.epol projekt.de, last accessed october . http://www.schlichtung s .de/dokumente.html, last accessed september . http://www.cis.uni muenchen.de/schmid/tools/ treetagger/, last accessed october . owls learn oer this work is licensed under a creative commons attribution . international license. owls learn oer shannon kipphut-smith, fondren library digital scholarship services october permanent link to this document: https://doi.org/ . / b-bs introduction to this course welcome to owls learn oer! your learning includes a series of self-paced online learning modules. the seven modules can serve as an introduction to open educational resources (oer) as well as an opportunity for further exploration and discovery of oer and open education practices. throughout the modules there are opportunities for you to test your knowledge or further explore a concept. the modules allow you to learn at your own pace. while you can follow the modules in any order, it is recommended that you start with module and progress through in order. by the end of this course, you should be able to: ● define open educational resources ● explain the rationale for oer adoption and use ● explain the differences between the six currently available creative commons licenses ● identify repositories and other resources for finding relevant oer ● use tools and criteria to evaluate oer ● recognize steps and associated criteria for adapting and creating oer with proper attribution and licensing ● create an open educational resource if you have questions about, or suggestions for, these modules, please contact: fondren library digital scholarship services cds@rice.edu http://creativecommons.org/licenses/by/ . / http://library.rice.edu/dss https://doi.org/ . / b-bs mailto:cds@rice.edu this work is licensed under a creative commons attribution . international license. attribution this course is adapted from carrie gits's (for digitex) "texas learn oer," licensed under a creative commons attribution . international license cc by . additional content for module , "creative commons licensing in-depth," is adapted from creative commons ’ “ creative commons certificate for educators and librarians,” licensed under the creative commons attribution . international license (cc by). http://creativecommons.org/licenses/by/ . / https://digitex.org/open-educational-resources/ https://digitex.org/open-educational-resources/ https://sites.google.com/austincc.edu/texaslearnoer/module- -introduction-to-this-course http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by/ . / https://creativecommons.org/ https://creativecommons.org/ https://certificates.creativecommons.org/cccertedu/ https://certificates.creativecommons.org/cccertedu/ https://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. course module outline module : understanding oer module : why oer? why use oer? oer: equity & openness explore further module : introduction to open licensing what is copyright? what is fair use? understanding an open license why is an open license important? what is the public domain? what is the difference between public domain and open license? why open licensing matters module : finding & evaluating oer finding oer evaluating oer module : accessibility accessibility and universal design an overview of accessibility choosing and using accessible video choosing and using accessible images choosing and using accessible course material choosing and using accessible textbooks resources module : creative commons licensing in-depth six licenses attribution combining cc licenses choosing a license for your work module : adapting, creating & sharing oer adapting an existing open educational resource creating open educational resources sharing your work http://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. module : understanding oer by the end of this module, you should be able to: ● define open educational resources (oer) ● describe the rs of oer ● identify examples of oer types ● recognize the role open licensing plays in oer ● test your knowledge introductory video: what is oer? what are open educational resources (oer)? open educational resources (oer) are teaching, learning and research materials in any medium – digital or otherwise – that reside in the public domain or have been released under an open license that permits no-cost access, use, adaptation and redistribution by others with no or limited restrictions. --unesco types of oer include (but are not limited to) syllabi, lesson plans, learning modules, lab experiments, simulations, course videos, discussion prompts, assignments, assessments, library guides, and course design templates. the key distinguishing factor of this type of educational resource is the copyright status of the material. if course content is under a traditional, all-rights-reserved copyright, then it’s not an oer. if it resides in the public domain or has been licensed for adaptation and distribution, then it is an oer. the rs of oer you recently viewed the introductory video where the presenters discussed how the rs are critical in defining and distinguishing open educational resources from other types of learning materials. these r permissions are what make oer different from material which is copyrighted under traditional, all-rights-reserved copyright. another way to frame this is that open in open educational resources doesn’t simply equate to being free; in fact, it more accurately can be described as: http://creativecommons.org/licenses/by/ . / https://youtu.be/ldtcdmkldqw https://en.unesco.org/themes/building-knowledge-societies/oer this work is licensed under a creative commons attribution . international license. open = free + permissions (the rs) the rs are a useful way to appreciate the value of oer. these permissions help you, the user of openly licensed content, understand what you are allowed to do with the work. these permissions are granted in advance and are legally established through public domain or creative commons license: ● revise – the right to adapt, adjust, modify, or alter the content itself (e.g., translate the content into another language) ● remix – the right to combine the original or revised content with other material to create something new (e.g., incorporate the content into a mashup) ● reuse – the right to use the content in a wide range of ways (e.g., in a class, in a study group, on a website, in a video) ● retain – the right to make, own, and control copies of the content (e.g., download, duplicate, store, and manage) ● redistribute – the right to share copies of the original content, your revisions, or your remixes with others (e.g., give a copy of the content to a friend) open licensing & oer going back to our definition of oer, we need to remember that these materials reside in the public domain or have been released under an intellectual property license permitting their free use and repurposing by others. the most commonly used intellectual property licenses for oer that permits free use and re- purposing are called creative commons licenses. creative commons licenses work with legal definitions of copyright to automatically provide usage rights pertaining to that work. as you progress along your learning journey, module and module will provide you with the opportunity to fully explore creative commons licensing and to learn how to apply the appropriate licenses to the oer you and your learners create and use. a (very) brief history of open educational resources ● - wayne hodgins coined the term “learning object” ● - david wiley coined the phrase “open content” ● - larry lessig, hal abelson, and eric eldred founded creative commons ● - mit introduced their opencourseware project (moocs) http://creativecommons.org/licenses/by/ . / https://creativecommons.org/ https://creativecommons.org/ https://ocw.mit.edu/index.htm https://ocw.mit.edu/index.htm this work is licensed under a creative commons attribution . international license. ● - unesco coined the term "open educational resources” (oer). ● - unesco adopted the oer paris declaration, an international commitment to oer ● - unesco updates their definition of oer, creating conversation within the open community about the impact of this change on the ability to reuse oer this movement continues to gain momentum, and the community of open education practitioners continues to expand. educators around the world are increasing their use and creation of these resources in their teaching and learning. explore further want to learn more about the history of oer? bliss, t. j. and smith, m. ( ). a brief history of open educational resources. in: jhangiani, r s and biswas-diener, r. (eds.) open: the philosophy and practices that are revolutionizing education and science. (pp. – ). london: ubiquity press. doi: https://doi.org/ . /bbc.b (links to an external site.). wiley, d. ( , january ). clarifying and strengthening the rs. iterating towards openness: pragmatism before zeal. https://opencontent.org/blog/archives/ http://creativecommons.org/licenses/by/ . / https://en.unesco.org/themes/building-knowledge-societies/oer https://en.unesco.org/themes/building-knowledge-societies/oer https://unesdoc.unesco.org/ark:/ /pf https://unesdoc.unesco.org/ark:/ /pf https://en.unesco.org/news/intergovernmental-expert-meeting-adopts-revised-draft-recommendation-open-educational-resources https://en.unesco.org/news/intergovernmental-expert-meeting-adopts-revised-draft-recommendation-open-educational-resources https://opencontent.org/blog/archives/ https://opencontent.org/blog/archives/ https://doi.org/ . /bbc.b https://doi.org/ . /bbc.b https://doi.org/ . /bbc.b https://opencontent.org/blog/archives/ https://opencontent.org/blog/archives/ this work is licensed under a creative commons attribution . international license. module : why oer? by the end of this module, you should be able to: ● articulate motivations for oer adoptions and use ● describe the benefits of oer for students ● describe the benefits of oer for faculty ● explore further benefits oer supports, such as equity and inclusion understanding the why behind adopting oer before we discuss the benefits of oer in detail, please take a few minutes to watch the video below. it reviews the definition of oer and provides a broad overview of why oer is an effective solution in addressing student barriers to high-quality learning materials. the video also provides examples of how faculty can use oer to enhance their teaching and improve student learning. an introduction to open educational resources you’ll notice that this module contains many external links to additional readings on the impact and benefits of oer. take the time to read these resources to explore further the concepts and points presented in this module. why use oer? oer supports a future where students and instructors have free access to a wide variety of high-quality educational resources that have been collaboratively developed, reviewed, revised, and shared across institutions. a future where educational resources can be easily adapted to fit within the context of specific courses, and to meet the needs of specific students. a future where the cost of creation, use, and maintenance is much lower than the current rising costs of textbooks and other classroom resources. the scholarly publishing and academic resources (sparc) summarizes the why behind using oer with these four points: ● textbook costs should not be a barrier to education ● students learn more when they have access to quality materials ● technology holds boundless potential to improve teaching and learning ● better education means a better future http://creativecommons.org/licenses/by/ . / https://youtu.be/ntjmakm -zc https://sparcopen.org/open-education/ https://sparcopen.org/open-education/ this work is licensed under a creative commons attribution . international license. benefits for students using oer can both provide tremendous cost savings for students and impact student success and completion rates. the cost of textbooks can be a huge financial burden on students, which not only affects student success, but could also delay graduation for students who are taking fewer classes per term because of that cost, further increasing financial costs for students over time. oer provide students with day-one access to free course materials, and research reviewed by the open education group shows that most students perform as well or better using oer course materials compared with students using traditional textbooks. when faculty use oer, we aren’t just saving students money on textbooks: we are directly impacting that students’ ability to enroll in, persist through, and successfully complete a course. ~ jhangiani & derosa the florida virtual campus’ and student textbook and course materials survey demonstrates that the cost of commercial textbooks continues to negatively impact student access, success, and completion. benefits for faculty imagine being able to edit, modify, update, and improve your course materials so the learning outcomes are met and the course material’s content is “exactly the way you want it.” oer allows for this! faculty using oer enjoy great freedom in selecting course materials that they customize to fit the specific needs of their students and the goals of their classes. since most oer permit adaptation, educators are free to edit, reorder, delete from, or remix oer materials. oer provide clearly defined rights to users, so educators are not faced with interpreting fair use and teach act guidelines. open educational resources also provide increased opportunities for faculty to engage in open pedagogical practices with their students. as mentioned above, students play a vital role in oer. student involvement also creates effective and successful open education programs at your institution. open pedagogy focuses on instructional approaches which allow students to use, reuse, revise, remix, and redistribute open content. in other words, students move from knowledge consumers to knowledge creators. the ability for students to engage more actively with the oer is a key pedagogical benefit for faculty and students - one that commercially published copyrighted course materials do not provide. to explore the power of open pedagogy further, take a look at the recent publication open pedagogy approaches: faculty, library, and student collaborations. this comprehensive collection is full of practical tips, ideas, and inspiring stories for faculty. http://creativecommons.org/licenses/by/ . / https://openedgroup.org/publications https://openedgroup.org/publications http://www.digitalpedagogylab.com/open-pedagogy-social-justice/ http://www.digitalpedagogylab.com/open-pedagogy-social-justice/ https://dlss.flvc.org/colleges-and-universities/research/textbooks https://dlss.flvc.org/colleges-and-universities/research/textbooks https://www.insidehighered.com/digital-learning/views/ / / /students-have-vital-role-creating-and-spreading-oer https://www.insidehighered.com/digital-learning/views/ / / /students-have-vital-role-creating-and-spreading-oer https://www.insidehighered.com/digital-learning/views/ / / /students-have-vital-role-creating-and-spreading-oer https://milnepublishing.geneseo.edu/openpedagogyapproaches/ https://milnepublishing.geneseo.edu/openpedagogyapproaches/ https://milnepublishing.geneseo.edu/openpedagogyapproaches/ this work is licensed under a creative commons attribution . international license. other key benefits to faculty include: use, improve, and share ● save time and energy by adapting or revising resources that have already been created ● tailor resources to fit specific context within your courses and research ● expand interdisciplinary teaching by integrating resources from multiple disciplines network and collaborate with peers (professional development considerations) ● access educational resources that have been peer-reviewed by other experts in your field ● explore reviews and annotations that provide more in-depth knowledge of the resource ● collaborate on creating new resources that can be used within or across disciplines lower costs and improve access to information ● reduce the cost of course materials ● enable all students to have equal access to course materials ● provide students with the opportunity to explore course content fully before enrolling oer: equity & openness when discussing open educational resources and exploring their use and benefits, remember that access and equity are not the same. this video explores how equity intersects with open education: equity in open education the video above references challenges to oer, such as inequitable access to technology and resources among students and institutions. in addition to this barrier, there are other challenges related to equity in open educational resources. while open educational resources and open practices present opportunities to create and share diverse and inclusive resources, inequities in oer exist. for example, the open community is lacking in diverse voices who author oer. there also are known difficulties finding openly licensed content that is culturally relevant and inclusive. representation matters and there is work to do in this area! the community college consortium for open educational resources (cccoer) has collected resources and articles exploring oer through the lens of equity, diversity, and inclusion. these resources are included (and continue to expand) on their equity & openness blog. as you learn more about oer, consider how open education practices and the use of oer can enhance your own teaching practices and learning materials to become more equitable, diverse, http://creativecommons.org/licenses/by/ . / https://youtu.be/ sycke awtg https://www.cccoer.org/ / / /on-equity-diversity-inclusion-and-open-education/ https://www.cccoer.org/ / / /on-equity-diversity-inclusion-and-open-education/ this work is licensed under a creative commons attribution . international license. and inclusive. as an oer champion, work consciously to resolve the known inequities that exist in open educational resources. make them truly culturally relevant, inclusive, and representative. ...oer provide a unique opportunity for educators to access learning materials, and then tailor them to the specific needs of their classroom. this is particularly important for teaching diverse groups of students. where culturally-responsive curriculum redesign must include funding to print textbooks that often fail to reflect student diversity and quickly become outdated, oer could instead be used to give students access to high-quality learning materials that educators could then continue to adapt as understandings of student needs and identities change. ~ prescott, s., muñiz, j. & ishmael, k. explore further additional research and videos discussing the impact and benefits of oer for faculty and students are linked below. colvard, n., watson, c. & park, h. ( ) the impact of open educational resources on student success metrics. international journal of teaching and learning in higher education, ( ), - . grimaldi pj, basu mallick d, waters ae, baraniuk rg ( ) do open educational resources improve student learning? implications of the access hypothesis. plos one ( ): e . hilton, j. ( ) open educational resources and college textbook choices: a review of research on efficacy and perceptions. education tech research and development, ( ), – . reynado, kharl. ( , october ) oer diversity discourse: bring in the student advocates openstax blog.vézina, b. and green, c. ( , march ) education in times of crisis and beyond: maximizing copyright flexibilities. creative commons blog. http://creativecommons.org/licenses/by/ . / https://www.newamerica.org/weekly/edition- /how-bring-equity-and-inclusion-classroom/ https://www.newamerica.org/weekly/edition- /how-bring-equity-and-inclusion-classroom/ https://www.newamerica.org/weekly/edition- /how-bring-equity-and-inclusion-classroom/ http://microblogging.infodocs.eu/wp-content/uploads/ / /ijtlhe .pdf http://microblogging.infodocs.eu/wp-content/uploads/ / /ijtlhe .pdf http://microblogging.infodocs.eu/wp-content/uploads/ / /ijtlhe .pdf https://doi.org/ . /journal.pone. https://doi.org/ . /journal.pone. https://doi.org/ . /journal.pone. https://doi.org/ . /journal.pone. https://link.springer.com/article/ . /s - - - https://link.springer.com/article/ . /s - - - https://link.springer.com/article/ . /s - - - https://openstax.org/blog/oer-diversity-discourse-bring-student-advocates https://openstax.org/blog/oer-diversity-discourse-bring-student-advocates https://creativecommons.org/ / / /education-in-times-of-crisis-and-beyond-maximizing-copyright-flexibilities/ https://creativecommons.org/ / / /education-in-times-of-crisis-and-beyond-maximizing-copyright-flexibilities/ this work is licensed under a creative commons attribution . international license. module : introduction to open licensing by the end of this module, you should be able to: ● define an open license ● distinguish between materials that are all rights reserved, in the public domain, and openly licensed ● identify the four factors of fair use did you realize these course modules are an oer? do you want to reuse the content, modify it for your students or colleagues? guess what … you can...with attribution of course! you’ll learn more about reusing open content and explicit open license permissions, such as attribution, in module . however, understanding what makes it possible for you to reuse, modify, and reshare this work is the first step. these activities are legal because when it was created the author released it with an open license. when discussing open licensing it is also necessary to review definitions of important terms and legal requirements of laws and principles applied to a creator’s work and how it can be used or reused. in addition to introducing and defining open licenses, this module will review and define copyright, fair use, and public domain. what is copyright? copyright is a legal right, grounded in the u.s. constitution, that gives the owner of copyright the exclusive right to: ● reproduce the work ● prepare derivative works ● distribute the work ● publicly perform the work ● publicly display the work ● authorize others to exercise some or all of those rights copyright takes effect immediately once a work has been fixed in tangible form--registration is optional and not necessary. http://creativecommons.org/licenses/by/ . / https://www.archives.gov/founding-docs/constitution-transcript#toc-section- - this work is licensed under a creative commons attribution . international license. what is fair use? start with an overview of fair use by viewing this short video: fair use in seven words fair use is a copyright principle based on the belief that the public is entitled to freely use portions of copyrighted materials for purposes of commentary and criticism. whether or not a specific use falls under fair use is determined by four factors: ● the purpose and character of your use ● the nature of the copyrighted work ● the amount and substantiality of the portion taken, and ● the effect of the use upon the potential market these factors are weighed in each case to determine whether a use qualifies as a fair use. to learn more about fair use, please see fondren library’s copyright and fair use guide. recognizing the differences between how copyrighted material and openly licensed or public domain material can be reused and shared legally, allows for a comprehensive understanding of fair use. understanding an open license in module you learned that an open educational resource is either in the public domain or released with copyright permissions which allows for free use and repurposing by others. specifically, an open license exists as a way for the original creator to clearly inform others how their work can be used by granting permissions to share and adapt their work. a public domain license and the variety of open license permissions known as creative commons (cc) are the predominant standards for open licenses. you will learn more about the six different cc license permissions in module . this video provides more information about the benefits of an open license and how this standard makes sharing and reusing resources easy: understanding an open license http://creativecommons.org/licenses/by/ . / https://youtu.be/ deu-cvyci https://libguides.rice.edu/copyright https://libguides.rice.edu/copyright https://youtu.be/chqb otnhhg this work is licensed under a creative commons attribution . international license. why is an open license important? it is the copyright status and license applied to a work which determines what you can and cannot do with the creative work of someone else. knowing how to identify and differentiate between common types of copyright status will be useful when determining which content you may reuse, and how. one should assume that a work is all rights reserved, unless the creator explicitly states otherwise or the user of the work can prove it differently. as you search for oer, you will become familiar with the markings of each license type. what is the public domain? a public domain work is a creative work that is not protected by copyright, which means it’s free for you to use without permission. works in the public domain are those for which intellectual property rights have expired, have been forfeited, or are inapplicable. here are some examples of works in the public domain: ● material created by the us government, such as pictures taken by nasa ● materials for which copyright protection has lapsed, such as “new hampshire” by robert frost ● works released to the public domain when they were created, such as images on pexels determining if a work is in the public domain can be difficult because the terms of copyright protection in the united states have changed over time. the cornell university library copyright information center is a useful tool for understanding what works might fall into the public domain. fondren library staff are also available to help determine copyright status. email digital scholarship services at cds@rice.edu. what is the difference between public domain and open license? they both grant free access to the materials, but the scope and nature are completely different. open licensing does recognize clear ownership of intellectual property and the work is still protected under copyright law, whereas works in the public domain are not protected by copyright law. therefore, users are required to follow the license requirements when using openly licensed materials. the infographic below illustrates the differences between public domain, open license, and all rights reserved copyright. public domain is "most open" because the work is out of copyright or copyright has been waived. the copyright owners retain copyright for open licenses and "all http://creativecommons.org/licenses/by/ . / https://images.nasa.gov/ https://images.nasa.gov/ https://babel.hathitrust.org/cgi/pt?id=uva.x ;view= up;seq= https://www.pexels.com/public-domain-images https://www.pexels.com/public-domain-images https://copyright.cornell.edu/publicdomain https://copyright.cornell.edu/publicdomain https://copyright.cornell.edu/publicdomain mailto:cds@rice.edu this work is licensed under a creative commons attribution . international license. rights reserved" copyright. however, open licenses grant specific uses to the user in advance. "all rights reserved" requires users to either request permission for each use or ensure that their use is considered fair use or other copyright exception. "difference between open license, public domain and all rights reserved copyright" by boyoung chae is licensed under cc by . why open licensing matters the power of open licensing lies in its ability to clearly communicate how the creator intends the work to be used. a creator can explicitly share the work and control the licensing provisions while retaining ownership. remember, for a work without a copyright notice, all rights reserved is assumed. so if you want to openly share your oer with your students and faculty peers, or publish it online for the world to access, displaying an open license statement with the work ensures it will be easily and clearly adopted in the way you intend. in module you will spend more time learning about creative commons licenses. http://creativecommons.org/licenses/by/ . / https://commons.wikimedia.org/wiki/file:difference_between_open_license,_public_domain_and_all_rights_reserved_copyright.png http://creativecommons.org/licenses/by/ . this work is licensed under a creative commons attribution . international license. module : finding & evaluating oer by the end of this module, you should be able to: ● recognize the different types of oer ● apply effective search strategies when looking for oer ● identify several online repositories for oer ● utilize other oer search tools available ● investigate the available reuse options for oer - adopt, adapt, combine and create ● identify perspectives on evaluating and defining ‘quality’ as it relates to course materials ● utilize relevant rubrics for evaluating oer modules - provided you with a solid introduction to various aspects of open educational resources such as the benefits to using oer, the r framework, and open licensing. in this module, you will apply what you now know about oer and start finding the variety of open resources available to you. through this module, you will be exposed to a variety of search strategies used in locating and finding relevant oer, and you will explore some of the more useful online repositories and sites which host oer. finding oer by the end of this module section, you should be able to: ● recognize the different types of oer ● apply effective search strategies when looking for oer ● identify several online repositories for oer ● utilize other oer search tools available recognizing different types of oer remember, oer refers to educational materials that include permission for anyone to use, modify and share. in its simplest form, the term open educational resource describes any educational resource (including curriculum maps, course materials, textbooks, streaming videos, multimedia applications, podcasts, and any other materials that have been designed for use in teaching and learning) that is openly available for use by educators and students, without an accompanying need to pay royalties or license fees. http://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. materials that are not in the public domain or are not released under a license that allows anyone to copy, adapt and share them, are not open educational resources. an example of this would be resources only available through institutional library subscriptions such as ebooks, online articles, and streaming media. you can use these materials only within fair use provisions or copyright exceptions. what are you looking for? perhaps the most useful first step when searching for oer is knowing what you are looking for. are you seeking oer video lectures that discuss microeconomics? alternatively, are you looking for a full oer course on psychology? if you can narrow your search to a particular topic and have an idea of the types of oer content you are seeking, your search will be much easier. as you begin your search for relevant open educational resources, take a few pre-planning steps before diving into the various search tools available. for a moment, put yourself in the shoes of your students when they are asked to research a topic for a paper. they identify a topic, outline keywords, plan their search strategy, compile relevant resources, and evaluate their results. your search for oer won’t be very different from this approach. below is a great list of questions to ask yourself before you begin your search. ● what sparked your interest in oer? ● what type of oer are you looking for? a textbook? a video? a set of lesson plans? ● identify course objectives, topics, & outcomes the oer will need to cover. ● list what you like (or love) about your current course materials. ● list what you don’t like about your current course material. ● think about the effectiveness of the textbooks and course materials. ○ rank your top elements (are they current? accurate? cover course outcomes? professionalism?) ● have you used any open educational resources before? if yes, make a list. http://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. once you’ve answered the above questions, you’ll have a better sense of where to start your search for oer. where do you look for oer? there are billions of openly licensed resources out there; it is easy to feel overwhelmed when trying to find relevant resources. this video provides a nice overview of some of the more common search repositories and tools for finding oer: how to find and evaluate oer searching oer repositories searching an oer repository can result in a faster and more productive search experience since the resources have been curated and organized into various categories including discipline, format, and open license. many repositories have either peer reviews or a rating scale where users have shared their perception or experience with the resource. start by trying oer commons: ● oer commons - the go-to repository if you are looking for supplementary resources from lesson plans to full courses. due to the amount of material in oer commons, they provide many options for limiting and filtering your searches such as discipline, material type of oer, format, education level and more. use their advanced search features to your advantage to fine-tune your results. a list of additional oer repositories can be found on fondren library’s oer research guide. searching for open textbooks if you are looking for an open textbook to replace your current, commercial textbook, start by visiting the two resources listed below: ● open textbook library - supported by the open textbook network at the university of minnesota, available resources include mainly college-level open textbooks. the repository includes faculty peer reviews, licensing information, a summary of content, format availability, and direct links to resources. it can be searched by keyword or by browsing discipline areas. http://creativecommons.org/licenses/by/ . / https://youtu.be/fbwumqm-ng https://www.oercommons.org/ https://libguides.rice.edu/c.php?g= &p= https://libguides.rice.edu/c.php?g= &p= https://open.umn.edu/opentextbooks/ this work is licensed under a creative commons attribution . international license. ● openstax - a nonprofit educational initiative based at rice university, publishes high- quality, peer-reviewed , openly licensed college textbooks that are freely available online and low cost in print. using search tools to find oer google advanced search https://www.google.com/advanced_search google is a popular and common search tool we all use daily, but you may not be aware of its advanced search features. the google advanced search allows you to filter results by usage rights, but it does not offer a list of licenses to search by (e.g., creative commons). mason oer metafinder (mom) https://mom.gmu.edu this utility from george mason university libraries searches oer repositories at once. you can add or remove sources to modify your search targets. oasis search https://oasis.geneseo.edu/ openly available sources integrated search (oasis) is a search tool developed at suny geneseo that aims to make the discovery of open content easier. this tool will simultaneously search different open content sources. be aware that these search tools rely on license metadata being detected on the source webpage(s), but it is wise to confirm the cc license on the content you want to reuse before doing so. finding more...images, videos, audio ● images ○ creative commons search ○ pexels ○ pixabay http://creativecommons.org/licenses/by/ . / https://openstax.org/ https://www.google.com/advanced_search https://mom.gmu.edu/ https://oasis.geneseo.edu/ https://ccsearch.creativecommons.org/ https://www.pexels.com/ https://pixabay.com/ this work is licensed under a creative commons attribution . international license. ○ noun project (great for icons) ● video (be sure videos include accurate captions or a transcript to allow for full accessibility) ○ youtube (use the creative commons filter) ○ vimeo (use the creative commons filter) ● audio (be sure audio files include a transcript to allow for full accessibility) ○ bandcamp ○ library of congress audio files if you still haven’t found what you’re looking for, contact fondren library’s digital scholarship services for help locating relevant oer or other zero cost course materials: cds@rice.edu. evaluating oer by the end of this module section, you should be able to: ● investigate the available reuse options for oer - adopt, adapt, combine and create ● identify perspectives on evaluating and defining ‘quality’ as it relates to course materials ● utilize relevant rubrics for evaluating oer in the previous section, finding oer, you focused on organizing your search and finding relevant oer. this section will focus on elements of evaluating oer. first things first: what do you want to do with the oer you found? the first part of evaluating an oer is asking yourself what you want to do with that oer. do you want to adopt and use as is? or, do you want to adapt and modify the content to meet your needs? if you found an oer that matched your learning outcomes perfectly, but some modification was required, does the license on that resource allow you to modify? or, is it licensed in a way that does not allow for modifications or derivatives? if modifications are not allowed, you may want to consider another resource. before diving into rubrics, consider the license for the oer and what the permissions allow. http://creativecommons.org/licenses/by/ . / https://thenounproject.com/ https://www.youtube.com/ https://vimeo.com/ https://bandcamp.com/tag/creative-commons https://www.loc.gov/audio/collections/ mailto:cds@rice.edu this work is licensed under a creative commons attribution . international license. evaluation questions the following questions can help guide you when selecting and evaluating oer. the list below is also available in pdf format from affordable learning georgia. clarity, comprehensibility, and readability ● is the content, including any instructions, exercises, or supplemental material, clear and comprehensible to students? ● is the content well-categorized in terms of logic, sequencing, and flow? ● is the content consistent with its language and key terms? content accuracy and technical accuracy ● is the content accurate based on both your expert knowledge and through external sources? ● are there any factual, grammatical, or typographical errors? ● is the interface easy to navigate? are there broken links or obsolete formats? adaptability and modularity ● is the resource in a file format which allows for adaptations, modifications, rearrangements, and updates? ● is the resource easily divided into modules, or sections, which can then be used or rearranged out of their original order? ● is the content licensed in a way which allows for adaptations and modifications? appropriateness ● is the content presented at a reading level appropriate for higher education students? ● how is the content useful for instructors or students? ● is the content itself appropriate for higher education? accessibility http://creativecommons.org/licenses/by/ . / https://www.affordablelearninggeorgia.org/documents/r _criteria.pdf https://www.affordablelearninggeorgia.org/documents/r _criteria.pdf this work is licensed under a creative commons attribution . international license. ● is the content accessible to students with disabilities? ● if you are using web resources, does each image have alternate text that can be read? ● do videos have accurate closed-captioning? ● are students able to access the materials in a quick, non-restrictive manner? ● more on evaluating accessibility can be found at: ○ open washington - evaluation module - accessibility ○ bc campus accessibility toolkit supplementary resources ● does the oer contain any supplementary materials, such as homework resources, study guides, tutorials, or assessments? ● have you reviewed these supplementary resources in the same manner as the original oer? evaluation rubrics & checklists there are plenty of rubrics and evaluation tools available. your department already may use one for evaluating other course material or textbooks for adoption. if they do, use that! outside of considering if you want to exercise the rs and whether the licensing on the resources allows for it, evaluating oer should not be any different than evaluating other course material under consideration for adoption. ● additional evaluation tools can be found on fondren library’s oer research guide. curriculum mapping another successful approach to evaluate an oer is to use a course map template to track course outcomes, activities, and teaching resources. a course map, also known as a curriculum map, is a record of teaching and learning that can provide faculty an opportunity to align oer with course learning outcomes. an added advantage to course mapping is unearthing unintentional gaps or redundancies in your learning outcomes. additionally, you can use a course map to document the license for the resource, keep track of where the resource lives online, and organize comments as you compile more resources. as you gather your resources and plan for aspects of course redesign when incorporating your oer, know there are tools available to help you. for example, a blank course map template was created for the texas learn oer project (from which this course is adapted). http://creativecommons.org/licenses/by/ . / https://www.openwa.org/module- / https://opentextbc.ca/accessibilitytoolkit/ https://libguides.rice.edu/c.php?g= &p= https://libguides.rice.edu/c.php?g= &p= https://docs.google.com/spreadsheets/d/ ir u ke -u txj xp hcwiq qtaqpzxusgymwyngi/edit?usp=sharing https://docs.google.com/spreadsheets/d/ ir u ke -u txj xp hcwiq qtaqpzxusgymwyngi/edit?usp=sharing https://docs.google.com/spreadsheets/d/ ir u ke -u txj xp hcwiq qtaqpzxusgymwyngi/edit?usp=sharing this work is licensed under a creative commons attribution . international license. a comment on quality often, in conversations surrounding the evaluation of oer, common questions emerge related to quality. a typical question might be: is the quality of the oer as good as commercially produced copyrighted course material? as you find and evaluate oer, challenge yourself to consider how quality is defined and measured. take a minute to read this blog post from david wiley, on quality and oer. after reading and reflecting, do you agree or disagree with this statement? “for educational materials, the degree to which they support learning is the only meaning of quality we should care about.” http://creativecommons.org/licenses/by/ . / https://opencontent.org/blog/archives/ https://opencontent.org/blog/archives/ this work is licensed under a creative commons attribution . international license. module : accessibility by the end of this module, you should be able to: ● explain universal design and how it improves accessibility for all learners ● identify steps for choosing and using accessible oer ● list three ways accessibility must be considered when adopting oer ● reflect on accessibility of current teaching resources and how they can be improved accessibility and universal design instructors should ensure that the teaching materials they use are accessible to all students. applying a universal design approach to your curriculum allows you to improve accessibility for all learners. universal design is the design of products and environments to be usable by all people, to the greatest extent possible, without the need for adaptation or specialized design. ron mace and colleagues at north carolina state university coined the term universal design (ud), with the understanding that designing to meet the needs of disabled people benefits everyone. for example, a curb cutout, designed to accommodate wheelchairs transitioning from sidewalks to streets, also benefits people with strollers, bike riders, and people who may have depth issues. what does ud mean for learning and curriculum design? universal design means that we design courses that are the most useful to the most different types of people. a proactive approach improves accessibility for all students. for example, although closed captions are added for deaf students, many students may use them when watching online videos in the library or if they are learning english. using a ud framework makes our courses more user-friendly for all learners. an overview of accessibility as instructors, we have legal and ethical obligations to ensure that our courses are fully accessible to all learners, including those with disabilities. we use digital resources in our courses because we believe they enhance learning. however, unless carefully chosen with accessibility in mind, these resources can have the opposite effect for students with disabilities, erecting daunting barriers that make learning difficult or impossible. for example, consider the accessibility challenges students described below might face: http://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. ● students who are deaf or hard of hearing are unable to access the contents of a video presentation unless it’s captioned. ● students who are blind or visually impaired use assistive technologies such as audible screen reader software or braille devices to access the content of websites, online documents, and other digital resources. they depend on authors providing alternate text that describes the content of images as well as headings, subheadings, lists, and other markup that helps them understand the structure and outline of the resource. ● some students who have learning disabilities such as dyslexia use assistive technologies that visibly highlight digital text as it’s read aloud, and are therefore dependent on text being readable (as opposed to a scanned image). ● students who are physically unable to use a mouse are unable to use interactive web and software applications unless these applications can be operated with a keyboard. ● students who are color blind may be unable to understand content that communicates information solely using color (for example, a bar chart with color as the sole means of differentiating between the bars). the web content accessibility guidelines (wcag) . , developed by the world wide web consortium, provide an international standard that defines accessibility of web-based resources. the principles of wcag . are applicable to other digital assets as well, including software, video, and digital documents. the do-it (disabilities, opportunities, internetworking, and technology) at the university of washington has a wealth of resources available to instructors on universal design in the classroom and in digital resources. their accessibility checklist can help anyone creating or choosing digital resources to understand the accessibility requirements related to the features and functions of those resources. the rest of this module provides tips for ensuring that the resources you’re choosing for your course are accessible to all learners. choosing and using accessible video when selecting video, be sure to choose videos that include accurate closed captioning. closed captions provide a text version of the spoken audio and other critical sounds, displayed in sync with the video. closed captions make video accessible to students who are deaf or hard of hearing but also benefit many others: they help second-language students understand the spoken audio; they help all students learn the spelling of the words that are being spoken; they make it possible to search the video for specific content; and they can be repurposed as an interactive transcript, which is a great feature for everyone! captions are supported by all major video hosting services including youtube and vimeo. if a video is captioned, it will have a cc button on the video player. http://creativecommons.org/licenses/by/ . / https://www.w .org/tr/wcag / https://www.washington.edu/doit/ http://www.washington.edu/accessibility/checklist/ this work is licensed under a creative commons attribution . international license. additionally, when selecting audio files (like a podcast) be sure the file also has a full written transcript available. video resources youtube automatically captions most videos that are uploaded to its website. however, automatic captions, which are created by a computer, are not accurate enough to be relied upon (consider the effect of one missed “not” on the meaning of the video). to check whether a video has reasonably accurate captions created by humans, click the cc button on the video player to turn captions on, and watch a few short segments of the video. consult the following resources for additional information on finding videos that have captions: ● searching youtube for videos with captions ● turning youtube captions on and off if you find an open-licensed video that is perfect for your course but does not currently have captions, caption it! here’s how: ● youtube: how to contribute subtitles and closed captions ● ted open translation project ● khan academy: volunteer ● amara – a free tool for captioning and subtitling any public video ● dotsub – another free tool for captioning and subtitling any public video choosing and using accessible images if images are used to communicate information, they should include short text descriptions for individuals who are unable to see the images. these short descriptions are typically referred to as “alternate text” or “alt text.” most authoring tools that support adding images to content also support adding alt text to an image. when you’re adding an image to a web page or document, simply look for an “alt text” field in the image properties dialog and enter a short description into the space provided. if the authoring tools do not support "alt text", include a description of the image after the figure title. note, any images with words, such as screen capture of a quote or "tweet", should include a transcript of the words displayed in the images. this is also good practice when you're sharing these images on social media! http://creativecommons.org/licenses/by/ . / https://support.google.com/youtube/answer/ ?hl=en https://support.google.com/youtube/answer/ ?hl=en https://support.google.com/youtube/answer/ ?hl=en https://www.ted.com/participate/translate https://www.khanacademy.org/contribute http://amara.org/ https://dotsub.com/ this work is licensed under a creative commons attribution . international license. the alt text that you enter for a particular image depends on the context. think about what you’re wanting to communicate by adding the image. then, add alt text that will communicate the same idea to someone who is unable to see the image. the following resources provide additional guidance for writing good alt text. ● webaim: alternate text ● guidelines for describing stem images – national center for accessible media if the image contains important detail that is too complex to be described in one or two brief sentences (for example, a chart or graph), then the text description will need to be provided separately from the image, either within surrounding text on the same page, or on a separate page that is accessible via a link on the main page. remember, if it is an image of text, you must provide a transcript. http://creativecommons.org/licenses/by/ . / http://webaim.org/techniques/alttext/ https://www.wgbh.org/foundation/ncam/guidelines/guidelines-for-describing-stem-images this work is licensed under a creative commons attribution . international license. choosing and using accessible course material when choosing among the wide variety of course materials that are available be sure to consider whether these materials might present challenges or barriers for students with disabilities. ask specific questions, such as: ● is all written content presented as text, so students using assistive technologies can read it? ● if the materials include images, is the important information from the images adequately communicated with accompanying alt text? ● if the materials include audio or video content, is it captioned or transcribed? ● if the materials have a clear visual structure including headings, sub-headings, lists, and tables, is this structure properly coded so it’s accessible to blind students using screen readers? ● if the materials include buttons, controls, drag-and-drop, or other interactive features that are operable with a mouse, can they also be operated with a keyboard alone for students who are physically unable to use a mouse? ● do the materials avoid communicating information using color alone (e.g., the red line means x, the green line means y)? if you find open course materials that are perfect for your course but you are unable to answer “yes” to each of the above questions, contact the author and talk to them about accessibility. your feedback may inspire them to improve the accessibility of their materials, which will benefit everyone! choosing and using accessible textbooks many of the downloadable textbooks available through the sites like openstax or open textbook library provide textbooks in pdf format. pdf, like most other document formats, includes support for accessibility features such as headings, subheadings, lists, and alt text on images, but the author and/or publisher must make a conscious effort to include these features. in order to support accessibility features, a pdf file must be tagged. a tagged pdf is a type of pdf that includes an underlying tagged structure that enables headings to be identified as headings, lists as lists, images as images with alt text, etc. tags provide the foundation on which accessibility can be built. to determine whether a particular pdf is tagged, open it in adobe acrobat or adobe reader and go to document properties (ctrl + d in windows; command + d in mac os x). in the lower left corner of the document properties dialog, “tagged” is either “yes” or “no.” http://creativecommons.org/licenses/by/ . / https://openstax.org/ https://open.umn.edu/opentextbooks/ https://open.umn.edu/opentextbooks/ this work is licensed under a creative commons attribution . international license. resources the following resources provide additional guidance for creating accessible documents, particularly in pdf, and on evaluating whether pdfs are accessible and if not, fixing their accessibility problems. additionally, reach out to staff at your institution, such as instructional designers or an accessibility support specialist, for help and guidance. ● adobe: pdf accessibility overview ● webaim: pdf accessibility ● bc campus open education accessibility toolkit if you find an open textbook that is perfect for your course but is not accessible, contact the author and talk to them about accessibility. http://creativecommons.org/licenses/by/ . / http://www.adobe.com/accessibility/pdf/pdf-accessibility-overview.html http://webaim.org/techniques/acrobat/ https://opentextbc.ca/accessibilitytoolkit/ this work is licensed under a creative commons attribution . international license. module : creative commons licensing in-depth by the end of this module, you should be able to: ● identify the differences between the six currently available creative commons licenses ● identify the conditions including attributions when using open licensed material ● recognize how different license permissions impact remixing compatibility ● use tools to guide you in choosing the appropriate license for your own work ● use tools for creating attribution statements in your work cc-by cc-by-sa cc-by-nc cc-by-nc-sa cc-by-nc-nd no, that wasn’t a typo! the acronyms above are representative of the six different creative commons (cc) licenses. in module you were introduced to open licenses and how they differ from all rights reserved copyright. in this module, you will learn about the different conditions and permissions of these licenses. this short slide show presentation provides the nuts and bolts of creative commons licenses and their conditions: creative commons licensing, the rs, and oer: the shortest possible introduction six licenses all creative commons (cc) licenses are structured to give the user permission to make a wide range of uses as long as the user complies with the conditions in the license. the basic condition in all of the licenses is that the user provides credit to the licensor and certain other information, such as where the original work may be found. understanding the meaning of each condition can be useful when deciding which cc license to use on your own work. as discussed in module , understanding the meaning of the conditions can also be useful in evaluating an open resource. http://creativecommons.org/licenses/by/ . / https://docs.google.com/presentation/d/ _temmzngibzfrmjxh zsouhibonmgifk u uiyegauo/edit?usp=sharing this work is licensed under a creative commons attribution . international license. this symbol means attribution or “by.” all of the licenses include this condition. this symbol means sharealike or “sa,” which means that adaptations based on this work must be licensed under the same license. this symbol means noncommercial or “nc,” which means the work is only available to be used for noncommercial purposes. this symbol means noderivatives or “nd,” which means reusers cannot share adaptations of the work. all of the licenses include the by condition: all of the licenses require that the creator be attributed in connection with their work. beyond that commonality, the licenses vary whether ( ) commercial use of the work is permitted; and ( ) whether the work can be adapted, and if so, on what terms. http://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. the six licenses, from least to most restrictive in terms of the freedoms granted reusers, are: the attribution license or “cc by” allows people to use the work for any purpose (even commercially and even in modified form) as long as they give attribution to the creator. the attribution-sharealike license or “by- sa” allows people to use the work for any purpose (even commercially or in modified form), as long as they give attribution to the creator and make any adaptations they share with others available under the same or a compatible license. the attribution-noncommercial license or “by-nc” allows people to use the work for noncommercial purposes only, and only as long as they give attribution to the creator. the attribution-noncommercial-sharealike license or “by-nc-sa” allows people to use the work for noncommercial purposes only, and only as long as they give attribution to the creator and make any adaptations they share with others available under the same or a compatible license. the attribution-noderivatives license or “by-nd” allows people to use the unadapted work for any purpose (even commercially), as long as they give attribution to the creator. http://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. the attribution-noncommercial-noderivatives license or “by-nc-nd” is the most restrictive license offered by cc. it allows people to use the unadapted work for noncommercial purposes only, and only as long as they give attribution to the creator. note: the no-derivatives condition allows sharing and reuse but only if the content is left unchanged. this presents an issue when searching for oer, as no customization or adaptation is allowed by the license. for this reason, nd content is not considered oer and should be considered for reuse only in situations where no adaptations are needed. attribution all six of the creative commons licenses include the by or attribution condition. this is a requirement of reuse. the original creator has explicitly informed the user of this requirement through the use of the by condition. as you learned in the slide show presentation earlier in this module, citations and attributions are similar but different. providing attribution is the legal requirement of the open license. while some tools, like cc search, include the attribution in the resource, there are other tools available to help users easily create attribution statements for work they reuse, remix, or modify. ● attribution builder - similar to a citation generator, this tool builds attribution statements that can be copied and pasted into documents and websites. note: all the attribution statements for these modules were created using this tool. when creating attribution statements a good rule of thumb is to remember the acronym tasl: ● title of the work ● author of the work ● source or where the work can be found ● license of the work include urls, when available. example: "creative commons th birthday celebration san francisco" by tvol is licensed under cc by . note: you may not always have all the information above; include as much as you can. http://creativecommons.org/licenses/by/ . / https://ccsearch.creativecommons.org/ https://ccsearch.creativecommons.org/ http://www.openwa.org/open-attrib-builder/ https://www.flickr.com/photos/sixteenmilesofstring/ /in/set- https://www.flickr.com/photos/sixteenmilesofstring/ https://www.flickr.com/photos/sixteenmilesofstring/ https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. combining cc licenses the by (attribution) condition is a part of all the licenses, but not all of them work together. for example, the sa and nd conditions do not appear in the same license because there is no reason to include the share-alike condition when no derivatives are being allowed. together, the conditions form the six cc licenses: as you find different types of oer to use in your courses, you may find the need to remix and modify the content. understanding how the different licenses can or cannot be combined is a critical step in reusing openly licensed material. the license compatibility chart below illustrates how licenses work, or don't work, together: "license compatibility chart" by creative commons is licensed under cc by . choosing a license for your work remember, when sharing your work, selecting and displaying a license with it ensures the work can be adopted and adapted how you want! if you don't select a license, all published material may be assumed to be all rights reserved even if you intended it to be openly licensed. when creating work to share, carefully consider how you want your work to be used when choosing which open license to apply. as the original creator of your work, you have choices. http://creativecommons.org/licenses/by/ . / https://wiki.creativecommons.org/wiki/wiki/cc_license_compatibility https://creativecommons.org/ http://creativecommons.org/licenses/by/ . http://creativecommons.org/licenses/by/ . this work is licensed under a creative commons attribution . international license. ● do you want to allow derivatives? ● do you want to allow for commercial purposes? ● do you want the same license to be applied on derivatives? ● if this work was made using openly licensed material, is there a copyright provision you must abide? creative commons designed the licenses to provide more options to the creator than all-rights reserved copyright. the cc license chooser is a simple tool designed to help creators decide which license is best for their work. with two questions the tool will prompt you to select conditions for sharing your work. a license icon, statement, and code -- similar to the ones below -- to embed is generated for you to easily copy and paste into your work. remember: when remixing content to create something new, if any of your adapted content includes the sa (share alike) condition - you must apply the sa condition to your newly remixed finished work. fondren library staff are available to help you navigate creative commons licenses. please contact digital scholarship services at cds@rice.edu. http://creativecommons.org/licenses/by/ . / https://creativecommons.org/choose/ mailto:cds@rice.edu this work is licensed under a creative commons attribution . international license. module : adapting, creating & sharing oer by the end of this module, you should be able to: ● determine reasons for adapting & creating ● apply needed steps for adapting & creating oer with proper attribution and licensing ● recognize the considerations in choosing a license for your work ● recognize the variety of creation and authoring tools available ● create your own oer in the previous six modules, you’ve learned a great deal about open educational resources and how they can be used as effective teaching and learning material in your courses. in this module, you will gain experience in applying what you’ve learned to successfully adopt, adapt, and create an oer. adapting an existing open educational resource the term adaptation is commonly used to describe the process of making changes to an existing work. we also can replace “adapt” with revise, modify, alter, customize, or other synonyms that describe the act of making a change. one advantage of choosing an open educational resource is that it gives faculty the legal right to add to, adapt, or delete content from the open work to fit their specific course without obtaining permission from the copyright holder. as you learned in module , this is possible because the copyright holder already has granted permission by releasing their work using an open — or creative commons — license. if you are considering making changes to an open resource, such as an open textbook, ask yourself the following questions: ● how much content do i wish to change? do i want to remove chapters or rewrite entire chapters? ● what technical format is the original textbook - an ms word doc, google doc, or pdf? a word document is much easier to modify than a pdf document. ● what type of license is the content released under? does it have a creative commons license that allows for modification or adaptation of the content? ● how comfortable are you with using technology and creating content? http://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. if you decide to adapt an existing open resource, here are six recommended steps to follow: . check the license of the work - does it allow for modifications or derivatives? . check the format of the work - common formats are html files (webpages), word or open documents (google docs), text files, epub, latex files (if the original book includes math or science formulas and equations). . choose tools for editing an open textbook (or other open resource) - there are many available. your choice of editing tool may vary depending on the original format of the resource. . choose the output for the work - students like having material in multiple formats. this allows them to choose what works best for them. some may prefer printed versions of the textbook; others will prefer using a website. still others will like to use an e-reader or e-reading software. by offering multiple formats you are making your content more accessible. . determine access for the work - how will your students access the content? will it be available in an lms, google classroom, oer commons, or another online hosting service? . choose a license - the open license you choose will depend on how the textbook you adapted was licensed. for example, if the original textbook was licensed with a creative commons attribution-sharealike (cc by-sa) license, then you must release your book with the same license to ensure it is compliant with the terms of use. creating open educational resources the alms framework for work to be truly “open” and allow the r permissions, the work should be meaningfully accessible and editable. how can you ensure adopters can easily reuse, revise, remix, redistribute, and retain the work? the alms framework, established by hilton, wiley, stein, and johnson ( ), highlights the vital importance of offering source files and creating work in easily adoptable formats. ● access: offer in a format that can be easily edited with freely accessible tools ● level: format should not require advanced technical expertise to revise content ● meaningful: offer in an editable format ● source: source file that is accessible and editable using the alms framework offers oer creators a structure guiding the openness of the content while ensuring access to adopters in a meaningful way. when creating work, consider sharing it in several formats that permits accessible classroom adoption: ms word, pdf, and google doc. which source file do you prefer to use? http://creativecommons.org/licenses/by/ . / https://scholarsarchive.byu.edu/facpub/ / this work is licensed under a creative commons attribution . international license. review the video below to get a brief introduction to creating oer: creating open educational resources: tips for new creators the video outlines tips for creators: ● determine how your oer will meet your course needs ● check if you've already created something you can use as a base for your oer ● evaluate tools and determine where you will build your oer ● consider what license you will apply to your oer ● decide where and how you want to share your oer there are low tech, medium tech, and high tech tools and authoring platforms available to create your oer. consider the tips previously mentioned and determine which tool best meets your needs. here are some widely used tools: ● google docs ● google sites ● google slides ● adobe spark ● pressbooks ● oer commons open author whichever creation tool or authoring platform you choose, be aware of any restrictions this tool may have on how the final work may be published or shared. before creating your work, look closely at the terms of use for that product. additional information about publishing your oer can be found in sharing your work. examples of faculty and student-created oer: ● university of texas at arlington ○ mavs open press- collection of open textbooks ○ collection of oer- created by uta faculty, staff, and students ● university of houston open textbooks catalog ● mathematical reasoning: writing and proof - this open textbook’s author won an award ( ) from the mathematical association of america for its impact on undergraduate students. ● george mason english open educational resources - a peer reviewed repository of open educational resources for use in mason’s english classes. the site allows faculty to easily search, reuse, and adapt the teaching materials developed by their colleagues. http://creativecommons.org/licenses/by/ . / https://youtu.be/dv-hiwtmq u https://uta.pressbooks.pub/ https://rc.library.uta.edu/uta-ir/handle/ / https://uhlibraries.pressbooks.pub/catalog/ https://uhlibraries.pressbooks.pub/catalog/ http://scholarworks.gvsu.edu/books/ / https://journals.gmu.edu/index.php/oerenglish this work is licensed under a creative commons attribution . international license. ● the open logic project- an international collaboration of people contributing to an open textbook in logic. sharing your work are you interested in sharing your material? do you have an engaging course activity, image, assessment item, video, or a whole course that might be beneficial to other faculty in your discipline? sharing your work is a personal choice and can be daunting, but it also can be rewarding. sharing your work with others allows for increased use as well as opportunities for collaboration, enhancement, and improvement of your work. you can start small by sharing your work with others in your department or just at your institution. or, if you are ready, you can share it globally with other educators and students, thus contributing to the open education community at large. whether you share it locally or globally as an oer, consider the following steps as your guide to sharing your work: step : terms of use decide on the terms of use. do you wish to release your work under creative commons license or in the public domain? please make sure to review the difference between these two copyright terms: ● by releasing your work under a creative commons license, you retain ownership while allowing others to use your work (as long as they attribute it to you) without needing to ask permission of you directly. ● by releasing your work in the public domain, your copyright ownership is waived. it is as if you are giving your work to the public as a gift. users may still cite you when adopting your work, but they are not required to do so. please see “what is the difference between public domain and open license?” in module for details. step : seeking copyright clearance be sure that the work is eligible to be shared. to release your work with a cc license or in the public domain, your work should be cleared from all copyright issues. to do so, your work should be one or a combination of the following types: . your original work, . built from open resources, . built from the public domain, http://creativecommons.org/licenses/by/ . / http://openlogicproject.org/ this work is licensed under a creative commons attribution . international license. . built from copyrighted work that you obtained permission to use and distribute for the life of your openly licensed work, or . combination of above works note: for any third-party materials, whether openly licensed or copyrighted, those materials need to be attributed as not governed by the cc license you chose for your work, but under different terms and by different authors. getting permission to use copyrighted materials if you must use any items that are copyrighted with all-rights reserved, be sure to obtain the permission letter(s) from the author(s). sample email to ask for permission to use the work: hello dr. r.b lone star, i am a faculty member with the ____ project. the purpose of this project is to design openly licensed science and technology courses that can be taught face-to-face, hybrid, and/or online. these courses will be freely available on the internet for anyone to copy, modify, and use. one of the purposes of this project is to offer educational resources to regions where formal educational opportunities are scarce or expensive. i am creating a course titled “horticulture history of the texas bluebonnet” and i would like to use a post from your blog titled “environment and climate: impacts on the texas bluebonnet” from february . i am seeking your permission to distribute this material as part of our course. you will maintain your copyright but will be giving us permission to distribute this material for reuse as part of the teaching of this course. we will most likely copy the text of your post into a google document and attribute you. a full citation for the work will accompany it, as will a statement of copyright ownership. please contact me at xxxx@bluebonnetu.edu or by telephone at -xxx-xxxx with information about this request. thank you for your time and attention. regards, your name step : selecting a repository images http://creativecommons.org/licenses/by/ . / this work is licensed under a creative commons attribution . international license. consider flickr or wikimedia commons. as you upload your image to these repositories, you will see the option to select the terms of use. open washington has created simple instructions if you need help in uploading an image to your flickr account and marking it with a cc license. videos consider youtube or vimeo. for help, consult these instructions created by open washington for uploading videos in youtube. always provide captions to your videos. youtube automatically creates captions; always verify that the captions are correct. they can be edited easily by following these simple instructions. course materials consider oer commons. additionally, if your institution has an institutional repository, work with your librarians to add your work to your institutional collection. alternatively, web storage space like google drive allows for easy and free access. if you choose a web storage space, make sure to ( ) manually mark your work as cc-licensed or in the public domain by placing the copyright notice somewhere visible and ( ) make the link accessible by the public. rice digital scholarship archive https://scholarship.rice.edu/ the rice digital scholarship archive (rdsa) provides global access to research and scholarship produced at rice university. managed by fondren library, we welcome the addition of rice-created oer. benefits of using the rdsa include: ● visibility: when you deposit your work in the rdsa, it becomes available to search engines as part of a worldwide network of research collections--your peers worldwide will be able to find it quickly. items can also be assigned dois, which further helps with citation and discovery. ● stability: each item deposited in the rdsa gets a permanent, citable, linkable url that will not change or break over time. ● longevity: the rdsa provides long-term storage for your materials by managing backups, and ensuring that your work remains accessible at a stable location on the web and available to search engines. rdsa will help keep works in common file formats up to date, ensuring that as technology and formats evolve, your work will remain accessible and usable. if you would like to learn more, please contact fondren’s digital scholarship services at cds@rice.edu. http://creativecommons.org/licenses/by/ . / https://www.flickr.com/ https://commons.wikimedia.org/wiki/main_page http://www.openwa.org/flickr-instructions/ http://www.openwa.org/flickr-instructions/ http://www.openwa.org/flickr-instructions/ https://www.youtube.com/ http://vimeo.com/ http://www.openwa.org/youtube-instructions/ http://www.openwa.org/youtube-instructions/ https://support.google.com/youtube/answer/ https://support.google.com/youtube/answer/ https://www.oercommons.org/ https://www.oercommons.org/ https://scholarship.rice.edu/ microsoft word - lognormal.docx the lognormal distribution is not an appropriate parametric model for shot length distributions of hollywood films nick redfern abstract we examine the assertion that the two-parameter lognormal distribution is an appropriate parametric model for the shot length distributions of hollywood films. a review of the claims made in favour of assuming lognormality for shot length distributions finds them to be lacking in methodological detail and statistical rigour. we find there is no supporting evidence to justify the assumption of lognormality in general for shot length distributions. in order to test this assumption we examined a total of hollywood films from to , inclusive, to determine goodness-of-fit of a normal distribution to log-transformed shot lengths of these films using four separate measures: the ratio of the geometric mean to the median; the ratio of the shape factor σ to the estimator � ∗ = � × ln �/ �; the shapiro- francia test; and the jarque-bera test. normal probability plots were also used for visual inspection of the data. the results show that, while a small number of films are well modelled by a lognormal distribution, this is not the case for the overwhelming majority of films tested ( out of ). therefore, we conclude there is no justification for claiming the lognormal distribution is an adequate parametric model of shot length data for hollywood films, and recommend the use of robust statistics that do not require underlying parametric models for the analysis of film style. keywords: lognormal distribution, goodness-of-fit, film style, shot length distribution je n'avais pas besoin de cette hypothèse-là pierre-simon laplace . introduction although the most frequently cited statistic of film style is the average (mean) shot length (asl), the distribution of shot lengths in a motion picture is definitely not normal. the distribution of shot lengths in a motion picture is typically positively skewed and contains a number of outlying data points whilst being bounded by zero at the lower end. given the skewed nature of this data, a logarithmic transformation may be appropriate before proceeding with statistical analysis in order to ‘normalise’ the distribution by removing the skew and the influence of outliers (quinn & keogh: - ). the data can then be summarised by finding the parameters for the underlying lognormal distribution, and can be analysed using parametric methods that assume data is normally distributed with the results transformed back to the original scale. the use of the lognormal distribution for describing the shot lengths in a motion picture in general has been suggest in two papers: one by barry salt (with an additional commentary on line), and one by jordan de long, kaitlin l. brunick, and james e. cutting. the lognormal distribution and hollywood cinema [ ] in this paper we review the claim that a lognormal distribution is an appropriate parametric model for the distribution of shot lengths in a motion picture. in the next section we introduce the lognormal distribution and note some relevant features to this study. in section three we review the claims that this distribution is an appropriate model for shot lengths in hollywood films, focussing on the methodology employed and the conclusions derived. in section four we test these claims against the shot length distributions of a sample of films using a range of different methods to determine if lognormality is an appropriate assumption. from these results were draw some conclusions about the appropriate use of statistics in the analysis of film style. . the lognormal distribution the distribution of shot lengths in a motion picture is asymmetric and exhibits positive skewness. in part this is due to a number of shots that are much longer than the majority of others. an additional reason is that the distribution is bounded on one side by zero: no shot can be equal to or less than seconds long. this means the distribution cannot be approximated by a normal distribution since this would imply that some shots have negative length, and this is obviously impossible. therefore the mean shot length and the standard deviation cannot be used to give an accurate description of film style since these are the parameters of the normal distribution. in these circumstances we may suspect that a skewed probability distribution will prove to be an adequate model for the data, and given the lower limit of the distribution of the data is bounded by zero that the lognormal distribution will fulfil this role. in applying a logarithmic transformation we do so in the expectation that the resulting data set will be symmetrical and that the influence of outliers will be removed (i.e. the transformed variable will be normally distributed), and that the dependence of the standard deviation on the mean will be eliminated. the lognormal distribution is defined in relation to the normal distribution: a random variable x is lognormally distributed if its logarithm � = log �� is normally distributed. therefore, the lognormal distribution is a continuous probability distribution with the density function � � = �√ � exp �− � � log − ���� , < < ∞, where μ is the arithmetic mean of log �� and σ is the standard deviation of log ��. this is true irrespective of the base of the logarithm, but throughout this paper we will use the natural logarithm (i.e. the logarithm to base e). kleiber and kotz ( : - ), burmaster and hull ( ), and limpert, stahel, and abbt ( ) provide detailed reviews of the properties of the lognormal distribution. back-transforming μ into the original scale of the data gives the geometric mean (g), which is approximately equal to the median (m) if the data are lognormal since the median of a lognormal distribution is at = exp ��. in this context exp �� is a location factor. as the distribution is right-skewed both the geometric mean and the median are less than the for example, based on data from the sample discussed below, the mean shot length of the apartment is . seconds with a standard deviation of . seconds, implying that % of the shots in this film are less than or equal to seconds. the lognormal distribution and hollywood cinema [ ] arithmetic mean ( �) of the data in the original scale. however, if a random variable is lognormally distributed then we can estimate the value of the arithmetic mean by exploiting a mathematical relationship between the arithmetic mean, the median, and the shape factor of the lognormal distribution, where � = × exp . � ��. since x is lognormally distributed if � = log �� is normally distributed, it easy to analyse the distribution of data as all we need do is apply a logarithmic transform and then use statistical methods based on normal distributions. testing the goodness-of-fit of a lognormal distribution to data is therefore equivalent to testing the goodness-of-fit of a normal distribution to � = log ��. the principles and methods of testing the goodness-of-fit of a normal distribution are well understood (see thode ), and normality tests are widely available in statistical software packages. . the lognormal distribution and film style in this section we examine the methodology and conclusions behind the claims of salt and of de long, brunick, and cutting that a lognormal distribution adequately models the frequency distribution of shot lengths in a motion picture. . barry salt, ‘let the numbers speak’ ( ) and ‘the metrics in cinemetrics’ ( ) the strongest claim for the use of lognormal distribution in analysing film style has been put forward by barry salt ( : - ; ), who asserts the generality of the lognormal distribution for the shot length distributions of all films. on the basis of this assertion he makes a series of subsequent general claims about shot length distributions: that the asl is an informative statistic even with heavily skewed shot length distributions because � = × exp . � �� under lognormality; that the characteristic shape factor of shot length distributions is σ = . ; and that the asl and σ are ‘fairly independent’ up to asl ≈ seconds. there are three principal problems with salt’s underlying claim of lognormality. the first problem is the sample of films on which these claims are based. in salt ( ) the sample used contains a total of just films, and one of these uses data from only the first minutes of the film. of the films in the sample two are silent films ( from and from ), were released in the s, four from the s, and one each from the s, s, s, and s. six of the films are french, one is german, one is brazilian, and the remainder are american. this sample cannot be considered to represent any population of films sorted by historical era or by nation, and it is unclear on what basis the results are generalized to other films. salt ( ) provides further individual examples but again there is no attempt to systematically analyse a large sample of films that represents a defined population. the second problem is the method by which salt measures goodness-of-fit. salt’s measures goodness-of-fit by plotting the frequency distribution of shot lengths in a histogram with the predicted frequency distribution from a lognormal distribution on logarithmic plotting paper. goodness-of-fit is then determined using the coefficient of determination (r²), with values close to considered as evidence the data is well fitted by a lognormal distribution. this is described as the ‘standard method’ of determining the relationship between two variables, and the squared correlation between the observed and theoretical quantiles of a distribution is a powerful method of determining goodness of fit when combined with normal probability plots (see sections . . and . . ). the use of r² here, however, is incorrect. sorting the data changes its structure so that data values are no longer the lognormal distribution and hollywood cinema [ ] independently and identically distributed, and it is necessary to take into account the high correlation and heteroscedasticity of the order statistics. in these circumstances ordinary least squares regression is not appropriate since the relationship between the ordered data and the ordered theoretical values is always monotonically increasing, and it is necessary to interpret the results in the context of a generalized least squares model (cullen & frey : ). there is no indication given by salt this is the case, and since these features may result in high values of r² even if the underlying distribution is not lognormal the results presented are unreliable. additionally, since the data is binned only a limited amount of information is used and the degree of correlation between the theoretical and observed values is dependent on the size of the bins chosen for the histogram rather than the data itself. finally, no decision rule or null distribution is given for the r² statistics and so what constitutes goodness-of-fit in this context is not defined. problem number three arises in the use of graphics to support the claim of lognormality. goodness-of-fit in both articles is represented visually using histograms with fitted lognormal distributions on an arithmetic scale rather than as histograms of the transformed data with a fitted normal distribution on a logarithmic scale. this gives a misleading impression of the goodness-of-fit and figure illustrates the difference this can make. the histogram of salt’s data for little annie rooney (without titles) on an arithmetic scale clearly shows the positive skew and outliers of this distribution and the fitted distribution appears to be a reasonable fit. however, when the same data is plotted on a logarithmic scale the same data does not show the familiar bell-shaped curve of the normal distribution and is obviously still skewed after the data transformation has been applied. the normal distribution fitted to log �� is obviously a very poor fit. the histogram on the arithmetic scale may lead us to infer this data is lognormally distributed and for this film salt ( ) reports that r²= . , but examination of the log-transformed data shows this conclusion to be wrong (shapiro-francia: . , p = < . ). there are, then, serious methodological issues regarding salt’s conclusions, which are based on a small, unrepresentative sample and flawed techniques for determining goodness-of-fit. based on the evidence in this paper, the claim cannot be considered proven and certainly does not justify salt’s interpretation that the lognormal distribution is an appropriate parametric model for such data in general for shot length distributions. in his online commentary, salt refers to aitchison and brown’s ( ) monograph on the lognormal distribution where the plotting of the grouped cumulative frequencies on lognormal probability paper is described but he does not apply geary’s test or the χ² test of normality to the log-transformed data even though these are discussed in the same text (see aitchison & brown : - ). in the circumstances, the χ² test would seem a more obvious choice for determining goodness-of-fit than r². however, χ² is not recommended as a normality test since binning the data is inherently wasteful and affects the test statistic, and there are many other normality tests available that are considerably more powerful (d’agostino ). the lognormal distribution and hollywood cinema [ ] figure histograms of shot length data for little annie rooney (without titles) on an arithmetic scale with fitted lognormal distribution log-' ~ ( . , . ²) (top) and on a logarithmic scale with fitted normal distribution ' ~ ( . , . ²) (bottom). source: http://www.cinemetrics.lv/movie.php?movie_id= , accessed january . . de long, brunick, and cutting, ‘film through the human visual system: finding patterns and limits’ ( ) the claim that shot length distributions are lognormal has been repeated by de long, brunick, and cutting ( ), whose research on films style is based on a sample of high grossing films at the us box office sampled five years apart from to , inclusive. they write despite being the popular metric, asl may be inappropriate because the distribution of shot lengths isn’t a normal bell curve, but rather a highly skewed, lognormal distribution. this means that while most shots are short, a small number of remarkably long shots inflate the mean. this means that the large majority of shots in a film are actually below average, leading to systematic over-estimation of individual the lognormal distribution and hollywood cinema film’s shot length. a better estimate is a film’s median shot length, a metric that provides a better estimate of shot length the appropriateness of assuming a lognormal distribution for shot length distributions casually asserted but is provide only one example of a functions to support this argument appears the reader is expected this single piece of evidence hypothesis of lognormality for a function of the untransformed dat result is similarly misleading, and night at the opera is definitely figure histogram of shot lengths in from de long j, brunick kl, and cutting je patterns and limits, in jc kaufman and dk simonton (eds.) york: oxford university press: this paper available at http://people.psych.cornell.edu/~jec /pubs/socialsciencecinema.pdf accessed january if salt’s claims are the result of and cutting do not appear to have used any methodology for assessing goodness they simply state that shot length distributions are lognormal, and present one chart as evidence that this is true in general. na presenting charts similar to figure for all films in a sample, but we may reasonably expect some more detailed evidence in support of so general and unequivocal a statement as the lognormal distribution and hollywood cinema [ ] film’s shot length. a better estimate is a film’s median shot length, a metric that better estimate of shot length. appropriateness of assuming a lognormal distribution for shot length distributions is not demonstrated. although their sample included films, they provide only one example of a shot length distribution with a fitted lognormal d to support this argument (see figure ). no other results are presented, and the reader is expected to infer the generality of the lognormal distribution piece of evidence. again, no goodness-of-fit tests are conducted hypothesis of lognormality for any data set. like salt, they use the histogram and density function of the untransformed data for a night at the opera to illustrate goodness result is similarly misleading, and, as we shall see below, the distribution of shot lengths in is definitely not lognormal. shot lengths in a night at the opera with a fitted lognormal de long j, brunick kl, and cutting je film through the human visual system: finding patterns and limits, in jc kaufman and dk simonton (eds.) the social science of cinema rsity press: in press. this graph was downloaded from the online version of le at http://people.psych.cornell.edu/~jec /pubs/socialsciencecinema.pdf accessed january . if salt’s claims are the result of a methodologically flawed analysis, then de long, brunick, and cutting do not appear to have used any methodology for assessing goodness they simply state that shot length distributions are lognormal, and present one chart as evidence that this is true in general. naturally, we would not expect to find a paper presenting charts similar to figure for all films in a sample, but we may reasonably expect some more detailed evidence in support of so general and unequivocal a statement as film’s shot length. a better estimate is a film’s median shot length, a metric that ... appropriateness of assuming a lognormal distribution for shot length distributions is although their sample included films, they shot length distribution with a fitted lognormal density no other results are presented, and it the lognormal distribution based on conducted to test the null histogram and density to illustrate goodness-of-fit. the e shall see below, the distribution of shot lengths in a with a fitted lognormal distribution film through the human visual system: finding the social science of cinema. new . this graph was downloaded from the online version of le at http://people.psych.cornell.edu/~jec /pubs/socialsciencecinema.pdf, analysis, then de long, brunick, and cutting do not appear to have used any methodology for assessing goodness-of-fit at all. they simply state that shot length distributions are lognormal, and present one chart as turally, we would not expect to find a paper presenting charts similar to figure for all films in a sample, but we may reasonably expect some more detailed evidence in support of so general and unequivocal a statement as the lognormal distribution and hollywood cinema [ ] that quoted above. given this situation, we do not consider de long, brunick, and cutting’s paper to be any sort of evidence of the lognormality of the shot length distributions of motion pictures. there is a clear difference between these two papers in their interpretation of the asserted lognormality of shot lengths and this also creates methodological problems for the statistical analysis of film style. salt ( ) uses the lognormal distribution to justify retaining the mean shot length based on the relationship between the arithmetic mean, the median, and σ under lognormality, while de long, brunick, and cutting adopt the opposite conclusion and refer to the lognormal distribution as justification for using the median shot length in place of the mean because it does not lead to the systematic overestimation of a film’s cutting rate given a skewed data set. therefore, although the claims made in these two papers may at first appear to be mutually supportive, they are in fact opposed in their fundamental approaches to analysing film style. inevitably, the result is unnecessary confusion for the reader since we are expected to maintain two conflicting ideas based on the same reasoning for two statistics that lead to obscure and contradictory conclusions. . is lognormality a reasonable assumption for shot length distributions of hollywood films? since the appropriateness of the lognormal distribution for shot lengths has not been demonstrated in this section we test this claim against a sample of hollywood films to determine if it is in fact reasonable to make any such assumption. . methods we assessed the lognormality of the data using several different methods based on descriptive statistics, normal probability plots, and statistical tests of the null hypothesis of lognormality. two different types of normality test were applied to the logarithms of the shot length data for each to avoid relying on a single test that may fail to identify some kinds of deviation from the assumed model. it is important to remember that failure to reject the null hypothesis is not proof that a lognormal distribution is an appropriate model. the decision on whether or not lognormality was assumed was based on consideration of all these factors together, and, if necessary, the use of additional graphical methods (e.g. histograms, density traces, etc). all statistical analysis was conducted using r (version . . ) and microsoft excel . . . sample the sample used in this study is that used by de long, brunick, and cutting in their analyses of film style since this is the data on which they apparently base their claims regarding film style. the majority of these films are american, with a handful of them british, and cover seventy years of filmmaking and a wide range of genres and filmmakers. this data can be accessed via the cinemetrics website: http://www.cinemetrics.lv/index.php. although this sample comprises a total of films we were unable to use all the films in our analysis due to rounding or what we assume are data entry errors. the minimum shot length for nine films was given as . seconds, while the minimum was less than . s for seven films. these films were excluded as no conclusion could be reached regarding lognormality because logarithms exist only for real numbers strictly greater than zero. therefore, the sample used in this study comprises a total of films. the lognormal distribution and hollywood cinema [ ] . . descriptive statistics the two-parameter lognormal distribution is described by the parameters μ and σ, where log �� ∼ ' �, � ��, and so we focus on estimates of these parameters as a means of assessing goodness-of-fit. the cutting rate of a film is described in seconds, and we interpret the average shot length in arithmetic space even though transforming the data implies we conduct any analysis in logarithmic space. we are therefore interested in exp(μ) as a measure of location. candidates for exp(μ) are the geometric mean (g) and the median (m). if the data are lognormally distributed then these statistics will be approximately equal, and + , = . if this ratio differs from we conclude the data is not lognormally distributed. as the median may be greater the geometric mean any description of the ratios for the whole sample of films will underestimate the true average discrepancy and so we use the consistent ratio -./ +,,� - +,,� to estimate the median discrepancy between these two estimates of exp(μ). salt claims we should retain the asl as a statistic of film style because its ratio to the median shot length allows us to derive the shape factor of a lognormal distribution that adequately describes the distribution of shot lengths in a motion picture. therefore, we compare two estimates of the shape factor of the distributions: σ – the standard deviation of the log- transformed shot lengths (the maximum likelihood estimate); and the estimate derived from the ratio of the arithmetic mean ( �) and the median (m): � ∗ = � × ln �/ �. if the data are lognormally distributed then the two shape factor estimates will be approximately equal and ∗ = . if this ratio differs from the lognormal distribution is not a good model for the data. again, we use the consistent ratio -./ , ∗� - , ∗� to determine the true size of the median discrepancy between estimates since σ may be greater than σ*. since salt gives the shape factor on a logarithmic scale we follow this convention, and we do not use the multiplicative standard deviation (exp(σ)). . . normal probability plots with log-transformed data a normal probability plot is a scatter plot of the quantiles of the observed distribution (i.e. the log-transformed shot lengths of a film) against the expected quantiles of the theoretical distribution (the normal distribution). the points in the normal probability plot will show a strong linear pattern if the theoretical distribution is a good model for the observed quantiles. fitting a reference line makes assessing the linearity of the plot easier, where the intercept and the slope of the line are the generalized least squares estimates of μ and σ, respectively. deviations from this line indicate the normal distribution is not an appropriate model for the data. see burmaster and hill ( ) for an overview of assessing lognormality with probability plots. the lognormal distribution and hollywood cinema [ ] . . the shapiro-francia test the shapiro-francia (sf) test ( ) is a correlation-based goodness-of-fit test related to the normal probability plot. the test statistic is the squared correlation between the expected values of the normal order statistics ( ) and the observed quantiles ( � ), = � :; < � � :; × − ��� :; <= is equal to the square of the probability plot correlation coefficient and is therefore a measure of the linearity of a normal probability plot (filliben ). looney and gulledge ( : ) point out the order statistics of the observed quantiles are highly correlated and heteroscedastic, and that the usual set of critical values used for interpreting a correlation coefficient do not apply in these circumstances. should not be confused with r² since this may lead to flawed inferences arising from use of the wrong null distribution. the shapiro-francia test was implemented using the r package nortest (version . ), which allows for samples of size ≤ ? ≤ . lognormality was rejected at α = . . . . the jarque-bera test as sample sizes become very large it becomes increasingly likely that a statistical hypothesis will reject the null hypothesis in the presence of trivial deviations from the theoretical model. the number of shots in films from the sample range from to , and so we employ the jarque-bera test ( ) as an additional test of normality suitable for large sample sizes. the jarque-bera test is a goodness-of-fit test based on the skewness and kurtosis of the normal distribution. skewness describes the asymmetry of a distribution and kurtosis is a measure of the peak of the data relative to a normal distribution. for a normal distribution, skewness is equal to (i.e. the distribution is symmetrical) and the kurtosis is equal to . the jarque-bera test compares the sample skewness and kurtosis to these values. the test statistic for a sample of size n is @a = ? cd � + g − ��i , where s is the sample skewness and k is the sample kurtosis. under the null hypothesis, jb has an asymptotic χ² distribution with two degrees of freedom. the jarque-bera test was implemented using the r package tseries (version . - ). lognormality was rejected at α = . . . results table presents the full set of results for each film. we find the lognormal distribution is not an appropriate model for of the films in the sample. of the remaining nine films, the lognormal distribution and hollywood cinema [ ] lognormality appears to be a reasonable assumption in eight cases. the results for one film (blood on the sun) are inconclusive and this film is discussed separately. . . descriptive statistics the first method employed was the comparisons of the ratios + , and ∗ . the ratios of the location estimates range from a minimum of . for harvey to a maximum of . for annie get your gun. the median is greater than the geometric median in five cases, but is only substantially different for harvey. the median of the consistent ratios of the location estimates is . (iqr: . , % ci: . , . ). the range of values for the ratio of the shape factor estimates is from a minimum of . for harvey to a maximum of . for sense and sensibility. σ* is greater than σ in four cases, but this ratio is only substantially less than for harvey. the median of the consistent ratios of the shape factor estimates is . (iqr: . , % ci: . , . ). therefore, the average discrepancy between the location estimates is % and the average discrepancy between the shape factor estimates is %. this method of assessing goodness-of-fit depends on the proposition that if the shot lengths of a film are lognormally distributed then + , and ∗ are approximately equal to . however, there are numerous occasions when the two ratios are both approximately equal to one but the normal probability plots and/or the hypothesis tests lead us to reject the lognormal distribution as a model for the data. this may occur if the distribution exhibits bimodality after the transformation has been applied, if the distribution is symmetrical and leptokurtic, or if one of the tails of the distribution deviates markedly from the theoretical distribution while the remainder of the data points are reasonably well fitted. therefore, the ratios of the geometric mean to the median and of σ* to σ cannot be considered reliable evidence a shot length distribution is lognormally distributed. relying on these ratios to conclude shots lengths are lognormal leads to logically flawed reasoning by affirming the consequent in the above proposition. these ratios are reliable evidence (by modus tollens) the data is not lognormally distributed and where we observe large discrepancies we can be confident the assumed model is not appropriate. however, we note the case of brief encounter as a film for which lognormality is a reasonable assumption, as figure a and the results in table demonstrate, but where the ratios of the location ( . ) and shape ( . ) estimates are greater than many films where lognormality is not a reasonable assumption. the % discrepancy between g and m for this film is just below the median discrepancy for the whole sample, and relying on this statistic may lead to rejection of lognormality when it is appropriate in this instance. the problem of interpreting these results therefore becomes one of deciding what constitutes a ‘large discrepancy’ between estimates. we therefore conclude that it is not possible to accurately determine between lognormal shots length distributions and those that are not by this method alone. relying on this method may lead researchers to reject the null hypothesis when it should have been accepted (type i error) or to accept the null hypothesis when it should have been rejected (type ii error). the results of the hypothesis tests in table indicate type ii errors will be more common and that researchers will fail to reject lognormality as a model for their data by relying on these ratios. . . normal probability plots with log-transformed data figure presents the normal probability plots for the log-transformed shot length data of eight films. if the log-transformed data is well fitted by a normal distribution with parameters μ and σ then the data points will lie along a straight line. brief encounter (figure the lognormal distribution and hollywood cinema [ ] a) and barry lyndon (figure b) both demonstrate this pattern, and we conclude that, in light of the other results, the lognormal distribution is an appropriate model for shot lengths in these films. the remaining six plots show that applying a logarithmic transformation does not necessarily produce a symmetric distribution and that there are a variety of reasons for this. a logarithmic transformation is applied to data in the expectation it will normalise skewed data and remove the influence of outliers. however, it is clear that in some cases such a transformation does not make the data symmetrical or completely solve the problem of outliers de long, brunick, and cutting provide the example of a night at the opera to illustrate the lognormality of shot lengths, but it is clear from the probability plot for this film (figure c) that a lognormal distribution is not a good fit. specifically, the distribution of shot lengths is positively skewed even after the data is log-transformed and the fitted distribution ' ~ ( . , . ²) underestimates the frequency of shorter shots while overestimating the number of longer takes. there are also outliers evident in both tails after a transformation has been applied. for this film there are discrepancies of % between the geometric mean and the median and of % for the two shape factor estimates – both substantially above the average values for these ratios. the ‘evidence’ presented by de long, brunick, and cutting in support of their argument is both categorically wrong (it is wrong without qualification) and a category error (it is an error of classification). a similar pattern can be seen in the plot of pretty woman (figure g), and the failure to eliminate the positive skew and/or outliers from the data is common right across the time period covered. harvey (figure d) evidently has a leptokurtic distribution with heavier tails than expected. this means that although the distribution of log �� for this film is roughly symmetrical it has a higher peak than a normal distribution, indicating that the log-transformed shot lengths for this film exhibit less variation than expected (and thereby explaining why the ratio of shape factors was less than for this film) with a increased density of shot lengths in both tails. a lognormal distribution clearly cannot accurately model this distribution and it is important to consider not only the skew but also the kurtosis in interpreting shot lengths distributions. the normal probability plot for the apartment (figure e) appears to be similar to brief encounter and barry lyndon, but we conclude the lognormal distribution is not appropriate based on its bimodality after transformation (see below). bimodality is evident in figure e as the distribution jumps from below the fitted line to the predicted values, but it is far easier to this pattern in the histogram of the transformed data in figure . we can be confident that when the data points deviate substantially from a straight line there is strong evidence against lognormality, but in this instance the apparent linearity of the plot may be easily misinterpreted. again, this indicates that reliance on a single method for assessing goodness- of-fit is dangerous and may be avoided by using normal probability pots with the shapiro- francia statistic and histograms of the log-transformed data. this plot also demonstrates a problem with applying a logarithmic transformation: because the logarithmic transformation stretches the interval [ , ] and compresses the interval [ , ∞], we may find that some data points appear as outliers in the lower tail of the distribution. this can clearly be seen in figure e and is also apparent in figure . the ‘creation’ of outliers in the lower tail of a distribution means that in a handful of case the log-transformed data may exhibit negative skew, but such instances are relatively rare. shampoo (figure f) exhibits a common pattern in probability plots for films from the s onwards, deviating markedly from the theoretical distribution in its lower tail. this plot indicates that lower tail of the distribution is heavier than would be expected if log �� were the lognormal distribution and hollywood cinema [ ] figure normal probability plots of log-transformed shot lengths for eight films the lognormal distribution and hollywood cinema [ ] normally distributed, and, therefore, that a lognormal distribution underestimates the density of shorter shots in this film. since this feature is evident for many films in the sample this implies that assuming the lognormal distribution as general distribution of film style will systematically give misleading descriptions of film style. shampoo also includes an outlier in the upper tail: the longest shot for this film is given as . seconds and is three times greater than the second longest shot ( . s), and a lognormal distribution simply will not adequately model a data set containing such an extreme outlier. this outlier can be clearly seen to be substantially different after a logarithmic transformation has been applied in the upper right corner of figure f. walk the line (figure h) shows the opposite pattern to shampoo, with a lighter lower tail than expected indicating that the fitted normal distribution overestimates the density of shorter shots. this pattern is much less common in this sample. . . the shaprio-francia test the null hypothesis of lognormality was rejected for of the films tested with the shapiro-francia test. this clearly indicates that the assumption of lognormality is not justified for the vast majority of films. . . the jarque-bera test the null hypothesis of lognormality was rejected for of the films tested with jarque- bera test, and again we conclude that lognormality is not appropriate in the vast majority of cases. there is a discrepancy between the results of the sf test and the jb test for the apartment and barry lyndon. for the apartment, the sf test rejected the null hypothesis while the jb statistic was not statistically significant in this instance. nonetheless, we consider the lognormal distribution to be inappropriate based on the above average difference between the two location estimates ( . ) and on the histogram of the log-transformed shot lengths (see figure ). although the resulting distribution is symmetrical, the shot lengths in this film are bimodal under a logarithmic transformation and the peak of the fitted distribution lies directly over the trough between the modes. in the case of barry lyndon, the sf test again rejected lognormality while the jb test indicates the null hypothesis is a plausible model for this data. there is no large discrepancy between the location estimates ( . ) and the ratio of the shape factor estimates ( . ) is well below the sample median. as noted above, the probability plot for this film indicates a strong linear relationship between the theoretical and observed values, and the histogram of log �� and the fitted normal distribution (figure ) also support the interpretation that a lognormal distribution is a reasonable assumption for this data. the lognormal distribution and hollywood cinema [ ] figure histogram of log-transformed shot lengths in the apartment with fitted normal distribution ' ~ ( . , . ²) figure histogram of log-transformed shot lengths in barry lyndon with fitted normal distribution ' ~ ( . , . ²) the lognormal distribution and hollywood cinema [ ] . . blood on the sun there is one case where the results are inconclusive. in the case of blood on the sun, the null hypothesis was rejected for the sf test but not for the jb test. the histogram of log x� and the fitted normal distribution for this film are similarly inconclusive: although the two density estimates in figure appear to be similar there is a greater than average discrepancy in its location estimates ( . ) and the ratio between the shape factor estimates ( . ) is below average but still much greater than . thus we find some evidence against lognormality for this film and some for, and although we cannot quite reject the null hypothesis of lognormality this does not justify any such assumption for shot lengths in this film. in fact, given these results are inconclusive we caution against making any such assumption and recommend a conservative approach and the use of robust methods, particularly in light of the size of the ratios of the descriptive statistics. figure histogram of log-transformed shot lengths in blood on the sun with fitted normal distribution ' ~ ( . , . ²) . discussion the first stage in the statistical analysis of film style should always be a detailed examination of the data. exploratory data analysis (eda) is an approach to data analysis characterized by scepticism of methods that may obscure the structure of the data and openness to unanticipated patterns (tukey , hartwig & dearing , behrens ). eda employs a range of methods to maximise insight into a data set by revealing the underlying structure of the data and extracting the relevant features, identifying outliers, generating hypotheses, and testing underlying statistical assumptions. eda places substantial emphasis on resistant and robust methods requiring few assumptions about the data and which are applicable in a wide range of circumstances: ‘robustness is important in eda because the underlying form of the data cannot always be presumed, and statistics that can be easily fooled (like the the lognormal distribution and hollywood cinema [ ] mean) may mislead’ (behrens : ). the eda approach is clearly different to classical statistics in which models and hypotheses are determined before seeing the data, and in which the objectives of statistical analysis are the estimation of model parameters and to confirm the presence or absence of specific features of the data. eda does not replace confirmatory hypothesis tests but assumes that the more we know about our data the more effective our subsequent analyses will be. exploratory data analysis is then a fundamental part of statistical research, and any study ‘that does not include a thorough exploratory data analysis is not complete’ (kundzewicz & robson : ). a fundamental principle of eda is that obtaining a well-rounded view of a data set requires the use of a range of exploratory methods in conjunction, combining tabular, numerical, and graphical representations in a flexible and intuitive manner. this paper has demonstrated the importance of using a range of different numerical and graphical exploratory techniques alongside normality tests. it is clear that there is no single method for judging the fit of a lognormal distribution to shot length data, and it is necessary to use multiple methods to properly evaluate the assumptions that underpin the statistical analysis of film style. it is also clear that comparing the density trace of a fitted lognormal distribution to histograms of the untransformed data is misleading and should not be used. the histogram of the log- transformed data with a fitted normal distribution is more informative, and, when used alongside several descriptive statistics and normal probability plots, allows us to gain some insight into the results of the formal hypothesis tests of normality. we tested the assumption the lognormal distribution for hollywood films released from to using a range of exploratory and confirmatory statistical methods, and we found this assumption was not justified in cases. designing a study of film style based on this assumption will lead to flawed inferences due to incorrect descriptions of film style. furthermore, reliance of parametric statistical tests that assume data is normally distributed (after an appropriate transformation is applied) may result in a loss of statistical power. departures from the underlying normal distribution may be arbitrarily small but still lead to fundamentally flawed conclusions (wilcox , huber & ronchetti : - ). from the above results we see those departures take such a variety of forms – bimodality, positive skew, negative skew, deviations in the lower or upper tails, outliers in the lower and/or upper tails, leptokurtosis – that it is unlikely there is a simple and general method for dealing with these deviations. a far simpler approach that avoids these problems is to make no such assumption, and to use statistical methods that do not depend on assumptions of lognormality. this can be illustrated in the choice of measures of central tendency and dispersion for describing shot length distributions. salt has previously claimed the asl is a useful statistic of film style, but this appears inconsistent with claims the data is lognormally distributed. it would seem obvious that we should give up the mean as a statistic of film style since it does not locate the centre of the data for some other measure of central tendency such as the geometric mean or the median. nonetheless, he advocates retaining the mean shot length even though the distribution of shot lengths is highly skewed though it is not clear what the mean shot length means in this context since it no longer functions as a measure of central tendency. statements such as ‘the mean shot length of little annie rooney (minus titles) is . seconds’ no longer mean what the majority of film scholars think they mean as this statistic no longer intended to be used as a description of the cutting rate of the film. however, this argument is invalid since it is the lognormal distribution and hollywood cinema [ ] based the relationship between the arithmetic mean, the median, and σ under lognormality and this fundamental assumption is not justified in general. the justification given by de long, brunick, and cutting for preferring the median shot length to the mean because shot lengths are lognormal is clearly flawed as the premise of this argument is invalid for the vast majority of cases. the median is a superior measure of central tendency not because we are justified in assuming lognormality but because it locates the centre of a distribution irrespective of its shape. the proper justification for using the median shot length to describe the style of a film is that it is resistant to the effects of outliers and robust to deviations from the assumed model due to its high breakdown point and bounded influence function (wilcox , ). neither the arithmetic nor geometric means are resistant or robust and use of these statistics lead to flawed interpretations of the data. additionally, the shape factor of the log-transformed shot length data is not a resistant or robust measure of dispersion and similarly leads to incorrect conclusions. robust measures of dispersion such as the interquartile range, d , or y (rousseeuw & croux ) are far superior to σ or σ*, and this is again due to their high breakdown points and bounded influence functions. calculating these statistics from the untransformed data expresses the dispersion of shot lengths in seconds making the interpretation of film style easier than using σ with the asl, and they have clearly defined meanings. . conclusion this paper examined the claim that lognormality is a reasonable assumption for shot length distributions of hollywood films. in reviewing the claims put forward to support this assumption we find them lacking in methodological detail and statistical rigour, and certainly unable to justify the conclusions presented. in testing the shot length data of hollywood films we find that, while the lognormal distribution is an adequate model for a handful of films, this is not the case for the vast majority of films in the sample. consequently, we conclude there is no justification for the claim that the lognormal distribution is an appropriate parametric model for the shot length distributions of hollywood films. references aitchison j and brown jac the lognormal distribution, with special reference to its use in economics. cambridge: cambridge university press. behrens jt principles and practices of exploratory data analysis, psychological methods ( ): - . burmaster de and hull da using lognormal distributions and lognormal probability plots in probabilistic risk assessments, human and ecological risk assessment: an international journal ( ): - . cullen ac and frey hc probabilistic techniques in exposure assessment: a handbook for dealing with variability and uncertainty in models and inputs. new york: plenum. d’agostino rb tests for the normal distribution, in rb d’agostino and ma stephens (eds.) goodness of fit techniques. new york: marcel dekker: - . de long j, brunick kl, and cutting je film through the human visual system: finding patterns and limits, in jc kaufman and dk simonton (eds.) the social science of cinema. new york: oxford university press: in press. an online version of this paper is available at the lognormal distribution and hollywood cinema [ ] http://people.psych.cornell.edu/~jec /pubs/socialsciencecinema.pdf, accessed january . filliben jj the probability plot correlation coefficient test for normality, technometrics ( ): - . hartwig f and dearing be exploratory data analysis. newbury park, ca: sage. huber pj and ronchetti em robust statistics, second edition. new york: john wiley & sons. jarque cm and bera ak a test for normality of observations and regression residuals, international statistical review ( ): – . kleiber c and kotz s statistical size distributions in economics and actuarial sciences. hoboken, nj: john wiley & sons. kundzewicz zb and robson aj change detection in hydrological records: a review of the methodology, hydrological sciences ( ): - . limpert e, stahel wa, and abbt m log-normal distributions across the sciences: keys and clues, bioscience ( ): - . looney sw and gulledge tr use of the correlation coefficient with normal probability plots, the american statistician ( ): - . quinn gp and keogh mj experimental design and data analysis for biologists. cambridge: cambridge university press. rousseeuw pj and croux c alternatives to the median absolute deviation, journal of the american statistical association : – . shapiro ss and francia rs an approximate analysis-of-variance test for normality, journal of the american statistical association : - . salt b moving into pictures: more on film history, style, and analysis. london: starword. salt b the metrics in cinemetrics, http://www.cinemetrics.lv/ metrics_in_cinemetrics.php, accessed january . thode hc testing for normality. new york: marcel dekker, inc. tukey jw exploratory data analysis. reading, ma: addison-wesley. wilcox rr how many discoveries have been lost by ignoring modern statistical methods?, american psychologist ( ): - . wilcox rr introduction to robust estimation and hypothesis testing, second edition. burlington, ma: elsevier academic press. the lognormal distribution and hollywood cinema [ ] table results of statistical tests of the null hypothesis shot lengths in hollywood films are lognormally distributed (see text for discussion of films marked *) title year median (m) geometric mean (g) z [ σ σ* \∗ \ shapiro- francia p jarque-bera p lognormal? a tale of two cities . . . . . . . < . . < . no top hat . . . . . . . < . . < . no les miserables . . . . . . . < . . < . no the informer . . . . . . . . . . no westward ho . . . . . . . < . . < . no a night at the opera . . . . . . . < . . < . no anna karenina . . . . . . . . . . no the steps . . . . . . . < . . < . no captain blood . . . . . . . . . . yes mutiny on the bounty . . . . . . . < . . < . no fantasia . . . . . . . . . . yes foreign correspondent . . . . . . . < . . < . no grapes of wrath . . . . . . . < . . < . no pinocchio . . . . . . . < . . < . no rebecca . . . . . . . < . . < . no santa fe trail . . . . . . . < . . < . no the great dictator . . . . . . . . . . yes the letter . . . . . . . < . . . no thief of bagdad . . . . . . . < . . < . no bells of st marys . . . . . . . < . . < . no blood on the sun . . . . . . . . . . n/a* brief encounter . . . . . . . . . . yes detour . . . . . . . . . . no in pursuit to algiers . . . . . . . < . . < . no leave her to heaven . . . . . . . < . . < . no lost weekend . . . . . . . < . . < . no spellbound . . . . . . . < . . < . no the lognormal distribution and hollywood cinema [ ] title year median (m) geometric mean (g) z [ σ σ* \∗ \ shapiro- francia p jarque-bera p lognormal? all about eve . . . . . . . < . . < . no annie get your gun . . . . . . . < . . < . no born yesterday . . . . . . . < . . < . no cheaper by the dozen . . . . . . . < . . < . no cinderella . . . . . . . < . . < . no harvey . . . . . . . < . . < . no the asphalt jungle . . . . . . . < . . < . no the flame and the arrow . . . . . . . < . . < . no battle cry . . . . . . . < . . < . no east of eden . . . . . . . < . . < . no lady and the tramp . . . . . . . < . . . no mr roberts . . . . . . . < . . < . no night of the hunter . . . . . . . . . . yes rebel without a cause . . . . . . . < . . < . no seven year itch . . . . . . . < . . . no the ladykillers . . . . . . . . . . no the trouble with harry . . . . . . . < . . < . no butterfield . . . . . . . < . . < . no exodus . . . . . . . . . . yes inherit the wind . . . . . . . < . . < . no magnificent seven . . . . . . . < . . < . no ocean’s . . . . . . . < . . < . no peeping tom . . . . . . . < . . < . no spartacus . . . . . . . < . . < . no swiss family robinson . . . . . . . < . . < . no the apartment . . . . . . . . . . no* time machine . . . . . . . < . . < . no the lognormal distribution and hollywood cinema [ ] title year median (m) geometric mean (g) z [ σ σ* \∗ \ shapiro- francia p jarque-bera p lognormal? dr zhivago . . . . . . . < . . < . no flight phoenix . . . . . . . < . . < . no flying machines . . . . . . . < . . < . no help . . . . . . . < . . < . no shenandoah . . . . . . . < . . < . no sound of music . . . . . . . < . . < . no that darn cat . . . . . . . < . . < . no the great race . . . . . . . < . . < . no thunderball . . . . . . . < . . < . no what’s new pussycat . . . . . . . < . . < . no airport . . . . . . . < . . < . no aristocats . . . . . . . . . . yes beneath the planet of the apes . . . . . . . < . . < . no catch . . . . . . . < . . < . no five easy pieces . . . . . . . < . . < . no kelly’s heroes . . . . . . . < . . . no patton . . . . . . . < . . < . no tora! tora! tora! . . . . . . . < . . . no barry lyndon . . . . . . . . . . yes* three days of the condor . . . . . . . < . . < . no one flew over the cuckoo’s nest . . . . . . . < . . < . no dog day afternoon . . . . . . . < . . < . no jaws . . . . . . . < . . < . no the man who would be king . . . . . . . < . . < . no monty python the holy grail . . . . . . . < . . < . no return of the pink panther . . . . . . . < . . < . no the rocky horror picture show . . . . . . . < . . < . no shampoo . . . . . . . < . . < . no the lognormal distribution and hollywood cinema [ ] title year median (m) geometric mean (g) z [ σ σ* \∗ \ shapiro- francia p jarque-bera p lognormal? airplane . . . . . . . < . . < . no coal miner’s daughter . . . . . . . < . . < . no the empire strikes back . . . . . . . < . . < . no nine to five . . . . . . . < . . < . no ordinary people . . . . . . . < . . < . no popeye . . . . . . . < . . < . no stir crazy . . . . . . . < . . < . no superman . . . . . . . < . . < . no the blue lagoon . . . . . . . < . . < . no urban cowboy . . . . . . . < . . < . no back to the future . . . . . . . < . . < . no cocoon . . . . . . . < . . < . no out of africa . . . . . . . < . . < . no police academy . . . . . . . < . . < . no rambo ii . . . . . . . < . . < . no spies like us . . . . . . . < . . < . no the color purple . . . . . . . < . . < . no witness . . . . . . . < . . < . no dick tracy . . . . . . . < . . < . no die hard . . . . . . . < . . < . no ghost . . . . . . . < . . < . no goodfellas . . . . . . . < . . < . no home alone . . . . . . . < . . < . no hunt for red october . . . . . . . < . . < . no pretty woman . . . . . . . < . . < . no teenage mutant ninja turtles . . . . . . . < . . < . no total recall . . . . . . . < . . < . no the lognormal distribution and hollywood cinema [ ] title year median (m) geometric mean (g) z [ σ σ* \∗ \ shapiro- francia p jarque-bera p lognormal? ace ventura . . . . . . . < . . < . no apollo . . . . . . . < . . < . no batman forever . . . . . . . < . . < . no casper . . . . . . . < . . < . no goldeneye . . . . . . . < . . < . no jumanji . . . . . . . < . . < . no pocahontas . . . . . . . < . . < . no sense and sensibility . . . . . . . < . . < . no toy story . . . . . . . < . . < . no castaway . . . . . . . < . . < . no charlie’s angels . . . . . . . < . . < . no dinosaur . . . . . . . < . . < . no erin brockovich . . . . . . . < . . < . no the grinch who stole christmas . . . . . . . < . . < . no scary movie . . . . . . . < . . < . no the perfect storm . . . . . . . < . . < . no what women want . . . . . . . < . . < . no x-men . . . . . . . < . . < . no hitch . . . . . . . < . . < . no king kong . . . . . . . < . . < . no the longest yard . . . . . . . < . . < . no madagascar . . . . . . . < . . . no mr & mrs smith . . . . . . . < . . < . no walk the line . . . . . . . < . . < . no wedding crashers . . . . . . . < . . < . no the lognormal distribution and hollywood cinema [ ] due to data sets containing shot lengths less than or equal to . seconds a total of films were excluded from the sample: philadelphia story ( ), anchors aweigh ( ), mildred pierce ( ), king solomon’s mines ( ), sunset boulevard ( ), to catch a thief ( ), little big man ( ), mash ( ), jewel of the nile ( ), rocky iv ( ), dances with wolves ( ), the usual suspects ( ), mission impossible ( ), chicken little ( ), harry potter and the goblet of fire ( ), and star wars: episode iii – the revenge of the sith ( ). moving the archivist closer to the creator: implementing integrated archival policies for born digital photography at colleges and universities this article was downloaded by: [state university of new york at albany], [brian keough] on: may , at: : publisher: routledge informa ltd registered in england and wales registered number: registered office: mortimer house, - mortimer street, london w t jh, uk journal of archival organization publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/wjao moving the archivist closer to the creator: implementing integrated archival policies for born digital photography at colleges and universities brian keough a & mark wolfe a a university at albany, state university of new york, albany, new york, usa available online: may to cite this article: brian keough & mark wolfe ( ): moving the archivist closer to the creator: implementing integrated archival policies for born digital photography at colleges and universities, journal of archival organization, : , - to link to this article: http://dx.doi.org/ . / . . please scroll down for article full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions this article may be used for research, teaching, and private study purposes. any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. the publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. the accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. the publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. http://www.tandfonline.com/loi/wjao http://dx.doi.org/ . / . . http://www.tandfonline.com/page/terms-and-conditions journal of archival organization, : – , copyright © taylor & francis group, llc issn: - print / - online doi: . / . . moving the archivist closer to the creator: implementing integrated archival policies for born digital photography at colleges and universities brian keough and mark wolfe university at albany, state university of new york, albany, new york, usa this article discusses integrated approaches to the management and preservation of born digital photography. it examines the changing practices among photographers, and the needed relation- ships between the photographers using digital technology and the archivists responsible for acquiring their born digital images. spe- cial consideration is given to technical issues surrounding preserva- tion and access of image formats. it explores how integrated policies can enhance the success of managing born digital photographs in an academic setting and illustrates the benefits and challenges to acquisition, description, and dissemination of born digital pho- tographs. it advocates for the archivist’s active involvement in the photographer’s image management practices to improve the acqui- sition and preservation of images. keywords digital photography, born digital records, raw image format, academic archives introduction college and university archives have historically played a crucial role in preserving the photographic records of their institutions. analog film nega- tives, contact sheets, and prints were created for administrative and publicity purposes, and then were traditionally transferred by campus photographers to the archives. the role of the archivist in collecting and preserving cam- pus photography was well established and understood in our profession. address correspondence to brian keough, m.e. grenander department of special col- lections and archives, science library , university at albany, suny, washington avenue, albany, ny . e-mail: bkeough@albany.edu d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay b. keough and m. wolfe however, as photographers transition from analog film to digital formats, college and university photographs from the s may paradoxically be in greater danger than photographs from the s. the shift from analog to digital has been a disruptive force that puts digital assets at risk. the ease in shooting images that digital photography affords has dramatically increased the number of images of enduring value. growing file sizes, as well as the numerous proprietary formats created by digital cameras, render old print-based management practices an impractical option. the recent announcement that the eastman kodak company has filed for bankruptcy rings the death knell for the era of print photography and film and should be a clear indication to archivists that preserving digital photography is now our central concern. although current practices of campus photographers work relatively smoothly for meeting their business needs, access and preservation are be- coming increasingly unmanageable activities. photographers are required to devote more time, in addition to their core business duties, to the tasks of managing their digital images. archivists’ inability to acquire images in a timely manner contributes to bulging storage servers and difficult-to-manage optical disk collections in photography departments. digital photography requires an unprecedented level of engagement by the archivist throughout the entire lifecycle of the records, but given the size of the problem where does the archivist begin? archivists can take small steps to develop tools appropriate to their setting that can lead toward long-term solutions. the wide implementa- tion of institutional repository (ir) software and digital asset manage- ment systems (dams) for digitized collections may be potential test beds for implementing practical methods to provide access for born digital im- ages. the relative ease with which institutions adopt standards and practices for digitization is reflected in the large number of well-documented case studies. the issues surrounding born digital collections are well known, and standards and practices are being developed to solve these problems. however, there remain too few examples documented by repositories that demonstrate practical methods and tools for providing access and preser- vation for born digital images, and such practical approaches deserve more attention. this article explores the policies and technical issues related to born digital images that influence their preservation in the college and university archives context. it discusses how the university at albany’s special col- lections and archives department began acquiring born digital images. it suggests practical approaches and methods to meet the challenges of digital photography with special consideration given to possible staffing and finan- cial constraints, and it advocates greater collaboration between university archives and campus photography departments. d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay moving the archivist closer to the creator literature review the archivist and photographer relationship the digital age requires more involvement, not less, by the archivist for in- stitutions that hope to preserve their cultural legacies. elizabeth yakel and william brown, as early as , examined the effect of digital technology on records creators and archivists. they argued that in the digital age there is a need for an “active archivist who serves as part of the administrative team, both culling and packaging information as well as working with ad- ministrative colleagues in the evaluation and interpretation of the data.” however, the close relationships that institutional archivists once enjoyed with paper records creators faded with the advent of digital technology. many archivists, contrary to professional best practice, are excluded or not able to get involved in the policy decisions about records creators. lisl zach and marcia frank peri suggested that there has been relatively little development of campus electronic records programs between and and concluded that the acquisition and management of institutional digital records is comparatively neglected. until recently, digital records that arrived on magnetic and optical media typically played an ancillary role to the paper records contained in the collection. whereas floppy disks and other aging media pose problems, they are only the tip of the iceberg. increasingly, repositories are being tasked with acquiring digital collections with no paper counterpart, putting archivists in the uncomfortable position of not being able to provide access to the materials they own. a recent online computer library center (oclc) report noted alarming findings among association of research libraries affiliated special collections departments. the report recognizes a “widespread lack of basic infrastruc- ture for collecting and managing born-digital materials: more than two thirds [of the collections studied] cited lack of funding as an impediment, while more than half noted lack of both expertise and time for planning.” in- deed, the report summarized that special collections departments state born digital materials as their second biggest challenge. yet, these findings are a striking contrast to the amount of investment institutions have made in their technological infrastructure. the oclc report also found that % of the respondents are using an ir, suggesting that lack of technology might be a smaller obstacle to acquiring and preserving born digital records than conventional wisdom would suggest. the failed attempts to successfully deploy ir software for the digital scholarship community have met obstacles similar to those encountered by archivists working with electronic records. as ir deployments proliferated, libraries floundered to acquire faculty research due in part to a faulty assump- tion that faculty (the records creators) would self-archive their materials. even relatively simple digital assets, such as preprints in pdf format, are difficult d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay b. keough and m. wolfe to acquire because in many cases faculty are left to their own devices to manage their own records. the “build it and they will come” approach has largely failed because of the lack of staffing and administration of the repository software and pro- grams. the poor levels of participation in irs may be owed in part to the emergent role of the repository librarian, who has suddenly assumed many of the responsibilities traditionally fulfilled by an archivist. the findings of the miracle (making institutional repositories a collaborative learning en- vironment) project suggest that the “type of one-on-one collection develop- ment and content recruitment now being carried out by librarians to populate irs is exactly the type of field work that archivists have done for decades.” archivists are professionally trained in collection building, especially the heterogeneous content and mixed format typically found in archival and manuscript collections. archivists are accustomed to networking with cre- ators to ensure that custody is taken of those collections, and yet archivists have not played a significant role, especially when the ir is deemed the place to deposit electronic records. although it is important to understand this his- torical distinction between the role of archivists and ir librarians, merely shifting the responsibility of collecting digital scholarship to the archivist will not solve the problem. similarly, archivists have struggled to acquire born digital material from campus administrative offices. unless archivists redou- ble their engagement with campus records creators, digital scholarship and administrative records will most likely remain in separate silos. absent the involvement of an archivist, the contemporary campus pho- tographer devotes a surprisingly large amount of time to access and preser- vation issues. professional photography literature devotes significantly more effort to these issues than the museum and archival community. this de- velopment is a logical one especially for commercial photographers who may have no dedicated managers or archivists to care for their legacy col- lections. jessica bushey says that “it is necessary to rearticulate the role of the photographer as both creator and preserver,” yet she hastens to add that photographers still must obtain “input from those entrusted with preserving digital materials, such as archivists.” born digital photography to improve this situation, it is incumbent upon archivists to develop a greater understanding of digital photographic practices. the photographer’s techni- cal environment has changed dramatically and generated a new and com- plex set of terminology and techniques. rules of thumb developed for digital image scanning operations may not be enough for the archivist to fully un- derstand the problem space of born digital photography. patricia russotti and richard anderson include over new, digitally specific terms in their d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay moving the archivist closer to the creator recent review of best practices for digital photography. as with other types of digital materials, such as audio and video, the digital imaging literature and terminology runs deep and wide and can be daunting to the novice. in particular, archivists need to develop a better understanding of digital photography file formats and metadata practices if they aim to engage pho- tographers effectively. born digital “originals” bring different kinds of responsibility to planning and decisions not found in scanning operations and introduce a new level of complexity and cost. the archivists’ commitment to authenticity may lead them toward wanting to preserve the “camera raw” (raw) format as the “negative,” but this commitment may come into conflict with the realities of resource and staffing constraints because of its complexity and additional file storage. the emphasis put on authenticity, reliability, and accuracy of elec- tronic records can be daunting, and it can even lead to further inaction by the archivist. the proprietary nature of the raw format, as adopted among the most popular camera manufacturers, has become a serious concern among photographers who worry that they may not be able to access their files in the future. raw image format the importance of the raw file format to the photographer cannot be un- derstated, and archivists must learn about it to meet the needs of the pho- tographer and to preserve the format effectively. photographers work in the raw format because it affords the highest quality in resolution and color range, and it allows the photographer to non-destructively adjust properties of the image. many photographers consider the raw image file generated by a digital camera as the “digital negative,” and image developing refers to the process of color selection and conversion of the raw file into a jpeg or tiff format. the raw file stores the information gathered from the camera’s sensor in an unprocessed format. michael k. bennett and f. barry wheeler explored the barriers and ben- efits to archivists who want to preserve an image in raw format, which they consider the true camera “negative” versus typical practices of preserving an image “positive” in tiff format. until the advent of raw, pictures shot using the jpeg and tiff formats were typically rendered inside the camera. raw, however, defers processing of the image until the image is transferred to a desktop computer, and rendered inside of adobe camera raw software or a comparable program. managing this format is complicated by the fact that there are dozens of proprietary formats in use by different camera brands, sometimes differing with each successive mode. this increased functionality of cameras consequently increases the burden of the archivist. currently, adobe’s digital negative (dng) is the only standardized, openly documented raw format, and the library of congress supports it d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay b. keough and m. wolfe as the most sustainable format for long-term storage of images. if adobe successfully makes dng a widely accepted standard among camera manu- facturers, then archivists may reap the same benefits for photographs as they have with documents reformatted to pdf. adobe should be commended for the degree of openness it has brought to these formats, but archivists should be wary of becoming too reliant on one software manufacturer to manage their collections. by converting to dng, photographers have the option of storing the parametric image edits made to the raw image in a “side car” file. parametric image editing allows for cropping and rotating the image, as well as adjusting the color and contrast non-destructively, but the same edits made to a tiff are irreversible. adobe’s camera raw software affords the user the ability to adjust dozens of settings, and then records them in an extensible metadata platform (xmp) file or side car file, which resides separate from the raw file; thus the digital negative is never tampered with. the encoded information in the xmp file can be inspected by viewing the tagged information about the edits and color settings made to the image in a text editor, which is easily read by a human. adobe’s dng negative convertor is a free converter that allows the archivist to preserve the original proprietary raw file, as created by the camera, inside the dng file. the dng negative convertor can also extract the proprietary files, such as the nikon electronic format (nef) format, if needed. additionally, with the conversion to dng, the convertor creates and stores an md hash code within the file, which is useful for documenting a file’s accuracy and authenticity. the file size of a raw image can vary depending on the megapixel and the bit depth of the camera and settings selected by the photographer. a nikon d camera with image sensor sized at × pixels ( . megapixels) that shoots a bit, lossless image will produce a . mb file size, which makes the raw format in this same model roughly twice the size of a high quality, rendered jpeg format. archivists have relied heavily on the tiff format as an archival master for use in repositories because of its ability to preserve lossless images. although the tiff format has been phased out as an option in professional cameras, it is still a common rendering option in computer software and scanners. because tiff is a fully rendered file format, the settings of a camera or scanner are encoded permanently into the image thereby increasing the size (especially using - bit settings) and preventing any future re-renderings by the photographer or archivist. however, if an archivist chooses dng as a master format, other options are available that can actually surpass the storage footprint of tiff. archivists may choose to embed the proprietary raw image inside of the dng raw file. although allowing the records creator to retain their original proprietary format, migrating to the open dng format will double the storage requirements for the repository. d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay moving the archivist closer to the creator embedded metadata and image handling software the sheer volume of born digital images requires archivists to economize their efforts, and capturing metadata can be one of the most costly aspects of acquiring a born digital collection. whereas creating a descriptive metadata record about the image is important, photographers and archivists need to understand and use embedded metadata, which is critical to ensuring the preservation of digital assets. preservation file formats allow self-describing information to be embedded into the “header” of the image itself. embed- ded metadata ensures against loss if the file is separated from its descriptive database record. the tiff, raw (most proprietary versions), and jpeg for- mats store the three following units of metadata in the file: exchangeable image file (exif) format, international press telecommunications council (iptc), and xmp; the dng format includes even more. the iptc standard was merged with xmp, and now users can extend xmp to adopt the iptc core metadata standard. xmp is an adobe created, xml-based formatting standard, which is primarily used for images and pdf files. the exif metadata is the information generated automatically by the camera; this includes such information as the f-stop setting, flash, focal length of the lens, and more. the iptc metadata contains the description informa- tion about the content of the image, creator, date, and other descriptive information. iptc can be extended by the dublin core standard as well, which allows for time saving processes through automation. previously we discussed xmp in the context of side car files and the information they store from the parametric image editing. the dng format stores descriptive in- formation in the xmp-formatted iptc “chunk” inside the dng file, whereas edits made to the image (cropping, color correction) are stored in a separate file, known as the side car file that is formatted in xmp as well. the simple act of viewing an image, and the slightly more difficult act of reading and writing embedded metadata, is a common barrier to managing the raw format in a repository. raw images often do not dis- play the same user-friendly image information in the form of thumbnail images as with jpeg and tiff, making it impractical to conduct search and management actions through windows explorer and finder programs. further, there are known issues with handling and interacting with the em- bedded data in files through some default file managers, and especially with older generations of applications, yet this should not be an issue if the proper software is used. adobe bridge can help photographers and archivists safely automate some of the processes of adding metadata to the image file header. other software dedicated to preparing images for ingestion include adobe lightroom, apple’s aperture, and photo mechanic. this additional image “collection processing” also known as “pieware” software may be a small cost in comparison to the added control it gives and the time it saves. d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay b. keough and m. wolfe as part of the preserving digital images project, artstor developed new practices and tools for photographers so that they could create archive ready born digital images. the project developed best practices that advise photographers to only embed metadata into the image using adobe bridge or its equivalent without having to rely on cumbersome excel spreadsheets. archivists can instead use artstor’s recently developed embedded metadata extraction tool tool to extract metadata from digital assets and export them as spreadsheets. the archivist must have a firm understanding of the technical complex- ities of the contemporary photographer’s work, but technical knowledge alone will not guide archivists through the steps of implementing the pro- cesses for acquiring and preserving digital images. research projects, confer- ences, and standards groups have developed an overwhelming number and array of technical guidelines, best practices, frameworks, and models to man- age and preserve electronic records. in light of these theoretical inroads in research, christopher prom posed the question, “have libraries and archives made adequate progress in implementing the procedures, tools and services to actually preserve digital records?” the number of demonstrated imple- mentations pale in comparison to our plethora of theory and standards, but this is slowly beginning to change. the gap between theory and practice has proven to be much wider than initially thought. although repositories hold out for a comprehensive electronic records solution, ben goldman wisely recommended “moving forward with practical and achievable steps” toward that goal “responsibly.” with this in mind, what are the short-term measures archivists can take to acquire and extend the life of their digital images? university at albany’s born digital images this section depicts the university at albany’s experience in formulating practical methods for acquiring and providing access to born digital images. it will discuss how the university archives dealt with the backlog of born digital images, and then describe how it wrote integrated image management policies with the photographer to improve the university archives’ ability to acquire images. we believe these short-term, practical steps will help us build toward our long-term goal of the archives becoming a “systematic institutional function” within the university. background since the early s, the university archives has regularly acquired pho- tographic prints and negatives from the campus photography department. the photographer typically transferred the images seven to ten years after their initial use, even earlier in some cases when the photographer ran out of d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay moving the archivist closer to the creator cabinet storage. archivists transferred prints and negatives into the archives, re-housed them in acid-free boxes and folders, and placed them in the repos- itory’s processing backlog. the collections typically required little processing because they were well organized and easily accessible. currently, there are more than , prints and negatives that are used frequently by the uni- versity community in exhibits and promotional material, and by off-campus researchers. analog to digital campus photographers began gradually converting to digital photography in . during the course of ten years, campus photographers routinely used up their allowed data storage and would rely on optical media as an “archiving” solution. as the digital camera industry evolved, the photogra- phers did too by adopting new file formats to meet their needs and to stay current with industry standards. over time, the photography department confronted image management issues, such as redundancy and proprietary formats that made search and retrieval of their own files difficult. in addition to the photography department, many departments across the university be- gan struggling to manage their digital assets, and the campus implemented a collaborative project to plan and purchase a dams. the project charter sought to solve “a broad-based need, expressed by several university constituencies, for an enterprise-wide service fo- cused on the management, curation and accessibility of digital media and related assets.” the project deemed it important that the dams feature superior image functionality because images were a priority for nearly all of the project participants. the project chose luna insight as its dams, and the system was adopted by various departments, including the university archives. originally, it was intended that each department would upload and manage its own digital collections. it quickly became clear, however, that the other departments lacked the qualified staff and time required to upload their images into the dams. the university art museum was the lone exception, and that was because they already had an employee tasked with managing their digital images on a legacy system. as with many implementations of ir software, the “build it and they will come approach” was not working. it was clear that the university archives needed to play a leadership role for campus departments using the luna insight system. as the university archives began using the dams for their digitized collections, it also began exploring how the archivists could get directly involved with acquiring born digital records across the campus. to realizing that access to proprietary formats was problematic, archivists dealt with the images taken between and first to address three major d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay b. keough and m. wolfe challenges: the lack of image descriptions; the lack of tools to facilitate the transfer of images and database records from the photographer to the university archives; and the lack of a centralized portal to access digital images. to address these challenges, we first needed a better understanding of how the campus photographer worked. our first step in understanding the photographer’s practice was to cap- ture, organize, and retain information about his workflow. we had previ- ously created some common ground between the university archives and the photographer, having already had conversations about the planning and implementation of luna insight. we met with the campus photographer to learn what information he obtained during his work, and we discussed with him what information the archives lacked. we learned that when the pho- tographer shot pictures for a particular job, he entered information about the event into a microsoft access database including the date, unique job number, brief description, and name of the person or department who made the request. after completing the job, the photographer loaded the raw files onto a local hard drive, created a folder for each event that contained the unique job number as the folder name, and placed all of the images cre- ated from the job in that folder. the photographer typically burned the job’s images to a cd and gave it to the customer. the images are used for the marketing of a particular event, and they are then repurposed for a multi- tude of uses in a semi-active manner by other departments on campus and in official university publications and alumni literature. for this purpose, the photographer created jpeg derivatives from the raw files (nef), which were edited for color correction and selected for publication or distribution; these might be comparable to the “print positive” in the analog photography. until , unfortunately, the photographer deleted % of the jpeg derivatives to save storage space. transfer to archives in , the campus photographer transferred to the university archives , raw files (with connected xmp files) stored on optical discs totaling close to two terabytes. one of the first steps was to transfer the files from the optical media onto a shared network drive while maintaining the original order and unique job numbers. graduate students appraise these images based on retention guidelines developed by the archivist. students work primarily in adobe bridge and photoshop to view, describe, rename, and to do batch conversions. before we obtained bridge, the student had to go through the cumbersome task of renaming files using windows explorer. if the image must be opened up, the student must use the adobe camera raw program that runs inside of photoshop. a batch file conversion is conducted on the images, converting them from proprietary raw files (nef) to tiff using adobe bridge and photoshop’s image processor tool. d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay moving the archivist closer to the creator it is important to adopt self-describing file naming conventions as well as embedding metadata into the header file of the image. adobe bridge is used to normalize all the file names using a consistent and planned file naming and identifier convention. file names should consist of lowercase characters from the latin alphabet and arabic numeral set; they should not consist of special characters or punctuation except for the use of the underscore character to designate meaningful spaces in a file name. these practices help to identify files in the case of disaster where the files are separated from their descriptive metadata records. graduate students create a first draft of a dublin core descriptive metadata record for each image that is later reviewed and enhanced by an archivist before being uploaded with the affiliated images to the university’s luna insight system. improve workflow for current images the nature of the campus photographer’s work aids in the development of descriptive metadata and master file formats for access and preservation. after analyzing the photographer’s work process, we altered it so that he now creates a tiff derivative of all raw files when he completes a job and enhances the event description. instead of transferring images to the university archives on optical media, the photographer transfers images directly onto a dedicated share on the university libraries’ server, which is restricted to the university archives staff and the photographer. the tiffs are loaded into the dams by the university archives and the raw files are stored offline on servers. we currently describe the images in such a way that the crucial job numbers used to identify the photographer’s work are preserved. the photographer can use old job numbers to search for images in luna insight. to ensure that each image is assigned a unique file name by the camera, we recommended to the photographer that he change the file numbering setting on his camera to “continuous” and to avoid using “auto reset.” if the incorrect setting is selected, the photographer may risk overwriting older files. digital image collections that have not been set to “continuous” may also complicate the important task of appending identifier information to the image. findings it has become abundantly clear that archivists can no longer passively wait for the transferal of born digital collections to our repositories. the archivist must become involved in the work of the creator to ensure that valuable digital information is not lost. archivists must leverage their traditional relationships to the departments outside of the library while at the same time acquiring the proper skills and methods to enable them to properly take custody when the d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay b. keough and m. wolfe time comes. the archivist must strike a balance between complete inaction and waiting for the perfect solution. numerous colleges and universities are exploring alternatives to conventional acquisition practices, such as using online vendors to host their images off site, and this is especially true for recent born digital images. however, other institutions are already fully involved in the act of acquiring born digital images into the archives and are administering the software and technology locally using their campus it infrastructure. although the oclc report points to the lack of it support as being a large barrier to acquisition, our experiences show that archivists can surmount this barrier by taking small incremental steps. some of the most important decisions are made at the moment of acquisition—these decisions may have lasting, irreversible effects on the sustainability of the collection. all attempts must be made to “do no harm” to the digital assets. does the repository consider the images used for a par- ticular event the most important thing worth documenting or do they want all of the images? furthermore, the archivists’ commitment to authenticity may lead them toward wanting to preserve the raw format as the “negative,” but this commitment may come into conflict with the realities of resource and staffing constraints. archivists who aim to preserve the digital masters may want to consider migrating their image collections out of the proprietary raw file created by the camera to a more stable raw format such as dng. dng will preserve all of the settings of the proprietary format and the metadata. dng also has a built-in data validation mechanism that keeps “hash” file information to prevent against the possibility of data corruption. hash files or checksums are unique pieces of data that get assigned to each image; they serve as a guarantee to the user that the file has not been digitally altered or corrupted. each time the file is handled by the computer there is always a small risk of data corruption. thus, the checksum that resides within the dng file can allow the archivist to periodically check all of the files to make sure they have not become corrupted. archivists can assume that as image processing software improves, older dng files can be reprocessed into even more faithful reproductions of the original. once the image has been converted to tiff, the archivist per- manently loses future capabilities to eliminate noise and to recover colors lost from the original image. the university archives is considering the costs and benefits of moving to dng. most dams require users to up- load surrogate copies of their images because raw formats are not provided with the same functionality in dams as found with jpeg and tiff. the ingest engines do not currently handle raw, but this is likely to change as raw becomes a more accepted preservation format. even though raw provides unlimited options to archivists and photographers, choosing these options for long-term preservation should be considered carefully because storage requirements can become overly costly. if storage costs are not a d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay moving the archivist closer to the creator concern, the justified belief that technology will improve over time suggests that archivists should attempt to preserve the raw format, dng preferably, when possible. the archivist and the photographer will have to come to agreement on what they want to preserve. although migrating the images out of a proprietary raw format to one such as dng may meet the photographer’s needs, it does bring additional costs in management and storage to the repository. michael j. bennett and f. barry wheeler stated that, “as a baseline, familiarity with the concepts of parametric editing is necessary to confidently sustain dng files while the format and its tools continue to mature.” the archivist and photographer may have to make a compelling case to their managers as to why preserving multiple formats is needed. access copies must be created in place of the raw files because most dams and ir software do not display raw files natively. once you dispose of the raw files however, something is lost, and the repository will not be able to turn back. archivists need to consider the benefits of embedding metadata into their image files. the university archives has not begun enhancing the em- bedded metadata in its images beyond what is already there, though this is something it is considering for future integrated workflow improvements. the work of the artstor project holds great promise for not only making it easier for the photographers to embed information into image files, but also for making it easier for archivists to get that information back out of the file and into their conventional metadata records. because records creators are typically not concerned with extensive metadata, the artstor project might be a way forward that lightens the costly burden of creating metadata for the creator and the archivist. our profession must continue building better tools to preserve and provide access to our digital heritage, but we must remember that, when possible, making relationships with our creators is an integral first step to making effective use of these tools. with this in mind, lisl zach and marcia frank peri suggested that archivists must find a “champion with influence within their institutions. . .forming strategic alliances with key players out- side of the library.” the photographer’s mindset and work environment may lend itself to repositories looking for a willing collaborator to begin an electronic records management program. it has become clear that without the archivist’s pre-custodial intervention early in the lifecycle of born digital images, their long-term preservation may be at risk. the records creator must not only have a desire to transfer his or her records, but also see a legiti- mate reason to do so. whether in print or digital formats, photographers are accustomed to keeping pace with technical innovation in their field. thus, photographers’ technical understanding of their own work environment, and their keen awareness of the problems that long-term access and preservation pose, can make them willing partners for archivists who plan to acquire born digital records. d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay b. keough and m. wolfe notes . tiffany hsu, “kodak files for long-expected chapter bankruptcy,” los angeles times, jan- uary , (accessed march , ). available at http://articles.latimes.com/ /jan/ /business/la- fi-kodak-bankrupt- . . william e. brown, jr. and elizabeth yakel, “redefining the role of college and university archives in the information age,” american archivist ( ): . . susan e. davis, “electronic records planning in “collecting” repositories,” american archivist ( ): . . lisl zach and marcia frank peri, “practices for college and university electronic records management (erm) programs: then and now,” american archivist ( ): . . jackie m. dooley, katherine luce, and oclc research, taking our pulse: the oclc research survey of special collections and archives (dublin, ohio: oclc research, ), . . ibid., . . ibid., . . dorothea salo, “innkeeper at the roach motel,” library trends ( ): . . karen markey, soo young rieh, beth st. jean, jihyuun kim, and elizabeth yakel, census of institutional repositories in the united states: miracle project research findings, (washington, dc: council on library and information resources, ), . . jessica bushey, “he shoots, he stores: new photographic practice in the digital age,” archivaria ( ): . . patricia russotti and richard anderson, digital photography best practices and workflow hand- book, (burlington, ma: focal press, ), – . . camera raw is also conventionally referred to as raw or raw, but it is not an acronym. . jessica elaine bushey, “born digital images as reliable as authentic records” (master’s thesis, university of british columbia, ), accessed march , ), http://www.interpares.org/ display_file.cfm?doc = ip - _dissemination_thes_bushey_ubc-slais_ .pdf. . michael reichmann and juergen specht, “the raw flaw,” accessed march , , http://www. luminous-landscape.com/essays/raw-flaw.shtml. . philip andrews, yvonne butler, and joe farace, raw workflow from capture to archives: a complete digital photographer’s guide to raw imaging (oxford, uk: focal press, ), . . michael j. bennett and f. barry wheeler, “raw as archival still images format: a consid- eration,” uconn libraries published works paper , accessed march , , http://digitalcommons. uconn.edu/libr_pubs/ /. . the openraw working group has been advocating for a raw format that is even more open and comprehensive than the current dng format. it does not feel dng is a fully open and documented format as adobe purports. see http://www.openraw.org/about/index.html (accessed march , ). . see http://en.wikipedia.org/wiki/digital_negative (accessed march , ). . these are sometimes referred to as “buddy” files. . bennett and wheeler, “raw as archival still images format,” . . richard anderson, “should i convert to dng?” dpbestflow.org, accessed march , , http://dpbestflow.org/node/ . . memory card capacity for a nikon d camera, accessed march , , http://imaging. nikon.com/lineup/dslr/d /features .htm. . “raw vs. rendered,” accessed march , , http://dpbestflow.org/camera/raw-vs-rendered# tiff. . russotti and anderson, digital photography best practices, – . . it has been reported that rotating some image formats, and even checking the properties of an image, can permanently change or destroy the exif data. see: “nikon also warn about windows xp,” dpreview.com, accessed march , , http://www.dpreview.com/news/ / / /nikonxpwarnings. see also, peter krogh, “metadata handling,” dpbestflow.org, accessed mar , , http://dpbestflow. org/metadata/metadata-handling#durability. . see http://dpbestflow.org/image-editing/catalog-pieware. . “final report on activities related to artstor’s ndipp grant: preserving digital still images,” available at http://www.digitalpreservation.gov/partners/documents/artstor_finalreport .pdf d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay moving the archivist closer to the creator . chris prom, “making digital curation a systematic institutional function,” the interna- tional journal of digital curation , no. ( ): – , accessed march , , http://ijdc.net/ index.php/ijdc/article/viewfile/ / . . for example, michael forstrom’s “managing electronic records in manuscript collections: a case study from the beinecke rare book and manuscript library,” american archivist ( ), – , is an excellent case study on managing born digital collections, and he reviews literature of other prominent case studies. . ben goldman, “bridging the gap: taking practical steps toward managing born-digital col- lections in manuscript repositories,” rbm: a journal of rare books, manuscripts, and cultural heritage ( ): . . prom, “making digital curation,” – . . ”university at albany university digital image database proof-of-concept project charter,” november , . . jeffrey warda, the aic guide to digital photography and conservation documentation (wash- ington, dc: american institute for conservation, ), . . the dublin core standard is widely used by libraries and archives, and provides for easier interoperability and flexibility. . for example, the web site smugmug.com contains the born digital images of numerous col- leges and universities. in an e-mail from john edens, january , , he explains that university at buffalo’s creative services are housing current photos as well as some historic photos online through smugmug.com, and the ub university archives are looking at solutions to preserve the images once they are no longer needed for their current uses online. . the tufts university archives contain born digital photos from to . see, http://dl.tufts.edu/view_collection.jsp?pid=tufts:ua . .do.ua (accessed march , ) . bennett and wheeler, “raw as archival.” . ibid. . zach and peri, “practices for college,” . d ow nl oa de d by [ s ta te u ni ve rs it y of n ew y or k at a lb an y] , [ b ri an k eo ug h] a t : m ay giving your patrons the world: barriers to, and the value of, international interlibrary loan giving your patrons the world: barriers to, and the value of, international interlibrary loan kurt munson, hilary h. thompson portal: libraries and the academy, volume , number , january , pp. - (article) published by johns hopkins university press doi: for additional information about this article [ access provided at apr : gmt from carnegie mellon university ] https://doi.org/ . /pla. . https://muse.jhu.edu/article/ https://doi.org/ . /pla. . https://muse.jhu.edu/article/ portal: libraries and the academy, vol. , no. ( ), pp. – . copyright © by johns hopkins university press, baltimore, md . feature: global perspectives giving your patrons the world: barriers to, and the value of, international interlibrary loan kurt munson and hilary h. thompson abstract: using the and survey by the reference and user services association sharing and transforming access to resources section (rusa stars) of international interlibrary loans (ill), the authors explore barriers to this method of meeting patrons’ information needs. they evaluate international ill in the context of developments in the information landscape, the united nations human development index, colonialism, and the current cultural, political, and economic climate. strategies to improve future access to global information resources are also considered. despite its challenges, this service meets a demonstrated information need, but further investigation is required to determine its exact value to researchers and where best to focus resources on improvement. introduction this study explores the barriers to international interlibrary loan (ill) to provide a better understanding of obstacles impeding greater use of this service and the value it provides to library patrons. the reference and user services association sharing and transforming access to resources section (rusa stars) international interlibrary loan committee conducted three surveys of international ill between and . the survey targeted only libraries in the united states, while the and surveys engaged libraries worldwide. the and surveys were sufficiently similar to garner comparable results and to indicate trends. the free-text responses to the survey reinforce the quantitative data and provide greater insight into the barriers that exist. because the majority ( percent) of the libraries reporting were academic, primarily at colleges and universities, the scope of this paper is limited to those, with occasional consideration given to national and research libraries. the latter may contain particularly unique or comprehensive holdings, and so they may be the sole source for borrowing certain materials. giving your patrons the world: barriers to, and the value of, international interlibrary loan for the purposes of this article, interlibrary loan is defined as a service where an item not available in the local library is acquired from another library to meet a patron’s demonstrated need. international ill involves requesting an item when a copy is not attainable within the requestor’s home country. ill is only one of many mechanisms to acquire materials not available locally, and in some cases, it may not be the most effec- tive way to acquire materials or provide the least restrictive use of them. for example, purchasing a paperback book from amazon may be cheaper than paying a fee charged by the lending library plus international return shipping, and the patron could likely borrow the item for longer if it is locally owned. ill is a labor-intensive operation requiring an existing mechanism or mechanisms for placing a request, the transport of physical items, an agreement between the librar- ies as to which systems (ordering, shipping, ensuring compliance, and tracking) they will use, payment for the use of those systems, and possibly charges for the use of the item. materi- als may also be digitized, but this fulfillment method requires workflows and equipment to digitize the original (if it is not already digital), a delivery method for the electronic file, the ability to render it in a usable way locally, and potentially, payment. international ill relies upon a complicated infrastructure of legal, logistical, and procedural shared values. the degree to which libraries share these values affects international ill’s ability to meet local researchers’ needs, and international ill can only occur if the infrastructure is sufficiently robust to support it. the international ill survey considered global borrowing and lending in a quan- titative, transactional, process-driven way, asking such questions as: how many items do you borrow internationally per year? how were the requests sent? how was pay- ment provided? to which countries do you lend most frequently? to which countries will you not lend? the results provide data about the volume of transactions, but those numbers cannot reveal the value a researcher places on an item. guided by the notion of importance to the patron, this paper explores the barriers to international ill as a means to meet researchers’ information needs. patron needs and the information landscape the acquisition process begins only after a patron determines that a resource potentially meets his or her information need. materials held by a library but lacking metadata do not exist for the purposes of international ill. there is a relationship between the perceived value of the item and the effort required to ac- quire it; the effort expended in obtaining that item is directly tied to its value to the patron. in this context, the researcher’s need, international ill must be evaluated as a tool and its use contextualized. ill is only one of many mech- anisms to acquire materials not available locally, and in some cases, it may not be the most effective way to acquire materials or provide the least restrictive use of them. international ill can only oc- cur if the infrastructure is suf- ficiently robust to support it. kurt munson and hilary h. thompson the and international ill surveys both indicated that rare or older ma- terials and local dissertations are the most challenging to acquire, with more than percent of respondents in both surveys who borrow internationally reporting difficulty obtaining them. yet these items, particularly dissertations, are the materi- als most likely to be unique and for which no other method to acquire them exists, and their uniqueness likely increases their value for researchers. audiovisual media were also identified as difficult to acquire via international ill, with percent re- porting difficulty in and percent in . the different formats used to encode dvds and vhs cassettes for sale in specific regions further complicate use of these materials if they are lent abroad. several trends brought about by the internet increase the availability of materials and thus directly affect the use of, and reduce the need for, international ill. these trends include the proliferation of legally digitized works, the growth of open access initiatives for born-digital scholarship, and the illegal sharing of copyrighted materials via the world wide web. legal online availability of works that have entered the public domain reduces the need to request older and thus likely rarer materials, while self- archiving of preprints and illegal file sharing provide alternate acquisition mechanisms for more current, copyright-protected materials. the development of mass-digitization projects, scan-on-demand programs, and open access journals and repositories have reduced the need for international ill because many sources can now be obtained legally from the internet without transporting a physical item or involving ill staff. the google books library project, which scans works in the collections of library partners and adds them to its digital inventory; national digital libraries (for example, gallica, the digital library of the bibliothèque nationale de france); and scan-on-demand services (for example, ethos, the british library’s electronic theses online service) now provide either free access or purchase options. in the past, ill was the only way to acquire such resources. moreover, nontraditional archives, such as the queer zine archive project (qzap), provide access to previously unheard voices. options for free legal access to current scholarship also have expanded, with more than , journals and , repositories currently listed in the directory of open access journals (doaj) and the directory of open access repositories (doar). for countries ranked lower on the and international ill surveys both indicated that rare or older materials and local disser- tations are the most challenging to acquire, with more than percent of respondents in both surveys who borrow internationally reporting difficulty obtaining them. the development of mass-digitization projects, scan-on-demand programs, and open access journals and reposi- tories have reduced the need for in- ternational ill because many sources can now be obtained legally from the internet without transporting a physical item or involving ill staff. giving your patrons the world: barriers to, and the value of, international interlibrary loan the united nations (un) human development index—a composite of life expectancy, education, and per capita income—such programs as hinari (originally an acronym for health inter-network access to research initiative, which provides access to research on health), agora (access to global online research in agriculture), and oare (online access to research in the environment) provide superior, more cost-effective access to materials than ill and thereby reduce the need for it. despite these developments, many publications still lie behind a restrictive pay wall, and the need for easier access has driven the development and wide usage of several international and perhaps less-than-legal tools for acquiring journal articles and other ma- terials. the growth of the website sci-hub and the twitter hashtag #icanhazpdf provide examples of these services. they also speak to the difficulties of traditional international ill as a tool and, at a more basic level, reflect the need for easier access to scholarly information regardless of location and affiliation. while many individual articles can be purchased online directly from publishers, the price may be too high for individual scholars to pay, or the -hour access model provided by some publishers may be too restrictive. alternately, scholars may be unable to obtain a reproduction through ill in the time frame required, or the process of requesting the article through the library may be too cumbersome. again, the relationship between the value the researcher perceives the item to have and the ease of acquisition merits consideration. newly developed tools, regardless of their legality, often provide access where access did not exist previ- ously, and they do so quickly, easily, and freely. the efficacy of international ill must be considered in terms of meeting the user’s information needs within his or her limited time, energy, and resources. thus, the greatest need for international ill and its strength as a tool lie in acquir- ing rare or unique resources that are still protected by copyright and not licensed for public display and distribution. for foreign dissertations and other single-source items, international ill remains the most effective mechanism to meet the user’s need. barriers to international ill while the rusa stars surveys concentrated on ill-specific topics, broader cultural, social, economic, and political issues af- fect international ill too. by considering international ill in this larger context, one can better understand its barriers. the ideas behind the un human devel- opment index (henceforth referred to as “the index”) provide a useful framework for exploring trends. the data from the survey suggest that the degree of human development, along with shared culture and infrastructure, have a greater effect on international ill than any other factors. the authors of this article posit that the greatest volume of international ill should occur between nations ranked highest on the index. moreover, similar economic and legal systems should influence international ill, as might a shared border. political stability and economic cooperation further provide an environment conducive to shar- for foreign dissertations and other single-source items, international ill remains the most effective mechanism to meet the user’s need. kurt munson and hilary h. thompson ing. such considerations as a shared language, previous colonial relationship, or current administrative and constitutional ties should also inform this notion of shared cultural heritage and norms, thus increasing volume. therefore, a high number of international ill transactions should occur between the united kingdom, current members of the commonwealth, and former british colonies, dominions, territories, and protectorates. likewise, spanish-speaking nations in south america should most often interact with one another and with spain. following this logic, the least advantaged countries on the index should engage in the smallest number of international ill transactions. their ca- pacity to engage in resource sharing is further reduced if they suffer from an epidemic, war, insurgence, or other unrest. a review of the data provided by the most recent rusa stars survey is best done by the classes of access defined earlier, starting with the concept of relative advantage or disadvantage using the index. libraries in countries responded to the yes or no questions asking whether they borrowed and lent internationally. as table shows, there is a positive correlation between human development (hd) group and survey participation, with the greatest representation of countries belonging to the very high hd group ( percent) and the lowest within the low hd group ( percent). the general lack of response from countries ranked low on the index is telling; moreover, the lone respondent from the low hd group did not participate in international ill at all. on the opposite end of the spectrum, the majority of respondents in very highly developed countries borrow ( percent) or lend ( percent) internationally (see figure ). countries from the high or medium hd groups fall in the middle of these two extremes, with to percent of respondents participating in international ill activity. when it comes to international ill volume, the greatest cluster of activity, exceeding requests per year, occurred between the ranks of . and . (see figure ). this range represents table . response by country to questions about borrowing and lending internationally united nations number human of countries total development with number of percentage of group responses countries countries represented very high % high % medium % low % giving your patrons the world: barriers to, and the value of, international interlibrary loan the top percent of the index, supporting the authors’ assertion that the highest volume of international ill should occur between the highest-ranking nations. china, which had the most libraries reporting above average activity outside of this range, is a con- spicuous outlier. because china had the highest – average annual hdi growth rate of survey respondents in the high hd group (and the second highest among all respondents), one can infer that rapid growth in development, in addition to very high development, may lead to increased participation in international ill. the european union (eu) and the schengen area, which is comprised of euro- pean countries that have abolished passport and other types of border control at their shared borders, serve as excellent examples of how similar culture, political stability, economic cooperation, and shared borders provide an environment conducive to shar- ing. eighty-eight european respondents identified countries from which they borrow abroad most frequently. table shows the top countries, percent of which are eu or schengen member states or both. of those frequent lenders, germany was selected times, nearly twice as often as its nearest competitors. germany shares borders with nine countries, all of which participate in the schengen agreement, whose open borders in- crease the speed with which goods, including library materials, can travel. of the survey respondents from germany’s neighbors who borrow internationally, all listed germany as one of their top five lenders. proxim- ity, open borders, and generally favorable lending policies seem to enhance germany’s popularity as an international lender, while its very high hdi rank (sixth in the world) aids its libraries in sustaining generous international ill activity. because the survey was distributed in english (and had relatively few responses from spanish-speaking countries), the former british empire provides the best case study to explore how a shared cultural heritage resulting from colonialism might affect intercon- tinental ill activity. libraries from six former british colonies, dominions, territories, or protectorates (henceforth referred to simply as “former british colonies”) and nine current members of the commonwealth participated in the survey. according to the central intelligence agency’s world factbook, english is either the primary spoken language or an official language with de jure status in of those countries, and it is commonly spoken in two more. despite dispersion across five continents, a shared language and current or historical ties seem to support higher levels of international ill activity. commonwealth members and former british colonies will more likely borrow and lend internationally than other respondents; this is true overall as well as within the very high and medium hd groups (see table ). likewise, their -month international ill volume compared to other respondents is equal or higher across all categories. their responses also indicate that commonwealth members and former british colonies will likely borrow and lend more frequently to one another ( one can infer that rapid growth in development, in addition to very high development, may lead to increased participation in international ill. a shared language and cur- rent or historical ties seem to support higher levels of international ill activity. kurt munson and hilary h. thompson figure . participation in international interlibrary loan (ill) by united nations human development group, according to the reference and user services association sharing and transforming access to resources section (rusa stars) survey figure . international interlibrary loan (ill) volume by united nations human development index value, according to the reference and user services association sharing and transforming access to resources section (rusa stars) survey giving your patrons the world: barriers to, and the value of, international interlibrary loan percent and percent, respectively) than to other countries ( percent and percent, respectively). among these countries’ top five most active borrowers and lenders abroad, all belong to the very high hd group, and four of the five (australia, united kingdom, canada, and the united states) belong to this shared cultural network (see table ). why libraries do not engage in international ill using the un human development index as a framework can inform an understanding of barriers to international ill, but only to an extent. it does not, for instance, explain why libraries in countries with very high hd do not lend abroad. libraries may have many reasons to not participate in international ill. the authors will now consider possible reasons in four broad categories: cultural, political, economic, and policy. the authors’ own observations inform these reasons, as do the open comments and responses to questions related to barriers in the and rusa stars surveys. cultural reasons scholarship has a long, well-documented, global history of resource sharing, from the copying of medieval european manuscripts to the transmission of knowledge across the table . top countries from which european libraries borrow most frequently un human development percentage european index un number of of union schengen (hdi) hdi country respondents respondents member member value rank germany % yes yes . united kingdom % yes no . united states % no no . france % yes yes . norway % no yes . sweden % yes yes . denmark % yes yes . italy % yes yes . netherlands % yes yes . spain % yes yes . kurt munson and hilary h. thompson table . response to questions about borrowing and lending internationally: comparison of commonwealth members and former british colonies to other respondents un borrow internationally (yes) lend internationally (yes) human commonwealth commonwealth development members and former other members and former other group british colonies respondents british colonies respondents very high % % % % high % % % % medium % % % % low % n/a % n/a overall % % % % table . top five countries from which commonwealth members and former british colonies borrow and to which they lend most frequently borrow un human country number of percentage of development un hdi respondents respondents index (hdi) value rank germany % . very high united kingdom % . very high canada % . very high australia % . very high united states % . very high lend canada % . very high united kingdom % . very high australia % . very high united states % . very high denmark % . very high giving your patrons the world: barriers to, and the value of, international interlibrary loan arab world, which extended from baghdad, now the capital of iraq, to seville, spain, during the middle ages. the creation of new knowledge is predicated upon review and synthesis of existing knowledge. thomas aquinas could never have written the summa theologica without the preservation of aristotle’s thoughts in arabic and the sharing of those manuscripts with the west. loans of materials also resulted in the theft of those materials, violating the trust that must underlie contemporary international ill. in the past, materials deemed unsuitable were suppressed, destroyed, or hidden if they did not fit the prescribed or existing shared values. perhaps they challenged the prevailing national sense of self, as nineteenth- and early twentieth-century racist materials do in the united states and australia today. given their age and the risk for theft or defacement, libraries may transfer historical items that cause discomfort from open stacks to special collections. this relocation, while ensuring preservation and allowing for presentation with curatorial interpretation, may limit access for remote researchers because materials in special collections are among the most difficult to acquire via ill. censorship also creates barriers that may hamper international ill. the comstock laws of the united states, which prohibited sending through the mail any material deemed “obscene, lewd, and/or lascivious,” including information about birth con- trol, provide an example. so do the extensive internet regulations in china, commonly known as the great firewall of china, and north korea’s exclusion from the internet. international ill can only flourish where the free exchange of information and ideas is permitted. barriers to international ill are not always overt; a culture of sharing may simply not exist, or it may exist only in a form not used elsewhere. for example, many latin american countries lack the ill coordination at the national level that is common in north america and europe. only two responses to the rusa stars survey came from central or south america, which is not surprising given that the authors had no known contacts to whom to send the survey. although the survey garnered responses from latin american libraries thanks to improved distribution efforts, the response rate from this region remained very low. of the responses received, percent did not participate in any national or international resource-sharing networks (the highest percentage of any continent). where ill does exist in latin america, it is based upon interinstitutional agreements and personal connections, and those relation- ships vouch for the safety of the materials. resource sharing exists in latin america, but it occurs through channels that are difficult for an uninitiated ill practitioner at a north american library to discover and use. additional research is needed to determine if other parts of the world (for example, africa, the middle east, or central and southeast asia) employ similar methods to meet researchers’ needs. political reasons international ill requires a level of political stability to ensure an exchange infrastructure exists and to protect the safety of the item transported. the and rusa stars international ill can only flourish where the free ex- change of information and ideas is permitted. kurt munson and hilary h. thompson surveys asked respondents to identify the top five countries to which they would not lend. the results, while restrictive toward the middle east, are less so than those of . the arab spring, a wave of protests and demon- strations in and , and the civil war in syria that began in —with their resulting political instability—likely contributed to the increased reluctance to lend to those countries in . likewise, india’s weak transportation infrastructure may explain why three respondents in the survey identified it as a country to which others would not lend. shared notions of ownership and property rights enshrined in law also affect in- ternational ill’s ability to serve researchers. items must pass from one jurisdiction to another safely in transit to the requesting library’s patron. the intellectual property rights for digitized materials such as journal articles must be honored. china’s reputation for counterfeiting and weak enforcement of intellectual property rights may be responsible for its inclusion on the and lists of countries to which libraries will not lend. even among politically and economically stable world trade organization (wto) signatories ranked highest on the index, barriers to international ill exist. thirty-two percent of the respondents in noted copyright law and licensing terms as the greatest barriers to in- ternational ill. germany’s legal restrictions upon electronic transmission were particularly noted in the open comments. while german law does not prohibit international ill, it does increase the de- livery time because a paper copy must be mailed or obsolescent technology such as fax machines used. even mybib el®, a novel presentation platform that meets the legal requirements of germany’s copyright law by allowing the borrowing library to print only a single copy of the file directly from the lending library’s server, causes delays because the file cannot be delivered directly to the patron. as a result, the library cannot meet the researcher’s need in as timely a fashion as it otherwise might (and as the researcher likely expects it to do). respondents to the survey indicated that shipping, at percent, and customs, at percent, were additional barriers to international ill. while a nation has the re- sponsibility to protect its residents and secure its borders, the comingling of ill materials with other items shipped via international post or commercial delivery systems makes international ill more challenging. exemptions and codes for “used library materials, no monetary value” exist, but compliance with customs regulations and the associated paperwork slow the process by adding work for library staff. moreover, the lending of books and other returnables across borders assumes international standards and agreements on the transport of materials, particularly trustworthy handoffs between national postal systems. resource sharing exists in latin america, but it occurs through channels that are dif- ficult for an uninitiated ill practitioner at a north ameri- can library to discover and use. thirty-two percent of the respondents in noted copyright law and licensing terms as the greatest barri- ers to international ill. giving your patrons the world: barriers to, and the value of, international interlibrary loan economic reasons all interlibrary loan is predicated upon trust and common expectations. the lending and borrowing libraries must trust each other and agree that the former will supply an information resource to the latter that, if a physical item, will be returned in a timely manner. the borrower accepts the costs to both libraries for processing and shipping or, if electronic delivery is possible, reimburses the lending library for the effort it expended. payment ranked as the highest barrier to international ill, with percent report- ing this difficulty in . payment involves not only reimbursing the lender for the provision of the item but also identifying and agreeing upon which mechanism will be used to submit the payment. the international federation of library asso- ciations and institutions (ifla) supports a voucher system for ill payment, but it is an antiquated process that requires the physical exchange of laminated plastic cards. electronic payment systems, such as interlibrary loan fee management (ifm) provided by the online computer library center (oclc) and docline’s electronic fund transfer system (efts), do exist, but these are tied to requests made in specific closed systems that require membership. their use for international ill is limited because participation is either dominated by, or specifically limited to, libraries in north america. libraries also use credit cards and invoices, but these methods incur extensive processing costs, as do bank drafts. an electronic, vendor-neutral clearinghouse for payments, often requested in the survey results, would greatly reduce the payment processing cost for international ill. implementation, however, would require the development of such a system, ifla or another trusted entity to serve as guarantor, and agreement among libraries around the world to use it. policy reasons finally, institutions may not engage in international ill as a matter of policy, or they may choose to participate in ways that fall outside international norms or via alterna- tive systems. national libraries frequently play the role of guardian and keeper of that country’s cultural heritage. as depositories for the nation’s published materials, they may be less likely to lend materials because they see their holdings as a unified whole from which parts cannot be separated. likewise, some research libraries—for example, the linda hall library in kansas city, missouri, or the newberry library in chicago— house collections that are entirely noncirculating, so these institutions inherently do not lend materials via ill (though they may provide scans and reproductions). the united states national library of medicine’s historic nonparticipation in the oclc interlibrary loan system represents another policy that impedes easy international or even national access. its docline ill system is a closed one designed to serve only medical libraries. all interlibrary loan is predicated upon trust and common expecta- tions. the lending and borrowing libraries must trust each other and agree that the former will supply an information resource to the lat- ter that, if a physical item, will be returned in a timely manner. kurt munson and hilary h. thompson the national library of medicine’s policy in many ways encapsulates cultural, political, and economic barriers. culturally, it reinforces the specialness and high status historically accorded to doctors and medicine in the west. politically, it asserts differ- ence. economically, it imposes high costs in the form of staff manually processing non- docline requests for both borrowing and lending. ultimately, it demonstrates how larger external factors culminate in barriers to ill. improving access to global information resources in , the national library of medicine began preparations to lend materials via the oclc interlibrary loan system. likewise, plans are underway to expand the partner- ship between the center for research libraries, a consortium of university, college, and independent research libraries, and the linda hall library, thereby increasing physical access to the latter’s collections. these shifts in policy serve as encouraging reminders that barriers can come down and that change for the better is possible. be- yond creating more inclusive policies, there are numerous other ways that the library community can improve international ill or otherwise expand access to global information resources. the possibility to effect change exists at all levels, from the institutional to international. at the institutional level, academic libraries should conduct qualitative studies to determine the value of international ill to their university community. this niche service provides otherwise unobtainable materials to library users, but to what end? questions related to the benefits and impact of this service include: does it help graduate students finish their dissertations in a timely fashion or support faculty in producing publications for promotion and tenure? which disciplines, if any, rely on international ill to meet their information needs? understanding the value that this service provides to local researchers not only would help determine appropriate levels of funding and support but also could pinpoint where resources should be focused for best return on investment. on the lending side, libraries should ensure that their policies and licensing agreements for e-resources do not prohibit or hamper international ill. greater collaboration between spe- cial collections and ill departments could improve access to older and rare materials. mechanisms such as atlas systems’ illiad and aeon web platform in- tegration, which allow for communication between an institution’s interlibrary loan and special collections management systems, are needed to quickly and eas- ily determine if special collections materials requested national libraries frequently play the role of guardian and keeper of that country’s cultural heritage. as de- positories for the nation’s published materials, they may be less likely to lend materials because they see their holdings as a unified whole from which parts cannot be separated. greater collaboration be- tween special collections and ill departments could improve access to older and rare materials. giving your patrons the world: barriers to, and the value of, international interlibrary loan via ill can be provided and, if so, how. programs should be developed to lend original materials with curatorial approval and to provide on-demand digitization where condi- tion and copyright permit, if such programs do not exist already. at the consortial level, access to global information resources could be improved through coordinated data gathering to inform the building of collective collections for international materials. reducing duplica- tion in future purchasing should allow for libraries to collect more unique materials, and building more distinctive local collec- tions would better support both consortial and national resource sharing. special atten- tion should be paid to collecting publications from countries for which our institutions have dedicated academic programs or research centers, such as those for latin american or middle eastern studies, espe- cially if those countries rank lower on the un human development index or are difficult from which to borrow materials. a consor- tium might also consider using resource-sharing data to identify potential partnerships worth establishing abroad, whether with an institution, consortium, or resource-sharing network. any such reciprocal agreements should be mutually beneficial and pursued with mindfulness of cultural differences and economic disparities and with sensitivity to histories of colonialism, exploitation, or unwanted foreign intervention. at the national and international levels, coordinated advocacy and development are needed. resource-sharing practitioners and the organizations that represent them should engage in conversations about, and actively support the development of, an iso (international organization for standardization)-based open system for ill request brokering that is vendor neutral. such interoperability could facilitate the building of transnational reciprocal partnerships by eliminating the need for libraries to adopt an additional system or to manually complete online forms to engage in resource sharing abroad. likewise, the development of an electronic, vendor-neutral system for remunera- tion is essential to reducing barriers associated with payment. while acknowledging that the complexity of dutch banking law poses challenges, national committees such as the rusa stars international interlibrary loan committee should continue to lobby ifla to support an electronic equivalent of ifla vouchers. finally, ifla must persist in its advocacy efforts for copyright law exceptions for libraries and archives, in particu- lar those that permit interlibrary loan and document delivery, and preferably without restrictions related to international or electronic delivery. ifla has attended meetings of the world intellectual property organization standing committee on copyright and related rights since and should not abandon its efforts to elevate all signatory countries to a minimum standard for libraries. many of the cultural, political, economic, or policy barriers to international ill are entrenched, but this is increasingly less true for technological ones. while previous consortial or national shared catalog systems tended to be closed systems (often client- server integrated library systems), the movement toward cloud-based library service reducing duplication in future purchasing should allow for libraries to collect more unique materials, and building more dis- tinctive local collections would better support both consortial and national resource sharing. kurt munson and hilary h. thompson platforms provides opportunities for cross-system communication. this new architecture could help increase international ill by providing more efficient ways to exchange re- quests abroad. for example, if two countries have their national shared catalogs on the same ex libris alma resource management server, it is much easier to place and track requests between libraries in those nations. all the necessary information is stored on the same server, and requesting between national systems can be automated. this model represents a marked improvement from the current process of exchanging e-mails and ifla vouchers. just as the introduction of web-accessible catalogs transformed discov- ery, these new architectures coupled with new open standards have the potential to transform requesting and delivery. the best way to improve future access to global information resources is to continue improving our understanding of the current barriers to the exchange of those resources. increasing comprehension requires three things: expanded participation in international ill surveys to increase the data available, coordination between surveying organizations to obtain comparable data sets, and improved analysis and distribution of those results to stakeholders. improved coordination between different library associations should lead to wider survey distribution. to this end, the rusa stars international interlibrary loan committee plans to coordinate production and distribution of its international ill survey with ifla, including translation into additional languages beyond english. this collaboration should provide a more holistic view of international ill, in contrast to the previous surveys, in which the united states was overrepresented. conclusion implementing the ideas to improve access to global information resources outlined here should begin with countries ranked higher on the un human development index. these nations already have the societal and physical infrastructure required to exchange library materials across borders, and numerous academic libraries in countries with very high human development already engage in international ill on a regular basis. this activity depends upon a safe environment for scholarly inquiry and discovery within institu- tions of higher education and reflects a mutual commitment to learning and the sharing of ideas. as the survey demonstrates, shared culture and open borders facilitate higher levels of international ill between certain countries, and open architecture could further increase resource sharing in the future. other barriers beyond the technological pose difficulty for ill practitioners seeking to borrow or lend materials abroad, and emerging alternatives (both legitimate and illicit) to international ill have and will continue to develop. clearly, international ill exists in a complex socioeconomic, transnational landscape filled with many challenges. academic library staff must navigate around them to provide the resources requested by their pa- trons. nevertheless, for many institutions, this service remains a viable mechanism for meeting their patrons’ information needs, as evidenced by the increase in international ill over the past five years reported by most of the survey respondents. investing in improvements to international ill is warranted, and several methods to ameliorate the current service model exist, but where to focus our efforts and resources for best results has yet to be determined. giving your patrons the world: barriers to, and the value of, international interlibrary loan the rusa stars international ill surveys have concentrated on numbers and tools, with barriers as only a secondary consideration. numbers are easy to gather, but they cannot establish the service’s value to researchers, demonstrate the change in this value over time, or explain how this resource acquisition tool helps advance scholarship. quantitative data do not provide qualitative results. cooperative initiatives, particularly a modern payment system to reduce the economic friction of international ill, can make this service easier, but additional study to demonstrate its value to the researcher is still needed. the rusa stars international interlibrary loan committee and other interested resource-sharing professionals should collaboratively pursue with ifla, other professional organizations, and peer in- stitutions abroad opportunities to enhance our understanding of how this international service supports local researchers. this knowledge, in turn, will help us to invest wisely in future improvements to international ill. kurt munson is the assistant head of access services at northwestern university libraries in evanston, illinois; he may be reached by e-mail at: kmunson@northwestern.edu. hilary h. thompson is the head of resource sharing and reserves at the university of maryland libraries in college park; she may be reached by e-mail at: hthomps @umd.edu. notes . for more information about the , , and surveys, see tina baich, tim jiping zou, heather weltin, and zheng ye yang, “lending and borrowing across borders: issues and challenges with international resource sharing,” reference & user services quarterly , ( ): – ; tina baich and heather weltin, “going global: an international survey of lending and borrowing across borders,” interlending & document supply , ( ): – ; and kurt munson, hilary h. thompson, jason cabaniss, heidi nance, and poul erlandsen, “the world is your library, or the state of international interlibrary loan in ,” interlending & document supply , ( ): – . . baich and weltin, “going global,” ; munson, thompson, cabaniss, nance, and erlandsen, “the world is your library,” . . ibid. . john bohannon, “who’s downloading pirated papers? everyone,” science , ( ), http://www.sciencemag.org/news/ / /whos-downloading-pirated-papers- everyone; carolyn gardner and gabriel gardner, “bypassing interlibrary loan via twitter: an exploration of #icanhazpdf requests,” association of college and research libraries (acrl) conference, portland, oregon, march – , , http://eprints.rclis.org/ /. . united nations development programme, human development reports, “ human development index statistical annex tables,” accessed january , , http://hdr.undp. org/en/data#. . while libraries in the high human development (hd) group are nearly as likely to borrow internationally as those in the medium hd group, they are noticeably less likely to lend. these data suggest that libraries in countries with high (but not very high) human development may experience an increased fear of loss of material compared to their peers in medium and very high hd groups. cooperative initiatives, particularly a modern pay- ment system to reduce the economic friction of inter- national ill, can make this service easier. kurt munson and hilary h. thompson . provided, of course, that the country already has developed enough to support international interlibrary loan (ill). the highest human development index (hdi) growth rate among survey respondents was that of zimbabwe, also the survey’s only respondent from the low hd group. despite extensive growth, the country’s human development still appears insufficient to sustain international ill activity. . germany shares borders with denmark, poland, the czech republic, austria, switzerland, france, belgium, luxembourg, and the netherlands. switzerland does not belong to the european union but participates in the schengen area; european commission, migration and home affairs, “schengen area,” accessed march , , http://ec.europa.eu/home- affairs/what-we-do/policies/borders-and-visas/schengen_en. . one hundred percent of german survey respondents said they lend both returnables and nonreturnables and accept international federation of library associations and institutions (ifla) vouchers for payment. however, electronic delivery of nonreturnables is often restricted due to german copyright law, as discussed in the section “political reasons.” . central intelligence agency, the world factbook, , accessed march , , https:// www.cia.gov/library/publications/the-world-factbook/. . tina baich, jennifer block, paul drake, lee anne hooley, jennifer jacobs, karen l. janke, and leetta m. schmidt, ala [american library association] rusa stars [reference and user services association, sharing and transforming access to resources section] international interlibrary loan survey: executive summary (chicago: rusa, ), ; leetta schmidt, e-mail to the author. . megan gaffney, “interlibrary loan between the united states and latin america: the current landscape,” paper presented at the ifla document delivery and resource sharing satellite meeting, august – , , washington, dc; leetta m. schmidt, “interlibrary lending in mexican, caribbean, central american, and south american libraries,” journal of interlibrary loan, document delivery, & electronic reserve , ( ): – ; robert a. seal, “interlibrary loan: integral component of global resource sharing,” paper presented at the ifla/seflin (southeast florida library information network) international summit on library cooperation in the americas, miami, fl, april , . . in , seven countries in the middle east were identified a total of times as countries to which libraries would not lend abroad; in , countries in the middle east were selected times, a percent increase. . china was included three times in both the and surveys. in , libraries in the united states, canada, and sweden reported refusing to lend to china. munson, thompson, cabaniss, nance, and erlandsen, “the world is your library,” . . ibid. . ibid. . ibid. . in , the online computer library center (oclc) symbol for the national library of medicine (nlm) was turned on as a supplier in the oclc policies directory, but there is currently a broad deflection in place, preventing most, if not all, libraries from submitting requests through this system. the nlm’s access unit head confirmed their plan to begin lending to non-docline libraries via oclc, though no start date has been determined. mary wassum, e-mail correspondence with the author, august , . . the current partnership only allows for the scanning of articles from the linda hall library in kansas city, mo, which are requested and provided via rapidill. kevin wilks, e-mail correspondence with the author, august , . . oclc is writing an iso (international organization for standardization) module for implementation with its resource-sharing products, and this module may be made available under an open license for broader use. likewise, index data is currently engaging in conversations with different communities about developing “a modular, component- oriented alternative to monolithic request brokering systems” using the folio (future of giving your patrons the world: barriers to, and the value of, international interlibrary loan libraries is open) architecture. a white paper outlining a model for open resource-sharing brokerage should be forthcoming. iso newsletter # , distributed via e-mail to subscribers, june , ; sebastian hammer, post to the folio discussions list, july , . . international federation of library associations and institutions (ifla), “statements made by ifla at the wipo [world intellectual property organization] standing committee on copyright and related matters (sccr),” updated october , , https://www.ifla.org/ publications/statements-made-by-ifla-at-the-wipo-standing-committee-on-copyright-and- related-matters. . forty-eight percent of survey respondents reported making more international ill requests now than they did five years ago, while only percent reported no change, and percent reported making fewer requests; munson, thompson, cabaniss, nance, and erlandsen, “the world is your library,” . today and in perpetuity: a canadian consortial strategy for owning and hosting ebooks the journal of academic librarianship xxx ( ) xxx–xxx acalib- ; no. of pages: ; c: contents lists available at sciverse sciencedirect the journal of academic librarianship today and in perpetuity: a canadian consortial strategy for owning and hosting ebooks☆ tony horava ⁎ university of ottawa, associate university librarian (collections), university private, ottawa, ontario k n n canada ☆ this is an open-access article distributed under the t attribution license, which permits unrestricted use, dis any medium, provided the original author and source a ⁎ tel.: + x ; fax: + e-mail address: thorava@uottawa.ca. - /$ – see front matter © the author. pub http://dx.doi.org/ . /j.acalib. . . please cite this article as: horava, t., today a academic librarianship ( ), http://dx.do a b s t r a c t a r t i c l e i n f o article history: received march accepted april available online xxxx keywords: ebooks consortia local hosting licensing ontario council of university libraries (ocul) canada the ontario council of university libraries (ocul) is a provincial consortium of twenty-one publicly funded universities in ontario, canada. a consortially-built platform called scholars portal is our digital library for ar- chiving ebook content and making it available / to university students and faculty. an ebooks committee has responsibility for coordinating the consortial acquisition of ebooks, within the context of an information resources committee. this paper discusses the consortial strategy and philosophy for ebook licensing in ocul, which involve a focus on ownership and local loading rights, for dual purposes of preservation and im- mediate access. key processes, tools, and accomplishments of this innovative service model are highlighted. © the author. published by elsevier inc. all rights reserved. consortia and the ebook landscape the growing importance of consortia for ebook acquisition is very much in evidence. according to a report, colleges in the u.s. pur- chase % of their ebooks via consortial agreements (ontario council of university libraries, a, b). this represents a substantial portion of ebook expenditures, and is higher than for public libraries ( %) or for special libraries ( %). consortia in the digital age have been viewed originally as buying clubs that focused on negotiating lower prices for group purchases (allen & hirshon, ) and exerting greater influence in the development of scholarly resources (alberico, ). collective acquisition of digital content leads inevi- tably to a greater focus on cost-sharing and its management. stern ( a) notes that “payment models for individual libraries already allow for complex funding options, but consortial funding allocations and reporting will be far more complicated and will require central tracking for annual analysis across the group.” this has become a fact of life for consortia acquiring digital scholarly content, and nu- merous models for allocating costs exist, whether based on central funding based on governmental sources, institutional funding, or a hybrid of the two. many criteria can be employed to serve as proxies erms of the creative commons tribution, and reproduction in re credited. . lished by elsevier inc. all rights res nd in perpetuity: a canadian i.org/ . /j.acalib. . for demand and value. what is crucial is that the criteria be perceived as equitable, transparent, consistently applied, and affordable by the members. cost–benefit analyses reflect the value of providing much greater access to scholarly material through consortia and saving the time of the user (scigliano, ). today, however, consortia are “pursuing complex cooperative collection development strategies, and what's more, content production, hosting, and sharing” (zeoli, ). in this context, consortia are playing a much larger market role in the scholarly communications ecosystem with respect to the develop- ment, acquisition, and integration of digital scholarship into the life of academic institutions (and other sectors as well). one can agree with maskell ( ) that “consortia might be considered not as augmenting and strengthening the role of academic libraries in that cycle, but rather becoming an increasingly powerful intermediary between the publisher and the academic library.” while there are numerous and powerful advantages for leveraging consortial action in acquiring digital content, there are challenges that are inherent to the structure and functioning of consortia. hazen ( , ) observes that, erve co . the internal workings of consortia reinforce the grounds for doubt. these bodies are instruments of their members' collective will, but also are beholden to each participant's priorities and claims. group decisions are susceptible to lowest-common-denominator, weak- link-in-chain, and divide-and-conquer distortions. consortia, in their current form, may be equivocal instruments of collective resolve. d. nsortial strategy for owning and hosting ebooks, the journal of http://dx.doi.org/ . /j.acalib. . . mailto:thorava@uottawa.ca http://dx.doi.org/ . /j.acalib. . . http://www.sciencedirect.com/science/journal/ http://dx.doi.org/ . /j.acalib. . . t. horava / the journal of academic librarianship xxx ( ) xxx–xxx human dynamics come into play in collective decision-making, thus testing the collaborative will and the common strategic objec- tives of the group. it is safe to say that there is a very wide spectrum of dynamics, goals, and resource capacities in library consortia as they evolve in their responses and strategies to the thorny challenges in the scholarly communications ecosystem. there will always be a ten- sion between institutional and consortial interests; this is one of the trade-offs implicit in any collective action. consortia carry out an ex- tensive and ongoing dialogue with each other, and learn from best practices and experience elsewhere in the community. considering how rapid has been the growth and influence of consortia on the marketplace, it is difficult to predict how this will evolve in light of the massive transformations we are witnessing in the publishing industry, in the re-invention of research in the digital age, and in the ubiquitous impact of information technologies on the delivery of edu- cation. what is clear is that ebooks are a disruptive force, in the best sense of the term, and that all consortia need to rethink their licensing strategies in relation to the value, purpose, viability, and integration of ebooks into their strategic thinking. a myriad of challenges in ontario as elsewhere, the ebook landscape has become a complex and challenging focal point for libraries in recent years. the chronic ‘messiness’ of ebooks is well-known, such as the difficulties in a com- mon definition and format for ebooks; the challenges around pricing; drm restrictions on use; lack of simultaneous publication with print; a clearer understanding of ebook use; and the need for a better integra- tion of ebooks into the workflows of researchers. part chameleon, part revolutionary, and part adolescent, the ebook challenges our assump- tions on the nature and use of long-form scholarship in today's world. it is therefore a time of much experimentation and questioning, as libraries try to harness the rich potential of the ebook to support learn- ing and teaching in a sustainable manner. the range of licensing options has expanded enormously— subscription, purchase, demand-driven, and pay-per-view acquisition models are proliferating. there are big deal-style complete collections (though not always complete!); subject collections; and bespoke or cus- tomized collections. there is major growth in the availability of backlist titles in digital form. there are workflow issues around the delivery of front list ebook titles, particularly in relation to timeliness. polanka comments that, “even when an ebook was made available for purchase, some publishers imposed an additional delay before ebooks could be included in leased collections. lack of uniformity across the publishing industry causes ongoing problems for aggregators and libraries alike.” (polanka, ). there is a wide range of preferences for book use: while a growing number of users are comfortable with ebooks and asso- ciated reader technology, a sizeable minority still prefers print books for some or all learning purposes (revelle, messner, shrimplin, & hurst, ). there are vendor integration issues around the impact on ap- proval plans, duplication of content, marc records, invoicing, and com- prehensiveness of coverage. there is the lack of standardization of isbns for ebooks. conference papers, online and in-person conversations in the library world are peppered with concerns on how to address these many challenges—and to respond to user expectations for digital con- tent delivery, ease of use, and timeliness of availability. a search in the academic complete database on the licensing of ebooks yielded hits in total, as of this writing. the annual number of hits has increased on an annual basis: for , for , and for . profes- sional dialogue on ebook issues is exploding in many directions, such as patron-driven acquisition, which is “poised to become the norm” (acrl, ) in academic libraries. experimentation around the forms and functioning of e-textbooks is another indicator of the turbulent state of search string = ebook* and (sales or price* or pricing or model or licensing or business or strateg*). please cite this article as: horava, t., today and in perpetuity: a canadian academic librarianship ( ), http://dx.doi.org/ . /j.acalib. . the venerable monograph. the only certainty is that the future forms of the book will be diverse, media-rich, and predicated on business models that exploit new technologies for delivery, interaction, and discoverability. looming above this landscape, however, is the sobering financial cloud—all libraries are struggling to maintain strong collections in the context of financial restraint, which in many cases means a flat or declining budget. the loss of purchasing power constrains the options available for libraries vis-a-vis ebooks. as the serials and con- tinuations budget (mostly spent on digital content) is subject to infla- tionary pressures, the monies available for monographs is declining and under threat in many institutions. the vendors, encouraged by rapid developments in e-reader technology (e-readers, tablets, and smartphones), as well as a dramatic take-up of ebooks by the general public, are sensing many new opportunities in the ebook market place. the journals market is quite saturated. hence the publishers and aggregators that are rushing to digitize front list and backlist titles, offering multiple licensing options and delivery channels, and overhauling production workflows so that the digital format becomes the default format of publication. this competitive arena is good for libraries, but it doesn't obviate the need for consortia to carefully assess these options in light of strategic goals and significant challenges in working together with partners in the monograph supply chain. the ontario council of university libraries (ocul) in ontario, canada's most populous province, there are twenty-one publicly funded universities. these range from the comprehensive research-intensive university (university of toronto—over , fte) to the small, focused undergraduate university (such as algoma university—about fte) and many others in-between. there are over , university students in the province. the ontario council of university libraries (ocul) is the consortium that represents these universities' interests and that spearheads consortial activities. founded in , ocul developed collaborative initiatives such as a province-wide interlibrary loan agreement to provide books free of charge, and a union catalogue of serials held in ontario libraries. since the late s, a projects officer and an information resources (ocul-ir) committee have coordinated the licensing of digital resources; as of this writing there are about licensed products for an annual expenditure of about m. the ir committee includes representatives from each of the twenty-one member institutions. there are separate specialized groups that coordinate the purchase of digital maps, data sets, and geospatial resources. ocul works as one provincial consortium within a partnership framework that in- cludes other canadian regional groupings through consortia canada and with other academic libraries through crkn (canadian research knowledge network). this has led to a very unique and innovative form of collaboration “with the goal of enhancing research supports and creating rich learning environments for ontario's diverse and growing student population.” (ontario council of university libraries, a) ocul supports “canada's knowledge economy by providing the information tools and access essential for high quality education and research in ontario's universities.” (scholars portal, a, b). the center- piece of collaboration is a cyber-infrastructure called scholars portal, launched in , which preserves and provides access to a broad range of scholarly content that has been licensed by the consortium on a perpetual access basis (i.e. via purchase agreements with publishers). scholars portal was originally developed using seed money from the ontario government in the late s, and is now sustained by the twenty-one member institutions on a cost-share basis. local loading rights have been negotiated to enable the archiving of these resources in perpetuity. as of this writing, there are , , full-text articles from , journals loaded on the journals server; , ebooks on the books platform (this includes consortial strategy for owning and hosting ebooks, the journal of . http://dx.doi.org/ . /j.acalib. . . t. horava / the journal of academic librarianship xxx ( ) xxx–xxx , commercial titles and , open access titles digitized by the internet archive). scholars portal has become a powerful dis- covery portal for a wide range of scholarly research aggregated on a single platform. this also includes born-digital ontario government documents catalogued by the ontario legislative library. there is a growing number of social science datasets and geospatial resources housed on other scholars portal servers. scholars portal also provides services to support library and researcher workflows: it “supports the online inter-library loan platform [racer] for ontario's universities and provides support for citation management systems, a virtual chat reference service and other tools designed to aid and enhance academic research in ontario.” (scholars portal, a, b) the technological infrastructure and the twenty four dedicated staff are financed by the members of the consortium, and located at the uni- versity of toronto, which acts as a service provider to ocul (while being a member as well). there are three scholars portal staff that are specifically responsible for managing and archiving the ebook content that is received from publishers. negotiations with publishers for ebook content have been predi- cated on several key principles: ) the best possible pricing, such that the consortial agreement is better than what any institution could achieve on its own; ) alignment with core licensing provisions as outlined in our model ebook license; ) perpetual access rights based on purchase agreements; ) local hosting on scholars portal; and ) prioritization of scholarly publishers' collections based on a consensus of our institutional research and teaching needs. these principles will be explored in the course of this paper. ebook platform scholars portal books is our locally-built platform for archiving and accessing scholarly monograph texts to support teaching, research, and learning. it is based on ebrary software purchased in , follow- ing an rfp selection process. it is a ‘light archive’—it is regularly accessed by authorized users to search and browse the collection. the platform provides the ability to search individual collections, or across all public or subscribed collections. searches can be refined according to various facets—subject, language, author, series, and type. the inter- face also provides related journal articles on the search term, thus providing an added benefit for students who may be looking for mono- graph and/or journal literature. access to collections is based on enti- tlements associated with acquisition decisions made by each school, and carefully controlled by ip ranges. shibboleth authentication is also available for federated identity management and appropriate access. where required, drm is in place to ensure that usage of the ebooks respects the terms of the license agreements. the platform allows for the highlighting of text, bookmarking, or saving of books, based on setting up personal accounts. pdf downloads are supported, and export to citation management systems such as refworks and endnote is available. a bilingual interface (english and french) is avail- able, in recognition that several of our institutions are either bilingual or include substantial numbers of french-speaking students and pro- fessors in their communities. once a consortial license has been signed, the publisher works with staff to transfer data and often metadata (although sometimes we need to get marc records from a different source). once staff re- ceive the data, they analyze the files and the metadata to determine if there are any issues to address, such as incomplete or corrupt files, and setting up the entitlements for the participating schools. once the problems are resolved, one of the programmers writes a loader and the books are loaded. the marc records with the scholars portal url are generated. the team then does a quality assurance check be- fore distributing the marcs to schools and opening up access. the lack of standardization of isbns for ebooks is an issue, i.e. multiple or problematic isbns. there is a wiki that provides ebook files and metadata to the institutions (organized by publisher/collection and please cite this article as: horava, t., today and in perpetuity: a canadian academic librarianship ( ), http://dx.doi.org/ . /j.acalib. . date). the process of working with publishers to source and load book files has been a learning experience for everyone involved. the ebook record specifications are provided on the wiki; there are general requirements on record creation as well as marc field-specific re- quirements. the cataloguers within ocul libraries has been very ac- tive in reviewing marc records from publishers, discussing issues on cataloguing of ebooks and developing the above specifications. it is no exaggeration to say that the scholars portal books platform constitutes one of the most innovative and successful collaborative ventures in the consortial environment. this has been recognized by the charleston advisor that awarded ocul an ebook innovation award (machovec, ). ocul's collaborative approach to providing its members with access to a growing collection of ebooks through the scholars portal books platform was cited by george machovec for offering “greater local control, customized functionality, and per- manence that can be depended upon.” ( ). scholars portal as an organization has won an innovation achievement award from the canadian association of college and university libraries (cla, ). the ebooks committee in – , ocul participated in a tri-sector consortial agreement for netlibrary titles, through the cool (consortium of ontario libraries) initiative. this involved the colleges and the public libraries in a unique model of cross-sector collaboration for digital collection development. this was the first large-scale consortial acquisition of ebooks in ontario. each consortium was responsible for selecting its own titles, and signif- icantly the pool of total titles was made available to all three sectors. it resulted in titles being purchased. netlibrary was problematic in terms of the single user, single book model. the interface at that time was quite primitive, and there were challenges in obtaining access to ti- tles and to marc records. the coordination of this initiative was under- taken by three librarian members of the ocul-ir committee as well as the projects officer. this experience—early in the evolution of the ebook marketplace—revealed the need for a designated group to over- see consortial offers for ebook collections. there was a multiplicity of questions around acquisition and licensing models, delivery channels, pricing, access methods, content availability, and marc records. the transition from print books to ebooks has been a long series of growing pains, as publishers, vendors, and libraries have struggled to define a sustainable scholarly supply chain that can meet everyone's interests and workflow capabilities. as is well known, this has occurred in marked contrast to the journals industry, where the transition from print to digital has been relatively painless and well-accepted—the e-format is now the stable, standard model for production and delivery of journal literature. for many reasons—such as immature business models, licensing restrictions, usability problems, and lack of wide- spread availability of titles—this has not been the case for monographs in digital format. at the same time, there was growing interest in ex- ploring and resolving these issues in order to advance the development of the digital library. as more academic libraries embraced strategies for enhancing their web presence and for creating, acquiring and managing digital scholarship, this became more important. in recognition of this reality, ocul-ir struck an ebooks committee in . the committee's mandate is to play a leadership role in ebook acquisition and licensing strategies. this creates efficiencies in the relationships with publishers and vendors for acquiring new content. the committee is composed of six members: two from small institutions (under , fte); two from medium-size institu- tions ( – , fte), and two from large institutions (above , fte). the assistant director of scholars portal (collections and digital preservation) is also a member. there are two co-chairs drawn from this group. this composition ensures that any vendor or publisher ebook proposal is reviewed by members representing a wide range of interests and circumstances. moreover, there was a recognition that ebook offers were becoming more and more time-consuming consortial strategy for owning and hosting ebooks, the journal of . http://dx.doi.org/ . /j.acalib. . . t. horava / the journal of academic librarianship xxx ( ) xxx–xxx to address. the committee serves a key coordination role in shepherding an ebook offer from the initial proposal, through to the call for interest, the fielding of questions and the seeking of clarifica- tions from the publisher, the crafting of the final offer, the request for decisions, and then hopefully finalizing a license agreement with the vendor. one of the committee members is designated as the lead negotia- tor, and works in collaboration with the projects officer. the commit- tee examines ebook offers and responds to solicitations of interest from vendors. through analysis and exchange of ideas, the committee determines whether a proposed offer or a solicitation of interest is worthy of being presented to the membership as a consortial propos- al. there is regular consultation with members through a distribution list, where questions are raised and clarifications given. this allows for a consistent process, builds expertise, and ensures that scholars portal hosting issues are taken into account (as described above). monitoring national and international developments, such as ebook business models, innovative consortial opportunities, and intelligence gathering in the publishing and scholarly communications field, are also important to this process. the committee formally reports to the ocul-ir group, at biannual meetings, where accomplishments and strategic directions are discussed. this framework has been very successful in enabling the ebooks committee to assess offers in a systematic manner, and to be guided by objectives that are critical to the consortium's strategic goals. it should also be pointed out that the committee can proactively initiate negotiations with an ebook vendor, if there is sufficient interest per- colating from the members (at least three institutions demonstrating such interest). this provides for greater flexibility. the model ebook license and vendor template one of the key objectives in negotiation is to pursue conformity with the ocul model ebook license (ocul, b). the ebook license, vetted by legal counsel, is predicated upon the standard database license agreement developed in to represent our technological, legal, and business interests in negotiating for e-content with vendors. the archiv- ing of licensed content is pivotal to the value of crafting our own model license for ebooks (gillies & horava, ): the local load provision is the heart of why ocul needed its own distinct model license agreement. moreover, as the future of vari- ous publisher–library preservation partnerships, such as lockss and portico, is far from certain and vendor assurances of perpetual access are viewed with some skepticism by librarians, a local solu- tion still provides the most reliable, responsible, and utilitarian option for digital archiving. (p. ) the license includes key provisions that affect ebook agreements, such as ownership and perpetual access, local loading rights, marc records, file format, delivery issues, and usage rights specific to ebooks such as interlibrary loans. there is a clause on copyright legislation to ensure that no statutory rights under canadian copyright law are eroded under a license agreement. this model license is monitored in relation to our rapidly changing environment and updated as appropriate. we are in the process of creating a separate local load license that would complement the standard ebook license; this will make the administration of license agreements better manageable, since entitlements to local loading can change over time, as institutions decide to join existing licenses. moreover, local loading can apply to various content types, not only ebooks. another important element has been the crafting of an ebook template for vendors (ocul, b) to be sent to vendors with http://www.ocul.on.ca/node/ . http://www.ocul.on.ca/node/ . please cite this article as: horava, t., today and in perpetuity: a canadian academic librarianship ( ), http://dx.doi.org/ . /j.acalib. . whom the committee is negotiating an offer. this is a series of ques- tions intended to capture all the relevant information needed to assess the offer. it also signals to the vendor that we have carefully considered our principles and requirements for a consortial ebook agreement, thus pro-actively articulating what we require and value from a vendor. the template is therefore a reflection of our practices and experience in negotiating ebook license agreements. the vendor is asked to complete this template, which is distributed to members with the offer itself. this is especially important for negotiating with a new vendor, i.e. one with whom the consortium does not have any previous history for ebook acquisition. it is generally not neces- sary for a vendor with whom we are negotiating a successive ebook agreement. the template requires that the vendor's offer be in accordance with the ocul ebook model license. this is a critical issue for us, since this license is our common blueprint for shared principles and interests in regards to ebook licensing and management. as men- tioned earlier, an ownership model with perpetual access and local hosting rights on scholars portal is a key priority for us. the template poses detailed questions around pricing, such as list price, consortial discounts, minimal purchase, multi-year pricing, deep discounting for print, and any ongoing fees. there are various questions around content, such as inclusions, exclusions, front list description, backlist coverage, types of works, and enhancements such as tables of con- tents and cover images. there are questions around access and acces- sibility, such as formats, authentication mechanisms, printing and downloading options, and checkout terms (where applicable). the template has evolved in relation to our experience with ven- dors and our platform and access requirements. we now provide detailed specifications on marc records, such as the oclc standard and the need for a unique identifier mapping the ebook file to the marc record. in regards to the issue of accessibility (in the context of the ontarians with disabilities act) we ask questions about compli- ance with these evolving standards. as authentication systems are pro- viding alternatives to proxy server recognition, such as shibboleth, we ask for the vendor's technical capabilities in this area. as researchers are becoming more interested in large-scale computational analysis of scholarly data, we are asking the vendor whether they support data or text mining of their licensed ebook content. these issues are also reflected in the model ebook license. the consortium has also developed a ‘local load toolkit’—this is a document providing ocul librarians with an overview of the ratio- nale behind local hosting, and discusses specific criteria, issues, and talking points to address with publishers. this is significant because local hosting as an overarching philosophy for the consortium applies to various types of content other than ebooks (such as ejournals, primary source material, data sets, geospatial content, etc.) and many publishers are not familiar with the business implications and opera- tional logistics involved in supporting local hosting of their licensed content. for some publishers, it takes awhile to understand what we are doing and why it is important to us. on the question of local content management of ebooks, polanka ( ) observes that, some libraries negotiate with vendors to obtain ebook files and host them on local servers. this provides greater control, but requires technological expertise to develop the interface and load content. a vendor may also send content files directly to libraries for archival purposes while at the same time providing access through its interface. good intentions aside, the files are of little use to libraries without the servers, interface, and technological expertise to deliver content to users. (p. ) developing the in-house technological expertise and the infrastruc- ture to host the content has been a critical priority for ocul. this has meant an ongoing review of user services, hardware capacity and consortial strategy for owning and hosting ebooks, the journal of . http://www.ocul.on.ca/node/ http://www.ocul.on.ca/node/ http://dx.doi.org/ . /j.acalib. . . t. horava / the journal of academic librarianship xxx ( ) xxx–xxx software requirements, and staffing in relation to the consortium's strategic priorities. as can be imagined, this represents a significant and sustained investment by the twenty-one members for our shared services and collections, today and into the future. ebook accomplishments the committee has enjoyed much success in negotiating ebook agreements with various publishers that support ownership and local hosting arrangements. the first was a comprehensive front list agree- ment with springer, and this set the standard for other agreements. these have included: elsevier, oxford, taylor & francis, sage, ovid, wiley, spie, ieee, emerald, morgan & claypool, thieme, and brill. in some cases, this has not been comprehensive coverage of ebooks (based on what has been available). institutions can decide whether to participate in a given agreement; this is an opt-in model for licensing. there is much content from canadian publishers, as described further below. the model ebook license has served as a foundation for negotia- tions. this has been a milestone in articulating and asserting our licens- ing interests to provide the broadest possible access rights, legal protections, and requirements for perpetual access and local hosting. the model license has been deployed for ebook agreements, sometimes with only minimal changes required by the publishers. this depends on the dynamics of negotiations. publishers have learned that we have core requirements if they wish to do business with us. as the scale of licensed content on scholars portal has grown, and our model has become widely known, publishers have become more familiar with our interests and requirements. the fact that there are robust security protocols, and that there have been no breaches such as systematic downloading of content, has been strong testimony to the viability of our approach. there has been a national canadian licensing agreement (via crkn) for ebook backlist collections from oxford, cambridge, and taylor & francis that include the option of local hosting. a total of titles were acquired in . these titles have been loaded onto scholars por- tal, and are accessible by the schools that have participated in this agree- ment with ingram. there has been a patron-driven acquisitions project in with ebrary, involving sixteen of the ocul institutions. this led to the purchase of scholarly titles for the participating schools, and provided valuable experience in understanding the strengths and limita- tions of a consortial pda project (davis, jin, neely, & rykse, ). these titles generally provided very good value for money, based on the tiered contribution model for financing the project. these titles have been load- ed on the scholars portal platform. also noteworthy is a pilot project in – with oxford for the upso (university press scholarship online) front list collection—these ebooks are automatically delivered to scholars portal upon publication, for local hosting. as a consequence, the digital copy becomes the default format for supporting research and teaching. there were thirteen institutions that participated in this project. we look forward to evaluating this new delivery model for monographs, par- ticularly in humanities and social sciences, and moving forward with the second phase of this agreement in – . university press ebooks, both canadian and american, are of great interest to us. this is consid- ered to be core research content that is of broad value to our faculty and students. we have struck an agreement (ocul, a) with the as- sociation of canadian university presses and ebound canada to acquire a comprehensive collection of over canadian scholarly ebooks pub- lished from to , with perpetual access and local loading rights. scholars portal will be the exclusive point of access for this collection. this number is expected to grow to almost titles by . this is a ground-breaking agreement that will allow institutions to access the e-copy as soon as titles are published. scholars portal is thereby becom- ing an integral player in the evolution of the scholarly communications ecosystem in canada. this will have important ramifications for the dis- semination of scholarly canadian content, and the participating institu- tions will be assessing the use and integration of these monographs in their communities. as duplication is problematic for libraries, there will please cite this article as: horava, t., today and in perpetuity: a canadian academic librarianship ( ), http://dx.doi.org/ . /j.acalib. . be impacts on acquisition practices around approval plans and firm or- dering for print titles. the consortial strategy for managing ebook licensing and acquisition in ontario has implied an educational process in our relations with ven- dors. we have invested time and energy in sensitizing vendors to our core principles; we have communicated our priorities and objectives, in the unique context of the scholars portal infrastructure and its cen- trality for how we do business. this has involved a significant amount of information sharing, negotiating and decision-making. ownership and local hosting of ebooks are cornerstones of our philosophy and a demonstration of stewardship of our licensed scholarly resources. some publishers have readily accepted this model, while in other cases there was significant discussion needed to convince key players and gain approval. many discussions on the implications and logistics of hosting ebook files and associated marc records have occurred. negotiating these matters has made us more assertive, more experienced, and more focused on our strategic goals. moving the goalposts of vendor relations to adopt our frame of thinking has been an important theme during the past several years. the deployment of the model license and ebook template to as- sert our requirements and interests have been important milestones in this regard. this has mitigated risk for the consortium as a whole and for the individual members—we only agree to terms and condi- tions for ebook agreements that meet our requirements, and we have walked away from negotiations that haven't lead to a positive outcome. an important spinoff has been the growth of expertise among the committee members in ebook issues in general and the dynamics of negotiation in particular. what will be the attributes of the scholarly monograph, assuming it survives, in an information culture where so many competing alterna- tives for disseminating research exist? what will be its intellectual form and structure, what media will it involve, and how will it be used in different and novel ways to address research and learning needs? all players in the ecosystem are seeking a sustainable, forward-looking strategy that can transform the scholarly monograph into a viable form that aligns with the business models, technological options, and supply chain realities of the digital era. we feel that our strategy will safeguard our collective investments in long-form mono- graph scholarship and serve our patrons effectively into the future. we are proud that in february scholars portal received certifica- tion as a trusted digital repository for journals, following an audit pro- cess with the center for research libraries, and thereby became the first canadian library organization to receive this distinction. it is hoped that this certification will eventually extend to ebooks as well. this milestone demonstrates ocul's commitment “to the long-term preservation of scholarly resources for the benefit of future students and researchers” (ocul, b). stern ( b) observes that “preser- vation of digital and born-digital materials is still a topic of debate. con- sortia are the logical groups to explore these details and to develop best practices to ensure safe storage, functionality, and access”. preservation of knowledge and information resources for future generations has been a core value for librarianship, but the ground has shifted dramat- ically as we leave the print era behind—the quantity and range of schol- arly works in digital formats has increased exponentially in recent years, and preservation planning today requires a new paradigm of thinking. we have squarely addressed the challenge of long-term pres- ervation for our community, while building a large-scale digital library of books to be accessed and widely used on a daily basis. conclusion based on a shared vision and technological infrastructure, the ocul consortium has developed a comprehensive and ambitious strategy re- garding ebook acquisition. ownership, preservation, and integration on our own platform are key goals that have inspired us. our strategy is one that few consortia can afford to adopt, as it requires a long-term consortial strategy for owning and hosting ebooks, the journal of . http://dx.doi.org/ . /j.acalib. . . t. horava / the journal of academic librarianship xxx ( ) xxx–xxx commitment to a shared technological infrastructure, significant staff support, and the ability to share costs in a sustainable manner. we now have a critical mass of ebook content. we have experience in nego- tiating numerous agreements with publishers, and in the logistics of loading and archiving ebooks and associated metadata. we have learned many lessons along the way and we have developed a collaborative ethos that has served us well. while there are always challenges in working with publishers, and working within a consortial framework, we now have a large measure of control over our collective investment in scholarly ebooks, today and into the future. as the market for mono- graphs rapidly evolves, we expect there will be more opportunities for our ebooks committee and for scholars portal, whether it be in new ‘business’ models, new content, new partnerships or developing innova- tive roles in the scholarly communication supply chain. it is an exciting time to be engaged in the messiness of ebooks. references alberico, s. ( ). academic libraries in transition. new directions for higher education, , – . allen, b., & hirshon, a. ( ). hanging together to avoid hanging separately: oppor- tunities for academic libraries and consortia. information technology and libraries, ( ), – . association of college and research libraries ( ). top ten trends in academic li- braries: a review of the trends and issues affecting academic libraries in higher ed- ucation. college & research libraries news, ( ), – . canadian library association ( ). ontario council of university libraries (ocul) awarded cacul innovation achievement award. http://www.cla.ca/am/ template.cfm?section=news &contentid= &template=/cm/contentdisplay. cfm (retrieved march , ). davis, k., jin, l., neely, c., & rykse, h. ( ). shared patron-driven acquisition within a consortium: the ocul pda pilot. serials review, ( ), – . please cite this article as: horava, t., today and in perpetuity: a canadian academic librarianship ( ), http://dx.doi.org/ . /j.acalib. . gillies, s., & horava, t. ( ). developing a model license: a canadian consortium's experience. in w. jones (ed.), e-journals access and management (pp. – ). new york: routledge. hazen, d. ( ). lost in the cloud: research library collections and community in the digital age. library resources & technical services, ( ), – . machovec, g. ( ). from your managing editor: twelfth annual reader's choice awards. the charleston advisor, ( ), – . maskell, c. ( ). consortia: anti-competitive or in the public good? library hi-tech, ( ), . ontario council of university libraries ( a). collaborate, innovate, deliver. http:// www.ocul.on.ca (retrieved march , ). ontario council of university libraries ( b). ocul model licenses. http://www. ocul.on.ca/node/ (retrieved march , ). ontario council of university libraries ( a). ocul, acup/apuc and ebound partner to promote canadian ebook scholarship. http://www.ocul.on.ca/node/ (retrieved march , ). ontario council of university libraries ( b). ocul's scholars portal — canada's first certified trustworthy digital repository. http://www.ocul.on.ca/node/ (retrieved march , ). ontario council of university libraries. about. http://www.ocul.on.ca/about (retrieved march , ). polanka, s. ( ). purchasing ebooks in libraries: a maze of opportunities and challenges. library technology reports, ( ), – . revelle, a., messner, k., shrimplin, a., & hurst, s. ( ). book lovers, technophiles, pragmatists, and printers: the social and demographic structure of user attitudes toward ebooks. college and research libraries, ( ), – . scholars portal ( a). about. http://spotdocs.scholarsportal.info/display/sp/about (retrieved march , ). scholars portal ( b). ocul overview. http://spotdocs.scholarsportal.info/display/ sp/ocul+overview (retrieved march , ). scigliano, m. ( ). consortium purchases: case study for a cost–benefit analysis. the journal of academic librarianship, ( ), – . stern, d. ( a). ebooks: from institutional to consortial considerations. online, ( ), – . stern, d. ( b). library use of ebooks edition. new york: primary research group. zeoli, m. ( ). how do you eat an elephant? or econtent and the future of the academic book vendor. against the grain, ( ), – . consortial strategy for owning and hosting ebooks, the journal of . http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://www.cla.ca/am/template.cfm?section=news &contentid= &template=/cm/contentdisplay.cfm http://www.cla.ca/am/template.cfm?section=news &contentid= &template=/cm/contentdisplay.cfm http://www.cla.ca/am/template.cfm?section=news &contentid= &template=/cm/contentdisplay.cfm http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://www.ocul.on.ca http://www.ocul.on.ca http://www.ocul.on.ca/node/ http://www.ocul.on.ca/node/ http://www.ocul.on.ca/node/ http://www.ocul.on.ca/node/ http://www.ocul.on.ca/node/ http://www.ocul.on.ca/about http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://spotdocs.scholarsportal.info/display/sp/about http://spotdocs.scholarsportal.info/display/sp/ocul+overview http://spotdocs.scholarsportal.info/display/sp/ocul+overview http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://refhub.elsevier.com/s - ( ) - /rf http://dx.doi.org/ . /j.acalib. . . today and in perpetuity: a canadian consortial strategy for owning and hosting ebooks consortia and the ebook landscape a myriad of challenges the ontario council of university libraries (ocul) ebook platform the ebooks committee the model ebook license and vendor template ebook accomplishments conclusion references making connections while earning an mlis by: sarah loor and michael a. crumpton loor, s. & crumpton, m.a. ( ). making connections while earning an mlis. the bottom line, ( ). http://dx.doi.org/ . /bl- - - ***this version ©emerald group publishing limited. this is not the final version. this article has been accepted for publication in the bottom line, published by emerald group publishing limited. reprinted with permission. figures and/or pictures may be missing from this version of the document. the version of record is available at http://dx.doi.org/ . /bl- - - .*** abstract: purpose this article discusses a collaboration with a non-profit organization conducted as part of the real learning connections project at the university of north carolina at greensboro. we discuss our experiences working with a non-profit partner from outside the university and the benefits gained from collaboration. design/methodology/approach this is a reflection based on the personal experiences of the authors as collaborators in the project. findings through our experience with the real learning connections project, we found that collaborating with a non-profit organization provides a unique opportunity for library school students to learn practical skills while also providing value to the non-profit organization in the form of expertise in information services. originality/value this piece discusses the benefits of collaboration from the perspective of both an lis student and a professional librarian, as well as considering the experiences of an external non-profit organization. keywords: real learning connections | collaboration article: schools of library and information studies typically offer students the opportunity to apply theory learned in the classroom to practical learning experiences such as practicum assignments, internships, service learning, or independent study opportunities. being able to learn practical skills while earning the graduate degree can prove very valuable (coleman, ). the quality of such opportunities varies considerably as the profession debates the perceived value of practice versus theory (ball, ). http://libres.uncg.edu/ir/clist.aspx?id= http://dx.doi.org/ . /bl- - - http://dx.doi.org/ . /bl- - - .*** at the same time, libraries are working beyond their walls to provide information to connect to other non-profit organizations (salinas & chabrán, / ). libraries seek opportunities to develop these relationships and provide influence on a broader scale. in this article we describe the activities of combining an internship for a lis graduate student with the outreach efforts of the university library. this experience not only enhanced the perceived value of the mlis degree earned, but also provided the library an opportunity for a long term collaborative relationship with another non-profit. introduction the term “experiential learning assignment” includes, but is not limited to, internships, practica, and service learning. experiential learning opportunities may be paid or unpaid, academic or nonacademic, university-sponsored, or independent and may take place outside the university environment. thus the real learning connections model was created to explore an in depth opportunity for learning. at the interstices of the model are the areas of concern that will be affected by the interaction between the participants. academic content is both what the practitioners learned in their own programs of study and the content of the curriculum (bird & crumpton, ). practical work experience is the nature of the tasks completed. current theory and research are both the work of the faculty member involved and a connection to the literature. for years the university libraries and the department of library and information studies at the university of north carolina at greensboro have collaborated on a learning partnership called real learning connections. this is a unique venture that triangulates the work of a faculty member, a practitioner and a student, the goal being to harness the learning power that could be shared among these three in a collaborative way. experiential learning opportunities for students in the department of library and information studies are work experiences that allow them to sample professional environments in which they might seek a career or which might give them experience to help prepare them for a career or enhance their current career. the concept of this program has involved the convergence of theory and practice; using a lis graduate student as a conduit, projects have been undertaken that create learning objectives for not only the student, but the librarian(s) and lis faculty member(s) involved for each individual project. the administrators of the real learning connections have published* and presented on the different projects completed and what was learned from all parties involved. each year new variables are added to the projects’ execution and the overall research goal is to create a sustainable internship model that fosters professional growth for students, organizational learning for practicing librarians and curriculum assessment and modification for the lis department. one of the more recent projects has been to provide outreach to a non-profit organization in the form of professional expertise with regard to resource allocation and accessibility. this project provided tangible support for a non-profit enterprise, sought to broaden research on the subject of domestic violence in order to help those affected, achieved learning objectives for a second year lis student with a related background in the subject area and established collaborative teams with both the library and lis department organizations for working beyond the walls of campus. from this project the real learning connections program is hoping to define a model for sharing expertise and learning opportunities at an organizational level with community groups or non-profits. this model can help produce foundational support, using professional library science expertise for other organizations across a variety of needs. this takes organizational commitment as projected needs could cross departmental lines and subject or discipline knowledge. intern background when sarah first came on board to real learning connections, she was excited about the opportunity to work on a project that would incorporate her pre-library school experience. like many of her fellow students, she entered the mlis program after spending some time in another field. in her case, she was an attorney who had worked in the non-profit immigrant advocacy field. one area that she specifically focused on during her internships and her clinical experience in law school was advocacy for immigrant survivors of domestic violence. there are immigration remedies that exist for people who have experienced domestic violence and other crimes. these remedies exist in order to encourage the reporting of such crimes, whose victims might otherwise be too afraid to advocate for themselves. as an intern and law clinic student, sarah worked with people who had survived intimate partner violence, sexual assault, and human trafficking to help them move toward building a life free of abuse. she enjoyed many things about this work, but she was especially driven by the goal of working as part of a team to help people gain access to the resources they needed to reach their goals of a better life. in many ways, deciding to pursue an mlis degree at the university of north carolina at greensboro was a logical progression of the career path sarah had started in law. she wanted to focus on helping connect people with the resources they needed to accomplish their goals. when she started the program, she saw the potential for the degree to be used to help effect positive social change, but wasn’t sure where her specific path would lead. when the opportunity emerged to work with the national resource center on domestic violence (nrcdv) through the real learning connections project, sarah saw that the past experience she had worked to gain in the legal field was coming together with the things she had been learning during her library school courses to help create a unique learning experience. vawnet the national online resource center on violence against women (vawnet) is a project of the national resource center on domestic violence (nrcdv) and has a comprehensive and easily accessible online collection of full-text, searchable materials and resources on domestic violence, sexual violence and related issues. sarah’s main objective during the course of the school year spent working on the real learning connections project was to learn more about digital scholarship. one of the courses she had taken the summer before beginning the real learning connections project focused on this topic and this project was a chance to put what she had learned in the class into practice. much of sarah’s work on this project focused on finding new digital content to submit to the nrcdv for inclusion in vawnet’s database of resources. she was free to look for any resources as long as they were high-quality, open-access, and related in some way to the topic of violence against women. some of the content she found was what would more traditionally be considered scholarly publication. since one target audience for vawnet is lawyers and legal support staff, sarah’s legal background was an asset that helped her find relevant law review articles. law journals are often available to anyone on an open-access basis, with many hosted on the digital commons network. even for the many users of vawnet who are not legal professionals, law review articles can be a valuable resource, as they often discuss cutting edge issues in-depth. vawnet’s manager had also specified that new content from state and local level anti- violence organizations would be welcome additions to the site. sarah also looked for content from national level organizations. searching for this type of content was different than the type of research we typically do in academia. there was no centralized database to use to begin searching. instead, sarah had to rely on using her searching skills to find websites of relevant organizations and investigate what each organization offered in terms of resources. finding one good resource would often lead to others. some of the resources found fell into the realm of more traditional articles. other resources included things like tipsheets, videos, and webinars. working on this project reinforced the idea that in the digital age, many different people and organizations can be content creators and that much high-quality content is available for free. during the project, sarah also learned more about the technical aspects of digital scholarship. she shadowed vawnet’s manager to learn how she adds new content to the site and studied html via free lessons available online through codecademy. she explored the idea of incorporating filtered searching into vawnet’s search structure so that users could narrow and broaden their searches as they went along without having to create multiple new searches. searches would become more efficient and users would be better able to connect with the precise content they needed from the huge amount of content available on vawnet. finally, sarah performed analysis of search terms people use in google to find vawnet in order to help the nrcdv determine whether any changes could be made to the proposed structure of their new website so that people could more easily find resources on frequently searched topics. sarah also learned more about the possibilities available for a website to utilize digital scholarship by researching potential next steps for vawnet and presenting them to the vawnet team. one potential way for a website like vawnet to efficiently provide access to a broad variety of content would be linking to institutional repositories. the site could create a page with a listing of scholars who write about issues relevant to that site’s topic, a brief description of each scholar’s general research topics, and a link to each scholar’s page at her university’s institutional repository. while work deposited in institutional repositories may be an author’s personal copy of the text and may not have the formatting that is applied to the published version in a journal, linking to such work would allow a site like vawnet to provide free access to content that might otherwise be found only as a paid resource. another option we explored was the idea of a site like vawnet linking to statewide library websites, which allow residents of specific states to access resources such as article databases and ebooks. for example, the statewide library site in north carolina is called nc live. access is available via credentials provided by public libraries throughout the state, and people who have an affiliation with a north carolina community college, college, or university can access nc live using credentials from their institution as well. a site like vawnet could consider having state-specific resource pages. a user could select her own state and be taken to a page that would link to that state’s online library. other statespecific resources could be provided here as well. in the case of vawnet, resources could include a link to that state’s domestic violence coalition, links to other organizations within that state that produce resource materials on violence against women, and links to state statutes on domestic violence. organizational gains as stated earlier, part of the objectives for real learning connection projects is to provide multiple learning opportunities for those involved. this project scope was intended to provide expertise for vawnet as it relates to the organization and classification of the information they were making available for researchers. this starts by identifying resources that are freely available, such as institutional repositories, but that must be sought out and discovered in order to be added or linked into vawnet’s purview. subsequent options become available as mentioned, such as state consortium sites or targeted subscription services for subject specific needs. vawnet’s gain in this project was to become informed on information seeking and retrieval techniques such as search schemas, tagging of keywords or concepts, and metadata techniques for content optimization. part of this project’s outcome was to make the case advocating the need for librarian expertise and training within vawnet’s organizational structure. the role of the university libraries and the department of library and information studies at unc at greensboro was to operationalize the concept of non-profit collaboration by sharing expertise, in this case through sarah as a graduate student. for the libraries, this project enables librarians to become embedded in an organizational service arena with another non- profit. the library school could evaluate how working across environments could impact howard gardner’s theory of multiple intelligences to improve the curriculum, as described by cargo ( ). cargo concluded that libraries can provide the means by which other disciplines can develop multiple access points of understanding and knowledge. the faculty member involved in this project already had connections with vawnet and was able to help coordinate the roles of the intern and the libraries in order to produce a tangible outcome for both. in addition, the department of library and information studies was able to gain a new experience that could be shared with other faculty members to broaden this type of project or approach when working with students on their practicums and internships. this further can influence the curriculum by providing curriculum development strategies beyond collective faculties' expertise (bird & crumpton, ). our journey on learning the real learning connections program also gave sarah the experience of working to help a real-world organization with its information needs. while she had been assigned other projects as part of her coursework that attempted to simulate real-world projects as much as possible, there is really no substitute for actual first-hand experience. the project showed the need for flexibility when goals change during the course of project and the importance of following the client’s lead. in an article about a partnership between a health sciences library and a community organization dedicated to services for parents of children with autism, alfasso ( ) discusses some of the logistical challenges that can arise during the course of a partnership. alfasso writes about how waiting for a county library budget to be passed affected the ability to schedule meeting rooms for the partnership project (p. ). similarly, we faced some delays in getting started while vawnet was waiting for a contract to be finalized. since the project only ran for one academic year, this delay did affect the amount of services we were able to provide, but we worked to find other things we could help with in the meantime that didn’t require contract finalization, such as sarah locating new content for vawnet. alfasso also discusses how “scheduling meeting times when all parties could be present” was a challenge for the partnership with the autism organization (p. ). scheduling could also be a challenge during the real learning connections project, since we had so many collaborators and since we were geographically distant (the unc at greensboro collaborators were in north carolina and the vawnet collaborators in pennsylvania). we were able to use technology tools such as webex, blackboard collaborate, and jira project tracking software to help us bridge the distance. salinas and chabrán ( / ), reflecting on past library partnerships for increasing access to technology for latino communities, write, “we were mindful that we were often perceived as agents of a large university. part of the partnership development was articulating a clear understanding of what our goals were, what we could do, and what we could not do” (p. ). at times we similarly needed to convey that certain library resources (for example, access to electronic resources licensed by the university) could not be included within the scope of the project, while always making sure to communicate what the services were that we could offer. we would highly recommend that other lis students seek out similar collaborative experiences to help put the skills they are learning in their programs into practice and to help information seekers achieve their goals. collaborations like this one between library schools and non-profit organizations are mutually beneficial to both parties. students gain valuable experience and non-profits, which often run on tight budgets, gain access to librarian and library student expertise in information organization. collaborations also offer students a chance to make a more personal connection with librarians, lis faculty, and non-profit professionals. since sarah completed her degree as a distance education student, this was especially important in her case. michael served as a mentor during the course of this project to guide sarah in the development of these new skills. at times when work on the main project with vawnet was a bit slower on our end while waiting for direction from the nrcdv, he was able to direct her to related projects to keep building her experience with digital scholarship. for example, she had the opportunity to edit articles for the journal of learning spaces, an open-access digital journal hosted by the university, which helped her see how the production of digital scholarship happens behind the scenes. the experience of working with vawnet and the real learning connections project has helped sarah immensely with the transition to professional employment after graduation. she will be working with a legal organization to help organize online information, so the vawnet project was instrumental in helping her to be prepared for the job. when she first started library school, she had solid experience in the legal world but needed the formal education provided by the mlis program to help transition to putting that experience to work in the information services field. the project helped sarah envision the ways that the skills gained in an mlis program could be leveraged in ways other than traditional work in a library and gave her the confidence to pursue a nontraditional position. while all of the coursework in the program was valuable, she considers the real learning connections project to be the key that has truly prepared her the most for work as an information professional. working on a project that extended beyond the bounds of the library also allowed us to fulfill goals of both the lis department and the university library. the unc at greensboro lis program’s mission is to “connect people, libraries, and information through research, teaching and service to enrich living and working in a global environment” (the university of north carolina at greensboro school of education, ). through the real learning connections project, we forged connections among our department, our university library, and a non-profit organization in order to provide enhanced service to the diverse community of users that relies on vawnet for information. the unc at greensboro university libraries mission statement reminds us that “through expertise in information services, the university libraries foster the success and impact of the unc greensboro community by promoting learning, inspiring creativity and enhancing research and collaboration in a diverse and innovative environment” (the university of north carolina at greensboro walter clinton jackson library, ). this project allowed those of us at the university to collaborate successfully with a nontraditional partner in a way that can serve as a model for future collaboration between libraries and community organizations. describing their partnership experience, salinas and chabrán ( / ) write, “working beyond the library walls has been both a privilege and an honor. the experience transformed us in many ways. we have come to appreciate that good partnerships are not born but nurtured” (p. ). our experience working with vawnet has similarly taught us how to work in partnership with a community organization and we hope that future partnerships with other organizations will continue to thrive at our institution and others references alfasso, a. ( ). information literacy instruction for community members: an academic partnership with a community nonprofit organization. journal of consumer health on the internet, ( ), - . http://dx.doi.org/ . / . . ball, m.a. ( ). practicums and service learning in lis education. journal of education for library and information science, ( ), - . bird, n. & crumpton, m.a. ( ). real learning connections: questioning the learner in the lis internship. journal of education for library and information science (jelis), ( ), - . cargo, r. ( ). made for each other: nonprofit management education, online technology, and libraries. journal of academic librarianship, ( ), - . coleman, j.g., jr. ( ). the role of the practicum in library schools. journal of education for library and information science, ( ), - . salinas, r. & chabrán, r. ( / ). preparing ethnic non-profits for the st century. in w. miller & r.m. pellen (eds.), libraries beyond their institutions: partnerships that work (pp. - ). binghamton, ny: haworth press. doi: . /j v n _ the university of north carolina at greensboro school of education. ( ). library and information studies (lis). retrieved from http://soe.uncg.edu/academics/departments/lis/ the university of north carolina at greensboro walter clinton jackson library. ( ). mission statement, goals & customer service values. retrieved from https://library.uncg.edu/info/ mission_statement.aspx http://dx.doi.org/ . / . . http://soe.uncg.edu/academics/departments/lis/ language and thought in hildegard of bingen's visionary trilogy: close and distant readings of a thinker's development language and thought in hildegard of bingen's visionary trilogy: close and distant readings of a thinker's development jeroen de gussem, dinah wouters parergon, volume , number , , pp. - (article) published by australian and new zealand association of medieval and early modern studies (inc.) doi: for additional information about this article [ this content has been declared free to read by the pubisher during the covid- pandemic. ] https://doi.org/ . /pgn. . https://muse.jhu.edu/article/ https://doi.org/ . /pgn. . https://muse.jhu.edu/article/ language and thought in hildegard of bingen’s visionary trilogy: close and distant readings of a thinker’s development jeroen de gussem and dinah wouters* by combining the methods of distant reading (computational stylistics) and close reading, the authors discuss the development of language and thought in hildegard of bingen’s visionary works (sciuias, liber uite meritorum and liber diuinorum operum). the visionary trilogy, although written over the course of three decades, raises the impression of a monolithic and seemingly unchanging voice. moving beyond this impression, the interdisciplinary analysis presented here reveals that the trilogy exhibits interesting differences at the word level which cannot simply be explained through external historical circumstances (e.g. manuscript transmission or different secretaries). instead, the results raise pertinent questions regarding the trilogy’s internal development in didactic method, style, and philosophy. i. the authorship of hildegard of bingen hildegard of bingen’s ( – ) major visionary works sciuias (c. – ), liber uite meritorum (c. – ) and liber diuinorum operum (finished c. ) have been said to form an ‘organic unity’, a tripartite magnum opus bound by strong conceptual, generic, and formal ties. this is emphasized by the structure beverly m. kienzle and travis a. stevens, ‘intertextuality in hildegard’s works: ezekiel and the claim to prophetic authority’, in a companion to hildegard of bingen, ed. by beverly m. kienzle and debra l. stoudt (leiden: brill, ), p. . a handful of studies that have approached hildegard’s three major works rather holistically are (from more recent to older): maura zátonyi, vidi et intellexi: die * this article is the result of a close collaboration between the henri pirenne institute for medieval studies at ghent university, the clips computational linguistics group at the university of antwerp, and the centre traditio litterarum occidentalium division for computer- assisted research into latin language and literature housed in the corpus christianorum library and knowledge centre of brepols publishers in turnhout (belgium). brepols publishers has generously provided the digitized text files of the editions hildegardis bingensis, sciuias, ed. by adelgundis führkötter and angela carlevaris, corpus christianorum continuatio mediaeualis (hereafter cited as cc cm), (turnhout: brepols, ); liber uite meritorum, ed. by angela carlevaris, cc cm, (turnhout: brepols, ); and liber diuinorum operum, ed. by albert derolez and peter dronke, cc cm, (turnhout: brepols, ), which are used in the experiments described in this article. for brepols’s online library of latin texts, see <www. brepolis.net>. the authors would like to thank their colleagues from the latin curriculum at ghent university for their advice and helpful comments to improve this article. this research has been realized thanks to financial support from the special research fund at ghent university and the research foundation flanders (fwo). parergon . ( ) jeroen de gussem and dinah wouters the treatises share: each of them follows a similar pattern where series of visions are complemented by allegorical explanations. the formal similarities of the visionary works invoke an unchanging, biblical, factual style that strives to align form and meaning, and it is often difficult to pin down where such a carefully aligned construction shows deviations. hildegard clearly intended these works to be sequels. this is perhaps best exemplified by how she placed them jointly at the heading of her allegedly self-redacted opera omnia, the monumental wiesbaden riesencodex. simultaneously, scholars have been aware that adopting a unified approach to the trilogy carries some risks. anne h. king-lenzmeier, for instance, notes that ‘it has been customary to combine [hildegard’s] three theological-visionary treatises into one chapter, facilitating analysis of the three of them in combination’, but also that such approaches have sometimes come at the cost of ‘the evolution in her thought’. constant mews also noted that ‘insufficient account has been taken of the evolution of her thought between the writing of sciuias and [...] her last great composition, the book of divine works ( – )’. although a few thematic developments in hildegard’s visionary oeuvre have been noted, discussing the differences between the prophet’s works—especially schrifthermeneutik in der visionstrilogie hildegards von bingen (münster: aschendorff, ); viki ranff, wege zu wissen und weisheit: eine verborgene philosophie bei hildegard von bingen (stuttgart-bad cannstatt: frommann-holzboog, ); fabio chávez alvarez, ʻdie brennende vernunft’: studien zur semantik der ʻrationalitas’ bei hildegard von bingen (stuttgart-bad cannstatt: frommann-holzboog, ); hans liebeschütz, das allegorische weltbild der heiligen hildegard von bingen (leipzig: teubner, ). michael embach, die schriften hildegards von bingen. studien zu ihrer Überlieferung und rezeption im mittelalter und in der frühen neuzeit, in erudiri sapientia, (berlin: akademie verlag, ), pp. – . the manuscript can nowadays be consulted in the hessische landesbibliothek in wiesbaden and is a nearly complete collection of hildegard’s oeuvre. the bulk of the codex must have been finished during her lifetime, before , and most certainly is a product of the rupertsberg scriptorium, the monastery where hildegard resided from onwards. anne h. king-lenzmeier, hildegard of bingen: an integrated vision (collegeville: the liturgical press ), p. . constant mews, ‘from scivias to the liber divinorum operum: hildegard’s apocalyptic imagination and the call to reform’, journal of religious history, ( ), – (p. ). despite the fact that hildegard’s last work, the liber diuinorum operum, might read as a reworking of her first, the sciuias, the differences in thematic choices are undeniably present in the shift of focus from sacramental theology and the role of the church in salvation history to creation and the natural world. secondly, the differences between these respective works’ literary form has been more and more re-evaluated, with specific attention for her usage of allegorical forms and exegetical principles. after the personification narratives of liber uite meritorum, hildegard again harks back in liber diuinorum operum to the kind of allegory that she also used in sciuias: non-narrative, abstract and difficult, perhaps even more abstract and difficult than before. at the same time, allegory is playing a smaller role, because the exegetical character of the work is made more explicit. in the second vision of liber diuinorum operum, for instance, the allegory receives several interpretations at different levels of meaning (pertaining to the cosmos, to the divine, to the human body, and to the soul), while sciuias only ever offered one interpretation. furthermore, biblical interpretation no longer serves an ancillary role in liber parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy when it comes to formulation and phraseology—has been difficult to detach from the prophetess’s authorship and the influence of her collaborators. it is well known that hildegard collaborated with assistants in composing her works, monks such as her provost volmar of disibodenberg or guibert of gembloux, but possibly also female scribes, amongst them richardis of stade. hildegard presented her collaboration with secretaries as necessary to her readership due to her limited schooling. although these ‘autobiographical’ assertions in which hildegard stressed her weaker sex (‘paupercula forma’) and lack of education (‘indocta’) likely reflect a self-conscious construction of a persona, it is indeed careless to express full confidence that the thought found in hildegard’s texts unanimously coincides with that of the author. then again, hildegard stressed that her secretaries were not to change the sense of her visions and were to focus on formal aspects of the language such as grammar and spelling. these teams were subject to change throughout her life (as is emphasized in figure ). each of hildegard’s treatises must have presented a very different undertaking from the outset. the three visionary works were produced over a considerable time span of around thirty-three years (from until ). in this stretch of time, hildegard changed her working environment from disibodenberg to the rupertsberg. the collaborators around her who assisted her in her writing of latin and understanding of scripture came and went. these material changes and diuinorum operum; whole commentaries are inserted into the visions. bernard mcginn argued that in sciuias and liber uite meritorum ‘the bible appears in a supporting role, that is, as providing selected and we might say “atomized” proof-texts to back up points revealed in the visiones mysticae’. in the liber diuinorum operum, however, ‘rather than using the bible in a piecemeal fashion and restricting herself to short passages, hildegard now began to exegete lengthy and traditionally difficult sections of scripture and to integrate them in a structural way into her visionary narratives’. see bernard mcginn, ‘hildegard of bingen as visionary and exegete’, in hildegard von bingen in ihrem historischen umfeld. internationaler kongress zum jährigen jubiläum, ed. by alfred haverkamp (bingen am rhein: verlag philipp von zabern, ), pp. – (p. ). see especially the tripartite article of hildephonse herwegen, ‘les collaborateurs de sainte hildegarde’, revue bénédictine, . ( ), – ; – ; – . morgan powell, ‘vox ex negativo. hildegard of bingen, rupert of deutz and authorial identity in the twelfth century’, in unverwechselbarkeit. persönliche identität und identifikation in der vormodernen gesellschaft, ed. by peter von moos, norm und struktur, (cologne: böhlau, ), pp. – . joan ferrante, ‘scribe quae vides et audis : hildegard, her language, and her secretaries’, in the tongue of the fathers: gender and ideology in twelfth-century latin, ed. by townsend david and taylor andrew, the middle ages series (philadelphia: university of pennsylvania press, ), pp. – (p. ). for instance, richardis of stade’s departure from disibodenberg (c. ) and volmar of disibodenberg’s death (c. ) proved to have sensitive repercussions in hildegard’s life. richardis left hildegard’s side shortly after the completion of sciuias: vita sanctae hildegardis, ed. by monika klaes, cc cm, (turnhout: brepols, ) . . – , p. : ‘nam, cum librum sciuias scriberem, quandam nobilem puellam—supradicte marchionisse filiam—in plena karitate habebam, sicut paulus thimotheum; que in diligenti amicitia in omnibus his se michi coniunxerat et in passionibus meis michi condoluit, donec ipsum librum compleui’. the epilogue to the liber diuinorum operum likewise testifies to the fact that volmar’s decease meant a parergon . ( ) jeroen de gussem and dinah wouters relocations introduced her to new sources of knowledge: the libraries’ contents differed, she allegedly travelled to preach at other locations such as cologne or trier c. – , the liturgical ceremonies changed in nature, and so on. by the time hildegard wrote the first sentence of the liber diuinorum operum, virtually no condition would have been the same as when she first sat down to write sciuias. the synergetic character of hildegard’s compositions could lead one to believe that observable lexical differences (which we will come to discuss soon), especially those that are hard to explain, are merely accidental imports by secretaries, and not a result of conscious writing strategies. in this rationale, the linguistic alterations within the trilogy are mere idiosyncrasies to the text. they are unintentionally slipped in by someone other than an original author. although this observation is vital, it follows a logic that relies on some presumptions that are difficult to defend. the distinction between ʻcollaborative’ and ʻindividual’, for instance, is not a constructive approach to give meaning to hildegard’s texts. after all, collaborative writing is a common phenomenon from roman times to the twelfth century and beyond, both for the learned and the unlearned, for male and female authorship. there are hundreds and thousands of art pieces in history that are collectively established and still the product of a singular vision. in that light, it would be an underestimation of a writer so versatile as hildegard to presume that any differences between her treatises are aberrations. substantial changeover in her assistance, as ludwig of st eucharius and her cousin wescelin of st andrew came to help her in the final throes of finishing the last part of her trilogy. according to the chronicler of the annales sancti disibodi, disibodenberg housed a considerable number of books on the flourishing liberal arts. likewise, johannes trithemius, historian and abbot of sponheim, claimed that disibodenberg was active in book production in the twelfth century. see constant mews, ‘hildegard and the schools’, in hildegard of bingen: the context of her thought and art, ed. by charles burnett and peter dronke (london: warburg institute, ), pp. – (pp. – ). rupertsberg would have been a new convent, possibly without any books at all in the early beginning. the convent might have exchanged books with disibodenberg, but it likely provided a different and especially more modest supply of books. these have often been named ‘preaching tours’, of which there were allegedly four. however, the idea that hildegard made four ‘discrete’ preaching tours is a conjectured itinerary, not a fact. see beverley mayne kienzle, hildegard of bingen and her gospel homilies: speaking new mysteries (turnhout: brepols, ), p. ; and franz j. felten, ‘hildegard von bingen – oder: was bringen jubiläen für die wissenschaft?’, deutsches archiv für erforschung des mittelalters, . ( ), – (see pp. , – ). presumably hildegard had a far more upfront participation to the liturgical ceremony in the rupertsberg, where the ceremony would have taken place in the company of women only (excluding the priest). here, the insights of the new philology of the s have been indispensable. see bernard cerquiglini, in praise of the variant: a critical history of philology, trans. by betsy wing, parallax: re-visions of culture and society, nd edn ( ; repr. baltimore: johns hopkins university press, ); and stephen g. nichols, ‘introduction: philology in a manuscript culture’, speculum, . ( ), – . see elizabeth j. bryan, collaborative meaning in medieval scribal culture: the otho laȝamon, editorial theory and literary criticism (ann arbor: university of michigan press, ), pp. – . parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy the article’s structure is as follows: we depart from a machine-driven ʻdistant reading’ of hildegard’s trilogy in order to speak about the strategies of these texts (to be explained in part ii). the aim is to lay bare a trail of statistically detectable patterns of difference that reveal a great deal about how hildegard (which will always mean: hildegard and her team) subtly yet structurally developed her style of writing. hildegard’s preferences for certain words in each of the trilogy’s works is understood as an indicative direction from which interpretation can proceed. the act of qualitative interpretation that follows, then, means to bridge the gap between hildegard’s language and her thought through a close reading, thereby taking into account that language and thought in development is a much more complex given than merely quantitative difference. throughout, we have decided to maintain an emphasis on what essentially constitutes the text and what are its formal characteristics. we do not wish to place emphasis here on historically contextualizing the origin of these particular textual differences, that is, which historical actors or events caused the textual changes to occur. rather, we focus on exploring the ultimate effect of this variety as a textual strategy. ii. distant reading a major critique voiced within the humanities’ computational turn has been the intersubjective aspect of ‘traditional’ philology. too often—these critics argue— traditional philology’s interpretations are supported by anecdotal arguments and so-called cherry-picking. for example, when it comes to the subject at hand, discussing what constitutes the difference between one text and another is susceptible to bias on behalf of the researcher and often presents too broad a question to be answered objectively. the reader runs the risk of choosing passages or words that suit his or her argument best. practitioners of computational methods, such as computational stylistics (stylometry), digital palaeography, or digital stemmatology, have often strengthened their position by stating that, in such debates, they can provide the common ground. in addition, their methods are presented as more powerful and mathematically more exact than the human mind in a reading task of high complexity. from a mathematical perspective, measuring difference is easy and requires at most a few seconds. ‘distant reading’, first coined by moretti, has been a popular term for conceptualizing this statistical approach to textual material. obviously, there has been a good deal of scepticism raised against such claims. david m. berry, ‘the computational turn: thinking about the digital humanities’, culture machine, ( ), – . a collection of moretti’s most seminal papers, which arguably form the manifestos of ʻdistant reading’, can now be found in franco moretti, distant reading (london: verso, ); but moretti introduced the approach as early as , in franco moretti, ‘conjectures on world literature’, new left review, ( ), – . computational methods, as well, are never unsupervised: they are steered in a direction. in addition, computational methods either seem to raise more questions than they answer, or confirm what we already knew, both of which make qualitative readings indispensable and place into question the computational turn’s claims to objectivity. for a critical stance parergon . ( ) jeroen de gussem and dinah wouters initially, the concept of distant reading can strike one as rather abstract. figure provides an exemplary graph, containing a network of words showing the highest variation in frequency in hildegard’s trilogy, that serves as an intuitive illustration of its rationale. before we proceed to explain the network graph in figure , a few words on the texts and the way in which they were preprocessed are in order. the analyses, carried out both here and below, relied on the digitized texts of hildegard of bingen’s sciuias, liber uite meritorum and liber diuinorum operum as they appear in the scholarly editions included in the online brepols library of latin texts. since we have contemporary manuscripts by hildegard, the critical editions have always based themselves quite transparently on the oldest available versions (importantly, all of which stem from the final years of her life). manuscript transmission and its reliability is, of course, an inherent problem, and equally so the reliability of the editions. on this basis, one might conclude that the entire problem of hildegard’s variety is rendered irrelevant or unreliable as the underlying data are flawed. however, this is not a constructive mindset in order to say anything meaningful about medieval history at all. next, the editions’ texts have been slightly amended to make them suitable for comparison, a procedure which is called ʻpreprocessing’ in the field of computational stylistics. preprocessing entails minor interventions in the text such as the deletion of irrelevant textual material, the normalization of divergent orthographical forms, and the lemmatization of texts with annotation software. towards the confrontation of close and distant reading in medieval studies, see julie orlemanski, ‘scales of reading’, exemplaria, . – ( ), – . the digitized text files of these editions have been generously provided by our project partner brepols publishers (see n. * above). see n. * above for the editions. aside from the advantage that editions are already digitized, and therefore save us the time of transcribing all manuscript variants, the first and foremost advantage is that they are recent and critical publications which were carefully scrutinized by experts in the field. we have carefully studied the variety of the existent branches in the critical apparatuses, and we argue that—based on the available evidence—editorial principles or manuscript variety will not have caused hildegard’s semantics to change or her lexical variety to drop as significantly as it does in the data that we present. nevertheless, we will take care that at those instances where troublesome distortion by editorial principles should be located, they are presented as openly as possible. the lengthy capitula and other chapter titles were excluded, because their succinct and formulaic character is not representative of hildegard’s writing style. the divergent orthographical conventions between the editions were normalized towards a classicized norm as maintained in charlton t. lewis and charles short, a latin dictionary (oxford: clarendon press, ) (e.g. <e> to <ae>, <v> to <u> or vice versa, tanquam to tamquam, inicium to initium) by lemmatizing the texts with annotation software, so that each of the inflected or conjugated words in the main text (these are called ‘tokens’) is referred to a dictionary lemma or headword, as illustrated in table . mike kestemont and jeroen de gussem, ‘integrated sequence tagging for medieval latin using deep representation learning’, ed. by marco büchler and laurence mellerin, journal of data mining and digital humanities, special issue on computer-aided processing of intertextuality in ancient languages ( ), – . parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy table . intuition of lemmatization token lemma pos-tag uisioni visio nn magno magnus adj each word cloud coloured in blue, red or green within the network graph in figure corresponds to a work of hildegard’s trilogy and is tied together by lemmas (sixty in total) that typify that work, in the sense that the lemma is conspicuously frequent in one work and then becomes proportionally uncommon in the other two works. we will not elaborate here upon the technical details of network theory, which are better documented elsewhere. in essence, these networks put on display the affiliations between the frequencies (the edges) of lemmas (the nodes) that were initially generated by means of a coefficient of variation. this coefficient is a simple, normalized percentage that informs us on whether a certain lemma shows conspicuously unpredictable behaviour in terms of its average frequency across the trilogy. table shows both positive and negative examples of words that have a high or a low variation coefficient. note that the average frequency was weighted by taking into account the length of the text by dividing the raw frequency by text length (hence the float numbers in the table). table . word frequencies per work normalized by text length, and their coefficient of variation lemma sciuias lvm ldo coefficient of variation sacerdos . . . vermis . . . firmamentum . . . contritio . . . the network in figure , which maps out the lemmas (nodes) by their frequencies (edges), for the larger part visualizes what seem to be the main thematic developments between the three works in the trilogy through distant reading. interestingly, the graph thereby provides a perfect illustration of earlier interpretations of hildegard’s work. for example, constant mews argued that the author’s increased independence from ecclesiastic institutions is reflected in her writing, and this observation corresponds to our data. the word cloud of mark e. j. newman, networks: an introduction (oxford: oxford university press, ), p. : ‘a network is, in its simplest form, a collection of points joined together in pairs by lines. in the jargon of the field the points are referred to as vertices or nodes and the lines are referred to as edges’. floats are numbers with digits behind the comma. mews, ‘from scivias to the liber divinorum operum’, p. : ‘when hildegard moved with her nuns to the new community at rupertsberg c. , a new phase in her intellectual development begins. having obtained ecclesiastical approval, she turned her attention away from ecclesia to reflection on the natural world’. parergon . ( ) jeroen de gussem and dinah wouters sciuias is filled with lemmas that relate to ecclesia and the sacraments (altar, baptizo, habitus, magisterium, oblatio, panis, sacerdos, sanctificatio, sponsa). likewise, liber uite meritorum’s specific focus on sin and the hardships of life is omnipresent (impietas, ingluvies, malignus, patio, torqueo, vermis), as is the liber diuinorum operum’s preoccupation with the cosmos and its undisturbed balance (aequalis, aquosus, firmamentum, medietas, planeta). immediately, the network illustrates both the advantages and disadvantages of distant reading acutely. on the one hand, very little is learned. we did not need a computer to repeat mews’s findings. on the other hand, the result is still significant from a theoretical perspective, and in the margins some new discoveries rise to the surface. mews’s reading of difference gains in authority because it is democratized. whatever impression mews gathered from his qualitative reading process is objectively traceable, and we can all participate in its discovery because it now proves to be a text-inherent, replicable trait. the opposite movement, that is, a qualitative reading that could elucidate quantitative findings, is less self- evident. newer directions of difference and development within the network do not so easily allow for interpretation. why, exactly, does hildegard use the indefinite pronoun quilibet ten times more often in liber diuinorum operum? why does she use adverbs itaque and etenim more often in liber diuinorum operum, words that rarely occur in sciuias or liber uite meritorum? why does the interrogative conjunction an practically only occur in sciuias? here again, the difficulty of hildegard’s secretaries arises. words such as quilibet, itaque, or an, seemingly meaningless function words such as adverbs, pronouns, conjunctions, particles, or prepositions, seem to be examples of the kinds of syntactic or grammatical markers that secretaries would have adjusted. a matter that further complicates this observation, is that in computational authorship attribution, such function words are the statistically measurable (i.e., numerous) items of a text that gain meaning because of their frequency. these are words such as et, in, quod. they are variable and all over the text: salient, easy to count, easy to spot, structural, scalable, highly informative and—allegedly— immune to conscious control. all of these aspects have rendered function words extremely robust discriminators in attributing authorship, or in discovering an author’s stylistic fingerprint. kestemont, deploige, and moens have applied stylometry to reveal stylistic deviations within a few of hildegard’s late letters, the visio de sancto martino and the visio ad guibertum missa, which they mosteller and wallace’s revolutionary work of demonstrated the usefulness of function words in statistically determining authorship, breathing life into the belief that there is something like a ‘secret life of pronouns’ whose occurrences betray aspects of the author’s profile or his or her latent preferences and writing tics; see frederick mosteller and david l. wallace, applied bayesian and classical inference: the case of the federalist papers, springer series in statistics, nd edn ( ; repr. new york: springer, ). hildegardis bingensis, opera minora ii, ed. by jeroen deploige and others, cc cm, a (turnhout: brepols, ). parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy convincingly attributed to the arrival of hildegard’s last secretary: guibert of gembloux. the study singles out a list of words that tellingly typify guibert’s stylistic preferences. an example is the higher frequency of relative pronoun qui, which relates to guibert’s notorious reputation for ‘constructing eloquent but complex sentences with a lot of embedded relative clauses’. as emphasized already, we have limited our scope of investigation to an analysis of the effect of measurable change (function words and others), and we are currently not interested in ʻattributing’ the measured phenomena. stylometry has often been viewed as the ideal method for tracing such external influences, but it has seldom been used for exploring internal development in texts. we want to take a step back and see whether we can also use stylometry to trace the development of these texts in their own right. could we link some of the changes that stylometric analyses come up with to evolutions in form, thought, or method that contribute to how the texts function and maybe even to how they construct their monolithic voice and the impression of unchanging stability? iii. principal differences the word clouds in the network (figure ) come with several disadvantages. firstly, lemmatizing the trilogy is accompanied by the risk that some information of the original text is irrevocably lost. if a word often appears in a fixed context or construction, and therefore in the same inflection or conjugation, we cannot derive this information from the figure. the second problem ties in closely with this first issue: figure only takes into account single words, not lengthier constructions or formulas. the third problem is more technical but also most pertinent: the way frequencies were calculated for the words in table is very naive. figure does not take into account if these words’ frequencies know a consistent distribution. this means that, hypothetically speaking, inflected forms of firmamentum could have occurred intensively only in the very last part of liber diuinorum operum and not anywhere else in liber diuinorum operum. one could then hesitate whether or not firmamentum is a representative word for liber diuinorum operum as a whole. distribution, then, is an informative aspect of the feature which we might want to retain. in what follows we attempt to address all of these aforementioned problems. we computationally ascertain the principal variables that generate textual difference between the three works of the trilogy. presently, we keep the trilogy’s texts intact, and do not lemmatize them such as in the previous experiment. in addition we do not only focus on words, but also on so-called word bigrams, which here equals two-word combinations (e.g. the bigrams in the phrase ‘nam a principio uisionum tuarum’ are: ‘nam a | a principio | principio uisionum | uisionum tuarum’). thirdly, we split up the original text in a number of smaller mike kestemont, sara moens, and jeroen deploige, ‘collaborative authorship in the twelfth century: a stylometric study of hildegard of bingen and guibert of gembloux’, digital scholarship in the humanities, . ( ), – (p. ). parergon . ( ) jeroen de gussem and dinah wouters text segments (also called ‘samples’), each consisting of , words, labelled by a class (i.e., sciuias, liber uite meritorum and liber diuinorum operum) and a number indicating its ordinal position in the original text from which it was segmented. this concept of text segmentation or sampling, that is, splitting up a larger population (the text) into smaller groups, enables us to track down consistently recurrent features that typify a certain class. table gives a technical overview of the subsequent steps. the main idea is to apply a few techniques from the field of multivariate data analysis and machine learning, the details of which are consultable in the table, in order to attain features (both words and bigrams) that distinguish best between the works in the trilogy. these features’ frequencies also indicate which words are typical and evenly distributed in sciuias, liber uite meritorum or liber diuinorum operum. consequently, we can plot the samples in a pca cluster plot (figure ), which is a visualization of the data’s principal components. in essence, a pca plot yields a simplified impression of the data’s most significant trends. the samples, indicated by colored dots and numbered by their position in the text, are overlayed by their ‘loadings’, the words that discriminate the classes in writing style best. led by the pca plot as a distant reading recommendation, we here further trace down the pca’s loadings (the words that overlay the numbered dots indicating the text samples) for an actual close reading analysis. surprisingly, the pca plot teaches us that it is the smaller function words that present significant deviances as to the frequency by which they are applied. even more conspicuous is that the strongest shifts seem to take place in the sphere of modal particles (quippe vs uidelicet), comparative conjunctions (sicut, sic, et sicut, ut), causal and interrogative adverbs or constructions (ita quod / ut, quemadmodum, itaque, etenim, namque, quia). these words, meaningless as they may appear, first and foremost play a very practical role in the genre that hildegard is exercising. after all, exegesis has an explanatory function. if those words that bind hildegard’s exegetical argumentation together show a strong shift, then we must look out for how this affects the text’s tone, and to what extent we can ascertain a rhetorical purpose or systematicity in these preferences. simultaneously (as has been emphasized a few times at this point) we also need to consider that some of these changes are likely the result of hildegard’s collaborative authorship: the frequencies of sampling is common practice in computational stylistics, yet finding a generally valid adequate length that is suitable to any author, genre, or language has been subject to much debate. adequate sample length has been discussed explicitly for hildegard of bingen and medieval latin in kestemont and others, ‘collaborative authorship in the twelfth century’, pp. – . for the problem in general, see maciej eder, ‘does size matter? authorship attribution, small samples, big problem’, digital scholarship in the humanities, . ( ), – ; or kim luyckx and walter daelemans, ‘the effect of author set size and data size in authorship attribution’, literary and linguistic computing, . ( ), – . for an elaborate explanation of pca and how the technique is applied in computational stylistics, see josé nilo g. binongo and m. wilfrid a. smith, ‘the application of principal components analysis to stylometry’, literary and linguistic computing, . ( ), – . parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy table . steps towards finding the principal variables of difference preprocessing . sampling original texts are sliced into , -word samples. . labelling each of the , -word samples is assigned a label according to the class from which it was derived, of which there are (i.e., ‘does the text sample belong to sciuias, liber uite meritorum or liber diuinorum operum?’) dimensionality reduction . feature extraction ‘predictae causae deum ita tangunt, ut eas acceptando sciat et uideat quia homo, seipsum restaurans de instabilitate quae sibi in ruina adae orta est’ we derive—from the entire corpus in general— the , most frequent words and , most frequent bigrams as features. then, consequently, each of the n , -word samples is vectorized according to this list. the raw frequencies were normalized by removing their mean and scaling to unit variance. features: [‘et’, ‘in’, ‘est’, ...] sample : [- . , . , . , ...] sample : [- . , - . , - . , ...] sample : [- . , - . , - . , ...] sample n: ... . feature selection a mutual information regression algorithm is used as a filter method, where only those features are selected (from originally , features) that distinguish particularly well between our classes. this algorithm could, for instance, decide that function word ‘et’ is not informative enough to distinguish between classes, and therefore becomes superfluous. [‘et’, ‘in’, ‘est’, ...] sample : [- . , . , . , ...] sample : [- . , - . , - . , ...] sample : [- . , - . , - . , ...] . principal components analysis in order to visualize the high-dimensional feature vectors, the vectors are reduced to principal components that capture the most important dynamics of the selected features. this final dimensionality reduction step is necessary, since the human mind essentially visualizes in two or three dimensions only. plotting cluster plot (figure ) parergon . ( ) jeroen de gussem and dinah wouters smaller words with an agglutinative function such as uidelicet or namque are often seen as excellent examples for betraying an author’s writing preferences. in this light, two of the bigrams in the pca plot are telling, namely ita quod vs ita ut. we know—again from the ghent autograph that was mentioned earlier—that secretaries made this correction very often in hildegard’s text and had to adjust the subjunctive that followed in such clauses. in this particular case we are obviously dealing with a corrective procedure that has very little meaningful impact for the text. yet, importantly, it seemed to have been a linguistic procedure to which her collaborators paid far more attention in her final work, and it might have contributed to the fact that the liber diuinorum operum reads as the most complex work of the three. in contrast to such more obvious scribal interferences, the interrogative function words deserve closer attention. in figure , the exclusivity of an to sciuias had already been demonstrated. this could be linked to the more inquiring and dialogical character of sciuias, which is also confirmed in the pca plot: we see that interrogative constructions with quomodo and quid est appear very frequently. the work moreover contains an enormous number of question marks ( ) in contrast to liber uite meritorum ( ) and liber diuinorum operum ( ). these changes are unlikely to only have been the result of different scribes’ interferences. rather, they should be ascribed to a change of tone, which is much more didactic in sciuias, where often a format of question-and-response is employed in its exegesis: quod dicitur: mulier propter uirum creata est, et uir propter mulierem factus est; quoniam ut illa de uiro ita et uir de illa, ne alterum ab altero discedat in unitate factorum natorum suorum, quia in uno opere unum operantur, quemadmodum aer et uentus opera sua inuicem complicant. quomodo? aer de uento mouetur, et uentus aeri implicatur, ita quod in ambitu eorum quaeque uiridia illis subdita sunt. derolez, introduction to hildegardis, liber diuinorum operum, p. xciv: ‘so the hundreds of instances of ita quod followed by a verb in the indicative were systematically changed into ita ut followed by the verb in the subjunctive. it is probably because of the intervention of learned friends after volmarus’s death that the language and style of ldo in its final form are better than in sciuias, for instance’. the maintained punctuation in the editions closely follow the manuscripts (which indeed show question marks), and therefore can be argued to be a feature of the text for which hildegard and her collaborators were responsible, and not the modern editors. for the liber diuinorum operum, albert derolez states to have ‘followed the punctuation of g[hent ms ] strictly and continuously’; see ‘introduction’, p. cxiv. for sciuias and liber uite meritorum, no explicit comment is made by führkötter and carlevaris on punctuation, but on our own closer inspection of the manuscripts they likewise appear to have followed hildegard’s punctuation closely. the digitized and consultable versions of sciuias and liber uite meritorum that were used for the edition are respectively the wiesbaden riesencodex and the former dendermonde ms codex . the latter was moved in from the abbey of st peter and paul in dendermonde to the maurits sabbe library in leuven. parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy quid est hoc? mulier uiro et uir mulieri in opere filiorum cooperatur. this already explains why words such as quid, quod, and quomodo abound in sciuias. it moreover turns out that these words actively appear in an exegetical function. they ensure the transfer from a biblical citation to its exegesis. we first notice that all exegeses of biblical citations are introduced by an exegetical formula such as quid est hoc? ( times in sciuias, and practically never again in liber uite meritorum and liber diuinorum operum) or quod etiam sic intelligendum est (only in liber uite meritorum, nowhere else). if we further look at the distributions of such formulas, we discover an astonishing systemacity (table ). table . exegetical formulas in the vision books sciuias part quod dicitur part quid est hoc? part hoc tale est liber uite meritorum part of each vision: psychomachia cuius sensus talis est part of each vision: afterlife quod etiam sic intelligendum est liber diuinorum operum part quod sic intellectui patet part hoc considerandum sic est part huius sententie intellectus hoc modo accipiendus est the first part of sciuias uses the same words after each biblical quote: quod dicitur. in the second part, however, we only and consistently find quid est hoc, and quod dicitur is never used again as an introduction to exegesis in any of the three works. quid est hoc is an exception in this regard, because it is the fixed formula for the second part, but it also regularly occurred in the first part in the body of the exegesis (i.e., not immediately following the biblical citation). all other formulas are specific to the introduction of an exegesis in one particular part of a book and are never used elsewhere. to continue, yet another formula is used in the third part of the book, namely hoc tale est. in liber uite meritorum, which is not divided into parts, we find a different but equally regular pattern. each of these six visions has two parts. the first part is a psychomachia where the personifications of virtues and vices argue with each other, and the second part gives a view of the afterlife, where the sinners who fell victim to these vices undergo their punishments. each of these two parts has its own exegetical formula: for the first part, this is cuius sensus talis est and for the second part quod etiam sic intelligendum est. lastly, the same organization is apparent in the last vision hildegardis, sciuias, i. . , p. (our italics). with one exception (hildegardis, sciuias, i. . , p. ), where the text says quid est hoc instead. parergon . ( ) jeroen de gussem and dinah wouters book. whereas the first part consistently uses quod sic intellectui patet, the second part always says hoc considerandum sic est, and the third part tells us that huius sententie intellectus hoc modo accipiendus est. to our knowledge, this sort of systematicity is found in no other exegetical text. the exact repetition of one formula for each part has a prophetic ring to it, whereas the variation in formulas is at once rhetorical variatio and yet so systematic and monolithic that readers do not even notice it. it acts as a sort of prophetic appropriation of the rhetorical technique. furthermore, the change of formulas between the texts seems to follow a development in how the text teaches. the formulas in sciuias indicate a simple deictic relationship between the citation and its exegesis, pointing from one to the other and saying ‘this is that’ (e.g. quid est hoc, tale est, quid est, qui est, quod est, id est). in liber uite meritorum, there is a sensus which needs to be apprehended; the reader must make some effort to understand what is said. liber diuinorum operum further develops this trend by putting the stress on intellectus as both the meaning of what is said and the intellect which grasps that meaning. this rhetorical effect is reflected in the changing frequencies of smaller words. in liber diuinorum operum, hildegard starts using more words such as quippe (modal particle), sicut (conjunction of comparison), itaque (adverb of cause), and quemadmodum (interrogative adverb), namque, and sic. her final work strongly revolves around ‘intellect’, around understanding (the capitula of liber diuinorum operum are teeming with the phrase quomodo intellectum sit), whereas sciuias is more occupied with the condition that precedes understanding, namely the right way of thinking. in the same vein, the occurrence of uidelicet (etymologically linked to ‘seeing’: ‘it is easy to see’) drops in the liber diuinorum operum in favour of other modal particles such as quippe or scilicet (in contrast a word that is etymologically linked to ‘knowing’: ‘it is easy to know’). the way in which hildegard links the vision with its explanation has become less of a one-to- one relationship; the likeness between image and explanation has become modal (quemadmodum […] sic [...], sicut enim, sic que). these are harder to interpret for the less advanced reader. in sciuias hildegard still takes her reader by the hand. this is reflected by the higher frequency of direct addresses throughout the work. in sciuias it is god himself who is subject, speaking directly to the audience in terms of ego, me, and meus (also visible in the pca plot). later on in hildegard’s trilogy, god becomes a more distant yet constantly present object in liber uite meritorum, objectivized through hildegard who takes on a reportorial role (deum or his uocem). finally, it is his divine works that gain central focus in liber diuinorum operum (operibus, opera). as sabina flanagan notes, sciuias is ‘essentially a work of instruction and direction, a ‘how to’ book rather than an abstract meditation on theological questions’. flanagan also stressed that hildegard was still trying to find her sabina flanagan, hildegard of bingen, – : a visionary life, nd edn ( ; repr. london: routledge, ), p. . parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy method of writing in sciuias: ‘we might conclude, then, that many general themes are stated in the first part which are explored at greater length in subsequent books of sciuias. indeed, it is possible that hildegard had not quite found her method at this stage’. this sort of didacticism gradually disappears in the two later vision books, and makes way for a style of exegetical explanation targeting a more advanced reader. what we should mostly take away from these experiments is how much control the text exerts over the placement of its words, and that there is a strongly embedded didactic line that appears throughout. if in this case each word is right where it should be in a system of uninterrupted order, this suggests a similar attitude with regard to other words. it also shows how hildegard’s collaboration with secretaries in composing her works does not equal an absence of structured, underlying thought, when even the preferences for the very small words seem to have a distinct purpose. iv. lexical richness in this next section we devise a set-up in which we quantify hildegard’s lexical richness. vocabulary richness—or diversity measurement—is a very intuitive and simple approach to measuring lexical differentiality between texts. it is one of the oldest methods for statistical text analysis, and it remains a valid exploratory approach to an author’s lexical variety. interestingly, the method has been conceived quite apt to yield informative figures of a developing mind in the process of creating. lexical richness has been said to visualize a ‘very subtle mental process’, which allows to see in what parts of the text the author meanders from one subject to another (high variety), or is highly fixated on a particular subject (low variety). vocabulary richness is nowadays also considered a helpful method to assess the acquisition or proficiency of a (second) language, although a correlation between lexical richness and control over the language is harder to maintain in the context of literature (cf. below). literature will often be homonymous, sparse, or repetitive with an artistic purpose. in figure , the texts were segmented into equally-sized , -word slices or samples, the concept of which should by now be clear. again, there was no lemmatization. consequently, the total size of the vocabulary was divided by the total number of words in the sample (always , words), yielding a type-token flanagan, hildegard, p. . it has for some time even been regarded a viable method to distinguish between works of different authorship. in the meantime its reliability to determine authorship has been much debated or withdrawn. for the earliest studies in this field, see george udny yule, the statistical study of literary vocabulary (cambridge: cambridge university press, ). for an overview, see fiona j. tweedie and r. harald baayen, ‘how variable may a constant be? measures of lexical richness in perspective’, computers and the humanities, ( ), – . carrington b. williams, style and vocabulary: numerical studies (london: charles griffin, ), p. . see n. above. parergon . ( ) jeroen de gussem and dinah wouters ratio (ttr) for each of these samples. the ttr is a simple ratio that informs us how many different words are used within a text sample. the higher the number of different words, the more ‘lexically diverse’ or ‘lexically rich’ the sample becomes. for example, if there are , different words in a , -word sample, then that sample is very diverse in vocabulary (the ttr would be . , which is approximately the highest score hildegard attains). once we have worked through the entire text and computed a ttr every , words, we are able to obtain an average ttr value for each work, which is indicated by the dashed line (following the usual color scheme maintained throughout this article). as a close reading of the previous section has already suggested, this dashed line suggests that hildegard consistently used fewer words for each work in the trilogy, with the clearest and most significant drop in the liber diuinorum operum. hildegard shows a diachronic tendency to distil her vocabulary to a concentration of carefully chosen words. this style of writing also made her increasingly abstract or difficult to understand. in other words: hildegard’s thought increases in complexity with a more limited range. when analysed more closely, lexically rich or, conversely, sparse episodes within hildegard’s oeuvre give a good indication of what the ttr measures: this often turns out to be a stylistic strategy in line with a particular thematic focus. for example, one of the lexically sparsest episodes of the liber diuinorum operum is the fourth vision. the vision’s obsession with cosmic symmetry is embedded in the latin: the descriptions are repetitive and synonymous. strikingly, the third most frequent word throughout the entire passage is uerbum, which takes up a central position in the beginning of her commentary on the prologue of the gospel of john. we can here observe how hildegard meditatively revolves and circles around the equalization of uerbum and deus, and the divine potential to create (creatura). because without the beginning, before the beginning of creatures, and even in the beginning of these creatures there was the word, and this same word was before the beginning and in the beginning of creatures with god and in no way distinct from god; because god’s will was in his word and his word created everything, just as he had preordained it before all time. and why is it called the word? because with a resounding voice it animated all creatures and summoned these creatures to itself. because what god dictated in the word, this word ordered through sound; and what the word ordered, god dictated in the word. and so god was the word. for the word was in god, and god note that text length can have an impact on such an average. however, this does not have to be a problem, since our aim is mostly to present a transparent overview of the lexical richness in the entire trilogy. moreover, clearly text length has not withheld the liber uite meritorum in outperforming liber diuinorum operum when it comes to type/token ratio. derolez and dronke, introduction to hildegardis, liber diuinorum operum, p. xcii. parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy dictated to it all his will in secret; and the word resounded and brought forth all creatures; and so the word and god are one. a lexically rich passage within hildegard’s trilogy, on the other hand, is found in the fifth part of liber uite meritorum. it concerns the divine judgement of man giving in to worldly temptations. interestingly, in an episode that is teeming with many different words that semantically root in vice (uitium, iniquitas), overabundance (superabundare) and luxury (luxuria), hildegard’s writing style is somewhat more digressive and ornate. the sentences are much longer and the syntax is considerably more complex: the allurement of earthly distraction and open-endedness thereby seems to become tangible through the language. for while injustice abounded, to the extent that it deemed itself unconquerable by anyone, the lord in his zeal destroyed every beginning and every head that raged in the perversity of tenebrous infidelity, namely from the beginning of pride with the devil, which had prepared for itself a seat in the infernal kingdom, up to the beginning of adam’s transgression, where the latter—captivated and incarcerated— subjected himself to that same devil. the high contrast in between these two passages, where a humble vocabulary focuses on god’s potential to create and a rich vocabulary on god’s potential to destroy, emphasizes how hildegard could adapt her style in function of the passage. importantly, it should be noted that a usage of fewer words, as in the passage of liber diuinorum operum, does not necessarily equal a simpler text or simpler reading experience. on the contrary: in observing the modest number of transmitted manuscripts of liber diuinorum operum, barbara newman has speculated that the work might have rather been ‘far too complex to become popular in hildegard’s time’, an observation which mcginn has echoed by stating that liber diuinorum operum is bolder and more original than its predecessors in its material, structure, and narrative. kowalewska has likewise our translation. see hildegardis, liber diuinorum operum, i. . . – , p. : ‘nam sine principio ante principium creaturarum et etiam in principio ipsarum erat uerbum, et idem uerbum ante principium et in ipso principio creaturarum erat apud deum et nullo modo a deo diuisum; quoniam deus in uerbo suo uoluit et uerbum suum omnia crearet, sicut ante secula preordinauerat. et | quare dicitur uerbum? quia cum sonante uoce omnes creaturas suscitauit et eas ad se uocauit. nam quod deus in uerbo dictauit, hoc uerbum sonando iussit; et quod uerbum iussit, hoc deus in uerbo dictauit. et ita deus erat uerbum. verbum enim in deo fuit, et deus in illo omnem uoluntatem suam secreto dictauit; et uerbum sonuit et omnes creaturas produxit; et sic uerbum et deus unum sunt’. our translation. see hildegardis, liber uite meritorum, . – , p. : ‘dum iniquitas superabundauit, ita quod a nullo se superari existimauit, contriuit dominus in zelo suo omne initium et omne caput quod in peruersitate tenebrose infidelitatis grassabatur, scilicet ab initio superbie diaboli, que sibi sedem in tartareo regno parauerat, usque ad initium transgressionis ade, ubi ipse captiuatus et incarceratus eidem diabolo se subiecerat’. barbara newman, review of hildegardis, liber divinorum operum, ed. by a. derolez and p. dronke, in speculum, . ( ), – . mcginn, ‘hildegard of bingen as visionary and exegete’, p. . parergon . ( ) jeroen de gussem and dinah wouters asserted that ‘while it is undoubtedly a fact that hildegard’s language is to a large extent homonymous, it does not manifest her linguistic poverty’. perhaps it is precisely this somewhat abstract quality to hildegard’s later work, which seems less intent on explaining itself through the use of more words, that renders it more difficult to grasp. v. semantic change this combination of decreasing lexical richness and increasing abstraction deserves closer attention. an abstract writing style implies more than simply using fewer words to describe a reality. it also implies that the semantic connection between the word and that reality is problematized. if fewer words are used, does that mean that fewer realities are indicated, or that larger parts of reality are taken together? we will give a few examples of philosophically significant words of which the relation between word and semantic content changes from the first vision book sciuias to the last book liber diuinorum operum. these two works, which chronologically demarcate hildegard’s literary achievement, are structurally and thematically very similar, yet show the greatest difference in lexical richness, which urges a closer comparison. in tracing these semantics, we will encounter a trend of specification: polysemic words are bent towards a single univocal meaning. on the other hand, we will sometimes encounter remarkable instances of quantitative change going hand in hand with qualitative, semantic, change. sometimes, words just disappear. while we start out by looking at a cluster of terms in which the semantic values of the words shift in relation to each other, we end by zooming in on instances of words that disappear towards the last book of the visionary trilogy. first, we will discuss terms or clusters of terms that acquire more specific and technical philosophical meanings. we start with hildegard’s semantic cluster of the word itself: uerbum, uox, sonus, sonitus, and nomen. in sciuias, uerbum occurs as the centre of the cluster. forms of uerbum occur times in the work. as one would expect, they are used all the time, both in non-philosophical usages and in the context of the word of god. sciuias contains little critical reflection on the nature of the word. the only time when hildegard discusses the nature of the word is when she uses it as an analogy to demonstrate the nature of the trinity. the word has three elements: ‘sonus, virtus et flatus’. ‘sed sonum habet ut audiatur, uirtutem ut intellegatur, flatum ut compleatur’. these three correspond to the father, the son, and the holy spirit, respectively: like the elements of the word, the persons of the trinity cannot be separated. małgorzata kowalewska, ‘the linguistic artistry of hildegard of bingen as exemplified in her letters’, roczniki kulturoznawcze, . ( ), – (p. ). it might be helpful for the reader to know the words which we investigated but for which we could not detect significant semantic changes. these were: speculum, uoluntas, intellectus, imago, and similitudo. hildegardis, sciuias, ii. . , p. . parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy sonus occurs times in sciuias. apart from being used as an aspect of the word, it denotes harmonious sound, often the sound of singing or of a message or truth that is revealed. mostly, this happens in a typological relationship where sonus denotes the anterior and hierarchically lesser element. the second element is then often uerbum. we would rather expect such a typological relation to be represented, in the augustinian tradition, by uox and uerbum. however, in sciuias, uox occurs only in its most common, non-metaphoric usage. the same is true for nomen: it only denotes the name and title by which one identifies a person, and has no further philosophical connotation. a word related to sonus, sonitus, appears times and represents the negative connotations of sound, with the meaning of noise and distorted sound. this difference between positive sonus and negative sonitus corresponds to the normal usage. however, it also happens that they are used interchangeably. the semantic relations within this cluster change markedly in liber diuinorum operum. as was noted earlier, hildegard’s last book includes important theoretical reflections on language and knowing through words. this engagement with words is reflected by the change of terms. the most significant change is that the word is now also that through which we understand, whereas in sciuias it was only the means of communication. in this sense, the word is called ‘name’, nomen. nomen in this philosophical usage is now often found in the vicinity of words like forma, officia, discernere, comprehendere. the name corresponds to the form of something created (forma creaturae), and through the name we discern (discernere) the functions (officia) that it has. consequently, the use of uerbum is reduced to signifying the word that is communicated. either it is the word of god, in the form of creation or christ, or it is the spoken human word. in these contexts, for instance at i. . , p. : ‘sonus laetitiae et prosperitatis’; at iii. . , p. : ‘tunc etiam declaratum est in acuta iustitia mysterium verbi dei, insinuatum uidelicet per sonum patriarcharum et prophetarum’; and at iii. , p. , ll. – : ‘et sonus ille, ut uox multitudinis in laudibus de supernis gradibus in harmonia symphonizans, sic dicebat’. for instance, adam and christ at iii. . , p. : ‘significante in circumcisione primum sonum oboeditionis post casum adae, praecurrentem operantem oboedientiam in uero uerbo quod est in filio dei, ut sonus uerbum praecurrit’; the patriarchs, prophets, and christ at iii. . , p. : ‘insinuatum uidelicet per sonum patriarcharum et prophetarum, qui praedixerunt ipsum verbum cum omni iustitia manifestandum’. for a more extensive discussion of the philosophy of the word in hildegard, see ranff, wege zu wissen und weisheit; chávez alvarez, die brennende vernunft; and dinah wouters, ‘“nisi per nomina”: language as the medium of thought in hildegard of bingen’s thinking’, revue d’histoire ecclésiastique, . – ( ), – . hildegardis, liber diuinorum operum, i. . , p. , ll. – : ‘uelut creaturas, que homini note sunt, formis et nominibus suis ab inuicem discreuit’. hildegardis, liber diuinorum operum, i. . , p. , l. : ‘officia et nomina creaturarum discernit’. hildegardis, liber diuinorum operum, i. . , p. , ll. – : ‘quia homo nullam rem alio modo nisi per nomina discernit’. parergon . ( ) jeroen de gussem and dinah wouters uerbum is still used in a typological relation to sonus, and now also to uox. thus, the traditional, augustinian pair of uox–uerbum comes into play, and uox is now also part of the word, which was not the case in sciuias. verbum now no longer has three parts; it is not analogous to the trinity anymore. instead, nomen has adopted this structure. lastly, the word sonitus disappears in liber diuinorum operum, which is not easy to explain, because it did carry a different meaning than sonus. it is not the case that sonus loses its connotation of harmony. instead, the meaning of disharmonious sound is filled in by strepitus, which was also present in sciuias, but only times. so, this change seems less motivated by a change of meaning than by the desire to use fewer words for the same meaning. the same phenomenon presents itself with the term species: there exists a semantic overlap with forma in sciuias, but liber diuinorum operum narrows down the use of the term so that the two are strictly separated in meaning. species is used times in sciuias and for all kinds of meanings. it denotes visible beauty and the outer appearance of things. it also has the meaning of the english word species, to denote the species of animals that left the ark and the species of humankind. in its sense of outer appearance, species overlaps with forma, which stands both for the outer appearance and the inner nature of something. the concept of forma thus corresponds to the philosophical idea that the ‘form’ of something is its essential nature, the sum of its properties, which is abstracted from the thing and thought in the mind. so, forma indicates a continuity between outer and inner appearance, and if forma and species would occupy wholly different semantic fields, we would expect species not to carry this meaning. however, this is not the case: species also sometimes includes a metaphorically motivated analogy between outer appearance and inner essence. in liber diuinorum operum, however, species occurs only in the commentary on genesis and only to denote animal ‘species’. the term did not even keep its meaning of ‘beauty’. for a word that is fairly common, this points to a deliberate avoidance. hildegard chose to appoint forma as the philosophically correct word for outer and inner appearance, and, more importantly, for the continuity between them. there cannot be an overlap of meaning. hildegardis, liber diuinorum operum, ii. . , p. , ll. – , – : ’hoc considerandum sic est: vox primum sonat et uim uerbi in se habet, ita ut quecumque annuntiat scienter intelligantur. [...] sed et uox aliquantum aliena est nec intelligibilis, uerbum autem notum et intelligibile est’. hildegardis, liber diuinorum operum, i. . , p. , ll. – . hildegardis, sciuias, ii. . , pp. – : ‘quapropter et ibi spiritus sanctus apparuit, quia fidelibus per eum remissio peccatorum fit, ibi uidelicet ob mysticum secretum eundem vnigenitum meum idem spiritus sanctus in specie columbae ostendens, quae simplicis et sinceri moris est; quoniam et spiritus sanctus in simplicitate et in bonitate omnium bonorum indeficiens iustitia est’; and at iii. . , p. : ‘sed homines isti quos uides in hac multitudine dicuntur compulsae oues, humanam speciem habentes propter opera hominum, et umbrosum uestitum quod est dubitatio in operibus peccatorum, in districtione tamen timoris metuentes iudicium dei’. parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy further, we will give some examples of terms that completely disappear from sciuias to liber diuinorum operum, due to, presumably, overspecification. first, there are typus and the adverb typice, which denote either the way that an image features in hildegard’s visions or the typological relation of prefiguration between old and new testament. we see that hildegard uses typus and typice together times in sciuias, but not at all in liber diuinorum operum. in sciuias, she used the word together with figura ( times), which replaces typus completely in liber diuinorum operum ( times), a process that is familiar by now. second, there is the strange case of conscientia, which is a central word in sciuias ( times), but disappears without a trace in liber diuinorum operum. viki ranff offers a hypothetical explanation that fits in well with our own findings. she notes that scientia is very close and almost identical with conscientia, because it always serves ethical knowledge. scientia is, for hildegard, always scientia boni et mali, with which conscientia was associated in sciuias. therefore, we may conjecture that hildegard felt that the word conscientia had become superfluous. it says much about hildegard’s terminological rigour that she would even do away with such a common and culturally central word. the last term that we will discuss is one that disappears almost completely: sermo. in sciuias, the word appears times. it is not a word that seems to receive special attention. it is used for hildegard’s own visions, the words of the faith, the kind words of a teacher, angelic words, and all kinds of human words. in the term only appears once in the capitula or introductory chapters of the liber diuinorum operum, which we did not include in this analysis. the capitula have been identified by albert derolez as written by another hand, one that is not found anywhere else in the work (ʻhand ’). the chapters also originally headed a version of the liber diuinorum operum which must have looked quite different from the current one, meaning that it was written somewhat before the the work’s ‘final’ completion (inasmuch that term can be properly used here); see derolez and dronke, introduction to liber diuinorum operum, p. lxxxviii. ranff, wege zu wissen und weisheit. hildegardis, sciuias, i. . , p. : ‘ego enim opus istud per hunc hominem edissero, cui idem opus in homine ignotum est et qui sermonem istum non ab homine, sed a scientia dei accepit’. hildegardis, sciuias, i. . , p. : ‘quapropter et ex eo quidam flatus cum turbinibus suis exiens per praedictum instrumentum se ubique diffundit: quoniam ab inundatione baptismatis salutem credentibus afferentis uerissima fama cum uerbis fortissimorum sermonum egrediens omnem mundum manifestatione beatitudinis suae perfudit, ut iam in populis infidelitatem deserentibus et fidem catholicam appetentibus aperte declaratur’. hildegardis, sciuias, ii. . , p. : ‘quomodo? si habet rectos rectores et spiritales magistros zelum meum habentes, hi debent eum ad seruitutem meam reuocare, et hoc primum facient supplicatione, exhortatione et blando sermone eum lenientes, et deinde uerberibus et constrictione frigoris ac famis et aliis his similibus castigationibus eum corripientes, quatenus his miseriis admonitus infernales poenas ad mentem suam reuocet, et eas timens a se putredinem animae suae auferat, et ad semitam illam quam deseruerat ita reuocatus redeat’. hildegardis, sciuias, ii. . , p. , ll. – : ‘sed quod eadem beata virgo per angelicum sermonem in eodem secreto ueram allocutionem audiuit’. parergon . ( ) jeroen de gussem and dinah wouters the latter sense it may also be used negatively. in the last vision book, however, things have changed drastically: sermo appears only times. if we look at the two passages in which sermo does appear, we see that it is now used in an extremely specific, namely in an extremely negative, sense. the first time is in a biblical quote which warns against the trickery of the antichrist: ‘neque terreamini neque per spiritum neque per sermonem neque per epistolam tanquam per nos missam, quasi instet dies domini’. in explaining the quote, hildegard defines this sermo as verbosa seductio. four chapters later, the talk is about the ‘two witnesses’ in the apocalypse, enoch and helias. god has taught them his mysteries, and therefore they know things ‘ita ut illa sciant quasi ea corporaliter uiderint’, which makes them ‘sapientiores […] scriptis et sermonibus sapientium’. it is instructive to also take a look at liber uite meritorum. this book, too, only features sermo twice. the first time it is used by the personified fallacia, who says that she voices a multitude of opinions because she fears to be contradicted. the second time it is used by infidelitas, who decides to do as she pleases, because she is confused by the ‘[m]ultos […] rumores et multos sermones ac multas doctrinas’ that she hears but does not understand. from these four passages we can conclude that sermo had lost all of its positive connotations after sciuias. apparently, sermo had come to be associated with bad and misleading teaching, and even with fallacy, seduction and infidelity. but why sermo? sermo was a word that carried much meaning, and many different meanings, too, but it was in no way a negative word or one that would be shunned by some writers. hildegard’s usage seems completely idiomatic. what this analysis of philosophically relevant vocabulary shows is that terms are used with greater care and specification in the last vision book than in the first one. first of all, hildegard apparently acquired a more philosophical vocabulary: philosophical distinctions are made (uerbum–nomen), and terms are combined in ways that are more in line with tradition (uox–uerbum). secondly, there is the tendency to eliminate semantic overlap by restricting competing terms both in meaning and use (species–forma, sonus–uox, sonitus–strepitus, typus–figura, conscientia–scientia). the changes described above are motivated by the wish to create a philosophical terminology that is clear and precise. researchers of hildegard’s hildegardis, sciuias, iii. . , p. , ll. – : ‘sed te in eadem emptione durante non habebis partem lucis in consortio supernorum angelorum: quoniam in sermone linguae tuae rapacitatem cordis tui protulisti aliud concupiscens quam ciues aeternae claritatis desiderent’. ii thessalonians . ; hildegardis, liber diuinorum operum, iii. . , p. , ll. – . hildegardis, sciuias, iii. . , p. , ll. – . hildegardis, liber uite meritorum, ii. , p. , ll. – : ‘si enim loquela mea in uno modo esset, ab omnibus damnarer; et ideo sermones meos multiplico, ne ab ullo superer, et hoc mihi utilius est, quam fustibus et gladiis percutiar’. hildegardis, liber uite meritorum, iii. , p. , ll. – : ‘multos quoque rumores et multos sermones ac multas doctrinas audio, quas nescio. vnde faciam quicquid ad utilitatem meam optimum fuerit’. parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy philosophy have noticed her careful choice of words: ‘die differenzierte und präzise nuanciertheit des sprachlichen ausdruckes vornehmlich bei wissens-, erkenntnis- und weisheitsthemen wirkt wie ein bewußt erstelltes begriffliches instrumentarium, das eine hohe sensibilität gegenüber philosophischen fragestellungen erwarten läßt’. christel meier has spoken of ‘[die] kontingenz einer art bildterminologie die systemhaft wie die abstrakte terminologie eines philosopischen werks die visionsschriften durchzieht’. meier herself has analysed the ‘bildterminologie’ of colour, while barbara maurmann has described the terminology of the winds. despite the attention that scholars have paid to this terminology, the changes in it have been less noticed. this might be partly due to how the text itself deals with these changes. first of all, hildegard does not signal that she will use certain words for certain meanings, and she does not give definitions. second, her preference goes to common words to denote very specific philosophical meanings in favour of more specialized words. although hildegard chooses her words with great care, this only becomes apparent to a reader when one closely analyses her word use, not when one reads the text linearly or analyses her philosophy. her terminology does not serve the purpose of making connections to other ideas and texts. on the contrary, hildegard seems intent on avoiding any such inferences. with regard to hildegard’s use of sources, peter dronke has noted ‘a sense [...] that verbal reminiscences have at times been deliberately covered over’, christel meier has voiced the strong suspicion that ‘mögliche wortanklänge an verwandte vorstellungen durch synonymengebrauch umgangen werden’, and jochen schröder has asserted that in hildegard’s use of ezechiel ‘naheliegende similien wie demonstrativ ausgemerzt [werden]’. this phenomenon certainly has to do with hildegard’s ‘bewußt angestrebte originalität’, as schröder says, and dronke sees it as ‘a mark of the individual creative will of [a] writer […] of ranff, wege zu wissen und weisheit, p. ; and chávez alvarez, die brennende vernunft. christel meier, ‘eriugena im nonnenkloster? Überlegungen zum verhältnis von prophetentum und werkgestalt in den figmenta prophetica hildegards von bingen’, frühmittelalterliche studien, ( ), – (p. ). christel meier, ‘die bedeutung der farben im werk hildegards von bingen’, frühmittelalterliche studien, ( ), – ; barbara maurmann, die himmelsrichtungen im weltbild des mittelalters: hildegard von bingen, honorius augustodunensis und andere autoren (munich: funk, ). peter dronke, ‘the allegorical world-picture of hildegard of bingen: revaluations and new problems’, in hildegard of bingen: the context of her thought and art, ed. by burnett and dronke, pp. – (p. ); meier, ‘zwei modelle von allegorie im . jahrhundert: das allegorische verfahren hildegards von bingen und alans von lille’, in formen und funktionen der allegorie, ed. by walter haug (stuttgart: metzler, ), pp. – (p. ); jochen schröder, ‘die formen der ezechielrezeption in den visionsschriften hildegards von bingen’, in ʻim angesicht gottes suche der mensch sich selbst’, ed. by rainer berndt (berlin: akademie verlag, ), pp. – (p. ). parergon . ( ) jeroen de gussem and dinah wouters genius’. meier explains it as an effect of hildegard’s wish to create an allegory that is hermetically closed to the reader. however, we do not only notice this elusiveness in the creative use of allegory but also in the use of philosophical terms. hildegard is deliberately disowning her learning. her persona of a visionary prophet writing the words of god brings her in a difficult position in relation to her steep learning curve. as hildegard the visionary, she is unlearned and knows nothing of philosophical knowledge. the voice that speaks in her visionary texts, however, is the voice of god, who knows everything, but is also not learned, in the sense that he is above all human learning. both can neither gain new knowledge nor refer to other human knowledge. this provides hildegard with a great freedom of saying things in her own way, but it also puts restrictions on what she is able to say. this argument might explain why the specification of a philosophical terminology would go hand in hand with the use of fewer words. usually, the creation of such a technical terminology rather seems to entail using more or even new words. if you want every semantical node to correspond to one lexical form, as hildegard clearly does, you would need more and not fewer words. yet, this lexical analysis—together with our lexical richness analysis—suggests that the more hildegard’s thought gained complexity, the fewer words she used. it is of course not proven that philosophical terms play the major role in this trend, but seeing that many terms simply disappear, we can assume that this does play a role. this is an interesting phenomenon which deserves to be pursued further. vi. conclusion hildegard of bingen’s visionary trilogy is not as monolithic as it appears, although it is designed to give that impression. precisely this aspect of her work makes it a grateful study object for a distant reading. such an abstract approach, which is inductive and exploratory, prides itself on being capable of temporarily deactivating any subjective impressions that can be misleading in a close reading. moreover, it allows spotting subtle nuances, changes and meaningful variations which often go unnoticed in an extensive corpus, by laying bare statistically measurable trends. it has been our aim to show that, on the one hand, statistical analysis is able to point out patterns which otherwise remain unread, and, on the other hand, that these patterns are not automatically incidental or ‘unconscious’ elements of the text. a text will always be incomparably richer than its most deep and nuanced reading. distant reading or data analysis can help us reduce the complexity of the text so that it yields an overview. our starting point was the analysis of the principal components in each of the three works, which showed that each work is stylistically distinct from the others to quite a large degree. we then traced the paths of words changing frequency, position, and meaning. through close readings of particular elements, schröder, ‘ezechielrezeption’, p. ; dronke, ‘the allegorical world-picture’, p. . meier, ‘zwei modelle’, p. . parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy we tried to indicate how these changes and differences need not only indicate external influences on the writing process but can also be emblematic of a strategic development within the visionary works as they evolve from didactic treatise to abstract philosophy. beginning with the small words, we looked at the frequencies of modal particles, comparative conjunctions, causal and interrogative adverbs that introduce exegetical explanations. these kinds of words were a distinctive marker of change in our computational analysis. behind some of their functionalities we have been able to ascertain a systematicity and a rationale which complicates any claim that they are solely ‘coincidental’ or ‘unconscious’ imports of hildegard’s assistants. sciuias relies on a question–response structure where an instructor directly addresses a readership, and where exegesis functions largely through deictic one-to-one relationships. liber diuinorum operum dispenses with this didactic style and its corresponding word use. the exegetical formulas which introduce biblical exegesis in each of the treatises are a particularly interesting case by which to demonstrate how hildegard deals with change. variation of these formulas occurs only en bloc, which reinforces rather than diminishes the monolithic voice of the visionary works. moreover, the change of formulas seems to be involved in the evolution from a question-and-answer format to the dry style of a treatise, evolving from a deictic quid est hoc? to admonishments to probe the deeper sense of something. we continued with an analysis of the variety and decrease in vocabulary richness throughout the three texts. we showed how the variety in vocabulary richness can often be attributed to conscious variations in style bound to a particular thematic focus. the decrease of richness over the course of the trilogy is a more complex case, and one we decided to elucidate through the study of some philosophical terminology. the remarkable results of this analysis indicate that decreasing vocabulary richness goes hand in hand with deliberate semantic shifts. we apparently have to do with an author who greatly evolved in thought and in the ways by which she put that thought into language. interestingly, it appears that the more hildegard’s thought gained complexity, the fewer words she used. moreover, this evolution appears to go hand in hand with the didactic-exegetical rationale, where the word reaches a level of abstraction and distance that would only have been comprehensible for the advanced reader. there is a development in didactic strategy from the deictic character of sciuias to the intellectual qualities of the liber diuinorum operum, where one could say that sciuias is the minor to the much more abstract maior liber diuinorum operum. indeed, hildegard’s striving for a philosophical clarity has not made her an easier read. the exact meanings of words are not explicated to the reader. a helpful way to understand the changing semantics of hildegard’s words is by looking at ‘word’ itself. whereas in sciuias, verbum denotes only spoken communication, in liber diuinorum operum verbum is contrasted with nomen, which denotes the word that is thought. in a long intellectual tradition which distinguishes between the linguistic word and parergon . ( ) jeroen de gussem and dinah wouters the metaphorical ‘inner word’, the latter consists of pure insight into the essential reality of that which a word denotes. we suggest that this is what hildegard tries to accomplish in her later books by using words the way she does, namely with the utmost precision and by trying to make form and meaning line up. whereas sciuias follows the model of spoken language in using a wide array of words with different meanings, liber diuinorum operum interiorizes the word by using an intellectual and philosophical language, a theological language, too, which imitates the univocal act of the word about which john says ‘et verbum caro factum est’. in conclusion, this article has attempted to pair a broad, computational, view on variation, change, and development in hildegard of bingen’s visionary trilogy with a close analysis of occurrences of variation and change in the works. we have been able to link our stylometric data to a didactic programme, to a variation in high and low style, and to a refinement of philosophical ideas. moreover, there is an interesting link between the way in which the text construes a monolithic voice and the way it deals with variation and change. we hope that the preliminary discussion presented here can make an argument for studying change and development in the works of hildegard in a more comprehensive way. the aim of further scholarship should be to read these texts as being the product of collaborative authorship but not therefore lacking a proper internal voice, coherence, and development. ghent university shimizu tetsuro, ‘words and concepts in anselm and abelard’, in langage, sciences, philosophie au xiie siècle, ed. by joël biard (paris: vrin, ), pp. – ; le langage mental du moyen Âge à l’âge classique, ed. by joël biard (louvain: peeters, ); luisa valente, ‘verbum mentis — vox clamantis: the notion of the mental word in twelfth-century theology’, in the word in medieval logic, theology and psychology, ed. by tetsuro shimizu and charles burnett (turnhout: brepols, ), pp. – ; martin lenz, ‘mental language’, in the oxford handbook of medieval philosophy, ed. by john marenbon (new york: oxford university press, ), pp. – . parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy f ig ur e . ti m el in e of th e tr ilo gy ’s c om po si tio n pr oc es s parergon . ( ) jeroen de gussem and dinah wouters f ig ur e . n et w or k g ra ph g en er at ed b y m ea ns o f k n ea re st n ei gh bo ur a lg or ith m , c on ne ct in g th e le m m as w ith th e hi gh es t c oe ffi ci en t o f v ar ia tio n by th ei r f re qu en cy v ec to rs . c re at ed w ith g ep hi (t he a lg or ith m u se d w as f or ce a tla s , e m be dd ed in g e ph i, an o pe n- so ur ce to ol fo r n et w or k m an ip ul at io n an d vi su al iz at io n; s ee m at hi eu j ac om y an d ot he rs , ‘ fo rc ea tla s , a c on tin uo us g ra ph l ay ou t a lg or ith m fo r h an dy n et w or k v is ua liz at io n d es ig ne d fo r t he g ep hi s of tw ar e’ , p lo s o ne , . ( ) , – ) . a lg or ith m = ‘b al l t re e’ ; n um be r o f n ei gh bo ur s = . parergon . ( ) language and thought in hildegard of bingen’s visionary trilogy f ig ur e . pr in ci pa l c om po ne nt s a na ly si s pl ot f ig ur e . n et w or k g ra ph g en er at ed b y m ea ns o f k n ea re st n ei gh bo ur a lg or ith m , c on ne ct in g th e le m m as w ith th e hi gh es t c oe ffi ci en t o f v ar ia tio n by th ei r f re qu en cy v ec to rs . c re at ed w ith g ep hi (t he a lg or ith m u se d w as f or ce a tla s , e m be dd ed in g e ph i, an o pe n- so ur ce to ol fo r n et w or k m an ip ul at io n an d vi su al iz at io n; s ee m at hi eu j ac om y an d ot he rs , ‘ fo rc ea tla s , a c on tin uo us g ra ph l ay ou t a lg or ith m fo r h an dy n et w or k v is ua liz at io n d es ig ne d fo r t he g ep hi s of tw ar e’ , p lo s o ne , . ( ) , – ) . a lg or ith m = ‘b al l t re e’ ; n um be r o f n ei gh bo ur s = . parergon . ( ) jeroen de gussem and dinah wouters f ig ur e . ty pe -t ok en r at io (t t r ), yi el di ng fi gu re s of h ild eg ar d’ s de cr ea si ng le xi ca l r ic hn es s. t he d as he d lin e in di ca te s th e m ea n. c re at ed w ith m at pl ot lib (j oh n d . h un te r, ‘m at pl ot lib : a d g ra ph ic s e nv ir on m en t’ , c om pu tin g in s ci en ce & e ng in ee ri ng , . ( ) , – ). the ‘assertive edition’ research article the ‘assertive edition’ on the consequences of digital methods in scholarly editing for historians georg vogeler published online: may # springer nature switzerland ag abstract the paper describes the special interest among historians in scholarly editing and the resulting editorial practice in contrast to the methods applied by pure philological textual criticism. the interest in historical ‘facts’ suggests methods the goal of which is to create formal representations of the information conveyed by the text in structured databases. this can be achieved with rdf representations of statements extracted from the text, by automatic information extraction methods, or by hand. the paper suggests the use of embedded rdf representations in tei markup, following the practice in several recent projects, and it concludes with a proposal for a definition of the ‘assertive edition’. keywords digital scholarly edition . history. rdf (resource description framework) . semantic web . historical documents . critial edition . tei (text encoding initiative) introduction the approach to scholarly editing used by historians differs from the approach used by literary scholars. both historians and literary scholars share an interest in a good text created by textual criticism, as texts are the main sources on which historians draw in their constructions of narratives about history. nevertheless, historians can have a slightly different approach to text: linguistic and physical aspects are considered mere intermediates to the information conveyed by the text. historians consider the content of the text ‘data’, and they want to use this data in their research to gain knowledge about the past. the circumstances under which archival documentation as a major type of text with which historians work was created support their perception of text: people recorded administrative activities in text to preserve information about these activities international journal of digital humanities ( ) : – https://doi.org/ . /s - - - * georg vogeler georg.vogeler@uni-graz.at zentrum für informationsmodellierung, universität graz, graz, austria http://crossmark.crossref.org/dialog/?doi= . /s - - - &domain=pdf mailto:georg.vogeler@uniraz.at for contemporary but absent clerks or for future clerks. in other words, they stored data in texts written on paper. in pre-digital editorial practice this can lead to decisions which are unacceptable to literary scholars, such as paraphrasing parts of the text. i will try to show that in digital scholarly editing the approach to editing used by historians can be reconciled with methods of textual scholarship. i suggest calling this combined method ‘assertive editing’ to avoid the impression that this method can only be used by historians. the method of assertive editing is not defined by disciplinary interests but by an interest in one facet of text: the information recorded. in terms of patrick sahle’s text wheel (sahle :iii, - ), the assertive edition is the editorial practice dedicated to the ‘text as content’ perspective. in the following i usually will oppose this ‘content’ to the ‘text’ as pure transcription and the result of textual critical work. contributions to the assertive edition assertive editing is fed by two streams in pre-digital and early digital scholarship. the ideas of content-oriented navigation, the possibility of multiple forms of representation, and extensive historical commentary are drawn from pre-digital editorial practice in historical research. i will try to show this by presenting on three major german historical printed editorial series: the monumenta germaniae historica (mgh), the records of the early modern imperial diet (breichstagsakten^), and the official minutes from the imperial chancellery (bakten der reichskanzlei^: bundesarchiv ). . pre-digital contributions to facilitate navigation and reception, the editions in the mgh diplomata series prepends abstracts of the legal core to each document. this is common practice in european charter editions, and it was codified by a committee of historical editors under the direction of robert-henri bautier in (bautier : , ). more recent diplomata series editions dedicate a paragraph in the introduction of each document to the historical context (e.g. the charters of emperor frederic ii by koch - ). some mgh editions of historiographical texts indicate the year to which the current text refers in the margins (e.g. georg waitz’s edition of the historia danorum roskildensi, : - ). this helps the readers find the events in which they are interested. of course, abstracts serve more purposes than simple navigation. in the editions of the imperial diets, abstracts replace some of the documents (e.g. heil , - ). dietmar heil describes the interest of the editors: bthe priority is … philological authenticity, but optimal accessibility^ ( , , trans. georg vogeler). they reduce historical orthography and change punctuation when it deviates from the modern syntactical analysis of the text (heil , - ). editors of correspondence have also considered this approach (steinecke ). editorial work in contemporary history is defined by the selection of significant material and contextualization of the text. the editors of the minutes of the cabinet of the german federal government (bundesarchiv ), for instance, explain their g. vogeler selection by relevance of content, discarding as irrelevant for instance the agenda in the head of each minute, invitations, and their attachments. this content-oriented approach can be found in other editorial principles of this edition. orthographic and syntactic errors are emended without notice, for instance. single entries start with a heading, persons present, and the place and time of the meeting, not as a verbal copy but as an extract created by the editors. the minutes also serve as an example of the third element of pre-digital editorial practice. they add extensive notes on the subjects of the meetings to each transcript with the aim of making the texts understandable. this kind of annotation is not specific to this one edition, but is generally recommended in historical editing (cullen ; stevens and burg , ). the edition of the minutes of the bundeskabinett serves primarily to illuminate government decisions, rather than their wording. similarily, many mgh editors add extensive comments on the historical context, e.g. in the pre- publication of anonymous continuations of frutolf - (marxreiter ). these approaches have been directly transferred into electronic editions. the idea of facilitating the understanding of the text accepts translations as a way of editing. this leads to solutions like david postles’ online representation of stubbington medieval records (postles ), which gives the text in a translation of the original latin. this not an individual practice, as p.d.a. harvey discusses in his introduction to historical translation as a method in editing ( , - ). from a historian’s point of view, a translation is a sensible solution, as it facilitates the use of the document. it would not satisfy the research interests of textual scholars. paul d.a. harvey argues that the edition of historical records can be reduced to a calendar of abstracts when the original or photocopies of the records are easily accessible ( , - ). several project follow an approach of this kind. the records of the swiss foreign office (zala et al. – ) replaces the transcription with images. this calendar plus image approach is also used by soundtoll registers (veluwenkamp and van der woude ; gøbel ) and peter rauscher and his colleagues in the donauhandel project ( - ). both create databases with struc- tured information directly from the source and link it to images of the source. . early digital contributions historians’ interests in the ‘facts’ and the dominance of sociologic approaches to history in the s to s led them to create ‘databases’ of historical information (boonstra et al. ). a famous example of this approach is the online catasto of (herlihy et al. ), an online edition created by r. burr litchfield and anthony molho based upon david herlihy and christiane klapisch-zuber’s project census and property survey of florentine dominions in the province of tuscany, - ( ; herlihy ; herlihy ). the data keeps close to the source, copying the information on wealth recorded for each taxable household in the city (as it is found in the initial tax declarations of plus additions and adjustments made in and ). seeing historical records as an accidental medial solution to preserve and process information, one could consider this database a simple change in recording medium, not in information itself. the needs of the recording medium require substan- tial changes in the recording method. herlihy / klapisch-zuber had to create new encodings and had to break the text rigorously into table columns. in the end, the ‘assertive edition’ the database tries to recreate the information recorded by the florentine offi- cials, addressing three essential questions: who had to pay what amount of taxes for which kind of property. philological editors certainly cannot consider this database an edition. the encoders did not copy family names, names, and patronymics letter by letter, but standardized them and truncated them when they went beyond ten letters. historians were well aware of the modifications that database encoding made to the original records. in the s, however, digital scholarly editing was not yet developed enough to provide a solution. the concept of scholarly editing does not even appear in the more recent book on historical information science by lawrence mccrank’s ( ). at the time, computing methods in the historical sciences chiefly meant the production of relational databases and spreadsheets. in the s, manfred thaller proposed a historical database system that kept closer to the original source ( , , , ). he developed the clio database as a ‘source oriented’ database. it would reduce the amount of encoding and transformation of the source customary. clio kept as much information from the source as possible by allowing for hierarchical organization of information, better representation of incom- plete data, and integration of alternatives and comments. this source-oriented database approach is clearly a type of editorial work, combining text from the source with interpretation by and for historians. at the same time, a philologist would regret the lack of a full transcription. digital editions and facts digital scholarly editing has developed since the days of clio and has built upon the methods developed for the mgh, the reichstagsakten, and the akten der reichskanzlei. the assertive edition developing out of these strands is something between pure textual representations and well-formed databases structured around specific research questions. no edition yet calls itself an assertive edition, but many bear features that fit the definition put forward here. a selection may be found by searching patrick sahle’s catalogue for bgeneral subject area: history^ (sahle – ). browsing through the projects on the list, one can identify four major questions: . which interface elements are typical for an assertive edition? . how can we use automatic information extraction processes in the scholarly edition? . is semantic markup (provided by the tei) sufficient? . how can we integrate the web of data (the ‘semantic web’) into scholarly editions? . interface elements editions like the letters of alfred escher (jung – ), the acta pacis westfalica (lanzinner and braun ), and the diplomatic correspondence of thomas bodley - (adams ) offer avenues of access to the text beyond the pre-existing g. vogeler textual structure. typically, tools include indices of persons, places, and subject keywords. other entry points to the texts show better what an assertive scholarly edition would concern itself with: apw, for instance, gives access via a timeline of events, a calendar of relevant dates, and a map. indeed, indices of persons, places, and events and calendars and maps are fast becoming default components for historical digital editions. additional fact-oriented interface elements seem to depend more on the type of documents edited: rich prosopographical information like in correspondence suggests using network visualisations, for instance in the diplomatic correspondence of thomas bodley (adams , visualisations). economic information suggests the use of bar charts to visualize income and expenditure, as in the case of the edition of the municipal accounts of basel - (burghartz , konten). the latter builds upon the source-oriented database approach advocated by manfred thaller by allowing the user to select entries from the accounts and collect them in a ‘data basket’ (burghartz , databasket). this allows the user to perform basic arithmetic opera- tions and download the results as a spreadsheet. finally, semantic networks like those used in burkhardt source (ghelardi et al. ) hold some general promise, but for the moment they remain lonely solutions for single projects. . information extraction the user interfaces, of course, are only the surface of the edition. how does one harvest information? what form does the information take as digital data? which models relate the information to the transcription? one approach to data harvesting from texts is automatic information extraction. computer linguists have been working on this since the s. their goal is to reduce free prose text to answers to the questions bwho did what to whom and when?^ and represent these answers in a structured way. a typical information extraction pipeline starts with generic natural language processing steps and then uses named entity recognition to mark up the words representing persons, locations, or organizations, temporal data, and quantifying data. the pipeline then relates these entities to one another, building connections between the entities. this can take the form of predicates in sentences, coreferences by pronouns, etc. the possible relationships can be inferred from external knowledge about the domain, like dates of birth and death for people mentioned in a text, or it can be the result of the semantic role, such as can be inferred from the predicate in a sentence. the task is very domain-specific, as it depends on what type of information is considered relevant. a typical task for historical research could be event extraction, which is already applied to automatic news analysis (see grishman for a general introduction). recent projects dealing with us foreign affairs records have taken this approach to transcripts of archival documents. they take the historical records as source data without any intermediate scholarly processing. using ocr to create a digital represen- tation of the text, scholars then apply distant reading methods like topic modelling or information extraction to this corpus (e.g. kaufmann – ). gao et al. ( ) have used even used the electronic texts of the cables in the s for their computer- based analysis. the aim of implementing this approach in scholarly editions would be to create a reliable text with classical textual criticism and to extract from this text the information for historians. existing information extraction methods are built for the ‘assertive edition’ modern texts, and thus they have to be modified to be applicable to historical texts or historical texts have to be modified to come closer to modern texts. piotrowski ( ) has described the many challenges in this task. some progress has been made e.g. in the handling of variants in historical language, for instance by bryan jurish ( , , , ) or kestemont et al. ( ). however, most of the problems still remain to be solved. scholarly editors still have to rely on their own competence and on human labour for the introduction of substantial knowledge about what people in the past wrote in their texts. . tei and semantic markup the problems computers still have with historical languages led to the decision to create manually annotated texts. digital editions use the extensible mark-up language xml to add semantic markup to texts. this is made possible in particular by the strong connection between the communities maintaining the guidelines of the text encoding initiative (tei) with the community of digital scholarly editors. tei provides semantic annotation for many phenomena interesting to historians: names of persons, locations, or organizations can be encoded as <name>, temporal expressions as <date> and <time>. with the tei p there are even guidelines concerning how to encode structured descriptions of persons, places, and events, structures that are similar to database structures. still, the markup provided by the tei is deficient in ways of expressing historical information of interest in this present study. an example is the <event> element. the tei guidelines consider it a concept independent from text, to which text can refer. an expression like ‘my inauguration’ in ‘after my inauguration, i decided to leave the town’ is not an ‘event’, but should be encoded like any other referring string with the <rs> element. nevertheless, while places and persons have a dedicated <persname>/<placename> tagging, historians interested in marking up named events like ‘world war i’, the ‘battle of marathon’, the ‘coronation of charlemagne’, the ‘contract of maastricht’, or the ‘lisbon earthquake’ in their sources have to employ workarounds. this observation illuminates the distance between a major practice in digital scholarly editing and the research interests of historians. one reason for this might be that scholars much more easily agree on the identification of individual names of concrete persons, places, and organizations than on more abstract events. the sample events above have formal names (some, more than one), but text often describes events in a much looser way: ‘my inauguration as bishop’ is clearly an individual event, but one unlikely to have formalized name. many events do not even bear names at all. rather, they are told as a story: ‘when hitler’s troops crossed the polish border on september in the year , world war ii started.’ this sentence clearly refers to the event ‘nazi invasion of poland’, but could just as easily be referred to as ‘start of world war ii’, or in many other ways. this example demonstrates that even these short identifiers are not just an arbitrary ‘name’. they create different contexts and are therefore part of a specific discourse. http://www.tei-c.org/release/doc/tei-p -doc/en/html/ref-event.html g. vogeler http://www.tei-c.org/release/doc/tei-p -doc/en/html/ref-event.html . web of data: semantic markup by reference linking different names for the same event is a typical competency of semantic web technologies as proposed by the w c since (berners-lee et al. ; rebranded as ‘web of data’ activities by w c ). the semantic web uses abstract unique identifiers (uris) as representations of the concepts covered by the name. with uris, scholars can create digital representations of events without relying on ambiguous natural language terms. an increasing number of digital scholarly editions use semantic web technologies to solve naming issues. the most prominent method is the extension of classical indices: while previously, such indices standardized names to represent the historical fact behind a name for a person or a place, uris allow identifying persons, places, and organisations for technical processing, even if there is no name. gautier poupeau described this approach in , and the digital edition of the fine rolls of king henry iii, created – (ciula et al. ), made extensive use of these technologies in its back-end. a good example of the use of semantic web technologies in scholarly editions is the teutsche academie der bau-, bild- und mahlerey-künste, by joachim von sandrart (kirchner et al. – ). the text refers to many artists and artistic objects, which are identified and described in the index and can be downloaded from the site as an rdf dataset. a more extensive formalisation than the index approach is demonstrated by the old bailey project (hitchcock et al. – ). the basic transcription of the text was annotated in xml in order to facilitate structured searching and statistical analysis. this approach works because the records already tend to have a regular structure. ´the meaning of particular words or phrases like names and crimes is tagged and further sorted into subcategories like types of verdict. the final encoding of the texts contains formal descriptions of the relationships established by the markup. they are processed in a separate database, but they are also kept together with the text in the xml. old bailey online is not just a database of criminal trials, but an assertive scholarly edition, representing the statements made by the transcription in a formal way and linking the statements to transcription, to image. following semantic web / web of data activities of the w c, the digital represen- tation of data is increasingly realized through rdf triples. in the context of the assertive edition, they have the advantage that they model facts as statements about reality in a simple but expressive way, as they can be read as subject predicate object propositions. parallel to the development of embedded annotation with xml, digital humanities has developed methods for stand-off annotation. since , stand-off annotation has been increasingly realized with rdf. a standard for this annotation has been found in the open annotation vocabulary (sanderson et al. ). digital editions have made use of this possibility. pundit (grassi et al. ; morbidoni and piccioli ; andreini et al. ; net ) is the most advanced application of the semantic web to digital scholarly editions, used for example in the scholarly edition of the correspon- dence of jakob burckhard (ghelardi et al. ). it allows annotation of any part of the http://ta.sandrart.net/data/ and single expressions via the rest api of the project at http://ta.sandrart. net/de/info/services/rest/, so http://ta.sandrart.net/services/rdf/person/ returns the rdf data for philipp melanchthon for example. https://www.oldbaileyonline.org/static/project.jsp#mark-up the ‘assertive edition’ http://ta.sandrart.net/data/ http://ta.sandrart.net/de/info/services/rest/ http://ta.sandrart.net/de/info/services/rest/ http://ta.sandrart.net/services/rdf/person/ https://www.oldbaileyonline.org/static/project.jsp#mark-up text. textual fragments can be used as the subjects or objects of an rdf triple. in the jakob burckhard edition, pund-it reduces the possible predicates to references to artworks and artists, general comments, quotations and references, dates, and geographical identifica- tion. pundit saves this as an rdf reference to the html elements. work is underway on linking the annotation directly to the xml/tei source. in the end, the semantic networks, which are a unique interaction feature of this digital edition, can describe the content of the text through direct links to part of the source text that contains the information. . how to combine transcription with databases? looking forward, a number of questions arise: can we build scholarly editions which include results similar to those created by information extraction software but controlled by hand, thus bringing the full power of human understanding to the annotation? can we encode the propositions made by the words of a text into the transcription? can we embed the statements extracted by the reader into the sequence of characters and thus create a single digital resource representing transcription and information conveyed by it to the editor? if so, how? one possible approach is suggested by rdfa, the w c’s proposed serialization of rdf embedded in html markup. it provides attributes for html elements describ- ing rdf triples. existing html element attributes like @href or @src can be used as objects in the ‘subject predicate object’ triple structure. additional attributes like @typeof, @resource and @property permit some of the full expressiveness of rdf. listing a: example of a sentence in rdfa encoding listing b: triples extracted from the sentence (in turtle/n notation) listing demonstrates which triples can be extracted from a sentence in a fictive letter by using rdfa markup as semantic annotation. this method is attractive, as it closely relates the assertive expression to the text how might tei be similarly extended? the standardized mark-of the tei covers some typical basic facts that might be extracted from texts. however, assertive anno- tation can be much richer and highly diverse. something more flexible is needed. therefore, i would suggest transferring the rdfa approach to tei, creating a ‘teia’ annotation style. the tei-community has already discussed the idea of directly importing the rdfa attributes into the tei (tei-community , ), but it was <p xmlns=”http://www.w .org/ /xhtml”> <span resource=”gnd:gleim”>bertrand marched out with the <span property=”ex:ispartof” resource=”ex:guards”>guards battalion to <span property=”ex:marchto” resource=”geo:wrocław”>breslau</span></span> and writes <span property=”ex:writes” resource=”_:briefe”><span property=”ex:emotionalquality” resource=”ex:serene”>gay</span> letters</span></span></p> gnd:bertrand ex:ispartof ex:guards ; ex:marchto geo:wrocław ; ex:writes [ ex:emotionalquality ex:serene ] . https://net .github.io/pundit /xmltei.html g. vogeler https://net .github.io/pundit /xmltei.html argued convincingly that foreign namespace is not controllable by tei and therefore not recommended. fortunately, tei provides attributes which cover much of a teia approach: @ref creates a link from a verbal expression to an entity, and @ana links textual fragments to any kind of analytical annotation. as the tei guidelines reduce the use of @ref to reference strings, the globally usable @ana seems to be the best candidate for a generic linking of textual fragments to rdf triple structures describing the relevant facts. the système modulaire de la gestion d’information historique (symogih), which was developed by francesco beretta and his team (beretta and vernus ; beretta et al. ), makes use of rdf-based semantic markup in combination with tei transcriptions. in the edition of the journal of léonard michon (letricot ), for instance, the transcribed texts are accompanied by a marginal index with short notes on events, facts, and persons. they are formalisations of the very text, e.g. ‘le roy luy a envoyé à marseille monsieur de saint olon, gentilhomme ordinaire, qui le suivra jusqu’à paris’ is represented by a descriptive text ‘françois pidou de saint olon accompagne l'ambassadeur perse de marseille à versailles’ and the people involved in the event. this information is represented as an rdf statement (http://symogih. org/resource/info ) about the two persons involved. the annotation is encoded with the tei, and the global attribute @ana links the text to a database of the formalized description of the content (beretta ). other examples of this approach are found in projects realized at the zentrum für informationsmodellierung at university of graz in cooperation with the historical department at the university of basel. susanna burghartz’s team created transcriptions of two sets of administrative records from the city of basel from the early modern period: the annual accounts of the city from to (burghartz ) and a criminal court record, the ‘urfehdebuch’ (register of oath of truce) from - (burghartz et al. ). while digital humanities projects related to the early modern period focus very often on handling the specific properties of early modern texts (nelson and terras ; estill et al. ), the basel editions can be considered assertive editions. both projects are realized in a very flexible technical environment, the gams (steiner and stigler – ), which is a framework for archiving and publication of humanities data sources, in particular digital scholarly editions. in the jahrrechnung der stadt basel the core information unit addressed is clear: the monetary amount of a single transaction, as transmitted by the historical accountant, i.e. his rubrics (vogeler a; vogeler b; cfr. vogeler for a deeper discussion of editorial methods appropriate for historical accounts). however, even this simple criterion needs interpretation: repaid loans and interests are mixed in one common category of income. for a financial analysis this is unacceptable; accordingly, stand-off annotation is used to apply sub-categories to individual entries. in the case of the urfehdebuch the main category is, as is true of the old bailey records, a single case. at least one core property of the data structure is already represented by the textual structure in the archival manuscript: the heading gives the name of the offender. however, type of offence, http://symogih.org/ http://journal-michon.symogih.org/documents/document-text.html?id=diob &volume_number= &page_number= http://gams.uni-graz.at the ‘assertive edition’ http://symogih.org/resource/info http://symogih.org/resource/info http://symogih.org/ http://journal-michon.symogih.org/documents/document-text.html?id=diob &volume_number= &page_number= http://journal-michon.symogih.org/documents/document-text.html?id=diob &volume_number= &page_number= http://gams.uni-graz.at victim, and punishment have to be extracted from the text and are encoded with links to a taxonomy developed for the project (pollin and vogeler ). embedding the interpretation of facts into the text seems straightforward, but it has several drawbacks. the examples above have shown that we need at least links to external knowledge organization systems and the translation into full rdf triples to express enough of the content of the texts. information science teaches us to go even further and to include time in the relationship between data and information, i.e. between edited text and the facts the historian considers to be represented by the text. börje langefors ( / ) has formulated his ‘infological equation’, according to which the information is a function of the data, the recovering structure, and the time when the interpretation takes place. this conceptualization of information argues in favour of stand-off annotation, as the semantic value of an edited text is an interpre- tations by the editor. in fact, there is a long-standing discussion in the text encoding community on the risks of embedded semantic markup, summarized by thaller and buzzetti ( ). standardized technical solutions for this approach do not yet exist. rdf has established itself as a common data structure for the exchange of the factual interpretations of text. the question of how to maintain the linkage between the text edited and the rdf is still under discussion. conclusion all of this leads to the following description of assertive editions: they are scholarly representations of historical documents in which the information on facts asserted by the transcription is in the focus of editorial work. they help the user/reader understand the text and use the information conveyed in the text as structured data. this data includes interpretations of the text based on the context and the expertise of the editor. in fact, interpretation is part of the core of the critical activity of the editor. she concludes on the basis of her knowledge about the written text, its layout, and the historical circumstances under which it was produced how to describe the content beyond pure transcription. this can include normalization, categorization, reference to external resources, formal knowledge representations, and many other forms of transformation. the assertive edition is not a well-defined type of scholarly editing yet. however, assertive editions exist. the methods according to which they are created, modelled, and made available online are becoming part of scholarship. digital assertive editions can be identified by the user interface and in the data structures, which try to combine the transcription with a database of statements made in the text. on the one hand, few historians have already implemented the concept. it allows them to employ source- oriented critical methods while working with large amounts of data. the majority of the historians still focus on the structured data extracted from the sources. databases are their major tool, often employing rich interfaces and elaborate visualizations. the majority of scholarly editors, on the other hand, employ traditional methods of textual scholarship; they ponder complex transcription problems, evaluate variants, and include textual materiality. the combination (deep links between structured data and text with assertive editing) is still rare. one reason for this is the technological ability to realize such links. tools like pundit, frameworks like symogih or gams, and best practice examples like the projects cited above are steps in the process of addressing this. g. vogeler acknowledgements open access funding provided by university of graz. this text profited very much from the suggestions of the reviewers, my colleagues at the institut für dokumentologie und editorik, and in particular the hard work of sean winslow, for which i am deeply grateful. open access this article is distributed under the terms of the creative commons attribution . international license (http://creativecommons.org/licenses/by/ . /), which permits unrestricted use, distribution, and repro- duction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. references all urls were last checked on apr. , and if possible archived via the web archiving service of http://archive.org. adams, r. (ed.) ( ). the diplomatic correspondence of thomas bodley. centre for editing lives and letters. retrieved from http://www.livesandletters.ac.uk/bodley/bodley.html. andreini, g., di donato, f., giacomi, d., giusti, e., & masotti, r. ( ). pundit. semantic annotation for digital humanities. in digital humanities : conference abstracts (pp. – ). kraków: jagiellonian university & pedagogical university retrieved from http://dh .adho.org/abstracts/ . bautier, r.-h. ( ). normalisation internationale des methodes de publication des documents latins du moyen age, colloque de barcelone, - octobre . in: bulletin philologique et historique, pp. – . beretta, f. ( ). the symogih.org project and tei: encoding structured historical data in xml texts. text encoding initiative conference and members’ meeting . connect, animate, innovate., oct , lyon, france. retrieved from https://halshs.archives-ouvertes.fr/halshs- . beretta, f., & vernus, p. ( ). le projet symogih et la modélisation de l'information: une opération scientifique au service de l'histoire. les carnets du larhra , pp. – . beretta, f., butez, c. c., carpentier, a., & delcourte, marie ( ). reconstituer les évolutions des espaces forestiers de l'avesnois aux xive – xviiie siècles. approches méthodologiques. bulletin du centre d’études médiévales d’auxerre | bucema, hors-série , retrieved from http://cem.revues.org/ . berners-lee, t., hendler, j., & lassila, o. ( ). the semanticweb. the scientific american, ( ), – . boonstra, o., breure, l., & doorn, p. ( ). past, present and future of historical information science. amsterdam: dans retrieved from http://www.oapen.org/search?identifier= . bundesarchiv ( ff.). die kabinettsprotokolle der bundesregierung, munich. retrieved from http://www. bundesarchiv.de/cocoon/barch/ /index.html. burghartz, s.(ed.) ( ). jahrrechnungen der stadt basel – – digital. basel/graz. retrieved from http://hdl.handle.net/ / . . burghartz, s., calvi, s., & vogeler, g. (eds.) ( ). urfehdebücher der stadt basel –digitale edition. basel/ graz. retrieved from http://hdl.handle.net/ / . . ciula, a., spence, p., & veira, j. m. ( ). expressing complex associations in medieval historical documents. the henry iii fine rolls project. literary and linguistic computing, , – retrieved from http://www.finerollshenry .org.uk/. cullen, c. t. ( ). principles of annotation in editing historical documents; or, how to avoid breaking the butterfly on the wheel of scholarship. in vogt, g. l., & bush jones, j. (eds.), literary and historical editing (pp. – ). kansas: univ. of kansas libraries (university of kansas publications library series, ). estill, l., jakacki, d. k., & ullyot, m. ( ). early modern studies after the digital turn. toronto: arizona center for medieval and renaissance studies. gao, y., goetz, j., mazumder, r., & connelly, m. ( ). mining events with declassified diplomatic documents, , december . retrieved from https://arxiv.org/abs/ . . ghelardi, m. et al. ( ). burkhardt source. retrieved from http://burckhardtsource.org/. gøbel, e. ( ). the sound toll registers online project, – . international journal of maritime history, xxii( ), – . grassi, m., morbidoni, c., nucci, m., fonda, s., & piazza, f. ( ). pundit: augmenting web contents with semantics. literary and linguistic computing, , – . https://doi.org/ . /llc/fqt . the ‘assertive edition’ http://archive.org http://www.livesandletters.ac.uk/bodley/bodley.html http://dh .adho.org/abstracts/ http://symogih.org https://halshs.archives-ouvertes.fr/halshs- http://cem.revues.org/ http://www.oapen.org/search?identifier= http://www.bundesarchiv.de/cocoon/barch/ /index.html http://www.bundesarchiv.de/cocoon/barch/ /index.html http://hdl.handle.net/ / . http://hdl.handle.net/ / . http://www.finerollshenry .org.uk/ https://arxiv.org/abs/ . http://burckhardtsource.org/ https://doi.org/ . /llc/fqt grishman, r. ( ). information extraction. in mitkov, r. (ed.), the oxford handbook of computational linguistics, nd edition (chapter ). oxford university press. https://doi.org/ . /oxfordhb/ . . . harvey, p.d.a. ( ). editing historical records. london: british library. heil, d. (ed.) ( ). der reichstag zu konstanz . munichmunich: historische kommission (deutsche reichstagsakten / mittlere reihe / deutsche reichstagsakten unter maximilian i. ). retrieved from http://reichstagsakten.de/index.php?vol=rta . heil, d. ( ). per aspera ad acta. ein werkstattbericht zur edition der deutschen reichstagsakten aus der zeit kaiser maximilians i. in wolgast, e., göttingen (eds.), nit wenig verwunderns und nachdenkens. die 'reichstagsakten – mittlere reihe' in edition und forschung. schriftenreihe der historischen kommission bei der bayerischen akademie der wissenschaften (pp. – ). doi: https://doi. org/ . / . . herlihy, d. ( ). direct and indirect taxation in tuscan urban finance, ca. – . finances et comptabilité urbaines du xiii e au xvie siècle. actes du colloque international blankenberge , september – . brussels: pro civitate (historische utigaven in- ), pp. – . herlihy, d. ( ). medieval and renaissance pistoia. in the social history of an italian town, – . new haven: yale univ. press. herlihy, d., klapisch‐zuber, ch. ( ). les toscans et leurs familles. paris: fondation nationale des sciences politiques. herlihy, d., klapisch-zuber, c., litchfield, r. b., & molho, a. (eds.) ( ). online catasto of . version . . [machine readable data file based on d. herlihy and c. klapisch–zuber, census and property survey of florentine domains in the province of tuscany, – .] florentine renaissance resources/stg: brown university, providence, r.i., . retrieved from http://cds.library.brown.edu/projects/catasto/. hitchcock, t., shoemaker, r., emsley, c., howard, s., & mclaughlin, j. et al. ( – ). the old bailey proceedings online, – . retrieved from http://www.oldbaileyonline.org. jung, j. (ed.) ( – ). alfred escher-briefedition. retrieved from https://www.briefedition. alfred-escher.ch/. jurish, b. ( ). finding canonical forms for historical german text. in storrer, geyken, siebert, & würzner (eds.), text resources and lexical knowledge (pp. – ). proceedings konvens. berlin: de gruyter. jurish, b. ( ). more than words. using token context to improve canonicalization of historical german. ldv forum – ldv (pp. – ). retrieved from http://www.jlcl.org/ _heft /bryan_jurish.pdf. jurish, b. ( ). finite-state canonicalization techniques for historical german. potsdam: universität potsdam retrieved from http://nbn-resolving.de/urn:nbn:de:kobv: -opus- . jurish, b. ( ). canonicalizing the deutsches textarchiv. in hafemann, ingelore (ed.): perspektiven einer corpusbasierten historischen linguistik und philologie. internationale tagung des akademienvorhabens ‘altägyptisches wörterbuch’ an der bbaw, , december – . berlin (thesaurus linguae aegyptiae ), pp. – . kaufmann, m. ( – ). beverything on paper will be used against me^: quantifying kissinger. retrieved from http://blog.quantifyingkissinger.com/. kestemont, m., de pauw, g., van nie, r., & daelemans, w. ( ). lemmatisation for variation-rich languages using deep learning. digital scholarship in the humanities, ( ), – . https://doi. org/ . /llc/fqw . kirchner, t., nova, a., blüm, c., schreurs, a., & wübbena, t. (eds.) ( – ). sandrart, j. von: teutsche academie der bau-, bild- und mahlerey-künste, nürnberg / / . retrieved from http://ta. sandrart.net/. koch, w. (ed.) ( – ). die urkunden friedrichs ii., currently vols. hannover. (mgh dd ). langefors, b. ( ). theoretical analysis of information systems, studentlitteratur, auerbacher. lanzinner, m., & braun, g. (eds.) ( ). acta pacis westfalica digital. retrieved from http://apw.digitale- sammlungen.de/. letricot, r. (ed.) ( ). Édition critique numérique des mémoires de léonard michon, université lyon , larhra (cnrs umr ). retrieved from http://journal-michon.symogih.org/index.html. marxreiter, b. ( ). frutolfi chronici continuationes anonymae ad annum et ad annum ,mgh scriptores xxxiii, digitale vorabedition, url: http://www.mgh.de/fileadmin/downloads/ pdf/bamberger_weltchronistik/continuatio_i_und_ii/satzlauf_ - - t /frutolf- fortsetzungen_bis_ _und_ _satzlauf_ - - t .pdf. mccrank, l. j. ( ). historical information science. an emergin undiscipline. medford: information today. morbidoni, c., & piccioli, a. ( ). curating a document collection via crowdsourcing with pundit . . gandon, f., guéret, c., villata, s., breslin, j., faron-zucker, c., & zimmermann, a. (eds.), the semantic g. vogeler https://doi.org/ . /oxfordhb/ . . https://doi.org/ . /oxfordhb/ . . http://reichstagsakten.de/index.php?vol=rta https://doi.org/ . / . https://doi.org/ . / . http://cds.library.brown.edu/projects/catasto/ http://www.oldbaileyonline.org https://www.briefedition.alfred-escher.ch/ https://www.briefedition.alfred-escher.ch/ http://www.jlcl.org/ _heft /bryan_jurish.pdf http://nbn-resolving.de/urn:nbn:de:kobv: -opus- http://blog.quantifyingkissinger.com/ https://doi.org/ . /llc/fqw . https://doi.org/ . /llc/fqw . http://ta.sandrart.net/ http://ta.sandrart.net/ http://apw.digitale-sammlungen.de/ http://apw.digitale-sammlungen.de/ http://journal-michon.symogih.org/index.html https://doi.org/ https://doi.org/ https://doi.org/ web: eswc satellite events. eswc . lecture notes in computer science, vol. . springer, cham. nelson, b. h., & terras, m. (eds.) ( ). digitizing medieval and early modern material culture. new technologies in medieval and renaissance studies, tempe (medieval and renaissance texts and studies ; new technologies in medieval and renaissance studies ). piotrowski, m. ( ). natural language processing for historical textes. san rafael, ca. (synthesis lectures on human language technologies). pollin, c., & vogeler, g. ( ). semantically enriched historical data. drawing on the example of the digital edition of the ‘urfehdebucher der stadt basel’. whise . workshop on humanities in the semantic web, proceedings of the second workshop on humanities in the semantic web (whise ii), co- located with th international semantic web conference (iswc ). in adamou, a., daga, e., & isaksen, l. (eds.), ceur-workshop proceedings , pp. – . postles, d. (ed.) ( ). stubbington account rolls from c. and others. retrieved from http://www. historicalresources.myzen.co.uk/stubb/ prelim.html. poupeau, g. ( ). de l'index nominum à l'ontologie. comment mettre en lumière les réseaux sociaux dans les corpus historiques numériques? in digital humanities . the first adho international conference: conference abstracts (pp. – ). paris: université paris-sorbonne. rauscher, p., & serles, a. (eds.) ( – ). der donauhandel. quellen zur österreichischen wirtschaftsgeschichte des . und . jahrhunderts. retrieved from http://www.univie.ac. at/donauhandel/. sahle, p. ( – ): a catalog of digital scholarly editions. retrieved from http://www.digitale- edition.de/. sahle, p. ( ): digitale editionsformen, norderstedt (schriften des instituts für dokumentologie und editorik – ). sanderson, r., ciccarese, p., & van de sompel, h. (eds.) ( ). open annotation data model, , february . retrieved from http://openannotation.org/spec/core/. steinecke, h. ( ). brief-regesten. theorie und praxis einer neuen editionsform. zeitschrift für deutsche philologie, , – . steiner, e., & stigler, j. ( – ). gams and cirilo client: policies, documentation and tutorial. http://gams.uni-graz.at/o:gams.doku. stevens, m. e., & burg, s. b. (eds.). ( ). editing historical documents: a handbook of practice. oxford: altamira press. tei-community ( ). rdf and tei. http://tei-l. .n .nabble.com/rdf-and-tei-xml-tt . html. tei-community ( ). tei and rdfa. http://tei-l. .n .nabble.com/tei-and-rdfa-was-re-saws- and-lod-was-re-cross-references-among-segs-in-tei-tt .html. thaller, m. ( ). automation on parnassus clio – a databank oriented system for historians. historical social research, ( ), – . thaller, m. ( ). gibt es eine fachspezifische datenverarbeitung in den historischen wissenschaften? quellenbanktechniken in der geschichtswissenschaft. geschichtswissenschaft und elektronische datenverarbeitung. kaufhold, k. h., & schneider, j. wiesbaden (eds.), beiträge zur wirtschafts- und sozialgeschichte , pp. – . thaller, m. ( ). the historical workstation project. histoire et informatique. smets, j. (ed.), ve congrès ‘history & computing’ – septembre à montpellier, montpellier, pp. – . thaller, m. ( ). kleio. a database system, st. katharinen: #scriptura mercaturae# (halbgraue reihe zur historischen fachinformatik: serie b ). thaller, m., & buzzetti, d. ( ). beyond embedded mark-up. digital humanities . retrieved from http://www.dh .uni-hamburg.de/conference/programme/abstracts/beyond-embedded-mark-up. .html. veluwenkamp, j. w., & van der woude, s. ( ff.). soundtoll registers online. http://www.soundtoll.nl. vogeler, g. ( a). digitale edition von wirtschafts- und rechnungsbüchern. in gleba, g., & petersen, n. (eds.), wirtschafts- und rechnungsbücher des mittelalters und der frühen neuzeit (pp. – ). göttingen. https://doi.org/ . /gup - . vogeler, g. ( b). warum werden mittelalterliche und frühneuzeitliche rechnungsbücher eigentlich nicht digital ediert? grenzen und möglichkeiten der digital humanities. baum, c. & stäcker, t. (eds.), wolfenbüttel (zfdg - sonderband ). retrieved from http://zfdg.de/sb _ . vogeler, g. ( ). the content of accounts and registers in their digital edition. xml/tei, spreadsheets, and semantic web technologies. in j. sarnowsky (ed.), konzeptionelle Überlegungen zur edition von rechnungen und amtsbüchern des späten mittelalters (pp. – ). göttingen: university press göttingen. the ‘assertive edition’ http://www.historicalresources.myzen.co.uk/stubb/ prelim.html http://www.historicalresources.myzen.co.uk/stubb/ prelim.html http://www.univie.ac.at/donauhandel/ http://www.univie.ac.at/donauhandel/ http://www.digitale-edition.de/ http://www.digitale-edition.de/ http://openannotation.org/spec/core/ http://gams.uni-graz.at/o:gams.doku http://tei-l. .n .nabble.com/rdf-and-tei-xml-tt .html http://tei-l. .n .nabble.com/rdf-and-tei-xml-tt .html http://tei-l. .n .nabble.com/tei-and-rdfa-was-re-saws-and-lod-was-re-cross-references-among-segs-in-tei-tt .html http://tei-l. .n .nabble.com/tei-and-rdfa-was-re-saws-and-lod-was-re-cross-references-among-segs-in-tei-tt .html http://www.dh .uni-hamburg.de/conference/programme/abstracts/beyond-embedded-mark-up. .html http://www.soundtoll.nl https://doi.org/ . /gup - http://zfdg.de/sb _ w c ( ). w c data activity building the web of data. retrieved from https://www.w .org/ /data/. waitz, g. (ed.) ( ). ex rerum danicarum scriptoribus saec. xii. et xiii. ex historiis islandicis. ex rerum polonicarum scriptoribus saec. xii. et xiii. ex rerum ungaricarum scriptoribus saec. xiii, hannover; (mgh scriptores ). retrieved from http://www.mgh.de/dmgh/resolving/mgh_ss_ . zala, s. et al. (ed.) ( – ). diplomatische dokumente der schweiz. bern/zürich. https://www.dodis.ch/. g. vogeler http://www.w .org/ /data/ http://www.mgh.de/dmgh/resolving/mgh_ss_ https://www.dodis.ch/ the ‘assertive edition’ abstract introduction contributions to the assertive edition pre-digital contributions early digital contributions digital editions and facts interface elements information extraction tei and semantic markup web of data: semantic markup by reference how to combine transcription with databases? conclusion references all urls were last checked on apr.� , and if possible archived via the web archiving service of http://archive.org. international journal of instruction international journal of instruction july ● vol. , no. e-issn: - ● www.e-iji.net p-issn: - x use and mastery of virtual learning environment in brazilian open university margarita victoria gomez prof., nove de julho university, são paulo brazil; advanced programme in contemporary culture. federal university of rio de janeiro (pacc/frj), brazil, marvi@uninove.br this paper describes and analyses the dynamics of the use and/or mastery of virtual learning environments (vles) by educators and students open university, important part of the brazilian educational system. a questionnaire with items was answered by students/instructors/coordinators of the media in education and physics courses, of two federal universities, between and early . the interview with a coordinator was transcribed and related to the data systematised in tables and graphs. interpretative analysis, in an open dialogue with the references and with the data from the universidade aberta do brasil (uab - open university of brazil) site resulted in the final considerations. these suggest that the use and/or mastery of vles by students are important, and the specificities of these uses subsidise studies and publications, still in a small number in the literature in this area of knowledge. the work reflects the development of the open distance education system, conducted with strong popular participation, as a response to the challenge posed to the educational policies for expanding the public provision of higher education, also using vles for this purpose. keywords: distance education, virtual learning environment, higher education, teacher, open university system, brazil. introduction technological innovations (castells, ) and the specific use of new devices have impacted the investigation to the extent of starting to respond to the challenges of science as an open enterprise. the royal society published the report ( ) science as an open enterprise: open data for open science, which systematised the facts that led to rethinking the science model of the laboratories closed to processes with little transparency and to inaccessible publications. opening science may imply the use of free software, creative commons and boi (budapest open access initiative), and the free culture of open source, among others, thinking of this movement in the area of education and, more specifically, in brazil and in latin america. how can we make use use and mastery of virtual learning... international journal of instruction, july ● vol. , no. of the network technological devices for researches and for capacity building in education involving social and educational changes in favour of citizenship? we currently believe it is possible for science, technology and education to be affected by innovations, opening to popular needs and reflecting on the people who think of these uses in their own experience. it may be a total pantopia, as says michel serres ( ), yet universities are opening to provide the working class with capacity-building courses through virtual learning environments (vles) and open educational resources (oer) (palloff & pratt, ), which stand as an expression of the changes going beyond the instrumental aspect, since they are both cultural and educational. in academic practice, some types of software are verified and used, such as toolbookii, webct, universite, topclass, designe’s edge, first class, aulanet, learning space, teleduc, moodle, which have allowed interacting and dialoguing, synchronously or asynchronously. however, we still consider we are at the beginning of era in which vles and oers will be widely explored in favour of open and distance education (moore & kearsley, ; stiles, ; olnet-oer research - ). universities re-territorialise lessons, laboratories, instructors and management towards virtual environments. by connecting pedagogical management, instructors, students, libraries, electronic publications, digitised learning objects, simulation laboratories, digital blackboard, ipad, tablet and social networks, a new dynamic is provided to the learning process reflecting both in society and in public policies. sharples, mcandrew, ferguson, whitelock, among others, presented a report in , which updates the area of innovation in education: moocs (massive open online courses); badges to accredit learning; seamless learning; crowd learning; geo-learning; learning from gaming; exploiting the power of digital games for learning; digital scholarship; citizen inquiry. in this quest to understand contemporary educational innovations better, in the / period, we systematised and analysed the dynamic of the use and/or mastery of the vles by educators and students in the open education higher education process, a relevant part of the brazilian national education system. the study specificities may subsidise the design of proposals for provisions in open education, as well as cooperate with studies and publications still scarce in the literature in this area of knowledge. the work reflects the pedagogy of virtuality as regards understanding virtual education systems, which develop as a rhizome (deleuze; guattari, ), with extense popular participation responding to the challenge posed to educational policies. context and review of literature the open distance education system in brazil counts on ample popular participation, constituting a challenge to educational policies. the uab ( ) system articulates the provision of courses at federal and state universities as well as at federal institutes, aiming to open these institutions and to include more people in higher education. gomez international journal of instruction, july ● vol. , no. currently, in the higher education context, public, economic and pedagogical investment is considerable for developing the system (costa, ). uab was established by decree nº (june ) and, as a method, distance education was chosen, keeping it as a face-to-face/virtual system, and accessibility as a political option. this system, in principle, is not a university, since it is a system which is not even open as it requires an exam for acceptance in the courses, but is somehow opposed to higher education restricted to certain sectors in public and free of charge higher education institutions. the system for selecting and granting access to uab abides by the procedures of the face-to-face modality with an entrance exam, specific didactic procedures, along with the principles and guidelines of the “quality referential for distance courses” (brasil, ). brazil, with about million inhabitants, counts on % of young people in its population, between and , who enter higher education. uab is among the actions started by the federal government to improve this situation and to aid in achieving the goals of the national education plan, i.e., providing access to higher education to at least % of the population in this age group by . the data of the official portal of uab report that by , it already counted on presential poles in institutions. in the higher education expansion and dissemination process, courses were offered at the time, encompassing about thousand enrolments. the ample popular participation confers a certain particularity to the enterprise. for , the aim was to have , students, but we still do not count on reliable data. (cf. capes communication advisory, brazil, sept. ). the open education concept was created by the historian and cambridge university professor, john clarke stobart ( - ). he was a school inspector and was the first director in the london bbc in the education area ( ). with the programs children's hour and the epilogue the motto "nation shall speak peace unto nation" was launched and the creation of a cultural network by open radio broadcast was proposed. open education is currently an education modality that, in higher education, organises distance provision so as to expand access to those interested and to the less favoured sectors, to people of any social class or type of housing. in its pedagogical proposal, it uses printed, audiovisual and information didactic systems. the elaboration of didactic material occurs in printed text, audio, video, in video-lessons, in programs and radio broadcast along with the internet. traditional correspondence, radio waves, video-lessons, television, specific courses, seminars, online courses, among others, can be employed specifically. the open education assessment meets the pedagogical and methodological proposal adopted by the institution. it consists in a modality, which requires autonomy and self-discipline on the part of the student. the methodological, gnosiological, cultural and political opening permits students to obtain a degree similar to the one granted by a presential university to act in the market. use and mastery of virtual learning... international journal of instruction, july ● vol. , no. open education originated in the industrial movement of the s and s, and found in the united kingdom open university ( ) the expression of the first university offering open higher education, in the distance modality, to the working class. virtual learning environments (vles) in this universe, it is necessary to understand that the use and mastery of vles by both educators and students, considering that the environment, as everything surrounding or involving living beings and/or things in an open learning process, generates differentiated space and academic culture. when describing the higher education process that, going beyond the coastal region, seeks to democratise education in brazil and to take it everywhere in the country, when the connections of this open education with social demands are identified, some questions are posed concerning the de-territorialisation of education towards virtuality. the analysis of this educational process reveals some little known data which provide experiences, uses and knowledge that are articulated in the use of vles aiming at capacity-building. this implies the notion of distance open education as a support to this practice intended to be democratic and manifested in the courses offered by the uab. for being a current theme in the brazilian educational debate, the use of vles for forming instructors in the open and distance modality implies considering the experience of the instructors and the pedagogical, technological and cultural factors involved and which deserved a reinterpretation as from the cultural studies. hence, the study with those participating in the courses mentioned herein, involved understanding education as a cultural action conducted in the praxis and in the dialogue of educators/students who seek to understand the vle, in initial or in continued education, to provide a sort of education which somehow requires resorting to them. virtual learning environments (vles) - free or proprietary - are software to develop and to manage online courses which allow involving people and/or artifacts in the distance learning process, or aiding presential activities. the most widely known environments are: moodle, teleduc, sakai, proinfo and webct. vles are digital cyberculture devices (levy, ) and are part of the educators’ life. the vle environment supports objects or open educational resources (oer) which, as in a didactic unit, contributes to learning and, on the whole, builds a repository that can be reused. both the environment and the objects are digitally improved by technological and cultural artifacts, and are currently very well articulated in pedagogical proposals for distance education. these cultural artifacts have been studied from the cultural perspective as an interdisciplinary-character field which allows some methodological flexibility to analyse the cultural mastery, the creation of meanings, the transformation and the gomez international journal of instruction, july ● vol. , no. dissemination of this process in distance open higher education, a process affecting the face-to-face system, still not knowing the reach of this effect very well. the english cultural studies understand that any object that may be considered cultural deserves to be analysed and criticised. williams ( ), hoggart ( ) and hall ( ) are some of the representatives of this thought, and who contributed to meeting this challenge. education open to the community expands through vles, as a rhizome (deleuze & guattari, ; sharples et.al, ), in its diversity and heterogeneity, with multiple methodologies and themes. communication explores the digital means and develops through technological and discursive devices, demanding new skills and competencies towards organizing experiences deriving from thought and learning. education thus opens to culture and is hence understood as a production and cultural action, which allows understanding its use and mastering vles. the idea of education as an act of knowledge and a political act is inseparable from the quest for social transformation (freire, , henry giroux e peter mclaren, ), inserted in the movement towards generating a learning community based on digital culture. we believe that education is a cultural action and freire’s pedagogy has specific contributions to reinvent learning. brazil is one of the few countries in latin america in which the relationship between pedagogy and culture have developed with a certain presence and consistency. likewise, brazilian educator and educational thinker paulo freire ( - ) is one of the few in south america acknowledged by authors in other continents, recognized as a source and as a master. freire contributes to thinking of this cybercultural process (levy, ). from the pedagogy of virtuality (romo, castañeda, orozco, gomez, ), we believe it necessary to speak about new territorialities where education occurs or, in other words, about information and communication society, about networking society. for the future of virtual learning, this perspective it is fundamental to understand this process and education as dialogue and communication. this dialogue is where women and men make themselves present; they are not cast out from communication and dialogue. freire ( a) has already raised the question of technology for teaching adults how to read and to write in the first cultural circles held in pernambuco (brazil). these were circles formed within charitable societies, soccer clubs, district associations, and in churches. the educators were in charge of preparing the creation of a circle, visiting the congregation club or the parochial church or the district association and of talking about the idea of a pedagogical work. once the proposal was accepted, a large promotion effort was made in the area, using popular resources (…) when two or three circles had been created, the educator made a thematic survey among the participants, which was studied by us, in a team, at the home office of the action group. once the themes had been “ treated’ they were organized in a program to be discussed with the participants of the circle (…) we prepared the material for the discussions, taking into consideration the resources available (…) ( ). in this sense, vle can operate as a cultural artifact generating others, besides creating a learning community. the study with the participating teaching staff allowed identifying who those seeking open education courses are and how the use and mastery of vles occurs along use and mastery of virtual learning... international journal of instruction, july ● vol. , no. undergraduate and specialization courses in the perspective of both educators and students. for this, it was necessary to know what they considered to be vles potentialities, to know the teaching competencies for acting in this virtual learning environment and to recognize the adequate elements for reinventing culture in this digital context. by systemising and analysing what instructors think and the meaning of using vles, in initial and continued education, of those taking courses in brazilian public universities connected to the uab system, one can perceive another pedagogy about to bloom. method as a result of research, we present the description and the analysis of the data concerning the participating population and qualitative aspects (attitudes and opinions likely to be considered and which reveal important aspects to understand how these people are thinking or mastering virtual environments in their education process). specifically, participants in the course media in education and of a physics course, of two federal universities, shared their experiences with us, which allowed us to know specificities of the phenomenon as from the data collected. these, as compared to the literature, with the data from the universidade aberta do brasil (uab) site and with those of the very participants, resulted in this analysis which shows that students and instructors use and master the devices. the uses and mastery of vle was an attempt to understand the changes in the student and in the lecturer culture in higher education. it has been described as a landmark in the public perception of digital culture and academic orientation. the participating population started with people responding to a questionnaire, and closed with , representing . % of those invited and involved in the courses, that is, instructors, monitors and students. the respondents were selected by means of invitation, considering that the questionnaire was responded by those who agreed to participate, via internet or face-to-face and individually. an interview was also conducted with a course coordinator based on the same instrument. sections of the interviewee statement were related to the data systematised in tables and graphs, along with the interpretation. the questionnaire included questions, most of them closed. some questionnaires were applied semi-face-to-face, which resulted in an interesting debate, as we started from the premise that this methodology would challenge the legitimacy of the virtual and of the face-to-face response. of course different relationships are established with the questions; however, the legitimacy is the same. in the two cases, people opted for identifying themselves with the questions presented; the respondents were interested in seeing that most are teachers thinking of issues that will serve to reflect about the area in which they work or study. therefore, be it on paper or virtually, the attitude of the respondent faced with this area cannot be measured. respondents might fail to respond, but if they respond, if they lie gomez international journal of instruction, july ● vol. , no. when answering, this is because they have an ideal model they want to correspond to, yet that provided information likely to be analysed. we are discussing a de facto behaviour or a behaviour they believe in. individuals responded because they have some involvement with what they are responding to. it does not matter whether it is true, but the [respondent’s] attitude before it; it is not a question of checking truths, but ways of thinking about the response. the issue was checking a thought, was to know how these people think of it. the application of the research instruments was preceded by a pre-test and, after the necessary adjustments, it was made available in the networks pertinent to the courses so that those interested in participating could respond. the method for analysing the data meant to evidence the existing relations between the phenomenon studied and the factors surveyed. these and other systematised data as from the research instruments are configured in blocks: block : identification (age, gender); education (bachelor degree; second bachelor degree; year the degree was granted, specialisation and others; names of the institutions and of the courses attended); professional actuation (main one and others; places of work; public or private; work regime; full or partial dedication; function; performs or not another paid activity; length of actuation). block : motivation and information technology facilities. block : use of vles (opinion; preparation for using the tool; platform used; time has used the tool for professional capacity building; regularity of use; time spent in capacity building activities; number of students involved; whether using vle is compulsory regarding the course programme; respondent’s role in the course: coordinator, instructor, monitor, student, others); vle technical characteristics in the semi-face-to- face education; most important ends for using vle in education; impact of using vles in education/actuation as an instructor). block : instructor’s competencies; contents that could favour and expand the use of vle; problem situation found in the education received; vle contribution to learning and to the educator’s work; importance of the course pedagogical proposal with vle; assessment of the monitor/instructor in the semi-face-to-face modality. block : professional aspiration as from this formation and their perception of the reach of the uab programme. identification and context of the study the data obtained are presented with the interpretations and indications for the final considerations and are provided as a contribution to other researchers in the area. the research was conducted in two years, in the late and early , using an interview and the application of a semi-face-to-face questionnaire to the instructors and students of the two public universities involved, one of them in são paulo and the other in rio de janeiro. the online questionnaire, in the questionpro software, totalled visualisations, out of which started to respond and concluded ( . %). use and mastery of virtual learning... international journal of instruction, july ● vol. , no. identification: from the data obtained from the . % respondents who concluded the responses to the questionnaire: % were from são paulo and . % from rio de janeiro. the student that enrols has to be aware that in this semi-face-to-face modality, he has to go to the pole every other week. the respondents live outside the capital of the state of são paulo and of rio de janeiro. this information is important for showing that higher education left the capital in the states, reaching people who would otherwise have difficulties in getting included in the higher education system: i think that, for the st time in the history of this country there was the decision to establish a network of universities with distance education (de); in each micro-region in this country, there is poles. (e ) the age prevailing among the respondents was between and ( people); between and ( people); between and ( people); between and ( people) and over ( people). regarding gender, out of the total responded, . % male and . % female. the research asked questions regarding the respondents’ education: they said they had taken elementary education mostly in state schools and higher education at private institutions and that, for being of a lower social class, they studied at state schools up to their secondary education, but did not manage to enter a state university. concerning the distribution per type of secondary education institution, . % had attended state schools and . % attended private schools. according to one interviewee: the public is very diversified, there are teachers with secondary education, technical degrees, people with higher education who decided to take a specific course, students from the state learning network. (e ) regarding distribution per type of undergraduate course institution, inversely to secondary education, in their first higher education degree, . % attended private institutions and . % attended a state institution. for their second higher education degree, this percentage increases in private institutions, with . %; . % attended state institutions. the same occurs for specialization, with . % attending private institutions and . % attending state institutions. out of the total participants, . % are verified to be students and . % are monitors, instructors, coordinators and/or perform other non-specified functions in the two courses researched. these students’ elementary education occurred in state schools, where . % work; . % work in private schools and . % work in other activities. out of these, . % act mainly as teachers in ef – fundamental (elementary) teaching i; . % in ef - fundamental (elementary) teaching ii; . % in secondary teaching (em); . % in pre-school education and , % in higher education. gomez international journal of instruction, july ● vol. , no. concerning the work regime, . % have taken a public selection exam to occupy a post; . % are hired . % [of the courses monitors] have a research grant and . % have other activities as intern and monitor. as to hour load, the respondents’ dedication to work was: . % full time - h (at one institution); . % - h (at more than one institution); . % part time – less than h (in one institution); . % - less than h (in more than one institution). how long he/she has been an educator: of the respondents said they had acted as educators and . % had over -year experience. motivation to take the course: most say they want a free graduate course; . % want their salaries raised; . % want specific education for their present job; . % want to obtain formal degrees and certificates; . %want to have formal promotion in their teaching or management career; . % say it is the best continued education; . % like its being free; . % are close to the presence poles; for . %, there is a lack of options; . % chose other reasons (course at a federal institution; renowned institution; improvement; time flexibilisation; knowledge of new media aiming at capacity building for teaching distance education; updating the terms and uses of the "media " surrounding us; the theme). according to the interviewee: students who seek de, except for the local education teacher, are people who use the name [status] of the university to print in their curricula.(e ) despite being a radical statement, it demonstrates a certain mobilisation, which is causing changes in the cultural matrix in both personal and institutional spheres. this unknown area will lead the institution to seek other possibilities. concerning computer use, , or % have one and use it. regarding place to access the internet, the major one is their home with . %; workplace, . %; lan house . %; telecentre, . %; others, . %. they effectively use the internet: . % have a computer at home; out of these, . % are connected to the internet. they use it for communication with others via e- mail; to access vle; for the news; entertainment; school researches and social networks. really, there is a concern for the situation generated by the open distance modality and a quest for new solutions for expanding, disseminating and democratising higher education in brazil. using vle in the training process the survey revealed that instructors/coordinators have positive expectations about vles facilitating the sharing of material and ideas. the data regarding vle use presented issues related to its structural elements. out of the . % respondents saying they use the internet to access the platform as a means to generate room for education, . % agree that vle is a device for performing activities that cannot be performed via e-mail; . % think it is a repository of didactic resources (texts, videos, images etc.); . % think it is software serving to assemble and to manage accessible courses via internet; . % think it promotes interaction between instructors and students and . % think it allows sharing knowledge by using e-mail, chat, forum. use and mastery of virtual learning... international journal of instruction, july ● vol. , no. although the respondents agree with these potentialities of vle, only . % had formal training for using it, and . % were trained in the course media in education(percentally, these were the ones who responded the most). the respondents who used other environments said they would also introduce vle in other courses and institutions and mentioned proinfo, moodle, thinkquest and the freire platform. they say they use the environment indicating priority with numbers, being the most important and the least important; moodle, proinfo, freire platform were the most widely used, besides the own institutional platform. vles are of recent use in the ambit of higher education and this is somehow expressed in the responses. . % have used it for over a year; . % started to use it from eight months to one year ago; . % have used it from four to seven months and . % have used it for less than three months . other spans of use mentioned are years ( ); years ( ); years ( ); years ( ); over years ( ); over years ( ). one respondent says “it is my second de course; the first one was a requirement to take a position/selection examination; hours in months” and other respondents say “i don’t use it.” currently, the frequency in vle use is mostly weekly ( . %) and daily ( . %). . % of the respondents state that vle use is compulsory for the course. . % say it is not compulsory, which suggests they use e-mail to communicate with the instructor and, sometimes, with colleagues. specifically, regarding the weekly average time used for education activities, they declare: to hours ( . %); to hours ( . %); over hours ( . %). in the respondents’ opinion, the key vle technical characteristics contribute to the development of semi-face-to-face higher education proposal. table : technical characteristics of vle n % being a virtual environment promoting development and the control of the lesson, discussions and assessments . integrating chats, forums and video-lessons . stimulating collaborative group and teamwork . counting on a good system for managing both instructor’s and student’s activities . promoting external channels for web interaction: portal, domínio público, e-proinfo . promoting spaces for reading and writing . being a support to a good interaction between people /good wideband . allowing transferring variable size files (among x and x kb or mb) . others _ specify . as the importance of easy access, they specified aps; integrating chat, forum and libraries; providing continuous education for those not counting on time to attend face- to-face courses; promoting interaction between students and instructor. the respondents consider the first most important ends of using vle in education: gomez international journal of instruction, july ● vol. , no. table : purpose of vle. n % managing contents . managing the course . following students’ progress . designing online activities to complement to in-class work . consulting tutors . talking to colleagues . stimulating collaborative learning . forming learning communities . propitiating teamwork . flexibilising the hours to access the course . flexibilising the place to access the course . lowering the costs of the higher education course . using repositories of learning objects . total the positive impact of using vles in instructors’ education/ actuation was mentioned, per decreasing order of importance, on work/study conditions ( ); pedagogical innovations ( ); better planning quality ( ); democratising the access to online education ( ); organisation of teaching/ curricular integration ( ); collective knowledge construction ( ); using the digital library ( ); participating in the design of activities ( ); infrastructure of the institution ( ); satisfaction of mutual interests ( ); decentralisation of higher education ( ) the interviewee states: in the discipline i teach, there are some widely used tools, such as the tutorial room; it is the classroom; it is used intensively in vle and it is where students interact with the tutors at a distance as well as with the coordination. (e ) data regarding teacher’s competencies in the respondents’ opinion, the relevant and necessary competencies for the instructor’s didactic work with vle, per decreasing order of importance, are: methodology for building online proposal (lesson, course, programme); technical-scientific knowledge of pedagogical methodologies (e.g. constructivism); pedagogical mediation; skill for using communication tools; knowledge of basic information concepts; fostering strategies for analysis and critical reflection; sensitisation towards considering personal and technological differences; knowledge of assessment for the semi-face-to-face modality; ability for teamwork; respect to colleagues’ production; respect to others; ability for group work and openness to learn from others. from a list presented, the respondents suggested situations that are or not a problem to education, i.e. participating in face-to-face sessions. table : problems in education. % participating in face-to-face moments . time available for designing, correcting and proposing other activities . keeping discipline and regularity of access and use . use and mastery of virtual learning... international journal of instruction, july ● vol. , no. guidance regarding the correct use of vle . articulating the work in an interdisciplinary fashion . integrating vle in the contents and curricular school time . integrated use of virtual laboratory . adequate use of learning objects supported by vle . integrated use of the virtual library . concerning the vle contribution to learning and the educator’s work, the respondents agree with important aspects. table : contributions of vle to education % will contribute to teachers’ higher education . will act as a facilitator of learning for students . will contribute to the teachers’ work in the classroom . will promote access to new technologies and knowledge to improve education . will contribute to forming virtual learning communities . will promote the use of learning repositories (learning objects) . will contribute to internationalise education . will promote the precarisation of teachers’ education . for the respondents, what matters in the design of an education proposal using vle is: the quality of the material/learning objects ( . %); knowing the possibilities provided by vle ( . %); pedagogical and technological mediation ( . %); curricular adequacy to the students’ universe ( . %); interest and the pertinence of associated themes ( . %); integration between traditional and distance education ( . %); promoting a totally online modality ( . %); the proposal design ( . %): here (in the platform) you will find the use of the video-lessons i produced; it is a constant feedback, one helps the other. (e ) for the respondents, those teaching through vle should be a good educator connected to the institution. table : open education instructor. n % educator, connected to the institution, with education and experience in the teaching area and in distance education . tutor with a ba/bsc degree who is concerned with the teaching and learning processes, who follows and assesses the activities . monitor - student who monitors and is concerned with the execution of activities . others/specify . total the competence of the monitor/educator/coordinator requires knowing de technology. the interviewee considers that % of this total do not: i am one of the few who uses the platform and seek solutions, but % do not; they place the test and correct it. (e ) when required to mention at least three concepts of what being a good instructor in the semi-face- to-face higher education modality is, participants mostly responded: knowledgeable in the subject and experienced in what he/she teaches. gomez international journal of instruction, july ● vol. , no. table : what it means to be a good educator in open education. n % good communicator , knowledge transmitter , concerned with learning and not with showing off knowledge , learning facilitator , knowledgeable in the subject and experient in what he/she teaches , knowing how to listen , knowing how to relate the themes with the social and the student’s reality , total the interviewee – course coordinator considers that: the group of monitors is composed of people not regularly hired, with no professional stability and with little time to be trained. thus, this is not a stable professional. (e ) at the same time she makes these statements, the interviewee shows to be committed with her work and admits that there is some dissatisfaction with the conditions for acting as an instructor. de is being built by face-to-face instructors and, as says the interviewee, it is a long process. about the major professional aspiration for the coming years, after this course, expressed, in decreasing order of importance: acting as an instructor/monitor in distance education ( . %); remaining in the present position, at the same institution ( . %); conducting another professional activity in the educational area ( . %); occupying school board and management positions ( , %); getting a better job away from the school ( . %); keep the same position, but at a different school ( . %) and retiring ( . %) the emergence of the open university system, in the attempt to take higher education everywhere in the country, to expand and to democratise the access to it, provides higher education to teachers and to people in general. a large popular demand for higher education is verified, which constitutes a challenge to educational policies that cannot meet it presentially. discussion after over five years since activities began, this experience compels us to know and to give a new meaning to the use of vles and of open educational resources to understand that the culture generated is an integral part of the learning process and that educating is creating new spaces for higher education. therefore, distance open education studies as a cultural action lead us to incorporate the people culture into the learning process. vles, as elements integrating the teacher’s education, act as agents of culture and of subjectivities. evidently, in a cybercultural universe, educators/students use computers and work in a network to solve educational and capacity building problems increasingly more complex (romo, castaneda, orozco, gomez, ). retaking the blocks or analysis matrixes presented herein, we verify some resonance in the survey participants who admit that education combining face-to-face and distance moments generated imbalances in its practices, but provided them with new reflections and learning. in these new educational arrangements, new educator functions are incorporated, such as use and mastery of virtual learning... international journal of instruction, july ● vol. , no. instructors, monitors and supervisors, which raise quite a debate, due to the conditions at work. the questionings lead us to observe that educators and students are mastering vles which, formerly, were not even a part of their daily life. they attempt to articulate their needs with the potentialities of vle use, faced with the lack of higher education institutions near them, of time and conditions to travel and to attend face-to-face lessons. if we ask them whether they conduct research, whether they form communities of culture, learning and knowledge in this ambit, one thing is certain: they use the devices, reflect about technical, didactic and collaboration strategies in the courses and, especially, they are open to learn. by means of vles, some people, knowledge, experiences, practices and methodologies circulated which, in turn, allowed inclusions, interactions and social uses of education and communication technologies, mediating relationships among instructors, students, learning managers and community in the vle. vles support and expand initial or continued education in the internet, de-territorialising this process, whichis only possible from discovering and rediscovering the traditions of education and of virtual learning environments, not through the discourse that promotes repetition, but rather by means of transformation and critique by the very participants in the courses. in this sense, instructors and students are rediscovering themselves, relating and using (either by their own initiative, or supervised by tutors and the latter by the official bodies) vle educational possibilities, be it to respond, to seek information or to create and to innovate. besides the technical and logistic potentialities of virtual learning environments, the use of free software for education is in agreement with an era of free access, which leads us to perceive some issues that point to vles as a substantial part of contemporary culture and education. concerning higher education, we can say that: it is actually the same conventional university, faced with the pressures of information society and of knowledge, takes a position; students seek open education for its being public and free, besides granting them a degree or a certificate to act in the market; also, students come from different backgrounds, but they mostly seek distance education, at the moment, for continued education. when instructors and students take the position of those who learn, this confers them a certain authority and autonomy because of and for the use they make of cultural artifacts. the experimental, conceptual and practical perspectives of the respondents contribute to the researches implicit in this educational dynamic in a context which open higher education is entering. references brasil. secretaria de educação a distância – seed/mec ( ). referenciais de qualidade para educação superior a distância. brasília, august. castells, m. ( ). the rise of the network society, the information age: economy, society and culture. uk: blackwell. gomez international journal of instruction, july ● vol. , no. costa, j. c. da ( ). modelos de educação superior a distância e implementação da universidade aberta do brasil. revista brasileira de informática na educação. ( ), deleuze g, guattari f. ( ). a thousand plateaus. trans. brian massumi. london and new york: continuum, . vol. of capitalism and schizophrenia. freire, p. ( ) pedagogy of the oppressed. new york: herder and herder. freire, p. ( a) cultural action for freedom. [cambridge], harvard educational review. freire, p. ( ) letters to cristina: reflections on my life and work. new york: routledge. giroux, h., mclaren. p.( ) cultural studies and education: towards a performative practice. eds. henry a. giroux and patrick shannon. hall, s. ( ) the centrality of culture: notes on the revolutions of our time’, in k.thompson (ed.) media and cultural regulation, vol. of the culture, media and identities course books. london: sage and the open university. hoggart, r. ( ) the uses of literacy. harmondsworth: penguin books. information resources management association (irma). virtual learning environments: concepts, methodologies, tools and applications, usa, igi global, . levy, p. ( ) cyberculture. rapport au conseil de l’europe dans le cadre du projet nouvelles technologie: coopération culturelle et communication, odile jacob, paris. open university. accessed at: http://www.open.ac.uk/. moore, m. g., & kearsley, g. ( ) distance education: a systems view of online learning. belmont, ca: wadsworth-cengage learning. olnet oer research ( ) accessed at: http://www.olnet.org/ palloff, r.m. & pratt, k. ( ). building online learning communities: effective strategies for the virtual classroom. san francisco, ca: jossey bass. romo, r., castañeda, m.; orozco, m.; gomez, m.( ) on-line education: an emancipating vision. available at: https://tojde.anadolu.edu.tr/tojde /pdf/review_ .pdf royal society science policy centre. ( ) science as an open enterprise. london: the royal society. serres, m. ( ) atlas. madrid: julliard. sharples, m. et.al ( ) innovating pedagogy . available at: http://www.open.ac.uk/personalpages/mike.sharples/reports/innovating_pedagogy_report_ .pdf stiles, m.j. ( ). death of the vle. a challenge to a new orthodoxy. ( ), - . available from: http://serials.uksg.org/openurl.asp?genre=article&issn= [paper] universidade aberta do brasil ( ) available at: www.uab.capes.gov.br/ williams, r.( ) working-class culture in the uses of literacy symposium, universities and left review . turkish abstract brezilya açık Üniversitesinde sanal Öğrenme ortamlarının kullanımı ve hakimiyeti bu çalışma brezilya eğitim sisteminin önemli bir parçasını oluşturan open Üniversite’nin eğitimciler ve öğrencileri tarafından sanal Öğrenme ortamlarının kullanım ve uzmanlaşma dinamiklerini tanımlamakta ve analiz etmektedir. ve başlarında iki devlet üniversitesinde eğitimde medya http://www.open.ac.uk/ http://www.open.ac.uk/personalpages/mike.sharples/reports/innovating_pedagogy_report_ .pdf use and mastery of virtual learning... international journal of instruction, july ● vol. , no. ve fizik derslerindeki öğrenciler/öğretim elemanları/ koordinatörlerden oluşan kişi tarafından maddelik bir anket doldurulmuştur. korrdinatörle yapılan görüşme yazıya aktarılmış ve tablolar ve grafiklerle sistematikleştirilen verilerle bağlantılandırılmıştır. referanslarla açık bir diyalogdan ve universidade aberta do brasil (uab – brezilya açık Üniversitesi)’dan toplanan verilerden elde edilen yorumlayıcı analiz son düzenlemeleri şekilllendirmiştir. tüm bunlar sÖo’ların öğrenciler tarafından kullanılması ve uzmanlaşılmasının önemli olduğunu göstermiş ve bu kullanımların özgüllüğü alan literatüründen hala az sayıda olan çalışmaları ve yayınları desteklemektedir. bu çalışma yükseköğretimin halktaki karşılığını ve sÖo’nın bu amaçla kullanımını genişletmek için eğitim politikalarına karşı olan zorluklara bir cevap olarak yüksek bir katılımla açık uzaktan eğitim sisteminin gelişimini göstermektedir. anahtar kelimeler: uzaktan eğitim, sanal öğrenme ortamları, yükseköğretim, öğretmen, açık üniversite sistemi, brezilya french abstract utilisation et maîtrise d'environnement d'apprentissage virtuel dans brésilien ouvrent université ce papier décrit et analyse la dynamique de l'utilisation et-ou de la maîtrise d'environnements d'apprentissage virtuels (vles) par des éducateurs et les étudiants ouvrent la partie universitaire, importante du système Éducatif brésilien. étudiants/instructeurs/coordinateurs ont répondu un questionnaire avec articles des médias dans l'enseignement et des cours de physique, de deux universités fédérales, entre et au début de . l'interview avec un coordinateur a été transcrit et relaté aux données systématisées dans des tables et des graphiques. l'analyse interprétative, dans un dialogue ouvert avec les références et avec les données de l'universidade aberta brasil (uab - ouvre l'université du brésil) le site a abouti aux considérations finales.ceux-ci suggèrent que l'utilisation et- ou la maîtrise de vles par des étudiants soit importante et les spécificités de ces utilisations subventionnent des études et des publications, toujours dans un petit nombre dans la littérature dans cette domaine de connaissance. le travail reflète le développement du système de formation à distance ouvert, conduit avec la participation populaire forte, comme une réponse au défi posé aux politiques éducatives pour étendre la disposition publique d'enseignement supérieur, utilisant aussi vles à cette fin. mots-clés: la formation à distance, l'environnement d'apprentissage virtuel, l'enseignement superieur, le professeur, ouvre le système universitaire, le brésil arabic abstract المفتوحة البرازيلي جامعة في واقعيال بيئة التعلم منإتقانها و استخدام ( من قبل المعلمين والطالب الجامعة vles) واقعيالورقة ويحلل ديناميات استخدام و / أو التمكن من بيئات التعلم ال وتصف هذه / مدربين / منسقين وسائل البنود من قبل الطالب المفتوحة، جزءا هاما من نظام التعليم البرازيلية. تم الرد على استبيان مع . وقد كتب المقابلة مع وأوائل عام يم والدورات الفيزياء، اثنين من الجامعات االتحادية، بين عامي اإلعالم في التربية والتعل في الجداول و الرسوم البيانية. التحليل التفسيري، في حوار مفتوح مع المراجع ومع البيانات من systematisedمنسق وتتعلق البيانات do brasil universidade aberta (uab - open university of brazil) أسفرت الموقع في االعتبارات النهائية. هذه ) تشير إلى أن استخدام و / أو التمكن من هذه البيئات من قبل الطالب مهمة، وخصوصيات هذه االستخدامات دعم الدراسات ل تطوير نظام التعليم المفتوح عن بعد، أجريت والمنشورات، ال يزال في عدد قليل في األدب في هذا المجال من المعرفة. يعكس عم بمشاركة شعبية قوية، وذلك استجابة للتحدي الذي يشكله على السياسات التعليمية لتوسيع توفير الدولة للتعليم العالي، وأيضا باستخدام هذه البيئات لهذا الغرض. .والبرازيل جامعة النظام المفتوح، المعلم عالي،ال، التعليم وقعيبيئة التعلم ال، التعليم عن بعد :مهمةالكلمات ال layout b u ll e ti n o f th e a m er ic an s o ci et y fo r in fo rm at io n s ci en ce an d te ch n o lo g y – o ct o b er /n o ve m b er – v o lu m e , n u m b er oya yildirim rieger is associate university librarian for digital scholarship and preservation services at cornell university library. she oversees the library’s digitization, online repository, digital preservation, electronic publishing and e-scholarship initiatives with a focus on needs assessment, requirements analysis, business modeling and information policy development. she can be reached at oyr <at>cornell.edu. o ver the past two decades, advancements in information and communication technologies have ushered new modes of knowledge creation, dissemination, sharing and enquiry. open access has emerged as an alternative and viable publishing model and an increasingly vital component of the scholarly communication infrastructure. the vision of an open and robust information infrastructure aims to facilitate the broad dissemination of research outputs of all types – including research data – to allow use, nullification, refinement and reuse. creating a ubiquitous, comprehensive and linked research data environment requires a seamless network of content, technologies, policies, expertise and practices. it is also critical to view this scholarly organization as an enterprise that needs to be maintained, improved, assessed and promoted overtime. open access does not entirely remove fees and access limitations, but it replaces and reconfigures them for the key stakeholders in the scholarly communication endeavor.as we explore a range of issues related to research data curation and management, it is prudent and timely to consider how these services will be maintained and developed in order to flourish over time. what we may consider an exploratory or pilot initiative often ends up transitioning into production, sometimes with insufficient time to think through implications. therefore as we envision research data support, it is critical that we consider long-term development and management issues upstream as a component of an enduring service infrastructure. simply put, sustainability is the capacity to endure. for repositories, sustainability entails long-term maintenance of responsibility, which has technical, socioeconomic, policy and business dimensions and encompasses the concept of stewardship, the responsible management of resource use.at the heart of the concept is the ability to secure resources such as technologies, expertise, policies, feature sustainability: scholarly repository as an enterprise by oya yildirim rieger c o n t e n t s n e x t pa g e > n e x t a rt i c l e >< p r e v i o u s pa g e editor’s summary the expanding need for an open information sharing infrastructure to promote scholarly communication led to the pioneering establishment of arxiv.org, now maintained by the cornell university library. to be sustainable, the repository requires careful, long term planning for services, management and funding. the library is developing a sustainability model for arxiv, based on voluntary contributions and the ongoing participation and support of libraries and research laboratories around the world. the sustainability initiative is based on a membership model and builds on arxiv’s technical, service, financial and policy infrastructure. five principles for sustainability drive development, starting with deep integration into the scholarly community. also key are a clearly defined mandate and governance structure, a stable yet innovative technology platform, systematic creation of content policies and strong business planning strategies. repositories like arxiv must consider usability and lifecycle alongside values and trends in scholarly communication. to endure, they must also support and enhance their service by securing and managing resources and demonstrating responsible stewardship. keywords digital repositories management information infrastructure governance scholarly communication mailto:oyr <at>cornell.edu visions and standards needed to protect and enhance the value of a service based on the needs of the user community technologies. in the pursuing discussion, based on cornell’s experience with arxiv, i highlight the key premises of sustainability. arxiv sustainability initiative started inaugust by paul ginsparg, arxiv.org is internationally acknowledged as a pioneering digital archive and open-access distribution service for research articles. the e-print repository, which moved to the cornell university library in , has transformed the scholarly communication infrastructure of multiple fields of physics and plays an increasingly prominent role in mathematics, computer science, quantitative biology, quantitative finance and statistics.as ofaugust , arxiv included over , e-prints. in there were , new submissions and close to million downloads from all over the world. arxiv’s operating costs for are projected to be approximately $ , , including six fte staff, server maintenance and networking. since january , cornell university library (cul) has undertaken an effort to establish a sustainability model for arxiv.as a three-year interim strategy for - , cul initially established a voluntary institutional contribution model and invited pledges from libraries and research laboratories worldwide that represent arxiv's heaviest institutional users. since scholars worldwide depend on the stable operation and continued development of arxiv, this strategy aimed to reduce arxiv's dependence on a single institution, instead creating a broad-based, community-supported resource. keeping open access academic resources such as arxiv sustainable entails not only covering the associated operating costs but also continuing to enhance the resources’ value based on the needs of the full range of user communities. cornell’s sustainability initiative has striven to assess and strengthen arxiv’s technical, service, financial and policy infrastructure. one of the goals of the business planning initiative has been to engage the institutions that benefit from arxiv to assist in defining the future of the service. the arxiv membership model, which will be launched in , is an outcome of the three-year planning process and was facilitated with a planning grant from the simons foundation. the model is founded on a set of operating principles for arxiv and presents a business model for generating revenues. the background information about the planning activities is available at http://arxiv.org/help/support. the business model is composed of four sources of revenue: cornell’s annual funding of $ , per year and indirect expenses (represents % of direct expenses), $ , per year gift from the simons foundation, annual fee income from the member institutions and a $ , per year challenge grant from the simons foundation based on the revenues generated through membership payments. the substantial gift from the simons foundation aims to encourage long-term community support by lowering arxiv membership fees, making participation affordable to a broad range of institutions. based on institutional usage ranking, the annual fees are set in four tiers within the $ , - , range. sustainability principles arxiv as a sociotechnical system consists of technical systems and standards – activities and practices involved in developing and using the system and the social arrangements and organizations that provide it with a structural framework. the sustainability planning process aims not only to investigate how to diversify revenue models but also to ensure that arxiv strives to meet a set of operational, editorial, financial and governance principles and can set an example of a transparent and reliable community- supported service. based on cornell’s experience in planning the future of arxiv, the following discussion proposes five sustainability principles, including the consideration of disciplinary cultures, creation of a governance structure, stability of technological components, creation of content policies and adherence to managerial best practices [ ] deep integration into the scholarly community and scholarly processes. disciplinary characteristics, work practices and conventions of academia play an important role in researchers’ assessment and appropriation of information communication technologies. repository deployment cannot be fully understood without comprehending how a specific technology is r i e g e r , c o n t i n u e d feature t o p o f a rt i c l ec o n t e n t s n e x t pa g e >< p r e v i o u s pa g e b u ll e ti n o f th e a m er ic an s o ci et y fo r in fo rm at io n s ci en ce an d te ch n o lo g y – o ct o b er /n o ve m b er – v o lu m e , n u m b er n e x t a rt i c l e > http://arxiv.org/help/support b u ll e ti n o f th e a m er ic an s o ci et y fo r in fo rm at io n s ci en ce an d te ch n o lo g y – o ct o b er /n o ve m b er – v o lu m e , n u m b er embedded in its social context. the information and communication technology integration characteristic of disciplinary communities often mirrors underlying differences in epistemic cultures. arxiv is a scholarly communication forum informed and guided by scientists and the scientific cultures being served. through paul ginsparg’s leadership, which is rooted in both the academic and information science communities, the service has consistently focused on the disciplinary cultures represented in the digital repository and on community need [ ]. arxiv’s wide international acceptance attests the importance of relying on user- and evidence-based strategies for informing it modifications and enhancements, user support services and associated repository policies. clearly defined mandate and governance structure. although best practices in developing technical architectures and associated processes and policies underpin a digital repository, organizational attributes are equally important. the trustworthy repositoriesaudit & certification: criteria and checklist (trac) tool (http://trac.edgewall.org/) emphasizes that organizational attributes affect the performance, accountability and sustainability of repositories. the first criteria in the trac assessment tool are governance and organizational viability. similarly, subject repositories must have clearly defined mandates and associated governance structures to reflect a commitment to the long-term stewardship of a service. the general purpose of governance is to ensure that an organization has the means to envision its future and the management structures and processes in place to ensure that the plan is implemented and sustained. good governance is participatory, consensus oriented, accountable, transparent, responsive, efficient, equitable and inclusive. however, it also needs to be nimble and flexible – not allowing any gridlocks or excessive groupthink. the key editorial, governance and economic tenets of arxiv are delineated by a set of principles (https://confluence.cornell.edu/x/xkstbw). cul holds the overall administrative and financial responsibility for arxiv’s operation and development, with strategic and operational guidance from its member advisory board (mab) and its scientificadvisory board (sab). cul manages the moderation of submissions and user support (including the development and implementation of policies and procedures), operates arxiv’s technical infrastructure, assumes responsibility for archiving to ensure long-term access, oversees arxiv mirror sites and establishes and maintains collaborations with related initiatives to improve services for the scientific community through interoperability and tool sharing. mab is elected from arxiv’s membership and advises cul on issues related to repository management and development, standards implementation, interoperability, development priorities, business planning and financial planning. sab is composed of scientists and researchers in areas covered by arxiv and provides advice and guidance pertaining to the intellectual oversight of arxiv, with particular focus on the policies and operation of arxiv's moderation system. technology platform stability and innovation. the existing repository ecology is a complex of architectures and features that are optimized to fulfill the specific needs of institutional, subject or archival repositories. the landscape is becoming even more heterogeneous with the addition of scientific social networking sites that profile local scholarly activities and open data initiatives that focus on data curation models.a critical component of a sustainability plan is to consider this rich context and understand how the service fits within the broader framework. in particular, we need to factor in the following three aspects: � interoperability arrangements that link a given repository to related systems, services and communities � features that support supplementary information objects such as underlying data, auxiliary multimedia content and research methodologies � functionality and arrangements that lower barriers to contributing content to multiple complementary repositories. among the critical roles of repositories is facilitating the preservation function. digital preservation (used interchangeably with archiving) refers to a range of managed activities to support the long-term maintenance of bitstreams, thereby ensuring that digital objects are usable. however, ensuring enduring access involves more than bitstream preservation. it must provide continued access to digital content through various delivery methods.as we assess the value of subject repositories, it is important to differentiate r i e g e r , c o n t i n u e d feature t o p o f a rt i c l ec o n t e n t s n e x t pa g e >< p r e v i o u s pa g e n e x t a rt i c l e > https://confluence.cornell.edu/x/xkstbw http://trac.edgewall.org/ b u ll e ti n o f th e a m er ic an s o ci et y fo r in fo rm at io n s ci en ce an d te ch n o lo g y – o ct o b er /n o ve m b er – v o lu m e , n u m b er between bitstream preservation and preserving access.also, some subject repositories may opt to institute best practices for managing digital content but may not be in a position to assume full preservation responsibilities. it is therefore critical to assess the clearly defined roles and associated procedures that pertain to this responsibility. there is often an inherent tension between technological stability and innovation.although dependability and consistency are important service attributes, also essential is keeping pace with evolving user needs through research and development (r&d) projects.accordingly, assessing resource needs and planning for investments in staffing and equipment are challenging tasks on both the maintenance and r&d fronts. however, given the uncertainties associated with the development and testing of new features and services, an innovation agenda needs to be carefully thought out in order to ensure that operational stability is not undermined. the sustainability of arxiv also depends on enabling interoperability and creating efficiencies among repositories with related and complementary content to reduce duplication of efforts. for instance, cornell has collaborated with the nsf data conservancy project (http://dataconservancy.org/) to launch a pilot interface that allows arxiv submitters to upload data associated with their articles directly to the data conservancy repository. links to the data are added in the arxiv record automatically. this service is a research project, however, and arxiv and the data conservancy make no guarantee about continued availability of datasets uploaded via this mechanism past the end of . systematic development of content policies. content curation and stewardship roles have traditionally been shared by libraries and publishers – publishers with a focus on creation and distribution and libraries specializing in discovery and preservation. in digital information environments, repository services such as arxiv blur the distinction between libraries and publishers. an essential criterion in assessing open access, online resources is the availability of clearly defined collection policies and submission guidelines that reflect the content curation and stewardship role of the hosting institution. although arxiv is not peer-reviewed, submissions are reviewed by subject-based moderators.additionally, an endorsement system is in place to ensure that content is relevant to current research in the specified disciplines. it is also critical to have clearly articulated policies about the copyright status of the deposited materials as well as conflict management processes (such as responding to concerns in regard to rejected submissions). arxiv supplements the traditional publication system by providing immediate dissemination and open access to scholarly articles (which often appear later in conventional journals). arxiv complements, rather than competes with, the commercial and scholarly society journal publishing market.among the challenges are ensuring the authority and integrity of e-prints and distinguishing between succeeding versions, such as a pre-print paper and its published version in a scholarly journal. it is therefore critical that the repository and preservation community address the versioning of scholarly articles, tracking them from initial submission to pre-print archive to final publication in a formal scholarly journal. other challenges include linking the burgeoning corpus of institutional repositories with related subject repositories in order to achieve version control as well as creating a critical mass of related materials on particular topics. cornell’s participation in the open researcher and contributor id (orcid) author identifiers initiative (http://www.orcid.org) aims to enable better author linking and allow improvements in ownership claiming. reliance on business planning strategies. business plans offer an overall view of a given product and its relevant user segments, key stakeholders, communication channels, competencies, resources, networks, collaborations, cost structures and revenue models. the primary purpose of a business planning process is to convey a clear value proposition to justify investment in a service or product by its potential users. the value proposition describes the benefits that a product or service provides. in other words, value propositions respond to this question:why should an institution purchase your product or service? since the focus of the value proposition is on the customer, it should be stated from the end-users’ perspective.value propositions may be based on a range of characteristics such as service features, customer support, product customization and economical pricing. the key challenge in creating a value proposition is addressing the needs of all stakeholders. r i e g e r , c o n t i n u e d feature t o p o f a rt i c l ec o n t e n t s n e x t pa g e >< p r e v i o u s pa g e n e x t a rt i c l e > http://www.orcid.org http://dataconservancy.org/ b u ll e ti n o f th e a m er ic an s o ci et y fo r in fo rm at io n s ci en ce an d te ch n o lo g y – o ct o b er /n o ve m b er – v o lu m e , n u m b er for instance, in the case of arxiv, the stakeholders include scientists, libraries, research centers, societies, publishers and funding agencies.although they are likely to have common goals, each group attaches value to a specific aspect of arxiv. for instance, from the end-users’ perspective, scientists’ highest priority for arxiv is likely to be the robustness and reliability of the repository and access features. business models also convey financial plans. in a collaborative business model such as the arxiv membership model it is critical to clearly define and justify the pricing model and the budget to understand how revenues are being generated and spent. maintaining, supporting and further developing a repository involve a range of expenses including management, programming, system administration, curation, storage, hardware, facilities (space, furniture, networking, phone), research and training (such as attending meetings and conferences), outreach and promotion, user documentation and administrative support. looking ahead the arxiv case study illustrates the need to manage repositories holistically by taking into consideration a range of lifecycle and usability issues, as well as factoring in changing patterns and values of scholarly communication ecology. increasing emphasis on open science and the burgeoning data management mandates usher a complex suite of technology, policy and service needs. we must consider the sustainability requirements upstream and remember that the services we are experimenting and creating now have long-term implications. bowker et al. [ ] argue that understanding the new information infrastructures requires an integrative view that goes beyond studying only technical, organizational or social aspects. such an integrative approach involves comprehending the emerging infrastructure within the context of day-to-day routines, evolving academic practices and business procedures.viewing scientific repositories as enterprises will ensure that the emerging scholarly communication structures and practices will be effective, efficient and enduring. � r i e g e r , c o n t i n u e d feature t o p o f a rt i c l ec o n t e n t s n e x t pa g e >< p r e v i o u s pa g e resources mentioned in the article [ ] rieger, o.y. ( ). assessing the value of open access information systems: making a case for community-based sustainability models. journal of library administration, , : - . [ ] ginsparg, p. ( ). arxiv at . nature, , – , doi: . / a. retrieved august , , from http://www.nature.com/nature/journal/v /n /full/ a.html. [ ] bowker, g.b, baker, k., millerand, f., & ribes, d. ( ). towards information infrastructure studies: ways of knowing in a networked environment. in j. huntsinger, m. allen, and l. klasrup (eds.), international handbook of internet research. london: springer. n e x t a rt i c l e > http://www.nature.com/nature/journal/v /n /full/ a.html button : button : op-llcj .. polysemy and synonymy in syntactic dependency networks ............................................................................................................................................................ radek čech department of czech language, university of ostrava, ostrava, czech republic ján mačutek department of applied mathematics and statistics, comenius university, bratislava, slovakia zdeněk žabokrtský institute of formal and applied linguistics, charles university in prague, prague, czech republic aleš horák natural language processing centre, masaryk university, brno, czech republic ....................................................................................................................................... abstract the relationship between two important semantic properties (polysemy and syn- onymy) of language and one of the most fundamental syntactic network proper- ties (a degree of the node) is observed. based on the synergetic theory of language, it is hypothesized that a word which occurs in more syntactic contexts, i.e. it has a higher degree, should be more polysemous and have more synonyms than a word which occurs in less syntactic contexts, i.e. it has a lesser degree. six languages are used for hypotheses testing and, tentatively, the hypotheses are corroborated. the analysis of syntactic dependency networks presented in this study brings a new interpretation of the well-known relationship between fre- quency and polysemy (or synonymy). ................................................................................................................................................................................. introduction a complex network (cf. newman ) is a model of a system. it contains sets of nodes, representing enti- ties, and links, representing relations among them. syntactic properties of language can be described, inter alia, as a system which contains both words and syntactically motivated relations between pairs of words (cf. mel’čuk, ; hudson, ). consequently, it is not surprising that complex net- works have been used for an analysis of syntax almost since the complex network theory emerged (barabási and albert, ; barabási, ). the analyses of syntactic complex networks opened new insights into a language functioning in the last decade. specifically, new models of language acquisition were proposed (cf. ninio, , ; ke and yao, ; corominas-murtra et al., ), the relation between syntax and communication needs was analysed (cf. ferrer i cancho et al., ; ferrer i cancho, a), differences of statistical properties of syntactic networks were used for typological correspondence: radek čech, department of czech language, university of ostrava, reálnı́ , ostrava , czech republic. e-mail: cechradek@gmail.com digital scholarship in the humanities � the author . published by oxford university press on behalf of eadh. all rights reserved. for permissions, please email: journals.permissions@oup.com of doi: . /llc/fqv digital scholarship in the humanities advance access published july , studies (cf. čech and mačutek, ; liu and li, ; abramov and mehler, ; liu and xu, ), the origin of projectivity, i.e. the fact that syn- tactic dependency crossing occurs but very rarely, was inquired (cf. ferrer i cancho, b, ), as well as the origin of syntax and its role in syntactic complex networks (cf. ferrer i cancho et al., ; solé, ; liu and hu, ; liu et al., ; čech et al., ). despite these achievements, a linguistic explanation of syntactic network properties is strongly needed, especially because the majority of language network analyses were merely descriptive, and not explanative (cf. ferrer i cancho, ), and focused on global network characteristics (cong and liu, a, b). on the other hand, net- work analyses can bring a finer-grained interpret- ation of some traditional linguistic problems (liu, ; gao et al., ), as is also presented in this study in case of polysemy and synonymy functioning, and the complex network approach have a potential to be inspiring for a theory of language, according to ferrer i cancho ( ). one possible way how to explore a syntactic net- work from a linguistic point of view is to find those language characteristics which should correlate with syntactic network properties because of some theor- etical reasons. generally, the approach of this kind was developed by synergetic linguistics in köhler ( , a). it was used to analyse phonology (coloma, ), lexical properties (köhler, b; köhler and altmann, ; wang, ), syntax (köhler, , , ; čech and mačutek, ; čech et al., , ; liu, ; gao et al., ), and morphology (prün and steiner, ). the present study follows this approach, and its aim is to scrutinize the relationship between syntax properties modelled by complex network (specific- ally, a number of syntactically motivated lexical con- texts of a given word are taken into account; in the complex network, it is determined by a degree of the node representing the word; see section for more details), on the one hand, and semantic properties (word polysemy and synonymy), on the other. the degree of the word is determined by the number of syntactic dependencies in all syntactic trees in a corpus; an aggregation of these trees makes the syn- tactic complex network. we hypothesize that a word—actually, we con- sidered a lemma, i.e. a canonical word form — which occurs in more syntactically motivated lexical contexts, i.e. it has a higher degree, should be more polysemous and have more synonyms than a word which occurs in less contexts, i.e. it has a lesser degree. based on this deduction, we put forth hypotheses as follows: ( ) the higher the out-degree/in-degree of a word, the more meanings it has; ( ) the higher the out-degree/in-degree of a word, the more synonyms it has. the out-degree of a word expresses the total number of words which are connected to it as its modifiers, i.e. the number of its syntactically dependent words, while the in-degree of the word expresses the total number of words which the word under consideration is connected to, i.e. the number of its heads, in an observed sample of language (see section ). since it is assumed that these hypotheses are not language specific, six languages (czech, dutch, english, german, italian, and spanish) are used for their testing. admittedly, they do not cover all types of languages (all of them belong to the indo- european family), but we consider them as a suffi- cient sample for the first preliminary analysis in this field. the article is organized as follows: the status of polysemy and synonymy in language is presented in section ; a language material together with the methodology used is introduced in section ; section is focused on results; and the article is ended by conclusion (section ). polysemy and synonymy of the word at least since zipf ( ), it is well known that the semantic aspect of language (i.e. a meaning of any language unit) is closely connected to other language properties (e.g. relative frequency, degree of intensity of accent, and degree of crystallization of the config- uration). probably the most systematical incorpor- ation of certain semantic characteristics of language r. čech et al. of digital scholarship in the humanities, into a language model was presented by köhler ( , b, ). specifically, in the synergetic model of language, köhler focuses on two semantic properties of language units, viz. polysemy and syn- onymy. he hypothesizes particular relationships among them and other language characteristics, such as frequency, word length, inventory sizes of language units, and functional loads of language units, and tests all hypotheses experimentally. consequently, both polysemy and synonymy can be viewed as language properties which are ruled by a complex mechanism emerging as a result of intricate interrelations among so-called communication requirements (cf. köhler, b). the system of these requirements represents a novel way of developing an approach originally proposed by zipf ( ). according to zipf, the properties of language are determined by a principle which should have a crucial impact on the human behaviour in general; he calls it the principle of least effort. as for the relationship between the principle of least effort and polysemy, zipf ( : – ) pointed out that [f]rom the viewpoint of speaker (the speaker’s economy) who has the job selecting not only the meanings to be conveyed but also the words that will convey them, there would doubtless exist an important latent economy in a vocabulary that consisted exclusively of one single word—a single word that would mean whatever the speaker wanted it to mean. (. . .) but from the viewpoint of the auditor, who has the job of deciphering the speaker’s meanings, the important internal economy of speech would be found rather in a vocabulary of such size that is possessed a distinctly different word for each different meaning to be verbalized. as a result of these opposite communication strategies, an equilibrium regarding the number of meanings to be conveyed by particular words should emerge; this assumption was corroborated in zipf ( ), tuldava ( ), ferrer i cancho and solé ( ), ferrer i cancho ( a), and kelih ( ). it was shown that the distribution of the number of meanings follows certain regularities. inspired by these findings, in the following para- graphs we hypothesize some relationships between both polysemy and synonymy, on the one hand, and degrees of nodes representing canonical word forms (lemmas) in a syntactic complex network, on the other. first, it is necessary to realize that the meaning of a word is strongly influenced by words which are in a syntactic relation with it in an actual language usage. obviously, the presence of syntactically related words makes it possible to express meaning refinements; thus, syntactically related words take part in conveying the meaning by the word under the consideration. consequently, it is reasonable to assume that, with an increasing variability of syntactic contexts of the word, the variability of subtle meaning differences of the word also in- creases. if the differences of meaning are ‘striking enough’, they are detected by a lexicographer and they are recorded in dictionaries or databases such as the wordnet (see below) which are usually used for the determination of the number of meanings of the word (i.e. its polysemy). further, from the speaker’s point of view, the presence of syntactically related words enables the speaker to use a unique word for conveying more meanings because syntactically related words take part in its meaning expression and its differenti- ation. this is in a clear agreement with the speaker’s economy—the speaker tends to use as few words as possible for as many meanings as possible (see above); obviously, the ‘cost’ of the increasing polysemy of the word is the length (or complexity) of the syntactic constructions needed for the expres- sion of the meaning (cf. köhler, ). from the auditor’s point of view, the presence of syntactically related words makes the determination of the meaning of the word easier, cf. the priming effect (hoey, ). so, the auditor’s economy ‘presses’ the speaker to use words with more meanings in a syntactically more complex environment because it makes it for him easier to determine the meaning of the entire expression. if the speaker would ignore the auditor’s economy, the probability of auditor’s misunderstanding increases, and, consequently, the probability that the auditor asks the speaker for a new, better explanation increases. from the polysemy and synonymy in syntactic dependency networks digital scholarship in the humanities, of speaker’s point of view, the repetition of the expres- sion conveying the same meaning is in a clear contradiction to his communication strategy (i.e. the speaker’s economy). in sum, because both the out-degree and the in- degree of the word represent the number of syntac- tically related words in a corpus, we hypothesize that the higher out-degree (or in-degree) of the word, the more meanings the word has. as for the relationship between synonymy and the degree of the word, it should be viewed as a consequence of the relationship between polysemy and synonymy. the more meanings a word obtains, the more semantic domains of other words it pene- trates and, consequently, these words become its synonyms (cf. köhler, ; wimmer and altmann, ). hence, because the relationship between the degree and polysemy is assumed, also fig. the network containing the first fifty lemmas from the english treebank fig. the tree graph expressing the structure of the sentence ‘tom saw my parents yesterday in boston’ based on the dependency grammar formalism. links between words represent the syntactic dependency relations, the arrows express the direction of the dependency r. čech et al. of digital scholarship in the humanities, the relationship between the degree and synonymy should be hypothesized, i.e. the higher out-degree/ in-degree of the word, the more synonyms it has. language material and methodology for empirical studies on the relation between vari- ability of the syntactic context and polysemy of words, we naturally need two types of data resources: ( ) syntactically annotated data, and ( ) dictionary data with enumerated words’ meanings. in our experiments, the role of the former type is played by dependency treebanks, while wordnets are used as the latter type. we managed to interlink the two types of information for six languages so far: czech, dutch, english, german, italian, and spanish. in spite of additional treebanks and word- nets being simultaneously available for several other languages, we were not able to include them into our experiments due to various technical obstacles, such as an insufficient size, a missing lemmatization, or an incompatible tokenization. we did not use the original treebank shapes for building dependency networks, but we used their ‘prague dependency treebank—harmonized’ forms, as introduced in zeman et al. ( ), where many treebanks were transformed in order to maximize the compatibility of the resulting dependency trees with annotation guidelines for so-called analytical trees (surface-syntactic dependency trees) of the prague dependency treebank (hajič et al., ). for instance, subordinating conjunctions are heads of subordinating clauses according to the prague dependency treebank style, analogously to prepos- itions being heads of prepositional groups. we do not claim that the prague dependency treebank style is superior to other treebanks’ conventions when it comes to building syntactic networks, but we prefer to treat all the languages as uniformly as possible, and the prague dependency treebank har- monization was the only normalization that was readily available to us. we used data originating from the treebanks listed below (cf. zeman et al., , for more details on the respective resources and associated normal- ization procedures): czech: prague dependency treebank . (hajič et al., ), dutch: alpino treebank (van der beek et al., ), english: penn treebank (surdeanu et al., ), german: tiger treebank (brants et al., ), italian: italian syntactic-semantic treebank (montemagni et al., ), spanish: ancora (taulé et al., ). the first wordnet database was published by miller et al. ( ) at the university of princeton. since then, more than seventy national wordnets were created following the same principles as the original princeton wordnet for english (cf. horák et al., ). the data in wordnet databases are organized as networks of basic entities called ‘synsets’, synonym sets. each synset corresponds to one meaning of a word or a collocation. the synonymy relation in wordnet is not the standard ‘strict’ synonymy, where synonymous words are simply identical (or nearly identical) in meaning (e.g. ‘pretty’ and ‘handsome’). the word meanings in synsets are ‘in a near synonymy’ relation—they are synonymous in the sense that they can be exchanged in the ‘same contexts’. for example, the synset ‘exist: , be: ’ in wordnet relates the fourth meaning (generally called a ‘literal’) of ‘to be’ with the first meaning of ‘to exist’ with the definition of ‘to have an existence’. for the purpose of the current experiment, we used the data of six languages in the form resulting table ranked distribution of out-degrees in the czech treebank rank out-degree , , , , . . . . . . , , polysemy and synonymy in syntactic dependency networks digital scholarship in the humanities, of from the eurowordnet and balkanet eu projects. the studied wordnets have the following statistics: czech wordnet: , words and collo- cations, , synsets, , literals; dutch wordnet: , words and colloca- tions, , synsets, , literals; english wordnet: , words and col- locations, , synsets, , literals; german wordnet: , words and colloca- tions, , synsets, , literals; italian wordnet: , words and collo- cations, , synsets, , literals; table mean values of out-degrees/in-degrees, the mean number of synonyms, and the mean number of meanings in the czech treebank out-degree synonyms polysemy in-degree synonyms polysemy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . r. čech et al. of digital scholarship in the humanities, spanish wordnet: , words and colloca- tions, , synsets, , literals. using these wordnet data, the nodes of the treebank-based syntactic networks were labelled with two values: number of synonyms—the number of other lit- erals in all synsets for the lemma, number of meanings—the number of different literals/synsets for the lemma. for example, the verb ‘intend’ has four mean- ings/synsets in the english wordnet: ( ) intend: , mean: , think: ; ( ) intend: , destine: , designate: , specify: ; ( ) mean: , intend: ; ( ) mean: , intend: , signify: , stand for: . the node ‘intend’ thus received nine synonyms and four meanings. in constructing the syntactic dependency net- works, the methods developed by ferrer i cancho et al. ( ) and liu ( ) were followed. each node of the complex network represents a particular lemma. two nodes are linked if there is a depend- ency relation between the respective lemmas in the treebank. the links are directed; they go from the head to the modifier (see fig. ). a single graph is used in our analysis, i.e. only unique connections between particular lemmas are counted. thus, a global syntactic dependency net- work is constructed by accumulating sentence struc- tures, and the network should be viewed as an emergent property of sentence structures (ferrer i cz du en . . . . ge it sp fig. relation between the mean out-degree (x-axis) and the mean number of synonyms (y-axis) polysemy and synonymy in syntactic dependency networks digital scholarship in the humanities, of cancho et al., ; ferrer i cancho, b). the free software pajek . (cf. de nooy et al., ) was used for network creation and computing. for an illustra- tion, fig. shows the syntactic dependency complex network containing the first fifty lemmas from the english treebank by surdeanu et al. ( ). statistical methodology we follow the procedures described in čech et al. ( ), which we remind here briefly. first, words which have zero meanings (it is determined by the absence of synsets in the wordnet) were omitted from our analyses. in the following step, rank- frequency distributions of out-degrees were constructed (for each language separately). the rank-frequency distribution is then exploited in the process of construction of the figures in section (ranks were used in the binning proced- ure; see below). values of out-degrees are ordered from the highest to the lowest, the highest value receives the rank , the second highest one the rank , etc. thereafter, we consider the ranks to be values of an auxiliary random variable and the out- degrees their frequencies. this auxiliary variable was used because the data are highly skewed—his- tograms constructed directly from both out- and in-degree values contain huge numbers of empty bins. in the example from the czech language, cf. table , in the rank-frequency distribution we cz du en ge it sp fig. relation between the mean out-degree (x-axis) and the mean number of meanings (y-axis) r. čech et al. of digital scholarship in the humanities, assume to have , -times value , , -times value , etc; value , does not occur (it has frequency ). next, a histogram was created from the rank- frequency distribution. its bin width was calculated first according to scott ( ). then, the bins boundaries were adjusted so that equal values (i.e. equal out-degrees in the original data) are kept in one bin (e.g. all ranks corresponding to out- degree belong to the same bin). the reason for adjusting the bin widths (which results in a histo- gram with different bin widths) is that there is no reasonable hierarchy of word lemmas which share the same out-degree in a network—they appear in the same order as they were entered into the tree- banks (or then found in the process of network creation). therefore, we assign all lemmas with the same out-degree to the same histogram bin. then, for each histogram bin we computed the mean out-degree, the mean number of synonyms, and the mean number of meanings (e.g. the mean polysemy) of the lemmas represented by ranks belonging to the bin. the same procedure was applied also to in-degrees. we thus obtained six data sets for out-degrees and six data sets for in-degrees. this approach (ranked frequencies) serves as a tool for a better visualization; the tests (cf. section ) were performed on the original data. the two hypotheses from section (the higher out-degree/in-degree of a word, the more meanings it has; the higher out-degree/in-degree of a word, the more synonyms it has) were tested using the cz du en ge it sp fig. relation between the mean in-degree (x-axis) and the mean number of synonyms (y-axis) polysemy and synonymy in syntactic dependency networks digital scholarship in the humanities, of kendall correlation coefficient (cf. hollander and wolfe, : ); see section . results the mean numbers of synonyms and meanings in the histogram bins tend to increase quite clearly with the increasing out-degree/in-degree. the values for the czech treebank can be found in table . figures – present the values graphically. we use the following abbreviations for the lan- guages: cz—czech, du—dutch, en—english, ge—german, it—italian, sp—spanish. in addition to the figures, the association between the variables (out-degree/in-degree and the number of synonyms, out-degree/in-degree and the number of meanings) was measured by the kendall correlation coefficient (cf. hollander and wolfe, ). the values of the correlation coefficient can be found in table . as the original data without any smoothing were used, the coeffi- cients are relatively small; nevertheless, the correl- ation is positive and statistically significant in all cases (all p-values are below . ), which, together with the trends for binned data in figs – corrob- orates the hypotheses from section (the higher out-degree/in-degree of the word, the more syno- nyms it has; the higher out-degree/in-degree of the word, the more meanings it has). given that sample sizes in table are quite large, the (very) small p-values themselves do not say too much—it is well known that, for large samples, virtually all null hypotheses are rejected (cf., e.g. . . . cz . . . du . . . en . . . ge . . . it . . . sp fig. relation between the mean in-degree (x-axis) and the mean number of meanings (y-axis) r. čech et al. of digital scholarship in the humanities, mačutek and wimmer, , and references therein). therefore, figs – provide perhaps a more reliable evidence. conclusions the results presented in the study brought two main findings. first, we found out that one of the most fundamental syntactic network properties, the degree of the node, significantly correlates with some important semantic properties (polysemy and synonymy) of language. moreover, the hypoth- esis concerning the relationships between degrees and polysemy (or synonymy) is based on a theoret- ical linguistic reasoning. consequently, our findings, perhaps, advocate the usage of complex network in linguistic research and they can be viewed as a constructive response to the call for linguistic explanation of syntactic network properties (cf. cong and liu, a, b; ferrer i cancho, ). second, the empirical corroboration of the hypothesis concerning the relationship between the degree and polysemy can be interpreted as a deeper insight into the well-known relationship between frequency and polysemy (e.g. zipf, ; baayen, and moscoso del prado, ; ilgen and karaoglan, ). in other words, the original hypothesis concerning the impact of the word (or lemma) frequency on polysemy presupposes ‘implicitly’ that frequent words occur in more con- texts and, consequently, this fact leads to an increase of polysemy. in our study, contexts have been ‘ex- plicitly’ operationalized—the degrees express the number of syntactic contexts, in fact. further, the use of single graphs diminishes the impact of the frequency as much as possible in studies of this kind and it allows observing the impact of a purely syntactic property (i.e. the degree of node in a syntactic complex network) of the lemma on its polysemy. to sum up, our approach experi- mentally proves the implicit assumption and reveals more detailed characteristics of the rela- tionship between these important language characteristics. acknowledgements this work was supported by the by the czech science foundation [grant no. p / / to r.č.]; the scientific grant agency of the slovak republic [grant no.vega / / to j.m.]; the ministry of education of the czech republic [lindat-clarin project lm to a.h.]; the ministry of the interior of the czech republic [pro- ject vf to a.h.]; and by the czech science foundation [project p / / to a.h.]. references abramov, o. and mehler, a. ( ). automatic language classification by means of syntactic dependency net- works. journal of quantitative linguistics, : – . baayen, h. and moscoso del prado, m. ( ). semantic density and past tense formation in three germanic languages. language, : – . barabási, a. ( ). the new science of networks. cambridge, ma: perseus publishing. barabási, a. and albert, r. ( ). emergence of scaling in random networks, science, : – . brants, s., dipper, s., hansen, s., lezius, w. and smith, g. ( ). the tiger treebank. in hinrichs, e. and simov, k. (eds), proceedings of the first workshop on treebanks and linguistic theories. sozopol: bulgarian academy of sciences, pp. – . table kendall correlation coefficients cz du en ge it sp number of lemmas , , , , , , synonyms, out-degree . . . . . . meanings, out-degree . . . . . . synonyms, in-degree . . . . . . meanings, in-degree . . . . . . polysemy and synonymy in syntactic dependency networks digital scholarship in the humanities, of coloma, g. ( ). towards a synergetic statistical model of language phonology. journal of quantitative linguistics, : – . cong, j. and liu, h. ( a). approaching human lan- guage with complex networks. physics of life reviews, : – . cong, j. and liu, h. ( b). linguistic complex net- works: rationale, application, interpretation, and dir- ections: reply to comments on ‘‘approaching human language with complex networks’’. physics of life reviews, : – . corominas-murtra, b., valverde, s. and solé. r. v. ( ). emergence of scale-free syntax networks. in nolfi, s. and mirolli, m. (eds), evolution of communication and language in embodied agents. berlin: springer, pp. – . čech, r. and mačutek, j. ( ). word form and lemma syntactic dependency networks in czech: a compara- tive study. glottometrics, : – . čech, r. and mačutek, j. ( ). on quantitative analysis of valency in czech. in grzybek, p., kelih, e. and mačutek, j. (eds), text and language. structures, functions, interrelations, quantitative perspectives. wien: praesens verlag, pp. – . čech, r., mačutek, j. and žabokrtský, z. ( ). the role of syntax in complex networks: local and global importance of verbs in a syntactic dependency network. physica a: statistical mechanics and its applications, : – . čech, r., pajas, p. and mačutek, j. ( ). full valency. verb valency without distinguishing complements and adjuncts. journal of quantitative linguistics, : – . de nooy, w., mrvar, a. and batagelj, v. ( ). exploratory social network analysis with pajek. new york: cambridge university press. ferrer i cancho, r. ( a). zipf’s law from a commu- nicative phase transition. european physical journal b, : – . ferrer i cancho, r. ( b). the structure of syntactic dependency networks: insights from recent advances in network theory. in altmann, g., levickij, v. and perebyinis, v. (eds), problems of quantitative linguistics. chernivtsi: ruta, pp. – . ferrer i cancho, r. ( a). when language breaks into pieces. a conflict between communication through isolated signals and language. biosystems, : – . ferrer i cancho, r. ( b). why do syntactic links not cross? europhysics letters, : – . ferrer i cancho, r. ( ). some word order biases from limited brain resources. a mathematical approach. advances in complex systems, : – . ferrer i cancho, r. ( ). network theory. in hogan, p. c. (ed.), the cambridge encyclopedia of the language sciences. cambridge: cambridge university press, pp – . ferrer i cancho, r. ( ). beyond description. comment on ‘‘approaching human language with complex networks’’ by cong & liu. physics of life reviews, ( ): – . ferrer i cancho, r. and solé, r. ( ). least effort and the origins of scaling in human language. proceedings of the national academy of sciences usa, : – . ferrer i cancho, r., solé, r. v. and köhler, r. ( ). patterns in syntactic dependency networks. physical review e, : article no. . ferrer i cancho, r., riordan, o. and bollobás, b. ( ). the consequences of zipf’s law for syntax and symbolic reference. proceedings of the royal society of london series b, : – . gao, s., zhang, h., liu, h. ( ). synergetic properties of chinese verb valency. journal of quantitative linguistics, : – . hajič, j., panevová, j., hajičová, e., sgall, p., pajas p., štěpánek, j., havelka, j., mikulová, m., žabokrtský, z. and ševčı́ková-razı́mová, m. ( ). prague dependency treebank . (cd-rom). philadelphia: linguistic data consortium. hoey, m. ( ). lexical priming: a new theory of words and language. new york: routledge. hollander, m. and wolfe, d. a. ( ). nonparametric statistical methods. hoboken: wiley. horák, a., pala, k. and rambousek, a. ( ). the global wordnet grid software design. in tanács, a., csendes, d., vincze, v., fellbaum, c. and vossen, p. (eds), fourth global wordnet conference proceedings. szeged: university of szeged, pp. – . hudson, r. ( ). language networks. the new word grammar. oxford: oxford university press. ilgen, b. and karaoglan, b. ( ). investigation of zipf’s law-of-meaning on turkish corpora. nd international symposium on computer and information sciences (iscis ), ankara: ieee, pp. – . r. čech et al. of digital scholarship in the humanities, ke, j. and yao, y. ( ). analysing language develop- ment from a network approach. journal of quantitative linguistics, : – . kelih, e. ( ). modelling polysemy in different lan- guages: a continuous approach. glottometrics, : – . köhler, r. ( ). zur linguistischen synergetik: struktur und dynamik der lexik. bochum: brockmeyer. köhler, r. ( ). syntactic structures. properties and interrelations. journal of quantitative linguistics, : – . köhler, r. ( a). synergetic linguistics. in köhler, r., altmann, g. and piotrowski, r. g. (eds), quantitative linguistics. an international handbook. berlin; new york: de gruyter, pp. – . köhler, r. ( b). properties of lexical units and systems. in köhler, r., altmann, g. and piotrowski, r. g. (eds), quantitative linguistics. an international handbook. berlin; new york: de gruyter, pp. – . köhler, r. ( ). quantitative analysis of syntactic structures in the framework of synergetic linguistics. in mehler, a. and köhler, r. (eds), aspects of automatic text analysis. berlin; heidelberg; new york: springer, pp. – . köhler, r. . quantitative syntax analysis. berlin; boston: de gruyter. köhler, r. and altmann, g. ( ). begriffsdynamik und lexikonstruktur. in beckmann, f. and heyer, g. (eds), theorie und praxis des lexikons. berlin: de gruyter, pp. – . liu, h. ( ). the complexity of chinese syntactic de- pendency networks. physica a: statistical mechanics and its applications, : – . liu, h. ( ). quantitative properties of english verb valency. journal of quantitative linguistics, : – . liu, h. and hu, f. ( ). what role does syntax play in a language network? epl, : article no. . liu, h. and li, w. w. ( ). language clusters based on linguistic complex networks. chinese science bulletin, : – . liu, h. and xu, c. ( ). can syntactic networks indi- cate morphological complexity of a language? epl, : article no. . liu, h., zhao, y. and huang, w. ( ). how do local syntactic structures influence global properties in lan- guage networks? glottometrics, : – . mačutek, j. and wimmer, g. ( ). evaluating good- ness-of-fit of discrete distribution models in quantitative linguistics. journal of quantitative linguistics, : – . mel’čuk, i. ( ). dependency syntax: theory and practice. albany: state university of new york press miller, g. a., beckwith, r., fellbaum, c., gross, d. and miller, k. ( ). five papers on wordnet (csl report ). princeton: princeton university. montemagni, s., barsotti, f., battista, m., calzolari, n., corazzari, o., lenci, a. and zampolli, a. ( ). building the italian syntactic-semantic treebank. in abeillé, a. (ed), treebanks: building and using parsed corpora. dordrecht: kluwer, pp. – . newman, m. e. j. ( ). networks. an introduction. oxford: oxford university press. ninio, a. ( ). language and the learning curve: a new theory of syntactic development. oxford: oxford university press. ninio, a. ( ). syntactic development, its input and output. oxford: oxford university press. prün, c. and steiner, p. ( ). quantitative morphologie. eigenschaften der morphologischen einheiten und systeme. in köhler, r., altmann, g. and piotrowski, r. g. (eds), quantitative linguistics. an international handbook. berlin; new york: de gruyter, pp. – . scott, d. w. ( ). on optimal and data-based histo- grams. biometrika, : – . solé, r. v. ( ). syntax for free? nature, : . surdeanu, m., johansson, r., meyers, a., màrquez, l. and nivre, j. ( ). the conll- shared task on joint parsing of syntactic and semantic dependencies. in proceedings of the twelfth conference on computational natural language learning. manchester: acl, pp. – taulé, m., martı́, m. a. and recasens, m. ( ). ancora: multilevel annotated corpora for catalan and spanish. in proceedings of the sixth international conference on language resources and evaluation (lrec’ ). marrakech, pp. – . tuldava, j. ( ). probleme und methoden der quantitativ-systemischen lexikologie. trier: wvt. van der beek, l., bouma, g., daciuk, j., gaustad, t., malouf, r., van noord, g., prins, r. and villada, b. ( ). algorithms for linguistic processing. nwo pionier progress report. groningen: graduate school for behavioral and cognitive neurosciences. polysemy and synonymy in syntactic dependency networks digital scholarship in the humanities, of wang, l. ( ). synergetic studies on some properties of lexical structures in chinese. journal of quantitative linguistics, : – . wimmer, g. and altmann, g. ( ). two hypotheses on synonymy. in ondrejovič, s. and považaj, m. (eds), lexicographica � . bratislava: veda, pp. – . zeman, d., mareček, d., popel, m., ramasamy, l., štěpánek, j., žabokrtský, z. and hajič, j. ( ). hamledt: to parse or not to parse? in proceedings of the eighth international conference on language resources and evaluation (lrec’ ). istanbul, pp. – . zipf, g. k. ( ). the psycho-biology of language. an introduction to dynamic philology. boston: houghton- mifflin. zipf, g. k. ( ). the meaning frequency relationship of words. journal of general psychology, : – . zipf, g. k. ( ). human behavior and the principle of least effort. cambridge: addison-wesley. notes canonical word form is the infinitive form of a verb and the nominative form of a substantive, adjective, pronoun, and so on (for example, the lemma of word forms ‘go’, ‘goes’, ‘went’, ‘gone’ is go). in this analysis, simple lemmatization is used; differences among par- ticular part-of-speeches are not taken into account, e.g. all forms of a word ‘mean’ fall to one lemma mean, without differentiation of verb, noun, or adjective. the terms polysemy, the number of meanings, and syn- onymy are used in accordance with the approach pre- sented by the wordnet (see section ). on the webpage http://www.cechradek.cz/data/net- work_polysemy_synonymy_tables.pdf one can find values for all languages analysed in the article. the kendall correlation is a measure of monotonous relation between two variables. it achieves its maximum value if the relation ‘the greater x, the greater y’ holds for all pairs x,y; similarly, its minimum value � cor- responds to the relation ‘the greater x, the smaller y’. r. čech et al. of digital scholarship in the humanities, http://www.cechradek.cz/data/network_polysemy_synonymy_tables.pdf http://www.cechradek.cz/data/network_polysemy_synonymy_tables.pdf brindley_final_web re-defining the library lynne brindley the british library, london, uk abstract purpose: originally the keynote speech to the bielefeld conference exploring the challenges facing libraries in the digital age and considering ways in which they need to reshape and rethink their services and skills to maintain their relevance and contribution. design/methodology/approach: a review of a wide range of recently published materials ( – ) gives a broad perspective on the challenges facing libraries. these are then considered within the case study experience of the british library to identify key themes for redefining the concept of ‘library’. findings: gives a clear articulation of the challenges facing libraries. through the case study the author identifies seven themes as central to redefining the library in the st century: . know your users and keep close to them; . re-think the physical spaces and create a desirable draw; . integrate marketing into the organisation; . open up legacy print collections to digital channels and through digitisation; . reduce legacy costs and continue to improve productivity in traditional activities; . invest more in innovation and digital activities; . develop our people and ensure the right mix of skills. practical implications: a practical source of ideas for those seeking to develop their own library activities and a thought-provoking analysis for anyone interested in the implications of the digital age. originality/value: this paper gives an original view of changes within the library sector from one of the leaders in the field and is rooted in the practical and innovative approaches adopted by one of the world’s great research libraries. keywords: libraries, change management, strategic management paper type: conceptual paper i am delighted and honoured to be giving the keynote speech at this important and prestigious conference of experts. thank you for the invitation. there is one advantage of giving a keynote speech, namely that no-one has already covered what you are about to say (as so frequently happens when you are later on the programme) and it is the prerogative of the keynote speaker to ask more questions than to give answers! i’m sure that i will be doing just that, for the theme of the conference – re-defining, re-thinking, developing new paradigms for library and information services in the st century – is fundamental to all of us and to the future health of scholarship and research in all our countries and in our institutions and for those we serve. this is a formidable leadership responsibility. on a less serious note i have also decided to spare you a power-point presentation. i’m sure there will be plenty of those to come: indeed it is my normally preferred style but i think that a keynote speech should be just that! flavour of the challenges when i was starting to prepare for this paper – blank screen in front of me, several cups of coffee drunk and all possible displacement activities completed – i happened to receive the january copy of information world review ( ) through the post at home. i needed to look no further than scanning the topics of the first few pages to indicate a flavour of the challenges we face. let me share a few with you: � the accuracy of wikipedia articles on science is validated by nature. the error level is not significantly greater than levels in encyclopaedia britannica; � libraries are urged to embrace the second generation of internet technology web . to satisfy user demands, saying that existing library catalogue standards, such as marc and z . need to be replaced by xml technology, enabling access to information from a wide variety of web services; library catalogues are compared unfavourably to amazon and google search services; � the now rather done to death topic of open access is covered with a suggestion of a split between science, technology and medicine publishers, oxford university press, blackwell publishing and springer all of whom have signed up to the wellcome trust open access model; and the dutch (reed elsevier, wolters kluwer) and the north americans (wiley, sage) who appear to be steadfastly unmoved by open access; � michael gorman slams digitisation of scholarly texts as a waste of money while elisabeth niggemann, director of die deutsche bibliothek urges both public (european union) and private sector investment to ensure wider access to collections; and � apple’s flagship store in regent street in london is projected as a model for libraries as they reshape space and service provision to encourage knowledge exchange and solving information problems. there is also a review of the year (chillingworth, ) - the stories that rocked your world in - and just a sample of topics is indicative of the complexity and rapidity of change within our information world over just twelve months: in january google announced its plans to digitise the collections of five major libraries, including the bodleian at oxford. later initiatives include the open content alliance (oca) with yahoo and microsoft; the british library/microsoft agreement to digitise millions of pages; macmillans joined the digitisation goldrush with bookstore; publishers and authors hit google with lawsuits around copyright violation, and amazon announced a new service to change the model to purchase book content by the page or chapter. digitisation has never been such a hot topic! content may really be king, at least for a little while! but in the uk subject librarians were being threatened with redundancy at bangor university; they were not considered as offering value for money compared with the net; and at the science museum the library collections were to be split up in the face of a financial crisis. regulatory and legislative issues pertaining to the digital environment surfaced strongly. in the uk the freedom of information act came into force; in a year when terrorist bombs struck in london, terrorism was never far from the headlines with new legislation on police powers and criminal evidence and a new terrorism bill all threatening to compromise librarians and their normal business activities. and as the year ended the uk government announced a fundamental review of intellectual property in the digital environment to take place in . what does all this mean? i have laboured these developments, to some extent a rather random sample, because they seem to me to represent either explicitly or implicitly the nature of the serious challenge to libraries and to information professionals in this first decade of the st century. they suggest a picture of ever more rapid innovation, mostly happening outside libraries and driven from the commercial sector; a picture of confusion and contradiction in the range of business models that are emerging and being experimented with; and of new demands from discerning and empowered users. let me quote from david warlock (cited in "flashback", ) a well respected industry watcher who says: […] users are in control on the networks, and exhibit a power unparalleled in the legacy world of print […] the gold […] is the greater ability of users to discriminate, select, personalise and customise, as powerful players in the information industry network. (p. ) but such a challenge is also an exciting opportunity for the library and information sector to play new roles and to define a new future in a very fast moving and competitive environment. there is in any case no choice but to change, and change quickly if we wish to remain relevant for the future. challenges for the library and information sector the challenge for libraries in the st century, as now only one part of a great diversity of alternatives, is to find new ways to add value and remain relevant in this rapidly changing, confusing and competitive environment. while the distant future for libraries is not clear, it is timely for libraries to challenge some historic assumptions and ask some fundamental strategic questions: � how can we serve the needs of the digitally savvy, impatient google generation for whom the web – a global information commons – has primacy of place for information and knowledge seeking? � how can we continue to enable the research and learning process when increasingly it is happening in a virtual realm outside the context of the library? � how can we be relevant to those who have never set foot in a library – to provide the infinite connectivity to information with its stacks in the ether? � does the library as place have relevance and how should space be best used? � where should we focus in the information value chain, and what should we not do? � how can libraries provide effective stewardship of both digital and physical collections, and what is our role regarding non-traditional information types, such as e-science data? � how are publishing and intellectual property regimes changing, and how must we influence thinking on them and change in response? � what types of skills do libraries need to exploit advances in technology and informatics, both to enhance knowledge exploration and presentation and to enable new ways of searching and mining their collections? � what types of collaboration and alliances do libraries need to engage in to present coherent collections and to create innovative new products and services for content delivery? these were some of the questions we asked ourselves at the british library when we were formulating our new strategy in / (the british library, ). technology is arguably turning on its head our assumptions about our value, it is challenging the roles of all accepted players; and it is enabling increasingly promiscuous users with different and higher needs to have much wider choice to fit their digital lifestyles. perceptions of libraries and information resources i hope that i have made at least the outline case for major, transformational change in the library and information sector, driven from the imperatives of the external environment – the information industry, the technology and most importantly the demands of the users. i want now to move on to some observations of how libraries are perceived by their users, to give context to the nature of the change and re-definition that may be required. in this i have been greatly assisted by a recent report published by the online computer library center ( ). if you have not already seen the report i commend it warmly to you as a rich, international source of market research data on perceptions of the library and its services, which should act as another wake-up call for us all. the report followed on from an earlier environmental scan identifying some dissonance in expectations of libraries. user priorities were seen as ease of use, convenience and availability, all regarded as equally important to the information consumer as information quality and trustworthiness (online computer library center, a). the objectives of the study were to ascertain how libraries are perceived by today’s information consumer and to see whether libraries still matter and what future patterns of library use might be. with this acknowledgement let me pick out some of the salient headlines: libraries are seen as more trustworthy/credible and as providing more accurate information than search engines. search engines are seen as more reliable, cost-effective, easy to use, convenient and fast. the library is not the first or only stop for many information seekers. search engines are the favourite place to begin a search and respondents indicate that google is the search engine most recently used to begin their searches. in addition users wanted ‘more books’ and longer and more convenient opening hours. perhaps no surprises there! through increasing familiarity with search engines and the web comes greater self-reliance of information consumers, who feel confident in their own evaluation of sources of all kinds. survey respondents are generally satisfied with libraries and librarians but most do not plan to increase their use of libraries. indeed the brand association of libraries appears to be rather depressingly nostalgic, traditional, and focussed on books. even with their strong emotional attachment to the idea of the library there was clear dissatisfaction with the physical and service experience of the libraries they use. poor signage, inhospitable surroundings, unfriendly staff, lack of parking, dirt, cold, hard-to-use systems and inconvenient hours were repeatedly mentioned. and finally more on brand image. most respondents feel that library is synonymous with books. books dominated responses across all regions surveyed and across all age groups, despite libraries’ growing investment in electronic resources and digital activities. in summary, these findings do not make comfortable reading. i am sure that each one of you will wish to argue that the findings do not apply to your particular institution or type of library, but that might suggest a level of denial that would be inappropriate. the opportunities are there for significant re-branding and re-positioning, both in terms of the design and delivery of digital and physical services, recognising that the information landscape will if anything become even more competitive and consumers will become even more discerning and willing to take what information they need from wherever they can most conveniently and painlessly get it. the call to action suggests the need for much deeper understanding of each of our user communities; a much more developed sense of place as social context for services; and greater attention to relevance, distinctiveness and convenience of all of our future services. how libraries stack up lest we get too concerned i would draw your attention to yet another online computer library center report ( b) which through a range of statistical data re-affirms the importance and comparative scale of library activity, primarily but not exclusively in a usa context. the basis on which new opportunities can be built is immense: � us public library cardholders outnumber amazon customers by almost to ; � each day, us libraries circulate nearly times more items than amazon handles; � one out of every six people in the world is a registered library user; � five times more people visit us public libraries each year than attend us professional and college football, basketball, baseball and hockey games combined; � there are over million libraries worldwide with billion volumes; and � there are some , librarians worldwide. we already know how well information professionals network (virtually and physically); we already know how willing we are to share experience and best practice. these are strong characteristics that should serve us well as we seek to ramp up the scale and pace of change. how do other industries respond to change? i would like to turn now briefly to what, if any, help and understanding we might get from models for change more generally. when considering the issue of re- definition in the context of the british library i have found the framework offered by mcgahan ( ) in a harvard business review article entitled ‘how industries change’. she argues strongly that to develop strategy and make appropriate investments in innovation within the organisation requires a real understanding of the nature of change within the industry. this is of course easy to say but difficult to assess, particularly to take a long-term look in a rapidly changing short-term context. the business world is littered with misinterpretations of signs. she suggests, however, four distinctive trajectories of change – radical, progressive, creative, and intermediating. i believe that libraries and information services (depending on their particular nature) are operating in an environment between intermediating and radical change. an industry on a radical change trajectory is entirely transformed, probably over a timescale of decades, with an end result of complete reconfiguration (usually diminished). companies dealing with radical transformation it is suggested should move strongly to improve productivity in existing activities without significant investment, conduct experiments with new products and services and develop new distribution channels. intermediating change is more common than radical change. it is where the core assets – knowledge, brand, content, patents - retain much of their value if they are used in new ways. this requires the simultaneous preservation of valuable assets and re-structuring of key relationships, and means finding innovative and unconventional ways of extracting value from core resources. managing this dual track approach is extremely challenging. from a british library perspective we continue to focus on increasing productivity and streamlining traditional processes (largely through systems changes); we are finding innovative ways of exploiting our core assets of content combined with expertise; and we are opening up new channels of delivery largely through digital partnerships and new service developments. this re-positioning in the digital library world, at the same time as sustaining our core statutory functions both for the print and digital domains, is a major leadership and management challenge requiring changes in structure, skills and investment patterns. digital library challenges it is interesting to also consider the views of leading experts in digital library developments and associated research. cliff lynch ( ) in assessing prospects for digital libraries in the next decade suggests strongly that the major challenge is to: connect and integrate digital libraries with broader individual, group and societal activities, and doing this across meaningful time horizons that recognize digital libraries and related constructs as an integral and permanent part of the evolving information environment. additionally he argues that: the issue of the future of libraries as social, cultural and community institutions, along with related questions about the character and treatment of what we have come to call “intellectual property” in our society, form perhaps the most central of the core questions within the discipline of digital libraries – and that these questions are too important to be left to libraries, who should be seen as nothing more than one group among a broad array of stakeholders. this questioning of ‘what is a digital library anymore, anyway’ is echoed in a challenging article of this title by carl lagoze et al. ( ) at cornell university. he worries about a perception that the ‘googlization’ of digital libraries and information more generally means that digital library problems have either already been solved or will be solved by google, msn, yahoo! and others. one might regard this as simply a plea for more research funding from interested parties, or perhaps more seriously as a need for us to think well beyond search and access as presently conceived towards a much richer information environment for information sharing, aggregation, manipulation, collaborative working, and indeed digital preservation which cliff lynch sees as an enormous, fundamental societal issue for the next decade. so where is all this leading us and what should we be doing? i hope that so far in this keynote speech i have painted a broad picture of an increasingly challenging information environment within which libraries of all types operate. just reflecting on developments within the last twelve months the pace of change is probably faster than any of us has experienced in our professional lives so far. in addition there is much evidence that in this new context whilst libraries are still valued there are increasing signs of dissatisfaction and perceptions of a lessening value compared with other options. however, the changing environment provides opportunities for re-defining our roles and our relevance and for developing new roles in a much wider range of public and private partnerships and collaborations. there is a need for strong strategic and thought leadership, more risk-taking than has generally been associated with libraries, and greater understanding of and attention to the changing demands of our different user populations. i would like therefore in the final part of my talk to pull together some themes of general strategic relevance for the conference, drawing on our experience of major change, re-branding and strategy development at the british library over the past few years. the british library has become more externally- focussed and market-facing; we have rationalised and modernised our portfolio of services and have a higher public and government profile. we have faced up to difficult staffing issues, have brought in people with new and complementary skills and have invested to catch up on our technology infrastructure and do leading work in areas such as digital preservation. re-defining the library i would like now to offer some themes that i believe will be central to the continuing re-definition and re-positioning of libraries to remain relevant in the st century. they are likely to be more or less relevant to you depending on the context of your service and are certainly not comprehensive. . know your users and keep close to them (and your lost users and your non- users) if there is a common message coming across from all those thinking deeply about the future of libraries it is that we need to be more deeply involved with our users to really understand how their work patterns are changing, to anticipate their future requirements and how information services can be better integrated with the increasingly digital life-style of new generations of students, researchers and knowledge workers. the british library had, like many libraries, for a long time viewed its users as a homogenous group of ‘readers’. in , we looked closely at this group and identified, unsurprisingly, that all readers were not the same. using expertise from commercial marketing we identified clear audiences, all of whom had specific ideas of what they needed from the library: researchers, including staff in higher and further education; postgraduates; high r&d industries; writers and scholars; government researchers; the library network; schools and young people; individuals - including undergraduates – pursuing their own research projects; the wider, general public; and business, which was identified as a core audience which at that stage was under-utilising the wealth of resources the library had to offer. our core functions as a great research library are primary but our users and their needs are varied and distinct. defining a library that will balance the needs of the humanities professor and the new entrepreneur and the undergraduate with approaching finals and the office worker looking for somewhere pleasant to spend their lunchbreak, is a challenge, but our new, focused understanding of who our users are is an important catalyst for creativity in approaching the rest of my core themes. . re-think the physical spaces of the library and create a ‘desirable draw’ libraries should aim to be uplifting, innovative and inspiring cultural, social and intellectual spaces, encouraging debate and collaboration, and desirable as places to be in, even in the age of ubiquitous internet access. some of the best models for the future will come, not simply from existing libraries (although there are some stunning international examples) but also from bookshops, apple stores, museums and galleries and new concepts in retailing spaces. if the british library is the physical representation of the information age, what should it be like? can the surroundings and atmosphere add value – even enhance – the information we are providing? while defining our users, we identified that we were underutilised by business and entrepreneurs: an important audience for ourselves and for our role as support to the uk’s creative and technological economy. market research showed that for every £ of public investment we received, we generated £ . of national value. how could we expand the use business made of our resources? what could we do to make ourselves more appealing to them? we found a good model in the new york public library’s science, industry and business library and i am proud that, one month tomorrow, the british library’s business and ip centre will open. the aim will be to deliver a coherent programme of support at each stage of the innovation lifecycle: information support and networking space, with expert support from partner organisations on business planning, financial development and intellectual property. but rethinking our physical space need not always mean rebuilding it. the british library’s st pancras site had just opened when i arrived in , after a costly and protracted building process. much of what we have done in the last five years has been about encouraging new audiences to come and use us as a public amenity. every summer now we have live music sessions on the piazza and tying into our launch of mozart’s digitised musical diary our central foyer was filled by people listening to live performances of mozart string quartets. we have installed london’s largest public wi-fi space and are becoming a place of choice for laptop users as well as delivering an additional resource for our core researchers. and before christmas we gained much media coverage for our nobel prize exhibition through our second “mingles” evening billed under the strapline “in love with science and want to share your passion? come and meet other like-minded single people and attend an open evening at the exhibition”. libraries can be sexy - over new visitors came along that night. . integrate marketing in your organisation and in the way you approach strategy and service development my two previous themes have not been about changing the core library function of repository and steward, but have been about changing our attitudes to that function to take a proactive approach to engaging our users and marketing our resources to meet their needs. as we seek to redefine ourselves it is essential that we integrate marketing into the way we approach strategy and service development. in , for the first time we introduced a directorate for strategic marketing and communication, headed up by a respected commercial specialist to help us do this. our marketing had traditionally concentrated in promotional marketing activity without sufficient emphasis on positioning or branding. curators were the driving force behind much of what we did with the effect that too often we could be focused on our own areas of interest and expertise rather than connected to those of our users. the new marketing director led on the tough questions – who are our users? what is our proposition to them? – and sought to engage all library staff in considering the answers. the result has not been simply rethinking how we use our space, or improving signage and welcome, or remodelling our treasures gallery to best display the jewels of the nation’s heritage: rather, a dynamic synergy between curatorial and marketing staff has emerged. exhibitions are now agreed and taken forward on the basis of whether they will meet a market need and then developed in tandem with promotional events. our most successful exhibition drew upon curatorial expertise and knowledge and enhanced this with a marketing edge. the result was a journey down the silk road, including letters, maps, objects, the diamond sutra, the world’s oldest printed book and drew record breaking numbers of visitors. it was supported by a series of seminars and lectures drawing in new audiences to hear a range of speakers and experts. . open up your legacy print collections to digital channels and reveal them through digitisation extracting new value from core assets is a key part of managing ‘intermediating change’. many of the great libraries represented here have unique assets, particularly in their primary collections of archives, manuscripts and rare materials. in addition the scale and scope of secondary source material (even just considering the corpus of out of copyright material) that can be digitised and therefore made infinitely more visible and accessible to the world is enormous. there are great opportunities here for libraries to find new channels, deliver public value and ensure business models that enable sustainability. the latter, however, depends on some careful thinking on the value of the intellectual property over which we have stewardship balanced against many of our professional aspirations to open our content freely to everyone. it is always worth reflecting that initial one- off costs of digitisation pale into insignificance set against a commitment to perpetual access and preservation of the newly created digital asset. the british library’s award winning turning the pages (ttp) programme and technology has been a direct product of our thinking about how digital can help us open up access to some of our most valuable items - and it is a source of immense pleasure to me that it has been seized by many of the institutions here today for doing that with their own treasures. at its start ttp was about helping us give members of the public access to precious books while keeping the originals safely under glass, allowing them to virtually 'turn' the pages of manuscripts in a realistic way, using touch-screen technology and interactive animation. but that initial step was just the beginning of the potential digitisation affords us as we rise to the challenges of ‘intermediating change’. the british library has developed a set of assessment criteria to help us prioritise areas of our collections for digitisation. high on our list is our vast collection of newspapers, due to its fragile nature. not only is there a huge collection of at risk material that could be digitised, but also, the time-based nature of newspapers makes them ideal to ‘map’ and link to other historical collections and items. we are also exploring opportunities for digitising our unique audio collection. audio resources are generally harder to manage and the resultant resource is more difficult to search, relying as it does on metadata. however much progress has been made in speech recognition technology and automated full-text creation of audio content may soon be possible, making searching of speech based items much easier. digitisation also makes exciting reunification projects possible, increasing the world’s knowledge and advancing digital scholarship. three weeks ago we launched a virtually-united leaf of mozart’s musical diary - cut into two pieces years ago by the widowed and impoverished mrs mozart to increase the total sale value. ttp has given the world access to previously unknown music and a catalogue of complete works integrating sound from the sound archive into the digitised work. the bbc covered the story and our website recorded a record number of hits as people linked through to us on the back of media coverage. and many of you will already know about the unprecedented collaboration underway to digitally reunify the codex siniaticus involving all four of the institutions at which parts of the manuscript are held: st catherine’s monastery, sinai, the british library, the university of leipzig and the national library of russia. digitisation opens doors to new and dynamic partnerships. last autumn the british library announced its intention to work with microsoft to digitise , out of copyright books and make them available over the internet. there are complex intellectual property issues involved in such partnership working, but i view the microsoft deal as an example of how libraries can work with the new players in the information arena as we modernise and update our services. . reduce legacy costs and continue to improve productivity in traditional library activities underpinning the functioning of our libraries are many traditional processes and activities – selection, acquisition, cataloguing, fetching and retrieving, preserving, and so on. we need to be vigilant to ensure that these well-worn routines continue to be challenged both in the way we do them and the priority we give them. freeing up resources for investment in new things, and in that i include freeing up our best staff, as well as creating funds for research and technological innovation will be critical if we are to keep up with changing expectations. this sometimes means challenging long-assumed professional roles and other entrenched working arrangements. for over years the british library’s document supply service has fulfilled a critical role in the uk’s information provision and remains the world’s largest document supplier. at its peak level of demand in / – only six years ago – it was fulfilling over million requests for individual documents annually – three million to uk customers and one million overseas. but demand has slowed in recent years because of the new opportunities provided for through the ‘big deals’ and easier digital access. three years ago, the library embarked upon a programme of extensive modernisation of our document supply operation, partly funded by the uk government’s invest to save budget. the library replaced all photocopiers with state of the art scanning stations, which enable us to deliver all documents electronically regardless of their original format and which have reduced standard turnaround times from days to just hours. . invest more in innovation and digital activities one of our real challenges is to create enough resources for faster innovation and investment in experiments and new digital services. all of us have opportunities for adding value to our communities through new roles such as institutional repository management, digital asset management and audit, digital scholarship, e-learning activities, and so on. some of these roles require new kinds of consortial and other partnerships across the public and the private sector. the legal deposit libraries act has given us responsibility for archiving the uk’s digital output and we are preparing the infrastructure and methodology for this now. no-one has exactly the product we are looking for, so we are using a combination approach, of buying in components and working with the supplier to develop them to meet our needs as dom – our digital object management system. the first practical release of the storage system provides a preservation- quality digital store for material received under the voluntary deposit scheme. subsequent releases will add more functionality to this. we will then extend this service to handle other types of material, initially e-journals. other high priority materials are cds, dvds, and other ‘hand-held’ items, and the growing number of digital newspapers. technical direction of the solution architecture has been validated by two external technical advisory panels. we are now focusing on ingest and are in the process of selecting a vendor. requirements for other functions (data management, administration, and access) are being developed. digital preservation is being addressed by a newly- recruited team and we are building european partnerships. . develop our people and ensure that we have the right mix of skills i believe that the role of librarian and information professional has never been more important, but what we mean by librarian has to be reconsidered. it is we, not just our institutions, who must face up to intermediating change. the information professionals of the future need to be outward-going people, with really sharp business skills and a huge understanding of technology and the implications of the internet. they need to be able to understand and engage with users to bring their collections to life, in a way that a search engine on someone’s desk simply cannot. they need to concentrate on what they are good at - information management, metadata, reference services, to name but a few - and be ruthless about bringing in specialisms they need from outside to add value to their core tasks. because it is the mixture that is so powerful. very few of the examples i have given today of how we at the british library are redefining ourselves for the st century would have been possible without the new blend of skills and expertise we have acquired over the last five years. my final point is to encourage us all to have a much stronger voice in the grand challenge debates that are part of the development of the digital society and economy. there are many big and complex policy issues that are live at national government, european and indeed global level. as library and information professionals we should speak robustly to ensure appropriate balances between public interest and commercial imperatives in digital copyright and intellectual property regulation; we should be central in working on and piloting new and sustainable business models for digital services; and we should widen understanding and debate about the importance to society and to individuals of digital preservation. concluding remarks i would like to conclude simply with a quote from the library’s founders when they brought together the original collections in to be ‘preserved therein for publick use, to all posterity’ providing access to the world’s knowledge for ‘all studious and curious persons’. at this early stage of the st century it is arguably the technological geeks who, in their fight to win a browser war, are doing most to re-interpret and fulfil the utopian dream of universal access. it is imperative that we, as the custodians of the world’s knowledge create our own vision and contribution to this desired future by re-defining the library to be relevant for this and future generations. references the british library ( ), redefining the library: the british library’s strategy – , the british library, london, available at: www.bl.uk/about/strategy.html (accessed april ). chillingworth, m. ( ), “the stories that rocked your world in ”, information world review, vol. , pp. - . “flashback”, ( ), information world review, vol. , p . information world review ( ), vol. . lagoze, c., krafft, d.b., payette,s. and jesuroga, s. ( ), “what is a digital library anymore, anyway?”, d-lib magazine, vol. , no. , available at: www.dlib.org/dlib/november /lagoze/ lagoze.html (accessed april ). lynch, c. ( ), “where do we go from here? the next decade for digital libraries”, d-lib magazine, vol. , no. / , available at: www.dlib.org/dlib/july /lynch/ lynch (accessed april ). mcgahan, a.m. ( ), “how industries change”, harvard business review, vol. , no. , pp. - . online computer library center (oclc) ( a), oclc environmental scan: pattern recognition, oclc inc., dublin, usa, available at: www.oclc.org/reports/ escan.htm (accessed april ). online computer library center (oclc) ( b), libraries: how they stack up, oclc inc., dublin, usa, available at: www.oclc.org/reports/ libsstackup.htm (accessed april ). online computer library center (oclc) ( ), perceptions of libraries and information resources: a report to the oclc membership, oclc inc., dublin, usa, available at: www.oclc.org/reports/ perceptions.htm (accessed april ). ali s. al-aufi abstract—the emergence of networked information and communication has transformed the accessibility and delivery of scholarly information and fundamentally impacted on the processes of research and scholarly communication. the purpose of this study is to investigate disciplinary differences in the use of networked information for research and scholarly communication at sultan qaboos university, oman. this study has produced quantitative data about how and why academics within different disciplines utilize networked information that is made available either internally through the university library, or externally through networked services accessed by the internet. the results indicate some significant differences between the attitudes and practice of academics in the science disciplines when compared to those from the social sciences and humanities. while respondents from science disciplines show overall longer and more frequent use of networked information, respondents from humanities and social sciences indicated more positive attitudes and a greater degree of satisfaction toward library networked services. keywords—academics, arab world, disciplinary culture, networked information, scholarly communication, sultan qaboos university, oman. i. introduction or most computer users, networking technology was first made available with the advent of the internet and the associated technology of the world wide web. as a result of the internet, the public gained access to numerous types of networked information resources and services, including e- mail, mailing lists, bulletin boards, internet chat, and different multimedia formats, both audio and visual. academic users were quick to take advantage of these developments and others that were delivered to their desktop as the world wide web became established as the common delivery platform for digital information services. in particular the rapid implementation and acceptance of ‘networked information’ in the form of web-based delivery of academic content such as e-journals, library catalogues, and bibliographic databases, transformed the processes of research and scholarly communication. networked information has fundamentally changed the manner in which academics correspond and work, and has had a far-reaching impact on many aspects of the research environment, including the accessibility of information; collaborative research, and the ali s. al-aufi is an assistant professor with the department of library and information science, sultan qaboos university, oman. (e-mail: alaufia@ squ.edu.om). paul genoni is a senior lecturer with the school of media, culture and creative atrts, curtin university of technology, australia (e-mail: p.genoni@curtin.edu.au). dissemination of research outputs. sultan qaboos university (squ) was opened in as the first public university in oman. currently, the university consists of seven colleges: agriculture and marine sciences; arts and social sciences; commerce and economics; education and islamic sciences; engineering; medicine, and sciences. furthermore, a college of law was attached to squ based on a royal decree issued by his majesty sultan qaboos bin said in april , which will bring the number of colleges at squ to eight. education is provided free for all students at squ, including tuition fees, text books, on campus food, and accommodation. the university provides various educational support centres to assist student learning, such as the educational technology centre, language centre, and the data system centre. the language centre plays a major role in preparing students to commence their higher education by providing intensive english language instruction. in addition, the university provides and supports various research centres and laboratories such as those dedicated to water, the environment, oil, telecommunications, remote sensing, earth quakes, seismology and omani studies [ ]. the internet was made available to squ in late . since that time the use of networked information and related technologies have become commonplace at squ, and they are now considered essential assets in enhancing the university’s teaching and research outcomes. although the internet and networked information are assumed to be widely used for research related purposes at squ, there remains a need to investigate the precise extent and patterns of their use by academics for research and scholarly communication, and how this might in turn be impacting upon the research effectiveness of the university and the nation. ii. statement of the problem as early as , bailey [ ] reported that “global computer networks, such as the internet, have created a complex electronic communication system that has significantly changed the way scholars informally exchange information and has started to change formal scholarly publication activities” (p. ). by the late s, these transforming effects of the internet were being widely felt on the established systems of research and scholarly communication. the rapid diffusion of the internet and networking technologies was impacting not only in developed countries. globally, academics and researchers were finding they could acquire information, undertake collaborative research projects, and communicate their research findings, far more easily and rapidly with the aid ali s. al-aufi and paul genoni f digital scholarship and disciplinary culture: an investigation of sultan qaboos university, oman world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / of networking technologies. while research on scholarly communication practices, including the use of networked information by university- based research communities, has grown steadily, very little of this research has been based in the arab world, or oman in particular. use of networked information in the developed countries had an impact for some time before it was transmitted to the developing countries. as a result, the amount of research related to the impact of these technologies conducted in developed countries has significantly exceeded that conducted in developing countries. studies of the adoption of the internet and networked information by research communities therefore needs to be conducted in developing arab countries in order to; • assess the information and technology gaps that exist between developing arab countries and developed countries and also between developing arab countries and other groups of developing countries, • identify if patterns of research related uses of networking technologies in developing arab countries have been influenced by local factors, such as the existing social, educational, and linguistic conditions, and • assess the impact that networking technologies have had on the research productivity of academics in developing arab countries. the major goal of this research is to investigate whether there are disciplinary differences in the way in which networked information and communication are being used in an arabic academic environment. since the internet was introduced to squ, the use of the internet by the university’s academics for purposes of research and scholarly communication has remained largely unexamined. furthermore, there have been no significant attempts to examine arab scholars’ attitudes in the networked environment based on their academic disciplines. iii. objectives the primary objectives of this study are to: . investigate the use of networked information and its impact on patterns of research and scholarly communication in an arabic context, using sultan qaboos university as an example. . identify the disciplinary differences reflected in the use of networked information for research and scholarly communication. iv. review of the literature the information use and scholarly communication patterns of researchers in all major academic disciplines have been the subject of research for some time. it has been argued that scholarly communication is a social activity wherein relationships are influenced by the disciplinary culture within which scholars are grouped. use of information by academics in science disciplines in particular had been examined closely. some notable earlier studies include menzel [ ] and garvey, tomita, & woolf [ ]. research into information use and scholarly communication in humanities and social science disciplines also commenced during the s and s. examples of these studies include simonton [ ] and gleaves [ ] in the humanities, and line [ ] and skelton [ ] in the social sciences. it should be noted that the literature of scholarship provides no absolute consensus as to what constitutes the ‘sciences’, ‘social sciences’ or ‘humanities’. the sciences are frequently grouped into the natural sciences [ ], physical sciences [ ], and applied sciences, but each of these groups has been differently constituted at different points in time. meadows [ ] states that at the beginning of the twentieth century ‘science’ referred to natural sciences in english-speaking countries. he adds that lack of a universal definition of science leads to differences in organizational structures that in turn have an effect on communication patterns. the natural sciences have been defined as “a set of separate, specialized disciplines—consisting primarily of physics, chemistry and biology—of relatively recent origin” [ ] ( p. ). as noted, for the of this study, science is defined by the organizational structure of sultan qaboos university where the science disciplines are divided into colleges of science, engineering, medicine, and agriculture. as with science, there is no universally accepted definition for the social sciences and humanities. cohen [ ] indicates that social sciences are comprised by a set of disciplines that study social phenomena and relationships among people. it generally includes archaeology, economics, history, political science, psychology, and sociology. for some commentators, history is considered part of the humanities [ ]. humanities is said to refer to classical studies [ ], and white [ ] notes that “disciplines of the humanities such as philosophy, history, and literary studies offer models and methods for addressing dilemmas and acknowledging ambiguity and paradox. they can help us face the tension between the concerns of individuals and those of groups and promote civil and informed discussion of conflicts, placing current issues in historical perspective” (p. ). in this study, social sciences and humanities are also based on the organizational structure of squ. these include all departments affiliated with the colleges of arts and social sciences, commerce and economics, and education and islamic sciences. research investigating the scholarly use of networked information in academic environments have included studies investigating a single discipline; studies that are inter- disciplinary; and studies that are multi-disciplinary or cross- disciplinary. the literature includes many studies that investigate the use of the internet or computer networks within a single discipline. examples include, bishop [ ]; brown [ ]; shaw [ ]; zhang [ ]. inter-disciplinary studies investigate the use of the internet or networked information or information seeking behavior by academics in two or more academic disciplines, within one broad area such as the social sciences. examples of these studies are, abdulaziz [ ]; abels, liebscher, & denman [ , ]; costa & meadows [ ]; eisend [ ]; seyal, abd-rahman, & mahbubur-rahim [ ]. multi-disciplinary studies investigate and often compare the world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / use of the internet or networked information by academics in two or more broad disciplines, such as academics in social sciences disciplines compared to academics in the sciences or humanities. examples of multi-disciplinary studies conducted in developed countries include applebee, bruce, clayton, pascoe, & sharpe [ ]; applebee, clayton, & pascoe [ ]; bane & milheim [ ]; bruce [ ]; budd & connaway [ ]; heterick [ ]; houghton, steele, & henty [ ]; kaminer [ ]; lazinger, bar-ilan, & peritz [ ]; schauder [ ]; wang & cohen [ ]. for examples of multi-disciplinary studies conducted in arab and developing countries see abdullah [ ]; adika [ ]; bin-alsabti [ ], boumarafi [ ]; ehikhamenor [ ]; jirjees & nashir [ ]; mamtora [ ]; uddin [ ]. while garvey [ ] indicated that the discipline of the researcher influenced a researcher’s information seeking behavior when using traditional information sources, abels et.al [ ]; tenopir [ ]; and torma & vakkari [ ] have also asserted that the researchers’ disciplinary culture is closely tied with the way in which they use networked information. the differences in the use of networked information between scientists and social scientists are explained by costa and meadows [ ] as being based on two factors. these are firstly, the differences in information needs and types of information used by the two groups, and secondly, because scientists were using computers some time before social scientists. costa and meadows also argue that although most studies conducted after the mid- s indicate disciplinary differences in networked information use between scientists and social scientists (cohen [ ]; lazinger, bar-ilan, & peritz [ ]; schauder [ ] ), some recent studies suggest that these differences have decreased over time. by , for example, there were indications that academics in all disciplines were using e-mail almost equally (costa & meadows [ ] ). costa and meadows reported on the network based communication practices of two groups of social scientists (economists and sociologists) and compared these results with other studies conducted in science disciplines. they concluded that social scientists were using networked communication to a lesser degree than scientists. it might be difficult, however, to compare the result of an interdisciplinary study conducted in two particular environments (brazil and the united kingdom) to the results of multi-disciplinary studies that examined these differences in a variety of environments. applebee et al. [ ] reported on a survey of the disciplinary differences within broad classifications of the discipline groupings of sciences, arts/humanities, and social sciences (management, administration, and commerce). participants from science disciplines were reported to be the most frequent users of e-mail to communicate with researchers or colleagues at the same university campus. in contrast, social scientists were reported to be the most frequent users of e-mail to communicate with researchers located remotely. in addition, applebee et al. [ ] indicated that it may be unreliable to associate frequency of use with academic disciplines, because frequency of use does not make it clear as to exactly how the internet is used or what types of internet services are used. therefore, the researchers decided to assess the disciplinary differences by comparing the usefulness of e-mail for research. assessed on that basis, it was found that science respondents indicated a more positive response than those from the social sciences or humanities. budd & connaway [ ] investigated the use of networked information by sampling academics at six departments representing the three broad categories of sciences, social sciences, and humanities, but no attempt was made to examine differences based on these disciplinary categories. differences were instead examined based on respondents’ departmental affiliation, as the result of which no significant differences were reported. for instance, when asked whether they use networked information, respondents from the departments of sociology ( %), physics ( %), and chemistry ( %) indicated a majority of positive responses; whereas those from the departments of english, psychology, and history indicated a majority of negative responses. and while respondents from sociology (social sciences) indicated the highest positive response for using networked information; in contrast, psychology respondents, also from the social sciences, indicated quite low usage of networked information. lu [ ] investigated how “electronic vehicles” (such as web site of a journal, e-mail address for a journal, electronic submission, electronic publishing) had impacted on formal scholarly communication by conducting a study of the communication practices of journals in both social science and natural science disciplines. the results indicated that the majority of categories of “vehicles” were used more frequently in natural science disciplines than social sciences. heterick [ ] compared economists’ (social sciences) and humanities scholars’ attitudes towards electronic resources. the findings indicate a variance of attitudes between the two groups of scholars. for example, while almost % of economists consider the library’s online catalog “very important”, nearly % of humanities scholars consider this to be the case. when respondents were asked whether networked information will reduce their personal visits to the physical library, almost % of the economists agreed, as compared to only % of those from the humanities. the results of this study also indicate differences between the attitudes of economists and humanities scholars toward the reliability of electronically stored information. while only % of the economists indicated they would trust a repository of electronic information stored locally, almost % of the humanities scholars reported a similar level of trust. talja and maula [ ] indicated that by scholars of humanities disciplines were still recognized as low users of e- journals and databases, while most scholars in science disciplines were already high level users. that conclusion was also supported by lenares [ ] who found that physical and biological scientists reported higher use than humanities and social science scholars. lenares drew the sample for her study from twenty research universities in the united states. another study conducted in nigeria by ehikhamenor [ ] also revealed disciplinary differences, although these differences were not consistent. this was attributed to the ambivalence all respondents felt toward various internet services. world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / torma and vakkari [ ] investigated how academics’ disciplines and availability of “electronic resources” correlate with their frequency and purpose of use of electronic resources provided by the finnish national electronic library (finelib). data were collected using an annual survey of users of the finelib website. there were respondents identified as belonging to one of six disciplinary groups. the findings indicated disciplinary differences, with respondents from the natural sciences ( %), economists ( %), medicine ( %), and engineering ( %) reporting using finelib on a daily basis in more cases than academics from the social sciences ( %) or humanities ( %). however, the study found that “perceived availability” of electronic resources was a stronger predictor of the frequency of use than disciplinary differences. networked information services provided by libraries have also been differently used by academics according to their disciplines. borgman [ ] and tenopir [ ] revealed that academics from science disciplines use “electronic information” sourced from academic libraries more than their counterparts from the social sciences and humanities. v. methodology this study is primarily concerned with investigating the disciplinary differences in the use of networked information for research and scholarly communication in an arabic academic environment. the methodology selected is quantitative, and the particular tool is a questionnaire survey. the survey was administered at sultan qaboos university, oman, in december , and the academic staff of squ were the subjects of this study. a quantitative approach was necessary in order to generate a basic understanding regarding the current use of networked information for research and scholarly communication at squ. it was assumed that a questionnaire would function in two ways. firstly, it would indicate the academics’ patterns of use of, and attitudes towards, networked information for the purpose of research and scholarly communication at squ. secondly, the many variables associated with this research topic can be appropriately investigated by use of a questionnaire in which respondents report the necessary demographic data that is essential to developing an appreciation of the different cohorts that form parts of the population being studied. the questionnaire was translated from english into arabic to allow academics who do not speak english to participate. native arabic speakers who also speak english were given the choice of completing an english or arabic version. the researcher undertook initial translation of the questionnaire. after the translated draft was completed, it was sent to a professional translator with extensive experience in translating bilingual documents from english to arabic and vice versa. the response rate to the survey was % (n= ) of the distributed questionnaires. the overall response rate of the whole population of academics at squ ( ) was . %. therefore, if . % is considered to be the valid response rate, it is quite acceptable given that the response rate to academic surveys is generally low (ejust [ ]; tomney & burton [ ]; weingart & anderson [ ]). the questionnaire data were coded and entered into the statistical package for social sciences (spss). both descriptive and inferential quantitative analysis were used to extract maximum information from the data. firstly, descriptive analysis involving frequency and percentage distribution of all variables, as well as calculating mean scores whenever required; and secondly, inferential analysis for testing associations between particular variables using both parametric and non-parametric statistics. three types of inferential statistical tests were undertaken, but only the parametric technique of one-way analysis of variance (one- way anova) is reported in this study, as the supplementary testing did not produce any variations in results. anova is best used to investigate how several independent variables interact with each other and how these interactions affect the dependent variables [ ]. in this study one-way anova was used to determine whether there were significant relationships based on differences of group means between particular variables and disciplinary differences. moreover, although it was appropriate to use the independent-samples t test to compare mean groups of two levels of independent variables [ ], in the case of disciplinary differences (" " humanities and social science, and " " sciences), significance differences in the results should not vary to a large extent if one-way anova is alternatively used. it is claimed that one-way anova is suitable to compare mean groups of two levels of independent variables and more, whereas the independent-samples t test is only applicable for comparing mean groups of two levels of independent variables [ ]. both the kruskal-wallis test and independent-samples t test (results not reported) were used to verify and qualify the result of anova in this study. this process enhanced the reliability and the trustworthiness of the inferential analysis used in this study. for all of the inferential analysis results, the minimum level of significance was determined at . . there were several techniques used to increase the quality and reliability of this study. the questionnaire was developed with reference to the existing literature and the research objectives and questions of the study. before the questionnaire was distributed for the major collection of data it was piloted and also examined by expert referees in the area of the research. modifications based on feedback were made whenever applicable. all academics at squ were considered probable respondents to the questionnaire, thereby increasing the response rate and improving the likelihood of measuring variations in academics’ perceptions and attitudes. all colleges at squ were sampled which allowed for the comparison of the results within and between different sample groups. vi. findings the categorization of disciplines in the current study has been based on the organizational structure of squ, in which colleges have been divided into two broad disciplinary arrangements. these are humanities and social sciences as one major division, and sciences as another division. the humanities and social sciences division includes three colleges; namely the college of arts and social sciences, the world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / college of commerce and economics, and the college of education. the science division consists of four colleges; the college of agriculture and marine sciences, the college of medicine and health sciences, the college of engineering, and the college of science. therefore, one way anova is used to determine whether there are significant differences in the respondents’ opinions and attitudes according to the broad disciplinary categories associated with squ’s two divisions. use of networked information table i refers to the following items: . how would you describe your skills as a user of networked information? . how important is it for you to be proficient in using and applying networked information? . how long have you been using networked information services? table i anova test of discipline versus skills, importance of use and length of use item sum of squares df mean square f .sig betwn groups within groups total� . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . participants assessment of their proficiency in using networked information differs significantly across the disciplinary groupings. it is concluded from the anova table (table i) that there are statistically significant differences in mean groups at . for items and . descriptive data illustrates that respondents from science disciplines consider the proficiency of using networked information is more important than do their counterparts from the humanities and social sciences. moreover, respondents from science disciplines have been using networked information longer than respondents from humanities and social sciencess. despite these differences in perceived importance and duration of use, there was, however, no significant difference between the disciplines regarding their perception of their current level of skill in using networked information. frequency of use of networked information table ii refers to the following items: . how frequently do you use e-mail? . how frequently do you use mailing lists? . how frequently do you use bulletin boards? . how frequently do you use internet chat? . how frequently do you use video conferencing? . how frequently do you use e-journals? . how frequently do you use full-texts other than e- journals? . how frequently do you use web-based library catalogues? . how frequently do you use web-based databases? . how frequently do you use internet search engines? table ii anova test of discipline versus frequency of use of networked information item sum of squares df mean square f .sig betwn groups within groups total� . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . of the ten items listed in the questionnaire that measure the frequency of use of networked information, five items ( , , , and ) across the disciplinary grouping of the participants are found to be statistically significant at . level (table ii). descriptive data illustrates that respondents from science disciplines use e-mail, mailing lists, internet search engines, and e-journals more frequently than respondents from the humanities and social sciences. in contrast, respondents from humanities and social sciences disciplines use internet chat more frequently than do their colleagues from the sciences. scholarly communication activities table iii refers to the following items: . to communicate with academics or researchers at same institutions. . to communicate with academics or researchers at different institutions in oman. world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / . to communicate with academics or researchers at different institutions within the arab states. . to communicate with academics or researchers at different institutions globally. . to exchange documents or information about issues or topics in an area of research. . to obtain bibliographic references. . to provide or obtain updates on research. . to ask questions or provide answers. . to keep current in an area of research . to learn about conference announcements. . to communicate with publishers. table iii anova test of discipline versus scholarly communication activities item sum of squares df mean square f .sig betwn groups within groups total� . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . the anova test reported in table iii revealed that of the above listed scholarly communication activities, nine items (all items except and ) differ significantly as a function of disciplinary affiliation at . level. the descriptive data illustrates that respondents from science disciplines indicated more positive responses for all above scholarly communication activities than respondents from humanities and social science disciplines. impact of networked information table iv refers to the following items: . i enjoy using networked information. . networked information makes it easier for me to research and publish collaboratively. . networked information has helped me access new tools for my research and scholarly communication. . networked information provides me with the capabilities to easily work beyond geographic boundaries. . networked information has helped me establish new relations with other researchers. . the use of networked information will increase my number of publications over the next few years. . the use of networked information will improve the quality of my research over the next few years. . some of my research will be published electronically over the next few years. . networked information will widen the scholarly community within which i am in contact over the next few years. . i will become increasingly dependent on networked information over the next few years. table iv anova test of discipline versus impact of networked information item sum of squares df mean square f .sig betwn groups within groups total� . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / among a list of items that investigated the impact of networked information on research and scholarly communication, only two items ( and ) were found to differ significantly across the participants’ disciplinary groupings at . level (table iv). descriptive data illustrates that respondents from science disciplines indicated more positive responses for the two significant items than do participants from humanities and social science disciplines. these results support the commonly held view that science scholars are more likely to form research teams than those in other disciplines, they also indicate that they have adopted the use of networked information sources to assist in this regard. training and library support table v refers to the following items: . i am able to access all networked information from my office or lab. . the university runs occasional training sessions for faculty members to use networked information. . the university commitment to improving the process of electronic scholarly communication is highly appreciated. . the library website is easy to navigate and gives comprehensive instructions and information. . the availability of networked information resources at the library is sufficient. . the library’s web-based catalog is clear and easy to use. . e-journals in my field are adequate and useful. . computer facilities and electronic equipments in the library are adequate. . i receive updates from the library through a networked medium such as email or group mailing lists. . the library informs me about networked information resources and services that are newly available. . the library invites me to attend sessions on networked information. . librarians are very collaborative and helpful. . i am overall satisfied about the networked information services facilitated by the library. table v anova test of discipline versus training and library support item sum of squares df mean square f .sig betwn groups within groups total� . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . item sum of squares df mean square f .sig betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total� . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . betwn groups within groups total� . . . . . . . it is concluded from the above anova table (table v) that the respondents’ attitudes to training and library support differ significantly according to disciplinary groupings at . for items , , , , , , and . the descriptive data illustrates that participants from humanities and social science disciplines indicated more positive responses to those statically significant items than respondents from the sciences. perception of arabic as a scholarly language table vi refers to the following: . the availability of networked information in english sufficiently substitutes for the extreme shortage of networked information in arabic. . sufficient availability of arabic networked information would have increased my intellectual productivity. . sufficient availability of arabic networked information would have encouraged me to think about publishing more in arabic. . sufficient availability of arabic networked information would help me to remain current in my field. . teaching and learning in arabic within my discipline is getting difficult due to the lack of networked information in arabic. . i strongly encourage colleagues and students to use english in writing and publishing. . learning the fields of sciences and technology nowadays in arabic will risk the learners’ academic and career future. world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / . absence of arabic e-journals and sufficient arabic networked information is a reason why arab academics favour english. . without being electronically available, the arabic language will not be able to contribute to the human and scientific development. . the domination of english language will lead to the continuous decline of the arabic language for the academic purposes. . the presence of arabic networked information on the internet will improve to a great extent in the next few years. . i would certainly prefer to write and publish in arabic if the language was sufficiently available in a networked environment. table vi anova test of discipline versus academics’ perception item sum of squares df mean square f .sig betwn groups within groups total� . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . betwn groups within groups total . . . . . . . respondents’ perception of arabic as a scholarly language in the arabic networked environment across discipline groups revealed statistically significant differences in group means at . level for six items ( , , , , , and ) in table vi. it should be noted that all the colleges in the science division teach their programs in english, while colleges in the humanities and social sciences division teach in arabic, with the exception of the college of commerce and economics which teaches in english. it is therefore apparent that the responses to six statements in table vi correlate not only with discipline but also with the language used for teaching and research. the first three statistically significant statements ( , , and ), all of which address the issue of sufficiency of “arabic networked information” reflect the greater reliance on arabic by social science and humanities scholars. statements and , which address the importance or need to use english (or languages other than arabic) recorded a significantly more positive response from the science scholars. interestingly, the science-based respondents also indicated (more than their colleagues in other disciplines) that they would “prefer to write and publish in arabic if the language was sufficiently available in a networked environment”. this strongly suggests that the use of english is a choice that is made for them by the ubiquitous use of english for science communication. vii. discussion as indicated previously, categorization of disciplines in this study has been based on the organizational structure of squ, in which colleges have been divided according to the broad disciplinary arrangement of the humanities and social sciences in one division, and the sciences in another division. the purpose of this section is to discuss the use of networked information at squ from a multi-disciplinary perspective, in which the two broad disciplinary categories used by the university for administrative purposes are compared. in general terms the results from this study indicate that science scholars at squ are significantly more active users of networked information than their social science and humanities colleagues. the results reported in tables ii, iii and iv in particular indicate the extent to which science respondents are more heavily engaged in the use of networked information for research purposes. these tables record a statistically significant difference in response by discipline for a variety of activities that are essential components of research productivity. these results reflect disciplinary differences regarding the use of, and attitudes towards, networked information at squ, and can therefore be compared to results from similar studies conducted elsewhere. it is also the case, however, that such comparisons need to be undertaken with caution, as the results may reflect not only differences in the research and scholarly communication based on disciplinary characteristics, but are also likely to be indicative of particular aspects of socio-educational development in oman and arab countries more generally. these additional factors include the state of the arabic language as a means of scholarly communication, and the level of development of an effective research culture in developing arab countries. the united nations development programme in its influential arab human development report ( ) [ ] pointed to the “crisis of the arabic language” (p. ) when world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / used for scholarly purposes and called for the “arabisation” of university education—particularly science education—in the middle-east region. the report noted the domination of english for use in scholarly communication, and the failure— due to a series of socio-political and technical issues—to adapt arabic for scholarly use in the digital environment. for these reasons arab scholars’ attitudes towards the use of the two languages for teaching and research purposes are of great importance. as noted previously disciplines and language are currently closely aligned at squ and the results reported in table vi seemingly reflect the different experience of respondents who teach and research in english (from the sciences) and those who use arabic (from the social sciences and humanities). the results recording the significantly lesser reliance by social science and humanities scholars on networked sources of information are very likely indicative of the under-representation of arabic on the internet and in other digitised information sources. it should also be noted that the results also point to the uncertain attitude towards the two languages held by science scholars, whereby they recognise the necessity of using english while retaining a preference for arabic. the arab human development report also argued that arab humanities and social science scholars have been working in a vacuum, as their ‘invisible college’ or social networks are poorly formed. this claim is supported by the results of the current study, with respondents from the humanities and social sciences reporting that they communicate with colleagues less frequently than respondents from the science disciplines, and that they have been less successful in working “collaboratively” or establishing “new relations”. it can be speculated that science scholars, due to their use of english and the more international focus of science research, have been able to make use of networked communication to attach themselves to established international, collaborative research communities. this is apparently not the case, however, for the social sciences and humanities wherein scholarship is frequently limited by a local or regional focus and further confined by the use of arabic. as a result such collaborative communities are yet to develop for these disciplines, and even opportunities for regional networking appear to be limited. whereas collaborative research cultures have generally been slower to develop in the social sciences and (particularly) the humanities, the evidence suggests that this is strongly the case in arab countries. as noted, earlier studies [ , , ] conducted in developed countries recorded similar differences between disciplines in a networked environment, but more recent studies have suggested that the disciplinary ‘gap’ in the use of networked information might be closing [ ]. the current study contradicts this trend as disciplinary differences are still strongly indicated in the use of networked information at squ, suggesting that there is a ‘lag’ in closing this gap. this is possibly due to the comparatively late uptake of networking technology at squ—and elsewhere in the arab world—but may also be due to the issues associated with language and underdeveloped research cultures. an intriguing element of the results is the extent to which social science and humanities respondents reported a more positive response to the networked services provided by the squ library (table v). it is likely that this reflects a generally greater dependence on library services and support within these disciplines, but it is relevant to note that the higher level of satisfaction extends to “networked information services”, when other elements of the results indicate that social science and humanities respondents use these resources less than their science counterparts, and also report being comparatively dissatisfied with the level of these resources available in their preferred language (arabic). with regard to library use respondents were also asked to record their use of various networked library services. although the results were not statistically significant, it was the case that the use of web-based library catalogs and web- based library databases was found to be more frequent by respondents from the social sciences and humanities. this finding contradicts that of torma and vakkari [ ], who reported that science scholars use networked library services more frequently than do those from the social sciences and humanities. the finding also contradicts those of borgman [ ] and tenopir [ ] who claimed that academics from science disciplines use “electronic information” in academic libraries more than those from the social sciences and humanities disciplines. in the arab world, ibrahim [ ] investigated the use of networked information and library services at the united arab emirates university and reported that academics from science disciplines indicated higher use than their counterparts in the social sciences and humanities. the extent to which the results from this study might be extrapolated to other arab countries, or developing countries more generally, is also relevant. it can be hypothesised that as other arab countries in the persian gulf region share similar circumstances in terms of the development of their higher education, research and communication infrastructure that they may demonstrate similar results. they also experience many of the same social and linguistic circumstances that contextualize the results of this study. it would be less safe to assume the results would be replicated in other arab countries (for example those of the maghreb region), or to developing countries more generally. viii. conclusion the research reported above indicates that science scholars at sultan qaboos university are more dependent on networked information that those from the social sciences and humanities. this situation likely reflects differences that are intrinsic to the nature of scholarship within the disciplines and have previously been reported with regard to more ‘traditional’ forms of scholarship. while other research suggests that these differences might to some extent be minimised within a networked environment there is little indication that this had occurred at sultan qaboos university at the time this research was conducted. it is concluded that this may be due in part to the comparatively recent uptake of networking technologies in oman, but it is also likely to reflect aspects of the current state of scholarship in developing arab countries, in particular the world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / poor utilization of arabic in digital information environments and the lack of developed research cultures. in both respects the results from this study indicate that scholars in the social sciences and humanities are disadvantaged in a manner which is likely to negatively impact on their use of networked services for research and communication. additional research is required in developing arab countries in order to understand more about the particular circumstances faced by scholars when using networked information services. this research could focus on the educational and social contexts in which the technology is deployed, in order to better understand their impact on research productivity in different disciplines. the conclusions of this study also have implications for the development and implementation of digital library services aimed at optimising the research productivity of oman and other developing arab countries. in particular academic librarians need to develop strategies to provide scholars— particularly those working in the social sciences and humanities—with support in compiling and accessing digitised arabic resources; and to assist in using networking technologies to build and sustain regional research communities for these same disciplines. references [ ] ministry of information, oman - . , muscat: ministry of information. [ ] bailey, c.w., scholarly electronic publishing on the internet, the nren, and the nii: charting possible futures. serials review . ( ): p. - . [ ] menzel, h., information needs and uses in science and technology, in annual review of information science and technology, c.z. cuadra, editor. , john wiley and sons: new york. p. - . [ ] garvey, w.d., k. tomita, and p. woolf, the dynamic scientific information user. information storage and retrieval, . ( ): p. - . [ ] simonton, w.c., characteristics of the research literature in the fine arts during the period - . , university of illinois: chicago. [ ] gleaves, e.s., characteristics of the research materials used by scholars who write in journals in the field of american literature. , emory university: atlanta. [ ] line, m.b., the information uses and needs of social scientist: an overview of infross. aslib proceedings, . ( ): p. - . [ ] skelton, b., scientists and social scientists as information users: a comparison of results of science user studies with the investigation into information requirements of the social sciences. journal of librarianship, . : p. - . [ ] lu, s., the transition to the virtual world in formal scholarly communication: a comparative study of the natural sciences and the social sciences. , university of california: los angeles. [ ] bouazza, a., use of information sources by physical scientists, social scientists, and humanities scholars at carnegi mellon university, in school of library and information science. , university of pittsburgh: pittsburgh. [ ] meadows, a.j., communicating research. , california: academic press. [ ] bynum, w.e., et al, dictionary of the history of science. , new jersey: princeton university press. [ ] cohen, b., an analysis of interactions between the natural sciences and the social sciences, in the natural sciences and the social sciences: some critical and historical perspectives, b. cohen, editor. , kluwer academic publishers: dordrecht, the netherlands. p. - . [ ] white, l.m., the humanities, in handbook of the undergraduate curriculum: a comprehensive guide to purposes, structures, practices, and change, j.l.r. jerry g. gaff, et. al.,, editor. , jossey-bass: san francisco. p. - . [ ] bishop, a.p., the role of computer networks in aerospace engineering. library trends, . ( ): p. - . [ ] brown, c.d., the role of computer-mediated communication in the research process of music scholars: an exploratory investigation. information research, . ( ). [ ] shaw, w., the use of the internet by english academics. information research, . ( ). [ ] zhang, y., scholarly use of internet-based electronic resources: a survey report. library trends, . ( ): p. - . [ ] abdulaziz, t.o., benefits of the internet on egyptians academics at social sciences disciplines. majallat maktabat almalik fahad alwataniyya (journal of king fahad national library), . ( ): p. - . (source in arabic). [ ] abels, e.g., p. liebscher, and d.w. denman, factors that influence the use of electronic networks by science and engineering faculty at small institutions. part i queries. journal of the american society for information science, . ( ): p. - . [ ] abels, e.g., p. liebscher, and d.w. denman, factors that influence the use of electronic networks by science and engineering faculty at small institutions. part ii preliminary use indicators. journal of the american society for information science, . ( ): p. - . [ ] costa, s. and j. meadows, the impact of computer usage on scholarly communication among social scientists. journal of information science, . ( ): p. - . [ ] eisend, m., the internet as a new medium for the sciences: the effects of internet use on traditional scientific communication media among social scientists in germany. online information review, . ( ): p. - . [ ] seyal, a.h., m.n. abd-rahman, and m. mahbubur-rahim, determinants of academic use of the internet: a structural equation model. behaviour & information technology, . ( ): p. - . [ ] applebee, a., et al., academics online: a nationwide quantitative study of australian academic use of the internet. , adelaide: auslib press. [ ] applebee, a., p. clayton, and c. pascoe, australian academic use of the internet. internet research: electronic networking applications and policy, . ( ): p. - . [ ] bane, a.f. and w.d. milheim, internet insights: how academics are using the internet. computers in libraries, . ( ): p. - . [ ] bruce, h., internet services and academic work: an australian perspective. internet research, . ( ): p. - . [ ] budd, j.m. and l.s. connaway, university faculty and networked information: results of a survey. journal of the american society for information science, . ( ): p. - . [ ] heterick, b., e-content: faculty attitudes toward electronic resources. educause review, . ( ): p. - . [ ] houghton, j.w., c. steele, and m. henty, changing research practices in the digital information and communication environment. , canberra: commonwealth of australia. [ ] kaminer, n., scholars and the use of the internet. library and information science research, . ( ): p. - . [ ] lazinger, s.s., j. bar-ilan, and b.c. peritz, internet use by faculty members in various disciplines: a comparative case study. journal of the american society for information science, . ( ): p. - . [ ] schauder, d., electronic publishing of professional articles: attitudes of academics and implications for the scholarly communication industry. journal of the american society for information science, . ( ): p. - . [ ] wang, y.-m. and a. cohen, communicating and sharing in cyberspace: university faculty use of internet resources. international journal of educational telecommunications, . ( ): p. - . [ ] abdullah, n.m., faculty members' attitudes toward the internet at cairo university. aalam alma'lomat walmaktabat walnashir (world of information, libraries, and publishing), . ( ): p. - . (source in arabic). [ ] adika, g., internet use among faculty members of universities in ghana. library review, . ( ): p. - . [ ] bin-alsabti, a., electronic exchange of information among academic researchers at mentouri university of constantine. alarabiyya (the arabic ), . (source in arabic). world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / [ ] boumarafi, b.m., use of the internet by faculty members at al-sharjah university. rissalat almaktaba (library message), . ( ): p. - (source in arabic). [ ] ehikhamenor, f.a., internet facilities: use and non-use by nigerian university scientists. journal of information science, . ( ): p. - . [ ] jirjees, j.m. and a. nashir, use of the internet by faculty members of the yemeni universities in sana'a city, in the united arabic strategy for information in the era of internet and other studies, afli, editor. , the arab federation for libraries and information (afli). (source in arabic): beirut. [ ] mamtora, j., pacific academics and the internet. australian academic & research libraries, . ( ): p. - . [ ] uddin, m.n., internet use by university academics: a bipartite study of information and communication needs. online information review, . ( ): p. - . [ ] garvey, w., communication: the essence of science. , oxford: pergamon press. [ ] tenopir, c. use and users of electronic library resources: an overview and analysis of recent research studies. [cited / ]; available from: http://www.clir.org/pubs/reports/pub /contents.html. [ ] torma, s. and p. vakkari, dicipline, availability of electronic resources and the use of finnish national electronic library - finelib. information research, . ( ). [ ] lu, s. a cross sectional study of the impact of the internet on formal scholarly communication. in information access in the global information economy, october - . . pittsburgh: asis. [ ] talja, s. and h. maula, reasons for the use and non-use of electronic journals and databases: a domain analytic study in four scholarly disciplines. journal of documentation, . ( ): p. - . [ ] lenares, d. faculty use of electronic journals at research institutions. in th national conference of the association of college & research libraries, april . . detroit, michigan: acrl. [ ] borgman, c.l., digital libraries and the continuum of scholarly communication. journal of documentation, . ( ): p. - . [ ] ejust. e-journal user study: report of first survey. [cited / /]; available from: http://ejust.stanford.edu/method_surveys.html. [ ] tomney, h. and p.f. burton, electronic journals: a study of usage and attitudes among academics. journal of information science, . : p. - . [ ] weingart, s.j. and j.a. anderson, when questions are answers: using a survey to achieve faculty awareness of the library's electronic resources. college and research libraries, . : p. - . [ ] field, a., discovering statistics using spss. nd ed. , london: sage publications. [ ] blanksby, p.e. and j.g. barber, spss for social workers: an introductory workbook. , boston: pearson education [ ] united nations development programme. arab human development report : building a knowledge society. [cited november]; available from: http://www.undp.org.sa/reports/ahdr% % -% english.pdf. [ ] ibrhaim, a.e., use and user perception of electronic resources in the united arab emirates university (uaeu). libri, . : p. - . world academy of science, engineering and technology international journal of information and communication engineering vol: , no: , international scholarly and scientific research & innovation ( ) scholar.waset.org/ - / in te rn at io na l s ci en ce i nd ex , i nf or m at io n an d c om m un ic at io n e ng in ee ri ng v ol : , n o: , w as et .o rg /p ub li ca ti on / http://waset.org/publication/digital-scholarship-and-disciplinary-culture:-an-investigation-of-sultan-qaboos-university,-oman/ http://scholar.waset.org/ - / designing multilingual digital pedagogy initiatives: the programming historian for english, spanish, and french speaking dh communities sofia papastamkou (sofia.papastamkou@univ-lille.fr) , antonio rojas castro (antonio.rojas-castro@bbaw.de) , anna-maria sichani (a.sichani@sussex.ac.uk) irhis, cnrs/university of lille, bbaw, university of sussex about the project the programming historian is a community-led pedagogy initiative launched in for humanities scholars that aims to publish open-access peer-reviewed tutorials in english, spanish and french on a wide range of digital tools, techniques, and workflows. it is now a proudly multilingual open access journal involving a large team of editors, authors, translators and reviewers, that requires software development, conducts community surveys and solicits community input. increasing access the programming historian aims to publish accessible tutorials targeted for a global audience. for that reason, authors are encouraged to address technological, linguistic and cultural barriers: ● methods & tools should be accessible to international audiences and reproducible in languages other than the original (e.g. multilingual documentation, different character sets for text analysis tools...); ● primary sources and alternative datasets from outside their geographical expertise can be suggested for readers to explore; ● specific cultural references, idiomatic expressions, or tones might not register for all audiences; ● historical persons, organizations, or events specific to a particular culture shall be clarified and explained in detail; ● examples of code and metadata shall use internationally recognized standards for date and time formats. three editions open access is not only about copyright issues and internet connection! language also shapes access to information on the internet. by translating and publishing original tutorials in spanish and french, the programming historian aims to make open educational resources accessible for a diverse community of scholars: ● english edition contains original tutorials; ● spanish edition contains tutorials ( translated tutorials and original tutorials); ● french edition contains translated tutorials. all our tutorials including translations are peer-reviewed by experts! bibliography ● crymble, a., ‘identifying and removing gender barriers in open learning communities: the programming historian’ in: blended learning in practice (autumn, ), – . ● gibbs, f., ‘editorial sustainability and open peer review at programming historian’, dh commons, vol. ( ). ● smith, m. s., ‘opening education’, science, ( ), - ( ). ● sichani, a.-m., baker, j., afanador llach, m. j., walsh, b., ‘diversity and inclusion in digital scholarship and pedagogy: the case of the programming historian’, insights, ( ). follow us / join us / síguenos / Únete / suivez-nous / rejoignez- nous programminghistorian.org @proghist internationalisation strategy ● open access | open peer review | open ethos ● full language initiatives ● contact zone: writing for a global audience ● inclusivity: gender-inclusive writing ● ad hoc translations ● neutral political policy quality enhancement on e-learning quality enhancement on e-learning e.s.i. ossiannilsson department of engineering and management, oulu university, oulu, finland abstract purpose – benchmarking, a method for quality assurance has not been very commonly used in higher education with regard to e-learning. today, e-learning is an integral part of higher education, and so should also be an integral part of quality assurance systems. however, quality indicators, benchmarks and critical success factors on e-learning have not been taken seriously into consideration, nor incorporated in ordinary national or international quality assurance systems. the purpose of this paper is to describe how the european association of distance teaching universities (eadtu) initiated and developed e-xcellenceþ , a quality benchmarking assessment method and tool. design/methodology/approach – this paper, which is part of a larger research project on european benchmarking, focuses on experiences from universities taking part in the e-xcellenceþ valorization process. findings – the results showed that benchmarking is a powerful tool to support improved governance and management in higher education, in alignment with national and international quality agencies. the tool can serve for quality improvements in teaching and learning. additionally, the results showed critical success issues for e-learning. originality/value – this original paper reports on a europe-wide study examining benchmarking of e-learning and presents suggestions for tackling quality issues. keywords europe, higher education, universities, distance learning, benchmarking, e-learning, quality assurance, critical success factors, quality enhancement paper type research paper . introduction benchmarking as a method for quality enhancement has until now not been very commonly used in higher education (moriarty and smallman, ) and especially not with regard to e-learning (ossiannilsson, a). quality assurance, quality indicators, benchmarks and critical success factors for e-learning have not been taken seriously into account in regular quality assurance within higher education (the swedish national agency for higher education (nahe), ; ossiannilsson, ; ubachs, ). the quality concepts have not been conceptualised. in any case, the quality of e-learning has been discussed in quality assurance methods, but e-learning has been considered and managed more disconnected according to an international study by nahe ( ). though, few methods have so far focused on parameters of quality assurance governing e-learning. nevertheless, criteria based on ease of access, new forms of interaction, flexibility, accessibility and personalisation, and other pedagogical aspects relevant for e-learning are missing. additionally, there is a lack of experiences and theoretical frameworks about values and impacts of benchmarking e-learning in higher education (ossiannilsson, a, ; bacsich, , ; schreurs, ). obviously, there is a need for enhanced understanding of how the current issue and full text archive of this journal is available at www.emeraldinsight.com/ - .htm campus-wide information systems vol. no. , pp. - r emerald group publishing limited - doi . / the author would like to express her thanks to eadtu, and to colleagues who have participated in the e-xcellenceþ project; also to professor pekka kess, the author’s supervisor, oulu university, fi, and professor and senior consultant paul bacsich, matic media, ltd, uk. cwis , benchmarking can be used in new contexts, focusing particularly on values and impacts for higher education institutions and their stakeholders participating in benchmarking exercises (ossiannilsson, a; ossiannilsson and landgren, ). recently, one benchmarking initiative at european level was conducted by the european association of distance teaching universities (eadtu). under the e-learning programme the e-xcellence benchmarking project was carried out by a consortium from european countries into lifelong, open and flexible learning and, in addition, expertise of quality assurance and accreditation processes from the european association for quality assurance in higher education (enqa) members in cooperation with the association of european institutions of higher education (eua) and united nations educational, scientific and cultural organisation (unesco). the intention with e-xcellence was to supplement existing quality assurance systems on e-learning specific issues, and not to interfere with ordinary quality assurance systems in higher education (ubachs, ). this paper focuses on experiences of european universities that participated in local seminars and took part in the process of quickscan in the framework of e-xcellenceþ by eadtu. in ongoing research by ossiannilsson ( b, ) (ossiannilsson and landgren, ) two recently completed european benchmarking initiatives on e-learning in / , is the centre of attention. one, which is the one elaborated on in this paper, was carried out by eadtu, e-xcellenceþ (ubachs, ) and the other one was conducted by the european centre for strategic management of universities (esmu), in cooperation with eadtu, the esmu e-learning benchmarking exercise (ossiannilsson, b, ; ossiannilsson and landgren, ). the paper will not focus on single benchmarks, indicators, critical success factors, or the benchmark methodology as such, but on values and impacts for stakeholders that participated in benchmarking exercises. the research regards aspects of value and impact and aims to be innovative, in regard to new concepts of benchmarking on e-learning in higher education. . benchmarking e-learning today, universities are facing new challenges as well as in the years ahead in the twenty-first century, to take action to be competitive not just in educational, social, managerial and technological aspects, but also to work in global perspectives, as well as to be a driver for innovation and contribute to sustainable development (ehlers and pawlowski, ; ehlers and schneckenberg, ; ossiannilsson, a, ; ossiannilsson and landgren, ). issues such as demonstrating respect for the individual student and their learning processes, accountability for the use of funding, both public and private, quality of education and research, and contributing to economic growth and sustainability have thus become more important (ehlers and pawlowski, ; ehlers and schneckenberg, ; ubachs, ). higher education institutions have to face the fact of increased demands on enhanced learning through new technology: digital skills in education, learning for the future in a global context within sustainable dimensions and integrating technology into all aspects of their strategic planning to ensure their survival in the years to come. the survey by nahe ( ) emphasised that e-learning must be accessed from a holistic point of view and argue that: existing methods of quality assessment need to be adapted. there is a need that quality aspects for e-learning are integrated into existing quality assurance systems. internal quality enhancement on e-learning competence and the provision of information in the e-learning area need to be guaranteed. internal working methods need to be adapted to the special conditions which apply for the assessment of borderless education (nahe, , p. ). research and experience shows that knowledge gaps on how e-learning can be embedded and integrated in ordinary quality assurance are both explicit and demanding (ossiannilsson and landgren, ). benchmarking is a rather new phenomenon in higher education (ubachs, , ossiannilsson, b; moriarty, ; esmu, a, b, ). the definition of benchmarking is, on the other hand, not very explicit and clear (revica, ). enqa defined benchmarking as “[y] a learning process, which requires trust, understanding, selecting and adapting good practices in order to improve” (esmu, a, p. ). the locus of benchmarking lies between the current and desirable states of affairs, and contributes to the transformation process that realise these improvements (moriarty, ; moriarty and smallman, ) benchmarking might identify changes necessary to achieve the aims. the concept change seems to be implicit in benchmarking; a change consistent with benchmarking-directed improvements processes. benchmarking is not only about change, but also about improvements or as harrington, already in , summarized: “all improvement is change, but not all change is improvement” (moriarty, , p. ). moriarty elaborated it further and stated that, benchmarking is not just about changes, it is more about identification and successful implementation. esmu ( a, b, ) emphasises that benchmarking is an ongoing process to improve the performance of higher education institutions. an extended literature review on benchmarking was carried out by esmu ( b) aiming to clarify the understanding of the concept. conversely, one of the underlying purposes of the study was to improve the practice of benchmarking in higher education, as a powerful tool to support improved governance and management in higher education. according to esmu ( b) there are at least ten good reasons to use benchmarking as a management tool in higher education; to self-assess their institutions; for a better understanding of processes; to measure, compare and discover new ideas; to obtain data to support decision making; to identify targets for improvement; to strengthen institutional identity; for strategy formulation and implementation; to enhance reputation; to respond to national performance indicators and benchmarks; and to set new standards for the sector in the context of higher education reforms. esmu ( b, p. ) defined benchmarking as an “[y] internal organisational process aiming to improve the organization’s performance by learning about possible improvements of its primary and/or support processes by looking at these processes in other, better-performing organizations”. e-learning is not very easy to define either. most often the concept of e-learning covers both technical and digital means, but covers also e-learning as learning, and learning through e-learning (ossiannilsson, b). the concept is used to cover a wide set of applications and pedagogical processes and learning supported by information and communication technology, such as web-based learning, computer-based learning, virtual classrooms and digital collaboration, with an added value of increased accessibility, flexibility and interactivity. mcloughlin and lee ( ) stress the “three p’s of pedagogy” for the networked society, personalisation, participation and productivity. bonk ( ) shows how technology has transformed educational opportunities for learners, as well as those of innovators from the worlds of technology and education that reveal the power of opening up the world of learning. new conceptualisations of e-learning in the twenty-first century will change the scene cwis , (ehlers and pawlowski, ; ehlers and schneckenberg, ; ossiannilsson and landgren, ) and may have an impact on how benchmarking e-learning in higher education in the future will be conducted, and what kind of quality issues will matter. in a comprehensive literature review by ossiannilsson ( a), the context of benchmarking e-learning in higher education was explored. conversely, as the literature showed, the trend today is that e-learning is more and more embedded in strategies of learning and teaching at universities (ehlers and pawlowski, ; ehlers and schneckenberg, ; nahe, ; ossiannilsson and landgren, ; ubachs, ). enhancing learning, teaching and assessment by the use of technology is one of a number of ways in which institutions can address their own strategic missions. . material and methods e-xcellenceþ the eadtu’s e-xcellence instrument was developed to complement existing quality assurance systems in higher education, and not to interfere with current systems (ubachs, ). the quality benchmarking assessment instrument which was developed, covered pedagogical, organisational and technical frameworks, with special attention on accessibility, flexibility, interactivity and personalization. the instrument was based on three elements: . first, a manual on quality assurance covering benchmarks on e-learning, with indicators related to benchmarks, guidance for improvement and references to e-xcellence level performance. the benchmarks were grouped into three areas covering six fields in total, namely: first, strategic management second, products (curriculum design, course design, course delivery) and finally, services (staff and student support) as illustrated in figure ; . second, assessors’ notes provided a more detailed description of the issues and approaches; and . finally, the tools, i.e. the online instrument. the tool quickscan, which is based on e-xcellence level benchmarks, and independent of particular institutional or national systems, is supplemented by a full online manual, all fully available on a web portal was launched in (ubachs, ). during the development process of e-xcellenceþ , besides the partnership within the project, stakeholders and policymakers were involved. the benchmarking can be products services management source: ossiannilsson and landgren ( ) figure . the three main areas for the benchmarks and indicators according to e-xcellenceþ quality enhancement on e-learning accomplished both as so-called online quickscan, and as a full assessment with evidence, or both. the quickscan is a simplified version of the full assessment tool, which in turn is a comprehensive tool. the online quickscan offers the opportunity to make comments on the specific issues by indicating: not adequate, partially adequate, largely adequate or fully adequate. after a completed online quickscan feedback are immediately generated based on the manual and assessors notes and e-mailed back to the responsible respondent. though, feedback is just given for answers not adequate, partially adequate. the approach with the quickscan was to a high extent greatly valued and led to commitments during the work. the instrument also offers with the full assessment the opportunities to make comments on the specific issue and to refer to documents or other references or links which can be used as reference on that specific aspect of e-learning. in , eua highlighted the initiative as: by modelling the e-xcellence tool on the needs and interests of institution and giving them a choice of modes with different degrees of intensity, the tool incorporates what has been endorsed on the european level as good practice in external quality assurance processes. moreover, by developing a set of benchmarks for the european level to build its tool on, the e-xcellence project has contributed toward building a european dimension for the specific field of e-learning (ubachs, , p. ). e-xcellenceþ became the phase for valorisation of the instrument at local, national and european levels within higher and adult education. within e-xcellenceþ , eadtu wanted to broaden the implementation and to receive feedback for enhancing the instrument. the e-xcellenceþ consortium consisted of expert representatives from open universities, traditional universities and assessment and accreditation bodies for higher and adult education. the consortium encompassed countries with an outreach to the rest of europe. e-xcellenceþ was piloted during / at local seminars, and three universities carried out the full assessment, together with site visits and road maps. several universities carried out the quickscan. universities who conducted the full assessment, site visits and road maps, and committed themselves to continue every second year with benchmarking e-learning in higher education, obtained the e-xcellence associated label. eadtu, with its e-xcellenceþ initiative, emphasised that any e-learning benchmarking initiatives need to be integrated, and not interfere with ordinary quality assessment in higher education institutions (ubachs, ). e-learning courses have, for a long time, been seen as special tracks in many universities. probably in the s this was needed, as the phenomenon and development of the internet was fairly new. at the present time, in the twenty-first century, where e-learning is embedded in universities and personalised interactive and mobile learning, the use of social media and open educational resources (oer) is emphasised, thus e-learning quality criteria must be integrated into any quality assurance systems, methods and movements and critical success factors have to be identified within new environments, e.g. social media and oer. this is almost certainly one of the crucial aspects and one of the benefits of benchmarking e-learning in higher education. the tool quickscan was valorised through the project e-xcellenceþ during and . introduction and dissemination of the tool was organised through local seminars in european countries. eadtu supported the improvement processes of e-learning by self-assessment, onsite assessment and accreditation, by embedding the instrument in national and institutional policy frameworks. five cases out of the universities during the time being are included in this research. cwis , the cases in order to explore the complex and multifaceted phenomena in depth, this study used an exploratory multiple case study strategy (yin, ). a mixed-method approach was applied, utilising a combination of quantitative but mainly qualitative data sources and integrated methods for analysing data (creswell and clarke, ; yin, ). a case study protocol was worked out for the data procedure (yin, ). the cases for the current study were selected from the local seminars conducted by eadtu at european universities (five out of ) (see table i). data for the cases were collected by the author, assisted by eadtu in / . in this paper, the analyses from the conducted seminars are discussed. data collection, procedure and analysis altogether some participants (vice-rectors, management, professors and students) attended the five local seminars at the involved institutions in europe (explored in this paper) in the dissemination and valorisation phase of e-xcellenceþ . one out of the five conducted by the time being the full assessment, site visits and worked out roadmaps. the data were collected mainly through reports from the seminar, but also using questionnaires and interviews following the case study protocol. the data were analysed within a holistic, but also within an embedded multiple case design (yin, ). according to yin ( ) the cases were analysed also as cross cases in order to identify similarities and differences and to provide further insight in processes and generalising of the case study results. . findings the questions for the seminars covered areas such as: application, added value, shortcomings, integration, institutional integration, next step and other issues. in the following, the answers from the five participating institutions based on cross case analyses according to the areas mentioned above are summarised. application the quickscan was conducted with staff at different levels (vice-rectors, professors, management and students). it was carried out through meetings, seminars, dialogues and questionnaires, both on an institutional and programme level (e.g. master program level). added value the institutions indicated that new views and recommendations came out of the assessment for further improvements. they stressed that it was a valuable exercise and process to go through and they obtained an overview of the performance at programme, faculty or institutional level. e-xcellenceþ allowed the institutions to university number of individuals local seminar date (i) alpha - november (ii) beta - march (iii) gamma - january (iv) delta - february (v) epsilon - march table i. universities involved in local seminars, e-xcellenceþ , by eadtu quality enhancement on e-learning show their expertise in e-learning more than conventional assessments were doing. within e-xcellenceþ dialogues an agenda was initiated for processes of quality enhancement and improvements. additionally the need for policy beyond a virtual learning environment was highlighted. as a team approach was necessary for conducting the quickscan, this also enabled teambuilding at all levels and allowed different stakeholders to take part, everyone from students to management. a comprehensive assessment approach was made possible at the same time as it served as a checklist. the documentation and the internal discussions were expressed as benefits of high value. all institutions emphasised the power of benchmarking and the internal dialogues which were initiated through e-xcellenceþ . through a guided dialogue the team obtained a clearer understanding of the opportunity it offered to a critical study of the institution’s position in relation to other institutions, and they also discovered clearly defined paths of enhancement. it was explicitly expressed that the tool has to be used as a total entity. the benchmarks were relevant for the institutions. student evaluations were still missing as benchmarks and have to be added in the tool. the tool offered opportunities for different ambitions. the fundamental principles were easy to understand for formulating decisions; namely, what is the position now and what are the aims for the future? in addition, what are the central issues in the organisation and what will be the policy outlines? it was highlighted that the tool as such is flexible enough to make choices but needs fine-tuning. moreover, it is important to bear in mind that benchmarks can even be pre-selected based on relevance. the tools are improvement tool and not accreditation tools, which is important to bear in mind. in summary, the respondents expressed values on conducting benchmarking on e- learning as it obtains transparency, to start and maintain internal dialogues, to strengthen teambuilding and to develop trust and a culture of scholarship of teaching and learning. additional values were expressed as through the benchmarking process also discussions on the meaning and understanding of concepts such as e-learning meant different things to different persons and within the teams and that this was allowed among the institutions. thus, the understanding of benchmarks could be understood differently in different contexts. shortcomings mentioned shortcomings were that the benchmarks were overly dedicated to distance learning educational institutions. some institutions expressed that normative definitions should be used. benchmarks should be in a position to balance the context of the institution. the institutions emphasised that students are not involved explicitly, and should be added in the system or create their own benchmark exercise or to be involved with the team. additional shortcomings were that the quickscan only provides answers that are not fully adequate or adequate. users might want feedback on all given answers. other shortcomings were that the benchmark formulations were sometimes too general but often also too complex. interpretations of the benchmarks were sometimes difficult, and there were also sometimes far too many aspects covered per benchmark. in addition, as the tool is in english, there were both language and linguistic barriers. institutional integration some institutions said that they operate in accordance with the enqa standards and have, therefore, a strong wish to have e-xcellence integrated/recognised by cwis , enqa. they also stated that it was immediately applicable as a self-assessment tool. in addition, institutions mentioned that it fitted in with the aims of the organisation. conversely, the tool needs fine-tuning. it was emphasised that the ambition must be in congruence with the ambition of the institution and within a step-by-step approach. contextualisation is necessary and the benchmarks should reflect a blended mode approach to teaching and learning. next steps the next step would be to investigate the integration of the benchmarks in the internal quality assurance processes and systems. all institutions expressed their willingness and their need to work out road maps based on e-xcellence. one of the institutions stated that their national agency for higher education would like to integrate the system, and had taken initiatives to develop e-learning criteria themselves, but are now inspired by the e-xcellence. on the other hand, another institution stated that their national agency for higher education was doubtful of an e-xcellence associated label. other issues for next steps were expressed as the needs to include social media, web . and oer in the benchmarks and indicators. other issues as has been stated above students’ input was missing within the benchmarks and indicators. the tool as it was at the time being probably is best used for open universities and the issues in a blended mode context are underestimated. institutions stressed the challenges to incorporate e-learning in ordinary quality assurance processes. the function of the quickscan was not immediately clear and there were requests for a guide, e.g. to use the tool on an individual basis, within a team approach, and from certain roles within the institution, or to select relevant themes. there were even requests for guidelines for different scenarios on how to use the quickscan, e.g. who is rating and which benchmarks are answered by whom? feedback options and cultural differences were also emphasised. even demands for better links between the benchmarks and the manual were suggested. recommendations were also to provide a “light” version vs an advanced version. issues were raised on language and interpretations of benchmarks. some benchmarks were too compact and too complex, and there should be possibilities to give neutral answers. the quickscan was presented as an assessment, whereas some institutions understood it more like a signal tool for internal use, and thus with no need for any label. nevertheless, a label is just issued for institutions going through the whole process with full assessment, site visits and working out roadmaps. the institutions emphasised the discussions about costs for recognition and according to this the use of the label and its usefulness and sustainability. in summary at least five key findings became explicit through the research and on three levels. values and impact of going through eadtu’s benchmarking was expressed within the institution at all levels. on the foundational level dialogue within the institution or the department, teambuilding and transparency was highlighted. on the second level policy making and decisions e.g. policy statement was emphasised and finally on top on that the third level, quality improvement and quality assurance was highlighted as values and impact of taking part in benchmarking processes (see figure ). quality enhancement on e-learning . discussion the ten good reasons described by esmu ( a) to conduct benchmarking were almost confirmed and verified by the participating institutions in the local seminars. they also emphasised that challenges for universities in the twenty-first century are to bring together all aspects of e-learning in a holistic framework, and perceive it in a more contextualised manner. the fact that e-learning is more and more embedded in strategies on learning and teaching at universities nowadays are almost benefits, but what will the consequences be and how should they pay attention to critical success factors, if there are any? experience from the e-xcellenceþ by eadtu can be expressed as both internal and external outcomes. internal outcomes were that within the universities individuals’ conducting the quickscan remained to the same conceptual framework which led to trust, transparency, and internal and extended dialogues. external outcomes were described as visibility for stakeholders, students, agencies and the public. findings from this study emphasised that benchmarking must always fall within the identification of strengths and weaknesses and gain a better insight of the institutions, with a vision to set targets and benchmarks for improvement and enhancement. benchmarking requires an explicit focus on continuous improvement and enhancement, the search for best practices and to be more than just a comparison of statistical data. a benchmark exercise must always be envisaged as a dynamic exercise with relevant benchmarks, as the aims are to identify good practice, which will lead to improvement and implementation of changes. further benchmarking requires institutional willingness to increase organisational performance, to act as a learning organisation and to review processes on an ongoing basis. in addition, the process as such requires the motivation to search for new practice and readiness to implement new models of operation. there is a strong need of commitment already from the beginning both on individual as well as on management level, especially if the result of the process will demand any overriding changes and for the implementation process. moreover, one success factor is the commitment to change. benchmarking requires institutional strategic development and is based on a continuous, long-term and professional approach. quality enhancement policy statement dialogue, teambuilding, transparency figure . key findings at different levels on the use of eadtu benchmarking quickscan tool cwis , . conclusions the impression seems to be that issues of constructive alignment, of benchmarking e-learning in universities according to national government and quality agencies’ mandates will change the scenario and be of importance for quality enhancement in the twenty-first century. this will be owing to changed learning and teaching paradigms with among issues as blended mode approaches, personalisation, participation, collaborative- ubiquitous- and open learning, oer, and social media and changed and new demands from the new millennium learners entering higher education. quality has to a higher extent to be valued from the learners’ dimensions and perspective as well as which currently are the most common i.e. learning outcomes and management. in addition, the discourse on scholarship of teaching and learning, including digital scholarship in a global knowledge-based sustainable society will be of utmost importance. although key benefits of benchmarking are well known, significant gaps still appear in the use of benchmarking practices in european higher education institutions. benchmarking is a powerful strategic tool to assist decision makers to improve quality and effectiveness of organisational processes and, ultimately, aims to build a european platform. through benchmarking, there can be large improvements in higher education institutions to meet international standards and guidelines, and to reach the position of the best international player in the higher education arena. other aspects are about fast-changing professional practice and globalisation and how to keep the staff in line with newly required competencies in a lifelong learning perspective. technology and digital scholarship is a useful tool for creating a new kind of university, but much more important are structural and cultural changes in which technology will play a supporting role. without these cultural and structural changes, technology cannot change the university on its own. will benchmarking on e-learning, in higher education in alignment with national and international quality boards and agencies, be an answer as a powerful tool for improvements on teaching and learning in a blended mode in the twenty-first century, to support improved governance and management in higher education? more research has to be done in a holistic perspective to answer questions on the value and impact of benchmarking e-learning in higher education, like as the following w-questions: why and how shall benchmarking be conducted, what shall be scrutinized, when shall it be done and duration, where shall it be done and by and for who/whom? references bacsich, p. ( ), “evaluating impact of e-learning: benchmarking”, towards a learning society: proceedings of the elearning conference, brussels, may, pp. - . bacsich, p. ( ), “benchmarking e-learning in uk universities: the methodologies”, in mayes, t. and higher education academy (eds), higher education academy and related national e-learning initiatives, higher education academy, bristol, pp. - . bonk, c.j. ( ), the world is open: how web technology is revolutionizing education, jossey- bass, san francisco, ca. creswell, j.w. and clarke, p. ( ), designing and conducting mixed methods research, sage publications, thousand oaks, ca. ehlers, u.-d. and pawlowski, j. (eds) ( ), “quality in european e-learning: an introduction”, handbook on quality and standardization in e-learning, springer, berlin, hamburg and new york, ny, pp. - . quality enhancement on e-learning http://www.emeraldinsight.com/action/showlinks?crossref= . % f - - - _ ehlers, u.-d. and schneckenberg, d. (eds) ( ), “introduction: changing cultures in higher education”, changing cultures in higher education, springer, berlin, heidelberg, pp. - . the european centre for strategic management of universities (esmu). (eds) ( a), benchmarking in european higher education. findings of a two-year eu funded project, esmu, brussels. the european centre for strategic management of universities (esmu). (eds) ( b), a practical guide. benchmarking in european higher education, esmu, brussels. the european centre for strategic management of universities (esmu). (eds) ( ), a university benchmarking handbook. benchmarking in higher education, esmu, brussels. mcloughlin, c. and lee, m.j.w. ( ), “the three p’s pedagogy for the networked society: personalisation, participation and productivity”, international journal of teaching and learning in higher education, vol. no. , pp. - . moriarty, j.p. ( ), “a theory of benchmarking”, unpublished phd thesis, lincoln university, lincoln. moriarty, j.p. and smallman, c. ( ), “en route to a theory on benchmarking”, benchmarking: an international journal, vol. no. , pp. - . ossiannilsson, e. ( a), “benchmarking on e-learning in universities: impact and value, european perspectives”, international journal of management in education, special issue on virtual university, accepted. ossiannilsson, e. ( b), “benchmarking e-learning in higher education findings from eadtu’s e-xcellenceþproject and esmu’s e-learning benchmarking exercise”, in soinila, m. and stalter, m. (eds), quality assurance of e-learning, the european association for quality assurance in higher education (enqa), helsinki, pp. - . ossiannilsson, e. ( ), “findings from european benchmarking exercises on e-learning: value and impact”, journal of creative education, vol. no. , pp. - . ossiannilsson, e. and landgren, l. ( ), “quality in e-learning – a conceptual framework based on experiences from three international benchmarking projects at lund university, sweden”, journal of computer assisted learning, special issue on quality in e-learning, vol. no. , pp. - . revica ( ), “bibliography of benchmarking, reviewing traces of european virtual campuses”, available at: www.virtualcampuses.eu/index.php/bibliography_of_benchmarking (accessed june ). schreurs, b. ( ), reviewing the virtual campus phenomenon. the rise of large-scale e-learning initiatives worldwide, europace ivzw, leuven. the swedish national agency for higher education (nahe) ( ), e-learning quality: aspects and criteria, högskoleverket, nahe, stockholm. ubachs, g. ( ), quality assessment for e-learning a benchmarking approach, european association of distance teaching universities (eadtu), heerlen. yin, r.k. ( ), case study research. design and methods, sage publications, inc, thousand oaks, ca. about the author e.s.i. ossiannilsson is a phd candidate at oulu university, department of industrial engineering and management, finland. she is also a senior administrative officer/project manager/flexible learning adviser at lund university, human resources, staff and educational development, centre for educational development, sweden. she has, for the last ten years, worked with regional, national and international projects on e-learning and open educational resources and web . in higher education and within quality issues and cwis , http://www.emeraldinsight.com/action/showlinks?crossref= . % f - - - - _ http://www.emeraldinsight.com/action/showlinks?crossref= . % fce. . http://www.emeraldinsight.com/action/showlinks?system= . % f http://www.emeraldinsight.com/action/showlinks?system= . % f http://www.emeraldinsight.com/action/showlinks?crossref= . % fj. - . . .x&isi= benchmarking. she is affiliated with several international organisations such as eden-nap, efquel, eucen, icde and sverd. she serves as expert in the quality grid epprobate and is service development partner at oer services. she serves as referee for national and international journals and elearning europa.eu, as well as being on the boards for international conferences. her research focuses on quality and benchmarking e-learning in higher education, principally regarding processes, values and impacts and with particular interest in the benchmarking exercises carried out through eadtu, the excellence and excellenceþ projects and the esmu elearning , benchmarking exercises. e.s.i. ossiannilsson can be contacted at: ebba.ossiannilsson@oulu.fi to purchase reprints of this article please e-mail: reprints@emeraldinsight.com or visit our web site for further details: www.emeraldinsight.com/reprints quality enhancement on e-learning we need to talk about the digital humanities job talkinghumanities.blogs.sas.ac.uk we need to talk about the digital humanities job - minutes dr james o’sullivan, a lecturer in digital arts and humanities at university college cork, explains why institutions need to think very carefully about the demarcation between public and digital humanities, because while they are related, they are not necessarily the same thing. this is not a commentary on the definition, legitimacy, or future of digital humanities (dh) – there is already enough of that around. rather, it is a treatment of one of the field’s most significant yet elided aspects – jobs. not just any job, not the tenure-track professorship wherein digital humanities is combined with an established discipline like literary studies or history; this is an exploration of ‘the dh job’. i refer to positions largely considered to be ‘alt-ac’ designed to support the development of dh within a particular institution. this is both a matter of pragmatics and ethics: the extent to which such roles align with existing frameworks needs to be fully appreciated if they are to benefit higher education, and we shouldn’t be putting people in these positions until we’ve answered such questions. so, what is the dh job? it is one that tasks its holder with ‘supporting research and teaching in digital humanities’. we need to talk about the digital humanities job about:reader?url=https://talkinghumanities.blogs.sas.ac.uk/ /... of / / , : essentially, it goes to someone who quickly becomes ‘the dh person’ called upon to do a lot of things, from teaching classes to advising colleagues on how to install wordpress (which in my book, isn’t digital humanities). the tasks assigned to an institution’s post-holder can run the intellectual gauntlet, but a lot of time is spent explaining that, no, help with twitter isn’t really part of the purview. now would be a good time to point out that i have held two such positions, and in both cases, i have nothing but positive things to say about my colleagues, and the institutions for which i worked. but i do speak from experience, and if your job is to support digital humanities, you’re going to spend a lot of time explaining why you can’t support something. a common example would be web design. when faculty seek out a colleague who can help them with their dh activities, they often want someone who can make its work more accessible. institutions need to think very carefully about the demarcation between public and digital humanities, because while they are related, they are not necessarily the same thing. this is particularly problematic at institutions that want to align with digital humanities because everybody else is doing it – these are the institutions where the aim is to produce visible scholarship, not visible scholarship. it is a peculiar situation when one must ask a string of seasoned professors, ‘where is the scholarly value?’ essentially, the digital humanities person is responsible for all things dh, but this is entirely dependent on the support systems already in place. at some universities, you will be able to point colleagues towards resources more suited to their requirements. at others, you might just have to do the we need to talk about the digital humanities job about:reader?url=https://talkinghumanities.blogs.sas.ac.uk/ /... of / / , : collegial thing and help them out with their social media. but this is where collegiality becomes dangerous. at annual review time, the dh person might find it hard to articulate what’s been keeping him or her busy. those quick sit-downs to run through a platform and that one-to-one instruction and advice over lukewarm coffee adds up, but that time is often unaccounted for. six months into my first digital humanities job, it was clear that the bulk of my time had been spent being a good colleague, but a poor institutional resource. supporting the individual is part of the broader agenda, but one must distinguish between activity which supports the institution as a whole, and that which satisfies a colleague who’d like to try something digital, but isn’t committed in the long term to whatever that digital thing might be. most institutions want digital humanities, but only some know why, and even fewer have really thought about what it looks like in the context of their scholarship and teaching. there are places where ‘capacity’ relates to a concrete set of activities and processes to which the dh person will be contributing. but there are also institutions where ‘building capacity’ means, ‘we don’t really know what we want, we just know that we don’t have it’. i have a game i like to play with search committees when i interview for a job. i ask them for their understanding of digital humanities. you never get consensus, which is positive in some respects, but you often get such vague and tentative answers that you wonder if they have even heard of it before. how is the person in role supposed to build capacity if the we need to talk about the digital humanities job about:reader?url=https://talkinghumanities.blogs.sas.ac.uk/ /... of / / , : institution doesn’t know what it is they want to build? one could argue that this is the ideal scenario, as you can shape the agenda to suit your vision, but do we really want institutional capacity being guided by the perspective of someone, who, like everyone else, will be carrying preconceptions and assumptions forged by personal experiences and disciplinary biases? one of the hardest parts of being the dh person is that you’re often the lone ranger, deprived of a base department where you’re surrounded by an intellectual community of peers. faculty and staff must go somewhere, but while the colleague in the neighbouring office might be entirely amicable, the likelihood is that their interests will lie elsewhere. the dh person cannot be all things to all people so they need to be based in the school or department where they can have the most impact and where their interests and expertise can best be utilised and nurtured. if that person comes from a literary studies background for example, it will not be long before they become dissatisfied with being cut-off from the english department. community is as much a social matter as it is an intellectual one. the repartee in the corridor, the chance meeting of minds, the sympathetic nod when an application is rejected, the last minute decision to go for a quick drink – these are the things that make people like where they work. coupled with this risk of isolation is the need to avoid stepping on toes. institutional desires to hire someone responsible for digital humanities tend to emerge from pre-existing curiosities and activities, meaning the new employee might be perceived as we need to talk about the digital humanities job about:reader?url=https://talkinghumanities.blogs.sas.ac.uk/ /... of / / , : a threat by colleagues. it is not nice being the hire who is viewed with suspicion. where the dh person should live will also tell you something about how their time should be occupied. this goes back to notions of success: if you hire a postdoc then freedom to pursue their own scholarship is essential otherwise, a lot of time will be spent contributing to projects which fail to excite, and disinterested parties make for the worst collaborators. the more they are required to step beyond their own interests, the more concerned they will become with future prospects – if the dh person wants to go back to that traditional teaching role, or at least, keep the option open, how much of their current position is going to be relevant to their next application? it doesn’t matter if you’re a literary scholar with expertise in text analysis, if all your publications are in history and sociology. what are the prospects for someone who wants to stay in alt- ac? the dh job is usually a junior position, and as a role with no real antecedents in arts and humanities faculties, the avenues for career progression are not at all clear. often fixed-term roles, what happens when the contract runs it course? in my experience, search committees for the dh job do not ask where you see yourself in five years – it’s something they really don’t want to consider. will we start to see senior versions of the dh job emerge, or is the idea that capacity building will lead to new opportunities? professionals need clearly defined paths for advancement or they will not see their job as a career. maybe this is a feasible model – with all that capacity building, maybe the need for ‘a centre’ will emerge, with the we need to talk about the digital humanities job about:reader?url=https://talkinghumanities.blogs.sas.ac.uk/ /... of / / , : logical step being that the dh person runs such a resource. but if that’s the idea, then who should that person be? should he or she be a scholar, a domain-specific expert who has advanced computational expertise, or a community builder whose scholarly expertise is less important than the ability to engage others? perhaps the dh person should be a grant writer, because centres are expensive, or a project manager, capable of delivering digital projects and running a facility that would involve a considerable amount of development activities? ideally, the dh person would be all these things, a glorious amalgam that can single-handedly secure funding, execute the day-to-day administration of all that precious capacity, lead a team of humanities scholars and software developers, and maintain their own reputation as an international scholar. i don’t know many people who satisfy all these requirements, and those that do, tend to like where they live. the digital humanities job is a good thing. it creates a space for people who do not want that traditional role, and it allows institutions to build something. whether that something has value depends on the ability of stakeholders to really think deeply about what it is that they want, why they want it, who can provide it, and how that person might remain professionally engaged and personally fulfilled. dr james o’sullivan (@jamescosullivan) lectures in digital arts and humanities at university college cork. his work has been published in a variety of interdisciplinary journals, including digital scholarship in the humanities, digital humanities quarterly, leonardo, and hyperrhiz: new media cultures. he and shawna ross are the we need to talk about the digital humanities job about:reader?url=https://talkinghumanities.blogs.sas.ac.uk/ /... of / / , : editors of reading modernism with machines ( ). he is the author of several collections of poetry, including courting katie ( ), and the founding editor of new binary press. his writing has also appeared in the guardian and la review of books.  we need to talk about the digital humanities job about:reader?url=https://talkinghumanities.blogs.sas.ac.uk/ /... of / / , : surveymonkey analyze - export q name demmy verbeke q library ku leuven libraries, artes q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh research data management support; teaching about data science, rdm, databases, etc.; support for the preparation of project proposals with digital component; r&d projects initiated by the library (ocr, ner and linked data) q please describe the positive aspects of this activity re-establish the library as a central partner in research q please describe what could be going better concerning this activity not all library staff has received the necessary training, so there is big train-the-trainer need q how long have you been doing this activity in your library? - years q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q do you have a dedicated budget for this activity? no q are you assessing the impact of your work? if no, why not? , not exactly clear how we would do this please specify: q what kind of collection are you using in the activity? a born-digital collection we have a license to , a digitised collection we have a license to , a born-digital collection we created and curate , a digitised collection we created and curate , metadata q how is the data you are using licensed? public domain, cc , any other cc- license , copyrighted q how did you find/built a relationship with the researchers working in this activity? training / outreach events , physical space – facilitating/hosting , other forums – committees, faculty liaison, others? , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? advisory / consultation roles , skills training and development / liber dh & dch working group - use case survey q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) q did you follow or offer any training for librarians as part of this activity? yes, namely , rdm training, oa training, copyright training please specify: q if you have followed or offered any dh training for librarians, how is it organised? it belongs to a personal professional development programme , it belongs to a library-wide training programme , it belongs to the scope of this activity q how aware are academics in your institution of the dh activities the library is active in? aware (they know (some of) the activities the library is active in) q if the library promotes the activity, where does it do so? articles in journals, conferences, own website q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity https://bib.kuleuven.be/english/research; and esp. https://bib.kuleuven.be/english/research/digital-humanities see also http://www.tijdschriftkarakter.be/de-vrije-hand-de-valorisatie-van-digitaal-onderzoek-in-de-menswetenschappen/; https://lirias repo.kuleuven.be/bitstream/id/ /; https://lirias repo.kuleuven.be/bitstream/id/ /; http://www.northernrenaissance.org/renaissance-studies-digital-humanities-and-the-library/; https://www.digitisation.eu/tools-evaluation- university-library-ku-leuven/; https://acrl.ala.org/dh/ / / /opportunistic-librarian/; https://lirias repo.kuleuven.be/bitstream/id/ / / liber dh & dch working group - use case survey q name andreas degkwitz q library library of the humboldt university berlin q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh digitising the fairy tail collection within the private library of jacob and wilhelm grimm and make the annotations in these books searchable. we started to prepare the project and an application at the german reserahc foundation. q please describe the positive aspects of this activity the fairy tail collection - as the entire privare library of the both grimm brothers - is relevant for dh-projects by the annotations, comments amd remarks done by jacob and wilhelm grimm and other reseracher of the th century. the corpus of books covers about voulmes ( % of the privare library). we have to apply for funds at the german research foundation, that we can run the activity as a project. q please describe what could be going better concerning this activity the cooperation between the library and the grimm researchers (grimm-arbeitsstelle) of the university is going very well. we have resp. had some problems to find the necessary technical expertise to identify the annotations in the full texts automatically and to make these elements searchable. our own competence is to less to realise the project. we need technical cooperation with an appropriate institution. q how long have you been doing this activity in your library? - years # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy q do you have a dedicated budget for this activity? yes, with external funding q are you assessing the impact of your work? if yes, how? , the project will be a pilot to process further materials of the grimms private library in this way please specify: q what kind of collection are you using in the activity? a digitised collection we created and curate q how is the data you are using licensed? public domain, cc q how did you find/built a relationship with the researchers working in this activity? through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , advisory / consultation roles q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) q did you follow or offer any training for librarians as part of this activity? yes, namely q if you have followed or offered any dh training for librarians, how is it organised? it belongs to a personal professional development programme / liber dh & dch working group - use case survey q how aware are academics in your institution of the dh activities the library is active in? aware (they know (some of) the activities the library is active in) q if the library promotes the activity, where does it do so? articles in journals, conferences, own website q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity the project is in the phase of preparation. therefore no project site is available yet. here are some links about our activities and collections: - https://www.digi-hub.de/viewer/ - https://www.digi-hub.de/viewer/sammlungen/ / liber dh & dch working group - use case survey q name sinéad keogh q library university of limerick q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh we created an online exhibition following the lives of one family throughout the first world war. using the letters, diaries and photographs from the armstrong family archive, we add a new post each week to show their lives years ago that week. q please describe the positive aspects of this activity engagement with the archive has dramatically increased, with subscribers, followers and likes on the site and social media platforms. we have also had people contact us to give further information, to correct errors and even to contribute material to the archive. q please describe what could be going better concerning this activity with the ww centenary commemorations approaching, we really wanted to do this project but the resources were not available. so it became a labour of love to do it. more staff and time resources would have been useful. q how long have you been doing this activity in your library? - years q how many people of your library are involved in the activity? - # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? no, it is an ad-hoc activity q do you have a dedicated budget for this activity? no q are you assessing the impact of your work? if yes, how? , google analytics, social media, publications please specify: q what kind of collection are you using in the activity? a digitised collection we created and curate q how is the data you are using licensed? copyrighted q how did you find/built a relationship with the researchers working in this activity? online presence – dh lab/portal, social media , physical space – facilitating/hosting , other forums – committees, faculty liaison, others? q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , skills training and development q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) q did you follow or offer any training for librarians as part of this activity? yes, namely , html, php, digitisation please specify: q if you have followed or offered any dh training for librarians, how is it organised? it belongs to a personal professional development programme / liber dh & dch working group - use case survey q how aware are academics in your institution of the dh activities the library is active in? not aware (they have no idea the library does anything related to dh) q if the library promotes the activity, where does it do so? articles in journals, conferences, own website, partner's website, social media, blog posts q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://longwaytotipperary.ul.ie/ https://www.facebook.com/armstrongfamilymoyaliffe https://twitter.com/ww ul / liber dh & dch working group - use case survey q name anda baklāne q library national library of latvia q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh provide collections of text files for researchers ("corpus on demand") q please describe the positive aspects of this activity ( ) there is a growing interest among researchers, we can see potential in this activity. ( ) allows to build upon already existing digital resources/services. q please describe what could be going better concerning this activity unclear long-term funding solutions. q how long have you been doing this activity in your library? under a year q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q do you have a dedicated budget for this activity? no q are you assessing the impact of your work? if yes, how? , not yet, but we plan to do so. please specify: q what kind of collection are you using in the activity? a digitised collection we created and curate q how is the data you are using licensed? public domain q how did you find/built a relationship with the researchers working in this activity? training / outreach events , online presence – dh lab/portal, social media , physical space – facilitating/hosting , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , advisory / consultation roles , skills training and development q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.), everything starting from understanding what kinds of files/data are needed, how to deliver them, what advice to give about the further handling and computation of data. if yes, please specify which skill gaps you indentified: / liber dh & dch working group - use case survey q did you follow or offer any training for librarians as part of this activity? yes, namely , this summer we are organizing a day course at the library (with invited lecturers). try to learn from colleagues in conferences, seminars, by asking for advice. please specify: q if you have followed or offered any dh training for librarians, how is it organised? it belongs to the scope of this activity q how aware are academics in your institution of the dh activities the library is active in? aware (they know (some of) the activities the library is active in) q if the library promotes the activity, where does it do so? conferences, own website, partner's website, social media q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://www.digitalhumanities.lv/ http://www.digitalhumanities.lv/bssdh/ https://www.lnb.lv/en/researchers/digital-humanities / liber dh & dch working group - use case survey page : liber dh & dch wg use case survey q name q library stockholm university library q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website, but please do not include my name q please describe one of the activities that you are doing in your library that you would define as dh digitization (scanning and, in some cases, optical character recognition of printed materials) q please describe the positive aspects of this activity increases findability and availability of research materials. full-text scans also allow for various types of new inquiry (text-mining, etc.). q please describe what could be going better concerning this activity a long-term plan with dedicated funding. we could also benefit from more detailed workflows. q how long have you been doing this activity in your library? more than years q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? no, it is an ad-hoc activity # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w / liber dh & dch working group - use case survey q do you have a dedicated budget for this activity? yes, with internal funding q are you assessing the impact of your work? if yes, how? , we have some limited statistics on numbers of views/downloads, but only for certain items, and these statistics are not collected systematically. please specify: q what kind of collection are you using in the activity? a born-digital collection we have a license to , a digitised collection we have a license to , a digitised collection we created and curate q how is the data you are using licensed? public domain, copyrighted q how did you find/built a relationship with the researchers working in this activity? other forums – committees, faculty liaison, others? , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections q what were the most significant skill-gaps you identified for this activity? soft skills (communication, project management, etc.) q did you follow or offer any training for librarians as part of this activity? no, because, our digitization personnel were self-taught. please specify: q if you have followed or offered any dh training for librarians, how is it organised? not applicable, we did not follow or offer any training / liber dh & dch working group - use case survey q how aware are academics in your institution of the dh activities the library is active in? not aware (they have no idea the library does anything related to dh) q if the library promotes the activity, where does it do so? own website q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity su library digitization information site: https://www.su.se/english/library/research-support/digitalisation su map collections (in swedish): https://kartavdelningen.sub.su.se/kartrummet/ digitized su special collections in libris (federated catalog in sweden): http://libris.kb.se/hitlist? d=libris&q=db% adigi+images.sub.su.se&f=simp&spell=true&hist=true&p= digitized dissertations from su (in diva, federated infrastructure for academic publications in sweden): http://su.diva- portal.org/smash/resultlist.jsf?dswid=- &language=sv&searchtype=research&query=&af=[]&aq=[[]]&aq = [[{% dateissued% % a{% from% % a% % % c% to% % a% % }}% c{% publicationtypecode% % a[% monographdoctoralthesis% % c% comprehensivedoctoralthesis% ]}]]&aqe= []&noofrows= &sortorder=author_sort_asc&sortorder =title_sort_asc&onlyfulltext=false&sf=all / liber dh & dch working group - use case survey q name lotte wilms q library kb, national library of the netherlands q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh we run a researcher-in-residence programme where we invite an early career researcher to join the library for months for , fte to do a research project with our digital collections. we offer help from a programmer, advisor, and collection specialists. q please describe the positive aspects of this activity we build our network in the academic community, we learn a great deal from how researchers work with our collection and how we can improve access to and our collections for them. q please describe what could be going better concerning this activity last year we had very few proposals, so this year we decided to open the call for proposals for a longer period, invite more researchers and set up consultation slots. q how long have you been doing this activity in your library? - years q how many people of your library are involved in the activity? - # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy q do you have a dedicated budget for this activity? yes, with internal funding q are you assessing the impact of your work? if yes, how? , we recently did an evaluation and interviewed previous participants. please specify: q what kind of collection are you using in the activity? a born-digital collection we created and curate , a digitised collection we created and curate , metadata q how is the data you are using licensed? public domain, copyrighted q how did you find/built a relationship with the researchers working in this activity? online presence – dh lab/portal, social media , other forums – committees, faculty liaison, others? , through existing relationships , other (please specify): call for proposals q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , advisory / consultation roles , other services or technical expertise (please specify): tool development / liber dh & dch working group - use case survey q what were the most significant skill-gaps you identified for this activity? none q did you follow or offer any training for librarians as part of this activity? no, because q if you have followed or offered any dh training for librarians, how is it organised? not applicable, we did not follow or offer any training q how aware are academics in your institution of the dh activities the library is active in? aware (they know (some of) the activities the library is active in) q if the library promotes the activity, where does it do so? articles in journals, conferences, own website, partner's website, social media, blog posts q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity https://www.kb.nl/en/organisation/research-expertise/researcher-in-residence http://lab.kb.nl/about-us/blog https://arbido.ch/de/ausgaben-artikel/ /zusammenarbeit/the-researcher-in-residence-programme-at-the-kb-national-library-of-the- netherlands / liber dh & dch working group - use case survey q name kirsty lingstadt q library university of edinburgh q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh we are working on improving accuracy of ocr for mid th to mid th century scottish printed text with a focus on legal material. as part of this work we are looking at automated metadata extraction including places and people etc and are developing tools to facilitate this. q please describe the positive aspects of this activity though project specific the methods and tools can be more widely applied and will take us further forward with the development of digital humanities tools required for students and researchers. q please describe what could be going better concerning this activity the challenge is funding as just now we are working on project specific funding. once this comes to an end we are looking for how we can ensure that we continue this work. q how long have you been doing this activity in your library? - years q how many people of your library are involved in the activity? - # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy q do you have a dedicated budget for this activity? yes, with external funding q are you assessing the impact of your work? if yes, how? , measuring accuracy of tools and processes as well as the ability to use these for other projects. please specify: q what kind of collection are you using in the activity? other (please specify): collection we are digitising and are making available cc-by q how is the data you are using licensed? any other cc- license q how did you find/built a relationship with the researchers working in this activity? other (please specify): currently working to build relationships with researchers. q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , digital storage / preservation / hosting , other services or technical expertise (please specify): tools for metadata extraction. q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.), also attracting researchers to the content if yes, please specify which skill gaps you indentified: q did you follow or offer any training for librarians as part of this activity? no, because, more general training has been offered to library staff eg library software carpentry please specify: / liber dh & dch working group - use case survey q if you have followed or offered any dh training for librarians, how is it organised? other (please specify): wider dh training being delivered within the university and the library is engaged in developing some of the programming q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? conferences, own website, social media, blog posts q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity website area not live yet - due to go live late july . / liber dh & dch working group - use case survey q name per cullhed q library uppsala university library q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh the hosting of a digital repository (alvin) and services to promote the use of its content. q please describe the positive aspects of this activity the creation of digital media for use in for example dh without the subscription costs associated with bought digital media q please describe what could be going better concerning this activity growth is slowed down due to the need of maintaining traditional library services q how long have you been doing this activity in your library? more than years q how many people of your library are involved in the activity? more than q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q do you have a dedicated budget for this activity? yes, with a combination of internal and external funding q are you assessing the impact of your work? if yes, how? , download statistics and other knowledge of usage please specify: q what kind of collection are you using in the activity? a born-digital collection we have a license to , a digitised collection we have a license to , a digitised collection we created and curate , metadata q how is the data you are using licensed? public domain, cc , any other cc- license , copyrighted q how did you find/built a relationship with the researchers working in this activity? training / outreach events , online presence – dh lab/portal, social media , other forums – committees, faculty liaison, others? , through existing relationships / liber dh & dch working group - use case survey q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , advisory / consultation roles , skills training and development , other services or technical expertise (please specify): use of the web for building contexts based on a digital content q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.), therei s a general lack of combined programming and subject skills if yes, please specify which skill gaps you indentified: q did you follow or offer any training for librarians as part of this activity? yes, namely , promoting the new competences needed (in many ways) please specify: q if you have followed or offered any dh training for librarians, how is it organised? other (please specify): lectures, workshops, study visits and regular courses q how aware are academics in your institution of the dh activities the library is active in? aware (they know (some of) the activities the library is active in) q if the library promotes the activity, where does it do so? articles in journals, conferences, own website, social media, blog posts, other (please specify): lectures / liber dh & dch working group - use case survey q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity www.alvin-portal.org http://ub.uu.se/om-biblioteket/forum-for-digital-humaniora https://liber .org/wp-content/uploads/ / /ws _cullhed_digital-humanities-forum.pdf http://www.ub.uu.se/anvand-biblioteket/labb/ / liber dh & dch working group - use case survey q name caleb derven q library glucksman library, university of limerick q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh building and deploying a digital library service q please describe the positive aspects of this activity offering a new service to the research community in our university; engaging with new technologies; creating a (hopefully) sustainable digital scholarship tool. q please describe what could be going better concerning this activity more dedicated staff for creating resources; full time developer resources q how long have you been doing this activity in your library? - years q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q do you have a dedicated budget for this activity? yes, with internal funding q are you assessing the impact of your work? if yes, how? , we regularly asses analytic metrics related to the use of the resource. please specify: q what kind of collection are you using in the activity? a digitised collection we created and curate , metadata q how is the data you are using licensed? cc , any other cc- license , copyrighted q how did you find/built a relationship with the researchers working in this activity? scoping tools – environmental scans, surveys , training / outreach events , other forums – committees, faculty liaison, others? , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , digital storage / preservation / hosting , advisory / consultation roles , skills training and development q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) / liber dh & dch working group - use case survey q did you follow or offer any training for librarians as part of this activity? yes, namely , metadata training please specify: q if you have followed or offered any dh training for librarians, how is it organised? it belongs to the scope of this activity q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? not applicable, the library does not promote the activity q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity respondent skipped this question / liber dh & dch working group - use case survey q name clara riera q library universitat oberta de catalunya q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh give training and consultacy to dh researchers for digitals projects q please describe the positive aspects of this activity interdiciplinary, connnecting inside resaerch to outside infrastructure q please describe what could be going better concerning this activity more time at the beginning of the research project to spend with library research servicies q how long have you been doing this activity in your library? under a year q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? no, it is an ad-hoc activity # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q do you have a dedicated budget for this activity? no q are you assessing the impact of your work? if no, why not? , at the moment the pilot is in a initial phase please specify: q what kind of collection are you using in the activity? other (please specify): catalogues q how is the data you are using licensed? cc , copyrighted q how did you find/built a relationship with the researchers working in this activity? training / outreach events , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , digital storage / preservation / hosting q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) q did you follow or offer any training for librarians as part of this activity? yes, namely , we took part in a consortium course please specify: q if you have followed or offered any dh training for librarians, how is it organised? not applicable, we did not follow or offer any training q how aware are academics in your institution of the dh activities the library is active in? not aware (they have no idea the library does anything related to dh) / liber dh & dch working group - use case survey q if the library promotes the activity, where does it do so? not applicable, the library does not promote the activity q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://transfer.rdi.uoc.edu/en/node/ project website http://www.ub.edu/openscienceandthehumanities/ / / /social-networks-of-the-past-mapping-hispanic-and-lusophone-literary- modernity- - / conference presenting the research project / liber dh & dch working group - use case survey q name susan halfpenny q library university of york q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh digital creativity week (https://www.york.ac.uk/library/news/ /digital-creativity/): week long event working with archival data in creative ways. students spent the week looking at data from the yorkshire dictionary project (https://www.york.ac.uk/borthwick/projects/yorkshire-dictionary/). activities included, data cleaning and visualisation, image and audio editing, and coding. students showcased their outputs to staff at an event. q please describe the positive aspects of this activity collaborative endeavour: staff from library, it services and archives collaborated to host the event and provide a range of activities. damien murphy, the research champion for creativity supported with the development of the event and provided useful academic contacts. archival data: yorkshire dictionary data provided by the archives enabled students to get involved in a current project. students used this as a source of inspiration and also picked up other data sources as they progressed with the project. creative digital output: students produced a digital story linked to the data they had explored. the project ended with them presenting their work in the space to enable an immersive experience for the audience which incorporated a visual presentation, audio and ar triggers. # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q please describe what could be going better concerning this activity this is the first year we have run this event. however, the plan is to run the event again in . some of the considerations for going forward include: structure of the week: students came up with their own ideas and the output wasn't fixed. more direction and structure would be worthwhile in the future to enable students to fix on an idea for their projects earlier in the week. marketing and communications: we had a lot of students drop out of the event, which meant we only had participants. in future events earlier communications and over recruitment would be considered to enable us to work with a bigger group. smaller events: as well as running a week long event we would like to run more one off events to enable us to reach more staff and students. q how long have you been doing this activity in your library? under a year q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? no, it is an ad-hoc activity q do you have a dedicated budget for this activity? yes, with internal funding q are you assessing the impact of your work? if yes, how? , participant feedback please specify: q what kind of collection are you using in the activity? other (please specify): archive data q how is the data you are using licensed? copyrighted q how did you find/built a relationship with the researchers working in this activity? through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? advisory / consultation roles , skills training and development / liber dh & dch working group - use case survey q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) q did you follow or offer any training for librarians as part of this activity? no, because, staff involved had the skills to support with the activity please specify: q if you have followed or offered any dh training for librarians, how is it organised? not applicable, we did not follow or offer any training q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? own website, social media, blog posts q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity https://www.york.ac.uk/library/news/ /digital-creativity/ / liber dh & dch working group - use case survey q name despoina gkogkou q library library and information center, university of patras q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh we hold several digital collections (digitized periodicals, books, manuscripts and other archival materials). we also run a digital publishing platform (open access serials on several humanities and social science disciplines). the library hosted a series of dh seminars on literary networks organized by the department of mathematics and a meeting on dh in greece by the department of philology. q please describe the positive aspects of this activity our collections of digitized press of the th and th centuries are considered of the most important in greece, being used widely by scholars and researchers. the content of our collections is valuable for greek and comparative philologists, historians of the press, archivists and the activity has bestowed the library with a stimulus towards digital humanities. q please describe what could be going better concerning this activity it would be of great benefit if we could enhance the processing of our data (ocr, text mining techniques). q how long have you been doing this activity in your library? more than years q how many people of your library are involved in the activity? # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? no, it is an ad-hoc activity q do you have a dedicated budget for this activity? no q are you assessing the impact of your work? if yes, how? , through our in-person relation and communication with the academics in our university that use the collections mostly. please specify: q what kind of collection are you using in the activity? a digitised collection we have a license to , a born-digital collection we created and curate , a digitised collection we created and curate , metadata q how is the data you are using licensed? public domain, cc , any other cc- license q how did you find/built a relationship with the researchers working in this activity? training / outreach events , physical space – facilitating/hosting , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , advisory / consultation roles , skills training and development / liber dh & dch working group - use case survey q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) q did you follow or offer any training for librarians as part of this activity? yes, namely , offer training and guidance to interns adding content and the metadata please specify: q if you have followed or offered any dh training for librarians, how is it organised? it belongs to a personal professional development programme q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? articles in journals, conferences, partner's website, social media q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity https://library.upatras.gr/english (general information) https://library.upatras.gr/digital (digital collections - in greek only) / liber dh & dch working group - use case survey q name ioannis clapsopoulos q library university of thessaly library & information centre q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh our main activity in dh is through a recently initiated research project titled “thessaly memory documentantion and communication” [the.me.do.com], which constitutes the main part of the more wide-ranging “university of thessaly historical archive design and implementation” project. the.me.do.com is a collaboration project, with a research team comprising central library [http://www.lib.uth.gr/lws/en/en_hp.asp] staff and teaching-research staff members of the university’s department of history, archaeology and social anthropology (http://www.ha.uth.gr/setlanguage.php?lang=en). a project webpage (in greek and english), including a progress log, is maintained at research gate platform: http://bit.ly/ uyaqbh q please describe the positive aspects of this activity our main activity in dh is through a recently initiated research project titled “thessaly memory documentantion and communication” [the.me.do.com], which constitutes the main part of the more wide-ranging “university of thessaly historical archive design and implementation” project. the.me.do.com is a collaboration project, with a research team comprising central library [http://www.lib.uth.gr/lws/en/en_hp.asp] staff and teaching-research staff members of the university’s department of history, archaeology and social anthropology (http://www.ha.uth.gr/setlanguage.php?lang=en). an initial project webpage (in greek and english), including a progress log, is maintained at research gate platform: http://bit.ly/ uyaqbh q please describe what could be going better concerning this activity since the.me.do.com project is at an early stage of implementation its progress will be evaluated for the first time by the end of . q how long have you been doing this activity in your library? under a year # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy , yes, it is part of the parent organisation’s policy q do you have a dedicated budget for this activity? yes, with a combination of internal and external funding q are you assessing the impact of your work? if no, why not? , the first impact assesment will be done within . please specify: q what kind of collection are you using in the activity? a born-digital collection we created and curate , a digitised collection we created and curate , metadata q how is the data you are using licensed? cc , any other cc- license q how did you find/built a relationship with the researchers working in this activity? other forums – committees, faculty liaison, others? , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , skills training and development / liber dh & dch working group - use case survey q what were the most significant skill-gaps you identified for this activity? none, the the.me.do.com project is at an early stage of implementation, so no skill-gaps have been identified yet. if yes, please specify which skill gaps you indentified: q did you follow or offer any training for librarians as part of this activity? no, because, not yet - required training needs will be evaluated by the end of . please specify: q if you have followed or offered any dh training for librarians, how is it organised? it belongs to the scope of this activity q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? own website, social media, other (please specify): articles in journals and conferences will be presented as the project evolves q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://bit.ly/ uyaqbh : initial the.me.do.com project website (at research gate platform – a dedicated project website will be published by october (social media i.e. twitter, facebook and instragram project accounts will also be published by october ). http://bit.ly/ llmmsj : the university of thessaly historical archive powerpoint short presentation (in greek). / liber dh & dch working group - use case survey q name emmanuelle bermès q library bibliotheque national de france q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh the corpus project = a years internal research program aimed at defining new services for researchers using digital collections (gallica, web archives, metadata). cf. c.bnf.fr/fom q please describe the positive aspects of this activity iterations with researchers on their needs. concrete results using bnf collections. support from the library's direction. q please describe what could be going better concerning this activity currently we struggle to get a clear view on physical space works programming & budget. we are lacking dedicated staff. q how long have you been doing this activity in your library? - years q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q do you have a dedicated budget for this activity? yes, with internal funding q are you assessing the impact of your work? if yes, how? , yearly reviews please specify: q what kind of collection are you using in the activity? a born-digital collection we created and curate , a digitised collection we created and curate , metadata q how is the data you are using licensed? any other cc- license q how did you find/built a relationship with the researchers working in this activity? training / outreach events , physical space – facilitating/hosting , through existing relationships , other (please specify): api website - api.bnf.fr q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , advisory / consultation roles , skills training and development , other services or technical expertise (please specify): infrastructure, physical space (hosting research teams), tools to analyse the data q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) / liber dh & dch working group - use case survey q did you follow or offer any training for librarians as part of this activity? no, because, not yet, but we will please specify: q if you have followed or offered any dh training for librarians, how is it organised? not applicable, we did not follow or offer any training q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? articles in journals, conferences, blog posts, other (please specify): workshops q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity c.bnf.fr/fom eleonora moiraghi's report: hal-bnf.archives-ouvertures.fr/hal- bnf research blog: bnf.hypotheses.org/ bnf.hypotheses.org/ bnf.hypotheses.org/ (reports from the workshops) webcorpora.hypotheses.org / liber dh & dch working group - use case survey q name agnes ponsati q library spanish national library q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh at bne we have created a portal bnelab under it we'll place different activities we are developing to improve the use and re-use of our digital/print collections and enrichment of our collection. take a look at www.bne.es/bnelab q please describe the positive aspects of this activity promotion of bne heritage collections with different narratives and also within new communities apart from the academic public. q please describe what could be going better concerning this activity n.a. q how long have you been doing this activity in your library? - years q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q do you have a dedicated budget for this activity? yes, with external funding q are you assessing the impact of your work? if yes, how? , we follow up the impact through some analysize tools, the social media impact, and impact on newspapers, radio and tv programs please specify: q what kind of collection are you using in the activity? a digitised collection we have a license to , a digitised collection we created and curate , metadata q how is the data you are using licensed? public domain, any other cc- license q how did you find/built a relationship with the researchers working in this activity? online presence – dh lab/portal, social media , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , digital storage / preservation / hosting , advisory / consultation roles q what were the most significant skill-gaps you identified for this activity? soft skills (communication, project management, etc.) q did you follow or offer any training for librarians as part of this activity? yes, namely q if you have followed or offered any dh training for librarians, how is it organised? it belongs to the scope of this activity / liber dh & dch working group - use case survey q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? own website, partner's website, social media, blog posts q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://www.bne.es/bnelab/ / liber dh & dch working group - use case survey q name helga kardos q library library of the hungarian parliament q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh digitisation of parliamentary documents, legal books, journals, other textual materials. digitised legislative knowledge base. q please describe the positive aspects of this activity - to know deeply the collection - to protect the collection - make the documents familiar to researchers and students q please describe what could be going better concerning this activity better and richer metadata (even in english). language limits are narrowing the audience and possibilities. q how long have you been doing this activity in your library? - years q how many people of your library are involved in the activity? - # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy , yes, it is part of the parent organisation’s policy q do you have a dedicated budget for this activity? yes, with internal funding q are you assessing the impact of your work? if yes, how? , with surveys please specify: q what kind of collection are you using in the activity? a digitised collection we created and curate , metadata q how is the data you are using licensed? public domain, copyrighted q how did you find/built a relationship with the researchers working in this activity? online presence – dh lab/portal, social media , through existing relationships q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.) q did you follow or offer any training for librarians as part of this activity? no, because, lack of money please specify: q if you have followed or offered any dh training for librarians, how is it organised? not applicable, we did not follow or offer any training / liber dh & dch working group - use case survey q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? articles in journals, own website, social media, blog posts q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity dtt.ogyk.hu -> digital collection of th elibrary ogyk.hu -> website of the library / liber dh & dch working group - use case survey q name karine bacher-eyroi q library université toulouse capitole / library research support dept. q is it ok if we share this use case (including the name of library)? yes, you may publish it on the liber website and please include my name as contact person q please describe one of the activities that you are doing in your library that you would define as dh - research materials & publications digitisation - digital research publishing q please describe the positive aspects of this activity - developing expertise & skills in our team - added-value projects highlighting the library in our university - working closely with researchers & research units as real partners and stakeholders of projects q please describe what could be going better concerning this activity - evaluating better the time needed - having a dedicated budget - upskilling the team on technical aspect of dh - integrating dh in our organisation chart - better advocacy towards researchers & university board - having an institutional policy on dh q how long have you been doing this activity in your library? - years # # completecomplete collector:collector: started:started: last modified:last modified: time spent:time spent: w page : liber dh & dch wg use case survey / liber dh & dch working group - use case survey q how many people of your library are involved in the activity? - q is the activity being undertaken in an embedded program in your library? in other words, is it part of a policy? yes, it is part of the library’s policy q do you have a dedicated budget for this activity? no q are you assessing the impact of your work? if no, why not? , the activity is quite recent in the library, we lack of elements, and of evaluation tools. but this is a priority to develop the assessment of our work. please specify: q what kind of collection are you using in the activity? a born-digital collection we created and curate , a digitised collection we created and curate , metadata q how is the data you are using licensed? public domain, copyrighted q how did you find/built a relationship with the researchers working in this activity? online presence – dh lab/portal, social media , through existing relationships , other (please specify): responding the research units demands / liber dh & dch working group - use case survey q what would be the main topics describing your relationship with the researchers working in the activity? digital content / collections , advisory / consultation roles , skills training and development , other services or technical expertise (please specify): expertise on tools, formats, standards, legislation q what were the most significant skill-gaps you identified for this activity? hard skills (coding, tools, etc.), potential of tdm, advocacy & communication if yes, please specify which skill gaps you indentified: q did you follow or offer any training for librarians as part of this activity? no, because, we didn't really develop that for the moment, except for some specific tools presentation. we integrate professional training institutions programs. please specify: q if you have followed or offered any dh training for librarians, how is it organised? it belongs to the scope of this activity q how aware are academics in your institution of the dh activities the library is active in? somewhat aware (they know that the library does something, but are not sure what) q if the library promotes the activity, where does it do so? conferences, own website, other (please specify): internal university board of research units meeting q where can we find any extra information about the activity? please list all the links to websites, news items, blog posts, articles or any other communication about the activity http://www.ut-capitole.fr/bibliotheques/ / liber dh & dch working group - use case survey ku leuven libraries library of the humboldt university berlin university of limerick national library of latvia stockholm university library kb, national library of the netherlands university of edinburgh uppsala university library glucksman library, university of limerick universitat oberta de catalunya university of york library and information center, university of patras university of thessaly library & information centre bibliotheque national de france spanish national library library of hungarian parliament université toulouse capitole / library research support dept. microsoft word - _lughi.docx digitcult | scientific journal on digital cultures published july correspondence should be addressed to giulio lughi, università di torino. email: giulio.lughi@mail.com digitcult, scientific journal on digital cultures is an academic journal of international scope, peer-reviewed and open access, aiming to value international research and to present current debate on digital culture, technological innovation and social change. issn: - . url: http://www.digitcult.it copyright rests with the authors. this work is released under a creative commons attribution (it) licence, version . . for details please see http://creativecommons.org/ licenses/by/ . /it/ digitcult http://dx.doi.org/ . / , vol. , iss. , – . doi: . / la visualizzazione digitale negli studi di cultural heritage abstract il saggio si collega ad altri interventi apparsi sulla rivista digitcult, focalizzati sulla possibilità di elaborare nuove forme di scrittura accademica, e più in generale critica e scientifica. il saggio rileva, preliminarmente, come spesso anche le proposte più innovative scontino una sorta di dipendenza dalla centralità della scrittura: certamente si riconosce che il digitale offre ampie possibilità di espansione rispetto alla testualità scritta, ma nonostante ciò il ruolo guida della argomentazione e della strategia espositiva pertiene sempre allo scritto. questo contributo intende invece proporre un punto di vista diverso, individuando nella componente visuale, e in particolare nel video interattivo, un possibile elemento trainante per elaborare nuove forme di discorso critico e interpretativo. a questo scopo vengono considerati alcuni precedenti storici (pre-digitali) rilevanti per il nostro discorso; successivamente vengono esaminati alcuni strumenti software che consentono di produrre testi incentrati sulla componente visiva; e infine viene proposto un esperimento di webdoc interattivo, centrato sul visuale, che analizza una complessa installazione site specific, opera dell’artista giuseppe penone. digital visualization in cultural heritage studies this paper is related to other interventions that appeared in the digitcult magazine, focused on the possibility of developing new forms of academic writing, and more generally of critical and scientific writing; we note, however, that often even the most innovative proposals somehow discount a sort of dependence on centrality of writing. certainly it is recognized that digital paradigm offers great possibilities for expanding written textuality (e.g. with multimedia and the possibility of adding images, sounds and videos; or with the possibility of intervening directly on the code; or with the semantic management of the text through keywords and ontologies): but always, and in any case, the guiding role of the argumentation pertains to the writing. this paper instead aims to propose a different point of view, entrusting the visual component, and in particular the interactive video, with the leading role. for this purpose, interesting historical precedents are examined; some software tools are considered that allow the production of texts focused on the visual; and finally an interactive webdoc experiment - centered on the visual - is proposed, describing a complex site specific installation by the italian artist giuseppe penone. giulio lughi università di torino | la visualizzazione digitale negli studi di cultural heritage doi: . / digitcult | scientific journal on digital cultures la rivista “digitcult” ha avviato recentemente una metariflessione sulle prospettive aperte dal digitale nei processi di analisi critica ed interpretazione dei prodotti culturali, nonché sulle forme di comunicazione e disseminazione degli studi scientifici. esistono forme testuali alternative, o almeno integrative, al classico paper scritto o alla monografia? la riflessione è stata anticipata da riva ( ) nel suo saggio sulla monografia digitale, dove esplora il dibattito anglosassone su questo tema facendo riferimento anche alle proprie sperimentazioni presso la brown university; poi è stata sistematizzata esplicitamente da roncaglia ( ), nel contributo programmatico intitolato nuove forme per la scrittura accademica: l’avvio di una sperimentazione; a seguire, il contributo sperimentale di venerandi ( ), il quale affronta l’argomento in un saggio ipertestuale - attivato mediante lo scanning di un qr code - dove mette l’accento sull’importanza dei linguaggi di programmazione; poi il contributo di leonetti ( ) - anch’esso un ipertesto esterno alla rivista attivato tramite qr code - dove viene sottoineata l’importanza delle marcature e relazioni semantiche all’interno del testo; infine il saggio di meschini ( ) che affronta in prospettiva più ampia le problematiche della digital scholarship. sullo sfondo di queste riflessioni traspare il grande tema della convergenza al digitale, cioè della moltiplicazione di potenza che il digitale offre alla trasmissione della conoscenza grazie al fatto che l’elaborazione logica di stringhe alfanumeriche è in grado di restituire, sui dispositivi di output, qualsiasi codice espressivo: dal testo scritto, all’immagine, alla grafica, al video, alla musica, configurandosi così come metamedium onnivoro capace di dar vita a molteplici forme espressive. un processo indagato soprattutto da lev manovich ( , ), che elabora il concetto di deep remixability per indicare il potere unificante e al contempo ibrido del digitale: “uniti in un ambiente software comune, i linguaggi di cinematografia, animazione, computer animation, effetti speciali, graphic design, e stampa sono venuti a formare un nuovo metalinguaggio”. il tratto saliente, rispetto alle forme di testualità pre-digitale, è che in questa prospettiva il digitale non è semplicemente una tecnologia (come il video, l’audio, i procedimenti avanzati di stampa, ecc.) ma è anche e soprattutto un linguaggio e una logica, in grado di entrare nel cuore dei procedimenti di costruzione testuale e di interconnettere piani espressivi che nel paradigma industriale venivano accuratamente tenuti distinti: di qui le enormi potenzialità testuali, e quindi comunicative e culturali, che il digitale dispiega, dalla struttura argomentativa reticolare (ipertestualità), all’intrecciarsi di diversi codici espressivi (multimedialità), alla possibilità di dinamismo autonomo (programmabilità), alla capacità di risposta alle azioni del lettore/spettatore (interattività). indicazioni ben presenti nei contributi sopra citati, nei quali tuttavia sembra di cogliere comunque una sorta di prevalenza attribuita al codice scritto, prevalenza che di fatto si risolve in un rapporto controverso fra testualità scritta e altre testualità, in particolare quella visuale. emblematica in questo senso la posizione di riva: “in short, rethinking my book as a digital monograph compelled me to shift the weight of my argument from the written to the visual component, embedding as much of my argument in the latter. at the same time, this also required a substantial shift in my writing strategy, reducing the overall “weight” of the textual component (from in excess of . words, in its first envisioned draft, to about . in the current plan) but investing the written text with a new crucial function: supporting the visualizations (in the shape of captions or internal annotations), on the one hand, and providing a narrative frame which allows the reader to connect the various visualizations among themselves, and follow a path toward some theoretical and methodological conclusions”. ( ) dove si riconosce certamente il peso crescente che il visuale assume nelle forme testuali digitali, ma al tempo stesso si affida al testo scritto il compito (“investing the written text with a new crucial function”) di strutturare il telaio narrativo e argomentativo entro cui si collocano gli elementi visuali. rispetto a questo approccio cercherò qui di spostare il punto di vista, focalizzando il discorso soprattutto sul ruolo della componente visuale, in particolare concentrandomi sull’utilizzo del video interattivo nei progetti di digital cultural heritage. per far questo riprenderò innanzitutto doi:. / giulio lughi | digitcult | scientific journal on digital cultures alcune osservazioni presenti nei contributi sopra segnalati: estenderò poi il discorso ad alcuni antecedenti (anche pre-digitali) che affrontano il tema del rapporto fra scritto e visuale; prenderò successivamente in esame la comparsa sul mercato - e quindi nelle pratiche di produzione testuale interattiva - di software autoriali nei quali si abbandona il modello “orientato allo scritto” per puntare decisamente verso una testualità “orientata al visuale”; e presenterò infine un esperimento di webdoc interattivo, elaborato da me e dai miei collaboratori, che applica queste forme di testualità espansa all’esplorazione e all’analisi di un’importante opera site specific di arte contemporanea, il giardino delle sculture fluide di giuseppe penone, collocato nel parco basso della reggia la venaria reale a poca distanza da torino. consideriamo innanzitutto i contributi, citati sopra, presenti in “digitcult”. vediamo che roncaglia ( ) si concentra soprattutto sulla forma ipertesto, ma sempre dal punto di vista della scrittura, all’interno della quale auspica un maggiore ricorso alla visualizzazione dei dati, le cosiddette infografiche, come “arricchimento” delle forme tradizionali di testualità; o ricorda le sperimentazioni di marvin minsky, il quale in un celebre esperimento “entra” in video nel testo scritto per commentarne alcuni passi: in entrambi i casi il “ruolo guida” della testualità risulta comunque affidato allo scritto, rispetto al quale le componenti visuali sono certamente prese in considerazione, ma più che altro come elementi integrativi, se non accessori. venerandi, dal canto suo, punta soprattutto sul coding, la scrittura del codice come strumento per gestire in maniera articolata e flessibile le diverse componenti testuali, anche multimediali, nonché sul superamento del concetto di pagina lineare per proporre invece la piattaforma digitale come strumento organizzativo della nuova testualità. anche nel suo caso si rileva tuttavia la posizione di secondo piano del visuale rispetto allo scritto, mentre la piena integrazione semiotica dei codici è espressa solamente in termini di auspicio (“si potrebbe invece pensare...”). nel paragrafo “il multimediale” venerandi infatti sottolinea come “la presenza di audio e video in contenuti digitali ha dato vita, specie in campo editoriale, a prodotti non sempre omogenei. il video viene utilizzato come elemento che aumenta un contenuto testuale, ma questo aumento è ancora gestito in maniera grossolana: il video è un contenuto che viene giustapposto al testo, senza che vi sia una reale integrazione fra gli elementi. si potrebbe invece pensare a contenuti, anche appartenenti a diversi media, che compongono la pagina della pubblicazione digitale e che tra di loro colloquiano a seconda delle interazioni del lettore”. per quanto riguarda il contributo di leonetti, la sua attenzione è puntata soprattutto sulle relazioni semantiche presenti nel testo, e sulle procedure di marcatura che consentono di evidenziarle e renderle individuabili attraverso appositi motori di ricerca. il rapporto fra scritto e visuale non emerge dal suo contributo, ma va tuttavia notato che il software-autore che leonetti suggerisce, e che ha utilizzato per elaborare il suo contributo, è decisamente “orientato allo scritto” nel senso di organizzato per pagine, all’interno delle quali sono incapsulabili immagini, video e altri elementi mediali. a conferma della problematicità del rapporto fra scritto e visuale, il saggio di meschini affronta infine in maniera articolata il tema più ampio della digital scolarship, enucleandone la complessità in termini di opposizioni sistematiche fra elementi portanti (assetto documento-centrico e data- centrico, mono e multicodicale, individuale e collettivo, informativo e narrativo, sincronico e diacronico, forma breve e forma lunga, ecc.), riconoscendo sul piano teorico le potenzialità comunicative del visivo in quanto diverse da quelle del linguaggio verbale, ma mantenendo - come nel caso di venerandi - una certa cautela, solo come ipotesi per il futuro, riguardo alla sua il tema della testualità “espansa” era stato proposto in una tavola rotonda dal titolo “oltre il libro. ebook e testualità espansa”, da me organizzata al salone del libro di torino nel maggio del , nell’ambito del programma “talking about: la scienza per capire il mondo”, in convenzione fra fondazione del libro, della musica e della cultura e università degli studi di torino e in collaborazione con storycodetorino: http://www.storycodetorino.com/oltre-il-libro-ebook-e-testualita-espansa/ https://www.lavenaria.it/it/esplora/i-capolavori/giardino-delle-sculture-fluide https://www.youtube.com/watch?v= gbe min-w&feature=youtu.be&t= il contributo di venerandi è disponibile solo online e non è citabile per pagina. https://www.epubeditor.it/home/home/ | la visualizzazione digitale negli studi di cultural heritage doi: . / digitcult | scientific journal on digital cultures applicabilità nei testi argomentativi (“di primo acchito può sembrare riduttivo...”; “...possano avere una carica non indifferente...”; “...può essere una componente rilevante...”): “...di primo acchito può sembrare riduttivo utilizzare il linguaggio audiovisivo in sostituzione di quello testuale, che richiede una maggiore partecipazione da parte del fruitore, con delle conseguenze fondamentali a livello cognitivo [...]. non va dimenticato come l’aspetto sovrasegmentale e performativo del linguaggio verbale, e l’approccio sinestesico del doppio canale visivo/auditivo possano avere una carica non indifferente nel veicolare con efficacia anche concetti non strettamente narrativi. inoltre la grammatica visiva ha un livello di espressività paragonabile a quella testuale, seppure di natura diversa; di conseguenza imparare a decodificarla può essere una componente rilevante in quell’attività più generale che è l’educazione alla complessità”. ( ) come si vede il rapporto scritto/visuale rappresenta uno snodo concettuale fondamentale, con tutte le sue contraddizioni e problematicità: l’opposizione fra scritto e visuale presenta infatti una complessità semiotica che va ben oltre la definizione della digital scholarship, in quanto tocca anche aspetti psicologici, cognitivi e culturali. e proprio per esplorare rapidamente l’ampiezza di questa tematica, e contestualizzare meglio il modo in cui essa può incidere nei testi critico- scientifici, vale la pena di considerare alcuni antecedenti storici (anche pre-digitali) nei quali è possibile riscontrare una sorta di tensione a superare la dimensione solo scritta dell’argomentazione critica per sperimentare forme interpretative più complesse. innanzitutto va menzionato - come precursore - il dibattito sul cosiddetto pseudo-saggio (gallerani ), forma ibrida in cui già tra otto e novecento si manifesta l’inquietudine testuale della critica letteraria che prova ad uscire dalla forma canonica del “saggio” per declinarsi secondo generi “altri” (il dialogo, l’autoritratto, la biografia romanzesca, la riscrittura parafinzionale, l’autobiografia, ecc.): un campo di sperimentazioni testuali che in origine resta legato solamente allo scritto, ma che nei suoi esiti più recenti va a contaminarsi con varie forme mediali, tra cui ovviamente quelle visive. se ci spostiamo nel settore della critica d’arte, appare di notevole interesse la sperimentazione promossa da carlo ludovico ragghianti, il quale tra il e il realizza diciannove critofilm: termine da lui coniato (scremin , ) per indicarne la esplicita funzione critico-scientifica in contrasto con altre forme di documentari artistici (ad esempio quelli “letterari” di roberto longhi e umberto barbaro, o quelli “narrativi” di luciano emmer). si tratta di un’operazione di critica d’arte sviluppata esclusivamente attraverso l’immagine in movimento, tanto che idealmente il critofilm dovrebbe essere muto. ragghianti rivolge particolare attenzione agli aspetti tecnici dell’operazione, al punto da elaborare una precisa metodologia di ripresa e dare peculiare rilievo al montaggio, oltre a far costruire appositamente dei carrelli speciali per i movimenti di macchina e per la messa a fuoco: riprendendo la nostra opposizione scritto/visuale, potremmo dire che mentre nei documentari di longhi il codice dominante è il linguaggio verbale, nei critofilm di ragghianti l’attenzione si sposta decisamente sulle potenzialità (anche tecnologiche) del visuale. se quella di ragghianti è la sperimentazione di uno studioso singolo, negli anni sessanta del novecento il tema delle capacità analitiche del visuale investe un’intera disciplina, con le prime formulazioni di sociologia visuale : un campo di studi e un metodo di indagine che da una parte si esplica nell’analisi di materiale iconografico preesistente per studiare i fenomeni sociali; dall’altra - più interessante per il nostro discorso - si misura con la produzione di materiali iconografici, foto e video, i quali costituiscono essi stessi il discorso scientifico di analisi del sociale (faccioli, losacco ). in termini più generali, l’ipotesi di utilizzare il video per elaborare un discorso critico trova oggi ampia applicazione nei video essay, i quali hanno una diffusione molto ampia a tutti i livelli, tanto da dar vita a veri e propri festival per studenti di scuole e università . peraltro spesso, anche nelle forme più raffinate - come nella lettura di giulio paolini da parte di sergio risaliti - emerge https://visualsociology.org/ https://tinyurl.com/u h n https://tinyurl.com/s k k a. doi:. / giulio lughi | digitcult | scientific journal on digital cultures molto chiaramente il predominio della testualità scritta (in questo caso realizzata oralmente); oppure viceversa - come nel caso del progetto finite rants lanciato da fondazione prada , che “intende testare la versatilità del saggio visuale nell’esprimere il pensiero attraverso le immagini e dimostrare la sua attualità nella produzione visiva contemporanea” - a prevalere è la componente visuale, in questo caso segnata da una forte caratterizzazione artistico-sperimentale o di cinema d’avanguardia. in ogni caso il video essay resta legato alla dimensione del video lineare, senza aprirsi alla possibilità interattive, programmabili e dinamiche che il digitale mette a disposizione: paradossalmente, la potenza del digitale viene utilizzata solo a livello di postproduzione e poi blindata dentro il video lineare, senza assumerne il controllo da fuori: come si può vedere nel progetto - peraltro articolato e affascinante - indy vinyl di ian garwood ( ), i cui video essay - benché trattino tematiche legate all’immersività e alla realtà virtuale - restano legati alla forma del video lineare, scontando quasi una sorta di dipendenza dalla matrice documentaristica. non è un caso che sia invece uno studioso di media digitali, e non di cinema, a intravedere le potenzialità di queste forme interpretative: in un intervento a proposito di un video essay che tratta dei rapporti fra cinema e videogame , jenkins ( ) afferma che: “... it represented an innovative form of scholarship. i am hoping we will see more examples of these kinds of analytic video essays in the future”. su questa linea scolari ( ), riprendendo l’intervento di jenkins, estende ulteriormente la prospettiva chiedendosi se sia possibile articolare un discorso scientifico transmediale andando ovviamente oltre la pratica, consueta nelle pubblicazioni scientifiche, di inserire semplicemente materiali iconici illustrativi o esplicativi nel testo scritto: scolari propone invece di rifarsi alla lezione di mcluhan, e in particolare ai libri prodotti dallo studioso canadese in collaborazione con disegnatori come quentin fiore o harley parker. ciò su cui insiste scolari è la perfetta integrazione semiotica fra testo scritto e testo visuale negli esperimenti di mcluhan, una perfetta fusione di intenti analitico-comunicativi e strumenti espressivi che scolari definisce “un big bang tipografico e cognitivo”. infine, uscendo da una concezione riduttiva di testo , legato alla pagina o ad un supporto magnetico, va considerato l’esempio di un intero museo che si presenta come un unico, grande testo visuale didattico-argomentativo: il museo m di mestre (ve) propone infatti un racconto- analisi del novecento italiano totalmente basato sulla esposizione di materiali visivi multimediali; in più aggiunge la dimensione interattiva che consente al visitatore di scegliere i propri percorsi conoscitivi. da oltre centocinquanta archivi il museo ha raccolto oltre seimila foto, ottocento video, e riproduzioni digitali di manifesti, poster, pagine di quotidiani e riviste; ma ciò che qui interessa è il deciso “orientamento al visuale” del progetto espositivo (tenuto conto che si tratta di un museo storico e non artistico): qui si è attuato un vero e proprio rovesciamento di prospettiva, in quanto non è più il testo scritto che svolge il ruolo di guida del discorso argomentativo, incapsulando ove necessario dei materiali visivi, ma sono i materiali visivi che conducono la danza, aprendosi - ove necessario - ad accogliere brevi testi parlati o scritti. di fronte a questo panorama concettuale complesso e dinamico, in cui scritto e visuale sembrano dialogare in un perenne equilibrio instabile, prendiamo ora in esame alcuni concreti strumenti operativi, software applicativi a disposizione della comunità scientifica per sperimentare nuove forme di testualità “espansa”; e va detto subito che ci troviamo di nuovo di fronte alla dicotomia fra strumenti “orientati allo scritto” e strumenti “orientati al visuale”. sono software sviluppati soprattutto nell’ultimo decennio, strumenti-autore che puntano a rendere accessibili ad un ampio pubblico le funzionalità più avanzate di design digitale interattivo senza dover conoscere i linguaggi di programmazione : vengono considerati comunemente come strumenti di content http://www.fondazioneprada.org/project/finiterants/ https://vimeo.com/ in un breve saggio (lughi ) ho a suo tempo ipotizzato che nello scenario del digitale, e soprattutto nel paradigma mobile/locative, testi e spazi tendano a fondersi verso un’unica dimensione esperienziale: “testi come spazi” e “spazi come testi”. https://www.m museum.it/ naturalmente è possibile che alcuni studiosi, o team di ricerca particolarmente organizzati, siano in grado - anche in ambito umanistico - di sviluppare un prodotto testuale lavorando direttamente sul codice, o almeno assemblando gruppi di programmatori in grado di interagire produttivamente in vista di | la visualizzazione digitale negli studi di cultural heritage doi: . / digitcult | scientific journal on digital cultures curation, un settore oggi orientato soprattutto al social marketing, ma che inizialmente era (ed è ancora in parte) orientato a potenziare la scrittura argomentativa, giornalistica o scientifica. nel primo gruppo - “orientato allo scritto” - va considerato ad esempio scalar , probabilmente la piattaforma più articolata e potente per la scrittura accademica. un prodotto libero, open source, con la possibilità di strutturare il testo in maniera reticolare, interattiva, e con buone potenzialità di marcatura semantica dei contenuti multimediali. come recita la presentazione della piattaforma: “scalar enables users to assemble media from multiple sources and juxtapose them with their own writing in a variety of ways, with minimal technical expertise required”. altrettanto interessante pubcoder , la cui caratteristica più rilevante è la capacità di gestire interattivamente singoli “oggetti” di svariata natura all’interno della pagina, con una serie di funzioni integrate che vanno dall’animazione ai questionari all’inserimento di smart object. sulla stessa linea atavist , strutturato per pagine scorrevoli sia in senso orizzontale sia verticale, all’interno delle quali possono essere incorporati i blocks, elementi testuali di varia natura tra cui ovviamente immagini e video (gestibili con vari artifici grafici come gallerie e accostamenti), ma anche tabelle, mappe, grafici, in collegamento dinamico con programmi esterni di gestione dei dati, e con grandi capacità di embedding nei confronti di altri programmi (ad es. slide powerpoint, post di instagram e twitter, storymap, servizi di mail, repository di audio, gif dinamiche, ecc.). sull’altro fronte troviamo invece i software “orientati al visuale”: qui gli elementi nucleari, le unità di base del testo non sono le pagine scritte, come nei casi precedenti, ma le clip video, il cui assemblaggio e coordinamento struttura il discorso argomentativo. va citato prezi , con i suoi più recenti sviluppi, per il carattere pionieristico nell’utilizzo della grafica vettoriale che consente di dare profondità alla struttura del testo, offrendo al lettore non solo la mobilità interattiva orizzontale tipica dell’ipertesto, ma anche quella verticale, a scatole cinesi, che permette di sviluppare visivamente gli approfondimenti necessari. nell’ambito del marketing sono molto diffusi oggi wirevax e eko : in entrambi il filo conduttore è dato da clip video, che possono essere agganciate ad altri elementi video o ad altri materiali multimediali, compresi naturalmente i testi scritti. in entrambi questi prodotti è molto sviluppata la componente interattiva, che si manifesta sia con la possibilità di organizzare racconti a bivi, sia di aprire schede esplicative aggiuntive, sia di collegarsi a servizi online esterni. sempre “orientati al video”, ma non tanto per il marketing quanto per la costruzione di testi argomentativi complessi, come reportage giornalistici o saggi a forte tasso di interattività, sono invece racontr e klynt : questi software sono sicuramente i prodotti più maturi in questo settore, e puntano entrambi sulla possibilità di aggiungere al video degli strati testuali, dei layer, indipendenti dal video stesso e gestibili con semplici comandi che non richiedono competenze di programmazione. a differenza del video lineare, nel quale si possono certamente incorporare in postproduzione altri elementi visivi o scritti, che tuttavia vengono blindati nel video al termine del processo di rendering, nel caso dei software qui citati gli strati testuali rimangono indipendenti e possono essere spostati, eliminati, fatti apparire e scomparire grazie ai comandi interattivi, o possono diventare essi stessi attivatori di dinamiche all’interno del testo stratificato. questa estrema flessibilità, che rende espandibile e manovrabile il testo in tutte le direzioni, compresa quella verticale della profondità, costituisce il banco di prova ideale per cercare di dar vita a quelle un risultato professionalmente rilevante. ma sembra più realistico pensare invece all’impiego di software-autore come quelli che esamineremo qui rapidamente, in cui l’eventuale intervento sul codice è limitato a semplici aggiustamenti o personalizzazioni: si tratta di software che infatti trovano largo impiego nelle redazioni editoriali, o giornalistiche, o da parte di autori freelance. https://scalar.me/anvc/scalar/ https://www.pubcoder.com/ https://atavist.com/ https://prezi.com/ https://www.wirewax.com/ https://eko.com/ https://racontr.com/ https://www.klynt.net/ doi:. / giulio lughi | digitcult | scientific journal on digital cultures inquietudini sperimentali, sul rovesciamento di prospettiva e integrazione fra scritto e visuale, di cui abbiamo ricostruito rapidamente alcune tappe a partire dal novecento pre-digitale e che costituiscono oggi un punto centrale nel dibattito sulla valorizzazione del cultural heritage. in questa direzione ho sviluppato insieme ai miei collaboratori, all’interno del laboratorio carmel per la piattaforma invisibilia , il webdoc interattivo il giardino delle sculture fluide di giuseppe penone, consultabile a questo indirizzo: https://www.invisibilia.net/penone/ il webdoc è stato sviluppato con il software klynt, ed è quindi decisamente “orientato al visuale”: consente una consultazione nomadica, tipicamente ipertestuale, proprio come l’installazione site specific di penone lascia libero il visitatore fisico di errare senza indicazioni fra le varie sculture che popolano il giardino. inoltre il webdoc gioca sulla molteplicità degli strati testuali, sulla loro profondità, opacità e trasparenza per sperimentare le diverse forme di complessità testuale che restituiscono l’esperienza emozionale, “mediata” ma coinvolgente, della visita al giardino. informazioni più dettagliate sull’opera di penone e sul webdoc stesso sono comunque disponibili all’interno del prodotto, oltre che in analisi specifiche di taglio semio- mediologico (biggio ) e storico-artistico (cammarata ) pubblicate su “digitcult”. la problematica dell’opposizione fra scritto e visuale resta comunque aperta: è un’opposizione tutt’altro che banale, che affonda le sue radici nella nostra storia culturale (ricordiamo solamente ut pictura poesis di orazio ; e poi pictura est laicorum literatura di gregorio magno ; e ancora ceci tuera cela di victor hugo ). un’opposizione che evidentemente non può essere risolta astrattamente ma più realisticamente gestita - mediante strumenti applicativi aperti e flessibili - con spostamenti di peso tra le due componenti, in funzione dei diversi obiettivi comunicativi che si vogliono raggiungere. il tutto in un contesto teorico in cui i concetti di documentazione, divulgazione, spettacolarizzazione necessitano ancora di un inquadramento sistemico alla luce del paradigma digitale: nel frattempo, il terreno è aperto per le sperimentazioni e per le discussioni, accompagnando la riflessione teorica con applicazioni operative - come quella qui proposta - nella prospettiva di alimentare il dibattito. https://www.invisibilia.net/?page_id= https://www.invisibilia.net/ orazio sintetizza così, nella tradizione classica, il lungo dibattito sui rapporti fra scritto e visuale che - attraversando aristotele e platone - risale fino a simonide di ceo, vi sec. a.c.: poema loquens pictura, pictura tacitum poema. si attribuisce questa formula a papa gregorio magno (vii sec.), ad indicare come le immagini possano rappresentare per i laici (gli illetterati) una forma alternativa di accesso ai sacri testi (scritti) della fede. l’espressione compare nel secondo capitolo del romanzo notre-dame de paris ( ), ambientato alla fine del quattrocento, e significa che “il libro ucciderà la cattedrale”: con l’invenzione di gutenberg, secondo hugo, la cultura scritta è destinata a prendere il sopravvento sul mondo immaginifico, tattile e sensoriale rappresentato dalle grandi architetture gotiche. ho affrontato recentemente il tema della spettacolarizzazione (lughi ) in un saggio mirato soprattutto all’ambito artistico, da cui tuttavia si possono ricavare indicazioni anche per altre forme di comunicazione culturale. | la visualizzazione digitale negli studi di cultural heritage doi: . / digitcult | scientific journal on digital cultures bibliografia biggio, federico. “digital arts and humanities for cultural heritage - interpretation and enunciation in digital documentation practices of cultural heritage”. digitcult - scientific journal on digital cultures, [s.l.], v. , n. , p. - , dec. . available at https://digitcult.lim.di.unimi.it/index.php/dc/article/view/ cammarata, silvia. “from wood to garden. telling the places of giuseppe penone”. digitcult - scientific journal on digital cultures, [s.l.], v. , n. , p. - , dec. . available at https://digitcult.lim.di.unimi.it/index.php/dc/article/view/ faccioli, patrizia e losacco, giuseppe. nuovo manuale di sociologia visuale. dall'analogico al digitale, milano, franco angeli, . gallerani, guido mattia. pseudo-saggi. (ri)scritture tra critica e letteratura. milano: morellini, . garwood, ian. “from ‘video essay’ to ‘video monograph’?: indy vinyl as academic book”. necsus, features, spring _#intelligence. available at https://necsus-ejms.org/from- video-essay-to-video-monograph-indy-vinyl-as-academic-book/ jenkins, henry. “transmedia synergies: remediating films and video games”. confessions of an aca-fan, january . available at http://henryjenkins.org/ / /transmedia- synergies-remediating-films-and-video-games.html leonetti, francesco. “the semantic challenge of digital publishing: an essay that knows what it is about”. digitcult - scientific journal on digital cultures, [s.l.], v. , n. , p. - , june . available at https://digitcult.lim.di.unimi.it/index.php/dc/article/view/ lughi, giulio. “text-space dynamics”. planum. the journal of urbanism , , p. - , june . lughi, giulio. “tecno-kitsch: la spettacolarizzazione digitale dell’arte”. piano b. arti e culture visive, [s.l.], v. , n. , p. - , nov. . available at https://pianob.unibo.it/article/view/ / meschini, federico. “documents, mediality and narration. what we talk about when we talk about digital scholarship”. digitcult - scientific journal on digital cultures, [s.l.], v. , n. , p. - , june . available at https://digitcult.lim.di.unimi.it/index.php/dc/article/view/ riva, massimo. “an emerging scholarly form: the digital monograph”. digitcult - scientific journal on digital cultures, [s.l.], v. , n. , p. - , dec. . available at https://digitcult.lim.di.unimi.it/index.php/dc/article/view/ roncaglia, gino. “experimenting with new forms of academic writing”. digitcult - scientific journal on digital cultures, [s.l.], v. , n. , p. - , dec. . available at https://digitcult.lim.di.unimi.it/index.php/dc/article/view/ scremin, paola. “i critofilm olivetti di carlo ludovico ragghianti”. i critofilm di carlo ludovico ragghianti. associazione archivio storico olivetti e fondazione olivetti. ivrea, . scolari, carlos. “entre el transmedia y mcluhan: ¿hacia un storytelling científico transmedia?” hipermediaciones.com - november . available at https://tinyurl.com/qmzn j venerandi, fabrizio. “notes for a ‘digital native writing’“. digitcult - scientific journal on digital cultures, [s.l.], v. , n. , p. - , dec. . available at https://digitcult.lim.di.unimi.it/ index.php/dc/article/view/ dh-antiphoner the burns antiphoner–from manuscript to interactive resource anna kijas boston college dh | montreal, canada august , burns antiphoner, http://burnsantiphoner.bc.edu/. project partners project team: • primary investigator and researcher: dr. michael noone (morrissey college of arts and sciences) • project manager and development: anna kijas (digital scholarship group, o’neill library) • project assistant: jonathan mott (june – august ; music undergraduate) • research and editorial: dr. graeme skinner (university of sydney) • web application development: ben florin (systems, o’neill library) external consultants & partners: • cantus database developer: jan koláček (prague, czech republic) • k&m productions acknowledgements this project received expertise and support from digital scholarship, digital library programs, special collections, and systems & applications at the boston college libraries. we would like to thank the following individuals who were consulted during the development of this project: andrew hankinson (mcgill university) diva.js developer and debra lacoste (university of waterloo) cantus project manager. burns antiphoner burns antiphoner, http://burnsantiphoner.bc.edu/. franciscan antiphoner. john j. burns library, http://hdl.handle.net/ / . th century franciscan antiphoner daily office services michael noone, ”novice’s guide,” burns antiphoner. http://burnsantiphoner.bc.edu/novices-guide/. example of palimpsest franciscan antiphoner, folio r. franciscan antiphoner, folio v. brief passage of written polyphony in the opening of the sequence celi solem imitantes. “der ander nocturnus” written in german franciscan antiphoner, folio r. manuscripts for pleasure and piety exhibit at the mcmullen museum. september – december , . http://beyondwords .org/. contents by liturgical occasion antiphonerinventory inventory identifies incipits (snapshot view) project goals ) develop an interactive, open access and responsive website with a searchable database and dynamic presentation layer; ) use and develop open source technology; ) contribute our data to the scholarly community through a collaboration with cantus; and ) develop software, workflows, and documentation that can inform future projects using similar technologies. timeframe: june - august metadata record metadata record for franciscan antiphoner, http://hdl.handle.net/ / . digitized franciscan antiphoner, http://hdl.handle.net/ / . tech workflow incipit encoding using mei elements in neume module structure and semantics of neumes cantus manuscript database cantus manuscript database: http://cantus.uwaterloo.ca/ top: cantus fields. bottom: cantus fields continued across spreadsheet incipit record in cantus unus ex duobus qui secuti… http://cantus.uwaterloo.ca/chant/ . “obtulerunt pro eo domino” example of where duplicate incipit text merged into one string verovio verovio: http://www.verovio.org/index.xhtml. diva.js diva.js, https://ddmal.github.io/diva.js/. https://github.com/bcdigschol/antiphoner-site/blob/master/view/antiphoner-data.json json output from inventory/incipit data (snapshot) the burns antiphoner in action burns antiphoner, http://burnsantiphoner.bc.edu/. antiphonerdata antiphonerdata with performance search interface search interface resources boston college libraries • burns antiphoner: http://burnsantiphoner.bc.edu/ • digital scholarship group: http://ds.bc.edu • github: https://github.com/bcdigschol/antiphoner-site cantus database • tutorial for cantus contributors: http://cantus.uwaterloo.ca/tutorial diva.js • full documentation: https://github.com/ddmal/diva.js/wiki • other tools from ddmal: https://github.com/ddmal mei • full guidelines: http://music-encoding.org/?page_id= • schemas, customizations, xslt stylesheets: https://github.com/music-encoding/music- encoding verovio • documentation and viewer: http://www.verovio.org/index.xhtml contact information anna kijas digital scholarship librarian boston college http://ds.bc.edu email: kijas@bc.edu twitter: @anna_kijas d/vr in the academic library: emerging practices and trends jennifer grayburn, zack lischer-katz, kristina golubiewski-davis, and veronica ikeshoji-orlati, editors council on library and information resources february isbn - - - - clir publication no. published by: council on library and information resources south clark street arlington, va web site at http://www.clir.org copyright © by council on library and information resources. this work is licensed under a creative commons attribution-noncommercial-sharealike . international license. cover image credit: cover images feature the virtual bethel project, images courtesy of zebulun wood cover image designed by reed scriven http://www.clir.org iii foreword, by christa williford .......................................................................................iv introduction. d/vr creation and curation: an emerging field of inquiry, by zack lischer-katz, kristina golubiewski-davis, jennifer grayburn, and veronica ikeshoji-orlati ..................................................................................................... . collaborative and lab-based approaches to d and vr/ar in the humanities, by victoria szabo .......................................................................... . d cultural heritage informatics: applications to d data curation, by will rourk ............................................................................................................ . virtual reality for preservation: production of virtual reality heritage spaces in the classroom, by zebulun m. wood, albert william, and andrea copeland ................................................................................................ . using d photogrammetry to create open-access models of live animals: d and d software solutions, by jeremy a. bot and duncan j. irschick .................................................................. . what happens when you share d models online (in d)?, by thomas flynn ....................................................................................................... . building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation, by ann baird whiteside ............................................................................................ . d/vr preservation: drawing on a common agenda for collective impact, by jessica meyerson .................................................................. . cs dp: developing agreement for d standards and practices based on community needs and values, by jennifer moore, adam rountrey, and hannah scates kettler ...................................................................................... conclusion. d/vr: stakeholders, ecosystems, and future directions, by zack lischer-katz, kristina golubiewski-davis, jennifer grayburn, and veronica ikeshoji-orlati ................................................................................................. acknowledgments ...................................................................................................... glossary ........................................................................................................................ about the editors and authors ................................................................................. contents iv foreword more and more individuals, both within and outside higher education, are creating and using d, virtual reality (vr), and augmented reality (ar) tools and systems for research and learning. applications span the full breadth of the disciplinary spectrum: d models of vanished historical spaces allow ar- chaeologists and historians to test theories about their arrangement and use; vr environments allow students of anatomy to explore simulations of the human body; ar ties digitized archives and artifacts to physical locations, merging the past with the present; and d printing enables student artists and engineers to share prototypes of their designs. the essays in d/vr in the academic library: emerging practices and trends capture just some of ways that d/vr is already having an impact on the creation and transfer of knowledge. bringing together the insights of notable innovators in research, pedagogy, cultural heritage, and science, the volume signals an expansion of these technologies within education while simultane- ously acknowledging serious challenges. the continued rapid development and expansion of d/vr may potentially transform learning, but keeping up with these changes can quickly overwhelm the capacity of any single department or organization. decision makers face difficult choices as they seek to calibrate their investments in related tools and expertise to the priori- ties of their institutions. the interdisciplinary appeal of d/vr can generate productive collaborations among students, faculty, and staff with different academic, technical, and professional backgrounds, and contributing to these teams can be rewarding. at the same time, this very interdisciplinarity chal- lenges accepted structures of authority: where should this work take place? who should lead it? in what venues should the work be shared with others? how should credit be distributed to contributors? other challenges arise from the adaptation of technologies developed for profit-driven industries to educational and nonprofit sectors. as vr technolo- gies develop with the interests of big commercial players, and as the means of encoding d data change alongside vr hardware and software, academic and independent creators who value broad access, sharing, and sustainability must find ways of maintaining d/vr content so that it is susceptible to cri- tique, reproducible, and repurposable. to help confront all of these challenges, d/vr creators badly need the support of specialist librarians and archivists. as interdisciplinary meeting points; as service providers dedicated to providing the raw materials for research and learning; and as organizations dedicated to the collection, pres- ervation, and distribution of knowledge, campus libraries already have staff with many of the skills needed for the wise application of new technologies to research. expertise in research methodologies, in digital publication, in digital library development, in the use of metadata for discovery, in archival v appraisal—all of these skills are relevant to the creation of sustainable envi- ronments in which educational applications of d/vr can thrive. one major concern—the digital preservation of d/vr data—is a com- mon thread that ties the pieces of this volume together. the projects described by the authors are all quite distinct in their reasons for being and in their potential for reuse. consequently, the preservation priorities for each body of work are different. for uses of d/vr to reconstruct vanished historical plac- es or artifacts, preserving the context for each reconstruction, the motivation behind it, and the choices that experts make about the evidence considered are essential elements. without the context, the rigor of the scholarship that underlies digital reconstructions can be obscured. in other cases, when re- searchers digitally scan contemporary natural and built environments in d, preserving “raw” scans of these environments in a common, nonproprietary format that is discoverable and reusable may be an important consideration. opportunities for d capture may come only once, and the captured data can become the best surviving record of a place, artifact, or organism that will otherwise be lost to time. for works of pure invention, including games, visu- al art, and film incorporating d/vr, documentation of the creative process could prove critical for future artists and historians. this volume’s editors draw upon the contributions of each author to out- line practical steps forward for collecting institutions. notably, they encour- age librarians and archivists to envision pathways to d/vr preservation and access that entail collective action across institutions. tackling the com- plexities of digital preservation for d data and vr applications can easily stretch skills and technical capacity at one organization well beyond its limits, so broad consultation will be necessary to inform the choices that each insti- tution must make. as vr for the home becomes ubiquitous, its uses for research and learn- ing will continue to multiply. documenting the variety of ways in which re- searchers, educators, and information professionals use d data and vr soft- ware should become a priority for academic libraries and archives, if only to inform future applications. for example, virtual environments already show great potential for research in philosophy, psychology, and neuroscience. in his article, “are we already living in virtual reality?,” joshua rothman describes how experiments with “virtual embodiment”—a particular type of vr that allows a person to temporarily “become” someone else—suggest that it is possible to dramatically, perhaps permanently, alter people’s perceptions of themselves and others. such experiences hold the potential for tremendous beneficence or tremendous harm. mapping a future for d/vr in education will require more than sophisticated technical and financial insights; in time, drawing appropriate moral and ethical boundaries could prove even more vital. a fully nuanced understanding of the histories of d/vr and related technologies must be available to help future researchers and teachers cope with these challenges. —christa williford the new yorker, april , https://www.newyorker.com/magazine/ / / /are-we-already-living-in-virtual-reality d/vr creation and curation: an emerging field of inquiry zack lischer-katz, kristina golubiewski-davis, jennifer grayburn, and veronica ikeshoji-orlati t hree-dimensional ( d) modeling, d capture techniques, and virtual reality (vr) are becoming increasingly common in research and teaching. following a period of intense, yet short-lived enthusiasm in the s, vr has returned, showing great promise as a tool for enhancing scholarship and pedagogy in the context of higher education. vr refers to a set of technologies that create immersive digital environments, re-creating the perceptions of our everyday experiences in order to simulate places and things. typically, this involves providing stereoscopic visual and aural in- formation to the eyes and ears, which the system changes as a user moves their head, body, and hands. the release of the inexpensive google cardboard vr viewer in has generated significant inter- est, with more than million viewers having been shipped (vanian ). the release of affordable, fully functional vr headsets in (most notably, the htc vive and oculus rift headsets) is fueling ex- perimentation in a variety of academic contexts, from library-based makerspaces and humanities classes (figueroa ) to architecture and design programs (enis ) and law schools (dilbeck ). ex- periments in the vr world aim to bring haptic and olfactory forms of perceptual information into the mix to create a greater experience of perceptual “fidelity,” that is, the closeness of the user ’s experi- ence of a simulated, virtual world to the user ’s ordinary experience of the physical world (bowman and mcmahan ). throughout this report, we use the term d/vr to stress the intertwined nature of d data/models and vr-related technologies (such as augmented reality), the latter of which may be understood as d viewing and visualization platforms. although in some ways this is an oversimplification, conceptualizing d as the content and vr as the platform allows us to simultaneously consider the particu- lar creation and curation needs of each aspect separately, as well as their interdependencies. introduction d/vr creation and curation: an emerging field of inquiry as d and vr tools are introduced into higher education con- texts, they enable faculty and students to engage with highly de- tailed d data—from cultural heritage artifacts to scientific simula- tions—in new ways. the creation and use of d data have become more widely practiced thanks in part to the decreasing costs of the hardware and software for capturing and processing d data; pho- togrammetry, structured light scanning, laser scanning (lidar), computed tomography (ct) scanning, and other techniques are moving out of the lab and finding new applications in a plethora of new fields. with d and vr technology, a professor may take stu- dents on an immersive field trip to stonehenge, changing the light- ing to simulate various phases of solar events; an archaeologist may capture d scans of an archaeological excavation and share these data with a colleague on the other side of the world in the form of an immersive virtual exploration of the site; a biochemistry professor may explore complex protein structures with students; or a chemi- cal engineer may simulate the movement of fluids in various porous rock materials. recent research has demonstrated the educational benefits of d/vr for teaching in a variety of fields, including architecture (angulo ; milovanovic et al. ), anthropology (lischer-katz, cook, and boulden ), and medicine (kersten-oertel, chen, and collins ), to name a few. more generally, d/vr has been shown to augment spatial analytic skills (ragan et al. ), particularly in such areas as big data exploratory analysis (donalek et al. ; van dam, laidlaw, and simpson ), design prototyping (seth, vance, and oliver ), graph analysis (ware and mitchell ), and analysis of volumetric datasets (laha, bowman, and socha ; prabhat et al. ). in the humanities, d/vr has become a popular means of producing new, multimedia forms of scholarship, such as d digitization and vr visualization of medieval manuscripts (en- dres, forthcoming), as well as making cultural heritage artifacts and sites more easily accessible to scholars and the public (bentkowska- kafel, denard, and baker ), and forming a “ -d digital heritage ecosystem” (limp et al. ). there is also potential for d/vr to shape the structure and functioning of the academic library of the future (cook and lischer-katz, forthcoming) by using vr to host tra- ditional library services, such as collection browsing and searching (cook ). because d/vr technologies are applicable to a number of fields, many academic libraries have become sites for cross-disci- plinary research and experimentation in d and vr. some digital examples of currently available vr applications that support this type of work include stonehenge vr sandbox (https://store.steampowered.com/app/ / stonehenge_vr_sandbox/); nefertari: journey to eternity, which is built on photogrammetric scans of nefertari’s tomb (https://store.steampowered.com/ app/ /nefertari_journey_to_eternity/); nanome, which enables the exploration of chemicals, proteins, and nanotechnology (https://store.steampowered. com/app/ /nanome/); irisvr, which supports architecture and interior design work (https://irisvr.com/); and oculus medium, which provides an artistic sculpting and modeling space in vr (https://www.oculus.com/medium/). these are available as of october , . https://store.steampowered.com/app/ /stonehenge_vr_sandbox/ https://store.steampowered.com/app/ /stonehenge_vr_sandbox/ https://store.steampowered.com/app/ /nefertari_journey_to_eternity/ https://store.steampowered.com/app/ /nefertari_journey_to_eternity/ https://store.steampowered.com/app/ /nanome/ https://store.steampowered.com/app/ /nanome/ https://irisvr.com/ https://www.oculus.com/medium/ d/vr creation and curation: an emerging field of inquiry humanities and digital scholarship centers, media labs, and maker- spaces, for example, have embraced the technology, often providing access and support for community members interested in exploring d and vr environments for the first time, meeting curricular goals, or examining research data. as d and vr projects scale up and move outside of the specialist disciplines where they have existed for decades (e.g., computer science labs, media arts programs), questions arise concerning skills development, interdisciplinary collaboration, publication, sustainability, preservation, and reuse. in addition to pro- viding access to d and vr resources, the academic library, already a center for collaboration, instruction, research, and collection preserva- tion, is well poised to provide leadership in this field. essays in this report the eight essays presented here emerged from talks given at d/vr creation and curation in higher education: a colloquium to ex- plore standards and best practices, a mini-conference that took place march – , , at the bizzell library at the university of oklahoma in norman, oklahoma. organized by the editors of this volume who, at that time, were council on library and information resources (clir) postdoctoral fellows working in academic libraries, the event was funded by the alfred p. sloan foundation through a clir mi- crogrant with co-sponsorship from the university of oklahoma li- braries, university of california santa cruz university library, and temple university libraries. although the primary focus on d/ vr was intentionally narrow in order to maintain a small, intimate group, many of the issues that arose also apply to other immersive technologies, including augmented reality (ar), mixed reality (mr), and extended reality (xr). by its nature, this report is a snapshot, a portrait of the varied professional objectives and workflows that have developed around d/vr at this moment, rather than a com- pendium of universal best practices that will be applicable across space and time. over the course of a day and a half, the schedule included mod- erated discussions by the organizers interspersed with presenta- tions by the eight experts, each representing knowledge in one or more of the following areas: d content creation, vr visualization and analysis, d/vr-based educational deployment, and d/vr data curation. the invited speakers were intentionally chosen from diverse subject expertise backgrounds in order to stimulate cross- disciplinary conversation about how the unique challenges of d/ vr are handled in a variety of circumstances. in addition to the eight invited experts, the participants included librarians, administrators, faculty, postdoctoral fellows, and content developers. this approach enabled the sharing of knowledge between practitioners in different communities of practice, thus fostering dialogue and developing ho- listic knowledge about these complex and multifaceted technologies. by addressing the full d/vr lifecycle within a tightly knit community of experts and stakeholders, the organizers of this d/vr creation and curation: an emerging field of inquiry colloquium aimed to identify points of tension and gaps in existing practices and knowledge in order to foster common understanding for the librarians and digital curators tasked with supporting and managing these new data types. in this context, content creators, faculty, librarians, digital curators, and preservationists talked across their projects, concerns, and needs. in particular, the conversations allowed participants to work toward a common understanding of when and how to support d/vr throughout production, dissemi- nation, and archiving workflows for different projects. the presenta- tions and moderated discussions bridged language and workflow gaps, allowing experts in d scanning and vr development to ex- change knowledge with experts in project management and digital preservation. practitioners are quick to point out that d/vr projects do not exist in a vacuum, but rather are implemented alongside other research and teaching activities already supported by institutes of higher education. the first chapter introduces two collaborative frameworks for d/vr projects in which teaching and research become increasingly intertwined. in “collaborative and lab-based approaches to d and ar/vr in the humanities,” victoria szabo presents duke university’s “lab”-based model of collaboration by using a common topic or theme of investigation to find shared goals among a range of invested departments and stakeholders, including libraries. groups composed of librarians, developers, faculty, and students unite with a common goal to produce, teach, share, and evaluate research using d/vr media. the next four chapters focus more closely on the tools and work- flows to create and distribute d/vr content, providing varying perspectives of how, why, and for whom these projects are created. these chapters focus on the unique concerns of digitization technol- ogy, model accuracy, artistic intervention, and optimization for shar- ing files in different contexts. working at the university of virginia, will rourk probes the distinction between d models and d data. in “ d cultural heritage informatics: applications to d data cura- tion,” he stresses the importance of thinking of d as “dimensional data” within cultural heritage projects and introduces a full range of d technology and scholarly outputs: d data, d prints, vr experi- ences, animation, open-access models, etc. he outlines the different workflows for the capture technology, research outputs, and preser- vation strategies selected, and he describes the library’s role in what he calls “ d cultural heritage informatics ( dchi).” zebulun m. wood, albert william, and andrea copeland focus on d/vr as a central project for classroom-based learning, using the media arts and sciences classroom at indiana university–purdue university indianapolis (iupui) as a collaborative space to build the virtual bethel project. “virtual reality for preservation: production of virtual reality heritage spaces in the classroom” describes their students’ work on a highly detailed d model of a historically signif- icant african american church that was being redeveloped in down- town indianapolis. incorporating historical research, community d/vr creation and curation: an emerging field of inquiry relationships, d data capture, and digital creation, the authors de- scribe developing students’ skills, negotiating student group dynam- ics, building on existing community relationships, and addressing audience concerns and feedback. while their project does not ad- dress library participation, their workflow, focus on instruction, and use of archival content provide avenues for librarians to consider should the question of their participation in similar projects arise. in “using d photogrammetry to create open-access models of live animals: d and d software solutions,” jeremy a. bot and duncan j. irschick introduce the importance of animated d models of individual animals for the digital life project. for this project, re- searchers at the university of massachusetts–amherst (with partners at the university of oklahoma) developed new photogrammetric techniques for capturing accurate d data from living organisms, including frogs, sharks, turtles, and other animals. capturing data from a living animal prone to unpredictable motion is a difficult pro- cedure that requires a digital artist to register and stitch together the many images produced in the capture process in order to produce a complete model amenable to animation. the complex workflow described underscores both the technical mastery and digital inter- vention necessary to transform individual scans into a complete, ani- mated model and, as the authors argue, makes open-source software an ideal choice to preserve and document the d model at various stages in the creation process. “what happens when you share d models online (in d)?” focuses on the broader dissemination of d models online through webgl and webvr. thomas flynn introduces the ways in which cultural heritage institutions and libraries use platforms such as sketchfab, a commercial platform to share and sell d content, to reach new audiences. dissemination, however, is not preservation; flynn notes that sketchfab is not a repository and that the files it dis- seminates for sharing are necessarily derivatives and optimized for web browser sharing. the ease of sharing, annotating, and embed- ding data, nonetheless, provides the opportunity to engage with new and old audiences in unexpected ways and enables the dissemina- tion of d models to a global community of scholars and enthusiasts. the final three chapters address the complexities of sustainabil- ity and preservation of d/vr projects by introducing three ongoing initiatives to promote better standards and workflows across disci- plines. in “building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documenta- tion,” ann baird whiteside reports on work being conducted under the auspices of harvard university library’s building for tomorrow project, funded through the institute of museum and library ser- vices (imls). this project is looking at the preservation issues related to curating architectural and design d data, particularly in regard to the archives of architects whose materials are increasingly born digital. some common issues in her chapter that may be applicable to other academic contexts include the risk of file format obsolescence and the long-term accessibility of d collections. whiteside focuses d/vr creation and curation: an emerging field of inquiry on the future of d architectural data, an ongoing concern for in- dustry and education alike for at least the past decade, showing the enhanced risks of proprietary formats to medium-term access and beyond. she also discusses the need to store multiple types of files for each architectural project, which is analogous to work done in digital humanities and other research where multiple types of media files are combined to create complex, interactive scholarly outputs. furthermore, whiteside underscores that, because of the complex- ity of the situation, any solutions to sustainability questions must be community-wide, rather than tackled on an institution-by-institu- tion, ad hoc basis. whiteside’s chapter is particularly important in considerations of collective action and shared interinstitutional mis- sions, since small institutions need to be able to adopt standards and best practices, but require the support of larger institutions to do so. jessica meyerson’s chapter, “ d/vr preservation: drawing on a common agenda for collective impact,” introduces concepts from her work with the software preservation network that can be ap- plied to curating and preserving d/vr software into the future. while working with d/vr requires specialized workflows with robust version control, involves a variety of file formats and data types with multiple relationships between files, and depends on an amalgam of complex hardware and software platforms, meyerson suggests that an examination of d/vr curation challenges can be guided by findings from other areas of digital curation. she identi- fies three major data curation challenges: ( ) scale, ( ) standards and interoperability, and ( ) software and hardware dependence. to ad- dress these challenges, she argues for a collective impact approach, outlines the major components of this approach, and discusses how it could be implemented to support the development of standards and best practices for d/vr software preservation. the final chapter reports on work being conducted by another imls-funded project, community standards for d preservation (cs dp). in “cs dp: developing agreement for d standards and practices based on community needs and values,” jennifer moore, adam rountrey, and hannah scates kettler review existing projects that are tackling the curation problems of d/vr and identify gaps in these projects that need to be addressed. they argue for more consensus-building across different stakeholder groups around pres- ervation standards. a critical aspect of creating d/vr preservation standards is balancing the need to structure workflows around a set of common practices with the need to keep new technologies flex- ible and open to innovation. through two national forum meetings in and ongoing collaborative work online, cs dp brought to- gether various stakeholder groups working with d data to develop standards and best practices. the project formed working groups from assembled forum participants to structure stakeholder involve- ment and generate consensus on standards and best practices related to critical topics, such as metadata and intellectual property rights. d/vr creation and curation: an emerging field of inquiry directions forward the issues discussed in this report were chosen to prompt greater self-awareness for library professionals as they develop programs that use d and vr technologies and work to integrate changing scholarly demands and conventions with existing library services and policies. across these eight essays, three critical approaches that librarians and digital curators need to address as they use d/vr to support their communities are represented: ( ) treat the academic outputs that use d/vr as scholarly products; ( ) build a d/vr scholarly community to support knowledge exchange across a range of stakeholder groups; and ( ) develop technical tools, training, and infrastructure to support a d/vr research ecosystem. treat d/vr as scholarly products libraries and other institutions need to consider d/vr as scholarly products in their own right, rather than as illustrations or supple- mental material. as such, these projects must be managed through- out their research lifecycle like other types of research data. the intellectual value of d/vr is still under debate within academic communities, but the library can lead the way in establishing the products of d/vr projects as scholarly outputs, which will encour- age greater acceptance of their use in academia. in her chapter, vic- toria szabo points out that the scholarly standards governing the use of d/vr are still being developed. while will rourk emphasizes the need to rethink d/vr as data, szabo argues that d/vr should also be taken more seriously as scholarly output and should be treated alongside the outputs of other types of research projects. both approaches, d/vr as data and d/vr as scholarly output, are neces- sary perspectives and affect the ways in which libraries interact with scholars when supporting this type of research. libraries can aid in the acceptance process by supporting peer review practices and developing publishing platforms for d/vr projects. to this end, libraries need to consider how d/vr should fit alongside other types of data and scholarly products, which will have an impact on scholarly services and research data management programs. build a d/vr scholarly community to support knowledge exchange across the essays in this report there is a collective clarion call to ac- tion to build a scholarly community of knowledge- and skill-sharing around d/vr. the authors collectively ask: how can we—as con- tent creators, users, curators, archivists, etc.—prevent the “siloiza- tion” of d/vr creation and curation practices and knowledge? libraries can play a role in this de-siloization as a cross-disciplinary collaborative space. the expertise that libraries offer in data manage- ment and digital scholarship practices, together with their ability to engage community members, places them in a prime position to be partners in d/vr scholarship projects. recent grant-funded projects designed to bring communities together and promote d/vr creation and curation: an emerging field of inquiry knowledge exchange for d/vr are leading the way in addressing these types of issues. in this report, ann baird whiteside describes harvard university library’s building for tomorrow project, which brings together stakeholders in the architecture and design commu- nities currently working with d data and trying to archive it for the long term; jessica meyerson lays out the model of the software pres- ervation network as a means of structuring knowledge exchange across networks of digital curation practice related to d/vr; and jennifer moore, adam rountrey, and hannah scates kettler demon- strate through the cs dp project how community-based working groups can help establish standards and recommended practices that are applicable to both large and small institutions in a variety of fields. in addition to these innovative, grant-funded initiatives, these chapters demonstrate the importance of communicating across com- munities to promote effective knowledge exchange. this sort of cross-disciplinary and cross-institutional communication can take the form of forums, white papers, conferences, publications, and the like, and they can collectively produce an active scholarly commu- nity composed of scholars and information professionals working together to develop knowledge in support of these complex technol- ogies. sustained and sustainable collaboration and coordination are essential and will help to balance the obligations of content creators with those of librarians and digital curators for preserving these new research data types and scholarly outputs. although content creators can assist this process by choosing archival formats, capturing meta- data, and preparing their materials for archiving throughout the data collection process, librarians should take the lead in providing infor- mation about standards and best practices and supporting d/vr creation and preservation workflows. develop technical tools, training, and infrastructure to support a d/vr ecosystem in addition to identifying the need to reconfigure scholarly conven- tions and institutional expectations around d/vr and enhance communication and knowledge exchange across scholarly communi- ties and information institutions, the essays in this report stress the growing requirements for technical tools, training, and infrastructure that are designed specifically to support a holistic d/vr research ecosystem. the evolution of such an ecosystem must be flexible and forward-looking enough to take into account the changing technical and scholarly landscape. as libraries are increasingly called upon to support knowledge exchange beyond traditional books and journals, the creation of novel types of research infrastructure will shape the a third imls grant-funded project, developing library strategy for d and virtual reality collection development and reuse (lib dvr) is also ongoing, but it was not discussed in these proceedings. lib dvr will be issuing a series of journal articles and a comprehensive white paper on its national forum meetings starting in the spring of . more information can be found here: https://lib.vt.edu/research- learning/lib dvr.html. https://lib.vt.edu/research-learning/lib dvr.html https://lib.vt.edu/research-learning/lib dvr.html d/vr creation and curation: an emerging field of inquiry preservation and access expectations of constituents. by being part of the conversation, librarians can position themselves to better under- stand the needs of the d/vr community and the services that the library can support. the essays in this volume outline a number of approaches by which library professionals can increase their involve- ment in d/vr. because every library and academic community approaching d/vr will be unique, we encourage each one to in- corporate an appropriate mix of strategies from across the examples provided here. conclusion academic libraries of the twenty-first century have enthusiastically embraced digital technology and emergent media as new forms of information and data that must be managed and made accessible. with greater and greater frequency, the library itself is being seen not only as a site of knowledge preservation and access, but also as a place for experimentation and knowledge production. library staff, moreover, play increasingly important roles in teaching, research support, project collaboration, technology consultation and instruc- tion, digital publishing, and more. the future of d/vr as a schol- arly and pedagogical ecosystem of tools depends on fostering close relationships among faculty, librarians, and digital curators. this report seeks to elucidate key dimensions of the current landscape of d/vr in academia, in the hope of developing common strategies for defining the library’s role and nurturing effective relationships between libraries and the range of academic stakeholder groups. developing these new skills and collaborations around emerging technologies such as d/vr can potentially enhance the profile and maintain the relevance of the academic library both as the custodian and curator of all forms of research and educational data, and as a catalyst for innovation in scholarship and pedagogy at the heart of the twenty-first-century university. references angulo, antonieta. . “on the design of architectural spatial experiences using immersive simulation.” in eaea conference proceedings. envisioning architecture: design, evaluation, communica- tion, – . milan, italy, september – . european architectural envisioning association. bentkowska-kafel, anna, hugh denard, and drew baker. . paradata and transparency in virtual heritage. london, new york: routledge. bowman, doug, and ryan mcmahan. . “virtual reality: how much immersion is enough?” computer ( ): - . d/vr creation and curation: an emerging field of inquiry cook, matt. . “virtual serendipity: preserving embodied brows- ing activity in the st century research library.” the journal of academic librarianship ( ): – . available at https://doi. org/ . /j.acalib. . . . cook, matt, and zack lischer-katz. forthcoming. “virtual reality integration in the academic library.” in beyond reality: augmented, virtual, and mixed reality in the library, edited by kenneth j. varnum. chicago: ala editions. dilbeck, mackenzie. . “ou law students use virtual reality to deepen human rights understanding.” ou college of law (blog), december , . available at http://www.law.ou.edu/news-and- media/ou-law-students-use-virtual-reality-deepen-human-rights- understanding. donalek, ciro, s. g. djorgovski, scott davidoff, alex cioc, anwell wang, jerry zhang, elizabeth lawler, stacy yeh, ashish mahabal, matthew graham, andrew drake, jeffrey s. norris, and giuseppe longo. . “immersive and collaborative data visualization using virtual reality platforms.” in proceedings of ieee international conference on big data, – . washington, dc, october – . endres, william f. forthcoming. digitizing manuscripts: materiality, methods, and ethics in d and d imaging. leeds, uk: arc humanities press. enis, matt. . “university of oklahoma expands networked vir- tual reality lab.” library journal (august ). available at https:// www.libraryjournal.com/?detailstory=university-of-oklahoma- expands-networked-virtual-reality-lab. figueroa, miguel. . “in a virtual world: how school, academic, and public libraries are testing virtual reality in their communi- ties.” american libraries (march ). available at https://americanli- brariesmagazine.org/ / / /virtual-world-virtual-reality- libraries/. kersten-oertel, marta, sean chen, and d. louis collins. . “an evaluation of depth enhancing perceptual cues for vascular volume visualization in neurosurgery.” ieee transactions on visualization and computer graphics ( ): – . laha, bireswar, doug a. bowman, and john j. socha. . “effects of vr system fidelity on analyzing isosurface visualization of vol- ume datasets.” ieee transactions on visualization and computer graph- ics ( ): – . limp, fred, angie payne, katie simon, snow winters, and jack cothren. . developing a -d digital heritage ecosystem: from object to representation and the role of a virtual museum in the st century. internet archaeology . available at https://doi. org/ . /ia. . . https://doi.org/ . /j.acalib. . . https://doi.org/ . /j.acalib. . . http://www.law.ou.edu/news-and-media/ou-law-students-use-virtual-reality-deepen-human-rights-understanding http://www.law.ou.edu/news-and-media/ou-law-students-use-virtual-reality-deepen-human-rights-understanding http://www.law.ou.edu/news-and-media/ou-law-students-use-virtual-reality-deepen-human-rights-understanding https://www.libraryjournal.com/?detailstory=university-of-oklahoma-expands-networked-virtual-reality-lab https://www.libraryjournal.com/?detailstory=university-of-oklahoma-expands-networked-virtual-reality-lab https://www.libraryjournal.com/?detailstory=university-of-oklahoma-expands-networked-virtual-reality-lab https://americanlibrariesmagazine.org/ / / /virtual-world-virtual-reality-libraries/ https://americanlibrariesmagazine.org/ / / /virtual-world-virtual-reality-libraries/ https://americanlibrariesmagazine.org/ / / /virtual-world-virtual-reality-libraries/ https://doi.org/ . /ia. . https://doi.org/ . /ia. . d/vr creation and curation: an emerging field of inquiry lischer-katz, zack, matt cook, and kristal boulden. . “evaluat- ing the impact of a virtual reality workstation in an academic li- brary: methodology and preliminary findings.” in proceedings of the annual meeting of the association for information science & technology, – . vancouver, canada, november – . milovanovic, julie, guillaume moreau, daniel siret, and francis miguet. . “virtual and augmented reality in architectural de- sign and education: an immersive multimodal platform to support architectural pedagogy.” in proceedings of future trajectories of com- putation in design: th international conference, caad futures, edited by gülen Çağdaş, mine Özkar, leman f. gül, and ethem gürer, – . prabhat, andrew f., michael katzourin, kristi wharton, and mel slater. . “a comparative study of desktop, fishtank, and cave systems for the exploration of volume rendered confocal data sets.” ieee transactions on visualization and computer graphics ( ): – . ragan, eric d., regis kopper, philip schuchardt, and doug a. bow- man. . “studying the effects of stereo, head tracking, and field of regard on a small-scale spatial judgment task.” ieee transactions on visualization and computer graphics ( ): – . seth, abhishek, judy m. vance, and james h. oliver. . “virtual reality for assembly methods prototyping: a review.” virtual real- ity ( ): – . van dam, andries, david h. laidlaw, and rosemary m. simpson. . “experiments in immersive virtual reality for scientific visual- ization.” computers & graphics ( ): – . vanian, jonathan. . “google has shipped millions of cardboard virtual reality devices.” fortune (march ). available at http://for- tune.com/ / / /google-cardboard-virtual-reality-shipments/. ware, colin, and peter mitchell. . “reevaluating stereo and mo- tion cues for visualizing graphs in three dimensions.” in proceed- ings of the nd symposium on applied perception in graphics and visual- ization, – . la coruña, spain, august – . new york: acm. http://fortune.com/ / / /google-cardboard-virtual-reality-shipments/ http://fortune.com/ / / /google-cardboard-virtual-reality-shipments/ collaborative and lab-based approaches to d and vr/ar in the humanities victoria szabo abstract this paper explores the interdisciplinary humanities lab model for collaborative work in d and virtual reality/augmented reality (vr/ ar). it draws upon our experiences at duke university with lab projects focused on historical and cultural visualization. at their best, shared digital projects promote engagement and deeper learning by students, expose new research questions for the diverse researchers and subject-area specialists involved, and result in applications that provide a deeper understanding of historic sites, objects, and phe- nomena to the wider public. working with d and vr/ar in partic- ular demands integration of in-depth critical, creative, and technical areas of knowledge. interdisciplinary lab projects leverage the com- bined expertise and skill sets of subject-area faculty, librarians, tech- nical staff, and students from various academic backgrounds. at the same time, however, such an ensemble approach to scholarship may disrupt the existing academic ecosystem, challenging disciplinary boundaries as well as institutional norms around teaching, research, labor, and resource allocation. by honoring diverse stakeholder goals, fostering critical conversations around theoretical and tech- nical questions, and establishing standards for tools and methods beyond the individual case, lab partners from various institutions may be able to contribute both to their specific projects and to the advancement of sustainable and scalable approaches to future work in d and vr/ar at their own institutions and beyond. chapter collaborative and lab-based approaches to d and vr/ar in the humanities introduction s cholars have begun to look to computationally generated d models, animations, virtual reality, augmented reality, and games to represent present, past, future, and fictive environ- ments and objects. complementing textual, image-based, and quan- titative analyses with d visualizations is nothing new. precursor technologies include illustrative d drawings, wooden architectural models, annotated exhibitions, immersive dioramas and tableaux, and even historical re-enactments. the study and creation of online texts, image collections, and time-based media archives is common- place. geographic information systems (gis) and web mapping are supported in classrooms, libraries, and digital humanities centers as part of a broader spatial turn in humanistic teaching and research. data visualization has experienced similar growth. nonspecialists may also explore d and virtual reality/augmented reality (vr/ar) for conducting and communicating their teaching and research. nonetheless, the academic status of digital scholarship itself often remains ambiguous. best practices for production and evalua- tion within disciplinary fields are still needed. evaluation guidelines note benefits such as deeper student engagement with course materi- als or public access to scholarship. however, undertaking a digital project may—or may not—translate into creditable research activity. a digital experiment at times may fail, or a tool may fall out of favor or reach. this uncertainty can have a chilling effect on scholars who might advance the field. the project-based humanities lab model of research and teaching offers a way to address the gap between discipline-focused objectives and digital expertise. in the lab model, participants from various backgrounds focus on a common academic question over a sustained period. the lab model complements other institutional structures by creating novel opportunities for faculty, staff, and students to col- laborate across disciplinary bounds. at duke university’s wired! lab for digital art history & visual culture and related initiatives, for example, first-year curricular experiments with d and vr/ar in the humanities led to the establishment of new graduate programs and to interdisciplinary and international collaborations around best practices for future research. d and vr/ar in historical and cultural visualization why use d and vr/ar? for the wired! lab, possible objectives might include creating historical reconstructions of significant spaces and structures, imagining fictive spaces, annotating or recontextual- izing objects of interest, exploring databases and networks across multiple dimensions, translating virtual archives into virtual mu- seums, or creating virtual settings for the exploration of spatiality, http://dukewired.org http://dukewired.org collaborative and lab-based approaches to d and vr/ar in the humanities proxemics, and movements across space and time. we describe this activity more generally as “historical and cultural visualization.” students learn about these topics in first-year seminars focused on art history topics, in thematic electives, and through participation in digital cultural heritage and museum exhibition projects. some of our most advanced work in wired! has taken place as part of the international visualizing venice group of researchers (lanzoni, gior- dano, and bruzelius ). the key concept that ties our digital humanities approaches together, and especially d and vr/ar, is the idea of the model. whether we are modeling an object, an environment, or both, a conceptual model underlies our processes and informs what we develop. if we add to that a dimension of time or interactivity, we make use of that model in a process of simulation or virtualization. key challenges for us in historical and cultural modeling in particu- lar may include representing change over time, representing uncer- tainty, documenting process and underlying data, producing content collaboratively, allowing for counterfactuals and conflicting interpre- tations among experts, distinguishing between evidence-based and placeholder elements, and connecting to existing or future data and systems. and all of this is before we populate the models, program in any procedural interactivity, or build in any agents. such challenges exist in any kind of model— d, digital, or other- wise—but are brought into greater relief when working with compu- tationally produced d and vr systems. the same dimensional, rep- resentational qualities that provide greater immersive potential in d and vr systems also have a greater likelihood of being fudged in an attempt to produce a coherent whole, as will rourk demonstrates in his discussion of d data versus d models (see chapter ). indeed, the software often demands filling in informational gaps, etc. added to these challenges is a relatively high technology turnover rate in the field. without deep understanding of the medium and critical engagement with its strengths and weakness, we end up in danger of providing what the most strident critics of d and vr abhor—inac- curate, expensive infotainment that does little to advance or commu- nicate knowledge in the field. providing data transparency, modular- ity, and iterability becomes important to counterbalance these effects. creating a black box object in an expensive walled garden, a closed digital ecosystem, is also an almost certain way to limit the life span and impact of a d or vr/ar project over the long term. therefore, scholars in the field need to understand what goes into these models and to be active partners in their creation and care. the digital humanities ecosystem how then do we address the challenges of working with such com- plex technologies, while at the same time attending to disciplinary concerns? within many academic libraries, digital scholarship has become an important area for partnership and support. as joan k. collaborative and lab-based approaches to d and vr/ar in the humanities lippincott and diane goldenberg-hart have noted, library support of digital scholarship extends beyond digital humanities project sup- port to include science and social science research. it also delves into broader infrastructure and training needs (lippincott and golden- berg-hart ). because d and vr/ar are such cross-disciplinary practices, academic libraries may be the ideal place to facilitate communication around their critical and creative use. at duke, for example, interest in d and vr/ar exists in our engineering pro- grams, computer science, campus information technology (it), and the innovation and entrepreneurship program, as well as in the arts and humanities. d/vr also overlaps with game studies, medical education, civic awareness, and social justice. many participants in various programs wish to use new media forms for studying, creat- ing, and archiving, and they turn to the library for guidance and sup- port. even grant-funding agencies in the humanities now expect rich data, access, and sharing plans from the grantees. the need to find scalable solutions is becoming more acute. this need includes places for storing both working files and final presentational outputs, as ann baird whiteside notes in chapter . despite the good will and interest of the constituencies involved, and perhaps because the challenges are so complex and overlap- ping, it is sometimes unclear in any moment who in the university setting should be responsible for providing not only content storage, archiving, and production server access, but also instruction and training. is it reasonable to expect librarians to take on this role? or should the staff of departments and programs arrange to teach and support digital humanities methods? perhaps teachers in writing programs could take on some responsibility for digital literacies and training. but how far should that responsibility go in the case of tools and methods that require a significant effort to master? regardless of who teaches them, digital tools and methods are sometimes treated as a set of skills to be acquired separately from their application to a disciplinary research challenge. critical atten- tion to underlying data structures, metadata standards, the assump- tions of software packages, or the affordances of interactive media forms may seem a concern better suited to information science, media studies, or communication studies, not to mention geography, architecture, or media arts. more transparent software interfaces may seem to obviate the need for such extra-domain expertise. with their user-friendly interfaces and starter tutorials, they may yield more impressive outputs than ever before. yet, their uninformed use may produce more critically questionable results. digital projects that force consistency of data, elide uncertainty, cherry-pick examples, or eschew complexity and nuance are rightly critiqued. critics might rightly argue that the alluring technologies of d and vr/ar do more harm than good. the pedagogical challenge does not lie only in the fact that digi- tal tools are being taught separately from specific projects, although students may find it harder to learn from examples too far afield from their disciplinary interest. a bigger issue lies in the follow-up collaborative and lab-based approaches to d and vr/ar in the humanities to the initial workshop session or tutorial video, when students (and researchers) apply what they have learned. individual project sup- port becomes more difficult to sustain at scale. this is where a proj- ect-based lab program can be helpful. especially when undertaken with an eye toward developing models and precedents, in-depth col- laboration on specific projects can yield more generalizable results. diverse campus partners may collaborate on such an effort without having to commit to generalized support for everyone who attends a workshop. the lab model does not replace the library digital scholar- ship center. rather, project-focused labs invite partners from libraries and information technology organizations to help create generaliz- able solutions and best practices that fit the scholarly questions at the heart of the lab’s mission. new directions in multimodal scholarly publishing today’s digital humanities ecosystem produces and supports both print publications and digital scholarship. the scholarly com- munications institute made digital scholarship an explicit focus of the institute in chapel hill, north carolina, for example. the challenges to the norms of scholarly publishing are already clear. the next generation of scholars will be increasingly hybrid, having grown up with multimodal technologies throughout their schooling. more focus on rigorous peer review, rather than the medium of circu- lation, opens up possibilities for new platforms (modern language association ; american historical association ; college art association and society of architectural historians ). online publication opens the door to other formats, both as supplements and as stand-alone resources. digital does not necessarily challenge the primacy of textual exposition as the primary communicative mode of expression for many fields, as edward l. ayers ( ) notes. in the last several years that has begun to change, with consequences for our publishing systems. with the support of the national endowment for the humanities office of digital humanities, the getty foundation, the andrew w. mellon foundation, and others, gaps between the form of the origi- nal production and its documentation are decreasing, and expecta- tions are rising for the quality and accessibility of the work present- ed—including that in d and vr/ar. as it becomes more important to review projects in their original media forms, old substitutes for original work may no longer be accepted. digital mapping projects may need geoservers to store and share vector and raster data, and presentation layers and models, but the support is uneven. today in many cases scholars are still reduced to creating screenshots or video documentation of their vr/ar experiences, at least for archival pur- poses. advances in webvr and open d formats get us closer to be- ing able to share such content in its “native” form more openly and transparently. emulators and code repositories also hold promise. https://trianglesci.org/ -institute/validating-and-valuing-digital-scholarship/ https://trianglesci.org/ -institute/validating-and-valuing-digital-scholarship/ collaborative and lab-based approaches to d and vr/ar in the humanities in the absence of other solutions, d and vr/ar experiences are likely to be served up from common commercial or open-source server platforms. platforms like sketchfab and the availability of unity d and unreal as free presentation environments make it pos- sible to share work in a format closer to the original than was pos- sible before. externally hosted digital objects appear as iframes or other types of embedded web objects. they have anticipated active lifetimes or what may be considered limited-time performances of technology exhibitions. the most lasting documentation may para- doxically be not in scholarly journals or websites but on social me- dia and other generalist platforms like youtube and flickr. valve’s steam store is now allowing almost all content on its game store, for example, which may make it easier for some developers to share their work. project code is often shared on github, with github education paving the way for commercial options. it is up to practi- tioners in the field to think through the implications of these choices. whatever systems are put in place, whether homegrown, open source, commercial, or some combination of these, they should meet the development, collaboration, and sharing needs of scholars and researchers. labs can be a great place to test possibilities. a lab with a research agenda can serve as a client or real-life testing ground for a proposed solution in a way that an internal test may not because the end products ultimately need to serve the scholarly community for which they were produced. labs in series and in parallel an individual lab may operate within a larger network of labs, as well as within the context of university-wide curricular and research programs. duke’s wired! lab was formed in within the depart- ment of art, art history and visual studies, with support from infor- mation science + studies. it started out with an experimental class of five faculty and eight students who were interested in d modeling and the reconstruction of historical buildings and environments. the initiatives in the wired! lab today evolved from those beginnings and fall within one of two main research categories: digital cities/ urban histories and the lives of things (wired! ). current wired! lab collaborators include faculty in art and architectural his- tory, archaeology, visual and media studies; the staff director of the departmental visual media center; the university subject-area librar- ian for visual studies; and a dedicated digital humanities specialist. we also work closely with the nasher museum of art staff and the rubenstein library’s special collections librarians on special proj- ects. the group holds regular meetings and friday afternoon project work sessions. the disciplinary orientation of the wired! lab has been central to its high profile in art history circles. wired! has embraced the “digital art history” label and focus for its work. it was founded by caroline bruzelius, an eminent art historian focused on medieval architecture with an interest in d modeling of cathedrals, and is now led by paul collaborative and lab-based approaches to d and vr/ar in the humanities jaskot, also a noted art historian who works on holocaust architec- ture and geospatial analysis. these credits give wired! academic clout. undergraduates from a wide range of disciplines, as well as students pursuing ma, mfa, and phd degrees, join lab teams as participants in lab-based courses and projects organized under the digital art history rubric. lab projects take the forms of posters, pa- pers, publications, exhibitions, apps, and vr experiences. wired! does not exist in isolation, however, nor could it achieve all it has without benefiting from the existence of other labs and programs. in , just as the wired! lab was becoming more estab- lished, the greaterthangames: transmedia applications, virtual worlds, and digital storytelling lab (gtg) was formed at the john hope franklin humanities institute. like wired!, gtg was built on a curricular experiment. in this case it was the virtual realities first- year course cluster, which ran from to and included cours- es from computer science, math, media studies, and classics. the game lab that followed made a substantive alternate reality game; held a semester-long mobile app development workshop attended by students, faculty, and staff; and developed art games in unity and location-based ar apps. it also partnered with wired! on an architec- tural history ipad app, visualizing san giovanni e paolo. although the lab disbanded in , the teaching and research collaborations continued. my own work on augmented reality for digital city appli- cations grew out of this experience (szabo ). additional labs co-located with wired! today include the duke art, law and markets initiative (dalmi), the emergence lab, the dig digital archaeology lab, the information science + studies lab, the john hope franklin humanities institute labs, and the shorter- term bass connections projects. we also work closely with the vi- sualization and interactive systems (vis) group, a cross-functional, cross-campus interest group/lab that includes the leaders of the duke immersive virtual environment (dive) in the pratt school of engineering, as well as library representatives focused on digital scholarship, visualization, and gis. undergraduate, ma, and phd students in digital art history and computational media, arts & cultures draw upon this whole network in writing their hybrid the- ses and dissertations. sharing solutions the key to collaboration lies in finding common ground around re- search topics of mutual interest. for example, in wired! we became very interested in creating richly annotated, immersive architectural models that could be explored in both active and passive modes. https://www.dukedalmi.org https://cmac.duke.edu/labs/emergence-lab https://fhi.duke.edu/labs https://bassconnections.duke.edu http://virtualreality.duke.edu http://cmac.duke.edu https://fhi.duke.edu/labs https://bassconnections.duke.edu http://cmac.duke.edu collaborative and lab-based approaches to d and vr/ar in the humanities the interaction scripts developed there could be used in other con- texts, such as in producing vr art projects and instructional content, but they were also useful for diverse other applications in the dive itself. this possibility of sharing code builds upon a choice we made several years ago when we discussed the need for shared d work- flows and pipelines, both for the purposes of introductory instruc- tion and for more advanced vr implementation. we collectively embraced the decision to standardize on a unity d-based workflow, investing together in some professional licenses to make it work. our shared goal was to make it easier to move from d model to im- mersive environment by providing a way for researchers to develop and display their own content in unity d, and then to export it to various platforms, including but not limited to the dive. critically for our partners in engineering, our projects become interesting case studies for them, too, in terms of user interface design, graphics stan- dards, and multimodal data management schemes. from our per- spective, standardizing in unity d has made it easier for us to move into head-mounted display variations on immersive vr experiences originally conceived of as limited to cave-based installations only. we have also begun working with unity d plus vuforia to create ar projects focused on generating d models from d images. current work with wired! lab postdoctoral researchers from padua, italy, who are working on visualizing the scrovegni chapel giotto frescoes is an excellent example of how our “new” workflow is working across projects and teams. the historical researchers were interested in both representing the changing architecture of the building itself through interactive building information modeling (bim) and in ar exploration of the frescoes contained within it. the dive researchers were interested in comparing vr/ar approaches with mobile-based approaches to the same materials for research purposes. the padua team has begun to bring its existing unity d- based models into the dive and head-mounted displays. they are also exploring ways to layer information about the frescoes into a real-time ar tool for use by visitors. this research may not only benefit future projects with our local museum, but also lead to tutori- als and platform choices to share with the other labs, library, and it partners (giordano et al. ). curricular challenges as anyone who works in d and vr/ar knows, integrating these technologies into teaching and research can be complicated when working with beginners, whether those beginners are first-year stu- dents or advanced faculty. one way to scaffold learning is to use our own projects as teaching examples, as the padua example suggests, deconstructing them into tutorial sample files. another is to create lab projects that students can participate in at different levels of the curriculum, and to create courses around projects that have a life beyond a specific class. for example, we have used the duke campus collaborative and lab-based approaches to d and vr/ar in the humanities and durham, north carolina, community resources in both teaching and lab research. nonetheless, managing the balance between subject-area and technical instruction remains difficult. regardless of who teaches technical topics, we wonder how and when to include tutorials as part of the core requirements of a course. we ask if it is better to separate technical instruction from primary classwork, and what the costs and benefits are of having students working in groups. we also ask whether we should privilege historical research over digital ex- ecution in final projects. since many students from computer science and engineering are attracted to our courses, we debate whether we should allow them to flex their skills or expect them to learn how to do qualitative research. by having students work in cross-functional teams, we try to have them do a bit of both. this approach also pre- pares them for future roles in labs. these are areas of needed future clarification and research. ultimately, the answers may be highly context-driven (in terms of student interest and skill level, discipline, course subject matter, and particular research questions). despite these efforts, the dream of perfect hybridity in teaching with digital technologies is elusive. while the use of some tools, such as readily accessible web-based presentation tools, is easy to teach in a single training session, more involved gis, d, and vr/ar topics require greater degrees of scaffolding and explication. we have also created downloadable tutorials and some online instruction, and we are working with the libraries to develop more. despite our best efforts to pack everything into a single course, we have determined that some topics, such as historical gis, unity d interaction design, physical computing, and web-based multimedia communications require their own courses for any depth of study. hybrid futures while a few of our faculty, such as edward triplett, former clir fel- low and now a departmental lecturer and wired! lab member, can move easily between historical research and technical wizardry, most faculty members are more accomplished in one area or another. we look to the next generation of scholars for more flexibility and for more collaborative teaching and research models than is currently the norm. to produce this next generation of hybrid scholars, we invite graduate students into the labs to serve as research assistants to faculty and mentors to undergraduates. we also encourage them to develop projects of their own within the phd lab in digital knowledge, a lab i co-direct with philip stern from the depart- ment of history, as part of the duke digital humanities initiative at http://www.dukewired.org/workshops/tutorials/ http://sites.fhi.duke.edu http://www.dukewired.org/workshops/tutorials/ http://sites.fhi.duke.edu collaborative and lab-based approaches to d and vr/ar in the humanities the john hope franklin humanities institute. the phd lab schol- ars program currently includes about digitally curious gradu- ate students from a wide range of humanities disciplines. we meet biweekly to share project work, and in between those sessions, we meet with technology-focused working groups organized by faculty, staff, and library partners. this year’s working groups will focus on digital pedagogy, vr/ar, digital mapping, text analysis, and envi- ronmental humanities. we also invite our graduate students’ faculty mentors to come and see what we are doing, offer critiques, and per- haps learn a bit themselves about what is possible. the duke digital humanities initiative also supports a digital humanities fellows program at north carolina central university (nccu), a distinguished historically black university with a strong community presence. the nccu fellows meet monthly at duke and have established their own digital humanities lab at their home in- stitution. in the coming year, the faculty from nccu will work with the information science + studies and wired! labs on a local neigh- borhood history project using location-based d and ar to share archival materials and to document resident experiences. we hope this partnership will contribute both to positive community relations and to best practices for public-facing, accessible digital humanities research. national and international connections these types of digital humanities projects benefit greatly from na- tional and international connections. these connections include the wired! lab partnership in the visualizing venice group on research and training. visualizing venice started out as an export of the “lab” model to our research partners in venice and padua. the group has multiple collective aims: to advance art history research, to ad- vance training, and to explore new modalities of presentation and exhibition. from to , among our other activities, the visual- izing venice group taught thematically focused, hands-on, two-week summer workshops in venice for junior scholars and young scholars interested in digital art and architectural history. the workshops al- lowed us to train the next generation of scholars on an international level and to reflect together on the next stages for our collaborations. then in june , with the encouragement of the getty founda- tion, we shifted focus to offer instead an advanced topics in digital art history: d geospatial networks institute (#dahvenice ). unlike our earlier workshops, which were targeted toward individu- als, this opportunity privileged applications from interdisciplinary teams, building on our experiences with the lab model and extend- ing it outward. teams from countries converged in venice to share projects and ideas, reflecting a wide range of interests. participants include architectural historians, art historians, and archaeologists, http://digitalhumanities.duke.edu http://visualizingvenice.org http://digitalhumanities.duke.edu http://visualizingvenice.org collaborative and lab-based approaches to d and vr/ar in the humanities as well as experts in gis, d, bim, and vr, with projects coming from various scales and perspectives. the project continues over the course of the year. we hope to establish a robust virtual community (or lab) in which we produce best practices, grant proposals, shared software resources, and joint publications. as organizers, we have already benefited from the group’s counsel as we expand our project conceptually from the visualizing venice focus to the visualizing cities project. when we next meet as a group, in june , we will workshop our deliverables in anticipation of public dissemination. at the same time as the wired! lab and visualizing venice have pushed the boundaries of digital art history, we have also furthered the conversation about d and vr/ar as digital humanities meth- ods. with the support of the national endowment for the humani- ties, duke university hosted the two-week virtual and augmented digital humanities institute in july . here our emphasis was specifically on the affordances of vr/ar as media forms, as noted in the project description. participants from the disciplines of art his- tory, literature, education, modern languages, history, media art, and classics met alongside engineers and developers, including our own vis group members, as well as local partners in key library units focused on mapping, visualization, and digital scholarship. a few local faculty who were digital humanities skeptics were also invited. the group discussed questions of ethics, longevity, the digital divide, access, and fair use, as well as favorite software, best practices for development, and project demonstrations. as with the digital art history group, we are seeking to generalize the lessons learned from particular projects into wider principles and best practices, while at the same time building a virtual community around our shared interests. this push and pull between media possibilities and disci- plinary objectives, between common dreams and individual objec- tives, between open-ended possibilities and established values, will continue to animate our conversations over the coming months and years around d and vr/ar in humanities scholarship and beyond. within the wired! lab, and in other labs like greaterthangames and the phd lab in digital knowledge, we have found that what ultimately makes the humanities lab work is the participants’ will- ingness to communicate; to share time, expertise, and resources; and to honor both individual and group project goals. for some partici- pants, the hook is a scholarly question; for others, it may be about media, tools, systems, pedagogy, or publishing. beyond the project focus that keeps everyone engaged, the secret of the lab model is that any self-identified working group can declare themselves a lab and behave accordingly. we had participants in venice declare themselves a “lab” with laughter, but also a bit of elation. however empowering such a statement may be, the lab model also depends on the centralized infrastructure and vision of the libraries, depart- ments, foundations, commercial entities, and community partners on which it relies. for d and vr/ar in the humanities—in all their complexity, expense, and potential—it will take all of us to advance the field. http://vardhi.org collaborative and lab-based approaches to d and vr/ar in the humanities references american historical association. . guidelines for the professional evaluation of digital scholarship by historians. available at https:// www.historians.org/teaching-and-learning/digital-history-resourc- es/evaluation-of-digital-scholarship-in-history/guidelines-for-the- professional-evaluation-of-digital-scholarship-by-historians. ayers, edward l. . “does digital scholarship have a future?” educause review ( ): – . college art association and the society of architectural historians. . guidelines for the evaluation of digital scholarship in art and architectural history. available at http://www.collegeart.org/pdf/ evaluating-digital-scholarship-in-art-and-architectural-history.pdf. giordano, andrea, isabella friso, cosimo monteleone, and federico panarotto. . “time and space in the history of cities.” in digi- tal research and education in architectural heritage, edited by sander münster, kristina friedrichs, florian niebling, and agnieszka seidel- grzesinska, – . springer international publishing. version of paper available at https://duke.app.box.com/s/ qvbgyittqw v wfjne efz. johnson, erik. . “who gets to be on the steam store?” (blog), june , . available at https://steamcommunity.com/ games/ /announcements/detail/ . lanzoni, kristin huffman, andrea giordano, and caroline astrid bruzelius. . visualizing venice: mapping and modeling time and change in a city. routledge research in digital humanities. london, new york: routledge, taylor & francis group. lippincott, joan k., and diane goldenberg-hart. . cni workshop report digital scholarship centers: trends & good practice. available at https://www.cni.org/wp-content/uploads/ / /cni-digitial- schol.-centers-report- .web_.pdf. modern language association. . guidelines for evaluating work in digital humanities and digital media. available at https://www. mla.org/about-us/governance/committees/committee-listings/ professional-issues/committee-on-information-technology/guide- lines-for-evaluating-work-in-digital-humanities-and-digital-media. szabo, victoria. . “apprehending the past: augmented reality, archives, and cultural memory.” in the routledge companion to me- dia studies and digital humanities, edited by jentery sayers, – . new york: routledge. wired! . “about wired!: history.” duke university. accessed jan. , . available at http://www.dukewired.org/about/. https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital-scholarship-by-historians https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital-scholarship-by-historians https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital-scholarship-by-historians https://www.historians.org/teaching-and-learning/digital-history-resources/evaluation-of-digital-scholarship-in-history/guidelines-for-the-professional-evaluation-of-digital-scholarship-by-historians http://www.collegeart.org/pdf/evaluating-digital-scholarship-in-art-and-architectural-history.pdf http://www.collegeart.org/pdf/evaluating-digital-scholarship-in-art-and-architectural-history.pdf https://duke.app.box.com/s/ qvbgyittqw vwfjne efz https://duke.app.box.com/s/ qvbgyittqw vwfjne efz https://steamcommunity.com/games/ /announcements/detail/ https://steamcommunity.com/games/ /announcements/detail/ https://www.cni.org/wp-content/uploads/ / /cni-digitial-schol.-centers-report- .web_.pdf https://www.cni.org/wp-content/uploads/ / /cni-digitial-schol.-centers-report- .web_.pdf https://www.mla.org/about-us/governance/committees/committee-listings/professional-issues/committee-on-information-technology/guidelines-for-evaluating-work-in-digital-humanities-and-digital-media https://www.mla.org/about-us/governance/committees/committee-listings/professional-issues/committee-on-information-technology/guidelines-for-evaluating-work-in-digital-humanities-and-digital-media https://www.mla.org/about-us/governance/committees/committee-listings/professional-issues/committee-on-information-technology/guidelines-for-evaluating-work-in-digital-humanities-and-digital-media https://www.mla.org/about-us/governance/committees/committee-listings/professional-issues/committee-on-information-technology/guidelines-for-evaluating-work-in-digital-humanities-and-digital-media http://www.dukewired.org/about/ d cultural heritage informatics: applications to d data curation will rourk abstract the physical elements of history directly link us with the heritage of diverse cultures. artifacts, architecture, and historic sites, active or derelict, must be documented while they are extant; otherwise, the opportunity to maintain a connection with their historic narratives may be compromised or lost to time. data collection technologies such as laser scanning, photogrammetry, and other three dimension- al data recording techniques and devices allow the documentation of the conditions of a historic place or object with submillimeter accu- racy. libraries can cultivate the resulting high-resolution d data to provide a variety of modes for exploring, researching, preserving, or reconstructing physical historical features. the university of virginia library uses methods that implement the full scope of d data cura- tion through the collection, processing, archiving, and distribution of data and its derivatives to the scholarly community. the applica- tion of these methods as an intrinsic role within libraries opens up a potential area of information science that can be identified as d cultural heritage informatics (lib dchi). more specifically, lib d- chi can be described as a set of techniques that yield primary source data derived from the existing conditions of historic and culturally relevant objects, places, and sites. d technologies such as web d, computer-aided design (cad), d printing, and virtual reality can help make a stronger connection to these objects, places, and sites by providing access to measured data about them through sensory-im- mersive technologies. libraries are promoting democratized access to d assets as a means of providing new forms of knowledge to the scholarly community. chapter d cultural heritage informatics: applications to d data curation introduction t o maintain the connection between the physical elements of history and their historic narratives, it is essential to docu- ment the features of artifacts, architecture, and historic sites, both active and derelict. data collection technologies such as laser scanning, photogrammetry, and other d data recording techniques and devices allow the documentation of the dimensions, scale, and geometry of a historic place or object with submillimeter accuracy. libraries can curate the resulting high-resolution d data and provide a variety of modes for exploring, researching, preserving, or reconstructing physical historical features. the application of these methods within libraries opens up a potential area of library information science that can be identified as d cultural heritage informatics (lib dchi). at the uva library, lib dchi is defined through the four fundamental stages of collecting, processing, ar- chiving, and accessing d data. collecting d data at the heart of lib dchi is the acquisition of precision data. docu- mentation of the existing physical conditions of material cultural heritage provides the fundamental base layer of data for lib dchi. producing reliable d data requires precision measuring instruments and techniques. d documentation technologies such as laser scan- ning, structured light scanning, and photogrammetry can provide measured data with millimeter to submillimeter precision. these technologies can provide two types of fundamental information about an object: surface geometry and color texture. laser scanners and structured light scanners are metrology devices, and their main function is to collect surface data in three dimensions during the data collection phase. lasers can detect surface variations from to . mm, depending on the sensor. external measuring devices such as a meter stick or calibrated scale bars should be included when photo- graphing the subject so measurements can be taken from the images and applied to the data when it is processed using photogrammetry software such as agisoft photoscan pro (measurement is not allowed in photoscan standard). choosing the most appropriate technology to yield the most ef- fective data depends on the subject matter to be documented. scale and surface characteristics are the two most important features in determining which tool to use. for laser-based documentation tech- nologies, shiny surfaces, mostly black, or mostly white, semitrans- parent textured surfaces may generate enough distortion to corrupt the surface data or even make it impossible to collect the data at all. because of the properties of light, lasers tend to reflect off shiny or white surfaces, while they are absorbed into black textured surfaces. structured light and photogrammetry may work better for such sur- faces, as these techniques use less concentrated light to gather infor- mation. structured light flashes a light pattern onto a surface, while photogrammetry uses images of the surface to generate data based d cultural heritage informatics: applications to d data curation on triangulation from a camera pair. as long as the surface is kept consistently lit with a minimum of texture changes like shadowing, reflections, or color shifts, then either of these techniques may suc- ceed where laser light may fall short. the quartz arrowhead in figure – shows conditions that are not conducive to the use of laser scan- ning, but are more amenable to structured light scanning. d data collection techniques yield raw data in point cloud form. for example, laser scanners function by shooting a beam of light into the space all around the device, recording millions of points from surfaces that intersect with the laser. photogrammetry produces similar data by generating sparse and dense point clouds during processing. the raw point cloud data are commonly cap- tured in a proprietary file format, along with other metadata that may have been recorded by the scanning device. raw data may not be readable in software other than what the manufacturer of the fig. – . challenging surfaces: semitransparent, white textured quartz arrowhead (right image courtesy of ben ford, rivanna archaeological services llc) fig. – . stages of d data processing from point cloud to mesh d cultural heritage informatics: applications to d data curation device has provided. in this case, the data must be processed and exported to other formats such as ply, obj, dae, pts, or x d to be usable in other software. point cloud data can be converted to derivative d mesh formats and imported into other d software to be transformed into d content for a multitude of purposes. figure – shows scan data from gutzon borglum’s the aviator at different stages of data processing from point cloud to refined mesh, which is the most commonly used form of d content. the processing phase must preserve as much precision as possible of the raw point cloud data as they are converted to a d mesh. processing d data in documenting a site, place, or object, it may be necessary to collect several datasets to cover the entire surface of the subject. a scan- ning device can capture only surfaces that are visible to the scanner. thus, in most cases, either the scanner or the subject will need to be repositioned to ensure that all surface data have been captured and the subject has been fully documented. multistage scanning sessions will yield separate datasets that must be registered together to pro- duce one cohesive dataset that accurately represents the documented subject. d data registration techniques depend on the technique used for documentation (e.g., laser scanning, photogrammetry). targets, and other registration devices, such as spheres, checkerboards, and coded targets, can be placed in the environment during the scanning process to help ensure that datasets are merged accurately. process- ing software will recognize standard targets and use them to precise- ly fit datasets together. current software is able to find similarities between datasets, called correspondences, to perform registration without targets. it must be emphasized that data are at the heart of lib dchi. d data are commonly misunderstood as “ d models” or “ d pictures” in discussions of the content produced through d scan- ning. however, d documentation technologies provide the actual, measured conditions of a place or thing, not just an appearance of or similarity to the subject. object dimensions can be obtained from measurements between any two points in a point cloud, and point clouds can contain millions to billions of points, yielding a multitude of dimensional measurements. the fundamental structure of d point cloud data can be re- duced to values for x,y,z,i,r,g,b such as the contents of the pts file seen in figure – . xyz describes a point existing along three axes in space. rgb is a color value attributed to that point in space. the vari- able, i, is an angle of incidence value that describes how light is inter- acting with the surface at that point in space (california department of transportation ). along with the fundamental data, raw data from a scanner may include metadata that are produced by other sensors in the measur- ing device, including a global positioning system (gps), altimeter, d cultural heritage informatics: applications to d data curation temperature, resolution settings, and other information specific to the conditions in which the data were collected. the data can be out- put to a variety of modes that include, but are not exclusive to, d cad-type modes, such as files for d printing, vr, web d, or inter- active d environments. it is important for the archival record of the data to document the conditions under which the data were collected and produced. careful archiving of source data and documentation metadata not only preserves measured d data, but also ensures re- peatability of data collection and processing methods, facilitating the reproducibility of data output (pedersini, sarti, and tubaro ). archiving d data libraries must develop effective means to balance preservation and access in archiving the diverse forms of d cultural heritage data. archiving strategies can be placed on a spectrum of preservation and access from long-term storage to immediate use. at one end of the spectrum are dark archives provided by the academic preservation trust (aptrust), a collaborative consortium of higher education insti- tutions managed by uva; the aptrust’s goal is to develop strategies for storage and successful retrieval of all forms of data going into the future. aptrust provides storage of data in perpetuity and is not de- signed to allow immediate retrieval. on the other end of the spectrum are archival solutions that offer immediate, open access to data. developed at harvard university, dataverse is an open source, open data platform that has the ability to ingest just about any data file format and then publish it openly to http://aptrust.org fig. – . ascii text readout of pts formatted file contents showing x,y,z,i,r,g,b d cultural heritage informatics: applications to d data curation the world. the uva library favors an open access approach to ar- chives, incorporating dataverse into its scholarly institutional reposi- tory, libra, to allow open access to scholarship created by the uva community. d formats can pose a variety of challenges to most re- positories because of their diversity in format types, but dataverse is extremely flexible. once data have been uploaded and published in the uva dataverse, a solr script creates searchable facets in virgo, the uva library’s search and discovery platform. search terms in virgo depend primarily on the parsing of key terms and description fields in dataverse. dataverse allows open access to students, faculty, staff, and other teachers and researchers who wish to discover and download d primary source data, and then incorporate d data into their re- search and pedagogy. although d technologies are widely utilized, little is understood of the differences in d formats. understanding d is dependent on fully understanding how the content is gener- ated. d content can be manually modeled using authoring tools like sketchup, d studio max, or blender. d data can be generated using laser scanners or through photogrammetric processes. source content is different from the derivatives that are exported from d authoring, editing, and optimization software. source content may be usable only by the tools that created it or the software that was used during data acquisition. content must be exported to formats that are contingent on context, such as d printing, cad, and vr. problematically, dataverse will ingest a variety of source files and their derivatives and dependencies, and present the content as a list https://dataverse.org/fig. – . dataverse interface https://dataverse.org/ d cultural heritage informatics: applications to d data curation with no visual reference (figure – ). therefore, scholars unfamiliar with d formats may have difficulty understanding how to use a particular file that is available in dataverse. in collaboration with the uva library, the uva institute for advanced technology in the humanities (iath) is developing a more user-friendly interface to access dataverse assets and serve as an intermediary between dataverse and virgo. the interface uses the open-source web d viewer d heritage online presenter ( dhop) to provide an interactive d model for users to explore the data before download. it provides links for individual content formats according to potential use, for example, pts, xyz, or e for point cloud data or obj, ply, stl, x d, or dae formats for mesh data. the interface in figure – presents the subject graphically and contextually with links to the repository, museum, or other collec- tion source where the original subject is kept, providing a bridge between the technical data, or provenance, and contextual metadata (d’andrea and fernie , – ; denard , ). http://vcg.isti.cnr.it/ dhop/ fig. – . drupal interface with dhop viewer for accessing d data in uva library virgo http://vcg.isti.cnr.it/ dhop/ d cultural heritage informatics: applications to d data curation accessing d data once lib dchi data have been archived and made openly acces- sible, scholars can use the data for a variety of academic purposes. libraries are employing d content experts to help the scholarly con- stituency with a variety of tools, techniques, and information to use d data most effectively. the following examples illustrate d data use at the uva library. d data collections the d greek vase scanning and printing project ( dgv) project is an ongoing collaborative effort between uva archaeology professor tyler jo smith and the uva library. in early , smith received a grant to study d documentation and d printing of artifacts from the fralin museum of art at uva. vases and other, similar artifacts of the classical greek period were scanned using a creaform zscan- ner cx laser scanner at a precision of . to . mm to collect the d surface and color texture data of the original. the data were processed and refined for d printing at a : scale in abs and pla plastic, typical to most printers today. while these types of printers do not replicate color texture from the d data, the geometric surface data can yield a nearly exact replica of the original vase’s form. this enables students engaged in archaeology, art and architectural his- tory, and other fields to directly handle artifacts, reducing stress and potential degradation of the original. this project continues today as new students are exposed to d documentation processes. the dataset from the documentation of fralin greek artifacts has become a collection on the uva library’s dataverse, and the schol- arly public can access all of the data openly through virgo. museums such as the fralin are keen to open collections data to the public with the understanding that a creative commons cc license enables the wider community to download and gain full access to the data, un- fettered by restrictive copyright protections. the fralin has engaged in other projects with the uva library to document and provide access to other cultural heritage artifacts. in the spring of , the fralin hosted an exhibition entitled collect, care, conserve, curate: the life of the art object, which explored various techniques and tools that museum conservators use for collection stewardship. figure – shows artifacts from the fralin mesoamerican and african collections that were d scanned and printed to provide fac- similes of artifacts for museum visitors to pick up and explore while the originals remained safe, but visible, behind glass. in addition to the physical reproduction, a dhop web viewer was made avail- able on an ipad kiosk for visitors to explore the fully color-textured object in d. providing multiple forms of interaction from multiple derivatives of the d documentation data can bring scholars and the curious public alike closer to the physical reality of tangible cultural heritage. http://archaeology.virginia.edu/ d-greek-vases.html http://archaeology.virginia.edu/ d-greek-vases.html d cultural heritage informatics: applications to d data curation d scan-to-print technology for accessibility d scan-to-print technology can be an effective solution for connect- ing the general public with invaluable artifacts and historic objects, but it can also be used to enhance interaction and appreciation of artwork for people with disabilities. at the gari melchers home and studio museum at belmont house, university of mary washington, michelle crow-dolby teaches young children about the artwork of early twentieth-century artist gari melchers. she often works with children who have low vision disabilities that prevent them from fully participating in classroom activities. she looked to the uva library for help in finding a technical solution for these students. crow-dolby recognized the work that the uva library had done with d scanning combined with d printing and saw an opportuni- ty for productive collaboration. the library once again used a crea- form zscanner to d-scan sculpture from gari melchers and d print a : replication of the original. crow-dolby now uses this technique to allow the children to interact directly with the sculpture, which visitors to the museum could not normally handle. by providing a replica of an original artifact or artwork based on precision measured data, d scan-to-print output could be con- sidered a form of d data in its own right. digital data gathered by d documentation techniques are encoded in the physical form of the d print. it must be noted, however, that the physical replication of an artifact is not exact. the d scanning and printing processes degrade the precision of the information derived from the original subject. at best, laser scanning can provide a precision of only . mm of the surface of the original. the many minute details and nu- ances that exist below . mm are not recorded. additional informa- tion, such as high-resolution photographs, scholarly accounts, and more descriptive metadata fields, must accompany the data record for preservation. fig. – . collect, care, conserve, curate: the life of the art object exhibit d cultural heritage informatics: applications to d data curation historic d reconstruction: lhasavr before the development of d acquisition methods, manual d modeling was a standard method for d documentation of archi- tecture and artifacts. just as plan, section, and elevation renderings will remain the standard for architectural documentation, historic d reconstruction continues to be an effective means for generating cultural heritage data. this is especially true of non-extant places and things or time-dependent, ephemeral conditions of architectural and archaeological features that may have undergone renovation, res- toration, or reconstruction. with careful research into historic docu- mentation of a place or thing, d modeling can digitally re-create a cultural feature. a massive d reconstruction project at uva is the lhasa vr project with the tibetan and himalayan library (thlib). the main effort of this project was to build a comprehensive historical gis of the tibetan capital of lhasa prior to the occupation by chinese forces. more than , buildings have been razed or significantly altered since then. with the use of early twentieth-century maps and early high-altitude photographs, a comprehensive d map was cre- ated of the city and its immediate surroundings in the lhasa-kyichu river valley. more than , features were identified by using esri arcmap mapping tools. each feature was linked to the thlib data- base to create a map enriched by information. the d map was then converted to a d model by using the esri cityengine procedural modeling tool and data from the thlib database to auto-generate buildings according to their historic characteristics. higher resolution d models were produced in autodesk d studio max and then im- ported into the cityengine d gis model. the view of the d model in figure – is from the cityen- gine web viewer utility, a free web d viewer for d gis models http://www.thlib.org/ fig. – . thlib lhasa vr project interface in esri cityengine web viewer http://www.thlib.org/ d cultural heritage informatics: applications to d data curation created with cityengine. d models created in ds max were based on architectural information provided by the nonprofit, non- governmental organization tibet heritage fund. projects such as lhasa vr expand the use of d reconstruction models by using information systems such as gis or bim to link historic informa- tion to d data. vr pedagogy: virtual museums the transformation of d data into pedagogical content is the fun- damental goal of the d cultural heritage informatics process. the effective use of d content in the classroom is contingent on having tools that promote an engaging and meaningful experience. immer- sive technologies such as vr can bridge the gap between physical history and cultural heritage data. d technologies play a crucial role in providing alternative modes of access to the physical world. publicly accessible vr systems give users of both university and public libraries an opportunity to experience immersive content. the uva library provides public vr spaces that its scholarly community can use for teaching, learning, and playing. playtime encourages students and faculty to become familiar with d content interaction and opens up a variety of potential tools and methods for interact- ing with d data. using free software such as authoring tools in the unity d game engine and the steam vr platform, d and vr specialists at uva teach pedagogical methods that are easy to learn and apply to a variety of academic fields. a method developed by arin bennett, vr/ar specialist with the uva library scholars’ lab, and will rourk uses unity d as the main platform to help students curate spatial narratives based on curriculum research topics. nar- rative spaces are designed on the priniciples of museum spatial layout. a research topic is represented spatially by creating “rooms” in a virtual museum that relate to the arguments in a paper. the de- tails of the argument are expressed by images, text, audio, or video objects placed in a room much like objects in a museum exhibition. the rooms are usually designed for a linear experience much in the same manner that a paper would be read. interactions with these spaces are usually kept simple, enabling, at a minimum, the ability to “walk” from one space to another to see images or read embedded or linked text. students who want to add more interactive features or create more specifically curated experiential elements can do so by learning javascript or c# scripting, which are both native to unity d. scripting is not taught in the class, but learning how to build ad- vanced interaction by downloading premade scripts is encouraged. the class is designed for base level novices with little experience in d modeling or content creation. students then lead their classmates in a tour of their research narrative while providing their own scholarly narrative. the d con- tent is either output to a web d viewer in a webpage or converted into a virtual environment for an immersive exploration with vr headsets such as the htc vive or oculus rift. both modes are net- worked and allow for multiple participation so that other students or d cultural heritage informatics: applications to d data curation faculty can join in from other parts of the university, as well as from external institutions. open access vr: dchivr in collaboration with a team of researchers from james madison uni- versity, the university of mary washington, and james madison’s historic home at montpelier, staff at the uva library are researching methods for enabling open access to lib dchi content in vr sys- tems. each of these institutions engage in vr systems-based interac- tion with their own constituency, while acknowledging that this con- tent can easily be shared across networks with the broader scholarly community. this multi-institutional team has entitled their effort the d cultural heritage informatics virtual reality ( dchivr) project, and their goal is to develop a platform for sharing content that is for- matted for use in vr systems. the system is to be based on the mod- el of open access, search, and discovery with dataverse and virgo at the uva library. to model how this system would work, the team is using the steam vr platform to provide lib dchi content collected and processed from laser scanners, photogrammetry, and historical d reconstruction. participants sign up for free with steam vr and can connect to one another in vr for a multiparticipant exploration of d documentation data of historic places. figure – shows a meeting taking place in data produced from two sites. at left is a d model of the anatomical theater, academi- cal village, university of virginia, produced for the juel project (jef- ferson’s university, the early life) by iath. at right is d data from laser scanning and aerial photogrammetry performed at the warm springs bathhouses (ca. late eighteenth century) in bath county, virginia. d documentation data were converted into d vr content with little to no editing so that participants can explore an authentic replication of an actual historic site. through networked vr systems, historical data can be easily distributed to the wider scholarly com- munity for a firsthand experience of cultural heritage features. an open source, open access solution is the desired outcome. fig. – . dchivr meeting in vr at the uva academical village (ca. s ce) and warm springs bathhouses (ca. late eighteenth century ce) d cultural heritage informatics: applications to d data curation conclusion the methods of lib dchi help bridge the experiential gap between subject and scholar. one of the goals of scholarly research is to be- come more closely acquainted with cultural heritage through its physical record. access to collections traditionally means examining artifacts behind glass, visiting a historic building in a remote or dis- tant location, or working with media that express a likeness of the original such as in photographs, video, audio, or conjectural physical or digital d modeling. in most cases, the researcher is physically separated from the subject and connected primarily through media or other facsimile representations. d documentation technologies and methods such as laser scanning, photogrammetry, and histori- cal reconstruction modeling can provide precise data that are useful for creating accurately measured representations of artifacts, archi- tecture, archaeological sites, and other elements of physical cultural heritage. web d technologies can help conveniently distribute d content and data through web browser interfaces. vr is a powerful mode for exploring cultural heritage d content, as it immerses the scholar in a life-sized, scaled representation of historic subjects with tools to explore, annotate, and make inferences while sustaining dis- course with other scholars and researchers within the global commu- nity. d printing can put accurately reproduced facsimiles directly into the hands of students and researchers who may otherwise have only limited tactile, visible, or other sensory interactions with the original subject. d technologies can strengthen a user’s connection to the subject by providing access to measured data of historic objects, places, and sites through sensory-immersive technologies. libraries are foster- ing democratized access to d assets as a means of providing new forms of knowledge to the scholarly community. by considering the creation, access, and preservation of d data, lib dchi techniques help to ensure that d assets are meaningful and useful for research and pedagogy long into the future. references california department of transportation. . “terrestrial laser scanning specifications.” in caltrans surveys manual, chapter . available at http://www.dot.ca.gov/landsurveys/docs/surveys- manual/ _surveys.pdf. d’andrea, andrea, and kate fernie. . “carare . : a metadata schema for d cultural objects.” in proceedings of the digital heritage international congress, – . piscataway, nj: institute of electrical and electronics engineers, inc. available at . / digitalheritage. . . denard, hugh, ed. . “london charter for the computer-based visualisation of cultural heritage.” draft . , february . avail- able at http://www.londoncharter.org/. http://www.dot.ca.gov/landsurveys/docs/surveys-manual/ _surveys.pdf http://www.dot.ca.gov/landsurveys/docs/surveys-manual/ _surveys.pdf https://doi.org/ . /digitalheritage. . https://doi.org/ . /digitalheritage. . http://www.londoncharter.org/ d cultural heritage informatics: applications to d data curation pedersini, federico, augusto sarti, and stefano tubaro. . “auto- matic monitoring and d reconstruction applied to cultural heri- tage.” journal of cultural heritage ( ): – . related reading boehler, wolfgang, muyambi vicent, and andreas marbs. . “investigating laser scanner accuracy.” in proceedings of xixth in- ternational symposium, cipa : new perspectives to save cultural heritage. antalya, turkey, september –october . champion, erik. . “the role of d models in virtual heritage infrastructures.” in cultural heritage infrastructures in digital humani- ties, edited by agiatis benardou, erik champion, costis dallas, and lorna m. hughes, – . abingdon, oxon, new york, ny: rout- ledge, taylor & francis group. dalton, margaret stieg, and laurie charnigo. . “historians and their information sources.” college & research libraries. ( ): – . available at . /crl. . . . gomes, leonardo, olga regina pereira bellon, and luciano silva. . “ d reconstruction methods for digital preservation of cultur- al heritage: a survey.” pattern recognition letters : – . available at https://doi.org/ . /j.patrec. . . . gruber, ethan wooster. . recent advancements in roman numis- matics. master’s thesis. charlottesville: university of virginia. guillem, anais, roko zarnic, and george bruseker. . “building an argumentation platform for d reconstruction using cidoc- crm and drupal.” digital heritage, – . available at . /digitalheritage. . . jomier, julien. . “open science—towards reproducible re- search.” information services & use ( ): – . available at https://content.iospress.com/articles/information-services-and- use/isu . marty, paul f., and katherine burton jones. . museum informatics: people, information, and technology in museums. new york: routledge. marty, paul f., and michael b. twidale. . “museum informat- ics across the curriculum: ten years of preparing lis students for careers transcending libraries, archives, and museums.” journal of education for library and information science ( ): . meyer, bonnie. . the accuracy myth. eden prairie, mn: strata- sys ltd. available at http://www.stratasys.com/resources/search/ white-papers/accuracy-myth. münster, sander, mieke pfarr-harfst, piotr kuroczynski, and ma- rinos ioannides. . d research challenges in cultural heritage ii: how to manage data and knowledge related to interpretative digital d reconstructions of cultural heritage. springer international publishing. https://doi.org/ . % fcrl. . . https://doi.org/ . /j.patrec. . . https://doi.org/ . /digitalheritage. . https://content.iospress.com/articles/information-services-and-use/isu https://content.iospress.com/articles/information-services-and-use/isu http://www.stratasys.com/resources/search/white-papers/accuracy-myth http://www.stratasys.com/resources/search/white-papers/accuracy-myth d cultural heritage informatics: applications to d data curation pfarr-harfst, mieke. . “typical workflows, documentation ap- proaches and principles of d digital reconstruction of cultural heritage.” in d research challenges in cultural heritage ii, edited by sander münster, mieke pfarr-harfst, piotr kuroczyński, and marinos ionnides, – . springer international publishing. rajapakse, ravihansa, margot brereton, laurianne sitbon, and paul roe. . “a collaborative approach to design individual- ized technologies with people with a disability.” in proceedings of the annual meeting of the australian special interest group for com- puter human interaction, ozchi , – . new york: acm. doi: . / . . rourk, william m. . “scan bim.” inform, , – . schroer, carla. . “photogrammetry.” cultural heritage imaging. accessed december , . available at http://culturalheritageim- aging.org/technologies/photogrammetry/. schroer, carla, and mark mudge. . “photogrammetry training, practical scientific use of photogrammetry in cultural heritage.” san francisco, ca: cultural heritage imaging. sula, chris alen. . “digital humanities and libraries: a concep- tual model.” journal of library administration ( ): – . turner, hannah, gabby resch, daniel southwick, rhonda mcewen, adam k. dubé, and isaac record. . “using d printing to en- hance understanding and engagement with young audiences: les- sons from workshops in a museum.” curator ( ): – . wachowiak, melvin j., and vicky karas. . “ d scanning and replication for museum and cultural heritage applications.” jour- nal of the american institute for conservation : – . williams, robert v. . “enhancing the cultural record: recent trends and issues in the history of information science and technol- ogy.” libraries & the cultural record ( ): – . http://culturalheritageimaging.org/technologies/photogrammetry/ http://culturalheritageimaging.org/technologies/photogrammetry/ https://repository.si.edu/handle/ /https% a% f% frepository.si.edu% fhandle% f % f https://repository.si.edu/handle/ /https% a% f% frepository.si.edu% fhandle% f % f virtual reality for preservation: production of virtual reality heritage spaces in the classrooom abstract the bethel ame church was the oldest african american church in indianapolis. in november , the congregation moved out of downtown, and the building that had housed the congregation since was sold. it is now being redeveloped into a hotel. through the virtual bethel project, faculty and students in the media arts and sci- ence (mas) program at indiana university–purdue university india- napolis (iupui) created a d virtual space of the physical sanctuary to preserve the cultural heritage of bethel. during its creation, vir- tual bethel served as a curricular and co-curricular experience for the undergraduate students in the d graphics and animation specializa- tion within class n d team production, which was co-taught by albert william and zebulun wood. virtual bethel, finished in , was the first historical and cultural preservation project that used vr within our class, program, school, and indiana university (iu) cam- pus. users can interact with various types of primary sources (e.g., photographs, video, audio, text) to learn about the underrepresented history of african americans associated with the church. virtual bethel was created in a series of classes within the mas program in the school of informatics and computing (soic), iupui. methods of zebulun m. wood, albert william, and andrea copeland virtual bethel was funded by an indiana university new frontiers grant in . the authors would like to thank the team of the advanced visualization lab (avl), iupui: thanks to jeff rodgers, chauncey frend, tyler jackson, and michael boyles for their unrelenting support of our programs and all-too-often thankless hours they put into our projects by fielding questions and inspiring all. we thank the students of virtual bethel, first luke brown for coming onto the project to embed and build a vr storytelling database, the first of its kind at iu, and for being at every public showcase and documenting everything. thanks to original team tyler jackson, rachel davidson, bryan dinkens, thomas springer, roxanne wheeler, and charles yu, and to skip comer and lisha chen for creating our website and the ability to provide virtual bethel to the world. we thank online resources, inc. for the original d scan of bethel amc, and kisha tandy for providing research and for assistance in curating the virtual bethel vignettes. finally, we thank the bethel amc congregation, especially olivia mcgee- lockhart, for curating the mission, public showcases, vignettes, and accuracy of content within virtual bethel. thank you for your continued partnership with our students, school, and university. chapter http://research.iu.edu/funding_newfrontiers.shtml virtual reality for preservation: production of virtual reality heritage spaces in the classroom teaching a team of students to preserve historic spaces using vr are discussed, as are our philosophies toward productions when work- ing with varying stakeholders’ priorities related to data preservation, asset preservation, and cultural preservation. project background and significance t he media arts and science (mas) program in the school of informatics and computing (soic) provides undergraduate courses in d production and visualization. as a project- based course, n d team production is designed to involve community partners in providing students with real-world, context- driven, hands-on learning opportunities. the course focuses on the creation of high-end, broadcast-quality animations through team- based learning. students learn skills in areas related to production in a d project. these skills include preproduction tasks such as the development of a story, script writing, research, conceptual drawing, storyboarding, animatics, and project management. production skills are explored in d asset creation, time management, file manage- ment, sound, and title sequences. postproduction processes include final rendering, movie creation, and formatting for various playback devices. more recently, the program has embraced projects imple- menting and leveraging emerging technologies such as d printing, vr, and augmented reality (ar). founded in , the bethel ame church was once a vital part of a thriving african american community in the heart of the indi- ana avenue jazz district. before that, bethel played a vital role in the underground railroad. bethel has significant meaning not just in af- rican american history, but also in the local heritage of indianapolis. recently, the church site was rezoned for redevelopment, leaving the historic building—and the materials housed in the church archive for more than years—in a vulnerable position. although the effort to digitally preserve the at-risk physical space is not innovative in and of itself within cultural preservation domains (e.g., arc/k project, cyark, iconem, masterworksvr ), the virtual bethel project incorporates associated digitized and born-digital archival materials into the virtual space to provide a new way of learning history and interacting with bethel’s primary sources. the project intends to develop this space as a virtual learn- ing environment for undergraduate students’ history and primary source education. the methods used to develop virtual bethel by engaging undergraduate students will be relevant to studies of other historic sites and archives with similar ambitions. http://arck-project.org https://www.cyark.org http://iconem.com/en/ http://masterworksvr.com https://comet.soic.iupui.edu/bethel/ http://arck-project.org https://www.cyark.org http://iconem.com/en/ http://masterworksvr.com https://comet.soic.iupui.edu/bethel/ virtual reality for preservation: production of virtual reality heritage spaces in the classroom community: respecting the heritage and institution’s members communication among the undergraduate n d team produc- tion class, bethel’s church membership, and local historians connect- ed the students easily with andrea copeland, chair of the library and information science department in the soic at iupui. copeland focuses her research on facilitating connections between groups typi- cally underrepresented by heritage institutions and community pres- ervation infrastructures. her research finds that trusted relationships are essential for reducing social distances and building connections among individuals, institutions, and knowledge. she has worked for several years with congregants from the bethel ame church. work- ing together on this project, we learned valuable lessons that can be used to support future community-driven heritage projects. course description and necessary skills for restoring d scanned structures the n d team production class allows students to work as a group and emulates the collaborative team environment found in the media and animation industry. the goal of the course is to bring students together to work on a common project. at this point in their undergraduate degree program, students have completed a number of prerequisites, including intermediate courses in d modeling, tex- turing and lighting, and animation. virtual bethel served as the first cultural preservation project using vr as a medium to engage stu- dents with the aim of educating the public. regardless of the project assigned, students are encouraged to bring their existing knowledge and specialty (e.g., modeling, unwrapping, shading, materials, light- ing, game development) to the team and investigate new skill sets that they may want to develop or that are needed to facilitate the success of the production. we have found through teaching this course for five semesters over five years that students engage in this class for a variety of rea- sons with a range of positive outcomes. the course allows them to work together on a single project synergistically. students can apply their existing skills and also find ways to implement other interests that they may not have yet developed. the course teaches them team dynamics and develops skills essential to the success of group work, including communication, leadership, organization, and account- ability. past class assessments indicate that the experiences students have in this class are far beyond regular class work in contributing to a larger team and serving the community, that they relish this experi- ence, and that the overall result is a very satisfying academic exercise (lombardi ; mcleod ). at the beginning of class, we assess student strengths to com- pare the existing skill sets with a project’s goals. we carefully look at student abilities, often recruiting students that we feel might benefit virtual reality for preservation: production of virtual reality heritage spaces in the classroom from the experience and who can contribute specifically to certain portions of the project as it develops. we expect strong student lead- ership from within the group and encourage students to be account- able to their peers rather than to the instructors. we set up a data and communication structure that ensures all students begin interacting as soon as the class starts. we encourage the students to adopt a communication system that is easiest for them; many times, students have opted to use facebook or other social media platforms. they also communicate daily and share files through a system used in the indiana university infrastructure called iu box. technical knowledge of the craft (in this case, animation) is, of course, critical to the success of the project; yet the students are en- couraged to research the topics on their own early in the semester. supplementary materials are gathered and provided in our course materials on canvas, the campus learning management system. knowing the history of the subject or the background of the story gives students more interest in the success of the project and their role within it. the classroom as a d production and preservation studio in the fall semester of , our n class was presented with the opportunity to re-create the historic bethel ame church in india- napolis. we felt that a vr experience using d models and textures would be a powerful method of telling the story of this church. we also felt that this opportunity for community engagement was a per- fect fit with the collaborative nature of the class, as iu strives to high- light its engagement and impact throughout the state. we presented the students with this opportunity in the first class meeting. they considered this a great use of their skills and time, and they were excited to be working to give back to the downtown indianapolis community. during our second class meeting, all students and faculty visited the church, which was within walking distance of our classroom, to experience the space that we would re-create. in early september , on a warm, late afternoon with sunlight streaming through the stained glass windows, we spent more than two hours exploring every corner of the church. we took about , high-quality photo- graphs to use as reference for the d models and textures that would go into the vr experience. we used a gigapan robotic camera system to capture images that were stitched together to form highly de- tailed panoramas of the interior, and a ricoh theta camera captured low-resolution -degree images. all of these photographs were cataloged, archived, and used to help re-create the digital space. measurements of some structures were also taken as references for scaling the models. before the start of the semester, online resources, inc., a lo- cal d scanning company, had created laser scans of the interior of virtual reality for preservation: production of virtual reality heritage spaces in the classroom the bethel ame church for the project with a mm surphaser lidar laser scanner. this data set was approximately . gb and contained million polygons. tests of the scanned data using the unreal game engine outputting to an htc vive vr system showed that, because of the density of the scanned mesh and size of the data set, we had to greatly reduce the scan so we could view bethel in vr. the group decided early on to use the scan as a type of digital tracing paper, an excellent reference resource that could speed up our modeling workflow (figure – ). one of the first steps taken as we started to develop the church interior was to bring the reduced scanned data into unreal and scale it correctly. we simply placed a six-foot box representing the size of a human and compared it with the ob- jects and space within the scene. students used autodesk maya as the software to model all of the as- sets. the scanned data were reduced using pixologic zbrush’s decimation tools, then brought into maya and scaled to their proper dimensions for use as a template. objects that had been measured by students were used as references, and then the scanned model was brought to its proper relative scale based on unit settings (mm, cm, meter) in maya. using the scanned data and refer- ence photographs, our modelers were able to determine the basic shapes of each part of the specific objects and be- gin manual reconstruction. we needed to be cognizant of the number of poly- gons that this model contained so that it would show enough detail, but the number could not be so high as to hin- der the unreal engine executable file when it was brought to vr and dis- played in stereoscopy. production meetings, in which iterations of the virtual environment, individual assets, and interactions in the virtual environment were dis- cussed, were held at the beginning of each week’s class in a conference room, instead of our usual classroom. this allowed students to focus on all of the assets, brought them together as a team, and showed them how their contribu- tions were affecting the big picture. the fig. – . the original lidar scanned data (top), retopologized interior (middle), and fully textured and lit virtual bethel ready for viewing in vr (bottom) virtual reality for preservation: production of virtual reality heritage spaces in the classroom team reviewed each student’s work weekly, addressed concerns as needed, then tasked the students with moving forward on new assignments. we, the instructors, stressed to each student that oth- ers relied on their progress and thus adherence to deadlines was mandatory. production meetings were followed by visits to the iu advanced visualization lab to view the progress on the vr envi- ronment within the unreal game engine. after the team meetings, the rest of the class was dedicated to lab/production time for all members to spend time interacting, working, and receiving indi- vidualized instruction from both faculty and team leads as needed. on artistic or preservation considerations, the students considered feedback from the client/public and adjusted the interactives based on the audience’s reaction to the environment, primary resources within vr, or even the vr hardware itself. as we neared comple- tion, some church congregation members and the church’s pastor visited to see our progress. it was surreal to watch the members find the pew where they had always sat and to observe the pas- tor stand in his favorite spot to give a sermon. the students saw the impact of their work on the visitors and started to understand how this project was important for the community. this was a very powerful visit for all involved. as students completed d models for populating the vr envi- ronment, they kept records of their progress on a spreadsheet. as digital models were completed, students began unwrapping the d objects so they could be textured using allegorithmic’s substance painter. this process applied materials and colors to the d models to give them a sense of realism. to simplify the identification of particular materials, an internal team library was created for stu- dents to access materials that had been identified in bethel. materi- als represented surfaces in the real world (e.g., types of wood [cher- ry, oak, poplar], paint [glossy, matte, aged], metals [chrome, copper, iron], plastics [silicone, rubber, shiny, glossy]). as models were painted within substance painter, texture maps were exported for use in unreal ’s physically based rendering (pbr) shaders, which experts in the video game and architectural visualization industry commonly use to make surfaces appear realistic. after all models had pbr materials applied and were loaded into unreal, various processes were used to optimize the scene. for example, the textures were tested to see whether there were errors, lighting was added to the scene to simulate light coming through the windows, navigation controls were optimized, and teleportation locations were created to permit navigation in the environment. https://kb.iu.edu/d/apel see a short video of virtual bethel in early production used to solicit additional support: virtual bethel solicitation, october , . available at https://vimeo. com/ https://vimeo.com/ https://vimeo.com/ virtual reality for preservation: production of virtual reality heritage spaces in the classroom building a campus infrastructure for virtual preservation of cultural heritage in projects such as virtual bethel, the prior knowledge, resources, and communication ability of support staff, faculty, project partners, and client partners are critical to project planning. readily avail- able hardware and software can ensure the success or guarantee the failure of projects. the mas faculty have made several choices and adopted specific philosophies to ensure success across media projects of many kinds. following are discussions of specific considerations concerning vr projects involving the digital preservation of spaces, including technical considerations; associated costs; student technical and artistic competencies; variations in student level of confidence and leadership; variations in peer-to-peer organization, communica- tion, and accountability; and student learning outcomes. technical considerations the mas program faculty and students make every effort to stay software agnostic, especially in relation to game development en- gines. in an age when software updates are daily and software companies are purchased every hour, it is impossible to anticipate which updates, plug-ins, or software will stop being supported or will be changed entirely. we encourage our faculty and students to test, vet, and hone their skills on multiple platforms. the unity game development engine tends to enable easier porting to various head- mounted devices (hmds), mobile devices, and app stores, while un- real, until recently, has supported only higher rendering and realism capabilities. we built virtual bethel using the unreal game engine because we wanted to develop contained systems for porting virtual bethel onto full vr, mobile devices, and web environments for maxi- mum public access. this was a new process for us, and future proj- ects in the program will benefit from the lessons learned. another important technical consideration is that d scans of objects, spaces, or both must be completed and delivered before the start of a production course semester. geometry created from scan data is often best at the highest possible capture settings. whether the data are captured via lidar, structured light, photogrammetry, or by other means, the bigger the dataset, the bigger the textures, and the larger and more frequent the photos, the better the end result. given the temporal constraints of the academic semester, student productions cannot be delayed because of the need to recapture or find additional photography. virtual reality for preservation: production of virtual reality heritage spaces in the classroom associated costs vr projects like virtual bethel have substantial costs. the total sup- port funded by indiana university’s new frontiers grant in was considerable at roughly $ , . most of the funds covered the costs of student hourly labor outside of class time (texture artists, lighters, game developers, web developers, user experience [ux] researchers), faculty summer support, and a modest honorarium for our community partners. nearly percent of the budget was used to purchase server space for the website and invest in a mobile vr workstation with a laptop capable of showcasing iterations of virtual bethel to the public during its production so that students could re- ceive feedback (copeland et al. ). without an mas or similar program, an institution is unlikely to have the necessary resources for such projects, including lab space, and hardware and software. the lab space, computers, display hard- ware, and d and game design software had already been purchased for teaching mas students in the undergraduate and graduate pro- gram within soic, iupui. the core labs, it and it , house computers with cintiq displays with the latest software and hard- ware for film, game and vr art, production, and development, as well as vr hardware, respectively (figure – ). unless this infrastruc- ture and the supporting it staff already exist, projects like virtual bethel would be unreasonably expensive. in terms of labor, the three to five mas students who worked during the digital replication of virtual bethel averaged a total of hours per week, followed by two students who averaged ten hours per week in labor in the second semester, and one student who worked over the last eight months embedding audio, interactions, and iterating on the story vignettes within virtual bethel once the project was funded at an average of hours per week. two stu- dents were paid outside of class after the first semester through the support of the iu new frontiers grant for two additional semesters https://research.iu.edu/funding-proposals/funding/opportunities/new-frontiers/ index.html fig. – . d (left) and vr production (right) labs at the school of informatics and computing virtual reality for preservation: production of virtual reality heritage spaces in the classroom to continue the development of the interaction portions of virtual bethel; to embed a curation experience; and to refine standardized workflows for creating multidevice vr executables for any audi- ence or hardware such as oculus, vive (full vr), ios and android (mobile vr), and web-based versions of the experience (for those lacking hmd or mobile vr hardware). a graduate student with web development experience was also paid to develop the public facing website and store all of the content. critical decision-making points through the project based on stakeholder feedback several critical decision-making points arose through the project, particularly in terms of vr navigation, curation of bethel’s story and history, implementation of audio and voice, protecting the cultural protocols of the bethel congregation, and preservation community priorities with vr constraints. virtual reality navigation during the first public showcase in the fall of , we noted that many of the bethel membership were elderly and took several min- utes to learn to use vr navigation. some members also were unable to stand because of their health. the decision was made and imple- mented immediately after that session to provide two modes of vr navigation when showcasing virtual bethel with hmds: one stand- ing mode with teleportation enabled, and a second sitting mode that would allow the user to transition to various locations preloaded in the virtual bethel sanctuary by clicking one button. making the vr experience accessible to those who cannot stand or walk has become a key priority to building an alternative vr navigation interaction in projects for the future. curation of bethel’s story and history after the space was initially showcased to the local membership, bethel resident historian and virtual bethel curator, olivia mcghee- lockhart and several heritage partners were given the opportunity to experience the virtual bethel space. they expressed gratitude at the re-creation of the space, but said that the experience felt flat and lacked a sense of story, import, or exploration. we knew the space alone was not enough to educate or leave an impact on an audience, so we decided to embed interactive story vignettes (figure – ). the vignettes included short written content and surrounded digital versions of content such as scanned newspapers, ledgers, and photo- graphs of bethel and its membership. in the late fall of , the team turned its attention to creating a database within virtual bethel so that it could outlive the vr team. now virtual bethel can accept ad- ditional story vignettes that house various types of data. https://comet.soic.iupui.edu/bethel/ virtual reality for preservation: production of virtual reality heritage spaces in the classroom implementation of audio and voice on martin luther king day, , we were showcasing the latest version of virtual bethel, which then contained story vignettes; we noticed several audience members saying that they would love to “hear” ambient church members and mcgee-lockhart, as a critical part of the vignette experience. as a result, mcgee-lockhart record- ed audio for each vignette, discussing the history of specific parts of the church, and the audio was programmed to play upon interac- tion with the vignettes. this enriched the experience of the space for fig. – . examples of interactive story vignettes virtual reality for preservation: production of virtual reality heritage spaces in the classroom users and enabled them to hear why each space was so important to its congregants and for the city of indianapolis. plan for sharing early in the relationship, because of the inexperience of project part- ners in both the game development processes and vr, we created contractual protections for virtual bethel that still make it legally difficult to share the project folders with interested parties. our goal was to ensure safe, responsible, and ethical use of virtual bethel files, as we feared that ease of access could lead to unimagined and poten- tially harmful reuse of the digital files. we wanted to protect bethel’s membership from the possibility that a game or other interactive ex- perience could be made based on the virtual bethel content without their express consent. preservation community priority versus vr constraints early in the project, stakeholders unfamiliar with the technical and hardware constraints of vr expected objects in virtual space to have the same level of realism and accuracy that objects have in real life. we explained to project stakeholders and community partners that varying levels of realism can be achieved based on the following fac- tors: student labor force size, ability, and available time. we are very proud of our students’ final version of virtual bethel and its level of realism/believability. in our opinion, it exceeds many other projects in both aesthetics and interaction. looking ahead as new technologies emerge, as virtual interactions become simpler to implement, and as alternate realities permit ever-richer experi- ences, new opportunities are continually emerging for research and application. looking forward, we suggest some areas of research in which to invest prior to building a virtual recreation. agnostic platforms  project partners and stakeholders at iu used virtual bethel’s method- ology to recreate various spaces, but with significant differences; for example, some teams used unity game development engine, while our team had used unreal game development engines. this meant that project teams frequently could not use the virtual artifacts and interactions that others created. developing across multiple software platforms can be cumbersome and often leaves one team feeling left behind the other. in mas, we have committed to creating projects in both unreal and unity game development engines to both increase the employability of our students and to stay current with trends in these industry-leading applications for next-generation interactions. virtual reality for preservation: production of virtual reality heritage spaces in the classroom ux standards and institutional documentation  there is little documentation for developing platform-specific inter- action standards for virtual reality. documenting virtual interactions that do (and do not) work well, sharing the code that creates them, and explaining the reasoning and context behind the choices made during their creation is therefore critical to building a project man- agement and knowledge-sharing infrastructure. what works well in a full vr experience for navigation and interaction must be config- ured separately for mobile devices and again for web-based virtual experiences on a desktop computer. the virtual bethel team has committed to delivering vr environments for all major platforms and to documenting the processes of creating interactions for each platform to aid future iu teams. creating tools that automate the de- velopment of navigation and interaction inputs across vr platforms and devices will expedite the testing, evaluation, and accessibil- ity of virtual reality-supported historical and cultural preservation projects. advanced capture technologies capturing d artifacts and spaces is becoming cheaper, more effi- cient, and more accurate every day. for example, we can now use a combination of photogrammetry-produced high-resolution textures with highly accurate spatial data from laser scanning—an approach that was not available to us when the project started two years ago. documenting methods for combining geometry and textures from multiple imaging tools will be important as advanced capture tech- nologies and techniques change over time. quantifying authenticity throughout the creation of virtual bethel, our core concern was to recreate the chapel for the congregants. each of the stakeholders we spoke with—the engineers scanning the space, the librarians scan- ning documents and artifacts, our students, the bethel membership, or the public at large—had their own definition of what was real, true, or believable in virtual reality. we began to frequently ask the following questions: • how can we quantify authenticity of virtual objects/spaces for different stakeholders when forced to remake spaces/artifacts for vr experiences? • what are the dependent variables in defining authenticity? • do digital born artifacts have authenticity as a digital replica? • at what point does digitized content made for ar or vr not rep- resent the physical artifact on which it is based? • how much freedom does a d artist have? much effort was spent educating all project partners and stake- holders on the geometry, texture, and lighting constraints of vr (and limitations of student ability) while also convincing them to capture the highest possible quality of scans for posterity. we believe that virtual reality will become the interaction virtual reality for preservation: production of virtual reality heritage spaces in the classroom medium of choice for audiences to learn about and experience his- tory. watching bethel congregants, students, children and colleagues interact with, gain insights from, and understand the vr medium and how bethel is being preserved in a new way leads our team to believe this is a much more natural, believable, and accessible me- dia environment with which to engage, educate, and entertain. as hardware adoption and the public’s comfort interacting with digital content increases, preservationists will no longer just preserve, but will also have the opportunity to lead the curation of and interaction with the objects, spaces, and time periods they protect (see costa and melotti ; morcillo et al. ). accessibility for all media arts and science at iupui is committed to making available and representing its students’ and project partners’ hard work in as many ways as possible. until full head-mounted displays are ubiq- uitous in homes around the world, we see it as necessary and ethical to create full, mobile, and web-based vr iterations of all of our proj- ects to ensure that all audiences can learn from the virtual spaces we create. conclusion throughout its creation, virtual bethel has benefited the faculty, stu- dents, librarians, preservationists, community partners, and, most important, bethel church members. positioning a d/vr production as a focal point for heritage preservation inspired quick stakeholder buy-in, enthusiasm, and flexibility through collective understand- ing. all stakeholders embraced this emerging technology as a unique preservation method. the iupui library has become a place to learn about emerging technologies, anticipate trends, and preserve the digital files of productions such as virtual bethel. the virtual bethel project has become an exemplar of what a library can offer its public and how an academic institution can leverage faculty and on-cam- pus resources while integrating its students into authentic and en- gaging curricular and co-curricular projects. undergraduate students led the vr production day to day, and the result, with a bit of orga- nization and regular community feedback, was more than anyone could have imagined. we could not be happier. during its creation, virtual bethel inspired six other preservation projects and student teams of varying sizes at iu. virtual bethel’s success has inspired the integration of mas students and faculty in the virtual re-creation of several environments and time periods in and around indianapolis. the collaboration of community members, historians, preservation- ists, librarians, student game developers, and d artists realizes a new opportunity to develop exciting experiences that are authentic, accurate, and informative, both inside and outside of academia. we did not just scan the bethel ame church, we did not just doc- ument the space, we re-created it. furthermore, we are preserving it virtual reality for preservation: production of virtual reality heritage spaces in the classroom in a medium that will far outlive the physical church or anyone on the team. we embedded within the vr environment access to more history than the real space could ever provide, separate from any single historian, member, or moment. we made it possible to add and amend new content to the virtual bethel database at any time. we believe projects like virtual bethel are redefining what the pres- ervation of an endangered cultural heritage site means. the scanning of d and d objects is the first step in a much larger preservation pipeline, one in which an audience readily accesses a space that no longer exists, listens to voices that can no longer be heard, and holds artifacts that no longer can be held. future audiences will demand to interact with and understand history on their own terms, while a new niche of vr curators will initiate preservation projects, provide access, and steward the experience. references copeland, andrea, zebulun m. wood, lydia spotts, and ayoung yoon. . “learning through virtual reality: virtual bethel case study.” presentation at iconference , sheffield, uk, march – , . costa, nicolò, and marxiano melotti. . “digital media in ar- chaeological areas, virtual reality, authenticity and hyper-tourist gaze.” sociology mind ( ): – . available at . /sm. . . lombardi, marilyn m. . making the grade: the role of assess- ment in authentic learning. educause learning initiative. available at https://library.educause.edu/~/media/files/library/ / / eli -pdf.pdf. mcleod, saul. . “kolb’s learning styles and experiential learn- ing cycle.” available at https://www.simplypsychology.org/learn- ing-kolb.html. morcillo, jesús muñoz, franziska schaaf, ralf h. schneider, caro- line y. robertson-von trotha. . “authenticity through vr-based documentation of cultural heritage. a theoretical approach based on conservation and documentation practices.” virtual archaeology review ( ): – . available at https://polipapers.upv.es/index. php/var/article/view/ . available at https://doi.org/ . / var. . . related reading jennett, charlene, anna l. cox, paul cairns, samira dhoparee, an- drew epps, tim tijs, and alison walton. . “measuring and de- fining the experience of immersion in games.” international journal of human-computer studies ( ): – , available at https://doi. org/ . /j.ijhcs. . . . http://dx.doi.org/ . /sm. . http://dx.doi.org/ . /sm. . https://library.educause.edu/~/media/files/library/ / /eli -pdf.pdf https://library.educause.edu/~/media/files/library/ / /eli -pdf.pdf https://www.simplypsychology.org/learning-kolb.html https://www.simplypsychology.org/learning-kolb.html https://polipapers.upv.es/index.php/var/article/view/ https://polipapers.upv.es/index.php/var/article/view/ https://doi.org/ . /var. . https://doi.org/ . /var. . https://doi.org/ . /j.ijhcs. . . https://doi.org/ . /j.ijhcs. . . virtual reality for preservation: production of virtual reality heritage spaces in the classroom looxid labs. . “measuring the power of vr education: when vr classroom needs eeg and eye-tracking technology.” medium (december ). available at https://medium.com/@looxid.labs/ measuring-the-power-of-vr-education-when-vr-classroom-needs- eeg-and-eye-tracking-technology- e f. portnoy, lindsay. . “metrics that matter: how to use vr to transform classroom learning.” medium (april , ). available at https://medium.com/@lportnoy/metrics-that-matter-how-vr- can-boost-classroom-engagement-and-learning-ff c b . sydell, laura. . “ d scans help preserve history, but who should own them?” all tech considered (may ). available at https://www.npr.org/sections/alltechconsidered/ / / / / d-scans-help-preserve-history-but-who-should-own-them. wood, zebulun, albert william, ayoung yoon, and andrea cope- land. . “virtual bethel: preservation of indianapolis’ oldest black church.” in research methods for the digital humanities, edited by lewis levenberg, tai neilson, and david rheams, – . cham, switzerland: palgrave macmillan. https://medium.com/@looxid.labs/measuring-the-power-of-vr-education-when-vr-classroom-needs-eeg-and-eye-tracking-technology- e f https://medium.com/@looxid.labs/measuring-the-power-of-vr-education-when-vr-classroom-needs-eeg-and-eye-tracking-technology- e f https://medium.com/@looxid.labs/measuring-the-power-of-vr-education-when-vr-classroom-needs-eeg-and-eye-tracking-technology- e f https://medium.com/@lportnoy/metrics-that-matter-how-vr-can-boost-classroom-engagement-and-learning-ff c b https://medium.com/@lportnoy/metrics-that-matter-how-vr-can-boost-classroom-engagement-and-learning-ff c b https://www.npr.org/sections/alltechconsidered/ / / / / d-scans-help-preserve-history-but-who-should-own-them https://www.npr.org/sections/alltechconsidered/ / / / / d-scans-help-preserve-history-but-who-should-own-them using d photogrammetry to create open-access models of live animals: d and d software solutions jeremy a. bot and duncan j. irschick abstract novel technological solutions emerging over the past five years have made it possible to consider previously unheard of re-creations of the world as it exists in its stunning, full-color, d topography. the potential value of the rendering of real animals for science, conser- vation, education, and story-telling is substantial. for scientists, d models of live animals provide valuable data for testing theories on body shape and movement, and they represent “avatars” of actual specimens for further analysis. for conservationists, the ability to use novel technological solutions such as virtual reality (vr), augmented reality (ar), or gaming applications to present real-life animals opens new doors for reaching the public. for educators, the ability to tell stories around a specific animal, instead of a generic animal, is significant. drawing on our work with the digital life project, we describe our process for using open-source software to recreate liv- ing animals in d, from photocapture to animation. introduction t he past decade has witnessed an upsurge in efforts to digitally preserve the world, such as d scanning of buildings (cyark. org) and corals (thehydro.us). some methods for d scanning include magnetic resonance imaging (mri), computed tomography (ct), laser or white light scanning, and photogrammetry (huising and gomes pereira ; bythell, pan, and lee ; gignac and kley ). of these methods, photogrammetry has emerged as a flex- ible tool that can be used for various research applications (debevec et al. ; bythell, pan, and lee ; dai and lu ). unlike the other methods listed, photogrammetry does not require expensive chapter http://cyark.org/ http://cyark.org/ file:///users/../c:/volumes/lexar/ dvr% thursday/ d_vr_pub-gail-edits/thehydro.us using d photogrammetry to create open-access models of live animals: d and d software solutions hardware or software (weinberg et al. ; linder ; falkingham ). this chapter examines the practice and art of generating d models of live animals using photogrammetry—the science of deriv- ing measurements in d space using photographs (figure – ). in particular, we focus on the technical workflow using open-source software (i.e., computer programs that are distributed with the origi- nal source codes, which allow them to be read and edited in their uncompiled state). re-creating live animals as full d models by means of photo- grammetry is not simple. for entertainment purposes, a d artist creates a generic animal model using multiple reference photos from various animals, modeling in as much detail as necessary. for pur- poses of conservation, science, and education, the d artist is under more constraints as they must be faithful to the original details of a single animal. it is also important to realize that most d scanned models represent a single snapshot, with the subject frozen in time. this is an issue as animals are rarely static. the ability to record and demonstrate the movement of a living animal enhances storytelling elements and can provide new tools for scientists and educators. for example, a recording of a dinosaur in motion, if such a thing were possible, would expand our understanding of how these creatures moved. as behaviors affect structures, this helps us better under- stand the animal’s underlying anatomy, and even gives hints to the environments they traversed. thus, in the review of our technical workflow, we also discuss the methods and mindset that we used for re-creating movement. in this paper we first focus on the process of using d photo- grammetry for the photocapture of objects. second, we discuss the use of this method and other aspects of photography to photocap- ture live animals in different settings (e.g., field, laboratory). third, we show how open-source software (e.g., blender) can be used to reconstruct the resulting d meshes and then render a model in its final forms to create a vr- and ar-ready “asset” (raitt and minter ; yirci ). uses for d models of living animals lifelike d models of animals (i.e., those that appear in death much the same as they do in life) are valuable for several reasons. in many cases, such as with mammals and birds, scanning a dead specimen is very different from scanning a live specimen, both in terms of fig. – . a green sea turtle is scanned using photogrammetry software, remodeled, textured, and animated. using d photogrammetry to create open-access models of live animals: d and d software solutions the process and the result, whereas in other cases, the distinction will not be obvious. first, these d models can represent d digital voucher specimens. traditionally, scientists collect live specimens in the field, euthanize them, preserve them with fixatives, and then deposit them in museums. using photogrammetry, researchers can scan live specimens in the field or the lab and then create d models that are connected with metadata (e.g., museum accession numbers) for scientific analyses. second, the d models can be used for com- putational modeling analyses, such as testing theories of how sea turtles or sharks swim. third, the models have various educational uses, ranging from virtual and augmented reality (vr and ar) to d printing to demonstrate animal form and function (blagoderov et al. ). fourth, the models serve as a powerful tool for animal conservation programs. in the same ways that photography pio- neered new ways of understanding animals, the use of vr and ar has similar potential to facilitate the visualization of live animals and demonstrate their uniqueness. finally, the d models have po- tential commercial value, as unique d models from animals allow more specific content for various kinds of work, such as commercial movies or video games. capturing and processing d data using photogrammetry combined with computer software, digital photos can be used to generate a d polygon mesh and surface color texture files. many popular photogrammetry software solutions can process all steps in the model-making process, from the capture of initial photos to the production of final textured d mesh (sidebar at left). technology overview the first step in scanning an object with photogram- metry is to take multiple photos of the subject at vary- ing angles. next, the computer software generates a d mesh by finding similarities between sets of pho- tos. the matched patterns in each photo are triangu- lated, resulting in the conversion of pixels into points in d space. the process enables us to use these point cloud data to construct a polygon mesh. having cal- culated the coordinates of the points and the positions of the camera, we can project the original photos onto the surface of the mesh to create a color texture map. because photogrammetry relies on the ability to iden- tify similar patterns between each photo, the process works best with detailed, nonreflective surfaces that are viewed under consistent lighting. for example, it is problematic to reconstruct shiny objects or objects with no obvious landmarks (e.g., a white sheet) with d software. the process of using photographs and videos to create d models via photogrammetry has been covered in other publications (e.g., falkingham, ; baqersad et al., ) software for photogrammetry and d mesh manipulation photogrammetry software: • realitycapture by capturingreality, https://www.capturingreality.com/ • photoscan by agisoft, http://www.agisoft.com/ open-source software tools that also handle various portions of the photogrammetry processing • visualsfm, http://ccwu.me/vsfm/index.html • meshlab, http://www.meshlab.net • bundler, http://www.cs.cornell.edu/~snavely/bundler • cmvs, http://www.di.ens.fr/cmvs/ • colmap, https://colmap.github.io/ • mve, https://www.gcc.tu-darmstadt.de/home/proj/mve/ • mvs-texturing, https://www.gcc.tu-darmstadt.de/home/proj/texrecon/ • openmvg, https://github.com/openmvg/openmvg • openmvs, http://cdcseacave.github.io/openmvs • theia, http://www.theia-sfm.org • meshroom, https://github.com/alicevision/meshroom https://www.capturingreality.com/ http://www.agisoft.com/ http://ccwu.me/vsfm/index.html http://www.meshlab.net/ http://www.cs.cornell.edu/~snavely/bundler http://www.di.ens.fr/cmvs/ https://colmap.github.io/ https://www.gcc.tu-darmstadt.de/home/proj/mve/ https://www.gcc.tu-darmstadt.de/home/proj/texrecon/ https://github.com/openmvg/openmvg http://cdcseacave.github.io/openmvs https://github.com/alicevision/meshroom using d photogrammetry to create open-access models of live animals: d and d software solutions single camera because it is possible to reconstruct only what the lens can see within its viewing angle, the use of a single camera is best suited for situa- tions in which the subjects being sampled are static, such as architec- ture, sculptures, or other inanimate objects. a static object allows the photographer to move around the subject, taking multiple pictures from various viewing angles, without having to worry that the sub- ject will move between photos. for most living animals, therefore, the use of a single camera is unsuitable. when a subject moves with respect to the background while photos are being captured in se- quence moves, the patterns in the photos will not align properly. if there are misalignments, the resulting d reconstruction will have inaccuracies, resulting in a lack of resolution or the presence of visual artifacts or noise. further, lighting quality around a subject, both in the field and the laboratory, can vary considerably. thus, the effec- tive use of photogrammetry as a tool requires some skill in lighting and basic photography. multiple cameras to capture moving objects (e.g., animals), one method that the digi- tal life team has used is a multicamera rig (figure – ). multicam- era rigs can be configured in many sizes and shapes, and they have been most widely used for human d imaging, such as the xxarray system by artist alexx henry in los angeles. a typical multicamera rig consists of to cameras on some form of fixed system, such as tripods or sets of / metal rods, with all of them pointing http://digitallife d.org fig. – . frog surrounded by a multicamera rig built by the digital life project. photograph by christine shepard. using d photogrammetry to create open-access models of live animals: d and d software solutions toward a central area. the cameras are then synchronized to wireless triggers or wires, which ensures that all of them take a photo at the same time. an alternative method is the use of synchronized video with a common motion (e.g., a clapper or a ball drop in view of all cameras). our focus has been on creating multicamera rigs for work with live animals that vary in shape, size, and behavior. the digital life team focused on creating four different kinds of multicamera rigs, which together represent the beastcam technology platform. these rigs are the handheld beastcam, the beastcam macro, the beastcam array, and the beastcam stand. the handheld beastcam is a four- camera system designed in part by kasey smart, dylan briggs, and duncan j. irschick. the system works through the synchronization of four small cameras on flexible arms that are triggered through wires. the beastcam macro was designed by trevor mayhan for creating d models of small animals ranging in size from about to inches in body length, such as small frogs or reptiles. the beastcam array, developed by zachary corriveau, can hold more than cameras and was funded in part to create d models of small reptiles. this system is best suited for small to mid-sized animals ranging from about to inches in length. finally, the beast- cam stand system was created by michael perriera originally for d models of houses, but the utility of this system for large animals soon became apparent. each of these systems is field-portable. the inclusion of scale bars next to the model allows the user to re- construct the scale of the specimen. aligning images to create an accurate d model especially with live animals, coverage of photos is typically insufficient to capture in detail every portion of an animal. the capture of additional images over occluded areas with a handheld camera can help to remedy this problem, even when the original animal has moved from the position in which all the cameras were synchronized. in the case of sea turtles, we used two scans (dorsal and ventral sides) along with indi- vidual images to create the full d model (figure – ). however, because the animal was often in a different pose when the dorsal and ventral sides were scanned, additional work was necessary to integrate the pho- tos/scans into a single model. the animal poses within the individual scans are adjusted so that the meshes align. this is accom- plished by finding corresponding landmarks between the portions. these landmarks serve as reference points, allowing the photographer to scale the mesh portions so that they match in size. once the rigid multimedia data file format definitions dng: photos taken with our cameras use this raw adobe digital negative format. https://helpx.adobe.com/photo- shop/digital-negative.html obj: we use this open-source file format developed by wavefront technologies, a popularly used standard for storing d mesh data. it includes texture coordinate information (also known as uvs) that maps a d texture to the d model. when combined with mtl files, the obj file can manage material properties and link to the texture image files that have been applied to the mesh. in our process, we export our scans in this format as it is compatible with nearly all d animation software pack- ages. however, it does not support skeletons, skinning, or animation data. http://www.cs.utah.edu/~boulos/cs / obj_spec.pdf png: for our textures, we use this open standard lossless image format that supports bit color depth with rgb or bit rgb + alpha. commonly used for textures in d animation because of its color depth and lossless data compression and on the web as recommended by the world wide web consortium (w c). http://www.libpng. org/pub/png/ blend: this open-source file format is the native file type for blender, our core d open-source animation tool set protected under general public license v . the format supports mesh geometry, uvs, materials, skeletons, skin- ning, and animation. blender has been slowly growing in popularity since the free version was released in . currently, many game engines and d websites are di- rectly supporting blender’s file format. we use .blend files throughout our d modeling and animation process. https://docs.blender.org/manual/en/dev/data_system/ https://helpx.adobe.com/photoshop/digital-negative.html https://helpx.adobe.com/photoshop/digital-negative.html http://www.cs.utah.edu/~boulos/cs /obj_spec.pdf http://www.cs.utah.edu/~boulos/cs /obj_spec.pdf http://www.libpng.org/pub/png/ http://www.libpng.org/pub/png/ https://docs.blender.org/manual/en/dev/data_system/ using d photogrammetry to create open-access models of live animals: d and d software solutions elements of the animal have been used to bring the portions to a consistent scale, the poses of the extremities are aligned with one an- other. rather than arbitrarily adjusting the points of the mesh, a tem- porary motion rig (used to allow the model to move) is placed within each portion so that limbs can be rotated into place. this procedure helps reduce the risk of distortions, thus preserving the proportions and anatomical structure of the animal modeled. finally, the aligned mesh portions can be manually stitched together or used as a tem- plate onto which a mesh can be drawn. fig. – . the dorsal (left) and ventral (right) sides of a green sea turtle are captured in two separate scans using our multi- camera setup. the scans are processed separately within the reality capture (photogrammetry software), then brought into the blender animation software for merging and cleanup. reducing the polygon count of d mesh the d mesh reconstructions from our photogrammetry processing have a high number of polygons, which should ideally be reduced to reduce processing time. the models exported from our photo- grammetry software are in the obj file format, contain to million triangle polygons, and include fine surface details mixed with mesh artifacts typically caused by surface reflections in the source photo- graphs (figure – ). using our current camera type and configura- tion, we have found that meshes exported in excess of million poly- gons add no useful details, but rather result in more unusable scan noise. though many of today’s computers can display these high- density d models, the processing performance of different systems varies widely. traditional method: simplifying mesh using edge collapsing processes traditionally used for reducing the number of polygons in a reconstructed mesh involve combining/collapsing the existing polygon edges (yirci ). many of these simplification tools have automated processes that will decimate the mesh to a specified re- duction percentage. using d photogrammetry to create open-access models of live animals: d and d software solutions fig. – . this . million polygon mesh reconstruction of a pixie frog’s face shows the eyes and a nostril. surface reflections on the skin and eyeball have caused some noise artifacts that appear as sharp bumps, cavities, or wrinkles on the surface. the decimated mesh is less dense than the original photogram- metry reconstruction (figure – ). though the number of polygons in the decimated mesh is more manageable, the irregular, triangulated structure of the mesh does not produce the best results when used with mesh deformers common in d animation (e.g., skinning, lat- tices). instead of using this approach, we have adopted an alternative method. fig. – . the . million polygon mesh of the frog shown in figure – would be a challenge to work with on most computer systems (left). using blender’s decimate modifier, we reduced the mesh to / th of its original polygon count, resulting in a workable number of , polygon triangles, which can be viewed or edited on even the most modest of computer setups. using d photogrammetry to create open-access models of live animals: d and d software solutions our preferred method: simplifying mesh while optimizing the mesh for deformation converting the geometry and textures to an animation-friendly format is a task often left to “ d artists,” as it typically involves a manual process of d mesh reconstruction commonly referred to as retopologizing. we have two primary goals when simplifying the dense polygon mesh in preparation for animation: . reduce the polygon count, enabling better d performance of the model in real time while maintaining the accuracy and high- frequency detail of the original scan (via “normal map” texture files). . structure the polygon edges in a way that allows for better defor- mation when the mesh is bound to a skeleton for animation. for instance, in figure – , the benefits of using quad polygons (right) over triangular polygons (left) are clear. fig. – . a randomly triangulated mesh (left) is not based on the function of the underlying anatomy and therefore produces undesirable wrinkles and bumps in the mesh. a retopologized quad mesh (right) follows the flow of the limbs, with edge loops lining up with the creases in deformation, thus producing fewer undesirable distortions. a reduced mesh can contain non-manifold geometry or intersecting polygons that occur when the decimate processes attempt to reduce natural high-frequency detail or surface noise generated in the photogrammetry processing. a mesh that has been retopologized to include primarily quad polygons can be structured so that the polygon “edge flow” aligns with the direction of deformation (figure – ). these quad meshes are common in animated d models used in film, television, and video games (raitt and minter ). using d photogrammetry to create open-access models of live animals: d and d software solutions fig. – . this quad mesh of , polygons shows an edge flow that resembles the underlying muscle structure of a human. the arm (left) shows a slight twisting of the polygons as they move down over the forearm. the blue strip on the forearm resembles the flow of the brachioradialis muscle, which allows the forearm to twist inward and outward. the face topology (right) resembles the muscles that raise the eyebrows, surround the eyes for squinting, and circle the mouth for retracting for a snarl or pulling at the corners of the mouth for a smile. model by angela guenette for the blenderella modeling class, used with permission and licensed via cc by . . retopologizing is traditionally a manual process, though soft- ware tools are now available to assist this process by allowing a user to change the mesh structure by drawing curves on the surface. the algorithm guides the reconstruction of the mesh edges into a new quad mesh topology that follows these curves. commercial d sculpting software packages that include automatic retopologizing include zbrush by pixologic, d coat by pilgway, autodesk’s mud- box, and the foundary’s modo. we use instant meshes, an open-source software solution, to begin the process of retopologizing our high-density meshes. with this software, we can create guide curves and export a reduced mesh consisting of mostly quad polygons with an edge flow that matches our guides. these guides not only simplify some of the cleanup process, but also give us quick edge flow results that align with the shape of the scanned animal (figure – ). https://ponderstudios.com/ / / /advanced-character-modeling-workflow/ https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / https://github.com/wjakob/instant-meshes https://ponderstudios.com/ / / /advanced-character-modeling-workflow/ using d photogrammetry to create open-access models of live animals: d and d software solutions fig. – . in this comparison of the topology of the top of the frog’s head, we can see that the original dense mesh of . million polygons (left) has been retopologized using instant meshes into an isotropic quad mesh of , polygons with an edge flow that preserves the silhouette (right). in addition to the reduction of polygons and better edge layout for skeletal animation, the resulting edge flow simplifies the process for re-creating the uv maps for the textures. reconstructing mesh normals and uv coordinates it is important to accurately reconstruct mesh normal and uv coordi- nates as these elements dictate the overall color map of a d model, as well as ensure that the model can be effectively used in a wide variety of platforms. uvs the term uvs refers to uv maps or texture coordinates. the letters u and v denote the axes of a plane, since x, y, and z are used for the coordinates in the d space. for a d image texture to be applied to a d object, each vertice of a d mesh contains a pair of numbers (us and vs) that represent the location at which that particular vertice would plot on a scale of to on a square plane. the us and vs refer to the height and width coordinates. a uv map is visualized as a wireframe cage flattened into something that resembles a d clothing pattern or animal pelt. marked “seam” edges enclose these pieces (uv “islands,” as blender calls them). many d software packages allow the user to create seams on the mesh (the shores to the metaphorical islands) by selecting edges in a d viewport and then marking the edges as seams (figure – ). next, running an unwrap function cuts the mesh along its marked seam lines to flatten the mesh into a d plane. this function attempts to limit the stretching of the mapping by ensuring that the relative area and general shape of each uv polygon is similar to that of the source polygon on the initial d model. wikipedia defines skeletal animation as “a technique in computer animation in which a character (or other articulated object) is represented in two parts: a surface representation used to draw the character (called skin or mesh) and a hierarchical set of interconnected bones (called the skeleton or rig) used to animate (pose and keyframe) the mesh.” https://en.wikipedia.org/wiki/computer_animation https://en.wikipedia.org/wiki/character_(animation) using d photogrammetry to create open-access models of live animals: d and d software solutions fig. – . uv texture coordinates on top of the color texture image with seams outlined in red (left) and d turtle model with edges marked as seams for automatic layout of uvs using blender’s uv unwrap function (right). most photogrammetry software has the capability to automati- cally generate the uvs for the resulting d mesh scan. however, these auto-generated seams can result in a map with fragmented tex- ture files (see figure – , left image). fig. – . the auto-generated uv maps on the scanned mesh are often quite fragmented, and while useful initially, they can be very difficult to work with when texture painting or manipulating the texture file using an external image-editing software (left). seams are created on the retopologized frog model by selecting edges and marking them (middle). the color texture information can then be transferred from the original mesh scan to the new retopologized mesh via a baking process, which results in a texture file that is much easier to work with in the texture-editing process, though it may result in some unused texture space (right). as the retopologization creates a new mesh, it is necessary to create a new uv map. a newer, cleaner edge flow will allow for seams to be easily marked using edge loop selection shortcuts in the animation software (figure – ). the resulting uv will usually contain dozens of uv islands, instead of the hundreds or thousands of islands that result from the original photogrammetry scan. using d photogrammetry to create open-access models of live animals: d and d software solutions fig. – . the cleaner edge flow that can result from retopologization can also be used to cleanly trim unnecessary geometry from the scan. these split edges now create a boundary that makes it possible to select points on the mesh section to be trimmed, and then the “select linked” function will grow the selection until it hits the split edge boundary (middle). the selected geometry can now be deleted, resulting in a trimmed mesh with a less jagged silhouette (right). sometimes the new uv islands, if not organized efficiently, can result in some large gaps between the islands; these gaps represent unused texture space. therefore, it is important to compare the origi- nal textured photogrammetry scan with the newly textured retopolo- gized mesh to see if there is any loss in resolution or texel density (i.e., number of pixels per square meter in real life). normals when describing a d mesh, normals are often used to describe the facing direction of the polygon vertice, edge, or face. a normal is a vector that is typically perpendicular to each polygon face (figure – ). it is important to have normals facing in a consistent direction (preferably pointing outward), and they can be adjusted manually or automatically. blender has some features that allow the normals to be automatically recalculated, smoothed, or flipped. normals should be adjusted before the baking of the “normal map” in the next section (conversion of high- frequency mesh detail into a d image texture). an infographic on texel density is available at https://www.artstation.com/ artwork/qboqp. fig. – . the face normals (red) show a vector that is perpendicular to each face of the retopologized mesh. using d photogrammetry to create open-access models of live animals: d and d software solutions recovering high-frequency geometry details the mesh retopologization reduces the polygon count and improves the edge flow, but the reduction removes the high-frequency details of the mesh. there are a few different ways to recover this detail. the following method brings back the fine details, but applies them to our more deformation-friendly retopologized mesh. multiresolution and shrinkwrap modifiers to recapture high-frequency details like other d animation software packages (e.g., d studio max), the blender d animation software package contains modifiers that allow us to alter our geometry without losing the original input mesh data. in most cases, these modifiers can be permanently ap- plied to the mesh or cumulatively stacked, maintaining the original base mesh information and modification history so that we can track our progress. we use two modifiers for transferring the high-fre- quency scan data into our retopologized mesh. the multiresolution modifier subdivides our mesh into levels, increasing the polygon count exponentially with each subdivision. this modifier can also store mesh edits that have been applied at the various subdivision levels. the shrinkwrap modifier adjusts the affected mesh by mor- phing the surface to match that of another target mesh. to retrieve the high-frequency geometry from our original scan, we use the multiresolution modifier to first increase the polygon count of our retopologized mesh. the multiresolution modifier adds subdivision levels where we can store the high-frequency details from our original scan, and it also gives us the ability to use blend- er’s geometry sculpting tools to smooth out some of the high-fre- quency surface noise or to add details back into the mesh that were lost during the photogrammetry processing. pressing the subdivide button within this modifier adds new subdivision levels; we can then use the preview or sculpt slider values to set the number of subdivi- sion levels that we need (figure – ). fig. – . the multiresolution modifier in blender receives an input mesh (far left) and is able to add subdivision levels, increasing the polygon count in steps to match or exceed that of our original . million polygon scan, eventually providing enough point data to capture the high resolution data from the original photogrammetry scan. https://docs.blender.org/manual/en/dev/modeling/modifiers/generate/ multiresolution.html https://docs.blender.org/manual/en/dev/modeling/modifiers/deform/ shrinkwrap.html using d photogrammetry to create open-access models of live animals: d and d software solutions with the polygon count of our retopologized mesh increased, we can now use blender’s shrinkwrap modifier to match the original photogrammetry mesh. as the name implies, it shrink-wraps the modified mesh by pushing out or pulling in the points of our now subdivided retopologized mesh to match that of the original scanned mesh (figure – ). the shrinkwrap modifier has an option to auto- matically perform this point-to-surface matching, using projections of the surface normals in the positive or negative directions in the project mode. the new surface position information from the shrink- wrap modifier can then be permanently applied to, and stored in, the multiresolution modifier at a chosen subdivision level by clicking the apply button for the shrinkwrap modifier in the stack. the apply button will make the shrinkwrap modifier’s effects permanent by storing them within the multiresolution modifier (the next modifier in the stack). because this change is made permanent, the shrink- wrap modifier is no longer needed and is automatically removed. the multiresolution modifier stores these modifications within the subdivision level specified in the preview numerical slider. fig. – . the retopologized mesh with a multiresolution modifier added at only four subdivision levels (left). while previewing four subdivision levels, a shrinkwrap modifier is also added to the mesh, using the original . million polygon photogrammetry mesh as a target (right). with the high-frequency details applied to the retopologized mesh via the multiresolution modifier, blender’s sculpting tools can now be used. unless the multiresolution modifier’s apply base but- ton is pressed, the modifier will store the sculpting information and keep the base retopologized mesh unaltered. though the informa- tion in this modifier can be accessed only from the blender animation software, we can export the retopologized mesh with all the captured high-frequency details and sculpted updates at whatever subdivision level we choose, depending on the level of detail required for our specific needs. using d photogrammetry to create open-access models of live animals: d and d software solutions normal maps: storing high-frequency surface detail via image texture file(s) if a lower polygon count is required, a normal map is commonly used to “fake” high-resolution surface detail. the normal map is a texture applied to the surface of the mesh that interacts with the lighting to create the appearance of high-resolution detail without the need for a high polygon count. this is significant for reducing the file size of the d asset and improves performance when the model is animated. in most cases, the differences between actual geometry and normal-mapped details are imperceptible to the average user. the normal map works in combination with the mesh normals (direction in which each polygon face faces) in order to accurately calculate the way light interacts with the surface (figure – ). there- fore, if the mesh changes, the normal directions may change, and noticeable problems develop in the way in which light interacts with the normal-mapped surface. when dealing with organic meshes, the development of visible (hard) polygon edges is the most common problem. fig. – . the retopologized mesh (left) has a normal map image file (center) applied in order to change the way that light affects the surface, which creates the perception of high frequency polygon detail for users without increasing the total number of polygons (right). the creation of a normal map involves two mesh objects: a source mesh that contains the high-frequency mesh details, and a lower polygon target mesh that will receive the normal map with the high-frequency details (figure – ). for this task, we work within a blender file that contains a single mesh object that has the high- and low-resolution details required to make a normal map stored in its multiresolution modifier. to create source and target objects, we du- plicate the mesh object. we now remove the multiresolution modifier from one of these objects, but not before determining the polygon density that best fits our destination platform (e.g., games, film, web). testing will be necessary to determine the best performance for a particular platform, but generally a real-time graphics engine requires a lower number of polygons, while video content functions can handle a higher number of polygons. on our lower resolution target mesh object, we use the preview slider on the multiresolution modifier to choose a subdivision level using d photogrammetry to create open-access models of live animals: d and d software solutions that works for our needs and make this permanent by clicking the apply button. to receive the normal map information, we create a new texture for the target mesh object. the bake options are found in the render properties panel. a normal map is baked by selecting both source and target meshes, setting the bake mode to normals, and clicking the bake button. since blender currently has more than one render engine, there are two methods to bake these textures. in blender’s current state, we have found that the blender internal ren- der engine is faster than blender’s cycles render engine. these nor- mal maps are saved into a png image file format and applied to the low-resolution target mesh in the object’s materials inside of blender, or exported for use in other d applications. fig. – . the high-frequency source mesh and lower resolution target mesh are baked to create a normal map texture that is applied to the lower resolution mesh to capture the high- frequency details without the overhead typically needed for meshes with high polygon counts. rigging and animating the retopologized mesh the final steps for bringing our model to life involve the applica- tion of movement to a d rig. because the details of creature rigging and animation are software specific, we will cover only basic theory, while pointing out considerations that apply to animal scans, par- ticularly those that require the merger of multiple meshes. creation of the skeletal rig animation of a polygon mesh requires the creation of a skeletal rig. these rigs are composed of a hierarchy of bones (also known as joints in autodesk’s maya) that give the polygon mesh a site to attach it- self. when the joints are animated, each vertice of the mesh follows the influencing bone(s) based on a weight value, causing the mesh to deform. for example, a point that is weighted percent to a single bone will follow the influencing bone by maintaining its relative texture baking in blender: https://docs.blender.org/manual/en/dev/render/ blender_render/bake.html using d photogrammetry to create open-access models of live animals: d and d software solutions positioning to the influence (i.e., the force applied to that part of the skeleton), while a vertice that has its weight split between two bones ( percent to each) will maintain a position halfway between its relative positioning to each of the two influencing bones. the process of attaching the mesh to the skeleton is commonly referred to as “skinning.” most d animation software has a default or automatic bind process that can be useful for quickly binding the mesh to a skeleton with minimal manual weight “painting” work re- quired. the automatic process looks for bones that are closest to each vertice and automatically assigns influencing weight values to each vertice. to customize the results of the automatic binding algorithms, extra bones can be added to the hierarchy (figure – ). though a skeletal rig can appear similar to a biological skeleton, it is typically different; the rig’s primary goal is to produce the most accurate de- formation results on the mesh geometry during animated motion. fig. – . based on anatomical study and an understanding of mesh deformation via a skeleton (“armature” as blender calls it), bones are placed inside of a turtle mesh. the vertices of the mesh are bound to the skeleton, with each bone’s influence over the mesh displayed as a heat map. additional bones are added inside the shell in order to prevent the turtle’s flippers from affecting the shell, keeping the shell rigid (left). the retopologized edge flow greatly simplifies the adjustment of these influence “weights,” as demonstrated by the edge rings influenced by the turtle’s neck bone (right). fig. – . two scans capture the turtle in two halves, each in a unique pose (left). the purple mesh is displayed bound to a purple skeleton, while the gray mesh is bound to the gray skeleton via an automatic skinning process and aligned by manipulating the skeletons, which deform the two meshes (right). using d photogrammetry to create open-access models of live animals: d and d software solutions in cases where multiple scans must be merged into a single model, a skeleton can be used to pose both halves of the specimen while preserving the proportions of the animal (figure – ). instead of arbitrarily adjusting the points of the mesh manually, the halves of the mesh are posed using the skeleton to meet at an in-between pose. the alignment of the skeleton and mesh surfaces assists in the vali- dation of the skeleton placement and mesh skinning. animation working with video of animal movements, we typically use the rigged/animatable model to create a repeating/cycling locomotion (e.g., lasseter ; kerlow, ). for a land creature, this involves a walk cycle. for most of our marine animals, we create swim cycle animations. though our goal is to create accurate motions, animal movement is dynamic and not easily captured in a single animation. we are currently working on processes to more accurately apply animations to our models, increasing the level of detail and decreas- ing the time needed to create these animations. when working with animals of a similar structure, we aim to repurpose and refine previ- ously created animations and workflows. conclusion in this paper, we have provided an overview of the technical work- flow of rendering live d animals using photogrammetry. our workflow has value as it enables d artists, scientists, and educators to recreate complex d geometries in a scientifically accurate man- ner. the value of the full-color, accurate, and fully rigged d animal models is that that they represent valuable starting points for many scientific investigations, as well as educational use for demonstrating biodiversity and animal form and function. references baqersad, javad, peyman poozesh, christopher niezrecki, and peter avitabile. . “photogrammetry and optical methods in structural dynamics–a review.” mechanical systems and signal processing : – . blagoderov, vladimir, ian j. kitching, laurence livermore, thomas j. simonsen, and vincent s. smith. . “no specimen left be- hind: industrial scale digitization of natural history collections.” zookeys : – . available at https://doi.org/ . / zookeys. . . bythell, john christopher, po-cheng pan, and janice lee. . “three-dimensional morphometric measurements of reef corals using underwater photogrammetry techniques.” coral reefs ( ): – . https://doi.org/ . /zookeys. . https://doi.org/ . /zookeys. . using d photogrammetry to create open-access models of live animals: d and d software solutions dai, fei, and ming lu. . “assessing the accuracy of applying photogrammetry to take geometric measurements on building products.” journal of construction engineering and management : – . available at http://www .hcmut.edu.vn/~ndlong/tk/ mat/baibao-baitapnhom/nhom .pdf. debevec, paul e., camillo j. taylor, jitendra malik, golan levin, george borshukov, and yizhou yu. . “image-based modeling and rendering of architecture with interactive photogrammetry and view-dependent texture mapping.” in iscas ’ : proceedings of the ieee international symposium on circuits and systems, : – . piscataway, nj: ieee. falkingham, peter l. . “acquisition of high resolution three- dimensional models using free open-source, photogrammetric soft- ware.” palaeontologia electronica ( ) t. available at https://doi. org/ . / . gignac, paul m., and nathan j. kley. . “iodine-enhanced micro- ct imaging: methodological refinements for the study of the soft- tissue anatomy of post-embryonic vertebrates.” journal of experimen- tal zoology part b. ( ): – . huising, e. jeroen, and luisa maria gomes pereira. . “errors and accuracy estimates of laser data acquired by various laser scanning systems for topographic applications.” isprs journal of photogrammetry and remote sensing ( ): – . kerlow, isaac. . the art of d computer animation and effects. hoboken, nj: wiley. lasseter, john. . “principles of traditional animation applied to d computer animation.” siggraph ‘ proceedings of the th an- nual conference on computer graphics and interactive techniques, – . new york: association for computing machinery. linder, wilfried. . digital photogrammetry: a practical course. berlin and heidelberg: springer-verlag. raitt, bay, and greg minter. . “digital sculpture techniques.” nichmen graphics. available at http://www.theminters.com/misc/ articles/derived-surfaces/derived-surfaces.pdf. weinberg, seth m., nicole m. scott, katherine neiswanger, carla a. brandon, and mary l. marazita. . “digital three-dimensional photogrammetry: evaluation of anthropometric precision and accu- racy using a genex d camera system.” the cleft-palate craniofacial journal ( ): – . yirci, murat. . a comparative study on polygon mesh simplification algorithms. a thesis submitted to the graduate school of natural and applied sciences of middle east technical university. middle east technical university. available at http://etd.lib.metu.edu.tr/ upload/ /index.pdf. http://www .hcmut.edu.vn/~ndlong/tk/mat/baibao-baitapnhom/nhom .pdf http://www .hcmut.edu.vn/~ndlong/tk/mat/baibao-baitapnhom/nhom .pdf https://doi.org/ . / https://doi.org/ . / http://www.theminters.com/misc/articles/derived-surfaces/derived-surfaces.pdf http://www.theminters.com/misc/articles/derived-surfaces/derived-surfaces.pdf http://etd.lib.metu.edu.tr/upload/ /index.pdf http://etd.lib.metu.edu.tr/upload/ /index.pdf what happens when you share d models online (in d)? thomas flynn abstract as d content capture and production become increasingly acces- sible and affordable, more cultural institutions are contemplating the use of this format alongside established media such as d photogra- phy. this chapter addresses the possibilities and value of publishing cultural heritage-related d data online, in a real-time interactive for- mat. using existing projects and initiatives published on the sketch- fab platform by way of example, the chapter provides an overview of d within the cultural heritage sector. introduction a s the cultural heritage lead at sketchfab, much of my work involves promoting the use of d within the cultural sector, including libraries, archives, and museums. this includes introducing organizations to the concept of d digitization if they do not already have a program in place and extolling the benefits of put- ting d models online if they are already creating digital d files. this chapter will draw heavily on my personal experience advis- ing cultural heritage professionals with regard to d and producing d content for national institutions; i will offer examples of how individuals and institutions within the sketchfab community are using d to achieve their goals. while i heartily recommend a trial of sketchfab to display d, regardless of which browser-based d viewer is used, these examples highlight the value of dynamic, on- line interaction with d content. a broad theme to keep in mind in discussing the creation and other open-source and commercial viewers are mentioned in the following pages. chapter what happens when you share d models online (in d) use of digital d models of cultural objects and spaces—especially online—is one of experimentation and, to a certain degree, skepti- cism. since i began working in the realm of d for cultural heritage in , i have seen the boom in d production matched with a boom in conferences, think tank activities, and online discussions about various aspects of the technology. at the same time, there is still some skepticism in a variety of fields about the value of d as more than just a fad or experimental technique. defining d the term d is often understood to describe various non-interactive and interactive experiences, including d renders (i.e., images, video) of d data, animated turntable presentations, -degree panoramas, all the way to complete virtual or augmented reality experiences. even when we say something like d data, there are questions. are we talking about vector or raster data? point clouds or meshes? textured or untextured? animated or static? raw or edited data? d scanned or computer generated? with regard to the cultural and historical d content on sketch- fab, the vast majority of d models are surface d models produced via photogrammetry and, to a lesser degree, structured light, laser scanning, and computed tomography. a relatively small portion of the d models are born digital (i.e., a “ d representation of an item that may not have a specific real-world counterpart ... created using digital imaging or drafting software rather than scanning” [devet et al. ]). the variation in understanding may result in part from personal experience. somebody who has grown up with interactive games and computer-generated movies may expect the term d to apply to the experience of d, whereas somebody who has been trained to use d imaging workflows in professional work may generally con- sider d to refer to quantifiable data. understanding the variations in perceived meanings may help better define the best presentation method for d models for a given audience. based on my own experience with d, i would go so far as to suggest that the term d means real-time interaction, that it indicates we are in control of our experience of the model or data and able to move to a different perspective at will. the ability to manipulate and not just spectate is a simple yet key reflection of how we explore physical objects in the real world, and digital d allows this to hap- pen virtually. to experience the full value of true d is through dy- namic, personal interaction. being explicit in our definition can help us avoid confusion as the cultural heritage community establishes metadata standards (e.g., the community standards for d data preservation [cs dp] project, the international image interoperability [iiif] community http://cs dp.org http://cs dp.org http://cs dp.org http://cs dp.org what happens when you share d models online (in d) group on d ) and pursues ongoing digitization projects and col- laborations. by considering d data separately from d experiences and services, we can perhaps build stronger and longer-lasting use cases for d in the cultural heritage sector. growing popularity of d in the cultural heritage community with more than museums using sketchfab to share some of their collections in digital d, plus thousands more libraries, scientific organizations, individual researchers, imaging experts, and hobbyist d scanners using it, sketchfab has become a go-to tool for the cul- tural heritage sector to display and disseminate d data online. as of october , , there were , d models in sketchfab’s cultur- al heritage & history category, , of which are downloadable under one of the following creative commons licenses: • cc by—attribution: depending on the non-commercial (nc), no derivatives (nd), and share alike (sa) choices, others may share, edit, and use the model, but they must give you credit for the original work. • cc by-nc—noncommercial: others cannot use your model commercially. • cc by-nd—noderivatives: others may use and share the mod- el, but it cannot be altered. • cc by-sa—sharealike: depending on the nc choices, others may share, edit, and use the model, but derivative work must be shared under the same license. (sketchfab a) combined, the top ten most viewed d models in the category have been viewed more than , , times. the popularity of sketchfab within the cultural sector can be attributed largely to the fact that it is a free, easy-to-use service with web browser and embed support. by supporting d annota- tions, audio, and animations, plus making it possible to view any uploaded model on the web in virtual reality (vr) and augmented reality (ar), the free sketchfab app gives cultural organizations all they need to tell their stories using the objects and spaces in their care through d. it can be inferred from the trending data in figure – that the annotation functionality—the ability to add a clickable hotspot to a particular part of a d model—first piqued cultural organiza- tions’ interest in the sketchfab platform. when clicked, the hotspot displays a text or image pop-up and transports the d viewer per- spective to a user-defined angle, zoom, and pan (figure – ). this https://iiif.io/community/groups/ d/#about https://sketchfab.com/models/categories/cultural-heritage-history anyone can join sketchfab on the free basic tier (https://sketchfab.com/plans); sketchfab offers free pro tier subscriptions and hosting to all cultural heritage organizations (https://sketchfab.com/museums). http://creativecommons.org/licenses/by/ . / http://creativecommons.org/licenses/by-nc/ . / http://creativecommons.org/licenses/by-nd/ . / http://creativecommons.org/licenses/by-sa/ . / https://sketchfab.com/models/categories/cultural-heritage-history what happens when you share d models online (in d) feature enables the creation of virtual tours around a d model, with contextual and other information linked directly to d navigation. it may also suggest that clever d technology must be underpinned by a relevant use case that truly adds value in a given application. what use is a d model on its own? only when the model is presented with contextual information that is strictly linked to the d nature of the media does it attract users from the cultural sector. fig. – . the jericho skull— d model by the british museum on sketchfab showing an active annotation the ability to add annotations makes a d model more than the sum of its data; it creates a self-contained educational artifact, which is sometimes the main reason that organizations use sketchfab to https://skfb.ly/rghd fig. – . monthly sign-ups to the museum segment on sketchfab, – https://skfb.ly/rghd what happens when you share d models online (in d) share d content online. for example, jonathan r. hendricks, direc- tor of publications at the paleontological research institution and adjunct associate professor in the department of earth and atmo- spheric sciences at cornell university, explains the significance of the digital atlas of ancient life program: the next step of our project is to annotate the existing models and incorporate them into both the textbook and into planned virtual teaching collections, which will allow students access to organized, curated collections of virtual fossils when the real things are not available. (jonathan r. hendricks, personal correspondence, september , ) the second uptick in museum sign-ups in late can be attrib- uted to the launch of sketchfab’s cultural heritage program, which offers free pro tier subscriptions for all museums and cultural insti- tutions. this offer means that these organizations can use sketchfab for free with the benefit of larger file uploads, more annotations, pri- vate models, and other advantages. a final increase in sign-ups from museums could also be at- tributed to the addition of support for webvr (a new web standard for browser-based vr content) in , which allows users to click a button on any online sketchfab model and jump into a vr mode, viewable on smartphone/cardboard devices as well as on dedicated headsets. although the long-term value of vr may still be under debate, the addition of webvr to sketchfab may have interested au- diences beyond cultural organizations in the platform. this means that anyone producing d content could also become a producer of vr experiences, with no coding required. the same d model view- able on a regular screen can now be presented in a way that affords a level of presence (the feeling of visiting a place or space physically) and an impression of scale for the viewer. certain functionality has not found widespread use. for exam- ple, the ability to add machine-readable tags to plot d models on a map would seem like a very practical feature, but it is seldom used at this point. indeed, most institutions do not even tag their models on sketchfab in a consistent manner or take advantage of sketchfab’s apis to link d model data to their online collection databases to keep them connected and up-to-date. the increasing need for meaningful ways to display d models, in particular d models of historical and cultural content, is also con- firmed by the existence of several other online d viewers alongside the sketchfab viewer, which was launched in . these include the d hop ( ) and the universal viewer ( , d supported as of ). both offer open-source and cultural heritage-centric d view- ing experiences for those interested in self-hosting a d viewer. the emergence of dedicated d viewers from large commercial https://labs.sketchfab.com/experiments/map/ https://sketchfab.com/developers http:// dhop.net https://universalviewer.io https://sketchfab.com/developers https://sketchfab.com/developers https://labs.sketchfab.com/experiments/map/ https://sketchfab.com/developers http:// dhop.net https://universalviewer.io what happens when you share d models online (in d) companies, such as google (poly, ), microsoft (remix d, ), and facebook ( d posts, ) suggests an even wider audi- ence for all kinds of d content. audiences for historical d content by publicly sharing d data online for viewing, download, and re- use, we gain insight into the active audience for such historical d models. while researchers often acquire d data as part of academic study or for condition documentation during collection assessments, a much larger and more general audience is to be found online for cultural and historical d data. this suggestion arises from obser- vations of how users interact with cultural heritage d models on sketchfab and how they share and reuse the same models both on- line and offline. use by publishing organizations the simplest way to put d models online and show them to au- diences is to use a d viewer. sometimes, as with the harvard semitic museum, minneapolis museum of art, and château de versailles, this means adding links or embedding d models from respective institutional profile accounts on sketchfab in official col- lection pages and exhibition websites. the royal academy of fine arts (madrid) and the museo archeologico nazionale di napoli employed a simple, but elegant, in-gallery solution, displaying web- based d models of original artifacts alongside physical plaster casts by using internet-connected tablets (marqués ). at other times, a third party uses d models published by one or more institutions. sarah bond, associate professor in the classics department at the university of iowa, uses d models from multiple sketchfab users in her undergraduate and graduate classes, in place of physical replicas. she explains the relevance for her course: i have begun to integrate d models of inscriptions into my courses ... d models and digital humanities approaches to material culture provide ample opportunity for transporting students and the general public to “visit” and then translate inscriptions in situ. while nothing will ever replace doing squeezes and rubbings on-site, these are a close second when used in a browser, on a mobile device, or loaded into a vr viewer. (bond ) services like sketchfab provide a framework for existing web teams to create sophisticated d interactive displays. for example, https://poly.google.com https://www.remix d.com/discover?section= b f e ab e ab ffaa https://developers.facebook.com/docs/sharing/ d-posts https://semiticmuseum.fas.harvard.edu/ d-models https://collections.artsmia.org/art/ /the-doryphoros-italy http://www.chateauversailles.fr/grands-formats/hameau-de-la-reine https://www.remix d.com https://developers.facebook.com/docs/sharing/ d-posts https://semiticmuseum.fas.harvard.edu/ d-models https://semiticmuseum.fas.harvard.edu/ d-models https://collections.artsmia.org/art/ /the-doryphoros-italy http://www.chateauversailles.fr/grands-formats/hameau-de-la-reine http://www.chateauversailles.fr/grands-formats/hameau-de-la-reine https://poly.google.com https://www.remix d.com/discover?section= b f e ab e ab ffaa https://developers.facebook.com/docs/sharing/ d-posts https://semiticmuseum.fas.harvard.edu/ d-models https://collections.artsmia.org/art/ /the-doryphoros-italy http://www.chateauversailles.fr/grands-formats/hameau-de-la-reine what happens when you share d models online (in d) the natural history museum in london used the sketchfab viewer api and models hosted on the platform to power a custom in-gallery touch screen for teaching visitors about whale skeletons (shakiry and capewell ). outreach placing d models online is an extension of the work that museums, archives, and libraries already do to make their location-specific col- lections accessible to a wider audience. being able to measure this outreach in some way is even more valuable. each model on sketch- fab has some publicly visible statistics: the number of times the mod- el has been viewed anywhere online and the number of likes and comments from members of the sketchfab community. for example, at the time of this writing, the d scan granite head of amenemhat iii published by the british museum had been viewed more than , times, had been downloaded , times, and had garnered likes and comments since its publication in . it is notable that most of the likes are from nonacademic users. to put the cul- tural heritage content on sketchfab in context with the wider com- munity on the platform, consider that the sketchfab profiles in the organization: museum user segment (the label users give them- selves upon signing up to the service) represent less than percent of the more than million registered accounts on the platform. the total number of models in the cultural heritage category (uploaded by both institutions and individuals) accounts for about percent of the total number of d models uploaded to sketchfab. in addition to the direct views on sketchfab, top embed referrals to the granite head of amenemhat iii model include creativecommons. org; an educational resource site from alabama community college, .cgchannel.com (a computer graphics community site); and open- culture.com (an aggregator for publicly available online educational media). while potentially accessed by academics and museum staff, these sites are unlikely to be referred to as serious research portals. the thousands of viewers that the sites referred to the granite head of amenemhat iii model on sketchfab suggest a wider viewership for historical d data. sketchfab is, as far as the author knows, the only community in which it is possible to leave a comment directly related to an individ- ual collection item—rendered in d or otherwise—from a museum, library, or archive. by commenting on d models, users show an interest beyond simply viewing items in a collection. comments on the granite head of amenemhat iii model range from simple thanks for the opportunity to view and download the d data to critiques of the data quality, questions about digitization workflow, and links to the data in new contexts. sketchfab also allows any user to create their own groupings of d models from other users in collections. such collections have many purposes. they may be simple collections of favorite models, https://sketchfab.com/models/ d b b e d de a https://skfb.ly/b eu https://skfb.ly/b eu https://sketchfab.com/models/ d b b e d de a what happens when you share d models online (in d) collections created for specific uses (e.g., for use as artistic refer- ence ), or collections intended to unite similar collections from dif- ferent institutions. interestingly, the centre des monuments nationaux (cmn) in france does not publish d models on sketchfab, but has created collections of d scans made by other users of monuments under its care. although some institutions and estates are protective of their collection’s intellectual property, cmn wanted to encourage unoffi- cial crowd-sourced digitization efforts. according to mélisande vial- ard, mission stratégie, prospective et numérique at cmn, at the moment, we don’t have a lot of models nor data because we are just starting to think about it. we would like to know more about how to work with the communities on sketchfab especially because some of our monuments are already on your platform like the sainte-chapelle or the towers of la rochelle and it would be great to encourage these initiatives. (mélisande vialard, personal communication, july , ) as d digitization tools become more accessible and simpler to use, many are taking up d scanning as a hobby and, increasingly, unofficial collections of historical d content are being posted online. as is the case with the models curated by cmn, hobbyist or com- mercial d scans fulfill a need where institutional capacity is lacking. there are, however, concerns about the diffusion of inaccurate or incomplete data. daniel pletinckx, the cultural technology expert at visual dimension bvba suggests that qualified objects could get a visible quality stamp. ... the important point here is to connect with the ch [cultural heritage] domain, so that ch experts start to feel connected and responsible for d ch resources. this would also stimulate d producers to deliver quality as the quality level of some items is questionable. (daniel pletinckx, personal communication, september , ) on occasion, sketchfab has received and acted upon valid con- tent take-down requests, as occurred when the artists rights society, representing henry moore, requested the removal of several d scans of the artist’s work from the platform. in this case, the copy- right claim is clearer than, for example, when referring to ancient monuments. however, sketchfab is digital millennium copyright act (dmca)–compliant and has a clear process for resolving any in- fringement claims (sketchfab b). in tandem with the decision to accept d models generated by the general public is the decision to release d data to the general public for download and reuse. institutions publishing on sketch- fab have the option to enable d models to be downloaded under https://sketchfab.com/demoon/collections/ref_sculpture https://sketchfab.com/tlatollotl/collections/aztec https://sketchfab.com/lecmn/collections https://sketchfab.com/lecmn/collections https://sketchfab.com/lecmn/collections https://sketchfab.com/demoon/collections/ref_sculpture https://sketchfab.com/tlatollotl/collections/aztec https://sketchfab.com/lecmn/collections what happens when you share d models online (in d) several types of creative commons licenses listed earlier. along with individual researchers and hobbyists, cultural organizations have made more than , cultural heritage–related d models available for download and reuse. the act of releasing d content under a generous license (i.e., permitting modification and reuse) means that cultural content can be reused in ways that are often unexpected. to take just a few ex- amples, the effect of the british museum releasing many of its d scans for download led to a bust of zeus hosting a youtube show as a vr avatar (smith ), d printed museums in a school classroom (quince ), a hologram of sir robert bruce cotton (interactive studio ), and a livestream of a d print of a pacific island god statue (museotechniki ). in addition to the reuse of d scans for simple fun, there are examples of reuse for what might be considered more serious or worthy purposes. archivist abira hussein has incorporated scans released by the british museum in her work healing through archives in multimedia webvr and tactile experiences, combining the d with d imagery, panoramas, and audio interviews with the so- mali immigrant population in london. some institutions, such as réunion des musées nationaux (rmn)-grand palais and the grand rapids public museum, have begun to license d data for commercial reuse on the sketchfab store. while the commodification of cultural data is not new, the monetization of a relatively new medium—and one that can be used to create convincing replicas of cultural objects—has prompted some comment. ethan gruber, director of data science at the american numismatic society, tweets his response to the sale of cultural d content: “beyond the ethical implications for profiting from heritage acquired (or stolen) via colonialism, you may run aground of the law in various jurisdictions regarding indigenous art” ( ). the ethical question is bound up with the monetization of muse- ums in general and should be discussed along with such topics as in- stitutional funding streams, paid exhibitions, gift shops, and existing licensing models for other media and replicas. regarding the legal question, sketchfab (like most online media platforms) has channels in place for anyone to report rights violations and regularly acts on such reports to remove content and review licenses. where there is lack of ethical or legal concerns, there is an opportunity for promot- ing the reuse of d content while creating a new revenue stream for cultural organizations. https://sketchfab.com/models/ eb e b f d a eac b e; https:// twitter.com/nebulousflynn/status/ https://sketchfab.com/francecollections/store https://sketchfab.com/grpm/store for example, d image licenses from the museum of modern art (http://www. scalarchives.com/web) and the guggenheim museum (http://www.artres.com) https://twitter.com/abirahussein https://sketchfab.com/models/ eb e b f d a eac b e https://twitter.com/nebulousflynn/status/ https://sketchfab.com/francecollections/store https://sketchfab.com/grpm/store https://sketchfab.com/models/ eb e b f d a eac b e https://twitter.com/nebulousflynn/status/ https://twitter.com/nebulousflynn/status/ https://sketchfab.com/francecollections/store https://sketchfab.com/grpm/store http://www.scalarchives.com/web http://www.scalarchives.com/web http://www.artres.com what happens when you share d models online (in d) libraries and d like museums, many libraries and archives are currently publish- ing d content. the british library (figure – ), national library of scotland (figure – ), cambridge digital library, and state library of queensland have all found that they have something in their print and special collections that can be experienced digitally in d, and these are just the institutions on sketchfab. fig. – . jane austen desk—open view by the british library on sketchfab fig. – . the stag (stage set of ) by national library of scotland on sketchfab https://sketchfab.com/britishlibrary https://sketchfab.com/natlibscot https://sketchfab.com/camdiglib https://sketchfab.com/slq https://skfb.ly/ xyxz https://skfb.ly/ zf g https://sketchfab.com/britishlibrary https://sketchfab.com/natlibscot https://sketchfab.com/natlibscot https://sketchfab.com/camdiglib https://sketchfab.com/slq https://sketchfab.com/slq https://sketchfab.com/britishlibrary https://sketchfab.com/natlibscot https://sketchfab.com/camdiglib https://sketchfab.com/slq https://skfb.ly/ xyxz https://skfb.ly/ zf g what happens when you share d models online (in d) publishing content online in d opens up new and unique ways for the general public to interact with artifacts. adi keinan-schoon- baert, digital curator (polonsky fellow) for the hebrew manuscripts digitisation project at the british library, explains in a blog post: while a museum is more of a usual suspect for these novel technologies, libraries are perhaps less so. they are perceived to hold books, manuscripts, documents, or in short—compilations of two-dimensional text. but nothing physical that a library holds is in fact two-dimensional, and some items kept in libraries may be of unanticipated nature. libraries have more potential to engage with d modelling and printing than one would expect. (keinan-schoonbaert ) although publishing d data can be seen as part of libraries’ general goal to engage the public with their collections, it should also be considered (where appropriate) as a legitimate, yet underexplored way to fulfill research outreach targets. d can augment or add value to d media, but the value is lim- ited by the kind of d or image artifact in question. two examples illustrate the point. figure – depicts a reuse of miniature of dunstan as a bishop, writing a commentary of the rule of saint benedict, with an inscription “s[an]c[tu]s dunstanus,” adapted from the british library’s catalogue of illuminated manuscripts. in this case, the additional material information has been added to the d data of the manu- script to allow a viewer to experience changes in the appearance of gold leaf on the manuscript as it is viewed from different angles. this could be said to be a more authentic experience of the manuscript compared with an on-screen or printed reproduction of the same. fig. – . miniature of dunstan as a bishop, adaptation by thomas flynn http://www.bl.uk/catalogues/illuminatedmanuscripts/illumin. asp?size=mid&illid= https://skfb.ly/ fl http://www.bl.uk/catalogues/illuminatedmanuscripts/illumin.asp?size=mid&illid= http://www.bl.uk/catalogues/illuminatedmanuscripts/illumin.asp?size=mid&illid= http://www.bl.uk/catalogues/illuminatedmanuscripts/illumin.asp?size=mid&illid= http://www.bl.uk/catalogues/illuminatedmanuscripts/illumin.asp?size=mid&illid= https://skfb.ly/ fl what happens when you share d models online (in d) figure – illustrates the digital morphing of a map from with more modern elevation data added and contour lines overlaid. it is possible then to see the difference in the way that a particular area of land has been documented over time and the possible devia- tion in survey accuracy. thus, figure – illustrates an artistic addi- tion to augment the user experience, while figure – shows how the addition of scientific data can enable the user to view a historic docu- ment in a new way. fig. – . planta da cidade de ouro preto - mg – , adaptation by rolling drone geotecnologias conclusion the potential use cases for d data have not yet been entirely ex- plored. by putting d models online and making use of existing platforms and functionality, however, we are starting to get a sense of current possibilities and limitations, and to learn what audiences react to and enjoy. the biggest benefit of putting d content online may be that it increases the visibility of, and in turn engagement with, d data. as examples of the use and reuse of d data are circulated, their value and usefulness will be challenged and contested. while the simple “technical magic” of d digitization can impress and excite now, the novelty will wear off and the focus will move to the ques- tion of how d can help achieve existing institutional goals. in the last few years, many people who are producing or work- ing with d data in the cultural heritage sector have moved from “wow!” to “why?” as they assume a more critical view of the value of d capture and display. as individual researchers; d community groups; museum, archive, and library staff; and those working in in- ternational think tanks take up the task of establishing standards and https://skfb.ly/ biqy https://skfb.ly/ biqy what happens when you share d models online (in d) recording project outcomes, we move toward a time when d data will play as ubiquitous a role as imagery, video, and text currently do in our everyday work. references bond, sarah e. . “replacing the squeeze? teaching classical epigraphy with d models.” history from below (blog), january , . available at https://sarahemilybond.com/ / / /replac- ing-the-squeeze-teaching-classical-epigraphy-with- d-models/. devet, katie, jasmine clark, julie hardesty, andrea thomer, and jon blundell. . “ d metadata,” presentation at cs dp-forum . ann arbor, michigan, august – . gruber, ethan [@ewg ]. . “beyond the ethical implications for profiting from heritage acquired (or stolen) via colonialism, you may run aground of the law in various jurisdictions regard- ing indigenous art.” twitter, june , , : pm. available at https://twitter.com/ewg /status/ . interactive studio. . british museum–buste d de sir robert bruce cotton–hologramme d interactif, video, : , posted may , . available at https://www.youtube.com/watch?v=ft- o qa . keinan-schoonbaert, adi. . “can’t judge a book by its cover? perhaps you can!” asian and african studies blog, the british li- brary, may , . available at http://blogs.bl.uk/asian-and-afri- can/ / /cant-judge-a-book-by-its-cover-perhaps-you-can.html. marqués, néstor f. . “new technologies for old collections.” cultural heritage, sketchfab (blog), march , . available at https://blog.sketchfab.com/new-technologies-old-collections/. museotechniki. . live streaming of the d printing of a’a figure. video, : : , livestreamed march , . available at https:// www.youtube.com/watch?v=irgwpvokvdy&t= s. quince, graham. . “sketchfabrication: printing egyptian trea- sures in the classroom with sketchfab.” community stories, sketchfab (blog), november , . available at https://blog.sketchfab.com/ sketchfabrication-printing-egyptian-treasures-in-the-classroom-with- sketchfab/. shakiry, shia, and ben capewell. . “nhm london: building cus- tom d interactives with the sketchfab api.” api spotlight, sketchfab (blog), september , . available at https://blog.sketchfab.com/ nhm-london-building-custom- d-interactives-sketchfab-api/. sketchfab. a. “downloading models.” sketchfab help center. last modified october , . available at https://help.sketchfab. com/hc/en-us/articles/ -downloading-models#licenses. https://sarahemilybond.com/ / / /replacing-the-squeeze-teaching-classical-epigraphy-with- d-models/ https://sarahemilybond.com/ / / /replacing-the-squeeze-teaching-classical-epigraphy-with- d-models/ https://twitter.com/ewg /status/ https://www.youtube.com/watch?v=ft- o qa http://blogs.bl.uk/asian-and-african/ / /cant-judge-a-book-by-its-cover-perhaps-you-can.html http://blogs.bl.uk/asian-and-african/ / /cant-judge-a-book-by-its-cover-perhaps-you-can.html https://blog.sketchfab.com/new-technologies-old-collections/ https://www.youtube.com/watch?v=irgwpvokvdy&t= s https://www.youtube.com/watch?v=irgwpvokvdy&t= s https://blog.sketchfab.com/sketchfabrication-printing-egyptian-treasures-in-the-classroom-with-sketchfab/ https://blog.sketchfab.com/sketchfabrication-printing-egyptian-treasures-in-the-classroom-with-sketchfab/ https://blog.sketchfab.com/sketchfabrication-printing-egyptian-treasures-in-the-classroom-with-sketchfab/ https://blog.sketchfab.com/nhm-london-building-custom- d-interactives-sketchfab-api/ https://blog.sketchfab.com/nhm-london-building-custom- d-interactives-sketchfab-api/ what happens when you share d models online (in d) sketchfab. b. “report violation.” sketchfab help center. last modified september , . available at https://help.sketchfab. com/hc/en-us/articles/ -report-violation. smith, trevor f. . spaciblō show . : quick design testing. video, : , posted january , . available at https://www. youtube.com/watch?v=oqbp fgmo g&list=plmkl fhlolh b it-zerk gacbyl zxlhq. https://help.sketchfab.com/hc/en-us/articles/ -report-violation https://help.sketchfab.com/hc/en-us/articles/ -report-violation https://www.youtube.com/watch?v=oqbp fgmo g&list=plmkl fhlolh bit-zerk gacbyl zxlhq https://www.youtube.com/watch?v=oqbp fgmo g&list=plmkl fhlolh bit-zerk gacbyl zxlhq https://www.youtube.com/watch?v=oqbp fgmo g&list=plmkl fhlolh bit-zerk gacbyl zxlhq building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation ann baird whiteside abstract the use of d and d computer-aided design (cad) and building information modeling software is now routine in architecture and design firms and in design education programs. cad is particularly problematic for the libraries, museums, and archives responsible for the long-term management of design documentation as cad is high- ly volatile, relying on proprietary mathematical algorithms to repre- sent shapes and structures, and is packaged in complex, proprietary, and rapidly evolving software products that are expensive, digitally encrypted, and obsolete within years. architectural museums and archives are facing a rapidly growing need to preserve digital infor- mation and are grappling with the need for technological tools, tech- nical expertise in digital preservation, autocad expertise, archival expertise, and repositories that can preserve and disseminate the ar- chived data. many institutions, especially smaller ones, lack the tech- nical infrastructure and expertise to implement scalable preservation of design records. a community-based approach to building an infrastructure comprising technology, digital preservation strategies, standards, and education for archivists of these collections is critical if we are to preserve digital architectural records at scale. in , the frances loeb library at the harvard university graduate school of design applied for a grant from the institute of museum and library services, building for tomorrow, to support the convening of a na- tional forum under the national digital platform funding priority during . the grant supported two priority-setting meetings of engaged stakeholders to frame a national/international collaborative infrastructure to support the long-term preservation of digital design data, specifically in the architecture and design fields. chapter building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation background t he preservation and management of architecture and design records have undergone significant change over the last years as architects, engineers, and other designers move to the use of digital technology for design documentation. archivists and librarians responsible for keeping this documentation have long been concerned about how it will be preserved. in architecture and design projects, many types of digital files are produced during planning and construction, and these files are important for long-term preservation for future renovations/restorations and scholarly research. comput- er-aided design (cad) is particularly problematic for the libraries, museums, and archives responsible for the long-term management of design documentation because cad is highly volatile, relying on pro- prietary mathematical algorithms to represent shapes and structures, and packaged in complex, proprietary, and rapidly evolving software products that are expensive, digitally encrypted, and obsolete within a few years. libraries and archives are increasingly under pressure to acquire these twenty-first century collections to support the next gen- eration of architectural students and historians. since the introduction of cad software in the s, industries that design and develop our built environment have been moving from pencil and paper to computers and digital files. the earliest adopters of the new technology were the aerospace and automotive industries; they were followed enthusiastically by the fields of ar- chitecture and design. cad allows architects to take previously un- imaginable risks in their designs and to experiment with new forms and materials without building prototypes or performing expensive structural analyses until much later in the process. u.s. architects such as frank gehry led the way, and schools of architecture have been teaching the technical expertise needed to unite architecture, engineering, and software design. this has led to a new generation of architects who leverage technology and has opened new doors for innovation in design. the use of d and d cad and building information modeling (bim) software is now routine in architecture and design firms. de- signers typically use multiple types of cad software throughout the design process, which is characterized by phases: concept, schematic design, design development, construction, and as-built. for many architects and firms, a building project now comprises tens of thou- sands of digital files that include d drawings; d models; video; images; and communications among architects, clients, contractors and other parties, including e-mail messages, contracts, specifica- tions, requests for information, and architects’ supplemental instruc- tions. d models may consist of multiple interrelated files, requiring deep knowledge to understand, that are usually held by the project architect. in addition to d cad models, there are hundreds or even thousands of detailed d-layer drawings produced for particular as- pects of a building; there are d printed objects, and there are project “outputs”—for example, drawings or sketches of the building. there building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation are photographs and videos of the construction site, websites about the building, bims, and other multimedia related to the project. with cad, architects and designers can modify files throughout the design lifecycle, allowing for a less linear and more flexible design process and more innovative design. although the architecture pro- fession has standards for managing design files, firms often use con- ventions that are easiest to use within the firm and may not be easily translatable to those outside the firm. several challenges are associated with the software in the archi- tecture profession. there are multiple programs used throughout a design project; the software has highly robust functionality, which also makes it difficult to preserve; and the software is proprietary, making migration of the data to other software systems difficult. the software relies on complex mathematical algorithms to represent shapes and structures. furthermore, software products change rapid- ly, and they are expensive, encrypted, and quickly become obsolete. alex ball, research officer at ukoln in the united kingdom and at- tached to the digital curation centre (dcc) based at the university of bath, described the issues in in this way: the issue of poor interoperability between cad systems and between versions is exacerbated by the rate of software development. in order to maintain a competitive edge, there is constant pressure on cad vendors to release new versions of their software with increased functionality or fewer limitations. not only does this create instability regarding file formats and their interpretation, it also means that individual versions of cad packages can become obsolete rather quickly (ball , ). to preserve the records of significant building projects complete- ly, all the digital information should be captured and linked or pack- aged together into a collection that can be easily searched and, once found, navigated and preserved over time. in the analog and early digital world, the contractual deliverable to clients was a set of printed, wet-signed and wet-stamped drawing sets. we have moved to a model of electronically signed d and d files that can be manipulated to convey information that is at least as detailed as traditional printed plans. further, over the last five years, students in architecture and design schools have been routinely us- ing cad for modeling, skipping the d drawing process entirely, meaning that the coming generation of architects will be producing documentation only in d models, adding urgency to the problem of preserving this type of documentation. the impact of the shift to an entirely digital architectural pro- duction workflow on the record of architectural innovation and practice—in architecture libraries, archives, and museums—is only beginning to be understood. no longer can libraries acquire blue- prints or drawings, a few images, and a scale model or two to repre- sent a major work of architecture in their collections. now they must acquire the d cad models and d drawing files, bims, digital im- ages, videos, and documents, delivered on a computer hard drive, often with no annotation whatsoever. building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation fortunately, the standards for cad are advancing so that options are emerging to represent cad drawings and models in ways that achieve a degree of interoperability across systems and time. these standards are complex and offer many trade-offs. such standards include iso : electronic document management—design and operation of an information system for the preservation of electronic documents–specifications; iso : information and documenta- tion—data exchange protocol for interoperability and preservation; indus- try foundation classes (buildingsmart international ); standard for exchange of product model data iso (step tools, inc. ); and initial graphics exchange specification (iges ). different software programs support different standards, and each standard supports different aspects of the represented design. archiving these digital design files raises a host of questions about the purposes that the digital designs should serve, their authenticity, and the best ways to manage such assets technically in the digital future. to complicate matters further, multiple cad software vendors supply the architecture, engineering, and construction industry, including autodesk, inc., bentley systems (microstation), dassault systèmes (catia), and others. as with all digital software, file for- mat obsolescence is a barrier to our ability to archive many digital design files. architectural museums and archives are faced with a rapidly growing need to preserve digital information and are grap- pling with the needs for technological tools, for technical expertise in digital preservation, for autocad expertise, for archival expertise, and for repositories that can preserve and disseminate the archived data. over the last two decades, the risk of losing this portion of our digital cultural heritage has grown tremendously. history of previous work ( – ) over the last years, architectural practitioners, archivists, and technologists have been working on the problem of preserving digi- tal design work. the art institute of chicago initiated the first project in the united states; it lasted from to and was conducted by kristine a. fallon, an architect. the research project of the mas- sachusetts institute of technology (mit), future-proofing architec- tural computer-aided design (facade), took place from to and was conducted by technologists, librarians, and architects. the society of american archivists design records section has been doing related research and surveys in the archival community since . the international confederation of architecture museums (icam) conference addressed these issues in several presentations. in november , the library of congress, the architect of the capitol, and the national gallery of art hosted a summit, designing the future landscape: digital architecture, design and engineer- ing assets, at the library of congress. finally, the canadian centre for architecture has led the effort for the profession, using tools and techniques from the digital preservation community. building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation art institute of chicago in , the curatorial department of architecture at the art institute of chicago undertook a study to identify requirements for creating and maintaining an archive of born-digital objects, with architect kristine a. fallon as principal investigator. the first step of the study was to conduct a survey to determine how design firms were then using digital design tools, what types of digital design data were being produced, how important digital design data were to understanding the design process, and whether digital methods were the primary way that the firms did work. if digital methodologies were not already primary in their work, re- spondents were asked whether they thought that digital methodolo- gies would be primary within five years. they were also asked to list the specific software products that they used in each use category. survey respondents reported using digital design tools in all catego- ries identified in the survey. the most frequently cited were in the ar- eas of communication/presentation ( percent) and documentation ( percent). the least frequently cited use was for rapid prototyping ( percent). more than half of the respondents reported using digital working methods primarily in data gathering, documentation, com- munication/presentation, and design exploration (kristine fallon associates ). the second step was to conduct in-depth case studies of projects ranging in scale from industrial design to urban design at nine u.s. design firms. the case studies showed that digital design tools are integral to the design process and digital images are central to design decision-making. the third step was to validate that the findings drawn from the case studies could be generalized to the broader design community. this was done through an international survey, in which staffs at design firms were asked how they used digital design tools, how important the tools were to their practices, and which products they used. the team also conducted research into earlier archiving projects and existing standards, methodologies, and products for collecting and archiving digital design data. it was found that no museum or archival institution had solved the key problem of ensuring long- term preservation of the numerous and rapidly changing data for- mats from digital architecture projects. basing their assessment on the open archival information sys- tem (oais) reference model for a long-term data repository system (iso : ), the team identified six distinct stages of the work- flow (planning/programming, design, construction, closeout/com- missions, operations/maintenance, and disposal) for bringing digital design data from design office to museum or archive and for making it accessible to the public. the report included recommendations on procedures, technol- ogy and related requirements, and a start-up implementation plan. a summary of the recommendations includes the following guidelines: building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation • the design practitioner should organize, name, and maintain de- sign data in such a way that a curator or archivist can discern the contents of data files and the time sequence in which they were produced. the designers should preserve important outputs— drawings, images, and animations presented to clients—in archi- val formats. • once the archive and a design firm have defined the content of a gift of digital design data, the design firm should prepare the submission information package (sip). sips should contain the content files and some level of descriptive information, including file-naming standards or project directory structure for a given set of files, and should be sent to the archive for ingest. • recommended archival formats suitable for inclusion in the sip include the following: • pdf/a and tiff • mpeg • png • bmp • extensible d • universal d the art institute of chicago’s report provides thorough recom- mendations and requirements for the implementation of a complete preservation plan for digital design records. recommendations are included for institutional funding, hardware and software infrastruc- ture, digital storage, data maintenance, cataloging and access, and personnel. the recommendations distribute the workload of digital preservation between creators and archivists, and assume firms and collecting institutions will have the technical infrastructure and tech- nology expertise required. mit facade project in , the mit libraries received an institute of museum and li- brary services (imls) grant to develop a strategy for processing and preserving the digital outputs of architectural projects involving d, d, and other digital files. the facade project team worked with the offices of frank gehry, moshe safdie, and thom mayne to create a collection of digital material from these major architects. the collec- tion was to include different d cad modeling tools that the archi- tects had used in their normal work practices and that could be used as a research test bed. the project team was provided with design files for specific projects for the research. the research explored the best way to associate d designs with related d drawings, digital images and videos, e-mail messages and other communications, and bims. specifically, what techniques should be applied to preserve native cad models over archival timeframes? is it necessary to preserve software, or is an emulation framework required? what additional process information is re- quired to capture the building lifecycle, and how can that be stored in digital archives? what other annotations must be supported to building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation retain the architect’s intentions and instructions to contractors and subcontractors who do the construction, and how should this in- formation be kept? how do we archive this information into digital preservation repositories and make it accessible? five major areas of investigation were defined for the project: ( ) analysis, identification, and description of major cad formats; ( ) analysis, design, and implementation of native cad file inges- tion; ( ) management, preservation, and dissemination practices; ( ) analysis and recommendations related to process documentation (relationships among various cad files and versions, and between cad files and other documentation); and ( ) training, outreach, and dissemination of results to the digital library and digital preservation communities. material was acquired on a hard drive or set of dvds, in the file system in use by the firm, and without annotation to help determine what was included. of the test collections acquired, the size ranged from just under , files ( gb) to almost , files ( gb) for a building in progress. the d cad models in particular were each very large (comprising one or more separate files), but usually few in number. the d cad drawings and other files were smaller, but numerous. each firm responded to requests for project files differ- ently, depending on their own practices. as happens in the analog world when collections are acquired, some firms simply turned over all the files; others had already culled the project files for their own archives, in which case the team acquired a smaller set consisting of what the firm considered important to keep. ideally, the team would acquire complete sets of data that included not just the designs and client presentations, but other archival material that is often of high historical value. for each building, the team sought material from all stages of the project, including concept design, schematic design, design devel- opment, construction documents, and construction administration. while d models were the focus of the research on digital preserva- tion, the context provided by the other materials in the collection (e.g., client presentations, correspondence with clients and contrac- tors, and digital images) was key to understanding the models. un- like analog architectural drawings, which show through line draw- ings the design intent of an architect, cad drawings and models are intricate and multilayered, with layers combined to produce one file. the intent of the architect’s design is integrated into the layers of the files and into the references between files. as a result, the combina- tion of files, rather than a single file, is necessary to understand the architect’s design intent. the project team developed a set of recommended best practices to support a preservation strategy that covered all the materials re- ceived in a building collection. these recommendations included • special processing of d cad models to generate derivative ver- sions with greater long-term archiving potential than the native software format. the team identified the need for four versions with distinct formats to ensure long-term preservation: building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation . original (the originally submitted version of the cad model) . display (an easily viewable format to present to users, nor- mally d pdf) . standard (full representation in preservable standard format, normally ifc or step) . dessicated (simple geometry in a preservable standard format, normally iges) • semi-automated conversion processing of other key design file formats (e.g., d drawings into pdf files). • automated conversion processing of common digital file formats (e.g., microsoft office documents and jpeg images) as part of ar- chive ingest. • no special processing for the remaining classes of file formats, al- though these will come under more generalized digital repository preservation strategies outside the scope of facade’s focused concerns. for the project, the team was able to acquire all the cad soft- ware products used by the architects who contributed to the research collection, and the team had valid access to those products through- out the project. should an archive need to keep cad software in perpetuity to view older cad models, the archive would have to continue buying license keys for the software forever and hope that those cad companies do not go out of business. this is obviously not a realistic strategy for long-term preservation; ideally, access to software will be maintained for many decades. the team dis- cussed this issue with representatives of several of the leading cad software companies, and they were open to the idea of escrowing unrestricted copies of the software with appropriate libraries and ar- chives. the team concluded that this was the best avenue to pursue. the team performed a detailed case study of emulation as a strategy on the accudraw software on the apple ii platform (long since obsolete), and the team was able to view accudraw models by running the software in a simulated environment. the process and lessons learned were documented in detail, and the team felt that this approach was technically viable for preserving modern cad software and data. however, the issue of legal access to the software via license keys remains a significant barrier. community discussions made it clear that many institutions, es- pecially smaller ones, lack the technical infrastructure and expertise to implement the preservation models developed by the facade team and the art institute of chicago. a key take-away was that a more community-based approach to building an infrastructure with technology, digital preservation strategies, standards, and education for archivists of these collections is critical if we are to preserve digi- tal architectural records at a national or international scale. following the facade project, mit and the harvard graduate school of design did further work in and . this included developing a more robust set of metadata fields for the descrip- tion of digital design files and continuing work on a prototype of building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation the original accessioning tool built in the first facade project. the work was funded by a small grant from the harvard library and did not continue because of a lack of funding and changes in personnel at each institution. society of american archivists design records section in , the society of american archivists design records section launched a cad/bim task force as a catalyst for a community-wide initiative to address the numerous legal, technical, and curatorial is- sues of born-digital architectural records. the work of the task force to date includes a survey of firms and architectural archives that was used to learn about holdings and current archival practices for born- digital design data, the development of a born-digital studies bib- liography, and a report of the task force (leventhal and zalduendo , ). icam conference in september , icam held its conference in montreal and new york. the opening session was devoted to the topic of archiving born-digital architectural materials. representatives from five international institutions of varying sizes described efforts at the institutional level to deal with digital design data. they discussed such issues as collecting strategies, soft- ware curation, and tools for archiving and description. the overall impression from across the presentations was that the complex prob- lems inherent in preserving digital design data are multilayered and comprise many general challenges. each institution was developing its own processes and methodologies, and many were not working across allied domains, such as digital preservation, from which they could adapt solutions and standards. the presentations made it clear that the problems would be better worked on collaboratively across institutions and across domains to take advantage of expertise at both the national and the international levels. library of congress in november , the library of congress, the architect of the capitol, and the national gallery of art hosted a summit, titled designing the future landscape: digital architecture, design and engineering assets, at the library of congress (leventhal ). the summit brought together stakeholders in the architecture, design, and engineering professions, from creators to curators, to explore the issues and obstacles of long-term preservation and access to the records of their projects and to begin working toward sustain- able solutions. the critical issues that arose through the summit were the need to identify the full array of digital design files created; the need to determine which design records, and specific types of data or information in those records, the various stakeholders need in the immediate and long-term future; and the need to develop better http://www.icam-web.org/data/media/cms_binary/original/ .pdf building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation communication and information-sharing practices, which are critical to developing sustainable solutions to problems with the long-term preservation, access, and use of digital design files. having so many representatives from the range of digital design communities engaged in discussion at the summit indicated how pressing the issues of preservation are, as we risk losing vital cultural heritage information. discussions highlighted the fact that advance- ments in digital preservation tools, such as bitcurator and archive- matica, and collection management tools such as archivesspace, of- fer technological support for the preservation of digital design files. the communities at the summit also recognized the importance of working with colleagues across domains to support the preservation efforts. building for tomorrow the research that has been done from various perspectives since has paved a path for understanding the digital preservation needs for d architecture and design files. we have tested different pres- ervation methods and tools, and we have begun to engage the com- munities—from those working in creation through those working in preservation and access. to further this work, the frances loeb library at the harvard university graduate school of design (gsd) applied for a grant from imls in to support the convening of a national forum under the national digital platform funding priori- ty during . the grant supported two priority-setting meetings of engaged stakeholders—architects, architectural historians, archivists, librarians, technologists, digital preservationists, and others—who are framing a national/international collaborative infrastructure to support the long-term preservation of digital design data, specifi- cally in the architecture and design fields. the infrastructure includes the ongoing integration of knowledge, standards, technologies, and management across generations of technology and practice. the first meeting, a day-and-a-half-long forum, was held immediately prior to the society of architectural historians annual conference in st. paul, minnesota, in april . the outcomes of the forum will be published in spring along with a set of strategic directions and actions forming the basis for a strategic plan of community-based work in the area of preservation for digital design records. at a second meeting, a steering committee met may – , , at the harvard university gsd. the goals of the meeting were to refine the group’s strategic directions and actions, and to determine its immediate next steps. a significant area of focus in the plan is the development of connections to other communities that have intersec- tions with this work, including the software industry, those working in the realm of d and vr, and those who provide digital preserva- tion tools and technologies, in order to leverage expertise and find commonalities across disciplinary domains. https://www.imls.gov/sites/default/files/grants/lg- - - - /proposals/lg- - - - -full-proposal-documents.pdf https://www.imls.gov/sites/default/files/grants/lg- - - - /proposals/lg- - - - -full-proposal-documents.pdf https://www.imls.gov/sites/default/files/grants/lg- - - - /proposals/lg- - - - -full-proposal-documents.pdf https://www.imls.gov/sites/default/files/grants/lg- - - - /proposals/lg- - - - -full-proposal-documents.pdf https://www.imls.gov/sites/default/files/grants/lg- - - - /proposals/lg- - - - -full-proposal-documents.pdf building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation an outcome of the may steering committee meeting was a deci- sion to build a formal coalition of stakeholders. a proposal for the structure of a building for tomorrow coalition of allied communities is being drafted and will be shared for community comment as part of the engagement strategy. participants in building for tomorrow will engage design records community members who could not be present at the forum, specifically software vendors with whom we want to engage, and will convene a meeting of those working on three current imls projects focused on d preservation (building for tomorrow, community standards for d preservation [cs dp], and developing library strategy for d virtual reality collection devel- opment and reuse [lib dvr]). we will also convene another meet- ing of the steering committee, which we hope to use as a jumping-off point to formalize a digital architecture, design, and engineering (dade) coalition. conclusion over the last year, as a result of the community work in this area and through the building for tomorrow meetings, several steps have been taken to help move this work forward in a community-based model. we have the tools, technologies, and technological expertise to preserve digital architecture, design, and engineering data, and we can apply them to this domain. several efforts in the united states focus on developing standards for preservation for d data, and we need to collaborate across our communities to leverage the expertise across those domains. we need to garner the engagement of design- ers in practice to incorporate thinking about preservation at a far earlier stage in their careers. we need to push harder to engage the software industry in order to align incentives for preserving software with the need to preserve the files created using the software. we need to think about sets of services around the preservation of digital design files to meet the needs of institutions of various sizes. and, to be successful, the preservation of digital design files must be an ef- fort across domains and institutions. references ball, alex. . “preserving computer-aided design (cad).” dpc technology watch report - (april): . buildingsmart international. . “ifc overview.” accessed june , . available at http://www.buildingsmart-tech.org/ specifications/ifc-overview. iges (initial graphics exchange specification). . techopedia, accessed june , . available at https://www.techopedia.com/ definition/ /initial-graphics-exchange-specification-iges. http://www.buildingsmart-tech.org/specifications/ifc-overview http://www.buildingsmart-tech.org/specifications/ifc-overview https://www.techopedia.com/definition/ /initial-graphics-exchange-specification-iges https://www.techopedia.com/definition/ /initial-graphics-exchange-specification-iges building for tomorrow: collaborative development of sustainable infrastructure for architectural and design documentation iso (international standards organization). . “iso : electronic document management—design and operation of an in- formation system for the preservation of electronic documents–spec- ifications.” available at https://www.iso.org/standard/ .html. iso (international standards organization). . “iso : information and documentation—data exchange protocol for in- teroperability and preservation.” available at https://www.iso.org/ standard/ .html. kristine fallon associates, inc. . ”collecting, archiving and ex- hibiting digital design data.” art institute of chicago. available at http://www.artic.edu/sites/default/files/ c.pdf. leventhal, aliza, for the library of congress. . designing the future landscape: digital architecture, design & engineering assets. a report on the architecture, design and engineering summit orga- nized by the library of congress, the national gallery of art and the architect of the capitol on november – , at the library of congress. available at http://loc.gov/preservation/digital/meet- ings/designingthefuturelandscapereport.pdf. leventhal, aliza, and inés zalduendo. . “draft bibliography on studies dealing with legal, technical, and curatorial issues related to born-digital architectural records.” society of american archi- vists cad/bim taskforce. available at https://www .archivists. org/sites/all/files/ar% taskforce_born% digital% studiesbib- liography_al+iz_finaldraft_revised.pdf. leventhal, aliza, and inés zalduendo. “cad bim task force report.” society of american archivists design records roundtable. available at http://www .archivists.org/sites/all/files/ _cad- bimtaskforcereport.pdf. step tools, inc. . “step standard—iso .” accessed june , . available at http://www.steptools.com/stds/step/. related reading smith, mackenzie. . “final report for the mit facade project: october –august .” mit. available at https://www.archi- tectuurarchiefvlaanderen.be/sites/default/files/projecten/bijlagen/ bib_ _facade_final.pdf. smith, mackenzie. . “can cad be saved? preserving digital designs is harder than you think.” arcca, the journal of the ameri- can institute of architects, california council ( ): – . available at http://aiacc.org/wp-content/uploads/ / /arcca _ .pdf. https://urldefense.proofpoint.com/v /url?u=https- a__www.iso.org_standard_ .html&d=dwmgaq&c=wo-rgvefibhhbzq fl hq&r=frclhsb mozpzpjl darlwtdhztj-thpnaw gakda&m=y dexgwoboehzyxkyqfeapwen yklrxu xw dulptu&s=qgpkhqg jpnlsaxzqhcuqsuu fklthykazc baxauo&e= https://urldefense.proofpoint.com/v /url?u=https- a__www.iso.org_standard_ .html&d=dwmgaq&c=wo-rgvefibhhbzq fl hq&r=frclhsb mozpzpjl darlwtdhztj-thpnaw gakda&m=y dexgwoboehzyxkyqfeapwen yklrxu xw dulptu&s=ii myz nyhqv khf x ykkgwpxhxzpa rzi uqo-o&e= https://urldefense.proofpoint.com/v /url?u=https- a__www.iso.org_standard_ .html&d=dwmgaq&c=wo-rgvefibhhbzq fl hq&r=frclhsb mozpzpjl darlwtdhztj-thpnaw gakda&m=y dexgwoboehzyxkyqfeapwen yklrxu xw dulptu&s=ii myz nyhqv khf x ykkgwpxhxzpa rzi uqo-o&e= http://www.artic.edu/sites/default/files/ c.pdf http://loc.gov/preservation/digital/meetings/designingthefuturelandscapereport.pdf http://loc.gov/preservation/digital/meetings/designingthefuturelandscapereport.pdf https://www .archivists.org/sites/all/files/ar% taskforce_born% digital% studiesbibliography_al+iz_finaldraft_revised.pdf https://www .archivists.org/sites/all/files/ar% taskforce_born% digital% studiesbibliography_al+iz_finaldraft_revised.pdf https://www .archivists.org/sites/all/files/ar% taskforce_born% digital% studiesbibliography_al+iz_finaldraft_revised.pdf http://www .archivists.org/sites/all/files/ _cadbimtaskforcereport.pdf http://www .archivists.org/sites/all/files/ _cadbimtaskforcereport.pdf http://www.steptools.com/stds/step/ https://www.architectuurarchiefvlaanderen.be/sites/default/files/projecten/bijlagen/bib_ _facade_final.pdf https://www.architectuurarchiefvlaanderen.be/sites/default/files/projecten/bijlagen/bib_ _facade_final.pdf https://www.architectuurarchiefvlaanderen.be/sites/default/files/projecten/bijlagen/bib_ _facade_final.pdf http://aiacc.org/wp-content/uploads/ / /arcca _ .pdf d/vr preservation: drawing on a common agenda for collective impact jessica meyerson chapter abstract digital curation is now part of the repertoire of all computationally dependent domains, requiring mechanisms for alignment to address common digital curation challenges that are beyond the scope of any individual organization or domain. this chapter explores the collective impact methodology (ci) and its application to cultural stewardship and digital curation challenges within and beyond the d and virtual reality (vr) discourse. to demonstrate the relevance of ci to d/vr preservation, the chapter provides a ci case study on software preservation that is experiencing early success in aligning efforts across sectors toward the long-term preservation and reuse of software. the chapter concludes by posing questions for consid- eration by d/vr practitioners as well as possible next steps to ad- dress one or more d/vr data curation challenges. introduction i n a article, john wenzler argues that one of the greatest chal- lenges to the realization of a sustainable scholarly communica- tions infrastructure is the collective action dilemma—the inherent difficulty in looking beyond localized, organizational needs toward “‘conscious coordination’ in the management of the scholarly re- cord” ( ). wenzler’s article inspired focused responses, including one regarding the development of financial mechanisms to sustain open scholarly infrastructure (lewis ) and another advocat- ing the use of social alignment mechanisms to direct the focus of individual institutions toward common goals (neylon ). the collective action dilemma is characterized by wenzler, lewis, and neylon as the primary challenge to the sustainability of an open d/vr preservation: drawing on a common agenda for collective impact scholarly infrastructure. it is also a challenge faced by cultural stew- ards engaging in digital curation across the information management landscape. digital curation is now part of the repertoire of all computation- ally dependent domains, necessitating alignment mechanisms to ad- dress common digital curation challenges that are beyond the scope of any individual organization or domain (digital curation centre ; poole ). d and virtual reality (vr) data preservation is treated here as both a digital curation use case—positioned at the intersection of some of the most difficult digital curation challenges, including scale, standards and interoperability, and hardware and software dependencies—and a boundary-spanning area requiring alignment of interests across a broad range of stakeholders, includ- ing the software industry, hardware manufacturers, academia, cul- tural stewardship organizations, standards bodies, policymakers, publishers, and (re)users of d/vr collections. collective impact as an alternative to the isolated impact of single organizations acting alone, collective impact (ci) is a collective action approach to com- plex social challenges that accounts for conflicting institutional or domain-specific priorities by design (kania and kramer ). kania and kramer ( ) argue that successful ci initiatives share these five conditions: . a shared vision, or common agenda that engages all of the stake- holder communities . a shared system of measurement that serves to measure the ef- fectiveness of varied stakeholder activities and outcomes toward advancing the common agenda . mutually reinforcing activities that leverage the unique contri- butions of each stakeholder group in a way that complements and amplifies the work of all stakeholder groups . constant communication to demonstrate a commitment to fair- ness and evidence-based decision-making over time . a backbone organization with paid staff to plan, coordinate, and sustain the effort through facilitation, technological infrastruc- ture, and data collection, synthesis, and reporting in this chapter, d/vr preservation refers to ( ) preservation of the means (parameters, data, and software) to render the volume and surface of a d object mathematically and ( ) preservation of the “technological system(s) that use interdependent hardware and software to place users in a computer-generated, three- dimensional environment that is immersive and interactive” (campbell , ). infrastructure studies make a useful frame for expanding the potential scope of stakeholder communities for software. specifically, in their in-depth study of the worm online community, star and ruhleder ( ) outline several key characteristics of infrastructure that apply to software, including embeddedness, transparency, reach, and scope. they also note that such infrastructure should embody standards and be learned as part of a membership, be built on an installed base, be fixed in modular increments, and be visible on breakdown. d/vr preservation: drawing on a common agenda for collective impact the greatest concentration of ci case studies is found in the fields of education, public health, homelessness, youth development, urban and economic development, community development, and the environment (cchd ). one of the longest running well-doc- umented applications of ci is strive together, “a national, nonprofit network of community partnerships” working “to ensure that every child succeeds from cradle to career, regardless of race, income or zip code.” strive together represents years of experimentation, testing, and refinement of their “theory of action,” which serves as an invaluable resource for understanding the evolution of ci over time in response to shifting social and economic circumstances. through this and other examples, john kania and mark kramer ( ) have argued that ci emphasizes “complexity and emergence” as two characteristics of system-level challenges that ci is particularly well suited to address. telescoping the d/vr literature—common data curation challenges for complex data d/vr is a collective action problem, evidenced through a specific set of challenges that individual institutions or individuals cannot tackle including scale, standards and interoperability, and hardware and software dependence. we need a collective action solution to ad- dress these challenges if we want to progress past highly constrained and localized d/vr curation environments to ensure d/vr pres- ervation over the long term. according to lischer-katz, the challenge of archiving vr objects arises from the “complexity of [these] data objects, in terms of the variety of data types, relations between files, and dependencies on a variety of hardware and software platforms” ( , ). while d/ vr is associated with disciplinary, format, and platform-specific challenges, many of the digital curation challenges cited in the d/ vr literature are common for a broad range of complex digital data. these challenges include the implications of scale for stewardship and access, standards and interoperability, and dependency on spe- cific hardware and software environments for meaningful access. by positioning d/vr preservation at the nexus of these common data curation challenges, we identify areas where collective action can make the greatest impact on long-term preservation. https://www.strivetogether.org/about/ in “collaborating for equity and justice: moving beyond collective impact,” tom wolff and coauthors summarize the limitations of ci, including the “failure to cite advocacy and systems change as core strategies, engage those most affected in the community as partners with equal power, and directly address the causes of social problems and their political, racial, and economic contexts.” the authors propose six principles for collaborative practice that promote equity and justice. these six principles inform a broader framing of ci discussed throughout the text of this chapter (wolff et al. ). https://www.strivetogether.org/about/ d/vr preservation: drawing on a common agenda for collective impact scale the challenge of scale has many facets in d/vr preservation, in- cluding file size, volume of files, number of possible platforms and formats, and the geometric complexity of the environment or object that is represented by a d/vr rendering. while geometric complex- ity is unique to d/vr preservation, the volume, size, and range of potential formats to be curated or preserved for a single vr environ- ment represent well documented digital curation challenges. for example, in the field of digital archives, digital corpora for a single user’s storage media can easily come to tens of thousands of files (sloyan ). in the context of distributed research project teams, the interdisciplinary zuse institute berlin (zib) points out that “the trend in large-scale research data management constitutes a para- digm shift from previous local data storage in file systems and data- bases to the coordinated orchestration of unstructured, high-volume data maintained in distributed sites.” standards and interoperability in a digital curation ecosystem in which numerous tools are applied to a growing volume of digital assets at each phase of the workflow, standards and interoperability are key. the tools themselves fre- quently change, and granular metadata facilitates the reuse of out- puts from one tool to another. the business case for standardization and interoperability are based on two well-grounded assumptions: ( ) standards facilitate interoperability by allowing the exchange of information across con- texts, and ( ) interoperability facilitates flexibility in the approach to data curation because the interpretation of the outputs of curation activities is not dependent on a given tool. for example, audiovisual preservation professionals have been particularly concerned with the persistence of embedded metadata due to varying levels of sup- port for the metadata across the range of software applications used to render audiovisual files (lacinak and forsberg ). similarly, in d/vr-based practice, “digital watermarking” describes a family of techniques that embed uniquely identifiable information about an information object in the object itself to address “security aspects of data and user authentication or data integrity protection” (steinbach, dittman, and neuhold ). digital watermarking makes it pos- sible both to detect unauthorized uses of the intellectual property (e.g., the underlying d models) and to track the chain of custody through distributed technological infrastructures. tools to support access to working d representations and reproducibility of those representations would have to be able to track specific instances of a model across systems. however, this level of interoperability relies on standardization around the specific properties and, in some cases, the machine-readable encoding standard used to represent these metadata. http://www.zib.de/research/large-scale-data-management-curation-analysis http://www.zib.de/research/large-scale-data-management-curation-analysis d/vr preservation: drawing on a common agenda for collective impact according to the principal investigator and the co-investigators of the community standards for d data preservation (cs dp) project, there are currently no agreed upon standards for the preser- vation of d data (moore et al. ). more specifically, establishing agreed upon schemas for provenanical metadata remains a chal- lenge (cohen-boulakia et al. ). an agreed upon provenancial schema would facilitate interoperability among systems and tools included in the workflow, as well as interoperability of execution environments for reproducibility or replay. provdata is a schema adopted by some virtual heritage practitioners; however, equivalent standards do not exist in every domain that employs d/vr tech- nologies (koller, frischer, and humphreys ). one can hope for a common set of extensible elements that could permit searches for complex digital objects across domains (e.g., all models within the same coordinate system) and within a single domain (e.g., all mod- els of a particular monument that have a low uncertainty quotient), similar to the implementation of preservation metadata maintenance activity (premis) in digital repository and preservation systems to track presentation “events” over the life of a digital object. as community infrastructures grow to include distributed networks of shared software and data across participating organizations, it is crucial not only to track provenance and pres- ervation events associated with assets across their life, but also to make that metadata discoverable. such a system feature enables organizational and individual users to know the source location and to use the provenancial chain of events associated with the lifecycle of a digital object as a second-order data set. software and hardware dependence according to the authors of “research challenges for digital ar- chives of d cultural heritage models”: anecdotal evidence suggests that digital models that are only two or three years old are losing their original functionality and information richness because of poor archival practice. cad software products used to create virtual heritage models belong to a rapidly changing market segment, which means that updated versions are inevitable (on the order of every months) and the full reuse of data (without loss of information) is not secured at all. (koller, frischer, and humphreys , : ) likewise, findings from the pooling activities, resources and tools for heritage e-research networking, optimization and syner- gies (parthenos) project include an existing and “important gap between open-source solutions . . . and the opaque alternatives of commercial black boxes . . . raising the question of our software de- pendency” (alliez et al. ). https://www.loc.gov/standards/premis/ for example, software heritage, datacite re data, eaasi. for an in-depth study of the longevity of computer-aided design, refer to smith . https://www.loc.gov/standards/premis/ d/vr preservation: drawing on a common agenda for collective impact software and hardware dependencies have been similarly dis- cussed in other data curation contexts, including time-based media conservation (laurenson ), archives and special collections (meyerson et al. ), and peer review in light of an expanding definition of a well-formed research object (bailleul , ). of the common d/vr digital curation challenges listed earlier, software dependence arguably has the broadest reach. all sectors and dis- ciplinary domains use some form of software as part of their work process. while the function and user community for a specific soft- ware title varies widely, some software dependency challenges span domains. the transition from a shared acknowledgment of the need for col- laboration to a sustained effort aimed at system-level change is one of the key social infrastructure challenges that ci is meant to address. the american library association ( ), the educopia institute (lip- pincott and skinner ; skinner, drummond, and halbert ; skinner and mccain ), and the institute of museum and library services ( ) are among those who have embraced ci as a concep- tual framework within cultural stewardship. of these, the cultural stewardship ci case studies with the greatest extensibility to d/vr preservation have been undertaken by educopia institute, which has translated its strategies to collective action challenges that are topical rather than regional in scope. the software preservation network (spn) is the most recent affiliated community of educopia, and is ac- tively applying the ci framework to “saving software, together.” software preservation network: a case study in collective impact before detailing spn’s application of ci it is critical to outline the recurring challenges found in the software preservation discourse (meyerson, potterbusch, and work ). layering major events in digital preservation over the software preservation timeline high- lights broader trends that have directly impacted the creation, pres- ervation, and reuse of software and digital data, including changes to u.s. copyright law that favor copyright holders (aufderheide et al. ); the period of annual price reduction for computational resources (koller, frischer, and humphreys ); the growth of the institutional repository landscape (lynch and lippincott ); increasing acquisition of hybrid collections (redwine et al. ); funder requirements for data sharing and reproducibility of compu- tationally dependent research (national institutes of health ; national science foundation ); and a shift away from migration and toward emulation as a long-term digital access strategy (berg- meyer ; bwfla ; granger ; rosenthal ; rothenberg ; waters and garrett ). whether your data is a simulation of climate change impact on ocean sea levels or a cad file from or deus ex, there are four major software preservation challenges that recur in the discourse d/vr preservation: drawing on a common agenda for collective impact over time, are applicable to any domain, and must be addressed through programmatic and strategic collaboration. these challenges are as follows: challenge : no single institution can feasibly locate, much less acquire or procure, all the software needed to address their existing born-digital data. implication: coordinating collection development and sharing of software resources among cultural stewardship organizations is es- sential for long-term preservation. challenge : intellectual property regimes in different nations are conflicting. implication: leveraging existing legal tools across national con- texts (and sharing a body of anecdotal evidence from researchers and practitioners, which is crucial to advocating for the expansion of stewardship and user rights) is necessary to ensure that intellec- tual property law does not restrict the preservation of and access to software. challenge : as distribution models change from physical instal- lation media to software as a service, software libraries and ex- ecutables are squarely and exclusively in the control of software producers. implication: articulating and aligning the shared needs/interests of cultural stewardship organizations in order to represent community needs and capabilities in conversation with the software industry is essential to ensure that software producers do not dictate preserva- tion decisions. challenge : there are no broadly adopted standards for describing the technical, provenancial, and relational properties of software. implication: to make software discoverable and preservable, it is essential to analyze the body of existing descriptive metadata stan- dards for software and identify their common properties. once a content-level iso standard has been developed, creators and practi- tioners can use it to map their localized metadata implementation to the broader software preservation ecosystem. in response to these challenges, the software preservation net- work builds on existing collections and digital preservation infra- structure to support the specific preservation and access needs of software. common agenda a common agenda is a shared vision or goal for change that en- gages all of the stakeholder communities around a problem area. historically, software preservation efforts have focused largely on technological developments or on attempts to tackle the full range of d/vr preservation: drawing on a common agenda for collective impact challenges within a single domain. spn exists as a commons for sec- tors to align and prioritize resource investment around shared chal- lenges. this in no way obviates the need for more domain-specific initiatives, tools, working groups, and organizations. instead, we want to help amplify that work and funnel some of that collective energy toward international-scale challenges to effectively promote cross-domain discussions that have representation from all of those areas. the common agenda that unites spn’s participants and sup- porters is to preserve software through community engagement, infrastructure support, and knowledge generation. spn tackles this agenda across five core activity areas: law & policy, metadata & stan- dards, training & education, research-in-practice, and technological infrastructure. communication, governance, and advisory groups support the work of the five activity areas. shared measurement the spn research-in-practice working group is focused on improv- ing shared measurement through review and synthesis of previous software preservation (and related) research. the purpose of this synthesis is to create a software preservation research toolkit con- sisting of ( ) a centralized longitudinal data effort to track progress and developments in the software preservation landscape, and ( ) a set of data-gathering instruments that individuals can use to gather data about software preservation and curation in their local organi- zation or community (hagenmaier ). by aggregating findings from across contexts and survey instruments, practitioners in the field can more easily map the landscape of software preservation and curation, allowing us to focus energy and expertise more precisely and effectively. shared measurement—“collecting data and measuring results consistently on a short list of indicators at the community level and across all participating organizations” (kania & kramer )—is one of the most difficult and also one of the most powerful compo- nents of collective impact. one key insight resulting from analysis of the software preservation discourse is that the only measurable change over time is an increase in the number of project-based and programmatic efforts addressing some aspect of software preserva- tion. this measure does not, however, advance understanding of whether, how, and to what extent these efforts have enabled pres- ervation, sharing, and long-term reuse of software. without shared measurement, there is no empirically-grounded way for stakeholders to determine whether past and current efforts have successfully ad- vanced the state of the field. mutually reinforcing activities in addition to requesting that the spn working group coordina- tors map their action plan items to strategic goals, spn requested that the working group coordinators map their activities to the http://www.softwarepreservationnetwork.org/about/ http://www.softwarepreservationnetwork.org/about/ d/vr preservation: drawing on a common agenda for collective impact related activities of other working groups. a shared understanding of unique stakeholder contributions and roles allows each group to play to their strengths and interests while highlighting the value of participation to all groups. in a series of spn all-hands strategy meetings held in january and february of , network members articulated specific mutually reinforcing activities shared among working groups. through the documentation of those activities, members reported increased trust in the process as whole. spn identifies affiliated projects in addition to mutually rein- forcing working group activities. spn’s involvement in an affili- ated project is first and foremost by tracking the project goals and measures of success against the collective agenda. beyond shared measurement, spn may shape an affiliated project or serve in an as- needed advisory capacity. currently, spn affiliated projects include scaling emulation as a service infrastructure (eaasi), fostering communities of practice for software preservation in libraries, ar- chives and museums (fcop), and code of best practices for fair use in software preservation. constant communication as mentioned earlier, meetings have been a crucial component of building spn. the spn forum in resulted in the development of a community roadmap that drives the first spn working groups and has been used as the foundation for iterative development of the current spn mission, vision, values, and strategic goals. the spn all-hands strategy meetings allowed all the current members of the network to clarify the relationships between group goals and broader strategic goals, provide real-time feedback about priorities, and identify gaps and opportunities. between larger convenings of network members, monthly working group meetings, monthly working group coordinator meetings, quarterly advisory committee meetings, and bimonthly newsletter and e-mail communications, spn stays in regular communication with both active participants and the spn community at large. the minutes of all meetings are in a shared google drive, and meetings are always expected to result in time-bounded, distributed tasks that address the group’s action plan. ad hoc meetings also take place between different groups or sets of stakeholders. backbone organization support spn has undertaken, with some success, the coordination and man- agement of the day-to-day facilitation work, including stakeholder engagement, communications, data collection and analysis, and other responsibilities. however, the last two years have shown that relying solely on dedicated volunteers for organizationally critical tasks can present challenges. organizations have paid support roles http://www.softwarepreservationnetwork.org/projects/ http://www.softwarepreservationnetwork.org/eaasi/ http://www.softwarepreservationnetwork.org/fcop/ http://www.softwarepreservationnetwork.org/bp-fair-use/ http://www.softwarepreservationnetwork.org/projects/ http://www.softwarepreservationnetwork.org/eaasi/ http://www.softwarepreservationnetwork.org/fcop/ http://www.softwarepreservationnetwork.org/bp-fair-use/ d/vr preservation: drawing on a common agenda for collective impact (such as communications specialist, financial manager, systems ad- ministrator, program manager) because certain functions are ongoing and critical for both the survival and the accountability of an orga- nization. spn, community standards for d preservation (cs dp), and building for tomorrow (b t) are awarded grants with some staff hours and some travel support for external participants. while this level of funding may be sufficient for a two-year period in which volunteer participants work together to articulate the priorities for action, dedicated full-time staff are eventually essential to push the common agenda forward and track activities to the system of shared measurement. in august , spn launched a membership and sponsorship campaign and a two-year seed funded project with the support of members and counting. membership and sponsorship are open to organizations in every sector committed to the preservation and long-term reuse of software. while the launch of the campaign is a direct response to the challenges of sustaining ci initiatives without dedicated resources, the formalization of governance and adminis- trative operations will, by necessity, affect the dedicated community of individuals and organizations that have driven the effort thus far. d/vr practitioners should consider spn and other documented ci efforts as a source of information that could improve their efficacy for d/vr preservation. conclusion: collective impact for d/vr preservation there are numerous points of intersection between the software pres- ervation agenda, general data curation goals for complex data sets, and d/vr curation challenges. listed below are several questions for the d/vr community of practice (wenger ) to consider; these questions may inform collective action for the curation challenges unique to d and vr data: • because d/vr preservation encompasses numerous digital cura- tion challenges, should d/vr preservation be framed as a collec- tive impact problem? or is it potentially more effective to focus on each major data curation challenge that bears particular relevance for d/vr? • where are the discussions about community infrastructure taking place? who is leading them? is there currently a backbone organi- zation that can drive the alignment efforts necessary to sustain a ci effort for d/vr or associated digital curation challenges? • which d/vr stakeholders are missing from the current d/vr preservation discourse, and how might the problems be framed differently to ensure that those stakeholders become part of a sustained and evolving solution to the inherent data curation challenges? https://projects.iq.harvard.edu/buildingtomorrow/home http://www.softwarepreservationnetwork.org/prospectus/ https://projects.iq.harvard.edu/buildingtomorrow/home http://www.softwarepreservationnetwork.org/prospectus/ d/vr preservation: drawing on a common agenda for collective impact • what forms of shared measurement make sense for a ci effort aimed at d/vr preservation? • what are the major gaps or issues with the ci model? what are the blind spots, particularly for tackling d/vr preservation? the following are recommended next steps toward a ci ap- proach to d/vr curation challenges: • identify past d/vr project and programmatic goals and associ- ated outcomes in multiple domains. • determine clear gaps in the landscape and use them to set a com- mon agenda. • make the components of the agenda measurable, and explicitly identify key stakeholders that are well positioned to lead work related to specific components of the agenda. • publish the agenda for feedback, and elicit participation from stakeholder groups currently unrepresented in the d/vr preser- vation discourse. • track current and future project and programmatic goals and as- sociated outcomes to the common agenda. • determine which existing organizations have the capacity to serve as the backbone organization for a d/vr collective impact initia- tive, and if this model is unsuitable, determine the most appropri- ate model to support that work. looking ahead, the enterprise of digital curation for d/vr data depends on the presence of actors at every level of participation; acknowledgment of the unique contributions of individuals, orga- nizations, and domains; and a ci approach that enables the d/vr community to direct those contributions toward a common agenda. while the complexity of d/vr data raises many hurdles, most of the essential curation challenges can be addressed in concert with work being done in the broader digital curation community. references alliez, pierre, laurent bergerot, jean-françois bernard, clotilde boust, george bruseker, nicola carboni, mehdi chayani, matteo dellepiane, nicolo dell’unto, and bruno dutailly. . digital d objects in art and humanities: challenges of creation, interoperability and preservation. white paper: a result of the parthenos workshop, bordeaux, france, november – december , . available at https://hal.inria.fr/parthenos/hal- v . american library association. . “collective impact.” available at http://www.ala.org/tools/future/trends/collectiveimpact. aufderheide, patricia, brandon butler, krista cox, and peter jaszi. . the copyright permissions culture in software preservation and its implications for the cultural record. washington, dc: as- sociation of research libraries. available at http://www.arl.org/ publications-resources/ -the-copyright-permissions-culture-in- software-preservation-and-its-implications-for-the-cultural-record#. wpsbjkingio. https://hal.inria.fr/parthenos/hal- v http://www.ala.org/tools/future/trends/collectiveimpact d/vr preservation: drawing on a common agenda for collective impact bailleul, john. . report on the first ieee workshop on the future of research curation and research reproducibility. available at file:/// users/kathlinsmith/downloads/ieee_reproducibility_workshop_re- port_final% ( ).pdf. bergmeyer, winfried. . “the keep emulation framework.” in proceedings of the st international workshop on semantic digital ar- chives (sda ), – . berlin, germany, september . “bwfla: emulation as a service.” . accessed june , . available at http://eaas.uni-freiburg.de/. campbell, savannah. . “a rift in our practices?: toward pre- serving virtual reality.” master’s thesis. new york university, may . available at https://www.nyu.edu/tisch/preservation/pro- gram/student_work/ spring/ s_thesis_campbell.pdf. cchd (center for community health and development), uni- versity of kansas. . “other models for promoting community health and development” (chapter ), “collective impact” (section ). accessed june , . available at https://ctb.ku.edu/en/table- of-contents/overview/models-for-community-health-and- development/collective-impact/main. cohen-boulakia, sarah, khalid belhajjame, olivier collin, jérôme chopard, christine froidevaux, alban gaignard, konrad hinsen, pierre larmande, yvan le bras, frédéric lemoine, fabien mareuil, hervé ménager, christophe pradal, and christophe blanchet. . “scientific workflows for computational reproducibility in the life sciences: status, challenges and opportunities.” future generation computer systems (october): – . available at https://doi. org/ . /j.future. . . . digital curation centre. . dcc curation lifecycle model. ac- cessed september , . available at http://www.dcc.ac.uk/ resources/curation-lifecycle-model. granger, stewart. . “emulation as a digital preservation strat- egy.” d-lib magazine ( ). available at https://doi.org/ . / october -granger. hagenmaier, wendy. . “help spn create a toolkit for software preservation research” blog, aug. , . available at https://con- nect.clir.org/blogs/wendy-hagenmaier/ / / /help-spn. institute of museum and library services. . strengthening net- works, sparking change: museums and libraries as community catalysts. available at https://www.imls.gov/publications/strengthening- networks-sparking-change-museums-and-libraries-community- catalysts. kania, john, and mark kramer. . “collective impact.” stanford social innovation review ( ): – . about:blank about:blank about:blank http://eaas.uni-freiburg.de/ https://www.nyu.edu/tisch/preservation/program/student_work/ spring/ s_thesis_campbell.pdf https://www.nyu.edu/tisch/preservation/program/student_work/ spring/ s_thesis_campbell.pdf https://ctb.ku.edu/en/table-of-contents/overview/models-for-community-health-and-development/collective-impact/main https://ctb.ku.edu/en/table-of-contents/overview/models-for-community-health-and-development/collective-impact/main https://ctb.ku.edu/en/table-of-contents/overview/models-for-community-health-and-development/collective-impact/main https://ctb.ku.edu/en/table-of-contents/overview/models-for-community-health-and-development/collective-impact/main https://doi.org/ . /j.future. . . https://doi.org/ . /j.future. . . https://doi.org/ . /j.future. . . http://www.dcc.ac.uk/resources/curation-lifecycle-model http://www.dcc.ac.uk/resources/curation-lifecycle-model https://doi.org/ . /october -granger https://doi.org/ . /october -granger https://connect.clir.org/blogs/wendy-hagenmaier/ / / /help-spn https://connect.clir.org/blogs/wendy-hagenmaier/ / / /help-spn https://www.imls.gov/publications/strengthening-networks-sparking-change-museums-and-libraries-community-catalysts https://www.imls.gov/publications/strengthening-networks-sparking-change-museums-and-libraries-community-catalysts https://www.imls.gov/publications/strengthening-networks-sparking-change-museums-and-libraries-community-catalysts d/vr preservation: drawing on a common agenda for collective impact kania, john, and mark kramer. . “embracing emergence: how collective impact addresses complexity,” stanford social innovation review, blog, jan. , . available at https://ssir.org/articles/ entry/social_progress_through_collective_impact. koller, david, bernard frischer, and greg humphreys. . “re- search challenges for digital archives of d cultural heritage mod- els.” journal on computing and cultural heritage ( ): : – : . avail- able at https://doi.org/ . / . . lacinak, chris, and walter forsberg. . a study of embedded meta- data support in audio recording software: summary of findings and conclusions. arsc technical committee. available at http://www. arsc-audio.org/pdf/arsc_tc_md_study.pdf. laurenson, pip. . “authenticity, change and loss in the con- servation of time-based media installations.” tate papers no. . available at https://www.tate.org.uk/research/publications/ tate-papers/ /authenticity-change-and-loss-conservation-of-time- based-media-installations. lewis, david w. . “the . % commitment.” available at https://scholarworks.iupui.edu/bitstream/handle/ / / the% . % % commitment.pdf?sequence= &isallowed=y. lippincott, sarah, and katherine skinner. . “the library pub- lishing coalition: from collective action to collective impact.” pre- sentation at the coalition of networked information. washington, d.c., december – , . available at https://www.cni.org/topics/ publishing/the-library-publishing-coalition-from-collective-action- to-collective-impact. lischer-katz, zack. . “preserving research data at ou: from text to virtual reality” (draft). university of oklahoma libraries. available at http://vrpreservation.oucreate.com/wp-content/up- loads/ / /draftofroadmapdocumentversion _ .docx.pdf. lynch, clifford a., and joan k. lippincott. . “institutional re- pository deployment in the united states as of early .” d-lib magazine ( ). available at https://doi.org/ . /september -lynch. meyerson, jessica, megan r. potterbusch, and lauren work. . “software preservation network literature review.” osf. february . available at osf.io/qdsyc. meyerson, jessica, zach vowell, wendy hagenmaier, aliza levan- thal, fernando rios, elizabeth russey roke, and tim walsh. . “the software preservation network (spn): a community effort to ensure long term access to digital cultural heritage.” d-lib maga- zine ( / ). available at https://doi.org/ . /may - meyerson. https://ssir.org/articles/entry/social_progress_through_collective_impact https://ssir.org/articles/entry/social_progress_through_collective_impact https://doi.org/ . / . https://doi.org/ . / . http://www.arsc-audio.org/pdf/arsc_tc_md_study.pdf http://www.arsc-audio.org/pdf/arsc_tc_md_study.pdf https://www.tate.org.uk/research/publications/tate-papers/ /authenticity-change-and-loss-conservation-of-time-based-media-installations https://www.tate.org.uk/research/publications/tate-papers/ /authenticity-change-and-loss-conservation-of-time-based-media-installations https://www.tate.org.uk/research/publications/tate-papers/ /authenticity-change-and-loss-conservation-of-time-based-media-installations https://scholarworks.iupui.edu/bitstream/handle/ / /the% . % % commitment.pdf?sequence= &isallowed=y https://scholarworks.iupui.edu/bitstream/handle/ / /the% . % % commitment.pdf?sequence= &isallowed=y https://www.cni.org/topics/publishing/the-library-publishing-coalition-from-collective-action-to-collective-impact https://www.cni.org/topics/publishing/the-library-publishing-coalition-from-collective-action-to-collective-impact https://www.cni.org/topics/publishing/the-library-publishing-coalition-from-collective-action-to-collective-impact http://vrpreservation.oucreate.com/wp-content/uploads/ / /draftofroadmapdocumentversion _ .docx.pdf http://vrpreservation.oucreate.com/wp-content/uploads/ / /draftofroadmapdocumentversion _ .docx.pdf https://doi.org/ . /september -lynch https://doi.org/ . /september -lynch file:///users/kathlinsmith/desktop/osf.io/qdsyc https://doi.org/ . /may -meyerson https://doi.org/ . /may -meyerson d/vr preservation: drawing on a common agenda for collective impact moore, jennifer, adam rountrey, hannah scates kettler, tassie gniady, nicholas f. polys, monique lassere, kristy golubiewski- davis, and cynthia hudson-vitale. . “cs dp (community stan- dards for d data preservation).” last updated nov. , . open science framework. available at https://osf.io/ewt h/. national institutes of health. . funding: grants (nih guide to grants and contracts.) accessed june , . available at https:// grants.nih.gov/funding/index.htm. national science foundation. . funding. accessed june , . available at https://www.nsf.gov/funding/. neylon, cameron. . “against the . % commitment.” science in the open: the online home of cameron neylon (blog), jan. , . available at https://cameronneylon.net/blog/against-the- - - commitment/. poole, alex h. . “the conceptual landscape of digital cura- tion.” journal of documentation ( ): – . available at https://doi. org/ . /jd- - - . redwine, gabriela, megan barnard, kate donovan, erika farr, mi- chael forstrom, william m. hansen, jeremy leighton john, nancy kuhl, seth shaw, and susan thomas. . born digital: guidance for donors, dealers, and archival repositories. washington, dc: council on library and information resources. available at https://www.clir. org/pubs/reports/pub /. rosenthal, david s. h. . emulation & virtualization as preservation strategies. new york: the andrew w. mellon foundation. available at https://mellon.org/media/filer_public/ c/ e/ c eee d- - ba -a - b e a c a /rosenthal-emulation- .pdf. rothenberg, jeff. . avoiding technological quicksand: finding a viable technical foundation for digital preservation. washington, dc: council on library and information resources. available at https:// www.clir.org/pubs/reports/rothenberg/. skinner, katherine, christine drummond, and martin halbert. . chrysalis: moving forward collectively. white paper. available at https://educopia.org/publications/chrysalis-moving-forward- collectively. skinner, katherine, and edward mccain. . “from collaborative action to collective impact.” presentation at the collaboration cul- ture symposium. university of missouri, march – , . avail- able at https://www.rjionline.org/stories/edward-mccain-and-kath- erine-skinner-from-collaborative-action-to-collective. sloyan, victoria. . “born-digital archives at the wellcome li- brary: appraisal and sensitivity review of two hard drives.” ar- chives and records ( ): – . available at https://doi.org/ . / . . . https://osf.io/ewt h/ https://grants.nih.gov/funding/index.htm https://grants.nih.gov/funding/index.htm https://www.nsf.gov/funding/ https://cameronneylon.net/blog/against-the- - -commitment/ https://cameronneylon.net/blog/against-the- - -commitment/ file:///users/brian/documents/clir% pubs% and% web/pub _ _ _ / d-vr-final% /% https://doi.org/ . /jd- - - https://doi.org/ . /jd- - - https://www.clir.org/pubs/reports/pub / https://www.clir.org/pubs/reports/pub / https://mellon.org/media/filer_public/ c/ e/ c eee d- - ba -a - b e a c a /rosenthal-emulation- .pdf https://mellon.org/media/filer_public/ c/ e/ c eee d- - ba -a - b e a c a /rosenthal-emulation- .pdf https://www.clir.org/pubs/reports/rothenberg/ https://www.clir.org/pubs/reports/rothenberg/ https://educopia.org/publications/chrysalis-moving-forward-collectively https://educopia.org/publications/chrysalis-moving-forward-collectively https://www.rjionline.org/stories/edward-mccain-and-katherine-skinner-from-collaborative-action-to-collective https://www.rjionline.org/stories/edward-mccain-and-katherine-skinner-from-collaborative-action-to-collective https://doi.org/ . / . . https://doi.org/ . / . . d/vr preservation: drawing on a common agenda for collective impact smith, mackenzie. . “future-proofing architectural computer- aided design: mit’s facade project.” editions infolio. available at https://dspace.mit.edu/bitstream/handle/ . / /gaudi- proceedings-smith.pdf?sequence= . star, susan leigh, and karen ruhleder. . “steps toward an ecology of infrastructure: design and access for large information spaces.” information systems research ( ): – . steinbach, martin, jana dittman, and erich neuhold. . “digital watermarking.” in encyclopedia of multimedia, edited by borko furht. springer us. waters, donald, and john garrett. . preserving digital information, report of the task force on archiving of digital information. washing- ton, dc: council on library and information resources. available at https://www.clir.org/pubs/reports/pub /. walzer, norman, liz weaver, and catherine mcguire. . “collec- tive impact approaches and community development issues.” com- munity development ( ): – . available at https://doi.org/ . / . . . wenzler, john. . “scholarly communication and the dilemma of collective action: why academic journals cost too much.” college & research libraries ( ): . available at https://crl.acrl.org/index. php/crl/article/view/ . wenger, etienne. . communities of practice: learning, meaning, and identity. cambridge: cambridge university press. wolff, tom, meredith minkler, susan wolfe, bill berkowitz, linda bowen, frances dunn butterfoss, brian d. christens, vincent t. fran- cisco, arthur t. himmelman, and kien s. lee. . “collaborating for equity and justice: moving beyond collective impact.” non- profit quarterly (january ). available at https://nonprofitquarterly. org/ / / /collaborating-equity-justice-moving-beyond- collective-impact/. http://dspace.mit.edu/handle/ . / https://dspace.mit.edu/bitstream/handle/ . / /gaudi proceedings-smith.pdf?sequence= https://dspace.mit.edu/bitstream/handle/ . / /gaudi proceedings-smith.pdf?sequence= https://www.clir.org/pubs/reports/pub / file:///users/brian/documents/clir% pubs% and% web/pub _ _ _ / d-vr-final% /% https://doi.org/ . / . . https://doi.org/ . / . . https://crl.acrl.org/index.php/crl/article/view/ https://crl.acrl.org/index.php/crl/article/view/ https://nonprofitquarterly.org/ / / /collaborating-equity-justice-moving-beyond-collective-impact/ https://nonprofitquarterly.org/ / / /collaborating-equity-justice-moving-beyond-collective-impact/ https://nonprofitquarterly.org/ / / /collaborating-equity-justice-moving-beyond-collective-impact/ cs dp: developing agreement for d standards and practices based on community needs and values jennifer moore, adam rountrey, and hannah scates kettler chapter abstract the pace and growth of d model creation has increased tremen- dously in recent years, a trend that will continue as technology evolves. the ever-present need to develop practices and standards to ensure the long life of this data type has never been more apparent to both d creators and data curators. in this moment, a community is coming together in an effort to increase the accessibility, discoverabil- ity, usefulness, and integrity of data. one such effort, the community standards for d data preservation, is focused on collective develop- ment of flexible and extensible best practices and/or standards for preservation, documentation, and dissemination of d data. the growth of d data creation and the need for curation t echnical advancements in creation and capture of d data and a reduction in costs have inspired intense growth and interest in creating, sharing, and using digital d data in the last decade. the digital d medium is still rapidly evolving; new us- ers and creators are coming in from many backgrounds, often with different end goals. yet, as users and creators, we are also at a point in the development and usage of d data at which many recognize the need for a system of standards—or at least guidance—for the documentation, preservation, and dissemination of d data. locally developed standards are now starting to emerge independently in labs, museums, libraries, and businesses, often with little communi- cation among stakeholders. this could be problematic for the larger community, as a system of standards optimized in isolation solely for cs dp: developing agreement for d standards and practices based on community needs and values the needs of a specific field or discipline is unlikely to meet the needs of those in other fields and may even be completely unusable outside its home field. for example, a standard designed specifically for d models of archaeological sites and collecting events is unlikely to meet the needs of game developers and architects, and it may unnec- essarily limit the discoverability of the content. collaboratively developed standards would increase accessibil- ity, discoverability, and usefulness outside the field of the user or creator, and they would foster innovative applications and scholarly work by reducing the barriers to cross-disciplinary access. how- ever, the more generalized standards require compromises, which means that they may not fully meet the needs of any specific field. given the need for documentation and preservation standards in most fields using d data, the broader community is now choosing to come together at a powerful, formative time, asking questions not only about how information should be encoded for sharing and preservation, but also about what we value and how we can design a system that promotes those values. in other words, what kinds of things do we want our standards to encourage and facilitate? relationship of data curation and d data today historically, librarians, curators, and museum managers have been concerned with curation and preservation of scholarly works and physical objects. in spec kit produced for the association of re- search libraries (arl), the authors reported that two-thirds of arl libraries surveyed were engaged in or developing data curation ser- vices in their libraries at some level (hudson-vitale et al. ). over the last decade, digital data curation services and research on digital preservation have become part of the established mission of many libraries, but few have explored the complexities of d data curation in depth. at first glance, d data may appear to be “just like any other data,” but this is not a full characterization. like digital photographs or simulation results, d models are, in the most reduced sense, just bits and bytes. this means that some parts of the preservation pro- cess, such as file storage and fixity, are not substantially different for different data types. of course, there is more to preservation than just storage. one could argue that the goal of preservation is to main- tain an object, ensuring that it may be “used” (even if in a restricted way) over a longer period. for a d image repository to be useful, potential users need to be able to judge the suitability of an image for an intended purpose. they may need information about the photog- rapher, location, date, subject, and licensing. in some cases, detailed information about the photography equipment and post-processing may be necessary. a d repository has a similar preservation mis- sion—to make d data usable over a long period—but the informa- tion required to determine whether the data are fit for a particular cs dp: developing agreement for d standards and practices based on community needs and values purpose is different. because of the growth, complexity, lack of standardized documentation, and rapid change in d data formats and creation methods, the larger d community is only beginning to assess what is needed to ensure that d datasets remain sufficiently useful in order to consider them preserved. it is important that more data curators and librarians are now exploring the problem of d data curation, but agreement among librarians and curators is not enough to ensure its preservation; stan- dards for preservation are useful only if adopted by the creator and user communities. aligning creators and curators the d/virtual reality (vr) ecosystem includes a broad array of communities and stakeholders that are involved in the creation, preservation, and dissemination of d data. libraries, museums, and archives should engage with this large community to assess needs in d data curation and agree upon practices and standards. indeed, the first step for the community standards for d data preservation (cs dp) project, funded by the institute of museum and library services (imls), was to develop a better understanding of stake- holders’ needs in the d data creation and data curation community (moore et al. ). we distributed a survey to libraries, museums, university labs, and government agencies, and asked their staffs to share it with other interested parties. in our initial survey, we received more than responses from individuals who worked with d data or had re- sponsibility for curating data. notably, percent of all respondents said that they were not using documented best practices or standards for preservation, documentation, and dissemination of d data. of this group, percent said that they did not use them because they were unaware of such standards. the respondents who were using standards had largely developed them in-house. the survey data made it clear that we were at a critical point in standardization. ex- isting standards lacked buy-in or were so poorly known that local ad hoc standards were being created to fill in the gap. the increase in local standards was not indicative of a desire for independence or isolation in preservation practices, as the majority ( percent) of re- spondents said they would like to collaboratively develop standards and best practices as a community (moore and scates kettler ). community members clearly recognized the barriers to collabora- tion and aggregation introduced by local solutions and were ready to work together on a more unified system of standards. the cs dp project emerged from the idea that the adoption of standards depends not only on meeting the needs of the community, but also on community ownership and stakeholder investment. we used the survey results to build a five-part framework for organizing ideas: ( ) preservation best practices, ( ) management and storage, lg- - - - f cs dp: developing agreement for d standards and practices based on community needs and values ( ) metadata, ( ) copyright and ownership, and ( ) access and dis- coverability. these framework topics eventually led to working groups that focused on considering the details of each topic. howev- er, unanticipated discussions during the planning and subsequent fo- rum may have been some of the most important. for example, at the first gathering of cs dp participants, many remarked that the wide variety of fields and backgrounds represented at the gathering was both unusual and much needed. in fact, the breadth of backgrounds and expertise was so diverse that we made it a priority to develop agreed-upon vocabularies for use in our discussions so that we could speak to one another with better clarity and understanding. exper- tise was sufficiently diverse that we needed standards to even start to talk about standards! questions that arose in discussions showed the desire of the community to work together and develop a general understanding of the current state of d preservation. among the questions raised were “what is d data?” “what is it that should be preserved?” “how do we facilitate sharing without discouraging cre- ativity and innovation?” and “what stakeholders are missing from these discussions?” the next step in the process of developing d standards is in- cluding perspectives that were not adequately represented in the ini- tial survey, including, but not limited to, those of indigenous and na- tive communities, those from nations with technological bandwidths different from those commonly found in north america and europe, and those from the entertainment industry. targeted engagement with these communities in the form of actual relationships, not just consultations, is appropriate if the future of d data preservation is to take into account the actual breadth of the needs and requirements of all the users and creators of d content. this concept was brought to the fore as narcisse mbunzama spoke as part of the first cs dp forum on february , . during the panel on discovery and access (wittenberg, nieves, and mbunza- ma ), mbunzama discussed the innovations in d in the demo- cratic republic of the congo (drc) and use cases there that heavily rely on mobile devices for dissemination and access. the d data are shared via the drc’s unreliable cellular and internet services, and are processed using basic computers. yet, the contribution and scholarship surrounding d data is no less active in the drc than it is anywhere else in the world. a standard and set of best practices for d preservation should account for the concerns of such active com- munities and should meet the needs and requirements of various types of users. bridging efforts in the community the process of collaboratively building d data curation has only just begun in earnest. however, our efforts are building on and incorporating stakeholders from well-established projects such as cs dp: developing agreement for d standards and practices based on community needs and values morphosource, smithsonian d, cultural heritage imaging, and facade (smith ), as well as initiatives such as the guides to good practice from the archaeological data service and digital an- tiquity and the d icons report on metadata and thesaurii (d’andrea and fernie ), funded by the european commission and other european-based funders. these efforts have been extremely valuable in setting the tone for discussions. yet, they tend to be focused on particular disciplinary needs or specific d data creation methodolo- gies. because of this focus, the applicability beyond specific contexts of data creation is limited. we want to assess the previous work and build on it to develop d data curation practices and standards that will be broadly applicable and extensible to meet the needs of multi- ple use cases and user groups. such a set of standards would support the creation of simplified, broadly applicable preservation systems as well as enhanced accessibility through data aggregators. the cs dp project is not alone in tackling the aforementioned concerns and is building connections with other projects. these bridges will facilitate communication about diverse needs and con- cerns and will promote the development of standards that will be used by a larger interdisciplinary community. the cs dp team has tracked preceding efforts, and we continue to monitor current efforts to address the curation of d data, including the national endow- ment for the humanities (neh)–funded forums on d in and , which focused on the scholarly rigor, research output, and user experience rather than the preservation of d content (neh ); a white paper resulting from the parthenos workshop held in france in late (parthenos ); imls-funded forums, including developing library strategy for d and virtual reality collection development and reuse (lib dvr) (virginia tech ); building for tomorrow (harvard university ); the international image interoperability framework (iiif) community d interest group (iiif ); and the clir d/vr colloquium ( d/vr cre- ation and curation in higher education ). these groups’ efforts, the cross-pollination of our teams, and the conversation that is con- tinually growing more focused have proven invaluable in identifying problems to be addressed in d preservation, community standards development, and the formation of best practices. in sharing our experience and utilizing our larger collective network, can we begin to deepen our understanding of each other’s needs, current practices, and shortcomings for d data preservation? in the short term, the cs dp team anticipates that the contributors to these projects will maintain cross-pollination and collective growth, and will move forward together to support widely applicable d data standards and recommendations. https://www.morphosource.org/ https:// d.si.edu/about http://culturalheritageimaging.org/ http://guides.archaeologydataservice.ac.uk/g gp/ d_toc https://www.morphosource.org/ https:// d.si.edu/about http://culturalheritageimaging.org/ http://guides.archaeologydataservice.ac.uk/g gp/ d_toc cs dp: developing agreement for d standards and practices based on community needs and values extensible standards that encourage innovation through congruity in an ideal d/vr ecosystem, d data standards and practices facili- tate curation, but do not inhibit innovation, extensibility, and adapt- ability. the cs dp team sees this as an imperative for any d stan- dard or best practice. as articulated during multiple d practitioner gatherings (humanities heritage d visualization , lib dvr, cs dp, building for tomorrow, and clir’s dvr colloquium), a way forward is for the community to build a minimum set of stan- dards and guides that include mechanisms for review, amendment, and a la carte extension. although somewhat aspirational, we believe this can be accomplished by focusing on identifying the commonali- ties among disciplines, modalities, and use cases. as we come to understand our common needs, we can build congruity through practical tools. templates for documenting d data workflows that are both modality-specific and unobtrusive could provide a structure that fosters efficient dissemination, col- laboration, and assessment of d data. agreement on documentation could permit the evaluation of a d resource for specific tasks and contexts and could uphold and expose scholarly rigor. the commu- nity is working now to identify needs, extant methods, and vocabu- laries and to provide direction on how this documentation can be created and linked. an analysis of ongoing requirements is taking place, which aims to make it possible for repositories to be optimized for long-term d data preservation and for access. congruous d data repository re- quirements will support the discovery of d assets that are years old, allowing future scholars to reuse, re-create, and remix these data. repositories will be able to describe which data holdings have been treated for preservation and persistence, which will in turn inform reusers about the data set’s integrity. that sort of quality assurance permits datasets to be retrieved, studied, augmented, and cited to advance research or allow for the production of new scholarship. through community consensus we will move toward interoperabil- ity by establishing preferred formats that will make vendor adoption practical. in multiple forums we have heard arguments for respon- sive platform design that can deliver heavy data for use in powerful, high-performance computers, as well as decimated data for use on small devices, such as a smartphone, which is particularly important for countries that depend on mobile accessibility. we do not envision there being “one repository to rule them all,” but the community may at some point be able to build a tool that ag- gregates metadata following the standards on which we agree. an aggregator would allow access from disparate repositories so that users could search across disciplines to find digital d assets. for example, an animator for the discovery channel could search for a d rendering of a plesiosaur and find a reconstruction offered from the university of michigan with appropriate metadata that would enable them to determine whether the d asset would be useful, cs dp: developing agreement for d standards and practices based on community needs and values appropriate, and accurate, without duplicating effort. users should be able to download, derive, augment, and redistribute an object with its provenance attached as embedded metadata or, at the very least, clearly articulated and explained in related documentation. through a common understanding of what is central to d data, a community-developed approach to data curation will facilitate re- producible workflows and provide methods by which to evaluate the integrity of data objects. established requirements will allow creators to reference guidelines that not only will support data management planning, but also will inform workflows and documentation of the data generated and the arguments created. preservation of these data is essential in supporting the basis of scholarship. curators value openness and sharing, but that must be balanced with cultural sensi- tivity, a rights structure that supports creator innovation, and proper recognition for novel forms of scholarship. such a structure will also provide a basis for the measurement of d scholarship and thereby help to facilitate promotion and tenure based on these data. determining rights and attributions for creators and scholars is not the only consideration while developing common needs for d preservation. understanding d data creation and ownership is problematic, as suggested by angel nieves in his talk at cs dp forum (wittenberg, nieves, and mbunzama ), because cul- tural heritage assets have been largely in the hands of the privileged. adopting a lens grounded in diversity, inclusion, and equity by including additional voices of marginalized user and creator com- munities will ideally address some of the biases within digital collec- tions that reflect a singular, sometimes predatory, cultural perspec- tive. by integrating our communities’ diverse needs and perspectives early in the establishment of a standard d data curation process and throughout its development, we can begin to build real relationships that address the prevalence of bias and violence done by the acad- emy, archives, libraries, and museums. a best practice or standard steeped in the values of diversity, inclusion, and equity can promote social justice through the inclusive curation and preservation of d cultural resources. once the community of diverse d scholars and contributors has been established, we may yet succeed in creating an inclusive, collaborative d data preservation ecosystem. community collabora- tion is necessary to build flexible standards and practices that afford librarians and data curators the tools that they need to support the d community’s continued innovation, development, open dissemi- nation, and further discovery of d research. with the appropriate consideration of different perspectives, we will generate the neces- sary fervor to support d research where we can share, use, find, and benefit from each other’s data well into the future. cs dp: developing agreement for d standards and practices based on community needs and values references d/vr creation and curation in higher education. . accessed june , . available at http://vrpreservation.oucreate.com/ colloquium/. d’andrea, andrea, and kate fernie. . “d . report on metadata and thesaurii.” slideshare. d-icons. accessed january , . available at https://www.slideshare.net/mobile/ dicons/ dicons-d -report-on-metadata-and-thesaurii. harvard university. . “building for tomorrow.” accessed june , . available at https://projects.iq.harvard.edu/building tomorrow/home. hudson-vitale, cynthia, heidi imker, lisa r. johnston, jake carlson, wendy kozlowski, robert olendorf, and claire stewart. . “spec kit : data curation.” available at https://publications.arl.org/ data-curation-spec-kit- /. iiif (international image interoperability framework). . “iiif d community group.” accessed june , . available at http://iiif. io/community/groups/ d/charter/. moore, jennifer, and hannah scates kettler. . “who cares about d preservation?” iassist quarterly ( ): – . available at https://doi.org/ . /iq . moore, jennifer, adam rountrey, hannah scates kettler, tassie gniady, nicholas f. polys, monique lassere, kristy golubiewski- davis, and cynthia hudson-vitale. . “cs dp (community stan- dards for d data preservation).” open science framework, january. available at https://osf.io/ewt h/. neh (national endowment for the humanities). . “humanities heritage d visualization: theory and practice.” accessed june , . available at https://www.neh.gov/divisions/odh/institutes/ humanities-heritage- d-visualization-theory-and-practice. parthenos. . “‘digital d objects in art and humanities’ workshop.” parthenos project (blog). november , . avail- able at http://www.parthenos-project.eu/digital- d-objects-in-art- and-humanities-workshop. smith, mackenzie. . “future-proofing architectural computer- aided design: mit’s facade project.” editions infolio. available at http://dspace.mit.edu/handle/ . / #files-area. virginia tech. . “ d collection strategies.” accessed june , . available at https://lib.vt.edu/content/lib_vt_edu/en/ research-learning/lib dvr.html. wittenberg, jamie, angel nieves, and narcisse mbunzama. . “presentation panel on discoverability/access.” community standards for d data preservation (cs dp) forum , february – , , st. louis, mo. available at http://ir.uiowa.edu/cs dp/ forum /presentations/ . http://vrpreservation.oucreate.com/colloquium/ http://vrpreservation.oucreate.com/colloquium/ https://www.slideshare.net/mobile/ dicons/ dicons-d -report-on-metadata-and-thesaurii https://www.slideshare.net/mobile/ dicons/ dicons-d -report-on-metadata-and-thesaurii https://projects.iq.harvard.edu/buildingtomorrow/home https://projects.iq.harvard.edu/buildingtomorrow/home https://publications.arl.org/data-curation-spec-kit- / https://publications.arl.org/data-curation-spec-kit- / http://iiif.io/community/groups/ d/charter/ http://iiif.io/community/groups/ d/charter/ https://doi.org/ . /iq https://osf.io/ewt h/ https://www.neh.gov/divisions/odh/institutes/humanities-heritage- d-visualization-theory-and-practice https://www.neh.gov/divisions/odh/institutes/humanities-heritage- d-visualization-theory-and-practice http://www.parthenos-project.eu/digital- d-objects-in-art-and-humanities-workshop http://www.parthenos-project.eu/digital- d-objects-in-art-and-humanities-workshop http://dspace.mit.edu/handle/ . / https://lib.vt.edu/content/lib_vt_edu/en/research-learning/lib dvr.html https://lib.vt.edu/content/lib_vt_edu/en/research-learning/lib dvr.html http://ir.uiowa.edu/cs dp/forum /presentations/ http://ir.uiowa.edu/cs dp/forum /presentations/ d/vr: stakeholders, ecosystems, and future directions zack lischer-katz, kristina golubiewski-davis, jennifer grayburn, and veronica ikeshoji-orlati conclusion t he essays in this report offer a glimpse at how d and vr are being used for their scholarly and pedagogical benefits. sup- porting these technologies furthers the mission of academic libraries to ensure that their constituencies have access to scholarly information in all forms and formats. the potential benefits of d/ vr technology can be fully realized only when the technology is properly integrated into research programs and teaching curricula, an area in which the library can lead. yet, the essays also make it clear that there remains a range of critical considerations that library professionals need to keep in mind as they shepherd novel d/vr technologies into their institutions, especially as they find themselves supporting new and complex technical workflows, scholarly prac- tices, and data curation and digital preservation requirements. the great diversity in the range of stakeholders involved compli- cates the development of comprehensive technical tools. one of the benefits of the clir d/vr colloquium was that it not only brought together a diverse range of stakeholder groups and enabled knowl- edge sharing across often-siloed groups, but also helped to identify stakeholder groups that the planning committee had not identified before the d/vr discussion. the box on the following page shows the stakeholder groups represented at the colloquium, including invited experts and other attendees from the broader university of oklahoma academic com- munity, as well as additional stakeholder groups that had not been included in the colloquium. d/vr: stakeholders, ecosystems, and future directions the following stakeholder groups were represented at the clir d/vr colloquium: • d technologies and architecture specialists • archaeological d data researchers • independent d animators • industry — corporate d scanning practitioners — d model sharing platform representatives • library communities — academic library administrators — digital humanities librarians — digital library applications developers — digital preservation specialists — digital scholarship specialists — emerging technology librarians — entrepreneurship librarians — fine and applied arts librarians — gis and anthropology librarians — librarians of reference and instructional services — postdoctoral fellows in data curation — social sciences and humanities librarians • meteorologists • nonprofit preservation organization representatives from the software preservation network and clir • university faculty members with expertise in — advanced medical imaging — biology — english — fine arts, sculpture — journalism — media arts and sciences faculty the following groups, not represented at the collo- quium, were also identified as stakeholders: • architecture libraries, archives, museums • industries — aerospace — automotive — games and entertainment — software development • museum patrons — children with disabilities — elementary school educators and students • museum staff • public library patrons • public library staff • students, undergraduate and graduate, in — architectural history — art — computer science — digital humanities • underrepresented communities — communities with different technological bandwidths (i.e., impact of “digital divide”) — elderly populations — indigenous and native communities — minority communities seeking to preserve their cultural heritage • other university programs/schools — law school libraries — schools of architecture these expansive lists illustrate the wide range of communi- ties that have an interest in the development of standards and best practices around d/vr technologies. no longer does this area of research interest only a small group of specialists. from schoolchil- dren to advanced researchers across multiple disciplines, the use and impact of d/vr are expanding rapidly and becoming increasingly mainstream. in part because of this growing range of stakeholder groups, there is still much debate about who will lead the way in establish- ing the d/vr ecosystem identified in this report. for example, will researchers and other content creators, libraries and archives, or commercial platforms take the lead? should the ecosystem be a cen- tralized one or a more loosely associated network? it is clear that the key elements of any d/vr ecosystem should include the following tools and modules: diverse stakeholders d/vr: stakeholders, ecosystems, and future directions • a universal viewer that integrates with existing and emergent d creation workflows and vr visualization tools. • tools and metadata schemata that enable the documentation of production workflows, especially in cases where d data are hand-edited or reduced in some way through the production of scholarly outputs. these tools and schemata also should sup- port metadata capture for preservation and rights management purposes. • a storage and retrieval platform for d data, designed to support the full range of d data uses, including creation, d printing, vi- sualization and analysis, and publication. libraries can take the lead in supporting these d/vr infra- structural components because they are the multidisciplinary hubs of academic institutions and have experience supporting the admin- istration of information resources, as well as research and develop- ment around new scholarly technologies. furthermore, they harbor a long institutional history of preserving and providing access to knowledge for academic institutions, and they can meet the needs of multiple departments simultaneously. for this reason, many libraries are currently working to support d/vr projects through a combi- nation of providing training opportunities, such as workshops with- in the library, hiring staff to provide services in support of research projects, and working to create a communitywide framework for preservation. several models of partnership may be forged among technolo- gists, faculty, and other interested stakeholders working toward the common goal of supporting current creation and preservation efforts using d/vr work in academic institutions. the diverse approaches discussed in these essays may guide librarians and digital curators alike as they approach the complex field of d/vr data creation, dissemination, and preservation. of course, each institution has a set of unique local needs; thus, each case study presented herein offers unique challenges and solutions that provide directions for further experimentation and research. at the same time, cross-disciplinary communities in the field (e.g., community standards for d preser- vation, building for tomorrow, the software preservation network, and developing library strategy for d and virtual reality collec- tion development and reuse, to name a few) are working toward a more cohesive model of addressing the standardization of preserva- tion methods and training; they hope to integrate their findings into a holistic approach that addresses the needs of communities across national and international contexts. as more libraries take up the call to action around this topic, we hope that this report—presented as a snapshot of the current state of the field—can provide an entrée into critical engagement with the exciting field of d/vr and other emerging scholarly tools in the rapidly changing academic library of the twenty-first century. acknowledgments the idea of a clir colloquium and proceedings report on d/virtual reality ( d/vr) arose from early conversations among clir postdoctoral fellows interested in d/vr scholarship and the subsequent development of a work- ing group at the coalition for networked information (cni) in . through a generous microgrant from clir funded by the alfred p. sloan foundation, we were able to realize our vision for a colloquium that would bring together diverse perspectives on this exciting and rapidly changing field. we are grate- ful to clir for its financial, logistical, and intellectual support throughout this process. in particular, we would like to thank elliott shore and lauren coats for encouraging these discussions by building into the clir education and training program formal opportunities to collaborate. we would also like to thank alice bishop and christa williford, who provided unwavering support and guidance throughout the application, planning, and publication process. special thanks to elizabeth parke for her key input on colloquium planning and logistics, diane ramirez for her help with colloquium travel and reimbursements, and kathlin smith for her expert direction and astute input on report publication. the colloquium itself was truly a collaborative endeavor, and we are indebted to numerous sponsors and participants who made this event so successful. thank you to our cosponsoring institutions—the university of oklahoma libraries; university library at the university of california, santa cruz; and temple university libraries. the support of many individuals within these institutions, namely tara carlisle, matt cook, carl grant, and joe luccia, contributed greatly to our efforts. thank you also to all of the at- tendees of the clir d/vr colloquium, who represented a range of perspec- tives and contributed to the enriching conversations that supplemented our scheduled presentations and shaped the content within these pages. we would also like to extend our appreciation to our peer reviewers, whose expertise and support improved this report immeasurably: christy caldwell, university of california, santa cruz; jasmine clark, temple uni- versity; matt cook, university of oklahoma; andrea copeland, indiana university–purdue university, indianapolis; bill endres, university of okla- homa; chad hutchens, university of wyoming; tom lee, university of con- necticut; kate murray, library of congress; elizabeth parke, mcgill univer- sity; samantha porter, university of minnesota; anthony sanchez, university of arizona; hannah scates kettler, university of iowa; carla schroer, cultural heritage imaging; li sou, university of bradford; chris strasbaugh, ohio state university; edward triplett, duke university; and jamie wittenberg, indiana university, bloomington. finally, thank you to reed sciven for the cover image design. -degree video: digital video files created by using either special lenses or multiple cameras to capture all angles of view (i.e., images degrees around the camera setup). for playback on a computer monitor, the user se- lects the part of the scene to be viewed by means of a keyboard and mouse, and then moves the perspective around the recorded -degree scene. with a vr headset, users move their heads and bodies to see different viewing angles of the scene. d file formats: d meshes a. x d: open-source, royalty-free international organization for standardization–compliant standard that defines a d data file (encoded in a variety of formats, including xml, classicvrml, compressed bi- nary encoding, and json encoding). it replaced virtual reality markup language (vrml) in . it is actively maintained by the web d consortium. file extensions include .x d, .x dv, .x db, .x dz, .x dbz, and .x dvz. b. obj: file format for encoding d geometry. originally developed by wavefront technologies, it is now an open format and widely supported by d modeling software. obj files store only geometry information, so textures have to be stored in a separate file. obj files have the extension .obj. because they are supported by most software programs, obj files are widely used—even though they lack a variety of functions, including in- ternal texture mapping and embedded metadata capabilities. c. dae: open-source, international organization for standardization– compliant standard file format for encoding d data. also referred to as collaborative design activity (collada), dae has provision for some metadata fields, can contain scale information, and is encoded as xml. it was originally designed as an interchange standard for mov- ing d models between different d modeling and animation packages. many d software packages can open and export dae files. the nonprofit khronos group currently manages dae. the file extension is .dae. d. ply: file format developed at stanford university by greg turk in to encode d geometry and some surface properties. it is also known as the polygon file format or, less commonly, the stanford triangle format. the file extension is .ply. it has more functions than obj, but lacks the rich fea- ture set, including embedded metadata, of dae and x d. e. fbx: proprietary d file format developed by kaydara and now owned by the large cad modeling software corporation, autodesk. derived from filmbox, it is not openly documented, but it can be encoded as ascii in a structured form that is human readable. the file extension is .fbx. glossary glossary f. stl: file format used by cad software to encode d information. the name is derived from its use in stereolithography, a form of d printing. it is often output from d modeling software in order to enable d printing. it stores geometric information in a d, cartesian coordinate system and does not carry surface texture information or scale information. the file format is .stl. ar (augmented reality): technologies that blend computer-generated im- ages, sounds, and haptic sensory inputs with real-world sensations. unlike vr technologies, ar does not cover up the user’s sensory field completely, but “augments” it by adding additional layers of sensory information that complement real-world phenomena. assets: a general term for individual components that compose d and vir- tual environments (see: mesh). bim (building information modeling): use of digital tools to model the physical and functional properties of architectural spaces. it enables govern- ments and businesses to plan for and manage critical infrastructure. cad (computer-aided design): use of computers to assist in the creation of designs, including architectural and engineering designs. they can be d (taking the place of traditional drafting techniques) or d (taking the place of model building). architectural and engineering design work is now done primarily using cad. decimate: to reduce the information in a d polygon mesh by simplifying its geometry (i.e., by reducing the number of polygons). this is often neces- sary to make it more practical to work with the resulting mesh files, which can otherwise be very large. web-based viewers and virtual reality systems, because of the limitations of graphics processor power, have limitations on the quantity of polygons that can be displayed at one time without system slowdown. gis (geographic information system): system that displays, manages, and analyzes layers of spatial data. lidar (light detection and ranging): remote-sensing technology that mea- sures distance using low-energy laser pulses. often used in aerial surveys, lidar technology emits a laser pulse, which reflects off an object, and re- cords its return delay to determine distance points. compiled distance points can be used to generate d representations of target objects (see point cloud). mr (mixed reality): technologies that blend computer-generated images, sounds, and haptic sensory inputs with real-world sensations. mr is similar to ar in that digital layers of sensory input overlay real-world phenomena, yet mr technologies more convincingly “mix” virtual and physical reality through the real-time interaction between the two. glossary mesh: generated from a set of data points (see point cloud) by software (e.g., agisoft photoscan and other photogrammetric processing software), a grid of triangles or other polygons that define the d surface of an object. because the mesh is composed of geometric elements that have vertices (points) and lines, they can represent the same spatial information as a point cloud with fewer data points. in addition, this creates a “watertight” model that can be d printed or brought into a vr environment and analyzed as a solid object rather than a set of discrete data points floating in space. meshes can also be generated from computer-aided design (cad) projects, in which there is no original source object. photogrammetry: technique in which (at least) two photographs are taken from slightly different perspectives to calculate the three dimensional coor- dinates of a space or object. by measuring changes in the vertical (y axis) and horizontal (x axis) position of points in each photograph, the distance from the camera to the object in question can be calculated, producing the data on the depth (z axis). the basic technique dates back to the nineteenth century when surveyors and those producing topographical maps measured points in photographs and mechanically compared them to calculate precise spatial coordinates. contemporary photography uses computers and sometimes hundreds of captured images to rapidly produce highly detailed d data that can be used to produce d models in the form of polygon meshes (see mesh). point cloud: set of data points that describe the x, y, and z spatial coordinates of real-world objects. obtained by light detection and ranging (lidar) and photogrammetry techniques, these data points can number in the millions. when rendered graphically, they resemble a cloud of individual points sus- pended in space. depending on the quality of the scanning process, there may be holes in the data that need to be filled in before the point cloud can be converted into a polygon mesh (see mesh) for d modeling and printing purposes. retopologization: process in which the surface of a d polygon mesh (see mesh) is replaced with a more efficient configuration of polygons. it can help reduce the file size and ensure that edges of polygons conform to the edges of the model features, which makes animation and other activities easier to ac- complish. tools provided with d modeling software are used in the process. structured-light scanning: highly accurate, d scanning process that in- volves superimposing patterns of light onto a physical object and a camera to capture information about the distortion of light patterns caused by contours of the object. structured-light scanning can capture large sections of an object at once and can be quicker than some other capture processes, including pho- togrammetry and laser scanning. texture map / uv mapping: map that carries color and texture information of an object and defines how that information should be wrapped around a d mesh (see mesh). uv mapping is a common way of defining and attaching texture maps. glossary vr (virtual reality): use of complex arrangements of computer software and hardware to re-create the sensory impression of real-world experience. current systems use head-mounted displays (hmds), hand controllers, and tracking sensors to produce interactive and immersive worlds composed of stereoscopic images and sounds. some vr systems have started to pro- vide haptic and olfactory interfaces as well as sound and image. in addition to hmds, cave systems (employing large, multiscreened rooms and d glasses with head tracking) have been developed, but they are far more costly than the current wave of commercially available hmds and require custom installation. xr (extended reality): overarching term encompassing the full spectrum of experience, from real-life to blended to full immersive reality (see -degree video, ar, mr, and vr). about the editors and authors editors kristina golubiewski-davis is interim director of the digital scholarship commons at the university of california, santa cruz (ucsc), where she works to provide access to technologies that enable new modes of knowledge building, creates opportunities for students to integrate digital tools into their learning, builds partnerships with ucsc faculty and staff to facilitate digital research and scholarly publication, and develops spaces that foster experimentation and innovation. she previously held a clir postdoctoral fellowship at middlebury college and received her phd in anthropology at the university of minnesota. her dissertation focused on connecting mea- surable aspects of bronze swords to communities of knowledge during the late bronze age. throughout her career, she has been an advocate for work- ing with d and digital technologies for research, outreach, and educational purposes. jennifer grayburn is director of digital scholarship and public services at union college in schenectady, new york. she received her phd in art and ar- chitectural history from the university of virginia, with a focus on the art and literature of the medieval north sea world. as a graduate student, she held the positions of praxis fellow project manager and makerspace technologist in the scholars’ lab, where she started her research on the critical use of d models and d printing in the classroom. she expanded her research on criti- cal making and digital pedagogy as a clir postdoctoral fellow at temple university and applied these pedagogical concepts as a consultant for the carnegie museum of art’s copy + paste exhibition. veronica ikeshoji-orlati is the robert h. smith postdoctoral research associate for digital projects at the center for advanced study in the visual arts at the national gallery of art in washington, dc. she received her phd in mediterranean art and archaeology from the university of virginia, where she studied the intersection of performance and visual culture in south italy and sicily during the fourth century bce. her research continues to focus on the material culture of the ancient mediterranean, with a particular emphasis on applying data analysis and visualization tools to explore the production, decoration, distribution, and deposition of figure-decorated vases. from to , she served as a clir postdoctoral fellow in data curation and research data management at vanderbilt university. about the editors and authors zack lischer-katz is a clir postdoctoral research fellow in data curation at the university of oklahoma libraries. in his fellowship, he is developing guidelines for curating research data associated with virtual reality technolo- gies and d models, and is researching the impact of virtual reality and d tools on research and pedagogy in academic libraries. his personal research addresses the cultures and techniques of preserving visual forms of informa- tion, including film, video, and emergent media. he received his phd in com- munication, information, and library studies from rutgers university, and his dissertation work focused on analog video digitization and the discourses and embodied experiences of media preservationists. authors jeremy a. bot is a professional d rigger and animator. his work involves the study and translation of creature behavior and motion into d graphics for film, tv, video games, and live performance. he currently works as a rig- ging supervisor for a large animation studio in vancouver, british columbia, and assists in the creation of a robotic artificial intelligence–driven human. under the banner of the digital life project, he frequently gives talks on us- ing open-source tools for photogrammetry and animation. andrea copeland is chair of the department of library and information science in the school of informatics and computing at indiana university in indianapolis. her research focus is public libraries and their relationship with communities. her current emphasis centers on connecting the cultural outputs of individuals and community groups to a sustainable preservation infrastructure. she is the co-editor of a recent volume, participatory heritage, which explores the many ways that people participate in cultural heritage ac- tivities outside of formal institutions. thomas flynn began working with d for cultural heritage while at the british museum, creating the museum’s first online collection of download- able d scans. he went on to co-found museum in a box (museuminabox. org) in and joined sketchfab as cultural heritage lead in . at sketchfab, his work focuses on exploring the possibilities of accessible d digitization and online d and extended reality, and encouraging institutions large and small to share their d data, knowledge, workflows, and experi- ments in this rapidly evolving field. duncan j. irschick is a professor of biology at the university of massachusetts and the director and co-founder of the digital life project (www.digitallife d.org). his research focuses primarily on form and function in a wide range of animals, including reptiles, amphibians, and invertebrates, with a recent emphasis on sharks and sea turtles. he has published more than papers in peer-reviewed journals, and he regularly gives seminars and workshops at universities worldwide. his contact information is: department of biology, morrill science center, university of massachusetts at amherst, amherst ma . email: irschick@bio.umass.edu. http://www.museuminabox.org http://www.museuminabox.org http://www.digitallife d.org/ about the editors and authors jessica meyerson is research program officer for educopia institute and co- founder of the software preservation network, a position that allows her to promote the essential role of software preservation in responsible and effec- tive digital stewardship. she works across institutions, communities, and sectors to support applied research that advances digital preservation prac- tices—including several concurrent projects aimed at broadening participa- tion in software preservation and exploring curation approaches for software- dependent objects. jennifer moore is the data services coordinator and anthropology librarian in the university libraries, and program faculty in international and area studies in the school of arts & sciences at washington university in st. louis. she leads data management and curation initiatives, teaches and provides support for data analysis, and oversees the libraries’ d capture initiatives. she is a co–principal investigator on the imls-funded community standards for d data preservation project and is a partner on the imls-funded proj- ect, building the digital curation workforce: advancing specialized data curation. adam rountrey is a research museum collection manager at the university of michigan museum of paleontology. he has been involved with acquisition, analysis, visualization, preservation, and dissemination of d specimen data at that institution since . during this time, he developed the photogram- metry workflows and d web viewer for the university of michigan online repository of fossils, and he currently manages the online repository. in addition, he is a co–principal investigator on the imls-funded community standards for d data preservation project and is particularly interested in issues related to rights and ownership of d data in museum settings. will rourk, who has a background in architecture and architectural history, has been a d data and content specialist with the university of virginia (uva) library, and more recently with the scholars’ lab, for more than two decades. he focuses on a cultural heritage informatics approach to the col- lection, processing, preservation, and distribution of d data of historic ar- chitecture and artifacts. his current methods for collecting measured data of historic content involve d laser scanning and photogrammetry. he has also been the lead architectural consultant to the uva tibetan and himalayan library since . hannah scates kettler is a digital humanities research and instruction librar- ian in the university of iowa’s digital scholarship & publishing studio. her work entails leading digital projects from inception to preservation; manag- ing the process; and providing research, development, and instruction sup- port. her research interests include cultural heritage representation in digital libraries and infrastructure support for d virtual heritage collections. scates kettler is a co–principal investigator on the imls community standards for d data preservation project and on the andrew w. mellon collections as data: part to whole project. she is chair of the digital library federation (dlf) cultural assessment interest group and vice-chair elect for the association of college & research libraries’ newly formed digital scholarship section, as well as one of three inaugural dlf futures fellows. http://www.cs dp.org/ http://www.cs dp.org/ https://umorf.ummp.lsa.umich.edu/wp/ https://umorf.ummp.lsa.umich.edu/wp/ http://www.cs dp.org/ http://www.cs dp.org/ https://collectionsasdata.github.io/part whole/ https://collectionsasdata.github.io/part whole/ https://wiki.diglib.org/assessment:cultural_assessment http://www.ala.org/acrl/aboutacrl/directoryofleadership/sections/dss/acr-dssec http://www.ala.org/acrl/aboutacrl/directoryofleadership/sections/dss/acr-dssec https://www.diglib.org/digital-library-federation-names-three-new-dlf-futures-fellows/ about the editors and authors victoria szabo is associate research professor of visual and media studies in the department of art, art history & visual studies at duke university, where she is a member of the wired! lab for digital art history & visual culture. she is the founding director of graduate studies for the phd in computational media, arts & cultures, and she also directs the information science + studies program, as well as the duke digital humanities initiative at the john hope franklin humanities institute. she is currently chair of the digital arts community for the association for computing machinery’s special interest group on computer graphics and interactive techniques. ann baird whiteside is librarian/assistant dean for information resources at the harvard university graduate school of design. the focus of her work is expanding the creation of and access to digital resources in close col- laboration with scholars and the use of technology to support teaching and research. she also actively works to create a bridge between technology, re- search support, and the re-envisioning of the twenty-first century library. she has been involved in projects such as cataloging cultural objects, the first facade project, sahara, and the building for tomorrow imls grant. albert william is a lecturer in the media arts and science program within the school of informatics and computing at iupui. he specializes in the d design and animation of scientific and medical content. he has received the silicon graphics, inc., award for excellence in computational sci- ences and visualization, the award for excellence in the scholarship of teaching, and the service and community engagement award. each year, he leads a service learning study abroad program to greece where the group documents cultural and historical artifacts by creating videos that in- clude photography, d reconstructions of ancient sites, and virtual reality. zebulun m. wood is a co-director and lecturer in the media arts and sciences program, human-centered computing department, within the school of informatics and computing (soic) at iupui. he works with students on projects that improve lives and disrupt industries. he is cur- rently leading projects that explore the use of mixed reality in mental health intervention, d and d printing in prosthetic design, augmented reality in education, and education for innovation and entrepreneurial thinking. he has worked with students and soic faculty on virtual bethel, a unique col- laboration involving digital archiving and a d re-creation of the bethel ame church building in indianapolis. council on library and information resources south clark street, arlington, va web: www.clir.org white paper report report id: application number: hd project director: katherine walter (kwalter @unl.edu) institution: university of nebraska, lincoln reporting period: / / - / / report due: / / date submitted: / / november centernet: cyberinfrastructure for the digital humanities white paper background: in response to a summit held at the national endowment for the humanities in and hosted by the maryland institute for technology in the humanities (mith), a north american group of centers began serving as a steering committee to foster collaboration. in , a digital humanities start-up grant was received by the university of nebraska-lincoln and the university of maryland to expand the idea of centernet worldwide, to create a formal structure, and to develop a business plan for long-term stability of the organization. activity: through the funding provided by the national endowment for the humanities, centernet has developed into an international network of digital humanities centers. to achieve this end, an international summit was held in july at king’s college london. about center directors and funders were invited to attend. there was tremendous enthusiasm for creating an international association, and a rationale and a general structure for doing so was discussed. consequently, four regional affiliates have formed in asia pacific, europe, north america and the u.k. and ireland, although we believe that the u.k. and ireland will merge with centernet europe in . many centernet members have been seeking centers in other parts of the world, however, it seems likely that libraries and information technology units may serve in place of centers in regions such as south america, africa and southeast asia and india. our hope is to help emerging centers develop by providing consulting and expertise. in order to conduct the business of centernet, each regional affiliate selected two representatives to serve on an international executive council. with the help and backing of the international executive council, the principal investigators have worked with many others to develop governance documents, membership and business plans, to form partnerships with other organizations, and to plan for centernet’s future programming. as of , centernet is a constituent organization of the alliance of digital humanities organizations (adho), and through this relationship, has a measure of stability for the future. as a constituent organization, centernet is unlike any of the others in adho. it is an organization of centers rather than an organization of individuals. because of the need to address membership fees differently for centernet, adho has worked with oxford university press to change the means of joining centernet november from paying subscription fees (to llc: the journal of digital scholarship in the humanities) to paying membership fees, resulting in a subscription to llc only if desired. while developing centernet’s governance structure and financial planning has taken a long time, the international board seems to have consensus now about our strategic directions, and as its organizational documentation and financial situation plays out, the council is now considering various programs and membership benefits of centernet.  one is the revitalization of arts-humanities. net under another name and in partnership with dhcommons. working with partners at king’s college london and the university of southampton, a new drupal backend is in place. an international transition committee has proposed several far reaching and exciting changes that are being considered by the international executive council.  day of dh, formerly managed by the university of alberta, has been adopted as a program of centernet and this in turn becomes a benefit of membership.  the centernet website includes a map of the world with the location of centers and an aggregated rss feed feature.  centernet has a listserve that will continue to be open. in addition to the work of the regional affiliates and the international executive council, centernet has an annual general membership meeting at the adho digital humanities conferences. the vision for centernet is to share expertise and promote collaboration among centers—whether to facilitate conferences or to foster innovation through collaborative project development. working with partners such as the consortium of humanities centers and institutes (chci), new curricular models are being developed for graduate education in digital humanities. we are establishing other formal affiliations with like-minded organizations, including the coalition of humanities and arts infrastructure network (chain), dhcommons, the digital library federation, and humanities. through these partnerships we believe that centernet will continue to make a difference. conclusion: although paperwork remains ahead of us, centernet is now a reality, and it has great potential for encouraging international collaboration for research and teaching. [pdf] the principle and the pragmatist: on conflict and coalescence for librarian engagement with open access initiatives. | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /j.acalib. . . corpus id: the principle and the pragmatist: on conflict and coalescence for librarian engagement with open access initiatives. @article{potvin thepa, title={the principle and the pragmatist: on conflict and coalescence for librarian engagement with open access initiatives.}, author={sarah potvin}, journal={the journal of academic librarianship}, year={ }, volume={ }, pages={ - } } sarah potvin published sociology the journal of academic librarianship abstract this article considers open access (oa) training and the supports and structures in place in academic libraries in the united states from the perspective of a new librarian. oa programming is contextualized by the larger project of scholarly communication in academic libraries, and the two share a historical focus on journal literature and a continued emphasis on public access and the economics of scholarly publishing. challenges in preparing academic librarians for involvement with oa… expand view via publisher comminfo.rutgers.edu save to library create alert cite launch research feed share this paper citationsbackground citations view all citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency open science disrupting the status quo in academic libraries: a perspective of zimbabwe m. tapfuma, r. hoskins political science save alert research feed subject vs. functional: should subject librarians be replaced by functional specialists in academic libraries? catherine hoodless, s. pinfield political science, computer science j. libr. inf. sci. pdf view excerpts, cites background save alert research feed innovating support for research: the coalescence of scholarly communication? h. sandy, a. j. million, cynthia hudson-vitale sociology, computer science coll. res. libr. pdf save alert research feed restructuring and formalizing: scholarly communication as a sustainable growth opportunity in information agencies? a. j. million, cynthia hudson-vitale, heather moulaison sandy political science view excerpt, cites background save alert research feed how do librarians experience copyright?: an exploratory study of librarian perception and practice from a survey of u.s. state libraries k. agan psychology save alert research feed open access and collection development policies: two solitudes? sharon dyas-correia, rea devakos political science save alert research feed the structure of scholarly communications within academic libraries w. thomas political science pdf save alert research feed the structure of scholarly communications within academic libraries w. thomas pdf view excerpt, cites background save alert research feed international open access week at small to medium u.s. academic libraries: the first five years p. johnson psychology save alert research feed public goods and public interests: scholarly communication and government documents in research libraries sarah potvin, sarah laura sare computer science save alert research feed ... ... references showing - of references sort byrelevance most influenced papers recency leading change in the system of scholarly communication: a case study of engaging liaison librarians for outreach to faculty kara j. malenfant sociology, computer science coll. res. libr. pdf save alert research feed scholarly communication: arl as a catalyst for change m. m. case computer science save alert research feed almost halfway there: an analysis of the open access behaviors of academic librarians holly mercer political science, computer science coll. res. libr. pdf save alert research feed envisioning the library's role in scholarly communication in the year m. j. carpenter, maria jolie jerome mary graybill, maria jolie jerome mary offord, maria jolie jerome mary piorun computer science pdf save alert research feed what is in a name? introducing the journal of librarianship and scholarly communication i. gilman, m. ramirez political science pdf save alert research feed adjusting to the workplace : transitions faced by new academic librarians j. oud sociology pdf save alert research feed where there’s a will there’s a way?: survey of academic librarian attitudes about open access k. palmer, e. dill, c. christie political science, sociology pdf save alert research feed the demise of the library school: personal reflections on professional education in the modern corporate university r. cox political science save alert research feed are mls graduates being prepared for the changing and emerging roles that librarians must now assume within research libraries? james l. mullins sociology save alert research feed scholarly communication: a lament and a call for change c. lowry computer science pdf save alert research feed ... ... related papers abstract citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue towards annotopia—enabling the semantic interoperability of web-based annotations future internet , , - ; doi: . /fi future internet issn - www.mdpi.com/journal/futureinternet article towards annotopia—enabling the semantic interoperability of web-based annotations jane hunter * and anna gerber school of information technology and electrical engineering, the university of queensland, st. lucia, qld , australia; e-mail: a.gerber@uq.edu.au * author to whom correspondence should be addressed; e-mail: j.hunter@uq.edu.au; tel.: + - - - . received: june ; in revised form: august / accepted: august / published: august abstract: this paper describes the results of a collaborative effort that has reconciled the open annotation collaboration (oac) ontology and the annotation ontology (ao) to produce a merged data model [the open annotation (oa) data model] to describe web-based annotations—and hence facilitate the discovery, sharing and re-use of such annotations. using a number of case studies that include digital scholarly editing, d museum artifacts and sensor data streams, we evaluate the oa model’s capabilities. we also describe our implementation of an online annotation server that supports the storage, search and retrieval of oa-compliant annotations across multiple applications and disciplines. finally we discuss outstanding problem issues associated with the oa ontology, and the impact that certain design decisions have had on the efficient storage, indexing, search and retrieval of complex structured annotations. keywords: annotation; interoperability; semantic web; ontology . introduction and background annotating documents is a core and pervasive practice for scholars across both the humanities and sciences. it can be used to organize and share existing knowledge or to facilitate the creation of new knowledge. annotations may be used by an individual scholar for note-taking, for classifying data and documents, or by groups of scholars to enable shared-editing, collaborative analysis and pedagogy. open access future internet , although numerous systems exist for annotating digital resources [ ], many of these systems can only be used to annotate specific collections, are based on proprietary annotation models, and were not designed to support open access to or sharing of annotations. scholars working across multiple content repositories often have to learn to use a number of annotation clients, all with different capabilities and limitations, and are unable to integrate their own annotations across collections, or to reference or respond to annotations created by colleagues using different systems. this presents a barrier for collaborative digital scholarship across disciplinary, institutional and national boundaries. given the importance of annotation as a scholarly practice, these issues, combined with the lack of robust interoperable annotation tools, and the difficulty of migrating annotations between systems, are hindering the exploitation of digital resources by scholars across many disciplines. . . the open annotation collaboration (oac) data model the oac project is a collaboration between the university of illinois, the university of queensland, los alamos national laboratory research library, the george mason university and the university of maryland, which is funded by the andrew w. mellon foundation. the oac project was established to define a framework to enable sharing and interoperability of scholarly annotations on digital resources. the collaboration has produced an rdf-based open annotation data model [ ], which draws on a number of previous efforts including: the w c’s annotea model [ ], agosti’s formal model [ ]; sane scholarly annotation exchange [ ]; and oats (the open annotation and tagging system [ ]. an analysis of these existing models reveals that on the whole, they have not been designed to be both web-centric and resource-centric. moreover, they have modeling shortcomings that include: the preclusion of existing web resources (regardless of media type) from being the content or target of an annotation; the inability to publish annotations as independent, stand-alone web resources; the lack of support for attaching a single annotation (body) to multiple targets. . . the annotation ontology (ao) developed concurrently, but independently to the oac model, the annotation ontology [ ] provides a model for annotating scientific documents on the web. the ao model draws on the annotea model and reuses the provenance, authoring and versioning (pav) ontology for annotation provenance. ao was designed to be integrated with the semantic web applications in neuromedicine (swan) ontology used for hypothesis-based representation of scientific discourse. it was also designed to integrate easily with existing scientific ontologies and vocabularies such as those expressed using simple knowledge organization system (skos) and social web ontologies such as semantically-interlinked online communities (sioc). ao was originally designed to enable biomedical domain ontologies to be used to annotate scientific literature but its creators believe it is more widely applicable to web resources in general. future internet , . . the merged open annotation model (oa) the w c open annotation community group was established in to specify an extensible, interoperable framework for representing annotations, by aligning the oac and ao data models. figure illustrates the draft open annotation (oa) model [ ], which has emerged from the alignment of the two models. a comparison of the key differences between the oac and oa models is presented in table . in figure , an oa annotation instance, anno represents the reification of an annotates relationship between a body (the comment or data being attached by the annotator) and one or more target resources [where the body is “about” the target(s)]. body and target resources can be of any type e.g., image, audio, video or text-based formats including html. annotation provenance is recorded via properties, (see top section of figure ), that describe the context in which the annotation was created, including the time of creation and the human or software agent responsible for creating the annotation. annotation bodies can be identified by uri or included as inline content (i.e., where the content is embedded in the annotation) using properties from the w c’s “representing content in rdf” [ ] ontology, as illustrated in the bottom left of figure . figure . the open annotation (oa) model. note: figure republished from the oa core data model, under the terms of the w c community contributor license agreement. future internet , table . comparison of open annotation collaboration (oac) and oa models. feature model oac oa provenance recommends use of dublin core creator, created etc. defines provenance properties, distinguishing between annotator and generator annotation types defines two subtypes of oac:annotation: ( ) oac:reply and ( ) oac:dataannotation. oac encourages subtyping. common subtypes described in oax rather than the core model. provides guidelines for subtyping. oa recommends explicitly including oa:annotation type for all annotations. equivalent serializations no equivalent concept in oac. defines equivalent property for maintaining a relationship between re-published copies of an annotation body does not specify cardinality of annotation bodies. explicitly allows annotations without a body, does not allow multiple bodies. resource segments • recommends use of fragment uris for target and body resources, with a dcterms:ispartof relationship to base resource. • constrainedtarget and constrainedbody with oac:constrains allow other types of segment descriptions to be specified via the oac:constrainedby property. • defines a number of constraint types including oac:offsetrangeconstraint, oac:prefixpostfixconstraint, oac:svgconstraint, oac:webtimeconstraint and oac:contextconstraint. • oa:specificresource replaces oac:constrainedtarget and oac:constrainedbody • oa:hassource property replaces oac:constrains. • oac:constraints are replaced with oa:selectors. • fragment uris are disallowed for oa:hasbody and oa:hastarget: oa:fragmentselector is defined for representing fragments instead. • all other types of selectors are described in oax rather than the core model. • allows multiple alternative selectors. state no equivalent concept in oac. defines oa:hasstate which associates an oa:state with an oa:specificresource to allow clients to retrieve the correct representation of the resource (e.g., version, format, language etc). style no equivalent concept in oac. defines oa:hasstyle which associates an oa:style with an oa:specificresource to enable hints to be provided to the client recommending how to display the annotation and/or selector. extensions no equivalent concept in oac. splits the model into a stable core (oa), and extension (oax), which may change. oax includes subtypes of annotation, selector, state and style, hassemantictag and support for named graphs for structured data. many annotations apply to segments of resources, such as a region of an image, or a paragraph of text within a larger document. in these cases, instead of targeting the uri that identifies the entire resource, an annotation will target a specificresource representing the segment of the resource. the extent of the segment is described by a selector and the complete resource is specified by hassource. the oa community group is working on defining a set of common subclasses of selector, including fragmentselector, which can be used to specify uri fragments using existing fragment schemes future internet , including html/xml ids, xpointers, or w c media fragments, and svgselector for describing image regions. state captures contextual information about a target resource when the annotation was attached, to assist clients in retrieving the correct version or representation of the target resource as it existed at the time of annotation. for example, the state information ensures the appropriate mime type is retrieved, if the same uri can return different representations via http content negotiation. style is used to provide hints to an annotation client about how an annotation should be displayed, which is useful in applications where color or formatting of the annotation markers needs to be persistent. selectors, states and styles can be specified inline (as for the body of this annotation), or by reference using a uri or urn. selectors, states and styles can also be applied to a specificresource representing the body of an annotation. in addition to the core classes and properties which are defined in the core open annotation namespace (“oa:”) [ ], the w c annotation community group are also working on an extension ontology (namespace “oax:”) [ ] that defines additional sub-classes and properties that are specific to certain common use cases and content media types. these extensions include: oa:annotation subclasses (e.g., oax:bookmark, oax:comment, oax:description, oax:highlight, oax:question, oax:reply, oax:tag); annotation types (dctypes:dataset; dctypes:image; dctypes:movingimage; dctypes:sound; dctypes:text); media-specific selectors (oax:textoffsetselector; oax:textquoteselector; oax:svgselector); oa:style subclasses (oax:cssvlauestyle, oax:xsltstyle) and additional properties (e.g., hassemantictag—to be used for tags that have uris). it is anticipated that the core data model will be mostly static whilst the extensions specification will provide the flexibility to accommodate changes and refinements based on community feedback. in the remainder of this paper, we apply the open annotation (oa) model to a number of scholarly annotation use cases spanning several disciplines, in order to evaluate its capabilities. we also provide a discussion of some of the problematic and open issues associated with the oa model. . case studies . . annotations for electronic scholarly editions scholarly editions are the outcome of detailed study of a specific literary work or collection of shorter works such as poems or short stories. when preparing a scholarly edition, scholarly editors aim to provide a comprehensive description of the history of the literary work(s) including information about significant versions and physical forms. the ifla frbr model, which allows bibliographic entities to be described as works, expressions, manifestations or items [ ] can be used as a foundation for describing these versions. versions can be represented as frbr expressions, while each issue of a given version can be represented as a frbr manifestation. physical characteristics of specific physical objects such as manuscripts are described at the frbr item level. in addition, a scholarly edition usually comprises a textual essay, textual notes that analyse variations and editorial decisions, and a textual apparatus that is compiled to record the alterations made between different versions of the work. annotations can be used to document these textual notes and variations, and can provide an additional layer of information about the documents being studied, and the people or organizations who were involved in the production of the work over time. annotations in the form future internet , of explanatory notes may also address the content of the text, identifying such things as allusions to other works, historical contexts and stylistic significance. modern scholarly editions are increasingly collaborative exercises that involve a team of editors, advisors and editorial board members dispersed globally. the austese project [ ] is a collaboration between the university of queensland, university of nsw, curtin university, university of sydney, queensland university of technology, loyola university, chicago and the university of saskatchewan, which aims to develop a set of interoperable services to support the production of electronic scholarly editions by distributed collaborators in a web . environment. the annotation service that we are developing for the austese project enables scholarly editors to: • create annotations that relate transcripts with facsimiles; • attach notes to text and image selections; • reference secondary sources; • annotate textual variations—record information about the reason for or the source of the variation; • engage in collaborative discussion about texts through comments, questions and replies. to support these use cases, we have defined several custom subclasses of the oa annotation class including: explanatorynote to provide explanatory commentary on selected characters, words, paragraphs, sections etc.; textualnote, to document or provide support for editorial decisions; and variationannotation to describe textual variations between multiple versions of a literary work. when a scholarly editor creates a variationannotation, the content of the annotation includes an assertion that there is a directional relationship between two versions (the original and the edited); and commentary and/or metadata (e.g., the responsible agent, the date of the variation, the reason for the variation), which applies to the asserted relationship: it does not apply to the versions individually. figure shows an example of a variationannotation. the body (ex:rem ) is an oai ore resource map [ ] that reifies the isvariantof relationship that the scholar is attaching between the original and variant texts. our use of ore resource map to reify a relationship between resources was inspired by the way that the oa annotation class reifies the oa:annotates relationship between body and target resources. the properties and relationships displayed in blue (at the bottom of figure ) are encapsulated within the body, whilst those at the top of figure , exist within the annotation graph. as an ore resource map, the body resource can be identified by a uri and can be reused outside of the context of the annotation. by using ore to represent complex scholarly annotation bodies, we hope to leverage existing tools such as lore [ ], a graphical authoring, publishing and visualization tool that we developed for authoring ore resource maps. future internet , figure . oa: annotation describing textual variation. in figure , both of the target resources are transcripts rendered as html, however, scholars may also annotate other digital surrogates of the same version of the work (including transcripts) in alternate formats such as plain text, tei xml or scanned page images. each target transcript is associated with a frbr manifestation via custom ontology properties, which identify the bibliographic entities for the versions of the work under study. these properties can be traversed by sparql queries to retrieve annotations across all digital surrogates of the same frbr manifestation. a uri identifying the person who created the annotation, as well as timestamps for when the annotation was created or modified are recorded as properties of the annotation (see figure ). also associated with each creator uri is their foaf name property, which is used for display purposes. the oa model allows metadata such as dterms:creator and timestamps (associated with dcterms:created and oa:annotated) to be specified for each body and target resource as well as the annotation object. this means that the author of an annotation body can be different to the author of the annotation object, allowing scholars to create annotations in which the body comprises content created by other scholars. an example of this type of reuse is the creation of annotations that use selected notes from an existing electronic scholarly edition as commentary. figure shows an oa model instance associating a textual note originally created by paul eggert for a digital scholarly edition of ned kelly’s jerilderie letter [ ]. by attaching dcterms:creator and dcterms:created to the body resource, the original scholar who created the note and the date it was created can be clearly specified. the person who has created the new annotation (damien ayers) is future internet , attributed, since they discovered and reused the original content, and may have attached tags or other metadata to the annotation to aid discovery or classification. figure above illustrates how collaborators can generate a discussion around a particular piece of text by replying to existing annotations. this example only shows a single reply, however it is possible to have a chain of replies in which each previous reply is the target of the next reply. figure . oa instance for textual note. figure . oa instance illustrating a reply to an existing annotation/textualnote. . . migrating annotations across multiple representations of museum artefacts the d semantic annotations ( dsa) project [ ] aims to develop simple, semantic annotation services for d digital objects (and their sub-parts) that will facilitate the discovery, capture, future internet , inferencing and exchange of cultural heritage knowledge. the dsa annotation tool enables users (curators, scholars, students and the general public) to search, browse, retrieve and annotate d digital representations of cultural heritage objects via a web browser. the first step involves generating a collection of d models of artefacts from the uq antiquities museum (greek vases and roman sculptures) and the uq anthropology museum (indigenous carvings from the wik community in western cape york). the d representations have been created using a konica minolta i d laser scanner. the dsa annotation client, shown in figure , is a web application that allows ontology-based (semantic) tags and/or free text comments to be attached to d objects. annotations can be attached to the whole object or to a point, surface region or d volumetric segment of the object. the central panel provides an interactive d viewer that allows the user to pan, zoom and rotate the d object. existing annotations on the object are indicated by the colored markers attached to the object, and are listed on the right-hand side. clicking on an annotation on the rhs, rotates and zooms the object, so that that selected annotation is central and its details described in the panel at the bottom centre. figure . dsa annotation client. the laser scanner produces archival quality models, with each model being on average between – mb, containing . – million polygons and with surface textures scanned as high resolution bit maps. however, many of the users accessing the collection from remote locations have limited bandwidth, poor graphics capabilities and limited compute power—and hence are unable to render very high resolution, archival-quality models. in order to support users with variable internet connectivity and computing resources, each model is converted into a high resolution and low-resolution x d format, as well as a “ . d” flashvr file constructed from a sequence of d images. table outlines the characteristics of the three formats for the web. future internet , table . d object characteristics. attribute format high quality d low quality d . d vr format x d x d flash file size – mb – mb mb polygon count . million , n/a in order to allow users to collaborate regardless of which format they are using to access the museum objects, dsa supports the automatic migration of annotations between the high res d model, low res d model and . d representations. the annotation client automatically migrates annotations attached to a point or region on one representation of the model across to the other formats. in the case of the x d format files, the segments being annotated are identical as the d representations share the same local matrix. however, when migrating to the flashvr image sequence, points need to be projected from a d space (identified using x, y and z co-ordinates) onto a d plane (x and y only). figure illustrates an oa model instance for a sample annotation after it has been migrated across the three formats. rather than having to manage three separate annotations, and ensure that their contents remain synchronized, the oa model allows all of the derivative models to be targeted within a single annotation. the body of this annotation is specified inline using cnt:chars (to represent the content) and cnt:characterencoding (to specify the type of the content) from the “representing content in rdf” ontology. figure . instance of oa model for dsa. future internet , the dsa client also captures the co-ordinates, rotation, light position and other related data, which is used to recreate the precise view, zoom level, and position at which the annotation was originally attached. there is no standard for addressing segments within d models, and as there may be many different values stored to represent each segment including co-ordinates, transformation matrixes etc, these d selectors are not well suited to being encoded in a uri fragment, in the style of the w c media fragments [ ] or xpointers. consequently, we have had to define custom subclasses of selector to describe points, surface regions and segments of a d or . vr model. in the example shown in figure , the two d format targets share the same x d-format selector ( dsel) for a given surface region, however the . vr format requires a different selector (qtvrsel) because the data has been transformed to apply to the d plane. in addition to having to support selectors that precisely specify points, d surface regions or d volumetric segments on d digital models, there is also a need to attach structured annotations that comprise both ontology-based (semantic) keywords (e.g., “attic, lekythos”) as well as full-text descriptions “the woman holds a lidded bowl over a large kalathos”. the semantic tags would be attached using the “oax:hassemantictag” property and the textual description would be represented as an inline body (since it does not have a uri). the text box below illustrates how the oa model would represent such an annotation (using the turtle rdf format). @prefix oa: <http://www.openannotation.org/ns/> @prefix oax: <http://www.openannotation.org/ns/> @prefix cnt: <http://www.w .org/ /content#> @prefix ex: <http://www.example.org/#> ex:anno a oa:annotation; oa:hasbody ex:body ; oa:hastarget ex:st ; oa:hastarget ex:st ; oa:hastarget ex:st . ex:st a oa:specificresource; oa:hasselector ex: dsel; oa:hassource <http://itee.uq.edu.au/eresearch/ dsa/vase _highres.x d>. ex:st a oa:specificresource; oa:hasselector ex: dsel; oa:hassource <http://itee.uq.edu.au/eresearch/ dsa/vase _lowres.x d>. ex:st a oa:specificresource; oa:hasselector ex:vrsel; oa:hassource <http://itee.uq.edu.au/eresearch/ dsa/vase .qtvr>. ex:anno oax:hassemantictag <http://itee.uq.edu.au/eresearch/ dsa_ontology#lekythos>. ex:anno oax:hassemantictag <http://itee.uq.edu.au/eresearch/ dsa_ontology#attic>. ex:body a cnt:contentastext; cnt:characterencoding "utf- "; cnt:chars "the woman holds a lidded bowl over a large kalathos">. future internet , . . automatic semantic tagging of species accelerometry data and video streams within the oztrack project [ ] at the university of queensland, we are working with ecologists from the uq eco-lab, providing tools to enable them to analyse large volumes of d accelerometry data acquired by attaching tri-axial accelerometers to animals (e.g., crocodiles, cassowaries, wild dogs), to study their behavior. the interpretation of animal accelerometry data is an onerous task due to: the volume and complexity of the data streams; the variability of activity and behavioural patterns across animals (due to age, environment, season); the lack of visualization and analysis tools; the inability to share data and knowledge between experts; the lack of ground truth data (e.g., in many cases there is no observational video); and the inaccessibility of machine learning/automatic recognition tools. as a result of the challenges identified above, we have been collaborating with the eco-lab developing the saar (semantic annotation and activity recognition) system, with the aim being: • to provide a set of web services and graphical user interfaces (guis) that enable ecologists to quickly and easily analyse and tag d accelerometry datasets and associated video using terms from controlled vocabularies (pre-defined ontologies) that describe activities of interest (e.g., walking, running, foraging, climbing, swimming, standing, lying, sleeping, feeding); • to build accurate, re-usable, automatic activity recognition systems (using machine learning techniques (support vector machines))—by training classifiers using training sets that have been manually annotated by domain experts. figure illustrates the overall system architecture of the saar system. saar uses the oa model to represent tags created manually by experts via the tagging system as well as tags that have been automatically generated by the classification model. this case study is interesting from the open annotation data model perspective because: • it provides a situation whereby the tags, that describe animal activities (running, walking, swimming, climbing, standing, lying, sleeping, feeding), are generated automatically by the svm classification model; • it provides an example of the application of the oa fragment selector—tags are attached to a temporal segment of the d accelerometry streams as well as the associated video content where available. the client interface comprises a web-based plot-video visualization interface (see top of figure ) developed using a combination of ajax, flot (a plotting jquery library) [ ], html video player library (video.js) [ ] with javascript. this interface enables users to interactively visualize both tri-axial accelerometer data alongside simultaneously recorded videos. users invoke the semantic annotation service by selecting a segment of accelerometer data from the timeline or a segment of video from the video timeline, and then attaching an activity tag chosen from a pull-down menu (whose values are extracted from a pre-defined ontology). the manually created annotation is stored in a backend rdf triple store (see section ) implemented using the apache tomcat java server and sesame . . additional annotation functions such as edit, refresh, and retrieve annotations are also supported. future internet , the automatic activity recognition component is implemented using the libsvm java library. at the training stage, users can interactively search for and retrieve specific segments/annotations using the following annotation search terms: species, creator, animal id and activity_tag. figure . high level architectural view of the saar system. the query is converted to the sparql query language which queries the annotation server. the retrieved annotations and data streams are processed to generate a set of application-dependent features that correspond to each tag. after the specific hierarchical svm classification model is built for all of the activity tags, new tri-axial accelerometer data are input to the trained svm classifier which then automatically tags the fresh input data (based on similarity between features). the automatically assigned tags are displayed in the timeline visualization pane, where experts can check or correct them. figure illustrates an instance of the oa data model that represents an automatically extracted tag (“running”) that has been assigned to a specific temporal segment (from s to s) of accelerometry data and associated video, using an oa:fragmentselector containing a w c media fragment (“t = , ”). future internet , figure . oa representation of an automatically generated animal activity tag. . implementation each of the case studies described above makes use of custom annotation client software, which has been developed to provide a user interface tailored to the annotation types and user needs, specific to the community. however, by adopting the common oa model within these applications, we have been able to develop a generic open source annotation repository, lorestore [ ], which can be used to store, search and retrieve annotations for all of the case studies described above. the lorestore repository was developed as a java web application on top of the sesame . rdf framework, and uses a named graph to store each annotation and ore-compliant data body. a rest api is provided to support crud operations on annotations and bodies (create, read, update, delete) as well as search and retrieval by target uri or via keyword search. a sparql endpoint has also been implemented to enable custom queries, and atom feeds are implemented to allow subscription to, or harvesting of annotations by target. the repository supports http content negotiation so that annotations can be retrieved by their identifying uri in a variety of serialization formats including trig, trix, rdf/xml and json. authentication and user account management are implemented using the emmet framework over spring security, which supports basic role-based, openid and/or shibboleth-based authentication. annotations can be flagged as private, so that only the owner can view or modify their content, or locked, so that they cannot be modified. a web-based user interface for the repository provides a search interface and renders annotations from their underlying rdf representation to human-readable html as well as diagrammatic representations. for administrators, the web interface also supports various content management and user management operations. figure illustrates the technical components of our generic annotation repository, lorestore. future internet , in the near future we will begin implementation of a web service that validates an annotation by checking its conformance with the open annotation core ontology. this service will return a positive or negative message, with regard to compliance, prior to the annotation being saved in the backend repository. the validation web service will be designed so that it can interact with any authoring tool or backend repository. it is shown in figure between the client and the security layer. figure . technical components of the lorestore annotation repository. . discussion and evaluation the exercise described above—evaluating the oa data model through a number of domain-specific use cases - has enabled us to identify the model’s strengths and weaknesses. the main strengths of the oa model that we have identified are as follows: • the oa model supports multiple targets, and each can be associated with a selector for specifying the segment of interest. this has allowed us to create annotations that describe textual variation across multiple textual documents, to migrate a single annotation across multiple representations of a d museum object, and to attach a semantic tag to both accelerometry data streams and video that documents an animal’s movements—without extending the model or creating aggregate targets. by comparison, previous annotation data models, such as the annotea ontology that underpins ao were designed with the assumption of a single target, so they provide no mechanism for associating an annotea context (used for describing the segment of a target resource) with a specific target. • bodies and targets can be any media type and can be located on any server. this flexibility means that we can directly annotate digital resources that have been made available through online collections, such as those digitized and published by archives, museums, libraries and future internet , publishers. it also allows us to create bodies that are rdf, so that metadata properties associated with the body can be stored separately rather than included in the annotation graph, making the provenance of the body, the target and the annotation clear and explicit. • because the oac model is rdf-based, it is a trivial exercise to extend the model and include properties from existing domain-specific ontologies within the annotation graph. for example, for the scholarly editions case study, we use custom properties to link target documents to frbr entities, allowing us to query and retrieve annotations across multiple versions of the same frbr expression or work. the main weaknesses of the oa model that we have identified are: • the relative complexity of the oa model for basic use cases. it is possible that many developers who want to implement simple tagging of web pages or whole digital resources, will find the oa model too complex for their needs and hence it will not be widely adopted. (on the other hand it is the flexibility afforded by such complexity that enables the oa model to represent complex scholarly annotation use cases, that simpler models like annotea are not capable of supporting.) • the ambiguity that exists within the model. for example, there are multiple ways to represent resource segments. such ambiguity increases the development effort required to produce tools that fully implement the model. for example, for annotations on part of an image, the image segment could be specified: using an svgselector (with a constrains relationship to the image uri); using a media fragment identifier in the target uri (with an ispartof relationship to the image uri); or using a media fragment expressed as a oa:fragmentselector. in all three cases there is no direct link from the annotation object to the image uri, so a query to retrieve all annotations on a given image must examine the target uri (for “whole of image” annotations), as well as uris related via properties. • the explosion of uris. the semantic web/linked open data approach adopted by oa, recommends that every resource (bodies, annotations, targets, selectors, states, styles) is accessible on the web via a persistent uri. annotation authoring and management tools will need to create, track and manage large numbers of uris. • the current oa model does not support multiple bodies—only multiple targets. a common situation that we found in our case studies is the need to attach both multiple semantic tags (or keywords), as well as a textual description to a resource, within a single annotation event. the current model recommends that you create multiple separate annotations. but this creates redundancy and also does not accurately reflect the annotation event. an alternative approach is to attach the textual description as an inline body and to use the “hassemantictag” property to attach the semantic tags. as discussed in the next bullet point, we don’t support the “hassemantictag” property that is currently included in the extension ontology as it is inconsistent with the existing class and property hierarchy. • lack of support for multiple semantic tags. a semantictag should ideally be defined as a subclass of the oa:annotation class in the extensions ontology. in addition, the core oa ontology should allow multiple bodies to be defined. these two changes would enable the future internet , model to support the use case that we describe above, and simultaneously maintain a well-structured and consistent ontology. • performance issues when querying. the complexity of the oa model may lead to slow performance when querying and retrieving on annotation fields that are buried a number of levels down the annotation graph. for example, the most common query is on targets e.g., “give me all of the annotations on this target resource”. the target resource may be the object of the hastarget or hassource properties, or a selection that is retrieved by applying the selector to hassource, or part of an aggregated target. resolving all of these alternatives to match the target uri and retrieve and display the relevant annotations, may require optimization to improve performance. • lack of standards for specifying segments of resources. the ability to use segments or fragments of resources as bodies or targets, is extremely useful. however the interoperability of this aspect of the model is limited by a lack of standards across communities for describing/identifying parts of things e.g., text segments across document formats. the w c media fragments working group recently published a proposed recommendation which can be applied to multimedia resources (images, video, audio)—but there is a real need for communities to agree on schemas/mechanism for selectors on textual resources, maps, timelines and d objects. . conclusions in this paper we firstly describe the open annotation (oa) model that has recently been developed through the w c open annotation community group by aligning the oac and ao data models. we then apply this model to a number of use cases that have arisen within collaborative projects conducted through the uq itee eresearch lab. we have demonstrated how the oa model can be applied and extended in each use case, to support scholarly annotation practices ranging from basic attachment of comments and tags to digital resources, through to relating multiple target resources, and data annotation. we have also illustrated how the oa model has enabled us to create common backend infrastructure for storing, indexing and querying annotations across applications and authoring clients. we have identified and discussed some of the outstanding issues that need to be addressed in order to improve the interoperability capabilities of this approach. by exposing annotations to the semantic web as linked open data, the open annotation approach enables robust machine-to-machine interactions and automated analysis, aggregation and reasoning over distributed annotations and annotated resources. moreover, this annotation environment that will allow scholars and tool-builders to leverage traditional models of scholarly annotation, while simultaneously enabling the evolution of these models and tools to make the most of the potential offered by the semantic web and linked data environments. acknowledgments we gratefully acknowledge the contributions to this paper and to the annotation cases described within from the following individuals: roger osborne and paul eggert (scholarly editions); chih-hao yu ( d museum objects); and lianli gao (sensor data streams). we also wish to thank the future internet , andrew w. mellon foundation for their generous funding of the oac project and our collaborators from the open annotation collaboration (in particular tim cole, herbert van de sompel, rob sanderson) and the w c open annotation community group. references . hunter, j. collaborative semantic tagging and annotation systems. ann. rev. inf. sci. technol. , , – . . sanderson, r.; van de sompel, h. open annotation: beta data model guide. open annot. collab. . available online: http://www.openannotation.org/spec/beta/ (accessed on april ). . kahan, j.; koivunen, m.r.; prud’hommeaux, e.; swick, r.r. annotea: an open rdf infrastructure for shared web annotations. comput. netw. , , – . . agosti, m.; ferro, n. a formal model of annotations of digital content. acm trans. inf. syst. , . doi: . / . . . boot, p. a sane approach to annotation in the digital edition. jahrb. computerphilogie , , – . . bateman, s.; farzan, r.; brusilovsky, p.; mccalla, g. oats : the open annotation and tagging system. in proceedings of the third annual international scientific conference of the learning object repository research network, montreal, canada, – november . . ciccarese, p.; ocana, m.; garcia castro, l.j.; das, s.; clark, t. an open annotation ontology for science on web . . j. biomed. semant. , , s : –s : . . sanderson, r.; ciccarese, p.; van de sompel, h. open annotation draft data model. open annot. collab. . available online: http://www.openannotation.org/spec/core/ (accessed on august ). . koch, j.; velasco, c.a.; ackermann, p. representing content in rdf . ; w c working draft may ; w c: cambridge, ma, usa, . available online: http://www.w .org/ tr/content-in-rdf / (accessed on august ). . the core open annotation namespace. available online: http://www.w .org/ns/openannotation/ core/ (accessed on august ). . the extensions namespace. available online: http://www.w .org/ns/openannotation/extension/ (accessed on august ). . international federation of library associations and institutions (ifla). functional requirements for bibliographic records; ifla: hague, the netherlands, . . austese project. available online: http://austese.net/ (accessed on august ). . open archives initiative—object reuse and exchange (oai-ore). ore user guide—resource map implementation in rdf/xml; oai: new york, ny, usa, . available online: http://www.openarchives.org/ore/ . /rdfxml (accessed on august ). . gerber, a.; hunter, j. compound object authoring and publishing tool for literary scholars based on the ifla-frbr. int. j. digit. curation , , – . . eggert, p.; mcquilton, j.; lee, d.; crowley, j. ned kelly’s jerilderie letter: the jitm worksite. available online: http://web.srv.adfa.edu.au/jitm/jl/annotation_viewer.html (accessed on august ). future internet , . yu, c.h.; groza, t.; hunter, j. high speed capture, retrieval and rendering of segment-based annotations on d museum objects. lect. notes comput. sci. , , – . . w c media fragments working group. media fragments uri . (basic); w c proposed recommendation; w c: cambridge, ma, usa, . available online: http://www.w .org/tr/ /pr-media-frags- / (accessed on august ). . oztrack project. available online: http://www.oztrack.org/ (accessed on august ). . flot—attractive javascript plotting for jquery. available online: http://code.google.com/p/flot/ (accessed on august ). . videojs—the open source html video player. available online: http://videojs.com/ (accessed on august ). . lorestore on github home page. available online: https://github.com/uq-eresearch/lorestore/ (accessed on august ). © by the authors; licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution license (http://creativecommons.org/licenses/by/ . /). _ ( )__.......hwp 일본의 전개 양상을 통해서 본 한국 기관 레포지토리의 과제 a study on the future development of korean institutional repository through an analysis of developmental aspects of japanese 조재인(jane cho)* 초 록 기 포지토리는 학의 지 생산물에 한 장기 보존과 신속한 배포 뿐 아니라 상업출 사 구도의 학술 커뮤니 이션 흐름을 변화시키기 한 주요 수단이다. 우리나라에서는 dcollection 시스템이 기 의 생산물을 수집․보존하는 한편 국 으로 공유․유통시키기 한 핵심 운 도구로 자리매김하 다. 그러나 dcollection 시스템은 연구 성과 공표를 통해 기 의 존립 이유를 설명하고 더 나아가 오 엑세스를 실 하기 한 학의 자발 운 도구라고 말하긴 어렵다. 한편, 일본은 우리나라와 같이 문부성의 ‘차세 기반 구축 사업’ 등에 의해 기 포지토리 운 을 확산시키고 있는데, 그 속도는 상 으로 조하지만, 개별 학이 자발 으로 본연의 기능을 실 하고 운 기술을 성숙시키고 있다는 에서 비슷하지만 상이한 개 양상을 보이고 있다. 본 연구에서는 일본의 기 포지토리 지원 정책과 포지토리의 발 양상을 비교 분석하고 한국 기 포지토리의 발 과제를 조망하 다. 우리나라에서는 셀 아카이빙 활성화를 해 학 회 작권정책 데이터베이스의 구축 운 이 시 하며, 연구업 시스템 등 학내 시스템과의 연동, 다양한 컨텐츠의 등록, 외부 발신 체계 강화 등 기 이 자발 으로 포지토리 운 능력을 성숙시킬 수 있도록 하는 제도 기술 지원이 필요한 것으로 보여진다. abstract ir(institutional repository) is an indispensable instrument for not only long time preservation of intellectual products but also for shifting commercial publishing company-dominated academic communication stream. in korea, dcollection project of ministry of education, science and technology has contributed on immediate, integrated circulation of distributed research products. dcollection is already one of the few integrating instrument for distributed academic resources, but it has not been university's voluntary instrument to announce their research accomplishments and to realize open access. on the other hand, japanese government has promoted universities' ir operation through “next generation infrastructure construction project.” even though distributing speed is relatively low, each university made ripe operating skill by their selves and realize its own purpose. this study comparatively analyzed policy and current status of institutional repository in korea and japan from various viewpoints. and also it proposed directions of development about korean institutional repositories. 키워드: 기 포지토리, 오 엑세스, 셀 아카이빙, dcollection, 한국교육학술정보원, nii ir(institutional repository), open access, self archiving, dcollection, keris, nii * 시립인천 문 학 문헌정보학과 임강사(chojane@icc.ac.kr) ■논문 수일자: 년 월 일 ■최 심사일자: 년 월 일 ■게재확정일자: 년 월 일 ■정보 리학회지, ( ): - , . [doi: . /kosim. . . . ] 정보 리학회지 제 권 제 호 . 서 론 . 연구의 배경과 목적 논문 생산자인 연구자가 논문의 출 으로부 터 직 인 경제 이익을 얻는 것이 없음에 도 불구하고 결과를 투고하는 것은 연구의 성 과가 인류 공통의 지 자산이라는 개념을 가 지고 있기 때문이다. 그러나 고액의 학술 잡지 간행으로 정보를 독 지배하는 소수의 상업 출 사로 인해 정보의 근에는 장벽이 생기게 되었다. 이러한 행의 시스템이 학술 커뮤니 이션의 역기능을 양산하고 있다고 단하고 그 주도 입장을 연구자 측에 되찾으려고 시 작된 것이 바로 오 엑세스 운동이다. 그러나 오 엑세스 운동은 련된 기 과 개인이 각 기 다른 다양한 입장에서 응하고 있기 때문 에, 단순히 상업 출 사 연구자라고 하는 단 순한 구도에서 이해할 수 없는 복잡한 양상이 되어 가고 있다(逸村裕 ). 한편, 오 엑 세스를 실 하기 해서는 ‘오 엑세스 잡지 간행’과 ‘셀 아카이빙’이라는 두 가지 방식이 있는데, 그 ‘셀 아카이빙’ 방식은 유료 에 게재되어 상업 으로 유통되고 있는 에 해 출 사로부터 허락을 받고 각종 포 지토리에 아카이빙하는 것을 의미한다. 셀 아카이빙의 주요 수단으로 자리매김한 기 포지토리는 연구자 측면에서는 자신의 작물 을 장기 보존하고, 연구결과물을 신속하게 배 포하여 인용율을 증 할 수 있는 한편, 기 측 면에서는 출 물 유지에 수반되는 시간 소모 업무를 경감하고 상업 출 사 구도의 학술 커 뮤니 이션 흐름을 변화시킬 수 있다(이나니 ). 뿐만 아니라 학의 연구 성과를 외 으로 공표하여 기 의 존재 이유를 설명할 수 있는 주요한 수단이기도 하다. 년도부터 시작된 우리나라의 기 포지 토리 구축은 교육과학기술부 산하 keris(korea education & research information service) 에서 추진한 ‘지식정보 생성․유통 체계 사업 (dcollection)’의 일환으로 물살을 타게 되었 다. 우리나라는 학의 자발 필요와 의지에 앞 서 국가 주도 인 지원 사업으로 시작되었는데, 이는 nii(national institute of informatics:  国立情報学研究所)의 「차세 학술 컨텐츠 기반 공동 구축 사업(次世代学術コンテンツ 基盤共同構築事業)」에 의해 추진되기 시작 한 일본의 개 양상과 유사하다. nii는 년 에 치바 학(千葉大学) 등 국립 개 학을 심으로 기 포지토리 운 지원 사업을 시작하 으며, 그 이후 ‘csi 탁 사업 - 기’ 를 통해 재까지 지속하고 있다. 그러나 시스 템과 표 업무 모델을 일 보 한 우리의 상 황과 조 다른 은 공모를 통해 선발된 학 이 자발 으로 시스템을 도입하여 운 하고, 운 시 발생된 제도 기술 문제에 해 후 발 도서 들이 참조할 수 있는 실증 안을 제안하게 된다는 이다. 따라서 우리나라에 비해 확산 속도는 조하지만, 학 상호간 경 쟁 으로 운 기술이 성숙되고 있어, 양국은 비슷하면서도 상이한 개 양상을 보이고 있다. 이 분야에서 그동안 제 로 된 분석이 이루 어지지 않았던 일본의 기 포지토리 운 황 발 양상을 분석해 본다면, 우리나라 기 포지토리 지원 정책 방향과 개별 학 의 운 기술 성숙을 한 시사 을 얻어낼 수 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 있을 것이다. 본 고는 그러한 맥락에서 다음의 몇 가지를 목표로 연구를 진행해 나간다. 먼 , 오 엑세스와 기 포지토리에 련하여 최 근에 발표된 주요 기구의 정책과 기술 동향을 살펴본다. 두 번째, 한국의 기 포지토리 황을 요 약해 보고 일본의 기 포지토리 지원 정책 과 발 양상을 다각도로 살펴본다. ‘nii의 차 세 학술 컨텐츠 기반 공동 구축(次世代学術 コンテンツ基盤共同構築) 사업' 련 문건, 년도 ‘csi 탁사업 보고 교류회(csi 委 託事業補で交流会)'에서 발표된 개별 학의 운 실태, 일본 roar(registry of open access repositories)인 irdb(institutional re- pository database contents analsis)를 기반으 로 시스템 운 , 컨텐츠 수집 체계, 작권, 아 카이빙 컨텐츠의 외부 발신 노력 등 운 반 을 분석해 본다. 세 번째, 일본의 개 양상과 한국의 황을 비교 분석해 보고, 이를 기반으로 한국 기 포지토리의 발 방향을 조망해 본다. . 최신 동향 및 운영 지원 기술 년 말을 기 으로 ‘opendoar(the di- rectory of open access repositories, http:// www.opendoar.org/)’에는 미국 개( %), 국 개( %), 독일 개( %), 일본 개( %) 기 을 포함한 개국, , 개의 기 포지토리가 등록되어 있다. ) 한편, oaister (http://www.oaister.org/viewcolls.html)에 속한 기 포지토리의 숫자도 년 월 개, 년 월 개, 년 월 , 개 기 으로 순조로운 성장세를 보이고 있다. 본 장에서는 기 포지토리와 련된 최신 정책 동향을 먼 살펴보며, 더불어 기 포지토 리 운 성숙에 기여하고 있는 련 기술 서 비스를 살펴본다. . . 기 포지토리 련 최신 정책 동향 본격 인 오 엑세스 운동은 년 상업 출 사에 항하는 sparc 운동에서부터 시작 하여, 년 부다페스트 선언에 의해 가속화되 었다. 그 이후에도 사회과학분야에 한 오 엑세스를 지지하기 한 베를린 선언, 미국의 국립보건연구원(nih)과 국 웰컴 재단의 연 구 성과물 공공 근 오 엑세스 성명 추진, 국제 도서 연맹(ifla)의 ‘학술 연구 문헌의 오 엑세스에 한 성명(ifla statement on open access to scholarly literature and research documentation)', 연구 성과 임 상실험 결과 공개를 의무화하는 헬싱키 선언 등(suber )은 연구자들 뿐만 아니라, 학회 출 사들의 움직임도 가속화 시키고 있다. 이러한 노력에 한 결과로 연구자들의 셀 아 카이빙은 증가하게 되었고 출 사들은 정책을 변경하기 시작하고 있다. nih가 오 엑세스 의무화를 결정한 년 월 이후, pubmed central의 논문 등록수( 년 월 기 )가 배 이상으로 증가하 으며(nihns statistics ), jisc(the joint information systems committee)가 오 엑세스 잡지를 출 하지 ) 한국의 경우, 개( 외경제정책연구원, 카이스트, 물리학연구정보센터, 서울 ) 기 포지토리가 등록. 정보 리학회지 제 권 제 호 않고 있는 개 형 출 사의 정책을 조사한 보고서를 통해서도 출 정책 변경을 검토하고 있는 곳이 폭 증가하고 있음이 보고된 바 있 다(beunen ). 한편, 최근에는 ‘공 자 으로 수행된 연구정 보 엑세스에 한 원칙 가이드라인’이 oecd 가입 국가로부터 승인받아 주목받고 있다. 더 불어 고에 지 물리학 분야에 새로운 학술정보 유통 임웤이 도입되어 화제가 되고 있다. oecd 가이드라인의 작성 배경과 주요 내용 을 먼 살펴보면 다음과 같다(oecd ). 년 반에 oecd는 ‘공 자 으로 수행된 연구정보의 액세스에 한 국제 가이드 라인’ 을 가맹국 정부에 의해서 승인받았다. 의된 내용의 반은 연구자와 연구 기 , 국가 사이 의 정보 액세스와 공유를 진하는 것으로 다 음과 같은 내용을 천명하고 있다. 첫 번째, 국가 의 공공 연구 기 과 연구 커뮤니티의 연구 정 보 공개와 공유 문화를 장려한다. 두 번째, 데이 타 액세스와 공유를 진한다. 세 번째, 공 자 으로 수행된 연구 성과 공유를 제한함으로써 래될 수 있는 잠재 손실에 한 인식을 제 고시킨다. 네 번째, 가맹국의 정보 공개 규제 행에 하여 재검토한다. 다섯 번째, 가맹국에 서는 연구 정보 공개 결정을 한 원칙과 규정 을 결정한다. 여섯 번째, 연구 정보의 국제 공 유와 유통 환경을 개선한다. 한편, 이와 더불어 oecd는 공개의 성, 다양성의 수용 정도, 입수의 투명성, 법성과 지 재산권 보호, 엑 세스 결정에 한 책임, 상호 운용성 등 공 자 으로 수행된 연구 성과 공개에 한 고려 원 칙을 제시하고 있다. 한편, 앞서 말한 바와 같이, 유럽 원자력 연구 소(cern)는 scoap (sponsoring consortium for open access publishing in particle phys- ics, http://scoap .org/) 제안을 통해 학술정 보 유통의 신을 래하고 있어 세계 으 로 주목받고 있다. 그 목 과 내용을 살펴보면 다음과 같다. 이 로젝트는 도서 등이 지 까지 학술 잡지의 구독비에 사용하고 있었던 산을 국가 단 의 분담 으로 용하여, 고 에 지 물리학 분야의 핵심 종을 오 엑세스화하는 것이다. 이는 게재 논문의 국별 분담을 근간으로 학술잡지 출 비용을 분담하 는 새로운 모델이다. 오 엑세스 화에는 최 , 만 유로가 필요할 것이라고 추측되 고 있는데, 주요 국가에는 개발도상국 분에 한 기여까지 요구되고 있다고 한다. 년 월 순 까지 미국· 국․독일․ 랑스 등 개국 으로부터 개 이상의 학․연구기 ․도서 컨소시엄이 심을 표명하여 필요 분담 의 . %가 이미 형성되었다고 한다. scoap 는 비용 분담 비율 등 불확정 요소가 다수 있다 고 평가 받고는 있으나, 학술 정보 유통의 신 시도로서 주목받고 있다(国立国会図書館 関西館 ). . . 기 포지토리의 최신 운 지원 기술 동향 기 포지토리의 운 목 은 아카이빙된 컨텐츠의 근성을 향상시키고 포지토리간 의 상호 운용을 지원하여 컨텐츠의 사용성을 제고하는 것이다. 여기에서는 그러한 맥락에서 주목할 만한 몇 가지 최신 운 지원 기술 서비스에 하여 살펴보도록 한다. 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 ∙airway 아카이빙된 정보 자원에 한 가시성을 확보 하고 근성을 제고하는 것은 기 포지토리 운 에 있어 주요한 이슈 에 하나이다. 더 불어 자 등 외부의 다양한 정보원과 연계하는 것은 기 포지토리를 통하여 오 엑세스를 효과 으로 실 시킬 수 있도록 한다. airway는 기 포지토리에 아카이 빙된 오 엑세스 자원을 상으로 openurl 기반의 네비게이션을 제공하는 서비스로 자 구독 라이센스를 가지고 있지 않은 이용 자를 오 아카이 에 장되어 있는 동일 문 헌으로 근시킬 수 있도록 한다(sugita et al. ). 링크리 버를 통해 기 포지토리 외 부에서 방황하는 이용자를 오 엑세스가 가능 한 정보원으로 인도해 주는 역할을 하는 이 서 비스는 openurl을 통한 오 액세스 문헌의 소재 악 반에 응용 가능하다고 평가받고 있다. google의 검색 엔진이 기 포지토리 의 코드를 단 %정도만 인덱스하고 있다는 조사 분석 결과가 나온 바 있다(hagedorn and santelli ). 이러한 시도는 구 과 같은 인 터넷 검색 엔진 뿐 아니라, oaister 등 oai- pmh 로바이더를 보완하여 기 포지토 리에 소장된 문헌에 한 근성을 향상시키는 기능을 할 수 있을 것으로 평가받고 있다. ∙foresite(functional object re-use and exchange: supporting information topology experiments) 오 아카이 의 상호 운 에 한 요구가 제기되면서 오 아카이 오 젝트의 재이용 교환에 한 로토콜인 oai-ore(http: //www.openarchives.org/ore/ . /toc)를 이용 한 툴 킷이 개발되었다. 웹 자원은 uri 기반의 원자 개념을 가지고 있으나, 자원의 집합체 도 통상 하나의 단 가 될 수 있다. 링크된 웹 사이트, 소셜 네트워킹 사이트의 주석, 코멘트, 자 의 권호도 집합체로써 하나의 단 가 될 수 있으며, 특정 기 포지토리에 장된 복수의 자원도 그러한 개념으로 바라볼 수 있 을 것이다. 콜 션 차원의 네비게이션을 가능 하도록 하여 학술 성과와 그 구성 요소의 유연 한 재이용을 가능하도록 하는 고도 학술 커뮤니 이션의 기 가 되는 것이 바로 oai-ore 로토콜이다. foresite는 oai-ore를 이 용하여 jstor에 수록된 정보를 맵으로 만들고 그것을 기 포지토리에 추가하는 시 험 로젝트이다. jisc의 조성으로 이루어지 고 있는 이 로젝트는 dspace를 상으로 진 행되고 있으며, atom, rdf/xml 등의 일 로 처리하는 라이 러리 ‘foresite-toolkit (http://foresite.cheshire .org/)’을 공개하 다 (catalogablog ). ∙rioja(repository interface for overlaid journal archives) 한편, 런던 학을 심으로 한 학도서 들이 포지토리를 활용한 자 의 실 가능성에 해 조사 연구를 실시하는 로젝트 ‘rioja(http://www.ucl.ac.uk/ls/rioja/)’를 추진하고 있다. 등록의 편이성이나 속보성을 가지고 있을 뿐 아니라 장기 보존까지 강력한 포지토리에 사독 기능을 추가하여, 리 린 트 등 아카이 자원의 질 보증을 담보하는 로젝트이다. 더 나아가 기 포지토리 자 정보 리학회지 제 권 제 호 원을 화할 수 있는 api를 개발하고 있다. 여기에서는 물리학 분야의 리 린트 아카이 인 ‘arxiv’와 자출 시스템 ‘open journal systems(ojs)'을 사용해 데모 을 개발하 고 있다(国立国会図書館関西館 ). ∙claddier(citation, location, and deposition in discipline & institutional repositories) 국 사우스햄튼 학과 과학기술평의회(sci- ence and technology facilities council)가 포지토리 컨텐츠간의 상호 링크와 참조에 해 검토하는 claddier(http://claddier.badc. ac.uk/) 로젝트를 실시하고 있다. 지 까지의 인용은 주로 새로운 문헌이 과거의 문헌을 참 조하는 ‘backward citations’이었으며, 그 상도 공식 으로 발표된 문헌에 한정되어 있었 다. 하지만 이 과제에서는 포지토리내에 있 는 디지털 컨텐츠의 동 인 링크를 제안하고 있 으며, 트랙백 구조를 확장한 인용 통지와 상호 링 크 기능으로 주목받고 있다(digital scholarship, citation, location, and deposition in dis- cipline & institutional repositories ). ∙names 기 포지토리나 주제 포지토리에 등록 되는 컨텐츠의 수가 증함에 따라 작성자와 기 명에 한 거 통제의 요성은 더욱 커 지고 있다. mimas는 국국립도서 (bl)과 공동으로 포지토리 용 명칭 거 로젝트 ‘names(http://names.mimas.ac.uk)’를 추진 하 다. 포지토리에 등록된 컨텐츠에는 미들 네임의 이니셜 표기나, 성명의 순서 등 이름과 기 명 표기에 일 성이 부재하다. 따라서 부 정확한 검색 결과가 나타날 수 있는데, 이러한 문제는 포지토리의 메타데이타 공유에 의해 보다 더 심각해질 우려가 있다. 년 월에 로토타입 소 트웨어 사양을 공개하 는데, 여기에는 명칭 거의 등록, 데이타의 출력, 인 터페이스의 정의, 거 일의 리, 외부 시스 템과의 호환, 등록된 거 데이타의 수정, 국 거주자 식별을 한 필드 등 각종 사항들이 정 의되어 있다(国立国会図書館関西館 ). 한편, 네덜란드에서도 연구자에게 디지털 식별 자를 부여해 학술 기 포지토리나 도서 목록 그리고 각종 연구 정보 시스템에 활용될 수 있도록 하는 로젝트를 추진하고 있다. 이 로젝트의 이름은 “digital author identifier (dai, http://www.surffoundation.nl/smartsite. dws?ch=eng&id= )”인데, 학술 기 포지토리 포털 “darenet”에 활용되고 있다 . 한국 기관 레포지토리 현황 오 엑세스 실 의 주요 방편으로 세계 인 움직임을 보이고 있는 기 포지토리 운 이 한국에서는 ‘지식정보 생성․유통 체계 사업(dcollection)’의 일환으로 추진되고 있다. opendoar에 의하면 한국에는 외경제정책 연구원 등 개 기 도 독자 인 기 포지토 리 시스템을 운 하고 있는 것으로 나타나고 있으나, 본 고에서는 dcollection을 심으로 한국 기 포지토리의 정책 지원 내용과 재까지의 주요 개 황에 하여 살펴보도 록 한다. 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 한국의 기 포지토리 지원 사업은 정부의 국가지식정보자원 리 사업의 일환으로 시작 되었다. 년도 후반에 시작된 국가지식정보 자원 리 사업은 국가의 주요 지식정보자원을 발굴하고 체계 으로 디지털화하여 유통하는 것을 목 으로 하는 사업이었다. 그러나 년도부터는 지식의 규모 디지털화보다 지식 의 생성․유통 로세스의 리가 더욱 강조되 기 시작하 는데, 그때부터 ‘dcollection’ 사업 에 심이 집 되기 시작하 다. 이 사업은 학이나 연구 기 의 학 논문과 학술지논문이 생성됨과 동시에 기 포지토리로 등록되고, 등록된 자원의 메타데이타는 추출되어 포털을 통해 통합 검색될 수 있도록 하는 체계로 요약 될 수 있겠다. 한국의 기 포지토리는 시스템과 표 업 무 모델을 학에 일 보 할 정도로 정부 주 도로 추진되고 있다. 년도에 개 기 에서 시작되어 년도에는 개로 증가하 으며, 소 학에 호스 시스템 보 을 통하여 년도에는 개 기 포지토리가 설치 되었다. 교육과학기술부 산하 한국교육학술정 보원(이하 keris)은 강력한 기 포지토리 지원 기 으로서 다음과 같은 역할을 수행하고 있다(한국교육학술정보원 ). 첫 번째, 상시 으로 기 포지토리 소 트웨어의 보 과 유지보수를 지원하고 있으며, 기 포지토리 포털을 통해 통합 검색을 제 공하고 있다. 두 번째, 기 포지토리 운 을 한 표 업무 모델을 개발하고 기 포지토리 활용 실태와 효과를 분석한다. 세 번째, 통합 포털의 활용성을 제고하기 하여 메타데이타 등록에 한 학의 기여를 평가하고 있으며, 활성화를 한 커뮤니티 운 도 병행하고 있다. 기 포지토리의 운 에 한 keris의 시스템 측면의 지원 내용을 다시 몇 가지로 요 약하여 살펴보면 아래와 같다(futureinfonet 컨소시엄 ). 첫 번째, 앞서 말한 바와 같이 dcollection을 직 개발하여 보 하고 있으며, 여건이 열악 한 소 학을 한 호스 시스템을 운 하 고 있다. 호스 참여 학의 기 별 홈페이지 구축, 커스터마이징, 더불어 기 데이타 이 업무까지 행하고 있다. 두 번째, 기 포지토리의 포털을 운 하 고 있으며, 하베스 된 메타데이타의 보정 업 무까지 수행하고 있다. 메타데이타의 기 별 오류 내역을 리하는 등 통합형 데이타베이스 의 품질 리를 해서도 많은 노력을 기울이 고 있다. 세 번째, 기 포지토리 운 을 한 표 화와 기술 개발을 주도하고 있다. 기 포지토 리 사업의 표 화 역인 메타데이타, 제출 워크 로우, 유통 로토콜 표 을 개발․ 리 하고 있으며, 기 포지토리 시스템 활용 극 화를 한 각종 api(도서 시스템의 제출자 인증 연계, sru를 활용한 통합검색, 학소장 자료의 통합검색) 개발, 연구업 리시스템과 연계 지원, ccl(creative commons license) 기반의 라이선스 리 시스템, 그리고 연구자 페 이지를 도입하는 등 서비스 고도화에 주력하고 있다. 한편, 한국교육학술정보원이 발표한 년 도 자료를 보면 참여 기 측면의 운 반을 정보 리학회지 제 권 제 호 다음과 같이 악할 수 있다(강정원 ). 그 내용을 간략하게 정리하면 다음과 같다. 첫 번째, 기 포지토리 등록 데이타의 부분은 학 논문 원문 데이타(국립 %, 사립 %)이며, 그 다음은 학부설연구소 논문 데 이타(국립 %, 사립 %)이다. 그 밖에 등록 되고 있는 학술 자료는 거의 부재하다. 두 번째, 기 포지토리와 학내 타 시스템 간의 상호 운용도 활발하다고 보긴 어렵다. 도 서 의 opac과 통합 검색이 되도록 구 해 놓은 기 이 반 정도에 이르지만, 시스템 으로 연동되어 있다고 볼 수는 없다. 하지만 주 요 몇 개 학을 심으로는 교내 연구업 정 보시스템과 기 포지토리의 연계가 시도되 고 있다. 세 번째, 제도 으로 컨텐츠가 수집․ 리 되도록 학내에 합의가 형성되어 있는 곳은 흔 치 않았다. 그리고 부분의 기 에 담 부서, 운 지침, 보존 정책, 납본 규정, 인센티 가 부재하고 학내 홍보도 원활하게 이루어지고 있 지 않은 것으로 조사되었다. . 일본 기관 레포지토리의 정책적 지원 내용과 전개 일본 정부는 년도까지 ‘선도 자도서 로젝트(先 的電子図書館プロジェクト)’, ‘ 자 정보의 수집․검색 시스템(電子的情 報の収集․検索システム)’이란 이름의 자 도서 로젝트로 학에 한 지원을 아끼지 않았다. 년도에 이르러서는 학술정보의 체 계 인 수집과 자 의 안정 이용, 학 연구 성과의 효과 를 주요 과제로 선정 하 다. 일 유(逸村裕 )의 말을 인용하 면, 그 당시 기 포지토리라는 용어가 범용 으로 사용되지는 않았지만, 학내에서 생산된 학술 정보의 발신 체제를 확립하는 한편, 발신 되는 학의 정보에 한 통합 포털 운 을 강 조하고 있어, 재의 기 포지토리 개념을 이미 많이 포함하고 있었다고 한다. 일본에서 기 포지토리 구축에 최 로 착 수한 곳은 치바 학(千葉大学)이며, 년 월부터는 국립정보학연구소(nii)에서 국립 개 학을 심으로 오 소스 소 트웨어의 시범 운 지원을 시작하게 되었다. 그 이후 nii의 ‘csi 탁 사업’을 통해 재까지 여개 의 기 포지토리가 운 되고 있다. 도서 은 공모를 통해 자 을 지원받으며, 포지토 리의 구축 운 에 련된 기술 , 제도 문 제에 한 실증 제안을 해야 하는 책임을 지 닌다. . 정책적 지원 내용 . . 개요 앞서 말한 바와 같이, 일본 문부과학성 산하 nii는 학 학술정보 공유 유통을 한 담 기 으로 년 에는 종합목록과 상호 차 사업 을 통해 학 간의 오 라인 자원 유통을 도모 하 다. 그 이후에는 학의 자도서 사업 지원과 자 의 안정 유통에 을 맞추 었으며, 년 에 들어서는 기 포지토리 운 을 통한 연구성과물의 체계 리와 효과 발신에 주력하고 있다. nii의 기 포지토 리 지원 내용의 략을 살펴보면 다음과 같다. 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 첫 번째, 년 월부터 nii는 국립 개 학을 심으로 dspace, eprints와 같은 오 소 스를 심으로 기 포지토리 시범 운 을 지 원하기 시작하 다(国立情報学研究所 ). nii는 이를 통해 참여 학이 구축 운 기 술을 축 하고 그 경험을 공개하여 다른 도서 이 참조할 수 있도록 하 다. 두 번째, 그 이후 년부터 년까지는 「차세 학술 컨텐츠 기반 공동 구축(次世代 学術コンテンツ基盤共同構築事業)」을 한 제 기 탁 사업이 추진되었는데, 기 포지 토리 구축 운 확산을 목표로 하는 이 사업에서 는 국립 학과 와세다 학(早稲田大学), 이오기쥬쿠 학(慶應義塾大学)을 포함한 총 개 학이 참여하 다. 이 사업은 nii의 「최 첨단학술정보 기반정비 사업 csi( 先端学術 情報基盤整備(csi)」의 일환으로 추진되었다. 년도에는 공모에 의해 국립 개, 사립 개 총 개 학이 선정되었으며, 년도에 는 개 학이 참여하 는데, 자 지원을 통 해 기 포지토리를 구축하는데서 끝나는 것 이 아니라, 일종의 「선구 인 연구 개발 사업 (先駆的な研究開発事業)」으로서 기 포 지토리 구축과 운 에 련되는 기술 혹은 제도 인 문제의 해결에 한 제안을 요구하고 있다(逸村裕 ). 세 번째, 년에서 년까지 추진되는 제 기 탁 사업(http://www.nii.ac.jp/irp/rfp/) 에서는 기 포지토리의 국 개와 포 지토리의 질 향상을 목 으로 하고 있으며, 연 구 개발 로젝트의 계승으로 컨텐츠 련 정책 의 개발과 기 포지토리 운 기술의 성숙을 강조하고 있다(国立情報学研究所 ). . . 시스템 지원 nii는 지원 기 이 운 할 소 트웨어를 스 스로 결정하고 신기술을 개발하여 용하도록 유도하고 있다. nii는 다만 jairo(junii 의 후속 , jairo.nii.ac.jp/)라는 통합 포털을 통해 개별 기 포지토리의 컨텐츠가 외부로 원활 하게 될 수 있도록 하는 역할에 최선을 다 하고 있다. 한편, nii는 국비수혜연구성과물 포 털(kaken, seika.nii.ac.jp)과, 학술지논문 포 털(cinii, ci.nii.ac.jp) 시스템을 기 포지토 리와 연동하여, 원문보기 권한이 없는 이용자 가 어느 치에 있건 간에 기 포지토리에 동 으로 연동되어 원하는 자료로 근할 수 있도록 지원하고 있다. 이를 통해 명실공히 오 지원 사업명 연 도 주요 내용 학술 기 포지토리(repository) 구축 소 트웨어 실장 실 험 로젝트(学術 機関 リポジトリ(repository) 構築 ソフ トウェア 室長 実験 プロジェクト) 오 소스 심의 기 포지토리 시범 운 지원 결과 공개 차세 학술 컨텐츠 기반 공동 구축(次世代学術コンテンツ 基盤共同構築) 탁 사업 기 - 기 포지토리 구축 확 운 기술 제도 연구 차세 학술 컨텐츠 기반 공동 구축(次世代学術コンテンツ 基盤共同構築) 탁 사업 기 - 기 포지토리 구축 확 상호 제휴에 의한 새로운 서비스 창출, 기 포지토리 운 편리성 향상을 한 연구 <표 > 일본의 기 포지토리 지원 사업 정보 리학회지 제 권 제 호 엑세스가 실 될 수 있도록 도모하고 있는 데, 그 로세스를 간단하게 요약하면 다음과 같다. 첫 번째, nii는 기 포지토리 통합 포털을 일본의 표 학술 원문서비스인 cinii와 연동시키고 있다. 라이선스가 필요한 cinii로 먼 근한 이용자가 오 엑세스될 수 있는 기 포지토리로 편리하게 도달할 수 있도록 하기 함이다. 기 포지토리의 uri가 안정 될 경우, 거기에서 추출한 메타데이타를 이용하 여 cinii에서 기 포지토리로 도달할 수 있 는 링크를 생성시킬 수 있다. cinii는 일본 최 의 학술논문원문 포털 서비스로 일본 내 주요 학회 학술논문, 학 학술논문 뿐 아니라 sci, ssci, h&ci까지 포 하고 있다. 동 연동을 해서는 앞에서 언 한 바 있는 airway라 는 링크리 버를 사용하는데, 를 들어 이용 자가 cinii를 통해 검색한 논문의 원문을 보고 자하나, 이용 권한이 없을 경우, 해당 서지의 간 략화면에 생성된 링크를 클릭하여 자의 기 포지토리로 연동될 수 있다. 기 포지토 리로 연동된 이용자는 거기에서 원하는 논문의 원문까지 오 엑세스할 수 있게 된다. 물론 학 술지에 발표된 논문이 기 포지토리에 셀 아카이빙되어 있음을 제로 해야 한다. 두 번째, 공 자 으로 수행된 연구 성과를 효과 으로 공개하여 활용성을 제고하기 하 여 nii는 국비수혜연구 성과물 db인 kaken 을 통해서도 기 포지토리에 인도되도록 지 원하고 있다. 일본은 국비수혜 결과로 산출된 연구 결과를 요약하여 학회지에 발표하도록 하 고(文部科学省研究振 局学術研究助成課 ), 더불어 결과물의 디지털 카피를 연구자가 소속된 기 의 포지토리에 제출하도록 권고하 고 있다(国立情報学研究所 ). 연구지원 기 인 jsps(日本学術振 会)는 연구 결과가 출 된 학회지 정보를 포함하여 반 인 연구 성과에 한 보고를 받고 그 간략 정보를 nii에 서 운 하는 kaken에 등록한다. kaken은 cinii와 연계되어 있어, 이용자는 원문이 서비 스되고 있는 cinii로 바로 연계될 수 있다. 한 편, cinii의 원문 근 권한이 없는 이용자는 결과물의 디지털 카피가 아카이빙 되어 있는 기 포지토리로 편리하게 연동되어 무료로 원문에 근할 수 있도록 지원된다. 세 번째, nii는 일본 roar(registry of open access repositories)인 irdb(institutional repository database contents analsis, irdb. nii.ac.jp/analysis/index_e.php) 서비스를 운 하고 있다. 여기에는 포지토리 운 기 정 보, 포지토리 링크 정보, 사용 소 트웨어 정 보 뿐 아니라, 수록 컨텐츠의 증감, 본문의 유무 비율, 컨텐츠 ( 자 , 출 사 )의 분포, 컨 텐츠 언어(일어, 어 등)의 분포 등 다양한 정 보를 제공하고 있다(国立情報学研究所 ). . 운영 현황 앞에서는 nii의 기 포지토리 지원 내용 의 략에 하여 살펴보았다. 여기에서는 일 본 학도서 의 기 포지토리 운 황을 시스템, 컨텐츠, 작권 정보 리, 기 포지 토리 연합체 운 측면에서 살펴본다. . . 시스템 년 csi 탁사업 보고에 따르면, 기 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 포지토리를 운 하고 있는 일본 학 개 개 학이 dspace를 채택하고 있다고 하 다. eprint를 이용하고 있는 학이 개, xoonipas 를 이용하는 학이 개 있으며, 독자 으로 오 소스 소 트웨어를 개발하여 운 하는 곳도 있다. 한 nalis-r( 개), e-repository( 개), infolib-dbr( 개), ilissurf e-lib( 개) 등의 상용 소 트웨어를 도입하고 있는 학도 여럿 있다고 한다(国立大学図書館協会 ). 한 편, 지역 단 로 복수의 학이 공동 포지토 리를 구축하여 운 하는 경우도 있다. 공동 포지토리는 복수 기 에서 생산된 교육 연구 성 과를 하나의 서버에 축 ․보존하는 것으로 포지토리 구축 비용을 최소화하고 기술 노하우 를 공유할 수 있다. 이는 소 규모 기 의 포지토리 도입 장벽을 없애고 오 엑세스의 변 확 에 기여하고 있다고 평가되고 있다. 국공 사립 학․단기 학․고등 문학교 도서 등 개 이 참여하고 있는 히로시마 공동 포 지토리(広島県大学共同リポジトリ: harp) 가 가장 표 인데, 이 시스템은 nii의 csi 사 업을 통해 기 경비를 충당하고 개 학이 연간 천엔 정도를 분담하여 유지보수와 서버 갱신비를 충당하고 있다. 이 시스템은 기본 으 로 dspace를 사용하고 있지만, 학이 개별 으로 컨텐츠를 하베스 하고 별도의 스타일 쉬 트를 운 하고 있다(広島大学図書館 ). 일본의 학들은 기 포지토리에 아카이 빙된 컨텐츠에 한 외부 발신을 매우 강조하 고 있다. nii의 jairo(junii의 후속 )가 가 장 기본 인 유통 채 역할을 하지만, 그 이외에 도 roar(http://archives.eprints.org/), open doar(http://www.opendoar.org/)에 부분 의 학이 등록되어 근 채 을 확보하고 있다. oaister(http://www.oaister.org/)에도 개 학이 등록되어 있을 뿐 아니라, google scholar (http://www.scholar.google.com/)나 scirus (http://www.scirus.com/)와 같은 서치 엔진 에 색인을 등록하여 학 연구 성과의 로벌한 발신이 가능하도록 하고 있다. 뿐만 아니라, 홋카 이도 학(北海道大学: huscap), 츠쿠바 학 (筑波大学: つくばリポジトリ tulips-r), 치 바 학(千葉大学: curator), 나고야 학(名古屋大学: nagoya repository), 큐슈 학(九州大学: 九州大学学術情報リポジト リ qir) 등은 앞서 언 한 airway 시스템 을 도입하여, 상용 자 이나 데이타베이스 에서 이용자가 심리스(seamlss)하게 기 포지토리로 연계될 수 있도록 지원하고 있다 (杉田茂樹 ). 한편 기 포지토리 시스템과 학내 연구업 시스템과의 연계도 상당히 진척되어 있는 수 이다. 년 월 drf(digital repository federation, http://drf.lib.hokudai.ac.jp/drf/index. php?digital% repository% federation) 회의 결과에서 연구업 db와 연계가 완료된 학이 체의 %이며, %는 재 검토 이라고 밝 졌다(金沢大学情報部情報 画課 ). nii의 ‘ 탁사업 역 '로 실시한 기 포지토리와 연구 실 데이타베이스의 연동 로젝트 결과, 두 가지 모델이 개발되었는데, 한 가지는 카나자와 학(金沢大学: 金沢大学 学術情報リポジトリ, kura)과 와세다 학 (早稲田大学: dspace at waseda university) 이 채택한 ‘ 리 등록형 모델’이고, 다른 한가지 는 큐슈 학((九州大学: 九州大学学術情報 정보 리학회지 제 권 제 호 リポジトリ qir)이 채택한 ‘이용자 지원형’ 모델이다. ‘ 리 등록형 모델’은 교원이 연구실 db에 실 을 등록 는 갱신할 때, 등록된 컨텐츠를 자동으로 기 포지토리에 업로드 시키는 방식이다. 이 방식은 연구업 시스템의 실 물 상세화면에서 교원이 본인의 화일을 등 록할 수 있는데, 그 정보는 기 포지토리 시 스템으로도 자동 등록된다. 등록된 실 물은 사서가 도서 의 리자용 화면에서 출 사를 악하고 작권 정책을 확인한 후 아카이빙될 수 있도록 처리한다. 한편 큐슈 학이 채택하 고 있는 ‘이용자 지원형’은 ‘ 리 등록형’과 같이 연구실 db시스템에서 기 포지토리로 데 이터를 자동 업로드 하기 한 인 시스템 변화가 불필요하다. 이는 원문을 기 포지토 리로 등록시킬 수 있는 창을 학내연구실 db 시스템에 연계해 주는 정도로 지원된다. 다만, 연구자에게 자신의 연구실 물에 한 이용자 의 엑세스 통계를 제공하여, 연구자 스스로가 자신의 연구 실 물 등록 의욕을 향상시킬 수 있도록 운 하고 있다(金沢大学 ). . . 컨텐츠의 수집 체계 일본 학의 기 포지토리에는 학술잡지논 문, 연구 보고서, 학 논문, 기요 논문(교내 학술 논문), 학회 발표 자료와 교재류 등 다양한 연구실 물이 탑재되어 있다. nii의 irdb(institutional repository database contents analsis, irdb. nii.ac.jp/analysis/index_e.php)에 의하면 년 월 일을 기 으로 기 포지토리는 개, 코드는 , 건이 등록되어 있다는 것을 확인할 수 있다. <표 >에서 보이는 바와 같이 아카이빙 컨텐츠는 기요논문(교내발간논 문) , 건, 학술잡지논문 , 건, 회의 발표자료 , 건, 그밖에 학 논문, 기술보고 서, 연구보고서, 리 린트, 교재 등이다. 国立大学図書館協会( )가 발표한 자료 에 의하면, 주요 정보원별 수집 체계는 아래와 같이 요약해 볼 수 있을 것이다. 첫 번째, 년도 월 을 기 으로 가장 많은 데이터가 탑재된 유형의 컨텐츠는 기요 논문이다. 이는 nii가 추진한 ‘ 학 기요 공개 지원 사업(大学紀要公開サポート事業)’과 ‘학 술잡지 공개지원 사업(学術雑誌公開支援事 業)’의 일환으로 순조롭게 수집될 수 있었다. 기요 논문은 학에서 간행된 학술논문을 일컫 는데, 학들은 사업을 통해 이를 디지털화 하고 cinii에 탑재하여 국 으로 유통될 수 구 분 건 수 구 분 건 수 기요논문(교내발간논문) , 연구보고서 , 학술잡지논문 , 기술보고서 , 회의발표용논문 , 회의발표용자료 , 학 논문 , 데이타 일반잡지논문 , 리 린트 도서 , 소 트웨어 교재 , 그외 , <표 > 일본 기 포지토리 컨텐츠 통계 ( 년 월 일 기 ) 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 있도록 하 다. nii는 기 포지토리 지원 사 업을 시작하면서 이 게 수집된 기요 논문의 원문을 자의 소속 기 포지토리로 이 시 켰고, 이 데이타가 개별 학 포지토리의 기 컨텐츠로서 자리잡게 된 것이다. 두 번째, 기 포지토리의 메인 컨텐츠라 고 할 수 있는 학술잡지논문은 , 건이 구 축되어 있는데, 아직까지는 도서 의 사서가 교원으로부터 수집하고, 출 사의 작권 정책 을 확인하는 과정을 거쳐 포지토리에 행 입력해 주고 있다. 학술잡지 논문의 원활한 아 카이빙에는 자의 최종 원고 수집이 건인데, 셀 아카이빙이 정착될 때까지 행 입력은 병행되어야 할 것이라는 망이 지배 이다. 한편, 실제 운 에 있어서 작권 정책 확인과 아카이빙 허락에는 시간이 필요하거나, 불가능 한 경우도 존재하고 있다고 밝 지고 있다. 세 번째, , 건이 구축되어 있는 연구비성 과보고서는 문부성의 연구비 수혜 결과물 는 학내 연구성과보고서를 일컫는다. 앞서 설명한 바와 같이 연구자는 최종 성과물이 발표된 학 술잡지의 주소를 기재하여 jsps에 제출하고 동시에 결과물의 디지털 원문은 학내 기 포지토리로 등록하게 된다. 학내 포지토리에 등록된 디지털 원문은 이미 학회지에 출 되었 기 때문에, 사서에 의해 출 사의 작권 정책 을 확인하는 과정을 거친다. 네 번째, 학이 생산하는 주요 연구 실 물 하나인 박사학 논문( , 건)은 그동안 한국의 경우와 같이 극 으로 수집되지 않았 다. 그러나 년 nii의 과제로 박사학 논문 디지털화 사업이 채택되자, 기 포지 토리를 심으로 수집․유통되기 시작하고 있 다고 보고되고 있다. . . 작권 정보 리 년 월 일본 학 회 작권 정책 데이타 베이스(society copyright policies in japan, 약칭 scpj(http://www.tulips.tsukuba.ac.jp/ scpj/, 이하, scpj)가 공개되었다. 이것은 nii csi 사업 역 “국내 학 회 등 작권 정책 공유․공개 로젝트(国内ハックヒョブフェな ど著作権ポリッシュ共有․公開プロジェク ト)”에 의해 추진되었다. 년에 이미 치바 학에서 개 학 회의 작권 정책을 조사한 바 있는데, 이 로젝트는 츠쿠바 학(筑波大学), 코베 학(兵庫教育大学) 등에 의해 계속되어 년 월까지 , 개의 의 작권 정책 이 조사되어 구축되었다(国立大学図書館協会 ). 학에서 해외 학술잡지에 출 된 연 구자의 성과물을 아카이빙할 때는 sherpa/ romeo를 참조하지만, 자국내 학술잡지에 출 된 연구성과물 등록을 해서는 scpj를 통 해 작권 정책을 확인하고 제시되는 조건에 따라 서비스하게 된다. . . 기 포지토리 연합 일본에서는 기 포지토리 구축을 진행시 키는 학의 정보 공유를 한 디지털 포지토 리 연합(digital repository federation: drf) 이 운 되고 있다(北海道大学 ). 메일링 리스트를 운 하고 주기 인 컨퍼런스를 통해 정보를 교환하고 있는데, 그 운 목 을 정리 하면 다음과 같다. 첫 번째, 포지토리의 발견 루트를 정비한다. 두 번째, 컨텐츠의 보존과 식 별을 한 연구를 수행한다. 세 번째, 작권 정 정보 리학회지 제 권 제 호 책 데이타베이스의 국제 제휴를 도모한다. 네 번째, 포지토리 평가 체제를 마련한다. 다 섯 번째, 연구성과시스템과 기 포지토리의 연계성을 향상한다. 여섯 번째, 지역 공동 포 지토리의 경제성을 검증한다. 일곱 번째, 오 엑세스 컨텐츠의 임펙트를 평가한다. . 주요 사례 장에서는 일본 학도서 의 기 포지 토리 운 황을 몇 가지 논 으로 살펴보았다. 앞서 언 한 바와 같이 ‘csi 탁 사업’은 참여 학이 기 포지토리 구축 운 에 실증 제안을 하도록 함으로써 경험을 공유하고 경쟁 으로 기술을 성숙시킬 수 있도록 유도하고 있 다. 따라서 학은 매해 포지토리 운 에 한 신기술을 개발․ 용하고 운 사례를 발표 해야 한다. 아래 표에서는 년 월 ‘csi 탁 사업 보고 교류회(컨텐츠계): 기 포지 토리로부터 퍼지는 학술 정보 발신․유통: 최 신 동향에서 과제 해결까지'에서 발표된 운 사례 몇 가지 내용을 발췌하여 정리하 다. 학명 ir 주요 운 사례 치바 학 千葉大学 curator (http://mitizane.ll. chiba-u.jp/curator/i ndex.html) - scirus ) 검색 엔진에 색인 제공, 스코퍼스와 제휴를 통해 기 포지토리의 정보가 로벌하 게 발신될 수 있도록 함(千葉大学附属図書館 ) - 기 포지토리 활성화에 따라 교내 학생들의 정보 수집 행태가 변화되고 있음을 보고(千葉 大学 ) ․학생들이 요구한 자료가 curator에 존재하고 있다는 사유로 상호 차가 거부되고 있는 사례가 증가 ․패스 인더(수업자료 네비게이터) 게재 문헌이 기 포지토리에 포함되고 있어, 도서 서비스의 기본 축이 인쇄장서에서 자 로 그리고 기 포지토리로 변화되고 있음을 보고 시마네 학 島根大学 (島根大学学術国 際部図書情報課 ) swan (http://www.lib.sh imane-u.ac.jp/ /coll ection/repo/) - 학내 발표 논문과 학외 발표 논문의 포지토리 등록 로세스 공개 ․기요논문, 학 논문 등 학내 발표 논문의 경우는 도서 에서 기요편집 원회등에 일 허락을 받고 등록하여 공개 ․학술지논문, 기술보고, 회의자료 등 학외 발표 논문의 경우는 교원이 스스로 등록하되, 작권 정책 확인과 허락 련 로세스는 사서가 담함 시즈오카 학 静岡大学 (静岡大学附属図 書館 ) sure (http://ir.lib.shizuo ka.ac.jp/) - dspace의 controlled vocabulary기능에 일본 십진 분류법 에 의해 주제 분류 수행 - opac을 학내학술정보시스템의 근간으로 삼아, 각종 구독 자 과 오 엑세스 자 (doaj,nii-els,nii-reo, j-stage) 이 연계 서비스될 수 있도록 운 - doi 뿐 아니라 다양한 메타데이타를 활용하여 자 의 opac과 webcat+, cinii, junii를 연계하고 있음 교토 학 京都大学 (京都大学附属図 書館電子情報掛 ) kurenai (http://repository. kulib.kyoto-u.ac.jp/ dspace/) - 기 포지토리와 출 부의 제휴를 통해 자 렛폼 마련 ․출 부 발행 연구서, 학술상 수상작 등을 상으로 함 ․도서 은 편집을 거친 가치 있는 컨텐츠를 획득하고 출 부는 어필할 수 있는 장소를 마련한다는 측면에 의미를 두고 있음 ․ 학부의 건을 호스트 훗가이도 학 北海道大学 (杉田茂樹, ) huscap (http://eprints.mat h.sci.hokudai.ac.jp/) - 링크리 버를 통해 기 포지토리로 근 채 강화 ․wos - sfx - huscap 연계 기능 실 ․airway 실험 로젝트 참여 - 소속 교원이 발표한 논문을 톰슨 isi에 제공을 의뢰하여 건 일을 일 제공 받음.(추가 으로 건이 자발 으로 등록됨)(国立大学図書館協会 ) <표 > 일본 주요 학 기 포지토리 운 사례 ) elservier가 운 하는 방 한 과학기술연구정보 검색 엔진으로 억 천만개의 과학기술 , 연구자 홈페이지 코 스웨어, 리 린트, 특허, 기 포지토리 등의 과학기술정보를 수록. 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 토후코 학 東北大学 tour (http://ir.library.to hoku.ac.jp/) - 교육 성과 심의 아카이빙. 최종 강의자료 탑재(国立大学図書館協会 ) 오타루 상과 학 樽商科大学 ( 樽商科大学附 属図書館 ) barrel (http://barrel.ih.ot aru-uc.ac.jp/dspac e/index.jsp) - 교원 의견 수렴을 통해 기 포지토리의 기능 개발 ※연구자별 문헌 열람, 복 제출의 수고를 경감하기 한 연구정보 db와의 제휴, 본인이 공개한 논문의 철회 기능, 연구자페이지와 메타데이타 재이용 기능 등 . 양국 정책적 지원 내용 및 전개 양상 비교 분석 장에서는 일본 기 포지토리의 추진 배경, 정책 지원 내용, 운 황과 주요 사례 를 살펴보았다. 본 장에서는 포지토리의 확 산 지원 체계, 기 포지토리 시스템 컨텐츠 등 다양한 측면에서 양국의 개 양상 을 비교 분석해 본다. ∙확산 황 한국은 년 부터 지식정보의 생산․ 유통 체계 리에 주력하기 시작하여 재에는 개 기 포지토리 시스템을 국 으로 보 하는데 이르게 되었다. 일본도 비슷한 시 기부터 학 학술정보 리 체계 정비와 연구 성과의 발신에 주력하기 시작하여 년 반 을 기 으로 여개의 기 포지토리가 구축 되었다. ∙지원 체계 keris는 언 한 바와 같이 시스템의 일 보 과 각종 로세스의 표 화를 통해 한국 학에 기 포지토리가 신속하게 확산될 수 있도록 하는 직 역할을 하 다. 더불어, 기 포지토리 운 성숙을 한 각종 기술과 정책을 직 개발 보 하고 있어, 아직은 환경 으로 열악한 한국 학도서 을 표 에 의해 획일 으로 이끌어 나가고 있는 셈이다. 반면, nii는 학이 자발 으로 시스템을 선택하고 운 기술을 성숙시켜서 공개하도록 책임을 부 여하고 있다. 셀 아카이빙을 한 로세스 와 제도 연구, 학 회 작권 정책 데이타베이 스 구축 운 지원 등에도 많은 심을 쏟고 있다. ∙기 포지토리 통합 포털 keris는 dcollection에 의해 하베스 된 메타데이타를 기반으로 통합 포털을 운 하고 있으며, 데이터 보정 작업을 통해 통합 데이터 베이스의 품질 리에도 많은 노력을 기울이고 있다. 뿐만 아니라, dcollection을 통해 학 논 문을 온라인으로 제출하도록 국 학의 학사 규정을 변경하 다. 그로 인해 체 학 논문 수여분의 %가 제도 으로 등록되고 있다. keris의 이러한 일련의 노력은 학에서 생 산된 학술정보를 효과 으로 수집하여 국 으로 공유․유통시키는데 많은 기여를 하고 있 다. 그러나 셀 아카이빙을 통한 오 엑세스 실 이라는 기 포지토리 본연의 기능에서 는 한 발짝 떨어져 있는 것처럼 보인다. 한편, nii는 후자 쪽에 좀 더 무게를 두고 있다. nii 는 통합 포털인 jairo를 운 하는 것 이외에 도 개별 포지토리에 수록된 데이타의 외부 정보 리학회지 제 권 제 호 발신력 강화에 주력하고 있다. nii는 기 포 지토리를 오 엑세스 실 을 한 주요 수단 으로 보고, airway를 통해 cinii, kaken 등 다양한 외부 정보원에서 이용자가 기 포지토리에 동 으로 연계될 수 있도록 지원하 고 있다. ∙기 포지토리 시스템 한국의 학도서 은 부분은 keris가 개 발․보 한 dcollection을 기 포지토리 시 스템으로 설치․운 하고 있다. keris가 기 포지토리 소 트웨어 유지보수와 기능 개 선을 지원하고 있으며, 학은 앙에서 일 보 된 표 업무 모델을 채택하고 있다. 소 학은 호스 시스템을 통해, ir 홈페이지 구 축, 커스터마이징, 더불어 기 데이타 이 업 무까지 지원받고 있다. 한편 일본의 학은 기 이 운 할 소 트웨어를 직 결정하고, 신 기술 개발을 통해 독자 으로 운 기술을 발 시키고 있다. dspace, eprint, xoonipas이외 에 오 소스 소 트웨어를 개발하여 운 하거 나 상용 시스템을 도입하는 경우도 있으며, 지 역 단 로 복수 학이 공동 포지토리를 운 하고 있는 사례도 볼 수 있다. ∙컨텐츠 한국 기 포지토리의 주요 컨텐츠는 교내 학술논문과 학 논문이다. 진정으로 오 엑세 스가 필요한 학술지 논문 특히, 국내외 상용 출 사에서 유통하고 있는 학술 의 원문은 아 직까지 포지토리에 등록되지 못하고 있다. 반면, nii가 추진하는 기 포지토리 사업의 참여 학은 교내학술논문과 학 논문이외에 도 교재, 도서류, 회의발표용 논문 등이 등록되 어 있으며, 유료 학술지논문과 해외 상용출 사의 원문까지도 극 으로 수집되어 포지토리에 등록되고 있다. ∙ 작권 리 에서 언 한 바와 같이 일본의 기 포 지토리에 국내외 상용 학술 의 원문이 등록 되고 있는 것은 교원의 교외 발표논문이 극 으로 수집되고, 작권 허가 여부가 일일이 확인되고 있기 때문이다. 이는 scpj(학 회 작권 정책 데이타베이스)에 의해 편리하게 이루어질 수 있었다. scpj에는 년 월까 지 , 개의 에 한 작권 정책이 구축 되어 있다. nii의 지원에 의해 몇 개 주요 학 이 작권 정책의 조사 리․운 업무를 담하고 있다. 한편 한국에서는 학 회에 한 작권 정책 조사가 부담스러운 숙제로 남 아 있는 상황이다. ∙운 기술 한국은 시스템의 개발과 유지보수, 메타데이 타 로토콜 표 화, 각종 api 개발 등 기 포지토리 질 성숙을 한 기술 개발이 keris를 심으로 이루어지고 있다. 반면, 일 본의 경우 참여 학이 직 다양한 운 기술 을 개발하고 용하며, 그 기술 노하우를 공개 하는 책임을 지닌다. 따라서 기 간에 용 기 술과 운 수 에는 차이가 나타나고 있으나 그 경험과 지식은 상호 원활하게 교류되고 있 다고 평가할 수 있겠다. 많은 기 에서 연구업 시스템이나 opac과 같은 학내 타시스템과 의 연계가 순조롭게 이루어지고 있으며, 링크 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 리 버를 활용하여 자 , 검색엔진 등 다 양한 외부 자원에서 기 포지토리로의 심리 (seamless)한 근 채 이 마련되고 있다. ∙외부 발신 체계 한국 기 포지토리의 메타데이타는 riss (http://www.riss u.net)로 하베스 되고 있 으나, riss 밖으로는 원활하게 발신되지 못하고 있는 상황이다. 특히 국제 발신은 거의 이루어 지지 못하고 있다. 반면, 일본의 개별 학은 부분 roar, oaister에 기 포지토리를 자발 으로 등록하고 있으며, google scholar, scirus와 같은 국제 검색 엔진에 색인을 제공하 여 기 의 연구 성과가 로벌하게 발신될 수 있도록 노력하고 있다. . 한국 기관 레포지토리의 발전 과제 한국과 일본의 기 포지토리는 정부의 강 력한 지원으로 확산되었으나, 그 발 양상은 비슷한 듯 상이하다. 한국은 기 포지토리 시스템을 개발하여 일 배포하고 규모 호스 을 운 할 정도로 정부 주도 이었으며, 개 별 학은 업무 로세스, 등록된 컨텐츠의 유 구 분 한 국 일 본 지원 체계 지식정보 생성 유통 사업(dcollection) 차세 학술컨텐츠기반 공동구축사업 keris nii 확산 황 개 여개 기 포지토리 시스템 dcollection의 일 보 dspase, eprint 등 다양한 소 트웨어 개별 채택 기 포지토리 포털의 운 체계 riss jairo 학술지포털, 연구성과물포털과 연동 미비 cinii, kaken 등과 개별 ir의 동 연동 / airway를 통해 개별 ir로의 근 채 지원 컨텐츠 교내학술논문(기요논문) ○ ○ 학 논문 ○( % 등록율) △ 국내학회지논문 × ○ 해외상용학회지논문 × ○ 연구비수혜보고서 × ○ 기타 × 강의자료, 단행본, 회의보고서, 소 트웨어 등 국내학 회 작권 폴리시 db × scpj 운 ( , 여개 일본학 회 의 작권 정보 수록) 연구업 db 연동 개교 %가 연계 완료, %가 미연계, %검토 외부 발신 체계 국제 oai 로바이더 등록 독자 ir 운 (dspace) 기 - 개 등록 roar opendoar 부분 등록 oaister 개 등록 국제 검색엔진 등록 × 일부 구 스칼라 scirus, scopus에 색인 제공 자국내 디 토리 운 여부 × 일본 roar irdb운 <표 > 한국과 일본의 기 포지토리 개 황 비교 정보 리학회지 제 권 제 호 형, 운 기술까지 유사한 모습으로 발 되어 가고 있다. 반면, 일본은 공모를 통해 기 포 지토리 구축 운 학을 선정하고, 운 기술 의 개발과 공개 책임을 부과함으로써 독자 으 로 운 기술을 성숙시켜 나가고 있다. 재, 한 국의 dcolletion은 학의 연구 성과를 공표하 고 보존하기 한 자발 운 도구라기보다, 분산되어 있는 국가의 학술연구성과를 수집하 여 신속하게 공유 유통시키기 한 포털 주도 운 도구로써 의미 있어 보인다. 어 되었 건 상 으로 열악한 국내 환경에서 이러한 방식의 추진은 지 과 같은 기반 마련과 인식 확산에 동력으로 작용하 다. 이제 공은 학 측으로 넘겨져, 기 포지토리의 자발 운 성숙으로 연계되어야 할 것이다. 본 연구에서 일본의 기 포지토리 개 양상을 조사해 본 결과, 한국의 기 포지토 리 발 에 참조할 수 있는 몇 개의 과제를 도출 할 수 있었다. 첫 번째, 한국 학이 자발 으로 기 포 지토리를 성숙시킬 수 있는 기반이 마련되어야 할 것이다. 지 까지의 양 성장은 부분 정 부 주도로 이루어졌으나 기 포지토리의 질 수 을 제고시키는 것은 학의 자발 노 력에 의존한다. 학내에서 기 포지토리가 학의 성과를 외부로 알릴 수 있는 공식 창 구로서 인정받고 체계 으로 지원받을 수 있어 야 한다. 그 게 하기 해서는 기 포지토 리에 한 인식 제고와 제도 지원이 필요할 것이다. 더불어 셀 아카이빙 로세스를 정 립하고 교원의 인식을 제고시키는 등 학내에서 기 포지토리를 활성화시키기 한 단계 과제의 수립과 이행이 필요할 것이다. 기 포지토리 운 을 단순히 도서 만의 사업이 아 니라, 학의 사회 책임에 한 이행, 연구 활 동에 한 사회 환원이라는 에서 개시 켜야 할 것이다. 두 번째, 기 포지토리를 통해 오 엑세 스가 실 될 수 있도록 기반이 마련되어야 한 다. 진정으로 오 엑세스가 필요한 자원, 특히 상업출 사의 학술 원문이 기 포지토 리로 등록될 수 있도록 지원되어야 한다. 국내 학회지 논문 아카이빙을 해서는 먼 , 일본 의 scpj와 같은 학 회 작권 정책 데이타베 이스가 구축․운 되어야 할 것이다. 출 양 상이나 작권 리 수 에 따라 조사 진행 자 체가 어려운 학 회도 존재하겠지만, keris 주 하에 몇 개 주요 학이 분담하여 시 히 추진해야 할 과제로 보인다. 한편, 외국 특 히 sci 에 수록된 한국 자의 논문은 자가 소속된 기 에 아카이빙되어 오 엑세 스될 수 있도록 일 허가를 추진하는 것도 의 미 있을 것이다. 세 번째, 기 포지토리의 사용성과 컨텐츠 의 발신력이 제고되어야 할 것이다. 연구업 시 스템, opac 등 학내 시스템과의 연동, 외부 자 원으로부터의 심리스한 연계, 오 아카이 간 의 상호 운용성 제고, 이름 거 구축 등 기 포지토리의 사용성을 제고하기 한 다각 연구와 실험이 이루어져야 한다. 더불어, 기 포지토리 수록 컨텐츠의 로벌한 발신에도 심을 기울여야 할 것이다. roar, oaister에 등록을 유도하고 google scholar 등 국제 검색 엔진에 기 포지토리 데이타가 노출될 수 있 도록 제휴하는 노력도 필요할 것이다. 마지막으로 dcollection은 단지 분산된 자원 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 을 수집하기 한 통합 포털의 운 도구가 아 니라, 사회 책임을 가지고 있는 학이 그 성 과를 외부로 알리고, 학내 자원을 수집․ 리․보존할 수 있는 효과 수단임을 인정받아 야 할 것이다. 자발 으로 필요를 느끼기 시작 한 학들이 학내에 별도의 ir 시스템을 도입 하여 운 하기 시작한다면, dcollection은 본연 의 기능이 무색해질 것이다. 학 논문 온라인 제출기로 축소되기에는 그동안 축 된 기술 ․운 토 가 무 아깝다. 참 고 문 헌 강정원. . dcollection 보 학의 활용 황 개선 방안. ꡔ 년 dcollection 참 여기 워크 자료ꡕ. [cited . . ]. <http://www.dcollection.net/search/no ticedetail.do?ticeid= >. 京都大学附属図書館電子情報掛. . 京都 大学学術情報リポジトリの戦略. ꡔ平 成 年度csi委託事業交流報ꡕ. [cited . . ]. <http://repository.kulib.kyoto-u.ac.jp/ dspace/bitstream/ / / /kure nai_csi_ .pdf>. 広島大学図書館. . 地域共同リポジトリ の取組みと課題:広島県大学共同リ ポジトリ-(harp).ꡔ平成 年度csi委 託事業報告交流会ꡕ. [cited . . ]. <http://www.nii.ac.jp/irp/event/ /debrief/pdf/ - _hirodai.pdf>. 国立国会図書館関西館. . 故エネルギー 物理学は 再び 学術 情報 流通に 革新 を 持って来ようか. ꡔ傾向 アウェアネ ス-eꡕ, no. . [cited . . ]. <http://current.ndl.go.jp/e >. 国立国会図書館関西館. . arxivとojsを 活用したオーバーレイジャーナルの構 想(英国). ꡔcurrent awareness portalꡕ. [cited . . ]. <http://current.ndl.go.jp/node/ >. 国立国会図書館関西館. . リポジトリに おける名称典拠 入の試み(英国). ꡔcurrent awareness portalꡕ. [cited . . ]. <http://current.ndl.go.jp/e >. 国立大学図書館協会. . ꡔ電子図書館機能 の高次化に向けて: ꡕ. 国立大学図書 館協会. [cited . . ]. <http://wwwsoc.nii.ac.jp/janul/j/projec ts/si/dc_lastreport.pdf>. 国立情報学研究所. . 学術機関リポジト リ構築連携支援事業とnii関連事業. ꡔ平成 年度csi委託事業報告交流会 ꡕ. [cited . . ]. <http://www.nii.ac.jp/irp/event/ /debrief/pdf/ - _nii.pdf>. 国立情報学研究所. . ꡔ学術機関リポジト リ構築ソフトウェア実装実験プロ 정보 리학회지 제 권 제 호 ジェクト報告書ꡕ. [cited . . ]. <http://www.nii.ac.jp/metadata/irp/ni i-irpreport.pdf>. 金沢大学. . ꡔa project on data sharing for archievement database and insti- tutional repositoryꡕ. [cited . . ]. <http://www.lib.kanazawa-u.ac.jp/k ura/achievement/>. 金沢大学情報部情報 画課. . 「業績db․ ir連携プロジェクト」の概要と今後 の課題. ꡔ平成 年度csi委託事業報 告交流会ꡕ. [cited . . ]. <dspace.lib.kanazawa-u.ac.jp/dspace /bitstream/ / / /h csi_ _ka nazwa_presentation.pdf>. 島根大学学術国際部図書情報課. . ꡔirと セルフアーカイビング-島根大学学 術情報リポジトリ(swan) ほか-ꡕ. [cited . . ]. <http://www.nii.ac.jp/irp/event/ /debrief/pdf/ - _shimanedai.pdf>. 文部科学省研究振 局学術研究助成課.  . ꡔ平成 年度科学研究費補助金 における制度改正について(通知)ꡕ. [cited . . ]. <http://www.mext.go.jp/a_menu/shi nkou/hojyo/ .htm>. 北海道大学. . 機関リポジトリコミュニティ の活性化, ꡔ平成19年度CSI委託 事業報告交流会ꡕ. [cited . . ]. <http://mitizane.ll.chiba-u.jp/metadb /up/irwg / _csi-report_nii. pdf>. 杉田茂樹. . access path to institutional resourcesvia link resolvers. ꡔ平成 年度 csi委託事業報告交流会ꡕ. [cited . . ]. <http://www.nii.ac.jp/irp/event/ /debrief/pdf/ - _hirodai.pdf>. 樽商科大学附属図書館. . 社会科学系 規模大学irbarrelの取組み. ꡔ平成 年度csi委託事業報告交流会ꡕ. [cited . . ]. <http://www.nii.ac.jp/irp/event/ /debrief/pdf/ - _otarushodai.pdf>. 이나니. . ꡔ기 포지토리를 심으로 한 학 학술정보 리 방안 연구ꡕ. 서울: 한국교육학술정보원. 逸村裕. . 動向レビュー: 日本における 機関リポジトリの展開: 学術情報流 通と蓄積の変容, ꡔカレントアウェア ネスꡕ, no . [cited . . ]. <http://current.ndl.go.jp/ca >. 静岡大学附属図書館. . irからgeniiへ のリンクで広がる学術情報ナビゲー ション. ꡔ平成 年 度csi委託事業報 告交流会ꡕ. [cited . . ]. <http://www.nii.ac.jp/irp/event/ /debrief/pdf/ - _shizuokadai.pdf>. 千葉大学. . 図書館活動全体から⾒たir. ꡔ平成 年 度csi委託事業報告交流会ꡕ. [cited . . ]. <http://www.nii.ac.jp/irp/event/ /debrief/pdf/ - _chibadai.pdf>. 千葉大学附属図書館. . ꡔcuratorの パートナー、scirus(サイラス) についてꡕ. 일본의 개 양상을 통해서 본 한국 기 포지토리의 과제 [cited . . ]. <http://mitizane.ll.chiba-u.jp/curator/ aboutscirus.html>. 한국교육학술정보원. . dcollection 추진 경 과 보고. ꡔ 년 dcollection 참여기 워크 자료ꡕ. [cited . . ] <http://www.dcollection.net/search/ noticedetail.do?ticeid= >. beunen, annemarie. . acceptance of the jisc/surf licence to publish & ac- companying principles by traditional publishers of journals. “surf foudation final report.” <http://www.surffoundation.nl/downlo ad/ltp-final-report-dec .pdf>. quated in 国立国会도図書館関西館. 伝統ある 学術雑誌出版社はオープンアクセスを 受け入れているか? current awareness portal( . . ). [cited . . ]. <http://current.ndl.go.jp/node/ >. digital scholarship, citation, location, and deposition in discipline & institutional repositories. . “digitalkoans.” [cited . . ]. <http://digital-scholarship.org/digital koans/ / / /citation-location-a nd-deposition-in-discipline-institution al-repositories/>. catalogablog. . “oai-ore resource maps.” <http://catalogablog.blogspot.com/ / /oai-ore-resource-maps.html>. [cited . . ]. futureinfonet 컨소시엄. . 년도 dcollection 호스 시스템 보 . ꡔ 년 dcollection 참여기 워크 자료ꡕ. [cited . . ]. <http://www.dcollection.net/search/ noticedetail.do?ticeid= >. hagedorn, kat and joshua santelli. . “google still not indexing hidden web urls,” d-libmagazine. vol , no / , july/august. [cited . . ]. <http://www.dlib.org/dlib/july /ha gedorn/ hagedorn.html>. oecd. . “oecd principles and guidelines for access to research data from pu- blic funding.” [cited . . ]. <http://www.oecd.org/dataoecd/ / / .pdf>. suber, peter. . “timeline of the open ac- cess movement.” [cited . . ]. <http://www.earlham.edu/~peters/f os/timeline.htm>. sugita, shigeki, et. al. . “linking service to open access repositories.” d-lib magazine, vol. , no. / . [cited . . ]. <http://www.dlib.org/dlib/march /s ugita/ sugita.html>. p r if y s g o l b a n g o r / b a n g o r u n iv e r s it y revisiting the classification of gallo-italic tamburelli, marco; brasca, lissander digital scholarship in the humanities doi: . /llc/fqx published: / / peer reviewed version cyswllt i'r cyhoeddiad / link to publication dyfyniad o'r fersiwn a gyhoeddwyd / citation for published version (apa): tamburelli, m., & brasca, l. ( ). revisiting the classification of gallo-italic: a dialectometric approach. digital scholarship in the humanities, ( ), - . https://doi.org/ . /llc/fqx hawliau cyffredinol / general rights copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • users may download and print one copy of any publication from the public portal for the purpose of private study or research. • you may not further distribute the material or use it for any profit-making activity or commercial gain • you may freely distribute the url identifying the publication in the public portal ? take down policy if you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. . apr. https://doi.org/ . /llc/fqx https://research.bangor.ac.uk/portal/en/researchoutputs/revisiting-the-classification-of-galloitalic( b b -e b- a -b c -d ad c ).html https://research.bangor.ac.uk/portal/en/researchers/marco-tamburelli(f be c - - - d d- bce c ).html https://research.bangor.ac.uk/portal/en/researchoutputs/revisiting-the-classification-of-galloitalic( b b -e b- a -b c -d ad c ).html https://research.bangor.ac.uk/portal/en/researchoutputs/revisiting-the-classification-of-galloitalic( b b -e b- a -b c -d ad c ).html https://doi.org/ . /llc/fqx revisiting the classification of gallo-italic: a dialectometric approach. marco tamburelli (bangor university) lissander brasca (bangor university) abstract while gallo-italic varieties clearly belong to the romance language family, their subgrouping as either gallo-romance or italo-romance has been the source of disagreement in the classificatory literature. while earlier analyses tended to classify gallo-italic as gallo-romance (notably schmid, ; bec, - ), later work has either argued for or tacitly assumed a classification of gallo- italic as part of the italo-romance branch, a view that is both different from as well as irreconcilable with the earlier gallo-romance classifications. in this paper we aim to contribute to the development of an empirically-based classification of gallo-italic through the use of dialectometry applied to atlas corpora, and specifically through the measurement of levenshtein distance. using three wordlists (swadesh , swadesh , leipzig-jakarta) and comparing twenty-six linguistic varieties across italy and south-eastern france, we show that gallo-italic is best classified as a third subgroup within the gallo-romance branch. our results also clearly identify all the major bundles of isoglosses established through traditional dialectological methods and confirm gallo-italic as a relatively homogenous group distinct from italo-romance. introduction since the work of bartoli ( ) and wartburg ( ), it is generally agreed that the rimini- la spezia line is an important isogloss for romance classification in general and for the classification of italian vernaculars in particular (see green, ; iacobini, ; repetti, ; and the volume edited by maiden & parry, , for more recent discussion). there is also broad agreement on the fact that most of the romance vernaculars traditionally spoken between the alps and the rimini-la spezia line form a generally homogenous group known as “gallo-italic”. as shown in figure , this group is comprised of the romance varieties historically spoken in the administrative regions of emilia-romagna, liguria, lombardy, and piedmont in italy and the canton of ticino in switzerland, as well as in smaller areas in the province of trento, the swiss canton of grisons/graubünden and in the northern most part of tuscany. gallo-italic varieties border with venetan varieties to the east and with occitan and franco-provençal to the west (for an overview, see harris & vincent, ; and posner, ). fig. the gallo-italic group and its neighbouring romance varieties. each variety is identified via its iso - code (lewis, simons, & fennig, ). the classification of gallo-italic while the existence of gallo-italic as a group is undisputed, the classification of gallo-italic within romance is a point of contention. specifically, the scholarly literature on romance classification proposes two distinct classifications for gallo-italic. according to what is perhaps the most influential tradition on the classification of italian vernaculars, gallo-italic varieties belong to the italo-romance branch. on this view, gallo-italic is part of a dialect group that includes all romance varieties historically spoken in italy, corsica, and canton ticino (southern switzerland), but excluding occitan, franco-provençal/arpitan (classified as gallo-romance), sardinian (classified as a separate branch), ladin and friulian (classified as rhaeto-romance) . works that are representative of this tradition include wartburg ( ), merlo ( - ), hall ( ), pellegrini ( ; ), loporcaro ( ) and, in some respects, lausberg ( ). within this tradition, gallo-italic varieties are either explicitly classified as italo-romance (and consequently excluded from the gallo-romance group) or indirectly presumed to be italo-romance, as is the case with classifications that consider gallo-italic as simply “northern italian” or even just “italian” (e.g. kabatek & pusch, ). however, according to a second, less influential tradition, gallo-italic is part of the gallo- romance branch, separated from italo-romance by the rimini-la spezia line. hull ( ) is perhaps the most extensive comparative analysis carried out within this tradition, culminating in a proposal for a genealogical classification of gallo-italic as part of a wider historical linguistic branch that he calls “padanian”, comprising of the gallo-italic and rhaetic continua (not unlike the “cisalpine” group in the work of pellegrini, , ), and belonging to the larger gallo-romance branch within the western romance group. hull concludes that “the romance vernaculars of northern italy and rhaetia have conserved, and in many cases have developed further, their original gallo-roman structure” ( : ), and warns against giving unwarranted to the “superficial italic, german and franco-occitan influences” which are “insufficient to warrant a classification of all or part of the rhaeto- cisalpine zone as “italo-romance” in the strictly linguistic sense of the term.” ( : ). similar conclusions had previously been reached by ascoli ( ), schmid ( ) and bec ( - ) who explicitly classified gallo-italic varieties as part of the gallo-romance branch, a classification that found further support in the work of pellegrini ( , ) and kotliarov, ( ). nevertheless, the classification of gallo-italic remains unclear, with most recent work not taking any particular stand on the issue, while at the same time assuming – either implicitly or explicitly – that gallo-italic is essentially italo-romance , and thus linguistically closer to the varieties south of the rimini la spezia than to – say – provençal or rumantsch. the assumption that gallo-italic belongs to the italo-romance branch is rather widespread in the modern sociolinguistic literature (e.g. cerruti, ; dal negro & vietti, ) but can also be found in the literature on romance typology (e.g. schmid, ), lexicography (e.g. barbato & varvaro, ; crevatin, ), and syntactic analysis (garzonio & poletto, ). these two classifications – both based on the family-tree model – are not only distinct from each other, they are also irreconcilable insofar as they take gallo-italic to belong to two distinct sub-groups that are not in a sisterhood relationship. fig. partial tree model indicating the position of gallo-romance in relation to italo- romance. as shown by the oval in figure , the two classifications are irreconcilable since the two nodes under which they classify gallo-italic, namely gallo-romance and italo-romance, are neither sisters nor in a mother-daughter relationship. in other words, the contention is not only about the classification of gallo-italic in itself, but also about its ancestor as either western-romance or proto-italian, thus making the two classifications considerably different from each other. classificatory criteria the genetic or genealogical classification of languages is based on the measurement of successive innovations. each innovation sets a variety apart from its original parent language, and shared innovations among varieties provide evidence for the formation of a sub-family (see fox, for a detailed overview). importantly, innovations are only considered as evidence if they are pervasive. this is in order to exclude the possibility that some apparently innovative trait resulted from borrowing rather than from systematic change (see joseph & janda, for a complete discussion). for the same reason, the innovations at the basis of classificatory linguistics are mostly phonetic/phonological and occasionally morpho- phonological, since those are the linguistic areas where regular, systematic change can be observed. the more innovations a set of varieties share, the more likely it is that those innovations are due to common ancestry rather than to coincidental, parallel development. while these principles are generally undisputed, dialectologists have often selected specific linguistic traits that they wished to analyse for signs of innovation and/or archaisms. for example, pei ( ) compiled a classification of some romance varieties based solely on the development of stressed vowels (see also lüdtke, ), while politzer ( ) gave particular importance to the synchronic conservation of plural -s when classifying the romance varieties of italy, a position partly endorsed by pellegrini ( ) and francescato ( ). perhaps unsurprisingly, different classifications have emerged depending on the traits selected and/or on the significance that different researchers have associated with particular traits. however, given that – as we have seen – language grouping in classificatory linguistics is intended to reflect systematic, pervasive change, researchers have increasingly questioned classifications that rely on linguistic traits selected a priori. while being a practical necessity in traditional comparative dialectology, the selection of a limited number of specific traits necessarily involves subjective judgements (mcmahon & mcmahon; ; starostin, ; szmrecsanyi & wolk, ), and may result in erroneous classifications as the pre-selected traits become overly influential in the final analysis. in keeping with this view, this paper aims to contribute to the development of an empirically-based classification of gallo-italic through the use of dialectometry applied to atlas corpora, and specifically through the measurement of levenshtein distance. the following research question will therefore be addressed: should gallo-italic be classified as part of the gallo-romance branch or the italo- romance branch? unlike traditional dialectology, dialectometry does not select linguistic traits a priori, relying instead on the extraction of patterns from quantitative data. therefore, dialectometric analyses can offer an insight into unresolved classifications, by largely eliminating the issue of subjective feature selection and enabling the identification of aggregate differences (nerbonne & kleiweg, ) and ‘seemingly hidden structures’ (goebl and schiltz, : ) emerging from the combination of individual linguistic variables. dialectometric measurements in general, and levenshtein distance in particular, have been successfully applied in the classification of varieties within the irish gaelic (kessler, ), dutch (heeringa, ; nerbonne, ; nerbonne et al, ), and norwegian (gooskens & heeringa, ) continua, as well as the italo-romance varieties of tuscany (montemagni et al., ; wieling et al., ). moreover, the measurement of linguistic distance has been argued to help evaluate the descriptive power of traditional classifications particularly in cases of disagreement (tang & van heuven, ; wichmann, holman, bakker, & brown, ), as is the case for gallo-italic. method wordlist comparison wordlist comparison is a quantitative technique aimed at detecting the degree of historical affinity between languages and/or language varieties. this technique relies on the use of one or more lists that contain basic vocabulary items where each item is a concept (or meaning, hence wordlists are sometimes also called “meaning lists”, e.g. mcmahon and mcmahon, ), and associated word forms which are collected for each concept in each of the language variety to be compared. all word forms corresponding to each concept are then compared across language varieties in order to measure the proportion of sound change correspondences and cognates that are shared, thus yielding a classification of the degrees of phylogenetic relatedness among the linguistic varieties at issue. wordlist comparison differs radically from a corpus comparison approach (e.g. heeringa, ) as well as from approaches that compare items on word databases (e.g. nerbonne et al., ). specifically, wordlist comparison aims to evaluate only items of basic vocabulary that are known to be relatively resistant to borrowing (tadmor, haspelmath, & taylor, ; thomason, ), such as pronouns, numerals and body parts. in doing so, wordlists exclude as much as possible items that may be shared across varieties for reasons other than shared genealogy – particularly contact-induced change – thereby leading to more accurate genealogical grouping (e.g. gray & atkinson, ; mcmahon et al., ; haspelmath, ). the most widely used wordlists are the ones developed by morris swadesh in the s and which came to be known as the “swadesh lists”. these lists were compiled by swadesh himself as part of his pioneering work on quantitative lexical comparison and were based on his own experience of what tended to be “stable” items in a language’s lexicon. the original swadesh list included items (known as the “swadesh ” list), while a later proposal introduced an alternative version containing items, which came to be known as the “swadesh- ” (for a detailed discussion, see swadesh, ). while the swadesh lists have been widely used in the literature on quantitative comparison, tadmor ( ) pointed out the need for a more strongly “empirically-based” alternative that “makes use of the powers of computational linguistics” (tadmore, : ). it is with this aim in mind that the leipzig-jakarta list was developed. tadmor ( ) and haspelmath and tadmor ( ) argue that the leipzig-jakarta is probably the most empirically accurate wordlist available, having been developed through quantitative comparison of , words from a database of languages representing a broad range of linguistic families and sub-families from across the world. this process involved an initial list of , concepts to be associated with their respective word forms in all languages. however, languages are not always semantically congruent, with some languages having more than one word associated with concepts for which other languages have no word (e.g. they rely on a periphrastic construction). consequently, the number of word forms collected for the , concepts varied across languages, ranging between and word forms depending on language. these word forms and their associated concepts were subsequently weighted for “borrowability” (van hout & muysken, ), namely the “relative likelihood that words with particular concepts would be borrowed” (haspelmath & tadmor, : ). this process resulted in the compilation of the leipzig- jakarta wordlist, containing only the concepts that consistently achieved a low score for borrowability across the corpus. as a result, the leipzig-jakarta is probably the most accurately calibrated wordlist available for genealogical classification. nevertheless, the current study also includes the swadesh wordlists since these have been widely used in the literature on quantitative comparison of the major romance and other indo-european languages (e.g. forster, toth, & bandelt, ; mcmahon & mcmahon, ; rama et al., ). materials: wordlists three conventional wordlists were compiled for the current study: swadesh (swadesh, ), swadesh (swadesh, ), and leipzig-jakarta (tadmor, ). the three wordlists have a high degree of overlap (see tadmor, for a detailed analysis). this is chiefly due to two facts. firstly, all three lists have been developed with the precise intent to minimise the presence of borrowings in order to increase the accuracy of the resulting genealogical grouping (see discussion in the previous section above). secondly, there exist only a limited number of concepts to choose from when constructing a wordlist since only a small subset of word categories and semantic fields tend to be resistant to borrowing (see tadmor, haspelmath, & taylor for a detailed overview). specifically, the leipzig-jakarta shares % of the items with the swadesh- list and % of the items with the swadesh list. the swadesh list – which was developed from the swadesh list with the intent to provide a potentially more accurate wordlist (swadesh, ) – resulted from the removal of concepts from the swadesh list plus the addition of concepts. overall, the three lists contain distinct concepts. at this point it is important to note that while the swadesh list is close to being a subset of the swadesh , this does not necessarily mean that the swadesh will yield a more reliable classification than the swadesh (see also zhang & gong, ). this is due to the fact that wordlists are based on two parameters (e.g. mcmahon & mcmahon, ; swadesh, ; zhang & gong, ). one of these is a quantitative parameter, namely the number of items that make up the list, and its importance lies in the fact that fewer items necessarily provide fewer measurement points with which to gauge degrees of genealogical relatedness (embleton, ). the other parameter, which is qualitative in nature, is the high resistance to borrowing required of the concepts in the wordlist. while a longer list will include more potential for measurement, it also increases the probability that some of its items might include “undiagnosed or misdiagnosed loans” that can subsequently “obscure the familial signal and lead to erroneous classifications” (mcmahon et al., : ). as an example, let us take a hypothetical wordlist l which misguidedly includes the concept for “letter” among its items. let us then suppose that this list is used to measure the degree of genealogical relatedness between english, french, and german. due to the fact that the english word for “letter” is a borrowing from french, our hypothetical list would partially contribute towards the erroneous conclusion that english is genealogically closer to french (‘lettre’) than to german (‘brief’). naturally, a single item would only minimally skew the overall result, but the point remains that the longer the wordlist, the higher the probability that such items may have been unwittingly included (on this point, see also mcmahon & mcmahon, ). therefore, a longer wordlist does not necessarily yield more genealogically accurate results, since length is only one of the two relevant parameters, and arguably the less important one as far as establishing degrees of genealogical relatedness is concerned. indeed, emory ( ) argued that the swadesh- yields more accurate results than the swadesh- as far as polynesian languages are concerned (though this is not true of all languages), echoing swadesh’s view that “quality is at least as important as quantity” ( : ). with these points in mind, we followed current practice in quantitative linguistic comparison (e.g. calude & pagel, ; rosendal & mapunda, ; syrjänen, honkola, korhonen, lehtinen, vesakoski & wahlberg, ) by including all three conventional wordlists in the current study. materials: atlases the three wordlists were compiled with the respective word forms taken from two linguistic atlases: the linguistic atlas of italy and southern switzerland (sprach- und sachatlas italiens und der südschweiz, jaberg and jud, - ) and the atlas linguistique de la france (gilliéron & edmont, ). the two atlases share a number of methodological features, not least because the linguistic atlas of italy and southern switzerland was produced by former pupils of gilliéron and with the explicit intent to apply the filed techniques of the atlas linguistique de la france to the swiss and italian areas. both atlases covered the romance speaking areas in their respective countries of interest (i.e. france, italy and switzerland), and the authors report that particular attention was paid to collecting data at equidistant intervals. for most locations, data was collected from a single informant, usually male, for a total of localities (atlas linguistique de la france) and (linguistic atlas of italy and southern switzerland) . the informants’ age varied widely for both atlases, though the majority of informants were males (due to cultural restrictions of the times) and between the ages of forty and seventy. the two atlases were compiled with data elicited by means of a questionnaire that was administered orally by the respective researchers. the questionnaire elicited individual lexical items and simple phrases which were then transcribed by the researchers using a set of symbols and diacritics to achieve accurate phonetic representation. procedure the three wordlists were applied to a total of twenty-four linguistic points. twenty points were taken from the linguistic atlas of italy and southern switzerland (sprach- und sachatlas italiens und der südschweiz, jaberg and jud, - ) and four points from the atlas linguistique de la france (gilliéron & edmont, ). the linguistic points include six points within the gallo-italic continuum, as well as six points from italo-romance varieties (i.e. varieties south of the rimini-la spezia line, including all three traditionally identified subgroups: central, southern, and extreme southern), two points within sardinian and eight points representing gallo-romance varieties that are uncontroversially classified as separate from italo-romance. within the varieties spoken in italy, we included a point immediately north of the rimini-la spezia (loiano, atlas point ) and one immediately to the south (barberino, atlas point ) in order to examine the extent of the linguistic differences across the established bundle of isoglosses. the two selected points are among the closest inhabited points across the rimini-la spezia line, approximately twenty-five km apart “as the crow flies” ( km by road) and with mostly uninhabited mountainous terrain between them. standard italian and standard french were also included to provide potential reference points, thus amounting to varieties in total. a summary of the varieties included in the comparison is given in table (north-west to south-east), while figure provides the geographical positions of the linguistic points. table summary of the linguistic points from the linguistic atlas of italy and southern switzerland (ais) and the atlas linguistique de la france (alf) with respective classifications geographical point (atlas ref.) subgrouping country (atlas) standard french (n/a) oïl, gallo-romance france (n/a) haute-savoie, chamonix ( ) franco-provençal, gallo- romance france (alf) savoie, chignin ( ) franco-provençal, gallo- romance france (alf) aosta valley, rhêmes ( ) franco-provençal, gallo- romance italy (ais) lanzo valley, stura ( ) franco-provençal, gallo- romance italy (ais) hautes alpes, le monetier ( ) occitan (vivaroaupenc), gallo-romance france (alf) susa valley, rochemolles ( ) occitan, gallo-romance italy (ais) susa valley, cesana ( ) occitan, gallo-romance italy (ais) var, aups ( ) occitan (provençal), gallo-romance france (alf) turin ( ) pedemontese, gallo-italic italy (ais) asti ( ) pedemontese, gallo-italic italy (ais) milan ( ) lombard, gallo-italic italy (ais) bergamo ( ) lombard, gallo-italic italy (ais) nonantola ( ) emilian, gallo-italic italy (ais) loiano, ( ) emilian, gallo-italic italy (ais) barberino ( ) central italian, italo- romance italy (ais) florence ( ) tuscan, italo-romance italy (ais) standard italian (n/a) italian, italo-romance italy (n/a) perugia ( ) central italian, italo- romance italy (ais) lazio, rieti ( ) central italian, italo- romance italy (ais) naples ( ) southern italian, italo- romance italy (ais) basilicata, pisticci ( ) southern italian, italo- romance. italy (ais) calabria, acri ( ) extreme southern, italo- romance italy (ais) palermo ( ) sicilian, extreme southern, italo-romance italy (ais) macumere ( ) logudorese, sardinian italy (ais) cagliari ( ) campidanese, sardinian italy (ais) fig. location of the linguistic points included in the dialectometric comparison as the two atlases use different transcription systems, all items were re-transcribed in the international phonetic alphabet (ipa) by the first author. distance measurements distance between the varieties was measured using string edit distance, commonly known as levenshtein distance. in its simplest form, levenshtein distance is the sum of the least costly set of operations needed to “transform one string into another” (nerbonne & heeringa, : ). figure illustrates how this measurement is achieved when comparing the word for “liver” between a lombard variety (milan, atlas point ) and an occitan variety (le monetier, atlas point ). fig. example of distance measurement including insertion, deletion and substitution the levenshtein distance is given by the total number of operations, thus yielding a distance of four for the above example. however, it has been argued that “more phonetic sensitivity” (nerbonne & heeringa, : ) needs to be incorporated into string distance measures if we are to achieve accurate measurements on linguistic data. in the current research, measurements were therefore carried out according to a more “linguistically responsible” version of levenshtein distance measurements (nerbonne, colen, gooskens, kleiweg, and leinonen, : ), which applies parameters that have been developed specifically for linguistic data. these parameters include the incorporation of normalisation for word length as well as the requirement that consonants and vowels always be kept distinct. finally, insertions and substitutions of diacritic marks are weighed . each (as opposed to the regular weight of ). this allows to count an oral [a] and a nasalized [ã] as closer than a pair of distinct vowels such as [a] and [o], thus yielding a more phonetically informed measurement. comparison was carried out on all items on the wordlists (i.e. both cognates and non- cognates). as non-cognates necessarily yielded distances approaching %, the current comparison reflects lexical distance as well as phonetic distance between the varieties. this, together with the fact that wordlists maximally exclude potential borrowings, should allow for a potentially highly accurate representation of overall linguistic distance between varieties (bryant, filimon, & gray, ; kessler, ; yang, ). results and discussion figures to show hierarchical clustering of the twenty-six varieties for each of the wordlists using ward’s method, which has been shown to be a highly reliable method for language clustering (batagelj, pisanski, & keržič, ; nerbonne et al., ; nerbonne, heeringa, & kleiweg, ). there was a positive correlation between all three distance matrices, and a mantel test revealed that all correlations were statistically significant: swadesh and swadesh (r = . , p = < . ); swadesh and leipzig-jakarta (r = . , p = < . ); leipzig-jakarta and swadesh (r = . , p = < . ). fig. hierarchical clustering based on the swadesh wordlist fig. hierarchical clustering based on the swadesh wordlist fig. hierarchical clustering based on the leipzig-jakarta wordlist while not identical, the three wordlists return similar results and the groups that emerge are relatively similar, with some significant clustering patterns. in particular, we observe clusters that correspond to groups identified by traditional dialectological methods (for an overview, see harris & vincent, ; and posner, ). occitan varieties both sides of the french- italian border form a single cluster in all three cases, as do the franco-provençal varieties of chamonix, chignign, and rhêmes. the variety of stura-lanzo, however, is consistently clustered with gallo-italic rather than with the franco-provençal group with which it is traditionally associated. this could be due to the fact that the variety in question has a number of traits (some innovative and some conservative ones) that overlap with gallo-italic, such as [j] for latin -li-, retention of word-final -l and -s (e.g. lombard [na:s] and stura- lanzo [na:s] but franco-provençal [na:] or [no] for “nose”), retention of postvocalic [ŋ], and consistent use of [y] for latin Ū (long /u/), which in most franco-provençal varieties can be realized as [ø] or even undergo deletion. for two of the wordlists, franco-provençal varieties are clustered with french, and thus classed as closer to french than to occitan / provençal, in line with ascoli’s ( ) original suggestion and with a more or less established consensus among dialectologists (posner, ). this is not the case for the swadesh wordlist, however, suggesting that the swadesh and leipzig-jakarta more accurately reflect the traditional distinction between the three gallo-romance sub-groups. the rimini-la spezia line also consistently emerges as a major bundle of isoglosses in all three wordlists, with varieties on each side appearing in separate clusters, showing that the systematic differences originally identified by bartoli ( ) have been confirmed by the edit distance measurements. in particular, we can observe that loiano (immediately north of the rimini-la spezia bundle of isoglosses) and barberino (immediately to the south) are embedded in separate clusters, despite being separated by only km of sparsely inhabited territory. within varieties south of the rimini- la spezia line we can observe separate sub-clusters for varieties each side of the rome- ancona bundle of isoglosses originally identified by rohlfs ( ), separating central varieties from southern varieties and ultimately from sardinian. the swadesh list appears to be less sensitive to the isoglottic bundle of the roma-ancona, clustering the variety of rieti with southern rather than with central varieties. finally, the tuscan origins of (standard) italian emerge in all three wordlists, with the varieties of florence and barberino clustering very closely with italian. classification of gallo-italic the gallo-italic group surfaces as a relatively homogenous cluster, with emilian varieties being slightly removed from the core cluster. as to its classification, comparisons for all three wordlists cluster gallo-italic varieties as closer to gallo-romance than to varieties south of the rimini-la spezia line, or italo-romance proper. fig. map representing the classification of gallo-italic as part of gallo-romance and separated from italo-romance by the rimini–la spezia line. this classification is consistent with the work of schmid ( ), bec ( - ), and hull ( ) but in opposition to the rather widespread stance that takes gallo-italic as essentially italo-romance (hall, ; pellegrini, , ; loporcaro, ; among others). as dialectometric comparison of wordlists does not select any linguistic feature a priori, it might be the case that the analyses that have assumed close affinity between gallo-italic and italo- romance have been biased towards traits that happen to be specific to italo-romance. alternatively, it may be the case that the sociolinguistic influence supposedly exerted on gallo-italic by tuscan has been previously overestimated (a point originally noted by hull, ). while the distinction between gallo-italic and other linguistic groups of italy (i.e. tuscan, central italian, southern italian, and extreme southern) is also accepted by traditional analyses, this distinction is often assumed to be an instance of sisterhood, with the different groups as parallel subsections of the italo-romance branch (e.g. de mauro, ). our dialectometric analysis shows that this is likely to be inaccurate, as the two subgroups are actually in a hierarchical relationship, with gallo-italic being closer to occitan than to italian, and thus considerably more distant from italian than either southern italian or sicilian are. in cladistics terms, gallo-italic appears as an ingroup of the gallo-romance branch rather than in a sisterhood relationship with italo-romance. similarly, the hierarchical clustering also consistently shows gallo-italic as being more distant from italian than occitan is from french, as the latter pair appear in a sisterhood relationship, while gallo-italic and italian do not. this is in keeping with the view that the rimini-la spezia line marks a stronger bundle of isoglosses than the oc-oïl line (wartburg, ; lausberg, ). our results are therefore compatible with the classification of gallo-italic as “a living branch of the gallo-romance linguistic tradition” (hull, : ). conclusions classification of gallo-italic as either gallo-romance or italo-romance has been a point of contention in the literature. while early studies on romance classification tended to group gallo-italic with gallo-romance (notably schmid, ), a later and arguably more influential tradition argued for the classification of gallo-italic as italo-romance by also relying on a priori selected traits, and thus not always in keeping with the tenets of the cladistic model. the current dialectmetric study showed that – when relying on all components of wordlist comparison – the relatively large bundle of isoglosses that constitute the rimini-la spezia line consistently leads gallo-italic to be clustered with gallo-romance and as considerably distant from italo-romance varieties, lending support to the analyses of bec ( - ), schmid ( ), and hull ( ). specifically, gallo-italic forms a relatively homogenous third subgroup within the gallo-romance branch, the other two being occitan and franco-burgundian (i.e. langue d’oïl and franco-provençal), as argued in hull’s ( ) extensive analysis. the rimini-la spezia line therefore emerges as the most important isogloss bundle in the north-south dimension of romance varieties. further research may benefit from expanding the set of linguistic properties to also include systematic comparison of grammatical properties and typological traits (bakker et al., ) in combination with the phonetic and lexical ones considered here. this might reveal further similarities between gallo-italic and other gallo-romance varieties, for example in syllabic structure (montreuil, ) or in the formation of clausal negation (zanuttini, ). some scholars classify venetan varieties as […] though the classification of rhaeto-romance and indeed the existence of rhaeto-romance as a separate branch is itself disputed, see benincà & haiman ( ) for discussion. on the elusiveness of a linguistic definition for “italo-romance”, see maiden and parry, ( ). though more than one informant was questioned for some areas whenever the fieldworkers thought it necessary. see temple, for a detailed discussion of the fieldwork techniques of the atlas linguistique de la france. references ascoli, g. i. ( ). paul meyer e il franco-provenzale. archivio glottologico italiano : - . bakker, d., müller, a., velupillai, v., wichmann, s., brown, c. h., brown, p. and holman, e. w. ( ). adding typology to lexicostatistics: a combined approach to language classification. linguistic typology, ( ): - . barbato, m., & varvaro, a. ( ). dialect dictionaries. international journal of lexicography, ( ): - . bartoli, m. ( ). caratteri fondamentali delle lingue neolatine. archivio glottologico italiano, : - . batagelj, v., pisanski, t., and keržič, d. ( ). automatic clustering of languages. computational linguistics, ( ): - . bec, p. ( - ). manuel pratique de philologie romane. a. and j. picard, paris. benincà, p., and haiman, j. ( ). rhaeto-romance languages. routledge, london. bryant, d., filimon, f., & gray, r. d. ( ). untangling our past: languages, trees, splits and networks. in mace, r., holden, c. j., & shennan, s. (eds.) the evolution of cultural diversity: a phylogenetic approach. left coast press, pp. - . calude, a. s., & pagel, m. ( ). frequency of use and basic vocabulary. in filipovic, l. & pütz, m. (eds.) multilingual cognition and language use: processing and typological perspectives. amsterdam: john benjamins publishing, pp. - . cerruti, m. ( ). regional varieties of italian in the linguistic repertoire. international journal of the sociology of language, : - . crevatin, f. ( ). italo-romance etymology and dictionaries: a difficult relationship. international journal of lexicography, ( ): - . dal negro, s., & vietti, a. ( ). italian and italo-romance dialects. international journal of the sociology of language, : - . de mauro, t. ( ). storia linguistica dell'italia unita. bari: laterza. embleton, s. ( ). statistics in historical linguistics. bochum: brockmeyer. emory, k. p. ( ). east polynesian relationships: settlement pattern and time involved as indicated by vocabulary agreements. the journal of the polynesian society, - . filipovic, l. & pütz, m. ( ). multilingual cognition and language use: processing and typological perspectives. amsterdam: john benjamins publishing. fox, a. ( ). linguistic reconstruction: an introduction to theory and method. oxford university press on demand. forster, p., toth, a., & bandelt, h. j. ( ). evolutionary network analysis of word lists: visualising the relationships between alpine romance languages. journal of quantitative linguistics, ( ): - . francescato, g. ( ). rhaeto-friulian. trends in romance linguistics and philology : - . garzonio, jacopo, and cecilia poletto. ( ). quantifiers as negative markers in italian dialects. linguistic variation yearbook ( ): - . gilliéron, j., and edmont, e. ( ). atlas linguistique de la france. champion. goebl, h., and schiltz, g. ( ). a dialectometrical compilation of clae and clae : isoglosses and dialect integration. computer developed linguistic atlas of england (clae). tübingen: max niemeyer verlag, . gooskens, c., and heeringa, w. ( ). perceptive evaluation of levenshtein dialect distance measurements using norwegian dialect data. language variation and change, ( ): - . gray, r. d., and atkinson, q. d. ( ). language-tree divergence times support the anatolian theory of indo-european origin. nature, ( ): - . green, j. n. ( ). romance languages. in comrie, b. (ed.), the world's major languages. london: routledge, pp. - . hall, r. a. ( ). external history of the romance languages (vol. ). new york: elsevier. harris, m., and vincent, n. (eds.). ( ). the romance languages. london: routledge. haspelmath, m. ( ). loanword typology: steps toward a systematic cross-linguistic study of lexical borrowability. in: stolz, t., bakker, d., and salas palomo, r. (eds.) aspects of language contact: new theoretical, methodological and empirical findings with special focus on romancisation processes. berlin: mouton de gruyter, pp. - . haspelmath, m., & tadmor, u. ( ). the loanword typology project and the world loanword database. in haspelmath, m. and tadmor, u. (eds.) loanwords in the world's languages: a comparative handbook. walter de gruyter, pp. - . heeringa, w. ( ) measuring dialect pronunciation differences using levenshtein distance. ph.d. thesis, university of groningen. hull, g. ( ). the linguistic unity of northern italy and rhaetia. ph.d. thesis, university of sydney. iacobini, c. ( ). the role of dialects in the emergence of italian phrasal verbs. morphology, ( ): - . jaberg, k. and jud, j. ( - ). sprach- und sachatlas italiens und der südschweiz, zofingen. joseph, b. d., & janda, r. d. ( ). the handbook of historical linguistics. malden, ma: blackwell. kabatek, j and pusch c. d. ( ). the romance languages. in kortmann, b., and van der auwera, j. (eds.), the languages and linguistics of europe: a comprehensive guide (vol. ). berlin: walter de gruyter. kessler, b. ( ). computational dialectology in irish gaelic. proceedings of the european acl. dublin: acl. – . kotliarov, ivan. ( ) a law of elision of unstressed vowels in western romance languages. glotta ( - ): - . lausberg, h. ( ). romanische sprachwissenschaft (vol. ). berlin: walter de gruyter. lewis, m. p., simons, g. f., and fennig, c. d. ( ). ethnologue: languages of the world, th ed. dallas, texas: sil international. online version: http://www. ethnologue. com. loporcaro, m. ( ). profilo linguistico dei dialetti italiani (vol. ). roma: laterza. lüdtke, h. ( ). die strukturelle entwicklung des romanischen vokalismus. romanisches seminar an der universität bonn. maiden, m., and parry, m. m. ( ). the dialects of italy. london: routledge mcmahon, a., & mcmahon, r. ( ). finding families: quantitative methods in language classification. transactions of the philological society, ( ): - . mcmahon, a. and r. mcmahon. ( ). language classification by numbers. oxford: oxford university press. mcmahon, a., heggarty, p. mcmahon, p. and slaska, n. ( ) swadesh sublists and the benefits of borrowing: an andean case study.transactions of the philological society ( ): - . merlo, c. ( - ). i dialetti lombardi. l’italia dialettale, : - . montemagni, s., wieling, m., de jonge, b. & nerbonne, j. ( ). synchronic patterns of tuscan phonetic variation and diachronic change: evidence from a dialectometric study. literary and linguistic computing, ( ): - . montreuil, j. p. ( ). sonority and derived clusters in raeto-romance and gallo-italic. amsterdam studies in the theory and history of linguistic science : - . nerbonne, j. ( ). computational contributions to the humanities. literary and linguistic computing, ( ): - . nerbonne, j., colen, r., gooskens, c., kleiweg, p., and leinonen, t. ( ). gabmap-a web application for dialectology. dialectologia, special issue : - . nerbonne, j. and heeringa, w. ( ). measuring dialect differences. in: auer, p. and schmidt, j. e. (eds.), language and space. an international handbook of linguistic variation. volume : theories and methods. berlin and new york: de gruyter mouton, pp. - . nerbonne, j., heeringa, w., van den hout, e., van der kooi, p., otten, s., and van de vis, w. ( ). phonetic distance between dutch dialects: clin vi, proceedings of the sixth clin meeting, antwerpen, december . nerbonne, j., heeringa, w., and kleiweg, p. ( ). edit distance and dialect proximity. in sankoff, d., and kruskal, j. b. (eds.) time warps, string edits, and macromolecules: the theory and practice of sequence comparison, nd edition. reading: addison-wesley publication. nerbonne, j., and kleiweg, p. ( ). toward a dialectological yardstick. journal of quantitative linguistics, ( - ): - . pei, m. a. ( ). a new methodology for romance classification. word, ( ), - . pellegrini, g. b. ( ). i cinque sistemi linguistici dell’italo-romanzo. revue roumaine de linguistique, : - . pellegrini, g. b. ( ). il “cisalpino” e l’italo-romanzo. archivio glottologico italiano, : - . pellegrini, g. b. ( ). il cisalpino e il retoromanzo. italia settentrionale: crocevia di idiomi romanzi. atti del convegno internazionale di studi (trento ), tübingen, - . politzer, r. l. ( ). final-s in the romania. romanic review, ( ): - . posner, r. ( ) the romance languages. cambridge, cambridge university press. rama, t., borin, l., mikros, g. k., & macutek, j. ( ). comparative evaluation of string similarity measures for automatic language classification. in mikros, g. k., and macutek, j. (eds.) sequences in language and text (vol. ). berlin/boston: walter de gruyter: - . repetti, l. ( ). teaching about the other italian languages: dialectology in the italian curriculum. italica, - . rohlfs, g. ( ). la struttura linguistica dell'italia (vol. ). keller. rosendal, t., & mapunda, g. ( ). is the tanzanian ngoni language threatened? a survey of lexical borrowing from swahili. journal of multilingual and multicultural development, ( ), - . schmid, h. ( ). Über randgebiete und sprachgrenzen. vox romanica xv. francke, bern. schmid, s. ( ). phonological typology, rhythm types and the phonetics-phonology interface. a methodological overview and three case studies on italo-romance dialects. in ender, a., leemann, a. a, and wälchli, b. (eds.), methods in contemporary linguistics. a festschrift in honour of iwar werlen. berlin/new york: mouton de gruyter, - . starostin, g. ( ). preliminary lexicostatistics as a basis for language classification: a new approach. journal of language relationship, : - . swadesh, m. ( ). "lexico-statistic dating of prehistoric ethnic contacts." proceedings of the american philosophical society, ( ): - . swadesh, m. ( ). towards greater accuracy in lexicostatistic dating. international journal of american linguistics, ( ): - . swadesh, m. ( ), origin and diversification of language, aldine· atherton, inc., chicago. syrjänen, k., honkola, t., korhonen, k., lehtinen, j., vesakoski, o., and wahlberg, n. ( ). shedding more light on language classification using basic vocabularies and phylogenetic methods: a case study of uralic. diachronica, ( ): - . szmrecsanyi, b., & wolk, c. ( ). holistic corpus-based dialectology. revista brasileira de linguística aplicada, ( ), - . tadmor, u. ( ). loanwords in the world’s languages: findings and results. in haspelmath, m. and tadmor, u. (eds.) loanwords in the world's languages: a comparative handbook. walter de gruyter, pp. - . tadmor, u., haspelmath, m., & taylor, b. ( ). borrowability and the notion of basic vocabulary. diachronica, ( ): - . tang, c., & van heuven, v. j. ( ). mutual intelligibility of chinese dialects experimentally tested. lingua, ( ): - . temple, r. a. ( ). old wine into new wineskins. a variationist investigation into patterns of voicing in plosives in the atlas linguistique de la france. transactions of the philological society, ( ): - . thomason, s. g. ( ). language contact. washington, d.c.: georgetown university press. van hout, r., & muysken, p. ( ). modeling lexical borrowability. language variation and change, ( ): - . wartburg, w. von ( ). die ausgliederung der romanischen sprachräume. bern: verlag francke. wichmann, s., holman, e. w., bakker, d., and brown, c. h. ( ). evaluating linguistic distance measures. physica a: statistical mechanics and its applications, ( ): - . wieling, m., montemagni, s., nerbonne, j., & baayen, r. h. ( ). lexical differences between tuscan dialects and standard italian: accounting for geographic and sociodemographic variation using generalized additive mixed modeling. language, ( ): - . yang, c. ( ). nisu dialect geography, sil electronic survey reports, , dallas: sil international. zanuttini, r. ( ). negation and clausal structure. oxford university press. zhang, m., & gong, t. ( ). how many is enough?—statistical principles for lexicostatistics. frontiers in psychology, . librarians in transition: scholarly communication as a core competency eastern illinois university from the selectedworks of steve brantley fall september , librarians in transition: scholarly communication as a core competency steve brantley todd bruns, eastern illinois university kirstin duffin, eastern illinois university available at: https://works.bepress.com/steve_brantley/ / http://www.eiu.edu https://works.bepress.com/steve_brantley/ https://works.bepress.com/steve_brantley/ / https://works.bepress.com/steve_brantley/ / running  head:  librarians  in  transition     librarians in transition: scholarly communication support as a developing core competency abstract: modern digital scholarship requires faculty to navigate an increasingly complex research and publication world. liaison librarians are uniquely suited to assist faculty with scholarly communication needs, yet faculty do not identify the library as a provider of these services. proactive promotion of scholarly communication services by librarians is needed. in this article, scholarly communication training developed for librarians at a mid-sized public university is described. two surveys – describing faculty digital scholarship needs and librarian attitudes toward scholarly communication services – are presented. articulating scholarly communication support as a core competency affirms the importance of this developing role for librarians. keywords: core competencies, liaison librarians, libraries & scholars, library services, scholarly communication author information: steve brantley (corresponding author) head of reference services, associate professor and liaison to communication studies, booth library, eastern illinois university jsbrantley@eiu.edu todd a. bruns, institutional repository librarian, associate professor and liaison to the school of technology, booth library, eastern illinois university tabruns@eiu.edu kirstin i. duffin, reference librarian, assistant professor and liaison to biological sciences, chemistry, and geology/geography, booth library, eastern illinois university kduffin@eiu.edu received: august , accepted: october , librarians  in  transition       librarians in transition: scholarly communication support as a developing core competency the activities of scholarly communication-support librarians have grown and changed in recent years due to the increasingly complex nature of modern digital scholarship. today’s scholars must wrestle with a dizzying array of options, from selecting search portals to ensure a thorough review of the literature, to making decisions about modes of publication and data curation, as well as considering their rights to self-archiving and self-determined dissemination of their work. the variety of needs across disciplines necessitates an “engagement-centric” librarianship that is embedded in and responsive to the scholarly life of our faculty (kenney, ). for several years academic libraries have developed services to support scholarly communication (sc) on their campuses, and many association of research libraries (arl) member libraries have established librarians dedicated to the issues of sc and publishing (newman, blecic, & armstrong, ; radom, feltner-reichert, & stringer-stanback, ). scholars have begun to study the trajectory of these positions since librarian job responsibilities have adapted to scholarship developments in the digital world (bonn, ; xia & li, ). in many cases, even when a dedicated sc support position exists, the demand goes beyond the abilities of a single librarian, and traditional roles are taking up the banner of sc support. many studies have described initiatives where libraries implement sc training (bresnahan & johnson, ; bruns, brantley, & duffin, ; kirchner, ; malenfant, ; rodriguez, ; wirth & chadwell, ), and subject liaisons are increasingly being called upon to add sc support to their repertoire of duties. librarians  in  transition       another argument for expanding the responsibilities of liaison librarians is the impact of the longstanding serials crisis, coupled with devastating losses in library collection budgets following the great recession (prottsman, ). reduced budgets not only decrease the amount of collections work, but compel librarians to seek out materials available as open access and redouble outreach efforts to their faculty in order to better serve and collaborate with them. plutchak ( ) argues that, because of the availability of online resources, scholars now tend to view their research processes as largely outside the library, although the proliferation of online sc and changes in scholarly publishing ironically increase scholars’ need for librarians’ skills. in this article, the authors argue for a change in the way sc knowledge and skills are perceived. while sc expertise is currently often the domain of a single position in the library, the authors believe researchers will be better served if sc is considered a core competency of subject librarians, similar to reference, instruction, and collection development. here, we describe a sc education program implemented for librarians at eastern illinois university (eiu), a carnegie classification master’s l university (awarding at least master’s degrees annually). we review a survey of eiu librarians’ behaviors and attitudes surrounding sc. we then consider the results of a survey of the digital scholarship needs of eiu faculty, which further emphasize the growing need for librarians to integrate sc skills into their competencies. finally, we affirm the need for sc to be a core competency of liaison librarianship and suggest that becoming proficient in sc support services is achievable for all librarians, irrespective of their maturity in the field. literature review the library and information science (lis) literature relevant to sc in the digital age has matured over the last several years and encompasses many facets of the concept. a sample of librarians  in  transition       these issues include managing institutional repositories (armstrong, ; bruns, knight-davis, corrigan, & brantley, ; bull & eden, ; burns, lana, & budd, ; royster, ; schlangen, ; sterman, ), authors’ rights (wirth & chadwell, ), open access (clobridge, ; harnad, ; pinfield, ; quinn, ; suber, ; zhao, ), bibliometrics and altmetrics (bladek, ; brown, ; bruns & inefuku, ; carpenter, lagace, & bahnmaier, ; gargouri et al., ; gordon, ; konkiel & scherer, ), data management (krier & strasser, ; patel, ; pinfield, cox, & smith, ), library publishing (allen, ; busher & kamotsky, ; gilman, ; mcintyre, chan, & gross, ; park & shim, ; steele, ), research support services (kennan, corrall, & afzal, ; mitchell, ; vinopal & mccormick, ), and faculty engagement (reinsfelder & anderson, ; wiegand, ). rather than provide an exhaustive review, the authors focus on aspects of the literature relevant to the transformation of liaison librarians into service providers of sc support, trends toward faculty engagement in library services, including assessment of faculty needs, and literature promoting sc support as a core competency. transforming the liaison role there is no single way to introduce a sc-support training program. at oakland university, a medium-sized public university, several sc-related training events for librarians were implemented in anticipation of open access week, allowing the liaisons to increase their comfort and understanding with these issues (rodriguez, ). at the university of british columbia, liaison librarians surveyed the sc environment of their subject areas (kirchner, ). by doing so, liaisons refocused their services from “library-centric” (the collection) to “scholar-centric” (engagement and outreach) (kenney, ). oregon state university organized an authors’ rights workshop for their librarians (wirth & chadwell, ). at the university of librarians  in  transition       colorado at boulder, librarians were polled to determine where their sc knowledge was strong and where additional preparation was needed (bresnahan & johnson, ). the assessment revealed that most discomfort was related to data management. practical, hands-on training was identified as a solution to address this issue. liaison assignment of duties were redefined at the university of minnesota (malenfant, ). reference desk hours and collection development responsibilities were replaced with sc and institutional repository outreach initiatives. such a restructuring emphasized the move to offering new services, brought to light areas where liaisons felt less prepared to provide services, and resulted in a scholarly communications collaborative for sharing resources and support (malenfant, ). at a small liberal arts college in new york, the conversation began with liaison interviews of faculty to understand how to address campus sc needs (swoger, brainard, & hoffman, ). while there has been discussion about integrating sc into the responsibilities of liaison librarians (e.g., beaubien, masselink, & tyron, ; bresnahan & johnson, ; cox, verbaan, & sen, ; finlay, tsou, & sugimoto, ; kirchner, ; malenfant, ; rodriguez, ), not much has been written about this transition from the perspective of the liaison (but see, for example, taylor, ; turtle & courtois, ; zhang, liu, & mathews, ). this may be because sc still feels beyond the expertise of liaison librarians who fill multiple roles in an environment of constricting staff and budgets. without a formal transition that requires active participation with sc services, added responsibilities may end up somewhere near the bottom of a liaison’s long to-do list. faculty may not be engaging librarians with questions about copyright, open access, and institutional repositories, so librarians may have a sense that there is no pressing need to educate themselves on these issues. on the contrary, to remain relevant to the campus community, such a proactive service is necessary. librarians  in  transition       acquiring an expertise of sc issues will allow librarians to create a faculty-centric model of service as advocated by hahn ( ) and royster ( ). reviewing malenfant’s study of liaison work at the university of minnesota and incorporating liaison efforts across many research libraries that have developed following the university of minnesota initiative, kenney ( ) identifies six issues that will affect the liaison model into the future. one of these issues, assessment, emphasizes that liaison activities should be evaluated in terms of the effect they have on scholars, rather than simply by quantifying accomplishments. another issue has to do with redefining the liaison role to better suit the changing research landscape (kenney, ). a arl special report reviews major changes in liaison librarianship over the last several years and identifies emergent trends and recommendations (jaguszewski & williams, ). the authors conclude that “[n]ew roles in research services, digital humanities, teaching and learning, digital scholarship, user experience, and copyright and scholarly communication [emphasis added] are being developed at research libraries across the country, requiring professional development and re-skilling of current staff, [and] creative approaches to increase staff capacity” (jaguszewski & williams, , p. ). at academic institutions, librarians are recognizing the essentiality of sc support services. are faculty beginning to turn to librarians for assistance with these services? faculty engagement and needs surveys subject librarians, with their close ties to academic departments, are well-positioned to support the research needs of faculty by offering sc services (kenney, ; malenfant, ; neugebauer & murray, ; plutchak, ; thomas, ). following an engagement-centric service model, liaison librarians support the research needs most valued by their departments and institution rather than those deemed most relevant by the library, although these indicators need not be divergent. in this way, liaisons are providing sc support in ways that are more meaningful librarians  in  transition       for their faculty. sc support may be administered in many ways, including this three-tiered approach advocated by thomas ( ): supporting faculty in navigating publishing models and making their work open access, consulting with faculty about copyright transfer agreements and the fair use of copyrighted work, and enabling faculty to evaluate open-access publications and meet funder mandates. with the growth of institutional repositories and the steady migration of scholarship from print only to include a hybrid space of print and digital media, opportunities for a wide range of services and support have been created. academic librarians can assume these duties. given that this set of tasks and services is still emergent, there are few published studies that survey faculty about the kinds of digital content they create, and fewer still that ask faculty to identify their own and their students’ needs in the realm of digital scholarship. instead, some current literature focuses on the practices and preferences of faculty regarding their use of existing library resources and services. profera, jefferson, and hosburgh ( ) assess the use of physical and virtual library spaces to gauge the effect on faculty use of library resources and services. zhang ( ) examines the use of library services by engineering faculty at a large research university. other studies have surveyed faculty about specific digital practices such as scientific data collection and preservation. with their us faculty survey, ithaka s+r began querying faculty about the types of research data they collect (wolff, rod, & schonfeld, ). this survey revealed that faculty want to manage their own data; however, a majority of faculty indicated that the provision of research support services by the library is highly important. toups and hughes ( ) probed faculty attitudes on and needs for data management to inform the development of data curation services at a small liberal arts university and found greater needs librarians  in  transition       existed than were anticipated. data-support services have been developed in light of the findings from focus groups and interviews. scaramozzino, ramírez, and mcgaughey ( ) surveyed stem faculty at california polytechnic state university san luis obispo (cal poly) about their data collection behaviors and attitudes in three areas: data preservation, data sharing, and educational needs. the survey evidenced that these scholars have data management and educational needs with which the library can assist, although faculty did not currently view librarians as a source of expertise for this information. cal poly, like eiu, is primarily a teaching university, and faculty at these types of institutions have unique data curation needs. the eiu faculty survey, presented later in this article, will add further insight into this nascent body of literature about scholars’ digital scholarship service needs. scholarly communication support as a core competency the new association of college and research libraries (acrl) framework for information literacy for higher education, adopted by the acrl board in , incorporates sc in the scholarship as conversation frame, an update from the old information literacy competency standards for higher education, which had no such mention of sc (mullen, ). librarian engagement with promoting sc services is gradually expanding. however, familiarity with traditional library resources makes incorporating open-access resources, including the institutional repository, challenging for many reference and liaison librarians (mullen, ). the evolution in methods of sc prompts scholar interest in support services and suggests the need for sc skills to become a core librarian competency (bailey, ; bonn, ; bresnahan & johnson, ; kenney, ; kirchner, ; neugebauer & murray, ; thomas, ; wirth, ). commenting on the role of librarians in the sc environment, wolf librarians  in  transition       ( ) argues that librarians are often at the center of this relationship between researcher, funding agencies, and publishers. in june , the task force on librarians’ competencies in support of e-research and scholarly communication (a joint effort between the arl, canadian association of research libraries - carl, association of european research libraries - liber, and coalition of open access repositories - coar) published its “librarians’ competencies profile for scholarly communication and open access” (calarco, shearer, schmidt, & tate, ). the authors of this profile argue that the traditional role librarians have played in relation to open access and sc has expanded such that specific skills and competencies must be outlined to assist administrators seeking to employ highly qualified professionals (calarco et al., ). in the broader context of research support services, which include sc knowledge as well as data management and bibliometrics, research has shown “a near-universal support” to develop targeted lis education to better serve library professionals in this emerging service role (kennan et al., , p. ). steele ( ) notes the essential work of librarians with researchers and repositories and calls for librarian-led sc literacy programs for researchers. sc is increasingly being viewed as a central service that libraries can provide and in which librarians should be skilled. the program: training the “scholarly communication coach” background and context institutional repositories (irs) continue to develop in academia, along with the expansion of discipline repositories such as arxiv, e-lis, social sciences research network (ssrn), english and american literature research network (from ssrn), and scholars’ commons like the digital commons network, research gate, mendeley and academia.edu. additional librarians  in  transition       repositories are listed at the simmons college-sponsored open access directory, http://oad.simmons.edu. despite this growth, and perhaps because of the simultaneous development of discipline and scholar commons, irs are not seen as central to the research enterprise of the institution, remain unknown to many faculty, and are viewed by scholars as limited to the domain of the library, rather than being a component of their research life (creaser, ; cullen & chawner, ; dutta & paul, ; hahn & wyatt, ). eiu’s ir, the keep, was launched in . as more faculty began contributing their works to the ir, the ir librarian was fielding a growing number of faculty questions surrounding sc issues, such as authors’ rights, copyright, choosing a reputable publisher, journal embargoes, and participating in the ir. the head of reference saw this as an opportunity to expand reference services to meet faculty need. the dean of the library recognized the value of providing such services, so the ir librarian and head of reference developed a program to expand awareness among liaison librarians of the complex sc landscape. the faculty demand for assistance with sc-related questions exceeds the support that one knowledgeable ir librarian is able to provide. librarianship at eiu is collaborative, communication among librarians is often informal, and librarian skill sets are diverse to meet campus need. the liaison librarians represent all areas of operation in the library. they share collection-development responsibilities and single or multiple department liaison duties. continuing to expand one’s skill set is an acknowledged part of the variety of liaison responsibilities. learning more about sc resources is not regarded by liaisons to be a burden, yet feeling comfortable with the range of sc issues in one or more disciplines could be perceived as a challenge. program planning librarians  in  transition       to begin organizing a sc training program at eiu, the ir librarian and head of reference searched the literature for any similar programs implemented at institutions analogous to eiu. they reviewed potential sc services and selected those most appropriate for the eiu faculty population to include in librarian training. for example, consultations with the campus research services office indicated that research data management (rdm) services were not in high demand, leading to the decision to limit training in rdm to basic services surrounding data management plans. where more extensive data services are required, the ir librarian will handle data management. the training program had as its ultimate goal the integration of library services into the sc and digital scholarship environment of each liaison’s department(s). the training program was to help the liaisons become “scholarly communication coaches” (brantley & bruns, ). we have defined scholarly communication coach as a subject liaison who has a working understanding of sc issues such that they can field most common concerns and provide basic consultative sc services (bruns et al., ). course components the sc coach program drew upon the structure of sc trainings developed at the universities of minnesota and british columbia, utilizing, respectively, the systems and environmental scan methods (kirchner, ; malenfant, ). with the environmental scan, the liaison analyzes the assigned department’s programs, such as student journals or undergraduate research fairs. the liaison investigates the discipline’s stand on open access via scholarly societies and professional associations. the liaison also discovers the discipline’s preeminent sources of publication, such as the major journals and discipline repositories. additionally, the liaison collects information on faculty participation in the ir (thekeep.eiu.edu) librarians  in  transition       and online scholars’ networks including academia.edu, research gate, and others previously mentioned. for the second component of training, the ir librarian and head of reference developed a sc coach toolkit, which includes select sc resources to assist liaisons in their communication with faculty. the toolkit points to such sources as the dmptool and sherpa/romeo (for publisher copyright policies). it also includes resources to assess publisher quality, such as jeffrey beall’s list of potential, possible, or probable predatory scholarly open-access publishers (https://scholarlyoa.com/publishers/) and the directory of open access journals (https://doaj.org/) promoted by berger and cirasella ( ). these core tools help answer most questions posed by our faculty, including questions about data management and how to navigate publisher copyright agreements. as a third component to sc coach training, liaisons attended a workshop on authors’ rights, intellectual property, creative commons licensing, and digital publishing. this session was designed to help librarians field questions about where to publish and which open-access journals or open repositories would most increase scholarly visibility and research impact. the session also aimed to enhance the liaison’s understanding of non-traditional ways of documenting research impact, such as with altmetrics. desired outcomes and future directions by designing and hosting a sc training program at eiu, the ir librarian and head of reference intended to enable liaison librarians to be well-informed of sc support services they may offer faculty in their subject domains. the sc coach role helps to personify this mission. learning about and applying these new skills will take an intentional investment of time. it will require active engagement by librarians with their subject faculty. librarian education on sc librarians  in  transition       issues will involve periodic refresher sessions and assessment activities, such as the internal survey described below. assessment exercises will help determine the level of contact the librarian has had with faculty researchers over the course of a year. environmental scans of the state of sc within a discipline performed every to years will refresh librarians’ understanding of the ongoing needs of scholars. the sc coach training was an introduction to the establishment of sc and digital scholarship support services at eiu. with the help of this instruction, the ongoing work of building sc services and engaging faculty in collaboration can grow. at a teaching institution like eiu, where research is important but not primary, demonstrating to faculty the benefits of seeking out liaisons for sc support is a central focus for the continued vitality of the library. when librarians focus on services relevant to scholars’ needs, not only are they able to position the ir as a valued university resource, they again find themselves a critical partner in the research heart of the university. liaison feedback on scholarly communication support: enthusiasm, indifference or reluctance? to better understand the current practices among eiu liaison librarians in providing sc services, the authors conducted an internal survey. at eiu’s booth library, librarians serve academic departments. the ir librarian, with expertise in sc, is also a subject liaison. the remaining liaisons also have responsibilities in reference, instruction, collection development, acquisitions, cataloging, circulation, technology support, and/or management and administration. sc is not a role formally acknowledged in these liaisons' assignment of duties. six months after presenting the sc coach workshop, the authors surveyed the liaisons seeking to understand their involvement with sc in their relationships with their department librarians  in  transition       faculty (survey questions may be found in appendix a). the ir librarian was excepted from the survey. twelve of responded to the survey. half of the liaisons attended the sc workshop ( of survey respondents). since the workshop, those librarians who participated in the training were more likely to report that their faculty had asked them about sc topics ( of , versus of of the liaisons who did not attend the workshop). workshop participants were more likely to have read about sc services or actively identified open-access resources to add to the library collection since the workshop ( of , versus of of the liaisons who did not attend the workshop). drawing broadly generalizable conclusions from this survey is impossible due to the low sample size, but a few notable points emerge from our liaison feedback. librarians working with departments where there is more interest in sc topics (e.g., open-access publishing, copyright concerns, self-archiving in the ir) are more likely to participate in professional development related to sc, and they are more likely to seek out opportunities for self-directed learning on sc topics. the results also suggest that librarians who elect to attend in-house workshops are interested in broadening their awareness of sc issues and services. there was no difference between librarians who attended the sc workshop and those who did not with regard to proactively engaging department faculty in conversations about sc topics ( of , and of , respectively). there was also no difference in how librarians viewed sc support being integrated into their assigned duties. for those who attended the workshop and for those who did not, of cited sc knowledge as being a part of their subject-specialist responsibilities. only two librarians believed sc services were beyond the scope of a liaison’s assigned duties. most liaisons recognize sc support as part of their job without having it formally acknowledged in their assignment of duties, but some liaisons may be more willing to librarians  in  transition       engage with sc issues when it is a formally acknowledged responsibility. despite the benefits for liaisons to discuss sc services with faculty, such as helping authors better understand their rights and striving to make research more openly available to the global community (kenney, ; mullen, ), if a department faculty member does not request the services, liaisons may not recognize the value. beginning in the and ithaka s+r us faculty surveys, respondents were asked to rate the importance of the statement, “the library provides active support that helps to increase the productivity of my research and scholarship” (housewright, schonfeld, & wulfson, , p. ). in , % of respondents rated this statement as important. in , that number declined to approximately %. it should be noted that, compared to every other survey question, the fewest respondents considered this statement to be important (housewright et al., ). in the ithaka s+r us faculty survey, this statement, now referred to as “research support,” made slight gains in importance. just over % of faculty rated research support as highly important (wolff et al., ). given the low level of faculty awareness and perceived importance of liaison services, it seems questionable at best to consider a lack of faculty requests to indicate an absence of need. in the eiu liaison survey, of respondents ( of who attended the workshop and of who did not) indicated a desire for more guidance or familiarity with sc services. a single workshop on issues surrounding sc is not enough to increase librarian familiarity with this topic. online resources like the sc coach toolkit (booth.eiu.edu/sccoach), additional workshops, and self-directed instruction will provide liaisons with an increased understanding of the importance of engaging with faculty on sc issues. additional training could include influential communication and leadership development (mcneil, ). establishing a stronger sense of librarians  in  transition       community among liaison librarians helped with the development of sc services at the university of iowa (koffel, magarrell, raber, & thormodson, ). barton and waters ( ) note that faculty must hear about these library services at least seven times and through multiple contact points (e.g., print handouts, web pages, in-person engagement) before becoming open to utilizing them. librarians, too, can be slow to adapt. the concept of shifting roles of the reference librarian to include skills in promoting and teaching about the ir was addressed in a special issue of reference services review (rockman, ). for libraries that are just beginning to develop an ir, this change in service emphasis may not yet be realized. at eiu, where the ir has been aggressively developed and promoted, liaison librarians are still learning how to incorporate sc as a primary skill set (bruns et al., ). there remains a need for greater emphasis on the importance of sc-services, but it is unclear from whom in the library this message should be coming. should the ir librarian, for whom these services are primary, be the champion, or should the subject liaison, who has the closest relationships with faculty in the academic departments, be the sc-services advocate? the best answer may be institutionally dependent. regardless of the type of institution, the role must be adopted by each librarian. this will ensure that the liaison librarian’s duties will move from passive, library-centric, collection building toward an active, participatory, and collaborative role, engaged with faculty in their teaching and research activities (buehler & boateng, ; kenney, ). the need: campus survey of the digital needs of faculty in early , bepress, an open-access, scholarly-publishing services company and the producer of the digital commons repository software, surveyed the eiu faculty on their digital scholarship needs. the survey, for which eiu’s ir librarian was consulted in writing, asked librarians  in  transition       about the types of digital products faculty (and their students) created in the course of their teaching and research (see appendix b). in addition, the survey asked faculty to rank the importance of each type of product to manage, organize, preserve, and share. these questions illuminated the fact that faculty are creating a wide variety of digital works, which go beyond the traditional published articles, book chapters, and monographs. documenting these other forms of faculty research productivity, such as data from literary and historical research, open educational resources like textbooks and web scripts related to geographical research, is encouraging for any university, but it is especially so at a comprehensive university that emphasizes undergraduate and master’s level teaching above faculty research. survey results indicate there is a potential role for librarian support to manage and promote these works. approximately %, or members, of the university faculty at eiu took the survey. of the respondents, % were from the humanities, % were from the social sciences, and % were from the sciences. only % percent of respondents did not indicate their discipline. the first part of the survey asked faculty to select, from a list of options, all the types of digital scholarship products they create. the options were: •   working papers and reports   •   published documents (articles, books, book chapters, conference proceedings)   •   historical and archival documents   •   multimedia (video, audio, image)   •   primary research materials, such as research data   •   creative works (art, photography, graphics, music compositions)   •   other (please specify)     faculty indicated that they most frequently create published documents (chosen by % of respondents) and working papers (selected by % of survey takers). while this response could have been expected of an active faculty, we did not anticipate the large number of projects being librarians  in  transition       created outside this traditional realm of scholarship: % of respondents create multimedia works, and more than % create primary research materials, such as research data. notable among respondents who create primary research materials (data) is the relatively high number of humanities faculty ( %) and social sciences faculty ( %) who produce research data in need of management and preservation, in addition to the sciences faculty ( %). some respondents ( %) also produce creative works in the course of their research and scholarly activity. a small portion of faculty ( %) create historical and archival documents, with humanities scholars representing over half of this population. next, faculty were asked to select the digital products they felt were the most important to manage, organize, preserve, and share. across the sciences, humanities, and social sciences, faculty were most likely to rank published documents as their most important digital products ( % of respondents). more than % of respondents selected research data as most important. the second part of the survey asked faculty to rank the importance of a range of scholarly activities. the range included the following options, which have been ordered by rank here from the survey results: .   organizing my digital scholarship .   managing and preserving research data .   increasing the citations/visibility of my work .   promoting my students' work .   publishing an online journal or conference .   measuring and demonstrating the impact of my work the survey then asked respondents for which of these activities they require the most help. their answers shifted slightly, with a higher ranking going to “increasing the citations/visibility of my work”: .   organizing my digital scholarship .   increasing the citations/visibility of my work .   managing and preserving research data librarians  in  transition       .   promoting my students' work .   publishing an online journal or conference .   measuring and demonstrating the impact of my work these two ranking questions, in particular, elucidate faculty perceptions and preferences of the kinds of sc support librarians can provide. although faculty may not be explicitly asking for assistance from their liaison librarians, this survey provides demonstrable evidence of faculty need for these services. it becomes our job as librarians and liaisons to provide the outreach and information alerting the faculty to our ability and readiness to provide these sc services. developing a core competency the results of the eiu faculty survey add a powerful testament to the growing body of literature underscoring a continuing need in academia for the professional librarian skillset, as well as for the continued development of this skillset to support faculty research and publishing. despite the flagging statistics of traditional library services evidenced in countless articles and surveys (e.g., association of research libraries, ; carlson, ; foster & gibbons, ; gayton, ), a coming “great age of librarians” may yet be on the horizon (plutchak, , p. ). as plutchak asserts, library services such as reference, collection development, and programming, as well as the library as place, may have declining value to researchers in the digital age. the physical library may be viewed as less essential to conducting research and therefore less esteemed. as digital scholarly content creation continues to proliferate, however, and the electronic avenues through which scholars access information grow ever more complex, the information landscape becomes increasingly difficult to navigate. this makes the librarian skillset more relevant and necessary than ever (plutchak, ). to bridge the gap in academic services for faculty sc support, the authors call for a liaison-librarian core competency to be developed around sc issues. these skills need to be librarians  in  transition       developed in lis education (corrall, kennan, & afzal, ; kennan et al., ). analysis of job advertisements predicts that, beyond a passing familiarity with the issues, sc is on the brink of being accepted as a core role, with reference, instruction, and collection development, of academic librarianship (finlay et al., ). key areas of focus for both professional development and lis education are defined in the report initiated by coar, “librarians’ competencies profile for open access and scholarly communication” (calarco et al., ). the development of a sc competency skillset for existing librarians has many useful examples, including the sc coach program implemented by the authors. additional programs and examples stemming from have matured to a point where best practices for a particular institution can be selected and customized, and assessment of their effectiveness can be measured (bresnahan & johnson, ; kirchner, ; malenfant, ; wirth & chadwell, ). kenney ( ) is an example of such an assessment, which draws upon the work of malenfant ( ). acrl has also developed a sc toolkit that serves as an updated and ample guide to developing individual competency or library services (http://acrl.ala.org/scholcomm/). in may , the acrl science & technology section (sts) scholarly communications committee began the scholarly communications investigations series, an initiative whose aim is to share informational sc posts on the sts discussion list (sts-l). with a wealth of resources available to help librarians at all stages of their career make the transition to sc coach, the moment is ripe for a transition to this new library service model. at research-intensive universities and teaching-centric institutions, librarian support is desirable and needed. librarians  in  transition       references allen, b. m. ( ). all hype or real change: has the digital revolution changed scholarly communication? journal of library administration, ( ), – . doi: . / armstrong, m. ( ). institutional repository management models that support faculty research dissemination. oclc systems & services, ( ), – . doi: . /oclc- - - association of research libraries. ( ). service trends in arl libraries, - [graph]. washington, d.c.: author. retrieved from http://www.arl.org/storage/documents/service- trends.pdf bailey, c. w. ( ). the role of reference librarians in institutional repositories. reference services review, ( ), – . doi: . / barton, m. r., & waters, m. m. ( ). creating an institutional repository: leadirs workbook. cambridge, ma: mit libraries. retrieved from http://hdl.handle.net/ . / beaubien, s., masselink, l., & tyron, j. ( ). recasting the role of comprehensive university libraries: starting points for educating librarians on the issues of scholarly communication and institutional repositories. in d. m. mueller (ed.), pushing the edge: explore, engage, extend: proceedings of the fourteenth national conference of the association of college and research libraries, march - , , seattle, washington. chicago, il: association of college and research libraries. retrieved from http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/nationa l/seattle/papers/ .pdf berger, m., & cirasella, j. ( ). beyond beall’s list: better understanding predatory librarians  in  transition       publishers. college & research libraries news, ( ), – . retrieved from http://crln.acrl.org/content/ / / .full bladek, m. ( ). bibilometrics services and the academic library: meeting the emerging needs of the campus community. college & undergraduate libraries, ( – ), – . doi: . / . . bonn, m. ( ). tooling up: scholarly communication education and training. college & research libraries news, ( ), – . retrieved from http://crln.acrl.org/content/ / / .full brantley, s., & bruns, t. a. ( , march). scholarly communication coaches. presentation at the meeting of the illinois association of college and research libraries, oak brook, il. retrieved from http://thekeep.eiu.edu/lib_fac/ / bresnahan, m. m., & johnson, a. m. ( ). assessing scholarly communication and research data training needs. reference services review, ( ), – . doi: . /rsr- - - brown, m. ( ). is almetrics an acceptable replacement for citation counts and the impact factor? the serials librarian, ( ), – . doi: . / x. . bruns, t. a., knight-davis, s., corrigan, e. k., & brantley, s. ( ). it takes a library: growing a robust institutional repository in two years. college & undergraduate libraries, ( – ), – . doi: . / . . bruns, t., brantley, s., & duffin, k. ( ). scholarly communication coaching: liaison librarians’ shifting roles. in b. eden (ed.), partnerships and new roles in the st-century academic library: collaborating, embedding, and cross-training for the future (pp. – ). lanham, md: rowman & littlefield. retrieved from http://thekeep.eiu.edu/lib_fac/ / librarians  in  transition       bruns, t., & inefuku, h. ( ). purposeful metrics: matching institutional repository metrics to purpose and audience. in b. b. callicott, d. scherer, & a. wesolek (eds.), making institutional repositories work (pp. – ). west lafayette, in: purdue university press. retrieved from https://works.bepress.com/todd_bruns/ / buehler, m. a., & boateng, a. ( ). the evolving impact of institutional repositories on reference librarians. reference services review, ( ), – . doi: . / bull, j., & eden, b. l. ( ). successful scholarly communication at a small university: integration of education, services, and an institutional repository at valparaiso university. college & undergraduate libraries, ( – ), – . doi: . / . . burns, c. s., lana, a., & budd, j. m. ( ). institutional repositories: exploration of costs and value. d-lib magazine, ( – ). doi: . /january -burns busher, c., & kamotsky, i. ( ). stories and statistics from library-led publishing. learned publishing, ( ), – . doi: . / calarco, p., shearer, k., schmidt, b., & tate, d. ( ). librarians’ competencies profile for scholarly communication and open access. retrieved from confederation of open access repositories website: https://www.coar-repositories.org/files/competencies-for- scholcomm-and-oa_june- .pdf carlson, s. ( , november ). the deserted library. the chronicle of higher education, p. a . retrieved from http://chronicle.com carpenter, t. a., lagace, n., & bahnmaier, s. ( ). developing standards for emerging forms of assessment: the niso altmetrics initiative. the serials librarian, ( – ), – . librarians  in  transition       doi: . / x. . clobridge, a. ( ). open access: progress, possibilities, and the changing scholarly communications ecosystem. online searcher, ( ), – . corrall, s., kennan, m. a., & afzal, w. ( ). bibliometrics and research data management services: emerging trends in library support for research. library trends, ( ), – . doi: . /lib. . cox, a., verbaan, e., & sen, b. ( ). upskilling liaison librarians for research data management. ariadne, . retrieved from http://www.ariadne.ac.uk/issue /cox-et-al creaser, c. ( ). open access to research outputs — institutional policies and researchers’ views: results from two complementary surveys. new review of academic librarianship, ( ), – . doi: . / cullen, r., & chawner, b. ( ). institutional repositories: assessing their value to the academic community. performance measurement and metrics, ( ), – . doi: . / dutta, g., & paul, d. ( ). awareness on institutional repositories-related issues by faculty of university of calcutta. desidoc journal of library & information technology, ( ), – . doi: . /djlit. . finlay, c., tsou, a., & sugimoto, c. ( ). scholarly communication as a core competency: prevalence, activities, and concepts of scholarly communication librarianship as shown through job advertisements. journal of librarianship and scholarly communication, ( ), ep . doi: . / - . foster, n. f., & gibbons, s. (eds.). ( ). studying students: the undergraduate research project at the university of rochester. chicago, il: association of college and research librarians  in  transition       libraries. gargouri, y., hajjem, c., lariviére, v., gingras, y., carr, l., brody, t., & harnad, s. ( ). self-selected or mandated, open access increases citation impact for higher quality research. plos one, ( ), e . doi: . /journal.pone. gayton, j. t. ( ). academic libraries: “social” or “communal?” the nature and future of academic libraries. journal of academic librarianship, ( ), – . doi: . /j.acalib. . . gilman, i. ( - ). adjunct no more: promoting scholarly publishing as a core service of academic libraries. against the grain, ( ), – . retrieved from http://commons.pacificu.edu/libfac/ / gordon, g. j. ( ). strategic access. legal information management, ( ), – . doi: . /s hahn, k. l. ( ). talk about talking about new models of scholarly communication. journal of electronic publishing, ( ). doi: . / . . hahn, s. e., & wyatt, a. ( ). business faculty’s attitudes: open access, disciplinary repositories, and institutional repositories. journal of business & finance librarianship, ( ), – . doi: . / . . harnad, s. ( ). gold open access publishing must not be allowed to retard the progress of green open access self-archiving. logos, ( – ), – . doi: . / x housewright, r., schonfeld, r. c., & wulfson, k. ( ). ithaka s+r us faculty survey . new york, ny. doi: . /sr. jaguszewski, j. m., & williams, k. ( ). new roles for new times: transforming liaison roles in research libraries. washington, dc: association of research libraries. retrieved from librarians  in  transition       http://www.arl.org/component/content/article/ / kennan, m. a., corrall, s., & afzal, w. ( ). “making space” in practice and education: research support services in academic libraries. library management, ( / ), – . doi: . /lm- - - kenney, a. r. ( ). leveraging the liaison model: from defining st century research libraries to implementing st century research universities. new york, ny: ithaka s+r. retrieved from http://www.sr.ithaka.org/wp- content/mig/files/sr_briefingpaper_kenney_ .pdf kenney, a. r. ( ). from engaging liaison librarians to engaging communities. college & research libraries, ( ), – . doi: . /crl. . . kirchner, j. ( ). scholarly communications: planning for the integration of liaison librarian roles. research library issues, , – . koffel, j., magarrell, k., raber, e., & thormodson, k. ( ). liaison connection: building a better community. college & research libraries news, ( ), – . retrieved from http://crln.acrl.org/content/ / / .full konkiel, s., & scherer, d. ( ). new opportunities for repositories in the age of altmetrics. bulletin of the american society for information science and technology, ( ), – . doi: . /bult. . krier, l., & strasser, c. a. ( ). data management for libraries: a lita guide. chicago, il: ala techsource. malenfant, k. j. ( ). leading change in the system of scholarly communication: a case study of engaging liaison librarians for outreach to faculty. college & research libraries, ( ), – . doi: . /crl. . . librarians  in  transition       mcintyre, g., chan, j., & gross, j. ( ). library as scholarly publishing partner: keys to success. journal of librarianship and scholarly communication, ( ), ep . doi: . / - . mcneil, b. ( ). librarians and scholarly communication: outreach, advocacy, and leadership within the academic community. in w. c. welburn, j. welburn, & b. mcneil (eds.), advocacy, outreach, and the nation’s academic libraries: a call for action (pp. – ). chicago, il: association of college and research libraries. mitchell, e. t. ( ). research support: the new mission for libraries. journal of web librarianship, ( ), – . doi: . / . . mullen, l. b. ( , may-june). open access and the practice of academic librarianship: strategies and considerations for “front line” librarians. paper presented at the meeting of the international association of scientific and technological university libraries, warsaw, poland. retrieved from http://docs.lib.purdue.edu/iatul/ /papers/ neugebauer, t., & murray, a. ( ). the critical role of institutional services in open access advocacy. international journal of digital curation, ( ), – . doi: . /ijdc.v i . newman, k. a., blecic, d. d., & armstrong, k. l. ( ). scholarly communication education initiatives (spec kit ). washington, dc: association of research libraries. retrieved from http://publications.arl.org/scholarly-communication-spec-kit- / park, j.-h., & shim, j. ( ). exploring how library publishing services facilitate scholarly communication. journal of scholarly publishing, ( ), – . doi: . /jsp. . . patel, d. ( ). research data management: a conceptual framework. library review, ( / ), – . doi: . /lr- - - librarians  in  transition       pinfield, s. ( ). making open access work: the “state-of-the-art” in providing open access to scholarly literature. online information review, ( ), – . doi: . /oir- - - pinfield, s., cox, a. m., & smith, j. ( ). research data management and libraries: relationships, activities, drivers and influences. plos one, ( ), e . doi: . /journal.pone. plutchak, t. s. ( ). breaking the barriers of time and space: the dawning of the great age of librarians. journal of the medical library association, ( ), – . doi: . / - . . . profera, e., jefferson, r., & hosburgh, n. ( ). personalizing library service to improve scholarly communication. the serials librarian, ( – ), – . doi: . / x. . prottsman, m. f. ( ). communication and collaboration: collection development in challenging economic times. journal of electronic resources in medical libraries, ( ), – . doi: . / . . quinn, m. m. ( ). open access in scholarly publishing: embracing principles and avoiding pitfalls. the serials librarian, ( ), – . doi: . / x. . radom, r., feltner-reichert, m., & stringer-stanback, k. ( ). organization of scholarly communication services (spec kit ). washington, dc: association of research libraries. retrieved from http://publications.arl.org/organization-of-scholarly- communication-services-spec-kit- / reinsfelder, t. l., & anderson, j. a. ( ). observations and perceptions of academic administrator influence on open access initiatives. the journal of academic librarianship, librarians  in  transition       ( ), – . doi: . /j.acalib. . . rockman, i. f. (ed.). ( ). reference librarians and institutional repositories [special issue]. reference services review, ( ). rodriguez, j. e. ( ). scholarly communications competencies: open access training for librarians. new library world, ( / ), – . doi: . /nlw- - - royster, p. ( , june). the advice not taken: how one repository found its own path. presentation at open repositories , helsinki, finland. retrieved from http://www.doria.fi/handle/ / scaramozzino, j. m., ramírez, m. l., & mcgaughey, k. j. ( ). a study of faculty data curation behaviors and attitudes at a teaching-centered university. college & research libraries, ( ), – . doi: . /crl- schlangen, m. ( ). content, credibility, and readership: putting your institutional repository on the map. public services quarterly, ( ), – . doi: . / . . steele, c. ( ). scholarly communication, scholarly publishing and university libraries. plus ça change? australian academic & research libraries, ( ), – . doi: . / . . sterman, l. ( ). institutional repositories: an analysis of trends and a proposed collaborative future. college & undergraduate libraries, ( – ), – . doi: . / . . suber, p. ( ). open access. elec, cambridge, ma: mit press. swoger, b. j. m., brainard, s. a., & hoffman, k. d. ( ). foundations for a scholarly communications program: interviewing faculty at a small public liberal arts college. journal librarians  in  transition       of librarianship and scholarly communication, ( ), ep . doi: . / - . taylor, s. ( ). coming full circle: scholarly communication and the role of liaison librarians. in k. l. anderson (ed.), sustainability in a changing climate: proceedings of the th annual conference of the international association of aquatic and marine science libraries and information centers (iamslic) (pp. – ). newport, or: iamslic. retrieved from http://hdl.handle.net/ / thomas, w. j. ( ). the structure of scholarly communications within academic libraries. serials review, ( ), – . doi: . / . . toups, m., & hughes, m. ( ). when data curation isn’t: a redefinition for liberal arts universities. journal of library administration, ( ), – . doi: . / . . turtle, e. c., & courtois, m. p. ( ). scholarly communication: science librarians as advocates for change. issues in science & technology librarianship, . doi: . /f svh vinopal, j., & mccormick, m. ( ). supporting digital scholarship in research libraries: scalability and sustainability. journal of library administration, ( ), – . doi: . / . . wiegand, s. ( ). beginning the conversation: discussing scholarly communication. the serials librarian, ( – ), – . doi: . / x. . wirth, a. a. ( ). incorporating existing library partnerships into open access week events. collaborative librarianship, ( ), – . wirth, a. a., & chadwell, f. a. ( ). rights well: an authors’ rights workshop for librarians. portal: libraries and the academy, ( ), – . doi: . /pla. . librarians  in  transition       wolf, m. ( ). untying knots and joining dots: the role of librarians in the scholarly communication environment. insights, ( ), – . doi: . /uksg. wolff, c., rod, a. b., & schonfeld, r. c. ( ). ithaka s+r us faculty survey . new york, ny: ithaka s+r. doi: . /sr. xia, j., & li, y. ( ). changed responsibilities in scholarly communication services: an analysis of job descriptions. serials review, ( ), – . doi: . / . . zhang, l. ( ). use of library services by engineering faculty at mississippi state university, a large land grant institution. science & technology libraries, ( ), - . doi: . / x. . zhang, y., liu, s., & mathews, e. ( ). convergence of digital humanities and digital libraries. library management, ( – ), – . doi: . /lm- - - zhao, l. ( ). riding the wave of open access: providing library research support for scholarly publishing literacy. australian academic & research libraries, ( ), – . doi: . / . . librarians  in  transition       appendix a liaison librarian sc support services survey . did you attend the scholarly communication coach training workshop? o   yes o   no . since the workshop, have you engaged faculty of your assigned departments in conversation about open access, copyright concerns, the keep, or other scholarly-communication related topics? o   yes o   no . since the workshop, have faculty of your assigned departments asked you about open access, copyright concerns, the keep, or other scholarly-communication related topics? o   yes o   no . since the workshop, have you read about scholarly communication services or identified open- access resources to add to our collection? choose all that apply: o   yes o   no o   did not attend the workshop o   other (please specify) . where do you see scholarly communication services fitting within your assigned duties? o   reference o   subject specialist o   it is not part of my assigned duties. o   other (please specify) . what would make you more inclined to incorporate scholarly communication services into your liaison duties? . comments?   librarians  in  transition       appendix b faculty digital needs survey . as part of my research, i develop the following types of digital materials: o   working papers and reports o   published documents (articles, books, book chapters, conference proceedings) o   historical and archival documents o   multimedia (video, audio, image) o   primary research materials (such as research data) o   creative works (art, photography, graphics, music compositions) o   other (please specify) . which of the above research materials are the most important to manage/organize/preserve/share? why? . as part of my teaching, i develop the following types of digital materials o   multimedia (video, audio, image) o   open educational resources (textbooks, syllabi, course materials) o   historical and archival documents o   primary research materials (such as research data) o   creative works (art, photography, graphics, music composition) o   other (please specify) . which of the above teaching materials are the most important to manage/organize/preserve/share? why? . my students develop the following types of digital materials: o   electronic theses and dissertations o   capstone projects o   multimedia (video, audio, image) o   historical and archival documents o   primary research materials (such as research data) o   creative works (art, photography, graphics, music composition) o   other (please specify) . which of the above student projects are the most important for you or your students to manage/organize/preserve/share? why? . what is the most important to you? (rank in order) o   organizing my digital scholarship o   promoting my students' work o   increasing the citations/visibility of my work o   managing and preserving research data o   publishing an online journal or conference librarians  in  transition       o   measuring and demonstrating the impact of my work o   other (please specify) . where do you need the most help? (rank in order) o   organizing my digital scholarship o   promoting my students' work o   increasing the citations/visibility of my work o   managing and preserving research data o   publishing an online journal or conference o   measuring and demonstrating the impact of my work o   other (please specify) . do you have any specific projects that would benefit from digital support services? please describe. eastern illinois university from the selectedworks of steve brantley fall september , librarians in transition: scholarly communication as a core competency microsoft word - brantleyduffinbruns_librariansintransition-revised-manuscript.docx digital scholarship and writing sprints: an academic author perspective insights – ( ), march digital scholarship and writing sprints | claire taylor claire taylor professor in hispanic studies, university of liverpool, uk during academic book week ( – november ) academics within the university of liverpool, in conjunction with liverpool university press, held a writing sprint focused around modern languages (ml), one of the major research areas within the university and one of the key areas of publishing within the press. the writing sprint brought together experts around some fundamental questions in ml research, and resulted in a collaborative-authored piece at the end of the week. this article explores the inspirations behind the sprint, describes the methodology and research questions, and finally discusses the advantages and challenges of undertaking such an activity. digital scholarship and writing sprints: an academic author perspective inspiration three inspirations were behind the writing sprint. the first was my own research area, as a researcher in latin american culture specializing in digital culture. my background, and my original training as a doctoral student, was in the analysis of conventional print-based texts, but the direction in which my research has taken me means that i have been exploring what happens when the print medium meets digital technologies. if my research, and that of my colleagues, has led us to think about the changes in textual practice in the contemporary era, so, too, the writing sprint as a format invites us to rethink our textual practice as researchers. the second inspiration was the realization that the authors and artists that we research publish in very much more varied and innovative forms than we ourselves do. for instance, in my research i have analysed twitter poetry, hypertext novels, net art, blog short stories and quite a number of other varied genres. there are many examples that i could give here, such as radikal karaoke by the argentine author belén gache, which is classified by the author as a ‘collection of poetry’ but whose interface is that of a karaoke machine, and which includes sound, moving image and colour effects as much as text. this collection of poetry involves significant input from the reader, who actually becomes the co-author and the co- voicer of the work. or, the hypertext novels by colombian author jaime alejandro rodríguez, such as gabriella infinita and golpe de gracia, both of which subvert the conventions of the print text and include multiple pathways through the story, as well as including multimedia formats such as audio, still images and moving images, text and video game interaction. or, yet another example could be argentine author and artist marina zerbarini’s eveline: fragmentos de una respuesta, which is a hypertext short story with a deliberately complex interface that refuses to follow the linear format which would conventionally underpin the short story. in all of these cases, and many more, the objects of study that i have considered do not actually ‘look like’ a book; they are not bound into a volume, they do not appear on a printed page, they do not have a linear structure, they do not respect the conventional delineation between author and reader, and so forth. all of these objects of study raise fundamental questions about text, authorship, the role of the reader and related issues that have been analysed in depth by the likes of janet h murray, n katherine hayles and many others. thus the question arises that, if these are the objects of study, why do we continue to use the static format of print books or journal articles when we analyse them? finally, the third inspiration behind the writing sprint was the recent venture by liverpool university press (lup) into open access (oa) publishing with its modern languages open (mlo) initiative, launched in . mlo is a peer-reviewed online platform for the oa publication of research within modern languages (ml) to a global audience. since mlo has the aim of facilitating interdisciplinarity, as well as promoting open access, i and other colleagues at the university of liverpool were keen to explore how this platform might help enable a more collaborative, multi-authored piece of work. aims in undertaking this writing sprint we had five main aims: facilitating collaboration; encouraging new ways of thinking about academic writing; engaging in reflective practice; rethinking peer review; and using our emerging digital scholarship to transform our writing practice. firstly, we were aiming to achieve collaboration in the writing process. we wanted to try to create an academic piece that would no longer be a single-authored piece, but instead we wanted to enable real collaboration throughout the entire writing process. secondly, we wanted to encourage new ways of thinking about academic writing, in terms of its style and ‘voice,’ something which was enabled by the blog format (about which more below). the third aim was to be able to reflect on the practice as much as the content. with the traditional form of academic writing, the accepted process is for an author to submit his/her article or book when it is finished, and then it is subsequently published. what we wanted to do in the writing sprint was actually see the writing process itself as it happened, and make that writing process part of the research question, as much as the finished product. fourthly, we also envisaged the writing sprint as an interesting way of rethinking the peer-review process, because contributors would be writing in a highly visible way (in real time, on a blog), with the various respondents who were nuancing and shaping the thoughts also doing so in a visible format that was open to public view. this entailed a rethinking of the conventional mode of peer review which is still, in the main, an anonymous process. finally, we wanted to make use of digital transformations in our writing process, exploring how digital tools (such as the blog platform) can help us rethink our practice as we are in the actual process of writing. developing a methodology thus inspired, i collaborated with a colleague in liverpool, niamh thornton, who is reader in latin american studies, to put together a writing sprint that would take place over the course of a week to coincide with academic book week. we focused the sprint around ml as a discipline, and specifically on how it engages with the digital in multiple ways. we commissioned several pieces of words each from experts in their field, and we appointed a broader group of respondents who were invited to dialogue with each piece, nuance it and shape the debate. all the participants then responded to the main question and, by the end of the week, a final piece would emerge for publication on lup’s mlo platform. digital as theme the main theme for the writing sprint was ‘modern languages and the digital: the shape of the discipline’. within this theme, we asked contributors to consider how digital technologies are changing the shape of ml research and publishing, and how the conceptual, methodological and practical bases of ml research are having to adapt to the challenges of the digital. we also asked how our encounter with the digital transforms our work as modern linguists, both in terms of our practice and in terms of our understanding of what ml is. finally, we also asked contributors to think about how the digital might be central to the (re)conceptualization of ml as a trans-disciplinary enterprise, and how modern languages have a transformative effect at the cutting edge of digital humanities. ‘we wanted to enable real collaboration throughout the entire writing process’ ‘we focused … on how modern languages engages with the digital in multiple ways’ exploring digital scholarship in modern languages six additional questions fed into the main theme over the course of the sprint. the first – ‘(big?) data and ml’– focused on how the ever-increasing volumes of data that are available to us as researchers are changing the way in which we engage in ml research. we asked contributors to explore which data tools and concepts are helpful to us, not just in an instrumental sense of how we undertake our research, but also in a more conceptual sense of how we understand what ml is. we also asked how tools such as crowdsourcing might help generate audience engagement in, and increase the public understanding of, ml. the second question – ‘ml and digital archives’ – started from the premise that technology has allowed us to gather material and share it with a wider community. we asked what technologies can be used to make archiving possible and lasting, and asked contributors to consider whether, if we work online and create spaces, we become archivists. if so, we asked what the ethical issues arising from this might be, and whether the digital archive is an act of recovery or curation. the third question – ‘ml: the digital as object of study’ – looked at how digital technologies have caused us to rethink existing literary and cultural formats, and how new platforms have transformed our understanding of what a ‘text’ is. we asked contributors to share their experiences of new cultural forms that are being developed at the interface between literary-cultural expression and new media technologies. we also asked contributors to explore what existing rich cultural, literary and artistic heritage (going well beyond the anglophone) such works build on. and we invited them to think how these new forms might force us to rethink the (implicit) nation-state assumptions that conventionally underpin ml practice. question four – ‘ml and digital ethnography’ – explored how ml has changed its methodological approach when analysing digital practices online. this question asked contributors to consider how ml research into the digital might be as much about practices as about texts. we also asked contributors to consider what we learn from ethnography, and what the boundaries are between digital ethnography and textual analysis. question five focused on the issue of ‘users and interfaces,’ and started from the premise that digital writing and publishing not only has to take into account readers as end users, but also has to recognize their potential in an open and dynamic dialogue. we asked how we should tap into the potential for readers to respond, improve upon, and change the process of publishing and editing as interfaces and platforms develop. we also asked what platforms we need to make reader engagement possible. finally, question six looked at ‘ml as research and process.’ traditional academia discourages sharing of process and encourages researchers to share a final finessed piece. digital spaces, by contrast, allow us to reveal, share and upend this by showing the tools, materials and infrastructure of our study. we asked contributors to consider in what ways this has changed how we think about the end result of our research, and what the benefits and pitfalls of this laying bare might be. we asked contributors to consider whether this fundamentally changes our research in itself. the sprint process as it happened over the course of the week of the book sprint, ten contributors wrote and published individual blog posts, containing their reflections and responses to the questions. the length of each individual entry varied, with most main entries comprising around words, and some shorter reflections arising spontaneously as the week went along. at the end of the week, these entries made up a collaborative piece totalling just under , words. ‘technology has allowed us to gather material and share it with a wider community’ ‘the interface between literary-cultural expression and new media technologies’ the response to the questions, and the subsequent reflections and dialogues that arose, provided some illuminating perspectives on the key issues pertaining to ml and its negotiation with the digital. from the arguments developed by kirsty hooper on the necessity for modern linguists to engage with what may at first glance be an unfamiliar scenario of big data, through to tori holme’s thought-provoking musings on how an engagement with digital culture must entail both an understanding of the origins and principles of ethnography, as well as an awareness of how ethnography itself is being changed and challenged by digital technologies, all the contributions encouraged us to think beyond our conventional disciplinary boundaries and to evaluate our practice. to view these and all the other rich contributions, see our writing sprint blog and the mlo platform where the sprint will then be published in spring . challenges the challenges we encountered concerned the visibility of the writing process, establishing an authorial voice, working under significant time pressure, adapting a technical solution and working collaboratively. how best to reassure contributors, who might have been potentially daunted by the total visibility of the writing process, presented one challenge. we attempted to address this by setting out clear guidelines, as well as trying to generate a collaborative spirit amongst the contributors. the challenge of establishing the authorial voice also required a new approach: as the process unfolded, we realized that each contributor wrote in his/her own voice, employing differing tones and styles. we concluded that, instead of aiming to achieve a consistent authorial voice (as one would do with a single-authored piece, or even a conventional joint-authored article, say), we had to allow for multiple authorial voices and styles to emerge. writing within a five-day time period to coincide with academic book week was also a challenge. the timescale was certainly different from that experienced by most academics when writing an academic piece, so it was important to engage forward planning and prompting in order to ensure it all ran to time. our technical challenge was to adapt the wordpress blogging platform to the needs of the writing sprint, and to try to create an interface which looked as dynamic as possible, and where the dialogues between contributors were as visible as possible. we were not, perhaps, able to make the dynamism of the exercise quite as visible on wordpress as we would have liked, given the limitations of the platform, and this is an area for further development when we next engage in such an activity. what might be thought to be a challenge, however – the collaborative aspect – was in fact very positive, and getting authors to work together turned out to be a smooth process. conclusion the sprint proved effective in bringing academics out of their silos and working collaboratively across geographical distance by virtually connecting colleagues at various institutions in the uk and worldwide, and across different academic disciplines and departments. the sprint also provided opportunities for reflection as part of the process, since it was an iterative practice that developed over the course of the week, allowing all contributors to reflect as the piece took shape. we were able to record the process as much as the end result, something that almost never happens with a traditional book chapter or article. at the end of the week, we had not only the finished piece, but also the record of how we had got there, which was enlightening in itself. and finally, we achieved a much richer output than a single-authored piece at the end, since the input from experts in different but related fields meant that new perspectives were shed upon some of the key concerns we are all grappling with. it is almost certain that a single- authored piece would not have achieved this same richness, since no one person combines ‘some illuminating perspectives on the key issues’ ‘we had to allow for multiple authorial voices and styles to emerge’ all the different skills and expertise that our writing sprint participants brought collectively: in this way, by working collaboratively, we were able to come up with a much more rounded, much more profound piece of work than had each one of us written our own individual piece. in conclusion, the writing sprint proved a valuable and productive process, allowing us to explore responses to the key questions established, and to address our main aims. although the notion of the writing sprint implies spontaneity, from our experience, whilst the sprint itself was productive and spontaneous during the week of writing itself, planning in advance of the week was necessary. provided that advance planning is undertaken, and that contributors are selected and briefed in plenty of time, a writing sprint can be an exciting way to enable collaboration and to encourage authors to think about new ways of presenting their research. abbreviations and acronyms a list of the abbreviations and acronyms used in this and other insights articles can be accessed here – click on the url below and then select the ‘abbreviations and acronyms’ link at the top of the page it directs you to: http://www.uksg.org/publications#aa competing interests the author has declared no competing interests. ‘the writing sprint proved a valuable and productive process’ references . gache, b, radikal karaoke, : http://belengache.net/rk/ (accessed january ). . rodríguez, j a, gabriella infinita, : http://www.javeriana.edu.co/gabriella_infinita/principal.htm (accessed january ). . rodríguez, j a, golpe de gracia, : http://collection.eliterature.org/ /works/rodriguez_golpe_de_gracia/ (accessed january ). . zerbarini, m, eveline: fragmentos de una respuesta, : http://marina-zerbarini.com.ar/evy/ (accessed january ). . murray, j h, hamlet on the holodeck: the future of narrative in cyberspace, , cambridge, mit. . hayles, n k, electronic literature: new horizons for the literary, , notre dame, indiana, university of notre dame press. . modlangdigital: the modern languages open writing sprint, : https://modernlangdigital.wordpress.com/ (accessed february ). . modern languages open: http://www.modernlanguagesopen.org/ (accessed february ). article copyright: © claire taylor. this is an open access article distributed under the terms of the creative commons attribution licence, which permits unrestricted use and distribution provided the original author and source are credited. claire taylor professor in hispanic studies department of modern languages and cultures, cypress building,  university of liverpool, chatham street, liverpool l zr, uk e-mail: c.l.taylor@liverpool.ac.uk orcid id: http://orcid.org/ - - - to cite this article: taylor, c, digital scholarship and writing sprints: an academic author perspective, insights, , ( ), – ; doi: http://dx.doi.org/ . /uksg. published by uksg in association with ubiquity press on march http://www.uksg.org/publications#aa http://belengache.net/rk/ http://www.javeriana.edu.co/gabriella_infinita/principal.htm http://collection.eliterature.org/ /works/rodriguez_golpe_de_gracia/ http://marina-zerbarini.com.ar/evy/ https://modernlangdigital.wordpress.com/ http://www.modernlanguagesopen.org/ http://http://creativecommons.org/licenses/by/ . / mailto:c.l.taylor@liverpool.ac.uk http://dx.doi.org/ . /uksg. http://www.uksg.org/ http://www.ubiquitypress.com/ [pdf] devil in the digital: ambivalent results in an object‐based teaching course | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /muan. corpus id: devil in the digital: ambivalent results in an object‐based teaching course @article{turin devilit, title={devil in the digital: ambivalent results in an object‐based teaching course}, author={m. turin}, journal={museum anthropology}, year={ }, volume={ }, pages={ - } } m. turin published sociology museum anthropology in , i piloted a course in which students used web-based tools to explore underdocumented collections of himalayan materials at yale university. through class-based research and contextualization, i set students the goal of augmenting existing metadata and designing media-rich, virtual tours of the collections that could be incorporated into the sparse catalogue holdings held within the library system. the process was experimental and had mixed results, as this article documents. the class… expand view via publisher halshs.archives-ouvertes.fr save to library create alert cite launch research feed share this paper citations view all figures from this paper figure figure figure figure figure view all figures & tables one citation citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency big data, bad metadata: a methodological note on the importance of good metadata in the age of digital history kimmo elo computer science pdf view excerpt save alert research feed references showing - of references engaging with pasts in the present: curators, communities, and exhibition practice m. k. scott history view excerpt, references background save alert research feed salvaging the records of salvage ethnography: the story of the digital himalaya project m. turin engineering pdf view excerpt, references background save alert research feed from sage on the stage to guide on the side a. king psychology pdf save alert research feed the cultural biography of objects c. gosden, y. marshall art save alert research feed himalayan exhibit unites regional artifacts yale daily news cultural property: a contribution to the debate related papers abstract figures citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue [pdf] creating, curating, and sharing online faculty development resources: the medical education in cases series experience. | semantic scholar skip to search formskip to main content> semantic scholar's logo search sign increate free account you are currently offline. some features of the site may not work correctly. doi: . /acm. corpus id: creating, curating, and sharing online faculty development resources: the medical education in cases series experience. @article{chan creatingca, title={creating, curating, and sharing online faculty development resources: the medical education in cases series experience.}, author={t. chan and brent thoma and m. lin}, journal={academic medicine : journal of the association of american medical colleges}, year={ }, volume={ }, pages={ - } } t. chan, brent thoma, m. lin published medicine academic medicine : journal of the association of american medical colleges problem it is difficult to engage clinicians in continuing medical education that does not focus on clinical expertise. evolving online technologies (e.g., massive open online courses [moocs]) are disrupting and transforming medical education, but few online nonclinical professional development resources exist. approach in august , the academic life in emergency medicine web site launched the medical education in cases (medic) series to engage clinicians in an online professional… expand view on pubmed links.lww.com save to library create alert cite launch research feed share this paper citationsbackground citations methods citations view all topics from this paper education, medical scientific publication education, medical, continuing citations citation type citation type all types cites results cites methods cites background has pdf publication type author more filters more filters filters sort by relevance sort by most influenced papers sort by citation count sort by recency crowdsourced curriculum development for online medical education e. shappell, t. chan, + authors j. ahn medicine cureus pdf save alert research feed social media in knowledge translation and education for physicians and trainees: a scoping review t. chan, k. dzara, sara paradise dimeo, anuja bhalerao, l. maggio psychology, medicine perspectives on medical education pdf save alert research feed building a global, online community of practice: the openpediatrics world shared practices video series. t. wolbrink, n. kissoon, n. mirza, j. burns sociology, medicine academic medicine : journal of the association of american medical colleges save alert research feed social-media-enabled learning in emergency medicine: a case study of the growth, engagement and impact of a free open access medical education blog s. carley, iain beardsell, + authors r. body medicine postgraduate medical journal save alert research feed a quantitative study on anonymity and professionalism within an online free open access medical education community d. dimitri, a. gubert, amanda b miller, brent thoma, t. chan medicine cureus pdf save alert research feed creating a virtual journal club: a community of practice using multiple social media strategies. michelle s. lin, j. sherbino computer science, medicine journal of graduate medical education pdf save alert research feed communication, learning and assessment: exploring the dimensions of the digital learning environment brent thoma, alison turnquist, f. zaver, a. k. hall, t. chan computer science, medicine medical teacher view excerpt, cites background save alert research feed education becomes social: the intersection of social media and medical education. r. madanick sociology, medicine gastroenterology save alert research feed large-scale online education programmes and their potential to effect change in behaviour and practice of health and social care professionals: a rapid systematic review a. zubala, kacper lyszkiewicz, e. lee, laura l. underwood, mary j. renfrew, n. gray psychology, computer science interact. learn. environ. pdf view excerpt, cites background save alert research feed professional identity (trans)formation in medical education: reflection, relationship, resilience. h. wald psychology, medicine academic medicine : journal of the association of american medical colleges pdf save alert research feed ... ... references showing - of references online learning for faculty development: a review of the literature d. cook, y. steinert medicine medical teacher save alert research feed medical education reimagined: a call to action. c. prober, s. khan psychology, medicine academic medicine : journal of the association of american medical colleges pdf save alert research feed free open access meducation (foam): the rise of emergency medicine and critical care blogs and podcasts ( – ) mike d cadogan, brent thoma, t. chan, m. lin medicine emergency medicine journal save alert research feed why it’s so hard to measure online readership the atlantic. february what is the theory that underpins our moocs? [blog post elearnspace. june what is the theory that underpins our moocs ? [ blog post ] . elearnspace . june , what is the theory that underpins our moocs? why it's so hard to measure online readership. the atlantic related papers abstract topics citations references related papers stay connected with semantic scholar sign up about semantic scholar semantic scholar is a free, ai-powered research tool for scientific literature, based at the allen institute for ai. learn more → resources datasetssupp.aiapiopen corpus organization about usresearchpublishing partnersdata partners   faqcontact proudly built by ai with the help of our collaborators terms of service•privacy policy the allen institute for ai by clicking accept or continuing to use the site, you agree to the terms outlined in our privacy policy, terms of service, and dataset license accept & continue gemms (gateway to early modern manuscript sermons – ): confronting the challenges of sermons research research how to cite: james, anne, and jeanne shami. . “gemms (gateway to early modern manuscript sermons – ): confronting the challenges of sermons research.” digital studies/le champ numérique ( ): , pp. – . doi: https://doi.org/ . /dscn. published: december peer review: this is a peer-reviewed article in digital studies/le champ numérique, a journal published by the open library of humanities. copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access: digital studies/le champ numérique is a peer-reviewed open access journal. digital preservation: the open library of humanities and all its journals are digitally preserved in the clockss scholarly archive service. https://doi.org/ . /dscn. http://creativecommons.org/licenses/by/ . / james, anne, and jeanne shami. . “gemms (gateway to early modern manuscript sermons – ): confronting the challenges of sermons research.” digital studies/le champ numérique ( ): , pp. – . doi: https://doi.org/ . /dscn. research gemms (gateway to early modern manuscript sermons – ): confronting the challenges of sermons research anne james and jeanne shami university of regina, ca corresponding author: anne james (anne.james@uregina.ca) this essay describes the first phase of the gemms project ( – ), the purpose of which is to increase access to early modern english manuscript sermons and sermon notes ( – ) by creating a freely accessible online bibliographic database of records for these materials. it reviews the need for this resource, the current state of its development, and the challenges and successes to date as well as proposed developments in its next phase. keywords: sermons; manuscripts; bibliographic databases; sermon notes; renaissance cet article décrit la première phase du projet gemms ( – ), dont l’objectif est de faciliter l’accès à des manuscrits et notes des sermons en anglais moderne naissant ( – ) en créant une base de données bibliographique en ligne et en libre accès des archives de ces documents. cet article examine le besoin de cette ressource, l’état actuel de son développement, les défis et réussites jusqu’à présent, ainsi que les développements proposés pour la prochaine phase du projet. mots-clés: sermons; manuscrits; base de données bibliographique; notes de sermons; renaissance https://doi.org/ . /dscn. mailto:anne.james@uregina.ca james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of introduction the gemms project responds to a need among early modern researchers for more effective and efficient access to records of manuscript sermons scattered among numerous archives in the united kingdom and north america. while the project’s primary goal is a freely accessible, fully searchable bibliographic database, an important secondary goal is creating a community of sermon scholars who will become contributors to as well as users of this data. this essay presents the rationale for the project and discusses three challenges it confronts due to the circumstances in which these materials were created and have been collected: the vast number and geographic dispersal of the manuscripts; the inadequacies of existing bibliographic tools for locating them; and the lack of taxonomies for describing their contents. a summary of these challenges, progress to date, and future plans comprises the remainder of the essay. the research context: problems of access sermons are an important resource for early modern scholars in many academic disciplines. early twentieth-century interest was largely confined to literary scholars who prized the sermons of well-known preachers such as john donne and lancelot andrewes for their rhetorical qualities, but who ignored the vast majority of undistinguished performances (shami , – ). later in the century, however, interest developed among a wider range of scholars for numerous reasons: increased popularity of literary and historical methodologies that expanded their research from a smaller group of elite texts to larger groups of more popular texts; easier access to a greater range of early modern printed texts through projects such as early english books online (eebo) that decrease the necessity for visiting specialized archival collections; and increased interest in oral culture, its relationship to print culture, and the circulation and reception of both print and manuscript texts. defined by contemporary john deios as “the expounding of scripture and applying of it to the present state, by the working of gods spirit in the mouth of a man called for that purpose” (deios , ; morrissey , ), the sermon functioned as an instrument of god, conveying saving grace to instruct, move, and convert. as lay james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of access to scripture in the vernacular increased, preaching by appropriately trained and licensed ministers also offered the church a means of guiding and directing individual interpretation within a communal setting. as oral performances, sermons came to function as instruments of the state as well as the church by enabling public engagement on controversial religious and political topics, but even sermons conspicuously neutral, uninspired, or mundane are best understood as “radically occasional pieces of performed writing, contingent upon the contexts [including place and auditory] in and for which they were delivered” (mccullough , ). these characteristics make them important sources of information for political, religious, and social historians, as well as theologians, church historians, and those interested in the histories of ideas, the book, performance, and women. among non- academics, genealogists and local historians also rely upon sermon evidence. the reformation made sermons the dominant cultural form of early modern english literature by encouraging the proliferation and development of sermons – the most prominent of the preacher’s ministerial duties – as part of the liturgy and as free-standing events, grafting these discourses onto a robust medieval tradition of catholic preaching and sermon attendance (carlson , ; shami , – ; wabuda , – ; wooding , , ). based upon the numbers of parishes and preaching occasions – which expanded to include not only traditional sunday services, but also regular weekday lectures or combination lectures across the kingdom, and sermons on special public and political occasions – godfrey davies calculated that at least , sermons would have been preached in england and wales between and alone (davies , ). however, edith klotz’s sampling of the short title catalogue (stc) suggested that only about , of these sermons survive in print (shami , ). while klotz’s estimate may be low, gemms research supports her conclusion: of the , sermons currently contained in the gemms database, we have so far identified fewer than that were printed. the reasons for this dearth of printed sermon records likely range from preachers’ disinclination or lack of time to prepare their sermons for print, to lack of demand for print copies (which were considered less efficacious than the spoken word in the early part of the period) and the sheer volume of sermons preached even by a single preacher in his career. james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of moreover, those that were printed constitute an unrepresentative sample. the contingencies of history have determined that sermons by well-known preachers, those preached in larger centres, those preached on special occasions or to prominent persons, and those that stirred controversy are more likely to survive in print. in contrast, sermons preached by marginalized preachers, including catholic priests, conforming clergy during the interregnum and nonconformists after , and quaker women (whose sermons do not survive, although we have evidence that they were preached) (shami , – ), were unlikely to be authorized for print. scholars have also become interested in traces of sermons that are entirely absent from the print record, such as notes and outlines by preachers, auditors, and readers (hunt ). in other words, not only are many early modern sermons inaccessible to researchers through sources such as eebo, but also relying entirely or primarily upon printed sermons may lead to limited or flawed conclusions. consequently, researchers, beginning with historians, began increasingly looking to manuscript sermons to provide them with a more accurate understanding of sermon culture (cogswell , ; eales ; hughes ; lake ; walsham ). such manuscripts exist in abundance in numerous libraries and archives but have been difficult to access systematically for three reasons: the materials are widely dispersed; they have seldom received full cataloguing by repositories; and the traces of sermons they record are enormously varied. gemms seeks to improve access to these materials by confronting these difficulties. geographic dispersal manuscripts frequently circulated within communities linked by their doctrinal or other affiliations, but sometimes widely separated geographically, and ministers preached in multiple locations during their careers. some left england to seek religious freedom in america, resulting in their sermons being archived on different continents. later acquisitions by antiquarians and collectors, and eventual dispersal to purchasing libraries, resulted in yet more scattering of manuscripts. for example, traces of sermons by the prolific late-seventeenth-century nonconformist preacher philip henry may be found in at least six libraries throughout england, scotland, james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of and wales (and some remain in private hands). recently, anne james made the serendipitous discovery that the handwriting in a volume of unattributed auditor’s notes in the national library of scotland matched that of seven volumes in the british library attributed to john hall of yorkshire (nls ms. ; bl mss. – ). how this volume became separated from the others remains unknown (and several other volumes likely remain unaccounted for). consequently, users are required to consult multiple archives to search for sermons by specific preachers or preached in specific locations or on particular texts or occasions, with little ability to predict where they might be housed, or even if they exist. inadequate bibliographic access the bibliographic tools upon which sermon scholars have traditionally relied are the catalogues, online and print, of libraries and archives. many of these catalogues were produced in the nineteenth and early twentieth centuries when sermons by ordinary parish ministers were considered to be of little interest to researchers. these institutions also purchased quantities of manuscripts containing sermons, which were often available cheaply during the last half-century, but as they were consulted relatively infrequently in most archives, limited staffing and budgets prevented full cataloguing even of complete sermons. in the catalogues of large repositories, individual sermons were seldom itemized, unless they were known to be preached by well-known preachers. even works by such preachers could escape attribution: john donne’s gunpowder plot sermon on lamentations : , corrected in his own hand, remained unattributed within british library ms. royal .b.xx until , when jeanne shami recognized donne’s handwriting (donne ). sermon notes by unknown preachers and auditors are even less likely to have received individual attention, while sermons and sermon notes that appear in miscellanies and commonplace books are seldom described. consequently, thousands of volumes have been briefly described in finding aids as ‘a volume of sermons’ or simply ‘sermon notes,’ often followed by a very broad range of possible dates (morrissey ). smaller and more specialized repositories often rely on printed catalogues that may only be consulted on-site to supplement online catalogues. for example, the james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of congregational library (housed at dr. williams’s library) possesses a painstakingly detailed multi-volume catalogue compiled by librarian david powell that exists only in a single typewritten copy housed in the archive. many such small or private libraries lack the resources to fully catalogue their manuscript collections. helen kemp, at the university of essex, has recently itemized manuscript sermons in dr. plume’s library (to be accessible through gemms), a private collection where full cataloguing had not previously been possible. problems of classification given the increasing recognition of the importance of sermons in early modern culture and the growing reliance of scholars on sermons, a lack of appropriate and widely accepted taxonomies for describing these materials has become a serious problem. these include both taxonomies of the various religious positions expressed in sermons by preachers as well as the classification of the various traces left by the sermons. earlier twentieth-century scholarship placed sermons within the history of english prose style, leading to outmoded taxonomies that obscured the genre’s complexity and bequeathed inaccurate labels (such as the anachronistic “anglo- catholic”) to cover widely divergent sermon practitioners (ferrell and mccullough , ; blench ; mitchell ; davies ). revisionist historiography has also challenged simplistic labels and distinctions (between anglican and puritan preachers, for example) by proposing a more nuanced spectrum of mainstream protestant positions (puritans, conformist calvinists, anti-calvinists, laudians) and sensitivity to nonconforming sects, including quakers, nonconformists, independents, baptists, and dissenters later in the century (green ; green ; haigh ; lake ; lake ; milton ; spurr ; tyacke ; tyacke ). that preaching spectrum also needs to be enlarged to include catholic preachers whose influence among important catholic families over our period, as michael questier highlights, proved “quite out of proportion to their numbers and even to the material resources of their patrons” (questier , ). moreover, the performative nature of sermons means that their traces survive in a wide variety of forms: complete sermons written by preachers; scribal copies; james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of preachers’ pulpit notes; notes by listeners and readers of sermons in varying degrees of detail; outlines by preachers, listeners, and readers; letters and other documents containing accounts of sermons; and lists of sermons. because earlier scholars concentrated primarily upon complete sermons, little effort has been made, either by cataloguers or by researchers, to define the characteristics that distinguish the surviving forms from each other. in practice, many of these distinctions are unclear. gemms project: purpose and scholarly context sermon scholars have long been aware of these limitations and their impact on both the quality and quantity of research on manuscript sermons and on notetaking at sermons (morrissey , – ). while many create private lists or databases that reflect their own research programs, few have the time and opportunity to perform a comprehensive or systematic search of the archives. a few american scholars have attempted to provide public access and establish more collaborative approaches to scholarship; however, their successes have been limited. the earliest of these projects is southern manuscript sermons before (lofaro ), essentially a bibliography of manuscript sermons from the five southern colonies/states (maryland, virginia, north and south carolina, and georgia) during the colonial period. begun by richard beale davis in , it was expanded in scope and completed by michael a. lofaro (university of tennessee, knoxville), who took over the work in . the project’s stated goal was to “help scholars to construct a more complete picture of the nature of the southern mind before and reveal how it contributes to a national ethos” (lofaro , xiii), and more specifically to enable comparisons between these sermons and those preached in the northern colonies/states. although the printed version, published in , promised additions and corrections to the online database, it seems new records are not being added. in appearance, the records for the over sermons in the database resemble those of a traditional library catalogue, with individual sermons assigned accession numbers, and authors and titles providing the main points of entry. while the individual records are informative, search functionality is limited by the use of controlled vocabulary rather than free text searching, and the evolving james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of nature of sermon scholarship has rendered some assigned key words obsolete. equally frustrating is that terms one might expect to find, such as ‘calvin’, ‘election’ or, more broadly, ‘theology of grace’, are not among the keywords used. moreover, the database’s coverage of repositories and preachers is idiosyncratic, since data collection was undertaken by individual researchers over a period of many years, and only complete sermons were included. several scholars have initiated collaborative projects to undertake the more labour- intensive task of presenting transcriptions of manuscript sermons; however, they have struggled to find volunteer transcribers. sermon notebooks online (neuman ) is the fairly recent creation of meredith marie neuman at clark university, author of jeremiah’s scribes: creating sermon literature in puritan new england (university of pennsylvania press, ). she describes her personal website as “an experiment in sharing the onerous work of transcription by breaking it down into less onerous pieces” (neuman ), and invites volunteers to take on tasks ranging from entire sermons to smaller sections of text. to date, however, few seem to have accepted the challenge, and the site presents a single transcription, alongside useful paleographic resources. neuman ( ) does not indicate an intention to create a database with search functions as more transcriptions are added, and this development would likely require a more committed group of collaborators and/or massive funding. such an approach has been taken by teams (transcribing early american manuscript, hutchins ), a group-sourcing initiative under the general editorship of zach hutchins at colorado state university, author of inventing eden: primitivism, millennialism, and the making of new england (oxford, ). the stated goal of this project is “to transcribe and publish a selection of sermons indicative of the rich diversity of manuscript sermons available in archives” (hutchins ). the project also relies upon volunteer transcribers, and the content is therefore dictated by their interests and research projects. although the website lists few contributors, the group has succeeded in digitizing transcriptions of fifty sermons. unlike sermons notebooks online, which invites contributions from transcribers with all levels of experience, this group follows standardized transcription protocols clearly indicated on the site. the search functions are relatively primitive, but users james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of may search by bible book and verse, author, denomination, location, and year, as well as within the full text. possibly due to an inability to obtain institutional permissions, the site presents the transcriptions without images of the originals, which removes the reader from the manuscript as material object. in the uk, the national archives (tna) provides bibliographic access not only to its own materials, but also to records listed in the catalogues of over , other archives in the nation; however, the omission of private libraries containing significant manuscript sermon collections, such as dr. williams’s library, makes this resource less useful for sermon scholars. to date, however, the active community of early modern sermons scholars in the uk, perhaps deterred by the vast number of manuscripts and collections in the british isles, has not attempted notable subject- specific databases or transcription projects. despite their limitations, such projects bear witness to the desire of sermon scholars to find more efficient means of expanding their research resources. recognizing the gaps between project intentions and the imperfections of individual efforts enabled us to develop a set of guidelines for our own project. first, we recognized the importance of developing a database that would support robust searching functions and of ensuring stable hosting that would enable ongoing maintenance and updating. we also saw that providing access to a more representative set of sermons (including sermon notes) would only be possible through collaboration involving both a dedicated research team and a committed group of sermon scholars centred on the project as both users and contributors. while witnessing the limited success of group-sourcing transcription affirmed our decision not to include transcriptions in the first iteration of gemms, southern manuscript sermons demonstrated the pitfalls of assigning keywords likely to become outdated as points of access. gemms project: a brief history of challenges and successes funded by a five-year social sciences and humanities research council of canada insight grant ( – ), the first phase of the project is nearing completion and has achieved its two primary goals: collecting a significant quantity of metadata james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of at libraries and archives in the uk and establishing a stable and fully searchable database containing this metadata. before beginning this theoretical and practical work, we appointed a board of uk sermon scholars including historians, literary scholars, and a theologian, who have provided advice on various matters, including creating a taxonomy and on the types of queries sermon scholars were likely to bring to the database. this group also collaborated with us regarding initial design decisions, contributed data, assisted with recruitment of uk research assistants, and expanded the international scope of the project. our first challenge was to determine the parameters of the project and construct a taxonomy for the very diverse materials involved. we established the date range for both theoretical and practical reasons: represents the beginnings of english protestantism, which changed both the importance and style of preaching, while marks the end of the stuart dynasty. from a practical perspective, relatively few manuscripts remain from the sixteenth and early seventeenth centuries, while by the early eighteenth century the volume of both printed and manuscript sermons is so great as to be unmanageable, and the increased availability of printed sermons decreases the need to rely on manuscripts to ensure a wide representation of preaching styles and theological positions. the decision to include sermons housed in north america as well as in the uk and ireland recognizes an opportunity both to link scattered works of particular preachers and to offer possibilities for research that compares and contrasts the sermon cultures on each continent. just as the geographical distribution of manuscripts has been challenging for traditional scholarship, it remains challenging for digital scholarship. we began by compiling lists of manuscripts at repositories to establish a working bibliography of over , manuscripts and manuscript collections. with the majority of these collections housed in the uk and the principal researchers based in canada, we have tried to make efficient use of our research trips by focusing on repositories with rich collections, such as the bodleian library and dr. williams’s library, while extending our searches to smaller and less well-known archives as time permits. research assistants (ras) have also collected data at many smaller repositories that have james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of been easier for them to visit, and iter fellows in toronto have collected data from manuscripts that have been digitized and are available online from repositories, primarily in the us. while this system is efficient (although not all of the records were accurate), we recognize that data has not been collected systematically, library by library, geographical area by geographical area. any search, therefore, reflects only the results of the data we have entered, data not statistically relevant because we have not aimed for complete coverage of any single collection (with the exception of the contents of dr. plume’s library: see below). moreover, it initially limited us to those with online finding aids, repositories that most experienced researchers would be familiar with. old-fashioned human connections led us to dr. helen kemp, a recent graduate of the university of essex, when she applied for an ra position. we learned of dr. kemp’s work in cataloguing individual sermons in manuscripts at the private dr. plume’s library and were successful in hiring her to add this data to gemms. similarly, a chance encounter at dr. williams’s library led dr. shami to a collection of uncatalogued manuscripts at westminster college, cambridge. it is these connections that have led to adding knowledge of new collections and have contributed greatly to the strength of this resource. we have now collected data from archives and have added data from manuscripts, resulting in over , records of individual sermons, the majority of which had never been individually catalogued. recognizing quickly that entering data directly into the database during these research trips was inefficient, we reverted to taking notes and photographing information that was difficult to decipher. ras then entered the metadata into the database. while this working arrangement makes it possible to collect and process data efficiently, as well as to help to train ras in palaeography, there are disadvantages. not all repositories allow photography, and when incomplete or defective information is found in our notes, often months after it has been collected, we must either rely on our memories or send ras to verify the correct information, a relatively expensive procedure for a single piece of missing data. checking for james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of accuracy occupies a significant amount of research time because we understand that accuracy is crucial to the long-term usefulness and survival of gemms. developing a taxonomy for the various kinds of sermon traces described above was a primary concern in the first stages of data collection and database design. because most scholars have made limited use of sermon traces other than complete sermons, there has been little agreement on how to classify or refer to these materials (morrissey , ). gemms has classified items into sermon types (figure ) according to two criteria: structure and creator. short summaries or lists of sermons kept in sermon notebooks, commonplace books, letters, and diaries represent another class of materials. as some manuscripts contain large numbers of such brief traces, we have chosen not to attempt individual entries for each sermon mentioned, but to identify the lists or diaries as ‘sermon reports’ and to summarise their contents. figure : gemms sermon taxonomy. ‘auditor’s notes’: notes taken by an auditor of a sermon. ‘auditor’s outline‘: a list of the heads or main points of a sermon written by an auditor. ‘preacher’s notes’: notes written by a preacher, rather than a fully written out sermon. ‘preacher’s outline’: a list of the heads or main points of a sermon written by the preacher. ‘reader’s notes’: notes taken by a reader of a sermon. ‘reader’s outline’: a list of the heads or main points of a sermon written by a reader. ‘sermon’: a sermon written out in full, though it may contain minor corrections or revisions, such as changing the occasional word or phrase. it may have been written either by the preacher or another person. ‘sermon draft’: a fully written out sermon with substantial revisions or corrections, not just changes to the occasional word or phrase. the revisions may have been made at any time, including for a subsequent preaching of a sermon. sermon drafts will almost always have been written by the preacher, but someone else may have later revised the sermon. ‘sermon fragment’: a part of a sermon, either written by the preacher or someone else. ‘sermon notes’: notes of a sermon when it is not known whether they were written by the preacher or someone else. ‘sermon outline’: an outline of the heads or main points of a sermon when it is not known whether it was written by the preacher or someone else. ‘transcription of manuscript sermon’: a copy of a sermon known to be transcribed from a manuscript sermon. ‘transcription of printed sermon’: a copy of a sermon known to be transcribed from a print edition. ‘transcription of sermon (unknown source)’: a sermon that has been copied from another source, but it is unclear whether that source sermon was from a manuscript or a printed text. james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of we add an additional means of classification, sermon genre, to indicate when a sermon is clearly of a particular kind, such as a funeral sermon, a fast sermon, or a sermon in preparation for communion. while these genres are often indicated fairly clearly in sermon headings, others such as doctrinal, instructional, confutational, and rehearsal can seldom be recognized without reading the sermons, which has not been possible given the time constraints of data collection. we anticipate that as researchers use the database they will be able to add such details to the entries and to offer corrections when necessary; however, the lack of such information in the shorter term may reduce the usefulness of the database for some potential users (full taxonomy available at gateway to early modern manuscript sermons a). this range of materials complicates the processes of data collection and entry. often, we are dealing with incomplete or illegible information, requiring researchers and assistants – like detectives – to interpret exactly what they have in front of them. incorrect and/or incomplete cataloguing offers researchers the pleasures of problem-solving, but also uses valuable time, sometimes to no satisfactory end. while a clearly defined taxonomy supports consistency in data entry, our experience also indicates that applying the taxonomy (which cannot account for every permutation of classification) is a compromise between precision and utility and is handled differently by different researchers. despite training of ras and detailed guidelines for data entry, the classification of the heterogeneous contents of many manuscripts is not always clear. as the number of contributors grows, the potential for variant interpretations also increases. allowing users to contribute their own data offers an even greater possibility of eroding the value of the taxonomy. moreover, as our research continues, we discover materials that do not fit existing classifications. for example, ra hannah yip, in a recent blog post, has raised the question of sermons prepared by laypersons that do not focus on a specific biblical text, a category that many traditional definitions of the sermon exclude (gateway to early modern manuscript sermons b). we are faced with the question of expanding the classification system and perhaps correcting existing records, or simply using free text fields to note anomalies and ambiguities. james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of identifying people associated with the manuscripts poses a second problem. with the assistance of our ras, we have supplemented the archival data to provide brief biographies for individuals who can be traced with the aid of online resources such as the clergy of the church of england database ( ) (cced), the surman index online ( ) (for nonconformist preachers), the oxford dictionar y of national biography, and the oxford and cambridge alumni databases ( ) (with the website providing links to these resources), but the gemms people records have presented many challenges. multiple versions of names need to be collated and cross-referenced, and it is often difficult to distinguish one mr. smith from another (requiring separate entries) due to insufficient biographical information. identifying named individuals with no other claims to fame (particularly subjects of funeral sermons) is frequently impossible. inconsistencies of geographical locations, which are frequently abbreviated in manuscripts and some of which no longer exist as separate communities, are equally troubling; handwriting is frequently cramped, and notes taken in haste at sermons are at times barely and incompletely legible. one of the greatest problems we have faced, however, is that many sermons can be dated only very approximately. in order to cope with the problems arising from dates, we have established two different types of dates: composition date and preaching date. a composition date is given if there is no indication of when the sermon was preached. many manuscripts are dated only provisionally across a broad date range (e.g. seventeenth century), and even when examined yield few clues as to a more precise date range, so we have had to train ras to use judgment and consistency when entering metadata related to dates. having no date is not an option. these dates often span a large range of years, and ‘ca.’ is added when they are approximate. they may be based on the repository’s dating of the manuscript or on the preacher’s active dates, if known. dates of actual preachings are given when provided on the manuscript, although these may also be approximate within a short span of months or years. a further complication is that some manuscripts use ‘old style’ dating, in which the year begins on march, while others use ‘new style’, with the year beginning january, a system becoming more common by the late james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of seventeenth century. when a manuscript follows a chronological order, it is generally possible to tell which style is being used; however, in many cases this is impossible, and we note any uncertainties. these problems of classification, lack of biographical certainty, and vague dating create challenges that make effective searches difficult. the advanced search function (figure ) allows researchers to search within the classification of sermons, manuscripts, or sermon reports, and to limit those searches by repository, manuscript, date, bible book, person and role, sermon type, sermon genre, preaching occasion, and preaching location. free text searching allows access to other relevant information including the identification of print editions or other manuscript witnesses, when they exist, to enable users to compare different versions of the same sermon, an exercise that helps researchers to better understand the transition from manuscript to print and the ways in which sermons might be adapted to different figure : advanced search for sermons. james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of audiences. we also record physical characteristics of manuscripts that may be useful to book historians, such as unusual bindings or paper, or the presence of clasps. simple searches are impressively swift, but we are still working to ensure that users can drill down through many levels to arrive at very specific – and accurate – findings. achieving consistent and reliable results from more complex searches remains a challenge. as an alternative to the search functions, users can browse tables including sermons, manuscripts, repositories, people, places, bible books, and sermon reports. however, browsing becomes less and less efficient as more records are added to the database. as in any such project, search results depend upon the quality of the data as well as the functionality of the search structure. while our advanced search function is robust, it frequently encounters problems resulting from incomplete or ambiguous data. these problems lack easy solutions since they are intrinsic to the materials themselves. future directions while creating a useful and viable database has taken, and continues to take, many of our resources, we have only begun to tackle our most complex challenge: creating an international, collaborative community of scholars who use manuscript sermons as evidence for their varied research objectives. this is a more formidable task. our immediate community consists of board members and known sermon scholars, as well as research associate jennifer farooq, past and present ras in the uk (lucy walton, catherine evans, hannah yip), and canadian ras and iter fellows (robert imes, benjamin durham, david robinson, adam richter, and brandon taylor). these researchers, supervised and trained by us and our research associate, have progressed from data entry and interpretation to wider application of our published data (for example, several ras organized and participated in a conference on early modern sermons at sheffield on november ). however, moving beyond this community will require a vibrant social media presence that facilitates user feedback on the data we are presenting and delivering and that will build momentum beyond what we can accomplish in our public presentations. we will also need to engage board members more creatively to assist james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of with hiring research assistants, contributing data, evaluating and reviewing gemms, and expanding and enriching our online community networks. while some sermon scholars have helped to publicise this resource in print (clement , ; morrissey , ) and have indicated that they are using it in their work, we have found it difficult to move beyond this dedicated group. partly this may be a problem of the breadth of audience for this database. as daniel pitti ( ) notes, “the narrower and more specific the intended audience, the easier it will be to identify and define the uses of the data” ( ). aside from those scholars who study sermons exclusively or primarily, there is a large group of others whose interest in them may be specific to a particular situation or project. drawing in these potential users will require engaging with them to determine their needs. to this end, we plan to turn our focus to developing partnerships with other researchers who are interested in manuscript sermons and sermon notes from this period for a variety of purposes. we have established a partnership with the new england beginnings project co-ordinated by dr. francis j. bremer ( ): “a partnership to encourage and promote activities that commemorate the cultures that shaped early new england.” not only will this partnership facilitate collaboration with american scholars, but it also offers possibilities for links with non-scholarly organizations, since it focuses on engaging both academics and “a wide, general public audience” ( ). we envision many avenues of participation with members of its loose federation that includes a variety of organizations and scholars: participation in their guest scholars program, an initiative that is committed to making the participants’ views on new england available to schools, colleges, and community groups via technologies such as teleconferencing, skype, etc.; and planning and co-hosting an international conference focused on interactions between england and new england involving sermons as a showcase for gemms as a research resource and as a collaborative initiative to bring these two worlds together, historically and culturally, for the education and mutual benefit of north american and european participants. other future plans include offering workshops at universities that have suitable programs. attracting casual or occasional users will require thinking beyond academic structures to historical and genealogical societies, perhaps using the new james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of england beginnings as a model for this kind of engagement. of special importance will be working directly with librarians and archivists to develop relationships that progress beyond gaining access to their collections to genuine collaborations that could include digitization projects. migration to the drupal platform during our next phase, set to begin in , will also be challenging. we intend for this migration to enable us to create an enhanced search interface and to develop group-sourcing features. drupal has been selected for its ability to be customized, as well as for its extensive usage among university libraries, which are the academy’s primary dispensers and preservers of digital scholarly content. this new phase will also see increasing promotion of the database among scholars, and, most importantly, the addition of features that will allow users to contribute their own metadata and to add comments, upload images (with permissions), and attach files (including transcriptions) to existing records. however, while early users and prospective contributors have expressed interest in the expansion of the project to include online images and transcriptions, we are unlikely to add these features systematically due first to the sheer volume of materials and secondly to repository policies and copyright restrictions. additionally, while we hope to enable researchers to plan their time and travel more effectively based on the information the database provides, we see value in consulting these materials in person whenever possible to benefit from seeing the sermons in context and recognizing the manuscripts as material objects. we were fortunate at the beginning of the project to secure hosting by iter: gateway to the middle ages and the renaissance ( ), which has given the website a stable platform that ensures its future availability and free access. iter’s mandate is to provide “a flexible environment for communication, exchange, and collaboration” (bowen, crompton, and hiebert , ) among renaissance scholars, and it is therefore concerned not simply with hosting resources, but also with the potential use of these resources for scholarship and teaching, with opportunities for publication and “social knowledge creation” (bowen, crompton, and hiebert , ). however, “storage is not synonymous with sustainability [:] what sustains a scholarly text [or james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of database] is the community that is built around it, the way that its ideas are taken up by other scholarship, and the way that it circulates among scholars, students, and the general public” (bowen, crompton, and hiebert , ). the project is ultimately valuable only if such circulation occurs. we hope that situating our “boutique” resource (powell, et al. , ) within iter, with its connections to rekn (renaissance knowledge network ) and inke (implementing new knowledge environments ) will help to engage scholars in different disciplines and scattered geographical locations in new kinds of research using the data we are collecting about early modern sermons and provide additional publication opportunities. competing interests the authors have no competing interests to declare. author contributions both authors contributed equally to the conceptualization, original draft preparation, review and editing of this essay. editorial contributions section/layout editor: mahsa miri, university of lethbridge journal incubator copy editor: shahina parvin, university of lethbridge journal incubator. references blench, john wheatley. . preaching in england in the late fifteenth and sixteenth centuries: a study of english sermons, –c. . oxford: basil blackwell. bowen, william r., constance crompton, and matthew hiebert. . “iter community: prototyping an environment for social knowledge creation and communication.” scholarly and research communication ( ): – . doi: https:// doi.org/ . /src. v n a bremer, francis j. . “new england beginnings.” accessed october, . https:// www.newenglandbeginnings.org. cambridge alumni database. . “cambridge alumni database.” accessed october, . http://venn.lib.cam.ac.uk/acad/enter.html. https://doi.org/ . /src. v n a https://doi.org/ . /src. v n a https://www.newenglandbeginnings.org https://www.newenglandbeginnings.org http://venn.lib.cam.ac.uk/acad/enter.html james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of carlson, eric. . “‘the boring of the ear’: shaping the pastoral vision of preaching in england, – .” in preachers and people in the reformations and early modern period, edited by larissa taylor, – . leiden: brill. clement, jennifer. . “introduction: rhetoric, emotion and the early modern english sermon.” english studies ( ): – . doi: https://doi.org/ . / x. . clergy of the church of england database. . accessed october, . http:// db.theclergydatabase.org.uk/jsp/search/index.jsp. cogswell, thomas. . “the politics of propaganda: charles i and the people in the s.” journal of british studies ( ): – . doi: https://doi. org/ . / davies, godfrey. . “english political sermons, – .” huntington library quarterly : – . doi: https://doi.org/ . / davies, horton. . like angels from a cloud: the english metaphysical preachers – . san marino, ca: huntington library. deios, laurence. . that the pope is that antichrist: and an answer to the obiections of sectaries, which condemne this church of england. london: george bishop and ralph newberie. donne, john. . john donne’s gunpowder plot sermon: a parallel-text edition, edited by jeanne shami. pittsburgh: duquesne university press. eales, jacqueline. . “provincial preaching and allegiance in the first english civil war ( – ).” in the culture of english puritanism, – , edited by c. durston and j. eales, – . new york: st. martin’s press. doi: https:// doi.org/ . / - - - - _ ferrell, lori anne, and peter mccullough. . the english sermon revised: religion, literature and history in england, – . manchester: manchester university press. gateway to early modern manuscript sermons. a. “gemms’s sermon taxonomy.” accessed october . http://gemmsproject.blogspot.com/p/gemms- sermon-taxonomy.html. https://doi.org/ . / x. . https://doi.org/ . / x. . http://db.theclergydatabase.org.uk/jsp/search/index.jsp http://db.theclergydatabase.org.uk/jsp/search/index.jsp https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / https://doi.org/ . / - - - - _ https://doi.org/ . / - - - - _ http://gemmsproject.blogspot.com/p/gemms-sermon-taxonomy.html http://gemmsproject.blogspot.com/p/gemms-sermon-taxonomy.html james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of ———. b. “silent preaching: laypeople’s manuscript sermons, c. –c. .” accessed october . http://gemmsproject.blogspot.com. green, ian m. . the re-establishment of the church of england, – . oxford: oxford university press. ———. . continuity and change in protestant preaching in early modern england. friends of dr. williams’s library, th lecture. london: dr. williams’s trust. haigh, christopher. . “the recent historiography of the english reformation.” in reformation to revolution: politics and religion in early modern england, edited by margo todd, – . abingdon: routledge. hughes, ann. . politics, society and civil war in warwickshire, – . cambridge: cambridge university press. doi: https://doi.org/ . / cbo hunt, arnold. . the art of hearing: english preachers and their audiences, – . cambridge: cambridge university press. hutchins, zachary ed. . “transcribing early american manuscript sermons (teams).” accessed october . http://earlyamericansermons.org. implementing new knowledge environment (inke). . accessed october . https://inke.ca. iter: gateway to the middle ages and the renaissance. . “iter: bibliography.” accessed october . https://www.itergateway.org. lake, peter. . “calvinism and the english church, – .” past and present : – . doi: https://doi.org/ . /past/ . . ———. . anglicans and puritans?: presbyterianism and english conformist thought from whitgift to hooker. crows nest: allen and unwin. lofaro, michael gen. ed. . southern manuscript sermons before : a bibliography. knoxville: newfound press. accessed october , . https:// trace.tennessee.edu/cgi/viewcontent.cgi?article= &context=utk_ newfound-ebooks. doi: https://doi.org/ . /v p w mccullough, peter. . “preaching and the context: john donne’s sermon at the funerals of sir william cokayne.” in the oxford handbook of the early modern http://gemmsproject.blogspot.com https://doi.org/ . /cbo https://doi.org/ . /cbo http://earlyamericansermons.org https://inke.ca https://www.itergateway.org https://doi.org/ . /past/ . . https://trace.tennessee.edu/cgi/viewcontent.cgi?article= &context=utk_newfound-ebooks https://trace.tennessee.edu/cgi/viewcontent.cgi?article= &context=utk_newfound-ebooks https://trace.tennessee.edu/cgi/viewcontent.cgi?article= &context=utk_newfound-ebooks https://doi.org/ . /v p w james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of sermon, edited by peter mccullough, hugh adlington, and emma rhatigan, – . oxford: oxford university press. doi: https://doi.org/ . / oxfordhb/ . . milton, anthony. . the british delegation and the synod of dort ( – ). suffolk: boydell press. mitchell, w. fraser. . english pulpit oratory from andrewes to tillotson: a study of its literary aspects. new york: russell and russell. morrissey, mary. . politics and the paul’s cross sermons, – . oxford: oxford university press. doi: https://doi.org/ . /acprof: oso/ . . ———. . “sermon-notes and seventeenth-century manuscript communities.” huntington library quarterly ( ): – . doi: https://doi.org/ . / hlq. . neuman, meredith. . “sermon notebooks online.” accessed april . https:// wordpress.clarku.edu/meneuman/sermon-notebooks-online/. pitti, daniel v. . “designing sustainable projects and publications.” in a companion to digital humanities, edited by susan schreibman, ray siemens, and john unsworth. wiley. accessed october , . http://www.digitalhumanities. org/companion/view?docid=blackwell/ / . xml&chunk.id=ss - - &toc.depth= &toc.id=ss - - &brand=default. powell, daniel, raymond siemens, matthew hiebert, lindsey seatter, and william r. bowen. . “transformation through integration: the renaissance knowledge network (rekn) and a next wave of scholarly publication.” scholarly and research communication ( ): – . doi: https://doi.org/ . / src. v n a questier, michael. . catholicism and community in early modern england: politics, aristocratic patronage and religion, c. – . cambridge: cambridge university press. doi: https://doi.org/ . /cbo renaissance knowledge network. . accessed october . https://rekn.org. shami, jeanne. . “introduction: reading donne’s sermons.” john donne journal ( – ): – . https://doi.org/ . /oxfordhb/ . . https://doi.org/ . /oxfordhb/ . . https://doi.org/ . /acprof:oso/ . . https://doi.org/ . /acprof:oso/ . . https://doi.org/ . /hlq. . https://doi.org/ . /hlq. . https://wordpress.clarku.edu/meneuman/sermon-notebooks-online/ https://wordpress.clarku.edu/meneuman/sermon-notebooks-online/ http://www.digitalhumanities.org/companion/view?docid=blackwell/ / .xml&chunk.id=ss - - &toc.depth= &toc.id=ss - - &brand=default http://www.digitalhumanities.org/companion/view?docid=blackwell/ / .xml&chunk.id=ss - - &toc.depth= &toc.id=ss - - &brand=default http://www.digitalhumanities.org/companion/view?docid=blackwell/ / .xml&chunk.id=ss - - &toc.depth= &toc.id=ss - - &brand=default https://doi.org/ . /src. v n a https://doi.org/ . /src. v n a https://doi.org/ . /cbo https://rekn.org james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of ———. . “women and sermons.” in the oxford handbook of the early modern sermon, edited by peter mccullough, hugh adlington, and emma rhatigan, – . oxford: oxford university press. ———. . “the sermon.” in the oxford handbook of early modern english literature and religion, edited by andrew hiscock, and helen wilcox, – . oxford: oxford university press. doi: https://doi.org/ . / oxfordhb/ . . spurr, john. . the restoration church of england, – . new haven: yale university press. doi: https://doi.org/ . /j.ctt qx d the surman index online. “how to use the surman index.” accessed october . http://surman.english.qmul.ac.uk/. tyacke, nicholas. . anti-calvinism: the rise of english arminianism, c. – . oxford: clarendon press. ———. . aspects of english protestantism, c. – . manchester: manchester university press. wabuda, susan. . preaching during the english reformation. cambrige: cambridge university press. walsham, alexandra. . “’the fatall vesper’: providentialism and anti-popery in late jacobean london.” past and present : – . doi: https://doi. org/ . /past/ . . wooding, lucy. . “from tudor humanism to reformation preaching.” in the oxford handbook of the early modern sermon, edited by peter mccullough, hugh adlington, and emma rhatigan, – . oxford: oxford university press. doi: https://doi.org/ . /oxfordhb/ . . https://doi.org/ . /oxfordhb/ . . https://doi.org/ . /oxfordhb/ . . https://doi.org/ . /j.ctt qx d http://surman.english.qmul.ac.uk/ https://doi.org/ . /past/ . . https://doi.org/ . /past/ . . https://doi.org/ . /oxfordhb/ . . james and shami: gemms (gateway to early modern manuscript sermons – ) art.  , page  of how to cite this article: james, anne, and jeanne shami. . “gemms (gateway to early modern manuscript sermons – ): confronting the challenges of sermons research.” digital studies/le champ numérique ( ): , pp. – . doi: https://doi. org/ . /dscn. submitted: october accepted: april published: december copyright: © the author(s). this is an open-access article distributed under the terms of the creative commons attribution . international license (cc-by . ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. see http://creativecommons.org/licenses/by/ . /. open access digital studies/le champ numérique is a peer-reviewed open access journal published by open library of humanities. https://doi.org/ . /dscn. https://doi.org/ . /dscn. http://creativecommons.org/licenses/by/ . / introduction the research context: problems of access geographic dispersal inadequate bibliographic access problems of classification gemms project: purpose and scholarly context gemms project: a brief history of challenges and -successes future directions competing interests author contributions editorial contributions references figure figure terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities melissa terras reader in electronic communication, department of information studies, university college london. m.terras@ucl.ac.uk abstract digital humanities faces many issues in the current financial and educational climate. in this closing plenary from the digital humanities conference at king’s college london, major concerns about the current role and function of digital humanities are raised, demonstrating the practical and theoretical aspects of digital humanities research in regard to an individual project at university college london: transcribe bentham. it is suggested that those in the digital humanities have to be more aware of our history, impact, and identity, if the discipline is to continue to flourish in tighter economic climes, and that unless we maintain and establish a more professional attitude towards our scholarly outputs, we will remain ‘present, not voting’ within the academy. the plenary ends with suggestions as to how the individual, institution, and funding body can foster and aid the digital humanities, ensuring the field’s relevance and impact in today’s academic culture. this paper is a transcript of what was planned to be said at dh , although the spoken plenary digresses from the following in places. the video of the speech can be viewed at http://www.arts- humanities.net/video/dh _keynote_melissa_terras_present_not_voting_digital_humanities_panopt icon. . introduction . preamble, the first. firstly, let me say how honoured i am to have been asked to be the plenary speaker at dh . i understand that this is a deviation from previous conferences – for the first time, instead of getting someone external to the community to talk about semi-related research areas, they’ve asked someone from well within the discipline to present. i’m aware that, in a room that holds people, there are people other than myself who are more than qualified to stand up for an hour and say what they terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . currently think of the digital humanities (not to mention the other folks registered for the conference who may be watching from the streaming lecture theatre). it’s also worth saying that i am incredibly nervous. many of those in the audience are close colleagues, many are good friends. this is not a conference i can walk away from and forget a disastrous presentation. i’m very aware that the rules of giving plenary speeches have changed as rapidly as the information environment over the last few years. i remember a plenary at allc/ach (as the conference was then known) ten or so years ago where the speaker read out a chapter of their book, never looking up at the audience once, and with no concession given for the change of presentational mode: ‘as i said on page . as i will discuss in chapter five...’ nowadays, given that i’m being recorded and simultaneously broadcast online, that just won’t cut it. you expect more. as well as being <nervous>, i’m aware that i’m being #nervous. many of you will already have tweeted comments about what i’ve said online, even though i’ve not really begun yet. that’s fine, and i’m not looking for any special treatment. i just want you to be aware that i’m aware that these are changed days. i don’t know how i’m being watched and perceived, as much as you don’t know what i’m going to say next. in fact, surveillance is just one of the things i want to talk to you about today. . preamble the second for those of you who don’t know me, i’m from university college london (http://www.ucl.ac.uk/), just a mile north from king’s. ucl and king’s were both founder members of the university of london in (harte , university of london ), and the two universities have an interesting, competitive history. ucl was set up as a secular educational establishment, letting in anyone who could pay the fees (such as gandhi. and women. (harte , harte and north )) whereas king’s was set up as an anglican church based institution, in reaction to the ‘great godless of gower street’ threatening to allow education to all just up the road. the two institutions have remained locked in friendly – but sometimes fierce – competitive mode, ever since. a recent provost’s newsletter from ucl ran with a headline that ucl had beaten king’s in the women’s rugby varsity match - (grant ). on the academic front, we are often competing for the same staff, students, grants, even facilities. ucl has recently established its centre for digital humanities (http://www.ucl.ac.uk/dh/), which forms a competitive alternative to the teaching and research that has been established at king’s centre for computing in the humanities (now the department of digital humanities, http://www.kcl.ac.uk/ddh/). and so it goes on. we’re proud at ucl of the different nature of our university. as opposed to king’s, we never will have a theology department, and do not provide a place of worship on campus. much of the founding principles of ucl were influenced by the jurist, philosopher, and legal and social reformer, terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . jeremy bentham who believed in equality, animal rights, utilitarianism, and welfarism (http://www.ucl.ac.uk/bentham-project/). ucl special collections host , folios of bentham’s letters and manuscripts, many of which have not been transcribed. upon his death, bentham’s skeleton was preserved as an ‘auto-icon’ (a two fingers up to those who believed in religion and the need for a christian funeral) which now sits, dressed in his favourite outfit, in the cloisters at ucl. there is an oft repeated story that bentham’s body is wheeled into senate meetings, although he is noted in the minutes as ‘present, not voting’ (bentham project ). you will notice that in early colour photos, you can still see bentham’s real preserved head at the base of the auto-icon. this has been stored in special collections ever since , when (the story goes) students from king’s stole it, merrily kicked it around the quad, and then held it to ransom. friendly competition, indeed. . introduction proper time to draw this properly back to digital humanities and the plenary in question. one of bentham’s main interests was penal reform, and he is perhaps most famous for his design of the panopticon, a prison which allowed jailors to observe (-opticon) all (pan-) prisoners without the incarcerated being able to tell whether they are being watched. this psychologically, and physically, brutal prison was never built, but the concept has lived on as metaphor, influencing a wide range of artists, writers, and theorists, including george orwell (who worked in room of senate house, in-between ucl and king’s, and would have been well aware of jeremy bentham’s work) and foucault ( ). indeed, the panopticon can be taken as a metaphor for western society, and increasingly, online communication, particularly social media. every time you tweet, do you know who is paying attention? what audience are we performing for, and can you be sure you are in control of how our actions are viewed and used? now, i cannot pretend to be an omnipresence that has been watching all that has been happening in the digital humanities over the past few months. but when you are asked to do a plenary speech such as this, believe me, you start to pay attention to things. you do your homework. i’ve been peering into the twittersphere panopticon and wondering what to say. which will be the following: i’m going to talk briefly about the transcribe bentham project, as a type of dh project that can objectively demonstrate the changes that are occurring in our field at the moment. i’m going to use this project as a window to peer at current issues in dh – or at least things that i’ve learnt from the project – and the wider community – over the past few months. terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . and finally, i’m going to set you all homework based on the key things that are emerging in our field. friendly competition is not so friendly just now. there are tough times ahead for academia, given the current financial crisis and promised cutbacks. what can we learn from the areas highlighted by this discussion, and what can we do better as a field, so those who are looking at us (and believe me, managers and administrators and financial experts are looking at us) can visibly see what we are up to? . transcribe bentham transcribe bentham is a one-year, arts and humanities research council funded project, housed under the auspices of the bentham project at ucl (http://www.ucl.ac.uk/bentham-project/). the bentham project aims to produce new editions of the scholarship of jeremy bentham, and so far twelve volumes of bentham’s correspondence have been published by the bentham project, plus various collections of his work on jurisprudence and legal matters (http://www.ucl.ac.uk/bentham- project/publications). however, there is much more work to be done to make his writings more accessible, and to provide transcripts of the materials therein. although a previous grant from the ahrc in - has allowed for the completion of a catalogue of the manuscripts held within ucl (http://www.benthampapers.ucl.ac.uk/), and transcriptions have been completed of some , folios (currently stored in ms word...), there are many hours of work that need to be invested in reading, transcribing, labelling, and making accessible the works of this interdisciplinary historical figure if they are to be analysed, consulted, and utilised by scholars across the various disciplines interested in bentham’s writings. crowdsourcing - the harnessing of online activity to aid in large scale projects that require human cognition - is becoming of interest to those in the library, museum and cultural heritage industry, as institutions seek ways to publically engage their online communities, as well as aid in creating useful and usable digital resources (holley ). as one of the first cultural and heritage projects to apply crowdsourcing to a non-trivial task, ucl's bentham project has recently set up the ‘transcribe bentham’ initiative; an ambitious, open source, participatory online environment which is being developed to aid in transcribing , folios of bentham’s handwritten documents (http://www.ucl.ac.uk/transcribe-bentham/). to be formally launched in september , this experimental initiative will aim to engage with individuals such as school children, amateur historians, and other interested parties, who can provide time to help us read bentham’s manuscripts. the integration of user communities will be key to the success of the project, and an additional project remit is to monitor the success of trying to engage the wider community with such documentary material. will we get high quality, trustworthy transcriptions as a result of this work? will people be interested in volunteering their time and effort to read the (poor) handwriting of a great philosopher? terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . what technical and pragmatic difficulties will we run into? how can we monitor success in a crowdsourced environment? one of the other things that is interesting about the bentham project, and the transcribe bentham initiative, is that it demonstrates neatly the progression of digital humanities in historical manuscript based projects. the bentham project has been primarily occupied with print output, gaining a web presence in the mid s, then an online database of the bentham archive in the early th century, and is now carrying out a moderately large-scale digitisation project to scan in bentham’s writings for transcribe bentham. in addition, the bentham project has gone from a simple web page, to an interactive web . environment, from ms word to tei-encoded xml texts, and from relatively inward looking academic project to an outward facing, community- building exercise. we can peer at dh through this one project, and see the transformative aspects that technologies have had on our working practices, and the practices of those working in the historical domain. . transcribe bentham, and emerging issues in dh i could talk about crowdsourcing for an entire hour, but i thought it would be more useful to point out to those involved in the digital humanities community some of the emergent issues that i have found myself tackling whilst engaging in the transcribe bentham project. it’s certainly true that for every project that you work on you learn new things about the field, and over the past year various aspects of dh research and issues that concern the dh community have raised their head. i’m going now to talk about some of these issues, backing up what i say with some observations of what has been happening in the dh community, through conversations that others have been having on twitter. forgive me if you just think i’m a stalker. a lot of these issues are becoming more visible in the dh community, so i’m going to quote you on those. . our dependence on primary sources, our dependence on modern technology i’ve never felt more of a jack of all trades, master of none working on transcribe bentham. and it’s great. let’s be clear – the bentham project belongs to professor philip schofield, who has been working on it for over years (http://www.ucl.ac.uk/laws/academics/profiles/index.shtml?schofield). i’ve just been drafted in to help bridge the gap between primary sources, dedicated scholars, and new technology. on the one hand, i’m utterly dependent on scholars who know less than me about it, and more than me about their subject domain, to make an academic contribution. on the other, i’m utterly dependent on some programmers who have the time to work up the ideas we have for transcribe bentham into a complex (but seemingly simple!) working environment for transcribing documents, in just a few months. i’d be lost without access to historical knowledge and source material, but i’d be lost without access to new, online cutting edge, technologies. terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . this is something i see repeated across our discipline. when ray siemens tweets ‘among many highlights of an excellent week, holding a well-read copy of thynne’s chaucer, as well as an ipad...’ we get the joy (siemens ). when brian croxall tweets ‘gah, can’t get online at this hotel’ (croxall ) we feel his frustration at being cut off from a service and an environment which is becoming as essential to us as running water or oxygen. when bethany nowviskie tweets ‘dreamed last night that my #dh poster was a set of flexible, give away, interactive touchscreens. maybe not too far in the future’ (nowviskie ) we nod in recognition, and go, yeah... that would be cool... let me google that and see if they exist... we exist in this parallel state where we are looking towards humanities research, and computational technology, and it can be immensely rewarding, and great fun. i’m really enjoying working on transcribe bentham. i really enjoy the duality of dh research (as long as i can get online when i want). . legacy data but as well as working with historical documents (or artefacts, or whatever), it’s becoming increasingly common with the digital humanities that we have to work with historical digital documents – or legacy data, left over from the not-so-distant past, in different formats and structures that need bringing into current thinking on best practice with digital data. this can take immense amounts of work. converting , transcribed bentham documents from ms word to tei- compliant xml, with any granularity of markup, is not a trivial task. linking these transcripts with the records currently held in the online database, and then ucl’s library record system (to deal with usability and sustainability issues) is not a trivial task. linking existing transcriptions with any digitised images of the writings which exist is not a trivial task. transcribe bentham, then, is dealing with sorting its own ducks into a row, as well as undertaking new and novel research. most of us understand this, and we understand just how much work (and cost) is involved in continually ensuring we are maintaining and updating our work and our records to make sure that our digital resources can continue to be used. so we understand that a seemingly simple tweet by tom elliott saying ‘more batlas legacy data added to pleiades today, courtesy of @sgillies http://bit.ly/dbuyfg’ (elliott ) belies an incredible amount of work to convert and maintain an existing resource. as well as looking forward to the future, and new technologies, us dh peoples must be our own archivists. . sustainability which brings me to the thorny issue of sustainability. we hope with transcribe bentham that the project will continue far beyond its one year remit, but there are some decisions to make in that regard. will the user forums, and user contributions, continue to be monitored and moderated if we can’t afford a staff member to do so? will the wiki get locked down at the close of funding or will we terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . leave it to its own devices, to become an online-free-for all? we are at the stage, in a one-year-project, where we already need to be applying for future funding, before we have even got anything to demonstrate that it’s worth continuing our funding (and there is no guarantee in the current climate that any funding will be forthcoming, see below). but we are lucky in transcribe bentham – its father project, the bentham project, will continue whatever happens, under the watchful auspices of philip schofield. so when dan cohen is quoted, by shane landrum, in a tweet that reminds us ‘being a labour of love is often the best sustainability model’ (landrum ) we understand what that means. sustainability is an area of huge concern for the dh community, and is going to become more so as financial issues get more complex. . digital identity transcribe bentham is going to live or die by its digital identity and digital presence. it doesn’t have any equivalent in the offline world. it is what it is: an online place to hang out and help transcribe some documents, should that take your fancy. to be a success, then, our functionality, digital presence, and digital identity need to be absolutely spot on. ironically, i’ve never worked on a digital humanities project before where the digital presence mattered so much, and i’ve come to realise that we all should be taking our digital identity and digital presence a lot more seriously. it’s not enough just to whack up a website and say ‘that’ll do, now back to writing books’. if we are going to be in the business of producing digital resources, we have to be able to excel at producing digital resources, and be conscious of our digital identity and digital presence. we are lucky at transcribe bentham to have gained the input of one of my phd students, rudolf ammann (@rkammann) who is also a gifted graphic designer. he has taken it upon himself to whip both ucl centre for digital humanities, and transcribe bentham, into online shape, whilst designing logos for us which are both fitting, useful, and memorable. we’re being careful with transcribe bentham to roll our presence out over twitter and facebook to try and encourage interaction. we hope that someone will be watching. suddenly, it matters in a way that didn’t matter before, if people are looking at our website and our resource. i believe that digital presence and digital identity is becoming more important to digital humanities as a discipline. so when amanda french jokily tweets ‘i feel like a got a rejection letter yesterday from @dhnow when too few rted my ‘binary hero’ post http://bit.ly/akpbix’ (french ) we understand the complexity of interacting in the new digital environment: we want the discourse, and want the attention (and if you don’t know what dhnow is, you should be reading it every day: http://digitalhumanitiesnow.org/). likewise when matt kirschenbaum tweets ‘has twitter done more as dh infrastructure than any dedicated effort to date?’ and this is immediately retweeted by tim sherratt with an addendum ‘[for me it has!]’ (sherratt ) we understand the possibilities terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . that are afforded with new modes of online communication. how we can harness this properly for transcribe bentham remains to be seen – but we are at least aware we need to make the effort. . embracing the random, embracing the open there are large differences between producing a perfect (or as near perfect as can be) print edition of bentham’s letters, and learning to deal with the various levels of quality of input we will be getting with transcribe bentham. there are large differences between working in a close knit group of scholars, to working with the general public. there are also differences in producing online editions and sources which you are willing to open up to other uses – and one of the things we want to do with transcribe bentham is to provide access to the resulting xml files so that others can reuse the information (via web-services, etc). the hosting and transcription environment we are developing will be open source, so that others can use it. and this sea change, from working in small groups, to really reaching out to users is something we have to embrace, and learn to work with. we also have to give up on ideas of absolute perfection, and go for broader projects, embracing input from a wider audience, and the chaos that ensues. so we understand when dan cohen tweets ‘another leitmotif i’m sensing: as academics, we need to get over our obsession w/ perfect, singular, finished, editorial vols’ (cohen ). bring it. let’s see what happens... . impact i only realised recently that my automatic reaction to getting involved with the transcribe bentham project was ‘how can i get from this some output that counts for me’. we wrote into the grant bid a period of user testing and feedback, and one of the reasons is to get a few pretty much guaranteed publications out of the project, looking at the success – or not! – of crowdsourcing in cultural heritage projects. get few academic outputs in there, then we can go and play online, and not have to worry too much about how creating an open source tool, or reaching out to a potential audience of thousands, will ‘count’ in the academic world. because no matter how successful transcribe bentham, the ‘impact’ will be felt in the same usual way – through publications. this is a nonsense, but it’s part of the academic game, and is becoming of increasing frustration to those working in the digital humanities. it’s not enough to make something that is successful and interesting and well used: you have to write a paper about it that gets published in the journal of successful academic stuff to make that line on your cv count, and to justify your time spent on the project. so we understand the frustration felt by stephen ramsay when he puts a mini-documentary online which goes as viral as things really get in the digital humanities, viewed by thousands of people, but which will have no real impact on his career: ‘i’ve published some print articles. funny thing though: none of them were read by people in the space of weeks .... had their titles printed on t-shirts, or terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . resulted in dozens of emails from adoring fans. so why am i writing journal articles again? .... oh wait, nevermind, my department doesn’t count movies.’ (ramsay a, b, c). . routes to jobs this is a tricky one. should those hired in digital humanities projects to do technical work have a phd in digital humanities – even if the tasks in the role are service level (such as marking up tei) and don’t require that academic training? i’m willing to admit that transcribe bentham walked right into the storm with this one when our job adverts for our two ras went up. we advertised for two postdocs: one with a historical background that had experience working in the bentham studies area, one with tei chops to help us with the back end of the system. we specified that we wanted phds because of the changed rules in employment at universities in the uk (well, at least those involved in the common pay framework): if we had advertised for posts at non-phd level, we wouldn’t have been able to employ someone with a phd, even if they wanted to work for less money, because of the spine point system. so, we advertise for two postdocs, and if someone good comes through without a phd, we can employ them on a lower rate. but we forgot to mention that applications from those without a phd may also be considered. cue much online discussion in various forums. we get this frustration. dot porter said, on facebook rather than twitter, ‘i get annoyed when i read job postings for positions that require a phd, and then read the job description and can’t figure out why. maybe i’m sensitive, not having a phd? is a phd really required for one to take part in the digital humanities these days, even in supporting (non-research) roles?’ (porter ). this is becoming a real issue in digital humanities. there is no clear route to an academic job, and no clear route to phd, and there are a lot of people at a high level in the field who do not have phds. yet increasingly, we expect the younger intake to have gone down that route, and then to work in service level roles (partly because there are few academic jobs). it remains to be seen how we can address this. in transcribe bentham, we changed the advert to make it clear we accepted applications from non-phds. in the end we did appoint two post-docs, but at least we made it clear that people had the option to apply for a job where, ostensibly, you didn’t need a phd, just the skill set, to undertake the task properly. . young scholars this problem of employment and career and progression taps into a general frustration for young scholars in our field. it can be hard to get a foothold, and hard to get a job (not just in digital humanities – in the uk over % of graduates under the age of are currently unemployed. it’s a tough time to be coming out of university, phd or no phd). perhaps twas always difficult to make the transition from academic student to academic academic, but twitter amplifies the issues that are terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . facing young scholars in the field trying to make headway. i was very aware when hiring for transcribe bentham that there were some very good candidates out there who just weren’t getting a break (the person who came second in the interview, and who we would have employed instantly had we had two historical post-doc positions, later told me that he had had over interviews, but we were the first people to give him any feedback). we shouldn’t forget the pressure young scholars are under (at a time when we are complaining of the financial pressures that us paid academics are under) and how difficult it can be for them on both a professional and personal level. it makes me sad to hear tweets like the one from ryan cordell saying ‘just wrote a tough email withdrawing from #dh . even if i got a bursary, i just couldn’t swing it in the same summer as our move #sigh’. (cordell ). (however, it is worth noting that the digital humanities conference currently offers four different types of bursary to young scholars, as well as mentoring schemes such as those provided by ach). . economic downturn which brings me to the next doom and gloom point. when brett bobley tweets ‘two weeks ago, no one in my kid’s school had silly bandz; now they all wear them. how come higher ed never moves that fast?’ (bobley ) we all chuckle at the thought of the academy as being a reactive, immediate place to be. it takes a few years for the impact of outside events to trickle down. its only now that the economic downturn is starting to hit higher education. in the uk, cuts over the next few years are predicted to be anything between % to %, depending on what leak or rumour or governmental minister you believe. these are uncertain times for research, and for institutions, and for individuals, and for projects. we don’t know if there will be money to even apply for to continue the research and application in the transcribe bentham project. we don’t know, even if we submit an application, that the funding council won’t suddenly reject all applications due to their funding cuts. we don’t know how to make an economic case for projects in the arts, humanities, heritage, and culture, so that when panjandrums and apparatchiks are deciding which swinging cut to make next, we can display our relevance, our impact, the point of our existence, and why people should keep writing the cheques. these are uncertain times. how this is affecting digital humanities is slowly beginning to be played out. . money, the humanities, and job security i feel that it would be morally wrong of me to come to a conference at king’s that has the word humanities in the title and not broach the subject of what had happened over the past year to the humanities at king’s (morgan a, morgan b, tickle and bowcott ). palaeography is a subject close to my heart, and as @drgnosis tweeted during the opening speech of #dh , ‘i weep for palaeography’ (@drgnosis ). i also like to think that had any one of the other registered terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . conference attendees from the digital humanities community been asked to give this plenary that they would have the guts to raise this issue. but i am guest here and do not want to be rude or impolite. so i’ll repeat what was tweeted by john theibault: ‘there’s going to be a bit of a pall over dh because of all that’s gone on with kcl’ (theibault ). and i recommend if you do not understand what i am talking about, then you read about it, and understand how little respect was given to humanities academics at kings over the past year from their management (pears ). and i suggest you hope that your own management have not been taking notes, and do not proceed in a similar fashion, for what hope is there then for the humanities? it’s very difficult for those in the humanities to make the economic case for their existence, and that is what we are being expected to do in the current climate. we need to be able to explain why projects like transcribe bentham are relevant, and important and useful. those in the humanities are historically bad at doing this, and those in dh are no different. but dh is different from traditional humanities research: on the plus side, we should be able to articulate the transferable skill set that comes with dh research, that can educate and influence a wide range of culture, heritage, creative, and even business processes. on the downside, projects like transcribe bentham are more expensive than paying one individual scholar for a year to write their scholarly tome on, say, byzantine sigillography – the digital equivalent will require researchers, computer programmers, computer kit, digitisation costs, etc. to ensure that the digital humanities are funded at the time when funding is being withdrawn from the humanities, we need to be prepared, and to articulate and explain why what we do is important, and relevant. . fears for the future of course, it’s not just the humanities that are in a perilous financial state: in the uk, it’s the whole of the sector. at king’s, it’s not just humanities that have taken the hit, but also the engineering faculty (hurst ). profitable groups from disciplines such as computer science have been poached wholesale by other universities (not so friendly competition now, is it?). and this is a pattern we are seeing across the universities in the uk. we’re all scared; for the continuation of our projects (such as transcribe bentham), for our students, for our young scholars on temporary contracts, for our ‘research profile’ (whatever that may mean) and for our own jobs. we understand the implicit horror in a tweet such as that from simon tanner saying ‘england next? plan to close smaller #welsh #universities broadly welcomed by #education professionals. http://bit.ly/dxwbsj #he #wales.’ (tanner ). if we think that no-one is watching us and making value judgements about our community, our research, our relevance, and our output, then we are misguided. it’s not just other scholars who are paying attention, but those who hold the purse strings – who often have no choice but to make brutal terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . cuts. the humanities are one of the easiest targets, given scholars’ reluctance or inability to make the case for themselves. i’m reminded of a phrase from orwell’s , and what happened to society when under the horrific pressure and surveillance within. allow me to paraphrase: if we are not prepared, and if we are not careful, these cuts will be ‘a boot stamping on the face of the humanities, forever’. i remember very strongly that at the end of an upbeat dh neil fraistat stood up and said ‘the digital humanities have arrived!’. but in , the place we have arrived to is a changed landscape, and not nearly as optimistic. we are not in kansas now, toto. . digital humanities in the panopticon so let us pretend that we are someone from outside our community, watching the goings on in academia and making value judgements, and financial judgements, about our discipline and field. how does digital humanities itself hold up when under scrutiny? how do we fair with the crucial aspects of digital identity, impact, and sustainability? the answer is – not very well. from the outside looking in, we look amateur. we should know and understand best, amongst many academic fields, how important it is to maintain and sustain our digital presence and our community. but our web presence, across the associations, sucks. the ach website says it was last updated in (http://www.ach.org/). the allc web site is a paean to unnecessary white space (http://www.allc.org/). sdh/semi is not so bad, but has its own problems with navigation and presentation (i’m including it here so as not to leave out a whole association, http://www.sdh-semi.org/) – but the adho website is a prime example of what happens when wikis are not wiki-ed (http://digitalhumanities.org/). these are our outward faces. these are our representations of the field. we’ve been slow to embrace other social media and new technologies when we are the field that is supposed to show how it is done. but what you may not know is that the associations have recently taken this on board. there is a lot of hard work going on behind the scenes on all accounts, so i don’t want to lay into folks too badly on this. a wireframe of the new adho site, which should be up and running shortly, demonstrates that we are moving into the st century, finally. what’s interesting is the big space for a mission statement, and a definition of the field (which we at dh don’t have, yet!). we need to take our digital presence more seriously, and to embrace the potentials that we all know about, but haven’t pitched in to help represent for the discipline. what about impact? we’ve been historically bad at articulating our relevance and our successes and our impact beyond our immediate community (and sometimes within our immediate community – it surprised me recently when a leading scholar in the field was told, via twitter, of the terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . role the dh community had in the formulation of xml). we’re bad at knowing our own history, as a discipline, and having examples listed off the top of our heads of why our research community is required in today’s academe. as for sustainability, digital humanities scholars should know how important it is to preserve our discipline’s heritage, and should lead the way in demonstrating this to other fields. yet it’s only been recently that scholars in the field have started to note the disappearance of abstracts from previous conferences, websites which have disappeared overnight, the fact we don’t have, and can’t locate, a complete back run of the journals printed by the associations. for example, we don’t have any of the image files included in the allc/ach abstract book. we need to look after our heritage: no-one else will. what you may not know, again, is that a few people are working behind the scenes to try and build up digital copies of our discipline’s history, and hopefully over the next year or so we’ll see this available online. we need to be leading the way in the humanities for publishing and maintaining and sustaining our discipline, to demonstrate that, yes, we really do know what we are talking about. at the moment, it looks like we don’t. why does all this matter? i bring you back to the title of my talk: ‘present, not voting: digital humanities in the panopticon’. our community matters - although heck, a lot of you are not voting – for the ach and allc elections, turnout was around %. we need new blood in the associations. we need people who are not just prepared to whine but prepared to roll up their sleeves and do things to improve our associations, our community, and our presence in academia. but the fact of the matter remains: if we do not treat our research presence seriously, if we are not prepared to stick up for digital humanities, if we are not prepared to demonstrate our relevance and our excellence and our achievements, then the status of those working within dh (including the relevance of digital scholarship, and how it is treated by those in the humanities) will not improve, and we’ll be as impotent as we have ever been. we should be demonstrating excellence and cohesion and strength in numbers. we should be prepared, as best we can, for whatever is coming next in the financial downturn, and in academia. if we self identify as digital humanists – and i presume many of those here at the conference would – then we need to articulate what that means, and what’s the point of our community. it’s the only way to prepare for what is coming. . homework so far, so doom and gloom. but we are a community who are full of those who like to do things, and make things, and achieve things. and there are plenty of practical things we can do to ensure the continuation of our individual careers, our individual projects (such as transcribe bentham), our centres, and our teaching. terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . for the individual, we can be prepared by having at the tip of our tongues what we do and why we matter and why we should be supported and why dh makes sense. (those definitions of dh must be personal, and must vary – but how many of us, when asked to explain dh, go ‘well, its kinda the intersection of...’ – and you lost them at kinda). we need to have thought about the impact of our work, and why it is relevant. when asked, or queried, about this (either in a personal or professional setting) we should know. and it really doesn’t hurt to have learnt a little about the background of our field, and its impact, and its successes, so we can throw in a few ‘for examples’ when the blue-sky nature of research pays off, and for when the application of our research in the wider community works, and for some major problems that need to be solved about digital culture and use and tools and why we are the people to do it. individuals can find support in networks of scholars, and become active in communities (both dh and individual subject organisations): there is strength in numbers. individuals can take their digital identity seriously – let’s show other scholars and other disciplines how best to proceed. we need to learn to play the academic game with regard to publications, though, and ensure all of our wonderful whizz bangy tools are equally followed up with research papers in important places, which is a bit of a bind, but the only way with which to maintain and improve our academic credentials at present. individuals can promote and be the advocates for dh, and for dh-based research. we can also ensure that we support the younger cohort and students and young scholars who are just entering our field: it’s our role to be ambassadors for dh in every way we can. for those individuals who do have some management sway and some management clout, there is also plenty that can be done to push forward the digital humanities agenda, within departments and institutions. more support and kudos can be given to digital scholarship and digital outputs within the humanities, and this becomes something that can be raised and pushed within institutional committee structures, to ensure they count for hiring and promotion and tenure. (indeed, established devoted tenure track posts for digital humanities scholars may be something those in the united states could work towards). issues of funding and employment for young scholars in the discipline should be watched out for, including the phd and hiring/ qualification issue, but this is something that can be tackled through careful, watchful leadership. my main advice to those in dh management, though, would be to ensure you fully embed your activities within institutional infrastructures: become indispensible. get involved with academic departments and service areas. provide advisory services and engage with as wide a spectrum as people within your institution as you can. be ready to defend your staff and your projects in the current financial climate, and be forewarned. there is also strength in numbers in management, in local, regional, national, and international communities. collaborations should be entered into, rather than competition, to further terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . embed projects and people into the wider academic field. strategies and policies should be developed to deal with the coming hardships that face us. from an institutional point of view, building up a centralised record of all the individuals and projects involved in dh within an institution can facilitate new research, and build on existing strengths to make it clear where new research opportunities may lie. i would suggest that dh centres should integrate closely with library systems (and ischools). institutions can also support digital outputs as being research in the internal promotion of individual scholars. the establishment of teaching programs (such as the new ucl centre for digital humanities ma in digital humanities, http://www.ucl.ac.uk/dh/courses/mamsc) provides essential training for young scholars entering our field, and institutions should look to the opportunities which exist in providing this graduate-level training – which is sorely needed in our field. institutions can also encourage collaboration with other institutions, and provide facilities, for example, for visiting scholars, to encourage cross-fertilisation of teaching and research ideas. the adho organisations can also do plenty to maintain and support research, teaching, and the dh community. our digital presence should be (and is being) sorted out as a matter of urgency. within those digital resources, adho and its constituent organisations should provide the community with the ammunition which is necessary to defend dh as a relevant, useful, successful research field. information about the successes of dh can be pushed, including projects and initiatives that have been important to both our and other communities. the value and impact of dh can be documented and presented. a register of good projects can also be maintained. best practice in the running of projects and centres can be pushed, and advice given to those who need it in all matters dh. collaboration should be encouraged, and the associations should continue the work they are doing in supporting young scholars. if anyone has any further ideas, then please do contact the associations. they are there to help you. my suggestions for funding agencies are relatively succinct – i am not sure how much leeway they have in providing funds at the moment, although it is worth saying that certain funders (more than others) have been and are being very supportive to dh, and are engaged with and listening to our community. we need financial support, both to carry out blue-sky research, and to build dh infrastructures. funding agencies can also help with the sustainability of projects, and in wrapping up and archiving projects. they can aid, encourage, and facilitate collaboration, and graduate research. considering the large investment that has been made in dh, particularly over the past ten to fifteen years, it makes sense for them to continue supporting us to ensure our research comes to fruition, although we are all very aware of the changed financial academic world in which we live. . wrapping up terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . this has been an honest tour of what dh means to me, and some of the issues which dh is presented with at the moment. it’s been necessarily negative in places. but i hope i have left you with the feeling that there is proactive activity which we, as individuals, departments, institutions, organisations, and agencies can take to further entrench ourselves in the humanities pantheon and to demonstrate that we really are indispensible to the humanities. i don’t know what is going to happen with transcribe bentham, whether the project itself will be a success, whether the resulting transcriptions will be accepted by the historical community, or whether we’ll still have a funded project to talk about in a year’s time, but for me it is part of the learning curve to distil and understand how our current research aims fit into the current academic framework. one thing i do know, is that jeremy bentham would have loved the fact that a picture of his manky embalmed head was being broadcast on a giant screen at king’s college london (especially when involved with a speech that raises issues about kcl!). i’ve really enjoyed having the chance to talk to you about my thoughts about transcribe bentham, and the digital humanities in general. thank you for listening in person, and see you on twitter, and in the panopticon. . references @drgnosis. ( ). tweet, : pm, th july. bentham project ( ). the auto- icon. http://www.ucl.ac.uk/bentham-project/faqs/auto_icon.htm. bobley, b. ( ). tweet, : am jun th, http://twitter.com/brettbobley/status/ . cohen, d. ( ). tweet, : pm mar th, http://twitter.com/dancohen/status/ . cordell, r. ( ). tweet, : pm mar th , http://twitter.com/ryancordell/status/ . croxall, b. ( ). tweet, : pm may rd . http://twitter.com/briancroxall/status/ . elliott, t. ( ). tweet, : pm jun th . http://twitter.com/paregorios/status/ . foucault, michel ( ). discipline and punish: the birth of the prison. random house. new york. french, a. ( ). tweet, : pm apr st, http://twitter.com/amandafrench/status/ . grant, m. ( ). provost’s newsletter, th march . university college london, london. available at http://digitool- b.lib.ucl.ac.uk: /r/ sn vgq usskri tbed uu rx tvhp fjyfb iyp x l d - ?func=results-jump-full&set_entry= &set_number= &base=gen terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . harte, n. ( ). the admission of women to university college, london : a centenary lecture. pamphlet, university college london, london. harte, n. ( ). the university of london - : an illustrated history. athlone, london. harte, n. and north, j. ( ). the world of ucl, - . university college london, london. hurst, g. ( ). making the cut and saving a world-class academic asset: the times' education editor reports on how king’s college london achieved budget cuts. june th . the times. http://business.timesonline.co.uk/tol/business/industry_sectors/public_sector/article .ece. holley, r. ( ). crowdsourcing: how and why should libraries do it? d-lib magazine, march/april , volume , number : . http://www.dlib.org/dlib/march /holley/ holley.html. landrum, s. ( ). tweet, : pm apr th. http://twitter.com/cliotropic/status/ . morgan, j. ( a). draconian' measure: king's to cut jobs. times higher education. th february . http://www.timeshighereducation.co.uk/story.asp?storycode= morgan, j. ( b). arts and humanities given reprieve at king's, but strike may go ahead. times higher education, th may . http://www.timeshighereducation.co.uk/story.asp?storycode= nowviskie, b. ( ). tweet, : am jun th . http://twitter.com/nowviskie/status/ . pears, i. ( ). the palaeographer and the managers: a tale of modern times. future thoughts blog. th february . porter, d. ( ). facebook status update, th february, : . ramsay, s. ( a). tweet, : pm mar th. http://twitter.com/sramsay/status/ . ramsay, s. ( b). tweet, : pm mar th. http://twitter.com/sramsay/status/ . ramsay, s. ( c). tweet, : pm mar th. http://twitter.com/sramsay/status/ . sherratt, t. ( ). tweet, : pm jul th, http://twitter.com/wragge/status/ . siemens, r. ( ). tweet, : am apr th . http://twitter.com/rays /status/ . tanner, s. ( ). tweet, : am jun th .http://twitter.com/simontanner/status/ . theibault, j. ( ). tweet, : am jun th. http://twitter.com/jtheibault/status/ . tickle, l. and bowcott, o. ( ). university cuts start to bite: the world is watching a row between staff and management at king's college london. rd march , the guardian, education section. http://www.guardian.co.uk/education/ /mar/ /university-funding-cuts university of london ( ). about university of london: a brief history. http://www.london.ac.uk/history.html. terras, m. ( ). "present, not voting: digital humanities in the panopticon closing plenary speech, digital humanities ". literary and linguistic computing, ( ): - . true s-cones are concentrated in the ventral mouse retina and wired for color detection in the upper visual field francisco m. nadal-nicolás , *, vincent p. kunze , †, john m. ball , †, brian t. peng , †, akshay krisnan , †, gaohui zhou , †, lijin dong , wei li , *. retinal neurophysiology section, national eye institute, national institutes of health, bethesda, maryland, usa. genetic engineering facility, national eye institute, national institutes of health, bethesda, maryland, usa. †equal contribution *corresponding authors abstract color, an important visual cue for survival, is encoded by comparing signals from photoreceptors with different spectral sensitivities. the mouse retina expresses a short wavelength-sensitive and a middle/long wavelength-sensitive opsin (s- and m-opsin), forming opposing, overlapping gradients along the dorsal-ventral axis. here, we analyzed the distribution of all cone types across the entire retina for two commonly used mouse strains. we found, unexpectedly, that “true s-cones” (s-opsin only) are highly concentrated (up to % of cones) in ventral retina. moreover, s-cone bipolar cells (scbcs) are also skewed towards ventral retina, with wiring patterns matching the distribution of true s-cones. in addition, true s-cones in the ventral retina form clusters, which may augment synaptic input to scbcs. such a unique true s-cone and scbc connecting pattern forms a basis for mouse color vision, likely reflecting evolutionary adaption to enhance color coding for the upper visual field suitable for mice’s habitat and behavior. keywords genuine s-cone, cone distribution, cone cluster, mammalian photoreceptor, s-cone bipolar cells, blue bipolar cells, color vision. . introduction topographic representation of the visual world in the brain originates from the light-sensitive photoreceptors in the retina (rhim et al., ). although the neuronal architecture of the retina is similar among different vertebrates, the numbers and distributions of photoreceptors vary considerably (hunt and peichl, ). such patterns have been evolutionarily selected, adapting to the animal’s unique behavior (diurnal or nocturnal) and lifestyle (prey or predator) for better use of the visual information in the natural environment (dominy and lucas, ; gerl and morris, ; peichl, ). color, an important visual cue for survival, is encoded by comparing signals carried by photoreceptors with different spectral preferences (baden and osorio, ). while amongst mammals, trichromatic color vision is privileged for some primates (jacobs et al., ; nathans et al., ; yokoyama and yokoyama, ), most terrestrial mammals are dichromatic (marshak and mills, ; puller and haverkamp, ; jacobs, ). the mouse retina expresses two types of cone opsins, s- and m-opsin, with peak sensitivities at nm and nm, respectively (jacobs et al., ; nikonov et al., ). the expression patterns of these two opsins form opposing and overlapping gradients along the dorsal-ventral axis, resulting in a majority of cones expressing both opsins (herein either “mixed cones” or m + s + ) (applebury et al., ; ng et al., ; wang et al., ). thus, s-opsin enrichment in the ventral retina better detects short-wavelength light from the sky, and m- opsin in the dorsal retina perceives the ground (e.g., a grassy field) (baden et al., ; gouras and ekesten, ; osorio and vorobyev, ; szél et al., ), while co-expression of both opsins (herein either mixed cones or m + s + ) (röhlich et al., ) broadens the spectral range of individual cones and improves perception under varying conditions of ambient light (chang et al., ). this unusual opsin expression pattern poses a challenge for color-coding, particularly so for mixed cones. however, it has been discovered that a small population of cones only expresses s-opsin (“true s-cones”, or s + m - ). these true s-cones are thought to be evenly distributed across the retina (franke et al., ; haverkamp et al., ; szatko et al., ; wang et al., ) and to be critical for encoding color, especially in the dorsal retina where they are quasi- evenly distributed in a sea of cones expressing only m-opsin (m + s - ), a pattern akin to mammalian retinas in general (haverkamp et al., ; wang et al., ). nonetheless, subsequent physiological studies revealed that color-opponent retinal ganglion cells (rgcs) are more abundant in the dorsal-ventral transition zone (chang et al., ) and the ventral retina (joesch and meister, ). recent large scale two-photon imaging results further demonstrated that color opponent cells were mostly located in the ventral retina (szatko et al., ). intriguingly, a behavior-based mouse study demonstrated that their ability to distinguish color is also restricted to the ventral retina (denman et al., ). these results prompt us to study, at the single-cell level and across the whole retina, the spatial distributions of cone types with different opsin expression configurations and, more importantly, with regard to s-cone bipolar cell connections in order to better understand the anatomical base for the unique color- coding scheme of the mouse retina. . results and discussion . . true s-cones are highly concentrated in the ventral retina of pigmented mouse. in mouse retina, the gradients of s- and m-opsin expression along the dorsal-ventral axis have been well documented (figure a-b) (applebury et al., ; calderone and jacobs, ; chang et al., ; haverkamp et al., ; jelcick et al., ; lyubarsky et al., ; ortín- martínez et al., ; szél et al., ; wang et al., ), but the distribution of individual cone types with different combinations of opsin expression across the whole retina has not been characterized (but see baden et al., ; eldred et al., , which we discuss below). we developed a highly reliable algorithm to automatically quantify the different opsins (s and m) and cone types (m + s - , true s, and mixed cones, figure , figure - figure supplement ) based on high-resolution images of entire flat-mount retinas immunolabeled with s- and m-opsin antibodies (figure - figure supplement ). as demonstrated in examples of opsin labeling from dorsal, medial, and ventral retinal areas of the pigmented mouse (figure b, left), while m opsin-expressing cones (m + : m + s + + m + s - ) were relatively evenly distributed across three regions, s opsin-expressing cones (s + : m + s + + s + m - ) showed considerable anisotropy, with a high density in the ventral retina and a precipitous drop in the dorsal retina, confirming previous observations (haverkamp et al., ; jelcick et al., ; ortín-martínez et al., ). surprisingly, instead of finding an even distribution of true s-cones as previously presumed (baden et al., ; haverkamp et al., ; wang et al., ), we found the ventral region had much more numerous true s-cones (~ % of the local cone population; figure c left, supplementary file a) than did the dorsal region (~ %). this result is evident from density maps of cone types from three examples of pigmented mice, showing highly concentrated true s-cones in the ventral retina (figure a, left column, bottom row). in addition, m + s - -cones were concentrated in the dorsal retina, whereas mixed cones dominated the medial and ventral retina (figure c left and figure a, left column, th and th rows). . . despite the vast difference in s-opsin expression pattern, the distribution of true s-cones is strikingly similar between the pigmented and albino mouse. such a highly skewed distribution of true s-cones conflicts with the general notion that true s- cones only account for ~ % of cones and are evenly distributed across the mouse retina (baden et al., ; franke et al., ; haverkamp et al., ; szatko et al., ; wang et al., ); however, it is not unprecedented considering the diverse s-cone patterns seen in mammals (ahnelt et al., ; ahnelt and kolb, ; calderone et al., ; hendrickson et al., ; hendrickson and hicks, ; kryger et al., ; müller and peichl, ; nadal-nicolás et al., ; ortín-martínez et al., , ; peichl, ; schiviz et al., ; szél et al., ). therefore, we also examined an albino mouse line to determine whether this observation persists across different mouse strains. overall, albino retinas had slightly smaller cone populations (figure b, supplementary file b; ortín-martínez et al., ). interestingly, while m-opsin expressing cones had similar distributions in both strains, s-opsin expression extended well into the dorsal retina of the albino mouse, exhibiting a greatly reduced gradient of s-opsin expression toward the dorsal retina compared to that seen in pigmented mice (figure b-c, figure a second row; applebury et al., ; ortín-martínez et al., ). consequently, most cones in the dorsal retina were mixed cones, and m + s - cones were very sparse ( %, compared to % in pigmented mouse, figure c right, supplementary file a, figure a right). however, despite these differences, the percentage and distribution of true s-cones were remarkably conserved between strains. in both strains, true s-cones were extremely sparse in the dorsal retina ( %) but highly concentrated in the ventral retina ( % vs %, figure c and supplementary file a). notably, the density maps of true s-cones are nearly identical in both strains (figure a, bottom row). evaluating the distribution of three main cone populations (mixed, m + s - , and true s-cone) in four retinal quadrants centered upon the optic nerve head reveals different profiles between pigmented and albino strain for mixed and m + s - cones (figure c). for example, in the dorsotemporal (dt) quadrant, we observed an increase of m + s - cones from the center to the periphery (green line) in pigmented mice, compared to a majority of mixed cones (gray line) in albino mice. however, true s-cone profiles (magenta lines) were similar between the two strains in all quadrants, except for a slightly increased density along the edge of the ventronasal (vn) quadrant in pigmented mice. a recent study successfully modeled cone opsin expression and type determination according to graded thyroid hormone signaling in a pigmented mouse strain (c bl/ ) (eldred et al., ). it would be interesting to see whether a different pattern of thyroid hormone and/or receptor distribution could recapitulate a similar true s-cone distribution with a very different form of s-opsin expression. . . s-cone bipolar cells exhibit a dorsal-ventral gradient with a higher density in the ventral retina. one major concern regarding cone classification based on opsin immunolabeling is that some s + m - cones may instead be mixed cones with low m-opsin expression (applebury et al., ; baden et al., ; nikonov et al., ; röhlich et al., ). even though a similar cone-type distributions have been observed in mouse retina, it has been assumed that only a fraction of the s + m - cones are ‘true’ s-cones (baden et al., ; eldred et al., ). out of caution, s + m - cones were only referred to as “anatomical” s-cones due to a lack of confirmation regarding their bipolar connections (baden et al., ). thus, both true s-cones and s-cone bipolar cells have been generally acknowledged to be evenly distributed across the retina (haverkamp et al., ; wang et al., ; baden et al., ; szatko et al., ; franke et al., ; eldred et al., ). in order to confirm the distribution of true s-cones, it is critical to uncover the distribution and dendritic contacts of s-cone bipolar cells (type , or scbcs). previously, scbcs have only been identified among other bipolar, amacrine and ganglion cells in a thy - clomeleon mouse line, rendering the quantification of their distribution across the entire retina impractical (haverkamp et al., ). we generated a copine -venus mouse line, in which scbcs are specifically marked (figure , supplementary file c), owing to the fact that cpne is an scbc-enriched gene (shekhar et al., ). in retinal sections, these venus + bipolar cells have axon terminals narrowly ramified in sub-lamina of ipl (figure a), closely resembling type bcs as identified in em reconstructions (behrens et al., ; stabio et al., a). in flat-mount view, these bipolar cells are often seen to extend long dendrites to reach true s-cones, bypassing other cone types (figure b-c). the majority of dendritic endings formed enlarged terminals beneath true s-cones pedicles (figure c-c’), but occasional slender “blind” endings were present (arrow in figure c-c”), which have been documented for s-cone bipolar cells in many species (haverkamp et al., ; herr et al., ; kouyama and marshak, ). unexpectedly, we found that the distribution of scbcs was also skewed toward vn retina, albeit with a shallower gradient (figure d-e). to examine the connections between true s- cones and scbcs, we immunolabeled s- and m-opsins in copine -venus mouse retinas. because m-opsin antibody signals did not label cone structures other than their outer segments, we first identified true s-cones at the outer segment level and then traced s-opsin labeling to their pedicles in the outer plexiform layer (opl), where they connect with scbcs (figure c, for more details see material and methods). although convergent as well divergent connections were found between true s-cones and scbcs in both dorsal and ventral retina (see the source data), we noted different connectivity patterns. while in the dorsal retina, a single true s-cone connected to approximately scbcs ( . ± . , see material and methods), in the ventral retina, a single scbc contacted approximately true s-cones ( . ± . ; figure c, supplementary file ). these results agree well with the true s-cone to scbc ratios calculated from cell densities in the dt and vn retina. specifically, in the dorsal retina, the true s-cone to scbc ratio was approximately : . , compared to . : in the ventral retina (supplementary file ). accordingly, both data sets support the presence of a prevalent divergence of true s-cone to scbc connections in the dorsal retina, in comparison to a prominent convergence of contacts from true s-cones to scbcs in vn retina. critically, the specificity of wiring from true s-cones to scbcs also confirms the identity of true s-cones as revealed by opsin labeling and further supports the finding that true s-cones are highly concentrated in vn mouse retina. . . true s-cones in the ventral retina are not evenly distributed but form clusters. as demonstrated above, in the mouse retina, despite a large population of mixed cones, scbcs precisely connect with true s-cones, preserving this fundamental mammalian color circuitry motif (behrens et al., ; breuninger et al., ; haverkamp et al., ; mills et al., ). however, the increased density of scbcs in the ventral retina does not match that of true s- cones (compare fig d and figure a, last row). thus, individual scbcs in the ventral retina may be required to develop more dendrites to maximize the number of contacts made with different s-cone terminals (supplementary file , graphs in figure c). intriguingly, we discovered in both strains that true s-cones in the ventral retina appeared to cluster together rather than forming an even distribution, as revealed by k-nearest neighbor analysis (figure a- b, supplementary file ). ideally, such true s-cone clustering may increase the availability of targets for individual scbcs in a reduced space. to quantify the spatial patterning of true s-cone populations (or their lack thereof), we compared the observed true s-cone distributions within -mm diameter vn and dt retinal samples to artificially generated alternative populations (figure c). to this end, we considered two extreme patterning rules: first, one in which the space between true s-cone locations was maximized within the set of actual locations for all cones, creating a relatively uniform (evenly “distributed”) mosaic of true s-cones. at the other extreme, cone identities were permuted randomly (“shuffled”) among observed cone locations (figure c). repetition of these algorithms generated distributions of patterning metrics for true s-cones (see below) that remain constrained by the observed cone locations and proportions of cone types for each - mm sample. to quantitatively compare the patterning of real true s-cone populations to their artificial counterparts, we first computed two measures of regularity for true s-cones: nearest neighbor and voronoi diagram regularity indices (nnri and vdri, respectively; reese and keeley, ; figure c-d); larger values of these metrics indicate smaller variability in the spacing between cones and thus more regular patterns. interestingly, far from being regularly distributed, true s- cone placement was quite irregular and nearly indistinguishable from shuffled populations (including a slight trend toward regularity measures lower than random, which may indicate a tendency toward clustering, figure d; see reese, ). to further probe the possibility of true s-cone clustering, we computed the ratios of true s-cone neighbors for each cone (denoted here as the s-cone neighbor ratio [scnr]; see methods for the calculation of the scnr search radius for each retinal sample). intriguingly, scnrs were significantly larger for true s-cones than for other cone types, which were equal to expected ratios due to random chance— especially so in ventral retinas, further indicating a clustering of true s-cones in those areas (figure e). notably, a more extreme form of clustering of s-cones has been observed in the “wild” mouse (warwick et al., ) and with much lower densities in some felids (ahnelt et al., ). here, such clustering may reflect the mode of true s-cone development in the ventral retina, for example, by “clonal expansion” to achieve unusually high densities (bruhn and cepko, ; reese et al., ). it is tempting to speculate that it may also facilitate the wiring of true s-cones with sparsely distributed scbcs, which were not observed to cluster in the ventral retina (figure e). indeed, we observed examples of groups of true s-cones forming clusters whose pedicles in the opl were tightly congregated in a patch and contacted by a nearby scbc (figure f). . . enriched true s-cones in the ventral retina may provide an anatomical base for mouse color vision. despite being nocturnal and having a rod-dominated retina (carter-dawson and lavail, ; jeon et al., ), mice can detect color (denman et al., ; jacobs et al., ). although it remains uncertain whether the source of long-wavelength sensitive signals for color opponency arises in rods or m-cones (baden and osorio, ; ekesten et al., ; ekesten and gouras, ; joesch and meister, ; reitner et al., ), it is clear that true s-cones provide short- wavelength signals for color discrimination. given the previously-held notion that true s-cones are evenly distributed across the retina (baden et al., ; franke et al., ; haverkamp et al., ; szatko et al., ; wang et al., ), whereas m + s - cones are concentrated in the dorsal retina of pigmented mouse, it is intuitive to speculate that color coding is prevalent in the dorsal retina. however, previous physiological and behavioral studies indicate that, although luminance detection can occur across the mouse retina, color discrimination is restricted to the ventral retina (breuninger et al., ; denman et al., ; szatko et al., ). thus, our discovery of high enrichment of true s-cones in the ventral retina provides a previously missed anatomical feature for mouse color vision that could help to re-interpret these results. from projections mapping true s-cone densities into visual space (figure -figure supplement ; sterratt et al., ), it is conceivable that high ventral true s-cone density will provide a much higher sensitivity of short-wavelength signals, thus facilitating color detection for the upper visual field. although the true s-cone signals carried by scbcs in the dorsal retina might not be significant for color detection, they could certainly participate in other functions, such as non-image forming vision, that are known to involve short-wavelength signals (altimus et al., ; doyle et al., ; patterson et al., ). interestingly, the overall true s-cone percentage in the mouse retina remains approximately % (figure b), and the average true s- cone to scbc ratio across the whole retina is about . : (supplementary file b-c), similar to what has been reported in other mammals (ahnelt et al., ; ahnelt and kolb, ; bumsted et al., ; bumsted and hendrickson, ; curcio et al., ; hendrickson and hicks, ; hunt and peichl, ; kryger et al., ; lukáts et al., ; müller and peichl, ; ortín- martínez et al., ; peichl et al., ; schiviz et al., ; shinozaki et al., ; szél et al., ). such a spatial rearrangement of true s-cones and scbcs likely reflects evolutionary adaption to enhance short-wavelength signaling and color coding for the upper visual field as best suited for the habitat and behavior of mice (baden et al., ). for example, it may facilitate aerial predator detection during daytime (yilmaz and meister, ). similarly, skewed s-cone arrangement has been reported for other terrestrial prey mammals (famiglietti and sharpe, ; juliusson et al., ; röhlich et al., ), while zebrafish possess a uv-enriched ventral retina that enhances their predation (zimmermann et al., ). in addition, we observed that the clustering of true s-cones in the ventral retina may allow several neighboring cones of the same type to converge onto the same scbc (figure f), which could potentially enhance signal- to-noise ratios for more accurate detection, as described recently in human fovea (schmidt et al., ). it is also remarkable that despite the very different s-opsin expression patterns in both mouse strains, the true s-cone population and distribution are strikingly similar between pigmented and albino mice, suggesting a common functional significance. . acknowledgements the authors would like to thank the nei animal care team, especially megan kopera and ashley yedlicka. . competing interests the authors declare no competing or financial interests. . funding this research was supported by intramural research program of the national eye institute, national institutes of health to wl. . figures figure . cone outer segments across retinal areas. immunodetection of m and s wavelength- sensitive opsins in retinal sections (a) and flat-mount retinas (b) in two mouse strains (pigmented and albino mice, left and right columns respectively). (c) retinal scheme of s-opsin expression used for image sampling to quantify and classify cones in three different retinal regions. pie graphs showing the percentage of cones manually classified as m + s - (green), s + m - (true s, magenta) and m + s + (mixed, gray) based on the opsin expression in different retinal areas from four retinas per strain. black mouse: pigmented mouse strain (c bl ), white mouse: albino mouse strain (cd ). figure . topography and total number of different opsins (m + , s + ) and cone-type populations in the whole mouse retina. (a) density maps depicting the distributions of different opsins expressing cones (m + and s + ) and different cone populations classified anatomically as: all, m + s + (mixed), m + s - , s + m - (true s) cones in pigmented and albino mice (left and right side respectively). each column shows different cone populations from the same retina and, at the bottom of each map is shown the number of quantified cones. color scales are shown in the right panel of each row (from [purple] to , [dark red] for all cone types except to , cones/mm [dark red] for the true s-cones and m + s - -cone in the albino strain). retinal orientation depicted by d: dorsal, n: nasal, t: temporal, v: ventral. (b) histogram showing the mean ± standard deviation of different cone subtypes for eight retinas per strain (supplementary file b). the percentages of each cone subtype are indicated inside of each bar, where % indicates the total of the ‘all cones’ group. (c) opsin expression profile across the different retinal quadrants (retinal scheme, dt: dorsotemporal, dn: dorsonasal, vt: ventrotemporal, vn: ventronasal). line graphs show the spatial profile of relative opsins expression (mixed [gray], m + s - [green], true s-cones [magenta]), where the sum of these three cone populations at a given distance from the optic nerve (on) head equals %. black mouse: pigmented mouse strain, white mouse: albino mouse strain. figure -figure supplement . validation of automatic routine for cone outer segment quantification. (a) retinal photomontages for m- and s-opsin signal in the same pigmented retina (correspond to second column in figure a). the square depicts an area of interest selected (transition zone of s-opsin expression) to perform the automatic routine validation by comparing manual and automatic quantifications. the images processed by the automatic routine using imagej show the selection of positive objects from the corresponding original image. (b) x, y graph showing the linear correlation (pearson coefficient, r ) between manual and automatic quantifications. , m + and , s + cones were manually annotated while , m + and , s + cones were automatically identified in random images obtained from retinal photomontages. (c) all, mixed, m + s - - and true s-cone populations are extracted from the original m- and s-cone images. all-cones were quantified after overlapping m- and s-signals. mixed (m + s + ) cones were obtained by subtracting the background of the s-opsin image in the m-opsin one. m + s - cones for pigmented mice are obtained after subtracting the s-opsin signal to the m-opsin photomontage. finally, m + s - cones for albino and true s-cones (s + m - ), in both strains, are manually marked on the retinal photomontage (adobe photoshop cc). the b&w images shown the processed image after quantifying automatically. at the bottom of each image is shown the number of quantified cones. black mouse: pigmented mouse strain. figure . s-cone bipolar cells (scbcs) in cpne -venus mouse retina. (a) retinal cross section showing the characteristic morphology of scbcs (behrens et al., ; breuninger et al., ). (b) detailed view of the selective connectivity between venus + scbcs and true s-cone terminals (yellow arrows). note that scbcs avoid contacts with cone terminals lacking s-opsin expression (m + s - -cone pedicles, identified using cone arrestin), as well as a mixed cone pedicle, marked with an asterisk. in fact, on the contrary, the scbcs prefer to develop multiple contacts to the same true s-cone pedicle. (c) images from flat-mount retinas focused on the inner nuclear and outer plexiform layers (inl+opl) or in the photoreceptor outer segment (os) layer of the corresponding area. magnifications showing divergent and convergent connectivity patterns from true s-cone pedicles in dorsal and ventral retinal domains, respectively. in the dt retina, six venus + scbcs (cyan circles) contact a single true s-cone pedicle (magenta circle in dt); while one venus + scbc contacts at least four true s-cone pedicles in the vn retina (magenta circles in vn), which belong to cones possessing s + m - oss (yellow circles). connectivity between true s- cones and scbcs in dt and vn retina was assessed as the average number of true s-cone pedicles contacting a single scbc per retina (magenta plot) or the average number of scbcs contacting a single true s-cone pedicle per retina (cyan plot) (p< . , p< . , respectively; n= ). (c’) detailed view of a secondary scbc bifurcation contacting independently two true s- cone pedicles. (c”) detailed view of a “blind” scbc process. (d) density maps depicting the distributions of scbcs in cpne -venus mice. (d) venus + scbcs along the dt-vn axis from a flat- mount retina (corresponding to the white frame in d) showing the gradual increase of scbcs towards the vn retina where true s-cone density peaks (last row in figure a). (e) demonstration of venus + scbc densities color-coded by the k-nearest neighbor algorithm according to the number of other venus + scbcs found within an m radius in two circular areas of interest (dt and vn). although, venus + scbcs exhibit a sparse density without forming clusters (circular maps), they were significantly denser in vn retina (p< . ; n= ). figure . clustering of true s-cones in the ventronasal (vn) retina. (a) retinal magnifications from flat-mount retinas demonstrating grouping of true s-cones in the vn area, where true s- cone density peaks. white dashed lines depict independent groups of true s-cones that are not commingled with mixed cones (m + s + , white outer segments in the merged image). (b) retinal scheme of true s-cones used for selecting two circular areas of interest along the dorsotemporal-ventronasal (dt-vn) axis. circular maps demonstrate true s-cone clustering in these regions. true s-cone locations are color-coded by the k-nearest neighbor algorithm according to the number of other true s-cones found within an m radius. (c-e) analytical comparisons of dt and vn populations of true s-cones to their simulated alternatives. c) example real and simulated true s-cone populations and their quantification. images depict true s-cone locations (magenta dots) and boundaries of their voronoi cells (dashed lines) from original and example simulated (“distributed”, “shuffled”) cone populations. gray dots indicate the locations of other cone types. observed cone locations were used for all simulated populations; only their cone identities were changed. the annotated features are examples of those measurements used in the calculations presented in d-e. (d) comparison of sample regularity indices for one albino vn retinal sample to violin plots of those values observed for n= simulated cone populations. note that average regularity indices for true s-cones were lower than that of shuffled populations, whereas those values lay between shuffled and distributed populations when all cones were considered. plots on the right show values for all actual retinal samples normalized using the mean and standard deviations of their simulated “shuffled” counterparts. the y-axis range corresponding to ± . standard deviations from the mean (i.e., that containing ~ % of shuffled samples) is highlighted in gray. (e) comparison of the real average scnr for the example in c-d to those values for its simulated counterparts. note that the average scnr for all cones in this sample was equal to that predicted by random chance (i.e., the ratio of true s-cones to all cones), which in turn was equal to the average for true s-cones for shuffled samples. in contrast, the real true s-cone scnr was higher. plot on the right shows true s-cone scnr values for all samples, normalized as described for d. (f) convergent connectivity from a true s-cone cluster to a single scbc in the vn retina. images of a true s-cone cluster, in a flat-mount retina, focused on the photoreceptor outer segment layer and the inner nuclear-outer plexiform layers (inl+opl). the upper left panel show the numerical and colored identification of each true s-outer segment in the cluster (note that the number positions indicate the locations where outer segments contact the photoreceptor inner segment). each true s-cone pedicle belonging to this cluster is outlined and color coded (middle upper panel) and are overlaid upon the scbc dendritic profile (right upper panel). to identify synaptic contacts between the scbc and the cone pedicles (maximum intensity projection - excluding the scbc soma- shown in lower left panel), we acquired orthogonal single plane views zooming into putative dendritic tips. an example for the contact with cone # is shown in lower middle panel, corresponding to the box area in lower left panel (f). the lower right panel shows dendritic endings of this sbcb (black) contacting the marked cones (# - ). it also contacts two additional cones outside of the field of view (# , ). dashed line depicts the soma of the scbc. dendrites from other scbcs are color coded for differentiation. figure -figure supplement . reconstruction and mapping of true s-cone densities into visual space. representative left eye from a -month-old pigmented mouse (c ). (a) s-opsin antibody labeling; (b) true s-cone density contour lines separated by quintiles overlaid onto s- opsin labeling; (c) quintile heatmap contours of true s-cone density. the top two rows demonstrate the flat-mount retina with marks for edges and relaxing cuts, followed by its reconstruction into uncut retinal space with lines of latitude and longitude that have been projected onto the flat-mount. the bottom two rows show the reconstructed retina inverted into visual space using orthogonal and sinusoidal projections. for these views, eye orientation angles for elevation and azimuth of  and , respectively, have been used as in (sterratt et al., ). for orthogonal projections, the globe has been rotated forward by  to emphasize the relationship of true s-cone densities to the upper pole of the visual field. s-opsin labeling is restricted to the upper visual field, but true s-cones are concentrated toward its lateral edges. . references ahnelt pk, fernández e, martinez o, bolea ja, kübber-heiss a. . irregular s-cone mosaics in felid retinas. spatial interaction with axonless horizontal cells, revealed by cross correlation. j opt soc am a opt image sci vis : – . doi: . /josaa. . ahnelt pk, kolb h. . the mammalian photoreceptor mosaic-adaptive design. prog retin eye res : – . doi: . /s - ( ) - ahnelt pk, schubert c, kübber-heiss a, schiviz a, anger e. . independent variation of retinal s and m cone photoreceptor topographies: a survey of four families of mammals. vis neurosci : – . doi: . /s x altimus cm, güler ad, villa kl, mcneill ds, legates ta, hattar s. . rods-cones and melanopsin detect light and dark to modulate sleep independent of image formation. proc natl acad sci usa : – . doi: . /pnas. applebury ml, antoch mp, baxter lc, chun ll, falk jd, farhangfar f, kage k, krzystolik mg, lyass la, robbins jt. . the murine cone photoreceptor: a single cone type expresses both s and m opsins with retinal spatial patterning. neuron : – . doi: . /s - ( ) - baden t, euler t, berens p. . understanding the retinal basis of vision across species. nat rev neurosci : – . doi: . /s - - - baden t, osorio d. . the retinal basis of vertebrate color vision. annu rev vis sci : – . doi: . /annurev-vision- - baden t, schubert t, chang l, wei t, zaichuk m, wissinger b, euler t. . a tale of two retinal domains: near-optimal sampling of achromatic contrasts in natural scenes through asymmetric photoreceptor distribution. neuron : – . doi: . /j.neuron. . . behrens c, schubert t, haverkamp s, euler t, berens p. . connectivity map of bipolar cells and photoreceptors in the mouse retina. elife . doi: . /elife. breuninger t, puller c, haverkamp s, euler t. . chromatic bipolar cell pathways in the mouse retina. j neurosci : – . doi: . /jneurosci. - . bruhn sl, cepko cl. . development of the pattern of photoreceptors in the chick retina. j neurosci : – . bumsted k, hendrickson a. . distribution and development of short-wavelength cones differ between macaca monkey and human fovea. j comp neurol : – . bumsted k, jasoni c, szél a, hendrickson a. . spatial and temporal expression of cone opsins during monkey retinal development. j comp neurol : – . calderone jb, jacobs gh. . regional variations in the relative sensitivity to uv light in the mouse retina. vis neurosci : – . doi: . /s calderone jb, reese be, jacobs gh. . topography of photoreceptors and retinal ganglion cells in the spotted hyena (crocuta crocuta). brain behav evol : – . doi: . / carter-dawson ld, lavail mm. . rods and cones in the mouse retina. i. structural analysis using light and electron microscopy. j comp neurol : – . doi: . /cne. chang l, breuninger t, euler t. . chromatic coding from cone-type unselective circuits in the mouse retina. neuron : – . doi: . /j.neuron. . . curcio ca, allen ka, sloan kr, lerea cl, hurley jb, klock ib, milam ah. . distribution and morphology of human cone photoreceptors stained with anti-blue opsin. j comp neurol : – . doi: . /cne. denman dj, luviano ja, ollerenshaw dr, cross s, williams d, buice ma, olsen sr, reid rc. . mouse color and wavelength-specific luminance contrast sensitivity are non-uniform across visual space. elife . doi: . /elife. dominy nj, lucas pw. . ecological importance of trichromatic vision to primates. nature : – . doi: . / doyle se, yoshikawa t, hillson h, menaker m. . retinal pathways influence temporal niche. proc natl acad sci usa : – . doi: . /pnas. ekesten b, gouras p. . cone and rod inputs to murine retinal ganglion cells: evidence of cone opsin specific channels. vis neurosci : – . doi: . /s ekesten b, gouras p, yamamoto s. . cone inputs to murine retinal ganglion cells. vision res : – . doi: . /s - ( ) -x eldred kc, avelis c, jr rjj, roberts e. . modeling binary and graded cone cell fate patterning in the mouse retina. plos computational biology :e . doi: . /journal.pcbi. famiglietti ev, sharpe sj. . regional topography of rod and immunocytochemically characterized “blue” and “green” cone photoreceptors in rabbit retina. vis neurosci : – . doi: . /s franke k, maia chagas a, zhao z, zimmermann mj, bartel p, qiu y, szatko kp, baden t, euler t. . an arbitrary-spectrum spatial visual stimulator for vision research. elife . doi: . /elife. gerl ej, morris mr. . the causes and consequences of color vision. evo edu outreach : – . doi: . /s - - -x gouras p, ekesten b. . why do mice have ultra-violet vision? exp eye res : – . doi: . /j.exer. . . haverkamp s, wässle h, duebel j, kuner t, augustine gj, feng g, euler t. . the primordial, blue-cone color system of the mouse retina. j neurosci : – . doi: . /jneurosci. - . hendrickson a, djajadi hr, nakamura l, possin de, sajuthi d. . nocturnal tarsier retina has both short and long/medium-wavelength cones in an unusual topography. j comp neurol : – . doi: . / - ( ) : < ::aid-cne > . .co; - z hendrickson a, hicks d. . distribution and density of medium- and short-wavelength selective cones in the domestic pig retina. exp eye res : – . doi: . /exer. . herr s, klug k, sterling p, schein s. . inner s-cone bipolar cells provide all of the central elements for s cones in macaque retina. j comp neurol : – . doi: . /cne. hunt dm, peichl l. . s cones: evolution, retinal distribution, development, and spectral sensitivity. vis neurosci : – . doi: . /s jacobs gh. . the distribution and nature of colour vision among the mammals. biol rev camb philos soc : – . doi: . /j. - x. .tb .x jacobs gh, neitz j, deegan jf. . retinal receptors in rodents maximally sensitive to ultraviolet light. nature : – . doi: . / a jacobs gh, neitz m, deegan jf, neitz j. . trichromatic colour vision in new world monkeys. nature : – . doi: . / a jacobs gh, williams ga, fenwick ja. . influence of cone pigment coexpression on spectral sensitivity and color vision in the mouse. vision res : – . doi: . /j.visres. . . jelcick as, yuan y, leehy bd, cox lc, silveira ac, qiu f, schenk s, sachs aj, morrison ma, nystuen am, deangelis mm, haider nb. . genetic variations strongly influence phenotypic outcome in the mouse retina. plos one :e . doi: . /journal.pone. jeon cj, strettoi e, masland rh. . the major cell populations of the mouse retina. j neurosci : – . joesch m, meister m. . a neuronal circuit for colour vision based on rod-cone opponency. nature : – . doi: . /nature juliusson b, bergström a, röhlich p, ehinger b, van veen t, szél a. . complementary cone fields of the rabbit retina. invest ophthalmol vis sci : – . kouyama n, marshak dw. . bipolar cells specific for blue cones in the macaque retina. j neurosci : – . kryger z, galli-resta l, jacobs gh, reese be. . the topography of rod and cone photoreceptors in the retina of the ground squirrel. vis neurosci : – . doi: . /s lukáts a, szabó a, röhlich p, vígh b, szél a. . photopigment coexpression in mammals: comparative and developmental aspects. histol histopathol : – . doi: . /hh- . lyubarsky al, falsini b, pennesi me, valentini p, pugh en. . uv- and midwave-sensitive cone-driven retinal responses of the mouse: a possible phenotype for coexpression of cone photopigments. j neurosci : – . marshak dw, mills sl. . short-wavelength cone-opponent retinal ganglion cells in mammals. vis neurosci : – . doi: . /s x mills sl, tian l-m, hoshi h, whitaker cm, massey sc. . three distinct blue-green color pathways in a mammalian retina. j neurosci : – . doi: . /jneurosci. - . müller b, peichl l. . topography of cones and rods in the tree shrew retina. j comp neurol : – . doi: . /cne. nadal-nicolás fm, salinas-navarro m, jiménez-lópez m, sobrado-calvo p, villegas-pérez mp, vidal-sanz m, agudo-barriuso m. . displaced retinal ganglion cells in albino and pigmented rats. front neuroanat : . doi: . /fnana. . nadal-nicolás fm, vidal-sanz m, agudo-barriuso m. . the aging rat retina: from function to anatomy. neurobiol aging : – . doi: . /j.neurobiolaging. . . nathans j, thomas d, hogness ds. . molecular genetics of human color vision: the genes encoding blue, green, and red pigments. science : – . doi: . /science. ng l, hurley jb, dierks b, srinivas m, saltó c, vennström b, reh ta, forrest d. . a thyroid hormone receptor that is required for the development of green cone photoreceptors. nat genet : – . doi: . / nikonov ss, kholodenko r, lem j, pugh en. . physiological features of the s- and m-cone photoreceptors of wild-type mice from single-cell recordings. j gen physiol : – . doi: . /jgp. ortín-martínez a, jiménez-lópez m, nadal-nicolás fm, salinas-navarro m, alarcón-martínez l, sauvé y, villegas-pérez mp, vidal-sanz m, agudo-barriuso m. . automated quantification and topographical distribution of the whole population of s- and l-cones in adult albino and pigmented rats. invest ophthalmol vis sci : – . doi: . /iovs. - ortín-martínez a, nadal-nicolás fm, jiménez-lópez m, alburquerque-béjar jj, nieto-lópez l, garcía-ayuso d, villegas-pérez mp, vidal-sanz m, agudo-barriuso m. . number and distribution of mouse retinal cone photoreceptors: differences between an albino (swiss) and a pigmented (c /bl ) strain. plos one :e . doi: . /journal.pone. osorio d, vorobyev m. . photoreceptor spectral sensitivities in terrestrial animals: adaptations for luminance and colour vision. proc biol sci : – . doi: . /rspb. . patterson ss, kuchenbecker ja, anderson jr, neitz m, neitz j. . a color vision circuit for non-image-forming vision in the primate retina. curr biol. doi: . /j.cub. . . peichl l. . diversity of mammalian photoreceptor properties: adaptations to habitat and lifestyle? anat rec a discov mol cell evol biol : – . doi: . /ar.a. peichl l, künzle h, vogel p. . photoreceptor types and distributions in the retinae of insectivores. vis neurosci : – . doi: . /s puller c, haverkamp s. . bipolar cell pathways for color vision in non-primate dichromats. vis neurosci : – . doi: . /s reese be. . . - mosaics, tiling, and coverage by retinal neurons in: masland rh, albright td, albright td, masland rh, dallos p, oertel d, firestein s, beauchamp gk, catherine bushnell m, basbaum ai, kaas jh, gardner ep, editors. the senses: a comprehensive reference. new york: academic press. pp. – . doi: . /b - - . - reese be, keeley pw. . design principles and developmental mechanisms underlying retinal mosaics. biol rev camb philos soc : – . doi: . /brv. reese be, necessary bd, tam pp, faulkner-jones b, tan ss. . clonal expansion and cell dispersion in the developing mouse retina. eur j neurosci : – . doi: . /j. - . . .x reitner a, sharpe lt, zrenner e. . is colour vision possible with only rods and blue-sensitive cones? nature : – . doi: . / a rhim i, coello-reyes g, ko h-k, nauhaus i. . maps of cone opsin input to mouse v and higher visual areas. j neurophysiol : – . doi: . /jn. . rodieck rw. . the density recovery profile: a method for the analysis of points in the plane applicable to retinal studies. vis neurosci : – . doi: . /s x röhlich p, van veen t, szél a. . two different visual pigments in one retinal cone cell. neuron : – . doi: . / - ( ) - schiviz an, ruf t, kuebber-heiss a, schubert c, ahnelt pk. . retinal cone topography of artiodactyl mammals: influence of body height and habitat. j comp neurol : – . doi: . /cne. schmidt bp, boehm ae, tuten ws, roorda a. . spatial summation of individual cones in human color vision. plos one :e . doi: . /journal.pone. shekhar k, lapan sw, whitney ie, tran nm, macosko ez, kowalczyk m, adiconis x, levin jz, nemesh j, goldman m, mccarroll sa, cepko cl, regev a, sanes jr. . comprehensive classification of retinal bipolar neurons by single-cell transcriptomics. cell : - .e . doi: . /j.cell. . . shinozaki a, hosaka y, imagawa t, uehara m. . topography of ganglion cells and photoreceptors in the sheep retina. j comp neurol : – . doi: . /cne. stabio me, sabbah s, quattrochi le, ilardi mc, fogerson pm, leyrer ml, kim mt, kim i, schiel m, renna jm, briggman kl, berson dm. a. the m cell: a color-opponent intrinsically photosensitive retinal ganglion cell. neuron : - .e . doi: . /j.neuron. . . stabio me, sondereker kb, haghgou sd, day bl, chidsey b, sabbah s, renna jm. b. a novel map of the mouse eye for orienting retinal topography in anatomical space. j comp neurol : – . doi: . /cne. sterratt dc, lyngholm d, willshaw dj, thompson id. . standard anatomical and visual space for the mouse retina: computational reconstruction and transformation of flattened retinae with the retistruct package. plos comput biol :e . doi: . /journal.pcbi. szatko kp, korympidou mm, ran y, berens p, dalkara d, schubert t, euler t, franke k. . neural circuits in the mouse retina support color vision in the upper visual field. biorxiv . doi: . / szél a, diamantstein t, röhlich p. . identification of the blue-sensitive cones in the mammalian retina by anti-visual pigment antibody. j comp neurol : – . doi: . /cne. szél a, lukáts a, fekete t, szepessy z, röhlich p. . photoreceptor distribution in the retinas of subprimate mammals. j opt soc am a opt image sci vis : – . doi: . /josaa. . szél a, röhlich p, caffé ar, juliusson b, aguirre g, van veen t. . unique topographic separation of two spectral classes of cones in the mouse retina. j comp neurol : – . doi: . /cne. wang yv, weick m, demb jb. . spectral and temporal sensitivity of cone-mediated responses in mouse retinal ganglion cells. j neurosci : – . doi: . /jneurosci. - . warwick ra, kaushansky n, sarid n, golan a, rivlin-etzion m. . inhomogeneous encoding of the visual field in the mouse retina. curr biol : - .e . doi: . /j.cub. . . yang h, wang h, shivalila cs, cheng aw, shi l, jaenisch r. . one-step generation of mice carrying reporter and conditional alleles by crispr/cas-mediated genome engineering. cell : – . doi: . /j.cell. . . yilmaz m, meister m. . rapid innate defensive responses of mice to looming visual stimuli. curr biol : – . doi: . /j.cub. . . yokoyama s, yokoyama r. . molecular evolution of human visual pigment genes. mol biol evol : – . doi: . /oxfordjournals.molbev.a zimmermann mjy, nevala ne, yoshimatsu t, osorio d, nilsson d-e, berens p, baden t. . zebrafish differentially process color across visual space to match natural scenes. curr biol : - .e . doi: . /j.cub. . . . methods . . key resources key resources table reagent type (species) or resource designation source or reference identifiers additional information strain, strain background (mus musculus, male) c bl/ j mouse strain jackson laboratory cat# , rrid:imsr_j ax: pigmented mouse inbred strain strain, strain background (mus musculus, male) crl:cd- (icr) mouse strain charles river cat# , rrid:imsr_c rl: albino mouse strain strain, strain background (mus musculus, male) copine - venus mouse line this paper material and methods section . . antibody anti-opn sw (n- ) (goat polyclonal) santa cruz biotechnology cat#sc- , rrid:ab_ if ( : ) antibody anti-opsin red/green (rabbit polyclonal) millipore/sigm a cat#ab , rrid:ab_ if ( : ) antibody anti-cone arrestin (rabbit polyclonal) millipore/sigm a cat#ab , rrid:ab_ if ( : ) antibody anti-gfp (chicken polyclonal) millipore/sigm a cat#ab , rrid:ab_ if ( : ) antibody anti-rabbit (donkey polyclonal) jackson immunoresear ch cat# - - , rrid:ab_ if ( : ) antibody anti-rabbit cy (donkey polyclonal) jackson immunoresear ch cat# - - , rrid:ab_ if ( : ) antibody anti-goat (donkey polyclonal) jackson immunoresear ch cat# - - , rrid:ab_ if ( : ) antibody anti-goat cy (donkey polyclonal) jackson immunoresear ch cat# - - , rrid:ab_ if ( : ) antibody anti-chicken (donkey polyclonal) jackson immunoresear ch cat# - - , rrid:ab_ if ( : ) sequence-based reagent copine _gr na_l( / ) this paper ’gagacatga ctggtccaa ’ sequence-based reagent copine _gr na_r( / . ), this paper ’gcctcggag cgtagcgtcc ’ software, algorithm zen zeiss zen lite black edition . sp software, algorithm fiji-imagej nih v . r https://imagej .nih.gov/ij/ software, algorithm sigma plot systat software . software, algorithm graphpad prism graph pad software . . software, algorithm photoshop adobe cc . . software, algorithm matlab mathworks software, algorithm r the r project for statistical computing . . https://www.r -project.org/ software, algorithm retina and visual space retistruct package sterratt dc et al., plos comput biol. software, algorithm zotero corporation for digital scholarship . https://www.z otero.org/dow nload/ other dapi thermofisher scientific cat# d , rrid:ab_ ( ug/ml) . . lead contact and materials availability further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, wei li (liwei @nei.nih.gov). . . method details . . . animal generation, handling and ethic statement three months old male pigmented (c bl/ j, n= ), albino (cd , n= ) mice were obtained from the national eye institute breeding colony. the venus-cpne mouse line (n= ; based on previous single cell sequencing data (shekhar et al., )) carries a reporter (venus) allele under the control of the mouse cpne locus. the reporter allele was created directly in b .sjl(f ) zygotes using crispr-mediated homologous recombination (hr) (yang et al., ). briefly, a hr targeting template was assembled with pcr fragments of ’ and ’ homology arms of bp and bp respectively, flanking exon one, and a venus expression cassette carrying the bovine growth hormone polyadenylation (bgh-polya) signal sequence as the terminator. homology arms were designed such that integration of the reporter cassette would be at the position right after the first codon of the cpne gene in exon one. a pair of guide rnas (grna), with outward orientation ( bp apart), were synthesized by in vitro transcription as described (yang et al., ) and tested for their efficiency and potential toxicity in a zygote differentiation assay where mouse fertilized eggs were electroporated with spcas protein and grna ribonuclear particles. eggs were cultured in vitro for days in ksom (origio inc, ct) until differentiated to blastocysts. viability and indel formation were counted respectively. grna sequences are ( ) copine _grna_l( / ), ’gagacatgactggtccaa ’; ( ) copine _grna_r( / . ), ’gcctcggagcgtagcgtcc ’. a mixture of the targeting plasmid (super coiled, ng/µl) with two tested grnas ( ng/µl each) and the spcas protein (life science technology, ng/µl) were microinjected into mouse fertilized eggs and transferred to pseudopregnant female recipients as described elsewhere (yang et al., ). with a total of f live births from pseudopregnant females, were found to carry the knockin allele by homologous recombination, a hr rate of %. f founders in b .sjl f ( % c bl genome) were crossed consecutively for generations with c bl /j mice to reach near congenic state to c bl /j. mice were housed a : hours light/dark cycle. all experiments and animal care are conducted in accordance with protocols approved by the animal care and use committee of the national institutes of health and following the association for research in vision and ophthalmology guidelines for the use of animals in research. . . . tissue collection all animals were sacrificed with an overdose of co and perfused transcardially with saline followed by % paraformaldehyde. to preserve retinal orientation, eight retinas per mouse strain/line were dissected as flat whole-mounts by making four radial cuts (the deepest one in the dorsal pole previously marked with a burn signal as described (nadal-nicolás et al., ; stabio et al., b). the two remaining retinas were cut in dorso-ventral orientation ( m) after cryoprotection in increasing gradients of sucrose (sigma-aldrich sl) and embedding in optimal cutting temperature (oct; sakura finetek). . . . immunohistochemical labeling immunodetection of flat-mounted retinas or retinal sections was carried out as previously described (nadal-nicolás et al., ). importantly, the retinal pigmented epithelium was removed before the immunodetection. first, whole-retinas were permeated ( x ’) in pbs . % triton x- (tx) and incubated by shaking overnight at room temperature with s-opsin ( : ) and m-opsin ( : ) or cone arrestin ( : ) primary antibodies diluted in blocking buffer ( % normal donkey serum). cpne -venus retinas were additionally incubated with an anti-gfp antibody ( : ) to enhance the original venus signal. retinas were washed in pbs . % tx before incubating the appropriate secondary antibodies overnight ( : ). finally, retinas were thoroughly washed prior to mounting with photoreceptor side up on slides and covered with anti-fading solution. retinal sections were counterstained with dapi. . . . image acquisition retinal whole-mounts were imaged with a x objective using a lsm zeiss confocal microscope equipped with computer-driven motorized stage controlled by zen lite software (black edition, zeiss). m- and s-opsins were imaged together to allow the identification and quantification of different cone types. magnifications from flat mounts and retinal cross- sections (figure ) were taken from dorsal, medial and ventral areas using a x objective for opsin co-expression analysis. images from retinal cross-sections were acquired ~ . mm dorsally or ventrally from the optic disc. . . . sampling and opsin co-expression measurement in four retinas per strain, we acquired images from three x m samples ( x) per each area of interest (dorsal, medial and ventral). these areas were selected according to the s-opsin gradient in wholemount retinas (see scheme in figure c). cone outer segments were manually classified as m + s - , true s- (s + m - ) or mixed (m + s + ) cones depending on their opsin expression. data representation was performed using graphpad prism . software. . . . image processing: manual and automated whole quantification to characterize the distribution of the different cone photoreceptor types in the mouse retina, we developed and validated an automatic routine (imagej, nih) to identify, quantify the total number of outer segments and finally extract the location of each individual cone (figure - figure supplement a). briefly, maximum-projection images were background-subtracted and thresholded (background-noise mean value, . ± . % and . ± . % for s- and m-opsin respectively, the threshold was applied at . %) to create a binary mask that was then processed using watershed and despeckle filters to isolate individual cones and reduce noise. the “ d objects counter” plugin was applied to such images to count cones within fixed parameters (shape and size) and extract their xy coordinates for further analysis. this automation was validated by statistical comparation with manual counting performed by an experienced investigator (pearson correlation coefficient r = - % for m- or s-opsin respectively, figure -figure supplement b). to count cone subtypes, images were pre- processed with image processing software (adobe photoshop cc) to isolate the desired subtype and then manually marked using photoshop, or automatically counted using imagej as described above. total cone populations were determined by combining m- and s-opsin channels, while mixed m + s + cones were obtained by masking the m-opsin signal with the s- opsin channel. m + s - cones in pigmented mice were obtained by subtracting the s-opsin signal from the m-opsin photomontage. finally, m + s - cones (in albino samples), true s-cones (both strains) (figure -figure supplement c) and venus + scbcs (cpne -venus mouse line) were manually marked on the retinal photomontage (adobe photoshop cc). . . . topographical distributions. topographical distributions of cone population densities were calculated from cone locations identified in whole-mount retinas using image processing (see above). from these populations, isodensity maps were created using sigmaplot . (systat software). these maps are filled contour plots generated by assigning to each area of interest ( . x . m) a color code according to its cone density, ranging from (purple) to , cones/mm for all cone types except for true s-cones and m + s - -cone in the albino strain ( , cones/mm ), as represented in the last image of each row of figure a, or , scbcs/mm (figure d) within a -step color- scale. these calculations allow as well, the illustration of the number of cones at a given position from the on center. to analyze the relative opsin expression along the retinal surface, we have considered three cone populations (mixed, m + s - - and true s-cones) dividing the retina in four quadrants: dorsotemporal, dorsonasal, ventrotemporal and ventronasal (dt, dn, vt and vn respectively, scheme in figure c). the relative percentage of cone-types are represented in line graphs from four retinas/strain (sigmaplot . ). . . . scbc sampling and ‘true s-cone’ connectivity to characterize the connectivity of venus + s-cone bipolar cells (venus + scbcs) with true s-cone terminals, we acquired images from the same area ( x m, x) at two focal planes: first, we focused upon the inl+opl, then the corresponding photoreceptor outer segment (os) layer, respectively, for two areas of interest (dt and vn). to verify connectivity between venus + scbc dendrites and true s-cone pedicles in the opl, in addition to s-opsin immunodetection, we also labeled retinas using cone arrestin antibodies to discriminate mixed cone pedicles from true s- cone pedicles, because true s-cone pedicles contain either low or no cone arrestin (figure b, haverkamp et al., ). in other retinas, scbc contacts were verified by tracking each cell body from cone pedicles to their respective os to confirm s + m - opsin labeling (figure c). in five retinas (with s- and m-opsin double immunodetection), we analyzed the connectivity between venus + scbcs ( and for dt and vn respectively) and true s-cone pedicles ( and , dt and vn respectively). the number of synaptic contacts was assessed by tracking manually each scbc-branch from the cell body using the zen lite black visualization package (z- stack with m interval). multiple branch contacts in one true s-cone pedicle from a single scbc were considered a single contact and counted only once (figure b), while secondary bifurcations were considered as multiple contacts (figure c’). scbc-blind endings were not counted (figure c”). the average number of contacts per retina was used to calculate the dt and vn means (supplementary file and graphs in c). . . . clustering analysis. k-neighbor maps and variance analysis of voronoi dispersion. to assess the true s-cones and s-cone bipolar cell (scbc) clustering, we performed two comparable sets of analyses. first, we extracted two circular areas ( mm diameter) in the dt- vn axis at mm from the optic disc center (scheme in b). a k-nearest neighbor algorithm (nadal-nicolás et al., ) was used to map the number of neighboring true s-cones within a m radius of each true s-cone to a color-code in its retinal position (figure b). regularity indices were computed for each retinal sample using voronoi diagrams for cone positions as well as nearest neighbor distances (vdri and nnri, respectively (reese and keeley, ); figure c-e). nnris were computed as the ratio of the mean to the standard deviation for the distance from true s-cones to their nearest true s-cone neighbor. true s-cone neighbor ratios (scnr) were calculated for each retinal sample as the average proportion of true s-cones within a given radius for each cone. this search radius was calculated separately for each sample to correct for sample-to-sample variations in total density: this radius (r) was calculated as r = √ (a / (√ πn)), where a is the circular area of the mm diameter retinal sample and n is the total number of cones in that sample. for a highly regular cell mosaic containing n cells filling an area a, this calculation estimates the location of the first minimum in the density recovery profile (rodieck, ), providing the average radius of a circle centered upon a cone that will encompass its first tier of cone neighbors (but exclude the second tier) in an evenly distributed mosaic. to minimize edge effects from computations of nnri, vdri, scnr, those values for cones closer to the outer edge of the sample than the scnr search radius were discarded. to produce simulated cone mosaics for comparison with observed values, cone distributions with evenly “distributed” true s-cones were generated by first using a simple mutual repulsion simulation to maximize the distances between true s-cones, followed by assigning the nearest positions among all cone locations as being “true s”. “shuffled” populations of true s-cones were generated by permuting cone identities randomly among all cone locations, holding the proportion of true s-cones constant. voronoi diagrams, neighbor calculations, and mosaic generation and other computations were performed using matlab r b. . . . true s-cone cluster and scbc synaptic contacts evaluation to characterize the true s-cone cluster connectivity in the vn retina, retinal whole-mounts were imaged with a x objective, from the photoreceptor outer segments to the opl, in a z- stack image with . m interval. to visualize the true s-cone clustering and venus + scbc connectivity, we identified numerically, and color coded each true s-outer segment form a cluster. the corresponding true s-pedicles were identified by tracking the cell body from their s + m - oss. focusing on the outer plexiform layer (opl), each individual true s-cone pedicle -that form a cluster- was manually outlined and color coded accordingly. lastly, the scbc synaptic terminals, that belong to a single scbc, were identified by their specific contacts to the respective true s-cone pedicle (figure f). . . . retinal reconstruction and visuotopic projection retinal images were reconstructed and projected into visual space using r software v. . . for -bit microsoft windows using retistruct v. . . as in sterratt et al. ( ). reconstruction parameters from that study were used: namely, a rim angle of  (phi = ), and eye orientation angles of  (elevation) and  (azimuthal). for figure -figure supplement , true s-cone density contour lines and heatmaps were computed in matlab and overlaid onto flat- mount retina opsin labeling images using imagej prior to processing by retistruct. . . . statistical analysis statistical comparisons for the percentage of cones/retinal location, the total cone quantifications (supplementary file ) and the dt or vn true s-cones and venus + scbcs (supplementary file ) were carried out using graphpad prism v . for microsoft windows. data are presented as mean ± standard deviation. all data sets passed the d'agostino-pearson test for normality, and the comparisons between strains were performed with student’s t-test. for each mm retinal sample, vdri, nnri, and scnr values were normalized and compared to the distributions of “shuffled” cone populations. such comparisons were not performed against “distributed” populations, because in those populations, vdri and nnri values were consistently much higher—and scnr much lower—than in real samples (see figure d-e). the “shuffled” populations for each retinal region produced measurements that were well described by normal distributions (kolmogorov-smirnov test, matlab). thus, to allow comparisons across samples, we converted each measurement into a z-score using the mean and standard deviation of those measures from shuffled populations. one-tailed student’s t- tests were performed to compare the normalized measures to the distribution of “randomly shuffled” cone population measures, and significance was determined at the p< . level. . supplementary material supplementary file . (a) cone numbers in different retinal areas along the dorsoventral axis in pigmented and albino mouse. three images/area (dorsal, medial and ventral) from four retinas/strain. different cone type quantifications are shown as average ± sd, corresponding to the percentages shown in fig c. the total number of cones analyzed per location and strain are shown in the last column. total number of cones (b) or s-cone bipolar cells (scbcs, c) in eight retinas/mouse strain or line (average ± sd, see also figure b). significant differences between strains p< . (*), p< . (**), p< . (***), p< . (****). supplementary file . true s-cone terminals and cpne -venus+scbcs connectivity in dorsotemporal (dt) and ventronasal retina (vn). quantitative data are shown as mean ± sd from the average of five dt and vn retinal areas (figure c). significant differences between retinal areas, p< . (**), p< . (****). supplementary file . numbers of true s-cones (a) and cpne -venus + scbcs (b) in dorsotemporal (dt) and ventronasal (vn) circular areas ( mm diameter, figs e and b). quantitative data are shown as average ± sd from eight retinas/strain or line. the mean of true s-cones and venus + scbcs in these circular areas was used to calculate the dt:vn and true s- cone:scbc (c) ratios. significant differences between strains p< . (*), p< . (***). true s- cones and scbcs were significant different between dt and vn retina (p< . ). article file september • the saa archaeological record on ethics, sustainability, and open access in archaeology eric c. kansa, sarah whitcher kansa, and lynne goldstein eric c. kansa is program director for open context, a web-based publishing venue. sarah whitcher kansa is executive director of the alexandria archive institute and editor of open context. lynne goldstein is professor of anthropology at the department of anthropology of the michigan state university. o n january , , an internet activist, aaron swartz (figure ) took his own life at the age of . swartz faced severe criminal charges for attempting to mass-download scholarly articles. his tragic death still reverberates around a community of activists that value sharing of knowledge and a free and open internet. his death also places a spotlight on the ethics of academic publishing prac- tices, requiring us to reexamine how we conduct and communicate archaeology. the story of swartz’s death involves jstor (http://www.jstor.org). most archaeologists are familiar with the online journal repository, originally funded by the mellon foundation. in some ways, jstor is a resounding success, as it serves many scholars worldwide, including archaeologists. unlike many digital scholarly communications initiatives, jstor is also financially “sustainable.” it is held up as a model for how to do digital scholarship correctly. it serves a large community and does not have to come back year after year begging for more grant money. jstor’s revenues come largely from subscriptions. if you do not have an affiliation with a subscribing institution to jstor, you do not get access to the vast majority of its resources. in other words, jstor sustains itself by setting up a “paywall.” that paywall blocks some million attempts to access jstor every year. swartz was allegedly caught attempting a mass download of some . million articles from the jstor repository via mit’s network. although jstor did not pursue charges, u.s. prosecutors indicted swartz with criminal hacking, and he faced to (!) years in federal prison. essentially, u.s. prosecutors charged swartz with cyber-terrorism, all for downloading academic articles in a manner that did not dam- age mit’s network or jstor (according to expert witnesses involved in the case). according to swartz’s family, this legal hounding directly motivated his suicide. this obviously is a tragic case, and another sad example of routine abuse of the legal system with regard to intellectual property and computer crime. jstor did not want to threaten swartz with decades of prison time for downloading articles. but, in the end, that did not matter. he still faced a draconian prison term, roughly equivalent to the punishment for second-degree murder because he allegedly violated net- work “terms of service” contractual rules that jstor put into place around research materials. the tragic case of aaron swartz highlights the ethical urgency of debating open access. however, before exploring this topic, we need to first introduce why open access (figure ) is even an issue. as defined by peter suber, one of the leaders of the open access movement, open access literature is “. . . digital, online, free of charge, and free of most copyright and licensing restrictions.” the world wide web makes open access feasible by dramatically reducing dissemination and copying costs. nevertheless, although the web makes the sharing of content very inexpensive, producing high-quality information, including peer-reviewed literature, remains expensive. debate over the direction of research publication revolves around how production, editing, and peer-review costs should be financed. the case of aaron swartz illus- trates the high legal and ethical stakes involved in this debate. article the saa archaeological record • september paywall dangers and peer-review papers as gray literature swartz’s case shows the dangers of our discipline’s normative prac- tice of fee-based (paywall) access to scholarship. paywalls, enforced by “terms of service” and strong copyright laws, create a legal con- text with punishments in excess of those dealt to human traffick- ers. as outrageous as that sounds, swartz’s case shows how our legal institutions treat violations of network terms of service and copyright more seriously than slavery. it is unlikely that the saa, the archaeological institute of america (aia), or the american anthropological association (aaa) would want to press felony charges or long prison terms if someone illegally downloaded a journal article from one of their servers. jstor did not press for prosecution either. nevertheless, swartz faced spending the rest of his life behind bars. although many archaeologists would be saddened to hear of swartz’s case, they do not consider themselves to be “hackers.” writing a program to automatically download jstor articles lies well outside the skill sets and inclinations of most saa members. nevertheless, the legal implications of swartz’s case should still worry the saa and its membership. archaeologists regularly lament the inaccessibility of crm research locked in “gray literature” reports, and a number of archaeologists have worked to make such reports more publicly available. yet, to anyone outside of journal paywalls, mainstream publications are as inaccessible as gray literature. many researchers, particularly our colleagues in public, crm, and contract archaeology or those struggling as adjunct faculty, either totally lack or regularly lose affiliations with institutions that subscribe to paywall resources such as jstor. many of these people beg logins from their friends and colleagues lucky enough to have access. similarly, file-sharing of copyright-protected articles is routine. email lists and other net- works regularly see circulation of papers, all under legally dubious circumstances. many of us have encountered these underground networks for sharing research but have not considered the risks associated with sharing research outside of official channels. to note that this situation endan- gers many saa members is no hyperbole. while table provides much more background information, below we highlight some of the legal risks involved in conventional publishing: • sharing logins to gain access to university library systems can involve grave legal risks. it involves the same sort of violations of terms of service that made aaron swartz face up to years in prison. for instance, jstor’s terms of service (that swartz allegedly violated in his felony charges) specifically prohibit actions such as sharing logins. • sharing papers (mainly in email, but also social networking sites) also carries risks, mainly in civil and not criminal law. this could change if the “stop online piracy act” (sopa) passes, making many more copyright crimes felonies. already, mass copyright lawsuits with financially ruinous penalties happen— even involving , people at a time, including children. • law professor john tehranian ( ) published a study in which he calculated a jaw-dropping $ . billion (the “b” is no typo!) in potential copyright liability involved in routine academic research and instructional activities over the course of a single year. copyright has expanded over the years into a more-or-less absolute and perpetual property right. in fact, no u.s. copyrighted works entered into the public domain last year. it is already illegal for even libraries to circumvent technological copyright protections (drm) to archive and preserve electronic books and article figure . swartz participated in a number of digital civil liberties efforts, including the campaign to block sopa from becoming law (credit: wikimedia commons, creative commons attribution, share-a-like license; source: http://en.wikipedia.org/wiki/file:aaronswartzpipa.jpg). september • the saa archaeological record article topic jstor aaron swartz’s criminal case and death open access law and “terms of service” copyright’s expanding scope business practices in scholarly publishing links and references • the mellon foundation’s account of its role with jstor: http://www.mellon.org/news_publications/ publications/jstor-a-history • jennifer howard, writing for the chronicle of higher education, on jstor’s access issues: http://chronicle.com/blogs/wiredcampus/jstor-tests-free-read-only-access-to-some-articles/ • jstor’s terms of service: http://www.jstor.org/page/info/about/policies/terms.jsp • mit’s paper the tech on swartz’s indictment: http://tech.mit.edu/v /n /swartz.html • lawrence lessig (ip attorney, faculty harvard law school, friend of aaron swartz) on the criminal case and con- duct of aaron swartz’s prosecutor: http://lessig.tumblr.com/post/ /prosecutor-as-bully • nbc news report on aaron swartz’s family’s reaction to his suicide: http://usnews.nbcnews.com/_news/ / / / -family-of-aaron-swartz-government-officials-partly-to- blame-for-his-death • alex stamos, cyber-crime expert and expert witness on the swartz case: http://unhandled.com/ / / / the-truth-about-aaron-swartzs-crime/ • ian millhiser on comparing the punishments swartz faced with punishments for violent crimes: http://www.alternet.org/ -awful-crimes-get-you-less-prison-time-what-aaron-swartz-faced • a canonical definition of open access by peter suber: http://legacy.earlham.edu/~peters/fos/brief.htm • special issue on open access in archaeology, edited by mark lake, published in the paywall journal world archaeology: http://www.tandfonline.com/toc/rwar / / • office of science and technology policy (whitehouse) open access memorandum: http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf • harvard university calls for open access: http://arstechnica.com/science/ / /harvard-library-advises-its- faculty-to-go-open-access/ • nature’s editor on open access: http://www.guardian.co.uk/science/ /jun/ /open-access-research- inevitable-nature-editor • summary of aaa issued a report (available only to aaa members) on the lack of sustainability of open access in anthropology: http://aaanet.org/press/an/ /davis.html • mixed reactions to the finch report (uk) open access recommendations: http://legacy.earlham.edu/~peters/fos/newsletter/ - - .htm#uk-ec • the electronic frontier foundation (eff), a civil liberties group, on terms of service: https://www.eff.org/deeplinks/terms-of-% ab% use • how copyright term extensions led to nothing entering the public domain in the us: http://web.law.duke.edu/cspd/publicdomainday • mass litigation and massive fines for copyright infringement: https://www.eff.org/wp/riaa-v-people-five-years- later • media industry lobbying in trade agreements: http://www.theatlantic.com/technology/archive/ / / why-an-international-trade-agreement-could-be-as-bad-as-sopa/ / • a library legal assessment of sopa: http://www.librarycopyrightalliance.org/bm~doc/lca-sopa- nov .pdf • georgia state fair use news and commentary: http://www.educause.edu/focus-areas-and-initiatives/policy-and- security/educause-policy/issues-and-positions/intellectual-property/georgia-state-copy • boycott efforts protesting elsevier’s lobbying for copyright expansions and stiffer punishments: http://www.guardian.co.uk/science/ /feb/ /academics-boycott-publisher-elsevier • cathy davidson (director of hastac, a digital humanities center) commenting on swartz’s death and academic publishing: http://hastac.org/blogs/cathy-davidson/ / / /tragedies-scholarly-publishing- • guardian editorial on “academic publishers make rupert murdock look like a socialist”: http://www.guardian.co.uk/commentisfree/ /aug/ /academic-publishers-murdoch-socialist table . informative links the saa archaeological record • september other media. worse, elsevier, the world’s leading commercial scientific publisher, lobbied in favor of sopa, a bill that would have made copyright infringement, even without a commercial motivation, a felony offense. that would have put many routine library activities, including preservation of the pub- lished archaeological record, at grave risk. the evidence is clear that current intellectual property rules carry significant legal risks for everyone. it is worse for researchers at the margins of the profession who lack their own institutional logins. essen- tially, paywalls create a criminalized underclass of researchers who bend and break rules to participate in their professional community. it is a perverse travesty that we have relegated essential professional com- munications to a quasi-legal/illegal underground, when supposedly we are a community dedicated to advancing the public good through the creation of knowledge about the past. like it or not, this legal con- text shapes academic communication and shapes its ethics. moving beyond narrow visions of sustainability if you are lucky enough to have a stable university affiliation, copyright and terms-of-service issues may seem removed from your reality or daily concerns. university libraries typically insulate faculty from the escalating costs of publication because libraries, not faculty, do the purchasing (or, more truthfully, the renting, because once a library stops subscribing, access to back issues also ceases) of access to commer- cial and semi-commercial publication repositories. faculty members take access largely for granted. this mindset is pervasive. we are certain that many reading this paper may dismiss its concerns thinking: “if it ain’t broke, don’t fix it.” but publication is broken and desperately needs to be fixed. recently, jason baird jackson ( ) asked: “last year, did you get paid nothing to work hard for a multi- national corporation with reported revenues of over a billion dollars?” he was referring to scholars that write, edit, and perform peer-review services for journals managed by commercial giants like elsevier, springer, wiley blackwell, and taylor & francis. his question highlights some uncomfortable truths about scholarly communication. publicly sponsored research, often conducted by scholars at public uni- versities, gets written up, edited, and reviewed without direct compensation, only to become commercial intellectual property that universities must purchase, at exorbitant and rapidly growing prices ( – x the rate of inflation), back from publishers (kansa ). the largest commercial science publishers routine- ly see profit margins in excess of percent, making the industry more profitable than the oil business. that profitability comes from dominating the circulation of knowledge— knowledge that is mainly creat- ed through public dollars in the form of grant-funded research. intellectual property barriers and cost escalations certainly do not help small publishing houses special- izing in books and monographs or society publishers such as the saa. the cost escalations of the big commercial science publishers translate to fewer funds to buy humanities and social science books and journals (davidson ). it is self-defeating for archaeology’s professional societies to fight open access, since they are simply helping to perpetuate cost-escalations in the areas of publishing in stem (science, technology, engineering and medicine) fields that university administrators prioritize over the humani- ties and social sciences. our professional societies need to consider this larger economic reality when determining their positions on open access (kansa ). the negative impact of publication’s costs increases and other dysfunctions diffuse widely, making them difficult for faculty to notice. but they do contribute to the general bloat of overhead and costs that leaves less room and money for teaching and research. in a recent case, cambridge university press, oxford university press, and sage publishers sued georgia state university (gsu) over e-reserves to curtail “fair use” (limitations in copyright law to allow research, instruction, critique, and free speech). the suit was dismissed, but at the cost of over $ million in legal fees to gsu. in dismissing the case and siding with gsu, the judge ordered the plaintiffs to cover the university’s legal fees. however, the publishers article figure . the public library of science originally designed this open access logo and released it to the public domain using the creative commons zero public domain dedication (source: wikimedia commons: http://en.wikipedia.org/wiki/ file:open_access_logo_plos_ white.svg). september • the saa archaeological record have filed appeals, and the legal bills continue to pile up as the matter remains unresolved. depending on the outcome of the appeals, gsu may still end up saddled with the high costs of defending “fair use” from legal challenge. for perspective, $ million could have fully endowed a new professorial chair in archaeology, relieved some student debt, funded research, bought archaeological monographs, or supported new high-quality, peer-reviewed open access publishing venues. worse, the high costs and risks of defending against such charges, even when dismissed(!), creates a chilling effect across all higher education institutions. fair use has a great deal of ambiguity, making it very risky to defend in court (lessig : ). although the pub- lishers lost in the first round of this case, they still signaled to universities the threat of future litigation. this will no doubt motivate university administrators to make it much more cumbersome and costly for faculty and students to exchange publications in instructional settings. the georgia state case highlights the dangers of thinking too narrowly about “sustainability.” fixating on narrowly defined notions of sustainability leads to what economists call a collective action problem. each individual project or organization tries to survive so they have a strong incentive to monetize their intel- lectual property (via paywalls and absolutism in copyright). damaging negative externalities (legal risks and costly barriers) are problems for others. thus, the current system pits the interests of professional societies and publishers against those of society members, students, adjuncts, crm researchers, libraries, funding agencies, and the public. despite all of these dysfunctions, professional societies representing archaeology typically undermine or avoid open access and cling to paywalls. they do so out of fear, not malice. they worry that a loss of sub- scription revenue will make it impossible to support editing and peer-review activities essential to quality publication. they also worry that they will lose dues-paying members should publications be openly avail- able. although these are valid concerns, open access financing models and years of experimentation teach us that the status quo of paywalls need not be an ugly necessity (suber :chapter ). some points: • green open access. in a february , statement, the white house office of science and tech- nology policy (ostp ) endorsed a “green” model of open access, where copies of peer-review pub- lications become available in open access repositories after an embargo period. green open access enables publishers to profit on exclusive access for a limited time. unfortunately, most green models have a limitation in that even after articles are publicly available, they are still under “all rights reserved” copyright. this can still inhibit reuse and make reuse costly and legally risky. routine reuse such as duplication of a previously published image for comparison as well as innovative approaches such as text-mining still can be stymied in costly and complex licensing and permissions negotiations (kansa ). nevertheless, green models will surely be an important step forward from the current status quo. • article processing fees (gold open access). article processing fees, such as the $ , charged by the public library of science (plos) for immediate, free-of-charge, open access publishing in their peer- reviewed journals, represent the most widely known model. this level of expense is far too high for archaeology, a field struggling with limited grant budgets (which rarely include funds for publication). but smaller, author-side fees may be feasible by subsidizing publication costs from other sources. such subsidies may be an ethical necessity, since any fee to publish represents a barrier. publication fee structures may need to be modeled like saa membership dues, with different costs charged to researchers at different career stages and affiliations. • membership fees. peerj, an important new commercial publisher, thinks it can make a profit by pub- lishing free and open peer-review papers financed with only $ in membership dues levied on con- tributing authors per year. that price-point has greater feasibility than plos charges, but most archae- ologists will still need subsidized support. the recently launched peerj is currently only focusing on biomedical sciences, so its application to archaeology needs more investigation. • redirection and subsidies. the points above indicate that subsidies likely need to play an important role in financing open access. the most obvious source for such subsidies comes from redirecting article the saa archaeological record • september library subscription income to directly underwrite open access publication costs. the particle physics community recently made this shift: libraries, physics journals, and other stakeholder came together to form the scoap (http://scoap .org/) consortium that now directly finances open access pub- lishing. this kind of approach needs to be explored for archaeology. open access does not represent an unachievable, utopian position. on the contrary, open access is a viable strategy to overcome a myriad of dysfunctions plaguing academic publishing that range from extremism in copyright to price-gouging by monopolistic publishers. our professional societies serve their memberships poorly when they call into question the financial feasibility and sustainability of open access without addressing or even acknowledging the dysfunctions of the current publishing status quo. even harvard university, the wealthiest academic institution in the world (and an organization not known for its radicalism), claims that it can no longer afford scholarly publications. harvard recently began urg- ing its faculty to publish in open access venues. the editor of nature, one of the most prestigious titles in scientific publishing, also believes a shift to open access is inevitable. despite all of these recent developments, the leadership of the saa, aia, and aaa still cling to the notion that open access represents a threat to sustainability. we disagree, and argue that paywall business mod- els show signs of deep trouble when they motivate suing customers (as illustrated in the gsu case) and out-pricing even the deepest pockets of harvard university. leadership and the public interest the authors of this paper are all very well aware of the costs of publication. two of us (kansa and whitch- er kansa) run open context (http://opencontext.org), an open access publisher of archaeological data. developing and undertaking editorial and peer-review processes to publish higher-quality archaeological data provides us with a good understanding of the real labor costs involved in publication. goldstein, a former editor of american antiquity, one of the flagship journals in archaeology, also has substantial expe- rience and expertise with the cost and effort required in publication. however, publication costs make up only a fraction of the larger costs of doing quality archaeological research. good research requires costly training, facilities, access to remote locations, workers, labs, equipment, curation, and site conservation. often, these costs are financed through public funding or publicly regulated philanthropy. but even in cases where research costs are not directly financed through public support, field work is a highly regulated activity. archaeological field work is almost universally governed or, in the case of cultural resource management (crm), mandated by public laws and agencies. in other words, there is a clear public interest in archaeological research, and it extends to the dissemi- nation of that research. in recognition of this, archaeological dissemination practices should work toward building public knowledge goods. the open access critique of status quo publishing is that conventional publishing models subvert the public interest and do not produce public goods, despite public mandates and financing of research. conventional publishing channels public support of research exclusively into private hands. archaeology’s professional societies should not be dragged kicking and screaming toward sharing the out- comes of their publicly supported and publicly regulated research. if archaeology does not seriously engage with the open access issue, it runs the risk of alienating many of its stakeholders. granting agen- cies, libraries, universities, the press, and members of the public all increasingly recognize the costs and other dysfunctions of paywalls and restrictive intellectual property barriers in scholarship. we see clear evidence for increasing pressure and momentum for change. in response to a white house petition, signed by over , people, the obama administration recently announced a new policy directing all federal granting agencies to require open access to peer-review papers within a year of publication. in addition, the “fair access to science and technology research act” (fastr) is now working its way through congress, and there are similar bills pending in the legislatures of three states. article september • the saa archaeological record archaeology needs to better acknowledge and adapt to this new realty (lake ). not all models of open access would necessarily be in the interests of the saa or its members. for example, the uk government- sponsored finch report recommended some open access policies that may be very difficult to apply in the cash-strapped humanities and social sciences. that is why the saa and other professional societies repre- senting archaeologists need to step ahead of the issue. instead of entrenching themselves in a moribund and unsustainable status quo of paywalls, archaeological societies should take the opportunity to help shape poli- cies that implement open access in a manner that promotes and does not diminish our discipline. recommendations for the saa the transition to open access will not be easy, but it is necessary. fortunately, the saa can take some immediate, if incremental, steps to address the challenge of improving publication and making it more equitable. below we list some recommendations to guide the saa to better align the communication of archaeological research to the saa’s own ethical principles, many of which read like a call for open access: • gain experience with open access. the saa needs to better understand the opportunities and costs associated with open access. it needs to experiment and learn exactly how to run a sustainable peer- reviewed open access publishing service. this experience will give the saa the needed understand- ing to better articulate policy recommendations to our financial backers. the saa need not do this alone. it can partner with other societies, university library groups developing scholarly communica- tions infrastructure, or other commercial or nonprofit open access publishers. • refrain from lobbying against or weakening open access. both the aaa and the aia joined with monopolistic publishers like elsevier in lobbying against open access (kansa ). these actions debase these scholarly societies and put them into the camp of commercial giants that promote oppressive intellectual property laws that further commoditize knowledge; harm research, teaching, and free-expression; and endanger their own memberships. • seek legal protections for researchers, students, and the public. the saa also can make a public state- ment calling for a more equitable and just balance in computer-security and copyright law and in the interpretation of such laws with regard to scholarly works. legal frameworks governing publication need to better reflect our values and protect researchers, instructors, students, and other members of the public in accessing and using published research. • encourage quality and prestige in open access archaeology. even if the saa does not launch its own open access titles, the saa leadership should encourage greater professionalism and professional recognition for open access. the saa should encourage senior scholars to join editorial boards of open access journals and should provide peer-review and other services to open access titles to increase their prestige, acceptance, and quality. • publicly endorse open access as a goal to work toward. the saa can issue a public statement that open access represents a goal for the organization, even if it is currently not financially feasible. the saa needs to investigate funding and organizational requirements to sustain quality open access publishing and make it a goal to build the public support and financial resources needed to adopt pub- lication models that better promote the common good of public knowledge. in other words, if we can- not finance open access with currently available funding, the saa needs to make sustainable open access to peer-reviewed publications the goal of future fundraising and public policy campaigns. conclusions media industries have pushed for “the best laws money can buy” (samuelson ) and have pressured legislative bodies and law enforcement agencies to enact stricter controls, more intrusive surveillance, and harsher (actually oppressive) punishments for copying. unfortunately, these laws not only apply to popu- lar music and movies, they also govern scholarly communications. this transformed legal context, togeth- er with massive industry consolidation, makes conventional research publishing very different and much more costly and legally dangerous than the pre-internet era. article the saa archaeological record • september something is obviously very wrong when a majority of researchers lack the means to legally participate in their own discipline’s communications and when university presses sue universities over e-reserves. the current situation works to nobody’s interest, except for large conglomerates such as elsevier. professional societies need to openly acknowledge the costs and draconian legal risks of our current publishing model. if open access is too difficult to finance with our existing resources, we need to clearly communicate to public and private granting foundations and other financial supporters the damage done by privatizing research and under-investing in the public good. pretending that all is fine and well with our current pay- wall approach to publishing will do nothing to solve our discipline’s finance problems. it will only perpet- uate a crippling and dysfunctional publishing system that enriches monopolistic media conglomerates, commoditizes our intellectual outputs, and deprives us of intellectual freedom in teaching and research. open access offers a publishing model far better aligned to our needs and values. it is time for our disci- pline to work toward open access and a more optimistic and equitable vision for archaeology in the twen- ty-first century. references cited davidson, cathy the tragedies of scholarly publishing in . hastac. humanities, arts, science and technology advanced collaboratory. electronic document, http://hastac.org/blogs/cathy-davidson/ / / /tragedies-scholarly- publishing- , accessed april , . jackson, jason baird getting yourself out of the business in five easy steps. in hacking the academy, edited by dan cohen and tom scheinfeldt. university of michigan. electronic document, http://www.digitalculture.org/hacking-the- academy/hacking-scholarship/#scholarship-jackson, accessed april , . kansa, eric openness and archaeology’s information ecosystem. world archaeology ( ): – . doi: http://dx.doi.org/ . / . . (open access preprint: http://alexandriaarchive.org/blog/wp-content/uploads/ /kansa-open-archaeology-self-archive- draft.pdf). lake, mark open archaeology. world archaeology ( ): – . doi: http://dx.doi.org/ . / . . . lessig, lawrence free culture: how big media uses technology and the law to lock down culture and control creativity. pen- guin, new york. ostp increasing access to the results of federally funded scientific research. memorandum. office of science and technology policy (ostp), washington, dc. electronic document, http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_ .pdf, accessed april , . samuelson, pamela should economics play a role in copyright law and policy? university of ottawa law & technology journal : ( – ). ssrn scholarly paper, id . electronic document, http://papers.ssrn.com/abstract= , accessed april , . suber, peter open access. the mit press, cambridge, ma. tehranian, john infringement nation: copyright reform and the law/norm gap. utah law review : ( ); loyola- la legal studies paper no. - ; university of utah legal studies paper no. - . ssrn scholarly paper, id : social science research network. electronic document, http://papers.ssrn.com/abstract= , accessed april , . note . gold open access is a more open approach, where peer-reviewed papers are immediately available, free-of- charge, under liberal licensing conditions. article dspace cover page creating sub- nm nanofluidic junctions in pdms microfluidic chip via self-assembly process of colloidal particles the mit faculty has made this article openly available. please share how this access benefits you. your story matters. citation xi, wei, abeer syed, pan mao, jongyoon han, and yong-ak song. "creating sub- nm nanofluidic junctions in pdms microfluidic chip via self-assembly process of colloidal particles." jove: engineering : e ( ). as published http://dx.doi.org/ . / publisher myjove corporation version final published version citable link http://hdl.handle.net/ . / terms of use article is made available in accordance with the publisher's policy and may be subject to us copyright law. please refer to the publisher's site for terms of use. https://libraries.mit.edu/forms/dspace-oa-articles.html http://hdl.handle.net/ . / journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of video article creating sub- nm nanofluidic junctions in pdms microfluidic chip via self- assembly process of colloidal particles xi wei , , abeer syed , pan mao , jongyoon han , yong-ak song , division of engineering, new york university abu dhabi (nyuad) department of chemical and biomolecular engineering, new york university tandon school of engineering newomics, inc. department of electrical engineering and computer science, department of biological engineering, mit correspondence to: yong-ak song at rafael.song@nyu.edu url: http://www.jove.com/video/ doi: doi: . / keywords: engineering, issue , ion concentration polarization (icp), self-assembly process, silica colloids, nanofluidics, microfluidics, electrokinetic concentration date published: / / citation: wei, x., syed, a., mao, p., han, j., song, y.a. creating sub- nm nanofluidic junctions in pdms microfluidic chip via self-assembly process of colloidal particles. j. vis. exp. ( ), e , doi: . / ( ). abstract polydimethylsiloxane (pdms) is the prevailing building material to make microfluidic devices due to its ease of molding and bonding as well as its transparency. due to the softness of the pdms material, however, it is challenging to use pdms for building nanochannels. the channels tend to collapse easily during plasma bonding. in this paper, we present an evaporation-driven self-assembly method of silica colloidal nanoparticles to create nanofluidic junctions with sub- nm pores between two microchannels. the pore size as well as the surface charge of the nanofluidic junction is tunable simply by changing the colloidal silica bead size and surface functionalization outside of the assembled microfluidic device in a vial before the self-assembly process. using the self-assembly of nanoparticles with a bead size of nm, nm, and nm, it was possible to fabricate a porous membrane with a pore size of ~ nm, ~ nm and ~ nm, respectively. under electrical potential, this nanoporous membrane initiated ion concentration polarization (icp) acting as a cation-selective membrane to concentrate dna by ~ , times within min. this non-lithographic nanofabrication process opens up a new opportunity to build a tunable nanofluidic junction for the study of nanoscale transport processes of ions and molecules inside a pdms microfluidic chip. video link the video component of this article can be found at http://www.jove.com/video/ / introduction nanofluidics is an emerging research area of µtas (micro total analysis systems) to study biological processes or transport phenomena of ions and molecules at the length scale of - nm. with the advent of the nanofluidic tools such as nanochannels, transport processes of molecules and ions can be monitored with unprecedented precision and manipulated, if needed, by exploiting features that are available only at this length scale for separation and detection. , one of these characteristic nanoscale features is a high ratio of surface to bulk charge (or dukhin number) in nanochannels that can cause a charge imbalance and initiate ion concentration polarization (icp) between the nanochannel and microchannel. a common device platform for the study of nanofluidic phenomena consists of a two-microchannel system connected by an array of nanochannels as a junction. - the material of choice for building such a nanofluidic device is the silicon because of its high stiffness that prevents the channel from collapsing during bonding processes. however, silicon device fabrication requires expensive masks and substantial amount of processing in the cleanroom facility. - due to the convenience of device fabrication through molding and plasma bonding, polydimethylsiloxane (pdms) has widely been accepted as a building material for microfluidics and it would be an ideal material for nanofluidics as well. however, its low young's modulus around - kpa, makes the pdms channel easily collapsible during plasma bonding. the minimum aspect ratio of the nanochannel (width to depth) has to be less than : which means that the fabrication of pdms devices via standard photolithography will become extremely challenging if the nanochannel depth has to be below nm, requiring a channel width less than the current limit of photolithography at around µm. to overcome this limitation, there have been attempts to create nanochannels in pdms using non-lithographical methods such as stretching to initiate cracks with mean depth of nm or to form wrinkles after plasma treatment. collapsing a pdms channel with mechanical pressure allowed a nanochannel height as low as nm. even though these highly inventive non-lithographic methods allowed building nanochannels below nm in depth, the dimensional controllability of the nanochannel fabrication still poses an obstacle to a wide acceptance of pdms as a building material for nanofluidic devices. another critical problem of the nanochannels, whether in silicon or pdms, is the surface functionalization in case there is a need to alter the surface charge on the channel wall for the manipulation of ions or molecules. after device assembly through bonding, the nanochannels are extremely difficult to reach for surface functionalization due to the diffusion-limited transport. to create a nanoscale channel with high http://www.jove.com http://www.jove.com http://www.jove.com mailto:rafael.song@nyu.edu http://www.jove.com/video/ http://dx.doi.org/ . / http://www.jove.com/video/ / journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of dimensional fidelity and facile surface functionalization, the self-assembly method of colloidal particles induced by evaporation - in microfluidic devices can be one of the promising approaches. besides the controllability of pore size and surface property, there is even a possibility to tune the size of the pore in-situ when using colloidal particles coated with polyelectrolytes by controlling temperature, ph, , and ionic strength. because of these advantages, the self-assembly method of colloidal particles has already found applications for electrochromatography, biosensors, protein concentration and separation of proteins and dna in microfluidics. , in this study, we deployed this self-assembly method to build an electrokinetic preconcentration device in pdms that requires a nanofluidic junction between two microchannels. the fundamental mechanism behind the electrokinetic concentration is based on ion concentration polarization (icp). a detailed description of fabrication and assembly steps is included in the following protocol. protocol . preparation of the silica colloidal bead suspensions . preparation of nm and nm silica bead suspensions . vortex the silica bead stock suspension ( % w/v in water) for sec. to obtain a homogeneous suspension. pipette a total of µl stock suspension into a . ml tube and centrifuge it at , x g for min. . substitute the supernatant with µl of mm sodium phosphate buffer (pb, ph . ). . suspend the silica beads into a final concentration of % in mm sodium phosphate solution at ph . through vortexing. . surface functionalize nm silica carboxyl beads with poly(allylamine hydrochloride, pah), and with poly(sodium styrene sulfonate, pss) polyelectrolytes . suspend . g of nm silica beads with carboxyl group with ml m nacl (ph . ) for % (w/v) bead suspension. . prepare . % pah (mw k) in m nacl by dissolving µl of the stock solution ( % w/v in water) in ml of m nacl. prepare . % pss (mw k) in m nacl solution by dissolving . g pss in ml m nacl solution. vortex both solutions for min. to dissolve the polyelectrolytes completely. . add µl of pah solution to . ml of % silica carboxyl beads in a ml tube to deposit a positively charged polyelectrolyte layer on silica beads with carboxyl functional group. vortex the bead suspension for min. and incubate it on a tube rotator for min. at rt. . centrifuge the bead suspension at x g for min. and wash off the unbound pah five times with ml di water. after each centrifuge and removal of the supernatant, the beads were densely packed at the bottom of the tube. disrupt the bead clump by vigorous pipetting with ml of di water before adding ml of di water so that the beads can be re-suspended and washed off prior to the next centrifuge step. . follow the steps in . . and . . for pss coating to deposit a negatively charged layer on the beads. re-suspend the beads in . ml of m nacl prior to the pss deposition after removing the di water supernatant from the th washing step of . . . . use the same vigorous pipetting step using ml of m nacl to break up the bead clump at the bottom of the ml tube and then add ml of m nacl. add μl of pss solution to . ml of the silica beads deposited with a single pah layer. after vortexing for min. and incubation for min. on the tube rotator, repeat washing steps with di water. . measure the zeta potential of the beads before and after each polyelectrolyte coating using a dynamic light scattering system according to manufacturer's protocol to verify the polyelectrolyte deposition procedure has been performed correctly (see table ). . repeat five washing steps with di water following the single pss layer deposition and re-suspend the beads in µl of mm sodium phosphate buffer with . % tween ( % w/v) prior to use in the microfluidic device to enhance its flowability. . follow the procedure described above from . . to . . for nm silica beads with amine functional group to deposit a single layer of pss. . fabrication of the pdms microfluidic chip . microfabrication of the silicon master . fabricate the silicon master for pdms molding using microfabrication techniques as follows. . spin coat a µm thin photoresist at , rpm on a silicon wafer. pattern the layer using projection lithography (exposure time msec.) and etch nm deep and µm wide planar nanochannels (acting as nanotraps for the silica beads) with reactive ion etching. . use the following etching parameters to achieve an etch rate of . nm/s: chf ( sccm), cf ( sccm), ar ( sccm), pressure mtorr, rf power w. . spin coat the second μm thick photoresist layer at , rpm and perform an alignment to the previously patterned nanotraps. pattern the microchannels via contact lithography and by deep reactive ion etching (drie) of silicon. use the drie parameters in table . . fabrication of pdms mold . silanize the silicon master with trichlorosilane ( μl) in a vacuum jar o/n. caution: tricholorosilane is a toxic and corrosive material. always use it in a chemical hood with proper personal protection equipment. . mix the base to the curing agent at : ratio and cast pdms on the silanized silicon master and cure it at °c for hr in a convection oven. . remove the pdms slab from the silicon master with a knife and plasma bond it on a blank wafer using a plasma cleaner after a plasma treatment in a plasma cleaner for min. attach tapes along the edge to mark a partition line for the following pdms casting step. http://www.jove.com http://www.jove.com http://www.jove.com journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of . silanize the pdms mold in a vacuum jar with trichlorosilane ( μl) o/n. . cast pdms (base: curing agent at : ratio) on the silanized pdms mold and cure it at °c for hr in a convection oven. . fabrication of the pdms device . peel off the cured pdms slab from the pdms mold along the partition line marked with the tape. . punch reservoir holes with . mm biopsy punch, clean with a tape, rinse with isopropyl alcohol (ipa) and dry with nitrogen. . plasma bond the pdms device on a mm x mm microscope glass slide after a plasma treatment in a plasma cleaner for min. . ultrasonicate the bead suspension for min. in an ultrasonic bath prior to filling. pipette a µl bead suspension ( nm non- functionalized silica beads, or nm silica carboxyl beads with pah-pss layers, or nm silica amine beads with a pss layer) into the inlets and each (see figure a, b) immediately after plasma bonding of the pdms chip to a glass substrate. tap gently on the pdms chip with a pipette tip to enhance the bead packing. . after filling the bead delivery channels, cover all the inlets except for and with tape. air-dry the device for hr and store at + °c prior to use. figure gives a step-by-step schematic of the colloidal self-assembly process. . experiment for electrokinetic concentration of dna . fill the reservoirs , with a buffer solution ( μl of mm pb) and reservoir with a dna sample ( μl of nm in mm pb) and apply a gentle negative pressure with an inverted pipette tip on reservoirs , and to fill the channels with the solutions without bubbles (see figure b). . add μl of mm pb to reservoirs and and μl of nm dna to reservoir to balance the pressure and wait for min. to reach equilibrium. . insert the pt wires into reservoirs , , , . . apply voltage across the nanofluidic junction using a voltage divider connected to a source meter and pt wires. first apply v on reservoirs , and gnd on reservoirs , . . decrease the voltage to v on reservoir after ~ sec. . use a mechanical shutter with a periodic opening in every s to minimize photobleaching of the sample when recording the fluorescence signals from the dna. representative results an electrokinetic concentrator chip in pdms that contains a self-assembled nanofluidic junction between two microchannels is shown in figure a). the channel in the middle of the device is filled with a dna sample solution and flanked by two buffer solution channels on each side via a µm wide bead delivery channel (figure b). the silica colloidal suspension is flown into the bead delivery channel immediately after plasma bonding to create a nanofluidic junction between the sample and the buffer solution channel. the nanotrap array consisting of nm deep and μm wide nanochannels is used to trap the colloidal particles. its scanned image obtained with a non-contact surface profiler is shown in figure c). the colloidal bead membranes after evaporation are shown in figure d). the sem in figure e) shows the silica beads trapped at the planar nanotrap array separating the sample channel from the bead delivery channel. the nm silica bead packing shows highly ordered hexagonal packing with some minor defects that could cause a variation in the concentration behavior (figure f). the design of the pdms concentrator chip with its dimensions can be found here and in the supplemental files. http://www.jove.com http://www.jove.com http://www.jove.com https://www.jove.com/files/ftp_upload/ /device_design_new.pptx journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of figure . microfluidic concentrator in pdms with an integrated sub- nm nanoporous junction. (a) photo of the pdms concentrator device. (b) schematic of the micro-nanofluidic device with a bead delivery channel between the sample and buffer solution channel. the voltage is applied across the bead membranes between the sample channel and the buffer solution channels. (c) surface profile of the nanotrap array in pdms with a width of μm and a depth of nm. (d) micrograph of the device with a colloidal particle assembly inside the bead delivery channel after evaporation. (e) scanning electron micrograph of the self-assembled nm silica colloidal particles with the nanotrap arrays between the sample and buffer channel. the nm beads are trapped at the entrance of the nanotraps due to surface tension. (f) hexagonally packed nm silica beads inside the bead delivery microchannel after evaporation. (adapted from ref. with permission from the royal society of chemistry) please click here to view a larger version of this figure. a schematic of the microfabrication steps for the pdms concentrator device is shown in figure . to make a pdms device, a double pdms casting is required. the bead filling process in the pdms concentrator is shown in figure . the details for the microfabrication and the filling process can be found in the protocol. the zeta potential of the silica beads without and with polyelectrolyte coating is shown in table . http://www.jove.com http://www.jove.com http://www.jove.com https://www.jove.com/files/ftp_upload/ / fig large.jpg journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of figure . schematic of the fabrication process for the silicon master, the pdms master and the pdms concentrator device. after two photolithographic and etching steps, the silicon master is cast with pdms. after a double-molding, the pdms device is assembled via plasma bonding and filled with a bead suspension. please click here to view a larger version of this figure. figure . step-by-step schematic for self-assembly of colloidal silica beads. µl of the bead suspension was pipetted in to the bead delivery channels immediately after plasma treatment. once the bead delivery channel was filled, all but two inlets and were covered with tape and the devices air dried for hr prior to use. (reproduced from ref. with permission from the royal society of chemistry) please click here to view a larger version of this figure. http://www.jove.com http://www.jove.com http://www.jove.com https://www.jove.com/files/ftp_upload/ / fig large.jpg https://www.jove.com/files/ftp_upload/ / fig large.jpg https://www.jove.com/files/ftp_upload/ / fig large.jpg journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of colloidal particles ( nm) zeta potential (mv) silica - . silica amine . silica carboxyl - . silica carboxyl, pah coated . silica carboxyl, pah, pss coated - . silica amine, pss coated - . table . zeta potential of silica beads at °c. . % (w/v) colloidal solutions were used for the measurements (n= ). the sem images taken from the bead packing channel after drying out show a pore size ranging between nm, nm and nm, as shown in figure . the pore size corresponds to approximately % of the bead size, nm, nm and nm, respectively ( % of the bead diameter is the theoretical pore size). figure . sem images of self-assembled nm (a), nm (b) and nm (c) silica colloidal bead packing. pdms devices were reversibly bonded to glass slides and beads flown into the channel using negative pressure. after air-drying the devices o/n, the pdms devices were peeled of the glass carefully and imaged. this pore sizes were estimated to be ± , ± and ± nm for nm, nm and nm beads respectively (n= ). these pore sizes were close to the theoretical size, ~ % of the bead diameter. (adapted from ref. with permission from the royal society of chemistry) please click here to view a larger version of this figure. when applying voltage of v across the nm bead membrane, an ion depletion zone was observed near the colloidal membrane inside a microchannel filled with a fluorescent labeled dna (figure a, b). when lowering the voltage to v on the left side, the dna molecules got accumulated in the form of a plug and its concentration increased due to electroosmotic flow driven by a voltage difference of v- v across the sample channel (figure c, d). figure . time-lapse micrographs show the formation of an ion depletion region near the nanofluidic colloidal junctions in the channel filled with dna (initial concentration of nm). the ion depletion region was initiated at t= s and a concentrated dna plug was generated at v = v and v = v across the sample channel while the buffer channels were grounded. the dotted lines have been used to highlight the channels walls. a concentration factor of ~ , folds was achieved within min. using a nm colloidal membrane. (reproduced from ref. with permission from the royal society of chemistry) please click here to view a larger version of this figure. http://www.jove.com http://www.jove.com http://www.jove.com https://www.jove.com/files/ftp_upload/ / fig large.jpg https://www.jove.com/files/ftp_upload/ / fig large.jpg journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of the silica membranes with a bead size of nm and nm showed the highest concentration factor at ~ , times for the cy tagged dna (caa ccg atg cca cat cat tag cta c) within min. (figure a, b). the polyelectrolyte-coated silica bead membranes led to a - to , -fold increase in the dna concentration after min. (figure c, d). figure . fluorescence intensity of dna as a function of time for (a) nm silica beads (b) nm silica beads and (c) nm pss- coated silica amine beads and (d) nm pah/pss coated silica carboxyl beads. the dotted lines represent the fluorescence signal intensity level for nm (a, b , c, d), µm (a, b), µm (c) and µm (d) dna. the results have been normalized against background fluorescence. (reproduced from ref. with permission from the royal society of chemistry) please click here to view a larger version of this figure. process time etch mode passivation mode process time s . s overrun . s s platen generator power w w coil generator power w w gas sf sccm c f sccm etch rate . µm/min table . drie parameters. discussion following the common device design scheme to study nanofluidics, we fabricated a nanofluidic junction between two microfluidic channels by using the evaporation-driven self-assembly of colloidal nanoparticles instead of lithographically patterning an array of nanochannels. when flowing the colloidal particles into the bead delivery channel, an array of nanotraps with a depth of nm and a width of µm on both sides of the bead delivery channel at a total width of μm prevented the bead suspension from flowing into the buffer and sample channel due to the surface tension at the nanotraps. once trapped, the colloidal particles packed in the bead delivery channel rapidly and formed a nanoporous junction between the sample and buffer channel. it is important to load the bead suspension immediately after plasma bonding so that the capillary force drives the silica bead suspension up to the entrance of the outlet reservoirs in the temporarily hydrophilic bead delivery channel. in order to prevent an air bubble blocking the flow in the inlet reservoir, it is highly recommended to reach the bottom of the reservoir with a pipette tip and then release the bead suspension into the reservoir. in the case of the surface-functionalized beads with polyelectrolytes, their flowability was drastically reduced compared to the silica beads without surface functionalization and tended to aggregate more easily and adhere to the channel surface during the filling process. in order to prevent a clogging of the channel with the polyelectrolyte-coated beads, we added a surfactant, . % tween , to the bead suspension. in case there was still a clogging problem during filling, a gentle tapping on the pdms chip with a pipette tip generally helped to resolve it. also, it is important that the bead suspension was not completely dried out after evaporation since it would be difficult to infiltrate the bead membrane with the sodium phosphate buffer solution again. therefore, after hr of partial evaporation, all in- and outlets of the pdms device http://www.jove.com http://www.jove.com http://www.jove.com https://www.jove.com/files/ftp_upload/ / fig large.jpg https://www.jove.com/files/ftp_upload/ / fig large.jpg journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of were taped and kept at °c for storage prior to use so the bead packing stays moist. during the preconcentration experiments, the self- assembled bead maintained its structural stability for the most part. however, in few instances, we observed a dislocation of the beads which indicated a defective packing of the beads in microchannels. the self-assembled silica beads ranging from a diameter of nm down to nm after the self-assembly can be seen in figure . the theoretical pore size of the bead packing was ~ nm, approximately ~ % of the colloidal particle diameter. we could confirm the pore size using a sem analysis and measured a pore size approximately % of the bead diameter after packing. using the self-assembled nm and nm colloidal particle membranes as an ion-selective nanoporous junction, we could initiate ion depletion region at v and concentrate nm cy tagged dna (caa ccg atg cca cat cat tag cta c) in mm sodium phosphate buffer (figure ). by continuously flowing the dna sample towards the ion depletion zone with an electroosmotic flow at a voltage difference v -v = v, we could increase the initial dna concentration by ~ , folds within min. (figure a, b). nm beads allowed more robust dna concentration than nm beads, as shown in figure b). since the electrokinetic concentration is based on a force balance between the electroosmotic force and the highly nonlinear electrophoretic forces, the resulting concentration factor is determined by the degree to which this force balance can be maintained during electrokinetic concentration. another significant advantage of deploying the colloidal particles for building a nanofluidic junction is the ease with which its surface functionalization can be performed. instead of creating a nanochannel through bonding first and then performing a surface functionalization on it, we can simply surface functionalize the colloidal particles in a vial outside of the device first and then flow them into the channel for self- assembly. based on this approach, we could initiate icp using the nm silica amine particles coated with a single layer of pss and nm silica carboxyl particles coated with a layer of pah and pss (figure c and d), at lower voltages ( v and v, respectively) than the colloidal particles without surface functionalization ( v). this result shows that the surface functionalization of the colloidal particles prior to the self-assembly was effective to increase the surface charge of colloidal particles and resulted in higher icp. however, in terms of the concentrator factor obtained, the nanofluidic junction of the surface functionalized beads was less effective than the non-functionalized silica beads. the amine/pss-coated beads enabled a factor of ~ , while the carboxyl/pah/pss bead membrane showed a , -fold increase after min. (figure d). this result can be explained by a higher surface charge of the surface functionalized nanopores that led to an increased length of the ion depletion region pushing the sample concentration plug farther away from the bead membrane and therefore, to less stable concentration. we believe that shortening the total width of the nanoporous bead membrane from currently mm (the section of the bead membrane parallel to the sample channel) could mitigate this instability issue. according to our previous study, the width of the nanoporous junction determines the amount of ionic current passing through it. as the width increases, the ionic current increases and since more cations can migrate through the membrane, the depletion length increases and the concentration plug is further pushed away from the nanoporous junction. therefore, the accumulation occurs in a less confined manner and the sample plug becomes less stable. empirically, the nanoporous junction should be ~ - µm in width. another feature to improve was an insufficient thickness of the pdms wall of µm between the sample channel and the bead delivery. this thin pdms section led to an insufficient bonding that enabled an ionic current between the buffer and sample channel. therefore, the entire bead membrane section parallel to the sample channel ( mm in width) was acting as a nanoporous junction, even though only μm of the bead was intended as a nanoporous junction membrane according to the total width of the nanotrap array. the pdms wall thickness should be at least µm or higher. disclosures the authors have nothing to disclose. acknowledgements this work was supported by nih r eb - a and new york university abu dhabi (nyuad) research enhancement fund . we express our thanks to the technical staff of mit mtl for their support during microfabrication and james weston and nikolas giakoumidis of nyuad for their support in taking sem pictures and building a voltage divider, respectively. the device fabrication in pdms was conducted in the microfabrication core facility of nyuad. lastly, we would like to thank rebecca pittam from the nyuad center for digital scholarship for video shooting and editing. references . mawatari, k., kazoe, y., shimizu, h., pihosh, y., & kitamori, t. extended-nanofluidics: fundamental technologies, unique liquid properties, and application in chemical and bio analysis methods and devices. anal chem. , - , ( ). . tsukahara, t., mawatari, k., & kitamori, t. integrated extended-nano chemical systems on a chip. chem soc rev. , - , ( ). . mani, a., zangle, t. a., & santiago, j. g. on the propagation of concentration polarization from microchannel-nanochannel interfaces part i: analytical model and characteristic analysis. langmuir. , - , ( ). . aizel, k. et al. enrichment of nanoparticles and bacteria using electroless and manual actuation modes of a bypass nanofluidic device. lab chip. , - , ( ). . wang, y. c., stevens, a. l., & han, j. million-fold preconcentration of proteins and peptides by nanofluidic filter. anal chem. , - , ( ). . karnik, r. et al. electrostatic control of ions and molecules in nanofluidic transistors. nano letters. , - , ( ). . mao, p., & han, j. y. fabrication and characterization of nm planar nanofluidic channels by glass-glass and glass-silicon bonding. lab chip. , - , ( ). . mao, p., & han, j. massively-parallel ultra-high-aspect-ratio nanochannels as mesoporous membranes. lab chip. , - , ( ). . balducci, a., mao, p., han, j. y., & doyle, p. s. double-stranded dna diffusion in slitlike nanochannels. macromolecules. , - , ( ). http://www.jove.com http://www.jove.com http://www.jove.com journal of visualized experiments www.jove.com copyright © journal of visualized experiments march | | e | page of . yamada, m., mao, p., fu, j. p., & han, j. y. rapid quantification of disease-marker proteins using continuous-flow immunoseparation in a nanosieve fluidic device. anal chem. , - , ( ). . huh, d. et al. tuneable elastomeric nanochannels for nanofluidic manipulation. nat mater. , - , ( ). . chung, s., lee, j. h., moon, m. w., han, j., & kamm, r. d. non-lithographic wrinkle nanochannels for protein preconcentration. adv mater. , - , ( ). . park, s. m., huh, y. s., craighead, h. g., & erickson, d. a method for nanofluidic device prototyping using elastomeric collapse. proc natl acad sci. , - , ( ). . zeng, y., & harrison, d. j. self-assembled colloidal arrays as three-dimensional nanofluidic sieves for separation of biomolecules on microchips. anal chem. , - , ( ). . malekpourkoupaei, a., kostiuk, l. w., & harrison, d. j. fabrication of binary opal lattices in microfluidic devices. chem mat. , - , ( ). . merlin, a., salmon, j.-b., & leng, j. microfluidic-assisted growth of colloidal crystals. soft matter. , - , ( ). . schepelina, o., & zharov, i. pnipaam-modified nanoporous colloidal films with positive and negative temperature gating. langmuir. , - , ( ). . schepelina, o., & zharov, i. poly( -(dimethylamino)ethyl methacrylate)-modified nanoporous colloidal films with ph and ion response. langmuir. , - , ( ). . smith, j. j., & zharov, i. ion transport in sulfonated nanoporous colloidal films. langmuir. , - , ( ). . gaspar, a., hernandez, l., stevens, s., & gomez, f. a. electrochromatography in microchips packed with conventional reversed-phase silica particles. electrophoresis. , - , ( ). . lee, s. y. et al. high-fidelity optofluidic on-chip sensors using well-defined gold nanowell crystals. anal chem. , - , ( ). . hu, y. l. et al. interconnected ordered nanoporous networks of colloidal crystals integrated on a microfluidic chip for highly efficient protein concentration. electrophoresis. , - , ( ). . zhang, d.-w. et al. microfabrication-free fused silica nanofluidic interface for on chip electrokinetic stacking of dna. microfluid nanofluid. , - , ( ). . syed, a., mangano, l., mao, p., han, j., & song, y. a. creating sub- nm nanofluidic junctions in a pdms microchip via self-assembly process of colloidal silica beads for electrokinetic concentration of biomolecules. lab chip. , - , ( ). . kim, s. j., song, y. a., & han, j. nanofluidic concentration devices for biomolecules utilizing ion concentration polarization: theory, fabrication, and applications. chem soc rev. , - , ( ). . fu, j. p., mao, p., & han, j. y. continuous-flow bioseparation using microfabricated anisotropic nanofluidic sieving structures. nat protoc. , - , ( ). . plecis, a., nanteuil, c., haghiri-gosnet, a. m., & chen, y. electropreconcentration with charge-selective nanochannels. anal chem. , - , ( ). . ko, s. h. et al. nanofluidic preconcentration device in a straight microchannel using ion concentration polarization. lab chip. , - , ( ). http://www.jove.com http://www.jove.com http://www.jove.com white paper report report id: application number: ht project director: philip ethington (philipje@usc.edu) institution: university of southern california reporting period: / / - / / report due: / / date submitted: / / office of grant management room national endowment for the humanities pennsylvania avenue, n.w. washington, d. c. . nov. dear office of grant management: please find attached the white paper for the “broadening the digital humanities: the vectors-cts summer institute on digital approaches to american studies,” july to august . (id number: ht- - ) awarded to the university of southern california. do not hesitate to contact us if you have any questions or require additional materials. sincerely, philip j. ethington, tara mcpherson, john carlos rowe   white paper grant id number: ht- - grant term: / / to / / grant title: “broadening the digital humanities: the vectors-cts summer institute on digital approaches to american studies,” july to august . project directors: philip ethington, tara mcpherson, john carlos rowe grant institution: university southern california nov.   white paper: “broadening the digital humanities: the vectors-cts summer institute on digital approaches to american studies,” co-hosted by the vectors-center for transformative scholarship, and american studies and ethnicity department at usc, july to august . nov. background: during the summer of , we held a very productive four-week summer institute on “broadening the digital humanities: the vectors-cts summer institute on digital approaches to american studies,” co-hosted by the vectors-center for transformative scholarship, and american studies and ethnicity department at usc, july to august . our primary audience was the american studies humanities scholar who does not have a great deal of computing experience but who has begun to express an interest in the digital humanities and in interactive media more broadly. scholars were offered the opportunity to explore the benefits of interactive media for scholarly analysis and authorship, with an emphasis on two, interoperable authoring and publishing platforms: the multimedia authoring platform scalar, and the geohistorical narrative visualization platform hypercities. please “outcomes”, p. below, for detailed project descriptions and post-institute project developments. response to the call for proposals: the response to the call for proposals was very strong. ninety-nine ( ) proposals were submitted and reviewed, indicating a high level of interest in the institute’s vision. after review, seventeen fellows representing fourteen proposals were selected. both the submitted proposals and the selected fellows came from applicants from a range of career levels (from advanced ph.d. student to an endowed professor), a variety of colleges and universities, and a broad geographic distribution. (see below for a list of summer fellows.) additionally, an advanced undergraduate was included in the pool, working alongside his father, a professor of africana studies. we were quite gratified at the strong overall quality across the applicant pool and could easily have accepted additional high-caliber proposals. in fact, narrowing the final pool to ten proved very difficult and, in the end, the decision was made to accept seventeen fellows, through tapping additional usc financial resources. our selection criteria was threefold: to achieve a diversity of content matter and theoretical frameworks: to optimize the match-up between our expertise and that of the applicants; and to achieve a balance of junior and senior scholars. in keeping with the value that many place on collaborative scholarship in digital humanities, we accepted four two-scholar partnerships, pushing the total number of fellows to , all of the highest caliber.   neh summer institute fellows: fellows project titles nicholas brown and sarah kanouse recollecting black hawk wendy cheng a people’s guide to los angeles elizabeth cornell keywords collaboratory brandon costelloe-kuehn and nick shapiro networking asthmatic spaces matt delmont the nicest kids in town: american bandstand, rock 'n' roll, and civil rights in s philadelphia kara keeling and thenmozhi soundararajan digital media and social movements david kim and mike rocchio mapping the murals: chicano community murals in la debra levine act up oral history project curtis marez cesar chavez’s video library mark marino critical code studies carrie rentschler there/not there: witness in genovese case nicholas sammond biting the invisible hand: blackface minstrelsy and animation jonathan sterne mp : the meaning of a format kara thompson a future perfect: time, queerness, indigeneity oliver wang legions of boom: mobile sounds, sights and sites scott wilson century villages of cabrillo (adjacent to the port of long beach) usc instruction team: steve anderson craig dietrich phil ethington (pi) erik loyer tara mcpherson (co-pi) jillian o'connor john carlos rowe (co-pi)   visiting presenters (please see appendix, p. for biographical sketches) mark allen anne balsamo randy bass anne burdick sharon daniel kathleen fitzpatrick gary hall alexandra juhasz marsha kinder caroline levander work timeline: based upon our assessments and those of our fellows, the institute was a strong success. the institute began july , , ran for four weeks, and brought fellows to the usc campus. our work on the grant began well in advance of the institute and continues on to the present, as we continue to support various fellows in ongoing projects. specifically, we have undertaken the following, in line with our proposed activities in the initial grant proposal: fall-winter -  call for proposals, curriculum, lining-up visiting presenters  call for proposals posted online and broadcast via discussion lists.  logistical planning for the summer institute began (housing, lab schedule, etc.)   spring  erik loyer and craig dietrich worked with todd presner and dave shepard of hypercities to established interoperability between the platforms. hypercities was established as a partner archive and hypercities can be inserted into scalar pages.  proposals received and evaluated by review committee  participants announced and confirmed  visiting faculty for the institute confirmed  travel and housing plans established  workshop curriculum fine-tuned. june  finalized logistics, technical support, and curriculum  finalized daily schedule  opened institute wiki  assisted fellows with travel and housing needs  finalized travel for guest presenters   july-august  broadening the digital humanities: the vectors-cts summer institute on digital approaches to american studies,” july to august neh july – august fall -present  continued assessment of outcomes  continued support of several fellows projects  publication and publicity support for fellows evaluation of the institute: fellows and presenters quickly established an atmosphere of risk-free learning and creativity. institute co-pis and instruction team carefully interviewed and tracked each participant to support participants’ visions, helping them to understand the affordances of each platform, strategies for choosing between goals for each project, and how best to position the project to become a “publication.”. each participant received weekly instruction concerning a) needed tutorials and support, arranging for hands-on expert help and lab time on specific topics from spatial data processing, -d tools, to audio and video processing and archiving; b) reasonable goals and best use of time. the morning sessions of the institute were comprised of seminar-style discussions led by guest presenters, the grant pis, or the fellows themselves. afternoon sessions were comprised of labs, technical workshops and demonstrations, and project development time. this mix of activities seemed to strike a balance that allowed “something for everyone.” while some fellows seemed to garner more from the seminars, others gravitated toward the hands-on aspects of the workshops and lab time. there were several things that the fellows all seemed to appreciate. most found the quality of the guest presenters to be top notch, and many commented on how useful it was to see a range of types of work presented. each also expressed gratitude for the way in which the institute interwove both conceptual questions and technical approaches. for this set of scholars (all tied to the “traditional” humanities), lodging questions of the digital within larger intellectual frames proved deeply satisfying. this indeed proved key to “broadening” the digital humanities (as was our theme.) assessment protocol: pis and instruction team met weekly with each fellow in scheduled meetings, to assess each participant’s satisfaction with the curriculum and rate of progress, immediate, intermediate, and longer-term goals. assignment of hands-on experts, lab topics, and demos were adjusted accordingly.   outcomes: projects update reports, as of november we are very pleased that about half of the projects launched during the summer institute are either already published (eg matt delmont, uc press), are being submitted (eg, curtis marez, to the journal american literature), or are being presented at major conferences (eg keeling and soundarajan). these projects are listed first under “a. projects being published, submitted, or presented at conferences.” of course, several of the projects have entered longer development cycles, but these also seem to hold promise for the participants. these projects are listed second below, under “b. projects with longer development timelines.” a. projects being published, submitted, or presented at conferences we followed our work in the four-week summer institute with several reports at the american studies association annual conventions. at the convention, john carlos rowe advertised the institute at the digital caucus and with handouts at the convention; at the convention, he reported on our results to the digital caucus. finally, at this year's asa convention, he discussed long-term implications of the institute at a panel on transnational publishing and another on digital education for graduate students. ************* matt delmont, professor of american studies, scripps college http://mattdelmont.com/ project title: the nicest kids in town   i used scalar to create working on a digital project companion to my book. i "published" the scalar project in january . the scalar project adds to the book because i was able to include + images (compared to in the book), as well as video clips related to my research. the link is here: http://scalar.usc.edu/nehvectors/nicest-kids excerpt from review citing the scalar site gayle wald ( ). the nicest kids in town: american bandstand, rock ’n’ roll, and the struggle for civil rights in s philadelphia. by matthew f. delmont. berkeley: university of california press, ./the nicest kids in town digital project, http://scalar.usc.edu/nehvectors/nicestkids/index.. journal of the society for american music, , pp doi: . /s “some of the sources that delmont uses in this regard are available in a free online companion to thenicest kids in town, constructed using innovative scalar software developed by the alliance for networking visual culture. scalar allows content producers to author projects, or “books,” that combine text and media, without subordinating the former to the latter. as users navigate the text of the scalar-based nicest kids in town, images and video clips scroll into view, accompanied by useful links to information about their provenance and content. a “stripe,” or index, running down the left-hand side of the page provides the user with an index to the material, making it easy to navigate among the three “paths,” in addition to an introduction, which constitute the main body of the project. the dozens of images on the digital nicest kids in town are of far higher quality than the illustrations in the book, with its grainy black-and-white reproductions. users will also appreciate what amount to visual “footnotes”—images of the newspaper clippings from which delmont quotes. it’s easy to imagine the digital nicest kids as a nice tool for helping undergraduates understand the significance and use of primary sources and other archival materials in the production of knowledge.”   ************* kara keeling and thenmozhi soundarajan project title: digital media and social movements kara keeling, assistant professor in the division of critical studies in the school of cinematic arts and in the department of american studies and ethnicity at the university of southern california http://dornsife.usc.edu/ase/people/faculty_display.cfm?person_id= thenmozhi soundarajan, ph.d. student, school of cinematic arts, usc from third cinema to media justice: third world majority and the promise of third cinema is a collaborative multi-media archive and scholarship project consisting of an archive that contains the materials produced by third world majority (twm) during the years of their existence as a collective and a collection of scholarly pieces, historical retrospectives, and other dialogues with the work of twm. twm was one of the first women of color media justice collectives in the united states. it operated from to . both the twm archive and the writings about it are part of the scalar project. since the institute ended, we have successfully uploaded the entire twm archive to the internet archives and begun the process of linking that material to the scalar anthology. we also have collaboratively produced content for the anthology, started recruiting others to contribute to the volume, presented about the project on a plenary for the us cultural studies association's annual conference in san diego. we are scheduled to present about it at the allied media conference in detroit, mi and the association for cultural studies conference in paris, france in june and july of . both presentations provide opportunities to produce additional content for the anthology and to identify and solicit contributors to the volume. we plan to have the archive available in scalar and issue invitations to contributors by the end of august .   ************* elizabeth cornell project title: keywords collaboratory elizabeth cornell, project coordinator, keywords collaboratory, fordham university http://www.elizabethfcornell.net/ i worked on a digital version of the book, keywords for american cultural studies, edited by bruce burgett and glenn hendler and published by nyu press. at the moment, we plan to use scalar for keyword essays, to be published alongside the second edition of the print version, which will contain other essays. we anticipate publication to be the fall of .   ************* curtis marez project title: cesar chavez’s video library curtis marez, associate professor of ethnic studies at the university of california, san diego my project is called “cesar chavez’s video library, or farm workers and the secret history of new media” it argues that farm workers have been influential actors in the political history of new media. farm worker unions, most notably the ufw, have been early and innovative adopters of older forms of “new media,” such as portable film and video technology, in ways that illuminate the political limits and possibilities of more recent new media practices among immigrant rights activists. i have continued to work on the project and have recently submitted a multimedia “essay” to the journal american literature.   ************* david kim and mike rocchio project title: mapping the murals: chicano community murals in la david kim, ph.d. student, department of information studies (expected ), ucla mike rocchio, ph.d student, department of architecture and urban design, ucla mapping chicano murals in la is a digital model and simulation of the estrada court public housing in east los angeles which features + community murals installed in the s and the s. during the summer institute, we built the beta version of the model in google sketchup and integrated the model into hypercities platform, which allowed us to combine narratives, archival materials and other resources towards spatial analysis of race, ethnicity and cultural nationalism as these concepts are embodied in the murals. currently, we wrapped up the digital publication version of the project in hypercities and will be presenting it in mla (special session: race in the digital humanities) and architecture studies conference. we submitted the online version for peer-review in the cambridge university press journal urban history in october .   ************* nick shapiro and brandon costelloe-keuhn project title: networking asthmatic spaces nick shapiro, graduate student, university of oxford http://oxford.academia.edu/nickshapiro brandon costelloe-keuhn, phd candidate, department of science and technology studies, rensselaer polytechnic institute http://rpi.academia.edu/brandoncostelloekuehn we created a scalar site that includes a short film, gis maps and an oral history journey through the experiences of residents in the "fema" trailers. these temporary housing units, originally built to accommodate gulf coast residents who were displaced by the hurricanes of , were found to contain potentially toxic chemicals, and have been resold across the united states in tandem with a widening foreclosure crisis. the project enhances users' capacity to ) visualize connections between environmental, public health and economic crises, ) move across scales, engaging material that situates them inside the trailers and the lives of residents, and then zooming out to see how hazards at the local level are distributed nationally, ) understand how scientifically-engaged media can generate new perspectives on complex problems. as these units continue to be sold to every corner of the u.s., we have been interviewing residents with irritated eyes, bloody noses, memory loss, insomnia, diarrhoea and respiratory issues. we have launched an auxiliary study in collaboration with a private indoor air quality lab which questions the prevailing scientific consensus that the trailers have off-gassed their store of potentially toxic chemicals in the almost seven years since they were manufactured. to date, the average level of formaldehyde found, across eight states, is over parts per billion, the epa’s recommended maximum indoor air concentration. we are currently working on a paper that layers the embodied knowledge of fema trailer inhabitants and our numerical data on the   ongoing toxicity of these domestic spaces. building on our work with scalar, we aim to craft new contexts in which layered claims of toxicity, based on embodied awareness and technologically mediated measurements and visualizations, can be heard, making these trailers and the effects and affects they engender graspable as objects of epistemic action. we are in conversation with the interactive web designer at wired.com to develop a website and we have lectured internationally on our research in addition to having been featured in internationally syndicated news media.   ************* oliver wang project title: legions of boom: mobile sounds, sights and sites oliver wang, assistant professor, sociology department, california state long beach http://www.csulb.edu/colleges/cla/departments/sociology/people/oliverwang.htm my digital project is a research repository focused on the history of the filipino american mobile disc jockey community in the san francisco bay area. it includes text, audio and visual resources, designed to introduce the social history of this community to both newcomers and those who grew up in the scene. the long-term goal is to create a dynamically-updated repository that can include contributions from visitors, thus emphasizing the community aspect of "community history." as i was preparing for my tenure file and book revisions (the latter of which relates to the research on the site), i have made minimal progress this past year. however, now that i have gained tenure and a fall semester sabbatical, i will be completing the site this summer and publicly launch it by early fall ( ).   b. projects with longer development timelines wendy cheng project title: a people’s guide to los angeles wendy cheng, assistant professor, asian pacific american studies and justice and social inquiry, school of social transformation, arizona state university https://webapp .asu.edu/directory/person/ project title: a people’s guide to los angeles i worked to develop a digital, interactive version of my coauthored book, a people's guide to los angeles, which presents sites of struggles over power and alternative and minority histories throughout los angeles county. although the book has now been published, unfortunately i have not made any progress on the digital project since the end of the institute. we (my coauthors and i, in conversation with uc press) are currently working to develop a people's guide book series and would like to return to the question of a digital, interactive online presence in the future that would serve as a hub for these various projects, but i don't have a sense of when or how that would develop. the institute helped tremendously, however, to identify what the digital project might look like, and what the questions, problems, and needs would likely be in order to realize the project.”   ************* nicholas sammond project title: biting the invisible hand: blackface minstrelsy and animation nicholas sammond, associate professor, cinema studies institute and department of english, university of toronto http://www.utoronto.ca/cinema/faculty-sammond.html my project entailed developing an online companion to my upcoming book, biting the invisible hand: blackface minstrelsy and the industrialization of american animation (duke university press, forthcoming). the companion is not a literal translation of the book, but a complementary resource, one that permits the reader (or stand-alone visitor) to view the cartoons, minstrel ephemera, and other media elements to which the book refers, but which the book cannot deliver in substantial form. to date, the companion, which is only a prototype, has been undergoing beta testing by student workers, with the goal of refining its organization, structure, and flows. with the book slated to be under review by the end of the summer, significant development on the companion will commence from july forward, intensifying in september and october.   ************* sarah kanouse and nicholas brown project title: recollecting black hawk sarah kanouse, assistant professor�intermedia program�school of art and art history http://www.readysubjects.org/bio.html nick brown, phd candidate, department of landscape architecture and american indian studies, university of illinois at urbana-champaign http://walkinginplace.org/ http://criticalspatialpractice.blogspot.com/ last summer, we set out to work on a digital supplement to the photo-text book re-collecting black hawk. immersion in the discussion of digital humanities at the summer institute helped us to realize that a spin-off project--related but stand alone--would be more appropriate than what we initially envisioned. the reconceptualized project has been delayed by the need to finish the print book, but we plan to return to it, using resources at our home campuses, once the publication timeline is firmed up.   ************* carrie rentschler project title: there/not there: witness in genovese case carrie rentschler, associate professor and william dawson scholar of feminist media studies in the department of art history and communication studies at mcgill university http://www.mcgill.ca/igsf/about/staff my digital project is an annotated archive of materials that animate the cultural life and case construction of the infamous kitty genovese murder. while the archive constitutes the research materials for a book i am writing, when complete, the digital archive will have significance beyond that publication, and will be of interest to students who are taught the case in high school and university classrooms. the project is not yet complete. it is much as it was at the end of the seminar last summer due to my current administrative responsibilities. i am, however, quite eager to complete it, because of how useful i believe it will be pedagogically and as a small research archive.   ************* jonathan sterne project title: mp : the meaning of a format jonathan sterne, associate professor, department of art history and communications studies, mcgill university. http://media.mcgill.ca/en/jonathan_sterne my project was to create a digital companion to my book mp : the meaning of a format. i was especially interested in expanding the audio capabilities of scalar and companion sites like critical commons. i've made some progress but am not yet that close to done. i hope to "go live" with something in august when my book comes out. i've replotted the project -- the concept was a little foggier last summer and i tried using existing models like pouring in text. instead, what i need to do it pick a few core "takeaway" concepts and provide additional illustration and material to the book, especially audio material.   ************* kara thompson project title: colonized time kara thompson, assistant professor of english & american studies college of william and mary ktthompson@wm.edu i submitted a project proposal for “mapping with reservations,” a multimedia, multilayered cartographic representation of the reservation system in the u.s., beginning in and continuing to the present day. this was clearly a project much too onerous for the time and scope of the fellowship. after some training with scalar, i created a minimal framework for “colonized time,” which is a cultural history of the black hills from - . i try to use the paths and visual orientations of scalar to show how the black hills is a site well known in a dominant tourist imagination, but underneath the venerated, re-enacted “wild west” is the very present colonization of the lakota people. i have not made any progress on the project since i left the institute. immediately following our time at the iml, i began my first tenure track position, which completely consumed my research and writing time. i do hope to take up the project again and devise a way to use it for either research or teaching. http://scalar.usc.edu/nehvectors/kara-thompson/index. [i notice the hypercities inserts are not loading properly, at least not on my computer, which is why i am sending a link instead of a screen shot].   ************* debra levine project title: act up oral history project       appendix visiting presenters mark allen is the founder of machine project, a non-profit community space in the echo park neighborhood of los angeles investigating art, technology, natural history, science, music, literature, and food. in the machine project storefront on north alvarado street, allen and his colleagues produce events, workshops and site-specific installations using hands-on engagement to make rarefied knowledge accessible. in his own work, allen is interested in sculpture and performance, asking how they can affect the viewer in a deep, personal way. how can the viewer be moved from a passive position to a state of engagement and communal experience? allen has been working with these concerns since graduate school, and his practice has transformed from studio artist to include collaborator, facilitator and producer as he has investigated these questions. under his direction, machine functions as a research laboratory, investigating performance, sculpture and installation as lived experience for the viewer. anne balsamo is professor of interactive media in the school of cinematic arts, and of communication in the annenberg school of communication and journalism at the university of southern california. from - , she served as director of the institute for multimedia literacy. her work focuses on the relationship between culture and technology. in , she co- founded onomy labs, inc., a silicon valley technology design and fabrication company that builds cultural technologies. her first book, technologies of the gendered body: reading cyborg women (duke university press, ) investigated the social and cultural implications of emergent bio-technologies. her new book project, designing culture: the technological imagination at work, examines the relationship between cultural reproduction and technological innovation. randy bass is executive director of georgetown's center for new designs in learning and scholarship, a university-wide center supporting faculty work in new learning and research environments. he is the director of the visible knowledge project (vkp). in conjunction with the vkp, he is also the director of the american studies crossroads project, an international project on technology and education in affiliation with the american studies association, with major funding in the past by the us department of education and the annenberg/cpb project. in conjunction with the crossroads project, bass is the supervising editor of engines of inquiry: a practical guide for using technology to teach american studies, and executive producer of the companion video, engines of inquiry: a video tour of learning and technology in american culture studies. he has served as co-leader of the neh-funded "new media classroom project: building a national conversation on narrative inquiry and technology," in conjunction with the american social history project/center for media and learning (at the cuny graduate center). he is also co-editor of the electronic resources editor for the heath anthology of american literature (third edition, paul lauter, ed.). anne burdick is chair of the media design program at the pasadena art center. she is a regular participant in the international dialogue regarding the future of graduate education and research in design. in addition, she designs experimental text projects in diverse media, for   which she has garnered recognition, from the prestigious leipzig award for book design to i.d. magazine’s interactive design review for her work with interactive texts. burdick has designed books of literary/media criticism by authors such as marshall mcluhan and n. katherine hayles and she is currently developing electronic corpora with the austrian academy of sciences. burdick’s writing and design can be found in the los angeles times, eye magazine and electronic book review, among others, and her work is held in the permanent collections of both sfmoma and moma. burdick studied graphic design at both art center college of design and san diego state university prior to receiving a b.f.a. and m.f.a. in graphic design at california institute of the arts. sharon daniel is professor of film and digital media at the university of california, santa cruz where she teaches classes in digital media theory and practice. her research involves collaborations with local and online communities, which exploit information and communications technologies as new sites for “public art.” daniel is the co-creator of the web- based interactive project public secrets, which examines the spaces of the prison system through the voices of incarcerated women. the award-winning project exemplifies precise and elegant interface design and the use of an algorithm to generate random “text boxes” that act as metaphors for the project’s central thesis. kathleen fitzpatrick is associate professor of english and media studies and chair of the mediastudies program at pomona college in claremont, california. she is the author of the anxiety of obsolescence: the american novel in the age of television (vanderbilt up, ), which was selected as an “outstanding academic title” for by choice. she serves on the editorial board of the pearson custom introduction to literature database anthology, as well as of the journal of e-media studies and the journal of transformative works, and is a member of the executive committee of the mla discussion group on media and literature. she has recently finished a book-length project, to be published by new york university press, entitled planned obsolescence: publishing, technology, and the future of the academy. she is a founder of mediacommons and a frequent blogger and has recently been appointed the first director of scholarly communication for the mla. gary hall is a london-based cultural and media theorist working on new media technologies, continental philosophy and cultural studies. he is professor of media and performing arts in the school of art and design at coventry university, uk. he is the author of culture in bits (continuum, ) and digitize this book!: the politics of new media, or why we need open access now (minnesota up, ) and co-editor of new cultural studies: adventures in theory (edinburgh up, ) and experimenting: essays with samuel weber (fordham up, ). he is also founding co-editor of the open access journal culture machine, director of the cultural studies open access archive csearch, co-founder of the open humanities press and co-editor of the ohp's culture machine liquid books series. his work has appeared in numerous journals, including angelaki, cultural politics, cultural studies, parallax and the oxford literary review. he is currently developing a series of politico-institutional interventions - recently dubbed 'deconstructions in the public sphere' - which use digital media to creatively perform critical and cultural theory. in / he will be a visiting fellow at the centre for research in the arts, humanities and social sciences at cambridge university.   alexandra juhasz is professor of media studies at pitzer college. she makes and studies committed media practices that contribute to political change and individual and community growth. she is the author of aids tv: identity, community and alternative video (duke university press, ), women of vision: histories in feminist film and video (university of minnesota press, ), f is for phony: fake documentary and truth’s undoing, co-edited with jesse lerner (minnesota, ), and media praxis: a radical web-site integrating theory, practice and politics, www.mediapraxis.org. she has published extensively on documentary film and video. dr. juhasz is also the producer of educational videotapes on feminist issues from aids to teen pregnancy. she recently completed the feature documentaries scale: measuring might in the media age ( ), video remains ( ) and dear gabe ( ) as well as women of vision: histories in feminist film and video ( ) and the shorts, released: short videos about women and film ( ) and naming prairie ( ), a sundance film festival, , official selection. she is the producer of the feature films, the watermelon woman (cheryl dunye, ) and the owls (dunye, ). her current work is on and about youtube: www.youtube.com/mediapraxisme and www.aljean.wordpress.com. marsha kinder is a cultural theorist and prolific film scholar, whose specializations include narrative theory, digital media, children's media culture, and spanish cinema. she has published more than essays and books, including blood cinema: the reconstruction of national identity in spain with companion cd-rom ( ), playing with power in movies, television and video games: from muppet babies to teenage mutant ninja turtles ( ), self and cinema ( ) and closeup ( ). since kinder has directed the labyrinth project, an art collective and research initiative on interactive cinema and database narrative. labyrinth has produced a series of award-winning interactive installations and dvds that have been exhibited at museums, conferences, film festivals and new media festivals worldwide. kinder has worked for sega as a rater of violence in video games, has written, directed and produced game protoypes and online courseware projects, and received a number of awards, including the sundance online festival jury award for new narrative forms, british academy of film & tv arts for best interactive project in the learning category, and new media invision award for best overall design. caroline levander is the vice provost for interdisciplinary initiatives, carlson professor in the humanities, and professor of english at rice university. she is currently writing laying claim: imagining empire on the u.s. mexico border (oxford university press) and where is american literature? (wiley-blackwell’s manifesto series). levander has recently co-edited teaching and studying the americas ( ), a companion to american literary studies ( ), and "the global south and world disorder," with walter mignolo, for the global south ( ). she is the recipient of grants and fellowships from the mellon foundation, the national endowment for the humanities, the national humanities center, the huntington library, and the institute of museum and library science's national leadership grant, among other agencies. in addition to co-editing a book series, imagining the americas, with oxford university press, levander is author of cradle of liberty: race, the child and national belonging from thomas jefferson to w.e.b. du bois (duke up ) and voices of the nation: women and public speech in nineteenth-century american culture and literature (cambridge up , paperback reprint ) and co-editor of hemispheric american studies ( ) and the american child: a cultural studies reader ( ). she is also involved in the our americas archive partnership   (oaap), a digital archive supported by search tools and teaching materials that provides open access to historical documents on the americas, which are housed at collaborating institutions. ramesh srinivasan is assistant professor of information studies with a courtesy appointment in design|media arts. srinivasan, who holds m.s and doctoral degrees, from the mit media laboratory and harvard's design school respectively, has focused his research globally on the development of information systems within the context of culturally-differentiated communities. he is interested in how an information system can function as a cultural artifact, as a repository of knowledge that is commensurable with the ontologies of a community. as a complement, he is also interested in how an information system can engage and re-question the notion of diaspora and how ethnicity and culture function across distance. his research therefore involves engaging communities to serve as the designers, authors, and librarians/archivists of their own information systems. his research has spanned such bounds as native americans, somali refugees, indian villages, aboriginal australia, and maori new zealand. he has published widely in scholarly journals. digital memory in the post-witness era: how holocaust museums use social media as new memory ecologies information article digital memory in the post-witness era: how holocaust museums use social media as new memory ecologies stefania manca ���������� ������� citation: manca, s. digital memory in the post-witness era: how holocaust museums use social media as new memory ecologies. information , , . https:// doi.org/ . /info received: december accepted: january published: january publisher’s note: mdpi stays neu- tral with regard to jurisdictional clai- ms in published maps and institutio- nal affiliations. copyright: © by the author. li- censee mdpi, basel, switzerland. this article is an open access article distributed under the terms and con- ditions of the creative commons at- tribution (cc by) license (https:// creativecommons.org/licenses/by/ . /). institute of educational technology, national research council of italy, genoa, italy; stefania.manca@itd.cnr.it abstract: with the passing of the last testimonies, holocaust remembrance and holocaust education progressively rely on digital technologies to engage people in immersive, simulative, and even counterfactual memories of the holocaust. this preliminary study investigates how three prominent holocaust museums use social media to enhance the general public’s knowledge and understand- ing of historical and remembrance events. a mixed-method approach based on a combination of social media analytics and latent semantic analysis was used to investigate the facebook, twitter, instagram, and youtube profiles of yad vashem, the united states holocaust memorial museum, and the auschwitz–birkenau memorial and museum. this social media analysis adopted a combi- nation of metrics and was focused on how these social media profiles engage the public at both the page-content and relational levels, while their communication strategies were analysed in terms of generated content, interactivity, and popularity. latent semantic analysis was used to analyse the most frequently used hashtags and words to investigate what topics and phrases appear most often in the content posted by the three museums. overall, the results show that the three organisations are more active on twitter than on facebook and instagram, with the auschwitz–birkenau museum and memorial occupying a prominent position in twitter discourse while yad vashem and the united states holocaust memorial museum had stronger presences on youtube. although the united states holocaust memorial museum exhibits some interactivity with its facebook fan community, there is a general tendency to use social media as a one-way broadcast mode of communication. finally, the analysis of terms and hashtags revealed the centrality of “auschwitz” as a broad topic of holocaust discourse, overshadowing other topics, especially those related to recent events. keywords: holocaust remembrance; social media; cultural studies; digital memory; social media analytics; latent semantic analysis . introduction with the advent of increasingly sophisticated communication technologies and with progressive temporal departure from the historical circumstances that marked the “destruc- tion of european jewry” [ ] about years ago, the employment of digital technology has emerged as a specific topic of research in the field of holocaust studies. as a number of scholars have highlighted, “the cosmopolitan holocaust memory of the new millennium is synonymous with digital technology” [ ] (p. ). efforts to save and preserve historical archives combined with attempts to safeguard the testimonies of the last survivors have resulted in numerous undertakings based on the use of advanced digital technologies. the first prominent initiative came from the usc shoah foundation’s institute for visual history and education (formerly survivors of the shoah visual history foundation), a non-profit organization dedicated to recording interviews with survivors and witnesses of the holocaust and other genocides [ ]. subsequently, progressive diminishment of the witness era [ ] has further marked the need to preserve testimonies through digital means. one such initiative, the new dimensions in testimony, gathers a collection of survivor testimonies in interactive d format in a quest to safeguard the possibility of real-time, information , , . https://doi.org/ . /info https://www.mdpi.com/journal/information https://www.mdpi.com/journal/information https://www.mdpi.com https://orcid.org/ - - - https://doi.org/ . /info https://doi.org/ . /info https://creativecommons.org/ https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / https://doi.org/ . /info https://www.mdpi.com/journal/information https://www.mdpi.com/ - / / / ?type=check_update&version= information , , of question-and-answer virtual dialogue with survivors to learn about and appreciate their life experiences [ , ]. in this vein, the idea of a “virtual holocaust memory” has advanced, embracing both digital and non-digital memory related to the holocaust and, at the same time, drawing attention to the pervasive nature of the virtuality of memory itself [ ]. overall, digital culture opens up new opportunities for externalising collective mem- ories and, in this regard, social media settings may be considered the main arenas of mediatized memory that are increasingly globalised and transcultural [ – ]. due to tech- nological transformation and the increasingly mediated nature of communication, digital memory is progressively becoming “unanchored” from localised contexts, making both individual and collective memory timeless and spaceless [ , ]. in this light, holocaust memorials, remembrance centres, and institutions have had a solid presence on the internet for a considerable time now, curating websites, mailing lists, and other digital services [ , ]. museums use and produce diverse media to transmit and communicate memorial content, including standard printed media, multimedia pro- ductions, (often hands-on) media stations, interactive software, and web-based material and services. franken-wendelstorf, greisinger, and gries [ ] explained how the “learning location museum” has expanded into digital space. furthermore, museums, libraries, and related cultural institutions have started using social media for the development of digital social archives [ ]. indeed, social media have become standard means by which holocaust museums, memorials, and institutions disseminate knowledge and reach out to the public, e.g., for publicising upcoming local events. within the specific research subfield of social media memory studies [ ], which investigates digital memory of historical events such as those related to the holocaust [ , ], social media holocaust studies have become a topic of scholarship in its own right. some recent projects in this area, such as eva.stories on instagram (https://www.instagram.com/ eva.stories/) and the anne frank video diary on youtube (https://www.youtube.com/ annefrank), have raised considerable controversy. however, the interest in engaging new generations through novel forms of agency in relation to media witnessing and mediated memory is not something that can be dismissed in principle, as they exemplify the co- creation of socially mediated experiences [ ]. although mass culture has increasingly become prominent in the provision of historical knowledge [ ], some scholars argue that traditional holocaust memory environments, such as memorials, cinema, and television, are no longer suitable for contemporary digital users; they see the need to “resurrect” holocaust commemoration, creating immersive and more engaging memories [ ]. for the most part, much critical debate about social media use has focused on so-called dark tourism at holocaust memorial sites [ ], namely visitors taking selfies and other tourist photographs and subsequently sharing them on social media with hashtags [ – ]. by contrast, little research has focused on proactive social media use by holocaust institutions, such as memorials and museums [ , – ]. in today’s digital age, holocaust museums act both as physical monuments and as mediated and virtual spaces and are thus located at the intersection between commemorative memory and mediated memory [ ]. in this sense, they have a multifaceted mandate that covers commemoration, engagement/education of site visitors, enlightenment of the general public’s understanding of the past, as well as strengthening or challenging of historical narratives [ ]. along with archives and libraries, holocaust museums are public spaces that constitute prime social “memory institutions” and, today, represent the most significant repositories of national and community memories of the nazi genocide [ ]. in this vein, museums position themselves at the intersection of holocaust memory studies and the emerging field of digital history by making content accessible beyond the physical spaces of museums, research institutions, or archives [ ]. however, today, the general expansion of social media into the realm of cultural heritage, not least that of holocaust remembrance, also raises serious concerns about competing forms of local and national memory, including the narratives conveyed through museums [ ]. despite controversial cases of “multidirectional memory” [ ], museums serve to reassure patrons https://www.instagram.com/eva.stories/ https://www.instagram.com/eva.stories/ https://www.youtube.com/annefrank https://www.youtube.com/annefrank information , , of thanks to the legitimacy and authority that people tend to accord to these cultural institu- tions, especially when set against the confusion of internet sites promoting antisemitism and treating holocaust denial as historical truth [ , ]. more recently, the restrictions posed by the covid- pandemic on cultural institu- tions and heritage sites have accelerated the proliferation of digital memory [ ]; a growing use of social media has been a natural response to the limitations posed specifically on in situ socialisation, thereby giving impetus to a shift from complex onsite digital technology to online social media. various campaigns, such as #rememberingfromhome and #shoah- names, were launched by yad vashem [ ] to celebrate israeli holocaust remembrance day and to foster engagement, participation, and users’ active response through sharing, posting, and commenting, thus configuring new memory ecologies [ ]. this study analyses how three holocaust museums—yad vashem, the united states holocaust memorial museum, and the auschwitz–birkenau memorial and museum—use facebook, twitter, instagram, and youtube to engage their communities both at the content- page and at relational levels. the aim is to investigate what communication strategies the three museums adopt regarding generated content, interactivity, and popularity; these are examined in terms of typology of published content as well as engaging terms and hashtags. . related literature among cultural heritage institutions, museums, monuments, and memorials are leading adopters of digital technologies for education and dissemination activities. these institutions are early up-takers of the internet, driven in part by the widespread push to digitise their archives, thereby making them accessible to an increasingly wide audience. similarly, they have turned to social media use from the early stages [ , ]. several studies have focused on how social media has challenged the traditional flow of museum- based information and have blurred the lines traditionally dividing the roles of exhibition developers, designers, and educators [ ]. other studies have investigated the tensions and synergies between traditional and modern museum practice from the perspective of ethical issues connected to transparency, censorship, and respect for constituencies, especially with the museum relinquishing direct control over their media content [ ]. at the same time, paramount importance has been stressed in encouraging different levels of public participation, ranging from merely enjoying content to exercising more participatory roles through the co-creation of new content. this participative turn in cultural policy relies on the paradigm of cultural democracy, according to which diverse social groups should obtain acknowledgement of their cultural practices and no assumption should be made of any superior imperative in the transmission of cultural expression [ ]. in this light, the different degrees of engagement with cultural heritage institutions— attendance, interaction, and co-construction—are also reflected in their social media pres- ence. the participatory culture imbued in social media [ ] is also reflected in the ways that museums act as intermediaries of historical knowledge and cultural heritage through the exploitation of social media as sociotechnical systems and through leveraging their affordances [ ]. the focus of recent studies has shifted from engagement to the extent to which social media contribute to the co-construction of dialogue between museums and their visitors [ ]. the idea of museums as cultural intermediaries is connected with the concept of online value creation. this is manifest in at least three organizational forms in which museums may engage: ( ) marketing, which promotes the face of the institution; ( ) inclusivity, which nurtures a real online community; and ( ) collaboration, which goes beyond communication and promotes constructive interaction with the audience [ , ]. one of the approaches taken for measuring museums’ social media presence involves gauging social media effectiveness by considering both content and relational communi- cation strategies [ ]. according to this approach, engagement is manifested in different behaviours and communication effectiveness ought to be considered in terms of three consumer engagement dimensions: popularity (e.g., the number of followers and likes); generated content (e.g., the number of posts and comments); and virality (e.g., the number information , , of of reposts/shares) [ ]. other studies have investigated post writing as a tool for ascertain- ing museum engagement and have explored engagement with posts and its distribution by focusing on images, hashtags, and mentions [ ]. other techniques based on topic modelling have been used to derive discourse topics in the content of museums’ posts and the interactions these generated [ ]. notwithstanding the above-reported methodological approaches, to the best of our knowledge, no research study has yet investigated social media engagement centring on major holocaust museums and memorials. moreover, recent studies have shown that research in the two subfields of holocaust remembrance and holocaust education are largely underpinned by different conceptual frameworks. while the former has become a well-established research field, there is a clear lack of empirical research on social media use for teaching and learning about the holocaust [ ]. this study provides a preliminary analysis of what type of content these three major holocaust institutions publish on social media and how they engage their respective online communities. . rationale of the study holocaust museums’ current pursuit of a dual mission—as sources of cultural heritage and as institutions with an educational calling—is a phenomenon that is increasingly related to their employment of digital technologies [ ]. social media use has the potential to reach millions of people and the power to transform engrained memory paradigms about the historical contexts of national socialism and the holocaust [ ]. although holocaust distortion and trivialisation [ ] have become increasingly pervasive on internet sites and social media, at the same time, social media may strengthen holocaust knowledge and raise awareness of the many forms of holocaust distortion being propagated, in part thanks to ready (online) access to accurate historical scientific knowledge on which to judge historical facts [ ]. in this sense, there is a need to raise awareness about the potential that social media chan- nels offer to museums and memorials for holocaust education so that they can better engage their audiences; this involves not only promoting cultural activities and initiatives but also adopting effective social media practices for disseminating accurate historical information. this study aims to provide a preliminary analysis of social media engagement in three major holocaust museums: yad vashem (yv) in israel, the united states holocaust memo- rial museum (ushmm), and the auschwitz–birkenau memorial and museum (amm) in poland. the reasons for focusing on these museums lie in their representativeness of worldwide holocaust heritage, their prominence in terms of the number of visitors they receive annually, and their importance as agencies in the field of holocaust education. moreover, despite variance in their holocaust narratives and their differing social, cultural, and political agendas [ ], they are all prominent holocaust heritage tourist sites that play a special role in shaping the collective memory of the holocaust [ – ]. although many academic studies have investigated these museums singularly or as part of a group of major heritage sites (e.g., [ – ]), very few have researched their use of social media [ , , – , ]. all three museums run active social media profiles on several platforms in order to share news about their special events and educational initiatives as well as to publicise important dates and ceremonies. in this endeavour, they have adopted their own hashtags—#yadvashem, #ushmm, and #auschwitz—to make it easy for people to locate their official communication. despite the advent of this stream of activity, research has yet to produce a comprehensive overview of how these three museums use twitter, facebook, instagram, and youtube as part of media-related learning and socially inherited memory. accordingly, this study aims to provide an answer to the following specific research questions: . what kind of content do the three museums publish via their social media profiles? . what kind of interaction takes place with these profiles? . what types of content engage the fans/followers most? information , , of . methods and procedure this study adopts a mixed-method approach grounded on established methods for social media research [ ] and is based on social media analytics and latent semantic analy- sis. social media analytics are considered a powerful means not only for informing but also for transforming “existing practices in politics, marketing, investing, product development, entertainment, and news media” [ ]. in cultural heritage studies on museums’ use of social media, social media analytics have been used to evaluate the impact of museums’ events [ , ] and to extract inspiring pronouncements [ ]. social media analytics were employed to investigate the three institutions’ use of four different social media platforms: facebook, twitter, instagram, and youtube. specifically, instagram was included in this group because it “encourages conversation and empathy, keeping the holocaust visible in youth discourses” [ ] (p. ) and because it offers a different perspective on holocaust museums’ engagement with social media. table reports the list of profiles for the three museums investigated here. table . list of social media profiles per museum (the date of profile creation/activation is shown in brackets). yv ushmm amm facebook yadvashem ( june ) holocaustmuseum ( october ) auschwitzmemorial ( october ) twitter @yadvashem ( april ) @holocaustmuseum ( august ) @auschwitzmuseum ( may ) instagram yadvashem (april ) holocaustmuseum (july ) auschwitzmemorial (january ) youtube yadvashem (february ) holocaustmuseum (august ) auschwitzmemorial (september ) the activity around these social media profiles was analysed in terms of ( ) content (e.g., post frequency and format, and type of information), ( ) interactivity (e.g., user response and engagement), and ( ) popularity (e.g., number of fans/followers, shares, etc.). this approach is derived from an analysis framework that distinguishes between content and relational communication strategies and that measures the effectiveness of fan pages and posts [ ]. unlike previous studies [ ] that relied on the analytics provided by the museum analytics website (http://www.museum-analytics.org), this study uses fanpage karma (https://www.fanpagekarma.com/) as its reference social media data analysis platform to retrieve data from facebook pages, twitter profiles, instagram profiles, and youtube channels. fanpage karma is one of the leading providers of social media analytics and monitoring. it provides valuable insights into posting metrics, strategies, and the perfor- mance of profiles on facebook, twitter, youtube, instagram, linkedin, and pinterest. the service allows for the creation of dashboards and benchmarks for social media profiles as well as provides instant reports (excel, powerpoint, and pdf) and email updates. the trial version provides metrics for the last days for public pages, while the paid service allows personalised timeframe setting. table shows a sample of metrics considered for the analysis. data analysis covers two months of activity from th july to th september . http://www.museum-analytics.org https://www.fanpagekarma.com/ information , , of table . list of metrics per platform. facebook page twitter profile instagram profile youtube channel content • number of posts • posts per day • link-posts (number of posts in url format) • picture-posts (number of posts in picture format) • video-posts (number of posts in video format) • number of tweets • tweets per day • picture and/or link-tweet • new content-tweet • number of posts • posts per day • picture-post • carousel-post (post with multiple photos or videos that can be viewed by swiping or clicking left) • video-post • number of videos interactivity • number of comments on posts • number of reactions to posts • post interaction (%) • engagement (%) • fans’ posts • fans’ posts with comment by page • fans’ posts with reaction by page • fans’ comments on other fans’ posts • number of likes • number of likes per tweet • tweet interaction (%) • engagement (%) • conversations • number of comments • number of comments per post • post interaction (%) • engagement (%) • number of views • number of views per video • number of likes • number of likes per video • number of dislikes • number of dislikes per video • number of comments • number of comments per video popularity • number of fans • number of shares • number of followers • number of retweets • average number of retweets per tweet • number of followers • follower growth • number of subscribers • subscriber growth (%) in addition to social media analytics metrics, this study also considered latent seman- tic analysis (lsa) [ ]. this is a technique adopted in natural language processing, in particular distributional semantics, that analyses relationships between words; in this study, it was employed to determine the topical structure of communication. lsa was applied to words and hashtags to analyse what words or strings of words are most frequently used in posts/tweets. given the functional importance and pervasive use of hashtags in twitter, these have been the subject of numerous studies that highlight their status as polysemic texts embodying multiple meanings and usages [ , ]. in this study, the aim is to provide an overview of the topics and phrases that appear most often and to discover which hashtags engage the fans/followers most. . results an initial analysis was conducted by inspecting social media analytics, which provided insights about how the three museums—yad vashem (yv), the united states holocaust memorial museum (ushmm), and the auschwitz–birkenau memorial and museum (amm)—used facebook, twitter, instagram, and youtube in the two-month period from th july to th september . tables – report the analytics related to the content, interactivity, and popularity of these three museums on the four social media platforms. information , , of table . content, interactivity, and popularity of museums’ twitter profiles. yv ushmm amm content tweets tweets per day . . . picture and/or link-tweet ( . %) ( . %) ( . %) new content-tweet ( %) ( . %) ( . %) interactivity likes , , , likes per tweet tweet interaction (%) . % . % . % engagement (%) . % . % . % conversations % % % popularity followers , , , , retweets , , average number of retweets per tweet . . . table . content, interactivity, and popularity of museums’ facebook pages. yv ushmm amm content posts posts per day . . . link-posts ( . %) ( . %) ( . %) picture-posts ( . %) ( . %) ( . %) video-posts ( . %) ( . %) ( . %) interactivity comments on posts , , reactions to posts , , , post interaction (%) . % . % . % engagement (%) . % . % . % fans’ posts fans’ posts with comment by page fans’ posts with reaction by page fans’ comments on other fans’ posts popularity fans , , , , shares , , table . content, interactivity, and popularity of museums’ instagram profiles. yv ushmm amm content posts posts per day . picture-post ( . %) ( . %) ( . %) carousel-post ( . %) ( . %) ( . %) video-post ( . %) ( . %) ( . %) interactivity comments , comments per post post interaction (%) . % . % . % engagement (%) . % . % . % popularity followers , , , growth information , , of table . content, interactivity, and popularity of museums’ youtube channels. yv ushmm amm content videos interactivity views , , views per video likes likes per video dislikes dislikes per video comments comments per video popularity subscribers , , subscriber growth . % . . content if we look at content categories, we see that the highest number of posted content was found on twitter (table ), where out of tweets, . % (n = ) was produced by amm, with an average of tweets published per day. in terms of content types, in general, more than half of the tweets contained images and/or links. while ushmm tended to publish more original content than the other two profiles (n = ; . %), amm republished the most content produced by other twitter profiles (n = ; . %). if we look at facebook posts (table ), the situation is very varied as far as the different types of content are concerned. the content published on facebook is, on the other hand, more often published by ushmm: out of posts, ushmm accounts for more than half of the content published (n = ; . %), with an average of . posts per day. while external links are prominently a feature in ushmm content, (n = ; . %), amm and yv (to a lesser degree) make massive use of images (n = ; . %, and n = ; . %, respectively). video content is employed more frequently by yv (n = ; . %) and ushmm (n = ; . %), although to a lesser extent than images. as far as instagram use is concerned (table ), content distribution is more homoge- neous (ushmm: n = , . %; amm: n = , . %; and yv: n = , . %). picture-posts account for most of the content, while yv also tends to publish a small amount of carousel- posts (n = ; . %). the ushmm profile also includes a small percentage of video-posts (n = ; . %). finally, youtube activity (table ) was higher for yv (n = ; %) and ushmm (n = ; . %) than for amm (n = ; . %), although the frequency of video posting per day was quite low (n = . ). all three channels published original content. . . interactivity interactivity was largely investigated using analytics (e.g., the number of total com- ments/likes or post/tweet interaction) and engagement. for twitter (table ), along with a high level of variance between the number of total likes that each profile’s content attracted, we also found that amm tweets tend to receive more likes than those of the other two profiles (n = versus for ushmm and for yv). however, if we look at twitter interaction—the average number of interactions per day on a given day’s tweets in relation to the total number of followers accrued on that same day in the selected period—we can see that both yv and amm report a similar percentage ( . % and . %, respectively). en- gagement levels—the average number of interactions per day on tweets on a given day in relation to the number of followers accrued on that same day in the selected period—were found to differ significantly between the three profiles: amm had the highest engagement among the three profiles, with . % versus . % for ushmm and . % for yv. finally, for the twitter-specific metric conversations (a measure determined by the ratio of @-reply information , , of tweets to all tweets published in the selected period interacting with other twitter profiles), yv had a higher ratio ( %) than ushmm ( %) or amm ( %). turning to interactivity on facebook (table ), this was gauged not only by the number of comments on posts, post interaction, and engagement but also by metrics such as the number of posts by fans, fan posts that received comments by the profile page, fan posts that received reactions from the profile page, and comments on user posts from other fans. regarding the ratio of comments per post and the ratio of reactions per post, ushmm attracted higher activity on both counts: , / = . and , / = . respectively. for post interaction—the average number of interactions per post; reactions such as like, love, hahah, thankful, wow, sad, and angry; comments; and shares on posts made on a given day in relation to the number of fans accrued on the same day in the selected period—amm attracted the most activity ( . %). this is in line with the engage- ment metrics—the average number of interactions per day on posts made on a given day in relation to the number of fans accrued on the same day in the selected period—with amm accounting for . % and ushmm accounting for . %. however, the situation is different when we look at the level of users’ active posting and the number of comments or reactions they receive. here, there is a huge difference among the three profiles: while users post new content almost exclusively in ushmm (n = ) and to a minor extent in the amm page (n = ), none of the users’ posts received comments by the page owner and only a limited number of posts from ushmm page users’ posts received reactions from the page itself (n = ) or comments from other fans (n = ). posts on instagram were inspected in terms of the number of comments and likes, post interaction, and engagement (table ). the ratio of comments per post is higher in ushmm ( / = . ) and amm ( / = . ), while the ratio of likes per post is prevalent in amm ( , / = ). post interaction metrics—the average number of organic likes and comments per post on posts made on a given day in relation to the number of followers accrued on the same day in the selected period—are similar in all three profiles, ranging from yv’s . % to amm’s . %. in terms of engagement—the average number of organic likes and comments per day on posts made on a given day in relation to the number of followers accrued on the same day in the selected period—was higher in ushmm and amm, corresponding to . %. finally, youtube interactivity was assessed mostly through views, likes and dislikes, and comments. yv and ushmm collected higher numbers of views both globally and per video (n = and n = , respectively) against only for amm. the likes vs. dislikes ratios are % for yv, % for ushmm, and % for amm. the number of comments was zero in the case of ushmm and amm, while yv collected only a very limited number of comments (n = ). . . popularity popularity was measured in terms of the number of fans/followers and number of retweets or shares. in the case of twitter (table ), amm has the highest number of followers (n = , , ), followed by ushmm (n = , ). this proportion is also reflected in the average number of retweets per tweet, with . retweets per tweet for amm, . for ushmm, and . for yv. facebook popularity (table ) is found to be higher in ushmm, with , , fans and the highest number of shares (n = , ). instagram popularity (table ) was found to be quite similar among the three profiles, with , fans for amm, , for ushmm, and , for yv. follower growth, that is the difference between the number of followers on the first and last days of the selected period, was found to be higher for amm and ushmm, with and additional fans, respectively. finally, youtube popularity was measured via the number of subscribers and sub- scriber growth. the most popular youtube channel amongst these three museums is yad vashem with , subscribers, followed by ushmm with , followers. although it information , , of is the least popular channel with subscribers, the amm channel grew by . % during the considered period. . . topic content and hashtag analysis a second, latent semantic analysis was conducted by inspecting the most commonly occurring words and hashtags used to identify conversation topics on the four social media platforms. on twitter, the most frequently used words by the three profiles are “educate” (n = . k), “history” (n = . k), “people” (n = . k), “learn” (n = . k), “online” (n = . k), and “visit” (n = . k). however, if we look at the profiles individually, we can see that these words largely coincide with those most used by the amm profile, while “nazi” (n = ), “holocaust” (n = ), and “jews” (n = ) tend to prevail for ushmm and “jews” (n = ) and “holocaust” (n = ) tend to prevail for yv. for twitter hashtags, figure presents those most frequently used by the three twitter profiles. we can see that #auschwitz is clearly the most frequently used (n = . k), although it does not attract a high level of engagement. indeed, despite having a lower number of occurrences, hashtags such as #theresienstadt (n = ) and #zigeunerlager (n = ) generate higher engagement. breaking down these figures by profile, we see that the use of #auschwitz is found only on the amm profile, while ushmm mostly used hashtags such as #otd [on this day] (n = ) and #antisemitism (n = ), while more frequently adopted hashtags on yv were #otd [n this day] (n = ), #martinschoeller, (n = ) and # survivors (n = ).information , , x for peer review of figure . hashtags that the museums used most frequently on twitter (relative frequency is expressed both by text size and by colour). for facebook, the most popular words employed in posts were “camp” (n = ), “nazi” (n = ), “jews” (n = ), and “holocaust” (n = ). in terms of differences among the three profiles, amm’s most frequent words were “camp” (n = ), “prisoner” (n = ), and “auschwitz” (n = ), while ushmm’s were “nazi” (n = ), “holocaust” (n = ), and “jews” (n = ) and yv’s were “holocaust” (n = ), “family” (n = ), and “jews” (n = ). looking at the use of hashtags on facebook (figure ), the most frequent were #auschwitz (n = ) and #backtoschool (n = ), while the one attracting most engagement was #antisemitism (n = ). broken down by institution, #auschwitz was the most frequent and engaging hashtag for amm, while #antisemitism was the most popular and engaging (n = ) for ushmm and #backtoschool (n = ) was that for yv. figure . hashtags that the museums used most frequently on twitter (relative frequency is expressed both by text size and by colour). for facebook, the most popular words employed in posts were “camp” (n = ), “nazi” (n = ), “jews” (n = ), and “holocaust” (n = ). in terms of differences among the three profiles, amm’s most frequent words were “camp” (n = ), “prisoner” (n = ), and “auschwitz” (n = ), while ushmm’s were “nazi” (n = ), “holocaust” (n = ), and “jews” (n = ) and yv’s were “holocaust” (n = ), “family” (n = ), and “jews” (n = ). information , , of looking at the use of hashtags on facebook (figure ), the most frequent were #auschwitz (n = ) and #backtoschool (n = ), while the one attracting most engagement was #antisemitism (n = ). broken down by institution, #auschwitz was the most frequent and engaging hashtag for amm, while #antisemitism was the most popular and engaging (n = ) for ushmm and #backtoschool (n = ) was that for yv. information , , x for peer review of figure . hashtags that the museums used most frequently on facebook (relative frequency is expressed both by text size and by colour). as for instagram content analysis, the top words employed were “camp” (n = ), “jews” (n = ), “deported” (n = ), “nazi” (n = ), and “jewish” (n = ): with “camp” (n = ) for amm, “nazi” (n = ) and “jews” (n = ) for ushmm, and “jews” (n = ) and “camp” (n = ) for yv being the most employed. figure shows that the most commonly used instagram hashtag was #holocaust (n = ) while the most engaging were #auschwitz (n = ), #history (n = ), and #yadvashem (n = ). broken down, the most popular were #auschwitz (n = ) for amm; #holocaust (n = ) and #history (n = ) for ushmm; and #yadvashem (n = ), #holocaust (n = ), and #history (n = ) for yv. figure . hashtags that the museums used most frequently on instagram (relative frequency is expressed both by text size and by colour). figure . hashtags that the museums used most frequently on facebook (relative frequency is expressed both by text size and by colour). as for instagram content analysis, the top words employed were “camp” (n = ), “jews” (n = ), “deported” (n = ), “nazi” (n = ), and “jewish” (n = ): with “camp” (n = ) for amm, “nazi” (n = ) and “jews” (n = ) for ushmm, and “jews” (n = ) and “camp” (n = ) for yv being the most employed. figure shows that the most commonly used instagram hashtag was #holocaust (n = ) while the most engaging were #auschwitz (n = ), #history (n = ), and #yadvashem (n = ). broken down, the most popular were #auschwitz (n = ) for amm; #holocaust (n = ) and #history (n = ) for ushmm; and #yadvashem (n = ), #holocaust (n = ), and #history (n = ) for yv. information , , x for peer review of figure . hashtags that the museums used most frequently on facebook (relative frequency is expressed both by text size and by colour). as for instagram content analysis, the top words employed were “camp” (n = ), “jews” (n = ), “deported” (n = ), “nazi” (n = ), and “jewish” (n = ): with “camp” (n = ) for amm, “nazi” (n = ) and “jews” (n = ) for ushmm, and “jews” (n = ) and “camp” (n = ) for yv being the most employed. figure shows that the most commonly used instagram hashtag was #holocaust (n = ) while the most engaging were #auschwitz (n = ), #history (n = ), and #yadvashem (n = ). broken down, the most popular were #auschwitz (n = ) for amm; #holocaust (n = ) and #history (n = ) for ushmm; and #yadvashem (n = ), #holocaust (n = ), and #history (n = ) for yv. figure . hashtags that the museums used most frequently on instagram (relative frequency is expressed both by text size and by colour). figure . hashtags that the museums used most frequently on instagram (relative frequency is expressed both by text size and by colour). information , , of given the lack of hashtags use on youtube, the analysis focused exclusively on word frequency. the results show that the posted videos cover a range of topics, with a prevalence of words such as “holocaust” and “auschwitz–birkenau” (figure ). information , , x for peer review of given the lack of hashtags use on youtube, the analysis focused exclusively on word frequency. the results show that the posted videos cover a range of topics, with a prevalence of words such as “holocaust” and “auschwitz–birkenau” (figure ). figure . words that the museums used most frequently words on youtube (relative frequency is expressed both by text size and by colour). . discussion this study investigated how a sample of prominent holocaust museums and organisations use social media to engage their audience about topics related to the holocaust. the results of this preliminary investigation show that, in general, the three holocaust organisations are quite active on twitter, facebook, instagram, and youtube, although with differing capacities to attract followers and to engage with them. overall, the three profiles are more active on twitter than on the other two social media, and publication date does not seem to influence the capacity to attract followers or to frequently produce content. at the same time, notable differences emerged. while amm’s activity is well established, especially on twitter, with the highest number of followers and tweets published daily, ushmm is more (globally and daily) active and popular on facebook; conversely, yv seems to invest more into youtube videos. the particular popularity of amm’s twitter profile is highlighted by the high average number of retweets per tweet. ushmm’s facebook page has the highest number of shared posts; they have had a presence on facebook, twitter, and youtube for more than years now, and this testifies to their social media commitment. this prioritisation is also reflected in the declaration of the auschwitz–birkenau memorial and museum to invest in “a place for discussion which is not available on the official website” [ ] and to engage with holocaust mockers and deniers [ ]. in a similar vein, the united states holocaust memorial museum has recently released a document in which they advocate the role of social media in countering holocaust denial and providing accurate knowledge for history lessons [ ]. instagram adoption is a more recent phenomenon, and here, no significant differences emerge between the three museums except for the more pronounced growth rate for ushmm and amm. with respect to youtube activity, yad figure . words that the museums used most frequently words on youtube (relative frequency is expressed both by text size and by colour). . discussion this study investigated how a sample of prominent holocaust museums and organ- isations use social media to engage their audience about topics related to the holocaust. the results of this preliminary investigation show that, in general, the three holocaust organisations are quite active on twitter, facebook, instagram, and youtube, although with differing capacities to attract followers and to engage with them. overall, the three profiles are more active on twitter than on the other two social media, and publication date does not seem to influence the capacity to attract followers or to frequently produce content. at the same time, notable differences emerged. while amm’s activity is well established, espe- cially on twitter, with the highest number of followers and tweets published daily, ushmm is more (globally and daily) active and popular on facebook; conversely, yv seems to invest more into youtube videos. the particular popularity of amm’s twitter profile is highlighted by the high average number of retweets per tweet. ushmm’s facebook page has the highest number of shared posts; they have had a presence on facebook, twitter, and youtube for more than years now, and this testifies to their social media commitment. this prioritisation is also reflected in the declaration of the auschwitz–birkenau memorial and museum to invest in “a place for discussion which is not available on the official website” [ ] and to engage with holocaust mockers and deniers [ ]. in a similar vein, the united states holocaust memorial museum has recently released a document in which they advocate the role of social media in countering holocaust denial and providing accurate knowledge for history lessons [ ]. instagram adoption is a more recent phenomenon, and here, no significant differences emerge between the three museums except for the more pronounced growth rate for ushmm and amm. with respect to youtube activity, yad vashem has a long tradition of video production, which is also reflected in the number of subscribers/fans and interactions related to their channel. regarding the first research question (content type), the data show that the three museums tend to publish new or original contents on their social media profiles except for amm’s twitter profile, where there is a prevalence of reposted (retweeted) contents information , , of produced by third parties. this demonstrates that the polish museum’s twitter profile acts as a “bridge” among other holocaust organisations’ profiles, thus contributing to cross-referencing and network-building among holocaust commemoration bodies. further research might investigate how social media is used for community building among holocaust organisations, with opportunities for the development of cooperation strategies and experiences [ ]. as for content media typology, amm and yv have a stronger tendency to publish twitter content that contains images and/or links to external resources, while ushmm seems to prefer textual information. this trend is also reflected to some extent on facebook, where ushmm tends to publish textual content accompanied by links to external resources while yv and amm make extensive use of images and yv of video content. in this regard, future research might also investigate the relationship between the use of images and visual content and user engagement, following the example set by some recent forward-looking research studies [ ]. finally, as far as instagram is concerned, the only institution to make (limited) use of video in addition to the more standard picture or carousel posts is ushmm. however, further research is needed to study instagram’s aesthetic visual communication and how instagram grammar [ ] encourages conversation and empathy, especially in youth discourse [ ]. in response to the second research question, interactivity was found to be globally higher on instagram, where no major difference emerged among the three museums in terms of post interaction and general interactivity, although ushmm and amm posts seem to attract more comments. more specifically, the situation changes completely when considering twitter, where amm has by far the highest engagement level, also borne out of the high number of likes per tweet that it attracts. however, if we look at the average number of tweet responses to tweets on a given day in relation to the number of followers (twitter interaction), there is no significant difference between yv and amm, showing that more content published does not necessarily mean more user interaction. on youtube, we found a significant level of passive participation, with a high number of views and likes but no active responses in terms of comments left. however, the most interesting outcomes from the data analysis are in regards to facebook. the multifaceted metrics available on facebook activity such as the number of fan posts and interaction with these posts allows for a deeper analysis of how content co-construction unfolds on this social media platform. while ushmm’s facebook page allows users to post their own photos or other content, the other two profiles do not allow active participation in their page content. despite this, ushmm has a very low reaction rate to visitors’ posts and, more generally, there is a lack of interaction among the page users themselves. this points to a broadcast-mode use of social media, which is broadly in line with previous studies showing a tendency towards mono-directional communication [ , ]. this trend has been emphasised in other studies, which have highlighted the passivity of “holocaust institutions whose staff members prefer one-directional communication, ‘broadcasting’ a carefully shaped, widely acceptable message via social media but refusing to engage further and bring their considerable expertise to bear on the difficult moral questions of how to develop an appropriate communicative memory of war crimes and what political consequences to draw from that memory” [ ] (pp. – ). however, as stressed in other studies [ ], the way in which amm, for instance, engages with instagram followers shows that it can be possible to exert less control over new channels of communication and representation, thus allowing holocaust-focused institutions to assume an increasingly visible role in transnational social media holocaust discourse. nevertheless, further study and more rigorous methodological approaches are required to understand how holocaust institutions are placing users (and their responsibility for the content they choose to post on social media) at the centre of the debate on sociohistorical agency in the digital age. in the case of this preliminary study, no specific evidence emerges that there has been an erosion of institutional power over how holocaust organisations and holocaust memory are presented and curated [ ] or how social media users are exercising agency in the co-construction of holocaust digital memories [ ]. further research is needed to support information , , of these claims as well as to investigate how the perceived threat and actual manifestation of antisemitic and hate speech may be factors potentially conditioning the way memorials approach and embrace social media [ ]. finally, the third research question regards the type of content that mostly strongly engages fans/followers. this entailed latent semantic analysis of the most frequently used hashtags and words. the analysis has revealed a set of terms and hashtags that refer to the basic lexicon of holocaust history, which attests to users’ strong interest in historical knowledge and less emphasis on the recent past or on analogies between contemporary events and wwii history. in this light, as kansteiner [ ] (p. ) has highlighted, holocaust- themed social media pages seem mostly to represent “a cyberspace address where [the subscribers] can hang out with peers, pursue their genocide memory interests by adding a thoughtful facet to their virtual selves, and then return to their comfortable lives”. another matter of concern relates to the centrality of auschwitz, both as a hashtag used by holocaust organisations and as a broad topic of holocaust discourse. this is reflected in the dominant popular perception of the holocaust in which auschwitz and related imagery represents an icon of the spatiality of the jewish genocide [ – ]. whether the centrality of “auschwitz” overshadows—and hence inhibits—topical discourses on final solution topics that are less familiar to the wider public is an issue worthy of more in-depth future research, as is whether it poses problems of the overall paucity of holocaust remembrance, such as the holocaust by bullets [ ]. . limitations and conclusions while this study has provided some useful insights based on a combination of social media analytics and topic modelling, some limitations need to be recognised. first of all, the study sample generated for this study covered a timespan stretching across the summer of , when museums were still struggling to adjust to the covid- pandemic. their social media contents and publication strategies may have been influenced by contingent circumstances, as ordinary activity was disrupted. in this respect, further research might investigate, for instance, a possible overlap of content between facebook and youtube to increase the provision of visual content due to the closure of museums. a second limitation concerns the adoption of the fanpage karma analytic service, which provides metrics and tools for analysis mostly based on a marketing approach. in future studies, other monitoring tools may be used to compare a diverse set of metrics and indications for engagement measures. thirdly, there is also a need to use mixed-method approaches that combine quantitative tools and qualitative instruments. for example, it is important to analyse posted content through a qualitative codebook that may use predefined or inducted categories to analyse historical content, moral lessons, or contemporary events related to holocaust topics. more sophisticated tools for (automatic) semantic analysis could comple- ment a qualitative approach as such. moreover, it will be important to consider diverse meanings of “engagement” applying relative weighting to the metrics adopted for deter- mining engagement and interactivity (in our case, e.g., “youtube interactivity was assessed mostly through views, likes and dislikes, and comments.”). these are each quite different in the nature and level of visitor engagement with the content. finally, the content of visitors’ comments, which were not the object of this study, should be considered in future research to investigate how fans/followers interact textually or with multimedia content with institutional pages/profiles. whatever the specific issues future research focuses on, research based on social media data will allow “unprecedented insights in the generation of historical consciousness because multi-platform consumption of historical content and explicit generation of historical interpretation can be recorded in unprecedented depth and breadth” [ ] (p. ). funding: this work was supported by the international holocaust remembrance alliance (ihra) under grant no. - “countering holocaust distortion on social media. promoting the positive use of internet social technologies for teaching and learning about the holocaust”. information , , of informed consent statement: not applicable. data availability statement: the data were obtained through a paid service and are not freely available. acknowledgments: this study was carried out as part of the author’s research project “teaching and learning about the holocaust with social media: a learning ecologies perspective”—doctoral programme in “education and ict (e-learning)”, universat oberta de catalunya, spain. conflicts of interest: the authors declare no conflict of interest. references . hilberg, r. the destruction of the european jewry, revised and definitive edition; holes and meir: new york, ny, usa, . . kansteiner, w. transnational holocaust memory, digital culture and the end of reception studies. in the twentieth century in european memory: transcultural mediation and reception; andersen, t.s., törnquist-plewa, b., eds.; brill: leiden, belgium, ; pp. – . . shandler, j. holocaust memory in the digital age: survivors’ stories and new media practices; stanford university press: palo alto, ca, usa, . . wieviorka, a. the era of the witness; cornell university press: ithaca, ny, usa; london, uk, . . frosh, p. the mouse, the screen and the holocaust witness: interface aesthetics and moral response. new media soc. , , – . [crossref] . zalewska, m. the last goodbye ( ): virtualizing witness testimonies of the holocaust. spectator , , – . . walden, v.g. what is ‘virtual holocaust memory’? mem. stud. . [crossref] . goldberg, a.; hazan, h. marking evil: holocaust memory in the global age; berghahn: new york, ny, usa, . . kansteiner, w.; presner, t. introduction: the field of holocaust studies and the emergence of global holocaust culture. in probing the ethics of holocaust culture; fogu, c., kansteiner, w., presner, t., eds.; harvard university press: cambridge, ma, usa, ; pp. – . . levy, d.; sznaider, n. the holocaust and memory in the global age; temple university press: philadelphia, pa, usa, . . hoskins, a. media, memory, metaphor: remembering and the connective turn. parallax , , – . [crossref] . o’connor, p. the unanchored past: three modes of collective memory. mem. stud. . [crossref] . brown, a.; waterhouse-watson, d. the future of the past: digital media in holocaust museums. holocaust stud. , , – . [crossref] . pfanzelter, e. at the crossroads with public history: mediating the holocaust on the internet. holocaust stud. , , – . [crossref] . franken-wendelstorf, r.; greisinger, s.; gries, c. das erweiterte museum. medien, technologien und internet; walter de gruyter: berlin, germany, . . bernsen, d.; kerber, u. praxishandbuch historisches lernen und medienbildung im digitalen zeitalter; verlag barbara budrich: leverkusen, germany, . . birkner, t.; donk, a. collective memory and social media: fostering a new historical consciousness in the digital age? mem. stud. , , – . [crossref] . garde-hansen, j.; hoskins, a.; reading, a. save as digital memories; palgrave macmillan: basingstoke, uk, . . henig, l.; ebbrecht-hartmann, t. witnessing eva stories: media witnessing and self-inscription in social media memory. new media soc . [crossref] . landsberg, a. engaging the past: mass culture and the production of historical knowledge; columbia university press: new york, ny, usa, . . wight, a.c. visitor perceptions of european holocaust heritage: a social media analysis. tour. manag. , , . [crossref] . commane, g.; potton, r. instagram and auschwitz: a critical assessment of the impact social media has on holocaust representation. holocaust stud. , , – . [crossref] . dalziel, i. “romantic auschwitz”: examples and perceptions of contemporary visitor photography at the auschwitz-birkenau state museum. holocaust stud. , , – . [crossref] . zalewska, m. selfies from auschwitz: rethinking the relationship between spaces of memory and places of commemoration in the digital age. digit icons , , – . . lundrigan, m. #holocaust #auschwitz: performing holocaust memory on social media. in a companion to the holocaust; gigliotti, s., earl, h., eds.; john wiley & sons: hoboken, nj, usa, ; pp. – . . manca, s. holocaust memorialisation and social media. investigating how memorials of former concentration camps use facebook and twitter. in proceedings of the th european conference on social media—ecsm , brighton, uk, – june ; pp. – . . rehm, m.; manca, s.; haake, s. sozialen medien als digitale räume in der erinnerung an den holocaust: eine vorstudie zur twitter-nutzung von holocaust-museen und gedenkstätten. merz , , – . http://doi.org/ . / http://doi.org/ . / http://doi.org/ . / . . http://doi.org/ . / http://doi.org/ . / . . http://doi.org/ . / . . http://doi.org/ . / http://doi.org/ . / http://doi.org/ . /j.tourman. . http://doi.org/ . / . . http://doi.org/ . / . . information , , of . pennington, l.k. hello from the other side: museum educators’ perspectives on teaching the holocaust. teach. dev. , , – . [crossref] . reading, a. digital interactivity in public memory institutions: the uses of new technologies in holocaust museums. media cult. soc. , , – . [crossref] . katz, d. is eastern european ‘double genocide’ revisionism reaching museums? dapim , , – . [crossref] . rothberg, m. multidirectional memory: remembering the holocaust in the age of decolonization; stanford university press: stanford, ca, usa, . . topor, l. dark hatred: antisemitism on the dark web. j. contemp. antisemitism , , – . [crossref] . samaroudi, m.; echavarria, k.r.; perry, l. heritage in lockdown: digital provision of memory institutions in the uk and us of america during the covid- pandemic. mus. manag. curatorship , , – . [crossref] . worldwide virtual name-reading campaign to mark holocaust remembrance day. . available online: https://www. yadvashem.org/downloads/name-reading-ceremonies.html (accessed on november ). . transformation of holocaust memory in times of covid- . available online: https://www.iwm.at/always-active/corona- focus/tobias-ebbrecht-hartmann-transformation-of-holocaust-memory-in-times-of-covid- / (accessed on november ). . marakos, p. museums and social media: modern methods of reaching a wider audience. mediterr. archaeol. archaeom. , , – . . gonzales, r. keep the conversation going: how museums use social media to engage the public. mus. sch. , , – . . wong, a.s. ethical issues of social media in museums: a case study. mus. manag. curatorship , , – . [crossref] . bonet, l.; négrier, e. the participative turn in cultural policy: paradigms, models, contexts. poetics , , – . [crossref] . jenkins, h.; mizuko, i. participatory culture in a networked era: a conversation on youth, learning, commerce, and politics; polity press: cambridge, uk, . . manca, s. researchgate and academia.edu as networked socio-technical systems for scholarly communication: a literature review. res. learn. technol. , , – . [crossref] . gronemann, s.t.; kristiansen, e.; drotner, k. mediated co-construction of museums and audiences on facebook. mus. manag. curatorship , , – . [crossref] . kidd, j. enacting engagement online: framing social media use for the museum. inf. technol. people , , – . [crossref] . padilla-meléndez, a.; del Águila-obra, a.r. web and social media usage by museums: online value creation. int. j. inf. manag. , , – . [crossref] . camarero, c.; garrido, m.-j.; san jose, r. what works in facebook content versus relational communication: a study of their effectiveness in the context of museums. int. j. hum. comput. interact. , , – . [crossref] . agostino, d.; arnaboldi, m.; diaz, m.l.l.; riva, p. exploring the importance of facebook post writing as a museum engagement tool. in proceedings of the th european conference on social media—ecsm , larnaca, cyprus, – july ; pp. – . . diaz, m.l.; arnaboldi, m. the participative turn in museum: the online facet. in proceedings of the th international conference on social media & society (smsociety’ ), toronto, on, canada, – july ; pp. – . . manca, s. bridging cultural studies and learning sciences: an investigation of social media use for holocaust memory and education in the digital age. rev. educ. pedagog. cult. stud. . [crossref] . burkhardt, h. geschichte im social web: geschichtsnarrative und erinnerungsdiskurse auf facebook und twitter mit dem kulturwissenschaftlichen medienbegriff medium des kollektiven gedächtnisses’ analysieren. in medien machen geschichte: neue anforderungen an den geschichtsdidaktischen medienbegriff im digitalen wandel; pallaske, c., ed.; logos: berlin, germany, ; pp. – . . bauer, y. creating a “usable” past: on holocaust denial and distortion. isr. j. foreign aff. , , – . [crossref] . burkhardt, h. social media und holocaust education. chancen und grenzen historisch-politischer bildung. in holocaust education revisited. holocaust education–historisches lernen–menschenrechtsbildun; ballis, a., gloe, m., eds.; springer: wiesbaden, germany, ; pp. – . . rotem, s.s. constructing memory: architectural narratives of holocaust museums; peter lang ag: bern, switzerland, . . berenbaum, m.; kramer, a. the world must know: the history of the holocaust as told in the united states holocaust; johns hopkins university press: baltimore, md, usa, . . bernard-donals, m. figures of memory: the rhetoric of displacement at the united states holocaust memorial; state university of new york press: new york, ny, usa, . . griffiths, c. encountering auschwitz: touring the auschwitz-birkenau state museum. holocaust stud. , , – . [crossref] . sievers, l.a. genocide and relevance: current trends in united states holocaust museums. dapim , , – . . burkhardt, h. erinnerungskulturen im social web. auschwitz und der europäische holocaustgedenktag auf twitter. in geschichtsunterricht–geschichtsschulbücher–geschichtskultur. aktuelle geschichtsdidaktische forschungen des wissenschaftlichen nach- wuchses. mit einem vorwort von thomas sandkühler; danker, u., ed.; vandenhoeck & ruprecht: göttingen, germany, ; pp. – . . sloan, l.; quan-haase, a. the sage handbook of social media research methods; sage publications: london, uk, . . lassen, n.b.; la cour, l.; vatrapu, r. predictive analytics with social media data. in the sage handbook of social media research methods; sloan, l., quan-haase, a., eds.; sage publications: london, uk, ; pp. – . http://doi.org/ . / . . http://doi.org/ . / http://doi.org/ . / . . http://doi.org/ . /jca/ . . http://doi.org/ . / . . https://www.yadvashem.org/downloads/name-reading-ceremonies.html https://www.yadvashem.org/downloads/name-reading-ceremonies.html https://www.iwm.at/always-active/corona-focus/tobias-ebbrecht-hartmann-transformation-of-holocaust-memory-in-times-of-covid- / https://www.iwm.at/always-active/corona-focus/tobias-ebbrecht-hartmann-transformation-of-holocaust-memory-in-times-of-covid- / http://doi.org/ . / . . http://doi.org/ . /j.poetic. . . http://doi.org/ . /rlt.v . http://doi.org/ . / . . http://doi.org/ . / http://doi.org/ . /j.ijinfomgt. . . http://doi.org/ . / . . http://doi.org/ . / . . http://doi.org/ . / . . http://doi.org/ . / . . information , , of . lê, j.t. #fashionlibrarianship: a case study on the use of instagram in a specialized museum library collection. art docum. , , – . . villaespesa, e. an evaluation framework for success: capture and measure your social-media strategy using the balanced scorecard. mw : museums and the web . available online: https://mw .museumsandtheweb.com/paper/an- evaluation-framework-for-success-capture-and-measure-your-social-media-strategy-using-the-balanced-scorecard/ (accessed on november ). . gerrard, d.; sykora, m.; jackson, t. social media analytics in museums: extracting expressions of inspiration. mus. manag. curatorship , , – . [crossref] . claes, f.; deltell, l. social museums: social media profiles in twitter and facebook – . prof. inf. , , – . . deerwester, s.; dumais, s.t.; furnas, g.w.; landauer, t.k.; harshman, r. indexing by latent semantic analysis. j. am. soc. inf. sci. , , – . [crossref] . erz, a.; marder, b.; osadchaya, e. hashtags: motivational drivers, their use, and differences between influencers and followers. comput. hum. behav. , , – . [crossref] . tsuria, r. get out of church! the case of #emptythepews: twitter hashtag between resistance and community. information , , . . auschwitz launches facebook site. bbc news. october . available online: http://news.bbc.co.uk/ /hi/europe/ .stm (accessed on november ). . the auschwitz museum has a twitter account, and this ex-journalist runs it. the times of israel. january . available online: https://www.timesofisrael.com/the-auschwitz-museum-has-a-twitter-account-and-this-ex-journalist-runs-it/ (accessed on november ). . the lessons of history and social media. us holocaust museum. august . available online: https://us-holocaust- museum.medium.com/the-lessons-of-history-and-social-media- ed (accessed on november ). . highfield, t.; leaver, t. instagrammatics and digital methods: studying visual social media, from selfies and gifs to memes and emoji. commun. res. pract. , , – . [crossref] . cole, t. selling the holocaust: from auschwitz to schindler, how history is bought, packaged, and sold; routledge: new york, ny, usa, . . partee allar, k. holocaust tourism in a post-holocaust europe: anne frank and auschwitz. in dark tourism and place identity: managing and interpreting dark places; white, l., frew, e., eds.; routledge: new york, ny, usa, . . pettigrew, a.; karayianni, e. ‘the holocaust is a place where . . . ’: the position of auschwitz and the camp system in english secondary school students’ understandings of the holocaust. holocaust stud. . [crossref] . vice, s. ‘beyond words’: representing the ‘holocaust by bullets’. holocaust stud. , , – . [crossref] author biography stefania manca is a research director at the institute of educational technology of the national research council of italy. she has a master’s degree in education and is a phd student in education and ict (e-learning). she has been active in the field of educational technology, technology-based learning, distance education and e-learning since . her research interests include social media and social network sites in formal and informal learning, teacher education, professional development, digital scholarship, and student voice-supported participatory practices in schools. she is currently working on a three-year research project about the application of social media to holocaust education from a learning ecologies perspective. she is author of scientific publications on various topics of educational technology, co-editor of the italian journal of educational technology (formerly td tecnologie didattiche), and part of the editorial and scientific boards of international and national journals and conferences on technology-enhanced learning. https://mw .museumsandtheweb.com/paper/an-evaluation-framework-for-success-capture-and-measure-your-social-media-strategy-using-the-balanced-scorecard/ https://mw .museumsandtheweb.com/paper/an-evaluation-framework-for-success-capture-and-measure-your-social-media-strategy-using-the-balanced-scorecard/ http://doi.org/ . / . . http://doi.org/ . /(sici) - ( ) : < ::aid-asi > . .co; - http://doi.org/ . /j.chb. . . http://news.bbc.co.uk/ /hi/europe/ .stm http://news.bbc.co.uk/ /hi/europe/ .stm https://www.timesofisrael.com/the-auschwitz-museum-has-a-twitter-account-and-this-ex-journalist-runs-it/ https://us-holocaust-museum.medium.com/the-lessons-of-history-and-social-media- ed https://us-holocaust-museum.medium.com/the-lessons-of-history-and-social-media- ed http://doi.org/ . / . . http://doi.org/ . / . . http://doi.org/ . / . . introduction related literature rationale of the study methods and procedure results content interactivity popularity topic content and hashtag analysis discussion limitations and conclusions references . saggi.indd ecdotica ( ) alma mater studiorum. università di bologna dipartimento di filologia classica e italianistica centro para la edición de los clásicos españoles carocci editore ecd otica ( ) is s n - c e centro para la edición de los clásicos españoles e calma mater studiorum universitÀ di bologna fondazione cassa di risparmio in bologna c e centro para la edición de los clásicos españoles e calma mater studiorum universitÀ di bologna fondazione cassa di risparmio in bologna € , is b n � ���� ����������������� ����� � ������ ecdotica fondata da francisco rico, con gian mario anselmi ed emilio pasquini ecdotica ( ) alma mater studiorum. università di bologna dipartimento di filologia classica e italianistica centro para la edición de los clásicos españoles carocci editore ���� ����������������������������� ������������������ � � comitato direttivo bárbara bordalejo, loredana chines, paola italia, pasquale stoppelli comitato scientifi co edoardo barbieri, francesco bausi, pedro m. cátedra, roger chartier, umberto eco †, conor fahy †, inés fernández-ordóñez, domenico fiormonte, hans-walter gabler, guglielmo gorni †, david c. greetham, neil harris, lotte hellinga, mario mancini, armando petrucci †, marco presotto, amedeo quondam, ezio raimondi †, roland reuß, peter robinson, antonio sorella, alfredo stussi, maria gioia tavoni, paolo trovato responsabile di redazione andrea severi redazione veronica bernardi, federico della corte, rosy cupo, marcello dani, sara fazion, laura fernández, francesca florimbii, albert lloret, alessandra mantovani, amelia de paz, stefano scioli, marco veglia, giacomo ventura ecdotica is a peer reviewed journal anvur: a ecdotica garantisce e risponde del valore e del rigore dei contributi che si pubblicano sulla rivista, pur non condividendone sempre e necessariamente prospettive e punti di vista. online: http://ecdotica.org alma mater studiorum. università di bologna, dipartimento di filologia classica e italianistica, via zamboni , bologna ecdotica.dipital@unibo.it centro para la edición de los clásicos españoles don ramón de la cruz, ( b), madrid cece@uab.es con il contributo straordinario dell’ateneo di bologna e con il patrocinio di carocci editore · corso vittorio emanuele ii, roma · tel. . , fax . alma mater studiorum universitÀ di bologna i n d ic e saggi h a n s w a l t e r g a b l e r, beyond author-centricity in scholarly editing barbara bor da lejo, pet er m.w. robins on, manus- cripts with few significant introduced variants alberto cadioli, «per formare edizioni corrette». casi ecdotici tra sette e ottocento john k. young, the editorial ontology of the periodical text joris j. van zun dert, why the compact disc was not a revolution and cityfish will change textual scholar- ship, or what is a computational edition? foro. manuali di filologia. m a r i a luisa men eghet ti, manuali di filologia (ro- manza) p a o l o trovato, qualche riflessione su alcuni manuali recenti, compreso il mio barbara bor da lejo, philology manuals: elena pierazzo’s digital scholarly editing testi s t e fa n o c a r r a i, p a o l a i t a l i a (a cura di), la filologia e la stilistica di dante isella. per una antologia questioni p e t e r m .w. r o b i n s o n, the texts of shakespeare s t e p h e n gr e en blat t , can we ever master king lear? p a s qua l e stoppe l li, ricordo di conor fahy ( - ), con un’ipotesi sul «cancellans» del furioso del rassegne hans walter gabler, text genetics in literary modernism and other essays (c. ros si), p. · edgar vincent, a.e. housman: hero of the hidden life (j. lawrance), p. · alonso víctor de paredes’ institution, and origin of the art of printing, and general rules for compositors (t. j. dads on), p. · l. chines, p. scapecchi, p. tinti, p. vecchi galli (a cura di), nel segno di aldo (g. mon tecchi), p. · t. zanato, a. comboni (a cura di), atlan- te dei canzonieri in volgare del quattrocento (g. ven tura), p. m a n u s c r i p ts w i t h f e w s ig n i f ic a n t i n t ro d u c e d va r i a n ts ba r ba r a b o r da l e j o - p e t e r m . w. ro b i n s o n the literature of stemmatics is rich in discussions of two phenomena which, it is commonly held, render the orderly assignation of manu- scripts into families problematic, even impossible. these two phenomena are coincident agreement (where unrelated manuscripts share the same reading, apparently by simple coincidence) and contamination (where a manuscript combines readings from two or more manuscripts). in this article, we suggest that there is a third area of difficulty which causes considerable problems to the stemmatic project. this third area is the phenomenon of multiple manuscripts within a tradition which cannot be assigned to any family because there is no consistent pattern of agree- ment in introduced variants between them and other manuscripts. for discussion of the problems caused by contamination and coincident agreement see kane’s introduction to his edition of the a version of piers plowman (george kane, ed. piers plowman: the a version. london, athlone, ). the glossing of “significant” in the formulation of the problem (“of manuscripts with few significant shared variants”) by the phrase “no consistent pattern of agreement” is deliberate. it is our core conviction, based on decades of work with digital tools, that “significant variants” are defined entirely by how the variants are distributed across the whole tradition. that is: if we find a number of variants which are present, over and over again, in the same distinctive pattern of witnesses, then those variants are significant (robinson, «four rules for the application of phylogenetics in the analysis of textual traditions», digital scholarship in the humanities, ( ), pp. - ). this differs sharply from the practice of traditional stemmatics, which puts considerable effort into a priori attempts to define, on the basis of the variant itself (omission? substantive seman- tic shift?) whether it is “significant” or not (for example, e. vinaver, «principles of textual emendation», studies in french language and medieval literature presented to mildred k. pope, the university press, , pp. - ). it is also a crucial tenet in our work this article describes four textual traditions in which we find this phe- nomenon, and reflects on how editors have responded to it. although it appears that no previous scholar has isolated the case of manuscripts with few significant shared introduced variants as a problem, our iden- tification of this as a cause of editorial difficulty in four unrelated manu- script traditions (not to mention the exceptional importance of three of those four) leads us to posit that this phenomenon, though previously unacknowledged, may be widespread. indeed, it is likely to be present in every large manuscript tradition. first identification: the old norse narrative sequence svipdagsmál robinson first observed the phenomenon of manuscripts which share few variants with any other manuscripts within a textual tradition in the course of his doctoral work on the old norse svipdagsmál. svip- dagsmál is the name given to two poems, gróugaldr and fjöllsvinsmál, normally appearing one after another in manuscripts and long recognized as forming a single narrative sequence, named for the protagonist of the two poems, svipdagr. the two poems are found in some manuscripts, all dating from after c. , although the two poems were likely com- posed and first copied some two centuries before. the svipdagsmál tradition has several features which made it seren- dipitously suited to an exploration of stemmatic techniques. firstly, an extraordinarily high number of manuscripts are known from unambig- uous external evidence to have been copied from one another. fourteen of the forty-six manuscripts are linked as exemplar and copy. this evi- dence of direct filiation could be used as both the foundation of analysis and as a check upon it. thus, one could measure the success of a quanti- tative method by the degree to which the method was able to link these fourteen manuscripts. secondly, the high proportion of manuscripts explicitly linked to one another suggested that the surviving manu- scripts represent a high proportion of all those which ever existed. we are not facing the situation we have with, for example, the greek new that we base our identification of what patterns of agreement are significant on the most complete collation possible (every word in every witness, or as close as we can manage) rather than any kind of sampling. on the dangers of sampling, see p.m.w. robinson, «the textual tradition of dante’s commedia and the barbi ‘loci’», ecdotica, ( ), pp. - . p.m.w. robinson, «an edition of svipdagsmál», doctoral dissertation, oxford, . barbara bordalejo - peter m.w. robinson testament where whole branches of the tradition disappeared or left just one or two representatives behind. thirdly, at one point in the his- tory of the tradition history it became fashionable to turn a manuscript into a mini-edition, by writing variants from other manuscripts into its margins. when these manuscripts were copied some of these marginal variants were copied from the margin into the text, thereby creating a useful laboratory for exploring contamination. finally, the tradition was suffi ciently compact ( manuscripts of a text of around short lines) for it to be completely transcribed, collated and analyzed within the span of a doctorate. considerably aided by these advantages, robinson was able to pro- duce the table of relationships of the manuscripts given in figure . in essence, he used classic lachmannian techniques to identify groups of manuscripts which shared distinctive (often nonsensical) readings and hence form a distinct family within the tradition. robinson also used a database to validate and refi ne the identifi cation of distinctive sets of variants. with these tools, cross-checked against the external knowledge of what manuscripts were copied from which, he was able to allocate almost all the manuscripts to one of fi ve groups. figure table of relationships of the manuscripts of svipdagsmál. manuscripts with few significant introduced variants almost all: but not all. six manuscripts did not fit into any of the five groups. if they did have any of the variants characteristic of any group, there were so few variants (perhaps just one or two) that the presence of these few variants was likely to be the result of mere coincidence. having failed to allocate any of the six to any of the five groups robinson sought evidence that any of the six might form some kind of affiliation with any others in the six. once more, if any variants suggested any such affiliation there were so few that they were likely to be the result of mere coincidence. further: there were no variants whatsoever shared by all six (or even any three or more of the six) and not found in the rest of the tradition. hence there is no evidence that the six descend from a single exemplar and no evidence that the six might be divided into smaller groups. however, to say that there is no evidence of any such affiliation is not the same as to say there is no such affiliation. the problem is not just that these manuscripts share very few variants with any others. it is that in these six there are very few variants of any kind: they seem to be particularly careful copies of their exemplars, all the way back to the common archetype of the whole tradition. of course, there are variants. but these few variants are either found randomly distributed elsewhere in the tradition or occur nowhere else, and hence have no classificatory power. however, this lack of positive evidence of affiliation within the six cannot prove a negative, that there is no affiliation. the six might repre- sent just one of descent, or as many as six separate lines of descent. robinson was left, as the editor, with a table of manuscript rela- tionships which showed three clear branches (whose heads are rep- resented by st x and x in figure ) and these six manuscripts. he designated these six as «manuscripts not members of any groups» and placed them, rather arbitrarily, around the centre of the map, with a line pointing to the six coming (again, arbitrarily) from the line lead- ing to x in in the table. what should an editor do with this information? according to the classic lachmannian formula one should go through the text word by word and at every point where there is variation one should look at the stemma, see which variant was in the most lines of descent and declare it the winner. this would be “scientific” editing indeed (except of course in the annoying case of competing variants being in equal numbers of it appears that trovato would have had robinson proceed in exactly this manner (p. trovato, everything you always wanted to know about lachmann’s method, padova, libreriauniversaria.it, , pp. - ). trovato’s certainty that he knows how to edit svipdagsmál without ever having looked at a single word of the poem is impressive. barbara bordalejo - peter m.w. robinson lines of descent, where one would have to use some kind of editorial judgement). by this time, robinson was very skeptical of this procedure. it takes no account of the fundamental rule of lectio difficilior: that a “dif- ficult” reading, though present in a minority of lines of descent or even in none, might have been the origin of “easier” readings found in more lines of descent, and so should be selected by the editor. as we discuss in the next section, at very many points an editor of the canterbury tales would have good reason to select such “difficult” readings. secondly, he had by now come to think that this was not a stemma at all, certainly not in the sense that it offered an iron-clad representation of how the manuscripts related to one another. this applied especially to these six manuscripts. if the six independent lines of descent were treated as independent from each other then their testimony might overwhelm the other three lines of descent. in particular: it meant accepting that any one of the six, all copied late in the tradition, was the equivalent of (for example) stockholm papp. (“st”), copied some years earlier, likely in iceland, probably only one or two copies away from the single now-lost medieval exemplar of the whole tradition. in the event, robinson took a pragmatic course as an editor. he elected to use st and another early manuscript (“ra”) as the base for his edition, with a preference for st where possible, and as the base for the spelling of the edition. he sought to keep the reading of st where possible; when it was not possible, he looked to ra; and when neither yielded a good reading he looked among the other manuscripts (guided by the table of relationships given in figure ), and at the work of other editors (an edition of svipdagsmál, pp. - ). it is notable that he did not accept one reading occurring in any of the six and nowhere else. it would not be unfair to say that robinson solved the problem of what to do with these six manuscripts by ignoring them. in the context of svip- dagsmál this was possible. however, this is not a remedy for all occasions. geoffrey chaucer’s canterbury tales and o john manly and edith rickert are the only scholars to have produced a complete analysis of the whole textual tradition of the tales, a labour that took them some twenty years. the results of their work were pub- j.m. manly, e. rickert, eds, the text of the canterbury tales: studied on the basis of all known manuscripts, vol. , chicago, chicago university press, . manuscripts with few significant introduced variants lished in in eight volumes, of which the first two are dedicated to the descriptions of the witnesses and the analysis of their findings and the resulting genetic groups. their grouping of witnesses of the tales is one of the two most enduring conclusions reached by manly and rick- ert. it was also their opinion that national library of wales peniarth d (hengwrt; hg) had the best extant text of the canterbury tales and used this manuscript as their base-text. even though their groupings present considerable problems, their basic structure has been retained and used by every scholar after them. vance ramsey, for example, points out that before manly and rickert the majority of the studies carried out ended up by concluding a binary classification of the manuscripts, a condition avoided by their classifica- tion. manly and rickert also made an important contribution to the refinement of the stemmatic method (and part of the basis for the new stemmatics). they proposed that not only do errors have to be taken into account when establishing relationships between texts, but also agree- ments in possibly “correct” readings. it follows that for their own research and for the classification of the witnesses, they used all agreements and identified those which they regarded as indicative of what they called “variational groups”. in their wording, these indicative agreements must be “persistent” and “con- sistent” to have potential to show the relationships between genetic groups. aside from the importance that manly and rickert conferred upon hengwrt, they also showed that there were certain other manu- scripts that were especially relevant. in the end, manly and rickert pro- posed four main groups (a b c d) and an agglomeration of unclassified manuscripts. their classification has been in use since the publication of their work in . despite this massive effort, and the broad acceptance of manly and rickert’s major conclusions about the value and hengwrt and the four major groups, their edition has been admired rather than used. «no chaucer edition before it [manly and rickert’s] had been supported by such an elaborate apparatus: six volumes to accompany two of text», and perhaps its sheer volume was one of the reasons that textual critics v. ramsey. the manly and rickert text of the canterbury tales. first ed. lampeter, wales, edwin mellen press, , p. . manly and rickert eds, the text of the canterbury tales, i . manly and rickert eds, the text of the canterbury tales, i . g. kane, «john m. manly and edith rickert», in editing chaucer: the great tradi- tion. ed. paul g. ruggiers, norman, oklahoma, pilgrim books, , p. . barbara bordalejo - peter m.w. robinson made little use of it. the reception of their work was also influenced by the opinions of those who doubted their methodology. kane, for exam- ple, repeatedly accused them of making mistakes, such as using skeat’s student edition as their base for collation when this was an unoriginal text or for supposedly assuming that the rate of variation is uniform among witnesses. what manly and rickert were looking for was evi- dence of non-random variation which was not the result of agreement by coincidence, thus their interest in “consistency.” perhaps the major reason their edition has not had the use they would have wished is the extraordinary complexity of the picture of manuscript relations they give. they choose to present their analy- sis tale by tale, and the result is that we are offered some thirty-nine separate histories, each one of them different from each other. manly and rickert explain this (as have many following them) by arguing that this suggests that this is because of “part-publication”: that the separate parts were originally published separately, and this is why the histo- ries are distinct. yet, this picture of part-publication is contradicted, as noted above, by the evidence that their four constant groups are indeed constant: they appear in every one of the separate histories. how is this possible if the histories are separate? indeed, as we go from to part, a pattern emerges. the same constant groups, and some pairs, do appear in every part history. but in each part there appears to be a loose set of manuscripts which usually stand apart from the constant groups, but whose relationships with each other vary from part to part. thus, they assign oxford christchurch ms (ch) the following relationships in four parts: – in the general prologue: it is with hg el gg doto , as not sharing an ancestor from which the other manuscripts descend, and hence perhaps independently descended from the archetype for example, germaine dempster, «manly’s conception of the early history of the canterbury tales», pmla, ( ), pp. - , has pointed out that one must be manly to understand the four-hundred page account of the manuscripts in volume ii. one needs to understand just what is meant by cd, and how this differs from cd*, and how both differ from √cd. a key is given on manly and rickert, volume , pp. - to all the “constant pairs” and “constant groups” represented by manly and rickert’s conventions: this key is not set out clearly, and so dense are the references to these pairs and groups that the reader is soon fatigued with moving back and forth from the text to the key. manly and rickert did not assert this. instead the quote provided by kane states that «the law of probability is so steady in its working that only groupings of clas- sificatory value have the requisite persistence and consistency to be taken as genetic groups ( . )». manuscripts with few significant introduced variants – in the miller’s tale: it appears on its own, as representing a line of descent distinct from el hg (which represent two other lines of descent), with gg now grouping with mss ad /ha and to apparently contami- nated by el – in the wife of bath’s prologue: ch appears to join a group made up of ad /ha ra tc gl, but then they appear to qualify this by asserting that at some point (their argument is here unclear) it shifts allegiance from this group to a distinct group composed of hg ht bo – in the nun’s priest’s tale: ch appears to come from the same exemplar as hg el gg ad and the a group. one notices that in all four cases, ch is linked to hg, and in several to el. we highlight these three manuscripts for several reasons. hg (heng- wrt) has been long acknowledged as presenting an excellent text, and it has been recently suggested that both hg and el (the ellesmere chaucer manuscript, huntington library, san marino) were both written by adam pinkhurst, who as well as bearing the name “adam”, which may make him the “adam scriveyn” addressed as his scribe in a poem by chau- cer, may have worked as chaucer’s scribe in the london customs house from to . this would place the copying of both manuscripts very close to chaucer himself. we add ch to this pair because in all sec- tions of the tales analyzed by us so far, ch el hg form an extraordinarily close trio, over and over sharing variants often found nowhere else or in very few other manuscripts. because of the likely closeness of these manuscripts to the original of the whole tradition, a possible explanation for these inconsistencies presents itself. we note above that the manly and rickert groupings do not rely solely on errors to establish genetic affiliations. indeed, this is one of their strong points. however, the danger is that the editors may fail to realize that they are in the presence of an archetypal reading and attempt to classify and group texts based on such readings. archetypal variants are non-classificatory from a stemmatic perspective because they could be (and should be expected to be) distributed in all parts of the tradition. only variation that has been introduced below the archetype is significant for the classification of witnesses into dis- tinct family groupings. thus: it might be that what manly and rick- ert see as evidence of affiliation might simply be agreement in ances- tral readings in the group hg el ch (joined often by gg ad ht and others). l. mooney, «chaucer’s scribe», speculum, ( ), pp. - . barbara bordalejo - peter m.w. robinson since , the canterbury tales project, now led by the co-authors of this article, has been following in the footsteps of manly and rick- ert. there is much in common with our approach and that of manly and rickert. like manly and rickert, we believe that we have to base our analysis on the variants at every word in every witness. like them, we think we should disregard the question of originality in seeking to establish consistent groupings of manuscripts, and we should base these groupings on “persistent” and “consistent” attestation of witness group- ings – though we are acutely aware that some of these groupings may be in variants ancestral to the whole tradition, and so not indicative of families within the tradition. as of this date, four of the separate parts of the tales have been fully collated, analyzed and published: the four parts general prologue, miller’s tale, wife of bath’s prologue and nun’s priest’s tale. for these sections we are able to compare directly our results with those of manly and rickert. firstly, we confirm the existence of the four “constant groups” a b c d, clearly present in each section with the same core manuscripts identified by manly and rickert. secondly, we found ourselves confronted directly with the same phenomenon which manly and rickert found, of manuscripts which do not belong to any of the constant groups but which do not seem to have any other settled affili- ation. the outstanding example was ch, which we found over and over sharing readings with both or either one of hg and el, frequently against almost every other manuscript in the tradition. we found a number of other manuscripts which followed the same pattern, though less fre- quently in agreement with the key hg/el pair than ch. the agreement with hg/el, very commonly in a lectio difficilior usually replaced by an easier reading in the constant groups, suggested to us that these variants were actually present in the archetype of the whole tradition, and that their appearance in these manuscripts was evidence of their common descent from, and their closeness to, the archetype. hence, we named these the o manuscripts, and the variants the o variants, identifying them as such in robinson . we list here a few variants from the miller’s tale and the nun’s priest’s tale, all following the same pattern. our comments below explain why we believe the hg el reading (here always with ch, and usually with a few others): two other sections have been fully transcribed and collated, for the merchant’s and franklin’s tales, but not analyzed and published. manuscripts with few significant introduced variants miller’s tale : i am thyn absolon my derelyng my - ch el gg hg ps to thyn dere - ad bo en ha mg mm ph thyn - ad bo bw cn cp dd ds en en fi ha ha hk ht la lc ld ma ne nl pw py ry ry se sl tc thyn owne - cx cx dl he ln pn ra tc wy i am thyn dere - gl o my - ha ii thyn swete - ra and thyn - sl line of the miller’s tale is a clear case of lectio difficilior. if the text had modern punctuation we would have a comma after “absolon” clearly indicating “my derelyng” is a vocative expression. the analysis of the miller’s tale states: we can imagine it working superbly in a live performance or reading. but it is exactly this shift which a scribe, working from a written exemplar, might fail to catch: and the evidence is that apart from witnesses close to the original (the trio el hg ch; but also to gg ps with the pair ii ha having the related ‘o my’), every other copy failed to register this, and substituted ‘thyn’ for ‘my’ following the ‘thyn’ earlier in the line. once this change was made, it was very unlikely to be reversed, and hence the complete absence of ‘my’ from elsewhere in the tradition. in this variant, the direction of variation seems obvious, from the lectio difficilior to an easier one in a relatively easy mistake to make. when scribes replaced “my” with “thyn” they did so in following the previous “thyn absolon.” one can easily see why the derivative reading might have made sense for the scribes who were not particularly interested in the per- formative aspects of the text. what we find most interesting about the distribution of this variant is that it is present in various witnesses repre- senting independent lines of descent. thus, hg el ch ha gg all descend from the archetype of the tradition but each in a separate line (although gg is related to other witnesses). p.m.w. robinson, b. bordalejo, «stemmatic commentary», in the miller’s tale on cd-rom, edited by p. robinson, leicester, scholarly digital editions, . b. bordalejo, p.m.w. robinson, «stemmatic commentary», in the nun’s priest’s tale on cd-rom. edited by paul thomas, leicester, scholarly digital editions, . barbara bordalejo - peter m.w. robinson nun’s priest’s tale : in which she hadde a cok heet chauntecler a cok heet - ch el hg me a cok hight - ad ad bw cn cx cx ds en fi he ii ln ma ne nl pn py tc wy a cok that hight - bo cp dl en gl ha ha ha ht la lc ld mc mg mm ph ph ps pw ra ry se sl sl tc to a cok hight chaū - en that hight - ry as in the previous example, the archetypal reading here is a lectio dif- ficilior. four witnesses preserve this reading, including hg el and ch, but also me (a fragmentary manuscript currently at the national library of wales). here is what we wrote as part of the stemmatic commentary of the nun’s priest’s tale on cd-rom: there are two characteristic patterns of variant distribution associated with a distinctive or difficult reading preserved almost alone in o. in the first model (as here) we see the archetypal difficult reading generate a range of variants through the tradition as different scribes struggle with the reading (contini’s ‘diffraction’): for further instances, see on np . in the second model, the reading is replaced by a single, obvious and easier reading, which might occur independently to different scribes: for instances of this, see on np . we have found that it happens with relative frequency that an easier reading appears independently in otherwise unrelated witnesses within the tradition. this occurs when the reading is easily conjectured through its contexts. this is the case of the variant, certres / sterres in kt where the context allows anyone to guess that the intended reading must have been “sterres”. despite that, the variant distribution points towards a misplaced abbreviation in the archetype. bordalejo wrote about this variant: kt is another example in which the variant in cx agrees with ad ch and ha . hg el cp dd gg and la share the reading ‘sertres.’ only cx has ‘serelis.’ it could be assumed since ad ch ha and cx have shown a consistent relation- ship in this part of the text, that their ancestor corrected a mistake in o. w. skeat, the evolution of the canterbury tales, second series. first ed., vol. , london, trubner & co., limited (for the chaucer society), . b. bordalejo, «the manuscript source of caxton’s second edition of the canterbury tales and its place in the textual tradition of the tales», phd. de montfort, , p. . manuscripts with few significant introduced variants in each of these examples, we see three manuscripts our research has shown to be descended directly from the archetype, hg el ch, agreeing in difficult and likely chaucerian original readings against almost all other manuscripts. some other manuscripts do agree with them, but in an inconsistent way. consider the following examples of what we call o readings (readings coming directly from the archetype), showing for each likely archetypal reading just what manuscripts agree with the trio hg el ch: link , line : to - ch dd el en gg hg ps pw link , line : that i - ad bo ch dd el en ha ha hg ht ln ph link , line : preye - bo ch dd ds el en gg hg hk ra tc to link , line : fame - bo ch cp dl el gg hg la ra sl tc link , line : nor - bo ch ds el en gg hg hk ln ra miller’s tale, line : hem - ch dd dl el en gg ha hg ii lc mg ps miller’s tale, line : ich - ch cp dd el en hg miller’s tale, line : wyndow - ad bo ch cp el gg hg la lc mg ph ra tc to miller’s tale, line : til - bo ch cn cp cx el en gg hg la ma pn ra to miller’s tale, line : for - ad ad bo ch cp dd el en hg la ln ma ph miller’s tale, line : astromye - bo ch cn el hg la py miller’s tale, line : astromye - bo ch el hg miller’s tale, line : he cogheth - ad bo ch dd el gg ha ha he hg ph to miller’s tale, line : knokketh - ad bo ch dd el en ha ha he hg pn ps to miller’s tale, line : til he cam - ad ad ch dd ds el en en ha hg la pn ps ry miller’s tale, line : he brosten hadde - ad ad ch dd ds el en en ha hg la pn ps ry miller’s tale, line : that - bo ch cn ds el en ha hg hk ma nl nun’s priest’s tale, : no wyn ne drank she - ch cp cx el ha hg pn ry sl wy nun’s priest’s tale, : a cok heet - ch el hg me nun’s priest’s tale, : he krew - ch cx cx el ha he hg ne pn pw py se tc wy nun’s priest’s tale, : was it - ad ch ds el en ha hg ld ma barbara bordalejo - peter m.w. robinson this is only a selection of the instances found, in these four and other sections. we note the inconsistencies between the witness distribution in these agreements. besides hg el ch, the other witnesses come and go. this makes grouping of these o witnesses impossible. where they agree with hg el ch this is likely only agreement in ancestral variants. where they do not agree with hg el ch they agree with other witnesses in such a random way that one cannot infer any groupings. we can now recognize the same fundamental phenomenon we saw in the svipdagsmál tradition. we see in these variants, again and again, the same sigils: ad ad bo dd ds en ha ha hk ps py to and others, with a different selection of sigils joining hg el ch at each instance. once more, we have the case of multiple manuscripts within a tradition which cannot be assigned to any family because there is no consistent pattern of agreement in introduced variants between them and other manuscripts. this is complicated here by the clear ancestral nature of these variants, making it still more difficult to divide these manuscripts into subfamilies. in terms of editing: the identification of these variants as likely to be archetypal is important. further, the identification of a set of manu- scripts as likely to have archetypal variants where others do not gives the editor reason for confidence. if a reading is (for example) in two of hg el ch, and in a number of these other manuscripts, it is then highly likely to have been present in the archetype of the tradition, regardless of what other witnesses might or might not attest to it. dante’s commedia and α we focus here on just one aspect of the vast textual tradition of the com- media ( manuscripts complete in at least one canticle): the question of manuscripts close to the archetype, which – like the o witnesses of the tales, and the ungrouped manuscripts in the svipdagsmál tradition – evidence the phenomenon of sharing few significant introduced vari- ants with other manuscripts. first, a brief history. the most influential edition of the commedia of the last decades is that of giorgio petrocchi, first published in . pe- trocchi elected to build his edition on manuscripts which date from before boccaccio’s copying of the commedia around , represent- ing what he called the “antica vulgata”. from a complete collation of dante alighieri, la commedia secondo l’antica vulgata, a cura di g. petrocchi, milano, mondadori, . manuscripts with few significant introduced variants these he created the stemma given below, which he then used in the making of his text. his use of his stemma in the editing of his text was (as it happens) quite similar to robinson’s practice in svipdagsmál in that he used the stemma as a guide when selecting readings, and not as an iron rule. however, he did not select a single manuscript as the base: rather, he typically chose readings from what one might term the flo- rentine tradition, from the influence of the trivulziano manuscript, written in florence in . figure petrocchi’s stemma of the antica vulgata manuscripts of the commedia. petrocchi’s edition was a staggering effort by a much respected scholar. however, in federico sanguineti published a new edition of the commedia which challenged both petrocchi’s methods and his conclu- sions. petrocchi’s methods: where petrocchi based his analysis on a col- lation of every variant in witnesses, sanguineti claimed that he had dantis alaghierii comedia, a cura di f. sanguineti, firenze, sismel-edizioni del galluzzo, . barbara bordalejo - peter m.w. robinson looked at variants in all manuscripts: in fact, at the variants in some lines (the “barbi loci”). petrocchi’s conclusions: petrocchi divided the manuscripts at the top of his stemma into five families (designated a to e), and suggested that these five groups may descend from two exem- plars, α (abc) and β (de). sanguineti retained the fundamental divi- sion of petrocchi’s stemma, with all the manuscripts descend from two copies made from the exemplar, which he names (following petrocchi) α and β. however: according to his analysis the beta family consists of precisely one manuscript: vatican library codex urbinate latino (urb), with all or so other manuscripts descending from the α hypar- chetype. this contradicted petrocchi, who placed two other manuscripts as descending from the same beta branch. sanguineti also upended the fundamental premise of petrocchi’s edition, that no manuscript after had value for the establishment of the text, by including lauren- ziana santa croce ms. plut. . (lausc), dating from around , in the base seven manuscripts he chose as “necessary and sufficient” for the making of an edition. figure sanguineti’s stemma. this simplifies matters somewhat: what petrocchi calls “d” is not really a family but a hypothetical ancestor of la, which also contains readings (presumably by contamina- tion) from petrocchi’s c. manuscripts with few significant introduced variants around , while sanguineti was developing this argument, prue shaw and robinson became aware of his work through two australian scholars, mary dwyer and diana modesto. the first plan of this group was for all to work together on a digital edition of the commedia based on the “sanguineti seven”, the seven manuscripts identified by sangui- neti as the base for an edition. for various reasons this collaboration did not continue. in the event, shaw and robinson decided to proceed on their own, with shaw as editor and robinson responsible for the technical aspects of the edition, particularly its use of digital tools for transcription, collation and analysis. bordalejo joined this team around , and was responsible for the formal specification of the transcrip- tion system used by the edition, the training of the collation team, and the overseeing of the collation process. shaw’s edition of the commedia was published in . in the period since its first conception as a partnership involving sanguineti, and follow- ing sanguineti’s withdrawal from the collaboration in , the purpose of the edition had shifted markedly. as well as exploring the tradition, we now focussed on testing sanguineti’s hypotheses about the tradition. they did not fare well. our analysis suggested that urb was not the unique rep- resentative of the β family. rather, a second manuscript also appeared to descend from β: rb. petrocchi had suggested this affiliation and our analy- sis confirmed it, thereby exactly halving the value of urb. also, our analysis confirmed the traditional view of the lausc manuscript as valueless for the establishment of the original text. we were able to show that it presents an extremely eclectic text, typical of post-boccaccio texts, hence adding support to petrocchi’s choice not to include post - manuscripts. however, there was one key assertion by sanguineti on which we could not give a definitive answer. it was his explicit assertion that there was a single exemplar, α, from which almost all the manuscripts descend. in terms of the manuscripts analyzed in the shaw edition, this would mean that the two non-β pairs of manuscripts, mart/triv and ash/ ham, descend from a single exemplar below the original. the question is important because if there is such an exemplar, then a reading present in both groups represents only one line of descent. if there is not such an exemplar, then a reading present in both mart/triv and ash/ham represents two independent lines of descent and so has double the evi- dentiary weight of (say) a reading found only in the beta manuscripts. dante alighieri, commedia. a digital edition, edited by p. shaw, leicester and flor- ence, scholarly digital editions and sismel, . barbara bordalejo - peter m.w. robinson accordingly, robinson sought evidence that there was such an exem- plar, using the digital vbase tool found in the shaw edition. vbase allows you to ask complex questions about the distribution of a textual tradition. in this instance, if there were an α exemplar below the arche- type from which the two pairs mart/triv and ash/ham both descend, distinct from the β exemplar, how might the variants introduced by that exemplar be distributed across the tradition? one might expect each variant to satisfy the following conditions: . the variant should be present in all four of ash ham mart-c triv; . it should be not present in either the editions of petrocchi (pet) or sanguineti, and so according to their best judgement, it is unlikely to have been present in the archetype; . it should not be present in either β manuscript (urb rb) the shaw edition vbase tool allows the reader to find out, in a few seconds, which variants might satisfy these conditions. figure shows our use of this tool to identify the putative set of variants evidencing a shared ancestor below the archetype for the two pairs ash/ham and mart/triv: figure the vbase tool, showing a search for evidence of α. “mart-c ” was the designation we gave to the variants introduced by luca mar- tini in into his copy of the aldine edition from a now-lost manuscript written in - . manuscripts with few significant introduced variants the first line of the query corresponds to the first requirement, that it should be present in all four of ash/ham mart/triv. the second line corresponds to the second and third requirement, that it should not be present in any of fs pet rb urb. as this figure shows, there are just variants in the , variants in the lines of the commedia which satisfy these conditions. to complicate matters still further: if one alters the query slightly, to return variants where one of urb/pet agrees with ash/ham and mart/triv, the number more than doubles, to variants. one could explain this by hypothesizing that in fact all of ash/ham mart/triv urb/rb share an ancestor below the archetype – above both α and β – which introduced these variants, but that some were not copied into the joint ancestor of rb/urb. one could multiply hypotheses about these manuscripts at will. we have presumed here that a variant not accepted by either sanguineti or petrocchi is unlikely to have been present in the archetype of the whole tradition. but their judgement could be wrong – or, the archetype could itself have contained errors. further, we are dealing with so few variants, among such a mass of variants ( , in all), that we have to reckon with the likelihood that at least some of the are there by simple acci- dent, through coincident variation. where we are dealing with so few variants, the addition or removal of just a few variants because either they are ancestral to the whole tradition or the result of coincidence would change the numbers disproportionately. hence, we find ourselves in the same position as for the chaucer and svipdagsmál traditions surveyed earlier. indeed, there might be an ances- tor below the archetype of the whole tradition, α, from which both the ash/ham and mart/triv pairs descend, as both sanguineti and petroc- chi assert. or, there might not be. the numbers of variants indicative of either hypothesis are so few as not to be decisive. as with both the canterbury tales and the svipdagsmál traditions, this state of affairs deprives the editor of the simple recourse, of count- ing variants in discrete lines of descent. once more, at the very top of the stemma, exactly where it is nearest to the archetype, we are unable to determine exactly how many lines of descent there are. there might be just two, α and β, as both petrocchi and sanguineti assert. but it might be that all three of petrocchi’s a b c groups (a: mart/triv; b: ash/ ham; c was not represented among the seven we analyzed) represent independent lines of descent. in that situation, it appears that the edi- tor’s best strategy is to proceed with caution as petrocchi did, looking at barbara bordalejo - peter m.w. robinson each variant on a case by case basis, and being advised by the distribu- tion of variants but not ruled by it. the greek new testament whatever the problems of editing svipdagsmál, the canterbury tales and the commedia, they are as nothing compared to the challenge of editing the greek new testament. firstly, we are dealing with a massive number of manuscripts: some for the whole text, with up to for any single section (thus, for the gospel of john). secondly, we are dealing with a tradition that extends across millennia, across vast geo- graphical space, and with many versions in many languages, all of which must be examined for their possible testimony. because of these factors, there is the question of just what should be the editorial goal. for the three traditions we have surveyed, and particularly for the chaucer and the dante, where our oldest extant manuscripts date to within a few years of the author’s life (or even to his life) one might plausibly declare that we are trying to recover the closest possible text to that which left the author’s hand: the “original”, if you like. but the gap between the historical jesus and the first manu- scripts, the complexity of the tradition and the paucity of evidence for its very earliest stages render the notion of the “original” text of the new testament problematic indeed. accordingly, the editors of the editio critica maior (ecm) series of editions of the greek new testament, the most ambitious and advanced initiative in greek new testament textual scholarship, declare that the edited text which they offer on the basis of research into all the evidence, including all extant manuscripts and versions in every language, represents not the “original” text but the “initial” text (german “ausgangstext”). this “initial text” is glossed in it should be noted that the difficulties are compounded if one chooses to carry out a sample collation rather than a full collation. see robinson, «the textual tradition of dante’s commedia and the barbi loci», ecdotica, ( ), pp. - . ecm: barbara aland, kurt aland†, gerd mink, holger strutwolf, and klaus wach- tel, ed., novum testamentum graecum: editio critica maior, vol. : catholic letters. instl. : the letters of peter, münster, institute for new testament research, . this discus- sion of the ecm “ausgangstext” draws upon michael w. holmes, «from ‘original text’ to ‘initial text’: the traditional goal of new testament textual criticism in contemporary discussion», in b.d. ehrman and m.w. holmes, eds., the text of the new testament in contemporary research: essays on the status quaestionis, nd ed., brill, , pp. - . manuscripts with few significant introduced variants the introduction to the ecm edition of - peter, * n. as «the form of a text that stands at the beginning of a textual tradition». the editors are at pains to distinguish this from both the “original” (what might have been first written down) and the “archetype” (the reconstructed ancestor of all surviving witnesses). it is both less than the original, and more than the archetype: as klaus wachtel described it to robinson, it is «the text which explains all the texts» (personal communication). holmes describes it further as «the reconstructed hypothetical form of text from which all surviving witnesses descend, a stage of a text’s his- tory that stands between its literary formation, on the one hand, and the archetype of the extant manuscripts, on the other» (p. ). in this discussion, we focus on a single place of variation in the ecm publication: the variation at peter : , where it appears that we have an instance of the same phenomenon identified as present in the three traditions discussed to this point, of the difficulty caused by manuscripts which do not show consistent patterns of agreement in introduced vari- ants between them and other manuscripts. the whole text of this verse is given in most editions as: ει δε ως χριστιανος μη αισχυνεσθω δοξαζετω δε τον θεον εν τω ονοματι τουτω this verse is translated in the netbible as «but if you suffer as a chris- tian, do not be ashamed, but glorify god that you bear such a name». the context is that the writer (the apostle peter) is speaking to the reader about the sufferings which might be brought upon him or her as a christian. in the previous verses, the writer instructs the reader that he or she is not insulted but is actually blessed if he or she is called a chris- tian (verse ); he or she should not accept suffering as a mere criminal (verse ) but should glorify god that he or she is called a christian. however, the exact wording of the greek is awkward, as can be seen by the variety of translations of this verse: common english bible: but don’t be ashamed if you suffer as one who belongs to christ. rather, honor god as you bear christ’s name. give honor to god, good news bible: however, if you suffer because you are a christian, don’t be ashamed of it, but thank god that you bear christ’s name. https://netbible.org/. accessed march . these translations from https://www.biblestudytools.com/ (accessed march ). barbara bordalejo - peter m.w. robinson lexham english bible: but if [someone suffers] as a christian, he must not be ashamed, but must glorify god with this name. (bible translations). it appears that the exact reference of the phrase εν τω ονοματι τουτω (“in this name”) causes difficulty: from the immediate context, following the reference to god, the most natural reading is that the name of chris- tian is to be applied to god, and not to the suffering person (δε τον θεον εν τω ονοματι τουτω: “concerning god in this name”). hence the variety of periphrases seen in the translations, intended to show that the name of christian belongs to the suffering person, while the glory belongs to god. where there is difficulty in the text, one might expect variation: and that is what we have here. in many manuscripts, we find εν τω μερει τουτω (“in this part”) instead of εν τω ονοματι τουτω (“in this name”). this is the reading underlying the king james bible, for long the most influ- ential english-language bible: “yet if any man suffer as a christian, let him not be ashamed; but let him glorify god on this behalf ”. here is the distribution of the variants, as summarized by gurry and wasserman: figure the variants at peter : , gurry and wasserman p. . gurry and wasserman observe, as do many other commentators (see the notes to gurry and wasserman’s discussion), that the witnesses to the reading εν τω μερει τουτω are confined almost exclusively to just one branch of the new testament tradition: to the byzantine manuscripts copied after ad and current in the orthodox church to this day. just one other branch of the tradition, the old church slavonic (also dating in its earliest hypothesized form to the th century), supports the byzantine text reading here. in contrast, the reading εν τω ονοματι τουτω is found in every witness and every version dating from before ad. it is found in the three great uncial manuscripts sinaiticus alexandrinus and vaticanus (designated p.j. gurry, t. wasserman. a new approach to textual criticism. an introduction to the coherence-based genealogical method, stuttgart, society for biblical literature and deutsche bibelgesellschaft, . manuscripts with few significant introduced variants here), all dating from between and ad. it is found in the bodmer papyrus vii-viii (p ), possibly earlier than all three uncials. it underlies both the vetus latina and vulgate latin versions, dating from before ad; and is the source of the readings in the coptic, syriac, ethiopic, georgian and armenian versions, and indeed every version dating before ad. it is also the reading of the important minuscule , which is held to be a copy of a fourth century uncial manuscript. yet, despite this complete unanimity of the earliest versions, the ecm editors choose the late byzantine reading here. why do they accept this reading? here, we must take two further factors into account. firstly, although the byzantine tradition arose centuries after the earliest manu- scripts and versions, it exists in far more manuscripts than any other ver- sion, because the orthodox church has continued using greek, and the greek text has continued to be copied (and later printed) up to this day. as a result when renaissance scholars sought greek manuscripts, for the making of new translations and editions, they found manuscripts carry- ing the byzantine text. the first printed text of the bible, prepared by eras- mus and printed by froben in basel in , was based on seven minus- cule manuscripts, all dating after and all carrying the byzantine text. erasmus’s second edition ( ) was used by martin luther as the basis of this german translation of the bible; the third edition ( ) was used by william tyndale for the first english new testament based on greek sources, and also by the creators of the geneva bible and the king james bible. the influence of the king james bible for later anglophone culture cannot be overstated: for england, and later for its colonies, and then for the emergent united states, it became the book of books; the touchstone by which not just religion but language itself was measured. to this day, there are many fundamentalist christian groups, especially in the united states, and not a few textual scholars, who ascribe extraordinary author- ity to the king james bible. there are groups of fundamentalist baptists with the motto “king james only”, and a group of well-qualified scho- lars who assert the value of the byzantine text under various labels: as “textus receptus” or the “majority text”. it is the king james bible which infuses the language of the book of mormon. it is a notable feature of the ecm editions that at many places – as here in peter : – they prefer a reading from the byzantine text where other editors (including the earlier k. lake, s. new, six collations of new testament manuscripts, cambridge, mass. and oxford, harvard up and oxford up, . see, for example, the postings at http://evangelicaltextualcriticism.blogspot.com/, with peter m. head, tommy wasserman and p.j. williams named as contributors on the mast- barbara bordalejo - peter m.w. robinson nestle-aland editions) prefer a reading attested by earlier manuscripts. the ecm editors declare no such policy. but that is the effect. the second factor is the reliance of the ecm editors on the “coher- ence-based genealogical method” (cbgm): a method developed by gerd mink, formerly of the münster institute for new testament research, spe- cifi cally to help the ecm editors, and others, choose which reading, among the many shown by their comprehensive collation, should be chosen for the “ausgangstext” (and hence, to appear as the text in the many edi- tions which use the text established by the institute and its partners). this method has become controversial among some new testament scholars, in part because it is diffi cult to understand. a full description and analysis of the cbgm is beyond the scope of this article; we will focus on its impli- cations for manuscripts which share few introduced variants with others. the fundamental tool of cbgm is that it builds, at every point of variation, what it calls “textual fl ow diagrams” (see figure ). figure cbgm textual fl ow diagram for the variants at john : (addresses - ), showing the descent of reading c from a. head as of march . these three are professional academic scholars of high repute. one might fairly report that many posts on the blog show sympathy towards the “major- ity text” movement, without subscribing to its more extreme positions. see, for example, http://evangelicaltextualcriticism.blogspot.com/ / /whats-happened-to-majority-text. html with its report of the activities of the “majority text society”. the site http://www. majoritytext.com is still maintained, but with little evidence of activity in recent years. barbara aland, kurt aland†, johannes karavidopoulos, carlo m. martini and bruce m. metzger. novum testamentum graece. th edition. stuttgart, deutsche bibel- gesellschaft, . g. mink, «problems of a highly contaminated tradition: the new testament. stemmata of variants as a source of a genealogy for witnesses», in studies in stemma- tology ii, ed. p.th. van reenen, a.a. den hollander, m. van mulken, amsterdam, john benjamins, , pp. - . a.c. edmondson, an analysis of the coherence-based genealogical method using phylogenetics, doctoral dissertation, university of birmingham, , p. . manuscripts with few significant introduced variants these textual flow diagrams look like traditional stemmata, but they differ in two crucial ways: . they exclude hypothetical sub-ancestors. only extant manuscripts are included. . the textual flow diagrams represent the relationship between texts, not manuscripts, and a key factor in determining the direction of flow between texts is the relative closeness of each text to the hypothetical aus- gangstext: in fact, the text of the greek new testament established by the nestle-aland and ecm editors. this can lead to odd results. although we know that the miniscule (written around ad) is the direct exem- plar of (written some years later), the textual flow diagrams actually reverse this, and show as the ancestor of , because differs from the nestle-aland text at only points, while differs at points. cbgm offers suggestions about the different levels of coherence resulting from different choices of “initial text”, and the editors can take these different levels of coherence into account as they choose which reading is most likely to have been present in the “initial text”. “coher- ence” may be significantly affected by many factors, including the choice of what variant is the ausgangstext. in several respects, the cbgm works on a model of textual variation which differs from what most scholars think happens in textual tradi- tions: thus the exclusion of hypothetical ancestors, and the determination of textual flow as quite detached from the historical dates of the manu- scripts which carry the texts. however, it has powerful practical advan- tages. it greatly simplifies the textual flow diagrams, as they include only extant manuscripts. it also permits texts which appear in only late manu- scripts but represent much earlier states of the text to have full weight. most often, the method yields apparently good results, or at least results that the great majority of scholars are prepared to accept. but in a few cases – as here – the cbgm offers a surprising choice of read- ing. the key to cbgm’s choice of the byzantine text reading here is that among the many manuscripts of the byzantine tradition, there are eight which actually have the (b) reading of the uncial manuscripts and others, εν τω ονοματι τουτω (“in this name”), and not the (a) reading εν τω μερει τουτω (“in this part”) of all other byzantine manuscripts. these eight represent three lines of descent. accordingly, the textual b. bordalejo, «the genealogy of texts: manuscript traditions and textual tradi- tions», in digital scholarship in the humanities, / ( ), pp. - . edmondson, an analysis, p. . edmondson, an analysis, p. . barbara bordalejo - peter m.w. robinson flow diagrams show reading (b) descended from (a) three times. now, if the initial text was the older reading (b), we have to deal with two changes, from ονοματι to μερει and back to ονοματι (figure ). figure textual flow at peter : , with the initial text set to εν τω ονοματι τουτω. however, if we presume that the initial text were the byzantine read- ing (a) we have only to deal with one variant, the shift from μερει to ονοματι (figure ). figure textual flow at peter : , with the initial text set to εν τω μερει τουτω. in the terms of the cbgm, the second hypothesis (that the origi- nal reading was μερει not ονοματι) is more “coherent” than the first. it means that we do not have to presume that the change εν τω ονοματι τουτω to εν τω μερει τουτω ever happened. instead we have only to presume the change εν τω μερει τουτω to εν τω ονοματι τουτω occurred just four times. accordingly, the ecm editors print this reading in their edition of peter : without qualification. manuscripts with few significant introduced variants however, a little more thought suggests there is something fundamen- tally wrong with this hypothesis. consider just the three uncial manu- scripts sinaiticus alexandrinus and vaticanus, the papyrus p , and the minuscule , all with texts dating from before ad. the textual flow diagrams typically show the presence of ονοματι in all five of these as the result of a single change, from ausgangstext μερει (as assigned by the ecm editors) to ονοματιμερει. this is represented as follows by edmondson in figure . figure textual flow in peter : , showing vaticanus ( ) as the ancestor of sinaiti- cus alexandrinus ( / ) p and . that is: the change appears first in vaticanus ( ) and thence descends to sinaiticus ( ) p and . this is in accord with the way in which the cbgm shows textual flow working. because sinaiticus has more variants from the ausgangstext ( ) than has vaticanus ( ), the tex- tual flow shows the text of sinaiticus as descended from vaticanus. but this is simply not true in terms of the manuscripts, rather than the texts. sinaiticus is comprehensively not a copy of vaticanus or descended from it. it is here that the exclusion of sub-ancestors from the textual flow diagrams becomes a problem. edmondson’s representation of the textual flow appears to show all of / p and descending from . but this is not historically the case. not one of these manuscripts is a descendant of . it is possible that there might have been an exemplar below the archetype from which all of the uncials, p , and all the versions, might have descended. and here we come up against the same problem we have seen in the other three instances studied. we have here multiple witnesses – ten or so – which may or may not share an exemplar below barbara bordalejo - peter m.w. robinson the original, or some of which may share exemplars with others. but it appears that generations of study have failed to find convincing evi- dence of any such relationship. indeed, even if there were sure evidence of a single exemplar which introduced the reading of ονοματι for ausgangstext μερει, it is difficult to accept the historical scenario this implies. it suggests that the archetype had μερει, that this is miscopied as of ονοματι just once into a single exem- plar some time before ad, from which all the copies made before c. ad descend. somehow, no other copies of the archetype with μερει survive until c. ad, when the first manuscripts with the byz- antine text appear. however, difficult as this is to imagine, the alterna- tive in which every one of these copies had μερει in their exemplar, and in every case (ten or more) altered this independently to of ονοματι, is even more difficult to accept. for these reasons, all modern editions before the ecm accepted the older reading of ονοματι, sometimes not even recording the byzantine (and old church slavonic) μερει as a variant. although mink argues that reading μερει is more likely on “internal” grounds, as the harder read- ing, an argument could be made for the difficulty of ονοματι as it refers not to god (in the phrase immediately preceding) but to χριστιανος, earlier in the sentence. the reading of ονοματι in three branches of the very large byzantine tradition (in place of μερει) might have arisen as a repetition from two verses earlier, where the same word appears in a parallel context. similarly, as the various translations of of ονοματι sug- gest, scribes might have found the sense difficult here and substituted the bland μερει in the exemplars of the byzantine and old church sla- vonic traditions. one thing that appears clear is this: in this instance, the effect of the cbgm may be that editors do not pay sufficient attention to the com- plexities of the historic relations among manuscripts, by suggesting that all the older manuscripts and versions amount to a single line of descent, and so (indeed) having no more stemmatic weight than the single line of descent represented by the byzantine text (or even less, if one regards the old church slavonic as a second, independent line). accordingly, the ecm editors consider they are licensed to prefer the byzantine reading here. however, in our view, this is an instance of the same problem of mul- tiple manuscripts representing an uncertain number of lines of descent from the exemplar that we met in the other three traditions here ana- lyzed. as in those cases, the editor must deal with this phenomenon, and not ignore it. manuscripts with few significant introduced variants conclusions it is well-known that contamination and agreement by coincidence make it difficult to determine the exact affiliations of manuscripts within a tradition. the four traditions surveyed in this paper suggest that there is a third circumstance which makes it difficult to determine affiliations within a tradition. this circumstance is when manuscripts do not share significant numbers of introduced variants with other manuscripts. this is further complicated by the case (as seen in the dante and new testament traditions) when it is uncertain whether a particular variant is ancestral to the whole tradition (and hence of no evidentiary value) or introduced below the archetype (and hence of evidentiary value). when a significant number of archetypal variation is shared by witnesses (as it is the case of hg, el, and ch in the canterbury tales tradition), research- ers might interpret the shared variation as if it were characteristic of a determinate sub-family when that is not the case. archetypal variation, unless the witnesses in question are very close to the archetype, tends to shift and change, presenting different groupings at diverse points. mis- taking these groupings for genetic ones would lead to incorrect conclu- sions about a tradition. it is notable that in all four instances, the readings in question are all found in manuscripts which, for multiple reasons, are considered espe- cially close to the archetype. furthermore, many of the variants them- selves are strong candidates for identification as present in the archetype of the whole tradition (whether this archetype be the “original”, the “ini- tial text”, or some other formulation). this is in contrast with the prob- lems offered by instances of contamination and coincident agreement, which are typically seen in manuscripts further from the archetype. accordingly, editorial policy on how to treat these variants is of special importance. we are not able to offer a general rule as to how they should be treated, beyond this: editors need to be alert to this likely situation, at any point where there is variation. abstract the literature of stemmatics is rich in discussions of two phenomena which, it is commonly held, render the orderly assignation of manuscripts into families problematic, even impossible. these two phenomena are coincident agreement (where unrelated manuscripts share the same reading, apparently by simple barbara bordalejo - peter m.w. robinson coincidence) and contamination (where a manuscript combines readings from two or more manuscripts). in this article, we suggest that there is a third area of difficulty which causes considerable problems to the stemmatic project. this third area is the phenomenon of multiple manuscripts within a tradition which cannot be assigned to any family because there is no consistent pattern of agreement in introduced variants between them and other manuscripts. this phenomenon can be seen in four different textual traditions: those of the old norse svipdagsmal ( witnesses, - ); the canterbury tales ( witnesses, - ); the commedia of dante (c. witnesses, c. -); the greek new testament (c. witnesses, c.e.-). the likely occurrence of this phenomenon in textual traditions requires attention from editors. keywords textual criticism, stemmatics, digital humanities, scholarly editing, coher- ence based genealogical method, phylogenetics manuscripts with few significant introduced variants progetto grafico e impaginazione: carolina valcárcel (centro para la edición de los clásicos españoles) ª edizione, giugno © copyright by carocci editore s.p.a., roma finito di stampare nel giugno da grafiche vd srl, città di castello (pg) is s n - is b n - - - - riproduzione vietata ai sensi di legge (art. della legge aprile , n. ) senza regolare autorizzazione, è vietato riprodurre questo volume anche parzialmente e con qualsiasi mezzo, compresa la fotocopia, anche per uso interno e didattico. microsoft word - d . parthenos initial communication plan.doc parthenos – d . initial communication plan knaw-niod, pin july parthenos – d . horizon - infradev- - / : grant agreement no. parthenos �pooling activities, resources and tools for heritage e-research networking, optimization and synergies communication plan deliverable number d . dissemination level public delivery date july status final author(s) reto speck stefano sbarbati sheena bassett franco niccolucci petra drenth project acronym parthenos project full title pooling activities, resources and tools for heritage e- research networking, optimization and synergies grant agreement nr. deliverable/document information deliverable nr./title . “initial communication plan” document title initial communication plan author(s) reto speck, knaw-niod stefano sbarbati, pin sheena bassett, pin franco niccolucci, pin petra drenth, knaw-niod dissemination level/distribution public document history version/date changes/approval author/approved by v . / / first draft stefano sbarbati, franco niccolucci v . / / first draft and review stefano sbarbati v . / / website implementation stefano sbarbati v . / / additional sections stefano sbarbati v . / / review franco niccolucci v . / / review, restructuring, additional sections reto speck v . / / incorporate feedback; new section reto speck v . / / minor corrections reto speck parthenos – d . table of content executive summary ........................................................................................................................... introduction and background ....................................................................................................... communication and dissemination strategy ........................................................................ . overall objectives .................................................................................................................................. . swot analysis ........................................................................................................................................ . communication and dissemination principles ........................................................................... stakeholder communities ............................................................................................................ multi-level communication ......................................................................................................... tailored messages .......................................................................................................................... . general message .................................................................................................................................... . extended general message ................................................................................................................. . research and educational message ................................................................................................ . jargon-free public message ................................................................................................................ . policy- and decision-maker message ............................................................................................. available resources ........................................................................................................................ . consultation of the parthenos partners ................................................................................... . resources available in the consortium.......................................................................................... . resources available via associated projects ................................................................................ . resources available via related international initiatives ....................................................... communication channels ............................................................................................................. . website ...................................................................................................................................................... . social networks ...................................................................................................................................... . mailing list ................................................................................................................................................ . newsletters .............................................................................................................................................. . e-journal and open access repository ............................................................................................ . (joint-) events ......................................................................................................................................... dissemination materials .............................................................................................................. . logo and visual style ............................................................................................................................ . press relations ........................................................................................................................................ . media productions ................................................................................................................................ . further planned dissemination materials ................................................................................... external dissemination activities for year ..................................................................... . events ......................................................................................................................................................... . international event co-operations .................................................................................................. . publications ............................................................................................................................................. communication evaluation and assessment..................................................................... parthenos – d . executive summary this deliverable presents the initial results of parthenos wp “communication, dissemination and outreach”, task “development and update of a coordinated communication plan”. it provides a general communication and dissemination strategy for parthenos, as well as a detailed plan of action for months to of the project. the plan is a live document and it will be updated at months , , and . updates will contain an evaluation of the communication activities undertaken during the previous period; an updated version of the general communication strategy; and detailed planning for the period ahead the general objectives of parthenos wp are to: • disseminate effectively the project goals and outcomes, • set up efficient tools for the communication towards various stakeholders (scientific communities, professionals, decisions makers, public, etc.), • exploit synergies in liaisons and collaborations. the present document acts as a general roadmap for all parthenos-related communication and dissemination activities. it presents parthenos’ overall communication and dissemination strategy (sections and ); analyses the project’s stakeholder communities (section ); presents a set of core communication messages (section ); analyses the communication resources available to the project (section ); describes the project’s own communication channels (section ) and dissemination materials produced by the project (section ); lists external dissemination opportunities (section ), and sets evaluation targets for the first months (section ). introduction and background parthenos is a project funded by the european commission’s horizon framework programme that started on the st may . the project life span is four years. the consortium built around parthenos is composed of fifteen partners from nine countries. it includes the two european strategy forum on research infrastructures (esfri) european research infrastructure consortia (erics) active in the broad fields of the humanities – dariah and clarin – as well as five major european research infrastructure projects – ariadne, cendari, charisma/iperion-ch, ehri, dch-rp. the overall goals of parthenos are to: • increase the cohesion of research sectors in the field of linguistic studies, digital humanities, cultural heritage, history, archaeology and related fields; • define and implement common guidelines and best practices enabling cross- discipline data curation policies; • establish a vision about shared virtual research methods for humanities supported by foresight studies; • mainstream standardization and interoperability in order to support data sharing and re-use ; • develop common tools for data oriented services. if these objectives are to be achieved during the lifetime of the project, a co-ordinated and comprehensive set of dissemination and communication activities are required in order to maximise the impact of the project both within the consortium and on its external stakeholders. work package (wp) is charged with planning, co-ordinating and implementing all of the project’s communication and dissemination activities. the wp consists of six tasks: task . – project web site and portal: this task concerns the creation of the parthenos website, which is defined as the central hub of all the project’s external parthenos – d . communication activities. details about the initial design and implementation of the project website can be found in section . . task . – development and update of a coordinated communication plan: the present document and its future iterations are developed within this task. task . – scientific communication: this task concerns communication at the scientific level. it will evaluate the creation of a scientific e-journal in the service of e-humanities research, and the creation and operation of an open access repository. task . – organization of joint events: this task concerns the organisation of joint events (symposia, workshops, public presentations) directly managed by the project, possibly co-located at other international/national events and in collaboration with other major national and international projects. task . – liaisons with other international initiatives: this task aims at coordinating and pooling existing networks external to the consortium in order to realise mutual benefits. task . – publicity and outreach: this task concerns activities and materials aimed at informing the public at large of the project’s plans and works. the task co-ordinates and plans outreach opportunities (press conferences, interviews, newspaper articles, etc.), and the preparation of high-quality publicity materials. the present document was prepared in t . . it presents parthenos’ overall communication and dissemination strategy (sections and ); analyses the project’s stakeholder communities (section ); presents a set of core communication messages (section ); analyses the communication resources available to the project (section ); describes the project’s own communication channels (section ) and dissemination materials produced by the project (section ); lists external dissemination opportunities (section ), and sets evaluation targets for the first months (section ). communication and dissemination strategy . overall objectives the overall objectives of parthenos’ communication and dissemination activities are to • disseminate effectively the project goals and outcomes, • set up efficient tools for the communication towards various stakeholders (scientific communities, professionals, decisions makers, public, etc.), • exploit synergies in liaisons and collaborations. in order to reach these general objectives, five specific objectives have been identified: • identify and involve internal stakeholders within the partner organisations; • create an affiliate network of external stakeholders (research infrastructures and networks in related fields); • ensure that parthenos reaches the core scientific communities in linguistic studies, digital humanities, digital heritage, archaeology and history, as well as professionals in related fields; • raise awareness about parthenos amongst policy makers, funding bodies and major related public institutions; • devise a strategy to involve the general public and attract non-professional audiences. . swot analysis the mission of the parthenos, the context in which it operates, and the composition of its consortium lead to a number of unique strengths, weaknesses, opportunities and threats in regard to its communication and dissemination objectives. these unique characteristics are analysed in table as follows: parthenos – d . strengths (helpful & internal) weaknesses (harmful & internal) • experienced management and sufficient resources • involvement of experienced professionals • project’s unique approach • project’s long term vision • consortium encompasses a large network of stakeholders including existing research infrastructures, possessing strong established dissemination channels • high level of freedom • consortium's large size and diversity • tailoring messages to diverse audiences may be difficult • news worthiness • difficulty in reaching audiences beyond academia such as policy makers, the general public. • difficulty in framing (horizontally vs vertically) our communication • lack of involvement, unresponsive partners opportunities (helpful & external) threats (harmful & external) • spin news values in our favour (e.g. the message "investing in culture is investing in future") • involvement of truly international actors • possibility to create "something different" from scratch • reaching out to less well represented countries such as newer member states • present and justify investment in research to the public • academic approach to media attention • bureaucracy • (too) specific and jargon based information • lack of co-ordination, leading to message “silos” rather than a single coherent message • “flat” news feed; failure to continuously keep stakeholders informed. table : swot analysis our swot analysis indicates that parthenos has a unique opportunity to reach a very wide and varied audience to disseminate information about the project’s innovative research, and to promote the societal and cultural value of humanities research across european societies. however, the project will only be able to realise this potential if it can efficiently marshal the existing communication and dissemination networks and resources already in existence among its partner institutions and affiliated projects, and if it manages to frame its messages in a coherent, well co-ordinated manner and in accordance to the needs of its various stakeholder communities. . communication and dissemination principles this section presents a set of five basic principles that have informed the articulation of the parthenos communication plan. adherence to these principles will ensure that the project can fully exploit its strengths and opportunities, while diminishing and managing its weakness and threats as outlined above. . adaptability. given the scope of the project and the specific themes involved, the communication strategy needs to be comprehensive enough to cover the project as a whole, while being adaptable to the project’s various research themes and stakeholder communities. for example, specific channels are to be used to reach particular target groups, and dissemination materials may have to be tailored to the needs of different end users. . flexibility. as per the previous pillar, parthenos’ communication needs to be flexible and open, in order to create a responsive framework to changing needs and challenges. . dynamism. the dynamic element is the natural consequence of the two points above. a dynamic strategy is a key to maximise the impact of parthenos. . tailoring of messages/usage of appropriate language. as stated above, parthenos needs to be able to speak to academic audiences in a variety of fields, as well as to decision makers and the public at large. to achieve this, parthenos will follow a multi-layered communication strategy that formulates core messages tailored to the needs and expectations of the various target audiences, and expressed in appropriate language (specialised, technical communication vs. plan, jargon-free communication). . exploitation of synergies: parthenos is a clustering project across existing research infrastructures, integrating initiatives and e-infrastructures in the fields of digital humanities, cultural heritage, history, linguistic studies, archaeology and related fields. as such, the project can draw upon a plethora of expertise, networks parthenos – d . and dissemination and communication channels that are already in existence at partner institutions and related projects and that can reach the specific subject communities with which parthenos wishes to engage. parthenos needs to exploit to the fullest the synergy that can be achieved by building bridges between these existing resources, and must avoid a duplication of effort. therefore, achieving better co-ordination and cross-fertilisation of existing communication and dissemination activities is central to parthenos’ mission. stakeholder communities key to a successful attainment of parthenos’ communication and dissemination goals is a throughout understanding of the key stakeholder communities with which the project needs to engage. this section identifies and presents the different stakeholder communities relevant to parthenos. the impact parthenos will have on these communities varies considerably, and the influence each community can exert on the project is equally diverse. moreover, each community has distinct needs and interests in terms of communication. therefore, it is essential that parthenos develops a thorough understanding of its stakeholder communities in order to be able to hone and target its communication and dissemination activities accordingly; to develop appropriate channels for contacting and informing stakeholder groups; and to design and plan dissemination materials and activities that maximize the visibility and impact of the project. the parthenos stakeholder communities are: • internal stakeholders, i.e. institutional partners who are part of parthenos’ consortium and projects associated to the project; • research institutions, international networks and individual researchers at varying career levels (phds, post-docs/early career researchers, senior researchers) active in the subject areas covered by parthenos; • galleries, libraries, archives and museums (glams) operating in these fields • non-academic professionals working in fields related to parthenos’ activities such as data management, the cultural industries, etc. • educators and students at varying levels in the subject areas covered by parthenos; • relevant politicians, policy makers and funding bodies; • the media and the general public. parthenos – d . figure attempts to visualise the degree of influence and mutual dependence that exists between these stakeholder communities on the one hand, and the project on the other. it should be noted, however, that the boundaries, represented as concentric circles in the diagram, are indicative rather than categorical. they merely exist to highlight the heterogeneous nature of the parthenos’ stakeholders, and to emphasise the need for a tailored approach to communication and dissemination, rather than act as a prescriptive classification. figure : stakeholder map table provides an in-depth analysis of each stakeholder communities’ importance and interests, communication and dissemination requirements, and an indication of the channels we will employ to reach them. stakeholder group description and examples interested in parthenos’ news about: importance to be reached through the following channels: internal - institutional partners - associated projects managers, decisions makers and researchers at partner institutions and associated projects -development of the shared infrastructure - details about innovations and new tools and methods - best practice, guidelines and training opportunities - conferences and other parthenos events very high partners need to be fully engaged in order to get their full support for the project, and to spread news about parthenos’ via their own networks. - website/social media - internal meetings - parthenos conferences/events - publicity material (flyers, short videos, etc.) - newsletters (internal/external) - publications parthenos – d . researchers -international networks - institutions - individuals (at all career levels) the communities of researchers, networks and institutions active in the parthenos subject areas of digital humanities, digital heritage, linguistic studies, archaeology and history - access to data and resources - recent developments in relevant ris - forthcoming events, workshops and training opportunities - details about innovations and access to new tools and methods - details about how parthenos will innovate research very high – the principal subject communities underlying parthenos and its associated ris need to be convinced that parthenos will enhance their research activities. - website/social media - newsletter (external) - publications - conference presentations - publicity material (flyers, short videos, etc.) - direct networking and via partners’/associated projects’ dissemination channels glams/professionals glams active in the subject areas covered by parthenos; non-academic professional working in fields such as data/information management or the cultural industries - overview about parthenos’ mission and its progress - details about innovations and access to new tools and methods - best practice guidelines medium to high: outreach beyond academia is highly desirable; glams as important providers of data and expertise to ris. - website/social media - newsletter (external) - publications - press releases - publicity material (flyers, short videos, etc.) educators/students educators and students at varying levels in the subject areas covered by parthenos. - overview about parthenos’ mission and its progress - examples of how parthenos will innovate research in the various subject areas. medium: desirable to reach, and contribute to the education of, the next generation of researchers - website/social media - newsletter (external) - press releases - publicity material (flyers, short videos, etc.) politicians, policy makers and funders all the institutional or individual actors that frame the wider context of (european) humanities research/ri development - overview about parthenos’ mission and its progress - innovation potential of parthenos, and benefits the project offers to stakeholders and end-users - socio-economic impact of parthenos. high: their support is needed to ensure the long- term future of parthenos and its outcomes. - direct networking - policy briefing - press releases media/general public media outlets and individuals with an interest in ris/research in the subject areas covered by parthenos - overview about parthenos’ mission and its progress - benefits, including socio- economic impact, of parthenos’ achievements media: outreach beyond academia is very important: demonstrate benefits of public investment into research. - press releases/media coverage - website/social networks table : stakeholder analysis parthenos – d . multi-level communication one of the main challenges for parthenos is to talk appropriately to all of its stakeholder communities. this implies that parthenos needs to support communication and dissemination activities that are detailed and technical to satisfy the needs of professional and research communities; broad, strategic and clear to have an impact on policy makers and funding bodies; and plain, general and “jargon-free” in order to attract the attention of the interested public. this challenge manifests itself most prominently in regard to those communication channels – especially the website, social media and press releases – that target all the project’s stakeholder communities indiscriminately. one should not underestimate the extent of the challenge. indeed, given that most parthenos partners’ expertise lies in (academic) research, there is a clear danger of defaulting to an academic style of communication, even if this style is not the best suited for a given communication channel and/or to reach a given target audience. such a defaulting would entail the risk of losing the interest of important stakeholders. it would also result in the project’s dissemination and communication efforts aligning poorly to the european commission's strategy in the field of communication of joint european research results, which places a special emphasis on reaching non-academic audiences – see, for instance: • european commission ( ), european research - a guide to successful communications. eu publications office, luxembourg. 
 • european commission ( ). scientific evidence for policymaking. publications office, luxembourg. 
 • european commission ( ). communicating research for evidence-based policymaking. a practical guide for researchers in socio-economic sciences and humanities. publications office, luxembourg. 
 in order to tackle this issue, we will design the website and other generic communication instruments in such a way that it can encompass differing styles of communication. in fact, as further described in section error! reference source not found., the project website will not only be used as a focal point for all parthenos-related dissemination activities, but also as a news hub about research, digital humanities, innovation and technology in general. both sections will live simultaneously on the website. while the focus will always be on reporting and disseminating the project’s activities, the “news hub” will foster the publicity of those activities using the following communicational frame: investing in culture is investing in the future this will help to frame parthenos’s research mission in a way that is interesting enough to compete in today’s noisy communication environment, while also helping the project to reach a general audience. thereby, we hope to disseminate the achievements of parthenos as relevant and meaningful to communities beyond academia. to achieve this goal, our communication strategy needs to develop along three trajectories: • communication aimed at internal stakeholders, researchers, educators and non- academic professionals. • communication aimed at the media and the general public. • communication aimed at policy and decision makers and funding bodies. while the requirements to successfully communicate to the first group are relatively clear and well understood, a few more words about the other two trajectories are in order. the second trajectory aims to engage and address general audiences. this category is broad enough to include specialised media, national and local institutions and the interested public. in order to create interest around parthenos among these communities, we need to focus on the following: parthenos – d . • compelling visual style (website, multimedia production and materials need to be designed with this specific segments in mind – e.g. fresh, appealing and easy to use appearance); • effective story telling. we need to create stories that are interesting for the public, by, for instance, referring to the “investing in research/culture is investing in the future” interpretative framework; • parthenos website as an hub for the information related to cultural heritage, digital humanities and technologies; • other media channels as appropriate e.g. youtube. regarding policy-, decision-makers and funding bodies, an ec paper of entitled “scientific evidence for policymaking” provides some useful pointers of how to target these groups. the paper advises project’s to develop “appropriate dissemination and knowledge sharing strategies from the earliest stages of project planning; include partners from the world of policy-making in their project team in order to ensure that the subjects chosen, as well as the scope of the research, respond to defined policymaking priority areas ”. moreover, the paper recommends to “develop more subtle ways of engaging with the broader public and embedding social and ethical reflection within the everyday practice of science, develop a programme and a methodology of dissemination of results over the lifecycle of their project in order to provide updated information on progress over time; reflect in terms of added-value of the work undertaken, not only in terms of the scientific research, but in terms of the policy-usefulness of the work undertaken; prepare policy briefings which are easily readable, understandable and useable by policymakers in framing and/or evaluating policies ” in other words, the third trajectory of parthenos’ communication strategy combines elements from the other two. specific actions and tailored messages targeting policy makers need to be defined on a case by case basis. european commission ( ). scientific evidence for policymaking. publications office, luxembourg, p. ibid., p. - tailored messages this section presents a set of tailored messages aimed at different communities that should help to hone the project’s communication along the three trajectories detailed in section error! reference source not found. above. the following messages have been developed: • general message • extended general message • research and educational message • jargon-free public message • policy- and decision-maker message it should be noted that these messages are indicative rather than prescriptive. indeed, partners are encouraged to adapt these messages according to their needs. particularly, the “research and educational message” may need adapting depending on the specific subject community that is being addressed. . general message this represents the official definition of the project, displayed on the project’s website, and will also be used on official dissemination materials. parthenos aims at strengthening the cohesion of research in the broad sector of linguistic studies, humanities, cultural heritage, history, archaeology and related fields through a thematic cluster of european research infrastructures, integrating initiatives and infrastructures, and building bridges between different, although tightly, interrelated fields. parthenos will achieve this objective through the definition and support of common standards, the coordination of joint activities, the harmonization of policy definition and implementation, and the development of pooled services and of shared solutions to the same problems. parthenos – d . . extended general message the description above, while very complete and exhausting, can be integrated with other information regarding the scope of the project and the composition of the consortium. the following is the extended description of the project, to be mainly used for communication activities directed to specific stakeholders’ groups. parthenos aims at strengthening the cohesion of research in the broad sector of linguistic studies, humanities, cultural heritage, history, archaeology and related fields through a thematic cluster of european research infrastructures, integrating initiatives, e-infrastructures and other world-class infrastructures, and building bridges between different, although tightly, interrelated fields. parthenos will achieve this objective through the definition and support of common standards, the coordination of joint activities, the harmonization of policy definition and implementation, and the development of pooled services and of shared solutions to the same problems. parthenos will address and provide common solutions to the definition and implementation of joint policies and solutions for the humanities and linguistic data lifecycle, taking into account the specific needs of the sector that require dedicated design, including provisions for cross-discipline data use and re-use, the implementation of common aaa (authentication, authorization, access) and data curation policies, including long-term preservation; quality criteria and data approval/ certification; ipr management, also addressing sensitive data and privacy issues; foresight studies about innovative methods for the humanities; standardization and interoperability; common tools for data-oriented services such as resource discovery, search services, quality assessment of metadata, annotation of sources; communication activities; and joint training activities. built around the two erics of the sector, dariah and clarin, and involving all the relevant integrating activities projects, parthenos will deliver guidelines, standards, methods, services and tools to be used by its partners and by all the research community. it will exploit commonalities and synergies to optimize the use of resources in related domains. . research and educational message this message aims to highlight the advantages that the research and educational sectors can gain through parthenos. it focuses on data use and re-use, a topical issue in today’s research area, and on the impetus to professional development and advancement that parthenos provides. it is important to note that, given the heterogeneity of the consortium, tailored messages that reflect partners’ specific area(s) of interests should be created. this will help to maximise the impact in the various research communities addressed by the project, while also helping the consortium to keep control over the global message. keywords: research infrastructures; involvement of researchers; benefits for researchers; data management; data use/re-use; intellectual property rights; tools and services; standards and guidelines; training; open access; integration and pooling of data, service and expertise. digital technologies have so far created large digital archives, and new methodologies to support research. it is now necessary to integrate these archives and support these new methods. this is the mission of newly set-up pan-european bodies, called european research infrastructures, created to provide integrated and coordinated facilities, resources and related services to the scientific community to conduct top- level research in their respective fields. parthenos is going to support the work of the two research infrastructures of our sector, called clarin (language studies) and dariah (digital humanities), as well as the contributions of various integration projects addressing specific subdomains. by pooling efforts, taking advantage of commonalities among the involved disciplines, and collecting indications from the respective research communities, results in terms of services and tools will be better, and closer to researchers’ needs. individual researchers are now able to benefit of the advanced services made available to everybody by parthenos, and may have a voice in the project development by letting the project know their needs and wishes. such services concern all aspects of data management, including: access to and sharing of data; management of intellectual property rights; integration of diverse datasets through shared understanding of concepts and content typical of the various disciplines and approaches; tools for discovering openly available resources such as archives, services and tools; guidelines for making the best use of digital technologies; and, finally, training on all of the above. sharing research outcomes in an open access parthenos – d . repository of scientific papers is an important component of this strategy. parthenos fosters the participation of all researchers in its activities and welcomes the involvement and participation of all research institutions, to enlarge the network of facilities, resources and services integrated in the research infrastructures it is serving. similarly, the high rate of innovation in parthenos is the key to present the project to educators and students. keywords: innovation; involvement of education communities; training of young researchers; development of innovative curricula. education must not be left behind the advances in research methodology, to prepare tomorrow’s researchers and professionals. basing on the experience of its partners that are primary educational institutions, parthenos addresses innovation in training and education with a specific activity, developing training plans and up- to-date curricula. participation and involvement of all actors from the education domain is welcome to collect requirements, test proposals and verify solutions. in return, educators will receive detailed information about current educational offerings for digital humanities and well-designed plans and curricula they can adapt and implement according to their needs and educational offerings. . jargon-free public message the following message targets the general public. despite being a pure research project, parthenos needs to find compelling ways to target this audience. keywords: social and economic relevance of culture/research; “investing in culture is investing in the future”; value for money; new digital tools and methods to explore culture. culture shapes our identities, aspirations and relations to others and the world. it also shapes the places and landscapes where we live, the lifestyles we develop. arts, heritage, history, literature and language are essential components of our european identity as well as key factors of social and economic development. investing in culture is investing in the future. continuously improving our understanding of history, literature, language, arts and heritage, by availing of innovative communication and information technologies that nowadays are part of our everyday life is a necessity, and pooling resources at european level will save money, avoid duplication or – even worst – divergence of efforts, and optimize results. the european commission provides significant support to research in the cultural domain, and parthenos is a beneficiary of the ec’s “positive spending review” that has resulted in a bolstering of the budget of horizon , the main european research programme. parthenos will ensure value for money by improving research work in the cultural domain, by providing better and more efficient digital services and tools that are based on information technology, and by ensuring the exploitation of the wealth of information that is already digitally available. updating the methodologies used in the cultural domains will make cultural content more familiar and accessible to everybody, will help the discovery of new significance in old concepts, and will eventually contribute to social and economic development, thereby resulting in growth and jobs in a strategic sector for europe. . policy- and decision-maker message as with the public message, the core theme for the message targeted at policy- and decision-makers is “investing in culture is investing in the future”. keywords: social and economic relevance of research infrastructures; job creation. parthenos and the europe-wide research infrastructures it supports will not only benefit research, cultural and educational institutions, but also the public at large. improved knowledge in the cultural domain will lead to significant social and economic benefits, not least by accelerating growth and creating jobs in a strategic sector for europe. the strong commitment to research infrastructures provided by the european commission in the horizon programme, needs to be accompanied by concrete and operational support to national and local initiatives participating in this trans- national and interdisciplinary effort. a similar message should be addressed to practitioners, professionals and glams, adding, “as it happened in the past for other technologies (photography is a ready example), research methods impact directly on professional practice, both of individuals parthenos – d . and of institutions. that is why cultural institutions such as archives, libraries, museums and heritage agencies, as well as professionals operating in these field, must be tightly connected to important developments such as the creation of europe-wide research infrastructures: research infrastructures will assist cultural institutions to achieve their missions in the service of european culture, and will receive important indications on the social implications of their research work.” available resources the aim of this section is to identify the relevant communication and dissemination skills, resources and networks available within the project consortium. . consultation of the parthenos partners two surveys launched by pin in the first two months of the project have contributed much of the information presented in this section as well as in sections and . these surveys are: . survey of planned dissemination activities and existing communication channels. this survey is implemented as a google spreadsheet (https://docs.google.com/spreadsheets/d/ jmfpd vxl iriuamj fr nkuqxroljl- qfqqyjt pk/edit#gid= ). in order to ensure the consistency and comparability of gathered data, several categories and data validation rules are hard coded into the spread sheet. . all partners have been invited to input relevant information about the following directly into the spreadsheet: • planned/forthcoming parthenos-related events; • planned/forthcoming parthenos-related publications; • established dissemination channels (such as journals, newsletters, publications, website, social media, etc.); • undertaken/planned networking/other activities (such as parthenos-related presentations; parthenos-related contributions to newsletters, blogs, etc.). this survey will remain open throughout the duration of the project, and partners are asked to update their information regularly. this way, the spreadsheet will become an important record of dissemination activities undertaken by the project, as well as constitute a valuable planning and working tool that contains up-to-date details about forthcoming dissemination opportunities and about existing channels that can be exploited by parthenos. parthenos – d . figure survey of planned dissemination activities and existing communication channels . survey of partners’ communication goals and feedback on parthenos’ communication strategy the aim of the survey is to collect qualified opinions and qualitative data from partners to shape the parthenos’ communication strategy according to the partners’ needs and opinions. the survey is implemented as an online typeform survey, and it accessible at https://stefanosbarbati.typeform.com/to/umozzh the survey is composed of twelve questions, arranged around three main survey areas: • identification and description of the main communication goal of each partner (independently from parthenos), • identification of the main communication goal of each partner within parthenos project, • measuring the level of agreement and disagreement to the general communication approach proposed. figure survey of partners’ communication goals and feedback on parthenos’ communication strategy while the survey is currently still open, preliminary results suggests that partners generally agree with the multi-level and multi-stakeholder communication strategy described above. . resources available in the consortium wp is led by knaw-niod and involves all partners in the consortium. most partners have public relations departments in their institutions and access to existing communication and dissemination networks and resources which should be exploited to disseminate parthenos. parthenos – d . the following table gives an overview of the partners’ responsibilities and contributions to the main communication and dissemination tasks. task no. task responsibilities main contact overall co-ordination of communication/disseminatio n activities knaw-niod (wpl), supported by pin (coordinator) reto speck reto.speck@kcl.ac.uk t . . design, development and technical maintenance of website pin, supported by knaw-niod stefano sbarbati stefano.sbarbati@pin.u nifi.it t . / . sharing project-internal news through the project website/social media pin, supported by knaw-niod. input from all partners stefano sbarbati stefano.sbarbati@pin.u nifi.it t . / . sharing relevant external news (i.e. news from partners, associated projects, etc.) through the project website/social media knaw-niod, supported by pin. input from all partners petra drenth p.drenth@niod.knaw.nl t . periodic update of coordinated communication plan knaw-niod, supported by pin, csic and mibact- iccu. input from all partners reto speck reto.speck@kcl.ac.uk t . evaluation/ potential implementation of a scientific e-journal ugoe, supported by knaw, cnrs, csic and kcl juliane stiller stiller@fh-potsdam.de t . evaluation/potential implementation of a open access repository ugoe, supported by knaw, cnrs, csic and kcl juliane stiller stiller@fh-potsdam.de t . organisation of joint events pin, supported by clarin, knaw-niod, cnr, kcl, oeaw, mibact-iccu, ugoe sheena bassett sheena.giess@gmail.com task no. task responsibilities main contact t . liaisons with other international initiatives clarin, supported by pin, inria, knaw, cnr, csic, forth, kcl, oeaw, mibact- iccu. input from all partners bente maegaard bmaegaard@hum.ku.dk t . . coordination of press releases/press relations pin, supported by knaw-niod. input from all partners stefano sbarbati stefano.sbarbati@pin.u nifi.it t . production of publicity material knaw-niod, supported by all partners. petra drenth p.drenth@niod.knaw.nl t . production of external newsletters knaw-niod, supported by all partners. petra drenth p.drenth@niod.knaw.nl t . production of internal newsletter pin, supported by all partners sheena bassett sheena.giess@gmail.com publicising project within partners’ countries /networks. adapting/ translating dissemination material as required all partners disseminating project background and results at external national/ international conferences and other events all partners. co- ordination by knaw- niod with the support of pin. reto speck reto.speck@kcl.ac.uk dissemination of project’s results via scientific publications all partners. co- ordination by knaw- niod with the support of pin reto speck reto.speck@kcl.ac.uk table : tasks and responsibilities . resources available via associated projects apart from the communication and dissemination resources available via consortium partners, parthenos is associated to five major european integrating/infrastructure projects and includes the two erics in the field of humanities research: parthenos – d . name of project represented in parthenos via (coordinator in bold) contact for communication clarin clarin-eric tbc. dariah* inria, knaw, cnr, cnrs, oeaw, ugoe, aa jakob epler ariadne pin, knaw, cnr, csic, forth, oeaw, mibact- iccu sheena bassett cendari tcd, inria, kcl, ugoe, sismel catherine o’brien charisma/iperion-ch cnr, cnrs, csic, forth emilio cano ehri knaw, kcl, ugoe petra drenth dch-rp mibact-iccu sara di giorgio table : associated projects *dariah-eric is set to become a full parthenos partner. in order to reach all the subject communities with which parthenos wishes to engage and in order to achieve synergies across european ris, it will be crucial to exploit communication and dissemination opportunities and networks available via these projects, and to carefully co-ordinate activities. as a first step towards achieving a coordinated approach to communication/dissemination across these projects, wp is currently setting up an informal network of communication officers working at associated projects. this network will facilitate the exchange of information, ensure wide dissemination of relevant information across the communities addressed by these projects, and achieve synergies by avoiding duplication of effort and the sharing of resources. for parthenos, petra drenth (knaw-niod) is responsible for initiating and managing the network. a more detailed account of its purpose, plans and operation will be provided in the next update of the communication plan (due m ). . resources available via related international initiatives as part of its mission, wp will identify and exploit connections parthenos’ partners have with other relevant international committees, initiatives, projects and other important research infrastructures in and beyond europe. this work is undertaken in task t . and is coordinated by bente maegaard (clarin). we will report details about such existing connections and possible strategies of how we could make use of these for the purposes of disseminating parthenos in the next update of this communication plan (due m ). parthenos – d . communication channels apart from re-using existing communication and dissemination resources available at partner institutions, associated projects and related international initiatives, parthenos develops, hosts and manages an array of project dissemination channels. the following section will introduce and describe each such channel and include a plan for its further development over the next months. . website the project website – http://www.parthenos-project.eu – is a central pillar in our communication and dissemination strategy. it is a hub for all the information about the project and its activities, events and services, and constitutes an important source of information for all stakeholder communities the project is seeking to reach. apart from directly hosting a wealth of content, it will also contain links to relevant information available elsewhere such as publications, presentations, etc. as such it offers stakeholder one-stop access to information about the project’s background, ambition and results. the website is built using a well-known, modular web content manager (wordpress), the website is fully responsive, meaning that it can be easily accessed and browsed via all commonly used devices (desktops, laptops, tablet and smartphones). a google analytics snippet is coded into the website, enabling us to generate comprehensive site usage statistics. the design of the website facilitates the modular and multilevel communication approach defined in the previous sections. it comes with a modern and appealing layout, chosen to attract non-academic and first time users to the website, and welcome them to the european research infrastructures universe. overall, the website was designed in such a way to conform to the communication principles articulated across this document: • clearly recognisable appearance (in line with project’s visual identity) • appealing layout and high accessibility • modularity and easy adaptation to project’s needs the website will be embedded in a wider ecosystem of social media. by means of the use of multimedia products such as videos, documentaries and features narrating parthenos’ progress, the project aims to generate interest among youngsters and students, traditionally the groups with the highest social networks’ usage penetration. while crafting this strategy, parthenos fully endorses ec’s vision: “[…] communication about european research projects should aim to demonstrate the ways in which research and innovation is contributing to a european 'innovation union' and account for public spending by providing tangible proof that collaborative research adds value […] ” needless to say, aiming to reach the wider public does not mean forgetting our core audiences. we expect to reach all of our stakeholder communities through social media and it is likely that many individuals from the professional and research communities will first encounter parthenos through popular social media channels such as twitter or youtube. figure gives details of the website’s current structure, while figure presents a screenshot of a page. european commission ( ), communicating eu research and innovation guidance for project participants v , p. � parthenos – d . figure website structure figure website screenshot the website has been designed and built by pin with the involvement of knaw-niod, and it has been online since month of the project. the goals for the next months are first of foremost to gradually expand the content that is available via the website. this will entail adding more concise background and contextual information about the project, for instance through a series of faq-style questions and answers about parthenos and the context in which it operates. likewise, news about parthenos events, announcements, or about significant activities parthenos – d . happening in associated projects, will be added on a continuous basis. as the project develops over time, we will also regularly publish news stories about the project’s progress and its substantial results. furthermore, we will continuously extend and adapt the design, structure and functionality of the site in response to feedback and changing requirements. pin takes the lead in developing and expanding the website with substantive editorial input provided by knaw-niod. feedback on the site, and input for new content will be sought from all parthenos partners. . social networks parthenos takes full advantage of the most used and effective social networks to support its dissemination. we will thereby aim at taking full advantage of the extensive social networks that are already in existence within the consortium. a twitter (@parthenos_eu) account has been setup, and will be used to report on the project’s activities, and alert followers to new content on the website. during major events (such as the kick off meetings, participation in conferences etc.) knaw-niod and pin will facilitate live blogging sessions. a youtube channel has also been setup (https://www.youtube.com/channel/ucnkjnfo_iffoai vh t hw). the channel already showcases two videos, and we plan to produce further short videos to document forthcoming parthenos meetings and events. flickr will be used to share the project’s photographic documentation. the official parthenos page is available at https://www.flickr.com/photos/parthenos_eu/. usage of further social networks such as google+, linkedin and facebook will be considered when and if the need arises. over the next months, our main goal is to significantly expand our social networks, and to ensure that our followers receive frequent, interesting and engaging updates from the project. pin and knaw-niod share responsibility for managing the project’s social media accounts. all partners are encouraged to help widen parthenos’ social networks by following us on, retweeting, etc. . mailing list website visitors can sign-up to a parthenos email-list. this mailing list is hosted on the popular mailchimp platform (see http://mailchimp.com/). subscribers to the list will receive regular email updates and news about parthenos to their inboxes by means of regular newsletters (c.f. section . ). our goal over the next months is to populate our mailing list. during the first two months, we will focus on gathering the details of individuals working at parthenos partner institutions. partner recipient of the newsletter will be encouraged to spread the parthenos newsletter among their own networks, which is expected to gradually result in a substantial number of subscribers from outside the parthenos consortium. pin and knaw-niod manage the parthenos mailing list. all partners will contribute to its population (forwarding of newsletters, spreading the word, etc.) . newsletters we plan to keep the subscribers to our email-list up-to-date about parthenos by means of a periodic newsletter (ca. issues per year). the newsletters will provide subscribers with a concise summary of all the latest parthenos-related news since the last issue. apart from reports about the project’s progress and announcements about forthcoming events, etc., the newsletter will also contain news about important developments in the various fields related to parthenos’ activities. parthenos – d . a template for the newsletter has been designed by pin in accordance with parthenos’ visual style – see below. as already mentioned above, we will aim to produce issues of the newsletter over the forthcoming months, with the first issue distributed on july . the newsletter will be edited by knaw-niod and pin, but will contain contributions from all partners and all work packages. figure : newsletter . e-journal and open access repository wp will evaluate the need for creating a scientific e-journal covering the field of e- humanities research. it will further investigate the possibility of setting-up and managing a scientific repository service for open access pre-print storage of scientific papers over the forthcoming months, we will investigate the community requirements and needs for an e-journal and repository in conjunction with wp . ugoe is leading this activity, supported by knaw, cnrs, csic and kcl. . (joint-) events parthenos will implement a comprehensive programme of joint events (symposia, workshops, public presentations) either directly managed by the project, or co-organised and with other relevant international/national initiatives. over the forthcoming months, we will identify themes for events, possible co-organisers and suitable venues. we will also plan fully at least one joint-event in that period, to take place in the first half of . contacts are already established with the many research communities composing parthenos; among the others, pin is exploring the possibility to hold one event during the dariah and clarin general assemblies. pin is leading this activity, supported by clarin, knaw-niod, cnr, kcl, oeaw, mibact-iccu, ugoe. it is expected that the university of potsdam will take over all of the university of göttingen’s (ugoe) contributions to parthenos. parthenos – d . dissemination materials to support the project’s communication and dissemination mission, an initial set of dissemination materials has been produced. these materials, as well as our plans for the creation of further materials, are presented in this section. . logo and visual style a distinct, clear and easily recognizable visual style is arguably as important to the achievement of our communication and dissemination goals, as are availability of suitable dissemination channels and concise tailored messages. indeed, as famously coined by marshall mcluhan, “the medium is the message”. one of the main goals in designing the project identity was to represent a key assumption behind parthenos, namely the importance of digital technologies in the fields of linguistic studies, digital humanities, cultural heritage, history, archaeology and other related fields. therefore, the project’s visual identity has to symbolize the bridge between technologies on the one hand and the above mentioned research fields on the other. this bridge is most prominently visible in the project’s logo which seeks to symbolize the linkages between heritage and modernity. it is composed of two main circles, the outer recalling the antiquity and the inner representing the modernity through an abstraction of modern electronics wiring, rendered according to the circularity of the logotype. figure parthenos logo (simple) figure parthenos logo (w/ description) a clear, minimalistic approach informs all of the documentation templates created for the project. so far the following resources have been created (all available via d science): • logo (in various versions) • letterhead • powerpoint template • word deliverable template • a visual style guide including details about font usage and colour schemes • a rollup used during the kick-off meeting . press relations obtaining good press coverage is one of the long term goals of parthenos. good press about the project’s activities means stronger and more sustainable impact for the project. it is, therefore, crucial to set up effective and coherent guidelines to be followed by all the partners when dealing with parthenos press releases. an up-to-date information media kit and project press releases will be developed by pin and knaw-niod for all the project’s major events and to disseminate the project results. partners will be responsible for translations and regional adaptations as well as for spreading the press releases to relevant regional stakeholders and at european/international level. partners wishing to schedule national or international press conferences related to parthenos will need to discuss in advance the feasibility. the primary contact for all press relations is stefano sbarbati (pin). as a rule of thumb, press conferences are to be scheduled only to present to the press relevant and news worthy project outcomes. parthenos – d . this rule is valid both for the international media and the local press. each partner is responsible for the organization of press conference and especially inviting relevant journalists. knaw-niod and pin will support these activities concerning press conferences with material if needed. depending upon the type of news, there might be the need to include some additional material to the press release – such as fact sheets, background information and photos in high definition. a press release template has been created and can be accessed on d science (https://goo.gl/knxutv). . media productions thanks to the expertise provided by the consortium, parthenos communication will be also based on enriched media. events, topics and project’s update will be covered with videos and short documentaries, in order to appeal to wider audiences. the use of videos and enriched media represents not only a means to directly reach wider audiences and communities, but is also a useful instrument for granting access to parthenos to online press and television interested in covering the topic. the production of a series of videos covering the major issues tackled by parthenos is currently under discussions, and its feasibility will be discussed in the next iteration of the present document. . further planned dissemination materials knaw-niods is currently is currently preparing a first version of a generic parthenos powerpoint presentation and a project leaflet. both will be ready in autumn , and will focus on a general presentation of the project and its main activities. they will be made available to all partners for usage and distribution at conferences and other suitable events. at a later stage in the project, an updated powerpoint presentation and a more detailed brochure will be prepared. external dissemination activities for year section above already contains details about our plans for the first year in regard to the dissemination and communication channels directly managed by parthenos. this section contains details about potential dissemination avenues outside the direct control of the project. . events active participation at external events (conferences, workshops, symposia, fairs) is crucial to parthenos’ dissemination strategy as it allows direct contact with the research communities with which the project wishes to engage. all partners will present parthenos at external events with knaw-niod providing co- ordination. table below details suitable events pertaining to fields of linguistic studies, digital humanities, archaeology, cultural heritage and history that could be fruitfully targeted by parthenos. it should be noted that for some of the earlier events, deadlines for calls of papers and registration will already have passed. they are listed here for reference purposes in order to identify future iterations of the same events. parthenos – d . name of event location date community th corpus linguistics conference (cl ) lancaster, uk - jul linguistics th ieee international conference on e-science munich, germany aug - sep research infrastructure (ri) digital heritage international congress granada, spain sep- oct heritage st annual meeting of the european association of archaeologists (eaa ) glasgow, uk - sep archaeology semantic web for cultural heritage workshop (sw ch’ ) poitiers, france - sep cultural heritage international conference on theory and practice of digital libraries (tpdl) poznan, poland - sep ri nd international symposium on virtual archaeology, museums and cultural tourism delphi, greece - sep archaeology; cultural heritage nd international apex conference (*) budapest, hungary - sep ri, history national movements and intermediary structures in europe (nise) annual gathering (*) swansea, uk - sep ri; history athena plus conference rome, italy oct cultural heritage; ri europeana digital service infrastructure meeting (dsi) rome, italy oct cultural heritage; ri icdh : xii international conference on digital heritage london, uk - nov cultural heritage conference on cultural heritage and new technologies (chnt), vienna, austria - nov cultural heritage computer applications & quantitative methods in archaeology (caa) conference oslo, norway mar- apr archaeology language resources and evaluation conference (lrec) portoroz, slovenia - may linguistics digital humanities krakow, poland - jul digital humanities th annual meeting of the european association of archaeologists (eaa ) vilnius, lithuania aug- sep archaeology th european association for lexicography (euralex) conference tbilisi, georgia - sep linguistics th plenary meeting of research data alliance united states - sep ri ipres , th international conference on digital preservation bern, switzerland - oct ri th plenary meeting of research data alliance barcelona, spain mar ri eaa rd annual meeting of the european association of archaeologists maastricht, netherlands sep archaeology table : external events (*) parthenos already has a confirmed participation at these events parthenos – d . in addition to such external events, the two erics participating in parthenos and associated ri projects, will periodically host their own events which provide excellent dissemination opportunities for parthenos. table below provides an overview of such forthcoming events. name of event location date clarin workshop on digital resources and services in social sciences and in humanities prague, czech republic sep clarin annual conference wroclaw, poland - oct dariah general assembly berlin, germany nov table : events organised by associated erics/projects . international event co-operations parthenos is promoting the organization of workshops with top-level extra-eu institutions and research communities, planned for early . among the others, the discussion is ongoing with the library of congress (us), the national gallery of art (us), a group of us universities, the mexican instituto nacional de antropología e historia (inah), the universidad nacional autónoma de méxico (unam) and other institutions in north america. the aim is to foster the collaboration with world-class institutions in the fields addressed by parthenos, in order to exchange best-practices and strenghten parthenos’ output effectiveness. . publications parthenos will be publishing its substantive research writings in academic journals. all partners are encouraged to publish their work in academic journals. authors of scientific publications should adhere to standard good academic practice, and particularly note the following: • mention eu support for the work, • notify the consortium of the publication, • take cognizance of the ec’s open access policy, parthenos – d . • provide a digital copy to the consortium, to be made available on the website (if the publisher agrees with the open access self-archiving). if not, a link will be provided to an archive copy elsewhere, or a copy will be kept in storage in case self-archiving is not allowed. table below provides a first overview of potential journals that partners could target with their articles. it should be noted that this list is currently very incomplete, and will be updates as an outcome of task . . as it is expected that publication of scientific papers will predominantly occur in the second half of the project, a more extent version of possible publication outlets will be provided in the next iteration of this deliverable. name of journal deadlines international journal of humanities and arts computing the international journal of humanities and arts computing (formerly history and computing) is one of the world’s premier multi-disciplinary, peer-reviewed forums for research on all aspects of arts and humanities computing. it focuses both on conceptual or theoretical approaches and case studies or essays demonstrating how advanced information technologies further scholarly understanding of traditional topics in the arts and humanities. the journal also welcomes submissions on policy, epistemological, and pedagogical issues insofar as they relate directly to computing-based arts and humanities research. http://www.euppublishing.com/journal/ijhac digital humanities quarterly digital humanities quarterly (dhq) is published by the alliance of digital humanities organizations (adho). it is an open-access, peer-reviewed, digital journal covering all aspects of digital media in the humanities. http://www.digitalhumanities.org/dhq/about/about.html digital scholarship in the humanities (dsh) formerly known as llc (literary & linguistic computing), dsh is an international peer reviewed journal on digital scholarship in the humanities, published by oxford journals on behalf of the alliance of digital humanities organizations. it publishes results of research projects, description and evaluation of techniques and methodologies, reports on work in progress. journal of computing and cultural heritage jocch publishes papers of significant and lasting value in all areas relating to innovative use of information and communication technologies in support of cultural heritage http://jocch.acm.org international journal of heritage in the digital area ijhdr is a quarterly peer-reviewed journal in the area of digital cultural heritage and digital libraries table : journals parthenos – d . communication evaluation and assessment the effectiveness of parthenos’ communication and evaluation activities will be periodically measured. each subsequent iteration of this deliverable will provide an assessment of the implementation of the previous plan, and provide targets for the forthcoming period. periodic evaluation is undertaken to guarantee that all our stakeholder communities are reached and provided with appropriate information. it also has an important to play in shaping future iterations of the communication plan by providing feedback on what works and what needs refinement. all partners have a significant role to play, not only in the implementation of the communication plan, but also in its iterative formulation and review. in particular, all partners are called upon to update their own dissemination activities according to the survey template described in section . previously. for the first period, spanning months - , the following targets are proposed: indicator quantity (by month ) avg. number of website visitors per month total number of website visitors , number of eu/eea countries reached through website total number of referrals number of contacts in the mailing list number of twitter followers avg. monthly number of tweet impressions number of joint events - number of attendees at joint events - number of press releases number of leaflets/other publicity materials distributed number of conference papers number of attendees reached at conferences number of scientific papers articles in professional journals and online newsletters table : performance targets profession © by the modern language association of america valuing digital scholarship: exploring the changing realities of intellectual work ja mes p. pur dy and joyce r . walk er because published research is a significant component of tenure- and- promotion cases, even at institutions with an explicit teaching focus, fac- ulty members often plan their pretenure scholarly activities on the basis of their understanding of how different types of scholarly work will be valued. at the same time, new technologies have influenced tenure- and- promotion considerations, expanding not only available venues of publi- cation but also definitions of scholarly activity and production. because these new technologies include both new knowledge products and new approaches to knowledge construction, efforts to categorize the scholarly value of digital work have been difficult and complicated. while both fac- ulty members using digital tools and committees charged with evaluating tenure- and- promotion cases have tried to create appropriate categories for digital scholarship, their success remains partial. both continue to raise important questions and concerns about how to approach digital work. the late twentieth and early twenty- first centuries have seen a range of discussions regarding the value of digital scholarship in tenure- and- promotion cases—both in the humanities in general (andersen; borg- man) and in en glish studies in particular ( bernard- donals; carnochan; lang, walker, and dorwick; levine; miall; nahrwald; janice walker). increasingly, these discussions have pointed to the need to account for james p. purdy is assistant professor of english and director of the university writing cen- ter at duquesne university. joyce r. walker is associate professor of english at illinois state university. ||| valuing digital scholarship and value digital work. the cccc and the mla, for instance, both explicitly argue that digital scholarship is legitimate and should be evalu- ated accordingly. the cccc advises that tenure- and- promotion com- mittees and candidates should “account [for] technology- related work” in research, teaching, and service (cccc promotion), and in its “report of the mla task force on evaluating scholarship for tenure and promo- tion,” the mla asserts, “departments and institutions should recognize the legitimacy of scholarship produced in new media, whether by indi- viduals or in collaboration, and create procedures for evaluating these forms of scholarship” ( ). these published recommendations, as well as anecdotal accounts of at- tempts to follow them, have been valuable to scholars interested in explor- ing the potentials of digital scholarship. however, they also highlight the limitations that remain embedded in current approaches to digital scholar- ship. though we are beginning to recognize the importance of digital work, discussions have tended to focus primarily on establishing digital work as equivalent to print publications to make it count instead of considering how digital scholarship might transform knowledge- making practices. though well- intentioned, the statements of governing institutions such as the cccc and the mla, which guide the decision making of tenure- and- promotion committees nationwide, can inhibit the drive toward alternative forms of scholarship, because they, perhaps unintentionally, place digital and print scholarship into narrowly constructed, oppositional genres that often privilege print and reinscribe the creative- scholarly split that has long been a problem for en glish studies. cheryl e. ball and ryan m. moeller point to how “many tenure guidelines . . . label research as ei- ther creative or scholarly,” counting only the scholarly. in this binary, be- cause digital texts are more visibly and consciously designed, they usually fall into the creative category; print texts fall into the scholarly category, which situates digital work outside the purview of knowledge making in the discipline. given the inclusive intention behind the mla and cccc statements, this division is not what en glish studies wants. this narrow binary is perhaps the most significant (and problematic) aspect of current attitudes regarding the value of digital scholarship. w. b. carnochan, for example, reinforces a print- based affiliation in his claim that you can evaluate an electronic publication in the same way you evaluate anything else, except that (being old school) you may want to read it in printed form. doctoral institutions have little experience evaluating electronic publications not because they pose a unique challenge but because they are not, or not yet, accepted currency. ( ) james p. purdy and joyce r. walker ||| carnochan is accurate that electronic work is still not widely accepted, though this situation is slowly changing, but he is less accurate that evalu- ating such work does not “pose a unique challenge.” not all digital publi- cations can easily be printed out—nor should they be. this binary relation creates a number of difficulties, both for schol- ars who wish to compose in and for digital spaces and for tenure- and- promotion committees who need and want to evaluate this work: because a binary relation always privileges one term, this positioning inherently situates either print or digital as superior. while current sentiment usually prefers print, the sentiment might eventually shift in favor of digital, which is also problematic. because the current binary privileges print, the perception is that good scholars (who are understandably concerned about tenure and promotion as well as the intrinsic value of their scholarship) will choose print. to prove its merit, then, digital scholarship must establish itself as equivalent to print, using criteria developed in response to the affordances and pre- dispositions of print genres rather than through a process that explores the potential affordances and predispositions of both digital and print texts and publication spaces. finally, situating print and digital publications in a binary relation focuses on the contrast of their respective materialities (i.e., one is printed out, while the other is looked at on a computer screen) instead of allowing scholars and tenure- and- promotion committees to examine and evaluate other dif- ferences and similarities that may affect the dissemination, use, and value of these texts. the field should consider both digital and print publications in relation to larger, more systemic issues regarding the nature and value of various kinds of scholarly work: design and delivery, recentness and relevance, and authorship and accessibility. these three areas—because they extend beyond simple materiality— provide a framework for analyzing scholarship across media and therefore define the intellectual purview of en glish studies as meaning making in all textual forms, not just print. a more systemic look at the affordances of digital publications can help us not only discuss ways we might reevalu- ate print publications but also define a more comprehensive list of the scholarly activities that we value and wish to promote. design and delivery in this section, we consider the ways that digital scholarship affords and promotes different kinds of intellectual activities and can significantly al- ter perspectives regarding the design and delivery of textual information. ||| valuing digital scholarship awareness and control of design and delivery choices popular and academic discussions (see, resp., jaschik; levine) often argue that the most important selling point for a turn toward digital environ- ments is their alteration of the form of textual delivery. while productive, such discussions tend to focus on improving the ease and minimizing the expense of bringing information to readers, thereby limiting the value of digital publications to practical aspects of delivery and framing publishing digital scholarship as a material delivery choice, not a knowledge- making practice. other advantages, such as the promotion of nonlinear think- ing, are overlooked. a more nuanced exploration of design and delivery is possible. opportunities for awareness and control are often unavailable in the naturalized systems of print publication, where attention to the rhetorical canon of delivery has diminished (see crowley). for kathleen welch, this move is to the detriment of writing and society itself, because it “repro- duces the form/ content binary that drives the movement to relegate writ- ing to skills and drills and perpetuates the status quo of racism and sexism . . . and the removal of student- written language from the larger public arena” ( ). we would add that this neglect of delivery also precludes the possibility of new forms of scholarly knowledge making that might chal- lenge this racism, sexism, and denigration of student writing. digital scholarship, because it requires authors to create texts that are publishable and readable online, renews the need for a consideration of delivery. as dànielle devoss and james e. porter point out, attention to how digital delivery differs from print delivery is crucial in recognizing new possibilities for textual production and distribution afforded by digi- tal technologies—and in understanding and responding to the application of intellectual property and copyright law to online spaces. these tech- nologies ask for “an expanded notion of delivery . . . that embraces the politics and economics of publishing” ( ). applying this expanded no- tion of delivery to digital scholarly journals like computers and composition online, kairos, and vectors can help us better understand what it means to deliver and archive scholarship in a single venue. practicing this expanded notion of delivery may better prepare us to recognize not only the unique forms of production and distribution afforded by digital texts but also the forms of production and distribution embedded in print texts. we do not argue that digital publication spaces are the only locations where examination of the ramifications of design and delivery choices can occur; such considerations (e.g., about page layout and image inclu- sion) are also part of the decision making in which scholars engage as they publish in print venues. for example, steve westbrook explains how james p. purdy and joyce r. walker ||| he was unable to include a desired visual image, a student’s parody of a maybelline advertisement, because the copyright holders would not grant him permission ( ). anne wysocki’s “impossibly distinct,” an argu- ment about the centrality of visual images in textual arguments, had to be reprinted because of mistakes with image reproduction in the initial printing (hawisher and selfe). design and delivery clearly matter in print, particularly when texts explicitly engage topics related to visual design. despite authorial consideration of design and delivery in print, the vis- ibility of their choices in digital environments encourages us to alter our approach to these choices in both digital and print locations. scholars can make more informed decisions about designing print publications by attend- ing to the design of digital ones (e.g., see laspina ). explorations of digi- tal publications, then, can highlight an author’s ability (and responsibility) to control the design and delivery options available for disseminating scholarly information. as stephen bernhardt, karen schriver, wysocki (“impossibly distinct” and “multiple media”), and others have reminded us, the visual design of a text shapes readers’ interactions with it and therefore participates in the communication of its ideas. when we do not control textual design, we lose an opportunity to influence readers’ engagement with our scholar- ship. conversely, when we attempt to better understand and control pub- lication choices, we can see more clearly that they are based on a range of influences and possibilities for arranging, organizing, and structuring infor- mation as well as for facilitating a reader’s comprehension and use of a text. design is part of all scholarly production and should be considered by those who make tenure- and- promotion decisions. the design of our online article “digital breadcrumbs” mimics the design of google and is part of our argument that scholars often use digital resources, including popular search engines like google, in nonprescribed ways for academic research- writing tasks. in presenting this article as part of our tenure- and- promotion materials, we face decisions about how best to showcase the scholarly content of the article and to highlight the ways the visual and structural design are crucial to our knowledge making. that focus on design is often necessary in the composition of certain types of digital publication highlights an opportunity: tenure- and- promotion committees might include design and delivery choices as part of the range of scholarly activities that can be considered. instead of making a binary comparison of print and digital compositions or relying on a hierarchical catalog of publication venues based on print sources, committees might examine how a scholar uses a range of venues, tailoring them to particular kinds of work. different publications require different scholarly activities to achieve certain rhetorical effects or to reach certain audiences. ||| valuing digital scholarship use of multiple modes the use of multiple modes for the delivery of scholarly information is per- haps the most visible way in which digital publication venues can alter our understanding of how we present information. digital venues allow authors to integrate word, sound, image, and video in new ways (hull and nelson; johnson- eilola; selfe; wysocki, “multiple media”). they thereby open the door for conversations about the relation between multiple modes of com- munication and the scholarly value and legitimacy of these modes. because print publications often do not (and sometimes cannot) offer alternative modal choices, we have learned to see the paragraph- based print text as containing the highest order of logical coherence and, ultimately, knowl- edge making. however, when we consider alternative modes, we must also reconsider the methods through which they produce knowledge. we can no longer evaluate both print and digital publications only according to criteria developed for print- based communications. for example, the mit libraries podcast series on scholarly publishing and copyright offers both podcasts and video articles on issues related to scholarly publication. these webtexts are specifically designed for faculty members and students at the massachusetts institute of technology but are available to any interested reader (podcasts). by using sound and video, the authors have altered not only the audience for their texts but also the ways that an audience might be reached and make use of the informa- tion. the shift to audio and video may seem relatively simple, but these alternative modes affect when, how, and in what ways users will receive and interact with the information on the site. in considering the value of such texts, one cannot stop at mode of production or genre. a podcast in this series should be evaluated in a framework of the rhetorical situation: how broad or specific is its content? were articles solicited or submit- ted? how were they reviewed? how have they contributed to disciplinary knowledge? when assessed in such a rich framework, a text can escape the print- digital binary and be valued more accurately for the work it does to make or share knowledge and for the appropriateness of the author’s choices for the needs of content and audience. rhetorically rich composition practices digital publication processes can raise awareness of the relation between textual form and content. new media technologies allow us to split form and content in composing (ball and moeller; perkel; stroupe). in turn, digital work asks us to think of textual design as a communicative prac- tice—a notion new to many writers who have been conditioned to ignore or dismiss design. web . (digital spaces, like social networking sites, james p. purdy and joyce r. walker ||| wikis, and blogs, that allow for dynamic and collaborative content con- struction) facilitates the coconstruction of meaning and social space. sec- ond life, for example, is a collaboratively authored digital world used for everything from games to language learning to academic conferences. its multiple users design the space of interaction, creating visual landscapes and personal avatars, as in katherine ellison’s island , where students re- create streets and buildings in and clothe avatars in attire appropri- ate for eighteenth- century london, and bryan carter’s virtual harlem, where students work to rebuild virtually s harlem. such digital spaces expand what it means to compose. digital compositions are, in fact, providing us with exigency: to com- pose in many types of digital environments (e.g., webtexts, blogs, wikis, web sites, databases), authors must develop a more complex rhetorical un- derstanding of the nature of each composition. interestingly, at the same time as some scholars have begun to examine the nature of multimodal composition using digital tools, other scholars in cultural- historic activity theory and actor- network theory have been working to develop theoretical frameworks for understanding and mapping rhetorical activity in complex ways (latour; prior and bazerman; prior et al.; russell). that scholars and teachers are increasingly engaging in the study and production of messy texts provides an important opportunity for us to move beyond narrow, print- based, nonrecursive conceptions of publication for tenure and promotion. explorations of more complex rhetorical mapping could significantly enrich our evaluation of scholarly work in en glish studies. we would like to see the evaluation of scholarly work in en glish stud- ies expanded to include not only the design and delivery for a particular textual production but also the entire range of choices made by authors for the development, production, dissemination, and reception of their ideas in various digital and nondigital venues. such an evaluation would require a more flexible approach by tenure- and- promotion committees to the kinds of activities considered and a more comprehensive approach by faculty members to the compilation of tenure- and- promotion docu- ments. instead of limiting ourselves to discussions of the rigorousness of the peer- review process of a particular journal (whether digital or print), we might begin to ask about the relations among ideas and publication venues, design and delivery of content, and reader interactions and the dissemination of scholarly ideas. communication of complex ideas w hile print scholarship has frequently been considered the venue in which complex ideas are best disseminated, privileging this method for ||| valuing digital scholarship producing knowledge fails to take into account the fact that digital schol- arship can also allow for the communication of complex ideas through al- ternative modes, interconnectivity, nonlinear relations, and various kinds of author- reader interactions. as mike rose points out, complex processes and thoughts can be difficult to communicate in print texts because they lend themselves to presenting chronological, step- based relations between static elements (“sophisticated ineffective books” and “speculations”). web . environments provide the opportunity for nonprint publication (video, audio, image) and more interactive exploration and development of complex ideas. in some cases, digital scholarship may provide a better location for a more speculative, associational kind of knowledge making ( joyce walker), as in wysocki’s “a bookling monument,” which suggests, through an interactive textual body interface, that how we see texts shapes how we see (our) bodies—and vice versa. to take full advantage of these opportunities, we must move away from a perspective on scholarly activity that regards this kind of knowledge making as less than rather than different from a scholarly article published in a print journal. a wiki or blog that allows faculty members to compile and share resources related to their specialty has clear value as a teaching tool and community resource, but its scholarly value can be harder to as- sess. gwen tarbox’s blog on children’s literature, book candy, illustrates that combining resources for colleagues and students with knowledge production is indeed possible. evaluating such a text, though, requires an exploration of not only the site but also the associated conversation in which an author endeavors to participate and its conventions and goals. recentness and relevance in this section, we discuss the recentness and relevance of various kinds of digital scholarship. speed that digital scholarship can be published quickly allows scholars to dis- seminate promptly the results of their research and readers to gain ready access to information for use in their classrooms or research. conversely, print scholarship, which affords a relatively slow pace of knowledge ac- cretion because the process of publication is more extended, allows more time for authors and editors to ensure accuracy. we draw this contrast not to establish a fast- slow binary but to emphasize that, while the speed of digital publication in scholarly journals can be important (especially for scholars publishing in fields where the state of knowledge changes james p. purdy and joyce r. walker ||| rapidly), there are other, less formal scholarly venues where speed of in- teraction is generative. the ability to converse quickly, whether in person or across geographic distance, can stimulate and sustain knowledge mak- ing. using kenneth burke’s parlor conversation as our metaphor ( – ), we can envision scholarly work (e.g., participation in digital conferences, scholarly electronic discussion lists, chat spaces) that contributes to a lively, productive, ongoing conversation—where scholars at various stages in their careers and research both generate new knowledge and benefit from the insights of others. wikipedia can serve this function: discussion from a range of scholars leads to new knowledge (see bruns; poe). such work has long been a part of scholarly activity in the form of face- to- face conferences, but it is important to note differences created by digital spaces, such as the creation of archives of conversations that are used over time, the data mining of site archives for the production of resource materials or research, or even the production of audiovisual archives of conference proceedings that are distributed online. extended and dynamic knowledge production not only can digital scholarship be published more quickly, it can also be elaborated more fully and thus do valuable rhetorical work. with es- sentially no limit on length, digital texts can include more material, such as appendices, questionnaires, data sets, and interview transcripts, which allows for more critical review and replication of research studies. kairos best webtext award winners, for example, are nearly all much longer than a standard print journal article (purdy and walker, “scholarship”). indeed, the authors of one such award- winning webtext said they chose digital scholarship precisely because of this possibility for expanded devel- opment. thomas rickert and michael salvo assert their webtext allows for “more detailed examinations of key themes and concepts,” explaining: the web works on a proliferation model where it costs nothing to pro- duce more, access more, and find additional resources. . . . in creating this [webtext], then, we have done our best to make available all the resources that are stripped away during the process of creating an au- thoritative print- based text. these “resources” include a glossary of definitions, extended endnotes, a list of fifty-two links to follow for additional information, five full- color images, six audio compositions, two podcasts, and a powerpoint slide show. these extended digital texts can be updated and revised over time. mistakes can be corrected, author affiliations (which can be important for tenure and promotion) updated, and citations added. such work, then, ||| valuing digital scholarship remains current and alive—reflecting new discoveries and perspectives and reinforcing the notions that scholarship evolves, that texts are dy- namic, and that knowledge making is an ongoing process. while new editions of print books and articles can take years to be published, digital texts can be updated in days or weeks. print texts, perhaps partly because of the slower speed of publication, tend to value knowledge produced as more permanent and less subject to (speedy) alteration. this more perma- nent view of knowledge creation has advantages in situations where either frequent revisions of knowledge do not need to occur or consistency over time is valued. to evaluate scholarly activity that extends across digital, face- to- face, and print venues, we need to create frameworks that allow us to consider the ongoing, recursive nature of knowledge production. authorship and accessibility this section examines aspects of the authorship and accessibility of digital scholarship to help us think about how scholarly value, in terms of both innovation and knowledge making, may contribute to social interactions. collaboration the frequently collaborative nature of digital scholarship allows for re- newed attention to the value of collaborative writing and the problems of automatically privileging single- authored publications in tenure- and- promotion decisions. the mla task force calls for valuing “scholar- ship produced in new media, whether by individuals or in collaboration” (“report” ; our emphasis). anne- marie pedersen and carolyn skin- ner reaffirm the vital role of collaboration in producing digital scholar- ship, contending, “[t] he composition of audio or video projects relies on collaborators’ combined knowledge of the project’s topic, its dominant modalities, the technology used for recording and editing, the medium in which the project is read or circulated, and the conventions or expecta- tions of audiences” ( ). scholars must be proficient in multiple modalities and technologies to produce publishable digital work. for this breadth of knowledge and for greater control over design and delivery, collaboration is valuable and at times necessary. given the rapid pace of change in digi- tal technologies, it is difficult for any one scholar to be sufficiently conver- sant in all the “modalities,” “technolog[ies],” “medi[a] ,” and “conventions” pedersen and skinner mention. though authorial choices in these areas have traditionally been more limited in print, recognizing how collabora- tion allows for more informed decisions and production competencies can make us appreciate more its value in print as well as digital forms. james p. purdy and joyce r. walker ||| the collaboration encouraged by digital scholarship extends beyond coauthors producing texts: digital scholarship fosters more collaboration among readers, writers, and textual sources. digital texts promote a cul- ture of sharing as they can be circulated easily among many people— for example, through social bookmarking sites, e-mail, and discussion boards. as ellen cushman, dànielle devoss, jeffrey t. grabill, bill hart- davidson, and jim porter argue, such ease of sharing changes how we interact with and use others’ texts: [a] udiences and writers are related to each other more interactively in time and space. writers can easily integrate the work of others into new meanings—text, image, sound, and video—with a power and speed impossible before computer technologies[, which] may be one of the most significant impacts of computer technologies on the contexts and practices of writing. digital scholarship fosters not only interactivity among texts and peo- ple but also cooperation over agonism in academic endeavors—a shift consistent with theories of writing, including postprocess and feminist perspectives, that value nonadversarial approaches to knowledge produc- tion (see breuch; hutcheon; mortensen and kirsch; moxley; olson). the borrowing and communal engagement facilitated by digital technologies can lead to new textual forms and knowledge- making practices that enact these theoretical perspectives, so they too should be considered in the evaluation of scholarly work. reader- user interactivity new and dynamic trajectories of composition can arise from the user in- teractivity that digital publications promote. digital composition spaces encourage alternative ways for authors and readers to interact ( joyce walker), as in electronic literature like emily short’s “galatea” and schol- arly webtexts like adrian miles’s “violence of text,” where readers deter- mine how the text unfolds. the digital nature of such texts makes them not only more usable but also reusable. it allows readers to be authors and promotes remix and assemblage, as demonstrated by eric faden’s video “a fair(y) use tale,” which argues for fair use by remixing short disney movie clips (faden et al.). johndan johnson- eilola and stuart selber argue that these beneficial writing practices help students “learn ways to use ex- isting information to solve real, concrete issues” and “move from a focus on representation (what things mean) to action (how things function, and to what effect)” ( , ). when such writing practices arise from reader- user engagement, there is greater potential for knowledge to be spread and ||| valuing digital scholarship used—which, after all, is the purported goal of scholarly research. tenure- and- promotion committees judge work for its influence on the field. the influence of print texts is traditionally measured with citation: authors provide in- text and bibliographic references. they can certainly do the same in digital scholarship, but viewing all scholarship through the lens of reader- user interactivity can help us better recognize, understand, and value other possibilities—in any medium. as doug eyman points out, there is a spectrum of citation possibilities in digital and print texts, including formal citation (explicit in- text and bibliographic references), informal citation ( in- text mention of an author or title but no explicit ci- tation), hyperlinks, and appropriation or quotation of part or all of a text with or without attribution ( – ). all these levels of engagement should be taken into consideration in assessing the influence of scholarship. public scholarship given the increasing calls in en glish studies to make our scholarship more public, decisions to produce and publish scholarship in ways that better reach the public should be evaluated across media. contributors to profession’s presidential forum, “the humanities at work in the world,” trumpet the value of disseminating our ideas to disciplines out- side en glish and to nonacademic audiences (e.g., barsky – ; brooks ). other scholars argue that extending scholarship beyond university walls is a necessary component of academic work (e.g., cintron; cush- man). the affordances of digital technologies make it easier to export our work more widely. with the right software, readers can access many digital scholarly texts from any networked computer. they do not need to travel to a specific location, show particular credentials, or pay to sub- scribe to a specific journal to view these texts. in digital environments, scholarly work can be not only brought to but also shaped by the larger public. for example, the hypercities project, the clergy of the church of en gland database, and the digital archive of literacy narratives (daln) all depend on public volunteers to con- tribute content and labor. the hypercities web site calls for people to contribute photos, maps, oral histories, and other texts that document the history of participating cities (getting involved). the clergy database relies on volunteers from across great britain to compile records of cleri- cal ordinations, appointments, and resignations between and from individual dioceses and submit them to a master database at king’s college (what is). the daln asks people from varied social, cultural, and educational groups to submit print, image, video, and audio files that document their literacy development (“daln home”). without public james p. purdy and joyce r. walker ||| participation, these scholarly projects would not exist—or at least not be as successful. having such an important role in these projects can make the public more connected to and invested in scholarly research endeav- ors, which can allow digital scholarship to have a broader influence and make the public more conscious of our work and its benefits. the stakes of expanding the reach of our scholarship may be greater than the respect afforded en glish studies, however. if we believe gra- bill’s claim that “the work of citizenship is knowledge work” and that the skills, habits of mind, and theoretical perspectives we teach and dis- seminate in our scholarship are crucial to this work ( ), then getting our scholarship to larger public audiences enhances and ensures the health of the citizenry of our nation. the writing and information technologies with which citizens engage are sometimes complex and always rhetori- cal. what scholars in en glish studies have to share with the public about negotiating and understanding these texts and technologies can improve the public’s participation in civic activity. accessibility for research digital scholarship is also accessible to other scholars for research pur- poses. researchers increasingly use citation managers, such as endnote and zotero, and social bookmarking sites, such as del .icio.us and digg, to store and organize scholarship. with these programs, scholars can tag, annotate, and classify digital scholarship in ways that allow for easy re- trieval and later use, creating a personalized record of resources and con- necting to other researchers who have consulted them (purdy). print scholarship, unless it is digitized, cannot be stored in citation managers or social bookmarking sites. but in digital form, scholarship can be searched for key words, authors, and so on and be linked directly with other texts. thus it can be readily found and used by others—including our students, who often turn first to online sources, which they find quickly through key word searches in google. digital work is more likely to be read and cited by younger generations conducting research. professional scholars also often consult digital texts for research- based writing, though they may use more sophisticated practices (e.g., employ- ing advanced boolean searches, searching in vetted archives). in surveying scientists about their article seeking and reading behaviors, carol tenopir and donald w. king found that the “advent of digital technologies on searching and publishing . . . has had a dramatic impact on information seeking and reading patterns in science.” their study reveals that in over half the texts scientists read and over ninety percent of searches they did were from electronic sources. the digital realm serves as the primary ||| valuing digital scholarship locus of research for researchers at every level. if academics are more likely to turn to digital sources for their research, academic work pub- lished in digital spaces should be recognized for tenure and promotion. the ability to follow hyperlinked citations in a digital article can shape future citation behavior. not only do the science scholars whom ten- opir and king surveyed increasingly search and read digital scholarship, they increasingly cite it. tenopir and king contend, “following citation links in electronic journal articles may have proportionately more influ- ence on citation behavior than reading behavior.” that is, linking can increase the likelihood that a text is cited. this finding is key for scholars who want their work to be read and cited—especially given how tenure- and- promotion committees often look at citation frequency in assessing a text’s success. the affordances of digital scholarship merit attention as shaping schol- arly activity. some search engines operate on the basis of frequency of key words in a text, and page rank on search results can depend on this frequency. if scholars write with such retrieval in mind, what constitutes good academic writing changes. writers are advised to compose titles and abstracts that contain and repeat key words. because digital publications are often more likely than print to be read and used, academic publishing changes too. digital availability is akin to circulation for a print journal, so scholars are well advised to publish in digital venues to maximize ex- posure to their work. further potentials for searching and data mining exist once digital scholarship is retrieved. scholars can search in digital documents easily— for instance, to find in which section a particular quotation appears or to determine the number of times a specific word or phrase is used. gather- ing this data can take considerably more time in print texts. thus, these affordances make more feasible closer attention to both a single scholarly text and a larger corpus of scholarly texts. digital scholarship furthers productive research because knowledge development can be traced and reshaped more easily in digital venues. scholars can provide a record of how they created a text, as in robert e. cummings and matt barton’s wiki writing, an edited collection that was composed in a wiki. this record provides invaluable access to the devel- opment of knowledge and a site for future research. scholars benefit from having multiple versions of their text saved for easy comparison; other researchers benefit by having a built- in repository of work to study. the processes of knowledge making we discuss above are not unknown to the academy. they are simply less visible than the embedded values that james p. purdy and joyce r. walker ||| have come to be associated with the scholarly print text. this decreased vis- ibility is partly because a notion of single, print authorship is perhaps easier and less complicated to assess, and the more collaborative and open prac- tices associated with digital work have, in the past, been part of face- to- face interactions and personal communications not easily archivable or sharable with others. tenure- and- promotion guidelines are often not explicit about how this rich range of activities might be documented or assessed, and ef- forts to categorize activities into discrete sections for service, teaching, ad- ministration, and research often exacerbate the problem because a faculty member’s scholarly activities can reach across these boundaries. because the category of research is often limited to narrowly defined “scholarly publications” (boyer ), other activities, such as those associ- ated with digital work, which may represent significant production and dissemination of new knowledge, are considered secondary to those em- bodied in single- authored, print- based, textual production. perhaps it is time we ask ourselves why. do we wish to continue to privilege only one of the myriad opportunities now routinely available to us for creating, sharing, and contributing to the knowledge- making practices of our re- spective scholarly communities? since we have long claimed to value the kinds of speculative thinking and association making that lead to new conversations and innovations, it seems limiting to place primary value on the linear, argumentative coherence that has become the province of the scholarly print essay in the american academy. as more scholars produce and value digital texts, we need ways to assess them. the process of rethinking these values will require us to create new frameworks, like the one we begin to advance in this article, in which the nature of scholarly work is broadly defined and where the materiality of a text becomes less important than a consideration of the complex rhetorical situation in which it makes meaning for the members of a discipline. as the likelihood that more and more scholarly publication will move online com- bines with the knowledge that digital forums allow for and often encourage different kinds of scholarly activities, we are all faced with the challenge of evaluating texts that do not look familiar, do not do the same kind of work we are used to seeing, and do not produce the kind of information or ideas we think of as scholarly. a better, more comprehensive understanding of alternative scholarship may require us to rethink how we read a text. one way to evaluate scholarly production that avoids a simple print- digital binary opposition is to think less about whether a text is digital or print and more about what it produces, participates in, or does. to develop a more robust, complex evaluation framework, we might ask these questions: ||| valuing digital scholarship what kinds of knowledge or ideas does the text produce or challenge? who uses it? who interacts with and changes it? with which recognizable genre (e.g., blog, scholarly web site, wiki) might it correspond? what skills or expertise did the scholar use to produce it? how are the media used to produce and disseminate it appropriate to the topic and audience? who has shared in this production? can the author(s) or the community respon- sible for production claim expertise in the subject matter? do those who shared in the production continue to use and reuse the text to produce knowledge? who has evaluated and assessed the ideas this production contains, and can those who have vetted the production claim scholarly expertise that we respect and value? such an approach to assessment would not look very much like the tenure- and- promotion activities now in place at most institutions. our current means of evaluating scholarly activity have not caught up with our burgeoning desire to account for nontraditional activities. these means rarely allow us access beyond the textual artifact, whether digital or print- based, to an exploration of the activity systems in which people interact (with various tools, institutions, and individuals) to create texts that dis- seminate information, make arguments, explore ideas, and even contrib- ute to the ways the discipline sees itself. investing in such activities will likely entail difficulties, but it will also enhance and expand our ability to engage in stimulating, innovative, and valuable kinds of knowledge production—and to reward faculty members for a more comprehensive range of scholarly contributions to our institutions and discipline. works cited andersen, deborah lines, ed. digital scholarship in the tenure, promotion, and review process. armonk: sharpe, . print. ball, cheryl e., and ryan m. moeller. “converging the ass[umptions] between u and me; or, how new media can bridge a scholarly/ creative split in en glish studies.” computers and composition online ( ): n. pag. web. mar. . barsky, robert f. “safe spaces in an era of gated communities and disproportion- ate punishments.” profession ( ): – . print. bernard- donals, michael. “it’s not about the book.” profession ( ): – . print. bernhardt, stephen a. “seeing the text.” college composition and communication ( ): – . print. borgman, christine l. scholarship in the digital age: information, infrastructure, and the internet. cambridge: mit p, . print. boyer, ernest l. scholarship reconsidered: priorities of the professoriate. princeton: car- negie foundation for the advancement of teaching, . print. breuch, lee- ann m. kastman. “post-process ‘pedagogy’: a philosophical exercise.” jac: a journal of composition theory . ( ): – . print. james p. purdy and joyce r. walker ||| brooks, peter. “the humanities as an export commodity.” profession ( ): – . print. bruns, axel. blogs, wikipedia, second life, and beyond: from production to produsage. new york: lang, . print. burke, kenneth. the philosophy of literary form. berkeley: u of california p, . print. carnochan, w. b. “on the tyranny of good intentions: some notes on the mla task force report.” profession ( ): – . print. cccc promotion and tenure guidelines for work with technology. ncte, nov. . web. dec. . cintron, ralph. angels’ town: chero ways, gang life, and rhetorics of the everyday. boston: beacon, . print. crowley, sharon. the methodical memory: invention in current- traditional rhetoric. carbondale: southern illinois up, . print. cummings, robert e., and matt barton, eds. wiki writing: collaborative learning in the college classroom. ann arbor: u of michigan p, . print. cushman, ellen. “the rhetorician as an agent of social change.” college composi- tion and communication ( ): – . print. cushman, ellen, dànielle devoss, jeff grabill, bill hart- davidson, and jim porter. “widepaper # : why teach digital writing?” writing in digital environments. michigan state u, aug. . web. dec. . “da ln home.” digital archive of literacy narratives. ohio state u, n.d. web. apr. . devoss, dànielle, and james e. porter. “why napster matters to writing: fileshar- ing as a new ethic of digital delivery.” computers and composition ( ): – . print. eyman, doug. “digital rhetoric: ecologies and economies of digital circulation.” diss. michigan state u, . print. faden, eric, et al. a fair( y) use tale. center for internet and society, stanford law school, mar. . web. june . getting involved. hypercities, n.d. web. sept. . grabill, jeffrey t. writing community change: designing technologies for citizen ac- tion. cresskill: hampton, . print. hawisher, gail e., and cynthia l. selfe. “erratum: ‘impossibly distinct: on form/ content and word/ image in two pieces of computer- based interactive multi- media.’” computers and composition . ( ): . print. hull, glynda a., and mark evan nelson. “locating the semiotic power of multimo- dality.” written communication ( ): – . print. hutcheon, linda. “creative collaboration: alternatives to the adversarial academy.” profession ( ): – . print. jaschik, scott. “abandoning print, not peer review.” inside higher ed. inside higher ed, feb. . web. feb. . johnson- eilola, johndan. datacloud: toward a new theory of online work. cresskill: hampton, . print. johnson- eilola, johndan, and stuart a. selber. “plagiarism, originality, assem- blage.” computers and composition ( ): – . print. lang, susan, janice r. walker, and keith dorwick, eds. tenure . spec. issue of computers and composition ( ): – . print. ||| valuing digital scholarship laspina, james andrew. the visual turn and the transformation of the textbook. mah- wah: erlbaum, . print. latour, bruno. reassembling the social: an introduction to actor- network theory. cam- bridge: oxford up, . print. levine, caroline. “rethinking peer review and the fate of the monograph.” profes- sion ( ): – . print. miall, david s. “the library versus the internet: literary studies under siege?” pmla ( ): – . print. miles, adrian, ed. “violence of text: an online academic publishing exercise.” kairos: a journal of rhetoric, technology, and pedagogy . ( ): n. pag. web. june . mortensen, peter, and gesa e. kirsch. “on authority in the study of writing.” col- lege composition and communication ( ): – . print. moxley, joe. “datalogies, writing spaces, and the age of peer production.” comput- ers and composition ( ): – . print. nahrwald, cindy. “just professing: a call for the valuation of electronic scholar- ship.” kairos: a journal for teachers of writing in webbed environments . ( ): n. pag. web. mar. . olson, gary a. “toward a post- process composition: abandoning the rhetoric of assertion.” post- process theory: beyond the writing process paradigm. ed. thomas kent. carbondale: southern illinois up, . – . print. pedersen, anne- marie, and carolyn skinner. “collaborating on multimodal proj- ects.” selfe – . perkel, dan. “copy and paste literacy? literacy practices in the production of a myspace profile.” informal learning and digital media: constructions, contexts, consequences. ed. k irsten drotner, hans siggard jenson, and k im christian schroeder. newcastle: cambridge scholars, . – . print. podcasts and video tutorials on scholarly publishing and copyright. mit libs., nov. . web. dec. . poe, marshall. “the hive.” atlantic monthly sept. : – . print. prior, paul, and charles bazerman, eds. what writing does and how it does it: an intro- duction to analysis of texts and textual practices. mahwah: erlbaum, . print. prior, paul, et al. “resituating and re- mediating the canons: a cultural- historical remapping of rhetorical activity (a collaborative webtext).” kairos: a journal of rhetoric, technology, and pedagogy . ( ): n. pag. web. feb. . purdy, james p. “the changing space of research: web . and the integration of research and writing environments.” computers and composition . ( ): – . print. purdy, james p., and joyce r. walker. “digital breadcrumbs: case studies of online research.” kairos: a journal of rhetoric, technology, and pedagogy . ( ): n. pag. web. july . ———. “scholarship on the move: a rhetorical a nalysis of scholarly activity in digital spaces.” the new work of composing. ed. debra journet, cheryl e. ball, and ryan trauman. logan: computers and composition digital; utah state up, forthcoming. “report of the mla task force on evaluating scholarship for tenure and promo- tion.” profession ( ): – . print. rickert, thomas, and michael salvo. “. . . and they had pro tools.” computers and composition online ( ): n. pag. web. june . james p. purdy and joyce r. walker ||| rose, mike. “sophisticated, ineffective books: the dismantling of process in com- position texts.” college composition and communication ( ): – . print. ———. “speculations on process knowledge and the textbook’s static page.” college composition and communication ( ): – . print. russell, david. “activity theory and its implications for writing instruction.” re- conceiving writing, rethinking writing instruction. ed. joseph petraglia. mahwah: erl baum, . – . print. schriver, karen. dynamics in document design: creating text for readers. new york: wiley, . print. selfe, cynthia l., ed. multimodal composition: resources for teachers. cresskill: hamp- ton, . print. short, emily. “galatea.” electronic literature collection: volume one. ed. n. katherine hayles, nick montfort, scott rettberg, and stephanie strickland. college park: electronic lit. organization, . web. feb. . stroupe, craig. “hacking the cool: the shape of writing culture in the space of new media.” computers and composition ( ): – . print. tenopir, carol, and donald w. king. “electronic journals and changes in schol- arly article seeking and reading patterns.” d- lib nov .-dec. : n. pag. web. dec. . walker, janice r. “fanning the flames: tenure and promotion and other role- playing games.” kairos: a journal for teachers of writing in webbed environments . ( ): n. pag. web. mar. . walker, joyce r. “hyper- activity: reading and writing in digital spaces.” kairos: a journal of rhetoric, technology, and pedagogy . ( ): n. pag. web. july . welch, kathleen. electric rhetoric: classical rhetoric, oralism, and a new literacy. cambridge: mit p, . print. westbrook, steve. “visual rhetoric in a culture of fear: impediments to multime- dia production.” college en glish ( ): – . print. what is the clergy of the church of en gland database project? king’s coll. london, . web. sept. . wysocki, anne frances. “a bookling monument.” kairos: a journal of rhetoric, tech- nology, and pedagogy . ( ): n. pag. web. june . ———. “impossibly distinct: on form/ content and word/ image in two pieces of computer- based interactive multimedia.” computers and composition . ( ): – . print. ———. “the multiple media of texts: how onscreen and paper texts incorporate words, images, and other media.” what writing does and how it does it: an in- troduction to analysis of text and textual practice. ed. charles bazerman and paul prior. mahwah: erlbaum, . – . print. devising enabling spaces and affordances for personal knowledge management system design volume , accepting editor raafat saadé │received: october , │ revised: march , │ accepted: may , . cite as: schmitt, u. ( ). devising enabling spaces and affordances for personal knowledge management system design. informing science: the international journal of an emerging transdiscipline, , - . retrieved from http://www.informingscience.org/publications/ (cc by-nc . ) this article is licensed to you under a creative commons attribution-noncommercial . international license. when you copy and redistribute this paper in full or in part, you need to provide proper attribution to it to ensure that others can later locate this work (and to ensure that others do not accuse you of plagiarism). you may (and we encour- age you to) adapt, remix, transform, and build upon the material for any non-commercial purposes. this license does not permit you to use this material for commercial purposes. devising enabling spaces and affordances for personal knowledge management system design ulrich schmitt university of stellenbosch, business school, bellville, south africa schmitt@knowcations.org abstract aim/purpose personal knowledge management (pkm) has been envisaged as a crucial tool for the growing creative class of knowledge workers, but adequate technological solutions have not been forthcoming. background based on former affordance-related publications (primarily concerned with communication, community-building, collaboration, and social knowledge shar- ing), the common and differing narratives in relation to pkm are investigated in order to suggest further pkm capabilities and affordances in need to be con- ferred. methodology the paper follows up on a series of the author’s pkm-related publications, firmly rooted in design science research (dsr) methods and aimed at creating an innovative pkm concept and prototype system. contribution the affordances presented offer pkm system users the means to retain and build upon knowledge acquired in order to sustain personal growth and facili- tate productive collaborations between fellow learners and/or professional ac- quaintances. findings the results call for an extension of nonaka’s seci model and ‘ba’ concept and provide arguments for and evidence supporting the claims that the pkm con- cept and system is able to facilitate better knowledge traceability and km prac- tices. recommendations and impact on society together with the prior publications, the paper points to current km shortcom- ings and presents a novel trans-disciplinary approach offering appealing oppor- tunities for stakeholders engaged in the context of curation, education, re- search, development, business, and entrepreneurship. its potential to tackle op- portunity divides has been addressed via a pkm for development (pkm d) framework. future dsr ac- tivities after completing the test phase of the prototype, its transformation into a via- ble pkm system and cloud-based server based on a rapid development plat- form and a nosql-database is estimated to take months. http://www.informingscience.org/publications/ https://creativecommons.org/licenses/by-nc/ . / https://creativecommons.org/licenses/by-nc/ . / mailto:schmitt@knowcations.org devising enabling spaces and affordances for pkm system design keywords personal knowledge management (pkm), design science research (dsr), informing science (is), knowledge worker, affordances, path dependency, fixations, digital ecosystems, memes, memex, knowcations personal knowledge management as informing science this article is the third publication to validate a novel personal knowledge management (pkm) con- cept and system. both of the previous papers - like this one - incorporate references to the relevant prior publications covering technical and methodological details and, thus, provide a kind of ‘long discussion case’ aiming to potentially assist it researchers and entrepreneurs engaged in similar pro- jects. in the first instalment (schmitt, d), the approaches at the heart of the pkm system are put un- der the is-macroscope by aligning them against some of the informing science’s key methodologies (cohen’s is-framework, leavitt’s diamond model, the is-meta approach, and gill’s and murphy’s three dimensions of design task complexity). the second article (schmitt, e, j) emphasizes pkm’s status as a ‘wicked’ problem (ill- defined; incomplete, contradictory, changing requirements; complex interdependencies) where the information needed to understand the challenges depends upon one’s idea for solving them. accord- ingly, it presents a chain of meta-arguments elaborating on the central idea to the pkm concept and system (incorporating notions of complexity, popper’s three worlds, digital ecosystems (de), and a united nations scenario of knowledge mass production over time), before verifying the resulting development process and prototype system against accepted general design science research (dsr) guidelines. dsr aims at creating innovative it artefacts (that extend human and social capabilities and meet desired outcomes) and at following thorough design processes (as evidence of their rele- vance, utility, rigor, resonance, and publishability). this follow-up paper turns its sight to the beneficiaries of the novel pkm system (pkms) and the affordances to be bestowed on them. it has been strongly guided by two publications, one addressing network communities with a focus on communication and community-building (mynatt, o’day, ad- ler, & ito, ), and the other aiming to extend this view to collaboration and social knowledge sharing (cabitza, simone, & cornetta, ). however, the affordances elaborated on by these au- thors only partially cover the wider scope of the pkm concept and system which resulted in the re- purposing, restructuring, and extension of the affordances frames. • thus, the paper will first introduce the notion of affordances – originally introduced as an ecological concept – and point out its shared and differing narratives with personal km. • then, the common ground in the context of communication, community-building, collabo- ration, and social knowledge sharing will be addressed, • next the further pkm capabilities will be accounted for in form of additional affordances to be conferred. this new level not only provides the ground for qualifying the pkm concept and system as a disruptive rather than sustaining innovation (schmitt, g) with the poten- tial of becoming a general-purpose-technology (schmitt, h), it also provides the overdue means to support knowledge workers as well as ambidexterity (schmitt, d). the notion of affordances in a world of ecosystems from gibson’s ecological point of view, “the affordances of the environment are what it offers the animal, what it provides or furnishes, either for good or ill.” accordingly, he exemplifies “what the environment affords animals, mentioning the terrain, shelters, water, fire, objects, tools, other ani- mals, and human displays” (gibson, ). schmitt subsequently, briscoe ( ) applied the ecological notion of environments or ecosystems also to information science. he introduced his conceptual framework of digital ecosystems (de) as a means for the cross pollination of ideas, concepts, and understanding between different classes of information environments. this general framework has proven very useful for defining the diverse km landscape based on a set of key attributes (figure ). as the result, the km environment has been portrayed as layers of interdependent and interacting des that differentiate the individual minds (knowledge worker) and their social affiliations (society and institutions) and that distinguish between physical objects (technology) and their explicit or encapsulated information or knowledge they contain (extelligence) as well as their associated sets of ideas or memes they originated from (ideosphere) (schmitt, j). consequently, each de’s affordances – as in the ecological case - differ in what they offer to their inhabitants (agents), what they provide or furnish - either for good or ill. in focusing on the personal km context, this paper follows up on the structuring exercise of the km-specific des (schmitt, j) and aims to stipulate what each of the six layers are able to afford to relevant beneficiaries in terms of objective effects (gains and detriments) as well as subjective per- ceptions (values and meanings in form of delighters or demotivators). while all six layers compose the universal habitat, any set of affiliated agents experiences a particular blend of the six des due to each of their blend-specific attribute values; such a set has been termed ‘knowcations’ and resembles a niche in ecology. while a habitat refers to where an agent lives, a uti- lized or occupied niche refers more to how an agent lives and, thus, represents a - potentially unique - set of attribute values and related affordances (gibson, ). however, contrary to the ecological setting, the affordances provided in the km context are not just concerned with fulfilling basic sur- vival needs, but aim to empower their agents to manage and further develop their inhabited and uti- lized ‘knowcation’ (exploitation and practice), to expand their ‘knowcation’ (learning and understand- ing), or establish new ones (exploration and innovation). figure : attributes of the six ecosystems interacting with the pkm system devising enabling spaces and affordances for pkm system design significance of affordances for knowledge societies progressing civilizations are based on changes by humans in pursuit of affordances. due to zero-sum games of competing for limited resources, the benefit of one agent often takes place to the detri- ment of others, future generations, or the environment. knowledge, however, is not reduced when consumed and is not lessened when transferred; its view-as-a-resource differs significantly and is des- tined for non-zero-sum (win-win) interactions. but, most of this knowledge is subjective and context-specific; its articulation – if possible – often entails generalization, and its encoded representation is “inevitably simplified and selective, for it fails to capture and preserve the tacit skills and judgment of individuals” comprehensively (lam, ). yet, the inventions of writing, printing, digitization, and the internet and cloud have provided us with the means to make our mental outputs explicit, accessible independent of time and space (schmitt, b), and consolidatable as a record of the world’s extelligence (stewart & cohen, ). hence, the notion of ‘standing on the shoulders of giants’ implies that no scholarly publication stands alone and that scholarship usually “is an inherently social activity, involving a wide range of public and private interactions within a research community. publication, as the public report of re- search, is part of a continuous cycle of reading, writing, discussing, searching, investigating, present- ing, submitting, and reviewing” (borgman, ). consequently, overall performances and viabilities of institutions and societies rely on the accessible stocks of knowledge, experience, and creativity. effectively utilized, these stocks convert into innu- merable small personal ‘nano actions’, which combine with larger departmental actions that combine to create consolidated enterprise actions that result in the performance of whole institutions and knowledge economies. as the world becomes more sophisticated and integrated and as the work contexts change with increasing complications and complexity, two needs become ever more vital: to understand how people reason and to understand how to ascertain that they are provided with op- portunities, supportive attitudes, and adequate resources to do their best (wiig, ). these ade- quate resources include affordance-conferring technologies as alluded to by mynatt et al. ( ) and cabitza et al. ( ). the novel pkm concept and system offers such a technology and provides overdue support for knowledge workers with the aim of: • managing/growing the intellectual, social, and emotional capitals of individuals, • supporting their creative authorship throughout their academic and professional careers an- ywhere and as contributors and beneficiaries of institutional and societal performance, edu- cational services, and the world’s collective extelligence, • fostering creative conversations among teams, organizations, and communities for mutual benefit and competitive advantage via network and cloud technologies. affordances priorit izin g com m un ication an d collaboration by applying the ‘standing on the shoulders of giants’ notion, it has proven fruitful to revisit a con- ceptual framework for ‘network communities’ from the pre-facebook-twitter-google-era (mynatt et al., ), although its criteria (persistence, periodicity, boundaries, engagement, and authoring) needed to be re-contextualized to fit the current time and expanded to accommodate the specifics of the pkms concept. the former task has been already partially accomplished in a paper advocating to go beyond usability and sociability considerations and to devise concepts for the ‘next community- oriented technologies’ (cabitza et al., ) by shifting the focus from social networking to towards conviviality (i.e., pleasing, gratifying, edifying, self-fulfilling, self-expressive experience) and convivial artefacts (defined as any technology aimed at promoting sociality, cooperativity, self-expression and autonomous and creative intercourses among individuals in order to foster collective deliberation, collective planning, and cooperative action). schmitt this notion of conviviality has also been an inspiration in developing the pkms concept and prompted the incorporation of the four criteria for ‘capable and convivial design’ (johri & pal, ) into a -criteria pkm for development (pkm d) framework to be applied in personal set- tings (schmitt, k) as well as development interventions (schmitt, h) in the interdisciplinary and intercultural context. the twelve criteria are closely aligned to the six ecosystems which are also offering a fitting structure for the affordances prioritizing communication, community-building, collaboration, and social knowledge sharing suggested by mynatt and cabitza. constraints and limitations within the technologies ecosystem the evolutionary progress of the ‘technologies ecosystem’ is based on a co-evolution of physical and social (including service) technologies directed by business plans (beinhocker, ). novel technological systems and their components are selected based on their utility and fitness resulting in sustaining (incremental improvements), disruptive innovations (substitutions), or failing products and ideas (schmitt, j). maturing network communities, hence, experience different kinds of constraints or limitations over time in both their physical and virtual spaces of interaction, either self-imposed or caused externally. on the one hand, these constraints are “providing a base for the mutual production of expectations about social life within the community” (mynatt et al., , p. ); on the other hand, “they require the community to be dynamic, resilient and reactive to unpredictable events” (cabitza et al., ). the technological infrastructure, thus, “has to afford suitable means to support this combination of contrasting conditions” and has to “provide the community members with the awareness of the cur- rent constraints, their ‘strength’ and related ‘slack’, and to support their activities in accordance and compliance with those; moreover, it has to equally sustain the reflective behavior of the community members that leads to the adaptation, appropriation and continuous redefinition of those constrains with respect to the changes of the contextual conditions” (cabitza et al., ). the related intervention in the pkm d framework is termed ‘scaping’ referring to modifying an environment for empowering the capable human resources it accommodates in order to improve ‘accessibility easiness’ and ‘operable autonomy’ for individuals. it ranges from meeting basic neces- sities (e.g., affording internet access to combat digital divides or access to information and knowledge via effective and affordable artefacts) to supporting individual sovereignty by employing grass-roots, bottom-up, affordable, personal applications (e.g., by affording alternatives to prohibitive approaches and discouraging services of dominant market players) (schmitt, h). cabitza et al. ( ) stress the point that for today’s “most popular technologies supporting a com- munity, their development is left in the hand of few big players while the research community is just observing and reporting on their usage in different contexts” and how this state is “stifling the de- velopment of real alternatives and the quest for disruptive innovation”. persistence within the extelligence ecosystem the ‘extelligence ecosystem’ focusses on developing and making effective use of the available world’s explicit knowledge and information. since successful network communities strive on “a growing mutual acquaintance and on an increasing set of conventions that shape the mutual interac- tions of the community members”, they have to employ state-of-the-art know-how and afford mem- bers with “durable, although evolving, structures” (cabitza et al., ) of participants, participation, interactions, and content. the pkm d intervention is termed ‘sight setting’ referring to the desire to empower citizens by making them highly knowledgeable in order to function competently and effectively in their daily lives, as part of the workforce, and as public citizens (wiig, ). it emphasizes the applicability and productive use of the accessible technologies and extelligence to support the ‘expressive creativity’ and ‘collaborative choice’ of individuals. it ranges from assisting people with their learning and re- flection, over developing and articulating their own ideas based on their individual knowledge, back- devising enabling spaces and affordances for pkm system design ground, and situation, towards guiding the self-determination of their lives and careers and their self- choosing of personal and professional acquaintances (schmitt, h). cabitza et al. ( ) point out that the current and popular social networking sites “do not offer this affordance” because the means to facilitate socialization among their members “are primarily based on a quantitative and merely communication oriented notion of it. from the technological point of view, their persistence is on the one hand guaranteed by the provider once a critical mass of mem- bers has been reached; on the other hand, just for this reason, the technological persistence is fully outside the community space of control.” engagement within the social ecosystem the ‘social ecosystem” hosts individual persons’ minds interacting with other minds (one’s acquaint- ances and contacts) through their bodies and senses resulting in personal subjective tacit knowledge. due to communication and the sharing of practices, any collective of individuals (e.g., family, friends, societies) is also likely to establish distinct cultures in this ecosystem which are based on nature (kin- ship, environment) or nurture (e.g., education) (schmitt, h). the kind of interactions a network community can afford to its members through the channels provided determines its level of mutual engagement (e.g., by sharing, consenting, endorsing, commit- ting, or collaborating). the degree of individual engagement depends either on the member’s choice of participating in the full range of opportunities available or might be restricted according to the services offered to particular roles or forms of membership (cabitza et al., ). the pkm d intervention is termed ‘socializing’ with the aim of strengthening the personal auton- omy and competencies of individuals further by engaging with relevant communities in pursuit of ‘relational interactivity’ and ‘creative conversations’. creating and maintaining social ties represent “an investment in the accumulation of social resources or social capital” (katz, lazer, arrow, & con- tractor, ) defined as the “sum of the resources, actual or virtual, that accrue to an individual or group by virtue of possessing a durable network of more or less institutionalized relationships of mutual acquaintance and recognition” (bourdieu & wacquant, ). since finding and keeping these regenerative relationships will be a key competence, an individual’s social capital has to be crafted and nurtured in conscious ways (gratton, ). accordingly, the affordances range from maintaining and classifying contacts and their talents, over making use of this information by the actual facilita- tion of conversations or collaborations, towards expediting wider intercultural and interdisciplinary discourses (schmitt, g, h) as well as experience management (schmitt, a). in light of these crucial needs, cabitza et al. ( ) criticize current social media tools and providers in regard to exclusion (of people without access or account), design control (features imposed on members), content ownership (exploitation of information voluntarily shared by members), and col- laborative support (restrictive communication/cooperation-oriented functionalities). authoring within the knowledge worker ecosystem the ‘knowledge worker ecosystem’ signifies the narrowing of the general ‘social ecosystem’, providing a space for individual knowledge workers as constituents of collective mind sets (e.g., teams, guilds, or professions) engaging in private and professional practices or labor markets. moti- vated by earnings, reputations, or career prospects, developing one’s attitudes, competences, exper- tise, and communication skills is key for advancing into desired public or work positions regulated by qualification frameworks and shaped by professional cultures (schmitt, h). hence, network communities ought to allow their members not only to use but also to manipulate their space, whether as designers or users. this applies to the interactions produced, but even more so to the social, virtual and physical ecology as being available to participants for continuous author- ing and re-authoring in the process of living in and developing the community (mynatt et al., , pp. - ). schmitt the pkm d intervention is termed ‘striving’ and subscribes to a definition of ‘knowledge worker’ which is not restricted to the narrowly defined socio-economic categories of the developed world (as in, for example, florida’s creative class ( )) but follows gurteen ( ) who places - rather than an individual’s type of work - the virtue of responsibility at the center of his reflections: “knowledge workers are those people who have taken responsibility for their work lives. they continually strive to understand the world about them and modify their work practices and behaviors to better meet their personal and organizational objectives”. to gurteen’s mind, these self-motivated “knowledge work- ers see the benefits of working differently for themselves. they are not ‘wage slaves’ - they take re- sponsibility for their work and drive improvement”. the associated pkm d criteria are ‘ecological reciprocity’ highlighting peoples’ desires “to give back to their environment and not just take resources from it, [a vital pre-requisite for a] participative culture and working in a collective milieu” (johri & pal, ) as well as ‘personal mastery’ referring to peoples’ intellectual capital in need of being nurtured by building depth and by putting in the time and resources to create a body of knowledge and skills - not only in one single but multiple are- as (gratton, ; schmitt, h). while all (or most of the) content in existing social networking sites is user-generated, cabitza et al. ( ) bemoan that the means (and the associated limitations) to produce, share, and consume it, “are imposed from above and subject to change with no notice or consultation (cf., the introduction of the timeline in facebook). this contributes to undermining the feeling of being in common vir- tual place; it corroborates the idea of being guests of some host that houses you (probably just to observe you, or take some opportunity to sell you something); and above all, it totally stifles the community affordance of authoring.” reputation and trust within the institutions ecosystem the ‘institutions ecosystem’ is an extension of the knowledge worker ecosystem providing a space for professionals and their stakeholders to form institutions (defined as “snapshots of a sub-set of the ideational field that persevere while the network itself continues to fluctuate” (kanengisser, )) with organizational intelligence and memories operating in particular economic and industrial sectors. the driving forces are competitiveness and/or collaboration based on capabilities to success- fully exploit and further explore and advance institutional portfolios of interests and expertise lead- ing to profitability or reputation and trust (schmitt, h). trust (defined as a “bet about the future contingent actions of others” (sztompka, )) cannot be afforded directly but has to be earned by acquiring a reputation of, for example, expertise, profes- sionalism, reliability, or high-quality services/content supplied. while social network communities rely on simple reputational metrics based on ‘likes’, clicks, reads, downloads, or interactions, commu- nities engaged in academic scholarship have established an academic-paper-based citation system that cultivates a sophisticated reputation economy by both crediting the original discoverer and providing a link in a chain of evidence (nielsen, ). depending on the status of the publisher, a citation adds to varying degrees towards citation indices or impactor factors accessible also online (e.g., google scholar, researchgate, or web of science). the remaining pkm d interventions are all related to self-transcendence and seek to further causes beyond individuals’ self which may also “involve service to others or a devotion to an ideal (e.g., truth, art) or a cause (e.g., social justice, environmentalism, pursuit of science, religious faith)” (kolt- ko-rivera, ). thus, the pkm d intervention of the ‘institutions ecosystem’ is termed ‘system- izing’ and refers to deliberate actions of converting individual into institutional or societal perfor- mances. the first criteria ‘institutional performance’ emphasizes the pkm concept’s aim of strengthening individual sovereignty and personal utility not at the expense of organizational knowledge management systems, but rather to foster a fruitful co-evolution for mutual benefit. the second criteria ‘innovative capabilities’ acknowledges the need of individuals to acquire a thorough devising enabling spaces and affordances for pkm system design understanding of how value can be added to intangible services as well as knowledge assets (defined “as nonphysical claims to future value or benefits” (dalkir, )). to promote trust and reputation by taking advantage of today’s online realities, nielsen ( ) urges removing barriers that prevent potential contributors in any part of the world from engaging in a wider sharing and faster diffusion of their ideas, sources, data, work-in-progress, pre-prints, and/or code for the benefit of more rapid iterative improvement: “if scientists are to take seriously contri- butions outside the old paper-based forms, then we should extend the citation system. […] all that’s needed for open science to succeed is for the sharing of scientific knowledge in new media to carry the same kind of cachet that papers do today. at that point the reputational reward of sharing knowledge in new ways will exceed the benefits of keeping that knowledge hidden”. boundaries within the ideosphere ecosystem the ‘ideosphere ecosystem’ (defined as an invisible but intelligible, metaphysical sphere of ideas and ideation where we engage in the creation of our world (sandberg, )) represents the entire accu- mulated explicit human know-how and experience. in popper’s three worlds view, this ecosystem resembles his world: which embodies the thought content made explicit in the form of abstract ob- jective knowledge objects, while world: comprises the concrete objects and their relationships and effects in the real physical world (comprising the technologies and extelligence ecosystems present- ed), and world: refers to the results of the mental human thought processes in the form of subjec- tive personal knowledge objects (comprising the society, knowledge worker, and institutions eco- system alluded to) (popper, , ; schmitt, j). a reputable network community affords transparent boundaries in respect to its internal and exter- nal stakeholders, system elements, and features together with their respective potential or permitted interactions, bearing in mind that “the space in which a network community lives is made up of both a physical and a virtual component: these two components are at the same time distinct and highly interconnected, as one cannot exist without the other” (cabitza et al., ). the pkm d intervention is, consequently, termed ‘scaling’ referring to an ability that goes beyond its usual financial setting of maintaining or improving profit margins with increasing turnover as evi- denced by the two associated self-transcendence-supporting criteria ‘encouraging empowerment’ and ‘technological progress’. while the former involves helping others to achieve self-actualization, taking avoiding action against ‘overlooked potentials’, and acknowledging responsible leadership and integrity in the process, the latter’s focus incorporates removing barriers, reducing complexities, and providing adequate tools to set the stage for an enabling environment and for stimulating the logics and logistics of new knowledge formation (schmitt, h). cabitza et al. ( ) caution that current infrastructures “are not totally adequate to support the in- terplay between on-line and off-line activities when they mainly afford communication-oriented and information sharing functionalities” while affordances to collaborate across interaction spaces suffer from providers enforcing inflexible exit, entry, and data export barriers at the expense of their cap- tured audiences’ attention, time, productivity, funds, and status (schmitt, g). affordances priorit izin g personal kn owledge m anagem en t as evidenced by the testimonials (last paragraphs of last six subsections), severe deficiencies are hampering communities’ experiences and the respective affordances in each of the ecosystems. they impede the capacities of communication, community-building, collaboration, and social knowledge sharing. from a more comprehensive pkm perspective the current state of affairs is even poorer. figure shows the further key affordances to be introduced as conferred by the pkm concept and system (top section) aligned to a visual summary of the discussion so far. schmitt figure : affordances and fixations affecting a pioneering pkms path development devising enabling spaces and affordances for pkm system design these key affordances are presented under the concept of technological path constitution or devel- opment which incorporates notions of path dependence (emergence, persistence, and dissolution) and path creation (composition, extension, and abolishment). since the former describes a situation based on chance or smaller unintended events, the latter calls for deliberate actions. the specific se- quence of a path’s phases (shown in brackets following a generic timeline of generation, continua- tion, termination), hence, depends on intentional interventions (meyer, ). accordingly, barriers and neglected affordances - as indicated - can be traced to the current market players’ (deliberate) emphasis on capturing their audiences through inflexible although inferior ser- vices. their reliance on top-down, heavyweight, prohibitive, centralized developments and institu- tional approaches might also be deliberate, but could also be owed to path dependence and techno- logical lock-ins due to unconcerned providers feeling secure in comfort and/or ignorance. putting path dependence in the wider context of design theory and creativity, le masson, hatchuel, and weil ( ) label these limited perceptual capacities to sense and adequately respond to changing environments ‘fixation effects’. their examples cite grounds owed to undue preoccupations with exist- ing, already designed objects, with existing design rules and machine elements, and with non-relevant reuse of existing knowledge. to overcome these fixations, the first step in a repeating cycle is recog- nizing and acknowledging them, followed by modifying or reinventing their underlying design theo- ries and models with subsequent diffusion, resulting in “enabling new types of innovation output” (pp. - , ), just like the pkms concept. technologies ecosystem tied to market barriers and lack of tools the question why a pkm-like system has not emerged earlier (although what bush ( ) had imag- ined already seven decades ago as the ‘memex’ can be regarded as its as-close-as-it-gets ancestor), has been linked to seven market barriers (schmitt, e) which led to the formulation of six pkm pro- visions based on affordances currently not catered for (schmitt, i): ( ) digital personal and per- sonalized knowledge stays always in the possession and at the personal disposal of its owner or eligi- ble co-worker; ( ) based on standardized, consistent, transparent, flexible, secure, and non-redundant formats as well as ( ) independent of changes in one’s social, educational, professional, or technolog- ical environment. ( ) a ‘world heritage of memes repository (whomer)’ unlocks collaboration capabilities between the decentralized autonomous pkms capacities ( ) which have to be mutually beneficial to facilitate consolidated team or institutional actions. ( ) the whole pkm approach needs to be supported by sound educational interventions. as a result of these neglected affordances, we do have “many powerful applications for locating vast amounts of digital information, [but] we lack effective tools for selecting, structuring, personalizing, and making sense of the digital resources available to us” (kahle, ). the pkm concept follows bush’s ( ) vision of the ‘memex’ and its prototype system is projected to be transformed into viable pkms device applications supported by a cloud-based whomer server based on a rapid development platform and a nosql-database. extelligence ecosystem hampered by siloes and book-age paradigm although progress only recently triggered the change from information scarcity to a never before experienced ever-increasing information abundance, the need for managing the scarce personal atten- tion of those receiving it has been stressed by simon ( ) already over four decades ago. contrary to this essential need, silos have been created based on proprietary digital formats or incompatible semantic ontologies (levy, , p. ) and digital repositories have been fortified by ‘walled garden’ apps and platforms, counteracting an open and connective web and pleads for a ‘new era of net- worked science’ (nielsen, ). moreover, “the over-simplistic modelling of digital documents as monolithic blocks of linear content, with a lack of structural semantics, does not pay attention to some of the superior features that digital media offers in comparison to traditional paper docu- schmitt ments” (signer, ). the continuing fixation on the outdated book-age paradigm still compels us, as noted by mintzberg ( ), to provide linear accounts of a nonlinear world. the pkm concept follows simon’s advice ( ) that producing and transmitting more and more information should not be our sole concern but that we also must know how much it costs, in terms of scarce attention, to receive it: “in a knowledge-rich world, progress does not lie in the direction of reading information faster, writing it faster, and storing more of it. progress lies in the direction of extracting and exploiting the patterns of the world – its redundancy – so that far less information needs to be read, written, or stored”. this pkms focusses attention by using structural references to re-usable basic information units (ideas or memes just like this paragraph) instead of documents, to be further described in the ‘ideosphere ecosystem’ section below. social ecosystem bound by analysis and industrial age paradigm as argued in prior papers (schmitt, g, j), three major fixations are adding to the sorry state of supporting knowledge workers in their personal and inter/transdisciplinary capacities: • education is still modeled after ford’s assembly line and taylor’s scientific management, preparing students in disciplinary siloes for the linear, definite, specialized and predictable ca- reer paths of the past century (davidson, ) with the exception of liberal arts or inter- disciplinary programs. • the myths of newton’s clock-work universe rather than systems thinking and design science are still dominating educational content and academic teachings. while ‘design/synthesis’ and ‘evaluation’ top bloom’s revised taxonomy (aect, ), research methodologies, pro- jects, and supervisors are dominated by or preoccupied with ‘analysis’. • management concepts and models “emanating from the academic discourse fall well short of organizational reality” and lack ‘theory effectiveness’, expecting designs to be purposeful – both in terms of utility (a matter of content) and communication (a question of presenta- tion) to an audience (o’raghallaigh, sammon, & murphy, a, b). as levy ( ) emphasizes the need for a personal discipline for collection, filtering and creative connection (among data, among people, and between people and data flows) and points to the sus- tainable growth of autonomous personal km capacities as the most important function of future education, the pkm system’s innovative features and educational philosophies are about to be aligned to an established learning management system. both approaches are seeking to focus our precious attention by substituting redundant information objects with digitally embedded structural references, benefiting creative authorship and novel learning and collaboration experiences (schmitt & saadé, ). knowledge worker ecosystem seeking autonomy and development due to the neglected affordances alluded to, “we still take copies and store them in diverse arrays of devices or make mental notes only. over time, copies deteriorate, memories fade and with it the abil- ity to recall the locations and contents of our fragmented personal knowledge inventories and ar- chives. nevertheless, we are unable to part with our accumulated hard and soft copies which slowly but steadily lapse from potential value towards dead ballast.” we also “long for better support for identifying and filling knowledge gaps, detecting and correcting flaws, and deciding on suitable means for evaluating and advancing our repositories including the recording of related to-dos, progress, processes, and feedback” (schmitt, ). a brief ‘pkm needs survey’ exemplifies – as a poster based on eleven flickr images – the challenges knowledge workers are facing (schmitt, n). devising enabling spaces and affordances for pkm system design figure : pkm system ices cycle versus organizational seci cycle schmitt to address these tasks, the pkms’s personal focus affords personal autonomy in handling, mobiliz- ing, and sharing one’s knowledge. the upper half of figure depicts the iterative cycle of this pro- cess. it starts with ( . originating/socializing) information gathering via field and desk investigations, continues with ( . exercising/internalizing) selecting the relevant findings to capture them in the pkms repository, followed by ( . systemizing/combining) utilizing their content and relations by connecting them to related content already present via classification and/or authorship. ( . dialogu- ing/externalizing) any content can then be voluntarily shared by an individual user with the pkms community, so that ( . originating/socializing) any individual eligible member can potentially engage with it. the terms in brackets ( - ) acting as a legend of the cycle presented and fully correspond to the the- ory of organizational dynamic knowledge creation (nonaka & takeuchi, ) and its further exten- sion known as the concept of ‘ba’ or spaces (nonaka, toyama, & konno, ). its seci model (depicted in the lower half of figure ) promotes individual and collective real-world learning pro- cesses in the anti-clockwise manner depicted. although, as the above cycle description and figure show, the pkms cycle - contrary to the seci-cycle - operates in a clock-wise ices fashion, it ac- commodates a very close co-evolution of the two cycles for the benefit of individuals, community members, and institutions as argued in a recent paper (schmitt, g). in line with its ambition to tackle opportunity divides, the pkms’s affordances portrayed are aimed to be conferred independently of their users’ space (e.g., developed/developing countries), time (e.g., study or career phase), discipline (e.g., natural or social science), or role (e.g., student, professional, or leader) and their focus on creative conversations are based on the “emergence of distributed pro- cesses of collective intelligence, which in turn feed them” (levy, ). institutions ecosystem in need of ambidexterity and innovativeness the synergies between the pkms concept and organizational km systems have been emphasized in the previous section and figure and have been further detailed in previous articles with regard to km system generations (schmitt, f), integration into earl’s seven km schools as well as ambi- dextrous organizations (schmitt, d), and disruptive innovations (schmitt, g). to utilize these synergies, “the aim has to be to collaboratively interlink and collectively harvest prior accumulated knowledge subsets provided the pkms user also benefits.” in effect, pkm devices ac- commodate a departure from top-down, centralized, institutional, km systems towards a more inclu- sive bottom-up approach. as a result, the pkm concept is able to “underpin a growing dynamic ca- pability for increasing the capacity of an organization to purposefully create, extend, or modify its resource base - including tacit (attitude and leadership), explicit (knowledge bases, rules and strate- gies), and encapsulated knowledge (products and services) as well as its wider ecosystem (involve- ment with the community) - not at the expense of disinterested employees but as a means to moti- vate them and serve their self-interests”, bearing in mind that the lack of acceptance of and engage- ment in organizational km has been a prime reason for the failure of many km projects (schmitt, d). the future of work and knowledge societies is said to be based on the notion that the knowledge and skills of a knowledge worker are portable and mobile (rosenstein, ). accordingly, the pkm af- fordance presented would finally enable individuals - moving from one project or responsibility to another - to take their personal version of a knowledge management system (able to be continually maintained and updated) with them wherever they choose to go and engage. to take a further step, a recent paper has also looked at entrepreneurship and shows how pkm systems can assist in navi- gating the barriers of the stage-growth business models (schmitt, k). devising enabling spaces and affordances for pkm system design ideosphere ecosystem fostering traceability and transdisciplinarity theory creation and validation constitute important objectives of research to foster understanding in the search for truth and forethought. conceptual schemes provide an alternative but instead of rep- resenting truth they are foremost “evaluated based upon their usefulness to a client”. to be useful, such a scheme need to be interesting (meaning it conveys something novel to the client), simple enough (to be communicated effectively), and aware of its own limitations (gill, ). while the notion of the ‘meme’ has been enthusiastically picked up by internet users, it is a highly controversial issue in social sciences and humanities. major criticisms raised include, for example, its ambiguous definition, its difficulties with quantification and measurement, its nature-culture- analogy’s questioned ability of describing complex human behaviors, its dominance of memetic con- trol over human agency, and its doubted value-adding qualities in regard to already existing tools or insights (shifman, ). as applied in the pkms context, it closely resembles its original context (by offering a conceptual scheme seeking for usefulness rather than truth) and significantly differs from the ‘internet meme’ commonly applied “to describe the propagation of content items such as jokes, rumors, videos, or websites from one person to others via the internet” (shifman, ). storytelling is regarded as a crucial tool for management and leadership, but the ‘meme’ meme not only offers an interesting story; it also considerably simplifies the task of rationalizing the substitu- tion of traditional document-centricity with digitally embedded structural references. first, memes - originally described as units of cultural transmission or imitation (dawkins, ) - evolve over time through a darwinian process of variation, selection and transmission. in order to ‘survive’, memes have to be able to endure in a medium they occupy and the medium itself has to persevere. they can either be encoded in durable vectors spreading almost unchanged for millennia, or they succeed in competing for a human host’s limited attention span to be memorized (internaliza- tion*) until they are forgotten, codified (externalization*) in further objects or spread by the spoken word to other human hosts’ brains (socialization*) with the potential to mutate into new variants or form symbiotic relationships (combination*) with other memes (memeplexes) to mutually support each other’s fitness and to replicate together (schmitt, a). [the *-marked terms in brackets refer to both of the pkms-ices and seci cycles discussed, as well as to figure , and, thus, ease under- standing by ensuring a close alignability of memes’ behavior with the processes of the two co- evolving concepts.] second, memetics views memes as ‘living’ organisms, capable of reproduction and evolution. as a conceptual consequence, pkm’s ideosphere ecosystem represents the habitat of all memes (or ‘business genes’ as re-labeled by koch ( ) to better fit the commercial context). able to self- replicate by utilizing the human mental storage, memes influence their hosts’ behavior to promote further replication (bjarneskans, grønnevik, & sandberg, ) and, thus, represent basic (cognitive) information-structures. from a meme’s-eye view, every human mind is a machine for making more memes, a vehicle for propagation, an opportunity for replication and a resource to compete for (blackmore, ). so, if memes and their inbuilt ideas are able to flourish in a virtual ‘ideosphere’ as their habitat of operation, pkm systems aiming at supporting individual capacity and repertoire for innovation, sharing, and collaboration are well advised to utilize the very same space and resources and to form a digital counterpart of this ‘ideosphere’. moreover, since the ideosphere can be visual- ized using a three-dimensional information-space model (boisot, ), the utilities of memes and memeplexes can again be explicitly displayed, this time in form of their amalgamated states as the pkms user’s knowledge assets and his/her capital (intellectual, social, and emotional) together with the steps, regimes, and km models employed to process them (schmitt, h, c). third, a meme, of course, exists only virtually and has no intentions of its own; it is merely an in- formation piece in a feedback loop with its longevity being determined by its environment (collis, ). in the pkm context, it represents a distinct basic building block of knowledge in the eyes of schmitt the beholder, to be ideally captured and referred to in a quasi-atomic state, perfectly understandable alone by itself, but, being able to be used at any later time - in combination with other meme building blocks stored - without piggybacking irrelevant or potentially redundant information. the pkms’s logics and logistics, thus, afford the recalling, sequencing, and combining of already stored memes with one’s own new meme creations for integration in any type of authoring and sharing activity one would like to pursue. the further decomposition of a basic textual, visual, audio, or video meme in its constituent elements (e.g., words, sounds, sentences) as described by du plessis ( ) is not re- quired; what matters is how memes are able to morph into increasingly complex memeplexes or knowledge assets (e.g., articles, presentations, or scripts). this process has been exemplified (schmitt, d) and further clarified by a concept of dynamic meme reuse classes and attribute modifica- tions (schmitt, d based on mitchell & mitchell, ) which accounts for just eight ways to change a meme (any combination of reusing or modifying context/symbols and/or con- tent/meaning and/or container/application). fourth, while the notion of the six digital ecosystems embedded in popper’s three worlds provides a pkms meta-level perspective, the conceptual meme scheme affords a transparent grass-roots level foundation. all processes and methodologies incorporated are placed between these two antipodes of the pkms scale, among them the extended ignorance matrix, pkms value chain, pkm d framework, and design task complexity cube (schmitt, d). the focus on memes and their dig- itally embedded structural references represents the most radical departure from the current docu- ment-centric km systems and affords invigorating digital scholarship, individual and institutional curation, and the traceability of knowledge. the latter forms the back-bone of modern manufactur- ing and stands for the ability to trace the history, application or location of an entity across a whole value chain by creating an as-built genealogy. its significance for the pkms concept has been further detailed in two prior articles (schmitt, e, i). fifth, the web created by these traceable memes and their relationships directly supports the educa- tional objectives of the pkms concept. with all pkm publications captured in their meme-based representations in the pkms repository, their structural references enable their straightforward re- purposing for the educational agenda (in form of e-books, online tutorials, and e-learning course units). the quest for a pkms solution has pondered on many methodologies advocated by scholars and practitioners. fortuitously, what might have appeared initially as difficult to reconcile or at odds (e.g., km’s objectives, philosophies, and methods) has resulted in the integration of a few hundred km tools and ideas which establishes the baseline for a transparent and coherent educational km concept and km curriculum, including the rationale of how and why some of the original methods had to be adjusted, extended, re-purposed, or merged, an undertaking further elaborated on in a pri- or paper (schmitt, f). lastly, only memes are captured in the pkms repository. however, this does not limit its functionali- ty but enriches it, because – in line with memetics – everything is a meme, including the description of people, groups, communities, and organizations together with their geographic, industrial, service, or research field related classifications, including the references to books, periodicals, events, scripts, databases, standards, testimonials, or artefacts together with their topical references and content, in- cluding the intentions, forethoughts, and evaluations representing the user’s emotional capital, and including the captured interdependencies between all these entities as permitted by the system. the pkms knowledge bases afford mirroring the virtual ideosphere and the means for creative connec- tions (among data, among people, and between people and data flows) independent of disciplines, and, thus, offer an overdue tool for knowledge workers and interdisciplinary tasks. conclusions and the way ahead through the affordances presented, the pkms community members obtain the means to retain and build upon knowledge acquired in order to sustain personal growth and facilitate productive collabo- rations between fellow learners and/or professional acquaintances. any meme captured is able to devising enabling spaces and affordances for pkm system design further evolve during learning processes and to form symbiotic relationships with known or newly imagined memes during phases of sensemaking or authorship. as distin ( ) points out, “in re- combination, existing memes are appropriately recombined in new situations, creating new ways of thought and novel effects, perhaps as the result of previously recessive memes’ ‘effects’ being re- vealed in the reshuffle”. in line with the definition by cabitza et al. ( ) that affordances “point to the offering or provision of either resources or opportunities to someone who recognizes them and is able to exploit them to become capable of performing some action or get some value or benefit”, the investigation into the pkm concept’s and system’s capabilities has brought to light not only novel affordances but also pointed out current limitations due to path dependencies and fixations. a further case was made for utilizing the notion of memes as a useful metaphor in support of the pkms concept together with its educational ambitions. dawkins ( ) points out three qualities of a meme to enhance its fitness in order to maintain a continued presence in future generations: fe- cundity, longevity, and copying fidelity. all three features are profiting extensively from the secure, convenient, and standardized storage in the pkms’s knowledge bases and from the creative conver- sations between networked autonomous pkms devices – and so will the world extelligence and its wider-spread accessibility and usability. further publications and posters are also under review or planned addressing a pkms sustainability vision, demonstrations and tutorials/workshops, a comparison of how the pkms trail-network compares to traditional hyperlink configurations based on the set of pkms publications, and how the pkms concept compares to, can make use of and add to semantic web technologies. after com- pleting the test phase of the prototype, its transformation into a viable pkms device application and a cloud-based whomer server based on a rapid development platform and a nosql-database is estimated to take months. references the sequence of alphabetical letters used to differentiate the author’s multiple publications in any year include some gaps since some papers/articles have not been referenced. the letter designations, however, are used consistently for refer- encing across all publications to avoid confusing readers and, hence, have also not been revised in this article. aect ( ). bloom’s taxonomy. association for educational communications and technology. retrieved march , , from http://epltt.coe.uga.edu/index.php?title=bloom% s_taxonomy beinhocker, e. d. ( ). the origin of wealth. harvard business press. bjarneskans, h., grønnevik, b., & sandberg, a. ( ). the lifecycle of memes. retrieved march , , from http://www.aleph.se/trans/cultural/memetics/memecycle.html blackmore, s. j. ( ). the power of memes. scientific american, ( ), - . boisot, m. ( ). exploring the information space: a strategic perspective on information systems. working paper series wp - . university of pennsylvania. bourdieu, p., & wacquant, l. j. d. ( ). an invitation to reflexive sociology. chicago: university of chicago press. borgman, c. l. ( ). scholarship in the digital age. mit press briscoe, g. ( ). complex adaptive digital ecosystems. proceedings of the international conference on management of emergent digital ecosystems, pp. - . bush, v. ( ). as we may think. the atlantic monthly, ( ), - . cabitza, f., simone, c., & cornetta, d. ( ). sensitizing concepts for the next community-oriented technologies: shifting focus from social networking to convivial artifacts. the journal of community informatics, ( ). retrieved march , , from http://ci- journal.net/index.php/ciej/article/view/ / http://epltt.coe.uga.edu/index.php?title=bloom% s_taxonomy http://www.aleph.se/trans/cultural/memetics/memecycle.html http://ci-journal.net/index.php/ciej/article/view/ / http://ci-journal.net/index.php/ciej/article/view/ / schmitt collis, j. ( ). introducing memetics. retrieved march , , from http://meme.sourceforge.net/docs/memetics.php dalkir, k. ( ). knowledge management in theory and practice. butterworth-heinemann. davidson, c. n. ( ). so last century. times higher education, april, . dawkins, r. ( ). the selfish gene. paw prints. distin, k. ( ). the selfish meme: a critical reassessment. cambridge university press. du plessis, j. ( ). learning objects: using language structures to understand the transition from affordance systems to intelligent systems. interdisciplinary journal of knowledge and learning objects, , - . retrieved from https://www.informingscience.org/publications/ florida, r. ( ). the rise of the creative class – revisited. basic books. gibson. j. j. ( ). the ecological approach to visual perception. hillsdale. nj: lawrence erlbaum associates. inc. gill, t. g. ( ). when what is useful is not necessarily true: the underappreciated conceptual scheme. informing science: the international journal of an emerging transdiscipline, , - . retrieved from https://www.informingscience.org/publications/ gratton, l. ( ). the shift – the future of work is already here. uk: harpercollins. gurteen, d. ( ). taking responsibility. inside knowledge, ( ). retrieved march , , from https://www.scribd.com/document/ /the-gurteen-perspective-taking-responsibility johri, a., & pal, j. ( ). capable and convivial design (ccd): a framework for designing ict for human development. information technology for development, ( ), - . kahle, d. ( ). designing open educational technology. in t. iiyoshi & m. s. vijay kumar (eds.), opening up education (pp. - ). mit press. kanengisser, d. ( ). how ideas change and how they change institutions: a memetic theoretical framework. paper presented at the annual meeting of the american political science association, washington, d.c., august - , . katz n., lazer d., arrow h., & contractor n. ( ). network theory and small groups. small group research. ( ), - . koch, r. ( ). the power laws of business. nicholas brealey. koltko-rivera, m. e. ( ). rediscovering the later version of maslow’s hierarchy of needs: self- transcendence and opportunities for theory, research, and unification. review of general psychology, ( ), . lam, a. ( ). tacit knowledge, organizational learning and societal institutions: an integrated framework. organization studies, ( ), - . le masson, p., hatchuel, a., & weil, b. ( ). the interplay between creativity issues and design theories: a new perspective for design management studies?. creativity and innovation management, ( ), - . levy, p. ( ). the semantic sphere . wiley. meyer, u. ( ). integrating path dependency and path creation in a general understanding of path constitution. the role of agency and institutions in the stabilisation of technological innovations. science, technology & innovation studies, (may). mintzberg, h. ( ). developing theory about the development of theory. in k. g. smith & m. a. hitt (eds.), great minds in management: the process of theory development (pp. – ). new york: oxford university press mitchell, b. t., & mitchell, r. k. ( ). digital content reuse in dynamic settings: an organizing typology for digital content users. proceedings of jais theory development workshop. sprouts: working papers on information systems, ( ). retrieved march , , from http://sprouts.aisnet.org/ - http://meme.sourceforge.net/docs/memetics.php https://www.informingscience.org/publications/ https://www.informingscience.org/publications/ https://www.scribd.com/document/ /the-gurteen-perspective-taking-responsibility http://sprouts.aisnet.org/ - devising enabling spaces and affordances for pkm system design mynatt, e. d., o’day, v. l., adler, a., & ito, m. ( ). network communities: something old, something new, something borrowed. computer supported cooperative work (cscw), ( - ), - . nielsen, m. ( ). reinventing discovery - the new era of networked science. princeton university press. nonaka, i., & takeuchi, h. ( ). the knowledge-creating company. oxford university press. nonaka, i., toyama, r., & konno, n. ( ). seci, ba and leadership: a unified model of dynamic knowledge creation. long range planning, , - . o’raghallaigh, p., sammon, d., & murphy, c. ( a). the design of effective theory. systems, signs & actions, ( ), - . o’raghallaigh, p., sammon, d., & murphy, c. ( b). towards an ontology of innovation models - a conceptual framework. european conference on information systems proceedings (ecis). paper . popper, k. ( ) objective knowledge - an evolutionary approach. oxford university press. popper, k. ( ) three worlds - the tanner lecture on human values delivered. the university of michigan april , . rosenstein, b. ( ). living in more than one world: how peter drucker’s wisdom can inspire and transform your life. berrett-koehler publishers. sandberg, a. ( ). memetics. retrieved march , , from http://www.aleph.se/trans/cultural/memetics/ schmitt, u. ( ). knowcations - the quest for a personal knowledge management solution. th international conference on knowledge management and knowledge technologies (i-know), sep - , graz, austria. retrieved march , , from http://dl.acm.org/citation.cfm?id= or http://www.researchgate.net/publication/ schmitt, u. ( e). managing personal knowledge to make a difference. th british academy of management conference (bam), sep - , , liverpool, united kingdom. retrieved march , , from http://dx.doi.org/ . / . . . schmitt, u. ( b). personal knowledge management devices - the next co-evolutionary driver of human development?! international conference on education and social sciences (intcess ), feb - , , istanbul, turkey. retrieved march , , from http://dx.doi.org/ . / . . . schmitt, u. ( d). how this paper has been created by leveraging a personal knowledge management system. th international conference on higher education (iche), mar - , , tel aviv, israel. retrieved march , , from http://dx.doi.org/ . / . . . schmitt, u. ( h). proposing a next generation of knowledge management systems for creative collaborations in support of individuals and institutions. proceedings of the th international joint conference on knowledge discovery, knowledge engineering and knowledge management (ic k), oct - , , rome, italy pp. - . retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( k). making sense of e-skills at the dawn of a new personal knowledge management paradigm. proceedings of the e-skills for knowledge production and innovation conference, november - , , cape town, south africa, pp. - . retrieved march , , from http://proceedings.e- skillsconference.org/ /e-skills - schmitt .pdf schmitt, u. ( n). who needs personal knowledge management anyway and what for? poster presentation at the e-skills for knowledge production and innovation conference, cape town, south africa, - november . retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( d). putting personal knowledge management under the macroscope of informing science. informing science: the international journal of an emerging transdiscipline, , - . retrieved march , , from https://www.informingscience.org/publications/ schmitt, u. ( e). supporting digital scholarship and individual curation based on a meme-and-cloud-based personal knowledge management concept. academic journal of science (ajs), ( ), - . retrieved march , , from http://www.universitypublications.net/ajs/ /pdf/r me .pdf http://www.aleph.se/trans/cultural/memetics/ http://www.researchgate.net/publication/ http://dx.doi.org/ . / . . . http://dx.doi.org/ . / . . . http://dx.doi.org/ . / . . . http://www.researchgate.net/publication/ http://proceedings.e-skillsconference.org/ /e-skills - schmitt .pdf http://proceedings.e-skillsconference.org/ /e-skills - schmitt .pdf http://www.researchgate.net/publication/ https://www.informingscience.org/publications/ http://www.universitypublications.net/ajs/ /pdf/r me .pdf schmitt schmitt, u. ( f). quo vadis, knowledge management: a regeneration or a revolution in the making? journal of information & knowledge management (jikm), ( ). retrieved march , , from http://dx.doi.org/ . /s schmitt, u. ( g). knowledge management as artefact and expediter of interdisciplinary discourses. proceedings of th international multi-conference on society, cybernetics and informatics (imsci), orlando, usa, july - , , pp. - . retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( h). knowledge management systems as an interdisciplinary communication and personalized general-purpose technology. in special issue of the journal of systemics, cybernetics and informatics : invited papers of the plenary keynote speakers at the ims conferences ( th international multi-conference on society, cybernetics and informatics (imsci), orlando, florida, usa, - july . retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( i). towards a ‘world heritage of memes repository’ for tracing ideas, tailoring knowledge assets and tackling opportunity divides: supporting a novel personal knowledge management concept. the international journal of technology, knowledge & society: annual review, , - . retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( a). the significance of memes for the successful formation of autonomous per-sonal knowledge management systems. in s. kunifuji, g. a. papadopoulos, & a. m. j. skulimowski (eds.), knowledge, information and creativity support systems (selected [extended] papers from kicss’ , th international conference held in limassol, cyprus, on november - , ), springer series: advances in systems and computing (aisc), vol. , pp. - . retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( c). knowcations - positioning a meme and cloud-based nd generation personal knowledge management system. in a. m. j. skulimowski & j. kacprzyk (eds), knowledge, information and creativity support systems: recent trends, advances and so-lutions (selected papers from kicss’ - th international conference on knowledge, information, and creativity support systems, nov - , , kraków, poland), springer series: advances in intelligent systems and computing (aisc), , pp. - . retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( d). tools for exploration and exploitation capability: towards a co-evolution of organizational and personal knowledge management systems. the international journal of knowledge, culture, and change management: annual review, , - . retrieved march , , from http://www.researchgate.net/publication/ schmitt u. ( e). design science research championing personal knowledge management system development. proceedings of informing science & it education conference (insite) , pp. - . retrieved march , , from http://www.informingscience.org/publications/ schmitt, u. ( f). redefining knowledge management education with the support of personal knowledge management devices. in v. uskov, r. j. howlett, & l. c. jain (eds.), smart education and smart e-learning, springer series: smart innovation, systems and technologies, , - . doi: . / - - - - _ . retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( g). utilizing the disruptive promises of personal knowledge management devices for strengthening organizational capabilities of innovativeness and leadership. presented paper at the th ashridge international re- search conference (airc ), jul - , , berkhamsted, uk. retrieved march , , from http://www.researchgate.net/publication/ schmitt, u. ( h). personal knowledge management for development (pkm d) framework and its application for people empowerment. procedia computer science , c, - . doi: . /j.procs. . . (presented at the international conference on knowledge management (ickm), - october, , vienna, austria). retrieved march , , from https://www.researchgate.net/publication/ schmitt, u. ( j). design science research for personal knowledge management system development – revisited. informing science: international journal of an emerging transdiscipline, , - . retrieved march , , from https://www.informingscience.org/publications/ http://dx.doi.org/ . /s http://www.researchgate.net/publication/ http://www.researchgate.net/publication/ http://www.researchgate.net/publication/ http://www.researchgate.net/publication/ http://www.researchgate.net/publication/ http://www.researchgate.net/publication/ http://www.informingscience.org/publications/ http://www.researchgate.net/publication/ http://www.researchgate.net/publication/ https://www.researchgate.net/publication/ https://www.informingscience.org/publications/ devising enabling spaces and affordances for pkm system design schmitt, u. ( k). strengthening smes impact and sustainability with the support of personal knowledge management systems and concepts. in th international entrepreneurship forum (ief) conference proceedings, volume : research and reflective papers, dec - , , venice, italy, pp. - . retrieved march , , from http://www.essex.ac.uk/conferences/ief/documents/ th/conference_proceedings_vol .pdf schmitt, u. ( a). the logic of use and functioning of personal km-supported experience management. th german workshop on experience management (gwem) within the th conference on professional knowledge management (prowm), apr - , , karlsruhe, germany. retrieved march , , from http://www.researchgate.net/publication/ schmitt, u., & saadé, r. g. ( ). taking on opportunity divides via smart educational and personal knowledge management technologies. th international conference on e-learning (icel), jun - , , orlando, usa. retrieved march , , from http://www.researchgate.net/publication/ shifman, l. ( ). memes in a digital world: reconciling with a conceptual troublemaker. journal of computer‐ mediated communication, ( ), - . signer, b. ( ). what is wrong with digital documents? a conceptual model for structural cross-media content composition and reuse. conceptual modeling–er , pp. - . springer berlin. simon, h. a. ( ). designing organizations for an information-rich world. in m. greenberger (ed.), computers, communication, and the public interest. baltimore: johns hopkins press. stewart, i. & cohen, j. ( ). figments of reality - the evolution of the curious mind. cambridge university press. sztompka, p. ( ). trust: cultural concerns. international encyclopedia of the social & behavioral sciences, pp. - . elsevier. wiig, k. m. ( ). the importance of personal knowledge management in the knowledge society. in d. j. pauleen, & g. e. gorman (eds), personal knowledge management (pp. - ). gower. biography ulrich schmitt’s professional background covers positions as it and management consultant in london and basle, as professor and vice president at two independent universities in germany, as well as vice rector at the polytechnic of namibia and dean of the graduate school at the university of botswana. he studied management and industrial engineering at tu berlin and cranfield university, completed his phd at basle university, and a science and research management program at speyer university. currently, he is focussing on personal knowledge management and is professor extraordinaire at the university of stellenbosch business school. see web site for previous and upcoming pkm related work: http://www.researchgate.net/profile/ulrich_schmitt http://www.essex.ac.uk/conferences/ief/documents/ th/conference_proceedings_vol .pdf http://www.researchgate.net/publication/ http://www.researchgate.net/publication/ http://www.researchgate.net/profile/ulrich_schmitt devising enabling spaces and affordances for personal knowledge management system design abstract personal knowledge management as informing science the notion of affordances in a world of ecosystems significance of affordances for knowledge societies affordances prioritizing communication and collaboration constraints and limitations within the technologies ecosystem persistence within the extelligence ecosystem engagement within the social ecosystem authoring within the knowledge worker ecosystem reputation and trust within the institutions ecosystem boundaries within the ideosphere ecosystem affordances prioritizing personal knowledge management technologies ecosystem tied to market barriers and lack of tools extelligence ecosystem hampered by siloes and book-age paradigm social ecosystem bound by analysis and industrial age paradigm knowledge worker ecosystem seeking autonomy and development institutions ecosystem in need of ambidexterity and innovativeness ideosphere ecosystem fostering traceability and transdisciplinarity conclusions and the way ahead references biography untitled kent academic repository full text document (pdf) copyright & reuse content in the kent academic repository is made available for research purposes. unless otherwise stated all content is protected by copyright and in the absence of an open licence (eg creative commons), permissions for further reuse of content should be sought from the publisher, author or other copyright holder. versions of research the version in the kent academic repository may differ from the final published version. users are advised to check http://kar.kent.ac.uk for the status of the paper. users should always cite the published version of record. enquiries for any further enquiries regarding the licence status of this document, please contact: researchsupport@kent.ac.uk if you believe this document infringes copyright then please contact the kar admin team with the take-down information provided at http://kar.kent.ac.uk/contact.html citation for published version redmon, david ( ) video methods, green cultural criminology, and the anthropocene: sanctuary as a case study. deviant behavior, ( ). pp. - . issn - . doi https://doi.org/ . / . . link to record in kar http://kar.kent.ac.uk/ / document version author's accepted manuscript video methods, green cultural criminology, and the anthropocene: sanctuary as a case study documentary criminology is a burgeoning, open-ended methodological technique that crafts and depicts sensuous knowledge from the lived experiences of crime, transgression, and harm . this ‘video ethnography paper’ examines my minute documentary, sanctuary, as a case study to demonstrate how documentary criminology draws upon green cultural criminology, video methods, and sensory studies to provide an experiential understanding of crime (in this case, against donkeys) and rehabilitation in the contested notion of an ‘anthropocene’ epoch. i trace how documentary criminology can evoke and enact the lived experiences of “donkey rehabilitation” as sensuous scholarship. re-wilding: a convergence of green cultural criminology and documentary criminology as a feral method in this article, i advocate for the adoption of video methods, in the form of “documentary criminology,” to enhance and advance our understanding of crimes against species through the construction of ethnographic media as sensory scholarship. i argue that in order to confront and address the anthropocene, criminology must develop new languages and sensibilities: textual, non-textual, sonic, and cinematic. a project of textual and audiovisual reconfiguration of i employ the term “harm” as a shorthand for these three concepts — crime, harm, and transgression — throughout the rest of this paper. although “harm” is the central issue confronted in donkey, documentary criminology may be used equally effectively to approach all three concepts. sanctuary can viewed at the following link: https://vimeo.com/ with the password: greenculture. https://vimeo.com/ criminology will necessarily be epistemologically and methodologically open- ended and emerging whilst subject to change, adaptation, and transformation. documentary criminology is simply one form among several, offering an aesthetic approach to craft and depict knowledge that pushes the boundaries of sensuous scholarship. where traditional representations of criminological knowledge embrace writing and telling as ways of knowing, documentary criminology extends this process to showing, sensing, and hearing — and in doing so, helps us relate to the experience of harm. by highlighting the haptic and sensory closeness of harm, documentary criminologists can emotively touch audiences, inspiring understanding and empathy. documentary criminology’s methodological approach transfigures the form and content of what is recognizable as “knowledge” in criminology. it calls into question taken-for-granted notions of what “counts” as criminological scholarship, thereby expanding methodological and epistemological boundaries. no longer reliant on the spoken word or textual accounts, documentary criminology reconsiders the foundational assumptions of how criminological knowledge is produced as scholarship. rather than dismissing documentary criminology as an illegitimate video method that fails to conform to staple-of-the-discipline approaches to explore crime, criminologists should seek to expand their methodological sensibilities and enhance their understanding of harm as sensuous knowledge production. as academics, we are taught – and we teach students – to write, read, and deliver textual research: we produce powerpoints, word documents, journal articles and books. yet in doing so, we reduce the plenitude of lived experience, in all its complexity and sensuousness, to language and numbers—we hold ourselves back from plunging into the brute experiences of sound, smell, taste, perception, and color. as a criminologist, i have steeped myself in video methods for the past fifteen years, producing (what i call) ethnographic documentaries with visual and aural sensibilities. in , i premiered my first ethnographic documentary, mardi gras: made in china, at the sundance film festival. at that time, i jokingly called myself a “feral criminologist,” an acknowledgement that i had strayed away from formal criminology to make documentaries. i left behind an academic world of teaching, administrative duties, and departmental meetings to craft ethnographic films from lived experience – documentaries that didn’t “count” towards tenure in academia. my projects, it seemed, were too wild, too far outside the boundaries of academia to count as reliable and valid academic knowledge. i had “gone feral” (rather than “gone native”) in my constant experimentation with the intersections among audiovisual technologies, methodologies, and criminology, and i now existed professionally outside the confines of the academic discipline. although i was disappointed in criminology, as a discipline, for not valuing video methods and supporting the making of documentaries to produce and disseminate knowledge, i continued to accept some of criminology’s central goal: the production of sensuous knowledge. it was only the institutional means to achieve that goal i rejected. instead of publishing in academic journals, i turned to popular culture, film festivals, television, netflix, itunes, and film distribution companies to distribute my scholarly work. the relative dearth of institutional support for criminological documentary filmmaking made it impossible to work with students to encourage audiovisual ethnographic research. without institutional support, neither the students nor i could attain the goals of tenure, or the completion of dissertations; these milestones necessitated conformation to accepted institutional methods. so i left academia to become a feral scholar. flash forward more than a decade later: today, thanks to a handful of imaginative criminologists working tirelessly at the margins to open up methodological possibilities, criminology has entered an academic, epistemological, and pedagogical climate of profound opportunities, complete with a newfound willingness to experiment with emerging technologies. nowadays, the opportunities are present to allow criminologists to go beyond the static methods of positivism, rational choice, and quantitative and qualitative research to explore mobile methods that can evoke and depict the fleeting, sensuous, and embodied motions of daily life. les back ( ) illustrates how today’s academic environment increasingly encourages a broad imagination within the social sciences, including criminology. “the tools and devices for research craft are being extended by digital culture in a hyper-connected world, affording new possibilities to re- imagine observation and the generation of alternative forms of research data” (back and puwar : ). the experiential flux of what people, objects, and animals do can be depicted sensuously. mobile methods of immersion can reside within the ambiguity of fluid experiences (see ferrell ; ferrell ; redmon ). the methodological techniques of documentary criminology allow researchers to depict elements of harm in ways that written text cannot deliver. whereas written knowledge primarily lends itself to linear processing, sensory knowledge engages the viewer through non-linear encounters and indeterminate contacts (macdougall, ; young, ; campbell, ). the video techniques of documentary criminology enjoin sensory experience with harm to produce vibrant encounters (campbell, ; ). these techniques craft a sensory documentary that plunges viewers directly into the fluctuating experiences of harm through pre-reflective attention and external expressivity. documentary criminology as sensuous scholarship documentary criminology’s wider objective is invitational. uninterested in grand theory, both criminologies embrace a perspective-oriented approach that seeks to develop methodologically open-ended and porous sensibilities which evolve and adapt over time. unlike textual criminology, which relies primarily on written language as a technique to render analysis, documentary criminology crafts an aesthetically rich, empirical, sensuous scholarship that uses images, sounds, and textures to immerse audiences in lived experience. documentary criminology attends to lyrical impressions and atmospheres of harm, crime, and pleasures. to paraphrase jane bennett ( ), documentary criminology’s sensuous knowledge is vibrant; it provides an experiential way of knowing and transfigures the real through contact and encounter. when we engage with documentary criminology, we not only know the real intellectually: we also encounter the real with our bodies pre-linguistically. we are sensual before we are verbal; we are pre-reflective and reflective. documentary criminology’s phenomenological framework “calls us to a series of systematic reflections within which we question and clarify that which we intimately live, but which has been lost to our reflective knowledge through habituation and/or institutionalization” (sobchack : ). the unique window documentary criminology provides into lived experience broadens criminology’s boundaries. green cultural criminology and the age of the ‘anthropocene’ academics have published poignant and divergent literature on the anthropocene — the “age of the human” — a questionable new epoch defined by human devastation of ecological habitat through acts of destruction (baskin ; bonneuil and fressoz ; hamilton et al. ; pattberg and zelli ; ruddiman ; steffan et al. ). the growing interest in this subject is reflected in the popularity of books such as feral, re-wilding, the intimate bond, feral cities, and rewilding the world. academic conferences (e.g., “animals in the age of the anthropocene”) have considered the complicated relationships among the anthropocene, the wild, humans, and animals. documentaries featuring harm to animals have heralded the birth of a new domain in film studies encompassing “ecocinema” and “popular green criminology” (rust, monani, and cubitt ; kohm and greenhill ). both frameworks pursue a green cultural criminological exploration of how media represents harm against species, ecology, and humankind. for example, kohm and greenhill ( : ) observe that media’s ‘affective nuances’ can engage audiences and move “criminology forward by reaching audiences rarely exposed to mainstream academic discourses on crime and the environment…[media] opens up spaces for affective engagement with (in)justice and simultaneously suggests a re-examination of taken-for-granted assumptions about offending and harm and their connection to broader contexts.” indeed, such quintessentially mainstream companies as netflix, national geographic, cnn, bbc, arte, amazon, and the discovery channel are currently developing and expanding programs on the problematic relationships between animals, humans, and the last remaining “wild” spaces on earth. the majority of these “wild” documentaries uncritically appear in public platforms such as itunes, television stations, open-access e-journals with video embeds, film festivals, galleries, and movie theaters. “wild” as a concept that is inseparable from nature and culture – and especially media depictions of both – has been a debate among academics. cronon’s ( ) argument, for instance, is that the ‘wild’ is fundamentally a human creation seeped in value-laden romanticism that elevates it to the status of the sacred and divorces human from the natural. drawing upon the discourse poets and environmental activists, cronon demonstrates the human construction of wild as supernatural, transcendental, classist, and often racist. yet, according to cronon, ‘wild’ is everywhere; it contains its own autonomy and reasons for being inside ecological relationships increasingly decimated as acts of harm (cronon ). media’s entangled relationship with the wild is contradictory, as will be explored in the concluding critiques of this article. merging documentary criminology with green cultural criminology seems natural from the outset, given the overlap in perspective between the two disciplines. as ferrell, hayward, and young ( : ) indicate, “countless other forms of green cultural criminology can also be imagined, some undertaken, some waiting to be imagined.” green cultural criminology resonates with documentary criminology in several ways: ( ) both attend to the mediated dynamics of style, symbolism, and meaning of environmental or species harm; ( ) both have a shared focus on resistance to species harm; ( ) both advocate for the evocation of sensory experiences of species harm through the construction and dissemination of media. the harm arising from the anthropocene calls upon us, as contemporary criminologists, to expand our understanding of crime, innovate our modes of analysis, and broaden our engagement with public audiences through the production and dissemination of media. shearing ( : ) explores the conceptual consequences of the anthropocene on the shape and content of criminology: “the realization that we humans are powerful biophysical agents invites us, as criminologists, to ask what criminology might be, and should be, in the anthropocene?” south’s ( : ) response is to focus “on the study of ‘harms’ as much as, if not more than, the study of crimes.” brisman and south a notable difference between green criminology and green cultural criminology is how the former focuses on a broad array of political, economic, ecological and corporate infrastructures that enact harm against the environment, food production and animals, whereas the latter explores the impact of cultural production and consumption, mediated dynamics, and symbolism of the social construction of harm. ( : ) recommend that criminology broaden its understanding of “crime” to include harms to species and their habitats through interdisciplinary approaches. citing several documentaries and fiction movies in their exploration of crimes against species, the authors suggest that researchers implement sensibilities of media production “in popular cultural forms” to transmit, disseminate, and bring empathetic concern to critically resist social harms (brisman and south : ; brisman et al. ). acknowledging the prospects of a green cultural criminology, ferrell ( ) also drifts into the discussion on the state of contemporary criminology in the age of the anthropocene. similar to brisman and south ( ), ferrell offers suggestions for methodological linkages between cultural and green criminology with video methods in his exploration of crime: “even in this emergent stage, though, particular orientations can be identified – orientations that create some particularly fertile ground for the intertwined growth of green criminology and cultural criminology. by the nature of their subject matter, both green criminology and cultural criminology push against the conventional boundaries of criminology, and so tend to upset the definitional and epistemic order of the discipline…among cultural criminology’s more useful innovation has been documentary criminology is a theoretical and methodological sensibility that actively enacts and produces media as sensuous scholarship, whereas visual criminology examines and interprets pre-existing visual representations of crime such as images and videos. visual ethnography is understood as a research method emerging from the social sciences to gain a deeper understanding of social life through lived experience and its visual representation. its emphasis on the visual, not only as an essential criminological subject matter in an increasingly mediated world but as a mode of criminological documentation and analysis. a visual criminology of this sort seems particularly appropriate for recording and communicating the little lost ecologies of everyday life” (ferrell : – ). this call to re-examine and broaden criminology’s focus on the study of depicting harm also necessitates the invention of new methodological sensibilities to conduct research on and depict harm. such a re-examination demonstrates there is no singular “criminology”; rather, several “criminologies” exist (michalowski ). i consider documentary criminology to be an essential part of this plural and open-ended emerging project that invites new methods to craft sensuous knowledge about the consequences of harm in the age of the anthropocene. it is here that documentary criminology and green cultural criminology can forge an alliance, and i demonstrate such an alliance with the documentary sanctuary as a case study. sanctuary as a case study of documentary criminology sanctuary arrives during a particularly crucial period of the anthropocene and green cultural criminology’s response to it, as capitalism fuels the growth of urbanization and consumerism encourages expansive development without regard for the impact on other species that share the planet. today, fewer and fewer “wild” spaces remain in which donkeys and other animals can live without human intervention and harm. according to the world wildlife fund, the planet has seen a % reduction in the overall number of wild animals since ; carbon emissions, urban expansion, pollution, abuse, and trafficking are pushing the wilderness to the margins and sequestering species within urban habitats. into the breach created by this global crisis, institutions have arisen to protect marginalized species and the environments they inhabit. sanctuary responds to the harms of the anthropocene by exploring the rehabilitation of one particular species as an analog for larger issues of animal rights and eco-justice. once highly valued as farming and transport animals, donkeys in the post- industrial world have been rendered superfluous. as their functional value in society has diminished, the hundreds of thousands of donkeys currently in existence have been re-commodified; rising criminal networks illegally abduct and traffic in equines – horses and donkeys – (sollund ), selling them to corporate factory farms where they are slaughtered, processed, and falsely packaged as “beef” in france, sweden, canada, south africa, australia, u.s., the u.k, and other countries. state complicity in the abusive treatment of donkeys is rampant: the parks and wildlife department and the bureau of land management in the united states routinely round up and shoot donkeys; in mexico and in quebec, canada, donkey hides are sold and carcasses are butchered as meat; taliban and isis fighters plant bombs in donkeys, and soldiers kill donkeys suspected of transporting armed weapons for terrorists. sollund ( : ) refers to animals that are abducted, trafficked, and killed in this commodity chain as victims of crimes. “wildlife trade is the abduction, acquisition, collection, destruction, possession, or transportation of animals for the purposes of barter, exchange, export, import, or purchase” (sollund : ). to date, over , donkeys have been rescued from harm, abuse and abandonment in the u.k., usa, france, spain, canada, and ireland. this startling number of rescues makes us wonder: what happens after these animals are rescued? what are the sensory elements of the rehabilitative process that donkeys undergo inside donkey sanctuaries as they recover from abuse and abandonment? sanctuary is set inside the donkey sanctuary located in sidmouth, devon, uk. the donkey sanctuary’s (hereinafter “the sanctuary”) mission is to care for the welfare and rehabilitate abandoned and abused donkey. the sanctuary is inseparably tied to and born of the anthropocene, but it also provides resistance to and refuge from this destructive epoch. the ambiguous process of donkey rehabilitation in the sanctuary is evoked in almost every phase of action, from donkey rescue to donkey surgery and donkey dentistry. aesthetic depictions of these processes capture the sight, sounds, and patterns of donkeys’ everyday experiences as they expressively reside in a human-made total institution that rehabilitates them. the camera captures their movements — braying, walking, eating, embodying their habitat, perceiving their environment. the images and sounds of care work and rehabilitation are messy, unpleasant, and at times alarming: a needle in a donkey’s neck for anesthetization; surgery to repair damage; farriers cut directly into the tissue and nerves of donkey hooves. sanctuary unsettles the body and troubles the conscience while also instilling in audiences the vitality necessary to affectively encounter the vulnerability of species. sanctuary demonstrates how video methods can be implemented to explore the broad ramifications of the anthropocene through the microcosm of care-work, where the damage of human violence and cruelty against one particular species — donkeys — is healed through rehabilitation. although the goal of care-work is to improve the health of damaged donkeys, the invasive measures required to achieve this goal inevitably inflict pain, even as care-takers work to alleviate the ravages of human cruelty in the greater interest of long-term improvement. the invasive procedures undertaken to rescue donkeys from cruelty and habitat loss render them dependent on their human caretakers. donkeys cannot decide when, where, or what to eat; they are not free to leave; they are bound by the institution’s spatial and temporal barriers, put in place to protect and heal them. the institution, to paraphrase goffman ( ), is “total.” within this mandatory enclosure, many of the donkeys are cut off from the wider ecology forever. together, these donkeys lead an enclosed, formally administered life (goffman : ), sutured between a cruel world of deliberate abuse and the inadvertent but inevitable pain of rehabilitation. how do donkeys inhabit, embody, and expressively experience this institutional space of care- work that subjects them to distress in order to provide rehabilitation, sanctuary and security? sanctuary examines the broader conditions of harm and healing by foregrounding the film’s subjects — the donkeys — and capturing the starkness of their brays, trots, and spatial negotiations inside the confines of the care facility. through experiential images and sounds, the audience comes to understand how donkeys’ livelihoods are shaped, orchestrated, and managed inside the sanctuary’s total institution. cinematically, the film brings the audience inside the contact zones of social control and victimology. indeed, green cultural provides a compelling approach to forging relationships between its theoretical framework and video methodologies enacted as documentary criminology. in examining how these connections are forged, i now turn to discuss how documentary criminology embeds itself within green cultural criminology to craft sensuous scholarship in the form of ethnographic documentaries. methodological sensibilities: four approaches to evoking aspects of the anthropocene sanctuary took three months to prepare and five years to make. in my filmmaking, i acted initially as a trained ethnographer. i relied on the attuned skills of patience, participation, and immersive participatory-observation while taking detailed notes. i remained stationary in various parts of the sanctuary for several days – at times, i slept there for up to seven nights; when not overnighting at the sanctuary, i slept in an adjacent bungalow for up to a month at a time, off and on for five years. my goal in undergoing this immersion was to understand the rhythms and sounds of rehabilitation, the redundant movements through which humans and donkeys encounter each other during care work procedures, and the haptic interactions of touch as a rehabilitative process. i gave particular attention to the rhythms and patterns of caretakers and their choreographic gestures, how they delivered and isolated the donkeys to provide rehabilitation from abuse. i incorporated my observations into the techniques i used to move and place my camera and sound recorder in relation to the rehabilitation process. from my initial immersive activities, i decided to focus on and implement four methodological techniques: ( ) learning to attend; ( ) continuous long take; ( ) sensory reliance; and ( ) sonic communication. i believe each of these techniques can be fruitfully incorporated into green cultural criminology. in the next pages, i will use sanctuary as a case study to explore ways to extend green cultural criminology into a practice-based methodology of audiovisual sensory scholarship that crafts media out of ethnographic encounters. the four methodological techniques i identify can evoke sensuous scholarship through ethnographic immersion. by providing sensory substance via these four video methods, criminologists can advance a novel understanding of how rehabilitation occurs to animals that have been abused by humans, enhancing our study of deviance and crime. methodologically, the four techniques i will outline assist in evoking an “order of things,” (bennett ) a structure of an experience (sniadecki ), and atmospheric drama in the mundane experiences of situational care work (vannini ). learning to attend requires a sensibility of openness, which in turn allows one to construct sensuousness through cinematic immersion in flux (ferrell : ). continuous long take, or extended duration, helps the videographer evoke a continuation of experience that can make the familiar unfamiliar, and the unfamiliar familiar – a fundamental goal in sociological criminology. sensory reliance “proceeds neither through the reductionism of abstract language nor the subordination of image and sound to argument, but instead through the expansive potential of aesthetic experience and experiential knowledge” (sniadecki : ). finally, sonic communication is based on the understanding that meaning “does not emerge only from language; it engages with the ways in which our sensory experience is pre- or non-linguistic, and part of our bodily being in the world” (karel : ). in the rest of this article, i discuss the implementation of these four video methods to amplify and evoke empirical atmospheres of harm and rehabilitation in the context of sanctuary. . learning to attend: rehabilitation through touch a central tenet in documentary criminology is that one must learn to attend to the activity of engagement by placing one’s body in proximity to it. in practice, learning to attend requires continual interaction with the activities you are investigating (in this case, harm and rehabilitation). in sanctuary, touch is the first type of interaction used to rehabilitate abused and abandoned donkeys. attending to activities such as rehabilitation-through-touch often entails ongoing adjustments to one’s positioning of the camera in response to the movements underway. foregrounding sensory encounters of touch through the skilled practice of ethnographic attentiveness helps the videographer to evoke the lived experience of rehabilitation and retains its animated features. the donkeys’ movements (feet shuffling, ears flapping, heads bowed eating) and their varied brays are mundane, but when given full cinematic attention they together produce a symphony of movement, sound, and emotion. in sanctuary, each cinematic shot of rehabilitation is precise in how it attends to the minutia of care work. the durations and situations involved in the rehabilitation effort are carefully captured. human hands move across the body of an abused donkey; a woman calmly speaks to donkeys while stroking them; dentists use machines to repair donkeys’ neglected teeth; a veterinarian places a sedative in a donkey’s neck immediately prior to performing surgery. each specific situation of care work and rehabilitation occurs inside tightly contained spaces intended to create safety for the donkeys. rather than creating a series of juxtapositions that condense the shots into fragments, the video method of learning to attend allows the rehabilitative experience to play out in an unusually lengthy manner. each shot of attentiveness enhances the rehabilitative process and gives attention to touch, contact: the interactive texture of hands, brays, hooves, and machines. laura marks discusses touch and contact as haptic interactions particular to the surface of the body. to touch is to trace a memory onto the body and activate the skin by moving through an immediate environment of material contact (marks : xii). “haptic criticism is mimetic: it presses up to the object and takes its shape. mimesis is a form of representation based on getting close enough to the other thing to become it” (marks : xiii). contact through touch also engages a material association with hearing: these are haptic sounds. a bray, for instance, is a touch of sensuousness for some people, a stirring in the chest that offers a new way to experience sensations inside our bodies. in haptic visuality, by contrast, the eyes touch but do not attempt to produce identification; they encourage an embodied relationship between viewer and image (marks : ). fingers and hands touch donkeys to explore their hair, head, nostrils, legs, and tails. learning to attend to haptic encounters as a methodological sensibility requires intuitive, embodied movement based on the experience of contact. the cinematic approach is to move with lived experience; the movement of the body, camera, and sound recorder occurs intuitively during the rehabilitative process. the criminologist-as-filmmaker responds to the experiential activity while remaining immersed in it, thereby co-constructing involvement as a relationship. collaborative dynamics emerge among the criminologist-as-filmmaker, the care- taker, and the donkeys – all of whom share the experiential dynamics of the situation. these dynamics converge to produce a singularity of unique experience for each human and donkey, or what manning and massumi ( : ) call a “catalyzing moment” that helps the situation develop a “creative participation which would be encouraged to take on their shape, direction, and momentum in the course of the event” (manning and massumi, : ). manning and massumi ( ) refer to the methodological approach of learning to attend as a “techniques of relation.” techniques of relation always occur within “enabling constraints” and are therefore devices for catalyzing and modulating interaction; they comprise a domain of practices (manning and massumi : ). the collaboration here is between the filmmakers, the donkeys, and the caretakers, all of whom share overlapping experience through encounters inside enabling constraints. the learning-to-attend approach connects various experiential practices, arranges unexpected filmic explorations, and allows more exploratory movement with the camera and sound recorder. “this means that what is key is less what ends are pre-envisioned – or any kind of subjective intentional structure – than how the initial conditions for unfolding are set” (manning and massumi : ). learning to attend is a skilled practice and a re-wilding technique that relies on open-ended physicality, embodied skills, and cinematic immersion. learning to attend brings the critical faculties of intuitive practice, mobility, and flexibility to documentary criminology as a feral technique that ruptures and undoes “proper,” pre-conceived, and rigid methodologies. the wildness joins the already existing experiences to allow unforeseen possibilities, unexpected practices, and new types of movements to emerge. in this sense, each film that emerges from documentary criminology takes its own shape, form and momentum to arrive at an unknown outcome, rather than abiding by pre- conceived rules, a “vision,” or procedures – all of which are in line with green cultural criminology’s approach of attunement through affective encounters. . long take: duration of an experience sanctuary experiments with long, unbroken shots designed to inflect the continuity of lived experience and to more fully explore the expressivity of donkey rehabilitation within the sanctuary habitat. the drama of duration produces shots that are mundane yet highly charged, attuned to everyday moments of texture and the sounds of machines used to rehabilitate donkeys. slow movements and extended scenes offer audiences the opportunity to thoughtfully reflect on their relationships with time, the space in which rehabilitation occurs, and nonlinguistic soundscapes. the use of long takes in sanctuary invites viewers into the active process of rehabilitating donkeys and helps them bear witness to the external response to that rehabilitation (subjectively, of course, we will never know the donkeys’ internal experience of it). sanctuary opens up a rare moment in the lives of viewers, allowing them to be actively present during the rehabilitation of donkeys and exposing them to the tension inherent in this process. the film’s long-take shots explore how donkeys circulate within the sanctuary, moving from arrival to isolation, grooming (haircut, bathing, hooves pared) to membership in the herd. implementing the long-take technique provides documentary criminologists the chance to help audiences sensuously understand the harmful implications of animal abuse, but also the broader context of the length of time it takes to heal invisible and visible wounds. experiential long takes provide audiences an opportunity to connect (or to use green cultural criminology’s term, “cathect”) with donkeys during the rehabilitative process. cathexis is “the process of charging an object, activity, or place with emotional energy, which is in turn related to memory creation” (pretty : ). documentary criminology’s advantage in this case is its ability to evoke highly charged atmospheres and appeal to the affective nature of the senses. brisman and south ( : ) suggest that cathexis facilitates “attachment to objects, activities, and places, and this matures over time as a part, and as a reflection, of biography and experience.” although brisman and south use the term “cathexis” to critically engage the limitations of consumerism, it also affirms an empathetic and sensuous interaction between viewers and abused donkeys, catalyzed during highly charged circumstances. cathexis suggests that viewers may construct an attachment to animals, so that eventually the affective charge becomes part of their own biographical identity. documentary criminology is inherently an experiential and sensuous medium, and for this reason, the technique of the long take has the potential to connect, or cathect, donkeys with viewers. the long take intentionally eschews expository narrative and avoids constructing tension through the juxtaposition of shots in cinema verité style or with the use of words. instead, the long take closely resembles scott macdonald’s ( ) phenomenological pragmatism: it evokes brute lived experience shaped into a narrative of everyday encounters, where the tension resides within the shot rather than between the juxtapose shots. the long takes presented in sanctuary shift the presentation of lived experience away from a dramatic, edited narrative to an attuned phenomenological inquiry of presence. long takes can elucidate the structure of an experience and reveal drama in mundane situational moments. the long take, as a technique of documentary criminology, offers a compelling means to inflect the fluctuating richness of complex motions and encounters that occur at the intersection of human, animal, and object rehabilitation. methodologically, understanding rehabilitation entails paying close attention to atmosphere: how donkeys position themselves to eat, where they stand, and how they move in synchronistical rhythms toward and within barriers of spatial limitations and freedoms. the camera is positioned to glance at donkeys within metal walls, but donkeys look back with intention. their gaze holds the audience. it is clear that donkeys are not only objects to be looked at; they are subjects who look back. . sensory reliance the mechanical sounds of a lorry’s movement mingle with a donkey’s muffled snorts. where is the donkey going and why? an introductory long-take shot sets the cinematic tone of sanctuary as a single donkey enters an institution and guides the audience to the herd. we see that there are more donkeys — thousands more, in fact. from within the herd, the story unfolds patiently and attentively, in a spirit of curious exploration, with gentle sounds, harrowing brays, distressed movements, grinding machines, and embodied gestures. documentary criminology’s methodological approach highlights and foregrounds these sensory textures and kinetic inflections within the cinematic context of sensory criminology, a cinematic aesthetic that seeks to craft and implement media to situate audiences inside immersive phenomena of deep personal presence rather than didactic exposition or textual representation. without the aid of voiceover or expert interviews, the audience is left to sensuously engage directly with donkey rehabilitation. what remains when human language is stripped from documentary analysis? when the verbal, expository language is omitted, requiring viewers to rely on their own interpretive skills to experience the documentary? experiential immersion into the sensory aesthetics of the donkey’s ecological space requires audiences to rely on sensibilities of orientation rather than a narrator’s voice, an expert’s interview, or an academic’s explanation – here, the documentary itself is the analysis. documentary criminology’s methodological technique of “sensory reliance” is open and expansive. it relies on the audience to add to the movie through their sensory engagement; it cultivates attentiveness and patience. relying on senses places bodies in contact with each other as a way of knowing. flashes of donkey experience enrich the criminological imagination. instead of subordinating lived experiences as instrumental fodder for linguistic explanation, sanctuary prioritizes the richness of unspoken sensory experiences of green cultural criminology – the criminological imagination – to reach beyond verbal, numerical or textual criminology. sanctuary embraces sensory criminology and the inflections of video ethnography by deliberately enhancing the relationship between sight, sound, and movement: tactile sensations transmitted to the viewer’s body. by relying on sensory engagement, the film immerses audiences in the donkeys’ institutional habitat, enveloping them in haptic contact through tactile and aural engagement with the rehabilitative process. in sanctuary, each rehabilitative scene is open-ended “as seen, felt, and heard—they speak to the body…” (redmon : ). the aesthetics of species harm – here, the damage to donkeys’ habitats and bodies – is offered as a puzzle to be teased out (brisman and south : ). documentary criminology, when combined with a green cultural criminology ethos, is infused with vitality: it brings audiences into a sensory, embodied relationship with donkey rehabilitation and its enlivened surroundings – it demonstrates damage but also resistance to harm. the documentary connects with audiences in a physical, internal way while also encouraging audiences to touch and be touched; to transform and be transformed; to act and be acted upon by the physical and animal world. john dewey ( : ) quotes a poet who maintains that “poetry seemed ‘more physical than intellectual,’ and goes on to say that he recognizes poetry by physical symptoms such as bristling of the skin, shivers in the spine, constriction of the throat, and a feeling in the pit of the stomach like keats’ ‘spear going through me.’” this physical, sensory response is what is evoked with sanctuary: the eeriness of the dark barn, the closeness of the fur, the varied loud sounds of the bray, and the confrontation of the donkey’s gaze — all generate tension felt on and within the body. these aural and tactile experiences are kinetic, intended to activate audiences’ bodies, senses, and minds – thereby providing a “thick” understanding of what abused donkeys go through during the process of rehabilitation. documentary criminology approaches the human/non-human barrier not as a problem to be rationally solved, but as an opportunity to be recognized and embraced — an opportunity to acknowledge relational sentience embedded in profound difference. a larger objective of documentary criminology is to sensuously inflect and infuse these human/non-human differences with vitality so they flourish rather than diminish. documentary criminology, in this instance, is open and expansive. when we empathetically immerse audiences in habitats replete with sounds and visuals—when we envelop them in haptic contact—we astonish by way of pre-reflective contact. . sonic communication and diegetic sounds (or, how can we ignore the range of donkey brays?) sonic communication — the diegetic soundscape of criminological atmospheres — provokes embodied and ‘felt’ responses to audiences. documentary criminology maximizes the technique of sonic communication in sanctuary through the aural textures of rehabilitation, whereas spoken human language is minimized. approximately seven english sentences are spoken in sanctuary; overwhelmingly, the language of the soundscape is instead mechanical, animal, and environmental. non-verbal communication emphasizes the status of the donkeys as victims of crimes – burning, torture, stabbing, and starvation. banging metal bars clash with donkey brays; the sounds of impatient donkeys scampering in a barn comingle with the echoes and refractions of their hooves running across the concrete floors as they prepare for medicine to be forcefully inserted into them. the sounds of donkeys jumping, resisting their medicine and licking their lips blend with the unseen donkey brays in the background. these atmospheric sounds of rehabilitation are the living substance of aural phenomenological experience that communicates to audiences. we are so acclimatized to the presence of an expository, disembodied voice directing viewers’ attention that the mere omission of this voice starkly foregrounds the ambiguity of sonic language. exposition that ‘clarifies’ closes off interpretative possibilities, whereas the non-expository, sonically vibrant atmosphere offered in sanctuary is open-ended. foregrounding sonic communication in documentary criminology has the methodological benefit of reinforcing the audience’s connection to the sensory immediacy of rehabilitation as aesthetic knowledge. the depiction of sound as aesthetic knowledge aligns with katz’s quest to convey the sensory details of crime and the victims who experience it. katz ( : ) writes: “social science literature contains only scattered evidence of what it means, feels, sounds, tastes, or looks like to commit a particular crime. readers of research on homicide and assault do not hear the slaps and curses, see the pushes and shoves, or feel the humiliation and rage that may build toward the attack …”. indeed, audiences do not hear (or see) any of the sensory elements addressed by katz ( ) in written form. yet documentary criminology as sensuous scholarship can communicate sonic textures from multiple perspectives. for example, the sounds of repairing donkey teeth; the grating of damages donkey hooves; the tactile softness of massaging a wound — all these sensory encounters permeate the sanctuary and activate viewers’ bodies in uncomfortable and pleasurable ways. the sound of the donkey dentist’s machine grinding on the enamel of donkey teeth is part of disciplinary rehabilitation as well as the sanctuary’s sonic environment. these sounds agitate and vibrate, evoking memories and empathy. rehabilitation is heard, but the violence against donkeys is absent: audiences hear the consequences of harm through routine care work. the visceral sounds of rehabilitation communicate non-verbally through aesthetic experience. these diegetic and depicted sounds of documentary criminology fill in katz’s gap. sound acts with force; sound is felt internally and externally on the skin of the body. sonic communication inflects immediacy and pre-reflective expressivity; the body naturally registers experiential sounds differently than written communication. sonic communication develops rapport but also ruptures viewers’ (listeners’) engagement. the sonic components of rehabilitation in sanctuary come in a wide range of styles, embodying pre-linguistic sensory experiences through their duration and continuity. the soundscapes of criminology can immerse the body into an aural atmosphere — an overlooked frontier of lived experience in criminological research methods. criminologists almost never study the sounds of crime, transgression, harm — or, in this case, rehabilitation. it is only in documentary criminology that sound stands on an equal footing with visual and textual representation. documentary criminology demonstrates how sound is directly connected to embodiment and sense of place. sound may not be seen, but it is perceptive to the body. the “visual” in “visual criminology” is often privileged as a primary way of knowing experience, effectively eschewing the aural, the affective, the tactile, haptic, and ambient — yet all experiences necessary contain invisible, crucial textures of sound. documentary criminology is interested in the invisible as much an exception to where one encounters the “smells” of criminology is chura, david. . i don’t wish nobody to have a life like mine. beacon press. boston. as the visible (davies et al. ), and it is in the invisible domain of the sonic that documentary perhaps gains its most powerful methodological and epistemological traction. a sonic approach to criminology records and depicts sounds of harm in ways that bring audiences into the experience. sound is felt through the body; bodies are in sound as much as sound works through, on, and with bodies. a benefit of foregrounding sonic criminology is that it evokes the sensory immediacy of sound as aesthetic knowledge. the environment of donkey rehabilitation becomes fuller when it is realized through sonic embodiment. evoking the sensuous sounds of a particular place — whether those sounds evoke revulsion and horror, seduction and affirmation, silence or laughter — is crucial to the construction of criminological documentaries. criminological sounds are active, affective experiences that provide audiences with a crucial interpretive key. sonic criminology can be understood as the production of aural information as aesthetic knowledge in its own right rather than as supplementary material to aid visual and textual representation. conclusion this article has used the documentary sanctuary as a care study to discuss four methodological techniques that aid the production of sensory scholarship in documentary criminology: ( ) learning to attend; ( ) continuous long take; ( ) sensory reliance; and ( ) sonic communication. i have explained and showed how sanctuary employs these techniques to position its audience to sensuously understand the distressing rehabilitative process donkeys undergo while recovering from abuse. the four techniques outlined in this article are in no way definitive and remain malleable and invitational to an open-ended exploration of lived experience. these methodological techniques allow criminologists to craft sensory experience that is dynamic, engaged, and attuned to the sensibilities emerging from direct encounters rather than pre-existing prescribed rules. the articulation and rendering of these methodological techniques establishes flexible expectations of what documentary criminology entails. the formation of a criminological documentary occurs through actual practice and ongoing encounters in the field. the researcher’s approach is attuned to the rhythms, dynamics, and ambiguity of lived experience. this experiential nature of this approach sets documentary criminology fundamentally apart from other, more textually-based criminologies. as a sensory-based methodology, documentary criminology seeks to answer the question: how can we know and relate to harm? this question relational and sensuous rather than textual or verbal; answering it necessitates an epistemological reconfiguration of the criminological imagination. methodologically, documentary criminology answers this question by enhancing our understanding of sensory-based harm. in this article, we saw this approach played out particularly through video methods, used to advance our understanding of donkey rehabilitation as sensory scholarship in the anthropocene epoch. this article has also borrowed extensively from green cultural criminology’s open-ended practices to re-wild methodologies, inviting feral approaches to mobile encounters that allow researchers to move with the flux of sensory experience. the four techniques outlined in this article intersect with green cultural criminology’s interest in ethnography and the crafting of media to engage popular audiences. by forging connections between green cultural criminology and video methods, documentary criminology can create sensory scholarship that re-imagines the relationship between popular audiences and academic researchers. sensory documentaries are currently being disseminated on a range of popular platforms, including itunes, netflix, and movie theaters. and here is precisely where i’d like to highlight two critiques of documentary criminology (though of course more exist). the first shortcoming of documentary criminology is its emphasis on longitudinal form. for example, not everyone can spend five years making an ethnographic documentary. how do academics overcome this limitation while also preserving the integrity of their research? saunders ( ) cites the emergence of digital technology as a transformative ‘practice’ in disciplines among the landscapes of higher education. practices, according to saunders ( : ), are “routine behaviours derived from a personal or collective knowledge base.” an ongoing conundrum is how to enact new practices from and within ongoing practices in criminology (see ferrell ). in an increasingly reductive academic infrastructure that examines the ‘bottom line’ of research output in terms of numbers and text inside an audit culture, efficiency, and new managerialism that overextends and degrades academics (becher and trowler : ; daniels and thistlethwaite : ) – and that leaves out whole swathes of curriculums and disciplines (saunders : ) – where do researchers who practice documentary criminology fit into this neoliberal model? daniels and thistlethwaite ( : ) suggest the future of researchers will be a bricolage of practices. “digital media technologies make it easier to create hybrid projects across fields that are typically separate. the future of being a scholar will include more blending of academia, journalism, and documentary filmmaking …” (daniels and thistlethwaite : ). it will shift to digital models of communication and modes of digital scholarship thereby presenting scholars with amazing “new opportunities to do their work in ways that matter to wider publics…being a scholar in the digitally networked classroom means guiding students to new knowledge and helping them become lifelong learners” (daniels and thistlethwaite : ). digital media technologies, when implemented as a research sensibility, allow scholars to depict, disseminate, and re-imagine knowledge as sensuous to enhance textual scholarship. ferrell ( : ) has come closest to defining this emerging digital media as ‘instant ethnography’. instant ethnography “denotes an ethnography of moments and ephemeral meanings and in so doing confronts yet another conventional assumption underlying the sense of ethnographic method as a totalizing enterprise: the notion that durable social groups and situations are to be studied through enduring ethnographic research.” sometimes documentaries combine immediate situations with longitudinal media immersion, interpretively recording the immediacy of crime, harm, and transgression over the course of time. participating in and intimately observing fluctuating activities allows researchers to explore and depict immediate experience as ‘instant ethnography’ while retaining inflections of dynamic experiences with audiovisual immediacy, mediated verstehen, and intimate attunement. ferrell, hayward, and young ( : - ) further elaborate on instant ethnography by drawing upon cartier-bresson’s notion of ‘decisive moment’ as a technique to depict the expressive significance of an event. it offers researchers insight to “say something significant about the world the image encapsulates” (ferrell, hayward and young : ). documentary criminology and instant ethnography overlap in ways that invite further methodological explorations as open-ended practices that contribute to novel research techniques as a sensibility. a second criticism of documentary criminology is from a commodified perspective of the ‘knowledge industry’. how can an anti-consumerist/anti- capitalist green cultural criminological project be squared with the fact that research films will be sold on itunes, netflix, amazon and other digital platforms and is thus a commodified product? indeed, i have demonstrated how and why documentary criminology is compatible with the ethos and theoretical frameworks of green cultural criminology in spite of the dissemination techniques and distribution outlets chosen by documentary criminology. it is important to find possibilities as well as limitations. the tools of video ethnography and documentary filmmaking are technologies of the culture industries that help produce and dissiminat popular knowledge in civil society. we as criminologists and knowledge producers (in the culture and knowledge industries) occupy nuanced contradictory positions: we are consumers, we are consumed, and we make consumable goods. let me be clear: there is no space of non- commodification in academia. books, journal articles, and documentaries are all part of the culture and knowledge industries, and by extension the crime industry too. media can be a way of translating complex ideas in everyday life in contradictory ways. media production demonstrates cronon’s ( ) proposition of humans’ construction of and presence in nature. vannini ( : ) has demonstrated how the growing popularity of popular documentaries distributed on netflix, itunes and so on can humble and teach researchers a lot about their role in the public sphere and how to reach different popular audiences beyond text. for example, textual based criminological articles, chapters and books are commodified products sold on amazon, itunes, and so forth – and so are movies. articles and movies have distribution companies and so do academic publications. movies and academic publications are branded, packaged and sold as a commodity; both are consumed by their recipients; and both generate income for the distributor and publisher – some more so than others. documentary criminology can hold contradictory positions: it can be anti-consumerist while also functioning as a commodity to undo harm (see redmon ). the question for my approach to documentary criminology is how to tap into existing modes of dissemination in order to further make research available – whether it’s free on youtube or purchased on netflix or itunes. as vannini ( : ) states, “more than ever before hybrid tv makes it possible—not easy, but at least possible—to reach a wide, diverse, documentary-savvy, and potentially socially conscious audience thirsty for entertaining and intelligent ethnographic content.” it is my conclusion that a shared goal of green cultural criminology and documentary criminology is to bring video ethnography “out of the rigid disciplinary and methodological debate following its inception by outlining its potential to reach multiple publics and inspire dialogue among audiences beyond the academy” (taggart and vannini : ). documentary criminology’s feral tendencies, methodological advantages, and unique practice-based sensibility offer a new and exciting way to advance this goal in spite of the above limitations but especially because of its possibilities. works cited andrist, l. chepp, v. dean, p. miller, m ( ) “toward a video pedagogy: a teaching typology with learning goals,” teaching sociology, vol. ( ) - . back, l ( ) ‘live sociology: social research and its futures,’ in back, les and puwar, nirmal, eds. ( ), live methods. malden: wiley-blackwell. back, l and puwar, n ( ), a manifesto for live methods: provocations and capacities. the sociological review, : – . baskin, j ( ) paradigm dressed as epoch: the ideology of the anthropocene. environmental values ( ) [february]: - . becher, t and trowler, p ( ) academic tribes and territories. buckingham: open university press. bennett, j ( ) vibrant matter: a political ecology of things. durham, nc: duke press. bonneuil, c., and jean-baptiste fressoz ( ) the shock of the anthropocene: the earth, history and us. london: verso. brisman, a and south, n ( ) ‘a green-cultural criminology: an exploratory outline’, crime, media culture, ( ): - . brisman a and south n ( ) green cultural criminology: constructions of environmental harm, consumerism, and resistance to ecocide. london: routledge. brisman, a, bill mcclanahan, and nigel south ( ) toward a green-cultural criminology of “the rural.” critical criminology ( ): - . brisman a, south n and white r (eds) ( ) environmental crime and social conflict: contemporary and emerging issues. farnham: ashgate. campbell, e ( ) ‘landscapes of performance: stalking as choreography,’ environment and planning d: society and space, ( ), - . campbell, e ( ) transgression, ‘affect and performance: choreographing a politics of urban space,’ british journal of criminology, ( ), - . chura, d ( ) i don’t wish nobody to have a life like mine. beacon press. boston. cronon, w ( ) “the trouble with wilderness; or, getting back to the wrong nature” ( ), accessed november , . http://www.williamcronon.net/writing/trouble_with_wilderness_main.html daniels, j and thistlethwaite, p ( ) being a scholar in the digital era: transforming scholarly practice for the public good. bristol: policy press. davies, francis, p, wyatt p (eds) ( ) invisible crimes and social harms. london: palgave. dewey, j ( ) art as experience. new york: penguin books. donovan, t ( ) feral cities: adventures with animals in the urban jungle. chicago: review press. fagan, b ( ) the intimate bond, how animals shaped human history. bloomsbury press. http://www.williamcronon.net/writing/trouble_with_wilderness_main.html ferrell, j. . kill method: a provocation. journal of theoretical and philosophical criminology, vol ( ). ferrell, j ( ) “disciplinarity and drift,” in mary bosworth and carolyn hoyle, editors, what is criminology? oxford, uk: oxford university press. ferrell, j ( ) ‘cultural criminology and the politics of meaning’. critical criminology, : – . ferrell, j, hayward, k, and young, j ( ) cultural criminology: an invitation. london: sage press. ferrell, j ( ) ‘we never, never talked about photography’: documentary photography, visual criminology, and method’, forthcoming in routledge international handbook of visual criminology. london: routledge. fraser, c ( ) re-wilding the world: dispatches from the conservation revolution. picador press, new york. goffman, e ( ) asylums: essays on the social situation of mental patients and other inmates. new york: anchor press. hamilton, c., christophe bonneuil and françois gemmene, (eds), ( ) the anthropocene and the global environmental crisis: rethinking modernity in a new epoch. routledge: nyc. karel, e ( ) interview with ernst karel, by mark peter wright / february , : https://earroom.wordpress.com/ / / /ernst-karel/ katz, j ( ) seductions of crime. new york: basic books. kohm s and greenhill p ( ) ‘this is the north, where we do what we want’: popular green criminology and ‘little red riding hood’ films. in south n and brisman a (eds) routledge international handbook of green criminology: Ş . london and new york: routledge. macdonald, s ( ) american ethnographic film and personal documentary: the cambridge turn. berkeley: university of california press. macdougall, d ( ) the corporeal image. princeton: princeton university press. manning, e and massumi, b ( ) thought in the act: passages in the ecology of experience. minneapolis: university of minnesota press. marks, l ( ) the skin of the film: intercultural cinema, embodiment, and the senses. durham, nc: duke university press. https://earroom.wordpress.com/author/markpeterwright/ https://earroom.wordpress.com/ / / /ernst-karel/ https://earroom.wordpress.com/ / / /ernst-karel/ michalowski, r ( ) "keynote address: critical criminology for a global age." western criminology review ( ): - . monbiot, g ( ) feral: searching for enchantment on the frontiers of rewilding. penguin press, london. pattberg, p, and fariborz zelli, (eds), ( ) environmental politics and governance in the anthropocene: institutions and legitimacy in a complex world. nyc: routledge. pretty, j ( ) 'the consumption of a finite planet: well-being, convergence, divergence and the nascent green economy.' environmental and resource economics, ( ). pp. - . redmon, d ( ) mardi gras made in china, carnivalesque films. redmon, d ( ) beads, bodies, and trash. nyc: routledge. ruddiman, w ( ) the anthropogenic greenhouse era began thousands of years ago. climatic change ( ): - . rust, s, monani, and cubitt, (eds) ( ) ecocinema theory and practice. routledge, new york. saunders, m ( ) “transformations from without and within the disciplines” in tribes and territories in the st century: rethinking the significance of disciplines in higher education, edited by paul trowler, murray saunders and veronica bamber. london: routledge. shearing, c ( ) ‘criminology and the anthropocene’. criminology & criminal justice, vol. ( ) – . sniadecki, j ( ) ‘chaiqian/demolition: reflections on media practice,’ visual anthropology review, : – . sobchack, v ( )the active eye: a phenomenology of cinematic vision. quarterly review of film and video, ( ): – . sollund, r ( ) ‘the victimization of women, children and non-human species through trafficking and trade’ in south n and brisman a (eds) the routledge international handbook of green criminology. london: routledge. south, n and brisman a (eds) ( ) the routledge international handbook of green criminology. london: routledge. south, n ( ) ‘anticipating the anthropocene and greening criminology’. criminology & criminal justice , vol. ( ) – . steffen, w, jacques grinevald, paul crutzen, and john mcneil ( ) the anthropocene: conceptual and historical perspectives philosophical transactions of the royal society a [ march]: - . taggart, j and vannini, p ( ) ‘life off-grid: considerations for a multi-sited, public ethnographic film’, in video methods; bates, c., ed.; routledge: new york, ny, usa, ; pp. – . vannini, p. . “ethnographic film and video on hybrid television: learning from the content, style, and distribution of popular ethnographic documentaries,” journal of contemporary ethnography, vol. ( ) – . vannini, p ( ) ‘video methods beyond representation: experimenting with multimodal, sensuous, affective intensities in the st century’, in video methods; bates, c., ed.; routledge: new york, ny, usa, ; pp. – . young, a ( ) ‘the scene of the crime: is there such a thing as ‘‘just looking’’?’, in hayward, k and presdee, m, eds, framing crime: cultural criminology and the image. oxford: routledge. p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com http://jrmdc.com imaging sacred artifacts: ethics and the digitizing of lichfield cathedral's st chad gospels bill endres university of kentucky, usa keywords: digitization, ethics, sacred artifacts, ethnography, medieval manuscripts, cultural heritage artifacts, digital humanities abstract this essay examines complexities that attend digitizing a cultural heritage artifact that is sacred to a contemporary community. it argues that scholars must first determine how the artifact participates in the life of its community. if this participation is integral, scholars should treat the artifact as a present-day cultural phenomenon, inseparable from its community. to explain the implications of this shift, the author turns to ethnography, which has a lengthy tradition of interacting with communities for generating research. photographing a sacred artifact is not unlike other ethnographic research, whether tape recording stories, collecting documents, or gathering information about social practices. to guide digital work, the essay proposes ethnographic ethical principles, demonstrating their value in digitizing the th -century st chad gospels at lichfield cathedral, england—supporting jamie bianco's recent call for an "ethical turn" in the digital humanities. about the author dr bill endres is assistant professor at the university of kentucky, affiliated with the department of writing, rhetoric and digital studies. he is a visual rhetorician and digital humanist, and his research focuses on illuminated manuscripts made in the british isles between and ce. his research agenda has three main focuses: digitizing manuscripts, exploring downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com digital representation, and studying how early medieval monks organized visuals and invented conventions for images and text to interact dynamically and portray knowledge. dr endres has published on the book of kells and d representation, and the chronicle of higher education has covered his work. his website "manuscripts of lichfield cathedral" (lichfield.as.uky.edu) allows visitors to view the first d renderings of an illuminated manuscript on the web. to cite this article: endres, b., . “imaging sacred artifacts: ethics and the digitizing of lichfield’s st chad gospels”, journal of religion, media and digital culture ( ), pp. - . [online] available at: http://jrmdc.com/papers-archive/volume- -issue- -december- /. . introduction in , i had the privilege of digitizing lichfield cathedral's th -century gospel-book, the st chad gospels. the st chad gospels (fig. ) is one of the most significant illuminated manuscripts to survive in the british isles: it is considered a national treasure. however, the figure . luke's portrait mark's incipit lichfield community considers it to be far more important than a cultural heritage artifact. this great gospel-book connects the lichfield community to its founding in the th century. as lichfield cathedral's canon chancellor pete wilcox explains, ‘one of the things that makes it so precious to us is it unites us to our patron. [...] there is only a cathedral in lichfield because of st chad, and this artifact links us within a generation to chad’ (howard, ). downloaded from brill.com / / : : am via free access http://jrmdc.com/ http://lichfield.as.uky.edu/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com indeed, a sacred artifact that unites a community with its founding saint is a treasure. such an artifact generates a centrifugal force far greater than most historical artifacts. the st chad gospels' material presence contains the community's originating energy. this energy has sustained itself over the last thirteen hundred years—through viking invasions, the gospel-book's likely theft, the anglican church's twice separation from the roman catholic church, civil wars, world wars, and economic downturns. its presence represents what it means to maintain a christian community through centuries of struggles. in contrast, academics value sacred artifacts for their contributions to the larger intellectual and artistic history of a people. this value is emphasized by labels such as cultural heritage artifact, a concept that originates from the united nations educational, scientific and cultural organization (unesco) and its convention of . this designation detaches a sacred object from its immediate community and resituates it within the context of world culture and history, and suggests that rights to an artifact belong to all human beings. as a consequence, when digital scholars justify their projects through designations like cultural heritage, the immediate community's concerns and needs tend to become relegated to a secondary status. this study explores what happens when a scholar approaches digitizing a historic and sacred artifact from the perspective of a digital ethnographer: as a present-day cultural phenomenon inseparable from its local context. such a perspective reverses the impetus that generally motivates digital efforts: it puts the community first. by doing so, such an approach invites ethnographic ethics to become a transforming force for digital scholarship, building upon a call in the digital humanities (dh) for an “ethical turn”. ethnographic ethics can contribute to such an “ethical turn”, providing insights to guide digital scholars' interactions with a community, opening communal knowledge to a project, and avoiding opportunistic ventures that produce uninspired scholarship and fall well short of a community's needs. . digital scholarship and an emerging "ethical turn" dh has yet to engage in a larger conversation about ethics (spiro, ; svensson, a), prompting jamie bianco ( ) to call for an “ethical turn”. this is not to claim that dh scholars have acted unethically. traditional disciplinary ethics regularly guide digital activities that fall under the umbrella term “digital humanities”. however, digital activities regularly fall downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com outside the realm of a traditional discipline's previous reflections on ethical actions. for example, the statement of professional ethics by the modern language association ( ) has little to do with digital scholarship and what makes dh distinct, such as its focus on interdisciplinary collaboration and research activities beyond the university (terras, nyhan and vanhoutte, ). likewise, the mla guidelines for evaluating work in the digital humanities and digital media are silent on ethical practices for interdisciplinary collaboration and engaging communities outside academia when pursuing digital scholarship. furthermore, methodologies like crowdsourcing (causer and wallace, ; doan, bamakrishnan and halevy, ; blanke, hedges and dunn, ; brabham, ) reverse longstanding academic notions of the scholar as expert who generates specialty knowledge and disseminates it. statements of ethics have little chance to stay up with an evolving dh that promotes paradigm shifts in scholarly identity and practices yet has difficulty in defining itself (terras, nyhan and vanhoutte, ; gold, ). bianco's call is twofold. first, bianco calls for scholars of dh to examine the power dynamics that have emerged within the burgeoning field, dynamics that remain relatively unscrutinized. she does not accept tom scheinfeldt's ( , p. ) claims of dh's “niceness” and the field's scholars as ‘the golden retrievers of the academy.’ for bianco ( , p. ), the dynamics of power within dh need to be interrogated—particularly when dh computer code and computational methods ‘operate through a web of politics, people, institutions, and technics in a network of uneven, albeit ubiquitous, relations.’ bianco's call supports scholarly inquiries like tara mcpherson's ( ) “why are the digital humanities so white? or thinking the histories of race and computation.” second, bianco calls for creative critique as a digital and ethical practice for dh. she views this approach as exposing the power dynamics of code and computational methods, adding a layer of transparency to digital scholarship. for her notion of creative critique, bianco ( ) draws on the digital art of sharon daniels, who uses critical theory as a dimension in her artistic expression. daniels' ( ) “public secrets”, which explores women in prison, has inspired discussions about the role of academic activism in dh (svensson, ). for example, alex juhasz ( ), also inspired by daniels' work, views the digital as ‘what was needed to push more scholars to engage with the personal and political implications of their practices’, a slow transition that he had already perceived happening in traditional scholarship. downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com i wish to answer bianco's call in a slightly different manner. operating through willard mccarty's ( ) notion of dh as a “trading zone”, i want to borrow from ethnography's rich tradition of scholarly thought about ethical principles when engaging communities to produce research (jones and watt, ; atkins et al., ; hammersley and atkinson, ). i want to use this knowledge to inform digital scholarship on sacred artifacts. mccarty recognizes that dh has always operated de facto ethnographically; i wish to make this de facto operating more mindful and productive. from years of experience and reflection on research in communities, ethnographers have developed ethical principles to guide their activities (jones, ; brunt, ; degan, ; faubion, ; macdonald, ). these principles informed my digitization of the st chad gospels and can productively inform other digital projects with sacred artifacts. by design, ethnographic ethical principles help ethnographers to think through issues connected to varying social contexts, research agendas, methods, and outcomes (jones and watt, ; atkins et al., ; hammersley and atkinson, ). they protect the interest of a community, but they also protect and promote the integrity of research and its outcomes (jones and watt, ; atkins et al., ; hammersley and atkinson, ). by tapping ethnography's rich tradition of engaging communities to produce research, digital scholars have an opportunity to take advantage of already established disciplinary knowledge. such knowledge, rethought in a digital context, provides digital scholars with insights into structuring projects, a means to think through complex social situations, and a metrics to inspire project outcomes—irreplaceable for honoring a sacred artifact as a present-day cultural phenomenon. such borrowing from the realm of ethics has already begun. digital scholars have leveraged feminist ethics to gain insights for re-inventing digital practices and outcomes. working from feminist theory, scholars are asking if it is enough simply to curate women's writing and build digital archives (wernimont, ; earhart, ; flanders and wernimont, ; roundtree, ). for instance, working from ellen rooney, jacqueline wernimont argues that digitally archiving women's writing requires an intervention. wernimont ( , para. ) states, ‘editing everything won't move us much further along in the effort to end oppression of women if we don't use those editorial opportunities to recenter the role of women's writing in historical and contemporary debates about gender, sex, ethics, and the social dynamics of power.’ by mobilizing feminist ethics, wernimont ( , par. - ) is able to rethink the downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com archive and propose strategies for using “interpretive markup” to address longstanding issues related to women writers being left out of larger historical and cultural debates. feminist scholars are demonstrating the importance of employing ethics to question historical values that otherwise become embedded in archives and perpetuated by dh. ethical questions fold easily into dh's rich tradition of building as thinking, making such building more dynamic and truly transformative (drucker, ). jacqueline wernimont demonstrates how feminist ethics can inspire innovation, asking scholars to respond to the world before them. however, some scholars in dh, like lisa spiro ( ), believe pursuing a statement of ethics is too restrictive. spiro argues for a statement of values. her reasoning is that ethical guidelines limit and prescribe behaviors. spiro ( , p. - ) believes dh ‘needs a values statement to articulate its mission’, create a coherent identity, and set priorities for the burgeoning field. however, ethical principles are not necessarily restrictive. their main demand is presence. they function as a tool to think through implications of this presence, situated within a specific social context and motivated by a specific research agenda. these principles can do for digital scholars what they have done for ethnographers: guide interactions, protect the interests of a community, promote long-term relations, facilitate innovation, protect the integrity of research, and transform outcomes (jones and watt, ; hammersley and atkinson, ). . an ethnographic ethical frame: principles to guide the demands of a sacred artifact's context addressing ethics does not necessarily mean generating a rigid ethical code. martyn hammersley and paul atkinson ( , p. ) point out that ‘legitimate action on the part of researchers is necessarily a matter of judgment in context.’ for such judgments, ethical principles are needed that function like heuristics. to make their point, hammersley and atkinson ( , pp. - ) discuss informed consent within four different approaches for judging actions: actions defined as ethical or not; action judged in context; actions judged relative to the values held by the researcher (relativism); and ethics defined as irrelevant to certain types of research. downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com because the nature of informed consent required for research can dramatically vary, depending upon what is being studied and its community, hammersley and atkinson ( ) show that judgment in context is generally the most productive approach to determine ethical action. for example, studying racial harassment within a repressive regime will likely require a different level of informed consent than studying members of a democratically elected government (hammersley and atkinson, , p. ). julie scott jones and sal watt ( ) take a similar position in ethnography in social science practice, with the slight difference of describing ethical judgment originating out of a cultivated ethnographic sensibility. while ethics can take the form of prescribed standards (first and last scenarios), the complexities of social interactions and their context require flexibility; otherwise, potential harm to participants becomes an increased possibility and the integrity of the research is compromised. on the other hand, ethical relativism is rarely an acceptable position for a field (brown and dobrin, ). elizabeth murphy and robert dingwall ( ) mention one further benefit for using ethical principles over ethical codes: their relationship to research methods. methods affect the research and participants, needing consideration in any discussion of ethics. as murphy and dingwall ( , p. ) point out, stringently adhering to ethical codes ‘may not give real protection to research participants but actually increase the risk of harm by blunting ethnographers' sensitivities to the method-specific issues which do arise’. to act ethically and protect participants, scholars need constantly to assess the dynamics of their research in context as it unfolds. ethical principles cultivate presence and sensitivities to these interactions. hammersley and atkinson ( , p. ) remind scholars that ‘research [in communities] cannot be programmed, that its practice is replete with the unexpected’. to address judgments in context, ethnographers divide ethics into two types of principles: deontological and consequential. deontological principles focus on inherent rights, and consequentialist principles focus on outcomes (murphy and dingwall, ). the inherent rights of deontological principles normally include privacy, self-determination, and justice—where justice refers to treating research participants equally (avoiding favoritism). privacy and self-determination come into play with informed consent and covert research. in what can seem the most innocent research, questions emerge about how much information to divulge when gaining consent. information can have unexpected and negative consequences. for instance, an ethnographer found that research designed to improve health care downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com for a disabled child became difficult after information was given for informed consent, because parents became obsessed with proving themselves to be good parents rather than revealing needs in social service (murphy and dingwall, , pp. - ). social dynamics make research in communities full of unanticipated events, and ethical principles are needed to guide thinking and generate productive counteractions as research unfolds. consequentialist principles have the most to offer digital scholars. focusing on the outcomes of research in a community, these principles cover doing no harm and beneficence (reciprocity) (murphy and dingwall, ; skeggs, ; hsu, ). harm can usually be recognized, but beneficence has a substantial range. while digital projects often claim that digitally preserving a sacred artifact is inherently an identifiable benefit for a community, such a claim is not self-evident. for example, if digital images are stored in an archive and the only way to view the images is visiting the archive, then beneficence is minimal at best. an example of beneficence is found in the research of digital ethnographer wendy hsu ( ). she studied asian american musicians and their uses of social media to promote their bands. hsu became intrigued by one particular south asian american punk band. in studying their social media exchanges, she generated a map of global communities in which the band had strong following. she shared this research and her map with the band, engaging in an act of beneficence. hsu ( ) states, ‘my gift in exchange for the band's time spent with me, in this case, was marketing analytics that displayed evidence of their global following.’ bronislaw malinowski, considered by many to be the father of ethnography, once wrote, ‘preconceived ideas are pernicious in any scientific work, but foreshadowed problems are the main endowment of a scientific thinker’ (qtd. in hammersley and atkinson, , p. - ). ethical principles are a way to use experience and reflection to provide scholars with a sense of foreshadowed problems. for a digital project, the foreshadowed problem of beneficence is how the digitized materials will meet the needs of an artifact's community. such a question invites contemplation, conversations, and attention to communal needs. finding answers to the question, however, can transform digital projects and their outcomes. . epistemology and respect downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com before i turn to my digital project with lichfield cathedral, a note on epistemology is needed. in many ways, traditions of learning within spiritual organizations generate correspondences that can encourage a digitization of a sacred artifact. many religious communities, including lichfield, have leaders with advanced degrees. in the anglican tradition, some of these church leaders have become leading experts on the items in their cathedral libraries, like dean h. e. savage of lichfield cathedral's work on the st chad gospels. the digital humanities actually began with scholarship on sacred texts, when in the s father roberto busa approached thomas watson of ibm to discuss using computers to generate a lemmatized concordance for the writings of st thomas aquinas (hockey. ). however, these correspondences do not preclude tensions between academic and religious communities. differences in epistemology, values, and goals can generate differing perspectives. the modern day university had its origins in the scientific revolution of the enlightenment. during this period, people reacted against the powers of the church and its control over culture and knowledge. to free intellectual activity from this control, the enlightenment turned to individual rationality, authorizing people and their use of reason to generate and validate knowledge (pagden, ; edelstein, ). such a staunch belief in the individual and reason stood in stark contrast with a religious, community-based epistemology, one centered in faith and tradition. while neither academia nor religions are the same today as during the enlightenment, these legacies continue in various forms and awareness of potential differences in interests is needed. complicating matters, two further epistemic tensions exist within academe. first, traditional humanities scholars have been skeptical of digital humanists, concerned that positivistic epistemologies based on mathematical models shape the scholarship and rob it of significant context that defies numerical representation. according to these critics, humanistic knowledge demands critical thought and interpretation (ramsay, ; edwards, ). while at first this appears a valid critique, it does not recognize that critical thought and interpretation make interventions at various places within any kind of scholarship. this current essay argues for one such intervention, the need for ethnographically-derived ethical principles to be included as part of the critical framework for digital projects. second, digitization can be viewed as an act of violence, converting sacred manuscripts into numbers, bits and bytes, s and s. a digitized sacred manuscript is no longer a sacred downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com manuscript: it is a dataset. this is the method of positivism and objectivism, demanding that knowledge be derived from logic and mathematics, information that can be objectified. but this is also the moment for a humanistic intervention, to understand what of the culture needs to be re-inserted into the process and product. what is lost or transformed when a sacred artifact is converted into s and s and rematerialized on the screen? a digital version of a sacred artifact needs its community. it needs to be honored as a cultural heritage item but viewed as a present-day cultural phenomenon, inseparable from its local community. its cultural heritage designation is just one facet of its identity. this attitude restores the artifact to its status as a living presence, as something vibrant within a community. maurice merleau-ponty ( , p. ) warns, ‘science manipulates things and gives up living in them.’ ethnography and its ethical principles keep the community in a digital humanities project, and they keep the relationship between people and artifacts alive. . the lichfield cathedral community and its sacred artifact situated in the english midlands, about two hours north of london by train and about fifteen miles north of birmingham, lichfield is a city of approximately , people (lichfield city council, ). once the ecclesiastical center for the powerful medieval kingdom of mercia, lichfield has slipped into relative obscurity. in modern times, routes chosen for major modes of transportation bypassed lichfield, beginning in the eighteenth-century with canals and followed by nineteenth-century rail and twentieth-century motorways. lichfield has relied economically on the industrial base of birmingham. however, birmingham and the midlands suffered from an economic downturn in the second half of the twentieth century, with industrial jobs leaving the area (crafts and woodward. ). in , lichfield regained international attention. approximately four miles from lichfield cathedral, terry herbert unearthed pieces of what would turn out to be the largest anglo-saxon hoard ever found (mercian trail partnership, ). the hoard contains more than , items and includes religious artifacts such as gold crosses, as well as items used for war, like sword hilts decorated exquisitely with gold and inlaid garnets. the hoard reminded the world of lichfield’s past prominence, when the midlands and beyond were ruled by powerful downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com kings such as penda, offa, and Æthelbald, who gained control of london (brown and farr, ). around ce, about fifty years after the hoard's estimated burial, monks made the st chad gospels (brown, ; stein, ; alexander, ). it is a latin gospel-book, containing the complete gospels of matthew and mark and the beginning of luke. the remainder of luke and the whole of john appear to have been in a second volume, likely lost either during the sixteenth-century reformation or seventeenth-century english civil war (savage, ). the namesake of this great gospel-book, st chad, served as an early bishop of mercia (bede, ). when named bishop, he moved his episcopal see to lichfield, establishing the medieval city as a significant christian center (brown and farr, ; bede, ). upon st chad's death in ce, the city became a destination for pilgrims (brown, ; bede, ). bishop hedda built a new church in , likely to accommodate st chad's shrine and these pilgrims. the st chad gospels was perhaps made to honor st chad, just as the lindisfarne gospels had been made to honor st cuthbert (brown, ; henderson, ). while the place of the st chad gospels' making is subject to debate, a preponderance of evidence supports lichfield. however, the st chad gospels' past is filled with uncertainty. it spent some of its early years in wales. how it arrived in wales is a mystery, but it likely was stolen for the precious metals and jewels that decorated its cover, the manuscript discarded once the cover was removed (brown, ; james, ; savage, ). as recorded in the annals of ulster, the ninth-century book of kells suffered a similar fate (bambury and beechinor, ). the theft of the st chad gospels' cover would explain why the preliminary materials are missing and why the first surviving page is the opening of matthew's gospel, the page worn in such a manner as to suggest that it served a substantial time as the manuscript's front (brown, ). furthermore, the surviving volume of the st chad gospels ends with luke : , an unusual place to intentionally divide a gospel-book. marginalia chronicle the st chad gospels' time in wales. these marginalia contain the oldest surviving examples of old welsh writing, including one entry recording gelhi trading his best horse for the manuscript and presenting it to the church of st telio at llandeilo fawr (brown, ; jenkins and owen, ; stein, ; alexander, ). in the tenth century, the manuscript returned to lichfield, again under unknown circumstances. perhaps when the manuscript was stolen and the cover torn loose, part of the manuscript was discarded near downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com lichfield and the cathedral recovered it, providing evidence of ownership. when relations improved between wales and mercia, perhaps such evidence helped to secure the first half of this great gospel-book's return (savage, ). in the past, the old welsh marginalia encouraged scholars to consider a welsh origin for the manuscript (scrivener, ; dafydd jenkins and morfydd owens have made rare later arguments for a welsh origin ( )). however, other substantial evidence makes this possibility unlikely. the entry about gelhi trading a horse for this great gospel-book suggests that the manuscript was stolen—otherwise, it would not have been in secular hands (brown, ; james, ; savage, ). also, connecting the st chad gospels to northumbria (upper northern england) and ireland are its decorative correspondences with the lindisfarne gospels, durham cassiodorus, and book of kells (brown, ; henderson, ; stevick, ; stein, ; savage, ). the first three bishops of mercia were irish, and st chad spent time studying in ireland on suggestion from his mentor, aidan, the irish monk who was the first abbott at lindisfarne (brown, ; bede, ; savage, ). likewise, st chad reflects lichfield's strong bonds with northumbria: st chad was born there, studied at lindisfarne, and later served as bishop of york before being named bishop of mercia (bede, ). in , with the discovery of the lichfield angel, the possibility of a welsh origin for the st chad gospels diminished further. the angel likely constructed part of the shrine that housed st chad's bones. its color palette mirrors that of the st chad gospels, a color palette unique amongst surviving insular gospel-books (brown, ). while a welsh origin is unlikely, sacred artifacts are like beloved children, whether adopted or not. in wales, the st chad gospels is still cherished and an indelible part of welsh history, recording medieval life at llandeilo fawr in its marginalia and inspiring welsh religious and artistic practices. . entry story in , i visited lichfield cathedral to study the st chad gospels. this great gospel- book had never had a printed facsimile made, so its study required a visit to lichfield cathedral library. both the canon chancellor, the revd. dr. pete wilcox, and the cathedral librarian, pat bancroft, thought that the cathedral had digital images for nearly the whole manuscript from downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com the st chad gospels' loan to the british library in . they planned to set me up on a computer with the images, since conservation practices limited access to the manuscript. however, the british library had photographed only targeted pages for a version of the st chad gospels for its turning the pages software, which emulates the turning of a page on a computer screen or kiosk. these images are inseparable from this application, so i arrived at lichfield to learn that the cathedral did not have high-resolution images for the st chad gospels. however, mrs. bancroft could supply me with " x " black and white photographs. the cathedral had these photographs from roger powell's rebinding of the manuscript in . i spent prolonged hours sitting on a hardwood chair, hunched over, and regularly peering through a magnifying glass, working through the images. i took copious notes, gathering information for my research. little did i know that i was demonstrating the need for scholarly access to high- resolution images of the st chad gospels (howard, , p. a ). furthermore, i was being interviewed for my digital project by mrs. bancroft and revd. dr. wilcox. my time in that hardwood chair was testimony to my commitment to and relationship with the st chad gospels, a relationship different from that of the members of the cathedral community but a relationship all in the same. this relationship was the building of my trust relations with the cathedral that would lead to my digital work. . beneficence in digital work and my project at lichfield cathedral lichfield cathedral had five desired outcomes for digitizing the st chad gospels, each having a different relationship to the ethnographic principle of beneficence:  increase access and study of the st chad gospels  acquire color photographs of the whole manuscript  determine how well the st chad gospels is aging  recover any erased text on page (perhaps information about provenance)  digitally preserve the photographs in a digital archive from a digital humanities' perspective, beneficence seems nearly automatic for a digital project like mine: access to the st chad gospels was limited as part of its conservatory strategies. downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com furthermore, supplying the cathedral with digital images for the complete st chad gospels easily appears a benefit for the community. however, high-resolution images generate their own complications. while digital photographs provide access for scholars and members of a community to explore a sacred artifact, technological requirements and expertise can severely restrict digital access. if beneficence is to be an outcome, the sheer size of a dataset and its images must be managed. for the st chad gospels, i generated over , files and a half terabyte of data. this quantity of data is necessary for digital preservation and recovering information from worn, water-damaged or erased areas on pages. one means to recover visual data is through multispectral imaging, photographs taken using different spectral bands of light, from near ultraviolet to infrared. images captured with ultraviolet and infrared light are adept at revealing information not normally visible to the unaided human eye. generally, i took photographs using thirteen different spectral bands. however, because ultraviolet light interacts with vellum and darkens it over time, i used ultraviolet light only for pages that required some type of recovery. therefore, the number of photographs for a page varies from twelve to thirteen. the multispectral images are roughly megabytes (produced with a mega-pixel monochrome camera). with twelve or thirteen images per page, multispectral photographs generate about . gigabytes of data for each page—the st chad gospels has pages. to produce colored photographs, images taken with red, green and blue bands of light are combined. human vision is trichromatic, and one trichromatic model for generating colors that humans can see is based upon red, green and blue bands of light. this model is efficient for electronic devices, like digital cameras—thus, the common designation of a color digital photograph as an rgb image. the color images for the st chad gospels are roughly megabytes each. therefore, the dataset for the st chad gospels consists of impressively large files that accumulate into a half terabyte of data. to engage this dataset, someone needs a substantial computer and sophisticated graphics editing software. but part of the rich data for my project also includes d images. prior to the eighteenth century, the st chad gospels suffered from water damage, which caused severe cockling (wrinkling of the pages). in , roger powell ( ) flattened the pages and rebound the manuscript, but the st chad gospels' pages are vellum (calfskin). nearly thirteen-hundred-year- old vellum is hygroscopic, absorbing moisture readily from the air; therefore, cockling gradually downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com returns because any absorbed moisture encourages pages to return to their prior state. in case cockling had returned, i took d images so pages could be digitally flattened. the richness of d data also provides further opportunities for preservation and presentation of a manuscript on the web. for instance, d data makes it possible to develop a tool for measuring any aspect of a page. these measurements can be compared in the future to provide valuable information about how the st chad gospels is aging. for example, aspects like holes in the vellum can be measured and compared to determine if they are increasing in size and in need of special care. lost chips of pigment can be measured to assess losses and predict future trends. for my digital project to meet the ethnographic challenge of beneficence, i needed to transform the way in which access to the data from a digital project is delivered over the web. this goes beyond supplying lichfield cathedral with all of the post-processed images that have been color and exposure corrected, embedded with metadata, logically organized and easily accessible on an external hard drive that works with a british power supply. while this provides the cathedral with the images to use as they see fit, a cathedral generally does not have the available expertise and resources to present a half terabyte of data on the web for its community. in dh, a benchmark for providing images is the archimedes palimpsest project (netz et al., ), a project that involves a secular text overwritten three hundred years later with a liturgical one, the manuscript currently in private hands. the owner of the palimpsest has generously made all of the images available for download, including color and multispectral images. however, downloading and viewing these images requires technical skill, software, and computing power—not to mention the aid of a fast internet connection. this solution makes sense for the archimedes palimpsest project. its managers (netz, et al., ) are clear about the project's intent: make its images available for researchers through a “public archive”. such a means to access images is acceptable for researchers but not for a sacred artifact's community. to overcome these challenges, i developed and adapted various available technologies. they include what i refer to as an overlay viewer for the rgb and multispectral images for each page, historical image overlays for viewing how the st chad gospels is aging, and d renderings of sixteen of the st chad gospels' pages. this development work would not have been possible without the infrastructure of a university. downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com but the technical challenge is only half of the issue of access and beneficence. intellectual access is also needed. during imaging, i had the unexpected opportunity to engage members of the cathedral community. a group of members, affectionately known as the “babysitters”, signed up for shifts to accompany the st chad gospels during its imaging, as representatives of the cathedral. each took a two-hour shift, some more than one, sitting in the dark while the led lighting flashed through its multiple bands of light time and again. during imaging, i had the opportunity to talk with them. they asked questions about the st chad gospels, and i pointed out various features and informally interviewed them. i asked them about the cathedral, the community, and the staffordshire area. beyond the enjoyment, these interactions enabled me to connect to community members. i listened to tales of riding ponies in the field where the staffordshire hoard was found, and witnessed the joy in the face of lyneth, one of the “babysitters”, when i asked if she would like to hold the st chad gospels when technical difficulties required the removal of the manuscript from the book cradle. my interactions gave me irreplaceable insight into the community's relationship to the st chad gospels and the types of questions members had about this great gospel-book. these interactions directed the design of the website for the st chad gospels. they supplied me with insights into the type of information community members would need to explore the great gospel-book more completely. one way i supplied this information was by creating a features section that lists pages for each gospel, major decoration, marginalia, dry- point glosses, and a range of other significant aspects of the st chad gospels. each feature has a link to its page. for access to images, i provide thumbnails for each page, labeled with information about the page's chapter, verses, and features of interest. . overlay viewer: organizing and making nearly a terabyte of data accessible if my digitization of the st chad gospels was to have benefits for the community at lichfield cathedral through the web, i had to organize the data and make it accessible to members of the community who might not have strong computer skills, a powerful computer, specialty software, or a fast internet connection. barriers to access have been a consistent issue with the web, created by lack of infrastructure, technical expertise, and cost of computers and downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com software. members of the cathedral community share these issues with a large number of people, including academics. outside of dh, the research that many scholars engage in does not require substantial expertise with computers, and in the realm of teaching the same holds for many educators. i share with the lichfield community the desire for easy access to and use of images. because i have technical expertise does not mean that i desire to employ it constantly. i wanted a digital presentation of images on the web that efficiently meets my research and teaching needs as well as those of other scholars. for example, the value of multispectral images is in overlaying them for comparisons. by overlaying multispectral photographs and adjusting the top image's opacity, a scholar can compare visual information captured by different bands of light – especially those that provide visual information not seen by the unaided human eye. this allows scholars to explore concealed aspects of a manuscript. beneficence suggests that members of the lichfield community have the same opportunity as scholars to explore these secrets. downloading and accumulating roughly fourteen images into a single file for the pages of the st chad gospels (using graphics software like photoshop) is neither practical nor efficient for interested members of lichfield cathedral—or for most scholars. each single file with its fourteen layers would amount to about . gigabytes. to solve this problem, i developed an overlay viewer that performs the two necessary functions of organizing the images and making them accessible. on my website for the cathedral's manuscripts, the nearly four thousand images of the st chad gospels are organized at the viewer level. pages are arranged in chronological order, displayed as thumbnails with the beginning and ending chapter and verse listed for each page. this is an aid to understanding what each page contains, since the st chad gospels is in latin and the insular majuscule script is initially a little difficult to read. when a visitor clicks a thumbnail, the viewer opens with the rgb image. at the top of the viewer, two drop-down lists present selections for any two images of a page, one for overlay, its opacity adjustable through a slide-bar (fig. ). downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com figure . overlay viewer with drop-down list for selecting images of a page and slide- bar in the middle to adjust transparency of the top image. page – erasure in middle area. the overlay viewer allows members of the community to explore pages and possibility recover text, like on page . when imaging the st chad gospels, page was of special interest to the “babysitters”. they were curious as to what this erased area might contain, including whether it would have information that would suggest an origin for the manuscript in lichfield or wales. the erased area is below the display script and above the marginalia, where downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com smudges appear. its vellum exhibits areas where a knife seems to have scraped writing from the page. infrared images allow for a view deeper into the vellum. ultraviolet darkens areas where any residual ink might remain. overlaying combinations of ultraviolet, infrared and rgb images and adjusting transparency provide opportunities for recovering any erased text. unlike the archimedes palimpsest project, a visitor does not need to download images for a page and use graphics software to combine them into a single file for comparison. my web viewer performs this function. a visitor simply selects the desired images from the drop-down lists and adjusts the transparency as desired for the top image. the viewer makes nearly , images for the st chad gospels accessible and interactive. because bandwidth can still vary for web access, the viewer automatically loads a mid- sized image, which is x pixels, with a smaller image ( x pixels) available by clicking the minus sign. a full-sized image is available by clicking the plus sign—full-sized images are x pixels. to speed access, images are sliced and loaded in segments of x pixels. . historical image overlays one of the challenges in caring for an early medieval manuscript is managing its preservation. inks, pigments, and vellum age slowly and subtly. discerning these subtle changes is difficult if not impossible without exacting visual information and a means to compare this information across time. providing information to lichfield cathedral about the aging process of the st chad gospels generated further beneficence from my digital project—similar to the global map of fans generated by the research of wendy hsu ( ) and shared with the asian american punk band. for lichfield cathedral, the st chad gospels' aging is of special interest because of its continued but restricted use. the st chad gospels is the oldest known gospel-book still serving its original purpose, albeit in extremely limited roles. up to six times a year, the st chad gospels is removed from its climate-controlled case and carried in a procession, such as during christmas day mass (howard, ). while extreme caution is taken and the roles tightly restricted, information about the aging of the st chad gospels contributes to decisions about these uses. downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com as part of the ethnographic principle of beneficence, information about the aging of the st chad gospels is important to balance against the risks taken by the cathedral in the digitization effort. ethnographers recognize that a community and/or its members generally assume more risk than the researcher in a research project, and part of the notion of beneficence is to balance any risk taken by a community and its members (murphy and dingwall, ). for instance, photographing the st chad gospels created added opportunities for mishaps to befall the st chad gospels. but beyond these risks, imaging places strains on a manuscript. for example, opening a page stresses the cohesion between pigment and vellum—layers of pigment are rigid and vellum is flexible, arching one way, then the other when a page is turned. imaging increases a manuscript's exposure to ultraviolet light, which damages vellum, slowly and through accumulation. such stresses were managed and minimalized. imaging happened in a dark room and used led lighting that generated only the wavelength of light necessary for any one image. nonetheless, imaging still added age to the st chad gospels, even if aging was minimal and the change unnoticeable. providing the cathedral with information about how the manuscript is aging is an important counterbalance for the risks undertaken by the community and the stresses that imaging placed on the st chad gospels. one more risk bears mentioning. lichfield cathedral does not receive government funds for support of the st chad gospels or its cathedral. when the cathedral agrees to an imaging project and shares its treasured manuscript, there is no guarantee that it will not hurt the revenue stream connected to it. residing in the english midlands, the st chad gospels does not have the advantages of a large city, like the ninth-century book of kells at trinity college in dublin, where drawing visitors is easy. although some digital humanists and computer scientists claim that digitization leads to wider exposure for a manuscript and to economic gain through visitors and requests for commercial uses, i do not know of any research that substantiates this claim for a cathedral. any exposure leading to economic gain would depend upon countless variables and circumstances, varying on a case-by-case basis. simon tanner and marilyn deegan ( ) have done excellent scholarship on the benefits for museums and libraries of digitizing their collections. unlike lichfield cathedral, these institutions usually receive government funding and have large collections that have a variety of possible appeals, but the benefits that tanner and deegan identify are still likely to apply in small ways to lichfield, primarily in the area of ethos rather than financial support. providing additional ascertainable benefits is a way to meet downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com the requirements of the ethnographic principle of beneficence and insure that a cathedral and its community receive a tangible benefit for their risks. to gauge how the st chad gospels is aging, i researched past photographic efforts, gained access to their photographs, digitized them, aligned them in photoshop, and compared them. on the manuscripts of lichfield cathedral website, i have placed historical images in my overlay viewer for nine of the most significant and telling pages of the st chad gospels. the viewer allows a page to be compared across time. the historical image overlays include images from six photographic efforts: – photographs published in f. h. a. scrivener's codex s. ceaddae latinus – photographs taken under unknown circumstances – photostat copy of the complete st chad gospels produced by the national library of wales, aberystwyth – photographs taken during the st chad gospels' rebinding by the conway library, courtauld institute of art, london – photographs taken by the british library, london, in a cooperative effort with lichfield cathedral and llandeilo fawr – digital images taken through my efforts with the college of arts & sciences and college of engineering, university of kentucky while these earlier efforts were sporadic and only three photographed the whole manuscript, they provide visual information to identify intervals when a change in the st chad gospels occurred. on the whole, the st chad gospels is aging well. however, a few pages show trends of pigment loss. one of them is luke's incipit, page . in the historical images, the trend of pigment loss continues from to the present—areas of loss since marked with curved brackets (fig. ). downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com figure . luke's incipit, in the fall of , i digitized further images from another photographic effort. in , the bildarchiv der abtei maria laach produced color slides for the major decorated pages of the st chad gospels. the photographs from the courtauld institute represent the manuscript immediately after the flattening of the st chad gospels' pages and before the manuscript was rebound. therefore, the slides provide significant visual information for assessing the effects of the flattening. cockling places stress on pigments as does the flattening process. these images will provide rare information for scholars and conservators about the effects of flattening—not only helping the cathedral to understand the aging of its beloved manuscript but also helping scholars and conservators understand this process for other medieval manuscripts. on the manuscripts of lichfield cathedral website, i provide further information about trends in the st chad gospels' aging. i include an animated gif to show an area where chips of pigment have broken free on luke's portrait (see: https://lichfield.as.uky.edu/historical-images). . d renderings one of the most promising new technologies i used to digitize the st chad gospels was d imaging. as mentioned earlier, i captured d data in case cockling had returned and the pages of the st chad gospels needed digital flattening. however, the richness of d data provides further opportunities for preservation and presentation of a manuscript on the web. for downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com example, because d data allows for measuring features of a page, such measurements can be taken and compared in the future to provide valuable information about how a manuscript is aging. the dynamics of d renderings also provide advantages over d images for scholarship, teaching, and viewing. d renderings introduce new ways to view and encounter a sacred manuscript on the web. with d, any point on a page can become an epicenter around which a page rotates. for instance, traditional western habits of reading move from left to right and top to bottom—as embedded in scroll bars for web browsers. but d opens new ways of interacting figure . chi-rho – lines of sight as encouraged by reading habits and layout figure . chi-rho – curved brackets mark four triskelia with and seeing a page of a manuscript. for example, figure shows the lines of sight generated by western habits of reading and layout. traveling left to right, my line of sight follows the xpi (first three letters of the greek word for christ—chi, rho, iota) and the lower three lines of text. traveling top to bottom, my line of sight is dominated by the x, filling the left side of the page. western culture cultivates a reading from left to right, which then becomes reproduced in various types of artifacts, including books and paintings (kress and van leeuwen, ). downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com in figure , i mark four triskelia, one of which is unique and has been overlooked by scholars—surprisingly so, because this page is one of the most photographed and printed in scholarship on the st chad gospels. triskelia are normally constructed with a three-pronged twirling geometric shape inside ( through ), following the celtic design (fig. ). however, triskelion is constructed out of three heads of anglo-saxon-styled birds. i did not notice the uniqueness of this triskelion figure . chi-rho – triskelion made using three heads of anglo- saxon styled birds until i examined the page in d (fig. ). i made a point near this triskelion an epicenter from which i rotated the page. the birds are worn, inviting the eye to see them as geometric shapes, as expected. however, d invites a different heuristic for seeing, one that encourages scholars to see a manuscript's feature anew, beyond expectations and habits of looking. i have written elsewhere about my efforts with d and its potential for studying manuscripts, examining issues such as digital representation and preservation, interaction, and heuristics of seeing (endres, ). one of the main challenges of d is accessibility. however, with the advent of webgl, major browsers have built-in support, making d viable for delivery downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com over the web and without the need for specialty software or downloads. at this writing, google chrome and safari have strong support, with other browsers at various stages of adding it. my goal for the d renderings of the st chad gospels are to offer an experience with a digital version of a manuscript that is more dynamic and interactive than is possible with d images. to facilitate this interaction and take advantage of d’s rich data, i designed a number of tools for the d gallery. these tools enable a visitor to generate a url for the exact position that a d image is manipulated into (a position that can be saved, sent to a friend or colleague, and used as a citation); zoom in and out, the resolution a healthy x pixels; measure any feature of a page, allowing a measurement to be a line or polygon; create and save an annotation that includes a note, measurement, and the position in which a d rendering has been manipulated—able to be reloaded later; and manipulate a d rendering by panning the camera or performing an alt + left click to make any point an epicenter around which a rendering rotates. the goal of the d renderings and digital space is interaction, engagement, and discovery. to insure that the cathedral received as much benefit as possible from the d data, i converted all of the d files into adobe pdfs. these pdfs are viewable with the free adobe reader. adobe reader includes a collection of tools, like a measurement tool, which is precise and rivals any specialty software for d renderings. in the conversion, adobe slightly mutes colors. however, pdfs make the d renderings quickly accessible and easy to use, requiring one file instead of three (see my video on d: http://youtu.be/ebr hnnnnrs). . concluding thoughts ethnographic ethical principles provide a means to understand and explore the digitization of a sacred artifact, its benefits to the sacred artifact's community, and a project's resulting digital representations of artifact and community. however, ethnographic principles have their greatest value before a digital project begins. they provide insight into the engagement between researchers and a community, requiring that a sacred artifact be viewed as a present-day cultural phenomenon regardless of its past. ethnographic principles require digital humanists to build a digitization effort around awareness and respect for community members and thoughtful beneficence for the community that will not be negated by technical requirements. one of the most important questions that ethnographic principles require to be downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com asked is what level of beneficence is acceptable for a digitization of a sacred artifact. this is a question that must be answered for each digital project, dependent on the sacred artifact's community, the scope of the project and its technical expertise, and the context of the sacred artifact. i have attempted to demonstrate how ethnography and its ethical principles unfolded in my digital efforts with the st chad gospels and its community at lichfield cathedral. dh scholars are engaging people outside academe to produce research—in ways beyond their own discipline's reflections on ethics and research. ethnographers learned long ago the difficulties that occur when scholars engage communities to generate research. ethical principles are not meant to restrict behavior or limit academic freedom. instead, they provide disciplinary wisdom, accumulated through reflection on years of experiences. they are designed to facilitate research, guide interactions, and protect communities along with the integrity of research. ethical principles are needed to transform dh, perhaps more so than digital technologies. they protect and cultivate human relations—the essence of the humanities. acknowledgments: with gratitude to the lichfield cathedral community: prior canon chancellor pete wilcox, canon chancellor anthony moore, clare townsend, jo burkinshaw, alex nicholson- ward, chris craddock, simon ferguson, diana arthur, cheryl baxter, anita caithness, gary cox, margaret davies, mary harris, adam johns, brian jones, liz kendrick, lyneth lockwood, pat scaife, arleen trickett, and ann waller (to name a few)—and in loving memory of pat bancroft. images reproduced with the kind permission of the chapter of lichfield cathedral. downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com an example of more traditional academic activism in dh is jim ridolfo and bill hart-davidson's samarian archives project. this project recognizes the diaspora of the samarian community and its texts. it ‘seeks to digitize thousands of ancient, sacred samaritan texts and make them available to the samaritan community’ (matrix, ; ridolfo, hart-davidson and mcleod, ). also, forthcoming in will be jim ridolfo's the digital samaritan, ann arbor: university of michigan press. patrik svensson has borrowed this term from mccarty and put it to good use. see his "the digital humanities as a humanities project" ( ) and "beyond the big tent" ( ). mccarty credits peter galison ( ) for the term. hammersley and atkinson ( ) present references to ethnographic research by different scholars who hold various views on these issues. for a brief history of ethnography, see "origins and ancestors" by julie scott jones ( ) in ethnography in social science practice. for a map of the english canal system, see uk government's maps of waterways. it shows canals along with other manmade water way: <http://www.waterways.org.uk/pdf/wwwaterwaysmap>. pete is now dean of liverpool cathedral. unfortunately, pat has passed away. for technical expertise, i am much indebted to noah adler, director of research computation and application development for the college of arts & sciences at the university of kentucky. in , the bbc filmed the christmas service at lichfield cathedral. to view the role of the st chad gospels during this service, visit bbc christmas day workshop (only available in great britain) <http://www.bbc.co.uk/programmes/b grzmv/episodes/guide>. i discuss d, its implications in seeing, and its potentials for scholarship in "more than meets the eye: going d with an early medieval manuscript" ( ). for these d tools, i am grateful for the talent and expertise of noah adler and justin hall. downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com bibliography alexander, j.j.g., . insular manuscripts th - th century. london: harvey miller. atkinson, p., et al. eds., . handbook of ethnography. london: sage. bambury, p. and beechinor, b., eds. and comps., . the annals of ulster. corpus of electronic text, university college cork, ireland, [online] available at http://www.ucc.ie/celt/online/t a/ [accessed november ]. bauer, j., . “who you calling untheoretical?” journal of digital humanities ( ), [online] available at: https://journalofdigitalhumanities.org [accessed november ]. bede, . the ecclesiastical history of the english people. mcclure, j. and r. collins, r. eds. . cambridge university press, cambridge. bianco, j., . “this digital humanities which is not one.” in: gold, m., ed. . debates in the digital humanities. minneapolis: university of minnesota press. pp. - . blanke, t., hedges, m. and dunn, s., . “arts and humanities e-science: current practices and future challenges.” future generation computer systems, ( ). pp. - . brabham, d., . “crowdsourcing as a model for problem solving: an introduction and cases.” convergence: the international journal of research into new media technologies, ( ), pp. - . brown, m., . “the lichfield angel and the manuscript context: lichfield as a centre of insular art.” journal of the british archaeological association, ( ), pp. - . downloaded from brill.com / / : : am via free access http://jrmdc.com/ http://www.ucc.ie/celt/online/t a/ https://journalofdigitalhumanities.org/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com brown, m., . “the lichfield/llandeilo gospels reinterpreted.” in: kennedy, r. and meecham-jones, s. eds. . authority and subjugation in writing of medieval wales. new york: palgrave macmillan. pp. - . brown, m. and farr, c., . mercia: an anglo-saxon kingdom in europe. london: continuum. brown, s. and dobrin, s., . “new writers of the cultural sage: from postmodern theory shock to critical praxis.” in: brown, s. and dobrin, s., eds., . ethnography unbound: from theory shock to critical praxis. albany: state university of new york press. pp. - . brunt, l., . “into the community.” in: atkinson, p. et al. eds. . handbook of ethnography. london: sage. pp. - . causer, t. and wallace, v., . “building a volunteer community: results and findings from “transcribe bentham.”” digital humanities quarterly, ( ), [online] available at: http://digitalhumanities.org: /dhq/vol/ / / / .html [accessed november ]. crafts, n. f. r. and woodward, n. ( ). the british economy since . oxford: oxford university press. daniel, s. ( ). public secrets. vectors journal, ( ), [online] available at: http://vectors.usc.edu/projects/index.php?project= [accessed november ]. deegan, m.j., . “the chicago school of ethnography.” in: atkinson, p. et al. eds. . handbook of ethnography. london: sage. pp. - . drucker, j., . speclab: digital aesthetics and projects in speculative computing. chicago: chicago university press. downloaded from brill.com / / : : am via free access http://jrmdc.com/ http://digitalhumanities.org: /dhq/vol/ / / / .html http://vectors.usc.edu/projects/index.php?project= p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com doan, a. and bamakrishnan, r. and halevy, a., . “crowdsourcing systems on the world- wide web.” communications of the acm, ( ), pp. - . earhart, a., . “recovering the recovered text: diversity, canon building, and digital studies.” presentation at dh in hamburg and slightly modified for talk at university of kansas. [video] oct. available at: https://www.youtube.com/watch?v= ui pijdreo [accessed november ]. edwards, c., . “the digital humanities and its users.” in: gold, m., ed. . debates in the digital humanities. minneapolis: university of minnesota press. pp. - . edelstein, d., . the enlightenment: a genealogy. chicago: university of chicago press. endres, b., . “more than meets the eye: going d with an early medieval manuscript.” proceedings of the digital humanities congress , [online] available at: http://www.hrionline.ac.uk/openbook/chapter/dhc -endres [accessed december ]. fabion, j.d., . “currents of cultural fieldwork.” in: atkinson, p. et al. eds. . handbook of ethnography. london: sage. pp. - . flanders, j. and wernimont, j., . “feminism in the age of digital archives: the women writers project.” tulsa studies in women’s literature ( ), pp. - . galison, p., . image and logic: a material culture of microphysics. chicago: university of chicago press. gold, m. eds., . debates in the digital humanities. minneapolis: university of minnesota press. downloaded from brill.com / / : : am via free access http://jrmdc.com/ https://www.youtube.com/watch?v= ui pijdreo http://www.hrionline.ac.uk/openbook/chapter/dhc -endres p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com hammersley, m. and atkinson, p. . ethnography: principles in practice. new york: routledge. henderson, g., . from durrow to kells: the insular gospel-books - . london: thames and hudson. james, p., . “the lichfield gospels: the question of provenance.” parergon ( ), pp. - . hockey, s., . “the history of humanities computing.” in: schreibman, s., siemens, r. and unsworth, j., eds. . companion to the digital humanities. oxford: blackwell. pp. - . howard, j., . “ st -century imaging helps scholars reveal rare th -century manuscript.” the chronicle of higher education, dec., pp. a , a . hsu, w., . “digital ethnography toward augmented empiricism: a new methodological framework.” journal of digital humanities, ( ), [online] available at: http://journalofdigitalhumanities.org/ - /digital-ethnography-toward-augmented- empiricism-by-wendy-hsu/ [accessed november ]. jenkins, d. and owen m., . “the welsh marginalia in the lichfield gospels, part i.” cambridge medieval celtic studies, (summer), pp. - . jones, j., . “origins and ancestors: a brief history of ethnography.” in: jones, j. and watt, s. eds. . ethnography in social science practice. new york: routledge. pp. - . jones, j. and watt, s. eds., . ethnography in social science practice. new york: routledge. juhasz, a., . “digital humanities.” media praxis july [online] available at: http://aljean.wordpress.com/ / /digital-humanities [accessed november ]. downloaded from brill.com / / : : am via free access http://jrmdc.com/ http://journalofdigitalhumanities.org/ - /digital-ethnography-toward-augmented-empiricism-by-wendy-hsu/ http://journalofdigitalhumanities.org/ - /digital-ethnography-toward-augmented-empiricism-by-wendy-hsu/ http://aljean.wordpress.com/ / /digital-humanities p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com kress, g. and van leeuween, t., . reading images: the grammar of visual design. new york: routledge. lichfield city council ( ). lichfield city council: statistics. office for national statistics – census. [online] available at: <http://www.lichfield.gov.uk/cc-statistics.ihtml> [accessed november ]. macdonald, s., . “british social anthropology.” in: atkinson, p. et al. eds. . handbook of ethnography. london: sage. pp. - . matrix ( ). “samaritan archives project receives international press attention.” matrix, june , [online] available at: http://www .matrix.msu.edu/ / /samaritan- archives-project-receives-international-press-attention [accessed november ]. mccarty, w., . humanities computing. new york: palgrave. mcpherson, t., . “why are the digital humanities so white? or thinking the histories of race and computation.” in: gold, m., ed. . debates in the digital humanities. minneapolis: university of minnesota press. pp. - . mercian trail partnership, . staffordshire hoard, [online] available at: <http://www.staffordshirehoard.org.uk/> [accessed december ]. merleau-ponty, m., ( ). “eye and mind.” in: johnson, g. ed. ( ). the merleau-ponty aesthetic reader. chicago: northwestern university press. pp. - . modern language association, . statement of professional ethics. mla, [online] available at: <http://www.mla.org/repview_profethics> [accessed november ]. downloaded from brill.com / / : : am via free access http://jrmdc.com/ http://www .matrix.msu.edu/ / /samaritan-archives-project-receives-international-press-attention http://www .matrix.msu.edu/ / /samaritan-archives-project-receives-international-press-attention p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com murphy, e. and dingwall, r., . “the ethics of ethnography.” in: atkinson, p. et al. eds. . handbook of ethnography. london: sage. pp. - . netz, r., et al., . “the digital archimedes palimpsest.” the archimedes palimpsest, [online] available at: <http://archimedespalimpsest.org/digital/> [accessed december ]. pagden, a., . the enlightenment. oxford: oxford university press. pidd, m., stubbs, e. and thomson, c. e., . “the hengwrt canterbury tales: inadmissible evidence?” occasional papers, , pp. - . powell, r., . “the lichfield st. chad's gospels: repair and rebinding, - .” the library, ( ), pp. - . ramsay, s., . “in praise of pattern.” text technology ( ), pp. - . ridolfo, j., hart-davidson, w., and mcleod, m., . “balancing stakeholder needs: archive . as community-centred design.” ariadne , [online] available at: <http://www.ariadne.ac.uk/issue /ridolfo-et-al> [accessed november ] roundtree, a., . “simulated visuals: some rhetorical and ethical implications.” digital humanities quarterly ( ), [online] available at: <http://digitalhumanities.org: /dhq/vol/ / / / .html> [accessed november ]. spiro, l., . “this is why we fight.” in: gold, m., ed. . debates in the digital humanities. minneapolis: university of minnesota press. pp. - . scheinfeldt, t., . “why digital humanities is “nice.”” in: gold, m., ed. . debates in the digital humanities. minneapolis: university of minnesota press. pp. - . downloaded from brill.com / / : : am via free access http://jrmdc.com/ p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com scrivener, f.h.a., . codex s. ceaddae latinus, evangelia sss. matthaei, marci, lucae ad cap. iii. complectens. cambridge: c. j. clay and sons. skeggs, b., . “feminist ethnography.” in: atkinson, p. et al. eds. . handbook of ethnography. london: sage. pp. - . stevick, r.d., . “the x crosses in the lindisfarne and lichfield gospels.” gesta ( ), pp. - . stein, w., . the lichfield gospels. ph.d. dissertation, berkeley: university of california. svensson, p., . “the landscape of digital humanities.” digital humanities quarterly ( ), [online] available at: http://digitalhumanities.org: /dhq/vol/ / / / .html [accessed december ]. svensson, p., . “the digital humanities as a humanities project.” arts and humanities in higher education . pp. - . svensson, p., . “beyond the big tent.” in: gold, m., ed. . debates in the digital humanities. minneapolis: university of minnesota press. pp. - . tanner, s., and deegan, m., . “inspiring research, inspiring scholarship: the value and impact of digitizing resources for learning, teaching, research and enjoyment.” jisc. available at: <http://www.kdcs.kcl.ac.uk/innovation/inspiring.html> [accessed december ]. terras, m., nyhan, j. and vanhoutte, e. eds., . defining digital humanities. burlington: ashgate. downloaded from brill.com / / : : am via free access http://jrmdc.com/ http://digitalhumanities.org: /dhq/vol/ / / / .html p a g e | journal of religion, media and digital culture volume , issue (december ) http://jrmdc.com unesco, . convention concerning the protection of the world cultural and natural heritage, oct. to nov. [online] available at: <http://whc.unesco.org/en/conventiontext> [accessed december ]. wernimont, j. ( ). “whence feminism? assessing feminist interventions in digital literary archives.” digital humanities quarterly ( ), [online] available at: <http://digitalhumanities.org: /dhq/vol/ / / / .html> [accessed november ]. downloaded from brill.com / / : : am via free access http://jrmdc.com/ journal of religion, media & digital culture (jrmdc) imaging sacred artifacts: ethics and the digitizing of lichfield cathedral's st chad gospels abstract about the author . introduction . digital scholarship and an emerging "ethical turn" . an ethnographic ethical frame: principles to guide the demands of a sacred artifact's context . epistemology and respect . the lichfield cathedral community and its sacred artifact . entry story . beneficence in digital work and my project at lichfield cathedral . overlay viewer: organizing and making nearly a terabyte of data accessible . historical image overlays . d renderings . concluding thoughts acknowledgments: bibliography canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach all rights reserved © the canadian historical association / la société historique du canada, ce document est protégé par la loi sur le droit d’auteur. l’utilisation des services d’Érudit (y compris la reproduction) est assujettie à sa politique d’utilisation que vous pouvez consulter en ligne. https://apropos.erudit.org/fr/usagers/politique-dutilisation/ cet article est diffusé et préservé par Érudit. Érudit est un consortium interuniversitaire sans but lucratif composé de l’université de montréal, l’université laval et l’université du québec à montréal. il a pour mission la promotion et la valorisation de la recherche. https://www.erudit.org/fr/ document généré le avr. : journal of the canadian historical association revue de la société historique du canada canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach tina adcock, keith grant, stacy nation-knapper, beth robertson et corey slumkoski volume , numéro , uri : https://id.erudit.org/iderudit/ ar doi : https://doi.org/ . / ar aller au sommaire du numéro Éditeur(s) the canadian historical association / la société historique du canada issn - (imprimé) - (numérique) découvrir la revue citer cet article adcock, t., grant, k., nation-knapper, s., robertson, b. & slumkoski, c. ( ). canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach. journal of the canadian historical association / revue de la société historique du canada, ( ), – . https://doi.org/ . / ar résumé de l'article cet article passe en revue l’impact du blogging dans la pratique historique au pays jusqu’à ce jour. s’inspirant de l’expérience tirée de cinq blogs consacrés à l’histoire canadienne gérés en collaboration – activehistory.ca, the otter~la loutre, findings/trouvailles, l’acadiensis blog et borealia – il explore comment cette activité a transformé la manière dont les historiens canadiens racontent le passé, publient leur recherche, enseignent et se mettent au service de leur institution et de l’ensemble de la communauté. le blogging a stimulé de nouvelles formes de narration historique, incluant les voix issues de groupes marginaux et sous-représentés dans les débats publics relatifs à l’histoire canadienne. il s’est aussi immiscé dans les cycles de la production savante et dans tous les niveaux de l’enseignement universitaire. malgré cela, de nombreux défis demeurent présents, notamment la place et à la valeur à accorder au blogging eu égard à l’appréciation des performances du travail universitaire. or par leur inclination croissante à lire et à produire des blogs historiques, non seulement les canadiens deviendront eux-mêmes les agents de changement de la pratique historique, au sein comme à l’extérieur des campus universitaires, mais ils récolteront une plus grande satisfaction personnelle à s’engager dans cette nouvelle forme de publication numérique. https://apropos.erudit.org/fr/usagers/politique-dutilisation/ https://www.erudit.org/fr/ https://www.erudit.org/fr/ https://www.erudit.org/fr/revues/jcha/ https://id.erudit.org/iderudit/ ar https://doi.org/ . / ar https://www.erudit.org/fr/revues/jcha/ -v -n -jcha / https://www.erudit.org/fr/revues/jcha/ journal of the cha new series, vol. , no. revue de la shc nouvelle série, vol. , nº canadian history blogging: refl ections at the intersection of digital storytelling, academic research, and public outreach tina adcock, keith grant, stacy nation-knapper, beth robertson, and corey slumkoski abstract this article surveys the impacts of blogging on canadian historical practice to date. drawing upon the experiences and practices of fi ve col- laborative or multi-author canadian history blogs — activehistory.ca, the otter~la loutre, findings/trouvailles, the acadiensis blog, and borealia — it explores how this activity is changing the ways in which canadian historians tell stories, publish their research, teach, and serve academic and wider communities. blogging has encouraged new forms of historical storytelling and the inclusion of underrepresented and margin- alized voices in public discussions of canadian historical narratives. it is being integrated into cycles of academic publication and undergraduate and graduate classrooms. yet challenges remain with regard to deter- mining the place and value of blogging within standard paradigms of academic labour. as more canadian historians come to read, write for, and edit historical blogs, however, they will not only help shift the prac- tice of canadian history inside and outside university campuses, but will also experience the pleasures and rewards of this kind of digital historical work for themselves. résumé cet article passe en revue l’impact du blogging dans la pratique his- torique au pays jusqu’à ce jour. s’inspirant de l’expérience tirée de cinq blogs consacrés à l’histoire canadienne gérés en collaboration – activehis- tory.ca, the otter~la loutre, findings/trouvailles, l’acadiensis blog et borealia – il explore comment cette activité a transformé la manière dont les historiens canadiens racontent le passé, publient leur recherche, enseignent et se mettent au service de leur institution et de l’ensemble de la communauté. le blogging a stimulé de nouvelles formes de narration his- journal of the cha / revue de la shc torique, incluant les voix issues de groupes marginaux et sous-représentés dans les débats publics relatifs à l’histoire canadienne. il s’est aussi immiscé dans les cycles de la production savante et dans tous les niveaux de l’enseignement universitaire. malgré cela, de nombreux défi s demeurent présents, notamment la place et à la valeur à accorder au blogging eu égard à l’appréciation des performances du travail universitaire. or par leur inclination croissante à lire et à produire des blogs historiques, non seulement les canadiens deviendront eux-mêmes les agents de changement de la pratique historique, au sein comme à l’extérieur des campus uni- versitaires, mais ils récolteront une plus grande satisfaction personnelle à s’engager dans cette nouvelle forme de publication numérique. since the turn of the century, history blogging (or weblogging) has grown in volume and popularity. emerging and established canadian historians are increasingly publishing work in collab- orative and multi-author blogs. like peer-reviewed journals, these blogs often focus upon specifi c geographical, thematic, temporal, or methodological areas of research. yet despite the proliferation of canadian history blogging in recent years, there has been little collective discussion about how this activity is reshaping the ways in which we research, write, publish, and teach canadian history. as with other digital historical tools and platforms, blogs are producing innovative forms of schol- arly writing and publication and modes of public engagement. as we embark upon this digital turn, we should also assess how blogging, along with other social media technologies, is trans- forming the historian’s craft. this article surveys the impacts of blogging on canadian historical practice to date. it focuses upon the fi ve collaborative or multi-author canadian history blogs with which the authors have editorial affi liations: activehistory.ca (beth robertson), the otter~la loutre (tina adcock), findings/trouvailles (stacy nation-knapper, tina adcock), the acadiensis blog (corey slum- koski), and borealia (keith grant). it discusses this activity’s relationship to four pillars of the historical profession — story- telling, publication, teaching, and service — and suggests some of the possibilities and pitfalls that have emerged so far from canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach each of these intersections. we argue that blogging has encour- aged new forms of historical storytelling and the inclusion of underrepresented and marginalized voices and perspectives in public discussions of canadian historical narratives. it is being integrated into cycles of academic publication and undergradu- ate and graduate classrooms. yet challenges remain with regard to determining the place and value of blogging within standard paradigms of academic labour. we are heartened to note blogging’s growing traction in the canadian historical discipline, as symbolized most recently by activehistory.ca’s receipt of the canadian historical association’s public history prize in . indeed, our decision to publish this article in the cha’s scholarly organ, instead of on our blogs, is a conscious act of outreach to colleagues who may not have had the opportunity to delve very deeply into the canadian history blogosphere. historians sometimes turn to interviews, op-eds, and other forms of popular media to disseminate their research beyond the academy. here, we travel in something like the opposite direction. we provide a snapshot of the present state of canadian history blogging in a traditional print journal — albeit in an issue published electronically — in order to bring these activities more fully into conversation with other contemporary developments in the canadian historical profession. we invite readers to follow the hyperlinks in our notes and to join the rich scholarly communities that have congregated around these digital seminar tables, or within these digital common rooms. as more canadian historians come to read, write for, and edit history blogs, they will not only help shift the practice of cana- dian history inside and outside university campuses, but will also experience the pleasures and rewards of this kind of digital his- torical work for themselves. about the blogs each of the fi ve blogs represented here takes a unique approach to, or has a distinct purpose in disseminating canadian history. one of the earliest multi-author blogs in this fi eld, activehistory. journal of the cha / revue de la shc ca, has provided a model for subsequent efforts. the founders of activehistory.ca were inspired by infl uential single-author blogs such as the one written by christopher moore, as well as interna- tional websites such as history & policy. determined to engage a wider public, they established the blog in , following the conference “active history: history for the future” held at glen- don college in september of that year. although some of the initial people are still involved, the editorial collective, collabo- rators, and contributors have evolved over time. the co-editors understand “active history” in various ways, as “history that listens and is responsive; history that will make a tangible dif- ference in people’s lives; history that makes an intervention and is transformative to both practitioners and communities.” activehistory.ca develops this notion of active history through blogging. whether providing informed commentary on current events from a historical perspective, illuminating communi- ty-based research practices, or refl ecting on public engagement, the website provides a platform for academics, public historians, archivists, museum professionals, and civil servants to engage in the potential of historical thinking. the otter~la loutre, originally named nature’s chron- iclers, is the blog of the network in canadian history and environment (niche). it began as a communication hub for participants in niche’s -year strategic clusters grant ( – ) from sshrc. from the website’s inception in , alan maceachern, adam crymble, and jim clifford invited canadian environmental historians to submit posts suitable for academic and non-academic audiences. at fi rst, some of the material pub- lished online was reprinted in a separate digital newsletter that was sent to interested parties several times each year. joshua macfadyen oversaw the otter~la loutre’s transition to a fully web-based environment, and expanded its remit to include features such as a regular book review section and additional french-language content. since , niche’s website and blog have been run by an editorial collective of eight scholars, four of whom work closely with the otter~la loutre. although the blog chiefl y represents and serves the canadian environ- canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach mental historical community, it also reaches out to canadian historians whose research and teaching concern broader relation- ships between canadians and their environments in the past. its readership extends past national borders, too. niche’s blog, website, and facebook and twitter feeds together comprise an important hub for environmental historians in north america, since there is no equivalent digital presence among american environmental historians. digital environmental historians from around the world also follow niche’s social media accounts and read the otter~la loutre. douglas hunter and patrice dutil founded findings/trou- vailles in the fall of to further the champlain society’s mission of increasing “public awareness of, and accessibility to, canada’s rich store of historical records.” the blog continues the society’s tradition of publishing historical “fi nds,” but does so in a digital format that complements the society’s characteristic red-bound editions. monthly posts focus on a specifi c textual, audiovisual, or material primary source, illuminating the content and context of this “fi nd” and explaining how it enhances present understand- ings of canadian history. the original findings/trouvailles editorial committee included hunter, dutil, stacy nation-knapper, and tina adcock; it has since expanded to include eight editors. early contributors to findings/trouvailles included members of the edi- torial committee and of the society. the roll-call of authors is now more diverse, and the editors especially encourage early-ca- reer scholars to publish research fi nds in an accessible format that openly promotes enthusiasm for historical research. the acadiensis blog was launched in to place the journal acadiensis at the forefront of digitally-engaged scholarly periodicals. although digitized back issues of acadiensis had long been available on the journal’s website, the website was rede- signed and expanded in . journal co-editors john reid and sasha mullally turned increasingly toward social media as a form of publication, outreach, and promotion. they set up the acadi- ensis blog, a facebook page, and twitter feed for the journal, and recruited corey slumkoski to oversee these elements as the jour- nal’s fi rst digital communications editor. at the time this paper journal of the cha / revue de la shc was submitted for publication, the blog had published approxi- mately posts on a weekly schedule. new content is generally published every monday, and most thursdays feature a “throw- back” piece wherein a link is posted to an article from acadiensis’ archives relevant to current events. for example, during the fort mcmurray fi re of may , as many displaced maritimers returned home, the blog featured patricia thornton’s important article about outmigration from the maritimes. borealia is a collaborative blog on early canadian history, broadly construed. it is a forum for historians of northern north america until approximately the end of the nineteenth century, encompassing indigenous, french, british, and early canadian national history. when denis mckim and keith grant launched the blog in , these related subfi elds were often diffused among several specialized conferences and journals. they wanted to create an online space to bring those various subfi elds together under an “early canadian” banner for collaboration and cross-fer- tilization, and to introduce this work to an interested general readership. borealia invites contributions from professional histo- rians working in academic, public, and alt-academic settings and advanced graduate students. to date, over historians, repre- senting many career stages, have written for the blog. borealia has also developed into a forum for transnational conversations, especially between early canadianists and early americanists. what began as an accident of the editors’ personal networks and research interests has become an intentional editorial stance of seeking to put early “canada” in continental, atlantic, and global contexts. with half of the blog’s readers and a third of its contributors located in the united states, borealia’s editors now encourage essays that straddle the border and check nationalistic assumptions in the historiography of early north america. blogging and storytelling canadian digital historians have used the nimbleness, fl exibil- ity, and accessibility of blogging platforms to experiment with narrative form and tell complex, yet engaging stories en plein air canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach to more-than-academic audiences. blogging has also created space for voices traditionally underrepresented in, or excluded from canadian history narratives and their contemporary nar- ration. it is already transforming and democratizing the fi eld, although it retains the potential to reproduce existing structural inequities in the canadian academy. bloggers often tell stories about canada’s past that are smaller in scope, though not necessarily less signifi cant than those found in conventional venues of publication. most posts on collaborative or multi-author blogs are – , words long. the blog post’s brevity makes it an ideal form in which to engage with a single item or argument in depth: to explore a fascinating source culled from a journal article or book chap- ter, to linger over a methodological or historiographical problem, or to develop a stray thematic thread from a larger research project. within the canadian history blogosphere, findings/trou- vailles provides a unique venue for this kind of micronarrative. each post discusses and analyzes a source, but dispenses with the larger analytical apparatus that surrounds such sources in long-form historical narratives. by highlighting the thrill of the “fi nd,” finding/trouvailles posts also reveal the affective dimen- sion of historical research, or the excitement of unearthing jewels in the archives. such experiences are rarely discussed in academic publications, but can help attract non-academic readers to his- tory blogs even as they educate them about this often-obscured facet of professional historical work. while blog posts are nimble in scope, blogging platforms are equally nimble in their fl exibility and celerity of publication. unlike the print medium, blogs can integrate non-textual and digitized sources easily. an image or video clip can become the centrepiece of a post, or can be used to supplement arguments advanced concurrently in traditional print venues. activehis- tory.ca has also experimented with “exhibits” that link blogging to less text-based conceptions of storytelling. furthermore, collaborative and multi-author canadian history blogs favour a condensed process of peer review. posts undergo at least one round of editorial intervention and subsequent revision prior to journal of the cha / revue de la shc publication, but are usually not sent out for external peer review. blogs can therefore publish topical content quickly, enabling historians to provide expert, near real-time commentary on cur- rent events. within several days of the federal election, the acadiensis blog featured a maritime-centred analysis of the results, written by a well-known political scientist. publishing a similar piece in a peer-reviewed journal such as acadiensis would have taken much longer — perhaps over a year — and would have resulted in a piece far less timely. even a non-refereed research note in the print journal would have taken some time to see the light of day, owing to acadiensis’ set biannual publi- cation schedule. for these and other reasons, blogs are an ideal vehicle for communicating canadian historical research to academics who do not specialize in this fi eld, as well as students and members of the public. most canadian history blogs are intentionally ori- ented toward a hybrid audience of academic and non-academic readers. they enable historians to conduct conversations in pub- lic that are meant to be overheard. they are written in a more academic manner than popular history magazines, though like them, they strive for readable narrative and engaging style. as the editors at borealia like to tell their contributors, if the blog were a restaurant, it would be casual fi ne dining: professional, energetic, and accessible. while all of the blogs discussed here fi t this character- ization, they employ different strategies to create such an ambience. findings/trouvailles’ editorial committee calls for sub- missions that are “informed, but not necessarily scholarly.” this encourages authors to write in a less formal, and often less formulaic manner that conveys what the editors hope will be a contagious enthusiasm for historical research. it also makes the products of that research more digestible to readers unfamiliar with the conventions of academic history-writing. while find- ings/trouvailles privileges enthusiastic storytelling, activehistory. ca challenges contributors to communicate complex historical ideas to a broader public. the editors maintain a deep commit- ment to theory and critical analysis, while eschewing jargon that canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach might alienate those working beyond specifi c scholarly fi elds or academic institutions. activehistory.ca’s posts speak to, but do not underestimate their readership. the blog’s editors have also reached out to members of underrepresented and marginal- ized groups in the canadian historical profession and canadian society, offering them the space and freedom to tell stories that often stray from and usefully challenge the kinds of canadian historical narratives dominant in the public sphere. indeed, the canadian history blogs discussed herein are now striving to tell more inclusive histories, with respect to language, race, gender, and stage of career. although the canadian history blogosphere, like the pro- fession writ large, is largely english- speaking, some blogs have actively recruited francophone writers, sought out editors fl uent in french, and published french-language content. undoubtedly the most successful has been histoireengagée.ca, activehistory.ca’s french-language sister site. like activehistory.ca, histoireengagée. ca endeavours to publish accessible, jargon-free posts that place contemporary issues and debates in canada and québec and on the world stage into dialogue with historical knowledge. although the two blogs operate independently, they maintain a close relationship. they periodically translate and share con- tent, and their editors are striving to make this happen more regularly. findings/trouvailles and borealia have each published a handful of french-language pieces, and have simultaneously produced english translations to increase such posts’ uptake. the otter~la loutre’s experience demonstrates that sustaining bilin- gual publishing practices can be diffi cult over time, however. niche once had a french-canada coordinator who ensured the regular publication of french-language content on the otter~la loutre and on the now-defunct french-language blog qu’est-ce qui se passe. they also sourced french-language content from elsewhere, and wrote english- and french-language posts about ongoing environmental historical research in québec and french canada. unfortunately, the otter~la loutre’s last french-lan- guage editor stepped down in , and the number of posts written in french has diminished accordingly. journal of the cha / revue de la shc in recent years, indigenous scholars such as chelsea vowel, erica violet lee, and zoe todd have maintained infl uential per- sonal blogs where they regularly debunk longstanding myths about indigenous peoples and cultures and counter neocolonial renditions of canadian history. such conversations also hap- pen on collaborative and multi-author blogs, where they often challenge editors’ and readers’ ideas about how canadian his- tory should be told, who should do the telling, and what forms that telling should take. one concrete outcome of such discus- sions was activehistory.ca’s theme week dedicated to indigenous history, which gwich’in scholar crystal fraser guest-edited in january . this theme week has become one of the blog’s most widely-read examples of this genre. contributing authors did much more than insert indigenous stories into existing cana- dian historical narratives. as adam gaudry noted, a simple “add and stir” approach is neither suffi cient nor unproblematic. contributors played with and contested eurocentric notions of history’s disciplinary boundaries, its accredited authors, and appropriate methods of conveying its knowledge. their perspec- tives on colonialism, dispossession, clean water on reserves, the truth and reconciliation commission, and other pressing top- ics demonstrate the potential of often-marginalized voices to intervene in and shift public and academic conversations about representations of the past. activehistory.ca has become increasingly invested in com- municating the histories of those underrepresented in such dialogues, including refugees, african canadians, people with disabilities, and sexual and gender minorities. in so doing, its editors consciously join others who present blogging as a viable means of drawing together academic, community, and embodied identities to make space for marginalized peoples within online public spheres. yet activehistory.ca’s editors remain all too aware of the inherent challenges of representing diverse voices and per- spectives. they strive, if at times imperfectly, to seek out a range of scholarship, to make space for alternative and divergent ways of knowing and writing about the past, and to welcome and respond to productive critiques of their efforts in this vein. canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach arguably, this set of best inclusive practices is especially necessary in subfi elds such as environmental history that remain older, whiter, and more masculine than is perhaps now the norm in the canadian historical profession. to give just one example, the otter~la loutre and borealia’s joint series on early canadian environmental history included an introductory post with sug- gested readings on this topic. the paucity of women authors on that reading list led to some productive and generous dialogue, both in the comments beneath the post and on twitter, about gender and authorship in the subfi elds of early canadian history and canadian environmental history. like the editors of acti- vehistory.ca, those of borealia and the otter~la loutre actively encourage multiple voices and approaches, but sometimes fall short of the mark, owing to structural reasons, personal myopia, or contingent circumstances. happily, mistakes can engender useful conversations and renew editors’ aspirational commitment to equitable representation on their blogs. finally, the place of non-tenure-track, alt-academic, and post-academic scholars in canadian history blogging merits con- sideration. blogging is one means by which these often exemplary writers and researchers can continue to intervene in scholarly debates and thereby participate in scholarly communities, to everyone’s benefi t. however, blogging is usually unpaid labour; as with many aspects of the academic economy, bloggers are paid in kind rather than coin. editors should think especially carefully before asking non-tenure-track scholars, or those transitioning out of the academy, to work on the same terms as tenure-track or tenured scholars. as melissa gregg has argued, blogging may provide a means for emerging authors to contribute to what seems to be a radical democratization of knowledge in the web . world. but they often only do so when precariously paid — if at all — and thoroughly overworked. some simply cannot afford to do so. when the editors of findings/trouvailles recently approached a post-academic scholar about writing a post, they replied, quite reasonably, that they no longer wrote for venues that did not pay them. as in the case of other underrepresented or marginalized scholars, we should make our blogs welcoming journal of the cha / revue de la shc places for such contributors, and should not hesitate to approach them. but we must remain mindful of such issues when we do so. while canadian history blogging has produced more fl exible, responsive, and inclusive modes of storytelling in front of larger and more heterogeneous audiences, it is imbricated with exist- ing academic structures of power and privilege and may work to entrench, rather than subvert present inequities, despite blog editors’ best intentions. blogging and publishing no simple answer exists to the question of how academic history blogging relates to traditional print publishing. blogs can com- plement scholarly publishing in several ways, but the medium is also pushing scholarly publication toward a more open-access model. academic blogging can be situated at several points on what book historian robert darnton terms the “communications circuit.” darnton’s diagram maps the relationship between authors, publishers, consumers, and readers, highlighting the social production of knowledge. historians currently use blog- ging to gain feedback in the early stages of writing and framing a project, to draw attention to book or journal publications, and to connect with readers throughout the research cycle. we agree with julia martin and brian hughes’ defi nition of blogging as a form of publication that grants “scholars and researchers a more accessible avenue of discourse than peer-reviewed journals.” blogging can help writers in the pre-publication stages of historical communication. writing to engage a broad pub- lic readership can help scholars become better writers and more effective communicators. the format also provides historians with an opportunity to try out ideas, write a preliminary analysis of a research fi nd, solicit sources, or initiate an informed con- versation with readers on the direction of a project, using the comments sections of blogs or linked social media accounts. we also know of contributors who have been approached by uni- versity presses about book contracts on the basis of well-written blog posts outlining their research project. canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach in the new hybrid ecology of scholarly publication, blog- ging, along with other forms of social media, plays a vital role in the circulation of traditional print publications. canadian history blogs have developed various strategies to draw atten- tion to new conventionally-published scholarship. at borealia, keith grant curates a twice-yearly preview of new books in early canadian history. similarly, at the otter~la loutre, alan maceachern posts an annual “booklook” featuring new mono- graphs in environmental history, and other editors have written similar collaborative posts about new journal articles. every month, the otter~la loutre’s social media editor jessica dewitt selects the fi ve best scholarly or popular articles about environ- mental history published online in the past month, and prepares a combined blog post and video interview with otter editor-in- chief sean kheraj called “#envhist worth reading.” inspired by and often in collaboration with the otter~la loutre, active- history.ca has published pieces that highlight ten publications meant to help contextualize a pressing contemporary issue. some of these posts have focused on the idle no more movement, debates over vaccinations, or the truth and reconciliation com- mission’s impact on the practice of history. several canadian history blogs also publish author inter- views, book reviews, and essays on recent historiography, taking advantage of their accelerated editorial process to engage with new monographs and emerging trends in research months (or even years) before major print journals. acadiensis, for example, uses its blog to publish reviews of single books, while the print journal remains committed to longer review essays that assess multiple works. some reviews therefore appear on the blog long before they would be seen in print. that journals such as aca- diensis treat their blog as an extension of the print publication suggests that the role of blogs and social media in the circulation of scholarship should be taken into account alongside traditional citations when measuring scholarly impact. blogs associated with print publications not only aid the latter’s circulation, but can also provide an additional layer of online content, as mentioned above. as an initiative of the journal of the cha / revue de la shc champlain society, an organization dedicated to publishing scholarly works and editions, findings/trouvailles occasionally features posts directly related to larger book projects. sandra alston’s piece on a mid-nineteenth-century “canadian ball” at rivière-du-loup provided a taste of the material in william ord mackenzie’s journal, a scholarly edition of which appeared under the champlain society’s imprint later that year. the acadiensis blog encourages scholars to contribute a blog post as a “movie trailer” for their research — a short, pithy piece that captures some of what their study has done, and which links to the lon- ger-form article published in acadiensis. writing for an academic blog, then, can complement several stages of scholarly commu- nication, from effective writing and intellectual exchanges about research to circulating print publications and enhancing them with digital content. yet blogs can also push beyond the bound- aries of traditional publications. blogging is one of the purest forms of open-access pub- lishing, fuelled by an almost unmitigated desire to “engag[e] with a wider community.” there are no blackouts, paywalls, or embargos limiting access to blog posts. recently, sshrc has encouraged canadian scholarly journals to make their content more accessible to members of the public, who support research through their tax dollars. this process has often been fraught with diffi culties as journals strive to protect the subscription bases upon which their economic viability rests. free from such fi nancial considerations, canadian history blogs make their con- tent available as a matter of choice. many blogs use some variety of creative commons licensing to specify how their content can be further distributed. activehistory.ca, in particular, has strongly advocated for the free and unfettered provision of rigorous his- torical scholarship to the general public. some of findings/trouvailles’ posts have responded directly to specifi c requests from different groups to make primary sources openly accessible. stacy nation-knapper wrote a post to honour an indigenous community’s request to make a source available to community members online while further research is conducted into the material’s broader signifi cance. similarly, canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach donald mcleod published a post during toronto’s lgbtq pride celebration to fulfi ll community members’ queries about a source related to this event and its history. in these ways and others, canadian history blogging has been a form of commu- nity-driven and -supported publication that enhances public access to archival sources and their interpretation by professional historians. as rohan maitzen writes, “blogging — free, acces- sible, interactive — restores immediacy to scholarly discussion, removes logistical roadblocks to knowledge dissemination, and up-ends the communication/validation hierarchy in favour of the open exchange of ideas. is that not what academic publishing is actually supposed to accomplish?” in some cases, the line between blogs and traditional pub- lications is blurring, creating new blended or hybrid genres with the potential to reverse or reconfi gure usual trajectories of schol- arly publication. the otter~la loutre has now published close to ten special series of posts on subjects as varied as hydroelec- tric dams, human-animal relationships, and winter in canadian history. some series feature new and emerging research at the intersection of environmental history and another thematic fi eld, including the history of science and technology, labour and working-class history, and the history of gender and sexuality. osver time, otter editors plan to convert some of these series into free e-books to broaden the reach and maintain the acces- sibility of these posts, which will otherwise become increasingly diffi cult to locate on the blog as new content is heaped atop old. likewise, borealia has discussed with a university press the pos- sibility of publishing thematic volumes in both e-book and print formats, similar to those produced in the university of calgary press’ canadian history and environment series. we believe that enhancing access to historical research, whether published online or offl ine, can only benefi t the profes- sion. yet we recognize that peer review remains an essential, if sometimes contested, aspect not only of scholarly publication, but also of the processes of hiring, tenure, and promotion, as we explore below. canadian history blogs have only begun to experiment with peer review. originally, activehistory.ca’s “papers” section invited journal of the cha / revue de la shc longer-form ( , – , words) research or evidence-based opin- ion essays, attempting to distinguish these more rigorous “papers” from typical posts. the peer-review process solicited feedback about an essay’s strengths and weaknesses and suggestions for its improvement, but also prioritized accessible writing and a stream- lined publishing schedule. most peer-reviewed essays appeared online within four to six weeks of submission. over time, however, the line between blog posts and papers began to blur, making this process increasingly unclear and unfair to some authors. for exam- ple, editors occasionally ran timely long-form blog posts without peer review, while continuing to require that opinion essays undergo peer review. in the recent revamping of activehistory.ca, the papers section has come to emphasize “features”; it now houses a broader cross-section of resources available on the site. these include blog series, classroom resources, and book reviews, in addi- tion to long-form essays that are no longer peer-reviewed. the decision to reorganize activehistory.ca in this way was ultimately intended to clarify the kind of ambiguities detailed above, which, in turn, arise out of blogging’s evolving position within cycles of academic writing and publishing. the group or multi-author blogs characteristic of con- temporary canadian history blogging fall somewhere between individually-authored blogs and conventional publication ven- ues. they are one form of the “middle ground” that julia martin and brian hughes call “small p publishing,” defi ned as “a space between peer-reviewed discourse and classroom discussion or less-formal academic writing.” martin and hughes argue that by “taking the ease of the blogging environment and adding some of the certifi cation provided by traditional publishing, … this form of expression can become more professionally useful and incorporate more members of the academic and professional community.” whether canada’s collaborative and multi-author history blogs remain a mediating presence in the publishing “middle ground” or move increasingly toward the peer-reviewed end of the spectrum is yet to be seen. their present relationship to modes of traditional publishing tilts that activity in the direction of broader access, wid- ening the circulation of canadian historical scholarship. canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach blogging and teaching a lthough twenty-fi rst-century students’ “digital nativity” has been rightly critiqued in recent years, when confronted with a topic new to them, most students do naturally turn to goo- gle. students are intimately familiar with digital genres of writing such as wikipedia entries, blog posts, and articles on popular journalistic websites and news aggregators. they may not, however, be critical readers of such content, and they may not have had much practice writing in these genres themselves. engaging with blogs as readers and writers hones and extends digital literacy skills introduced at earlier stages of students’ edu- cation. it also enables instructors to achieve learning objectives or educational goals common to post-secondary history courses, including those related to participatory learning, critical reading and writing, and the cultivation of civic-mindedness and empa- thy through exposure to the past. recently, editors of canadian history blogs have become aware that their materials are increas- ingly being used in secondary and post-secondary classrooms, and are beginning to provide pedagogically-useful resources for instructors based at such institutions. instructors should not hesitate to assign blog posts as readings in their courses, despite their lack of peer review. we believe that the quality of the piece is far more important than the medium in which it appears. just because something has appeared on a blog doesn’t mean it can’t engage and inform stu- dents, just as the appearance of a piece in a peer-reviewed journal does not guarantee its admittance to syllabi. when selecting blog posts, the instructor may need to assume some of the professional responsibility normally shouldered by peer reviewers and journal editors. they should consider whether the piece is appropriate for use in their classroom, and may choose to supplement it with readings published in more conventional scholarly venues. by openly discussing these kinds of judgments and other issues surrounding blog posts in their classrooms, instructors can help students view the information that they read on their screens in a more critical light. in high school, students tend to journal of the cha / revue de la shc encounter digital literacy predominantly in the context of safety. in post-secondary settings, instructors can draw on students’ familiarity with blogs to initiate wide-ranging conversations about material published online and how to discern between reliable and less reliable resources found there. students can then build upon their existing digital literacy skills by engaging in more sophisticated debates about online content, sources, and the methodologies used to produce these. in many cases, these conversations complement those that many history instructors conduct regarding the judicious use of primary and secondary sources in print. they extend students’ budding analytical exper- tise concerning source material into the digital realm. reading blog posts can also lead students to cultivate traits that are highly valued in historical and humanistic education, including empathy and civic-mindedness. posts published on canadian history blogs can encourage students to refl ect crit- ically upon their own experiences as canadians, and to situate those experiences within larger historical narratives and trajecto- ries. this sharpens their sense of similarity and difference between the lives of past and present-day canadians, which, in turn, can instill empathetic, inclusive notions of “canadianness” across time and space. in tina adcock’s classroom, ian mosby’s posts about analyzing cookbooks as primary documents, and about his grandmother viewing cooking as drudgery, inspired students to discuss consonant experiences around food, gender, and labour in their own family histories. indeed, by featuring a lively writing style and content with contemporary relevance, blog posts help combat the elephant in all of our classrooms: the notion that canadian history is intrinsically boring. students appreciate it when instructors make an effort to choose interesting readings. canadian history blogs offer a wealth of fascinating, yet academ- ically rigorous perspectives upon our country’s past. it is equally, if not even more worthwhile for students to produce as well as consume digital pieces of writing. using academic blogs like those discussed here as their models, assign- ments that require students to blog aid their cultivation of traditional history-writing skills. students practice framing an canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach argument, citing sources appropriately, and situating their argu- ments in historiographical context, much as they would for a research paper. yet a blog post teaches students more naturally about style, narrative, and effective public communication than does a traditional essay, especially if the assignment or course has a public history component. equipping students with the facility to write clearly and well often fi gures prominently among our course outcomes, but it is a task for which we do not always provide a suitable medium. beth robertson, for instance, found value in simulating the blog-writing experience by creating refl ective online exercises centred around different kinds of media in which students practiced writing critically-informed pieces for a broader audience. by adopting a mode of writing familiar to them as readers, but which they did not associate with academic learning, students seemed better able to wrestle with complex concepts. reading and writing blog posts can also be incorporated into graduate coursework to good effect. graduate students pre- paring for comprehensive exams or striving to master particular fi elds ahead of writing theses can benefi t from posts that intro- duce specifi c historiographical themes to a general readership, such as those hosted by the early american history blog the junto, or that provide learned discussions of fi elds. blog-writ- ing, meanwhile, may prove an even more benefi cial exercise in graduate than undergraduate courses. early-career graduate students can all too easily use the linguistic and structural con- ventions of academic writing as a convenient crutch upon which to lean. by asking students to communicate academic fi ndings clearly but rigorously for a mixed audience, instructors can help them fi nd their own scholarly voice — an important step in their professional development. especially when given free rein with respect to topic and approach, graduate students seem to fi nd the experience of blogging simultaneously enjoyable and useful. as the connection between blogging and teaching strength- ens, canadian history blogs have begun to provide resources for post-secondary instructors. these resources often complement those available on purpose-built historical education websites, journal of the cha / revue de la shc such as that operated by the history education network/his- toire et Éducation en réseau (then/hier). a growing number of teaching resources and education-themed posts have appeared on activehistory.ca in recent years. both the otter~la loutre specifi cally and niche’s website more generally offer resources for instructors of environmental history. in the mini-series “por- trait of a country,” otter editor claire campbell highlights visual resources available online that can be used to teach important themes in canadian environmental history — the north, wilder- ness, and so on. in february , the otter~la loutre published a two-part post on the “greatest hits” of canadian environmental history, in which editors selected scholarly works published before that they still found useful in the classroom. niche’s website devotes a page to teaching materials, including suggested textbooks and sample syllabi for canadian and north american environmental history courses. over at borealia, meanwhile, kathryn magee labelle has provided instructors teaching the pre-confederation survey with specifi c biographical content about often-overlooked indigenous actors. although most of our knowledge about blogging and its integration into canadian history classrooms is anecdotal, more and more post-secondary instructors seem to be including blog posts on their syllabi and experimenting with assignments that incorporate blogging. some fruits of canadian history blog- ging, particularly those published by borealia, are fi nding their way into high school classrooms as well. the editors of borealia hope to learn more about the pedagogical niche that these posts occupy in such spaces, and to consider how they can intention- ally support such uses in ways that respect and refl ect the nature of the genre and its present place within the scholarly publishing ecosystem. blogging, scholarship, and service like scholars in other fi elds, canadian historians continue to debate the exact value and place of blogging in academia. should blogging count as scholarship, or service? as knowledge canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach creation, or knowledge mobilization? does blogging ultimately hinder, or help the prospects of early-career academics apply- ing for postdoctoral fellowships and tenure-track positions, and their more senior counterparts applying for tenure and promotion? if blogging can lead to junior scholars’ (further) marginalization within academe, more established scholars may wish to reconsider the wisdom and ethics of courting them as authors and editors. we consider blogging to be scholarship as well as service, but acknowledge that its recognition as schol- arship depends largely on the culture of the department or university with which one is affi liated, or at which one is seeking to gain employment. yet more and more scholars are now rec- ognizing established collaborative and multi-author canadian history blogs as forums for publication as well as knowledge dissemination and public engagement. blog editors can help make canadian historians more familiar with and invested in this kind of history-writing. this may, in turn, help to solidify and naturalize blogging’s place within disciplinary rhythms of work and its assessment. rohan maitzen has suggested that “academic blogging can and should have an acknowledged place in the overall ecology of scholarship.” while we concur with maitzen in principle, we believe that the more salient question is how to quantify blog- ging as a form of scholarship, especially given its neophyte status in the academy. there is no consensus regarding how to record blogging in curricula vitae, or how much weight to accord blog posts placed under the heading of publications. the scale of the form makes some scholars pause. although posts often origi- nate from larger research projects, the time and effort expended in writing a blog post is obviously nowhere near that required to produce a peer-reviewed article or book chapter, let alone a monograph. moreover, given that blog posts do not undergo formal peer review, some scholars are understandably chary of recognizing them as fully-fl edged works of scholarship. blog posts are perhaps akin to book reviews or op-eds: they count for something, but not for anywhere as much as a more substantive piece of research. journal of the cha / revue de la shc even as blogging constitutes a form of scholarship, it is also a vector for academic service and public outreach. blogs dissemi- nate scholarly research far more widely than most peer-reviewed journals or academic monographs. corey slumkoski, for example, has enjoyed much greater engagement with the general public through his blog posts and stewardship of the acadiensis blog than through any of his peer-reviewed publications. blogging may yet become a standard form of service within the historical discipline. as nancy janovicek pointed out during the discussion that followed the roundtable on which this article is based, many academics work in publicly-funded institutions or are otherwise supported by public funds. they have a responsibility to convey their research fi ndings to the public that supports them. aca- demic history blogging thus becomes a “means of accountability” to that audience. it is ironic that this singularly useful means of promoting our discipline to laypeople, including potential his- tory students in an era of declining humanities enrollments, is still not consistently regarded as a meaningful contribution by our academic peers and colleagues. given blogging’s tenuous place within academic hierar- chies of labour and prestige, we believe that graduate students aspiring to traditional academic careers should carefully consider how to balance this kind of writing with the work of publish- ing peer-reviewed articles. this point must be made precisely because blogging, as a form of publication, can be really alluring. a blog post can be written in days rather than weeks or months. it requires fewer and less fundamental revisions than does a refereed article. the length of time between composition and publication is comparatively short, and writers receive near-in- stantaneous feedback in the form of comments, likes, and tweets. but blog posts are not substitutes for the kind of scholarly work that hiring committees and tenure and promotion committees expect early-career researchers to produce, and that takes sub- stantial time and labour to do well. blogging may provide junior scholars with valuable exposure and networking opportu- nities, but it will not lead to a permanent academic post in and of itself. early in one’s academic career, the energy and creativity canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach needed to write blog posts should not hinder or replace the effort required to produce more conventional (and conventionally-re- warded) publications. nor will editorial work on blogs necessarily help graduate students, precariously employed scholars, or untenured scholars secure a permanent position or promotion. for example, corey slumkoski accepted the editorship of the acadiensis blog not only because it would grant him an active role in the evolving world of digital history, including blogging, but also because it came with an editorial position with acadiensis, that of digital commu- nications editor. he knew that even if his department’s rank, tenure, and promotion committee declined to consider his digital editorial work a meaningful contribution to the fi eld, serving on the editorial committee of a prestigious academic journal would carry weight. faced with the prospect that peers and colleagues may not accord much value to blogging, academics who write for and edit blogs may need to present that work according to conventional understandings of academic labour, since their dig- ital toil will be assessed within such frameworks. writing posts for a collaborative or multi-author blog can be framed as a form of scholarly impact in one’s tenure package, or as a means of knowledge mobilization in a research grant proposal. editing a thematic blog is best packaged as service to the discipline, partic- ularly if that blog is affi liated with a scholarly society or journal or housed within a university department, centre, or institute. caveats aside, a growing number of early-career canadian historians have derived tangible professional benefi ts from dig- ital historical scholarship and service, whether through writing for and editing blogs, microblogging on twitter, or maintaining a visible, thoughtfully-curated online presence on social media or their own independent professional websites. we know of one scholar who was invited to co-edit a peer-reviewed collection, now under contract with the university of british columbia press, by a tenured colleague whom they had met only once in person, but who had been impressed by their tweets. corey slumkoski and tina adcock, along with other blog editors now in tenured or tenure-track positions, confi rm that their early-career digital journal of the cha / revue de la shc historical activities set them apart from other job applicants and increased their visibility within academic networks, even if they did not directly produce job offers. we suggest that blog editors, most of whom are still in the early stages of their own careers, afford (other) early-career researchers the same consideration as untenured or precari- ously employed scholars. emerging scholars must engage in an ever-increasing number of tasks simply to be considered viable candidates in highly competitive job markets. blogging may come to be experienced as yet another requirement imposed upon an overworked and often meagerly compensated seg- ment of academia. canadian history blogs should open their doors to such scholars, but leave it up to individuals to decide whether or not they wish to cross that threshold. the experience of findings/trouvailles, which has specifi cally sought to feature the research of emerging canadian historians, is instructive. of posts published over three-and-a-half years, only were written by early-career scholars. seven were written by early- to mid-career scholars. because of the many demands on such scholars’ time, findings/trouvailles has had diffi culty soliciting posts from the very demographic its editors set out to work with most closely. interested early-career academics should have the oppor- tunity not only to write for multi-authored blogs, but to serve as editors, too. when findings/trouvailles expanded their edi- torial committee in january , editors made the conscious decision to bring two doctoral candidates, travis hay and abril liberatori, aboard. daniel ross, who joined the editorial collec- tive of activehistory.ca as a doctoral candidate but who has since defended, has been an invaluable public outreach coordinator. not unlike internships with peer-reviewed journals, positions on a blog’s editorial board give early-career scholars the opportu- nity to meet junior and senior scholars hailing from all corners of the country, to encounter scholarly writing of all styles and at all stages of completion, and to make a good fi rst impression through careful and generous editorial work and through cour- teous professional behaviour more generally. canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach given the widespread, if not always accurate perception that blogging is a young scholar’s game, history blogging may become the front line of a new “history war” waged between advocates of the digital humanities and scholars of more tra- ditional proclivities. even the most technophobic scholars now acknowledge the benefi ts that digital advances have conferred upon the historical discipline. it would be diffi cult to fi nd a his- torian today who laments the use of a word processor to write, or who decries the existence of online journals. yet these things are digital replications of analog aspects of historical practice. scholars wrote articles and read journals prior to computers and the internet; these technologies have merely eased and enriched the work of scholars and increased their productivity. blogging, however, has no analog comparator. multi-author blogs did not exist in a predigital age. writing for such a blog might well be many scholars’ fi rst foray into digital history, as it is unlikely that they would commence by crunching big data or performing text analysis. history blogging, then, may help to win over scholars who might not consider digital historical or humanities projects as rigorous or valuable as traditional forms of scholarship and service. indeed, this process is already underway. while some mid- or late-career scholars may remain reluctant to count blog posts as scholarship, they are generally reading at least some of the posts published on canadian history blogs. blog editors can facilitate such scholars’ acclimatization, and perhaps eventually conversion to digital history by deliberately reaching out to this one, last underrepresented group — under- represented among digital historians, anyway — and inviting them to write a post. generational or age-based assumptions about the makeup of the history blogging community may actu- ally be dissuading some mid- and late-career academics from participating. we have found that even senior scholars are often genuinely fl attered to be approached, and that many are willing to contribute posts. afterwards, some have thanked editors for the opportunity to re-engage with their long-term research proj- ects, however fl eetingly, in the midst of seemingly never-ending teaching and administrative duties, and for the pleasure that journal of the cha / revue de la shc they derived both from writing the post and witnessing people’s reactions to it. if, by becoming bloggers themselves, some “tra- ditional” historians might come to take digital historical work more seriously, then more outreach toward the upper as well as the lower end of the career spectrum should occur. in time, blogging may become a natural part of historical work, a typical, although not obligatory step in a piece’s journey from conception to publication. this will depend, however, on the goodwill of established scholars in their capacity as editors as well as authors. currently, a scholar conducts research, writes up the results in a paper, presents that paper at one or more confer- ences, revises it based on feedback received informally from one’s colleagues, and then submits it to a journal, where it is formally reviewed. publishing a blog post could easily become part of this process. blogging would enable authors to reach a wider and more varied audience more effi ciently, solicit a broader range of comments and suggestions, and thus tune the paper more fi nely before it arrives in a journal editor’s inbox. threading blogs into the publication process may be further facilitated by the structural similarities emerging between group and multi-author blogs and scholarly journals. rebecca goetz contends that most such blogs tend “to function more like jour- nals, publishing book reviews, interviews with new authors, and … generally longer pieces.” but editors of academic journals will need to view such blogs as collaborators rather than com- petitors in the work of publication. potential contributors to the acadiensis blog have sometimes hesitated to submit posts, fearing that doing so might negatively affect their chances of publishing similar research in acadiensis one day. aware of such concerns, the editorial board of acadiensis decided that posting a piece on the acadiensis blog would not disqualify a longer version of that piece from being accepted for publication in the journal. we hope that the editorial boards of other academic journals adopt similar policies with regard to blogging more broadly. blogging is both scholarship and service. it often mobilizes research and writing practices similar, if not identical, to those required to produce traditional modes of scholarship. but it also canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach encompasses elements of community engagement and scholarly networking that refl ect and help fulfi ll expectations of academic service. whether blogging hinders or helps academic careers in a formal sense depends on the person, the place, and a host of contingent and shifting factors, although it appears to be gaining ground and approbation within the canadian historical discipline. when managed well and integrated thoughtfully into historians’ regular cycles of labour and expected duties and responsibilities, blogging can benefi t individual historians, the academic commu- nity, and the larger public we should be striving to reach. conclusion: blogging and community blogging has already reshaped the work of researching, teach- ing, and communicating canadian history in manifold ways. it has usefully broadened the discipline’s range of narrators and nar- rative forms, and increased the potential reach of the narratives that result. it has allowed academic research to travel faster and farther than ever before, both prior to and after its publication in conventional scholarly forums. it provides students with the opportunity to refi ne skills associated with digital literacy and the comprehension and composition of scholarly writing, and instructors with the means to achieve learning objectives central to teaching history at the post-secondary level. finally, it enables both junior and senior scholars to connect with like-minded peers and members of the public, and often to derive personal and pro- fessional benefi ts from doing so. while the labour of blogging has the potential to reinscribe persistent imbalances of power within the academy and canadian society, it may also offer marginalized individuals and groups with the means to broadcast their perspec- tives more widely. if the canadian history blogging community commits to operating on a care-full, thoughtful, and continually self-refl ective basis, and is willing to learn from its missteps and mistakes, we see every reason to be optimistic about its future. we conclude on a variation of the theme of inclusiveness that runs throughout the essay by refl ecting upon the commu- nity-building capacities of canadian history blogging. blogs can journal of the cha / revue de la shc help construct academic communities from scratch, as niche’s website and the otter~la loutre has done for the subfi eld of canadian environmental history. they can provide space online to expand existing communities and heighten the profi le of par- ticular subfi elds, as borealia is doing for scholars of early canadian history. they can continue and broaden conversations begun at academic conferences, as in the case of activehistory.ca, now among canada’s foremost multi-author blogs. they can draw historians specializing in disparate nations or regions together by inaugurating interdisciplinary conversations around certain themes or sets of questions, as the notches blog has done for the history of sexuality, or the age of revolutions blog has done for the concept of revolutions in history. finally, blogs can bridge the divide between paper-based and digital scholarship, helping staid academic journals such as acadiensis and bc studies fi nd a foothold in the web . world of scholarly communication. we are now entering a new phase of community-building in the canadian history blogosphere. collaborative and multi-au- thor canadian history blogs have traditionally served as structural and practical models for each other, and have shared some edi- torial personnel. but they have otherwise operated largely in isolation, even as they have each become vibrant gathering places for canadian historians and fellow travellers. in the last year, however, we have seen more and more cross-platform collabo- rations, both within and beyond canada. in may , borealia and the otter~la loutre hosted a joint series on early canadian environmental history, capitalizing on the energy and expertise of contributors and readers from both subfi elds. in – , the otter~la loutre and the wisconsin-based “digital maga- zine” edge effects are co-hosting “seeds,” a series that showcases the research of emerging canadian and american environmental historians. smaller-scale partnerships have occurred between borealia and both the acadiensis blog and activehistory.ca in can- ada, and with the junto and the republic blogs on early american history in the united states. the proliferation of group and multi-author blogs has cre- ated greater vibrancy and momentum in the canadian digital canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach historical community. it has fostered an atmosphere of collabora- tion, not competition; it has produced greater cohesion between the readers and writers of these blogs, rather than their fragmen- tation or dispersal. just as a street full of used bookshops becomes a destination, so the growth of history blogging in canada has raised the medium’s profi le. we hope that the ever-growing col- legiality and conviviality of the canadian history blogosphere will entice more historians to read, write, and edit blogs themselves, to experience the joys and benefi ts of blogging directly, and per- haps to adjust their estimations of this activity accordingly. such a development would benefi t not only the profession, but canadian society. in tandem with broader ideals of “active history” and community-engaged research, history blogging can help break down the all-too-persistent myth of the disengaged ivory-tower academic, both in the eyes of scholars themselves and members of the public. beth robertson remembers being told in one of her fi rst graduate seminars that if anyone in the room thought they could change the world by doing history, they should leave now. perhaps no one history blogger or digital historian has the ability to change the world, at least not radi- cally. but professional historians are still constituent members of canadian society, capable of making meaningful contributions to public debates, to policy-making, and to commonly-held understandings of canadian culture and nation. blogging offers a user-friendly, easily accessible means to do just that. and even if digital historians wind up leaving canada much as it was when they began, blogging may still affect the way they understand, situate, and carry themselves as researchers and scholars in the wider world. *** tina adcock is an assistant professor of history at simon fraser university. she co-edits the otter~la loutre, the blog of the network in canadian history and environment, and sits on the editorial committee of findings/trouvailles, the champlain society’s blog. journal of the cha / revue de la shc tina adcock est professeure adjointe en histoire à l’uni- versité de simon fraser. elle est codirectrice de the otter~la loutre, blog du network in canadian history and environment, et siège sur le comité de rédaction de findings/trouvailles, blog de la champlain society. keith grant is a ph.d. candidate in history at the univer- sity of new brunswick, and a founding co-editor of borealia: a group blog on early canadian history. keith grant est doctorant en histoire de l’université du nouveau-brunswick et membre fondateur de borealia: a group blog on early canadian history, dont il assure la codirection. stacy nation-knapper is a postdoctoral fellow at the l.r. wilson institute for canadian history at mcmaster univer- sity. she is chair of the editorial committee for findings/trouvailles, the blog of the champlain society. stacy nation-knapper est boursière postdoctorale à la l.r. wilson institute for canadian history à l’université mcmaster. elle dirige le comité de rédaction de findings/trou- vailles, blog de la champlain society. beth robertson is a sessional lecturer with the depart- ment of history at carleton university, as well as a research associate with carleton university’s disability research group. she is one of the co-editors of activehistory.ca. beth robertson est chargée de cours au département d’histoire de l’université carleton, ainsi que chercheure associée au disability research group de la même université. elle est membre du comité rédactionnel de activehistory.ca. corey slumkoski is an associate professor of history at mount saint vincent university, and the digital communica- tions editor for acadiensis. canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach corey slumkoski est professeur agrégé en histoire de l’université mount saint vincent et directeur des communica- tions numériques pour acadiensis. endnotes for the characteristics of collaborative (or group) and multi-author blogs, see patrick dunleavy, “shorter, better, faster, free: blogging changes the nature of academic research, not just how it is commu- nicated,” lse impact blog, december , http://blogs.lse.ac.uk/ impactofsocialsciences/ / / /shorter-better-faster-free/ and sadie bergen, “from personal to professional: collaborative history blogs go mainstream,” perspectives on history, april , https://www.historians.org/ publications-and-directories/perspectives-on-history/april- /from-per- sonal-to-professional-collaborative-history-blogs-go-mainstream/. in addition to those discussed herein, other collaborative and multi-au- thor canadian history blogs include that run by the peer-reviewed journal bc studies (http://www.bcstudies.com/?q=blog), and teaching the past, hosted by the history education network/histoire et Édu- cation en réseau (then/hier) (http://thenhier.ca/en/node/ ). important single-author canadian history blogs include christopher moore’s history news (http://christophermoorehistory.blogspot.ca/), andrew smith’s the past speaks (https://pastspeaks.com/), and andrea eidinger’s unwritten histories (http://www.unwrittenhistories.com/). corey slumkoski discusses some of the blogs and websites featured here in “history on the internet . : the rise of social media,” acadiensis , no. ( ): – . katherine o’flaherty and robert gee discuss canadian scholars’ use of the microblogging service twitter in “‘inviting coworkers’: linking scholars of atlantic canada on the twitter back- channel,” acadiensis , no. ( ): – . on digital canadian history and digital humanities in canada, see, for example, corey slum- koski, margaret conrad and lisa charlong, “history on the internet: the atlantic canada portal,” acadiensis , no. ( ): – ; john bonnett and kevin kee, “transitions: a prologue and preview of digi- tal humanities research in canada,” digital studies/le champ numérique , no. ( ): http://www.digitalstudies.org/ojs/index.php/digital_ studies/article/view/ / ; sasha mullally, “democratizing the past?: canada’s history on the world wide web,” in settling and unsettling memories: essays in canadian public history, ed. nicole neatby and peter hodgins (toronto: university of toronto press, ), – ; corey slumkoski, “regional history in a digital age: the problems and pros- pects of atlantic canadian studies,” scholarly and research communication journal of the cha / revue de la shc , no. ( ): , pp.; jennifer bonnell and marcel fortin, eds., historical gis research in canada (calgary: university of calgary press, ). key works on digital history include roy rosenzweig, “can history be open source? wikipedia and the future of the past,” journal of american history , no. ( ): – ; daniel j. cohen and roy rosenzweig, digital history: a guide to gathering, preserving, and presenting the past on the web (philadelphia: university of pennsylvania press, ); roy rosenzweig, clio wired: the future of the past in the digital age (new york: columbia university press, ); daniel j. cohen, et al., “interchange: the promise of digital history,” journal of american history , no. ( ): – . the american historical association (aha) has prom- inently encouraged conversations about digital history. see especially “guidelines for the evaluation of digital scholarship in history,” https:// www.historians.org/teaching-and-learning/digital-history-resources/ evaluation-of-digital-scholarship-in-history/guidelines-for-the-eval- uation-of-digital-scholarship-in-history, <viewed july >; “resources for getting started in digital history,” december , https://www.historians.org/teaching-and-learning/digital-history-re- sources/resources-for-getting-started-in-digital-history. on blogging and historical scholarship, see alex sayf cummings and jonathan jar- rett, “only typing? informal writing, blogging, and the academy,” in writing history in the digital age, ed. jack dougherty and kristen nawrotzki (ann arbor: university of michigan press, ), – . scholars in other fi elds have already begun to consider blogging’s effects upon academic practices. see, for example, melissa gregg, “feeling ordinary: blogging as conversational scholarship,” journal of media & cultural studies , no. (june ): – ; aaron barlow, blogging america: the new public sphere (westport, ct: praeger publishers, ); gill kirkup, “academic blogging: academic practice and academic identity,” london review of education , no. (march ): – ; tania heap and shailey minocha, “an empirically grounded frame- work to guide blogging for digital scholarship,” research in learning technology: alt-c conference proceedings ( ): – ; karen fricker, “blogging,” contemporary theatre review , no. ( ): – . moore, history news; history & policy, http://www.historyandpolicy.org. “about,” activehistory.ca, http://activehistory.ca/about/, <viewed july >. one occasion on which activehistory.ca fostered public engagement recently was at its second conference, “new directions in active his- tory: institutions, communications and technologies,” held in . there, federal servants discussed the importance of historical thinking to government policy, and curators and archivists considered how the canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach fi ndings of the truth and reconciliation commission are reshaping their practice. alan corbiere also spoke about his work as the anishi- naabemowin revitalization program coordinator at lakeview school, m’chigeeng first nation. although the conference happened in real time, the variety of topics, ideas and voices on display there refl ected the kind of vibrant online conversations that activehistory.ca has helped fos- ter since its inception. videos of some of the panels (http://activehistory. ca/videos-from-new-directions-in-active-history- /) and blog posts emerging from the conference were posted to the website. the event also led to other initiatives and collaborations between the blog and groups such as the graphic history collective (ghc) that have endeav- oured to insert alternative and lesser-known narratives into the public conversation surrounding the th anniversary of confederation. see especially activehistory.ca’s cross-posting of the ghc’s activist art proj- ect “remember | resist | redraw: a radical history poster project,” e.g., january , http://activehistory.ca/ / /remember-re- sist-redraw- /. the newsletter helped disseminate early otter content more widely than would otherwise have been the case, as many academics still preferred to read material on paper rather than on their screens. jim clifford recalls that one person printed the newsletter off and read it on the bus long before they began reading the otter~la loutre regularly online. personal communication with author, march . “about us: the mission of the champlain society,” the champlain society, http://www.champlainsociety.ca/about-us/, <viewed july >. patricia thornton, “the problem of out-migration from atlantic can- ada, – : a new look,” acadiensis , no. ( ): – . digital history encompasses scholarship presented using digital technol- ogies as well as that produced using computational tools and methods. see american historical association, “guidelines for the evaluation of digital scholarship in history.” field notes, a web-only feature of the peer-reviewed journal environ- mental history, invites “media-rich” essays that can either stand alone or complement scholarship published in the journal. see field notes, environmental history, http://environmentalhistory.net/fi eld-notes/, <viewed august >. beth a. robertson and dorotea gucciardo, “science, tech- nology and gender in canada: an activehistory.ca exhibit in collaboration with the canadian science and technology museum,” activehistory.ca, november , http://activehistory.ca/ / / science-technology-and-gender-in-canada-an-activehistory-ca-exhib- it-in-partnership-with-the-canadian-science-and-technology-museum/. journal of the cha / revue de la shc “how to submit to findings/trouvailles,” the champlain society, http:// www.champlainsociety.ca/how-to-submit-to-findingstrouvailles/, <viewed august >. see especially mathieu lapointe, “de cannon à bastarache: la commis- sion d’enquête comme manœuvre d’évitement,” histoireengagée.ca, february , http://histoireengagee.ca/?p= ; yves-michel tho- mas, “reconstruction d’haïti, (re-) construire l’État,” histoireengagée.ca, october , http://histoireengagee.ca/?p= . see also “pour une histoire engagée,” histoireengagée.ca, september , http:// histoireengagee.ca. see especially thomas peace, “does the crowd matter? the moral econ- omy in the twenty-first century,” activehistory.ca, february , http://activehistory.ca/ / /does-the-crowd-matter-the-moral-econ- omy-in-the-twenty-fi rst-century/, translated and republished as thomas peace, “la foule importe-t-elle? l’économie morale au xixe siècle,” histoi- reengagée.ca, march ; http://histoireengagee.ca/?p= ; camille robert, “the ninth floor: sur les traces de black power à montréal,” histoireengagée.ca, february , http://histoireengagee.ca/?p= , translated and republished as camille robert, “the ninth floor: finding black power in montreal,” activehistory.ca, march , http://active- history.ca/ / /the-ninth-fl oor-fi nding-black-power-in-montreal/. chelsea vowel, âpihtawikosisân|law, language, life: a plains cree speak- ing métis woman in montreal, http://apihtawikosisan.com, <viewed august >; erika violet lee, moontime warrior: fearless philosophiz- ing, embodied resistance, https://moontimewarrior.com, <viewed august >; zoe todd, zoe s.c. todd: academic, writer, indigenous feminist, métis advocate, https://zoesctodd.wordpress.com, <viewed august >. crystal fraser, “politics and personal experiences: an editor’s intro- duction to indigenous research in canada,” activehistory.ca, january , http://activehistory.ca/ / /politics-and-personal-experienc- es-an-editors-introduction-to-indigenous-research-in-canada/. adam gaudry, “paved with good intentions: simply requiring indig- enous content is not enough,” activehistory.ca, january , http://activehistory.ca/ / /paved-with-good-intentions-simply-re- quiring-indigenous-content-is-not-enough/. on this point, see also, for example, timothy j. stanley, “why i killed canadian history: towards an anti-racist history in canada,” histoire sociale/social history , no. ( ): – , especially – . see, for example, heide estes, “blogging and academic identity,” lit- erature compass , no. ( ): – . sean kheraj and denis mckim, “early canadian environmental his- tory series: editorial introduction and essential reading,” the otter~la canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach loutre, may , http://niche-canada.org/ / / /early-cana- dian-environmental-history-series-editorial-introduction/. on the importance of inclusive and diverse citation practices more gen- erally, see sara ahmed, “making feminist points,” feministkilljoys (blog), september , https://feministkilljoys.com/ / / /mak- ing-feminist-points/. melissa gregg, “banal bohemia: blogging from the ivory tower hot- desk,” convergence: the international journal of research into new media technologies , no. ( ): – . robert darnton, “what is the history of books?,” in the kiss of lamo- urette: refl ections in cultural history (new york: norton, ), – . for one adaptation of darnton’s model for the digital age, see “the digital communications circuit,” the book unbound: disruption and dis- intermediation in the digital age, university of stirling, , http://www. bookunbound.stir.ac.uk/research/infographic/, <viewed july >. julia martin and brian hughes, “small p publishing: a networked blogging approach to academic discourse,” journal of electronic resources librarianship , no. ( ): . see pat thomson, “seven reasons why blogging can make you a better academic writer,” times higher education, january , https://www. timeshighereducation.com/blog/seven-reasons-why-blogging-can- make-you-better-academic-writer. on the relationship between blogs and other forms of social media, see dunleavy, “shorter, better, faster, free.” see alan maceachern, “booklook ,” the otter~la loutre, may , http://niche-canada.org/ / / /booklook- /; “ new articles in canadian environmental history: from high modernism to ducks,” the otter~la loutre, may , http:// niche-canada.org/ / / / -new-articles-in-canadian-environ- mental-history-from-high-modernism-to-ducks/; “new articles in canadian environmental history: from adventurous tourists to experimental farms,” the otter~la loutre, december , http:// niche-canada.org/ / / /new-articles-in-canadian-environmen- tal-history-from-adventurous-tourists-to-experimental-farms/; keith grant, “new books in early canadian history, , part ,” january , https://earlycanadianhistory.ca/ / / /new-books- in-early-canadian-history- -preview-part- /. see http://niche-canada.org/category/media/envhist-worth-reading/ for an index of all such posts. see especially krista mccracken, “ten books to contextualize recon- ciliation in archives, museums and public history,” activehistory.ca, october , http://activehistory.ca/ / /ten-books-to-contex- tualize-reconciliation-in-archives-museums-and-public-history/; kate journal of the cha / revue de la shc barker, “ten books to contextualize the history of infectious diseases and vaccinations,” activehistory.ca, april , http://activehistory. ca/ / /ten-books-to-contextualize-the-history-of-infectious-dis- eases-and-vaccinations/; andrew watson and thomas peace, “ten books to contextualize idle no more,” activehistory.ca, january , http://activehistory.ca/ / /ten-books-to-contextualize-idle- no-more/. for an example of a piece republished from the otter~la loutre, see stacy nation-knapper, andrew watson and sean kheraj, “ten books to contextualize the alberta tar sands,” activehistory.ca, may , http://activehistory.ca/ / /ten-books-to-contextualize- the-alberta-tar-sands/. see jason priem et al., “altmetrics: a manifesto,” altmetrics, october , http://altmetrics.org/manifesto/. for refl ections on the relationship between a journal’s print and online spaces, see lucinda matthews-jones, “blogging the victorians for the journal of victorian culture online,” journal of victorian culture , no. ( ): – ; helen rogers, “academic journals in the digi- tal age: an editor’s perspective,” journal of victorian culture , no. ( ): – . sandra alston, “a ‘canadian ball’ at rivière-du-loup, , from the journal of william ord mackenzie,” findings/trouvailles, february , http://www.champlainsociety.ca/canadian-ball-at-riviere-du-loup- -william-ord-mackenzie/; alston and cicely blackstock, eds., ‘another world’: william ord mackenzie’s sojourn in the canadas, - (toronto: the champlain society, ). this volume was available only to society members; the associated findings/trouvailles post allowed a sliver of its content much wider circulation than it would otherwise have enjoyed. heap and minocha, “empirically grounded framework,” – . see, for example, jim clifford, “the tpp and public domain content in canada,” activehistory.ca, october , http://activehistory. ca/ / /the-tpp-and-public-domain-content-in-canada/; ian milli- gan, “research is getting a bit more open: good news for historical research in canada,” activehistory.ca, march , http://activehistory. ca/ / /research-is-getting-a-bit-more-open-good-news-for-histori- cal-research-in-canada/. stacy nation-knapper, “kootenay pelly and spokane garry: indige- nous students at the red river mission school,” findings/trouvailles, december , http://www.champlainsociety.ca/kootaney-pelly-and- spokane-garry-indigenous-students-at-the-red-river-mission-school/. donald w. mcleod, “a gay ‘kiss-in’ and the quest for equality, ,” findings/trouvailles, june , http://www.champlainsociety.ca/a-gay- kiss-in/. canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach members of the public also engage with blog posts after their publica- tion, particularly when they have direct personal experience with the activity or phenomenon under discussion. for example, daniel heidt’s otter post about postwar arctic weather stations elicited comments from people who had worked as technicians in such places, or were otherwise familiar with these stations and their histories. see heidt, “met techs, the environment and science at the joint arctic weather stations, – ,” the otter~la loutre, march , http:// niche-canada.org/ / / /met-techs-the-environment-and-sci- ence-at-the-joint-arctic-weather-stations- - /. see also the comments beneath merle massie, “to sled or not to sled: boundaries and borders in canada’s winter playgrounds,” the otter~la loutre, january , http://niche-canada.org/ / / /to-sled-or-not-to- sled-boundaries-and-borders-in-canadas-winter-playgrounds- /. rohan maitzen, “scholarship . : blogging and/as academic practice,” journal of victorian culture , no. ( ): . “dam nation: hydroelectric developments in canada,” the otter~la loutre, september , http://niche-canada.org/tag/dam-nation-hydro- electric-developments-in-canada/; “animal metropolis,” the otter~la loutre, february , http://niche-canada.org/tag/animal-metropolis/; tina adcock, “a cold kingdom,” the otter~la loutre, march , http://niche-canada.org/ / / /a-cold-kingdom/. see “landscapes of science,” the otter~la loutre, january–august , http://niche-canada.org/tag/landscapes-of-science/; “when blue meets green,” the otter~la loutre, november , http://niche-can- ada.org/tag/when-blue-meets-green/; “(un)natural identities,” the otter~la loutre, march–may , http://niche-canada.org/tag/ unnatural-identities/. martin and hughes, “small p publishing,” . now-classic critiques of students as digital natives include ellen helsper and rebecca enyon, “digital natives: where is the evidence?,” british educational research journal , no. ( ): – ; susan j. ben- nett and karl a. maton, “beyond the ‘digital natives’ debate: towards a more nuanced understanding of students’ technology experiences,” journal of computer assisted learning , no. ( ): – . teachinghistory.org, a project of the roy rosenzweig center for his- tory and new media at george mason university, has a section called “the digital classroom” (http://teachinghistory.org/digital-classroom) that provides teachers with resources for using blogs and other digital tools to teach historical thinking skills. on this overlap, see also douglas a. powell, casey j. jacob, and benja- min j. chapman, “using blogs and new media in academic practice: journal of the cha / revue de la shc potential roles in research, teaching, learning and extension,” innova- tive higher education , no. (august ): – . ian mosby, “speak, recipe: reading cookbooks as life stories,” activehistory.ca, august , http://activehistory.ca/ / / speak-recipe-reading-cookbooks-as-life-stories/; mosby, “the forgotten parts of food culture: unpaid labour and drudgery,” off the page (blog), th shelf, july , http:// thshelf.com/blog/ / / /the- forgotten-parts-of-food-culture-unpaid-labour-and-drudgery. for an excellent overview of how to teach digital historical writing, see sean kheraj, “best practices for writing history on the web,” activehistory.ca, october , http://activehistory.ca/ / /best- practices-for-writing-history-on-the-web/. the junto: a group blog on early american history, http://earlyamerican- ists.com. for a sterling example of this kind of digital historiographical debate, see “early canadian environmental history: a forum,” borealia, may , https://earlycanadianhistory.ca/ / / /early-cana- dian-environmental-history-a-forum/. the history education network/histoire et Éducation en réseau, http://then- hier.ca. see, for example, thomas peace, “indigenous peoples: a starting place for the history of higher education in canada,” activehistory.ca, january , http://activehistory.ca/ / /rethinking-higher-ed- ucation-colonialism-and-indigenous-peoples/; alison little, “black history education through the archives of ontario,” activehistory.ca, february , http://activehistory.ca/ / /black-history-edu- cation-through-the-archives-of-ontario/. these and other teaching-themed posts are indexed at http://niche-can- ada.org/category/the-otter/teaching/. “teaching materials/matériel d’enseignement,” niche, http:// niche-canada.org/resources/teaching-materials-materiel-denseigne- ment/, <viewed august >. kathryn magee labelle, “forgotten indigenous figures — early cana- dian biographies and course content,” borealia, june , https:// earlycanadianhistory.ca/ / / /forgotten-indigenous-fi gures-ear- ly-canadian-biographies-and-course-content/. maitzen, “scholarship . ,” . rachel leow, “refl ections on feminism, blogging and the historical profession,” journal of women’s history , no. (winter ): . nancy janovicek’s and rachel leow’s arguments resemble those that sshrc has put forth to support its request that online journals move to embrace open-access publication more fully. on this point, see also slumkoski, “history on the internet . ,” . canadian history blogging: reflections at the intersection of digital storytelling, academic research, and public outreach see tina adcock, “go south, young historian?,” the otter~la loutre, june , http://niche-canada.org/ / / /go-south-young-his- torian/. see alison mountz et al., “for slow scholarship: a feminist politics of resistance through collective action in the neoliberal university,” acme: an international e-journal for critical geographies , no. ( ): – ; yvonne hartman and sandy darab, “a call for slow scholarship: a case study on the intensifi cation of academic life and its implications for pedagogy,” the review of education, pedagogy and cultural studies , nos. – ( ): – . pushback against what some regard as the inevitability of mandatory academic engagement on social media is already occurring. see “i’m a serious academic, not an instagrammer,” academics anonymous (blog), higher education network, the guardian, august , https:// www.theguardian.com/higher-education-network/ /aug/ / im-a-serious-academic-not-a-professional-instagrammer. rebecca goetz, “a brief history of blogging as experienced by yours truly,” historianess (blog), june , https://historianess.wordpress. com/ / / /a-brief-history-of-blogging-as-experienced-by-yours- truly/. notches: (re)marks on the history of sexuality (blog), http://notchesblog. com; age of revolutions (blog), https://ageofrevolutions.com. edge effects (blog), center for culture, history, and environment, nelson institute for environmental studies, university of wisconsin-madison, http://edgeeffects.net; “seeds: new research in environmental his- tory,” the otter~la loutre, august –, http://niche-canada.org/tag/ seeds-new-research-in-environmental-history/. the republic (blog), society for historians of the early american repub- lic (shear), http://www.shear.org/blog/. taking the british library forward in the twenty-first century search  |    back issues  |    author index  |    title index  |    contents d-lib magazine november volume number issn - taking the british library forward in the twenty-first century   lynne brindley chief executive the british library chief-executive@bl.uk introduction the end of the th century saw the british library successfully moved into its new, flagship building in london. problems and criticisms associated with the largest public building project of the century in the uk have largely ceased as users and visitors signal their approval. researchers love the reading rooms and the much enhanced quality of associated services; visitors marvel at the showcase exhibition galleries; and large parts of the library’s collections are now housed in an environment fitting to their international standing. the building stands as a confident symbol of the importance of all libraries to the nation’s cultural, educational and economic success. yet it is at this very time of success that the british library needs to turn to, and accelerate its engagement with, the critical and transformational issues associated with the digital world and the new demands being made by our wide range of users and stakeholders. my arrival as newly appointed chief executive in july has provided the opportunity for the library to focus on new strategic directions and this article aims to share our early thinking with the d-lib community. strategic directions for the british library we are engaged in a strategic journey that recognises the centrality of the web to our future and seeks gradually to re-position the library as a key player in a multiplicity of collaborative arrangements, within national and international networks of libraries, with scholars and researchers, and with other public and private sector bodies to ensure the timely provision of appropriate services for the future. our emerging vision has the strap-line of "making accessible the world’s intellectual, scientific and cultural heritage". we seek to make the collections of the british library (and other great collections) accessible on everyone’s ‘virtual bookshelf’, wherever this may be -- at work, at school, at college, at home. this implies a larger focus on e-strategy, including digitisation and digital collecting; more emphasis on presentation of the library’s collections in the context of other great collections and worthwhile resources world-wide; much more active use and development of navigational tools to assist users; and reaching out through the web (directly and mediated through appropriate educational agencies and the public libraries) to a much wider public. the british library's e-strategy our e-strategy will be at the core of our work and will underpin many of our priority developments. firstly, and at the core of the british library’s future relevance and mission, is the continuing effort being expended to ensure that the uk will have an adequate system of legal deposit for an electronic age. our collaboration with the other legal deposit libraries is critical to defining the framework for this, and for devising a practical solution. meanwhile, a code of practice for the voluntary deposit of non-print publications has been agreed as an interim measure, endorsed by publishing trade bodies, the legal deposit libraries and our sponsoring government department, department for culture, media and sport (dcms). our digital infrastructure is being critically enhanced by a deal that the british library has just concluded with ibm, following a lengthy procurement process, to provide a digital store which will form the technical platform to support the library’s acquisition and preservation of collection materials in digital form, together with digitised elements of its own historical collections. the digital store will be designed using the open archival information system (oais) reference model and will build on the work of the cedars digital preservation project within which the british library is acting as a test site. the dutch national library, the koninklijke bibliotheek, has embarked simultaneously on a similar project with ibm, and we envisage working in close collaboration with koninklijke bibliotheek as we move into uncharted digital territories. we will hope to contribute to, and share findings from, the international digital preservation research agenda, not least by providing an excellent test-bed for such work. the library is also pursuing new opportunities for digitisation of its collections. we were encouraged by a recent house of commons committee report, primarily on public libraries but dealing with some british library matters ( ). let me quote from the report. ….we strongly support the british library in its endeavours to continue its digitalisation of internationally important books and manuscripts. we recommend that, wherever possible, those images should be freely available on the internet. we consider that support for this process should be considered a high priority for lottery or government funding as appropriate. it should be the government’s avowed aim to establish the british library as a hub for the uk and the international library network. this will enable the british library to become a universal resource rather than the preserve of a relatively small number of users on the site -- a library for the many not just for the few. the expansion of the british library’s role should not be at the expense of and should in no way compromise the performance of the british library’s core statutory functions…. the library currently has two major digitisation bids in progress for lottery funding of nationally significant heritage material. we are leading a consortium of bidders on the theme of ‘a national sense of place’, focussed on the location and appearance of places within the uk. we are also a partner in a bid on the subject of ‘moving here’, with content based on immigration to england. there are plans for the digitisation of some , of our most attractive images to create a picture library, our early photographic collections are being put on the web, and we are in collaboration with keio university digitising several surviving copies of gutenberg’s bible, enabling scholars to compare copies virtually, in ways previously impossible. but the label of digitisation hides rather deeper considerations and policy issues. whilst we wish to make a critical mass of digital material available, we see limited merit in digitisation without some coherence of purpose and integrity. with cliff lynch, executive director of the coalition for networked information in the us, we believe that it is critical to ‘weave primary content with commentary, criticism, scholarship and instruction’. materials digitised need to be described, related, contextualised, justified and scoped. these are complex tasks involving a range of new collaborations with scholars, teachers, educational publishers, and so on. new models for these kinds of partnerships are international and complex: we need to share lessons on how to ‘re-purpose’ our materials, and understand the range of business models appropriate for such ventures. collaborative e-ventures in support of research and scholarship let me share with you two examples of the library’s current involvement with such collaborations. the first example is the international dunhuang project (idp) which is currently a showcased project within the ecai (electronic cultural atlas initiative) led from the university of california, berkeley. the project is developing methodologies and software for storing, querying, and displaying cultural features that vary through time. the idp aims to bring together all the cave documents in high-quality digital format, and integrate electronically dispersed, fragile and relatively inaccessible collections housed in four major institutions -- the national library of china, the british library, the bibliotheque nationale de france, and the institute of oriental studies, st petersburg. the project has already had substantial support from the mellon foundation and from the higher education funding council for england, and with their continuing and potentially extended support, we have every hope that this will be a digital scholarship project of immense international value. a second, rather different example, is fathom, a recently announced partnership involving the british library, columbia university new york, the london school of economics, cambridge university press, the smithsonian institution’s national museum of natural history and the new york public library, together with a growing number of cultural and educational contributors of international standing. it is intended that fathom.com through its website will provide access to a range of e-course and related content, and will act as a quality knowledge space. the british library is actively developing digital content and a range of ‘stories’ contributed by our curators. i cite these two examples particularly to lead into another facet of our e-strategy, namely the recognition of the important contribution our curators and bibliographers have to make in this new environment. one of the greatest assets of the british library, in parallel with its collections, is the expertise of many of its curatorial staff, who are often international scholars in their fields. they will have critical roles to play in international scholarship and research projects in the digital field, in entrepreneurial internet ventures, and in supporting wider public access to, and understanding of, our great collections in an e-setting. we are currently refining the sets of skills and competencies needed for these new roles of brokering, interpretation, exploitation, and complex partnerships. collaboration with higher education collaboration and partnership with the higher education sector, in the uk and more widely, is a strategic priority. the uk framework of the distributed national electronic resource (dner), sponsored by the joint information systems committee (jisc), provides an excellent focus for dialogue and for a significant contribution to its fulfilment by the british library. what is emerging is a shared agenda for development and the opportunity for the library to enrich services for researchers and students, and for the incorporation of many of the dner’s offerings in the services offered to the wider user communities of the library. more specifically we are offering free access to students and academics to zetoc, which provides desk-top access to the table of contents of some million journal articles and conferences, all accessible from the british library: this service is in partnership with jisc and the national data-centre based at manchester. we are also working within the same partnership to develop our electronic document delivery services as part of a distributed network of providers. in the area of portal development the dner has a well-established framework through its resource discovery network (rdn) with its faculty/subject nodes. the library will make initial contributions to this network in the areas of complementary medicine and in sustainable business: reciprocally the library will embed aspects of this navigational service within its own portal to improve services to its reading room users, and to its remote users. further ideas are emerging, for example, collaborative ways of improving coverage of the national bibliography in the area of quality electronic resources evaluated and catalogued within the rdn. potentially the british library can also enrich the dner through its special relationship with other national libraries all over the world. we are in the final stages of negotiating a major european commission funded programme to enable the development of ‘the european library’, to assist the provision, over the next few years, of a pan-european digital national library for europe, providing open and seamless access to the digital resources of the major national libraries, with multilingual access, together with technical and business models that can be extended more widely. for a d-lib article it is inevitable that the focus is on digital collaboration, but mention should also be made of the significant progress being made, in partnership with uk higher education, towards more collaboration in collection development and management. recognising that panizzi’s aspiration for the british museum library to be comprehensive for all time has been for some time unachievable, despite a continuing commitment to high levels of acquisition, we intend to work closely with others to move towards a more integrated and distributed collections strategy nationally, in which the library plays a leading role. this is of course sensitive territory, and will require trust, dialogue, and flexibility, as we aim together to provide the most effective collection coverage to support research and scholarship within the uk research library system. collaboration for wider public access developing as a national library ‘for the many’ without detriment to the core statutory functions, as recommended by the house of commons committee report referred to above, and playing a role in the government’s widening access agenda is a major challenge for the library. of course we offer a range of facilities at st pancras for the general public, including exhibitions, public talks and lectures, and tours of the building. however, we envisage that the provision of digitised material from the british library’s collections on the web will clearly represent a major plank in scaling up provision to meet this new challenge. we believe strongly that we need to develop partnerships with the public libraries in the uk as major agents in extending access, through, for example, the people’s network, through learning centres, and through traditional public library channels. to this end we have just announced a call for proposals within the library’s co-operation and partnership programme which will encourage practical manifestations of this outreach strategy. we will encourage projects that provide public library gateways to the library’s range of services; we would seek projects to streamline admission to learning resources in cities, regions and the national library, ensuring easier referral; we are interested in exploring the design and delivery of regional and virtual exhibitions to reach wider publics. worth mentioning is ‘turning the pages’, an animated computer simulation of leafing through selected library treasures, where we have shown that we can lead the way in using creative multi-media technology to make our treasures accessible to a wider audience. we have licensed three sites in northumberland for the ‘turning the pages’ digitised version of the lindisfarne gospels: the original precious manuscript is on temporary loan to the northeast: the digitised version is available for the long-term and for everyone to share. ways of scaling up this kind of initiative will be sought within our e-strategy. conclusions this article can only skim the surface of the developing strategic directions of the british library. it has focussed broadly on the e-strategy as the engine for organisational and cultural change to meet new challenges, but has inevitably been selective. there are strands of our strategic ‘work in progress’ that i have paid little attention to in this article -- they are important and will emerge in later iterations of our new strategic directions. i have stressed the importance of partnerships as the library seeks to re-position itself more integrally in the national and international library network. we believe that, primarily through digital developments, we can enable wider access to our collections, working closely with partners, such as public libraries and public and private players in the education sector, to ‘re-purpose and re-present’ our offerings. we will work with them to define priorities for our digital programmes. at the same time we seek to work with scholars and researchers to ensure innovative programmes of the highest international quality and scholarly value: the modernisation and re-interpretation of the curatorial role is essential to achieve these objectives. i have skirted over the issues of organisational and cultural change needed to succeed, but the library does not underestimate their importance and the difficulty of the task. hawkins and battin ( ) have expressed it well when they say: because of the capacity of digital technology to eliminate barriers to information access and global communication, it is no longer possible to confine changes to individual units, institutions, or commercial organisations. new, pervasive interrelationships among all those who use digital technology present unprecedented financial and managerial challenges, as we seek to re-interpret social values and institutional missions in a reconfigured world. we are entering into a range of discussions, debates and dialogues on the nature and pace of our strategic development as a relevant national library for the st century. feedback from this article would be welcomed. we would commit to a consolidated response to such feedback in a later d-lib issue. please email comments to ann.clarke@bl.uk. references . house of commons culture, media and sport committee. sixth report. public libraries. london, the stationery office limited, may . . hawkins, brian l and battin, patricia, eds. the mirage of continuity: reconfiguring academic information resources for the st century. washington, clir and aau, . copyright© lynne brindley top | contents search | author index | title index | monthly issues editorial | next article home | e-mail the editor d-lib magazine access terms and conditions doi: . /november -brindley   “reflecting forward” on the digital in multidirectional memory-work between canada and south africa all rights reserved © faculty of education, mcgill university, ce document est protégé par la loi sur le droit d’auteur. l’utilisation des services d’Érudit (y compris la reproduction) est assujettie à sa politique d’utilisation que vous pouvez consulter en ligne. https://apropos.erudit.org/fr/usagers/politique-dutilisation/ cet article est diffusé et préservé par Érudit. Érudit est un consortium interuniversitaire sans but lucratif composé de l’université de montréal, l’université laval et l’université du québec à montréal. il a pour mission la promotion et la valorisation de la recherche. https://www.erudit.org/fr/ document généré le avr. : mcgill journal of education revue des sciences de l'éducation de mcgill “reflecting forward” on the digital in multidirectional memory-work between canada and south africa réfléchir à l’avenir : la place du numérique dans le travail de mémoire multidirectionnelle entre le canada et l’afrique du sud teresa strong-wilson, claudia mitchell, connie morrison, linda radford et kathleen pithouse-morgan volume , numéro , fall uri : https://id.erudit.org/iderudit/ ar doi : https://doi.org/ . / ar aller au sommaire du numéro Éditeur(s) faculty of education, mcgill university issn - (numérique) découvrir la revue citer cet article strong-wilson, t., mitchell, c., morrison, c., radford, l. & pithouse-morgan, k. ( ). “reflecting forward” on the digital in multidirectional memory-work between canada and south africa. mcgill journal of education / revue des sciences de l'éducation de mcgill, ( ), – . https://doi.org/ . / ar résumé de l'article nous explorons la place que peut occuper le numérique au sein des pratiques pédagogiques des enseignants oeuvrant en justice sociale et particulièrement la manière dont le travail de mémoire peut approfondir et améliorer ces pratiques enseignantes. À la manière de walter benjamin, nous considérons la mémoire comme un moyen d’explorer le passé ainsi qu’un endroit où le numérique offre aux enseignants des possibilités accrues de travailler efficacement au coeur de contextes géographiques aux prises avec des problématiques de justice sociale. nous soutenons que le concept de mémoire multidirectionnelle développé par michael rothberg a le potentiel et constitue la voie logique pour mieux saisir les notions de mémoire productive transnationale et transfrontalière, à l’aide des outils numériques. nous exposons également un certain nombre de questions que nous considérons fondamentales pour trouver des solutions et réfléchir à l’avenir en ce qui a trait à des problématiques propres à la recherche numérique dans le contexte de la mémoire multidirectionnelle. https://apropos.erudit.org/fr/usagers/politique-dutilisation/ https://www.erudit.org/fr/ https://www.erudit.org/fr/ https://www.erudit.org/fr/revues/mje/ https://id.erudit.org/iderudit/ ar https://doi.org/ . / ar https://www.erudit.org/fr/revues/mje/ -v -n -mje / https://www.erudit.org/fr/revues/mje/ mcgill journal of education • vol. no fall “reflecting forward” on the digital “reflecting forward” on the digital in multidirectional memory-work between canada and south africa teresa strong-wilson & claudia mitchell mcgill university connie morrison memorial university linda radford university of ottawa kathleen pithouse-morgan university of kwa-zulu natal, south africa abstract. we explore the place that the digital can occupy in teachers’ peda- gogical practices around social justice and especially how memory-work can deepen and enhance teacher practices. like walter benjamin, we see memory as being a medium for exploring the past and where the digital provides greater opportunities for teachers to work productively across geographical contexts that are wrestling with issues of social justice. we argue for the po- tential of michael rothberg’s notion of multidirectional memory as a logical direction in which to pursue notions of cross-border, transnational productive remembering facilitated by digital means. we also pose a number of ques- tions we see as critical for working through and “reflecting forward” on issues central to digital scholarship within the context of multidirectional memory. rÉflÉchir À l’avenir : la place du numÉrique dans le travail de mÉmoire multidirectionnelle entre le canada et l’afrique du sud rÉsumÉ. nous explorons la place que peut occuper le numérique au sein des pra- tiques pédagogiques des enseignants œuvrant en justice sociale et particulièrement la manière dont le travail de mémoire peut approfondir et améliorer ces pratiques enseignantes. À la manière de walter benjamin, nous considérons la mémoire comme un moyen d’explorer le passé ainsi qu’un endroit où le numérique offre aux enseignants des possibilités accrues de travailler efficacement au cœur de contextes géographiques aux prises avec des problématiques de justice sociale. nous soutenons que le concept de mémoire multidirectionnelle développé par michael rothberg a le potentiel et constitue la voie logique pour mieux saisir les notions de mémoire productive transnationale et transfrontalière, à l’aide des outils numériques. nous exposons également un certain nombre de questions que nous considérons fondamentales pour trouver des solutions et réfléchir à l’avenir en ce qui a trait à des problématiques propres à la recherche numérique dans le contexte de la mémoire multidirectionnelle. strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne language has unmistakably made plain that memory is not an instrument for exploring the past, but rather a medium. (benjamin in assmann, , p. ) where thinking suddenly stops in a constellation pregnant with tensions, it gives that constellation a shock. (benjamin in rothberg, , p. ) scholarly publications tell the story of data. (borgman, , p. ) teachers are the primary “memory agents” in schools, ranging from their role in selecting which texts, approaches to text and projects become the focus of student learning within the curriculum, to the fact that teachers often come to occupy a space in the memories of former students (o’reilly-scanlon, ) and also need to contend with their own memories of learning, schooling and the curriculum (pinar, ). teachers also stand at the front lines in integrating technology into the curriculum, developing students’ “ st century” skills (unesco, ). as co-authors, we have all been teachers (elementary or secondary) and are now teacher educators while also being educational researchers; our research regularly brings us back in contact with students and classrooms. we also share an abiding interest in memory in benjamin’s ( ) sense of its being a medium, and have been exploring this interest primarily through actively engaging teachers (ourselves included) in autobiographical and biographical forms of memory-work. our memory-work projects have primarily been located in two places: canada and south africa, with some of us working mostly in canada and some of us mostly in south africa. in canada, one key focus has been canada’s history of relations with indigenous peoples, especially the legacy of residential schooling, while in south africa, the focus has mainly been on the effects of hiv and aids on rural school- ing in a post-apartheid context. our work has been framed by social justice issues of race and/or gender. sensing their interrelatedness, we have looked for opportunities to bring this work together through, for instance, a research collaboration on partnerships in education, which resulted in a symposium held in durban, south africa in and an edited book on self-study and social justice (pithouse, mitchell, & moletsane, ), but most notably through a productive remembering research workshop held at mcgill in , which resulted in two co-edited collections of papers —memory and pedagogy (mitchell, strong-wilson, pithouse & allnutt, ) and productive remembering and social agency (strong-wilson, mitchell, allnutt, & pithouse-morgan, ). these conversations helped us to begin to collectively develop our ideas around memory as a medium for “productive remembering” as phenomenon and method. however, it was only when we embarked on talking about research that each of us had been conducting separately in relation to teachers, students and the digital that we could envision generating “digital dialogue” (wegerif, ) between teachers in canada and south africa, and in so doing link mcgill journal of education • vol. no fall “reflecting forward” on the digital this dialogue to our previous memory work, through what we provisionally called digital memory-work (strong-wilson, mitchell, morrison, radford, & pithouse-morgan, ). we are interested in exploring the place that the digital can occupy in teach- ers’ pedagogical practices around social justice and in particular, with how memory-work can deepen and enhance teacher practices. as democracies, both canada and south africa are haunted by glaring examples of their “present pasts,” with apartheid continuing to having an impact on south africa years after the first democratic elections, and the idle no more movement testify- ing to unresolved intergenerational issues from canada’s shameful legacy of indian residential schools. at the same time, there is also a multidirectional flow between the two countries in relation to these shared histories. follow- ing canada’s example of establishing the reservation system, south africa established the group areas act in , which legally enforced apartheid. canada, following the truth and reconciliation commission hearings in south africa beginning in , established its own structure in , the indian residential schools truth and reconciliation commission. both countries continue to share a shameful present in relation to sexual violence amongst indigenous girls and youth women. and yet, the contexts are also positioned very differently with respect to questions of social justice and post-colonialism, with south africa living out the post-effects of colonialism as apartheid in what is meant to be a post-apartheid state, and canada wrestling with its status as a settler colonial society and the ongoing legacy of its fraught relations with indigenous peoples, which have crystallized around residential schools. what would be the educational usefulness of bringing together these shared and simultaneously vastly different political contexts? the field known as “memory and pedagogy” is concerned with transformation: with how critically engaging with the past / one’s past can change the future (mitchell et al., ). memory studies emphasizes that our relation to the past is about how we live in the present, where memory (remembering  / forget- ting) entails “working through” the past to avoid repeating injustice or trauma (hodgkin & radstone, ; simon, rosenberg, & eppert, ). memory- work refers to a set of practices, typically collaborative, that help participants connect personal memories to larger social, political or economic issues and thus work through those issues in ways that engender a deeper commitment or consciousness (haug, a, b; haug et al., ; strong-wilson et al, ). we originally coined the phrase “digital memory-work” to articulate an interrelationship between digital media and memory-work that we saw as pending yet imminent, in which digital media would be used to both explore as well as represent memory-work. in so doing, we drew on insights from various fields, including the emerging field of digital memory (ernst, ), linking this with the burgeoning literature on teachers’ responsibility to meaningfully integrate digital media in classrooms. we see the potential of digital forms of strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne memory-work to help promote teacher agency and lead to transformation in classroom practices through teachers leveraging digital tools (e.g., wider range of resources; online dialogue with a broader group of teachers) to access the past so as to investigate social in / justice. “digital critical pedagogies” is the term we have been using for teaching approaches that can move theorizing (memory-work) to practice (changes in a teacher’s pedagogy). given that the project data will primarily be in digital form, the question posed by the mje / rsÉm special issue around scholarly representation is highly germane to our thinking through of the project. how might working with the digital in the context of memory-work challenge our present boundaries around what constitutes representation in scholarship and potentially contribute to new insights in research and practice? a preliminary question concerns the implications of setting in motion a “constellation” of memories (very possibly difficult and traumatic) through memory-work with teachers across the two country contexts. what theoretical framework(s) can support the use of digital dialogue for productive forms of remembering that can lead to social agency? section one: multidirectional memory-work and the digital multidirectional memory-work multidirectional memory is michael rothberg’s ( ) alternative to a “zero- sum” (p. ) game in which memories compete for space and attention within the public sphere. in the wake of the second world war, but only really beginning in the s with the highly publicized eichmann trial, personal testimonials and stories of violent injustice began to be unleashed (rothberg, ). susannah radstone ( ) has noted the central place of holocaust memories in shaping the nascent field of memory studies. rothberg begins his second book by citing literary critic walter benn michaels’ exasperation with the public space given over to the jewish holocaust in the us holocaust museum on the mall in washington dc. what about what americans did to black people?, michaels asks. rothberg uses michaels’ observation as a starting-point for proposing a different reading of post wwii history and thus a different trajectory for memory studies. whereas rothberg’s first book ( ) focused on the study of literary representations of the holocaust, in particular those that he called “traumatic realism,” in his second book ( ), he delves more deeply into questions of representation — of what kind of story is being told and whose story is being told — by re-envisioning the holocaust through a lens of decolonization, in which the holocaust is one (albeit a central) piece of a larger canvas marked by struggles for freedom against violent injustice. how does he arrive at this point? the key elements of his argument are germane to seeing multidirectional memory as a logical direction in which to pursue notions of cross-border, transnational productive remembering facilitated by digital means. mcgill journal of education • vol. no fall “reflecting forward” on the digital what are those key elements? the notion of multidirectional memory is based on relatedness through juxtaposition; methodologically, it resembles pastiche in the sense that it brings together histories that might otherwise seem unlikely “bedfellows” (rothberg, , p. ). it does this by arguing first that memory, “while concerned with the past, happens in the present” (p. ). memory occupies a present space that memory studies has tended to characterize as a space of contestation (hodgkin & radstone, ), in which memories compete with one another to be seen and heard (e.g., counter-memories vs. dominant narrative ideology; counter-memory vs. counter-memory). what we do with the present space, though, suggests rothberg, is for us to imagine and re-shape; memory as “present past” is ultimately future-directed. building on that argument, rothberg argues for memory as a form of work, but on the largest possible canvas so as to allow for “dynamic form[s] of contiguity” (p. ), with memories intersecting with one another, coming from and moving into different directions. multidirectional memory is “concerned simultaneously with individual and collective memory” and has focused “on both agents and sites of memory, and especially on their interaction within specific historical and political contexts of struggle and contestation” (p. ). whereas multidi- rectional memory-work might be beginning to sound like memory studies’ version of multiculturalism, rothberg is careful to emphasize the specificity of histories, which remain intact; the overriding metaphor (borrowed from walter benjamin) is of elements being brought into “constellation” through being juxtaposed. the constellation (within benjamin’s thinking) produces shock; this shock or “arrest” produced by the constellation is what can lead to consciousness and potentially, social action and change (strong-wilson, yoder & phipps, ). multidirectional memory does depend on a comparative approach to memory, but one in which difficult and traumatic memories come to the table on an equal footing; this required rothberg to come to terms with the place of the holocaust within multidirectional memory. he develops an argument, begun in his earlier book (rothberg, ), against seeing the holocaust as a unique event. based on his re-reading of key authors on the subject of the holocaust (e.g., arendt’s origins of totalitarianism, rothberg, but also adorno’s famous dictum that no poetry was possible after auschwitz rothberg, ), rothberg re-positions the holocaust within the global effects of colonization and imperialism. he draws attention to the fact that holocaust memory oc- curred during the same period as movements for de-colonization but where holocaust memory has played, and continues to play, a pivotal role in provok- ing, granting permission for, and even drawing attention to “the articulation of other histories” (p. ) that pre-date as well as post-date the holocaust itself. as such, multidirectional memory relies on both “collective” as well as “shared” memory (rothberg, , p. ). shared memory is predicated on the mediation of memory through networks of communication and refers to strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne individuals’ communicating about memories of an event; it is built on a “divi- sion of mnemonic labor” (margalit cited in rothberg, , p. ). following halbwachs’ classic conceptualization of memory as simultaneously individual and collective (individuals provide the “locus” for remembrance but memories are filtered through living with others and in relation to collective frameworks, rothberg, , p. ), multidirectional memory is collective in that “it is formed within social frameworks”; it is shared memory in that it is “formed within mediascapes” that depend on a division of labour (p. ). rothberg ( ) has argued that multidirectional memory goes further than either shared or collective memory in highlighting the “displacements and contingencies” (p. ) that accompany re-telling memories and where those memories take on an “affective charge” through becoming part of a larger constellation or “network of associations” (p. ). the locus of memory-work is thus shifted as the work is determined in relation to associations and triggers across contexts that cannot be anticipated or foreseen in advance (p. ): benjamin’s tensions “where thinking suddenly stops in a constellation,” giving “that constellation a shock” (benjamin, as cited in rothberg, , p. ). but what is the purpose of such multidirectional memory-work? although he writes about history, rothberg comes out of english studies. he has been primarily interested in questions of representation. the entire argument of his first book on traumatic realism and the holocaust rests on his critique of what he often refers to as narrative “continuity” (rothberg, , p. ) and that alice pitt and deborah britzman ( ) (in education) have called “lovely knowledge” (p. ). such stories are the bedtime ones with the happy, tidy ending that we may wish to hear but that, especially in relation to trauma and difficulty, we know cannot be true — and that in their inauthenticity, can be harmful and misleading. the key characteristic and insight of what rothberg ( ) has called “traumatic realism” (as a new genre of holocaust story) is how it wrestles with the ways in which the nazis deliberately and perversely yoked the everyday with the extreme. traumatic realism might be considered as one possible form for multidirectional memory as it depends on interrupt- ing continuity in favour of producing benjaminian shocks. our key question then asks: what kind of pastiche story might be told by bringing together histories as diverse as canada’s and south africa’s? multidi- rectional memory begins with dissimilarity “since no two events are ever alike” (rothberg, , p. ). its method lies in constructing links between “disparate documents” (p. ) and thus, on focusing “intellectual energy on investigating what it means to invoke connections nevertheless” (p. ; italics added). it is that “nevertheless” that discloses multidirectional memory’s reliance on the association (which is an old association) between memory and imagination and, in another leap, that despite its “dark subject matter,” of being “written under the sign of optimism” (p. ). one of the main positive goals of mul- tidirectional memory is of “re-framing justice in a globalizing world” (fraser mcgill journal of education • vol. no fall “reflecting forward” on the digital cited in rothberg, , p. ), thus the need for a comparative approach, like the one proposed in our memory-work project involving the digital. multidirectional memory-work, the digital and scholarship “e-research encompasses a disruptive set of technologies with the potential to revolutionize the social sciences,” says christine borgman ( , p. ), even as she points out that the term “new” is often bandied about but as yet rarely explained (p. ). fundamentally, scholarly communications, whether in formal settings (publications) or informal ones (conferences), “tell the story of data” (p. ) no matter what form that data takes, from biological specimens to pot-shards from an archeological dig to responses to interview questions — to digital objects and artifacts. how will we know what is new? dutton and jeffreys ( ) suggest that we take our cue from our everyday lives, where digital devices have brought about fundamental transformations in how we do things. we might expect the same for research, they maintain. the term digital scholarship encompasses research on digital media as well as scholarly communication that uses digital media, says one go-to collaborative e-source that has successfully infiltrated academe, namely wikipedia. most research is presently “on” digital media, in the sense of being “about” it. as yet, there are few examples of the use of digital media to present, or represent, research or act as a host / site for research. this dilemma was one encountered by one of the co-authors who encountered insuperable challenges in the representa- tion of her doctoral research on avatars (morrison, ), compelled at the time (by the expected format of the dissertation) to bring avatars from their virtual spaces (their screen homes) to paper. she found that studying avatars designed for dynamic use in online spaces on the static world of a printed page was akin to studying the cinematic contributions of james cameron by reading his movie scripts and ignoring the visual spectacle. as a social and shared phenomenon, digital media has become an integral part of our everyday lives (lankshear & knobel, , ). this has happened in a variety of ways (oral, visual, written) using an increasing array of devices (digital cameras, cell phones, ipods, tablets). the emerging field of “digital memory,” based on the idea of the archive as dynamic, comes out of the recognition that memory is not the same for all time and changes according to the context (huyssen, ; radstone, ). digital memory scholars note that what distinguishes digital memory from classical notions of memory as storehouse is that the present has become more accessible as well as moves more quickly into becoming the digital past (ernst, ). this makes digital memory open to transformation and reinvention (bouchardon & bachimont, ) but also to being readily forgotten. as haskins ( ) points out, “large quantities of digitized materials does not translate into a usable past” (p. ). we live in a digital age of “perfect remembering” with little consciousness or discussion of how and what to remember — or how and what to forget (mayer-schonberger, strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne ). the ubiquitous — but mobile — presence of digital media has raised challenging questions for the place of remembering and forgetting within so- ciety as well as within scholarship. as borgman ( ) points out, “many of the assumptions about content and context associated with physical artifacts and print do not hold in distributed, digital environments”; rather, “digital objects often are malleable, mutable and mobile” (p. ). to date, the most cited and debated article in the journal memory studies is connerton’s ( ) “seven types of forgetting.” this due to his fifth type, “annulment”, which connerton ties to the rise of new media. arguing for the need for erasure, connerton states: “the concept of discarding may come to occupy as central a role in the st century as the concept of production did in the th century” (p. ). this debate was started by andreas huyssen ( ), who argued that technology threatens to dissolve the space we know as memory while radstone ( ) has begged to differ, seeing possibility in a cultural preoccupation with, and working through of, memory through the new medium of the digital. multidirectional memory-work would invariably involve the use of those digital tools that are already pervasive and ubiquitous and that are already the focus of shared as well as collective memories through various networks. multidirectional memory-work provides a needed focus on representation — on which story is being told, by whom and how, using which digital tools to which effect and to what end — and where devising methods of multidirectional memory-work (how to approach, share and juxtapose memories across political contexts) will need to take place alongside conversations around digital representation. these conversations, we argue, are not only useful but necessary, given the shifting tides towards e-scholarship. in the section below, we briefly describe the project that is underway but move fairly quickly into discussion of key questions and issues surrounding the digital that we foresee as highly pertinent to moving our multidirectional memory-work inquiry forward in the context of digital scholarship. section two: a digital project of multidirectional memory-work our project focuses on the need to increase teachers’ fluency with digital media in ways that are critical and thoughtful. memory-work has proven to be highly effective in linking theorizing with practice by embedding teachers’ commitment to teaching to social in / justice first within their own histories then by sharing with other teachers (elbaz-luwisch, ; mitchell et al., ; strong-wilson, ). we are interested in the approaches that schratz and walker ( ) describe in their book research as social change, connecting the self with the social for the purpose of “reflecting-on-the-future” (wilson, , p. ). reflecting on the future involves notions of agency and “anticipatory reflection” on the teaching that is to come (wilson, , p. ), as informed by critical reflection on the past. in so doing, we locate our fieldwork within mcgill journal of education • vol. no fall “reflecting forward” on the digital the kind of participatory forms of research that take account of dynamics of collaboration / collectivity (achinstein, ; kapoor & jordan, ) and translation into action / practice (marcos, miguel, & tillema, ). these forms include: teacher action research and scholarship of practice (cochran- smith & lytle, ; kemmis & mctaggart, ; loughran, hamilton, labosky & russell, ); memory-work methods, social autobiography and autoethnography (hasebe-ludt, chambers & leggo, ; strong-wilson, ; mitchell et al., ; strong-wilson et al., ); participatory visual methodologies (mitchell, ); and self-study methodologies (hamilton, ; kitchen & russell, ; pithouse et al., ). our project has a dual focus in that it seeks first to create digital memory- work workshops or “digital retreats” (mitchell & de lange, ) to engage primary and secondary teachers across six sites in canada and south africa in investigating social injustice  / the present past and second, to support teachers’ development of digital pedagogical approaches to social injustice. the workshops are meant to adapt to a digital context the work of haug et al. ( ) and others exploring memory-work through: group selection of a topic or theme (e.g., “recall an early memory of social injustice”), digitally represent- ing the memories, creating individual and shared digital artifacts, and group approaches to analysis of digital artifacts (e.g., what do our memory pieces have in common? how do differing national contexts  / pasts play out? are there certain dominant themes? what memories  / pasts are missing? what do we make of these memories? what next?) while there will be various follow-up actions to these site-based workshops, and the generation of a range of digital artefacts as data (e.g., cellphilms, i- movies, digital stories, podcasts, classroom-based social justice projects), three that are particularly pertinent to digital scholarship are: . a group webinar in which teachers from both countries will meet on site but digitally screen, critique and analyze their digital memory-work with one another across sites (and consider ways to take the work forward through critical digital pedagogies); . a teacher blog in which, using an agreed-upon sharing protocol, transna- tional groups of teachers will post and respond to visual and text-based examples of their digital pedagogies (teachers will also be invited to “blog” on-going reflections on their own / others’ classroom projects, reflections which will also be analyzed as digital data); . the creation of a digital archive. what will be critical is attention during data collection and analysis to conceiv- ing memory-work as multidirectional and, at the same time, wrestling with these ideas in the context of exploring the capacity of digital tools to help perform memory-work as well as represent understandings that are the result strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne of multidirectional memory-work. beginning in january , two of the co- authors will be engaged in a pilot project, titled exploring digital approaches to multidirectional memory-work, focused precisely on this: exploring and developing these tools and understandings through professional development research workshops with teachers and teacher educators. section three: the future conditional, or working through the “what-if” questions questions of “doing” — and the types of data produced by researchers and / or teachers as part of the doing (e.g., cellphilms, i-movies, digital stories)  — raise new questions about representation and the ways in which working with digital methodologies, especially those using the autobiographical and autoeth- nographic, in and of themselves become central to this “doing.” we refer to this section as “future conditional” as a way to signpost the space (figuratively and otherwise) that we occupy in our project of multidirectional memory work using digital tools. we foresee questions, ones with no definitive answers, but which may help to chart a path forward. our questions are not new, some emerging from the literature and some from our previous work but where we were more focused on “the technologies of do- ing” and less with “the technologies of representing.” in tracing the movement from an analogue model of scholarly publishing to a digital one still coming into being, pochoda ( ) identifies several “digital affordances” that will drive change; one of these is the ability of content to more flexibly inform format: in the procrustean print system, authors are compelled to fit their argument into the short-form article or the long-form text (itself falling within a limited spectrum of potential lengths). by contrast, the digital regime, in principle, permits publication in any length and in a wide and expanding variety of digital (as well as print) containers. (p. ) what will this new regime look like? we are not sure but we know that it will likely be different and that questions central to our own digital memory- work about the digital dialogue in and around the self and between selves are also central to our digital scholarship. whereas digital technology began as a “sustaining innovation” for the analogue model / print-container (a model inspired by voltaire’s set of encyclopedias, pochoda, ), allowing it to perform its work quicker and more efficiently (e.g., through scholars’ diy word formatting of their manuscripts for publishers), the digital has now definitely become a productive yet “disruptive innovation” (christensen cited in pochoda, , p. ), “one premised on digitally inspired and digitally mediated resources and perspectives introduced at every juncture of the system” (pochoda, , p. ). we have found that many issues that arise are not necessarily made explicit in the resulting scholarship, but rather become buried in the sorting out of mcgill journal of education • vol. no fall “reflecting forward” on the digital things. this was the case for one of the authors in preparing an article for a special issue of sociology online dedicated to the inclusion of digital mate- rial as central to the representation itself. the article draws on work with community health care workers in a rural setting in kwazulu natal, south africa in which the community health care workers engage in participatory analysis in co-creating a digital archive of photovoice data related to stigma and hiv&aids (see de lange & mitchell, ). as the author comments in a set of notes produced during the writing of the article: the challenges are not about the technology itself (i.e., creating hyperlinks or preparing the material for a digital realm in other ways). that is easy. but how do we first gain ethical clearance from the participants to have their data part of a public archive when they don’t really have any idea what a digital archive is regardless of whether it is restricted or public? and how do we make sure that we don’t misrepresent the visual data? it is one thing for us as the research team to screen a participatory video at a conference or public event– we can set the stage- although even there, decisions get made about what images to show outside of south africa. sometimes the visual is too explicit. will this be a case of colonial cringe? (fieldnotes, may, ) framed then by this reflecting forward, we offer below the following four questions as a set of “future conditional” “what if” questions and issues. while these are by no means the only questions, they are ones that seem to be particularly critical at this present juncture in the project. question : what are the challenges in addressing social justice issues through a multidirectional memory lens, across divergent geographical contexts, and using digital tools? the central challenge is to create a productive context for the prompting of shared multidirectional memory-work across continents. in a post-apartheid era in south africa, memories pertaining to social justice issues will include lin- gering legacies of the past such as widespread social and economic inequities, impoverished schools, and high levels of violence and xenophobia, while in canada, social justice issues may be more related to immigration — issues such as racism and persistent social and economic inequities — and, of course, to the treatment of indigenous peoples, including legacies of residential schools. given the highly visual nature of much e-material, as well as the practicalities of working digitally across geographical distances, this work will likely invoke visual images. for instance, it may demand that participants initially engage in the process of what prosser ( ) has called “picturing atrocity” (p. ), based on the idea of photography in / of crisis. batchen, gidley, miller, and prosser ( ) in their book picturing atrocity: photography in crisis are speaking of pictures of atrocity in public journalism, offering close readings of images depicting atrocities in the congo in the early th century (twomey, ), the “iconography of famine” (campbell, ), images of the civil rights move- ment in the us (abel, ), through to the mushroom cloud of hiroshima strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne (hariman & lucaites, ). their work anticipates, we would argue, the types of digital representations that might also be produced in digital photovoice and participatory video projects as well as the kinds of images that teachers may have accepted on faith as trustworthy. if so, these visual representations are likely to bring particular demands in terms of critical engagement (e.g., as prompts for discussion) and then re-represented multidirectionally, perhaps through the creation (within and across geographical contexts) of “dialectical images,” which rely on juxtaposition to instigate a shock or “standstill in a constellation saturated with tensions” (benjamin in abbas, , p. ). at the same time these images may be framed as what brown and phu ( ) refer to in their book of the same name as “feeling photography.” dialectical images are often staged images (e.g., in canada, of the photograph of a photographing of the royal canadian mounted police standing alongside a first nations chief in front of a teepee; see simon, , p. ). in their analysis of images produced in community-based research in rural south africa, mitchell, de lange, stuart, moletsane & buthelizi ( ) highlight the ways in which photos, especially those that are “staged,” can be particularly provocative, raising questions about what should be used in public contexts. a photo on stigma “staged” by a group of grade nine boys, for example, depicted a boy committing suicide. their caption for the photo, “suicide,” read: “he can’t accept that the hiv is positive. he feels he has to commit suicide because he would not like to tell people that he has aids” (p. ). batchen et al. ( ) make the argument that photographs of atrocity (and we would argue that “suicide” is an example) carry with them “a particular set of ethical responsibilities” (p. ). while the authors are speaking more of media representations produced by professional journalists as opposed to community researchers using digital tools, we would suggest that the same rules should apply: the media (photographer) has a responsibility to contextualize and caption the atrocity photography correctly. we have a responsibility to read the im- age closely  — perhaps not immediately to trust what we see in the image. if an atrocity has been committed, someone is responsible. this matter of responsibility gave rise to the first humanitarian campaigns that worked with atrocity photographs. do we also have a responsibility to respond to the photograph beyond simply reading it? what is the question that atrocity photographs ask of us? (batchen et al., , p. ) susan sontag ( ) makes a similar argument in regarding the pain of oth- ers when she observes: “narratives can make us understand. photographs do something else; they haunt us” (p. ). what we need to anticipate then is discussion and contextualization of images by teachers engaged in “digital dialogue” with one another across geographical contexts, and where images would serve as only one kind of prompt that would lend itself to digital dia- logue; others would be films, popular culture, objects (viz., pictures of objects) as well as writing, including literary writing, by published authors and / or by the teachers (strong-wilson et al, a). mcgill journal of education • vol. no fall “reflecting forward” on the digital question : how do we interpret the presence of multimedia in our scholar- ship on multidirectional memory-work? related to the first question, when data is collected, archived, analyzed and disseminated through multimedia  / digital forms, the tendency may be to privilege these accounts as more truthful or trustworthy, based on the positive social prejudice towards digital formats which are associated with relevancy and innovation. in the wake of poststructuralist frameworks, we know that truth is relational and that words, representations, and subjects are unstable and often contradictory. with autobiographical / autoethnographic research, we are also dealing with the subjectivity of lived experience. lived experience as refracted through a multimedia format may seem to provide more direct and immediate access to experience: a first-hand, witness account. we need to be careful not take the image / visual at face value as evidence of truth, and instead contextualize it as a version of an event or experience, which we see as central to multidirectional forms of memory. we need to begin from the premise that just like print text, multimedia data forms are value laden, are subject to interpretations as diverse as those who view / listen / experience them, and may even be commercially or politically driven (e.g., by relying on particular programs or software). also implicit in media constructs are power structures imported from the social and cultural contexts within which they exist (fiske, ), which includes the power to access particular media and technologies. what this implies in multidirectional memory-work using the digital is the need to foreground process and participatory approaches to data collection and interpretation  / analysis. a foregrounding of process would involve the documentation, theorizing as well as engaging of the participant in reflection on the “construction scars” (pinar & pautz, ) involved in working in / with the past before these traces disappear into the final work. using the digital, this would mean using the blog to good effect. through participatory processes of engaging with one another’s digital “data” (viz., through memory-work across geographical contexts), teachers can be invited into a collective process of interpretation, similar to métissage (hasebe-ludt et al., ), in which the teacher authors construct narratives out of the pieces of their “pasts,” read and critique one another’s pieces and in which, in a digital prologue or epilogue, they reflect on the outcome as well as process. a scholarly article may also take the form of a teacher blog, in which teachers show, for instance, the process by which digital memory-work was transformed into digital pedagogies, or how digital dialogue across transnational contexts informed the creation of particular digital pedagogies. question : what is the relationship between the autobiographical and autoethnographic, and use of the web as public sphere for multidirectional memory? one of the key issues emerging from digital scholarship is the ephemeral and mutable character of digital media, the fact that digital records cannot survive strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne by “benign neglect,” therefore need to be curated (borgman, , p. ). even as we worry over the future of its traces, because they are embedded within a distributed sphere (the web), they can be keyword searched (another digital affordance) and thus become permanent, un-erasable (mayer-schonberger, ), and redistributed in another context remote from multidirectional memory; even, misused. once the data becomes saved in digital format, even if password-protected, might it become accessed anywhere-anytime (e.g., through being shared by teachers with others)? how much depends on an individual’s ability or desire (or prior knowledge) to establish privacy settings? as jones ( ) points out, existing paradigms of the relationship between media and memory and as- sociated theoretical models are “inadequate for understanding the profound impact of the supreme accessibility, transferability and circulation of digital content: on how individuals, groups and societies come to remember and forget (garde-hansen et al., , p. ). (p. ) another question is: what will teachers themselves consider as ephemeral and as “collectible” and why? in raising this last question, we identify concerns about the private and the public. the adding of hyperlinks, for instance, is google’s attempt at a cultural institute: making available through virtual museums the last century’s historical and cultural events, archived photos, manuscripts letters and first hand video testimonials. the google “world wonders project,” which links street view technology with unesco world heritage sites, represents an extension of that project. what are the implications of using digital forms of memory-work (which though collective, begin with the private and autobiographical) for digital pedagogies which are necessarily shared and public? will we inadvertently be contributing to the creation of a virtual museum of the personal? and if yes, what will be the implications of this for future generations? we do not yet have answers to these questions, beyond creating password-protected sites. question : what are some of the new ethical challenges associated with digital representation in multidirectional memory-work? there is perhaps no issue that is receiving more attention currently than the ethics of self-representation in a digital age particularly in the context of “self- ies” and online-bullying. while much of this work takes place within a diy culture, what happens when it is part of a data-gathering project? what are the responsibilities of the researchers to safeguard participants, and in the case of the teachers as consenting adults, what should be the guiding principles? to gain ethical clearance for research projects from university ethics boards, it is customary to make a commitment to protect participants by ensuring confi- dentiality and anonymity. however, when participants are producers of digital artifacts such as online videos or blogs, they might well choose to “go public” as the makers or authors of their work (as discussed in the previous section). mcgill journal of education • vol. no fall “reflecting forward” on the digital but what about others, such as participants’ family members or former teach- ers who might be identifiable in digital memory-work artifacts even if their names or faces are not made public? how will researchers and participants address the blurring of the lines between their roles in the research, a challenge in participatory approaches to research, but with particular dilemmas when dealing with digital data and artifacts? another concern is that memory-work, especially when focused on issues of social injustice, can elicit painful stories of the past that can be traumatic for those who lived through the distressing experiences and for those who are hearing about or seeing these reconstructed memories (see for example, masinga, ; mitchell, ; de lange & mitchell, ). while we have developed strategies for attending to the possible emotional consequences of memory-work in our face-to-face work with teachers (pithouse et al., ), the public and essentially uncontainable nature of digital scholarship presents us with new, somewhat unpredictable challenges. what does this mean in relation to traditional forms of academic dissemination (even those making provision for digital scholarship) and the everyday uses that participants might want to make of their own digital self-representations? who will “own” the digital artifacts that are produced? with these blurring of the lines come questions of ownership, for instance, with respect to copyright and distribution. we live in a “share culture” in which the teachers involved in research projects may very well wish to share their artifacts (and perhaps those of others) with colleagues, friends, and family as well as posted online on sites accessible to many others. while we do not see this as problematic (and even potentially highly desirable), we acknowledge that when the boundaries become widened, it can be a challenge to locate impact and track distribution of the research. as “ephemera,” the artifacts may potentially pass beyond the ken of the researcher. new ways to engage with distribution and archive may need to be devised in light of such digital, participatory memory-work research. the digital setting of the research can be an occasion for addressing digital dialogism as the scene for a multiplicity of voices and perspectives writ large. as we know from bakhtin ( ), dialogical texts can help us understand rela- tions in ways that are not mechanical as they avoid authorial finality. digital dialogical texts can further blur the lines to allow for multiple, non-subordinated perspectives. gubrium and harper ( ) highlight the potential of dialogic editing, something that we see as being further enhanced through access to google docs and other digital platforms. hence, we see the involvement of the teacher participants as crucial in developing appropriate, context-sensitive ethical guidelines for the project. we anticipate that the ethics of the project will be the subject of an ongoing conversation with our participants as the project evolves and new ethical dilemmas must be attended to. thus, a critical and self-reflexive study of the ways in which ethical issues play out will be a key aspect of our project. strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne conclusion matthews and aston ( ) maintain that multimedia (such as, but not lim- ited to audio, video, and digital image) is much more than a simple tool for recording and documenting research in the humanities and social sciences. it is the primary research output. memory, as benjamin suggests (in the opening quote), is itself a medium for sharing and communication. multidirectional memory depends on the critical and creative generation that comes about through “constellation” across tensions. in this paper, we have sought to bring to bear the pending digital platform for scholarship to multidirectional ap- proaches to memory-work for social justice, as we see these movements as in productive tandem but accompanied by the need to “reflect forward” on chal- lenging questions immediately ahead. what story do we want our data to tell? notes . idle no more is a grassroots social movement of in canada that was initiated in and galvanized significant ongoing public attention to pressing social and political issues affecting indigenous people in canada. . one of the current conversations that studies this shared history is located within a year sshrc and idrc joint-funded partnership (mitchell & moletsane, - ) called “networks for change and well-being: girl-led ‘from the ground up’ policy making to address sexual violence in canada and south africa.” . session one: collective remembering & social justice issues; session two: working with memories (including ethical issues around memory-work, the digital, & teacher collaboration); session three: digital memory-work part i; session four: digital memory-work part ii; session five: viewing & critiquing digital productions; and session six: envisioning theory to practice. . our plan is to collect data based on the teachers’ digital artifacts (e.g., digital stories, i-movies, cellphilms, etc.) as well as documentation of the teachers’ process (individual and collective) of working with / through the past using digital forms of memory-work, the teacher blog, and the creation of a digital archive composed of data from the project as well as links to pertinent websites. we will use nvivo to work with digital data across sites as well as draw on digitizing coding methods informed by participatory analysis so that teachers can be invited into the data analysis process through individual and group coding. references abbas, a. ( ). on fascination: walter benjamin’s images. new german critique, , - . abel, e. ( ). history at a standstill: agency and gender in the image of civil rights. in g. batchen, m. gidley, n. k. miller, & j. prosser (eds.), picturing atrocity: photography in crisis (pp. - ). london, uk: reaktion books. achinstein, b. ( ). conflict amid community: the micropolitics of teacher collaboration. teach- ers college record, ( ), – . assmann, a. ( ). cultural memory and western civilization: functions, media, archives. new york, ny: cambridge university press. bakhtin, m. ( ). problems of dostoevsky’s poetics. minneapolis, mn: university of minnesota press. batchen, g., gidley, m., miller, n. k., & prosser, j. (eds.). ( ). picturing atrocity: photography in crisis. london, uk: reaktion books. benjamin, w. ( ). excavation and memory. in m. bullok & m. k. jennings (eds.), walter benjamin: selected writings, volume ( - ). london, uk: cambridge. mcgill journal of education • vol. no fall “reflecting forward” on the digital borgman, c. l. ( ). scholarship in the digital age: information, infrastructure and the internet. cam- bridge, ma: mit press. bouchardon, s., & bachimont, b. ( ). preservation of digital literary words: another model of memory? retrieved from http://www.utc.fr/~bouchard/wp-content/uploads/ / / - -bouchardon- bachimont-epoetry .pdf brown, e. h., & phu, t. (eds.). ( ). feeling photography. durham, nc: duke university press. campbell, d. ( ). the iconography of famine. in g. batchen, m. gidley, n. k. miller, & j. prosser (eds.), picturing atrocity: photography in crisis (pp. - ). london, uk: reaktion books. cochran-smith, m. & lytle, s. ( ). inquiry as stance: practitioner research for the next generation. new york, ny: teachers college press. connerton, p. ( ). seven types of forgetting. memory studies, ( ), - . de lange, n., & mitchell, c. ( ). community health workers working the digital archive: a case for looking at participatory archiving in studying stigma in the context of hiv and aids. sociological research online, ( ), — . retrieved from http://www.socresonline.org.uk/ / / .html dutton, w. h., & jeffreys, p. w. ( ). worldwide research. cambridge, ma: mit press. elbaz-luwisch, f. ( ). auto/biography and pedagogy: memory and presence in teaching. new york, ny: peter lang. ernst, w. ( ). digital memory and the archive. minneapolis, mn: university of minnesota press. fiske, j. ( ). media matters: race and gender in us politics. minneapolis, mn: university of min- nesota press. garde-hansen, j., hoskins, a., & reading, a. ( ). introduction. in j. garde-hansen, a. hoskins, & a. reading (eds.), save as… digital memories (pp. - ). basingstoke, uk: palgrave macmillan. gubrium, a. & harper, k. . participatory visual and digital methods. walnut creek, ca: left coast press. hamilton, m. l. (ed.). ( ). reconceptualizing teaching practice: self-study in teacher education. lon- don, uk: falmer press. hariman, r., & lucaites, j. ( ). the iconic image of mushroom cloud and the cold war nuclear optic. in g. batchen, m. gidley, n. k. miller & j. prosser (eds.), picturing atrocity: photography in crisis (pp. - ). london, uk: reaktion books. hasebe-ludt, e., chambers, c., & leggo, c. ( ). life writing and literary métissage as an ethos for our times. new york, ny: peter lang. haskins, e. ( ). between archive and participation: public memory in a digital age. rhetoric society quarterly, , - . haug, f. ( a). memory work. australian feminist studies, ( ), - . haug, f. ( b). memory-work: a detailed rendering of the method for social science research. in a. e. hyle, m. ewing, d. montgomery, & j. s. kaufman (eds.), dissecting the mundane: international perspectives on memory-work (pp. - ). new york, ny: university press of america. haug, f., andresen, s., bunz-elfferding, a., hauser, k., lang, u., laudan, m., … meir, u. ( ). female sexualization: a collective work of memory. london, uk: verso. hodgkin, k., & radstone, s. ( ). introduction. in k. hodgkin & s. radstone (eds.), contested pasts: the politics of memory (pp. - ). london, uk: routledge. hoskins, a. ( ). the end of decay time. memory studies, ( ), – . huyssen, a. ( ). twilight memories: marking time in a culture of amnesia. new york, ny: routledge. jones, s. ( ). catching fleeting memories: victim forums as mediated remembering communi- ties. memory studies, ( ), – . kapoor, d., & jordan, s. (eds.). ( ). education, participatory action research, and social change: international perspectives. new york, ny: palgrave macmillan. http://www.utc.fr/~bouchard/wp-content/uploads/ / / - -bouchardon-bachimont-epoetry .pdf http://www.utc.fr/~bouchard/wp-content/uploads/ / / - -bouchardon-bachimont-epoetry .pdf http://www.socresonline.org.uk/ / / .html strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne kemmis, s., & mctaggart, r. ( ). communicative action and the public sphere. in n. denzin & y. lincoln (eds.), the sage handbook for qualitative research (pp. - ). thousand oaks, ca: sage. kitchen, j., & russell, t. (eds.). ( ). canadian perspectives on the self-study of teacher education practices. retrieved from http://www.csse-scee.ca/docs/cate_book_on_self-study_% % _% % . pdf lankshear, c., & knobel, m. ( ). new literacies: everyday practices and classroom learning. berkshire, uk: open university press. lankshear, c., & knobel, m. (eds.). ( ). digital literacies: concepts, policies and practices. new york, ny: peter lang. loughran, j., hamilton, m., laboskey, v., & russell, t. (eds). ( ) international handbook of self- study of teaching and teacher education. toronto, on: kluwer. marcos, j. j. m., miguel, e. s., & tillema, h. ( ). teacher reflection on action: what is said (in research) and what is done (in teaching). reflective practice, ( ), – . masinga, l. ( ). journeys to self-knowledge: methodological reflections on using memory-work in a participatory study of teachers as sexuality educators. journal of education, , - . matthews, p., & aston, j. ( ). interactive multimedia ethnography: archiving workflow, interface aesthetics and metadata. acm journal on computing and cultural heritage, ( ), article . mayer-schönberger, v. ( ). delete: the virtue of forgetting in the digital age. princeton, nj: princeton university press. mitchell, c. ( ). doing visual research. london, uk: sage. mitchell, c., & de lange, n. ( ). what can a teacher do with a cellphone? using participatory visual research to speak back in addressing hiv&aids. south african journal of education, ( ), - . mitchell, c., de lange, n., stuart, j., moletsane, r., & buthelezi, t. ( ). children’s provocative images of stigma, vulnerability and violence in the age of aids: revisualizations of childhood. in n. de lange, c. mitchell, & j. stuart (eds.), putting people in the picture: visual methodologies for social change (pp. - ). rotterdam, netherlands: sense publishers. mitchell, c., strong-wilson, t., pithouse, k., & allnutt, s. ( ). memory and pedagogy. new york, ny: routledge. o’reilly-scanlon, k. ( ) she’s still on my mind: teachers’ memories, memory-work and self-study (un- published doctoral dissertation). mcgill university, montreal, qc. pinar, w., & pautz, a. ( ). construction scars: autobiographical voice in biography. in c. krindel (ed.), writing educational biography: explorations in qualitative research (pp. - ). new york, ny: garland publishing. pinar, w. ( ). the character of curriculum studies: bildung, currere, and the recurring question of the subject. new york, ny: palgrave macmillan. pithouse, k., mitchell, c., & moletsane, r. (eds.). ( ). making connections: self-study & social action. new york, ny: peter lang. pitt, a., & britzman, d. ( ). speculations on qualities of difficult knowledge in teaching and learning: an experiment in psychoanalytic research. qualitative studies in education, ( ), - . pochoda, p. ( ). the big one: the epistemic system break in scholarly monograph publishing. new media and society, ( ), – . prosser, j. ( ). introduction. in g. batchen, m. gidley, n. k. miller & j. prosser (eds.), picturing atrocity: photography in crisis (pp. - ). london, uk: reaktion books. radstone, s. ( ). introduction. in s. radstone (ed.), memory and methodology (pp. - ). oxford, uk: berg. rothberg, m. ( ). traumatic realism: the demands of holocaust representation. minneapolis, mn: university of minnesota press. http://www.csse-scee.ca/docs/cate_book_on_self-study_% % _% % .pdf http://www.csse-scee.ca/docs/cate_book_on_self-study_% % _% % .pdf mcgill journal of education • vol. no fall “reflecting forward” on the digital rothberg, m. ( ). multidirectional memory: remembering the holocaust in the age of decolonization. stanford, ca: stanford university press. schratz, m., & walker, r. ( ). research as social change: new opportunities for qualitative research. london, uk: routledge. simon, r. i. ( ). teaching against the grain: texts for a pedagogy of possibility. toronto, on: oise press. simon, r., rosenberg, s., & eppert, c. (eds.). ( ). between hope and despair: pedagogy and the remembrance of historical trauma. maryland, md: rowman & littlefield. sontag, s. ( ). regarding the pain of others. new york, ny: farrar, straus & giroux. strong-wilson, t. ( ). bringing memory forward: storied remembrance in social justice education with teachers. new york, ny: peter lang. strong-wilson, t., mitchell, c., allnutt, s., & pithouse-morgan, k. ( ). productive remembering and social agency. rotterdam, netherlands: sense publishers. strong-wilson, t., mitchell, c., morrison, c., radford, l. & pithouse-morgan, k. ( ). looking forward through looking back: using digital memory-work in teaching for transformation. in l. thomas (ed.), becoming teacher: sites for teacher development in canadian teacher education (pp. - ). http://www.csse-scee.ca/associations/about/cate-acfe strong-wilson, t., yoder, a., & phipps, h. ( ). going down the rabbit-hole: teachers’ engage- ments with ‘dialectical images’ in canadian children’s literature on social justice. changing english, ( ), - . twomey, c. ( ). severed hands: authenticating atrocity in the congo, - . in g. batchen, m. gidley, n. k. miller, & j. prosser (eds.), picturing atrocity: photography in crisis (pp. - ). london, uk: reaktion books. unesco. ( ). ict competency standards for teachers. paris, france: unesco. wegerif, r. ( ). a dialogic understanding of the relationship between cscl and teaching think- ing skills. international journal of computer-supported collaborative learning, ( ), - . wilson, j. p. ( ). reflecting-on-the-future: a chronological consideration of reflective practice. reflective practice, ( ), - . strong-wilson, mitchell, morrison, radford & pithouse-morgan revue des sciences de l’Éducation de mcgill • vol. no automne teresa strong-wilson is associate professor in the faculty of education at mcgill university and editor-in-chief of the mcgill journal of education / revue des sciences de l’éducation de mcgill. she has interests across the fields of: memory, literacy / litera- cies, the digital, stories, early childhood, children’s literature, social justice education, indigenous education, curriculum theory and teacher education. she has published extensively in these areas through several peer-reviewed forums: books (authored, co-authored and co-edited) and journals, including changing english, children’s literature in education, educational theory, and teachers and teaching. claudia mitchell is a james mcgill professor in the faculty of education, mcgill university and an honorary professor in the school of education, university of kwazulu-natal. her research cuts across several areas including visual and other participatory methodologies in relation to working with youth to address gender and sexuality, girlhood, teacher identity, and critical areas of international development linked to gender and hiv and aids. she is the co-founder and editor of girlhood stud- ies: an interdisciplinary journal. connie morrison holds a ph.d. in education from memorial university newfoundland where she conducted research on avatars and the cultural politics of online girlhood identity. she has designed and taught a wide range of undergraduate and graduate courses under the umbrellas of curriculum, literacy and popular culture. her teach- ing pedagogies are informed by principles of social justice. she has been an assistant professor in mcgill’s department of integrated studies in education, the co-editor of english quarterly, and is currently a visiting assistant professor at memorial university. she is the author of who do they think they are? teenage girls and their avatars in spaces of social online communication ( ). linda radford is a lecturer at the university of ottawa’s faculty of education. her research focuses on social media, changing literacies and the place of digital technolo- gies in teacher education. recent and ongoing projects include the empowerment of marginalized youth through engaging literacies in critical ways, the development of an urban education community program, and assessing change practice initiatives through university-ministry partnerships. she has published in several peer-reviewed venues: journals (e.g., changing english) and book chapters. kathleen pithouse-morgan is senior lecturer in teacher development studies in the school of education, university of kwazulu-natal, south africa. she has served as secretary of the self-study of teacher education special interest group and is lead investigator of the nrf-funded transformative education / al studies (tes) project. kathleen is first editor of making connections: self-study & social action (pithouse, mitchell, & moletsane, ) and co-editor of teaching and hiv & aids (mitchell & pithouse, ), memory and pedagogy (mitchell et al., ), and productive remembering and social agency (strong-wilson et al., ). email: pithousemorgan@ukzn.ac.za mailto:pithousemorgan@ukzn.ac.za mcgill journal of education • vol. no fall “reflecting forward” on the digital teresa strong-wilson est professeur agrégée à la faculté des sciences de l’éducation de l’université mcgill et rédactrice en chef de la revue des sciences de l’éducation de mcgill. elle s’intéresse à une variété de domaines tels que la mémoire, la littératie / les littératies, le numérique, les histoires, la petite enfance, la littérature jeunesse, l’éducation à la justice sociale, l’éducation des autochtones, l’étude des programmes et la formation des enseignants. elle a publié un grand nombre de textes dans ces domaines et dans des ouvrages avec comités de révision par les pairs : livres (écrits, co-écrits et publiés conjointement) et revues, incluant changing english, children’s literature in education, educational theory, et teachers and teaching. claudia mitchell est professeur james mcgill et travaille au sein de la faculté des sciences de l’éducation de l’université mcgill. elle est aussi professeur honoraire à la faculté d’éducation de l’university of kwazulu-natal. ses recherches couvrent un éventail de champs d’intérêts incluant les méthodologies visuelles et les autres types de méthodologies participatives en lien avec le travail auprès des jeunes pour discuter de genre et de sexualité, de l’enfance chez les filles et d’identité enseignante. elle s’intéresse également aux domaines critiques du développement international en lien avec le genre et le vih / sida. elle est la co-fondatrice et la rédactrice de la revue girlhood studies: an interdisciplinary journal. connie morrison est détentrice d’un doctorat en éducation de la memorial university située à terre-neuve. pour compléter sa thèse, elle a piloté un projet de recherche portant sur les avatars et les politiques culturelles régissant l’identité vir- tuelle des filles. elle a créé et enseigné une grande variété de cours, portant sur les programmes, la littératie et la culture populaire. ses pratiques pédagogiques s’inspirent de principes de justice sociale. elle a été professeur adjoint au département d’études intégrées en sciences de l’éducation de l’université mcgill, corédactrice du english quarterly et effectue actuellement une résidence à la memorial university à titre de professeur adjoint. elle est l’auteur de l’ouvrage who do they think they are? teenage girls and their avatars in spaces of social online communication ( ). linda radford est chargée de cours à l’université d’ottawa au sein de la faculté d’éducation. ses recherches portent sur les médias sociaux, les nouvelles formes de littératies et la place des technologies numériques dans la formation des enseignants. ses projets récents et en cours comportent la responsabilisation des jeunes margi- naux par une interaction critique avec la littératie, le développement d’un programme d’éducation communautaire en milieu urbain et l’évaluation des initiatives de chan- gement à l’aide de partenariats universités-ministères. elle a publié des textes dans divers ouvrages révisés par des pairs tels que des revues universitaires (p. ex. changing english) et des livres. kathleen pithouse-morgan est chargée de cours senior en teacher development studies à la faculté de l’éducation de l’university of kwazulu-natal, située en afrique du sud. elle a été secrétaire du self-study of teacher education special interest group et est chercheur principal du projet nrf-funded transformative education / al studies (tes). kathleen est la première rédactrice de making connections: self-study & social action (pithouse, mitchell et moletsane, ) et la corédactrice de teaching and hiv & aids (mitchell et pithouse, ), memory and pedagogy (mitchell et al., ) et productive remembering and social agency (strong-wilson et al., ). son adresse courriel est pithousemorgan@ukzn.ac.za mailto:pithousemorgan@ukzn.ac.za em-dlpj .. viewpoint exploring the utility of an emerging altmetric platform: a swot analysis of plum analytics . introduction plum analytics is an altmetric tool that provides novel documentation of research usage, reach and impact. by drawing from scholarly sources, as well as media channels, blogs and social media, plum analytics build upon, extend and advance traditional measures of citation. in recent years, the platform has become a primary means for scholars and universities to garner altmetric data regarding the public significance of research (tucker, a, b). while research on altmetric platforms has flourished in recent years (bawden, ), research focused specifically on the plum platform is still emerging. this report opens a new venue for research by examining plum analytics’ contributions to the growing landscape of digital documentation. the report is organized into the following sections: section presents an overview of plum analytics. section analyzes plum’s primary strengths, weaknesses, opportunities, and threats (swot). section discusses the theoretical, practical, and social significance of the research, while posing questions for future studies. section offers conclusions. . an overview of the plum analytics plum analytics categorizes research impact into five clusters. the clusters unite to create the “plum print,” a visual representation located next to a citation or abstract (figure ). the “plum print” for each research artifact assigns a different color to each type of data: green for usage, purple for captures, yellow for mentions, blue for social media and orange for citations. each component of the visual representation expands based on the number of data points in each category (lindsay, ), for example, if usage is the most robust category of impact, the green node is the biggest component of the plum print; if captures is the largest category of impact, the purple node is the biggest component of the plum print (figure ). . five dimensions of data: the plum print the five dimensions of data illustrated through the “plum print” are described below: ( ) usage: usage data provides a number of who has clicked on, downloaded, viewed, played or placed a library request for a published document. ( ) captures: capture data provides information on who has saved a file, including bookmarks, code forks, favorites, readers and watchers. ( ) mentions: mention data measures how the data from an article has been engaged in other articles, including blog posts, comments, reviews, wikipedia entries and news media. ( ) social media: social media data measures tweets, facebook likes and social media references to the file. swot analysis of plum analytics digital library perspectives vol. no. / , pp. - © emerald publishing limited - doi . /dlp- - - http://dx.doi.org/ . /dlp- - - ( ) citations: citation data includes citation counts from traditional citation indexes like scopus and also provides indications of societal impact, including citation indexes, patent citations, clinical citations and policy citations. . data aggregation: plumx plum analytics houses a data aggregator, plumx, that imports researcher data and artifact data from google scholar, orcid, vivo and institutional repositories (rathemacher, ). plumx locates and compiles data on multiple types of artifacts, including articles, case studies, abstracts, books, book chapters, data sets, videos from youtube, slideshare, vimeo, figshare, github, audio recordings, figures, government documents, images, musical scores and maps (collister and deliyannides, ; lindsay, ). articles can be identified by a doi, pubmed id, isbn, url and/or patent numbers (rathemacher, ; torres-salinas et al., ). artifacts are then linked to the researchers who created them. because an article may appear on multiple platforms, plumx ensures that metrics for all versions of the same item are counted and presented together (rathemacher, ). data produced through the plum interface is available as: � individual artifact data; and � institutional aggregate data. figure . plum print embedded within a citation figure . five dimensions of the plum print dlp , / . artifact data and aggregate data . . individual artifact data. plumx reports relevant metrics on an article in a single interface (lindsay, ). the data can be displayed on directories, dashboards, widgets and application program interfaces (rathemacher, ). plumx provides three different reports, including: ( ) artifacts by publication year; ( ) sunbursts; and ( ) artifact overviews (figure ). from an individual artifact, a user can create an “embed widget” that can be used to showcase plum altmetrics on any website (figure ). . . institutional aggregate data. plumx profiles are available for researchers and institutions. from the institution screen, users can find researchers who are members of a figure . plum x artifact overview figure . data viewable from plum’s embedded widget swot analysis of plum analytics given university. users can then access different options to filter data by individual researcher, artifact or artifact type (lindsay, ) and can view the impact of a given researcher or an entire institution with customizable graphics. other functions include: plumx þ grants, plumx funding opportunities and plumx benchmarks. the plumx þ grants section matches institutions and researchers with a grant database and provides assessments of past performance, including successful application outcomes. the plumx funding opportunities section enables researchers to search for new grants. the plumx benchmarks section allows for a comparison of grant outcomes across institutions (lindsay, ). . swot analysis of plum analytics as the popularity of plum analytics continues to grow (crosby, ), increased scholarly study of the platform is warranted. to such ends, descriptive swot analysis of the site was conducted (for an overview of the swot analytic framework, see helms and nixon, ; pickton and wright, ; williams, ). the primary goal of this approach is to provide useful information for key stakeholders, including adopters (i.e. colleges, universities and other academic institutions), users (i.e. librarians, researchers and information specialists) and developers looking to improve the functionality of the platform. key strengths and weaknesses are detailed below. . strengths � plum metrics are derived from a broad range of platforms, with a greater reach than other altmetric providers (tucker, a, b). web-based sources, including blogs, academic social networks and internet-based news sources are incorporated into plum analytics. these measures allow researchers to demonstrate the impact of web-native research in addition to more traditional scholarly outputs (rathemacher, ); � plum analytics tracks multiple types of research sources, including books, book chapters, posters and articles. while other platforms focus primarily on the impact of scholarly journals, plumx is particularly well suited to assess the broad impact of books by using algorithms that aggregate data generated from multiple isbn variations (torres-salinas et al., ); � plum analytics integrates large data sets across platforms and sources quickly and seamlessly. in comparison with other products, including the web of science and scopus that limit the number of items that can be downloaded, plum analytics allows for simple entry of large data sets (torres-salinas et al., ). additionally, plumx provides better coverage of mendeley readers than altmetric.com (ortega, ); � plum analytics showcase more timely assessments of research impact than traditional metrics. as hillary corbett, director of scholarly communications and digital publishing at northeastern university observes, “altmetrics allow scholars to create a more complete picture of how their work is being accessed and used from the moment of publication – and sometimes years before traditional metrics would show any impact” (rathemacher, ); � altmetrics encourage a renewed focus on public engagement. as the value of scholarship is amplified through digital networks, blogs, news outlets and social dlp , / http://altmetric.com media, altmetrics provide empirical documentation of connections between scholars, academic research and public audiences (williams, a); � plum analytics help contextualize scholarly work. while some altmetric products use only a numerical scoring system, plumx uses multiple forms of qualitative, descriptive and comparative data to illustrate various dimensions of impact (collister and deliyannides, ); � plum analytics facilitate deeper exploration of research by providing direct links to the digital platforms and external settings in which research appears (rathemacher, ); � plum analytics is based on algorithms that can lessen errors introduced by manual data entry processes; � plum analytics has a large user base. as of october , plum analytics had . million individual pieces of research output and had accrued . billion individual researcher interactions (crosby, ). plum analytics is also quickly expanding its presence in the academic publishing marketplace. in , plum analytics originated as a provider of alternate metrics for measuring the impact of research. by , plum analytics was operating with elsevier, a host that provides digital solutions and services for research and design, to further promote the prominence of data analytics (tucker, c). by connecting with elsevier, plum analytics is now available to elsevier’s clients, as well as scopus, science direct and mendeley; and � the visual depiction of data presented through the plum print allows users to quickly assess impact in multiple domains. visual depictions are extremely powerful. they draw in users’ attention, aid users’ evaluation of information and enhance users’ retention of knowledge (williams and woodacre, ). . weaknesses � plum metrics account for research from traditional journals, as well as from sources that have not been peer reviewed, therefore, plum metrics are not a replacement for traditional metrics (williams, b, ). rather, it is best to use plum data in combination with traditional metrics, including journal impact factors and citations counts; � while altmetrics are promising, they have not yet replaced traditional measures of scholarly impact (lindsay, ); � plum analytics are faced with some of the same problems as traditional citation metrics, including author disambiguation and lack of regulation (brigham, ); � altmetric tools such as altmetric.com, plumx and crossref event data (ced), can yield discrepant counts of metrics in their data, which may, in turn, cast doubt on the reliability of these tools for measuring impact (ortega, ); � the five dimensions of plum analytics are not always mutually exclusive. because of potential overlap between the five domains, the validity of the measures may be compromised (torres-salinas et al., ); � plum output data does not precisely match plum input data. for example, torres- salinas et al. ( ) found that not all output records conserve the original isbn swot analysis of plum analytics http://altmetric.com input numbers thereby leaving researchers to manually verify the accuracy of results; � data processing speeds may be slowed if the size of the data sample is very large. for samples with less than , items, the processing is normally completed within three hours (torres-salinas et al., ), however, processing time increases as sample size increases; � subscription to plum analytics is not free. costs can be prohibitive and lessen the likelihood of adoption (lapinski et al., ); � plumx data are made available to institutions in exchange for access to their institutional repository usage data (rathemacher, ), which can, in turn, compromise data security and information privacy; and � plumx tools are not directly available to individual researchers. currently, they are only available to scholars through university-wide subscriptions (lapinski et al., ). as plum analytics continues to grow and develop, it becomes essential to consider the opportunities and threats that the site may face in the future. some of the primary opportunities and threats on the horizon are presented below. . opportunities � altmetrics, like those produced by plum analytics, can be used by researchers to showcase the public significance and impact of their work (williams, a, c); � plum analytics provides individual researchers with data they can use to bolster cvs, job applications and tenure and promotion dossiers. making these resources free for researchers to adopt and use could sustain the future growth of plum analytics; � plum analytics captures networked scholarship that is leveraged by scholars to build their scholarly brands and strengthen their scholarly identities (williams and woodacre, ; williams, ). this asset could be further promoted to researchers as a means of increasing popularity and use; � in the future, plum could continue to increase the number of measures and impact indicators they offer. as of , plumx delivered measures, whereas, altmetric.com delivered only . this has been viewed as a strategic advantage that plumx holds over altmetric.com, which will need to be maintained over- time to secure a position in the academic publishing market (torres-salinas et al., ); � to enhance bibliographic data and account for isbn variations between input and outputs, plumx could create its own data index of the book entries/isbns, which could be of potential value to universities, libraries and researchers (torres-salinas et al., ); � some university rankings do not include metrics of scholarly books in their ratings of research productivity. by continuing to provide fruitful data of book metrics, plum analytics could leverage its unique ability to report empirical data inclusive of all scholarly manuscripts (torres-salinas et al., ); dlp , / http://altmetric.com http://altmetric.com � plum analytics has recently been purchased by elsevier and will now have the direct means to incorporate plum analytics data into mendeley profiles (carpenter, ); � communication researchers and information scientists are strategically positioned to continue exploration of and experimentation with emerging altmetrics. librarians will play a key role in education and outreach concerning these tools (lapinski et al., ); � plum analytics quest for inclusivity presents an opportunity to bring new and undiscovered research to the forefront and may work to level the playing field between new and established researchers (rathemacher, ); and � plum analytics work with academic institutions to decide what metrics are tracked. while these decisions are currently made at the institutional level, empowering individual researchers to have more control in the data selection and collection process will strengthen the long-term viability of the platform. . threats � competitors can, directly and indirectly, influence the continuity, development and impact of plum analytics; � an array of altmetric providers (e.g. piwowar, altmetric.com and ced) currently offer similar products and services; � altmetric competitors vie with one another for academic customers, which can lead to the downfall of any given data provider; � decisions regarding the adoption and use of plum analytics may be influenced by the value plum products and services offer in comparison with other altmetric providers. � examples of notable competitive values and benefits available through other altmetric providers include the following: (a) piwowar hosts an “impactstory” that tracks the impact of a researcher’s full body of scholarship while also placing each work into the context of other works produced within the same discipline and timeframe (rathemacher, ); (b) ced extracts more wikipedia citations than plumx (ortega, ); (c) in the domain of blogs, news and tweets, altmetric.com has better coverage than plum analytics (ortega, ); and (d) plum’s trademark “plum print” presents a unique form of data visualization, however, altmetric.com also promotes data visualization in the form of a “donut” symbol used to showcase the multiple metrics they produce (brigham, ). � in a competitive marketplace, threats to academic values of openness and transparency may arise. altmetrics providers, including plum analytics, need to maintain openness and transparency in designing, promoting, and monitoring their delivery tools, methods and measurement techniques to sustain value in an academic marketplace; � in digital environments, threats of “gaming” can occur. altmetrics providers including plum analytics can be compromised by “gaming,” through which swot analysis of plum analytics http://altmetric.com http://altmetric.com http://altmetric.com individuals and companies create multiple social media profiles programed to endorse links to certain articles as a means of artificially inflating altmetrics (brigham, , p. ); � data collection glitches can threaten the accuracy of reported data. ensuring the validity of data outputs is necessary for maintaining the credibility of the platform. while much of the plum analytic data collection process is automatic, users must manually verify that all data sources are active and linking correctly to the plum database to ensure the accuracy and efficacy of the output data; � the viability of plum analytics is currently dependent upon institutional resources needed to pay for services; and � the continued success of plum analytics is dependent upon the acceptance of altmetrics as valued means of assessing scholarly research (rathemacher, ; williams, a). . discussion in review, this work conceptualizes and outlines what plum analytics are and how they work, while also illustrating the value that altmetrics bring to researchers, academic communities and the public at large. the analysis highlights several overarching themes regarding the utility and performance of the platform. among the technology’s notable strengths, plum analytics provides: � unique information about the scope and reach of research; � broad and robust measures of research impact inclusive of web-native scholarship; and � timely information about research impact not available through traditional metrics. the analysis also points to weaknesses of the digital platform including: � barriers to entry that limit access for researchers who do not have a subscription to the service; � costs required to use the platform without an institutional affiliation; and � internal validity concerns stemming from the potential overlap between evaluation categories. looking to the future, plum analytics introduces opportunities, such as: � further strengthening engagement between researchers and the public; � enhancing the means by which researchers can tailor messages to reach target audiences; and � educating academic communities about how analytics can be used to build and strengthen scholarly reputations. it is important to also note threats that could hamper the development of altmetrics such as � ownership and partnership changes that have limited the platform’s growth; � promotional challenges that have not been met to fully raise awareness among researchers and academic institutions; and � competition with various altmetric providers, including, altmetric.com, that are also gaining popularity. dlp , / http://altmetric.com developments in each of these domains should continue to be monitored over time as the platform progresses. . significance of the findings the findings hold social, theoretical and practical/applied significance. . . social significance. a primary societal benefit of plum analytics is the opportunity for researchers to easily measure and enhance the public impact of their scholarship. by monitoring how research flows through digital and social media, researchers can identify networks that are promoting their research and can amplify public engagement within and across those networks. information garnered about social media use and sharing can also be used by researchers to tailor information to specific audiences who may be well-served by learning about particular research findings. in addition to using measures of social networking to identify members of the public who are encountering research, academics can also use this information to communicate directly with those who are discussing and forwarding their scholarship. such discussion can, in turn, incite the attention of a participant’s larger social network, and can thereby expand the overall reach of the research. as scholarship attracts media attention, these analytics also highlight the impact of research reported in the news. not only can scholars use the metrics to see when their research is cited in news reports but they can also observe how news of their work circulates online. researchers can then use this information to communicate directly with journalists who are promoting their work and with audiences who are being served by their research. . . theoretical significance. the theoretical take-away from the findings is essential to the continued study of altmetrics and altmetric platforms. most importantly, the outcomes of this study show that the significance of altmetric platforms, like plum analytics, is largely dependent upon scholars’ use and adoption of the technology. this underscores a theoretically motivated view from the vantage of use and dissemination that will be particularly important as the platform evolves (blumler and katz, ; parker and plank, ; rogers, ). the disciplines of communication and information studies are home to theoretical perspectives that will inform future research, while also forging a dialogue between these complementary areas of study. in the domain of communication, the theory of uses gratifications provides a relevant backdrop for studying the ways in which diverse publics use research to meet diverse needs (blumler and katz, ; parker and plank, ). in the domain of information studies, the study of altmetrics can be contextualized through the theoretical lens of information dissemination and the diffusion of innovation (rogers, ). as digital analytics develop across time, these complementary theoretical perspectives can be merged to longitudinally evaluate the use and evolution of digital analytic platforms and research metrics. furthermore, continued study in this domain will open new opportunities for interdisciplinary research, as altmetrics are not field-specific, but are of importance to all disciplines. . . practical/applied significance. the findings carry practical significance to researchers. most notably, this study documents the importance of digitally based measurement. while some academic institutions and scholars continue to place higher value on print-based publications and traditional citation measures, the world of scholarship is expanding in a digital landscape (laakso et al., ; odlyzko, ). in recognition of this evolution, more robust measurement tools that capture the presence and reach of online scholarship need to be more fully understood and embraced. this is not to dismiss the value swot analysis of plum analytics of traditional publication conduits and measures but rather to broaden our understanding of the impact that scholarship carries within digital spaces. expanded metrics are particularly important in the context of promotion and tenure decisions, which focus heavily on the “impact” of research (holden et al., ; hendrix, ). plum analytics, and other digital measurement tools, can play a role in contextualizing and substantiating the impact of research, as they provide information of public engagement with research and immediate measures of academia’s “real-world” significance (williams, a). continued discussion concerning the normative implications of altmetric platforms is warranted. as the world of publication grows, a need to embrace research across various publication modes is evident. online publication venues should not be dismissed as lesser than traditional print publications. rather, online and offline publication venues should both be acknowledged and valued as contributing to the expansion of our collective knowledge (johnson, ; lewis, ). the normative implications of this perspective will require sustained attention from researchers as digital scholarship and measurement tools advance. . . future research. future research should continue to explore the adoption, use, significance, and impact of plum analytics and other altmetric tools. some important, untapped research questions that await further exploration include the following: rq . how are scholars using altmetrics, generally, and plum analytics, specifically? rq . from a theoretical perspective, what are the uses and gratifications that academics garner from altmetrics? what value(s) do these measures provide to scholars? (for an overview of uses and gratifications theory see blumler and katz, ; parker and plank, ); rq . from an applied perspective, does the shifting nature of open access publishing, and the metrics that assign value to public scholarship, lead scholars to write to broader audiences? how can altmetric tools help to encourage and promote public engagement with research? rq . from a socio-technological vantage, how might increasing the transparency of the algorithms underlying altmetrics influence academics’ adoption and use of these new tools? (manca, ); rq . from an institutional perspective, how many and what types of institutions have adopted plumx? what values do they derive from these metrics? rq . from a methodological vantage, how stable, reliable and valid are altmetric data? (torres-salinas et al., , p. ); and rq . from a critical and analytic perspective, how can institutions and researchers compare and evaluate, the different altmetrics that have recently emerged? (collister and deliyannides, ). . conclusion new media afford opportunities for academics to discover and develop new methods and venues for data collection and analysis. as the digital environment continues to grow, these tools promise to improve the dissemination and evaluation of research while encouraging continued and future dialogues among varied researchers, audiences and disciplines. scholars should not only be aware of these new platforms but should also look to evaluate, improve and embrace change. dlp , / ann e. williams department of communication, georgia state university, atlanta, georgia, usa references bawden, d. ( ), “altmetrics and qualitative understanding at the croatian seaside”, journal of documentation, vol. no. . blumler, j.g. and katz, e. ( ), the uses of mass communications: current perspectives on gratifications research, sage publications, thousand oaks, ca. brigham, t.j. ( ), “emerging technologies”, medical reference services quarterly, vol. no. , pp. - . carpenter, t.a. ( ), “plum goes orange – elsevier acquires plum analytics”, available at: https:// scholarlykitchen.sspnet.org/ / / /plum-goes-orange-elsevier-acquires-plum-analytics/ (accessed january ). collister, l.b. and deliyannides, t.s. ( ), “altmetrics: documenting the story of research”, against the grain, pp. - . crosby, t. ( ), “telling the story of research through altmetrics categories”, available at: https://plumanalytics.com/telling-story-research-altmetrics-categories/ (accessed january ). helms, m.m. and nixon, j. ( ), “exploring swot analysis – where are we now? a review of academic research from the last decade”, journal of strategy and management, vol. no. , pp. - . hendrix, d. ( ), “tenure metrics: bibliometric education and services for academic faculty”, medical reference services quarterly, vol. no. , pp. - . holden, g., gary, r. and kathleen, b. ( ), “bibliometrics: a potential decision making aid in hiring, reappointment, tenure and promotion decisions”, social work in health care, vol. nos / , pp. - . johnson, r.k. ( ), “open access: unlocking the value of scientific research”, journal of library administration, vol. no. , pp. - . laakso, m., patrik, w., helena, b., linus, n. and bo-christer, b.t. ( ), “the development of open access journal publishing from to ”, plos one, vol. no. , p. e . lapinski, s., piwowar, h. and priem, j. ( ), “riding the crest of the altmetrics wave: how librarians can help prepare faculty for the next generation of research impact metrics”, arxiv preprint arxiv . . lewis, d.w. ( ), “the inevitability of open access”, college and research libraries, vol. no. , pp. - . lindsay, j.m. ( ), “plumx from plum analytics: not just altmetrics”, journal of electronic resources in medical libraries, vol. no. , pp. - . manca, s. ( ), “researchgate and academia.edu as networked socio-technical systems for scholarly communication: a literature review”, research in learning technology, vol. , pp. - . odlyzko, a. ( ), “competition and cooperation: libraries and publishers in the transition to electronic scholarly journals”, journal of scholarly publishing, vol. no. , pp. - . ortega, j.l. ( ), “reliability and accuracy of altmetric providers: a comparison among altmetric. com, plumx and crossref event data”, scientometrics, vol. no. , pp. - . parker, b.j. and plank, r.e. ( ), “a uses and gratifications perspective on the internet as a new information source”, american business review, vol. no. , pp. - . swot analysis of plum analytics https://scholarlykitchen.sspnet.org/ / / /plum-goes-orange-elsevier-acquires-plum-analytics/ https://scholarlykitchen.sspnet.org/ / / /plum-goes-orange-elsevier-acquires-plum-analytics/ https://plumanalytics.com/telling-story-research-altmetrics-categories/ pickton, d.w. and wright, s. ( ), “what’s swot in strategic analysis?”, strategic change, vol. no. , pp. - . rathemacher, a.j. ( ), “altmetrics: help your researchers measure their full impact”, series review, vol. no. , pp. - . rogers, e.m. ( ), diffusion of innovations, simon and schuster, new york, ny. torres-salinas, d., gumpenberger, c. and gorraiz, j. ( ), “plumx as a potential tool to assess the macroscopic multidimensional impact of books”, frontiers in research metrics and analytics, vol. no. . tucker, d. ( a), “plum analytics metrics are now available to more researchers”, available at: www. elsevier.com/connect/plum-analytics-metrics-are-now-available-to-more-researchers (accessed january ). tucker, d. ( b), “more researchers to now benefit from plum analytics metrics”, available at www.elsevier.com/about/press-releases/science-and-technology/more-researchers-to-now- benefit-from-plum-analytics-metrics (accessed january ). tucker, d. ( c), “elsevier acquires leading ‘altmetrics’ provider plum analytics”, available at: www. elsevier.com/about/press-releases/corporate/elsevier-acquires-leading-altmetrics-provider-plum- analytics (accessed january ). williams, a.e. ( a), “altmetrics: an overview and evaluation”, online information review, vol. no. , pp. - . williams, a.e. ( b), “f : an overview and evaluation”, information and learning science, vol. nos / , pp. - . williams, a.e. ( c), “kudos: bringing your publications to life?”, information and learning science, vol. nos / , pp. - . williams, a.e. ( ), “exploring the utility of academia.edu: a swot analysis”, information and learning science, vol. no. , pp. - . williams, a.e. and woodacre, m.a. ( ), “the possibilities and perils of academic social networking sites”, online information review, vol. no. , pp. - . corresponding author ann e. williams can be contacted at: annwilliams@gsu.edu dlp , / http://www.elsevier.com/connect/plum-analytics-metrics-are-now-available-to-more-researchers http://www.elsevier.com/connect/plum-analytics-metrics-are-now-available-to-more-researchers http://www.elsevier.com/about/press-releases/science-and-technology/more-researchers-to-now-benefit-from-plum-analytics-metrics http://www.elsevier.com/about/press-releases/science-and-technology/more-researchers-to-now-benefit-from-plum-analytics-metrics http://www.elsevier.com/about/press-releases/corporate/elsevier-acquires-leading-altmetrics-provider-plum-analytics http://www.elsevier.com/about/press-releases/corporate/elsevier-acquires-leading-altmetrics-provider-plum-analytics http://www.elsevier.com/about/press-releases/corporate/elsevier-acquires-leading-altmetrics-provider-plum-analytics mailto:annwilliams@gsu.edu outline placeholder . introduction . an overview of the plum analytics . five dimensions of data: the plum print . data aggregation: plumx . artifact data and aggregate data undefined namespace prefix xmlxpathcompopeval: parameter error xmlxpatheval: evaluation failed . swot analysis of plum analytics . strengths . weaknesses . opportunities . threats . discussion . significance of the findings undefined namespace prefix xmlxpathcompopeval: parameter error xmlxpatheval: evaluation failed undefined namespace prefix xmlxpathcompopeval: parameter error xmlxpatheval: evaluation failed undefined namespace prefix xmlxpathcompopeval: parameter error xmlxpatheval: evaluation failed undefined namespace prefix xmlxpathcompopeval: parameter error xmlxpatheval: evaluation failed . conclusion references supporting digital scholarship in research libraries: scalability and sustainability jennifer vinopal, librarian for digital scholarship initiatives, new york university, vinopal@nyu.edu monica mccormick, program officer for digital scholarly publishing, new york university, monica.mccormick@nyu.edu notes this is the peer­reviewed version of an article published in the journal of library administration,  ( ),  , special issue “digital humanities in libraries: new models for scholarly engagement.” this version of the work is licensed under a creative commons attribution­noncommercial  .  unported license. issue url: http://www.tandfonline.com/toc/wjla / / article url: http://dx.doi.org/ . / . . recommended citation vinopal, jennifer, and monica mccormick. “supporting digital scholarship in research libraries: scalability and sustainability.” journal of library administration  , no.  (january  ,  ):  – . doi: . / . . . keywords digital scholarship, digital humanities, library services, leadership, organizational culture, scalability and sustainability abstract new york university libraries and our partners in information technology services offer effective enterprise­wide technology solutions for many academic practices, but we are still working to solve the "faculty website problem"­­providing services for digital scholarship and publishing in a way that is both scalable and sustainable. this article describes our study of nyu scholars' needs and digital scholarship support at other research institutions, and then introduces a service model we developed for supporting such services (which may include digitization, hosting of research data, digital publishing, the development of software for scholarly practices, and more). we then discuss the http://www.google.com/url?q=http% a% f% fcreativecommons.org% flicenses% fby-nc% f . % fdeed.en_us&sa=d&sntz= &usg=afqjcnedkeomisjn g cxazba_xxhjosow http://www.google.com/url?q=http% a% f% fcreativecommons.org% flicenses% fby-nc% f . % fdeed.en_us&sa=d&sntz= &usg=afqjcnedkeomisjn g cxazba_xxhjosow http://www.google.com/url?q=http% a% f% fcreativecommons.org% flicenses% fby-nc% f . % fdeed.en_us&sa=d&sntz= &usg=afqjcnedkeomisjn g cxazba_xxhjosow http://www.google.com/url?q=http% a% f% fwww.tandfonline.com% ftoc% fwjla % f % f &sa=d&sntz= &usg=afqjcnflrl_si flqt jzckjl sqmxgq w http://www.google.com/url?q=http% a% f% fdx.doi.org% f . % f . . &sa=d&sntz= &usg=afqjcnfnqlnr qhd_hsfjbf tvnqwxfskg challenges to research libraries of implementing our service model in a scalable, sustainable way, by addressing project and tool selection, staffing, and organizational change. introduction: the faculty website problem at new york university, as at other large research institutions, we are working hard to support faculty and students who increasingly expect sophisticated new services for digital scholarship.  nyu libraries, with our colleagues in academic technology services (ats, a unit of nyu information technology services), offer tools and support teams for activities including high performance computing; geographic information systems; quantitative and qualitative data analysis; data finding and management; the digitization, creation, manipulation, storage, and sharing of media content; repository services; digital preservation; streaming media platforms; digital journal publishing; online collaboration; and intellectual property consultation. these are enterprise­level services, offered to as many members of the nyu community as possible. despite this breadth of services and expertise, we find ourselves challenged to respond effectively to what we have come to call “the faculty website problem”­­an ever­growing number of requests for web­based spaces and tools to collaborate on scholarly research and share the results. despite the fact that scholars often describe their needs with the catch­all term “website,” such requests actually represent a diverse set of activities which may be achieved in a variety of ways: with a wiki or basic blog, with more complex tools like a custom­designed database with public or private web access, with tools for collaboration with colleagues at nyu and beyond, integration with platforms elsewhere, or some combination of all of these. support for these projects can be equally varied, and may require anything from a single consultation about available enterprise­level tools, to semester­long training and advice for a course’s student projects, or an open ended commitment to implement a new tool or manage a scholarly digital collection. over the years we have approached these needs in several ways. in the late s and early  s, academic technology services had small, discipline­focused computing groups who supported specialized faculty projects in the humanities, arts, social sciences, and sciences. because of the idiosyncratic nature of faculty projects, the significant time required to plan and accomplish their long­term research initiatives, and the need for ongoing care, development, and migration of resulting websites and databases, these computing groups could only support a few faculty per year. in the mid­ s, in an effort to provide technology services to more users, nyu libraries and ats jointly committed to offering enterprise­level academic tools (e.g., wikis, blogs, streaming services, file storage, repository services) and correspondingly robust support services for the widest array of faculty and students.  we follow the scholarly communication institute’s definition of digital scholarship as “the use of digital evidence and method, digital authoring, digital publishing, digital curation and preservation, and digital use and reuse of scholarship.” smith rumsey, abby.  . scholarly communication institute  : new­model scholarly communication: road map for change. charlottesville, va: university of virginia library. http://www.uvasci.org/wp­content/uploads/ / /sci ­report.pdf. http://www.google.com/url?q=http% a% f% fwww.uvasci.org% fwp-content% fuploads% f % f % fsci -report.pdf&sa=d&sntz= &usg=afqjcnh buf l iqykya_ qcjb mnwlq while emphasizing commodity tools and services has allowed us to provide a broad clientele with relatively easy­to­use solutions for many digital research needs, this standardization has come at the expense of supporting the kinds of innovative, web­based collaboration, communication, and publication activities that are becoming a regular part of scholarly practice across the disciplines at nyu and beyond.  current areas of scholarly exploration include the use and development of new tools and methods for multimodal and collaborative publishing (e.g., scalar and mediacommons ), open peer review (e.g., mediacommons press ), and data analysis and visualization (e.g., topic modeling, mapping and timeline tools). so far, our work in these areas is in early development. to continue developing services that respond to changing scholarly practice, dean of libraries carol mandel asked us in april   to better define nyu scholars’ needs, to investigate how other universities, especially their research libraries, are supporting new web­based forms of collaboration, communication, and publishing, and to then propose a service model that might be adopted at nyu libraries. we conducted research from april through november  , and submitted a report in december  . in this article we describe our findings and offer a high­level model for deploying scalable and sustainable digital scholarship services.  we then discuss some important institutional and organizational challenges to research libraries and offer recommendations for providing effective digital scholarship support. gathering data nyu scholars’ needs to learn more about nyu scholars’ needs (including both faculty and graduate students), we partnered with subject specialists to identify and interview eleven nyu faculty who are experimenting with technology for their research and publishing. we also  nyu’s digital library technology services has developed tools and platforms relevant to digital scholarship (e.g., mediacommons, a network and publishing platform for scholars in media studies, http://mediacommons.futureofthebook.org/) but the dlts group’s primary focus is on processing, enabling access to, and preserving digital materials from the nyu community and collaborating partner organizations.  scalar, in beta development by the alliance for networking visual culture, is an open source authoring and publishing platform that’s designed to make it easy for authors to write long­form, born­digital scholarship online. scalar enables users to assemble media from multiple sources and juxtapose them with their own writing in a variety of ways, with minimal technical expertise required. see http://scalar.usc.edu/scalar  mediacommons press is part of the mediacommons scholarly network. it uses the tool commentpress (built on wordpress) to enable open online peer review. see http://mediacommons.futureofthebook.org/mcpress/  services are sustainable when they can be efficiently maintained over time, and scalable when they can be provided effectively as demand increases. http://www.google.com/url?q=http% a% f% fmediacommons.futureofthebook.org% f&sa=d&sntz= &usg=afqjcnen ueaanyuc jkobg couldvla http://www.google.com/url?q=http% a% f% fscalar.usc.edu% fscalar% f&sa=d&sntz= &usg=afqjcnh u vhzzmzbe_xhiygohxqcsqwmg http://www.google.com/url?q=http% a% f% fmediacommons.futureofthebook.org% fmcpress% f&sa=d&sntz= &usg=afqjcnhlimkzqaytq c zozxlwpk_ a performed a service gap analysis by reviewing recent technology support requests from scholars that we were either unable or only partially able to meet. both sources of data revealed similar faculty needs and gaps in available services and resources. scholars want help developing, using, and maintaining websites for storing and presenting their digital research content. research may be used in various ways online: as a personal archive, to collaborate with students or colleagues, or to publish these materials via the web. scholar requests for custom­built databases with web­searchable front ends indicate a need for interoperable tools and repositories that allow scholars to create, store, and work with materials in various formats (multimedia, images, text, annotation, etc.) and then provide easy online access to these materials. they want these sites to be dynamic (to add new content and functionality as needed) and to facilitate collaboration with colleagues. faculty also need help for themselves and their students to learn new skills, methods, and tools, and they want support integrating them into their work. interviews with peer institutions to understand how our peers support digital scholarship, we interviewed colleagues at fourteen institutions,  focusing our questions mainly on services for online publishing and scholarly collaboration. we also asked about staffing, service location within the organization, and scalability and sustainability concerns. we discussed the same issues at conferences with colleagues from many other libraries. among great variation in the tools, services, and staffing models our peers offer, we identified three basic approaches. all institutions we interviewed provide some version of these general types: . digitizing collections: infrastructure for digitization, preservation, and access these services are driven primarily by library collections and focus on building infrastructure and workflows that may also be used for scholars’ projects or shared with other parts of the library, making efficient use of staff time and equipment. project selection can be closely aligned with library strategic priorities, user demand, or other criteria. however, these services do not address scholars’ needs for the kinds of collaborative, multimodal digital services listed above. . digital research & publishing services with a focus on scalability, these services support a wide range of needs with a small amount of customization and are typically available to most scholars. examples include journal and conference paper hosting; institutional repositories; consultation on project planning, metadata, and digitization best practices; video and audio production; blogging, wiki, and content management platforms with a fixed set of templates and standard plug­ins for simple website creation; copyright and ip consultation. many tools  california state university, los angeles; columbia university; cornell university; duke university; emory university; harvard university; princeton university; university of california, los angeles; university of chicago; university of kansas; university of michigan; university of virginia; yale university; and the educopia institute. can be provided with minimal training to users and without ongoing intervention by the service team. related reference­type consultations are handled on a regular basis. while requests for customized services cannot typically be accommodated, service teams may consider strategically undertaking a special project if it is likely to result in a first­of­a­kind, rather than one­of­a­kind , solution, which might eventually be rolled out more widely. . digital scholarship or digital humanities centers these centers are scholar­driven with a strong research and development component and may not be affiliated with the library. they include high­touch collaborations among scholars for a limited number of projects per semester or year. scholars and staff on project teams are true research partners in this model, and staff may also pursue research projects on their own. such projects may result in tools or platforms that can be reused in other settings (for example, the open source library discovery interface blacklight , which grew out of a staff project at the university of virginia’s scholars’ lab). but because of the tight integration between a scholar’s research methodology and its expression in digital form, the products may sometimes be idiosyncratic and thus hard to maintain over time without ongoing developer intervention. no single service model mentioned here fully describes any of the organizations we spoke to, but we found it helpful to characterize services in these ways as the models suggest quite different approaches, staffing levels, and required skills. furthermore, none of our colleagues felt confident that they had solved the problem of providing services for the breadth of digital scholarship needs in a way that was both sustainable and scalable. like most of our peer institutions, nyu libraries currently provides some services from each of these general types. a high­level model for scalable and sustainable services drawing on our understanding of practices and trends at peer institutions and our own faculty’s research and requests, we developed a high­level model to describe how an organization might support digital scholarship. we had several guiding principles in designing this  ­tier model. services should be sustainable (so they can be maintained over time) and scalable (in order to benefit as many scholars as possible). our experience suggests, and peer interviews confirmed, that one effective way to achieve scalability and sustainability is through service and tool standardization. there are other considerations–programmatic and strategic requirements–as we discuss in the next section. as well, these services should promote the development of reusable tools,  thanks to rebecca kennison of columbia university for this useful distinction.  http://projectblacklight.org/ http://www.google.com/url?q=http% a% f% fprojectblacklight.org% f&sa=d&sntz= &usg=afqjcnhwt bezxzggrseuaqren zdqd da platforms, and methods, and facilitate the creation of preservable, reusable scholarly content to ensure the long­term value of and access to the institution’s research. this multi­level service model puts a strong emphasis on developing, maintaining, and integrating standard tools, platforms, and support services for a large community of users. the model should integrate current services and initiatives, and build out new service components only when necessary. finally, these services should capitalize on staff knowledge and expertise, while providing an opportunity for staff to gain new skills. the model we envisioned has four tiers, with the first (and most widely­used) at the bottom. tier  : enterprise academic tools these are enterprise­level academic tools that meet the basic computing needs of a vast majority of students and faculty. examples include: learning management systems, wikis, video streaming, individual and shared file storage, and virtual computer labs. these tools are designed to meet academic and administrative computing needs, but do not necessarily lend themselves to scholars’ research requirements. most offer little to no customization for individual projects. tier  : standard research services like the enterprise academic and administrative tools in tier  , these services are designed to be available to as many scholars as possible. however, tools at this service level are designed specifically to support research and scholarship. examples include: journal and conference­paper hosting tools (e.g., open journal systems or bepress), cms and web­hosting platforms (e.g., wordpress), and web exhibit platforms (e.g., omeka).  though certain tools or platforms may enable a large number of configuration choices, this service level does not offer that option. rather, to the extent possible, tools should offer a fixed set of templates, so users can pick the format, style, or functionality that best meets their needs. for example, an institution­wide wordpress service could give users the choice of a limited number of design templates and approved plug­ins. if services at this level are well designed and supported, a majority of scholars could rely on these sustainable alternatives to one­off solutions. tier  : enhanced research services this level builds on tier   and includes the ability to offer some custom configuration of the standard services described there. tier   provides select scholars with staff support for more sustained consultation and customization that go beyond the standard services and templates. services might include designing a special interface to a standard tool or providing custom­tailored metadata options for a repository. in addition, this level could include short or long­term project consultations with scholars on project planning, grant seeking, or digital methodologies. services in tier   could lead to more in­depth partnerships at service tier  . though the goal will always be to help as many scholars as possible, access to tier   services, requiring more staff time and support, will be necessarily selective and a well­defined selection process is required to manage demand. selection processes for these services will vary from institution to institution; criteria can range from focusing on vip faculty, to partnering with a particular department or program, only accepting projects that come with grant funding, or offering funds for which scholars may compete.  whatever selection process is chosen, it needs to be well understood by staff and potential project partners so decisions demonstrate a strategic approach to services. tier  : applied research and development this level is more experimental and aimed at developing methods and infrastructure with possible (but not certain) future research value. the focus is on partnership with innovative scholars, ideally leading to reusable products or integration among existing tools. a key objective is to create “first­of­a­kind” tools, platforms, methods, or integrations that meet emerging research needs, and to implement them in a cycle that supports use, testing, and improvement. ultimately, the goal is to enable such services, methods, or tools to be rolled out as tier   or tier   services. work in tier  is highly selective, mostly grant­funded, and extremely staff intensive. this tiered model provides a way for organizations to recognize their existing and  these services complement existing library research services such as subject specialist research assistance, the library catalog, etc. desired services as a spectrum of methods for supporting digital scholarship, ranging from enterprise­level tools to experimental, resource­intensive initiatives. articulating how the institution’s services fall into the four tiers will help library staff and leadership consider the organization’s strengths, gaps, and research needs, and determine how to best invest time and effort to strategically develop new services. in the next section, we address some challenges of implementing this model. considerations for implementation this high­level service model is not prescriptive; it can be applied in a variety of ways, depending on the given organizational context and structure. we believe it could be implemented with many different initiatives, tools, or services to achieve the desired level of engagement and support. similarly, it can rely on a wide range of possible staffing arrangements. in planning to offer services for digital scholarship, institutions must be guided by local considerations such as user needs, strategic priorities, and existing organizational structures and services. however, in order to implement scalable and sustainable services, there are certain programmatic and strategic requirements without which these initiatives may fail.   scholars’ needs for digital scholarship support are inherently diverse; in attempting to meet them without considering scale and sustainability, we risk developing narrowly focused or short­lived solutions that are difficult to maintain over time and with infrastructure that cannot be repurposed to benefit other projects. none of the peers we consulted have fully solved this problem, but they shared many helpful approaches. we are giving their ideas considerable thought as we develop and refine our own services. in this final section, we describe some of the most significant challenges to scalability and sustainability and propose some methods for addressing them. selection and scoping though we talk about them as related goals, scalability and sustainability should also be considered individually when evaluating service options. there are times when one may be a more important consideration than the other. for instance, a valuable  this past summer, miriam posner, the library loon, and mike furlough, among others, had a thoughtful discussion via their blogs about the institutional challenges to supporting digital humanities in the library. (miriam posner expands her original blog post as an article in this issue). they identified common impediments that cause incipient digital humanities services in libraries to falter and scholars and staff to become frustrated with the services offered. their insightful observations complement the implementation considerations we derived from our research in   and our subsequent work in this area. posner, “what are some challenges to doing dh in the library?” retrieved from http://miriamposner.com/blog/?p= ; library loon, “additional hurdles to novel library services,” retrieved from http://gavialib.com/ / /additional­hurdles­to­novel­library­services/; furlough, “some institutional challenges to supporting dh in the library,” retrieved from http://www.personal.psu.edu/mjf /blogs/on_furlough/ / /some­institutional­challenges­to­supporting­dh­in­the­lib rary.html http://www.google.com/url?q=http% a% f% fmiriamposner.com% fblog% f% fp% d &sa=d&sntz= &usg=afqjcnhhlkwhfgh-wgcgwtgufed nsw a http://www.google.com/url?q=http% a% f% fgavialib.com% f % f % fadditional-hurdles-to-novel-library-services% f&sa=d&sntz= &usg=afqjcnf mapktzq_um v dkkvhyxbklhkq http://www.google.com/url?q=http% a% f% fwww.personal.psu.edu% fmjf % fblogs% fon_furlough% f % f % fsome-institutional-challenges-to-supporting-dh-in-the-library.html&sa=d&sntz= &usg=afqjcnf pqydlm t fdlgbkbxzlvufn lg http://www.google.com/url?q=http% a% f% fwww.personal.psu.edu% fmjf % fblogs% fon_furlough% f % f % fsome-institutional-challenges-to-supporting-dh-in-the-library.html&sa=d&sntz= &usg=afqjcnf pqydlm t fdlgbkbxzlvufn lg service might be sustainable at a given staffing level, but not scalable to a larger clientele without adding significantly more resources or using a different technology. to get the most out of institutional investment in new initiatives, it’s important to identify the intended audience, define the scalability and sustainability goals, and select tools, services, and projects strategically to meet these goals. for services intended to be scalable, our model advocates offering tools that offer a limited range of alternative interfaces and functionality but can be run and supported efficiently and thus offered to a large number of users (see tier   in our proposed service model above). the city university of new york, for example, is developing the commons in a box, a content management system for blogging and collaboration, with a set of design templates and plug­ins for different needs.  columbia’s center for digital research and scholarship offers a standard software platform and a tiered service model for journal publishing, with the basic service available at no charge and customization options provided for a fee.  such approaches provide useful alternatives for patrons, while building in constraints (templates, fee structures) that ensure the service can be supported with the resources available. once a tool or platform is selected for implementation, service definitions are critical to setting user and staff expectations for their use. according to the itil (it infrastructure library)  service management framework, a service definition or service level agreement (sla) typically specifies details of service hours and availability, functionality, service and customer support levels, customer and service provider obligations, as well as any associated fees.   slas should also help staff and scholars understand the differences among services. for example, a training service should clearly state when and how training may occur, who is served, and what level of training is to be expected. and training to use tools must be clearly distinguished from, say, engaging in a long­term project with a scholar. when services are well defined and understood by all involved, it is easier to carefully assess the needs of a potential scholarly project and determine whether it can be met with an existing service (tier   in  commons in a box is described as, “a new open­source project that will help other organizations quickly and easily install and customize their own commons platforms.” http://news.commons.gc.cuny.edu/ / / /the­cuny­academic­commons­announces­the­commons­in­a­box­projec t/  the cdrs tiers of journal service are described at http://cdrs.columbia.edu/cdrsmain/texture­publications/which­service­level­is­right­for­my­journal/ more information about their journal services may be downloaded from http://cdrs.columbia.edu/cdrsmain/texture­publications/  for more information about the it infrastructure library (itil) see http://www.itil­officialsite.com/aboutitil/whatisitil.aspx  service level agreements are defined here: http://www.knowledgetransfer.net/dictionary/itil/en/service_level_agreement.htm http://www.google.com/url?q=http% a% f% fnews.commons.gc.cuny.edu% f % f % f % fthe-cuny-academic-commons-announces-the-commons-in-a-box-project% f&sa=d&sntz= &usg=afqjcnhveq -dh ferqw o asrhwkas a http://www.google.com/url?q=http% a% f% fnews.commons.gc.cuny.edu% f % f % f % fthe-cuny-academic-commons-announces-the-commons-in-a-box-project% f&sa=d&sntz= &usg=afqjcnhveq -dh ferqw o asrhwkas a http://www.google.com/url?q=http% a% f% fcdrs.columbia.edu% fcdrsmain% ftexture-publications% fwhich-service-level-is-right-for-my-journal% f&sa=d&sntz= &usg=afqjcnfzaiivdm -dggpmtfon tlb a yw http://www.google.com/url?q=http% a% f% fcdrs.columbia.edu% fcdrsmain% ftexture-publications% f&sa=d&sntz= &usg=afqjcnh xlsuwv-cgwhezbtuwpapsph_xw http://www.google.com/url?q=http% a% f% fwww.itil-officialsite.com% faboutitil% fwhatisitil.aspx&sa=d&sntz= &usg=afqjcnffcjju au cbvrfc z e nldbpq http://www.google.com/url?q=http% a% f% fwww.knowledgetransfer.net% fdictionary% fitil% fen% fservice_level_agreement.htm&sa=d&sntz= &usg=afqjcnfskjexdh zdjvkmd vqsga c iww our model) or if it requires consideration as a special project (tiers   or  ). the traditional reference interview process provides an excellent model for these types of evaluations. for instance, a faculty member approached us about a “digital humanities project” that amounted to the need for a wiki where documents could be shared with students­­a request easily met with a service already in place that could support the project as it evolved. more complex projects require a more substantial investigation before they can be selected, and will rely on the staff member who is conducting the initial interview to know where to refer the patron, or being empowered to assemble a team to discuss the request. having a well­developed project selection process allows organizations to make informed choices about how to strategically deploy staff on more experimental initiatives. portfolio management­­the process of documenting and assessing both projects and the services within an organization­­provides a broad overview of the organization’s work and enables service gap analysis, resource allocation, and project selection, and can thus facilitate strategic alignment. (vinopal,  ) we believe that project selection should be undertaken as part of an active portfolio management process to ensure scalability and sustainability. all projects in tier   must, by definition, be selected, since those services cannot be offered widely.  and for tier  , an organization may want to leverage its project selection process to identify “stretch” projects that will help it explore new areas and develop new capacities that may eventually benefit many other scholars. to ensure the return on resource investment, these “first­of­a­kind” projects must be selected strategically to fill in known gaps in the service portfolio. success with this approach requires that decision makers:  ) understand the organization’s service portfolio and service gaps;  ) have articulated the strategic priorities of the organization, in order to develop services that meet those goals; and  ) have a well­understood decision making process for selecting initiatives, assigning resources, and moving new projects forward. some of those we interviewed have a regular meeting at which projects are assessed for their fit with organizational goals, skills, and staff time. others assemble project assessment teams ad­hoc as requests arrive. however, without clear selection criteria, an overview of the project and service portfolio, and a strong understanding of project needs, this ad hoc method can result in a bulging portfolio and difficulties completing work on schedule. once projects are selected, many institutions develop written agreements with project partners to clarify responsibilities and define project scope. these agreements are similar in some ways to slas, described above, but focus on the specific project rather than a broad service. project agreements may stipulate the length of time any resulting systems (e.g., a specially­designed website) will be supported, by whom, and what kind of ongoing support is to be expected (for example, bug fixes only, ongoing development of new functionality, platform and content migration, etc.). situating services and staffing our research indicated that services supporting digital scholarship can be positioned within the library in any number of ways: they might be established as a separate, new unit or department; fully integrated into the existing organization, with staff members from many departments spending some of their time on digital services; or managed in a hybrid approach, with a small core staff who draw support from subject specialists, metadata experts, etc., on an ad hoc basis, depending on project need. sometimes grant funding is used to hire staff for initial projects, with positions evolving into permanent lines as need is demonstrated and budgets allow. all of these approaches have implications for service sustainability and scalability. no matter how these services are configured within the library, it is important that they eventually become an integral part of the holistic service environment of the organization. in their report “new roles for new times: digital curation for preservation,” walters and skinner emphasize the library­wide transformation required to build what they call “the trio of strong infrastructures, content, and services” to support digital scholarship. ( ) while launching digital scholarship services as a separate unit or department with dedicated (and possibly new) staff may afford the unit flexibility and speed to develop quickly, consideration should be given to the relationship between that unit and the rest of the organization. if the eventual goal is to foster a new level of organization­wide engagement with emerging research practices and needs, then incubating new services among a small group can potentially limit the development and contributions of other staff. as a consequence, when service needs grow, it may be challenging for staff outside the new unit to support the services in an integrated way. on the other hand, a staffing approach that will rely from the start on the participation of the whole organization may create problems of dilution and diffusion. scattering responsibility for the initiative across the organization can inhibit focus and may also negatively affect staff participation, especially so if this work is in addition to staff’s responsibilities for existing services. as well, library­wide staffing for new services would require a very clear message about priorities and goals for the organization, the departments, and the individuals involved, addressing questions such as: how do the new services build on existing work? what new skills are staff expected to acquire? what current work may become a lower priority? and, who has the authority to delegate this new work to staff across various departments? this last question is particularly important, as existing reporting structures can prove particularly resistant to cross­departmental collaboration. this amount of organizational change requires significant time, which might hinder an effective digital scholarship presence on campus. a third option is a hybrid model that falls somewhere between the “separate unit” and the “fully integrated” approaches described above. one way to implement this model is to identify current staff who are best situated (because of knowledge and skills) to help develop digital scholarship services, then free them up to lead the initiative, without necessarily creating a new unit. the organization could then incorporate other staff or hire new staff strategically and incrementally as service direction and definition are established. these efforts could be supported by ad hoc reliance on subject specialists, archivists, metadata experts and others as needed, with more staff being trained and brought in to the services as time goes on. according to a survey of arl libraries conducted in  , this provisional model is common among libraries developing support for digital humanities. (bryson, et al.,  ) our research suggests that it applies to general digital scholarship services as well. this incremental approach to staffing and service development has advantages, in that it can respond flexibly to fast­developing needs. being small and somewhat apart from the existing organization during start­up phase, service providers can take a more exploratory, experimental approach to their work and then bring their experiences and conclusions back to the organization for larger­scale implementation. for example, staff may spend time developing strategic partnerships or running small test projects to learn what works and what does not. during the initial phase of service design, it is especially important to assess work being done and to use these early experiments and experiences to document needs and the resources required to meet them. assessment activities can include: determining success criteria, evaluating client satisfaction, identifying what did and didn’t work, calculating staff hours spent on development and support activities, estimating costs and possible efficiencies, and considering next steps. it is equally important to share these assessments at the appropriate level of detail with the rest of the staff, so that experience and learning are shared, and the services’ evolution is understood.   while effective in a time of rapid change in service needs and financial constraints, ad hoc service provision should be seen as a tactic on the way to a longer­term strategy for robust and scalable service design and support. the authors of the arl digital humanities survey note, “as demand for services supporting the digital humanities has grown, libraries have begun to re­evaluate their provisional service and staffing models. many respondents expressed a desire to implement practices, policies, and procedures that would allow them to cope with increases in demand for services.” (bryson, et al.,  ) scaling up these services and keeping them going over time can be challenging for staff. like the “fully integrated” approach above, this hybrid model requires clear direction from library leadership about expectations and priorities; otherwise those assigned to initiate these services may have difficulty summoning the project and service support from colleagues who are already fully occupied with their own work. additionally, if services in this area rely primarily on fellowship­ or grant­funded staff, it can be very challenging to sustain them once staff leave or funds are spent. funding like the other service support considerations discussed above, funding approaches for digital scholarship services are diverse, including hard funding, fees for some or all services, and internal or external grant funding. special funds are frequently required for projects and services that are offered in tiers   and  , since these are more staff­intensive and may require advanced technology skills. some institutions require scholars who are proposing projects to come with grant money in hand. others partner with scholars to help them secure funding. another model is for those providing digital scholarship services (e.g., a digital scholarship center) to receive institutional funds that they then award as grants to researchers through a competitive project selection process. as with service definitions and project selection criteria, funding models should be well defined and clearly understood by all involved. strategic vision noting how innovative digital initiatives and services successfully develop at some institutions and not at others, a colleague of ours has asked: “what can you do if my library director gets it and yours doesn’t?” this simple question cuts to the heart of the matter: grassroots innovation and a few enterprising, proactive staff are no substitute for library leadership providing sustained vision, guidance, and support for these new initiatives. the scalability and sustainability of library initiatives depend not only on careful choices about technology deployment, well­developed service descriptions, and effective project selection and portfolio management, but also on staff having a clear understanding of how and why they are investing their time and talent in complex new services. it is critical to identify strategic priorities that align with the larger institution’s mission and goals, and to clearly articulate what the organization will and will not focus on. with such an array of options (tools, services, platforms, service models) no organization can undertake them all. library leaders need to select organizational priorities, make them known, and fund them. without focus, nascent efforts can become muddled and ineffectual. to foster cross­library engagement with this new service domain, leadership should ensure that it is understood across the organization as a strategic priority, and create a shared vision of how these new services relate to the library’s mission and goals and can be effectively integrated with existing ones. it is also important to frankly acknowledge the challenges of providing stable ongoing services while remaining responsive to emerging needs. implementing project and portfolio management to document and track the organization’s services and projects can help to guard against taking on more work than can be accomplished at any one time. authority and time to accomplish the staff who are specifically engaged in developing services for digital scholarship have particular needs arising from the way these services are situated within the organization. for a start, it is critical to identify staff with the appropriate knowledge and skills, and to give them the time to explore digital scholarship needs and establish the appropriate services. in addition, they must be provided with sufficient professional development support to maintain currency with rapidly evolving technology and standards. furthermore, as we have said, because digital services necessarily rely on a wide range of expertise, staffing for them is frequently ad hoc in nature. a common scenario is for projects to be managed by a digital services person with project support staff who all report to others. as a result, those charged with creating digital scholarship services often have considerable responsibility to accomplish initiatives without the authority to mobilize the resources needed to succeed. this is the particular challenge that evolves from building services that are not housed in a traditional department or unit but instead are more interstitial and rely on cross­organizational support for staffing. it is critical that new service managers have the authority to accomplish their work within the scope of the vision and direction that leadership sets out. given the inherently ambiguous nature of new service requirements, digital scholarship service leaders need the authority to make decisions, to direct the work of involved staff, and to establish a process for decision­making and communication about priorities up and down the hierarchy. everybody involved in these ventures, even in an ad­hoc capacity, needs to understand his or her role and responsibility in the project or service’s success. because the implementation of innovative new services requires a concomitant change in organizational mindset and practice, higher­level administration may need to intervene when work “gets stuck.” it is not enough for library administration to remind department managers or their staff about organizational priorities in the abstract; they must recognize the time required for this work and help staff set priorities and allocate enough time to participate in this new initiative. guidance establishing new ventures requires even more guidance and feedback from leadership than maintaining existing services. those developing new digital scholarship initiatives will need a process for regularly communicating with library leadership about progress and priorities, and for seeking direction at critical junctures. implementing our tiered services model, for instance, will require a selection process for projects at tiers  and  , which are more staff­intensive. as well, goals with clear measures of progress and success should be established, so that projects and services can be regularly assessed, and changes implemented as needed. the steering process can take many forms, including regular meetings with a designated steering committee or ad hoc meetings with the library director or other appropriate manager. no matter what process is enacted, it should be clearly articulated, so there is no confusion about how and when staff should report, how much autonomy they have in decision­making, and when they should seek feedback. what is important is that everyone involved in the service development process, from top­level leadership down, should understand how the new service will be guided, how service priorities will be set, who makes which decisions, which success criteria and assessment measures will be used, and how questions will be answered when problems arise. conclusion over the course of this article we have highlighted challenges to and strategies for building scalable and sustainable digital scholarship services. more and more scholars want to adopt digital tools, platforms, and practices for research and teaching, and these technologies and methodologies evolve rapidly. as the nature of scholarship changes, research libraries’ practices will also adapt in order to partner most effectively with scholars. new models for librarian­scholar collaboration include much more librarian engagement with the entire research process than ever before. from grant seeking, project planning, data collection and organization, and metadata creation, to data analysis and visualization, content dissemination, and long­term archiving, libraries have significant roles to play in developing and sustaining effective practices in digital scholarship. the organizational challenges required for a research library to become and remain engaged with this quickly evolving scholarly landscape are not inconsequential. this requires not just a one­time organizational change, but also the development of an organizational culture that is inquisitive, adaptable, responsive, and that welcomes change, one that is willing to try new things, assess their success, and sometimes simply move on. as new opportunities, roles, and responsibilities emerge, library leadership must take an active role in articulating a strategic vision, defining priorities, addressing the connections between new services and established ones, facilitating horizontal as well as vertical communication and collaboration, and building a staff that are lifelong learners with evolving job descriptions. our success in supporting new scholarly practices hinges on our ability to scale and sustain this kind of organizational change. references bryson, t., posner, m., st. pierre, a., & varner, s. ( ). digital humanities (no.  ). spec kit (p.  ). association of research libraries. retrieved from http://www.arl.org/bm~doc/spec­ ­web.pdf vinopal, j. ( ). project portfolio management for academic libraries: a gentle introduction. college & research libraries,  ( ),  – . retrieved from http://crl.acrl.org/content/ / / .full.pdf+html?sid=cde e ­ b­ ­ cde­ ce fc e walters, t., & skinner, k. ( ). new roles for new times: digital curation for preservation. washington, d.c.: association of research libraries. retrieved from http://www.arl.org/bm~doc/nrnt_digital_curation mar .pdf http://www.google.com/url?q=http% a% f% fwww.arl.org% fbm~doc% fspec- -web.pdf&sa=d&sntz= &usg=afqjcnet jawlpnaulr dqwggeeux x tw http://www.google.com/url?q=http% a% f% fcrl.acrl.org% fcontent% f % f % f .full.pdf% bhtml% fsid% dcde e - b- - cde- ce fc e&sa=d&sntz= &usg=afqjcnfiiqrmnyhqgz tewfdsz_gjie hg http://www.google.com/url?q=http% a% f% fcrl.acrl.org% fcontent% f % f % f .full.pdf% bhtml% fsid% dcde e - b- - cde- ce fc e&sa=d&sntz= &usg=afqjcnfiiqrmnyhqgz tewfdsz_gjie hg http://www.google.com/url?q=http% a% f% fwww.arl.org% fbm~doc% fnrnt_digital_curation mar .pdf&sa=d&sntz= &usg=afqjcneh vncro-mjxrqsqtdup-vkmxrmq microsoft word - preparing for the st century library_final- .docx     preparing for the st century: academic library realignment jennifer e. nutefall and faye a. chadwell oregon state university libraries, corvallis, oregon, united states. a bs trac t purpo s e – the purpose of this article is to communicate how an academic library can establish and implement a realignment process to prepare itself to serve users in the st century. d e s i g n/ me tho do l o g y / appro ac h – the authors employed a case study approach to present the challenges of realigning an academic library. we describe the collaborative and interactive process that oregon state university (osu) libraries undertook to envision what a st century academy library might demand and to realign its units to support this vision. we summarize the positive outcomes of this process and provide an overview of what next steps might be. fi ndi ng s – a combination of visioning exercises and collaborative study of the appropriate lis literature was key to establishing the direction that the osu libraries’ realignment would take and the eventual organizational structure the libraries implemented. the realignment activities not only emphasized collaboration among unit heads, but also emphasized the importance of clear communication, ongoing assessment, and connection to the university’s overall strategic goals and realignment in order to guarantee eventual success. ori g i nal i ty / v al ue – this article describes a process that most academic libraries could emulate to shift the focus of legacy operations and departments to those that successfully meet the challenges of the st century academic library. ke y w o rds -- academic libraries, realignment, change, future pape r ty pe -- case study contact: jennifer e. nutefall, associate university librarian for innovative user services, oregon state university libraries, the valley library, corvallis, or - ; email: jennifer.nutefall@oregonstate.edu; phone: - - , fax: - - . faye a. chadwell, donald and delpha campbell university librarian and osu press director; oregon state university libraries, the valley library, corvallis, or - ; email: faye.chadwell@oregonstate.edu; phone: - - ; fax: - - .     brief biographical note: jennifer e. nutefall is associate university librarian for innovative user services at oregon state university libraries. before starting at osu in april she was instruction coordinator at the gelman library, george washington university, from - . during her time at gwu, jennifer assisted with the campus-wide implementation of a new first-year writing program that integrated librarians into the curriculum. prior to gwu she worked as a reference/instruction librarian at the state university of new york (suny), college at brockport from - . she holds a bs in journalism and an mls from syracuse university and an ma in education and human development from george washington university.   faye a. chadwell was appointed the donald and delpha campbell university librarian and osu press director at oregon state university in may . prior to this appointment she was osu's associate university librarian for collections and content management, a position she'd held since august . an oregonian since , she also served as the head of collection development and acquisitions at the university of oregon libraries. she worked at the university of south carolina in columbia as the social sciences bibliographer and as a reference librarian from - . she holds a b.a. and an m. a. in english from appalachian state university and an mls from the university of illinois at urbana-champaign.     intro duc ti o n what does a st century university look like? this question motivated oregon state university’s (osu) president and provost to announce a plan to realign the university to focus on strategic directions and priorities in three signature areas: advancing the science of sustainable earth ecosystems, improving human health and wellness, and promoting economic growth and social progress. implemented during the - academic year, the final realignment created four divisions out of existing colleges. osu libraries (osul) chose to use the university’s desire for realignment as an opportunity to seriously review library functions and organization with the intention of delivering library services that address anticipated needs of the osu community. how is a st century library organized to meet the needs of its community? this article will provide an overview of the literature on change in academic libraries and the articles used for the realignment, a description of the realignment process, responsibilities of the new departments and lessons learned.   li te rature re v i e w this literature review will set the context for this case study, covering articles from the united states that focus on change in academic libraries and the articles chosen and consulted by library administrators and department heads during the realignment process. the libraries’ plan to align with the rest of the campus was re-enforced by franklin ( ). his case study focused on the university of connecticut’s alignment of the library with the campus strategic plan. in the university of connecticut libraries rethought its approach to services and created a reorganization project team. the reorganization’s focus was to “shift the libraries’ focus from an organizational structure based on internal library functions to a structure designed to support the university’s academic plan” (p. ). the final reorganization included program areas:     academic research services; undergraduate education and access services; the thomas j. dodd research center; the regional campus libraries; and central services. franklin concludes that the success of the reorganization will be measured over the next five years using metrics from the libraries’ strategic plan and demonstrating “how well integrated the uconn libraries have become in the university’s efforts to carry out its academic plan” (p. , ). fitch, thomason, and wells ( ) also focus on a library wide reorganization at samford university in alabama. they discuss how samford university’s library completely rethought their physical and staff operations and implemented an organizational structure “to meet the challenge of service excellence with flexibility, enthusiasm, and efficiency” (p. ). a team of professional staff was tasked with the reorganization with the goal to improve services and processes. the team’s process proved unique as they solicited input from the whole library staff to gather ideas for the new organizational structure, a new structure that eventually allowed the library to automate and plan for a building expansion. the authors conclude that “professional and support staff must be empowered to participate in planning and changing their library to produce a responsive, customer-centered environment” (p. ). several lis authors address the question: “what does the st century library look like?” barclay ( ) discusses the planning behind the uc merced library as a model for structuring the st century research library. he provides insights on library as place, signage, rfid, and collections. while barclay does not explicitly discuss merced’s organizational structure, he does outline three principles for creating a st century research library. principle one is to begin by asking “what is it we want to do?” starting with this question can help break down traditional thinking. principle two focuses on technology, stressing how “we always maintained that we would use     technology to achieve specifics ends but would not use technology for its own sake” (p. ). principle three is to plan bravely, with courage and the knowledge that plans may fail to work. to engender collaboration and build investment in the process, during osu libraries’ realignment process department heads were asked to provide articles that focused on the future of collections, services, and space or other relevant topics that could inform our realignment discussion. the articles selected ran the gamut of topics. a few selected here reflect the trends in academic libraries but also emphasize some of the strategic challenges informing the university’s realignment needs and opportunities. every librarian knows that collection acquisition and management are undergoing changes that will have impact across library units. anderson ( ) counsels librarians, especially “serialists” to “future proof” themselves by recognizing that the future for library collections will focus on unique collections and digital resources. he makes five predictions that describe this future while also providing one strategy per prediction to help libraries confront the future. anderson concludes that the successful library of the future “will be the one that has found new ways of meeting its stakeholders’ needs” (pg. ). how do libraries reconceptualize collections, space, and services? in what proved to be a pivotal article, pritchard ( ) discusses these areas and writes “the key is in reorienting our work to a much more refined definition of services, focusing on unique strengths, local needs, and multiple ways of delivering information” (p. ). she emphasizes the importance of defining the library’s mission and users. in terms of services, she states as libraries try to locate new services within typical organization charts, where does one put things like digital publishing, scholarly communication support, or information management consultation, in which we advise faculty about structure and metadata for their own databases and web sites? these are increasingly important services, yet formalizing them requires taking apart older notions of departments and tasks (p. ).     she concludes by advocating for a deconstruction of the library as a way of bringing a new perspective to library organization. a major focus for osu is international students and the internationalization of the university. becker’s ( ) article on internationalization of higher education and the role of australian academic libraries provides an in-depth case study analysis of two libraries identified as engaging in wide and deep internationalization practices: ibis university and the university of greenfields. because higher education institutions across the globe are seeking to increase their globalization, becker offers ways to apply the findings of her research for other libraries seeking to help their institutions internationalize. two articles included in the osu libraries’ realignment reading packet focused on k- students and their needs to provide some understanding of the users that academic libraries can expect to encounter in to years. lawrence hardy ( ) argues that since the availability of information has expanded exponentially, well-trained school librarians ought to be positioned to help students navigate and evaluate the wealth and diversity. “libraries,” states joyce kasman valenza, a school librarian/blogger quoted in hardy’s article, “need to change from places just to get stuff to places to make stuff, do stuff, and share stuff” (pg. ). elizabeth haynes ( ) writes about the class of and the challenges this group of “digital natives” will bring to librarians and other educators not only because of the anticipated growth and development of hand-held devices and other digital technology but also because of the culture or mindset that these digital natives bring to the classrooms and libraries as a result of their early and ongoing exposure to technology. other pieces consulted during the realignment process were the ithaka report, the arl scenarios report, and acrl’s futures thinking report on academic libraries in     (schonfeld & housewright, ; arl ; staley & malenfant ). additionally, articles from mainstream media were consulted when highlighting issues libraries will be facing in the near and not so near future such as digital decay and the rise of digital content, (cohen ; darnton ; kellogg ; kolowich, ). b ac kg ro und osu libraries (http://osulibrary.oregonstate.edu/) provides support to meet the teaching, learning, and research needs of osu's students, faculty, and staff. during the - academic year, the university asked all units on campus to submit a plan for strategic realignment and budget reduction. the goals of the realignment were to: • restructure administrative and academic units to advance the university’s strategic goals and signature areas; • achieve budget savings; and • develop a system to monitor progress, accountability, and savings. all units on campus, including the library, were asked to submit a proposal for strategic realignment and budget reduction. prior to submitting the libraries’ response, the university librarian (ul) and two associate university librarians (auls) discussed whether the current organizational structure aligned with the goals and major initiatives in the library’s strategic plan. how would a realignment benefit the library and the university and what might it look like? this initial brainstorming session on possibilities for realignment yielded enough ideas to move forward with engaging department heads in the visioning process. priority areas emerged in teaching, scholarly communication, open access, community engagement, data curation, and digitization. although prompted by the campus request, the libraries had continuously evaluated and reexamined its organizational structure along with its strategic goals. the existing     organizational structure included nine departments. figure provides the organizational chart showing the reporting lines for these nine departments. [ in s ert figu re on e] pro c e s s  at the beginning of the realignment process in march , the ul and auls met to map out the overall goals and decided to exclude the osu press from the realignment. the process and content for the realignment moved to the auls because of their close integration and involvement with the department heads whom they supervise. before meeting with the department heads, the auls met several times to discuss potential realignment models, focusing their discussions on assumptions related to organizational development and asking what might shape or influence the future organizational development of osul and academic libraries in general. the auls also outlined what activities librarians and library staff might pursue more frequently in the future as a part of their regularly responsibilities. activities include instruction, outreach, assessment, scholarly communication, digital publishing, and working with metadata. the outcome of these meetings was a list of activities with broader descriptions that led to a preliminary list of possible unit configurations and missions: • get it department--acquisitions, circulation, ill, collection development, selectors • build a learning environment—teachers, space • knowledge organization and distribution – catalog, institutional repository • scholarly communication • unique at osu • assessment • technology     the next step was gathering input from the libraries’ administrative team known as library administration, management, and planning (lamp). lamp’s department heads (excluding osu press) were divided into two groups with an aul leading each group. the auls made a conscious decision to not create these groups simply by aligning lamp members who reported to a specific aul—respectively, the aul for innovative user services and the aul for collections and content management. instead the membership of these groups was mixed to facilitate cross-pollination of ideas. each group was given two months for their discussions and charged with presenting, as a final product, a new organizational structure with department descriptions and a new organizational chart. the following sections provide an overview of the process used by each aul and their group to come up with their proposed reorganization. the group led by the aul for innovative user services started by writing down all the current library services and activities that they could think of. this activity was followed by the creation of a list of services or activities the group considered a priority in the next - years. these exercises were done individually and then each member shared the activities/services seen as a priority so areas of overlap could begin to be identified. the priority activities/services were grouped into categories and included assessment, outreach, space, data curation, and digital collections. the group also identified activities the library could stop doing, including journal claiming, cataloging, subject specialists, fines, copiers (only provide scanning), and book processing. from the categories the group began moving items under more central categories with significant discussion focusing on areas of overlap. the final plan included five departments or priority areas as figure illustrates. [ in s ert figu re ]     as outlined at the beginning of the realignment process, a document was produced that included a definition for each area, unit activities, and an organizational chart was created. the goal of this group’s realignment plan is to maintain the libraries’ relationship to the university and to continue to offer services and resources that make students and faculty successful in their learning, teaching and research. prior to the first group meeting led by the aul for collections and content management, department heads were asked to come prepared to draw on what they had learned from readings, what they knew about how other libraries and organizations are evolving, and what positions other libraries are formulating that might eventually take prominence in a new structure. at this group’s first meeting, members participated in a visioning exercise that asked them to imagine what the osul organizational structure might look like and act like in the next - years. group members were encouraged to forget about what the existing units and organizational structure do and to not consider (at least temporarily) the work that many library staff were currently performing. the brainstorming/visioning exercise made use of questions that the auls had created during their earlier meetings. at the next meeting, the aul employed yet another series of questions intended to envision a new organizational design. this meeting concluded with each lamp member attempting to complete the following statement with five different responses: we should be delivering or providing (fill in the blank) services or performing (fill in the blank) operations/functions. as a follow up, each individual was asked to explain why the service or function was important. here are the important areas of focus established as a result of this meeting and the prior meeting:     • support for the knowledge creation process (copyright, scholarly communication, data management); • provision of centrally located and inviting place for users to study, collaborate, socialize, • promotion of unique resources held at osu libraries; • customized information retrieval and fulfillment--the capacity to fill any sort of researcher request whenever and whatever via transparent processes and systems; • engagement with users, especially undergraduates, to help them be successful; • application and development of tools and services to solve problems for users and for library staff. at this group’s final meeting, lamp members reviewed the notes and lists from the previous meetings. the emphasis needed to be creating units that would advance the libraries and support areas of endeavor that need to be supported. as a result of this activity, the group proposed six departments or units to carry out the significant future areas of focus. these units were: • center for digital scholarship • emerging technologies, trends, and services • user services (also known as the get it department) • primary research center • learning and liaison services • guin library though the auls did not receive any formal feedback on the separate processes (and their respective merits and drawbacks) that each pursued, informal feedback from lamp members     did indicate that the department heads were pleased with the collaborative nature of the activities and the level of input they were able to provide. fi nal re al i g nme nt and impl e me ntati o n the goal of the realignment exercises was to strategically realign the libraries so as to position library staff to anticipate and address the needs and expectations of users at present and in the future. at the lamp meeting on may , each group presented their final recommendation including organizational structure and answered questions. while each group ended up with a different final product, there were clear areas of overlap in focus and direction that created a foundation for the realignment. after this meeting the auls regrouped with the ul to finalize an organizational structure that brought together the best structure from each group’s recommendation. this structure embedded the scope of foci that the group led by the aul for innovative user services had identified. this structure also drew upon the departmental combinations that the other group led by the aul for collections and content management created. it also recognized that given the staff size of osu libraries, the final units would still need to draw upon fte from more than just a single unit to provide necessary operations and services. the ul and auls then tentatively assigned faculty and staff to appropriate units and created timelines for communicating when departments would be dissolved, new reporting structures would be in place, and staff would move to new locations. in july , the auls planned a retreat with lamp to discuss the final phases of the realignment. this retreat covered the mission of the departments and their titles; staffing – assignment of librarians and staff to departments; location--where/how the new departments will be formed; timeline for implementation; gaining feedback on the realignment; and future position searches that would need to be undertaken. a second phase of the realignment focused on the merger of     the special collections and university archives units into one. this phase was initiated (and planned for) following the retirement of the head of special collection in january . it also involved engaging an outside consultant to provide expertise on the merger of workflows, consolidating physical collections, and focusing staff expertise. the final realignment included six departments which are described below. figure demonstrates the realigned library’s organization. . the collections and resource sharing department (crsd) brings together fte from acquisitions, collection development, ill and access services into a single unit to ensure that users have the content they need for learning, teaching and research. the creation of crsd acknowledges the importance of resource sharing, including collaborative collection development, and new methods of acquisition, such as user-driven collection building, as strategies to meet this goal. . teaching and engagement department emphasizes the increasing importance that information literacy plays in the success of osu students and the university (increased retention of students, the development of lifelong learners, support for an information literate society). this department will focus not only on the teaching aspects of information literacy but also will concentrate on developing physical and virtual learning environments conducive to student learning and success. . the center for digital scholarship and services is dedicated to supporting osu’s research enterprise through the organization, delivery, management and preservation of a wide range of digital and print resources for scholars and students at osu and beyond. . emerging technologies and services (ets) leads the development and support of the libraries’s it infrastructure and online environment. ets monitors trends and new     technologies. its pursuit and support of new tool and service development as well as collaborative partnerships will position the libraries to respond to the evolving information landscape. . branch libraries department - this department includes the multifaceted operations of the osu libraries’ two branch libraries, the guin library at the hatfield marine science center and the osu-cascades library embedded in the central oregon community college library. while separated by physical distance, the branches address similar challenges of providing seamless and appropriate services for our users. though the users and institutional settings require different approaches to providing services and resources, they also require alignment with overall osu libraries’ policies, mission and vision. . special collections and archives research center - this newer unit draws upon the distinctive materials within special collections and university archives. its focus is practical in that it will create a single service and physical access point for the libraries’ unique collections of records, manuscripts, and visual materials. the vision for scarc is to integrate the libraries’ significant special collections and archival holdings more thoroughly into the research and teaching of the university, especially by engaging student workers and student researchers not only in processing and describing archival content, but also creating new knowledge based on the holdings. [ in s ert figu re ] lo o ki ng a he ad the process of realignment and reorganization is never complete and libraries need to remain flexible and agile to meet users’ evolving needs. while osu libraries’ new organizational structure was implemented in september of , additional changes were anticipated. for     example, the retirement of the head of special collections provided an opportunity to look at the merger of special collections and university archives. osu libraries also anticipates changes with services at the branch libraries. to make any realignment successful, however, the process needs to be communicated clearly and the entire library, not just the administrative team, needs to understand and invest in the new model. the department heads need to communicate and work with their staff throughout the process and plan for the changes. as with any change process, each experience also provides learning opportunities. here are four lessons learned from this realignment process. . know your destination. libraries undertake realignment for various reasons. from an administrative perspective, it is important to provide an overall vision and context. there should be clearly articulated goals for the realignment and an outline of concepts that the realignment will accomplish. this will ensure that the realignment is purposeful rather than just an interesting intellectual exercise. this will also provide common understanding for staff across all units about how their new unit may contribute to the overall library organization and mission. . communication. a challenge in any organization is to communicate the information needed in a timely fashion. during a time of transition and change, communication is especially important. while the administrative group was regularly talking about the realignment it could have been emphasized and communicated more thoroughly to the library faculty and staff. this would have allowed questions and concerns to be addressed early on, especially in regard to the timeline. this also would have allowed for broader input from across the library, which is a key factor in successful reorganization for any institution or agency.     . assessment. as with any change assessment is key to knowing its success. establishing indicators of success for the realignment and then conducting follow-up would have provided valuable input for future processes. more follow-up with department heads and librarians on the realignment should have been done to indicate their understanding of the process and what worked. . connection to the university’s realignment. the university’s realignment was the initial driving force for osul’s decision to begin its realignment. its focus was largely on consolidating colleges and departments especially as a means to reduce expenditures. while the budgetary focus was not reflected in osul’s realignment process there is an impact in repurposing positions and collections. the libraries are better positioned for the future by using its resources more wisely. in the future, the university’s focus on collegial and departmental structures will require osul’s examination of the impact these changes might have on the libraries’ organization in the future, especially within individual units. co nc l us i o n there are many valid or worthwhile reasons why an academic library might elect to undergo a realignment process: tremendous technological change, severe budgetary need, or evolving management or change theory. these are especially valid when careful planning is undertaken so the realignment is responsive rather than reactive to the present and perceived future needs of users. while users were not directly consulted during this realignment process, it is expected that user input will be received during the next strategic planning cycle. it is also important to remember that the process for realignment is not static – it is evolving and iterative. this is what distinguishes osul’s process--that the libraries’ realignment was driven primarily by the desire     to prepare its organization proactively for success as a st century academic library serving the st century academic library user.     re f e re nc e s anderson, r. ( ), “future-proofing the library: strategies for acquisitions, cataloging, and collection development,” the serials librarian, vol. no. , pp. - . available at http://dx.doi.org/ . / association of research libraries ( ), “envisioning research library futures: a scenario thinking project,” available at: http://www.arl.org/rtl/plan/scenarios/usersguide/index.shtml (accessed october ). barclay, d.a. ( ), “creating an academic library for the twenty-first century,” new directions for higher education, no. , pp. - . becker, l. k. w. ( ), “globalisation and internationalisation: models and patterns of change for australian academic librarians,” a ustralian a cademic & research libraries, vol. no. , pp. - . cohen, p. ( ), “fending off digital decay, bit by bit,” the new y ork times, march, pp. c . available at http://www.nytimes.com/ / / /books/ archive.html?_r= darnton, r. ( ), “the library in the new age,” the new y ork review of books, june. available at http://www.nybooks.com/articles/archives/ /jun/ /the-library-in-the-new-age/     fitch, d.k., thomason, j., and wells, e. c. ( ), “turning the library upside down: reorganization using total quality management principles,” journal of a cademic librarianship, vol. no. , pp. - . franklin, b. ( ), “aligning library strategy and structure with the campus academic plan: a case study,” journal of library a dministration, vol. , pp. - . hardy, l. ( ), “the future of libraries,” a merican school board journal, vol. no. , pp. - . haynes, e. ( ), “the class of : how will we meet their needs and expectations?”library media connection, vol. no. , pp. - . kellogg, c. ( ), “what will the library of the future look like?” los a ngeles times, february. available at http://latimesblogs.latimes.com/jacketcopy/ / /what-will-the-l.html kolowich, s. ( ), “libraries of the future,” inside higher ed, september, available at http://www.insidehighered.com/news/ / / /libraries. pritchard, s. m. ( ), “deconstructing the library: reconceptualizing collections, spaces, and services”, journal of library a dministration, vol. no. , pp. - .     schonfeld, r. c. and housewright, r. ( ), “faculty survey : key strategic insights for libraries, publishers, and societies,” available at: http://www.ithaka.org/ithaka-s- r/research/faculty-surveys- - /faculty% study% .pdf staley, d. j. and malenfant, k. ( ), “futures thinking for academic librarians: higher education in ,” available at: http://www.ala.org/ala/mgrps/divs/acrl/issues/value/futures .pdf white paper report id: application number: pw- - project director: john renaud institution: university of california, irvine reporting period: / / - / / report due: / / date submitted: / / white paper pw- - piloting linked open data for artists' books white paper lead author: laura j. smart project director: john p. renaud university of california, irvine / / pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of “piloting linked open data on artists’ books: a case study in interoperability and sustainability” abstract: artists’ books are a common component of many art libraries, and are of great interest to artists and art historians because of their highly visual, interactive and sculptural qualities. however, many of these art-like qualities remain under-described when only represented in the typical library catalog. uci libraries completed a national endowment for the humanities - humanities collections research resources foundations grant to extend interoperability and discoverability of artists’ books through the use of linked open data (lod). we implemented processes of transforming legacy metadata from our library catalog to linked open data while enhancing records with visual resources association (vra) core elements. in addition to publishing linked open data with digital surrogates of artists’ books in our special collections, we built a prototype visualization tool to allow researchers to traverse relationships within and between the works, discovering connections between artists, genres, techniques, and materials. this article will describe the behind the scenes processes and challenges in making project interoperable with an emphasis on the metadata aspect of the project, and offer ways to sustain the project’s growth, through the recommendations and toolkits gathered from the council on library and information resources (clir), digital library federation (dlf), and ithaka s+r. the issues of interoperability and sustainability are huge issues to digital humanities’ continued growth; once applied to art information, digital humanities projects are even more likely to suffer complications related to these issues. this article will describe a digital humanities project using art information that directly discusses the lessons learned and recommended resources for tackling these issues head on. introduction: artists’ books are well known in the art information community to defy easy categorization. according to scholar, critic, and book artist johanna drucker, an artists’ book "interrogates the conceptual or material form of the book as part of its intention, thematic interests, or production activities." this artistic intervention can include "fine printing, independent publishing, the craft tradition of books, conceptual art, painting and other traditional arts, politically motivated art activity and activist production, performance of both traditional and experimental varieties, concrete poetry, experimental music, computer and electronic arts, and last but not least, the tradition of the illustrated book, the livre d’artiste." many artists’ book collections are held in libraries, in both art libraries and special collections libraries. as an item in the library’s collection, artists’ books typically are described in the library’s catalog using standard bibliographic description. a standard description includes title, author, imprint, place of publication, year, notes, subjects, and genres, but doesn’t johanna drucker, the century of artists’ books (new york: granary books, ): , . pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of necessarily convey the complex and nuanced meaning and associations triggered by interaction with the object, or even give much of a sense of what the artists’ book looks like. cataloging descriptive standards from the rare books and manuscripts section (rbms) of the association of college and research libraries (acrl) help a great deal by providing more detailed and nuanced descriptions, but are more favorable towards rare book objects than art objects. furthermore, they do not have subject terms that indicate more nuanced concepts that artists grapple with, through the interplay between form and content. many institutions that house artists’ books have published images of them in their respective repositories, ensuring visual access to the collection, which is highly crucial not only for the types of objects being portrayed, but also the user community most likely to use the objects, i.e., visual artists and art historians. a number of artists’ book collections have been at least partially photographed and enhanced with vra core and published in digital asset environments, such as the joan flasch artists’ book collection of school of the art institute of chicago, or reed college’s digital library collection of artists’ books. others have gone even further in developing research tools: johanna drucker has created artists’ books online, a portal that describes and illustrates an impressive number of artists’ books; book artists unbound from university of miami, which enhances artists’ books records through encoded archival collections – corporate bodies, persons, and families (eac-cpf); and artists’ books dc, a tool that geolocates collections in the washington, d.c. area. uci’s project pushed these advances further to develop workflows and tool recommendations so that other libraries may implement these collection enhancements. our pilot has valuable takeaways for a variety of stakeholders, from visual resource professionals to art librarians, catalogers, and those who work in special collections, and digital scholarship centers. linked data as christian bizen, tom heath, and tim berners-lee state, the adoption of the linked data best practices has led to the extension of the web with a global data space connecting data from diverse domains such as people, companies, books, scientific publications, films, music, television and radio programmes, genes, proteins, drugs and clinical trials, online communities, statistical and scientific data, and reviews. data is linked through the use of uniform resource indicators (uri’s) and serialized in a resource framework description (rdf) format. a number of authority files have published their data as linked data, including library of congress name authority file (lcnaf), library of congress subject headings (lcsh), and getty authorities, including art & architecture thesaurus (aat), thesaurus of geographic names (tgn), and union list of artist names christian bizer, tom heath, and tim berners-lee ( ) linked data - the story so far. international journal on semantic web and information systems, (vol. ( ), pages - . doi: . /jswis. ): . accessed as a preprint here: http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf: http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of (ulan). uci libraries staff believed linked data could be used in a browse/discovery layer to expose the complex and nuanced meanings hidden in the existing metadata. linked data has the promise to harness authoritative metadata in real time to dynamically display related data on a particular topic. linked open data, has all the qualities of linked data while also adhering to the qualities of openness - available for others to use with the highest amount possible of technical ease with the clearest license available. staff across a wide variety of departments at university of california, irvine libraries wanted to try using linked data for themselves, using a known issue in the art information community: the impossibility of conveying artists’ books meaning through a typical library catalog. scope of the project uci libraries was awarded a humanities collections research resources foundations from the national endowment for the humanities grant to pursue piloting linked open data on artists’ books (plodab). we identified the following high level procedures as necessary for our grant work:  choose books for pilot based on differences and similarities; take pictures of books; upload to digital asset management system (dams)  identify systems needed for dams, application, and interface layers  extract, enhance, and transform legacy metadata from library’s catalog to a form of lod that visualization tool can use  build a visualization tool using linked data to display the relationships between books  test visualization tool on user community and gather audience for project launch and next steps to focus on concepts of interoperability and sustainability, this article will focus the metadata and user testing aspects of our work. identifying processes for transforming metadata to vra linked data we had three guiding principles for designing the workflow for creating linked open data (lod). our metadata should be interoperable with metadata from other artists’ books collections and with linked data practices used within the visual arts community. our processes should be scalable, meaning they could be used for very large sets of metadata beyond the small sample of records we used for the pilot. they should be extensible, meaning others could add-on processes to further automate workflow or metadata enhancement. we designed our processes based on several assumptions. we believed other libraries would have their artists’ books metadata encoded in marc and stored in a relational database or integrated library system (ils). we expected that the people applying our process may have varying levels of technical training. and, we anticipated that interoperability of our metadata would be based on our selection of vra as a descriptive standard. we presumed the community pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of would trust our evaluation and choice of standard and guessed that many others were already using vra. finally, we assumed we would only be able to advocate for a high level generic process due to a wide amount of local variation in applying descriptive standards. we identified basic areas of transformation with associated tasks. the ordering of phases is malleable. local context determines what steps to perform in which software tool. there are a variety of ways of making the resulting linked data available. how one does the work depends on which choices are made. one may be able to do the transformation using the database in which the legacy metadata resides. or, one might have to use a variety of tools within a complex systems architecture. all steps should be automated wherever possible to meet the guiding principles of extensibility and scalability and minimize potential for introducing variability and error. table : tasks for translating legacy metadata to linked data functional requirements create list of desiderata for features of transformed linked metadata select descriptive standard and serializations based on desiderata select tools for storing and exposing metadata analysis and cleaning review existing metadata for anomalies fix inconsistencies (normalize dates, correct spelling errors, etc.) mapping review and select and adapt crosswalks between source descriptive standard and target descriptive standard or create new crosswalk if required translation convert legacy description to new descriptive standard convert legacy encoding to new encoding enhancement identify appropriate linked data vocabularies match data to associated uris testing try out process for translation and/or enhancement revise if required we had a complex systems architecture and specific linked data vocabulary requirements which impacted our selection of tools and the design of our work process. our legacy resource description and access (rda)/anglo-american cataloging rules (aacr ) machine readable cataloging (marc) metadata was managed in our millennium ils. our works were stored in a nuxeo digital asset management system hosted by california digital library (cdl), the university of california-wide centralized service platform for digital services and tools. and, our user front end was a bespoke visualization tool requiring our data to be provided as a static rdf- we created a rubric to analyze rda, cdwa, and cco encoded in mods as potential alternatives to using vra. there are documented questions for considering context of use when choosing a descriptive metadata standard. see for example, chapter in miller, stephen j. ( ) metadata for digital collections: a how-to-do-it manual. new york, neal-schuman publishers. pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of xml file. we needed to ensure our linked data utilized the library of congress and getty vocabularies predominately used by the visual arts community. our choices were also affected by the technical knowledge of our staff. our subject expert had little training on working with marc metadata within the ils. the choice of nuxeo impacted the mapping because it introduced the need for multiple maps. cdl uses the digital public library of america (dpla) application profile of dublin core (dc) for metadata ingest and storage in nuxeo. the metadata team therefore had to map to that standard in addition to ensuring the metadata could be exported to vra for the ultimate creation of linked data in rdf. fortunately, the team didn't have to work from scratch. there are several metadata crosswalks available from other institutions, including library of congress. the team reviewed these crosswalks and made adaptations based on our local context. we decided to work with our metadata outside of the ils environment. millennium doesn’t have internal conversion tools. it would be easier for staff to manipulate the metadata in spreadsheets due to familiarity with that software. and, nuxeo required data to be ingested in delimited form as did the tools we were considering for cleaning and enhancement. our final process went roughly as follows: . create functional requirements which specify the content and encoding of the metadata you need and how it is to be handled . export source (marc) records from ils . perform initial marc enhancement. we used marcedit linked data tool to automatically add uris from the library of congress vocabularies to the source marc file. . convert marc to delimited form. we used marcedit export tab delimited tool . cleaning and enhancement of marc-ordered delimited data to create “canonical” source file a. clean data in spreadsheet form. we tested openrefine and considered excel macros but found it more efficient to do the clean-up manually due to the small size of our pilot data set. b. reconcile the data against the getty vocabularies i. we tested openrefine sparql reconciliation tool. it was unsuccessful due to lack of matches within the source metadata ii. we manually added getty uris to our spreadsheet . create metadata map for converting descriptive content (rda/aacr data to vra and dc) . ingest “canonical” tab delimited file to nuxeo and convert description information to dublin core a. give metadata map and file to cdl to program batch loading script b. test load c. review quality post-ingest d. revise “canonical” source data and load production version in nuxeo. the marcedit openrefine integration tool was not available at that time, which might have accelerated our work pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of . convert description to vra and rdf for linked data a. convert tab delimited “canonical” data to comma delimited data b. run conversion c. edit xml files. we found the output from the heidelberg tool to be excellent, but insufficient. it didn’t include enough repeating subject-related elements to match our source data. we also wanted to use some vra attributes which weren’t included in the tool. we did the hand editing using oxygen xml editor. . publish linked data – in our case, it’s a static rdf-xml file posted on the uci libraries github hindrances to metadata interoperability our experience creating linked data uncovered some of the well documented barriers to making interoperable metadata. for example, our results were unsatisfactory when we tested enhancement with the openrefine getty reconciliation tool. there was little keyword string matching due to synonymy and term granularity differences. synonymy causes an obvious mismatch in search terms. granularity problems occur when concepts are in the same classification category but at a broader or narrower level. the aacr and rda descriptive cataloging rules dictate that catalogers describe the work at a level of specificity reflected by the “item in hand” and with literal transcription of some elements from the piece. so, two works may be in our politics collection but use the subject terms war and imprisonment. we might prefer to use the getty vocabularies at the broader level to facilitate grouping concepts in the browsing interface. we also had a gigo problem when we attempted to automate conversion from aacr /rda marc to vra. we wanted to use the vra material, technique and cultural context elements. we had difficulty because of semantic issues in our subject terms and hidden information within the records. the library of congress subject headings (lcsh) used in our source records created prior to conflated subject and genre terms. in addition, our local subject cataloging practice was to put terms describing technique and material into general subject fields. cultural context was buried as implicit data within the geographic location of publication. the gigo issue was not easily addressed. the subject librarian had to very carefully review marc subject and imprint fields to manually match concepts with getty vocabularies if we were to expose the richness of the works with linked data triples. detailed subject analysis and remediation required her domain expertise in visual arts. this is time consuming original we tested a xslt provided by northwestern university but found the source xlst would need too much customization to work for our needs. we tested the vra core xml transform tool designed by heidelberg university and generated acceptable baseline rdf. http://kjc-ws .kjc.uni- heidelberg.de: /exist/apps/csv xml/index.xq the literature on metadata interoperability and the issues in metadata quality which affect interoperability go back to the early days of digital library development. see for example the published works of thomas bruce, diane hillmann, naomi dushay, sarah shreeves, and jenn riley. gigo, or “garbage in, garbage out” is a phrase used in computer and information science to describe how computers can only work with the data as given. if there are problems with input, the output will reflect the same problem. http://kjc-ws .kjc.uni-heidelberg.de: /exist/apps/csv xml/index.xq http://kjc-ws .kjc.uni-heidelberg.de: /exist/apps/csv xml/index.xq pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of cataloging work, akin to creating records from scratch. it would not be generalizable to other institutions nor scalable to collections with a large number of legacy records. lessons learned and suggestions for scaling metadata work? our experience produced many lessons and shaped our thinking about what we might do differently to make it easier to work with a large number of records when transforming legacy metadata to linked data. lesson : choose tools and partners based on current capacity our choice to work with cdl and use nuxeo as dams was based on our long term aspiration to increase participation from other campuses within the uc and to incorporate more collections into a single artists’ books resource for access and discovery. nuxeo met our functional requirements. yet, cdl had only recently migrated their in-house dams and hadn’t yet implemented all aspects of the software including features for exporting metadata as linked data. there is risk in selecting metadata management systems based on un-developed or as-yet- to-be implemented features. the delay in developing export features caused us to create our own static linked data. this resulted in silos of data without an easy means to keep those silos synchronized. it would be advisable to create memorandums of understanding for expected deliverables and deadlines when working with external partners for development. lesson : prioritize your functional requirements in detail. we generally considered features to be required or desired without any detailed ranking because it was difficult for us to reach consensus and we had a limited amount of project time. it would have helped to spend time prioritizing features up-front. we would have saved time doing serialization conversions if we had ranked metadata transfer features higher and selected our systems accordingly. for example, if we could have imported raw marc data into our dams and let the system make it dublin core, we would have had less work in configuring our delimited data. lesson : follow best practices for data migration we are aware that this is a well-known best practice to review, clean, and enhance metadata prior to transforming its content or structure during the migration process. we made the choice to work with the metadata in spreadsheets outside of the ils, however, because some of our team members were not trained to use traditional cataloging rules or systems. it made it easier for the metadata neophytes to do the work but it added complexity to managing different versions of the metadata. we may have saved time by training team members to work with metadata at its source. lesson : use version control tools our proliferation of working files caused confusion, delay, and extra work. version control may have helped us avoid that. lesson : legacy metadata is messy metadata. accept semi-automation and good-enough records pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of there is no fully automated way of converting/migrating metadata without lossyness or error. there are scaling and granularity issues within the legacy metadata as a result of descriptive cataloging rules, semantic interoperability, and local practice and/or cataloger error. there are tools which can handle serialization changes or descriptive standard mapping for thousands of records. variations in source data are not so consistent as to be covered by algorithms. thus, software tools or programming scripts can only semi-automate clean-ups and quality control. humans will need to manage metadata mapping and review quality control prior to ingest and after migration. there may not be additional time, budget, or labor for fixing the complex issues within the source data. if that is the case, then one will need to accept less-than- perfect description within the target application. sustainability in the midst of working on the grant, other environmental factors made an impact on how we measured its success. the digital scholarship services unit (members of which were the main source of expertise for project management, metadata, and systems for the grant) was charged to develop business cases for implementing digital scholarship services for the uci campus community, including data science, digital preservation, scholarly communication, and digital humanities. creative lead on the grant project emilee mathews was appointed the digital humanities coordinator and charged to develop the business case for why a project like plodab should continue to be supported by the libraries. using this charge as a springboard, our main questions were:  if we continue to build the tool, will our campus community use it in research and teaching?  will other libraries, archives, and museums find our project and its associated documentation useful? would they want to partner with us to develop it further?  what are fruitful sources of additional internal and external funding? several publications developed by the library community served us in good stead. “fit for purpose: developing business cases for new services in research libraries” lays out the components of a business case tool set to justify the launch of a new service to library administration, and provides guidelines on how to go about doing so. ithaka s+r report “sustaining the digital humanities: host institution support beyond the start-up phase” discusses the results of the survey, site visits, and interviews conducted to determine how institutions can prepare themselves to sustain digital humanities project and derives recommendations. additionally, team members attended workshops by the ithaka team which “fit for purpose: developing business cases for new services in research libraries.” http://mcpress.media- commons.org/businesscases/ nancy l. maron and sarah pickle: “sustaining the digital humanities: host institution support beyond the start- up phase,” june , . http://www.sr.ithaka.org/wp- content/uploads/ / /sr_supporting_digital_humanities_ f.pdf pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of were formative in establishing documentation, including “building a business case toolkit” and “finding and keeping an audience in a competitive environment.” findings from ithaka and clir reports throughout these materials, recurrent themes of institutional priorities, culture, and capacity, robust user community, and scalability of efforts were pointed to as main factors for the establishment or continuation or a proposed service or project. mission a sustainable project needs to add value to the institution’s mission, and to be strategically tied into the institution’s larger priorities. as “sustaining dh” points out, “the system that will work best for an institution, its faculty and staff, is the one most closely tailored to the goals that institution holds dear.” “fit for purpose” is influenced by social entrepreneurship, which emphasizes developing services in accordance to the institution’s mission, and further recommend that the team proposing potential services “examine the library’s mission to determine if it would need to be modified to include a proposed service.” organizational capacity the organization’s ability to engage a project is crucial to the project’s success, both in its initial development and its ongoing support. “fit for purpose” recommends that the institution reflect on whether it has “sufficient physical, human, and financial resources available to consider embarking on new initiatives at the present time.” “sustaining dh” notes that in-kind institutional support is the highest percentage of ongoing support funding. these findings indicate that institutions need to have prepared mechanisms in place, while project teams should have a clear sense of what happens to their project after the initial content has been created and the technical infrastructure developed are completed. goodness of fit goodness of fit between institution, project, and timing is key to success. “sustaining dh” outlines several different models of institutional support for dh projects: the service model, the lab model, and the network model. the service model tends to be a centralized unit on campus “where the unit seeks to support faculty and students in their work…” by offering consultations, talks, workshops, and meetups, emphasizing support and education. unlike the “lifting all boats” method embraced by the service model, lab models have a greater “focus on innovation and project development.” while the lab model excels at developing projects, “even building a business case toolkit http://www.sr.ithaka.org/services/workshops/#library-business-planning and finding and keeping an audience in a competitive environment http://www.sr.ithaka.org/services/workshops/#keep-an-audience “sustaining dh,” . “fit for purpose,” . ibid., . “sustaining dh,” . ibid., . ibid., . http://www.sr.ithaka.org/services/workshops/#library-business-planning http://www.sr.ithaka.org/services/workshops/#keep-an-audience pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of the most successful labs cannot absorb the costs of long-term hosting and support.” by contrast to both service and lab models, in a network model of supporting dh work, “there are multiple units whose services have developed over time, in the library and it departments, but also visualization labs, centers in museums, and instructional technology groups, each of which was formed to meet a specific need.” understanding what model your institution fits into, or if there is no model currently in place, what model would fit most organically with institutional culture and current priorities, will help to determine if your institution has the capacity to take on the administrative and overhead costs associated with ensuring your project has a long life span. user base the project planning team should conceive of what success looks like, how it is measured, and how to achieve penetration with its audience, particularly in projects based out of the library: “those focused on library-based digital collections will want to gain a strong sense of who is using the materials, and how, in order to make a strong case to administration for future support.” “fit for purpose” recommends talking to potential stakeholders throughout the development process to ensure that the proposed service has a defined audience and will contribute concrete impact. the report talks about value propositions as a way to demonstrate exactly “the intended benefits or value...to reaffirm the rationale for moving ahead.” the ithaka workshops provide crucial frameworks for developing value propositions and other user-focused brainstorming exercises. scale solutions in “sustaining the digital humanities,” the authors recommend institutions “figure out how to use scale solutions, without overly limiting the creativity and research aims of project leaders.” however, “does every library need a digital humanities center?” cautions that “some large-scale projects to create comprehensive technical solutions for dh have demonstrated the danger of de-contextualizing scholarship and producing a homogenizing effect.” scale solutions also promote interoperability. applying these recommendations to plodab ibid., . ibid., . ibid., . “fit for purpose,” . ithaka s+r workshops, “building a business case toolkit” and “finding and keeping an audience in a competitive environment.” “sustaining dh,” . jennifer schaffner and ricky erway, “does every research library need a digital humanities center?” (dublin, ohio: oclc research, . http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch- digital-humanities-center- .pdf): . http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital-humanities-center- .pdf http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital-humanities-center- .pdf pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of mission: plodab augmented the library’s mission to advance research and innovation while preserving the cultural record. uci libraries university librarian encourages library staff to pursue external funding opportunities; plodab was one larger scale recent grant project. organizational capacity: uci libraries has staff with expertise in metadata, web services, systems, special collections, and visual arts; the grant allowed us to hire a short-term programmer to develop the visualization tool. the project had a large impact on several project team members’ workloads and took up a lot of time from the members of the digital scholarship services team, which hindered their ability to take on other projects. frankly, the technical complexity paired with the aggressive timeline of the project was a stretch for the team members. while the project definitely had positive effects on the team’s skills and abilities, the project was also challenging to incorporate into daily workflow. goodness of fit: in considering the various models of digital humanities work being done on campuses that “sustaining dh” highlights, it became clear that the project was not a perfect fit for uci’s institutional culture. with its emphasis on project development and spurring innovation, the project was more suited towards a lab model for supporting dh; meanwhile our institutional culture would be a more natural fit for the service or network model, with its emphasis on faculty research and success. the project was not initiated out of a particular faculty or other campus group, it was conceived of within the libraries. the end user was considered throughout the project’s lifecycle, however none were involved at the outset. so, unless we parlay the research and development we’ve done to apply directly to faculty and graduate students, or our culture majorly shifts through leadership or money, it is less likely that we will continue the project as currently scope. user base: the project team conducted iterative user testing during the prototype building phase of the project, serving three main purposes. during the grant project, we were able to gain early feedback from potential users and incorporate some of their suggestions into the tool, while keeping other suggestions for longer term development. we also learned how faculty anticipated using the tool, and discovered that it was primarily understood as a pedagogical tool. and lastly, we began to build an audience for the continuation of the tool, while identifying the types of content that would be most useful to add in in terms of our faculty’s interests. project members were strategically gathering data on what content would be most useful to add, using a model inspired by the center for primary research and training (cprt), a unit of university of california, los angeles’ special collections. cprt staff have successfully managed a workflow based on faculty interest and use of material to selectively digitize pedagogically crucial materials. we also considered the library, archive, and museum community to be a key audience for the project. we purposely built materials and tools to not only be interoperable but also adaptable to others’ situations. project team members started to present the project via blog posts and conference presentations in january, . we gained enthusiastic feedback for potential partnerships among a wide variety of institutions. scale solutions: the project’s use of authoritative metadata, institutional repositories, and open source tools with robust communities all weighed heavily in favor of its scalability. the uci libraries drafted a new strategic plan with new vision, mission, and values at the same time that the project was taking place. http://www.lib.uci.edu/sites/all/files/docs/uci-libraries-strategic-plan- - - .pdf pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of in sum, in judging the plodab project’s ultimate sustainability by uci libraries and its staff, it has both strengths and weaknesses. its strengths include its contribution to mission, the interested user base (particularly the library, archives, and museums community), and the use of scale solutions. however, its weaknesses include uci’s organizational capacity to keep the project afloat, the goodness of fit, and the lack of faculty champion for its continued existence. the project’s team members believe that the best chance for the project’s sustainability is to a.) find a faculty champion who would be willing to take on a lead role in the next phases of the project; and b.) to pursue external partnerships with institutions that can actively contribute to the project. lessons learned from sustainability  understand your audience  understand your institutional readiness and capacity, and ability to provide institutional support at the outset and ongoing. our project’s identity as more of a lab model project than a service model project illuminates some ways that we should have understood our own organizational predilections better  understand the workload, staff expertise  understand the grant landscape conclusion while the plodab project experienced both successes and challenges in its mission to achieve interoperability and sustainability, it greatly enriched team members’ experience which has been beneficial to the institution’s greater capacity for digital scholarship. references bizer, christian, tom heath, and tim berners-lee. “linked data - the story so far.” in international journal on semantic web and information systems , no. ( ): - . doi: . /jswis. . drucker, johanna. the century of artists’ books. new york: granary books, . “fit for purpose: developing business cases for new services in research libraries.” http://mcpress.media-commons.org/businesscases/ maron, nancy l., and sarah pickle. “sustaining the digital humanities: host institution support beyond the start-up phase,” june , . http://www.sr.ithaka.org/wp- content/uploads/ / /sr_supporting_digital_humanities_ f.pdf miller, stephen j. metadata for digital collections: a how-to-do-it manual. new york, neal- schuman publishers, . schaffner, jennifer and ricky erway. does every research library need a digital humanities center? dublin, ohio: oclc research, . http://mcpress.media-commons.org/businesscases/ pw- - : piloting linked open data for artists' books white paper uc irvine libraries page of http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital- humanities-center- .pdf. university of california, irvine libraries, “ national endowment for the humanities final report: piloting linked open data on artists’ books.” [url forthcoming] ____. “university of california, irvine libraries strategic plan,” may , . http://www.lib.uci.edu/sites/all/files/docs/uci-libraries-strategic-plan- - - .pdf http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital-humanities-center- .pdf http://www.oclc.org/content/dam/research/publications/library/ /oclcresearch-digital-humanities-center- .pdf thermal inequity in richmond, va: the effect of an unjust evolution of the urban landscape on urban heat islands sustainability article thermal inequity in richmond, va: the effect of an unjust evolution of the urban landscape on urban heat islands kelly c. saverino ,* , emily routman , todd r. lookingbill , andre m. eanes , jeremy s. hoffman and rong bao ���������� ������� citation: saverino, k.c.; routman, e.; lookingbill, t.r.; eanes, a.m.; hoffman, j.s.; bao, r. thermal inequity in richmond, va: the effect of an unjust evolution of the urban landscape on urban heat islands. sustainability , , . https:// doi.org/ . /su academic editor: troy abel received: october accepted: january published: february publisher’s note: mdpi stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. copyright: © by the authors. licensee mdpi, basel, switzerland. this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license (https:// creativecommons.org/licenses/by/ . /). department of geography and the environment, university of richmond, richmond, va , usa; em.routman@gmail.com (e.r.); tlooking@richmond.edu (t.r.l.); andre.eanes@richmond.edu (a.m.e.); rong.bao@richmond.edu (r.b.) science museum of virginia, richmond, va , usa; jhoffman@smv.org * correspondence: kelly.saverino@richmond.edu; tel.: + - - - abstract: the urban heat island (uhi) effect is caused by intensive development practices in cities and the diminished presence of green space that results. the evolution of these phenomena has occurred over many decades. in many cities, historic zoning and redlining practices barred black and minority groups from moving into predominately white areas and obtaining financial resources, a practice that still affects cities today, and has forced these already disadvantaged groups to live in some of the hottest areas. in this study, we used a new dataset on the spatial distribution of temperature during a heat wave in richmond, virginia to investigate potential associations between extreme heat and current and historical demographic, socioeconomic, and land use factors. we assessed these data at the census block level to determine if blocks with large differences in temperature also had significant variation in these covariates. the amount of canopy cover, percent impervious surface, and poverty level were all shown to be strong correlates of uhi when analyzed in conjunction with afternoon temperatures. we also found strong associations of historical policies and planning decisions with temperature using data from the university of richmond’s digital scholarship lab’s “mapping inequality” project. finally, the church hill area of the city provided an interesting case study due to recent data suggesting the area’s gentrification. differences in demographics, socioeconomic factors, and uhi were observed between north and (more gentrified) south church hill. both in church hill and in richmond overall, our research found that areas occupied by people of low socioeconomic status or minority groups disproportionately experienced extreme heat and corresponding impacts on health and quality of life. keywords: urban heat islands; socioeconomic inequity; temperature; poverty; redlining; race . introduction warming attributable to human emissions of heat-trapping gases is unequivocal [ ]. at the planetary scale, observable trends include decreased snow and ice cover, changes in phenology/growing seasons, droughts, and some aspects of hurricane formation, strength, and translational speed [ ]. the consequences on human health and well-being are dire. as the climate warms, typical weather patterns become more extreme. this includes an increase in heavy rain during monsoon season, followed by an extremely hot dry period, which can result in severe human and economic losses [ ]. the changes in atmospheric conditions also affect heat, the deadliest weather hazard in the united states. death from heat and related drought make up . % of all natural hazard deaths in the u.s.; severe summer weather is a close second at . % [ ]. the frequency, duration, and intensity of heat waves across the u.s. have all increased over the past years [ ]. these changes are accentuated in cities through urban heat island (uhi) effects. an urban heat island is an area of a city that is consistently warmer, in terms of surface temperature, air temperature, or both, than its suburban and rural surroundings [ ]. uhis sustainability , , . https://doi.org/ . /su https://www.mdpi.com/journal/sustainability https://www.mdpi.com/journal/sustainability https://www.mdpi.com https://orcid.org/ - - - https://orcid.org/ - - - https://orcid.org/ - - - https://doi.org/ . /su https://doi.org/ . /su https://creativecommons.org/ https://creativecommons.org/licenses/by/ . / https://creativecommons.org/licenses/by/ . / https://doi.org/ . /su https://www.mdpi.com/journal/sustainability https://www.mdpi.com/ - / / / ?type=check_update&version= sustainability , , of result from increased land surface temperature (lst) as heat energy gets trapped on the earth’s surface [ ]. this is often caused by highly intense development, from industrial areas to dense mixed-use city centers, along with the removal of green space [ , ]. urban areas, as a result, have a high net radiation retention due to their generally low albedo, poor thermal dissipation, and emissions of heat and heat-trapping pollutants [ ]. trees and other vegetation reduce the amount of radiation that reaches and is retained in the ground, lowering surface temperatures and mitigating uhi effects as a whole [ ]. ziter et al. showed that temperature decreased with increasing canopy cover, with the greatest cooling effects when canopy cover surpassed % [ ]. unfortunately, racial minorities and communities of low socioeconomic status suffer disproportionately from uhi in the united states, which is compounded by lack of access to central air conditioning in their homes or public refuge facilities [ , ]. this distribution of at-risk populations stems in large part from the historical practices of exclusionary zoning and redlining [ ]. throughout the th century, public and private sectors around the u.s. used sev- eral different tactics to exclude minority groups from certain neighborhoods. redlining occurred when people living in an area classified as having a poor financial status were refused a financial resource, such as a loan or insurance. these policies served to uphold segregation by denying resources either directly based on race or indirectly based on historical proxies for race, such as one’s neighborhood, financial status, health conditions, and insurance history [ ]. this racially biased policy has historically been an issue in cities throughout the u.s. and has resulted in black and other minority residents getting trapped in the cycle of poverty, a pattern of segregation that is still visible in many u.s. cities today [ ]. as early as , baltimore became the first city to create outright racial zoning laws. richmond followed suit and in passed their own segregation law that banned black households from moving into white household blocks and vice-versa. in , the supreme court case buchanan v. warley banned these racial zoning codes, declaring city-mandated racial zoning unconstitutional [ ]. in response to the decision in buchanan v. warley, which specifically dealt with municipal legal statutes, the racial restrictive covenant became a common practice. this entailed the buyer of a house entering into a written agreement with the seller promising not to sell, rent, or transfer the property to “any person not of the caucasian race” and specifically “against the occupancy as owners or tenants of any portion of said property for resident or other purposed by people of the negro or mongolian race” [ ] (p. ). in , the federal home owners’ loan corporation (holc) began assessing perceived risk in major u.s. cities. their goal was to determine what areas would be more likely to pay off their amortized home loans. these designations are what later led to the practice of redlining [ ]. by , percent of both chicago and los angeles carried restrictive covenants barring black families [ ]. the long-lasting effects of these policies on many u.s. cities are well documented, but the specific negative implications on environmental conditions, such as temperature resulting from discrimination against and segregation of black communities, require further exploration. hoffman et al. [ ] described how the areas that experience the highest surface temper- atures in cities like richmond and baltimore correlate significantly with historic redlining. holc map shapefiles for us cities or urban areas were compared to land surface temperatures (lsts) derived from landsat imagery collected between june and august . mean lst within each holc security polygon was estimated, yielding the ability to show on average how much cooler or warmer a single holc security polygon is in comparison to all of the holc security polygons overall. the resulting non-uniform distribution in lst differences across various cities demonstrates that present-day urban heat is influenced by historical policies. in general, cities that have experienced redlining tend to be warmer overall than those that have not experienced redlining [ ]. our study builds on this work by investigating in further depth whether increased heat in certain areas of richmond, virginia is associated with the history of those areas and current demographic factors, especially those related to race and socioeconomic status. we sustainability , , of explore the history of the city, including its history of redlining that created legacy spatial patterns in the demographics of the city, and we analyze more recent demographic change over the past decade including ongoing impacts of gentrification and immigration. we then consider how these changing demographics are related to the distribution of temperature in the city. our goal is to assess which populations of the city may be most at risk from extreme heat and to evaluate the historical and contemporary drivers of these patterns. . materials and methods . . study area richmond, virginia provides a history of segregation over lines of race, ethnicity, and class, which has had a lasting footprint on the modern landscape. as of , richmond has an area of . km and an estimated population of , . according to american community survey -year estimates ( – ), richmond’s racial makeup is % black, % white, % asian, and % other, with % of richmond residents identifying ethnically as hispanic or latino [ ]. the majority of the white populations reside near the more suburban west end, while the vast majority of the black populations are located on the east side of the city, which is denser and more urban. additionally, the latino populations tend to reside in the southern parts of richmond, near the chesterfield border (figure ). race and class are heavily intertwined in richmond: those areas with the highest median household income and lowest poverty are in the mostly white west end while the areas with low income and high poverty are in the mostly black east end. although the black population accounted for % of the population in , they accounted for % of those in poverty. richmond’s poverty rate of . % in was much higher than both virginia ( . %) and the u.s. ( . %); the median household income in richmond ($ , ) was significantly lower than that of virginia ($ , ) and the u.s. ($ , ) in (figure ). sustainability , , x for peer review of our study builds on this work by investigating in further depth whether increased heat in certain areas of richmond, virginia is associated with the history of those areas and current demographic factors, especially those related to race and socioeconomic sta- tus. we explore the history of the city, including its history of redlining that created legacy spatial patterns in the demographics of the city, and we analyze more recent demographic change over the past decade including ongoing impacts of gentrification and immigration. we then consider how these changing demographics are related to the distribution of tem- perature in the city. our goal is to assess which populations of the city may be most at risk from extreme heat and to evaluate the historical and contemporary drivers of these pat- terns. . materials and methods . . study area richmond, virginia provides a history of segregation over lines of race, ethnicity, and class, which has had a lasting footprint on the modern landscape. as of , richmond has an area of . km and an estimated population of , . according to american community survey -year estimates ( – ), richmond’s racial makeup is % black, % white, % asian, and % other, with % of richmond residents identifying ethnically as hispanic or latino [ ]. the majority of the white populations reside near the more suburban west end, while the vast majority of the black populations are located on the east side of the city, which is denser and more urban. additionally, the latino populations tend to reside in the southern parts of richmond, near the chesterfield border (figure ). race and class are heavily intertwined in richmond: those areas with the highest median household income and lowest poverty are in the mostly white west end while the areas with low income and high poverty are in the mostly black east end. although the black population accounted for % of the population in , they accounted for % of those in poverty. richmond’s poverty rate of . % in was much higher than both virginia ( . %) and the u.s. ( . %); the median household income in richmond ($ , ) was significantly lower than that of virginia ($ , ) and the u.s. ($ , ) in (figure ). figure . race and ethnicity distribution in richmond, va. source: american community survey (acs) -year estimates ( – ). figure . race and ethnicity distribution in richmond, va. source: american community survey (acs) -year esti- mates ( – ). because of richmond’s unique development history, the urban form varies throughout the city. according to data from the multi-resolution land characteristics consortium national land cover database (nlcd) [ ], . % of richmond’s land cover is impervious surfaces. the abundance of developed surfaces in the eastern part of the city is evident in figure . the southeast includes the highest concentration of industrial land uses. in contrast, much of the one-third of the city’s area that is classified as single-family housing can be found in the western extents of the city (figure ) [ ]. the racial, ethnic, and socioeconomic diversity, as well as vast differences in urban form throughout different areas sustainability , , of of the city, allowed for in-depth analyses of potential relationships between temperature and a variety of demographic and geographic factors associated with the marginalization of certain urban populations. sustainability , , x for peer review of figure . economic indicators in richmond, va, including median household income, poverty, and educational attain- ment. source: acs -year estimates ( – ). because of richmond’s unique development history, the urban form varies through- out the city. according to data from the multi-resolution land characteristics con- sortium national land cover database (nlcd) [ ], . % of richmond’s land cover is impervious surfaces. the abundance of developed surfaces in the eastern part of the city is evident in figure . the southeast includes the highest concentration of industrial land uses. in contrast, much of the one-third of the city’s area that is classified as single-family housing can be found in the western extents of the city (figure ) [ ]. the racial, ethnic, and socioeconomic diversity, as well as vast differences in urban form throughout differ- ent areas of the city, allowed for in-depth analyses of potential relationships between tem- perature and a variety of demographic and geographic factors associated with the mar- ginalization of certain urban populations. figure . economic indicators in richmond, va, including median household income, poverty, and educational attainment. source: acs -year estimates ( – ). sustainability , , x for peer review of figure . land use and land cover in richmond, va. sources: richmond planning and development review and national land cover database. the socioeconomic and racial divisions evident in richmond today are a continuing legacy of zoning codes established nearly one hundred years ago. in the late s, har- land bartholomew, a st. louis-based planner who was hired to create a city plan and begin the annexation of some of wealthiest areas in neighboring henrico county, asserted that one of the greatest “deficiencies” of richmond’s zoning code was “the lack of single- family residential districts,” since “the soundest realty values in any city” are owner-oc- cupied single-family dwellings [ ] (p. ). bartholomew’s solution to this so-called prob- lem was a new zoning ordinance based on socioeconomic status and increased housing type designation of richmond’s neighborhoods. this forced families who could not afford single-family housing out of these designated areas and prevented others from moving in, a practice that became known as economic zoning. because race and class are so inter- twined, single-family districts would avert non-white, mainly black, and poor individuals from moving into these districts, creating wealthy, white neighborhoods in areas deemed ‘most desirable’ or grade a. the new ordinance was adopted in with minimal back- lash, since zoning was a novel planning tool and its format was too complicated for the average person to understand [ ]. the results of the zoning ordinance showed a clear difference in treatment be- tween majority-white and majority-black neighborhoods. the most notable change from the zoning ordinance was the division from just three residential districts into three single-family, one two-family, and two multiple-family residential districts. in overlaying the “negro areas” map from bartholomew’s master plan with the zoning map, none figure . land use and land cover in richmond, va. sources: richmond planning and development review and national land cover database. sustainability , , of the socioeconomic and racial divisions evident in richmond today are a continuing legacy of zoning codes established nearly one hundred years ago. in the late s, harland bartholomew, a st. louis-based planner who was hired to create a city plan and begin the annexation of some of wealthiest areas in neighboring henrico county, asserted that one of the greatest “deficiencies” of richmond’s zoning code was “the lack of single-family residential districts,” since “the soundest realty values in any city” are owner-occupied single-family dwellings [ ] (p. ). bartholomew’s solution to this so- called problem was a new zoning ordinance based on socioeconomic status and increased housing type designation of richmond’s neighborhoods. this forced families who could not afford single-family housing out of these designated areas and prevented others from moving in, a practice that became known as economic zoning. because race and class are so intertwined, single-family districts would avert non-white, mainly black, and poor individuals from moving into these districts, creating wealthy, white neighborhoods in areas deemed ‘most desirable’ or grade a. the new ordinance was adopted in with minimal backlash, since zoning was a novel planning tool and its format was too complicated for the average person to understand [ ]. the results of the zoning ordinance showed a clear difference in treatment between majority-white and majority-black neighborhoods. the most notable change from the zoning ordinance was the division from just three residential districts into three single-family, one two-family, and two multiple-family residential districts. in overlaying the “negro areas” map from bartholomew’s master plan with the zoning map, none of the black areas became zoned “a” or “b” single family; just a handful of black areas in the northside, the east end, and manchester were zoned “c” single family, but the majority were either in “d” two-family, “e” or “f” multiple family, or even in “j” light industrial or “k” heavy industrial districts (figure ). because of the link between race and class, the implication was that the poor, black residents could not afford to live in the exclusive, white, single-family enclaves. this was intensified by the use of restrictive covenants to prevent black families from moving into homes, as well as the refusal of banks to make home loans to those households due to redlining [ ]. consequently, these factors led to black families continuing to live in the smaller, older, and less expensive homes in the less ‘prestigious’ residential districts. the zoning ordinance was also problematic in that it placed many predominantly black areas in or adjacent to industrial districts, including scott’s addition, manchester, and shockoe bottom. according to richard rothstein, “the racial aspect of” the choice to zone industrial areas close to black neighborhoods “was a desire to avoid the deterioration of white neighborhoods when african american sites were available alternatives . . . the welfare of african americans did not count for much in this policy making” [ ] (p. ). through the zoning ordinance, the city was able to insulate and “protect” the wealthy and white neighborhoods. after decades of these segregation tactics, the poorest citizens, who were disproportionately people of color, were left practically unable to move anywhere beyond inner cities. disinvestment toward the poor inner-city communities through the mid- th century left a stark contrast between the majority-black areas and the majority-white areas [ ]. these socioeconomic and racial contrasts still exist today in richmond, and we therefore predict that these long-term inequities, coupled with physical differences in the city’s landscape, will result in more black communities located in the hottest areas of the city, causing them to be most affected by the urban heat island effect and its negative repercussions. sustainability , , of sustainability , , x for peer review of map, none of the black areas became zoned “a” or “b” single family; just a handful of black areas in the northside, the east end, and manchester were zoned “c” single family, but the majority were either in “d” two-family, “e” or “f” multiple family, or even in “j” light industrial or “k” heavy industrial districts (figure ). because of the link between race and class, the implication was that the poor, black residents could not afford to live in the exclusive, white, single-family enclaves. this was intensified by the use of restrictive covenants to prevent black families from moving into homes, as well as the refusal of banks to make home loans to those households due to redlining [ ]. consequently, these factors led to black families continuing to live in the smaller, older, and less expensive homes in the less ‘prestigious’ residential districts. figure . zoning categories in richmond, va that resulted from the zoning ordinance and areas with greater than % black population in . source: richmond planning and development review. figure . zoning categories in richmond, va that resulted from the zoning ordinance and areas with greater than % black population in . source: richmond planning and development review. . . data the spatial distribution of temperature in the city of richmond is derived from a field sample conducted on july . on this day, one of the hottest of the summer, teams of researchers, students, and volunteers walked, biked (three bicycles), or drove (nine cars) throughout areas of the city with vehicle-mounted sensors that recorded their spatial coordinates at one-second intervals and corresponding afternoon ( : – : p.m.) temperature using a type “t” thermocouple and data logger [ ]. the tens of thousands of observations generated were then fed into a machine learning algorithm to model temperature at -m resolution based on the associations between these observations and spectral data from the sentinel- satellite constellation [ ]. a full description of the temperature data collection, quality assurance, and modeling is available in shandas et al., sustainability , , of [ ]. our study relates to shandas et al., by expanding on their urban heat island location prediction technique to determine if the hottest areas of richmond, va are occupied by certain socioeconomic, racial, or ethnic groups. the temperature map was then aggregated by census block groups [ ] using arcgis pro and areas of extreme high temperature were classified as urban heat islands (figure ). sustainability , , x for peer review of figure . average afternoon temperature by census block group in richmond, va in july . images on the left provide an example of how data from one of the bike-route transects was used to create the temperature model for the city. values from the temperature model were then aggregated by census block. source: shandas et al., . impervious surface and canopy cover data were retrieved from the national land cover database (nlcd) [ ]. green space data were sourced from an arcgis map of parks and open space created for the richmond master plan [ ]. the “parks and open space” layer used to represent green space encompasses recognized parks, sports fields, farmer’s markets, community gardens, cemeteries, and other vegetated areas. these maps were overlaid with a map of richmond’s census blocks [ ] using arcgis pro, and the percentage of each census block comprising impervious surface, canopy cover, and green space was calculated (figure ). figure . average afternoon temperature by census block group in richmond, va in july . images on the left provide an example of how data from one of the bike-route transects was used to create the temperature model for the city. values from the temperature model were then aggregated by census block. source: shandas et al., . impervious surface and canopy cover data were retrieved from the national land cover database (nlcd) [ ]. green space data were sourced from an arcgis map of parks and open space created for the richmond master plan [ ]. the “parks and open space” layer used to represent green space encompasses recognized parks, sports fields, farmer’s markets, community gardens, cemeteries, and other vegetated areas. these maps were overlaid with a map of richmond’s census blocks [ ] using arcgis pro, and the percentage of each census block comprising impervious surface, canopy cover, and green space was calculated (figure ). sustainability , , of sustainability , , x for peer review of figure . greenspace, impervious surface, and canopy cover as percentages of census block group. redlining data were obtained from the “mapping inequality” database of the uni- versity of richmond’s digital scholarship lab [ ], with the city divided into polygons designated by one of four holc security ratings: a = best, b = still desirable, c = defi- nitely declining, d = hazardous. the security ratings were meant to indicate the residen- tial desirability of a neighborhood, as well as the “threat of infiltration of foreign-born, negro, or lower-grade population” [ ]. the proportion within each census block of each holc rating was calculated. in subsequent analyses, grade a “best” and grade b “still desirable” sections were combined, while grade c “definitely declining” and grade d “hazardous” sections were combined. demographic data were collected to assess any associations between demographic factors and temperature data. we aggregated data from american community survey’s (acs) most recent -year estimates ( – ) at the block group level [ ]. we collected data for the following attributes: total population (table a ), race and ethnicity (tables a and a ), household income (table a ), highest educational attainment (table a ), transportation to work for workers years and older (table a ), pov- erty status in (tables a a, a b, and a c), per capita income (table a ), and number of housing units in each structure (table a ). race and ethnicity were normalized as a percent of the total population. for highest educational attainment, we used the number of adults over years old with at least a bachelor’s degree normal- ized as a percent of total population. all data were gathered to be representative of the – range to be consistent with the -year acs estimates. temperature data represent july . nlcd data used to extract impervious surface and canopy cover were derived from the dataset. green space data represents and was sourced from the richmond planning and develop- ment office. the only exception is the redlining data, which was included to assess the legacy effects of historical policy. to examine trends in the demographic data, we also collected the -year acs estimates from data ( – ) [ ]. . . analysis to identify areas of the city with high and low values for a given variable, we first divided the city into quartiles based on our chosen predictor variables. for example, figure . greenspace, impervious surface, and canopy cover as percentages of census block group. redlining data were obtained from the “mapping inequality” database of the uni- versity of richmond’s digital scholarship lab [ ], with the city divided into polygons designated by one of four holc security ratings: a = best, b = still desirable, c = defi- nitely declining, d = hazardous. the security ratings were meant to indicate the residential desirability of a neighborhood, as well as the “threat of infiltration of foreign-born, negro, or lower-grade population” [ ]. the proportion within each census block of each holc rating was calculated. in subsequent analyses, grade a “best” and grade b “still desirable” sections were combined, while grade c “definitely declining” and grade d “hazardous” sections were combined. demographic data were collected to assess any associations between demographic factors and temperature data. we aggregated data from american community survey’s (acs) most recent -year estimates ( – ) at the block group level [ ]. we collected data for the following attributes: total population (table a ), race and ethnicity (tables a and a ), household income (table a ), highest educational attainment (table a ), transportation to work for workers years and older (table a ), poverty status in (tables a a, a b, and a c), per capita income (table a ), and number of housing units in each structure (table a ). race and ethnicity were normalized as a percent of the total population. for highest educational attainment, we used the number of adults over years old with at least a bachelor ’s degree normalized as a percent of total population. all data were gathered to be representative of the – range to be consistent with the -year acs estimates. temperature data represent july . nlcd data used to extract impervious surface and canopy cover were derived from the dataset. green space data represents and was sourced from the richmond planning and development office. the only exception is the redlining data, which was included to assess the legacy effects of historical policy. to examine trends in the demographic data, we also collected the -year acs estimates from data ( – ) [ ]. . . analysis to identify areas of the city with high and low values for a given variable, we first divided the city into quartiles based on our chosen predictor variables. for example, the first quartile (q ) blocks based on household income had an average median household income of $ , and the highest quartile (q ) blocks had an average median household income of $ , ; the q blocks based on white population had an average of . % of the population identifying as white, while the q blocks had, on average, . % of the population of the blocks identifying as white. sustainability , , of then, to assess if the predictor variable had a significant correlation with temperature (i.e., low household income blocks were associated with increased temperatures) we conducted t-tests on the temperature data, grouping the census blocks by q and q for the variable of interest. for example, we asked whether the set of census blocks with a median household income of $ , (q ) was significantly warmer than the set of census blocks with a median household income of $ , (q ). we repeated these tests for all predictor variables to determine which variables had the strongest relationship with temperature. determining what factors were most strongly associated with temperature differences allowed us to look at what populations of the city are most at risk. by mapping all the variables, we were able to identify which portions of the city had the greatest risks. to determine the possible influence of gentrification, we looked at temporal patterns between temperature and demographic factors from the acs data (from to ). we identified three areas of the city with large changes in highest educational attainment, which we used as an indicator of gentrification: church hill, manchester, and scott’s addition (figure ). sustainability , , x for peer review of the first quartile (q ) blocks based on household income had an average median house- hold income of $ , and the highest quartile (q ) blocks had an average median house- hold income of $ , ; the q blocks based on white population had an average of . % of the population identifying as white, while the q blocks had, on average, . % of the population of the blocks identifying as white. then, to assess if the predictor variable had a significant correlation with temperature (i.e., low household income blocks were associated with increased temperatures) we con- ducted t-tests on the temperature data, grouping the census blocks by q and q for the variable of interest. for example, we asked whether the set of census blocks with a median household income of $ , (q ) was significantly warmer than the set of census blocks with a median household income of $ , (q ). we repeated these tests for all pre- dictor variables to determine which variables had the strongest relationship with temper- ature. determining what factors were most strongly associated with temperature differ- ences allowed us to look at what populations of the city are most at risk. by mapping all the variables, we were able to identify which portions of the city had the greatest risks. to determine the possible influence of gentrification, we looked at temporal patterns between temperature and demographic factors from the acs data (from to ). we identified three areas of the city with large changes in highest educational attainment, which we used as an indicator of gentrification: church hill, manchester, and scott’s ad- dition (figure ). figure . changes in educational attainment between and by census blocks in richmond, va. educational at- tainment in this case is defined as adults age or older with a bachelor’s degree or higher. source: acs -year estimates ( – ). figure . changes in educational attainment between and by census blocks in richmond, va. educational attainment in this case is defined as adults age or older with a bachelor’s degree or higher. source: acs -year estimates ( – ). . results mapping the temperature data from shandas et al. [ ] throughout richmond pro- vides evidence that heat is unequally distributed throughout the city. general spatial discrepancies emerge. the most urban, downtown areas and areas toward the eastern side of the city were significantly hotter than the more suburban parts of richmond. the hottest areas of the city included scott’s addition, manchester, and church hill. in contrast, sustainability , , of areas in the western part of the city, especially those with higher tree cover south of the river, were considerably cooler. sites of extreme heat included areas that were historically redlined. the distribution of holc security ratings throughout richmond, as well as their corresponding average afternoon temperatures, can be seen in figure . the importance of this historical policy in determining the trajectory of landscape evolution leading to disparities in urban heating is evident from the overlap of the two data layers. sustainability , , x for peer review of . results mapping the temperature data from shandas et al. [ ] throughout richmond pro- vides evidence that heat is unequally distributed throughout the city. general spatial dis- crepancies emerge. the most urban, downtown areas and areas toward the eastern side of the city were significantly hotter than the more suburban parts of richmond. the hot- test areas of the city included scott’s addition, manchester, and church hill. in contrast, areas in the western part of the city, especially those with higher tree cover south of the river, were considerably cooler. sites of extreme heat included areas that were historically redlined. the distribution of holc security ratings throughout richmond, as well as their corresponding average afternoon temperatures, can be seen in figure . the im- portance of this historical policy in determining the trajectory of landscape evolution lead- ing to disparities in urban heating is evident from the overlap of the two data layers. figure . home owners’ loan corporation (holc) risk ratings throughout richmond, va. average afternoon temperature is overlaid, with larger circles indicating higher temperatures. sources: shandas et al., and home owners’ loan corporation, division of research and sta- tistics ( april ). figure . home owners’ loan corporation (holc) risk ratings throughout richmond, va. av- erage afternoon temperature is overlaid, with larger circles indicating higher temperatures. sources: shandas et al., and home owners’ loan corporation, division of research and statistics ( april ). we determined that a variety of social-environmental factors were associated with temperature (table ). a total of of the social-environmental factors were associated with statistically significant differences in afternoon temperature according to the t-tests sustainability , , of (p < . ). at a significance level of p < . , out of of the variables had a significant association with temperature. the top six strongest variables associated with temperature were percent impervious surface, percent canopy cover, percent of the block that was redlined category c or d, percent drove alone to work, median household income, and percent below poverty level (figure ). the strong associations between temperature and category c or d redlining, public transportation use, and low median household income suggest that the poorest neighborhoods were indeed the hottest. percent impervious surface and percent canopy cover, the two strongest variables associated with temperature, each have a q –q temperature difference with an absolute value of about ◦c. this difference of ◦c is about equal to the entirety of warming that the earth has experienced since pre-industrial times, and the impacts of this warming on the planet have been well- documented [ , ]. we were interested in further exploring the variation between racial groups due to richmond’s history of redlining and racial division, despite their absence among the six most significant variables. black communities and those living below the poverty line were disproportionately located in the hottest areas of richmond and in turn exposed to the negative consequences of the heat, compared to their wealthier, white counterparts (figure ). again, there was a clear east–west divide that can be traced back to historical housing policies. table . quartile analyses of the variables associated with afternoon temperature. variable q average q average temperature difference ◦c (q –q ) t-test p-value impervious surface (%) . . . . × − canopy cover (%) . . − . . × − redlining c + d (%) . . . × − drove alone to work (%) . . − . . × − below poverty level (%) . . . . median household income ($) , , − . . per capita income ($) , , − . . white population (%) . . − . . detached housing (%) . . − . . green space (%) . − . . black population (%) . . . . redlining a + b (%) . − . . non-hispanic white population (%) . . − . . attached housing (%) . . . bachelor’s degree or more (%) . . − . . hispanic population (%) . . . . variables were sourced from the american community survey (acs), the national land cover database (nlcd), green space data from the richmond planning and development office, and historical data on redlining from the home owners’ loan corporation, division of research and statistics. the table lists variables from most to least significant. q and q averages indicate the average value of the variable within those quartiles. the temperature difference (◦c) shows the heat difference between the q average temperature and the q average temperature. positive values indicate that q blocks are warmer than q blocks and negative values indicate that q blocks are warmer than q blocks. sustainability , , of sustainability , , x for peer review of figure . comparison of variables associated with average afternoon temperature throughout rich- mond, va. percent impervious surface, percent canopy cover, percent of a block rated category c or d, percent drove alone to work, percent below the poverty level, and median household income produced some of the strongest associations with afternoon temperature. figure . comparison of variables associated with average afternoon temperature throughout richmond, va. percent impervious surface, percent canopy cover, percent of a block rated category c or d, percent drove alone to work, percent below the poverty level, and median household income produced some of the strongest associations with afternoon temperature. sustainability , , of sustainability , , x for peer review of we were interested in further exploring the variation between racial groups due to richmond’s history of redlining and racial division, despite their absence among the six most significant variables. black communities and those living below the poverty line were disproportionately located in the hottest areas of richmond and in turn exposed to the negative consequences of the heat, compared to their wealthier, white counterparts (figure ). again, there was a clear east–west divide that can be traced back to historical housing policies. figure . distribution of white and black populations throughout richmond, va, compared to the average afternoon temperature. in each map, the larger the circle is, the greater the population of the corresponding race. the white alone map indicates that the greatest portion of white residents live toward the western end of the city, while the black alone map indicates that the greatest portion of black residents live toward the eastern end. to evaluate the changes in the aforementioned variables over time, american census data from was analyzed and compared to data. in , the two strongest var- iables associated with afternoon temperature were median household income and percent drove alone to work (table ). this was consistent with our analyses, in which these two variables also had the strongest associations with temperature. the magnitude of the temperature differences between the q and q neighborhoods were also similar between and at around . °c. four out of the seven demographic variables analyzed for had a significant relationship with temperature at a significance level of p < . and an additional two variables were significant at a level of p < . . the variables were only sourced from acs data and did not include the redlining or land use and land cover variables. figure . distribution of white and black populations throughout richmond, va, compared to the average afternoon temperature. in each map, the larger the circle is, the greater the population of the corresponding race. the white alone map indicates that the greatest portion of white residents live toward the western end of the city, while the black alone map indicates that the greatest portion of black residents live toward the eastern end. to evaluate the changes in the aforementioned variables over time, american census data from was analyzed and compared to data. in , the two strongest variables associated with afternoon temperature were median household income and percent drove alone to work (table ). this was consistent with our analyses, in which these two variables also had the strongest associations with temperature. the magnitude of the temperature differences between the q and q neighborhoods were also similar between and at around . ◦c. four out of the seven demographic variables analyzed for had a significant relationship with temperature at a significance level of p < . and an additional two variables were significant at a level of p < . . the variables were only sourced from acs data and did not include the redlining or land use and land cover variables. sustainability , , of table . quartile analyses of the seven variables associated with afternoon temperature from the american community survey (acs), ordered from most to least significant. variable q average q average temperature difference ◦c (q -q ) t-test p-value median household income ($) , , − . . × − drove alone to work (%) . . − . . non-hispanic white population (%) . . − . . bachelor’s degree or more (%) . . − . . white population (%) . . − . . black population (%) . . . . hispanic population (%) . . . q and q averages indicate the average value of the variable within those quartiles. the temperature difference (◦c) shows the heat difference between the q average temperature and the q average temperature. positive values indicate that q blocks are warmer than q blocks and negative values indicate that q blocks are warmer than q blocks. notable changes occurred in the demographic data from to . for example, the difference between q and q median household income in was $ , , while the difference in was $ , . this shows a $ , increase in the disparity between the median household incomes of the richest and poorest areas of richmond within the eight- year period. this disparity is driven, in part, by pockets of gentrification in the city. one example is south church hill, which experienced an increase in median household income of $ from to . during that same period, north church hill experienced little to no gentrification. demographic patterns between these two adjacent neighborhoods are typical of gentrification, with more minority groups, black and hispanic populations, and those of lower socioeconomic status in the north (table ). we hypothesized that the gentrified south church hill (blocks east and west) would have a lower mean afternoon temperature than north church hill (block ), perhaps due to green space increase or albedo-related land use changes. however, the data did not show a significant difference in temperature. this suggests that historical factors, including redlining, still primarily determine temperature rather than more recent demographic change. table . comparison of temperature, socioeconomic, and demographic data between north and south church hill. variable south ( east) south ( west) north ( ) mean afternoon temperature (◦c) . . . below poverty level (%) . . . per capita income ($) , , , median household income ($) , , , white population (%) . . . black population (%) . . . redlining category d (%) . . . property value ($) , , , south church hill is divided into two census blocks, since both the eastern and western blocks that comprise the area are included. . discussion the main goal of this study was to determine if there was any correlation between the hottest areas of the city and certain demographic, socioeconomic, and land use factors. percent impervious surface and percent canopy cover were determined to be the two sustainability , , of strongest variables associated with afternoon temperature, meaning that the higher the proportion of impervious surfaces or the less canopy cover an area has, the warmer the area is likely to be. these factors have direct mechanistic influences on temperature, and the strength of these observed associations was not surprising. the next four strongest correlates each were associated with approximately . ◦c in temperature difference and included a variety of social and historical factors including the redlining holc security ratings of the census blocks. our results are representative of national patterns showing the lingering effects of redlining. in a related study, redlined communities were found to be hotter than non- redlined neighbors in % of the u.s. cities studied [ ]. this extreme heat, in turn, poses a significant threat in terms of adverse health effects for the communities living there [ ]. one group that is highly impacted is pregnant women, specifically black women, living within urban heat islands [ – ]. in a meta-analysis conducted to evaluate the association between heat exposure and birth issues, it was found that the chance of a premature birth increases by . % for every . ◦c increase in temperature [ ]. as climate change inevitably worsens over the coming years, such issues will be exacerbated. planting trees has the opportunity to address the inequity in heat in many cities, and cities across the planet have pledged to plant millions of trees to contribute to this effort. programs like the trillion trees initiative (https://trilliontrees.org/) have begun to emphasize the importance of urban forests where trees provide multiple values including reducing urban heat island effects, fighting pollution, reducing flooding, and sequestering carbon [ , ]. the benefits already provided by existing trees in urban communities in- clude approximately $ . billion in savings from reduced heating and cooling of residential buildings [ ]. in the case of minimal space for tree planting, increasing vegetative cover, such as on roofs, has also been found to aid in cooling [ ]. our study points to the need to concentrate future planting in specific neighborhoods to reduce current inequities in urban heating [ ]. richmond is one of the cities that would greatly benefit from mass tree planting and other efforts to mitigate heat [ ]. in recent planning initiatives, including the office of sustainability’s rvagreen and richmond’s most recent -year plan [ ], there has been notable attention paid to uhi and climate issues. rvagreen is the city of richmond’s “equity-centered climate action and resilience planning initiative to achieve net zero greenhouse gas emissions by and help the community adapt to richmond’s climate impacts of extreme heat, precipitation, and flooding” [ ] (paragraph ). the city plans to address urban heat by using light-colored pavement to reflect sunlight, varying building heights in hot neighborhoods to increase air flow, protecting and restoring natural resources, increasing tree shade on sidewalks, and ensuring that every richmond resident lives within a ten-minute walk of a park [ ]. though the goal is to implement these improvements throughout the entirety of the city, the main and initial focus will be on the hottest and poorest neighborhoods of richmond. additional benefits of these proposed improvements to the physical environment include a variety of health benefits. green urban planning has the potential to reduce mental illness within vulnerable populations [ ], who tend to be those living in the hottest areas of the city. these mental health concerns span from everyday stress to more serious psychiatric disorders like schizophrenia. in a study of , residents of denmark, engemann et al. [ ] found that the risk of developing a psychiatric disorder between adolescence and adulthood was reduced by up to % when childhood development took place in a relatively green area. the association between improved mental health and ample green space was still evident when controlling for socio-economic status, urbanization, and family history of mental conditions, all known predictors of psychiatric disorders. the hottest areas of the city in our study tend to correspond to historically redlined regions as has been documented for many cities of the u.s. [ ]. church hill, an area of richmond located in the northeast part of the city, is one of these regions. the southern part of church hill has been gentrified while the north has not. in the s, broad street https://trilliontrees.org/ sustainability , , of was the dividing line between north and south. however, as gentrification spreads north, the present-day border has moved to n street, which aligns with the census block divide. yet, we do not yet see differences in temperature among the different neighborhoods in church hill, suggesting historical factors continue to have an outsized influence on current temperatures in cities. going forward, it would be beneficial to repeat our data collection to look for any progression or changes within the predictor variables and their relationship to temperature. performing additional case studies, like the church hill analyses, could be useful in evaluat- ing the effects of redlining on other areas known to have experienced gentrification. further analysis of scott’s addition would be an interesting extension since the greatest change in educational attainment from to , our indicator of gentrification, was found here. the area was dominated by industrialization starting in the early th century when the land was annexed to the city of richmond [ ], but has become richmond’s fastest growing neighborhood in recent years [ ]. due to a call for more residential development in the late th century, scott’s addition now includes a variety of breweries, restaurants, and apartment complexes [ ]. currently, the -strong population of scott’s addition consists of . % white collar workers and only . % blue collar workers. households are predominately non-family households without children [ ]. as factories made way for new businesses, younger people with more education and higher incomes (average household income: $ , [ ]) were drawn to the area, contributing to the change in educational attainment over time. similar patterns can be found in other gentrified regions as well, as wealthier people and businesses move into these areas. it is yet to be seen whether these socioeconomic changes will translate to reductions in thermal stress. we also stress that our analysis relies on a series of bivariate comparisons to deter- mine the associations between the demographic and land use variables with temperature. of course, there are interactions among many of the variables, and future analyses would benefit from the inclusion of longer time series data and more robust multivariate analyses. . conclusions based on our research it is evident that heat is unequally distributed throughout the city of richmond. this is the result of racist historical zoning practices that sought to keep white neighborhoods separate from black and minority communities and protected from the harm of industrialization. the extremely detrimental effects of these practices are still evident in the segregation of richmond today, as the more affluent suburban western end of the city is dominated by white neighborhoods and the urban eastern side of the city is dominated by black neighborhoods. since areas occupied by black communities were essentially sacrificed as industrialization grew, these regions have the most impervious surfaces and least green space and are in turn the warmest parts of the city. in comparison, the western end of the city has a great deal of green space and minimal impervious surfaces. at local scales, differences in temperature were expected between adjacent gentrified and non-gentrified regions. the temperature data comparing north church hill and gentrified south church hill were not significant, yet the two areas did have large differences in percent below the poverty level, per capita income, median household income, and property value, revealing the tangible effects of gentrification. while the extent of the disparity that the findings revealed is disturbing, the data can be used to implement initiatives to reduce the thermal imbalance. the findings of this study can be used to improve city planning, such as increasing green space in areas that currently lack access. construction in areas known to be the hottest can incorporate heat mitigating tactics. light-colored or vegetated roofs are one such strategy to minimize heat absorption. working to improve heat inequality in richmond will not only be advantageous to the environment, but also has the potential to greatly improve health inequity among racial and ethnic groups and people of varying socioeconomic statuses throughout the city. sustainability , , of author contributions: conceptualization, k.c.s., t.r.l. and a.m.e.; methodology, k.c.s. and t.r.l.; software, k.c.s.; validation, k.c.s., t.r.l., e.r., j.s.h. and a.m.e.; formal analysis, k.c.s.; investigation, k.c.s., e.r., t.r.l., a.m.e., j.s.h. and r.b.; resources, k.c.s., e.r., t.r.l., j.s.h. and r.b.; data curation, k.c.s., e.r. and r.b.; writing—original draft preparation, k.c.s., e.r. and t.r.l.; writing—review and editing, k.c.s., e.r., t.r.l., a.m.e. and j.s.h.; visualization, k.c.s., e.r., t.r.l. and r.b.; supervision, t.r.l.; project administration, k.c.s. and t.r.l.; funding acquisition, k.c.s., t.r.l. and j.s.h. all authors have read and agreed to the published version of the manuscript. funding: this research was funded by the university of richmond iqs summer fellowship grant and science museum of virginia support from the virginia academy of science small research grant program. j.s.h. acknowledges further funding support from the noaa office of education award na sec . institutional review board statement: not applicable. informed consent statement: not applicable. data availability statement: not applicable. acknowledgments: we thank our colleagues at the university of richmond’s spatial analysis lab, digital scholarship lab, and unpacking the census project for their support in developing background datasets for this study. conflicts of interest: the authors declare no conflict of interest. the funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. references . ipcc. climate change : synthesis report. contribution of working groups i, ii and iii to the fifth assessment report of the intergovernmental panel on climate change; core writing team, pachauri, r.k., meyer, l.a., eds.; ipcc: geneva, switzerland, ; p. . . nasa. facts: the effects of climate change. available online: https://climate.nasa.gov/effects/ (accessed on november ). . wang, s.s.-y.; kim, h.; coumou, d.; yoon, j.-h.; zhao, l.; gillies, r.r. consecutive extreme flooding and heat wave in japan: are they becoming a norm? atmos. sci. lett. , . [crossref] . borden, k.a.; cutter, s.l. spatial patterns of natural hazards mortality in the united states. int. j. health geogr. , . [crossref] [pubmed] . habeeb, d.; vargo, j.; stone, b., jr. rising heat wave trends in large us cities. nat. hazards , , – . [crossref] . arnfield, a.j. two decades of urban climate research: a review of turbulence, exchanges of energy and water, and the urban heat island. int. j. climatol. , , – . [crossref] . kimuku, c.w.; ngigi, m. study of urban heat island trends to aid urban planning in nakuru county-kenya. j. geogr. inf. syst. , , – . [crossref] . chakraborty, t.; hsu, a.; manya, d.; sheriff, g. a spatially explicit surface urban heat island database for the united states: characterization, uncertainties, and possible applications. isprs j. photogramm. remote sens. , , – . [crossref] . livesley, s.j.; mcpherson, e.g.; calfapietra, c. the urban forest and ecosystem services: impacts on urban water, heat, and pollution cycles at the tree, street, and city scale. j. environ. qual. , , – . [crossref] . solecki, w.d.; rosenzweig, c.; parshall, l.; pope, g.; clark, m.; cox, j.; wiencke, m. mitigation of the heat island effect in urban new jersey. environ. hazards , , – . [crossref] . farina, a. exploring the relationship between land surface temperature and vegetation abundance for urban heat island mitigation in seville, spain. master’s thesis, lund university, lund, sweden, . . ziter, c.d.; pederson, e.j.; kucharik, c.j.; turner, m.g. scale-dependent interactions between tree canopy cover and impervious surfaces reduce daytime urban heat during summer. proc. natl. acad. sci. usa , , – . [crossref] . wilson, b. urban heat management and the legacy of redlining. j. am. plan. assoc. , , – . [crossref] . voelkel, j.; hellman, d.; sakuma, r.; shandas, v. assessing vulnerability to urban heat: a study of disproportionate heat exposure and access to refuge by socio-demographic status in portland, oregon. int. j. environ. res. public health , , . [crossref] [pubmed] . jones-correa, m. the origins and diffusion of racial restrictive covenants. political sci. q. , , – . [crossref] . hillier, a.e. redlining and the home owners’ loan corporation. j. urban hist. , , – . [crossref] . mitchell, b.; franco, j. holc “redlining” maps: the persistent structure of segregation and economic inequality; national community reinvestment coalition (ncrc): washington, dc, usa, ; pp. – . . manning thomas, j.; ritzdorf, m. (eds.) the racial origins of zoning in american cities. in urban planning and the african community: in the shadows; sage publications: thousand oaks, ca, usa, . https://climate.nasa.gov/effects/ http://doi.org/ . /asl. http://doi.org/ . / - x- - http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /s - - -z http://doi.org/ . /joc. http://doi.org/ . /jgis. . http://doi.org/ . /j.isprsjprs. . . http://doi.org/ . /jeq . . http://doi.org/ . /j.hazards. . . http://doi.org/ . /pnas. http://doi.org/ . / . . http://doi.org/ . /ijerph http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . / http://doi.org/ . / sustainability , , of . united states commission on civil rights (usccr). understanding fair housing; government printing office: washington, dc, usa, ; p. . . rothstein, r. the color of law: a forgotten history of how our government segregated america; liveright publishing corporation a division of w. w. norton & company: new york, ny, usa, . . hoffman, j.s.; shandas, v.; pendleton, n. the effects of historical housing policies on resident exposure to intra-urban heat: a study of us urban areas. climate , , . [crossref] . u.s. census bureau quickfacts: richmond city, virginia; united states. available online: https://www.census.gov/quickfacts/ richmondcityvirginia (accessed on august ). . homer, c.; dewitz, j.; yang, l.; jin, s.; danielson, p.; xian, g.; coulston, j.; herold, n.; wickham, j.; megown, k. completion of the national land cover database for the conterminous united states–representing a decade of land cover change information. photogramm. eng. remote sens. , , – . . richmond planning and development review department. “richmond insights report”. . available online: http: //richmond .com/sites/default/files/insightsreport_sept _ .pdf (accessed on september ). . richmond times-dispatch; newsbank, inc.: richmond, va, usa, ; p. . . summers, b.t. capital reinvestment: riot, renewal, and the rise of a black ghetto. in black in place: the spatial aesthetics of race in a post-chocolate city; the university of north carolina press: chapel hill, ca, usa, ; pp. – . . shandas, v.; voelkel, j.; williams, j.; hoffman, j. integrating satellite and ground measurements for predicting locations of extreme urban heat. climate , , . [crossref] . census blocks map rva. available online: https://www .census.gov/geo/maps/dc map/tract/st _va/c _richmond_city/dc ct_c _ .pdf (accessed on august ). . arcgis parks and open space. available online: http://www.arcgis.com/home/webmap/viewer.html?webmap= f b d e f d b a b (accessed on august ). . nelson, r.k. race and redlining in richmond. redlining richmond n.d. available online: https://dsl.richmond.edu/holc/ pages/intro (accessed on october ). . american community survey (acs) -year estimates – . available online: http://data.census.gov/ (accessed on july ). . american community survey (acs) -year estimates – . available online: http://data.census.gov/ (accessed on july ). . nardone, a.; chiang, j.; corburn, j. historic redlining and urban health today in u.s. cities. environ. justice , , – . [crossref] . smith, m.l.; hardeman, r.r. association of summer heat waves and the probability of preterm birth in minnesota: an exploration of the intersection of race and education. int. j. environ. res. public health , , . [crossref] . nardone, a.l.; casey, j.a.; rudolph, d.e.; karasek, d.; mujahid, m.; morello-frosch, r. associations between historical redlining and birth outcomes from through in california. plos one , . [crossref] . krieger, n.; van wye, g.; huynh, m.; waterman, p.d.; maduro, g.; li, w.h.; gwynn, r.c.; barbot, o.; bassett, m.t. structural racism, historical redlining, and risk of preterm birth in new york city, – . am. j. public health , , – . [crossref] . bekkar, b.; pacheco, s.; basu, r.; denicola, n. association of air pollution and heat exposure with preterm birth, low birth weight, and stillbirth in the us: a systematic review. jama netw. open , , e . [crossref] [pubmed] . domkea, g.m.; oswaltb, s.n.; waltersa, b.f.; morinc, r.s. tree planting has the potential to increase carbon sequestration capacity of forests in the united states. proc. natl. acad. sci. usa , , – . [crossref] [pubmed] . brancalion, p.h.s.; holl, k.d. guidance for successful tree planting initiatives. j. appl. ecol. , , – . [crossref] . nowak, d.j.; appleton, n.; ellis, a.; greenfield, e. residential building energy conservation and avoided power plant emissions by urban and community trees in the united states. urban for. urban green. , , – . [crossref] . boycourt, l.a. richmond tree initiative to combat heat, pollution, inequity. available online: https://chesapeakebaymagazine. com/richmond-tree-initiative-to-combat-heat-pollution-inequity/ (accessed on november ). . richmond planning and development review department. “richmond : a guide for growth”. june . available online: http://www.richmond .com/sites/default/files/r _lowresdraft_ .pdf (accessed on september ). . city of richmond rvagreen . what is rvagreen ? available online: https://www.rvagreen .com/what-is-rvagreen- (accessed on september ). . engemann, k.; bøcker, c.; arge, l.; tsirogiannis, c.; mortensen, p.b.; svennings, j. residential green space in childhood is associated with lower risk of psychiatric disorders from adolescence into adulthood. proc. natl. acad. sci. usa , , – . [crossref] . the history of scott’s addition. available online: https://www.scottsaddition.com/history-of-scotts-addition/ (accessed on november ). . about scott’s addition. available online: https://www.scottsaddition.com/about-scotts-addition/ (accessed on novem- ber ). . scott’s addition demographics. available online: https://www.point homes.com/us/neighborhood/va/scott-s-addition- demographics.html#:~{}:text=scott\t \textquoterights% addition% is% an% area,with% a% population% of% % c .&text=the% median% age% of% the,married% and% % c % being% single (accessed on novem- ber ). http://doi.org/ . /cli https://www.census.gov/quickfacts/richmondcityvirginia https://www.census.gov/quickfacts/richmondcityvirginia http://richmond .com/sites/default/files/insightsreport_sept _ .pdf http://richmond .com/sites/default/files/insightsreport_sept _ .pdf http://doi.org/ . /cli https://www .census.gov/geo/maps/dc map/tract/st _va/c _richmond_city/dc ct_c _ .pdf https://www .census.gov/geo/maps/dc map/tract/st _va/c _richmond_city/dc ct_c _ .pdf http://www.arcgis.com/home/webmap/viewer.html?webmap= f b d e f d b a b http://www.arcgis.com/home/webmap/viewer.html?webmap= f b d e f d b a b https://dsl.richmond.edu/holc/pages/intro https://dsl.richmond.edu/holc/pages/intro http://data.census.gov/ http://data.census.gov/ http://doi.org/ . /env. . http://doi.org/ . /ijerph http://doi.org/ . /journal.pone. http://doi.org/ . /ajph. . http://doi.org/ . /jamanetworkopen. . http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . /pnas. http://www.ncbi.nlm.nih.gov/pubmed/ http://doi.org/ . / - . http://doi.org/ . /j.ufug. . . https://chesapeakebaymagazine.com/richmond-tree-initiative-to-combat-heat-pollution-inequity/ https://chesapeakebaymagazine.com/richmond-tree-initiative-to-combat-heat-pollution-inequity/ http://www.richmond .com/sites/default/files/r _lowresdraft_ .pdf https://www.rvagreen .com/what-is-rvagreen- https://www.rvagreen .com/what-is-rvagreen- http://doi.org/ . /pnas. https://www.scottsaddition.com/history-of-scotts-addition/ https://www.scottsaddition.com/about-scotts-addition/ https://www.point homes.com/us/neighborhood/va/scott-s-addition-demographics.html#:~{}:text=scott\t \textquoteright s% addition% is% an% area,with% a% population% of% % c .&text=the% median% age% of% the,married% and% % c % being% single https://www.point homes.com/us/neighborhood/va/scott-s-addition-demographics.html#:~{}:text=scott\t \textquoteright s% addition% is% an% area,with% a% population% of% % c .&text=the% median% age% of% the,married% and% % c % being% single https://www.point homes.com/us/neighborhood/va/scott-s-addition-demographics.html#:~{}:text=scott\t \textquoteright s% addition% is% an% area,with% a% population% of% % c .&text=the% median% age% of% the,married% and% % c % being% single introduction materials and methods study area data analysis results discussion conclusions references wp-p m- .ebi.ac.uk params is empty sys_ exception wp-p m- .ebi.ac.uk no params is empty exception params is empty / / - : : if (typeof jquery === "undefined") document.write('[script type="text/javascript" src="/corehtml/pmc/jig/ . . /js/jig.min.js"][/script]'.replace(/\[/g,string.fromcharcode( )).replace(/\]/g,string.fromcharcode( ))); // // // window.name="mainwindow"; .pmc-wm {background:transparent repeat-y top left;background-image:url(/corehtml/pmc/pmcgifs/wm-nobrand.png);background-size: auto, contain} .print-view{display:block} page not available reason: the web page address (url) that you used may be incorrect. message id: (wp-p m- .ebi.ac.uk) time: / / : : if you need further help, please send an email to pmc. include the information from the box above in your message. otherwise, click on one of the following links to continue using pmc: search the complete pmc archive. browse the contents of a specific journal in pmc. find a specific article by its citation (journal, date, volume, first page, author or article title). http://europepmc.org/abstract/med/